| Title: | Exploratory Data Analysis for Public Policy Applied to Culture |
|---|---|
| Description: | Implementation of frequency tables and bar charts for qualitative variables and checkbox fields. This package implements tables and charts used in reports at Funarte (National Arts Foundation) and OBEC (Culture and Creative Economy Observatory) in Brazil, and its main purpose is to simplify the use of R for people with a background in the humanities and arts. Examples and details can be viewed in this presentation from 2026: <https://formacao2026.netlify.app/assets/modulo_3/modulo3#/title-slide>. |
| Authors: | Gilberto Sassi [aut, cre] |
| Maintainer: | Gilberto Sassi <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-28 07:35:57 UTC |
| Source: | https://github.com/cran/datacult |
Axis-x has the frequency and data label has the percent value.
bar_chart( data, variable, axis_title = "", fill = "blue", axis_title_axis_size = 22, axis_text_axis_size = 22, data_label_size = 15, sort = TRUE )bar_chart( data, variable, axis_title = "", fill = "blue", axis_title_axis_size = 22, axis_text_axis_size = 22, data_label_size = 15, sort = TRUE )
data |
a dataframe object |
variable |
variable name |
axis_title |
axis y label. Defaults to "". |
fill |
bar color. Character scale with hexidecimal color ou named color. Defaults to "blue". |
axis_title_axis_size |
integer scalar. Size of axis labels. Defaults to 22. |
axis_text_axis_size |
integer scalar. Size of values annotated in axis. Defaults to 22. |
data_label_size |
integer scalar. Size of data labels. Defaults to 15 |
sort |
logical scalar. If TRUE, the are ordered using frequency. Defaults to TRUE. |
a ggplot2 object
df <- data.frame(var = sample(c("A", "B"), size = 100, replace = TRUE)) bar_chart(df, var)df <- data.frame(var = sample(c("A", "B"), size = 100, replace = TRUE)) bar_chart(df, var)
This function builds a bar chart with ggplot2 for a field with checkbox (user can select 2 or more options). Each option is one separated column.
bar_chart_checkbox( data, columns, sucess = "checked", labels = NULL, axis_title = "", fill = "blue", axis_title_axis_size = 22, axis_text_axis_size = 22, data_label_size = 15, sort = TRUE )bar_chart_checkbox( data, columns, sucess = "checked", labels = NULL, axis_title = "", fill = "blue", axis_title_axis_size = 22, axis_text_axis_size = 22, data_label_size = 15, sort = TRUE )
data |
dataframe object |
columns |
character vector. Columns to count. |
sucess |
character scalar. Category indicating the sucess. Defaults to "checked". |
labels |
character vector.Label of each category. Defaults to NULL. |
axis_title |
axis y label. Defaults to "". |
fill |
bar color. Character scale with hexidecimal color ou named color. Defaults to "blue". |
axis_title_axis_size |
integer scalar. Size of axis labels. Defaults to 22. |
axis_text_axis_size |
integer scalar. Size of values annotated in axis. Defaults to 22. |
data_label_size |
integer scalar. Size of data labels. Defaults to 15 |
sort |
logical scalar. If TRUE, the are ordered using frequency. Defaults to TRUE. |
a ggplot2 object
df <- data.frame( x1 = c("checked", "checked", "unchecked"), x2 = c("checked", "unchecked", "checked") ) bar_chart_checkbox(df, c("x1", "x2"))df <- data.frame( x1 = c("checked", "checked", "unchecked"), x2 = c("checked", "unchecked", "checked") ) bar_chart_checkbox(df, c("x1", "x2"))
Number of missing values in a vector.
n_missing(x)n_missing(x)
x |
an atomic vector |
returns a integer scalar
n_missing(c(1, NA, 3))n_missing(c(1, NA, 3))
Nnumber of non missing values in a vector.
n_no_missing(x)n_no_missing(x)
x |
an atomic vector |
scalar integer
n_no_missing(c(1, NA, 3))n_no_missing(c(1, NA, 3))
Each category (or option) is a variable in the dataframe, and we count the number of sucess in each column. The column percent is the ratio between this count by the observation number.
tab_freq_checkbox( data, columns, sucess = "checked", labels = NULL, variable_name = NULL )tab_freq_checkbox( data, columns, sucess = "checked", labels = NULL, variable_name = NULL )
data |
dataframe object |
columns |
character vector. Columns to count. |
sucess |
character scalar. Category indicating the sucess. Defaults to "checked". |
labels |
character vector.Label of each category. Defaults to NULL. |
variable_name |
character scalar. Name of field in the form. Defaults to NULL. |
Missing values are ignored.
return a dataframe with n and percent
data <- data.frame( x1 = c("checked", "checked", "unchecked"), x2 = c("checked", "unchecked", "checked") ) tab_freq_checkbox(data, c("x1", "x2"))data <- data.frame( x1 = c("checked", "checked", "unchecked"), x2 = c("checked", "unchecked", "checked") ) tab_freq_checkbox(data, c("x1", "x2"))
Bin the continuous variable, and count the occurrence of each interval.
tab_freq_cont( data, variable, breaks = NULL, labels = NULL, include_lowest = TRUE, right = FALSE )tab_freq_cont( data, variable, breaks = NULL, labels = NULL, include_lowest = TRUE, right = FALSE )
data |
dataframe object |
variable |
variable name as character |
breaks |
either a integer number or vector of number. Defaults to |
labels |
label for each interval. Defaults to interval names using
brackes and parentheses. Defaults to |
include_lowest |
logical value. If |
right |
logical value. If |
if breaks = NULL, then ceiling(1 + log2(n)), where n is the
sample size.
a dataframe with distribution frequency
tab_freq_cont(iris, "Sepal.Width")tab_freq_cont(iris, "Sepal.Width")