I would like to create an automated knitr report that will produce histograms for each numeric field within my dataframe. My goal is to do this without having to specify the
I am using child Rmd files in markdown, also works in sweave.
in Rmd use following snippet:
```{r run-numeric-md, include=FALSE}
out = NULL
for (i in c(1:num_vars)) {
out = c(out, knit_child('da-numeric.Rmd'))
}
```
da-numeric.Rmd looks like:
Variabele `r num_var_names[i]`
------------------------------------
Missing : `r sum(is.na(data[[num_var_names[i]]]))`
Minimum value : `r min(na.omit(data[[num_var_names[i]]]))`
Percentile 1 : `r quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[2]`
Percentile 99 : `r quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[100]`
Maximum value : `r max(na.omit(data[[num_var_names[i]]]))`
```{r results='asis', comment="" }
warn_extreme_values=3
d1 = quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[2] > warn_extreme_values*quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[1]
d99 = quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[101] > warn_extreme_values*quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[100]
if(d1){cat('Warning : Suspect extreme values in left tail')}
if(d99){cat('Warning : Suspect extreme values in right tail')}
```
``` {r eval=TRUE, fig.width=6, fig.height=2}
library(ggplot2)
v <- num_var_names[i]
hp <- ggplot(na.omit(data), aes_string(x=v)) + geom_histogram( colour="grey", fill="grey", binwidth=diff(range(na.omit(data[[v]]))/100))
hp + theme(axis.title.x = element_blank(),axis.text.x = element_text(size=10)) + theme(axis.title.y = element_blank(),axis.text.y = element_text(size=10))
```
see my datamineR package on github https://github.com/hugokoopmans/dataMineR