My goal is to create a function that, when looped over multiple variables of a data frame, will return a new data frame containing the percents and 95% confidence intervals
The nice thing about all the functions you're using is that they are already vectorized (except sd
and qt
, but you can easily vectorize them for specific arguments with Vectorize
). This means you can pass vectors to them without needing to write a single loop. I left out the parts of your function that deal with preparing the input and prettying up the output.
t1.props <- function(var, data=mtcars) {
N <- nrow(data)
levels <- names(table(data[,var]))
count <- unclass(table(data[,var])) # counts
prop <- count / N # proportions
se <- sqrt(prop * (1-prop)/(N-1)) # standard errors of props.
lprop <- log(prop) - log(1-prop) # logged prop
lse <- se / (prop*(1-prop)) # logged se
stat <- Vectorize(qt, "df")(0.975, N-1) # tstats
llower <- lprop - stat*lse # log lower
lupper <- lprop + stat*lse # log upper
lower <- exp(llower) / (1 + exp(llower)) # lower ci
upper <- exp(lupper) / (1 + exp(lupper)) # upper ci
data.frame(variable=var,
level=levels,
perc=100*prop,
lower=100*lower,
upper=100*upper)
}
So, the only explicit applying/looping comes when you apply the function to multiple variables as follows
## Apply your function to two variables
do.call(rbind, lapply(c("cyl", "am"), t1.props))
# variable level perc lower upper
# 4 cyl 4 34.375 19.49961 53.11130
# 6 cyl 6 21.875 10.34883 40.44691
# 8 cyl 8 43.750 27.09672 61.94211
# 0 am 0 59.375 40.94225 75.49765
# 1 am 1 40.625 24.50235 59.05775
As far as the loop in your code, it's not like that is particularly important in terms of efficiency, but you can see how much easier code can be to read when its concise - and apply functions offer a lot of simple one-line solutions.
I think the most important thing to change in your code is the use of assign
and get
. Instead, you can store variables in lists or another data structure, and use setNames
, names<-
, or names(...) <-
to name the components when needed.