I\'m trying to solve the following problem in R: I have a dataframe with two variables (number of successes, and number of total trials).
# A tibble: 4 x 2
We can use pmap after changing the column names with the arguments of 'prop.test'
pmap(setNames(df, c("x", "n")), prop.test)
Or using map2
map2(df$Success, df$N, prop.test)
The problem with map is that it is looping through each of the columns of the dataset and it is a list of vectors
df %>%
map(~ .x)
#$Success
#[1] 38 12 27 9
#$N
#[1] 50 50 50 50
So, we cannot do .x$Success or .x$N
As @Steven Beaupre mentioned, if we need to create new columns with p-value and confidence interval
res <- df %>%
mutate(newcol = map2(Success, N, prop.test),
pval = map_dbl(newcol, ~ .x[["p.value"]]),
CI = map(newcol, ~ as.numeric(.x[["conf.int"]]))) %>%
select(-newcol)
# A tibble: 4 x 4
# Success N pval CI
#
#1 38.0 50.0 0.000407
#2 12.0 50.0 0.000407
#3 27.0 50.0 0.671
#4 9.00 50.0 0.0000116
The 'CI' column is a list of 2 elements, which can be unnested to make it a 'long' format data
res %>%
unnest
Or create 3 columns
df %>%
mutate(newcol = map2(Success, N, ~ prop.test(.x, n = .y) %>%
{tibble(pvalue = .[["p.value"]],
CI_lower = .[["conf.int"]][[1]],
CI_upper = .[["conf.int"]][[2]])})) %>%
unnest
# A tibble: 4 x 5
# Success N pvalue CI_lower CI_upper
#
#1 38.0 50.0 0.000407 0.615 0.865
#2 12.0 50.0 0.000407 0.135 0.385
#3 27.0 50.0 0.671 0.395 0.679
#4 9.00 50.0 0.0000116 0.0905 0.319