dplyr::mutate to add multiple values

前端 未结 6 1587
旧时难觅i
旧时难觅i 2020-11-30 04:15

There are a couple of issues about this on the dplyr Github repo already, and at least one related SO question, but none of them quite covers my question -- I think.

6条回答
  •  春和景丽
    2020-11-30 04:29

    Here are some possibilities with rowwise and nesting.

    library("dplyr")
    library("tidyr")
    

    data frame with repeated x/n combinations, for fun

    dd <- data.frame(x=c(3, 4, 3), n=c(10, 11, 10))
    

    a versions of the CI function that returns a data frame, like @Joran's

    get_binCI_df <- function(x,n) {
      binom.test(x, n)$conf.int %>% 
        setNames(c("lwr", "upr")) %>% 
        as.list() %>% as.data.frame()
    }
    

    Grouping by x and n as before, removes the duplicate.

    dd %>% group_by(x,n) %>% do(get_binCI_df(.$x,.$n))
    # # A tibble: 2 x 4
    # # Groups:   x, n [2]
    #       x     n       lwr       upr
    #              
    # 1     3    10 0.1181172 0.8818828
    # 2     4    11 0.1092634 0.6920953
    

    Using rowwise keeps all the rows but removes x and n unless you put them back using cbind(. (like Ben does in his OP).

    dd %>% rowwise() %>% do(cbind(., get_binCI_df(.$x,.$n)))
    # Source: local data frame [3 x 4]
    # Groups: 
    #   
    # # A tibble: 3 x 4
    #       x     n        lwr       upr
    # *             
    # 1     3    10 0.06673951 0.6524529
    # 2     4    11 0.10926344 0.6920953
    # 3     3    10 0.06673951 0.6524529
    

    It feels like nesting could work more cleanly, but this is as good as I can get. Using mutate means I can use x and n directly instead of .$x and .$n, but mutate expects a single value, so it needs to be wrapped in list.

    dd %>% rowwise() %>% mutate(ci=list(get_binCI_df(x, n))) %>% unnest()
    # # A tibble: 3 x 4
    #       x     n        lwr       upr
    #               
    # 1     3    10 0.06673951 0.6524529
    # 2     4    11 0.10926344 0.6920953
    # 3     3    10 0.06673951 0.6524529
    

    Finally, looks like something like this is an open issue (as of 5 Oct 2017) for dplyr; see https://github.com/tidyverse/dplyr/issues/2326; if something like that is implemented then that will be the easiest way!

提交回复
热议问题