dplyr::mutate to add multiple values

前端 未结 6 1597
旧时难觅i
旧时难觅i 2020-11-30 04:15

There are a couple of issues about this on the dplyr Github repo already, and at least one related SO question, but none of them quite covers my question -- I think.

6条回答
  •  悲哀的现实
    2020-11-30 04:27

    Old question (with plenty of good answers), but this is a great use case for tidyverse's broom package, which deals with tidying output from test and modeling objects (such as binom.test, lm, etc).

    It's more verbose than other methods, but I think it matches your desire for a more expressive approach.

    The process is:

    1. Define the groups that you'll run binom.test on (in this case, those groups are defined by x and n) and nest them, creating separate data.frames for each (within the full data.frame)
    2. map the binom.test call to the x and n values from each group
    3. tidy the binom.test output for each group (this is where broom comes in)
    4. unnest the tidied test output data.frames into the full data.frame

    Now you're left with a data.frame where each row contains the x and n values, combined with all of the output from the corresponding binom.test, neatly formatted with separate columns for each bit of output information (point estimate, upper/lower conf, p-value, etc).

    library(tidyverse)
    library(broom)
    dd <- data.frame(x=c(3,4),n=c(10,11))
    dd %>%
      group_by(x, n) %>%
      nest() %>%
      mutate(test = map(data, ~tidy(binom.test(x, n)))) %>%
      unnest(test)
    #> # A tibble: 2 x 11
    #> # Groups:   x, n [2]
    #>       x     n data  estimate statistic p.value parameter conf.low conf.high
    #>                               
    #> 1     3    10  2     4    11  # … with 2 more variables: method , alternative 
    

    From here you can get to your exact desired format with just a bit more manipulation, selecting the desired output variables, and renaming them:

    dd %>%
      group_by(x, n) %>%
      nest() %>%
      mutate(test = map(data, ~tidy(binom.test(x, n)))) %>%
      unnest(test) %>%
      rename(lwr = conf.low, upr = conf.high) %>%
      select(x, n, lwr, upr)
    #> # A tibble: 2 x 4
    #> # Groups:   x, n [2]
    #>       x     n    lwr   upr
    #>       
    #> 1     3    10 0.0667 0.652
    #> 2     4    11 0.109  0.692
    

    As mentioned, it's verbose. Much more so than (for example) @joran's beautifully succinct

    dd %>% 
        group_by(x,n) %>%
        do(foo(.$x,.$n))
    

    However, the benefit of the broom approach is that you won't need to define a function foo (or get_binCI). It's fully self-contained, and in my opinion far more expressive and flexible.

提交回复
热议问题