`group_by` and keep grouping levels as nested data frame's name

痞子三分冷 提交于 2021-01-07 06:57:45

问题


This question is related to and inspired by can't use emmeans inside map

I am doing several steps of data analysis with the following code. I want to keep my grouping factor's levels as the nested data frames' names and uses those names to identify each of the steps along the way, instead of using the default enumeration [[1]], [[2]], [[3]], etc. I don't understand the error I got. Please see how I can fix my code.

library(dplyr)
library(purrr)
library(emmeans)
data("warpbreaks")
wb_emm <-  warpbreaks %>%
  group_by(tension) %>% 
  setNames(unique(.x$tension)) %>%
  nest() %>%
  mutate(models=map(data,~glm(breaks~wool,data=.x))) %>%
  mutate(jt = map(models, ~emmeans::joint_tests(.x, data = .x$data))) %>%
  mutate(means=map(models,~emmeans::emmeans(.x,"wool",data=.x$data))) %>%
  mutate(p_cont = map(means, ~emmeans::contrast(.x, "pairwise",infer = c(T,T))))

Error in unique(.x$tension) : object '.x' not found

I originally did group_by(tension) %>% setNames(unique(tension)) and got Error in unique(tension) : object 'tension' not found I also tried split(.$tension) but it is conflicted with nest()

But the tension levels are legible.

 unique(warpbreaks$tension)
[1] L M H
Levels: L M H

The code runs well without the setNames(unique(.x$tension)) %>% step.

wb_emm$p_cont
[[1]]
 contrast estimate   SE  df asymp.LCL asymp.UCL z.ratio p.value
 A - B        16.3 6.87 Inf      2.87      29.8 2.378   0.0174 

Confidence level used: 0.95 

[[2]]
 contrast estimate   SE  df asymp.LCL asymp.UCL z.ratio p.value
 A - B       -4.78 4.27 Inf     -13.1      3.59 -1.119  0.2630 

Confidence level used: 0.95 

[[3]]
 contrast estimate   SE  df asymp.LCL asymp.UCL z.ratio p.value
 A - B        5.78 3.79 Inf     -1.66      13.2 1.523   0.1277 

Confidence level used: 0.95 

Thank you.

Update: from the second solution provided by Ronak Shah below, I tried on diamonds but the names were unchanged. The code runs with either ungroup()%>% or ungroup%>%.

diamonds %>%
  group_by(cut) %>%
  nest() %>% 
  ungroup %>%
  mutate(models=map(data,~glm(price ~ x + y + z + clarity + color,data=.x)),
         jt = map(models, ~emmeans::joint_tests(.x, data = .x$data)),
         means=map(models,~emmeans::emmeans(.x,"color",data=.x$data)),
         p_cont = map(means, ~emmeans::contrast(.x, "pairwise",infer = c(T,T))),
         across(models:p_cont, stats::setNames,  .$cut)) -> diamond_result
> diamond_result$jt
[[1]]
 model term df1 df2 F.ratio p.value
 x            1 Inf 611.626 <.0001 
 y            1 Inf   2.914 0.0878 
 z            1 Inf 100.457 <.0001 
 clarity      7 Inf 800.852 <.0001 
 color        6 Inf 256.796 <.0001 

回答1:


You need to add setNames in the map step :

library(tidyverse)

warpbreaks %>%
  group_by(tension) %>% 
  nest() %>%
  ungroup %>%
  mutate(models=map(data,~glm(breaks~wool,data=.x)),
        jt = map(models, ~emmeans::joint_tests(.x, data = .x$data)),
        means=map(models,~emmeans::emmeans(.x,"wool",data=.x$data)),
        p_cont = setNames(map(means, 
                  ~emmeans::contrast(.x, "pairwise",infer = c(T,T))),.$tension))

If you want to name all the list output use across :

warpbreaks %>%
  group_by(tension) %>% 
  nest() %>%
  ungroup %>%
  mutate(models=map(data,~glm(breaks~wool,data=.x)),
         jt = map(models, ~emmeans::joint_tests(.x, data = .x$data)),
         means=map(models,~emmeans::emmeans(.x,"wool",data=.x$data)),
         p_cont = map(means, ~emmeans::contrast(.x, "pairwise",infer = c(T,T))),
         across(models:p_cont, setNames,  .$tension)) -> result

result$jt

#$L
# model term df1 df2 F.ratio p.value
# wool         1 Inf   5.653 0.0174 


#$M
# model term df1 df2 F.ratio p.value
# wool         1 Inf   1.253 0.2630 


#$H
# model term df1 df2 F.ratio p.value
# wool         1 Inf   2.321 0.1277 


来源:https://stackoverflow.com/questions/65334449/group-by-and-keep-grouping-levels-as-nested-data-frames-name

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!