Why won't pipe operator let me combine successive mutations?

心已入冬 提交于 2021-02-11 12:44:47

问题


I want to mutate columns from this...

enter image description here

...into...

enter image description here

When I did the following...

villastats<-villastats%>% 
  mutate(HG = ifelse(HomeTeam == "Aston Villa", villastats$FTHG, ifelse(HomeTeam != "Aston Villa", 0, 0)))
villastats<-villastats%>% 
  mutate(AG = ifelse(AwayTeam == "Aston Villa", villastats$FTAG, ifelse(AwayTeam != "Aston Villa", 0, 0)))
villastats<-villastats%>%
  mutate(THG=cumsum(villastats$HG))
villastats<-villastats%>%
  mutate(TAG=cumsum(villastats$AG))
villastats<-villastats%>%
  mutate(Tot=THG+TAG)

...it produced the result shown above that I wanted. I want to do all the mutations at once, so I tried

villastats<-villastats%>% 
  mutate(HG = ifelse(HomeTeam == "Aston Villa", villastats$FTHG, ifelse(HomeTeam != "Aston Villa", 0, 0)))%>%
  mutate(AG = ifelse(AwayTeam == "Aston Villa", villastats$FTAG, ifelse(AwayTeam != "Aston Villa", 0, 0)))%>%
  mutate(THG=cumsum(villastats$HG))
  mutate(TAG=cumsum(villastats$AG))%>%
  mutate(Tot=THG+TAG)

This didn't work. The first two lines work fine but when I add the third line it tells me

Error: Column THG must be length 38 (the number of rows) or one, not 0 <

Where am I going wrong? Why is it doing this?


回答1:


  1. When you use villastats$ inside a pipe that is derived from the object villastats$ (as you are doing), then villastats$$FTHG refers to the version of the variable before the first step in your pipeline. For instance,

    someframe <- data.frame(a = 1:3, b = 11:13) # <---------------------------\
    someframe %>%                                                             |
      mutate(a = a + 1) %>% # <-------------------------------------\         |
      mutate(a = a + 2) %>%       # <--- this 'a' is referring to --/         |
      mutate(a = someframe$a + 3) # <--- this 'someframe$a' is referring to --/
    

    In some simpler magrittr pipes, this is "fine" in that the version of the variable at the beginning is no different than at the time of referencing it. However, if there are fewer rows (dplyr::filter), different values (mutate(a = a+2)) or just reordering (arrange), then a can be very different from someframe$a. In the best case, you get an error because the length of the vector you're referencing is incompatible with the operation you're doing. In the worst case, it gives you no warning or error but all of your calculations are silently wrong.

  2. You can place all of your mutate operations in one call, as in

    villastats %>% 
      mutate(
        HG = ifelse(HomeTeam == "Aston Villa", FTHG,
                    ifelse(HomeTeam != "Aston Villa", 0, 0)),
        AG = ifelse(AwayTeam == "Aston Villa", FTAG,
                    ifelse(AwayTeam != "Aston Villa", 0, 0)),
        THG = cumsum(HG),
        TAG = cumsum(AG),
        Tot = THG+TAG
      )
    

    While what you did is not wrong, it is slower and perhaps a little harder to read.

  3. Your ifelses are unnecessarily nested. The first comparison HomeTeam=="AstonVilla" and the second comparison HomeTeam!="AstonVilla" are perfectly complementary, you can reduce all of those to just

    villastats %>% 
      mutate(
        HG = ifelse(HomeTeam == "Aston Villa", FTHG, 0),
        AG = ifelse(AwayTeam == "Aston Villa", FTAG, 0),
        THG = cumsum(HG),
        TAG = cumsum(AG),
        Tot = THG + TAG
      )
    
  4. Not that you asked, but I urge dplyr::if_else in place of base ifelse. The latter drops some classes (try ifelse(TRUE, Sys.time(), Sys.time()) for an example) and allows the programmer to be sloppy by including different class objects in the "yes" and "no" options. if_else won't let you do if_else(TRUE, "1", -3.14), since they are different. (It'll even complain about if_else(TRUE, 0, 0L). It's strict.) Use it and be declarative, meaning using 0L instead of 0 if you expect that your normal operation will be an integer, etc.



来源:https://stackoverflow.com/questions/64325649/why-wont-pipe-operator-let-me-combine-successive-mutations

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!