Adding prefix or suffix to most data.frame variable names in piped R workflow

人走茶凉 提交于 2019-11-27 21:29:38

You can pass functions to rename_at, so do

 means14 <- dat14 %>%
  group_by(class) %>%
  select(-ID) %>%
  summarise_all(funs(mean(.))) %>% 
  rename_at(vars(-class),function(x) paste0(x,"_2014"))

After additional experimenting since posting this question, I've found that the setNames function will work with the piping as it returns a data.frame:

dat14 %>%
  group_by(class) %>%
  select(-ID) %>%
  summarise_each(funs(mean(.))) %>%
  setNames(c(names(.)[1], paste0(names(.)[-1],"_mean_2014"))) 

  class speed_mean_2014 power_mean_2014 force_mean_2014
1     a       0.5572500             0.8       0.5519802
2     b       0.2850798             0.6       1.0888116

This is a bit quicker, but not totally what you want:

dat14 %>%
  group_by(class) %>%
  select(-ID) %>%
  summarise_each(funs(mean(.))) -> means14 

names(means14)[-1] %<>% paste0("_mean_2014")

if you haven't used the %<>%-operator before definitely check this link out, its a super-useful tool.

you can also use it for recomputing or rounding some columns, like this df$meancolumn %<>% round() , and so on, it just comes up very often and just saves you a lot of writing

As of February 2017 you can do this with the dplyr command rename_(...).

In the case of this example you could do.

dat14 %>%
  group_by(class) %>%
  select(-ID) %>%
  summarise_each(funs(mean(.))) %>%
  rename_(names(.)[-1], paste0(names(.)[-1],"_mean_2014"))) 

This is rather similar to the answer with set_names but works with tibbles too!

This is more of a step back, but you might think of reshaping your data in order to apply the function to multiple years at the same time. This will preserve tidyness. If you're going to want to end up comparing different years, it might make sense to have the year be a separate variable in a dataframe, rather than storing the year in the names. You should be able to use summarise_ to get the mean_year behavior. See http://cran.r-project.org/web/packages/dplyr/vignettes/nse.html

library(dplyr)
library(tidyr)
set.seed(1)
dat14 <- data.frame(ID = 1:10, speed = runif(10), power = rpois(10, 1),
                    force = rexp(10), class = rep(c("a", "b"),5))

dat14 %>% 
  gather(variable, value, -ID, -class) %>% 
  mutate(year = 2014) %>% 
  group_by(class, year, variable)%>% 
  summarise(mean = mean(value))`
zerweck

While Sam Firkes solution using setNames() ist certainly the only solution keeping an unbroken pipe, it will not work with the tbl objects from dplyr, since the column names are not accessible by methods from the usual base R naming functions. Here is a function that you can use within a pipe with tbl objects as well, thanks to this solution by hrbrmstr. It adds predefined prefixes and suffixes at the specified column indices. Default is all columns.

tbl.renamer <- function(tbl,prefix="x",suffix=NULL,index=seq_along(tbl_vars(tbl))){
  newnames <- tbl_vars(tbl) # Get old variable names
  names(newnames) <- newnames
  names(newnames)[index] <- paste0(prefix,".",newnames,suffix)[index] # create a named vector for .dots
  rename_(tbl,.dots=newnames) # rename the variables
}

Example usage (Assume auth_users beeing an tbl_sql object):

auth_user %>% tbl_vars
tbl.renamer(auth_user) %>% tbl_vars
auth_user %>% tbl.renamer %>% tbl_vars
auth_user %>% tbl.renamer(index = c(1,5)) %>% tbl_vars
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!