可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
(Somewhat related question: Enter new column names as string in dplyr's rename function)
In the middle of a dplyr chain (%>%), I would like to replace multiple column names with functions of their old names (using tolower or gsub, etc.)
library(tidyr); library(dplyr) data(iris) # This is what I want to do, but I'd like to use dplyr syntax names(iris) % gather(measurement, value, -species) %>% group_by(species,measurement) %>% summarise(avg_value = mean(value))
I see ?rename takes the argument replace as a named character vector, with new names as values, and old names as names.
So I tried:
iris %>% rename(replace=c(names(iris)=tolower( gsub("\\.", "_", names(iris) ) ) ))
but this (a) returns Error: unexpected '=' in iris %>% ... and (b) requires referencing by name the data frame from the previous operation in the chain, which in my real use case I couldn't do.
iris %>% rename(replace=c( )) %>% # ideally the fix would go here gather(measurement, value, -species) %>% group_by(species,measurement) %>% summarise(avg_value = mean(value)) # I realize I could mutate down here # instead, once the column names turn into values, # but that's not the point # ---- Desired output looks like: ------- # Source: local data frame [12 x 3] # Groups: species # # species measurement avg_value # 1 setosa sepal_length 5.006 # 2 setosa sepal_width 3.428 # 3 setosa petal_length 1.462 # 4 setosa petal_width 0.246 # 5 versicolor sepal_length 5.936 # 6 versicolor sepal_width 2.770 # ... etc ....
回答1:
I think you're looking at the documentation for plyr::rename, not dplyr::rename. You would do something like this with dplyr::rename:
iris %>% rename_(.dots=setNames(names(.), tolower(gsub("\\.", "_", names(.)))))
回答2:
This is a very late answer, on May 2017
As of dplyr 0.5.0.9004, soon to be 0.6.0, many new ways of renaming columns, compliant with the maggritr pipe operator %>%, have been added to the package.
Those functions are:
- rename_all
- rename_if
- rename_at
There are many different ways of using those functions, but the one relevant to your problem, using the stringr package is the following:
df % rename_all( funs( stringr::str_to_lower(.) %>% stringr::str_replace_all(., '\\.', '_') ) )
And so, carry on with the plumbing :) (no pun intended).
回答3:
Here's a way around the somewhat awkward rename syntax:
myris % setNames(tolower(gsub("\\.","_",names(.))))
回答4:
For this particular [but fairly common] case, the function has already been written in the janitor package:
library(janitor) iris %>% clean_names() ## sepal_length sepal_width petal_length petal_width species ## 1 5.1 3.5 1.4 0.2 setosa ## 2 4.9 3.0 1.4 0.2 setosa ## 3 4.7 3.2 1.3 0.2 setosa ## 4 4.6 3.1 1.5 0.2 setosa ## 5 5.0 3.6 1.4 0.2 setosa ## 6 5.4 3.9 1.7 0.4 setosa ## . ... ... ... ... ...
so all together,
iris %>% clean_names() %>% gather(measurement, value, -species) %>% group_by(species,measurement) %>% summarise(avg_value = mean(value)) ## Source: local data frame [12 x 3] ## Groups: species [?] ## ## species measurement avg_value ## ## 1 setosa petal_length 1.462 ## 2 setosa petal_width 0.246 ## 3 setosa sepal_length 5.006 ## 4 setosa sepal_width 3.428 ## 5 versicolor petal_length 4.260 ## 6 versicolor petal_width 1.326 ## 7 versicolor sepal_length 5.936 ## 8 versicolor sepal_width 2.770 ## 9 virginica petal_length 5.552 ## 10 virginica petal_width 2.026 ## 11 virginica sepal_length 6.588 ## 12 virginica sepal_width 2.974
回答5:
My eloquent attempt using base, stringr and dplyr:
EDIT: library(tidyverse) now includes all three libraries.
library(tidyverse) # OR # library(dplyr) # library(stringr) # library(maggritr) names(iris) %% # pipes so that changes are apply the changes back tolower() %>% str_replace_all(".", "_")
I do this for building functions with piping.
my_read_fun % names(df) %% tolower() %>% str_replace_all("_", ".") tempdf %% select(a, b, c, g) }
回答6:
Both select() and select_all() can be used to rename columns.
If you wanted to rename only specific columns you can use select:
iris %>% select(sepal_length = Sepal.Length, sepal_width = Sepal.Width, everything()) %>% head(2) sepal_length sepal_width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa
rename does the same thing, just without having to include everything():
iris %>% rename(sepal_length = Sepal.Length, sepal_width = Sepal.Width) %>% head(2) sepal_length sepal_width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa
select_all() works on all columns and can take a function as an argument:
iris %>% select_all(tolower) iris %>% select_all(~gsub("\\.", "_", .))
or combining the two:
iris %>% select_all(~gsub("\\.", "_", tolower(.))) %>% head(2) sepal_length sepal_width petal_length petal_width species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa