Automatically generate new variable names using dplyr mutate

后端 未结 3 1672
谎友^
谎友^ 2020-12-11 08:09

I would like to create variable names dynamically while using dplyr; although, I’d be fine with a non-dplyr solution as well.

For Example:

data(iris)         


        
相关标签:
3条回答
  • 2020-12-11 08:47

    You can use mutate_all (or mutate_at for specific columns) then prepend lag_ to the column names.

    data(iris)
    library(dplyr) 
    
    lag_iris <- iris %>%
      group_by(Species) %>%
      mutate_all(funs(lag(.))) %>%
      ungroup
    colnames(lag_iris) <- paste0('lag_', colnames(lag_iris))
    
    head(lag_iris)
    
      lag_Sepal.Length lag_Sepal.Width lag_Petal.Length lag_Petal.Width lag_Species
                 <dbl>           <dbl>            <dbl>           <dbl>      <fctr>
    1               NA              NA               NA              NA      setosa
    2              5.1             3.5              1.4             0.2      setosa
    3              4.9             3.0              1.4             0.2      setosa
    4              4.7             3.2              1.3             0.2      setosa
    5              4.6             3.1              1.5             0.2      setosa
    6              5.0             3.6              1.4             0.2      setosa
    
    0 讨论(0)
  • 2020-12-11 08:49

    Since you're also happy with a non-dplyr, try this:

    lagger <- function(x, n) c(rep(NA,n), head(x,-n) )
    iris[paste0("lag_", names(iris) )] <- lapply(iris, lagger, n=1)
    
    head(iris,2)[-(1:5)]
    #  lag_Sepal.Length lag_Sepal.Width lag_Petal.Length lag_Petal.Width lag_Species
    #1               NA              NA               NA              NA          NA
    #2              5.1             3.5              1.4             0.2           1
    
    0 讨论(0)
  • 2020-12-11 08:51

    Here is a data.table approach. I chose columns with numbers in this case. What you want to do is to choose column names and create new column names in advance. Then, you apply shift(), which works like lag() and lead() in the dplyr package, to each of the columns you chose.

    library(data.table)
    
    # Crate a df for this demo.
    mydf <- iris
    
    # Choose columns that you want to apply lag() and create new colnames.
    cols = names(iris)[sapply(iris, is.numeric)]
    anscols = paste("lag_", cols, sep = "")
    
    # Apply shift() to each of the chosen columns.
    setDT(mydf)[, (anscols) := shift(.SD, 1, type = "lag"),
                .SDcols = cols]
    
         Sepal.Length Sepal.Width Petal.Length Petal.Width   Species lag_Sepal.Length lag_Sepal.Width
     1:          5.1         3.5          1.4         0.2    setosa               NA              NA
     2:          4.9         3.0          1.4         0.2    setosa              5.1             3.5
     3:          4.7         3.2          1.3         0.2    setosa              4.9             3.0
     4:          4.6         3.1          1.5         0.2    setosa              4.7             3.2
     5:          5.0         3.6          1.4         0.2    setosa              4.6             3.1
     ---                                                                                             
    146:          6.7         3.0          5.2         2.3 virginica              6.7             3.3
    147:          6.3         2.5          5.0         1.9 virginica              6.7             3.0
    148:          6.5         3.0          5.2         2.0 virginica              6.3             2.5
    149:          6.2         3.4          5.4         2.3 virginica              6.5             3.0
    150:          5.9         3.0          5.1         1.8 virginica              6.2             3.4
         lag_Petal.Length lag_Petal.Width
      1:               NA              NA
      2:              1.4             0.2
      3:              1.4             0.2
      4:              1.3             0.2
      5:              1.5             0.2
     ---                                 
    146:              5.7             2.5
    147:              5.2             2.3
    148:              5.0             1.9
    149:              5.2             2.0
    150:              5.4             2.3
    
    0 讨论(0)
提交回复
热议问题