Rename multiple dataframe columns, referenced by current names

后端 未结 5 1919
-上瘾入骨i
-上瘾入骨i 2020-12-24 05:40

I want to rename some random columns of a large data frame and I want to use the current column names, not the indexes. Column indexes might change if I add

相关标签:
5条回答
  • 2020-12-24 05:53
    names(mydf)[names(mydf) == "MyName.1"] = "MyNewName" # 13 characters shorter. 
    

    Although, you may want to replace a vector eventually. In that case, use %in% instead of == and set MyName.1 as a vector of equal length to MyNewName

    0 讨论(0)
  • 2020-12-24 05:55
    names(mydf) <- sub("MyName\\.1", "MyNewName", names(mydf))
    

    This would generalize better to a multiple-name-change strategy if you put a stem as a pattern to be replaced using gsub instead of sub.

    0 讨论(0)
  • 2020-12-24 06:09

    You can use the str_replace function of the stringr package:

    names(mydf) <- str_replace(names(mydf), "MyName.1", "MyNewName")
    
    0 讨论(0)
  • 2020-12-24 06:17

    The trouble with changing column names of a data.frame is that, almost unbelievably, the entire data.frame is copied. Even when it's in .GlobalEnv and no other variable points to it.

    The data.table package has a setnames() function which changes column names by reference without copying the whole dataset. data.table is different in that it doesn't copy-on-write, which can be very important for large datasets. (You did say your data set was large.). Simply provide the old and the new names:

    require(data.table)
    setnames(DT,"MyName.1", "MyNewName")
    # or more explicit:
    setnames(DT, old = "MyName.1", new = "MyNewName")
    ?setnames
    
    0 讨论(0)
  • 2020-12-24 06:17

    plyr has a rename function for just this purpose:

    library(plyr)
    mydf <- rename(mydf, c("MyName.1" = "MyNewName"))
    
    0 讨论(0)
提交回复
热议问题