How do add a column in a data frame in R

后端 未结 5 996
逝去的感伤
逝去的感伤 2020-12-15 12:58

I have imported data from a file into a data frame in R. It is something like this.

Name      Count   Category
A         100     Cat1
C         10      Cat2
         


        
相关标签:
5条回答
  • 2020-12-15 13:34

    check out:

    • cut()
    • recode() in the car package
    0 讨论(0)
  • 2020-12-15 13:41

    Perhaps simpler and more readable using ifelse and %in%:

    df <- data.frame( Name = c('A', 'C', 'D', 'E', 'H', 'Z', 'M'), 
    Count =c(100,10,40,30,3,20,50), stringsAsFactors = FALSE)
    
    cat1 = c("A","D")
    cat2 = c("C","Z")
    cat3 = c("E","H")
    cat10 = c("M")
    
    df$Category = ifelse(df$Name %in% cat1, "Cat1",
                  ifelse(df$Name %in% cat2, "Cat2",
                  ifelse(df$Name %in% cat3, "Cat3",
                  ifelse(df$Name %in% cat10, "Cat10",
                  NA))))
    
       Name Count Category
    1    A   100     Cat1
    2    C    10     Cat2
    3    D    40     Cat1
    4    E    30     Cat3
    5    H     3     Cat3
    6    Z    20     Cat2
    7    M    50    Cat10
    
    0 讨论(0)
  • 2020-12-15 13:44

    [Update following the OP's comment and altered Q]

    DF <- data.frame(Name = c("A","C","D","E","H","Z","M"),
                     Count = c(100,10,40,30,3,20,50), stringsAsFactors = FALSE)
    lookup <- data.frame(Name = c("A","C","D","E","H","Z","M"),
                         Category = paste("Cat", c(1,2,1,3,3,2,10), sep = ""),
                         stringsAsFactors = FALSE)
    

    Using the above data frames, we can do a data base merge. You need to set-up lookup for the Name Category combinations you want, which is OK if there aren't a very large number of Names (At least you only need to list them once each in lookup and you don't have to do it in order - list all Cat1 Names first, etc):

    > merge(DF, lookup, by = "Name")
      Name Count Category
    1    A   100     Cat1
    2    C    10     Cat2
    3    D    40     Cat1
    4    E    30     Cat3
    5    H     3     Cat3
    6    M    50    Cat10
    7    Z    20     Cat2
    > merge(DF, lookup, by = "Name", sort = FALSE)
      Name Count Category
    1    A   100     Cat1
    2    C    10     Cat2
    3    D    40     Cat1
    4    E    30     Cat3
    5    H     3     Cat3
    6    Z    20     Cat2
    7    M    50    Cat10
    

    One option is indexing:

    foo <- function(x) {
        out <- character(length = length(x))
        chars <- c("Ones", "Tens", "Hundreds", "Thousands")
        out[x < 10] <- chars[1]
        out[x >= 10 & x < 100] <- chars[2]
        out[x >= 100 & x < 1000] <- chars[3]
        out[x >= 1000 & x < 10000] <- chars[4]
        return(factor(out, levels = chars))
    }
    

    An alternative that scales better is,

    bar <- function(x, cats = c("Ones", "Tens", "Hundreds", "Thousands")) {
        out <- cats[floor(log10(x)) + 1]
        factor(out, levels = cats)
    }
    
    0 讨论(0)
  • 2020-12-15 13:49

    You can use ifelse. If your data frame were called df you would do:

    df$cat <- ifelse(df$name<100, "Ones", "Hundreds")
    df$cat <- ifelse(df$name<1000, df$cat, "Thousands")
    
    0 讨论(0)
  • 2020-12-15 13:50

    You can use a map. (UPDATED to use stringsAsFactors = FALSE)

    df <- data.frame( Name = c('A', 'C', 'D', 'E', 'H', 'Z', 'M'), 
                      Count = c(100,10,40,30,3,20,50), stringsAsFactors = FALSE)
    Categories <- list(Cat1 = c('A','D'), 
                       Cat2 = c('C','Z'), 
                       Cat3 = c('E','H'), 
                       Cat10 = 'M')
    nams <- names( Categories )
    nums <- sapply(Categories, length)
    CatMap <- unlist( Map( rep, nams, nums ) )
    names(CatMap) <- unlist( Categories )
    
    df <- transform( df, Category = CatMap[ Name ])
    
    0 讨论(0)
提交回复
热议问题