Assign value to group based on condition in column

后端 未结 3 1100
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-01 16:40

I have a data frame that looks like the following:

> df = data.frame(group = c(1,1,1,2,2,2,3,3,3), 
                 date = c(1,2,3,4,5,6,7,8,9),
                 


        
相关标签:
3条回答
  • 2020-12-01 17:08

    Here's a quick data.table one

    library(data.table)
    setDT(df)[, newValue := date[value == 4L], by = group]
    df
    #    group date value newValue
    # 1:     1    1     3        2
    # 2:     1    2     4        2
    # 3:     1    3     3        2
    # 4:     2    4     4        4
    # 5:     2    5     5        4
    # 6:     2    6     6        4
    # 7:     3    7     6        8
    # 8:     3    8     4        8
    # 9:     3    9     9        8
    

    Here's a similar dplyr version

    library(dplyr)
    df %>%
      group_by(group) %>%
      mutate(newValue = date[value == 4L])
    

    Or a possible base R solution using merge after filtering the data (will need some renaming afterwards)

    merge(df, df[df$value == 4, c("group", "date")], by = "group")
    
    0 讨论(0)
  • 2020-12-01 17:13

    One more base R path:

    df$newValue <- ave(`names<-`(df$value==4,df$date), df$group, FUN=function(x) as.numeric(names(x)[x]))
    df
       group date value newValue
    1      1    1     3        2
    2      1    2     4        2
    3      1    3     3        2
    4      2    4     4        4
    5      2    5     5        4
    6      2    6     6        4
    7      3    7     6        8
    8      3    8     4        8
    9      3    9     9        8
    10     3   11     7        8
    

    I used a test on variable length groups. I assigned the date column as the names for the logical index of value equal to 4. Then identify the value by group.

    Data

    df = data.frame(group = c(1,1,1,2,2,2,3,3,3,3), 
                     date = c(1,2,3,4,5,6,7,8,9,11),
                     value = c(3,4,3,4,5,6,6,4,9,7))
    
    0 讨论(0)
  • 2020-12-01 17:27

    Here is a base R option

     df$newValue = rep(df$date[which(df$value == 4)], table(df$group))
    

    Another alternative using lapply

    do.call(rbind, lapply(split(df, df$group), 
      function(x){x$newValue = rep(x$date[which(x$value == 4)], 
                        each = length(x$group)); x}))
    
    #    group date value newValue
    #1.1     1    1     3        2
    #1.2     1    2     4        2
    #1.3     1    3     3        2
    #2.4     2    4     4        4
    #2.5     2    5     5        4
    #2.6     2    6     6        4
    #3.7     3    7     6        8
    #3.8     3    8     4        8
    #3.9     3    9     9        8
    
    0 讨论(0)
提交回复
热议问题