Finding maximum value of one column (by group) and inserting value into another data frame in R

后端 未结 3 510

All,

I was hoping someone could find a solution to an issue of mine that isn\'t necessarily causing headaches, but, as of right now, invites the possibility for huma

相关标签:
3条回答
  • 2020-12-07 05:00

    If you know SQL, then you could use sqldffunction from this package: http://cran.r-project.org/web/packages/sqldf/index.html

    df <- sqldf("select year, max(x1), max(x2), max(x3), max(x4) from Data group by year")
    df
      year max(x1) max(x2) max(x3) max(x4)
    1 1998      30      10      30       2
    2 2000      95      90      25      90
    3 2005      90      90       5      40
    
    0 讨论(0)
  • 2020-12-07 05:06

    It sounds like you're just looking for aggregate:

    > aggregate(cbind(x1, x2, x3, x4) ~ country1 + year, Data, max)
      country1 year x1 x2 x3 x4
    1        B 1998 30 10 30  2
    2        A 2000 95 90 25 90
    3        C 2005 90 90  5 40
    

    It's not very clear from your question how you want to proceed from there though....

    0 讨论(0)
  • 2020-12-07 05:16

    You can also use ddply from plyr package. Assuming your sample is data.

    data<-structure(list(country1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 2L, 2L, 2L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor"), 
        country2 = structure(c(2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 4L, 
        9L, 1L, 4L, 10L), .Label = c("A", "B", "C", "D", "E", "F", 
        "G", "H", "I", "X"), class = "factor"), year = c(2000L, 2000L, 
        2000L, 2000L, 2000L, 2000L, 2000L, 1998L, 1998L, 1998L, 2005L, 
        2005L, 2005L), x1 = c(50L, 70L, 10L, 95L, 10L, 5L, 10L, 5L, 
        30L, 10L, 10L, 90L, 49L), x2 = c(30L, 2L, 90L, 10L, 10L, 
        5L, 30L, 10L, 6L, 9L, 15L, 0L, 90L), x3 = c(1L, 5L, 20L, 
        10L, 10L, 0L, 25L, 30L, 9L, 7L, 2L, 0L, 5L), x4 = c(20L, 
        90L, 30L, 5L, 0L, 0L, 40L, 2L, 0L, 0L, 6L, 40L, 0L)), .Names = c("country1", 
    "country2", "year", "x1", "x2", "x3", "x4"), class = "data.frame", row.names = c(NA, 
    -13L))
    
    install.packages("plyr")
    library(plyr)
    ddply(data,.(country1,year),numcolwise(max))
    
      country1 year x1 x2 x3 x4
    1        A 2000 95 90 25 90
    2        B 1998 30 10 30  2
    3        C 2005 90 90  5 40
    
    0 讨论(0)
提交回复
热议问题