Compute the number of distinct values in col2 for each distinct value in col1 in R

后端 未结 3 1917
清歌不尽
清歌不尽 2020-12-21 11:34

I have a dataframe like this:

df <- data.frame(
          SchoolID=c(\"A\",\"A\",\"B\",\"B\",\"C\",\"D\"),
          Country=c(\"XX\",\"XX\",\"XX\",\"YY\"         


        
相关标签:
3条回答
  • 2020-12-21 12:07

    One approach, which does not rely on third-party libraries:

    > as.data.frame(rowSums(table(df[!duplicated(df), ]), na.rm=T))
      rowSums(table(df[!duplicated(df), ]), na.rm = T)
    A                                                1
    B                                                2
    C                                                1
    D                                                1
    
    0 讨论(0)
  • 2020-12-21 12:11
    aggregate(Country ~ SchoolID, df, function(x) length(unique(x)))
    

    Or

    tapply(df$Country, df$SchoolID, function(x) length(unique(x)))
    

    Or

    library(data.table) 
    setDT(df)[, .(NumberOfCountry = length(unique(Country))), by = SchoolID]
    

    Or with v >1.9.5

    setDT(df)[, .(NumberOfCountry = uniqueN(Country)), by = SchoolID]
    

    Or

    library(dplyr)
    df %>% 
      group_by(SchoolID) %>% 
      summarise(NumberOfCountry = n_distinct(Country))
    
    0 讨论(0)
  • 2020-12-21 12:18

    try this..

    select School,count(Country)
    from(
    select distinct School,Country
    from tbl_stacko) temp
    group by School
    
    0 讨论(0)
提交回复
热议问题