Counting unique items in data frame

后端 未结 4 512
予麋鹿
予麋鹿 2020-12-09 00:22

I want a simple count of the number of subjects in each condition of a study. The data look something like this:

subjectid  cond   obser variable
1234                


        
相关标签:
4条回答
  • 2020-12-09 00:26

    Use the ddply function from the plyr package:

    require(plyr)
    df <- data.frame(subjectid = sample(1:3,7,T), 
                     cond = sample(1:2,7,T), obser = sample(1:7))
    
    > ddply(df, .(cond), summarize, NumSubs = length(unique(subjectid)))
      cond NumSubs
    1    1       1
    2    2       2
    

    The ddply function "splits" the data-frame by the cond variable, and produces a summary column NumSubs for each sub-data-frame.

    0 讨论(0)
  • 2020-12-09 00:26

    or, if you like SQL and don't mind installing a package:

    library(sqldf);
    sqldf("select cond, count(distinct subjectid) from dat")
    
    0 讨论(0)
  • 2020-12-09 00:32

    Just to give you even more choice, you could also use tapply

    tapply(a$subjectid, a$cond, function(x) length(unique(x)))
    1 2 
    2 1 
    
    0 讨论(0)
  • 2020-12-09 00:52

    Using your snippet of data that I loaded into object dat:

    > dat
      subjectid cond obser variable
    1      1234    1     1       12
    2      1234    1     2       14
    3      2143    2     1       19
    4      3456    1     1       12
    5      3456    1     2       14
    6      3456    1     3       13
    

    Then one way to do this is to use aggregate to count the unique subjectid (assuming that is what you meant by "Ss"???

    > aggregate(subjectid ~ cond, data = dat, FUN = function(x) length(unique(x)))
      cond subjectid
    1    1         2
    2    2         1
    
    0 讨论(0)
提交回复
热议问题