I want a simple count of the number of subjects in each condition of a study. The data look something like this:
subjectid cond obser variable
1234
Use the ddply
function from the plyr
package:
require(plyr)
df <- data.frame(subjectid = sample(1:3,7,T),
cond = sample(1:2,7,T), obser = sample(1:7))
> ddply(df, .(cond), summarize, NumSubs = length(unique(subjectid)))
cond NumSubs
1 1 1
2 2 2
The ddply
function "splits" the data-frame by the cond
variable, and produces a summary column NumSubs
for each sub-data-frame.
or, if you like SQL and don't mind installing a package:
library(sqldf);
sqldf("select cond, count(distinct subjectid) from dat")
Just to give you even more choice, you could also use tapply
tapply(a$subjectid, a$cond, function(x) length(unique(x)))
1 2
2 1
Using your snippet of data that I loaded into object dat
:
> dat
subjectid cond obser variable
1 1234 1 1 12
2 1234 1 2 14
3 2143 2 1 19
4 3456 1 1 12
5 3456 1 2 14
6 3456 1 3 13
Then one way to do this is to use aggregate to count the unique subjectid
(assuming that is what you meant by "Ss"???
> aggregate(subjectid ~ cond, data = dat, FUN = function(x) length(unique(x)))
cond subjectid
1 1 2
2 2 1