I have a data set which looks like the following (partially):
id name dummy
1 Jane 1
1 Jane 0
1 Jane 1
2 Mike 0
2 Mike 0
2 Mik
ave
can be used to produce the groupwise sum, while keeping the original position:
x[with(x, ave(dummy, name, FUN=sum))>0,]
## id name dummy
## 1 1 Jane 1
## 2 1 Jane 0
## 3 1 Jane 1
## 9 3 Tom 1
## 10 3 Tom 1
## 11 3 Tom 0
## 12 3 Tom 0
ave
is something like aggregate
but copies the aggregated value for each row in the set:
with(x, ave(dummy, name, FUN=sum))
## [1] 2 2 2 0 0 0 0 0 2 2 2 2
You can use plyr::ddply
too
require(plyr)
ddply(df, .(name), function(x) subset(x, !all(dummy == 0)))
## id name dummy
## 1 1 Jane 1
## 2 1 Jane 0
## 3 1 Jane 1
## 4 3 Tom 1
## 5 3 Tom 1
## 6 3 Tom 0
## 7 3 Tom 0
Note that it's possible to replace !all(dummy == 0)
by any(dummy != 0)
Consider df
is your data.frame, then use tapply
and [
to subset what you want:
> ind <- with(df, tapply(dummy, name, sum))
> df[df$name %in% names(ind)[ind!=0], ]
id name dummy
1 1 Jane 1
2 1 Jane 0
3 1 Jane 1
9 3 Tom 1
10 3 Tom 1
11 3 Tom 0
12 3 Tom 0
Another alternative:
> result <- split(df, df$name)[with(df, tapply(dummy, name, function(x) sum(x)!=0))]
> do.call(rbind, result)
A possible solution:
subset(dat, as.logical(ave(dummy, id, FUN = any)))
# id name dummy
# 1 1 Jane 1
# 2 1 Jane 0
# 3 1 Jane 1
# 9 3 Tom 1
# 10 3 Tom 1
# 11 3 Tom 0
# 12 3 Tom 0
An alternative with data.table
:
library(data.table)
setDT(dat)[, if (any(dummy)) .SD, by = id]
Or with dplyr
:
library(dplyr)
dat %>%
group_by(id) %>%
filter(any(dummy))