I have a df (“df”) containing multiple time series (value ~ time) whose observations are grouped by 3 factors: temp, rep, and species. These data need to be trimmed at the lower
We can find out indices which we want to exclude using mapply
df[-c(with(df_thresholds,
mapply(function(x, y, z, min_x, max_x)
which(df$species == x & df$temp == y & df$rep == z &
(df$value < min_x | df$value > max_x)),
species, temp, rep, min_value, max_value))), ]
# species temp rep time value
#2 A 10 1 2 4
#3 A 10 1 3 8
#6 A 20 1 2 4
#7 A 20 1 3 9
#9 A 10 2 1 2
#10 A 10 2 2 4
#11 A 10 2 3 10
#12 A 10 2 4 16
#......
In mapply we pass all the columns of df_thresholds filter df accordingly and find out indices which are outside min and max value for each row and exclude them from the original dataframe.
The result of mapply call is
#[1] 1 4 5 8 25 28
which are the rows we want to exclude from the df since they fall out of range.