问题
This should be so simple but for some reason data.table is not doing what I expect. I want to take the max of two values in a row to determine if a row should be filtered or not. What appears to be happening is that the max() function is looking at the entire column which is not what I want. Here's the code:
> test_dt <- data.table(value1 = 1:10, value2 = 2:11, value3 = 3:12)
> test_dt[max(value1, value2, value3) < 7]
Empty data.table (0 rows) of 3 cols: value1,value2,value3
Here's what I expect:
value1 value2 value3
1: 1 2 3
2: 2 3 4
3: 3 4 5
4: 4 5 6
What am I doing wrong here? This should be so trivial but I appear to be missing something critical.
回答1:
You want the parallel max, or pmax
. See ?max
for details:
test_dt[pmax(value1, value2, value3) < 7]
# value1 value2 value3
# 1: 1 2 3
# 2: 2 3 4
# 3: 3 4 5
# 4: 4 5 6
If you really want speed, you can use pmax.int
, again, see ?max
for details.
来源:https://stackoverflow.com/questions/33768702/r-data-table-using-max-in-i-statement