问题
There are two data tables of the following structure:
DT1 <- data.table(ID=c("A","B","C"), P0=c(1,10,100), key="ID")
DT2 <- data.table(ID=c("B","B","B","A","A","A","C","C","C"), t=rep(seq(0:2),3), P=c(NA,30,50,NA,4,6,NA,200,700))
In data tableDT2
all NAs in column P
shall be updated by values P0
out of data table DT1
.
If DT2
is ordered by ID
like DT1
, the problem can be solved like this:
setorder(DT2,ID)
idxr <- which(DT2[["t"]]==1)
set(DT2, i=idxr, j="P", value=DT1[["P0"]])
But how can the data tables be "merged" without ordering DT2
before?
回答1:
We can join the two datasets on
'ID', for NA values in 'P', we assign 'P' as 'P0', and then remove the 'P0' by assigning it to 'NULL'.
library(data.table)#v1.9.6+
DT2[DT1, on='ID'][is.na(P), P:= P0][, P0:= NULL][]
Or as @DavidArenburg mentioned, we can use ifelse
condition after joining on 'ID' to replace the NA elements in 'P'.
DT2[DT1, P := ifelse(is.na(P), i.P0, P), on = 'ID']
回答2:
Here's another option of joining by condition
DT2[is.na(P), P := DT1[.SD, P0]]
DT2
# ID t P
# 1: B 1 10
# 2: B 2 30
# 3: B 3 50
# 4: A 1 1
# 5: A 2 4
# 6: A 3 6
# 7: C 1 100
# 8: C 2 200
# 9: C 3 700
来源:https://stackoverflow.com/questions/33981797/r-updating-nas-in-a-data-table-with-values-of-another-data-table