问题
I want to do inner join with the condition that it should give me subtraction of 2 columns.
df1 = data.frame(Term = c("T1","T2","T3"), Sec = c("s1","s2","s3"), Value =c(10,30,30))
df2 = data.frame(Term = c("T1","T2","T3"), Sec = c("s1","s3","s2"), Value =c(40,20,10)
df1
Term Sec Value
T1 s1 10
T2 s2 30
T3 s3 30
df2
Term Sec Value
T1 s1 40
T2 s3 20
T3 s2 10
The result I want is
Term Sec Value
T1 s1 30
T2 s2 20
T3 s3 10
Basically I am joining two tables and for the column value I am taking
Value= abs(df1$Value - df2$Value)
I have struggled but could not found any way to do this conditional merge in base R. Probably if it is not possible with base R, dplyr should able to do that with inner_join() but I am not well aware with much of this package.
So, any suggestion with base R and/or dplyr will be appreciated
EDITING
I have included my original data as asked. My data is here
https://jsfiddle.net/6z6smk80/1/
DF1 is first table and DF2 is second. DF2 starts from 168th row.
All logic same , I want to join these two tables whose length is 160 rows each. I want to join by ID and take difference of column Value from both tables. The resultant dataset should have same number of rows which is 160 with extra column diff
回答1:
Here is a "base R" solution using the merge()
function on the Term
column shared by your original df1
and df2
data frames:
df_merged <- merge(df1, df2, by="Sec")
df_merged$Value <- abs(df_merged$Value.x - df_merged$Value.y)
df_merged <- df_merged[, c("Sec", "Term.x", "Value")]
names(df_merged)[2] <- "Term"
> df_merged
Sec Term Value
1 s1 T1 30
2 s2 T2 20
3 s3 T3 10
回答2:
Using data.table
s binary join you can modify columns while joining. nomatch = 0L
makes sure that you are doing an inner join
library(data.table)
setkey(setDT(df2), Sec)
setkey(setDT(df1), Sec)[df2, .(Term, Sec, Value = abs(Value - i.Value)), nomatch = 0L]
# Term Sec Value
# 1: T1 s1 30
# 2: T2 s2 20
# 3: T3 s3 10
回答3:
As this is a dplyr question, here is a dplyr solution :
First use inner_join
and then transmute
to keep variables and compute and append a new one.
inner_join(df1, df2, by = "Sec") %>%
transmute(Term = Term.x, Sec, Value = abs(Value.x - Value.y))
来源:https://stackoverflow.com/questions/31179805/inner-join-with-conditions-in-r