R: cbind based on match first few letters or number of a cells

我们两清 提交于 2020-01-02 22:38:40

问题


I have df1 like this:

df1 <- data.frame(A=c("x01","x02","y03","z02","x04"), B=c("A01BB01","A02BB02","C02AA05","B04CC10","C01GX02"))

    A       B
1 x01 A01BB01
2 x02 A02BB02
3 y03 C02AA05
4 z02 B04CC10
5 x04 C01GX02

I have df2 like this.

  X     Y
1 a A01BB
2 b   A02
3 c  C02A
4 d   B04
5 e C01GX

df2 <- data.frame(X=c("a","b","c","d","e"), Y=c("A01BB","A02","C02A","B04","C01GX"))

I want to match the first few letters/ numbers in df1$B with those in df2$Y. And then merge two dataframe based on the best match, as such, we expect to see a results data frame like this:

  A       B   X     Y
1 x01 A01BB01   a A01BB
2 x02 A02BB02   b   A02
3 y03 C02AA05   c  C02A
4 z02 B04CC10   d   B04
5 x04 C01GX02   e C01GX

Could you mind to teach me how to do so? Thanks.

the Matching could only happens in the first few letters/number, the matched portion could not appear in the middle or the end of the words in df1$B, are there any effective way of doing this with R?


回答1:


You can use pmatch for this kind of matching:

with(c(df1,df2),{
  i <- pmatch(Y,B)
  data.frame(A,B,X = X[i],Y = Y[i])
})


来源:https://stackoverflow.com/questions/6592214/r-cbind-based-on-match-first-few-letters-or-number-of-a-cells

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!