发表新帖

发表新帖

R: find largest common substring starting at the beginning

后端未结

关注

 11  2556

星月不相逢 2021-02-19 18:33

I\'ve got 2 vectors:

word1 <- \"bestelling\"   
word2 <- \"bestelbon\"

Now I want to find the largest common substring that starts at the

11条回答

闹比i (楼主)

2021-02-19 19:03
A little messy, but it's what I came up with:
```
largest_subset <- Vectorize(function(word1,word2) {
    substr(word1, 1, sum(substring(word1, 1, 1:nchar(word1))==substring(word2, 1, 1:nchar(word2))))
})
```
It produces a warning message if the words are not the same length, but have no fear. It checks to see if each substring from the first character of each word to every position produces a match between the two words. You can then count how many values came out to be true, and capture the substring up to that character. I vectorized it so you can apply it to word vectors.
```
> word1 <- c("tester","doesitwork","yupyppp","blanks")
> word2 <- c("testover","doesit","yupsuredoes","")
> largest_subset(word1,word2)
    tester doesitwork    yupyppp     blanks 
    "test"   "doesit"      "yup"         "" 
```
0 讨论(0)

查看其它11个回答
发布评论:

提交评论
- 加载中...

热议问题