Finding the indexes of multiple/overlapping matching substrings

时光总嘲笑我的痴心妄想 提交于 2019-12-17 03:20:21

问题


I have a string, s="CCCGTGCC" and a subtstring ss="CC". I want to get all the indexes in s that start the string ss. In my example I would want to get back the array c(1,2,6).

Is there any string function that achieves this? Notice that my string is in the form "CCCGTGCC", and not c("C","C","C","G","T","G","C","C").

grep only returns whether there is a match anywhere in the string, and not the indexes of the matches within the string, unless I'm missing something.


回答1:


Try gregexpr with perl=TRUE and use perl regular expressions with look-ahead assertions (see ?regex):

gregexpr("(?=CC)","CCCGTGCC",perl=TRUE)
[[1]]
[1] 1 2 7
attr(,"match.length")
[1] 0 0 0


来源:https://stackoverflow.com/questions/7878992/finding-the-indexes-of-multiple-overlapping-matching-substrings

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!