Find the repeating substring a string is composed of, if it exists

断了今生、忘了曾经 提交于 2020-02-05 01:08:48

问题


How would you go about splitting a normal string in to as many identical pieces as possible whilst using all characters. For example

a = "abab"

Would return "ab", whereas with

b= "ababc"

It would return "ababc", as it can't be split into identical pieces using all letters.


回答1:


This is very similar, but not identical, to How can I tell if a string repeats itself in Python? – the difference being that that question only asks to determine whether a string is made up of identical repeating substrings, rather than what the repeating substring (if any) is.

The accepted (and by far the best performing) answer to that question can be adapted to return the repeating string if there is one:

def repeater(s):
    i = (s+s)[1:-1].find(s)
    if i == -1:
        return s
    else:
        return s[:i+1]

Examples:

>>> repeater('abab')
'ab'
>>> repeater('ababc')
'ababc'
>>> repeater('xyz' * 1000000)
'xyz'
>>> repeater('xyz' * 50 + 'q')
'xyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzxyzq'



回答2:


It seems that repeating substring has no pre and after letters, so it also could be this way:

In[4]: re.sub(r'^([a-z]+)\1$',r'\1','abab')
Out[4]: 'ab'
In[5]: re.sub(r'^([a-z]+)\1$',r'\1','ababc')
Out[5]: 'ababc' 

([a-z]+) means substring, \1 means repeat.

EDIT :

re.sub(r'^([a-z]+)\1{1,}$',r'\1','abcabcabcabc')
'abc'


来源:https://stackoverflow.com/questions/43035406/find-the-repeating-substring-a-string-is-composed-of-if-it-exists

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!