可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

由

翻译强力驱动

问题:

What's the best way to count the number of occurrences of a given string, including overlap in python? is it the most obvious way:

def function(string, str_to_search_for):       count = 0       for x in xrange(len(string) - len(str_to_search_for) + 1):            if string[x:x+len(str_to_search_for)] == str_to_search_for:                 count += 1       return count   function('1011101111','11') returns 5

or is there a better way in python?

回答1:

Well, this might be faster since it does the comparing in C:

def occurrences(string, sub):     count = start = 0     while True:         start = string.find(sub, start) + 1         if start > 0:             count+=1         else:             return count

回答2:

>>> import re >>> text = '1011101111' >>> len(re.findall('(?=11)', text)) 5

If you didn't want to load the whole list of matches into memory, which would never be a problem! you could do this if you really wanted:

>>> sum(1 for _ in re.finditer('(?=11)', text)) 5

As a function (re.escape makes sure the substring doesn't interfere with the regex):

>>> def occurrences(text, sub):         return len(re.findall('(?={0})'.format(re.escape(sub)), text))  >>> occurrences(text, '11') 5

回答3:

You can also try using the new Python regex module, which supports overlapping matches.

import regex as re  def count_overlapping(text, search_for):     return len(re.findall(search_for, text, overlapped=True))  count_overlapping('1011101111','11')  # 5

回答4:

Python's str.count counts non-overlapping substrings:

In [3]: "ababa".count("aba") Out[3]: 1

Here are a few ways to count overlapping sequences, I'm sure there are many more :)

Look-ahead regular expressions

How to find overlapping matches with a regexp?

In [10]: re.findall("a(?=ba)", "ababa") Out[10]: ['a', 'a']

Generate all substrings

In [11]: data = "ababa" In [17]: sum(1 for i in range(len(data)) if data.startswith("aba", i)) Out[17]: 2

回答5:

s = "bobobob" sub = "bob" ln = len(sub) print(sum(sub == s[i:i+ln] for i in xrange(len(s)-(ln-1))))

回答6:

My answer, to the bob question on the course:

s = 'azcbobobegghaklbob' total = 0 for i in range(len(s)-2):     if s[i:i+3] == 'bob':         total += 1 print 'number of times bob occurs is: ', total

回答7:

How to find a pattern in another string with overlapping

This function (another solution!) receive a pattern and a text. Returns a list with all the substring located in the and their positions.

def occurrences(pattern, text):     """     input: search a pattern (regular expression) in a text     returns: a list of substrings and their positions      """     p = re.compile('(?=({0}))'.format(pattern))     matches = re.finditer(p, text)     return [(match.group(1), match.start()) for match in matches]  print (occurrences('ana', 'banana')) print (occurrences('.ana', 'Banana-fana fo-fana'))

[('ana', 1), ('ana', 3)]
[('Bana', 0), ('nana', 2), ('fana', 7), ('fana', 15)]

回答8:

Here is my edX MIT "find bob"* solution (*find number of "bob" occurences in a string named s), which basicaly counts overlapping occurrences of a given substing:

s = 'azcbobobegghakl' count = 0  while 'bob' in s:     count += 1      s = s[(s.find('bob') + 2):]  print "Number of times bob occurs is: {}".format(count)

回答9:

That can be solved using regex.

import re def function(string, sub_string):     match = re.findall('(?='+sub_string+')',string)     return len(match)

回答10:

def count_overlaps (string, look_for):     start   = 0     matches = 0      while True:         start = string.find (look_for, start)         if start

回答11:

Function that takes as input two strings and counts how many times sub occurs in string, including overlaps. To check whether sub is a substring, I used the in operator.

def count_Occurrences(string, sub):     count=0     for i in range(0, len(string)-len(sub)+1):         if sub in string[i:i+len(sub)]:             count=count+1     print 'Number of times sub occurs in string (including overlaps): ', count

回答12:

For a duplicated question i've decided to count it 3 by 3 and comparing the string e.g.

counted = 0  for i in range(len(string)):      if string[i*3:(i+1)*3] == 'xox':        counted = counted +1  print counted

回答13:

An alternative very close to the accepted answer but using while as the if test instead of including if inside the loop:

def countSubstr(string, sub):     count = 0     while sub in string:         count += 1         string = string[string.find(sub) + 1:]     return count;

This avoids while True: and is a little cleaner in my opinion

回答14:

If strings are large, you want to use Rabin-Karp, in summary:

a rolling window of substring size, moving over a string
a hash with O(1) overhead for adding and removing (i.e. move by 1 char)
implemented in C or relying on pypy

回答15:

def count_substring(string, sub_string):     counter = 0     for i in range(len(string)):         if string[i:].startswith(sub_string):         counter = counter + 1     return counter

Above code simply loops throughout the string once and keeps checking if any string is starting with the particular substring that is being counted.

回答16:

If you want to count permutation counts of length 5 (adjust if wanted for different lengths):

def MerCount(s):   for i in xrange(len(s)-4):     d[s[i:i+5]] += 1 return d

回答17:

sum([ 1 for _ in range(len(string)-len(str_to_search_for)+1) if string[_:_+len(str_to_search_for)] == str_to_search_for])

In a list comprehension, we slide through bigger string by one position at a time with the sliding window of length of smaller string. We can compute the sliding count by substracting the length of smaller string from bigger string. For each slide, we compare that part of bigger string with our smaller string and generate 1 in a list if match found. Sum of all of these 1's in a list will give us total number of matches found.

转载请标明出处:string count with overlapping occurrences

文章来源: string count with overlapping occurrences

标签

string

sub