Most Pythonic Way to Split an Array by Repeating Elements

前端未结

关注

 11  1526

星月不相逢 2021-02-13 09:51

I have a list of items that I want to split based on a delimiter. I want all delimiters to be removed and the list to be split when a delimiter occurs twice. F

11条回答

刺人心 (楼主)

2021-02-13 10:10

Use a generator function to maintain state of your iterator through the list, and the count of the number of separator chars seen so far:

l = ['a', 'b', 'X', 'X', 'c', 'd', 'X', 'X', 'f', 'X', 'g'] 

def splitOn(ll, x, n):
    cur = []
    splitcount = 0
    for c in ll:
        if c == x:
            splitcount += 1
            if splitcount == n:
                yield cur
                cur = []
                splitcount = 0
        else:
            cur.append(c)
            splitcount = 0
    yield cur

print list(splitOn(l, 'X', 2))
print list(splitOn(l, 'X', 1))
print list(splitOn(l, 'X', 3))

l += ['X','X']
print list(splitOn(l, 'X', 2))
print list(splitOn(l, 'X', 1))
print list(splitOn(l, 'X', 3))

prints:

[['a', 'b'], ['c', 'd'], ['f', 'g']]
[['a', 'b'], [], ['c', 'd'], [], ['f'], ['g']]
[['a', 'b', 'c', 'd', 'f', 'g']]
[['a', 'b'], ['c', 'd'], ['f', 'g'], []]
[['a', 'b'], [], ['c', 'd'], [], ['f'], ['g'], [], []]
[['a', 'b', 'c', 'd', 'f', 'g']]

EDIT: I'm also a big fan of groupby, here's my go at it:

from itertools import groupby
def splitOn(ll, x, n):
    cur = []
    for isdelim,grp in groupby(ll, key=lambda c:c==x):
        if isdelim:
            nn = sum(1 for c in grp)
            while nn >= n:
                yield cur
                cur = []
                nn -= n
        else:
            cur.extend(grp)
    yield cur

Not too different from my earlier answer, just lets groupby take care of iterating over the input list, creating groups of delimiter-matching and not-delimiter-matching characters. The non-matching characters just get added onto the current element, the matching character groups do the work of breaking up new elements. For long lists, this is probably a bit more efficient, as groupby does all its work in C, and still only iterates over the list once.

0 讨论(0)

查看其它11个回答