Count number of specific elements in between other elements in list

I am reading a data file. Rows start with consecutive numbers (steps), and sometimes in between each row there is a 0.

E.g:

How can I create a list that counts the number of 0s in between each step.

I want a list like this:

finalList = [1,1,0,0,3,1]

which represents the number of 0s each step contains, i.e: step 1 has 1 zero, step 2 has 1 zero, step 3 has 0 zeros, step 4 has 0 zeros, step 5 has 3 zeros and step 6 has 1 zero.

The following code should work if your datafile looks exactly as you described (e.g. no other number except incresing number of step and zeroes).

cur = 0
res = []
with open("file.txt") as f:
    for line in f:
        if line.strip() == '0':
            cur += 1
        else:
            res.append(cur)
            cur = 0

a = [1,0,2,0,3,4,5,0,0,0,6,0]
finalList = []
count = 0
for i in xrange(len(a)):
    if i == 0 : continue
    if a[i] == 0 :
        count += 1
    else : 
        finalList.append(count)
        count = 0
finalList.append(count)

Possibly overly clever solution using Python's included batteries:

from itertools import chain, groupby

with open("file.txt") as f:
    # Add extra zeroes after non-zero values so we see a group when no padding exists
    extrazeroes = chain.from_iterable((x, 0) if x else (x,) for x in map(int, f))

    # Count elements in group and subtract 1 if not first group to account for padding
    # The filter condition means we drop non-zero values cheaply
    zerocounts = [sum(1 for _ in g) - bool(gnum) for gnum, (k, g) in enumerate(groupby(extrazeroes)) if k == 0]

    # If leading zeroes (before first non-zero line) can't happen, simplify to:
    zerocounts = [sum(1 for _ in g) - 1 for k, g in groupby(extrazeroes) if k == 0]

Yes, it's a bit complicated (if you didn't care about including zeroes where there was no gap between two non-zero values it would be much simpler), but it's succinct and should be very fast. If you could omit the 0s in your counts, it would simplify to the much cleaner:

with open("file.txt") as f:
    zerocounts = [sum(1 for _ in g) for k, g in groupby(map(int, f)) if k == 0]

For the record, I'd use the latter if it met requirements. The former should probably not go in production code. :-)

Note that depending on your use case, using groupby may be a good idea for your broader problem; in comments, you mention you're storing all the lines in the file (using f = f.readlines()), which implies you'll be accessing them, possibly based on the values stored in zerocounts. If you have some specific need to process each "step" based on the number of following zeroes, an adaptation of the code above might save you the memory overhead of slurping the file by lazily grouping and processing.

Note: To avoid slurping the whole file into memory, in Python 2, you'd want to add from future_builtins import map so map is a lazy generator function like it is in Py3, rather than loading the whole file and converting all of it to int up front. If you don't want to stomp map, importing and using itertools.imap over map for int conversion accomplishes the same goal.

I came up with this:

finalList = []
count = 0
step = None

for e in [1, 0, 2, 0, 3, 4, 5, 0, 0, 0, 6, 0]:
   if e > 0:
      if step:
         finalList.append(count)
      step = e
      count = 0
   else:
      count += 1
if step:
    finalList.append(count)

Alternative solution

# temp list (copy of l with last element if doesn't exist)
_l = l if l[-1] > 0 else l + [max(l) + 1]
# _l.index(i) - _l.index(i - 1) - 1 = distance between elements
[_l.index(i) - _l.index(i - 1) - 1 for i in range(2, max(_l) + 1)]

来源：https://stackoverflow.com/questions/34401733/count-number-of-specific-elements-in-between-other-elements-in-list

标签

python

list

between