Find indexes of repeated elements in an array (Python, NumPy)

前端未结

关注

 5  1227

Assume, I have a NumPy-array of integers, as:

[34,2,3,22,22,22,22,22,22,18,90,5,-55,-19,22,6,6,6,6,6,6,6,6,23,53,1,5,-42,82]

I want to find

相关标签:

5条回答

南笙

2020-12-20 15:04

Using np.diff and the method given here by @WarrenWeckesser for finding runs of zeros in an array:

import numpy as np

def zero_runs(a):  # from link
    iszero = np.concatenate(([0], np.equal(a, 0).view(np.int8), [0]))
    absdiff = np.abs(np.diff(iszero))
    ranges = np.where(absdiff == 1)[0].reshape(-1, 2)
    return ranges

a = [34,2,3,22,22,22,22,22,22,18,90,5,-55,-19,22,6,6,6,6,6,6,6,6,23,53,1,5,-42,82]

zero_runs(np.diff(a))
Out[87]: 
array([[ 3,  8],
       [15, 22]], dtype=int32)

This can then be filtered on the difference between the start & end of the run:

runs = zero_runs(np.diff(a))

runs[runs[:, 1]-runs[:, 0]>5]  # runs of 7 or more, to illustrate filter
Out[96]: array([[15, 22]], dtype=int32)

0 讨论(0)

无人共我

2020-12-20 15:09

Here is a relatively quick, errorless solution which also tells you how many copies were in the run. Some of this code was borrowed from KAL's solution.

# Return the start and (1-past-the-end) indices of the first instance of
# at least min_count copies of element value in container l 
def find_repeat(value, min_count, l):
  look_for = [value for _ in range(min_count)]
  for i in range(len(l)):
    count = 0
    while l[i + count] == value:
      count += 1
    if count >= min_count:
      return i, i + count

0 讨论(0)

耶瑟儿～

2020-12-20 15:18

If you're looking for value repeated n times in list L, you could do something like this:

def find_repeat(value, n, L):
    look_for = [value for _ in range(n)]
    for i in range(len(L)):
        if L[i] == value and L[i:i+n] == look_for:
            return i, i+n

0 讨论(0)

伪装坚强ぢ

2020-12-20 15:23
There really isn't a great short-cut for this. You can do something like:
```
mult = 5
for elem in val_list:
    target = [elem] * mult
    found_at = val_list.index(target)
```
I leave the not-found exceptions and longer sequence detection to you.
0 讨论(0)
发布评论:

提交评论
- 加载中...

小蘑菇

2020-12-20 15:25

Here is a solution using Python's native itertools.

Code

import itertools as it


def find_ranges(lst, n=2):
    """Return ranges for `n` or more repeated values."""
    groups = ((k, tuple(g)) for k, g in it.groupby(enumerate(lst), lambda x: x[-1]))
    repeated = (idx_g for k, idx_g in groups if len(idx_g) >=n)
    return ((sub[0][0], sub[-1][0]) for sub in repeated)

lst = [34,2,3,22,22,22,22,22,22,18,90,5,-55,-19,22,6,6,6,6,6,6,6,6,23,53,1,5,-42,82]    
list(find_ranges(lst, 5))
# [(3, 8), (15, 22)]

Tests

import nose.tools as nt


def test_ranges(f):
    """Verify list results identifying ranges."""
    nt.eq_(list(f([])), [])
    nt.eq_(list(f([0, 1,1,1,1,1,1, 2], 5)), [(1, 6)])
    nt.eq_(list(f([1,1,1,1,1,1, 2,2, 1, 3, 1,1,1,1,1,1], 5)), [(0, 5), (10, 15)])
    nt.eq_(list(f([1,1, 2, 1,1,1,1, 2, 1,1,1], 3)), [(3, 6), (8, 10)])    
    nt.eq_(list(f([1,1,1,1, 2, 1,1,1, 2, 1,1,1,1], 3)), [(0, 3), (5, 7), (9, 12)])

test_ranges(find_ranges)

This example captures (index, element) pairs in lst, and then groups them by element. Only repeated pairs are retained. Finally, first and last pairs are sliced, yielding (start, end) indices from each repeated group.

See also this post for finding ranges of indices using itertools.groupby.

0 讨论(0)