Compare 1 column of 2D array and remove duplicates Python

自古美人都是妖i 提交于 2019-12-12 00:57:18

问题


Say I have a 2D array like:

array = [['abc',2,3,],
        ['abc',2,3],
        ['bb',5,5],
        ['bb',4,6],
        ['sa',3,5],
        ['tt',2,1]]

I want to remove any rows where the first column duplicates
ie compare array[0] and return only:

removeDups = [['sa',3,5],
        ['tt',2,1]]

I think it should be something like: (set first col as tmp variable, compare tmp with remaining and #set array as returned from compare)

for x in range(len(array)):
    tmpCol = array[x][0] 
    del array[x] 
    removed = compare(array, tmpCol) 
    array = copy.deepcopy(removed) 

print repr(len(removed))  #testing 

where compare is: (compare first col of each remaining array items with tmp, if match remove else return original array)

def compare(valid, tmpCol):
for x in range(len(valid)):
    if  valid[x][0] != tmpCol:
        del valid[x]
        return valid
    else:
        return valid

I keep getting 'index out of range' error. I've tried other ways of doing this, but I would really appreciate some help!


回答1:


Similar to other answers, but using a dictionary instead of importing counter:

counts = {}

for elem in array:
    # add 1 to counts for this string, creating new element at this key
    # with initial value of 0 if needed
    counts[elem[0]] = counts.get(elem[0], 0) + 1

new_array = []
for elem in array:
    # check that there's only 1 instance of this element.
    if counts[elem[0]] == 1:
        new_array.append(elem)



回答2:


One option you can try is create a counter for the first column of your array before hand and then filter the list based on the count value, i.e, keep the element only if the first element appears only once:

from collections import Counter

count = Counter(a[0] for a in array)
[a for a in array if count[a[0]] == 1]
# [['sa', 3, 5], ['tt', 2, 1]]



回答3:


You can use a dictionary and count the occurrences of each key. You can also use Counter from the library collections that actually does this.

Do as follows :

from collection import Counter

removed = []
for k, val1, val2 in array:
    if Counter([k for k, _, _ in array])[k]==1:
        removed.append([k, val1, val2])


来源:https://stackoverflow.com/questions/41791521/compare-1-column-of-2d-array-and-remove-duplicates-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!