python count number of unique elements in csv column

柔情痞子 提交于 2019-12-07 12:39:48

问题


I'm trying to get the counts of unique items in a csv column using Python.

Sample CSV file (has no header):

AB,asd
AB,poi
AB,asd
BG,put
BG,asd

I've tried this so far.

import csv
from collections import defaultdict, Counter

input_file = open('Results/1_sample.csv')
csv_reader = csv.reader(input_file, delimiter=',')

data = defaultdict(list)
for row in csv_reader:
    data[row[0]].append(row[1])
for k, v in data.items():
    print k
    print Counter(v)

This gives output in this format:

AB
Counter({'asd': 2, 'poi': 1})
BG
Counter({'asd': 1, 'put': 1})

But I want my output to be like:

AB:2
BG:2
total_unique_count:3 #unique count of column[1], irrespective of the data in column[0]

回答1:


You're looking for the SeriesGroupby method nunique:

In [11]: df
Out[11]:
    0    1
0  AB  asd
1  AB  poi
2  AB  asd
3  BG  put
4  BG  asd

In [12]: g = df.groupby(0)

In [13]: g[1].nunique()
Out[13]:
0
AB    2
BG    2
Name: 1, dtype: int64



回答2:


Use sets:

data = (('AB', 'asd'),
    ('AB', 'poi'),
    ('AB', 'asd'),
    ('BG', 'put'),
    ('BG', 'asd'))
unique_items = set(data)
keys = [[entry[0] for entry in unique_items]]
for key in set(keys):
    print("Key '{}' appears {} unique times".format(key, keys.count(key)))

Key 'BG' appears 2 unique times
Key 'AB' appears 2 unique times



来源:https://stackoverflow.com/questions/29634417/python-count-number-of-unique-elements-in-csv-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!