问题
I am fairly new to python and I could not figure out how to do the following.
I have a list of (word, tag) tuples
a = [('Run', 'Noun'),('Run', 'Verb'),('The', 'Article'),('Run', 'Noun'),('The', 'DT')]
I am trying to find all tags that has been assigned to each word and collect their counts. For example, word "run" has been tagged twice to 'Noun' and once to 'Verb'.
To clarify: I would like to create another list of tuples that contains (word, tag, count)
回答1:
You can use collections.Counter:
>>> import collections
>>> a = [('Run', 'Noun'),('Run', 'Verb'),('The', 'Article'),('Run', 'Noun'),('The', 'DT')]
>>> counter = collections.Counter(a)
Counter({('Run', 'Noun'): 2, ('Run', 'Verb'): 1, ... })
>>> result = {}
>>> for (tag, word), count in counter.items():
... result.setdefault(tag, []).append({word: count})
>>> print(result)
{'Run': [{'Noun': 2}, {'Verb': 1}], 'The': [{'Article': 1}, {'DT': 1}]}
回答2:
Pretty easy with a defaultdict:
>>> from collections import defaultdict
>>> output = defaultdict(defaultdict(int).copy)
>>> for word, tag in a:
... output[word][tag] += 1
...
>>> output
defaultdict(<function copy>,
{'Run': defaultdict(int, {'Noun': 2, 'Verb': 1}),
'The': defaultdict(int, {'Article': 1, 'DT': 1})})
来源:https://stackoverflow.com/questions/39582639/counting-items-inside-tuples-in-python