I want to take every word from a text file, and count the word frequency in a dictionary.
Example: \'this is the textfile, and it is used to take words and co
The following takes the string, splits it into a list with split(), for loops the list and counts the frequency of each item in the sentence with Python's count function count (). The words,i, and its frequency are placed as tuples in an empty list, ls, and then converted into key and value pairs with dict().
sentence = 'this is the textfile, and it is used to take words and count'.split()
ls = []
for i in sentence:
word_count = sentence.count(i) # Pythons count function, count()
ls.append((i,word_count))
dict_ = dict(ls)
print dict_
output; {'and': 2, 'count': 1, 'used': 1, 'this': 1, 'is': 2, 'it': 1, 'to': 1, 'take': 1, 'words': 1, 'the': 1, 'textfile,': 1}
sentence = "this is the textfile, and it is used to take words and count"
# split the sentence into words.
# iterate thorugh every word
counter_dict = {}
for word in sentence.lower().split():
# add the word into the counter_dict initalize with 0
if word not in counter_dict:
counter_dict[word] = 0
# increase its count by 1
counter_dict[word] =+ 1
#open your text book,Counting word frequency
File_obj=open("Counter.txt",'r')
w_list=File_obj.read()
print(w_list.split())
di=dict()
for word in w_list.split():
if word in di:
di[word]=di[word] + 1
else:
di[word]=1
max_count=max(di.values())
largest=-1
maxusedword=''
for k,v in di.items():
print(k,v)
if v>largest:
largest=v
maxusedword=k
print(maxusedword,largest)
One more function:
def wcount(filename):
counts = dict()
with open(filename) as file:
a = file.read().split()
# words = [b.rstrip() for b in a]
for word in a:
if word in counts:
counts[word] += 1
else:
counts[word] = 1
return counts
from collections import Counter
t = 'this is the textfile, and it is used to take words and count'
dict(Counter(t.split()))
>>> {'and': 2, 'is': 2, 'count': 1, 'used': 1, 'this': 1, 'it': 1, 'to': 1, 'take': 1, 'words': 1, 'the': 1, 'textfile,': 1}
Or better with removing punctuation before counting:
dict(Counter(t.replace(',', '').replace('.', '').split()))
>>> {'and': 2, 'is': 2, 'count': 1, 'used': 1, 'this': 1, 'it': 1, 'to': 1, 'take': 1, 'words': 1, 'the': 1, 'textfile': 1}
you can also use default dictionaries with int type.
from collections import defaultdict
wordDict = defaultdict(int)
text = 'this is the textfile, and it is used to take words and count'.split(" ")
for word in text:
wordDict[word]+=1
explanation: we initialize a default dictionary whose values are of the type int. This way the default value for any key will be 0 and we don't need to check if a key is present in the dictionary or not. we then split the text with the spaces into a list of words. then we iterate through the list and increment the count of the word's count.