Efficiently count word frequencies in python

前端 未结 8 1193
走了就别回头了
走了就别回头了 2020-11-29 04:33

I\'d like to count frequencies of all words in a text file.

>>> countInFile(\'test.txt\')

should return {\'aaa\':1, \'bbb\':

8条回答
  •  慢半拍i
    慢半拍i (楼主)
    2020-11-29 04:59

    Skip CountVectorizer and scikit-learn.

    The file may be too large to load into memory but I doubt the python dictionary gets too large. The easiest option for you may be to split the large file into 10-20 smaller files and extend your code to loop over the smaller files.

提交回复
热议问题