Python - Counting Words In A Text File

前端 未结 4 2022
悲哀的现实
悲哀的现实 2020-12-10 20:49

I\'m new to Python and am working on a program that will count the instances of words in a simple text file. The program and the text file will be read from the command line

相关标签:
4条回答
  • 2020-12-10 21:25

    I just noticed a typo: you open the file as f but you close it as file. As tripleee said, you shouldn't close files that you open in a with statement. Also, it's bad practice to use the names of builtin functions, like file or list, for your own identifiers. Sometimes it works, but sometimes it causes nasty bugs. And it's confusing for people who read your code; a syntax highlighting editor can help avoid this little problem.

    To print the data in your count dict in descending order of count you can do something like this:

    items = count.items()
    items.sort(key=lambda (k,v): v, reverse=True)
    print '\n'.join('%s: %d' % (k, v) for k,v in items)
    

    See the Python Library Reference for more details on the list.sort() method and other handy dict methods.

    0 讨论(0)
  • 2020-12-10 21:32

    Your final print doesn't have a loop, so it will just print the count for the last word you read, which still remains as the value of word.

    Also, with a with context manager, you don't need to close() the file handle.

    Finally, as pointed out in a comment, you'll want to remove the final newline from each line before you split.

    For a simple program like this, it's probably not worth the trouble, but you might want to look at defaultdict from Collections to avoid the special case for initializing a new key in the dictionary.

    0 讨论(0)
  • 2020-12-10 21:37

    What you did looks fine to me, one could also use collections.Counter (assuming you are python 2.7 or newer) to get a bit more information like the number of each word. My solution would look like this, probably some improvement possible.

    import sys
    from collections import Counter
    lines = open(sys.argv[1], 'r').readlines()
    c = Counter()
    for line in lines:
        for work in line.strip().split():
            c.update(work)
    for ind in c:
        print ind, c[ind]
    
    0 讨论(0)
  • 2020-12-10 21:37

    I just did this by using re library. This was for average words in a text file per line but you have to find out number of words per line.

    import re
    #this program get the average number of words per line
    def main():
        try:
            #get name of file
            filename=input('Enter a filename:')
    
            #open the file
            infile=open(filename,'r')
    
            #read file contents
            contents=infile.read()
            line = len(re.findall(r'\n', contents))
            count = len(re.findall(r'\w+', contents))
            average = count // line
    
            #display fie contents
            print(contents)
            print('there is an average of', average, 'words per sentence')
    
            #closse the file
            infile.close()
        except IOError:
            print('An error oocurred when trying to read ')
            print('the file',filename )
    
    #call main
    main()
    
    0 讨论(0)
提交回复
热议问题