Converting plural to singular in a text file with Python

后端 未结 3 665
予麋鹿
予麋鹿 2021-01-04 23:01

I have txt files that look like this:

word, 23
Words, 2
test, 1
tests, 4

And I want them to look like this:

word, 23
word,          


        
3条回答
  •  萌比男神i
    2021-01-04 23:28

    If you have complex words to singularize, I don't advise you to use stemming but a proper python package link pattern :

    from pattern.text.en import singularize
    
    plurals = ['caresses', 'flies', 'dies', 'mules', 'geese', 'mice', 'bars', 'foos',
               'families', 'dogs', 'child', 'wolves']
    
    singles = [singularize(plural) for plural in plurals]
    print singles
    

    returns:

    >>> ['caress', 'fly', 'dy', 'mule', 'goose', 'mouse', 'bar', 'foo', 'foo', 'family', 'family', 'dog', 'dog', 'child', 'wolf']
    

    It's not perfect but it's the best I found. 96% based on the docs : http://www.clips.ua.ac.be/pages/pattern-en#pluralization

提交回复
热议问题