Stemming is something that's needed in tagging systems. I use delicious, and I don't have time to manage and prune my tags. I'm a bit more careful with my blog, but it isn't perfect. I write software for embedded systems that would be much more functional (helpful to the user) if they included stemming.
For instance:
Parse
Parser
Parsing
Should all mean the same thing to whatever system I'm putting them into.
Ideally there's a BSD licensed stemmer somewhere, but if not, where do I look to learn the common algorithms and techniques for this?
Aside from BSD stemmers, what other open source licensed stemmers are out there?
-Adam
Check out the nltk toolkit written in python. It has a very functional stemmer.
Another option for stemming would be WordNet, along with one of its APIs. Some basic information on stemming and lemmatization, including a description of the Porter stemming algorithm, can be found online in Introduction to Information Retrieval.
来源:https://stackoverflow.com/questions/595110/stemming-code-examples-or-open-source-projects