Parser to parse search terms and extract valuable information [closed]

戏子无情 提交于 2019-12-04 23:17:00

The problem you describe is called information extraction. A host of algorithms exist, the simplest being regexp matching, the best structured machine learning. Try regexps first and look at something like NLTK if you know Python.

Distinguishing "staples in NY" from "cat in hat" is possible if your program knows that "NY" is a location. You can tell either by the capitals or because "NY" occurs in a list called a gazetteer.

The problem in general is AI-complete, so expect to put in lots of hard work if you want good results.

You should write such linguistic rules in grammars such as GATE and http://code.google.com/p/graph-expression/. Examples: Token+ in (LocationLookup).

Not too sure but two approaches as per my experience with parsing -

  1. Define a grammar which can parse the expression and collect values / parameters. You might want to come up with a dictionary of keywords using which you can then deduce the the type of search.

  2. Be strict when defining your grammar so that the expression itself tells you about the type of search. eg LOC: A in B , VALUE $ to Euro. etc.

For parser see ANTLR / jcup & jflex.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!