mg4j vs. apache lucene

ε祈祈猫儿з 提交于 2019-12-23 10:39:35

问题


Can anyone provide a simple comparative analysis of these search engines? What advantages does either framework have?

BTW, I've seen the following basic explanations of choosing mg4j from several academic papers:

  • combining indices over the same collection
  • multi-index queries

Update:

These slides (from mir2ed.org) contain a more fresh overview of open source search engines including Lucene and mg4j on benchmarking various aspects: memory & CPU, index size, search performance, search quality etc.


回答1:


Jeff Dalton reviewed many open source search engines including Lucene and mg4j in 2007, and updated the comparison in 2009.

I have not used mg4j. I have used Lucene, though. The number one feature of Lucene IMO is its wide adoption and wonderful community of users/developers/committers. This means that there is a fair chance that somebody worked on a use case similar to yours using Lucene. Current weak points of Lucene are its scoring model and its ability to scale to large collections of text. The Lucene developers are working on these issues.

I believe that the choice of a search library is very dependent on your (academic or industrial) setting, the other parts of your application and your use case.



来源:https://stackoverflow.com/questions/5028314/mg4j-vs-apache-lucene

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!