Is MongoDB a valid alternative to relational db + lucene?

ぃ、小莉子 提交于 2019-12-20 08:38:42

问题


On a new project I need a hard use of lucene for a searcher implementation. This searcher will be a very important (and big) piece of the project. Is valid or convenient replacing Relational Database + Lucene with MongoDb?

edit: Ok, I will clarify: I'm not asking about risk, I can pay that price in this project. My point is: Is MongoDB oriented to this kind of thing? Can I make a full search engine with the same perfomance as I can get on Lucene?. A friend point me out MongoDB as alternative, but I don't see if the Lucene performance comes with the document alternative (and then, I will see it in MongoDB too), or, in other hand, the inverted index and optimitizations are complety independant of document orientation.


回答1:


Technically you can do full text search with MongoDB, but you're missing out on a lot that a full text search provider has to offer. I love MongoDB, but I'd couple it with a full text search provider (such as Lucene or Sphinx) if time to implementation is at all a concern. I think MongoDB's convenient ability to index word arrays is better left to tagging and searching based on tagging than full text search.

Search (Information Retrieval) isn't just about grabbing any documents that match, if you want your search results to have any relevance at all you're going to need something along the lines of TF-IDF, phrase matching (words in a sequence score higher) or any number of other IR techniques to improve search precision. If you use MongoDB you'll need to implement it all from scratch.

If you really want to implement it all from scratch but not bother with the raw storage side of things, MongoDB is pretty close to the best DB store that you could implement it on top of (can't think of many others), but that still doesn't make it a great option.




回答2:


CouchDb seems to be a(n other) possible alternative to use Lucene via couchdb-lucene project.




回答3:


MongoDb is an NOSQl, Lucene and SOLR are search engines, and adding another thing to the comparison is caches like Terracota along with EhCache. All have thier own purpose.

If searching along with full text search is required with stemming, relevancy settings like showing results with text matching in product title ranking more than text matching in desctription, and many such text based features. Also ranking, relevancy, sound alike macthing, partial word matching etc etc . All this things are best handled by search based storage systems like SOLR and Lucene.

If your criteria is fater retrieval only and you dont need your presentation data objects to be durable then simply use a cache lke Terracota.

If you need faster retrieval and also need to colloborate and aggregate data in one datasource and also need that aggregated data to be durable then use NOSQL like Mongodb.




回答4:


Look's possible but slower (see here)

  • You will have to do word splitting and stemming your self.
  • Ranking of queries 'requires user supplied code to do so'



回答5:


I'm not familiar with MongoDB so I can't directly answer the question but I would like to note that unlike Lucene (which is about ten years old) and relational databases (which have been around for decades) MongoDB is less than three years old.

At this stage of the game it is likely still maturing. It may be suitable to your needs (and I'm curious to see if anyone familiar with using it will chime in here) but you'll need to factor this into your equation. Are you willing to pay the price to use cutting edge technology?

Even if it winds up being stable and efficient enough, you may run into issues with limited support in the form of websites/tutorials etc. (due to the small user base). You are also taking the chance that it will be discontinued.

It can be worthwhile to take this chance, but you need to do so with your eyes open and not blinded by the "oh, look at the shiny new toy" effect.




回答6:


Another option is to use elasticsearch (backed in lucene) width couchdb: http://www.elasticsearch.org/blog/2010/09/28/the_river_searchable_couchdb.html




回答7:


Lucene is an established and stable product. Alas the same is not yet true of MongoDB. So I would think that Lucene plus an RDBMS is a much less risky option.

Of course, to a certain extent it depends on the nature of the project: just how important is "very important (and big)"? The other thing is, do you have prior experience of MongoDB (I'm guessing not)? If you can get access to people who have some expertise then that would mitigate the risk.




回答8:


After attending Devoxx 2011 and attending a presentation from 10Gen, I have written a little blog comparing MongoDB to RDBMS databases. MongoDB is one of the popular Nosql dbs.As stated in the replies before MongoDB is a NoSQL db, which is different to the exising mainstream rdbms databases.

http://blog.iprofs.nl/2011/11/25/is-mongodb-a-good-alternative-to-rdbms-databases-like-oracle-and-mysql




回答9:


For fulltext search solutions, I have used Lucene & Sphinx earlier but they are not that good to fetch best results to the supplied keyword. So I used mongodb fulltext search plugin MongoLantern, which is very good at it. Moreover in terms of performance it's using MongoDB as a backend engine, so there is no performance issues at all. waiting for more reviews in terms of Production usability of MongoLantern.

https://sourceforge.net/projects/mongolantern/




回答10:


No, it isn't, since MongoDB is not relational.



来源:https://stackoverflow.com/questions/2546494/is-mongodb-a-valid-alternative-to-relational-db-lucene

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!