NoSQL (MongoDB) vs Lucene (or Solr) as your database

前端 未结 10 1932
春和景丽
春和景丽 2020-11-28 17:05

With the NoSQL movement growing based on document-based databases, I\'ve looked at MongoDB lately. I have noticed a striking similarity with how to treat items as \"Documen

相关标签:
10条回答
  • 2020-11-28 17:29

    The third party solutions, like a mongo op-log tail are attractive. Some thoughts or questions remain about whether the solutions could be tightly integrated, assuming a development/architecture perspective. I don't expect to see a tightly integrated solution for these features for a few reasons (somewhat speculative and subject to clarification and not up to date with development efforts):

    • mongo is c++, lucene/solr are java
      • maybe lucene could use some mongo libs
      • maybe mongo could rewrite some lucene algorithms, see also:
        • http://clucene.sourceforge.net/
        • http://lucy.apache.org/
    • lucene supports various doc formats
      • mongo is focused on JSON (BSON)
    • lucene uses immutable documents
      • single field updates are an issue, if they are available
    • lucene indexes are immutable with complex merge ops
    • mongo queries are javascript
    • mongo has no text analyzers / tokenizers (AFAIK)
    • mongo doc sizes are limited, that might go against the grain for lucene
    • mongo aggregation ops may have no place in lucene
      • lucene has options to store fields across docs, but that's not the same thing
      • solr somehow provides aggregation/stats and SQL/graph queries
    0 讨论(0)
  • 2020-11-28 17:30

    Since no one else mentioned it, let me add that MongoDB is schema-less, whereas Solr enforces a schema. So, if the fields of your documents are likely to change, that's one reason to choose MongoDB over Solr.

    0 讨论(0)
  • 2020-11-28 17:31

    We use MongoDB and Solr together and they perform well. You can find my blog post here where i described how we use this technologies together. Here's an excerpt:

    [...] However we observe that query performance of Solr decreases when index size increases. We realized that the best solution is to use both Solr and Mongo DB together. Then, we integrate Solr with MongoDB by storing contents into the MongoDB and creating index using Solr for full-text search. We only store the unique id for each document in Solr index and retrieve actual content from MongoDB after searching on Solr. Getting documents from MongoDB is faster than Solr because there is no analyzers, scoring etc. [...]

    0 讨论(0)
  • 2020-11-28 17:34

    Also please note that some people have integrated Solr/Lucene into Mongo by having all indexes be stored in Solr and also monitoring oplog operations and cascading relevant updates into Solr.

    With this hybrid approach you can really have the best of both worlds with capabilities such as full text search and fast reads with a reliable datastore that can also have blazing write speed.

    It's a bit technical to setup but there are lots of oplog tailers that can integrate into solr. Check out what rangespan did in this article.

    http://denormalised.com/home/mongodb-pub-sub-using-the-replication-oplog.html

    0 讨论(0)
  • 2020-11-28 17:34

    MongoDB Atlas will have a lucene-based search engine soon. The big announcement was made at this week's MongoDB World 2019 conference. This is a great way to encourage more usage of their high revenue MongoDB Atlas product.

    I was hoping to see it rolled into the MongoDB Enterprise version 4.2 but there's been no news of bringing it to their on-prem product line.

    More info here: https://www.mongodb.com/atlas/full-text-search

    0 讨论(0)
  • From my experience with both, Mongo is great for simple, straight-forward usage. The main Mongo disadvantage we've suffered is the poor performance on unanticipated queries (you cannot created mongo indexes for all the possible filter/sort combinations, you simple can't).

    And here where Lucene/Solr prevails big time, especially with the FilterQuery caching, Performance is outstanding.

    0 讨论(0)
提交回复
热议问题