How to decide which NoSQL technology to use?

后端未结

关注

 6  1691

陌清茗 2021-01-30 02:55

What is the pros and cons of MongoDB (document-based), HBase (column-based) and Neo4j (objects graph)?

I\'m particularly interested to know some of the typical use cas

6条回答

渐次进展 (楼主)

2021-01-30 03:21
MongoDB:

MongoDB is document database unlike Relational database. The document stores semi structured data like JSON object ( schema free)

Key features:
1. Schema can change over evolution of application
2. Full indexing
3. Load balancing & Data sharding
4. Data replication
5. Consistency & Partitioning in CAP theory ( Consistency-Availability-Partitioning)
When to use:
1. Real time analytics
2. High speed logging
3. Semi structured data management
When not to use:
1. Highly transactional applications with strong ACID properties ( Atomicity, Consistency, Isolation & Durability). RDBMS is preferred in this use case.
2. Operating on data sets involving relations - foreign keys etc
HBASE:

HBase is an open source, non-relational, distributed column family database

Key features:
1. It provides a fault-tolerant way of storing large quantities of sparse data (small amounts of information caught within a large collection of empty or unimportant data, such as finding the 50 largest items in a group of 2 billion records, or finding the non-zero items representing less than 0.1% of a huge collection)
2. Supports variable schema where each row is different
3. Can serve as the input and output for MapReduce job
4. Compression, in-memory operation, and Bloom filters on a per-column (A data structure designed to tell you, rapidly and memory-efficiently, whether an element is present in a set) 5.Achieve CP on CAP
When to use HBase:
1. If you’re loading data by key, searching data by key (or range), serving data by key, querying data by key
2. Storing data by row that doesn’t conform well to a schema (variable schema)
When not to use HBase:
1. For relational analytics
2. Full table scans
3. Data to be aggregated, analyzed by rows instead of columns
Neo4j:

Neo4j is graph database using Property Graph Data Model (Data is stored as a graph and nodes & relationships with properties)

Key features:
1. Supports full ACID(Atomicity, Consistency, Isolation and Durability) rules
2. Supports Indexes by using Apache Lucence
3. Schema free, bottom-up data model design
4. High scalability has been achieved due to compact storage and memory caching available for graphs
When to use:
1. Master data management
2. Network and IT Operations
3. Real time recommendations
4. Fraud detection
5. Social network (like facebook)
When not to use:
1. Bulk queries/Scans
2. If your application requires Partitioning & Sharding of data
Have a look at comparison of various NoSQL technologies in this article

Sources:

Wiki, Slide share, Cloudera,Tutorials Point,Neo4j
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...