nosql | 易学教程

NoSQL for filesystem storage organization and replication?

阅读更多关于 NoSQL for filesystem storage organization and replication?

问题 We've been discussing design of a data warehouse strategy within our group for meeting testing, reproducibility, and data syncing requirements. One of the suggested ideas is to adapt a NoSQL approach using an existing tool rather than try to re-implement a whole lot of the same on a file system. I don't know if a NoSQL approach is even the best approach to what we're trying to accomplish but perhaps if I describe what we need/want you all can help. Most of our files are large, 50+ Gig in size

Transaction-like update of two documents using CouchDB

阅读更多关于 Transaction-like update of two documents using CouchDB

问题 As a newbie to CouchDB or NoSQL in general I can't find a good way of updating two documents, with guarantee that either both are updated or none of them. In my use case there is a boolean flag in each document. To illustrate, lets assume I'm talking about document of type="citizen" with a boolean attribute isKing. I want to ensure there is exactly one king at a time. It gets tricky when I want to change the king. This requires modification of two documents (to set isKing=true for the new

SQL to Key Value

阅读更多关于 SQL to Key Value

问题 I'd like to move from the SQL approach to the Key Value approach, because I deal with "big data" and would like to benefit from systems like DynamoDB, Riak or Cassandra. It's quite easy when the data is unrelated, thus one have a document based approach (a primary key + data, but no relations). I'd appreciate any theoretical or academic input on how to model my data. 回答1: I've been using NoSQL in the last 4 years and this is just what I think, what I learnt ... my personal golden rules.

Convert any Elasticsearch response to simple field value format

阅读更多关于 Convert any Elasticsearch response to simple field value format

问题 On elastic search, when doing a simple query like: GET miindex-*/mytype/_search { "query": { "query_string": { "analyze_wildcard": true, "query": "*" } } } It returns a format like: { "took": 1, "timed_out": false, "_shards": { "total": 1, "successful": 1, "failed": 0 }, "hits": { "total": 28, "max_score": 1, "hits": [ ... So I parse like response.hits.hits to get the actual records. However if you are doing another type of query e.g. aggregation, the response is totally different like: {

Google App Engine Datastore / NoSQL example with ancestor queries

阅读更多关于 Google App Engine Datastore / NoSQL example with ancestor queries

问题 I'm very used to SQL, and not the NoSQL paradigm of App Engine Datastore, so I had to write a piece of example code to understand how to do ancestor queries correctly. Thought I'd share it with you here; maybe it's interesting to someone. 回答1: #!/usr/bin/env python # -*- coding: utf-8 -*- import webapp2 from google.appengine.ext.webapp.util import run_wsgi_app import logging from google.appengine.ext import db # MODELS class Child_model(db.Model): name = db.StringProperty() class Parent_model

Azure Storage Table design with multiple query points

阅读更多关于 Azure Storage Table design with multiple query points

问题 I have the following Azure Storage Table. PositionData table: PartitionKey: ClientID + VehicleID RowKey: GUID Properties: ClientID, VehicleID, DriverID, Date, GPSPosition Each vehicle will log up to 1,000,000 entities per year per client. Each client could have thousands of vehicles. So, I decided to partition by ClientID + VehicleID so to have small, manageable partitions. When querying by ClientID and VehicleID , the operation performs quickly because we are narrowing the search down to one

Database schema for a dynamic formbuilder

阅读更多关于 Database schema for a dynamic formbuilder

问题 I know that there is already an answer for a similar question but I think that the answer is not strong enough, so I'll ask with my own specific issues. assumption: dynamic form builder, users can create form with structure which is not known. solution: Form submission, data will be stored in a 2 table structure: FormSubmissionHeader table that will store some basic data about the submission(formid,userid,datetime,etc) FormSubmissionFieldsData(FormSubmissionHeaderID ,FIELDID,FIELDVALUE) My

MongoDB : where is the limit between “few” and “many”?

阅读更多关于 MongoDB : where is the limit between “few” and “many”?

问题 I am coming from the relational database world (Rails / PostgreSQL) and transitioning to the NoSQL world (Meteor / MongoDB), so I am learning about denormalization, embedding and true links. It seems that, in many cases, choosing between various database schemas comes down to the number of documents that will be "related" to each others. In this video series, the author distinguishes: one-to-many relationships from one-to-few relationships many-to-many relationships from few-to-few

Is there any multicore exploiting NoSQL system?

阅读更多关于 Is there any multicore exploiting NoSQL system?

问题 I am playing with MongoDB since yesterday and absolutely love it. I am trying to import lots of data (2 billion rows) and index it but it doesn't seem to be using the 8 cores that my system has adn the import is going at normal rates (60000 records/sec). I can only imagine how long it might take to index two columns in this collection. Are there any MondoDB type databases that exploit multicore nature of CPUs? 回答1: If MongoDB has an achilles heel it's the fact that it only supports single

Atomic counters in Couchbase

阅读更多关于 Atomic counters in Couchbase

问题 I wanted to know if Couchbase support consistent incremental Counters. From what I've read in this doc, it does not, it just encapsulates a read/write operation so you won't need to do it yourself. Of course this doesn't work for me because the data might change since the time you read the data from the database. 回答1: Couchbase absolutely does, just like memcached and Membase Server, it supports the incr/decr operations atomically within a cluster. cb.set("mykey", 1) x = cb.incr("mykey") puts