How to structure data in Riak?

限于喜欢 提交于 2019-11-27 14:49:32

问题


I'm trying to figure out how to model data in Riak. Let's say you are building something like a CMS with two features, news and products. You need to be able to store this information for multiple clients X and Y. How would you typically structure this?

  1. One bucket per client and then two keys news and products. Store multiple objects under each key and then use map/reduce to order them.

  2. Store both the news and the products in the same bucket, but with a new autogenerated key for each news item and product item. That is, one bucket for X and one for Y.

  3. One bucket per client/feature combination, that is, the buckets would be X-news, X-products, Y-news and Y-products. Then use map/reduce on the whole bucket to return the results in order.

Which would be the best way to handle this problem?


回答1:


I'd create 2 buckets: news and products. Then I'd prefix keys in each bucket with client names. I'd probably also include dates in news keys for easy date ranging.

news/acme_2011-02-23_01
news/acme_2011-02-23_02
news/bigcorp_2011-02-21_01

And optionally prefix product names with category names

products/acme_blacksmithing_anvil
products/bigcorp_databases_oracle

Then in your map/reduce you could use key filtering:

// BigCorp News items
{
  "inputs":{
     "bucket":"news",
     "key_filters":[["starts_with", "bigcorp"]]
  }
  // ... rest of mapreduce job
}

// Acme Blacksmithing items
{
  "inputs":{
     "bucket":"products",
     "key_filters":[["starts_with", "acme_blacksmithing"]]
  }
  // ... rest of mapreduce job
}

// News for all clients from Feb 12th to 19th
{
  "inputs":{
     "bucket":"news",
     "key_filters":[["tokenize", "_", 2],
                    ["between", "2011-02-12", "2011-02-19"]]
  }
  // ... rest of mapreduce job
}



回答2:


An even more efficient approach to this than using key filtering (as per Kev Burns's recommendation) is to use Secondary Indexes or Riak Search, to model this scenario.

Take a look at my answers to Which clustered NoSQL DB for a Message Storing purpose? and Links in Riak: what can they do/not do, compared to graph databases? for a discussion of similar cases.

You have several decisions to make, depending on your use case. In all cases, you would start out with a company bucket, so that each company has a unique key.

1) Whether to store the items of interest in 2 separate buckets (news and products) or in one (something like items_of_interest) depends on your preference and ease of querying. If you're always going to be querying for both news and products for a company in a single query, you might as well store them in a single bucket. But I recommend using 2 separate ones, to keep easier track of them, especially if you'll have something like separate tabs or pages for "Company X - Products" and "Company X - News". And if you need to combine them into a single feed, you would make 2 queries (one for news and one for products), and combine them in the client code (by date or whatever).

2) If a news/product item can have one and only one company that it belongs to, create a secondary index on company_key for each item. That way, you can easily fetch all news or products for a company via a secondary index (2i) query for that company.

3) If there's a many-to-many relationship (if a news/product item can belong to several companies (perhaps the news item is about a joint venture for 2 separate companies)), then I recommend modeling the relationship as a separate Riak object. For example, you could create a mentions bucket, and for each company mentioned in a news story, you would insert a Mention object, with its own unique key, a secondary index for company_key, and the value would contain a type ('news' or 'product') and an item_key (news key or product key). Extracting relationships to separate Riak objects like this allows you to do a lot of interesting things -- tag them arbitrarily using Riak Search, query them for subscription event notifications, etc.



来源:https://stackoverflow.com/questions/5052154/how-to-structure-data-in-riak

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!