Chained map/reduce in couchDB

回眸只為那壹抹淺笑 提交于 2020-01-01 11:57:19

问题


In couchDB, I have a set of items like the following (simplified for example's sake):

{_id: 1, date: "Jul 1", user: "user1"}
{_id: 2, date: "Jul 2", user: "user1"}
{_id: 3, date: "Jul 3", user: "user2"}
...etc...

I'd like to get a list of "most recent activity", sorted by date, with no duplicate user _ids. I can create a view with results like so:

{key: "July 3", _id: 3, user: "user2"}
{key: "July 2", _id: 2, user: "user1"}
{key: "July 1", _id: 1, user: "user1"}

but this contains duplicate entries for the same user. Or I can create a view that maps {key: user, value: date} and reduces to

{key: "user1", mostRecentDate: "July 2"}
{key: "user2", mostRecentDate: "July 3"}

but that isn't sorted by "most recent".

I know that the obvious solution - reducing over the results of another view isn't supported. BigCouch supports chained map/reduce, but appears to be rather out of date / unsupported (last release 2012).

This seems like a rather common problem - what are some existing solutions (beyond "switch databases")?


回答1:


Here is a general idea of how you can do chained map reduce with couchdb 1.xxx. What we want is the ability to pass the the results of one map/reduce to another.

  1. Subscribe to the _changes feed filtered by the view. This will give you a list of docs that will actually be emitted by the map function.

  2. Next we need to call the view function for these filtered docs. It's simple since we can pass a list of keys to the view so we simply pass the keys and get the desired result subset of the view.

  3. Next we push this result either in a separate database or in the same one. We can use bulk inserts to perform the inserts faster. If you use a separate database you can even reuse the _id's from the view results so the bulk updates would be a lot easier.

  4. Within this database we define another view that sorts our results based on value.

    {key: "user1", mostRecentDate: "July 2"} {key: "user2", mostRecentDate: "July 3"}

since you have already gotten to this step all you need to do is create a view on mostRecentDate in the second database and you will get user activity sorted by date.

I hope you are using a dummy reduce though. One that returns null and is only used for group=true.

using a list function in step 4 can make your life easier. As bulk updates require the list of docs to be in the form {"docs":[....]} you can easily get it in one go with a list function.



来源:https://stackoverflow.com/questions/25270276/chained-map-reduce-in-couchdb

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!