I have the following documents. Time stamped positions of keywords.
{
_id: willem-aap-1234,
keyword:aap,
position: 10,
profile: { name: willem },
created_at: 1234
},
{
_id: willem-aap-2345,
keyword:aap,
profile: { name: willem },
created_at: 2345
},
{
_id: oliver-aap-1235,
keyword:aap,
profile: { name: oliver },
created_at: 1235
},
{
_id: oliver-aap-2346,
keyword:aap,
profile: { name: oliver },
created_at: 2346
}
Finding the most recent keywords per profile.name can be done by:
map: function(doc) {
if(doc.profile)
emit(
[doc.profile.name, doc.keyword, doc.created_at],
{ keyword : doc.keyword, position : doc.position, created_at: doc.created_at }
);
}
reduce: function(keys, values, rered) {
var r = values[0];
for (var i=1; i<values.length; i++)
if (r.created_at < values[i].created_at)
r = values[i];
return r;
}
And then query the database with
reduce : true,
group_level : 2,
startkey : [aname],
endkey : [aname,{}]
This gives me the most recent documents for the profile with name aname.
But now I want to count all most recent documents per keyword, and sum the positions. I cannot get my head around this trying to do it with map/reduce only.
My user case is:
- find the most recent documents per profile.user, per keyword
- count the number of unique profile.name's per keyword
- sum the positions of the most recent document, per keyword
The only way I can make it work is using the following list function:
function(head, req) {
var row;
var counts = {};
while (row = getRow()) {
var v = row.value;
var k = v.keyword;
if (v.position) {
if (!counts[k])
counts[k] = {
position : 0,
count : 0
}
counts[k].position += v.position;
counts[k].count++;
}
}
return JSON.stringify(counts);
}
Can anyone think of a better way to do this, using map/reduce only?
Thanks
The meaning of some parts are still a bit cloudy (for example, what is a "position"?).
But from a pure formal point of view, it seems that your list creates an index on keyword
while your map created an index on [profile, keyword, timestamp]
.
If you really need different indexes then you need several maps, one per index. The only exception is when you already have a map on [a,b,c]
, you can change the "group level" and get two other indexes: [a,b]
and [a]
.
来源:https://stackoverflow.com/questions/15320834/couchdb-map-reduce-view-counting-only-the-most-recent-items