couchdb map/reduce view: counting only the most recent items

[亡魂溺海] 提交于 2019-12-08 04:07:43

问题


I have the following documents. Time stamped positions of keywords.

{
  _id: willem-aap-1234,
  keyword:aap,
  position: 10,
  profile: { name: willem },
  created_at: 1234
},
{
  _id: willem-aap-2345,
  keyword:aap,
  profile: { name: willem },
  created_at: 2345
},
{
  _id: oliver-aap-1235,
  keyword:aap,
  profile: { name: oliver },
  created_at: 1235
},
{
  _id: oliver-aap-2346,
  keyword:aap,
  profile: { name: oliver },
  created_at: 2346
}

Finding the most recent keywords per profile.name can be done by:

map: function(doc) {
if(doc.profile)
    emit(
        [doc.profile.name, doc.keyword, doc.created_at], 
        { keyword : doc.keyword, position : doc.position, created_at: doc.created_at }
    );
}

reduce: function(keys, values, rered) {
  var r = values[0];
  for (var i=1; i<values.length; i++)
    if (r.created_at < values[i].created_at)
      r = values[i];
  return r;
}

And then query the database with

reduce : true,
group_level : 2,
startkey : [aname],
endkey : [aname,{}]

This gives me the most recent documents for the profile with name aname.

But now I want to count all most recent documents per keyword, and sum the positions. I cannot get my head around this trying to do it with map/reduce only.

My user case is:

  1. find the most recent documents per profile.user, per keyword
  2. count the number of unique profile.name's per keyword
  3. sum the positions of the most recent document, per keyword

The only way I can make it work is using the following list function:

function(head, req) {
  var row;
  var counts = {};
  while (row = getRow()) {
    var v = row.value;
    var k = v.keyword;

    if (v.position) {
      if (!counts[k])
        counts[k] = { 
          position : 0,
          count : 0
        }
      counts[k].position += v.position;
      counts[k].count++;
    }
  }

  return JSON.stringify(counts);
}

Can anyone think of a better way to do this, using map/reduce only?

Thanks


回答1:


The meaning of some parts are still a bit cloudy (for example, what is a "position"?).

But from a pure formal point of view, it seems that your list creates an index on keyword while your map created an index on [profile, keyword, timestamp].

If you really need different indexes then you need several maps, one per index. The only exception is when you already have a map on [a,b,c], you can change the "group level" and get two other indexes: [a,b] and [a].



来源:https://stackoverflow.com/questions/15320834/couchdb-map-reduce-view-counting-only-the-most-recent-items

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!