is there a way to clear duplication record from results?

♀尐吖头ヾ 提交于 2019-12-12 06:08:50

问题


I have a view but did have duplicated documents from the results of view as like following, how can I get the duplicate results and get the unique? thank you in advance

{
"total_rows": 9,
"offset": 0,
"rows": [

        {
            "id": "xxxx",
            "key": "12345",
            "value": {
            "_id": "abc123",
            "_rev": "4-8db4da81d1e20afcea0a328fb16e7ec8",
                "field1": "abc",
                "field2": "dfr"
            },

            {
                "id": "xxxx",
                "key": "12345",
                "value": {
                    "_id": "abc123",
                    "_rev": "4-8db4da81d1e20afcea0a328fb16e7ec8",
                    "field1": "abc",
                    "field2": "dfr"
                },
            ]
        }

The view is like this

function(doc) {
if(doc){
    for (var i in doc.item){
            emit(doc.item[i].key,doc);
        }
    }  
}

view called by ...._view/duplicate?key="12345"

I always had this error when execute the following reduce :"error":"reduce_overflow_error","reason":"Reduce output must shrink more rapidly:

function (keys, values, rereduce) {
var uniqueKey = [];
var newValues = [];
for (var i=0; i<values.length; i++) {
    if (uniqueKey.indexOf(values[i]._id)==-1) {
        uniqueKey.push(values[i]._id);
        newValues.push(values[i]);
    }
}
return newValues;

}


回答1:


The issue is your map function :

function(doc) {
if(doc){
    for (var i in doc.item){
            emit(doc.item[i].key,doc);
        }
    }  
}

By doing that, you emit the same document several times, therefore it shouldn't come as a surprise that you have duplicate documents in your view's results. If you want to return all the items for a document without any duplicate, you can do something as simple as that :

function(doc) {
  if(doc.item) {
    emit(doc.item, null)
  }     
} 

Also please note that I do emit(doc.item, null) and not emit(doc.item, doc). It is a bad practice to emit the document, you should instead query the view with option include_docs=true. Otherwise your view index will have the same size as your database.



来源:https://stackoverflow.com/questions/37453127/is-there-a-way-to-clear-duplication-record-from-results

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!