Mongoose - Search for text in three fields based on score or weightage

非 Y 不嫁゛ 提交于 2019-12-22 23:39:17

问题


I am using Mongoose on top of MongoDB. This is how my model looks.

var BookSchema = new Schema({
  name: String,
  viewCount: { type: Number, default: 0 },
  description: {
    type: String,
    default: 'No description'
  },
  body: {
    type: String,
    default: ''
  }
    }
});

I need to search for some text on over Name, Description, Body fields. So far this is what I am doing & its working:

Book.find().or([{ 'name': { $regex: term, $options: "$i" }}, { 'description': { $regex: term, $options: "$i" }}, { 'body': { $regex: term, $options: "$i" }}]).exec(
    function (err, topics) {
      if (err) {
        return handleError(res, err);
      }
      return res.status(200).json(books);
    });

Problem: I need to come up with some mechanism where I assign weightage/score to all the fields (Name,Description,Body) with name having highest weightage, description having little less weightage than name and body having the least weightage. When the results comes, I want to sort the result by the score/weight.

So far I have looked into this link & weights, but not sure what is the best way to get the desired result. I also wants to understand, do I need to create weights every time bebore I search or its a one time activity & how to implemt weights with Mongoose ?


回答1:


A "text index" and search is indeed likely the best option here as long as you are searching for whole words.

Adding a text index to your schema definition is quite simple:

BookSchema.index(
    {
         "name": "text",
         "description": "text",
         "body": "text"
    },
    {
        "weights": {
            "name": 5,
            "description": 2
        }
    }
)

This allows you to perform simple searches with "set" weighting to the fields:

Book.find({ "$text": { "$search": "Holiday School Year" } })
    .select({ "score": { "$meta": "textScore" } })
    .sort({ "score": { "$meta": "textScore" } })
    .exec(function(err,result) {

    }
);

Where each term matched will be considered against the field it was found in which gives the most weight and the number of occurances.

Assigning the weights is attached to the "index", so the definition is done once and cannot be changed. Another limitation is that at "text search" does not look at "partial" words. For example "ci" does not match "City" or "Citizen", and for such a thing you would need a regular expression instead.

If you needed more flexibilty than that or generally must be able to dynamically change the weighting of results then you need something like the aggregation framework or mapReduce.

The aggregation framework however cannot perform a "logical" match operation ( it can filter though the $match operator, but not a "logical" match ) of a "regular expression" to your terms. You can work with single words and "exact" matches though if this suits.

Book.aggregate(
    [
        { "$match": {
            "$or": [
                { "name": /Holiday/ },
                { "description": /Holiday/ },
                { "body": /Holiday/ }
            ]
        }},
        { "$project": {
            "name": 1,
            "description": 1,
            "body": 1,
            "score": {
                "$add": [
                    { "$cond": [{ "$eq": [ "$name", "Holiday" ] },5,0 ] },
                    { "$cond": [{ "$eq": [ "$description", "Holiday" ] },2,0 ] },
                    { "$cond": [{ "$eq": [ "$body", "Holiday" ] },1,0 ] }
                ]
            }
        }},
        { "$sort": { "score": -1 } }
    ],
    function(err,results) {

    }
)

As an aggregation pipeline uses a data structure to query where you can change the parameters for weight on each exection to whatever you presently need.

MapReduce shares a similar principle, where you can include a calculated "score" in part of the primary key emitted as the leading element. MapReduce naturally sorts all input emitted by this key as an optimization for feeding to a reduce function. However you cannot further sort or "limit" such a result.

Those are generally your options to look at and decide which best suits your case.



来源:https://stackoverflow.com/questions/32063998/mongoose-search-for-text-in-three-fields-based-on-score-or-weightage

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!