MongoDB: Case insensitive and accent insensitive

半腔热情 提交于 2019-12-11 06:07:33

问题


I am looking for string "JESÚS" but only returns the document with the specified string, I need the search to ignore the accents and capital letters.

I am using C# and mongodb driver.

I have two documents saved in my mongodb:

_id:5d265f3129ea36365c7ca587
TRABAJADOR:"JESUS HERNANDEZ DIAZ"

_id:5d265f01db86a83148404711
TRABAJADOR:"JESÚS HERNÁNDEZ DÍAZ"

In visual c# with mongo driver:

var filter = Builders<BsonDocument>.Filter.Regex("TRABAJADOR", new BsonRegularExpression(string.Format(".*{0}.*", "JESÚS"), "i"));

var result = collection.Find(filter, new FindOptions() { Collation = new Collation("es", strength: CollationStrength.Primary, caseLevel:true) }).ToList();

output = JsonConvert.SerializeObject(result);
return output;

If I search for "JESÚS", actual output:

_id:5d265f01db86a83148404711
TRABAJADOR:"JESÚS HERNÁNDEZ DÍAZ"

But actually I am expecting following output:

_id:5d265f3129ea36365c7ca587
TRABAJADOR:"JESUS HERNANDEZ DIAZ"

_id:5d265f01db86a83148404711
TRABAJADOR:"JESÚS HERNÁNDEZ DÍAZ"

回答1:


i recommend you create a text index with the default language set to "none" in order to make it diacritic insensitive and then doing a $text search as follows:

db.Project.createIndex(
    {
        "WORKER": "text",
        "TRABAJADOR": "text"
    },
    {
        "background": false,
        "default_language": "none"
    }
)
db.Project.find({
    "$text": {
        "$search": "jesus",
        "$caseSensitive": false
    }
})

here's the c# code that generated the above queries. i'm using my library MongoDB.Entities for brevity.

using MongoDB.Entities;
using System;
using System.Linq;

namespace StackOverflow
{
    public class Program
    {
        public class Project : Entity
        {
            public string WORKER { get; set; }
            public string TRABAJADOR { get; set; }
        }

        private static void Main(string[] args)
        {
            new DB("test");

            DB.Index<Project>()
              .Key(p => p.WORKER, KeyType.Text)
              .Key(p => p.TRABAJADOR, KeyType.Text)
              .Option(o => o.DefaultLanguage = "none")
              .Option(o => o.Background = false)
              .Create();

            (new[] {
                new Project { WORKER = "JESUS HERNANDEZ DIAZ"},
                new Project { TRABAJADOR = "JESÚS HERNÁNDEZ DÍAZ"}
            }).Save();

            var result = DB.SearchText<Project>("jesus");

            Console.WriteLine($"found: {result.Count()}");
            Console.Read();
        }
    }
}



回答2:


You need to look at two fields to get both:

 var filter = Builders<BsonDocument>.Filter;
 var query = filter.Regex("TRABAJADOR", new BsonRegularExpression(string.Format(".*{0}.*", "JESÚS"), "i")) & filter.Regex("WORKER", new BsonRegularExpression(string.Format(".*{0}.*", "JESÚS"), "i"));

Replace your first line with these two and give query to your find.

I didn't test it, I hope it works for you!



来源:https://stackoverflow.com/questions/56994094/mongodb-case-insensitive-and-accent-insensitive

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!