Matching an approximate string in a Core Data store

≡放荡痞女 提交于 2019-12-05 00:21:41

You want your search to be diacritic insensitive to match the 'é' in pensée and 'e' in pensee. You get this by adding the [d] after the attribute. Like so:

    NSPredicate *predicate = [NSPredicate predicateWithFormat:@"(songTitle like[cd] %@)", yourSongSubstring];
The 'c' in [cd] is for case insensitivity.

Since your string could appear in any order in the string you are searching, you could tokenize your search string ([... componentsByString:@" "]) then create a predicate like

    NSPredicate *predicate = [NSPredicate predicateWithFormat:@"(songTitle like[cd] %@) and (songTitle like[cd] %@)", songToken1, songToken2];
That syntax to combine predicates above may be off, going from memory.

I believe the tool you want to use here is SearchKit. I say that as if I've just made your job easy.... I haven't, but it should have the tools you need to be successful here. LNC is still offering their SearchKit Podcast for free (very nice).

Each track would be a document in this case, and you'd need to come up with a good way to index them with an identifier that can be used to find them. You can then load them up with metadata, and search them. Perhaps putting the title "in" the document would be helpful here to facilitate the use of Similarity Searching (kSKSearchOptionFindSimilar). That may or may not work really well.

The question you've asked is a good one, but there is certainly no industry standard for it because anyone who solves this problem well (i.e. every major search engine) keeps their algorithms very secret. This is a hard problem; no one is quite ready to give away their answer.

AtoN

Consider q-grams, which are substrings of length q (Gravano et al., 2001).

You could, for two strings s1 and s2, determine for each q-gram of s1 the corresponding q-gram of s2 with smallest edit distance. Then add all those distances and you end up with a metric which is very robust to permutation of words and extra characters.

Generally, q should be adapted to your problem domain (experiment with q = 3, 4, 5...).

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!