morelikethis

How to find similar documents

你说的曾经没有我的故事 提交于 2020-01-01 00:45:13
问题 How do you find a similar documents of a given document in Lucene. I do not know what the text is i only know what the document is. Is there a way to find similar documents in lucene. I am a newbie so I may need some hand holding. 回答1: you may want to check the MoreLikeThis feature of lucene. MoreLikeThis constructs a lucene query based on terms within a document to find other similar documents in the index. http://lucene.apache.org/java/3_0_1/api/contrib-queries/org/apache/lucene/search

Aggregating MoreLikeThis Results in RavenDB

情到浓时终转凉″ 提交于 2019-12-24 15:44:55
问题 I have been trying out the MoreLikeThis Bundle to bring back a set of documents ordered by the number of matches in a field called 'backyardigans' compared to a key document. This all works as expected. But what I would like to do is order by the number of matches of 3 separate fields added together. An example record would be: var data = new Data{ backyardigans = "Pablo Tasha Uniqua Tyrone Austin", engines = "Thomas Percy Henry Toby", pigs = "Daddy Peppa George Mummy Granny" }; If another

Solr MoreLikeThis boosting query fields

时光毁灭记忆、已成空白 提交于 2019-12-22 05:14:58
问题 I am experimenting with Solr's MoreLikeThis feature. My schema deals with articles, and I'm looking for similarities between articles within three fields: articletitle, articletext and topic. The following query works well: q=id:(2e2ec74c-7c26-49c9-b359-31a11ea50453) &rows=100000000&mlt=true &mlt.fl=articletext,articletitle,topic&mlt.boost=true&mlt.mindf=1&mlt.mintf=1 But I would like to experiment with boosting different query fields - i.e. putting more weight on similarities in the

Is it possible to have SOLR MoreLikeThis use different fields for model and matches?

纵饮孤独 提交于 2019-12-11 07:24:49
问题 Let's say I have documents with two fields, A and B. I'd like to use SOLR's MoreLikeThis, but with a twist: I'm most interested in boosting documents whose A field is like my model document's B field. (That is, extract MLT's 'interesting terms' from the model B field, but only collect MLT results based on the A field.) I don't see a way to use the mlt.fl fields or mlt.qf boosts to achieve this effect in a single query. (It seems mlt.fl specifies fields used for both discovery of 'interesting

Creating more like this in RavenDB

有些话、适合烂在心里 提交于 2019-12-10 18:58:06
问题 I have these documents in my domain: public class Article { public string Id { get; set; } // some other properties public IList<string> KeywordIds { get; set; } } public class Keyword { public string Id { get; set; } public string UrlName { get; set; } public string Title { get; set; } public string Tooltip { get; set; } public string Description { get; set; } } I have this scenario: Article A1 has keyword K1 Article A2 has keyword K1 One user reads article A1 I want to suggest user to read

Measuring similarity between document sets

回眸只為那壹抹淺笑 提交于 2019-12-10 12:56:21
问题 For illustration purposes, let's assume this is a forum service. I need to calculate the "similarity" among each users' posts, so that the result would be something like: among posts by user A, similarity 60% among posts by user B, similarity 20% ... I'm dealing with multibyte strings, so I guess I'm stuck with search engines here. We already use Solr, already have moreLikeThis implemented, but I'm not quite sure how to construct the query. Any help appreciated! 回答1: Possibly Carrot2 will

Solr MoreLikeThis not working for multiple shards?

橙三吉。 提交于 2019-12-10 11:02:27
问题 I have 5 node cluster in SolrCloud, with 2 shards per node, Solr version:6.3.0 now when I run mlt query it only returns result per node and doesn't distribute them over all shards/nodes, i.e http://10.0.1.15:8983/solr/test_ingest/mlt?q=advertising_id%w72w9424620427042&fl=score&fl=advertising_id&mlt.fl=channel_name&mlt.fl=show_name&mlt.fl=language&mlt.mindf=1 gives no results while http://10.0.1.119:8983/solr/test_ingest/mlt?q=advertising_id%w72w9424620427042&fl=score&fl=advertising_id&mlt.fl

Lucene相似搜索组件MoreLikeThis原理与代码分析

折月煮酒 提交于 2019-12-09 19:55:00
MoreLikeThis 是 Lucene 的一个捐赠模块,为其Query相关的功能提供了相当不错扩充。MoreLikeThis提供了一组可用于相似搜索的接口,已方便让我们实现自己的相似搜索。 什么是相似搜索: 相似搜索按我个人的理解,即:查找与某一条搜索结果相关的其他结果。它为用户提供一种不同于标准搜索(查询语句—>结果)的方式,通过一个比较符合自己意图的搜索结果去搜索新的结果(结果—>结果)。 MoreLikeThis 设计思路分析: 首先,MoreLikeThis 为了实现与Lucene 良好的互动,且扩充Lucene;它提供一个方法,该方法返回一个Query对象,即Lucene的查询对象,只要Lucene通过这个对象检索,就能获得相似结果;所以 MoreLikeThis 和 Lucene 完全能够无缝结合;Solr 中就提供了一个不错的例子。 MoreLikeThis 所提供的 方法如下: /** * Return a query that will return docs like the passed lucene document ID. * * @param docNum the documentID of the lucene doc to generate the 'More Like This" query for. * @return a query

Is it possible to use a more-like-this query on nested fields?

老子叫甜甜 提交于 2019-12-08 12:46:27
问题 I have an "event" type based on a (nested) press article, including the title, and the text, which both have multifields. I've tried : { "query":{ "nested":{ "path":"article", "query":{ "mlt":{ "fields":["article.title.search","article.text.search"], "max_query_terms": 20, "min_term_freq": 1, "include": "false", "like":[{ "_index":"myindex", "_type":"event", "doc":{ "article":{ "title":"this is the title", "text":"this is the body of the article" } }] } } } } } But it always returns 0 hits

Solr FieldCollapsing for More Like This queries

只愿长相守 提交于 2019-12-08 07:33:45
问题 I want to use a "More Like This" query to find similar documents and collapse those that have the same value for the field 'image'. I tried to use the Field Collapsing parameters however they do not seem to work for "More like this". Below is a snippet of my code. Can you tell me how to collapse results using the "More Like This" query? $url = "http://{$host}:{$port}/solr/{$core}/mlt"; $data = [ 'stream.body' => $content, 'fl' => 'image,content,title,signature', 'start' => 0, 'order' =>