solr

Using Solr for indexing multiple languages

谁都会走 提交于 2020-01-01 02:38:30
问题 We're setting up a Solr to index documents where title field can be in various languages. After googling I found two options: Define different schema fields for every language i.e. title_en, title_fr,... applying different filters to each language then query one of title fields with a corresponding language. Creating different Solr cores to handle each language and make our app query correct Solr core. Which one is better? What are the ups and downs? Thanks 回答1: There's also a third

Is there a way to include stopwords when searching exact phrases in Solr?

五迷三道 提交于 2020-01-01 02:30:30
问题 I want stopwords excluded except when the search term is within double quotes eg. "just like that" should also search "that". Is this possible? 回答1: It depends on the configuration of the field you are querying. If the configuration of the indexing analyzer includes a StopFilterFactory, then the stopwords are simply not indexed, so you can not query for them afterward. But since Solr keeps the position of the terms in the index, you can instruct it to increment the position value of the

Declaring unique key as int in solr results in error

非 Y 不嫁゛ 提交于 2020-01-01 02:27:20
问题 Declaring <field name="id" type="int" indexed="true" stored="true" required="true" multiValued="false" /> in schema.xml results in the following error. HTTP Status 500 - {msg=SolrCore 'collection1' is not available due to init failure: Error initializing QueryElevationComponent.,trace=org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: Error initializing QueryElevationComponent. at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:860

How to run Solr 4 in Tomcat locally?

↘锁芯ラ 提交于 2020-01-01 01:51:09
问题 I've been trying to set up Solr 4.3 on my home PC (in Tomcat 7) but it doesn't run. I have set up Tomcat and deployed the solr.war file which both unpacks and shows up in the Tomcat Web Apps Manager screen in Tomcat but its not running and clicking the start button doesn't do anything (as it should already be running in the first place). Here is my solr.xml context file in Tomcat which also gives the path to the solr.war file and where my solr Cores are (which is the default Collection1):

What is the easiest way to implement terms association mining in Solr?

亡梦爱人 提交于 2019-12-31 21:44:20
问题 Association mining seems to give good results for retrieving related terms in text corpora. There are several works on this topic including well-known LSA method. The most straightforward way to mine associations is to build co-occurrence matrix of docs X terms and find terms that occur in the same documents most often. In my previous projects I implemented it directly in Lucene by iteration over TermDocs (I got it by calling IndexReader.termDocs(Term)). But I can't see anything similar in

What is the easiest way to implement terms association mining in Solr?

自闭症网瘾萝莉.ら 提交于 2019-12-31 21:43:10
问题 Association mining seems to give good results for retrieving related terms in text corpora. There are several works on this topic including well-known LSA method. The most straightforward way to mine associations is to build co-occurrence matrix of docs X terms and find terms that occur in the same documents most often. In my previous projects I implemented it directly in Lucene by iteration over TermDocs (I got it by calling IndexReader.termDocs(Term)). But I can't see anything similar in

自然语言处理在现实生活中运用

浪尽此生 提交于 2019-12-31 15:48:09
自己动手搭建搜索工具 作者 白宁超 2016年4月12日16:31:48 摘要: 搜索已经作为生活中不可缺少的一部分,诸如:百度、 google 、还是在微信上寻找好友或者通过一段文本查找关键字。另外亚马逊、京东、天猫、苏宁等电商在搜索中也是别有洞天(多面搜索等)。对于开发人员,搜索往往是大部分应用的关键功能,特别是对大规模文本数据驱动应用更是如此。另一类搜索像语音智能检索,其采用分类、聚类、神经网络等方法进行模型评估,反馈给用户比较理想的匹配结果,这里需要强调的是其采用评分机制反馈的模糊近似查询结果,与传统精确采用是不一样的。这种结果的反馈评分主要依托正确率和召回率。这里自己构建搜索工具好处在于:灵活性、开发费用低、自己更了解自己的搜索工具、价格当然是免费的啦。本文作者花费大量时间,经过资料收集,研究和实验所得,旨在技术分享。( 本文原创,转载需说明出处: 自己动手搭建搜索工具。 ) 目录 【文本挖掘(0)】 快速了解什么是自然语言处理 【文本挖掘(1)】 OpenNLP:驾驭文本,分词那些事 【文本挖掘(2)】 【NLP】Tika 文本预处理:抽取各种格式文件内容 【文本挖掘(3)】 自己动手搭建搜索工具 1 Apache Solr 搜索服务器简介 1.1. Solr 是什么? Solr 它是一种开放源码的、基于 Lucene Java 的搜索服务器,易于加入到 Web

Changing the default operator from OR to AND in Solr (Magento Enterprise)

戏子无情 提交于 2019-12-31 10:42:14
问题 I'm using Solr with Magento Enterprise. I'm trying to change the default search operator from OR to AND to make searches more specific by default. The first thing I tried was to to change defaultOperator in schema.xml which did not have the desired effect (it started using AND between fields, not keywords). <solrQueryParser defaultOperator="AND"/> I then read about LocalParams and tried adding that to several requestHandler sections in solrconfig.xml (I'm just guessing where it's supposed to

Suggester(Auto completion) search in solr using NGrams (one collation for Suggester Component)

南楼画角 提交于 2019-12-31 07:19:11
问题 Im working on auto completion search with solr using EdgeNGrams.I use solr 3.3 and I would like to use collations from suggester as a autocomplete solution for multi term searches. Unfortunately the Suggester returns only one collation for a multi term search If the user is searching for names of employees, then auto completion should be applied. ie., want results like google search. It's working fine for me below configurations. schema.xml <fieldType name="edgytext" class="solr.TextField"

Solr: where to store additional information?

和自甴很熟 提交于 2019-12-31 05:19:34
问题 I want to provide additional information per each indexed document during index time. And access this information in the same analyzer during query time to compare it. So. Theoretically it would be great to write this value into some field present in this document and at query time search this field also. f.e. I have an animals db. I want to find all documents with 3 words 'dog' inside. (just an example). I can setup for my "animals" field my custom BaseTokenFilterFactory which will produce