solr | 易学教程

Using Solr for indexing multiple languages

阅读更多关于 Using Solr for indexing multiple languages

问题 We're setting up a Solr to index documents where title field can be in various languages. After googling I found two options: Define different schema fields for every language i.e. title_en, title_fr,... applying different filters to each language then query one of title fields with a corresponding language. Creating different Solr cores to handle each language and make our app query correct Solr core. Which one is better? What are the ups and downs? Thanks 回答1: There's also a third

Is there a way to include stopwords when searching exact phrases in Solr?

阅读更多关于 Is there a way to include stopwords when searching exact phrases in Solr?

问题 I want stopwords excluded except when the search term is within double quotes eg. "just like that" should also search "that". Is this possible? 回答1: It depends on the configuration of the field you are querying. If the configuration of the indexing analyzer includes a StopFilterFactory, then the stopwords are simply not indexed, so you can not query for them afterward. But since Solr keeps the position of the terms in the index, you can instruct it to increment the position value of the

Declaring unique key as int in solr results in error

阅读更多关于 Declaring unique key as int in solr results in error

问题 Declaring <field name="id" type="int" indexed="true" stored="true" required="true" multiValued="false" /> in schema.xml results in the following error. HTTP Status 500 - {msg=SolrCore 'collection1' is not available due to init failure: Error initializing QueryElevationComponent.,trace=org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: Error initializing QueryElevationComponent. at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:860

How to run Solr 4 in Tomcat locally?

阅读更多关于 How to run Solr 4 in Tomcat locally?

问题 I've been trying to set up Solr 4.3 on my home PC (in Tomcat 7) but it doesn't run. I have set up Tomcat and deployed the solr.war file which both unpacks and shows up in the Tomcat Web Apps Manager screen in Tomcat but its not running and clicking the start button doesn't do anything (as it should already be running in the first place). Here is my solr.xml context file in Tomcat which also gives the path to the solr.war file and where my solr Cores are (which is the default Collection1):

What is the easiest way to implement terms association mining in Solr?

阅读更多关于 What is the easiest way to implement terms association mining in Solr?

问题 Association mining seems to give good results for retrieving related terms in text corpora. There are several works on this topic including well-known LSA method. The most straightforward way to mine associations is to build co-occurrence matrix of docs X terms and find terms that occur in the same documents most often. In my previous projects I implemented it directly in Lucene by iteration over TermDocs (I got it by calling IndexReader.termDocs(Term)). But I can't see anything similar in

What is the easiest way to implement terms association mining in Solr?

阅读更多关于 What is the easiest way to implement terms association mining in Solr?

自然语言处理在现实生活中运用

阅读更多关于自然语言处理在现实生活中运用

自己动手搭建搜索工具作者白宁超 2016年4月12日16:31:48 摘要：搜索已经作为生活中不可缺少的一部分，诸如：百度、 google 、还是在微信上寻找好友或者通过一段文本查找关键字。另外亚马逊、京东、天猫、苏宁等电商在搜索中也是别有洞天（多面搜索等）。对于开发人员，搜索往往是大部分应用的关键功能，特别是对大规模文本数据驱动应用更是如此。另一类搜索像语音智能检索，其采用分类、聚类、神经网络等方法进行模型评估，反馈给用户比较理想的匹配结果，这里需要强调的是其采用评分机制反馈的模糊近似查询结果，与传统精确采用是不一样的。这种结果的反馈评分主要依托正确率和召回率。这里自己构建搜索工具好处在于：灵活性、开发费用低、自己更了解自己的搜索工具、价格当然是免费的啦。本文作者花费大量时间，经过资料收集，研究和实验所得，旨在技术分享。（本文原创，转载需说明出处：自己动手搭建搜索工具。）目录【文本挖掘（0）】快速了解什么是自然语言处理【文本挖掘（1）】 OpenNLP：驾驭文本，分词那些事【文本挖掘（2）】【NLP】Tika 文本预处理：抽取各种格式文件内容【文本挖掘（3）】自己动手搭建搜索工具 1 Apache Solr 搜索服务器简介 1.1. Solr 是什么？ Solr 它是一种开放源码的、基于 Lucene Java 的搜索服务器，易于加入到 Web

Changing the default operator from OR to AND in Solr (Magento Enterprise)

阅读更多关于 Changing the default operator from OR to AND in Solr (Magento Enterprise)

问题 I'm using Solr with Magento Enterprise. I'm trying to change the default search operator from OR to AND to make searches more specific by default. The first thing I tried was to to change defaultOperator in schema.xml which did not have the desired effect (it started using AND between fields, not keywords). <solrQueryParser defaultOperator="AND"/> I then read about LocalParams and tried adding that to several requestHandler sections in solrconfig.xml (I'm just guessing where it's supposed to

Suggester(Auto completion) search in solr using NGrams (one collation for Suggester Component)

阅读更多关于 Suggester(Auto completion) search in solr using NGrams (one collation for Suggester Component)

问题 Im working on auto completion search with solr using EdgeNGrams.I use solr 3.3 and I would like to use collations from suggester as a autocomplete solution for multi term searches. Unfortunately the Suggester returns only one collation for a multi term search If the user is searching for names of employees, then auto completion should be applied. ie., want results like google search. It's working fine for me below configurations. schema.xml <fieldType name="edgytext" class="solr.TextField"

Solr: where to store additional information?

阅读更多关于 Solr: where to store additional information?

问题 I want to provide additional information per each indexed document during index time. And access this information in the same analyzer during query time to compare it. So. Theoretically it would be great to write this value into some field present in this document and at query time search this field also. f.e. I have an animals db. I want to find all documents with 3 words 'dog' inside. (just an example). I can setup for my "animals" field my custom BaseTokenFilterFactory which will produce