solr | 易学教程

Solr 4.4: StopFilterFactory and enablePositionIncrements

阅读更多关于 Solr 4.4: StopFilterFactory and enablePositionIncrements

问题 While attempting to upgrade from Solr 4.3.0 to Solr 4.4.0 I ran into this exception: java.lang.IllegalArgumentException: enablePositionIncrements=false is not supported anymore as of Lucene 4.4 as it can create broken token streams which led me to this issue. I need to be able to match queries irrespective of intervening stopwords (which used to work with enablePositionIncrements="true"). For instance: "foo of the bar" would find documents matching "foo bar", "foo of bar", and "foo of the bar

How to exclude fields in a SOLR query

阅读更多关于 How to exclude fields in a SOLR query

问题 I have a SOLR query which should fetch all the fields I store, except one field. Say I have 20 fields, do I need to hard code the 19 fields I want to fetch in the &fl=[f],[f],[f],....[f]' Or is there a way to do something similar to &fl=*,![f]' [f] stands for a field name. 回答1: Unfortunately the ability to remove a field name via the query string is still an outstanding improvement request. Please see SOLR-3191 for more details. Until this improvement is implemented, you will need to specify

What is omitNorms and version field in solr schema?

阅读更多关于 What is omitNorms and version field in solr schema?

问题 I am not understanding when to use omitNorms="true". I read 2-3 links but still I am not clear with its meaning. what does it mean "Set to true to omit the norms associated with this field (this disables length normalization and index-time boosting for the field, and saves some memory). Only full-text fields or fields that need an index-time boost need norms." at http://wiki.apache.org/solr/SchemaXml page 回答1: Norms are stored as a Single byte information in the index per document per field.

Solr: DIH for multilingual index & multiValued field?

阅读更多关于 Solr: DIH for multilingual index & multiValued field?

问题 I have a MySQL table: CREATE TABLE documents ( id INT NOT NULL AUTO_INCREMENT, language_code CHAR(2), tags CHAR(30), text TEXT, PRIMARY KEY (id) ); I have 2 questions about Solr DIH: 1) The langauge_code field indicates what language the text field is in. And depending on the language, I want to index text to different Solr fields. # pseudo code if langauge_code == "en": index "text" to Solr field "text_en" elif langauge_code == "fr": index "text" to Solr field "text_fr" elif langauge_code ==

How to access Solr from an external IP address?

阅读更多关于 How to access Solr from an external IP address?

问题 I have Solr running on my server on localhost in the Jetty container. This seems like an obvious question, but how do I access the web interface from outside the server itself, like from an external IP address? Obviously, authentication will be important as part of any solution. I am also running Apache2 on the server, if that is a good solution. I'm surprised I can't find anything about this. 回答1: I finally stumbled upon an answer to this. I don't really need persistent access to the Solr

Solr中Facet用法和Group用法

阅读更多关于 Solr中Facet用法和Group用法

Group分组划分结果，返回的是分组结果； Facet分组统计，侧重统计，返回的是分组后的数量；一、Group用法： //组查询基础配置 params.set(GroupParams.GROUP, "true"); params.set(GroupParams.GROUP_FIELD, "dkeys");根据dkeys域上的值来分组划分结果，建议dkeys上不要分词； params.set(GroupParams.GROUP_LIMIT, "5"); params.set(GroupParams.GROUP_FORMAT, "grouped"); params.set(GroupParams.GROUP_MAIN, "false"); Group查询结果遍历方式： QueryResponse response = solrServer.query(query); GroupResponse groupResponse = response.getGroupResponse(); List<GroupCommand> ls = groupResponse.getValues(); for(GroupCommand gc:ls){ List<Group> list = gc.getValues(); for(Group g : list){ SolrDocumentList sdl

How do we create a simple search engine using Lucene, Solr or Nutch?

阅读更多关于 How do we create a simple search engine using Lucene, Solr or Nutch?

问题 Our company has thousands of PDF documents. How do we create a simple search engine using Lucene, Solr or Nutch? We'll provide a basic Java/JSP web page were people can type in words and perform basic and/or queries then show them the document links of all matching PDF's. 回答1: None of the projects in the Lucene family can natively process PDFs, but there are utilities you can drop in and well written examples on how to roll your own. Lucene will do pretty much whatever you need it to do, but

Upgrade solr 1.4 index to solr 3.3?

阅读更多关于 Upgrade solr 1.4 index to solr 3.3?

问题 I have an existing index build using apache solr 1.4. I want to use this existing index in version 3.3. As you know the index format is changed after 3.x, so how is it possible to do this? I have exported the existing index (that is in 1.4 version) using Luke to XML. 回答1: There's two ways to do this: if your index is unoptimized, then simply optimize it - this will upgrade the file format along the way. if your index is already optimized, you can't do this. Instead, use the command line tool

how to reduce solr memory usage?

阅读更多关于 how to reduce solr memory usage?

问题 I use solr in my application, there is just hundreds of documents. the memory usage is about 80M, how to reduce it? 回答1: 80M is not much, in fact it's pretty much the mininum, you won't go much lower than that. Some factors that affect memory usage: Input document size Multi-threaded document updates Cache size Facet queries Sorting References: http://wiki.apache.org/solr/SolrPerformanceFactors#Factors_affecting_memory_usage http://www.nabble.com/Debugging-Solr-memory-usage-heap-problems

Solr associations

阅读更多关于 Solr associations

问题 The last couple of days we are thinking of using Solr as our search engine of choice. Most of the features we need are out of the box or can be easily configured. There is however one feature that we absolutely need that seems to be well hidden (or missing) in Solr. I'll try to explain with an example. We have lots of documents that are actually businesses: <document> <name>Apache</name> <cat>1</cat> ... </document> <document> <name>McDonalds</name> <cat>2</cat> ... </document> In addition we