solr | 易学教程

Exact field search with solr/lucene

阅读更多关于 Exact field search with solr/lucene

问题 I have text field. And for given query I want to find all documents that contains indexed field values. query.contains(document.field_name) Examples: 1. field_name:"a b" 2. field_name:"a b c" For query "a b d" I want to find only first item. Not efficient way to do this is basically generate all substrings of query and index field as a string. Is it possible to implements such requirements in Solr using existen functionality? If not what is the most efficient algorithm/way to do this? PS.

Partial schema sharing - is it possible?

阅读更多关于 Partial schema sharing - is it possible?

问题 We have a large number of Solr cores, which all share the same fields but have differing definitions for certain field types (e.g. for different languages). For instance, in the following example, I have a full_text field of type text that exists in two different cores but I apply different filters to the text field in each core. Core 1 <fields> <field name="full_text" type="text" indexed="true" /> </fields> <types> <fieldType name="text" class="solr.TextField"> <analyzer type="index">

indexing document to a specific collection using spring data solr

阅读更多关于 indexing document to a specific collection using spring data solr

问题 I am trying to index a document to a specific collection in solr. The collection name is 'program'. I am using spring data solr. I am getting the below error when trying to save the document: HTTP ERROR 404 Problem accessing /solr/update. Reason:Not Found My assumption is that the annotation @SolrDocument is not recognized. spring-data-solr is trying to post the document to /solr/update whereas it should try to post it to /solr/program/update.However I am not sure how to prove it or fix it.

Solr DataImportHandler doesn't work with XML Files

阅读更多关于 Solr DataImportHandler doesn't work with XML Files

问题 I'm very new to Solr. I succeeded in indexing data from my sql database via DIH. Now I want to import xml files and index them also via DIH but it just won't work! My data-config.xml looks like this: <dataConfig> <dataSource type="FileDataSource" encoding="UTF-8" /> <document> <entity name="dir" processor="FileListEntityProcessor" baseDir="/bla/test2" fileName=".*xml" stream="true" recursive="false" rootEntity="false"> <entity name="PubmedArticle" processor="XPathEntityProcessor" transformer=

How to restrict Sunspot search with nested models?

阅读更多关于 How to restrict Sunspot search with nested models?

问题 I want to filter the Sunspot search results with with(:is_available, true) . This is working with the User model, but I can't make it work with the Itinerary model. app/controllers/search_controller.rb: class SearchController < ApplicationController before_filter :fulltext_actions private def fulltext_actions @itineraries = do_fulltext_search(Itinerary) @users = do_fulltext_search(User) @itineraries_size = @itineraries.size @users_size = @users.size end def do_fulltext_search(model) Sunspot

How to restrict Sunspot search with nested models?

阅读更多关于 How to restrict Sunspot search with nested models?

How to modify search result page given by Solr?

阅读更多关于 How to modify search result page given by Solr?

问题 I intend to make a niche search engine. I am using apache-nutch-1.6 as the crawler and apache-solr-3.6.2 as the searcher. I must say there is very less updated information on web about these technologies. I followed this tutorial http://wiki.apache.org/nutch/NutchTutorial and have successfully installed apache and solr on my ubuntu system. I was also successful in injecting seed url to webdb and perform the crawl. Using solr interface at http://localhost:8983/solr/admin , I can also query the

How do I get the tf and idf score from a Solr query?

阅读更多关于 How do I get the tf and idf score from a Solr query?

问题 Following Solr documentations (https://cwiki.apache.org/confluence/display/solr/Function+Queries and others) I should just put idf(fieldname, 'term') as I do with termfreq(fieldname, 'term') in the field list. However, whenever I try this I get an exception as: org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request By looking at the logs I could find: null:java.lang.UnsupportedOperationException: requires a TFIDFSimilarity (such as

Solr stops responding (or slows down to molasses)…(Solr newbie)

阅读更多关于 Solr stops responding (or slows down to molasses)…(Solr newbie)

问题 Running multi-core Solr under Tomcat 6.0 /Win 2008 Server and ASP.NET queries via SolrNet. One of the cores is huge i.e. ~25 million documents (~20 GB disk-space) and several fields. The other 3 cores are much smaller (few gigs each). After a couple of queries to the large index, Solr slows down dramatically and stops responding i.e. can't even open admin console. If I restart tomcat, things again works ok for a few more queries and then molasses to stop. I have checked the machine RAM and

How to get a better Lucene/Solr score if word queried was at the beginning of the indexed field?

阅读更多关于 How to get a better Lucene/Solr score if word queried was at the beginning of the indexed field?

问题 I read at some point something explaining what to do to have Lucene/Solr give a better score if my queried word was found at the beginning of the description I indexed. Cannot find it anymore on the net. Anybody has the links handy ? Thank you. 回答1: Payloads could help you do that. Actually, payloads let you give an arbitrary boost to any token of your token stream, so you can boost depending on anything: the position in the stream, the font weight, whether the token contains capital letters,