solr

Obtain metadata associated with matched content in Solr/Lucene

空扰寡人 提交于 2019-12-25 09:33:20
问题 I've a large set of text documents which I will index with Solr, in a format where each line of text has associated metadata. For example: #metadata1 A line of text. #metadata2 Another long, broken line of #metadata3 text that should be searchable. I'd like to index this such that the content is searchable, including phrase matches spanning multiple lines, but not the metadata. However, I can't discard the metadata: I would like to have any matches still have the associated metadata. E.g. A

How to join two different cores from two different Solr servers?

烈酒焚心 提交于 2019-12-25 09:20:08
问题 So I have some cores in one solr server and some cores in another solr server and I need to join them. The schema of the cores are different with no matching attribute name but matching attribute value. I tried to do it with join & shards but both didn't work. Can you help me out? attribute1 is in abc:7892/solr/core1 attribute2 , attribute3 is in xyz:8983/solr/core2 {!join from=attribute1 to=attribute2 fromIndex="xyz:8983/solr/core2"} attribute3:* Error Message : Cross-core join: no such core

Solr Facet and Tokenizer

自作多情 提交于 2019-12-25 09:09:52
问题 I have solr array field that could contain string with some separate words as a one value, for example ["Super Ball", "BlaBla", "Info"]. I need to see all those 3 values as an facet values and have case insensitive search by fields as well. If I use next field type setting I see 3 values in facet but case insensitive search doesn't work. <fieldType name="myLower" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.KeywordTokenizerFactory"/>

Solr Facet and Tokenizer

坚强是说给别人听的谎言 提交于 2019-12-25 09:07:19
问题 I have solr array field that could contain string with some separate words as a one value, for example ["Super Ball", "BlaBla", "Info"]. I need to see all those 3 values as an facet values and have case insensitive search by fields as well. If I use next field type setting I see 3 values in facet but case insensitive search doesn't work. <fieldType name="myLower" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.KeywordTokenizerFactory"/>

SOLR and accented characters

蹲街弑〆低调 提交于 2019-12-25 09:05:14
问题 I have an index for occupations (identifier + occupation): <field name="occ_id" type="int" indexed="true" stored="true" required="true" /> <field name="occ_tx_name" type="text_es" indexed="true" stored="true" multiValued="false" /> <!-- Spanish --> <fieldType name="text_es" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words=

Compare strings of text between two tables in a database or locally

China☆狼群 提交于 2019-12-25 08:42:17
问题 Edit : SQL doesn't work for this. I just found out about Solr/Sphinx and it seems like the right tool for this problem, so if you know Solr or Sphinx I'm eager to hear from you. Basically, I have a .tsv with patent info and a .csv with product names. I need to match each row of the patents column against the product names and extract the occurrences in a new .csv column. You can scroll down and see the example at the end. Original question: SQL newbie here so bear with me :). I can't figure

Some word is not indexed in solr properly

余生长醉 提交于 2019-12-25 08:28:21
问题 I don't know what is going wrong. http://IP_ADDRESS/solr/CORE_NAME/select?indent=on&q=Bangalore&wt=json There are more than 100 records which contains the word Bangalore in my database. However the the results contain just 2 records. However, The below Query below for works perfectly. http://IP_ADDRESS/solr/CORE_NAME/select?indent=on&q=Bangalor&wt=json Just removing the letter e from Bangalore , i get much more results containing the word "Bangalore". I think the word "Bangalore" is not

Multiple shards on single machine performance

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-25 08:23:18
问题 Does it make sense to have multiple shards in Elasticsearch if I am going to use only single machine? Will it improve performance in any way? Same question for Apache Solr - does it make sense to use Solr Cloud with ZooKeeper for single server instance or just create one core without any sharding? Let's assume I am not going to use other machines in future, so the main point is how sharding on single machine influence search engines performance? 回答1: There are certain parts of Lucene that's

What is the regular expression to remove spaces in SOLR

不问归期 提交于 2019-12-25 08:14:20
问题 In a Regex, how to remove all the leading, trailing and where ever spaces exist in SOLR. To remove special characters, we can have the PatternReplaceFilterFactory as <filter class="solr.PatternReplaceFilterFactory" pattern="([^a-z])" replacement="" replace="all" /> What pattern value will be formed to remove the spaces whereever it comes. 回答1: I don't know SOLR but based on your example I guess you could just do <filter class="solr.PatternReplaceFilterFactory" pattern="(\s+)" replacement=""

Solr Delta Import Query is not working

倖福魔咒の 提交于 2019-12-25 08:04:09
问题 I am trying to import data from Mongodb to Solr6.0. Full import is executing properly but delta import is not working. When I execute delta import I get below result. Requests: 0 , Fetched: 0 , Skipped: 0 , Processed: 0 My data config file queries are as below query="" deltaQuery="db.getCollection('customer').find({'jDate':{$gt:'${dih.last_index_time}'}},{'_id' :1});" deltaImportQuery="db.getCollection('customer').find({'_id':'${dataimporter.delta.id}'})" the whole data-config.xml <?xml