solr | 易学教程

一步一步学solr：什么是solr？

阅读更多关于一步一步学solr：什么是solr？

【推荐】2019 Java 开发者跳槽指南.pdf(吐血整理) >>> 简介 Solr是一个独立的企业级搜索应用服务器，它对外提供类似于Web-service的API接口。用户可以通过http请求，向搜索引擎服务器提交一定格式的XML文件，生成索引；也可以通过Http G et操作提出查找请求，并得到XML格式的返回结果；特点 Solr是一个独立的企业搜索服务器REST-like API。你把文件(称为“索引”) 通过XML、JSON、CSV通过HTTP或二进制。你查询它通过HTTP GET和接收XML、JSON、CSV或二进制的结果。高级全文搜索功能优化了高容量的网络流量基于标准的开放接口——XML、JSON和HTTP 综合HTML管理接口服务器统计数据暴露在JMX监控线性可伸缩、自动索引复制,自动故障转移和恢复接近实时索引灵活和适应性强的XML配置可扩展的插件体系结构 Solr使用Lucene TM 搜索库和扩展了它! 真正的数据模式,数值类型、动态字段,独特的钥匙强大的扩展Lucene查询语言面向方面的搜索和过滤地理空间搜索支持多个分文档和geo多边形先进、可配置的文本分析高度可配置和用户可扩展的缓存性能优化外部配置通过XML 一个基于AJAX的管理界面可监控日志快接近实时增量索引和索引复制高度可伸缩的分布式搜索分散指数跨多个主机

SolrEntityProcessor is called only once for sub-entities

阅读更多关于 SolrEntityProcessor is called only once for sub-entities

问题 I'm using Solr 4.2, and I am trying to call SolrEntityProcessor as a sub-entity . So far, only one call is made to Solr and a single document is indexed while all others are ignored. This should be possible, but it doesn't seem to work... Any ideas? Code snippist: <document> <entity dataSource="psql" name="user" query="SELECT * FROM users";> <field column="id" name="user_id" /> <entity name="liked_items" processor="SolrEntityProcessor" url="http://localhost:8983/solr/items" query="user_liking

solr-创建core(二)

阅读更多关于 solr-创建core(二)

创建solr核心core [julong@localhost bin]$ ./solr create -c julong Copying configuration to new core instance directory: /home/julong/solr-5.5.2/server/solr/julong Creating new core 'julong' using command: http://localhost:8983/solr/admin/cores?action=CREATE&name=julong&instanceDir=julong { "responseHeader":{ "status":0, "QTime":3587}, "core":"julong"} 删除solr核心core [julong@localhost bin]$ ./solr delete -c julong Deleting core 'julong' using command: http://localhost:8983/solr/admin/cores?action=UNLOAD&core=julong&deleteIndex=true&deleteDataDir=true&deleteInstanceDir=true {"responseHeader":{ "status"

Solr join “not in” subselect

阅读更多关于 Solr join “not in” subselect

问题 In the Solr join documentation Solr Join they say that: /solr/collection1/select ? fl=xxx,yyy & q={!join from=inner_id to=outer_id}zzz:vvv is equivalent to: SELECT xxx, yyy FROM collection1 WHERE outer_id IN (SELECT inner_id FROM collection1 where zzz = "vvv") How do I write in Solr (see the NOT): SELECT xxx, yyy FROM collection1 WHERE outer_id NOT IN (SELECT inner_id FROM collection1 where zzz = "vvv") Lets consider the following example: People Records: 1. name='a', id=1, teacherId=4 2.

Solr/Lucene fieldCache OutOfMemory error sorting on dynamic field

阅读更多关于 Solr/Lucene fieldCache OutOfMemory error sorting on dynamic field

问题 We have a Solr core that has about 250 TrieIntField s (declared as dynamicField ). There are about 14M docs in our Solr index and many documents have some value in many of these fields. We have a need to sort on all of these 250 fields over a period of time. The issue we are facing is that the underlying lucene fieldCache gets filled up very quickly. We have a 4 GB box and the index size is 18 GB. After a sort on 40 or 45 of these dynamic fields, the memory consumption is about 90% and we

solr suggester not returning any results

阅读更多关于 solr suggester not returning any results

问题 I've followed the solr wiki article for suggester almost to the T here: http://wiki.apache.org/solr/Suggester. I have the following xml in my solrconfig.xml: <searchComponent class="solr.SpellCheckComponent" name="suggest"> <lst name="spellchecker"> <str name="name">suggest</str> <str name="classname">org.apache.solr.spelling.suggest.Suggester</str> <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str> <str name="field">description</str> <float name="threshold">0.05<

Solr search for hashtag or mentions

阅读更多关于 Solr search for hashtag or mentions

问题 We are using solr version 3.5 to search though Tweets, I am using WordDelimiterFactory with the following setting, to be able to search for @username or #hashtags : <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0" splitOnNumerics="0" preserveOriginal="1" handleAsChar="@#"/> I saw the following patch but this doesn’t seem to be working as I expected, am I missing something?

HTTP ERROR: 404 missing core name in path with solr

阅读更多关于 HTTP ERROR: 404 missing core name in path with solr

问题 I am new to Solr, after installing it in ubuntu 8.10, when I was trying exampledocs to index , as per this link, I got this error: HTTP ERROR: 404 missing core name in path This is in Jetty. What shall I do, in order to solve this? 回答1: I've gotten the same error: HTTP ERROR: 404 missing core name in path In my case I've forgotten so set the solr/home value in the WEB-INF/web.xml file <env-entry> <env-entry-name>solr/home</env-entry-name> <env-entry-value>/put/your/solr/home/here</env-entry

How can I Schedule data imports in Solr

阅读更多关于 How can I Schedule data imports in Solr

问题 The wiki page, http://wiki.apache.org/solr/DataImportHandler explains how to index data using DataImportHandler. But the example uses a command to initiate the import operation. How can I schedule a job to do this on a regular basis?c 回答1: On UNIX/Linux, cron jobs are your friends! On Windows, there is Task Scheduler. UPDATE To do it from Java code, since this is a simple GET request, you can use the HTTP Client library. See this tutorial on using the GetMethod. If you need to

Remove results below a certain score threshold in Solr/Lucene?

阅读更多关于 Remove results below a certain score threshold in Solr/Lucene?

问题 Is there a built-in functionalities in solr/lucene to filter the results if they fall below a certain score threshold? Let's say if I provide a score threshold of .2, then all documents with score less than .2 will be removed from my results. My intuition is that this is possible by updating/customizing solr or lucene. Could you point me to right direction on how to do this? Thanks in advance! 回答1: You could write your own Collector that would ignore collecting those documents that the scorer