solr

Hbase 二级索引 Solr int字段排序问题 can not sort on multivalued field

泄露秘密 提交于 2019-12-16 18:48:02
Hbase Solr 同步二级索引后,进行int字段排序时报错 报错如下 { "responseHeader":{ "zkConnected":true, "status":400, "QTime":75, "params":{ "q":"*:*", "sort":"hbase_indexer_fn_read_num desc", "_":"1576474856934"}}, "error":{ "metadata":[ "error-class","org.apache.solr.common.SolrException", "root-error-class","org.apache.solr.common.SolrException"], "msg":"can not sort on multivalued field: hbase_indexer_fn_read_num", "code":400}} 提示不能是multivalued属性 多值在solr中显示如下,带中括号 false 情况下应为下图,不带中括号 修改 schema.xml 文件, multivalued = "false" <field name="hbase_indexer_fn_read_num" type="string" indexed="true" multiValued="false"

Can't open lucene index (Java heap space)

99封情书 提交于 2019-12-16 18:06:29
问题 I want to grab some data from lucene index file. But I can't read it. I try to use Luke , but it always crashes with java.lang.OutOfMemoryError: Java heap space . Note -Xmx can't help me. I try -Xmx512, -Xmx1024 and even -Xmx2048. I try to use Solr also, but gets java.lang.OutOfMemoryError: Java heap space too. Any ideas how I can extract some data from Lucene? P. S. I use lucene 2.3.0. My index file is 1.8 Gb size. 回答1: What size is the data you are trying to fetch? Maybe the result set is

ElasticSearch入门之风花雪月(五)

梦想的初衷 提交于 2019-12-15 17:18:48
【推荐】2019 Java 开发者跳槽指南.pdf(吐血整理) >>> 以前经常有人问散仙,如何学好搜索? 其实这个问题很具有代表性,你可以归纳为一类问题? 其实,散仙在以前博客的中,也有总结过,大家可以点击这个链接 再看一下 。 本篇散仙要介绍的内容,是关于如何用Luke查看ElasticSearch的索引,那么为什么会写如此一篇文章呢? 相信学过或了解过全文检索的朋友们,都知道,搜索的核心的就是倒排索引,之所以我们能够使用Google在互联网的海量的数据中,通过关键词快速定位到我们想要的数据,就是因为倒排索引在这里起了非常大的作用,在搜索中索引通常是不可见的,我们只知道能通过搜索某些关键词找到我们想要的信息,而并不知道,在倒排索引中,他们所有的倒排词是什么样的,这也就是很多时候,我们经常会很奇怪,为什么我搜索的这个词没有返回结果呢? 如果没有返回结果,基本能够证明它在索引中,是不存在的,或者有时候,我们搜索了中国人,能够搜索数据,但是如果搜索中国,却搜不到数据? 这一切都跟倒排有关? 如果我们出现上面的一些问题,那么该怎么办呢? 莫慌,如果对分词很了解的朋友们,基本上都很够找到原因,因为索引里面的数据,是需要经过分词,然后在索引的,其实就是把一篇文章,切成不同的token也称(term),检索的关键词只要和这些token匹配,基本就能搜索到数据,当然这是很复杂的流程

Apache Solr远程代码执行漏洞 (CVE-2019-0193)复现操作

回眸只為那壹抹淺笑 提交于 2019-12-15 10:49:15
QQ交流学习群:811401950 一. 漏洞描述 2019年11月16日,Apache官方发布Apache Solr远程代码执行漏洞(CVE-2019-0193)安全通告,此漏洞存在于可选模块DataImportHandler 中,DataImportHandler是用于从数据库或其他源提取数据的常用模块,该模块中所有DIH配置都可以通过外部请求的dataConfig参 数来设置,由于DIH配置可以包含脚本,因此该参数存在安全隐患。攻击者可利用dataConfig参数构造恶意请求,实现远程代码执行,请相关用户尽快升级Solr至安全版本,以确保对此漏洞的有效防护。 二. 复现操作 批量搜一下开了8983端口的主机 2.访问目标主机,找到漏洞的位置,点到 core selector那个按钮,burp抓包拦截 3.发送到repeater,把上图请求的admin 改为item(这里注意对应的模块),后面再加config 查看是否有其配置文件,如果没有配置文件,则说明可能目标不存在这种漏洞,这也是一种检测方式? 4.因为该漏洞所影响的模块可以利用外部的请求来修改,所以利用恶意的请求payload去修改配置,修改了配置过后,就导致可以执行恶意的脚本,从而发起带有恶意代码的请求。 5,修改了配置过后,便可以带上恶意脚本的请求去发起访问了,从而导致了远程代码执行,请求路径内容如下图

solr的使用方式

冷暖自知 提交于 2019-12-14 06:06:36
solr的使用 本博客solr版本为7.7.2 下载地址: http://mirrors.tuna.tsinghua.edu.cn/apache/lucene/solr/7.7.2/solr-7.7.2.zip 下载完压缩文件后解压到本地即可 打开 使用cmd进入控制台,然后进入solr-7.7.2\bin文件夹,在文件夹中输入 solr start 默认端口号是8983,注意窗口不要关闭 打开后访问 http://localhost:8983/solr可以访问solr后台管理 点击选项中的core admin创建一个实例, 如果创建失败提示没有配置文件可以直接把solr-7.7.2\server\solr\configsets_default下的面conf文件夹复制到实例文件夹中 注意查看日志 导入数据 导入驱动 在core中创建lib文件夹存入jar包 放好后刷新core,在core admin选择reload刷新 修改 solrconfig.xml 文件 <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">data-config.xml</str> <

Solr, Special Chars, and Latin to Cyrillic char conversion

独自空忆成欢 提交于 2019-12-14 03:48:49
问题 I am trying to setup a search engine using Solr (or Lucene) which could have text in both Latin with special chars, (special chars would include Ö or Ç as an example) or Cyrilic chars (examples include Б or б and Ж ж). Anyway, I am trying to find a solution to allow me to search for words with these charicters in them, but for users who do not have the key on their keyboard... Example would be (making up words here, hopefully won't offend anyone): "BÖÖK" would be found when searching for

How to add shards dynamically to collection in solr?

筅森魡賤 提交于 2019-12-14 03:43:58
问题 Using the following query when I create the collection I set two shards for the collection10 . /solr/admin/collections?action=CREATE&name=collection10&numShards=2&replicationFactor=2 But what is my requirement is, I have to add 3rd shard dynamically after 10000 documents has been indexed in first two shards. Is it possible to add shards dynamically once we started the collection and indexing at existing shards? If it possible means how to add shards dynamically once after we started the

solr join - return parent and child document

◇◆丶佛笑我妖孽 提交于 2019-12-14 03:42:10
问题 I am using Solr's (4.0.0-beta) join capability to query an index that has documents with parent/child relationships. The join query works great, but I only get the parent documents in the search results. I believe this is the expected behavior. Is it possible, though, to get both the parent and the child documents to be returned in the search results? (as separate search hits). For example: Parents: SolrDocument{uid=m_1, media_id=1}<br/> SolrDocument{uid=m_2, media_id=2}<br/> SolrDocument{uid

Cassandra SOLR Rolling Upgrade

家住魔仙堡 提交于 2019-12-14 03:13:54
问题 We have a cluster of 12 nodes, 6 DSE-SOLR and 6 DSE-Cassandra. When upgrading from 3.0 to 3.1 we noticed that requests through the SOLR interface were broken until all nodes had been upgraded. Is this limitation still present when upgrading from 3.1 to 3.2? Are there any gotchas to note when making the upgrade? In the upgrade path docs it says to enable the old gossip protocol until all nodes have been upgraded, is this per DC or for the entire cluster? 回答1: Russ, What errors are you getting

How can I boost a result in SOLR based on a parameter?

夙愿已清 提交于 2019-12-14 03:09:11
问题 I am new to SOLR and I am trying to boost a result based on a parameter "country". For example, I want to set the country to US and move all the results with US to the top. This is how I am doing it right now but it doesn't work. : sort=query({!qf=market v='US'}) desc This is how the dismax request handler is set up: <requestHandler name="dismax" class="solr.SearchHandler" > <lst name="defaults"> <str name="defType">dismax</str> <str name="echoParams">explicit</str> <float name="tie">0.01<