lucene | 易学教程

How to boost fields in solr

阅读更多关于 How to boost fields in solr

问题 I already have the boost determined before hand. I have a field in the solr index called boost1 . This boost field will have a value from 1 to 10 similar to google PR rank. This is the boost that should be applied to every query ran in solr. here are the fields in my index Id Title Text Boost1 The boost field should be apply to every query. I am trying to implement functionality similar to Google PR rank. Is there a way to do this using solr? 回答1: you can add the boost during query e.g. q={

Solr Query with LIKE Clause

阅读更多关于 Solr Query with LIKE Clause

问题 I'm working with Solr and I'd like to know if it is possible to have a LIKE clause in the query. For example, I want to know all organizations with "New York" in the title. In SQL, this would be written like Name LIKE 'New York%'. My question - how do you write a LIKE query in Solr? I'm using the SolrNet library, if that makes a difference. 回答1: You just search for "New York", but first you need to properly configure your field's analyzer. For example you might want to start with a field type

Getting error on a specific query

阅读更多关于 Getting error on a specific query

问题 Novice on Lucene here. I'm using it with Hibernate in a java client, and have been getting this error on a particular query: HSEARCH000146: The query string 'a' applied on field 'name' has no meaningfull tokens to be matched. Validate the query input against the Analyzer applied on this field. Search works fine for all other queries, even with empty resultset. My testing DB does have this record with 'a'. What could be wrong here? 回答1: 'a' is a stopword, and will be filtered out of your query

Getting terms matched in a document when searching using a wildcard search

阅读更多关于 Getting terms matched in a document when searching using a wildcard search

问题 I am looking for a way to find the terms that matched in the document using waldcard search in Lucene. I used the explainer to try and find the terms but this failed. A portion of the relevant code is below. ScoreDoc[] myHits = myTopDocs.scoreDocs; int hitsCount = myHits.Length; for (int myCounter = 0; myCounter < hitsCount; myCounter++) { Document doc = searcher.Doc(myHits[myCounter].doc); Explanation explanation = searcher.Explain(myQuery, myCounter); string myExplanation = explanation

How to sort IntPont or LongPoint field in Lucene 6

阅读更多关于 How to sort IntPont or LongPoint field in Lucene 6

问题 Hi: I am migrating to Lucene 6 from Lucene 5.1. I found out that InPoint does not support sorting as its DocValuesType is frozen to NONE and sorting requires NUMERIC. In Lucene 5.1, I could set the field type of a newmeric field so I could do range based search and sort the result. I know I can migrate to LegacyIntField but I'd like migrate to the new IntPoint instead. Does any one know how to index a numeric value to support both range based query and sorting? Thank you! 回答1: You have to use

How to sort IntPont or LongPoint field in Lucene 6

阅读更多关于 How to sort IntPont or LongPoint field in Lucene 6

Lucene in Neo4j has some misbehaviours in terms of reliable search querys - compared to OrientDB

阅读更多关于 Lucene in Neo4j has some misbehaviours in terms of reliable search querys - compared to OrientDB

问题 I'm still in the evaluation of Neo4j vs. OrientDB . Most importantly I need Lucene as full-text index engine. So I created on both databases the same schema with the same data (300Mio lines). I'm also experienced with querying different things in both systems. I used the Standard Analyzer on both sides. The OrientDB test query results are all fine and really good in terms of reliability and speed. The speed of Neo4j is also ok but the results are kind of bad in most of the cases. So let's

Elastic search 概述

阅读更多关于 Elastic search 概述

Elasticsearch研究有一段时间了，现特将Elasticsearch相关核心知识、原理从初学者认知、学习的角度，从以下9个方面进行详细梳理。欢迎讨论…… 0. 带着问题上路——ES是如何产生的？（1）思考：大规模数据如何检索？如：当系统数据量上了10亿、100亿条的时候，我们在做系统架构的时候通常会从以下角度去考虑问题： 1）用什么数据库好？(mysql、sybase、oracle、达梦、神通、mongodb、hbase…) 2）如何解决单点故障；(lvs、F5、A10、Zookeep、MQ) 3）如何保证数据安全性；(热备、冷备、异地多活) 4）如何解决检索难题；(数据库代理中间件：mysql-proxy、Cobar、MaxScale等;) 5）如何解决统计分析问题；(离线、近实时) （2）传统数据库的应对解决方案对于关系型数据，我们通常采用以下或类似架构去解决查询瓶颈和写入瓶颈：解决要点： 1）通过主从备份解决数据安全性问题； 2）通过数据库代理中间件心跳监测，解决单点故障问题； 3）通过代理中间件将查询语句分发到各个slave节点进行查询，并汇总结果（3）非关系型数据库的解决方案对于Nosql数据库，以mongodb为例，其它原理类似：解决要点： 1）通过副本备份保证数据安全性； 2）通过节点竞选机制解决单点问题； 3）先从配置库检索分片信息

Solr 4 Adding Shard to existing Cluster

阅读更多关于 Solr 4 Adding Shard to existing Cluster

问题 Background: I just finished reading the Apache Solr 4 Cookbook. In it the author mentions that setting up shards needs to be done wisely b/c new ones cannot be added to an existing cluster. However, this was written using Solr 4.0 and at the present I am using 4.1. Is this still the case? I wish I hadn't found this issue and I'm hoping someone can tell me otherwise. Question: Am I expected to know how much data I'll store in the future when setting up shards in a SolrCloud cluster? I have

Lucene search results sort by custom order list (unique to each user)

阅读更多关于 Lucene search results sort by custom order list (unique to each user)

问题 I have authenticated users in my application who have access to a shared database of up to 500,000 items. Each of the users has their own public facing web site and needs the ability to prioritize the items on display (think upvote) on their own site. out of the 500,000 items they may only have up to 200 prioritized items, the order of the rest of the items is of less importance. Each of the users will prioritize the items differently. I initially asked a similar mysql question here Mysql