lucene | 易学教程

Javascript Nested object to multidimensional array recursive function

阅读更多关于 Javascript Nested object to multidimensional array recursive function

问题 Ok here's one at which I've been scratching my head at without great success so far; Sorry in advance for the very long question... I am using this Lucene Query Parser to parse a string/query which produce this kind of data structure: // Notice that the repetition of 'field3' is on purpose Sample String: field1:val1 AND field2:val2 OR field3:val3 AND field3:val4 Result: { left: { field: "field1", term: "val1" }, operator: "AND" right: { left: { field: "field2", term: "val2" }, operator: "OR"

Solr - How to search in all fields without passing query field?

阅读更多关于 Solr - How to search in all fields without passing query field?

问题 I have tried as below, <field name="collector" type="text_general" indexed="true" stored="false" multiValued="true" /> and copy all my fields to copyField as below, <copyField source="fullname" dest="collector"/> <copyField source="email" dest="collector"/> <copyField source="city" dest="collector"/> and also I have put all copyField tags below <fields> </fields> tags. But I cant search in all fields. I have to pass fullname before query like, q=fullname:Mayur I want search by, q=Mayur And I

Solr - How to search in all fields without passing query field?

阅读更多关于 Solr - How to search in all fields without passing query field?

configuring nutch regex-normalize.xml

阅读更多关于 configuring nutch regex-normalize.xml

问题 I am using the Java-based Nutch web-search software. In order to prevent duplicate (url) results from being returned in my search query results, I am trying to remove (a.k.a. normalize) the expressions of 'jsessionid' from the urls being indexed when running the Nutch crawler to index my intranet. However my modifications to $NUTCH_HOME/conf/regex-normalize.xml (prior to running my crawl) do not seem to be having any effect. How can I ensure that my regex-normalize.xml configuration is being

Solr Customization using Java for modified output?

阅读更多关于 Solr Customization using Java for modified output?

问题 I am developing an application uaing Solr. Everything is going fine and I am looking ahead to integrate Solr with CodeIgniter or some other framework for frontend. But there is a problem. I am performing some calculations on output rows thrown by Solr and showing them to users. It is really not feasible to do in PHP ( as it takes really long time). I have an existing code written in Java and hence, I find no reason in porting this application to PHP. How can I do that? Is there anyway I can

Obtain metadata associated with matched content in Solr/Lucene

阅读更多关于 Obtain metadata associated with matched content in Solr/Lucene

问题 I've a large set of text documents which I will index with Solr, in a format where each line of text has associated metadata. For example: #metadata1 A line of text. #metadata2 Another long, broken line of #metadata3 text that should be searchable. I'd like to index this such that the content is searchable, including phrase matches spanning multiple lines, but not the metadata. However, I can't discard the metadata: I would like to have any matches still have the associated metadata. E.g. A

Elasticsearch Autocomplete - Completion suggestion from dot & whitespace for matching input

阅读更多关于 Elasticsearch Autocomplete - Completion suggestion from dot & whitespace for matching input

问题 I am trying to create auto-complete suggest based on title (string as "Hunter Game", "Hunter", "HunterGame", "Hunter-Game") and package name (string as "az.com.hsz.hunter.game", "az.com.hsz.hunter-game", "az.com.hsz.hunter_game", "az.com.hsz.hunterGame") . Mapping is as follow: { "app-search-test": { "mappings": { "package": { "properties": {"title": { "type": "string", "analyzer": "autocomplete" }, "package_name": { "type": "string" }, "title-suggest": { "type": "completion", "analyzer":

Docker 简单部署 ElasticSearch

阅读更多关于 Docker 简单部署 ElasticSearch

一、ElasticSearch是什么? Elasticsearch也使用Java开发并使用Lucene作为其核心来实现所有索引和搜索的功能，但是它的目的是通过简单的RESTful API来隐藏Lucene的复杂性，从而让全文搜索变得简单。不过，Elasticsearch不仅仅是Lucene和全文搜索，我们还能这样去描述它：分布式的实时文件存储，每个字段都被索引并可被搜索分布式的实时分析搜索引擎可以扩展到上百台服务器，处理PB级结构化或非结构化数据拉取镜像 docker pull elasticsearch:6.7.2 docker pull mobz/elasticsearch-head:5 docker pull kibana:6.7.2 运行容器 ElasticSearch 的默认端口是9200，我们把宿主环境9200端口映射到 Docker 容器中的9200端口，就可以访问到 Docker 容器中的 ElasticSearch 服务了，同时我们把这个容器命名为 es 。 docker run -d --restart=always --name es -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:6.7.2 配置跨域进入容器由于要进行配置

Can we make Lucene IndexWriter serializable for ExecutionContext of Spring Batch?

阅读更多关于 Can we make Lucene IndexWriter serializable for ExecutionContext of Spring Batch?

问题 This question is related to my another SO question. To keep IndexWriter open for the duration of a partitioned step, I thought to add IndexWriter in ExecutionContext of partitioner and then close in a StepExecutionListenerSupport 's afterStep(StepExecution stepExecution) method. Challenge that I am facing in this approach is that ExecutionContext needs Objects to be serializable. In light of these two questions, Q1, Q2 -- it doesn't seem feasible because I can't add a no - arg constructor in

Phrase query in Lucene 6.2.0

阅读更多关于 Phrase query in Lucene 6.2.0

问题 I have a document like this: { "_id" : ObjectId("586b723b4b9a835db416fa26"), "name" : "test", "countries" : { "country" : [ { "name" : "russia iraq" }, { "name" : "USA china" } ] } } In MongoDB I am trying to retrieve it using phrase query(Lucene 6.2.0). My code looks as folllows: StandardAnalyzer analyzer = new StandardAnalyzer(); // 1. create the index Directory index = new RAMDirectory(); IndexWriterConfig config = new IndexWriterConfig(analyzer); try { IndexWriter w = new IndexWriter