elasticsearch-5 | 易学教程

_updated_by_query for re-indexing elasticsearch documents

阅读更多关于 _updated_by_query for re-indexing elasticsearch documents

问题 We are using elasticsearch 5.5. We made some changes to the mapping : Added some new fields. Added analyzer to some fields. Removed some fields. Excluded some existing fields from _all. Removed the same analyzer setting from some exiting fields. Here is what I understand: For cases 2, and 4 we need to perform re-indexing for the changes to take effect. One approach I got to know to perform re-indexing is using _update_query with conflicts=proceed. I am only aware of the other ways: Re-index

How to get array count of nested object in elastic-search

阅读更多关于 How to get array count of nested object in elastic-search

问题 Can someone please help me to get an aggregated count of the nested object in elastic search, let say if my elastic search object mapping as : { "employe": { "dynamic": "strict", "properties": { "empId":{ "type": "keyword" }, "entities": { "type": "nested" } } } } entities are the type of array with some other object. I wanted to get the count of entities of the filtered item. I have tried some elastic search query, but it does not work { "query": { "bool": { "filter": [ { "terms": { "empId":

Elasticsearch: Aggregation on filtered nested objects to find unique values

阅读更多关于 Elasticsearch: Aggregation on filtered nested objects to find unique values

问题 I have an array of objects (tags) in each document in Elasticsearch 5: { "tags": [ { "key": "tag1", "value": "val1" }, { "key": "tag2", "value": "val2" }, ... ] } Now I want to find unique tag values for a certain tag key. Something similiar to this SQL query: SELECT DISTINCT(tags.value) FROM tags WHERE tags.key='some-key' I have came to this DSL so far: { "size": 0, "aggs": { "my_tags": { "nested": { "path": "tags" }, "aggs": { "filter" : { "terms": { "tags.key": "tag1" } }, "aggs": { "my

How to get array count of nested of nested object in elastic-search

阅读更多关于 How to get array count of nested of nested object in elastic-search

问题 Can someone please help me to get an aggregated count of the nested sets object in elastic search, let say if my elastic search object as : { "empId":12121, "entities": [ { "sets": [ { "setId": 1 } ] } ] } entities are the type of array which contains an another array called sets. I wanted to get the count of sets of the filtered item. { "query": { "bool": { "filter": [ { "terms": { "mediaItemId": [346754750,346745565] } } ] } }, "size": 0, "aggs": { "entities_agg": { "sum": { "script": {

Unable to install Search Guard plugin for Elasticsearch-5.x

阅读更多关于 Unable to install Search Guard plugin for Elasticsearch-5.x

问题 Due to the restrictions, I was not allowed to install any packages from internet. So, This command is not useful for me inorder to install search-guard. bin/elasticsearch-plugin install -b com.floragunn:search-guard-ssl:<version> However, I am able to install Search Guard successfully on a different network by running the above command. Because of this reason, I tried installing Search Guard from tar.gz or zip file by the below command as per documentation. /usr/share/elasticsearch# bin

Write to elasticsearch from spark is very slow

阅读更多关于 Write to elasticsearch from spark is very slow

问题 I am processing a text file and writing transformed rows from a Spark application to elastic search as bellow input.write.format("org.elasticsearch.spark.sql") .mode(SaveMode.Append) .option("es.resource", "{date}/" + dir).save() This runs very slow and takes around 8 minutes to write 287.9 MB / 1513789 records. How can I tune spark and elasticsearch settings to make it faster given that network latency will always be there. I am using spark in local mode and have 16 cores and 64GB RAM. My

What differs between post-filter and global aggregation for faceted search?

阅读更多关于 What differs between post-filter and global aggregation for faceted search?

问题 A common problem in search interfaces is that you want to return a selection of results, but might want to return information about all documents. (e.g. I want to see all red shirts, but want to know what other colors are available). This is sometimes referred to as "faceted results", or "faceted navigation". the example from the Elasticsearch reference is quite clear in explaining why / how, so I've used this as a base for this question. Summary / Question: It looks like I can use both a

Bulk request throws error in elasticsearch 6.1.1

阅读更多关于 Bulk request throws error in elasticsearch 6.1.1

问题 I recently upgraded to elasticsearch version 6.1.1 and now I can't bulk index documents from a json file. Wehn I do it inline, it works fine. Here are the contents of the document: {"index" : {}} {"name": "Carlson Barnes", "age": 34} {"index":{}} {"name": "Sheppard Stein","age": 39} {"index":{}} {"name": "Nixon Singleton","age": 36} {"index":{}} {"name": "Sharron Sosa","age": 33} {"index":{}} {"name": "Kendra Cabrera","age": 24} {"index":{}} {"name": "Young Robinson","age": 20} When I run

Python elasticsearch.helpers.scan example

阅读更多关于 Python elasticsearch.helpers.scan example

问题 Can someone provide scan API example of python elasticsearch helpers client? res = elasticsearch.helpers.scan(....) How can i get all results from elasticsearch with res object? 回答1: The documentation includes an example, although if I'm reading it right, helpers.scan by default sets search_type=scan , which was removed in ES 5.1. This causes the example code to fail with ES returning No search type for [scan] . We can amend this with preserve_order=True (I am however not sure about the

How to retrieve 1M documents with elasticsearch in Python? [closed]

阅读更多关于 How to retrieve 1M documents with elasticsearch in Python? [closed]

问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 2 years ago . How I can get 100000 registers in elasticsearch from python? MatchAll query only retrieve 10000. 回答1: Like it's been pointed out, I'd use the Scan API to do that. import elasticsearch from elasticsearch import Elasticsearch ES_HOST = { "host": "localhost", "port": 9200 } ES_INDEX