elasticsearch-5

Elasticsearch count in groups by date range

大城市里の小女人 提交于 2019-12-08 03:59:49
问题 I have documents like this: { body: 'some text', read_date: '2017-12-22T10:19:40.223000' } Is there a way to query count of documents published in last 10 days group by date? For example: 2017-12-22, 150 2017-12-21, 79 2017-12-20, 111 2017-12-19, 27 2017-12-18, 100 回答1: Yes, you can easily achieve that using a date_histogram aggregation, like this: { "query": { "range": { "read_date": { "gte": "now-10d" } } }, "aggs": { "byday": { "date_histogram": { "field": "read_date", "interval": "day" }

Write to elasticsearch from spark is very slow

流过昼夜 提交于 2019-12-07 07:12:27
I am processing a text file and writing transformed rows from a Spark application to elastic search as bellow input.write.format("org.elasticsearch.spark.sql") .mode(SaveMode.Append) .option("es.resource", "{date}/" + dir).save() This runs very slow and takes around 8 minutes to write 287.9 MB / 1513789 records. How can I tune spark and elasticsearch settings to make it faster given that network latency will always be there. I am using spark in local mode and have 16 cores and 64GB RAM. My elasticsearch cluster has one master and 3 data nodes with 16 cores and 64GB each. I am reading text file

How can I do this in painless script Elasticsearch 5.3

こ雲淡風輕ζ 提交于 2019-12-06 12:00:28
问题 We're trying to replicate this ES plugin https://github.com/MLnick/elasticsearch-vector-scoring. The reason is AWS ES doesn't allow any custom plugin to be installed. The plugin is just doing dot product and cosine similarity so I'm guessing it should be really simple to replicate that in painless script. It looks like groovy scripting is deprecated in 5.0. Here's the source code of the plugin. /** * @param params index that a scored are placed in this parameter. Initialize them here. */

ELasticSearch: found jar hell in test classpath

安稳与你 提交于 2019-12-06 05:13:41
问题 I am trying to get MockTransportClient here is the code class CampaignTest extends ESIntegTestCase { var client: Client = null @Before def initClient(): Unit = { val file = Files.createTempDirectory("tempESData") val settings = Settings.builder() .put("http.enabled", "false") .put("path.data", file.toString()).build() client = new MockTransportClient(settings) } @Test def myTest(): Unit = { println("client is"+client) } @After def closeClient(): Unit = { client.close() //close client after

ELasticSearch: found jar hell in test classpath

此生再无相见时 提交于 2019-12-04 09:20:24
I am trying to get MockTransportClient here is the code class CampaignTest extends ESIntegTestCase { var client: Client = null @Before def initClient(): Unit = { val file = Files.createTempDirectory("tempESData") val settings = Settings.builder() .put("http.enabled", "false") .put("path.data", file.toString()).build() client = new MockTransportClient(settings) } @Test def myTest(): Unit = { println("client is"+client) } @After def closeClient(): Unit = { client.close() //close client after test client = null } and its throwing exception [error] Test testcontrollers.campaign.CampaignTest failed

ElasticSearch - Ordering aggregation by nested aggregation on nested field

老子叫甜甜 提交于 2019-12-04 05:26:33
问题 { "query": { "match_all": {} }, "from": 0, "size": 0, "aggs": { "itineraryId": { "terms": { "field": "iid", "size": 2147483647, "order": [ { "price>price>price.max": "desc" } ] }, "aggs": { "duration": { "stats": { "field": "drn" } }, "price": { "nested": { "path": "prl" }, "aggs": { "price": { "filter": { "terms": { "prl.cc.keyword": [ "USD" ] } }, "aggs": { "price": { "stats": { "field": "prl.spl.vl" } } } } } } } } } } Here, I am getting error that "Invalid terms aggregation order path

bucket_script inside filter aggregation throws error

吃可爱长大的小学妹 提交于 2019-12-04 05:18:46
问题 I am trying to filter empty buckets in side a filter aggregation block, and I get an error from elasticsearch. without this the response is huge, as I am querying lots of metric, and nested aggregation (this is part of bigger query for simplicity ) GET index/type/_search?ignore_unavailable { "size": 0, "aggs": { "groupby_country": { "terms": { "field": "country", "size": 2000 }, "aggs": { "exists__x__filter": { "filter": { "bool": { "filter": [ { "exists": { "field": "x" } } ] } }, "aggs": {

What differs between post-filter and global aggregation for faceted search?

℡╲_俬逩灬. 提交于 2019-12-03 16:13:59
A common problem in search interfaces is that you want to return a selection of results, but might want to return information about all documents. (e.g. I want to see all red shirts, but want to know what other colors are available). This is sometimes referred to as "faceted results", or "faceted navigation". the example from the Elasticsearch reference is quite clear in explaining why / how, so I've used this as a base for this question. Summary / Question: It looks like I can use both a post-filter or a global aggregation for this. They both seem to provide the exact same functionality in a

Content-Type header [application/x-www-form-urlencoded] is not supported [duplicate]

不羁的心 提交于 2019-12-03 12:38:15
This question already has an answer here: Content-Type header [application/x-www-form-urlencoded] is not supported on Elasticsearch 2 answers I have integrated Elasticsearch (Version 5.5) into Gitlab and try to use it. This is the command I send from an external windows client: curl -XGET gitlab.server:9200/ -H 'Content-Type: application/json' -d '{"query": {"simple_query_string" : {"fields" : ["content"], "query" : "foo bar -baz"}}}' but it doesn't work. On the client I get these errors: {"error":"Content-Type header [application/x-www-form-urlencoded] is not supported","status":406} curl: (6

Simple date histogram?

*爱你&永不变心* 提交于 2019-12-02 22:06:22
问题 Viewing documents on per weekday classification? My data is in a format like this: {"text": "hi","created_at": "2016-02-21T18:30:36.000Z"} For this I am using a dateConversion.groovy script and kept in the scripts folder in ES 5.1.1. Date date = new Date(doc[date_field].value); java.text.SimpleDateFormat format = new java.text.SimpleDateFormat(format); format.format(date) When I executed the following code in ES PLUGIN: "aggs": { "byDays": { "terms": { "script": { "lang": "groovy", "file":