elasticsearch-bulk-api

Bulk API error while indexing data into elasticsearch

谁说我不能喝 提交于 2021-01-27 23:13:47
问题 I want to import some data into elasticsearch using bulk API. this is the mapping I have created using Kibana dev tools: PUT /main-news-test-data { "mappings": { "properties": { "content": { "type": "text" }, "title": { "type": "text" }, "lead": { "type": "text" }, "agency": { "type": "keyword" }, "date_created": { "type": "date" }, "url": { "type": "keyword" }, "image": { "type": "keyword" }, "category": { "type": "keyword" }, "id":{ "type": "keyword" } } } } and this is my bulk data: {

NewLine error in Elasticsearch bulk API post request

笑着哭i 提交于 2020-12-14 05:04:46
问题 I am trying to use the elasticsearch bulk api to insert multiple records into an index. My JSON looks something like this: request json I am inserting a new line ( \\n ) at the end of the document but I'm still getting the newline error . Error: { "error": { "root_cause": [ { "type": "illegal_argument_exception", "reason": "The bulk request must be terminated by a newline [\n]" } ], "type": "illegal_argument_exception", "reason": "The bulk request must be terminated by a newline [\n]" },

NewLine error in Elasticsearch bulk API post request

放肆的年华 提交于 2020-12-14 05:03:16
问题 I am trying to use the elasticsearch bulk api to insert multiple records into an index. My JSON looks something like this: request json I am inserting a new line ( \\n ) at the end of the document but I'm still getting the newline error . Error: { "error": { "root_cause": [ { "type": "illegal_argument_exception", "reason": "The bulk request must be terminated by a newline [\n]" } ], "type": "illegal_argument_exception", "reason": "The bulk request must be terminated by a newline [\n]" },

Update nested field for millions of documents

余生长醉 提交于 2020-04-14 07:29:56
问题 I use bulk update with script in order to update a nested field, but this is very slow : POST index/type/_bulk {"update":{"_id":"1"}} {"script"{"inline":"ctx._source.nestedfield.add(params.nestedfield)","params":{"nestedfield":{"field1":"1","field2":"2"}}}} {"update":{"_id":"2"}} {"script"{"inline":"ctx._source.nestedfield.add(params.nestedfield)","params":{"nestedfield":{"field1":"3","field2":"4"}}}} ... [a lot more splitted in several batches] Do you know another way that could be faster ?

How to handle multiple updates / deletes with Elasticsearch?

徘徊边缘 提交于 2019-12-22 11:31:28
问题 I need to update or delete several documents. When I update I do this: I first search for the documents, setting a greater limit for the returned results (let’s say, size: 10000). For each of the returned documents, I modify certain values. I resent to elasticsearch the whole modified list (bulk index). This operation takes place until point 1 no longer returns results. When I delete I do this: I first search for the documents, setting a greater limit for the returned results (let’s say, size

BULK API : Malformed action/metadata line [3], expected START_OBJECT but found [VALUE_STRING]

怎甘沉沦 提交于 2019-12-19 05:45:02
问题 Using Elasticsearch 5.5,getting the following error while posting this bulk request, unable to figure out what is wrong with the request. "type": "illegal_argument_exception", "reason": "Malformed action/metadata line [3], expected START_OBJECT but found [VALUE_STRING]" POST http://localhost:9200/access_log_index/access_log/_bulk { "index":{ "_id":11} } { "id":11, "tenant_id":682, "tenant_name":"kcc", "user.user_name":"k0772251", "access_date":"20170821", "access_time":"02:41:44.123+01:30",

Specify the _id field using Bulk.IndexMany in ElasticSearch

独自空忆成欢 提交于 2019-12-11 06:06:26
问题 I'm facing a problem inserting document using bulk API (C# NEST v5.4). I've an array of documents and inside of the array I've my ID. My code is: documents = documents .ToArray(); Client.Bulk(bd => bd.IndexMany(documents, (descriptor, s) => descriptor.Index(indexName))); How can i insert the _id manually using the descriptor ? Thanks in advance! 回答1: You can set _id similarly to how you're setting the index name on the BulkDescriptor . Given the following POCO public class Message { public

What is the ideal bulk size formula in ElasticSearch?

冷暖自知 提交于 2019-12-10 12:47:25
问题 I believe there should be a formula to calculate bulk indexing size in ElasticSearch. Probably followings are the variables of such a formula. Number of nodes Number of shards/index Document size RAM Disk write speed LAN speed I wonder If anyone know or use a mathematical formula. If not, how people decide their bulk size? By trial and error? 回答1: There is no golden rule for this. Extracted from the doc: There is no “correct” number of actions to perform in a single bulk call. You should

How to handle multiple updates / deletes with Elasticsearch?

五迷三道 提交于 2019-12-06 06:42:57
I need to update or delete several documents. When I update I do this: I first search for the documents, setting a greater limit for the returned results (let’s say, size: 10000). For each of the returned documents, I modify certain values. I resent to elasticsearch the whole modified list (bulk index). This operation takes place until point 1 no longer returns results. When I delete I do this: I first search for the documents, setting a greater limit for the returned results (let’s say, size: 10000) I delete every found document sending to elasticsearch _id document (10000 requests) This

Elastic Search Bulk API, Pipeline and Geo IP

限于喜欢 提交于 2019-12-02 10:11:10
问题 I import data to my ELK stack using the Bulk API. {"index":{"_index":"waf","_type":"logs","_id":"325d05bb6900440e"}} {"id":"325d05bb6900440e","country":"US","ip":"1.1.1.1","protocol":"HTTP/1.1","method":"GET","host":"xxxxx","user_agent":"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36","uri":"/?a=><script>alert(1)</script>","request_duration":1999872,"triggered_rule_ids":["100030"],"action":"challenge","cloudflare_location":