elasticsearch-bulk-api | 易学教程

Bulk API error while indexing data into elasticsearch

阅读更多关于 Bulk API error while indexing data into elasticsearch

问题 I want to import some data into elasticsearch using bulk API. this is the mapping I have created using Kibana dev tools: PUT /main-news-test-data { "mappings": { "properties": { "content": { "type": "text" }, "title": { "type": "text" }, "lead": { "type": "text" }, "agency": { "type": "keyword" }, "date_created": { "type": "date" }, "url": { "type": "keyword" }, "image": { "type": "keyword" }, "category": { "type": "keyword" }, "id":{ "type": "keyword" } } } } and this is my bulk data: {

NewLine error in Elasticsearch bulk API post request

阅读更多关于 NewLine error in Elasticsearch bulk API post request

问题 I am trying to use the elasticsearch bulk api to insert multiple records into an index. My JSON looks something like this: request json I am inserting a new line ( \\n ) at the end of the document but I'm still getting the newline error . Error: { "error": { "root_cause": [ { "type": "illegal_argument_exception", "reason": "The bulk request must be terminated by a newline [\n]" } ], "type": "illegal_argument_exception", "reason": "The bulk request must be terminated by a newline [\n]" },

NewLine error in Elasticsearch bulk API post request

阅读更多关于 NewLine error in Elasticsearch bulk API post request

Update nested field for millions of documents

阅读更多关于 Update nested field for millions of documents

问题 I use bulk update with script in order to update a nested field, but this is very slow : POST index/type/_bulk {"update":{"_id":"1"}} {"script"{"inline":"ctx._source.nestedfield.add(params.nestedfield)","params":{"nestedfield":{"field1":"1","field2":"2"}}}} {"update":{"_id":"2"}} {"script"{"inline":"ctx._source.nestedfield.add(params.nestedfield)","params":{"nestedfield":{"field1":"3","field2":"4"}}}} ... [a lot more splitted in several batches] Do you know another way that could be faster ?

How to handle multiple updates / deletes with Elasticsearch?

阅读更多关于 How to handle multiple updates / deletes with Elasticsearch?

问题 I need to update or delete several documents. When I update I do this: I first search for the documents, setting a greater limit for the returned results (let’s say, size: 10000). For each of the returned documents, I modify certain values. I resent to elasticsearch the whole modified list (bulk index). This operation takes place until point 1 no longer returns results. When I delete I do this: I first search for the documents, setting a greater limit for the returned results (let’s say, size

BULK API : Malformed action/metadata line [3], expected START_OBJECT but found [VALUE_STRING]

阅读更多关于 BULK API : Malformed action/metadata line [3], expected START_OBJECT but found [VALUE_STRING]

问题 Using Elasticsearch 5.5,getting the following error while posting this bulk request, unable to figure out what is wrong with the request. "type": "illegal_argument_exception", "reason": "Malformed action/metadata line [3], expected START_OBJECT but found [VALUE_STRING]" POST http://localhost:9200/access_log_index/access_log/_bulk { "index":{ "_id":11} } { "id":11, "tenant_id":682, "tenant_name":"kcc", "user.user_name":"k0772251", "access_date":"20170821", "access_time":"02:41:44.123+01:30",

Specify the _id field using Bulk.IndexMany in ElasticSearch

阅读更多关于 Specify the _id field using Bulk.IndexMany in ElasticSearch

问题 I'm facing a problem inserting document using bulk API (C# NEST v5.4). I've an array of documents and inside of the array I've my ID. My code is: documents = documents .ToArray(); Client.Bulk(bd => bd.IndexMany(documents, (descriptor, s) => descriptor.Index(indexName))); How can i insert the _id manually using the descriptor ? Thanks in advance! 回答1: You can set _id similarly to how you're setting the index name on the BulkDescriptor . Given the following POCO public class Message { public

What is the ideal bulk size formula in ElasticSearch?

阅读更多关于 What is the ideal bulk size formula in ElasticSearch?

问题 I believe there should be a formula to calculate bulk indexing size in ElasticSearch. Probably followings are the variables of such a formula. Number of nodes Number of shards/index Document size RAM Disk write speed LAN speed I wonder If anyone know or use a mathematical formula. If not, how people decide their bulk size? By trial and error? 回答1: There is no golden rule for this. Extracted from the doc: There is no “correct” number of actions to perform in a single bulk call. You should

How to handle multiple updates / deletes with Elasticsearch?

阅读更多关于 How to handle multiple updates / deletes with Elasticsearch?

I need to update or delete several documents. When I update I do this: I first search for the documents, setting a greater limit for the returned results (let’s say, size: 10000). For each of the returned documents, I modify certain values. I resent to elasticsearch the whole modified list (bulk index). This operation takes place until point 1 no longer returns results. When I delete I do this: I first search for the documents, setting a greater limit for the returned results (let’s say, size: 10000) I delete every found document sending to elasticsearch _id document (10000 requests) This

Elastic Search Bulk API, Pipeline and Geo IP

阅读更多关于 Elastic Search Bulk API, Pipeline and Geo IP

问题 I import data to my ELK stack using the Bulk API. {"index":{"_index":"waf","_type":"logs","_id":"325d05bb6900440e"}} {"id":"325d05bb6900440e","country":"US","ip":"1.1.1.1","protocol":"HTTP/1.1","method":"GET","host":"xxxxx","user_agent":"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36","uri":"/?a=><script>alert(1)</script>","request_duration":1999872,"triggered_rule_ids":["100030"],"action":"challenge","cloudflare_location":