Parse multiline JSON with grok in logstash

余生长醉 提交于 2019-12-30 06:47:12

问题


I've got a JSON of the format:

{
    "SOURCE":"Source A",
    "Model":"ModelABC",
    "Qty":"3"
}

I'm trying to parse this JSON using logstash. Basically I want the logstash output to be a list of key:value pairs that I can analyze using kibana. I thought this could be done out of the box. From a lot of reading, I understand I must use the grok plugin (I am still not sure what the json plugin is for). But I am unable to get an event with all the fields. I get multiple events (one even for each attribute of my JSON). Like so:

{
       "message" => "  \"SOURCE\": \"Source A\",",
      "@version" => "1",
    "@timestamp" => "2014-08-31T01:26:23.432Z",
          "type" => "my-json",
          "tags" => [
        [0] "tag-json"
    ],
          "host" => "myserver.example.com",
          "path" => "/opt/mount/ELK/json/mytestjson.json"
}
{
       "message" => "  \"Model\": \"ModelABC\",",
      "@version" => "1",
    "@timestamp" => "2014-08-31T01:26:23.438Z",
          "type" => "my-json",
          "tags" => [
        [0] "tag-json"
    ],
          "host" => "myserver.example.com",
          "path" => "/opt/mount/ELK/json/mytestjson.json"
}
{
       "message" => "  \"Qty\": \"3\",",
      "@version" => "1",
    "@timestamp" => "2014-08-31T01:26:23.438Z",
          "type" => "my-json",
          "tags" => [
        [0] "tag-json"
    ],
          "host" => "myserver.example.com",
          "path" => "/opt/mount/ELK/json/mytestjson.json"
}

Should I use the multiline codec or the json_lines codec? If so, how can I do that? Do I need to write my own grok pattern or is there something generic for JSONs that will give me ONE EVENT with key:value pairs that I get for one event above? I couldn't find any documentation that sheds light on this. Any help would be appreciated. My conf file is shown below:

input
{
        file
        {
                type => "my-json"
                path => ["/opt/mount/ELK/json/mytestjson.json"]
                codec => json
                tags => "tag-json"
        }
}

filter
{
   if [type] == "my-json"
   {
        date { locale => "en"  match => [ "RECEIVE-TIMESTAMP", "yyyy-mm-dd HH:mm:ss" ] }
   }
}

output
{
        elasticsearch
        {
                host => localhost
        }
        stdout { codec => rubydebug }
}

回答1:


I think I found a working answer to my problem. I am not sure if it's a clean solution, but it helps parse multiline JSONs of the type above.

input 
{   
    file 
    {
        codec => multiline
        {
            pattern => '^\{'
            negate => true
            what => previous                
        }
        path => ["/opt/mount/ELK/json/*.json"]
        start_position => "beginning"
        sincedb_path => "/dev/null"
        exclude => "*.gz"
    }
}

filter 
{
    mutate
    {
        replace => [ "message", "%{message}}" ]
        gsub => [ 'message','\n','']
    }
    if [message] =~ /^{.*}$/ 
    {
        json { source => message }
    }

}

output 
{ 
    stdout { codec => rubydebug }
}

My mutliline codec doesn't handle the last brace and therefore it doesn't appear as a JSON to json { source => message }. Hence the mutate filter:

replace => [ "message", "%{message}}" ]

That adds the missing brace. and the

gsub => [ 'message','\n','']

removes the \n characters that are introduced. At the end of it, I have a one-line JSON that can be read by json { source => message }

If there's a cleaner/easier way to convert the original multi-line JSON to a one-line JSON, please do POST as I feel the above isn't too clean.




回答2:


You will need to use a multiline codec.

input {
  file {
    codec => multiline {
        pattern => '^{'
        negate => true
        what => previous
    }
    path => ['/opt/mount/ELK/json/mytestjson.json']
  }
}
filter {
  json {
    source => message
    remove_field => message
  }
}

The problem you will run into has to do with the last event in the file. It won't show up till there is another event in the file (so basically you'll lose the last event in a file) -- you could append a single { to the file before it gets rotated to deal with that situation.



来源:https://stackoverflow.com/questions/25588222/parse-multiline-json-with-grok-in-logstash

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!