问题
I am trying to parse the @message
field from a Postfix log and extract it into multiple fields.
Message:
<22>Sep 17 19:12:14 postfix/smtp[18852]: 28D40A036B: to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9030A15D0)
LogStash Output:
{
"@source": "syslog://192.244.100.42/",
"@tags": [
"_grokparsefailure"
],
"@fields": {
"priority": 13,
"severity": 5,
"facility": 1,
"facility_label": "user-level",
"severity_label": "Notice"
},
"@timestamp": "2013-09-17T17:12:06.958Z",
"@source_host": "192.244.100.42",
"@source_path": "/",
"@message": "<22>Sep 17 19:12:14 postfix/smtp[18852]: 28D40A036B: to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9030A15D0)",
"@type": "syslog"
}
I've tried to use the grok parser but the data remains in the @message
field. I want to use syslog parser with regular expressions.
What steps do I follow to parse the @message
field?
回答1:
The fact that you have a _grokparsefailure in your output indicates a problem parsing you logs. WHat is the grok filter you're using in your config?
回答2:
While we're now at Logstash 5.x, the concepts of grok remain the same.
Unfortunately Postfix has some really annoying patterns in logging, as in a handful of people have written some patterns that account for most of the data you'll end up seeing in Postfix logs. I will only use a few of them.
The key is to identify components of the message, if they conform to a standard or is largely popular it is likely a grok filter already has been written for it (e.g. syslog). Components of a message you do not know, you can write a filter for with grok.
Let's break the message into pieces:
<22>Sep 17 19:12:14 postfix/smtp[18852]:
: This is very nearly RFC5424 syslog, but it is missing the ver (version) field.- SYSLOG5424PRI: Priority value
- SYSLOGTIMESTAMP: Self explanatory
- SYSLOGPROG: The application's name
28D40A036B: to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9030A15D0)
: This is domain-specific information to Postfix.- POSTFIX_KEYVALUE_DATA: Used as a component of another filter to match key=value data (such as relay=..., delay=...).
- POSTFIX_QUEUEID: Self explanatory
- POSTFIX_KEYVALUE: Combines the POSTFIX_QUEUEID and POSTFIX_KEYVALUE_DATA.
- POSTFIX_SMTP_DELIVERY: Uses POSTFIX_KEYVALUE to identify the above information until status=, after which is the SMTP response.
Filter:
filter {
if [type] == "postfix" {
grok {
patterns_dir => "/etc/logstash/patterns"
match => { "message" => "%{SYSLOG5424PRI}%{SYSLOGTIMESTAMP} %{SYSLOGPROG}: %{POSTFIX_SMTP_DELIVERY}" }
}
}
}
Where you would save the Postfix patterns in the patterns_dir.
Output:
{
"postfix_queueid" => "28D40A036B",
"@timestamp" => 2017-02-23T08:15:32.546Z,
"postfix_smtp_response" => "250 2.0.0 Ok: queued as 9030A15D0",
"port" => 50228,
"postfix_keyvalue_data" => "to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent",
"syslog5424_pri" => "22",
"@version" => "1",
"host" => "10.0.2.2",
"pid" => "18852",
"program" => "postfix/smtp",
"message" => "<22>Sep 17 19:12:14 postfix/smtp[18852]: 28D40A036B: to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9030A15D0)"
}
All of the above grok filters are either common or written by someone else to serve a purpose. Luckily, many people use Postfix, but few have written filters for it, as it is fairly complex.
Once that is established, you can get pretty crafty with your Logstash configuration.
来源:https://stackoverflow.com/questions/18857831/logstash-parse-log-field