I am trying to use Logs Insights with data containing JSON in one of the fields, and to parse the JSON fields
My data looks like the following when I put it in insig
Also another parse with help of regex
Assume your @message
is:
I, [2018-12-19T23:42:52.629855 #23447] INFO -- : [2ce588f1-c27d-4a55-ac05-62a75b39e762] {"method":"GET","path":"/api/v1/professionals/"}
You can extract method by
fields @timestamp, @message
| parse @message /\"method\":\"(?<method_type>.*?)\"/
CloudWatch Insights Logs automatically discovers fields for the following log types:
Lambda logs
CloudWatch Logs Insights automatically discovers log fields in Lambda logs, but only for the first embedded JSON fragment in each log event (Note: emphasis mine). If a Lambda log event contains multiple JSON fragments, you can parse and extract the log fields by using the parse command. For more information, see Fields in JSON Logs.
CloudTrail logs
See fields in JSON logs.
Source: Supported Logs and Discovered Fields
If @message
is I, [2018-12-11T13:20:27] INFO -- : {"method":"GET"}
Then you can select and filter the fields like so:
fields @timestamp, @message, method
| filter method = "GET"
| sort @timestamp desc
It works with nested fields too, i.e. params.format = "json"
or results.0.firstName = "Paul"
.
Would this regex query within the parse command help you?
filter @message like / \"path\":\"/
| parse @message /(?<@endpt>((\/[a-zA-Z0-9_{}()-?]+){1,}))/
Good luck!
Building on @pyb insights, I was able to use parse @message '"path":"*"' as path
to extract the path from any place in the @message
.
You can go on to get your method by piping another parse @message '"method":"*"' as method
without concern for ordering as it is a second global plain text search on @message
In the case where your @message
is:
I, [2018-12-19T23:42:52.629855 #23447] INFO -- : [2ce588f1-c27d-4a55-ac05-62a75b39e762] {"method":"GET","path":"/api/v1/professionals/ID","format":"json","controller":"API::V1::Public::ProfessionalsController","action":"show","status":200,"duration":285.27,"view":222.36,"time":"2018-12-19T23:42:52.344+00:00","params":{"include":"xxx","format":"json","compress":false,"id":"ID"},"@timestamp":"2018-12-19T23:42:52.629Z","@version":"1","message":"[200] GET /api/v1/professionals/ID (API::V1::Public::ProfessionalsController#show)"}
Using:
parse @message '"path":"*"' as path | parse @message '"method":"*"' as method
will result in the fields: path = '/api/v1/professionals/ID'
and method = 'GET'
Note that this is still simply string parsing and as such, it has no concept of nested keys like params.format
would not find json
, however using just format
would, so long as there wasn't another format
string anywhere in your @message
.
Also note that this is for the case where Insights is not discovering your JSON in the message. I belive this is the case that @pyb was referring to in this answer. My logs aren't being discovered either using the following format
info - Request: {"method":"POST","path":"/auth/login/","body":{"login":{"email":"email@example.com","password":"********"}},"uuid":"36d76df2-aec4-4549-8b73-f237e8f14e23","ip":"*.*.*.*"}
You can use the parse command to extract fields.
If @message
is
I, [2018-12-11T13:20:27] INFO -- : {"method":"GET"}
Then you extract the fields like so:
fields @timestamp, @message
| parse "I, [*T*] INFO -- : {"method":"*"}" as @date, @time, @method
| filter method=GET
| sort @timestamp desc
| limit 20
The documentation is rather light for now. I am able to get results by replacing the wildcard *
with a regular expression, but then the parsing fails.