In Bash, how can I parse multiple newline delimited JSON objects from a log file?

懵懂的女人 提交于 2019-12-04 04:53:38

问题


I am parsing through a log file and get result lines (using grep) like the following:

2017-01-26 17:19:40 +0000 docker: {"source":"stdout","log":"I, [2017-01-26T17:19:40.703988 #24]  INFO -- : {\"tags\":\"structured_log\",\"payload\":{\"results\":[{\"baserate\":\"-1\"}]},\"commit_stamp\":1485451180,\"resource\":\"google_price_result_metric\",\"object_id\":\"20170126171940700\"}","container_id":"6ecbf7f64e4c9557e9dd1efbc6666a3c6c53f9cd5c18414ed5633cad8c302e","container_name":"/test-container-b49c8188c3ebe4b93300"}
2017-01-26 17:19:40 +0000 docker: {"container_id":"6ecbf7f64e4c9557e9dd1efbc6666a3c6c53f9cd5c18414ed5633cad8c302e","container_name":"/test-container-b49c8188c3ebe4b93300","source":"stdout","log":"I, [2017-01-26T17:19:40.704364 #24]  INFO -- : method=POST path=/prices.xml format=xml controller=TestController action=prices status=200 duration=1686.51 view=0.08 db=0.62"}

I then extract the JSON objects with the following command:

... | grep -o -E "\{.*$"

I know I can parse a single line with python -mjson.tool like so:

... | grep -o -E "\{.*$" | tail -n1 | python -mjson.tool

But I want to parse both lines (or n lines). How can I do this in bash? (I think xargs is supposed to let me do this, but I am new to the tool and can't figure it out)


回答1:


jq can be told to accept plain text as input, and attempt to parse an extracted subset as JSON. Consider the following example, tested with jq 1.5:

jq -R 'capture("docker: (?<json>[{].*[}])$") | .json? | select(.) | fromjson' <<'EOF'
2017-01-26 17:19:40 +0000 docker: {"source":"stdout","log":"I, [2017-01-26T17:19:40.703988 #24]  INFO -- : {\"tags\":\"structured_log\",\"payload\":{\"results\":[{\"baserate\":\"-1\"}]},\"commit_stamp\":1485451180,\"resource\":\"google_price_result_metric\",\"object_id\":\"20170126171940700\"}","container_id":"6ecbf7f64e4c9557e9dd1efbc6666a3c6c53f9cd5c18414ed5633cad8c302e","container_name":"/test-container-b49c8188c3ebe4b93300"}
2017-01-26 17:19:40 +0000 docker: {"container_id":"6ecbf7f64e4c9557e9dd1efbc6666a3c6c53f9cd5c18414ed5633cad8c302e","container_name":"/test-container-b49c8188c3ebe4b93300","source":"stdout","log":"I, [2017-01-26T17:19:40.704364 #24]  INFO -- : method=POST path=/prices.xml format=xml controller=TestController action=prices status=200 duration=1686.51 view=0.08 db=0.62"}
EOF

...properly yields:

{
  "source": "stdout",
  "log": "I, [2017-01-26T17:19:40.703988 #24]  INFO -- : {\"tags\":\"structured_log\",\"payload\":{\"results\":[{\"baserate\":\"-1\"}]},\"commit_stamp\":1485451180,\"resource\":\"google_price_result_metric\",\"object_id\":\"20170126171940700\"}",
  "container_id": "6ecbf7f64e4c9557e9dd1efbc6666a3c6c53f9cd5c18414ed5633cad8c302e",
  "container_name": "/test-container-b49c8188c3ebe4b93300"
}
{
  "container_id": "6ecbf7f64e4c9557e9dd1efbc6666a3c6c53f9cd5c18414ed5633cad8c302e",
  "container_name": "/test-container-b49c8188c3ebe4b93300",
  "source": "stdout",
  "log": "I, [2017-01-26T17:19:40.704364 #24]  INFO -- : method=POST path=/prices.xml format=xml controller=TestController action=prices status=200 duration=1686.51 view=0.08 db=0.62"
}


来源:https://stackoverflow.com/questions/41904454/in-bash-how-can-i-parse-multiple-newline-delimited-json-objects-from-a-log-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!