I get a very large JSON stream (several GB) from curl and try to process it with jq.
The relevant output I want to parse with jq
To get:
{"key1": "row1", "key2": "row1"}
{"key1": "row2", "key2": "row2"}
From:
{
"results":[
{
"columns": ["n"],
"data": [
{"row": [{"key1": "row1", "key2": "row1"}], "meta": [{"key": "value"}]},
{"row": [{"key1": "row2", "key2": "row2"}], "meta": [{"key": "value"}]}
]
}
],
"errors": []
}
Do the following, which is equivalent to jq -c '.results[].data[].row[]', but using streaming:
jq -cn --stream 'fromstream(1|truncate_stream(inputs | select(.[0][0] == "results" and .[0][2] == "data" and .[0][4] == "row") | del(.[0][0:5])))'
What this does is:
--stream).results[].data[].row[] (with select(.[0][0] == "results" and .[0][2] == "data" and .[0][4] == "row")"results",0,"data",0,"row" (with del(.[0][0:5]))fromstream(1|truncate_stream(…)) pattern from the jq FAQFor example:
echo '
{
"results":[
{
"columns": ["n"],
"data": [
{"row": [{"key1": "row1", "key2": "row1"}], "meta": [{"key": "value"}]},
{"row": [{"key1": "row2", "key2": "row2"}], "meta": [{"key": "value"}]}
]
}
],
"errors": []
}
' | jq -cn --stream '
fromstream(1|truncate_stream(
inputs | select(
.[0][0] == "results" and
.[0][2] == "data" and
.[0][4] == "row"
) | del(.[0][0:5])
))'
Produces the desired output.