There\'s valid json in a javascript on a html page that I want to parse with a shell script.
First of all I would like to get the entire json string from {
to
Usually it is not recommended to use unix command line tools for parsing HTML. But if you know your marker string foo.bar.Processor.message
, then you may use this sed + jq
solution:
sed -n 's/foo\.bar\.Processor\.message(\([^)]*\).*/\1/p' file.html |
jq -r '.head.url | split(";")[1] | split("=")[1]'
347EDAFA2B136D7825745B0A490DE32
In the absence of jq
, you may use this sed + gnu grep
solution:
sed -n 's/foo\.bar\.Processor\.message(\([^)]*\).*/\1/p' file.html |
grep -oP ';barid=\K\w+'