问题
Hi below is my text file
{"Author":"john"
"subject":"java"
"title":"java cook book.pdf"}
{"title":"Php book.pdf"
"Author":"Smith"
"subject":"PHP"}
{"Author":"Smith"
"title":"Java book.pdf"}
from the above data i want to extract all titles which contains "java" word, i should get the following output
java cook book.pdf
Java book.pdf
Please suggest me
Thanks
回答1:
GNU sed
sed -r '/title.*java/I!d;s/.*:.(.*).}$/\1/' file
java cook book.pdf Java book.pdf
回答2:
You can try something like this with awk
:
awk -F: '$1~/title/&&tolower($2)~/java/{gsub(/\"/,"",$2);print $2}' file
Explaination:
-F:
sets the field separator to:
$1~/title
checks where first column istitle
tolower($2)~/java/
checks for second columnjava
case insensitivelygsub(..)
is to remove"
.print $2
to print your second column
回答3:
I will avoid any complex solution and will rely on old good grep+awk+tr instead:
$ grep '"title":' test.txt | grep '[Jj]ava' | awk -F: '{print $2}' | tr -d [\"}]
java cook book.pdf
Java book.pdf
which works as follow:
- extract all lines which contain
"title":
- extract from these lines all which contain either
Java
orjava
- split these lines by
:
and show second field - remove
"
and}
signs
回答4:
You should definitely use a json parser to get flawless results.. I like the one provided with PHP and if your file is, as shown, a bunch json blocks separated with blank lines:
foreach( explode("\n\n", file_get_contents('/your/file.json_blocks')) as $js_block ):
$json = json_decode( trim($js_block) );
if ( isset( $json['title'] ) && $json['title'] && stripos($json['title'], 'java') ):
echo trim($json['title']), PHP_EOL;
endif;
endforeach;
This will be a lot more sure fire than doing the same with any given combination of sed/awk/grep/ et al, simply because json is follows a specific format and should be used with a parser. As an example, a simple new line in the 'title' which has no real meaning to the json but will break the solution provided by Jaypal.. Please see this for a similar problem: parsing xhtml with regex and why you shouldn't do it: RegEx match open tags except XHTML self-contained tags
来源:https://stackoverflow.com/questions/17086628/how-to-extract-text-which-matches-particular-fields-in-text-file-using-linux-com