how to extract text which matches particular fields in text file using linux commands

问题

Hi below is my text file

{"Author":"john"
  "subject":"java"
  "title":"java cook book.pdf"}

{"title":"Php book.pdf"
 "Author":"Smith"
 "subject":"PHP"}

{"Author":"Smith"
"title":"Java book.pdf"}

from the above data i want to extract all titles which contains "java" word, i should get the following output

java cook book.pdf
Java book.pdf

Please suggest me

Thanks

回答1:

GNU sed

sed -r '/title.*java/I!d;s/.*:.(.*).}$/\1/' file

java cook book.pdf
Java book.pdf

回答2:

You can try something like this with awk:

awk -F: '$1~/title/&&tolower($2)~/java/{gsub(/\"/,"",$2);print $2}' file

Explaination:

-F: sets the field separator to :
$1~/title checks where first column is title
tolower($2)~/java/ checks for second column java case insensitively
gsub(..) is to remove ".
print $2 to print your second column

回答3:

I will avoid any complex solution and will rely on old good grep+awk+tr instead:

$ grep '"title":' test.txt | grep '[Jj]ava' | awk -F: '{print $2}' | tr -d [\"}]
java cook book.pdf
Java book.pdf

which works as follow:

extract all lines which contain "title":
extract from these lines all which contain either Java or java
split these lines by : and show second field
remove " and } signs

回答4:

You should definitely use a json parser to get flawless results.. I like the one provided with PHP and if your file is, as shown, a bunch json blocks separated with blank lines:

foreach( explode("\n\n", file_get_contents('/your/file.json_blocks')) as $js_block ):
    $json = json_decode( trim($js_block) );
    if ( isset( $json['title'] ) && $json['title'] && stripos($json['title'], 'java') ):
        echo trim($json['title']), PHP_EOL;
    endif;
endforeach;

This will be a lot more sure fire than doing the same with any given combination of sed/awk/grep/ et al, simply because json is follows a specific format and should be used with a parser. As an example, a simple new line in the 'title' which has no real meaning to the json but will break the solution provided by Jaypal.. Please see this for a similar problem: parsing xhtml with regex and why you shouldn't do it: RegEx match open tags except XHTML self-contained tags

来源：https://stackoverflow.com/questions/17086628/how-to-extract-text-which-matches-particular-fields-in-text-file-using-linux-com

标签

Linux

sed

awk