问题
I've been searching for a ling time, and have not been able to find a working answer for my problem.
I have a line from an HTML file extracted with sed '162!d' skinlist.html
, which contains the text
<a href="/skin/dwarf-red-beard-734/" title="Dwarf Red Beard">
.
I want to extract the text Dwarf Red Beard
, but that text is modular (can be changed), so I would like to extract the text between title="
and "
.
I cannot, for the life of me, figure out how to do this.
回答1:
awk 'NR==162 {print $4}' FS='"' skinlist.html
- set field separator to
"
- print only line 162
- print field 4
回答2:
Solution in sed
sed -n '162 s/^.*title="\(.*\)".*$/\1/p' skinlist.html
Extracts line 162
in skinlist.html
and captures the title
attributes contents in\1
.
回答3:
The shell's variable expansion syntax allows you to trim prefixes and suffixes from a string:
line="$(sed '162!d' skinlist.html)" # extract the relevant line from the file
temp="${line#* title=\"}" # remove from the beginning through the first match of ' title="'
if [ "$temp" = "$line" ]; then
echo "title not found in '$line'" >&2
else
title="${temp%%\"*}" # remote from the first '"' through the end
fi
回答4:
You can pass it through another sed
or add expressions to that sed
like -e 's/.*title="//g' -e 's/">.*$//g'
回答5:
also sed
sed -n '162 s/.*"\([a-zA-Z ]*\)"./\1/p' skinlist.html
来源:https://stackoverflow.com/questions/16705927/print-text-between-two-strings-on-the-same-line