问题
I was reading this question: Extract lines between 2 tokens in a text file using bash because I have a very similar problem... I have to extract (and save it to $variable before printing) text in this xml file:
<--more labels up this line>
<ExtraDataItem name="GUI/LastVMSelected" value="14cd3204-4774-46b8-be89-cc834efcba89"/>
<--more labels and text down this line-->
I only need to get the value= (obviously without brackets and no 'value='), but first, I think it have to search "GUI/LastVMSelected" to get to this line, because there could be a similar value field in other lines,and the value of that label is that i want.
回答1:
If they are on the same line (as they seem to be from your example), it's even easier. Just:
sed -ne '/name="GUI\/LastVMSelected"/s/.*value="\([^"]*\)".*/\1/p'
Explanation:
- -n: Suppress default print
- /name="GUI\/LastVMSelected"/: only lines matching this pattern
- s/.value="([^"])"./\1/p
- substitute everything, capturing the parenthesized part (the value of value)
- and print the result
回答2:
I'm assuming that you're extracting from an XML document. If that is the case, have a look at the XMLStarlet command-line tools for processing XML. There's some documentation for querying XML docs here.
回答3:
Use this:
for f in `grep "GUI/LastVMSelected" filename.txt | cut -d " " -f3`; do echo ${f:7:36}; done
grepgets you only the lines you needcutsplits the lines using some separator, and returns the Nth result of the split-d " "sets the separator to space-f3returns the third result (1-based indexing)${f:7:36}extracts the substring starting at index 7 that is 36 characters long. This gets rid of the leadingvalue="and trailing slash, etc.
Obviously if the order of the fields changes, this will break, but if you're just after something quick and dirty that works, this should be it.
回答4:
Using my answer from the question you linked:
sed -n '/<!--more labels up this line-->/{:a;n;/<!--more labels and text down this line-->/b;\|GUI/LastVMSelected|s/value="\([^=]*\)"/\1/p;ba}' inputfile
Explanation:
-n- don't do an implicit print/<!-- this is token 1 -->/{- if the starting marker is found, then:a- label "a"n- read the next line/<!-- this is token 2 -->/q- if it's the ending marker, quit\|GUI/LastVMSelected|- if the line matches the strings/value="\([^"]*\)"/\1/p- print the string after 'value=' and before the next quote
ba- branch to label "a"
}end if
来源:https://stackoverflow.com/questions/4860228/how-to-extract-from-a-file-text-between-tokens-using-bash-scripts