问题
How can I print the IP address (86.23.215.130) of the following line? The entire file (not shown) is the stdout from a wget (hence HTML). Sounds easy, but I didn't manage.
...
<tr><td align=center colspan=3 bgcolor="D0D0D0"><font face="Arial, Monospace" size=+3>86.23.215.130</font></td></tr>
...
Thanks
回答1:
Why sed? I believe grep is much better:
grep -iohP '(?<=\x3e)([0-9]+\.){3}[0-9]+(?=\x3c)' file
where \x3e means > and \x3c means < (ascii hex code)
Although sed can do this, but it's not recommended:
sed -rn 's/.*\x3e(([0-9]+\.){3}[0-9]+)\x3c.*/\1/p' file
Thanks to Mr. Sternad, I improved this a little bit.
回答2:
If you want to extract the IP address only, you should use the following command:
sed -E -n 's/.*>([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)<.*/\1/p' file.txt
Here is what it does:
-E
switches sed into extended regex mode (-r in GNU Sed)-n
suppresses the output of matched lines's/something/something2/p'
substitutes something with something2 and prints the resulting match([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)
captures a group of four consecutive digits, separated by dots- \1 is a reference to the captured group above
Note that this regex does not necessarily find correct IP addresses, but any sequence of digits, separated by dots.
If you want more flexibility (and accuracy), you could use the Perl Commons Regex module. It validates IP addresses.
perl -MRegexp::Common -lne 'print $1 if /($RE{net}{IPv4})/' file.txt
Note that you have to correctly anchor your expression, otherwise an invalid IP, like 486.23.215.130
will be reduced to a valid address of 86.23.215.130
.
回答3:
Ip addresses are four groups of 0-3 digits separated by 3 period points.
sed -e '/[0-9]\.[0-9]\.[0-9]\.[0-9]/p' infile.txt
回答4:
What about this here? Any remarks?
grep "size=+3" | awk -F'[<>]' '{print $7}'
I know ... it assumes that the IP is always at the same place in the line containing size+3
. Your suggestions are all far more generally formulated, hence better applicable to any parse input text.
来源:https://stackoverflow.com/questions/36101699/extract-ip-address-from-html-document