I have external HTML site and I need to extract data from the table on that site. However source of the HTML website has wrong formatting except the table in the code, so I
I'm not sure why nobody mentioned pure Bash solution, despite of its limitation (such as a file without endings of html tags on the same line- nevertheless you said you've cleaned the .html)
For your purposes a quick solution would be a 1-liner:
sed -n '//,/<\/table>/p'
Explanation:
print everything between two specified tags, in this case
You could also easily make a tag variable for e.g or and change the output on the fly. But the above solution gives what you asked for without external tools.
- 热议问题