Extract part of the code and parse HTML in bash

后端 未结 2 344
深忆病人
深忆病人 2020-12-10 15:11

I have external HTML site and I need to extract data from the table on that site. However source of the HTML website has wrong formatting except the table in the code, so I

2条回答
  •  孤街浪徒
    2020-12-10 15:55

    I'm not sure why nobody mentioned pure Bash solution, despite of its limitation (such as a file without endings of html tags on the same line- nevertheless you said you've cleaned the .html)

    For your purposes a quick solution would be a 1-liner:

    sed -n '//,/<\/table>/p'  
    
    
    

    Explanation: print everything between two specified tags, in this case

    You could also easily make a tag variable for e.g or

    and change the output on the fly. But the above solution gives what you asked for without external tools.

    提交回复
    热议问题