How to extract data from html table in shell script?

前端 未结 6 1466
[愿得一人]
[愿得一人] 2020-11-30 11:42

I am trying to create a BASH script what would extract the data from HTML table. Below is the example of table from where I need to extract data:

6条回答
  •  被撕碎了的回忆
    2020-11-30 12:03

    Go with (g)awk, it's capable :-), here is a solution, but please note: it's only working with the exact html table format you had posted.

     awk -F "|" '/<\/*t[rd]>.*[A-Z][A-Z]/ {print $3, $5, $7 }' FILE
    

    Here you can see it in action: https://ideone.com/zGfLe

    Some explanation:

    1. -F sets the input field separator to a regexp (any of tr's or td's opening or closing tag

    2. then works only on lines that matches those tags AND at least two upercasse fields

    3. then prints the needed fields.

    HTH

提交回复
热议问题