Regex for links in html text

前端 未结 8 1663
旧巷少年郎
旧巷少年郎 2020-12-16 04:42

I hope this question is not a RTFM one. I am trying to write a Python script that extracts links from a standard HTML webpage (the tags). I hav

8条回答
  •  無奈伤痛
    2020-12-16 05:32

    Answering your two subquestions there.

    1. I've sometimes subclassed SGMLParser (included in the core Python distribution) and must say it's straight forward.
    2. I don't think HTML lends itself to "well defined" regular expressions since it's not a regular language.

提交回复
热议问题