Parsing text from XML node in Python

前端 未结 3 900
傲寒
傲寒 2020-12-02 02:36

I\'m trying to extract URLs from a sitemap like this: https://www.bestbuy.com/sitemap_c_0.xml.gz

I\'ve unzipped and saved the .xml.gz file as an .xml file. The struc

3条回答
  •  挽巷
    挽巷 (楼主)
    2020-12-02 03:38

    I know this is a bit of a zombie reply, but I actually just posted a tool on github that does exactly what you're looking for. And in Python! So feel free to take what you need from the source code (or use it as-is). I figured I'd comment with this so other people who come across this thread would have it.

    Here it is: https://github.com/tcaldron/xmlscrape

提交回复
热议问题