Searching through webpage

ε祈祈猫儿з 提交于 2019-12-03 04:03:54

You could do something simple like:


import urllib2
import re

html_content = urllib2.urlopen('http://www.domain.com').read()

matches = re.findall('regex of string to find', html_content);

if len(matches) == 0: 
   print 'I did not find anything'
else:
   print 'My string is in the html'
Bassdread

lxml is awesome: http://lxml.de/parsing.html

I use it regularly with xpath for extracting data from the html.

The other option is http://www.crummy.com/software/BeautifulSoup/ which is great as well.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!