I am learning to use both the re module and the urllib module in python and attempting to write a simple web scraper. Here\'s the code I\'ve writte
You could scrape a bunch of titles with a couple lines of gazpacho:
from gazpacho import Soup
urls = ["http://google.com", "https://facebook.com", "http://reddit.com"]
titles = []
for url in urls:
soup = Soup.get(url)
title = soup.find("title", mode="first").text
titles.append(title)
This will output:
titles
['Google',
'Facebook - Log In or Sign Up',
'reddit: the front page of the internet']