So I wanted to get all of the pictures on this page(of the nba teams). http://www.cbssports.com/nba/draft/mock-draft
However, my code gives a bit more than that. It
I know this can be "traumatic", but for those automatically generated pages, where you just want to grab the damn images away and never come back, a quick-n-dirty regular expression that takes the desired pattern tends to be my choice (no Beautiful Soup dependency is a great advantage):
import urllib, re
source = urllib.urlopen('http://www.cbssports.com/nba/draft/mock-draft').read()
## every image name is an abbreviation composed by capital letters, so...
for link in re.findall('http://sports.cbsimg.net/images/nba/logos/30x30/[A-Z]*.png', source):
print link
## the code above just prints the link;
## if you want to actually download, set the flag below to True
actually_download = False
if actually_download:
filename = link.split('/')[-1]
urllib.urlretrieve(link, filename)
Hope this helps!