Extract image links from the webpage using Python

后端 未结 3 702
走了就别回头了
走了就别回头了 2021-01-07 00:39

So I wanted to get all of the pictures on this page(of the nba teams). http://www.cbssports.com/nba/draft/mock-draft

However, my code gives a bit more than that. It

3条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-01-07 01:28

    To save all the images on http://www.cbssports.com/nba/draft/mock-draft,

    import urllib2
    import os
    from BeautifulSoup import BeautifulSoup
    URL = "http://www.cbssports.com/nba/draft/mock-draft"
    default_dir = os.path.join(os.path.expanduser("~"),"Pictures")
    opener = urllib2.build_opener()
    urllib2.install_opener(opener)
    soup = BeautifulSoup(urllib2.urlopen(URL).read())
    imgs = soup.findAll("img",{"alt":True, "src":True})
    for img in imgs:
        img_url = img["src"]
        filename = os.path.join(default_dir, img_url.split("/")[-1])
        img_data = opener.open(img_url)
        f = open(filename,"wb")
        f.write(img_data.read())
        f.close()
    

    To save any particular image on http://www.cbssports.com/nba/draft/mock-draft, use

    soup.find("img",{"src":"image_name_from_source"})
    

提交回复
热议问题