BeautifulSoup: extract text from anchor tag

后端 未结 5 1895
误落风尘
误落风尘 2020-12-01 00:34

I want to extract:

  • text from following src of the image tag and
  • text of the anchor tag which is inside the div class data
5条回答
  •  慢半拍i
    慢半拍i (楼主)
    2020-12-01 01:17

    This will help:

    from bs4 import BeautifulSoup
    
    data = '''
            '''
    
    soup = BeautifulSoup(data)
    
    for div in soup.findAll('div', attrs={'class':'image'}):
        print(div.find('a')['href'])
        print(div.find('a').contents[0])
        print(div.find('img')['src'])
    

    If you are looking into Amazon products then you should be using the official API. There is at least one Python package that will ease your scraping issues and keep your activity within the terms of use.

提交回复
热议问题