BeautifulSoup returns some weird text for the tag

BeautifulSoup returns some weird text for the <a> tag

问题

I'm new to web scraping and I'm trying to scrape data from this auction website. However, I meet this weird problem when trying to get the text of the anchor tag.

Here's the HTML:

<div class="mt50">
  <div class="head_011">
    <a id="item_event_title" href="https://www.storyltd.com/auction/auction.aspx?eid=4158">NO RESERVE AUCTION OF MODERN AND CONTEMPORARY ART  (16-17 APRIL 2019)</a>
  </div>
</div>

Here's my code:

auction_info = LTD_work_soup.find('a', id = 'item_event_title').text
print(auction_info)

This prints out "Back To Auction Catalogue" instead of 'NO RESERVE AUCTION OF MODERN AND CONTEMPORARY ART (16-17 APRIL 2019)', which is what I am expecting.

Here's the link to the page.

Thank you.

回答1:

Here how you can extract the NO RESERVE AUCTION OF MODERN AND CONTEMPORARY ART (16-17 APRIL 2019)' from the webpage:

from bs4 import BeautifulSoup
import requests

page_link = 'https://www.storyltd.com/auction/item.aspx?eid=4158&amp&lotno=2'
page_response = requests.get(page_link, timeout=5)
page_content = BeautifulSoup(page_response.content, "html.parser")
page_content.find("input", attrs={"id":"hdnAuctionTitle"}).attrs['value']

Output:

NO RESERVE AUCTION OF MODERN AND CONTEMPORARY ART  (16-17 APRIL 2019)

When you check the page_content, you will find out that this sentence is present in an Input Tag.

I hope it helps!

来源：https://stackoverflow.com/questions/56587211/beautifulsoup-returns-some-weird-text-for-the-a-tag

标签

python

html

beautifulsoup

python-requests

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!