Use BeautifulSoup to get a value after a specific tag

谁说我不能喝 提交于 2019-12-19 03:13:12

问题


I'm having a very hard time getting BeautifulSoup to scrape some data for me. What's the best way to access the date (the actual numbers, 2008) from this code sample? It's my first time using Beautifulsoup, I've figured out how to scrape urls off of the page, but I can't quite narrow it down to only select the word Date, and then to only return whatever numeric date follows (in the dd brackets). Is what I'm asking even possible?

<div class='dl_item_container clearfix detail_date'>
    <dt>Date</dt>
    <dd>
        2008
    </dd>
</div>

回答1:


Find the dt tag by text and find the next dd sibling:

soup.find('div', class_='detail_date').find('dt', text='Date').find_next_sibling('dd').text

The complete code:

from bs4 import BeautifulSoup

data = """
<div class='dl_item_container clearfix detail_date'>
    <dt>Date</dt>
    <dd>
    2008
    </dd>
</div>
"""

soup = BeautifulSoup(data)
date_field = soup.find('div', class_='detail_date').find('dt', text='Date')
print date_field.find_next_sibling('dd').text.strip()

Prints 2008.



来源:https://stackoverflow.com/questions/25778394/use-beautifulsoup-to-get-a-value-after-a-specific-tag

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!