Cant get the tag 'rel' via beautifulsoup webscrapping python

假如想象 提交于 2019-12-11 05:46:28

问题


I am trying to test a beautifulsoup4 webscrap code on a website. Have done most of it but one attribute information due to its location is little tricky for me to accomplish.

Code goes like this:

span class="callseller-description-icon">
<a id="phone-lead" class="callseller-description-link" rel="0501365082" href="#">Show Phone Number</a>

I am trying this but not sure if its okay

try:
        phone=soup.find('a',{'id':'phone-lead'})
        for a in phone:
            phone_result= str(a.get_text('rel').strip().encode("utf-8"))
        print "Phone information:", phone_result
    except StandardError as e:
        phone_result="Error was {0}".format(e)
        print phone_result

What is possibly my mistake. It kinda hard to get the rel information which has phone numbers

The error i m getting is

NavigableString object has no attribute get_text

回答1:


find returns the element not a list, if you want all a tags, use the find_all method. Also to get the rel attribute you need to use the .get() method or dictionary lookup. You can also add rel=True to get only those "a" tags where with the "rel" attribute.

Demo:

  • Using find()

    >>> soup.find('a', {'id': 'phone-lead', 'rel': True}).get('rel')
    ['0501365082']
    
  • Using find_all:

    >>> for a in soup.find_all('a', {'id':'phone-lead', 'rel': True}):
    ...     print(a['rel'])
    ... 
    ['0501365082']
    

To get a list of all "rel" you can use a list comprehensions

>>> [rel for rel in a['rel'] for a in soup.find_all('a', {'id':'phone-lead', 'rel': True})]
['0501365082']


来源:https://stackoverflow.com/questions/37519993/cant-get-the-tag-rel-via-beautifulsoup-webscrapping-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!