(Beautiful Soup) Get data inside a button tag

早过忘川 提交于 2021-01-28 14:11:45

问题


I try to scrape out an ImageId inside a button tag, want to have the result:

"25511e1fd64e99acd991a22d6c2d6b6c".

When I try:

drawing_url = drawing_url.find_all('button', class_='inspectBut')['onclick'] 

it doesn't work. Giving an error-

TypeError: list indices must be integers or slices, not str

Input =

for article in soup.find_all('div', class_='dojoxGridRow'):
drawing_url = article.find('td', class_='dojoxGridCell', idx='3')
drawing_url = drawing_url.find_all('button', class_='inspectBut')
if drawing_url:
    for e in drawing_url:
        print(e)

Output =

    <button class="inspectBut" href="#" 
        onclick="window.open('getImg?imageId=25511e1fd64e99acd991a22d6c2d6b6c&amp;
                 timestamp=1552011572288','_blank', 'toolbar=0, 
                 menubar=0, modal=yes, scrollbars=1, resizable=1, 
                 height='+$(window).height()+', width='+$(window).width())" 
         title="Open Image" type="button">
    </button>
... 
...

回答1:


Try this one.

import re

#for all the buttons
btn_onlclick_list = [a.get('onclick') for a in soup.find_all('button')]
for click in btn_onlclick_list:
     a = re.findall("imageId=(\w+)", click)[0]
     print(a)



回答2:


You should be searching for

button_list = soup.find_all('button', {'class': 'inspectBut'})

That will give you the button array and you can later get url field by

 [button['getimg?imageid'] for button in button_list]

You will still need to do some parsing, but I hope this can get you on the right track.

Your mistake here was that you need to search correct property class and look for correct html tag, which is, ironically, getimg?imageid.




回答3:


You first need to check whether the attribute is present or not. tag.attrs returns a list of attributes present in the current tag

Consider the following Code.

Code:

from bs4 import BeautifulSoup
a="""
<td>
<button class='hi' onclick="This Data">
<button class='hi' onclick="This Second">
</td>"""
soup = BeautifulSoup(a,'lxml')
print([btn['onclick'] for btn in soup.find_all('button',class_='hi') if 'onclick' in btn.attrs])

Output:

['This Data','This Second']

or you can simply do this

[btn['onclick'] for btn in soup.find_all('button', attrs={'class' : 'hi', 'onclick' : True})]


来源:https://stackoverflow.com/questions/55056003/beautiful-soup-get-data-inside-a-button-tag

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!