问题
I try to scrape out an ImageId inside a button tag, want to have the result:
"25511e1fd64e99acd991a22d6c2d6b6c".
When I try:
drawing_url = drawing_url.find_all('button', class_='inspectBut')['onclick']
it doesn't work. Giving an error-
TypeError: list indices must be integers or slices, not str
Input =
for article in soup.find_all('div', class_='dojoxGridRow'):
drawing_url = article.find('td', class_='dojoxGridCell', idx='3')
drawing_url = drawing_url.find_all('button', class_='inspectBut')
if drawing_url:
for e in drawing_url:
print(e)
Output =
<button class="inspectBut" href="#"
onclick="window.open('getImg?imageId=25511e1fd64e99acd991a22d6c2d6b6c&
timestamp=1552011572288','_blank', 'toolbar=0,
menubar=0, modal=yes, scrollbars=1, resizable=1,
height='+$(window).height()+', width='+$(window).width())"
title="Open Image" type="button">
</button>
...
...
回答1:
Try this one.
import re
#for all the buttons
btn_onlclick_list = [a.get('onclick') for a in soup.find_all('button')]
for click in btn_onlclick_list:
a = re.findall("imageId=(\w+)", click)[0]
print(a)
回答2:
You should be searching for
button_list = soup.find_all('button', {'class': 'inspectBut'})
That will give you the button array and you can later get url field by
[button['getimg?imageid'] for button in button_list]
You will still need to do some parsing, but I hope this can get you on the right track.
Your mistake here was that you need to search correct property class
and look for correct html tag, which is, ironically, getimg?imageid
.
回答3:
You first need to check whether the attribute is present or not.
tag.attrs
returns a list of attributes present in the current tag
Consider the following Code.
Code:
from bs4 import BeautifulSoup
a="""
<td>
<button class='hi' onclick="This Data">
<button class='hi' onclick="This Second">
</td>"""
soup = BeautifulSoup(a,'lxml')
print([btn['onclick'] for btn in soup.find_all('button',class_='hi') if 'onclick' in btn.attrs])
Output:
['This Data','This Second']
or you can simply do this
[btn['onclick'] for btn in soup.find_all('button', attrs={'class' : 'hi', 'onclick' : True})]
来源:https://stackoverflow.com/questions/55056003/beautiful-soup-get-data-inside-a-button-tag