Beautiful Soup: get picture size from html

丶灬走出姿态 提交于 2021-02-08 08:14:04

问题


I want to extract the pictures' widths and heights using Bueatiful Soup. All pictures have the same code format:

<img src="http://somelink.com/somepic.jpg" width="200" height="100">

I can extract the links easily with

for pic in soup.find_all('img'):
    print (pic['src'])

But

for pic in soup.find_all('img'):
    print (pic['width'])

is not working for extracting sizes. What am I missing?

EDIT: One of the pictures in the page does not have the width and height in the html code. Did not notice this at the time of the initial post. So any solution must take this into account


回答1:


The dictionary-like attribute access should work for width and height as well, if they are specified. You might encounter images that don't have these attributes explicitly set - your current code would throw a KeyError in this case. You can use get() and provide a default value instead:

for pic in soup.find_all('img'):
    print(pic.get('width', 'n/a'))

Or, you can find only img elements that have the width and height specified:

for pic in soup.find_all('img', width=True, height=True):
    print(pic['width'], pic['height']) 



回答2:


It works a little differently, to get other attributes

for pic in soup.find_all('img'):
    print(pic.get('width'))



回答3:


Try this:

>>> html = '<img src="http://somelink.com/somepic.jpg" width="200" height="100">'
>>> soup = BeautifulSoup(html)
>>> for tag in soup.find_all('img'):
...     print tag.attrs.get('height', None), tag.attrs.get('width', None)
... 
100 200

you can use attrs method, it returns a dict , keys as attribute of tag and values as tag value .



来源:https://stackoverflow.com/questions/36754686/beautiful-soup-get-picture-size-from-html

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!