Printing specific HTML values with Python

倖福魔咒の 提交于 2019-12-25 12:25:16

问题


I am having some trouble printing only a specific value of the scraped html

This the specific line of HTML my program scrapes for

<input name="form_key" type="hidden" value="MmghsMIlPm5bd2Dw"/>

My code is as follows

import requests, time
from bs4 import BeautifulSoup
from colorama import Fore, Back, Style, init


print(Fore.CYAN + "Lets begin!"")
init(autoreset=True)

url = raw_input("Enter URL: ")

print(Fore.CYAN + "\nGetting form key")


r = requests.get(url)

soup = BeautifulSoup(r.content, "html.parser")

data = soup.find_all("input", {'name': 'form_key', 'type':'hidden'})

for data in data:
    print(Fore.YELLOW + "Found Form Key:")
    print(data)

The program scrapes it fine, but prints the entire line where I desire to only print "MmghsMIlPm5bd2Dw" (no quotes)

How can I achieve this??

I have tried things like

print soup.find(data).text

And

last_input_tag = soup.find("input", id="value")
print(last_input_tag)

But nothing has seemed to really work


回答1:


if you print data and it shows you the whole input statement you should be able to print the value by specifying it

print(data.get('value'))

Please refere to documentation here https://www.crummy.com/software/BeautifulSoup/bs4/doc/




回答2:


More generically... presuming that there are multiple tags in the html:

from bs4 import BeautifulSoup

html = '''<title><p><input name="form_key" type="hidden" value="MmghsMIlPm5bd2Dw"/>
<input name="form_key" type="hidden" value="abcdefghijklmo"/>
<input name="form_key" type="hidden"/>
</p></title>'''

soup = BeautifulSoup(html, "html.parser")

We can search for all tags with the name input.

tags = soup.find_all('input')

We can then cycle through all the tags to retrieve those tags with value attributes. Because tags can be treated much like dictionaries under the hood, we can query for the attributes as though they were keys, using the *.get() method. This method looks for an attribute called value:

  • If it finds this attribute, the method returns the value associated with the attribute
  • If it cannot find the attribute, the *.get() method will return the default value you provide as a second argument:

To cycle through the tags...

for tag in tags:
    print(tag.get('value', 'value attribute not found'))

=== Output: ===
MmghsMIlPm5bd2Dw
abcdefghijklmo
value attribute not found


来源:https://stackoverflow.com/questions/47047998/printing-specific-html-values-with-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!