问题
I am having some trouble printing only a specific value of the scraped html
This the specific line of HTML my program scrapes for
<input name="form_key" type="hidden" value="MmghsMIlPm5bd2Dw"/>
My code is as follows
import requests, time
from bs4 import BeautifulSoup
from colorama import Fore, Back, Style, init
print(Fore.CYAN + "Lets begin!"")
init(autoreset=True)
url = raw_input("Enter URL: ")
print(Fore.CYAN + "\nGetting form key")
r = requests.get(url)
soup = BeautifulSoup(r.content, "html.parser")
data = soup.find_all("input", {'name': 'form_key', 'type':'hidden'})
for data in data:
print(Fore.YELLOW + "Found Form Key:")
print(data)
The program scrapes it fine, but prints the entire line where I desire to only print "MmghsMIlPm5bd2Dw" (no quotes)
How can I achieve this??
I have tried things like
print soup.find(data).text
And
last_input_tag = soup.find("input", id="value")
print(last_input_tag)
But nothing has seemed to really work
回答1:
if you print data and it shows you the whole input statement you should be able to print the value by specifying it
print(data.get('value'))
Please refere to documentation here https://www.crummy.com/software/BeautifulSoup/bs4/doc/
回答2:
More generically... presuming that there are multiple tags in the html:
from bs4 import BeautifulSoup
html = '''<title><p><input name="form_key" type="hidden" value="MmghsMIlPm5bd2Dw"/>
<input name="form_key" type="hidden" value="abcdefghijklmo"/>
<input name="form_key" type="hidden"/>
</p></title>'''
soup = BeautifulSoup(html, "html.parser")
We can search for all tags with the name input
.
tags = soup.find_all('input')
We can then cycle through all the tags to retrieve those tags with value
attributes. Because tags can be treated much like dictionaries under the hood, we can query for the attributes as though they were keys, using the *.get()
method. This method looks for an attribute called value
:
- If it finds this attribute, the method returns the value associated with the attribute
- If it cannot find the attribute, the
*.get()
method will return the default value you provide as a second argument:
To cycle through the tags...
for tag in tags:
print(tag.get('value', 'value attribute not found'))
=== Output: ===
MmghsMIlPm5bd2Dw
abcdefghijklmo
value attribute not found
来源:https://stackoverflow.com/questions/47047998/printing-specific-html-values-with-python