问题
I've created a script in python using requests module in combination with BeautifulSoup library to fill in some tiny forms traversing different pages in a webpage. There are multiple get and post requests I need to issue to accomplish this as selenium is not an option here. I'm only interested in modifying the fields in step 2 captioned as personal information.
How to do it - After logging in using the email and password (available within the script) it is necessary to choose (by default yes) Yes option to go on to the step 2. I sent a post requests along with appropriate parameters to fill in the fields available in step 2. Other than names and email, the rest two options would be Yes.
Normally when I fill in the fields manually in step two, I can only go on to the step three. However, when I do the same using the following script I can see that the script is able to parse the caption from step 3 whereas the step two are not properly filled in.
How can I fill in the fields in step 2 using the following script?
The email (shamim.techbd2275@gmail.com) and password (TShift123@&) I've created for testing purpose only. Feel free to use it.
log in url
starting url #if necessary
I've tried with:
import requests
from bs4 import BeautifulSoup
url = "https://p-sso.recsolu.com/cas/login?service=https%3A%2F%2Fdb.recsolu.com%2Flogin"
link = "https://p-sso.recsolu.com/cas/login;jsessionid={}?service=https%3A%2F%2Fdb.recsolu.com%2Flogin"
get_link = "https://db.recsolu.com/external/requisitions/9e6Z9Gu7HTHVaz3cje1PUw/apply/0"
post_link = "https://db.recsolu.com/external/requisitions/9e6Z9Gu7HTHVaz3cje1PUw/0"
new_get_link = "https://db.recsolu.com/external/requisitions/9e6Z9Gu7HTHVaz3cje1PUw/apply/1"
ano_new_post = "https://db.recsolu.com/external/requisitions/9e6Z9Gu7HTHVaz3cje1PUw/1"
ano_new_get = "https://db.recsolu.com/external/requisitions/9e6Z9Gu7HTHVaz3cje1PUw/apply/2"
with requests.Session() as s:
r = s.get(url)
jsid = dict(s.cookies)['JSESSIONID']
soup = BeautifulSoup(r.text,"lxml")
payload = {i['name']:i.get('value','') for i in soup.select('input[name]')}
payload['username'] = 'shamim.techbd2275@gmail.com'
payload['password'] = 'TShift123@&'
sres = s.post(link.format(jsid),data=payload,headers={
"user-agent":"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.100 Safari/537.36",
"content-type":"application/x-www-form-urlencoded",
"referer": "https://p-sso.recsolu.com/cas/login?service=https%3A%2F%2Fdb.recsolu.com%2Flogin",
'accept-encoding': 'gzip, deflate, br'
})
resp = s.get(get_link)
sauce = BeautifulSoup(resp.text,"lxml")
new_payload = {i['name']:i.get('value','') for i in sauce.select('input[name]')}
if new_payload['utf8']:
first_utf = new_payload['utf8']
if new_payload['authenticity_token']:
first_token = new_payload['authenticity_token']
first_payload = [
('utf8',first_utf),
('authenticity_token',first_token),
('application_form[answers][332]', ''),
('application_form[answers][332]', '6211'), #----for yes option
('commit', 'Continue')
]
first_payload_ = "{%s}" % ', '.join("'%s': '%s'" % pair for pair in first_payload)
s.post(post_link,json=first_payload_,headers={
'user-agent':'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.100 Safari/537.36',
'content-type': 'multipart/form-data; boundary=----WebKitFormBoundaryVj5OyhKfB43XQNpK',
"referer":"https://db.recsolu.com/external/requisitions/9e6Z9Gu7HTHVaz3cje1PUw/apply/0",
})
res = s.get(new_get_link)
sp = BeautifulSoup(res.text,"lxml")
ano_payload = {i['name']:i.get('value','') for i in sp.select('input[name]')}
if ano_payload['utf8']:
utf_encoding = ano_payload['utf8']
if ano_payload['authenticity_token']:
token_num = ano_payload['authenticity_token']
second_payload = [
('utf8',utf_encoding),
('authenticity_token',token_num),
('application_form[first_name]', 'shamim'), #----this name should be replaced with the default one there
('application_form[last_name]', 'ahmed'), #----this name should be replaced with the default one there
('application_form[answers][620]', ''),
('application_form[answers][620]', '10168'), #----for yes option
('application_form[answers][627]', ''),
('application_form[answers][627]', '10182'), #----for yes option
('commit', 'Continue')
]
second_payload_ = "{%s}" % ', '.join("'%s': '%s'" % pair for pair in second_payload)
s.post(ano_new_post,json=second_payload_,headers={
"user-agent":"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.100 Safari/537.36",
"content-type":"multipart/form-data; boundary=----WebKitFormBoundaryD1duN64tHN9BodAD",
"referer":"https://db.recsolu.com/external/requisitions/9e6Z9Gu7HTHVaz3cje1PUw/apply/1",
})
response = s.get(ano_new_get)
soup_obj = BeautifulSoup(response.text,"lxml")
elem = soup_obj.select_one("h3.section_header b")
print(elem)
If accidentally the following page shows up while logging in, make sure to click on the continue button.
PS When the fields in any step are properly filled in, they get automatically saved, so when I check manually I can see the fields already recorded in there.
来源:https://stackoverflow.com/questions/60228705/unable-to-modify-few-fields-in-a-webpage-issuing-a-post-request