How to use request library to send keys to web page in Python?

孤街浪徒 提交于 2020-01-04 15:15:20

问题


I have a website https://www.icsi.in/student/Members/MemberSearch.aspx which when visited, I've to enter the 'CP number' as 16803, & click on search. After that information of student displays which I need to scrap. Can someone please help how to pass the 'CP number' to request & how to press the 'search' button using request.

So far I've tried using the class name & id name as well in param tag of request.get() method.

import requests
r=requests.get('https://www.icsi.in/student/Members/MemberSearch.aspx',params={'dnn_ctr410_MemberSearch_txtCpNumber':16803})

In above code I have tried using param name as: [Class name & id name]

dnn$ctr410$MemberSearch$txtCpNumber

dnn_ctr410_MemberSearch_txtCpNumber

I don't how to work it & I can't use selenium or mechanise like library, can someone please help me.


回答1:


Finally I tried, & got it working using this,

import requests
from bs4 import BeautifulSoup
import pandas as pd

s=requests.Session()

resp=s.get('https://www.icsi.in/student/Members/MemberSearch.aspx')
resp

soup = BeautifulSoup(resp.content,"html5lib")

dictinfo = {i['name']: i.get('value', '') for i in soup.select('input[name]')}
dictinfo['dnn$ctr410$MemberSearch$txtCpNumber'] = 16803
dictinfo["__EVENTTARGET"] = 'dnn$ctr410$MemberSearch$btnSearch'
dictinfo = {k:(None, str(v)) for k,v in dictinfo.items()}

resp = s.post('https://www.icsi.in/student/Members/MemberSearch.aspx', files=dictinfo)

soup2 = BeautifulSoup(resp.text,"html5lib")

name=soup2.select_one(".name_head").text
#print(name)

info=[str(i.text).strip() for i in soup2.select(".chart_head")]
#print(info)

detail=[str(i.text).strip() for i in soup2.select(".chart_detail")]
#print(detail)

print("Name : ",name)
data=pd.DataFrame({'Info':info,'Details':detail},columns=['Info', 'Details'])
data

Thank you so much to everyone.




回答2:


website submit button rendering js or ajax request. You should try automation selenium library. it allows you to scrap dynamic rendering request(js or ajax) page data.

from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

browser = webdriver.Chrome('/usr/bin/chromedriver')
browser.get('https://www.icsi.in/student/Members/MemberSearch.aspx')

reg = browser.find_element_by_name('dnn$ctr410$MemberSearch$txtCpNumber')
reg.send_keys('16803')

sub = browser.find_element_by_class_name('dnnPrimaryAction')
sub.click()

WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.CLASS_NAME, "rgMasterTable")))

soup = BeautifulSoup(browser.page_source, 'lxml')
table = soup.find("table",{'class':"rgMasterTable"}).find("tr",{'class':"rgRow"})

data = {}
for div in table.find_all("div",{'class':"chart_att"}):
    for div2 in div.find_all("div"):
        _class = div2.get("class")

        if "chart_row" in _class[0]:
            key = None
            value = None
            for td in div2.find_all("td"):
                _class1 = td.get("class")

                if "chart_head" in _class1[0]:
                    key = td.text.strip()
                else:
                    value = td.text.strip()
            if key is not None and value is not None:
                data[key] = value

print(data)

O/P:

{'Organization': 'RAHUL SHINDE AND COMPANY', 'Designation': 'COMPANY SECRETARIES (*)', 'Membership Number': 'A32412', 'CP Number': '16803', 'Benevolent Member': 'No', 'Address': '25/26, 3RD FLOOR, PAREERA BUILDING NAVJEEVAN WADI, KALBADEVI POST DHOBI TALAV, MARINE LINES', 'City': 'MUMBAI', 'Phone': '', 'Email': 'jurisrahul@gmail.com', 'Mobile': '8369683685'}

where '/usr/bin/chromedriver' selenium web driver path.

Download selenium web driver for chrome browser:

http://chromedriver.chromium.org/downloads

Install web driver for chrome browser:

https://christopher.su/2015/selenium-chromedriver-ubuntu/

Selenium tutorial:

https://selenium-python.readthedocs.io/




回答3:


I have tried to solve your problem and I was able to fetch the member details according to CP number using Curl command The curl command which worked is ::

curl 'https://www.icsi.in/student/Members/MemberSearch.aspx' -H 'Connection: keep-alive' -H 'Pragma: no-cache' -H 'Cache-Control: no-cache' -H 'Upgrade-Insecure-Requests: 1' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36' -H 'Origin: https://www.icsi.in' -H 'Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryjACWRjNNdWvIyQAt' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3' -H 'Referer: https://www.icsi.in/student/Members/MemberSearch.aspx' -H 'Accept-Encoding: gzip, deflate, br' -H 'Accept-Language: en-US,en;q=0.9' -H 'Cookie: .ASPXANONYMOUS=Yi-3rxhT1QEkAAAAN2Y1ZGE1ZDQtN2FjOC00NmJlLWFmNzEtMTRmYmNjZThiMzAz0; language=en-US; __utma=268070294.1401783362.1559839356.1559839356.1559839356.1; __utmc=268070294; __utmz=268070294.1559839356.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmt=1; __utmb=268070294.9.10.1559839356' --data-binary $'------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="StylesheetManager_TSSM"\r\n\r\n;Telerik.Web.UI, Version=2011.3.1115.35, Culture=neutral, PublicKeyToken=121fae78165ba3d4:en-US:f0ea1c34-9d2c-42a1-84c3-49717427a593:9e1572d6:e25b4b77\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="ScriptManager_TSM"\r\n\r\n;;System.Web.Extensions, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35:en:eb198dbd-2212-44f6-bb15-882bde414f00:ea597d4b:b25378d2;Telerik.Web.UI, Version=2011.3.1115.35, Culture=neutral, PublicKeyToken=121fae78165ba3d4:en:f0ea1c34-9d2c-42a1-84c3-49717427a593:16e4e7cd:58366029\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="__EVENTTARGET"\r\n\r\ndnn$ctr410$MemberSearch$btnSearch\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="__EVENTARGUMENT"\r\n\r\n\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="__VIEWSTATE"\r\n\r\\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="__VIEWSTATEGENERATOR"\r\n\r\n6A295697\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="__VIEWSTATEENCRYPTED"\r\n\r\n\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="__EVENTVALIDATION"\r\n\r\n1eO5OePdCTECUOCaykvLo65wq/pTry9rKOEYyreRsnnxiTpBbtYAIPUG+bVt2l4zVxPCliqCPAmRPBbYNUAFvgSy+x54jvq73espjGpQssT1TlOP1J3StMO0MDMiQIF/6Cw/jckVtXWV0b6fP7W2qnEzvALQXz6YwtS2urQiOZ+4nHaMevnrjENuHKlgR4D1zA6U+XAhdvds4fwO2pcNHL8nKr/Sog6efTRV40jwCPaJKR0CT5StHsnekIc/9DZY8RsxcF61tgN/HnjkUX6Wu8GlkrgVy6rAoqfteSUduE6MizWzu6DTcZhRYjXasjnDjjnWMBAba8Id8YiqJMIrPEuiU0w6tk1Pf034om2/uXIr1wFD5QUV8yC09x8Z+g+NHU1u7yH2AF/nuetY2PvNO6WSfsD7r1YGL47ZK9ADu/BA7pT+GMIq7Y7oc0kcbszAh2Tuw3YOouV6+LE+zoypa//x8vubNKsBdhZWcA==\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="dnn$ctlHeader$dnnSearch$Search"\r\n\r\nSiteRadioButton\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="dnn$ctlHeader$dnnSearch$txtSearch"\r\n\r\n\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="dnn$ctr410$MemberSearch$txtFirstName"\r\n\r\n\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="dnn$ctr410$MemberSearch$txtLastName"\r\n\r\n\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="dnn$ctr410$MemberSearch$ddlMemberType"\r\n\r\n0\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="dnn$ctr410$MemberSearch$txtMembershipNumber"\r\n\r\n\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="dnn$ctr410$MemberSearch$txtCpNumber"\r\n\r\n16803\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="dnn$ctr410$MemberSearch$txtCity"\r\n\r\n\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="dnn$ctr410$MemberSearch$txtOrganisation"\r\n\r\n\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="dnn$ctr410$MemberSearch$txtAddress2"\r\n\r\n\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="dnn$ctr410$MemberSearch$txtAddress3"\r\n\r\n\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="dnn$ctr410$MemberSearch$txtEmail"\r\n\r\n\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="dnn_ctr410_MemberSearch_grdMembers_ClientState"\r\n\r\n\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="ScrollTop"\r\n\r\n329\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt\r\nContent-Disposition: form-data; name="__dnnVariable"\r\n\r\n{"__scdoff":"1","__dnn_pageload":"__dnn_setScrollTop();"}\r\n------WebKitFormBoundaryjACWRjNNdWvIyQAt--\r\n' --compressed

If you are still unable to get the desired result through the requests module then you can automate the curl request and get the work done what the requests will do for you. If you are going to try this method then you just have to change the "CpNumber" attribute value and for doing that you can search for that attribute in the curl request. Hope this will work out for you.



来源:https://stackoverflow.com/questions/56481600/how-to-use-request-library-to-send-keys-to-web-page-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!