mechanicalsoup

Scraping a website with python 3 that requires login

我是研究僧i 提交于 2021-02-08 09:13:38
问题 Just a question regarding some scraping authentication. Using BeautifulSoup : #importing the requests lib import requests from bs4 import BeautifulSoup #specifying the page page = requests.get("http://localhost:8080/login?from=%2F") #parsing through the api soup = BeautifulSoup(page.content, 'html.parser') print(soup.prettify()) From here the output, I think would be important: <table> <tr> <td> User: </td> <td> <input autocapitalize="off" autocorrect="off" id="j_username" name="j_username"

MechanicalSoup action difficulty with forms

天大地大妈咪最大 提交于 2020-01-24 21:56:46
问题 First, I am French so if there are mistakes in my english I'm sorry. So here is my problem, I have hard time with mechanicalsoup. So here is my HTML page: <form class="XFYOY" method="post"><h2 class="vvzhL ">Inscrivez-vous pour voir les photos et vidéos de vos amis.</h2> Here are just the first line. I want to create an automatic form but there is not action and I don't know what to put in browser.select_form(): browser.select_form('form[action=/post]') browser["emailOrPhone"] = "0689754327"

logging in to website using requests

青春壹個敷衍的年華 提交于 2019-12-31 04:06:11
问题 I've tried two completely different methods. But still I can't get the data that is only present after loggin in. I've tried doing one using requests but the xpath returns a null import requests from lxml import html USERNAME = "xxx" PASSWORD = "xxx" LOGIN_URL = "http://www.reginaandrew.com/customer/account/loginPost/referer/aHR0cDovL3d3dy5yZWdpbmFhbmRyZXcuY29tLz9fX19TSUQ9VQ,,/" URL = "http://www.reginaandrew.com/gold-leaf-glass-top-table" def main(): FormKeyTxt = "" session_requests =

mechanicalsoup is not redirecting to where it should redirect to

浪尽此生 提交于 2019-12-11 17:52:20
问题 I am trying to make a webscraping bot that logs into https://adelbert.magister.net/ so that I can scrap data in the website after logging in. My code: import mechanicalsoup browser = mechanicalsoup.StatefulBrowser( soup_config={'features': 'lxml'}, raise_on_404=True, user_agent='bot', ) browser.open("https://adelbert.magister.net/") print(browser.get_url()) If you visit the page in your normal browser it redirects to a URL that looks like this: https://accounts.magister.net/account/login

Difficulty with MechanicalSoup forms

我是研究僧i 提交于 2019-12-11 15:58:02
问题 First, I am French so if there are mistakes in my english I'm sorry. So here is my problem, I have hard time with mechanicalsoup. So here is my HTML page: <form class="XFYOY" method="post"><h2 class="vvzhL ">Inscrivez-vous pour voir les photos et vidéos de vos amis.</h2> Here are just the first line. I want to create an automatic form but there is not action and I don't know what to put in browser.select_form(): browser.select_form('form[action=/post]') browser["emailOrPhone"] = "0689754327"

python mechanicalsoup redirection issue

跟風遠走 提交于 2019-12-11 04:26:43
问题 guys, actually I have a problem in my code and the redirection feature in my router, after I wrote the code which able to find the form and login into the router I faced a problem that after the login using the login.cgi the router redirects the link into something like http://192.168.1.2/index.asp;session_id=2dfa2490ad2e26a3d073edfdae7d0f45 what I could understand that it shows the session id in the link and I need help to make my code understands and gets the link I tried many times using

Using mechanicalsoup to set value of form element w/o a name

Deadly 提交于 2019-11-28 14:22:25
I have searched through all mechanicalsoup & beautifulsoup documentation but can't figure out how to set the value of a form element using 'id' (because it doesn't have a name). import mechanicalsoup browser = mechanicalsoup.StatefulBrowser() browser.open(my_url) form = browser.select_form('form[id="login-form"]') browser.get_current_form().print_summary() userid = browser.get_current_page().find('input', id='text-userid') form.set("text-userid", "user") This gets me - <input class="login-text-box" id="text-userid" placeholder="Email" type="text" value=""/> <input class="login-text-box" id=

Using mechanicalsoup to set value of form element w/o a name

≡放荡痞女 提交于 2019-11-26 22:11:00
问题 I have searched through all mechanicalsoup & beautifulsoup documentation but can't figure out how to set the value of a form element using 'id' (because it doesn't have a name). import mechanicalsoup browser = mechanicalsoup.StatefulBrowser() browser.open(my_url) form = browser.select_form('form[id="login-form"]') browser.get_current_form().print_summary() userid = browser.get_current_page().find('input', id='text-userid') form.set("text-userid", "user") This gets me - <input class="login

In MechanicalSoup (python 3x) how to logout a website whose logout button is a javascript

断了今生、忘了曾经 提交于 2019-11-26 14:56:25
问题 I could successfully login and navigate through a website but when I inspected the logout button it is like javascript:__doPostBack('ctl00$lnkBtnLogout','') as link, follow_link() doesn't work on this (saying: no adapters)??? Can someone help me? 回答1: From https://github.com/MechanicalSoup/MechanicalSoup: A Python library for automating interaction with websites. MechanicalSoup automatically stores and sends cookies, follows redirects, and can follow links and submit forms. It doesn't do