I\'m trying to login to https://www.voxbeam.com/login using requests to scrape data. I\'m a python beginner and I have done mostly tutorials, and some web scraping on my own
As said above, you should send values of all fields of form. Those can be find in the Web inspector of browser. This form send 2 addition hidden values:
url = "https://www.voxbeam.com//login"
data = {'userName':'xxxxxxxxx','password':'yyyyyyyyy','challenge':'zzzzzzzzz','hash':''}
# note that in email have encoded '@' like uuuuuuu%40gmail.com
session = requests.Session()
r = session.post(url, headers=headers, data=data)
Also, many sites have protection from a bot like hidden form fields, js, send encoded values, etc. As variants you could:
1) Use a cookies from manual login:
url = "https://www.voxbeam.com"
headers = {'user-agent': "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.98 Safari/537.36"}
cookies = {'PHPSESSID':'zzzzzzzzzzzzzzz', 'loggedIn':'yes'}
s = requests.Session()
r = s.post(url, headers=headers, cookies=cookies)
2) Use module Selenium:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
url = "https://www.voxbeam.com//login"
driver = webdriver.Firefox()
driver.get(url)
u = driver.find_element_by_name('userName')
u.send_keys('xxxxxxxxx')
p = driver.find_element_by_name('password')
p.send_keys('yyyyyyyyy')
p.send_keys(Keys.RETURN)
Try to specify the URL more clearly as follows :
url=https://www.voxbeam.com//login?id=loginForm
This will setFocus on the login form so that POST method applys
from webbot import Browser
web = Browser() # this will navigate python to browser
link = web.go_to('enter your login page url')
#remember click the login button then place here
login = web.click('login') #if you have login button in your web , if you have signin button then replace login with signin, in my case it is login
id = web.type('enter your Id/Username/Emailid',into='Id/Username/Emilid',id='txtLoginId') #id='txtLoginId' this varies from web to web find this by inspecting the Id/Username/Emailid Button, in my case it is txtLoginId
next = web.click('NEXT', tag='span')
passw = web.type('Enter Your Password', into='Password', id='txtpasswrd')
#id='txtpasswrd' (this also varies from web to web similiarly inspect the Password Button)in my case it is txtpasswrd
home = web.click('NEXT', id="fa fa-home", tag='span')
# id="fa fa-home" (Now inspect all necessary Buttons and move accordingly) in my case it is fa fa-home
next11 = web.click('NEXT', tag='span')
It's very tricky depending on how the website handles the login process but what I did was that I used Charles which is a proxy application and listened to requests that my browser sent to the website's server while I was logging in manually. Afterwards I copied the exact same header and cookie that was shown in Charles into my own python code and it worked! I assume the cookie and header are used to prevent bot logging in.