How could I use python-request to grab a linkedin page?

╄→尐↘猪︶ㄣ 提交于 2019-12-04 20:50:27

This is much more complicated than what you've got so far.

You will need to do something like:

  • Load https://www.linkedin.com/uas/login
  • Parse the response with BeautifulSoup to get the login form, with all the hidden form fields etc. (The CSRF ones are particularly important, as the server will reject a POST request without the correct values).
  • Build your POST data dictionary from the parsed login form data + your username and password
  • POST that data to https://www.linkedin.com/uas/login-submit (you might have to fake some of the headers too, as it might only accept requests marked as AJAX)
  • Finally GET http://www.linkedin.com/nhome

You can see this whole process by opening the developer tools in chrome/firefox and going through the login process in the network tab.

Something like this should work:

import requests
from bs4 import BeautifulSoup

# Get login form
URL = 'https://www.linkedin.com/uas/login'
session = requests.session()
login_response = session.get('https://www.linkedin.com/uas/login')
login = BeautifulSoup(login_response.text)

# Get hidden form inputs
inputs = login.find('form', {'name': 'login'}).findAll('input', {'type': ['hidden', 'submit']})

# Create POST data
post = {input.get('name'): input.get('value') for input in inputs}
post['session_key'] = 'username'
post['session_password'] = 'password'

# Post login
post_response = session.post('https://www.linkedin.com/uas/login-submit', data=post)

# Get home page
home_response = session.get('http://www.linkedin.com/nhome')
home = BeautifulSoup(home_response.text)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!