I can't login to Instagram with Requests

雨燕双飞 提交于 2019-12-23 05:37:15

问题


I've been trying to login to Instagram using the Requests library but I can't get it to work. The connection always get refused.

import requests

#Creating URL, usr/pass and user agent variables

BASE_URL = 'https://www.instagram.com/'
LOGIN_URL = BASE_URL + 'accounts/login/ajax/'
USERNAME = '******'
PASSWD = '******'
USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)\
 Chrome/59.0.3071.115 Safari/537.36'

#Setting some headers and refers
session = requests.Session()
session.headers = {'user-agent': USER_AGENT}
session.headers.update({'Referer': BASE_URL})


try:
    #Requesting the base url. Grabbing and inserting the csrftoken

    req = session.get(BASE_URL)
    session.headers.update({'X-CSRFToken': req.cookies['csrftoken']})
    login_data = {'username': USERNAME, 'password': PASSWD}

    #Finally login in
    login = session.post(LOGIN_URL, data=login_data, allow_redirects=True)
    session.headers.update({'X-CSRFToken': login.cookies['csrftoken']})

    cookies = login.cookies

    #Print the html results after I've logged in
    print(login.text)

#In case of refused connection
except requests.exceptions.ConnectionError:
    print("Connection refused")

I don't know what I'm doing wrong. I would really appreciate if anyone posted any solutions. Please do not suggest API or Selenium(They're not an option for me at the moment)


回答1:


Since requests doesn't execute JavaScript's you don't have the CSRFToken in your cookies.

If you have a look at the content you can find the csrf_token inside the html.

Using bs4 and json you can extract it and use it in your post.

from bs4 import BeautifulSoup
import json, random, re, requests

BASE_URL = 'https://www.instagram.com/accounts/login/'
LOGIN_URL = BASE_URL + 'ajax/'

headers_list = [
        "Mozilla/5.0 (Windows NT 5.1; rv:41.0) Gecko/20100101"\
        " Firefox/41.0",
        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2)"\
        " AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2"\
        " Safari/601.3.9",
        "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:15.0)"\
        " Gecko/20100101 Firefox/15.0.1",
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"\
        " (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36"\
        " Edge/12.246"
        ]


USERNAME = '****'
PASSWD = '*****'
USER_AGENT = headers_list[random.randrange(0,4)]

session = requests.Session()
session.headers = {'user-agent': USER_AGENT}
session.headers.update({'Referer': BASE_URL})    
req = session.get(BASE_URL)    
soup = BeautifulSoup(req.content, 'html.parser')    
body = soup.find('body')

pattern = re.compile('window._sharedData')
script = body.find("script", text=pattern)

script = script.get_text().replace('window._sharedData = ', '')[:-1]
data = json.loads(script)

csrf = data['config'].get('csrf_token')
login_data = {'username': USERNAME, 'password': PASSWD}
session.headers.update({'X-CSRFToken': csrf})
login = session.post(LOGIN_URL, data=login_data, allow_redirects=True)
login.content
# b'{"authenticated": true, "user": true, "userId": "*******", "oneTapPrompt": false, "status": "ok"}'

Have in mind that most of the data in instagram it's loaded with javascript, so you may have more troubles in future.

You can refer to this post on how to recover data : https://stackoverflow.com/a/49831347

Or you can use different library like dryscrape or spynner



来源:https://stackoverflow.com/questions/50316885/i-cant-login-to-instagram-with-requests

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!