Flurry Login Requests.Session() Python 3

问题

So I had this question answered before here. However, something on the Flurry website has changed and the answer no longer works.

from bs4 import BeautifulSoup
import requests 

loginurl = "https://dev.flurry.com/secure/loginAction.do"
csvurl = "https://dev.flurry.com/eventdata/.../..."       #URL to get CSV
data = {'loginEmail': 'user', 'loginPassword': 'pass'}

with requests.Session() as session:
    session.headers.update({
         "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36"})
    soup = BeautifulSoup(session.get(loginurl).content)
    name = soup.select_one("input[name=struts.token.name]")["value"]
    data["struts.token.name"] = name
    data[name] = soup.select_one("input[name={}]".format(name))["value"]
    login = session.post(loginurl, data=data)
    getcsv = session.get(csvurl)

The code above worked great for the last month and then it stopped working last week. For the life of me, I can't figure out what on the website has changed. ID Names and tokens all look correct, username and pass hasnt changed. Im at a loss.

If I login manually, I can download the csv just fine using the csvurl.

login.histroy shows:

[<Response [302]>, <Response [302]>, <Response [302]>, <Response [302]>, <Response [303]>]

If anyone could take a look and figure out where I am going wrong, I would greatly appreciate it.

Thanks.

UPDATE

So from the new login address, I see the post needs to be in this format:

{"data":{"type":"session","id":"bd7d8dc1-4a86-4aed-a618-0b2765b03fb7","attributes":{"scopes":"","email":"myemail","password":"mypass","remember":"false"}}}

What I can't figure out though is how they generated the id. Can anyone take a look?

回答1:

You can offer up a dummy session id and it will log you in with a new one. Postman interceptor helped with the redirects.

import requests
import json

def login(email, password, session, session_id=None):
    """ Authenticate with flurry.com, start a fresh session 
        if no session id is provided. """ 
    auth_url = 'https://auth.flurry.com/auth/v1/session'
    login_url = 'https://login.flurry.com'
    auth_method = 'application/vnd.api+json'
    if session_id is None:
        session_id = 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa'
    response = session.request('OPTIONS', auth_url, data='')
    headers = response.headers
    headers.update({'origin': login_url, 'referer': login_url,
                    'accept': auth_method, 'content-type': auth_method})
    data = {'data': {'type': 'session', 'id': session_id, 'attributes': {
            'scopes': '', 'email': email, 'password': password, 'remember': 'false'}}}
    payload = json.dumps(data)
    response = session.request('POST', auth_url, data=payload, headers=headers)
    return response

email, password = 'your-email', 'your-password'
session = requests.Session()
response = login(email, password, session)
# session_id = response.json()['data']['id']

And then you can grab your csv data after hitting the old site:

response = session.request('GET', 'https://dev.flurry.com/home.do')
data = session.request('GET', your_csv_url).text

回答2:

They now have a new design and a new login page to which they redirect you too - that's why you see 302 and 303 status codes. The login process and logic behind it, the URLs, links to CSV files - everything is now different and you have to "reimplement"/"remimic" it.

回答3:

you can use uuid lib to generate an uuid for the session id, in order to use the old interface you'll need to perform a request to https://dev.flurry.com/home.do?isFirstPostLogin=true, now you can get the csv. (url_get variable)

id = uuid.uuid4()
payload = {"data":
            {"type":"session",
             "id": str(id),
             "attributes":{
                "scopes":"",
                "email": username,
                "password": password,
                "remember":"false"}
             }
            }

with session() as api:
  headers = {
    'Origin': 'https://login.flurry.com',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36',
    'Content-Type': 'application/vnd.api+json',
    'Accept': 'application/vnd.api+json',
    'Connection': 'keep-alive',
  }
  req = api.post('https://auth.flurry.com/auth/v1/session', data=json.dumps(payload), headers=headers)
  if req.status_code == 201:
    api.get('https://dev.flurry.com/home.do?isFirstPostLogin=true')
    return  api.get(url_get).content.encode('ascii', 'ignore')
  else:
    raise Exception('Login failed')

来源：https://stackoverflow.com/questions/39210484/flurry-login-requests-session-python-3

标签

python

python-3.x

session

beautifulsoup

flurry