python urllib3 login + search

只愿长相守 提交于 2020-04-19 03:04:17

问题


import urllib3
import io
from bs4 import BeautifulSoup
import re
import cookielib

http = urllib3.PoolManager()
url = 'http://www.example.com'
headers = urllib3.util.make_headers(keep_alive=True,user_agent='Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6')
r = http.urlopen('GET', url, preload_content=False)

# Params die dann am Post request übergeben werden
params = {
    'login': '/shop//index.php',
    'user': 'username',
    'pw': 'password'
  }
suche = {
    'id' : 'searchfield',
    'name' : 'suche',
    }

# Post Anfrage inkl params (login) Antwort in response.data
response = http.request('POST', url, params, headers)
suche = http.request('POST', site-to-search? , suche, headers)
html_suche = suche.data

print html_suche

I try to login with this code to a site and search after that. With this code i get a answer that i am not loged in.

how can i combine that i first login and after that to search. Thx.


回答1:


Web servers track browser-like client state by setting cookies, which the client must return. By default, urllib3 does not pretend to be a browser, so we need to do a little extra work to relay the cookie back to the server. Here's an example of how to do this with httpbin.org:

import urllib3
http = urllib3.PoolManager()

# httpbin does a redirect right after setting a cookie, so we disable redirects
# for this request
r = http.request('GET', 'http://httpbin.org/cookies/set?foo=bar', redirect=False)

# Grab the set-cookie header and build our headers for our next request.
# Note: This is a simplified version of what a browser would do.
headers = {'cookie': r.getheader('set-cookie')}
print headers
# -> {'cookie': 'foo=bar; Path=/'}

r = http.request('GET', 'http://httpbin.org/cookies', headers=headers)
print r.body
# -> {
#      "cookies": {
#        "foo": "bar"
#      }
#    }

(Note: This recipe is useful and urllib3's documentation would benefit from having it. I'd appreciate a pull request which adds something to this effect.)

Other options, as mentioned by Martijn, is to use a higher-level library that pretends to be more like a browser. robobrowser looks like a great choice for this kind of work, but also requests has provisions for managing cookies for you and it uses urllib3 underneath. :)



来源:https://stackoverflow.com/questions/29061135/python-urllib3-login-search

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!