How do I get Python's Mechanize to POST an ajax request?

痞子三分冷 提交于 2019-12-03 07:53:55

问题


The site I'm trying to spider is using the javascript:

request.open("POST", url, true);

To pull in extra information over ajax that I need to spider. I've tried various permutations of:

r = mechanize.urlopen("https://site.tld/dir/" + url, urllib.urlencode({'none' : 'none'}))

to get Mechanize to get the page but it always results in me getting the login HTML again, indicating that something is wrong. Firefox doesn't seem to add any HTTP data to the POST according to Firebug, and I'm adding an empty field to try and force the urlopen to use "POST" instead of "GET" hoping the site ignores the field. I thought that Mechanize's urlopen DOES include cookies. But being HTTPS it's hard to wireshark the transaction to debug.

Is there a better way?

Also there doesn't seem to be decent API documentation for Mechanize, just examples. This is annoying.


回答1:


This was what I came up with:

req = mechanize.Request("https://www.site.com/path/" + url, " ")
req.add_header("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.7) Gecko/20100713 Firefox/3.6.7")
req.add_header("Referer", "https://www.site.com/path")
cj.add_cookie_header(req)
res = mechanize.urlopen(req)

Whats interesting is the " " in the call to mechanize.Request forces it into "POST" mode. Obviously the site didn't choke on a single space :)

It needed the cookies as well. I debugged the headers using:

hh = mechanize.HTTPHandler()
hsh = mechanize.HTTPSHandler()
hh.set_http_debuglevel(1)
hsh.set_http_debuglevel(1)
opener = mechanize.build_opener(hh, hsh)
logger = logging.getLogger()
logger.addHandler(logging.StreamHandler(sys.stdout))
logger.setLevel(logging.NOTSET)
mechanize.install_opener(opener)

Against what Firebug was showing.



来源:https://stackoverflow.com/questions/3225569/how-do-i-get-pythons-mechanize-to-post-an-ajax-request

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!