mechanize

Python mechanize doesn't work when HTTPS and Proxy Authentication required

不问归期 提交于 2019-11-30 23:07:50
I use Python 2.7.2 and Mechanize 0.2.5. When I access the Internet, I have to go through a proxy server. I wrote the following codes, but an URLError occurred at the last line.. Does anyone have any solution about this? import mechanize br = mechanize.Browser() br.set_debug_http(True) br.set_handle_robots(False) br.set_proxies({ "http" : "192.168.20.130:8080", "https" : "192.168.20.130:8080",}) br.add_proxy_password("username", "password") br.open("http://www.google.co.jp/") # OK br.open("https://www.google.co.jp/") # Proxy Authentication Required I don't recommend you to use Mechanize, it's

Web Scraper for dynamic forms in python

我的未来我决定 提交于 2019-11-30 19:22:44
问题 I am trying to fill the form of this website http://www.marutisuzuki.com/Maruti-Price.aspx. It consists of three drop down lists. One is Model of the car, Second is the state and third is city. The first two are static and the third, city is generated dynamically depending upon the value of state, there is an onclick java script event running which gets the values of corresponding cities in a state. I am familiar with mechanize module in python. I came across several links telling me that I

Python mechanize doesn't work when HTTPS and Proxy Authentication required

谁说我不能喝 提交于 2019-11-30 18:56:17
问题 I use Python 2.7.2 and Mechanize 0.2.5. When I access the Internet, I have to go through a proxy server. I wrote the following codes, but an URLError occurred at the last line.. Does anyone have any solution about this? import mechanize br = mechanize.Browser() br.set_debug_http(True) br.set_handle_robots(False) br.set_proxies({ "http" : "192.168.20.130:8080", "https" : "192.168.20.130:8080",}) br.add_proxy_password("username", "password") br.open("http://www.google.co.jp/") # OK br.open(

getaddrinfo error with Mechanize

三世轮回 提交于 2019-11-30 15:02:47
I wrote a script that will go through all of the customers in our database, verify that their website URL works, and try to find a twitter link on their homepage. We have a little over 10,000 URLs to verify. After a fraction of if the urls are verified, we start getting getaddrinfo errors for every URL. Here's a copy of the code that scrapes a single URL: def scrape_url(url) url_found = false twitter_name = nil begin agent = Mechanize.new do |a| a.follow_meta_refresh = true end agent.get(normalize_url(url)) do |page| url_found = true twitter_name = find_twitter_name(page) end @err << "[#{

Submitting a form in mechanize

寵の児 提交于 2019-11-30 13:59:43
I'm having issues submitting the result of a form submission (I can submit a form, but I can't submit the form on the page that follows the first). I have: browser = mechanize.Browser() browser.set_handle_robots(False) browser.open('https://www.example.com/login') browser.select_form(nr=0) browser.form['j_username'] = 'username' browser.form['j_password'] = 'password' req = browser.submit() This works, as print req results in ` <body onload="document.forms[0].submit()"> <noscript> <p> <strong>Note:</strong> Since your browser does not support JavaScript, you must press the Continue button once

Using Mechanize (Python) to fill form

三世轮回 提交于 2019-11-30 13:46:56
I want to fill the form on this page using python mechanize and then record the response. How should I do it? When I search for forms on this page using the following code, it shows the form only for the search. How should I locate the form name of the other form with fields such as name, gender etc? http://aapmaharashtra.org/join-us Code: import mechanize br=mechanize.Browser() br.open("http://aapmaharashtra.org/join-us") for form in br.forms(): print "Form name:", form.name print form The form you need (with an id="form1" ) is loaded dynamically on the fly - this is why you don't see it in

Catching timeout errors with ruby mechanize

一世执手 提交于 2019-11-30 13:00:27
I have a mechanize function to log me out of a site but on very rare occasions it times me out. The function involves going to a specific page, and then clicking on a logout button. On the occasional that mechanize suffers a timeout when either going to the logout page or clicking the logout button the code crashes. So I put in a small rescue and it seems to be working as seen below the first piece of code. def logmeout(agent) page = agent.get('http://www.example.com/') agent.click(page.link_with(:text => /Log Out/i)) end Logmeout with rescue: def logmeout(agent) begin page = agent.get('http:/

How to install mechanize for Python 2.7?

巧了我就是萌 提交于 2019-11-30 10:59:39
I saved mechanize in my Python 2.7 directory. But when I type import mechanize into the Python shell, I get an error message that reads: Traceback (most recent call last): File "<pyshell#0>", line 1, in <module> import mechanize ImportError: No module named mechanize using pip : pip install mechanize or download the mechanize distribution archive, open it, and run: python setup.py install Try this on Debian/Ubuntu: sudo apt-get install python-mechanize You need to follow the installation instructions and not just download the files into your Python27 directory. It has to be installed in the

Mechanize and Google App Engine

萝らか妹 提交于 2019-11-30 10:09:01
Has someone managed to use mechanize with Google App Engine application? Michael I have solved this problem, please see: Python Mechanize + GAEpython code I found that someone created this project: gaemechanize . But no code at the time of writing. 来源: https://stackoverflow.com/questions/1389893/mechanize-and-google-app-engine

Checkbox input using python mechanize

戏子无情 提交于 2019-11-30 09:26:02
I want to fill a form using python mechanize. form looks like: <POST https://10.20.254.39/cloud_computing/vmuser/migrate_vm/cli multipart/form-data <TextControl(vm=cli)> <TextControl(chost=10.20.14.39)> <SelectControl(dhost=[*, 28, 27])> <CheckboxControl(live=[on])> <CheckboxControl(undefinesource=[on])> <CheckboxControl(suspend=[on])> <SubmitControl(<None>=Submit) (readonly)> <HiddenControl(_formkey=85819e5a-02bb-42c8-891f-3ddac485438b) (readonly)> <HiddenControl(_formname=migrate_create) (readonly)>> How do i set the value of live or undefinesource (checkbox) to True(ticked) or False(untick)