mechanize

Selenium Webdriver vs Mechanize

余生长醉 提交于 2019-11-29 02:34:55
问题 I am interested in automating repetitive data entry in some forms for a website I frequent. So far the tools I've looked up that would provide support for this in a headless fashion could be Selenium WebDriver and Mechanize. My question is, is there a fundamental technical difference in using once versus the other? Selenium is mostly used for testing. I've also noticed some folks use it for doing exactly what I'm looking for, and that's automating data entry. Testing becomes a second benefit

Python Mechanize select a form with no name

╄→гoц情女王★ 提交于 2019-11-28 22:38:34
I am attempting to have mechanize select a form from a page, but the form in question has no "name" attribute in the html. What should I do? when I try to use br.select_form(name = "") I get errors that no form is declared with that name, and the function requires a name input. There is only one form on the page, is there some other way I can select that form? YOU Try: br.select_form(nr=0) to select the first form In Mechanize source , def select_form(self, name=None, predicate=None, <b>nr=None</b>): """ ... nr, if supplied, is the sequence number of the form (where 0 is the first). """ If you

Python: Clicking a button with urllib or urllib2

我是研究僧i 提交于 2019-11-28 21:40:48
I want to click a button with python, the info for the form is automatically filled by the webpage. the HTML code for sending a request to the button is: INPUT type="submit" value="Place a Bid"> How would I go about doing this? Is it possible to click the button with just urllib or urllib2? Or will I need to use something like mechanize or twill? Use the form target and send any input as post data like this: <form target="http://mysite.com/blah.php" method="GET"> ...... ...... ...... <input type="text" name="in1" value="abc"> <INPUT type="submit" value="Place a Bid"> </form> Python: # parse

Maintaining cookies between Mechanize requests

巧了我就是萌 提交于 2019-11-28 20:39:39
I'm trying to use the Ruby version of Mechanize to extract my employer's tickets from a ticket management system that we're moving away from that does not supply an API. Problem is, it seems Mechanize isn't keeping the cookies between the post call and the get call shown below: require 'rubygems' require 'nokogiri' require 'mechanize' @agent = Mechanize.new page = @agent.post('http://<url>.com/user_session', { 'authenticity_token' => '<token>', 'user_session[login]' => '<login>', 'user_session[password]' => '<password>', 'user_session[remember_me]' => '0', 'commit' => 'Login' }) page = @agent

Force python mechanize/urllib2 to only use A requests?

二次信任 提交于 2019-11-28 19:42:38
Here is a related question but I could not figure out how to apply the answer to mechanize/urllib2: how to force python httplib library to use only A requests Basically, given this simple code: #!/usr/bin/python import urllib2 print urllib2.urlopen('http://python.org/').read(100) This results in wireshark saying the following: 0.000000 10.102.0.79 -> 8.8.8.8 DNS Standard query A python.org 0.000023 10.102.0.79 -> 8.8.8.8 DNS Standard query AAAA python.org 0.005369 8.8.8.8 -> 10.102.0.79 DNS Standard query response A 82.94.164.162 5.004494 10.102.0.79 -> 8.8.8.8 DNS Standard query A python.org

WebBrowsing in C# - Libraries, Tools etc. - Anything like Mechanize in Perl? [closed]

有些话、适合烂在心里 提交于 2019-11-28 19:11:15
问题 Looking for something similar to Mechanize for .NET... If you don't know what Mechanize is.. http://search.cpan.org/dist/WWW-Mechanize/ I will maintain a list of suggestions here. Anything for browsing/posting/screen scraping (Other than WebRequest and WebBrowser Control). Parsing HTMLAgilityPack - http://www.codeplex.com/htmlagilitypack Web App Testing WatiN - Web Application Testing Framework (.NET) - http://watin.sourceforge.net/ Selenium - http://seleniumhq.org/ Art of Test Design Canvas

Screen scraping: getting around “HTTP Error 403: request disallowed by robots.txt”

旧城冷巷雨未停 提交于 2019-11-28 16:05:19
Is there a way to get around the following? httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt Is the only way around this to contact the site-owner (barnesandnoble.com).. i'm building a site that would bring them more sales, not sure why they would deny access at a certain depth. I'm using mechanize and BeautifulSoup on Python2.6. hoping for a work-around You can try lying about your user agent (e.g., by trying to make believe you're a human being and not a robot) if you want to get in possible legal trouble with Barnes & Noble. Why not instead get in touch with their

Python re - escape coincidental parentheses in regex pattern

流过昼夜 提交于 2019-11-28 12:31:31
问题 I am having trouble with the regex in the following code: import mechanize import re br = mechanize.Browser() br.set_handle_robots(False) br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')] response = br.open("http://www.gfsc.gg/The-Commission/Pages/Regulated-Entities.aspx?auto_click=1") html = response.read() br.select_form(nr=0) #print br.form br.set_all_readonly(False) next = re.search(r"""<a href=

Python mechanize login to website

て烟熏妆下的殇ゞ 提交于 2019-11-28 12:28:52
I'm trying to log into a website using Python and Mechanize, however, I'm running into trouble when trying to get the POST data to behave as I want. Essentially I want to replicate this using mechanize and Python: wget --quiet --save-cookies cookiejar --keep-session-cookies --post-data "action=login&login_nick=USERNAME&login_pwd=PASSWORD" -O outfile.htm http://domain.com/index.php The form looks like this: <login POST http://domain.com/index.php application/x-www-form-urlencoded <TextControl(login_nick=USERNAME)> <PasswordControl(login_pwd=PASSWORD)> <CheckboxControl(login_auto=[1])>

Ruby SSL error - sslv3 alert unexpected message

匆匆过客 提交于 2019-11-28 11:30:06
I'm trying to connect to the server https://www.xpiron.com/schedule in a ruby script. However, when I try connecting: require 'open-uri' doc = open('https://www.xpiron.com/schedule') I get the following error message: OpenSSL::SSL::SSLError: SSL_connect returned=1 errno=0 state=SSLv2/v3 read server hello A: sslv3 alert unexpected message from /usr/local/lib/ruby/1.9.1/net/http.rb:678:in `connect' from /usr/local/lib/ruby/1.9.1/net/http.rb:678:in `block in connect' from /usr/local/lib/ruby/1.9.1/timeout.rb:44:in `timeout' from /usr/local/lib/ruby/1.9.1/timeout.rb:87:in `timeout' from /usr/local