mechanize

what does mechanize tag br.set_handle_gzip do?

回眸只為那壹抹淺笑 提交于 2019-12-31 03:48:04
问题 I'm trying python mechanize module in order to write some scripts. When i run it i get the following error.What actually is this set_handle_gzip ? manoj@ubuntu:~/pyth$ python rock.py │ rock.py:15: UserWarning: gzip transfer encoding is experimental! │ br.set_handle_gzip(True) │ Traceback (most recent call last): │ File "rock.py", line 60, in <module> │ br.follow_link(text='Sign out') │ File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py", line│ 569, in follow_link │ return self

Mechanize and NTLM Authentication

孤者浪人 提交于 2019-12-30 13:23:12
问题 The following code generates a 401 => Net::HTTPUnauthorized error. From the log: response-header: x-powered-by => ASP.NET response-header: content-type => text/html response-header: www-authenticate => Negotiate, NTLM response-header: date => Mon, 02 Aug 2010 19:48:17 GMT response-header: server => Microsoft-IIS/6.0 response-header: content-length => 1539 status: 401 The Script is as follows: require 'rubygems' require 'mechanize' require 'logger' agent = WWW::Mechanize.new { |a| a.log =

Extract data from HTML Table with mechanize

耗尽温柔 提交于 2019-12-30 11:41:30
问题 First of all, here is the sample html table : <tr> <td><strong>Kangchenjunga </strong></td> <td>8,586m<br /></td> <td>28,169ft</td> <td><div align="center">Nepal/India </div></td> <td>1955; G. Band, J. Brown </td> </tr> The ARGV[0] will have the name of a mountain ( the first colomn) and the return value should be the last column, the people who climbed the mountain for the first time. So I need to check if the whole rows first column is the ARGV[0], and if it is, then I should return the

Extract data from HTML Table with mechanize

时间秒杀一切 提交于 2019-12-30 11:41:29
问题 First of all, here is the sample html table : <tr> <td><strong>Kangchenjunga </strong></td> <td>8,586m<br /></td> <td>28,169ft</td> <td><div align="center">Nepal/India </div></td> <td>1955; G. Band, J. Brown </td> </tr> The ARGV[0] will have the name of a mountain ( the first colomn) and the return value should be the last column, the people who climbed the mountain for the first time. So I need to check if the whole rows first column is the ARGV[0], and if it is, then I should return the

Error - urlopen error [Errno 8] _ssl.c:504: EOF occurred in violation of protocol

こ雲淡風輕ζ 提交于 2019-12-30 07:52:12
问题 My aim is to extract the html from all the links in the first page after entering the google search term. I work behind a proxy so this is my approach. 1.I first used mechanize to enter the search term in the form , ive set the proxies and robots correctly. 2.After extracting the links , Ive used an opener using urllib2.ProxyHandler globally , to open the urls individually. However this gives me this error. Not able to figure it out. urlopen error [Errno 8] _ssl.c:504: EOF occurred in

How to add cookie to existing cookielib CookieJar instance in Python?

空扰寡人 提交于 2019-12-30 00:35:11
问题 I have a CookieJar that's being used with mechanize that I want to add a cookie to. How can I go about doing this? make_cookie() and set_cookie() weren't clear enough for me. br = mechanize.Browser() cj = cookielib.LWPCookieJar() br.set_cookiejar(cj) 回答1: Managed to figure this out import mechanize import cookielib br = mechanize.Browser() cj = cookielib.LWPCookieJar() br.set_cookiejar(cj) ck = cookielib.Cookie(version=0, name='Name', value='1', port=None, port_specified=False, domain='www

BeautifulSoup HTML table parsing

旧巷老猫 提交于 2019-12-29 14:28:11
问题 I am trying to parse information (html tables) from this site: http://www.511virginia.org/RoadConditions.aspx?j=All&r=1 Currently I am using BeautifulSoup and the code I have looks like this from mechanize import Browser from BeautifulSoup import BeautifulSoup mech = Browser() url = "http://www.511virginia.org/RoadConditions.aspx?j=All&r=1" page = mech.open(url) html = page.read() soup = BeautifulSoup(html) table = soup.find("table") rows = table.findAll('tr')[3] cols = rows.findAll('td')

Mechanize and BeautifulSoup for PHP? [closed]

╄→гoц情女王★ 提交于 2019-12-29 14:21:53
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 2 years ago . I was wondering if there was anything similar like Mechanize or BeautifulSoup for PHP? 回答1: SimpleTest provides you with similar functionality: http://www.simpletest.org/en/browser_documentation.html 回答2: I don't know how powerful BeautifulSoup is, so maybe this won't be as great ; but you could try using

Ruby SSL error - sslv3 alert unexpected message

一笑奈何 提交于 2019-12-28 18:13:40
问题 I'm trying to connect to the server https://www.xpiron.com/schedule in a ruby script. However, when I try connecting: require 'open-uri' doc = open('https://www.xpiron.com/schedule') I get the following error message: OpenSSL::SSL::SSLError: SSL_connect returned=1 errno=0 state=SSLv2/v3 read server hello A: sslv3 alert unexpected message from /usr/local/lib/ruby/1.9.1/net/http.rb:678:in `connect' from /usr/local/lib/ruby/1.9.1/net/http.rb:678:in `block in connect' from /usr/local/lib/ruby/1.9

Is there a PHP equivalent of Perl's WWW::Mechanize?

巧了我就是萌 提交于 2019-12-27 10:46:31
问题 I'm looking for a library that has functionality similar to Perl's WWW::Mechanize, but for PHP. Basically, it should allow me to submit HTTP GET and POST requests with a simple syntax, and then parse the resulting page and return in a simple format all forms and their fields, along with all links on the page. I know about CURL, but it's a little too barebones, and the syntax is pretty ugly (tons of curl_foo($curl_handle, ...) statements Clarification: I want something more high-level than the