mechanize | 易学教程

Can't download file using user-agent

阅读更多关于 Can't download file using user-agent

问题 I am trying to download a PDF using a user-agent. This is the link: http://docstore.ohchr.org/SelfServices/FilesHandler.ashx?enc=6QkG1d%2fPPRiCAqhKb7yhsm7cEzPuG%2bIuSsjbaxhB%2b5vM0qgl%2bI%2bWZbIXqai7dQjlQHIySQ1HA8jayiBtura5uBz8YKKmXzyI%2fQxLt%2b1ik4JdCe7BQMFZngbVTePkj7ib If I follow this link in a browser it downloads a PDF. But when I do it via a user-agent I get a response that includes the following message: There is an end-user problem. If you have reached this site from a web link, -

methods width and height Mechanize

阅读更多关于 methods width and height Mechanize

问题 I'm using Mechanize for scraping images url then I'm looking http://mechanize.rubyforge.org/Mechanize/Page/Image.html for to know width and height images. I write in console: url = "http://www.bbc.co.uk/" page = Mechanize.new.get(url) images_url = page.images.map{|img| img.width}.compact I get the result: ["1", "84", "432", "432", "432", "432", "432", "432", "432", "304", "144", "144", "144", "144", "144", "144", "432", "432", "432", "432", "432", "432", "432", "336", "62", "62", "62", "62",

mechanize._mechanize.LinkNotFoundError

阅读更多关于 mechanize._mechanize.LinkNotFoundError

问题 I try to imitate a link click using this script: #!/usr/bin/env python import mechanize targetPage = 'http://example.com/' clickUrl="http://someurlinsideexample.com/" br = mechanize.Browser(factory=mechanize.RobustFactory()) br.open(targetUrl) br.follow_link(url=clickUrl) but I get this error: File "/usr/local/lib/python2.7/dist-packages/mechanize-0.2.5-py2.7.egg/mechanize/_mechanize.py", line 620, in find_link raise LinkNotFoundError() mechanize._mechanize.LinkNotFoundError What's wrong with

Mechanize on Ruby 1.9.3 encoding issue

阅读更多关于 Mechanize on Ruby 1.9.3 encoding issue

问题 Using the following code (from the Mechanize site but in a rake task).. namespace :ans do task :grab => :environment do a = Mechanize.new { |agent| agent.user_agent_alias = 'Mac Safari' } begin a.get('http://google.com/') do |page| search_result = page.form_with(:name => 'f') do |search| search.q = 'Hello world' end.submit search_result.links.each do |link| puts link.text end end end end end I get an encoding error.. rake aborted! "\x8B" from ASCII-8BIT to UTF-8 This is whilst using the

Parse html into Rails without new record every time?

阅读更多关于 Parse html into Rails without new record every time?

问题 I have the following code which is parsing a HTML table as simply as possible. # Timestamp (Column 1 of the table) page = agent.page.search("tbody td:nth-child(1)").each do |item| Call.create!(:time => item.text.strip) end # Source (Column 2 of the table) page = agent.page.search("tbody td:nth-child(2)").each do |item| Call.create!(:source => item.text.strip) end # Destination (Column 3 of the table) page = agent.page.search("tbody td:nth-child(3)").each do |item| Call.create!(:destination =>

keep mechanize page over request boundaries

阅读更多关于 keep mechanize page over request boundaries

问题 Im writing a ruby application that can post comments on behalf of the user to a remote blog. My problem is that i have to use the same page in the post method of the controller, to keep the session alive & to fill out a captcha: app/controller/comment_controller.rb require 'mechanize' class CommentController < ApplicationController def new agent = Mechanize.new @page = agent.get('http://blog.example.com') @captcha_src = @page.search("//div[@id='recaptcha_image']").search("//img")[1].attribute

How to get response URL in Python requests module?

阅读更多关于 How to get response URL in Python requests module?

问题 I'm tyring to login to a website login.php using Python requests module. If the attempt is successful, the page will be redirected to index.php If not, it remains there in login.php . I was able to do the same with mechanize module. import mechanize b = mechanize.Browser() url = 'http://localhost/test/login.php' response = b.open(url) b.select_form(nr=0) b.form['username'] = 'admin' b.form['password'] = 'wrongpwd' b.method = 'post' response = b.submit() print(response.geturl()) if response

Can't print a specific line from text file

阅读更多关于 Can't print a specific line from text file

问题 So I currently have this code to read an accounts.txt file that looks like this: username1:password1 username2:password2 username3:password3 I then have this (thanks to a member here) read the accounts.txt file and split it at the username and password so I can later print it. When I try to print line 1 with the username and password separate with this code: with open('accounts.txt') as f: credentials = [x.strip().split(':') for x in f.readlines()] for username,password in credentials: print

ruby: irb gives NameError attempting to use mechanize gem (ubuntu)

阅读更多关于 ruby: irb gives NameError attempting to use mechanize gem (ubuntu)

问题 On my ubuntu box, irb (ruby) gives a NameError when I try to use the mechanize gem: $ irb irb(main):001:0> require 'mechanize' => true irb(main):002:0> Mechanize.new NameError: uninitialized constant Mechanize from (irb):2 from :0 gem env shows this: RubyGems Environment: - RUBYGEMS VERSION: 1.3.7 - RUBY VERSION: 1.8.7 (2010-01-10 patchlevel 249) [x86_64-linux] - INSTALLATION DIRECTORY: /usr/lib/ruby/gems/1.8 - RUBY EXECUTABLE: /usr/bin/ruby1.8 - EXECUTABLE DIRECTORY: /usr/bin - RUBYGEMS

Sending POST parameters with Python using Mechanize

阅读更多关于 Sending POST parameters with Python using Mechanize

问题 I want to fill out this form using Python: <form method="post" enctype="multipart/form-data" id="uploadimage"> <input type="file" name="image" id="image" /> <input type="submit" name="button" id="button" value="Upload File" class="inputbuttons" /> <input name="newimage" type="hidden" id="image" value="1" /> <input name="path" type="hidden" id="imagepath" value="/var/www/httpdocs/images/" /> </form> As you can see, there are two Parameters that are named exactly the same, so when I'm using