问题
I need to scrape data from a site, but it requires my login first. I've been using hpricot to successfully scrape other sites, but I'm new to using mechanize, and I'm truly baffled by how to work it.
I see this example commonly quoted:
require 'rubygems'
require 'mechanize'
a = Mechanize.new
a.get('http://rubyforge.org/') do |page|
  # Click the login link
  login_page = a.click(page.link_with(:text => /Log In/))
  # Submit the login form
  my_page = login_page.form_with(:action => '/account/login.php') do |f|
    f.form_loginname  = ARGV[0]
    f.form_pw         = ARGV[1]
  end.click_button
  my_page.links.each do |link|
    text = link.text.strip
    next unless text.length > 0
    puts text
  end
end
But I've found it extremely cryptic. The part I don't understand in particular is what's going on here:
f.form_loginname  = ARGV[0]
f.form_pw         = ARGV[1]
How have those input tags from the page suddenly become methods? Am I missing something here? When I try to recreate it, to login to AppDataPro (http://www.appdata.com/login) I run into the problem that the input name contains brackets, like this:
<Table> 
<tr><td width="150"> 
   <label for="user_session_username">Username</label><br /> 
</td><td > 
    <input id="user_session_username" name="user_session[username]" size="30" type="text" /> 
</td></tr> 
<tr><td> 
   <label for="user_session_password">Password</label><br /> 
</td><td> 
    <input id="user_session_password" name="user_session[password]" size="30" type="password" /> 
</td></tr> 
</table> 
This is my attempt to use mechanize:
    a = Mechanize.new
    a.get('http://www.appdata.com/login') do |page|
        # Click the login link
        login_page = a.click(page.link_with(:text => /Login/)) #login_page is basically a doc of appdata/login
        my_page = login_page.form_with(:action => '/login') do |f|
            f.user_session[username] =  '****username here?****'
            f.user_session[password] =  '****password here?****'
        end
    end
but it causes the error,
logintest01.rb:21:in `block (2 levels) in <main>': undefined method `user_session' for nil:NilClass (NoMethodError)
What's wrong with what I'm doing?
回答1:
This is the approach I usually take. It hasn't failed me:
username_field = form.field_with(:name => "user_session[username]")
username_field.value = "whatever_user"
password_field = form.field_with(:name => "user_session[password]")
password_field.value = "whatever_pwd"
form.submit
回答2:
Try without this
login_page = a.click(page.link_with(:text => /Login/))
Or
a.get('http://www.appdata.com/') do |page|
来源:https://stackoverflow.com/questions/6629579/using-ruby-with-mechanize-to-log-into-a-website