问题
I am building a script using Mechanize to scrape data from a website. The script is supposed to click on the "Read biography" link and then scrape the biography of the member on the next page.
Here is the script in the Rake file:
require 'mechanize'
require 'date'
require 'json'
task :testing2 do
agent = Mechanize.new
page = agent.get("https://www.congress.gov/members")
page_links = page.links_with(href: %r{.*/member/\w+})
member_links = page_links[0...2]
members = member_links.map do |link|
member = link.click
name = member.search('title').text.split('|')[0]
institution = member.search('td~ td+ td').text.split(':')[0]
dob = member.search('.birthdate').text.strip[1..4]
# Get bio
bio_link = member.link_with(:text => 'Read biography').click
bio = bio_page.search('p').text.strip
{
name: name.strip,
institution: institution.strip,
dob: dob,
bio: bio
}
end
puts JSON.pretty_generate(members)
end
回答1:
The code you are using:
member.link_with(:text => 'Read biography')
does not find the link, because the link has some space and new lines characters in it. You need to use it like this:
member.link_with(:text => /Read biography/)
that code will find the link.
回答2:
There are two calls to click:
member = link.click
and
bio_link = member.link_with(:text => 'Read biography').click
The first is called on iterator, that can not be nil
, hence the problematic one is the second.
Try to put debug output, or set a breakpoint before # Get bio
and examine what’s wrong with it. It is impossible to say why member.link_with(:text => 'Read biography')
returns nil
by the information you have provided.
来源:https://stackoverflow.com/questions/32786256/undefined-method-click-for-nilnilclass-mechanize