Mechanize/Ruby read source code of 404 page

ぐ巨炮叔叔 提交于 2019-12-21 20:54:40

问题


All I'm doing is loading mechanize, and getting a page that returns 404. But that's exactly what I want. The 404 page has plenty of html I'd like to use in my example.

a = mechanize.new
a.get('http://www.youtube.com/watch?v=e4g8jriw4rg')
a.page
=> nil

I can't seem to find any further info on this.


回答1:


You need to handle the exception:

begin
  page = a.get 'http://www.youtube.com/watch?v=e4g8jriw4rg'
rescue Mechanize::ResponseCodeError => e
  puts e.response_code # the status code as a string, e.g. "404"
  page = e.page
end

puts page.title



回答2:


This may have been the case when the answer was written (the code changed about 5 years ago) but it's no longer the case. You can now set allowed_error_codes on the agent instance to an array of Integers or Strings with the values set to the HTTP Response codes you wish to handle without an exception. The docs (as of writing this) note that "2xx, 3xx and 401 status codes will be handled without checking this list."



来源:https://stackoverflow.com/questions/13188195/mechanize-ruby-read-source-code-of-404-page

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!