scrubyt | 易学教程

Any scrubyt command that clicks a link returns a 403 Forbidden Error

阅读更多关于 Any scrubyt command that clicks a link returns a 403 Forbidden Error

问题 I'm trying to use Scrubyt to navigate around a website, but whenever I use it to click any links it gives me 403 Forbidden errors. The website doesn't require logins or anything so I don't understand this. Might it need some kind of session variable, or the right UserAgent string. Any idea how I might fix this? 回答1: I think this may be the same issue as your other question here: How to get 'Next Page' link with Scrubyt 来源： https://stackoverflow.com/questions/169934/any-scrubyt-command-that

How to get 'Next Page' link with Scrubyt

阅读更多关于 How to get 'Next Page' link with Scrubyt

问题 I'm trying to use Scrubyt to get the details from this page http://www.nuffieldtheatre.co.uk/cn/events/event_listings.php?section=events. I've managed to get the titles and detail URLs from the list, but I can't use next_page to get the scraper to go to the next page. I assume that's cause I'm not using the correct pattern for the next page link. I tried the string "Next Page", and I've also tried the XPath. Any other ideas? The code is below: require 'rubygems' require 'scrubyt' nuffield

Scraping hidden HTML (when visible = false) using Hpricot (Ruby on Rails)

阅读更多关于 Scraping hidden HTML (when visible = false) using Hpricot (Ruby on Rails)

问题 I've come across an issue which unfortunately I can't seem to surpass, I'm also just a newborn to Ruby on rails unfortunately hence the number of questions I am attempting to scrape a webpage such as the following: http://www.yellowpages.com.mt/Malta/Grocers-Mini-Markets-Retail-In-Malta-Gozo.aspx I would like to scrape The Addresses, Phones and URL of the next Page which in this case is http://www.yellowpages.com.mt/Malta/Grocers-Mini-Markets-Retail-In-Malta-Gozo+Ismol.aspx I've been trying

How to get 'Next Page' link with Scrubyt

阅读更多关于 How to get 'Next Page' link with Scrubyt

I'm trying to use Scrubyt to get the details from this page http://www.nuffieldtheatre.co.uk/cn/events/event_listings.php?section=events . I've managed to get the titles and detail URLs from the list, but I can't use next_page to get the scraper to go to the next page. I assume that's cause I'm not using the correct pattern for the next page link. I tried the string "Next Page", and I've also tried the XPath. Any other ideas? The code is below: require 'rubygems' require 'scrubyt' nuffield_data = Scrubyt::Extractor.define do fetch 'http://www.nuffieldtheatre.co.uk/cn/events/event_listings.php