nokogiri

HTTP redirection loop RuntimeError in open_uri_redirections gem

血红的双手。 提交于 2019-12-12 22:19:57
问题 Thanks for your time. I'm working up a Ruby script to parse a CSV of urls and evaluate them on a variety of dimensions to see if certain tags and attributes are present or confirm to certain pattern. I'm using Nokogiri, open-uri and the patch for open-url to allow the script to follow redirections, open_uri_redirections. On a handful of problematic domains, I encounter an error and the script encounters a runtime error: Loading https://www.exampleproblemdomain.com C:/Ruby24-x64/lib/ruby/2.4.0

ElasticBeanstalk - Rails Nokogiri Deployment Issue

混江龙づ霸主 提交于 2019-12-12 20:46:04
问题 I have a working rails application deployed to EC2 through ElasticBeanstalk. I update the website every few weeks without issue. Today I'm running into a problem after committing changes and running "eb deploy": An error occurred while installing nokogiri (1.7.0.1), and Bundler cannot continue. Make sure that `gem install nokogiri -v '1.7.0.1'` succeeds before bundling. I haven't changed anything aside from a few views. The host is the same and the Gemfile is the same. On my local machine, I

Nokogiri with Rspec

霸气de小男生 提交于 2019-12-12 19:02:56
问题 Is there a way to use Nokogiri in Rspec? Particularly I'm trying to take the response from a controller action and convert that into a Nokogiri object page and run Nokogiri-specific parsing like the one below: page.search('input[name="some_name"]').size.should == 1 Where do I include Nokogiri - would that be in spec_helper.rb? How do I convert the ActionController::TestResponse into a Nokogiri object? Or is it possible to run the above kind of assertion by using plain Rspec syntax? 回答1: If

nokogiri + mechanize css selector by text

删除回忆录丶 提交于 2019-12-12 18:44:19
问题 I am new to nokogiri and so far most familiar with CSS selectors, I am trying to parse information from a table, below is a sample of the table and the code I'm using, I'm stuck on the appropriate if statement, as it seems to return the whole contents of the table. Table: <div class="holder"> <div class ="row"> <div class="c1"> <!-- Content I Don't need --> </div> <div class="c2"> <span class="data"> <!-- Content I Don't Need --> <span class="data"> </div> </div> ... <div class="row"> <div

Which version of xpath that Nokogiri support?

天大地大妈咪最大 提交于 2019-12-12 18:33:12
问题 I can't find an official statement of the xpath version that Nokogiri supports. Anyone can help me with it? In fact I want to extract some elements that have an attribute start with specified sub string. For example, I want to get all Book elements that have a category attribute start with the character C . How to do this with nokogiri? <?xml version="1.0" encoding="ISO-8859-1"?> <!-- Edited by XMLSpy?--> <bookstore> <book category="COOKING"> <title lang="en">Everyday Italian</title> <author

Issue with unclosed img tag

夙愿已清 提交于 2019-12-12 12:46:58
问题 data presented in HTML format and submitted to server, that does some preprocessing. It operates with "src" attribute of "img" tag. After preprocessing and saving, all the preprocessed "img" tags are not self-closed. For example, if "img" tag was following: <img src="image.png" /> after preprocessing with Nokogiri or Hpricot, it will be: <img src="/preprocessed_path/image.png"> The code is pretty simple: doc = Hpricot(self.content) doc.search("img").each do |tag| preprocess tag end self

Why doesn't nokogiri install?

倾然丶 夕夏残阳落幕 提交于 2019-12-12 10:13:37
问题 I'm having a devil of a time installing Nokogiri on Ubuntu 12.04. I use rbenv. $ gem install nokogiri -v '1.6.1' ERROR: While executing gem ... (Errno::EACCES) Permission denied - /home/deploy/.rbenv/versions/2.0.0-p353/lib/ruby/gems/2.0.0/gems/nokogiri-1.6.1/.autotest $ sudo gem install nokogiri -v '1.6.1' ERROR: Error installing nokogiri: nokogiri requires Ruby version >= 1.9.2. $ rbenv sudo gem install nokogiri -v '1.6.1' Building native extensions. This could take a while... ERROR: Error

Nokogiri won't let me bundle install in Rails

徘徊边缘 提交于 2019-12-12 09:33:15
问题 I've seen this question asked and tried everything I've seen suggested. I got a new macbook and am looking to set up an existing app. When i clone the app, it will not bundle install and acts like Rails is not installed, even though it works in other directories. I tried removing version numbers from gemfile and deleting gemfile.lock. I tried bundle update. I'm on osx 10.9.4, rails 4.1.5 and ruby 2.1.1. the error I am getting: An error occurred while installing nokogiri (1.6.3.1), and Bundler

DRY search every page of a site with nokogiri

ⅰ亾dé卋堺 提交于 2019-12-12 08:54:56
问题 I want to search every page of a site. My thought is to find all links on a page that stay within the domain, visit them, and repeat. I'll have to implement measures to not repeat efforts as well. So it starts very easily: page = 'http://example.com' nf = Nokogiri::HTML(open(page)) links = nf.xpath '//a' #find all links on current page main_links = links.map{|l| l['href'] if l['href'] =~ /^\//}.compact.uniq "main_links" is now an array of links from the active page that start with "/" (which

HTML is read before fully loaded using open-uri and nokogiri

纵饮孤独 提交于 2019-12-12 08:47:44
问题 I'm using open-uri and nokogiri with ruby to do some simple webcrawling. There's one problem that sometimes html is read before it is fully loaded. In such cases, I cannot fetch any content other than the loading-icon and the nav bar. What is the best way to tell open-uri or nokogiri to wait until the page is fully loaded? Currently my script looks like: require 'nokogiri' require 'open-uri' url = "https://www.the-page-i-wanna-crawl.com" doc = Nokogiri::HTML(open(url, ssl_verify_mode: OpenSSL