nokogiri

Using Nokogiri HTML Builder to create fragment with multiple root nodes

喜夏-厌秋 提交于 2019-11-29 06:47:14
问题 Well I have a simple problem with Nokogiri. I want to make Nokogiri::HTML::Builder to make an HTML fragment of the following form: <div> #Some stuff in here </div> <div> #Some other stuff in here </div> When trying to do: @builder = Nokogiri::HTML::Builder.new(:encoding => 'UTF-8') do |doc| doc.div { doc.p "first test" } doc.div { doc.p "second test" } end @builder.to_html I get an error: Document has already a root node , which I partly understand. I know I am not wrapping the whole thing

Failing to install Nokogiri gem

旧巷老猫 提交于 2019-11-29 06:21:22
问题 I'm working on a rails app that allows for image attachments to each use account. I'm using paperclip and amazon web services: gem 'paperclip' gem 'aws-sdk' When I run bundle install, I get this message: extconf failed, exit code 1 Gem files will remain installed in /usr/local/rvm/gems/ruby-2.1.2/gems/nokogiri-1.6.5 for inspection. Results logged to /usr/local/rvm/gems/ruby-2.1.2/extensions/x86_64-darwin-13/2.1.0-static/nokogiri-1.6.5/gem_make.out An error occurred while installing nokogiri

Parsing Javascript using Ruby code

戏子无情 提交于 2019-11-29 06:12:55
I'm writing a test code in Ruby and trying to parse a HTML source file of a website. It has a JavaScript variable which I can use to compare it against other values. For example: <script type="text/javascript" language="JavaScript"> function GetParam(name) { var req_var = { a: 'xyz', b: 'yy.com', c: 'en', d:0, e: 'y' }; } </script> Here I want to extract the variable req_var from this function. Is it possible to do that? If so can anyone please help me with that? karlcow javascript parser in ruby rbnarcissus Rkelly johnson You could use a regular expression to parse it out like this: k =

Save all image files from a website

删除回忆录丶 提交于 2019-11-29 04:37:55
I'm creating a small app for myself where I run a Ruby script and save all of the images off of my blog. I can't figure out how to save the image files after I've identified them. Any help would be much appreciated. require 'rubygems' require 'nokogiri' require 'open-uri' url = '[my blog url]' doc = Nokogiri::HTML(open(url)) doc.css("img").each do |item| #something end Phrogz URL = '[my blog url]' require 'nokogiri' # gem install nokogiri require 'open-uri' # already part of your ruby install Nokogiri::HTML(open(URL)).xpath("//img/@src").each do |src| uri = URI.join( URL, src ).to_s # make

Failure to install nokogiri libiconv is missing on Yosemite Mac OS X 10.10

人走茶凉 提交于 2019-11-29 02:51:23
问题 Trying to install Nokogiri I’m getting the following error Maxims-MacBook-Air:ScrapingTheApple maximveksler$ gem install nokogiri Fetching: nokogiri-1.6.2.1.gem (100%) Building native extensions. This could take a while... Building nokogiri using packaged libraries. ERROR: Error installing nokogiri: ERROR: Failed to build gem native extension. /Users/maximveksler/.rvm/rubies/ruby-2.1.2/bin/ruby extconf.rb Building nokogiri using packaged libraries. ----- libiconv is missing. please visit http

Adding namespace using Nokogiri's XML Builder

放肆的年华 提交于 2019-11-29 00:28:46
I have been wrecking my head for a few hours but I can't seem to determine how to add XMLNS namespace whilst using the Nokogiri XML Builder class to construct a XML structure. For instance, consider the XML sample below: I can create everything between the GetQuote tags but creating the "p:ACMRequest" remains a mystery. I came across this reference, https://gist.github.com/428455/7a15f84cc08c05b73fcec2af49947d458ae3b96a , that still doesn't make sense to me. Even referring to the XML documentation,http://www.w3.org/TR/xml-names/, didn't make much sense either. <?xml version="1.0" encoding="UTF

Error installing nokogiri 1.6.0 on mac (libxml2)

徘徊边缘 提交于 2019-11-28 21:15:04
UPDATE: Fixed I found the answer in another thread. The workaround I used is to tell Nokogiri to use the system libraries instead: NOKOGIRI_USE_SYSTEM_LIBRARIES=1 bundle install ==== Trying to install nokogiri 1.6.0 on a mac. With previous versions, I had no problems. But 1.6.0 refuses to install. This is the error: Building native extensions. This could take a while... ERROR: Error installing nokogiri: ERROR: Failed to build gem native extension. /Users/josenriq/.rvm/rubies/ruby-1.9.3-head/bin/ruby extconf.rb Extracting libxml2-2.8.0.tar.gz into tmp/i686-apple-darwin11/ports/libxml2/2.8.0...

Why doesn't Nokogiri xpath like xmlns declarations

徘徊边缘 提交于 2019-11-28 20:23:56
I'm using Nokogiri::XML to parse responses from Amazon SimpleDB. The response is something like: <SelectResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/"> <SelectResult> <Item> <Attribute><Name>Foo</Name><Value>42</Value></Attribute> <Attribute><Name>Bar</Name><Value>XYZ</Value></Attribute> </Item> </SelectResult> </SelectResponse> If I just hand the response straight over to Nokogiri, all XPath queries (e.g. doc/"//Item/Attribute[Name='Foo']/Value" ) return an empty array. But if I remove the xmlns attribute from the SelectResponse tag, it works perfectly. Is there some extra thing I

How do I remove a node with Nokogiri?

扶醉桌前 提交于 2019-11-28 20:01:58
How can I remove <img> tags using Nokogiri? I have the following code but it wont work: # str = '<img src="canadascapital.gc.ca/data/2/rec_imgs/5005_Pepsi_H1NB.gif"/…; testt<a href="#">test</a>tfbu' f = Nokogiri::XML.fragment(str) f.search('//img').each do |node| node.remove end puts f xds2000 have a try! f = Nokogiri::XML.fragment(str) f.search('.//img').remove puts f I prefer CSS over XPath, as it's usually much more readable. Switching to CSS: require 'nokogiri' doc = Nokogiri::HTML('<html><body><img src="foo"><img src="bar"></body></html>') After parsing the document looks like: doc.to

How do I use XPath in Nokogiri?

[亡魂溺海] 提交于 2019-11-28 16:00:50
问题 I have not found any documentation nor tutorial for that. Does anything like that exist? doc.xpath('//table/tbody[@id="threadbits_forum_251"]/tr') The code above will get me any table , anywhere, that has a tbody child with the attribute id equal to "threadbits_forum_251". But why does it start with double // ? Why there is /tr at the end? See "Ruby Nokogiri Parsing HTML table II" for more details. Can anybody tell me how to extract href , id , alt , src , etc., using Nokogiri? td[3]/div[1]/a