nokogiri | 易学教程

Preventing Nokogiri from escaping characters?

阅读更多关于 Preventing Nokogiri from escaping characters?

问题 I have created a text node and inserted into my document like so: #<Nokogiri::XML::Text:0x3fcce081481c "<%= stylesheet_link_tag 'style'%>">]> When I try to save the document with this: File.open('ng.html', 'w+'){|f| f << page.to_html} I get this in the actual document: <%= stylesheet_link_tag 'style'%> Is there a way to disable the escaping and save my page with my erb tags intact? Thanks! 回答1: You are obliged to escape some characters in text elements like: " " ' ' < < > > & & If you

Converting nested hash into XML using nokogiri

阅读更多关于 Converting nested hash into XML using nokogiri

问题 I have many levels of nested hash like: { :foo => 'bar', :foo1 => { :foo2 => 'bar2', :foo3 => 'bar3', :foo4 => { :foo5 => 'bar5' }}} How can I convert them into an XML like this?: <foo>bar</foo> <foo1> <foo2>bar2</foo2> <foo3>bar3</foo3> <foo4> <foo5>bar5</foo5> </foo4> </foo1> I have tried the xml.send method, but it converts the above nested hash to: <foo1 foo3="bar3" foo4="foo5bar5" foo2="bar2"/> <foo>bar</foo> 回答1: How about this? class Hash def to_xml map do |k, v| text = Hash === v ? v

How to install Nokogiri Gem for Windows

阅读更多关于 How to install Nokogiri Gem for Windows

I'm having this problem with nokogiri's gem: Could not open library 'C:\Ruby187\lib\ruby\gems\1.8\gems\nokogiri-1.4.6-x86-mingw32\ext\nokogiri\libxml2.dll' : unknown I read that I had to try the 1.5.0.beta3 version. However, when I run C:\Users\t3en4>gem install nokogiri --pre Fetching: nokogiri-1.5.0.beta.4.gem (100%) ERROR: Error installing nokogiri: The 'nokogiri' native gem requires installed build tools. Please update your PATH to include build tools or download the DevKit from 'http://rubyinstaller.org/downloads' and follow the instructions at 'http://github.com/oneclick/rubyinstaller

Error - “gem install rails” - libxml2 is missing

阅读更多关于 Error - “gem install rails” - libxml2 is missing

问题 I've been working through the Rails install instructions (http://railsapps.github.io/installrubyonrails-mac.html) and everything was okay up until I got to gem install rails part under New Rails Application . When I ran that I got libxml2 is missing. Here's the log: http://codecascade.com/sIjhQ/raw I had similar issues install nokogiri, and the only way I was able to get it resolved was with gem install nokogiri -- --use-system-libraries I'm on OS X 10.10.2. I also have RubyMine installed if

Nokogiri error when running bundle install

阅读更多关于 Nokogiri error when running bundle install

Trying to get a cloned Rails app running. When running bundle install I get this error: Using mini_portile (0.5.0) Installing nokogiri (1.6.0) Gem::InstallError: nokogiri requires Ruby version >= 1.9.2. An error occurred while installing nokogiri (1.6.0), and Bundler cannot continue. Make sure that `gem install nokogiri -v '1.6.0'` succeeds before bundling. But this is the output for rbenv version : › rbenv version 1.9.3-p429 (set by /Users/andrewguo/.rbenv/version) When running gem list I get: . . . mini_portile (0.5.0) minitest (2.5.1) multi_json (1.7.7) nokogiri (1.6.0) I've been racking my

Get link and href text from html doc with Nokogiri & Ruby?

阅读更多关于 Get link and href text from html doc with Nokogiri & Ruby?

I'm trying to use the nokogiri gem to extract all the urls on the page as well their link text and store the link text and url in a hash. <html> <body> <a href=#foo>Foo</a> <a href=#bar>Bar </a> </body> </html> I would like to return {"Foo" => "#foo", "Bar" => "#bar"} Here's a one-liner: Hash[doc.xpath('//a[@href]').map {|link| [link.text.strip, link["href"]]}] #=> {"Foo"=>"#foo", "Bar"=>"#bar"} Split up a bit to be arguably more readable: h = {} doc.xpath('//a[@href]').each do |link| h[link.text.strip] = link['href'] end puts h #=> {"Foo"=>"#foo", "Bar"=>"#bar"} Another way: h = doc.css('a

How to convert Nokogiri Document object into JSON

阅读更多关于 How to convert Nokogiri Document object into JSON

I have some parsed Nokogiri::XML::Document objects that I want to print as JSON. I can go the route of making it a string, parsing it into a hash, with active-record or Crack and then Hash.to_json; but that is both ugly and depending on way too manay libraries. Is there not a simpler way? As per request in the comment, for example the XML <root a="b"><a>b</a></root> could be represented as JSON: <root a="b"><a>b</a></root> #=> {"root":{"a":"b"}} <root foo="bar"><a>b</a></root> #=> {"root":{"a":"b","foo":"bar"}} That is what I get with Crack now too. And, sure, collisions between entities and

How do I use XPath in Nokogiri?

阅读更多关于 How do I use XPath in Nokogiri?

I have not found any documentation nor tutorial for that. Does anything like that exist? doc.xpath('//table/tbody[@id="threadbits_forum_251"]/tr') The code above will get me any table , anywhere, that has a tbody child with the attribute id equal to "threadbits_forum_251". But why does it start with double // ? Why there is /tr at the end? See " Ruby Nokogiri Parsing HTML table II " for more details. Can anybody tell me how to extract href , id , alt , src , etc., using Nokogiri? td[3]/div[1]/a/text()' <--- extracts text How can I extract other things? Rubens Farias Seems you need to read a

extract links (URLs), with nokogiri in ruby, from a href html tags?

阅读更多关于 extract links (URLs), with nokogiri in ruby, from a href html tags?

I want to extract from a webpage all URLs how can I do that with nokogiri? example: <div class="heat"> <a href='http://example.org/site/1/'>site 1</a> <a href='http://example.org/site/2/'>site 2</a> <a href='http://example.org/site/3/'>site 3</a> </diV> result should be an list: l = [' http://example.org/site/1/ ', ' http://example.org/site/2/ ', ' http://example.org/site/3/ ' You can do it like this: doc = Nokogiri::HTML.parse(<<-HTML_END) <div class="heat"> <a href='http://example.org/site/1/'>site 1</a> <a href='http://example.org/site/2/'>site 2</a> <a href='http://example.org/site/3/'

Save image with Mechanize and Nokogiri?

阅读更多关于 Save image with Mechanize and Nokogiri?

问题 I'm using Mechanize and Nokogiri to gather some data. I need to save a picture that's randomly generated at each request. In my attempt I'm forced to download all pictures, but the only one I really want is the image located within div#specific . In addition, is it possible to generate Base64 data from it, without saving it, or reloading its source? require 'rubygems' require 'mechanize' require 'nokogiri' a = Mechanize.new { |agent| agent.keep_alive = true agent.max_history = 0 } urls =