nokogiri | 易学教程

How to edit docx with nokogiri and rubyzip

阅读更多关于 How to edit docx with nokogiri and rubyzip

问题 I'm using a combination of rubyzip and nokogiri to edit a .docx file. I'm using rubyzip to unzip the .docx file and then using nokogiri to parse and change the body of the word/document.xml file but ever time I close rubyzip at the end it corrupts the file and I can't open it or repair it. I unzip the .docx file on desktop and check the word/document.xml file and the content is updated to what I changed it to but all the other files are messed up. Could someone help me with this issue? Here

How do I use Nokogiri to parse an XML file?

阅读更多关于 How do I use Nokogiri to parse an XML file?

问题 I'm having some issues with Nokogiri. I am trying to parse this XML file: <Collection version="2.0" id="74j5hc4je3b9"> <Name>A Funfair in Bangkok</Name> <PermaLink>Funfair in Bangkok</PermaLink> <PermaLinkIsName>True</PermaLinkIsName> <Description>A small funfair near On Nut in Bangkok.</Description> <Date>2009-08-03T00:00:00</Date> <IsHidden>False</IsHidden> <Items> <Item filename="AGC_1998.jpg"> <Title>Funfair in Bangkok</Title> <Caption>A small funfair near On Nut in Bangkok.</Caption>

Is it possible to plug a JavaScript engine with Ruby and Nokogiri?

阅读更多关于 Is it possible to plug a JavaScript engine with Ruby and Nokogiri?

I'm writing an application to crawl some websites and scrape data from them. I'm using Ruby, Curl and Nokogiri to do this. In most cases it's straightforward and I only need to ping a URL and parse the HTML data. The setup works perfectly fine. However, in some scenarios, the websites retrieve data based on user input on some radio buttons. This invokes some JavaScript which fetches some more data from the server. The generated URL and posted data is determined by JavaScript code. Is it possible to use: A JavaScript library along with this setup which would be able to determine execute the

Convert XML collection (of Pivotal Tracker stories) to Ruby hash/object

阅读更多关于 Convert XML collection (of Pivotal Tracker stories) to Ruby hash/object

I have a collection of stories in an XML format. I would like to parse the file and return each story as either hash or Ruby object, so that I can further manipulate the data within a Ruby script. Does Nokogiri support this, or is there a better tool/library to use? The XML document has the following structure, returned via Pivotal Tracker's web API : <?xml version="1.0" encoding="UTF-8"?> <stories type="array" count="145" total="145"> <story> <id type="integer">16376</id> <story_type>feature</story_type> <url>http://www.pivotaltracker.com/story/show/16376</url> <estimate type="integer">2<

XPath to find all following siblings up until the next sibling of a particular type

阅读更多关于 XPath to find all following siblings up until the next sibling of a particular type

Given this XML/HTML: <dl> <dt>Label1</dt><dd>Value1</dd> <dt>Label2</dt><dd>Value2</dd> <dt>Label3</dt><dd>Value3a</dd><dd>Value3b</dd> <dt>Label4</dt><dd>Value4</dd> </dl> I want to find all <dt> and then, for each, find the following <dd> up until the next <dt> . Using Ruby's Nokogiri I am able to accomplish this like so: dl.xpath('dt').each do |dt| ct = dt.xpath('count(following-sibling::dt)') dds = dt.xpath("following-sibling::dd[count(following-sibling::dt)=#{ct}]") puts "#{dt.text}: #{dds.map(&:text).join(', ')}" end #=> Label1: Value1 #=> Label2: Value2 #=> Label3: Value3a, Value3b #=>

XPath to select between two HTML comments?

阅读更多关于 XPath to select between two HTML comments?

I have a big HTML page. But I want to select certain nodes using Xpath: <html> ........  <div>some text</div> <div><p>Some more elements</p></div>  ....... </html> I can select HTML after the  using: "//comment()[. = ' begin content ']/following::*" Also I can select HTML before the  using: "//comment()[. = ' end content ']/preceding::*" But do I have to have XPath to select all the HTML between the two comments? I would look for elements that are preceded by the first comment and followed by the second comment

Extracting between <br> tags with Nokogiri?

阅读更多关于 Extracting between tags with Nokogiri?

问题 I am trying to extract the phone number and the address from this site using Nokogiri. Both of them are between <br> tags. How can I do this? In case the site is down, here is an excerpt of some of the HTML from which I wish to extract the phone number and address: <table width="900" style=" margin:8px; padding:5px; font-family:Verdana, Geneva, sans-serif; font-size:12px; line-height:165%; color:#333333; border-bottom:1px solid #cccccc; "><tbody><tr valign="top"><td> <strong>Alana's Cafe<

Installing nokogiri Mac OS X 10.8.2 XCode installed

阅读更多关于 Installing nokogiri Mac OS X 10.8.2 XCode installed

Trying to install nokogiri on Mountain Lion. I was using ruby 1.8.7 but just upgraded to 1.9.3 but it stopped the bundle install from working. Incidentally, I could get round this problem by uninstalling ruby 1.9.3 and reverting to 1.8.7. however this is obviously a suboptimal solution since I don't want to be stuck on 1.8.7 for the rest of time... Users-MacBook-Pro:sample_app user$ ls Gemfile app doc script Gemfile.lock config lib spec README.md config.ru log tmp Rakefile db public vendor Ravins-MacBook-Pro:sample_app user$ bundle Fetching gem metadata from https://rubygems.org/....... /Users

How do I do a regex search in Nokogiri for text that matches a certain beginning?

阅读更多关于 How do I do a regex search in Nokogiri for text that matches a certain beginning?

Given: require 'rubygems' require 'nokogiri' value = Nokogiri::HTML.parse(<<-HTML_END) "<html> <body> <p id='para-1'>A</p> <div class='block' id='X1'> <h1>Foo</h1> <p id='para-2'>B</p> </div> <p id='para-3'>C</p> <h2>Bar</h2> <p id='para-4'>D</p> <p id='para-5'>E</p> <div class='block' id='X2'> <p id='para-6'>F</p> </div> </body> </html>" HTML_END I want to do something like what I can do in Hpricot: divs = value.search('//div[@id^="para-"]') How do I do a pattern search for elements in XPath style? Where would I find the documentation to help me? I didn't see this in the rdocs. Aaron

Rails Tutorial: nokogiri-1.5.2 error on bundle install

阅读更多关于 Rails Tutorial: nokogiri-1.5.2 error on bundle install

After working through the RVM setup, rspec and guard sections of chapter 3 of the Ruby on Rails Tutorial , whenever I run bundle install I get the following error dump: Gem::Installer::ExtensionBuildError: ERROR: Failed to build gem native extension. /usr/bin/ruby1.9.1 extconf.rb /usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require': cannot load such file -- mkmf (LoadError) from /usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require' from extconf.rb:5:in `<main>' Gem files will remain installed in /home/dan/.bundler/tmp/17577/gems/nokogiri-1.5.2 for inspection. Results logged