nokogiri

How to edit docx with nokogiri and rubyzip

廉价感情. 提交于 2019-12-03 05:19:32
问题 I'm using a combination of rubyzip and nokogiri to edit a .docx file. I'm using rubyzip to unzip the .docx file and then using nokogiri to parse and change the body of the word/document.xml file but ever time I close rubyzip at the end it corrupts the file and I can't open it or repair it. I unzip the .docx file on desktop and check the word/document.xml file and the content is updated to what I changed it to but all the other files are messed up. Could someone help me with this issue? Here

How do I use Nokogiri to parse an XML file?

喜你入骨 提交于 2019-12-03 05:13:29
问题 I'm having some issues with Nokogiri. I am trying to parse this XML file: <Collection version="2.0" id="74j5hc4je3b9"> <Name>A Funfair in Bangkok</Name> <PermaLink>Funfair in Bangkok</PermaLink> <PermaLinkIsName>True</PermaLinkIsName> <Description>A small funfair near On Nut in Bangkok.</Description> <Date>2009-08-03T00:00:00</Date> <IsHidden>False</IsHidden> <Items> <Item filename="AGC_1998.jpg"> <Title>Funfair in Bangkok</Title> <Caption>A small funfair near On Nut in Bangkok.</Caption>

Is it possible to plug a JavaScript engine with Ruby and Nokogiri?

情到浓时终转凉″ 提交于 2019-12-03 04:06:09
I'm writing an application to crawl some websites and scrape data from them. I'm using Ruby, Curl and Nokogiri to do this. In most cases it's straightforward and I only need to ping a URL and parse the HTML data. The setup works perfectly fine. However, in some scenarios, the websites retrieve data based on user input on some radio buttons. This invokes some JavaScript which fetches some more data from the server. The generated URL and posted data is determined by JavaScript code. Is it possible to use: A JavaScript library along with this setup which would be able to determine execute the

Convert XML collection (of Pivotal Tracker stories) to Ruby hash/object

懵懂的女人 提交于 2019-12-03 04:02:20
I have a collection of stories in an XML format. I would like to parse the file and return each story as either hash or Ruby object, so that I can further manipulate the data within a Ruby script. Does Nokogiri support this, or is there a better tool/library to use? The XML document has the following structure, returned via Pivotal Tracker's web API : <?xml version="1.0" encoding="UTF-8"?> <stories type="array" count="145" total="145"> <story> <id type="integer">16376</id> <story_type>feature</story_type> <url>http://www.pivotaltracker.com/story/show/16376</url> <estimate type="integer">2<

XPath to find all following siblings up until the next sibling of a particular type

放肆的年华 提交于 2019-12-03 03:50:14
Given this XML/HTML: <dl> <dt>Label1</dt><dd>Value1</dd> <dt>Label2</dt><dd>Value2</dd> <dt>Label3</dt><dd>Value3a</dd><dd>Value3b</dd> <dt>Label4</dt><dd>Value4</dd> </dl> I want to find all <dt> and then, for each, find the following <dd> up until the next <dt> . Using Ruby's Nokogiri I am able to accomplish this like so: dl.xpath('dt').each do |dt| ct = dt.xpath('count(following-sibling::dt)') dds = dt.xpath("following-sibling::dd[count(following-sibling::dt)=#{ct}]") puts "#{dt.text}: #{dds.map(&:text).join(', ')}" end #=> Label1: Value1 #=> Label2: Value2 #=> Label3: Value3a, Value3b #=>

XPath to select between two HTML comments?

断了今生、忘了曾经 提交于 2019-12-03 03:37:23
I have a big HTML page. But I want to select certain nodes using Xpath: <html> ........ <!-- begin content --> <div>some text</div> <div><p>Some more elements</p></div> <!-- end content --> ....... </html> I can select HTML after the <!-- begin content --> using: "//comment()[. = ' begin content ']/following::*" Also I can select HTML before the <!-- end content --> using: "//comment()[. = ' end content ']/preceding::*" But do I have to have XPath to select all the HTML between the two comments? I would look for elements that are preceded by the first comment and followed by the second comment

Extracting between <br> tags with Nokogiri?

会有一股神秘感。 提交于 2019-12-03 00:50:10
问题 I am trying to extract the phone number and the address from this site using Nokogiri. Both of them are between <br> tags. How can I do this? In case the site is down, here is an excerpt of some of the HTML from which I wish to extract the phone number and address: <table width="900" style=" margin:8px; padding:5px; font-family:Verdana, Geneva, sans-serif; font-size:12px; line-height:165%; color:#333333; border-bottom:1px solid #cccccc; "><tbody><tr valign="top"><td> <strong>Alana's Cafe<

Installing nokogiri Mac OS X 10.8.2 XCode installed

和自甴很熟 提交于 2019-12-03 00:48:48
Trying to install nokogiri on Mountain Lion. I was using ruby 1.8.7 but just upgraded to 1.9.3 but it stopped the bundle install from working. Incidentally, I could get round this problem by uninstalling ruby 1.9.3 and reverting to 1.8.7. however this is obviously a suboptimal solution since I don't want to be stuck on 1.8.7 for the rest of time... Users-MacBook-Pro:sample_app user$ ls Gemfile app doc script Gemfile.lock config lib spec README.md config.ru log tmp Rakefile db public vendor Ravins-MacBook-Pro:sample_app user$ bundle Fetching gem metadata from https://rubygems.org/....... /Users

How do I do a regex search in Nokogiri for text that matches a certain beginning?

此生再无相见时 提交于 2019-12-02 22:06:49
Given: require 'rubygems' require 'nokogiri' value = Nokogiri::HTML.parse(<<-HTML_END) "<html> <body> <p id='para-1'>A</p> <div class='block' id='X1'> <h1>Foo</h1> <p id='para-2'>B</p> </div> <p id='para-3'>C</p> <h2>Bar</h2> <p id='para-4'>D</p> <p id='para-5'>E</p> <div class='block' id='X2'> <p id='para-6'>F</p> </div> </body> </html>" HTML_END I want to do something like what I can do in Hpricot: divs = value.search('//div[@id^="para-"]') How do I do a pattern search for elements in XPath style? Where would I find the documentation to help me? I didn't see this in the rdocs. Aaron

Rails Tutorial: nokogiri-1.5.2 error on bundle install

不打扰是莪最后的温柔 提交于 2019-12-02 21:57:11
After working through the RVM setup, rspec and guard sections of chapter 3 of the Ruby on Rails Tutorial , whenever I run bundle install I get the following error dump: Gem::Installer::ExtensionBuildError: ERROR: Failed to build gem native extension. /usr/bin/ruby1.9.1 extconf.rb /usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require': cannot load such file -- mkmf (LoadError) from /usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require' from extconf.rb:5:in `<main>' Gem files will remain installed in /home/dan/.bundler/tmp/17577/gems/nokogiri-1.5.2 for inspection. Results logged