nokogiri

Parse XML nodes to CSV with Ruby/Nokogiri

吃可爱长大的小学妹 提交于 2019-12-03 22:08:26
问题 Ruby parsing newbie here. I've got an XML file that looks like; ?xml version="1.0" encoding="iso-8859-1"?> <Offers xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://ssc.channeladvisor.com/files/cageneric.xsd"> <Offer> <Model><![CDATA[11016001]]></Model> <Manufacturer><![CDATA[Crocs, Inc.]]></Manufacturer> <ManufacturerModel><![CDATA[11016-001]]></ManufacturerModel> ...lots more nodes <Custom6><![CDATA[<li>Bold midsole stripe for a sporty look.</li>

Using XPath on single node returns elements in all nodes

时光总嘲笑我的痴心妄想 提交于 2019-12-03 22:02:25
I am parsing an XML doc that looks something like this: <MyBook> <title>Favorite Poems</title> <issn>123-456</issn> <pages>45</pages> </MyBook> <MyBook> <title>Chocolate Desserts</title> <issn>654-098</issn> <pages>100</pages> </MyBook> <MyBook> <title>Jabberwocky</title> <issn>454-545</issn> <pages>19</pages> </MyBook> I use xpath to pull out the MyBook nodes and iterate through them like so: xmldoc.xpath("//MyBook").each do |node| mytitle=node.xpath("//title").text puts mytitle end the output looks like this: Favorite PoemsChocolateDessertsJabberwocky Favorite

Nokogiri in Ruby 2.0

这一生的挚爱 提交于 2019-12-03 18:13:16
When I require 'nokogiri' in Ruby 2.0, it has a error `require': cannot load such file -- nokogiri/2.0/nokogiri (LoadError) Is nokogiri not supporting Ruby 2.0 yet? I can see nokogiri in gem list Mike Dalessio Ruby 2.0 support is not yet available for Windows. Follow along here for updates: Yes, it works fine: RUBY_VERSION # => "2.0.0" require 'nokogiri' doc = Nokogiri::HTML('<html><body><p>foo</p></body></html>') doc.at('p').text # => "foo" Nokogiri now support Ruby 2.0, even on Windows, see HERE 来源: https://stackoverflow.com/questions/15332416/nokogiri-in-ruby-2-0

Nokogiri recursively get all children

北城以北 提交于 2019-12-03 16:59:35
问题 The Problem I am running some statistics against various URLS. I want to find the top level element with the most concentrated number of children. The method that I would like to follow is to identify all top level elements and then determine what percentage of all the elements on the page belong to it. Goal Recursively get all children of a given element. Inputs: a Nokogiri Element Outputs: an array of Nokogiri Elements OR the count of total number of children Setup Ruby 1.9.2 Nokogiri gem

How can I get Nokogiri to parse and return an XML document?

ⅰ亾dé卋堺 提交于 2019-12-03 15:47:19
Here's a sample of some oddness: #!/usr/bin/ruby require 'rubygems' require 'open-uri' require 'nokogiri' print "without read: ", Nokogiri(open('http://weblog.rubyonrails.org/')).class, "\n" print "with read: ", Nokogiri(open('http://weblog.rubyonrails.org/').read).class, "\n" Running this returns: without read: Nokogiri::XML::Document with read: Nokogiri::HTML::Document Without the read returns XML, and with it is HTML? The web page is defined as "XHTML transitional", so at first I thought Nokogiri must have been reading OpenURI's "content-type" from the stream, but that returns 'text/html' :

Can't install Nokogiri for Ruby in Windows

孤人 提交于 2019-12-03 15:20:36
I know this is simple but I just can't figure it out. I need to run a script in Ruby and it requires Nokogiri. I do have some experience in other languages but not in Ruby. Here is my system : Ruby 2.0.0-p195 (x64) is installed @ C:\Programs\RubyLanguage Ruby Development Kit (mingw64-64-4.7.2-20130224-1432) is installed @ C:\Programs\RubyDevKit When I run gem install nokogiri I get this error: ERROR: Error installing nokogiri: The 'nokogiri' native gem requires installed build tools. Please update your PATH to include build tools or download the DevKit from 'http://rubyinstaller.org/downloads'

XPath to select between two HTML comments?

Deadly 提交于 2019-12-03 14:31:44
问题 I have a big HTML page. But I want to select certain nodes using Xpath: <html> ........ <!-- begin content --> <div>some text</div> <div><p>Some more elements</p></div> <!-- end content --> ....... </html> I can select HTML after the <!-- begin content --> using: "//comment()[. = ' begin content ']/following::*" Also I can select HTML before the <!-- end content --> using: "//comment()[. = ' end content ']/preceding::*" But do I have to have XPath to select all the HTML between the two

Modifying text inside html nodes - nokogiri

最后都变了- 提交于 2019-12-03 13:29:15
Let's say i have the following HTML: <ul><li>Bullet 1.</li> <li>Bullet 2.</li> <li>Bullet 3.</li> <li>Bullet 4.</li> <li>Bullet 5.</li></ul> What I wish to do with it, is replace any periods, question marks or exclamation marks with itself and a trailing asterisk, that is inside an HTML node, then convert back to HTML. So the result would be: <ul><li>Bullet 1.*</li> <li>Bullet 2.*</li> <li>Bullet 3.*</li> <li>Bullet 4.*</li> <li>Bullet 5.*</li></ul> I've been messing around with this a bit in IRB, but can't quite figure it out. here's the code i have: html = "<ul><li>Bullet 1.</li> <li>Bullet

Parsing Large XML files w/ Ruby & Nokogiri

北城余情 提交于 2019-12-03 12:59:06
I have a large XML file (about 10K rows) I need to parse regularly that is in this format: <summarysection> <totalcount>10000</totalcount> </summarysection> <items> <item> <cat>Category</cat> <name>Name 1</name> <value>Val 1</value> </item> ...... 10,000 more times </items> What I'd like to do is parse each of the individual nodes using nokogiri to count the amount of items in one category. Then, I'd like to subtract that number from the total_count to get an ouput that reads "Count of Interest_Category: n, Count of All Else: z". This is my code now: #!/usr/bin/ruby require 'rubygems' require

set tag attribute and add plain text content to the tag using nokogiri builder (ruby)

寵の児 提交于 2019-12-03 11:42:56
问题 I am trying to build XML using Nokogiri with some tags that have both attributes and plain text inside the tag. So I am trying to get to this: <?xml version="1.0"?> <Transaction requestName="OrderRequest"> <Option b="hive">hello</Option> </Transaction> Using builder I have this: builder = Nokogiri::XML::Builder.new { |xml| xml.Transaction("requestName" => "OrderRequest") do xml.Option("b" => "hive").text("hello") end } which renders to: <Transaction requestName="OrderRequest"> <Option b="hive