nokogiri

How do I remove a node with Nokogiri?

不想你离开。 提交于 2019-11-27 12:54:34
问题 How can I remove <img> tags using Nokogiri? I have the following code but it wont work: # str = '<img src="canadascapital.gc.ca/data/2/rec_imgs/5005_Pepsi_H1NB.gif"/…; testt<a href="#">test</a>tfbu' f = Nokogiri::XML.fragment(str) f.search('//img').each do |node| node.remove end puts f 回答1: have a try! f = Nokogiri::XML.fragment(str) f.search('.//img').remove puts f 回答2: I prefer CSS over XPath, as it's usually much more readable. Switching to CSS: require 'nokogiri' doc = Nokogiri::HTML('

How can I get the absolute URL when extracting links using Nokogiri?

杀马特。学长 韩版系。学妹 提交于 2019-11-27 11:35:41
I'm using Nokogiri to extract links from a page but I would like to get the absolute path even though the one on the page is a relative one. How can I accomplish this? Nokogiri is unrelated, other than the fact that it gives you the link anchor to begin with. Use Ruby's URI library to manage paths: absolute_uri = URI.join( page_url, href ).to_s Seen in action: require 'uri' # The URL of the page with the links page_url = 'http://foo.com/zee/zaw/zoom.html' # A variety of links to test. hrefs = %w[ http://zork.com/ http://zork.com/#id http://zork.com/bar http://zork.com/bar#id http://zork.com

Nokogiri: How to select nodes by matching text?

帅比萌擦擦* 提交于 2019-11-27 11:06:55
If I have a bunch of elements like: <p>A paragraph <ul><li>Item 1</li><li>Apple</li><li>Orange</li></ul></p> Is there a built in nokogiri method that would get me all, for example, p elements that contain the text "Apple"? (the example element above would match, for instance). Nokogiri can do this (now) using jQuery extensions to CSS: require 'nokogiri' html = ' <html> <body> <p>foo</p> <p>bar</p> </body> </html> ' doc = Nokogiri::HTML(html) doc.at('p:contains("bar")').text.strip => "bar" Here is an XPath that works: require 'nokogiri' doc = Nokogiri::HTML(DATA) p doc.xpath('//li[contains(text

XPath axis, get all following nodes until

落花浮王杯 提交于 2019-11-27 08:05:59
I have the following example of HTML: <!-- lots of html --> <h2>Foo bar</h2> <p>lorem</p> <p>ipsum</p> <p>etc</p> <h2>Bar baz</h2> <p>dum dum dum</p> <p>poopfiddles</p> <!-- lots more html ... --> I'm looking to extract all paragraphs following the 'Foo bar' header, until I reach the 'Bar baz' header (the text for the 'Bar baz' header is unknown, so unfortunately I can't use the answer provided by bougyman). Now I can of course using something like //h2[text()='Foo bar']/following::p but that of course will grab all paragraphs following this header. So I have the option to traverse the nodeset

Undefined namespace prefix in Nokogiri and XPath

|▌冷眼眸甩不掉的悲伤 提交于 2019-11-27 07:46:56
问题 I am trying to parse Youtube Gdata to see if video with given id exists. But there isn't normal tag but with namespace. On the link http://gdata.youtube.com/feeds/api/videos?q=KgfdlZuVz7I there is tag: <openSearch:totalResults>1</openSearch:totalResults> There is namespace openSearch: xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' but I dont know how to deal with it in Nokogiri and Ruby. Here is part of code: xmlfeed = Nokogiri::HTML(open("http://gdata.youtube.com/feeds/api/videos

finding common ancestor from a group of xpath?

心不动则不痛 提交于 2019-11-27 07:26:39
问题 say i have html/body/span/div/p/h1/i/font html/body/span/div/div/div/div/table/tr/p/h1 html/body/span/p/h1/b html/body/span/div how can i get the common ancestor? in this case span would be the common ancestor of "font, h1, b, div" would be "span" 回答1: To find common ancestry between two nodes: (node1.ancestors & node2.ancestors).first A more generalized function that works with multiple nodes: # accepts node objects or selector strings class Nokogiri::XML::Element def common_ancestor(*nodes)

How to make Nokogiri transparently return un/encoded Html entities untouched?

▼魔方 西西 提交于 2019-11-27 06:20:00
问题 How can I use Nokogiri with having html entities (like German umlauts) untouched? I.e.: # this is fine node = Nokogiri::HTML.fragment('<p>ö</p>') node.to_s # => '<p>ö</p>' # this is not node = Nokogiri::HTML.fragment('<p>ö</p>') node.to_s # => '<p>ö</p>' # this is what I need node = Nokogiri::HTML.fragment('<p>ö</p>') node.to_s # => '<p>ö</p>' I've tried to mess with both PARSE_OPTIONS and :save_with options but could not come up with a way to have Nokogiri just transparently behave like

Error installing Nokogiri on bundle install but already installed

我的未来我决定 提交于 2019-11-27 05:17:21
问题 I'm having issues with bundling my Gemfile. I have Nokogiri installed already yet when I run bundle install it fails to load Nokogiri. Installing Nokogiri: gem install nokogiri Building native extensions. This could take a while... Successfully installed nokogiri-1.6.6.2 Parsing documentation for nokogiri-1.6.6.2 Done installing documentation for nokogiri after 2 seconds 1 gem installed Bundle install: bundle install sic@ANTHONYs-iMac Fetching gem metadata from https://rubygems.org/.........

Error while installing Nokogiri (1.6.7) on El Capitan

北慕城南 提交于 2019-11-27 05:13:37
问题 One of my developers have updated Nokogiri, and when pulling the updated Gemfile my bundle install fails. ➜ my-project git:(master) bundle install Fetching source index from https://rubygems.org/ Using rake 10.4.2 Using i18n 0.7.0 Using json 1.8.3 Using minitest 5.8.3 Using thread_safe 0.3.5 Using tzinfo 1.2.2 Using activesupport 4.2.3 Using builder 3.2.2 Using erubis 2.7.0 Using mini_portile2 2.0.0 Gem::Ext::BuildError: ERROR: Failed to build gem native extension. /Users/me/.rvm/rubies/ruby

Nokogiri was built against LibXML version 2.7.7, but has dynamically loaded 2.7.3

丶灬走出姿态 提交于 2019-11-27 04:42:11
问题 In Rails 3, I've noticed that every time I invoke the framework, whether from rake , rails server , or anything else, I get the following warning: Nokogiri was built against LibXML version 2.7.7, but has dynamically loaded 2.7.3 Searching on Google yields a few blog posts, all of which suggest rebuilding Nokogiri using explicit lib and include paths. For example: http://mrflip.github.com/2009-08/nokogiri-hates-libxml2-on-osx.html But, that didn't solve the problem for me. Typing nokogiri -v