nokogiri

Nokogiri: need to turn markup partitioned by `hr` into divs

戏子无情 提交于 2019-12-11 05:20:08
问题 Given markup inside an HTML document that looks like this <h3>test</h3> <p>test</p> <hr/> <h3>test2</h3> <p>test2</p> <hr/> I'd like to to produce this <div> <h3>test</h3> <p>test</p> </div> <div> <h3>test2</h3> <p>test2</p> </div> What's the most elegant way to do with with Nokogiri? 回答1: Edit : Reworked answer to be a bit cleaner. Edit2 : Small rewrite to shorten by two lines require 'nokogiri' doc = Nokogiri::HTML <<ENDHTML <h3>test</h3> <p>test</p> <hr/> <h3>test2</h3> <p>test2</p> <hr/>

Rails/Paperclip/S3 mystery errors: undefined method “global_endpoint?”

让人想犯罪 __ 提交于 2019-12-11 05:05:54
问题 So I upgraded a Rails app from 3.0 to 4.0 last week, and ever since I've been getting strange errors that seem to point to random places that I haven't changed, and I can't reproduce them. One such error is like this: NoMethodError: undefined method `global_endpoint?' for AWS::S3:Class [GEM_ROOT]/gems/aws-sdk-1.46.0/lib/aws/core/configuration.rb:441 /gems/aws-sdk-1.46.0/lib/aws/core/configuration.rb:441 in "block in add_service" /gems/aws-sdk-1.46.0/lib/aws/core/configuration.rb:361 in "call"

Find comment or text nodes in a document fragment

别说谁变了你拦得住时间么 提交于 2019-12-11 04:16:44
问题 I have to clean up a Nokogiri::HTML::DocumentFragment document (remove comment nodes and text nodes which contain whitespace only). Here's an example: html = "<p>paragraph</p><!-- comment --><p>paragraph</p> <p>paragraph</p>" doc = Nokogiri::HTML::DocumentFragment.parse html The document fragment looks as you'd expect: #(DocumentFragment:0x3fc65f9f5870 { name = "#document-fragment", children = [ #(Element:0x3fc65f9f5064 { name = "p", children = [ #(Text "paragraph")] }), #(Comment " comment "

Nokogiri grab text with formatting and link tags, <em>,<strong>, <a>, etc

做~自己de王妃 提交于 2019-12-11 03:03:50
问题 How can I recursively capture all the text with formatting tags using Nokogiri? <div id="1"> This is text in the TD with <strong> strong </strong> tags <p>This is a child node. with <b> bold </b> tags</p> <div id=2> "another line of text to a <a href="link.html"> link </a>" <p> This is text inside a div <em>inside<em> another div inside a paragraph tag</p> </div> </div> For example, I would like to capture: "This is text in the TD with <strong> strong </strong> tags" "This is a child node.

Traverse elements that aren't children between two elements with Nokogiri [duplicate]

感情迁移 提交于 2019-12-11 02:59:09
问题 This question already has answers here : Nokogiri and Xpath: find all text between two tags (2 answers) Closed 5 years ago . Using Nokogiri, I'm trying to figure out the best way to select div elements that match a css class between two other div elements. Here's some sample HTML of what I'm working with: <div class="date"> <span>Today</span> </div> <div class="random"></div> <div class="preferred"></div> <div class="preferred"></div> <div class="preferred"></div> <div class="random"></div>

Nokogiri producing different results on heroku?

南楼画角 提交于 2019-12-11 02:51:32
问题 I'm having a very strange problem and I'd appreciate help tracking it down. I'm using the nokogiri gem to parse some html, and I am parsing a file which has a weird character in it. Not entirely sure what this character is, in vim it shows as ^Q. On my own computer, everything works fine, however on heroku it inserts a </body></html><html> when it hits the character and selectors only return the elements before the weird character. To illustrate: Nokogiri::HTML( open("http://thoms.net.nz/e2

How to put a group of <p> inside a <div>

可紊 提交于 2019-12-11 02:35:52
问题 I'd like to figure out a way to get to the HTML result (mentioned further below) by using the following Ruby code and Nokogiri: require 'rubygems' require 'nokogiri' value = Nokogiri::HTML.parse(<<-HTML_END) "<html> <body> <p id='1'>A</p> <p id='2'>B</p> <h1>Bla</h1> <p id='3'>C</p> <p id='4'>D</p> <p id='5'>E</p> </body> </html>" HTML_END # The selected-array is given by the application. # It consists of a sorted array with all ids of # <p> that need to be enclosed by the <div> selected = [

Preventing Nokogiri from escaping characters in URLs

邮差的信 提交于 2019-12-11 02:29:31
问题 Nokogiri("<a href='*|UNSUB|*'>unsubscribe</a>").to_html # returns "<a href="*%7CUNSUB%7C*">unsubscribe</a>" How can I get Nokogiri to not escape the pipes? 回答1: require 'nokogiri' doc = Nokogiri("<a href='*|UNSUB|*'>unsubscribe</a>") puts doc.to_html #=> <a href="*%7CUNSUB%7C*">unsubscribe</a> puts doc.to_xml #=> <?xml version="1.0"?> #=> <a href="*|UNSUB|*">unsubscribe</a> Alternatively: puts doc.to_html.gsub('%7C','|') #=> <a href="*|UNSUB|*">unsubscribe</a> 来源: https://stackoverflow.com

How can I detect errors in an HTML document fragment with Ruby?

痴心易碎 提交于 2019-12-11 01:36:14
问题 I'm collecting some HTML formatted content from a web form. Before saving this HTML content, I'd like to do a quick sanity check on it to make sure it looks well-formed (no unclosed tags, no invalid markup). Using Ruby and/or with any popular gems, can I check an HTML fragment string like: <p>foo</p><h1>Unclosed H1<p>bar</p> and discover things like the unclosed h1 tag? I thought Nokogiri would come to my rescue here, but no: >> Nokogiri::HTML::DocumentFragment.parse("<p>foo</p><h1>Unclosed

Unable to gem install nokogiri

最后都变了- 提交于 2019-12-11 01:11:47
问题 When attempting to use gem install nokogiri I'm getting the following error: ERROR: Error installing nokogiri: nokogiri requires Ruby version < 2.3, >= 1.9.2. However if I do ruby -v : ruby 2.3.0p0 (2015-12-25 revision 53290) [i386-mingw32] I've attempted to install it locally, gem install --local nokogiri and it runs through the install process, however when I attempt to use the gem, it won't find the file: C:/Ruby23/lib/ruby/site_ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require