nokogiri | 易学教程

How to parse only part of a string-value from an element using Nokogiri? RUBY, Mechanize

阅读更多关于 How to parse only part of a string-value from an element using Nokogiri? RUBY, Mechanize

问题 How do I extract numbers off a string ? if xpath is 'td[5]p/@title' HTML : <td valign="top" align="center"> <p title="6 en su sucursal" style="margin-top: 0px; margin-bottom:0px; cursor:hand"> <b>10</b> </p> </td> I need to extract from the title attribute string-value "6 en su sucusal" only number 6 回答1: Give some HTML inside html , you'd do something like this: doc = Nokogiri::HTML(html) numbers = doc.xpath('//p[@title]').collect { |p| p[:title].gsub(/[^\d]/, '') } Then you'll have the

Is there a way to select all the contents of a node?

阅读更多关于 Is there a way to select all the contents of a node?

问题 Is there a way to select all the contents of a node in Nokogiri? <root> <element>this is <hi>the content</hi> of my æøå element</element> </root> The result of getting the content of /root/element should be: this is <hi>the content</hi> of my æøå element Edit: It seems like the solution is simply to use myElement.inner_html() . The problem I had was in fact that I was relying on an old version of libxml2, which escaped all the special characters. 回答1: Nokogiri.parse('<root><element>this is

Ruby: How do I get attribute values from XML with Nokogiri?

阅读更多关于 Ruby: How do I get attribute values from XML with Nokogiri?

问题 How to get the value of the message value ("ready to use")? <?xml version="1.0" encoding="UTF-8"?> <response status="ok" permission_level="admin" message="ready to use" cached="0"> <title>kit</title> </response> Thanks 回答1: require 'rubygems' require 'nokogiri' string = %Q{ <?xml version="1.0" encoding="UTF-8"?> <response status="ok" permission_level="admin" message="ready to use" cached="0"> <title>kit</title> </response> } doc = Nokogiri::XML(string) doc.css("response").each do |response

Ruby: How do I get attribute values from XML with Nokogiri?

阅读更多关于 Ruby: How do I get attribute values from XML with Nokogiri?

How can I parse a URL using a proxy with Rails?

阅读更多关于 How can I parse a URL using a proxy with Rails?

问题 My app has the following controller action: def test #get URL url = "http://www.coteur.com/surebet.php" doc = Nokogiri::HTML(open(url)) @show = doc.at_css("title").text @game_data = Array.new doc.css('tbody').each do |tr| tr.css("tr").each do |f| @game_data.push(f.css("td").text) end end end And render the following view: <%= @show%> <div class="bs-example" data-example-id="hoverable-table"> <table class="table table-hover"> <tbody> <% if @game_data.empty? %> <tr> <td>Nope</td> </tr> <%else%>

Get the values of attributes with namespace, using Nokogiri

阅读更多关于 Get the values of attributes with namespace, using Nokogiri

问题 I'm parsing a document.xml file using Nokogiri, extracted from .docx file and need to get values of attributes with names, like " w:val ". This is a sample of the source XML: <w:document> <w:body> <w:p w:rsidR="004D5F21" w:rsidRPr="00820E0B" w:rsidRDefault="00301D39" pcut:cut="true"> <w:pPr> <w:jc w:val="center"/> </w:pPr> </w:body> </w:document> This is a sample of the code: require 'nokogiri' doc = Nokogiri::XML(File.open(path)) doc.search('//w:jc').each do |n| puts n['//w:val'] end There

Get the values of attributes with namespace, using Nokogiri

阅读更多关于 Get the values of attributes with namespace, using Nokogiri

Find tag with id including [] with Nokogiri

阅读更多关于 Find tag with id including [] with Nokogiri

问题 I have an html element like: <div id="spam[500]"> I want to search for this element by id, but it seems that nokogiri is getting confused by the []. I'm trying: doc.css("#spam[#{eggs.id}]") but to no avail. 回答1: Chris, try this and let me know if it works: doc = Nokogiri::HTML(page) el = doc.xpath("//div[@id='spam[500]']").first The problem is that you can't access it via CSS (even in the browser). Try setting some CSS attributes for "spam[500]" and they won't be applied. You can access via

Changing href attributes with nokogiri and ruby on rails

阅读更多关于 Changing href attributes with nokogiri and ruby on rails

问题 I Have a HTML document with links links, for exemple: <html> <body> <ul> <li><a href="http://someurl.com/etc/etc">teste1</a></li> <li><a href="http://someurl.com/etc/etc">teste2</a></li> <li><a href="http://someurl.com/etc/etc">teste3</a></li> <ul> </body> </html> I want with Ruby on Rails, with nokogiri or some other method, to have a final doc like this: <html> <body> <ul> <li><a href="http://myproxy.com/?url=http://someurl.com/etc/etc">teste1</a></li> <li><a href="http://myproxy.com/?url

how to make Nokogiri not to convert to space

阅读更多关于 how to make Nokogiri not to convert to space

问题 i fetch one html fragment like "<li>市场价" which contains " ", but after calling to_s of Nokogiri NodeSet, it becomes "<li>市场价" , i want to keep the original html fragment, and tried to set :save_with option for to_s method, but failed. can someone encounter the same problem and give me help? thank you in advance. 回答1: I encountered a similar situation, and what I came up was a bit of a hack, but it seems to work well. nbsp = Nokogiri::HTML(" ").text text.gsub(nbsp, " ") In my case, I