nokogiri

Using Nokogiri to scrape a value from Yahoo Finance?

给你一囗甜甜゛ 提交于 2019-12-13 20:57:43
问题 I wrote a simple script: require 'rubygems' require 'nokogiri' require 'open-uri' url = "http://au.finance.yahoo.com/q/bs?s=MYGN" doc = Nokogiri::HTML(open(url)) name = doc.at_css("#yfi_rt_quote_summary h2").text market_cap = doc.at_css("#yfs_j10_mygn").text ebit = doc.at("//*[@id='yfncsumtab']/tbody/tr[2]/td/table[2]/tbody/tr/td/table/tbody/tr[11]/td[2]/strong").text puts "#{name} - #{market_cap} - #{ebit}" The script grabs three values from Yahoo finance. The problem is that the ebit XPath

Parsing javascript function elements with nokogiri

孤街浪徒 提交于 2019-12-13 19:25:38
问题 I'm trying to parse out values within a script tag with mechanize + nokogiri. This is as far as i'm able to get 1.9.3-p125 :107 > agent.page.search("script")[7] => #<Nokogiri::XML::Element:0x3ff1e10f3ff8 name="script" attributes= [#<Nokogiri::XML::Attr:0x3ff1e10f3f80 name="type" value="text/javascript">] children= [#<Nokogiri::XML::CDATA:0x3ff1e10f39b8 "countdownFactory.create('47884', '1333724400000', '');countdownFactory.create('48436', '1333638000000', '');countdownFactory.create('46085',

Should Nokogiri::XML.parse be creating separate Text nodes for linefeeds?

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-13 18:48:57
问题 I have an XML document created by an outside tool: <?xml version="1.0" encoding="UTF-8"?> <suite> <id>S1</id> <name>First Suite</name> <description></description> <sections> <section> <name>section 1</name> <cases> <case> <id>C1</id> <title>Test 1.1</title> <type>Other</type> <priority>4 - Must Test</priority> <estimate></estimate> <milestone></milestone> <references></references> </case> <case> <id>C2</id> <title>Test 1.2</title> <type>Other</type> <priority>4 - Must Test</priority>

How can I get the first element's text using Nokogiri?

ぐ巨炮叔叔 提交于 2019-12-13 17:17:07
问题 I am trying to get the text for Last sold date from this HTML: <td class="browse-cell-date"> <span title="Last sold date"> May 2002 </span> <button class="btn btn-previous-sales js-btn-previous-sales"> Previous sales (1) <i class="icon icon-down-open-1"/> </button> <div class="previous-sales-panel is-hidden"> <span style="display: block;"> Aug 1997 <span class="fright">£60,000</span> </span> </div> </td> I tried: date = val.search(".//td[@class='browse-cell-date']").children[1] It gave me the

How to retrieve the nokogiri processing instruction attributes?

六眼飞鱼酱① 提交于 2019-12-13 15:59:35
问题 I am parsing the XML using Nokogiri. I am able to retrieve the stylesheets. But not the attributes of each stylesheet. 1.9.2p320 :112 >style = xml.xpath('//processing-instruction("xml-stylesheet")').first => #<Nokogiri::XML::ProcessingInstruction:0x5459b2e name="xml-stylesheet"> style.name => "xml-stylesheet" style.content => "type=\"text/xsl\" href=\"CDA.xsl\"" Is there any easy way to get the type, href attributes values? OR Only way is to parse the content(style.content) of the processing

What's the cleanest way to ignore empty nodes with Nokogiri::XML::Builder

夙愿已清 提交于 2019-12-13 15:33:08
问题 So let's say I have a builder template like the following: builder = Nokogiri::XML::Builder.new(:encoding => 'UTF-8') do |xml| xml.environment do |environment| environment.title title environment.feed feed environment.status status environment.description description # many many more end end builder.to_xml If feed and description were nil , it could output: <?xml version="1.0" encoding="UTF-8"?> <environment> <title>title</title> <feed/> <status>status</status> <description/> </environment> I

LoadError: incompatible library version - /home/ubuntu/.rvm/gems/ruby-2.3.1@lm5/gems/nokogiri-1.8.2/lib/nokogiri/nokogiri.so

北城以北 提交于 2019-12-13 14:22:07
问题 Trying to run rake assets precompile with production mode as follows. rake assets:precompile It is working fine for ubuntu 14.04 (32 bit) and 16.06(32 bit.) But getting Load Error on 16.04(64 bit) in aws ec2. Please help me regarding this. Thanks in advance. Here's my full stack trace rake aborted! LoadError: incompatible library version - /home/ubuntu/.rvm/gems/ruby-2.3.1@lm5/gems/nokogiri-1.8.2/lib/nokogiri/nokogiri.so /home/ubuntu/.rvm/gems/ruby-2.3.1@lm5/gems/activesupport-5.0.1/lib

How do I parse this data structure returned by Nokogiri in Ruby?

旧街凉风 提交于 2019-12-13 08:37:01
问题 So I am cycling through an array element and this is the result returned: [nil, [#<Nokogiri::XML::Element:0x835386d4 name="a" attributes=[#<Nokogiri::XML::Attr:0x835385f8 name="href" value="http://bham.craigslist.org/web/2961573018.html">] children=[#<Nokogiri::XML::Text:0x835381c0 "Web Designer Full time">]> What I would like to do is access href value, and then the text value. How do I do that? I tried this: puts i[:href] But that generates this error: TypeError: Symbol as array index By

How do I select either a th or a td from a table row?

五迷三道 提交于 2019-12-13 07:58:54
问题 I'm using Nokogiri with Rails 5. How do I select either a "th" element or a "td" element from a table row? My goal is to get all the text of cells in a row (if there is a more generic, elegant solution, I'm all in). Here's what I have text_all_rows = all_rows.map do |row| row_values = row.css('td | th').map{|str| str.text } .map{|str| str.gsub(/[[:space:]]+/, ' ').gsub(/\A\p{Space}+|\p{Space}+\z/, '') }.join("\t") [*row_values] end As you may have noticed "td | th" is not valid syntax for

Cucumber Failing with Nokogiri

好久不见. 提交于 2019-12-13 06:44:06
问题 I just started using Cucumber and in the simplest of scenarios I throw the following error: undefined method has_key?' for #<Nokogiri::XML::Element:0x10677a400> (NoMethodError) ./features/step_definitions/web_steps.rb:36:in /^(?:|I )fill in "([^"] )" with "([^"] )"$/' features/authentication.feature:9:in `When I fill in "user_name" with "Joe User"' The Scenario is as follows... Scenario: Signup Given I go to the signup page When I fill in "user_name" with "Joe User" Is this a problem in the