like DOMDocument class in PHP, is there any class in RUBY (i.e the core RUBY), to parse and get node elements value from a HTML Document.
Ruby Cheerio - A jQuery style HTML parser in ruby. A most simplified version of Nokogiri for crawlers. This is the ruby version of most popular NodeJS package cheerio.
Follow the link for a simple crawler example.
gem install ruby-cheerio
require 'ruby-cheerio'
jQuery = RubyCheerio.new("h1_1
h1_2
")
jQuery.find('h1').each do |head_one|
p head_one.text
end
# getting attribute values like jQuery.
p jQuery.find('h1.one')[0].prop('h1','class')
# function chaining similar to jQuery.
p jQuery.find('body').find('h1').first.text