I have a large XML file (about 10K rows) I need to parse regularly that is in this format:
10000
You can dramatically decrease your time to execute by changing your code to the following. Just change the "99" to whatever category you want to check.:
require 'rubygems'
require 'nokogiri'
require 'open-uri'
icount = 0
xmlfeed = Nokogiri::XML(open("test.xml"))
items = xmlfeed.xpath("//item")
items.each do |item|
text = item.children.children.first.text
if ( text =~ /99/ )
icount += 1
end
end
othercount = xmlfeed.xpath("//totalcount").inner_text.to_i - icount
puts icount
puts othercount
This took about three seconds on my machine. I think a key error you made was that you chose the "items" iterate over instead of creating a collection of the "item" nodes. That made your iteration code awkward and slow.