Parsing Large XML files w/ Ruby & Nokogiri

后端 未结 5 1562
眼角桃花
眼角桃花 2021-01-02 07:30

I have a large XML file (about 10K rows) I need to parse regularly that is in this format:


    10000         


        
5条回答
  •  醉话见心
    2021-01-02 08:09

    You can dramatically decrease your time to execute by changing your code to the following. Just change the "99" to whatever category you want to check.:

    require 'rubygems'
    require 'nokogiri'
    require 'open-uri'
    
    icount = 0 
    xmlfeed = Nokogiri::XML(open("test.xml"))
    items = xmlfeed.xpath("//item")
    items.each do |item|
      text = item.children.children.first.text  
      if ( text =~ /99/ )
        icount += 1
      end
    end
    
    othercount = xmlfeed.xpath("//totalcount").inner_text.to_i - icount 
    
    puts icount
    puts othercount
    

    This took about three seconds on my machine. I think a key error you made was that you chose the "items" iterate over instead of creating a collection of the "item" nodes. That made your iteration code awkward and slow.

提交回复
热议问题