Parsing Large XML with Nokogiri

后端 未结 4 1276
忘了有多久
忘了有多久 2021-01-07 10:10

So I\'m attempting to parse a 400k+ line XML file using Nokogiri.

The XML file has this basic format:



        
4条回答
  •  感动是毒
    2021-01-07 10:59

    You're likely running out of memory because symptomsList is getting too large in memory size. Why not perform the SQL within the xpath loop?

    require 'nokogiri'
    
    sympFile = File.open("Temp.xml")
    @doc = Nokogiri::XML(sympFile)
    sympFile.close()
    
    @doc.xpath("////DisorderSign").each do |x|
      signId = x.at('ClinicalSign').attribute('id').text()      
      name = x.at('ClinicalSign').element_children().text()
      Symptom.where(:name => name, :signid => signId.to_i).first_or_create
    end
    

    It's possible too that the file is just too large for the buffer to handle. In that case you could chop it up into smaller temp files and process them individually.

提交回复
热议问题