So I\'m attempting to parse a 400k+ line XML file using Nokogiri.
The XML file has this basic format:
You can also use Nokogiri::XML::Reader. It's more memory intensive that Nokogiri::XML::SAX parser but you can keep XML structure, e.x.
class NodeHandler < Struct.new(:node)
def process
# Node processing logic
#e.x.
signId = node.at('ClinicalSign').attribute('id').text()
name = node.at('ClinicalSign').element_children().text()
end
end
Nokogiri::XML::Reader(File.open('./test/fixtures/example.xml')).each do |node|
if node.name == 'DisorderSign' && node.node_type == Nokogiri::XML::Reader::TYPE_ELEMENT
NodeHandler.new(
Nokogiri::XML(node.outer_xml).at('./DisorderSign')
).process
end
end
Based on this blog