Keeping attributes when converting XML to Ruby hash

落花浮王杯 提交于 2019-12-10 20:04:13

问题


I have a large XML document I am looking to parse. In this document, many tags have different attributes within them. For example:

<album>
 <song-name type="published">Do Re Mi</song-name>
</album>

Currently, I am using Rail's hash-parsing library by requiring 'active_support/core_ext/hash'.

When I convert it to a hash, it drops the attributes. It returns:

{"album"=>{"song-name"=>"Do Re Mi"}}

How do I maintain those attributes, in this case, the type="published" attribute?

This seems to have been previously been asked in "How can I use XML attributes when converting into a hash with from_xml?", which had no conclusive answer, but that was from 2010, and I'm curious if things have changed since then. Or, I wonder if you know of an alternative way of parsing this XML so that I could still have the attribute information included.


回答1:


Converting XML to a hash isn't a good solution. You're left with a hash that is more difficult to parse than the original XML. Plus, if the XML is too big, you'll be left with a hash that won't fit into memory, and can't be processed, whereas the original XML could be parsed using a SAX parser.

Assuming the file isn't going to overwhelm your memory when loaded, I'd recommend using Nokogiri to parse it, doing something like:

require 'nokogiri'

class Album

  attr_reader :song_name, :song_type
  def initialize(song_name, song_type)
    @song_name = song_name
    @song_type = song_type
  end
end

xml = <<EOT
<xml>
  <album>
   <song-name type="published">Do Re Mi</song-name>
  </album>
  <album>
    <song-name type="unpublished">Blah blah blah</song-name>
  </album>
</xml>
EOT

albums = []
doc = Nokogiri::XML(xml)
doc.search('album').each do |album|
  song_name = album.at('song-name')
  albums << Album.new(
      song_name.text,
      song_name['type']
    )
end

puts albums.first.song_name
puts albums.last.song_type

Which outputs:

Do Re Mi
unpublished

The code starts by defining a suitable object to be used to hold the data you want. When the XML is parsed into a DOM, the code will loop through all the <album> nodes, and extract the information, defining an instance of the class, and appending it to the albums array.

After running you'd have an array you would walk, and process each item, storing it into a database, or manipulating it however you want. Though, if your goal is to insert that information into a database, you'd be smarter to let the DBM read the XML and import it directly.




回答2:


It's problem with active support XMLConverter class Please add following code to any of your initializers file.

module ActiveSupport
    class XMLConverter
        private
            def become_content?(value)
                value['type'] == 'file' || (value['__content__'] && (value.keys.size == 1 && value['__content__'].present?))
            end
    end
end

It will gives you output like following.

Ex Input XML

xml = '<album>
   <song-name type="published">Do Re Mi</song-name>
</album>'

Hash.from_xml(xml)

Output will be

{"album"=>{"song_name"=>{"type"=>"published", "__content__"=>"Do Re Mi"}}}



回答3:


I actually think its the garbage method, it's checking the type attribute and if it doesn't return a hash it'll return true which in the method become_hash? returns false. Which is the last check in the process_hash method. So it'll return nil for type attribute and won't build the hash for it.

For those interested what I'm talking about is in the active support gem active_support/core_ext/hash/conversions.rb

module ActiveSupport class XMLConverter private def garbage?(value) false end end end

I just defaulted it to false and it worked for me but it might not be for everyone.




回答4:


As in the question you linked above, Nokogiri is the (short) answer.

If you can provide some sample code, someone might come up with better answers.



来源:https://stackoverflow.com/questions/19309465/keeping-attributes-when-converting-xml-to-ruby-hash

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!