In Perl, XML::Simple is not able to dereference multi dimensional associative array parsed by Data::Dumper

无人久伴 提交于 2019-12-25 04:29:25

问题


Following is the xml file that I want to parse:

<?xml version="1.0" encoding="UTF-8"?>

<topic id="yerus5" xmlns:ditaarch="http://dita.oasis-open.org/architecture/2005/">



<title/>
  <shortdesc/>
  <body>
<p><b>CCU_CNT_ADDR: (Address=0x004 Reset=32'h1)</b><table id="table_r5b_1xj_ts">
<tgroup cols="4">
  <colspec colnum="1" colname="col1"/>
  <colspec colnum="2" colname="col2"/>
  <colspec colnum="3" colname="col3"/>
  <colspec colnum="4" colname="col4"/>
  <tbody>
        <row>
          <entry>Field</entry>
          <entry>OFFSET</entry>
          <entry>R/W Access</entry>
          <entry>Description</entry>
        </row>
        <row>
          <entry>reg2sm_cnt</entry>
          <entry>15:0</entry>
          <entry>R/W</entry>
          <entry>Count Value to increment in the extenral memory at the specified location.
            Default Value of 1. A Count value of 0 will clear the counter value</entry>
        </row>
        <row>
          <entry>ccu2bus_endianess</entry>
          <entry>24</entry>
          <entry>R/W</entry>
          <entry>Endianess of the data structure bit</entry>
        </row>
        <row>
          <entry>ccu_lane_sel</entry>
          <entry>25</entry>
          <entry>R/W</entry>
          <entry>ccu_lane_sel bit. Indicates the lane selection bit of the 32-bit location to
            update</entry>
        </row>
        <row>
          <entry>ccu_rdinvalid</entry>
          <entry>26</entry>
          <entry>R/W</entry>
          <entry>ccu_rdinvalid bit. Indicates if the read value from the bus needs to be stored
            or not.</entry>
        </row>
      </tbody>
    </tgroup>
  </table></p>


</body>
</topic>

After running following code:

#!/usr/bin/perl


# use module
use XML::Simple;
use Data::Dumper;

# create object
$xml = new XML::Simple(); #(KeyAttr=>[]);

# read XML file
$data = $xml->XMLin("test.xml");

# access XML data
print Dumper($data);



# dereference hash ref



    # foreach $b (@{$p->{b}})
    # {

# }
foreach $body (@{$data->{body}})
{
 foreach $p (@{$body->{p}})
 {
     foreach $table (@{$p->{table}})
     {

        foreach $tgroup (@{$table->{tgroup}})
        {

            foreach $tbody (@{$tgroup->{tbody}})
            {

                foreach $row (@{$tbody->{row}})
                {
                    foreach $entry ((@{$row->{entry}})->[3])
                    {
                    print $entry,"\n";
                    }

                }
            }
        }
    }
}

}

I am getting this error: Not an ARRAY reference at ppfe.pl line 28. (at foreach $body (@{$data->{body}}))

I want to access each data of the <entry></entry>. Above code only is only accessing the 'Description' column. How to do that?

With reference to above question,

I am not able to extract details particularly for each <b></b> text. Following is sample output:

Name: CCU_CNT_ADDR: (Address=0x004 Reset=32'h1)
Field: reg2sm_cnt 
OFFSET: 15:0 
Access: R/W 
Description: Count Value to increment in the extenral memory at the specified location. Default Value of 1. A Count value of 0 will clear the counter value 
Filed: ccu2bus_endianess 
OFFSET: 24 
Access: R/W 
Description: Endianess of the data structure bit 
 .
 .
 .
 .
 .
 .
 .
Name: CCU_STAT_ADDR: (Address=0x008 Reset=32'h0) 
Field: fifo_cnt 
.
 .
 .
 .
 .
 .
 .

回答1:


Don't use XML::Simple.

Even XML::Simple says "don't use XML::Simple".

The use of this module in new code is discouraged. Other modules are available which provide more straightforward and consistent interfaces.

Try instead something like this:

use strict;
use warnings;
use XML::Twig;

XML::Twig->new(
    'twig_handlers' => {
        'entry' => sub { print $_ ->text, "\n" }
    }
)->parsefile ('your_file.xml');

This will print the text content of all the entry elements, which appears to be what you're trying to do?

XML::Twig has two really handy mechanisms - one of using a twig_handler to find and print nodes matching a spec - this works 'as you go' which is particularly useful when handling large XML, or if you want to edit it before processing.

However, it also allows you to 'handle' data afterwards:

my $twig = XML::Twig->new( 'pretty_print' => 'indented_a' )->parsefile('your_xml_file');

foreach my $element ( $twig -> get_xpath ("//entry") )
{
    print $element ->text, "\n";
}

Or you could use a full path to the node as you're doing above:

$twig->root->get_xpath("body/p/table/tgroup/tbody/row/entry") )

In response to your question though:

Above code only is only accessing the 'Description' column. How to do that?

That's because you're doing this:

foreach $entry ((@{$row->{entry}})->[3])

E.g. trying to get the 4th element in the entry array, which is Description.

With reference to the comments - I'd suggest you convert your 'entries' into a hash outside the XML data structure.

Like this:

use strict;
use warnings;
use XML::Twig;

use Data::Dumper;

my @headers;

my $column_to_show = 'Field';

sub process_row {
    my %entries;

    my ( $twig, $row ) = @_;
    my @row_entries = map { $_->text } $row->children;
    if (@headers) {
        @entries{@headers} = @row_entries;
        print $column_to_show, " => ", $entries{$column_to_show}, "\n";
    }
    else {
        @headers = @row_entries;
    }
}

my $twig = XML::Twig->new(
    'pretty_print' => 'indented_a',
    twig_handlers  => { 'row' => \&process_row }
)->parsefile ( 'your_file.xml' ); 

What this does is:

  • fire that handler on each row element.
  • extract the entry subelements (and their text) into an array. @row_entries.
  • Use the "header" row to turn that into a hash.
  • Print the hash value that matches a specific key $column_to_show.

Depending on whether you're doing any more with the data than print it, you can turn that into a hash of arrays or similar.

Or you could just print $row_entries[3] instead of course ;).




回答2:


It is always swifter and more accurate to use a proper XML parsing module that will allow you to access the XML data using XPath expressions

Here's a solution using [XML::Twig][XML::Twig]

I wasn't sure what you meant about the bold fields in <b>...</b> as there is only one in the example data you show, but I've accessed that using the XPath //body/p/b and printed it at the start of the output

The rest of the output is the values of the <entry> elements in each <row> which I access using //table/tgroup/tbody/row. The contents of the first row are used as field names to label subsequent values

use strict;
use warnings;
use 5.010;

use open qw/ :std :encoding(UTF-8) /;

use XML::Twig;
use List::Util qw/ max /;
use List::MoreUtils qw/ pairwise /;

my $twig = XML::Twig->new;
$twig->parsefile('topic.xml');

say $twig->findvalues('//body/p/b');
say '';

my (@fields, $size);
for my $row ( $twig->findnodes('//table/tgroup/tbody/row') ) {

  unless ( @fields ) {
    @fields = map "$_:", $row->findvalues('entry');
    $size = max map length, @fields;
    next;
  }

  my @values = $row->findvalues('entry');
  say for pairwise { sprintf '%-*s %s', $size, $a, $b } @fields, @values;
  say '---';
}

output

CCU_CNT_ADDR: (Address=0x004 Reset=32'h1)

Field:       reg2sm_cnt
OFFSET:      15:0
R/W Access:  R/W
Description: Count Value to increment in the extenral memory at the specified location.
            Default Value of 1. A Count value of 0 will clear the counter value
---
Field:       ccu2bus_endianess
OFFSET:      24
R/W Access:  R/W
Description: Endianess of the data structure bit
---
Field:       ccu_lane_sel
OFFSET:      25
R/W Access:  R/W
Description: ccu_lane_sel bit. Indicates the lane selection bit of the 32-bit location to
            update
---
Field:       ccu_rdinvalid
OFFSET:      26
R/W Access:  R/W
Description: ccu_rdinvalid bit. Indicates if the read value from the bus needs to be stored
            or not.
---


来源:https://stackoverflow.com/questions/31540145/in-perl-xmlsimple-is-not-able-to-dereference-multi-dimensional-associative-ar

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!