Perl 反序列化 XML

Perl 反序列化 XML

问题描述:

我有这样的 XML 文件(某些应用程序的输出):

I have XML file (output of some application) like this:

<data>
    <call function="get_user_data">
        <output>
            <integer name="my_id" value="-31" />
            <string name="login" value="root" />
            <integer name="ip_list" value="2" />
            <array name="i">
                <item>
                    <ip_address name="user_ip" value="0.0.0.0" />
                    <ip_address name="user_mask" value="0.0.0.0"/>
                </item>
                <item>
                    <ip_address name="user_ip" value="94.230.160.230" />
                    <ip_address name="user_mask" value="255.255.255.0"/>
                </item>
            </array>
            <integer name="modules" value="2" />
            <array name="i">
                <item>
                    <string name="module_name" value="core" />
                </item>
                <item>
                    <string name="module_name" value="devices" />
                </item>
            </array>
            <integer name="addition_modules" value="0"/>
            <array name="i"/>
        </output>
    </call>
</data>

我需要为这个 perl 结构解析它:

I need to parse it for this perl structure:

$data = {
    my_id => -31,
    login => "root",
    ip_list => [
        {
            user_ip => "0.0.0.0",
            user_mask => "0.0.0.0"
        },
        {
            user_ip => "94.230.160.230",
            user_mask => "255.255.255.0"
        }
    ],
    modules => [
        {
            module_name => "core"
        },
        {
            module_name => "devices"
        }
    ],
    addition_modules => []
}

请帮我完成它!

这是一种糟糕的 XML 格式.它是多余的:为什么给每个数组的元素数量,它就在那里,并且层次结构没有很好地定义:数组的名称(如果你必须有它,它的元素数)应该是数组的一个属性, 不是它之前的同级元素.

This is an awful XML format. It's redundant: why give the number of elements for each array, it's there, and the hierarchy is not well defined: the name of the array (and, if you must have it, its number of elements) should be an attribute of the array, not an element before it at the same level.

无论如何...你的格式太具体了,我怀疑你可以在这里使用通常的 XML::Simple,所以我取出了 XML::Twig,我认为这可以:

Anyway... your format is so specific that I doubt you can use the usual XML::Simple here, so I pulled out XML::Twig, and I think this will do:

#!/usr/bin/perl 

use strict;
use warnings;

use XML::Twig;
use Test::More tests => 1;

my $expected= {
    my_id => -31,
    login => "root",
    ip_list => [
        { user_ip => "0.0.0.0",
          user_mask => "0.0.0.0"
        },
        { user_ip => "94.230.160.230",
          user_mask => "255.255.255.0"
        }
    ],
    modules => [
        { module_name => "core" },
        { module_name => "devices" }
    ],
    addition_modules => []
};

my $data={};

my $t=XML::Twig->new( twig_handlers => { 'integer[@name="my_id"]' => sub { add_field( $data, $_)},
                                         'string[@name="login"]' => sub { add_field( $data, $_)},
                                         array => sub { array( $data, @_); },
                                       },
              )
         ->parse( \*DATA); # replace with parsefile( 'file.xml') to parse a file

is_deeply( $data, $expected, 'one test to rule them all');


sub array
  { my( $data, $t, $array)= @_;
    my $name= $array->prev_sibling( 'integer')->att( 'name');
    $data->{$name}=[];
    foreach my $item ($array->children( 'item'))
      { my $item_data={};
        foreach my $child ($item->children)
          { add_field( $item_data, $child); }
        push @{$data->{$name}}, $item_data;
      }
  }


# get a name/value pair of attributes and add it to a hash, which
# could be the overall $data or an element in an array
sub add_field
  { my( $data, $elt)= @_;
    $data->{$elt->att( 'name')}= $elt->att( 'value');
  }



__DATA__
<data>
    <call function="get_user_data">
        <output>
            <integer name="my_id" value="-31" />
            <string name="login" value="root" />
            <integer name="ip_list" value="2" />
            <array name="i">
                <item>
                    <ip_address name="user_ip" value="0.0.0.0" />
                    <ip_address name="user_mask" value="0.0.0.0"/>
                </item>
                <item>
                    <ip_address name="user_ip" value="94.230.160.230" />
                    <ip_address name="user_mask" value="255.255.255.0"/>
                </item>
            </array>
            <integer name="modules" value="2" />
            <array name="i">
                <item>
                    <string name="module_name" value="core" />
                </item>
                <item>
                    <string name="module_name" value="devices" />
                </item>
            </array>
            <integer name="addition_modules" value="0"/>
            <array name="i"/>
        </output>
    </call>
</data>