How to retrieve list of values from result object in PyParsing?

问题

I have a simple example where I would like to parse 2 lines of data.

In [1] from pyparsing import Word, nums, OneOrMore, Optional, Suppress, alphanums, LineEnd, LineStart

       Float = Word(nums + '.' + '-')
       Name = Word(alphanums)
       Line = OneOrMore(Float)('data') + Suppress(Optional(';')) + Optional('%') + Optional(Name)('name')

       Lines = OneOrMore(Line + LineEnd())

       string = ''' 1   10  0       T20
            1   76  0   T76
       '''
       result = Lines.parseString(string)

In [2] result
Out[2] (['1', '10', '0', 'T20', '\n', '1', '76', '0', 'T76', '\n'], {'data': [(['1', '10', '0'], {}), (['1', '76', '0'], {})], 'name': ['T20', 'T76']})

The result object contains all the values I require, i.e. the values of data and the name keys are lists with items ordered based on the line. How do I get the values from result object?

Accessing the data attribute does not give both rows

In [3] result.data
Out[3] (['1', '76', '0'], {})

In [4] for i in result.data:
           print i
       1
       76
       0

The asDict() method returns only the second row

In [5]: result.asDict()
Out[5]: {'data': ['1', '76', '0'], 'name': 'T76'}

The asList() method returns all the information in a single list and it is difficult to enumerate when you don't know the length of name and data ahead of time

In [6]: result.asList()
Out[6]: ['1', '10', '0', 'T20', '\n', '1', '76', '0', 'T76', '\n']

asXML() contains everything I require, but it is in XML format, and the docstring says it will be deprecated soon.

In [7]: print result.asXML() # The documentation says this will be deprecated
        <data>
          <data>1</data>
          <ITEM>10</ITEM>
          <ITEM>0</ITEM>
          <name>T20</name>
          <ITEM>
        </ITEM>
          <data>1</data>
          <ITEM>76</ITEM>
          <ITEM>0</ITEM>
          <name>T76</name>
          <ITEM>
        </ITEM>
        </data>

dump() again partially contains the relevant information, but it returns a string and one would have to parse the string again for information.

In [8]: print result.dump()
        ['1', '10', '0', 'T20', '\n', '1', '76', '0', 'T76', '\n']
        - data: ['1', '76', '0']
        - name: 'T76'

How does one get these values in a Pythonic way?

回答1:

Well done on using results names, they are incredibly helpful when accessing the parsed fields. But it sounds like you need to add a layer of structuring to your parser, so that each line gets its own data, name, etc. You can do that by just redefining Lines as:

Lines = OneOrMore(Group(Line) + LineEnd().suppress())

Now, if you print(result.dump()) you get:

[['1', '10', '0', 'T20'], ['1', '76', '0', 'T76']]
[0]:
  ['1', '10', '0', 'T20']
  - data: ['1', '10', '0']
  - name: 'T20'
[1]:
  ['1', '76', '0', 'T76']
  - data: ['1', '76', '0']
  - name: 'T76'

The output of dump() is not meant to be parsed to get the values, it is meant to help show you how the structured values can be retrieved. For instance, you can do:

print(result[1].data)
print(result[1].name)

and get

['1', '76', '0']
T76

or:

for parsed_line in result:
    print("{name}: {data}".format_map(parsed_line))

and get:

T20: ['1', '10', '0']
T76: ['1', '76', '0']

来源：https://stackoverflow.com/questions/40667527/how-to-retrieve-list-of-values-from-result-object-in-pyparsing

标签

python

pyparsing