How to retrieve list of values from result object in PyParsing?

不羁岁月 提交于 2019-12-08 02:56:33

问题


I have a simple example where I would like to parse 2 lines of data.

In [1] from pyparsing import Word, nums, OneOrMore, Optional, Suppress, alphanums, LineEnd, LineStart

       Float = Word(nums + '.' + '-')
       Name = Word(alphanums)
       Line = OneOrMore(Float)('data') + Suppress(Optional(';')) + Optional('%') + Optional(Name)('name')

       Lines = OneOrMore(Line + LineEnd())

       string = ''' 1   10  0       T20
            1   76  0   T76
       '''
       result = Lines.parseString(string)

In [2] result
Out[2] (['1', '10', '0', 'T20', '\n', '1', '76', '0', 'T76', '\n'], {'data': [(['1', '10', '0'], {}), (['1', '76', '0'], {})], 'name': ['T20', 'T76']})

The result object contains all the values I require, i.e. the values of data and the name keys are lists with items ordered based on the line. How do I get the values from result object?

Accessing the data attribute does not give both rows

In [3] result.data
Out[3] (['1', '76', '0'], {})

In [4] for i in result.data:
           print i
       1
       76
       0

The asDict() method returns only the second row

In [5]: result.asDict()
Out[5]: {'data': ['1', '76', '0'], 'name': 'T76'}

The asList() method returns all the information in a single list and it is difficult to enumerate when you don't know the length of name and data ahead of time

In [6]: result.asList()
Out[6]: ['1', '10', '0', 'T20', '\n', '1', '76', '0', 'T76', '\n']

asXML() contains everything I require, but it is in XML format, and the docstring says it will be deprecated soon.

In [7]: print result.asXML() # The documentation says this will be deprecated
        <data>
          <data>1</data>
          <ITEM>10</ITEM>
          <ITEM>0</ITEM>
          <name>T20</name>
          <ITEM>
        </ITEM>
          <data>1</data>
          <ITEM>76</ITEM>
          <ITEM>0</ITEM>
          <name>T76</name>
          <ITEM>
        </ITEM>
        </data>

dump() again partially contains the relevant information, but it returns a string and one would have to parse the string again for information.

In [8]: print result.dump()
        ['1', '10', '0', 'T20', '\n', '1', '76', '0', 'T76', '\n']
        - data: ['1', '76', '0']
        - name: 'T76'

How does one get these values in a Pythonic way?


回答1:


Well done on using results names, they are incredibly helpful when accessing the parsed fields. But it sounds like you need to add a layer of structuring to your parser, so that each line gets its own data, name, etc. You can do that by just redefining Lines as:

Lines = OneOrMore(Group(Line) + LineEnd().suppress())

Now, if you print(result.dump()) you get:

[['1', '10', '0', 'T20'], ['1', '76', '0', 'T76']]
[0]:
  ['1', '10', '0', 'T20']
  - data: ['1', '10', '0']
  - name: 'T20'
[1]:
  ['1', '76', '0', 'T76']
  - data: ['1', '76', '0']
  - name: 'T76'

The output of dump() is not meant to be parsed to get the values, it is meant to help show you how the structured values can be retrieved. For instance, you can do:

print(result[1].data)
print(result[1].name)

and get

['1', '76', '0']
T76

or:

for parsed_line in result:
    print("{name}: {data}".format_map(parsed_line))

and get:

T20: ['1', '10', '0']
T76: ['1', '76', '0']


来源:https://stackoverflow.com/questions/40667527/how-to-retrieve-list-of-values-from-result-object-in-pyparsing

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!