Tips for reading in a complex file - Python

前端 未结 1 793
耶瑟儿~
耶瑟儿~ 2020-12-07 06:39

I have complex, variable text files that I want to read into Python, but I\'m not sure what the best strategy would be. I\'m not looking for you to code anything for me, jus

相关标签:
1条回答
  • 2020-12-07 06:59

    It's a typical task for a syntactic analyzer. In this case, since

    • lexical constructs do not cross line boundaries and there's a single construct ("statement") per line. In other words, each line is a single statement
    • full syntax for a single line can be covered by a set of regexes
    • the structure of compounds (=entities connecting multiple "statements" into something bigger) is simple and straightforward

    a (relatively) simple scannlerless parser based on lines, DFA and the aforementioned set of regexes can be applied:

    • set up the initial parser state (=current position relative to various entities to be tracked) and the parse tree (=data structure representing the information from the file in a convenient way)
    • for each line
      • classify it, e.g. by matching against the regexes applicable to the current state
      • use the matched regex's groups to get the line's statement's meaningful parts
      • using these parts, update the state and the parse tree

    See get the path in a file inside {} by python for an example. There, I do not construct a parse tree (wasn't needed) but only track the current state.

    0 讨论(0)
提交回复
热议问题