How can I use pyparsing to parse nested expressions that have multiple opener/closer types?

后端 未结 2 1999

I\'d like to use pyparsing to parse an expression of the form: expr = \'(gimme [some {nested [lists]}])\', and get back a python list of the form: [[[\'gi

相关标签:
2条回答
  • 2020-12-15 08:33

    This should do the trick for you. I tested it on your example:

    import re
    import ast
    
    def parse(s):
        s = re.sub("[\{\(\[]", '[', s)
        s = re.sub("[\}\)\]]", ']', s)
        answer = ''
        for i,char in enumerate(s):
            if char == '[':
                answer += char + "'"
            elif char == '[':
                answer += "'" + char + "'"
            elif char == ']':
                answer += char
            else:
                answer += char
                if s[i+1] in '[]':
                    answer += "', "
        ast.literal_eval("s=%s" %answer)
        return s
    

    Comment if you need more

    0 讨论(0)
  • 2020-12-15 08:43

    Here's a pyparsing solution that uses a self-modifying grammar to dynamically match the correct closing brace character.

    from pyparsing import *
    
    data = '(gimme [some {nested, nested [lists]}])'
    
    opening = oneOf("( { [")
    nonBracePrintables = ''.join(c for c in printables if c not in '(){}[]')
    closingFor = dict(zip("({[",")}]"))
    closing = Forward()
    # initialize closing with an expression
    closing << NoMatch()
    closingStack = []
    def pushClosing(t):
        closingStack.append(closing.expr)
        closing << Literal( closingFor[t[0]] )
    def popClosing():
        closing << closingStack.pop()
    opening.setParseAction(pushClosing)
    closing.setParseAction(popClosing)
    
    matchedNesting = nestedExpr( opening, closing, Word(alphas) | Word(nonBracePrintables) )
    
    print matchedNesting.parseString(data).asList()
    

    prints:

    [['gimme', ['some', ['nested', ',', 'nested', ['lists']]]]]
    

    Updated: I posted the above solution because I had actually written it over a year ago as an experiment. I just took a closer look at your original post, and it made me think of the recursive type definition created by the operatorPrecedence method, and so I redid this solution, using your original approach - much simpler to follow! (might have a left-recursion issue with the right input data though, not thoroughly tested):

    from pyparsing import *
    
    enclosed = Forward()
    nestedParens = nestedExpr('(', ')', content=enclosed) 
    nestedBrackets = nestedExpr('[', ']', content=enclosed) 
    nestedCurlies = nestedExpr('{', '}', content=enclosed) 
    enclosed << (Word(alphas) | ',' | nestedParens | nestedBrackets | nestedCurlies)
    
    
    data = '(gimme [some {nested, nested [lists]}])' 
    
    print enclosed.parseString(data).asList()
    

    Gives:

    [['gimme', ['some', ['nested', ',', 'nested', ['lists']]]]]
    

    EDITED: Here is a diagram of the updated parser, using the railroad diagramming support coming in pyparsing 3.0.

    0 讨论(0)
提交回复
热议问题