Python: parsing JSON-like Javascript data structures (w/ consecutive commas)

前端 未结 6 1822
南旧
南旧 2020-12-10 22:55

I would like to parse JSON-like strings. Their lone difference with normal JSON is the presence of contiguous commas in arrays. When there are two such commas, it i

6条回答
  •  悲哀的现实
    2020-12-10 23:02

    I've had a look at Taymon recommendation, pyparsing, and I successfully hacked the example provided here to suit my needs. It works well at simulating Javascript eval() but fails one situation: trailing commas. There should be a optional trailing comma – see tests below – but I can't find any proper way to implement this.

    from pyparsing import *
    
    TRUE = Keyword("true").setParseAction(replaceWith(True))
    FALSE = Keyword("false").setParseAction(replaceWith(False))
    NULL = Keyword("null").setParseAction(replaceWith(None))
    
    jsonString = dblQuotedString.setParseAction(removeQuotes)
    jsonNumber = Combine(Optional('-') + ('0' | Word('123456789', nums)) +
                        Optional('.' + Word(nums)) +
                        Optional(Word('eE', exact=1) + Word(nums + '+-', nums)))
    
    jsonObject = Forward()
    jsonValue = Forward()
    # black magic begins
    commaToNull = Word(',,', exact=1).setParseAction(replaceWith(None))
    jsonElements = ZeroOrMore(commaToNull) + Optional(jsonValue) + ZeroOrMore((Suppress(',') + jsonValue) | commaToNull)
    # black magic ends
    jsonArray = Group(Suppress('[') + Optional(jsonElements) + Suppress(']'))
    jsonValue << (jsonString | jsonNumber | Group(jsonObject) | jsonArray | TRUE | FALSE | NULL)
    memberDef = Group(jsonString + Suppress(':') + jsonValue)
    jsonMembers = delimitedList(memberDef)
    jsonObject << Dict(Suppress('{') + Optional(jsonMembers) + Suppress('}'))
    
    jsonComment = cppStyleComment
    jsonObject.ignore(jsonComment)
    
    def convertNumbers(s, l, toks):
        n = toks[0]
        try:
            return int(n)
        except ValueError:
            return float(n)
    
    jsonNumber.setParseAction(convertNumbers)
    
    def test():
        tests = (
            '[1,2]',       # ok
            '[,]',         # ok
            '[,,]',        # ok
            '[  , ,  , ]', # ok
            '[,1]',        # ok
            '[,,1]',       # ok
            '[1,,2]',      # ok
            '[1,]',        # failure, I got [1, None], I should have [1]
            '[1,,]',       # failure, I got [1, None, None], I should have [1, None]
        )
        for test in tests:
            results = jsonArray.parseString(test)
            print(results.asList())
    

提交回复
热议问题