Parsing malformed JSON in JavaScript

前端 未结 5 617
别跟我提以往
别跟我提以往 2020-12-07 04:38

Thanks for looking!

BACKGROUND

I am writing some front-end code that consumes a JSON service which is returning malformed JSON. Specifically, the keys are n

5条回答
  •  鱼传尺愫
    2020-12-07 05:25

    I was trying to solve the same problem using a regEx in Javascript. I have an app written for Node.js to parse incoming JSON, but wanted a "relaxed" version of the parser (see following comments), since it is inconvenient to put quotes around every key (name). Here is my solution:

    var objKeysRegex = /({|,)(?:\s*)(?:')?([A-Za-z_$\.][A-Za-z0-9_ \-\.$]*)(?:')?(?:\s*):/g;// look for object names
    var newQuotedKeysString = originalString.replace(objKeysRegex, "$1\"$2\":");// all object names should be double quoted
    var newObject = JSON.parse(newQuotedKeysString);
    

    Here's a breakdown of the regEx:

    • ({|,) looks for the beginning of the object, a { for flat objects or , for embedded objects.
    • (?:\s*) finds but does not remember white space
    • (?:')? finds but does not remember a single quote (to be replaced by a double quote later). There will be either zero or one of these.
    • ([A-Za-z_$\.][A-Za-z0-9_ \-\.$]*) is the name (or key). Starts with any letter, underscore, $, or dot, followed by zero or more alpha-numeric characters or underscores or dashes or dots or $.
    • the last character : is what delimits the name of the object from the value.

    Now we can use replace() with some dressing to get our newly quoted keys:

    originalString.replace(objKeysRegex, "$1\"$2\":")
    

    where the $1 is either { or , depending on whether the object was embedded in another object. \" adds a double quote. $2 is the name. \" another double quote. and finally : finishes it off. Test it out with

    {keyOne: "value1", $keyTwo: "value 2", key-3:{key4:18.34}}
    

    output:

    {"keyOne": "value1","$keyTwo": "value 2","key-3":{"key4":18.34}}
    

    Some comments:

    • I have not tested this method for speed, but from what I gather by reading some of these entries is that using a regex is faster than eval()
    • For my application, I'm limiting the characters that names are allowed to have with ([A-Za-z_$\.][A-Za-z0-9_ \-\.$]*) for my 'relaxed' version JSON parser. If you wanted to allow more characters in names (you can do that and still be valid), you could instead use ([^'":]+) to mean anything other than double or single quotes or a colon. You can have all sorts of stuff in here with this expression, so be careful.
    • One shortcoming is that this method actually changes the original incoming data (but I think that's what you wanted?). You could program around that to mitigate this issue - depends on your needs and resources available.

    Hope this helps. -John L.

提交回复
热议问题