Parse javascript object declaration which doesn't use strings for property names (using python and BeautifulSoup)

蹲街弑〆低调 提交于 2019-12-11 20:25:26

问题


I'm doing something very similar to what this user was doing: trying to load a javascript object declaration into a python dictionary. However, unlike that user, the property names aren't enclosed in quotes.

>>> simplejson.loads('{num1: 1383241561141, num2: 1000}')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/lalalal/site-packages/simplejson/__init__.py", line 385, in loads
    return _default_decoder.decode(s)
  File "/Users/lalalal/site-packages/simplejson/decoder.py", line 402, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Users/lalalal/site-packages/simplejson/decoder.py", line 418, in raw_decode
    obj, end = self.scan_once(s, idx)
simplejson.decoder.JSONDecodeError: Expecting property name: line 1 column 1 (char 1)

It'd be just splendid if I had the correct JSON notation:

>>> simplejson.loads('{"num1": 1383241561141, "num2": 1000}')
{'num1': 1383241561141, 'num2': 1000}

But, I don't. How can I work around this? Maybe it comes down to something as simple as a regex?

Edit: This regex that Martijn wrote has me halfway there, it just doesn't work if I have trailing whitespace after the braces which happens in some of my example data, e.g. { num1: 1383241561141, num2: 1000}'


回答1:


Some libraries like RSON support parsing the so-called "relaxed" JSON notation.

Depending on the actual keys, and if you don't care about the security implications (never use this on external input), eval may give you a functioning dictionary as well.




回答2:


one simple way to do it in js:

'{num1: 1383241561141, num2: 1000}'   // the string
  .trim()                             // remove whitespace
  .slice(1,-1)                        // remove endcap braces
  .trim()                             // remove whitespace
  .split(/\s*,\s*/).map(function(a){  // loop through each comma section names as a
     var p=a.split(/\s*:\s*/);        // split section into key/val segments
     this[p[0]]=p[1];                 // assign val to collection under key
     return this;                     // return collection
},{})[0];                             // grab the return once (same on each index)

This routine returns a live object that stringifys like this:

{
    "num1": "1383241561141",
    "num2": "1000"
}

note the string numbers, you can loop through the object again and Number(val) those keys back to real numbers if need be.



来源:https://stackoverflow.com/questions/19745157/parse-javascript-object-declaration-which-doesnt-use-strings-for-property-names

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!