可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have malformed string:
a = '(a,1.0),(b,6.0),(c,10.0)'
I need dict:
d = {'a':1.0, 'b':6.0, 'c':10.0}
I try:
print (ast.literal_eval(a)) #ValueError: malformed node or string: <_ast.Name object at 0x000000000F67E828>
Then I try replace chars to 'string dict', it is ugly and does not work:
b = a.replace(',(','|{').replace(',',' : ') .replace('|',', ').replace('(','{').replace(')','}') print (b) {a : 1.0}, {b : 6.0}, {c : 10.0} print (ast.literal_eval(b)) #ValueError: malformed node or string: <_ast.Name object at 0x000000000C2EA588>
What do you do? Something missing? Is possible use regex?
回答1:
Given the string has the above stated format, you could use regex substitution with backrefs:
import re a = '(a,1.0),(b,6.0),(c,10.0)' a_fix = re.sub(r'\((\w+),', r"('\1',",a)
So you look for a pattern (x, (with x a sequence of \ws and you substitute it into ('x',. The result is then:
# result a_fix == "('a',1.0),('b',6.0),('c',10.0)"
and then parse a_fix and convert it to a dict:
result = dict(ast.literal_eval(a_fix))
The result in then:
>>> dict(ast.literal_eval(a_fix)) {'b': 6.0, 'c': 10.0, 'a': 1.0}
回答2:
No need for regexes, if your string is in this format.
>>> a = '(a,1.0),(b,6.0),(c,10.0)' >>> d = dict([x.split(',') for x in a[1:-1].split('),(')]) >>> print(d) {'c': '10.0', 'a': '1.0', 'b': '6.0'}
We remove the first opening parantheses and last closing parantheses to get the key-value pairs by splitting on ),(. The pairs can then be split on the comma.
To cast to float, the list comprehension gets a little longer:
d = dict([(a, float(b)) for (a, b) in [x.split(',') for x in a[1:-1].split('),(')]])
回答3:
If there are always 2 comma-separated values inside parentheses and the second is of a float type, you may use
import re s = '(a,1.0),(b,6.0),(c,10.0)' print(dict(map(lambda (w, m): (w, float(m)), [(x, y) for x, y in re.findall(r'\(([^),]+),([^)]*)\)', s) ])))
See the Python demo and the (quite generic) regex demo. This pattern just matches a (, then 0+ chars other than a comma and ) capturing into Group 1, then a comma is matched, then any 0+ chars other than ) (captured into Group 2) and a ).
As the pattern above is suitable when you have pre-validated data, the regex can be restricted for your current data as
r'\((\w+),(\d*\.?\d+)\)'
See the regex demo
Details:
\( - a literal ( (\w+) - Capturing group 1: one or more word (letter/digit/_) chars , - a comma (\d*\.?\d+) - a common integer/float regex: zero or more digits, an optional . (decimal separator) and 1+ digits \) - a literal closing parenthesis.
回答4:
the reason why eval() dose not work is the a, b, c are not defined, we can define those with it's string form and eval will get that string form to use
In [11]: text = '(a,1.0),(b,6.0),(c,10.0)' In [12]: a, b, c = 'a', 'b', 'c' In [13]: eval(text) Out[13]: (('a', 1.0), ('b', 6.0), ('c', 10.0)) In [14]: dict(eval(text)) Out[14]: {'a': 1.0, 'b': 6.0, 'c': 10.0}
to do this in regex way:
In [21]: re.sub(r'\((.+?),', r'("\1",', text) Out[21]: '("a",1.0),("b",6.0),("c",10.0)' In [22]: eval(_) Out[22]: (('a', 1.0), ('b', 6.0), ('c', 10.0)) In [23]: dict(_) Out[23]: {'a': 1.0, 'b': 6.0, 'c': 10.0}