Regex : Remove all commas between a quote separated string [python]

依然范特西╮ 提交于 2020-01-11 11:47:50

问题


What would be an appropriate regex to remove all commas in a string as such:

12, 1425073747, "test", "1, 2, 3, ... "

Result:

12, 1425073747, "test", "1 2 3 ... "

What I have that matches correctly:

"((\d+), )+\d+"

However, I obviously cant replace this with $1 $2. I can't use "\d+, \d+" because it will match 12, 1425073747 which is not what I want. If someone can explain how to recursively parse out values that would be appreciated as well.


回答1:


This should work for you:

>>> input = '12, 1425073747, "test", "1, 2, 3, ... "';
>>> print re.sub(r'(?!(([^"]*"){2})*[^"]*$),', "", input);
12, 1425073747, "test", "1 2 3 ... "

(?!(([^"]*"){2})*[^"]*$) matches text only if inside quotea -- avoid matching even number of quotes after comma.




回答2:


You may use a re.sub with a simple r'"[^"]*"' regex and pass the match object to a callable used as the replacement argument where you may further manipulate the match:

import re
text = '12, 1425073747, "test", "1, 2, 3, ... "'
print( re.sub(r'"[^"]*"', lambda x: x.group().replace(",", ""), text) )

See the Python demo.

If the string between quotes may contain escaped quotes use

re.sub(r'(?s)"[^"\\]*(?:\\.[^"\\]*)*"', lambda x: x.group().replace(",", ""), text)

Here, (?s) is the inline version of a re.S / re.DOTALL flag and the rest is the double quoted string literal matching pattern.

Bonus

  • Removing all whitespace in between double quotes: re.sub(r'"[^"]*"', lambda x: ''.join(x.group().split()), text)
  • Remove all non-digit chars inside double quotes: re.sub(r'"[^"]*"', lambda x: ''.join(c for c in x.group() if c.isdigit()), text)
  • Remove all digit chars inside double quotes: re.sub(r'"[^"]*"', lambda x: ''.join(c for c in x.group() if not c.isdigit()), text)


来源:https://stackoverflow.com/questions/28775048/regex-remove-all-commas-between-a-quote-separated-string-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!