Splitting digits into groups of threes, from right to left using regular expressions

懵懂的女人 提交于 2019-12-02 15:03:51

问题


I have a string '1234567890' that I want split into groups of threes, starting from right to left, with the left most group ranging from one digit to 3-digits (depending on how many digits are left over)

Essentially, it's the same procedure as adding commas to a long number, except, I also want to extract the last three digits as well.

I tried using look-arounds but couldn't figure out a way to get the last three digits.

string = '1234567890'
re.compile(r'\d{1,3}(?=(?:\d{3})+$)')
re.findall(pattern, string)

['1', '234', '567']

Expected output is (I don't need commas):

 ['1', '234', '567', 789]

回答1:


Appreciate that if we add commas from right to left, for each group of three complete digits, then we can simply do a regex replace all of three digits with those three digits followed by a comma. In the code snippet below, I reverse the numbers string, do the comma work, then reverse again to arrive at the output we want.

string = '1234567890'
string = re.sub(r'(?=\d{4})(\d{3})', r'\1,', string[::-1])[::-1]
print string.split(',')
string = '123456789'
string = re.sub(r'(?=\d{4})(\d{3})', r'\1,', string[::-1])[::-1]
print string.split(',')

Output:

['1', '234', '567', '890']
['123', '456', '789']

One part of the regex used for replacement might warrant further explanation. I added a positive lookahead (?=\d{4}) to the start of the pattern. This is there to ensure that we don't add a comma after a final group of three digits, should that occur.

Demo here:

Rextester




回答2:


It is actually easier to operate on a reversed string to keep track of groups of 3 digits where there are more digits to go (with the positive lookahead of (?=\d):

for s in ('123','1234','123456789','1234567890'):
    print(re.sub(r'(\d\d\d)(?=\d)',r'\1,',s[::-1])[::-1])

Or a negative lookahead version:

for s in ('123','1234','123456789','1234567890'):
    print(re.sub(r'(\d\d\d)(?!$)',r'\1,',s[::-1])[::-1])

Either prints:

123
1,234
123,456,789
1,234,567,890

Applying a reversed regex on a reversed string is called a sexeger in Perl ;-)

You can also do a lookahead version that does not require reversing the string:

for s in ('123','1234','123456789','1234567890'):
   print(re.sub(r'(\d)(?=(\d{3})+$)',r'\1,',s))
# same output

Based on the comment, just add an appropriate delimiter and then .split on that:

>>> for s in ('123','1234','123456789','1234567890'):
...     re.sub(r'(\d)(?=(\d{3})+$)',r'\1\t',s).split('\t')
... 
['123']
['1', '234']
['123', '456', '789']
['1', '234', '567', '890']

Or, skip the regex and just do it in Python:

for s in ('123','1234','123456789','1234567890'):
    s=s[::-1]
    n=3
    print([s[i:i+n][::-1] for i in range(0,len(s),n)][::-1])
# same output


来源:https://stackoverflow.com/questions/45950494/splitting-digits-into-groups-of-threes-from-right-to-left-using-regular-express

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!