Python - extracting split list correctly

主宰稳场 提交于 2019-12-12 19:20:00

问题


As a follow up to this question, I have an expression such as this:['(', '44', '+(', '3', '+', 'll', '))'] which was created by using re.findall('\w+|\W+',item) method, however within this list of strings, there are two errors. One is the '+(' and the other is the '))'.

Is there a pythonic way that I could split just the operators such that the list would be something like ['(', '44', '+','(', '3', '+', 'll', ')',')'].

(keep the digits/letters together, separate the symbols)

Thanks


回答1:


Short solution using str.join() and re.split() functions:

import re
l = ['(', '44', '+(', '3', '+', 'll', '))']
new_list = [i for i in re.split(r'(\d+|[a-z]+|[^\w])', ''.join(l)) if i.strip()]

print(new_list)

The output:

['(', '44', '+', '(', '3', '+', 'll', ')', ')']



回答2:


An alternative would be to change the regex in order to keep the non-alphanumeric characters separate :

import re
lst = ['z+2-44', '4+(55+z)+88']
[re.findall('\w+|\W', s) for s in lst]

#[['z', '+', '2', '-', '44'], ['4', '+', '(', '55', '+', 'z', ')', '+', '88']]



回答3:


You want to split characters of grouped non-alphanumerical characters.

I would create a 1-list item if the item is ok (alphanumerical) or a list of characters if the item is a sequence of symbols.

Then, I'd flatten the list to get what you asked for

import itertools

l = ['(', '44', '+(', '3', '+', 'll', '))']
new_l = list(itertools.chain.from_iterable([x] if x.isalnum() else list(x) for x in l))
print(new_l)

result:

['(', '44', '+', '(', '3', '+', 'll', ')', ')']

EDIT: actually you could link your 2 questions into one answer (adapting the regex answer of the original question) by not grouping symbols in the regex:

import re
lst = ['z+2-44', '4+55+((z+88))']
print([re.findall('\w+|\W', s) for s in lst])

(note the lack of + after \W) and you get directly:

[['z', '+', '2', '-', '44'], ['4', '+', '55', '+', '(', '(', 'z', '+', '88', ')', ')']]



回答4:


Try this:

import re
lst = ['z+2-44', '4+(55+z)+88']
[re.findall('\w+|\W', s) for s in lst]

May be it helps to others.



来源:https://stackoverflow.com/questions/42333875/python-extracting-split-list-correctly

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!