问题
I would like to split a string only where there are at least two or more whitespaces.
For example
str = '10DEUTSCH GGS Neue Heide 25-27 Wahn-Heide -1 -1'
print str.split()
Results:
['10DEUTSCH', 'GGS', 'Neue', 'Heide', '25-27', 'Wahn-Heide', '-1', '-1']
I would like it to look like this:
['10DEUTSCH', 'GGS Neue Heide 25-27', 'Wahn-Heide', '-1', '-1']
回答1:
In [4]: import re
In [5]: text = '10DEUTSCH GGS Neue Heide 25-27 Wahn-Heide -1 -1'
In [7]: re.split(r'\s{2,}', text)
Out[7]: ['10DEUTSCH', 'GGS Neue Heide 25-27', 'Wahn-Heide', '-1', '-1']
回答2:
As has been pointed out, str
is not a good name for your string, so using words
instead:
output = [s.strip() for s in words.split(' ') if s]
The .split(' ') -- with two spaces -- will give you a list that includes empty strings, and items with trailing/leading whitespace. The list comprehension iterates through that list, keeps any non-blank items (if s
), and .strip() takes care of any leading/trailing whitespace.
回答3:
In [30]: strs='10DEUTSCH GGS Neue Heide 25-27 Wahn-Heide -1 -1'
In [38]: filter(None, strs.split(" "))
Out[38]: ['10DEUTSCH', 'GGS Neue Heide 25-27', ' Wahn-Heide', ' -1', '-1']
In [32]: map(str.strip, filter(None, strs.split(" ")))
Out[32]: ['10DEUTSCH', 'GGS Neue Heide 25-27', 'Wahn-Heide', '-1', '-1']
For python 3, wrap the result of filter
and map
with list
to force iteration.
回答4:
In the case of:
- mixed tabs and spaces
- blanks at start and/or at end of the string
(originally answering to Split string at whitespace longer than a single space and tab characters, Python)
I would split with a regular expression: 2 or more blanks, then filter out the empty strings that re.split
yields:
import re
s = ' 1. 1. 2. 1 \tNote#EvE\t \t1\t \tE3\t \t 64\t 1. 3. 2. 120 \n'
result = [x for x in re.split("\s{2,}",s) if x]
print(result)
prints:
['1. 1. 2.', '1', 'Note#EvE', '1', 'E3', '64', '1. 3. 2. 120']
this isn't going to preserve leading/trailing spaces but it's close.
来源:https://stackoverflow.com/questions/61050576/split-string-at-whitespace-longer-than-a-single-space-and-tab-characters-python