I have a phone number(string), e.g. \"+123-456-7890\", that I want to turn into a list that looks like: [+, 1, 2, 3, -, ...., 0].
Why? So I can go iterate through t
You can use str.translate
, you just have to give it the right arguments:
>>> dels=''.join(chr(x) for x in range(256) if not chr(x).isdigit())
>>> '+1-617-555-1212'.translate(None, dels)
'16175551212'
N.b.: This won't work with unicode strings in Python2, or at all in Python3. For those environments, you can create a custom class to pass to unicode.translate
:
>>> class C:
... def __getitem__(self, i):
... if unichr(i).isdigit():
... return i
...
>>> u'+1-617.555/1212'.translate(C())
u'16175551212'
This works with non-ASCII digits, too:
>>> print u'+\u00b9-\uff1617.555/1212'.translate(C()).encode('utf-8')
¹6175551212
You can use the re module:
import re
re.sub(r'\D', '', '+123-456-7890')
This will replace all non-digits with ''.
Did you try list(x)??
y = '+123-456-7890'
c =list(y)
c
['+', '1', '2', '3', '-', '4', '5', '6', '-', '7', '8', '9', '0']
I know this question has been answered, but just to point out what timeit
has to say about the solutions efficiency. Using these parameters:
size = 30
s = [str(random.randint(0, 9)) for i in range(size)] + (size/3) * ['-']
random.shuffle(s)
s = ''.join(['+'] + s)
timec = 1000
That is the "phone number" has 30 digits, 1 plus sing and 10 '-'. I've tested these approaches:
def justdigits(s):
justdigitsres = ""
for char in s:
if char.isdigit():
justdigitsres += str(char)
return justdigitsres
re_compiled = re.compile(r'\D')
print('Filter: %ss' % timeit.Timer(lambda : ''.join(filter(str.isdigit, s))).timeit(timec))
print('GE: %ss' % timeit.Timer(lambda : ''.join(n for n in s if n.isdigit())).timeit(timec))
print('LC: %ss' % timeit.Timer(lambda : ''.join([n for n in s if n.isdigit()])).timeit(timec))
print('For loop: %ss' % timeit.Timer(lambda : justdigits(s)).timeit(timec))
print('RE: %ss' % timeit.Timer(lambda : re.sub(r'\D', '', s)).timeit(timec))
print('REC: %ss' % timeit.Timer(lambda : re_compiled.sub('', s)).timeit(timec))
print('Translate: %ss' % timeit.Timer(lambda : s.translate(None, '+-')).timeit(timec))
And came out with these results:
Filter: 0.0145790576935s
GE: 0.0185861587524s
LC: 0.0151798725128s
For loop: 0.0242128372192s
RE: 0.0120108127594s
REC: 0.00868797302246s
Translate: 0.00118899345398s
Apparently GEs and LCs are still slower than a regex or a compiled regex. And apparently my CPython 2.6.6 didn't optimize the string addition that much. translate
appears to be the fastest (which is expected as the problem is stated as "ignore these two symbols", rather than "get these numbers" and I believe is quite low-level).
And for size = 100
:
Filter: 0.0357120037079s
GE: 0.0465779304504s
LC: 0.0428011417389s
For loop: 0.0733139514923s
RE: 0.0213229656219s
REC: 0.0103371143341s
Translate: 0.000978946685791s
And for size = 1000
:
Filter: 0.212141036987s
GE: 0.198996067047s
LC: 0.196880102158s
For loop: 0.365696907043s
RE: 0.0880808830261s
REC: 0.086804151535s
Translate: 0.00587010383606s
You mean that you want something like:
''.join(n for n in phone_str if n.isdigit())
This uses the fact that strings are iterable. They yield 1 character at a time when you iterate over them.
Regarding your efforts,
This one actually removes all of the digits from the string leaving you with only non-digits.
x = row.translate(None, string.digits)
This one splits the string on runs of whitespace, not after each character:
list = x.split()
''.join(filter(str.isdigit, "+123-456-7890"))