Convert string (without any separator) to list

前端 未结 9 1847
挽巷
挽巷 2020-12-09 17:09

I have a phone number(string), e.g. \"+123-456-7890\", that I want to turn into a list that looks like: [+, 1, 2, 3, -, ...., 0].

Why? So I can go iterate through t

相关标签:
9条回答
  • 2020-12-09 17:16

    You can use str.translate, you just have to give it the right arguments:

    >>> dels=''.join(chr(x) for x in range(256) if not chr(x).isdigit())
    >>> '+1-617-555-1212'.translate(None, dels)
    '16175551212'
    

    N.b.: This won't work with unicode strings in Python2, or at all in Python3. For those environments, you can create a custom class to pass to unicode.translate:

    >>> class C:
    ...    def __getitem__(self, i):
    ...       if unichr(i).isdigit():
    ...          return i
    ... 
    >>> u'+1-617.555/1212'.translate(C())
    u'16175551212'
    

    This works with non-ASCII digits, too:

    >>> print u'+\u00b9-\uff1617.555/1212'.translate(C()).encode('utf-8')
    ¹6175551212
    
    0 讨论(0)
  • 2020-12-09 17:19

    You can use the re module:

    import re
    re.sub(r'\D', '', '+123-456-7890')
    

    This will replace all non-digits with ''.

    0 讨论(0)
  • 2020-12-09 17:19

    Did you try list(x)??

     y = '+123-456-7890'
     c =list(y)
     c
    

    ['+', '1', '2', '3', '-', '4', '5', '6', '-', '7', '8', '9', '0']

    0 讨论(0)
  • 2020-12-09 17:21

    I know this question has been answered, but just to point out what timeit has to say about the solutions efficiency. Using these parameters:

    size = 30
    s = [str(random.randint(0, 9)) for i in range(size)] + (size/3) * ['-']
    random.shuffle(s)
    s = ''.join(['+'] + s)
    timec = 1000
    

    That is the "phone number" has 30 digits, 1 plus sing and 10 '-'. I've tested these approaches:

    def justdigits(s):
        justdigitsres = ""
        for char in s:
            if char.isdigit():
                justdigitsres += str(char)
        return justdigitsres
    
    re_compiled = re.compile(r'\D')
    
    print('Filter: %ss' % timeit.Timer(lambda : ''.join(filter(str.isdigit, s))).timeit(timec))
    print('GE: %ss' % timeit.Timer(lambda : ''.join(n for n in s if n.isdigit())).timeit(timec))
    print('LC: %ss' % timeit.Timer(lambda : ''.join([n for n in s if n.isdigit()])).timeit(timec))
    print('For loop: %ss' % timeit.Timer(lambda : justdigits(s)).timeit(timec))
    print('RE: %ss' % timeit.Timer(lambda : re.sub(r'\D', '', s)).timeit(timec))
    print('REC: %ss' % timeit.Timer(lambda : re_compiled.sub('', s)).timeit(timec))
    print('Translate: %ss' % timeit.Timer(lambda : s.translate(None, '+-')).timeit(timec))
    

    And came out with these results:

    Filter: 0.0145790576935s
    GE: 0.0185861587524s
    LC: 0.0151798725128s
    For loop: 0.0242128372192s
    RE: 0.0120108127594s
    REC: 0.00868797302246s
    Translate: 0.00118899345398s
    

    Apparently GEs and LCs are still slower than a regex or a compiled regex. And apparently my CPython 2.6.6 didn't optimize the string addition that much. translate appears to be the fastest (which is expected as the problem is stated as "ignore these two symbols", rather than "get these numbers" and I believe is quite low-level).

    And for size = 100:

    Filter: 0.0357120037079s
    GE: 0.0465779304504s
    LC: 0.0428011417389s
    For loop: 0.0733139514923s
    RE: 0.0213229656219s
    REC: 0.0103371143341s
    Translate: 0.000978946685791s
    

    And for size = 1000:

    Filter: 0.212141036987s
    GE: 0.198996067047s
    LC: 0.196880102158s
    For loop: 0.365696907043s
    RE: 0.0880808830261s
    REC: 0.086804151535s
    Translate: 0.00587010383606s
    
    0 讨论(0)
  • 2020-12-09 17:28

    You mean that you want something like:

    ''.join(n for n in phone_str if n.isdigit())
    

    This uses the fact that strings are iterable. They yield 1 character at a time when you iterate over them.


    Regarding your efforts,

    This one actually removes all of the digits from the string leaving you with only non-digits.

    x = row.translate(None, string.digits)
    

    This one splits the string on runs of whitespace, not after each character:

    list = x.split()
    
    0 讨论(0)
  • 2020-12-09 17:28
    ''.join(filter(str.isdigit, "+123-456-7890"))
    
    0 讨论(0)
提交回复
热议问题