Fastest way to check if a string contains specific characters in any of the items in a list

前端 未结 4 1114
隐瞒了意图╮
隐瞒了意图╮ 2021-01-02 04:33

What is the fastest way to check if a string contains some characters from any items of a list?

Currently, I\'m using this method:

lestring = \"Text1         


        
4条回答
  •  滥情空心
    2021-01-02 05:15

    You can try list comprehension with membership check

    >>> lestring = "Text123"
    >>> lelist = ["Text", "foo", "bar"]
    >>> [e for e in lelist if e in lestring]
    ['Text']
    

    Compared to your implementation, though LC has an implicit loop but its faster as there is no explicit function call as in your case with count

    Compared to Joe's implementation, yours is way faster, as the filter function would require to call two functions in a loop, lambda and count

    >>> def joe(lelist, lestring):
        return ''.join(random.sample(x + 'b'*len(x), len(x)))
    
    >>> def uz(lelist, lestring):
        for x in lelist:
            if lestring.count(x):
                return 'Yep. "%s" contains characters from "%s" item.' % (lestring, x)
    
    
    >>> def ab(lelist, lestring):
        return [e for e in lelist if e in lestring]
    
    >>> t_ab = timeit.Timer("ab(lelist, lestring)", setup="from __main__ import lelist, lestring, ab")
    >>> t_uz = timeit.Timer("uz(lelist, lestring)", setup="from __main__ import lelist, lestring, uz")
    >>> t_joe = timeit.Timer("joe(lelist, lestring)", setup="from __main__ import lelist, lestring, joe")
    >>> t_ab.timeit(100000)
    0.09391469893125759
    >>> t_uz.timeit(100000)
    0.1528471407273173
    >>> t_joe.timeit(100000)
    1.4272649857800843
    

    Jamie's commented solution is slower for shorter string's. Here is the test result

    >>> def jamie(lelist, lestring):
        return next(itertools.chain((e for e in lelist if e in lestring), (None,))) is not None
    
    >>> t_jamie = timeit.Timer("jamie(lelist, lestring)", setup="from __main__ import lelist, lestring, jamie")
    >>> t_jamie.timeit(100000)
    0.22237164127909637
    

    If you need Boolean values, for shorter strings, just modify the above LC expression

    [e in lestring for e in lelist if e in lestring]
    

    Or for longer strings, you can do the following

    >>> next(e in lestring for e in lelist if e in lestring)
    True
    

    or

    >>> any(e in lestring for e in lelist)
    

提交回复
热议问题