Counting the number of True Booleans in a Python List

≯℡__Kan透↙ 提交于 2019-11-28 06:39:34

True is equal to 1.

>>> sum([True, True, False, False, False, True])
3
Mark Tolonen

list has a count method:

>>> [True,True,False].count(True)
2

This is actually more efficient than sum, as well as being more explicit about the intent, so there's no reason to use sum:

In [1]: import random

In [2]: x = [random.choice([True, False]) for i in range(100)]

In [3]: %timeit x.count(True)
970 ns ± 41.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [4]: %timeit sum(x)
1.72 µs ± 161 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

If you are only concerned with the constant True, a simple sum is fine. However, keep in mind that in Python other values evaluate as True as well. A more robust solution would be to use the bool builtin:

>>> l = [1, 2, True, False]
>>> sum(bool(x) for x in l)
3

UPDATE: Here's another similarly robust solution that has the advantage of being more transparent:

>>> sum(1 for x in l if x)
3

P.S. Python trivia: True could be true without being 1. Warning: do not try this at work!

>>> True = 2
>>> if True: print('true')
... 
true
>>> l = [True, True, False, True]
>>> sum(l)
6
>>> sum(bool(x) for x in l)
3
>>> sum(1 for x in l if x)
3

Much more evil:

True = False

You can use sum():

>>> sum([True, True, False, False, False, True])
3

Just for completeness' sake (sum is usually preferable), I wanted to mention that we can also use filter to get the truthy values. In the usual case, filter accepts a function as the first argument, but if you pass it None, it will filter for all "truthy" values. This feature is somewhat surprising, but is well documented and works in both Python 2 and 3.

The difference between the versions, is that in Python 2 filter returns a list, so we can use len:

>>> bool_list = [True, True, False, False, False, True]
>>> filter(None, bool_list)
[True, True, True]
>>> len(filter(None, bool_list))
3

But in Python 3, filter returns an iterator, so we can't use len, and if we want to avoid using sum (for any reason) we need to resort to converting the iterator to a list (which makes this much less pretty):

>>> bool_list = [True, True, False, False, False, True]
>>> filter(None, bool_list)
<builtins.filter at 0x7f64feba5710>
>>> list(filter(None, bool_list))
[True, True, True]
>>> len(list(filter(None, bool_list)))
3

It is safer to run through bool first. This is easily done:

>>> sum(map(bool,[True, True, False, False, False, True]))
3

Then you will catch everything that Python considers True or False into the appropriate bucket:

>>> allTrue=[True, not False, True+1,'0', ' ', 1, [0], {0:0}, set([0])]
>>> list(map(bool,allTrue))
[True, True, True, True, True, True, True, True, True]

If you prefer, you can use a comprehension:

>>> allFalse=['',[],{},False,0,set(),(), not True, True-1]
>>> [bool(i) for i in allFalse]
[False, False, False, False, False, False, False, False, False]

I prefer len([b for b in boollist if b is True]) (or the generator-expression equivalent), as it's quite self-explanatory. Less 'magical' than the answer proposed by Ignacio Vazquez-Abrams.

Alternatively, you can do this, which still assumes that bool is convertable to int, but makes no assumptions about the value of True: ntrue = sum(boollist) / int(True)

GMishx

After reading all the answers and comments on this question, I thought to do a small experiment.

I generated 50,000 random booleans and called sum and count on them.

Here are my results:

>>> a = [bool(random.getrandbits(1)) for x in range(50000)]
>>> len(a)
50000
>>> a.count(False)
24884
>>> a.count(True)
25116
>>> def count_it(a):
...   curr = time.time()
...   counting = a.count(True)
...   print("Count it = " + str(time.time() - curr))
...   return counting
... 
>>> def sum_it(a):
...   curr = time.time()
...   counting = sum(a)
...   print("Sum it = " + str(time.time() - curr))
...   return counting
... 
>>> count_it(a)
Count it = 0.00121307373046875
25015
>>> sum_it(a)
Sum it = 0.004102230072021484
25015

Just to be sure, I repeated it several more times:

>>> count_it(a)
Count it = 0.0013530254364013672
25015
>>> count_it(a)
Count it = 0.0014507770538330078
25015
>>> count_it(a)
Count it = 0.0013344287872314453
25015
>>> sum_it(a)
Sum it = 0.003480195999145508
25015
>>> sum_it(a)
Sum it = 0.0035257339477539062
25015
>>> sum_it(a)
Sum it = 0.003350496292114258
25015
>>> sum_it(a)
Sum it = 0.003744363784790039
25015

And as you can see, count is 3 times faster than sum. So I would suggest to use count as I did in count_it.

Python version: 3.6.7
CPU cores: 4
RAM size: 16 GB
OS: Ubuntu 18.04.1 LTS

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!