How to avoid an empty result with `Bag.take(n)` when using dask?
问题 Context: Dask documentation states clearly that Bag.take() will only collect from the first partition. However, when using a filter it can occur that the first partition is empty, while others are not. Question: Is it possible to use Bag.take() so that it collects from a sufficient number of partitions to collect the n items (or the maximum available less than than n ). 回答1: You could do something like the following: from toolz import take f = lambda seq: list(take(n, seq)) b.reduction(f, f)