itertools | 易学教程

Cartesian product of large iterators (itertools)

阅读更多关于 Cartesian product of large iterators (itertools)

问题 From a previous question I learned something interesting. If Python's itertools.product is fed a series of iterators, these iterators will be converted into tuples before the Cartesian product begins. Related questions look at the source code of itertools.product to conclude that, while no intermediate results are stored in memory, tuple versions of the original iterators are created before the product iteration begins. Question : Is there a way to create an iterator to a Cartesian product

How do I “multi-process” the itertools product module?

阅读更多关于 How do I “multi-process” the itertools product module?

So I tried I tried calculating millions and millions of different combinations of the below string but I was only calculating roughly 1,750 combinations a second which isn't even near the speed I need. So how would I reshape this so multiple processes of the same thing are calculating different parts, while not calculating parts that have already been calculated and also maintaining fast speeds? The code below is partially what I've been using. Any examples would be appreciated! from itertools import product for chars in product("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ12234567890!

Identifying consecutive occurrences of a value

阅读更多关于 Identifying consecutive occurrences of a value

问题 I have a df like so: Count 1 0 1 1 0 0 1 1 1 0 and I want to return a 1 in a new column if there are two or more consecutive occurrences of 1 in Count and a 0 if there is not. So in the new column each row would get a 1 based on this criteria being met in the column Count . My desired output would then be: Count New_Value 1 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 1 1 0 0 I am thinking I may need to use itertools but I have been reading about it and haven't come across what I need yet. I would like to

MemoryError while creating cartesian product in Numpy

阅读更多关于 MemoryError while creating cartesian product in Numpy

问题 I have 3 numpy arrays and need to form the cartesian product between them. Dimensions of the arrays are not fixed, so they can take different values, one example could be A=(10000, 50), B=(40, 50), C=(10000,50). Then, I perform some processing (like a+b-c) Below is the function that I am using for the product. def cartesian_2d(arrays, out=None): arrays = [np.asarray(x) for x in arrays] dtype = arrays[0].dtype n = np.prod([x.shape[0] for x in arrays]) if out is None: out = np.empty([n, len

How to turn an itertools “grouper” object into a list

阅读更多关于 How to turn an itertools “grouper” object into a list

I am trying to learn how to use itertools.groupby in Python and I wanted to find the size of each group of characters. At first I tried to see if I could find the length of a single group: from itertools import groupby len(list(list( groupby("cccccaaaaatttttsssssss") )[0][1])) and I would get 0 every time. I did a little research and found out that other people were doing it this way: from itertools import groupby for key,grouper in groupby("cccccaaaaatttttsssssss"): print key,len(list(grouper)) Which works great. What I am confused about is why does the latter code work, but the former does

Combining itertools and multiprocessing?

阅读更多关于 Combining itertools and multiprocessing?

问题 I have a 256x256x256 Numpy array, in which each element is a matrix. I need to do some calculations on each of these matrices, and I want to use the multiprocessing module to speed things up. The results of these calculations must be stored in a 256x256x256 array like the original one, so that the result of the matrix at element [i,j,k] in the original array must be put in the [i,j,k] element of the new array. To do this, I want to make a list which could be written in a pseudo-ish way as

How not to miss the next element after itertools.takewhile()

阅读更多关于 How not to miss the next element after itertools.takewhile()

问题 Say we wish to process an iterator and want to handle it by chunks. The logic per chunk depends on previously-calculated chunks, so groupby() does not help. Our friend in this case is itertools.takewhile(): while True: chunk = itertools.takewhile(getNewChunkLogic(), myIterator) process(chunk) The problem is that takewhile() needs to go past the last element that meets the new chunk logic, thus 'eating' the first element for the next chunk. There are various solutions to that, including

Prevent memory error in itertools.permutation

阅读更多关于 Prevent memory error in itertools.permutation

Firstly I would like to mention that i have a 3 gb ram. I am working on an algorithm that is exponential in time on the nodes so for it I have in the code perm = list( itertools.permutations(list(graph.Nodes))) # graph.Nodes is a tuple of 1 , 2 , ... n integers which generates all the combinations of vertices in a list and then i can work on one of the permutation. However when i run the program for 40 vertices , it gives a memory error. Is there any simpler way in implementation via which i can generate all the combinations of the vertices and not have this error. Try to use the iterator

How can I use python itertools.groupby() to group a list of strings by their first character?

阅读更多关于 How can I use python itertools.groupby() to group a list of strings by their first character?

问题 I have a list of strings similar to this list: tags = ('apples', 'apricots', 'oranges', 'pears', 'peaches') How should I go about grouping this list by the first character in each string using itertools.groupby()? How should I supply the 'key' argument required by itertools.groupby()? 回答1: groupby(sorted(tags), key=operator.itemgetter(0)) 回答2: You might want to create dict afterwards: from itertools import groupby d = {k: list(v) for k, v in groupby(tags, key=lambda x: x[0])} 回答3: >>> for i,

implementing argmax in Python

阅读更多关于 implementing argmax in Python

问题 How should argmax be implemented in Python? It should be as efficient as possible, so it should work with iterables. Three ways it could be implemented: given an iterable of pairs return the key corresponding to the greatest value given an iterable of values return the index of the greatest value given an iterable of keys and a function f , return the key with largest f(key) 回答1: I modified the best solution I found: # given an iterable of pairs return the key corresponding to the greatest