itertools

Identifying consecutive occurrences of a value

假装没事ソ 提交于 2019-11-29 01:45:56
I have a df like so: Count 1 0 1 1 0 0 1 1 1 0 and I want to return a 1 in a new column if there are two or more consecutive occurrences of 1 in Count and a 0 if there is not. So in the new column each row would get a 1 based on this criteria being met in the column Count . My desired output would then be: Count New_Value 1 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 1 1 0 0 I am thinking I may need to use itertools but I have been reading about it and haven't come across what I need yet. I would like to be able to use this method to count any number of consecutive occurrences, not just 2 as well. For

Combining itertools and multiprocessing?

我们两清 提交于 2019-11-28 23:37:00
I have a 256x256x256 Numpy array, in which each element is a matrix. I need to do some calculations on each of these matrices, and I want to use the multiprocessing module to speed things up. The results of these calculations must be stored in a 256x256x256 array like the original one, so that the result of the matrix at element [i,j,k] in the original array must be put in the [i,j,k] element of the new array. To do this, I want to make a list which could be written in a pseudo-ish way as [array[i,j,k], (i, j, k)] and pass it to a function to be "multiprocessed". Assuming that matrices is a

How not to miss the next element after itertools.takewhile()

℡╲_俬逩灬. 提交于 2019-11-28 23:13:40
Say we wish to process an iterator and want to handle it by chunks. The logic per chunk depends on previously-calculated chunks, so groupby() does not help. Our friend in this case is itertools.takewhile(): while True: chunk = itertools.takewhile(getNewChunkLogic(), myIterator) process(chunk) The problem is that takewhile() needs to go past the last element that meets the new chunk logic, thus 'eating' the first element for the next chunk. There are various solutions to that, including wrapping or à la C's ungetc() , etc.. My question is: is there an elegant solution? takewhile() indeed needs

How can I use python itertools.groupby() to group a list of strings by their first character?

淺唱寂寞╮ 提交于 2019-11-28 19:54:26
I have a list of strings similar to this list: tags = ('apples', 'apricots', 'oranges', 'pears', 'peaches') How should I go about grouping this list by the first character in each string using itertools.groupby()? How should I supply the 'key' argument required by itertools.groupby()? Ignacio Vazquez-Abrams groupby(sorted(tags), key=operator.itemgetter(0)) Pratik Deoghare You might want to create dict afterwards: from itertools import groupby d = {k: list(v) for k, v in groupby(tags, key=lambda x: x[0])} >>> for i, j in itertools.groupby(tags, key=lambda x: x[0]): print(i, list(j)) a ['apples'

implementing argmax in Python

蓝咒 提交于 2019-11-28 19:17:26
How should argmax be implemented in Python? It should be as efficient as possible, so it should work with iterables. Three ways it could be implemented: given an iterable of pairs return the key corresponding to the greatest value given an iterable of values return the index of the greatest value given an iterable of keys and a function f , return the key with largest f(key) I modified the best solution I found: # given an iterable of pairs return the key corresponding to the greatest value def argmax(pairs): return max(pairs, key=lambda x: x[1])[0] # given an iterable of values return the

What is the difference between chain and chain.from_iterable in itertools?

荒凉一梦 提交于 2019-11-28 18:31:51
I could not find any valid example on the internet where I can see the difference between them and why to choose one over the other. The first takes 0 or more arguments, each an iterable, the second one takes one argument which is expected to produce the iterables: from itertools import chain chain(list1, list2, list3) iterables = [list1, list2, list3] chain.from_iterable(iterables) but iterables can be any iterator that yields the iterables: def gen_iterables(): for i in range(10): yield range(i) itertools.chain.from_iterable(gen_iterables()) Using the second form is usually a case of

Weirdness of itertools.groupby in Python when realizing the groupby result early [duplicate]

≯℡__Kan透↙ 提交于 2019-11-28 14:41:24
This question already has an answer here: python groupby behaviour? 3 answers First, apologies for my poor description of the problem. I can't find a better one. I found that applying list to an itertools.groupby result will destroy the result. See code: import itertools import operator log = '''\ hello world hello there hi guys hi girls'''.split('\n') data = [line.split() for line in log] grouped = list(itertools.groupby(data, operator.itemgetter(0))) for key, group in grouped: print key, group, list(group) print '-'*80 grouped = itertools.groupby(data, operator.itemgetter(0)) for key, group

Itertools to generate scrambled combinations

≯℡__Kan透↙ 提交于 2019-11-28 14:33:37
What I want to do is obtain all combinations and all unique permutations of each combination. The combinations with replacement function only gets me so far: from itertools import combinations_with_replacement as cwr foo = list(cwr('ACGT', n)) ## n is an integer My intuition on how to move forward is to do something like this: import numpy as np from itertools import permutations as perm bar = [] for x in foo: carp = list(perm(x)) for i in range(len(carp)): for j in range(i+1,len(carp)): if carp[i] == carp[j]: carp[j] = '' carp = carp[list(np.where(np.array(carp) != '')[0])] for y in carp: bar

How to use expand in snakemake when some particular combinations of wildcards are not desired?

被刻印的时光 ゝ 提交于 2019-11-28 14:28:22
Let's suppose that I have the following files, on which I want to apply some processing automatically using snakemake: test_input_C_1.txt test_input_B_2.txt test_input_A_2.txt test_input_A_1.txt The following snakefile uses expand to determine all the potential final results file: rule all: input: expand("test_output_{text}_{num}.txt", text=["A", "B", "C"], num=[1, 2]) rule make_output: input: "test_input_{text}_{num}.txt" output: "test_output_{text}_{num}.txt" shell: """ md5sum {input} > {output} """ Executing the above snakefile results in the following error: MissingInputException in line 4

list around groupby results in empty groups

拜拜、爱过 提交于 2019-11-28 11:51:27
I was playing around to get a better feeling for itertools groupby , so I grouped a list of tuples by the number and tried to get a list of the resulting groups. When I convert the result of groupby to a list however, I get a strange result: all but the last group are empty. Why is that? I assumed turning an iterator into a list would be less efficient but never change behavior. I guess the lists are empty because the inner iterators are traversed but when/where does that happen? import itertools l=list(zip([1,2,2,3,3,3],['a','b','c','d','e','f'])) #[(1, 'a'), (2, 'b'), (2, 'c'), (3, 'd'), (3,