ipython-parallel

Printing to stdout in IPython parallel processes

ε祈祈猫儿з 提交于 2019-12-08 16:27:34
问题 I'm new to IPython and would like to print intermediate results to stdout while running IPython parallel cluster functions. (I'm aware that with multiple processes, this might mangle the output, but that's fine--it's just for testing/debugging, and the processes I'd be running are long enough that such a collision is unlikely.) I checked the documentation for IPython but can't find an example where the parallelized function prints. Basically, I'm looking for a way to redirect the print output

How to use IPython.parallel map() with generators as input to function

徘徊边缘 提交于 2019-12-08 05:15:41
问题 I am trying to use IPython.parallel map. The inputs to the function I wish to parallelize are generators. Because of size/memory it is not possible for me to convert the generators to lists. See code below: from itertools import product from IPython.parallel import Client c = Client() v = c[:] c.ids def stringcount(longstring, substrings): scount = [longstring.count(s) for s in substrings] return scount substrings = product('abc', repeat=2) longstring = product('abc', repeat=3) # This is what

How to best share static data between ipyparallel client and remote engines?

我怕爱的太早我们不能终老 提交于 2019-12-06 17:21:55
问题 I am running the same simulation in a loop with different parameters. Each simulation makes use a pandas DataFrame ( data ) which is only read, never modified. Using ipyparallel (IPython parallel), I can put this DataFrames into the global variable space of each engine in my view before simulations start: view['data'] = data The engines then have access to the DataFrame for all the simulations which get run on them. The process of copying the data (if pickled, data is 40MB) is only a few

ipython notebook : how to parallelize external script

隐身守侯 提交于 2019-12-05 01:54:51
问题 I'm trying to use parallel computing from ipython parallel library. But I have little knowledge about it and I find the doc difficult to read from someone who knows nothing about parallel computing. Funnily, all tutorials I found just re-use the example in the doc, with the same explanation, which from my point of view, is useless. Basically what I'd like to do is running few scripts in background so they are executed in the same time. In bash it would be something like : for my_file in $(cat

How to best share static data between ipyparallel client and remote engines?

蓝咒 提交于 2019-12-04 22:58:12
I am running the same simulation in a loop with different parameters. Each simulation makes use a pandas DataFrame ( data ) which is only read, never modified. Using ipyparallel (IPython parallel), I can put this DataFrames into the global variable space of each engine in my view before simulations start: view['data'] = data The engines then have access to the DataFrame for all the simulations which get run on them. The process of copying the data (if pickled, data is 40MB) is only a few seconds. However, It appears that if the number of simulations grows, memory usage grows very large. I

Ipython Notebook: where is jupyter_notebook_config.py in Mac?

为君一笑 提交于 2019-12-03 11:23:20
问题 I just started using a Mac, so please forgive me if this sounds too naive. I'm trying to install Interactive Parallel . From https://github.com/ipython/ipyparallel, it says I need to find jupyter_notebook_config.py . I've already installed python and related packages with Anaconda , and I can use the ipython notebook. But when I search with spotlight for jupyter_notebook_config.py , I just can't find this file: So, where can I find this file? UPDATE: this is my home folder: There is only

How to pin threads to cores with predetermined memory pool objects? (80 core Nehalem architecture 2Tb RAM)

两盒软妹~` 提交于 2019-12-03 09:04:47
问题 I've run into a minor HPC problem after running some tests on a 80core (160HT) nehalem architecture with 2Tb DRAM: A server with more than 2 sockets starts to stall a lot (delay) as each thread starts to request information about objects on the "wrong" socket, i.e. requests goes from a thread that is working on some objects on the one socket to pull information that is actually in the DRAM on the other socket. The cores appear 100% utilized, even though I know that they are waiting for the

Import custom modules on IPython.parallel engines with sync_imports()

一笑奈何 提交于 2019-12-03 06:46:07
I've been playing around with IPython.parallel and I wanted to use some custom modules of my own, but haven't been able to do it as explained on the cookbook using dview.sync_imports() . The only thing that has worked for me was something like def my_parallel_func(args): import sys sys.path.append('/path/to/my/module') import my_module #and all the rest and then in the main just to if __name__=='__main__': #set up dview... dview.map( my_parallel_func, my_args ) The correct way to do this would in my opinion be something like with dview.sync_imports(): import sys sys.path.append('/path/to/my

Ipython Notebook: where is jupyter_notebook_config.py in Mac?

…衆ロ難τιáo~ 提交于 2019-12-03 01:48:35
I just started using a Mac, so please forgive me if this sounds too naive. I'm trying to install Interactive Parallel . From https://github.com/ipython/ipyparallel , it says I need to find jupyter_notebook_config.py . I've already installed python and related packages with Anaconda , and I can use the ipython notebook. But when I search with spotlight for jupyter_notebook_config.py , I just can't find this file: So, where can I find this file? UPDATE: this is my home folder: There is only anaconda . rkrzr Look in your home directory for a .jupyter folder. It should contain the file according

How to pin threads to cores with predetermined memory pool objects? (80 core Nehalem architecture 2Tb RAM)

泄露秘密 提交于 2019-12-02 23:11:15
I've run into a minor HPC problem after running some tests on a 80core (160HT) nehalem architecture with 2Tb DRAM: A server with more than 2 sockets starts to stall a lot (delay) as each thread starts to request information about objects on the "wrong" socket, i.e. requests goes from a thread that is working on some objects on the one socket to pull information that is actually in the DRAM on the other socket. The cores appear 100% utilized, even though I know that they are waiting for the remote socket to return the request. As most of the code runs asynchronously it is a lot easier to