Is using the -L flag and a addprocs script the more powerful version of -p and --machinefile?

ε祈祈猫儿з 提交于 2019-12-24 13:59:06

问题


So I have a moderately complex set of requirements for my worker processes. I want to use a the master slave topology, and a nondefault working directory. I also want to mix both local and remote workers.

As far as I can tell from readying the --machine-file section of the documentation. It will not let me do that.

So I am looking at the -L <file parameter

>julia -h
...
-L, --load Load immediately on all processors
...

So if I do not use the -p or --machine-file` flags, then there is initially only one processer so the all processors just mean on the only processor.

So I tried this out

start_workers.jl

addprocs([
          ("cluster_c4_1",:auto),
          ("cluster_c4_2",:auto)
    ],
        dir="/mnt/",
        topology=:master_slave
        )

addprocs(
        dir="/mnt/",
        topology=:master_slave
        )

test.jl

println("*************")
println(workers())
println("-------------")

Running it:

>julia -L start_workers.jl pl.jl 
*************
[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]
-------------

So it looks all good, got my 20 workers. Have I done anything unreasonable? Is this the best way?


回答1:


That's exactly how I'm deploying it on a HPC cluster under Torque scheduler. In fact I'm in the process of re-writing the the cluster manager to support more options when adding processes through the Torque scheduling systems in particular, so I've spent quite a bit of time looking into this.

You might also want to be aware there are various ClusterManagers, Pkg.add("ClusterManagers") that extend the ability of addprocs under a variety of environments, such as when you need to request the resources from a scheduler. It looks like passwordless ssh is possible for you, so the default cluster manager is sufficient in your case.

I don't believe there is any way of defining the extra topology and directory parameters on the command line, so your approach is correct.



来源:https://stackoverflow.com/questions/37250274/is-using-the-l-flag-and-a-addprocs-script-the-more-powerful-version-of-p-and

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!