Spreading a job over different nodes of a cluster in sun grid engine (SGE)

问题

I'm tryin get sun gridending (sge) to run the separate processes of an MPI job over all of the nodes of my cluster.

What is happening is that each node has 12 processors, so SGE is assigning 12 of my 60 processes to 5 separate nodes.

I'd like it to assign 2 processes to each of the 30 nodes available, because with 12 processes (dna sequence alignments) running on each node, the nodes are running out of memory.

So I'm wondering if it's possible to explicitly get SGE to assign the processes to a given node?

Thanks,

Paul.

回答1:

Check out "allocation_rule" in the configuration for the parallel environment; either with that or then by specifying $pe_slots for allocation_rule and then using the -pe option to qsub you should be able to do what you ask for above.

回答2:

You can do it by creating a queue in which you can define the queue uses only only 2 processors out of 12 processors in each node.

You can see configuration of current queue by using the command

 qconf -sq queuename

you will see following in the queue configuration. This queue named in such a way that it uses only 5 execution hosts and 4 slots (processors) each.

....
slots                 1,[master=4],[slave1=4],[slave2=4],[slave3=4],[slave4=4]
....

use following command to change the queue configuration

qconf -mq queuename

then change those 4 into 2.

回答3:

From an admin host, run "qconf -msconf" to edit the scheduler configuration. It will bring up a list of configuration options in an editor. Look for one called "load_factor". Set the value to "-slots" (without the quotes).

This tells the scheduler that the machine is least loaded when it has the fewest slots in use. If your exec hosts have a similar number of slots each, you will get an even distribution. If you have some exec hosts that have more slots than the others, they will be preferred, but your distribution will still be more even than the default value for load_factor (which I don't remember, having changed this in my cluster quite some time ago).

You may need to set the slots on each host. I have done this myself because I need to limit the number of jobs on a particular set of boxes to less than their maximum because they don't have as much memory as some of the other ones. I don't know if it is required for this load_factor configuration, but if it is, you can add a slots consumable to each host. Do this with "qconf -me hostname", add a value to the "complex_values" that looks like "slots=16" where 16 is the number of slots you want that host to use.

回答4:

This is what I learned from our sysadmin. Put this SGE resource request in your job script:

#$ -l nodes=30,ppn=2

Requests 2 MPI processes per node (ppn) and 30 nodes. I think there is no guarantee that this 30x2 layout will work on a 30-node cluster if other users also run lots of jobs but perhaps you can give it a try.

来源：https://stackoverflow.com/questions/3363261/spreading-a-job-over-different-nodes-of-a-cluster-in-sun-grid-engine-sge

标签

mpi

sungridengine