$SGE_TASK_ID not getting set with qsub array grid job

限于喜欢 提交于 2020-01-04 15:16:31

问题


With a very simple zsh script:

#!/bin/zsh

nums=(1 2 3)
num=$nums[$SGE_TASK_ID]

$SGE_TASK_ID is the sun-grid engine task id. I'm using qsub to submit an array of jobs.

I am following what is advised in the qsub manpage (http://www.clusterresources.com/torquedocs/commands/qsub.shtml#t) and submitting my array job as

#script name: job_script.sh
qsub job_script.sh -t 1-3

$SGE_TASK_ID is not being set for this array job... does anyone have any ideas why?

Thanks!


回答1:


Try submitting the job like this:

qsub -t 1-3 job_script.sh

and see what happens.

Observe:

qsub -sync y job_script.sh -t 1-3
Your job 74578 ("job_script.sh") has been submitted
Job 74578 exited with exit code 0.

vs.

qsub -sync y -t 1-3 job_script.sh
Your job-array 74579.1-3:1 ("job_script.sh") has been submitted
Job 74579.3 exited with exit code 0.
Job 74579.1 exited with exit code 0.
Job 74579.2 exited with exit code 0.

Note, torque (the referenced man page in your question) is a little different from SGE. My SGE man page definitely suggests putting all options before the command. Also, SGE doesn't like that "%" syntax for limiting the max number of simultaneous jobs, but mine at least lets me say -tc NNN to specify the limit (not mentioned in the man page, but in qsub -help).




回答2:


To access to a position in your array, you have to do it like this: ${the_array[$the_position]}.

So in your case,

num=${nums[$SGE_TASK_ID]}

Test:

$ nums=(1 2 3)
$ SGE_TASK_ID=1
$ echo ${nums[$SGE_TASK_ID]}
2

note that the first position is the 0th.




回答3:


You need to surround the array variable with curly braces:

SGE_TASK_ID=2
nums=(1 2 3)
num=${nums[$SGE_TASK_ID]}
echo "num: $num"
# prints "num: 3"

The Linux Documentation Project has the best shell scripting docs




回答4:


thanks everyone for the answers. I found a solution that works:

Depending on how the cluster is set up, the Sun Grid Engine might be configured to use another variable name for array ids.. This was the case for me. I found out the variable to use by doing the following:

// job_script.sh

#!/bin/zsh
env >> ~/job_env
set >> ~/job_env

This dumps all environment variables set by the script into a file called job_env. Just simply look in the file and look for a variable array ID that is incremented for each job. Should not be that hard to find.

Remember to submit the job_script.sh with qsub as follows:

qsub -t 1-3 job_script.sh

In my case, the ID that was set was $PBS_ARRAYID. I don't think that's the default so $SGE_TASK_ID should work for standard SGE setups on clusters.

Cheers!



来源:https://stackoverflow.com/questions/16483977/sge-task-id-not-getting-set-with-qsub-array-grid-job

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!