问题
Anyone have any idea which Resource manager is good for PVM? Or should I not have used PVM and instead relied on MPI (or any version of it, such as MPICH-2 [are there any other ones that are better?]). Main reason for using PVM was because the person before me who started this project assumed the use of PVM. However, now that this project is mine (he hasn't done any significant work that relies on PVM) this can be easily changed, preferably to something that is easy to install because installing and setting up PVM was a big hassle.
I'm leaning towards SunGridEngine seeing as how I have dedicated hardware, and after reading up on another post of which ones are better for dedicated hardware, SGE seems to be the winner. However I'm unsure of its performance using PVM. Wondering if anyone have had any experience with PVM and SGE?
If people use SGE, what do you use to communicate from computer to computer (or virtual machine to virtual machine)
Oh and I will be running Perl applications/lines if this matters.
Any suggestions or ideas?
Thanks in advance to all comments,
- Tyug
回答1:
I run PVM on Linux systems using Torque, SGE and LSF without any problems. Are you asking "Is it possible to use SGE, Torque, etc. to run PVM applications?"?
If so, check out my example Linux c-shell job scripts below. Note the scripts are nearly identical, except for the header of each script, which conforms to the appropriate format for each resource manager.
SGE job script:
#!/bin/csh
#$ -N LTR-001
#$ -o LTR-001.output
#$ -e LTR-001.error
#$ -pe comp 24
#$ -l h_rt=04:00:00
#$ -A cmit2
#$ -cwd
#$ -V
# Setup envirnoment
setenv LD_LIBRARY_PATH /lfs0/projects/cmit2/opt-intel/overture-noX/lib:${LD_LIBRARY_PATH}
setenv PVM_ARCH LINUX
setenv PVM_ROOT /lfs0/projects/cmit2/opt-intel/pvm3
setenv PVM_BIN ${PVM_ROOT}/bin
setenv PVM_RSH /usr/bin/ssh
setenv MY_HOSTS pvm_hostfile
rm -f ~/.pvmprofile
env | grep PVM_ > ~/.pvmprofile
# Create file containing _unique_ host names.  Note that there are two possible sources of available hosts
sort -k 1,1 -u ${MACHINE_FILE} >! ${MY_HOSTS}
# Start PVM & add nodes
printf "%s\n%s\n" conf quit|${PVM_ROOT}/lib/pvm ${MY_HOSTS}
wait
sleep 2
#
# Run apps requiring PVM.
#
wait
# Exit PVM daemon
echo "reset" | $PVM_ROOT/lib/pvm
echo "halt" | $PVM_ROOT/lib/pvm
Torque job script:
#!/bin/csh
#PBS -N LTR-001
#PBS -o LTR-001.output
#PBS -e LTR-001.error
#PBS -l nodes=3:ppn=8
#PBS -l walltime=04:00:00
#PBS -q compute
#PBS -d .
# Setup envirnoment
setenv LD_LIBRARY_PATH /users/ps14/opt-intel/overture/lib:${LD_LIBRARY_PATH}
setenv PVM_ARCH LINUX64
setenv PVM_ROOT /users/ps14/opt-intel/pvm3
setenv PVM_BIN ${PVM_ROOT}/bin
setenv PVM_RSH ${PVM_ROOT}/ssh
setenv MY_HOSTS pvm_hostfile
rm -f ~/.pvmprofile
env | grep PVM_ > ~/.pvmprofile
# Create file containing _unique_ host names.  Note that there are two possible sources of available hosts
sort -k 1,1 -u ${PBS_NODEFILE} >! ${MY_HOSTS}
# Start PVM & add nodes
printf "%s\n%s\n" conf quit|${PVM_ROOT}/lib/pvm ${MY_HOSTS}
wait
sleep 2
#
# Run apps requiring PVM.
#
wait
# Exit PVM daemon
echo "reset" | $PVM_ROOT/lib/pvm
echo "halt" | $PVM_ROOT/lib/pvm
来源:https://stackoverflow.com/questions/2320976/sungridengine-condor-torque-as-resource-managers-for-pvm