[molpro-user] mpi runs across multiple sandy bridge nodes failing

Benj FitzPatrick benjfitz at gmail.com
Fri Dec 13 20:52:29 GMT 2013


Hello,
Of the 3 clusters to which I have access, 2 have dual socket, quad core
procs and 1 has dual socket, octo core procs. On the 8-core nodes I have no
problems splitting an open-shell CCSD(T) geometry optimization (attached)
between multiple nodes (up to 9, and only asking for 66 cores, as that is
the number of gradients). However, when I try to use more than one of the
sb nodes the job fails right at the beginning with the below output. I
asked the supercomputer techs, but they didn't know, nor did they know how
to run it using mpirun and molpro.exe (I tried this, but didn't make it far
past adding in -L path_to_molpro_lib because I got tons of forrtl: No such
file or directory errors).

I would greatly appreciate any thoughts regarding how to fix my pbs file
(at the bottom), either by calling the molpro wrapper or by using
molpro.exe and mpirun.
Thanks,
Benj



--------------------.out file-----------------------
channel 10: open failed: administratively prohibited: open failed
tmp = /home/blankda/fitzpatr/pdir//soft/molpro/2012.1.0/bin/molpro.exe.p
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=35853
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=44465
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=33990
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=44234
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=59324
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=60822
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=40737
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=49072
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=47593
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=48778
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=34877
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=39966
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=54140
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=52418
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=49038
 Creating: host=node1122, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=59762
 Creating: host=node1123, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=35146
 Creating: host=node1123, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=36352
 Creating: host=node1123, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=41440
 Creating: host=node1123, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=52777
 Creating: host=node1123, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=38916
 Creating: host=node1123, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=37340
 Creating: host=node1123, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=60017
 Creating: host=node1123, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=38961
 Creating: host=node1123, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=49443
 Creating: host=node1123, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=37504
 Creating: host=node1123, user=fitzpatr,
           file=/soft/molpro/2012.1.0/bin/molpro.exe, port=50897
 48: interrupt(1)
 13: interrupt(1)
  4: interrupt(1)
  8: interrupt(1)
 10: interrupt(1)
  0: interrupt(1)
  6: interrupt(1)
 11: interrupt(1)
  1: interrupt(1)
 15: interrupt(1)
 14: interrupt(1)
 12: interrupt(1)


----------------------------.pbs file-----------------------------
#!/bin/bash -l
#
# Job: ./pr14_opt-tight-lmp2-vdz-a
#
#
# To submit this script to the queue type:
#    qsub ./test.pbs
#
#PBS -m n
#PBS -l nodes=3:ppn=16
#PBS -l walltime=1:00:00
#PBS -l pmem=1800mb
#PBS -e ${PBS_JOBID}.e
#PBS -o ${PBS_JOBID}.o
#PBS -q sb128


NPROC=16
JOBNAME="c6h7-int4-t-ts-int2-opt-ccsdt-avdz-tight-b3"


SCRATCH="/lustre/${USER}/${PBS_JOBID}"
SCRATCH2=${SCRATCH}

export TMPDIR="${SCRATCH}"
export TMPDIR4="${SCRATCH2}"

mkdir -p ${SCRATCH2}
mkdir -p ${SCRATCH}
cd ${SCRATCH}
cp ${PBS_O_WORKDIR}/${JOBNAME}.inp ${SCRATCH}


module load intel ompi/intel molpro

molpro --nodefile $PBS_NODEFILE -n 48/${NPROC} --mppx -d ${SCRATCH} <
./${JOBNAME}.inp >& ./${JOBNAME}.out
cp ${SCRATCH}/${JOBNAME}.out ${PBS_O_WORKDIR}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.molpro.net/pipermail/molpro-user/attachments/20131213/b6ebcf32/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: c6h7-int4-t-ts-int2-opt-ccsdt-avdz-tight-b3.inp
Type: application/octet-stream
Size: 1116 bytes
Desc: not available
URL: <http://www.molpro.net/pipermail/molpro-user/attachments/20131213/b6ebcf32/attachment.obj>


More information about the Molpro-user mailing list