Parallel Molpro won't run on >1 node!

The Matt thompsma at colorado.edu
Tue Jun 24 22:20:08 BST 2003


Dear Molpro List:

I am trying to get my Molpro to work on a many two-proc node cluster. 
Right now, if I run molpro with -n2 on one node, it runs great.

The problem occurs when I try to run on two or more nodes.  Now, note
that tcgmsg/parallel works great for all nodes with that test.x program
in the GA distribution.  But molpro fails. For example, using -n4 and
two nodes qsub borks with this:

:: PROCGRP file /data/procgrp.00030948 ::
:: thompsma keck2 2
/home/other/thompsma/lib/molpro-mpp-Linux-i686-i4-2002.6/molprop_2002_6_tcgmsg.exe /data
:: thompsma keck37 2
/home/other/thompsma/lib/molpro-mpp-Linux-i686-i4-2002.6/molprop_2002_6_tcgmsg.exe /data
cd /data/
long output file: /home/other/thompsma/QChem/normal_dft.log
/home/other/thompsma/lib/molpro-mpp-Linux-i686-i4-2002.6/parallel
/home/other/thompsma/lib/molpro-mpp-Linux-i686-i4-2002.6/molprop_2002_6_tcgmsg.exe
keck37: Connection refused
  4: interrupt(1)
  0: interrupt(1)
  1: interrupt(1)
status=256

Now, the PROCGRP part is spot on perfect.  It's what I would create for
a .p file for tcgmsg/parallel.  As for the "keck37: Connection refused",
that shouldn't be.  I can ssh/rsh to every node just fine and all the
needed keys are in .ssh/authorized_keys2.  And, as I said, parallel
test.x works great on all nodes (and with an identical PROCGRP, albeit
with a different application field).

So, why is parallel molpro failing where parallel test.x succeeds?  Any
help will be much appreciated.

Thanks,
Matt Thompson
-- 
"And isn't sanity really just a one-trick pony, anyway?  I mean,
all you get is one trick, rational thinking, but when you're good
and crazy, ooh ooh ooh, the sky's the limit!" -- The Tick
  The Matt -- http://ucsub.colorado.edu/~thompsma/




More information about the Molpro-user mailing list