Andy May MayAJ1 at cardiff.ac.uk
Tue Apr 7 12:58:01 BST 2009


There are two issues here.

1. The difference between running when directly connected to the machine
and via PBS is that the GA launcher, 'parallel', detects in the first
case you are running locally, and in the second case it thinks it must
connect via the network to the machine (even if this is not true). Thus,
in the first case you will never see connection problems.

2. The binary parallel versions of Molpro use GA built with TCGMSG. The
primary reason for this is that it allows us to easily package up their
launcher, 'parallel', which is quite portable compared to packaging up
mpirun etc. This situation will likely change in future versions of
Molpro. The downside of using 'parallel' is that it relies upon rsh, not
ssh, so you must check users can rsh, without password, between the
nodes in order for Molpro to work.

Paul Hatton wrote:
> Can anyone advise me please? 
> At Molpro2008 I can't even get a single-core job to run *in our batch
> system (Torque/MOAB)* on our large (1500-core) AMD 64-bit Opteron dual
> processor/dual core Scientific Linux 5 cluster. I can run a single-core
> job OK on a logon node and also when I ssh to a worker node (we don't
> let users do this), but when I submit a batch job, either interactively
> (qsub -I) or as a non-interactive batch job I get a 'Connection refused'
> error. See the attached input file, job submission file which is run
> with
> qsub batch-job
> and output file. Does any of this look at all familiar?
> Discussion on the Molpro lists suggest that this may be something to do
> with us not allowing rsh, but I don't understand why it's OK when I ssh
> onto a node then issue the molpro command but not in the batch system.
> What I need to know is exactly what command is leading to the
> 'Connection refused' error but I don't think I can get that (can I?).
> Molpro 2006 was OK, but we built that along with Global Arrays (with
> some difficulty) from source whereas this is running the pre-compiled
> binary Version 2008.1 for architecture Linux/amd64, standard code, mpp
> (Patchlevel 5). I have a feeling I'll have to build this but that
> involves Global Arrays and such like which fill me with foreboding.
> Any help gratefully received.
