[molpro-user] parallel 2006.1 on Opteron/myrinet cluster

Kirk Peterson kipeters at wsu.edu
Thu Jul 20 02:52:11 BST 2006


Dear parallel Molpro aficionados,

I've been working with a colleague here at WSU to get the 2006.1  
version of Molpro up and running on their Opteron cluster.  I should  
note that the 2002.6 version seems to run just fine.  The nodes are  
networked with myrinet and the program was built with v6.02 of the  
PGI compiler.  While all the test jobs run fine on the frontend node  
(interactively), when submitting a test job through PBS or running  
interactively on a compute node the program dies before it starts with:
Use of uninitialized value in subroutine entry at /usr/local/mpich-gm/ 
bin/mpirun.ch_gm.pl line 862.

Bad arg length for Socket::inet_ntoa, length is 0, should be 4 at / 
usr/local/mpich-gm/bin/mpirun.ch_gm.pl line 862.

The myrinet website notes that this error message implies an old  
version of mpirun.ch_gm is being used with a newer gm or mpich-gm  
library, but this is not the case.  As a further teaser, I can  
successfully run the standard ga test job (v.4.0) using mpirun.ch_gm  
on this compute node either via PBS or interactively.  I can also  
reproduce this error message with the 2002.6 version if I neglect to  
give the molpro program a machinefile on the command line. Something  
seems not to be configured correctly but I certainly haven't found it  
yet.

Any hints would be greatly appreciated.

-Kirk

PS - I should mention that 2006.1 works just fine on my myrinet  
cluster, but mine is athlon-based and uses somewhat older versions of  
pgi, ga, gm, and mpich-gm.



More information about the Molpro-user mailing list