[molpro-user] parallel 2006.1 on Opteron/myrinet cluster

Kirk Peterson kipeters at wsu.edu
Tue Aug 15 17:48:59 BST 2006


All,

after rebuilding both the myrinet and mpich software on this system,  
the problem now changes to:

Last System Error Message from Task 1:: No such file or directory
[1] MPI Abort by user Aborting program !
[1] Aborting program!

Molpro jobs work fine on a single processor and all mpi and ga test  
programs seem to run ok.  Any ideas
on what file or directory either ga or molpro cannot find?

regards,

Kirk


On Jul 19, 2006, at 11:50 PM, Sigismondo Boschi wrote:

> Dear Kirk,
>
> Actually it seems a mpich/gm configuration problem. Are you sure  
> you have compiled with the correct libraries/includes in /usr/local/ 
> mpich-gm and not something else?
>
> How did you build GM and molpro? We are doing exactly the same  
> here, so if it is a distribution problem, we will encounter the  
> same very soon.
>
> However we are using LSF in place of PBS, but it rely on  
> mpirun.ch_gm in the same way.
>
> Regards,
>    Sigismondo Boschi
>
>
> Kirk Peterson ha scritto:
>> Dear parallel Molpro aficionados,
>> I've been working with a colleague here at WSU to get the 2006.1  
>> version of Molpro up and running on their Opteron cluster.  I  
>> should note that the 2002.6 version seems to run just fine.  The  
>> nodes are networked with myrinet and the program was built with  
>> v6.02 of the PGI compiler.  While all the test jobs run fine on  
>> the frontend node (interactively), when submitting a test job  
>> through PBS or running interactively on a compute node the program  
>> dies before it starts with:
>> Use of uninitialized value in subroutine entry at /usr/local/mpich- 
>> gm/bin/mpirun.ch_gm.pl line 862.
>> Bad arg length for Socket::inet_ntoa, length is 0, should be 4 at / 
>> usr/local/mpich-gm/bin/mpirun.ch_gm.pl line 862.
>> The myrinet website notes that this error message implies an old  
>> version of mpirun.ch_gm is being used with a newer gm or mpich-gm  
>> library, but this is not the case.  As a further teaser, I can  
>> successfully run the standard ga test job (v.4.0) using  
>> mpirun.ch_gm on this compute node either via PBS or  
>> interactively.  I can also reproduce this error message with the  
>> 2002.6 version if I neglect to give the molpro program a  
>> machinefile on the command line. Something seems not to be  
>> configured correctly but I certainly haven't found it yet.
>> Any hints would be greatly appreciated.
>> -Kirk
>> PS - I should mention that 2006.1 works just fine on my myrinet  
>> cluster, but mine is athlon-based and uses somewhat older versions  
>> of pgi, ga, gm, and mpich-gm.
>
>
> -- 
> ---------------------------------------------------------------------- 
> ---------
> User Guide: http://www.cineca.it/sap/files/user_guide_cineca.pdf
> SUPPORT REQUESTS: superc at cineca.it   **
> PERSONAL ADDRESS: s.boschi at cineca.it **
> ---------------------------------------------------------------------- 
> ---------
> Sigismondo Boschi, Ph.D.               tel: +39 051 6171559
> CINECA (High Performance Systems)      fax: +39 051 6132198
> via Magnanelli, 6/3                    http://instm.cineca.it
> 40033 Casalecchio di Reno (BO)-ITALY   http://www.cineca.it




More information about the Molpro-user mailing list