[molpro-user] MPI_ERR_ARG error with openmpi version of molpro2012

Michael Mckee mckeeml at auburn.edu
Fri Feb 6 14:40:24 GMT 2015


When I use molpro on a large CCSD(T) calculation using 30GB/cpu, the
program runs to completion when all cpus are on the same node (4 cpus
and 120GB memory).  The program also runs correctly when the cpus are on
different nodes but a shared file system is used (i.e. all files are
same file system).  However, if local storage is used and the cpus are
on different nodes, I always get an mpi error (using openmpi-1.8.4, but
also with openmpi-1.8.3) right after the following output:

Has anyone seen this behavior?

Mike McKee
Auburn University

------------------- start of output -----------------------
 Primary working directories    : /tmp/scratch-local
 Secondary working directories  : /tmp/scratch-local
 Wavefunction directory         : /home/aubmlm/wfu/
 Main file repository           : /tmp/scratch-local/

 SHA1      : bcd629ae4e472137acd19c88fd5379f7579943a3
 NAME      : 2012.1.20
 ARCHNAME  : Linux/x86_64
FC        : /apps/dmc/apps/intel_2015.0.090/composer_xe_2015.0.090/bin/intel64/ifort
 FCVERSION : 15.0.0
-lmkl_intel_ilp64 -lmkl_sequential -lmkl_core
 id        : auburnkp
 Nodes     nprocs
 dmc32        2
 dmc35        2
 dmc36        1
 Number of processes for MPI-2 version of Molpro:   nprocs(total)=    6
nprocs(compute)=    5   nprocs(helper)=    1
 ga_uses_ma=false, calling ma_init with nominal heap.
 GA-space will be limited to   8.0 MW (determined by -G option)

 Using customized tuning parameters: mindgm=1; mindgv=20; mindgc=4;
mindgr=1; noblas=0; minvec=7
 default implementation of scratch files=ga  


--------------- output lines missing here ----------------------

1PROGRAM * CCSD (Restricted open-shell coupled cluster)     Authors: C.
Hampel, H.-J. Werner, 1991, M. Deegan, P.J. Knowles, 1992

 Convergence thresholds:  THRVAR = 1.00D-08  THRDEN = 4.02D-06

 CCSD(T)     terms to be evaluated (factor= 1.000)

 Number of core orbitals:          40 (  21  19 )
 Number of closed-shell orbitals:  60 (  34  26 )
 Number of active  orbitals:        1 (   1   0 )
 Number of external orbitals:     459 ( 230 229 )

 For full I/O caching in triples, increase memory by 1554.23 Mwords to
5554.51 Mwords.

 Number of N-1 electron functions:             121
 Number of N-2 electron functions:            7260
 Number of singly external CSFs:             27847
 Number of doubly external CSFs:         579350790
 Total number of CSFs:                   579378637

 Molecular orbitals read from record     2100.2  Type=RHF/CANONICAL
(state 1.1)
-------------------- end of output ---------------------------

------------------- mpi error message -------------------------
[dmc37:116946] *** An error occurred in MPI_Alloc_mem
[dmc37:116946] *** reported by process [2587623425,140733193388037]
[dmc37:116946] *** on communicator MPI_COMM_WORLD
[dmc37:116946] *** MPI_ERR_ARG: invalid argument of some other kind
[dmc37:116946] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
will now abort,
[dmc37:116946] ***    and potentially your MPI job)
------------------- end of mpi error message ---------------------

More information about the Molpro-user mailing list