[molpro-user] Parallel Molpro compilation on AMD/Opteron - ARCMI problem with executable

Łukasz Rajchel lrajchel1981 at gmail.com
Tue Apr 26 08:30:42 BST 2011


Dear Molpro users,

I've recently compiled parallel 2010.1 version of Molpro on Opteron  
cluster using ga-5-0-2 and openmpi-1.4.3. The compilation proceeds  
without problems, however, the resulting executable won't run at all.  
The details are as follows:

the machine is AMD/Opteron cluster of 4-core nodes connected with  
InfiniBand:

------------------------------------------------
uname -a
Linux ls-b03 2.6.30.10 #1 SMP Wed Oct 13 20:29:38 CEST 2010 x86_64  
x86_64 x86_64 GNU/Linux
------------------------------------------------

I first compiled global arrays with the following options:

------------------------------------------------
mkdir bld_openmpi && cd bld_openmpi && ../configure MPIF77=mpif77  
MPICC=mpicc MPICXX=mpicxx LIBS="-pthread -I/opt/openmpi/1.4.3/lib -L/ 
opt/openmpi/1.4.3/lib -lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal - 
ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl" --prefix=$HOME/ga/ 
ga-5-0-2-install --enable-cxx --enable-shared --with-blas="-L/opt/acml/ 
gfortran64_mp/lib -lacml_mp" --with-openib
------------------------------------------------

and then Molpro with

------------------------------------------------
./configure -i4 -blas -lapack -mpp -mppbase $HOME/ga/ga-5-0-2/ 
bld_openmpi -openmpi -var CFLAGS=-I/opt/openmpi/1.4.3/include -var  
LD_ENV=/opt/openmpi/1.4.3/lib
------------------------------------------------

for BLAS/LAPACK and MPILIB I've put

------------------------------------------------
-L/opt/acml/gfortran64_mp/lib -lacml_mp
-pthread -I/opt/openmpi/1.4.3/lib -L/opt/openmpi/1.4.3/lib -lmpi_f90 - 
lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl - 
lutil -lm -ldl
------------------------------------------------

[MPILIB taken as $(mpif90 --showme:link)]

The resulting CONFIG is

------------------------------------------------
# MOLPRO CONFIG generated at Mon Apr 25 07:46:48 CEST 2011, for host  
ls-b01

CONFIGURE_OPTIONS="-i4" "-blas" "-lapack" "-mpp" "-mppbase" "/home/ 
staff/lukarajc/ga/ga-5-0-2/bld_openmpi" "-openmpi" "-var" "CFLAGS=-I/ 
opt/openmpi/1.4.3/include" "-var" "LD_ENV=/opt/openmpi/1.4.3/lib" "- 
var" "BLASLIB=-L/opt/acml/gfortran64_mp/lib -lacml_mp" "-var"  
"LAPACKLIB=-L/opt/acml/gfortran64_mp/lib -lacml_mp" "-var" "MPILIB=- 
pthread -I/opt/openmpi/1.4.3/lib -L/opt/openmpi/1.4.3/lib -lmpi_f90 - 
lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl - 
lutil -lm -ldl" "-var" "INSTBIN=exec" "-var" "INSTLIB=aux" "-var"  
"INSTHTML=manual"

AR=ar
ARCHNAME=Linux/x86_64
ARFLAGS=-rS
AWK=awk
BIBTEX=
BLASLIB=-L/opt/acml/gfortran64_mp/lib -lacml_mp
BUILD=p
CAT=cat
CC=/usr/bin/gcc
CCVERSION=4.4.4
CC_FRONT=
CDEBUG=-g $(addprefix $(CDEFINE),_DEBUG)
CDEFINE=-D
CFLAGS=-I/opt/openmpi/1.4.3/include -m64 -Waddress -Wcast-align -Wchar- 
subscripts -Wcomment -Wformat -Wimplicit -Wimplicit-int -Wimplicit- 
function-declaration -Wmain -Wmissing-braces -Wmultichar -Wnested- 
externs -Wparentheses -Wpointer-arith -Wpointer-sign -Wreturn-type - 
Wsequence-point -Wsign-compare -Wstrict-aliasing -Wstrict-overflow=1 - 
Wswitch -Wtrigraphs -Wuninitialized -Wunknown-pragmas -Wunused- 
function -Wunused-label -Wunused-value -Wunused-variable -Wvolatile- 
register-var -DZLIB
CLEARSPEEDLIB=
CMPPINCLUDE=/home/staff/lukarajc/ga/ga-5-0-2-install/include
COPT=-O3
COPT0=-O0
COPT1=-O1
COPT2=-O2
COPT3=-O3
CP=cp -p
CPROFILE=-p
CSCN=cscn
CSFLAGS=-O3 -I. --dynamic
CUDACC=
CUDACCVERSION=
CUDACDEBUG=-g $(addprefix $(CUDACDEFINE),_DEBUG)
CUDACDEFINE=-D
CUDACFLAGS=
CUDACOPT=
CUDACOPT0=-O0
CUDACOPT1=-O1
CUDACOPT2=-O2
CUDACOPT3=-O3
CUDACPROFILE=-p
CXX=/usr/bin/g++
CXXFLAGS=$(filter-out -Wimplicit-function-declaration -Wimplicit-int - 
Wnested-externs -Wpointer-sign,$(CFLAGS))
DOXYGEN=
ECHO=echo
EXPORT=export
F90FLAGS=
FC=/usr/bin/gfortran
FCVERSION=4.4.4
FDEBUG=-g $(addprefix $(FDEFINE),_DEBUG)
FDEFINE=-D
FFLAGS=-fdefault-real-8 -Waggregate-return -Waliasing -Wampersand - 
Wcharacter-truncation -Wintrinsics-std -Wline-truncation -Wno-tabs - 
Wsurprising -Wunderflow
FOPT=-O3
FOPT0=-O0
FOPT1=-O1
FOPT2=-O2
FOPT3=-O3
FPP=-x f77-cpp-input
FPROFILE=-p
FSTATIC=
FTCFLAGS=molpro unix unix-i4 Linux fortran_mem mpp blas lapack
HDF5INCLDE=
HDF5LIB=-L/usr/lib64 -lhdf5 -lhdf5_hl -lhdf5_fortran
HOSTFILE_FORMAT=%N
INSTBIN=exec
INSTHTML=manual
INSTLIB=aux
INST_PL=0
INTEGER=4
LAPACKLIB=-L/opt/acml/gfortran64_mp/lib -lacml_mp
LATEX2HTML=
LAUNCHER=/opt/openmpi/1.4.3/bin/mpirun --mca mpi_warn_on_fork 0 - 
machinefile %h -np %n %x
LD_ENV=/opt/openmpi/1.4.3/lib
LD_ENVNAME=LD_LIBRARY_PATH
LIBRARY_SUFFIX=a
LIBS=-L/readonly/usr/bin/../lib/gcc/x86_64-redhat-linux/4.4.4 -lstdc++  
-lz
LIBS_FRONT=
LINKOPT=
LINKOPT_FRONT=
LN=ln -s
MACROS=MOLPRO MOLPRO_gfortran MOLPRO_f2003 GA_TOOLS GA_MPI  
GA_VERSION_GE_5 BLAS_INT=4 LAPACK_INT=4 MOLPRO_FORCE_VECTOR  
MOLPRO_NEXTSCALAR MOLPRO_NO_RECURRENCE MOLPRO_NOVECTOR  
MOLPRO_SHORTLOOP _MOLCAS_MPP_
MAKEDEPEND_OPTIONS=
MAKEINDEX=
MAPLE=
MKDIR=mkdir -p
MODULE_FLAG=-I
MODULE_SUFFIX=mod
MPILIB=-pthread -I/opt/openmpi/1.4.3/lib -L/opt/openmpi/1.4.3/lib - 
lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export- 
dynamic -lnsl -lutil -lm -ldl
MPPLIB=-L/home/staff/lukarajc/ga/ga-5-0-2-install/lib -lga -larmci
OBJECT_SUFFIX=o
OPT0=kraft1a.F parse.f
OPT1=getvar.f
OPT2=
OPT3=
PAPER=a4paper
PARSE=parse-Linux-x86_64-i4.o
PDFLATEX=
PNAME=molprop_2010_1_Linux_x86_64_i4
PTSIZE=11
RANLIB=ranlib
RM=rm -rf
SHELL=/bin/sh
STRIP=strip
SUFFIXES=f F f90 F90 c cpp
TAR=tar -cf
UNTAR=tar -xf
VERBOSE=@
VERSION=2010.1
XSD=/usr/bin/xmllint --noout --schema
XSLT=/usr/bin/xsltproc
YACC=bison -b y

.SUFFIXES:
MAKEFLAGS+=-r
ifneq ($(LD_ENVNAME),)
$(LD_ENVNAME):=$(LD_ENV):$($(LD_ENVNAME))
endif
------------------------------------------------

Now, when I for example try to run Molpro on a simple job (RHF for He  
atom) for 8 cores (2 nodes), I get

------------------------------------------------
  # PARALLEL mode
  nodelist=8
  first   =8
  second  =
  third   =
  HOSTFILE_FORMAT: $hostname

ls-b03
ls-b03
ls-b03
ls-b03
ls-b01
ls-b01
ls-b01
ls-b01

  export LD_LIBRARY_PATH='/opt/openmpi/1.4.3/lib::/home/staff/lukarajc/ 
ga/ga-5-0-2-install/lib:/opt/acml/gfortran64_mp/lib'
  export AIXTHREAD_SCOPE='s'
         INSTLIB=''
  export MP_NODES='0'
  export MP_PROCS='8'
         MP_TASKS_PER_NODE=''
  export MOLPRO_NOARG='1'
  export MOLPRO_OPTIONS=' -v He.inp'
  export MOLPRO_OPTIONS_FILE='/tmp/molpro_options.24516'
         MPI_MAX_CLUSTER_SIZE=''
  export PROCGRP='/tmp/procgrp.24516'
  export RT_GRQ='ON'
         TCGRSH=''
         TMPDIR=''
  export XLSMPOPTS='parthds=1'
/opt/openmpi/1.4.3/bin/mpirun --mca mpi_warn_on_fork 0 -machinefile / 
tmp/procgrp.24516 -np 8 /home/staff/lukarajc/molpro-dev/molpro2010.1- 
gfortran-ga/bin/molpro.exe  -v He.inp
ARMCI configured for 2 cluster nodes. Network protocol is 'OpenIB  
Verbs API'.
  input from /home/staff/lukarajc/molpro-jobs/He.inp
  output to /home/staff/lukarajc/molpro-jobs/He.out
3:Segmentation Violation error, status=: 11
1:Segmentation Violation error, status=: 11
(rank:1 hostname:ls-b03 pid:24667):ARMCI DASSERT fail. ../../armci/src/ 
signaltrap.c:SigSegvHandler():312 cond:0
2:Segmentation Violation error, status=: 11
(rank:2 hostname:ls-b03 pid:24668):ARMCI DASSERT fail. ../../armci/src/ 
signaltrap.c:SigSegvHandler():312 cond:0
(rank:3 hostname:ls-b03 pid:24669):ARMCI DASSERT fail. ../../armci/src/ 
signaltrap.c:SigSegvHandler():312 cond:0
6:Segmentation Violation error, status=: 11
(rank:6 hostname:ls-b01 pid:2027):ARMCI DASSERT fail. ../../armci/src/ 
signaltrap.c:SigSegvHandler():312 cond:0
7:Segmentation Violation error, status=: 11
4:Segmentation Violation error, status=: 11
(rank:4 hostname:ls-b01 pid:2025):ARMCI DASSERT fail. ../../armci/src/ 
signaltrap.c:SigSegvHandler():312 cond:0
0:Segmentation Violation error, status=: 11
(rank:0 hostname:ls-b03 pid:24666):ARMCI DASSERT fail. ../../armci/src/ 
signaltrap.c:SigSegvHandler():312 cond:0
(rank:7 hostname:ls-b01 pid:2028):ARMCI DASSERT fail. ../../armci/src/ 
signaltrap.c:SigSegvHandler():312 cond:0
------------------------------------------------

The things I figured out so far (looking at http://www.molpro.net/pipermail/molpro-user/2009-December/003421.html) 
:

* the launcher works all right on the test.x file from Global Arrays,  
i.e.

------------------------------------------------
/opt/openmpi/1.4.3/bin/mpirun --mca mpi_warn_on_fork 0 -np 8 test.x
------------------------------------------------
yields no errors

* the installation of Molpro with openmpi only works fine (but it uses  
one processor per node as a helper and is thus slower than ga-based one)

* /tmp is writable, so it's not the problem with machine file.

* the same error results no matter of the number of CPUs provided with  
-n switch.

I know I could use -auto-ga-openmpi option (in fact I used it, and  
strange thing happens in make tuning - the tuning.out file keeps  
growing to monstrous size, but that's another story), but first I'd  
like to use the version of Global Arrays I've compiled.

I'd be greatly obliged for your ideas on how to overcome that issue,

best regards,

Łukasz Rajchel
Quantum Chemistry Laboratory
University of Warsaw
http://tiger.chem.uw.edu.pl/staff/lrajchel/







More information about the Molpro-user mailing list