[molpro-user] Molpro2009.1 parallel mode problem

Radoslaw Stachowski radoslaw.stachowski at pwr.wroc.pl
Mon Jul 12 14:20:17 BST 2010


Hello

The problem is I cannot run molpro in parallel mode.

Compilation is ok, there are no suspicious logs.

my CONFIG file:
CONFIGURE_OPTIONS="-x86_64" "-icc" "-ifort" "-mpp"
"-mppbase" "/usr/mpi/intel/mvapich2-1.0.3/include/"
"-var" "INSTBIN=/home/stach008/molpro2009"
"-var" "INSTLIB=/home/stach008/molpro2009"
"-var" "INSTHTML=/home/stach008/molpro2009"


BLASLIB=-L/opt/intel/Compiler/11.1/064/mkl/lib/em64t -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core
CC=/opt/intel/Compiler/11.1/064/bin/intel64/icc
CCVERSION=11.1
CC_FRONT=
CDEBUG=-g
CDEFINE=-D
CFLAGS=-ftz -fPIC -vec-report0 -DMOLPROC_PAR -DINT64 -DZLIB
CLEARSPEEDLIB=
CMPPINCLUDE=/usr/mpi/intel/mvapich2-1.0.3/include
CSFLAGS=-O3 -I. --dynamic
FC=/opt/intel/Compiler/11.1/064/bin/intel64/ifort
FCVERSION=11.1
FTCFLAGS=molpro unix unix-i8 Linux lapack sf eaf mpi-io mpi2 mpp blas1 blas2 blas3
F_OPT0=
F_OPT1=nevpt2_optrpc.f explicit_util.f artwo2.f drv2el_l3ext_lmp2g.f drv2el_l3ext_cen.f rmp2_f12_drv2.f90 ri_lmp2g.f df_llmp2.f
F_OPT2=integrals.f90 RL4gen1.f basis_integral_shells.f
LAUNCHER=/usr/mpi/intel/mvapich2-1.0.3/bin/mpiexec -machinefile %h -np %n %x
LD_ENV=/opt/intel/Compiler/11.1/064/lib/intel64:/opt/intel/Compiler/11.1/064/mkl/lib/em64t
LD_ENVNAME=LD_LIBRARY_PATH
LAUNCHER=/usr/mpi/intel/mvapich2-1.0.3/bin/mpiexec -machinefile %h -np %n %x
LD_ENV=/opt/intel/Compiler/11.1/064/lib/intel64:/opt/intel/Compiler/11.1/064/mkl/lib/em64t
LD_ENVNAME=LD_LIBRARY_PATH
MPILIB=-i-dynamic -I/usr/mpi/intel/mvapich2-1.0.3/include -L/usr/lib64 -L/usr/mpi/intel/mvapich2-1.0.3/lib -Wl,-rpath -Wl,/usr/mpi/intel/mvapich2-1.0.3/lib -lmpich -L/usr/lib64 -lrdmacm -libverbs -libumad -lpthread -lrt
MPPLIB=
OBJECT_SUFFIX=o
OWNERPOS=3
PAPER=a4paper
PARSE=parse-Linux-x86_64-i8.o
PATCHER=patcher.exe
PDFLATEX=
PNAME=molprop_2009_1_Linux_x86_64_i8
PTSIZE=11
RANLIB=ranlib
RM=rm -rf
SHAREDLIBFLAGS=-shared --whole-archive ../lib/libmolpro.a --no-whole-archive -rpath-link . -L/opt/intel/Compiler/11.1/064/lib/intel64 -lsvml
SHAREDLIBSUFFIX=so
SHELL=/bin/sh

.SUFFIXES:
MAKEFLAGS+=-r
                                                                                                                                                                                                                                


I use PBS in interactive mode (for testing)[$ qsub -I -l mem=7200mb -l nodes=2L:ppn=2].
After taping ./molprop_2009_1_Linux_x86_64_i8 ~/1.inp I get:

"mpiexec_nova: cannot connect to local mpd (/tmp/mpd2.console_stach008); possible causes:
  1. no mpd is running on this host
  2. an mpd is running but was started without a "console" (-n option)
In case 1, you can start an mpd on this host with:
    mpd &
and you will be able to run jobs just on this host.
For more details on starting mpds on a set of hosts, see
the MPICH2 Installation Guide."

So I start mpd daemons manually.
/usr/mpi/intel/mvapich2-1.0.3/bin/mpdboot -n 2 -f mpd_host (I`ve requested from PBS 2 nodes and 2 proc per node)

and run ./molprop_2009_1_Linux_x86_64_i8 ~/1.inp (1.inp - just a test job input file)

and I get this error :
molprop_2009_1_Linux_x86_64_i8.exe: rdma_iba_1sc.c:472: MPIDI_CH3I_RDMA_win_create: Assertion `MPIDI_CH3I_RDMA_Process.current_win_num <= (16)' failed.
molprop_2009_1_Linux_x86_64_i8.exe: rdma_iba_1sc.c:472: MPIDI_CH3I_RDMA_win_create: Assertion `MPIDI_CH3I_RDMA_Process.current_win_num <= (16)' failed.
molprop_2009_1_Linux_x86_64_i8.exe: rdma_iba_1sc.c:472: MPIDI_CH3I_RDMA_win_create: Assertion `MPIDI_CH3I_RDMA_Process.current_win_num <= (16)' failed.
molprop_2009_1_Linux_x86_64_i8.exe: rdma_iba_1sc.c:472: MPIDI_CH3I_RDMA_win_create: Assertion `MPIDI_CH3I_RDMA_Process.current_win_num <= (16)' failed.
forrtl: error (76): Abort trap signal
Image              PC                Routine            Line        Source
libc.so.6          000000349542E21D  Unknown               Unknown  Unknown
libc.so.6          000000349542FA1E  Unknown               Unknown  Unknown
libc.so.6          0000003495427AE1  Unknown               Unknown  Unknown
libmpich.so        00002AE1422B60B7  Unknown               Unknown  Unknown
libmpich.so        00002AE14224AB83  Unknown               Unknown  Unknown
libmpich.so        00002AE142299DD8  Unknown               Unknown  Unknown
libmpich.so        00002AE1422F1B92  Unknown               Unknown  Unknown
molprop_2009_1_Li  00000000021444B6  Unknown               Unknown  Unknown
molprop_2009_1_Li  0000000002143DE2  Unknown               Unknown  Unknown
molprop_2009_1_Li  0000000002143BB6  Unknown               Unknown  Unknown
molprop_2009_1_Li  0000000001258EBB  Unknown               Unknown  Unknown
molprop_2009_1_Li  00000000012548BF  Unknown               Unkforrtl: error (76): Abort trap signal
Image              PC                Routine            Line        Source
libc.so.6          000000349542E21D  Unknown               Unknown  Unknown
libc.so.6          000000349542FA1E  Unknown               Unknown  Unknown
libc.so.6          0000003495427AE1  Unknown               Unknown  Unknown
libmpich.so        00002AF52EBE00B7  Unknown               Unknown  Unknown
libmpich.so        00002AF52EB74B83  Unknown               Unknown  Unknown
libmpich.so        00002AF52EBC3DD8  Unknown               Unknown  Unknown
libmpich.so        00002AF52EC1BB92  Unknown               Unknown  Unknown
molprop_2009_1_Li  00000000021444B6  Unknown               Unknown  Unknown
molprop_2009_1_Li  0000000002143DE2  Unknown               Unknown  Unknown
molprop_2009_1_Li  0000000002143BB6  Unknown               Unknown  Unknown
molprop_2009_1_Li  0000000001258EBB  Unknown               Unknown  Unknown
molprop_2009_1_Li  00000000012548BF  Unknown               Unknown  Unknown
molprop_2009_1_Li  00000000004C3219  Unknown               Unknown  Unknown
molprop_2009_1_Li  00000000004BDF65  Unknown               Unknown  Unknown
molprop_2009_1_Li  000000000043D56C  Unknown               Unknown  Unknown
libc.so.6          000000349541C3FB  Unknown               Unknown  Unknown
molprop_2009_1_Li  000000000043D49A  Unknown               Unknown  Unknown
nown  Unknown
molprop_2009_1_Li  00000000004C3219  Unknown               Unknown  Unknown
molprop_2009_1_Li  00000000004BDF65  Unknown               Unknown  Unknown
molprop_2009_1_Li  000000000043D56C  Unknown               Unknown  Unknown
libc.so.6          000000349541C3FB  Unknown               Unknown  Unknown
molprop_2009_1_Li  000000000043D49A  Unknown               Unknown  Unknown
forrtl: error (76): Abort trap signal
Image              PC                Routine            Line        Source
libc.so.6          000000349542E21D  Unknown               Unknown  Unknown
libc.so.6          000000349542FA1E  Unknown               Unknown  Unknown
libc.so.6          0000003495427AE1  Unknown               Unknown  Unknown
libmpich.so        00002B5261EBA0B7  Unknown               Unknown  Unknown
libmpich.so        00002B5261E4EB83  Unknown               Unknown  Unknown
libmpich.so        00002B5261E9DDD8  Unknown               Unknown  Unknown
libmpich.so        00002B5261EF5B92  Unknown               Unknown  Unknown
molprop_2009_1_Li  0000000002144499  Unknown               Unknown  Unknown
molprop_2009_1_Li  0000000002143DE2  Unknown               Unknown  Unknown
molprop_2009_1_Li  0000000002143BB6  Unknown               Unknown  Unknown
molprop_2009_1_Li  0000000001258EBB  Unknown               Unknown  Unknown
molprop_2009_1_Li  00000000012548BF  Unknown               Unknown  Unknown
 
Any ideas?                                                                                                                
I will appreciate any help.

Cheers,

Radosław Stachowski



More information about the Molpro-user mailing list