[molpro-user] Molpro2009.1 parallel mode problem

Manhui Wang wangm9 at cardiff.ac.uk
Mon Jul 12 15:18:44 BST 2010


Hi Radosław,

I noticed you are using very old mvapich2 library, which contains some
bugs about MPI window and memory leakage. Please install the latest
MVAPICH2 1.5  and build Molpro with it.

Best wishes,
Manhui

Radoslaw Stachowski wrote:
> Hello
> 
> The problem is I cannot run molpro in parallel mode.
> 
> Compilation is ok, there are no suspicious logs.
> 
> my CONFIG file:
> CONFIGURE_OPTIONS="-x86_64" "-icc" "-ifort" "-mpp"
> "-mppbase" "/usr/mpi/intel/mvapich2-1.0.3/include/"
> "-var" "INSTBIN=/home/stach008/molpro2009"
> "-var" "INSTLIB=/home/stach008/molpro2009"
> "-var" "INSTHTML=/home/stach008/molpro2009"
> 
> 
> BLASLIB=-L/opt/intel/Compiler/11.1/064/mkl/lib/em64t -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core
> CC=/opt/intel/Compiler/11.1/064/bin/intel64/icc
> CCVERSION=11.1
> CC_FRONT=
> CDEBUG=-g
> CDEFINE=-D
> CFLAGS=-ftz -fPIC -vec-report0 -DMOLPROC_PAR -DINT64 -DZLIB
> CLEARSPEEDLIB=
> CMPPINCLUDE=/usr/mpi/intel/mvapich2-1.0.3/include
> CSFLAGS=-O3 -I. --dynamic
> FC=/opt/intel/Compiler/11.1/064/bin/intel64/ifort
> FCVERSION=11.1
> FTCFLAGS=molpro unix unix-i8 Linux lapack sf eaf mpi-io mpi2 mpp blas1 blas2 blas3
> F_OPT0=
> F_OPT1=nevpt2_optrpc.f explicit_util.f artwo2.f drv2el_l3ext_lmp2g.f drv2el_l3ext_cen.f rmp2_f12_drv2.f90 ri_lmp2g.f df_llmp2.f
> F_OPT2=integrals.f90 RL4gen1.f basis_integral_shells.f
> LAUNCHER=/usr/mpi/intel/mvapich2-1.0.3/bin/mpiexec -machinefile %h -np %n %x
> LD_ENV=/opt/intel/Compiler/11.1/064/lib/intel64:/opt/intel/Compiler/11.1/064/mkl/lib/em64t
> LD_ENVNAME=LD_LIBRARY_PATH
> LAUNCHER=/usr/mpi/intel/mvapich2-1.0.3/bin/mpiexec -machinefile %h -np %n %x
> LD_ENV=/opt/intel/Compiler/11.1/064/lib/intel64:/opt/intel/Compiler/11.1/064/mkl/lib/em64t
> LD_ENVNAME=LD_LIBRARY_PATH
> MPILIB=-i-dynamic -I/usr/mpi/intel/mvapich2-1.0.3/include -L/usr/lib64 -L/usr/mpi/intel/mvapich2-1.0.3/lib -Wl,-rpath -Wl,/usr/mpi/intel/mvapich2-1.0.3/lib -lmpich -L/usr/lib64 -lrdmacm -libverbs -libumad -lpthread -lrt
> MPPLIB=
> OBJECT_SUFFIX=o
> OWNERPOS=3
> PAPER=a4paper
> PARSE=parse-Linux-x86_64-i8.o
> PATCHER=patcher.exe
> PDFLATEX=
> PNAME=molprop_2009_1_Linux_x86_64_i8
> PTSIZE=11
> RANLIB=ranlib
> RM=rm -rf
> SHAREDLIBFLAGS=-shared --whole-archive ../lib/libmolpro.a --no-whole-archive -rpath-link . -L/opt/intel/Compiler/11.1/064/lib/intel64 -lsvml
> SHAREDLIBSUFFIX=so
> SHELL=/bin/sh
> 
> .SUFFIXES:
> MAKEFLAGS+=-r
>                                                                                                                                                                                                                                 
> 
> 
> I use PBS in interactive mode (for testing)[$ qsub -I -l mem=7200mb -l nodes=2L:ppn=2].
> After taping ./molprop_2009_1_Linux_x86_64_i8 ~/1.inp I get:
> 
> "mpiexec_nova: cannot connect to local mpd (/tmp/mpd2.console_stach008); possible causes:
>   1. no mpd is running on this host
>   2. an mpd is running but was started without a "console" (-n option)
> In case 1, you can start an mpd on this host with:
>     mpd &
> and you will be able to run jobs just on this host.
> For more details on starting mpds on a set of hosts, see
> the MPICH2 Installation Guide."
> 
> So I start mpd daemons manually.
> /usr/mpi/intel/mvapich2-1.0.3/bin/mpdboot -n 2 -f mpd_host (I`ve requested from PBS 2 nodes and 2 proc per node)
> 
> and run ./molprop_2009_1_Linux_x86_64_i8 ~/1.inp (1.inp - just a test job input file)
> 
> and I get this error :
> molprop_2009_1_Linux_x86_64_i8.exe: rdma_iba_1sc.c:472: MPIDI_CH3I_RDMA_win_create: Assertion `MPIDI_CH3I_RDMA_Process.current_win_num <= (16)' failed.
> molprop_2009_1_Linux_x86_64_i8.exe: rdma_iba_1sc.c:472: MPIDI_CH3I_RDMA_win_create: Assertion `MPIDI_CH3I_RDMA_Process.current_win_num <= (16)' failed.
> molprop_2009_1_Linux_x86_64_i8.exe: rdma_iba_1sc.c:472: MPIDI_CH3I_RDMA_win_create: Assertion `MPIDI_CH3I_RDMA_Process.current_win_num <= (16)' failed.
> molprop_2009_1_Linux_x86_64_i8.exe: rdma_iba_1sc.c:472: MPIDI_CH3I_RDMA_win_create: Assertion `MPIDI_CH3I_RDMA_Process.current_win_num <= (16)' failed.
> forrtl: error (76): Abort trap signal
> Image              PC                Routine            Line        Source
> libc.so.6          000000349542E21D  Unknown               Unknown  Unknown
> libc.so.6          000000349542FA1E  Unknown               Unknown  Unknown
> libc.so.6          0000003495427AE1  Unknown               Unknown  Unknown
> libmpich.so        00002AE1422B60B7  Unknown               Unknown  Unknown
> libmpich.so        00002AE14224AB83  Unknown               Unknown  Unknown
> libmpich.so        00002AE142299DD8  Unknown               Unknown  Unknown
> libmpich.so        00002AE1422F1B92  Unknown               Unknown  Unknown
> molprop_2009_1_Li  00000000021444B6  Unknown               Unknown  Unknown
> molprop_2009_1_Li  0000000002143DE2  Unknown               Unknown  Unknown
> molprop_2009_1_Li  0000000002143BB6  Unknown               Unknown  Unknown
> molprop_2009_1_Li  0000000001258EBB  Unknown               Unknown  Unknown
> molprop_2009_1_Li  00000000012548BF  Unknown               Unkforrtl: error (76): Abort trap signal
> Image              PC                Routine            Line        Source
> libc.so.6          000000349542E21D  Unknown               Unknown  Unknown
> libc.so.6          000000349542FA1E  Unknown               Unknown  Unknown
> libc.so.6          0000003495427AE1  Unknown               Unknown  Unknown
> libmpich.so        00002AF52EBE00B7  Unknown               Unknown  Unknown
> libmpich.so        00002AF52EB74B83  Unknown               Unknown  Unknown
> libmpich.so        00002AF52EBC3DD8  Unknown               Unknown  Unknown
> libmpich.so        00002AF52EC1BB92  Unknown               Unknown  Unknown
> molprop_2009_1_Li  00000000021444B6  Unknown               Unknown  Unknown
> molprop_2009_1_Li  0000000002143DE2  Unknown               Unknown  Unknown
> molprop_2009_1_Li  0000000002143BB6  Unknown               Unknown  Unknown
> molprop_2009_1_Li  0000000001258EBB  Unknown               Unknown  Unknown
> molprop_2009_1_Li  00000000012548BF  Unknown               Unknown  Unknown
> molprop_2009_1_Li  00000000004C3219  Unknown               Unknown  Unknown
> molprop_2009_1_Li  00000000004BDF65  Unknown               Unknown  Unknown
> molprop_2009_1_Li  000000000043D56C  Unknown               Unknown  Unknown
> libc.so.6          000000349541C3FB  Unknown               Unknown  Unknown
> molprop_2009_1_Li  000000000043D49A  Unknown               Unknown  Unknown
> nown  Unknown
> molprop_2009_1_Li  00000000004C3219  Unknown               Unknown  Unknown
> molprop_2009_1_Li  00000000004BDF65  Unknown               Unknown  Unknown
> molprop_2009_1_Li  000000000043D56C  Unknown               Unknown  Unknown
> libc.so.6          000000349541C3FB  Unknown               Unknown  Unknown
> molprop_2009_1_Li  000000000043D49A  Unknown               Unknown  Unknown
> forrtl: error (76): Abort trap signal
> Image              PC                Routine            Line        Source
> libc.so.6          000000349542E21D  Unknown               Unknown  Unknown
> libc.so.6          000000349542FA1E  Unknown               Unknown  Unknown
> libc.so.6          0000003495427AE1  Unknown               Unknown  Unknown
> libmpich.so        00002B5261EBA0B7  Unknown               Unknown  Unknown
> libmpich.so        00002B5261E4EB83  Unknown               Unknown  Unknown
> libmpich.so        00002B5261E9DDD8  Unknown               Unknown  Unknown
> libmpich.so        00002B5261EF5B92  Unknown               Unknown  Unknown
> molprop_2009_1_Li  0000000002144499  Unknown               Unknown  Unknown
> molprop_2009_1_Li  0000000002143DE2  Unknown               Unknown  Unknown
> molprop_2009_1_Li  0000000002143BB6  Unknown               Unknown  Unknown
> molprop_2009_1_Li  0000000001258EBB  Unknown               Unknown  Unknown
> molprop_2009_1_Li  00000000012548BF  Unknown               Unknown  Unknown
>  
> Any ideas?                                                                                                                
> I will appreciate any help.
> 
> Cheers,
> 
> Radosław Stachowski
> _______________________________________________
> Molpro-user mailing list
> Molpro-user at molpro.net
> http://www.molpro.net/mailman/listinfo/molpro-user

-- 
-----------
Manhui  Wang
School of Chemistry, Cardiff University,
Main Building, Park Place,
Cardiff CF10 3AT, UK
Telephone: +44 (0)29208 76637



More information about the Molpro-user mailing list