[molpro-user] Fwd: Problems with parallel build on AMD Opteron(tm) Processor 6376

Reuti reuti at staff.uni-marburg.de
Thu Jan 29 11:12:13 GMT 2015


Hi,

This looks like you are hit by Open MPI's automatic core binding which started in 1.7.x.

Either disable Open MPI's automatic core binding or provide a disjunct list of cores for each job:

$ mpiexec --bind-to none ...

This will no change any binding which was already setup by your queuing system though. Although the behavior is different in some way: the queuingsystem will most often bind a set of processes to a set of cores - i.e. the scheduler of the OS could shift the processes around inside this set. Open MPI on the other hand will bind each process to a unique core.

Also note, that starting in Open MPI 1.8.2 they do a more intensive network scan for possible routes between the machines which might (depending on your network setup) lead to some delay in the range of 2 minutes after `mpiexec` was issued before the application finally starts.

Best would be to have a closer look to these issues outside of Molpro first.

-- Reuti

PS: Using Open MPI 1.6.5 might also be worth to be tested, as there these issues aren't present.


> Am 29.01.2015 um 11:26 schrieb molpro-user <molpro-user at molpro.net>:
> 
> 
> 
> == == == == == == Forwarded message == == == == == == 
> From : Robert Polly<polly at kit.edu>
> To : <molpro-user at molpro.net>
> Date : Fri, 23 Jan 2015 12:50:51 +0000
> Subject : Problems with parallel build on AMD Opteron(tm) Processor 6376
> == == == == == == Forwarded message == == == == == == 
> 
> 
> Dear MOLPRO community, 
> we installed MOLPRO on our new Opteron cluster (64 CPUs per node) 
> 
> CPU: AMD Opteron(tm) Processor 6376 
> Compiler: ifort/icc 
> Openmpi: openmpi-1.8.3 
> Molpro: 2012.1.18 
> 
> CONFIG file: 
> 
> # MOLPRO CONFIG generated at Fri Jan 9 10:31:38 MET 2015, for host 
> master.hpc1.ine, SHA1=50d6e5f7071a51146f1443020887856fd3d38933 
> 
> CONFIGURE_OPTIONS="-icc" "-ifort" "-mpp" "-openmpi" "-mppbase" 
> "/home/polly/molpro/TEST/Molpro.2012.1.18.par/src/openmpi-install/include" 
> 
> AR=ar 
> ARCHNAME=Linux/x86_64 
> ARFLAGS=-rS 
> AWK=awk 
> BIBTEX= 
> BLASLIB=-L/pub/hpc/module/compilers/intel/xe2015/composer_xe_2015.0.090/mkl/lib/intel64 
> -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core 
> BUILD=p 
> CAT=cat 
> CC=/pub/hpc/module/compilers/intel/xe2015/composer_xe_2015.0.090/bin/intel64/icc 
> CCVERSION=15.0.0 
> CC_FRONT= 
> CDEBUG=-g $(addprefix $(CDEFINE),_DEBUG) 
> CDEFINE=-D 
> CFLAGS=-ftz 
> -I/home/polly/molpro/TEST/Molpro.2012.1.18.par/src/openmpi-install/include 
> CLDFLAGS= 
> CLEAN=echo 'target clean only available with git cloned versions, please 
> unpack the tarball again' 
> CMPPINCLUDE=/home/polly/molpro/TEST/Molpro.2012.1.18.par/src/openmpi-install/include 
> COPT=-O2 
> COPT0=-O0 
> COPT1=-O1 
> COPT2=-O2 
> COPT3=-O3 
> CP=cp -p 
> CPROFILE=-p 
> CUDACC= 
> CUDACCVERSION= 
> CUDACDEBUG=-g $(addprefix $(CUDACDEFINE),_DEBUG) 
> CUDACDEFINE=-D 
> CUDACFLAGS= 
> CUDACOPT= 
> CUDACOPT0=-O0 
> CUDACOPT1=-O1 
> CUDACOPT2=-O2 
> CUDACOPT3=-O3 
> CUDACPROFILE=-p 
> CXX=/pub/hpc/module/compilers/intel/xe2015/composer_xe_2015.0.090/bin/intel64/icpc 
> CXXFLAGS=$(CFLAGS) 
> DOXYGEN=/bin/doxygen 
> ECHO=/bin/echo 
> EXPORT=export 
> F90FLAGS=-stand f03 
> FC=/pub/hpc/module/compilers/intel/xe2015/composer_xe_2015.0.090/bin/intel64/ifort 
> FCVERSION=15.0.0 
> FDEBUG=-g $(addprefix $(FDEFINE),_DEBUG) 
> FDEFINE=-D 
> FFLAGS=-i8 -pc64 -auto -warn nousage -align array32byte -cxxlib 
> FLDFLAGS= 
> FOPT=-O3 
> FOPT0=-O0 
> FOPT1=-O1 
> FOPT2=-O2 
> FOPT3=-O3 
> FPROFILE=-p 
> FSTATIC= 
> HOSTFILE_FORMAT=%N 
> INSTALL_FILES_EXTRA=src/openmpi-install/bin/mpirun 
> src/openmpi-install/bin/orterun 
> INSTBIN= 
> INST_PL=0 
> INTEGER=8 
> LAPACKLIB= 
> LATEX2HTML= 
> LAUNCHER=/home/polly/molpro/TEST/Molpro.2012.1.18.par/src/openmpi-install/bin/mpirun 
> --mca mpi_warn_on_fork 0 -machinefile %h -np %n %x 
> LD_ENV=/pub/hpc/module/compilers/intel/xe2015/composer_xe_2015.0.090/compiler/lib/intel64:/pub/hpc/module/compilers/intel/xe2015/composer_xe_2015.0.090/mkl/lib/intel64 
> LD_ENVNAME=LD_LIBRARY_PATH 
> LIBRARY_SUFFIX=a 
> LIBS=-lpthread 
> /home/polly/molpro/TEST/Molpro.2012.1.18.par/src/boost-install/lib/libboost_system.a 
> /home/polly/molpro/TEST/Molpro.2012.1.18.par/src/boost-install/lib/libboost_thread.a 
> -lrt 
> LN=ln -s 
> MACROS=MOLPRO NDEBUG MOLPRO_f2003 MOLPRO_bug3990 MPI2 HAVE_BOOST_THREADS 
> HAVE_SSE2 _I8_ MOLPRO_INT=8 BLAS_INT=8 LAPACK_INT=8 MOLPRO_AIMS 
> MOLPRO_NECI _MOLCAS_MPP_ MOLPRO_BLAS MOLPRO_LAPACK 
> MAKEDEPEND_OPTIONS= 
> MAKEINDEX= 
> MAPLE= 
> MAX_INCREMENT_LIBRARY=0 
> MKDIR=mkdir -p 
> MODULE_FLAG=-I 
> MODULE_SUFFIX=mod 
> MPILIB=-I/home/polly/molpro/TEST/Molpro.2012.1.18.par/src/openmpi-install/lib 
> -Wl,-rpath 
> -Wl,/home/polly/molpro/TEST/Molpro.2012.1.18.par/src/openmpi-install/lib 
> -Wl,--enable-new-dtags -L/home/polly/molpro/TEST\ 
> /Molpro.2012.1.18.par/src/openmpi-install/lib -lmpi_usempif08 
> -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lm 
> -lpciaccess -ldl -lrt -losmcomp -libverbs -lrdmacm -lutil -lpsm_infinipath 
> MPPLIB= 
> OBJECT_SUFFIX=o 
> OPT0=B88.F copyc6.F 
> OPT1=explicit_util.F avcc.F koopro4.F dlaed4.F frequencies.F optg.F 
> OPT2=tstfnc.F dftgrid.F mrf12_singles.F90 basis_integral_shells.F 
> integrals.F90 geminal.F surface.F gcc.F90 
> OPT3= 
> PAPER=a4paper 
> PARSE=parse-Linux-x86_64-i8.o 
> PDFLATEX= 
> PNAME=molprop_2012_1_Linux_x86_64_i8 
> PREFIX=/usr/local/molpro/molprop_2012_1_Linux_x86_64_i8 
> PTSIZE=11 
> PYTHON=/bin/python 
> RANLIB=ranlib 
> RM=rm -rf 
> SHELL=/bin/sh 
> STRIP=strip 
> SUFFIXES=F F90 c cpp 
> TAR=tar -cf 
> UNTAR=tar -xf 
> VERBOSE=@ 
> VERSION=2012.1 
> XSD=/bin/xmllint --noout --schema 
> XSLT=/bin/xsltproc 
> YACC=bison -b y 
> 
> .SUFFIXES: 
> MAKEFLAGS+=-r 
> ifneq ($(LD_ENVNAME),) 
> $(LD_ENVNAME):=$(LD_ENV):$($(LD_ENVNAME)) 
> endif 
> 
> 
> We encounter the problem, that when 2 MOLPRO jobs run on one node 
> with 8 processors each, they run on the same 8 processors already 
> allocated by 
> thge first job, although there are still 56 free processors on the machine. 
> 
> Any suggestions how to solve the problem? 
> 
> Best regards, 
> Robert 
> 
> -- 
> 
> ********************************************************************* 
> 
> Karlsruher Institut für Technologie (KIT) 
> Institut fuer Nukleare Entsorgung 
> 
> Dr. Robert Polly 
> 
> Quantenchemie 
> 
> Institut fuer Nukleare Entsorgung (INE), Campus Nord, Gebaeude 712, 
> Postfach 3640, 76021 Karlsruhe, Germany 
> 
> 0049-(0)721-608-24396 
> 
> email: polly at kit.edu 
> www: http://www.fzk.de/ine 
> 
> KIT - Universität des Landes Baden-Württemberg und 
> nationales Großforschungszentrum in der Helmholtz-Gemeinschaft 
> 
> ********************************************************************* 
> 
> 
> 
> 
> <CONFIG>_______________________________________________
> Molpro-user mailing list
> Molpro-user at molpro.net
> http://www.molpro.net/mailman/listinfo/molpro-user




More information about the Molpro-user mailing list