[molpro-user] MPI Error

Quandt, Robert rwquand at ilstu.edu
Wed Jul 31 18:23:02 BST 2013


Molpro users,
I recently upgraded from the 2012.1.0 to the 2012.1.3 binaries and now get an error message, see below, when I try to run a multi-node job (it works fine on one node). The job below was running fine before I killed it to do the upgrade, so it isn't an input problem. Has anyone run into this problem with 2012.1.3? Any ideas on how to fix it?  Any help would be greatly apprecieated.
Thanks in advance,
Bob

gimli /home/qg/calcs>molpro -n 12 -N qg:gimli:4,qg:legolas:4,qg:aragorn:4 dzpro.inp &
[2] 31782
gimli /home/qg/calcs>Fatal error in PMPI_Comm_dup: A process has failed, error stack:
PMPI_Comm_dup(175)...................: MPI_Comm_dup(MPI_COMM_WORLD, new_comm=0x1928e4c0) failed
PMPI_Comm_dup(160)...................:
MPIR_Comm_dup_impl(55)...............:
MPIR_Comm_copy(1552).................:
MPIR_Get_contextid(799)..............:
MPIR_Get_contextid_sparse_group(1064):
MPIR_Allreduce_impl(719).............:
MPIR_Allreduce_intra(201)............:
allreduce_intra_or_coll_fn(110)......:
MPIR_Allreduce_intra(539)............:
MPIDI_CH3U_Recvq_FDU_or_AEP(667).....: Communication error with rank 4
MPIR_Allreduce_intra(212)............:
MPIR_Bcast_impl(1369)................:
MPIR_Bcast_intra(1199)...............:
MPIR_Bcast_binomial(220).............: Failure during collective
Fatal error in PMPI_Comm_dup: A process has failed, error stack:
PMPI_Comm_dup(175)...................: MPI_Comm_dup(MPI_COMM_WORLD, new_comm=0x1928e4c0) failed
PMPI_Comm_dup(160)...................:
MPIR_Comm_dup_impl(55)...............:
MPIR_Comm_copy(1552).................:
MPIR_Get_contextid(799)..............:
MPIR_Get_contextid_sparse_group(1064):
MPIR_Allreduce_impl(719).............:
MPIR_Allreduce_intra(201)............:
allreduce_intra_or_coll_fn(110)......:
MPIR_Allreduce_intra(362)............:
dequeue_and_set_error(888)...........: Communication error with rank 4
MPIR_Allreduce_intra(212)............:
MPIR_Bcast_impl(1369)................:
MPIR_Bcast_intra(1199)...............:
MPIR_Bcast_binomial(220).............: Failure during collective

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 1
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
[proxy:0:0 at gimli] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:886): assert (!closed) failed
[proxy:0:0 at gimli] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:0 at gimli] main (./pm/pmiserv/pmip.c:206): demux engine error waiting for event
[mpiexec at gimli] HYDT_bscu_wait_for_completion (./tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated badly; aborting
[mpiexec at gimli] HYDT_bsci_wait_for_completion (./tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion
[mpiexec at gimli] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:217): launcher returned error waiting for completion
[mpiexec at gimli] main (./ui/mpich/mpiexec.c:331): process manager error waiting for completion

[2]    Exit 255                      molpro -n 12 -N qg:gimli:4,qg:legolas:4,qg:aragorn:4 dzpro.inp
gimli /home/qg/calcs>

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Dr. Bob Quandt                             Office: 324 SLB
Department of Chemistry                    Phone: (309) 438-8576
Illinois State University                  Fax: (309) 438-5538
Normal, IL 61761-4160                      email: quandt at ilstu.edu<mailto:quandt at ilstu.edu>

Nihil tam absurde dici potest, quod non dicatur ab aliquo philosophorum.
- Marcus Tullius Cicero (106-43 BC)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.molpro.net/pipermail/molpro-user/attachments/20130731/10715ca2/attachment.html>


More information about the Molpro-user mailing list