[molpro-user] VMP2, VCI calculations teminated with 'segmentation Violation error' , using downloaded MOLPRO 2010.1 binary code

Andy May MayAJ1 at cardiff.ac.uk
Mon Aug 8 13:36:51 BST 2011


Gang,

I've managed to reproduce the problem you mention and am testing a fix 
for this.

Best wishes,

Andy

On 05/08/11 13:30, Gang Li wrote:
> Dear Molpro Users,
>
> I recently downloaded the new 2010.1 binary version. However, quick tests
> using the VMP2 sample input in the Molpro 2010.1 Manual crash every time
> with the following error in terminal.   Similar error message always shows
> up with the VCI sample input and the CBS frequency input in the manual.
> The system OS is Ubuntu 10.04 LTS and the 2008 version I was using worked
> smoothly.  Anyone knows the reason? Thanks!
>
>
> Creating: host=virgo, user=root,
>             file=/usr/local/bin/molprop_2010_1_Linux_x86_64_i8.exe,
> port=53336
> 3:Segmentation Violation error, status=: 11
> (rank:3 hostname:virgo pid:21797):ARMCI DASSERT fail.
> src/signaltrap.c:SigSegvHandler():312 cond:0
> 1:Segmentation Violation error, status=: 11
> (rank:1 hostname:virgo pid:21795):ARMCI DASSERT fail.
> src/signaltrap.c:SigSegvHandler():312 cond:0
> Last System Error Message from Task 3:: Bad file descriptor
> Last System Error Message from Task 1:: Bad file descriptor
>    3: ARMCI aborting 11 (0xb).
>    3: ARMCI aborting 11 (0xb).
>    1: ARMCI aborting 11 (0xb).
>    1: ARMCI aborting 11 (0xb).
> system error message: Bad file descriptor
> 0:Segmentation Violation error, status=: 11
> 2:Segmentation Violation error, status=: 11
> system error message: Bad file descriptor
> (rank:2 hostname:virgo pid:21796):ARMCI DASSERT fail.
> src/signaltrap.c:SigSegvHandler():312 cond:0
> Last System Error Message from Task 2:: Bad file descriptor
> (rank:0 hostname:virgo pid:21794):ARMCI DASSERT fail.
> src/signaltrap.c:SigSegvHandler():312 cond:0
>    2: ARMCI aborting 11 (0xb).
>    2: ARMCI aborting 11 (0xb).
> system error message: Bad file descriptor
> Last System Error Message from Task 0:: Inappropriate ioctl for device
>    0: ARMCI aborting 11 (0xb).
>    0: ARMCI aborting 11 (0xb).
> system error message: Inappropriate ioctl for device
>    4: interrupt(1)
> WaitAll: Child (21794) finished, status=0x100 (exited with code 1).
> WaitAll: Child (21797) finished, status=0x100 (exited with code 1).
> WaitAll: Child (21795) finished, status=0x100 (exited with code 1).
> WaitAll: No children or error in wait?
>
>
>
> For the VMP2 input, the calculation reached the SURF calculation and then
> stopped response after 2-D coupling potential
>
>
> PROGRAM * SURF (Surface generation)   Authors: G. Rauhut 2004, T. Hrenar
> 2005
> ...
>   Harmonic and 1D-anharmonic vibrational frequencies
>
>     Mode     Harmonic     Diagonal     Intens
>
>     3 B2      3970.77      4063.06      33.06
>     2 A1      3851.51      3765.54       5.35
>     1 A1      1677.46      1663.36      59.75
>
>   Calculating 2D coupling potential
>
>   2D:   3   2  Points:  18 Conv: 0.04825417 0.04870899
>   2D:   3   1  Points:  18 Conv: 0.00686833 0.00693136
>   2D:   2   1  Points:  36 Conv: 0.00654850 0.00663013
>
>   Calculating 3D coupling potential
>   'stopped here'.
>
>
> Best wishes,
> Gang
>
>
>
>
>
> -----Original Message-----
> From: molpro-user-bounces at molpro.net [mailto:molpro-user-bounces at molpro.net]
> On Behalf Of molpro-user-request at molpro.net
> Sent: 05 August 2011 12:00
> To: molpro-user at molpro.net
> Subject: Molpro-user Digest, Vol 37, Issue 2
>
> Send Molpro-user mailing list submissions to
> 	molpro-user at molpro.net
>
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://www.molpro.net/mailman/listinfo/molpro-user
> or, via email, send a message with subject or body 'help' to
> 	molpro-user-request at molpro.net
>
> You can reach the person managing the list at
> 	molpro-user-owner at molpro.net
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Molpro-user digest..."
>
>
> Today's Topics:
>
>     1. Re: Permanent installation of dependencies (Andy May)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 04 Aug 2011 21:21:11 +0100
> From: Andy May<MayAJ1 at cardiff.ac.uk>
> To: Gregory Magoon<gmagoon at MIT.EDU>
> Cc: molpro-user at molpro.net
> Subject: Re: [molpro-user] Permanent installation of dependencies
> Message-ID:<4E3AFF37.8050603 at cardiff.ac.uk>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Greg,
>
> Thanks for the info, perhaps it was auto-build with openmpi which could
> not be relocated in this way. We have a bug open about the problem and
> I'm working on a solution.
>
> mpptune.com can be run with, for example:
>
> ./bin/molpro -n 2 mpptune.com
>
> and should add parameters to lib/tuning.rc when complete, but will not
> add parameters if only 1 process is used.
>
> I've attempted to make modifications to close the libmol.index file when
> not in use. The change is available in the nightly build which you can
> download from the website. Please let us know if this does makes a
> difference.
>
> Best wishes,
>
> Andy
>
> On 25/07/11 16:13, Gregory Magoon wrote:
>> I was able to get it to work (I think) for -auto-ga-tcgmsg-mpich2 by
>> copying
>> /src/mpich2-install to a permanent location and editing the LAUNCHER
>> variable
>> in the script file.
>>
>> A couple more quick questions:
>> -Is use of mpptune.com recommended or is this deprecated? I think I was
>> able to
>> figure out how to run it, but I wasn't sure whether this is helpful or
> not,
>> performance-wise.
>> -I noticed that a text file called libmol.index is kept open by molpro
>> during
>> execution (and possibly read from (?))...I'm thinking this may be
> adversely
>> affecting performance when the file happens to be on an NFS file system.
>> If my
>> suspicions are correct, is there any way the file can be cached by
>> reading into
>> memory?
>>
>> Thanks,
>> Greg
>>
>> Quoting Andy May<MayAJ1 at cardiff.ac.uk>:
>>
>>> Greg,
>>>
>>> Glad to hear you got Molpro running. At the moment the auto-build
>>> options are marked as experimental, however the only known issue is
>>> the installation problem you mention. I'm not sure one can simply move
>>> mpich2/openmpi installation directories, but we will try to implement
>>> a solution.
>>>
>>> Best wishes,
>>>
>>> Andy
>>>
>>> On 24/07/11 04:12, Gregory Magoon wrote:
>>>> After some work I was finally able to trace this to some sort of
>>>> issue between
>>>> NFSv4 and MPICH2; I can get this to work properly when I mount the
>>>> NFS drives
>>>> as NFSv3 (as opposed to NFSv4), so the issue is now more-or-less
>>>> resolved.
>>>>
>>>> A quick follow-up question: Is there a recommended approach for
>>>> permanent
>>>> installation of the mpich2 dependency (and maybe also GA?) when using
>>>> the auto
>>>> build approach? By default, it seems that the installation scripts
>>>> leave the
>>>> mpiexec in the compile directory. I saw that the
>>>> makefile/installation scripts
>>>> mention an option called REDIST which seemed like it might allow this,
>>>> but they
>>>> don't seem to make use of this option (REDIST=NO).
>>>>
>>>> Thanks,
>>>> Greg
>>>>
>>>> Quoting Gregory Magoon<gmagoon at MIT.EDU>:
>>>>
>>>>> Hi,
>>>>> I have successfully compiled molpro (with Global Arrays/TCGMSG;
>>>>> mpich2 from
>>>>> Ubuntu package) on one of our compute nodes for our new server, and
>>>>> installed
>>>>> it in an NFS directory on our head node. The initial tests on the
>>>>> compute node
>>>>> ran fine but since the installation, I've had issues with running
>>>>> molpro on the
>>>>> compute nodes (it seems to work fine on the head node). Sometimes
>>>>> (sorry I can't
>>>>> be more precise, but it does not seem to be reproducible), when
>>>>> running on the
>>>>> compute node, the job will get stuck in the early stages, producing a
>>>>> lot (~14+
>>>>> Mbps outbound to headnode and 7Mbps inbound from headnode) of NFS
>>>>> traffic and
>>>>> causing fairly high nfsd process CPU% usage on the head node. Molpro
>>>>> processes
>>>>> in the stuck state are shown in "top" command display at the bottom
>>>>> of the
>>>>> e-mail. I have also attached example verbose output for a case that
>>>>> works and a
>>>>> case that gets stuck.
>>>>>
>>>>> Some notes:
>>>>> -/usr/local is mounted as NFS read-only file system; /home is mounted
>>>>> as NFS rw
>>>>> file system
>>>>> -It seems like runs with fewer processors (e.g. 6) are more likely
>>>>> to run
>>>>> successfully
>>>>>
>>>>> I've tried several approaches for addressing the issue, including 1.
>>>>> Mounting
>>>>> /usr/local as rw file system, and 2. Changing the rsize and wsize
>>>>> parameters
>>>>> for the NFS filesystem but none seem to work. We also tried piping<
>>>>> /dev/null
>>>>> when calling the process, which seemed like it was helping at first,
>>>>> but later
>>>>> tests suggested that this wasn't actually helping.
>>>>>
>>>>> If anyone has any tips or ideas to help diagnose the issue here, it
>>>>> would be
>>>>> greatly appreciated. If there are any additional details I can
>>>>> provide to help
>>>>> describe the problem, I'd be happy to provide them.
>>>>>
>>>>> Thanks very much,
>>>>> Greg
>>>>>
>>>>> Top processes in "top" output in stuck state:
>>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>>>> 10 root 20 0 0 0 0 S 10 0.0 0:16.50 kworker/0:1
>>>>> 2 root 20 0 0 0 0 S 6 0.0 0:10.86 kthreadd
>>>>> 1496 root 20 0 0 0 0 S 1 0.0 0:04.73 kworker/0:2
>>>>> 3 root 20 0 0 0 0 S 1 0.0 0:00.93 ksoftirqd/0
>>>>>
>>>>> Processes in "top" output for user in stuck state:
>>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>>>> 29961 user 20 0 19452 1508 1072 R 0 0.0 0:00.05 top
>>>>> 1176 user 20 0 91708 1824 868 S 0 0.0 0:00.01 sshd
>>>>> 1177 user 20 0 24980 7620 1660 S 0 0.0 0:00.41 bash
>>>>> 1289 user 20 0 91708 1824 868 S 0 0.0 0:00.00 sshd
>>>>> 1290 user 20 0 24980 7600 1640 S 0 0.0 0:00.32 bash
>>>>> 1386 user 20 0 4220 664 524 S 0 0.0 0:00.01 molpro
>>>>> 1481 user 20 0 18764 1196 900 S 0 0.0 0:00.00 mpiexec
>>>>> 1482 user 20 0 18828 1092 820 S 0 0.0 0:00.00 hydra_pmi_proxy
>>>>> 1483 user 20 0 18860 488 212 D 0 0.0 0:00.00 hydra_pmi_proxy
>>>>> 1484 user 20 0 18860 488 212 D 0 0.0 0:00.00 hydra_pmi_proxy
>>>>> 1485 user 20 0 18860 488 212 D 0 0.0 0:00.00 hydra_pmi_proxy
>>>>> 1486 user 20 0 18860 488 212 D 0 0.0 0:00.00 hydra_pmi_proxy
>>>>> 1487 user 20 0 18860 488 212 D 0 0.0 0:00.00 hydra_pmi_proxy
>>>>> 1488 user 20 0 18860 488 212 D 0 0.0 0:00.00 hydra_pmi_proxy
>>>>> 1489 user 20 0 18860 488 212 D 0 0.0 0:00.00 hydra_pmi_proxy
>>>>> 1490 user 20 0 18860 488 208 D 0 0.0 0:00.00 hydra_pmi_proxy
>>>>> 1491 user 20 0 18860 488 208 D 0 0.0 0:00.00 hydra_pmi_proxy
>>>>> 1492 user 20 0 18860 488 208 D 0 0.0 0:00.00 hydra_pmi_proxy
>>>>> 1493 user 20 0 18860 488 208 D 0 0.0 0:00.00 hydra_pmi_proxy
>>>>> 1494 user 20 0 18860 492 212 D 0 0.0 0:00.00 hydra_pmi_proxy
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Molpro-user mailing list
>>>> Molpro-user at molpro.net
>>>> http://www.molpro.net/mailman/listinfo/molpro-user
>>>
>>
>>
>
>
> ------------------------------
>
> _______________________________________________
> Molpro-user mailing list
> Molpro-user at molpro.net
> http://www.molpro.net/mailman/listinfo/molpro-user
>
>
> End of Molpro-user Digest, Vol 37, Issue 2
> ******************************************
>
> _______________________________________________
> Molpro-user mailing list
> Molpro-user at molpro.net
> http://www.molpro.net/mailman/listinfo/molpro-user



More information about the Molpro-user mailing list