[molpro-user] gmainv1 failure to allocate - what to do?

Grigory Shamov Grigory.Shamov at umanitoba.ca
Fri Jan 31 19:08:52 GMT 2014


Dear Manhui,

Thank you for the reply.  We have 48GB or 96GB RAM nodes, always with 47GB swap partition. The memory should be allocated by Torque accordingly to user's request. If he uses more memory in MolPro than requested from the batch queuing system, the job gets terminated automatically ( a different kind of error to see, for getting SIGTERM, and the event logged for us).

The particular job ran on four 48GB nodes, four processes each (3 compute, 1 data server as it was SF version) and requested 98 GB of RAM — which is not enough, but the job didn't reach the state of being killed for using too much, but just failed to allocate, right?

SHMMAX is a per-node value for SystemV shared memory. Some old versions of GA used to depend on it. I should've think that for using SF it is irrelevant, unless MolPro uses also SystemV internally.

On the other hand we do not allow for virtual memory overcommit. Could MolPro want to allocate a large Virtual memory, much larger than the Residential  memory ?

--
Grigory Shamov
HPC Analyst, Westgrid/Compute Canada
E2-588 EITC Building, University of Manitoba
(204) 474-9625



From: Manhui Wang <wangm9 at cardiff.ac.uk<mailto:wangm9 at cardiff.ac.uk>>
Date: Friday, 31 January, 2014 12:03 PM
To: Grigory Shamov <Grigory.Shamov at umanitoba.ca<mailto:Grigory.Shamov at umanitoba.ca>>
Cc: "molpro-user at molpro.net<mailto:molpro-user at molpro.net>" <molpro-user at molpro.net<mailto:molpro-user at molpro.net>>
Subject: Re: [molpro-user] gmainv1 failure to allocate - what to do?

Dear Grigory,

On 31/01/14 15:07, Grigory Shamov wrote:

Dear MolPro users,


I have compiled MolPro with Intel 12 and auto-built MVAPICH2. (I had to
substitute MVAPICH2-2.0b instead of the unavailable tar ball of
MVAPICH2-1.9 in the processs; also to decrease optimization to -O2; I
attach my CONFIG. I understand it picked I8 array indexes, right?).

When our user ran it on a large task, with -m 1000m he got the following
error:

Multipassing necessary in transformation. To avoid, increase memory by
3561.33 Mwords.
 ? Error
 ? 2-ext paging plus 3-ext ints not yet working (kintb)!
 ? The problem occurs in cckint

Then, I told him to increase the memory, but with -m 4600m it fails in a
different way:


gmainv1 failure to allocate 4600230002
gmainv1 failure to allocate 4600230002
gmainv1 failure to allocate 4600230002
gmainv1 failure to allocate 4600230002

How much memory does the machine have? If you want to run Molpro job with
-m 4600m with 4 processes, actually you are requesting 4600*1000000 words(1 word = 8 bytes) per process.
The total requested memory is 137 GB (4600*1000000 * 8 * 4).

I've googled this problem, and have seen some answers related to SHMMAX
value. On our cluster we have it increase though:

cat /proc/sys/kernel/shmmax
68719476736

This is around 64GB.

Could you please check if the requested memory exceeds the hard memory limit on the machine?


Best wishes,
Manhui


Could you please suggest, how does one avoid this problem,? Also, I did
use the default limits when compiling (number of atoms etc.) -- if that is
the reason, which limits should be increased? Thank you very much!






_______________________________________________
Molpro-user mailing list
Molpro-user at molpro.net<mailto:Molpro-user at molpro.net>http://www.molpro.net/mailman/listinfo/molpro-user


--
-----------
Manhui  Wang
School of Chemistry, Cardiff University,
Main Building, Park Place,
Cardiff CF10 3AT, UK

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.molpro.net/pipermail/molpro-user/attachments/20140131/a8b1b374/attachment.html>


More information about the Molpro-user mailing list