[molpro-user] Benchmark timing on linux raid
knizia at theochem.uni-stuttgart.de
Wed Nov 24 08:16:26 GMT 2010
On Wednesday 24 November 2010 00:49, Jacek Klos wrote:
> I have noticed some strange behavior on our linux Dell machine.
> Machine is dual-6core 3.33GHz 24GB RAM, and RAID disk composed
> of 3 SAS 15k RPM drives making logical volume of about 1TB.
> When I run small_normal_ccsd benchmarks with 12 cpus using parallel
> molpro (GA4.2 Open-MPI) and I repeat it let's say 20 times after the fresh
> reboot of the machine the fastest elapsed time is 76 seconds.
> The range of elapsed times is more or less +/- 5 sek.
> But when I run again the same series of 12cpus jobs after machine is
> on for a day or so I get elapsed timings like 30 seconds longer:
It may be possible that this is related to the 'system cache', i.e., the
amount of free physical RAM not committed to any other running processes.
Operating systems tend to use all of that memory for disk caching, and since
these CCSD jobs are still rather small and you have lots of memory, maybe
after a fresh reboot the OS never actually writes the integral data to disk
in the first place (or never reads it back). AOINT, AOSORT, HF and TRANSFORM
all depend heavily on the IO performance usually, so if for some reason there
is lots of spare RAM for disk caching, these would become much faster.
Why this stops being the case after the system has run for a longer time is
another question. Maybe in the meantime more files have been opened and the
OS determined these as more cache-worthy than the integral files.
There may be tools in linux to tell you about the system cache usage, but I'm
not an expert and could not tell which. One thing you could try is to test of
this difference of execution speed also occurs if (a) you run smaller jobs,
which always fit completely into memory or (b) you run larger jobs (with
integral files >> 24 GB), which never fit completely into memory.
More information about the Molpro-user