[molpro-user] Benchmark timing on linux raid

Wed Nov 24 08:16:26 GMT 2010

On Wednesday 24 November 2010 00:49, Jacek Klos wrote:
> I have noticed some strange behavior on our linux Dell machine.
> Machine is dual-6core 3.33GHz 24GB RAM, and RAID disk composed
> of 3 SAS 15k RPM drives making logical volume of about 1TB.
>
> When I run   small_normal_ccsd benchmarks with 12 cpus using parallel
> molpro (GA4.2 Open-MPI) and I repeat it let's say 20 times after the fresh
> reboot of the machine the fastest elapsed time is 76 seconds.
> The range of elapsed times is more or less +/- 5 sek.
>
> But when I run again the same series of 12cpus jobs after machine is
> on for a day or so I get elapsed timings like 30 seconds longer:

It may be possible that this is related to the 'system cache', i.e., the 
amount of free physical RAM not committed to any other running processes. 
Operating systems tend to use all of that memory for disk caching, and since 
these CCSD jobs are still rather small and you have lots of memory, maybe 
after a fresh reboot the OS never actually writes the integral data to disk 
in the first place (or never reads it back). AOINT, AOSORT, HF and TRANSFORM 
all depend heavily on the IO performance usually, so if for some reason there 
is lots of spare RAM for disk caching, these would become much faster.

Why this stops being the case after the system has run for a longer time is 
another question. Maybe in the meantime more files have been opened and the 
OS determined these as more cache-worthy than the integral files.

There may be tools in linux to tell you about the system cache usage, but I'm 
not an expert and could not tell which. One thing you could try is to test of 
this difference of execution speed also occurs if (a) you run smaller jobs, 
which always fit completely into memory or (b) you run larger jobs (with 
integral files >> 24 GB), which never fit completely into memory.
-- 
Gerald Knizia