Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
running_molpro_on_parallel_computers [2021/06/02 08:43]
qianli [Running Molpro on parallel computers]
running_molpro_on_parallel_computers [2021/06/02 08:55] (current)
qianli [Memory specifications]
Line 7: Line 7:
 There are different GA implementation options (runtimes), and there are advantages and disadvantages for using one or the other implementation (see [[GA Installation]]). There are different GA implementation options (runtimes), and there are advantages and disadvantages for using one or the other implementation (see [[GA Installation]]).
  
-Since Molpro 2021.2 the [[#disk option]] is used by default in single node calculation, in which case large data structures are simply kept in MPI files.+**Since Molpro 2021.2 the [[#disk option]] is used by default** in single node calculation, in which case large data structures are simply kept in MPI files.
 The behavior of previous versions can be recovered by the ''--ga-impl ga'' command line option. The behavior of previous versions can be recovered by the ''--ga-impl ga'' command line option.
 However, ''--ga-impl ga'' requires pre-allocation of GA memory in many calculations if the ''socket'' GA runtime is used, and failing to preallocate sufficient amount of GA memory may lead to crashes or incorrect results. However, ''--ga-impl ga'' requires pre-allocation of GA memory in many calculations if the ''socket'' GA runtime is used, and failing to preallocate sufficient amount of GA memory may lead to crashes or incorrect results.
Line 34: Line 34:
   - If the [[#disk option]] is disabled (the default) and one of the older GA runtimes (''sockets'', ''openib'', etc.) is used (**including when using the Molpro binary release**): Sufficient amount of GA memory must be specified by the ''-G'' or ''-M'' option (see below) and pre-allocated by Molpro in the beginning of a calculation, otherwise the calculation may crash or yields incorrect results.   - If the [[#disk option]] is disabled (the default) and one of the older GA runtimes (''sockets'', ''openib'', etc.) is used (**including when using the Molpro binary release**): Sufficient amount of GA memory must be specified by the ''-G'' or ''-M'' option (see below) and pre-allocated by Molpro in the beginning of a calculation, otherwise the calculation may crash or yields incorrect results.
   - If the disk option is disabled and one of the comex-based GA runtimes (e.g. ''mpi-pr'') is used, or if the disk option is enabled but the scratch is in a tmpfs: the ''-G'' or ''-M'' option is not mandatory, but sufficient physical memory shall be left for the global data structure.   - If the disk option is disabled and one of the comex-based GA runtimes (e.g. ''mpi-pr'') is used, or if the disk option is enabled but the scratch is in a tmpfs: the ''-G'' or ''-M'' option is not mandatory, but sufficient physical memory shall be left for the global data structure.
-  - If the [[#disk option]] is enabled and the scratch directory is located on a physical disk: the GA usage should be negligible and the ''-G'' or ''-M'' options should not be given.+  - If the [[#disk option]] is enabled and the scratch directory is located on a physical disk: the GA usage should be negligible and the **''-G'' or ''-M'' options should not be given**. However, the performance of the calculation might be better if some memory is left for the system to buffer the I/O. 
 + 
 +Note that we have made the disk option the default in single node calculations since Molpro 2021.2. If this causes performance problems, the previous behavior of storing large data structure in GlobalArrays can be enabled by setting the environment variable ''MOLPRO_GA_IMPL'' to ''GA'', or by passing the ''%%--ga-impl ga%%'' command-line option.
  
 Both the ''-m'' and ''-G'' options are by default given in megawords (m) but unit gigaword (g) can also be used (e.g. ''-m1000'' is equivalent to ''-m1000m'' and to ''-m1g''). Both the ''-m'' and ''-G'' options are by default given in megawords (m) but unit gigaword (g) can also be used (e.g. ''-m1000'' is equivalent to ''-m1000m'' and to ''-m1g'').
Line 53: Line 55:
 With this, the total memory allocatable by Molpro can be specified, and the memory is split 50-50 for stack and GA in DF/PNO calculations, and 80-20 in other calculations. With this, the total memory allocatable by Molpro can be specified, and the memory is split 50-50 for stack and GA in DF/PNO calculations, and 80-20 in other calculations.
 Thus, unless specified otherwise, in DF/PNO calculations the stack memory per process is $m=M\cdot N/(2\cdot n)$ and the total GA memory is $G=N\cdot M/2$. Thus, unless specified otherwise, in DF/PNO calculations the stack memory per process is $m=M\cdot N/(2\cdot n)$ and the total GA memory is $G=N\cdot M/2$.
-It is recommended to provide a default ''-M'' value in .molprorc (**except for disk-based calculation that does not use GA, see [[#disk option]]**), e.g. ''-M=25g'' for a dedicated machine with 256 GB of memory and 20 cores (.molprorc can be in the home directory and/or in the submission directory, the latter having preference). Then each Molpro run would be able to use the whole memory of the machine with reasonable splitting between stack and GA. The default can be overwritten or modified by molpro command line options ''-m'' and/or ''-G'', or by input options (cf. section [[general program structure#memory allocation|memory allocation]]), the latter having preference over command line options.+If the use of GA in storing large data structure is desired, it is recommended to provide a default ''-M'' value in .molprorc (**do not do so for disk-based calculation, see [[#disk option]]**), e.g. ''-M=25g'' for a dedicated machine with 256 GB of memory and 20 cores (.molprorc can be in the home directory and/or in the submission directory, the latter having preference). Then each Molpro run would be able to use the whole memory of the machine with reasonable splitting between stack and GA. The default can be overwritten or modified by molpro command line options ''-m'' and/or ''-G'', or by input options (cf. section [[general program structure#memory allocation|memory allocation]]), the latter having preference over command line options.
  
 If the ''-G'' or ''-M'' options are given, some programs check at early stages if the GA space is sufficient. If not, an error exit occurs and the estimated amount of required GA is printed. In this case the calculation should be repeated, specifying (at least) the printed amount of GA space with the ''-G'' option. If crashes without such message occur, the calculation should also be repeated with more GA space or with the disk option, but care should be taken that the total memory per node does not get too large. If the ''-G'' or ''-M'' options are given, some programs check at early stages if the GA space is sufficient. If not, an error exit occurs and the estimated amount of required GA is printed. In this case the calculation should be repeated, specifying (at least) the printed amount of GA space with the ''-G'' option. If crashes without such message occur, the calculation should also be repeated with more GA space or with the disk option, but care should be taken that the total memory per node does not get too large.
Line 74: Line 76:
  
 Since version 2021.1, Molpro can use MPI files instead of GlobalArrays to store large global data. This option can be enabled globally by setting the environment variable ''MOLPRO_GA_IMPL'' to ''DISK'', or by passing the ''%%--ga-impl disk%%'' command-line option. Since version 2021.1, Molpro can use MPI files instead of GlobalArrays to store large global data. This option can be enabled globally by setting the environment variable ''MOLPRO_GA_IMPL'' to ''DISK'', or by passing the ''%%--ga-impl disk%%'' command-line option.
 +Since version 2021.2 the disk option is made the default in single-node calculations.
 Some programs in Molpro including DF-HF, DF-KS, (DF-)MULTI, DF-TDDFT, and PNO-LCCSD also support an input option ''implementation=disk'' to enable the disk option for the particular job step. Some programs in Molpro including DF-HF, DF-KS, (DF-)MULTI, DF-TDDFT, and PNO-LCCSD also support an input option ''implementation=disk'' to enable the disk option for the particular job step.
 The file system for these MPI files must be accessible by all processors. The file system for these MPI files must be accessible by all processors.