28.7 FCIQMC error analysis

Since the FCIQMC method is a stochastic simulation, care must be taken in order to obtain reliable energy averages and consequent errorbars. This can be difficult to automate reliably since a knowledge of the equilibration time is required for accurate estimates, as well as requiring serial correlation between the instantaneous estimates to be removed. This is traditionally done using a blocking analysis of the contributions to the energy. An attempt is made to estimate these and provide an automatic error analysis at the end of each FCIQMC calculation, which is then stored in the internal variable FCIQMC_ERR. This may often be good enough, however for production results it is recommended to provide some information manually to increase the accuracy and confidence in these errorbars. This can be done with the REBLOCKSHIFT and REBLOCKPROJE options.

If these options are present, then the FCIQMC calculation will be skipped, and the FCIQMCStats file from the previous calculation will be read back in to perform a blocking analysis. The values that these options take indicate the equilibration time in iterations that is to be given to the shift and projected energy estimates respectively before blocking to obtain accurate averaged energies and errors. These equilibration times can be estimated by plotting iteration against shift (columns 1:2 from the FCIQMCStats file) for the REBLOCKSHIFT value, and iterations against numerator and denominator (columns 1:24 and 1:25) for the REBLOCKPROJE value. It may be important to consider the equilibration time for both of these columns for the projected energy estimate and give the maximum equilibration time, since the covariance between the quantities in the ratio is needed for an accurate error estimate.

Once these equilibration times have been provided and MOLPRO rerun in the same working directory with these options, a more accurate estimate of the averaged energies and errors will be provided. In addition, files called Blocks_num, Blocks_denom, Blocks_proje and Blocks_shift will be created. By plotting these files, an estimate for whether the random errors are reliable or not can be found, by checking whether the central limit theorem holds when the data is divided up into `blocks' of different length. If column 1 (the number of resummed blocks) is plotted against column 3 (the estimated error between these blocks), with errorbars on the estimated error given in column 4, then a plateau should be observable as the number of blocks decreases. If the error increases continually as the number of blocks decreases, this is a sign that the serial correlation in the data may not have been removed yet, and continued running is advised. A log scale on the x-axis is recommended to observe the full range of block lengths. An example of one of these blocking graphs is shown in Fig. 3. If an incorrect number of blocks is automatically assumed for the plateau height, a more accurate final error can be found for the correct number of blocks in the relevant blocking file (Blocks_proje or Blocks_shift). For the projected energy, the minimum number of blocks should be calculated from the Blocks_num and Blocks_denom files, before the final error read from the corresponding number of blocks in the Blocks_proje file. Note that if the reference configuration changed during the initial run, this is not considered if the blocking is performed manually, and so the correlation energy obtained from manual reblocking will need to be added to previous changed reference energy.

Figure 3: Example of well-converged blocking analysis, with plateau present. This would indicate an error of 0.002 Hartrees in the shift.

molpro@molpro.net 2020-04-18