A database is an XML file conforming to the molpro-database
schema, and consists one or more occurrences of each of the following
two principal elements.
Normally, the molecule nodes will be in separate
self-contained files that are then referenced in the main database
file through the syntax of
XInclude. There are three reasons for
this. Firstly, these files can be produced directly by a MOLPRO calculation, with the rest of the database being constructed by
hand. Secondly, they allow the possibility that the molecule files be
replaced in the future by, for example, running all the molecule
calculations again using a different method; in that case, the rest of
the database, i.e. the reaction specifications, does not need to
change. This supports the possibility of having several databases that
have the same structure – specification of reactions – but different
numerical data, and therefore being capable of numerical comparison.
Thirdly, several databases can coexist in the same directory, and
share some of the same molecule files. An example of this is a
supplementary database that consists of a subset of the reactions
contained in the main database.
- Information about a single molecular species in the
molpro-output XML format. This will usually be the result
PUT,XML in a MOLPRO calculation, but can also be
constructed directly from an external data source. The important
quantities that are used are the geometry and energy, together with
metadata such as the method and basis set, and other quantities such
as spin and symmetry that might be useful for constructing a new
MOLPRO job for the molecule.
- A list of species specifications that point uniquely to one
of the molecule nodes, together with information on how the
species appears stoichimetrically in the reaction, and whether it is
a special point such as a transition state. species
specifications can also be given without either of these tags,
allowing additional geometries, for example along a reaction
coordinate or potential surface cut, to be included.
The following is an example of a complete database of four reactions
involving the species
Note that the association between the species and the
molpro-output:molecule nodes is achieved through the use of
tags, which PUT,XML will produce provided that
is installed on the system.
An alternative is through syntax such as
and the use of
<species index="73"> in the database file.
Note that sometimes different species have the same InChI, and so the
use of index is necessary to resolve ambiguities.
For full specification of the possible structure of a database, see the schema file
The directory database/utilities contains several Python
scripts that manipulate databases. For convenience, they can be run
through the script
molpro –database script-name arguments ...
so long as Python (version 3 preferred) is installed on the
system. You need the
package included in your Python installation:
pip install lxml requests
pip3 if you are using Python 3).
The script validate checks whether a database conforms to the
schema, for example
cd Molpro # assuming below that we are in Molpro source tree, but works from anywhere
bin/molpro --database validate \