# Molecular geometry

## Geometry specifications

The geometry may be given in standard Z-matrix form, or XYZ form, either in the input or in a separate file (see section geometry files). The geometry specifications are given in the form

[`SYMMETRY, options `

]

[`ORIENT, options `

]

[`ANGSTROM`

]

[`BOHR`

]

`GEOMETRY={ atom specifications }`

`GEOMETRY`

must come after the other commands that modify the way the geometry is constructed. The following are permitted as `SYMMETRY`

options:

Any valid combination of symmetry generators, as described in section symmetry specification below.

Disable use of symmetry. Instead of `SYMMETRY,NOSYM`

also just `NOSYM`

can be used.

Enable use of symmetry.

The following are permitted as `ORIENT`

options:

Orient molecule such that origin is centre of nuclear charge, and axes are eigenvectors of quadrupole moment.`CHARGE`

Orient molecule such that origin is centre of mass, and axes are eigenvectors of inertia tensor (default for Z-matrix input). Alternatively, the symmetry centre can be specified as`MASS`

`CENTRE=MASS|CHARGE`

.Disable re-orientation of molecule (default for XYZ-input).`NOORIENT`

Force first non-zero $x$-coordinate to be positive or negative, respectively. Similarly,`SIGNX=`

$\pm1$`SIGNY`

,`SIGNZ`

can be set for the $y$- and $z$-coordinates, respectively. This can be useful to fix the orientation of the molecule across different calculations and geometries. Alternatively, the system variables`ZSIGNX`

,`ZSIGNY`

,`ZSIGNZ`

can be set to positive or negative values to achieve the same effect.For the $C_{2v}$ and $D_{2h}$ point groups, force the primary plane to be $xz$ instead of the default $yz$. The geometry builder attempts by swapping coordinate axes to place as many atoms as possible in the primary plane, so for the particular case of a planar molecule, this means that all the atoms will lie in the primary plane. The default implements recommendation $5a$ and the first part of recommendation $5b$ specified in J. Chem. Phys.`PLANEXZ`

**23**, 1997 (1955).`PLANEYZ`

and`PLANEXY`

may also be specified, but note that the latter presently generates an error for $C_{2v}$.

`ANGSTROM`

Forces bond lengths that are specified by numbers, or variables without associated units, to use the values as a number of Ångstrom, rather than Bohr.

`BOHR`

Forces bond lengths that are specified by numbers, or variables without associated units, to use the values as a number of Bohr. This is useful in the case of xyz-input with coordinates given in Bohr.

### Z-matrix input

The general form of an atom specification line is

`[`

*group [,]]atom*, $p_1$, $r$, $p_2$, $\alpha$, $p_3$, $\beta$, $J$

or, alternatively,

`[`

*group [,]]atom*, $p_1$, $x$, $y$, $z$

where

atomic group number (optional). Can be used if different basis sets are used for different atoms of the same kind. The basis set is then referred to by this group number and not by the atomic symbol.*group*chemical symbol of the new atom placed at position $p_0$. This may optionally be appended (without blank) by an integer, which can act as sequence number, e.g.,*atom*`C1`

,`H2`

, etc. Dummy centres with no charge and basis functions are denoted either`Q`

or`X`

, optionally appended by a number, e.g,`Q1`

; note that the first atom in the z-matrix must not be called`X`

, since this may be confused with a symmetry specification (use`Q`

instead).**$p_1$**atom to which the present atom is connected. This may be either a number*n*, where $n$ refers to the $n$’th line of the Z-matrix, or an alphanumeric string as specified in the*atom*field of a previous card, e.g.,`C1`

,`H2`

etc. The latter form works only if the atoms are numbered in a unique way.**$r$**Distance of new atom from $p_1$. This value is given in bohr, unless`ANG`

has been specified directly before or after the symmetry specification.**$p_2$**A second atom needed to define the angle $\alpha(p_0,p_1,p_2)$. The same rules hold for the specification as for $p_1$.**$\alpha$**Internuclear angle $\alpha(p_0,p_1,p_2)$. This angle is given in degrees and must be in the range $0 \lt \alpha \lt 180^{0}$.**$p_3$**A third atom needed to define the dihedral angle $\beta(p_0,p_1,p_2,p_3)$. Only applies if $J=0$, see below.**$\beta$**Dihedral angle $\beta(p_0,p_1,p_2,p_3)$ in degree. This angle is defined as the angle between the planes defined by $(p_0,p_1,p_2)$ and $(p_1,p_2,p_3)$ ($-180^{0} \le \beta \le 180^{o}$). Only applies if $J=0$, see below.**$J$**If this is specified and nonzero, the new position is specified by two bond angles rather than a bond angle and a dihedral angle. If $J=\pm 1$, $\beta$ is the angle $\beta(p_0,p_1,p_3)$. If $J=1$, the triple vector product $({\bf p}_1-{\bf p}_0) \cdot [({\bf p}_1-{\bf p}_2) \times ({\bf p}_1-{\bf p}_3)]$ is positive, while this quantity is negative if $J=-1$.Cartesian coordinates of the new atom. This form is assumed if $p_1\le0$; if $p_1\lt 0$, the coordinates are frozen in geometry optimizations.*x,y,z*

All atoms, including those related by symmetry transformations, should be specified in the Z-matrix. Note that for the first atom, no coordinates need be given, for the second atom only $p_1,r$ are needed, whilst for the third atom $p_3,\beta,J$ may be omitted. The 6 missing coordinates are obtained automatically by the program, which translates and re-orients the molecule such that the origin is at the centre of mass, and the axes correspond to the eigenvectors of the inertia tensor (see also `CHARGE`

option above).

Variable names, and in general expressions that are linear in all dependent variables, may be used as well as fixed numerical values for the parameters $r$, $\alpha$ and $\beta$. These expressions are evaluated as late as possible, so that it is possible, for example, to set up loops in which these parameters are changed; the geometry optimizer also understands this construction, and will optimize the energy with respect to the value of the variables. Non-linear expressions should not be used, because the geometry optimization module is unable to differentiate them.

Once the reorientation has been done, the program then looks for symmetry ($D_{2h}$ and subgroups), unless the `NOSYM`

option has been given. It is possible to request that reduced symmetry be used by using appropriate combinations of the options `X,Y,Z,XY,XZ,YZ,XYZ`

. These specify symmetry operations, the symbol defining which coordinate axes change sign under the operation. The point group is constructed by taking all combinations of specified elements. If symmetry is explicitly specified in this way, the program checks to see that the group requested can be used, swapping the coordinate axes if necessary. This provides a mechanism for ensuring that the same point group is used, for example, at all points in the complete generation of a potential energy surface, allowing the safe re-utilization of neighbouring geometry molecular orbitals as starting guesses, etc..

Note that symmetry is not implemented in density fitting methods, and in these cases the NOSYM option is implied automatically.

Also note that by default the automatic orientation of the molecule only takes place if the geometry is defined by internal (Z-matrix) coordinates. In case of `XYZ`

Input the orientation is unchanged, unless the `MASS`

option is specified in the geometry block.

### XYZ input

Simple cartesian coordinates in Ångstrom units can be read as an alternative to a Z matrix, either directly from the input stream, or from a file (see section geometry files).). This facility is triggered by setting the Molpro variable `GEOMTYP`

to the value `XYZ`

before the geometry specification is given, but usually this does not need to be done, as a geometry specification where the first line is a single integer will be recognized as XYZ format, as will the case of the first line consisting of a chemical symbol followed by three cartesian coordinates. The geometry block should then contain the cartesian coordinates in XYZ format (Minnesota Supercomputer Center, Inc.). Variable names, and in general expressions that are linear in all dependent variables, may be used as well as fixed numerical values. Non-linear expressions should not be used, because the geometry optimization module is unable to differentiate them.

The XYZ file format consists of two header lines, the first of which contains the number of atoms, and the second of which is a title. The remaining lines each specify the coordinates of one atom, with the chemical symbol in the first field, and the $x$, $y$, $z$ coordinates following. A sequence number may be appended to the chemical symbol; it is then interpreted as the atomic group number, which can be used when different basis sets are wanted for different atoms of the same kind. The basis set is then specified for this group number rather than the atomic symbol. As a further extension, the first two header lines can be omitted.

Note that for XYZ input the default is not to reorient the molecule. Orientation can be forced, however, by the `MASS`

or `CHARGE`

options on the `ORIENT`

directive.

- examples/h2o_xyzinput.inp
geomtyp=xyz geometry={ 3 ! number of atoms This is an example of geometry input for water with an XYZ file O ,0.0000000000,0.0000000000,-0.1302052882 H ,1.4891244004,0.0000000000, 1.0332262019 H,-1.4891244004,0.0000000000, 1.0332262019 } hf

The XYZ format had been specified within the documentation distributed with the Minnesota Supercomputer Center’s XMol package. Note that Molpro has the facility to write XYZ files with the `PUT`

command (see section writing files for postprocessing(PUT)).

## Geometry Files

Using the format

`GEOMETRY`

=*file*

the geometry definitions are read from *file*, instead of inline.
For the name of the *file* and the characters allowed, similar recommendations hold as for the Molpro input file, see
https://www.molpro.net/develop/manual/doku.php?id=quickstart#how_to_run_molpro .
The geometry file together with a path to it may however be used.

For series of similar calculations for different molecules the filename can be provided as a variable via the molpro script. The general format is

`geometry=string1$var1$var2'string2'$var3.extension`

where `$var1, $var2, $var3`

etc are either integer numbers (provided via the `-V`

molpro option) or string variables (provided via the `--name`

molpro option). Variable names end before the next dollar ($), quote ('), full stop (.) or blank (end of line). Any number of variables is possible, although usually one is sufficient. `.extension`

is appended as given. Strings in quotes between variables are also included as given. The (optional) first string (string1 above) must not be in quotes.

Examples:

`geometry=$mol.xyz`

with `molpro --name mol=h2o …`

. Alternatively

`geometry=$mol`

with `molpro --name mol=h2o.xyz …`

. Both forms evaluate to `geometry=h2o.xyz`

.

Another useful possibility is

`geometry=product$n`

.xyz

with `molpro -V n=3 …`

, which evaluates to `geometry=product3.xyz`

.

Note that the variables defining the geometry filename cannot be defined in the input file, they can only be provided via the molpro script.

## Symmetry specification

If standard Z-matrix input is used, `MOLPRO`

determines the symmetry automatically by default. However, sometimes it is necessary to use a lower symmetry or a different orientation than obtained by the default, and this can be achieved by explicit specification of the symmetry elements to be used, as described below.

Generating symmetry elements, which uniquely specify the point group, can be specified on the `SYMMETRY`

directive. This must be given *before* the geometry block. Each symmetry directive only affects the subsequent geometry block; after a geometry block has been processed, the defaults are restored. Note that the specification of symmetry elements inside the geometry block is no longer allowed.

The dimension of the point group is 2**(number of fields given). Each field consists of one or more of `X, Y`

, or `Z`

(with no intervening spaces) which specify which coordinate axes change sign under the corresponding generating symmetry operation. It is usually wise to choose $z$ to be the unique axis where appropriate (essential for $C_2$ and $C_{2h}$). In that case, the possibilities are:

**(null card)**$C_1$ (i.e., no point group symmetry)$C_s$`Z`

$C_2$`XY`

$C_i$`XYZ`

$C_{2v}$`X,Y`

$C_{2h}$`XY,Z`

$D_2$`XZ,YZ`

$D_{2h}$`X,Y,Z`

Note that Abelian point group symmetry only is available, so for molecules with degenerate symmetry, an Abelian subgroup must be used — e.g, $C_{2v}$ or $D_{2h}$ for linear molecules.

See section symmetry for more details of symmetry groups and ordering of the irreducible representations. Also see section Z-matrix input for more information about automatic generation of symmetry planes.

Note that by default the automatic orientation of the molecule only takes place if the geometry is defined by internal (Z-matrix) coordinates. In case of `XYZ`

Input the orientation is unchanged, unless the `MASS`

option is specified in the geometry block.

## Writing files for postprocessing(PUT)

The `PUT`

command may be used at any point in the input to print, or write to a file, the current geometry. The syntax is

`PUT`

,*style,file,status,info*

If *style* is `GAUSSIAN`

, a complete Gaussian input file will be written; in that case, `info`

will be used for the first (route) data line, and defaults to ‘`# SP`

’.

If *style* is `XYZ`

, an XYZ file will be written (see also section XYZ input). If *style* is `XYZGRAD`

, an XYZ file including energy gradients will be written (columns are: x,y,z,charge,gx,gy,gz; gradient is given in -1*eV/A) If *style* is `CRD`

, the coordinates will be written in CHARMm CRD format.

If *style* is `MOLDEN`

, an interface file for the MOLDEN visualization program is created; further details and examples are given below.

If *style* is `IRSPEC`

, a gnuplot input file will be written, which can be used to plot an IR spectrum. The gnuplot file contains Lorentz functions with the results of the previous vibrational calculation (`VSCF`

the VSCF program (VSCF), `VCI`

the VCI program (VCI)). Input examples can be found in these chapters.

If *style* is `XML`

, a dump of the current state is written, including variables, geometry, basis set, latest orbitals. The file format is well-formed XML, according to the

https://www.molpro.net/schema/molpro-output.html schema. In this case, additional options can be given:

Insert an`INDEX`

,*n*`index=“n”`

attribute into the`<molecule>`

node that is produced.Skip output of normal modes`NOFREQUENCIES`

Skip output of molecular orbitals`NOORBITALS`

Skip output of virtual molecular orbitals`SKIPVIRT`

Skip output of basis set specification`NOBASIS`

Skip output of Molpro variables`NOVARIABLES`

Express orbitals in terms of spherical-harmonic functions rather than converting to cartesian`KEEPSPH`

If *style* is of the form `BABEL/`

*format*, then, where possible, the external shell command `obabel`

will be used to convert the geometry to any format supported by the OpenBabel package. For the possible values of *format*, see OpenBabel’s documentation. If `obabel`

is not installed on the system, then nothing happens.

If *style* is omitted, the Z-matrix, current geometry, and, where applicable, gradient are written.

*file* specifies a file name to which the data is written; if blank, the data is written to the output stream. If *status* is omitted or set to `NEW`

, any old contents of the file are destroyed; otherwise the file is appended.

A subcommand `ORBITAL`

can be given, using the syntax given in section selecting orbitals and density matrices (ORBITAL, DENSITY), to specify a set of orbitals other than the default, latest set.

### Visualization of results using Molden

Geometry, molecular orbital, and normal mode information, when available, is dumped by `PUT,MOLDEN`

in the format that is usable by MOLDEN.

The example below generates all the information required to plot the molecular orbitals of water, and to visualize the normal modes of vibration:

- examples/h2o_put_molden.inp
***,H2O geometry={angstrom;o;h,o,roh;h,o,roh,h,theta}; roh=1.0 theta=104.0 rhf; optg; {frequencies; print,low,img;} put,molden,h2o.molden;

The example below does a difference density by presenting its natural orbitals to MOLDEN. Note that although MOLDEN has internal features for difference density plots, the approach shown here is more general in that it bypasses the restriction to STO-3G, 3-21G, 4-31G and 6-31G basis sets.

- examples/h2o_diffden_molden.inp
gprint,orbitals symmetry,y,planexz geometry={O;H1,O,r;h2,O,r,h1,alpha} r=1.8 alpha=104 int; {hf;wf,10,1;orbital,2100.2} {multi;wf,10,1;orbital,2140.2} {matrop load,dscf,density,2100.2 !load scf density load,dmcscf,density,2140.2 !load mcscf density add,ddiff,dmcscf,-1,dscf !compute dmcscf-dscf natorb,neworb1,dscf natorb,neworb2,dmcscf natorb,neworbs,ddiff save,neworbs,2110.2 save,ddiff,2110.2} put,molden,h2o_ddens.molden;orb,2110.2

## Lattice of point charges

`LATTICE`

,[`INFILE`

=*input_file*,] [`OUTFILE`

= *output_file*,] [`VARGRAD`

,] [`NUCONLY`

,] [`REMOVE`

]

Includes a lattice of point charges, for use in QM/MM calculations for example (see section QM/MM interfaces). An external file (*input_file*) should be given as input, with the following format:

*Comment line*

*number of point charges N*

$x1,y1,z1,q1,$*flag*$1$

$\vdots$

$xN,yN,zN,qN,$*flag*$N$

The $x,y$ and $z$ fields stand for the point charge coordinates (in Å), $q$ for its charge and *flag=1* indicates that gradients should be computed for this lattice point (0 means no gradient). $x,y$, $z$, $q$ are floating point numbers, *flag* is an integer, e.g.:

0.0 , 0.0 , 5.0 , 1.0 , 1

or (commas are optional here):

0.0 0.0 5.0 1.0 1

*outfile* specifies a file name to which the lattice gradient is written; if blank, it will be written to the output stream.

(logical) Stores the lattice gradient in variable`VARGRAD`

`VARGRAD`

.(logical) Disables gradient evaluation with respect to the lattice, independent of`NUCONLY`

*flag*in the lattice file.(logical) Removes the lattice.`REMOVE`

Symmetry is not supported for lattice gradients.

Note that the nuclear repulsion energy (and therefore also the total energy) is computed without including the Coulomb interaction between the point charges.

## Redefining and printing atomic masses

The current masses of all atoms can be printed using

`MASS`

, `PRINT`

The atomic masses can be redefined using

`MASS`

, [*type*,] [*symbol*=*mass*, …]

The optional keyword *type* can take either the value `AVER[AGE]`

for using average isotope masses, or `ISO[TOPE]`

for using the masses of the most abundant isotopes. This affects only the rotational constants and vibrational frequencies. As in most quantum chemistry packages, the default for *type* is `AVERAGE`

. If `INIT`

is given, all previous mass definitions are deleted and the defaults are reset.

Individual masses can be changed by the following entries, where *symbol* is the chemical symbol of the atom and *mass* is the associated mass. Several entries can be given on one `MASS`

card, and/or several `MASS`

cards can follow each other. The last given mass is used.

Note that specifying different isotope masses for symmetry related atoms lowers the symmetry of the system if the molecular centre of mass is taken as the origin. This effect can be avoided by using the charge centre as origin, i.e., specifying `CHARGE`

as the option to an `ORIENT`

directive before the `GEOMETRY`

input:

`ORIENT,CHARGE;`

`GEOMETRY={ ...}`

### Example: two different masses for hydrogen in HD

In analogy to defining two different basis sets for the same element (see Examples for basis input), also two different masses can be defined.

- examples/hd.inp
geometry={ 2 H1 0 0 0 H2 0 0 1.0 } mass,H1=1.008 mass,H2=2.014 basis={ default=cc-pVDZ; } hf optg freq

For performing counterpoise corrected geometry optimizations see section optimizing counterpoise corrected energies.

## Dummy centres

`DUMMY`

,[`SKIP`

],*atom1,atom2,*$\ldots$

Sets nuclear charges on atoms 1,2 etc. to zero, for doing counterpoise calculations, for example. *atom1, atom2,*$\dots$ can be Z-matrix row numbers or tag names. Rangles of consecutive atoms can be given as *atom1,-atom2*, which means from *atom1* to *atom2* (here *atom1* and *atom2* are z-matrix or xyz row numbers)

If the option `SKIP`

is given, basis functions at the specified atoms are not included. This option is not possible with zmatrix geometry inputs, xyz inputs have to be used.

Note that the current setting of dummies is remembered by the program across restarts via the `MOLPRO`

variable `DUMMYATOMS`

. Dummies can be reset to their original charges using a `DUMMY`

card with no entries. Dummy centres are also reset to their original charges if (i) an `INT`

command is encountered, or (ii) a new geometry input is encountered.

The program does not recognize automatically if the symmetry is reduced by defining dummy atoms. Therefore, for a given dummy atom, either all symmetry equivalent atoms must also be dummies, or the symmetry must be reduced manually as required. An error will result if the symmetry is not consistent with the dummy centre definitions.

### Counterpoise calculations

Counterpoise corrections are easily performed using dummy cards. One first computes the energy of the total system, and then for the subsystems using dummy cards. The process can, however, be automated using the command

`COUNTERPOISE`

[, *key1=value, key2=value,……*]

Without any options, this will calculate the ghost-orbital corrections to dimer interaction energies using the last energy calculation to define the dimer and the methods to be used. First of all, the molecule is automatically partitioned into fragments. This is done by assigning all interatomic distances as either intra- or intermolecular, comparing the distance with the scaled sum of Bragg radii of the two atoms. The default scale factor is 1.4, but it can be changed with `SCALE=`

*scale*; for the S66 database of intermolecular interactions, correct partitioning is obtained with a scale factor in the range 1.2 to 1.95. If all of the distances are intramolecular, then it will be assumed that the system is not a non-bonded complex, and the counterpoise calculation will not be done at all. This gives support for running the same Molpro input for both dimers and monomers. At present, partitioning into more than two fragments is discovered, but there is no supporting implementation of the counterpoise correction.

The counterpoise procedure performs four calculations. For each of the identified monomers, it calculates the energy at the dimer geometry using first the dimer basis set, and secondly just the basis functions tied to the monomer atoms. The difference between these, the counterpoise correction, is reported, and added to the previously-calculated dimer energy.

Further options that specify the calculation to be run can be specified exactly as for geometry optimisations; see section options to select the wavefunction and energy to be optimized for further details.

### Example: interaction energy of OH-Ar

- examples/ohar_bsse.inp
***,OH(2Sig+)-Ar linear geometry={q1; !dummy center in center of mass o,q1,ro;h,q1,rh,o,180; !geometry of OH ar,q1,rar,o,theta,h,0} !geometry of Ar roh=1.8 !OH bond-length rar=7.5 !distance of Ar from center of mass theta=0 !angle OH-Ar rh=roh*16/17 !distance of O from center of mass ro=roh*1/17 !distance of H from center of mass basis=avdz !basis set text,calculation for complex {rhf;occ,8,3,3;wf,27,1,1} !RHF for total system rccsd(t) !CCSD(T) for total system e_ohar=energy !save energy in variable e_ohar text,cp calculation for OH dummy,ar !make Ar a dummy center {rhf;occ,3,1,1;wf,9,1,1} !RHF for OH rccsd(t) !CCSD(T) for OH e_oh=energy !save energy in variable e_oh text,cp calculation for Ar dummy,o,h !make OH dummy hf !scf for Ar ccsd(t) !CCSD(T) for Ar e_ar=energy !save energy in variable e_ar text,separate calculation for OH geometry={O;H,O,roh} !geometry for OH alone {rhf;occ,3,1,1;wf,9,1,1} !RHF for OH rccsd(t) !CCSD(T) for OH e_oh_inf=energy !save energy in variable e_oh_inf text,separate calculation for Ar geometry={AR} !geometry for OH alone hf !scf for Ar ccsd(t) !CCSD(T) for Ar e_ar_inf=energy !save energy in variable e_ar_inf de=(e_ohar-e_oh_inf-e_ar_inf)*tocm !compute uncorrected interaction energy de_cp=(e_ohar-e_oh-e_ar)*tocm !compute counter-poise corrected interaction energy bsse_oh=(e_oh-e_oh_inf)*tocm !BSSE for OH bsse_ar=(e_ar-e_ar_inf)*tocm !BSSE for Ar bsse_tot=bsse_oh+bsse_ar !total BSSE

For performing counterpoise corrected geometry optimizations see section optimizing counterpoise corrected energies.