Geometry Input

Supported Geometry Formats

Currently the following input formats are supported by xtb.

Format Basename Suffix molecular periodic
Turbomole coord coord, tmol x 3D
xyz file   xyz x  
mol file   mol x  
Structure-Data file   sdf x  
Protein Database file   pdb x  
Vasp’s POSCAR/CONTCAR poscar, contcar vasp, poscar, contcar   3D
genFormat   gen x 3D
Gaussian external   ein x  

Note

The default format is always Turbomole and neither the suffix nor the base name is case-sensitive for file type detection.

Turbomole Coordinate Input

The default input format is the Turbomole coordinate file as a $coord data group.

h2o.coord
$coord
 0.00000000000000      0.00000000000000     -0.73578586109551      o
 1.44183152868459      0.00000000000000      0.36789293054775      h
-1.44183152868459      0.00000000000000      0.36789293054775      h
$end

Like in Turbomole the coordinates can be given in Bohr (default) or Ångström ($coord angs) or fractional coordinates ($coord frac). Optional data groups like the systems periodicity ($periodic), lattice parameters ($lattice [bohr|angs]) and cell parameters ($cell [bohr|angs]) can be provided as well.

The following input for a MgO crystal utilize this data groups:

mgo.coord
$coord frac
    0.00      0.00      0.00      mg
    0.50      0.50      0.50      o
$periodic 3
$cell
 5.798338236 5.798338236 5.798338236 60. 60. 60.
$end

Note

crest only supporting a plain coord data group without modifiers.

xyz Input

The well-known xyz or xmol format is supported by xtb as well. The atom is identified by its element symbol and followed by its cartesian coordinates in Ångström.

An example for caffeine is given with:

caffeine.xyz
24

C          1.07317        0.04885       -0.07573
N          2.51365        0.01256       -0.07580
C          3.35199        1.09592       -0.07533
N          4.61898        0.73028       -0.07549
C          4.57907       -0.63144       -0.07531
C          3.30131       -1.10256       -0.07524
C          2.98068       -2.48687       -0.07377
O          1.82530       -2.90038       -0.07577
N          4.11440       -3.30433       -0.06936
C          5.45174       -2.85618       -0.07235
O          6.38934       -3.65965       -0.07232
N          5.66240       -1.47682       -0.07487
C          7.00947       -0.93648       -0.07524
C          3.92063       -4.74093       -0.06158
H          0.73398        1.08786       -0.07503
H          0.71239       -0.45698        0.82335
H          0.71240       -0.45580       -0.97549
H          2.99301        2.11762       -0.07478
H          7.76531       -1.72634       -0.07591
H          7.14864       -0.32182        0.81969
H          7.14802       -0.32076       -0.96953
H          2.86501       -5.02316       -0.05833
H          4.40233       -5.15920        0.82837
H          4.40017       -5.16929       -0.94780

mol and sdf Input

The mol and sdf format of the ct-file formats are partly supported in xtb. Some limitations apply to those input formats are applied to those formats to make them play nicely together with QM nature of the xTB methods. This means not any mol or sdf input will be accepted as geometry input.

A valid sdf input is given for the water molecule with

h2o.sdf
Water
  xtb     11041909383D
Comment line
  3  2  0     0  0            999 V2000
   -0.2191   -0.3098    0.0000  O  0  0  0  0  0  0  0  0  0  0  0  0
    0.7400   -0.2909   -0.0000  H  0  0  0  0  0  0  0  0  0  0  0  0
   -0.5210    0.6007    0.0000  H  0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0  0
  1  3  1  0  0  0  0
M  END
> <Formula>
H2 O

> <Mw>
18.01528

> <SMILES>
O([H])[H]

> <CSID>
937

$$$$

The input reader is strict in differentiating mol and sdf input, mol input with the sdf extension will be rejected by the reader. The topology and the sdf key-value pairs will be preserved and printed again in the final optimized geometry.

Multiple entries in an sdf input will be ignored by the reader.

Protein Database Input

The input reader supports parts of the pdb-format for reading single PDB file. Minimal sanity checks on the PDB input will be performed, i.e. the reader will outright reject any geometry without hydrogen atoms.

An valid example input (with hydrogen atoms and partial occupied sides removed) is given here:

4qxx.pdb
ATOM      1  N   GLY Z   1      -0.821  -2.072  16.609  1.00  9.93           N
ATOM      2  CA  GLY Z   1      -1.705  -2.345  15.487  1.00  7.38           C
ATOM      3  C   GLY Z   1      -0.968  -3.008  14.344  1.00  4.89           C
ATOM      4  O   GLY Z   1       0.258  -2.982  14.292  1.00  5.05           O
ATOM      5  HA2 GLY Z   1      -2.130  -1.405  15.135  1.00  0.00           H
ATOM      6  HA3 GLY Z   1      -2.511  -2.999  15.819  1.00  0.00           H
ATOM      7  H1  GLY Z   1      -1.364  -1.742  17.394  1.00  0.00           H
ATOM      8  H2  GLY Z   1      -0.150  -1.365  16.344  1.00  0.00           H
ATOM      9  H3  GLY Z   1      -0.334  -2.918  16.868  1.00  0.00           H
ATOM     10  N   ASN Z   2      -1.721  -3.603  13.425  1.00  3.53           N
ATOM     11  CA  ASN Z   2      -1.141  -4.323  12.291  1.00  1.85           C
ATOM     12  C   ASN Z   2      -1.748  -3.900  10.968  1.00  3.00           C
ATOM     13  O   ASN Z   2      -2.955  -3.683  10.873  1.00  3.99           O
ATOM     14  CB  ASN Z   2      -1.353  -5.827  12.446  1.00  5.03           C
ATOM     15  CG  ASN Z   2      -0.679  -6.391  13.683  1.00  5.08           C
ATOM     16  OD1 ASN Z   2       0.519  -6.202  13.896  1.00  6.10           O
ATOM     17  ND2 ASN Z   2      -1.448  -7.087  14.506  1.00  8.41           N
ATOM     18  H   ASN Z   2      -2.726  -3.557  13.512  1.00  0.00           H
ATOM     19  HA  ASN Z   2      -0.070  -4.123  12.263  1.00  0.00           H
ATOM     20  HB2 ASN Z   2      -0.945  -6.328  11.568  1.00  0.00           H
ATOM     21  HB3 ASN Z   2      -2.423  -6.029  12.503  1.00  0.00           H
ATOM     22 HD21 ASN Z   2      -2.427  -7.218  14.293  1.00  0.00           H
ATOM     23 HD22 ASN Z   2      -1.056  -7.487  15.346  1.00  0.00           H
ATOM     24  N   LEU Z   3      -0.907  -3.803   9.944  1.00  3.47           N
ATOM     25  CA  LEU Z   3      -1.388  -3.576   8.586  1.00  3.48           C
ATOM     26  C   LEU Z   3      -0.783  -4.660   7.709  1.00  3.29           C
ATOM     27  O   LEU Z   3       0.437  -4.788   7.643  1.00  3.80           O
ATOM     28  CB  LEU Z   3      -0.977  -2.185   8.081  1.00  3.88           C
ATOM     29  CG  LEU Z   3      -1.524  -1.669   6.736  1.00  8.66           C
ATOM     30  CD1 LEU Z   3      -1.225  -0.191   6.570  1.00  9.89           C
ATOM     31  CD2 LEU Z   3      -0.962  -2.409   5.541  1.00 13.56           C
ATOM     32  H   LEU Z   3       0.086  -3.888  10.109  1.00  0.00           H
ATOM     33  HA  LEU Z   3      -2.475  -3.661   8.568  1.00  0.00           H
ATOM     34  HB2 LEU Z   3      -1.284  -1.469   8.843  1.00  0.00           H
ATOM     35  HB3 LEU Z   3       0.111  -2.162   8.026  1.00  0.00           H
ATOM     36  HG  LEU Z   3      -2.606  -1.798   6.737  1.00  0.00           H
ATOM     37 HD11 LEU Z   3      -1.623   0.359   7.423  1.00  0.00           H
ATOM     38 HD12 LEU Z   3      -1.691   0.173   5.654  1.00  0.00           H
ATOM     39 HD13 LEU Z   3      -0.147  -0.043   6.513  1.00  0.00           H
ATOM     40 HD21 LEU Z   3      -1.168  -3.475   5.643  1.00  0.00           H
ATOM     41 HD22 LEU Z   3      -1.429  -2.035   4.630  1.00  0.00           H
ATOM     42 HD23 LEU Z   3       0.115  -2.250   5.489  1.00  0.00           H
ATOM     43  N   VAL Z   4      -1.635  -5.424   7.029  1.00  3.17           N
ATOM     44  CA  VAL Z   4      -1.165  -6.460   6.119  1.00  3.61           C
ATOM     45  C   VAL Z   4      -1.791  -6.230   4.755  1.00  5.31           C
ATOM     46  O   VAL Z   4      -3.014  -6.209   4.620  1.00  7.31           O
ATOM     47  CB  VAL Z   4      -1.567  -7.872   6.593  1.00  5.31           C
ATOM     48  CG1 VAL Z   4      -1.012  -8.934   5.633  1.00  6.73           C
ATOM     49  CG2 VAL Z   4      -1.083  -8.120   8.018  1.00  5.48           C
ATOM     50  H   VAL Z   4      -2.628  -5.282   7.146  1.00  0.00           H
ATOM     51  HA  VAL Z   4      -0.080  -6.402   6.034  1.00  0.00           H
ATOM     52  HB  VAL Z   4      -2.655  -7.939   6.585  1.00  0.00           H
ATOM     53 HG11 VAL Z   4      -1.303  -9.926   5.980  1.00  0.00           H
ATOM     54 HG12 VAL Z   4      -1.414  -8.766   4.634  1.00  0.00           H
ATOM     55 HG13 VAL Z   4       0.075  -8.864   5.603  1.00  0.00           H
ATOM     56 HG21 VAL Z   4      -1.377  -9.121   8.333  1.00  0.00           H
ATOM     57 HG22 VAL Z   4       0.003  -8.032   8.053  1.00  0.00           H
ATOM     58 HG23 VAL Z   4      -1.529  -7.383   8.686  1.00  0.00           H
ATOM     59  N   SER Z   5      -0.966  -6.052   3.736  1.00  7.53           N
ATOM     60  CA  SER Z   5      -1.526  -5.888   2.407  1.00 11.48           C
ATOM     61  C   SER Z   5      -1.207  -7.085   1.529  1.00 16.35           C
ATOM     62  O   SER Z   5      -0.437  -7.976   1.902  1.00 14.00           O
ATOM     63  CB  SER Z   5      -1.031  -4.596   1.767  1.00 13.36           C
ATOM     64  OG  SER Z   5       0.361  -4.652   1.540  1.00 15.80           O
ATOM     65  OXT SER Z   5      -1.737  -7.178   0.429  1.00 17.09           O
ATOM     66  H   SER Z   5       0.033  -6.031   3.880  1.00  0.00           H
ATOM     67  HA  SER Z   5      -2.610  -5.822   2.504  1.00  0.00           H
ATOM     68  HB2 SER Z   5      -1.543  -4.449   0.816  1.00  0.00           H
ATOM     69  HB3 SER Z   5      -1.254  -3.759   2.428  1.00  0.00           H
ATOM     70  HG  SER Z   5       0.653  -3.831   1.137  1.00  0.00           H
TER      71      SER Z   5
HETATM   72  O   HOH Z 101       0.935  -5.175  16.502  1.00 18.83           O
HETATM   73  H1  HOH Z 101       0.794  -5.522  15.621  1.00  0.00           H
HETATM   74  H2  HOH Z 101       1.669  -4.561  16.489  1.00  0.00           H
HETATM   75  O   HOH Z 102       0.691  -8.408  17.879  1.00 56.55           O
HETATM   76  H1  HOH Z 102       1.392  -8.125  18.466  1.00  0.00           H
HETATM   77  H2  HOH Z 102       0.993  -8.356  16.972  1.00  0.00           H
CONECT   73   72
CONECT   74   72
CONECT   72   73   74
CONECT   76   75
CONECT   77   75
CONECT   75   76   77
END

Vasp’s POSCAR/CONTCAR Input

For periodic input Vasp’s POSCAR / CONTCAR input files are supported, for more information on the format visit the vasp-wiki.

For a molecular crystal of ammonia the input would look like:

ammonia.poscar
 H  N
 1.0000000000000000
     5.0133599999999996    0.0000000000000000    0.0000000000000000
     0.0000000000000000    5.0133599999999996    0.0000000000000000
     0.0000000000000000    0.0000000000000000    5.0133599999999996
  12   4
Cartesian
  2.1985588943999996  1.7639005823999998  0.8801454815999999
  1.7639005823999998  0.8801454815999999  2.1985588943999996
  0.8801454815999999  2.1985588943999996  1.7639005823999998
  4.8411510839999998  1.6194155471999998  4.9398140088000000
  4.3563090384000001  2.4998116967999997  3.6324801215999996
  3.5195792543999995  1.1535741359999998  4.0840334567999994
  4.0840334567999994  3.5195792543999995  1.1535741359999998
  4.9398140088000000  4.8411510839999998  1.6194155471999998
  3.6324801215999996  4.3563090384000001  2.4998116967999997
  2.4998116967999997  3.6324801215999996  4.3563090384000001
  1.1535741359999998  4.0840334567999994  3.5195792543999995
  1.6194155471999998  4.9398140088000000  4.8411510839999998
  1.3746131783999997  1.3746131783999997  1.3746131783999997
  3.9981545999999994  1.9910559239999999  4.4636450759999997
  4.4636450759999997  3.9981545999999994  1.9910559239999999
  1.9910559239999999  4.4636450759999997  3.9981545999999994

genFormat Input

The DFTB+ genFormat is supported for molecular and 3D periodic systems.

A valid input file for a molecular system is given here:

1,4-bromomethaneformaldehyde.gen
9 C
C Br H O
     1   1  -8.9147060000E-02  -6.6786080000E-02  -1.0432907000E-01
     2   2   1.7639746700E+00   2.6771621000E-01   4.2178865000E-01
     3   3  -2.6325805000E-01  -1.1300550700E+00  -1.3052621000E-01
     4   3  -7.4963702000E-01   3.9302570000E-01   6.1238499000E-01
     5   3  -2.6130022000E-01   3.5462634000E-01  -1.0812232600E+00
     6   4   4.7684499800E+00   7.6734388000E-01   1.2078966200E+00
     7   1   5.5165496700E+00   2.5437564000E-01   4.3331738000E-01
     8   3   6.6378745000E+00   3.1585526000E-01   5.3760272000E-01
     9   3   5.1708208600E+00  -3.3263252000E-01  -4.6451965000E-01

For a periodic system the input would look like

GaAs.gen
2  F
Ga As
1 1 0.00 0.00 0.00
2 2 0.25 0.25 0.25
0.0000000E+00 0.0000000E+00 0.0000000E+00
0.2713546E+01 0.2713546E+01 0.0000000E+00
0.0000000E+00 0.2713546E+01 0.2713546E+01
0.2713546E+01 0.0000000E+00 0.2713546E+01

Gaussian External Input

The Gaussian external format is supported to use xtb with the Gaussian program. A thin wrapper around the xtb binary is required to convert the external call to a valid xtb program call.

An example input file is given here:

nh3.EIn
         4         1         0         1
         7      0.000000000000      0.000000000000     -0.114091591161      0.000000000000
         1     -1.817280998039      0.000000000000      0.528409372569      0.000000000000
         1      0.908640499019     -1.573811509290      0.528409372569      0.000000000000
         1      0.908640499019      1.573811509290      0.528409372569      0.000000000000

Custom Annotations in Geometries

The element type is detected by the element symbol, xtb filters the input string in the respective format for letters and uses them to figure out the element type in a case-insensitive way. While reading the geometry input the actual element symbol will be preserved and not normalized, a buffer of four characters is available to hold the symbol which will be used when referring to the element and printing the final geometry.

Note

Prior to version 6.3 only a two character buffer was available.

This allows to add annotations to the geometry input which will not affect the calculation, but show up in the output, log files and the final geometry printout. Consider the following xyz example:

24

13C        1.07317        0.04885       -0.07573
N          2.51365        0.01256       -0.07580
C*         3.35199        1.09592       -0.07533
N          4.61898        0.73028       -0.07549
C          4.57907       -0.63144       -0.07531
C          3.30131       -1.10256       -0.07524
C          2.98068       -2.48687       -0.07377
O(1)       1.82530       -2.90038       -0.07577
N          4.11440       -3.30433       -0.06936
C          5.45174       -2.85618       -0.07235
O          6.38934       -3.65965       -0.07232
N          5.66240       -1.47682       -0.07487
C          7.00947       -0.93648       -0.07524
C          3.92063       -4.74093       -0.06158
H          0.73398        1.08786       -0.07503
D          0.71239       -0.45698        0.82335
D          0.71240       -0.45580       -0.97549
D          2.99301        2.11762       -0.07478
H          7.76531       -1.72634       -0.07591
2H         7.14864       -0.32182        0.81969
3H         7.14802       -0.32076       -0.96953
H          2.86501       -5.02316       -0.05833
H          4.40233       -5.15920        0.82837
H          4.40017       -5.16929       -0.94780

Which is a valid input for xtb. Note that D and T can be used as synonyms for hydrogen (H).

Note

Mangled names are not supported with crest and must be normalized.