Usage examples

CENSO can be used for several applications / target quantities. Some are listed below:

Hint

CENSO has sorting “parts” which can be turned on and off. The parts are run in sequence and the sorting-results are influenced by the choice of sorting-parts employed. If for example the optimization part (part2) is not performed, then all subsequent parts will calculate free energies or properties on the input SQM/FF geometries (not DFT optimized geometries)! Each part contains thresholds (see Threshold) and choosing to low (free) energy windows (in the early sorting parts) will affect your final ensemble / averaged free energy / property.

Note

For the demonstration purpose it is assumed that all parts are turned off in the global configuration file of the user!

Calculate fast DFT(B97-D3(0)/def2-SV(P)+gcp) single-point energies on GFNn-xTB input geometries

Hint

Useful in case of large structure ensembles (SE). Very efficient (fast) improvement on the electronic energy description compared to the initial SQM/FF energies. High lying conformers are quickly sorted out.

Calculate free energies in solution phase (CHCl3) on GFNn-xTB geometries

Note

There are two options available:

  • Using part1 (prescreening) or

  • only using part3 and do not calculate part2 (optimization)

The difference between the two approaches is that Part3 applies tighter thresholds in the SCF.

Calculate free energies on populated, DFT optimized conformers

Note

  • Using sorting parts: part0 and part1 in order to reduce the number of computational costly DFT geometry optimizations

  • For the cheap-prescreening in part 0, the default level of theory is B97-D/def2-SV(P). Using a higher level of theory, e.g., a hybrid DFA as B3LYP-D3 with a large basis set leads to unnecessarily high computational costs in this part.

  • example settings are read from .censorc file

Calculation of NMR spectra

cis-4-hexen-1-ol

Example of calculating the 1H-NMR spectrum of cis-4-hexen-1-ol in CHCl3 at 400 MHz using TURBOMOLE

  $ cat coord
  $coord
     -4.5787202885           -0.8553740252            1.3686940377   C
     -3.4023148672           -0.1930996279           -1.1104654057   C
     -1.3502307141            1.1896725766           -1.5219716436   C
      0.3006494795            2.4620980359            0.3955352829   C
      0.2362191223            4.5001215909            0.0503890823   H
      3.0642161355            1.6025835718            0.2096773622   C
      3.4178595291           -1.1774351958            0.9222138508   C
      2.5415191057           -2.8708599609           -0.9480135970   O
      0.8099880164           -2.4078504756           -1.3346662145   H
      5.4277591776           -1.6058314070            1.1410607567   H
      2.4658368243           -1.5681688895            2.7281288136   H
      4.2123967195            2.7454371194            1.4856132673   H
      3.7614131313            1.8836292778           -1.7127605110   H
     -0.3830548458            2.1258051759            2.3091187341   H
     -0.7518930894            1.4941446048           -3.4579714157   H
     -4.3910561682           -0.9675515200           -2.7291675973   H
     -3.5318795176           -0.1003881407            2.9630921609   H
     -6.5027613245           -0.1227486813            1.4413316443   H
     -4.6902514935           -2.9039535570            1.5605291206   H
$end
structure

Input structure.

Start with your (at best already optimized) input structure and create the conformers and rotamers for the CENSO and ANMR calculation.

$ crest coord -gfn2 -g chcl3 -T 4 -nmr > crest.out

In our case CREST found 86 conformers within an energy window of 6 kcal/mol. We then create a new folder for the CENSO reranking and copy the necessary files:

$ mkdir censo
$ cp crest_conformers.xyz coord anmr_nucinfo anmr_rotamer censo/
$ cd censo/

CENSO requires only the file crest_conformers.xyz, but ANMR needs the last three files. Here we want to calculate everything with TURBOMOLE using r2SCAN-3c and DCOSMO-RS for geometry optimization and PBE0/def2-TZVP for the NMR part using the command line. To save computational costs, a threshold of 95 percent of Boltzmann_weights is used in part 2 which is in most cases sufficient to reproduce the experimental spectrum. In our case, this reduces the number of conformers in part 4 from 74 to 58.

Note

  • settings are read from global .censorc file (default settings in censo)

Now all information is present and ANMR can be called to calculate the full NMR spectrum.

After ANMR finished computing, the file anmr.dat is written and it contains the spectrum (intensity vs shift) the user can plot:

$ nmrplot.py -i anmr.dat exp.dat -start 0 -end 6.5 -o 1Hspectrum -orientation 1 -1
1H NMR spectrum

1H NMR spectrum of cis-4-hexen-1-ol in chloroform at 400 MHz, comparing calculated and experimental spectrum. Exp taken from [SDBSWeb : https://sdbs.db.aist.go.jp (National Institute of Advanced Industrial Science and Technology,16-10-2019) (SDBSNo. 11748)].

2-methyl-1-pentene

Example of calculating the 1H-NMR spectrum of 2-methyl-1-pentene in CHCl3 at 400 MHz using ORCA

$ cat coord
$coord
       -5.1134989926            0.0445408597            0.0007215195   C
       -2.3988260553            0.1202192416            0.9598504570   C
       -2.0426150350            1.9467776447            1.8509773297   H
       -0.4955528936           -0.3025973506           -1.1852527430   C
        2.1853738985           -0.2583887206           -0.2367582425   C
        3.4286190716           -2.3618737092            0.3005853656   C
        2.5901373582           -4.2004809628            0.0485727882   H
        5.3374488734           -2.3390838060            1.0097369787   H
        3.3398174602            2.3079102171            0.0825121447   C
        5.2234930962            2.1788391495            0.8913733279   H
        2.1708137054            3.4751040066            1.3098219746   H
        3.4682822356            3.2543034689           -1.7427970988   H
       -0.7536049204            1.1708293724           -2.6081830586   H
       -0.8901990516           -2.1258718566           -2.0673390015   H
       -2.1284937554           -1.3401645088            2.3937954454   H
       -6.4334377217            0.3509962700            1.5473865797   H
       -5.4204111085            1.5054637513           -1.4143993805   H
       -5.5276306722           -1.7796998127           -0.8540259276   H
$end
structure

Input structure.

Start with your (at best already optimized) input structure and create the conformers and rotamers for the CENSO and ANMR calculation.

$ crest coord -gfn2 -alpb chcl3 -T 4 -nmr > crest.out

In our case CREST found 9 conformers within an energy window of 6 kcal/mol. We then create a new folder for the CENSO reranking and copy the necessary files:

$ mkdir censo
$ cp crest_conformers.xyz coord anmr_nucinfo anmr_rotamer censo/
$ cd censo/

CENSO requires only the file crest_conformers.xyz, but ANMR needs the last three files. In the new folder we create the file flags.dat and adapt it to our choosing. Here we want to calculate everything with ORCA using r2SCAN-3c and SMD for geometry optimization and PBE0/def2-TZVP for the shielding calculation. B97-D/def2-SV(P) (keyword “b97-d3” in ORCA) is used for the prescreening in part0 to save computation time.

In our case CENSO printed an error-message that the reference absolute shielding constant at the level of theory chosen is missing for hydrogen.

ERROR:       The reference absolute shielding constant for element h could not be found!
             You have to edit the file .anmrrc by hand!

To calculate it, the same calculation as for 2-methyl-1-pentene has to be performed for TMS in a new directory:

$ mkdir tms
$ cd tms
$ cat coord
$coord
 2.05833045453195     -2.05833045453195      2.05833045453195  c
 3.27901073396930     -3.27901073396930      0.93023223253204  h
 3.27901073396930     -0.93023223253204      3.27901073396930  h
 0.93023223253204     -3.27901073396930      3.27901073396930  h
-0.00000000000000      0.00000000000000      0.00000000000000  si
-2.05833045453195      2.05833045453195      2.05833045453195  c
-3.27901073396930      3.27901073396930      0.93023223253204  h
-0.93023223253204      3.27901073396930      3.27901073396930  h
-3.27901073396930      0.93023223253204      3.27901073396930  h
 2.05833045453195      2.05833045453195     -2.05833045453195  c
 0.93023223253204      3.27901073396930     -3.27901073396930  h
 3.27901073396930      0.93023223253204     -3.27901073396930  h
 3.27901073396930      3.27901073396930     -0.93023223253204  h
-2.05833045453195     -2.05833045453195     -2.05833045453195  c
-3.27901073396930     -3.27901073396930     -0.93023223253204  h
-3.27901073396930     -0.93023223253204     -3.27901073396930  h
-0.93023223253204     -3.27901073396930     -3.27901073396930  h
$end
$ crest coord -gfn2 -alpb chcl3 -T 4 -nmr > crest.out
$ mkdir censo
$ cp crest_conformers.xyz coord anmr_nucinfo anmr_rotamer censo/
$ cd censo/

The calculated shift has now to be inserted into the .anmrrc file of the NMR-calculation for the respective molecule:

$ cat .anmrrc

7 8 XH acid atoms
ENSO qm= ORCA mf= 300.0 lw= 1.0  J= on S= on T= 298.15
TMS[chcl3] pbe0[SMD]/def2-TZVP//r2scan-3c[SMD]/def2-mTZVPP
1  31.59    0.0     1

Now all information is present and ANMR can be called to calculate the full NMR spectrum at 400 MHz:

1H NMR spectrum

1H NMR spectrum of 2-methyl-1-pentene in chloroform at 400 MHz, comparing calculated and experimental spectrum. Exp taken from [SDBSWeb: https://sdsbs.db.aist.go.jp (National Institute of Advanced Industria Science and Technology, 16-10-2019) (SDBSNo. 225)].

Calculation of optical rotation

Example of calculating the optical rotation (OR) of \({\alpha}\)-D-glucopyranose and \({\beta}\)-D-glucopyranose

structure

input structure of \({\alpha}\)-D-glucopyranose (left) and \({\beta}\)-D-glucopyranose (right)

Start with an input structure (this one is taken from Pubchem) and create the conformers for the CENSO calculation. For the computation of the OR, it is important to get an (ideally) complete conformer ensemble. Therefore, the crest_combi bash script is used. It automatically starts several CREST runs at GFN-FF and GFN2-xTB theory levels complemented by searching on artificial PES (scaled disp and charge) to overcome possible method deficiencies to efficiently scan a large part of the PES of the respective molecule. To use it, download the script from the release page https://github.com/grimme-lab/CRENSO/releases and make it executable (chmod u+x crest_combi).

Hint

To save computational costs, the crest_combi script reduces the number of generated structures to the most representative structures via the clustering algorithm of CREST. The -or command sets the maximum of generated structures to 1000 instead of the default value of 500 to prevent that relevant conformers are sorted out in this step.

$ crest_combi coord -solvent h2o -T 14 -or > crest.out

In our case CREST found 54 conformers within an energy window of 8 kcal/mol (default value of crest_combi). The final ensemble is optimized at GFN2-xTB level with the ALPB implicit solvation model. We then create a new folder for the CENSO reranking and copy the necessary files:

$ mkdir censo
$ cp crest_combi.xyz censo/
$ cd censo/

To prevent sorting out relevant conformers, we increase the energy thresholds of part 0 and part 1 to 5.0 and 3.0 kcal/mol, respectively. By default, the PBE kernel is employed on the r2SCAN orbitals for the calculation of the optical rotation.

Warning

Since optical rotation is currently not implemented in the ORCA code, OR calculations are only possible with TURBOMOLE.

Hint

Since the generated structures of crest_combi and hence the final ensemble for the OR calculation are non-deterministic the computed OR values of different runs can differ. Therefore, it is recommended to take the average of at least three independent runs of crest_combi + CENSO.

The mean value is 102.5 °[dm(g/cm³)]¹ and in good agreement with the experimental value of 112.2 °[dm(g/cm³)]¹.

The same procedure is performed for \({\beta}\)-D-glucopyranose:

The mean value is 6.2 °[dm(g/cm³)]¹ and in good agreement with the experimental value of 17.5°[dm(g/cm³)]¹. \({\alpha}\) and \({\beta}\) forms of D-glucopyranose could clearly be identified.

Hint

By default, CENSO uses the gauge-invariant velocity representation for the calculation of the OR. This option is currently not available for hybrid DFAs in TURBOMOLE and the lenght representation is used instead when using a hybrid DFA for the calculation of the OR.

Restarting calculations

CENSO keeps track of all performed calculations, if they succeed, fail or are pending. The data is stored in enso.json and enables restarting of CENSO runs. Restarting a calculation is useful if you only considered few conformers in the first run and want to increase the number of investigated conformers (e.g., start from 30 and go to 100 conformers). In the case of difficult molecules (e.g., transition metal compounds) where large differences between the low-cost methods in the sorting parts and the high-level method in part 2 are found, the default thresholds may be increased. Or if after a previous calculation of a free ensemble energy, a property, e.g. OR, is to be calculated. This can be done by first calculating the free energy, checking the results and the final ensemble and then restart to calculate the property (OR). Some restart limitations are given though: You can not change anything that concerns the geometry optimization, since all results have to be created with the same method-combination.

When restarting CENSO, files with information of the previous calculation (enso.json, enso_ensemble_partx.xyz, …) are automatically backed up.

Restarting a calculation is performed by calling censo and adjusting the new settings by command line:

# increase the number of considered conformers to 200
$ censo -restart -nc 200 > censo-restart.out &

# increase the thresholds for parts 0 and 1
# if relevant conformers have been sorted out
$ censo --restart -thrpart0 5.0 -thrpart1 3.0  > censo_restart.out

# calculate OR after previous free energy calculation
$ censo -restart -OR on > censo-restart-OR.out &