A Quick Guide To Running DFTB (With Available Parameters) In GAMESS-US

This post comes out of my great appreciation for just how well Yoshio Nishimoto’s DFTB (density functional-based tight binding method) implementation in GAMESS-US runs, both as an additional functionality in an already considerable program and in comparison to a few other programs I’ve worked with to do the same.

Also, the use of unmodified Slater-Koster files in the GAMESS-US implementation is a nice touch.

This all begins at the dftb.org website with the downloading of available Slater-Koster parameter files for available sets of elements. Note that your favorite elements might not yet have parameters – or parameters within any given parameter set – for immediate use. Until others publish new parameter sets and post them somewhere – and if you’re an academic – you might consider giving Stefan Grimme’s XTB serious consideration.

Additionally! A recent find while looking for new parameter sets was the KIST Integrated Force Field Platform (kiff.vfab.org), providing the complete set of DFTB parameters and up-to-date ReaxFF parameters as well.

Basic Input Format

A few key points (highlighted in the generic pointer input file) below:

! $scf soscf=.t. fdiff=.f. shift=.f. extrap=.f. damp=.t. diis=.f. $end
$system modio=31 $end
$basis gbasis=dftb $end
! $dftb ndftb=2 dampxh=.t. dampex=4.0 itypmx=0 etemp=300 $end
$dftb ndftb=3 dampxh=.t. dampex=4.0 disp=skhp etemp=300
disppr(1)=0.31,0.386,0.386,0.000,0.000,0.000,0.000,3.5,3.5,3.5,
3.5,3.5,3.5,0.80,0.69,1.382,1.382,1.382,1.064,1.064,1.064,3.8,3.8,
3.8,3.8,3.8,3.8,2.50 $end
$dftb hubder(1)=-0.1857,-0.1492 $end
$dftbsk
C C "/path/to/skfiles/3ob-3-1/C-C.skf"
C Si "/path/to/skfiles/3ob-3-1/C-Si.skf"
C H "/path/to/skfiles/3ob-3-1/C-H.skf"
Si C "/path/to/skfiles/3ob-3-1/Si-C.skf"
Si Si "/path/to/skfiles/3ob-3-1/Si-Si.skf"
Si H "/path/to/skfiles/3ob-3-1/Si-H.skf"
H C "/path/to/skfiles/3ob-3-1/H-C.skf"
H Si "/path/to/skfiles/3ob-3-1/H-Si.skf"
H H "/path/to/skfiles/3ob-3-1/H-H.skf"
$end $data

C1
C 6.0 -2.775607 0.000000 0.000000
...
H 1.0 -2.809590 0.000420 50.594100
$end

1. There have been many improvements to the DFTB code. The myweb.liu.edu/~nmatsuna/gamess manual (among others hosted *not* on the official GAMESS-US website) is among the first that *always* come up when searching out GAMESS-US keywords – and it’s several years out of date (esp. for DFTB keywords). The 2019.1 manual has a significantly expanded $DFTB section, including calls for including dispersion corrections from the original $DFT block.

2. No $SCF – as the manual states, converger specifications are pre-defined in the ITYPMX keyword, with additional keywords in the $DFTB block (esp. ETEMP) available to aid in problematic convergence.

3. One big $DFTBSK – as addressed on Jan Jensen’s initial post about running DFTB in GAMESS-US, GAMESS-US will only read atom pairs needed for the atoms listed in the $DATA section. You are fine to simply make an all-encompassing $DFTBSK block for all pairs in a parameter folder and write that to the input file.

4. DISPPR for SKHP – a mild manual issue. For the “Slater–Kirkwood + bond number polarizability dependence” dispersion correction, the current manual states that “For DISP=SKHP, a set for a species has 14 parameters. The first six are the polarizabilities depending on the number of bonds, and the next six are cutoff length, and the last is atomic charge.” You’ll note that 6 + 6 + 1 adds up to 13. The 14 numbers needed for each element in this keyword begins with the covalent atomic radius for the element. In the absence of a list for those values, Table S1.1 (freely available) from “DFTB Parameters for the Periodic Table: Part 1, Electronic Structure” is an excellent resource. I did not bother to address the “different covalent radii values for different hybridizations” issue and have seen little on the topic related to DFTB calculations. For the other 13 numbers, the DFTB+ manual contains a set in Appendix E and reproduced below, with notes to consider other publications for more and different versions of the same elements (and note the NOTES column in the Appendix E table for P and S). Local copy of the manual – 2019Aug24_DFTBplus_manual.pdf

Element Polarisability (6 #'s) ---- [Å3]Cutoff (6 #'s) -- [Å]Chrg

O 0.560 0.560 0.000 0.000 0.000 0.0003.8 3.8 3.8 3.8 3.8 3.8 3.15
N 1.030 1.030 1.090 1.090 1.090 1.0903.8 3.8 3.8 3.8 3.8 3.8 2.82
C 1.382 1.382 1.382 1.064 1.064 1.0643.8 3.8 3.8 3.8 3.8 3.8 2.50
H 0.386 0.386 0.000 0.000 0.000 0.0003.5 3.5 3.5 3.5 3.5 3.5 0.80
P 1.600 1.600 1.600 1.600 1.600 1.6004.7 4.7 4.7 4.7 4.7 4.7 4.50
S 3.000 3.000 3.000 3.000 3.000 3.0004.7 4.7 4.7 4.7 4.7 4.7 4.80

5. HUBDER – If you don’t specify Hubbard Derivatives in your input file, GAMESS-US will include them from internal values. Values appear to be the set from the 3ob-3-1 parameter set and are available in the set’s README file (reproduced from that file below for available elements).

List of all atomic Hubbard derivatives (atomic units):

Br = -0.0573
 C = -0.1492
Ca = -0.0340
Cl = -0.0697
 F = -0.1623
 H = -0.1857
 I = -0.0433
 K = -0.0339
Mg = -0.02
 N = -0.1535
Na = -0.0454
 O = -0.1575
 P = -0.14
 S = -0.11
Zn = -0.03

6. HUBDER order – the order for these numbers is as per the FIRST appearance of the element in the $DATA block. This is ieasy to see if you don’t specity the HUBDER values in the input file and simply let GAMESS-US write out the “HUBBARD DERIVATIVES” block in your DFTB3 run.

7. $DFT Dispersion – according to the manual, you can now use more the traditional dispersion corrections in $DFTB calculations. In the $DFTB block, use DISP=DFT, then modify your $DFT section as you otherwise would – and include DC=.T. in your $DFT block.

8. DFTB Block Summary in the output file – below for a DFTB3 run of a CH-containing molecule using the 3ob-3-1 SK parameters:

 **********************************************************
 **  DENSITY-FUNCTIONAL TIGHT-BINDING (DFTB) CALCULATION **
 **********************************************************
       WRITTEN BY YOSHIO NISHIMOTO (NAGOYA UNIVERSITY)

  NUMBER OF SPECIES:  2
         SPECIES 1 :  H       WITH S     ORBITAL
         SPECIES 2 :  C       WITH S+P   ORBITALS
 
      $DFTB OPTIONS
      -------------
  NDFTB  =       3     SCC    =       T     DFTB3  =       T
  SRSCC  =       F     DAMPXH =       T     DAMPEX =    4.00
  DISP   =NONE         ITYPMX =       0     ETEMP  =    0.00
  MODESD =       0     MODGAM =       8     PRTORB =       F

 --- SCC CALCULATION ---
      INCLUDE 3RD ORDER CORRECTION

 SETTING HUBBARD DERIVATIVES FOR DFTB3
  H       :     -0.14920 (USER DEFINED)
  C       :     -0.18570 (USER DEFINED)
      USE X-H DAMPING:  4.00000

 BROYDEN'S CHARGE MIXING

 4 SLATER-KOSTER FILES WILL BE READ

 START READING SLATER-KOSTER FILES

 1  1 (H        - H       ) = /home/ec2-user/3ob-3-1/H-H.skf
 1  2 (H        - C       ) = /home/ec2-user/3ob-3-1/H-C.skf
 2  1 (C        - H       ) = /home/ec2-user/3ob-3-1/C-H.skf
 2  2 (C        - C       ) = /home/ec2-user/3ob-3-1/C-C.skf

DFTB2

Starting with the less keyword-involved run, below is a completely generic DFTB2 input file with no funny business in the $DFTB block to aid in convergence (my first test would be to change ETEMP to something higher, which has often done the trick when a run won’t complete its first SCF cycle in 200 steps).

 $contrl runtyp=optimize icharg=0 maxit=200 $end
 $system mwords=100 $end
 $system modio=31 $end
 $basis gbasis=dftb $end
 $dftb ndftb=2 dampxh=.t. dampex=4.0 itypmx=0 etemp=0 $end
 $data
C1
C   6.0        -2.775607   0.000000    0.000000
…
H   1.0        -2.809590   0.000420    50.594100
 $end
  $dftbsk
 C C "/path/to/dftb/params/pbc-0-3/C-C.skf"
 C H "/path/to/dftb/params/pbc-0-3/C-H.skf"
 H C "/path/to/dftb/params/pbc-0-3/H-C.skf"
 H H "/path/to/dftb/params/pbc-0-3/H-H.skf"
  $end

DFTB3

A more involved DFTB3 input file, including (1) the SKHP dispersion-correction and carriage returns to show that all 14 numbers are there – in order of appearance – for the atoms in $DATA, (2) the $SCF block as a comment that is ignored by DFTB calculations, (3) a reordering of all keywords above the $DATA block (because why not?), (4) the HUBDER values in order of appearance in the $DATA block, and (5) the Si-containing pairs in the $DFTBSK block just to show that their presence isn’t an issue for the run.

! $scf soscf=.t. fdiff=.f. shift=.f. extrap=.f. damp=.t. diis=.f. $end
 $system modio=31 $end
 $basis gbasis=dftb $end
 $dftb ndftb=3 dampxh=.t. dampex=4.0 disp=skhp etemp=300 
disppr(1)=
0.31,0.386,0.386,0.000,0.000,0.000,0.000,3.5,3.5,3.5,3.5,3.5,3.5,0.80,
0.69,1.382,1.382,1.382,1.064,1.064,1.064,3.8,3.8,3.8,3.8,3.8,3.8,2.50
$end
 $dftb hubder(1)=-0.1857,-0.1492 $end
 $dftbsk
C C "/path/to/skfiles/3ob-3-1/C-C.skf"
C Si "/path/to/skfiles/3ob-3-1/C-Si.skf"
C H "/path/to/skfiles/3ob-3-1/C-H.skf"
Si C "/path/to/skfiles/3ob-3-1/Si-C.skf"
Si Si "/path/to/skfiles/3ob-3-1/Si-Si.skf"
Si H "/path/to/skfiles/3ob-3-1/Si-H.skf"
H C "/path/to/skfiles/3ob-3-1/H-C.skf"
H Si "/path/to/skfiles/3ob-3-1/H-Si.skf"
H H "/path/to/skfiles/3ob-3-1/H-H.skf"
 $end 
 $data

C1
H   1.0        -2.775607   0.000000    0.000000
C   6.0        -3.775607   0.000000    0.000000
....
H   1.0        -2.809590   0.000420    50.594100
 $end

And, just to keep you from bouncing between tabs, the $DFTB section from the 2019.1 version of the manual (obviously check there at some point for changes and improvements).

$DFTB group                  (relevant for GBASIS=DFTB)

Density-functional tight-binding (DFTB) is turned on by
selecting GBASIS=DFTB in $BASIS.  $DFTB controls optional
parameters for a DFTB calculation.  DFTB is formulated in a
two-center approximation utilizing implicitly a minimal
pseudoatomic orbital basis set with corresponding,
pretabulated one- and two-center integrals.   Because of
this, many properties (for instances, multipoles higher
than dipoles) and many options are ignored or not available
in the current implementations of DFTB.  DFTB also uses an
independent SCF driver (SCF in DFTB is also called SCC, see
below), so most SCF options are not available for DFTB.

Only SCFTYP=RHF and UHF are implemented. SCFTYP=ROHF is
available, only when all SPNCST values are zero. DFTB does
not explicitly use symmetry (C1 throughout) since integrals
are never computed during the calculations.  Slater-Koster
tables are only defined for spherical functions (5d) so
DFTB sets ISPHER=1.  Most $GUESS options do not work for
DFTB (DFTB does not use initial orbitals in the usual
sense).  Other than the default (METHOD=HUCKEL, which is
ignored), only METHOD=MOREAD works (note that SCC-DFTB can
use initial charges on atoms, derived from the orbitals).

RUNTYP=OPTIMIZE, HESSIAN and RAMAN are available for full
(non-FMO) DFTB and FMO-DFTB. Excited state calculations for
full DFTB may be performed through the standard (linear-
response) time-dependent formalism (only closed shell). PCM
can be used for both ground and excited state calculations,
and energy and gradient can be evaluated.

In DFTB calculation, the atom type is determined by its
name, not its nuclear charge as elsewhere in GAMESS. The
nuclear charge (the second column in $DATA) is used only in
population analysis, but not in SCF.  DFTB uses a notion of
"species", which means an atomic type.  The species are
numbered according to the order in which atoms appear in
$DATA. For instances, in water there are two species, O and
H.  An atomic type of each species needs MAXANG, which for
most but not all atoms is set automatically.


NDFTB  order of the Taylor expansion of the total energy
       around a reference density in the DFTB model.
       = 1 NCC-DFTB, also called DFTB1.
           NCC stands for non-charge-consistent, i.e., no
           explicity charge-charge interaction term is
           included in the energy calculation.
       = 2 SCC-DFTB, also called DFTB2.
           SCC means a self-charge-consistent approach,
           and SCC implies that SCF iterations are carried
           out that converge monopolar charges towards
           self-consistency.
       = 3 DFTB3, including 3rd order correction using
           Hubbard derivatives (HUBDER).
           In order to reproduce the published DFTB3
           approach, it is necessary to also specify
           DAMPXH=.TRUE. to add other terms.
           Gaus, M. et al. J. Chem. Theory Comput. 2011,
           7, 931-948 is referred to as Gaus2011 below.
           Default: 2.

DAMPXH =  a flag to include the damping function for X-H
          atomic pair in DFTB3. See also DAMPEX, and eq 21
          in Gaus2011.
          The damping function is used when at least one
          atom in a pair is "H". "HYDROGEN" and any other
          name will turn off the damping.
          Default: .FALSE.

DAMPEX =  an exponent used in the damping function for X-H
          atomic pairs.  The default value is 4.0 (taken
          from the 3OB parameter set).

SRSCC  =  a flag to perform shell-resolved SCC calculation.
          If set to .FALSE., the code uses the Hubbard
          value for an s orbital for p and d orbitals,
          ignoring their Hubbard values defined in Slater-
          Koster tables.
          Using .TRUE. enables the use of proper Hubbard
          values for p and d orbitals, implemented only
          for DFTB1 and DFTB2.
          Default: .FALSE.

ITYPMX    Convergence method of SCC calculations.
       = -1 Use standard GAMESS convergence methods.
            SOSCF and DIIS are supported, but DEM is not.
       = 0  Broyden's method.
            Interpolation is applied for atomic
            (or shell-resolved when SRSCC=.TRUE.)
            charges, but not Hamiltonian matrix.
       = 1  (reserved)
       = 2  DIIS for charges.
            Default: 0.

ETEMP  = electronic temperature in Kelvin. Non-zero values
         of ETEMP help SCF convergence of nearly-degenerate
         systems by smearing occupation numbers around the
         Fermi-level. Only the Fermi-Dirac distribution
         function is available as a smearing function.  The
         default value is 0 Kelvin, meaning the smearing
         function is not used.
         ETEMP is implemented only for SCFTYP=RHF and when
         FMO is not used.

DISP     dispersion model for DFTB.
       = NONE no Dispersion correction.
       = UFF  UFF-type dispersion correction.
              Parameters for atomic numbers up to 54 are
              available internally or can be supplied in
              DISPPR for any atom.
              Built-in parameters are taken from Rappe
              et al. J. Am. Chem. Soc. 1992, 114, 10024.
       = SK   The Slater-Kirkwood type dispersion
              correction omitting the change polarizability
              depending on the number of bonds.
              No default values of DISPPR are available.
              Some are listed in the manual of the DFTB+
              program.
       = SKHP The Slater--Kirkwood type dispersion with
              the dependence of polarizabilities on the
              number of bonds.
       = DFT  Use so-called DFT-D. See $DFT for further
              details. DISP=GRIMME is a synonym.

DISPPR   an array of parameters used for dispersion
         correction, listed in sets for each species.
         For DISP=UFF, DISPPR(1) and DISPPR(2) define the
         non-bonded distance (Angs.) and energy (kcal/mol)
         for the first species, respectively, and so on.
         For DISP=SK, a set for a species has 3 parameters,
         the polarizability (Angstrom^3), cutoff length
         (Angstrom), and atomic charge.
         For DISP=SKHP, a set for a species has 14
         parameters. The first six are the polarizabilities
         depending on the number of bonds, and the next six
         are cutoff length, and the last is atomic charge.
         Default: see DISP.

HUBDER   an array of Hubbard derivatives for each species
         (1 per species) used only for DFTB3 calculations.
         Default values are set for the elements included
         in the 3OB parameters (Br, C, Ca, Cl, F, H, I,
         K, Mg, N, Na, O, P, S, Zn).

MAXANG   array of maximum angular momentum of each species,
         which determines the number of basis functions.
         DFTB uses only valence orbitals and electrons!
         Most elements have proper default values, but for
         some atomic types (i.e., species) you need to
         manually define the values.

QREF     array of the number of reference electrons of each
         species.  QREF is usually automatically taken from
         Slater-Koster parameters, so this option is seldom
         used.

SPNCST   an array of spin constants used in unrestricted
         (UHF) DFTB calculation. Provide 6 spin constants,
         W_{ss}, W_{sp}, W_{pp}, W_{sd}, W_{pd}, & W_{dd},
         for each species in a continuous array. Constants
         for some elements can be found in the manual of
         the DFTB+ program.

PARAM    specifies the directory from which DFTB parameters
         are taken. If you wish to mix parameters from
         different directories, this option cannot be used.
         Specifying PARAM means no $DFTBSK; otherwise,
         $DFTBSK is read.
         Nota bene-bene: the actual path for parameters includes
         $DFTBPAR, defined in rungms. All directory names
         used in PARAM should be ** UPPER CASE **, as 3OB-3-1 in
         ~/gamess/dftb/param/3OB-3-1 where
         $DFTBPAR=~/gamess/auxdata/DFTB
         PARAM=3OB-3-1
         The length of PARAM is maximum 8 characters!
         Each parameter file name has a limit of 150 characters.
         GAMESS includes 3OB-3-1 and MATSCI03 (properly called
         matsci-0-3), which you may specify in PARAM.
         3ob-3-1 should be used with DFTB3 (biochemistry+water).
         matsci-0-3 should be used with DFTB2 (iorganics).
         You can find more parameter sets at dftb.org.
         Before using DFTB parameters, ~/gamess/dftb/README.dftb
         should be consulted regarding lisense and citations.
         Default: "", meaning that $DFTBSK is read.

ISPDMP   An array of integer specifying species X to which
         the X--H damping function (DAMPXH) is applied. By
         default, with DAMPXH=.TRUE., ISPDMP for all
         elements is 1 (apply). Setting 1 for H does not
         do anything.

                        * * *


The following options are FMO-DFTB specific (Nishimoto, Y.
et al. J. Chem. Theory Comput. 2014, 10, 4801-4812.).

FMO-DFTB has many limitations and some FMO options are not
supported (for instance, multilayer FMO etc).  Only single
layer, restricted closed-shell FMO2/3-DFTB1/2/3
are implemented at present. SRSCC, ETEMP etc are not
available. The analytic gradient is available for FMO-DFTB,
requiring solving SCZV as in other FMO methods.

MODESD = controls the behavior of ES-DIM (electrostatic
         dimer) approximation (bit additive).
         1 Calculate interfragment repulsive energy for ES
           dimers (almost never used).
         2 Add up all ES-DIM energies. This means that
           individual ES dimer energies are not calculated,
           but only their total lump sum, computed with the
           dynamic load balancing.
         4 Lump ES-DIM routine with static load balancing.
           The bits of 2 or 4 are mutually exclusive.
           Default: 0 (i.e., individual ES dimer energies).

MODGAM = controls the calculation of gamma values
         (interatomic 1/R-like function) in FMO-DFTB2 and
         FMO-DFTB3 (bit additive).
         0 Calculate gamma values on the fly.
         1 Calculate once and prestore gamma values in
           triangular matrix.
         2 Calculate once and prestore gamma values in
           square matrix.
         4 With the bits of 1 or 2, the calculation of
           gamma values is parallelized with GDDI.
           The bits of 1 or 2 are mutually exclusive. These
           options are faster but takes more memory.
         8 Using this option omits computing ESP in dimer
           and trimer calculations by accumulating
           contributions of each fragment and subtracting
           double-counting contributions.
           Default: 8

                        * * *


The following options are relevant to second- and
third-order derivative calculations (RUNTYP=HESSIAN and
RAMAN).

CPCONV = Convergence criterion during coupled-pertrubed DFTB
         iterations, similar to CONV in $SCF. In DFTB,
         the program uses Mulliken charges for testing the
         convergence, but not density matrix itself.
         By default, CPCONV=1.0D-06.

MXCPIT = Maximum number of coupled-perturbed DFTB iterations.
         By default, MXCPIT=50.

DEGTHR = An array of two degeneracy thresholds. If the
         difference of two eigenvalues are less than the
         threshold, two orbitals are seen as degenerated.
         The first threshold is employed in solving
         coupled-perturbed equations, while the second
         threshold is in computing third-order derivatives
         analytically. By default, these are set to 1.0D-12
         and 1.0D-08, whieh are usually reasonable.

ARAMAN = A flag to compute third-order derivatives (static
         hyperpolarizability and polarizability derivative)
         analytically, in addition to Hessian. If this
         option is activated, users do not have to give
         $HESSIAN and $DIPDR in the input, and
         non-resonance Raman spectra can be simulated by
         a single run. This option requires that RUNTYP must
         be RAMAN. By default, ARAMAN=.FALSE.

                        * * *


The following optinos are relevant to long-ranged corrected
DFTB. The formulation is based on Lutsker, V. et al. 2015,
143, 184107. With LC-DFTB, using ITYPMX=-1 options is highly
recommended.

LCDFTB = A flag to activate long-range correctios

EMU    = A parameter for long-range corrections. The meaning
         is very similar to MU in $DFT. By default, EMU=0.0,
         and this corresponds to regular DFTB.

ICUT   = A parameter applied in the screening using the
         Schwarz inequiality. The meaning is similar to ICUT
         in $CONTRL. By default, the screening is not
         employed, but this is usually as fast as ICUT=9,
         depending on the performance of the math library.

Compiling And Running GAMESS-US (1 May 2013(R1)) On 64-bit Ubuntu 12.X/13.X In SMP Mode

Author’s Note 1: It is my standard policy to put too much info into guides so that those who are searching for specific problems they come across will find the offending text in their searches. With luck, your “build error” search sent you here.

Author’s Note 2: It’s not as bad as it looks (I’ve included lots of output and error messages for easy searching)!

Author’s Note 3: I won’t be much help for you in diagnosing your errors, but am happy to tweak the text below if something is unclear.

Conventions: I include both the commands you type in your Terminal and some of the output from these commands, the output being where most of the errors appear that I work on in the discussion.

Input is formatted as below:

username – your username (check your prompt)
machinename – your hostname (type hostname or check your prompt)

Text you put in at the (also shown, so you see the directory structure) prompt (copy + paste should be fine)

Text you get out (for checking results and reproducing errors)

Having just recently downloaded the newest version of GAMESS-US (R1 2013), my first few passes at using it under Linux (specifically, Ubuntu 12.04) ran into a few walls that required some straightforward modifications and a little bit of system prep planning. As my first few passes before successful execution are likely the same exact problems you might have run into in your attempts to get GAMESS-US to run (after a successful compilation and linking), I’m posting my problems and solutions here.

Qualifier 1 – My concern at the moment has been to get GAMESS-US to run under 64-bit Ubuntu 12.04 on a multi-core board (ye olde symmetric multiprocessing (which I always called single multi-processor, or SMP)). While some answers may follow in what’s below, this post doesn’t cover MPI-specific builds (nothing through a router, that is). SMP is the only concern (which is to say, I likely won’t have good answers if you send along an MPI-specific question). Also, although I’m VERY interested in trying it, I’ve not yet attempted to build a GPU-capable version (but plan to in the near future).

Qualifier 2 – It is my standard policy to install apps into /opt, and my steps below will reflect that (specifically because there’s a permission issue that needs to be addressed when you first try to build components). You can default to whatever you like, but keep in mind my tweaks when you try to build your local copy.

So, with the qualifiers in mind…

1. Prepping The System (apt-get)

There are few things better than being able to apt-get everything you need to prep your machine for an install, and I’m pleased to report that the (current) process for putting the important files onto Ubuntu 12.X/13.X is easy. Assuming you’re not going the Intel / PGI / MKL route, you can do everything by installing gfortran (compiler, presently installing 4.4) and the blas and atlas math libraries.

username@machinename:~$ sudo apt-get install gfortran libblas-dev libatlas-base-dev

Note: your atlas libraries will be installed in /usr/lib64/atlas/ – this will matter when you run config.

After these finish, run the following to determine your installed gfortran version (will be asked for by the new GAMESS config)

username@machinename:~$ gfortran -dumpversion

GNU Fortran (Ubuntu 4.4.3-4ubuntu5.1) 4.4.3
Copyright (C) 2010 Free Software Foundation, Inc.

GNU Fortran comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of GNU Fortran
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING

4.4 And you’re ready for GAMESS.

2. Downloading GAMESS-US, Placing Into /opt, And Changing Permissions

First, obviously, get the GAMESS source (click on the red text).

After downloading, copy/move gamess-current.tar.gz into /opt

username@machinename:~$ cd ~/Downloads
username@machinename:~/Downloads$ sudo cp gamess-current.tar.gz /opt
username@machinename:~/Downloads$ cd /opt
username@machinename:/opt$ sudo gunzip gamess-cuerent.tar.gz
username@machinename:/opt$ sudo tar xvd gamess-current.tar

gamess/
gamess/gms-files.csh
gamess/tools/

gamess/misc/count.code
gamess/misc/vbdum.src
gamess/Makefile.in

At this point, if you go through the config process and get to the point of building ddikick.x, you will get an error when you first try to run ./compddi

username@machinename:/opt/gamess/ddi$ sudo ./compddi >& compddi.log &

[1] 4622
-bash: compddi.log: Permission denied

The problem is with the permission of the entire gamess folder:

drwxr-xr-x  4 root        root              4096 2014-04-04 21:43 .
drwxr-xr-x 22 root        root              4096 2013-12-27 16:17 ..
drwxr-xr-x 14 1300 504              4096 2014-04-04 21:43 gamess
-rw-r–r– 1 root        root         198481920 2014-04-04 21:42 gamess-current.tar

Which you remedy before running into this error by changing the permissions:

username@machinename:/opt$ sudo chown -R username gamess

The next step is recommended when you run config, so I’m performing the step here to get it out of the way. With the atlas libraries installed, generate two symbolic links.

username@machinename:/opt$ cd /usr/lib64/atlas
username@machinename:/usr/lib64/atlas$ sudo ln -s libf77blas.so.3.0 libf77blas.so
username@machinename:/usr/lib64/atlas$ sudo ln -s libatlas.so.3.0 libatlas.so

And, at this point, you’re ready to run the new (well, new to me) config script that preps your system install.

3. Building GAMESS-US

Back to the GAMESS-US folder.

username@machinename:/usr/lib64/atlas$ cd /opt/gamess
username@machinename:/opt/gamess$ sudo ./config

This script asks a few questions, depending on your computer system,
to set up compiler names, libraries, message passing libraries,
and so forth.

You can quit at any time by pressing control-C, and then .

Please open a second window by logging into your target machine,
in case this script asks you to ‘type’ a command to learn something
about your system software situation. All such extra questions will
use the word ‘type’ to indicate it is a command for the other window.

After the new window is open, please hit to go on.

You can open that second window or blindly assume that what I include below is all you need.

[enter]

GAMESS can compile on the following 32 bit or 64 bit machines:
axp64 – Alpha chip, native compiler, running Tru64 or Linux
cray-xt – Cray’s massively parallel system, running CNL
hpux32 – HP PA-RISC chips (old models only), running HP-UX
hpux64 – HP Intel or PA-RISC chips, running HP-UX
ibm32 – IBM (old models only), running AIX
ibm64 – IBM, Power3 chip or newer, running AIX or Linux
ibm64-sp – IBM SP parallel system, running AIX
ibm-bg – IBM Blue Gene (P or L model), these are 32 bit systems
linux32 – Linux (any 32 bit distribution), for x86 (old systems only)
linux64 – Linux (any 64 bit distribution), for x86_64 or ia64 chips
AMD/Intel chip Linux machines are sold by many companies
mac32 – Apple Mac, any chip, running OS X 10.4 or older
mac64 – Apple Mac, any chip, running OS X 10.5 or newer
sgi32 – Silicon Graphics Inc., MIPS chip only, running Irix
sgi64 – Silicon Graphics Inc., MIPS chip only, running Irix
sun32 – Sun ultraSPARC chips (old models only), running Solaris
sun64 – Sun ultraSPARC or Opteron chips, running Solaris
win32 – Windows 32-bit (Windows XP, Vista, 7, Compute Cluster, HPC Edition)
win64 – Windows 64-bit (Windows XP, Vista, 7, Compute Cluster, HPC Edition)
winazure – Windows Azure Cloud Platform running Windows 64-bit
type ‘uname -a’ to partially clarify your computer’s flavor.
please enter your target machine name:

We’re doing a linux64 build, so type the following at the prompt:

linux64

Where is the GAMESS software on your system?
A typical response might be /u1/mike/gamess,
most probably the correct answer is /opt/gamess

GAMESS directory? [/opt/gamess]

Who is this mike and where is my folder u1? We’ll get to that in rungms. For now, I’m installing in /opt, so the default directory is fine:

[enter]

Setting up GAMESS compile and link for GMS_TARGET=linux64
GAMESS software is located at GMS_PATH=/opt/gamess

Please provide the name of the build locaation.
This may be the same location as the GAMESS directory.

GAMESS build directory? [/opt/gamess]

Fine as selected.

[enter]

Please provide a version number for the GAMESS executable.
This will be used as the middle part of the binary’s name,
for example: gamess.00.x

Version? [00]

Is this important? Maybe, if you plan on building multiple versions of GAMESS-US (you might want a GPU-friendly version, one with a different compiler, one with MPI, etc.). Number as you wish and remember the number when it comes to rungms. That said, the actual linking step seems to really want to produce a 01 version (we’ll get to that). Meantime, default value is fine.

[enter]

Linux offers many choices for FORTRAN compilers, including the GNU
compiler set (‘g77’ in old versions of Linux, or ‘gfortran’ in
current versions), which are included for free in Unix distributions.

There are also commercial compilers, namely Intel’s ‘ifort’,
Portland Group’s ‘pgfortran’, and Pathscale’s ‘pathf90’. The last
two are not common, and aren’t as well tested as the others.

type ‘rpm -aq | grep gcc’ to check on all GNU compilers, including gcc
type ‘which gfortran’ to look for GNU’s gfortran (a very good choice),
type ‘which g77’ to look for GNU’s g77,
type ‘which ifort’ to look for Intel’s compiler,
type ‘which pgfortran’ to look for Portland Group’s compiler,
type ‘which pathf90’ to look for Pathscale’s compiler.
Please enter your choice of FORTRAN:

We’re using gfortran (currently 4.4.3):

gfortran

gfortran is very robust, so this is a wise choice.

Please type ‘gfortran -dumpversion’ or else ‘gfortran -v’ to
detect the version number of your gfortran.
This reply should be a string with at least two decimal points,
such as 4.1.2 or 4.6.1, or maybe even 4.4.2-12.
The reply may be labeled as a ‘gcc’ version,
but it is really your gfortran version.
Please enter only the first decimal place, such as 4.1 or 4.6:

4.4

Alas, your version of gfortran does not support REAL*16,
so relativistic integrals cannot use quadruple precision.
Other than this, everything will work properly.
hit to continue to the math library setup.

If this was my biggest concern I’d be a happy quantum chemist. Obviously you can try to install other flavors of gfortran and, possibly, by the time you need the procedure I’m following, a newer version of gfortran will be apt-gotten.

[enter]

Linux distributions do not include a standard math library.

There are several reasonable add-on library choices,
MKL from Intel for 32 or 64 bit Linux (very fast)
ACML from AMD for 32 or 64 bit Linux (free)
ATLAS from www.rpmfind.net for 32 or 64 bit Linux (free)
and one very unreasonable option, namely ‘none’, which will use
some slow FORTRAN routines supplied with GAMESS. Choosing ‘none’
will run MP2 jobs 2x slower, or CCSD(T) jobs 5x slower.

Some typical places (but not the only ones) to find math libraries are
Type ‘ls /opt/intel/mkl’ to look for MKL
Type ‘ls /opt/intel/Compiler/mkl’ to look for MKL
Type ‘ls /opt/intel/composerxe/mkl’ to look for MKL
Type ‘ls -d /opt/acml*’ to look for ACML
Type ‘ls -d /usr/local/acml*’ to look for ACML
Type ‘ls /usr/lib64/atlas’ to look for Atlas

Enter your choice of ‘mkl’ or ‘atlas’ or ‘acml’ or ‘none’:

atlas

Where is your Atlas math library installed? A likely place is
/usr/lib64/atlas
Please enter the Atlas subdirectory on your system:

Our location is, in fact, /usr/lib64/atlas, so we type it in accordingly.

NOTE: If you don’t type anything but [enter] below, the script closes (/usr/lib64/atlas is listed as the expected location, but it is not defaulted by the script. You need to type it in.

/usr/lib64/atlas
 

The linking step in GAMESS assumes that a softlink exists
within the system’s /usr/lib64/atlas
from libatlas.so to a specific file like libatlas.so.3.0
from libf77blas.so to a specific file like libf77blas.so.3.0
config can carry on for the moment, but the ‘root’ user should
chdir /usr/lib64/atlas
ln -s libf77blas.so.3.0 libf77blas.so
ln -s libatlas.so.3.0 libatlas.so
prior to the linking of GAMESS to a binary executable.

Math library ‘atlas’ will be taken from /usr/lib64/atlas

please hit to compile the GAMESS source code activator

The symbolic linking was performed before the GAMESS steps.

[enter]

gfortran -o /home/username/gamess/tools/actvte.x actvte.f
unset echo
Source code activator was successfully compiled.

please hit to set up your network for Linux clusters.

[enter]

If you have a slow network, like Gigabit Ethernet (GE), or
if you have so few nodes you won’t run extensively in parallel, or
if you have no MPI library installed, or
if you want a fail-safe compile/link and easy execution,
choose ‘sockets’
to use good old reliable standard TCP/IP networking.

If you have an expensive but fast network like Infiniband (IB), and
if you have an MPI library correctly installed,
choose ‘mpi’.

communication library (‘sockets’ or ‘mpi’)?

Again, I’m not building an mpi-friendly version, so am using sockets.

sockets

64 bit Linux builds can attach a special LIBCCHEM code for fast
MP2 and CCSD(T) runs. The LIBCCHEM code can utilize nVIDIA GPUs,
through the CUDA libraries, if GPUs are available.
Usage of LIBCCHEM requires installation of HDF5 I/O software as well.
GAMESS+LIBCCHEM binaries are unable to run most of GAMESS computations,
and are a bit harder to create due to the additional CUDA/HDF5 software.
Therefore, the first time you run ‘config’, the best answer is ‘no’!
If you decide to try LIBCCHEM later, just run this ‘config’ again.

Do you want to try LIBCCHEM? (yes/no):

no

Your configuration for GAMESS compilation is now in
/home/username/gamess/install.info
Now, please follow the directions in
/home/username/gamess/machines/readme.unix
username@machinename:~/gamess$

At this stage, you’re ready to build ddikick.x and continue with the compiling.

4. Build ddikick.x

username@machinename:/opt/gamess$ cd ddi
username@machinename:/opt/gamess/ddi$ sudo ./compddi >& compddi.log &

Will dump output into compddi.log (which will now work with the correct permissions).

username@machinename:/opt/gamess/ddi$ sudo mv ddikick.x ..
username@machinename:/opt/gamess/ddi$ cd ..
username@machinename:/opt/gamess$ sudo ./compall >& compall.log &

Feel free to follow along as compall.log dumps results. You’re also welcome to follow the readme.unix advice:

This takes a while, so go for coffee, or check the SF Giants web page.

Upon completion, the last step is to link the executable.

Now, it used to be the case that you specified the version number in the lked step. So, if you wanted to stick with the 00 version from the config file, you’d type

username@machinename:/opt/gamess$ sudo ./lked gamess 00 >& lked.log &

When you do that at present, you get

[1] 7626
username@machinename:/opt/gamess$

[1]+ Stopped sudo ./lked gamess 00 &>lked.log

This then leads you to use the lked call from the readme.unix file.

username@machinename:/opt/gamess$ sudo ./lked gamess 01 >& lked.log &

Which then produces lked.log and gamess.01.x.

Now, if you run with 00 again, you get a successful linking of gamess.00.x . Not sure why this happens, but the version number isn’t important so long as you specify the right one when you use rungms (so I’ve not diagnosed it further).

At this point, you have a gamess.00.x and/or gamess.01.x executable in your /opt/gamess folder:

30828747 2014-04-04 22:41 gamess.01.x

I’m going to ignore the 00 issue out of the config file and use the gamess.01.x executable.

We’re ready to run calculations and work through the next set of errors you’ll receive if you don’t properly modify files.

5. PATH Setting

First, we copy rungms to our home folder, then add /opt/gamess to the PATH:

username@machinename:/opt/gamess$ cp rungms ~/
username@machinename:/opt/gamess$ cd ~/
username@machinename:~$ nano .bashrc

Add the following to the bottom of .bashrc (or extend your PATH)

PATH=$PATH:/opt/gamess

Quit nano and source.

username@machinename:~$ source .bashrc
[OPTIONAL] username@machinename:~$ echo $PATH

/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:…/opt/gamess:

6. rungms (Probably Why You’re Here)

If you just go blindly into a run, you’ll get the following error:

username@machinename:~$ ./rungms test.inp

—– GAMESS execution script ‘rungms’ —–
This job is running on host machinename
under operating system Linux at Fri Apr 4 22:47:55 EDT 2014
Available scratch disk space (Kbyte units) at beginning of the job is
df: `/scr/username’: No such file or directory
df: no file systems processed
GAMESS temporary binary files will be written to /scr/username
GAMESS supplementary output files will be written to /home/username/scr
Copying input file test.inp to your run’s scratch directory…
cp test.inp /scr/username/test.F05
cp: cannot create regular file `/scr/username/test.F05′: No such file or directory
unset echo
/u1/mike/gamess/gms-files.csh: No such file or directory.

As is obvious, rungms needs some modifying.

username@machinename:~$ nano rungms

Scroll down until you see the following:

set TARGET=sockets
set SCR=/scr/$USER
set USERSCR=~$USER/scr
set GMSPATH=/u1/mike/gamess

Given that it’s just me on the machine, I tend to simplify this by making SCR and USERSCR the same directory, and I make them both /tmp. If you intend on keeping all of the files, you’ll need to make rungms specific for each run case. My only concerns are .dat and .log, so /tmp dumping is fine. Furthermore, we must change GMSPATH from how the ever-helpful Mike Schmidt (he got me through some early issues when I started my GAMESS-US adventure 15ish years ago. Won’t complain about his continued default-ed presence in the scripts) has it set up at Iowa to how we want it on our own machines (in my case, /opt/gamess)

set TARGET=sockets
set SCR=/tmp
set USERSCR=/tmp
set GMSPATH=/opt/gamess

With these modifications, your next run will be a bit more successful:

username@machinename:~$ ./rungms test.inp

—– GAMESS execution script ‘rungms’ —–
This job is running on host machinename
under operating system Linux at Fri Apr 4 22:51:35 EDT 2014
Available scratch disk space (Kbyte units) at beginning of the job is
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda2 1905222596 249225412 1559217460 14% /
GAMESS temporary binary files will be written to /tmp
GAMESS supplementary output files will be written to /tmp
Copying input file test.inp to your run’s scratch directory…
cp test.inp /tmp/test.F05
unset echo
/opt/gamess/ddikick.x /opt/gamess/gamess.00.x test -ddi 1 1 machinename -scr /tmp

Distributed Data Interface kickoff program.
Initiating 1 compute processes on 1 nodes to run the following command:
/opt/gamess/gamess.00.x test

******************************************************
* GAMESS VERSION = 1 MAY 2013 (R1) *
* FROM IOWA STATE UNIVERSITY *
* M.W.SCHMIDT, K.K.BALDRIDGE, J.A.BOATZ, S.T.ELBERT, *
* M.S.GORDON, J.H.JENSEN, S.KOSEKI, N.MATSUNAGA, *
* K.A.NGUYEN, S.J.SU, T.L.WINDUS, *
* TOGETHER WITH M.DUPUIS, J.A.MONTGOMERY *
* J.COMPUT.CHEM. 14, 1347-1363(1993) *
**************** 64 BIT LINUX VERSION ****************

INPUT CARD>
DDI Process 0: shmget returned an error.
Error EINVAL: Attempting to create 160525768 bytes of shared memory.
Check system limits on the size of SysV shared memory segments.

The file ~/gamess/ddi/readme.ddi contains information on how to display
the current SystemV memory settings, and how to increase their sizes.
Increasing the setting requires the root password, and usually a sytem reboot.

DDI Process 0: error code 911
ddikick.x: application process 0 quit unexpectedly.
ddikick.x: Fatal error detected.
The error is most likely to be in the application, so check for
input errors, disk space, memory needs, application bugs, etc.
ddikick.x will now clean up all processes, and exit…
ddikick.x: Sending kill signal to DDI processes.
ddikick.x: Execution terminated due to error(s).
unset echo
—– accounting info —–
Files used on the master node machinename were:
-rw-r–r– 1 username username 0 2014-04-04 22:51 /tmp/test.dat
-rw-r–r– 1 username username 1341 2014-04-04 22:51 /tmp/test.F05
ls: No match.
ls: No match.
ls: No match.
Fri Apr 4 22:51:36 EDT 2014
0.0u 0.0s 0:01.08 9.2% 0+0k 0+8io 0pf+0w

Things worked, but with a memory error. This issue is discussed at the Baldridge Group wiki: ocikbapps.uzh.ch/kbwiki/gamess_troubleshooting.html

From the wiki:

If you are sure you are not asking for too much memory in the input file, check that your kernel parameters are not allowing enough memory to be requested. You might have to increase the SHMALL & SHMAX kernel memory values to allow GAMESS to run. (See http://www.pythian.com/news/245/the-mysterious-world-of-shmmax-and-shmall/ for a better explanation.)
For example, on a machine with 4GB of memory, you might add these to /etc/sysctl.conf:
# cat /etc/sysctl.conf | grep shm
kernel.shmmax = 3064372224
kernel.shmall = 748137
Then set the new settings like so:
# sysctl -p
Since they are in /etc/sysctl.conf, they will automatically be set each time the system is booted.

In our case, we modify sysctl.conf with the recommendations from the wiki:

username@machinename:~$ sudo nano /etc/sysctl.conf

Add the following to the bottom of the file:

kernel.shmmax = 3064372224
kernel.shmall = 748137

Save and exit.

username@machinename:~$ sudo sysctl -p

net.ipv4.ip_forward = 1
kernel.shmmax = 3064372224
kernel.shmall = 748137

These memory values will change depending on your system.

Now we empty the /tmp and rerun.

username@machinename:~$ rm /tmp/*
username@machinename:~$ ./rungms test.inp

If your input file is worth it’s salt, you’ll have successfully run your file on a single processor (single core, that is). If you run into additional memory errors, increase kernel.shmmax and kernel.shmall.

Now, onto the SMP part. My first attempt to run games in parallel (on 4 cores using version 00) produced the following error:

username@machinename:~$ rm /tmp/*
username@machinename:~$ ./rungms test.inp 00 4

—– GAMESS execution script ‘rungms’ —–
This job is running on host machinename
under operating system Linux at Fri Apr 4 22:52:52 EDT 2014
Available scratch disk space (Kbyte units) at beginning of the job is
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda2 1905222596 249225416 1559217456 14% /
GAMESS temporary binary files will be written to /tmp
GAMESS supplementary output files will be written to /tmp
Copying input file test.inp to your run’s scratch directory…
cp test.inp /tmp/test.F05
unset echo
I do not know how to run this node in parallel.

I tried a number of stupid things to get the run to work, finally settling on modifying the rungms file properly. To make gamess know how to run the node in parallel, we need only make the following changes to our rungms file.

username@machinename:~$ nano rungms

Scroll down until you find the section below:

# 2. This is an example of how to run on a multi-core SMP enclosure,
# where all CPUs (aka COREs) are inside a -single- NODE.
# At other locations, you may wish to consider some of the examples
# that follow below, after commenting out this ISU specific part.
if ($NCPUS > 1) then
switch (`hostname`)
case se.msg.chem.iastate.edu:
case sb.msg.chem.iastate.edu:
if ($NCPUS > 2) set NCPUS=4
set NNODES=1

The change is simple. We remove the cases for $NCPUS > 1 in the file and add the hostname of our linux box (and if you don’t know this or it’s not in your prompt, simply type hostname at the prompt first). We’ll disable the two cases listed and add our hostname to the case list.

# 2. This is an example of how to run on a multi-core SMP enclosure,
# where all CPUs (aka COREs) are inside a -single- NODE.
# At other locations, you may wish to consider some of the examples
# that follow below, after commenting out this ISU specific part.
if ($NCPUS > 1) then
switch (`hostname`)
case machinename:
# case se.msg.chem.iastate.edu:
# case sb.msg.chem.iastate.edu:
if ($NCPUS > 2) set NCPUS=4
set NNODES=1

This gives you parallel functionality, but it’s still not using the machine resources (cores) correctly when I ask for anything more than 2 cores (always using only 2 cores).

[minor complaint]
Admittedly, I don’t immediately get the logic of this section as currently coded, as one cannot get more than 2 cores to work in this case given how the if statements are written (so far as I can see now. I will assume I am the one missing something but have not decided to ask about it, instead changing the rungms text to the following). You can check this yourself by running top in another window. This is the most simple modification, and assumes you want to run N number of cores each time. Clearly, you can make this more elegant than it is (my modification, that is). Meantime, I want to run 4 cores on this machine, so I change the section to reflect a 4-core board (and commented out much of this section).
[/complaint]

# 2. This is an example of how to run on a multi-core SMP enclosure,
# where all CPUs (aka COREs) are inside a -single- NODE.
# At other locations, you may wish to consider some of the examples
# that follow below, after commenting out this ISU specific part.
if ($NCPUS > 1) then
switch (`hostname`)
case machinename
# case se.msg.chem.iastate.edu:
# case sb.msg.chem.iastate.edu:
# if ($NCPUS > 2) set NCPUS=2
# set NNODES=1
# set HOSTLIST=(`hostname`:cpus=$NCPUS)
# breaksw
# case machinename
# case br.msg.chem.iastate.edu:
if ($NCPUS >= 4) set NCPUS=4
set NNODES=1
set HOSTLIST=(`hostname`:cpus=$NCPUS)
breaksw
case machinename
# case cd.msg.chem.iastate.edu:
# case zn.msg.chem.iastate.edu:
# case ni.msg.chem.iastate.edu:
# case co.msg.chem.iastate.edu:
# case pb.msg.chem.iastate.edu:
# case bi.msg.chem.iastate.edu:
# case po.msg.chem.iastate.edu:
# case at.msg.chem.iastate.edu:
# case sc.msg.chem.iastate.edu:
# if ($NCPUS > 4) set NCPUS=4
# set NNODES=1
# set HOSTLIST=(`hostname`:cpus=$NCPUS)
# breaksw
# case ga.msg.chem.iastate.edu:
# case ge.msg.chem.iastate.edu:
# case gd.msg.chem.iastate.edu:
# if ($NCPUS > 6) set NCPUS=6
# set NNODES=1
# set HOSTLIST=(`hostname`:cpus=$NCPUS)
# breaksw
default:
echo I do not know how to run this node in parallel.
exit 20
endsw
endif
#

And, with this set of changes, I’m using all 4 cores on the board (but have some significant memory issues when running MP2 calks. But that’s for another post).

The typical user will never be able to do what the GAMESS group has done in making an excellent program that also happens to be free. That said, the need to make changes to the rungms file is something that would be greatly simplified by having N number of rungms scripts for each case instead of a monolithic file that is mostly useless text to users not using one of the system types. This, for instance, would make rungms modification much easier. If I streamline rungms for my specific system, I may post a new file accordingly.

Bash Script For Generating 3D Potential Energy Surface Scan Input Files

INPUTFILETEMPLATE.gjf
1_PESscan.bash
2_bcf.bash
3_gaussian_bcf_top4lines.txt

The above scripts were made to overcome some of the potential energy surface (PES) scan limitations (well, design issues I’d rather not design around) in GAMESS and Gaussian. The intent of the scripts are to:

a) take a molecular input file from some quantum chemical code

b) align the molecule such that the surface you want to perform the PES scan across is aligned in the XY, YZ, or XZ planes

c) define the atom/molecule you want to scan with (such as a cation) in variable form in the input file template

d) define the grid size you want to perform the scan with (how fine a mesh)

e) generate the input files

and finally,

f) write out a command line script in a format that leaves for little post-processing for both the calculation and the analysis.

The demo case will be the (nominally, depending on the level of theory) C2v molecular anion (-1) diphenylmethanide in Gaussian, which will also hit a couple of technical points to speed up the PES scan. The intent was to take a cation (K+, Rb+) and determine the binding energies of the cation/anion pair over the (here, YZ) plane of the molecule to look at preferred binding positions and, more interestingly, the range in binding energies at different positions (see figure below).

pes grid

A few setup point for the script (general):

1. The molecule should be aligned along a plane (assuming it is planar. If not, you’ll have to tweak the code a little to taste, which isn’t too bad) because the script is going to sweep along Y and Z with the assumption that X is a constant vertical value above the plane of the molecule. The goal is to make a 2D map of a surface at one height, then change the height and redo the scan to look at the energy differences in sheets.

2. A molecule optimized in GAMESS and Gaussian should, provided symmetry is employed, orient the molecule such that a molecular plane is aligned along a Cartesian plane. In Gaussian, using the keyword symm(loose,follow) will both nudge a molecule into a higher symmetry and maintain that higher symmetry in the optimization. For the sake of the surface scan, this restriction of the molecule to a higher point group saves significant time, makes the analysis of the script results easier, and will not significantly alter the energies of a molecule whose minimum lies close to a higher symmetry form anyway (the differences in energy between the C2v and Cs diphenylmethanide structures, for instance, is on the order of a few kJ/mol by most levels of theory when the level of theory does NOT predict the C2v form to be the minimum).

3. Negative values are OK in the script, although some shells may not like having the “-” sign in the name of the files. If this is a problem on your machine, take the oriented molecule from Step 1 and add some constant value to all of the X, Y, Z coordinates to make the corner of the scan you start the X,Y,Z analysis from be (X,0,0). I use “X” here because the molecule may best be aligned to the X = 0 plane, which is easiest for some of the later post-processing.

4. If your molecule has two mirror planes (such as diphenylmethanide), set up the scan such that you’re only looking at HALF the structure. That should make perfect sense.

The scripts are provided in the links below. Their uses and results are as follows:

Input File 0. INPUTFILETEMPLATE – this file contains the parameters and coordinates for the PES scan. Make sure that this isn’t just a cut-and-paste of the optimization file (change the keywords so you run only SINGLE POINT CALCULATIONS!). More on this file below.

Input File 1. 1_PESscan.bash – the bash script that takes a template input file and generates (a) all of the single-point energy input files for the scan and (b) a batch file for submitting all of the files.

Input File 2. 2_bcf.bash – this is Gaussian-specific for the standard Gaussian PC interface, where the script for running multiple jobs is defined in the batch control file (bcf). This script gets the formatting right. Not needed for GAMESS, etc., calculations.

Input File 3. 3_gaussian_bcf_top4lines.txt – this file could be embedded into the 2_bcf.bash file if you like. Part of the 2_bcf.bash script catenates (with cat) the contents of this file and the contents of name files in the right bcf format. This will make perfect sense after the first run.

Output File 4. 4_batchscript.bat – this file (with the numerous input files) is generated from the 1_PESscan.bash script and contains all of the input files and execution parameters for the prompt. Will be Windows- or UNIX-friendly if you did it right.

Output File 5. 5_bcf_for_gaussian.bcf – this is the batch control file for Windows-based Gaussian calculations generated from 2_bcf.bash.

Specifics of each file are provided below:

Input File 0. INPUTFILETEMPLATE

The input template file should look like the following…

%mem=500MB
%nproc=2
# scf=tight b3lyp/6-31+g(d,p) integral(grid=ultrafine)

diphenylmethanide surface scan

0 1
K             REPLACEX    REPLACEY    REPLACEZ
C             0.000000    0.000000    0.940353
C             0.000000    1.308620    0.380671
...

Note that the molecule is aligned along X=0 and that I replaced opt with scf from the molecular optimization. You don’t want to disappear for a weekend only to find that you’ve done 4 structure optimizations when you could have done 100 single-point calculations. The cation we’re sliding with is defined with its X,Y,Z coordinates as REPLACEX,REPLACEY,REPLACEZ. These variables will be replaced by values along the PES grid by running 1_PESscan.bash.

Input File 1. 1_PESscan.bash

The usage of this bash script is:

./1_PESscan.bash $1 $2 $3 $4 $5 $6

or, for instance,

./1_PESscan.bash Kdiphenylmethanide gjf g03 100 25 25

Here are what the variables are.

$1 = name of the input file template (no extension)
$2 = name of the file extension (gjf for Gaussian, inp for GAMESS, etc.)
$3 = command line executable (g03 for Gaussian, gms… for GAMESS
$4 = decimal increments for the X axis (100 = unit, no decimals)
$5 = decimal increments for the Y axis (100 = unit, no decimals)
$6 = decimal increments for the Z axis (100 = unit, no decimals)

You’re limited to nine variables in the command line, so there’s one modification you need to make to the 1_PESscan.bash file itself. In the first section, MINX through MAXZ need to be defined. These are the integer steps taken by the script to generate the surfaces. These will depend on how you oriented your molecule. The decimal increments will not (which is why they’re called in the command line).

Input File 2. 2_bcf.bash

There’s nothing to edit here. It will take all of the Gaussian input (.gjf) files in a directory and make the corresponding .bcf file. Major time-saver.

Input File 3. 3_gaussian_bcf_top4lines.txt

This file contains the following…

!
!user created batch file list
!start=1
!

Which is just the typical 4 top lines in a bcf file (start referring to which molecule in the series gets run first).
And that’s the worst of it. Definitely practice the script(s) a few times before beginning a calculation, expect some fuss (depending on how you’ve pathed your executables) initially, and start with very COARSE grid sizes before wasting a lot of time on generating and checking files (decimal increments of 100 or 50, for instance).

A brief word on the input/output files. The format for the file names generated from 1_PESscan.bash are as follows.

INPUTFILETEMPLATE_X_2.000000_Y_0.000000_Z_0.000000.gjf
INPUTFILETEMPLATE_X_2.000000_Y_0.000000_Z_0.500000.gjf
INPUTFILETEMPLATE_X_2.000000_Y_0.000000_Z_1.000000.gjf
INPUTFILETEMPLATE_X_2.000000_Y_0.000000_Z_1.500000.gjf
...

The format (with six decimal places) is used here so that a user can simply use “ls *.gjf > file.txt” to generate a text file with all of the scanned coordinates, which can then be opened in Excel or something to make the position columns. Further, of course, is that a grep for the final energies in these files will make another file with the coordinates and energies together, which is then easily tweaked and plotted.

That’s the worst of it, short of plotting a few surfaces. Any questions or better ideas, drop a line.

www.msg.ameslab.gov/GAMESS/
www.gaussian.com