Bash Script For Generating 3D Potential Energy Surface Scan Input Files

INPUTFILETEMPLATE.gjf
1_PESscan.bash
2_bcf.bash
3_gaussian_bcf_top4lines.txt

The above scripts were made to overcome some of the potential energy surface (PES) scan limitations (well, design issues I'd rather not design around) in GAMESS and Gaussian. The intent of the scripts are to:

a) take a molecular input file from some quantum chemical code

b) align the molecule such that the surface you want to perform the PES scan across is aligned in the XY, YZ, or XZ planes

c) define the atom/molecule you want to scan with (such as a cation) in variable form in the input file template

d) define the grid size you want to perform the scan with (how fine a mesh)

e) generate the input files

and finally,

f) write out a command line script in a format that leaves for little post-processing for both the calculation and the analysis.

The demo case will be the (nominally, depending on the level of theory) C2v molecular anion (-1) diphenylmethanide in Gaussian, which will also hit a couple of technical points to speed up the PES scan. The intent was to take a cation (K+, Rb+) and determine the binding energies of the cation/anion pair over the (here, YZ) plane of the molecule to look at preferred binding positions and, more interestingly, the range in binding energies at different positions (see figure below).

pes grid

A few setup point for the script (general):

1. The molecule should be aligned along a plane (assuming it is planar. If not, you'll have to tweak the code a little to taste, which isn't too bad) because the script is going to sweep along Y and Z with the assumption that X is a constant vertical value above the plane of the molecule. The goal is to make a 2D map of a surface at one height, then change the height and redo the scan to look at the energy differences in sheets.

2. A molecule optimized in GAMESS and Gaussian should, provided symmetry is employed, orient the molecule such that a molecular plane is aligned along a Cartesian plane. In Gaussian, using the keyword symm(loose,follow) will both nudge a molecule into a higher symmetry and maintain that higher symmetry in the optimization. For the sake of the surface scan, this restriction of the molecule to a higher point group saves significant time, makes the analysis of the script results easier, and will not significantly alter the energies of a molecule whose minimum lies close to a higher symmetry form anyway (the differences in energy between the C2v and Cs diphenylmethanide structures, for instance, is on the order of a few kJ/mol by most levels of theory when the level of theory does NOT predict the C2v form to be the minimum).

3. Negative values are OK in the script, although some shells may not like having the "-" sign in the name of the files. If this is a problem on your machine, take the oriented molecule from Step 1 and add some constant value to all of the X, Y, Z coordinates to make the corner of the scan you start the X,Y,Z analysis from be (X,0,0). I use "X" here because the molecule may best be aligned to the X = 0 plane, which is easiest for some of the later post-processing.

4. If your molecule has two mirror planes (such as diphenylmethanide), set up the scan such that you're only looking at HALF the structure. That should make perfect sense.

The scripts are provided in the links below. Their uses and results are as follows:

Input File 0. INPUTFILETEMPLATE – this file contains the parameters and coordinates for the PES scan. Make sure that this isn't just a cut-and-paste of the optimization file (change the keywords so you run only SINGLE POINT CALCULATIONS!). More on this file below.

Input File 1. 1_PESscan.bash – the bash script that takes a template input file and generates (a) all of the single-point energy input files for the scan and (b) a batch file for submitting all of the files.

Input File 2. 2_bcf.bash – this is Gaussian-specific for the standard Gaussian PC interface, where the script for running multiple jobs is defined in the batch control file (bcf). This script gets the formatting right. Not needed for GAMESS, etc., calculations.

Input File 3. 3_gaussian_bcf_top4lines.txt – this file could be embedded into the 2_bcf.bash file if you like. Part of the 2_bcf.bash script catenates (with cat) the contents of this file and the contents of name files in the right bcf format. This will make perfect sense after the first run.

Output File 4. 4_batchscript.bat – this file (with the numerous input files) is generated from the 1_PESscan.bash script and contains all of the input files and execution parameters for the prompt. Will be Windows- or UNIX-friendly if you did it right.

Output File 5. 5_bcf_for_gaussian.bcf – this is the batch control file for Windows-based Gaussian calculations generated from 2_bcf.bash.

Specifics of each file are provided below:

Input File 0. INPUTFILETEMPLATE

The input template file should look like the following…

%mem=500MB
%nproc=2
# scf=tight b3lyp/6-31+g(d,p) integral(grid=ultrafine)

diphenylmethanide surface scan

0 1
K             REPLACEX    REPLACEY    REPLACEZ
C             0.000000    0.000000    0.940353
C             0.000000    1.308620    0.380671
...

Note that the molecule is aligned along X=0 and that I replaced opt with scf from the molecular optimization. You don't want to disappear for a weekend only to find that you've done 4 structure optimizations when you could have done 100 single-point calculations. The cation we're sliding with is defined with its X,Y,Z coordinates as REPLACEX,REPLACEY,REPLACEZ. These variables will be replaced by values along the PES grid by running 1_PESscan.bash.

Input File 1. 1_PESscan.bash

The usage of this bash script is:

./1_PESscan.bash $1 $2 $3 $4 $5 $6

or, for instance,

./1_PESscan.bash Kdiphenylmethanide gjf g03 100 25 25

Here are what the variables are.

$1 = name of the input file template (no extension)
$2 = name of the file extension (gjf for Gaussian, inp for GAMESS, etc.)
$3 = command line executable (g03 for Gaussian, gms… for GAMESS
$4 = decimal increments for the X axis (100 = unit, no decimals)
$5 = decimal increments for the Y axis (100 = unit, no decimals)
$6 = decimal increments for the Z axis (100 = unit, no decimals)

You're limited to nine variables in the command line, so there's one modification you need to make to the 1_PESscan.bash file itself. In the first section, MINX through MAXZ need to be defined. These are the integer steps taken by the script to generate the surfaces. These will depend on how you oriented your molecule. The decimal increments will not (which is why they're called in the command line).

Input File 2. 2_bcf.bash

There's nothing to edit here. It will take all of the Gaussian input (.gjf) files in a directory and make the corresponding .bcf file. Major time-saver.

Input File 3. 3_gaussian_bcf_top4lines.txt

This file contains the following…

!
!user created batch file list
!start=1
!

Which is just the typical 4 top lines in a bcf file (start referring to which molecule in the series gets run first).
And that's the worst of it. Definitely practice the script(s) a few times before beginning a calculation, expect some fuss (depending on how you've pathed your executables) initially, and start with very COARSE grid sizes before wasting a lot of time on generating and checking files (decimal increments of 100 or 50, for instance).

A brief word on the input/output files. The format for the file names generated from 1_PESscan.bash are as follows.

INPUTFILETEMPLATE_X_2.000000_Y_0.000000_Z_0.000000.gjf
INPUTFILETEMPLATE_X_2.000000_Y_0.000000_Z_0.500000.gjf
INPUTFILETEMPLATE_X_2.000000_Y_0.000000_Z_1.000000.gjf
INPUTFILETEMPLATE_X_2.000000_Y_0.000000_Z_1.500000.gjf
...

The format (with six decimal places) is used here so that a user can simply use "ls *.gjf > file.txt" to generate a text file with all of the scanned coordinates, which can then be opened in Excel or something to make the position columns. Further, of course, is that a grep for the final energies in these files will make another file with the coordinates and energies together, which is then easily tweaked and plotted.

That's the worst of it, short of plotting a few surfaces. Any questions or better ideas, drop a line.

www.msg.ameslab.gov/GAMESS/
www.gaussian.com

Synthetic, Structural And Theoretical Investigations Of Alkali Metal Germanium Hydrides – Contact Molecules And Separated Ions

In press, available from Chemistry – A European Journal. This is a paper a year or so in the making that, had I started it a year from now, would have taken a very different route. Much of the work I've done in neutron and terahertz spectroscopy has demonstrated that the inclusion of the crystal environment in quantum chemical treatments of solid-state systems is the key to interpreting the data (makes sense). This paper examines the unusual orientation of the [GeH3] anion in two crown ether complexes with potassium (K+) and rubidium (Rb+) cations. The crystal cells of these two complexes are far larger than computational resources would handle now (and definitely when the project started), but they'd be easily handled on better equipment (such as an 8-processor box with a terabyte or so of scratch space). The isolated molecule calculations (Dr. Alex Granovsky's PC-GAMESS version with basis sets from EMSL) demonstrate that the potential energy surfaces corresponding to anion orientations in the vicinity of only the solvated cations is shallow (at best) and that any moderate collection of electrostatic interactions (such as those in the crystal cell) may be enough to stabilize the unexpected anion orientation. It is also of interest to note that the [GeH3] anion prefers to bind to the K/Rb cations by the hydrogens (which we refer to as "inverted") and NOT the germanium anionic lone pair (the traditional, van't Hoff arrangement, all of those calculations being performed at B3LYP/6-311G(d,p) and MP2/6-311G(d,p) levels of theory with LANL2DZ ECPs for the K and Rb). This "oddity" was considered previously by the great Paul v. R. Schleyer and coworkers for a similar Na-SiH3 system some time back (Angew. Chem. 1994, 106, 221-223). This project will hopefully be revisited with solid-state density functional theory to see just how the crystal interactions combine to impose the non-traditional [GeH3] binding orientation.

karin geh3

W. Teng, D. G. Allis and K. Ruhlandt-Senge

Abstract: The preparation of a series of crown ether-ligated alkali metal (M = K, Rb, Cs) germyl derivatives M(crown ether)GeH3 via hydrolysis of the respective tris(trimethylsilyl)germanides is reported. Depending on the alkali metal and the crown ether diameter, the hydrides display either contact molecules or separated ions in the solid state, providing a unique structural insight into the geometry of the obscure GeH3 anion.

Germyl derivatives displaying M-Ge bonds in the solid state are of the general formula M([18]crown-6)(thf)GeH3 with M = K, 1; M = Rb, 4. Interestingly, the lone pair at germanium is not pointed towards the alkali metal, rather two of the three hydrides are approaching the alkali metal center to display M-H interactions.

Separated ions display alkali metal cations bound to two crown ethers in a sandwich-type arrangement and non-coordinated GeH3 anions to afford complexes of the type [M(crown ether)2][GeH3] with M = K, crown ether = [15]crown-5, 2; M = K, crown ether = [12]crown-4, 3 and M = Cs, crown ether = [18]crown-6, 5.

The highly reactive germyl derivatives were characterized using X-ray crystallography, 1H and 13C NMR, and IR spectroscopy. Density functional theory (DFT) and Second-Order Moeller-Plesset perturbation theory (MP2) calculations were performed to analyze the geometry of the GeH3 anion in the contact molecules 1 and 4.

www3.interscience.wiley.com/cgi-bin/jhome/26293?CRETRY=1&SRETRY=0
www3.interscience.wiley.com/cgi-bin/abstract/112234636/ABSTRACT
en.wikipedia.org/wiki/Jacobus_Henricus_van_'t_Hoff
www.emsl.pnl.gov/forms/basisform.html
chemistry.syr.edu/faculty/ruhlandt.html
classic.chem.msu.su/gran/gamess
www.chem.uga.edu/schleyer