home

Archive for the '(pc-)gamess-us' Category

Bash Script For Generating 3D Potential Energy Surface Scan Input Files

Monday, October 9th, 2006

INPUTFILETEMPLATE.gjf
1_PESscan.bash
2_bcf.bash
3_gaussian_bcf_top4lines.txt

The above scripts were made to overcome some of the potential energy surface (PES) scan limitations (well, design issues I’d rather not design around) in GAMESS and Gaussian. The intent of the scripts are to:

a) take a molecular input file from some quantum chemical code

b) align the molecule such that the surface you want to perform the PES scan across is aligned in the XY, YZ, or XZ planes

c) define the atom/molecule you want to scan with (such as a cation) in variable form in the input file template

d) define the grid size you want to perform the scan with (how fine a mesh)

e) generate the input files

and finally,

f) write out a command line script in a format that leaves for little post-processing for both the calculation and the analysis.

The demo case will be the (nominally, depending on the level of theory) C2v molecular anion (-1) diphenylmethanide in Gaussian, which will also hit a couple of technical points to speed up the PES scan. The intent was to take a cation (K+, Rb+) and determine the binding energies of the cation/anion pair over the (here, YZ) plane of the molecule to look at preferred binding positions and, more interestingly, the range in binding energies at different positions (see figure below).

pes grid

A few setup point for the script (general):

1. The molecule should be aligned along a plane (assuming it is planar. If not, you’ll have to tweak the code a little to taste, which isn’t too bad) because the script is going to sweep along Y and Z with the assumption that X is a constant vertical value above the plane of the molecule. The goal is to make a 2D map of a surface at one height, then change the height and redo the scan to look at the energy differences in sheets.

2. A molecule optimized in GAMESS and Gaussian should, provided symmetry is employed, orient the molecule such that a molecular plane is aligned along a Cartesian plane. In Gaussian, using the keyword symm(loose,follow) will both nudge a molecule into a higher symmetry and maintain that higher symmetry in the optimization. For the sake of the surface scan, this restriction of the molecule to a higher point group saves significant time, makes the analysis of the script results easier, and will not significantly alter the energies of a molecule whose minimum lies close to a higher symmetry form anyway (the differences in energy between the C2v and Cs diphenylmethanide structures, for instance, is on the order of a few kJ/mol by most levels of theory when the level of theory does NOT predict the C2v form to be the minimum).

3. Negative values are OK in the script, although some shells may not like having the “-” sign in the name of the files. If this is a problem on your machine, take the oriented molecule from Step 1 and add some constant value to all of the X, Y, Z coordinates to make the corner of the scan you start the X,Y,Z analysis from be (X,0,0). I use “X” here because the molecule may best be aligned to the X = 0 plane, which is easiest for some of the later post-processing.

4. If your molecule has two mirror planes (such as diphenylmethanide), set up the scan such that you’re only looking at HALF the structure. That should make perfect sense.

The scripts are provided in the links below. Their uses and results are as follows:

Input File 0. INPUTFILETEMPLATE – this file contains the parameters and coordinates for the PES scan. Make sure that this isn’t just a cut-and-paste of the optimization file (change the keywords so you run only SINGLE POINT CALCULATIONS!). More on this file below.

Input File 1. 1_PESscan.bash – the bash script that takes a template input file and generates (a) all of the single-point energy input files for the scan and (b) a batch file for submitting all of the files.

Input File 2. 2_bcf.bash – this is Gaussian-specific for the standard Gaussian PC interface, where the script for running multiple jobs is defined in the batch control file (bcf). This script gets the formatting right. Not needed for GAMESS, etc., calculations.

Input File 3. 3_gaussian_bcf_top4lines.txt – this file could be embedded into the 2_bcf.bash file if you like. Part of the 2_bcf.bash script catenates (with cat) the contents of this file and the contents of name files in the right bcf format. This will make perfect sense after the first run.

Output File 4. 4_batchscript.bat – this file (with the numerous input files) is generated from the 1_PESscan.bash script and contains all of the input files and execution parameters for the prompt. Will be Windows- or UNIX-friendly if you did it right.

Output File 5. 5_bcf_for_gaussian.bcf – this is the batch control file for Windows-based Gaussian calculations generated from 2_bcf.bash.

Specifics of each file are provided below:

Input File 0. INPUTFILETEMPLATE

The input template file should look like the following…

%mem=500MB
%nproc=2
# scf=tight b3lyp/6-31+g(d,p) integral(grid=ultrafine)

diphenylmethanide surface scan

0 1
K             REPLACEX    REPLACEY    REPLACEZ
C             0.000000    0.000000    0.940353
C             0.000000    1.308620    0.380671
...

Note that the molecule is aligned along X=0 and that I replaced opt with scf from the molecular optimization. You don’t want to disappear for a weekend only to find that you’ve done 4 structure optimizations when you could have done 100 single-point calculations. The cation we’re sliding with is defined with its X,Y,Z coordinates as REPLACEX,REPLACEY,REPLACEZ. These variables will be replaced by values along the PES grid by running 1_PESscan.bash.

Input File 1. 1_PESscan.bash

The usage of this bash script is:

./1_PESscan.bash $1 $2 $3 $4 $5 $6

or, for instance,

./1_PESscan.bash Kdiphenylmethanide gjf g03 100 25 25

Here are what the variables are.

$1 = name of the input file template (no extension)
$2 = name of the file extension (gjf for Gaussian, inp for GAMESS, etc.)
$3 = command line executable (g03 for Gaussian, gms… for GAMESS
$4 = decimal increments for the X axis (100 = unit, no decimals)
$5 = decimal increments for the Y axis (100 = unit, no decimals)
$6 = decimal increments for the Z axis (100 = unit, no decimals)

You’re limited to nine variables in the command line, so there’s one modification you need to make to the 1_PESscan.bash file itself. In the first section, MINX through MAXZ need to be defined. These are the integer steps taken by the script to generate the surfaces. These will depend on how you oriented your molecule. The decimal increments will not (which is why they’re called in the command line).

Input File 2. 2_bcf.bash

There’s nothing to edit here. It will take all of the Gaussian input (.gjf) files in a directory and make the corresponding .bcf file. Major time-saver.

Input File 3. 3_gaussian_bcf_top4lines.txt

This file contains the following…

!
!user created batch file list
!start=1
!

Which is just the typical 4 top lines in a bcf file (start referring to which molecule in the series gets run first).
And that’s the worst of it. Definitely practice the script(s) a few times before beginning a calculation, expect some fuss (depending on how you’ve pathed your executables) initially, and start with very COARSE grid sizes before wasting a lot of time on generating and checking files (decimal increments of 100 or 50, for instance).

A brief word on the input/output files. The format for the file names generated from 1_PESscan.bash are as follows.

INPUTFILETEMPLATE_X_2.000000_Y_0.000000_Z_0.000000.gjf
INPUTFILETEMPLATE_X_2.000000_Y_0.000000_Z_0.500000.gjf
INPUTFILETEMPLATE_X_2.000000_Y_0.000000_Z_1.000000.gjf
INPUTFILETEMPLATE_X_2.000000_Y_0.000000_Z_1.500000.gjf
...

The format (with six decimal places) is used here so that a user can simply use “ls *.gjf > file.txt” to generate a text file with all of the scanned coordinates, which can then be opened in Excel or something to make the position columns. Further, of course, is that a grep for the final energies in these files will make another file with the coordinates and energies together, which is then easily tweaked and plotted.

That’s the worst of it, short of plotting a few surfaces. Any questions or better ideas, drop a line.

www.msg.ameslab.gov/GAMESS/
www.gaussian.com

Synthetic, Structural And Theoretical Investigations Of Alkali Metal Germanium Hydrides – Contact Molecules And Separated Ions

Thursday, August 31st, 2006

In press, available from Chemistry – A European Journal. This is a paper a year or so in the making that, had I started it a year from now, would have taken a very different route. Much of the work I’ve done in neutron and terahertz spectroscopy has demonstrated that the inclusion of the crystal environment in quantum chemical treatments of solid-state systems is the key to interpreting the data (makes sense). This paper examines the unusual orientation of the [GeH3]- anion in two crown ether complexes with potassium (K+) and rubidium (Rb+) cations. The crystal cells of these two complexes are far larger than computational resources would handle now (and definitely when the project started), but they’d be easily handled on better equipment (such as an 8-processor box with a terabyte or so of scratch space). The isolated molecule calculations (Dr. Alex Granovsky’s PC-GAMESS version with basis sets from EMSL) demonstrate that the potential energy surfaces corresponding to anion orientations in the vicinity of only the solvated cations is shallow (at best) and that any moderate collection of electrostatic interactions (such as those in the crystal cell) may be enough to stabilize the unexpected anion orientation. It is also of interest to note that the [GeH3]- anion prefers to bind to the K/Rb cations by the hydrogens (which we refer to as “inverted”) and NOT the germanium anionic lone pair (the traditional, van’t Hoff arrangement, all of those calculations being performed at B3LYP/6-311G(d,p) and MP2/6-311G(d,p) levels of theory with LANL2DZ ECPs for the K and Rb). This “oddity” was considered previously by the great Paul v. R. Schleyer and coworkers for a similar Na-SiH3 system some time back (Angew. Chem. 1994, 106, 221-223). This project will hopefully be revisited with solid-state density functional theory to see just how the crystal interactions combine to impose the non-traditional [GeH3]- binding orientation.

karin geh3

W. Teng, D. G. Allis and K. Ruhlandt-Senge

Abstract: The preparation of a series of crown ether-ligated alkali metal (M = K, Rb, Cs) germyl derivatives M(crown ether)GeH3 via hydrolysis of the respective tris(trimethylsilyl)germanides is reported. Depending on the alkali metal and the crown ether diameter, the hydrides display either contact molecules or separated ions in the solid state, providing a unique structural insight into the geometry of the obscure GeH3- anion.

Germyl derivatives displaying M-Ge bonds in the solid state are of the general formula M([18]crown-6)(thf)GeH3 with M = K, 1; M = Rb, 4. Interestingly, the lone pair at germanium is not pointed towards the alkali metal, rather two of the three hydrides are approaching the alkali metal center to display M-H interactions.

Separated ions display alkali metal cations bound to two crown ethers in a sandwich-type arrangement and non-coordinated GeH3- anions to afford complexes of the type [M(crown ether)2][GeH3] with M = K, crown ether = [15]crown-5, 2; M = K, crown ether = [12]crown-4, 3 and M = Cs, crown ether = [18]crown-6, 5.

The highly reactive germyl derivatives were characterized using X-ray crystallography, 1H and 13C NMR, and IR spectroscopy. Density functional theory (DFT) and Second-Order Moeller-Plesset perturbation theory (MP2) calculations were performed to analyze the geometry of the GeH3- anion in the contact molecules 1 and 4.

www3.interscience.wiley.com/cgi-bin/jhome/26293?CRETRY=1&SRETRY=0
www3.interscience.wiley.com/cgi-bin/abstract/112234636/ABSTRACT
en.wikipedia.org/wiki/Jacobus_Henricus_van_’t_Hoff
www.emsl.pnl.gov/forms/basisform.html
chemistry.syr.edu/faculty/ruhlandt.html
classic.chem.msu.su/gran/gamess
www.chem.uga.edu/schleyer

Extension Of The Single Amino Acid Chelate Concept (SAAC) To Bifunctional Biotin Analogues For Complexation Of The M(CO)3+1 Core (M = Tc And Re): Syntheses, Characterization, Biotinidase Stability And Avidin Binding

Thursday, March 30th, 2006

In press, available from the journal Bioconjugate Chemistry. The modeling study for the avidin-biotin structure and the biotin derivatives were completed with the molecular dynamics program NAMD on a Dual G4/450 loaned to me from Apple for development work, for which I am grateful (I’ve performed molecular dynamics simulations with the Walrus). I did manage to smoke the motherboard during this experience, for which I apologize. Given the state of the machine after the autopsy, I’m hoping no one (especially Eric Zelman!) asks for it back, even when I’m 64.

I made mention of the reasons for some of this work in an interview I did for nanotech.biz, completely unrelate to the other content, in case anyone wants some background.

Shelly James, Kevin P. Maresca, Damian G. Allis, John F. Valliant, William Eckelman, John W. Babich, and Jon Zubieta

Abstract: Biotin and avidin form one of the most stable complexes known (KD = 10-15M-1) making this pairing attractive for a variety of biomedical applications including targeted radiotherapy. In this application one of the pair is attached to a targeting molecule while the other is subsequently used to deliver a radionuclide for imaging and/or therapeutic applications. Recently we reported a new single amino acid chelate (SAAC) capable of forming robust complexes with Tc(CO)3 or Re(CO)3 cores. We describe here the application of SAAC analogs for the development of a series of novel radiolabeled biotin derivatives capable of forming robust complexes with both Tc and Re. Compounds were prepared through varying modification of the free carboxylic acid group of biotin. Each 99mTc complex of SAAC-biotin was studied for their ability to bind avidin, susceptibility to biotinidase and specificity for avidin in an in vivo avidin-containing tumor model. The radiochemical stability of the 99mTc(CO)3 complexes was also investigated by challenging each 99mTc-complex with large molar excesses of cysteine and histidine at elevated temperature. All compounds were radiochemically stable for greater than 24 hours at elevated temperature in the presence of histidine and cysteine. Both [99mTc(CO)3(L6)]+1 [TcL6; L6 = biotinyl- amido- propyl- N,N- (dipicolyl)- amine] and [99mTc(CO)3(L12a)]+1 (TcL12; L12 = N,N-(dipicolyl)- biotin- amido- Boc- lysine; TcL12a; L12a = N,N- (dipicolyl)- biotin- amide- lysine) readily bound to avidin whereas [99mTc(CO)3(L9)]+1 [TcL9; L9 = N,N- (dipicolyl)- biotin- amine] demonstrated minimal specific binding. TcL6 and TcL9 were resistant to biotinidase cleavage while TcL12a, which contains a lysine linkage, was rapidly cleaved. The highest uptake in an in vivo avidin tumor model was exhibited by TcL6, followed by TcL9 and TcL12a, respectively. This is likely the result of both intact binding to avidin and resistance to circulating biotinidase. Ligand L6 is the first SAAC analogue of biotin to demonstrate potential as a radiolabeled targeting vector of biotin capable of forming robust radiochemical complexes with both 99mTc and rhenium radionuclides.Computational simulations were performed to assess biotin-derivative accommodation within the binding site of the avidin. These calculations demonstrate that deformation of the surface domain of the binding pocket can occur to accommodate the transition metal-biotin derivatives with negligible changes to the inner-β-barrel, the region most responsible for binding and retaining biotin and its derivatives.

P.S. This publication is also of some use for explaining the series of images on the current departmental brochure for the Syracuse U. Chemistry Department. Steric interactions affect the local geometries of protein binding pockets. And a good thing, too.


Click on the image for a larger version.

www.apple.com
www.applecorps.com
chemistry.syr.edu/faculty/zubieta.html
www.molecularinsight.com
www.chemistry.mcmaster.ca/people/faculty/valliant/index.html
pubs.acs.org/journals/bcches/index.html
www.ks.uiuc.edu/Research/namd/

Everything You Need To Set Up PC-GAMESS On A(n) SMP System

Wednesday, January 4th, 2006

The following is a step-by-step guide to getting PC-GAMESS 6.4 (perhaps prior. I’m not inclined to check, but would appreciate a confirmation) running on a(n) SMP (that’s Symmetric MultiProcessing (useless trivia), a single motherboard with multiple processors) system. All of the websites describing what’s below are less… obvious than I would like, especially concerning the infamous .pg file for SMP machines. What’s below is specifically geared for a dual-core/dual-board (four processor) system, but is easily changed to other SMP cases.

1. Obtain the most recent (6.4 as of this posting) version of PC-GAMESS from the good Dr. Alex Granovsky.

2. After extracting the contents of the downloaded .exe file, you’ll notice the wmpi1_3.exe file. Double click to install this program, blindly saying yes to everything it asks.

3. In your c:\ directory (dig into My Computer), create the sub-directories pg1, pg2, pg3, and pg4 (note: the names are completely arbitrary). When that’s done, your folders should look like the following (PEXECUTE.BAT and PRUNGAMESS.BAT are separate batch files you can download below):

4. Into c:\pg1, place all of the extracted PC-GAMESS files (whatever came out of the .exe run). The folder should look like this (with three input files added by myself for testing):

5. Into the folders c:\pg2, c:\pg3, and c:\pg4, copy and paste (from c:\pg1) the files FASTDIAG.DLL, PCGP2P.DLL, and PGAMESS.EXE. These directories will look like the following:

6. Back in c:\pg1, create or place pgamess.pg, the wmpi config file you can download HERE (if you download, remove the .txt extension before use). The file is, simply,

local 0
localhost 1 c:\pg2\pgamess.exe
localhost 1 c:\pg3\pgamess.exe
localhost 1 c:\pg4\pgamess.exe

No surprises, but I’ve never found the SMP .pg file listed anywhere. This file is just for SMP runs. local 0 refers to the “base” copy of PGAMESS (in c:\pg1, the PC-GAMESS home directory). The localhost 1 lines call each of the other three directories and PGAMESS.EXE for the three remaining processors (in the quad box). For a dual core or dual CPU box, you’d remove c:\pg3 and c:\pg4 and delete the third and fourth lines in pgamess.pg.

7. The command line will look like the following (note: the input file must be named “input“):

c:\pg1> PGAMESS.EXE c:\pg1 c:\pg2 c:\pg3 c:\pg4 -p4local > filename.out

All should, in theory, work without hitch. For a batch-type system (you can’t add new files, but you can run existing jobs in the same directory automatically and sequentially), download the files PEXECUTE.BAT and PRUNGAMESS.BAT and place them into c:\pg1 (make sure to remove the .txt extension before use if you download these). To run this script, simply double-click on PRUNGAMESS.BAT.

NOTE1: Of course, what’s shown is not the most efficient way to run PC-GAMESS. For maximum speed-up, you’ll want a single hard drive dedicated to EACH processor (so each temp file is being written to a different disk). Your folders c:\pg1, c:\pg2, c:\pg3, and c:\pg4 would then be c:\pg1, d:\pg2, e:\pg3, and f:\pg4.

NOTE2: A number of my calculations fail randomly with MPI memory allocation/write errors on dual core/dual cpu AMD Opteron machines running Windows XP. These errors are actually hard drive write problems and not RAM problems. You can get around this problem (errors and OK buttons will pop up) by running the calculations in-RAM (DIRSCF=.TRUE.) and ramping up the amount of used RAM (with the MEMORY= keyword).

Using External Basis Sets In GAMESS-US, Running Mixed Basis Set Calculations With Z-Matrix Inputs

Wednesday, January 4th, 2006

The parameterization of the new nanoENGINEER-1 simulator is being performed for molecules containing H through Cl at the B3LYP/6-31+G(d,p) level of theory. Unfortunately, the 6-31+G(d,p) basis set is not available for the 4th row elements (Ga,Ge,As,Se,Br), meaning an alternate basis set is required (for the 4th row, that is). In GAMESS-US, this is simple if the input files are in Cartesian format. This same approach cannot be used in Z-matrix input formats. A (maybe THE only) way to call mixed basis sets for Z-matrices in GAMESS-US is provided here. The procedure involves making an external basis set file with the required basis sets, changing the $BASIS control to read the external file, and modifying rungms to read the external basis set file.

Below is a sample input file. The only noteworthy differences between it and any other input file are (a) the $BASIS line, which tells GAMESS-US to use the external file (EXTFIL=.TRUE.) and (b) the call to use external basis sets named STO2GBAS. In this external file, you can have multiple groups of elements and basis sets, but not multiple basis set types for the SAME ELEMENT in the same group (so far as I know at the moment). This means that the external basis set example file I have available here has Hydrogen (H) and F (Fluorine) STO-2G (STO2GBAS) and 3-21G (321GBASI) basis set groups, but that you CANNOT call the STO2GBAS Hydrogen basis set and 321GBASI Fluorine basis set.

$CONTRL SCFTYP=RHF RUNTYP=OPTIMIZE COORD=ZMT $END
$BASIS EXTFIL=.TRUE. GBASIS=STO2GBAS $END
$CONTRL NZVAR=1 $END
$ZMAT IZMAT(1)=1,1,2 $END
$DATA
COMMENT: HF Z-matrix with mixed basis sets
C1
H
F 1 HFDist

HFDist=1.00
$END

The edit to the rungms file is as follows. You will need to change the EXTBAS control from its default (/dev/null) to the location of the external basis set file (which I have named EXTFILE.txt). If the basis set file is in the same directory as your GAMESS-US executable, then change the EXTBAS to ./EXTFILE.txt.

set echo
setenv ERICFMT ./ericfmt.dat
setenv IRCDATA ./$JOB.irc
setenv INPUT $SCR/$JOB.F05
setenv PUNCH ./$JOB.dat
setenv EXTBAS ./EXTFILE.txt
setenv AOINTS $SCR/$JOB.F08

The external basis sets can, of course, be downloaded from www.emsl.pnl.gov. My sample file, EXTFILE.txt, is below. Note the STO2GBAS and 321GBASI grouping. These labels MUST have 8 characters. Change the basis set choice by changing GBASIS in $BASIS in the input file.

H STO2GBAS
S 2
1 1.3097563770     0.4301284980
2 0.2331359740     0.6789135310

F STO2GBAS
S 2
1 63.73520200     0.4301280000
2 11.34483400     0.6789140000
L 2
1 2.498548000     0.4947200000E-01    0.5115410000
2 0.633698000     0.9637820000          0.6128200000

H 321GBASI
S 2
1 5.4471780000     0.1562850000
2 0.8245470000     0.9046910000
S 1
1 0.1831920000     1.0000000000

F 321GBASI
S 3
1 413.8010000     0.5854830000E-01
2 62.24460000     0.3493080000
3 13.43400000     0.7096320000
L 2
1 9.777590000     -0.4073270000     0.2466800000
2 2.086170000      1.223140000      0.8523210000
L 1
1 0.4823830000     1.000000000      1.000000000

You can download all of the program text and some brief comments in a single file located HERE.

In order to use the 6-31+G(d,p) and 6-311G(d,p) basis sets for a single molecule of mixed 1st/2nd/3rd row and 4th row elements, define an 8-character sequence for the group (like MIXEDBAS) and simply add all of the required basis sets for whatever elements into that group. Again, it is not obvious how to have two Hydrogen atoms in a single molecule have two different basis sets in a Z-matrix input by this (or any other) method in GAMESS-US, but the above goes a long way towards removing a major obstacle to running many molecules.

Obligatory

  • Ubuntu 4 Nano

  • CNYO

  • NMT Review

  • N-Fact. Collab.

  • T R P Nanosys

  • Nano Gallery

  • nano gallery
  • Aerial Photos

    More @ flickr.com

    Syracuse Scenes

    More @ flickr.com