sed-Based Script For Converting NAMOT And NAMOT2 DNA Output To ffAMBER Format For GROMACS Topology Generation v1

In continuing efforts to streamline the simulation of atomistic DNA structures in GROMACS using the ffAMBER force field (the port of AMBER for GROMACS), the following script takes the .pdb output of NAMOT or NAMOT2 and does all of the atom label and atom label position conversions, correct 3' and 5' terminal H atom assignments, and random changes throughout the .pdb file to provide something that should flow seamlessly into GROMACS.

"Did you need to post the entire script and not just provide the downloadable text file as a link?" Of course, as I suspect no small number of people looking for how to convert a NAMOT pdb file into ffAMBER-speak will begin by searching based on GROMACS errors, which occur one missing residue label at a time. Hopefully, having the entire script readable by google and yahoo will cause it to pop up high in the search ranking.

Less searching, more simulating.

Now, there is one problem. NAMOT and NAMOT2 do not include the methyl group hydrogen atoms on the thymine residue, defaulting to a single C5M. In most of the GROMACS force fields, this is just fine, as the hydrogen atoms are subsumed into the methyl carbon (all non-polar C-H bonds are treated this way). For AMBER (ffAMBER, that is), all hydrogens are included. This fix is performed on the ffAMBER/GROMACS side in a modification to the .hdb and .rtp files that I will describe in an upcoming post.

How to use:

As a series of sed operations, you obviously need sed, which is available for all platforms and "pre-installed" with any self-respecting Linux/UNIX distro (which, of course, means OSX (the OS under which the script was generated).

To run this script, have the script and your NAMOT/NAMOT2-generated .pdb in the same directory and type:

./NAMOT_to_ffAMBER_in_GROMACS.script FILENAME.pdb NN

Where:

NAMOT_to_ffAMBER_in_GROMACS.script is the name of the script

FILENAME.pdb is the .pdb file (include the .pdb)

NN is the number of bases in each strand. This number is required in order to correctly change the atom types on the 3' end of each strand.

This script is downloadable form the following link: NAMOT_to_ffAMBER_in_GROMACS.script

I also include a 35-base C-G double helix NAMOT .pdb file at C_G_NAMOT.pdb. To test the script on your machine, type the following in a Terminal window:

./NAMOT_to_ffAMBER_in_GROMACS.script C_G_NAMOT.pdb 35

As usual, if you have problems, comments, questions, concerns, etc. please either make an account and post a comment for this post or send me an email and I'll keep the running tally.

C_Gpdb_QuteMolX_image_may2008

Also, this same scripting procedure works just fine for the GROMOS96 force fields (ffG53a5, ffG53a6, etc.) and I'll be posting the one I use for those calculations in short order (they are, in fact, easier to work with for GROMACS, as they also neglect the methyl group hydrogen atoms on the Thymine. In fact, they neglect ALL of the non-polar C-H bonds, so you end up deleting atoms from the NAMOT/NAMOT2 .pdb files).

################################################################################
#
# Questions?  Problems?  Complaints?  Better Ideas?
# Damian Allis, damian@somewhereville.com, www.somewhereville.com
#
# This script takes the double helix output from NAMOT and NAMOT2 (a and b
# strands) and converts them into a format that the current ffAMBER
# implementation for GROMACS can use in the generation of the GROMACS .top file.
#
################################################################################
#
# Generally, the following list of GROMACS runs should get you through an
# energy minimization without problem.  Note only 10 cations are added
# to your structure.  Change accordingly (or don't.  It doesn't matter for
# the test).
#
# Run these in order:
#
# pdb2gmx -nomerge -f DNA.pdb -o DNA_pdb2gmx.gro -p DNA_pdb2gmx.top
# editconf -f DNA_pdb2gmx.gro -o DNA_editconf.gro -d 1.0 -bt triclinic
# genbox -cp DNA_editconf.gro -cs -o DNA_genbox.gro -p DNA_pdb2gmx.top
# grompp -f em -c DNA_genion.gro -p DNA_pdb2gmx.top -o DNA_grompp2em.tpr
# genion -np 10 -norandom -pname Na -o DNA_genion.gro -s DNA_gromppem.tpr
#   -p DNA_pdb2gmx.top (this .top goes in the same line as the genion)
# grompp -f em -c DNA_genion.gro -p DNA_pdb2gmx.top -o DNA_grompp2em.tpr2
# mdrun -s DNA_grompp2em.tpr -o DNA_md_em.trr -c DNA_md_em.pdb -v
#
################################################################################
#
# In case you don't have one handy, here's the contents of an em.mpd file
# for use in the energy minimization test.
#
# Copy this content below, remove the "#", save as a text filed named
# -> em.mpd
#
# cpp                 =  /usr/bin/cpp
# define              =  -DFLEXIBLE
# integrator          =  steep
# nsteps              =  5000
# emtol               =  10.0
# emstep              =  0.01
# nstcgsteep          =  100
# coulombtype         = PME
# rvdw                = 1.0
# rlist               = 1.1
# rcoulomb            = 1.1
# pme_order           = 4
# ewald_rtol          = 1e-5
# vdwtype             = shift
# ns_type             = grid
# nstlist             = 10
#
################################################################################
#
# Here's the command line:
#
# ./NAMOT_to_ffAMBER_in_GROMACS.sed $1 $2
#
# $1 = file name (including the .pdb, as I often forget to not include it)
# $2 = number of the 3' base for conversion into Dn3 (n = A,T,G,C)
# the number in $2 will automatically do the 3' and 5' conversion (keep the
# terminal hydrogens on the PO4- groups)
#
################################################################################
################################################################################
#
# The magic happens below.
#
################################################################################
################################################################################
#
# First thing first, make a backup of the original pdb file in case you goof.
#
cp $1 $1_original
#
################################################################################
#
# This section converts all of the "*" with "z" so that you're not using the
# asterisk during the editing.  Replacing with the ffAMBER-requisite
# "single-quote" (') makes the sed script more complicated than it needs to be.
#
sed 's/*/z/' $1 > $1_temp
rm $1
mv $1_temp $1
#
################################################################################
#
# This section changes the nitrogen hydrogen (NH2) labels to those expected by
# ffMABER.  Hn2/1, where n is the atom number in the formal labeling scheme.
#
# THYIME is its own problem, as the methyl carbon needs modification.
# This is addressed by a separate ffAMBER modification.  Check
# https://www.somewhereville.com/?cat=74 (my AMBER category) for details.
#
#
# HN2A/B occurs in GUANINE.
#
sed 's/HN2A/ H21/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/HN2B/ H22/' $1 > $1_temp
rm $1
mv $1_temp $1
#
#
# HN4A/B occurs in CYTOSINE.
#
sed 's/HN4A/ H41/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/HN4B/ H42/' $1 > $1_temp
rm $1
mv $1_temp $1
#
#
# HN6A/B occurs in ADENINE.
#
sed 's/HN6A/ H61/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/HN6B/ H62/' $1 > $1_temp
rm $1
mv $1_temp $1
#
################################################################################
#
# This section converts ADE, CYT, GUA, THY in the NAMOT output to DA, DC, DG, DT
# in accord with the topology labels used by ffAMBER in GROMACS for the nucleic
# acids.
#
#
sed 's/ADE / DA /' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/CYT / DC /' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/GUA / DG /' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/THY / DT /' $1 > $1_temp
rm $1
mv $1_temp $1
#
################################################################################
#
# This is the bulk of the conversion, moving atoms around and formatting.
# Mostly, just moving atom labels over one column to the left.  This doesn't
# necessarily have to be done, but conforms to pdb format better and some
# program may need the atom labels in the columns as defined below.
#
# This section changes all of the hydrogen atom labels.
#
sed 's/1H5z/H5z1/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/2H5z/H5z2/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/H2Az/H2z1/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/H2Bz/H2z2/' $1 > $1_temp
rm $1
mv $1_temp $1
#
# These are global changes in column position and fix all nucleic acids.
#
sed 's/  P  D/P    D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ N9  D/N9   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ N7  D/N7   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ N6  D/N6   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ N3  D/N3   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ N1  D/N1   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ C8  D/C8   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ C6  D/C6   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ C5  D/C5   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ C4  D/C4   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ C2  D/C2   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ H8  D/H8   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ H2  D/H2   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ H3  D/H3   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ H6  D/H6   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ O3  D/O3   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ O4  D/O4   D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ O2  D/O    D/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/C5M  D/C7   D/' $1 > $1_temp
rm $1
mv $1_temp $1
#
################################################################################
#
# This section converts the 5' end of both chains into ffAMBER format.  Always
# begins with "1".  If you deleted some of the double strand at the 5' end and
# the first base number is NOT 1, this script will still run but give you
# a final structure that will require additional modification before running
# the pdb2gmx topology generator.
#
################################################################################
#
# chain a ADENINE adjustment
#
sed 's/  DA a   1/ DA5 a   1/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HB DA5/H5T DA5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O5T DA5/O5z DA5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HE  DA/H3T DA3/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O3T  DA/O3z DA3/' $1 > $1_temp
rm $1
mv $1_temp $1
#
# chain b ADENINE adjustment
#
sed 's/  DA b   1/ DA5 b   1/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HB DA5/H5T DA5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O5T DA5/O5z DA5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HE  DA/H3T DA3/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O3T  DA/O3z DA3/' $1 > $1_temp
rm $1
mv $1_temp $1
#
################################################################################
#
# chain a THYMINE adjustment
#
sed 's/  DT a   1/ DT5 a   1/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HB DT5/H5T DT5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O5T DT5/O5z DT5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HE  DT/H3T DT3/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O3T  DT/O3z DT3/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/  DG a   1/ DG5 a   1/' $1 > $1_temp
rm $1
mv $1_temp $1
#
# chain b THYMINE adjustment
#
sed 's/  DT b   1/ DT5 b   1/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HB DT5/H5T DT5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O5T DT5/O5z DT5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HE  DT/H3T DT3/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O3T  DT/O3z DT3/' $1 > $1_temp
rm $1
mv $1_temp $1
#
################################################################################
#
# chain a GUANINE adjustment
#
sed 's/ HB DG5/H5T DG5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O5T DG5/O5z DG5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HE  DG/H3T DG3/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O3T  DG/O3z DG3/' $1 > $1_temp
rm $1
mv $1_temp $1
#
# chain b GUANINE adjustment
#
sed 's/  DG b   1/ DG5 b   1/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HB DG5/H5T DG5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O5T DG5/O5z DG5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HE DG3/H3T DG3/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O3T DG3/O3z DG3/' $1 > $1_temp
rm $1
mv $1_temp $1
#
################################################################################
#
# chain a CYTOSINE adjustment
#
sed 's/  DC a   1/ DC5 a   1/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HB DC5/H5T DC5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O5T DC5/O5z DC5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HE  DC/H3T DC3/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O3T  DC/O3z DC3/' $1 > $1_temp
rm $1
mv $1_temp $1
#
# chain b CYTOSINE adjustment
#
sed 's/  DC b   1/ DC5 b   1/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HB DC5/H5T DC5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O5T DC5/O5z DC5/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ HE DC3/H3T DC3/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/O3T DC3/O3z DC3/' $1 > $1_temp
rm $1
mv $1_temp $1
#
################################################################################
#
# This section changes the last base in each chain (a and b) from the default
# "Dn" to "Dn3" so that the topology generation gets the 3' end correct.
# Goes by units, tens, hun, thou and searches specifically for the pattern
# in question (taking care to follow the standard  format for base number.
#
# NOTE: We do the junction, crossover, etc. generation outside of NAMOT.
# Therefore, each file output by NAMOT only has chain "a" and chain "b".
#
################################################################################
#
# changes the 3' strand if the length is from 1 to 9 (units)
# strand 1/a
#
sed 's/ DA a   '$2'/DA3 a   '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DC a   '$2'/DC3 a   '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DG a   '$2'/DG3 a   '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DT a   '$2'/DT3 a   '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
#
# changes the 3' strand if the length is from 1 to 9 (units)
# strand 2/b
#
sed 's/ DA b   '$2'/DA3 b  '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DC b   '$2'/DC3 b   '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DG b   '$2'/DG3 b   '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DT b   '$2'/DT3 b   '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
#
################################################################################
#
# changes the 3' strand if the length is from 10 to 99 (tens)
# strand 1/a
#
sed 's/ DA a  '$2'/DA3 a  '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DC a  '$2'/DC3 a  '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DG a  '$2'/DG3 a  '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DT a  '$2'/DT3 a  '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
#
# changes the 3' strand if the length is from 10 to 99 (tens)
# strand 2/b
#
sed 's/ DA b  '$2'/DA3 b  '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DC b  '$2'/DC3 b  '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DG b  '$2'/DG3 b  '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DT b  '$2'/DT3 b  '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
#
################################################################################
#
# changes the 3' strand if the length is from 100 to 999 (hund)
# strand 1/a
#
sed 's/ DA a '$2'/DA3 a '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DC a '$2'/DC3 a '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DG a '$2'/DG3 a '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DT a '$2'/DT3 a '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
#
# changes the 3' strand if the length is from 100 to 999 (hund)
# strand 2/b
#
sed 's/ DA b '$2'/DA3 b '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DC b '$2'/DC3 b '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DG b '$2'/DG3 b '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DT b '$2'/DT3 b '$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
#
################################################################################
#
# changes the 3' strand if the length is from 1000 to 9999 (thou)
# strand 1/a
#
sed 's/ DA a'$2'/DA3 a'$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DC a'$2'/DC3 a'$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DG a'$2'/DG3 a'$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DT a'$2'/DT3 a'$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
#
# changes the 3' strand if the length is from 1000 to 9999 (thou)
# strand 2/b
#
sed 's/ DA b'$2'/DA3 b'$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DC b'$2'/DC3 b'$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DG b'$2'/DG3 b'$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
sed 's/ DT b'$2'/DT3 b'$2'/' $1 > $1_temp
rm $1
mv $1_temp $1
#
################################################################################
#
# Home stretch.  Changes all of the "z" atoms in the pdb file to ' (single-
# quotes) for ffMABER.
#
sed s/\z/\'/g $1 > $1_temp
rm $1
mv $1_temp $1_proper_pdb
#
#
################################################################################
#
# Questions?  Problems?  Complaints?  Better Ideas?
# Damian Allis, damian@somewhereville.com, www.somewhereville.com
#
################################################################################

www.somewhereville.com/?p=114
en.wikipedia.org/wiki/DNA
www.gromacs.org
chemistry.csulb.edu/ffamber
amber.scripps.edu
namot.lanl.gov
en.wikipedia.org/wiki/Thymine
en.wikipedia.org/wiki/Sed
en.wikipedia.org/wiki/Linux
en.wikipedia.org/wiki/Unix
www.apple.com/macosx

DNA-Specific (But Generally Applicable) AMBER With GROMACS 3.3.x: Installation And Notes

The following is the full procedure for installing the AMBER force field port for GROMACS (AMBER-in-GROMACS, AMBER-with-GROMACS, AMBER-on-GROMACS, whatever you want to call it) developed by Eric Sorin at California State University, Long Beach, providing a bit more depth in the installation process (specifically for GROMACS 3.3.x) and a few modified GROMACS files.

As brief background, AMBER (Assisted Model Building and Energy Refinement) is one of THE dominant molecular mechanics/molecular dynamics (MM/MD) force fields used today in biochemical simulations. The motivation for this page (my installing AMBER for use in GROMACS) stems from the current Nanorex focus on Structural DNA Nanotechnology (SDN) modeling, for which we're working on a reduced model force field for large-structure energy minimizations and, importantly, integrating the GROMACS MM/MD package for use via our CAD interface. You can read more about this in the poster presented at FNANO08 this past April. As a force field validated for DNA simulations, AMBER meets our needs of performing atomistic simulations on DNA nanostructures. While NAMD is also a possibility for DNA simulations, GROMACS meets Nanorex's open source needs.

Needless to say, finding that Sorin and co-workers had ported AMBER to GROMACS was a wonderful discovery (and that this same port appears to be driving the Folding@Home project). The installation directions on the AMBER-on-GROMACS website are good, but they don't mention a few important steps that I spent no small amount of time trying to figure out (and I'm definitely not complaining. 99% of the work was the porting, which has been graciously handed to the GROMACS community on a silver platter). Below is the complete set of steps required to get, as the first example, the Dickerson.pdb sample file provided with the porting files to energy minimize and MD correctly. Issues related to DNA-based simulations not using the Dickerson.pdb file will be covered in another post.

Dickerson DNA

QuteMol rendering of Dickerson.pdb (from Drew, H. R., Wing, R. M., Takano, T., Broka, C., Tanaka, S., Itakura, K. & Dickerson, R. E. (1981). Structure of a B-DNA dodecamer: conformation and dynamics. Proc. Natl. Acad. Sci. USA 78, 2179-2183. Protein Data Bank structure 1BNA).

Install/compile a custom installation of GROMACS

The AMBER port to GROMACS does not require modification of the GROMACS code but does require a few changes to force field text files. In the interest of remembering what was changed and what wasn't, I recommend a custom compilation of GROMACS or, at the very least, the installation of another copy of GROMACS you can modify the top directory of without risking changes to an already working GROMACS version.

If you don't know how to compile your own version of GROMACS, I recommend taking a look at Compiling Single-Precision And Double-Precision GROMACS 3.3.3 With OpenMPI 1.2.6 Under OSX 10.5 (Leopard).

My installation directory is /usr/local/gromacs333-amber (which I will refer to here as /ULG-A/ and all directory calls will be to this. Obviously, change all directory calls to match yours.

Download GROMACS-Ported AMBER force field files

I'm running a double-precision MPI version of GROMACS 3.3.3, meaning I'm using the GROMACS-ported AMBER force field files for GROMACS 3.3.1 (May 2006). And get the version "with pdfs." If you're going to be doing these calculations, you should know where the math come from!

Copy vdwradii.dat and aminoacids-NA.dat into /usr/local/gromacs-amber/share/gromacs/top

You will need to be logged in as root to do this (sudo cp FILENAME /ULG-A/top). The AMBER port file vdwradii.dat differs from the vdwradii.dat file in the /ULG-A/top directory of the GROMACS install by the addition of 5 lines:

???  P     0.15
???  LP1   0
???  LP2   0
LYSH MNZ1  0
LYSH MNZ2  0

With no removals or number changes.

The aminoacids.dat file is a little different. In order to use pdb2gmx to generate nucleic acid topology files, GROMACS requires the list of residue codes that reside in aminoacids-NA.dat. So far as I can tell, using aminoacids-NA.dat instead of aminoacids.dat does not change the handling of amino acids (why two files exist for a reason other than amino acid-centric organization is beyond my pay scale), so we'll be using aminoacids-NA.dat exclusively.

Move/delete/rename aminoacids.dat and rename aminoacids-NA.dat to aminoacids.dat

The aminoacids-NA.dat file includes 32 additional residue codes, accounting for the 3' (includes terminal H atom), 5' (included terminal H atom), in-strand (nucleic acid repeat unit) and molecule (individual hydrogen-terminated sugar and nucleic acid) topology information codes for DNA and RNA found in the .rtp files

Copy the ffamber files of interest into the /ULG-A/top directory

If you're doing this in Linux or OSX, you might as well move all of the files you want installed into a single directory and su cp * /ULG-A/top. Note that the .itp and .gro files in your ffamber_v3.3.1 (for the 3.3.x GROMACS) are the same for all of the AMBER force field flavors you can install. Further, all of the ffamber* files have unique names (94, 99, 03, etc.), so you can install all the force field files into /ULG-A/top and use whichever.

Modify /ULG-A/top/FF.dat to reflect the new force field files

The FF.dat file as installed by GROMACS looks like the following:

9
ffG43a1  GROMOS96 43a1 force field
ffG43b1  GROMOS96 43b1 vacuum force field
ffG43a2  GROMOS96 43a2 force field (improved alkane dihedrals)
ffG45a3  GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)
ffG53a5  GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)
ffG53a6  GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)
ffoplsaa OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)
ffencadv Encad all-atom force field, using scaled-down vacuum charges
ffencads Encad all-atom force field, using full solvent charges    

To include the AMBER force field files (so that these separate files can be called during pdb2gmx), we modify the FF.dat file by (1) incrementing the total number of force fields and (2) adding the force field code and text description to this file. The new FF.dat file will look like this if you added all of the ffamber force field files:

17
ffG43a1  GROMOS96 43a1 force field
ffG43b1  GROMOS96 43b1 vacuum force field
ffG43a2  GROMOS96 43a2 force field (improved alkane dihedrals)
ffG45a3  GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)
ffG53a5  GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)
ffG53a6  GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)
ffoplsaa OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)
ffencadv Encad all-atom force field, using scaled-down vacuum charges
ffencads Encad all-atom force field, using full solvent charges
ffamber94 AMBER94 Cornell et al. (1995), JACS 117, 5179-5197
ffamber96 AMBER96 Kollman (1996), Acc. Chem. Res. 29, 461-469
ffamberGS AMBER99GS Garcia & Sanbonmatsu (2002), PNAS 99, 2782-2787
ffamberGSs AMBER99GSs Nymeyer & Garcia (2003) PNAS 100, 13934-13939
ffamber99 AMBER99 Wang et al. (2000), J. Comp. Chem. 21, 1049-1074
ffamber99p AMBER99p Sorin & Pande (2005). Biophys. J. 88(4), 2472-2493
ffamber99SB AMBER99sb Hornak et. al (2006). Proteins 65, 712-725
ffamber03 AMBER03 Duan et al. (2003), J. Comp. Chem. 24, 1999-2012

Add and increment numbers as appropriate for the force field you want to use. You can download the complete FF.dat at the following link and comment:

Download amber_FF.txt, delete "amber_", change ".txt" to ".dat", place in /ULG-A/top

source /UGL-A/bin/GMXRC

The GMXRC file contains path-dependent settings for your shell.

In theory, you're all done. At least, this is the end of the installation process according to the official AMBER-on-GROMACS list. The IMPORTANT NOTES section contains relevant information but does not complete the installation. The additional steps are provided below.

Modify the /ULG-A/top/spc.itp file

If you're running through a complete energy minimization calculation with the website installation followed, your first source of error (hopefully) comes in the form of…

Program grompp_amber, VERSION 3.3.3
Source code file: toppush.c, line: 1193

Fatal error:
[ file "/usr/local/gromacs333_amber/share/gromacs/top/spc.itp", line 32 ]:
Atom index (1) in bonds out of bounds (1-0).
This probably means that you have inserted topology section "bonds"
in a part belonging to a different molecule than you intended to.
In that case move the "bonds" section to the right molecule.

The origin of this error is the lack of amberXX water parameters in the spc.itp file (nothing directly to do with the reported GROMACS error). The fix for this is straightforward, simply modifying the spc.itp file in the /ULG-A/top directory as follows below. The original spc.itp file below…

[ moleculetype ]
; molname       nrexcl
SOL             2

[ atoms ]
;   nr   type  resnr residue  atom   cgnr     charge       mass
#ifdef _FF_GROMACS
1     OW      1    SOL     OW      1      -0.82
2     HW      1    SOL    HW1      1       0.41
3     HW      1    SOL    HW2      1       0.41
#endif
#ifdef _FF_GROMOS96
#ifdef HEAVY_H
1     OW      1    SOL     OW      1      -0.82    9.95140
2      H      1    SOL    HW1      1       0.41    4.03200
3      H      1    SOL    HW2      1       0.41    4.03200
#else
1     OW      1    SOL     OW      1      -0.82   15.99940
2      H      1    SOL    HW1      1       0.41    1.00800
3      H      1    SOL    HW2      1       0.41    1.00800
#endif
#endif
#ifdef _FF_OPLS
1  opls_116   1    SOL     OW      1      -0.82
2  opls_117   1    SOL    HW1      1       0.41
3  opls_117   1    SOL    HW2      1       0.41
#endif

#ifdef FLEXIBLE
[ bonds ]
; i     j       funct   length  force.c.
1       2       1       0.1     345000  0.1     345000
1       3       1       0.1     345000  0.1     345000

[angles ]
; i     j       k       funct   angle   force.c.
2       1       3       1       109.47  383     109.47  383
#else
[ settles ]
; OW    funct   doh     dhh
1       1       0.1     0.16330

[ exclusions ]
1       2       3
2       1       3
3       1       2
#endif

is modified to this…

[ moleculetype ]
; molname       nrexcl
SOL             2

[ atoms ]
;   nr   type  resnr residue  atom   cgnr     charge       mass
#ifdef _FF_GROMACS
1     OW      1    SOL     OW      1      -0.82
2     HW      1    SOL    HW1      1       0.41
3     HW      1    SOL    HW2      1       0.41
#endif
#ifdef _FF_GROMOS96
#ifdef HEAVY_H
1     OW      1    SOL     OW      1      -0.82    9.95140
2      H      1    SOL    HW1      1       0.41    4.03200
3      H      1    SOL    HW2      1       0.41    4.03200
#else
1     OW      1    SOL     OW      1      -0.82   15.99940
2      H      1    SOL    HW1      1       0.41    1.00800
3      H      1    SOL    HW2      1       0.41    1.00800
#endif
#endif
#ifdef _FF_OPLS
1  opls_116   1    SOL     OW      1      -0.82
2  opls_117   1    SOL    HW1      1       0.41
3  opls_117   1    SOL    HW2      1       0.41
#endif

#ifdef _FF_AMBER94
; also applies to FF_AMBER96, FF_AMBERGS, and FF_AMBERGSs
1  amber94_42   1  SOL     OW      1      -0.82  15.99940
2  amber94_27   1  SOL    HW1      1       0.41   1.00800
3  amber94_27   1  SOL    HW2      1       0.41   1.00800
#endif

#ifdef _FF_AMBER99
; also applies to FF_AMBER99P, FF_AMBER99SB, FF_AMBER03
1  amber99_54   1  SOL     OW      1      -0.82  15.99940
2  amber99_55   1  SOL    HW1      1       0.41   1.00800
3  amber99_55   1  SOL    HW2      1       0.41   1.00800
#endif



#ifdef FLEXIBLE
[ bonds ]
; i     j       funct   length  force.c.
1       2       1       0.1     345000  0.1     345000
1       3       1       0.1     345000  0.1     345000

[ angles ]
; i     j       k       funct   angle   force.c.
2       1       3       1       109.47  383     109.47  383
#else
[ settles ]
; OW    funct   doh     dhh
1       1       0.1     0.16330

[ exclusions ]
1       2       3
2       1       3
3       1       2
#endif

This FFAMBER-included spc.itp file is downloadable in the following form and comment:

Download amber_spc.txt, delete "amber_" change ".txt" to ".itp", place in /ULG-A/top

At this point, pdb2gmx, editconf, genbox, grompp, and genion should all work without error. The next problem comes with the post-genion grompp, which can't find ion information…

Modify ions.itp

Similar to the spc.itp error, the error that comes up with the inclusion of counterions confounds the mind because everything seems to be in place.

Program grompp_amber, VERSION 3.3.3
Source code file: toppush.c, line: 1396

Fatal error:
No such moleculetype Na

With Na being whatever counterion you're trying. The AMBER-on-GROMACS website reports the following for AMBER ion inclusion:

Our AMBER ports include common ion definitions, which are listed in the ffamber*.rtp files (just below the TIP water models). This allows the AMBER ports to be used without modification or use of the GROMACS ions.itp file. At the moment these include Cl- , IB+, Na+, K+ , Rb+, Cs+, Li+ , Ca2+, Mg2+, Zn2+, Sr2+, and Ba2+. If your pdb file has ions present and pdb2gmx does not properly convert those ions, please check the atom and residue name of your ions and rename them if necessary to agree with the .rtp file. If you are using ion-related GROMACS tools, such as genion, you will need to enter the AMBER ion definition to the ions.itp file in the "top" directory of the GROMACS distribution.

The bases for the ion use are in the .rtp files, but the implementation, at least for GROMACS 3.3.3 and possibly earlier versions, is not as found in these files.

The proper format for both the FFAMBER_94 and FFAMBER_99 ion types in ions.itp is as follows:


#ifdef _FF_AMBER94

[ moleculetype ]
; molname       nrexcl
Na              1
[ atoms ]

; id    at type         res nr  residu name     at name  cgnr   charge    mass
1       amber94_31      1       Na              Na       1      1         22.99000
#endif

#ifdef _FF_AMBER99

[ moleculetype ]
; molname       nrexcl
Na              1
[ atoms ]

; id    at type         res nr  residu name     at name  cgnr   charge    mass
1       amber99_31      1       Na              Na       1      1         22.9900
#endif

This is the format used for the FF_GROMOS96 ions with one change. The atom type for the FF_AMBERXX parameters is NOT the atom label, but instead the AMBER force field label. In the case of Na, this is amber94_31 (for AMBER94) and amber99_31 (for AMBER99). Note that the mass must be specified (you have to scroll down the ions.itp file to see FF_GROMOS96. FF_GROMACS does not require mass specification).

For quick reference, here are the following [ atoms ] specifications for FF_AMBER94 and FF_AMBER_99.

#ifdef _FF_AMBER94
; id    at type         res nr  residu name     at name  cgnr   charge    mass
1       amber94_32      1       IB              IB       1      1         131.00000
1       amber94_56      1       Li              Li       1      1         6.94000
1       amber94_31      1       Na              Na       1      1         22.99000
1       amber94_51      1       K               K        1      1         39.10000
1       amber94_52      1       Rb              Rb       1      1         85.47000
1       amber94_53      1       Cs              Cs       1      1         132.90000
1       amber94_33      1       Mg              Mg       1      2         24.30500
1       amber94_15      1       Ca              Ca       1      2         40.08000
1       amber94_57      1       Zn              Zn       1      2         65.40000
1       amber94_58      1       Sr              Sr       1      2         87.62000
1       amber94_59      1       Ba              Ba       1      2         137.33000
1       amber94_30      1       Cl              Cl       1      -1        35.45000
#endif

#ifdef _FF_AMBER99
; id    at type         res nr  residu name     at name  cgnr   charge    mass
1       amber99_32      1       IB              IB       1      1         131.00000
1       amber99_56      1       Li              Li       1      1         6.94000
1       amber99_31      1       Na              Na       1      1         22.99000
1       amber99_51      1       K               K        1      1         39.10000
1       amber99_52      1       Rb              Rb       1      1         85.47000
1       amber99_53      1       Cs              Cs       1      1         132.90000
1       amber99_33      1       Mg              Mg       1      2         24.30500
1       amber99_15      1       Ca              Ca       1      2         40.08000
1       amber99_57      1       Zn              Zn       1      2         65.40000
1       amber99_58      1       Sr              Sr       1      2         87.62000
1       amber99_59      1       Ba              Ba       1      2         137.33000
1       amber99_30      1       Cl              Cl       1      -1        35.45000
#endif

To save yourself plenty of trouble, download the ions.itp file with all of the ion parameters in the form of the following link:

Download amber_ions.txt, delete "amber_", change ".txt" to ".itp", place in /ULG-A/top

With the ions.itp file complete, your GROMACS energy minimizations and dynamics simulations of amino acid and nucleic acid structures should go without hitch.

My thanks to Alan Wilter S. da Silva, D.Sc. – CCPN Research Associate, Department of Biochemistry, University of Cambridge, for catching a formatting error in the first version of the amber_ions.itp file.

If you find problems, questions, concerns, incompatibilities, etc., please let me know by setting up a user account and posting a comment in this post or email me so I can post whatever you might find.

And don't forget to cite ffAMBER if you use it!

Sorin & Pande (2005). Biophys. J. 88(4), 2472-2493

chemistry.csulb.edu/ffamber
www.gromacs.org
chemistry.csulb.edu/esorin
chemistry.csulb.edu
amber.scripps.edu
www.nanorex.com
en.wikipedia.org/wiki/DNA_nanotechnology
www.somewhereville.com/rescv/fnano08_2008_poster.jpg
www.cs.duke.edu/~reif/FNANO
www.ks.uiuc.edu/Research/namd
en.wikipedia.org/wiki/Open_source
folding.stanford.edu
qutemol.sourceforge.net
www.pdb.org
www.pdb.org/pdb/explore/explore.do?structureId=1BNA
www.linux.org
www.apple.com/macosx
www.gromacs.org/documentation/reference/online/pdb2gmx.html