Stupid-Simple (*nix-Specific) Sed Scripts To Get (All Current) Gaussian09 Output Files Working With aClimax

The following three snippets of Gaussian output are for an optimization and normal mode analysis of simple olde methane (CH4).

...
 ******************************************
 Gaussian 03:  EM64L-G03RevE.01 11-Sep-2007
                31-Aug-2014 
 ******************************************
...
 incident light, reduced masses (AMU), force constants (mDyne/A),
 and normal coordinates:
                     1                      2                      3
                     T                      T                      T
 Frequencies --  1356.0070              1356.0070              1356.0070
 Red. masses --     1.1789                 1.1789                 1.1789
 Frc consts  --     1.2771                 1.2771                 1.2771
 IR Inten    --    14.1122                14.1122                14.1122
 Atom AN      X      Y      Z        X      Y      Z        X      Y      Z
   1   1     0.02  -0.42   0.43    -0.34  -0.13  -0.08    -0.36  -0.23  -0.23
   2   6     0.00   0.08  -0.09     0.00   0.09   0.08     0.12   0.00   0.00
...
 -------------------
 - Thermochemistry -
 -------------------
 Temperature   298.150 Kelvin.  Pressure   1.00000 Atm.
 Atom  1 has atomic number  1 and mass   1.00783
...
...
 ******************************************
 Gaussian 09:  EM64L-G09RevA.02 11-Jun-2009
                31-Aug-2014 
 ******************************************
...
 incident light, reduced masses (AMU), force constants (mDyne/A),
 and normal coordinates:
                     1                      2                      3
                     T                      T                      T
 Frequencies --  1356.0058              1356.0058              1356.0058
 Red. masses --     1.1789                 1.1789                 1.1789
 Frc consts  --     1.2771                 1.2771                 1.2771
 IR Inten    --    14.1123                14.1123                14.1123
  Atom  AN      X      Y      Z        X      Y      Z        X      Y      Z
     1   1    -0.03   0.42   0.43    -0.34  -0.14   0.07    -0.36  -0.23   0.23
     2   6     0.00  -0.08  -0.10     0.01   0.10  -0.08     0.12   0.00   0.00
...
-------------------
 - Thermochemistry -
 -------------------
 Temperature   298.150 Kelvin.  Pressure   1.00000 Atm.
 Atom     1 has atomic number  1 and mass   1.00783
...
...
 ******************************************
 Gaussian 09:  EM64L-G09RevD.01 24-Apr-2013
                31-Aug-2014 
 ******************************************
...
 incident light, reduced masses (AMU), force constants (mDyne/A),
 and normal coordinates:
                      1                      2                      3
                     ?A                     ?A                     ?A
 Frequencies --   1356.0132              1356.0132              1356.0132
 Red. masses --      1.1789                 1.1789                 1.1789
 Frc consts  --      1.2771                 1.2771                 1.2771
 IR Inten    --     14.1119                14.1119                14.1119
  Atom  AN      X      Y      Z        X      Y      Z        X      Y      Z
     1   1     0.02   0.42   0.43     0.34  -0.14   0.08    -0.36   0.23  -0.23
     2   6     0.00  -0.08  -0.09    -0.01   0.09  -0.08     0.12   0.00   0.00
...
 -------------------
 - Thermochemistry -
 -------------------
 Temperature   298.150 Kelvin.  Pressure   1.00000 Atm.
 Atom     1 has atomic number  1 and mass   1.00783
...

Two of these things are not like the other. The data’s nearly identical (and thank heavens. Unfortunately, Gaussian09 D.01 didn’t see the fully-optimized methane as belonging to the Td point group – despite all three versions being run with the same exact input file – but a rigorous re-symmetrization would have taken care of that), but there are some subtle formatting differences between all three versions (including differences between both Gaussian09 versions) that cause the venerable, all-encompassing aClimax program (developed by Timmy, the venerable, all-encompassing A. J. Ramirez-Cuesta) to throw out the following errors for all three cases when you use *.log files from a *nix (UNIX, Linux) machine.

Serious Error: A-CLIMAX has encountered an unhanded error. Please Save your data and contact support
aClimax: Quote Error Number 9
Error Loading File: Error reading data. Please check and try again.
aClimax: WARNING loaded file containing no frequencies

Problem number 1 is the existence of *nix newlines (carriage returns) in the *.log files coming off a *nix machine. Performing a conversion from *nix to DOS (for myself, using LineBreak in OSX, but tofrodos works just as well), the Gaussian03 file now opens just fine in aClimax:

File Loaded: Data Loaded Succesfully [sic].

This, unfortunately, does not improve the matter with the Gaussian09 files, which produce the following error:

Error: One of the numbers you have entered is of the wrong type.Please recheck and try again
Error Loading File: Error reading data. Please check and try again.

Given how little of the .log file aClimax actually needs to produce simulated inelastic neutron scattering (INS) spectra, I ran the methane normal mode analyses in three different Gaussian versions to determine what, in G09, was changed to make it just un-G03 enough to fail to load. With those changes figured out, I had a Perl script drafted up that would have converted everything back to the original G03 format. It was awesome. That said, after a small amount of testing to see where aClimax’s sensitivities lay, I discovered that very little of the .log file contents needed to be changed out, meaning that simple sed scripts would work just as well for those of us using our Windows boxes (or VirtualBox emulations) only for that “one stupid program” that keeps us having to log in (and, by that, I mean that we have sed already on our computers).

So, the problems between G09 and aClimax not related to carriage returns lie in two places.

1. The spacing of “Atom AN” – at the top of the eigenvector lists are the column labels, beginning with “Atom AN” – or something very close to “Atom AN” (the “|” in the boxes below mark the left edge of the output):

G03 E01 | Atom AN
G09 A02 |   Atom  AN
G09 D01 |  Atom  AN

Yes, the addition of a space or two results in a read error by aClimax. I would call this an… aggressive stringency in aClimax. That said, what did the original space in G03 versions not do that they do do in G09?

2. The spacing of “Atom N” – In the “Thermochemistry” section below the eigenvectors, atomic masses are listed as “Atom N” – or something very close to “Atom N” (again, the “|” in the boxes below mark the left edge of the output):

G03 E01 |  Atom  1
G09 A02 |    Atom     1
G09 D01 |   Atom     1

This change in spacing is also enough to cause aClimax to error out.

The Solution

A small sed script performs the necessary conversions on your *nix box (including OSX) for all .log files in a directory without issue:

#!/bin/sh

# This section converts all .log files to aClimax-friendly G03-ish format
find . -type f -name '*.log' -print | while read i
do
sed 's|  Atom  AN| Atom AN |g' $i > $i.aclimaxconversion_step1
sed 's| Atom   | Atom|g' $i.aclimaxconversion_step1 > $i.aClimaxable.log
rm $i.aclimaxconversion_step1
done

# This section converts all .out files to aClimax-friendly G03-ish format
find . -type f -name '*.out' -print | while read i
do
sed 's|  Atom  AN| Atom AN |g' $i > $i.aclimaxconversion_step1
sed 's| Atom   | Atom|g' $i.aclimaxconversion_step1 > $i.aClimaxable.out
rm $i.aclimaxconversion_step1
done

But Wait! Running G0* Jobs Under *nix? Convert To DOS Carriage Returns

The final problem halting your aClimax spectrum generation is the DOS carriage return (^M). For those running DOS-based Gaussian calculations (likely with a .out suffix), your conversion with the short script above (under *nix) likely (hopefully) worked just fine. For those running under *nix, you performed the conversion and still received the following aClimax error:

Serious Error: A-CLIMAX has encountered an unhanded error. Please Save your data and contact support
aClimax: Quote Error Number 9
Error Loading File: Error reading data. Please check and try again.
aClimax: WARNING loaded file containing no frequencies

The solution is an additional line in the sed script that will globally replace all *nix newlines with proper DOS carriage returns. The .out section remains the same.

#!/bin/sh

# This section converts all .log files to aClimax-friendly G03-ish format
find . -type f -name '*.log' -print | while read i
do
sed 's|  Atom  AN| Atom AN |g' $i > $i.aclimaxconversion_step1
sed 's| Atom   | Atom|g' $i.aclimaxconversion_step1 > $i.aclimaxconversion_step2
# This section converts your *nix newlines into DOS carriage returns
CR=`echo "\0015"`  # define the Carriage Return
sed -e "s/$/${CR}/g" $i.aclimaxconversion_step2 > $i.aClimaxable.log
done
# this cleans up your folder of temp files
rm *.aclimaxconversion_step1
rm *.aclimaxconversion_step2

# This section converts all .out files to aClimax-friendly G03-ish format
find . -type f -name '*.out' -print | while read i
do
sed 's|  Atom  AN| Atom AN |g' $i > $i.aclimaxconversion_step1
sed 's| Atom   | Atom|g' $i.aclimaxconversion_step1 > $i.aClimaxable.out
rm $i.aclimaxconversion_step1
done

Q. But what if I run the *nix-to-DOS version of the script on an already DOS-output file?

A1. The simple answer is that you’ll make your text file double-spaced (which is bad enough). aClimax will then provide the following error when you try to open it:

Error Reading File: Unexpected File End. File May be incorrect or corrupt.
Error Loading File: Error reading data. Please check and try again.

A2. I will assume that your problem is that you’re running the script in DOS to try to get your G09 to read more like G03. In this case (assuming you’re generating .out files), you’ll want to use a text editor to make the replacements described above (which is to say, that Perl script might makes it way to this page eventually. If you write a DOS .bat file or similar script for all OS’s, I’d be happy to link to it).

L-Alanine Alaninium Nitrate (LAAN) Shout-Out At spectroscopyNOW.com (And Better Raman Image Here)

It doesn’t happen often.  Simply marking for interested parties that David Bradley wrote a piece about the recent L-Alanine Alaninium Nitrate article published in Physical Chemistry Chemical Physics (Phys. Chem. Chem. Phys., 2009, 11, 9474 – 9483, DOI: 10.1039/b905070a) with a specific focus on the organic ferroelectric behavior of this system as argued from the results of the inelastic neutron scattering (INS) and temperature-dependent Raman spectroscopic studies.  Also, of course, the entire discussion and analysis revolves around the results of the density functional theory (DFT) studies performed on the solid-state system with DMol3.

I find it mildly amusing that a paper that went through several rather exhaustive crystallography-focused review cycles (fighting with crystallography-specific reviewers about the use of the vibrational spectroscopy to provide the more realistic view of this organic salt in the solid-state) makes headlines (well, you know) only for the vibrational spectroscopy.  I certainly won’t point fingers (only browsers), but I’ve yet to see someone say the same of vibrational spectroscopists.

Continue reading “L-Alanine Alaninium Nitrate (LAAN) Shout-Out At spectroscopyNOW.com (And Better Raman Image Here)”

Examination of Phencyclidine Hydrochloride via Cryogenic Terahertz Spectroscopy, Solid-State Density Functional Theory, and X-Ray Diffraction

“I’m high on life… and PCP.” – Mitch Hedberg

In press, in the Journal of Physical Chemistry A. If the current rosters of pending manuscripts and calculations are any indication, this PCP paper will mark the near end of my use of DMol3 for the prediction (and experimental assignment) of terahertz (THz) spectra (that said, it is still an excellent tool for neutron scattering spectroscopy and is part of several upcoming papers).

While the DMol3 vibrational energy (frequency) predictions are generally in good agreement with experiment (among several density functionals, including the BLYP, BOP,VWN-BP, and BP generalized gradient approximation density functionals), the use of the difference-dipole method for the calculation of infrared intensities has shown itself to be of questionable applicability when the systems being simulated are charged (either molecular salts (such as PCP.HCl) or zwitterions (such as the many amino acid crystal structures)). The previously posted ephedrine paper (in ChemPhysChem) is most interesting from a methodological perspective for the phenomenal agreement in both mode energies AND predicted intensities obtained using Crystal06, another solid-state density functional theory program (that has implemented hybrid density functionals, Gaussian-type basis sets, cell parameter optimization and, of course, a more theoretically sound prediction of infrared intensities by way of Born charges). The Crystal06 calculations take, on average, an order of magnitude longer to run than the comparable DMol3 calculations, but the slight additional gain in accuracy for good density functionals, the much greater uniformity of mode energy predictions across multiple density functionals (when multiple density functionals are tested), and the proper calculation of infrared intensities all lead to Crystal06 being the new standard for THz simulations.

After a discussion with a crystallographer about what theoreticians trust and what they don’t in a diffraction experiment, the topic of interatomic separation agreement between theory and experiment came up in the PCP.HCl analysis performed here (wasn’t Wayne). As the position of hydrogen atoms in an X-ray diffraction experiment are categorically one of those pieces of information solid-state theoreticians do NOT trust when presented with a cif file, I reproduce a snippet from the paper considering this difference below (and, generally, one will not find comparisons of crystallographically-determined hydrogen positions and calculated hydrogen positions in any of the THz or inelastic neutron scattering spectroscopy papers found on this blog).

The average calculated distance between the proton and the Cl ion is 2.0148 Angstroms, an underestimation of nearly 0.13 Angstroms when compared to the experimental data. This deviation is likely strongly tied to the uncertainly in the proton position as determined by the X-ray diffraction experiment and is, therefore, not used as a proper metric of agreement between theory and experiment. The distance from the nitrogen atom to the Cl ion has been determined to be an average of 3.0795 Angstroms, which is within 0.002 Angstroms of the experimentally determined bond length. This proper comparison of heavy atom positions between theory and experiment indicates that this interatomic separation has been very well predicted by the calculations.

Patrick M. Hakey, Matthew R. Hudson, Damian G. Allis, Wayne Ouellette, and Timothy M. Korter

Department of Chemistry, Syracuse University, Syracuse, NY 13244-4100

The terahertz (THz) spectrum of phencyclidine hydrochloride from 7.0 – 100.0 cm-1 has been measured at cryogenic (78 K) temperature. The complete structural analysis and vibrational assignment of the compound have been performed employing solid-state density functional theory utilizing eight generalized gradient approximation density functionals and both solid-state and isolated-molecule methods. The structural results and the simulated spectra display the substantial improvement obtained by using solid-state simulations to accurately assign and interpret solid-state THz spectra. A complete assignment of the spectral features in the measured THz spectrum has been completed at a VWN-BP/DNP level of theory, with the VWN-BP density functional providing the best-fit solid-state simulation of the experimentally observed spectrum. The cryogenic THz spectrum contains eight spectral features that, at the VWN-BP/DNP level, consist of fifteen infrared-active vibrational modes. Of the calculated modes, external crystal vibrations are predicted to account for 42% of the total spectral intensity.

en.wikipedia.org/wiki/Mitch_Hedberg
pubs.acs.org/journal/jpcafh
en.wikipedia.org/wiki/Phencyclidine
accelrys.com/products/materials-studio/modules/dmol3.html
en.wikipedia.org/wiki/Terahertz_radiation
en.wikipedia.org/wiki/Density_functional_theory
en.wikipedia.org/wiki/Density_functional_theory#Approximations_.28Exchange-correlation_functionals.29
en.wikipedia.org/wiki/Zwitterions
en.wikipedia.org/wiki/Amino_acid
www.somewhereville.com/?p=680
www3.interscience.wiley.com/journal/122540399/abstract
www.crystal.unito.it
en.wikipedia.org/wiki/Basis_set_(chemistry)
en.wikipedia.org/wiki/X-ray_scattering_techniques
en.wikipedia.org/wiki/Inelastic_neutron_scattering
chemistry.syr.edu
www.syr.edu