Stupid-Simple (*nix-Specific) Sed Scripts To Get (All Current) Gaussian09 Output Files Working With aClimax

The following three snippets of Gaussian output are for an optimization and normal mode analysis of simple olde methane (CH4).

...
 ******************************************
 Gaussian 03:  EM64L-G03RevE.01 11-Sep-2007
                31-Aug-2014 
 ******************************************
...
 incident light, reduced masses (AMU), force constants (mDyne/A),
 and normal coordinates:
                     1                      2                      3
                     T                      T                      T
 Frequencies --  1356.0070              1356.0070              1356.0070
 Red. masses --     1.1789                 1.1789                 1.1789
 Frc consts  --     1.2771                 1.2771                 1.2771
 IR Inten    --    14.1122                14.1122                14.1122
 Atom AN      X      Y      Z        X      Y      Z        X      Y      Z
   1   1     0.02  -0.42   0.43    -0.34  -0.13  -0.08    -0.36  -0.23  -0.23
   2   6     0.00   0.08  -0.09     0.00   0.09   0.08     0.12   0.00   0.00
...
 -------------------
 - Thermochemistry -
 -------------------
 Temperature   298.150 Kelvin.  Pressure   1.00000 Atm.
 Atom  1 has atomic number  1 and mass   1.00783
...
...
 ******************************************
 Gaussian 09:  EM64L-G09RevA.02 11-Jun-2009
                31-Aug-2014 
 ******************************************
...
 incident light, reduced masses (AMU), force constants (mDyne/A),
 and normal coordinates:
                     1                      2                      3
                     T                      T                      T
 Frequencies --  1356.0058              1356.0058              1356.0058
 Red. masses --     1.1789                 1.1789                 1.1789
 Frc consts  --     1.2771                 1.2771                 1.2771
 IR Inten    --    14.1123                14.1123                14.1123
  Atom  AN      X      Y      Z        X      Y      Z        X      Y      Z
     1   1    -0.03   0.42   0.43    -0.34  -0.14   0.07    -0.36  -0.23   0.23
     2   6     0.00  -0.08  -0.10     0.01   0.10  -0.08     0.12   0.00   0.00
...
-------------------
 - Thermochemistry -
 -------------------
 Temperature   298.150 Kelvin.  Pressure   1.00000 Atm.
 Atom     1 has atomic number  1 and mass   1.00783
...
...
 ******************************************
 Gaussian 09:  EM64L-G09RevD.01 24-Apr-2013
                31-Aug-2014 
 ******************************************
...
 incident light, reduced masses (AMU), force constants (mDyne/A),
 and normal coordinates:
                      1                      2                      3
                     ?A                     ?A                     ?A
 Frequencies --   1356.0132              1356.0132              1356.0132
 Red. masses --      1.1789                 1.1789                 1.1789
 Frc consts  --      1.2771                 1.2771                 1.2771
 IR Inten    --     14.1119                14.1119                14.1119
  Atom  AN      X      Y      Z        X      Y      Z        X      Y      Z
     1   1     0.02   0.42   0.43     0.34  -0.14   0.08    -0.36   0.23  -0.23
     2   6     0.00  -0.08  -0.09    -0.01   0.09  -0.08     0.12   0.00   0.00
...
 -------------------
 - Thermochemistry -
 -------------------
 Temperature   298.150 Kelvin.  Pressure   1.00000 Atm.
 Atom     1 has atomic number  1 and mass   1.00783
...

Two of these things are not like the other. The data's nearly identical (and thank heavens. Unfortunately, Gaussian09 D.01 didn't see the fully-optimized methane as belonging to the Td point group – despite all three versions being run with the same exact input file – but a rigorous re-symmetrization would have taken care of that), but there are some subtle formatting differences between all three versions (including differences between both Gaussian09 versions) that cause the venerable, all-encompassing aClimax program (developed by Timmy, the venerable, all-encompassing A. J. Ramirez-Cuesta) to throw out the following errors for all three cases when you use *.log files from a *nix (UNIX, Linux) machine.

Serious Error: A-CLIMAX has encountered an unhanded error. Please Save your data and contact support
aClimax: Quote Error Number 9
Error Loading File: Error reading data. Please check and try again.
aClimax: WARNING loaded file containing no frequencies

Problem number 1 is the existence of *nix newlines (carriage returns) in the *.log files coming off a *nix machine. Performing a conversion from *nix to DOS (for myself, using LineBreak in OSX, but tofrodos works just as well), the Gaussian03 file now opens just fine in aClimax:

File Loaded: Data Loaded Succesfully [sic].

This, unfortunately, does not improve the matter with the Gaussian09 files, which produce the following error:

Error: One of the numbers you have entered is of the wrong type.Please recheck and try again
Error Loading File: Error reading data. Please check and try again.

Given how little of the .log file aClimax actually needs to produce simulated inelastic neutron scattering (INS) spectra, I ran the methane normal mode analyses in three different Gaussian versions to determine what, in G09, was changed to make it just un-G03 enough to fail to load. With those changes figured out, I had a Perl script drafted up that would have converted everything back to the original G03 format. It was awesome. That said, after a small amount of testing to see where aClimax's sensitivities lay, I discovered that very little of the .log file contents needed to be changed out, meaning that simple sed scripts would work just as well for those of us using our Windows boxes (or VirtualBox emulations) only for that "one stupid program" that keeps us having to log in (and, by that, I mean that we have sed already on our computers).

So, the problems between G09 and aClimax not related to carriage returns lie in two places.

1. The spacing of "Atom AN" – at the top of the eigenvector lists are the column labels, beginning with "Atom AN" – or something very close to "Atom AN" (the "|" in the boxes below mark the left edge of the output):

G03 E01 | Atom AN
G09 A02 |   Atom  AN
G09 D01 |  Atom  AN

Yes, the addition of a space or two results in a read error by aClimax. I would call this an… aggressive stringency in aClimax. That said, what did the original space in G03 versions not do that they do do in G09?

2. The spacing of "Atom N" – In the "Thermochemistry" section below the eigenvectors, atomic masses are listed as "Atom N" – or something very close to "Atom N" (again, the "|" in the boxes below mark the left edge of the output):

G03 E01 |  Atom  1
G09 A02 |    Atom     1
G09 D01 |   Atom     1

This change in spacing is also enough to cause aClimax to error out.

The Solution

A small sed script performs the necessary conversions on your *nix box (including OSX) for all .log files in a directory without issue:

#!/bin/sh

# This section converts all .log files to aClimax-friendly G03-ish format
find . -type f -name '*.log' -print | while read i
do
sed 's|  Atom  AN| Atom AN |g' $i > $i.aclimaxconversion_step1
sed 's| Atom   | Atom|g' $i.aclimaxconversion_step1 > $i.aClimaxable.log
rm $i.aclimaxconversion_step1
done

# This section converts all .out files to aClimax-friendly G03-ish format
find . -type f -name '*.out' -print | while read i
do
sed 's|  Atom  AN| Atom AN |g' $i > $i.aclimaxconversion_step1
sed 's| Atom   | Atom|g' $i.aclimaxconversion_step1 > $i.aClimaxable.out
rm $i.aclimaxconversion_step1
done

But Wait! Running G0* Jobs Under *nix? Convert To DOS Carriage Returns

The final problem halting your aClimax spectrum generation is the DOS carriage return (^M). For those running DOS-based Gaussian calculations (likely with a .out suffix), your conversion with the short script above (under *nix) likely (hopefully) worked just fine. For those running under *nix, you performed the conversion and still received the following aClimax error:

Serious Error: A-CLIMAX has encountered an unhanded error. Please Save your data and contact support
aClimax: Quote Error Number 9
Error Loading File: Error reading data. Please check and try again.
aClimax: WARNING loaded file containing no frequencies

The solution is an additional line in the sed script that will globally replace all *nix newlines with proper DOS carriage returns. The .out section remains the same.

#!/bin/sh

# This section converts all .log files to aClimax-friendly G03-ish format
find . -type f -name '*.log' -print | while read i
do
sed 's|  Atom  AN| Atom AN |g' $i > $i.aclimaxconversion_step1
sed 's| Atom   | Atom|g' $i.aclimaxconversion_step1 > $i.aclimaxconversion_step2
# This section converts your *nix newlines into DOS carriage returns
CR=`echo "\0015"`  # define the Carriage Return
sed -e "s/$/${CR}/g" $i.aclimaxconversion_step2 > $i.aClimaxable.log
done
# this cleans up your folder of temp files
rm *.aclimaxconversion_step1
rm *.aclimaxconversion_step2

# This section converts all .out files to aClimax-friendly G03-ish format
find . -type f -name '*.out' -print | while read i
do
sed 's|  Atom  AN| Atom AN |g' $i > $i.aclimaxconversion_step1
sed 's| Atom   | Atom|g' $i.aclimaxconversion_step1 > $i.aClimaxable.out
rm $i.aclimaxconversion_step1
done

Q. But what if I run the *nix-to-DOS version of the script on an already DOS-output file?

A1. The simple answer is that you'll make your text file double-spaced (which is bad enough). aClimax will then provide the following error when you try to open it:

Error Reading File: Unexpected File End. File May be incorrect or corrupt.
Error Loading File: Error reading data. Please check and try again.

A2. I will assume that your problem is that you're running the script in DOS to try to get your G09 to read more like G03. In this case (assuming you're generating .out files), you'll want to use a text editor to make the replacements described above (which is to say, that Perl script might makes it way to this page eventually. If you write a DOS .bat file or similar script for all OS's, I'd be happy to link to it).

Generating Molecular Orbitals (And Visualizing Assorted Properties) With The Gaussian09 cubegen Utility

To begin, this post owes its existence to the efforts of Dr. Douglas Fox at Gaussian, Inc., who provided me with an alternative explanation of how the cubegen utility works. After much wailing and gnashing of teeth, I intend on taking Dr. Fox's advice and asking Gaussian Support for assistance earlier in my endeavors. What follows below, I hope, will save you some significant frustration (and, given how little there is online that really describes the extra workings of cubegen in a clear and example'ed way, it is my expectation that this page appeared early in your search list).

What I wanted out of cubegen that I couldn't figure out how to get:

The situation was simple. I wanted my molecule centered and bound within an arbitrarily-sized box (X,Z,Y) for making images and doing additional post-processing. Specifically, I wanted to be able to take many different molecules (from hydrogen gas to big biomolecules) defined within the same-sized box for layering and presentation (different boxes for each, but all the same size).

I am assuming for this that you're using cubegen from a terminal (not within GaussView or the like) to produce .cub/.cube files for use in some kind of rendering-capable program (like VESTA or VMD) and that cubegen and formchk are in your PATH (either properly placed or by running the Gaussian install script). I'll be demonstrating usage with benzene (C6H6) and the benzene cation (C6H6+).

1. The Checkpoint File

To extract any kind of data for making .cub/.cube files, you need a checkpoint file (.chk) from your run. This is performed by adding a %chk=FILENAME.chk line to the top of the input file (which, if you're a Gaussian user, you likely already know). If you want additional properties cube'd, check the Gaussian Tech Document, specifically looking at the Pop keyword for most of the properties you'd want visualized (this data gets placed into the .chk file for .cub/.cube generation after the run). For the standard molecular orbitals, they're already saved in the .chk file (or their coefficients, anyway).

For benzene.gjf:

%chk=benzene.chk
# b3lyp/6-31G(d,p)

Benzene

0 1
 C                  1.20809735    0.69749533   -0.00000000
 C                  0.00000000    1.39499067   -0.00000000
 C                 -1.20809735    0.69749533   -0.00000000
 C                 -1.20809735   -0.69749533   -0.00000000
 C                  0.00000000   -1.39499067   -0.00000000
 C                  1.20809735   -0.69749533   -0.00000000
 H                  2.16038781    1.24730049   -0.00000000
 H                  0.00000000    2.49460097   -0.00000000
 H                 -2.16038781    1.24730049   -0.00000000
 H                 -2.16038781   -1.24730049   -0.00000000
 H                  0.00000000   -2.49460097   -0.00000000
 H                  2.16038781   -1.24730049   -0.00000000

For benzenecation.gjf:

%chk=benzenecation.chk
# b3lyp/6-31G(d,p)

Benzene cation

1 2
 C                  1.20809735    0.69749533   -0.00000000
 C                  0.00000000    1.39499067   -0.00000000
 C                 -1.20809735    0.69749533   -0.00000000
 C                 -1.20809735   -0.69749533   -0.00000000
 C                  0.00000000   -1.39499067   -0.00000000
 C                  1.20809735   -0.69749533   -0.00000000
 H                  2.16038781    1.24730049   -0.00000000
 H                  0.00000000    2.49460097   -0.00000000
 H                 -2.16038781    1.24730049   -0.00000000
 H                 -2.16038781   -1.24730049   -0.00000000
 H                  0.00000000   -2.49460097   -0.00000000
 H                  2.16038781   -1.24730049   -0.00000000

2. Convert The .chk To .fchk With formchk

As per the Gaussian Tech Doc:

formchk converts the data in a Gaussian checkpoint file into a formatted form which is suitable for input into a variety of visualization software.

Basically, making the .chk file something that cubegen can manipulate to generate .cub/.cube files of orbitals, densities, electrostatic potentials, etc. This run is simple for most users (for the rest, see formchk).

formchk benzene.chk benzene.fchk
formchk benzenecation.chk benzenecation.fchk

3. Using cubegen

And now the fun begins. A typical cubegen run looks like the following:

cubegen 0 MO=HOMO benzene.fchk benzene_HOMO.cub 0 h

cubegen – run cubegen
0 – an old memory flag (must be there, but not important)
MO=HOMO – generate the highest occupied molecular orbital
benzene.fchk – the .fchk file
benzene_HOMO.cub – the generated .cub file
0 – use the default grid point specification (80*80*80 points total in the whole cube file)
h – write out the .cub file with headers

The output you find summarized in VESTA is below for this case.

DEFAULT:
OpenGL version: 2.1 INTEL-8.26.34
Video configuration: Intel HD Graphics 4000 OpenGL Engine
Maximum supported width and height of the viewport: 16384 x 16384
OpenGL depth buffer bit: 16

/Users/damianallis/benzene_HOMO_default_0.cub
====================================================================================
Title Benzene MO=HOMO
Dimensions 87 91 65

Lattice parameters

a b c alpha beta gamma
9.39704 9.82909 7.02078 90.0000 90.0000 90.0000

Unit-cell volume = 648.469273 Ã…^3

Total number of polygons and unique vertices on slices;
(1 0 0): 0 ( 0), 0 ( 0)
(0 1 0): 0 ( 0), 0 ( 0)
(0 0 1): 0 ( 0), 0 ( 0)
====================================================================================

====================================================================================
Title Benzene

Lattice type P
Space group name P 1
Space group number 1
Setting number 1

Lattice parameters

a b c alpha beta gamma
1.00000 1.00000 1.00000 90.0000 90.0000 90.0000

Unit-cell volume = 1.000000 Ã…^3

Structure parameters

x y z Occ. B Site Sym.
1 C C1 4.65450 6.23638 3.44640 1.000 1.000 1 –
2 C C2 5.86259 5.53889 3.44640 1.000 1.000 1 –
3 C C3 5.86259 4.14389 3.44640 1.000 1.000 1 –
4 C C4 4.65450 3.44640 3.44640 1.000 1.000 1 –
5 C C5 3.44640 4.14389 3.44640 1.000 1.000 1 –
6 C C6 3.44640 5.53889 3.44640 1.000 1.000 1 –
7 H H1 4.65450 7.33599 3.44640 1.000 1.000 1 –
8 H H2 6.81488 6.08869 3.44640 1.000 1.000 1 –
9 H H3 6.81488 3.59409 3.44640 1.000 1.000 1 –
10 H H4 4.65450 2.34679 3.44640 1.000 1.000 1 –
11 H H5 2.49411 3.59409 3.44640 1.000 1.000 1 –
12 H H6 2.49411 6.08869 3.44640 1.000 1.000 1 –
====================================================================================

Number of polygons and unique vertices on isosurface = 16904 (8460)
12 atoms, 12 bonds, 0 polyhedra; CPU time = 39 ms

For the coarse grid (-2) case:

cubegen 0 MO=HOMO benzene.fchk benzene_HOMO_default_m2.cub -2 h

The output you find summarized in VESTA is below for this case.

/Users/damianallis/benzene_HOMO_default_m2.cub
====================================================================================
Title Benzene MO=HOMO
Dimensions 54 56 40

Lattice parameters

a b c alpha beta gamma
9.52518 9.87796 7.05569 90.0000 90.0000 90.0000

Unit-cell volume = 663.865482 Ã…^3

Total number of polygons and unique vertices on slices;
(1 0 0): 0 ( 0), 0 ( 0)
(0 1 0): 0 ( 0), 0 ( 0)
(0 0 1): 0 ( 0), 0 ( 0)
====================================================================================

====================================================================================
Title Benzene

Lattice type P
Space group name P 1
Space group number 1
Setting number 1

Lattice parameters

a b c alpha beta gamma
1.00000 1.00000 1.00000 90.0000 90.0000 90.0000

Unit-cell volume = 1.000000 Ã…^3

Structure parameters

x y z Occ. B Site Sym.
1 C C1 4.65450 6.23638 3.44640 1.000 1.000 1 –
2 C C2 5.86259 5.53889 3.44640 1.000 1.000 1 –
3 C C3 5.86259 4.14389 3.44640 1.000 1.000 1 –
4 C C4 4.65450 3.44640 3.44640 1.000 1.000 1 –
5 C C5 3.44640 4.14389 3.44640 1.000 1.000 1 –
6 C C6 3.44640 5.53889 3.44640 1.000 1.000 1 –
7 H H1 4.65450 7.33599 3.44640 1.000 1.000 1 –
8 H H2 6.81488 6.08869 3.44640 1.000 1.000 1 –
9 H H3 6.81488 3.59409 3.44640 1.000 1.000 1 –
10 H H4 4.65450 2.34679 3.44640 1.000 1.000 1 –
11 H H5 2.49411 3.59409 3.44640 1.000 1.000 1 –
12 H H6 2.49411 6.08869 3.44640 1.000 1.000 1 –
====================================================================================

Number of polygons and unique vertices on isosurface = 6516 (3266)
12 atoms, 12 bonds, 0 polyhedra; CPU time = 10 ms

For the medium grid (-3) case:

cubegen 0 MO=HOMO benzene.fchk benzene_HOMO_default_m3.cub -3 h

The output you find summarized in VESTA is below for this case.

/Users/damianallis/benzene_HOMO_default_m3.cub
====================================================================================
Title Benzene MO=HOMO
Dimensions 107 111 79

Lattice parameters

a b c alpha beta gamma
9.43701 9.78980 6.96751 90.0000 90.0000 90.0000

Unit-cell volume = 643.703858 Ã…^3

Total number of polygons and unique vertices on slices;
(1 0 0): 0 ( 0), 0 ( 0)
(0 1 0): 0 ( 0), 0 ( 0)
(0 0 1): 0 ( 0), 0 ( 0)
====================================================================================

====================================================================================
Title Benzene

Lattice type P
Space group name P 1
Space group number 1
Setting number 1

Lattice parameters

a b c alpha beta gamma
1.00000 1.00000 1.00000 90.0000 90.0000 90.0000

Unit-cell volume = 1.000000 Ã…^3

Structure parameters

x y z Occ. B Site Sym.
1 C C1 4.65450 6.23638 3.44640 1.000 1.000 1 –
2 C C2 5.86259 5.53889 3.44640 1.000 1.000 1 –
3 C C3 5.86259 4.14389 3.44640 1.000 1.000 1 –
4 C C4 4.65450 3.44640 3.44640 1.000 1.000 1 –
5 C C5 3.44640 4.14389 3.44640 1.000 1.000 1 –
6 C C6 3.44640 5.53889 3.44640 1.000 1.000 1 –
7 H H1 4.65450 7.33599 3.44640 1.000 1.000 1 –
8 H H2 6.81488 6.08869 3.44640 1.000 1.000 1 –
9 H H3 6.81488 3.59409 3.44640 1.000 1.000 1 –
10 H H4 4.65450 2.34679 3.44640 1.000 1.000 1 –
11 H H5 2.49411 3.59409 3.44640 1.000 1.000 1 –
12 H H6 2.49411 6.08869 3.44640 1.000 1.000 1 –
====================================================================================

Number of polygons and unique vertices on isosurface = 25532 (12774)
12 atoms, 12 bonds, 0 polyhedra; CPU time = 51 ms

For the fine grid (-4) case:

cubegen 0 MO=HOMO benzene.fchk benzene_HOMO_default_m4.cub -4 h

The output you find summarized in VESTA is below for this case.

/Users/damianallis/benzene_HOMO_default_m4.cub
====================================================================================
Title Benzene MO=HOMO
Dimensions 212 221 157

Lattice parameters

a b c alpha beta gamma
9.34876 9.74564 6.92337 90.0000 90.0000 90.0000

Unit-cell volume = 630.786281 Ã…^3

Total number of polygons and unique vertices on slices;
(1 0 0): 0 ( 0), 0 ( 0)
(0 1 0): 0 ( 0), 0 ( 0)
(0 0 1): 0 ( 0), 0 ( 0)
====================================================================================

====================================================================================
Title Benzene

Lattice type P
Space group name P 1
Space group number 1
Setting number 1

Lattice parameters

a b c alpha beta gamma
1.00000 1.00000 1.00000 90.0000 90.0000 90.0000

Unit-cell volume = 1.000000 Ã…^3

Structure parameters

x y z Occ. B Site Sym.
1 C C1 4.65450 6.23638 3.44640 1.000 1.000 1 –
2 C C2 5.86259 5.53889 3.44640 1.000 1.000 1 –
3 C C3 5.86259 4.14389 3.44640 1.000 1.000 1 –
4 C C4 4.65450 3.44640 3.44640 1.000 1.000 1 –
5 C C5 3.44640 4.14389 3.44640 1.000 1.000 1 –
6 C C6 3.44640 5.53889 3.44640 1.000 1.000 1 –
7 H H1 4.65450 7.33599 3.44640 1.000 1.000 1 –
8 H H2 6.81488 6.08869 3.44640 1.000 1.000 1 –
9 H H3 6.81488 3.59409 3.44640 1.000 1.000 1 –
10 H H4 4.65450 2.34679 3.44640 1.000 1.000 1 –
11 H H5 2.49411 3.59409 3.44640 1.000 1.000 1 –
12 H H6 2.49411 6.08869 3.44640 1.000 1.000 1 –
====================================================================================

Number of polygons and unique vertices on isosurface = 100680 (50348)
12 atoms, 12 bonds, 0 polyhedra; CPU time = 155 ms

These all generate a file containing the highest occupied molecular orbital (or one of the degenerate HOMO's in this case. Do I have to qualify that this doesn't mean what 99.5% of the people coming to this page thinks this means?). The box is generated by something in cubegen to be 9.3ish x 9.7ish x 6.9ish Angstroms on a side and containing X points per Angstrom (and you can change the fineness of the grid points). The image below shows the four cases for the benzene HOMO. Click to see larger versions if you want to see the influence of grid fineness on the final image.

benzene_homo_gaussian_defaults_small

Click for a larger view.

Now, then, while the boxes are almost all identical, the same molecule and input gives four slightly different results. Fine for individual images, but not ideal for the obsessive-compulsive image maker. Also, you see how a box simply bounds the molecule, meaning no standardization of size if you needed that standardization for some reason.

   a        b        c       alpha    beta     gamma
 9.39704  9.82909  7.02078  90.0000  90.0000  90.0000 < - default (0)
 9.52518  9.87796  7.05569  90.0000  90.0000  90.0000 <- coarse (-2)
 9.43701  9.78980  6.96751  90.0000  90.0000  90.0000 <- medium (-3)
 9.34876  9.74564  6.92337  90.0000  90.0000  90.0000 <- fine (-4)

So, for a specific case - suppose I wanted this orbital in a box exactly 15 x 20 x 25 Angstroms on a side with the molecule offset from the center by -1.0 Angstrom in each direction.

I was pleased to finally discover that cubegen allows for that, although you have to ask Gaussian Support to find out how (until now, that is) and you need to do a little bit of math to get the placement right (or use the excel file I've linked in a .zip file found at 2014june7_cubegen_excel_file.xlsx).

You begin with the following:

cubegen 0 MO=HOMO benzene.fchk benzene_HOMO.cub -1 h

But for -1, Where do the numbers go?

From the Gaussian Tech Doc:

A value of -1 says to read the cube specification from the input stream, according to the following format:

IFlag, X0, Y0, Z0 Output unit number and initial point.
N1, X1, Y1, Z1 Number of points and step-size in the X-direction.
N2, X2, Y2, Z2 Number of points and step-size in the Y-direction.
N3, X3, Y3, Z3 Number of points and step-size in the Z-direction.

IFlag is the output unit number. If IFlag is less than 0, then a formatted file will be produced; otherwise, an unformatted file will be written.

Admittedly, "input stream" made no sense to me upon first and second read. I just knew that the program didn't do anything when I ran it. Now obvious, this means you input the cube specifications by typing in (or, better, pasting in) the 16 numbers it asks for.

Continuing…

The -1 tells cubegen to "expect more input." In this case, without explanation first, my input would look as below:

-12  -6.50000  -9.00000 -11.50000
 60   0.25000   0.00000   0.00000
 80   0.00000   0.25000   0.00000
100   0.00000   0.00000   0.25000

Which you just paste into your terminal at the new line (having pressed ENTER after typing out the cubegen line above).

How this works (and note the use of minus signs!):

-[# Atoms] -[Start Point For Box In X] -[Start Point For Box In Y] -[Start Point For Box In Z]
[Number of Points In X]   [Grid Fineness In X]   [Grid Fineness In Y]   [Grid Fineness In Z]
[Number of Points In Y]   [Grid Fineness In X]   [Grid Fineness In Y]   [Grid Fineness In Z]
[Number of Points In Y]   [Grid Fineness In X]   [Grid Fineness In Y]   [Grid Fineness In Z]

Assuming orthogonality in your box, the off-diagonals for the grid fineness matrix are zero.

-[# Atoms] -[Start Point For Box In X] -[Start Point For Box In Y] -[Start Point For Box In Z]
[Number of Points In X]  [Grid Fineness In X]   0.000   0.000
[Number of Points In Y]  0.000   [Grid Fineness In Y]   0.000
[Number of Points In Y]  0.000   0.000   [Grid Fineness In Z]

4. -6.5, -9, -11.5?

You build the box around your molecule in cubegen, which means you combine (1) where you want the molecule positions with (2) the number of grid points along each direction and (3) the fineness of the grid to generate the box. Here, I'm starting my hypothetical box at -6.5 in X, -9 in Y, and -11.5 in Z, then building out my molecule 121*.25 points in X, 161*.25 in Y, 201*.25 in Z. This will produce the intended box size with the molecule technically centered at the origin in the box (0,0,0), but the generation of all 121, 161, and 201 points in X, Y, and Z will result in the box going from -6.5 to 8.5, -9 to 11, and -11.5 to 13.5 (and there's your asymmetry in the box). Alternatively, you could think of it as generating a box 15 x 20 x 25, then placing the center of the molecule at 6.5, 9, 11.5 (but you don't specify the box size directly, instead relying on the relative position of the molecule and the fineness of the grid to determine the position (from which you could work back to get the number of points you needed in each direction if you knew the size of the box you wanted. Yes, you might have to re-read that a few times).

I demonstrate this below for a benzene orbital "walk" along X using direct output from VESTA. The rest of the numbers in my matrix above are the same except for the "-[Start Point For The Box In X]" value.

benzene_homo_walk

The benzene walk (numbers show the spacing based on the cubegen input above).

5. Formula For Boxes And Grid Points

You can, in fact, work from the box size you want and relative position of the molecule in that box with some simple math. That looks like the table below:

-(# Atoms)           -(X Position)  -(Y Position)  -(Z Position)
(Box Size / X Mesh)    X Mesh         0.00000        0.00000
(Box Size / Y Mesh)    0.00000        Y Mesh         0.00000
(Box Size / Z Mesh)    0.00000        0.00000        Z Mesh

You specify # Atoms, X Position, Y Position, Z Position, X Mesh, Y Mesh and Z Mesh, then decide on how big your box is going to be. Also, note that X Position, Y Position, Z Position all need to be 1/2 the size of your box if you want the molecule centered. A way to help force this is to force the molecule to have its center of mass shifted to the origin using Symm=COM in your input file.

As mentioned above, a simple excel file for performing this task is provided for download at 2014june7_cubegen_excel_file.xlsx.

6. Lastly, A Procedure For Scripting The Generation Of Many Orbitals

That first stone passed, everything about making custom .cub/.cube files finally made sense. But it lead to another problem. Suppose I want to generate many molecular orbitals. Does one have to paste in the IFlag…Z3 block each time?

Thankfully, this process can be scripted to automation as well, although it's not just a matter of pasting IFlag…Z3 below each run of cubegen. Doing that produces the following…

Example:

This isn't a cubegen problem, it's a Linux issue with the interpretation of stdin. The cubegen script needs to be fed in the matrix in a file (say cubegen.dat if you always want the same .cub/.cube file generated) or via the use of an EOF call.

Cubegen.dat:

cubegen 0 MO=1 benzene.fchk benzene_MO1.cub -1 h < cubegen.dat
cubegen 0 MO=2 benzene.fchk benzene_MO2.cub -1 h < cubegen.dat
cubegen 0 MO=3 benzene.fchk benzene_MO3.cub -1 h < cubegen.dat
...

EOF

cubegen 0 MO=1 benzene.fchk benzene_MO1.cub -1 h << EOF
-12  -6.50000  -9.00000 -11.50000
 60   0.25000   0.00000   0.00000
 80   0.00000   0.25000   0.00000
100   0.00000   0.00000   0.25000
EOF
cubegen 0 MO=2 benzene.fchk benzene_MO2.cub -1 h << EOF
-12  -6.50000  -9.00000 -11.50000
 60   0.25000   0.00000   0.00000
 80   0.00000   0.25000   0.00000
100   0.00000   0.00000   0.25000
EOF
cubegen 0 MO=3 benzene.fchk benzene_MO3.cub -1 h << EOF
-12  -6.50000  -9.00000 -11.50000
 60   0.25000   0.00000   0.00000
 80   0.00000   0.25000   0.00000
100   0.00000   0.00000   0.25000
EOF
...

7. What's The Deal With The Benzene Cation?

Nothing, except I saw a question in my perusing of cubegen problems and found one related to UHF wavefunctions. How do you render alpha spin orbitals and beta spin orbitals? The answer is you dig into the .log file for the orbital energies and count (to the best of my knowledge).

Benzene (21 alpha/beta-occupied)

 Alpha  occ. eigenvalues -- -10.18955 -10.18928 -10.18928 -10.18872 -10.18872
 Alpha  occ. eigenvalues -- -10.18845  -0.84761  -0.73971  -0.73971  -0.59595
 Alpha  occ. eigenvalues --  -0.59595  -0.51588  -0.45423  -0.43943  -0.41518
 Alpha  occ. eigenvalues --  -0.41518  -0.36090  -0.33862  -0.33862  -0.24750
 Alpha  occ. eigenvalues --  -0.24750
 Alpha virt. eigenvalues --   0.00266   0.00266   0.08636   0.14126   0.14126
 Alpha virt. eigenvalues --   0.16238   0.17957   0.17957   0.18681   0.29989
 Alpha virt. eigenvalues --   0.29989   0.31908   0.31908   0.46637   0.52628
 Alpha virt. eigenvalues --   0.54782   0.55099   0.56222   0.59294   0.60077
 Alpha virt. eigenvalues --   0.60077   0.60084   0.60084   0.62384   0.62384
 Alpha virt. eigenvalues --   0.66653   0.66653   0.74180   0.81178   0.81178
 Alpha virt. eigenvalues --   0.82134   0.83694   0.83694   0.91676   0.93745
 Alpha virt. eigenvalues --   0.93745   0.95812   1.08054   1.08054   1.12992
 Alpha virt. eigenvalues --   1.12992   1.20098   1.26111   1.30051   1.40786
 Alpha virt. eigenvalues --   1.40786   1.42585   1.42585   1.42914   1.42914
 Alpha virt. eigenvalues --   1.74102   1.76078   1.80542   1.87583   1.90680
 Alpha virt. eigenvalues --   1.90680   1.97195   1.97195   1.97924   1.97924
 Alpha virt. eigenvalues --   2.02762   2.07664   2.07664   2.29609   2.29609
 Alpha virt. eigenvalues --   2.34429   2.34429   2.35491   2.39944   2.40328
 Alpha virt. eigenvalues --   2.40328   2.44636   2.44636   2.48731   2.48731
 Alpha virt. eigenvalues --   2.50802   2.58538   2.58538   2.60300   2.65987
 Alpha virt. eigenvalues --   2.75521   2.80103   2.80103   3.03123   3.03123
 Alpha virt. eigenvalues --   3.18490   3.20485   3.21867   3.21867   3.37166
 Alpha virt. eigenvalues --   3.48298   3.48298   3.93339   4.13215   4.16289
 Alpha virt. eigenvalues --   4.16289   4.43754   4.43754   4.82384

RHF wave functions are easy as the alpha and beta spin orbitals are identical (so you just call one).

Benzene Cation (21 alpha occ, 20 beta occ)

 Alpha  occ. eigenvalues --  -10.44746 -10.44745 -10.44690 -10.44689 -10.41307
 Alpha  occ. eigenvalues --  -10.41306  -1.09893  -0.99649  -0.97270  -0.83278
 Alpha  occ. eigenvalues --   -0.83268  -0.74423  -0.68358  -0.67574  -0.65278
 Alpha  occ. eigenvalues --   -0.63494  -0.61047  -0.56837  -0.56618  -0.51141
 Alpha  occ. eigenvalues --   -0.47878
 Alpha virt. eigenvalues --   -0.25225  -0.22671  -0.10624  -0.07758  -0.05310
 Alpha virt. eigenvalues --   -0.04280  -0.01821  -0.00871   0.00401   0.08260
 Alpha virt. eigenvalues --    0.08579   0.09642   0.10056   0.25206   0.29439
 Alpha virt. eigenvalues --    0.31399   0.31852   0.34121   0.36475   0.36906
 Alpha virt. eigenvalues --    0.37451   0.38343   0.38500   0.39459   0.40284
 Alpha virt. eigenvalues --    0.43576   0.45334   0.52549   0.60260   0.60770
 Alpha virt. eigenvalues --    0.61287   0.62929   0.64337   0.70989   0.71650
 Alpha virt. eigenvalues --    0.71731   0.74333   0.85713   0.86949   0.90112
 Alpha virt. eigenvalues --    0.90952   0.98816   1.00856   1.05831   1.15646
 Alpha virt. eigenvalues --    1.17792   1.17972   1.18789   1.20601   1.20854
 Alpha virt. eigenvalues --    1.49713   1.52475   1.57000   1.65756   1.66784
 Alpha virt. eigenvalues --    1.68337   1.73545   1.74011   1.74167   1.74723
 Alpha virt. eigenvalues --    1.80258   1.82880   1.84586   2.04024   2.06015
 Alpha virt. eigenvalues --    2.12117   2.12667   2.14025   2.17682   2.18940
 Alpha virt. eigenvalues --    2.19096   2.22084   2.22451   2.24748   2.25480
 Alpha virt. eigenvalues --    2.28544   2.35165   2.36888   2.39005   2.41062
 Alpha virt. eigenvalues --    2.52629   2.57091   2.57724   2.79730   2.80863
 Alpha virt. eigenvalues --    2.95189   2.99029   2.99731   3.01110   3.14403
 Alpha virt. eigenvalues --    3.25310   3.26537   3.70063   3.88553   3.90763
 Alpha virt. eigenvalues --    3.92953   4.18629   4.20462   4.58339
  Beta  occ. eigenvalues --  -10.44304 -10.44303 -10.44252 -10.44250 -10.41463
  Beta  occ. eigenvalues --  -10.41462  -1.08758  -0.97673  -0.97028  -0.82708
  Beta  occ. eigenvalues --   -0.82377  -0.74165  -0.67883  -0.67164  -0.64793
  Beta  occ. eigenvalues --   -0.63478  -0.57727  -0.56637  -0.56323  -0.47270
  Beta virt. eigenvalues --   -0.41639  -0.21435  -0.21139  -0.10438  -0.05496
  Beta virt. eigenvalues --   -0.05056  -0.04232  -0.01054  -0.00739   0.00754
  Beta virt. eigenvalues --    0.08748   0.08784   0.10027   0.10356   0.25410
  Beta virt. eigenvalues --    0.30875   0.31655   0.33033   0.34430   0.37599
  Beta virt. eigenvalues --    0.38243   0.38423   0.38827   0.38857   0.40471
  Beta virt. eigenvalues --    0.40510   0.45633   0.45687   0.53548   0.60543
  Beta virt. eigenvalues --    0.61003   0.61366   0.63303   0.64325   0.71163
  Beta virt. eigenvalues --    0.71910   0.72371   0.74501   0.86611   0.87153
  Beta virt. eigenvalues --    0.90721   0.90982   0.99163   1.02443   1.07028
  Beta virt. eigenvalues --    1.17547   1.18130   1.19642   1.19672   1.20955
  Beta virt. eigenvalues --    1.21374   1.51458   1.52709   1.57335   1.66396
  Beta virt. eigenvalues --    1.67580   1.68460   1.73895   1.74747   1.75260
  Beta virt. eigenvalues --    1.75568   1.80924   1.84865   1.84936   2.06229
  Beta virt. eigenvalues --    2.06582   2.12479   2.12665   2.14334   2.18350
  Beta virt. eigenvalues --    2.18883   2.19283   2.22289   2.22978   2.25783
  Beta virt. eigenvalues --    2.25938   2.29233   2.36212   2.37068   2.39062
  Beta virt. eigenvalues --    2.42549   2.53376   2.57824   2.57840   2.79980
  Beta virt. eigenvalues --    2.80952   2.95964   2.99101   2.99875   3.01115
  Beta virt. eigenvalues --    3.14561   3.25632   3.26592   3.70353   3.89317
  Beta virt. eigenvalues --    3.92008   3.93146   4.19813   4.20623   4.58989

In the case of UHF wave functions, you specify alpha or beta using AMO= or BMO= when you run cubegen.