home

Archive for the 'structural dna nanotechnology' Category

NAMOT Pre-Release 2.2.0-pre4 In OSX 10.8 (Maybe Older Versions)

Sunday, January 13th, 2013

A recent visit to the College of Nanoscale Science and Engineering (CNSE) at SUNY Albany inspired a few new DNA ideas that I decided would be greatly simplified by having NAMOT available again for design. Having failed at the base install of the NAMOT 2 version and, unfortunately, not having NAMOT available in Fink for a simple installation, the solution became to build the pre-release from scratch. Ignoring the many errors one encounters while walking through an OSX/Xcode/Fink/X11 bootstrap, the final procedure worked well and without major problem. As usual, the error messages at varied steps are provided below because, I assume, those messages are what you’re searching for when you find your way here.

0. Required Installations

You’ll need the following installed for this particular build. I believe XCode is the only thing that you’ll have to pay for (if you don’t already have it. I seem to remember paying $5 through the App Store).

1. XCode

The OSX Developer Suite – developer.apple.com/xcode

2. XQuartz

An OSX (X.Org) X Window System – xquartz.macosforge.org/landing/

3. Fink

An OSX port program for a host of Unix codes and libraries – www.finkproject.org

3a. GSL

The GNU Scientific (C and C++) Libraries – www.gnu.org/software/gsl. This will be installed with Fink.

3b. LessTif

An OSF/Motif clone (made available for OSX through Fink) – lesstif.sourceforge.net. This will be installed with Fink.

4. NAMOT2.2.0-pre4

The -pre4 is currently available (from 2003) from sourceforge.net/projects/namot/files/. I did not try -pre3 and had no luck with the official 2 release.

And, with that…

1. XCode

Blindly follow the install procedure. Several steps below deal with working around the default install locations (specifically, /sw).

2. XQuartz

If you don’t have XQuartz installed, you’re configure step…

cd Downloads
cd namot-2.2.0-pre4
./configure 

will produce the following error…

checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
...
creating libtool
checking for X... no
checking for main in -lX11... no
NAMOT requires Xwindows

Blindly follow the XQuartz install process. After the installations, you’ll receive the same error as above. The –x-libraries= and –x-includes= additions to configure below direct the script to the proper libraries and includes.

./configure --x-libraries=/usr/X11/lib/ --x-includes=/usrX11/include/

Hopefully, you’ll find yourself past the first install problem and onto the second problem.

checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
...
creating libtool
checking for X... libraries /usr/X11/lib/, headers /usrX11/include/
checking for gethostbyname... yes
checking for connect... yes
checking for remove... yes
checking for shmat... yes
checking for IceConnectionNumber in -lICE... yes
checking for main in -lX11... yes
checking for main in -lgslcblas... no
NAMOT requires GNU Scientific Library

3. Fink

The next two codes that need to be installed are the GNU Scientific Libraries and LessTif, both of which are much easier to install using Fink. It is generally useful for many other codes as well, so a good program for any computational chemist to have on hand. The install should be non-problematic despite having to build it from source in 10.6 – 10.8 (as of January 2013). If you build with all the default settings, you’ll have no trouble after.

cd Downloads
cd fink-0.34.4
./bootstrap 

I chose the default settings throughout.

Fink must be installed and run with superuser (root) privileges. Fink can automatically try to become root when it's run from a user account. Since you're currently running this script as a normal user, the method you choose will also be used immediately for this script. Available methods:

(1)	Use sudo
(2)	Use su
(3)	None, fink must be run as root

Choose a method: [1] 

...

You should now have a working Fink installation in '/sw'. You still need package descriptions if you want to compile packages yourself. You can get them by running either of the commands: 'fink selfupdate-rsync', to update via rsync (generally preferred); or 'fink selfupdate-cvs', to update via CVS (more likely to work through a firewall).

Run '. /sw/bin/init.sh' to set up this terminal session environment to use Fink. To make the software installed by Fink available in all of your future terminal shells, add '. /sw/bin/init.sh' to the init script '.profile' or '.bash_profile' in your home directory. The program /sw/bin/pathsetup.sh can help with this. Enjoy.

Then you run the final step in Fink below:

/sw/bin/pathsetup.sh

Which will produce the following two pop-ups notifying you of shell modifications.

3a. GSL

With the install of Fink, you need to install GSL and LessTif. If you try to install either immediately after installation…

fink install gsl

…you’ll receive the following error:

Password:
Scanning package description files..........
Information about 305 packages read in 0 seconds.
no package found for "gsl"
Failed: no package found for specification 'gsl'!

Required after the installation is a fink selfupdate.

fink selfupdate

As usual, follow the default settings…

fink needs you to choose a SelfUpdateMethod.

(1)	cvs
(2)	Stick to point releases
(3)	rsync

Choose an update method [3] 
/usr/bin/find /sw/fink -name CVS -type d -print0 | xargs -0 /bin/rm -rf
fink is setting your default update method to rsync
...
Updating the list of locally available binary packages.
Scanning dists/stable/main/binary-darwin-i386
New package: dists/stable/main/binary-darwin-i386/base/base-files_1.9.13-1_darwin-i386.deb
New package: dists/stable/main/binary-darwin-i386/base/fink-mirrors_0.34.4.1-1_darwin-i386.deb

Which then leads to a successful GSL install.

fink install gsl

Producing the following output…

Information about 12051 packages read in 1 seconds.
The package 'gsl' will be built and installed.
Reading build dependency for gsl-1.15-1...
Reading dependency for gsl-1.15-1...
Reading runtime dependency for gsl-1.15-1...
Reading dependency for gsl-shlibs-1.15-1...
...
Updating the list of locally available binary packages.
Scanning dists/stable/main/binary-darwin-i386
New package: dists/stable/main/binary-darwin-i386/sci/gsl-shlibs_1.15-1_darwin-i386.deb
New package: dists/stable/main/binary-darwin-i386/sci/gsl_1.15-1_darwin-i386.deb

Attempting a fresh build after the GSL step…

./configure --x-libraries=/usr/X11/lib/ --x-includes=/usrX11/include/

…then still produces the following error:

checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
...
checking for IceConnectionNumber in -lICE... yes
checking for main in -lX11... yes
checking for main in -lgslcblas... no
NAMOT requires GNU Scientific Library

As mentioned above, there are a few redirects that need to be made after the XCode / Fink install to put libraries and includes where, in this case, NAMOT expects them. To perform this task, we’ll be using symbolic links.

sudo ln -s /sw/include/gsl /usr/include/
sudo ln -s /sw/lib/libgsl* /usr/lib

Now attempting a build…

./configure --x-libraries=/usr/X11/lib/ --x-includes=/usrX11/include

Gets you past the GSL issue.

checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
…
checking for IceConnectionNumber in -lICE... yes
checking for main in -lX11... yes
checking for main in -lgslcblas... yes
checking for main in -lgsl... yes
checking for XShmCreateImage in -lXext... yes
checking for main in -lXt... yes
checking for main in -lXm... no
NAMOT requires Motif...try LessTif(http://www.lesstif.org)

3b. LessTif

The LessTif symbolic links work the same as the GSL symbolic links. This fink install may take a while.

fink install lesstif

Output below…

Information about 12051 packages read in 1 seconds.
The package 'lesstif' will be built and installed.
Reading build dependency for lesstif-0.95.2-4...
Reading dependency for lesstif-0.95.2-4...
Reading runtime dependency for lesstif-0.95.2-4...
...
Setting up lesstif (0.95.2-4) ...
Clearing dependency_libs of .la files being installed

Updating the list of locally available binary packages.
Scanning dists/stable/main/binary-darwin-i386
New package: dists/stable/main/binary-darwin-i386/x11/app-defaults_20010814-12_darwin-i386.deb
New package: dists/stable/main/binary-darwin-i386/x11/lesstif-bin_0.95.2-4_darwin-i386.deb
New package: dists/stable/main/binary-darwin-i386/x11/lesstif-shlibs_0.95.2-4_darwin-i386.deb
New package: dists/stable/main/binary-darwin-i386/x11/lesstif_0.95.2-4_darwin-i386.deb
./configure --x-libraries=/usr/X11/lib/ --x-includes=/usrX11/include/

But, unfortunately, the LessTif libraries are not in the expected locations.

checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
...
checking for IceConnectionNumber in -lICE... yes
checking for main in -lX11... yes
checking for main in -lgslcblas... yes
checking for main in -lgsl... yes
checking for XShmCreateImage in -lXext... yes
checking for main in -lXt... yes
checking for main in -lXm... no
NAMOT requires Motif...try LessTif(http://www.lesstif.org)

So we add the symbolic links…

sudo ln -s /sw/lib/libXm.* /usr/lib
sudo ln -s /sw/include/Xm /usr/include

Which, finally, runs configure…

./configure --x-libraries=/usr/X11/lib/ --x-includes=/usrX11/include/

…with no errors.

checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
...
config.status: creating docs/demos/curve/Makefile
config.status: creating docs/demos/dit/Makefile
config.status: creating docs/demos/Makefile
config.status: creating config.h
config.status: executing depfiles commands

NOTE: The make step with Python 2.6 produces the following error below. I did not diagnose this beyond the failure to build under 10.6. OSX 10.8 comes with Python 2.7, which did not produce this problem (I’m assuming this is the origin of the problem).

make

…will produce the following error at the pngwriter.c step.

/bin/sh ../libtool --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I..    -DLIB_HOME="\"/usr/local/share/namot\"" -DHELP_FILE_DIR="\"/usr/local/share/namot\"" -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -I/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/config -g -O2 -c -o _pynamot_la-pngwriter.lo `test -f 'pngwriter.c' || echo './'`pngwriter.c
gcc -DHAVE_CONFIG_H -I. -I. -I.. -DLIB_HOME=\"/usr/local/share/namot\" -DHELP_FILE_DIR=\"/usr/local/share/namot\" -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -I/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/config -g -O2 -c pngwriter.c -MT _pynamot_la-pngwriter.lo -MD -MP -MF .deps/_pynamot_la-pngwriter.TPlo  -fno-common -DPIC -o _pynamot_la-pngwriter.lo
pngwriter.c: In function 'dump_PNG':
pngwriter.c:28: error: 'png_structp' undeclared (first use in this function)
pngwriter.c:28: error: (Each undeclared identifier is reported only once
pngwriter.c:28: error: for each function it appears in.)
pngwriter.c:28: error: expected ';' before 'png_ptr'
pngwriter.c:29: error: 'png_infop' undeclared (first use in this function)
pngwriter.c:29: error: expected ';' before 'info_ptr'
pngwriter.c:30: error: 'png_byte' undeclared (first use in this function)
pngwriter.c:30: error: 'row_pointers' undeclared (first use in this function)
pngwriter.c:30: error: expected expression before ')' token
pngwriter.c:31: error: 'png_text' undeclared (first use in this function)
pngwriter.c:31: error: expected ';' before 'text_ptr'
pngwriter.c:39: warning: incompatible implicit declaration of built-in function 'memset'
pngwriter.c:39: error: 'text_ptr' undeclared (first use in this function)
pngwriter.c:47: error: 'png_ptr' undeclared (first use in this function)
pngwriter.c:47: error: 'PNG_LIBPNG_VER_STRING' undeclared (first use in this function)
pngwriter.c:48: error: 'png_voidp' undeclared (first use in this function)
pngwriter.c:57: error: 'info_ptr' undeclared (first use in this function)
pngwriter.c:60: error: 'png_infopp' undeclared (first use in this function)
pngwriter.c:82: error: 'PNG_COLOR_TYPE_RGB' undeclared (first use in this function)
pngwriter.c:82: error: 'PNG_INTERLACE_ADAM7' undeclared (first use in this function)
pngwriter.c:83: error: 'PNG_COMPRESSION_TYPE_DEFAULT' undeclared (first use in this function)
pngwriter.c:83: error: 'PNG_FILTER_TYPE_DEFAULT' undeclared (first use in this function)
pngwriter.c:85: error: 'PNG_sRGB_INTENT_ABSOLUTE' undeclared (first use in this function)
pngwriter.c:90: error: 'PNG_TEXT_COMPRESSION_NONE' undeclared (first use in this function)
pngwriter.c:93: error: 'PNG_TEXT_COMPRESSION_zTXt' undeclared (first use in this function)
pngwriter.c:104: error: expected expression before ')' token
make[2]: *** [_pynamot_la-pngwriter.lo] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2

The build on 10.8 continues as below, with a few warnings about the symbolic link usage that do not seem to affect the program usability (or continued build).

make

Results below…

make  all-recursive
Making all in src
source='namot_wrap.c' object='_pynamot_la-namot_wrap.lo' libtool=yes \
	depfile='.deps/_pynamot_la-namot_wrap.Plo' tmpdepfile='.deps/_pynamot_la-namot_wrap.TPlo' \
	depmode=gcc3 /bin/sh ../depcomp \
	/bin/sh ../libtool --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I..    -DLIB_HOME="\"/usr/local/share/namot\"" -DHELP_FILE_DIR="\"/usr/local/share/namot\"" -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -I/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/config -g -O2 -c -o _pynamot_la-namot_wrap.lo `test -f 'namot_wrap.c' || echo './'`namot_wrap.c

...

*** Warning: linker path does not have real file for library -lXm.
*** I have the capability to make that library automatically link in when
*** you link to this library.  But I can only do this if you have a
*** shared version of the library, which you do not appear to have
*** because I did check the linker path looking for a file starting
*** with libXm and none of the candidates passed a file format test
*** using a file magic. Last file checked: /sw/lib/libXm.la

*** Warning: linker path does not have real file for library -lgsl.
*** I have the capability to make that library automatically link in when
*** you link to this library.  But I can only do this if you have a
*** shared version of the library, which you do not appear to have
*** because I did check the linker path looking for a file starting
*** with libgsl and none of the candidates passed a file format test
*** using a file magic. Last file checked: /sw/lib/libgsl.la

*** Warning: linker path does not have real file for library -lgslcblas.
*** I have the capability to make that library automatically link in when
*** you link to this library.  But I can only do this if you have a
*** shared version of the library, which you do not appear to have
*** because I did check the linker path looking for a file starting
*** with libgslcblas and none of the candidates passed a file format test
*** using a file magic. Last file checked: /sw/lib/libgslcblas.la

*** Warning: libtool could not satisfy all declared inter-library
*** dependencies of module _pynamot.  Therefore, libtool will create
*** a static module, that should work as long as the dlopening
*** application is linked with the -dlopen flag.

...

Making all in libs
make[2]: Nothing to be done for `all'.
Making all in docs
Making all in helpfiles
make[3]: Nothing to be done for `all'.
Making all in demos
Making all in 6way
make[4]: Nothing to be done for `all'.
Making all in bending
make[4]: Nothing to be done for `all'.
Making all in cube
make[4]: Nothing to be done for `all'.
Making all in curve
make[4]: Nothing to be done for `all'.
Making all in dit
make[4]: Nothing to be done for `all'.
make[4]: Nothing to be done for `all-am'.
make[3]: Nothing to be done for `all-am'.
Making all in etc
make[2]: Nothing to be done for `all'.

Finally, the install…

make install

Which produces the following:

Making install in src
/bin/sh ../mkinstalldirs /usr/local/lib
 /bin/sh ../libtool --mode=install /usr/bin/install -c  _pynamot.la /usr/local/lib/_pynamot.la
/usr/bin/install -c .libs/_pynamot.lai /usr/local/lib/_pynamot.la
/usr/bin/install -c .libs/_pynamot.a /usr/local/lib/_pynamot.a
ranlib /usr/local/lib/_pynamot.a
chmod 644 /usr/local/lib/_pynamot.a
----------------------------------------------------------------------
Libraries have been installed in:
   /usr/local/lib

If you ever happen to want to link against installed libraries
in a given directory, LIBDIR, you must either use libtool, and
specify the full pathname of the library, or use the `-LLIBDIR'
flag during linking and do at least one of the following:
   - add LIBDIR to the `DYLD_LIBRARY_PATH' environment variable
     during execution

See any operating system documentation about shared libraries for
more information, such as the ld(1) and ld.so(8) manual pages.
...
/bin/sh ../mkinstalldirs /usr/local/share/namot
 /usr/bin/install -c -m 644 Namot2.512 /usr/local/share/namot/Namot2.512
 /usr/bin/install -c -m 644 Namot2.600 /usr/local/share/namot/Namot2.600
 /usr/bin/install -c -m 644 Namot2.700 /usr/local/share/namot/Namot2.700
 /usr/bin/install -c -m 644 icon1.xv /usr/local/share/namot/icon1.xv
make[2]: Nothing to be done for `install-exec-am'.
make[2]: Nothing to be done for `install-data-am'.

With luck, your launching of NAMOT will open XQuartz and produce a fully operational NAMOT session.

namot

And, for more assistance with producing DNA files for GROMACS, consider the Modifications To The ffG53a6.rtp And ffG53a5.rtp Residue Topology Files Required For Using GROMOS96-NAMOT-GROMACS v1, sed-Based Script For Converting NAMOT And NAMOT2 DNA Output To GROMOS96 Format For GROMACS Topology Generation v1, and sed-Based Script For Converting NAMOT And NAMOT2 DNA Output To ffAMBER Format For GROMACS Topology Generation v1 pages on this blog.

A Most Unlikely Obvious Molecule: DNA And Its Consequences – Slides From The CNY Skeptics Talk

Friday, November 9th, 2012

I’ve been fortunate twice this year to have the Central New York (CNY) Skeptics force me to commit to a presentation topics I thought were worth presenting. As a complement to the audio that will appear at some point on the CNY Skeptics site, I’ve posted the non-animated slides as a PDF below. And the press photo’s from a way-back Excelsior Cornet Band gig where I had too long a wait between playing and marching.


Download: DGAllis_CNY_Skeptics_DNA_Lecture_7_Nov_2012.pdf, 8.3 MB

CNY Skeptics Presents Damian Allis, Ph.D., on “A Most Unlikely Obvious Molecule: DNA And Its Consequences”

Wednesday, November 7, 2012, 7:00 pm
DeWitt Community Library at Shoppingtown Mall
Buckland Community Room

DNA is Nature’s medium of digital information storage and access from which cellular machinery produces life itself. The 60 years of advances in our understanding of DNA have run in parallel with advances in computer technology and information science, and we are now entering an age of whole-genome maps, customized diagnoses, medicines, and dosages from genetic testing, and genetic modification that may eradicate some disorders completely. From super crops to super humans, the genetic information age offers humanity many different possible outcomes. This lecture will cover some of the history, machinery, possibilities, and consequences of DNA life.

Dr. Damian Allis is a research professor in the Department of Chemistry at Syracuse University, research fellow with the Forensic and National Security Sciences Institute, and bioinformaticist for Aptamatrix, Inc. He contains approximately 20 billion miles of DNA.

Sanger (And Illumina 1.3+ (And Solexa)) Phred Score (Q) ASCII Glyph Base Error Conversion Tables

Friday, December 16th, 2011

Given the importance of the use of these scores both in FASTQ and MAQ (for MAQ (for me), specifically using alignment quality scores from Illumina sequencing runs to monitor run and sample quality), I was a bit surprised to not find some complete work-up of the meanings, the scores, the glyphs coordinated to the scores, and the encoding interpretations of these scores in one location. The two (three) tables shown here hopefully provide a meaningful summary.

I should qualify that much of the background for this page was taken from four key places. First is the wikipedia entry for FASTQ. Second is the wikipedia entry for Phred quality score. Third is the Rosetta Stone of Phred Score interpretation in the form of the open access article: P. J. A. Cock, C. J. Fields, N. Goto, M. L. Heuer and P. M. Rice, “The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants.” Nucleic Acids Research, 2010, Vol. 38, No. 6, 1767–1771 doi:10.1093/nar/gkp1137. Fourth is seqanswers.com in various forms.

(Sanger) Phred Quality Scores

I refer you to the two wikipedia articles on FASTQ and Phred Quality Scores for historical content (and for a brief discussion of the processing of chromatogram data for the production of quality scores). Table 1 shows the Q[Phred] (Phred Q) from P[Phred] values (Probability (P) Of Wrong Base), then adds the ASCII glyph codes (Sanger “Q + 33″ Shift) and characters (Sanger “Q + 33″ ASCII GLYPH) for the original Phred scores (Phred scores 0-to-93 use ASCII characters 33-to-126 in the Sanger method – this is performed to keep the single-character associated letters readable) and the Illumina 1.3+ codes (Illumina 1.3+ “Q + 64″ Shift, using ASCII glyphs 64-to-126 to score from 0-to-62 on the “P” scale) and corresponding ASCII glyphs (Illumina 1.3+ “Q + 64″ ASCII GLYPH). This is all likely completely self-explanatory (or hopefully will be by the bottom of the post). For review, the relationship between Phred quality score Q[Sanger] and the base-calling error probability P is

Q[Sanger]= −10 * log10P

or, re-written for the logarithmically challenged…

P = 10^[-Q/10]

Table 1. Phred Quality Scores (Q), Wrong Base Probabilities, And Sanger And Illumina 1.3+ ASCII Glyphs.
Phred
Q
Probability (P)
Of Wrong Base
Sanger
“Q + 33″
Shift
Sanger
“Q + 33″
ASCII GLYPH
Illumina 1.3+
“Q + 64″
Shift
Illumina 1.3+
“Q + 64″
ASCII GLYPH
00 1.0000000000 033 ! 064 @
01 0.7943282347 034 065 A
02 0.6309573445 035 # 066 B
03 0.5011872336 036 $ 067 C
04 0.3981071706 037 % 068 D
05 0.3162277660 038 & 069 E
06 0.2511886432 039 070 F
07 0.1995262315 040 ( 071 G
08 0.1584893192 041 ) 072 H
09 0.1258925412 042 * 073 I
10 0.1000000000 043 + 074 J
11 0.0794328235 044 , 075 K
12 0.0630957344 045 - 076 L
13 0.0501187234 046 . 077 M
14 0.0398107171 047 / 078 N
15 0.0316227766 048 0 079 O
16 0.0251188643 049 1 080 P
17 0.0199526231 050 2 081 Q
18 0.0158489319 051 3 082 R
19 0.0125892541 052 4 083 S
20 0.0100000000 053 5 084 T
21 0.0079432823 054 6 085 U
22 0.0063095734 055 7 086 V
23 0.0050118723 056 8 087 W
24 0.0039810717 057 9 088 X
25 0.0031622777 058 : 089 Y
26 0.0025118864 059 ; 090 Z
27 0.0019952623 060 < 091 [
28 0.0015848932 061 = 092 \
29 0.0012589254 062 > 093 ]
30 0.0010000000 063 ? 094 ^
31 0.0007943282 064 @ 095 _
32 0.0006309573 065 A 096 `
33 0.0005011872 066 B 097 a
34 0.0003981072 067 C 098 b
35 0.0003162278 068 D 099 c
36 0.0002511886 069 E 100 d
37 0.0001995262 070 F 101 e
38 0.0001584893 071 G 102 f
39 0.0001258925 072 H 103 g
40 0.0001000000 073 I 104 h
41 0.0000794328 074 J 105 i
42 0.0000630957 075 K 106 j
43 0.0000501187 076 L 107 k
44 0.0000398107 077 M 108 l
45 0.0000316228 078 N 109 m
46 0.0000251189 079 O 110 n
47 0.0000199526 080 P 111 o
48 0.0000158489 081 Q 112 p
49 0.0000125893 082 R 113 q
50 0.0000100000 083 S 114 r
51 0.0000079433 084 T 115 s
52 0.0000063096 085 U 116 t
53 0.0000050119 086 V 117 u
54 0.0000039811 087 W 118 v
55 0.0000031623 088 X 119 w
56 0.0000025119 089 Y 120 x
57 0.0000019953 090 Z 121 y
58 0.0000015849 091 [ 122 z
59 0.0000012589 092 \ 123 {
60 0.0000010000 093 ] 124 |
61 0.0000007943 094 ^ 125 }
62 0.0000006310 095 _ 126 ~
63 0.0000005012 096 `
64 0.0000003981 097 a
65 0.0000003162 098 b
66 0.0000002512 099 c
67 0.0000001995 100 d
68 0.0000001585 101 e
69 0.0000001259 102 f
70 0.0000001000 103 g
71 0.0000000794 104 h
72 0.0000000631 105 i
73 0.0000000501 106 j
74 0.0000000398 107 k
75 0.0000000316 108 l
76 0.0000000251 109 m
77 0.0000000200 110 n
78 0.0000000158 111 o
79 0.0000000126 112 p
80 0.0000000100 113 q
81 0.0000000079 114 r
82 0.0000000063 115 s
83 0.0000000050 116 t
84 0.0000000040 117 u
85 0.0000000032 118 v
86 0.0000000025 119 w
87 0.0000000020 120 x
88 0.0000000016 121 y
89 0.0000000013 122 z
90 0.0000000010 123 {
91 0.0000000008 124 |
92 0.0000000006 125 }
93 0.0000000005 126 ~

An assumption going in when I was producing plots from the Q[Sanger] and Q[Solexa] data was that the “P” was the same value and the Solexa system simply opted to use the Odds (P/(1-P)) as their metric. A proper two-second consideration of the shape of the form of P and P/(1-P) would have lead to the immediate conclusion that something was afoot. The table columns on the left of the black bar in Table 2 (2A) are the Q[Solexa] values based on the use of the Q[Sanger] probabilities. This is here simply to show that they are, in fact, not the same and if you’ve spent any time wondering why you can’t adequately… manipulate Excel’s rounding tools to reproduce the Q[Solexa] integer values, this is why.

The probabilities obtained for Q[Solexa] were, in fact, worked backwards from the integer values of Q[Solexa] (having found no table online that gives a number-by-number summary of the probability or odds). For background, the Q[Solexa] values are obtained from:

Q[Solexa] = −10 * log10[(P/1-P)]

Table 2A: Q[Solexa] from P[Sanger] Table 2B: Q[Solexa] and associated odds (P/(1-P)).
Probability
(P) Of
Wrong Base
Associated
Sanger
Odds
[P/(1-P)]
Q[Solexa]
Based On
Phred
Probability
Solexa Q
[-5 to 62]
Solexa
Probability
(P) Of
Wrong Base
Solexa
Odds
[P/(1-P)]
Solexa
“Q + 64″
Q Shift
Solexa
“Q + 64″
ASCII
GLYPH
0.7943282 3.8621161 -5.8682532 -5 0.7597469 3.1622774 59 ;
0.6309573 1.7097139 -2.3292343 -4 0.7152527 2.5118860 60 <
0.5011872 1.0047602 -0.0206244 -3 0.6661394 1.9952619 61 =
0.3981072 0.6614253 1.7951917 -2 0.6131368 1.5848929 62 >
0.3162278 0.4624753 3.3491146 -1 0.5573117 1.2589255 63 ?
0.2511886 0.3354498 4.7437242 0 0.5000000 1.0000000 64 @
0.1995262 0.2492602 6.0334710 1 0.4426884 0.7943284 65 A
0.1584893 0.1883390 7.2505963 2 0.3868632 0.6309575 66 B
0.1258925 0.1440241 8.4156483 3 0.3338606 0.5011873 67 C
0.1000000 0.1111111 9.5424251 4 0.2847473 0.3981072 68 D
0.0794328 0.0862868 10.6405549 5 0.2402531 0.3162278 69 E
0.0630957 0.0673449 11.7169522 6 0.2007600 0.2511887 70 F
0.0501187 0.0527631 12.7766933 7 0.1663376 0.1995263 71 G
0.0398107 0.0414613 13.8235685 8 0.1368069 0.1584893 72 H
0.0316228 0.0326554 14.8604457 9 0.1118158 0.1258926 73 I
0.0251189 0.0257661 15.8895167 10 0.0909091 0.1000000 74 J
0.0199526 0.0203588 16.9124707 11 0.0735876 0.0794328 75 K
0.0158489 0.0161042 17.9306177 12 0.0593509 0.0630957 76 L
0.0125893 0.0127498 18.9449785 13 0.0477267 0.0501187 77 M
0.0100000 0.0101010 19.9563519 14 0.0382865 0.0398107 78 N
0.0079433 0.0080069 20.9653650 15 0.0306534 0.0316228 79 O
0.0063096 0.0063496 21.9725111 16 0.0245034 0.0251189 80 P
0.0050119 0.0050371 22.9781790 17 0.0195623 0.0199526 81 Q
0.0039811 0.0039970 23.9826759 18 0.0156017 0.0158489 82 R
0.0031623 0.0031723 24.9862446 19 0.0124327 0.0125893 83 S
0.0025119 0.0025182 25.9890773 20 0.0099010 0.0100000 84 T
0.0019953 0.0019993 26.9913260 21 0.0078807 0.0079433 85 U
0.0015849 0.0015874 27.9931114 22 0.0062700 0.0063096 86 V
0.0012589 0.0012605 28.9945291 23 0.0049869 0.0050119 87 W
0.0010000 0.0010010 29.9956549 24 0.0039653 0.0039811 88 X
0.0007943 0.0007950 30.9965489 25 0.0031523 0.0031623 89 Y
0.0006310 0.0006314 31.9972589 26 0.0025056 0.0025119 90 Z
0.0005012 0.0005014 32.9978228 27 0.0019913 0.0019953 91 [
0.0003981 0.0003983 33.9982707 28 0.0015824 0.0015849 92 \
0.0003162 0.0003163 34.9986264 29 0.0012573 0.0012589 93 ]
0.0002512 0.0002513 35.9989090 30 0.0009990 0.0010000 94 ^
0.0001995 0.0001996 36.9991334 31 0.0007937 0.0007943 95 _
0.0001585 0.0001585 37.9993116 32 0.0006306 0.0006310 96 `
0.0001259 0.0001259 38.9994532 33 0.0005009 0.0005012 97 a
0.0001000 0.0001000 39.9995657 34 0.0003979 0.0003981 98 b
0.0000794 0.0000794 40.9996550 35 0.0003161 0.0003162 99 c
0.0000631 0.0000631 41.9997260 36 0.0002511 0.0002512 100 d
0.0000501 0.0000501 42.9997823 37 0.0001995 0.0001995 101 e
0.0000398 0.0000398 43.9998271 38 0.0001585 0.0001585 102 f
0.0000316 0.0000316 44.9998627 39 0.0001259 0.0001259 103 g
0.0000251 0.0000251 45.9998909 40 0.0001000 0.0001000 104 h
0.0000200 0.0000200 46.9999133 41 0.0000794 0.0000794 105 i
0.0000158 0.0000158 47.9999312 42 0.0000631 0.0000631 106 j
0.0000126 0.0000126 48.9999453 43 0.0000501 0.0000501 107 k
0.0000100 0.0000100 49.9999566 44 0.0000398 0.0000398 108 l
0.0000079 0.0000079 50.9999655 45 0.0000316 0.0000316 109 m
0.0000063 0.0000063 51.9999726 46 0.0000251 0.0000251 110 n
0.0000050 0.0000050 52.9999782 47 0.0000200 0.0000200 111 o
0.0000040 0.0000040 53.9999827 48 0.0000158 0.0000158 112 p
0.0000032 0.0000032 54.9999863 49 0.0000126 0.0000126 113 q
0.0000025 0.0000025 55.9999891 50 0.0000100 0.0000100 114 r
0.0000020 0.0000020 56.9999913 51 0.0000079 0.0000079 115 s
0.0000016 0.0000016 57.9999931 52 0.0000063 0.0000063 116 t
0.0000013 0.0000013 58.9999945 53 0.0000050 0.0000050 117 u
0.0000010 0.0000010 59.9999957 54 0.0000040 0.0000040 118 v
0.0000008 0.0000008 60.9999966 55 0.0000032 0.0000032 119 w
0.0000006 0.0000006 61.9999973 56 0.0000025 0.0000025 120 x
0.0000005 0.0000005 62.9999978 57 0.0000020 0.0000020 121 y
0.0000004 0.0000004 63.9999983 58 0.0000016 0.0000016 122 z
0.0000003 0.0000003 64.9999986 59 0.0000013 0.0000013 123 {
0.0000003 0.0000003 65.9999989 60 0.0000010 0.0000010 124 |
0.0000002 0.0000002 66.9999991 61 0.0000008 0.0000008 125 }
0.0000002 0.0000002 67.9999993 62 0.0000006 0.0000006 126 ~

With all three data sets, I reproduce a plot familiar to the FASTQ community below, showing the asymptotic behavior of the Q[Solexa] and Q[Sanger] values at high Q (which represent the lowest read errors. They approach one another because the numbers are simply too damn small on the plot). Also obvious from the plot is that the plots show poor agreement with each other in the range where the error probability is highest (so the entire analysis goes to pot as the data quality goes to pot [ed. Note for the international reader: “pot” refers to the device found in the water-closet). The grey line is a good plot of the wrong data (that in Table 2A).

The presentation of this data is likely complete overkill, but I have found it useful in discussion. Hopefully your having tables in front of someone during an explanation will help clarify that explanation.

bfast-0.6.5a, MUMmer-3.22, and Amos-3.0.0 Installs In Ubuntu 10.04 LTS (And Related)

Thursday, July 14th, 2011

Taking care of a DNA/RNA fragment alignment installation triple-threat with this post. These Ubuntu installs for largely problem-free, but one little trick is needed for Amos (this because of my use of “/opt” for my usual installation and compilation attempts and, more so, my not being interested in modifying the root PATH statement despite the constant use of sudo when building in “/opt”).

So, with the downloads of

bfast-0.6.5a (currently: sourceforge.net/apps/mediawiki/bfast/index.php?title=Main_Page)
MUMmer-3.22 (currently: mummer.sourceforge.net)
Amos-3.0.0 (currently: sourceforge.net/apps/mediawiki/amos/index.php?title=AMOS)

taken care of, the following process is performed.

user@machine:~sudo aptitude update
user@machine:~sudo aptitude upgrade

[POSSIBLE RESTART REQUIRED after this. You don't need-need to update/upgrade, but I do it before all builds regardless.]

user@machine:~sudo apt-get install bison build-essential cmake csh doxygen flex fort77 freeglut3-dev g++ g++-multilib gcc gcc-multilib gettext gfortran gnuplot ia32-libs lib32asound2 lib32gcc1 lib32gcc1-dbg lib32gfortran3 lib32gomp1 lib32mudflap0 lib32ncurses5 lib32nss-mdns lib32z1 libavdevice52 libbz2-dev libc6-dev-i386 libc6-i386 libfreeimage-dev libglew1.5-dev libnetcdf-dev libopenal1 libopenexr-dev libopenmpi-dev libpng12-dev libqt4-dev libssl-dev libstdc++6-4.3-dbg libstdc++6-4.3-dev libstdc++6-4.3-doc libxext-dev libxi-dev libxml-simple-perl libxmu-dev libxt-dev mercurial nfs-common nfs-kernel-server openmpi-bin patch portmap python2.6-dev rpm ssh tcsh xorg-dev zlib1g-dev

The large apt-get above is my “default” additional install for a variety of programs, including Amber, Abinit, GAMESS, GROMAC, etc. Many of these may not be needed but hard drives are cheap and figuring out the minimum list is more work than simply installing everything. Do check the list, however, to make sure something won’t confuse any other installs on your machine (if you’re new to this, likely not. If you’ve done builds a few times, you may already know the difference).

user@machine:~$ sudo mv bfast-0.6.5a.tar.gz /opt
user@machine:~$ sudo mv MUMmer3.22.tar.gz /opt
user@machine:~$ sudo mv amos-3.0.0.tar.gz /opt
user@machine:~$ cd /opt

Move the three programs to /opt (or not). Specifically for bfast, two additional apt-get’s are required.

user@machine:/opt$ sudo apt-get install libstatistics-descriptive-perl libdbd-pg-perl

The build for bfast is straightforward.

user@machine:/opt$ sudo tar xvfz bfast-0.6.5a.tar.gz 
user@machine:/opt$ cd bfast-0.6.5a/
user@machine:/opt/bfast-0.6.5a$ sudo ./configure 
user@machine:/opt/bfast-0.6.5a$ sudo make
user@machine:/opt/bfast-0.6.5a$ sudo make install
user@machine:/opt/bfast-0.6.5a$ cd ..

bfast is officially built and you’ve returned to your “/opt” directory. MUMmer is also straightforward.

user@machine:/opt$ sudo tar xvfz MUMmer3.22.tar.gz 
user@machine:/opt$ cd MUMmer3.22/
user@machine:/opt/MUMmer3.22$ sudo make check
user@machine:/opt/MUMmer3.22$ sudo make install

MUMmer is officially built. If you intend to build Amos, you will need some of what you built in MUMmer. Specifically, nucmer, delta-filter, and show-coords are used by Amos and must be present in your PATH during the Amos build. As I am building in “/opt,” I’m using sudo. As I do not want to deal with setting a new PATH for root, the solution is simply to move these three programs to a universally accessible place.

user@machine:/opt/MUMmer3.22$ sudo cp nucmer /usr/local/bin/
user@machine:/opt/MUMmer3.22$ sudo cp delta-filter /usr/local/bin/
user@machine:/opt/MUMmer3.22$ sudo cp show-coords /usr/local/bin/
user@machine:/opt/MUMmer3.22$ cd ..

And, with that, you are ready for the Amos build procedure. You will need two more apt-get installs to complete the Amos build.

user@machine:/opt/amos-3.0.0$ sudo apt-get install libboost-all-dev libqt3-headers

To build Amos with no errors and all of the listed components, note the ./configure settings and run the list below.

user@machine:/opt$ sudo tar xvfz amos-3.0.0.tar.gz 
user@machine:/opt$ cd amos-3.0.0/
user@machine:/opt/amos-3.0.0$ sudo ./configure --with-Qt-dir=/usr/share/qt3 --prefix=/opt/amos-3.0.0
user@machine:/opt/amos-3.0.0$ sudo make
user@machine:/opt/amos-3.0.0$ sudo make check
user@machine:/opt/amos-3.0.0$ sudo make install

If nucmer, delta-filter, show-coords, and the qt3 libraries are not present, you’ll see the following error list after running ./configure:

-- AMOS Assembler 2.0.8 Configuration Results --
  C compiler:          gcc -g -O2
  C++ compiler:        g++ -g -O2
  GCC version:         gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3
  Host System type:    x86_64-unknown-linux-gnu
  Install prefix:      /opt/amos-3.0.0
  Install eprefix:     ${prefix}

  See config.h for further configuration information.
  Email  with questions and bug reports.

WARNING! nucmer was not found but is required to run AMOScmp
   install nucmer if planning on using AMOScmp
WARNING! delta-filter was not found but is required to run AMOScmp-shortReads-alignmentTrimmed
   install delta-filter if planning on using AMOScmp-shortReads-alignmentTrimmed
WARNING! show-coords was not found but is required to run minimus2
   install show-coords if planning on using minimus2
WARNING! Qt3 toolkit was not found but is required to run AMOS GUIs
   install Qt3 or locate Qt3 with configure to build GUIs
   see config.log for more information on what went wrong
WARNING! Boost graph toolkit was not found but is required to run parts of the AMOS Scaffolder (Bambus 2)
   install Boost or locate Boost with configure to build Scaffolder
   see config.log for more information on what went wrong

And, finally, add these directories to your PATH.

user@machine:~$ cd
user@machine:~$ pico .profile

Add the following to your PATH statement:

/opt/amos-3.0.0/bin/:/opt/MUMmer3.22/:/opt/bfast-0.6.5a/butil/:

Crtl-X, “Y”, and quit.

UNAFold 3.8, MFold Utilities 4.5/4.6 And Additional Component Installation (Using XCode Tools 3 And Fink 0.29.21) For OSX 10.6.x

Wednesday, June 8th, 2011

NOTE: The version numbers for everything are given specifically because aspects of the installation process may change with different versions and, in the event, I will not necessarily know the answer to subsequent problems if major version changes include major changes to the below (and that should clear up the “qualifications” section).

The UNAFold (UNified Nucleic Acid Fold(ing)) nucleic acid folding and hybridization prediction program set (here using version 3.8) can by itself be built with few (and not important) errors in OSX with Xcode Tools 3. The actual running of UNAFold.pl produces several errors that do not affect the run but do affect the amount/format of the output. It is my assumption that any OS running a less-than “kitchen sink” installation of Linux/Unix (Ubuntu, gentoo and Damn Small Linux come to mind) will have these errors and will require subsequent installations of programs/libraries that pieces of UNAFold rely on for processing output into, specifically, images and PDF files. OSX has the same issue that is easy to handle using Fink (and less so trying to install otherwise completely unrelated programs to make these “dependencies” (programs and libraries) available to UNAFold). Once Fink is installed, it is a few-step process to build UNAFold, move the Mfold Utilities contents to their proper folders (and there is a small trick here as well), and generate a UNAFold-complete install for all your DNA/RNA needs.

1. UNAFold 3.8 Installation

To begin, download (currently at mfold.rna.albany.edu/?q=DINAMelt/software), extract, open a terminal, cd into the unafold_3.8 directory (likely ~/Downloads/unafold_3.8), and run ./configure.

[prompt]$ cd ~/Downloads/unafold_3.8
[prompt]$ ./configure

On my machine (MacBook Pro, 10.6.x OSX + XCode Tools 3), this produces the output found in the local file 2011june_unafold_configure_output.txt.

You will likely note two sets of errors in the ./configure output:

./configure: line 8579: sort: No such file or directory
./configure: line 8576: sed: No such file or directory
./configure: line 10077: sort: No such file or directory
./configure: line 10074: sed: No such file or directory

The 10077 and 10074 errors are a bit odd because there are only 10039 lines in the configure file.

Are these errors important? No, you can build UNAFold just fine. I have run into these two “sort” and “sed” problems with a few other build attempts in OSX but have no good answer as to how to get around them (in case you’re wondering, sort and sed are most certainly installed on the machine. The “sort” error can be removed by specifying the path explicitly in the configure file (in line 8579, change “sort” the “/usr/bin/sort”), but the sed error persists in the few attempts I tried to work around it. It doesn’t appear to be a simple PATH issue. I’m not yet interested enough in finding a proper solution but, if you know, please post a comment or send a message. Is it just a character issue as discussed at itmercenary.livejournal.com/1585.html?).

running “make” produces the output found in the local file 2011june_unafold_make_output.txt.

[prompt]$ make

No issues. To install UNAFold, which will default to putting components into /usr/local/bin and /usr/local/share/, run sudo make install, which produces the output found in the local file 2011june_unafold_sudo_make_install_output.txt.

[prompt]$ sudo make install

Again, no issues. You will now have a populated /usr/local/bin folder.

2. MFold Utilities 4.5 (and, currently, the source for 4.6)

The next (optional) step is the inclusion of the mfold_util-4.5-Mac binaries (currently available at mfold.rna.albany.edu/?q=mfold/download-mfold), which I’ve also placed into the /usr/local/bin folder by extracting the contents of this file, them performing a cp * /usr/local/bin from within the MacBin directory.

[prompt]$ cd ~/Downloads/MacBin/
[prompt]$ sudo cp * /usr/local/bin

The processing of the data into plots with these programs requires that a set of *.col files be placed in the folder /usr/local/shared/mfold_util. Furthermore, these *.col are NOT provided in the mfold_util-4.5-Mac binary package. To get these files, you need only download the mfold_util-4.6.tar.gz file (currently at mfold.rna.albany.edu/?q=mfold/download-mfold), cd your way into src, make the /usr/local/shared/mfold_util folder, and copy the *.col files to /usr/local/shared/mfold_util.

[prompt]$ sudo mkdir /usr/local/shared/mfold_util
[prompt]$ cd Downloads/mfold_util-4.6/src
[prompt]$ sudo cp *.col /usr/local/shared/mfold_util

3. Fink 0.29.21 Install From Scratch

The first indication that other work was required came from trying to run mutplot randomly, which produced the following error:

dyld: Library not loaded: /sw/lib/libpng12.0.dylib
  Referenced from: /usr/local/bin/mutplot
  Reason: image not found
Trace/BPT trap

As digging around for libraries is not as straightforward as it would be for a Linux distro, I chose instead to solve the many problems by installing dependencies through the Fink program (currently fink-0.29.21). As 10.6.x users will find that there is no available Fink binary, you must build this from the source (which, with Xcode Tools 3 installed, occurs without error. If you don’t have Xcode Tools 3 installed, the new mechanism for buying a copy of XCode Tools 4 is less than ideal (to me, anyway. $4.99?) but now occurs through the App Store).

Download the fink source (fink 0.29.21), extract, cd into the fink-0.29.21 directory, and run bootstrap. Upon completion, you run pathsetup.sh, source your .profile, and update fink.

[prompt]$ cd ~/Downloads/fink-0.29.21
[prompt]$ ./bootstrap
[prompt]$ . /sw/bin/pathsetup.sh 
[prompt]$ cd ~/
[prompt]$ source .profile
[prompt]$ fink selfupdate-rsync
[prompt]$ fink update-all

The output for my installation can be found in 2011june_fink_install_output.txt. The rsync output can be found in 2011june_fink_selfupdate_rsync_output.txt. NOTE: You will be asked several questions about the installation process. Be prepared to blindly select the default settings with [enter], but don’t just walk away from the screen.

This completes the UNAFold install, MacBin install, and Fink install, meaning now we can walk through the dependencies.

4. Installing UNAFold (well, MFold Utils) Dependencies

First dependency-free UNAFold.pl run attempt produces the following error:

[prompt]$ UNAFold.pl seqtest.txt 
Checking for boxplot_ng... dyld: Library not loaded: /sw/lib/libpng12.0.dylib
  Referenced from: /usr/local/bin/boxplot_ng
  Reason: image not found
found, supports Postscript
Checking for hybrid-plot-ng... found, supports Postscript
Checking for sir_graph_ng or sir_graph... dyld: Library not loaded: /sw/lib/libpng12.0.dylib
  Referenced from: /usr/local/bin/sir_graph
  Reason: image not found
found, supports Postscript
Checking for ps2pdfwr... not found
Calculating for seqtest.txt, t = 37

As the UNAFold install page states, you need glut, the GD library, and gnuplot installed (and all of the many libraries therein).

[prompt]$ fink install libjpeg tetex gd2 gnuplot

For gnuplot, you will be required to make a few selections during the build process (blindly hitting the enter key at these questions will do, but this is not just a “type and go” install process. And it took about two hours on a MBP).

A final working error-free run looks as below, leaving you to process the data with the MFold Utilities as you like:

[prompt]$ UNAFold.pl seqtest.txt 
Checking for boxplot_ng... found, supports Postscript
Checking for hybrid-plot-ng... found, supports Postscript
Checking for sir_graph_ng or sir_graph... found, supports Postscript
Checking for ps2pdfwr... found
Calculating for seqtest.txt, t = 37

Obligatory

  • CNYO

  • Sol. Sys. Amb.

  • Salt City Miners

  • Ubuntu 4 Nano

  • NMT Review

  • N-Fact. Collab.

  • T R P Nanosys

  • Nano Gallery

  • nano gallery
  • Aerial Photos

    More @ flickr.com

    Syracuse Scenes

    More @ flickr.com