And Happy New Year,
Yet another random Ubuntu-centric bioinformatics aside in the event others run into the same build issues (with errors included below, as you likely googled those first). For those wondering…
The Hierarchical Catalog of Orthologs v8
Orthology (website, download) is the cornerstone of comparative genomics and gene function prediction. OrthoDB aims to classify protein-coding genes from the increasing number of available sequenced genomes into groups of orthologs descended from a single gene of the last common ancestor (LCA) of each clade of species. Applying this concept to the hierarchy of LCAs along the species phylogeny results in multiple levels of orthology: the more closely-related the species, the more finely-resolved the orthologous relations.
The build here was on a fresh installation of 64-bit Ubuntu 14.04 (Trusty Tahr). All of the errors produced come from running on that clean install, meaning you’ll run into dumb errors (like missing build-essential), didn’t-know-we-needed-that errors (boost), and that’s-probbly-an-Ubuntu-oddity errors (with a modified Makefile.rules file with explicit boost calls linked to below; I suspect the developers are working on a non-Ubuntu distro).
Continue reading “OrthoDB 1.6 Installation On Ubuntu 14.04 (And Related) – Build Errors And The Simple Fixes”
A recent visit to the College of Nanoscale Science and Engineering (CNSE) at SUNY Albany inspired a few new DNA ideas that I decided would be greatly simplified by having NAMOT available again for design. Having failed at the base install of the NAMOT 2 version and, unfortunately, not having NAMOT available in Fink for a simple installation, the solution became to build the pre-release from scratch. Ignoring the many errors one encounters while walking through an OSX/Xcode/Fink/X11 bootstrap, the final procedure worked well and without major problem. As usual, the error messages at varied steps are provided below because, I assume, those messages are what you’re searching for when you find your way here.
0. Required Installations
You’ll need the following installed for this particular build. I believe XCode is the only thing that you’ll have to pay for (if you don’t already have it. I seem to remember paying $5 through the App Store).
Continue reading “NAMOT Pre-Release 2.2.0-pre4 In OSX 10.8 (Maybe Older Versions)”
I’ve been fortunate twice this year to have the Central New York (CNY) Skeptics force me to commit to a presentation topics I thought were worth presenting. As a complement to the audio that will appear at some point on the CNY Skeptics site, I’ve posted the non-animated slides as a PDF below. And the press photo’s from a way-back Excelsior Cornet Band gig where I had too long a wait between playing and marching.
Download: DGAllis_CNY_Skeptics_DNA_Lecture_7_Nov_2012.pdf, 8.3 MB
Continue reading “A Most Unlikely Obvious Molecule: DNA And Its Consequences – Slides From The CNY Skeptics Talk”
This post is a brief update to a much longer and more involved discussion of Amber 11 and AmberTools installation in Ubuntu 10.04 LTS (Lucid Lynx) (as the changes are minor and the parallelization setup remains largely the same). You can find this more involved discussion at www.somewhereville.com/?p=1422.
Long/Short – the installation under Ubuntu 12.04 LTS (Precise Pangolin) is not much different and goes without hitch provided you keep your locations organized. NOTE 1: I’ve not a copy of Amber12, so cannot speak for any changes to its installation procedure. NOTE 2: This install assumes 32-bit only.
If you tried installing all of the build software from the 10.04 LTS post, you’ll receive errors like the following (as usual, I include error messages for those who are searching against error messages)…
Continue reading “Brief Update: Amber 11 And AmberTools 1.5 In Ubuntu 12.04 LTS”
Given the importance of the use of these scores both in FASTQ and MAQ (for MAQ (for me), specifically using alignment quality scores from Illumina sequencing runs to monitor run and sample quality), I was a bit surprised to not find some complete work-up of the meanings, the scores, the glyphs coordinated to the scores, and the encoding interpretations of these scores in one location. The two (three) tables shown here hopefully provide a meaningful summary.
I should qualify that much of the background for this page was taken from four key places. First is the wikipedia entry for FASTQ. Second is the wikipedia entry for Phred quality score. Third is the Rosetta Stone of Phred Score interpretation in the form of the open access article: P. J. A. Cock, C. J. Fields, N. Goto, M. L. Heuer and P. M. Rice, “The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants.” Nucleic Acids Research, 2010, Vol. 38, No. 6, 1767–1771 doi:10.1093/nar/gkp1137. Fourth is seqanswers.com in various forms.
Continue reading “Sanger (And Illumina 1.3+ (And Solexa)) Phred Score (Q) ASCII Glyph Base Error Conversion Tables”
Having successfully navigated serial and parallel Amber10 installs under Ubuntu 8.10, I am pleased to report that the process for Amber11 with OpenMPI (from apt-get, one doesn’t have to build from scratch) under Ubuntu 10.10 is seemingly much easier (and have it here so I don’t forget). There is a bit of persnicketiness to the order of the serial and parallel installs that must be kept track of (and I’m building in serial-to-parallel order), but the process is otherwise straightforward.
For organizational purposes, I’m building amber11 in my $HOME directory. This removes some of the PATH issues with sudo-ing aspects of the install (and can be moved into another directory after the build is complete).
1. apt-get Installs
The search for dependent programs and libraries is a long and involved one given how many programs I have installed. Therefore, instead of trying to find all of the amber-dependent installs for successful building, I’m simply providing the list of everything I have on the test machine. As hard drives are cheap and Ubuntu will warn of conflicts, I recommend simply installing the below and accepting the 100 Mb hit to NOT have to find the smallest apt-get set (yes, some of these are obviously not needed).
Continue reading “Amber 11 And AmberTools 1.5 In Ubuntu 10.04 LTS (And Related, Including A How-To For EOL 8.10)”
Taking care of a DNA/RNA fragment alignment installation triple-threat with this post. These Ubuntu installs for largely problem-free, but one little trick is needed for Amos (this because of my use of “/opt” for my usual installation and compilation attempts and, more so, my not being interested in modifying the root PATH statement despite the constant use of sudo when building in “/opt”).
So, with the downloads of
bfast-0.6.5a (currently: sourceforge.net/apps/mediawiki/bfast/index.php?title=Main_Page)
MUMmer-3.22 (currently: mummer.sourceforge.net)
Amos-3.0.0 (currently: sourceforge.net/apps/mediawiki/amos/index.php?title=AMOS)
taken care of, the following process is performed.
Continue reading “bfast-0.6.5a, MUMmer-3.22, and Amos-3.0.0 Installs In Ubuntu 10.04 LTS (And Related)”
NOTE: The version numbers for everything are given specifically because aspects of the installation process may change with different versions and, in the event, I will not necessarily know the answer to subsequent problems if major version changes include major changes to the below (and that should clear up the “qualifications” section).
The UNAFold (UNified Nucleic Acid Fold(ing)) nucleic acid folding and hybridization prediction program set (here using version 3.8) can by itself be built with few (and not important) errors in OSX with Xcode Tools 3. The actual running of UNAFold.pl produces several errors that do not affect the run but do affect the amount/format of the output. It is my assumption that any OS running a less-than “kitchen sink” installation of Linux/Unix (Ubuntu, gentoo and Damn Small Linux come to mind) will have these errors and will require subsequent installations of programs/libraries that pieces of UNAFold rely on for processing output into, specifically, images and PDF files. OSX has the same issue that is easy to handle using Fink (and less so trying to install otherwise completely unrelated programs to make these “dependencies” (programs and libraries) available to UNAFold). Once Fink is installed, it is a few-step process to build UNAFold, move the Mfold Utilities contents to their proper folders (and there is a small trick here as well), and generate a UNAFold-complete install for all your DNA/RNA needs.
Continue reading “UNAFold 3.8, MFold Utilities 4.5/4.6 And Additional Component Installation (Using XCode Tools 3 And Fink 0.29.21) For OSX 10.6.x”
So, with the BclConverter installation complete and a small QSEQ-to-FASTQ script available to convert the QSEQ output, the/a next step is the alignment of your lane-worth of sequenced DNA. The Maq program is used by the Cornell Sequencing Center (and was recommended as the workhorse tool for this task) and is available by link from the Illumina third-party tools list. In keeping with my no-interest-in-installing-another-distro run of Ubuntu luck, the procedure below explains the process of building Maq using as much apt-get as possible. In the case of Maq, there is one small busy step in the installation process because we need a copy of libstdc++.so.5 local that is NOT available by some easy package install (although what one has to do isn’t terribly difficult either and I’ve linked local copies of the two .deb files below).
The process begins with apt-get, continues to dpkg, and then is finished with an easy make.
1. apt-get Install List
The official package list, I am quite sure, is below. From a Terminal window:
Continue reading “Maq-0.6.x Or Maq-0.7.x (And Likely Others) Installation In Ubuntu 10.04 LTS (And Likely Others)”
What follows is the procedure for successfully building and running BclConverter-1.7.1 under Ubuntu (specifically 10.04, but this will likely be generic for other versions) using only apt-get to install missing programs and libraries, thereby trying to keep the install process as build-friendly as possible to the general (non-coding) user.
So, What’s BCL And Why Does It Need Converting?
The newest version of the Illumina sequencing software no longer uses the QSEQ format during the sequencing run, relying now on BCL files. This 12 January 2010 post snip from www.politigenomics.com covers the intro nicely.
Gone are the QSEQ files, they are replaced by BCL files which are binary, per image, per cycle files that contain the base call and quality information. Because they are per image, per cycle files, they can be transferred cycle by cycle as they are generated (as opposed to QSEQ files which are read based). The BCL files are also more compact, requiring only 1 byte/base (B/b) as compared to QSEQ files which require about 2.5 B/b. In addition, the intensity files are also not transferred by default, so RTA output goes from 10 B/b to just 1 B/b. Thus, even though you are generating five times more sequence data than a GA, your RTA directory will actually be smaller (about 250 GB).
Continue reading “BclConverter-1.7.1 Installation In Ubuntu 10.04 LTS (And Related)”