Sanger (And Illumina 1.3+ (And Solexa)) Phred Score (Q) ASCII Glyph Base Error Conversion Tables

Given the importance of the use of these scores both in FASTQ and MAQ (for MAQ (for me), specifically using alignment quality scores from Illumina sequencing runs to monitor run and sample quality), I was a bit surprised to not find some complete work-up of the meanings, the scores, the glyphs coordinated to the scores, and the encoding interpretations of these scores in one location. The two (three) tables shown here hopefully provide a meaningful summary.

I should qualify that much of the background for this page was taken from four key places. First is the wikipedia entry for FASTQ. Second is the wikipedia entry for Phred quality score. Third is the Rosetta Stone of Phred Score interpretation in the form of the open access article: P. J. A. Cock, C. J. Fields, N. Goto, M. L. Heuer and P. M. Rice, “The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants.” Nucleic Acids Research, 2010, Vol. 38, No. 6, 1767–1771 doi:10.1093/nar/gkp1137. Fourth is seqanswers.com in various forms.

(Sanger) Phred Quality Scores

I refer you to the two wikipedia articles on FASTQ and Phred Quality Scores for historical content (and for a brief discussion of the processing of chromatogram data for the production of quality scores). Table 1 shows the Q[Phred] (Phred Q) from P[Phred] values (Probability (P) Of Wrong Base), then adds the ASCII glyph codes (Sanger “Q + 33” Shift) and characters (Sanger “Q + 33” ASCII GLYPH) for the original Phred scores (Phred scores 0-to-93 use ASCII characters 33-to-126 in the Sanger method – this is performed to keep the single-character associated letters readable) and the Illumina 1.3+ codes (Illumina 1.3+ “Q + 64” Shift, using ASCII glyphs 64-to-126 to score from 0-to-62 on the “P” scale) and corresponding ASCII glyphs (Illumina 1.3+ “Q + 64” ASCII GLYPH). This is all likely completely self-explanatory (or hopefully will be by the bottom of the post). For review, the relationship between Phred quality score Q[Sanger] and the base-calling error probability P is

Q[Sanger]= −10 * log10P

or, re-written for the logarithmically challenged…

P = 10^[-Q/10]

Table 1. Phred Quality Scores (Q), Wrong Base Probabilities, And Sanger And Illumina 1.3+ ASCII Glyphs.
Phred
Q
Probability (P)
Of Wrong Base
Sanger
“Q + 33”
Shift
Sanger
“Q + 33”
ASCII GLYPH
Illumina 1.3+
“Q + 64”
Shift
Illumina 1.3+
“Q + 64”
ASCII GLYPH
00
1.0000000000
033
!
064
@
01
0.7943282347
034
065
A
02
0.6309573445
035
#
066
B
03
0.5011872336
036
$
067
C
04
0.3981071706
037
%
068
D
05
0.3162277660
038
&
069
E
06
0.2511886432
039
070
F
07
0.1995262315
040
(
071
G
08
0.1584893192
041
)
072
H
09
0.1258925412
042
*
073
I
10
0.1000000000
043
+
074
J
11
0.0794328235
044
,
075
K
12
0.0630957344
045
076
L
13
0.0501187234
046
.
077
M
14
0.0398107171
047
/
078
N
15
0.0316227766
048
0
079
O
16
0.0251188643
049
1
080
P
17
0.0199526231
050
2
081
Q
18
0.0158489319
051
3
082
R
19
0.0125892541
052
4
083
S
20
0.0100000000
053
5
084
T
21
0.0079432823
054
6
085
U
22
0.0063095734
055
7
086
V
23
0.0050118723
056
8
087
W
24
0.0039810717
057
9
088
X
25
0.0031622777
058
:
089
Y
26
0.0025118864
059
;
090
Z
27
0.0019952623
060
<
091
[
28
0.0015848932
061
=
092
\
29
0.0012589254
062
>
093
]
30
0.0010000000
063
?
094
^
31
0.0007943282
064
@
095
_
32
0.0006309573
065
A
096
`
33
0.0005011872
066
B
097
a
34
0.0003981072
067
C
098
b
35
0.0003162278
068
D
099
c
36
0.0002511886
069
E
100
d
37
0.0001995262
070
F
101
e
38
0.0001584893
071
G
102
f
39
0.0001258925
072
H
103
g
40
0.0001000000
073
I
104
h
41
0.0000794328
074
J
105
i
42
0.0000630957
075
K
106
j
43
0.0000501187
076
L
107
k
44
0.0000398107
077
M
108
l
45
0.0000316228
078
N
109
m
46
0.0000251189
079
O
110
n
47
0.0000199526
080
P
111
o
48
0.0000158489
081
Q
112
p
49
0.0000125893
082
R
113
q
50
0.0000100000
083
S
114
r
51
0.0000079433
084
T
115
s
52
0.0000063096
085
U
116
t
53
0.0000050119
086
V
117
u
54
0.0000039811
087
W
118
v
55
0.0000031623
088
X
119
w
56
0.0000025119
089
Y
120
x
57
0.0000019953
090
Z
121
y
58
0.0000015849
091
[
122
z
59
0.0000012589
092
\
123
{
60
0.0000010000
093
]
124
|
61
0.0000007943
094
^
125
}
62
0.0000006310
095
_
126
~
63
0.0000005012
096
`
64
0.0000003981
097
a
65
0.0000003162
098
b
66
0.0000002512
099
c
67
0.0000001995
100
d
68
0.0000001585
101
e
69
0.0000001259
102
f
70
0.0000001000
103
g
71
0.0000000794
104
h
72
0.0000000631
105
i
73
0.0000000501
106
j
74
0.0000000398
107
k
75
0.0000000316
108
l
76
0.0000000251
109
m
77
0.0000000200
110
n
78
0.0000000158
111
o
79
0.0000000126
112
p
80
0.0000000100
113
q
81
0.0000000079
114
r
82
0.0000000063
115
s
83
0.0000000050
116
t
84
0.0000000040
117
u
85
0.0000000032
118
v
86
0.0000000025
119
w
87
0.0000000020
120
x
88
0.0000000016
121
y
89
0.0000000013
122
z
90
0.0000000010
123
{
91
0.0000000008
124
|
92
0.0000000006
125
}
93
0.0000000005
126
~

An assumption going in when I was producing plots from the Q[Sanger] and Q[Solexa] data was that the “P” was the same value and the Solexa system simply opted to use the Odds (P/(1-P)) as their metric. A proper two-second consideration of the shape of the form of P and P/(1-P) would have lead to the immediate conclusion that something was afoot. The table columns on the left of the black bar in Table 2 (2A) are the Q[Solexa] values based on the use of the Q[Sanger] probabilities. This is here simply to show that they are, in fact, not the same and if you’ve spent any time wondering why you can’t adequately… manipulate Excel’s rounding tools to reproduce the Q[Solexa] integer values, this is why.

The probabilities obtained for Q[Solexa] were, in fact, worked backwards from the integer values of Q[Solexa] (having found no table online that gives a number-by-number summary of the probability or odds). For background, the Q[Solexa] values are obtained from:

Q[Solexa] = −10 * log10[(P/1-P)]

Table 2A: Q[Solexa] from P[Sanger] Table 2B: Q[Solexa] and associated odds (P/(1-P)).
Probability
(P) Of
Wrong Base
Associated
Sanger
Odds
[P/(1-P)]
Q[Solexa]
Based On
Phred
Probability
Solexa Q
[-5 to 62]
Solexa
Probability
(P) Of
Wrong Base
Solexa
Odds
[P/(1-P)]
Solexa
“Q + 64”
Q Shift
Solexa
“Q + 64”
ASCII
GLYPH
0.7943282
3.8621161
-5.8682532
-5
0.7597469
3.1622774
59
;
0.6309573
1.7097139
-2.3292343
-4
0.7152527
2.5118860
60
<
0.5011872
1.0047602
-0.0206244
-3
0.6661394
1.9952619
61
=
0.3981072
0.6614253
1.7951917
-2
0.6131368
1.5848929
62
>
0.3162278
0.4624753
3.3491146
-1
0.5573117
1.2589255
63
?
0.2511886
0.3354498
4.7437242
0
0.5000000
1.0000000
64
@
0.1995262
0.2492602
6.0334710
1
0.4426884
0.7943284
65
A
0.1584893
0.1883390
7.2505963
2
0.3868632
0.6309575
66
B
0.1258925
0.1440241
8.4156483
3
0.3338606
0.5011873
67
C
0.1000000
0.1111111
9.5424251
4
0.2847473
0.3981072
68
D
0.0794328
0.0862868
10.6405549
5
0.2402531
0.3162278
69
E
0.0630957
0.0673449
11.7169522
6
0.2007600
0.2511887
70
F
0.0501187
0.0527631
12.7766933
7
0.1663376
0.1995263
71
G
0.0398107
0.0414613
13.8235685
8
0.1368069
0.1584893
72
H
0.0316228
0.0326554
14.8604457
9
0.1118158
0.1258926
73
I
0.0251189
0.0257661
15.8895167
10
0.0909091
0.1000000
74
J
0.0199526
0.0203588
16.9124707
11
0.0735876
0.0794328
75
K
0.0158489
0.0161042
17.9306177
12
0.0593509
0.0630957
76
L
0.0125893
0.0127498
18.9449785
13
0.0477267
0.0501187
77
M
0.0100000
0.0101010
19.9563519
14
0.0382865
0.0398107
78
N
0.0079433
0.0080069
20.9653650
15
0.0306534
0.0316228
79
O
0.0063096
0.0063496
21.9725111
16
0.0245034
0.0251189
80
P
0.0050119
0.0050371
22.9781790
17
0.0195623
0.0199526
81
Q
0.0039811
0.0039970
23.9826759
18
0.0156017
0.0158489
82
R
0.0031623
0.0031723
24.9862446
19
0.0124327
0.0125893
83
S
0.0025119
0.0025182
25.9890773
20
0.0099010
0.0100000
84
T
0.0019953
0.0019993
26.9913260
21
0.0078807
0.0079433
85
U
0.0015849
0.0015874
27.9931114
22
0.0062700
0.0063096
86
V
0.0012589
0.0012605
28.9945291
23
0.0049869
0.0050119
87
W
0.0010000
0.0010010
29.9956549
24
0.0039653
0.0039811
88
X
0.0007943
0.0007950
30.9965489
25
0.0031523
0.0031623
89
Y
0.0006310
0.0006314
31.9972589
26
0.0025056
0.0025119
90
Z
0.0005012
0.0005014
32.9978228
27
0.0019913
0.0019953
91
[
0.0003981
0.0003983
33.9982707
28
0.0015824
0.0015849
92
\
0.0003162
0.0003163
34.9986264
29
0.0012573
0.0012589
93
]
0.0002512
0.0002513
35.9989090
30
0.0009990
0.0010000
94
^
0.0001995
0.0001996
36.9991334
31
0.0007937
0.0007943
95
_
0.0001585
0.0001585
37.9993116
32
0.0006306
0.0006310
96
`
0.0001259
0.0001259
38.9994532
33
0.0005009
0.0005012
97
a
0.0001000
0.0001000
39.9995657
34
0.0003979
0.0003981
98
b
0.0000794
0.0000794
40.9996550
35
0.0003161
0.0003162
99
c
0.0000631
0.0000631
41.9997260
36
0.0002511
0.0002512
100
d
0.0000501
0.0000501
42.9997823
37
0.0001995
0.0001995
101
e
0.0000398
0.0000398
43.9998271
38
0.0001585
0.0001585
102
f
0.0000316
0.0000316
44.9998627
39
0.0001259
0.0001259
103
g
0.0000251
0.0000251
45.9998909
40
0.0001000
0.0001000
104
h
0.0000200
0.0000200
46.9999133
41
0.0000794
0.0000794
105
i
0.0000158
0.0000158
47.9999312
42
0.0000631
0.0000631
106
j
0.0000126
0.0000126
48.9999453
43
0.0000501
0.0000501
107
k
0.0000100
0.0000100
49.9999566
44
0.0000398
0.0000398
108
l
0.0000079
0.0000079
50.9999655
45
0.0000316
0.0000316
109
m
0.0000063
0.0000063
51.9999726
46
0.0000251
0.0000251
110
n
0.0000050
0.0000050
52.9999782
47
0.0000200
0.0000200
111
o
0.0000040
0.0000040
53.9999827
48
0.0000158
0.0000158
112
p
0.0000032
0.0000032
54.9999863
49
0.0000126
0.0000126
113
q
0.0000025
0.0000025
55.9999891
50
0.0000100
0.0000100
114
r
0.0000020
0.0000020
56.9999913
51
0.0000079
0.0000079
115
s
0.0000016
0.0000016
57.9999931
52
0.0000063
0.0000063
116
t
0.0000013
0.0000013
58.9999945
53
0.0000050
0.0000050
117
u
0.0000010
0.0000010
59.9999957
54
0.0000040
0.0000040
118
v
0.0000008
0.0000008
60.9999966
55
0.0000032
0.0000032
119
w
0.0000006
0.0000006
61.9999973
56
0.0000025
0.0000025
120
x
0.0000005
0.0000005
62.9999978
57
0.0000020
0.0000020
121
y
0.0000004
0.0000004
63.9999983
58
0.0000016
0.0000016
122
z
0.0000003
0.0000003
64.9999986
59
0.0000013
0.0000013
123
{
0.0000003
0.0000003
65.9999989
60
0.0000010
0.0000010
124
|
0.0000002
0.0000002
66.9999991
61
0.0000008
0.0000008
125
}
0.0000002
0.0000002
67.9999993
62
0.0000006
0.0000006
126
~

With all three data sets, I reproduce a plot familiar to the FASTQ community below, showing the asymptotic behavior of the Q[Solexa] and Q[Sanger] values at high Q (which represent the lowest read errors. They approach one another because the numbers are simply too damn small on the plot). Also obvious from the plot is that the plots show poor agreement with each other in the range where the error probability is highest (so the entire analysis goes to pot as the data quality goes to pot [ed. Note for the international reader: “pot” refers to the device found in the water-closet). The grey line is a good plot of the wrong data (that in Table 2A).

The presentation of this data is likely complete overkill, but I have found it useful in discussion. Hopefully your having tables in front of someone during an explanation will help clarify that explanation.

Maq-0.6.x Or Maq-0.7.x (And Likely Others) Installation In Ubuntu 10.04 LTS (And Likely Others)

So, with the BclConverter installation complete and a small QSEQ-to-FASTQ script available to convert the QSEQ output, the/a next step is the alignment of your lane-worth of sequenced DNA. The Maq program is used by the Cornell Sequencing Center (and was recommended as the workhorse tool for this task) and is available by link from the Illumina third-party tools list. In keeping with my no-interest-in-installing-another-distro run of Ubuntu luck, the procedure below explains the process of building Maq using as much apt-get as possible. In the case of Maq, there is one small busy step in the installation process because we need a copy of libstdc++.so.5 local that is NOT available by some easy package install (although what one has to do isn’t terribly difficult either and I’ve linked local copies of the two .deb files below).

Installation Procedure

The process begins with apt-get, continues to dpkg, and then is finished with an easy make.

1. apt-get Install List

The official package list, I am quite sure, is below. From a Terminal window:

sudo apt-get install zlib1g-dev libssl-dev build-essential gcc g++ rpm ia32-libs

I say this because I (1) have installed several other packages on the machines I’ve been working on prior to the Maq builds and (2) I’ve no interest in wiping machines to perfect a super-clean install. If there is an error in the Maq-make, it is possible an additional package is missing (although I suspect this will not be the case, as there is little needed for the Maq build). If there is an error, the solution may simply be to blindly add the following additional packages (and, if you installed the BclConverter, you have this all installed anyway).

YOU LIKELY DON’T NEED THE FOLLOWING, BUT JUST IN CASE:

sudo apt-get install build-essential mercurial cmake python2.6-dev python3.1-dev gettext
libopenal1 libopenexr-dev libavdevice52 freeglut3-dev libglew1.5-dev libxmu-dev libxi-dev
libfreeimage-dev doxygen libqt4-dev bison flex libbz2-dev libpng12-dev libxml-simple-perl
ia32-libs lib32asound2 lib32ncurses5 lib32nss-mdns lib32z1 lib32gfortran3 gcc-4.3-multilib
gcc-multilib lib32gomp1 libc6-dev-i386 lib32mudflap0 lib32gcc1 lib32gcc1-dbg lib32stdc++6
lib32stdc++6-4.3-dbg libc6-i386 csh g++ g++-4.3 libstdc++6-4.3-dev g++-multilib
g++-4.3-multilib gcc-4.3-doc libstdc++6-4.3-dbg libstdc++6-4.3-doc nfs-common
nfs-kernel-server portmap ssh gnuplot

2. Adding 32-bit (Needed For Both) And 64-bit (If Running 64-bit) libstdc++.so.5

The following process assumes you know where the two .deb files are sitting and that you have access to this folder (I assume you’ve downloaded to Downloads or Desktop, drive your Terminal window in that direction with cd ~/Downloads or cd ~/Desktop). The two .deb files in question that contain (I believe) the most recent versions of libstdc++.so.5 are linked below (and sitting on my website – you’ll have to unzip them with a double-click or a gunzip *.zip in the download’ed directory):


libstdc++5_3.3.6-18_i386.deb (as libstdc5_3.3.6_18_i386.deb.zip)
libstdc++5_3.3.6-18_amd64.deb (as libstdc5_3.3.6_18_amd64.deb.zip)

These are an “additional runtime library for C++ programs built with the GNU compiler.” The i386 is the 32-bit version. You definitely need this one. The amd64 is needed if you installed the 64-bit Ubuntu distro. You’ll STILL need to install the i386 version.

A. For the 32-bit version, the installation is simple:

sudo dpkg -i libstdc++5_3.3.6-18_i386.deb

B. For the 64-bit version, the installation is also simple:

sudo dpkg -i libstdc++5_3.3.6-18_amd64.deb

Output as below:

Selecting previously deselected package libstdc++5.
(Reading database ... 169294 files and directories currently installed.)
Unpacking libstdc++5 (from libstdc++5_3.3.6-18_amd64.deb) ...
Setting up libstdc++5 (1:3.3.6-18) ...

Processing triggers for libc-bin ...
ldconfig deferred processing now taking place

The second step is only mildly more involved. These five steps (1) extract out the contents of libstdc++5_3.3.6-18_i386.deb without installing the library (so no over-writing), (2) enter the usr/lib directory you just extracted, (3) copy libstdc++.so.5.0.7 to /usr/lib32, (4) cd into /usrlib32, and (5) make a symbolic link for libstdc++.so.5.

dpkg --extract libstdc++5_3.3.6-18_i386.deb ./
cd usr/lib
sudo cp libstdc++.so.5.0.7 /usr/lib32
cd /usr/lib32/
sudo ln -s libstdc++.so.5.0.7 libstdc++.so.5
cd ~/

And that is all.

3. Installing Maq

sudo mv maq-0.7.1.tar.bz2 /opt/
cd /opt
sudo tar xvjf maq-0.7.1.tar.bz2 

Produces…

tar: Record size = 8 blocks
maq-0.7.1/
maq-0.7.1/AUTHORS
maq-0.7.1/COPYING
maq-0.7.1/ChangeLog
maq-0.7.1/FUTURES
maq-0.7.1/INSTALL
maq-0.7.1/Makefile.am
maq-0.7.1/Makefile.generic
maq-0.7.1/Makefile.in
maq-0.7.1/NEWS
maq-0.7.1/PROBLEMS
maq-0.7.1/README
maq-0.7.1/aclocal.m4
maq-0.7.1/algo.hh
maq-0.7.1/altchr.cc
maq-0.7.1/assemble.cc
maq-0.7.1/assemble.h
maq-0.7.1/assopt.c
maq-0.7.1/autogen.sh
maq-0.7.1/aux_utils.c
maq-0.7.1/bfa.c
maq-0.7.1/bfa.h
maq-0.7.1/break_pair.c
maq-0.7.1/cleanup.sh
maq-0.7.1/config.guess
maq-0.7.1/config.h.in
maq-0.7.1/config.sub
maq-0.7.1/configure
maq-0.7.1/configure.ac
maq-0.7.1/const.c
maq-0.7.1/const.h
maq-0.7.1/csmap2ntmap.cc
maq-0.7.1/dword.hh
maq-0.7.1/eland2maq.cc
maq-0.7.1/fasta2bfa.c
maq-0.7.1/fastq2bfq.c
maq-0.7.1/genran.c
maq-0.7.1/genran.h
maq-0.7.1/get_pos.c
maq-0.7.1/glf.h
maq-0.7.1/glfgen.cc
maq-0.7.1/indel_call.cc
maq-0.7.1/indel_pe.cc
maq-0.7.1/indel_soa.cc
maq-0.7.1/install-sh
maq-0.7.1/main.c
maq-0.7.1/main.h
maq-0.7.1/mapcheck.cc
maq-0.7.1/maq.1
maq-0.7.1/maq.pdf
maq-0.7.1/maq.pod
maq-0.7.1/maqmap.c
maq-0.7.1/maqmap.h
maq-0.7.1/maqmap_conv.c
maq-0.7.1/match.cc
maq-0.7.1/match.hh
maq-0.7.1/match_aux.cc
maq-0.7.1/merge.cc
maq-0.7.1/missing
maq-0.7.1/pair_stat.cc
maq-0.7.1/pileup.cc
maq-0.7.1/rbcc.cc
maq-0.7.1/read.cc
maq-0.7.1/read.h
maq-0.7.1/rmdup.cc
maq-0.7.1/scripts/
maq-0.7.1/scripts/asub
maq-0.7.1/scripts/farm-run.pl
maq-0.7.1/scripts/fq_all2std.pl
maq-0.7.1/scripts/maq.pl
maq-0.7.1/scripts/maq_eval.pl
maq-0.7.1/scripts/maq_plot.pl
maq-0.7.1/scripts/maq_post.pl
maq-0.7.1/scripts/maq_sanger.pl
maq-0.7.1/scripts/paf_utils.pl
maq-0.7.1/scripts/solid2fastq.pl
maq-0.7.1/seq.c
maq-0.7.1/seq.h
maq-0.7.1/simulate.c
maq-0.7.1/sort_mapping.cc
maq-0.7.1/stdaln.c
maq-0.7.1/stdaln.h
maq-0.7.1/stdhash.hh
maq-0.7.1/submap.c
maq-0.7.1/subsnp.cc
cd maq-0.7.1/

README Contents

Mapass2 is a software that builds mapping assemblies from short reads
generated by the next-generation sequencing machines. It is particularly
designed for Illumina-Solexa 1G Genetic Analyzer, which typically
generates reads 25-35bp in length.

Mapass2 first aligns reads to reference sequences and then calls the
consensus. At the mapping stage, maq performs ungapped alignment. For
single-end reads, maq is able to find all hits with up to 2 or 3
mismatches, depending on a command-line option; for paired-end reads, it
always finds all paired hits with one of the two reads containing up to
1 mismatch. At the assembling stage, maq calls the consensus based on a
statistical model. It calls the base which maximizes the posterior
probability and calculates a phred quality at each position along the
consensus. Heterozygotes are also called in this process.

For more information, see also maq website:

http://mapass.sourceforge.net

INSTALL Contents

There are two ways to compile maq. The first way is to use the GNU
building systems. Simply type ‘./configure; make; make install’ to
compile and to install maq. Three executables ‘maq’, ‘maq.pl’ and
‘farm-run.pl’ will be copied to ‘/usr/local/bin’ by default.

Alternatively, one could compile with ‘make -f Makefile.generic’ and
manually copy the three executables to the destination directory.
Modification to ‘Makefile.generic’ is sometimes needed for different
architectures.

As I’m running this from /opt, we’ll be doing the first way to compile Maq but using “sudo” in each case.

USERID@MACHINE:/opt/maq-0.7.1$ sudo ./configure 

Produces…

checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables... 
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for g++... g++
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking if gcc accepts -m64... yes
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking zlib.h usability... yes
checking zlib.h presence... yes
checking for zlib.h... yes
configure: creating ./config.status
config.status: creating Makefile
config.status: creating config.h
USERID@MACHINE:/opt/maq-0.7.1$ sudo make

Produces…

cd . && /bin/bash /opt/maq-0.7.1/missing --run autoheader
/opt/maq-0.7.1/missing: line 54: autoheader: command not found
WARNING: `autoheader' is missing on your system.  You should only need it if
         you modified `acconfig.h' or `configure.ac'.  You might want
         to install the `Autoconf' and `GNU m4' packages.  Grab them
         from any GNU archive site.
rm -f stamp-h1
touch config.h.in
cd . && /bin/bash ./config.status config.h
config.status: creating config.h
config.status: config.h is unchanged
make  all-am
make[1]: Entering directory `/opt/maq-0.7.1'
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c main.c
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c const.c
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c seq.c
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c bfa.c
bfa.c: In function ‘nst_load_bfa1’:
bfa.c:31: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result
bfa.c:32: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result
bfa.c:33: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result
bfa.c:35: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result
bfa.c:37: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result
bfa.c: In function ‘nst_bfa_len’:
bfa.c:46: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result
bfa.c:48: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o read.o read.cc
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c fasta2bfa.c
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c fastq2bfq.c
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o merge.o merge.cc
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o match_aux.o
 match_aux.cc
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o match.o match.cc
match.cc: In function ‘int alt_cal_mm(bit64_t)’:
match.cc:58: warning: suggest parentheses around ‘+’ in operand of ‘&’
match.cc:61: warning: suggest parentheses around ‘+’ in operand of ‘&’
match.cc: In function ‘int alt_cal_err(bit64_t, bit64_t)’:
match.cc:67: warning: suggest parentheses around ‘+’ in operand of ‘&’
match.cc:70: warning: suggest parentheses around ‘+’ in operand of ‘&’
match.cc: In function ‘int ma_match(int, char**)’:
match.cc:525: warning: ignoring return value of ‘int fscanf(FILE*, const char*, ...)’, declared 
with attribute 
warn_unused_result
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o sort_mapping.o 
sort_mapping.cc
sort_mapping.cc: In function ‘int ma_make_pair(const match_aux_t*, const match_info_t*, const 
match_info_t*, 
pair_info_t*)’:
sort_mapping.cc:59: warning: suggest parentheses around arithmetic in operand of ‘^’
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o assemble.o 
assemble.cc
assemble.cc: In function ‘base_call_aux_t* assemble_cns_collect(assemble_pos_t*, const 
assemble_aux_t*)’:
assemble.cc:106: warning: suggest parentheses around arithmetic in operand of ‘|’
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o 
pileup.o pileup.cc
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o 
mapcheck.o mapcheck.cc
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c get_pos.c
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c assopt.c
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c aux_utils.c
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o rbcc.o rbcc.cc
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o subsnp.o 
subsnp.cc
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o pair_stat.o 
pair_stat.cc
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o indel_soa.o 
indel_soa.cc
indel_soa.cc: In function ‘void fill_counter(bit32_t*, int, nst_bfa1_t*, void*)’:
indel_soa.cc:42: warning: suggest parentheses around ‘-’ inside ‘< <’
indel_soa.cc:56: warning: suggest parentheses around ‘-’ inside ‘<<’
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c maqmap.c
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c maqmap_conv.c
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o altchr.o 
altchr.cc
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c submap.c
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o rmdup.o 
rmdup.cc
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c simulate.c
In file included from /usr/include/string.h:640,
                 from maqmap.h:23,
                 from simulate.c:11:
In function ‘memset’,
    inlined from ‘simustat_core’ at simulate.c:386:
/usr/include/bits/string3.h:86: warning: call to __builtin___memset_chk will always overflow 
destination buffer
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c genran.c
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o indel_pe.o 
indel_pe.cc
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c stdaln.c
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o indel_call.o 
indel_call.cc
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o 
eland2maq.o eland2maq.cc
eland2maq.cc: In function ‘hash_map_char* read_list(FILE*)’:
eland2maq.cc:33: warning: ignoring return value of ‘int fscanf(FILE*, const char*, ...)’, 
declared with attribute warn_unused_result
eland2maq.cc: In function ‘void eland2maq_core(FILE*, FILE*, void*)’:
eland2maq.cc:88: warning: ignoring return value of ‘int fscanf(FILE*, const char*, ...)’, 
declared with attribute warn_unused_result
eland2maq.cc:96: warning: ignoring return value of ‘int fscanf(FILE*, const char*, ...)’, 
declared with attribute warn_unused_result
eland2maq.cc:99: warning: ignoring return value of ‘int fscanf(FILE*, const char*, ...)’, 
declared with attribute warn_unused_result
eland2maq.cc: In function ‘void novo2maq_core(FILE*, FILE*, void*)’:
eland2maq.cc:323: warning: ignoring return value of ‘char* fgets(char*, int, FILE*)’, declared 
with attribute warn_unused_result
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o 
csmap2ntmap.o csmap2ntmap.cc
gcc -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c break_pair.c
g++ -DHAVE_CONFIG_H -I.     -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2 -c -o 
glfgen.o glfgen.cc
glfgen.cc: In function ‘glf1_t* glfgen1_core(assemble_pos_t*, const assemble_aux_t*, bit8_t)’:
glfgen.cc:43: warning: suggest parentheses around arithmetic in operand of ‘|’
g++  -Wall -m64 -D_FASTMAP -DMAQ_LONGREADS -g -O2   -o maq main.o const.o seq.o bfa.o read.o 
fasta2bfa.o fastq2bfq.o merge.o match_aux.o match.o sort_mapping.o assemble.o pileup.o 
mapcheck.o 
get_pos.o assopt.o aux_utils.o rbcc.o subsnp.o pair_stat.o indel_soa.o maqmap.o maqmap_conv.o 
altchr.o submap.o rmdup.o simulate.o genran.o indel_pe.o stdaln.o indel_call.o eland2maq.o 
csmap2ntmap.o break_pair.o glfgen.o -lm -lz 
make[1]: Leaving directory `/opt/maq-0.7.1'
USERID@MACHINE:/opt/maq-0.7.1$ sudo make install

Produces…

make[1]: Entering directory `/opt/maq-0.7.1'
test -z "/usr/local/bin" || /bin/mkdir -p "/usr/local/bin"
  /usr/bin/install -c 'maq' '/usr/local/bin/maq'
test -z "/usr/local/bin" || /bin/mkdir -p "/usr/local/bin"
 /usr/bin/install -c 'scripts/maq.pl' '/usr/local/bin/maq.pl'
 /usr/bin/install -c 'scripts/farm-run.pl' '/usr/local/bin/farm-run.pl'
 /usr/bin/install -c 'scripts/maq_plot.pl' '/usr/local/bin/maq_plot.pl'
 /usr/bin/install -c 'scripts/maq_eval.pl' '/usr/local/bin/maq_eval.pl'
make[1]: Nothing to be done for `install-data-am'.
make[1]: Leaving directory `/opt/maq-0.7.1'

The Maq package is installed in /usr/local/bin and should be available immediately without any path calls. In the interest of running a brief test, I’ve provided a fastq file for phi-X174 and the “easy” command line run to run an alignment (rendered movie from the virusworld website, including a second half featuring the lovely QuteMol program, is below). While I wouldn’t object to hosting a full lane of phi-X174, the 1.9 GB of fragments = unbearably long server upload. Suffice it to say, if you have a phi-X174 lane and you’ve run the BCL-to-QSEQ-to-FASTQ procedure in the BclConverter post, you have a properly formatted PhiXSequence.fastq for running this example.

You can download the phi-X174 sequence at phi_X174_sequence.fastq.gz, a local version of the file you can find at the National Center for Biotechnology Information. The sequence is below because, well, I wanted to have a sequence present on the blog post (and it is absolutely fascinating to me that this is the instruction manual for something).

>gi|216019|gb|J02482.1|PX1CG Coliphage phi-X174, complete genome
GAGTTTTATCGCTTCCATGACGCAGAAGTTAACACTTTCGGATATTTCTGATGAGTCGAAAAATTATCTT
GATAAAGCAGGAATTACTACTGCTTGTTTACGAATTAAATCGAAGTGGACTGCTGGCGGAAAATGAGAAA
ATTCGACCTATCCTTGCGCAGCTCGAGAAGCTCTTACTTTGCGACCTTTCGCCATCAACTAACGATTCTG
TCAAAAACTGACGCGTTGGATGAGGAGAAGTGGCTTAATATGCTTGGCACGTTCGTCAAGGACTGGTTTA
GATATGAGTCACATTTTGTTCATGGTAGAGATTCTCTTGTTGACATTTTAAAAGAGCGTGGATTACTATC
TGAGTCCGATGCTGTTCAACCACTAATAGGTAAGAAATCATGAGTCAAGTTACTGAACAATCCGTACGTT
TCCAGACCGCTTTGGCCTCTATTAAGCTCATTCAGGCTTCTGCCGTTTTGGATTTAACCGAAGATGATTT
CGATTTTCTGACGAGTAACAAAGTTTGGATTGCTACTGACCGCTCTCGTGCTCGTCGCTGCGTTGAGGCT
TGCGTTTATGGTACGCTGGACTTTGTGGGATACCCTCGCTTTCCTGCTCCTGTTGAGTTTATTGCTGCCG
TCATTGCTTATTATGTTCATCCCGTCAACATTCAAACGGCCTGTCTCATCATGGAAGGCGCTGAATTTAC
GGAAAACATTATTAATGGCGTCGAGCGTCCGGTTAAAGCCGCTGAATTGTTCGCGTTTACCTTGCGTGTA
CGCGCAGGAAACACTGACGTTCTTACTGACGCAGAAGAAAACGTGCGTCAAAAATTACGTGCGGAAGGAG
TGATGTAATGTCTAAAGGTAAAAAACGTTCTGGCGCTCGCCCTGGTCGTCCGCAGCCGTTGCGAGGTACT
AAAGGCAAGCGTAAAGGCGCTCGTCTTTGGTATGTAGGTGGTCAACAATTTTAATTGCAGGGGCTTCGGC
CCCTTACTTGAGGATAAATTATGTCTAATATTCAAACTGGCGCCGAGCGTATGCCGCATGACCTTTCCCA
TCTTGGCTTCCTTGCTGGTCAGATTGGTCGTCTTATTACCATTTCAACTACTCCGGTTATCGCTGGCGAC
TCCTTCGAGATGGACGCCGTTGGCGCTCTCCGTCTTTCTCCATTGCGTCGTGGCCTTGCTATTGACTCTA
CTGTAGACATTTTTACTTTTTATGTCCCTCATCGTCACGTTTATGGTGAACAGTGGATTAAGTTCATGAA
GGATGGTGTTAATGCCACTCCTCTCCCGACTGTTAACACTACTGGTTATATTGACCATGCCGCTTTTCTT
GGCACGATTAACCCTGATACCAATAAAATCCCTAAGCATTTGTTTCAGGGTTATTTGAATATCTATAACA
ACTATTTTAAAGCGCCGTGGATGCCTGACCGTACCGAGGCTAACCCTAATGAGCTTAATCAAGATGATGC
TCGTTATGGTTTCCGTTGCTGCCATCTCAAAAACATTTGGACTGCTCCGCTTCCTCCTGAGACTGAGCTT
TCTCGCCAAATGACGACTTCTACCACATCTATTGACATTATGGGTCTGCAAGCTGCTTATGCTAATTTGC
ATACTGACCAAGAACGTGATTACTTCATGCAGCGTTACCATGATGTTATTTCTTCATTTGGAGGTAAAAC
CTCTTATGACGCTGACAACCGTCCTTTACTTGTCATGCGCTCTAATCTCTGGGCATCTGGCTATGATGTT
GATGGAACTGACCAAACGTCGTTAGGCCAGTTTTCTGGTCGTGTTCAACAGACCTATAAACATTCTGTGC
CGCGTTTCTTTGTTCCTGAGCATGGCACTATGTTTACTCTTGCGCTTGTTCGTTTTCCGCCTACTGCGAC
TAAAGAGATTCAGTACCTTAACGCTAAAGGTGCTTTGACTTATACCGATATTGCTGGCGACCCTGTTTTG
TATGGCAACTTGCCGCCGCGTGAAATTTCTATGAAGGATGTTTTCCGTTCTGGTGATTCGTCTAAGAAGT
TTAAGATTGCTGAGGGTCAGTGGTATCGTTATGCGCCTTCGTATGTTTCTCCTGCTTATCACCTTCTTGA
AGGCTTCCCATTCATTCAGGAACCGCCTTCTGGTGATTTGCAAGAACGCGTACTTATTCGCCACCATGAT
TATGACCAGTGTTTCCAGTCCGTTCAGTTGTTGCAGTGGAATAGTCAGGTTAAATTTAATGTGACCGTTT
ATCGCAATCTGCCGACCACTCGCGATTCAATCATGACTTCGTGATAAAAGATTGAGTGTGAGGTTATAAC
GCCGAAGCGGTAAAAATTTTAATTTTTGCCGCTGAGGGGTTGACCAAGCGAAGCGCGGTAGGTTTTCTGC
TTAGGAGTTTAATCATGTTTCAGACTTTTATTTCTCGCCATAATTCAAACTTTTTTTCTGATAAGCTGGT
TCTCACTTCTGTTACTCCAGCTTCTTCGGCACCTGTTTTACAGACACCTAAAGCTACATCGTCAACGTTA
TATTTTGATAGTTTGACGGTTAATGCTGGTAATGGTGGTTTTCTTCATTGCATTCAGATGGATACATCTG
TCAACGCCGCTAATCAGGTTGTTTCTGTTGGTGCTGATATTGCTTTTGATGCCGACCCTAAATTTTTTGC
CTGTTTGGTTCGCTTTGAGTCTTCTTCGGTTCCGACTACCCTCCCGACTGCCTATGATGTTTATCCTTTG
AATGGTCGCCATGATGGTGGTTATTATACCGTCAAGGACTGTGTGACTATTGACGTCCTTCCCCGTACGC
CGGGCAATAACGTTTATGTTGGTTTCATGGTTTGGTCTAACTTTACCGCTACTAAATGCCGCGGATTGGT
TTCGCTGAATCAGGTTATTAAAGAGATTATTTGTCTCCAGCCACTTAAGTGAGGTGATTTATGTTTGGTG
CTATTGCTGGCGGTATTGCTTCTGCTCTTGCTGGTGGCGCCATGTCTAAATTGTTTGGAGGCGGTCAAAA
AGCCGCCTCCGGTGGCATTCAAGGTGATGTGCTTGCTACCGATAACAATACTGTAGGCATGGGTGATGCT
GGTATTAAATCTGCCATTCAAGGCTCTAATGTTCCTAACCCTGATGAGGCCGCCCCTAGTTTTGTTTCTG
GTGCTATGGCTAAAGCTGGTAAAGGACTTCTTGAAGGTACGTTGCAGGCTGGCACTTCTGCCGTTTCTGA
TAAGTTGCTTGATTTGGTTGGACTTGGTGGCAAGTCTGCCGCTGATAAAGGAAAGGATACTCGTGATTAT
CTTGCTGCTGCATTTCCTGAGCTTAATGCTTGGGAGCGTGCTGGTGCTGATGCTTCCTCTGCTGGTATGG
TTGACGCCGGATTTGAGAATCAAAAAGAGCTTACTAAAATGCAACTGGACAATCAGAAAGAGATTGCCGA
GATGCAAAATGAGACTCAAAAAGAGATTGCTGGCATTCAGTCGGCGACTTCACGCCAGAATACGAAAGAC
CAGGTATATGCACAAAATGAGATGCTTGCTTATCAACAGAAGGAGTCTACTGCTCGCGTTGCGTCTATTA
TGGAAAACACCAATCTTTCCAAGCAACAGCAGGTTTCCGAGATTATGCGCCAAATGCTTACTCAAGCTCA
AACGGCTGGTCAGTATTTTACCAATGACCAAATCAAAGAAATGACTCGCAAGGTTAGTGCTGAGGTTGAC
TTAGTTCATCAGCAAACGCAGAATCAGCGGTATGGCTCTTCTCATATTGGCGCTACTGCAAAGGATATTT
CTAATGTCGTCACTGATGCTGCTTCTGGTGTGGTTGATATTTTTCATGGTATTGATAAAGCTGTTGCCGA
TACTTGGAACAATTTCTGGAAAGACGGTAAAGCTGATGGTATTGGCTCTAATTTGTCTAGGAAATAACCG
TCAGGATTGACACCCTCCCAATTGTATGTTTTCATGCCTCCAAATCTTGGAGGCTTTTTTATGGTTCGTT
CTTATTACCCTTCTGAATGTCACGCTGATTATTTTGACTTTGAGCGTATCGAGGCTCTTAAACCTGCTAT
TGAGGCTTGTGGCATTTCTACTCTTTCTCAATCCCCAATGCTTGGCTTCCATAAGCAGATGGATAACCGC
ATCAAGCTCTTGGAAGAGATTCTGTCTTTTCGTATGCAGGGCGTTGAGTTCGATAATGGTGATATGTATG
TTGACGGCCATAAGGCTGCTTCTGACGTTCGTGATGAGTTTGTATCTGTTACTGAGAAGTTAATGGATGA
ATTGGCACAATGCTACAATGTGCTCCCCCAACTTGATATTAATAACACTATAGACCACCGCCCCGAAGGG
GACGAAAAATGGTTTTTAGAGAACGAGAAGACGGTTACGCAGTTTTGCCGCAAGCTGGCTGCTGAACGCC
CTCTTAAGGATATTCGCGATGAGTATAATTACCCCAAAAAGAAAGGTATTAAGGATGAGTGTTCAAGATT
GCTGGAGGCCTCCACTATGAAATCGCGTAGAGGCTTTGCTATTCAGCGTTTGATGAATGCAATGCGACAG
GCTCATGCTGATGGTTGGTTTATCGTTTTTGACACTCTCACGTTGGCTGACGACCGATTAGAGGCGTTTT
ATGATAATCCCAATGCTTTGCGTGACTATTTTCGTGATATTGGTCGTATGGTTCTTGCTGCCGAGGGTCG
CAAGGCTAATGATTCACACGCCGACTGCTATCAGTATTTTTGTGTGCCTGAGTATGGTACAGCTAATGGC
CGTCTTCATTTCCATGCGGTGCACTTTATGCGGACACTTCCTACAGGTAGCGTTGACCCTAATTTTGGTC
GTCGGGTACGCAATCGCCGCCAGTTAAATAGCTTGCAAAATACGTGGCCTTATGGTTACAGTATGCCCAT
CGCAGTTCGCTACACGCAGGACGCTTTTTCACGTTCTGGTTGGTTGTGGCCTGTTGATGCTAAAGGTGAG
CCGCTTAAAGCTACCAGTTATATGGCTGTTGGTTTCTATGTGGCTAAATACGTTAACAAAAAGTCAGATA
TGGACCTTGCTGCTAAAGGTCTAGGAGCTAAAGAATGGAACAACTCACTAAAAACCAAGCTGTCGCTACT
TCCCAAGAAGCTGTTCAGAATCAGAATGAGCCGCAACTTCGGGATGAAAATGCTCACAATGACAAATCTG
TCCACGGAGTGCTTAATCCAACTTACCAAGCTGGGTTACGACGCGACGCCGTTCAACCAGATATTGAAGC
AGAACGCAAAAAGAGAGATGAGATTGAGGCTGGGAAAAGTTACTGTAGCCGACGTTTTGGCGGCGCAACC
TGTGACGACAAATCTGCTCAAATTTATGCGCGCTTCGATAAAAATGATTGGCGTATCCAACCTGCA

With this file downloaded and your phi-X174 fragment collection sitting in a file [I will assume is named] phi_X174_seq_fragments.fastq in the same directory, the command line run is simple:

maq.pl easyrun -d phi_X174 phi_X174_seq.fastq phi_X174_seq_fragments.fastq >& phi_X174.log
drwxr-xr-x 2 user user       4096 2010-12-11 17:18 phi_X174
-rw-r--r-- 1 user user       3524 2010-12-11 17:18 phi_X174.log
-rw-r--r-- 1 user user       5529 2010-12-11 16:39 phi_X174_seq.fastq
-rw-r--r-- 1 user user 1921502305 2010-12-11 16:40 phi_X174_seq_fragments.fastq

This will produce the phi_X174.log results file (check for errors. Log contents below)…

-- CMD: /usr/local/bin/maq fasta2bfa /home/user/phi_X174_seq.fastq \ 
phi_X174/ref.bfa 2> /dev/null
-- CMD: /usr/local/bin/maq fastq2bfq -n 2000000 /home/user/phi_X174_seq_fragments.fastq \
phi_X174/read1
-- finish writing file 'phi_X174/read1@1.bfq'
-- finish writing file 'phi_X174/read1@2000001.bfq'
-- finish writing file 'phi_X174/read1@4000001.bfq'
-- finish writing file 'phi_X174/read1@6000001.bfq'
-- finish writing file 'phi_X174/read1@8000001.bfq'
-- finish writing file 'phi_X174/read1@10000001.bfq'
-- finish writing file 'phi_X174/read1@12000001.bfq'
-- finish writing file 'phi_X174/read1@14000001.bfq'
-- finish writing file 'phi_X174/read1@16000001.bfq'
-- 16259703 sequences were loaded.
-- CMD: (cd phi_X174; /usr/local/bin/maq map  -n 2 -e 70 -u unmap1@8000001.txt \
aln1@8000001.map ref.bfa read1@8000001.bfq 2> aln1@8000001.map.log)
-- CMD: (cd phi_X174; /usr/local/bin/maq map  -n 2 -e 70 -u unmap1@2000001.txt \
aln1@2000001.map ref.bfa read1@2000001.bfq 2> aln1@2000001.map.log)
-- CMD: (cd phi_X174; /usr/local/bin/maq map  -n 2 -e 70 -u unmap1@6000001.txt \
aln1@6000001.map ref.bfa read1@6000001.bfq 2> aln1@6000001.map.log)
-- CMD: (cd phi_X174; /usr/local/bin/maq map  -n 2 -e 70 -u unmap1@10000001.txt \
aln1@10000001.map ref.bfa read1@10000001.bfq 2> aln1@10000001.map.log)
-- CMD: (cd phi_X174; /usr/local/bin/maq map  -n 2 -e 70 -u unmap1@16000001.txt \
aln1@16000001.map ref.bfa read1@16000001.bfq 2> aln1@16000001.map.log)
-- CMD: (cd phi_X174; /usr/local/bin/maq map  -n 2 -e 70 -u unmap1@4000001.txt \
aln1@4000001.map ref.bfa read1@4000001.bfq 2> aln1@4000001.map.log)
-- CMD: (cd phi_X174; /usr/local/bin/maq map  -n 2 -e 70 -u unmap1@14000001.txt \
aln1@14000001.map ref.bfa read1@14000001.bfq 2> aln1@14000001.map.log)
-- CMD: (cd phi_X174; /usr/local/bin/maq map  -n 2 -e 70 -u unmap1@1.txt aln1@1.map \
ref.bfa read1@1.bfq 2> aln1@1.map.log)
-- CMD: (cd phi_X174; /usr/local/bin/maq map  -n 2 -e 70 -u unmap1@12000001.txt \
aln1@12000001.map ref.bfa read1@12000001.bfq 2> aln1@12000001.map.log)
-- CMD: (cd phi_X174; /usr/local/bin/maq mapmerge all.map aln1@8000001.map \
aln1@2000001.map aln1@6000001.map aln1@10000001.map aln1@16000001.map \
aln1@4000001.map aln1@14000001.m
ap aln1@1.map aln1@12000001.map)
-- CMD: (cd phi_X174; /usr/local/bin/maq mapcheck ref.bfa all.map > mapcheck.txt)
[ma_mapcheck] processing gi|216019|gb|J02482.1|PX1CG...
-- CMD: (cd phi_X174; /usr/local/bin/maq assemble -N 2 -Q 60 consensus.cns ref.bfa \
all.map 2> assemble.log)
-- CMD: /usr/local/bin/maq cns2fq phi_X174/consensus.cns > phi_X174/cns.fq
-- CMD: /usr/local/bin/maq cns2snp phi_X174/consensus.cns > phi_X174/cns.snp
-- CMD: /usr/local/bin/maq cns2win phi_X174/consensus.cns > phi_X174/cns.win
-- CMD: /usr/local/bin/maq indelsoa phi_X174/ref.bfa phi_X174/all.map > phi_X174/cns.indelse
-- CMD: (cd phi_X174; touch unmap.indel)
-- CMD: /usr/local/bin/maq.pl SNPfilter -q 40 -w 5 -N 2 -f phi_X174/cns.indelse -d 3 \
-D 256 -n 20 phi_X174/cns.snp > phi_X174/cns.final.snp
-- 0 potential soa-indels pass the filter.
-- CMD: (cd phi_X174; ln -s cns.final.snp cns.filter.snp)
-- CMD: /usr/local/bin/maq.pl statmap phi_X174/*.map.log

-- == statmap report ==

-- # single end (SE) reads: 16259703
-- # mapped SE reads: 16011454 (/ 16259703 = 98.47%)
-- # paired end (PE) reads: 0
-- # mapped PE reads: 0 (/ 0 = NA%)
-- # reads that are mapped in pairs: 0 (/ 0 = NA%)
-- # Q>=30 reads that are moved to meet mate-pair requirement: 0 (/ 0 = NA%)
-- # Q<30 reads that are moved to meet mate-pair requirement: 0 (NA%)

…and a “phi_X174” directory (from the “-d”) containing (hopefully) the following file list:

drwxr-xr-x  2 user user      4096 2010-12-11 17:18 .
drwxr-xr-x 19 user user      4096 2010-12-11 17:09 ..
-rw-r--r--  1 user user 337296876 2010-12-11 17:16 all.map
-rw-r--r--  1 user user  41635667 2010-12-11 17:13 aln1@10000001.map
-rw-r--r--  1 user user     18493 2010-12-11 17:13 aln1@10000001.map.log
-rw-r--r--  1 user user  42219864 2010-12-11 17:15 aln1@12000001.map
-rw-r--r--  1 user user     18493 2010-12-11 17:15 aln1@12000001.map.log
-rw-r--r--  1 user user  42885466 2010-12-11 17:14 aln1@14000001.map
-rw-r--r--  1 user user     18493 2010-12-11 17:14 aln1@14000001.map.log
-rw-r--r--  1 user user   5808963 2010-12-11 17:13 aln1@16000001.map
-rw-r--r--  1 user user     18484 2010-12-11 17:13 aln1@16000001.map.log
-rw-r--r--  1 user user  42616782 2010-12-11 17:14 aln1@1.map
-rw-r--r--  1 user user     18493 2010-12-11 17:14 aln1@1.map.log
-rw-r--r--  1 user user  41452684 2010-12-11 17:12 aln1@2000001.map
-rw-r--r--  1 user user     18493 2010-12-11 17:12 aln1@2000001.map.log
-rw-r--r--  1 user user  41223383 2010-12-11 17:14 aln1@4000001.map
-rw-r--r--  1 user user     18493 2010-12-11 17:14 aln1@4000001.map.log
-rw-r--r--  1 user user  41423788 2010-12-11 17:13 aln1@6000001.map
-rw-r--r--  1 user user     18493 2010-12-11 17:13 aln1@6000001.map.log
-rw-r--r--  1 user user  42709777 2010-12-11 17:12 aln1@8000001.map
-rw-r--r--  1 user user     18493 2010-12-11 17:12 aln1@8000001.map.log
-rw-r--r--  1 user user      8704 2010-12-11 17:18 assemble.log
lrwxrwxrwx  1 user user        13 2010-12-11 17:18 cns.filter.snp -> cns.final.snp
-rw-r--r--  1 user user       509 2010-12-11 17:18 cns.final.snp
-rw-r--r--  1 user user     10983 2010-12-11 17:18 cns.fq
-rw-r--r--  1 user user         0 2010-12-11 17:18 cns.indelse
-rw-r--r--  1 user user       571 2010-12-11 17:18 cns.snp
-rw-r--r--  1 user user       452 2010-12-11 17:18 cns.win
-rw-r--r--  1 user user      3555 2010-12-11 17:18 consensus.cns
-rw-r--r--  1 user user      4525 2010-12-11 17:17 mapcheck.txt
-rw-r--r--  1 user user  51219471 2010-12-11 17:11 read1@10000001.bfq
-rw-r--r--  1 user user  51533445 2010-12-11 17:11 read1@12000001.bfq
-rw-r--r--  1 user user  52014489 2010-12-11 17:12 read1@14000001.bfq
-rw-r--r--  1 user user   6777154 2010-12-11 17:12 read1@16000001.bfq
-rw-r--r--  1 user user  51764856 2010-12-11 17:09 read1@1.bfq
-rw-r--r--  1 user user  51064770 2010-12-11 17:10 read1@2000001.bfq
-rw-r--r--  1 user user  50938162 2010-12-11 17:10 read1@4000001.bfq
-rw-r--r--  1 user user  51025084 2010-12-11 17:10 read1@6000001.bfq
-rw-r--r--  1 user user  51803458 2010-12-11 17:11 read1@8000001.bfq
-rw-r--r--  1 user user      2744 2010-12-11 17:09 ref.bfa
-rw-r--r--  1 user user   3579605 2010-12-11 17:13 unmap1@10000001.txt
-rw-r--r--  1 user user   3592541 2010-12-11 17:15 unmap1@12000001.txt
-rw-r--r--  1 user user   3790240 2010-12-11 17:14 unmap1@14000001.txt
-rw-r--r--  1 user user    492518 2010-12-11 17:13 unmap1@16000001.txt
-rw-r--r--  1 user user   3768747 2010-12-11 17:14 unmap1@1.txt
-rw-r--r--  1 user user   3668574 2010-12-11 17:12 unmap1@2000001.txt
-rw-r--r--  1 user user   3608445 2010-12-11 17:13 unmap1@4000001.txt
-rw-r--r--  1 user user   3429470 2010-12-11 17:13 unmap1@6000001.txt
-rw-r--r--  1 user user   3410453 2010-12-11 17:12 unmap1@8000001.txt
-rw-r--r--  1 user user         0 2010-12-11 17:18 unmap.indel

I’ve included the file sizes so you can see the amount of data generated from a 4 Kb phi_X174_sequence.fastq file and a 1.9 GB phi_X174_seq_fragments.fastq file.

BclConverter-1.7.1 Installation In Ubuntu 10.04 LTS (And Related)

What follows is the procedure for successfully building and running BclConverter-1.7.1 under Ubuntu (specifically 10.04, but this will likely be generic for other versions) using only apt-get to install missing programs and libraries, thereby trying to keep the install process as build-friendly as possible to the general (non-coding) user.

So, What’s BCL And Why Does It Need Converting?

The newest version of the Illumina sequencing software no longer uses the QSEQ format during the sequencing run, relying now on BCL files. This 12 January 2010 post snip from www.politigenomics.com covers the intro nicely.

Gone are the QSEQ files, they are replaced by BCL files which are binary, per image, per cycle files that contain the base call and quality information. Because they are per image, per cycle files, they can be transferred cycle by cycle as they are generated (as opposed to QSEQ files which are read based). The BCL files are also more compact, requiring only 1 byte/base (B/b) as compared to QSEQ files which require about 2.5 B/b. In addition, the intensity files are also not transferred by default, so RTA output goes from 10 B/b to just 1 B/b. Thus, even though you are generating five times more sequence data than a GA, your RTA directory will actually be smaller (about 250 GB).

That is all well and good, most of the open source programs of relevance (to me, anyway) require FASTQ format as input. As there is no one-stop conversion from BCL to FASTQ from the illumina downloads, the generation of QSEQ files is still a necessary (although not significantly difficult) evil. QSEQ files are generate-able from the BCL data (with maintenance of the illumina directory structure!) with the BclConverter code.

Unfortunately, the conversion from BCL format to QSEQ format (which is a file format for which many scripts exist online for conversion into the ever-familiar FASTQ format) requires an additional installation on your network machine, this installation being the BclConverter (v1.7.1) program available from the Illumina iCom website (registration required). This BclConverter program is not a pre-compiled binary, .rpm, .deb, .etc package, meaning the build is done by you from scratch. For many Linux distributions, this is non-problematic, as the build uses fairly standard tools. If you’re running Ubuntu, you will find yourself compiling (and running) with a host of show-stopping (or eye candy-stopping) errors. What lies below takes care of these errors.

Quick Summary

If you walk through the following steps, you’ll have no issue installing BclConverter. The more exhaustive discussion is below.

> sudo aptitude update

> sudo aptitude upgrade

> sudo apt-get install build-essential mercurial cmake python2.6-dev python3.1-dev gettext
libopenal1 libopenexr-dev libavdevice52 freeglut3-dev libglew1.5-dev libxmu-dev libxi-dev
libfreeimage-dev doxygen libqt4-dev bison flex libbz2-dev libpng12-dev libxml-simple-perl
ia32-libs lib32asound2 lib32ncurses5 lib32nss-mdns lib32z1 lib32gfortran3 gcc-4.3-multilib
gcc-multilib lib32gomp1 libc6-dev-i386 lib32mudflap0 lib32gcc1 lib32gcc1-dbg lib32stdc++6
lib32stdc++6-4.3-dbg libc6-i386 csh g++ g++-4.3 libstdc++6-4.3-dev g++-multilib
g++-4.3-multilib gcc-4.3-doc libstdc++6-4.3-dbg libstdc++6-4.3-doc nfs-common
nfs-kernel-server portmap ssh gnuplot

> sudo tar xvf BclConverter-1.7.1.tar.gz

> cd BclConverter-1.7.1

> sudo make install

Installation – The Long And Sometimes Error-Filled Version

There’s a lot of error message duplication and step-wise discussion below because I assume that you found this page by searching against errors as they came up in the build process.

NOTE: The first installation attempt failed with the following packages installed additionally during the initial setup of the machine:

sudo apt-get install ia32-libs lib32asound2 lib32ncurses5 lib32nss-mdns lib32z1 lib32gfortran3
gcc-4.3-multilib gcc-multilib lib32gomp1 libc6-dev-i386 lib32mudflap0 lib32gcc1 lib32gcc1-dbg
lib32stdc++6 lib32stdc++6-4.3-dbg libc6-i386 csh g++ g++-4.3 libstdc++6-4.3-dev g++-multilib
g++-4.3-multilib gcc-4.3-doc libstdc++6-4.3-dbg libstdc++6-4.3-doc nfs-common nfs-kernel-server
portmap ssh

You may or may not need some of these (especially if you’re running a 32-bit version of Ubuntu), but I can’t say definitively that something above is NOT ALSO required beyond the apt-get list provided below, so just install them anyway (the NFS stuff may be overkill, but if you’re going to mount this machine for sequencer file transfer, you’ll need this and/or SAMBA anyway).

My first build attempt of BclConverter with a mostly fresh Ubuntu install provided the following error:

...failed updating 2 targets...
...skipped 3 targets...
...updated 7846 targets...
boost.sh: build failed: Terminating...
CMake Error at c++/CMakeLists.txt:177 (message):
  Failed to build Boost


-- Configuring incomplete, errors occurred!
make: *** [build/Makefile] Error 1

So, we know that the Boost 1.42 libraries are not installed. Part of the BclConverter build process involves building a copy of these libraries (what failed above). The problem above was not a missing Boost as much as it was missing build tools for the whole program.

If the problem is Boost 1.42, why not just install the Ubuntu package? I’m not entirely sure, but there may or may not be something about the BclConverter build that requires something in Boost 1.42 to be findable by the BclConverter in its local directory (not too likely, but I didn’t diagnose it). Also, the problem may be version-specific (more likely than not), as the Boost build one can apt-get is 2.0-m12-2. Which is to say, installing the Ubuntu package…

sudo apt-get install boost-build

…did not solve the Boost problem. The possible solutions are to (1) build Boost 1.42 yourself or (2) simply let the BclConverter build take care of this (since the Boost 1.42 library is included in the BclConverter package for building). A Boost 1.42 build attempt external to the BclConverter program did not, in fact, solve the Boost problem in a subsequent BclConverter build attempt (I spare you repeat of the same error), making the successful apt-get-based approach all the easier.

We begin by updating our aptitude database and upgrading your machine (this is a skip-able step, but I prefer keeping everything up-to-date).

sudo aptitude update
sudo aptitude upgrade

The required build programs and libraries for BclConverter-1.7.1 (that are not part of my standard lib32 et al. install-ables listed above) are install-able as below:

sudo apt-get install build-essential mercurial cmake python2.6-dev python3.1-dev gettext
libopenal1 libopenexr-dev libavdevice52 freeglut3-dev libglew1.5-dev libxmu-dev libxi-dev
libfreeimage-dev doxygen libqt4-dev bison flex libbz2-dev libpng12-dev libxml-simple-perl
gnuplot

My routine setup preference is to place installed programs into /opt (purely for organizational purposes. It really doesn’t matter where within reason). With BclConverter-1.7.1 downloaded from iCom, we’ll move the .gz/.zip file to /opt, extract, untar, and install. With a Terminal window open and cd’ed to the BclConverter-1.7.1 download location (likely ~/Downloads, maybe ~/Desktop):

sudo mv BclConverter-1.7.1.tar.gz /opt
cd /opt
sudo tar xvf BclConverter-1.7.1.tar.gz
cd BclConverter-1.7.1
sudo make install

If, for any reason, you wish to see what the install log looks like, you can download mine for this session (in the 2010dec7__bclconverter_1_7_1_logs.zip file, see 2010dec7__bclconverter_1_7_1_build3b__successful__BUILD).

The last piece of the puzzle is to add the /opt/BclConverter/bin directory to your path, which we do in .profile as follows:

cd ~/
pico .profile

In .profile, add the following to the bottom somewhere…

PATH="/opt/BclConverter-1.7.1/bin/:$PATH"

Save and exit.

source .profile

Potential Errors Along The Way

This section is the most important part as it’s likely how you found this post. Below are the few problems (and messages) that might arise that are solved by the installation of specific packages).

1. Boost Error And Attempted sudo apt-get install boost-build

The error with and without a boost-build install is the same.

...failed updating 2 targets...
...skipped 3 targets...
...updated 7846 targets...
boost.sh: build failed: Terminating...
CMake Error at c++/CMakeLists.txt:177 (message):
  Failed to build Boost


-- Configuring incomplete, errors occurred!
make: *** [build/Makefile] Error 1

The full list from the build attempts for both cases can be viewed in (in 2010dec7__bclconverter_1_7_1_logs.zip:

* 2010dec7__bclconverter_1_7_1_build1__boosterror__FAILED.txt – initial error
* 2010dec7__bclconverter_1_7_1_build2__aptgetboost__FAILED.txt – after boost-build install

Running the full apt-get (see the contents of 2010dec7__bclconverter_1_7_1_logs.zip, with the results in 2010dec7__bclconverter_1_7_1_build3a__aptgetlist__RESULTS) produces a successful BclConverter build. The log for my build is available in 2010dec7__bclconverter_1_7_1_build3b__successful__BUILD.txt.

2. XML:Simple-Related Error And libxml-simple-perl

Without either gnuplot or the libxml-simple-perl installation, a setupBclToQseq.py run will successfully generate QSEQ files. The additional tools provide you with some statistical and visual analyses of your results (so are definitely worth installing).

If you don’t have libxml-simple-perl installed, you’ll see the following error after running the BaseCalls “make”:

/opt/BclConverter-1.7.1/bin/plotIntensity_tiles.pl tiles.txt s_8 SignalMeans _all.txt 
_all.png && echo `date` > s_8_all_pngs.txt
Can't locate XML/Simple.pm in @INC (@INC contains: /opt/BclConverter-1.7.1/lib/perl /etc/perl
/usr/local/lib/perl/5.10.1 /usr/local/share/perl/5.10.1 /usr/lib/perl5 /usr/share/perl5
/usr/lib/perl/5.10 /usr/share/perl/5.10 /usr/local/lib/site_perl .) at 
/opt/BclConverter-1.7.1/lib/perl/Gerald/Common.pm line 128.
BEGIN failed--compilation aborted at /opt/BclConverter-1.7.1/lib/perl/Gerald/Common.pm
line 128.
Compilation failed in require at /opt/BclConverter-1.7.1/bin/plotIntensity_tiles.pl line 22.
...
make: *** [s_1_all_pngs.txt] Error 2
make: *** [s_2_all_pngs.txt] Error 2
make: *** [s_3_all_pngs.txt] Error 2
make: *** [s_4_all_pngs.txt] Error 2
make: *** [s_5_all_pngs.txt] Error 2
make: *** [s_6_all_pngs.txt] Error 2
make: *** [s_7_all_pngs.txt] Error 2
make: *** [s_8_all_pngs.txt] Error 2

The fix is trivial (if you’re doing it incrementally and don’t already have XML:simple installed):

sudo apt-get install libxml-simple-perl

3. gnuplot Errors And Fix

If you don’t have gnuplot already installed (and why wouldn’t you?), you’ll receive the following error during the BCL-to-QSEQ “make” process:

sh: gnuplot: not found
/opt/BclConverter-1.7.1/share/makefiles/bclToQseq/FlowCellTargets.mk:76: [IVC.htm
(s_1_all_pngs.txt s_2_all_pngs.txt s_3_all_pngs.txt s_4_all_pngs.txt s_5_all_pngs.txt
s_6_all_pngs.txt s_7_all_pngs.txt s_8_all_pngs.txt plotIntensity_for_IVC_finished.txt)
(s_1_all_pngs.txt s_2_all_pngs.txt s_3_all_pngs.txt s_4_all_pngs.txt s_5_all_pngs.txt
s_6_all_pngs.txt s_7_all_pngs.txt s_8_all_pngs.txt plotIntensity_for_IVC_finished.txt)]

/opt/BclConverter-1.7.1/bin/create_IVC_thumbnail.pl . > IVC.htm.tmp && mv IVC.htm.tmp
IVC.htm
/opt/BclConverter-1.7.1/share/makefiles/bclToQseq/FlowCellTargets.mk:82: [All.htm
(s_1_all_pngs.txt s_2_all_pngs.txt s_3_all_pngs.txt s_4_all_pngs.txt s_5_all_pngs.txt
s_6_all_pngs.txt s_7_all_pngs.txt s_8_all_pngs.txt plotIntensity_for_IVC_finished.txt)
(s_1_all_pngs.txt s_2_all_pngs.txt s_3_all_pngs.txt s_4_all_pngs.txt s_5_all_pngs.txt
s_6_all_pngs.txt s_7_all_pngs.txt s_8_all_pngs.txt plotIntensity_for_IVC_finished.txt)]

/opt/BclConverter-1.7.1/bin/create_tile_thumbnails.pl all > FullAll.htm && \
	/opt/BclConverter-1.7.1/bin/create_tile_thumbnails.pl all --maxTiles=20 --link='
_a href="FullAll.htm"_Full output (Warning: may overload your browser!)_/a_' > All.htm.tmp 
&& mv All.htm.tmp All.htm

/opt/BclConverter-1.7.1/share/makefiles/bclToQseq/FlowCellTargets.mk:109:
[BustardSummary.xml (IVC.htm All.htm tiles.txt) (IVC.htm All.htm tiles.txt)]
/opt/BclConverter-1.7.1/bin/produceIntensityStats.pl .
Unable to find file /LOCATION_OF_INTENSITIES_FOLDER/Intensities/BaseCalls/../../
../samples.xml at /opt/BclConverter-1.7.1/lib/perl/Gerald/Jerboa.pm line 387.

...

/opt/BclConverter-1.7.1/share/makefiles/bclToQseq/FlowCellTargets.mk:58: [finished.txt
(Matrix Phasing s_1 s_2 s_3 s_4 s_5 s_6 s_7 s_8 BustardSummary.xml BustardSummary.xsl
IVC.htm All.htm) (Matrix Phasing s_1 s_2 s_3 s_4 s_5 s_6 s_7 s_8 BustardSummary.xml
BustardSummary.xsl IVC.htm All.htm)]
touch finished.txt.tmp && mv finished.txt.tmp finished.txt

With a:

sudo apt-get install gnuplot

All remaining errors in the BCL-to-QSEQ “make” process should disposed of, leaving you with a Plots directory containing multiple .png files after the QSEQ generation process.

4. Just Running “make” For BCL-to-QSEQ

The successful setupBclToQseq.py run:

setupBclToQseq.py -i /LOCATION_OF_FILES/Intensities/BaseCalls -p /LOCATION_OF_FILES/Intensities
 -o /LOCATION_OF_FILES/Intensities/BaseCalls --in-place --overwrite

ends with (also in setupBclToQseq.log):

setupBclToQseq.py version 1.7.1

Configuring /opt/BclConverter-1.7.1/share/makefiles/bclToQseq/Makefile to 
/LOCATION_OF_FILES/Intensities/BaseCalls/Makefile

Creating the 'Makefile.config'

Output directory succesfully initialized. Type 'make' in 
/LOCATION_OF_FILES/Intensities/BaseCalls to start the conversion

And if you simply type “make,” you get the following error:

/opt/BclConverter-1.7.1/bin/plotIntensity_tiles.pl tiles.txt s_1 SignalMeans _all.txt
_all.png && echo `date` > s_1_all_pngs.txt
Can't locate XML/Simple.pm in @INC (@INC contains: /opt/BclConverter-1.7.1/lib/perl /etc/perl 
/usr/local/lib/perl/5.10.1 /usr/local/share/perl/5.10.1 /usr/lib/perl5 /usr/share/perl5 
/usr/lib/perl/5.10 /usr/share/perl/5.10 /usr/local/lib/site_perl .) at 
/opt/BclConverter-1.7.1/lib/perl/Gerald/Common.pm line 128.
BEGIN failed--compilation aborted at /opt/BclConverter-1.7.1/lib/perl/Gerald/Common.pm 
line 128.
Compilation failed in require at /opt/BclConverter-1.7.1/bin/plotIntensity_tiles.pl line 22.
BEGIN failed--compilation aborted at /opt/BclConverter-1.7.1/bin/plotIntensity_tiles.pl
line 22.
make: *** [s_1_all_pngs.txt] Error 2

The complete log is available in 2010dec7__bclconverter_1_7_1_build3c__make_error_wo_j8.txt in
2010dec7__bclconverter_1_7_1_logs.zip.

A brief User Guide read will hip you to a proper run command in the BaseCalls directory (obvious, read the User Guide):

make -j 8

For a simple test (and I assume that your network directory structure for the Illumina is something like /LOCATION/TO/NETWORK/DATA/Data/Intensities and /LOCATION/TO/NETWORK/DATA/Data/Intensities/Basecalls (which it should be), we’ll use the example in the BCLConverter User Guide (and be sureto download the .PDF).

This will produce a sizable logfile. You can check out a successful run in 2010dec7__bclconverter_1_7_1_build3i__with_gnuplot.txt in 2010dec7__bclconverter_1_7_1_logs.zip.

5. QSEQ-to-FASTQ Script

Not really an error, just a last little help to convert your QSEQ files into generic FASTQ format.

#!/bin/bash
for ((x=1;x< =8;x+=1)); do 
cat s_"$x"_1_*_qseq.txt | awk -F '\t' '{gsub(/\./,"N", $9); if ($11 > 0) printf("@%s_%04d:
%s:%s:%s:%s#%s/%s\n%s\n+%s_%04d:%s:%s:%s:%s#%s/%s\n%
\n",$1,$2,$3,$4,$5,$6,$7,$8,$9,$1,$2,$3,$4,$5,$6,$7,$8,$10)}' > s_
"$x"_sequence.fastq; 
done

NOTE: the “cat” contents has to be all on one line! Copy this script into a text editor and reformat (or download a copy – 2010dec7__qseq_to_fastq.script).