Some Light Science Reading. The Constellations: Orion

As first appeared in the April 2012 edition of the Syracuse Astronomical Society newsletter The Astronomical Chronicle (PDF).

Image generated with Starry Night Pro 6.

Much can be said about the old hunter Orion. To Central New York observers, it had (until very recently) been the case that Orion made his way across the Night Sky during the coldest and least hospitable (to most nighttime observers) months of the year. Conditions would keep observers in hiding from him (some of the best CNY observers I know would risk surgical strikes on the Orion Nebula with their fastest to set-up and tear-down equipment). The abbreviated winter of 2011/2012 and reasonably early start of the SAS observing season have provided us with excellent opportunities in the past few months to make Orion The Hunter now the hunted. The mid-April observing session will be the last "official" opportunity to observe Orion before he disappears behind the Western horizon until the most nocturnal of us can next see him in our Eastern sky before sunrise in late August. I then take this opportunity to discuss Orion, one many CNY/SAS members may know the best by sight but may know the least by observing attention.

One of the topics covered in the 2011 SAS lecturing series was how we observe. Not the discussion of optics or the physics of planetary motion along the ecliptic, but the visual and mental mechanisms we use to translate the photonic triggers in our retina into mental pictures of celestial objects. Orion was the astronomical example I used to describe Pareidolia, how we impose a kind of order on things we see despite that order not being present in the actual collection. When you look at a cloud, you may see a face, an animal, or something your mind triggers as being something it clearly is not. I often placed the infamous "Face On Mars" next to the Constellation Orion to show clearly how we see what we think we see despite all reasonable evidence to the contrary (or the two can be mangled together, as shown below). The clouds may look like an animal, the "Face On Mars" looks unmistakably like a shadowed face, and Orion, as it happens, has looked like a human figure to virtually all peoples for as long as we have record of Constellations, the same way Scorpius has appeared as a scorpion to every civilization for which this little monster was part of the local biosphere.

Pareidolia is not just for cognitive neuroscience! One of the keys to learning the sky I discussed last year was to let your mind wander while staring at the sky and see if certain things jump out at you. The constellations are, for the most part, made up of the most reasonably bright star groupings, but if you see any type of geometry that makes some part of the sky easy to identify, run with it. This same philosophy may be responsible for the rise of the asterism, or "non-Constellation star grouping," as the distillation of mythological complexity into more practical tools for everyday living. For instance, I suspect everyone reading this can find the asterism known as the Big Dipper, but how many know all of the stars of its proper Constellation Ursa Major? Our southern tree line and Cortland obscure some of the grandeur of Sagittarius, which means we at the hill identify the location of its core (and several galactic highlights) by the easy-to-see "teapot." The body of Orion is a similar case of reduction-to-apparent, as the four stars marking his corners (clockwise from upper left)…

Betelgeuse (pronounced "Betelgeuse Betelgeuse Betelgeuse!" – marking his right shoulder; a red supergiant of very orange-ish color even without binoculars)

Bellatrix – the left shoulder (so you now know the Constellation is facing us as originally defined) – a blue giant known also known as the "Amazon Star"

Rigel – the left foot; a blue supergiant and the star system within which the aliens that make the Rigel Quick Finder reside

Saiph – the right foot; a star dim in the visible but markedly brighter in the ultraviolet. Saiph and Rigel are about the same distance away (Saiph 50 light years closer at 724 light years, a point to consider as you observe them both)

… and the three stars marking his belt (from left)…

Alnitak – A triple-star system 800 light years away with a blue supergiant as its anchor star

Alnilam – the farthest star of the belt at 1359 light years, this young blue supergiant burns as brightly as the other two, making the belt appear equally bright "al across"

Mintaka – 900 light years away, this is an eclipsing binary star system, meaning one star passes between us and the main star in its orbit (about every 5.7 days)

… are obvious to all, while the head and club stars require a longer look to identify.

Sticking to Naked Eye observing for a moment, Orion is not only famous for its historical significance and apparent brightness. Orion is ideally oriented to serve as an order of alignment for several nearby Constellations and is surrounded by enough bright stars and significant Constellations that curiosity alone should have you familiar with this part of the sky in very short order. As an April focus, it is of benefit that all of the Constellations we'll focus on either hit the horizon at the same time as Orion or they rest above him.

I've color-coded the significant stars marking notable Constellations in the image below. If you're standing outside on any clear night, the marked stars should all be quite obvious (we're talking a hands' width or two at arm's length). From right and working our way counterclockwise…

(RED) Following the belt stars to the right will lead you to the orange-ish star Aldebaran, marking the eye of Taurus the Bull. This is a dense part of the sky, as Aldebaran marks both the head of the Bull and also marks the brightest star in the Hyades star cluster (a gravitationally-bound open cluster 150 light years away composed of over 100 stars). Just to the right of this cluster is the "Tiny Dipper" known as the Pleiades (Messier 45), another dense star cluster worth observing at all magnifications. Both of these clusters are simultaneously easier and harder to find at present, as Venus ("1") is resting just above them, providing an easy way to find both clusters but plenty of reflected light to dull the brilliance of the two open clusters.

(ORANGE) Auriga, featuring Capella (the third brightest star in the Night Sky), is an oddly-shaped hexagon featuring a small triangle at one corner. Auriga, like Ursa Minor in last month's discussion, is made easy to find by the fact that the five marked stars are in an otherwise nondescript part of the sky (relatively dim generally, but brighter than anything in the vicinity). Venus will dull Hassaleh (Auriga's closest star to Venus and the two open clusters below it) but Elnath and Capella will be easy finds.

(YELLOW) Castor and Pollux, the twins of Gemini, are literally standing on Orion's club. Making an arrow from Mintaka (the right-most star of the belt) and Betelgeuse will lead you to Alhena (Pollux's left foot), after which a slow curve in a horseshoe shape will give you the remaining stars.

(GREEN) Canis Minor is two stars (which is boring), but is significant for containing Procyon, the 7th brightest star in the Night Sky (which means it will be an EASY find). But don't confuse it with Sirius, which is the big shimmering star in…

(BLUE) Canis Major is the larger of Orion's two dogs and contains Sirius ("The Dog Star"), a star so bright (magnitude −1.46) and so close (8.6 light years) that it appears not as a star but as a shimmering light. Some would say an airplane, others would say a hovering UFO. Part of my duties as president involve intermittently explaining that it is not the latter.

And, with respect, Monoceros is an old Constellation but not a particularly brilliant one. Having Canis Minor and Canis Major identified will make your identification of Monoceros quite straightforward.

We now turn to the other "stellar" objects in Orion, composed of three Messiers and one famous IC. M78 is a diffuse nebula almost one belt width above and perpendicular to Alnitak. You will know it when you see it. M43 and M42 (marked as "4" in the image below), on the other hand, are so bright and close that you can see their nebulosity in dark skies without aid of any optics.

M42 – The Orion Nebula is, in the right dark conditions, a Naked Eye sight in itself. For those of us between cities, even low-power binoculars bring out the wispy edges and cloudy core of this nebula. For higher-power observers, the resolving of Trapezium at M42's core serves as one of your best tests of astronomy binoculars (I consider the identification of four stars as THE proper test of a pair of 25×100's. Ideal conditions and a larger aperture will get you six stars total). You could spend all night just exploring the edges and depths of this nebula. You can take a look back at the Astro Bob article in the April 2012 edition of the Astronomical Chronicle (From My Driveway To Orion, Nature Works Wonders) for a more detailed discussion of this part of Orion.

M43 – de Mairan's Nebula is, truth be told, a lucky designation. M43 is, in fact, part of the M42 nebula that is itself a small part of the Orion Molecular Cloud Complex (not THAT'S a label). M43 owes its differentiation to a dark lane of dust that breaks M43 and M42, just as the lane of dust in our own Milky Way we know as the "Great Rift" splits what would otherwise be one continuous band of distant stars the same way a large rock in a stream causes the water to split in two and recombine on the other side.

Finally (and the one you'll work for), IC 434, the Horsehead Nebula, lies just to the lower-left of Alnitak (1). The Horsehead is itself a dark nebula, a region absorbing light to make it pronounced by its difference from the lighter regions around it. To put the whole area into perspective, The Horsehead is itself STILL within the Orion Molecular Cloud Complex. The sheath of Orion's Sword and nearly the entire belt is contained in this Complex, like dust being rattled off with each blow from Orion's club.

I close by taking a look at the perilously ignored club attempting to tear into Taurus. At present, asteroids surround Orion's Club like pieces of debris flying off after a hard impact. All are in the vicinity of 12th magnitude (so require a decent-sized mirror), and all are also moving at a sufficiently fast clip that their paths can be seen to change over several observing sessions (if, by miracle, enough clear days in a row can be had to make these measurements). I have highlighted the five prominent ones in the image below.

Is it an oddity to have Orion so full of asteroids? Certainly not! Orion is placed near the ecliptic, the apparent path of the planets in their motion around the Sun. Orion's club just barely grazes the ecliptic at the Gemini/Taurus border, two of the 12 Constellations of the Zodiac, the collection of Constellations that themselves mark the ecliptic. As nearly all of the objects in the Solar System lie near or within the disc of the Solar System, you expect to find all manner of smaller objects in the vicinity of the Zodiacal Constellations. In effect, Orion's club is kicking up different dust all year long as the asteroids orbit the Sun. You only have a few more weeks to watch the action happen before Orion's return in the very early early morning of the very late summer.

– Happy Hunting, Damian

Sanger (And Illumina 1.3+ (And Solexa)) Phred Score (Q) ASCII Glyph Base Error Conversion Tables

Given the importance of the use of these scores both in FASTQ and MAQ (for MAQ (for me), specifically using alignment quality scores from Illumina sequencing runs to monitor run and sample quality), I was a bit surprised to not find some complete work-up of the meanings, the scores, the glyphs coordinated to the scores, and the encoding interpretations of these scores in one location. The two (three) tables shown here hopefully provide a meaningful summary.

I should qualify that much of the background for this page was taken from four key places. First is the wikipedia entry for FASTQ. Second is the wikipedia entry for Phred quality score. Third is the Rosetta Stone of Phred Score interpretation in the form of the open access article: P. J. A. Cock, C. J. Fields, N. Goto, M. L. Heuer and P. M. Rice, "The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants." Nucleic Acids Research, 2010, Vol. 38, No. 6, 1767-1771 doi:10.1093/nar/gkp1137. Fourth is seqanswers.com in various forms.

(Sanger) Phred Quality Scores

I refer you to the two wikipedia articles on FASTQ and Phred Quality Scores for historical content (and for a brief discussion of the processing of chromatogram data for the production of quality scores). Table 1 shows the Q[Phred] (Phred Q) from P[Phred] values (Probability (P) Of Wrong Base), then adds the ASCII glyph codes (Sanger "Q + 33" Shift) and characters (Sanger "Q + 33" ASCII GLYPH) for the original Phred scores (Phred scores 0-to-93 use ASCII characters 33-to-126 in the Sanger method – this is performed to keep the single-character associated letters readable) and the Illumina 1.3+ codes (Illumina 1.3+ "Q + 64" Shift, using ASCII glyphs 64-to-126 to score from 0-to-62 on the "P" scale) and corresponding ASCII glyphs (Illumina 1.3+ "Q + 64" ASCII GLYPH). This is all likely completely self-explanatory (or hopefully will be by the bottom of the post). For review, the relationship between Phred quality score Q[Sanger] and the base-calling error probability P is

Q[Sanger]= -10 * log10P

or, re-written for the logarithmically challenged…

P = 10^[-Q/10]

Table 1. Phred Quality Scores (Q), Wrong Base Probabilities, And Sanger And Illumina 1.3+ ASCII Glyphs.
Phred
Q
Probability (P)
Of Wrong Base
Sanger
"Q + 33"
Shift
Sanger
"Q + 33"
ASCII GLYPH
Illumina 1.3+
"Q + 64"
Shift
Illumina 1.3+
"Q + 64"
ASCII GLYPH
00
1.0000000000
033
!
064
@
01
0.7943282347
034
"
065
A
02
0.6309573445
035
#
066
B
03
0.5011872336
036
$
067
C
04
0.3981071706
037
%
068
D
05
0.3162277660
038
&
069
E
06
0.2511886432
039
'
070
F
07
0.1995262315
040
(
071
G
08
0.1584893192
041
)
072
H
09
0.1258925412
042
*
073
I
10
0.1000000000
043
+
074
J
11
0.0794328235
044
,
075
K
12
0.0630957344
045
076
L
13
0.0501187234
046
.
077
M
14
0.0398107171
047
/
078
N
15
0.0316227766
048
0
079
O
16
0.0251188643
049
1
080
P
17
0.0199526231
050
2
081
Q
18
0.0158489319
051
3
082
R
19
0.0125892541
052
4
083
S
20
0.0100000000
053
5
084
T
21
0.0079432823
054
6
085
U
22
0.0063095734
055
7
086
V
23
0.0050118723
056
8
087
W
24
0.0039810717
057
9
088
X
25
0.0031622777
058
:
089
Y
26
0.0025118864
059
;
090
Z
27
0.0019952623
060
<
091
[
28
0.0015848932
061
=
092
\
29
0.0012589254
062
>
093
]
30
0.0010000000
063
?
094
^
31
0.0007943282
064
@
095
_
32
0.0006309573
065
A
096
`
33
0.0005011872
066
B
097
a
34
0.0003981072
067
C
098
b
35
0.0003162278
068
D
099
c
36
0.0002511886
069
E
100
d
37
0.0001995262
070
F
101
e
38
0.0001584893
071
G
102
f
39
0.0001258925
072
H
103
g
40
0.0001000000
073
I
104
h
41
0.0000794328
074
J
105
i
42
0.0000630957
075
K
106
j
43
0.0000501187
076
L
107
k
44
0.0000398107
077
M
108
l
45
0.0000316228
078
N
109
m
46
0.0000251189
079
O
110
n
47
0.0000199526
080
P
111
o
48
0.0000158489
081
Q
112
p
49
0.0000125893
082
R
113
q
50
0.0000100000
083
S
114
r
51
0.0000079433
084
T
115
s
52
0.0000063096
085
U
116
t
53
0.0000050119
086
V
117
u
54
0.0000039811
087
W
118
v
55
0.0000031623
088
X
119
w
56
0.0000025119
089
Y
120
x
57
0.0000019953
090
Z
121
y
58
0.0000015849
091
[
122
z
59
0.0000012589
092
\
123
{
60
0.0000010000
093
]
124
|
61
0.0000007943
094
^
125
}
62
0.0000006310
095
_
126
~
63
0.0000005012
096
`
64
0.0000003981
097
a
65
0.0000003162
098
b
66
0.0000002512
099
c
67
0.0000001995
100
d
68
0.0000001585
101
e
69
0.0000001259
102
f
70
0.0000001000
103
g
71
0.0000000794
104
h
72
0.0000000631
105
i
73
0.0000000501
106
j
74
0.0000000398
107
k
75
0.0000000316
108
l
76
0.0000000251
109
m
77
0.0000000200
110
n
78
0.0000000158
111
o
79
0.0000000126
112
p
80
0.0000000100
113
q
81
0.0000000079
114
r
82
0.0000000063
115
s
83
0.0000000050
116
t
84
0.0000000040
117
u
85
0.0000000032
118
v
86
0.0000000025
119
w
87
0.0000000020
120
x
88
0.0000000016
121
y
89
0.0000000013
122
z
90
0.0000000010
123
{
91
0.0000000008
124
|
92
0.0000000006
125
}
93
0.0000000005
126
~

An assumption going in when I was producing plots from the Q[Sanger] and Q[Solexa] data was that the "P" was the same value and the Solexa system simply opted to use the Odds (P/(1-P)) as their metric. A proper two-second consideration of the shape of the form of P and P/(1-P) would have lead to the immediate conclusion that something was afoot. The table columns on the left of the black bar in Table 2 (2A) are the Q[Solexa] values based on the use of the Q[Sanger] probabilities. This is here simply to show that they are, in fact, not the same and if you've spent any time wondering why you can't adequately… manipulate Excel's rounding tools to reproduce the Q[Solexa] integer values, this is why.

The probabilities obtained for Q[Solexa] were, in fact, worked backwards from the integer values of Q[Solexa] (having found no table online that gives a number-by-number summary of the probability or odds). For background, the Q[Solexa] values are obtained from:

Q[Solexa] = -10 * log10[(P/1-P)]

Table 2A: Q[Solexa] from P[Sanger] Table 2B: Q[Solexa] and associated odds (P/(1-P)).
Probability
(P) Of
Wrong Base
Associated
Sanger
Odds
[P/(1-P)]
Q[Solexa]
Based On
Phred
Probability
Solexa Q
[-5 to 62]
Solexa
Probability
(P) Of
Wrong Base
Solexa
Odds
[P/(1-P)]
Solexa
"Q + 64"
Q Shift
Solexa
"Q + 64"
ASCII
GLYPH
0.7943282
3.8621161
-5.8682532
-5
0.7597469
3.1622774
59
;
0.6309573
1.7097139
-2.3292343
-4
0.7152527
2.5118860
60
<
0.5011872
1.0047602
-0.0206244
-3
0.6661394
1.9952619
61
=
0.3981072
0.6614253
1.7951917
-2
0.6131368
1.5848929
62
>
0.3162278
0.4624753
3.3491146
-1
0.5573117
1.2589255
63
?
0.2511886
0.3354498
4.7437242
0
0.5000000
1.0000000
64
@
0.1995262
0.2492602
6.0334710
1
0.4426884
0.7943284
65
A
0.1584893
0.1883390
7.2505963
2
0.3868632
0.6309575
66
B
0.1258925
0.1440241
8.4156483
3
0.3338606
0.5011873
67
C
0.1000000
0.1111111
9.5424251
4
0.2847473
0.3981072
68
D
0.0794328
0.0862868
10.6405549
5
0.2402531
0.3162278
69
E
0.0630957
0.0673449
11.7169522
6
0.2007600
0.2511887
70
F
0.0501187
0.0527631
12.7766933
7
0.1663376
0.1995263
71
G
0.0398107
0.0414613
13.8235685
8
0.1368069
0.1584893
72
H
0.0316228
0.0326554
14.8604457
9
0.1118158
0.1258926
73
I
0.0251189
0.0257661
15.8895167
10
0.0909091
0.1000000
74
J
0.0199526
0.0203588
16.9124707
11
0.0735876
0.0794328
75
K
0.0158489
0.0161042
17.9306177
12
0.0593509
0.0630957
76
L
0.0125893
0.0127498
18.9449785
13
0.0477267
0.0501187
77
M
0.0100000
0.0101010
19.9563519
14
0.0382865
0.0398107
78
N
0.0079433
0.0080069
20.9653650
15
0.0306534
0.0316228
79
O
0.0063096
0.0063496
21.9725111
16
0.0245034
0.0251189
80
P
0.0050119
0.0050371
22.9781790
17
0.0195623
0.0199526
81
Q
0.0039811
0.0039970
23.9826759
18
0.0156017
0.0158489
82
R
0.0031623
0.0031723
24.9862446
19
0.0124327
0.0125893
83
S
0.0025119
0.0025182
25.9890773
20
0.0099010
0.0100000
84
T
0.0019953
0.0019993
26.9913260
21
0.0078807
0.0079433
85
U
0.0015849
0.0015874
27.9931114
22
0.0062700
0.0063096
86
V
0.0012589
0.0012605
28.9945291
23
0.0049869
0.0050119
87
W
0.0010000
0.0010010
29.9956549
24
0.0039653
0.0039811
88
X
0.0007943
0.0007950
30.9965489
25
0.0031523
0.0031623
89
Y
0.0006310
0.0006314
31.9972589
26
0.0025056
0.0025119
90
Z
0.0005012
0.0005014
32.9978228
27
0.0019913
0.0019953
91
[
0.0003981
0.0003983
33.9982707
28
0.0015824
0.0015849
92
\
0.0003162
0.0003163
34.9986264
29
0.0012573
0.0012589
93
]
0.0002512
0.0002513
35.9989090
30
0.0009990
0.0010000
94
^
0.0001995
0.0001996
36.9991334
31
0.0007937
0.0007943
95
_
0.0001585
0.0001585
37.9993116
32
0.0006306
0.0006310
96
`
0.0001259
0.0001259
38.9994532
33
0.0005009
0.0005012
97
a
0.0001000
0.0001000
39.9995657
34
0.0003979
0.0003981
98
b
0.0000794
0.0000794
40.9996550
35
0.0003161
0.0003162
99
c
0.0000631
0.0000631
41.9997260
36
0.0002511
0.0002512
100
d
0.0000501
0.0000501
42.9997823
37
0.0001995
0.0001995
101
e
0.0000398
0.0000398
43.9998271
38
0.0001585
0.0001585
102
f
0.0000316
0.0000316
44.9998627
39
0.0001259
0.0001259
103
g
0.0000251
0.0000251
45.9998909
40
0.0001000
0.0001000
104
h
0.0000200
0.0000200
46.9999133
41
0.0000794
0.0000794
105
i
0.0000158
0.0000158
47.9999312
42
0.0000631
0.0000631
106
j
0.0000126
0.0000126
48.9999453
43
0.0000501
0.0000501
107
k
0.0000100
0.0000100
49.9999566
44
0.0000398
0.0000398
108
l
0.0000079
0.0000079
50.9999655
45
0.0000316
0.0000316
109
m
0.0000063
0.0000063
51.9999726
46
0.0000251
0.0000251
110
n
0.0000050
0.0000050
52.9999782
47
0.0000200
0.0000200
111
o
0.0000040
0.0000040
53.9999827
48
0.0000158
0.0000158
112
p
0.0000032
0.0000032
54.9999863
49
0.0000126
0.0000126
113
q
0.0000025
0.0000025
55.9999891
50
0.0000100
0.0000100
114
r
0.0000020
0.0000020
56.9999913
51
0.0000079
0.0000079
115
s
0.0000016
0.0000016
57.9999931
52
0.0000063
0.0000063
116
t
0.0000013
0.0000013
58.9999945
53
0.0000050
0.0000050
117
u
0.0000010
0.0000010
59.9999957
54
0.0000040
0.0000040
118
v
0.0000008
0.0000008
60.9999966
55
0.0000032
0.0000032
119
w
0.0000006
0.0000006
61.9999973
56
0.0000025
0.0000025
120
x
0.0000005
0.0000005
62.9999978
57
0.0000020
0.0000020
121
y
0.0000004
0.0000004
63.9999983
58
0.0000016
0.0000016
122
z
0.0000003
0.0000003
64.9999986
59
0.0000013
0.0000013
123
{
0.0000003
0.0000003
65.9999989
60
0.0000010
0.0000010
124
|
0.0000002
0.0000002
66.9999991
61
0.0000008
0.0000008
125
}
0.0000002
0.0000002
67.9999993
62
0.0000006
0.0000006
126
~

With all three data sets, I reproduce a plot familiar to the FASTQ community below, showing the asymptotic behavior of the Q[Solexa] and Q[Sanger] values at high Q (which represent the lowest read errors. They approach one another because the numbers are simply too damn small on the plot). Also obvious from the plot is that the plots show poor agreement with each other in the range where the error probability is highest (so the entire analysis goes to pot as the data quality goes to pot [ed. Note for the international reader: "pot" refers to the device found in the water-closet). The grey line is a good plot of the wrong data (that in Table 2A).

The presentation of this data is likely complete overkill, but I have found it useful in discussion. Hopefully your having tables in front of someone during an explanation will help clarify that explanation.