[med-svn] [bowtie2] 01/04: Imported Upstream version 2.2.5

Alex Mestiashvili malex-guest at moszumanska.debian.org
Wed Mar 18 12:46:14 UTC 2015


This is an automated email from the git hooks/post-receive script.

malex-guest pushed a commit to branch master
in repository bowtie2.

commit a6ae526a01bd22dfd361209792dd5ef70aaef0f2
Author: Alexandre Mestiashvili <alex at biotec.tu-dresden.de>
Date:   Wed Mar 18 10:08:18 2015 +0100

    Imported Upstream version 2.2.5
---
 MANUAL             |  10 +--
 MANUAL.markdown    | 130 +++++++++++++--------------------
 Makefile           |  11 ++-
 NEWS               |  13 +++-
 VERSION            |   2 +-
 aligner_result.cpp |  10 +--
 bowtie2            |  32 +++++++--
 bt2_search.cpp     |   2 +-
 doc/manual.html    | 208 +++++++++++++++++++++--------------------------------
 pat.cpp            |   8 ++-
 10 files changed, 193 insertions(+), 233 deletions(-)

diff --git a/MANUAL b/MANUAL
index 9a01b37..691fa91 100644
--- a/MANUAL
+++ b/MANUAL
@@ -381,8 +381,8 @@ details regarding these fields.
 A pair that aligns with the expected relative mate orientation and with the
 expected range of distances between mates is said to align "concordantly".  If
 both mates have unique alignments, but the alignments do not match paired-end
-expectations (i.e. the mates aren't in the expcted relative orientation, or
-aren't within the expected disatance range, or both), the pair is said to align
+expectations (i.e. the mates aren't in the expected relative orientation, or
+aren't within the expected distance range, or both), the pair is said to align
 "discordantly".  Discordant alignments may be of particular interest, for
 instance, when seeking [structural variants].
 
@@ -519,7 +519,7 @@ because it exceeded a limit placed on search effort (see `-D` and `-R`) or
 because it already knows all it needs to know to report an alignment.
 Information from the best alignments are used to estimate mapping quality (the
 `MAPQ` [SAM] field) and to set SAM optional fields, such as `AS:i` and
-`XS:i`.  Bowtie 2 does not gaurantee that the alignment reported is the best
+`XS:i`.  Bowtie 2 does not garantee that the alignment reported is the best
 possible in terms of alignment score.
 
 See also: `-D`, which puts an upper limit on the number of dynamic programming
@@ -545,7 +545,7 @@ beyond the first has the SAM 'secondary' bit (which equals 256) set in its FLAGS
 field.  See the [SAM specification] for details.
 
 Bowtie 2 does not "find" alignments in any specific order, so for reads that
-have more than N distinct, valid alignments, Bowtie 2 does not gaurantee that
+have more than N distinct, valid alignments, Bowtie 2 does not garantee that
 the N alignments reported are the best possible in terms of alignment score.
 Still, this mode can be effective and fast in situations where the user cares
 more about whether a read aligns (or aligns a certain number of times) than
@@ -571,7 +571,7 @@ very large genomes, this mode is very slow.
 ### Randomness in Bowtie 2
 
 Bowtie 2's search for alignments for a given read is "randomized."  That is,
-when Bowtie 2 encouters a set of equally-good choices, it uses a pseudo-random
+when Bowtie 2 encounters a set of equally-good choices, it uses a pseudo-random
 number to choose.  For example, if Bowtie 2 discovers a set of 3 equally-good
 alignments and wants to decide which to report, it picks a pseudo-random integer
 0, 1 or 2 and reports the corresponding alignment.  Abitrary choices can crop up
diff --git a/MANUAL.markdown b/MANUAL.markdown
index ce02c84..c70abbe 100644
--- a/MANUAL.markdown
+++ b/MANUAL.markdown
@@ -393,8 +393,8 @@ details regarding these fields.
 A pair that aligns with the expected relative mate orientation and with the
 expected range of distances between mates is said to align "concordantly".  If
 both mates have unique alignments, but the alignments do not match paired-end
-expectations (i.e. the mates aren't in the expcted relative orientation, or
-aren't within the expected disatance range, or both), the pair is said to align
+expectations (i.e. the mates aren't in the expected relative orientation, or
+aren't within the expected distance range, or both), the pair is said to align
 "discordantly".  Discordant alignments may be of particular interest, for
 instance, when seeking [structural variants].
 
@@ -532,7 +532,7 @@ because it exceeded a limit placed on search effort (see [`-D`] and [`-R`]) or
 because it already knows all it needs to know to report an alignment.
 Information from the best alignments are used to estimate mapping quality (the
 `MAPQ` [SAM] field) and to set SAM optional fields, such as [`AS:i`] and
-[`XS:i`].  Bowtie 2 does not gaurantee that the alignment reported is the best
+[`XS:i`].  Bowtie 2 does not garantee that the alignment reported is the best
 possible in terms of alignment score.
 
 See also: [`-D`], which puts an upper limit on the number of dynamic programming
@@ -558,7 +558,7 @@ beyond the first has the SAM 'secondary' bit (which equals 256) set in its FLAGS
 field.  See the [SAM specification] for details.
 
 Bowtie 2 does not "find" alignments in any specific order, so for reads that
-have more than N distinct, valid alignments, Bowtie 2 does not gaurantee that
+have more than N distinct, valid alignments, Bowtie 2 does not garantee that
 the N alignments reported are the best possible in terms of alignment score.
 Still, this mode can be effective and fast in situations where the user cares
 more about whether a read aligns (or aligns a certain number of times) than
@@ -584,7 +584,7 @@ very large genomes, this mode is very slow.
 ### Randomness in Bowtie 2
 
 Bowtie 2's search for alignments for a given read is "randomized."  That is,
-when Bowtie 2 encouters a set of equally-good choices, it uses a pseudo-random
+when Bowtie 2 encounters a set of equally-good choices, it uses a pseudo-random
 number to choose.  For example, if Bowtie 2 discovers a set of 3 equally-good
 alignments and wants to decide which to report, it picks a pseudo-random integer
 0, 1 or 2 and reports the corresponding alignment.  Abitrary choices can crop up
@@ -2220,162 +2220,128 @@ scale and the encoding is ASCII-offset by 33 (ASCII char `!`), similarly to a
 of these optional fields for each alignment, depending on the type of the
 alignment:
 
-    <table>
-    <tr><td id="bowtie2-build-opt-fields-as">
-
+<table>
+<tr><td id="bowtie2-build-opt-fields-as">
 [`AS:i`]: #bowtie2-build-opt-fields-as
 
         AS:i:<N>
 
-    </td>
-    <td>
-
+</td>
+<td>
     Alignment score.  Can be negative.  Can be greater than 0 in [`--local`]
     mode (but not in [`--end-to-end`] mode).  Only present if SAM record is for
     an aligned read.
-
-    </td></tr>
-    <tr><td id="bowtie2-build-opt-fields-xs">
-
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-xs">
 [`XS:i`]: #bowtie2-build-opt-fields-xs
 
         XS:i:<N>
 
-    </td>
-    <td>
-
+</td>
+<td>
     Alignment score for the best-scoring alignment found other than the
 	alignment reported.  Can be negative.  Can be greater than 0 in [`--local`]
 	mode (but not in [`--end-to-end`] mode).  Only present if the SAM record is
 	for an aligned read and more than one alignment was found for the read.
 	Note that, when the read is part of a concordantly-aligned pair, this score
 	could be greater than [`AS:i`].
-
-    </td></tr>
-    <tr><td id="bowtie2-build-opt-fields-ys">
-
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-ys">
 [`YS:i`]: #bowtie2-build-opt-fields-ys
 
         YS:i:<N>
 
-    </td>
-    <td>
-
+</td>
+<td>
     Alignment score for opposite mate in the paired-end alignment.  Only present
     if the SAM record is for a read that aligned as part of a paired-end
     alignment.
-
-    </td></tr>
-    <tr><td id="bowtie2-build-opt-fields-xn">
-
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-xn">
 [`XN:i`]: #bowtie2-build-opt-fields-xn
 
         XN:i:<N>
 
-    </td>
-    <td>
-
+</td>
+<td>
     The number of ambiguous bases in the reference covering this alignment. 
     Only present if SAM record is for an aligned read.
-
-    </td></tr>
-    <tr><td id="bowtie2-build-opt-fields-xm">
-
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-xm">
 [`XM:i`]: #bowtie2-build-opt-fields-xm
 
         XM:i:<N>
 
-    </td>
-    <td>
-
+</td>
+<td>
     The number of mismatches in the alignment.  Only present if SAM record is
     for an aligned read.
-
-    </td></tr>
-    <tr><td id="bowtie2-build-opt-fields-xo">
-
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-xo">
 [`XO:i`]: #bowtie2-build-opt-fields-xo
 
         XO:i:<N>
 
-    </td>
-    <td>
-
+</td>
+<td>
     The number of gap opens, for both read and reference gaps, in the alignment.
     Only present if SAM record is for an aligned read.
-
-    </td></tr>
-    <tr><td id="bowtie2-build-opt-fields-xg">
-
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-xg">
 [`XG:i`]: #bowtie2-build-opt-fields-xg
 
         XG:i:<N>
 
-    </td>
-    <td>
-
+</td>
+<td>
     The number of gap extensions, for both read and reference gaps, in the
     alignment. Only present if SAM record is for an aligned read.
-
-    </td></tr>
-    <tr><td id="bowtie2-build-opt-fields-nm">
-
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-nm">
 [`NM:i`]: #bowtie2-build-opt-fields-nm
 
         NM:i:<N>
 
-    </td>
-    <td>
-
+</td>
+<td>
     The edit distance; that is, the minimal number of one-nucleotide edits
     (substitutions, insertions and deletions) needed to transform the read
     string into the reference string.  Only present if SAM record is for an
     aligned read.
-
-    </td></tr>
-    <tr><td id="bowtie2-build-opt-fields-yf">
-
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-yf">
 [`YF:Z`]: #bowtie2-build-opt-fields-yf
 
         YF:Z:<S>
 
-    </td><td>
-
+</td><td>
     String indicating reason why the read was filtered out.  See also:
     [Filtering].  Only appears for reads that were filtered out.
-
-    </td></tr>
-    <tr><td id="bowtie2-build-opt-fields-yt">
-
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-yt">
 [`YT:Z`]: #bowtie2-build-opt-fields-yt
 
         YT:Z:<S>
 
-    </td><td>
-
+</td><td>
     Value of `UU` indicates the read was not part of a pair.  Value of `CP`
     indicates the read was part of a pair and the pair aligned concordantly.
     Value of `DP` indicates the read was part of a pair and the pair aligned
     discordantly.  Value of `UP` indicates the read was part of a pair but the
     pair failed to aligned either concordantly or discordantly.
-
 [Filtering]: #filtering
-
-    </td></tr>
-    <tr><td id="bowtie2-build-opt-fields-md">
-
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-md">
 [`MD:Z`]: #bowtie2-build-opt-fields-md
 
         MD:Z:<S>
 
-    </td><td>
-
+</td><td>
     A string representation of the mismatched reference bases in the alignment. 
     See [SAM] format specification for details.  Only present if SAM record is
     for an aligned read.
-
-    </td></tr>
-    </table>
+</td></tr>
+</table>
 
 [SAM format specification]: http://samtools.sf.net/SAM1.pdf
 [FASTQ]: http://en.wikipedia.org/wiki/FASTQ_format
diff --git a/Makefile b/Makefile
index d74f7c8..a4cdfa7 100644
--- a/Makefile
+++ b/Makefile
@@ -54,12 +54,11 @@ endif
 MACOS = 0
 ifneq (,$(findstring Darwin,$(shell uname)))
 	MACOS = 1
-endif
-
-ifneq (,$(findstring 13,$(shell uname -r)))
-	CPP = clang++
-	CC = clang
-	EXTRA_FLAGS += -stdlib=libstdc++
+	ifneq (,$(findstring 13,$(shell uname -r)))
+		CPP = clang++
+		CC = clang
+		EXTRA_FLAGS += -stdlib=libstdc++
+	endif
 endif
 
 POPCNT_CAPABILITY ?= 1
diff --git a/NEWS b/NEWS
index ee6c178..7f0e810 100644
--- a/NEWS
+++ b/NEWS
@@ -3,7 +3,7 @@ Bowtie 2 NEWS
 
 Bowtie 2 is now available for download from the project website,
 http://bowtie-bio.sf.net/bowtie2.  2.0.0-beta1 is the first version released to
-the public and 2.2.1 is the latest version.  Bowtie 2 is licensed under
+the public and 2.2.5 is the latest version.  Bowtie 2 is licensed under
 the GPLv3 license.  See `LICENSE' file for details.
 
 Reporting Issues
@@ -16,6 +16,17 @@ Please report any issues using the Sourceforge bug tracker:
 Version Release History
 =======================
 
+Version 2.2.5 - Mar 9, 2015
+   * Fixed some situations where incorrectly we could detect a Mavericks platform.
+   * Fixed some manual issues including some HTML bad formating.
+   * Make sure the wrapper correctly identifies the platform under OSX.
+   * Fixed --rg/--rg-id options where included spaces were incorrectly treated.
+   * Various documentation fixes added by contributors.
+   * Fixed the incorrect behavior where parameter file names may contain spaces.
+   * Fixed bugs related with the presence of spaces in the path where bowtie binaries are stored.
+   * Improved exception handling for missformated quality values. 
+   * Improved redundancy checks by correctly account for soft clipping. 
+
 Version 2.2.4 - Oct 22, 2014
    * Fixed a Mavericks OSX specific bug caused by some linkage ambiguities.
    * Added lz4 compression option for the wrapper.
diff --git a/VERSION b/VERSION
index 530cdd9..21bb5e1 100644
--- a/VERSION
+++ b/VERSION
@@ -1 +1 @@
-2.2.4
+2.2.5
diff --git a/aligner_result.cpp b/aligner_result.cpp
index 1072575..5a2b87e 100644
--- a/aligner_result.cpp
+++ b/aligner_result.cpp
@@ -930,6 +930,7 @@ void RedundantAlns::add(const AlnRes& res) {
 	assert(!cells_.empty());
 	TRefOff left = res.refoff(), right;
 	const size_t len = res.readExtentRows();
+        const size_t alignmentStart = res.trimmedLeft(true);
 	if(!res.fw()) {
 		const_cast<AlnRes&>(res).invertEdits();
 	}
@@ -937,7 +938,7 @@ void RedundantAlns::add(const AlnRes& res) {
 	size_t nedidx = 0;
 	assert_leq(len, cells_.size());
 	// For each row...
-	for(size_t i = 0; i < len; i++) {
+	for(size_t i = alignmentStart; i < alignmentStart + len; i++) {
 		size_t diff = 1;  // amount to shift to right for next round
 		right = left + 1;
 		while(nedidx < ned.size() && ned[nedidx].pos == i) {
@@ -947,7 +948,7 @@ void RedundantAlns::add(const AlnRes& res) {
 			}
 			nedidx++;
 		}
-		if(i < len - 1) {
+		if(i < alignmentStart + len - 1) {
 			// See how many inserts there are before the next read
 			// character
 			size_t nedidx_next = nedidx;
@@ -980,6 +981,7 @@ bool RedundantAlns::overlap(const AlnRes& res) {
 	assert(!cells_.empty());
 	TRefOff left = res.refoff(), right;
 	const size_t len = res.readExtentRows();
+        const size_t alignmentStart = res.trimmedLeft(true);
 	if(!res.fw()) {
 		const_cast<AlnRes&>(res).invertEdits();
 	}
@@ -988,7 +990,7 @@ bool RedundantAlns::overlap(const AlnRes& res) {
 	// For each row...
 	bool olap = false;
 	assert_leq(len, cells_.size());
-	for(size_t i = 0; i < len; i++) {
+	for(size_t i = alignmentStart; i < alignmentStart + len; i++) {
 		size_t diff = 1;  // amount to shift to right for next round
 		right = left + 1;
 		while(nedidx < ned.size() && ned[nedidx].pos == i) {
@@ -998,7 +1000,7 @@ bool RedundantAlns::overlap(const AlnRes& res) {
 			}
 			nedidx++;
 		}
-		if(i < len - 1) {
+		if(i < alignmentStart + len - 1) {
 			// See how many inserts there are before the next read
 			// character
 			size_t nedidx_next = nedidx;
diff --git a/bowtie2 b/bowtie2
index 31e2ac3..12e3567 100755
--- a/bowtie2
+++ b/bowtie2
@@ -45,7 +45,7 @@ while (-f $prog && -l $prog){
 
 ($vol,$script_path,$prog) 
                 = File::Spec->splitpath($prog);
-my $os_is_nix   = ($^O eq "linux") || ($^O eq "darwin");
+my $os_is_nix   = $^O ne "MSWin32";
 my $align_bin_s = $os_is_nix ? 'bowtie2-align-s' : 'bowtie2-align-s.exe'; 
 my $build_bin   = $os_is_nix ? 'bowtie2-build' : 'bowtie2-build.exe';               
 my $align_bin_l = $os_is_nix ? 'bowtie2-align-l' : 'bowtie2-align-l.exe'; 
@@ -58,6 +58,25 @@ my $idx_ext       = $idx_ext_s;
 my %signo       = ();
 my @signame     = ();
 
+sub quote_params {
+    my %params_2_quote = ('--rg' => 1, '--rg-id' => 1,
+                          '-S' => 1, '-U' => 1,
+                          '-1' => 1, '-2' => 1
+    );
+    my $param_list = shift;
+    my $quoting = 0;
+    
+    for (my $i=0; $i<scalar(@{$param_list}); $i++){
+        if($quoting){
+            $quoting = 0;
+            $param_list->[$i] = "\"".$param_list->[$i]."\"";
+            next;
+        }
+    	$quoting = 1 if(exists($params_2_quote{$param_list->[$i]}));
+    }
+}
+
+
 {
 	# Get signal info
 	use Config;
@@ -76,7 +95,7 @@ my @signame     = ();
 # 2 args from wrapper args
 sub getBt2Desc($) {
 	my $d = shift;
-	my $cmd = "$align_prog --wrapper basic-0 --arg-desc";
+	my $cmd = "\"$align_prog\" --wrapper basic-0 --arg-desc";
 	open(my $fh, "$cmd |") || Fail("Failed to run command '$cmd'\n");
 	while(readline $fh) {
 		chomp;
@@ -273,13 +292,13 @@ sub cat_file($$) {
 	my ($ifn, $ofh) = @_;
 	my $ifh = undef;
 	if($ifn =~ /\.gz$/) {
-		open($ifh, "gzip -dc $ifn |") ||
+		open($ifh, "gzip -dc \"$ifn\" |") ||
 			 Fail("Could not open gzipped read file: $ifn \n");
 	} elsif($ifn =~ /\.bz2/) {
-		open($ifh, "bzip2 -dc $ifn |") ||
+		open($ifh, "bzip2 -dc \"$ifn\" |") ||
 			Fail("Could not open bzip2ed read file: $ifn \n");
 	} elsif($ifn =~ /\.lz4/) {
-		open($ifh, "lz4 -dc $ifn |") ||
+		open($ifh, "lz4 -dc \"$ifn\" |") ||
 			Fail("Could not open lz4ed read file: $ifn \n");
 	} else {
 		open($ifh, $ifn) || Fail("Could not open read file: $ifn \n");
@@ -446,7 +465,8 @@ else {
 my $debug_str = ($debug ? "-debug" : "");
 
 # Construct command invoking bowtie2-align
-my $cmd = "$align_prog$debug_str --wrapper basic-0 ".join(" ", @bt2_args);
+quote_params(\@bt2_args);
+my $cmd = "\"$align_prog$debug_str\" --wrapper basic-0 ".join(" ", @bt2_args);
 
 # Possibly add read input on an anonymous pipe
 $cmd = "$readpipe $cmd" if defined($readpipe);
diff --git a/bt2_search.cpp b/bt2_search.cpp
index f8a0c80..04f495a 100644
--- a/bt2_search.cpp
+++ b/bt2_search.cpp
@@ -1284,7 +1284,7 @@ static void parseOption(int next_option, const char *arg) {
 				throw 1;
 			}
 			if(len > 32) {
-				cerr << "Error: -L argument must be <= 32; was" << arg << endl;
+				cerr << "Error: -L argument must be <= 32; was " << arg << endl;
 				throw 1;
 			}
 			polstr += ";SEEDLEN="; polstr += arg; break;
diff --git a/doc/manual.html b/doc/manual.html
index 34821f8..63fa76b 100644
--- a/doc/manual.html
+++ b/doc/manual.html
@@ -182,7 +182,7 @@ Alignment:
 <h3 id="paired-sam-output">Paired SAM output</h3>
 <p>When Bowtie 2 prints a SAM alignment for a pair, it prints two records (i.e. two lines of output), one for each mate. The first record describes the alignment for mate 1 and the second record describes the alignment for mate 2. In both records, some of the fields of the SAM record describe various properties of the alignment; for instance, the 7th and 8th fields (<code>RNEXT</code> and <code>PNEXT</code> respectively) indicate the reference name and position where the other mate align [...]
 <h3 id="concordant-pairs-match-pair-expectations-discordant-pairs-dont">Concordant pairs match pair expectations, discordant pairs don't</h3>
-<p>A pair that aligns with the expected relative mate orientation and with the expected range of distances between mates is said to align "concordantly". If both mates have unique alignments, but the alignments do not match paired-end expectations (i.e. the mates aren't in the expcted relative orientation, or aren't within the expected disatance range, or both), the pair is said to align "discordantly". Discordant alignments may be of particular interest, for instance [...]
+<p>A pair that aligns with the expected relative mate orientation and with the expected range of distances between mates is said to align "concordantly". If both mates have unique alignments, but the alignments do not match paired-end expectations (i.e. the mates aren't in the expected relative orientation, or aren't within the expected distance range, or both), the pair is said to align "discordantly". Discordant alignments may be of particular interest, for instance [...]
 <p>The expected relative orientation of the mates is set using the <a href="#bowtie2-options-fr"><code>--ff</code></a>, <a href="#bowtie2-options-fr"><code>--fr</code></a>, or <a href="#bowtie2-options-fr"><code>--rf</code></a> options. The expected range of inter-mates distances (as measured from the furthest extremes of the mates; also called "outer distance") is set with the <a href="#bowtie2-options-I"><code>-I</code></a> and <a href="#bowtie2-options-X"><code>-X</code></a> [...]
 <p>To declare that a pair aligns discordantly, Bowtie 2 requires that both mates align uniquely. This is a conservative threshold, but this is often desirable when seeking structural variants.</p>
 <p>By default, Bowtie 2 searches for both concordant and discordant alignments, though searching for discordant alignments can be disabled with the <a href="#bowtie2-options-no-discordant"><code>--no-discordant</code></a> option.</p>
@@ -220,17 +220,17 @@ Reference: GCAGATTATATGAGTCAGCTACGATATTGTTTGGGGTGACACATTACGCGTCTTTGAC</code></pr
 <p>Two alignments for the same individual read are "distinct" if they map the same read to different places. Specifically, we say that two alignments are distinct if there are no alignment positions where a particular read offset is aligned opposite a particular reference offset in both alignments with the same orientation. E.g. if the first alignment is in the forward orientation and aligns the read character at read offset 10 to the reference character at chromosome 3, offset [...]
 <p>Two alignments for the same pair are distinct if either the mate 1s in the two paired-end alignments are distinct or the mate 2s in the two alignments are distinct or both.</p>
 <h3 id="default-mode-search-for-multiple-alignments-report-the-best-one">Default mode: search for multiple alignments, report the best one</h3>
-<p>By default, Bowtie 2 searches for distinct, valid alignments for each read. When it finds a valid alignment, it generally will continue to look for alignments that are nearly as good or better. It will eventually stop looking, either because it exceeded a limit placed on search effort (see <a href="#bowtie2-options-D"><code>-D</code></a> and <a href="#bowtie2-options-R"><code>-R</code></a>) or because it already knows all it needs to know to report an alignment. Information from the b [...]
+<p>By default, Bowtie 2 searches for distinct, valid alignments for each read. When it finds a valid alignment, it generally will continue to look for alignments that are nearly as good or better. It will eventually stop looking, either because it exceeded a limit placed on search effort (see <a href="#bowtie2-options-D"><code>-D</code></a> and <a href="#bowtie2-options-R"><code>-R</code></a>) or because it already knows all it needs to know to report an alignment. Information from the b [...]
 <p>See also: <a href="#bowtie2-options-D"><code>-D</code></a>, which puts an upper limit on the number of dynamic programming problems (i.e. seed extensions) that can "fail" in a row before Bowtie 2 stops searching. Increasing <a href="#bowtie2-options-D"><code>-D</code></a> makes Bowtie 2 slower, but increases the likelihood that it will report the correct alignment for a read that aligns many places.</p>
 <p>See also: <a href="#bowtie2-options-R"><code>-R</code></a>, which sets the maximum number of times Bowtie 2 will "re-seed" when attempting to align a read with repetitive seeds. Increasing <a href="#bowtie2-options-R"><code>-R</code></a> makes Bowtie 2 slower, but increases the likelihood that it will report the correct alignment for a read that aligns many places.</p>
 <h3 id="k-mode-search-for-one-or-more-alignments-report-each">-k mode: search for one or more alignments, report each</h3>
 <p>In <a href="#bowtie2-options-k"><code>-k</code></a> mode, Bowtie 2 searches for up to N distinct, valid alignments for each read, where N equals the integer specified with the <code>-k</code> parameter. That is, if <code>-k 2</code> is specified, Bowtie 2 will search for at most 2 distinct alignments. It reports all alignments found, in descending order by alignment score. The alignment score for a paired-end alignment equals the sum of the alignment scores of the individual mates. Ea [...]
-<p>Bowtie 2 does not "find" alignments in any specific order, so for reads that have more than N distinct, valid alignments, Bowtie 2 does not gaurantee that the N alignments reported are the best possible in terms of alignment score. Still, this mode can be effective and fast in situations where the user cares more about whether a read aligns (or aligns a certain number of times) than where exactly it originated.</p>
+<p>Bowtie 2 does not "find" alignments in any specific order, so for reads that have more than N distinct, valid alignments, Bowtie 2 does not garantee that the N alignments reported are the best possible in terms of alignment score. Still, this mode can be effective and fast in situations where the user cares more about whether a read aligns (or aligns a certain number of times) than where exactly it originated.</p>
 <h3 id="a-mode-search-for-and-report-all-alignments">-a mode: search for and report all alignments</h3>
 <p><a href="#bowtie2-options-a"><code>-a</code></a> mode is similar to <a href="#bowtie2-options-k"><code>-k</code></a> mode except that there is no upper limit on the number of alignments Bowtie 2 should report. Alignments are reported in descending order by alignment score. The alignment score for a paired-end alignment equals the sum of the alignment scores of the individual mates. Each reported read or pair alignment beyond the first has the SAM 'secondary' bit (which equals 256) set [...]
 <p>Some tools are designed with this reporting mode in mind. Bowtie 2 is not! For very large genomes, this mode is very slow.</p>
 <h3 id="randomness-in-bowtie-2">Randomness in Bowtie 2</h3>
-<p>Bowtie 2's search for alignments for a given read is "randomized." That is, when Bowtie 2 encouters a set of equally-good choices, it uses a pseudo-random number to choose. For example, if Bowtie 2 discovers a set of 3 equally-good alignments and wants to decide which to report, it picks a pseudo-random integer 0, 1 or 2 and reports the corresponding alignment. Abitrary choices can crop up at various points during alignment.</p>
+<p>Bowtie 2's search for alignments for a given read is "randomized." That is, when Bowtie 2 encounters a set of equally-good choices, it uses a pseudo-random number to choose. For example, if Bowtie 2 discovers a set of 3 equally-good alignments and wants to decide which to report, it picks a pseudo-random integer 0, 1 or 2 and reports the corresponding alignment. Abitrary choices can crop up at various points during alignment.</p>
 <p>The pseudo-random number generator is re-initialized for every read, and the seed used to initialize it is a function of the read name, nucleotide string, quality string, and the value specified with <a href="#bowtie2-options-seed"><code>--seed</code></a>. If you run the same version of Bowtie 2 on two reads with identical names, nucleotide strings, and quality strings, and if <a href="#bowtie2-options-seed"><code>--seed</code></a> is set the same for both runs, Bowtie 2 will produce  [...]
 <p>However, when the user specifies the <a href="#bowtie2-options-non-deterministic"><code>--non-deterministic</code></a> option, Bowtie 2 will use the current time to re-initialize the pseudo-random number generator. When this is specified, Bowtie 2 might report different alignments for identical reads. This is counter-intuitive for some users, but might be more appropriate in situations where the input consists of many identical reads.</p>
 <h2 id="multiseed-heuristic">Multiseed heuristic</h2>
@@ -510,7 +510,7 @@ Reference: GCAGATTATATGAGTCAGCTACGATATTGTTTGGGGTGACACATTACGCGTCTTTGAC</code></pr
 <pre><code>-L <int></code></pre>
 </td><td>
 
-<p>Sets the length of the seed substrings to align during <a href="#multiseed-heuristic">multiseed alignment</a>. Smaller values make alignment slower but more senstive. Default: the <a href="#bowtie2-options-sensitive"><code>--sensitive</code></a> preset is used by default, which sets <code>-L</code> to 20 both in <a href="#bowtie2-options-end-to-end"><code>--end-to-end</code></a> mode and in <a href="#bowtie2-options-local"><code>--local</code></a> mode.</p>
+<p>Sets the length of the seed substrings to align during <a href="#multiseed-heuristic">multiseed alignment</a>. Smaller values make alignment slower but more senstive. Default: the <a href="#bowtie2-options-sensitive"><code>--sensitive</code></a> preset is used by default, which sets <code>-L</code> to 22 in <a href="#bowtie2-options-end-to-end"><code>--end-to-end</code></a> mode and to 20 in <a href="#bowtie2-options-local"><code>--local</code></a> mode.</p>
 </td></tr>
 <tr><td id="bowtie2-options-i">
 
@@ -527,7 +527,7 @@ Seed 3 fw:             ACGCTATCAT
 Seed 3 rc:             ATGATAGCGT
 Seed 4 fw:                   TCATGCATAA
 Seed 4 rc:                   TTATGCATGA</code></pre>
-<p>Since it's best to use longer intervals for longer reads, this parameter sets the interval as a function of the read length, rather than a single one-size-fits-all number. For instance, specifying <code>-i S,1,2.5</code> sets the interval function <code>f</code> to <code>f(x) = 1 + 2.5 * sqrt(x)</code>, where x is the read length. See also: <a href="#setting-function-options">setting function options</a>. If the function returns a result less than 1, it is rounded up to 1. Default: th [...]
+<p>Since it's best to use longer intervals for longer reads, this parameter sets the interval as a function of the read length, rather than a single one-size-fits-all number. For instance, specifying <code>-i S,1,2.5</code> sets the interval function <code>f</code> to <code>f(x) = 1 + 2.5 * sqrt(x)</code>, where x is the read length. See also: <a href="#setting-function-options">setting function options</a>. If the function returns a result less than 1, it is rounded up to 1. Default: th [...]
 </td></tr>
 <tr><td id="bowtie2-options-n-ceil">
 
@@ -1005,129 +1005,85 @@ Seed 4 rc:                   TTATGCATGA</code></pre>
 <li><p>Inferred fragment length. Size is negative if the mate's alignment occurs upstream of this alignment. Size is 0 if the mates did not align concordantly. However, size is non-0 if the mates aligned discordantly to the same chromosome.</p></li>
 <li><p>Read sequence (reverse-complemented if aligned to the reverse strand)</p></li>
 <li><p>ASCII-encoded read qualities (reverse-complemented if the read aligned to the reverse strand). The encoded quality values are on the <a href="http://en.wikipedia.org/wiki/Phred_quality_score">Phred quality</a> scale and the encoding is ASCII-offset by 33 (ASCII char <code>!</code>), similarly to a <a href="http://en.wikipedia.org/wiki/FASTQ_format">FASTQ</a> file.</p></li>
-<li><p>Optional fields. Fields are tab-separated. <code>bowtie2</code> outputs zero or more of these optional fields for each alignment, depending on the type of the alignment:</p>
+<li><p>Optional fields. Fields are tab-separated. <code>bowtie2</code> outputs zero or more of these optional fields for each alignment, depending on the type of the alignment:</p></li>
+</ol>
 <table>
 <tr><td id="bowtie2-build-opt-fields-as">
-</li>
-</ol>
-<pre><code>    AS:i:<N>
-
-</td>
-<td>
-
-Alignment score.  Can be negative.  Can be greater than 0 in [`--local`]
-mode (but not in [`--end-to-end`] mode).  Only present if SAM record is for
-an aligned read.
-
-</td></tr>
-<tr><td id="bowtie2-build-opt-fields-xs"></code></pre>
-<pre><code>    XS:i:<N>
-
-</td>
-<td>
-
-Alignment score for the best-scoring alignment found other than the
-alignment reported.  Can be negative.  Can be greater than 0 in [`--local`]
-mode (but not in [`--end-to-end`] mode).  Only present if the SAM record is
-for an aligned read and more than one alignment was found for the read.
-Note that, when the read is part of a concordantly-aligned pair, this score
-could be greater than [`AS:i`].
-
-</td></tr>
-<tr><td id="bowtie2-build-opt-fields-ys"></code></pre>
-<pre><code>    YS:i:<N>
-
-</td>
-<td>
-
-Alignment score for opposite mate in the paired-end alignment.  Only present
-if the SAM record is for a read that aligned as part of a paired-end
-alignment.
-
-</td></tr>
-<tr><td id="bowtie2-build-opt-fields-xn"></code></pre>
-<pre><code>    XN:i:<N>
-
-</td>
-<td>
-
-The number of ambiguous bases in the reference covering this alignment. 
-Only present if SAM record is for an aligned read.
-
-</td></tr>
-<tr><td id="bowtie2-build-opt-fields-xm"></code></pre>
-<pre><code>    XM:i:<N>
-
-</td>
-<td>
-
-The number of mismatches in the alignment.  Only present if SAM record is
-for an aligned read.
-
-</td></tr>
-<tr><td id="bowtie2-build-opt-fields-xo"></code></pre>
-<pre><code>    XO:i:<N>
-
-</td>
-<td>
-
-The number of gap opens, for both read and reference gaps, in the alignment.
-Only present if SAM record is for an aligned read.
-
-</td></tr>
-<tr><td id="bowtie2-build-opt-fields-xg"></code></pre>
-<pre><code>    XG:i:<N>
-
-</td>
-<td>
-
-The number of gap extensions, for both read and reference gaps, in the
-alignment. Only present if SAM record is for an aligned read.
-
-</td></tr>
-<tr><td id="bowtie2-build-opt-fields-nm"></code></pre>
-<pre><code>    NM:i:<N>
-
-</td>
-<td>
-
-The edit distance; that is, the minimal number of one-nucleotide edits
-(substitutions, insertions and deletions) needed to transform the read
-string into the reference string.  Only present if SAM record is for an
-aligned read.
-
-</td></tr>
-<tr><td id="bowtie2-build-opt-fields-yf"></code></pre>
-<pre><code>    YF:Z:<S>
-
-</td><td>
-
-String indicating reason why the read was filtered out.  See also:
-[Filtering].  Only appears for reads that were filtered out.
-
-</td></tr>
-<tr><td id="bowtie2-build-opt-fields-yt"></code></pre>
-<pre><code>    YT:Z:<S>
-
-</td><td>
-
-Value of `UU` indicates the read was not part of a pair.  Value of `CP`
-indicates the read was part of a pair and the pair aligned concordantly.
-Value of `DP` indicates the read was part of a pair and the pair aligned
-discordantly.  Value of `UP` indicates the read was part of a pair but the
-pair failed to aligned either concordantly or discordantly.</code></pre>
-<pre><code></td></tr>
-<tr><td id="bowtie2-build-opt-fields-md"></code></pre>
-<pre><code>    MD:Z:<S>
-
-</td><td>
-
-A string representation of the mismatched reference bases in the alignment. 
-See [SAM] format specification for details.  Only present if SAM record is
-for an aligned read.
+<pre><code>    AS:i:<N></code></pre>
+</td>
+<td>
+    
+Alignment score. Can be negative. Can be greater than 0 in <a href="#bowtie2-options-local"><code>--local</code></a> mode (but not in <a href="#bowtie2-options-end-to-end"><code>--end-to-end</code></a> mode). Only present if SAM record is for an aligned read.
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-xs">
+<pre><code>    XS:i:<N></code></pre>
+</td>
+<td>
+    
+Alignment score for the best-scoring alignment found other than the alignment reported. Can be negative. Can be greater than 0 in <a href="#bowtie2-options-local"><code>--local</code></a> mode (but not in <a href="#bowtie2-options-end-to-end"><code>--end-to-end</code></a> mode). Only present if the SAM record is for an aligned read and more than one alignment was found for the read. Note that, when the read is part of a concordantly-aligned pair, this score could be greater than <a href= [...]
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-ys">
+<pre><code>    YS:i:<N></code></pre>
+</td>
+<td>
+    
+Alignment score for opposite mate in the paired-end alignment. Only present if the SAM record is for a read that aligned as part of a paired-end alignment.
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-xn">
+<pre><code>    XN:i:<N></code></pre>
+</td>
+<td>
+    
+The number of ambiguous bases in the reference covering this alignment. Only present if SAM record is for an aligned read.
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-xm">
+<pre><code>    XM:i:<N></code></pre>
+</td>
+<td>
+    
+The number of mismatches in the alignment. Only present if SAM record is for an aligned read.
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-xo">
+<pre><code>    XO:i:<N></code></pre>
+</td>
+<td>
+    
+The number of gap opens, for both read and reference gaps, in the alignment. Only present if SAM record is for an aligned read.
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-xg">
+<pre><code>    XG:i:<N></code></pre>
+</td>
+<td>
+    
+The number of gap extensions, for both read and reference gaps, in the alignment. Only present if SAM record is for an aligned read.
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-nm">
+<pre><code>    NM:i:<N></code></pre>
+</td>
+<td>
+    
+The edit distance; that is, the minimal number of one-nucleotide edits (substitutions, insertions and deletions) needed to transform the read string into the reference string. Only present if SAM record is for an aligned read.
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-yf">
+<pre><code>    YF:Z:<S></code></pre>
+</td><td>
+    
+String indicating reason why the read was filtered out. See also: <a href="#filtering">Filtering</a>. Only appears for reads that were filtered out.
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-yt">
+<pre><code>    YT:Z:<S></code></pre>
+</td><td>
+    
+Value of <code>UU</code> indicates the read was not part of a pair. Value of <code>CP</code> indicates the read was part of a pair and the pair aligned concordantly. Value of <code>DP</code> indicates the read was part of a pair and the pair aligned discordantly. Value of <code>UP</code> indicates the read was part of a pair but the pair failed to aligned either concordantly or discordantly. <a href="#filtering">Filtering</a>: #filtering
+</td></tr>
+<tr><td id="bowtie2-build-opt-fields-md">
+<pre><code>    MD:Z:<S></code></pre>
+</td><td>
+    
+A string representation of the mismatched reference bases in the alignment. See <a href="http://samtools.sourceforge.net/SAM1.pdf">SAM</a> format specification for details. Only present if SAM record is for an aligned read.
+</td></tr>
+</table>
 
-</td></tr>
-</table></code></pre>
 <h1 id="the-bowtie2-build-indexer">The <code>bowtie2-build</code> indexer</h1>
 <p><code>bowtie2-build</code> builds a Bowtie index from a set of DNA sequences. <code>bowtie2-build</code> outputs a set of 6 files with suffixes <code>.1.bt2</code>, <code>.2.bt2</code>, <code>.3.bt2</code>, <code>.4.bt2</code>, <code>.rev.1.bt2</code>, and <code>.rev.2.bt2</code>. In the case of a large index these suffixes will have a <code>bt2l</code> termination. These files together constitute the index: they are all that is needed to align reads to that reference. The original se [...]
 <p>Bowtie 2's <code>.bt2</code> index format is different from Bowtie 1's <code>.ebwt</code> format, and they are not compatible with each other.</p>
diff --git a/pat.cpp b/pat.cpp
index f9167e0..7d287c9 100644
--- a/pat.cpp
+++ b/pat.cpp
@@ -1026,7 +1026,13 @@ bool FastqPatternSource::read(
 			}
 			if (c != '\r' && c != '\n') {
 				if (*qualsReadCur >= trim5) {
-					c = charToPhred33(c, solQuals_, phred64Quals_);
+					try {
+						c = charToPhred33(c, solQuals_, phred64Quals_);
+					}
+					catch (...) {
+						cout << "Error encountered at sequence id: " << r.name << endl;
+						throw;
+					}
 					assert_geq(c, 33);
 					qbuf->append(c);
 				}

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/bowtie2.git



More information about the debian-med-commit mailing list