[med-svn] r18790 - trunk/packages/vsearch/trunk/debian/patches

Andreas Tille tille at moszumanska.debian.org
Wed Feb 18 14:38:05 UTC 2015


Author: tille
Date: 2015-02-18 14:38:04 +0000 (Wed, 18 Feb 2015)
New Revision: 18790

Modified:
   trunk/packages/vsearch/trunk/debian/patches/manpage_syntax.patch
Log:
Hmmm, upstream is actively maintaining the man page so this patch does not make much sense without upstream support.  Just left the working chunks and droped the conflicting ones - feel free to enhance.


Modified: trunk/packages/vsearch/trunk/debian/patches/manpage_syntax.patch
===================================================================
--- trunk/packages/vsearch/trunk/debian/patches/manpage_syntax.patch	2015-02-17 07:54:41 UTC (rev 18789)
+++ trunk/packages/vsearch/trunk/debian/patches/manpage_syntax.patch	2015-02-18 14:38:04 UTC (rev 18790)
@@ -4,105 +4,7 @@
 
 --- a/doc/vsearch.1
 +++ b/doc/vsearch.1
-@@ -9,58 +9,58 @@ vsearch \(em chimera detection, clusteri
- .ad l
- Chimera detection:
- .RS
--\fBvsearch\fR --uchime_denovo \fIfastafile\fR (--chimeras |
----nonchimeras | --uchimealns | --uchimeout) \fIoutputfile\fR
-+\fBvsearch\fR \-\-uchime_denovo \fIfastafile\fR (\-\-chimeras |
-+\-\-nonchimeras | \-\-uchimealns | \-\-uchimeout) \fIoutputfile\fR
- [\fIoptions\fR]
- .PP
--\fBvsearch\fR --uchime_ref \fIfastafile\fR (--chimeras | --nonchimeras
--| --uchimealns | --uchimeout) \fIoutputfile\fR --db \fIfastafile\fR
-+\fBvsearch\fR \-\-uchime_ref \fIfastafile\fR (\-\-chimeras | \-\-nonchimeras
-+| \-\-uchimealns | \-\-uchimeout) \fIoutputfile\fR \-\-db \fIfastafile\fR
- [\fIoptions\fR]
- .PP
- .RE
- Clustering:
- .RS
--\fBvsearch\fR (--cluster_fast | --cluster_size | --cluster_smallmem)
--\fIfastafile\fR (--alnout | --blast6out | --centroids | --clusters |
----msaout | --uc | --userout) \fIoutputfile\fR --id \fIreal\fR
-+\fBvsearch\fR (\-\-cluster_fast | \-\-cluster_size | \-\-cluster_smallmem)
-+\fIfastafile\fR (\-\-alnout | \-\-blast6out | \-\-centroids | \-\-clusters |
-+\-\-msaout | \-\-uc | \-\-userout) \fIoutputfile\fR \-\-id \fIreal\fR
- [\fIoptions\fR]
- .PP
- .RE
- Dereplication:
- .RS
--\fBvsearch\fR --derep_fulllength \fIfastafile\fR (--output | --uc)
-+\fBvsearch\fR \-\-derep_fulllength \fIfastafile\fR (\-\-output | \-\-uc)
- \fIoutputfile\fR [\fIoptions\fR]
- .PP
- .RE
- Masking:
- .RS
--\fBvsearch\fR --maskfasta \fIfastafile\fR --output \fIoutputfile\fR
-+\fBvsearch\fR \-\-maskfasta \fIfastafile\fR \-\-output \fIoutputfile\fR
- [\fIoptions\fR]
- .PP
- .RE
- Pairwise alignment:
- .RS
--\fBvsearch\fR --allpairs_global \fIfastafile\fR (--alnout |
----blast6out | --matched | --notmatched | --uc | --userout)
--\fIoutputfile\fR (--acceptall | --id \fIreal\fR) [\fIoptions\fR]
-+\fBvsearch\fR \-\-allpairs_global \fIfastafile\fR (\-\-alnout |
-+\-\-blast6out | \-\-matched | \-\-notmatched | \-\-uc | \-\-userout)
-+\fIoutputfile\fR (\-\-acceptall | \-\-id \fIreal\fR) [\fIoptions\fR]
- .PP
- .RE
- Searching:
- .RS
--\fBvsearch\fR --usearch_global \fIfastafile\fR --db \fIfastafile\fR
--(--alnout | --blast6out | --uc | --userout) \fIoutputfile\fR --id
-+\fBvsearch\fR \-\-usearch_global \fIfastafile\fR \-\-db \fIfastafile\fR
-+(\-\-alnout | \-\-blast6out | \-\-uc | \-\-userout) \fIoutputfile\fR \-\-id
- \fIreal\fR [\fIoptions\fR]
- .PP
- .RE
- Shuffling:
- .RS
--\fBvsearch\fR --shuffle \fIfastafile\fR --output \fIoutputfile\fR
-+\fBvsearch\fR \-\-shuffle \fIfastafile\fR \-\-output \fIoutputfile\fR
- [\fIoptions\fR]
- .PP
- .RE
- Sorting:
- .RS
--\fBvsearch\fR (--sortbylength | --sortbysize) \fIfastafile\fR --output
-+\fBvsearch\fR (\-\-sortbylength | \-\-sortbysize) \fIfastafile\fR \-\-output
- \fIoutputfile\fR [\fIoptions\fR]
- .PP
- .RE
-@@ -107,10 +107,10 @@ present. All other ascii or non-ascii ch
- complained about in a non-blocking warning message.
- .PP
- \fBvsearch\fR operations are case insensitive, except when soft masking is
--activated. For --usearch_global (searching), --cluster_fast and
----cluster_smallmem (clustering), and --maskfasta (masking) commands,
-+activated. For \-\-usearch_global (searching), \-\-cluster_fast and
-+\-\-cluster_smallmem (clustering), and \-\-maskfasta (masking) commands,
- the case is important if soft masking is used. Soft masking is
--specified with the options "--dbmask soft" (for searching) or "--qmask
-+specified with the options "\-\-dbmask soft" (for searching) or "\-\-qmask
- soft" (for searching, clustering and masking). When using soft
- masking, lower case letters indicate masked symbols, while upper case
- letters indicate regular symbols. Masked symbols are never included in
-@@ -121,7 +121,7 @@ in result files.
- When comparing sequences during chimera detection, dereplication,
- searching and clustering, T and U are considered identical, regardless
- of their case. If two symbols are non-identical, their alignment will
--result in the negative mismatch score (default -4), except if one or
-+result in the negative mismatch score (default \-4), except if one or
- both of the symbols are ambiguous (RYSWKMDBHVN) in which case the
- score is zero. Alignment of two identical ambiguous symbols (e.g. R vs
- R) also receives a score of zero.
-@@ -138,27 +138,27 @@ searching). We start with general option
+@@ -137,27 +137,27 @@ searching). We start with general option
  General options:
  .RS
  .TP 9
@@ -136,7 +38,7 @@
  Do not truncate sequence labels at first space, use the full header in
  output files.
  .RE
-@@ -168,7 +168,7 @@ Chimera detection options:
+@@ -167,7 +167,7 @@ Chimera detection options:
  .PP
  .RS
  Chimera detection is based on a scoring function controlled by five
@@ -145,9 +47,9 @@
  sorted by decreasing abundance (if available), and compared on their
  \fIplus\fR strand only (case insensitive).
  .PP
-@@ -176,12 +176,12 @@ In \fIde novo\fR mode, input fasta file
+@@ -175,12 +175,12 @@ In \fIde novo\fR mode, input fasta file
  annotations (pattern [;]size=\fIinteger\fR[;] in the fasta
- header). The input order influences the chimera detection, we
+ header). The input order influences the chimera detection, so we
  recommend to sort sequences by decreasing abundance (default of
 ---derep_fulllength command). If your sequence set needs to be sorted,
 -please see the --sortbysize command in the sorting section.
@@ -160,106 +62,8 @@
 +.BI \-\-abskew \0real
 +When using \-\-uchime_denovo, the abundance skew is used to distinguish
  in a 3-way alignment which sequence is the chimera and which are the
- parents. The assumption is that chimeras appeared later in the PCR
+ parents. The assumption is that chimeras appear later in the PCR
  amplification process and are therefore less abundant than their
-@@ -189,75 +189,75 @@ parents. The default value is 2.0, which
- be at least 2 times more abundant than their chimera. Any positive
- value greater than 1.0 can be used.
- .TP
--.BI --alignwidth\~ "positive integer"
--Width of 3-way alignments in --uchimealns output. The default value is
-+.BI \-\-alignwidth\~ "positive integer"
-+Width of 3-way alignments in \-\-uchimealns output. The default value is
- 80. Set to 0 to eliminate wrapping.
- .TP
--.BI --chimeras \0filename
-+.BI \-\-chimeras \0filename
- Output chimeric sequences to \fIfilename\fR, in fasta format. Output
- order may vary when using multiple threads.
- .TP
--.BI --db \0filename
--When using --uchime_ref, detect chimeras using the fasta-formatted
-+.BI \-\-db \0filename
-+When using \-\-uchime_ref, detect chimeras using the fasta-formatted
- reference sequences contained in \fIfilename\fR. Reference sequences
- are assumed to be chimera-free. Chimeras will not be detected if their
- parents (or sufficiently close relatives) are not present in the
- database.
- .TP
--.BI --dn \0real
-+.BI \-\-dn \0real
- No vote pseudo-count (parameter \fIn\fR in the chimera scoring
- function) (1.4).
- .TP
--.BI --mindiffs\~ "positive integer"
-+.BI \-\-mindiffs\~ "positive integer"
- Minimum number of differences per segment (3).
- .TP
--.BI --mindiv \0real
-+.BI \-\-mindiv \0real
- Minimum divergence from closest parent (0.8).
- .TP
--.BI --minh \0real
-+.BI \-\-minh \0real
- Minimum score (h). Increasing this value tends to reduce the number of
- false positives and to decrease sensitivity. Default value is
- 0.28. (value ranging from 0.0 to 1.0 included).
- .TP
--.BI --nonchimeras \0filename
-+.BI \-\-nonchimeras \0filename
- Output non-chimeric sequences to \fIfilename\fR, in fasta
- format. Output order may vary when using multiple threads.
- .TP
--.B --self
--When using --uchime_ref, ignore a reference sequence when its label
-+.B \-\-self
-+When using \-\-uchime_ref, ignore a reference sequence when its label
- matches the label of the query sequence (useful to estimate
- false-positive rate in reference sequences).
- .TP
--.B --selfid
--When using --uchime_ref, ignore a reference sequence when its
-+.B \-\-selfid
-+When using \-\-uchime_ref, ignore a reference sequence when its
- nucleotide sequence is strictly identical with the query sequence.
- .TP
--.BI --threads\~ "positive integer"
-+.BI \-\-threads\~ "positive integer"
- Number of computation threads to use (1 to 256) with uchime_ref.
- The number of threads
- should be lesser or equal to the number of available CPU cores. The
- default is to launch one thread per available logical core.
- .TP
--.BI --uchime_denovo \0filename
-+.BI \-\-uchime_denovo \0filename
- Detect chimeras present in the fasta-formatted \fIfilename\fR, without
- external references (i.e. \fIde novo\fR). Automatically sort the
- sequences in \fIfilename\fR by decreasing abundance
- beforehand. Multithreading is not supported.
- .TP
--.BI --uchime_ref \0filename
-+.BI \-\-uchime_ref \0filename
- Detect chimeras present in the fasta-formatted \fIfilename\fR by
--comparing them with reference sequences (option --db). Multithreading
-+comparing them with reference sequences (option \-\-db). Multithreading
- is supported.
- .TP
--.BI --uchimealns \0filename
-+.BI \-\-uchimealns \0filename
- Write 3-way global alignments (parentA, parentB, chimera) to
--\fIfilename\fR using a human-readable format. Use --alignwidth to modify
-+\fIfilename\fR using a human-readable format. Use \-\-alignwidth to modify
- alignment length. Output order may vary when using multiple threads.
- .TP
--.BI --uchimeout \0filename
-+.BI \-\-uchimeout \0filename
- Write chimera detection results to \fIfilename\fR using the uchime
- tab-separated format of 18 fields (see the list below). Use
----uchimeout5 to use a format compatible with usearch v5 and earlier
-+\-\-uchimeout5 to use a format compatible with usearch v5 and earlier
- versions. Rows output order may vary when using multiple threads.
- .RS
- .RS
 @@ -272,7 +272,7 @@ A: parent A sequence label.
  B: parent B sequence label.
  .IP \n+[step].
@@ -269,238 +73,8 @@
  .IP \n+[step].
  idQM: percentage of similarity of query (Q) and model (M)
  constructed as a part of parent A and a part of parent B.
-@@ -304,12 +304,12 @@ YN: query is chimeric (Y), or not (N), o
- .RE
- .RE
- .TP
--.B --uchimeout5
--When using --uchimeout, write chimera detection results using a
--tab-separated format of 17 fields (drop the 5th field of --uchimeout),
-+.B \-\-uchimeout5
-+When using \-\-uchimeout, write chimera detection results using a
-+tab-separated format of 17 fields (drop the 5th field of \-\-uchimeout),
- compatible with usearch version 5 and earlier versions.
- .TP
--.BI --xn \0real
-+.BI \-\-xn \0real
- No vote weight (parameter beta) (8.0).
- .RE
+@@ -502,9 +502,9 @@ Masking options:
  .PP
-@@ -320,53 +320,53 @@ Clustering options:
- \fBvsearch\fR implements a single-pass, greedy star-clustering
- algorithm, similar to the algorithms implemented in usearch, DNAclust
- and sumaclust. Important parameters are the global clustering
--threshold (--id) and the pairwise identity definition (--iddef).
-+threshold (\-\-id) and the pairwise identity definition (\-\-iddef).
- .TP 9
--.BI --centroids \0filename
-+.BI \-\-centroids \0filename
- Output cluster centroid sequences to \fIfilename\fR file, in fasta
- format. The centroid is the sequence that seeded the cluster (i.e. the
- first sequence of the cluster).
- .TP
--.BI --cluster_fast \0filename
-+.BI \-\-cluster_fast \0filename
- Clusterize the fasta sequences in \fIfilename\fR, automatically
- perform a sorting by decreasing sequence length beforehand.
- .TP
--.BI --cluster_size \0filename
-+.BI \-\-cluster_size \0filename
- Clusterize the fasta sequences in \fIfilename\fR, automatically
- perform a sorting by decreasing sequence abundance beforehand.
- .TP
--.BI --cluster_smallmem \0filename
-+.BI \-\-cluster_smallmem \0filename
- Clusterize the fasta sequences in \fIfilename\fR without automatically
- modifying their order beforehand. Sequence are expected to be sorted
--by decreasing sequence length, unless --usersort is used.
-+by decreasing sequence length, unless \-\-usersort is used.
- .TP
--.BI --clusters \0string
-+.BI \-\-clusters \0string
- Output each cluster to a separate fasta file using the prefix
- \fIstring\fR and a ticker (0, 1, 2, etc.) to construct the path and filenames.
- .TP
--.BI --consout \0filename
-+.BI \-\-consout \0filename
- Output cluster consensus sequences to \fIfilename\fR. For each
- cluster, a multiple alignment is computed, and a consensus sequence is
- constructed by taking the majority symbol (nucleotide or gap) from
- each column of the alignment. Columns containing a majority of gaps
--are skipped, except for terminal gaps. Use --construncate to take
-+are skipped, except for terminal gaps. Use \-\-construncate to take
- terminal gaps into account (not implemented yet).
- .\" .TP
--.\" .B --construncate
--.\" when using the --consout option to build consensus sequences, do not
-+.\" .B \-\-construncate
-+.\" when using the \-\-consout option to build consensus sequences, do not
- .\" ignore terminal gaps. That option skips terminal columns if they
- .\" contain a majority of gaps, yielding shorter consensus sequences than
--.\" when using --consout alone.
-+.\" when using \-\-consout alone.
- .TP
--.BI --id \0real
-+.BI \-\-id \0real
- Do not add the target to the cluster if the pairwise identity with the
- centroid is lower than \fIreal\fR (value ranging from 0.0 to 1.0
- included). The pairwise identity is defined as the number of (matching
- columns) / (alignment length - terminal gaps). That definition can be
--modified by --iddef.
-+modified by \-\-iddef.
- .TP
--.BI --iddef\~ "0|1|2|3|4"
--Change the pairwise identity definition used in --id. Values accepted
-+.BI \-\-iddef\~ "0|1|2|3|4"
-+Change the pairwise identity definition used in \-\-id. Values accepted
- are:
- .RS
- .RS
-@@ -381,68 +381,68 @@ edit distance excluding terminal gaps (d
- Marine Biological Lab definition counting each extended gap as a
- single difference.
- .IP \n+[step].
--BLAST definition, equivalent to --iddef 2 in a context of global
-+BLAST definition, equivalent to \-\-iddef 2 in a context of global
- pairwise alignment.
- .RE
- .RE
- .TP
--.BI --msaout \0filename
-+.BI \-\-msaout \0filename
- Output a multiple sequence alignment and a consensus sequence for each
- cluster to \fIfilename\fR, in fasta format. The consensus sequence is
- constructed by taking the majority symbol (nucleotide or gap) from
- each column of the alignment. Columns containing a majority of gaps
- are skipped, except for terminal gaps.
- .TP
--.BI --qmask\~ "none|dust|soft"
-+.BI \-\-qmask\~ "none|dust|soft"
- Mask simple repeats and low-complexity regions in sequences using the
- \fIdust\fR or the \fIsoft\fR algorithms, or do not mask
- (\fInone\fR). Warning, when using \fIsoft\fR masking, clustering
- becomes case sensitive. The default is to mask using \fIdust\fR.
- .TP
--.B --sizein
-+.B \-\-sizein
- Take into account the abundance annotations present in the input fasta
- file (search for the pattern "[>;]size=\fIinteger\fR[;]" in sequence
- headers).
- .TP
--.B --sizeout
-+.B \-\-sizeout
- Add abundance annotations to the output fasta files (add the pattern
--";size=\fIinteger\fR;" to sequence headers). If --sizein is specified,
-+";size=\fIinteger\fR;" to sequence headers). If \-\-sizein is specified,
- abundance annotations are reported to output files, and each cluster
- centroid receives a new abundance value corresponding to the total
--abundance of the amplicons included in the cluster (--centroids
--option). If --sizein is not specified, input abundances are set to 1
-+abundance of the amplicons included in the cluster (\-\-centroids
-+option). If \-\-sizein is not specified, input abundances are set to 1
- for amplicons, and to the number of amplicons per cluster for
- centroids.
- .TP
--.BI --strand\~ "plus|both"
-+.BI \-\-strand\~ "plus|both"
- When comparing sequences with the cluster seed, check the \fIplus\fR
- strand only (default) or check \fIboth\fR strands.
- .TP
--.BI --threads\~ "positive integer"
-+.BI \-\-threads\~ "positive integer"
- Number of computation threads to use (1 to 256). The number of threads
- should be less or equal to the number of available CPU cores. The
- default is to launch one thread per available logical core.
- .TP
--.BI --uc \0filename
-+.BI \-\-uc \0filename
- Output clustering results in \fIfilename\fR using a uclust-like
- format. See <http://www.drive5.com/usearch/manual/ucout.html> for a
- description of the format.
- .TP
--.B --usersort
--When using --cluster_smallmem, allow any sequence input order, not
-+.B \-\-usersort
-+When using \-\-cluster_smallmem, allow any sequence input order, not
- just a decreasing length ordering.
- .TP
- Most searching options also apply to clustering:
- .br
----alnout, --blast6out, --userout, --userfields, --fastapairs, --matched,
----notmatched, --maxaccept, --maxreject, score filtering, gap penalties, masking. (see the Searching section).
-+\-\-alnout, \-\-blast6out, \-\-userout, \-\-userfields, \-\-fastapairs, \-\-matched,
-+\-\-notmatched, \-\-maxaccept, \-\-maxreject, score filtering, gap penalties, masking. (see the Searching section).
- .RE
- .PP
- .\" ----------------------------------------------------------------------------
- Dereplication options:
- .RS
- .TP 9
--.BI --derep_fulllength \0filename
-+.BI \-\-derep_fulllength \0filename
- Merge strictly identical sequences contained in
- \fIfilename\fR. Identical sequences are defined as having the same
- length and the same string of nucleotides (case insensitive, T and U
-@@ -450,46 +450,46 @@ are considered the same). As \fBvsearch\
- \fIfilename\fR twice, \fIfilename\fR must be a real file, not a
- stream.
- .TP
--.BI --maxuniquesize\~ "positive integer"
-+.BI \-\-maxuniquesize\~ "positive integer"
- Discard sequences with an abundance value greater than \fIinteger\fR.
- .TP
- .BI --minuniquesize\~ "positive integer"
- Discard sequences with an abundance value smaller than \fIinteger\fR.
- .TP
--.BI --output \0filename
-+.BI \-\-output \0filename
- Write the dereplicated sequences to \fIfilename\fR, in fasta format
- and sorted by decreasing abundance. Identical sequences receive the
--header of the first sequence of their group. If --sizeout is used, the
-+header of the first sequence of their group. If \-\-sizeout is used, the
- number of occurrences (i.e. abundance) of each sequence is indicated
- at the end of their fasta header using the pattern
- ";size=\fIinteger\fR;".
- .TP
--.B --sizein
-+.B \-\-sizein
- Take into account the abundance annotations present in the input fasta
- file (search for the pattern "[>;]size=\fIinteger\fR[;]" in sequence
- headers).
- .TP
--.B --sizeout
-+.B \-\-sizeout
- Add abundance annotations to the output fasta file (add the pattern
--";size=\fIinteger\fR;" to sequence headers).  If --sizein is specified,
-+";size=\fIinteger\fR;" to sequence headers).  If \-\-sizein is specified,
- each unique sequence receives a new abundance value corresponding to
- its total abundance (sum of the abundances of its occurrences). If
----sizein is not specified, input abundances are set to 1, and each
-+\-\-sizein is not specified, input abundances are set to 1, and each
- unique sequence receives a new abundance value corresponding to its
- number of occurrences in the input file.
- .TP
--.BI --strand\~ "plus|both"
-+.BI \-\-strand\~ "plus|both"
- When searching for strictly identical sequences, check the \fIplus\fR
- strand only (default) or check \fIboth\fR strands.
- .TP
--.BI --topn\~ "positive integer"
-+.BI \-\-topn\~ "positive integer"
- Output only the top \fIinteger\fR sequences (i.e. the most abundant).
- .TP
--.BI --uc \0filename
-+.BI \-\-uc \0filename
- Output dereplication results in \fIfilename\fR using a uclust-like
- format. See <http://www.drive5.com/usearch/manual/ucout.html> for a
- description of the format. In the context of dereplication, the option
----uc_allhits has no effect.
-+\-\-uc_allhits has no effect.
- .RE
- .PP
- .\" ----------------------------------------------------------------------------
-@@ -498,9 +498,9 @@ Masking options:
- .PP
  An input sequence can be composed of lower- or uppercase
  nucleotides. Lowercase nucleotides are silently set to uppercase
 -before masking, unless the --qmask soft option is used. Here are the
@@ -512,7 +86,7 @@
  lower and uppercase nucleotides:
  .PP
  .TS
-@@ -518,24 +518,24 @@ soft:on:lowercase symbols masked and cha
+@@ -522,24 +522,24 @@ soft:on:lowercase symbols masked and cha
  .TE
  .PP
  .TP 9
@@ -542,44 +116,8 @@
 +.BI \-\-threads\~ "positive integer"
  Number of computation threads to use (1 to 256). The number of threads
  should be lesser or equal to the number of available CPU cores. The
- default is to launch one thread per available logical core.
-@@ -545,26 +545,26 @@ default is to launch one thread per avai
- Pairwise alignment options:
- .RS
- .TP 9
--.BI --allpairs_global \0filename
-+.BI \-\-allpairs_global \0filename
- Perform optimal global pairwise alignments of all vs. all fasta
- sequences contained in \fIfilename\fR. The results of the n * (n-1) /
--2 alignments are written to the result files specified with --alnout,
----blast6out, --fastapairs --matched, --notmatched, --uc or --userout
--(see Searching section below). Specify either the --acceptall option
-+2 alignments are written to the result files specified with \-\-alnout,
-+\-\-blast6out, \-\-fastapairs \-\-matched, \-\-notmatched, \-\-uc or \-\-userout
-+(see Searching section below). Specify either the \-\-acceptall option
- to output all pairwise alignments, or specify an identity level with
----id to discard weak alignments. Most other accept/reject options (see
-+\-\-id to discard weak alignments. Most other accept/reject options (see
- Searching options below) may also be used. Sequences are aligned on
- their \fIplus\fR strand only. This command is multi-threaded.
- .TP
--.B --acceptall
-+.B \-\-acceptall
- Write the results of all alignments to output files. This option
--overrides all other accept/reject options (e.g. --id).
-+overrides all other accept/reject options (e.g. \-\-id).
- .TP
--.BI --id \0real
-+.BI \-\-id \0real
- Reject the sequence match if the pairwise identity is lower than
- \fIreal\fR (value ranging from 0.0 to 1.0 included).
- .TP
--.BI --threads\~ "positive integer"
-+.BI \-\-threads\~ "positive integer"
- Number of computation threads to use (1 to 256). The number of threads
- should be lesser or equal to the number of available CPU cores. The
- default is to launch one thread per available logical core.
-@@ -574,17 +574,17 @@ default is to launch one thread per avai
+ default is to use all available ressources and to launch one thread
+@@ -582,17 +582,17 @@ per logical core.
  Searching options:
  .RS
  .TP 9
@@ -602,76 +140,10 @@
  query+target+id+alnlen+mism+opens+qlo+qhi+tlo+thi+evalue+bits.
  A complete list and description is available in the section "Userfields"
  of this manual.
-@@ -628,52 +628,52 @@ query. Nucleotide numbering starts from
- there is no alignment.
- .IP \n+[step].
- \fIevalue\fR: expectancy-value (not computed for nucleotide
--alignments). Always set to -1.
-+alignments). Always set to \-1.
- .IP \n+[step].
- \fIbits\fR: bit score (not computed for nucleotide
- alignments). Always set to 0.
- .RE
- .RE
+@@ -714,12 +714,12 @@ default in usearch, all default scores a
+ \fBvsearch\fR have been doubled to maintain equivalent penalties and
+ to produce identical alignments.
  .TP
--.BI --db \0filename
--Compare query sequences (specified with --usearch_global)
-+.BI \-\-db \0filename
-+Compare query sequences (specified with \-\-usearch_global)
- to the fasta-formatted target sequences contained in \fIfilename\fR,
- using global pairwise alignment.
- .TP
--.BI --dbmask\~ "none|dust|soft"
-+.BI \-\-dbmask\~ "none|dust|soft"
- Mask simple repeats and low-complexity regions in target database
- sequences using the \fIdust\fR or the \fIsoft\fR algorithms, or do not
- mask (\fInone\fR). Warning, when using \fIsoft\fR masking search
- commands become case sensitive. The default is to mask using
- \fIdust\fR.
- .TP
--.BI --dbmatched \0filename
-+.BI \-\-dbmatched \0filename
- Write database target sequences matching at least one query sequence
--to \fIfilename\fR, in fasta format. If the option --sizeout is used,
-+to \fIfilename\fR, in fasta format. If the option \-\-sizeout is used,
- the number of queries that matched each target sequence is indicated
- using the pattern ";size=\fIinteger\fR;".
- .TP
--.BI --dbnotmatched \0filename
-+.BI \-\-dbnotmatched \0filename
- Write database target sequences not matching query sequences to
- \fIfilename\fR, in fasta format.
- .TP
--.BI --fastapairs \0filename
-+.BI \-\-fastapairs \0filename
- Write pairwise alignments of query and target sequences to
- \fIfilename\fR, in fasta format.
- .TP
--.B --fulldp
-+.B \-\-fulldp
- Dummy option. To maximize search sensitivity, \fBvsearch\fR uses a
- 8-way 16-bit SIMD vectorized full dynamic programming algorithm
--(Needleman-Wunsch), whether or not --fulldp is specified.
-+(Needleman-Wunsch), whether or not \-\-fulldp is specified.
- .TP
--.BI --gapext \0string
--Set penalties for a gap extension. See --gapopen for a complete
-+.BI \-\-gapext \0string
-+Set penalties for a gap extension. See \-\-gapopen for a complete
- description of the penalty declaration system. The default is to
- initialize the six gap extending penalties using a penalty of 2 for
- extending internal gaps and a penalty of 1 for extending terminal
- gaps, in both query and target sequences (i.e. 2I/1E).
- .TP
--.BI --gapopen \0string
-+.BI \-\-gapopen \0string
- Set penalties for a gap opening. A gap opening can occur in six
- different contexts: in the query (Q) or in the target (T) sequence, at
- the left (L) or right (R) extremity of the sequence, or inside the
-@@ -704,12 +704,12 @@ gap penalties. Because the lowest gap pe
- in usearch, all default scores and gap penalties in \fBvsearch\fR
- have been doubled in order to obtain similar alignments.
- .TP
 -.B --hardmask
 +.B \-\-hardmask
  Mask low-complexity regions by replacing them with Ns instead of
@@ -683,7 +155,7 @@
  Reject the sequence match if the pairwise identity is lower than
  \fIreal\fR (value ranging from 0.0 to 1.0 included). The search
  process sorts target sequences by decreasing number of \fIk\fR-mers
-@@ -719,13 +719,13 @@ also prevent pairwise alignments with we
+@@ -729,13 +729,13 @@ also prevent pairwise alignments with we
  there needs to be at least 6 shared \fIk\fR-mers to start the pairwise
  alignment, and at least one out of every 16 \fIk\fR-mers from the
  query needs to match the target. Consequently, using values lower than
@@ -701,280 +173,7 @@
  are:
  .RS
  .RS
-@@ -735,40 +735,40 @@ CD-HIT definition using shortest sequenc
- .IP \n+[step].
- edit distance.
- .IP \n+[step].
--edit distance excluding terminal gaps (default value of --id).
-+edit distance excluding terminal gaps (default value of \-\-id).
- .IP \n+[step].
- Marine Biological Lab definition counting each extended gap as a
- single difference.
- .IP \n+[step].
--BLAST definition, equivalent to --iddef 2 in a context of global
-+BLAST definition, equivalent to \-\-iddef 2 in a context of global
- pairwise alignment.
- .RE
- .RE
- .PP
--The option --userfields accepts the fields id0 to id4, in addition to
-+The option \-\-userfields accepts the fields id0 to id4, in addition to
- the field id, to report the pairwise identity values corresponding to
- the different definitions.
- .TP
--.BI --idprefix\~ "positive integer"
-+.BI \-\-idprefix\~ "positive integer"
- Reject the target sequence if the first \fIinteger\fR nucleotides do
- not match the query sequence.
- .TP
--.BI --idsuffix\~ "positive integer"
-+.BI \-\-idsuffix\~ "positive integer"
- Reject the target sequence if the last \fIinteger\fR nucleotides do
- not match the query sequence.
- .TP
--.B --leftjust
-+.B \-\-leftjust
- Reject the target sequence if the alignment begins with gaps.
- .TP
--.BI --match\~ "integer"
-+.BI \-\-match\~ "integer"
- Score assigned to a match (i.e. identical nucleotides) in the pairwise
- alignment. The default value is 2.
- .TP
--.BI --matched \0filename
-+.BI \-\-matched \0filename
- Write query sequences matching database target sequences to
- \fIfilename\fR, in fasta format.
- .TP
--.BI --maxaccepts\~ "positive integer"
-+.BI \-\-maxaccepts\~ "positive integer"
- Maximum number of hits to accept before stopping the search. The
- default value is 1. This option works in pair with maxrejects. The
- search process sorts target sequences by decreasing number of
-@@ -779,31 +779,31 @@ and the search process stops for that qu
- higher value, more hits are accepted. If maxaccepts and maxrejects are
- both set to 0, the complete database is searched.
- .TP
--.BI --maxdiffs\~ "positive integer"
-+.BI \-\-maxdiffs\~ "positive integer"
- Reject the target sequence if the alignment contains at least
- \fIinteger\fR substitutions, insertions or deletions.
- .TP
--.BI --maxgaps\~ "positive integer"
-+.BI \-\-maxgaps\~ "positive integer"
- Reject the target sequence if the alignment contains at least
- \fIinteger\fR insertions or deletions.
- .TP
--.BI --maxhits\~ "positive integer"
-+.BI \-\-maxhits\~ "positive integer"
- Maximum number of hits to show once the search is terminated (hits are
- sorted by decreasing identity). Unlimited by default value. \fBIt
- applies to alnout, blast6out, uc, userout, fastapairs\fR.
- .TP
--.BI --maxid \0real
-+.BI \-\-maxid \0real
- Reject the target sequence if its percentage of identity with the
- query is greater than \fIreal\fR.
- .TP
--.BI --maxqsize\~ "positive integer"
-+.BI \-\-maxqsize\~ "positive integer"
- Reject query sequences with an abundance greater than
- \fIinteger\fR.
- .TP
--.BI --maxqt \0real
-+.BI \-\-maxqt \0real
- Reject if the query/target sequence length ratio is greater than \fIreal\fR.
- .TP
--.BI --maxrejects\~ "positive integer"
-+.BI \-\-maxrejects\~ "positive integer"
- Maximum number of non-matching target sequences to consider before
- stopping the search. The default value is 32. This option works in
- pair with maxaccepts. The search process sorts target sequences by
-@@ -815,138 +815,138 @@ hit). If maxrejects is set to a higher v
- are considered. If maxaccepts and maxrejects are both set to 0, the
- complete database is searched.
- .TP
--.BI --maxsizeratio \0real
-+.BI \-\-maxsizeratio \0real
- Reject if the query/target abundance ratio is greater than
- \fIreal\fR.
- .TP
--.BI --maxsl \0real
-+.BI \-\-maxsl \0real
- Reject if the shorter/longer sequence length ratio is
- greater than \fIreal\fR.
- .TP
--.BI --maxsubs\~ "positive integer"
-+.BI \-\-maxsubs\~ "positive integer"
- Reject the target sequence if the alignment contains more than
- \fIinteger\fR substitutions.
- .TP
--.BI --mid \0real
-+.BI \-\-mid \0real
- Reject the alignment if the percentage of identity is lower than
- \fIreal\fR (ignoring all gaps, internal and terminal).
- .TP
--.BI --mincols\~ "positive integer"
-+.BI \-\-mincols\~ "positive integer"
- Reject the target sequence if the alignment length is shorter than
- \fIinteger\fR.
- .TP
--.BI --minqt \0real
-+.BI \-\-minqt \0real
- Reject if the query/target sequence length ratio is lower than
- \fIreal\fR.
- .TP
--.BI --minsizeratio \0real
-+.BI \-\-minsizeratio \0real
- Reject if the query/target abundance ratio is lower than \fIreal\fR.
- .TP
--.BI --minsl \0real
-+.BI \-\-minsl \0real
- Reject if the shorter/longer sequence length ratio is lower than
- \fIreal\fR.
- .TP
--.BI --mintsize\~ "positive integer"
-+.BI \-\-mintsize\~ "positive integer"
- Reject target sequences with an abundance lower than \fIinteger\fR.
- .TP
--.BI --mismatch\~ "integer"
-+.BI \-\-mismatch\~ "integer"
- Score assigned to a mismatch (i.e. different nucleotides) in the
--pairwise alignment. The default value is -4.
-+pairwise alignment. The default value is \-4.
- .TP
--.BI --notmatched \0filename
-+.BI \-\-notmatched \0filename
- Write query sequences not matching database target sequences to
- \fIfilename\fR, in fasta format.
- .TP
--.B --output_no_hits
--Write both matching and non-matching queries to --alnout, --blast6out,
--and --userout output files (--uc and --uc_allhits output files always
-+.B \-\-output_no_hits
-+Write both matching and non-matching queries to \-\-alnout, \-\-blast6out,
-+and \-\-userout output files (\-\-uc and \-\-uc_allhits output files always
- feature non-matching queries). Non-matching queries are labelled "No
--hits" in --alnout files.
-+hits" in \-\-alnout files.
- .TP
--.BI --qmask\~ "none|dust|soft"
-+.BI \-\-qmask\~ "none|dust|soft"
- Mask simple repeats and low-complexity regions in query sequences
- using the \fIdust\fR or the \fIsoft\fR algorithms, or do not mask
- (\fInone\fR). Warning, when using \fIsoft\fR masking search commands
- become case sensitive. The default is to mask using \fIdust\fR.
- .TP
--.BI --query_cov \0real
-+.BI \-\-query_cov \0real
- Reject if the fraction of the query aligned to the target sequence is
- lower than \fIreal\fR. The query coverage is computed as
- (matches + mismatches) / query sequence length. Internal or terminal
- gaps are not taken into account.
- .TP
--.B --rightjust
-+.B \-\-rightjust
- Reject the target sequence if the alignment ends with gaps.
- .TP
--.BI --rowlen\~ "positive integer"
--Width of alignment lines in --alnout output. The default value is
-+.BI \-\-rowlen\~ "positive integer"
-+Width of alignment lines in \-\-alnout output. The default value is
- 64. Set to 0 to eliminate wrapping.
- .TP
--.B --self
-+.B \-\-self
- Reject the alignment if the query and target labels are identical.
- .TP
--.B --selfid
-+.B \-\-selfid
- Reject the alignment if the query and target sequences are strictly
- identical.
- .TP
--.B --sizeout
--Add abundance annotations to the output of the option --dbmatched
-+.B \-\-sizeout
-+Add abundance annotations to the output of the option \-\-dbmatched
- (using the pattern ";size=\fIinteger\fR;").
- .TP
--.BI --strand\~ "plus|both"
-+.BI \-\-strand\~ "plus|both"
- When searching for similar sequences, check the \fIplus\fR strand only
- (default) or check \fIboth\fR strands.
- .TP
--.BI --target_cov \0real
-+.BI \-\-target_cov \0real
- Reject if the fraction of the target sequence aligned to the query
- sequence is lower than \fIreal\fR. The target coverage is computed as
- (matches + mismatches) / target sequence length.
- Internal or terminal gaps are not taken into account.
- .TP
--.BI --threads\~ "positive integer"
-+.BI \-\-threads\~ "positive integer"
- Number of computation threads to use (1 to 256). The number of threads
- should be lesser or equal to the number of available CPU cores. The
- default is to launch one thread per available logical core.
- .TP
--.B --top_hits_only
-+.B \-\-top_hits_only
- Output only the hits with the highest percentage of identity with the
- query.
- .TP
--.BI --uc \0filename
-+.BI \-\-uc \0filename
- Output searching results in \fIfilename\fR using a uclust-like
- format. See <http://www.drive5.com/usearch/manual/ucout.html> for a
- description of the format. Output order may vary when using multiple
- threads.
- .TP
--.B --uc_allhits
--When using the --uc option, show all hits, not just the top hit for
-+.B \-\-uc_allhits
-+When using the \-\-uc option, show all hits, not just the top hit for
- each query.
- .TP
--.BI --usearch_global \0filename
--Compare target sequences (--db) to the fasta-formatted query sequences
-+.BI \-\-usearch_global \0filename
-+Compare target sequences (\-\-db) to the fasta-formatted query sequences
- contained in \fIfilename\fR, using global pairwise alignment.
- .TP
--.BI --userfields \0string
--When using --userout, select and order the fields written to the
-+.BI \-\-userfields \0string
-+When using \-\-userout, select and order the fields written to the
- output file. Fields are separated by "+" (e.g. query+target+id). See
- the "Userfields" section for a complete list of fields.
- .TP
--.BI --userout \0filename
-+.BI \-\-userout \0filename
- Write user-defined tab-separated output to \fIfilename\fR. Select the
--fields with the option --userfields. Output order may vary when using
--multiple threads. If --userfields is empty or not present,
-+fields with the option \-\-userfields. Output order may vary when using
-+multiple threads. If \-\-userfields is empty or not present,
- \fIfilename\fR is empty.
- .TP
--.BI --weak_id \0real
-+.BI \-\-weak_id \0real
- Show hits with percentage of identity of at least \fIreal\fR, without
- terminating the search. A normal search stops as soon as enough hits
--are found (as defined by --maxaccepts, --maxrejects, and --id). As
----weak_id reports weak hits that are not deduced from --maxaccepts,
--high --id values can be used, hence preserving both speed and
-+are found (as defined by \-\-maxaccepts, \-\-maxrejects, and \-\-id). As
-+\-\-weak_id reports weak hits that are not deduced from \-\-maxaccepts,
-+high \-\-id values can be used, hence preserving both speed and
- sensitivity. Logically, \fIreal\fR must be smaller than the value
--indicated by --id.
-+indicated by \-\-id.
- .TP
--.BI --wordlength\~ "positive integer"
-+.BI \-\-wordlength\~ "positive integer"
- Length of words (i.e. \fIk\fR-mers) for database indexing. The range
- of possible values goes from 3 to 15, but values near 8 are generally
- recommended. Longer words may reduce the sensitivity for weak
-@@ -963,75 +963,75 @@ more). The default value is 8.
+@@ -984,75 +984,75 @@ more). The default value is 8.
  Shuffling options:
  .RS
  .TP 9
@@ -1077,7 +276,7 @@
  .RS
  .TP 9
  .B aln
-@@ -1052,7 +1052,7 @@ format (Compact Idiosyncratic Gapped Ali
+@@ -1073,7 +1073,7 @@ format (Compact Idiosyncratic Gapped Ali
  (deletion) and I (insertion). Empty field if there is no alignment.
  .TP
  .B evalue
@@ -1086,7 +285,7 @@
  .TP
  .B exts
  Number of columns containing a gap extension (zero or positive integer
-@@ -1088,7 +1088,7 @@ single difference.
+@@ -1109,7 +1109,7 @@ single difference.
  .TP
  .B id4
  BLAST definition of the percentage of identity (real value ranging
@@ -1095,7 +294,7 @@
  pairwise alignment.
  .TP
  .B ids
-@@ -1129,7 +1129,7 @@ Internal or terminal gaps are not taken
+@@ -1150,7 +1150,7 @@ Internal or terminal gaps are not taken
  field is set to 0.0 if there is no alignment.
  .TP
  .B qframe
@@ -1104,7 +303,7 @@
  is not computed by \fBvsearch\fR. Always set to +0.
  .TP
  .B qhi
-@@ -1189,7 +1189,7 @@ Internal or terminal gaps are not taken
+@@ -1209,7 +1209,7 @@ Internal or terminal gaps are not taken
  The field is set to 0.0 if there is no alignment.
  .TP
  .B tframe
@@ -1113,7 +312,7 @@
  is not computed by \fBvsearch\fR. Always set to +0.
  .TP
  .B thi
-@@ -1240,31 +1240,31 @@ quirks and inconsistencies. We decided n
+@@ -1259,31 +1259,31 @@ quirks and inconsistencies. We decided n
  and for complete transparency, to document here the deliberate changes
  we made.
  .PP
@@ -1152,9 +351,9 @@
 +\fBvsearch\fR extends the \-\-sizein option to dereplication
 +(\-\-derep_fulllength) and clustering (\-\-cluster_fast).
  .PP
- \fBvsearch\fR treats T and U as identical nucleotides for
+ \fBvsearch\fR treats T and U as identical nucleotides during
  dereplication.
-@@ -1296,8 +1296,8 @@ Cluster with a 97% similarity threshold,
+@@ -1333,8 +1333,8 @@ Cluster with a 97% similarity threshold,
  and write cluster descriptions using a uclust-like format:
  .PP
  .RS
@@ -1165,7 +364,7 @@
  .RE
  .PP
  Dereplicate the sequences contained in queries.fas, take into account
-@@ -1306,9 +1306,9 @@ to output with the new abundance informa
+@@ -1343,9 +1343,9 @@ to output with the new abundance informa
  with an abundance of 1:
  .PP
  .RS
@@ -1178,41 +377,7 @@
  .RE
  .PP
  Mask simple repeats and low complexity regions in the input fasta file
-@@ -1316,26 +1316,26 @@ Mask simple repeats and low complexity r
- file:
- .PP
- .RS
--\fBvsearch\fR --maskfasta \fIqueries.fas\fR --output
--\fIqueries_masked.fas\fR --qmask dust
-+\fBvsearch\fR \-\-maskfasta \fIqueries.fas\fR \-\-output
-+\fIqueries_masked.fas\fR \-\-qmask dust
- .RE
- .PP
- Sort by decreasing abundance the sequences contained in queries.fas
- (using the "size=\fIinteger\fR" information), relabel the sequences
--while preserving the abundance information (with --sizeout), keep only
-+while preserving the abundance information (with \-\-sizeout), keep only
- sequences with an abundance equal to or greater than 2:
- .PP
- .RS
--\fBvsearch\fR --sortbysize \fIqueries.fas\fR --output
--\fIqueries_sorted.fas\fR --relabel sampleA_ --sizeout --minsize 2
-+\fBvsearch\fR \-\-sortbysize \fIqueries.fas\fR \-\-output
-+\fIqueries_sorted.fas\fR \-\-relabel sampleA_ \-\-sizeout \-\-minsize 2
- .RE
- .PP
- Align all sequences in a database with each other and output all pairwise
- alignments:
- .PP
- .RS
--\fBvsearch\fR --allpairs_global \fIdatabase.fas\fR
----alnout \fIresults.aln\fR --acceptall
-+\fBvsearch\fR \-\-allpairs_global \fIdatabase.fas\fR
-+\-\-alnout \fIresults.aln\fR \-\-acceptall
- .RE
- .PP
- Search queries in a reference database, with a 80%-similarity
-@@ -1343,8 +1343,8 @@ threshold, take terminal gaps into accou
+@@ -1362,8 +1362,8 @@ threshold, take terminal gaps into accou
  similarities:
  .PP
  .RS
@@ -1223,7 +388,7 @@
  .RE
  .PP
  Search a sequence dataset against itself (ignore self hits), get all
-@@ -1352,9 +1352,9 @@ matches with at least 60% identity, and
+@@ -1371,9 +1371,9 @@ matches with at least 60% identity, and
  blast-like tab-separated format:
  .PP
  .RS
@@ -1236,7 +401,7 @@
  .RE
  .PP
  Shuffle the input fasta file (change the order of sequences) in a
-@@ -1362,8 +1362,8 @@ repeatable fashion (fixed seed), and wri
+@@ -1381,8 +1381,8 @@ repeatable fashion (fixed seed), and wri
  to the output file:
  .PP
  .RS
@@ -1246,8 +411,8 @@
 +\fIqueries_shuffled.fas\fR \-\-seed 13 \-\-fasta_width 0
  .RE
  .PP
- .\" 
-@@ -1440,17 +1440,17 @@ Bug fixes (ssse3/sse41 requirement, memo
+ Sort by decreasing abundance the sequences contained in queries.fas
+@@ -1469,17 +1469,17 @@ Bug fixes (ssse3/sse4.1 requirement, mem
  Bug fix (now writes help to stdout instead of stderr).
  .TP
  .BR v1.0.4\~ "released December 8th, 2014"
@@ -1269,7 +434,7 @@
  .TP
  .BR v1.0.8\~ "released January 22nd, 2015"
  Introduces several changes and bug fixes:
-@@ -1459,7 +1459,7 @@ Introduces several changes and bug fixes
+@@ -1488,7 +1488,7 @@ Introduces several changes and bug fixes
  a new linear memory aligner for alignment of sequences longer than
  5,000 nucleotides,
  .IP -
@@ -1278,23 +443,7 @@
  abundance before clustering,
  .IP -
  meaning of userfields qlo, qhi, tlo, thi changed for compatibility
-@@ -1468,12 +1468,12 @@ with usearch,
- new userfields qilo, qihi, tilo, tihi gives coordinates ignoring
- terminal gaps,
- .IP -
--in --uc output files, a perfect alignment is indicated with a "=" sign,
-+in \-\-uc output files, a perfect alignment is indicated with a "=" sign,
- .IP -
----cluster_fast will now sort sequences by decreasing length, then by
-+\-\-cluster_fast will now sort sequences by decreasing length, then by
-   decreasing abundance and finally by sequence identifier,
- .IP -
--default --maxseqlength value set to 50,000 nucleotides,
-+default \-\-maxseqlength value set to 50,000 nucleotides,
- .IP -
- fix for bug in alignment in rare cases,
- .IP -
-@@ -1481,7 +1481,7 @@ fix for lack of detection of under- or o
+@@ -1511,7 +1511,7 @@ fix for lack of detection of under- or o
  .RE
  .TP
  .BR v1.0.9\~ "released January 22nd, 2015"




More information about the debian-med-commit mailing list