[med-svn] [pbgenomicconsensus] 02/08: Redo and reorganize manpages

Afif Elghraoui afif at moszumanska.debian.org
Fri Jul 8 16:10:57 UTC 2016


This is an automated email from the git hooks/post-receive script.

afif pushed a commit to branch master
in repository pbgenomicconsensus.

commit 5e744427abd554ff0ebf50310f91835a6212d83f
Author: Afif Elghraoui <afif at debian.org>
Date:   Fri Jul 8 08:17:43 2016 -0700

    Redo and reorganize manpages
    
    * Drop rst patches-- copy doc sources and maintain them independently
    * Add helper rules to d/rules
---
 debian/control                     |   2 +-
 debian/man/arrow.1                 |  22 ++
 debian/man/genomicconsensus.7.md   | 396 ++++++++++++++++++++++++++++++++++
 debian/man/pbgff.5.md              | 164 ++++++++++++++
 debian/man/plurality.1             |   7 +-
 debian/man/poa.1                   |  22 ++
 debian/man/quiver.1                |  22 ++
 debian/man/variantCaller.1.md      | 254 ++++++++++++++++++++++
 debian/patches/manpages.patch      | 432 -------------------------------------
 debian/patches/series              |   1 -
 debian/pbgenomicconsensus.manpages |   4 +-
 debian/rules                       |  38 +++-
 12 files changed, 920 insertions(+), 444 deletions(-)

diff --git a/debian/control b/debian/control
index 33f40c0..bcef98c 100644
--- a/debian/control
+++ b/debian/control
@@ -6,7 +6,7 @@ Uploaders: Afif Elghraoui <afif at debian.org>
 Build-Depends:
 	debhelper (>= 9),
 	dh-python,
-	python-docutils,
+	pandoc,
 	python-pkg-resources,
 	python-all,
 	python-setuptools,
diff --git a/debian/man/arrow.1 b/debian/man/arrow.1
new file mode 100644
index 0000000..9ef4244
--- /dev/null
+++ b/debian/man/arrow.1
@@ -0,0 +1,22 @@
+.TH ARROW 1
+
+.SH NAME
+
+arrow \- HMM-based consensus caller for Pacific Biosciences data
+
+.SH DESCRIPTION
+
+See
+.BR genomicconsensus (7)
+for a more detailed description of \f[B]arrow\f[P].
+
+.SH OPTIONS
+
+.TP
+.B \-h, \-\-help
+Show summary of options.
+
+.SH SEE ALSO
+.BR genomicconsensus (7)
+.BR variantCaller (1)
+.BR blasr (1)
diff --git a/debian/man/genomicconsensus.7.md b/debian/man/genomicconsensus.7.md
new file mode 100644
index 0000000..a544112
--- /dev/null
+++ b/debian/man/genomicconsensus.7.md
@@ -0,0 +1,396 @@
+% GENOMICCONSENSUS(7)
+%
+%
+
+
+What are EviCons? GenomicConsensus? Quiver? Plurality?
+======================================================
+
+**GenomicConsensus** is the current PacBio consensus and variant calling
+suite. It contains a main driver program, `variantCaller`, which
+provides two consensus/variant calling algorithms: **Arrow** and
+**Quiver**. These algorithms can be run by calling
+`variantCaller --algorithm=[arrow|quiver|plurality]` or by going
+through the convenience wrapper scripts `quiver` and `arrow`.
+
+**EviCons** was the previous generation PacBio variant caller (removed
+:   in software release v1.3.1).
+
+Separate packages called **ConsensusCore** and **ConsensusCore2** are
+C++ libraries where all the computation behind Quiver and Arrow are
+done, respectively. This is transparent to the user after installation.
+
+What is Plurality?
+==================
+
+**Plurality** is a very simple variant calling algorithm: it stacks up
+the aligned reads (alignment as produced by BLASR, or alternate mapping
+tool), and for each column under a reference base, calls the most
+abundant (i.e., the plurality) read base (or bases, or deletion) as the
+consensus at that reference position.
+
+Why is Plurality a weak algorithm?
+==================================
+
+Plurality does not perform any local realignment. This means it is
+heavily biased by the alignment produced by the mapper (BLASR,
+typically). It also means that it is insensitive at detecting indels.
+Consider this example:
+
+    Reference    AAAA
+                 ----
+      Aligned    A-AA
+        reads    AA-A
+                 -AAA
+                 ----
+    Plurality    AAAA
+    consensus
+
+Note here that every read has a deletion and the correct consensus call
+would be "AAA", but due to the mapper's freedom in gap-placement at the
+single-read level, the plurality sequence is "AAAA"---so the deletion is
+missed. Local realignment, which plurality does not do, but which could
+be considered as implicit in the Quiver algorithm, essentially pushes
+the gaps here to the same column, thus identifying the deletion. While
+plurality could be adjusted to use a simple "gap normalizing"
+realignment, in practice noncognate extras (spurious non-homopolymer
+base calls) in the midst of homopolymer runs pose challenges.
+
+What is Quiver?
+===============
+
+**Quiver** is a more sophisticated algorithm that finds the maximum
+quasi-likelihood template sequence given PacBio reads of the template.
+PacBio reads are modeled using a conditional random field approach that
+scores the quasi-likelihood of a read given a template sequence. In
+addition to the base sequence of each read, Quiver uses several
+additional *QV* covariates that the basecaller provides. Using these
+covariates provides additional information about each read, allowing
+more accurate consensus calls.
+
+Quiver does not use the alignment provided by the mapper (BLASR,
+typically), except for determining how to group reads together at a
+macro level. It implicitly performs its own realignment, so it is highly
+sensitive to all variant types, including indels---for example, it
+resolves the example above with ease.
+
+The name **Quiver** reflects a consensus-calling algorithm that is
+QV-aware.
+
+We use the lowercase "quiver" to denote the quiver *tool* in
+GenomicConsensus, which applies the Quiver algorithm to mapped reads to
+derive sequence consensus and variants.
+
+Quiver is described in detail in the supplementary material to the [HGAP
+paper](http://www.nature.com/nmeth/journal/v10/n6/full/nmeth.2474.html).
+
+What is Arrow?
+==============
+
+Arrow is a newer model intended to supercede Quiver in the near future.
+The key differences from Quiver are that it uses an HMM model instead of
+a CRF, it computes true likelihoods, and it uses a smaller set of
+covariates. We expect a whitepaper on Arrow to be available soon.
+
+We use the lowercase "arrow" to denote the arrow *tool*, which applies
+the Arrow algorithm to mapped reads to derive sequence consensus and
+variants.
+
+How do I run quiver/arrow?
+==========================
+
+For general instructions on installing and running, see the
+[HowTo](./HowTo.rst) document.
+
+What is the output from quiver/arrow?
+=====================================
+
+There are three output files from the GenomicConsensus tools:
+
+1.  A consensus *FASTA* file containing the consensus sequence
+2.  A consensus *FASTQ* file containing the consensus sequence with
+    quality annotations
+3.  A variants *GFF* file containing a filtered, annotated list of
+    variants identified
+
+It is important to note that the variants included in the output
+variants GFF file are *filtered* by coverage and quality, so not all
+variants that are apparent in comparing the reference to the consensus
+FASTA output will correspond to variants in the output variants GFF
+file.
+
+To enable all output files, the following can be run (for example):
+
+    % quiver -j16 aligned_reads.cmp.h5 -r ref.fa \
+     -o consensus.fa                             \
+     -o consensus.fq                             \
+     -o variants.gff
+
+The extension is used to determine the output file format.
+
+What does it mean that quiver consensus is *de novo*?
+=====================================================
+
+Quiver's consensus is *de novo* in the sense that the reference and the
+reference alignment are not used to inform the consensus output. Only
+the reads factor into the determination of the consensus.
+
+The only time the reference sequence is used to make consensus calls
+-when the `--noEvidenceConsensusCall` flag is set to `reference` or
+`lowercasereference` (the default)- is when there is no effective
+coverage in a genomic window, so Quiver has no evidence for computing
+consensus. One can set `--noEvidenceConsensusCall=nocall` to avoid using
+the reference even in zero coverage regions.
+
+What is the expected quiver accuracy?
+=====================================
+
+Quiver's expected accuracy is a function of coverage and chemistry. The
+C2 chemistry (no longer available), P6-C4 and P4-C2 chemistries provide
+the most accuracy. Nominal consensus accuracy levels are as follows:
+
+  --------------------------------------------
+  Coverage +  Expected consensus accuracy
+              ------------------+------------+
+              C2, P4-C2, P6-C4 | P5-C3
+  ----------- --------------------------------
+  10x         \> Q30 | \> Q30
+
+  20x         \> Q40 | \> Q40
+
+  40x         \> Q50 | \> Q45
+
+  60-80x      \~ Q60 | \> Q55
+  --------------------------------------------
+
+The "Q" values referred to are Phred-scaled quality values:
+
+$$q = -10 \log_{10} p_{error}$$
+
+for instance, Q50 corresponds to a p\_error of 0.00001---an accuracy of
+99.999%. These accuracy expectations are based on routine validations
+performed on multiple bacterial genomes before each chemistry release.
+
+What is the expected accuracy from arrow
+========================================
+
+arrow achieves similar accuracy to quiver. Numbers will be published
+soon.
+
+What are the residual errors after applying quiver?
+===================================================
+
+If there are errors remaining applying Quiver, they will almost
+invariably be homopolymer run-length errors (insertions or deletions).
+
+Does quiver/arrow need to know what sequencing chemistry was used?
+==================================================================
+
+At present, the Quiver model is trained per-chemistry, so it is very
+important that Quiver knows the sequencing chemistries used.
+
+If SMRT Analysis software was used to build the cmp.h5 or BAM input
+file, the cmp.h5 will be loaded with information about the sequencing
+chemistry used for each SMRT Cell, and GenomicConsensus will
+automatically identify the right parameters to use.
+
+If custom software was used to build the cmp.h5, or an override of
+Quiver's autodetection is desired, then the chemistry or model must be
+explicity entered. For example:
+
+    % quiver -p P4-C2 ...
+    % quiver -p P4-C2.AllQVsMergingByChannelModel ...
+
+Can a mix of chemistries be used in a cmp.h5 file for quiver/arrow?
+===================================================================
+
+Yes! GenomicConsensus tools automatically see the chemistry *per-SMRT
+Cell*, so it can figure out the right parameters for each read and model
+them appropriately.
+
+What chemistries and chemistry mixes are supported?
+===================================================
+
+For Quiver: all PacBio RS chemistries are supported. Chemistry mixtures
+of P6-C4, P4-C2, P5-C3, and C2 are supported.
+
+For Arrow: the RS chemistry P6-C4, and all PacBio Sequel chemistries are
+supported. Mixes of these chemistries are supported.
+
+What are the QVs that the Quiver model uses?
+============================================
+
+Quiver uses additional QV tracks provided by the basecaller. These QVs
+may be looked at as little breadcrumbs that are left behind by the
+basecaller to help identify positions where it was likely that errors of
+a given type occurred. Formally, the QVs for a given read are vectors of
+the same length as the number of bases called; the QVs used are as
+follows:
+
+> -   DeletionQV
+> -   InsertionQV
+> -   MergeQV
+> -   SubstitutionQV
+> -   DeletionTag
+
+To find out if your cmp.h5 file is loaded with these QV tracks, run the
+command :
+
+    % h5ls -rv aligned_reads.cmp.h5
+
+and look for the QV track names in the output. If your cmp.h5 file is
+lacking some of these tracks, Quiver will still run, though it will
+issue a warning that its performance will be suboptimal.
+
+Why is quiver/arrow making errors in some region?
+=================================================
+
+The most likely cause for *true* errors made by these tools is that the
+coverage in the region was low. If there is 5x coverage over a 1000-base
+region, then 10 errors in that region can be expected.
+
+It is important to understand that the effective coverage available to
+quiver/arrow is not the full coverage apparent in plots---the tools
+filter out ambiguously mapped reads by default. The remaining coverage
+after filtering is called the /effective coverage/. See the next section
+for discussion of MapQV.
+
+If you have verified that there is high effective coverage in the region
+in question, it is highly possible---given the high accuracy quiver and
+arrow can achieve---that the apparent errors actually reflect true
+sequence variants. Inspect the FASTQ output file to ensure that the
+region was called at high confidence; if an erroneous sequence variant
+is being called at high confidence, please report a bug to us.
+
+What does Quiver do for genomic regions with no effective coverage?
+===================================================================
+
+For regions with no effective coverage, no variants are outputted, and
+the FASTQ confidence is 0.
+
+The output in the FASTA and FASTQ consensus sequence tracks is dependent
+on the setting of the `--noEvidenceConsensusCall` flag. Assuming the
+reference in the window is "ACGT", the options are:
+
+  ---------------------------------------------------------
+  `--noEvidenceConsensusCall=...`                Consensus
+                                                 output
+  ---------------------------------------------- ----------
+  `nocall` (default in 1.4)                      NNNN
+
+  `reference`                                    ACGT
+
+  `lowercasereference` (new post 1.4, and the    acgt
+  default)                                       
+  ---------------------------------------------------------
+
+What is MapQV and why is it important?
+======================================
+
+MapQV is a single scalar Phred-scaled QV per aligned read that reflects
+the mapper's degree of certainty that the read aligned to *this* part of
+the reference and not some other. Unambigously mapped reads will have a
+high MapQV (typically 255), while a read that was equally likely to have
+come from two parts of the reference would have a MapQV of 3.
+
+MapQV is pretty important when you want highly accurate variant calls.
+Quiver and Plurality both filter out aligned reads with a MapQV below 20
+(by default), so as not to call a variant using data of uncertain
+genomic origin.
+
+This can be problematic if using quiver/arrow to get a consensus
+sequence. If the genome of interest contains long (relative to the
+library insert size) highly-similar repeats, the effective coverage
+(after MapQV filtering) may be reduced in the repeat regions---this is
+termed these MapQV dropouts. If the coverage is sufficiently reduced in
+these regions, quiver/arrow will not call consensus in these
+regions---see
+What do quiver/arrow do for genomic regions with no effective coverage?\_.
+
+If you want to use ambiguously mapped reads in computing a consensus for
+a denovo assembly, the MapQV filter can be turned off entirely. In this
+case, the consensus for each instance of a genomic repeat will be
+calculated using reads that may actually be from other instances of the
+repeat, so the exact trustworthiness of the consensus in that region may
+be suspect. The next section describes how to disable the MapQV filter.
+
+How can the MapQV filter be turned off and when should it be?
+--------------------------------------------------------------The MapQV
+filter can be disabled using the flag `--mapQvThreshold=0` (shorthand:
+`-m=0`). If running a quiver/arrow job via SMRT Portal, this can be done
+by unchecking the "Use only unambiguously mapped reads" option. Consider
+this in de novo assembly projects, but it is not recommended for variant
+calling applications.
+
+How can variant calls made by quiver/arrow be inspected or validated?
+=====================================================================
+
+When in doubt, it is easiest to inspect the region in a tool like SMRT
+View, which enables you to view the reads aligned to the region.
+Deletions and substitutions should be fairly easy to spot; to view
+insertions, right-click on the reference base and select "View
+Insertions Before...".
+
+What are the filtering parameters that quiver/arrow use?
+========================================================
+
+The available options limit read coverage, filters reads by MapQV, and
+filters variants by quality and coverage.
+
+-   The overall read coverage used to call consensus in every window is
+    100x by default, but can be changed using `-X=value`.
+-   The MapQV filter, by default, removes reads with MapQV \< 20. This
+    is configured using `--mapQvThreshold=value` / `-m=value`
+-   Variants are only called if the read coverage of the site exceeds
+    5x, by default---this is configurable using `-x=value`. Further,
+    they will not be called if the confidence (Phred-scaled) does not
+    exceed 40---configurable using `-q=value`.
+
+What happens when the sample is a mixture, or diploid?
+-----------------------------------------------------At present,
+quiver/arrow assume a haploid sample, and the behavior of on sample
+mixtures or diploid/polyploid samples is *undefined*. The program will
+not crash, but the output results are not guaranteed to accord with any
+one of the haplotypes in the sample, as opposed to a potential
+patchwork.
+
+Why would I want to *iterate* the mapping+(quiver/arrow) process?
+=================================================================
+
+Some customers using quiver for polishing highly repetitive genomes have
+found that if they take the consensus FASTA output of quiver, use it as
+a new reference, and then perform mapping and Quiver again to get a new
+consensus, they get improved results from the second round of quiver.
+
+This can be explained by noting that the output of the first round of
+quiver is more accurate than the initial draft consensus output by the
+assembler, so the second round's mapping to the quiver consensus can be
+more sensitive in mapping reads from repetitive regions. This can then
+result in improved consensus in those repetitive regions, because the
+reads have been assigned more correctly to their true genomic loci.
+However there is also a possibility that the potential shifting of reads
+around from one rounds' mapping to the next might alter borderline (low
+confidence) consensus calls even away from repetitive regions.
+
+We recommend the (mapping+quiver) iteration for customers polishing
+repetitive genomes, and it could also prove useful for resequencing
+applications. However we caution that this is very much an *exploratory*
+procedure and we make no guarantees about its performance. In
+particular, borderline consensus calls can change when the procedure is
+iterated, and the procedure is *not* guaranteed to be convergent.
+
+Is iterating the (mapping+quiver/arrow) process a convergent procedure?
+=======================================================================
+
+We have seen many examples where (mapping+quiver), repeated many times,
+is evidently *not* a convergent procedure. For example, a variant call
+may be present in iteration n, absent in n+1, and then present again in
+n+2. It is possible for subtle changes in mapping to change the set of
+reads examined upon inspecting a genomic window, and therefore result in
+a different consensus sequence there. We expect this to be the case
+primarily for "borderline" (low confidence) base calls.
+
+SEE ALSO
+========
+
+**variantCaller**(1)
diff --git a/debian/man/pbgff.5.md b/debian/man/pbgff.5.md
new file mode 100644
index 0000000..efcb624
--- /dev/null
+++ b/debian/man/pbgff.5.md
@@ -0,0 +1,164 @@
+% PBGFF(5) 2.1 | Bioinformatics formats
+% Pacific Biosciences <devnet at pacificbiosciences.com>
+% August 2014
+
+NAME
+====
+
+**pbgff** - Pacific Biosciences extended GFFv3 file format
+
+DESCRIPTION
+===========
+
+As of this version, `variants.gff` is our primary variant call file
+format. The `variants.gff` file is based on the [GFFv3
+standard](http://www.sequenceontology.org/gff3.shtml). The GFFv3
+standard describes a tab-delimited plain-text file meta-format for
+describing genomic "features." Each gff file consists of some initial
+"header" lines supplying metadata, and then a number of "feature" lines
+providing information about each identified variant.
+
+The GFF Coordinate System
+-------------------------
+
+All coordinates in GFF files are 1-based, and all intervals `start, end`
+are understood as including both endpoints.
+
+Headers
+-------
+
+The `variants.gff` file begins with a block of metadata headers, which
+looks like the following:
+
+    ##gff-version 3
+    ##pacbio-variant-version 2.1
+    ##date Tue Feb 28 17:44:18 2012
+    ##feature-ontology http://song.cvs.sourceforge.net/*checkout*/song/ontology/sofa.obo?revision=1.12
+    ##source GenomicConsensus v0.1.0
+    ##source-commandline callVariants.py --algorithm=plurality aligned_reads.cmp.h5 -r spinach.fasta -o variants.gff
+    ##source-alignment-file /home/popeye/data/aligned_reads.cmp.h5
+    ##source-reference-file /home/popeye/data/spinach.fasta
+    ##sequence-region EGFR_Exon_23 1 189
+    ##sequence-header EGFR_Exon_24 1 200
+
+The `source` and `source-commandline` describe the name and version of
+the software generating the file. `pacbio-variant-version` reflects the
+specification version that the file contents should adhere to.
+
+The `sequence-region` headers describe the names and extents of the
+reference groups (i.e. reference contigs) that will be refered to in the
+file. The names are the same as the full FASTA header.
+
+`source-alignment-file` and `source-reference-file` record absolute
+paths to the primary input files.
+
+Feature lines
+-------------
+
+After the headers, each line in the file describes a genomic *feature*;
+in this file, all the features are potential variants flagged by the
+variant caller. The general format of a variant line is a 9-column
+(tab-delimited) record, where the first 8 columns correspond to fixed,
+predefined entities in the GFF standard, while the 9th column is a
+flexible semicolon-delimited list of mappings `key=value`.
+
+The 8 predefined columns are as follows:
+
+  ------- -------- --------------------------------- -------------------
+  Column  Name     Description                       Example
+  Number                                             
+
+  1       seqId    The full FASTA header for the     `lambda_NEB3011`
+                   reference contig.                 
+
+  2       source   (unused; always populated with    `.`
+                   `.`)                              
+
+  3       type     the type of variant. One of       `substitution`
+                   `insertion`, `deletion`, or       
+                   `substitution`.                   
+
+  4       start    1-based start coordinate for the  200
+                   variant.                          
+
+  5       end      1-based end coordinate for the    215
+                   variant. start\<=end always       
+                   obtains, regardless of strand.    
+
+  6       score    unused; populated with `.`        `.`
+
+  7       strand   unused; populated with `.`        `.`
+
+  8       phase    unused; populated with `.`        `.`
+  ------- -------- --------------------------------- -------------------
+
+The attributes in the 9th (final) column are as follows:
+
++-----------------+-------------------------------+--------------------+
+| Key             | Description                   | Example value      |
++-----------------+-------------------------------+--------------------+
+| `coverage`      | the read coverage of the      | `42`               |
+|                 | variant site (not the variant |                    |
+|                 | itself)                       |                    |
++-----------------+-------------------------------+--------------------+
+| `confidence`    | the phred-scaled probability  | `37`               |
+|                 | that the variant is real,     |                    |
+|                 | rounded to the nearest        |                    |
+|                 | integer and truncated at 93   |                    |
++-----------------+-------------------------------+--------------------+
+| `reference`     | the reference base or bases   | `T`, `.`           |
+|                 | for the variant site. May be  |                    |
+|                 | `.` to represent a            |                    |
+|                 | zero-length substring (for    |                    |
+|                 | insertion events)             |                    |
++-----------------+-------------------------------+--------------------+
+| `variantSeq`    | the read base or bases        | `T`                |
+|                 | corresponding to the variant. | :   (haploid);     |
+|                 | `.` encodes a zer-length      |                    |
+|                 | string, as for a deletion.    | `T/C`, `T/.`       |
+|                 |                               | :   (heterozygous) |
++-----------------+-------------------------------+--------------------+
+| `frequency`     | the read coverage of the      | `13`               |
+|                 | variant itself; for           | :   (haploid)      |
+|                 | heterozygous variants, the    |                    |
+|                 | frequency of both observed    | `15/12`            |
+|                 | alleles. This is an optional  | :   (heterozygous) |
+|                 | field.                        |                    |
++-----------------+-------------------------------+--------------------+
+
+The attributes may be present in any order.
+
+The four types of variant we support are as follows. *(Recall that the
+field separator is a tab, not a space.)*
+
+1.  Insertion. Examples:
+
+        ref00001 . insertion 8 8 . . . reference=.;variantSeq=G;confidence=22;coverage=18;frequency=10
+        ref00001 . insertion 19 19 . . . reference=.;variantSeq=G/.;confidence=22;coverage=18;frequency=7/5
+
+> For insertions, start==end, and the insertion event is understood as
+> taking place *following* the reference position start.
+
+2.  Deletion. Examples:
+
+        ref00001 . deletion 348 349 . . . reference=G;variantSeq=.;confidence=39;coverage=25;frequency=20
+        ref00001 . deletion 441 443 . . . reference=GG;variantSeq=GG/.;confidence=39;coverage=25;frequency=8/8
+
+3.  Substitution. Examples:
+
+        ref000001 . substitution 100 102 . . . reference=GGG;variantSeq=CCC;confidence=50;coverage=20;frequency=16
+        ref000001 . substitution 200 201 . . . reference=G;variantSeq=G/C;confidence=50;coverage=20;frequency=10/6
+
+Compression
+-----------
+
+The gff metaformat is verbose, so for practical purposes we will gzip
+encode `variants.gff` files as `variants.gff.gz`. Consumers of the
+variant file should be able to read it in either form.
+
+SEE ALSO
+========
+
+The VCF and BED standards describe variant-call specific file formats.
+We can currently translate variants.gff files to these formats, but they
+are not the primary output of the variant callers.
diff --git a/debian/man/plurality.1 b/debian/man/plurality.1
index e12ab61..9960f00 100644
--- a/debian/man/plurality.1
+++ b/debian/man/plurality.1
@@ -7,7 +7,7 @@ plurality \- simple variant calling algorithm
 .SH DESCRIPTION
 
 See
-.BR quiver-faq (7)
+.BR genomicconsensus (7)
 for a more detailed description of \f[B]plurality\f[P].
 
 .SH OPTIONS
@@ -17,7 +17,6 @@ for a more detailed description of \f[B]plurality\f[P].
 Show summary of options.
 
 .SH SEE ALSO
-.BR quiver-faq (7)
-.BR quiver (1)
-.BR variantCaller.py (1)
+.BR genomicconsensus (7)
+.BR variantCaller (1)
 .BR blasr (1)
diff --git a/debian/man/poa.1 b/debian/man/poa.1
new file mode 100644
index 0000000..3116a2a
--- /dev/null
+++ b/debian/man/poa.1
@@ -0,0 +1,22 @@
+.TH POA 1
+
+.SH NAME
+
+poa \- consensus caller for Pacific Biosciences data
+
+.SH DESCRIPTION
+
+See
+.BR genomicconsensus (7)
+for a more detailed description of \f[B]poa\f[P].
+
+.SH OPTIONS
+
+.TP
+.B \-h, \-\-help
+Show summary of options.
+
+.SH SEE ALSO
+.BR genomicconsensus (7)
+.BR variantCaller (1)
+.BR blasr (1)
diff --git a/debian/man/quiver.1 b/debian/man/quiver.1
new file mode 100644
index 0000000..00f0f57
--- /dev/null
+++ b/debian/man/quiver.1
@@ -0,0 +1,22 @@
+.TH QUIVER 1
+
+.SH NAME
+
+quiver \- quality-value aware consensus caller for Pacific Biosciences data
+
+.SH DESCRIPTION
+
+See
+.BR genomicconsensus (7)
+for a more detailed description of \f[B]quiver\f[P].
+
+.SH OPTIONS
+
+.TP
+.B \-h, \-\-help
+Show summary of options.
+
+.SH SEE ALSO
+.BR genomicconsensus (7)
+.BR variantCaller (1)
+.BR blasr (1)
diff --git a/debian/man/variantCaller.1.md b/debian/man/variantCaller.1.md
new file mode 100644
index 0000000..c7c4aa2
--- /dev/null
+++ b/debian/man/variantCaller.1.md
@@ -0,0 +1,254 @@
+% variantCaller(1)
+%
+%
+
+NAME
+====
+
+**variantCaller** - variant-calling algorithms for PacBio sequencing data
+
+SYNOPSIS
+========
+
+`variantCaller` is invoked from the command line. For example, a
+simple invocation is:
+
+    variantCaller -j8 --algorithm=quiver \
+                      -r lambdaNEB.fa        \
+                      -o variants.gff        \
+                      aligned_reads.cmp.h5
+
+which requests that variant calling proceed, - using 8 worker processes,
+- employing the **quiver** algorithm, - taking input from the file
+`aligned_reads.cmp.h5`, - using the FASTA file `lambdaNEB.fa` as the
+reference, - and writing output to `variants.gff` (see **pbgff**(5)).
+
+A particularly useful option is `--referenceWindow/-w`: this option
+allows the user to direct the tool to perform variant calling
+exclusively on a *window* of the reference genome, where the
+
+OPTIONS
+=======
+
+    variantCaller --help
+
+will provide a help message explaining all available options.
+
+NOTES
+=====
+
+Input and output
+----------------
+
+`variantCaller` requires two input files:
+
+-   A file of reference-aligned reads in PacBio's standard cmp.h5
+    format;
+-   A FASTA file that has been processed by ReferenceUploader.
+
+The tool's output is formatted in the GFF format, as described in (how
+to link to other file?). External tools can be used to convert the GFF
+file to a VCF or BED file---two other standard interchange formats for
+variant calling.
+
+> **note**
+>
+> **Input cmp.h5 file requirements**
+>
+> `variantCaller` requires its input cmp.h5 file to be be sorted. An
+> unsorted file can be sorting using the tool `cmpH5Sort.py`.
+>
+> The **quiver**(1) algorithm in `variantCaller` requires its input
+> cmp.h5 file to have the following *pulse features*: - `InsQV`, -
+> `SubsQV`, - `DelQV`, - `DelTag`, - `MergeQV`.
+>
+> The **plurality**(1) algorithm can be run on cmp.h5 files that lack
+> these features.
+
+The input file is the main argument to `variantCaller`, while the
+output file is provided as an argument to the `-o` flag. For example,
+
+    variantCaller aligned_reads.cmp.h5 -r lambda.fa  -o variants.gff
+
+will read input from `aligned_reads.cmp.h5`, using the reference
+`lambda.fa`, and send output to the file `variants.gff`. The extension
+of the filename provided to the `-o` flag is meaningful, as it
+determines the output file format. The file formats presently supported,
+by extension, are
+
+`.gff`
+:   GFFv3 format
+
+`.txt`
+:   a simplified human readable format used primarily by the developers
+
+If the `-o` flag is not provided, the default behavior is to output to a
+`variants.gff` in the current directory.
+
+> **note**
+>
+> `variantCaller` does **not** modify its input cmp.h5 file in any
+> way. This is in contrast to previous variant callers in use at PacBio,
+> which would write a *consensus* dataset to the input cmp.h5 file.
+
+Available algorithms
+--------------------
+
+At this time there are two algorithms available for variant calling:
+**plurality** and **quiver**.
+
+**Plurality** is a simple and very fast procedure that merely tallies
+the most frequent read base or bases found in alignment with each
+reference base, and reports deviations from the reference as potential
+variants.
+
+**Quiver** is a more complex procedure based on algorithms originally
+developed for CCS. Quiver leverages the quality values (QVs) provided by
+upstream processing tools, which provide insight into whether
+insertions/deletions/substitutions were deemed likely at a given read
+position. Use of **quiver** requires the `ConsensusCore` and `ConsensusCore2`
+libraries as well as trained parameter set, which will be loaded from a
+standard location (TBD). Arrow and Quiver can be thought of as
+local-realignment procedures (QV-aware in the case of Quiver).
+
+Both algorithms are expected to converge to *zero* errors (miscalled
+variants) as coverage increases; however **quiver** should converge much
+faster (i.e., fewer errors at low coverage), and should provide greater
+variant detection power at a given error level.
+
+### Confidence values
+
+Both *quiver* and *plurality* make a confidence metric available for
+every position of the consensus sequence. The confidence should be
+interpreted as a phred-transformed posterior probability that the
+consensus call is incorrect; i.e.
+
+$$QV = -10 \log_{10}(p_{err})$$
+
+`variantCaller` clips reported QV values at 93---larger values cannot
+be encoded in a standard FASTQ file.
+
+### Chemistry specificity
+
+The Quiver algorithm parameters are trained per-chemistry. SMRTanalysis
+software loads metadata into the cmp.h5 to indicate the chemistry used
+per movie. Quiver sees this table and automatically chooses the
+appropriate parameter set to use. This selection can be overridden by a
+command line flag.
+
+When multiple chemistries are represented in the reads in a cmp.h5,
+Quiver will model each read appropriately using the parameter set for
+its chemistry, thus yielding optimal results.
+
+### Performance Requirements
+
+`variantCaller` performs variant calling in parallel using multiple
+processes. Work splitting and inter-process communication are handled
+using the Python `multiprocessing` module. Work can be split among an
+arbitrary number of processes (using the `-j` command-line flag), but
+for best performance one should use no more worker processes than there
+are CPUs in the host computer.
+
+The running time of the *plurality* algorithm should not exceed the
+runtime of the BLASR process that produced the cmp.h5. The running time
+of the *quiver* algorithm should not exceed 4x the runtime of BLASR.
+
+The amount of core memory (RAM) used among all the python processes
+launched by a `variantCaller` run should not exceed the size of the
+uncompressed input `.cmp.h5` file.
+
+EXAMPLES
+========
+
+Basic running instructions
+--------------------------
+
+Basic usage---using 8 CPUs to compute consensus of mapped reads and
+variants relative to a reference---is as follows:
+
+    % quiver -j8 aligned_reads{.cmp.h5, .bam, .fofn, or .xml} \
+    >     -r reference{.fasta or .xml} -o variants.gff        \
+    >     -o consensus.fasta -o consensus.fastq
+
+`quiver` is a shortcut for `variantCaller --algorithm=quiver`.
+Naturally, to use arrow you could use the `arrow` shortcut or
+`variantCaller --algorithm=arrow`.
+
+in this example we perform haploid consensus and variant calling on the
+mapped reads in the `aligned_reads.bam` which was aligned to
+`reference.fasta`. The `reference.fasta` is only used for designating
+variant calls, not for computing the consensus. The consensus quality
+score for every position can be found in the output FASTQ file.
+
+*Note that 2.3 SMRTanalysis does not support "dataset" input (FOFN or
+XML files); those who need this feature should wait for the forthcoming
+release of SMRTanalysis 3.0 or build from GitHub sources.*
+
+Running a large-scale resequencing/polishing job in SMRTanalysis 2.3
+--------------------------------------------------------------------
+
+To run a large-scale resequencing job (\>50 megabase genome @ 50x
+coverage,nominally), you want to spread the computation load across
+multiple nodes in your computing cluster.
+
+The smrtpipe workflow engine in SMRTanalysis 2.3 provides a convenient
+workflow automating this---it will automatically spread the load for
+both mapping and quiver jobs among your available cluster nodes. This is
+accessible via the SMRTportal UI; the simplest way to set up and run
+thse workflows is via tha UI. Nonetheless, we include command-line
+instructions for completeness.
+
+If you have to run the smrtpipe workflow manually from the command line,
+a recipe is as folows:
+
+1.  Make sure the reference you will align and compare against is
+    present in a SMRTportal "reference repository". Even if you don't
+    want to use SMRTportal, you need to build/import the reference
+    appropriately, and the simplest way to do that is via SMRTportal. If
+    you don't have a SMRTportal instance, you can use the
+    `referenceUploader` command to prepare your reference repository.
+2.  Prepare an "input.fofn" file listing, one-per-line, each "bax.h5"
+    file in your input data set.
+3.  Convert the "input.fofn" to an "input.xml" file that SMRTpipe can
+    understand:
+
+    \$ fofnToSmrtpipeInput.py input.fofn \> input.xml
+
+4.  Prepare your "params.xml" file. Here is a [params.xml
+    template](./params-template.xml) you can use; you should just need
+    to edit the reference path.
+5.  Activate your SMRTanalysis environment, and invoke smrtpipe:
+
+    \$ source \<SMRT Analysis\>/etc/setup.sh \$ smrtpipe.py --distribute
+    --params=params.xml xml:input.xml
+
+6.  After successful execution is complete, the results should be
+    available as data/consensus.fast[aq].gz and data/variants.gff.gz,
+    etc.
+
+Please consult the [SMRTpipe reference
+manual](http://www.pacb.com/wp-content/uploads/2015/09/SMRT-Pipe-Reference-Guide.pdf)
+for further information.
+
+*Note that resequencing (mapping reads against a reference genome and
+then calling consensus and identifying variants) and polishing (mapping
+reads against a draft assembly and then taking the consensus output as
+the final, polished, assembly) are the same algorithmic operation, the
+only effective difference is that the "variants.gff" output is not
+biologically meaningful in the polishing case---it just records the
+edits that were made to the draft to produce the polished assembly.*
+
+Running a large-scale quiver/arrow job in SMRTanalysis 3.0+
+-----------------------------------------------------------
+
+(Forthcoming)
+
+
+SEE ALSO
+========
+
+**quiver**(1)
+**arrow**(1)
+**plurality**(1)
+**pbgff**(5)
+**blasr**(1)
diff --git a/debian/patches/manpages.patch b/debian/patches/manpages.patch
deleted file mode 100644
index 26c441e..0000000
--- a/debian/patches/manpages.patch
+++ /dev/null
@@ -1,432 +0,0 @@
-Description: Turn upstream file format documentation into a manpage
- These patches change the rst file to be suitable input to rst2man.
- I followed the example at
- http://docutils.sourceforge.net/sandbox/manpage-writer/rst2man.txt
-Author: Afif Elghraoui <afif at ghraoui.name>
-Forwarded: not-needed
-Last-Update: 2016-02-11
---- pbgenomicconsensus.orig/doc/VariantsGffSpecification.rst
-+++ pbgenomicconsensus/doc/VariantsGffSpecification.rst
-@@ -1,6 +1,20 @@
-+=====
-+pbgff
-+=====
-+
-+----------------------------------------------
-+Pacific Biosciences extended GFFv3 file format
-+----------------------------------------------
-+
-+:Author: Pacific Biosciences <devnet at pacificbiosciences.com>
-+:Date: August 2014
-+:Version: 2.1
-+:Manual section: 5
-+:Manual group: Bioinformatics formats
- 
--``variants.gff`` File Format (Version 2.1)
--============================================
-+
-+DESCRIPTION
-+===========
- 
- As of this version, ``variants.gff`` is our primary variant call file
- format.  The ``variants.gff`` file is based on the `GFFv3 standard`_.
-@@ -40,7 +54,7 @@
- ``pacbio-variant-version`` reflects the specification version that the
- file contents should adhere to.
- 
--  The ``sequence-region`` headers describe the names and extents of
-+The ``sequence-region`` headers describe the names and extents of
- the reference groups (i.e. reference contigs) that will be refered to
- in the file.  The names are the same as the full FASTA header.
- 
-@@ -162,8 +176,8 @@
- the variant file should be able to read it in either form.
- 
- 
--Other file formats
--------------------
-+SEE ALSO
-+========
- 
- The VCF and BED standards describe variant-call specific file formats.
- We can currently translate `variants.gff` files to these formats, but
---- pbgenomicconsensus.orig/doc/HowToQuiver.rst
-+++ pbgenomicconsensus/doc/HowToQuiver.rst
-@@ -1,51 +1,22 @@
-+======
-+quiver
-+======
- 
--How to install and use Quiver
--=============================
-+--------------------------------------------------------------
-+genomic consensus caller designed for Pacific Biosciences data
-+--------------------------------------------------------------
- 
--Quiver is bundled in SMRTanalysis version 1.4 and later.  The easiest
--way to get Quiver is to install the most recent version of SMRTanalysis.
-+:Date: February 2016
-+:Manual section: 1
- 
--If you want to install Quiver as a standalone package from the latest
--bleeding-edge code, follow the instructions below.
- 
--*Note: please install this software on an isolated machine that does
--not have SMRTanalysis installed.  Older versions of SMRTanalysis
--pollute the ``PYTHONPATH``, which has the undesirable effect of
--overriding ``virtualenv``-installed modules.*
-+DESCRIPTION
-+===========
- 
--Background
------------
- **Quiver** is an algorithm for calling highly accurate consensus from
- multiple PacBio reads, using a pair-HMM exploiting both the basecalls
- and QV metrics to infer the true underlying DNA sequence.
- 
--Quiver is available through the ``quiver`` script from the
--``GenomicConsensus`` package.  To use Quiver, the following PacBio
--software is required.
--
--- ``GenomicConsensus``, containing ``quiver``
--- ``ConsensusCore``, a C++ library containing the core computational
--  routines for Quiver
--- ``pbcore``, a package providing access to PacBio data files
--
--
--Required libraries and tools
------------------------------
--To install the PacBio software, the following are required:
--
--- Boost  >= 1.4.7   (standard C++ libraries)
--- SWIG   >= 2.0.7   (library wrapper generator)
--- Python 2.7.3
--- virtualenv        (builds isolated Python environments)
--
--If you are within PacBio, these requirements are already installed
--within the cluster environment.
--
--Otherwise, you will need to install them yourself.  The automatic
--installation script requires that the ``swig`` executable is in your
--UNIX ``$PATH`` and that your boost installation can be found under
--``/usr/include`` or ``/usr/local``.
--
- 
- Data file requirements
- ----------------------
-@@ -66,45 +37,11 @@
- please upgrade.
- 
- 
--Installation instructions
---------------------------
--
--Step 1: Set up your Python environment
--``````````````````````````````````````
--I recommend using a Python *virtualenv* to isolate your sandbox.
--
--To set up a new virtualenv, do ::
--
--    $ cd; virtualenv -p python2.7 --no-site-packages VE-QUIVER
--
--and activate the virtualenv using ::
--
--    $ source ~/VE-QUIVER/bin/activate
--
--
--Step 2: Install PacBio libraries
--````````````````````````````````
--To install the PacBio software, execute ::
--
--    $ pip install pbcore
--
--    $ git clone https://github.com/PacificBiosciences/ConsensusCore
--    $ cd ConsensusCore; python setup.py install --swig=$SWIG --boost=$BOOST
--    $ pip install git+https://github.com/PacificBiosciences/GenomicConsensus
--
--where you replace ``$SWIG`` with the path to your ``swig`` executable
--and ``$BOOST`` with the path to your boost install (the top level
--directory).  (Note that if SWIG is in your ``$PATH`` and boost is in
--``/usr/local`` or ``/usr/include/``, you do not need to specify these
--flags on the command line---``setup.py`` will find them).
--
--
--Step 3: Run Quiver
--``````````````````
--Those who wish to call consensus on a resequencing job can simply use
--the ``quiver`` script that has been installed in your
--virtualenv (from `GenomicConsensus`).
-+EXAMPLES
-+========
- 
-+Running Quiver
-+--------------
- For example, ::
- 
-     $ quiver -j8 aligned_reads.bam              \
-@@ -120,8 +57,8 @@
- QV metrics required for optimal Quiver accuracy.  The command will still work,
- but it will give a warning that its accuracy will be suboptimal.
- 
--Step 3.5: Run Quiver on Multiple Input Files
--````````````````````````````````````````````
-+Run Quiver on Multiple Input Files
-+----------------------------------
- Multiple alignment files in a FOFN (File of File Names) can be quivered against
- a single reference (`GenomicConsensus` >= 1.1.0).
- 
-@@ -140,22 +77,18 @@
- Quiver can also be used with DataSet XML files. See pbcore for details on
- generating new DataSet XML files for your alignment files.
- 
--Step 4: Highly-accurate assembly consensus
--``````````````````````````````````````````
-+Highly-accurate assembly consensus
-+----------------------------------
- Quiver enables consensus accuracies on genome assemblies at accuracies
- approaching or even exceeding Q60 (one error per million bases).  If
- you use the HGAP assembly protocol in SMRTportal 2.0 or later, Quiver
- runs automatically as the final "assembly polishing" step.
- 
- 
--Resources
-----------
--Here is an `FAQ document`_ to address common issues.
--
--For a technical summary of some of the details of how Quiver works, I
--recommend reading the supplementary material of our 2013 *Nature
--Methods* `HGAP paper`_
--
--
--.. _`FAQ document`: https://github.com/PacificBiosciences/GenomicConsensus/blob/master/doc/QuiverFAQ.rst
--.. _`HGAP paper`: http://www.nature.com/nmeth/journal/v10/n6/full/nmeth.2474.html
-+SEE ALSO
-+========
-+ 
-+**quiver-faq**\ (7)
-+**plurality**\ (1)
-+**variantCaller**\ (1)
-+**pbgff**\ (5)
---- pbgenomicconsensus.orig/doc/QuiverFAQ.rst
-+++ pbgenomicconsensus/doc/QuiverFAQ.rst
-@@ -1,5 +1,17 @@
--Quiver FAQ
- ==========
-+quiver-faq
-+==========
-+
-+------------------------------------------------------------------------
-+frequently asked questions regarding the quiver genomic consensus caller
-+------------------------------------------------------------------------
-+
-+:Date: October 2015
-+:Version: 1.0.0
-+:Manual section: 7
-+
-+QUESTIONS
-+=========
- 
- What are EviCons? GenomicConsensus? Quiver? Plurality?  
- ------------------------------------------------------------
-@@ -67,14 +79,6 @@
- The name **Quiver** reflects a consensus-calling algorithm that is
- `QV-aware`.
- 
--How do I run Quiver?
----------------------
--
--For general instructions on installing and running, see the
--HowToQuiver_ document.
--
--
--
- What does Quiver put in its output files?
- -----------------------------------------
- There are three output files from Quiver:
-@@ -155,7 +159,7 @@
- 
- 
- Does Quiver need to know what sequencing chemistry was used?
------------------------------------------------------------
-+------------------------------------------------------------
- 
- At present, the Quiver model is trained per-chemistry, so it is very
- important that Quiver knows the sequencing chemistries used.
-@@ -175,7 +179,7 @@
- 
- 
- Can a mix of chemistries be used in a cmp.h5 file for Quiver?
-------------------------------------------------------------
-+-------------------------------------------------------------
- 
- Yes!  Quiver automatically sees the chemistry *per-SMRT Cell*, so it
- can figure out the right parameters for each read and model them
-@@ -289,7 +293,7 @@
- 
- 
- How can the `MapQV` filter be turned off and when should it be?
----------------------------------------------------------------
-+---------------------------------------------------------------
- The `MapQV` filter can be disabled using the flag
- ``--mapQvThreshold=0`` (shorthand: ``-m=0``).  If running a
- Quiver job via SMRT Portal, this can be done by unchecking the "Use
-@@ -299,7 +303,7 @@
- 
- 
- How can variant calls made by Quiver be inspected or validated?
----------------------------------------------------------------
-+---------------------------------------------------------------
- When in doubt, it is easiest to inspect the region in a tool like
- SMRT View, which enables you to view the reads aligned to the region.
- Deletions and substitutions should be fairly easy to spot; to view
-@@ -324,7 +328,7 @@
- 
- 
- What happens when the sample is a mixture, or diploid?
-------------------------------------------------------
-+------------------------------------------------------
- At present, Quiver assumes a haploid sample, and the behavior of
- *Quiver* on sample mixtures or diploid/polyploid samples is
- *undefined*.  The program will not crash, but the output results are
-@@ -372,5 +376,10 @@
- calls.
- 
- 
-+SEE ALSO
-+========
- 
--.. _HowToQuiver: https://github.com/PacificBiosciences/GenomicConsensus/blob/master/doc/HowToQuiver.rst
-+**quiver**\ (1)
-+**plurality**\ (1)
-+**variantCaller**\ (1)
-+**pbgff**\ (5)
---- pbgenomicconsensus.orig/doc/VariantCallerFunctionalSpecification.rst
-+++ pbgenomicconsensus/doc/VariantCallerFunctionalSpecification.rst
-@@ -1,35 +1,16 @@
-+=============
-+variantCaller
-+=============
-+
-+-----------------------------------------------------
-+variant-calling algorithms for PacBio sequencing data
-+-----------------------------------------------------
-+
-+:Date: February 2016
-+:Manual section: 1		 
- 
--
--Variant Caller Functional Specification
--=======================================
--
--Version 2.2
--
--
--Introduction
--------------
--
--This document describes the interface, input/output, and performance
--characteristics of ``variantCaller.py``, a variant calling tool
--provided by the ``GenomicConsensus`` package.
--
--
--Software Overview
-------------------
--
--The ``GenomicConsensus`` package provides a command-line tool,
--``variantCaller.py``, which provides several variant-calling algorithms for
--PacBio sequencing data.  ``variantCaller.py`` replaces ``EviCons`` and
--``SmrtBayes``, the previous (haploid, diploid---respectively) variant callers
--at PacBio.
--
--
--
--Functional Requirements
-------------------------
--
--Command-line interface
--``````````````````````
-+SYNOPSIS
-+========
- 
- ``variantCaller.py`` is invoked from the command line.  For example, a simple
- invocation is::
-@@ -44,23 +26,24 @@
- - employing the **quiver** algorithm,
- - taking input from the file ``aligned_reads.cmp.h5``,
- - using the FASTA file ``lambdaNEB.fa`` as the reference,
--- and writing output to ``variants.gff``.
-+- and writing output to ``variants.gff`` (see **pbgff**\ (5)).
- 
- A particularly useful option is ``--referenceWindow/-w``: this option
- allows the user to direct the tool to perform variant calling
- exclusively on a *window* of the reference genome, where the
- 
- 
--Invoking
-+OPTIONS
-+=======
- 
- ::
- 
-     variantCaller.py --help
- 
--will provide a help message explaining all available options; they will be
--documented here shortly.
--
-+will provide a help message explaining all available options.
- 
-+NOTES
-+=====
- 
- Input and output
- ````````````````
-@@ -82,7 +65,7 @@
-         be sorted.  An unsorted file can be sorting using the tool
-         ``cmpH5Sort.py``.
- 
--        The *quiver* algorithm in ``variantCaller.py`` requires its
-+        The **quiver**\ (1) algorithm in ``variantCaller`` requires its
-         input cmp.h5 file to have the following *pulse features*:
-             - ``InsQV``,
-             - ``SubsQV``,
-@@ -90,7 +73,7 @@
-             - ``DelTag``,
-             - ``MergeQV``.
- 
--        The *plurality* algorithm can be run on cmp.h5 files that lack
-+        The **plurality**\ (1) algorithm can be run on cmp.h5 files that lack
-         these features.
- 
- The input file is the main argument to ``variantCaller.py``, while the output
-@@ -147,19 +130,6 @@
- fewer errors at low coverage), and should provide greater variant detection
- power at a given error level.
- 
--
--Software interfaces
--```````````````````
--The ``GenomicConsensus`` module has two essential dependencies:
--
--1. **pbcore**, the PacBio Python bioinformatics library
--2. **ConsensusCore**, a C++ library with SWIG bindings that provides access to
--   the same algorithms used in circular consensus sequencing.
--
--Both of these modules are easily installed using their ``setup.py`` scripts,
--which is the canonical means of installing Python packages.
--
--
- Confidence values
- -----------------
- 
-@@ -209,3 +179,11 @@
- The amount of core memory (RAM) used among all the python processes launched
- by a ``variantCaller.py`` run should not exceed the size of the uncompressed
- input ``.cmp.h5`` file.
-+
-+SEE ALSO
-+========
-+
-+**quiver**\ (1)
-+**plurality**\ (1)
-+**pbgff**\ (5)
-+**blasr**\ (1)
diff --git a/debian/patches/series b/debian/patches/series
index 94817e0..0b14f63 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -1,3 +1,2 @@
 spelling.patch
-manpages.patch
 verbose-testing.patch
diff --git a/debian/pbgenomicconsensus.manpages b/debian/pbgenomicconsensus.manpages
index 19f429f..9914f7b 100644
--- a/debian/pbgenomicconsensus.manpages
+++ b/debian/pbgenomicconsensus.manpages
@@ -1 +1,3 @@
-debian/man/*
+debian/man/*.1
+debian/man/*.5
+debian/man/*.7
diff --git a/debian/rules b/debian/rules
index 5677229..a89ca39 100755
--- a/debian/rules
+++ b/debian/rules
@@ -10,19 +10,47 @@ export PYBUILD_NAME := $(DEB_SOURCE)
 BINDIR=$(CURDIR)/debian/$(DEB_SOURCE)/usr/bin/
 MANDIR=$(CURDIR)/debian/$(DEB_SOURCE)/usr/share/man/
 
+manpages = \
+pbgff.5.md \
+variantCaller.1.md \
+genomicconsensus.7.md \
+
+%: %.md
+	pandoc -s --to man $< -o $@
+
+
 %:
 	dh $@ --with python2 --buildsystem=pybuild
 
 override_dh_auto_test:
 	PYTHONPATH=$(CURDIR) PATH=$(CURDIR)/bin:$$PATH $(MAKE) tests
 
+override_dh_auto_build: docs
+	dh_auto_build
+
+.PHONY: docs
+docs: $(addprefix $(CURDIR)/debian/man/, $(manpages:.md=))
+
+.PHONY: clean-docs
+clean-docs:
+	$(RM) $(addprefix $(CURDIR)/debian/man/, $(manpages:.md=))
+
 override_dh_install:
 	dh_install
 # Place executable programs in the main package
 	mkdir -p $(BINDIR)
 	mv debian/python-$(DEB_SOURCE)/usr/bin/* $(BINDIR)
-	mkdir -p $(MANDIR)/man5 $(MANDIR)/man1 $(MANDIR)/man7
-	rst2man doc/VariantsGffSpecification.rst > $(MANDIR)/man5/pbgff.5
-	rst2man doc/VariantCallerFunctionalSpecification.rst > $(MANDIR)/man1/variantCaller.1
-	rst2man doc/HowToQuiver.rst > $(MANDIR)/man1/quiver.1
-	rst2man doc/QuiverFAQ.rst > $(MANDIR)/man7/quiver-faq.7
+
+override_dh_auto_clean: clean-docs
+	dh_auto_clean
+
+# Helper for setting up the documentation as manpages
+# Usage: VPATH=doc/ debian/rules prepare-doc
+prepare-doc: $(manpages)
+
+pbgff.5.md: VariantsGffSpecification.rst
+variantCaller.1.md: VariantCallerFunctionalSpecification.rst
+genomicconsensus.7.md: FAQ.rst
+
+$(manpages):
+	pandoc -s $< -o $@

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/pbgenomicconsensus.git



More information about the debian-med-commit mailing list