[med-svn] [partitionfinder] 01/01: depend on sklearn, add manpages

Wed Nov 11 12:37:01 UTC 2015

This is an automated email from the git hooks/post-receive script.

daube-guest pushed a commit to branch master
in repository partitionfinder.

commit d14da4cd50e24c6bc9769358e5cd08295b53f327
Author: Kevin Murray <spam at kdmurray.id.au>
Date:   Tue Nov 10 21:24:39 2015 +1100

    depend on sklearn, add manpages
---
 debian/PartitionFinder.py.1        | 150 +++++++++++++++++++++++++++++++++++++
 debian/PartitionFinderProtein.py.1 | 150 +++++++++++++++++++++++++++++++++++++
 debian/control                     |   2 +
 debian/partitionfinder.manpages    |   1 +
 4 files changed, 303 insertions(+)

diff --git a/debian/PartitionFinder.py.1 b/debian/PartitionFinder.py.1
new file mode 100644
index 0000000..6af9344
--- /dev/null
+++ b/debian/PartitionFinder.py.1
@@ -0,0 +1,150 @@
+.TH USAGE: "1" "November 2015" "PartitionFinder.py" "User Commands"
+.SH NAME
+PartitionFinder.py \- manual page for PartitionFinder.py
+.SH SYNOPSIS
+.B PartitionFinder.py \/\fR[\fI\,options\/\fR] \fI\,<foldername>\/\fR
+.SH DESCRIPTION
+.IP
+PartitionFinder and PartitionFinderProtein are designed to discover optimal
+partitioning schemes for nucleotide and amino acid sequence alignments.
+They are also useful for finding the best model of sequence evolution for datasets.
+.IP
+The Input: <foldername>: the full path to a folder containing:
+.IP
+\- A configuration file (partition_finder.cfg)
+\- A nucleotide/aa alignment in Phylip format
+.IP
+Take a look at the included 'example' folder for more details.
+.IP
+The Output: A file in the same directory as the .cfg file, named
+\&'analysis' This file contains information on the best
+partitioning scheme, and the best model for each partiiton
+.IP
+Usage Examples:
+.IP
+>python PartitionFinder.py example
+Analyse what is in the 'example' sub\-folder in the current folder.
+.IP
+>python PartitionFinder.py \fB\-v\fR example
+Analyse what is in the 'example' sub\-folder in the current folder, but
+show all the debug output
+.IP
+>python PartitionFinder.py \fB\-c\fR \fI\,~/data/frogs\/\fP
+Check the configuration files in the folder data/frogs in the current
+user's home folder.
+.IP
+>python PartitionFinder.py \fB\-\-force\-restart\fR \fI\,~/data/frogs\/\fP
+Deletes any data produced by the previous runs (which is in
+\fI\,~/data/frogs/output\/\fP) and starts afresh
+.SH OPTIONS
+.TP
+\fB\-h\fR, \fB\-\-help\fR
+show this help message and exit
+.TP
+\fB\-v\fR, \fB\-\-verbose\fR
+show debug logging information (equivalent to \fB\-\-debugout\fR=\fI\,all\/\fR)
+.TP
+\fB\-c\fR, \fB\-\-check\-only\fR
+just check the configuration files, don't do any
+processing
+.TP
+\fB\-f\fR, \fB\-\-force\-restart\fR
+delete all previous output and start afresh (!)
+.TP
+\fB\-p\fR N, \fB\-\-processes\fR=\fI\,N\/\fR
+Number of concurrent processes to use. Use \fB\-1\fR to match
+the number of cpus on the machine. The default is to
+use \fB\-1\fR.
+.TP
+\fB\-\-show\-python\-exceptions\fR
+If errors occur, print the python exceptions
+.TP
+\fB\-\-save\-phylofiles\fR
+save all of the phyml or raxml output. This can take a
+lot of space(!)
+.TP
+\fB\-\-dump\-results\fR
+Dump all results to a binary file. This is only of use
+for testing purposes.
+.TP
+\fB\-\-compare\-results\fR
+Compare the results to previously dumped binary
+results. This is only of use for testing purposes.
+.TP
+\fB\-q\fR, \fB\-\-quick\fR
+Avoid anything slow (like writing schemes at each
+step),useful for very large datasets.
+.TP
+\fB\-r\fR, \fB\-\-raxml\fR
+Use RAxML (rather than PhyML) to do the analysis. See
+the manual
+.TP
+\fB\-\-cmdline\-extras\fR=\fI\,N\/\fR
+Add additional commands to the phyml or raxml
+commandlines that PF uses.This can be useful e.g. if
+you want to change the accuracy of lnL calculations
+('\-e' option in raxml), or use multi\-threaded versions
+of raxml that require you to specify the number of
+threads you will let raxml use ('\-T' option in raxml.
+E.g. you might specify this: \fB\-\-cmndline_extras\fR ' \fB\-e\fR
+2.0 \fB\-T\fR 10 ' N.B. MAKE SURE YOU PUT YOUR EXTRAS IN
+QUOTES, and only use this command if you really know
+what you're doing and are very familiar with raxml and
+PartitionFinder
+.TP
+\fB\-\-weights\fR=\fI\,N\/\fR
+Mainly for algorithm development. Only use it if you
+know what you're doing.A list of weights to use in the
+clustering algorithms. This list allows you to assign
+different weights to: the overall rate for a subset,
+the base/amino acid frequencies, model parameters, and
+alpha value. This will affect how subsets are
+clustered together. For instance: \fB\-\-cluster_weights\fR
+\&'1, 2, 5, 1', would weight the base freqeuncies 2x
+more than the overall rate, the model parameters 5x
+more, and the alpha parameter the same as the model
+rate
+.TP
+\fB\-\-kmeans\fR=\fI\,type\/\fR
+This defines which sitewise values to use: entropy or
+tiger \fB\-\-kmeans\fR entropy: use entropies for sitewise
+values \fB\-\-kmeans\fR tiger: use TIGER rates for sitewise
+values
+.TP
+\fB\-\-rcluster\-percent\fR=\fI\,N\/\fR
+This defines the proportion of possible schemes that
+the relaxed clustering algorithm will consider before
+it stops looking. The default is 10%.e.g. \fB\-\-rclusterpercent\fR 10.0
+.TP
+\fB\-\-rcluster\-max\fR=\fI\,N\/\fR
+This defines the number of possible schemes that the
+relaxed clustering algorithm will consider before it
+stops looking. The default is to look at just the top
+1000 schemes.e.g. \fB\-\-rcluster\-max\fR 1000
+.TP
+\fB\-\-min\-subset\-size\fR=\fI\,N\/\fR
+This defines the minimum subset size that you will
+accept for the kmeans algorithm. Subsets smaller than
+this will still be created during the algorithm, but
+they will be merged with other subsets at the end of
+the algorithm.e.g. \fB\-\-min\-subset\-size\fR 100
+.TP
+\fB\-\-debug\-output\fR=\fI\,REGION\/\fR,REGION,...
+(advanced option) Provide a list of debug regions to
+output extra information about what the program is
+doing. Possible regions are 'all' or any of {subset,su
+bset_ops,neighbour,raxml,parser,model_util,results,ent
+ropy,alignment,future_stdlib,threadpool,progress,main,
+config,pandas,reporter,kmeans,pandas.io.gbq,pandas.io,
+analysis_m,util,scheme,submodels,database,analysis,phy
+ml,raxml_mode,model_load,phyml_mode}.
+.TP
+\fB\-\-all\-states\fR
+In the k\-means algorithm, only produce subsets which
+have all states represented (e.g. ACTG for DNA
+datasets).
+.TP
+\fB\-\-profile\fR
+Output profiling information after running (this will
+slow everything down!)
+.IP
diff --git a/debian/PartitionFinderProtein.py.1 b/debian/PartitionFinderProtein.py.1
new file mode 100644
index 0000000..00d8c09
--- /dev/null
+++ b/debian/PartitionFinderProtein.py.1
@@ -0,0 +1,150 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.2.
+.TH USAGE: "1" "November 2015" "PartitionFinderProtein.py" "User Commands"
+.SH NAME
+PartitionFinderProtein.py \- manual page for PartitionFinderProtein.py
+.SH SYNOPSIS
+.B PartitionFinderProtein.py \/\fR[\fI\,options\/\fR] \fI\,<foldername>\/\fR
+.SH DESCRIPTION
+.IP
+PartitionFinder and PartitionFinderProtein are designed to discover optimal
+partitioning schemes for nucleotide and amino acid sequence alignments.
+They are also useful for finding the best model of sequence evolution for datasets.
+.IP
+The Input: <foldername>: the full path to a folder containing:
+.IP
+\- A configuration file (partition_finder.cfg)
+\- A nucleotide/aa alignment in Phylip format
+.IP
+Take a look at the included 'example' folder for more details.
+.IP
+The Output: A file in the same directory as the .cfg file, named
+\&'analysis' This file contains information on the best
+partitioning scheme, and the best model for each partiiton
+.IP
+Usage Examples:
+.IP
+>python PartitionFinderProtein.py example
+Analyse what is in the 'example' sub\-folder in the current folder.
+.IP
+>python PartitionFinderProtein.py \fB\-v\fR example
+Analyse what is in the 'example' sub\-folder in the current folder, but
+show all the debug output
+.IP
+>python PartitionFinderProtein.py \fB\-c\fR \fI\,~/data/frogs\/\fP
+Check the configuration files in the folder data/frogs in the current
+user's home folder.
+.IP
+>python PartitionFinderProtein.py \fB\-\-force\-restart\fR \fI\,~/data/frogs\/\fP
+Deletes any data produced by the previous runs (which is in
+\fI\,~/data/frogs/output\/\fP) and starts afresh
+.SH OPTIONS
+.TP
+\fB\-h\fR, \fB\-\-help\fR
+show this help message and exit
+.TP
+\fB\-v\fR, \fB\-\-verbose\fR
+show debug logging information (equivalent to \fB\-\-debugout\fR=\fI\,all\/\fR)
+.TP
+\fB\-c\fR, \fB\-\-check\-only\fR
+just check the configuration files, don't do any
+processing
+.TP
+\fB\-f\fR, \fB\-\-force\-restart\fR
+delete all previous output and start afresh (!)
+.TP
+\fB\-p\fR N, \fB\-\-processes\fR=\fI\,N\/\fR
+Number of concurrent processes to use. Use \fB\-1\fR to match
+the number of cpus on the machine. The default is to
+use \fB\-1\fR.
+.TP
+\fB\-\-show\-python\-exceptions\fR
+If errors occur, print the python exceptions
+.TP
+\fB\-\-save\-phylofiles\fR
+save all of the phyml or raxml output. This can take a
+lot of space(!)
+.TP
+\fB\-\-dump\-results\fR
+Dump all results to a binary file. This is only of use
+for testing purposes.
+.TP
+\fB\-\-compare\-results\fR
+Compare the results to previously dumped binary
+results. This is only of use for testing purposes.
+.TP
+\fB\-q\fR, \fB\-\-quick\fR
+Avoid anything slow (like writing schemes at each
+step),useful for very large datasets.
+.TP
+\fB\-r\fR, \fB\-\-raxml\fR
+Use RAxML (rather than PhyML) to do the analysis. See
+the manual
+.TP
+\fB\-\-cmdline\-extras\fR=\fI\,N\/\fR
+Add additional commands to the phyml or raxml
+commandlines that PF uses.This can be useful e.g. if
+you want to change the accuracy of lnL calculations
+('\-e' option in raxml), or use multi\-threaded versions
+of raxml that require you to specify the number of
+threads you will let raxml use ('\-T' option in raxml.
+E.g. you might specify this: \fB\-\-cmndline_extras\fR ' \fB\-e\fR
+2.0 \fB\-T\fR 10 ' N.B. MAKE SURE YOU PUT YOUR EXTRAS IN
+QUOTES, and only use this command if you really know
+what you're doing and are very familiar with raxml and
+PartitionFinder
+.TP
+\fB\-\-weights\fR=\fI\,N\/\fR
+Mainly for algorithm development. Only use it if you
+know what you're doing.A list of weights to use in the
+clustering algorithms. This list allows you to assign
+different weights to: the overall rate for a subset,
+the base/amino acid frequencies, model parameters, and
+alpha value. This will affect how subsets are
+clustered together. For instance: \fB\-\-cluster_weights\fR
+\&'1, 2, 5, 1', would weight the base freqeuncies 2x
+more than the overall rate, the model parameters 5x
+more, and the alpha parameter the same as the model
+rate
+.TP
+\fB\-\-kmeans\fR=\fI\,type\/\fR
+This defines which sitewise values to use: entropy or
+tiger \fB\-\-kmeans\fR entropy: use entropies for sitewise
+values \fB\-\-kmeans\fR tiger: use TIGER rates for sitewise
+values
+.TP
+\fB\-\-rcluster\-percent\fR=\fI\,N\/\fR
+This defines the proportion of possible schemes that
+the relaxed clustering algorithm will consider before
+it stops looking. The default is 10%.e.g. \fB\-\-rclusterpercent\fR 10.0
+.TP
+\fB\-\-rcluster\-max\fR=\fI\,N\/\fR
+This defines the number of possible schemes that the
+relaxed clustering algorithm will consider before it
+stops looking. The default is to look at just the top
+1000 schemes.e.g. \fB\-\-rcluster\-max\fR 1000
+.TP
+\fB\-\-min\-subset\-size\fR=\fI\,N\/\fR
+This defines the minimum subset size that you will
+accept for the kmeans algorithm. Subsets smaller than
+this will still be created during the algorithm, but
+they will be merged with other subsets at the end of
+the algorithm.e.g. \fB\-\-min\-subset\-size\fR 100
+.TP
+\fB\-\-debug\-output\fR=\fI\,REGION\/\fR,REGION,...
+(advanced option) Provide a list of debug regions to
+output extra information about what the program is
+doing. Possible regions are 'all' or any of {subset,su
+bset_ops,neighbour,raxml,parser,model_util,results,ent
+ropy,alignment,future_stdlib,threadpool,progress,main,
+config,pandas,reporter,kmeans,pandas.io.gbq,pandas.io,
+analysis_m,util,scheme,submodels,database,analysis,phy
+ml,raxml_mode,model_load,phyml_mode}.
+.TP
+\fB\-\-all\-states\fR
+In the k\-means algorithm, only produce subsets which
+have all states represented (e.g. ACTG for DNA
+datasets).
+.TP
+\fB\-\-profile\fR
+Output profiling information after running (this will
+slow everything down!)
diff --git a/debian/control b/debian/control
index e765515..bae7a5e 100644
--- a/debian/control
+++ b/debian/control
@@ -6,6 +6,7 @@ Build-Depends: debhelper (>= 9),
                dh-python,
                python-all,
                python-numpy,
+               python-sklearn,
 Standards-Version: 3.9.6
 Homepage: https://github.com/brettc/partitionfinder
 #Vcs-Git: git://anonscm.debian.org/collab-maint/partitionfinder.git
@@ -18,6 +19,7 @@ Depends: ${shlibs:Depends},
          ${python:Depends},
          python2.7,
          python-numpy,
+         python-sklearn,
 Description: choses partitioning schemes and models of molecular evolution for sequence data
  PartitionFinder and PartitionFinderProtein are Python programs for
  simultaneously choosing partitioning schemes and models of molecular evolution
diff --git a/debian/partitionfinder.manpages b/debian/partitionfinder.manpages
new file mode 100644
index 0000000..0f65186
--- /dev/null
+++ b/debian/partitionfinder.manpages
@@ -0,0 +1 @@
+debian/*.1

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/partitionfinder.git