[med-svn] [Git][med-team/python-dendropy][upstream] New upstream version 4.3.0+dfsg
Andreas Tille
gitlab at salsa.debian.org
Mon Feb 19 09:08:26 UTC 2018
Andreas Tille pushed to branch upstream at Debian Med / python-dendropy
Commits:
10d0e32a by Andreas Tille at 2018-02-18T20:16:55+01:00
New upstream version 4.3.0+dfsg
- - - - -
20 changed files:
- CHANGES.rst
- DendroPy.egg-info/PKG-INFO
- DendroPy.egg-info/SOURCES.txt
- PKG-INFO
- + applications/sumlabels/sumlabels.py
- applications/sumtrees/sumtrees.py
- dendropy/__init__.py
- dendropy/dataio/newickreader.py
- dendropy/dataio/nexuswriter.py
- dendropy/datamodel/charmatrixmodel.py
- dendropy/datamodel/taxonmodel.py
- dendropy/datamodel/treemodel.py
- + dendropy/interop/rspr.py
- dendropy/model/birthdeath.py
- dendropy/test/test_birthdeath.py
- dendropy/test/test_dataio_newick_reader_tree.py
- dendropy/test/test_tree_shape_kernel.py
- dendropy/utility/textprocessing.py
- setup.cfg
- setup.py
Changes:
=====================================
CHANGES.rst
=====================================
--- a/CHANGES.rst
+++ b/CHANGES.rst
@@ -1,3 +1,11 @@
+Release 4.3.0
+-------------
+
+- [SumTrees]: Important bugfix in tracking splits on consensus tree from rooted trees: previously it was possible for a split on the consensus tree to be ignored, resulting in a null (0) edge length and no (0.0) support.
+- Added ``sumlabels.py`` application.
+- BD tree (``dendropy.model.birth_death_tree``) now allows for preservation of extinct tips.
+- Improved performance of character subset export
+
Release 4.2.0
-------------
=====================================
DendroPy.egg-info/PKG-INFO
=====================================
--- a/DendroPy.egg-info/PKG-INFO
+++ b/DendroPy.egg-info/PKG-INFO
@@ -1,6 +1,6 @@
Metadata-Version: 1.1
Name: DendroPy
-Version: 4.2.0
+Version: 4.3.0
Summary: A Python library for phylogenetics and phylogenetic computing: reading, writing, simulation, processing and manipulation of phylogenetic trees (phylogenies) and characters.
Home-page: http://packages.python.org/DendroPy/
Author: Jeet Sukumaran and Mark T. Holder
@@ -115,7 +115,7 @@ Description: .. image:: https://raw.githubusercontent.com/jeetsukumaran/DendroPy
Current Release
===============
- The current release of DendroPy is version 4.2.0 (master-5051a46, 2016-12-28 13:25:19).
+ The current release of DendroPy is version 4.3.0 (master-e251bcf, 2017-07-06 21:23:19).
Keywords: phylogenetics phylogeny phylogenies phylogeography evolution evolutionary biology systematics coalescent population genetics phyloinformatics bioinformatics
@@ -131,5 +131,7 @@ Classifier: Programming Language :: Python :: 3.1
Classifier: Programming Language :: Python :: 3.2
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
+Classifier: Programming Language :: Python :: 3.5
+Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
=====================================
DendroPy.egg-info/SOURCES.txt
=====================================
--- a/DendroPy.egg-info/SOURCES.txt
+++ b/DendroPy.egg-info/SOURCES.txt
@@ -11,6 +11,7 @@ DendroPy.egg-info/entry_points.txt
DendroPy.egg-info/requires.txt
DendroPy.egg-info/top_level.txt
DendroPy.egg-info/zip-safe
+applications/sumlabels/sumlabels.py
applications/sumtrees/sumtrees.py
dendropy/__init__.py
dendropy/__main__.py
@@ -63,6 +64,7 @@ dendropy/interop/genbank.py
dendropy/interop/muscle.py
dendropy/interop/paup.py
dendropy/interop/raxml.py
+dendropy/interop/rspr.py
dendropy/interop/rstats.py
dendropy/interop/seqgen.py
dendropy/legacy/__init__.py
=====================================
PKG-INFO
=====================================
--- a/PKG-INFO
+++ b/PKG-INFO
@@ -1,6 +1,6 @@
Metadata-Version: 1.1
Name: DendroPy
-Version: 4.2.0
+Version: 4.3.0
Summary: A Python library for phylogenetics and phylogenetic computing: reading, writing, simulation, processing and manipulation of phylogenetic trees (phylogenies) and characters.
Home-page: http://packages.python.org/DendroPy/
Author: Jeet Sukumaran and Mark T. Holder
@@ -115,7 +115,7 @@ Description: .. image:: https://raw.githubusercontent.com/jeetsukumaran/DendroPy
Current Release
===============
- The current release of DendroPy is version 4.2.0 (master-5051a46, 2016-12-28 13:25:19).
+ The current release of DendroPy is version 4.3.0 (master-e251bcf, 2017-07-06 21:23:19).
Keywords: phylogenetics phylogeny phylogenies phylogeography evolution evolutionary biology systematics coalescent population genetics phyloinformatics bioinformatics
@@ -131,5 +131,7 @@ Classifier: Programming Language :: Python :: 3.1
Classifier: Programming Language :: Python :: 3.2
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
+Classifier: Programming Language :: Python :: 3.5
+Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
=====================================
applications/sumlabels/sumlabels.py
=====================================
--- /dev/null
+++ b/applications/sumlabels/sumlabels.py
@@ -0,0 +1,306 @@
+#! /usr/bin/env python
+
+##############################################################################
+## DendroPy Phylogenetic Computing Library.
+##
+## Copyright 2010 Jeet Sukumaran and Mark T. Holder.
+## All rights reserved.
+##
+## See "LICENSE.txt" for terms and conditions of usage.
+##
+## If you use this work or any portion thereof in published work,
+## please cite it as:
+##
+## Sukumaran, J. and M. T. Holder. 2010. DendroPy: a Python library
+## for phylogenetic computing. Bioinformatics 26: 1569-1571.
+##
+##############################################################################
+
+"""
+Marge labels of splits/branches from different input trees onto corresponding
+splits/branches of a single tree.
+"""
+
+import os
+import sys
+import argparse
+
+import datetime
+import time
+import socket
+try:
+ import getpass
+except:
+ pass
+import platform
+
+import dendropy
+from dendropy.utility.messaging import ConsoleMessenger
+from dendropy.utility.cli import confirm_overwrite, show_splash
+
+_program_name = "SumLabels"
+_program_subtitle = "Phylogenetic Tree Label Concatenation"
+_program_date = "Jan 20 2017"
+_program_version = "Version 2.0.0 (%s)" % _program_date
+_program_author = "Jeet Sukumaran"
+_program_contact = "jeetsukumaran at gmail.com"
+_program_copyright = "Copyright (C) 2017 Jeet Sukumaran.\n" \
+ "License GPLv3+: GNU GPL version 3 or later.\n" \
+ "This is free software: you are free to change\nand redistribute it. " \
+ "There is NO WARRANTY,\nto the extent permitted by law."
+
+def main_cli():
+
+ description = "%s %s %s" % (_program_name, _program_version, _program_subtitle)
+
+ parser = argparse.ArgumentParser(version =_program_version, description=description)
+
+ parser.add_argument(
+ "sources",
+ metavar="TREEFILE",
+ nargs="+")
+ parser.add_argument("-t","--target",
+ dest="target_tree_filepath",
+ default=None,
+ help="path to file with tree (Newick or NEXUS format) "
+ + "to which labels will be written")
+ parser.add_argument("--preserve-target-labels",
+ action="store_true",
+ dest="preserve_target_labels",
+ default=False,
+ help="keep any existing labels on target tree (by default, these will be cleared before writing the new labels)")
+ parser.add_argument("--rooted",
+ action="store_true",
+ dest="rooted_trees",
+ default=None,
+ help="treat trees as rooted")
+ parser.add_argument("--unrooted",
+ action="store_false",
+ dest="rooted_trees",
+ default=None,
+ help="treat trees as unrooted")
+ parser.add_argument("--ignore-missing-source",
+ action="store_true",
+ dest="ignore_missing_source",
+ default=False,
+ help="ignore missing source tree files (at least one must exist!)")
+ parser.add_argument("-o","--output",
+ dest="output_filepath",
+ default=None,
+ help="path to output file (if not given, will print to standard output)")
+ parser.add_argument("-s","--separator",
+ dest="separator",
+ default="/",
+ help="string to use to separate labels from different source trees (default='%(default)s')")
+ parser.add_argument("--no-taxa-block",
+ action="store_false",
+ dest="include_taxa_block",
+ default=True,
+ help="do not include a taxa block in the output treefile (otherwise will create taxa block by default)")
+ parser.add_argument("-c", "--additional-comments",
+ action="store",
+ dest="additional_comments",
+ default=None,
+ help="additional comments to be added to the summary file")
+ parser.add_argument("--to-newick",
+ action="store_true",
+ dest="to_newick_format",
+ default=False,
+ help="save results in NEWICK (PHYLIP) format (default is to save in NEXUS format)")
+ parser.add_argument("--to-phylip",
+ action="store_true",
+ dest="to_newick_format",
+ default=False,
+ help="same as --to-newick")
+ parser.add_argument("-r", "--replace",
+ action="store_true",
+ dest="replace",
+ default=False,
+ help="replace/overwrite output file without asking if it already exists ")
+ parser.add_argument("-q", "--quiet",
+ action="store_true",
+ dest="quiet",
+ default=False,
+ help="suppress ALL logging, progress and feedback messages")
+
+ args = parser.parse_args()
+ if args.quiet:
+ messaging_level = ConsoleMessenger.ERROR_MESSAGING_LEVEL
+ else:
+ messaging_level = ConsoleMessenger.INFO_MESSAGING_LEVEL
+ messenger = ConsoleMessenger(name="SumLabels", messaging_level=messaging_level)
+
+ # splash
+ if not args.quiet:
+ show_splash(prog_name=_program_name,
+ prog_subtitle=_program_subtitle,
+ prog_version=_program_version,
+ prog_author=_program_author,
+ prog_copyright=_program_copyright,
+ dest=sys.stderr,
+ )
+
+ ###################################################
+ # Source file idiot checking
+
+ source_filepaths = []
+ if len(args.sources) > 0:
+ for fpath in args.sources:
+ fpath = os.path.expanduser(os.path.expandvars(fpath))
+ if not os.path.exists(fpath):
+ if args.ignore_missing_source:
+ messenger.send_warning("Source file not found: '%s'" % fpath)
+ else:
+ messenger.error("Terminating due to missing source files. "
+ + "Use the '--ignore-missing-source' option to continue even "
+ + "if some files are missing.")
+ sys.exit(1)
+ else:
+ source_filepaths.append(fpath)
+ if len(source_filepaths) == 0:
+ messenger.error("No valid sources of input trees specified. "
+ + "Please provide the path to at least one (valid and existing) file "
+ + "containing trees")
+ sys.exit(1)
+ else:
+ messenger.info("No sources of input trees specified. "
+ + "Please provide the path to at least one (valid and existing) file "
+ + "containing tree samples to summarize. See '--help' for other options.")
+ sys.exit(1)
+
+ ###################################################
+ # Lots of other idiot-checking ...
+
+ # target tree
+ if args.target_tree_filepath is not None:
+ target_tree_filepath = os.path.expanduser(os.path.expandvars(args.target_tree_filepath))
+ if not os.path.exists(target_tree_filepath):
+ messenger.error("Target tree file not found: '%s'" % target_tree_filepath)
+ sys.exit(1)
+ else:
+ messenger.error("Target tree file not specified: use the '-t' or '--target' option to provide path to target tree")
+ sys.exit(1)
+
+ # output
+ if args.output_filepath is None:
+ output_dest = sys.stdout
+ else:
+ output_fpath = os.path.expanduser(os.path.expandvars(args.output_filepath))
+ if confirm_overwrite(filepath=output_fpath, replace_without_asking=args.replace):
+ output_dest = open(output_fpath, "w")
+ else:
+ sys.exit(1)
+
+ # taxon set to handle target trees
+ master_taxon_namespace = dendropy.TaxonNamespace()
+ is_rooted = args.rooted_trees
+ messenger.info("Reading target tree: '%s'" % target_tree_filepath)
+ target_tree = None
+ if is_rooted:
+ rooting = "force-rooted"
+ else:
+ rooting = None
+ for tree in dendropy.Tree.yield_from_files(
+ [target_tree_filepath, "rU",],
+ schema='nexus/newick',
+ taxon_namespace=master_taxon_namespace,
+ rooting=rooting):
+ target_tree = tree
+ break
+ bipartition_labels = {}
+ for src_fpath in source_filepaths:
+ messenger.info("Reading source tree(s) from: '%s'" % src_fpath)
+ for tree in dendropy.Tree.yield_from_files(
+ [src_fpath,],
+ schema='nexus/newick',
+ taxon_namespace=master_taxon_namespace,
+ rooting=rooting):
+ tree.encode_bipartitions()
+ for bipartition, edge in tree.bipartition_edge_map.items():
+ label = edge.head_node.label
+ if not label:
+ continue
+ try:
+ bipartition_labels[bipartition].append(label)
+ except KeyError:
+ bipartition_labels[bipartition] = [label]
+ messenger.info("Mapping labels")
+ target_tree.encode_bipartitions()
+ for bipartition, edge in target_tree.bipartition_edge_map.items():
+ label = []
+ if args.preserve_target_labels and edge.head_node.label:
+ label.append(edge.head_node.label)
+ elif not args.preserve_target_labels:
+ edge.head_node.label = None
+ if bipartition in bipartition_labels:
+ label.extend(bipartition_labels[bipartition])
+ else:
+ pass
+ # messenger.send_warning("Split on target tree not found in source trees: ignoring")
+ if label:
+ edge.head_node.label = args.separator.join(label)
+ output_dataset = dendropy.DataSet()
+ tree_list = output_dataset.new_tree_list(taxon_namespace=master_taxon_namespace)
+ tree_list.append(target_tree)
+ if args.to_newick_format:
+ output_dataset.write(
+ file=output_dest,
+ schema="newick",
+ suppress_rooting=False,
+ suppress_edge_lengths=False,
+ unquoted_underscores=False,
+ preserve_spaces=False,
+ store_tree_weights=False,
+ suppress_annotations=False,
+ annotations_as_nhx=False,
+ suppress_item_comments=False,
+ suppress_leaf_taxon_labels=False,
+ suppress_leaf_node_labels=True,
+ suppress_internal_taxon_labels=False,
+ suppress_internal_node_labels=False,
+ node_label_element_separator=' ',
+ )
+ else:
+ if args.include_taxa_block:
+ simple = False
+ else:
+ simple = True
+ comment = []
+ try:
+ username = getpass.getuser()
+ except:
+ username = "a user"
+ comment.append("%s %s by %s." % (_program_name, _program_version, _program_author))
+ comment.append("Using DendroPy Version %s by Jeet Sukumaran and Mark T. Holder."
+ % dendropy.__version__)
+ python_version = sys.version.replace("\n", "").replace("[", "(").replace("]",")")
+ comment.append("Running under Python %s on %s." % (python_version, sys.platform))
+ comment.append("Executed on %s by %s@%s." % (platform.node(), username, socket.gethostname()))
+ if args.additional_comments:
+ comment.append("\n")
+ comment.append(args.additional_comments)
+ output_dataset.write(
+ file=output_dest,
+ schema="nexus",
+ simple=simple,
+ file_comments=comment,
+ suppress_rooting=False,
+ unquoted_underscores=False,
+ preserve_spaces=False,
+ store_tree_weights=False,
+ suppress_annotations=False,
+ annotations_as_nhx=False,
+ suppress_item_comments=False,
+ suppress_leaf_taxon_labels=False,
+ suppress_leaf_node_labels=True,
+ suppress_internal_taxon_labels=False,
+ suppress_internal_node_labels=False,
+ node_label_element_separator=' ',
+ )
+ if not args.output_filepath:
+ pass
+ else:
+ messenger.info("Results written to: '%s'." % (output_fpath))
+
+if __name__ == '__main__':
+ main_cli()
=====================================
applications/sumtrees/sumtrees.py
=====================================
--- a/applications/sumtrees/sumtrees.py
+++ b/applications/sumtrees/sumtrees.py
@@ -1659,6 +1659,7 @@ def main():
else:
raise ValueError(args.summary_target)
_message_and_log(msg, wrap=True)
+ tree.encode_bipartitions()
target_trees.append(tree)
else:
try:
=====================================
dendropy/__init__.py
=====================================
--- a/dendropy/__init__.py
+++ b/dendropy/__init__.py
@@ -104,7 +104,7 @@ import collections
version_info = collections.namedtuple("dendropy_version_info",
["major", "minor", "micro", "releaselevel"])(
major=4,
- minor=2,
+ minor=3,
micro=0,
releaselevel=""
)
=====================================
dendropy/dataio/newickreader.py
=====================================
--- a/dendropy/dataio/newickreader.py
+++ b/dendropy/dataio/newickreader.py
@@ -303,7 +303,8 @@ class NewickReader(ioservice.DataReader):
taxon_symbol_map_fn=taxon_symbol_mapper.require_taxon_for_symbol)
yield tree
if tree is None:
- raise StopIteration
+ # raise StopIteration
+ return
def _read(self,
stream,
=====================================
dendropy/dataio/nexuswriter.py
=====================================
--- a/dendropy/dataio/nexuswriter.py
+++ b/dendropy/dataio/nexuswriter.py
@@ -178,7 +178,7 @@ class NexusWriter(ioservice.DataWriter):
# and need to be removed so as not to cause problems with our keyword
# validation scheme
self.simple = kwargs.pop("simple", False)
- self.suppress_taxa_blocks = kwargs.pop("suppress_taxa_block", None)
+ self.suppress_taxa_blocks = kwargs.pop("suppress_taxa_blocks", None)
self.suppress_block_titles = kwargs.pop("suppress_block_titles", None)
self.file_comments = kwargs.pop("file_comments", [])
if self.file_comments is None:
=====================================
dendropy/datamodel/charmatrixmodel.py
=====================================
--- a/dendropy/datamodel/charmatrixmodel.py
+++ b/dendropy/datamodel/charmatrixmodel.py
@@ -1606,7 +1606,7 @@ class CharacterMatrix(
# recalculated, which will require some careful and perhaps arbitrary
# handling of corner cases
clone.character_subsets = container.OrderedCaselessDict()
- # clone.clone_from(self)
+ indices = set(indices)
for vec in clone.values():
for cell_idx in range(len(vec)-1, -1, -1):
if cell_idx not in indices:
=====================================
dendropy/datamodel/taxonmodel.py
=====================================
--- a/dendropy/datamodel/taxonmodel.py
+++ b/dendropy/datamodel/taxonmodel.py
@@ -548,6 +548,7 @@ class TaxonNamespace(
raise TypeError("TaxonNamespace() takes at most 1 non-keyword argument ({} given)".format(len(args)))
elif len(args) == 1:
# special case: construct from argument
+ basemodel.DataObject.__init__(self, label=kwargs_set_label)
other = args[0]
for i in other:
if isinstance(i, Taxon):
=====================================
dendropy/datamodel/treemodel.py
=====================================
--- a/dendropy/datamodel/treemodel.py
+++ b/dendropy/datamodel/treemodel.py
@@ -772,6 +772,9 @@ class Edge(
def __eq__(self, other):
return self is other
+ def __lt__(self, other):
+ return id(self) < id(other)
+
###########################################################################
### Basic Structure
@@ -6103,7 +6106,7 @@ class Tree(
output.write(s)
return s
- def as_python_source(self, tree_obj_name=None, tree_args=None, oids=False):
+ def as_python_source(self, tree_obj_name=None, tree_args=None):
"""
Returns string that will rebuild this tree in Python.
"""
@@ -6119,10 +6122,9 @@ class Tree(
tree_args = ""
else:
tree_args = ", " + tree_args
- p.append("%s = dendropy.Tree(label=%s%s%s)" \
+ p.append("%s = dendropy.Tree(label=%s%s)" \
% (tree_obj_name,
label,
- oid_str,
tree_args))
taxon_obj_namer = lambda x: "tax_%s" % id(x)
@@ -6132,11 +6134,11 @@ class Tree(
label = "'" + taxon.label + "'"
else:
label = "None"
- p.append("%s = %s.taxon_namespace.require_taxon(label=%s%s)" \
+ p.append("%s = %s.taxon_namespace.require_taxon(label=%s)" \
% (tobj_name,
tree_obj_name,
label,
- oid_str))
+ ))
node_obj_namer = lambda x: "nd_%s" % id(x)
for node in self.preorder_node_iter():
@@ -6153,13 +6155,13 @@ class Tree(
ct = taxon_obj_namer(child.taxon)
else:
ct = "None"
- p.append("%s = %s.new_child(label=%s, taxon=%s, edge_length=%s%s)" %
+ p.append("%s = %s.new_child(label=%s, taxon=%s, edge_length=%s)" %
(node_obj_namer(child),
nn,
label,
ct,
child.edge.length,
- oid_str))
+ ))
return "\n".join(p)
=====================================
dendropy/interop/rspr.py
=====================================
--- /dev/null
+++ b/dendropy/interop/rspr.py
@@ -0,0 +1,177 @@
+#! /usr/bin/env python
+
+##############################################################################
+## DendroPy Phylogenetic Computing Library.
+##
+## Copyright 2010-2015 Jeet Sukumaran and Mark T. Holder.
+## All rights reserved.
+##
+## See "LICENSE.rst" for terms and conditions of usage.
+##
+## If you use this work or any portion thereof in published work,
+## please cite it as:
+##
+## Sukumaran, J. and M. T. Holder. 2010. DendroPy: a Python library
+## for phylogenetic computing. Bioinformatics 26: 1569-1571.
+##
+##############################################################################
+
+"""
+Wrapper for interacting with RSPR
+"""
+
+import subprocess
+import uuid
+import tempfile
+import socket
+import random
+import os
+import sys
+
+import dendropy
+from dendropy.utility.messaging import get_logger
+from dendropy.utility import processio
+from dendropy.utility import textprocessing
+_LOG = get_logger("interop.rspr")
+
+HOSTNAME = socket.gethostname()
+PID = os.getpid()
+
+class Rspr(object):
+ """
+ This class wraps all attributes and input needed to make a call to RSPR.
+
+ https://github.com/cwhidden/rspr
+
+ RSPR:
+
+ Calculate approximate and exact Subtree Prune and Regraft (rSPR)
+ distances and the associated maximum agreement forests (MAFs) between pairs
+ of rooted binary trees from STDIN in newick format. Supports arbitrary labels.
+ The second tree may be multifurcating.
+
+ Copyright 2009-2014 Chris Whidden
+ whidden at cs.dal.ca
+ http://kiwi.cs.dal.ca/Software/RSPR
+
+ """
+ ###
+ # NOTE ON THE ``--pairwise`` FLAG
+ # -------------------------------
+ # This determines the type of comparisons done.
+ #
+ # Specified without arguments, it does all distinct pairwise comparisons of
+ # the input tree. Output by default is a matrix with only the upper half
+ # filled. So, assuming the source has 4 trees, then:
+ #
+ # $ cat 4trees.tre | rspr -pairwise
+ # 0,23,24,24
+ # ,0,5,7
+ # ,,0,6
+ #
+ # Further (numerical) arguments specify the row/columns to restrict the
+ # comparisons.
+ #
+ # E.g., first row only: first tree to all other trees:
+ #
+ # $ cat 4trees.tre | rspr -pairwise 0 1
+ # 0,23,24,24
+ #
+ # E.g. first two rows only: first two trees to all other trees:
+ #
+ # $ cat 4trees.tre | rspr -pairwise 0 2
+ # 0,23,24,24
+ # ,0,5,7
+ #
+ # E.g. third row only:
+ #
+ # $ cat 4trees.tre | rspr -pairwise 2 3
+ # ,,0,6
+ #
+ # E.g. last column only: last tree to all other trees:
+ #
+ # $ cat 4trees.tre | rspr -pairwise 0 4 3 4
+ # 24
+ # 7
+ # 6
+ # 0
+
+
+ def __init__(self,
+ algorithm="bb",
+ optimizations=None,
+ cc=None,
+ ):
+ """
+
+ Parameters
+ ----------
+ algorithm : str
+ One of "fpt", "bb", "approx".
+ optimizations: list[str] or None
+ Will be passed directly rspr (with each element prefixed by ``-``).
+ cc: bool
+ Calculate a potentially better approximation with a quadratic time
+ algorithm.
+ """
+ self.algorithm = algorithm
+ self.optimizations = optimizations
+
+ def compare_one_to_many(self,
+ ref_tree,
+ comparison_trees,
+ command_args=None,
+ newick_output_kwargs=None,
+ ):
+ """
+
+ Compare ``ref_tree'' to each tree in ``comparison_trees``.
+
+ Parameters
+ ----------
+ ref_tree : |Tree|
+ A |Tree| object to be compared to every tree in ``comparison_trees``.
+ comparison_trees : |Tree|
+ An (ordered) iterable of trees to which ``ref_tree`` should be
+ compared.
+ command_args : list or None
+ An iterable of (string) arguments to be passed to the program.
+ newick_output_kwargs : dict or None
+ A collection of keyword arguments to pass to the tree string
+ composition routines (that will generate the tree strings to be
+ used as input to rspr).
+
+ Returns
+ -------
+ scores : list[numeric]
+ A list of the SPR distances from ``ref_tree'' to
+ ``comparison_trees``, in order of the trees given.
+ """
+ if newick_output_kwargs is None:
+ newick_output_kwargs = {}
+ # tf = tempfile.NamedTemporaryFile("w", delete=True)
+ tf = textprocessing.StringIO()
+ ref_tree.write(file=tf, schema="newick", **newick_output_kwargs)
+ for t in comparison_trees:
+ t.write(file=tf, schema="newick", **newick_output_kwargs)
+ command = []
+ command.append("rspr") # TODO: command path as instance attribute
+ command.extend(["-pairwise", "0", "1"])
+ if command_args is not None:
+ command.extend(command_args)
+ p = subprocess.Popen(command,
+ stdin=subprocess.PIPE,
+ stdout=subprocess.PIPE,
+ stderr=subprocess.PIPE,)
+ stdout, stderr = processio.communicate(p, commands=tf.getvalue())
+ result_fields = stdout.strip("\n").split(",")
+ assert len(result_fields) == 1 + len(comparison_trees), "Expecting length {} + 1 for results, but received {}: {}".format(len(comparison_trees), len(result_fields), result_fields)
+ return [int(v) for v in result_fields[1:]]
+
+
+
+
+
+
+
+
=====================================
dendropy/model/birthdeath.py
=====================================
--- a/dendropy/model/birthdeath.py
+++ b/dendropy/model/birthdeath.py
@@ -28,6 +28,7 @@ from dendropy.calculate import probability
from dendropy.utility import GLOBAL_RNG
from dendropy.utility.error import TreeSimTotalExtinctionException
from dendropy.utility import constants
+from dendropy.utility import deprecate
import dendropy
@@ -37,57 +38,107 @@ def birth_death_tree(birth_rate, death_rate, birth_rate_sd=0.0, death_rate_sd=0.
death rate specified by ``death_rate``, with edge lengths in continuous (real)
units.
- ``birth_rate_sd`` is the standard deviation of the normally-distributed mutation
- added to the birth rate as it is inherited by daughter nodes; if 0, birth
- rate does not evolve on the tree.
-
- ``death_rate_sd`` is the standard deviation of the normally-distributed mutation
- added to the death rate as it is inherited by daughter nodes; if 0, death
- rate does not evolve on the tree.
-
Tree growth is controlled by one or more of the following arguments, of which
at least one must be specified:
- - If ``ntax`` is given as a keyword argument, tree is grown until the number of
- tips == ntax.
- - If ``taxon_namespace`` is given as a keyword argument, tree is grown until the
- number of tips == len(taxon_namespace), and the taxa are assigned randomly to the
- tips.
+ - If ``num_extant_tips`` is given as a keyword argument, tree is grown until the
+ number of EXTANT tips equals this number.
+ - If ``num_extinct_tips`` is given as a keyword argument, tree is grown until the
+ number of EXTINCT tips equals this number.
+ - If ``num_total_tips`` is given as a keyword argument, tree is grown until the
+ number of EXTANT plus EXTINCT tips equals this number.
- If 'max_time' is given as a keyword argument, tree is grown for
a maximum of ``max_time``.
- If ``gsa_ntax`` is given then the tree will be simulated up to this number of
- tips (or 0 tips), then a tree will be randomly selected from the
- intervals which corresond to times at which the tree had exactly ``ntax``
- leaves (or len(taxon_namespace) tips). This allows for simulations according to
- the "General Sampling Approach" of Hartmann et al. (2010).
-
+ EXTANT tips (or 0 tips), then a tree will be randomly selected from the
+ intervals which corresond to times at which the tree had exactly ``num_extant_tips``
+ leaves. This allows for simulations according to the "General
+ Sampling Approach" of Hartmann et al. (2010). If this option is
+ specified, then ``num_extant_tips`` MUST be specified and
+ ``num_extinct_tips`` and ``num_total_tips`` CANNOT be specified.
If more than one of the above is given, then tree growth will terminate when
- *any* of the termination conditions (i.e., number of tips == ``ntax``, or number
- of tips == len(taxon_namespace) or maximum time = ``max_time``) are met.
-
- Also accepts a Tree object (with valid branch lengths) as an argument passed
- using the keyword ``tree``: if given, then this tree will be used; otherwise
- a new one will be created.
-
- If ``assign_taxa`` is False, then taxa will *not* be assigned to the tips;
- otherwise (default), taxa will be assigned. If ``taxon_namespace`` is given
- (``tree.taxon_namespace``, if ``tree`` is given), and the final number of tips on the
- tree after the termination condition is reached is less then the number of
- taxa in ``taxon_namespace`` (as will be the case, for example, when
- ``ntax`` < len(``taxon_namespace``)), then a random subset of taxa in ``taxon_namespace`` will
- be assigned to the tips of tree. If the number of tips is more than the number
- of taxa in the ``taxon_namespace``, new Taxon objects will be created and added
- to the ``taxon_namespace`` if the keyword argument ``create_required_taxa`` is not given as
- False.
+ *any* one of the termination conditions are met.
- Under some conditions, it is possible for all lineages on a tree to go extinct.
- In this case, if the keyword argument ``repeat_until_success`` is |True| (the
- default), then a new branching process is initiated.
- If |False| (default), then a TreeSimTotalExtinctionException is raised.
+ Parameters
+ ----------
- A Random() object or equivalent can be passed using the ``rng`` keyword;
- otherwise GLOBAL_RNG is used.
+ birth_rate : float
+ The birth rate.
+ death_rate : float
+ The death rate.
+ birth_rate_sd : float
+ The standard deviation of the normally-distributed mutation added to
+ the birth rate as it is inherited by daughter nodes; if 0, birth rate
+ does not evolve on the tree.
+ death_rate_sd : float
+ The standard deviation of the normally-distributed mutation added to
+ the death rate as it is inherited by daughter nodes; if 0, death rate
+ does not evolve on the tree.
+
+ Keyword Arguments
+ -----------------
+
+ num_extant_tips: int
+ If specified, branching process is terminated when number of EXTANT
+ tips equals this number.
+ num_extinct_tips: int
+ If specified, branching process is terminated when number of EXTINCT
+ tips equals this number.
+ num_total_tips: int
+ If specified, branching process is terminated when number of EXTINCT
+ plus EXTANT tips equals this number.
+ max_time: float
+ If specified, branching process is terminated when time reaches or
+ exceeds this value.
+ gsa_ntax: int
+ The General Sampling Approach threshold for number of taxa. See above
+ for details.
+ tree : Tree instance
+ If given, then this tree will be used; otherwise a new one will be created.
+ taxon_namespace : TaxonNamespace instance
+ If given, then this will be assigned to the new tree, and, in addition,
+ taxa assigned to tips will be sourced from or otherwise created with
+ reference to this.
+ is_assign_extant_taxa : bool [default: True]
+ If False, then taxa will not be assigned to extant tips. If True
+ (default), then taxa will be assigned to extant tips. Taxa will be
+ assigned from the specified ``taxon_namespace`` or
+ ``tree.taxon_namespace``. If the number of taxa required exceeds the
+ number of taxa existing in the taxon namespace, new |Taxon| objects
+ will be created as needed and added to the taxon namespace.
+ is_assign_extinct_taxa : bool [default: True]
+ If False, then taxa will not be assigned to extant tips. If True
+ (default), then taxa will be assigned to extant tips. Taxa will be
+ assigned from the specified ``taxon_namespace`` or
+ ``tree.taxon_namespace``. If the number of taxa required exceeds the
+ number of taxa existing in the taxon namespace, new |Taxon| objects
+ will be created as needed and added to the taxon namespace. Note that
+ this option only makes sense if extinct tips are retained (specified via
+ 'is_retain_extinct_tips' option), and will otherwise be ignored.
+ is_add_extinct_attr: bool [default: True]
+ If True (default), add an boolean attribute indicating whether or not a
+ node is an extinct tip or not. False will skip this. Name of attribute
+ is set by 'extinct_attr_name' argument, defaulting to 'is_extinct'.
+ Note that this option only makes sense if extinct tips are retained
+ (specified via 'is_retain_extinct_tips' option), and will otherwise be
+ ignored.
+ extinct_attr_name: str [default: 'is_extinct']
+ Name of attribute to add to nodes indicating whether or not tip is extinct.
+ Note that this option only makes sense if extinct tips are retained
+ (specified via 'is_retain_extinct_tips' option), and will otherwise be
+ ignored.
+ is_retain_extinct_tips : bool [default: False]
+ If True, extinct tips will be retained on tree. Defaults to False:
+ extinct lineages removed from tree.
+ repeat_until_success: bool [default: True]
+ Under some conditions, it is possible for all lineages on a tree to go
+ extinct. In this case, if this argument is given as |True| (the
+ default), then a new branching process is initiated. If |False|
+ (default), then a TreeSimTotalExtinctionException is raised.
+ rng: random.Random() or equivalent instance
+ A Random() object or equivalent can be passed using the ``rng`` keyword;
+ otherwise GLOBAL_RNG is used.
References
----------
@@ -95,46 +146,116 @@ def birth_death_tree(birth_rate, death_rate, birth_rate_sd=0.0, death_rate_sd=0.
Hartmann, Wong, and Stadler "Sampling Trees from Evolutionary Models" Systematic Biology. 2010. 59(4). 465-476
"""
- target_num_taxa = kwargs.get('ntax')
- max_time = kwargs.get('max_time')
- taxon_namespace = kwargs.get('taxon_namespace')
- if (target_num_taxa is None) and (taxon_namespace is not None):
- target_num_taxa = len(taxon_namespace)
- elif taxon_namespace is None:
- taxon_namespace = dendropy.TaxonNamespace()
- gsa_ntax = kwargs.get('gsa_ntax')
+ if "assign_taxa" in kwargs:
+ deprecate.dendropy_deprecation_warning(
+ message="Deprecated: 'assign_taxa' will no longer be supported as an argument to this function. Use 'is_assign_extant_taxa' and/or 'is_assign_extinct_taxa' instead",
+ stacklevel=3)
+ a = kwargs.pop("assign_taxa")
+ kwargs["is_assign_extant_taxa"] = a
+ kwargs["is_assign_extant_taxa"] = a
+ if "ntax" in kwargs:
+ deprecate.dendropy_deprecation_warning(
+ message="Deprecated: 'ntax' is no longer supported as an argument to this function. Use one or more of the following instead: 'num_extant_tips', 'num_extinct_tips', 'num_total_tips', or 'max_time'",
+ stacklevel=3)
+ kwargs["num_extant_tips"] = kwargs.pop("ntax")
+ if (("num_extant_tips" not in kwargs)
+ and ("num_extinct_tips" not in kwargs)
+ and ("num_total_tips" not in kwargs)
+ and ("max_time" not in kwargs) ):
+ if "taxon_namespace" in kwargs:
+ ### cannot support legacy approach, b/c ``taxon_namespace`` may grow during function, leading to unpredictable behavior
+ # deprecate.dendropy_deprecation_warning(
+ # preamble="Deprecated: The 'taxon_namespace' argument can no longer be used to specify a termination condition as a side-effect. Use one or more of the following instead with the length of the taxon namespace instance as a value: 'num_extant_tips', 'num_extinct_tips', or 'num_total_tips'",
+ # old_construct="tree = birth_death_tree(\n ...\n taxon_namespace=taxon_namespace,\n ...\n)",
+ # new_construct="tree = birth_death_tree(\n ...\n taxon_namespace=taxon_namespace,\n num_extant_tips=len(taxon_namespace),\n ...\n)")
+ # kwargs["num_extant_tips"] = len(kwargs["taxon_namespace"])
+ raise ValueError("The 'taxon_namespace' argument can no longer be used to specify a termination condition as a side-effect."
+ "Use one or more of the following instead with the length of the taxon namespace instance as a value: "
+ "'num_extant_tips', 'num_extinct_tips', or 'num_total_tips'.\n"
+ "That is, instead of:\n\n"
+ " tree = birth_death_tree(\n ...\n taxon_namespace=taxon_namespace,\n ...\n )\n\n"
+ "Use:\n\n"
+ " ntax = len(taxon_namespace)\n tree = birth_death_tree(\n ...\n taxon_namespace=taxon_namespace,\n num_extant_tips=ntax,\n ...\n )\n"
+ "\nOr (recommended):\n\n"
+ " tree = birth_death_tree(\n ...\n taxon_namespace=taxon_namespace,\n num_extant_tips=100,\n ...\n )\n"
+ "\nNote that the taxon namespace instance size may grow during any particular call of the function depending on taxon assignment/creation settings, so"
+ " for stable and predictable behavor it is important to take a snapshot of the desired taxon namespace size before any call of the function, or, better yet"
+ " simply pass in a constant value."
+ )
+ else:
+ raise ValueError("One or more of the following must be specified: 'num_extant_tips', 'num_extinct_tips', or 'max_time'")
+ target_num_extant_tips = kwargs.pop("num_extant_tips", None)
+ target_num_extinct_tips = kwargs.pop("num_extinct_tips", None)
+ target_num_total_tips = kwargs.pop("num_total_tips", None)
+ max_time = kwargs.pop('max_time', None)
+ gsa_ntax = kwargs.pop('gsa_ntax', None)
+ is_add_extinct_attr = kwargs.pop('is_add_extinct_attr', True)
+ extinct_attr_name = kwargs.pop('extinct_attr_name', 'is_extinct')
+ is_retain_extinct_tips = kwargs.pop('is_retain_extinct_tips', False)
+ is_assign_extant_taxa = kwargs.pop('is_assign_extant_taxa', True)
+ is_assign_extinct_taxa = kwargs.pop('is_assign_extinct_taxa', True)
+ repeat_until_success = kwargs.pop('repeat_until_success', True)
+
+ tree = kwargs.pop("tree", None)
+ taxon_namespace = kwargs.pop("taxon_namespace", None)
+
+ rng = kwargs.pop('rng', GLOBAL_RNG)
+
+ ignore_unrecognized_keyword_arguments = kwargs.pop('ignore_unrecognized_keyword_arguments', False)
+ if kwargs and not ignore_unrecognized_keyword_arguments:
+ raise ValueError("Unsupported keyword arguments: {}".format(kwargs.keys()))
+
terminate_at_full_tree = False
- if target_num_taxa is None:
- if gsa_ntax is not None:
- raise ValueError("When 'gsa_ntax' is used, either 'ntax' or 'taxon_namespace' must be used")
- if max_time is None:
- raise ValueError("At least one of the following must be specified: 'ntax', 'taxon_namespace', or 'max_time'")
- else:
- if gsa_ntax is None:
- terminate_at_full_tree = True
- gsa_ntax = 1 + target_num_taxa
- elif gsa_ntax < target_num_taxa:
- raise ValueError("gsa_ntax must be greater than target_num_taxa")
- repeat_until_success = kwargs.get('repeat_until_success', True)
- rng = kwargs.get('rng', GLOBAL_RNG)
+
+ if gsa_ntax is None:
+ terminate_at_full_tree = True
+ # gsa_ntax = 1 + target_num_taxa
+ elif target_num_extant_tips is None:
+ raise ValueError("If 'gsa_ntax' is specified, 'num_extant_tips' must be specified")
+ elif target_num_extinct_tips is not None:
+ raise ValueError("If 'gsa_ntax' is specified, 'num_extinct_tups' cannot be specified")
+ elif target_num_total_tips is not None:
+ raise ValueError("If 'gsa_ntax' is specified, 'num_total_tips' cannot be specified")
+ elif gsa_ntax < target_num_extant_tips:
+ raise ValueError("'gsa_ntax' ({}) must be greater than 'num_extant_tips' ({})".format(gsa_ntax, target_num_extant_tips))
# initialize tree
- if "tree" in kwargs:
- tree = kwargs['tree']
- if "taxon_namespace" in kwargs and kwargs['taxon_namespace'] is not tree.taxon_namespace:
- raise ValueError("Cannot specify both ``tree`` and ``taxon_namespace``")
+ if tree is not None:
+ if taxon_namespace is not None:
+ assert tree.taxon_namespace is taxon_namespace
+ else:
+ taxon_namespace = tree.taxon_namespace
+ extant_tips = set()
+ extinct_tips = set()
+ for nd in tree:
+ if not nd._child_nodes:
+ if getattr(nd, extinct_attr_name, False):
+ extant_tips.add(nd)
+ if is_add_extinct_attr:
+ setattr(nd, extinct_attr_name, False)
+ else:
+ extinct_tips.append(nd)
+ if is_add_extinct_attr:
+ setattr(nd, extinct_attr_name, True)
+ elif is_add_extinct_attr:
+ setattr(nd, extinct_attr_name, None)
else:
+ if taxon_namespace is None:
+ taxon_namespace = dendropy.TaxonNamespace()
tree = dendropy.Tree(taxon_namespace=taxon_namespace)
tree.is_rooted = True
tree.seed_node.edge.length = 0.0
tree.seed_node.birth_rate = birth_rate
tree.seed_node.death_rate = death_rate
+ if is_add_extinct_attr:
+ setattr(tree.seed_node, extinct_attr_name, False)
+ extant_tips = set([tree.seed_node])
+ extinct_tips = set()
+ initial_extant_tip_set = set(extant_tips)
+ initial_extinct_tip_set = set(extinct_tips)
- # grow tree
- leaf_nodes = tree.leaf_nodes()
- #_LOG.debug("Will generate a tree with no more than %s leaves to get a tree of %s leaves" % (str(gsa_ntax), str(target_num_taxa)))
- curr_num_leaves = len(leaf_nodes)
total_time = 0
+
# for the GSA simulations targetted_time_slices is a list of tuple
# the first element in the tuple is the duration of the amount
# that the simulation spent at the (targetted) number of taxa
@@ -142,22 +263,26 @@ def birth_death_tree(birth_rate, death_rate, birth_rate_sd=0.0, death_rate_sd=0.
# a list of terminal edges in the tree and the length for that edge
# that marks the beginning of the time slice that corresponds to the
# targetted number of taxa.
-
targetted_time_slices = []
- extinct_tips = []
+
while True:
if gsa_ntax is None:
- assert (max_time is not None)
- if total_time >= max_time:
+ if target_num_extant_tips is not None and len(extant_tips) >= target_num_extant_tips:
+ break
+ if target_num_extinct_tips is not None and len(extinct_tips) >= target_num_extinct_tips:
+ break
+ if target_num_total_tips is not None and (len(extant_tips) + len(extinct_tips)) >= target_num_total_tips:
+ break
+ if max_time is not None and total_time >= max_time:
break
- elif curr_num_leaves >= gsa_ntax:
+ elif len(extant_tips) >= gsa_ntax:
break
# get vector of birth/death probabilities, and
# associate with nodes/events
event_rates = []
event_nodes = []
- for nd in leaf_nodes:
+ for nd in extant_tips:
if not hasattr(nd, 'birth_rate'):
nd.birth_rate = birth_rate
if not hasattr(nd, 'death_rate'):
@@ -171,42 +296,38 @@ def birth_death_tree(birth_rate, death_rate, birth_rate_sd=0.0, death_rate_sd=0.
rate_of_any_event = sum(event_rates)
# waiting time based on above probability
- #_LOG.debug("rate_of_any_event = %f" % (rate_of_any_event))
waiting_time = rng.expovariate(rate_of_any_event)
- #_LOG.debug("Drew waiting time of %f from hazard parameter of %f" % (waiting_time, rate_of_any_event))
- if (gsa_ntax is not None) and (curr_num_leaves == target_num_taxa):
+ if ( (gsa_ntax is not None)
+ and (len(extant_tips) == target_num_extant_tips)
+ ):
edge_and_start_length = []
- for nd in leaf_nodes:
+ for nd in extant_tips:
e = nd.edge
edge_and_start_length.append((e, e.length))
targetted_time_slices.append((waiting_time, edge_and_start_length))
- #_LOG.debug("Recording slice with %d edges" % len(edge_and_start_length))
if terminate_at_full_tree:
break
# add waiting time to nodes
- for nd in leaf_nodes:
+ for nd in extant_tips:
try:
nd.edge.length += waiting_time
except TypeError:
nd.edge.length = waiting_time
- #_LOG.debug("Next waiting_time = %f" % waiting_time)
total_time += waiting_time
# if event occurs within time constraints
if max_time is None or total_time <= max_time:
-
# normalize probability
for i in range(len(event_rates)):
event_rates[i] = event_rates[i]/rate_of_any_event
-
# select node/event and process
nd, birth_event = probability.weighted_choice(event_nodes, event_rates, rng=rng)
- leaf_nodes.remove(nd)
- curr_num_leaves -= 1
+ extant_tips.remove(nd)
if birth_event:
- #_LOG.debug("Speciation")
+ if is_add_extinct_attr:
+ setattr(nd, extinct_attr_name, None)
c1 = nd.new_child()
c2 = nd.new_child()
c1.edge.length = 0
@@ -215,93 +336,118 @@ def birth_death_tree(birth_rate, death_rate, birth_rate_sd=0.0, death_rate_sd=0.
c1.death_rate = nd.death_rate + rng.gauss(0, death_rate_sd)
c2.birth_rate = nd.birth_rate + rng.gauss(0, birth_rate_sd)
c2.death_rate = nd.death_rate + rng.gauss(0, death_rate_sd)
- leaf_nodes.append(c1)
- leaf_nodes.append(c2)
- curr_num_leaves += 2
+ extant_tips.add(c1)
+ extant_tips.add(c2)
else:
- #_LOG.debug("Extinction")
- if curr_num_leaves > 0:
- #_LOG.debug("Will delete " + str(id(nd)) + " with parent = " + str(id(nd.parent_node)))
- extinct_tips.append(nd)
+ if len(extant_tips) > 0:
+ extinct_tips.add(nd)
+ if is_add_extinct_attr:
+ setattr(nd, extinct_attr_name, None)
else:
+ # total extinction
if (gsa_ntax is not None):
if (len(targetted_time_slices) > 0):
break
if not repeat_until_success:
raise TreeSimTotalExtinctionException()
- # We are going to basically restart the simulation because the tree has gone extinct (without reaching the specified ntax)
- leaf_nodes = [tree.seed_node]
- curr_num_leaves = 1
- for nd in tree.seed_node.child_nodes():
- tree.prune_subtree(nd, suppress_unifurcations=False)
- extinct_tips = []
+ # We are going to basically restart the simulation because
+ # the tree has gone extinct (without reaching the specified
+ # ntax)
+ extant_tips = set(initial_extant_tip_set)
+ extinct_tips = set(initial_extinct_tip_set)
+ for nd in extant_tips:
+ if is_add_extinct_attr:
+ setattr(nd, extinct_attr_name, False)
+ nd.clear_child_nodes()
total_time = 0
- assert(curr_num_leaves == len(leaf_nodes))
- #_LOG.debug("Current tree \n%s" % (tree.as_ascii_plot(plot_metric='length', show_internal_node_labels=True)))
- #tree._debug_tree_is_valid()
- #_LOG.debug("Terminated with %d leaves (%d, %d according to len(leaf_nodes))" % (curr_num_leaves, len(leaf_nodes), len(tree.leaf_nodes())))
+
if gsa_ntax is not None:
total_duration_at_target_n_tax = 0.0
for i in targetted_time_slices:
total_duration_at_target_n_tax += i[0]
r = rng.random()*total_duration_at_target_n_tax
- #_LOG.debug("Selected rng = %f out of (0, %f)" % (r, total_duration_at_target_n_tax))
selected_slice = None
for n, i in enumerate(targetted_time_slices):
r -= i[0]
if r < 0.0:
selected_slice = i
assert(selected_slice is not None)
- #_LOG.debug("Selected time slice index %d" % n)
edges_at_slice = selected_slice[1]
last_waiting_time = selected_slice[0]
+
for e, prev_length in edges_at_slice:
daughter_nd = e.head_node
for nd in daughter_nd.child_nodes():
- tree.prune_subtree(nd, suppress_unifurcations=False)
- #_LOG.debug("After pruning %s:\n%s" % (str(id(nd)), tree.as_ascii_plot(plot_metric='length', show_internal_node_labels=True)))
- try:
- extinct_tips.remove(nd)
- except:
- pass
- try:
- extinct_tips.remove(daughter_nd)
- except:
- pass
+ nd._parent_node = None
+ extinct_tips.discard(nd)
+ extant_tips.discard(nd)
+ for desc in nd.preorder_iter():
+ extinct_tips.discard(desc)
+ extant_tips.discard(desc)
+ daughter_nd.clear_child_nodes()
+ extinct_tips.discard(daughter_nd)
+ extant_tips.add(daughter_nd)
+ if is_add_extinct_attr:
+ setattr(daughter_nd, extinct_attr_name, False)
e.length = prev_length + last_waiting_time
-
-
-# tree._debug_tree_is_valid()
-# for nd in extinct_tips:
-# _LOG.debug("Will be deleting " + str(id(nd)))
-
- for nd in extinct_tips:
- bef = len(tree.leaf_nodes())
- while (nd.parent_node is not None) and (len(nd.parent_node.child_nodes()) == 1):
- nd = nd.parent_node
-# _LOG.debug("Deleting " + str(nd.__dict__) + '\n' + str(nd.edge.__dict__))
-# for n, pnd in enumerate(tree.postorder_node_iter()):
-# _LOG.debug("%d %s" % (n, repr(pnd)))
-# _LOG.debug("Before prune of %s:\n%s" % (str(id(nd)), tree.as_ascii_plot(plot_metric='length', show_internal_node_labels=True)))
- if nd.parent_node:
+ if not is_retain_extinct_tips:
+ processed_nodes = set()
+ for nd in list(extinct_tips):
+ if nd in processed_nodes:
+ continue
+ processed_nodes.add(nd)
+ extinct_tips.discard(nd)
+ assert not nd._child_nodes
+ while (nd.parent_node is not None) and (len(nd.parent_node._child_nodes) == 1):
+ nd = nd.parent_node
+ processed_nodes.add(nd)
tree.prune_subtree(nd, suppress_unifurcations=False)
-# _LOG.debug("Deleted " + str(nd.__dict__))
-# for n, pnd in enumerate(tree.postorder_node_iter()):
-# _LOG.debug("%d %s" % (n, repr(pnd)))
-# tree._debug_tree_is_valid()
tree.suppress_unifurcations()
-# tree._debug_tree_is_valid()
-# _LOG.debug("After deg2suppression:\n%s" % (tree.as_ascii_plot(plot_metric='length', show_internal_node_labels=True)))
-
- if kwargs.get("assign_taxa", True):
- tree.randomly_assign_taxa(create_required_taxa=True, rng=rng)
-
- # return
+ if is_assign_extant_taxa or is_assign_extinct_taxa:
+ taxon_pool = [t for t in taxon_namespace]
+ rng.shuffle(taxon_pool)
+ taxon_pool_labels = set([t.label for t in taxon_pool])
+
+ ### ONLY works if in GSA sub-section we remove ALL extant and
+ ### extinct nodes beyond time slice: expensive
+ ### Furthermore, main reason to use this approach is to have different
+ ### label prefixes for extinct vs. extant lineages, but the second time
+ ### this function is called with the same taxon namespace or any time
+ ### this function is called with a populated taxon namespace, that
+ ### aesthetic is lost.
+ # node_pool_labels = ("T", "X")
+ # for node_pool_idx, node_pool in enumerate((extant_tips, extinct_tips)):
+ # for node_idx, nd in enumerate(node_pool):
+ # if taxon_pool:
+ # taxon = taxon_pool.pop()
+ # else:
+ # taxon = taxon_namespace.require_taxon("{}{}".format(node_pool_labels[node_pool_idx], node_idx+1))
+ # nd.taxon = taxon
+ # assert not nd._child_nodes
+
+ tlabel_counter = 0
+ leaf_nodes = tree.leaf_nodes()
+ rng.shuffle(leaf_nodes)
+ for nd_idx, nd in enumerate(leaf_nodes):
+ if not is_assign_extant_taxa and nd in extant_tips:
+ continue
+ if not is_assign_extant_taxa and nd in extinct_tips:
+ continue
+ if taxon_pool:
+ taxon = taxon_pool.pop()
+ else:
+ while True:
+ tlabel_counter += 1
+ label = "{}{}".format("T", tlabel_counter)
+ if label not in taxon_pool_labels:
+ break
+ taxon = taxon_namespace.require_taxon(label=label)
+ taxon_pool_labels.add(label)
+ nd.taxon = taxon
return tree
-
def discrete_birth_death_tree(birth_rate, death_rate, birth_rate_sd=0.0, death_rate_sd=0.0, **kwargs):
"""
Returns a birth-death tree with birth rate specified by ``birth_rate``, and
=====================================
dendropy/test/test_birthdeath.py
=====================================
--- a/dendropy/test/test_birthdeath.py
+++ b/dendropy/test/test_birthdeath.py
@@ -59,7 +59,7 @@ class BirthDeathTreeTest(unittest.TestCase):
"""test that the birth-death process produces the correct number of tips with GSA."""
_RNG = MockRandom()
for num_leaves in range(2, 15):
- t = birthdeath.birth_death_tree(birth_rate=1.0, death_rate=0.2, ntax=num_leaves, gsa_ntax=3*num_leaves, rng=_RNG)
+ t = birthdeath.birth_death_tree(birth_rate=1.0, death_rate=0.2, num_extant_tips=num_leaves, gsa_ntax=3*num_leaves, rng=_RNG)
self.assertTrue(t._debug_tree_is_valid())
self.assertEqual(num_leaves, len(t.leaf_nodes()))
@@ -67,7 +67,7 @@ class BirthDeathTreeTest(unittest.TestCase):
"""test that the pure-birth process produces the correct number of tips."""
_RNG = MockRandom()
for num_leaves in range(2, 20):
- t = birthdeath.birth_death_tree(birth_rate=1.0, death_rate=0.0, ntax=num_leaves, rng=_RNG)
+ t = birthdeath.birth_death_tree(birth_rate=1.0, death_rate=0.0, num_extant_tips=num_leaves, rng=_RNG)
self.assertTrue(t._debug_tree_is_valid())
self.assertEqual(num_leaves, len(t.leaf_nodes()))
@@ -75,7 +75,7 @@ class BirthDeathTreeTest(unittest.TestCase):
"""test that the pure-birth process produces the correct number of tips with GSA."""
_RNG = MockRandom()
for num_leaves in range(2, 20):
- t = birthdeath.birth_death_tree(birth_rate=1.0, death_rate=0.0, ntax=num_leaves, gsa_ntax=4*num_leaves, rng=_RNG)
+ t = birthdeath.birth_death_tree(birth_rate=1.0, death_rate=0.0, num_extant_tips=num_leaves, gsa_ntax=4*num_leaves, rng=_RNG)
self.assertTrue(t._debug_tree_is_valid())
self.assertEqual(num_leaves, len(t.leaf_nodes()))
@@ -83,7 +83,7 @@ class BirthDeathTreeTest(unittest.TestCase):
"""PureCoalescentTreeTest -- tree generation without checking [TODO: checks]"""
_RNG = MockRandom()
for num_leaves in range(2, 20):
- t = birthdeath.birth_death_tree(birth_rate=1.0, death_rate=0.2, ntax=num_leaves, rng=_RNG)
+ t = birthdeath.birth_death_tree(birth_rate=1.0, death_rate=0.2, num_extant_tips=num_leaves, rng=_RNG)
self.assertTrue(t._debug_tree_is_valid())
self.assertEqual(num_leaves, len(t.leaf_nodes()))
=====================================
dendropy/test/test_dataio_newick_reader_tree.py
=====================================
--- a/dendropy/test/test_dataio_newick_reader_tree.py
+++ b/dendropy/test/test_dataio_newick_reader_tree.py
@@ -990,9 +990,9 @@ class NewickInternalLabelAssociationTest(unittest.TestCase):
expected_label = expected_labels[int(nd.bipartition)]
if labels_to_edges:
self.assertIs(nd.label, None)
- self.assertEquals(nd.edge.label, expected_label)
+ self.assertEqual(nd.edge.label, expected_label)
else:
- self.assertEquals(nd.label, expected_label)
+ self.assertEqual(nd.label, expected_label)
self.assertIs(nd.edge.label, None)
if __name__ == "__main__":
=====================================
dendropy/test/test_tree_shape_kernel.py
=====================================
--- a/dendropy/test/test_tree_shape_kernel.py
+++ b/dendropy/test/test_tree_shape_kernel.py
@@ -183,7 +183,8 @@ class AssemblageInducedTreeManagerTestBase(unittest.TestCase):
tree = dendropy.simulate.birth_death_tree(
birth_rate=0.1,
death_rate=0.0,
- taxon_namespace=tns)
+ taxon_namespace=tns,
+ num_extant_tips=len(tns))
tree.assemblage_leaf_sets = []
tree.assemblage_classification_regime_subtrees = []
for group_id in AssemblageInducedTreeManagerTests.GROUP_IDS:
=====================================
dendropy/utility/textprocessing.py
=====================================
--- a/dendropy/utility/textprocessing.py
+++ b/dendropy/utility/textprocessing.py
@@ -37,7 +37,10 @@ except ImportError:
###############################################################################
## Unicode/String Conversions
-ENCODING = locale.getdefaultlocale()[1]
+try:
+ ENCODING = locale.getdefaultlocale()[1]
+except ValueError:
+ ENCODING = None # let default value be assigned below
if ENCODING == None:
ENCODING = 'UTF-8'
=====================================
setup.cfg
=====================================
--- a/setup.cfg
+++ b/setup.cfg
@@ -9,5 +9,4 @@ upload-dir = doc/build/html
[egg_info]
tag_build =
tag_date = 0
-tag_svn_revision = 0
=====================================
setup.py
=====================================
--- a/setup.py
+++ b/setup.py
@@ -91,8 +91,8 @@ ENTRY_POINTS = {}
SCRIPT_SUBPATHS = [
['applications', 'sumtrees', 'sumtrees.py'],
+ ['applications', 'sumlabels', 'sumlabels.py'],
# ['scripts', 'sumtrees', 'cattrees.py'],
- # ['scripts', 'sumtrees', 'sumlabels.py'],
# ['scripts', 'calculators', 'strict_consensus_merge.py'],
# ['scripts', 'calculators', 'long_branch_symmdiff.py'],
]
@@ -168,6 +168,8 @@ setup(name='DendroPy',
"Programming Language :: Python :: 3.2",
"Programming Language :: Python :: 3.3",
"Programming Language :: Python :: 3.4",
+ "Programming Language :: Python :: 3.5",
+ "Programming Language :: Python :: 3.6",
"Programming Language :: Python",
"Topic :: Scientific/Engineering :: Bio-Informatics",
],
View it on GitLab: https://salsa.debian.org/med-team/python-dendropy/commit/10d0e32a2a9d9000898cadb2cf0c137a92cc0976
---
View it on GitLab: https://salsa.debian.org/med-team/python-dendropy/commit/10d0e32a2a9d9000898cadb2cf0c137a92cc0976
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/debian-med-commit/attachments/20180219/aede0b81/attachment-0001.html>
More information about the debian-med-commit
mailing list