[med-svn] [gubbins] 01/04: Imported Upstream version 1.4.2

Thu Nov 19 10:31:25 UTC 2015

This is an automated email from the git hooks/post-receive script.

tille pushed a commit to branch master
in repository gubbins.

commit f64633ac5485bd24310499dcad3f5c3a5b9ec207
Author: Andreas Tille <tille at debian.org>
Date:   Thu Nov 19 10:40:38 2015 +0100

    Imported Upstream version 1.4.2
---
 .travis.yml                                        |  23 +
 INSTALL.md                                         | 252 ++++------
 README.md                                          |   5 +-
 VERSION                                            |   2 +-
 configure.ac                                       |   2 +-
 debian/changelog                                   |   6 +
 install-userspace.sh                               |   2 +-
 install_dependencies.sh                            | 111 +++++
 python/Makefile.am                                 |   2 +-
 python/gubbins/Fastml.py                           |  63 +++
 python/gubbins/__init__.py                         |   4 +-
 python/gubbins/common.py                           | 514 +++++++++++----------
 python/gubbins/tests/bin/dummy_custom_fastml2      |  62 +++
 python/gubbins/tests/bin/dummy_fastml2             |  60 +++
 python/gubbins/tests/bin/dummy_fastml3             |  51 ++
 python/gubbins/tests/data/non_bi_tree.tre.expected |   2 +-
 ....tre.filter_out_removed_taxa_from_tree_expected |   2 +-
 ...ance_tree1.tre.reroot_tree_at_midpoint_expected |   2 +-
 .../gubbins/tests/test_alignment_python_methods.py |  15 +-
 .../tests/test_converging_recombinations.py        |   2 +-
 python/gubbins/tests/test_external_dependancies.py |  39 +-
 python/gubbins/tests/test_fastml.py                |  43 ++
 python/gubbins/tests/test_string_construction.py   |   2 +-
 python/gubbins/tests/test_tree_python_methods.py   |  43 +-
 python/gubbins/tests/test_validate_fasta_input.py  |   2 +-
 .../gubbins/tests/test_validate_starting_tree.py   |   2 +-
 python/scripts/gubbins_drawer.py                   |  30 +-
 python/scripts/run_gubbins.py                      |   8 +-
 python/setup.py                                    |  10 +-
 release/manifests/trustyvm.pp                      |   7 +-
 src/Newickform.c                                   |   3 +-
 src/alignment_file.c                               |   4 +
 src/parse_vcf.c                                    |   4 +-
 tests/check_branch_sequences.c                     |   4 +-
 34 files changed, 842 insertions(+), 541 deletions(-)

diff --git a/.travis.yml b/.travis.yml
new file mode 100644
index 0000000..6b746b8
--- /dev/null
+++ b/.travis.yml
@@ -0,0 +1,23 @@
+language: python
+addons:
+  apt:
+    packages:
+    - autoconf
+    - check
+    - g++
+    - libtool
+    - pkg-config
+    - python3-dev
+    - python3-setuptools
+cache:
+  directories:
+  - "build"
+  - "$HOME/.cache/pip"
+python:
+  - "3.4"
+sudo: false
+install:
+  - "source ./install_dependencies.sh"
+  - "autoreconf -i"
+  - "./configure"
+script: "make check"
diff --git a/INSTALL.md b/INSTALL.md
index 28b77a9..274fa61 100644
--- a/INSTALL.md
+++ b/INSTALL.md
@@ -1,193 +1,111 @@
-Quick start
-=======
-Before you do anything, please have a look at the [Gubbins webpage](http://sanger-pathogens.github.io/gubbins/). It contains links to the latest precompiled binaries.
+Before you do anything, please have a look at the [Gubbins webpage](http://sanger-pathogens.github.io/gubbins/).
 
+# Installation
+There are a few ways to install Gubbins and its dependancies. The simpliest way is using HomeBrew (OSX) or LinuxBrew.
 
-Gubbins Install
-=======
+* OSX - Mavericks (10.9) & Yosemite (10.10)
+* OSX - Mountain Lion (10.8)
+* Linux - Ubuntu Trusty (14.04) & Precise (12.04)
+* Linux - CentOS 7
+* Linux - CentOS 6
+* OSX/Linux - from source
+* OSX/Linux/Windows - Virtual Machine
 
-There are a few ways to Install/Use Gubbins, with detailed instructions below:
 
-1. OSX - using homebrew
-2. Ubuntu Trusty - using apt-get
-3. Linux - using precompiled binaries or compiling from source
-4. Windows - using our virtual machine
-
-
-
-## Prep work ##
-As listed in the dependancies section below, you need
- * Python 2.7.
- * Python development headers (the python-dev package on Debian/Ubuntu).
-
-You will also need the following python packages installed, if they are not then they will be downloaded and installed automatically during the Gubbins installation process:
-
- * Biopython ( >=1.59 )
- * DendroPy ( >=3.11.1 )
- * ReportLab ( >= 2.5 )
-
-Gubbins also depends on the following pieces of software.  Depending on your installation choice these may be downloaded and installed automatically during the Gubbins installation process. 
-
-* [FastTree](http://www.microbesonline.org/fasttree/) ( >=2.1.4 )
-* [RAxML](http://sco.h-its.org/exelixis/software.html) ( >=7.2.8, the full version )
-* [FASTML](http://fastml.tau.ac.il/) ( 2.02 )
-
-If you choose to install from source, you should follow the FastTree and RAxML instructions to do so.  In the case of installing FASTML from source, we recommend using our slight modifications to the FASTML build system available at https://github.com/sanger-pathogens/fastml.
-
-## Install ##
-
-There are multiple ways to install gubbins depending on your requirements
-
-1. install system-wide from binaries,
-2. install per-user from binaries,
-3. install system-wied from source, and
-4. install per-user from source.
-
-Each of the system-wide cases assumes you have permissions to _sudo_.  The per-user cases do not make that assumption.  Please note this software does not work on Windows (and never will), only on Linux and OSX (*nix).
-
-### System-wide from binaries ###
-
-We currently only support Ubuntu 14.04 x86_64 as a system-wide binary install.  Other architectures will be added on request.
-
-Install the DendroPy dependancy:
-
-``` bash
-$ wget  http://pypi.python.org/packages/source/D/DendroPy/DendroPy-3.12.0.tar.gz
-$ tar xzvf DendroPy-3.12.0.tar.gz
-$ cd DendroPy-3.12.0
-$ sudo python setup.py install
+## OSX - Mavericks (10.9) & Yosemite (10.10)
+Install [HomeBrew](http://brew.sh/). It requires a minimum of Xcode 5.1.1 (xcodebuild -version). Then run:
 ```
-
-Then install gubbins
-
-``` bash
-$ sudo add-apt-repository ppa:ap13/gubbins
-$ sudo apt-get update
-$ sudo apt-get install fasttree raxml fastml2 gubbins
+brew tap homebrew/science
+brew install gubbins
 ```
 
-If you have your own version of the raxml binary, then you can omit it from the list of packages to install.  Many users have reported vastly increased performance by installing RAxML from source by selecting the most appropriate makefile.
-
+## OSX - Mountain Lion (10.8)
+Install [HomeBrew](http://brew.sh/). It requires a minimum of Xcode 5.1.1 (xcodebuild -version).
 
-### System-wide from binaries for other debian based systems ###
-This might work on other Debian based systems and other versions of Ubuntu, but is untested.
-
-Install the DendroPy dependancy:
-
-``` bash
-$ wget  http://pypi.python.org/packages/source/D/DendroPy/DendroPy-3.12.0.tar.gz
-$ tar xzvf DendroPy-3.12.0.tar.gz
-$ cd DendroPy-3.12.0
-$ sudo python setup.py install
+Manually install [FastML](http://fastml.tau.ac.il/source.php) and include the binary in your PATH. For example:
 ```
-
-Then install gubbins
-
-
-```bash
-echo "deb http://ppa.launchpad.net/ap13/gubbins/ubuntu trusty main" >> /etc/apt/sources.list
-echo "deb-src http://ppa.launchpad.net/ap13/gubbins/ubuntu trusty main" >> /etc/apt/sources.list
-
-sudo apt-get update
-sudo apt-get install fasttree raxml fastml2 gubbins
+wget 'http://fastml.tau.ac.il/source/FastML.v3.1.tgz'
+tar -xzf FastML.v3.1.tgz
+cd FastML.v3.1
+export PATH=${HOME}/FastML.v3.1/bin:$PATH
 ```
-
-### Per-user from binaries  ###
-
-Again, we currently only support Ubuntu 14.04 x86_64 as a binary install option
-
-Check out a version of the repository from GitHub
-
-> $ git clone https://github.com/sanger-pathogens/gubbins
-
-Run the installation script to install it in your home directory, spectifically into ~/.local.  The installation script ensures that the Python dependencies are also installed locally.
-
-> $ ./install-userspace.sh
-
-Source the .bashrc file
-
-> $ source ~/.bashrc
-
-### System-wide from source ###
-
-You must first enssure that the dependencies are installed.
-
-On a Debian/Ubuntu system
-``` bash
-$ sudo apt-get install python-biopython python-setuptools
-$ easy_install -U dendropy
+Then run:
 ```
-
-Alternatively, if you need to install the dependencies from source:
-``` bash
-$ wget https://bootstrap.pypa.io/ez_setup.py -O - | sudo python -
-$ sudo easy_install -U biopython
-$ sudo easy_install -U dendropy
+brew tap homebrew/science
+brew install python3
+brew install gubbins --without-fastml
 ```
 
-Check out a version of the repository from github and
-
-> $ git clone https://github.com/sanger-pathogens/gubbins
+## OSX - It failed to install
+* Run 'brew doctor' and correct any errors with your homebrew setup.
+* Make sure Xcode is 5.1.1 or greater (xcodebuild -version). 
+* Install it in /usr/local. Homebrew warn 'Pick another prefix at your peril!'.
+* Run 'brew install -vd gubbins' and try and correct any errors.
 
-Run the following commands to compile and install the software:
+## Linux - Ubuntu
+Tested on Ubuntu Trusty (14.04) and Precise (12.04). Install [LinuxBrew](http://brew.sh/linuxbrew/). Then run:
 
-``` bash
-$ autoreconf -i
-$ ./configure
-$ make
-$ sudo make install
 ```
-
-### Per-user from source ###
-
-If you do not have permission to install the software as root and instead want to install it in a local user directory then the following commands can be used instead:
-
-``` bash
-$ wget https://bootstrap.pypa.io/ez_setup.py -O - | python - --user
-$ ~/.local/bin/easy_install -U --user biopython
-$ ~/.local/bin/easy_install -U --user dendropy
-$ ./configure --prefix=~/.local
-$ make
-$ make install
+sudo apt-get install gfortran
+brew tap homebrew/science
+brew install python3
+brew install gubbins
 ```
 
-Ensure that
-
-> LD_LIBRARY_PATH=~/.local/lib
-
-and
-
-> PATH=~/.local/bin:$PATH
-
-This is best achieved by adding them to ~/.bashrc in the usual manner.
-
-## Running Gubbins ##
-
-For bash users ensure you run
-
-> $ source ~/.bashrc
-
-To run the Gubbins application use:
-
-> $ run_gubbins my_alignment.fa
+## Linux - CentOS/RHEL 7
+Enable EPEL and make sure compilers are installed.
+```
+sudo yum install epel-release gcc gcc-c++ automake
+```
+Install [LinuxBrew](http://brew.sh/linuxbrew/).
+```
+brew tap homebrew/science
+brew tap homebrew/dupes	
+ln -s $(which gcc) ~/.linuxbrew/bin/gcc-4.8
+ln -s $(which g++) ~/.linuxbrew/bin/g++-4.8
+ln -s $(which gfortran) ~/.linuxbrew/bin/gfortran-4.8
+brew install ruby gpatch python3
+brew install gubbins
+```
 
-To see full usage of this script run:
+## Linux - CentOS/RHEL 6.6
+Enable EPEL and make sure compilers are installed.
+```
+sudo yum install epel-release gcc gcc-c++ automake ruby-irb
+```
+Install [LinuxBrew](http://brew.sh/linuxbrew/).
+```
+brew tap homebrew/science
+ln -s $(which gcc) ~/.linuxbrew/bin/gcc-4.4
+ln -s $(which g++) ~/.linuxbrew/bin/g++-4.4
+ln -s $(which gfortran) ~/.linuxbrew/bin/gfortran-4.4
+brew install ruby python3
+brew install gubbins
+```
 
-> $ run_gubbins -h
+## OSX/Linux - from source
+This is the most difficult method and is only suitable for someone with advanced computing skills. Please consider using HomeBrew/LinuxBrew instead.
 
+Install the dependances and include them in your PATH:
+* [FastTree](http://www.microbesonline.org/fasttree/#Install) ( >=2.1.4 )
+* [RAxML](https://github.com/stamatak/standard-RAxML) ( >=8.0 )
+* [FASTML](http://fastml.tau.ac.il/source.php) ( >=2.02 )
+* Python modules: Biopython (> 1.59), DendroPy (>=4.0), Reportlab, nose, pillow
+* Standard build environment tools (e.g. python3, pip3, make, autoconf, libtool, gcc, check, etc...)
 
-### OSX ###
-Install the python dependancies:
+```
+autoreconf -i
+./configure
+make
+sudo make install
+```
 
-> curl https://bootstrap.pypa.io/ez_setup.py  | python - --user
-> ~/bin/easy_install -U --user biopython
-> ~/bin/easy_install -U --user dendropy
-> ~/bin/easy_install -U --user reportlab
+## OSX/Linux/Windows - Virtual Machine
+Roary wont run natively on Windows but we have created virtual machine which has all of the software setup, along with the test datasets from the paper. 
+It is based on [Bio-Linux 8](http://environmentalomics.org/bio-linux/).  You need to first install [VirtualBox](https://www.virtualbox.org/), 
+then load the virtual machine, using the 'File -> Import Appliance' menu option. The root password is 'manager'.
 
-Go to the [homebrew website](http://brew.sh/) and install the homebrew package manager.
+* ftp://ftp.sanger.ac.uk/pub/pathogens/pathogens-vm/pathogens-vm.latest.ova
 
-Install fastml and gubbins from the following recipes.
+More importantly though, if your trying to do bioinformatics on Windows, your not going to get very far and you should seriously consider upgrading to Linux.
 
-> brew tap homebrew/science
-> brew install http://sanger-pathogens.github.io/gubbins/fastml.rb
-> brew install http://sanger-pathogens.github.io/gubbins/gubbins.rb
diff --git a/README.md b/README.md
index f1064f0..ddcceb1 100644
--- a/README.md
+++ b/README.md
@@ -2,10 +2,13 @@ Please see our website for more information on [Gubbins](http://sanger-pathogens
 
 [Croucher N. J., Page A. J., Connor T. R., Delaney A. J., Keane J. A., Bentley S. D., Parkhill J., Harris S.R.
 "Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins". doi:10.1093/nar/gku1196, Nucleic Acids Research, 2014.]
-(http://nar.oxfordjournals.org/content/early/2014/11/20/nar.gku1196.abstract)
+(http://nar.oxfordjournals.org/content/43/3/e15)
 
 Gubbins
 =======
+
+[![Build Status](https://travis-ci.org/sanger-pathogens/gubbins.svg?branch=master)](https://travis-ci.org/sanger-pathogens/gubbins)
+
 Since the introduction of high-throughput, second-generation DNA sequencing technologies, there has been an enormous increase in the size of datasets being used for estimating bacterial population phylodynamics. Although many phylogenetic techniques are scalable to hundreds of bacterial genomes, methods which have been used for mitigating the effect of mechanisms of horizontal sequence transfer on phylogenetic reconstructions cannot cope with these new datasets. Gubbins (Genealogies Unbi [...]
 
 Install
diff --git a/VERSION b/VERSION
index d0149fe..9df886c 100644
--- a/VERSION
+++ b/VERSION
@@ -1 +1 @@
-1.3.4
+1.4.2
diff --git a/configure.ac b/configure.ac
index d6e3fd7..eff27a5 100644
--- a/configure.ac
+++ b/configure.ac
@@ -18,7 +18,7 @@ AX_PTHREAD
 PKG_CHECK_MODULES([zlib], [zlib])
 
 # Check for Python
-AM_PATH_PYTHON([2.0],
+AM_PATH_PYTHON([3.0],
                [],
                [AC_MSG_WARN([Python not found. Python is required to build presage python binding. Python can be obtained from http://www.python.org])])
 
diff --git a/debian/changelog b/debian/changelog
index 0de6f98..f2902ce 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+gubbins (1.3.4~trusty1) trusty; urgency=low
+
+  * Identify blocks at end of genome
+
+ -- Andrew Page <ap13 at sanger.ac.uk>  Mon, 18 Apr 2015 14:51:00 +0000
+
 gubbins (1.3.3~trusty1) trusty; urgency=low
 
   * Max window size arg bug
diff --git a/install-userspace.sh b/install-userspace.sh
index 9aa2205..8178bf1 100755
--- a/install-userspace.sh
+++ b/install-userspace.sh
@@ -13,7 +13,7 @@
 #
 
 py_pkgs=( "biopython" "dendropy" )
-deb_urls=( "http://uk.archive.ubuntu.com/ubuntu/pool/universe/r/raxml/raxml_7.2.8-2_amd64.deb" "https://launchpad.net/~ap13/+archive/ubuntu/gubbins/+files/fastml2_2.3~trusty1_amd64.deb" "https://launchpad.net/~ap13/+archive/ubuntu/gubbins/+files/gubbins_1.1.1~trusty1_amd64.deb" )
+deb_urls=( "http://uk.archive.ubuntu.com/ubuntu/pool/universe/r/raxml/raxml_7.2.8-2_amd64.deb" "https://launchpad.net/~ap13/+archive/ubuntu/gubbins/+files/fastml2_2.3~trusty1_amd64.deb" "https://launchpad.net/~ap13/+archive/ubuntu/gubbins/+files/gubbins_1.3.3~trusty1_amd64.deb" )
 
 function check_platform {
     # Ubuntu 14.04
diff --git a/install_dependencies.sh b/install_dependencies.sh
new file mode 100644
index 0000000..e8b834c
--- /dev/null
+++ b/install_dependencies.sh
@@ -0,0 +1,111 @@
+#!/bin/bash
+
+set -x
+set -e
+
+start_dir=$(pwd)
+
+RAXML_VERSION="8.1.21"
+FASTML_VERSION="2.3"
+FASTTREE_VERSION="2.1.8"
+
+RAXML_DOWNLOAD_URL="https://github.com/stamatak/standard-RAxML/archive/v${RAXML_VERSION}.tar.gz"
+FASTML_DOWNLOAD_URL="https://github.com/sanger-pathogens/fastml/archive/v${FASTML_VERSION}.tar.gz"
+FASTTREE_DOWNLOAD_URL="http://www.microbesonline.org/fasttree/FastTree-${FASTTREE_VERSION}.c"
+
+# Make an install location
+if [ ! -d 'build' ]; then
+  mkdir build
+fi
+cd build
+build_dir=$(pwd)
+
+# DOWNLOAD ALL THE THINGS
+download () {
+  url=$1
+  download_location=$2
+
+  if [ -e $download_location ]; then
+    echo "Skipping download of $url, $download_location already exists"
+  else
+    echo "Downloading $url to $download_location"
+    wget $url -O $download_location
+  fi
+}
+
+download $RAXML_DOWNLOAD_URL "raxml-${RAXML_VERSION}.tgz"
+download $FASTML_DOWNLOAD_URL "fastml-${FASTML_VERSION}.tgz"
+download $FASTTREE_DOWNLOAD_URL "fasttree-${FASTTREE_VERSION}.c"
+
+# Update dependencies
+if [ "$TRAVIS" = 'true' ]; then
+  echo "Using Travis's apt plugin"
+else
+  sudo apt-get update -q
+  sudo apt-get install -y -q autoconf \
+                             check \
+                             g++ \
+                             libtool \
+                             pkg-config \
+                             python-dev
+fi
+
+# Build all the things
+cd $build_dir
+
+## RAxML
+raxml_dir=$(pwd)/"standard-RAxML-${RAXML_VERSION}"
+if [ ! -d $raxml_dir ]; then
+  tar xzf raxml-${RAXML_VERSION}.tgz
+fi
+cd $raxml_dir
+if [ -e "${raxml_dir}/raxmlHPC" ]; then
+  echo "Already build RAxML; skipping build"
+else
+  make -f Makefile.gcc
+fi
+
+cd $build_dir
+
+## FASTML
+fastml_dir=$(pwd)/"fastml-${FASTML_VERSION}"
+if [ ! -d $fastml_dir ]; then
+  tar xzf fastml-${FASTML_VERSION}.tgz
+fi
+cd $fastml_dir
+if [ -e "${fastml_dir}/programs/fastml/fastml" ]; then
+  echo "Already build FASTML; skipping build"
+else
+  make
+fi
+
+cd $build_dir
+
+## FastTree
+fasttree_dir=${build_dir}/fasttree-${FASTTREE_VERSION}
+if [ ! -d $fasttree_dir ]; then
+  mkdir $fasttree_dir
+fi
+cd $fasttree_dir
+if [ -e "${fasttree_dir}/FastTree" ]; then
+  echo "Skipping, FastTree already exists"
+else
+  gcc -O3 -finline-functions -funroll-loops -Wall -o FastTree ${build_dir}/fasttree-${FASTTREE_VERSION}.c  -lm
+fi
+
+# Setup environment variables
+update_path () {
+  new_dir=$1
+  if [[ ! "$PATH" =~ (^|:)"${new_dir}"(:|$) ]]; then
+    export PATH=${new_dir}:${PATH}
+  fi
+}
+
+update_path ${raxml_dir}
+update_path ${fastml_dir}/programs/fastml
+update_path ${fasttree_dir}
+
+cd $start_dir
+
+set +x
+set +e
diff --git a/python/Makefile.am b/python/Makefile.am
index 98e1ea8..ecfc9dc 100644
--- a/python/Makefile.am
+++ b/python/Makefile.am
@@ -5,7 +5,7 @@ all-local:
 
 
 install-exec-local:
-	${PYTHON} setup.py install  --root=$(DESTDIR) --install-purelib=$(pythondir) --install-scripts=/usr/bin
+	${PYTHON} setup.py install
 
 uninstall-local:
 	rm -rf $(pythondir)/*gubbins*
diff --git a/python/gubbins/Fastml.py b/python/gubbins/Fastml.py
new file mode 100644
index 0000000..daf604b
--- /dev/null
+++ b/python/gubbins/Fastml.py
@@ -0,0 +1,63 @@
+import os
+import re
+import subprocess
+
+class Fastml(object):
+  def __init__(self, fastml_exec = None):
+      self.fastml_exec = fastml_exec
+      self.fastml_version = None
+      self.fastml_model = None
+      self.fastml_parameters = self.__calculate_parameters__()
+      
+  def __calculate_parameters__(self):
+      if(self.which(self.fastml_exec) == None):
+        return None
+      
+      if re.search('nucgtr', str(self.__run_without_options__())):
+          self.fastml_version = 3
+          self.fastml_model = 'g'
+          print("Using FastML 3 with GTR model\n")
+      else:
+          self.fastml_version = 2
+          
+          if re.search('General time Reversible', str(self.__run_with_fake_file__())):
+              self.fastml_model = 'g'
+              print("Using Gubbins patched FastML 2 with GTR model\n")
+          else:
+              self.fastml_model = 'n'
+              print("Using FastML 2 with Jukes Cantor model\n")
+          
+      return self.fastml_exec + " -qf -b -a 0.00001 -m"+self.fastml_model+" "
+
+
+  def __run_with_fake_file__(self):
+      
+      # Create a minimal FASTA file
+      with open('.seq.aln','w') as out:
+          out.writelines(['>1','A','>2','A'])
+      
+      cmd = self.fastml_exec + " -qf -b -a 0.00001 -mg -s .seq.aln -t doesnt_exist.tre"
+      output = subprocess.Popen(cmd, stdout = subprocess.PIPE, shell=True).communicate()[0]
+      os.remove('.seq.aln')
+      return output
+      
+  def __run_without_options__(self):
+      return subprocess.Popen(self.fastml_exec, stdout = subprocess.PIPE, shell=True).communicate()[0]
+      
+  def which(self,program):
+      executable = program.split(" ")
+      program = executable[0]
+      def is_exe(fpath):
+        return os.path.isfile(fpath) and os.access(fpath, os.X_OK)
+      fpath, fname = os.path.split(program)
+      if fpath:
+        if is_exe(program):
+          return program
+      else:
+        for path in os.environ["PATH"].split(os.pathsep):
+          exe_file = os.path.join(path, program)
+          if is_exe(exe_file):
+            return exe_file
+    
+      return None
+        
\ No newline at end of file
diff --git a/python/gubbins/__init__.py b/python/gubbins/__init__.py
index d605da8..b1face2 100644
--- a/python/gubbins/__init__.py
+++ b/python/gubbins/__init__.py
@@ -1,4 +1,4 @@
-#! /usr/bin/env python
+#! /usr/bin/env python3
 
 """
 Imports into the `gubbins` namespace all fundamental
@@ -13,7 +13,7 @@ import os
 ## Populate the 'gubbins' namespace
 
 from gubbins import common
-
+from gubbins import Fastml
 
 ###############################################################################
 ## PACKAGE METADATA
diff --git a/python/gubbins/common.py b/python/gubbins/common.py
index df3b334..48f8c06 100644
--- a/python/gubbins/common.py
+++ b/python/gubbins/common.py
@@ -22,10 +22,11 @@ from Bio import Phylo
 from Bio import SeqIO
 from Bio.Align import MultipleSeqAlignment
 from Bio.Seq import Seq
-from cStringIO import StringIO
+from io import StringIO
 from collections import Counter
 import argparse
 import dendropy
+from dendropy.calculate import treecompare
 import math
 import os
 import re
@@ -34,6 +35,7 @@ import subprocess
 import sys
 import tempfile
 import time
+from gubbins.Fastml import Fastml
 
 class GubbinsError(Exception):
   def __init__(self, value,message):
@@ -53,55 +55,55 @@ class GubbinsCommon():
       Phylo.read(starting_tree, 'newick')
       tree  = dendropy.Tree.get_from_path(starting_tree, 'newick', preserve_underscores=True)
     except:
-      print "Error with the input starting tree: Is it a valid Newick file?"
+      print("Error with the input starting tree: Is it a valid Newick file?")
       return 0
     return 1
     
   @staticmethod
   def do_the_names_match_the_fasta_file(starting_tree, alignment_filename):
-    input_handle  = open(alignment_filename, "rU")
-    alignments = AlignIO.parse(input_handle, "fasta")
-    sequence_names = {}
-    for alignment in alignments:
-        for record in alignment:
-            sequence_names[record.name] = 1
-    input_handle.close()
-    
-    tree = dendropy.Tree.get_from_path(starting_tree, 'newick', preserve_underscores=True)
-    
-    leaf_nodes = tree.leaf_nodes()
-    for i,lf in enumerate(leaf_nodes):
-      if not leaf_nodes[i].taxon.label in sequence_names:
-        print "Error: A taxon referenced in the starting tree isnt found in the input fasta file"
-        return 0
-
+    with open(alignment_filename, "r") as input_handle:
+      alignments = AlignIO.parse(input_handle, "fasta")
+      sequence_names = {}
+      for alignment in alignments:
+          for record in alignment:
+              sequence_names[record.name] = 1
+      input_handle.close()
+      
+      tree = dendropy.Tree.get_from_path(starting_tree, 'newick', preserve_underscores=True)
+      
+      leaf_nodes = tree.leaf_nodes()
+      for i,lf in enumerate(leaf_nodes):
+        if not leaf_nodes[i].taxon.label in sequence_names:
+          print("Error: A taxon referenced in the starting tree isnt found in the input fasta file")
+          return 0
+      
     return 1
 
 
   @staticmethod
   def does_fasta_contain_variation(alignment_filename):
-    input_handle  = open(alignment_filename, "rU")
-    alignments = AlignIO.parse(input_handle, "fasta")
-    first_sequence = ""
-
-    for index, alignment in enumerate(alignments):
-      for record_index, record in enumerate(alignment):
-
-        if record_index == 0:
-          first_sequence = record.seq
-
-        if str(record.seq) != str(first_sequence):
-          input_handle.close()
-          return 1
-          
-    input_handle.close()
+    with open(alignment_filename, "r") as input_handle:
+      alignments = AlignIO.parse(input_handle, "fasta")
+      first_sequence = ""
+      
+      for index, alignment in enumerate(alignments):
+        for record_index, record in enumerate(alignment):
+      
+          if record_index == 0:
+            first_sequence = record.seq
+      
+          if str(record.seq) != str(first_sequence):
+            input_handle.close()
+            return 1
+            
+      input_handle.close()
     return 0
 
 
   @staticmethod
   def does_file_exist(alignment_filename, file_type_msg):
     if(not os.path.exists(alignment_filename)):
-      print GubbinsError('','Cannot access the input '+file_type_msg+'. Check its been entered correctly')
+      print(GubbinsError('','Cannot access the input '+file_type_msg+'. Check its been entered correctly'))
       return 0
     return 1
  
@@ -135,7 +137,7 @@ class GubbinsCommon():
     if self.args.threads == 1 and raxml_executable == "":
       self.args.threads = 2
       raxml_executables = ['raxmlHPC-PTHREADS-AVX','raxmlHPC-PTHREADS-SSE3','raxmlHPC-PTHREADS']
-      print "Trying PTHREADS version of raxml because no single threaded version of raxml could be found. Just to warn you, this requires 2 threads.\n"
+      print("Trying PTHREADS version of raxml because no single threaded version of raxml could be found. Just to warn you, this requires 2 threads.\n")
       raxml_executable = GubbinsCommon.choose_executable(raxml_executables)
     
     RAXML_EXEC = raxml_executable+' -f d -p 1 -m GTRGAMMA'
@@ -147,7 +149,8 @@ class GubbinsCommon():
     
     FASTTREE_PARAMS = '-nosupport -gtr -gamma -nt'
     GUBBINS_EXEC = 'gubbins'
-    FASTML_EXEC = 'fastml -mg -qf -b '
+
+    FASTML_EXEC = Fastml('fastml').fastml_parameters
 
     GUBBINS_BUNDLED_EXEC = '../src/gubbins'
 
@@ -183,7 +186,7 @@ class GubbinsCommon():
     if self.args.use_time_stamp > 0:
       current_time = str(int(time.time()))+'.'
       if self.args.verbose > 0:
-        print current_time
+        print(current_time)
 
     # get the base filename
     (base_directory,base_filename) = os.path.split(self.args.alignment_filename)
@@ -202,17 +205,20 @@ class GubbinsCommon():
     (base_directory,base_filename) = os.path.split(self.args.alignment_filename)
     (base_filename_without_ext,extension) = os.path.splitext(base_filename)
     starting_base_filename = base_filename
+    
+    if len(base_filename) > 115:
+        sys.exit("Your filename is too long for RAxML at "+ str(len(base_filename))+ " characters, please shorten it to less than 115 characters")
 
     # find all snp sites
     if self.args.verbose > 0:
-      print GUBBINS_EXEC +" "+ self.args.alignment_filename
+      print(GUBBINS_EXEC +" "+ self.args.alignment_filename)
     try:
       subprocess.check_call([GUBBINS_EXEC, self.args.alignment_filename])
     except:
       sys.exit("Gubbins crashed, please ensure you have enough free memory")
       
     if self.args.verbose > 0:
-      print int(time.time())
+      print(int(time.time()))
 
     GubbinsCommon.reconvert_fasta_file(starting_base_filename+".gaps.snp_sites.aln",starting_base_filename+".start")
 
@@ -281,7 +287,7 @@ class GubbinsCommon():
         gubbins_command       = GubbinsCommon.fasttree_gubbins_command(base_filename,starting_base_filename+".gaps", i,self.args.alignment_filename,GUBBINS_EXEC,self.args.min_snps,self.args.alignment_filename, self.args.min_window_size,self.args.max_window_size)
 
       if self.args.verbose > 0:
-        print tree_building_command
+        print(tree_building_command)
 
 
       if self.args.starting_tree is not None and i == 1:
@@ -293,13 +299,13 @@ class GubbinsCommon():
           sys.exit("Failed while building the tree.")
 
       if self.args.verbose > 0:
-        print int(time.time())
+        print(int(time.time()))
 
       GubbinsCommon.reroot_tree(str(current_tree_name), self.args.outgroup)
 
       fastml_command_suffix = ' > /dev/null 2>&1'
       if self.args.verbose > 0:
-        print fastml_command
+        print(fastml_command)
         fastml_command_suffix = ''
 
 
@@ -317,16 +323,16 @@ class GubbinsCommon():
 
 
       if self.args.verbose > 0:
-        print int(time.time())
+        print(int(time.time()))
 
       if self.args.verbose > 0:
-        print gubbins_command
+        print(gubbins_command)
       try:
         subprocess.check_call(gubbins_command, shell=True)
       except:
         sys.exit("Failed while running Gubbins. Please ensure you have enough free memory")
       if self.args.verbose > 0:
-        print int(time.time())
+        print(int(time.time()))
 
       tree_file_names.append(current_tree_name)
       if i > 2:
@@ -335,12 +341,12 @@ class GubbinsCommon():
           
           if GubbinsCommon.have_recombinations_been_seen_before(current_recomb_file,previous_recomb_files):
             if self.args.verbose > 0:
-              print "Recombinations observed before so stopping: "+ str(current_tree_name)
+              print("Recombinations observed before so stopping: "+ str(current_tree_name))
             break
         else:
           if GubbinsCommon.has_tree_been_seen_before(tree_file_names,self.args.converge_method):
             if self.args.verbose > 0:
-              print "Tree observed before so stopping: "+ str(current_tree_name)
+              print("Tree observed before so stopping: "+ str(current_tree_name))
             break
 
     # cleanup intermediate files
@@ -381,15 +387,21 @@ class GubbinsCommon():
   
   @staticmethod
   def robinson_foulds_distance(input_tree_name,output_tree_name):
-    input_tree  = dendropy.Tree.get_from_path(input_tree_name, 'newick')
-    output_tree = dendropy.Tree.get_from_path(output_tree_name, 'newick')
-    return input_tree.robinson_foulds_distance(output_tree)
+    tns = dendropy.TaxonNamespace()
+    input_tree  = dendropy.Tree.get_from_path(input_tree_name, 'newick',taxon_namespace=tns)
+    output_tree = dendropy.Tree.get_from_path(output_tree_name, 'newick',taxon_namespace=tns)
+    input_tree.encode_bipartitions()
+    output_tree.encode_bipartitions()
+    return dendropy.calculate.treecompare.weighted_robinson_foulds_distance(input_tree, output_tree)
     
   @staticmethod
   def symmetric_difference(input_tree_name,output_tree_name):
-    input_tree  = dendropy.Tree.get_from_path(input_tree_name, 'newick')
-    output_tree = dendropy.Tree.get_from_path(output_tree_name, 'newick')
-    return input_tree.symmetric_difference(output_tree)
+    tns = dendropy.TaxonNamespace()
+    input_tree  = dendropy.Tree.get_from_path(input_tree_name, 'newick',taxon_namespace=tns)
+    output_tree = dendropy.Tree.get_from_path(output_tree_name, 'newick',taxon_namespace=tns)
+    input_tree.encode_bipartitions()
+    output_tree.encode_bipartitions()
+    return dendropy.calculate.treecompare.symmetric_difference(input_tree,output_tree)
     
   @staticmethod
   def has_tree_been_seen_before(tree_file_names,converge_method):
@@ -422,11 +434,11 @@ class GubbinsCommon():
     tree  = dendropy.Tree.get_from_path(tree_name, 'newick',
               preserve_underscores=True)
     tree.deroot()
-    tree.update_splits()
+    tree.update_bipartitions()
     
     for leaf_node in tree.mrca(taxon_labels=outgroups).leaf_nodes():
       if leaf_node.taxon.label not in outgroups:
-        print "Your outgroups do not form a clade.\n  Using the first taxon "+str(outgroups[0])+" as the outgroup.\n  Taxon "+str(leaf_node.taxon.label)+" is in the clade but not in your list of outgroups."
+        print("Your outgroups do not form a clade.\n  Using the first taxon "+str(outgroups[0])+" as the outgroup.\n  Taxon "+str(leaf_node.taxon.label)+" is in the clade but not in your list of outgroups.")
         return [outgroups[0]]
     
     return outgroups
@@ -451,10 +463,9 @@ class GubbinsCommon():
     tree  = dendropy.Tree.get_from_path(tree_name, 'newick',
               preserve_underscores=True)
     tree.deroot()
-    tree.update_splits()
+    tree.update_bipartitions()
     output_tree_string = tree.as_string(
-      'newick',
-      taxon_set=None,
+      schema='newick',
       suppress_leaf_taxon_labels=False,
       suppress_leaf_node_labels=True,
       suppress_internal_taxon_labels=False,
@@ -467,12 +478,11 @@ class GubbinsCommon():
       suppress_annotations=True,
       annotations_as_nhx=False,
       suppress_item_comments=True,
-      node_label_element_separator=' ',
-      node_label_compose_func=None
+      node_label_element_separator=' '
       )
-    output_file = open(tree_name, 'w+')
-    output_file.write(output_tree_string.replace('\'', ''))
-    output_file.closed
+    with open(tree_name, 'w+') as output_file:
+      output_file.write(output_tree_string.replace('\'', ''))
+      output_file.closed
 
   @staticmethod
   def split_all_non_bi_nodes(node):
@@ -504,12 +514,11 @@ class GubbinsCommon():
               preserve_underscores=True)
     GubbinsCommon.split_all_non_bi_nodes(tree.seed_node)
 
-    tree.reroot_at_midpoint(update_splits=True, delete_outdegree_one=False)
+    tree.update_bipartitions()
     tree.deroot()
-    tree.update_splits()
+    tree.update_bipartitions()
     output_tree_string = tree.as_string(
-      'newick',
-      taxon_set=None,
+      schema='newick',
       suppress_leaf_taxon_labels=False,
       suppress_leaf_node_labels=True,
       suppress_internal_taxon_labels=False,
@@ -522,12 +531,11 @@ class GubbinsCommon():
       suppress_annotations=True,
       annotations_as_nhx=False,
       suppress_item_comments=True,
-      node_label_element_separator=' ',
-      node_label_compose_func=None
+      node_label_element_separator=' '
       )
-    output_file = open(tree_name, 'w+')
-    output_file.write(output_tree_string.replace('\'', ''))
-    output_file.closed
+    with open(tree_name, 'w+') as output_file:
+      output_file.write(output_tree_string.replace('\'', ''))
+      output_file.closed
 
   @staticmethod
   def raxml_base_name(base_filename_without_ext,current_time):
@@ -610,8 +618,7 @@ class GubbinsCommon():
     tree  = dendropy.Tree.get_from_path(input_filename, 'newick', preserve_underscores=True)
 
     output_tree_string = tree.as_string(
-      'newick',
-      taxon_set=None,
+      schema='newick',
       suppress_leaf_taxon_labels=False,
       suppress_leaf_node_labels=True,
       suppress_internal_taxon_labels=True,
@@ -624,12 +631,11 @@ class GubbinsCommon():
       suppress_annotations=True,
       annotations_as_nhx=False,
       suppress_item_comments=True,
-      node_label_element_separator=' ',
-      node_label_compose_func=None
+      node_label_element_separator=' '
       )
-    output_file = open(output_filename, 'w+')
-    output_file.write(output_tree_string.replace('\'', ''))
-    output_file.closed
+    with open(output_filename, 'w+') as output_file:
+      output_file.write(output_tree_string.replace('\'', ''))
+      output_file.closed
 
   @staticmethod
   def translation_of_fasttree_filenames_to_final_filenames(starting_base_filename, max_intermediate_iteration, output_prefix):
@@ -724,26 +730,26 @@ class GubbinsCommon():
   @staticmethod
   def get_sequence_names_from_alignment(filename):
     sequence_names = []
-    handle = open(filename, "rU")
-    for record in SeqIO.parse(handle, "fasta") :
-      sequence_names.append(record.id)
-    handle.close()
+    with open(filename, "r") as handle:
+      for record in SeqIO.parse(handle, "fasta") :
+        sequence_names.append(record.id)
+      handle.close()
     return sequence_names
 
   @staticmethod
   def is_input_fasta_file_valid(input_filename):
     try:
       if GubbinsCommon.does_each_sequence_have_the_same_length(input_filename) == 0:
-        print "Each sequence must be the same length"
+        print("Each sequence must be the same length")
         return 0
       if GubbinsCommon.are_sequence_names_unique(input_filename) == 0:
-        print "All sequence names in the fasta file must be unique"
+        print("All sequence names in the fasta file must be unique")
         return 0
       if GubbinsCommon.does_each_sequence_have_a_name_and_genomic_data(input_filename) == 0:
-        print "Each sequence must have a name and some genomic data"
+        print("Each sequence must have a name and some genomic data")
         return 0
       if GubbinsCommon.does_fasta_contain_variation(input_filename) == 0:
-        print "All of the input sequences contain the same data"
+        print("All of the input sequences contain the same data")
         return 0
     except:
       return 0
@@ -765,93 +771,94 @@ class GubbinsCommon():
 
   @staticmethod
   def does_each_sequence_have_a_name_and_genomic_data(input_filename):
-    input_handle  = open(input_filename, "rU")
-    alignments = AlignIO.parse(input_handle, "fasta")
-    number_of_sequences = 0
-    for alignment in alignments:
-        for record in alignment:
-            number_of_sequences +=1
-            if record.name is None or record.name == "":
-              print "Error with the input FASTA file: One of the sequence names is blank"
-              return 0
-            if record.seq is None or record.seq == "":
-              print "Error with the input FASTA file: One of the sequences is empty"
-              return 0
-            if re.search('[^ACGTNacgtn-]', str(record.seq))  != None:
-              print "Error with the input FASTA file: One of the sequences contains odd characters, only ACGTNacgtn- are permitted"
-              return 0
-    if number_of_sequences <= 3:
-      print "Error with input FASTA file: you need more than 3 sequences to build a meaningful tree"
-      return 0
-    input_handle.close()
+    with  open(input_filename, "r") as input_handle:
+      alignments = AlignIO.parse(input_handle, "fasta")
+      number_of_sequences = 0
+      for alignment in alignments:
+          for record in alignment:
+              number_of_sequences +=1
+              if record.name is None or record.name == "":
+                print("Error with the input FASTA file: One of the sequence names is blank")
+                return 0
+              if record.seq is None or record.seq == "":
+                print("Error with the input FASTA file: One of the sequences is empty")
+                return 0
+              if re.search('[^ACGTNacgtn-]', str(record.seq))  != None:
+                print("Error with the input FASTA file: One of the sequences contains odd characters, only ACGTNacgtn- are permitted")
+                return 0
+      if number_of_sequences <= 3:
+        print("Error with input FASTA file: you need more than 3 sequences to build a meaningful tree")
+        return 0
+      input_handle.close()
     return 1
     
     
   @staticmethod
   def does_each_sequence_have_the_same_length(input_filename):
     try:
-      input_handle  = open(input_filename, "rU")
-      alignments = AlignIO.parse(input_handle, "fasta")
-      sequence_length = -1
-      for alignment in alignments:
-          for record in alignment:
-             if sequence_length == -1:
-               sequence_length = len(record.seq)
-             elif sequence_length != len(record.seq):
-               print "Error with the input FASTA file: The sequences dont have the same lengths this isnt an alignment: "+record.name
-               return 0
-      input_handle.close()
+      with open(input_filename) as input_handle:
+      
+        alignments = AlignIO.parse(input_handle, "fasta")
+        sequence_length = -1
+        for alignment in alignments:
+            for record in alignment:
+               if sequence_length == -1:
+                 sequence_length = len(record.seq)
+               elif sequence_length != len(record.seq):
+                 print("Error with the input FASTA file: The sequences dont have the same lengths this isnt an alignment: "+record.name)
+                 return 0
+        input_handle.close()
     except:
-      print "Error with the input FASTA file: It is in the wrong format so check its an alignment"
+      print("Error with the input FASTA file: It is in the wrong format so check its an alignment")
       return 0
     return 1
 
   @staticmethod
   def are_sequence_names_unique(input_filename):
-    input_handle  = open(input_filename, "rU")
-    alignments = AlignIO.parse(input_handle, "fasta")
-    sequence_names = []
-    for alignment in alignments:
-        for record in alignment:
-            sequence_names.append(record.name)
-            
-    if [k for k,v in Counter(sequence_names).items() if v>1] != []:
-      return 0
-    input_handle.close()
+    with open(input_filename) as input_handle:
+      alignments = AlignIO.parse(input_handle, "fasta")
+      sequence_names = []
+      for alignment in alignments:
+          for record in alignment:
+              sequence_names.append(record.name)
+              
+      if [k for k,v in list(Counter(sequence_names).items()) if v>1] != []:
+        return 0
+      input_handle.close()
     return 1
 
   @staticmethod
   def filter_out_alignments_with_too_much_missing_data(input_filename, output_filename, filter_percentage,verbose):
-    input_handle  = open(input_filename, "rU")
-    output_handle = open(output_filename, "w+")
-    alignments = AlignIO.parse(input_handle, "fasta")
-    output_alignments = []
-    taxa_removed = []
-    number_of_included_alignments = 0
-    for alignment in alignments:
-        for record in alignment:
-          number_of_gaps = 0
-          number_of_gaps += record.seq.count('n')
-          number_of_gaps += record.seq.count('N')
-          number_of_gaps += record.seq.count('-')
-          sequence_length = len(record.seq)
-
-          if sequence_length == 0:
-            taxa_removed.append(record.id)
-            print "Excluded sequence " + record.id + " because there werent enough bases in it"
-          elif((number_of_gaps*100/sequence_length) <= filter_percentage):
-            output_alignments.append(record)
-            number_of_included_alignments += 1
-          else:
-            taxa_removed.append(record.id)
-            print "Excluded sequence " + record.id + " because it had " + str(number_of_gaps*100/sequence_length) +" percentage gaps while a maximum of "+ str(filter_percentage) +" is allowed"
-
-    if number_of_included_alignments <= 1:
-      sys.exit("Too many sequences have been excluded so theres no data left to work with. Please increase the -f parameter")
-
-    AlignIO.write(MultipleSeqAlignment(output_alignments), output_handle, "fasta")
-    output_handle.close()
-    input_handle.close()
+    with open(input_filename) as input_handle:
+      with open(output_filename, "w+") as output_handle:
+        alignments = AlignIO.parse(input_handle, "fasta")
+        output_alignments = []
+        taxa_removed = []
+        number_of_included_alignments = 0
+        for alignment in alignments:
+            for record in alignment:
+              number_of_gaps = 0
+              number_of_gaps += record.seq.count('n')
+              number_of_gaps += record.seq.count('N')
+              number_of_gaps += record.seq.count('-')
+              sequence_length = len(record.seq)
+        
+              if sequence_length == 0:
+                taxa_removed.append(record.id)
+                print("Excluded sequence " + record.id + " because there werent enough bases in it")
+              elif((number_of_gaps*100/sequence_length) <= filter_percentage):
+                output_alignments.append(record)
+                number_of_included_alignments += 1
+              else:
+                taxa_removed.append(record.id)
+                print("Excluded sequence " + record.id + " because it had " + str(number_of_gaps*100/sequence_length) +" percentage gaps while a maximum of "+ str(filter_percentage) +" is allowed")
+        
+        if number_of_included_alignments <= 1:
+          sys.exit("Too many sequences have been excluded so theres no data left to work with. Please increase the -f parameter")
+        
+        AlignIO.write(MultipleSeqAlignment(output_alignments), output_handle, "fasta")
+        output_handle.close()
+      input_handle.close()
     return taxa_removed
 
   @staticmethod
@@ -865,12 +872,11 @@ class GubbinsCommon():
     tree  = dendropy.Tree.get_from_path(starting_tree, 'newick',
               preserve_underscores=True)
 
-    tree.prune_taxa_with_labels(taxa_removed, update_splits=True, delete_outdegree_one=False)          
-    tree.prune_leaves_without_taxa(update_splits=True, delete_outdegree_one=False)
+    tree.prune_taxa_with_labels(taxa_removed, update_bipartitions=True)          
+    tree.prune_leaves_without_taxa(update_bipartitions=True)
     tree.deroot()
     output_tree_string = tree.as_string(
-      'newick',
-      taxon_set=None,
+      schema='newick',
       suppress_leaf_taxon_labels=False,
       suppress_leaf_node_labels=True,
       suppress_internal_taxon_labels=True,
@@ -883,12 +889,11 @@ class GubbinsCommon():
       suppress_annotations=True,
       annotations_as_nhx=False,
       suppress_item_comments=True,
-      node_label_element_separator=' ',
-      node_label_compose_func=None
+      node_label_element_separator=' '
       )
-    output_file = open(temp_starting_tree, 'w+')
-    output_file.write(output_tree_string.replace('\'', ''))
-    output_file.closed
+    with open(temp_starting_tree, 'w+') as output_file:
+      output_file.write(output_tree_string.replace('\'', ''))
+      output_file.closed
 
     return temp_starting_tree
 
@@ -896,71 +901,71 @@ class GubbinsCommon():
   def reinsert_gaps_into_fasta_file(input_fasta_filename, input_vcf_file, output_fasta_filename):
     # find out where the gaps are located
     # PyVCF removed for performance reasons
-    vcf_file = open(input_vcf_file, 'r')
-
-    sample_names  = []
-    gap_position = []
-    gap_alt_base = []
-
-    for vcf_line in vcf_file:
-      if re.match('^#CHROM', vcf_line)  != None :
-         sample_names = vcf_line.rstrip().split('\t' )[9:]
-      elif re.match('^\d', vcf_line)  != None :
-        # If the alternate is only a gap it wont have a base in this column
-        if  re.match('^([^\t]+\t){3}([ACGTacgt])\t([^ACGTacgt])\t', vcf_line)  != None:
-          m = re.match('^([^\t]+\t){3}([ACGTacgt])\t([^ACGTacgt])\t', vcf_line) 
-          gap_position.append(1)
-          gap_alt_base.append(m.group(2))
-        elif re.match('^([^\t]+\t){3}([^ACGTacgt])\t([ACGTacgt])\t', vcf_line)  != None:
-          # sometimes the ref can be a gap only 
-          m = re.match('^([^\t]+\t){3}([^ACGTacgt])\t([ACGTacgt])\t', vcf_line) 
-          gap_position.append(1)
-          gap_alt_base.append(m.group(3))
-        else:
-          gap_position.append(0)
-          gap_alt_base.append('-')
-
-    gapped_alignments = []
-    # interleave gap only and snp bases
-    input_handle = open(input_fasta_filename, "rU")
-    alignments = AlignIO.parse(input_handle, "fasta")
-    for alignment in alignments:
-      for record in alignment:
-        inserted_gaps = []
-        if record.id in sample_names:
-          # only apply to internal nodes
-          continue
-        gap_index = 0
-        for input_base in record.seq:
-          while gap_index < len(gap_position) and gap_position[gap_index] == 1:
-            inserted_gaps.append(gap_alt_base[gap_index])
-            gap_index+=1
-          if gap_index < len(gap_position):
-            inserted_gaps.append(input_base)
-            gap_index+=1
-
-        while gap_index < len(gap_position):
-          inserted_gaps.append(gap_alt_base[gap_index])
-          gap_index+=1
-
-        record.seq = Seq(''.join(inserted_gaps))
-        gapped_alignments.append(record)
-
-    output_handle = open(output_fasta_filename, "a")
-    AlignIO.write(MultipleSeqAlignment(gapped_alignments), output_handle, "fasta")
-
+    with open(input_vcf_file) as vcf_file:
+     
+      sample_names  = []
+      gap_position = []
+      gap_alt_base = []
+     
+      for vcf_line in vcf_file:
+        if re.match('^#CHROM', vcf_line)  != None :
+           sample_names = vcf_line.rstrip().split('\t' )[9:]
+        elif re.match('^\d', vcf_line)  != None :
+          # If the alternate is only a gap it wont have a base in this column
+          if  re.match('^([^\t]+\t){3}([ACGTacgt])\t([^ACGTacgt])\t', vcf_line)  != None:
+            m = re.match('^([^\t]+\t){3}([ACGTacgt])\t([^ACGTacgt])\t', vcf_line) 
+            gap_position.append(1)
+            gap_alt_base.append(m.group(2))
+          elif re.match('^([^\t]+\t){3}([^ACGTacgt])\t([ACGTacgt])\t', vcf_line)  != None:
+            # sometimes the ref can be a gap only 
+            m = re.match('^([^\t]+\t){3}([^ACGTacgt])\t([ACGTacgt])\t', vcf_line) 
+            gap_position.append(1)
+            gap_alt_base.append(m.group(3))
+          else:
+            gap_position.append(0)
+            gap_alt_base.append('-')
+     
+      gapped_alignments = []
+      # interleave gap only and snp bases
+      with open(input_fasta_filename, "r") as input_handle:
+        alignments = AlignIO.parse(input_handle, "fasta")
+        for alignment in alignments:
+          for record in alignment:
+            inserted_gaps = []
+            if record.id in sample_names:
+              # only apply to internal nodes
+              continue
+            gap_index = 0
+            for input_base in record.seq:
+              while gap_index < len(gap_position) and gap_position[gap_index] == 1:
+                inserted_gaps.append(gap_alt_base[gap_index])
+                gap_index+=1
+              if gap_index < len(gap_position):
+                inserted_gaps.append(input_base)
+                gap_index+=1
+        
+            while gap_index < len(gap_position):
+              inserted_gaps.append(gap_alt_base[gap_index])
+              gap_index+=1
+        
+            record.seq = Seq(''.join(inserted_gaps))
+            gapped_alignments.append(record)
+        
+      with open(output_fasta_filename, "a") as output_handle:
+        AlignIO.write(MultipleSeqAlignment(gapped_alignments), output_handle, "fasta")
+        output_handle.close()
     return
 
 
     # reparsing a fasta file splits the lines which makes fastml work
   @staticmethod
   def reconvert_fasta_file(input_filename, output_filename):
-    input_handle = open(input_filename, "rU")
-    output_handle = open(output_filename, "w+")
-    alignments = AlignIO.parse(input_handle, "fasta")
-    AlignIO.write(alignments, output_handle, "fasta")
-    output_handle.close()
-    input_handle.close()
+    with open(input_filename, "r") as input_handle:
+      with open(output_filename, "w+") as output_handle:
+        alignments = AlignIO.parse(input_handle, "fasta")
+        AlignIO.write(alignments, output_handle, "fasta")
+        output_handle.close()
+      input_handle.close()
     return
 
   @staticmethod
@@ -982,7 +987,8 @@ class GubbinsCommon():
 
   @staticmethod
   def create_pairwise_newick_tree(sequence_names, output_filename):
-    tree = Phylo.read(StringIO('('+sequence_names[0]+','+sequence_names[1]+')'), "newick")
+    stringio = StringIO("".join(('(',sequence_names[0], ',', sequence_names[1],')')))
+    tree = Phylo.read(stringio, "newick")
     Phylo.write(tree, output_filename, 'newick')
 
   @staticmethod
@@ -993,7 +999,7 @@ class GubbinsCommon():
           full_path_of_file_for_deletion = os.path.join(directory_to_search, filename)
           if(re.match(str(deletion_regex), filename) != None and os.path.exists(full_path_of_file_for_deletion)):
             if verbose > 0:
-              print "Deleting file: "+ os.path.join(directory_to_search, filename) + " regex:"+deletion_regex
+              print("Deleting file: "+ os.path.join(directory_to_search, filename) + " regex:"+deletion_regex)
             os.remove(full_path_of_file_for_deletion)
             
   @staticmethod
@@ -1004,7 +1010,7 @@ class GubbinsCommon():
           full_path_of_file_for_find = os.path.join(directory_to_search, filename)
           if(re.match(str(find_regex), filename) != None and os.path.exists(full_path_of_file_for_find)):
             if verbose > 0:
-              print "File exists: "+ os.path.join(directory_to_search, filename) + " regex:"+find_regex
+              print("File exists: "+ os.path.join(directory_to_search, filename) + " regex:"+find_regex)
             return 1
     return 0
 
@@ -1036,31 +1042,31 @@ class GubbinsCommon():
   
   @staticmethod
   def extract_recombinations_from_embl(filename):
-    fh = open(filename, "rU")
-    sequences_to_coords = {}
-    start_coord = -1
-    end_coord = -1
-    for line in fh:
-      searchObj = re.search('misc_feature    ([\d]+)..([\d]+)$', line)
-      if searchObj != None:
-        start_coord = int(searchObj.group(1))
-        end_coord = int(searchObj.group(2))
-        continue
-
-      if start_coord >= 0 and end_coord >= 0:
-        searchTaxa = re.search('taxa\=\"([^"]+)\"', line)
-        if searchTaxa != None:
-          taxa_names = searchTaxa.group(1).strip().split(' ')
-          for taxa_name in taxa_names:
-            if taxa_name in sequences_to_coords:
-              sequences_to_coords[taxa_name].append([start_coord,end_coord])
-            else:
-              sequences_to_coords[taxa_name] = [[start_coord,end_coord]]
-            
-          start_coord = -1
-          end_coord   = -1
-        continue
-    fh.close()
+    with open(filename, "r") as fh:
+      sequences_to_coords = {}
+      start_coord = -1
+      end_coord = -1
+      for line in fh:
+        searchObj = re.search('misc_feature    ([\d]+)..([\d]+)$', line)
+        if searchObj != None:
+          start_coord = int(searchObj.group(1))
+          end_coord = int(searchObj.group(2))
+          continue
+      
+        if start_coord >= 0 and end_coord >= 0:
+          searchTaxa = re.search('taxa\=\"([^"]+)\"', line)
+          if searchTaxa != None:
+            taxa_names = searchTaxa.group(1).strip().split(' ')
+            for taxa_name in taxa_names:
+              if taxa_name in sequences_to_coords:
+                sequences_to_coords[taxa_name].append([start_coord,end_coord])
+              else:
+                sequences_to_coords[taxa_name] = [[start_coord,end_coord]]
+              
+            start_coord = -1
+            end_coord   = -1
+          continue
+      fh.close()
     return sequences_to_coords
     
   @staticmethod
diff --git a/python/gubbins/tests/bin/dummy_custom_fastml2 b/python/gubbins/tests/bin/dummy_custom_fastml2
new file mode 100755
index 0000000..f6a1585
--- /dev/null
+++ b/python/gubbins/tests/bin/dummy_custom_fastml2
@@ -0,0 +1,62 @@
+#!/usr/bin/env bash
+
+
+if test "$#" -eq 0; then
+cat << "EOF"
+START OF LOG FILE
+USAGE:	fastml [-options] 
+ |-------------------------------- HELP: -------------------------------------+
+ | VALUES IN [] ARE DEFAULT VALUES                                            |
+ |-h   help                                                                   |
+ |-s sequence input file (for example use -s D:\mySequences\seq.txt )       |
+ |-t tree input file                                                          |
+ |   (if tree is not given, a neighbor joining tree is computed).             |
+ |-g Assume among site rate variation model (Gamma) [By default the program   |
+ |   will assume an homogenous model. very fast, but less accurate!]          |
+|-m     model name                                                           |
+|-mj    [JTT]                                                                |
+|-mr    mtREV (for mitochondrial genomes)                                    |
+|-md    DAY                                                                  |
+|-mw    WAG                                                                  |
+|-mc    cpREV (for chloroplasts genomes)                                     |
+|-ma    Jukes and Cantor (JC) for amino acids                                |
+|-mn    Jukes and Cantor (JC) for nucleotides                                |
+ +----------------------------------------------------------------------------+
+ |Controling the output options:                                              |
+ |-x   tree file output in Newick format [tree.newick.txt]                    |
+ |-y   tree file output in ANCESTOR format [tree.ancestor.txt]                |
+ |-j   joint sequences output file [seq.joint.txt]                            |
+ |-k   marginal sequences output file [seq.marginal.txt]                      |
+ |-d   joint probabilities output file [prob.joint.txt]                       |
+ |-e   marginal probabilities output file [prob.marginal.txt]                 |
+ |-q   ancestral sequences output format.  -qc = [CLUSTAL], -qf = FASTA       |
+ |     -qm = MOLPHY, -qs = MASE, -qp = PHLIYP, -qn = Nexus                    |
+ +----------------------------------------------------------------------------+
+ |Advances options:                                                           |
+ |-a   Treshold for computing again marginal probabilities [0.9]              |
+ |-b   Do not optimize branch lengths on starting tree                        |
+ |     [by default branches and alpha are ML optimized from the data]         |
+ |-c   number of discrete Gamma categories for the gamma distribution [8]     |
+ |-f   don't compute Joint reconstruction (good if the branch and bound       |
+ |     algorithm takes too much time, and the goal is to compute the          |
+ |     marginal reconstruction with Gamma).                                   |
+ |-z   The bound used. -zs - bound based on sum. -zm based on max. -zb [both] |
+ |-p   user alpha parameter of the gamma distribution [if alpha is not given, |
+ |     alpha and branches will be evaluated from the data (override -b)       |
+ +----------------------------------------------------------------------------+
+EOF
+
+else
+
+cat << "PARAMS"
+START OF LOG FILE
+END OF LOG FILE
+Using homogenous model (no among site rate variation)
+Nucleotide substitution model is General time Reversible
+
+Error - unable to open tree file xxx
+System Error: No such file or directory
+Assertion failed: (0), function reportError, file errorMsg.cpp, line 41.
+Abort trap: 6
+PARAMS
+fi
\ No newline at end of file
diff --git a/python/gubbins/tests/bin/dummy_fastml2 b/python/gubbins/tests/bin/dummy_fastml2
new file mode 100755
index 0000000..eb31211
--- /dev/null
+++ b/python/gubbins/tests/bin/dummy_fastml2
@@ -0,0 +1,60 @@
+#!/usr/bin/env bash
+
+if test "$#" -eq 0; then
+cat << "EOF"
+START OF LOG FILE
+USAGE:	fastml [-options] 
+ |-------------------------------- HELP: -------------------------------------+
+ | VALUES IN [] ARE DEFAULT VALUES                                            |
+ |-h   help                                                                   |
+ |-s sequence input file (for example use -s D:\mySequences\seq.txt )       |
+ |-t tree input file                                                          |
+ |   (if tree is not given, a neighbor joining tree is computed).             |
+ |-g Assume among site rate variation model (Gamma) [By default the program   |
+ |   will assume an homogenous model. very fast, but less accurate!]          |
+|-m     model name                                                           |
+|-mj    [JTT]                                                                |
+|-mr    mtREV (for mitochondrial genomes)                                    |
+|-md    DAY                                                                  |
+|-mw    WAG                                                                  |
+|-mc    cpREV (for chloroplasts genomes)                                     |
+|-ma    Jukes and Cantor (JC) for amino acids                                |
+|-mn    Jukes and Cantor (JC) for nucleotides                                |
+ +----------------------------------------------------------------------------+
+ |Controling the output options:                                              |
+ |-x   tree file output in Newick format [tree.newick.txt]                    |
+ |-y   tree file output in ANCESTOR format [tree.ancestor.txt]                |
+ |-j   joint sequences output file [seq.joint.txt]                            |
+ |-k   marginal sequences output file [seq.marginal.txt]                      |
+ |-d   joint probabilities output file [prob.joint.txt]                       |
+ |-e   marginal probabilities output file [prob.marginal.txt]                 |
+ |-q   ancestral sequences output format.  -qc = [CLUSTAL], -qf = FASTA       |
+ |     -qm = MOLPHY, -qs = MASE, -qp = PHLIYP, -qn = Nexus                    |
+ +----------------------------------------------------------------------------+
+ |Advances options:                                                           |
+ |-a   Treshold for computing again marginal probabilities [0.9]              |
+ |-b   Do not optimize branch lengths on starting tree                        |
+ |     [by default branches and alpha are ML optimized from the data]         |
+ |-c   number of discrete Gamma categories for the gamma distribution [8]     |
+ |-f   don't compute Joint reconstruction (good if the branch and bound       |
+ |     algorithm takes too much time, and the goal is to compute the          |
+ |     marginal reconstruction with Gamma).                                   |
+ |-z   The bound used. -zs - bound based on sum. -zm based on max. -zb [both] |
+ |-p   user alpha parameter of the gamma distribution [if alpha is not given, |
+ |     alpha and branches will be evaluated from the data (override -b)       |
+ +----------------------------------------------------------------------------+
+EOF
+else
+
+cat << "PARAMS"
+START OF LOG FILE
+END OF LOG FILE
+Using homogenous model (no among site rate variation)
+Nucleotide substitution model is JTT
+
+Error - unable to open tree file xxx
+System Error: No such file or directory
+Assertion failed: (0), function reportError, file errorMsg.cpp, line 41.
+Abort trap: 6
+PARAMS
+fi
\ No newline at end of file
diff --git a/python/gubbins/tests/bin/dummy_fastml3 b/python/gubbins/tests/bin/dummy_fastml3
new file mode 100755
index 0000000..1525770
--- /dev/null
+++ b/python/gubbins/tests/bin/dummy_fastml3
@@ -0,0 +1,51 @@
+#!/usr/bin/env bash
+
+cat << "EOF"
+START OF LOG FILE
+USAGE:	/software/pathogen/external/apps/usr/local/FastML.v3.1/programs/fastml/fastml [-options] 
+ |-------------------------------- HELP: -------------------------------------+
+ | VALUES IN [] ARE DEFAULT VALUES                                            |
+ |-h   help                                                                   |
+ |-s sequence input file (for example use -s D:\mySequences\seq.txt )       |
+ |-t tree input file                                                          |
+ |   (if tree is not given, a neighbor joining tree is computed).             |
+ |-g Assume among site rate variation model (Gamma) [By default the program   |
+ |   will assume an homogenous model. very fast, but less accurate!]          |
+|-m     model name                                                           |
+|-mj    [JTT]                                                                |
+|-ml    LG                                                                   |
+|-mr    mtREV (for mitochondrial genomes)                                    |
+|-md    DAY                                                                  |
+|-mw    WAG                                                                  |
+|-mc    cpREV (for chloroplasts genomes)                                     |
+|-ma    Jukes and Cantor (JC) for amino acids                                |
+|-mn    Jukes and Cantor (JC) for nucleotides                                |
+|-mh    HKY Model for nucleotides                                            |
+|-mg    nucgtr Model for nucleotides                                         |
+|-mt    tamura92 Model for nucleotides                                       |
+|-my    yang M5 codons model                                                 |
+|-me    empirical codon matrix                                               |
+ +----------------------------------------------------------------------------+
+ |Controling the output options:                                              |
+ |-x   tree file output in Newick format [tree.newick.txt]                    |
+ |-y   tree file output in ANCESTOR format [tree.ancestor.txt]                |
+ |-j   joint sequences output file [seq.joint.txt]                            |
+ |-k   marginal sequences output file [seq.marginal.txt]                      |
+ |-d   joint probabilities output file [prob.joint.txt]                       |
+ |-e   marginal probabilities output file [prob.marginal.txt]                 |
+ |-q   ancestral sequences output format.  -qc = [CLUSTAL], -qf = FASTA       |
+ |     -qm = MOLPHY, -qs = MASE, -qp = PHLIYP, -qn = Nexus                    |
+ +----------------------------------------------------------------------------+
+ |Advances options:                                                           |
+ |-a   Treshold for computing again marginal probabilities [0.9]              |
+ |-b   Do not optimize branch lengths on starting tree                        |
+ |     [by default branches and alpha are ML optimized from the data]         |
+ |-c   number of discrete Gamma categories for the gamma distribution [8]     |
+ |-f   don't compute Joint reconstruction (good if the branch and bound       |
+ |     algorithm takes too much time, and the goal is to compute the          |
+ |     marginal reconstruction with Gamma).                                   |
+ |-z   The bound used. -zs - bound based on sum. -zm based on max. -zb [both] |
+ |-p   user alpha parameter of the gamma distribution [if alpha is not given, |
+ |     alpha and branches will be evaluated from the data (override -b)       |
+ +----------------------------------------------------------------------------+
+EOF
diff --git a/python/gubbins/tests/data/non_bi_tree.tre.expected b/python/gubbins/tests/data/non_bi_tree.tre.expected
index d5e8fad..2fb3948 100644
--- a/python/gubbins/tests/data/non_bi_tree.tre.expected
+++ b/python/gubbins/tests/data/non_bi_tree.tre.expected
@@ -1 +1 @@
-(sequence_3:8.38574,(sequence_2:0.0002,sequence_4:0.306587)N6:0.00017,(sequence_9:0.000196,((sequence_10:2e-05,(sequence_47:0.300658,(sequence_37:0.300658,(sequence_27:0.300658,(sequence_7:0.300658,sequence_17:0.300658):0):0):0):0)N3:1.1e-05,((sequence_6:0.030557,(sequence_5:0.24819,sequence_8:1.1e-05)N8:9e-06)N7:1e-05,sequence_1:2e-06)N1:1e-05)N2:0.193335)N4:1.23278):0.0;
+((sequence_6:0.030557,(sequence_5:0.24819,sequence_8:1.1e-05)N8:9e-06)N7:1e-05,sequence_1:2e-06,((sequence_10:2e-05,(sequence_47:0.300658,(sequence_37:0.300658,(sequence_27:0.300658,(sequence_7:0.300658,sequence_17:0.300658):0):0):0):0)N3:1.1e-05,(sequence_9:0.000196,(sequence_3:8.38574,(sequence_2:0.0002,sequence_4:0.306587)N6:0.00017)N5:1.23278)N4:0.193335)N2:1e-05)N1:0.0;
diff --git a/python/gubbins/tests/data/robinson_foulds_distance_tree1.tre.filter_out_removed_taxa_from_tree_expected b/python/gubbins/tests/data/robinson_foulds_distance_tree1.tre.filter_out_removed_taxa_from_tree_expected
index c74bf5f..cfbe5f2 100644
--- a/python/gubbins/tests/data/robinson_foulds_distance_tree1.tre.filter_out_removed_taxa_from_tree_expected
+++ b/python/gubbins/tests/data/robinson_foulds_distance_tree1.tre.filter_out_removed_taxa_from_tree_expected
@@ -1 +1 @@
-(((sequence_7:0.300658,sequence_10:2e-05):1.1e-05,sequence_9:0.193531):2e-05,sequence_6:0.030557,sequence_8:2e-05):0.0;
+(((sequence_7:0.300658,sequence_10:2e-05):1.1e-05,sequence_9:0.193531):2e-05,sequence_6:0.030557,sequence_8:1.9999999999999998e-05):0.0;
diff --git a/python/gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_expected b/python/gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_expected
index fd5880a..ce9ff25 100644
--- a/python/gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_expected
+++ b/python/gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_expected
@@ -1 +1 @@
-(sequence_3:8.38574,(sequence_2:0.0002,sequence_4:0.306587)N6:0.00017,(sequence_9:0.000196,((sequence_7:0.300658,sequence_10:2e-05)N3:1.1e-05,((sequence_6:0.030557,(sequence_5:0.24819,sequence_8:1.1e-05)N8:9e-06)N7:1e-05,sequence_1:2e-06)N1:1e-05)N2:0.193335)N4:1.23278):0.0;
+((sequence_6:0.030557,(sequence_5:0.24819,sequence_8:1.1e-05)N8:9e-06)N7:1e-05,sequence_1:2e-06,((sequence_7:0.300658,sequence_10:2e-05)N3:1.1e-05,(sequence_9:0.000196,(sequence_3:8.38574,(sequence_2:0.0002,sequence_4:0.306587)N6:0.00017)N5:1.23278)N4:0.193335)N2:1e-05)N1:0.0;
diff --git a/python/gubbins/tests/test_alignment_python_methods.py b/python/gubbins/tests/test_alignment_python_methods.py
index d695348..27b69f3 100644
--- a/python/gubbins/tests/test_alignment_python_methods.py
+++ b/python/gubbins/tests/test_alignment_python_methods.py
@@ -1,4 +1,4 @@
-#! /usr/bin/env python
+#! /usr/bin/env python3
 # encoding: utf-8
 
 """
@@ -7,6 +7,7 @@ Tests alignement manipulation and conversion with no external application depend
 
 import unittest
 import re
+import filecmp
 import os
 from gubbins import common
 
@@ -20,23 +21,17 @@ class TestAlignmentPythonMethods(unittest.TestCase):
 
   def test_filter_out_alignments_with_too_much_missing_data(self):
     common.GubbinsCommon.filter_out_alignments_with_too_much_missing_data('gubbins/tests/data/alignment_with_too_much_missing_data.aln', 'gubbins/tests/data/alignment_with_too_much_missing_data.aln.actual', 5,0)
-    actual_file_content = open('gubbins/tests/data/alignment_with_too_much_missing_data.aln.actual', 'U').readlines()
-    expected_file_content = open('gubbins/tests/data/alignment_with_too_much_missing_data.aln.expected', 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp('gubbins/tests/data/alignment_with_too_much_missing_data.aln.actual','gubbins/tests/data/alignment_with_too_much_missing_data.aln.expected')
     os.remove('gubbins/tests/data/alignment_with_too_much_missing_data.aln.actual')
 
   def test_reinsert_gaps_into_fasta_file(self):
     common.GubbinsCommon.reinsert_gaps_into_fasta_file('gubbins/tests/data/gaps_to_be_reinserted.aln', 'gubbins/tests/data/gaps_to_be_reinserted.vcf', 'gubbins/tests/data/gaps_to_be_reinserted.aln.actual')
-    actual_file_content   = open('gubbins/tests/data/gaps_to_be_reinserted.aln.actual',   'U').readlines()
-    expected_file_content = open('gubbins/tests/data/gaps_to_be_reinserted.aln.expected', 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp('gubbins/tests/data/gaps_to_be_reinserted.aln.actual','gubbins/tests/data/gaps_to_be_reinserted.aln.expected')
     os.remove('gubbins/tests/data/gaps_to_be_reinserted.aln.actual')
 
   def test_reconvert_fasta_file(self):
     common.GubbinsCommon.reconvert_fasta_file('gubbins/tests/data/alignment_with_too_much_missing_data.aln', 'gubbins/tests/data/reconvert_fasta_file.aln.actual')
-    actual_file_content = open('gubbins/tests/data/reconvert_fasta_file.aln.actual', 'U').readlines()
-    expected_file_content = open('gubbins/tests/data/reconvert_fasta_file.aln.expected', 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp('gubbins/tests/data/reconvert_fasta_file.aln.actual','gubbins/tests/data/reconvert_fasta_file.aln.expected')
     os.remove('gubbins/tests/data/reconvert_fasta_file.aln.actual')
 
 if __name__ == "__main__":
diff --git a/python/gubbins/tests/test_converging_recombinations.py b/python/gubbins/tests/test_converging_recombinations.py
index 43e6ca8..9eba900 100644
--- a/python/gubbins/tests/test_converging_recombinations.py
+++ b/python/gubbins/tests/test_converging_recombinations.py
@@ -1,4 +1,4 @@
-#! /usr/bin/env python
+#! /usr/bin/env python3
 # encoding: utf-8
 
 """
diff --git a/python/gubbins/tests/test_external_dependancies.py b/python/gubbins/tests/test_external_dependancies.py
index 8fa03e6..a4382a8 100755
--- a/python/gubbins/tests/test_external_dependancies.py
+++ b/python/gubbins/tests/test_external_dependancies.py
@@ -1,4 +1,4 @@
-#! /usr/bin/env python
+#! /usr/bin/env python3
 # encoding: utf-8
 
 """
@@ -12,6 +12,7 @@ import sys
 import os
 import glob
 import argparse
+import filecmp
 import pkg_resources
 from gubbins import common
 
@@ -353,34 +354,6 @@ class TestExternalDependancies(unittest.TestCase):
     gubbins_runner  = common.GubbinsCommon(parser.parse_args(['gubbins/tests/data/multiple_recombinations.aln']))
     gubbins_runner.parse_and_run()
 
-    actual_file_content2    = open('multiple_recombinations.summary_of_snp_distribution.vcf',   'U').readlines()
-    actual_file_content3    = open('multiple_recombinations.recombination_predictions.embl',   'U').readlines()
-    actual_file_content4    = open('multiple_recombinations.per_branch_statistics.csv',   'U').readlines()
-    actual_file_content5    = open('multiple_recombinations.filtered_polymorphic_sites.fasta',   'U').readlines()
-    actual_file_content6    = open('multiple_recombinations.filtered_polymorphic_sites.phylip',   'U').readlines()
-    actual_file_content8    = open('multiple_recombinations.recombination_predictions.gff',   'U').readlines()
-    actual_file_content9    = open('multiple_recombinations.branch_base_reconstruction.embl',   'U').readlines()
-    actual_file_content10   = open('multiple_recombinations.final_tree.tre',   'U').readlines()
-    
-    expected_file_content2  = open('gubbins/tests/data/expected_RAxML_result.multiple_recombinations.iteration_5.vcf',   'U').readlines()
-    expected_file_content3  = open('gubbins/tests/data/expected_RAxML_result.multiple_recombinations.iteration_5.tab',   'U').readlines()
-    expected_file_content4  = open('gubbins/tests/data/expected_RAxML_result.multiple_recombinations.iteration_5.stats',   'U').readlines()
-    expected_file_content5  = open('gubbins/tests/data/expected_RAxML_result.multiple_recombinations.iteration_5.snp_sites.aln',   'U').readlines()
-    expected_file_content6  = open('gubbins/tests/data/expected_RAxML_result.multiple_recombinations.iteration_5.phylip',   'U').readlines()
-    expected_file_content8  = open('gubbins/tests/data/expected_RAxML_result.multiple_recombinations.iteration_5.gff',   'U').readlines()
-    expected_file_content9  = open('gubbins/tests/data/expected_RAxML_result.multiple_recombinations.iteration_5.branch_snps.tab',   'U').readlines()
-    expected_file_content10 = open('gubbins/tests/data/expected_RAxML_result.multiple_recombinations.iteration_5',   'U').readlines()
-    
-    # Slightly different values of internal nodes if run on fastml on linux and osx
-    #assert actual_file_content2 == expected_file_content2
-    #assert actual_file_content3 == expected_file_content3
-    #assert actual_file_content4 == expected_file_content4
-    #assert actual_file_content5 == expected_file_content5
-    #assert actual_file_content6 == expected_file_content6
-    #assert actual_file_content8 == expected_file_content8
-    #assert actual_file_content9 == expected_file_content9
-    #assert actual_file_content10 == expected_file_content10
-    
     assert not os.path.exists('multiple_recombinations.aln.start')
     assert not os.path.exists('RAxML_result.multiple_recombinations.iteration_5.ancestor.tre')
     assert not os.path.exists('RAxML_result.multiple_recombinations.iteration_5.seq.joint.txt')
@@ -408,14 +381,10 @@ class TestExternalDependancies(unittest.TestCase):
     assert os.path.exists('pairwise.final_tree.tre')
     
     # Check the VCF file is as expected
-    actual_file_content   = open('pairwise.summary_of_snp_distribution.vcf',   'U').readlines()
-    expected_file_content = open('gubbins/tests/data/pairwise.aln.tre.vcf_expected', 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp('pairwise.summary_of_snp_distribution.vcf','gubbins/tests/data/pairwise.aln.tre.vcf_expected')
     
     # Check the reconstruction of internal nodes
-    actual_file_content   = open('pairwise.filtered_polymorphic_sites.fasta',   'U').readlines()
-    expected_file_content = open('gubbins/tests/data/pairwise.aln.snp_sites.aln_expected', 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp('pairwise.filtered_polymorphic_sites.fasta','gubbins/tests/data/pairwise.aln.snp_sites.aln_expected');
     
     
     os.remove('pairwise.summary_of_snp_distribution.vcf')
diff --git a/python/gubbins/tests/test_fastml.py b/python/gubbins/tests/test_fastml.py
new file mode 100755
index 0000000..4439918
--- /dev/null
+++ b/python/gubbins/tests/test_fastml.py
@@ -0,0 +1,43 @@
+#! /usr/bin/env python3
+# encoding: utf-8
+
+"""
+Tests if we can detect which version of fastml is running so we can choose the correct model
+"""
+
+import unittest
+import re
+import os
+import subprocess
+from gubbins.Fastml import Fastml
+
+os.environ["PATH"] += os.pathsep + 'gubbins/tests/bin'
+
+class TestFastml(unittest.TestCase):
+    
+  def test_no_fastml_installed(self):
+      fastml_check = Fastml('exec_doesnt_exist')
+      self.assertEqual(fastml_check.fastml_version, None)
+      self.assertEqual(fastml_check.fastml_model, None)
+      self.assertEqual(fastml_check.fastml_parameters, None)
+      
+  def test_fastml_3_installed(self):
+      fastml_check = Fastml('dummy_fastml3')
+      self.assertEqual(fastml_check.fastml_version, 3)
+      self.assertEqual(fastml_check.fastml_model,'g')
+      self.assertEqual(fastml_check.fastml_parameters, 'dummy_fastml3 -qf -b -a 0.00001 -mg ')
+    
+  def test_fastml_2_installed(self):
+      fastml_check = Fastml('dummy_fastml2')
+      self.assertEqual(fastml_check.fastml_version, 2)
+      self.assertEqual(fastml_check.fastml_model, 'n')
+      self.assertEqual(fastml_check.fastml_parameters, 'dummy_fastml2 -qf -b -a 0.00001 -mn ')
+    
+  def test_custom_fastml_2_installed(self):
+      fastml_check = Fastml('dummy_custom_fastml2')
+      self.assertEqual(fastml_check.fastml_version, 2)
+      self.assertEqual(fastml_check.fastml_model,'g')
+      self.assertEqual(fastml_check.fastml_parameters, 'dummy_custom_fastml2 -qf -b -a 0.00001 -mg ')
+
+if __name__ == "__main__":
+  unittest.main()
\ No newline at end of file
diff --git a/python/gubbins/tests/test_string_construction.py b/python/gubbins/tests/test_string_construction.py
index 20a4015..9fbbcd2 100644
--- a/python/gubbins/tests/test_string_construction.py
+++ b/python/gubbins/tests/test_string_construction.py
@@ -1,4 +1,4 @@
-#! /usr/bin/env python
+#! /usr/bin/env python3
 # encoding: utf-8
 
 """
diff --git a/python/gubbins/tests/test_tree_python_methods.py b/python/gubbins/tests/test_tree_python_methods.py
index 530436e..8673138 100644
--- a/python/gubbins/tests/test_tree_python_methods.py
+++ b/python/gubbins/tests/test_tree_python_methods.py
@@ -1,4 +1,4 @@
-#! /usr/bin/env python
+#! /usr/bin/env python3
 # encoding: utf-8
 
 """
@@ -11,6 +11,7 @@ import shutil
 import os
 import difflib
 import tempfile
+import filecmp
 from gubbins import common
 
 class TestTreePythonMethods(unittest.TestCase):
@@ -30,32 +31,24 @@ class TestTreePythonMethods(unittest.TestCase):
   def test_reroot_tree(self):
     shutil.copyfile('gubbins/tests/data/robinson_foulds_distance_tree1.tre','gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual')
     common.GubbinsCommon.reroot_tree('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual', 'sequence_4')
-    actual_file_content = open('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual', 'U').readlines()
-    expected_file_content = open('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4', 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual','gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4')
     os.remove('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual')
     
     shutil.copyfile('gubbins/tests/data/robinson_foulds_distance_tree1.tre','gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_actual')
     common.GubbinsCommon.reroot_tree('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_actual','')
-    actual_file_content = open('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_actual', 'U').readlines()
-    expected_file_content = open('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_expected', 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_actual', 'gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_expected')
     os.remove('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_actual')
     
   def test_reroot_tree_with_outgroup(self):
     shutil.copyfile('gubbins/tests/data/robinson_foulds_distance_tree1.tre','gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual')
     common.GubbinsCommon.reroot_tree_with_outgroup('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual', ['sequence_4'])
-    actual_file_content = open('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual', 'U').readlines()
-    expected_file_content = open('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4', 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual','gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4')
     os.remove('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual')
     
   def test_reroot_tree_with_outgroups(self):
     shutil.copyfile('gubbins/tests/data/robinson_foulds_distance_tree1.tre','gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual')
     common.GubbinsCommon.reroot_tree_with_outgroup('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual', ['sequence_4','sequence_2'])
-    actual_file_content = open('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual', 'U').readlines()
-    expected_file_content = open('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_2', 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual','gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_2')
     os.remove('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_at_sequence_4_actual')
     
   def test_reroot_tree_with_outgroups_all_in_one_clade(self):
@@ -87,43 +80,33 @@ class TestTreePythonMethods(unittest.TestCase):
     
     assert expected_monophyletic_outgroup == common.GubbinsCommon.get_monophyletic_outgroup('.tmp.outgroups_input.tre', outgroups)
     common.GubbinsCommon.reroot_tree_with_outgroup('.tmp.outgroups_input.tre', outgroups)
-    actual_file_content = open('.tmp.outgroups_input.tre', 'U').readlines()
-    expected_file_content = open(expected_output_file, 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp('.tmp.outgroups_input.tre',expected_output_file)
     os.remove('.tmp.outgroups_input.tre')
     
   def test_split_all_non_bi_nodes(self):
     # best way to access it is via reroot_tree_at_midpoint because it outputs to a file
     shutil.copyfile('gubbins/tests/data/non_bi_tree.tre','gubbins/tests/data/non_bi_tree.tre.actual')
     common.GubbinsCommon.reroot_tree_at_midpoint('gubbins/tests/data/non_bi_tree.tre.actual')
-    actual_file_content = open('gubbins/tests/data/non_bi_tree.tre.actual', 'U').readlines()
-    expected_file_content = open('gubbins/tests/data/non_bi_tree.tre.expected', 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp('gubbins/tests/data/non_bi_tree.tre.actual','gubbins/tests/data/non_bi_tree.tre.expected')
     os.remove('gubbins/tests/data/non_bi_tree.tre.actual')
     
   def test_reroot_tree_at_midpoint(self):
     shutil.copyfile('gubbins/tests/data/robinson_foulds_distance_tree1.tre','gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_actual')
     common.GubbinsCommon.reroot_tree_at_midpoint('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_actual')
-    actual_file_content = open('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_actual', 'U').readlines()
-    expected_file_content = open('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_expected', 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_actual','gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_expected')
     os.remove('gubbins/tests/data/robinson_foulds_distance_tree1.tre.reroot_tree_at_midpoint_actual')
 
   def test_filter_out_removed_taxa_from_tree_and_return_new_file(self):
     temp_working_dir = tempfile.mkdtemp(dir=os.getcwd())
     common.GubbinsCommon.filter_out_removed_taxa_from_tree_and_return_new_file('gubbins/tests/data/robinson_foulds_distance_tree1.tre', temp_working_dir, ['sequence_1','sequence_2','sequence_3','sequence_4','sequence_5'])    
-    actual_file_content = open(temp_working_dir + '/robinson_foulds_distance_tree1.tre', 'U').readlines()
-    expected_file_content = open('gubbins/tests/data/robinson_foulds_distance_tree1.tre.filter_out_removed_taxa_from_tree_expected', 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp(temp_working_dir + '/robinson_foulds_distance_tree1.tre', 'gubbins/tests/data/robinson_foulds_distance_tree1.tre.filter_out_removed_taxa_from_tree_expected')
     os.remove(temp_working_dir + '/robinson_foulds_distance_tree1.tre')
     os.removedirs(temp_working_dir)
     
   def test_internal_node_taxons_removed_when_used_as_starting_tree(self):
     temp_working_dir = tempfile.mkdtemp(dir=os.getcwd())
     common.GubbinsCommon.filter_out_removed_taxa_from_tree_and_return_new_file('gubbins/tests/data/tree_with_internal_nodes.tre', temp_working_dir, [])    
-    actual_file_content = open(temp_working_dir + '/tree_with_internal_nodes.tre', 'U').readlines()
-    expected_file_content = open('gubbins/tests/data/tree_with_internal_nodes.tre_expected', 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp(temp_working_dir + '/tree_with_internal_nodes.tre','gubbins/tests/data/tree_with_internal_nodes.tre_expected')
     os.remove(temp_working_dir + '/tree_with_internal_nodes.tre')
     os.removedirs(temp_working_dir)
     
@@ -135,9 +118,7 @@ class TestTreePythonMethods(unittest.TestCase):
   def test_remove_internal_node_labels(self):
     common.GubbinsCommon.remove_internal_node_labels_from_tree('gubbins/tests/data/final_tree_with_internal_labels.tre', 'final_tree_with_internal_labels.tre')
     assert os.path.exists('final_tree_with_internal_labels.tre')
-    actual_file_content = open('final_tree_with_internal_labels.tre', 'U').readlines()
-    expected_file_content = open('gubbins/tests/data/expected_final_tree_without_internal_labels.tre', 'U').readlines()
-    assert actual_file_content == expected_file_content
+    assert filecmp.cmp('final_tree_with_internal_labels.tre', 'gubbins/tests/data/expected_final_tree_without_internal_labels.tre')
     os.remove('final_tree_with_internal_labels.tre')
     
 if __name__ == "__main__":
diff --git a/python/gubbins/tests/test_validate_fasta_input.py b/python/gubbins/tests/test_validate_fasta_input.py
index 621cdd9..a440207 100644
--- a/python/gubbins/tests/test_validate_fasta_input.py
+++ b/python/gubbins/tests/test_validate_fasta_input.py
@@ -1,4 +1,4 @@
-#! /usr/bin/env python
+#! /usr/bin/env python3
 # encoding: utf-8
 
 """
diff --git a/python/gubbins/tests/test_validate_starting_tree.py b/python/gubbins/tests/test_validate_starting_tree.py
index d561080..dccf638 100644
--- a/python/gubbins/tests/test_validate_starting_tree.py
+++ b/python/gubbins/tests/test_validate_starting_tree.py
@@ -1,4 +1,4 @@
-#! /usr/bin/env python
+#! /usr/bin/env python3
 # encoding: utf-8
 
 """
diff --git a/python/scripts/gubbins_drawer.py b/python/scripts/gubbins_drawer.py
index 7887874..2cfe38c 100755
--- a/python/scripts/gubbins_drawer.py
+++ b/python/scripts/gubbins_drawer.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/env python3
 
 #################################
 # Import some necessary modules #
@@ -49,17 +49,17 @@ def tab_parser(handle, quiet=False):
 				break
 				raise ValueError("Premature end of line during features table")
 			if line[:object.HEADER_WIDTH].rstrip() in object.SEQUENCE_HEADERS:
-				if object.debug : print "Found start of sequence"
+				if object.debug : print("Found start of sequence")
 				break
 			line = line.rstrip()
 			if line == "//":
 				raise ValueError("Premature end of features table, marker '//' found")
 			if line in object.FEATURE_END_MARKERS:
-				if object.debug : print "Found end of features"
+				if object.debug : print("Found end of features")
 				line = object.handle.readline()
 				break
 			if line[2:object.FEATURE_QUALIFIER_INDENT].strip() == "":
-				print line[2:object.FEATURE_QUALIFIER_INDENT].strip()
+				print(line[2:object.FEATURE_QUALIFIER_INDENT].strip())
 				raise ValueError("Expected a feature qualifier in line '%s'" % line)
 			
 			if skip:
@@ -170,7 +170,7 @@ def add_ordered_embl_to_diagram(record, incfeatures=["CDS", "feature"], emblfile
 	
 	new_tracks={}
 	
-	print len(record.features), "features found for", record.name
+	print(len(record.features), "features found for", record.name)
 	
 	if len(record.seq)>500000:
 		scale_largetick_interval=int(round((len(record.seq)/10),-5))
@@ -184,9 +184,9 @@ def add_ordered_embl_to_diagram(record, incfeatures=["CDS", "feature"], emblfile
 			
 			continue
 		
-		if feature.qualifiers.has_key("colour"):
+		if "colour" in feature.qualifiers:
 			colourline=feature.qualifiers["colour"][0]
-		elif feature.qualifiers.has_key("color"):
+		elif "color" in feature.qualifiers:
 			colourline=feature.qualifiers["color"][0]
 		else:
 			colourline = "5"
@@ -195,8 +195,8 @@ def add_ordered_embl_to_diagram(record, incfeatures=["CDS", "feature"], emblfile
 		elif len(colourline.split())==3:
 			colour=translator.int255_color((int(colourline.split()[0]),int(colourline.split()[1]),int(colourline.split()[2])))
 		else:
-			print "Can't understand colour code!"
-			print colourline
+			print("Can't understand colour code!")
+			print(colourline)
 			sys.exit()
 		
 		locations=[]
@@ -244,7 +244,7 @@ def add_ordered_tab_to_diagram(filename):
 	try:
 		record=tab_parser(open(filename,"r"))
 	except IOError:
-		print "Cannot find file", filename
+		print("Cannot find file", filename)
 		sys.exit()
 	record.name=filename
 	new_tracks=add_ordered_embl_to_diagram(record, incfeatures=["i", "d", "li", "del", "snp", "misc_feature", "core", "cds", "insertion", "deletion", "recombination", "feature", "blastn_hit", "fasta_record", "variation"], emblfile=False)
@@ -349,7 +349,7 @@ def drawtree(treeObject, treeheight, treewidth, xoffset, yoffset, name_offset=5)
 			
 			if treeObject.node(node).data.comment and "name_colour" in treeObject.node(node).data.comment:
 				name_colours=[]
-				for x in xrange(0,len(treeObject.node(node).data.comment["name_colour"])):
+				for x in range(0,len(treeObject.node(node).data.comment["name_colour"])):
 					r,g,b= treeObject.node(node).data.comment["name_colour"][x]
 					name_colours.append(colors.Color(float(r)/255,float(g)/255,float(b)/255))
 			else:
@@ -362,7 +362,7 @@ def drawtree(treeObject, treeheight, treewidth, xoffset, yoffset, name_offset=5)
 			gubbins_length += namewidth
 			colpos=1
 			
-			for x in xrange(colpos,len(name_colours)):
+			for x in range(colpos,len(name_colours)):
 				gubbins_length += block_length
 				if x!=0:
 					gubbins_length += vertical_scaling_factor
@@ -657,7 +657,7 @@ if __name__ == "__main__":
 	height, width = pagesize
   
 	if len(args)==0:
-		print "Found nothing to draw"
+		print("Found nothing to draw")
 		sys.exit()
 	
 	d = Drawing(width, height)
@@ -708,7 +708,7 @@ if __name__ == "__main__":
 	
 	if options.tree!="":
 		if not os.path.isfile(options.tree):
-			print "Cannot find file:", options.tree
+			print("Cannot find file:", options.tree)
 			options.tree=""
 		else:
 			treestring=open(options.tree,"rU").read().strip()
@@ -754,7 +754,7 @@ if __name__ == "__main__":
 	track_number=0
 	
 	for track in output_order:
-		if(not my_tracks.has_key(track)):
+		if(track not in my_tracks):
 			my_tracks = add_empty_track(my_tracks, track)
 		
 		track_height=my_tracks[track].track_height
diff --git a/python/scripts/run_gubbins.py b/python/scripts/run_gubbins.py
index 3840d78..6e454b1 100755
--- a/python/scripts/run_gubbins.py
+++ b/python/scripts/run_gubbins.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/env python3
 # encoding: utf-8
 #
 # Wellcome Trust Sanger Institute
@@ -31,9 +31,9 @@ parser = argparse.ArgumentParser(description='Croucher N. J., Page A. J., Connor
 parser.add_argument('alignment_filename',       help='Multifasta alignment file')
 parser.add_argument('--outgroup',         '-o', help='Outgroup name for rerooting. A list of comma separated names can be used if they form a clade')
 parser.add_argument('--starting_tree',    '-s', help='Starting tree')
-parser.add_argument('--use_time_stamp',   '-u', action='count', help='Use a time stamp in file names')
-parser.add_argument('--verbose',          '-v', action='count', help='Turn on debugging')
-parser.add_argument('--no_cleanup',       '-n', action='count', help='Dont cleanup intermediate files')
+parser.add_argument('--use_time_stamp',   '-u', action='count', help='Use a time stamp in file names', default = 0)
+parser.add_argument('--verbose',          '-v', action='count', help='Turn on debugging', default = 0)
+parser.add_argument('--no_cleanup',       '-n', action='count', help='Dont cleanup intermediate files', default = 0)
 parser.add_argument('--tree_builder',     '-t', help='Application to use for tree building [raxml|fasttree|hybrid], default RAxML', default = "raxml")
 parser.add_argument('--iterations',       '-i', help='Maximum No. of iterations, default is 5', type=int,  default = 5)
 parser.add_argument('--min_snps',         '-m', help='Min SNPs to identify a recombination block, default is 3', type=int,  default = 3)
diff --git a/python/setup.py b/python/setup.py
index 1ae2641..fbe12ce 100755
--- a/python/setup.py
+++ b/python/setup.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/env python3
 
 from setuptools import setup
 import multiprocessing
@@ -25,15 +25,15 @@ setup(
       """,
     classifiers=[
         "License :: OSI Approved :: GNU General Public License (GPL)",
-        "Programming Language :: Python",
+        "Programming Language :: Python :: 3",
         "Development Status :: 4 - Beta",
         "Intended Audience :: Science/Research",
         "Topic :: Scientific/Engineering :: Bio-Informatics",
         ],
     install_requires=[
-        'Biopython >= 1.59',
-        'DendroPy  >= 3.12.0',
-        'Reportlab >= 2.5',
+        'biopython >= 1.59',
+        'dendropy  >= 4.0.2',
+        'reportlab >= 3.0',
         'nose >= 1.3'
     ],
     license='GPL',
diff --git a/release/manifests/trustyvm.pp b/release/manifests/trustyvm.pp
index 0c9b9a0..341453c 100644
--- a/release/manifests/trustyvm.pp
+++ b/release/manifests/trustyvm.pp
@@ -2,7 +2,7 @@ package { "dh-make":
     ensure => "installed"
     }
 
-package { ["gcc", "build-essential", "pkg-config","ntp"]:
+package { ["gcc", "build-essential", "pkg-config","ntp","libtool"]:
     ensure => "installed"
     }
 
@@ -49,6 +49,11 @@ package {"python-nose":
   ensure => "installed"
 }
 
+package {"python3-pip":
+  ensure => "installed"
+}
+
+
 # The Debian/Ubuntu system biopython library has no egg-info associated with it
 # https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=743927
 # so setuptools will pull down an egg and needs Python.h available to make it
diff --git a/src/Newickform.c b/src/Newickform.c
index f5ec8d9..df62489 100644
--- a/src/Newickform.c
+++ b/src/Newickform.c
@@ -36,6 +36,7 @@ newick_node* build_newick_tree(char * filename, FILE *vcf_file_pointer,int * snp
 	char *pcOutputFile;
 	char acStrArray[256];
 	newick_node *root;
+    char *returnchar;
 	
 	FILE *f;
 	
@@ -57,7 +58,7 @@ newick_node* build_newick_tree(char * filename, FILE *vcf_file_pointer,int * snp
 	while (1)
 	{
 		memset(acStrArray, '\0', 256);
-		fgets(acStrArray, 255, f);
+		returnchar = fgets(acStrArray, 255, f);
 		if (acStrArray[0] == '\0' && feof(f))
 		{
 			break;
diff --git a/src/alignment_file.c b/src/alignment_file.c
index 20aba56..b9e1a07 100644
--- a/src/alignment_file.c
+++ b/src/alignment_file.c
@@ -161,6 +161,10 @@ int build_reference_sequence(char reference_sequence[], char filename[])
 			reference_sequence[i]  = '-';
 		}
 	}
+    if(reference_sequence[seq->seq.l] != '\0')
+    {
+      reference_sequence[seq->seq.l]  =   '\0';
+    }
 	
 	kseq_destroy(seq);
 	gzclose(fp);
diff --git a/src/parse_vcf.c b/src/parse_vcf.c
index 169a452..04aa737 100644
--- a/src/parse_vcf.c
+++ b/src/parse_vcf.c
@@ -129,10 +129,10 @@ int get_number_of_snps(FILE * vcf_file_pointer)
 	int i = 0;
 	int length_of_line =0;
 	char szBuffer[2] = {0};  
-	
+    char *returnchar;	
 	do{
 		// check the first character of the line to see if its in the header
-		fgets(szBuffer, sizeof(szBuffer), vcf_file_pointer);
+		returnchar = fgets(szBuffer, sizeof(szBuffer), vcf_file_pointer);
 		if(szBuffer[0] != '#')
 		{
 			i++;
diff --git a/tests/check_branch_sequences.c b/tests/check_branch_sequences.c
index 20badf2..b6f8c90 100644
--- a/tests/check_branch_sequences.c
+++ b/tests/check_branch_sequences.c
@@ -92,7 +92,7 @@ int test_bases_in_recombinations(int block_size)
 	block_coords[1][3] = 15;
 	char * child_sequence      = "AAAAAAAAATAA";
 	int snp_locations[12] = {1,2,3,5,7,10,11,15,20,30,100,110};
-	calculate_number_of_bases_in_recombations_excluding_gaps(block_coords, block_size, child_sequence, snp_locations,12);
+	return calculate_number_of_bases_in_recombations_excluding_gaps(block_coords, block_size, child_sequence, snp_locations,12);
 }
 
 int test_bases_in_recombinations_with_gaps(int block_size)
@@ -111,7 +111,7 @@ int test_bases_in_recombinations_with_gaps(int block_size)
 	block_coords[1][3] = 15;
 	char * child_sequence      =  "--A---AAAAAAAAAAAAAT";
 	int snp_locations[16] = {1,4,5,6,7,8,9,10,11,15,20,30,40,50,100,110};
-	calculate_number_of_bases_in_recombations_excluding_gaps(block_coords, block_size, child_sequence, snp_locations,16);
+	return calculate_number_of_bases_in_recombations_excluding_gaps(block_coords, block_size, child_sequence, snp_locations,16);
 }
 
 

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/gubbins.git