[med-svn] [aragorn] 01/12: Imported Upstream version 1.2.37

Sascha Steinbiss satta at debian.org
Sun Jul 3 07:55:15 UTC 2016


This is an automated email from the git hooks/post-receive script.

satta pushed a commit to branch master
in repository aragorn.

commit 39fb3f7a4d4ff990d2abaee52c2806e92b8526dd
Author: Sascha Steinbiss <satta at debian.org>
Date:   Sun Jul 3 07:27:49 2016 +0000

    Imported Upstream version 1.2.37
---
 aragorn.1                          |  390 ----
 aragorn1.2.36.c => aragorn1.2.37.c | 4196 +++++++++++++++++++++++-------------
 manpage.1.src                      |  273 +++
 3 files changed, 2997 insertions(+), 1862 deletions(-)

diff --git a/aragorn.1 b/aragorn.1
deleted file mode 100644
index 617405f..0000000
--- a/aragorn.1
+++ /dev/null
@@ -1,390 +0,0 @@
-'\" t
-.\"     Title: aragorn
-.\"    Author: [see the "AUTHORS" section]
-.\" Generator: DocBook XSL Stylesheets v1.76.1 <http://docbook.sf.net/>
-.\"      Date: 02/24/2013
-.\"    Manual: \ \&
-.\"    Source: \ \&
-.\"  Language: English
-.\"
-.TH "ARAGORN" "1" "02/24/2013" "\ \&" "\ \&"
-.\" -----------------------------------------------------------------
-.\" * Define some portability stuff
-.\" -----------------------------------------------------------------
-.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-.\" http://bugs.debian.org/507673
-.\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html
-.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-.ie \n(.g .ds Aq \(aq
-.el       .ds Aq '
-.\" -----------------------------------------------------------------
-.\" * set default formatting
-.\" -----------------------------------------------------------------
-.\" disable hyphenation
-.nh
-.\" disable justification (adjust text to left margin only)
-.ad l
-.\" -----------------------------------------------------------------
-.\" * MAIN CONTENT STARTS HERE *
-.\" -----------------------------------------------------------------
-.SH "NAME"
-aragorn \- detect tRNA genes in nucleotide sequences
-.SH "SYNOPSIS"
-.sp
-\fBaragorn\fR [\fIOPTION\fR]\&... \fIFILE\fR
-.SH "OPTIONS"
-.PP
-\fB\-m\fR
-.RS 4
-Search for tmRNA genes\&.
-.RE
-.PP
-\fB\-t\fR
-.RS 4
-Search for tRNA genes\&. By default, all are detected\&. If one of
-\fB\-m\fR
-or
-\fB\-t\fR
-is specified, then the other is not detected unless specified as well\&.
-.RE
-.PP
-\fB\-mt\fR
-.RS 4
-Search for Metazoan mitochondrial tRNA genes\&. tRNA genes with introns not detected\&.
-\fB\-i\fR,
-\fB\-sr\fR
-switchs ignored\&. Composite Metazoan mitochondrial genetic code used\&.
-.RE
-.PP
-\fB\-mtmam\fR
-.RS 4
-Search for Mammalian mitochondrial tRNA genes\&.
-\fB\-i\fR,
-\fB\-sr\fR
-switchs ignored\&.
-\fB\-tv\fR
-switch set\&. Mammalian mitochondrial genetic code used\&.
-.RE
-.PP
-\fB\-mtx\fR
-.RS 4
-Same as
-\fB\-mt\fR
-but low scoring tRNA genes are not reported\&.
-.RE
-.PP
-\fB\-mtd\fR
-.RS 4
-Overlapping metazoan mitochondrial tRNA genes on opposite strands are reported\&.
-.RE
-.PP
-\fB\-gc\fR[\fInum\fR]
-.RS 4
-Use the GenBank transl_table = [\fInum\fR] genetic code\&. Individual modifications can be appended using
-\fI,BBB\fR=<aa> B = A,C,G, or T\&. <aa> is the three letter code for an amino\-acid\&. More than one modification can be specified\&. eg
-\fB\-gcvert\fR,aga=Trp,agg=Trp uses the Vertebrate Mitochondrial code and the codons AGA and AGG changed to Tryptophan\&.
-.RE
-.PP
-\fB\-gcstd\fR
-.RS 4
-Use standard genetic code\&.
-.RE
-.PP
-\fB\-gcmet\fR
-.RS 4
-Use composite Metazoan mitochondrial genetic code\&.
-.RE
-.PP
-\fB\-gcvert\fR
-.RS 4
-Use Vertebrate mitochondrial genetic code\&.
-.RE
-.PP
-\fB\-gcinvert\fR
-.RS 4
-Use Invertebrate mitochondrial genetic code\&.
-.RE
-.PP
-\fB\-gcyeast\fR
-.RS 4
-Use Yeast mitochondrial genetic code\&.
-.RE
-.PP
-\fB\-gcprot\fR
-.RS 4
-Use Mold/Protozoan/Coelenterate mitochondrial genetic code\&.
-.RE
-.PP
-\fB\-gcciliate\fR
-.RS 4
-Use Ciliate genetic code\&.
-.RE
-.PP
-\fB\-gcflatworm\fR
-.RS 4
-Use Echinoderm/Flatworm mitochondrial genetic code
-.RE
-.PP
-\fB\-gceuplot\fR
-.RS 4
-Use Euplotid genetic code\&.
-.RE
-.PP
-\fB\-gcbact\fR
-.RS 4
-Use Bacterial/Plant Chloroplast genetic code\&.
-.RE
-.PP
-\fB\-gcaltyeast\fR
-.RS 4
-Use alternative Yeast genetic code\&.
-.RE
-.PP
-\fB\-gcascid\fR
-.RS 4
-Use Ascidian Mitochondrial genetic code\&.
-.RE
-.PP
-\fB\-gcaltflat\fR
-.RS 4
-Use alternative Flatworm Mitochondrial genetic code\&.
-.RE
-.PP
-\fB\-gcblep\fR
-.RS 4
-Use Blepharisma genetic code\&.
-.RE
-.PP
-\fB\-gcchloroph\fR
-.RS 4
-Use Chlorophycean Mitochondrial genetic code\&.
-.RE
-.PP
-\fB\-gctrem\fR
-.RS 4
-Use Trematode Mitochondrial genetic code\&.
-.RE
-.PP
-\fB\-gcscen\fR
-.RS 4
-Use Scenedesmus obliquus Mitochondrial genetic code\&.
-.RE
-.PP
-\fB\-gcthraust\fR
-.RS 4
-Use Thraustochytrium Mitochondrial genetic code\&.
-.RE
-.PP
-\fB\-tv\fR
-.RS 4
-Do not search for mitochondrial TV replacement loop tRNA genes\&. Only relevant if
-\fB\-mt\fR
-used\&.
-.RE
-.PP
-\fB\-c7\fR
-.RS 4
-Search for tRNA genes with 7 base C\-loops only\&.
-.RE
-.PP
-\fB\-i\fR
-.RS 4
-Search for tRNA genes with introns in anticodon loop with maximum length 3000 bases\&. Minimum intron length is 0 bases\&. Ignored if
-\fB\-m\fR
-is specified\&.
-.RE
-.PP
-\fB\-i\fR[\fImax\fR]
-.RS 4
-Search for tRNA genes with introns in anticodon loop with maximum length [\fImax\fR] bases\&. Minimum intron length is 0 bases\&. Ignored if
-\fB\-m\fR
-is specified\&.
-.RE
-.PP
-\fB\-i\fR[\fImin\fR],[\fImax\fR]
-.RS 4
-Search for tRNA genes with introns in anticodon loop with maximum length [\fImax\fR] bases, and minimum length [\fImin\fR] bases\&. Ignored if
-\fB\-m\fR
-is specified\&.
-.RE
-.PP
-\fB\-io\fR
-.RS 4
-Same as
-\fB\-i\fR, but allow tRNA genes with long introns to overlap shorter tRNA genes\&.
-.RE
-.PP
-\fB\-if\fR
-.RS 4
-Same as
-\fB\-i\fR, but fix intron between positions 37 and 38 on C\-loop (one base after anticodon)\&.
-.RE
-.PP
-\fB\-ifo\fR
-.RS 4
-Same as
-\fB\-if\fR
-and
-\fB\-io\fR
-combined\&.
-.RE
-.PP
-\fB\-ir\fR
-.RS 4
-Same as
-\fB\-i\fR, but report tRNA genes with minimum length [\fImin\fR] bases rather than search for tRNA genes with minimum length [\fImin\fR] bases\&. With this switch, [\fImin\fR] acts as an output filter, minimum intron length for searching is still 0 bases\&.
-.RE
-.PP
-\fB\-c\fR
-.RS 4
-Assume that each sequence has a circular topology\&. Search wraps around each end\&. Default setting\&.
-.RE
-.PP
-\fB\-l\fR
-.RS 4
-Assume that each sequence has a linear topology\&. Search does not wrap\&.
-.RE
-.PP
-\fB\-d\fR
-.RS 4
-Double\&. Search both strands of each sequence\&. Default setting\&.
-.RE
-.PP
-\fB\-s\fR or \fB\-s+\fR
-.RS 4
-Single\&. Do not search the complementary (antisense) strand of each sequence\&.
-.RE
-.PP
-\fB\-sc\fR or \fB\-s\-\fR
-.RS 4
-Single complementary\&. Do not search the sense strand of each sequence\&.
-.RE
-.PP
-\fB\-ps\fR
-.RS 4
-Lower scoring thresholds to 95% of default levels\&.
-.RE
-.PP
-\fB\-ps\fR[\fInum\fR]
-.RS 4
-Change scoring thresholds to [\fInum\fR] percent of default levels\&.
-.RE
-.PP
-\fB\-rp\fR
-.RS 4
-Flag possible pseudogenes (score < 100 or tRNA anticodon loop <> 7 bases long)\&. Note that genes with score < 100 will not be detected or flagged if scoring thresholds are not also changed to below 100% (see \-ps switch)\&.
-.RE
-.PP
-\fB\-seq\fR
-.RS 4
-Print out primary sequence\&.
-.RE
-.PP
-\fB\-br\fR
-.RS 4
-Show secondary structure of tRNA gene primary sequence using round brackets\&.
-.RE
-.PP
-\fB\-fasta\fR
-.RS 4
-Print out primary sequence in fasta format\&.
-.RE
-.PP
-\fB\-fo\fR
-.RS 4
-Print out primary sequence in fasta format only (no secondary structure)\&.
-.RE
-.PP
-\fB\-fon\fR
-.RS 4
-Same as
-\fB\-fo\fR, with sequence and gene numbering in header\&.
-.RE
-.PP
-\fB\-fos\fR
-.RS 4
-Same as
-\fB\-fo\fR, with no spaces in header\&.
-.RE
-.PP
-\fB\-fons\fR
-.RS 4
-Same as
-\fB\-fo\fR, with sequence and gene numbering, but no spaces\&.
-.RE
-.PP
-\fB\-w\fR
-.RS 4
-Print out in Batch mode\&.
-.RE
-.PP
-\fB\-ss\fR
-.RS 4
-Use the stricter canonical 1\-2 bp spacer1 and 1 bp spacer2\&. Ignored if
-\fB\-mt\fR
-set\&. Default is to allow 3 bp spacer1 and 0\-2 bp spacer2, which may degrade selectivity\&.
-.RE
-.PP
-\fB\-v\fR
-.RS 4
-Verbose\&. Prints out information during search to STDERR\&.
-.RE
-.PP
-\fB\-a\fR
-.RS 4
-Print out tRNA domain for tmRNA genes\&.
-.RE
-.PP
-\fB\-a7\fR
-.RS 4
-Restrict tRNA astem length to a maximum of 7 bases
-.RE
-.PP
-\fB\-aa\fR
-.RS 4
-Display message if predicted iso\-acceptor species does not match species in sequence name (if present)\&.
-.RE
-.PP
-\fB\-j\fR
-.RS 4
-Display 4\-base sequence on 3\*(Aq end of astem regardless of predicted amino\-acyl acceptor length\&.
-.RE
-.PP
-\fB\-jr\fR
-.RS 4
-Allow some divergence of 3\*(Aq amino\-acyl acceptor sequence from NCCA\&.
-.RE
-.PP
-\fB\-jr4\fR
-.RS 4
-Allow some divergence of 3\*(Aq amino\-acyl acceptor sequence from NCCA, and display 4 bases\&.
-.RE
-.PP
-\fB\-q\fR
-.RS 4
-Dont print configuration line (which switchs and files were used)\&.
-.RE
-.PP
-\fB\-rn\fR
-.RS 4
-Repeat sequence name before summary information\&.
-.RE
-.PP
-\fB\-O\fR [\fIoutfile\fR]
-.RS 4
-Print output to
-\fI\&. If [\*(Aqoutfile\fR] already exists, it is overwritten\&. By default all output goes to stdout\&.
-.RE
-.SH "DESCRIPTION"
-.sp
-aragorn detects tRNA, mtRNA, and tmRNA genes\&. A minimum requirement is at least a 32 bit compiler architecture (variable types int and unsigned int are at least 4 bytes long)\&.
-.sp
-[\fIFILE\fR] is assumed to contain one or more sequences in FASTA format\&. Results of the search are printed to STDOUT\&. All switches are optional and case\-insensitive\&. Unless \-i is specified, tRNA genes containing introns are not detected\&.
-.SH "AUTHORS"
-.sp
-Bjorn Canback <bcanback at acgt\&.se>, Dean Laslett <gaiaquark at gmail\&.com>
-.SH "REFERENCES"
-.sp
-Laslett, D\&. and Canback, B\&. (2004) ARAGORN, a program for the detection of transfer RNA and transfer\-messenger RNA genes in nucleotide sequences Nucleic Acids Research, 32;11\-16
-.sp
-Laslett, D\&. and Canback, B\&. (2008) ARWEN: a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences Bioinformatics, 24(2); 172\-175\&.
diff --git a/aragorn1.2.36.c b/aragorn1.2.37.c
similarity index 69%
rename from aragorn1.2.36.c
rename to aragorn1.2.37.c
index dea1d7c..8db114b 100644
--- a/aragorn1.2.36.c
+++ b/aragorn1.2.37.c
@@ -1,20 +1,24 @@
 
 /* 
 ---------------------------------------------------------------
-ARAGORN v1.2.36 Dean Laslett
+ARAGORN v1.2.37 Dean Laslett
 ---------------------------------------------------------------
 
     ARAGORN (together with ARWEN at last)
     Detects tRNA, mtRNA, and tmRNA genes in nucleotide sequences
-    Copyright (C) 2003-2015 Dean Laslett
+    Copyright (C) 2003-2018 Dean Laslett
 
-    Please, report bugs and suggestions of improvements to the authors
+    A minimum requirement is at least a 32 bit compiler architecture 
+    (variable types int and unsigned int are at least 4 bytes long).
+    Please report bugs and suggestions of improvements to the authors.
 
-    E-mail: Bj�rn Canb�ck: bcanback at acgt.se
-            Dean Laslett:  gaiaquark at gmail.com 
+    E-mail: Dean Laslett:  gaiaquark at gmail.com 
+            Bj�rn Canb�ck: bcanback at acgt.se
 
-    Version 1.2.36  February 15th, 2013.
-    Thanks to Sascha Steinbiss for fixing more bugs
+    Version 1.2.37  Oct 15th, 2014.
+    Thanks to Francisco Ossandon for finding many bugs and testing 
+    Thanks to Haruo Suzuki for finding bugs
+    Thanks to Sascha Steinbiss for fixing bugs
 
 
     Please reference the following papers if you use this
@@ -321,148 +325,10 @@ DAMAGES.
 
        END OF TERMS AND CONDITIONS
 
-*/
-
-
 
-
-/*
 ---------------------------------------------------------------
-ARAGORN v1.2.36 Dean Laslett
+ARAGORN v1.2.37 Dean Laslett
 ---------------------------------------------------------------
-
-
-aragorn detects tRNA, mtRNA, and tmRNA genes.
-A minimum requirement is at least a 32 bit compiler architecture 
-(variable types int and unsigned int are at least 4 bytes long).
-
-Usage:
-aragorn -v -s -d -c -l -j -a -q -rn -w -ifro<min>,<max> -t -m -mt 
-        -gc -tv -seq -br -fasta -fo -o <outfile> <filename>
-
-<filename> is assumed to contain one or more sequences
-in FASTA format. Results of the search are printed to
-STDOUT. All switches are optional and case-insensitive.
-Unless -i is specified, tRNA genes containing introns
-are not detected.
-
-    -m            Search for tmRNA genes.
-    -t            Search for tRNA genes.
-                  By default, all are detected. If one of
-                  -m or -t is specified, then the other 
-                  is not detected unless specified as well.
-    -mt           Search for Metazoan mitochondrial tRNA genes.
-                  tRNA genes with introns not detected. -i,-sr switchs
-                  ignored. Composite Metazoan mitochondrial
-                  genetic code used.
-    -mtmam        Search for Mammalian mitochondrial tRNA
-                  genes. -i,-sr switchs ignored. -tv switch set.
-                  Mammalian mitochondrial genetic code used.
-    -mtx          Same as -mt but low scoring tRNA genes are 
-                  not reported.
-    -mtd          Overlapping metazoan mitochondrial tRNA genes 
-                  on opposite strands are reported.
-    -gc<num>      Use the GenBank transl_table = <num> genetic code.
-    -gcstd        Use standard genetic code.
-    -gcmet        Use composite Metazoan mitochondrial genetic code.
-    -gcvert       Use Vertebrate mitochondrial genetic code.
-    -gcinvert     Use Invertebrate mitochondrial genetic code.
-    -gcyeast      Use Yeast mitochondrial genetic code.
-    -gcprot       Use Mold/Protozoan/Coelenterate mitochondrial genetic code.
-    -gcciliate    Use Ciliate genetic code.
-    -gcflatworm   Use Echinoderm/Flatworm mitochondrial genetic code
-    -gceuplot     Use Euplotid genetic code.
-    -gcbact       Use Bacterial/Plant Chloroplast genetic code.
-    -gcaltyeast   Use alternative Yeast genetic code.
-    -gcascid      Use Ascidian Mitochondrial genetic code.
-    -gcaltflat    Use alternative Flatworm Mitochondrial genetic code.
-    -gcblep       Use Blepharisma genetic code.
-    -gcchloroph   Use Chlorophycean Mitochondrial genetic code.
-    -gctrem       Use Trematode Mitochondrial genetic code.
-    -gcscen       Use Scenedesmus obliquus Mitochondrial genetic code.
-    -gcthraust    Use Thraustochytrium Mitochondrial genetic code.
-                  Individual modifications can be appended using
-    ,BBB=<aa>     B = A,C,G, or T. <aa> is the three letter
-                  code for an amino-acid. More than one modification
-                  can be specified. eg -gcvert,aga=Trp,agg=Trp uses
-                  the Vertebrate Mitochondrial code and the codons
-                  AGA and AGG changed to Tryptophan.          
-    -tv           Do not search for mitochondrial TV replacement
-                  loop tRNA genes. Only relevant if -mt used.
-    -c7           Search for tRNA genes with 7 base C-loops only.
-    -i            Search for tRNA genes with introns in
-                  anticodon loop with maximum length 3000
-                  bases. Minimum intron length is 0 bases.
-                  Ignored if -m is specified.
-    -i<max>       Search for tRNA genes with introns in
-                  anticodon loop with maximum length <max>
-                  bases. Minimum intron length is 0 bases.
-                  Ignored if -m is specified.
-    -i<min>,<max> Search for tRNA genes with introns in
-                  anticodon loop with maximum length <max>
-                  bases, and minimum length <min> bases.
-                  Ignored if -m is specified.
-    -io           Same as -i, but allow tRNA genes with long
-                  introns to overlap shorter tRNA genes.
-    -if           Same as -i, but fix intron between positions
-                  37 and 38 on C-loop (one base after anticodon).
-    -ifo          Same as -if and -io combined.
-    -ir           Same as -i, but report tRNA genes with minimum
-                  length <min> bases rather than search for 
-                  tRNA genes with minimum length <min> bases.
-                  With this switch, <min> acts as an output filter,
-                  minimum intron length for searching is still 0 bases.
-    -c            Assume that each sequence has a circular
-                  topology. Search wraps around each end.
-                  Default setting.
-    -l            Assume that each sequence has a linear
-                  topology. Search does not wrap.
-    -d            Double. Search both strands of each
-                  sequence. Default setting.
-    -s  or -s+    Single. Do not search the complementary
-                  (antisense) strand of each sequence.
-    -sc or -s-    Single complementary. Do not search the sense
-                  strand of each sequence.
-    -ps           Lower scoring thresholds to 95% of default levels.
-    -ps<num>      Change scoring thresholds to <num> percent of default levels.
-    -rp           Flag possible pseudogenes (score < 100 or tRNA anticodon
-                  loop <> 7 bases long). Note that genes with score < 100
-                  will not be detected or flagged if scoring thresholds are not
-                  also changed to below 100% (see -ps switch).
-    -seq          Print out primary sequence.
-    -br           Show secondary structure of tRNA gene primary sequence
-                  using round brackets.
-    -fasta        Print out primary sequence in fasta format.
-    -fo           Print out primary sequence in fasta format only 
-                  (no secondary structure). 
-    -fon          Same as -fo, with sequence and gene numbering in header. 
-    -fos          Same as -fo, with no spaces in header. 
-    -fons         Same as -fo, with sequence and gene numbering, but no spaces.
-    -w            Print out in Batch mode.
-    -ss           Use the stricter canonical 1-2 bp spacer1 and
-                  1 bp spacer2. Ignored if -mt set. Default is to
-                  allow 3 bp spacer1 and 0-2 bp spacer2, which may 
-                  degrade selectivity.\n");
-    -v            Verbose. Prints out information during
-                  search to STDERR.
-    -a            Print out tRNA domain for tmRNA genes.
-    -a7           Restrict tRNA astem length to a maximum of 7 bases
-    -aa           Display message if predicted iso-acceptor species
-                  does not match species in sequence name (if present).
-    -j            Display 4-base sequence on 3' end of astem
-                  regardless of predicted amino-acyl acceptor length.
-    -jr           Allow some divergence of 3' amino-acyl acceptor
-                  sequence from NCCA.
-    -jr4          Allow some divergence of 3' amino-acyl acceptor
-                  sequence from NCCA, and display 4 bases.
-    -q            Dont print configuration line (which switchs
-                  and files were used).
-    -rn           Repeat sequence name before summary information.
-    -O <outfile>  Print output to <outfile>. If <outfile>
-                  already exists, it is overwritten.  By default
-                  all output goes to stdout.
-
-
 */
 
 
@@ -476,12 +342,14 @@ are not detected.
 #endif
 
 
+#define NOCHAR          '\0'
 #define DLIM            '\n'
 #define STRLEN          4001
 #define STRLENM1        4000
 #define SHORTSTRLEN     51
 #define SHORTSTRLENM1   50
 #define KEYLEN          15
+#define NHELPLINE       173
 #define INACTIVE        2.0e+35
 #define IINACTIVE       2000000001L
 #define ITHRESHOLD      2000000000L
@@ -504,7 +372,7 @@ are not detected.
 
 #define MAXGCMOD   16
 #define MAMMAL_MT  2
-#define NGENECODE  24
+#define NGENECODE  26
 #define METAZOAN_MT      0
 #define STANDARD         1
 #define VERTEBRATE_MT    2
@@ -563,7 +431,7 @@ are not detected.
 #define SLANTDL 8
 #define SLANT   5
 
-#define MATX 42  /* 41 */
+#define MATX 42  
 #define MATY 34
 
 
@@ -582,9 +450,9 @@ are not detected.
 #define MINTRNALEN      (MINCTRNALEN + 1)
 #define MAXTRNALEN      (MAXCTRNALEN + ASTEM2_EXT)
 #define MAXETRNALEN     (MAXTRNALEN + MAXINTRONLEN)
-#define VARMAX          26 /* 25 */
+#define VARMAX          26 
 #define VARMIN          3
-#define VARDIFF         23 /* 22 */               /* VARMAX - VARMIN */
+#define VARDIFF         23                /* VARMAX - VARMIN */
 #define MINTPTSDIST     50
 #define MAXTPTSDIST     321
 #define TPWINDOW        (MAXTPTSDIST - MINTPTSDIST + 1)
@@ -606,6 +474,7 @@ are not detected.
 #define TSWEEP          1000
 #define WRAP            2*MAXETRNALEN
 #define NPTAG           33
+#define MAXAGENELEN     (MAXETRNALEN + MAXTMRNALEN) 
 
 /*
 NOTE: If MAXPPINTRONDIST is increased, then validity of MAXTMRNALEN
@@ -624,21 +493,22 @@ must remain equal to or more than 2*MAXTMRNALEN and TSWEEP.
 #define CLOOP  3
 #define VAR    4
 
-#define NA   MAXINTRONLEN
-#define ND   100
-#define NT   200 
-#define NH   2000
-#define NTH  3000
-#define NC   5000 
-#define NGFT 5000   /* 100 */
-#define NTAG 474    /* 367 */
-#define LSEQ 20000
-#define ATBOND 2.5
-#define mtNA 1500
-#define mtND 150 
-#define mtNTH 3000 /* 750 */
-#define mtNTM  3
-#define mtNCDS 200 /* 500,20 */
+#define NA          MAXINTRONLEN
+#define ND          100
+#define NT          200 
+#define NH          2000
+#define NTH         3000
+#define NC          5000 
+#define NGFT        5000
+#define NTAG        1273
+#define NTAGMAX     1300
+#define LSEQ        20000
+#define ATBOND      2.5
+#define mtNA        1500
+#define mtND        150 
+#define mtNTH       3000 
+#define mtNTM       3
+#define mtNCDS      200
 #define mtNCDSCODON 6000
 #define mtGCBOND    0.0
 #define mtATBOND    -0.5
@@ -671,14 +541,14 @@ must remain equal to or more than 2*MAXTMRNALEN and TSWEEP.
 #define srpDMAXLEN  300
 #define srpDMINLEN  100
 #define srpNH       200
-#define srpNS       500  /* 100 */
+#define srpNS       500
 #define srpMAXHPL   14
 #define srpMAXSP    6
-#define srpMAXSTEM  6500 /* 6000 */
+#define srpMAXSTEM  6500
 #define srpDISPMAX  4*srpMAXLEN
 #define srpMAXSPACER 12 
 #define srpMAXNISTEMS 10
-#define srpNESTMAX  2  /* 3 */
+#define srpNESTMAX  2
 
 #define cdsMAXLEN   3000
 #define NCDS        200 
@@ -691,24 +561,32 @@ typedef struct { long start;
                  long antistart;
                  long antistop;
                  int genetype;
+                 int pseudogene;
+                 int permuted;
+                 int detected;
                  char species[SHORTSTRLEN]; } annotated_gene;
                  
 
 typedef struct { char filename[80];
                  FILE *f;
                  char seqname[STRLEN];
+                 int bugmode;
                  int datatype;
                  double gc;
+                 long filepointer;
                  long ps;
                  long psmax;
                  long seqstart;
+                 long seqstartoff;
                  long nextseq;
-                 int ns; 
+                 long nextseqoff;
+                 int ns,nf;
+                 long aseqlen; 
                  int nagene[NS];
                  annotated_gene gene[NGFT]; } data_set;
 
 
-typedef struct { char name[80];
+typedef struct { char name[100];
                  int seq[MAXTRNALEN+1];
                  int eseq[MAXETRNALEN+1];
                  int *ps;
@@ -736,7 +614,9 @@ typedef struct { char name[80];
                  double energy;
                  int asst;
                  int tps;
-                 int tpe;   } gene;
+                 int tpe;
+                 int annotation;
+                 int annosc;   } gene;
 
 typedef struct { int *pos;
                  int stem;
@@ -818,11 +698,14 @@ typedef struct { int *pos;
                  int win; } cds_codon;
 
                 
+typedef struct { char name[50];
+                 char tag[50]; } tmrna_tag_entry;
 
 
-
-typedef struct { FILE *f;
+typedef struct { char genetypename[NS][10];
+                 FILE *f;
                  int batch;
+                 int batchfullspecies;
                  int repeatsn;
                  int trna;
                  int tmrna;
@@ -876,16 +759,20 @@ typedef struct { FILE *f;
                  int ngene[NS];
                  int nps;
                  int annotated;
+                 int dispmatch;
+                 int updatetmrnatags;
+                 int tagend;
+                 int trnalenmisthresh;
+                 int tmrnalenmisthresh;
                  int nagene[NS];
-                 int natfn;
-                 int natfp;
+                 int nafn[NS];
+                 int nafp[NS];
                  int natfpd;
                  int natfptv;
-                 int nacdsfn;
-                 int nacdsfp;
                  int lacds;
                  int ldcds;
                  long nabase;
+                 double pseudogenethresh;
                  double trnathresh;
                  double ttscanthresh;
                  double ttarmthresh;
@@ -910,6 +797,8 @@ typedef struct { FILE *f;
 
 
 
+
+
 /* Basepair matching matrices */
 
   int lbp[3][6][6] =
@@ -1236,7 +1125,7 @@ typedef struct { FILE *f;
      "Thr","Tyr","Asp","His","Asn",
      "Met","Trp","Glu","Gln","Lys",
      "Stop",
-     "seC",
+     "SeC",
      "Pyl",
      "(Arg|Stop|Ser|Gly)",
      "(Ile|Met)",
@@ -1245,8 +1134,13 @@ typedef struct { FILE *f;
 
   char ambig_aaname[4] = "???";
 
+/*
+aamap based on NCBI genetic code table (downloaded 26-Apr-2014)
+ftp://ftp.ncbi.nih.gov/entrez/misc/data/gc.prt
+*/
+
   int aamap[NGENECODE][64] = {
-   /* composite metazoan mt */
+   /* 0. composite metazoan mt */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1263,7 +1157,7 @@ typedef struct { FILE *f;
      25,Gly,Arg,23,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,26 },
-   /* standard */
+   /* 1. standard */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1280,8 +1174,8 @@ typedef struct { FILE *f;
      SeC,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-  /* vertebrate mt */
-  { Phe,Val,Leu,Ile,
+  /* 2. vertebrate mt */
+  {  Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
      Tyr,Asp,His,Asn,
@@ -1297,7 +1191,7 @@ typedef struct { FILE *f;
      Trp,Gly,Arg,Stop,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* yeast mt */
+   /* 3. yeast mt */
    { Phe,Val,Thr,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1314,7 +1208,7 @@ typedef struct { FILE *f;
      Trp,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* mold, protozoan, and coelenterate mt */
+   /* 4. mold, protozoan, and coelenterate mt */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1331,7 +1225,7 @@ typedef struct { FILE *f;
      Trp,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* invertebrate mt */
+   /* 5. invertebrate mt */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1348,7 +1242,7 @@ typedef struct { FILE *f;
      Trp,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* ciliate */
+   /* 6. ciliate */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1365,7 +1259,7 @@ typedef struct { FILE *f;
      SeC,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Gln,Glu,Gln,Lys },
-   /* deleted -> standard */
+   /* 7. deleted -> standard */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1382,7 +1276,7 @@ typedef struct { FILE *f;
      SeC,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* deleted -> standard */
+   /* 8. deleted -> standard */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1399,7 +1293,7 @@ typedef struct { FILE *f;
      SeC,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* echinoderm and flatworm mt */
+   /* 9. echinoderm and flatworm mt */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1416,7 +1310,7 @@ typedef struct { FILE *f;
      Trp,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Asn },
-   /* euplotid */
+   /* 10. euplotid */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1433,12 +1327,12 @@ typedef struct { FILE *f;
      Cys,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* bacterial and plant chloroplast */
+   /* 11. bacterial and plant chloroplast */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
      Tyr,Asp,His,Asn,
-     Leu,Val,Ser,Met,
+     Leu,Val,Leu,Met,
      Trp,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Pyl,Glu,Gln,Lys,
@@ -1450,7 +1344,7 @@ typedef struct { FILE *f;
      SeC,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* alternate yeast */
+   /* 12. alternate yeast */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1467,7 +1361,7 @@ typedef struct { FILE *f;
      SeC,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* ascidian mt */
+   /* 13. ascidian mt */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1484,7 +1378,7 @@ typedef struct { FILE *f;
      Trp,Gly,Arg,Gly,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* alternate flatworm mt */
+   /* 14. alternate flatworm mt */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1501,7 +1395,7 @@ typedef struct { FILE *f;
      Trp,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
      Tyr,Glu,Gln,Asn },
-   /* blepharisma */
+   /* 15. blepharisma */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1518,7 +1412,7 @@ typedef struct { FILE *f;
      SeC,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* chlorophycean mt */
+   /* 16. chlorophycean mt */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1535,7 +1429,7 @@ typedef struct { FILE *f;
      SeC,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* deleted -> standard */
+   /* 17. deleted -> standard */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1552,7 +1446,7 @@ typedef struct { FILE *f;
      SeC,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* deleted -> standard */
+   /* 18. deleted -> standard */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1569,7 +1463,7 @@ typedef struct { FILE *f;
      SeC,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* deleted -> standard */
+   /* 19. deleted -> standard */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1586,7 +1480,7 @@ typedef struct { FILE *f;
      SeC,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* deleted -> standard */
+   /* 20. deleted -> standard */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1603,7 +1497,7 @@ typedef struct { FILE *f;
      SeC,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* trematode mt */
+   /* 21. trematode mt */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1620,7 +1514,7 @@ typedef struct { FILE *f;
      Trp,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* scenedesmus obliquus mt*/
+   /* 22. scenedesmus obliquus mt*/
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1637,7 +1531,7 @@ typedef struct { FILE *f;
      SeC,Gly,Arg,Arg,
      Stop,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys },
-   /* thraustochytrium mt */
+   /* 23. thraustochytrium mt */
    { Phe,Val,Leu,Ile,
      Cys,Gly,Arg,Ser,
      Ser,Ala,Pro,Thr,
@@ -1653,6 +1547,40 @@ typedef struct { FILE *f;
      Stop,Val,Leu,Ile,
      SeC,Gly,Arg,Arg,
      Ser,Ala,Pro,Thr,
+     Stop,Glu,Gln,Lys },
+   /* 24. Pterobranchia mt */
+   { Phe,Val,Leu,Ile,
+     Cys,Gly,Arg,Ser,
+     Ser,Ala,Pro,Thr,
+     Tyr,Asp,His,Asn,
+     Leu,Val,Leu,Met,
+     Trp,Gly,Arg,Lys,
+     Ser,Ala,Pro,Thr,
+     Pyl,Glu,Gln,Lys,
+     Phe,Val,Leu,Ile,
+     Cys,Gly,Arg,Ser,
+     Ser,Ala,Pro,Thr,
+     Tyr,Asp,His,Asn,
+     Leu,Val,Leu,Ile,
+     Trp,Gly,Arg,Ser,
+     Ser,Ala,Pro,Thr,
+     Stop,Glu,Gln,Lys },
+   /* 25. Gracilibacteria */
+   { Phe,Val,Leu,Ile,
+     Cys,Gly,Arg,Ser,
+     Ser,Ala,Pro,Thr,
+     Tyr,Asp,His,Asn,
+     Leu,Val,Leu,Met,
+     Trp,Gly,Arg,Arg,
+     Ser,Ala,Pro,Thr,
+     Pyl,Glu,Gln,Lys,
+     Phe,Val,Leu,Ile,
+     Cys,Gly,Arg,Ser,
+     Ser,Ala,Pro,Thr,
+     Tyr,Asp,His,Asn,
+     Leu,Val,Leu,Ile,
+     Gly,Gly,Arg,Arg,
+     Ser,Ala,Pro,Thr,
      Stop,Glu,Gln,Lys } };
 
 
@@ -1661,6 +1589,1466 @@ typedef struct { FILE *f;
 /* POINTERS TO DETECTED GENES */
 
   gene *ts;
+
+
+/* HELP MENU */
+
+char helpmenu[NHELPLINE][81] =
+{
+"----------------------------",
+"ARAGORN v1.2.37 Dean Laslett",
+"----------------------------\n",
+"Please reference the following papers if you use this",
+"program as part of any published research.\n",
+"Laslett, D. and Canback, B. (2004) ARAGORN, a",
+"program for the detection of transfer RNA and transfer-messenger",
+"RNA genes in nucleotide sequences",
+"Nucleic Acids Research, 32;11-16\n",
+"Laslett, D. and Canback, B. (2008) ARWEN: a",
+"program to detect tRNA genes in metazoan mitochondrial",
+"nucleotide sequences",
+"Bioinformatics, 24(2); 172-175.\n\n",
+"ARAGORN detects tRNA, mtRNA, and tmRNA genes.\n",
+"Usage:",
+"aragorn -v -e -s -d -c -l -j -a -q -rn -w -ifro<min>,<max> -t -mt -m", 
+"        -rp -ps -gc -tv -seq -br -fasta -fo -o <outfile> <filename>\n",
+"<filename> is assumed to contain one or more sequences",
+"in FASTA or GENBANK format. Results of the search are printed",
+"to STDOUT. All switches are optional and case-insensitive.",
+"Unless -i is specified, tRNA genes containing introns",
+"are not detected.\n",
+"    -m            Search for tmRNA genes.",
+"    -t            Search for tRNA genes.",
+"                  By default, all are detected. If one of",
+"                  -m or -t is specified, then the other", 
+"                  is not detected unless specified as well.",
+"    -mt           Search for Metazoan mitochondrial tRNA genes.",
+"                  tRNA genes with introns not detected. -i,-sr switchs",
+"                  ignored. Composite Metazoan mitochondrial",
+"                  genetic code used.",
+"    -mtmam        Search for Mammalian mitochondrial tRNA",
+"                  genes. -i switch ignored. -tv switch set.",
+"                  Mammalian mitochondrial genetic code used.",
+"    -mtx          Same as -mt but low scoring tRNA genes are", 
+"                  not reported.",
+"    -mtd          Overlapping metazoan mitochondrial tRNA genes", 
+"                  on opposite strands are reported.",
+"    -gc<num>      Use the GenBank transl_table = <num> genetic code.",
+"    -gcstd        Use standard genetic code.",
+"    -gcmet        Use composite Metazoan mitochondrial genetic code.",
+"    -gcvert       Use Vertebrate mitochondrial genetic code.",
+"    -gcinvert     Use Invertebrate mitochondrial genetic code.",
+"    -gcyeast      Use Yeast mitochondrial genetic code.",
+"    -gcprot       Use Mold/Protozoan/Coelenterate mitochondrial genetic code.",
+"    -gcciliate    Use Ciliate genetic code.",
+"    -gcflatworm   Use Echinoderm/Flatworm mitochondrial genetic code",
+"    -gceuplot     Use Euplotid genetic code.",
+"    -gcbact       Use Bacterial/Plant chloroplast genetic code.",
+"    -gcaltyeast   Use alternative Yeast genetic code.",
+"    -gcascid      Use Ascidian mitochondrial genetic code.",
+"    -gcaltflat    Use alternative Flatworm mitochondrial genetic code.",
+"    -gcblep       Use Blepharisma genetic code.",
+"    -gcchloroph   Use Chlorophycean mitochondrial genetic code.",
+"    -gctrem       Use Trematode mitochondrial genetic code.",
+"    -gcscen       Use Scenedesmus obliquus mitochondrial genetic code.",
+"    -gcthraust    Use Thraustochytrium mitochondrial genetic code.",
+"    -gcptero      Use Pterobranchia mitochondrial genetic code.",
+"    -gcgrac       Use Gracilibacteria genetic code.",
+"                  Individual modifications can be appended using",
+"    ,BBB=<aa>     B = A,C,G, or T. <aa> is the three letter",
+"                  code for an amino-acid. More than one modification",
+"                  can be specified. eg -gcvert,aga=Trp,agg=Trp uses",
+"                  the Vertebrate Mitochondrial code and the codons",
+"                  AGA and AGG changed to Tryptophan.",          
+"    -c            Assume that each sequence has a circular",
+"                  topology. Search wraps around each end.",
+"                  Default setting.",
+"    -l            Assume that each sequence has a linear",
+"                  topology. Search does not wrap.",
+"    -d            Double. Search both strands of each",
+"                  sequence. Default setting.",
+"    -s  or -s+    Single. Do not search the complementary",
+"                  (antisense) strand of each sequence.",
+"    -sc or -s-    Single complementary. Do not search the sense",
+"                  strand of each sequence.",
+"    -i            Search for tRNA genes with introns in",
+"                  anticodon loop with maximum length 3000",
+"                  bases. Minimum intron length is 0 bases.",
+"                  Ignored if -m is specified.",
+"    -i<max>       Search for tRNA genes with introns in",
+"                  anticodon loop with maximum length <max>",
+"                  bases. Minimum intron length is 0 bases.",
+"                  Ignored if -m is specified.",
+"    -i<min>,<max> Search for tRNA genes with introns in",
+"                  anticodon loop with maximum length <max>",
+"                  bases, and minimum length <min> bases.",
+"                  Ignored if -m is specified.",
+"    -io           Same as -i, but allow tRNA genes with long",
+"                  introns to overlap shorter tRNA genes.",
+"    -if           Same as -i, but fix intron between positions",
+"                  37 and 38 on C-loop (one base after anticodon).",
+"    -ifo          Same as -if and -io combined.",
+"    -ir           Same as -i, but report tRNA genes with minimum",
+"                  length <min> bases rather than search for", 
+"                  tRNA genes with minimum length <min> bases.",
+"                  With this switch, <min> acts as an output filter,",
+"                  minimum intron length for searching is still 0 bases.",
+"    -tv           Do not search for mitochondrial TV replacement",
+"                  loop tRNA genes. Only relevant if -mt used.",
+"    -c7           Search for tRNA genes with 7 base C-loops only.",
+"    -ss           Use the stricter canonical 1-2 bp spacer1 and",
+"                  1 bp spacer2. Ignored if -mt set. Default is to",
+"                  allow 3 bp spacer1 and 0-2 bp spacer2, which may",
+"                  degrade selectivity.",
+"    -j            Display 4-base sequence on 3' end of astem",
+"                  regardless of predicted amino-acyl acceptor length.",
+"    -jr           Allow some divergence of 3' amino-acyl acceptor",
+"                  sequence from NCCA.",
+"    -jr4          Allow some divergence of 3' amino-acyl acceptor",
+"                  sequence from NCCA, and display 4 bases.",
+"    -e            Print out score for each reported gene.",
+"    -ps           Lower scoring thresholds to 95% of default levels.",
+"    -ps<num>      Change scoring thresholds to <num> percent of default levels.",
+"    -rp           Flag possible pseudogenes (score < 100 or tRNA anticodon",
+"                  loop <> 7 bases long). Note that genes with score < 100",
+"                  will not be detected or flagged if scoring thresholds are not",
+"                  also changed to below 100% (see -ps switch).",
+"    -rp<num>      Flag possible pseudogenes and change score threshold to <num>",
+"                  percent of default levels.",
+"    -seq          Print out primary sequence.",
+"    -br           Show secondary structure of tRNA gene primary sequence",
+"                  using round brackets.",
+"    -fasta        Print out primary sequence in fasta format.",
+"    -fo           Print out primary sequence in fasta format only",
+"                  (no secondary structure).", 
+"    -fon          Same as -fo, with sequence and gene numbering in header.", 
+"    -fos          Same as -fo, with no spaces in header.", 
+"    -fons         Same as -fo, with sequence and gene numbering, but no spaces.",
+"                  as (<species>|<species>) instead of ???",
+"    -v            Verbose. Prints out information during",
+"                  search to STDERR.",
+"    -a            Print out tRNA domain for tmRNA genes.",
+"    -a7           Restrict tRNA astem length to a maximum of 7 bases",
+"    -aa           Display message if predicted iso-acceptor species",
+"                  does not match species in sequence name (if present).",
+"    -amt<num>     Change annotated tRNA length mismatch reporting threshold to",
+"                  <num> bases when searching GENBANK files. Default is 10 bases.",
+"    -amm<num>     Change annotated tmRNA length mismatch reporting threshold to",
+"                  <num> bases when searching GENBANK files. Default is 30 bases.",
+"    -q            Dont print configuration line (which switches",
+"                  and files were used).",
+"    -rn           Repeat sequence name before summary information.",
+"    -o <outfile>  Print output to <outfile>. If <outfile>",
+"                  already exists, it is overwritten. By default",
+"                  all output goes to stdout.",
+"    -w            Print out in batch mode.",
+"    -wa           Same as -w, but for 6 or 8 base anticodon",
+"                  loops, print possible iso-acceptor species",
+"                  For tRNA genes, batch mode output is in the form:\n",
+"                  Sequence name",
+"                  N genes found",
+"                  1 tRNA-<species> [locus 1] <Apos> (nnn)",
+"                  i(<intron position>,<intron length>)",
+"                            .          ",
+"                            .          ",
+"                  N tRNA-<species> [Locus N] <Apos> (nnn)",
+"                  i(<intron position>,<intron length>)\n",
+"                  N is the number of genes found",
+"                  <species> is the tRNA iso-acceptor species",
+"                  <Apos> is the tRNA anticodon relative position",
+"                  (nnn) is the tRNA anticodon base triplet",
+"                  i means the tRNA gene has a C-loop intron\n",
+"                  For tmRNA genes, output is in the form:\n",
+"                  n tmRNA(p) [Locus n] <tag offset>,<tag end offset>",
+"                  <tag peptide>\n",
+"                  p means the tmRNA gene is permuted",
+"    -wunix        Get around problem with some windows gcc compilers",
+"                  (found so far in Strawberry Perl and Active Perl)",
+"                  when reading Unix files.",
+"                  Execution speed may be slower for large files.",
+"                  Execution speed will be a lot slower for files",
+"                  with many small sequences." 
+};
+
+
+
+/* tmRNA TAG PEPTIDE DATABASE */
+  
+tmrna_tag_entry tagdatabase[NTAGMAX] =
+   { { "Acaryochloris marina","ANNIVSFARQRTATAVA"},
+     { "Accumulibacter phosphatis","ANDERFALAA"},
+     { "Acetobacter pasteurianus","ANDNTEVLAVAA"},
+     { "Acetobacterium woodii","AKTEKSYGLALAA"},
+     { "Acetohalobium arabaticum","ANDNSYALAAA"},
+     { "Achromobacter xylosoxidans","ANDERFALAA"},
+     { "Acidaminococcus fermentans","ADDSYALAA"},
+     { "Acidaminococcus sp. D21","AEDSYALAA"},
+     { "Acidimicrobium ferrooxidans","AEPELALAA"},
+     { "Acidiphilium cryptum","ANDNFEALAVAA"},
+     { "Acidithiobacillus caldus","ANDSNYALAA"},
+     { "Acidithiobacillus ferrivorans","ANDSNYALAA"},
+     { "Acidithiobacillus ferrooxidans","ANDSNYALAA"},
+     { "Acidobacterium capsulatum","ANNNLALAA"},
+     { "Acidobacterium Ellin6076","ANTQFAYAA"},
+     { "Acidothermus cellulolyticus","ANSSRADFALAA"},
+     { "Acidovorax avenae","ANDERFALAA"},
+     { "Acidovorax citrulli","ANDERFALAA"},
+     { "Acidovorax sp. JS42","ANDERFALAA"},
+     { "Acidovorax sp. KKS102","ANDERFALAA"},
+     { "Acinetobacter ADP1","ANDETYALAA"},
+     { "Acinetobacter baumannii","ANDETYALAA"},
+     { "Acinetobacter oleivorans","ANDETYALAA"},
+     { "Acinetobacter sp. ADP1","ANDETYALAA"},
+     { "Acinetobacter sp. SH024","ANDETYALAA"},
+     { "Actinobacillus actinomycetemcomitans","ANDEQYALAA"},
+     { "Actinobacillus pleuropneumoniae","ANDEQYALAA"},
+     { "Actinobacillus succinogenes","ANDEQYALAA"},
+     { "Actinobacillus suis","ANDEQYALAA"},
+     { "Actinomyces naeslundii","ADNTRTDFALAA"},
+     { "Actinoplanes missouriensis","AKDNSRADFALAA"},
+     { "Actinoplanes sp. SE50/110","ANSKFDADQYALAA"},
+     { "Actinosynnema mirum","AKSNDQRAFALAA"},
+     { "Advenella kashmirensis","ANDESYALAA"},
+     { "Aequorivita sublithincola","GENNYALAA"},
+     { "Aerococcus urinae","DKNESQSLAFAA"},
+     { "Aeromonas hydrophila 1","ANDENYALAA"},
+     { "Aeromonas hydrophila 2","ANDENYALAA"},
+     { "Aeromonas salmonicida","ANDENYALAA"},
+     { "Aeromonas veronii","ANDENYALAA"},
+     { "Aggregatibacter actinomycetemcomitans","ANDEQYALAA"},
+     { "Aggregatibacter aphrophilus","ANDEQYALAA"},
+     { "Agrobacterium fabrum","ANDNNAKEYALAA"},
+     { "Agrobacterium radiobacter","ANDNYAEARLAA"},
+     { "Agrobacterium sp. H13-3","ANDNNAKEYALAA"},
+     { "Agrobacterium tumefaciens 1","ANDNNAKEYALAA"},
+     { "Agrobacterium tumefaciens 2","ANDNNAKECALAA"},
+     { "Agrobacterium vitis","ANDNNAQGYAVAA"},
+     { "Akkermansia muciniphila","AESNDLALAA"},
+     { "Alcaligenes faecalis","ANDERFALAA"},
+     { "Alcaligenes viscolactis","ANDERFALAA"},
+     { "Alcanivorax borkumensis","ANDDSYALAA"},
+     { "Alcanivorax dieselolei","ANDDTYALAA"},
+     { "Alicycliphilus denitrificans","ANDERFALAA"},
+     { "Alicyclobacillus acidocaldarius","GKANRFTTQNKLALAA"},
+     { "Aliivibrio salmonicida","ANDENYALAA"},
+     { "Alistipes finegoldii","GNNSYALAA"},
+     { "Alkalilimnicola ehrlichii","ANDENYALAA"},
+     { "Alkaliphilus metalliredigenes","ANDNYSLAAA"},
+     { "Alkaliphilus metalliredigens","ANDNYSLAAA"},
+     { "Alkaliphilus oremlandii","ANDNYALAA"},
+     { "Allochromatium vinosum","ANDDNYALAA"},
+     { "alpha proteobacterium","ANESYALAA"},
+     { "Alphaproteobacteria SAR-1","ANDELALAA"},
+     { "Alteromonas macleodii","ANDETYALAA"},
+     { "Alteromonas sp. SN2","ANDENYALAA"},
+     { "Aminobacterium colombiense","VNNNNYALAA"},
+     { "Ammonifex degensii","ANNERVALAA"},
+     { "Amoebophilus asiaticus","GNNQVALAA"},
+     { "Amphibacillus xylanus","GKTNNYSLAAA"},
+     { "Amycolatopsis mediterranei","ADSSQREFALAA"},
+     { "Amycolicicoccus subflavus","ADNAQRSQSDFALAA"},
+     { "Anabaena variabilis","ANNIVKFARKDALVAA"},
+     { "Anaerobaculum mobile","ANENYALAA"},
+     { "Anaerococcus prevotii","ANNNSEANFALAA"},
+     { "Anaerolinea thermophila","VRKSGCRSGRSRTERKRAFGP"},
+     { "Anaeromyxobacter dehalogenans","ANEPMALAA"},
+     { "Anaeromyxobacter sp. Fw109-5","ANEPMALAA"},
+     { "Anaeromyxobacter sp. K","ANEPMALAA"},
+     { "Anaplasma centrale","ANDDFVAANDNMETAFVAAA"},
+     { "Anaplasma marginale","ANDDFVAANDNMETAFVAAA"},
+     { "Anaplasma phagocytophilum","ANDDFVAANDNVETAFVAAA"},
+     { "Anoxybacillus flavithermus","GKENYALAA"},
+     { "Aquifex aeolicus","APEAELALAA"},
+     { "Arcanobacterium haemolyticum","ANKQKSDFALAA"},
+     { "Arcobacter butzleri","ANNTNYAPAYAKAA"},
+     { "Arcobacter nitrofigilis","ANNTNYAPAYAKVA"},
+     { "Arcobacter sp. L","ANNTNYAPAYAKAA"},
+     { "Aromatoleum aromaticum","ANDERFAVAA"},
+     { "Arthrobacter arilaitensis","AESKRTDFALAA"},
+     { "Arthrobacter aurescens","AESKRTDFALAA"},
+     { "Arthrobacter chlorophenolicus","AESKRTDFALAA"},
+     { "Arthrobacter FB24","AKQTRTDFALAA"},
+     { "Arthrobacter phenanthrenivorans","AESKRTDFALAA"},
+     { "Arthrobacter sp. FB24","AKQTRTDFALAA"},
+     { "Arthrobacter sp. Rue61a","AESKRTDFALAA"},
+     { "Arthromitus sp. SFB-mouse-Japan","DKNYSLQAA"},
+     { "Arthromitus sp. SFB-rat-Yit","DKNYSLQAA"},
+     { "Azoarcus BH72","ANDERFALAA"},
+     { "Azoarcus EbN1","ANDERFAVAA"},
+     { "Azoarcus sp. BH72","ANDERFALAA"},
+     { "Azobacteroides pseudotrichonymphae","GENFYALAA"},
+     { "Azorhizobium caulinodans","ANDNYAPVAVAA"},
+     { "Azospira oryzae","ANDERFAIAA"},
+     { "Azospirillum brasilense","ANDNVAPVAVAA"},
+     { "Azospirillum lipoferum","ANDNVAQARLAA"},
+     { "Azospirillum sp. B510","ANDNVAQARLAA"},
+     { "Azotobacter vinelandii","ANDDNYALAA"},
+     { "Bacillus amyloliquefaciens","GKTKSFNQNLALAA"},
+     { "Bacillus anthracis","GKQNNLSLAA"},
+     { "Bacillus atrophaeus","GKTKSFNQNLALAA"},
+     { "Bacillus cellulosilyticus","GKQEDNFAFAA"},
+     { "Bacillus cereus","GKQNNLSLAA"},
+     { "Bacillus clausii","GKENNNFALAA"},
+     { "Bacillus coagulans","GKSNTKLALAA"},
+     { "Bacillus cytotoxicus","GKQQNNFALAA"},
+     { "Bacillus halodurans","GKENNNFALAA"},
+     { "Bacillus licheniformis","GKSNQNLALAA"},
+     { "Bacillus megaterium","GKSNNNFALAA"},
+     { "Bacillus phage","AKLNITNNELQVA"},
+     { "Bacillus pumilus","GKTKSFNQNLALAA"},
+     { "Bacillus selenitireducens","GKQDNDFALAAA"},
+     { "Bacillus stearothermophilus","GKQNYALAA"},
+     { "Bacillus subtilis","GKTNSFNQNVALAA"},
+     { "Bacillus thuringiensis","GKQNNLSLAA"},
+     { "Bacillus weihenstephanensis","GKQNNLSLAA"},
+     { "Bacillusphage G","AKLNITNNELQVA"},
+     { "Bacteriovorax marinus","AESNFAPAMAA"},
+     { "Bacteroides fragilis","GETNYALAA"},
+     { "Bacteroides helcogenes","GENNYALAA"},
+     { "Bacteroides salanitronis","GNENYALAA"},
+     { "Bacteroides thetaiotaomicron","GETNYALAA"},
+     { "Bacteroides vulgatus","GNENYALAA"},
+     { "Bartonella bacilliformis","ANDNYAEARLAA"},
+     { "Bartonella clarridgeiae","ANDNYAEARLIAA"},
+     { "Bartonella grahamii","ANDNYAEARLAA"},
+     { "Bartonella henselae","ANDNYAEARLAA"},
+     { "Bartonella quintana","ANDNYAEARLAA"},
+     { "Bartonella tribocorum","ANDNYAEARLAA"},
+     { "Baumannia cicadellinicola","ANNSQYESVALAA"},
+     { "Bdellovibrio bacteriovorus","GNDYALAA"},
+     { "Beijerinckia indica","ANDNYAPVAVAA"},
+     { "Belliella baltica","GESNYAMAA"},
+     { "Beutenbergia cavernae","ADSKRTDFALAA"},
+     { "Bifidobacterium adolescentis","AKSNRTEFALAA"},
+     { "Bifidobacterium animalis","AKSNRTEFALAA"},
+     { "Bifidobacterium asteroides","AKSNRTEFALAA"},
+     { "Bifidobacterium bifidum","AKSNRTEFALAA"},
+     { "Bifidobacterium breve","AKSNRTEFALAA"},
+     { "Bifidobacterium dentium","AKSNRTEFALAA"},
+     { "Bifidobacterium longum","AKSNRTEFALAA"},
+     { "Blastococcus saxobsidens","ADSNRADYALAA"},
+     { "Blattabacterium sp. (Blaberus giganteus)","GEKEYAFAA"},
+     { "Blattabacterium sp. (Blattella germanica) Bge","GEQQYAFAA"},
+     { "Blattabacterium sp. (Cryptocercus punctulatus)","GEKQYAFAA"},
+     { "Blattabacterium sp. (Mastotermes darwiniensis)","GEKQYAFAA"},
+     { "Blattabacterium sp. (Periplaneta americana)","GEKQYAFAA"},
+     { "Blochmannia floridanus","AKNKYNEPVALAA"},
+     { "Blochmannia pennsylvanicus","ANNTTYRESVALAA"},
+     { "Blochmannia vafer","ANYNYNESAALAA"},
+     { "Bolidomonas pacifica chloroplast","ANNILAFNRKSLSFA"},
+     { "Bordetella avium","ANDERFALAA"},
+     { "Bordetella bronchiseptica","ANDERFALAA"},
+     { "Bordetella parapertussis","ANDERFALAA"},
+     { "Bordetella pertussis","ANDERFALAA"},
+     { "Bordetella petrii","ANDERFALAA"},
+     { "Borrelia afzelii","AKNNNFTSSNLVMAA"},
+     { "Borrelia bissettii","AKNNNFTSSNLVMAA"},
+     { "Borrelia burgdorferi","AKNNNFTSSNLVMAA"},
+     { "Borrelia crocidurae","AKNNNFTSSDLVMAA"},
+     { "Borrelia duttonii","AKNNNFTSSDLVMAA"},
+     { "Borrelia garinii","AKNNNFTSSNLVMAA"},
+     { "Borrelia hermsii","ARNNNFTSSNLVMAA"},
+     { "Borrelia recurrentis","AKNNNFTSSDLVMAA"},
+     { "Borrelia turicatae","AKNNNFTSSNLVMAA"},
+     { "Brachybacterium faecium","AEPKRTDFALAA"},
+     { "Brachyspira hyodysenteriae","ADEYALAA"},
+     { "Brachyspira intermedia","ADEYALAA"},
+     { "Brachyspira murdochii","ADEYALAA"},
+     { "Brachyspira pilosicoli","ADEYALAA"},
+     { "Bradyrhizobium japonicum","ANDNFAPVAQAA"},
+     { "Bradyrhizobium sp. BTAi1","ANDNFAPVAQAA"},
+     { "Bradyrhizobium sp. ORS 278","ANDNFAPVAQAA"},
+     { "Bradyrhizobium sp. S23321","ANDNFAPVAQAA"},
+     { "Brevibacillus brevis","GNKQLSLAA"},
+     { "Brevibacterium linens","AKSNNRTDFALAA"},
+     { "Brucella abortus","ANDNNAQGYALAA"},
+     { "Brucella canis","ANDNNAQGYALAA"},
+     { "Brucella ceti","ANDNNAQGYALAA"},
+     { "Brucella melitensis","ANDNNAQGYALAA"},
+     { "Brucella ovis","ANDNNAQGYALAA"},
+     { "Brucella suis","ANDNNAQGYALAA"},
+     { "Buchnera aphidicola 1","ANNKQNYALAA"},
+     { "Buchnera aphidicola 2","ANNKQNYALAA"},
+     { "Buchnera aphidicola 3","AKQNQYALAA"},
+     { "Burkholderia ambifaria","ANDDTFALAA"},
+     { "Burkholderia cenocepacia","ANDDTFALAA"},
+     { "Burkholderia cepacia","ANDDTFALAA"},
+     { "Burkholderia fungorum","ANDDTFALAA"},
+     { "Burkholderia gladioli","ANDETFALAA"},
+     { "Burkholderia glumae","ANDDTFALAA"},
+     { "Burkholderia graminis","ANDDTFALAA"},
+     { "Burkholderia mallei","ANDDTFALAA"},
+     { "Burkholderia multivorans","ANDDTFALAA"},
+     { "Burkholderia phenoliruptrix","ANDDTFALAA"},
+     { "Burkholderia phymatum","ANDDTFALAA"},
+     { "Burkholderia phytofirmans","ANDETFALAA"},
+     { "Burkholderia pseudomallei","ANDDTFALAA"},
+     { "Burkholderia rhizoxinica","ANDETYALAA"},
+     { "Burkholderia sp. 383","ANDDTFALAA"},
+     { "Burkholderia sp. CCGE1001","ANDDTFALAA"},
+     { "Burkholderia sp. CCGE1002","ANDDTFALAA"},
+     { "Burkholderia sp. YI23","ANDDTFALAA"},
+     { "Burkholderia thailandensis","ANDDTFALAA"},
+     { "Burkholderia vietnamiensis","ANDDTFALAA"},
+     { "Burkholderia xenovorans","ANDDTFALAA"},
+     { "Butyrivibrio proteoclasticus","ANDNLALAA"},
+     { "Caldicellulosiruptor bescii","ADKAELALAA"},
+     { "Caldicellulosiruptor hydrothermalis","ADRTELALAA"},
+     { "Caldicellulosiruptor kristjanssonii","ADKAELALAA"},
+     { "Caldicellulosiruptor kronotskyensis","ADKAELALAA"},
+     { "Caldicellulosiruptor lactoaceticus","ADKAELALAA"},
+     { "Caldicellulosiruptor obsidiansis","AEKPQLALAA"},
+     { "Caldicellulosiruptor owensensis","AEKPQLALAA"},
+     { "Caldicellulosiruptor saccharolyticus","ADKAELALAA"},
+     { "Caldilinea aerophila","AKNTGKAFAFGTPATSVALAA"},
+     { "Caldisericum exile","ADYSYALAA"},
+     { "Calditerrivibrio nitroreducens","ANDEYALAAA"},
+     { "Campylobacter coli","ANNVKFAPAYAKAA"},
+     { "Campylobacter concisus","ANNVNFAPAYAKAA"},
+     { "Campylobacter curvus","ANNVKFAPAYAKAA"},
+     { "Campylobacter fetus 2","ANNVKFAPAYAKAA"},
+     { "Campylobacter hominis","ANNAKFAPAYAKIA"},
+     { "Campylobacter jejuni","ANNVKFAPAYAKAA"},
+     { "Campylobacter lari","ANNVKFAPAYAKAA"},
+     { "Campylobacter upsaliensis","ANNAKFAPAYAKVA"},
+     { "Candidatus atelocyanobacterium thalassa","ANNIVSFKRVAVAA"},
+     { "Capnocytophaga canimorsus","GENNYALAA"},
+     { "Capnocytophaga ochracea","GENNYALAA"},
+     { "Carboxydothermus hydrogenoformans","ANENYALAA"},
+     { "Cardinium endosymbiont","VINNSRRCKFVALRKEEEEDDELRMAA"},
+     { "Carnobacterium maltaromaticum","AKNNNNSYALAA"},
+     { "Carnobacterium sp. 17-4","DKNNNNSYALAA"},
+     { "Catenulispora acidiphila","ANKTQLKSQTAYGLAA"},
+     { "Catera virion","ATDTDATVTDAEIEAFFAEEAAALV"},
+     { "Caulobacter crescentus","ANDNFAEEFAVAA"},
+     { "Caulobacter segnis","ANDNFAEEFAVAA"},
+     { "Caulobacter sp. K31","ANDNFAEEFAIAA"},
+     { "Cellulomonas fimi","ADNKRTDFALAA"},
+     { "Cellulomonas flavigena","ADSKRTDFALAA"},
+     { "Cellulophaga algicola","GENNYALAA"},
+     { "Cellulophaga lytica","GENNYALAA"},
+     { "Cellvibrio gilvus","ADSKRTDFALAA"},
+     { "Cellvibrio japonicus","ANDDSYALAA"},
+     { "Chelativorans sp. BNC1","ANDNYAEARLAA"},
+     { "Chitinophaga pinensis","GESNYAMAA"},
+     { "Chlamydia muridarum","AEPKAECEIISFADLNDLRVAA"},
+     { "Chlamydia psittaci","AEPKAECEIISFSELSEQRLAA"},
+     { "Chlamydia trachomatis","AEPKAECEIISFADLEDLRVAA"},
+     { "Chlamydophila abortus","AEPKAKCEIISFSELSEQRLAA"},
+     { "Chlamydophila caviae","AEPKAECEIISFSDLTEERLAA"},
+     { "Chlamydophila felis","AEPKAECEIISFSDLTQERLAA"},
+     { "Chlamydophila pecorum","AEPKAECEIISFSDLLVEERVAA"},
+     { "Chlamydophila pneumoniae","AEPKAECEIISLFDSVEERLAA"},
+     { "Chlamydophila psittaci","AEPKAECEIISFSELSEQRLAA"},
+     { "Chloracidobacterium thermophilum","AETQELALAA"},
+     { "Chlorobaculum parvum","ADDYSYAMAA"},
+     { "Chlorobium chlorochromatii","ADDYSYAMAA"},
+     { "Chlorobium limicola","ADDYSYAMAA"},
+     { "Chlorobium luteolum","ADDYSYAMAA"},
+     { "Chlorobium phaeobacteroides","ADDYSYAMAA"},
+     { "Chlorobium phaeovibrioides","ADDYSYAMAA"},
+     { "Chlorobium tepidum","ADDYSYAMAA"},
+     { "Chloroflexus aggregans","ANNNARVQPRLALAA"},
+     { "Chloroflexus aurantiacus","ANTNTRAQARLALAA"},
+     { "Chloroherpeton thalassium","ADDYSYAMAA"},
+     { "Chromobacterium violaceum","ANDETYALAA"},
+     { "Chromohalobacter salexigens","ANDDNYAQGALAA"},
+     { "Chroococcidiopsis PCC6712","ANNIVKFERQAVFA"},
+     { "Citrobacter koseri","ANDENYALAA"},
+     { "Citrobacter rodentium","ANDENYALAA"},
+     { "Clavibacter michiganensis","ANNKQSSFVLAA"},
+     { "Cloacamonas acidaminovorans","ANNNYALAA"},
+     { "Clostridiales genomosp.","ANKNYSYAAA"},
+     { "Clostridium acetobutylicum","DNENNLALAA"},
+     { "Clostridium acidurici","ANDNYALAA"},
+     { "Clostridium beijerinckii","AEDNFALAA"},
+     { "Clostridium botulinum","ANDNFALAA"},
+     { "Clostridium cellulolyticum","AKNDNFALAAA"},
+     { "Clostridium cellulovorans","DENYLLAA"},
+     { "Clostridium clariflavum","AENDNYALAAA"},
+     { "Clostridium difficile","ADDNFAIAA"},
+     { "Clostridium kluyveri","ENDNLALAA"},
+     { "Clostridium lentocellum","AEDNLAIAA"},
+     { "Clostridium ljungdahlii","ENNNENLALAA"},
+     { "Clostridium perfringens","AEDNFALAA"},
+     { "Clostridium phytofermentans","ANDNLAYAA"},
+     { "Clostridium saccharolyticum","ANNNELALAA"},
+     { "Clostridium sp. BNL1100","AKNDNFALAAA"},
+     { "Clostridium sp. SY8519","AKEDNFELAMAA"},
+     { "Clostridium sticklandii","ANENYALAA"},
+     { "Clostridium tetani","ADDNFVLAA"},
+     { "Clostridium thermocellum","ANEDNYALAAA"},
+     { "Collimonas fungivorans","ANDNSYALAA"},
+     { "Colwellia psychrerythraea","ANDDTFALAA"},
+     { "Colwellia sp","ANDDTFALAA"},
+     { "Comamonas testosteroni","ANDERFALAA"},
+     { "Conexibacter woesei","ADSHEYALAA"},
+     { "Coprothermobacter proteolyticus","AEPEFALAA"},
+     { "Coraliomargarita akajimensis","GEEQFALAA"},
+     { "Corallococcus coralloides","ANDNVELALAA"},
+     { "Coriobacterium glomerans","GMAQTKIEPTRNPRARRRAQGNRISTGD"},
+     { "Corynebacterium aurimucosum","AEKNSQRDYALAA"},
+     { "Corynebacterium diphtheriae","AENTQRDYALAA"},
+     { "Corynebacterium efficiens","AEKTQRDYALAA"},
+     { "Corynebacterium glutamicum","AEKSQRDYALAA"},
+     { "Corynebacterium jeikeium","AENTQRDYALAA"},
+     { "Corynebacterium kroppenstedtii","AENTQRDYALAA"},
+     { "Corynebacterium pseudotuberculosis","AEKTQRDYALAA"},
+     { "Corynebacterium resistens","AENTQRDYALAA"},
+     { "Corynebacterium ulcerans","AEKTQRDYALAA"},
+     { "Corynebacterium urealyticum","AENTQRDYALAA"},
+     { "Corynebacterium variabile","AENTQRDYALAA"},
+     { "Coxiella burnetii","ANDSNYLQEAYA"},
+     { "Croceibacter atlanticus","GENNYALAA"},
+     { "Crocosphaera watsonii","ANNIVSFKRVAVAA"},
+     { "Cronobacter sakazakii","ANDENYALAA"},
+     { "Cronobacter turicensis","ANDENYALAA"},
+     { "Cryptobacterium curtum","DNNKSFGRQYALAA"},
+     { "Cupriavidus metallidurans","ANDERYALAA"},
+     { "Cupriavidus necator","ANDERYALAA"},
+     { "Cupriavidus taiwanensis","ANDERYALAA"},
+     { "Cyanidioschyzon merolae Chloroplast","ANQILPFSIPVKHLAV"},
+     { "Cyanidium caldarium chloroplast","ANNIIEISNIRKPALVV"},
+     { "Cyanobium gracile","ANNIVRFSRQAAPVAA"},
+     { "Cyanobium sp. PCC 6904","ANNIVRFSRQAAPVAA"},
+     { "Cyanobium sp. PCC 7009","ANNIVRFSRQAAPVAA"},
+     { "Cyanophora paradoxa chloroplast","ATNIVRFNRKAAFAV"},
+     { "Cyanothece sp. ATCC 51142","ANNIVSFKRVAVAA"},
+     { "Cyanothece sp. PCC 7424","ANNIVPFARKAAPVAA"},
+     { "Cyanothece sp. PCC 7425","ANNIVPFARKAVAVA"},
+     { "Cyanothece sp. PCC 7822","ANNIVPFARKSALVAA"},
+     { "Cyanothece sp. PCC 8801","ANNIVSFKRVAVAA"},
+     { "Cyclobacterium marinum","GESNYAMAA"},
+     { "Cycloclasticus sp. P1","ANDDNYAIAA"},
+     { "Cytophaga hutchinsonii","GEESYAMAA"},
+     { "Dechloromonas agitata","ANDEQFAIAA"},
+     { "Dechloromonas aromatica","ANDEQFAIAA"},
+     { "Dechlorosoma suillum","ANDERFAIAA"},
+     { "Deferribacter desulfuricans","ANDELALAA"},
+     { "Dehalococcoides ethenogenes","GERELVLAG"},
+     { "Dehalococcoides sp. CBDB1","GERELVLAG"},
+     { "Dehalococcoides sp. VS","GERELVLAG"},
+     { "Dehalogenimonas lykanthroporepellens","DAKEISAGLERFRRLKLEGREQKAG"},
+     { "Deinococcus deserti","GNQNYALAA"},
+     { "Deinococcus geothermalis","GNQNYALAA"},
+     { "Deinococcus gobiensis","GNQNYALAA"},
+     { "Deinococcus maricopensis","GNNNSTTFALAA"},
+     { "Deinococcus proteolyticus","GENNYALAA"},
+     { "Deinococcus radiodurans","GNQNYALAA"},
+     { "Delftia acidovorans","ANDERFALAA"},
+     { "Delftia sp. Cs1-4","ANDERFALAA"},
+     { "Denitrovibrio acetiphilus","ANNEHTLAAA"},
+     { "Desulfarculus baarsii","ADDYNYAVAA"},
+     { "Desulfatibacillum alkenivorans","ADDYNYAMAA"},
+     { "Desulfitobacterium hafniense","ANDDNYALAA"},
+     { "Desulfobacca acetoxidans","ADNYGYALAA"},
+     { "Desulfobacterium autotrophicum","ADDYNYAVAA"},
+     { "Desulfobacula toluolica","ADDYNYAVAA"},
+     { "Desulfobulbus propionicus","ADDYNYALAA"},
+     { "Desulfococcus oleovorans","ADDYNYAVAA"},
+     { "Desulfohalobium retbaense","ANDYDYALAA"},
+     { "Desulfomicrobium baculatum","ANDNYDYAMAA"},
+     { "Desulfomonile tiedjei","ANDYEYALAA"},
+     { "Desulforudis audaxviator","AKNETYALAA"},
+     { "Desulfotalea psychrophila","ADDYNYAVAA"},
+     { "Desulfotomaculum acetoxidans","ANNDYALAA"},
+     { "Desulfotomaculum carboxydivorans","ANEEYALAA"},
+     { "Desulfotomaculum kuznetsovii","ANEEYALAA"},
+     { "Desulfotomaculum reducens","ANEEYALAA"},
+     { "Desulfotomaculum ruminis","ANEEYALAA"},
+     { "Desulfovibrio aespoeensis","ANNDYDYAIAA"},
+     { "Desulfovibrio africanus","ANDYNYSLAA"},
+     { "Desulfovibrio alaskensis","ANNDYEYAMAA"},
+     { "Desulfovibrio desulfuricans","ANNDYDYAYAA"},
+     { "Desulfovibrio desulfuricans 2 (G20)","ANNDYEYAMAA"},
+     { "Desulfovibrio magneticus","ANDYDYALAA"},
+     { "Desulfovibrio salexigens","ANDNYDYAMAA"},
+     { "Desulfovibrio vulgaris","ANNYDYALAA"},
+     { "Desulfovibrio yellowstonii","ANNELALAA"},
+     { "Desulfurispirillum indicum","ANDENVLAAA"},
+     { "Desulfurivibrio alkaliphilus","ADDYAYAAAA"},
+     { "Desulfurobacterium thermolithotrophum","ANEELALAA"},
+     { "Desulfuromonas acetoxidans","ADTDVSYALAA"},
+     { "Dichelobacter nodosus","ANDDNYALAA"},
+     { "Dickeya dadantii","ANDENFAPAALAA"},
+     { "Dickeya zeae","ANDENFAPAALAA"},
+     { "Dictyoglomus thermophilum","ANTNLALAA"},
+     { "Dictyoglomus turgidum","ANTNLALAA"},
+     { "Dinoroseobacter shibae","ANDNRAPVAVAA"},
+     { "Dyadobacter fermentans","GESTYAMAA"},
+     { "Edwardsiella tarda","ANDENYALAA"},
+     { "Eggerthella lenta","GKNNTQSAPALAMAA"},
+     { "Eggerthella sp. YY7918","GKNNTQSAPALAMAA"},
+     { "Ehrlichia canis","ANDNFVFANDNNSSVAGLVAA"},
+     { "Ehrlichia chaffeensis","ANDNFVFANDNNSSANLVAA"},
+     { "Ehrlichia ruminantium 1","ANDNFVSANDNNSTANLVAA"},
+     { "Ehrlichia ruminantium 2","ANDNFVSANDNNSTANLVAA"},
+     { "Elusimicrobium minutum","GNQTELNWATA"},
+     { "Emiliania huxleyi chloroplast","ANNILNFNSKLAIA"},
+     { "Emticicia oligotrophica","GNTSYAMAA"},
+     { "Enterobacter aerogenes","ANDENYALAA"},
+     { "Enterobacter cancerogenus","ANDENYALAA"},
+     { "Enterobacter cloacae","ANDENYALAA"},
+     { "Enterobacter lignolyticus","ANDENYALAA"},
+     { "Enterobacter sakazakii","ANDENYALAA"},
+     { "Enterobacter sp. 638","ANDENYALAA"},
+     { "Enterococcus durans","AKNENNSYALAA"},
+     { "Enterococcus faecalis","AKNENNSFALAA"},
+     { "Enterococcus faecium","AKNENNSYALAA"},
+     { "Enterococcus hirae","AKNENNSYALAA"},
+     { "Erwinia amylovora","ANDENFAPAALAA"},
+     { "Erwinia billingiae","ANDENYALAA"},
+     { "Erwinia carotovora","ANDENYALAA"},
+     { "Erwinia chrysanthemi","ANDENFAPAALAA"},
+     { "Erwinia pyrifoliae","AKLKYNESVANDGEYELIAAAA"},
+     { "Erwinia sp. Ejp617","AKLYNNIPVANDGEFITPALAA"},
+     { "Erwinia tasmaniensis","ANDENFAPAALAA"},
+     { "Erysipelothrix rhusiopathiae","GNNSLQFAA"},
+     { "Erythrobacter litoralis","ANDNEALALAA"},
+     { "Escherichia coli","ANDENYALAA"},
+     { "Ethanoligenens harbinense","AKDNVIRVNFGRSEEALAA"},
+     { "Eubacterium eligens","ANDNLAYAA"},
+     { "Eubacterium limosum","AKENRSYGMALAA"},
+     { "Eubacterium rectale","AEDNLAYAA"},
+     { "Exiguobacterium sibiricum","GKTNTQLAAA"},
+     { "Exiguobacterium sp. AT1b","GKTNTQLAAA"},
+     { "Ferrimonas balearica","ANDENYALAA"},
+     { "Fervidobacterium nodosum","ANEYVPLAA"},
+     { "Fervidobacterium pennivorans","ANEYVPLAA"},
+     { "Fibrobacter succinogenes","ADENYALAA"},
+     { "Filifactor alocis","ANENNLLAA"},
+     { "Finegoldia magna","AEDNNFALAA"},
+     { "Flavobacteriaceae bacterium","GDQEFALAA"},
+     { "Flavobacterium columnare","GENNYALAA"},
+     { "Flavobacterium indicum","GENNYALAA"},
+     { "Flavobacterium johnsoniae","GENNYALAA"},
+     { "Flexibacter litoralis","GESNYAMAA"},
+     { "Flexistipes sinusarabici","ANDEFALAAA"},
+     { "Fluviicola taffensis","DNTSYALAA"},
+     { "Francisella cf.","ANDSNFAAVAKAA"},
+     { "Francisella noatunensis","ANDSNFAAVTKAA"},
+     { "Francisella novicida","ANDSNFAAVAKAA"},
+     { "Francisella philomiragia","ANDSNFAAVAKAA"},
+     { "Francisella sp. TX077308","ANDSNFAAVAKAA"},
+     { "Francisella tularensis 1","GNKKANRVAANDSNFAAVAKAA"},
+     { "Francisella tularensis 2","ANDSNFAAVAKAA"},
+     { "Frankia alni","ANKTQPVTPLYALAA"},
+     { "Frankia sp. CcI3","ANKTQPTTPTYALAA"},
+     { "Frankia sp. EAN1pec","ATKTQPASSTFALAA"},
+     { "Frankia sp. EuI1c","ANSEQSATSAYALAA"},
+     { "Frankia symbiont","ANKSQSATPRTFALAA"},
+     { "Frateuria aurantia","ANDDNYALAA"},
+     { "Fremyella diplosiphon","ANNIVKFARKEALVAA"},
+     { "Fusobacterium nucleatum 1","GNKDYALAA"},
+     { "Fusobacterium nucleatum 2","GNKEYALAA"},
+     { "Gallibacterium anatis","ANDENYALAA"},
+     { "Gallionella capsiferriformans","ANDENYALAA"},
+     { "gamma proteobacterium","ANDESYALAA"},
+     { "Gammaproteobacteria SAR-1","ANNYNYSLAA"},
+     { "Gardnerella vaginalis","AKSNRTEFALAA"},
+     { "Gemmata obscuriglobus","AEPQYSLAA"},
+     { "Gemmatimonas aurantiaca","ANNNLALAA"},
+     { "Geobacillus kaustophilus","GKQNYALAA"},
+     { "Geobacillus sp. WCH70","GKENYALAA"},
+     { "Geobacillus sp. Y4.1MC1","GKENYALAA"},
+     { "Geobacillus stearothermophilus","GKQNYALAA"},
+     { "Geobacillus thermodenitrificans","GKENYALAA"},
+     { "Geobacter bemidjiensis","ADNYDYALAA"},
+     { "Geobacter daltonii","ADNYDYALAA"},
+     { "Geobacter lovleyi","ADNYNTQPVALAA"},
+     { "Geobacter metallireducens","ADNYDYAVAA"},
+     { "Geobacter sp. M18","ADNYDYALAA"},
+     { "Geobacter sp. M21","ADNYDYALAA"},
+     { "Geobacter sulfurreducens","ADNYDYAVAA"},
+     { "Geobacter uraniireducens","ADNYNYALAA"},
+     { "Geodermatophilus obscurus","ADSSQREFALAA"},
+     { "Glaciecola nitratireducens","ANDENYALAA"},
+     { "Glaciecola sp. 4H-3-7+YE-5","ANDENYALAA"},
+     { "Gloeobacter violaceus","ATNNVVPFARARATVAA"},
+     { "Gluconacetobacter diazotrophicus","ANDNSEVLAVAA"},
+     { "Gluconacetobacter xylinus","ANDNSEVLAVAA"},
+     { "Gluconobacter oxydans","ANDNSEVLAVAA"},
+     { "Gordonia bronchialis","ADSNQRDYALAA"},
+     { "Gordonia polyisoprenivorans","ADKNQRDYALAA"},
+     { "Gordonia rubripertincta","ADSNQRDYALAA"},
+     { "Gordonia sp. KTR9","ADSNQRDYALAA"},
+     { "Gracilaria tenuistipitata chloroplast","AKNNILTLSRRLIYA"},
+     { "Gramella forsetii","GENNYALAA"},
+     { "Granulibacter bethesdensis","ANDNHEALAVAA"},
+     { "Granulicella mallensis","AEPQFALAA"},
+     { "Granulicella tundricola","AEPQFALAA"},
+     { "Guillardia theta chloroplast","ASNIVSFSSKRLVSFA"},
+     { "Haemophilus ducreyi","ANDEQYALAA"},
+     { "Haemophilus influenzae","ANDEQYALAA"},
+     { "Haemophilus parainfluenzae","ANDEQYALAA"},
+     { "Haemophilus parasuis","ANDEQYALAA"},
+     { "Haemophilus somnus","ANDEQYALAA"},
+     { "Hahella chejuensis","ANDETYALAA"},
+     { "Halanaerobium hydrogeniformans","ANDNSYALAAA"},
+     { "Halanaerobium praevalens","ANDNNYTLAAA"},
+     { "Haliangium ochraceum","ANDNAVALAA"},
+     { "Haliscomenobacter hydrossis","GESNYAMAA"},
+     { "Halobacillus halophilus","GESNDNLAVAA"},
+     { "Halomonas elongata","ANDDNYAQGALAA"},
+     { "Halorhodospira halophila","ANDDNYALAA"},
+     { "Halothermothrix orenii","ADNNNYALAAA"},
+     { "Halothiobacillus neapolitanus","ANDDNYALAA"},
+     { "Hamiltonella defensa","AKINKNRPAANGYMPVAALAA"},
+     { "Helicobacter acinonychis","VNNTDYAPAYAKVA"},
+     { "Helicobacter bizzozeronii","VNNPNYAPNYAKAA"},
+     { "Helicobacter cetorum","VNNTNYAPAYAKVA"},
+     { "Helicobacter cinaedi","ANNTNYAPVYAKVA"},
+     { "Helicobacter felis","VNNPNYAPNYAKAA"},
+     { "Helicobacter hepaticus","ANNANYAPAYAKVA"},
+     { "Helicobacter mustelae","ANNKNYAPAYAKVA"},
+     { "Helicobacter pylori 1","VNNTDYAPAYAKAA"},
+     { "Helicobacter pylori 2","VNNTDYAPAYAKAA"},
+     { "Helicobacter pylori 3","VNNADYAPAYAKAA"},
+     { "Heliobacillus mobilis","AEDNYALAA"},
+     { "Heliobacterium modesticaldum","AEENYALAA"},
+     { "Herbaspirillum seropedicae","ANDESYALAA"},
+     { "Herminiimonas arsenicoxydans","DNSYALAA"},
+     { "Herpetosiphon aurantiacus","GKNTFRAPVALAA"},
+     { "Hippea maritima","ADTEYALAA"},
+     { "Hirschia baltica","ANDNFAEGELLAA"},
+     { "Hydrogenophaga palleronii","ANDERFALAA"},
+     { "Hyphomicrobium denitrificans","ANDNYAEAALAA"},
+     { "Hyphomicrobium sp. MC1","ANDNYAEAALAA"},
+     { "Hyphomonas neptunium","ANDNFAEGELLAA"},
+     { "Idiomarina loihiensis","ANDDNYALAA"},
+     { "Ignavibacterium album","GEYNYALAA"},
+     { "Ilyobacter polytropus","ENNNYALAA"},
+     { "Intrasporangium calvum","ANSKRTDFALAA"},
+     { "Isoptericola variabilis","ADNKRTDFTLAA"},
+     { "Jannaschia sp. CCS1","ANDNRAPAMALAA"},
+     { "Janthinobacterium sp. Marseille","ANDNSYALAA"},
+     { "Jonesia denitrificans","ADTKRTDFALAA"},
+     { "Kangiella koreensis","ANEDNYALAA"},
+     { "Ketogulonicigenium vulgare","ANNNRAPAMALAA"},
+     { "Kineococcus radiotolerans","ADSKRTEFALAA"},
+     { "Kitasatospora setae","ANSKRDSQQFALAA"},
+     { "Klebsiella oxytoca","ANDENYALAA"},
+     { "Klebsiella pneumoniae","ANDENYALAA"},
+     { "Kocuria rhizophila","AKSKRTDFALAA"},
+     { "Koribacter versatilis","ANTQMAYAA"},
+     { "Kosmotoga olearia","ANTEFALAA"},
+     { "Kribbella flavida","ADSKRSSFALAA"},
+     { "Krokinobacter sp. 4H-3-7-5","GENNYALAA"},
+     { "Kyrpidia tusciae","ANKQELALAA"},
+     { "Kytococcus sedentarius","ANSKRTDFALAA"},
+     { "Lacinutrix sp. 5H-3-7-4","GENNYALAA"},
+     { "Lactobacillus acidophilus","ANNKNSYALAA"},
+     { "Lactobacillus amylovorus","ANNKNSYALAA"},
+     { "Lactobacillus brevis","AKNNNNSYALAA"},
+     { "Lactobacillus buchneri","AKNNNNSYALAA"},
+     { "Lactobacillus casei","AKNENSYALAA"},
+     { "Lactobacillus crispatus","ANNKNSYALAA"},
+     { "Lactobacillus delbrueckii 1","AKNENNSYALAA"},
+     { "Lactobacillus delbrueckii 2","ANENSYAVAA"},
+     { "Lactobacillus fermentum","ANNNSQSYAYAA"},
+     { "Lactobacillus gallinarum","ANNKNSYALAA"},
+     { "Lactobacillus gasseri","ANNENSYAVAA"},
+     { "Lactobacillus helveticus","ANNKNSYALAA"},
+     { "Lactobacillus johnsonii","ANNENSYAVAA"},
+     { "Lactobacillus kefiranofaciens","ANNKNSYALAA"},
+     { "Lactobacillus plantarum","AKNNNNSYALAA"},
+     { "Lactobacillus reuteri","ANNNSNSYAYAA"},
+     { "Lactobacillus rhamnosus","AKNENSYALAA"},
+     { "Lactobacillus ruminis","AKNNNYSYALAA"},
+     { "Lactobacillus sakei","ANNNNSYAVAA"},
+     { "Lactobacillus salivarius","AKNNNNSYALAA"},
+     { "Lactobacillus sanfranciscensis","AKNNNNSYALAA"},
+     { "Lactococcus garvieae","AKNNTSYALAA"},
+     { "Lactococcus lactis","AKNNTQTYAMAA"},
+     { "Lactococcus plantarum","AKNTQTYALAA"},
+     { "Lactococcus raffinolactis","AKNTQTYAVAA"},
+     { "Laribacter hongkongensis","ANDDTYALAA"},
+     { "Lawsonia intracellularis","ANNNYDYALAA"},
+     { "Leadbetterella byssophila","GNTSYAMAA"},
+     { "Legionella longbeachae","ANDENFAGGEAIAA"},
+     { "Legionella pneumophila","ANDENFAGGEAIAA"},
+     { "Leifsonia xyli","ANSKSTVSAKADFALAA"},
+     { "Leptolyngbya boryana","ANNIVPFARKTAPVAA"},
+     { "Leptospira biflexa","ANNEFALAA"},
+     { "Leptospira borgpetersenii","ANNELALAA"},
+     { "Leptospira interrogans","ANNELALAA"},
+     { "Leptospirillum ferriphilum","ANEELALAA"},
+     { "Leptospirillum ferrooxidans","ANNEMALAA"},
+     { "Leptospirillum groupII","ANEELALAA"},
+     { "Leptospirillum groupIII","ANEELALAA"},
+     { "Leptospirillum sp. Group II '5-way CG'","ANEELALAA"},
+     { "Leptospirillum sp. Group III","ANEELALAA"},
+     { "Leptothrix cholodnii","ANDSTYALAA"},
+     { "Leptotrichia buccalis","GNDNYALAA"},
+     { "Leuconostoc carnosum","AKNENTFAVAA"},
+     { "Leuconostoc citreum","AKNENSFAIAA"},
+     { "Leuconostoc gasicomitatum","AKNENSFAIAA"},
+     { "Leuconostoc gelidum","AKNENSFAIAA"},
+     { "Leuconostoc lactis","AKNENSFAIAA"},
+     { "Leuconostoc mesenteroides","AKNENSFAIAA"},
+     { "Leuconostoc pseudomesenteroides","AKNENSYAIAA"},
+     { "Leuconostoc sp. C2","AKNENSFAIAA"},
+     { "Liberibacter asiaticus","ANDNSAREVLAA"},
+     { "Liberibacter solanacearum","ANDNFAGETRLAA"},
+     { "Listeria grayi 1","GKEKQNLAFAA"},
+     { "Listeria grayi 2","GKQNNNLAFAA"},
+     { "Listeria innocua","GKEKQNLAFAA"},
+     { "Listeria ivanovii","GKEKQNLAFAA"},
+     { "Listeria monocytogenes","GKEKQNLAFAA"},
+     { "Listeria seeligeri","GKEKQNLAFAA"},
+     { "Listeria welshimeri","GKEKQNLAFAA"},
+     { "Lysinibacillus sphaericus","GKQQNLAFAA"},
+     { "Macrococcus caseolyticus","GKTNNFAVAA"},
+     { "Magnetococcus marinus","ANDEHYAPAFAAA"},
+     { "Magnetococcus sp.","ANDEHYAPAFAAA"},
+     { "Magnetospirillum magneticum","ANDNVELAAAA"},
+     { "Magnetospirillum magnetotacticum 1","ANDNFAPVAVAA"},
+     { "Magnetospirillum magnetotacticum 2","ANDNVELAAAA"},
+     { "Mahella australiensis","ADNNAELALAA"},
+     { "Mannheimia haemolytica","ANDEQYALAA"},
+     { "Mannheimia succiniciproducens","ANDEQYALAA"},
+     { "Maribacter sp. HTCC2170","GDNNYALAA"},
+     { "Maricaulis maris","ANDNFAEEVALAA"},
+     { "Marinithermus hydrothermalis","GNNRYALAA"},
+     { "Marinitoga piezophila","AEENYALAA"},
+     { "Marinobacter adhaerens","ANDENYALAA"},
+     { "Marinobacter aquaeolei","ANDENYALAA"},
+     { "Marinobacter hydrocarbonoclasticus","ANDENYALAA"},
+     { "Marinobacter sp. BSs20148","ANDENYSLAA"},
+     { "Marinomonas mediterranea","ANDENYALAA"},
+     { "Marinomonas posidonica","ANDENYALAA"},
+     { "Marinomonas sp. MWYL1","ANDENYALAA"},
+     { "Marivirga tractuosa","GESNYAMAA"},
+     { "Megasphaera elsdenii","AKENNFALAA"},
+     { "Meiothermus ruber","GNVRSNSYALAA"},
+     { "Meiothermus silvanus","GNTQRSYALAA"},
+     { "Melioribacter roseus","GEYNYALAA"},
+     { "Melissococcus plutonius","AKKQNYSYAVAA"},
+     { "Mesoplasma florum","ANKNEENTNEVPTFMLNAGQANYAFA"},
+     { "Mesorhizobium ciceri","ANDNYAEARLAA"},
+     { "Mesorhizobium loti","ANDNYAEARLAA"},
+     { "Mesorhizobium opportunistum","ANDNYAEARLAA"},
+     { "Mesorhizobium sp.","ANDNYAEARLAA"},
+     { "Mesostigma viride chloroplast","ANNILPFNRKTAVAV"},
+     { "Mesotoga prima","ANNEFALAA"},
+     { "Methylacidiphilum infernorum","ANEELALAA"},
+     { "Methylibium petroleiphilum","ANDERFALAA"},
+     { "Methylobacillus flagellatus","ANDETYALAA"},
+     { "Methylobacillus glycogenes","ANDETYALAA"},
+     { "Methylobacterium extorquens","ANDNFAPVAVAA"},
+     { "Methylobacterium nodulans","ANDNYAPVAVAA"},
+     { "Methylobacterium populi","ANDNFAPVAVAA"},
+     { "Methylobacterium radiotolerans","ANDNFAPVAVAA"},
+     { "Methylobacterium sp. 4-46","ANDNYAPVAVAA"},
+     { "Methylocella silvestris","ANDNYAPVAVAA"},
+     { "Methylococcus capsulatus","ANDDVYALAA"},
+     { "Methylocystis sp. SC2","ANDNYAPVAVAA"},
+     { "Methylomicrobium alcaliphilum","ANDENYSMALAA"},
+     { "Methylomirabilis oxyfera","ANHELALAA"},
+     { "Methylomonas methanica","ANDENYSVALAA"},
+     { "Methylophaga sp. JAM1","ANDNNYALAA"},
+     { "Methylophaga sp. JAM7","ANDNNYALAA"},
+     { "Methylotenera mobilis","ANDETYSLAA"},
+     { "Methylotenera versatilis","ANDETYSLAA"},
+     { "Methylovorus glucosetrophus","ANDETYALAA"},
+     { "Micavibrio aeruginosavorus","ANDNFVVANDNSREAAVAIAA"},
+     { "Microbacterium testaceum","ADAKRTDFALAA"},
+     { "Microbulbifer degradans","ANDDNYGAQLAA"},
+     { "Micrococcus luteus","AESKRTDFALAA"},
+     { "Microcystis aeruginosa","ANNIVPFARKAAPVAA"},
+     { "Microlunatus phosphovorus","AKSEQRTDFALAA"},
+     { "Micromonospora aurantiaca","AKNNRADFALAA"},
+     { "Midichloria mitochondrii","ANNKFVPANSDFVPALQAA"},
+     { "Mobiluncus curtisii","AERNSTESFALAA"},
+     { "Modestobacter marinus","ADSSQRDFALAA"},
+     { "Moorella thermoacetica","ADDNLALAA"},
+     { "Moranella endobia","ANDSQYESVALAA"},
+     { "Moraxella catarrhalis","ANDETYALAA"},
+     { "Muricauda ruestringensis","GENNYALAA"},
+     { "Mycobacteriophage Bxz1 virion","ATDTDATVTDAEIEAFFAEEAAALV"},
+     { "Mycobacterium abscessus","ADSHQRDYALAA"},
+     { "Mycobacterium africanum","ADSHQRDYALAA"},
+     { "Mycobacterium austroafricanum","ADSNQRDYALAA"},
+     { "Mycobacterium avium","ADSHQRDYALAA"},
+     { "Mycobacterium bovis","ADSHQRDYALAA"},
+     { "Mycobacterium chubuense","ADSNQRDYALAA"},
+     { "Mycobacterium gilvum","ADSNQRDYALAA"},
+     { "Mycobacterium indicus","ADSHQRDYALAA"},
+     { "Mycobacterium intracellulare","ADSHQRDYALAA"},
+     { "Mycobacterium leprae","ADSYQRDYALAA"},
+     { "Mycobacterium marinum","ADSHQRDYALAA"},
+     { "Mycobacterium microti","ADSHQRDYALAA"},
+     { "Mycobacterium phage","ATDTDATVTDAEIEAFFAEEAAALV"},
+     { "Mycobacterium rhodesiae","ADSNQRDFALAA"},
+     { "Mycobacterium smegmatis","ADSNQRDYALAA"},
+     { "Mycobacterium sp. MCS","ADTNQRDYALAA"},
+     { "Mycobacterium tuberculosis","ADSHQRDYALAA"},
+     { "Mycoplasma agalactiae","ANDKKSEEVRVELPAFAIANANANLAFA"},
+     { "Mycoplasma arthritidis","GNLETSEDKKLDLQFVMNSQTQQNLLFA"},
+     { "Mycoplasma bovis","ANDKKSEEVRLELPAFAIANANANLAFA"},
+     { "Mycoplasma capricolum","ANKNEETFEMPAFMMNNASAGANFMFA"},
+     { "Mycoplasma conjunctivae","ANKKEDKAVDVNLLASQSFNSNLAFA"},
+     { "Mycoplasma crocodyli","GKSKKAENEFSFSNPAFAGNLNLAFA"},
+     { "Mycoplasma fermentans","AEDKKAEEVNISSLMIAQKMQSQSNLAFA"},
+     { "Mycoplasma gallisepticum","DKTSKELADENFVLNQLASNNYALNF"},
+     { "Mycoplasma genitalium 1","DKENNEVLVEPNLIINQQASVNFAFA"},
+     { "Mycoplasma genitalium 2","DKENNEVLVDPNLIINQQASVNFAFA"},
+     { "Mycoplasma haemofelis","ANKQERESSVVNLLMSQPQDLASLSF"},
+     { "Mycoplasma hominis","AEEKQNKQSFVLNQMMSSNPVFAY"},
+     { "Mycoplasma hyorhinis","GKENKKEDYSLLMNASTQSNLAFAF"},
+     { "Mycoplasma leachii","ANKNEETFEMPAFMMNNASAGANFMFA"},
+     { "Mycoplasma mobile","GKEKQLEVSPLLMSSSQSNLVFA"},
+     { "Mycoplasma mycoides","ADKNEENFEMPAFMINNASAGANYMFA"},
+     { "Mycoplasma penetrans","AKNNKNEAVEVELNDFEINALSQNANLALYA"},
+     { "Mycoplasma pneumoniae","DKNNDEVLVDPMLIANQQASINYAFA"},
+     { "Mycoplasma pulmonis","GTKKQENDYQDLMISQNLNQNLAFASV"},
+     { "Mycoplasma putrefaciens","ANKKTEEFEMPAFMINNASAGANLMFA"},
+     { "Mycoplasma synoviae","GNKQSQVEEVTREFSPSLYTFNSNLAYA"},
+     { "Myxococcus fulvus","ANDNVELALAA"},
+     { "Myxococcus xanthus","ANDNVELALAA"},
+     { "Nakamurella multipartita","ADSKRTEFALAA"},
+     { "Natranaerobius thermophilus","ADEDYALAAA"},
+     { "Nautilia profundicola","AANNTNYSPAVARAAA"},
+     { "Neisseria gonorrhoeae","ANDETYALAA"},
+     { "Neisseria lactamica","ANDETYALAA"},
+     { "Neisseria meningitidis","ANDETYALAA"},
+     { "Nephroselmis olivacea chloroplast","TTYHSCLEGHLS"},
+     { "Niastella koreensis","GNTQFAMAA"},
+     { "Nitratifractor salsuginis","ANNTDYRPAYAHAA"},
+     { "Nitratiruptor sp. SB155-2","ANNTDYRPAYAVAA"},
+     { "Nitrobacter hamburgensis","ANDNYAPVAQAA"},
+     { "Nitrobacter Nb-311A","ANDNYAPVAQAA"},
+     { "Nitrobacter winogradskyi","ANDNYAPVAQAA"},
+     { "Nitrosococcus halophilus","ANDDNYALAA"},
+     { "Nitrosococcus oceani","ANDDNYALAA"},
+     { "Nitrosococcus watsonii","ANDDNYALAA"},
+     { "Nitrosomonas cryotolerans","ANDENYALAA"},
+     { "Nitrosomonas europaea","ANDENYALAA"},
+     { "Nitrosomonas eutropha","ANDENYALAA"},
+     { "Nitrosomonas sp. AL212","ANDENYALAA"},
+     { "Nitrosomonas sp. Is79A3","ANDENYALAA"},
+     { "Nitrosospira multiformis","ANDENYALAA"},
+     { "Nitrospira defluvii","ANQELALAA"},
+     { "Nocardia brasiliensis","ADSNQREYALAA"},
+     { "Nocardia cyriacigeorgica","ADSHQREYALAA"},
+     { "Nocardia farcinica","ADSHQREYALAA"},
+     { "Nocardioides sp. JS614","ANTNRSSFALAA"},
+     { "Nocardiopsis alba","ANSKRTEFALAA"},
+     { "Nocardiopsis dassonvillei","ANSKRTEFALAA"},
+     { "Nostoc azollae","ANNIVKFARREALVAA"},
+     { "Nostoc PCC7120","ANNIVKFARKDALVAA"},
+     { "Nostoc punctiforme","ANNIVNFARKDALVAA"},
+     { "Nostoc sp. PCC 7120","ANNIVKFARKDALVAA"},
+     { "Novosphingobium aromaticivorans","ANDNEALALAA"},
+     { "Novosphingobium sp. PP1Y","ANDNEALALAA"},
+     { "Oceanimonas sp. GK1","ANDENYALAA"},
+     { "Oceanithermus profundus","GNDNYALAA"},
+     { "Oceanobacillus iheyensis","GKETNQPVLAAA"},
+     { "Ochrobactrum anthropi","ANDNKAQGYALAA"},
+     { "Odontella sinensis chloroplast","ANNLISSVFKSLSTKQNSLNLSFAV"},
+     { "Odoribacter splanchnicus","GENNYALAA"},
+     { "Oenococcus oeni","AKNNEPSYALAA"},
+     { "Oligotropha carboxidovorans","ANDNYAPVAQAA"},
+     { "Olsenella uli","DNDSYQGSYALAA"},
+     { "Ornithobacterium rhinotracheale","GNNEYALAA"},
+     { "Oscillatoria 6304","ANNIVPFARKAAPVAA"},
+     { "Oscillatoria acuminata","ANNIVPFARKAAPVAA"},
+     { "Owenweeksia hongkongensis","GENNFALAA"},
+     { "Paenibacillus larvae","GKQQNNYALAA"},
+     { "Paenibacillus mucilaginosus","GNQKQQLAFAA"},
+     { "Paenibacillus polymyxa","GKQQNNYAFAA"},
+     { "Paenibacillus sp. JDR-2","GKQQQTYAFAA"},
+     { "Paenibacillus sp. Y412MC10","GKQQNNYAFAA"},
+     { "Paenibacillus terrae","GKQQNNYAFAA"},
+     { "Paludibacter propionicigenes","GENNYALAA"},
+     { "Pantoea ananatis","ANDENYALAA"},
+     { "Pantoea sp. At-9b","ANDNYYDAPAALAA"},
+     { "Pantoea stewartii","ANDENYALAA"},
+     { "Pantoea vagans","ANDENYALAA"},
+     { "Parabacteroides distasonis","GENNYALAA"},
+     { "Parachlamydia acanthamoebae","ADSVSYAAAA"},
+     { "Parachlamydia UWE25","ANNSNKIAKVDFQEGTFARAA"},
+     { "Paracoccus denitrificans","ANDNRAPVALAA"},
+     { "Parvibaculum lavamentivorans","ANDNYAEARLAA"},
+     { "Parvularcula bermudensis","ANDNSSEGFALAA"},
+     { "Pasteurella multocida","ANDEQYALAA"},
+     { "Pavlova lutheri chloroplast","ANNILSFNRVAVA"},
+     { "Pectobacterium atrosepticum","ANDENYALAA"},
+     { "Pectobacterium carotovora","ANDENYALAA"},
+     { "Pectobacterium carotovorum","ANDENYALAA"},
+     { "Pectobacterium wasabiae","ANDENYALAA"},
+     { "Pediococcus claussenii","AKNNNNSYALAA"},
+     { "Pediococcus pentosaceus","AKNNNNSYALAA"},
+     { "Pedobacter heparinus","GENNYALAA"},
+     { "Pedobacter saltans","ENNYALAA"},
+     { "Pelagibacter sp. IMCC9063","ANESYAIAA"},
+     { "Pelagibacter ubique","ADESYALAA"},
+     { "Pelagibacterium halotolerans","ANDNNKAPVALAA"},
+     { "Pelobacter carbinolicus","ADTDVSYALAA"},
+     { "Pelobacter propionicus","ADNYNTPVALAA"},
+     { "Pelodictyon phaeoclathratiforme","ADDYSYAMAA"},
+     { "Pelotomaculum thermopropionicum","AKENYALAA"},
+     { "Petrotoga mobilis","GGSSLPKFSWNLA"},
+     { "Phaeobacter gallaeciensis","ANDNRAPAMAVAA"},
+     { "Photobacterium phosphoreum","ANDENYALAA"},
+     { "Photobacterium profundum","ANDENFALAA"},
+     { "Photorhabdus asymbiotica","ANDNEYALVA"},
+     { "Photorhabdus luminescens","ANDEKYALAA"},
+     { "Phycisphaera mikurensis","ANDENTIAGRIGFGNDALRLAA"},
+     { "Phytoplasma australiense","GKQTNSASEGDQIYNWVPSQSSQNLQQLAFA"},
+     { "Pirellula sp.","AEENFALAA"},
+     { "Pirellula staleyi","AESNLALAA"},
+     { "Planctomyces brasiliensis","ANKQYAMVA"},
+     { "Planctomyces limnophilus","ANTGNYALAA"},
+     { "Plectonema boryanum","ANNIVPFARKTAPVAA"},
+     { "Polaromonas JS666","ANDERFALAA"},
+     { "Polaromonas naphthalenivorans","ANDERFALAA"},
+     { "Polaromonas sp. JS666","ANDERFALAA"},
+     { "Polymorphum gilvum","ANDNYASDVALAA"},
+     { "Polynucleobacter necessarius","ANDERFALAA"},
+     { "Porphyra purpurea chloroplast","AENNIIAFSRKLAVA"},
+     { "Porphyromonas asaccharolytica","AETRHHPGGRCSEAL"},
+     { "Porphyromonas gingivalis","GENNYALAA"},
+     { "Prevotella denticola","GENNYALAA"},
+     { "Prevotella intermedia","GENNYALAA"},
+     { "Prevotella melaninogenica","GENNYALAA"},
+     { "Prevotella ruminicola","GNNEYALAA"},
+     { "Prochlorococcus marinus 1","ANKIVSFSRQTAPVAA"},
+     { "Prochlorococcus marinus 2","ANNIVRFSRQPALVAA"},
+     { "Prochlorococcus marinus 3","ANKIVSFSRQTAPVAA"},
+     { "Prochlorococcus marinus","ANNIVSFSRQTAPVAA"},
+     { "Propionibacterium acidipropionici","ADNKRTDFALAA"},
+     { "Propionibacterium acnes 1","AENTRTDFALAA"},
+     { "Propionibacterium acnes 2","AENTRTDFALAA"},
+     { "Propionibacterium freudenreichii","ADTNRTDFALAA"},
+     { "Propionibacterium propionicum","ANNSRTDFALAA"},
+     { "Prosthecochloris aestuarii","ADDYSYAMAA"},
+     { "Proteobacteria SAR-1, version 1","GENADYALAA"},
+     { "Proteobacteria SAR-1, version 2","ANNYNYSLAA"},
+     { "Proteobacteria SAR-1, version 3","ADNGYMAAA"},
+     { "Proteus mirabilis","ANDNQYKALAA"},
+     { "Protochlamydia amoebophila","ANNSNKIAKVDFQEGTFARAA"},
+     { "Providencia rettgeri","ANDENYALAA"},
+     { "Providencia stuartii","ANDENYALAA"},
+     { "Pseudoalteromonas atlantica","ANDENYALAA"},
+     { "Pseudoalteromonas haloplanktis","ANDDNYSLAA"},
+     { "Pseudoalteromonas sp. SM9913","ANDDNYSLAA"},
+     { "Pseudogulbenkiania sp. NH8B","ANDETYALAA"},
+     { "Pseudomonas aeruginosa","ANDDNYALAA"},
+     { "Pseudomonas brassicacearum","ANDENYGQEFAIAA"},
+     { "Pseudomonas chlororaphis","ANDETYGEYALAA"},
+     { "Pseudomonas entomophila","ANDENYEGYALAA"},
+     { "Pseudomonas fluorescens 1","ANDDQYGAALAA"},
+     { "Pseudomonas fluorescens 2","ANDENYGQEFALAA"},
+     { "Pseudomonas fluorescens 3 (Pf-5)","ANDETYGDYALAA"},
+     { "Pseudomonas fulva","ANDENYEGYALAA"},
+     { "Pseudomonas mendocina","ANDDNYALAA"},
+     { "Pseudomonas protegens","ANDETYGDYALAA"},
+     { "Pseudomonas putida 1","ANDENYGAEYKLAA"},
+     { "Pseudomonas stutzeri","ANDDNYEGYALAA"},
+     { "Pseudomonas syringae 1","ANDENYGAQLAA"},
+     { "Pseudomonas syringae 2","ANDETYGEYALAA"},
+     { "Pseudomonas syringae 3","ANDENYGAQLAA"},
+     { "Pseudonocardia dioxanivorans","ADKSQRAYALAA"},
+     { "Pseudovibrio sp. JE062","ANDNYAMDNAVAA"},
+     { "Pseudoxanthomonas spadix","ANDDNYGSDFALAA"},
+     { "Pseudoxanthomonas suwonensis","ANDDNYALAA"},
+     { "Psychrobacter 2734","ANDENYALAA"},
+     { "Psychrobacter arcticus","ANDENYALAA"},
+     { "Psychrobacter cryohalolentis","ANDENYALAA"},
+     { "Psychrobacter sp. PRwf-1","ANDETYALAA"},
+     { "Psychroflexus torquis","GEDNYALAA"},
+     { "Psychromonas ingrahamii","ANDSNYSLAA"},
+     { "Pusillimonas sp. T7-7","ANDERFALAA"},
+     { "Rahnella aquatilis","ANDENYALAA"},
+     { "Rahnella sp. Y9602","ANDENYALAA"},
+     { "Ralstonia eutropha","ANDERYALAA"},
+     { "Ralstonia metallidurans","ANDERYALAA"},
+     { "Ralstonia pickettii","ANDERYALAA"},
+     { "Ralstonia solanacearum","ANDNRYQLAA"},
+     { "Ramlibacter tataouinensis","ANDERFALAA"},
+     { "Renibacterium salmoninarum","ANSKRTDFALAA"},
+     { "Rhizobium etli","ANDNYAEARLAA"},
+     { "Rhizobium leguminosarum","ANDNYAEARLAA"},
+     { "Rhodobacter capsulatus","ANDNRAPVALAA"},
+     { "Rhodobacter sphaeroides","ANDNRAPVALAA"},
+     { "Rhodococcus equi","AESTQREYALAA"},
+     { "Rhodococcus erythropolis","ADSNQRDYALAA"},
+     { "Rhodococcus jostii","ADSNQRDYALAA"},
+     { "Rhodococcus opacus","ADSNQRDYALAA"},
+     { "Rhodoferax ferrireducens","ANDERFALAA"},
+     { "Rhodomicrobium vannielii","ANDNYAGARPVAIAA"},
+     { "Rhodomonas salina","ANNIVPFSRKVALV"},
+     { "Rhodopirellula baltica","AEENFALAA"},
+     { "Rhodopseudomonas palustris","ANDNYAPVAQAA"},
+     { "Rhodopseudomonas palustris 4","ANDNVRMNEVRLAA"},
+     { "Rhodospirillum centenum","ANDNTAPALRMAA"},
+     { "Rhodospirillum photometricum","ANDNVELAAAA"},
+     { "Rhodospirillum rubrum","ANDNVELAAAA"},
+     { "Rhodothermus marinus","ANDYSYAMAA"},
+     { "Rickettsia africae","ANDNNRSVGHLALAA"},
+     { "Rickettsia amblyommii","ANDNNRSVGRLALAA"},
+     { "Rickettsia australis","ANDNNRSVDLALAA"},
+     { "Rickettsia bellii","ANDNYRSAGTPALAVA"},
+     { "Rickettsia conorii","ANDNNRSVGHLALAA"},
+     { "Rickettsia heilongjiangensis","ANDNNRSVGRLALAA"},
+     { "Rickettsia massiliae","ANDNNRSVGRLALAA"},
+     { "Rickettsia montanensis","ANDNNRSVGRLALAA"},
+     { "Rickettsia parkeri","ANDNNRSVGHLALAA"},
+     { "Rickettsia peacockii","ANDNNRSVGRLALAA"},
+     { "Rickettsia philipii","ANDNNRSVGRLALAA"},
+     { "Rickettsia prowazekii","ANDNRYVGVPALAAA"},
+     { "Rickettsia rhipicephali","ANDNNRSVGRLALAA"},
+     { "Rickettsia rickettsii","ANDNNRSVGRLALAA"},
+     { "Rickettsia sibirica","ANDNNRSVGHLALAA"},
+     { "Rickettsia slovaca","ANDNNRSVGRLALAA"},
+     { "Rickettsia typhi","ANDNKRYVGVAALAAA"},
+     { "Riemerella anatipestifer","GNEEFALAA"},
+     { "Riesia pediculicola","AKTKNYAYAQAA"},
+     { "Robiginitalea biformata","GDNNYALAA"},
+     { "Roseburia hominis","AEDNLAYAA"},
+     { "Roseiflexus castenholzii","ANNNKVVAFKPAMALAA"},
+     { "Roseiflexus sp. RS-1","ANTNKVVAFKPAMALAA"},
+     { "Roseobacter denitrificans","ANDNRAPVAMAA"},
+     { "Roseobacter litoralis","ANDNRAPVAMAA"},
+     { "Rothia dentocariosa","AKSKRTDFALAA"},
+     { "Rothia mucilaginosa","AESKRTDFALAA"},
+     { "Rubrivivax gelatinosus","ANDERFALAA"},
+     { "Rubrobacter xylanophilus","ANDREMALAA"},
+     { "Ruegeria pomeroyi","ANDNRAPVALAA"},
+     { "Ruegeria sp. TM1040","ANDNRAPVALAA"},
+     { "Ruminococcus albus","GHGYFAKAS"},
+     { "Ruminococcus albus","DNDNFAMAA"},
+     { "Runella slithyformis","GEYSYAMAA"},
+     { "Ruthia magnifica","ANENNYALAA"},
+     { "Saccharomonospora viridis","AKTNSQRDFALAA"},
+     { "Saccharophagus degradans","ANDDNYGAQLAA"},
+     { "Saccharopolyspora erythraea","ADKSQREFALAA"},
+     { "Salinibacter ruber","ADDYSYAMAA"},
+     { "Salinispora arenicola","AKQNRADFALAA"},
+     { "Salinispora tropica","AKQNRADFALAA"},
+     { "Salmonella bongori","ANDENYALAA"},
+     { "Salmonella enterica 1","ANDETYALAA"},
+     { "Salmonella enterica 2","ANDENYALAA"},
+     { "Salmonella enterica 3","ANDETYALAA"},
+     { "Salmonella enterica 5","ANDETYALAA"},
+     { "Salmonella enterica 6","ANDENYALAA"},
+     { "Salmonella paratyphi","ANDENYALAA"},
+     { "Salmonella typhimurium","ANDETYALAA"},
+     { "Salmonella typhi","ANDETYALAA"},
+     { "Sanguibacter keddieii","ADSKRTDFALAA"},
+     { "Saprospira grandis","GNTNYALAA"},
+     { "Sebaldella termitidis","GNDNYALAA"},
+     { "secondary endosymbiont","ANDSQFESKTALAA"},
+     { "Segniliparus rotundus","ADTTQRDYALAA"},
+     { "Selenomonas ruminantium","DEFDYAYAA"},
+     { "Selenomonas sputigena","ANEDYALAA"},
+     { "Serratia marcescens","ANDENYALAA"},
+     { "Serratia plymuthica","ANDSQFESAALAA"},
+     { "Serratia proteamaculans","ANDSQFESAALAA"},
+     { "Serratia symbiotica","ANDENYALAA"},
+     { "Shewanella amazonensis","ANDDNYALAA"},
+     { "Shewanella ANA-3","ANDDNYALAA"},
+     { "Shewanella baltica","ANDSNYSLAA"},
+     { "Shewanella denitrificans","ANDSNYSLAA"},
+     { "Shewanella frigidimarina","ANDSNYSLAA"},
+     { "Shewanella halifaxensis","ANDSNYSLAA"},
+     { "Shewanella loihica","ANDDNYALAA"},
+     { "Shewanella oneidensis","ANDDNYALAA"},
+     { "Shewanella pealeana","ANDSNYSLAA"},
+     { "Shewanella piezotolerans","ANDDNYSLAA"},
+     { "Shewanella putrefaciens","ANDDNYALAA"},
+     { "Shewanella PV-4","ANDDNYALAA"},
+     { "Shewanella SAR-1","ANDDNYALAA"},
+     { "Shewanella SAR-1, version 2","ANNDNYALAA"},
+     { "Shewanella SAR-2, version 2","ADYGYMAAA"},
+     { "Shewanella sediminis","ANDSNYSLAA"},
+     { "Shewanella sp. ANA-3","ANDDNYALAA"},
+     { "Shewanella sp. MR-4","ANDDNYALAA"},
+     { "Shewanella sp. MR-7","ANDDNYALAA"},
+     { "Shewanella sp. W3-18-1","ANDDNYALAA"},
+     { "Shewanella violacea","ANDSNYSLAA"},
+     { "Shewanella woodyi","ANDDNYALAA"},
+     { "Shigella boydii","ANDENYALAA"},
+     { "Shigella dysenteriae 1","ANDENYALAA"},
+     { "Shigella dysenteriae 2","ANDENYALAA"},
+     { "Shigella flexneri","ANDENYALAA"},
+     { "Shigella sonnei","ANDENYALAA"},
+     { "Shimwellia blattae","ANDENYALAA"},
+     { "Sideroxydans lithotrophicus","ANDEKYALAA"},
+     { "Silicibacter pomeroyi","ANDNRAPVALAA"},
+     { "Silicibacter TM1040","ANDNRAPVALAA"},
+     { "Simiduia agarivorans","ANDDNYGAQLAA"},
+     { "Simkania negevensis","VDTTEDFYLEAA"},
+     { "Sinorhizobium fredii","ANDNYAEARLAA"},
+     { "Sinorhizobium medicae","ANDNYAEARLAA"},
+     { "Sinorhizobium meliloti","ANDNYAEARLAA"},
+     { "Slackia heliotrinireducens","GKSYNTGRMALAA"},
+     { "Sodalis glossinidius","ANDSQFESNAALAA"},
+     { "Solibacillus silvestris","GKQQNFAFAA"},
+     { "Solibacter usitatus","ANTQFAYAA"},
+     { "Solitalea canadensis","GENNYALAA"},
+     { "Sorangium cellulosum","ANDNAYAVAA"},
+     { "Sphaerobacter thermophilus","GNESYALAA"},
+     { "Sphaerochaeta coccoides","AKKEDENVSYDAEYAFAA"},
+     { "Sphaerochaeta globosa","AKKEDEVSFNAEYAFAA"},
+     { "Sphaerochaeta pleomorpha","AKKEDEVSFNAEYALAA"},
+     { "Sphingobacterium sp. 21","GENNYALAA"},
+     { "Sphingobium chlorophenolicum","ANDNEALALAA"},
+     { "Sphingobium japonicum","ANDNEALALAA"},
+     { "Sphingobium sp. SYK-6","ANDNEALALAA"},
+     { "Sphingomonas elodea","ANDNEALAIAA"},
+     { "Sphingomonas wittichii","ANDNEALAIAA"},
+     { "Sphingopyxis alaskensis","ANDNEALALAA"},
+     { "Spirochaeta africana","AKNEDNVVEVAFGNDDTMLAAA"},
+     { "Spirochaeta smaragdinae","ANDADYALAA"},
+     { "Spirochaeta thermophila","ANDELALAA"},
+     { "Spiroplasma kunkelii","ASKKQKEDKIEMPAFMMNNQLAVSMLAA"},
+     { "Spirosoma linguale","GEYNYAMAA"},
+     { "Stackebrandtia nassauensis","AKTESRSSFALAA"},
+     { "Staphylococcus aureus","GKSNNNFAVAA"},
+     { "Staphylococcus carnosus","GKTNNNLAVAA"},
+     { "Staphylococcus epidermidis","DKSNNNFAVAA"},
+     { "Staphylococcus haemolyticus","DKSNNNFAVAA"},
+     { "Staphylococcus lugdunensis","GKSNNNFAVAA"},
+     { "Staphylococcus pseudintermedius","GKTNNNFAVAA"},
+     { "Staphylococcus saprophyticus","GKENNNFAVAA"},
+     { "Staphylococcus xylosus","GKENNNFAVAA"},
+     { "Starkeya novella","ANDNYAPVAQAA"},
+     { "Stenotrophomonas maltophilia","ANDDNYALAA"},
+     { "Stigmatella aurantiaca","DGKDTKANDNVELALAA"},
+     { "Streptobacillus moniliformis","GKNNFALAA"},
+     { "Streptococcus agalactiae","AKNTNSYALAA"},
+     { "Streptococcus bovis","AKNTNSYAVAA"},
+     { "Streptococcus constellatus","AKNNNSYALAA"},
+     { "Streptococcus criceti","AKNTNSYAVAA"},
+     { "Streptococcus dysgalactiae","AKNTNSYALAA"},
+     { "Streptococcus equi","AKNNTTYALAA"},
+     { "Streptococcus gallolyticus","AKNTNSYAVAA"},
+     { "Streptococcus gordonii","AKNNTSYALAA"},
+     { "Streptococcus macedonicus","AKNTNSYAVAA"},
+     { "Streptococcus mitis","AKNNTSYALAA"},
+     { "Streptococcus mutans","AKNTNSYAVAA"},
+     { "Streptococcus oralis","AKNNTSYALAA"},
+     { "Streptococcus parasanguinis","AKNNNSYALAA"},
+     { "Streptococcus parauberis","AKNTNTYALAA"},
+     { "Streptococcus pneumoniae","AKNNTSYALAA"},
+     { "Streptococcus pseudopneumoniae","AKNNTSYALAA"},
+     { "Streptococcus pyogenes","AKNTNSYALAA"},
+     { "Streptococcus salivarius","AQLNITAKNTNSYAVAA"},
+     { "Streptococcus sanguinis","AKNNNSYALAA"},
+     { "Streptococcus sobrinus","AKNTNSYAVAA"},
+     { "Streptococcus suis","AKNTNTYALAA"},
+     { "Streptococcus thermophilus","AKNTNSYAVAA"},
+     { "Streptococcus uberis","AKNTNSYALAA"},
+     { "Streptococcus zooepidemicus","AKNNTTYALAA"},
+     { "Streptomyces aureofaciens","ANSKRDSQQFALAA"},
+     { "Streptomyces avermitilis","ANTKSDSQSFALAA"},
+     { "Streptomyces avermitilus","ANTKSDSQSFALAA"},
+     { "Streptomyces bingchenggensis","ANTKRDSFALAA"},
+     { "Streptomyces cattleya","ANNKRDSFALAA"},
+     { "Streptomyces coelicolor","ANTKRDSSQQAFALAA"},
+     { "Streptomyces collinus","ANTKRDSSSFALAA"},
+     { "Streptomyces flavogriseus","ANSKRDSSAFALAA"},
+     { "Streptomyces griseus","ANSKRDSSAFALAA"},
+     { "Streptomyces hygroscopicus","ANTKRDSFALAA"},
+     { "Streptomyces lividans","ANTKRDSSQQAFALAA"},
+     { "Streptomyces scabiei","ANSKSDSPQQQFSLAA"},
+     { "Streptomyces sp. SirexAA-E","ANTKRDSSAFALAA"},
+     { "Streptomyces thermophilus","AKNTNSYAVAA"},
+     { "Streptomyces venezuelae","ANSKSDNSRFALAA"},
+     { "Streptomyces violaceusniger","ANTKRDSFALAA"},
+     { "Streptosporangium roseum","ANKTHSEVSQGNLALAA"},
+     { "Sulcia muelleri","GKKNYALAA"},
+     { "Sulfuricurvum kujiense","ANNTNYRPAYAVA"},
+     { "Sulfurimonas autotrophica","ANNTNYRPALAVA"},
+     { "Sulfurimonas denitrificans","ANNTNYRPAYAVA"},
+     { "Sulfurospirillum barnesii","ANNSNYRPAYAVA"},
+     { "Sulfurospirillum deleyianum","ANNSNYRPAYALAA"},
+     { "Sulfurovum sp. NBC37-1","ANNTDYRPAYAVA"},
+     { "Synechococcus elongatus","ANNIVPFARKAAPVAA"},
+     { "Synechococcus sp. CC9311","ANNIVRFSRQAAPVAA"},
+     { "Synechococcus sp. CC9605","ANNIVRFSRQAAPVAA"},
+     { "Synechococcus sp. CC9902","ANNIVRFSRQAAPVAA"},
+     { "Synechococcus sp. JA-2-3B'a(2-13)","ANNVVPFARKAAALAA"},
+     { "Synechococcus sp. JA-3-3Ab (version 1)","ANNVVPFARKAAALAA"},
+     { "Synechococcus sp. JA-3-3Ab (version 2)","ANNVVPFARKAAALAA"},
+     { "Synechococcus sp. PCC 6301","ANNIVPFARKAAPVAA"},
+     { "Synechococcus sp. PCC 6307","ANNIVRFSRQAAPVAA"},
+     { "Synechocystis sp. PCC 6803","ANNIVSFKRVAIAA"},
+     { "Synechococcus sp. PCC 6904","ANNIVRFSRQAAPVAA"},
+     { "Synechococcus sp. PCC 7002","ANNIVPFARKAAAVA"},
+     { "Synechococcus sp. PCC 7009","ANNIVRFSRQAAPVAA"},
+     { "Synechococcus sp. RCC307","ANNIVRFSRQAAPVAA"},
+     { "Synechococcus sp. WH 7803","ANNIVRFSRQAAPVAA"},
+     { "Synechococcus sp. WH 8102","ANNIVRFSRHAAPVAA"},
+     { "Syntrophobacter fumaroxidans","ADDYAYAVAA"},
+     { "Syntrophomonas wolfei","AEDNFALAA"},
+     { "Syntrophothermus lipocalidus","ANNELALAA"},
+     { "Syntrophus aciditrophicus","ANDYEYALAA"},
+     { "Tannerella forsythensis","GENNYALAA"},
+     { "Tannerella forsythia","GENNYALAA"},
+     { "Taylorella asinigenitalis","ANDDKFALAA"},
+     { "Taylorella equigenitalis","ANDENFALAA"},
+     { "Tepidanaerobacter acetatoxydans","ANNDLAYAA"},
+     { "Teredinibacter turnerae","ANDDNYGAQLAA"},
+     { "Terriglobus roseus","AEPQFALAA"},
+     { "Terriglobus saanensis","AEPQFALAA"},
+     { "Tetragenococcus halophilus","AKNNNNSYALAA"},
+     { "Thalassiosira pseudonana chloroplast","ANNIMPFMFNVVKTNRSLTTLNFAV"},
+     { "Thalassiosira weissflogii chloroplast","ANNIIPFIFKAVKTKKEAMALNFAV"},
+     { "Thauera sp. MZ1T","ANDERFALAA"},
+     { "Thermacetogenium phaeum","ANNEYALAA"},
+     { "Thermaerobacter marianensis","ANEELALAA"},
+     { "Thermanaerovibrio acidaminovorans","ANDNYALAA"},
+     { "Thermincola potens","AEENYALAA"},
+     { "Thermoanaerobacter italicus","ADRELAYAA"},
+     { "Thermoanaerobacter mathranii","ADRELAYAA"},
+     { "Thermoanaerobacter pseudethanolicus","ADRELAYAA"},
+     { "Thermoanaerobacter sp. X514","ADRELAYAA"},
+     { "Thermoanaerobacter tengcongensis","ADRELAYAA"},
+     { "Thermoanaerobacter wiegelii","ADRELAYAA"},
+     { "Thermoanaerobacterium saccharolyticum","ANDNLAYAA"},
+     { "Thermoanaerobacterium thermosaccharolyticum","ANNDNLAYAA"},
+     { "Thermoanaerobacterium xylanolyticum","ANDNLAYAA"},
+     { "Thermobaculum terrenum","ANTEYALAA"},
+     { "Thermobifida fusca","ANSKRTEFALAA"},
+     { "Thermobispora bispora","ANKKHAEVSQASLALAA"},
+     { "Thermodesulfatator indicus","ADEYNYAMAA"},
+     { "Thermodesulfobacterium commune","ANEYAYALAA"},
+     { "Thermodesulfobacterium geofontis","ADEYSYALAA"},
+     { "Thermodesulfobium narugense","ANNNSLALAA"},
+     { "Thermodesulfovibrio yellowstonii","ANNELALAA"},
+     { "Thermomicrobium roseum","GERELALAA"},
+     { "Thermomonospora curvata","ANKKQSEFALAA"},
+     { "Thermosediminibacter oceani","ANEELALAA"},
+     { "Thermosipho africanus","ANEELALAA"},
+     { "Thermosipho melanesiensis","ANEEIALAA"},
+     { "Thermosynechococcus elongatus","ANNIVPFARKAAAVA"},
+     { "Thermotoga lettingae","ANNELALAA"},
+     { "Thermotoga maritima ","ANEPVAVAA"},
+     { "Thermotoga neapolitana","ANEPVAVAA"},
+     { "Thermotoga petrophila","ANEPVAVAA"},
+     { "Thermotoga sp. RQ2","ANEPVAVAA"},
+     { "Thermotoga thermarum","ANEELALAA"},
+     { "Thermovibrio ammonificans","ADETLALAA"},
+     { "Thermovirga lienii","ANENYALAA"},
+     { "Thermus oshimai","ANKPAYALAA"},
+     { "Thermus scotoductus","ANKPAYALAA"},
+     { "Thermus sp. CCB_US3_UF1","ANKPAYALAA"},
+     { "Thermus thermophilus","ANTNYALAA"},
+     { "Thioalkalimicrobium cyclicum","ANDDNYALAA"},
+     { "Thioalkalivibrio sp. K90mix","ANDDNYALAA"},
+     { "Thiobacillus denitrificans","AKSKAARRNPACSAGVMELKA"},
+     { "Thiocystis violascens","ANDDNYALAA"},
+     { "Thiomicrospira crunogena","ANDDNYALAA"},
+     { "Thiomonas intermedia","ANDSSYALAA"},
+     { "Thiomonas sp. 3As","ANDSSYALAA"},
+     { "Tistrella mobilis","ANDNRVALAA"},
+     { "Tolumonas auensis","ANDETYALAA"},
+     { "Tremblaya princeps 1 (Dysmicoccus)","APSNRFTIVANDCIDALVRRAVV"},
+     { "Treponema azotonutricium","ADNDNYNYALAA"},
+     { "Treponema brennaborense","AEDNRQFALAA"},
+     { "Treponema caldaria","ADNDSYALAA"},
+     { "Treponema denticola","AENNDSFDYALAA"},
+     { "Treponema pallidum","ANSDSFDYALAA"},
+     { "Treponema primitia","ANNDSYAFAA"},
+     { "Treponema succinifaciens","AKRREDEQSENEQFALAA"},
+     { "Trichodesmium erythraeum","ANNIVPFARKQVAALA"},
+     { "Tropheryma whipplei","ANLKRTDLSLAA"},
+     { "Truepera radiovictrix","GNSNSYALAA"},
+     { "Tsukamurella paurometabola","ADSNQRDFALAA"},
+     { "Turneriella parva","AENETYALAA"},
+     { "uncultured bacterium","ANDNFAPVAVAA"},
+     { "Uncultured ciona","ANDEFFDARLRA"},
+     { "Uncultured FS1","ANDETYALAA"},
+     { "Uncultured FS2","ANDENYALAA"},
+     { "Uncultured LEM1","ANDETYALAA"},
+     { "Uncultured LEM2","ANDETHALAA"},
+     { "Uncultured marineEBAC20E09","ANNDNYALAA"},
+     { "Uncultured phakopsora","ANDNSYALAA"},
+     { "Uncultured QL1","ANVENYALAA"},
+     { "Uncultured RCA1","ANDENYALAA"},
+     { "Uncultured RCA2","SNDENYALAA"},
+     { "Uncultured RCA4","ANDETYALAA"},
+     { "Uncultured remanei","ANDESYALAA"},
+     { "Uncultured stronglyoides1","ANDERFALAA"},
+     { "Uncultured U01a","ANDSNYALAA"},
+     { "Uncultured U02","ANDEQFALAA"},
+     { "Uncultured U04","ANDETYALAA"},
+     { "Uncultured VLS13","ANDENYALAA"},
+     { "Uncultured VLS1","ANDENYALAA"},
+     { "Uncultured VLS5","ANDETYALAA"},
+     { "Uncultured VLS6","ANDENYALAA"},
+     { "Uncultured VLS7","ANDENYALAA"},
+     { "Uncultured VLS9","ANDENYALAA"},
+     { "Uncultured VLW1","ANDENYALAA"},
+     { "Uncultured VLW2","ANDENYALAA"},
+     { "Uncultured VLW3","ANDENYALAA"},
+     { "Uncultured VLW5","ANDENYALAA"},
+     { "Uncultured WW10","ANDENYALAV"},
+     { "Uncultured WW11","ANDDNYALAA"},
+     { "Uncultured WW1","ANDENYALAA"},
+     { "Uncultured WW2","ANDENYALAA"},
+     { "Uncultured WW4","ANDGNYALAA"},
+     { "Uncultured WW5","ANDENYALAA"},
+     { "Uncultured WW7","ANDENCALAA"},
+     { "Uncultured WW8","ANDENYALAA"},
+     { "Uncultured WW9","ANDENYALAA"},
+     { "Ureaplasma parvum","AENKKSSEVELNPAFMASATNANYAFAY"},
+     { "Ureaplasma urealyticum","AENKKSSEVELNPAFMASATNANYAFAY"},
+     { "Variovorax paradoxus","ANDERFALAA"},
+     { "Veillonella parvula","AEENFALAA"},
+     { "Verminephrobacter eiseniae","ANDERFALAA"},
+     { "Verrucomicrobium spinosum","ANSNELALAA"},
+     { "Verrucosispora maris","AKHNRADFALAA"},
+     { "Vesicomyosocius okutanii","ENENNYALAA"},
+     { "Vibrio anguillarum","ANDENYALAA"},
+     { "Vibrio campbellii","ANDENYALAA"},
+     { "Vibrio cholerae","ANDENYALAA"},
+     { "Vibrio Ex25","ANDENYALAA"},
+     { "Vibrio fischeri","ANDENYALAA"},
+     { "Vibrio furnissii","ANDENYALAA"},
+     { "Vibrio parahaemolyticus","ANDENYALAA"},
+     { "Vibrio parahemolyticus","ANDENYALAA"},
+     { "Vibrio sp. EJY3","ANDENYALAA"},
+     { "Vibrio sp. Ex25","ANDENYALAA"},
+     { "Vibrio splendidus","ANDENYALAA"},
+     { "Vibrio vulnificus","ANDENYALAA"},
+     { "Waddlia chondrophila","ADLDLATAAVAA"},
+     { "Weeksella virosa","GNEEYALAA"},
+     { "Weissella koreensis","AKNSNNLAFAA"},
+     { "Wigglesworthia brevipalpis","AKHKYNEPALLAA"},
+     { "Wigglesworthia glossinidia","AKHKYNEPALLAA"},
+     { "Wolbachi.sp","ANDNFAAEDNVDAIAA"},
+     { "Wolbachia endosymbiont","ANDNFAAEEYRVAA"},
+     { "Wolbachia sp. 2 (Brugi)","ANDNFAAEGDVAVAA"},
+     { "Wolbachia sp. 3 (Culex)","ANDNFAAEDNVALAA"},
+     { "Wolbachia sp. 4 (Dros.)","ANDNFAAEEYRVAA"},
+     { "Wolinella succinogenes","ALSSHPKRGKRLGLPITSALGA"},
+     { "Xanthobacter autotrophicus","ANDNYAPVAQAA"},
+     { "Xanthomonas albilineans","ANDDNYALAA"},
+     { "Xanthomonas axonopodis","ANDDNYGSDFAIAA"},
+     { "Xanthomonas campestris 1","ANDDNYGSDFAIAA"},
+     { "Xanthomonas campestris 2","ANDDNYGSDSAIAA"},
+     { "Xanthomonas oryzae","ANDDNYGSDFAIAA"},
+     { "Xenorhabdus bovienii","ANDENYALAA"},
+     { "Xenorhabdus nematophila","ANDENYALAA"},
+     { "Xylanimonas cellulosilytica","ADNTRNDFALAA"},
+     { "Xylella fastidiosa 1","ANEDNFAVAA"},
+     { "Xylella fastidiosa 2","ANEDNFALAA"},
+     { "Xylella fastidiosa 3","ANEDNFAIAA"},
+     { "Xylella fastidiosa 4","ANEDNFALAA"},
+     { "Yersinia bercovieri","ANDSQYESAALAA"},
+     { "Yersinia enterocolitica","ANDSQYESAALAA"},
+     { "Yersinia frederiksenii","ANDENYALAA"},
+     { "Yersinia intermedia","ANDSQYESAALAA"},
+     { "Yersinia mollaretii","ANDSQYESAALAA"},
+     { "Yersinia pestis","ANDENYALAA"},
+     { "Yersinia pseudotuberculosis","ANDENYALAA"},
+     { "Zobellia galactanivorans","GENNYALAA"},
+     { "Zunongwangia profunda","GENNYALAA"} };
+
   
 
 /* TOOLS */
@@ -1668,12 +3056,18 @@ typedef struct { FILE *f;
 char upcasec(char c)
 { return((c >= 'a')?c-32:c); }
 
-
 int length(char *s)
-{ int i=0;
+{ int i = 0;
   while (*s++) i++;
   return(i); }
 
+char *softmatch(char *s, char *key)
+{ while (upcasec(*key) == upcasec(*s))
+   { if (!*key++) return(s);
+     s++; }
+  if (*key) return(NULL);
+  return(s); }
+
 char *strpos(char *s, char *k)
 { char c,d;
   int i;
@@ -1698,6 +3092,17 @@ char *softstrpos(char *s, char *k)
      s++; }
   return(NULL); }
 
+char *wildstrpos(char *s, char *k)
+{ char c,d;
+  int i;
+  d = upcasec(*k);
+  while (c = *s)
+   { if ((upcasec(c) == d) || (d == '*'))
+       { i = 0;
+         do if (!k[++i]) return(s);
+         while ((upcasec(s[i]) == upcasec(k[i])) || (k[i] == '*')); }
+     s++; }
+  return(NULL); }
 
 char *marginstring(char *s, char *k, int margin)
 { char c,d;
@@ -1726,8 +3131,29 @@ int margindetect(char *line, int margin)
   if (c == '\r') return(0);
   if (c == '\0') return(0);
   return(1); }
-     
 
+
+char *backword(char *line, char *s, int n)
+{
+int spzone;
+if (space(*s))
+ { spzone = 1; }
+else
+ { spzone = 0;
+   n++; }
+while (s > line)
+ { if (space(*s))
+    { if (spzone == 0)
+       { spzone = 1;
+         if (--n <= 0) 
+          return(++s); }}
+   else spzone = 0;
+   s--; }
+if (!space(*s))
+ if (n <= 1) return(s);
+return(NULL);
+}
+     
 char *dconvert(char *s, double *r)
 { static char zero='0',nine='9';
   int shift,expshift,sgn,expsgn,exponent;
@@ -1802,6 +3228,7 @@ char *lconvert(char *s, long *r)
 char *getlong(char *line, long *l)
 { static char zero='0',nine='9';
   char c1,c2,*s;
+  if (!line) return(NULL);
   s = line;
   while (c1 = *s) 
    { if (c1 >= zero)
@@ -1822,6 +3249,21 @@ char *copy(char *from, char *to)
   return(--to);  }
 
 
+char *copy2sp(char *from1, char *from2, char *to, int n)
+{ 
+char *s;
+s = to;
+while (from1 < from2)
+ { *s++ = *from1++;
+   if (--n <= 0)
+    { do if (--s <= to) break;
+      while (!space(*s));
+      break; }}
+*s = '\0'; 
+return(s);
+}
+
+
 char *copy3cr(char *from, char *to, int n)
 { while (*to = *from++)
    { if (*to == DLIM)
@@ -1838,60 +3280,151 @@ char *quotestring(char *line, char *a, int n)
   while (ch = *line++) 
    if (ch == '"') 
     { while (ch = *line++) 
-       if (ch != '"') 
-        { *a++ = ch;
-          if (--n <= 0) break; }
-       else break;
+       { if (ch == '"') break;
+         if (ch == ';') break;
+         if (ch == '\n') break;
+         if (ch == '\r') break;
+         *a++ = ch;
+         if (--n <= 0) break; }
       break; }
   *a = '\0';
   return(a); }
+      
 
 /* LIBRARY */
 
+
+int fseekd(data_set *d, long fpos, long foffset)
+{
+if (d->bugmode)
+ { fpos += foffset;
+   if (fpos < 0L) fpos = 0L;
+   if (fseek(d->f,0L,SEEK_SET)) return(EOF);
+   d->filepointer = -1L;
+   while (++d->filepointer < fpos)
+    if (getc(d->f) == EOF) return(EOF);
+   return(0); }
+if (fseek(d->f,fpos,SEEK_SET)) return(EOF);
+d->filepointer = fpos;
+if (foffset != 0L)
+ { if ((fpos + foffset) < 0L) foffset = -fpos;
+   if (fseek(d->f,foffset,SEEK_CUR)) return(EOF);
+   d->filepointer += foffset; }
+return(0);
+}
+
+
+long ftelld(data_set *d)
+{
+if (d->bugmode) return(d->filepointer);
+else return(ftell(d->f));
+}
+
+
+char fgetcd(data_set *d)
+{
+int ic;
+if ((ic = getc(d->f)) == EOF) return(NOCHAR);
+d->filepointer++;
+return((char)ic);
+}
+
+
+char *fgetsd(data_set *d, char line[], int len)
+{
+int i,ic;
+i = 0;
+while (i < len)
+ { if ((ic = getc(d->f)) == EOF) break;
+   d->filepointer++;
+   if (ic == '\r') continue;
+   if (ic == '\n')
+    { line[i++] = DLIM;
+      break; }
+   line[i++] = (char)ic; }
+if (i < 1) return(NULL);
+line[i] = '\0';
+return(line);
+}
+
+int agene_position_check(data_set *d, int nagene, annotated_gene *agene)
+{
+int a;
+long l,swap;
+if ((agene->stop - agene->start) > MAXAGENELEN) 
+ { swap = agene->stop;
+   agene->stop = agene->start;
+   agene->start = swap;
+   agene->stop += d->aseqlen; }
+if (agene->start > agene->stop) agene->stop += d->aseqlen;
+l = agene->stop - agene->start;
+if ((l < 1) || (l > MAXAGENELEN)) return(0);
+if (agene->stop == d->aseqlen)
+ { for (a = 0; a < nagene; a++)
+    if (d->gene[a].start == agene->start)
+     if (d->gene[a].genetype == agene->genetype)
+      if (softmatch(d->gene[a].species,agene->species))
+       return(0); }
+return(1);
+}
+
+
+
 long process_sequence_heading(data_set *d, csw *sw)
 { int i,ic,nagene;
   long l,realstart;
-  char line[STRLEN],c,*s,*sq;
-  annotated_gene *agene;
-  FILE *f;
-  f = d->f;
+  char line[STRLEN],c,*s,*sq,*sd;
+  annotated_gene *agene,tmpagene;
   d->datatype = FASTA;
-  fseek(f,d->seqstart,SEEK_SET);
-  do { if ((ic = getc(f)) == EOF) return(-1L);
-       c = (char)ic; }
+  fseekd(d,d->seqstart,d->seqstartoff);
+  HEADING:           
+  do if ((c = fgetcd(d)) == NOCHAR) return(-1L);
   while (space(c));
-  if (!fgets(d->seqname,STRLENM1,f)) return(-1L);
+  if (c == '#')
+   { if (!fgetsd(d,line,STRLENM1)) return(-1L);
+     goto HEADING; }
+  if (!fgetsd(d,d->seqname,STRLENM1)) return(-1L);
   if (c != '>')
-   { if (upcasec(c) != 'L') goto FNSN;
-     if (!(s = softstrpos(d->seqname,"OCUS"))) goto FNSN;
+   { s = d->seqname;
+     if (upcasec(c) != 'L')
+      { do if (!(c = *s++)) goto FNSN;
+        while (upcasec(c) != 'L'); } 
+     if (!(s = softmatch(s,"OCUS"))) goto FNSN;
+     if (sd = softstrpos(d->seqname,"BP"))
+      { sd = backword(d->seqname,sd,1);
+        if (sd = getlong(sd,&l)) d->aseqlen = l; }
      s += 4;
      while (space(*s)) s++;
      sq = d->seqname;
      while (!space(*s)) *sq++ = *s++;
-     *sq++ = ' ';
-     if (!fgets(line,STRLENM1,f)) return(-1L);
-     if (!(s = softstrpos(line,"DEFINITION"))) return(-1L);
-     s += 10;
-     while (space(*s)) s++;
-     copy(s,sq);
-     if (!fgets(line,STRLENM1,f)) return(-1L);
+     d->aseqlen = 0L; 
+     if (!fgetsd(d,line,STRLENM1)) return(-2L);
+     if (sd = softstrpos(line,"DEFINITION"))
+      { sd += 10;
+        while (space(*sd)) sd++;
+        *sq++ = ' ';
+        copy(sd,sq);
+        if (!fgetsd(d,line,STRLENM1)) return(-2L); }
+     else copy(s,sq);
      for (i = 0; i < NS; i++) d->nagene[i] = 0;
      nagene = 0;
      while (!marginstring(line,"ORIGIN",10))
       { if (nagene >= NGFT) goto GBNL;
-        if (!(s = marginstring(line,"tRNA",10))) goto CDSEQ;
         agene = &(d->gene[nagene]);
+        agene->comp = 0;
+        agene->start = -1L;
+        agene->stop = -1L;
+        agene->antistart = -1L;
+        agene->antistop = -1L;
+        agene->permuted = 0;
+        agene->pseudogene = 0;
+        if (!(s = marginstring(line,"tRNA",10))) goto TMRNASEQ;
         agene->genetype = tRNA;
 	    if (softstrpos(s,"complement")) agene->comp = 1;
-	    else agene->comp = 0;
-        if (!(s = getlong(s,&l))) l = -1L;
-	    agene->start = l;
-        if (!(s = getlong(s,&l))) l = -1L;
-	    agene->stop = l;
+        if (s = getlong(s,&l)) agene->start = l;
+        if (s = getlong(s,&l)) agene->stop = l;
         copy("tRNA-???",agene->species);
-        agene->antistart = -1L;
-        agene->antistop = -1L;
-        if (!fgets(line,STRLENM1,f)) return(-1L);
+        if (!fgetsd(d,line,STRLENM1)) return(-2L);
         while (!margindetect(line,10))
          { if (s = softstrpos(line,"product="))
             if (s = softstrpos(s,"tRNA-"))
@@ -1904,24 +3437,76 @@ long process_sequence_heading(data_set *d, csw *sw)
               agene->antistart = l;
               if (!(s = getlong(s,&l))) l = -1L;
               agene->antistop = l; }
-           if (!fgets(line,STRLENM1,f)) return(-1L); }
-        d->nagene[tRNA]++;
+           if (softstrpos(line,"/pseudo")) agene->pseudogene = 1;
+           if (!fgetsd(d,line,STRLENM1)) return(-2L); }
+        if (agene_position_check(d,nagene,agene))
+         { d->nagene[tRNA]++;
+           nagene++; }
+	    continue;
+        TMRNASEQ:
+        if (!(s = marginstring(line,"tmRNA",10))) goto CDSEQ;
+        agene->genetype = tmRNA;
+	    if (softstrpos(s,"complement")) agene->comp = 1;
+        if (s = getlong(s,&l)) agene->start = l;
+        if (s = getlong(s,&l)) agene->stop = l;
+        copy("tmRNA",agene->species);
+        if (!agene_position_check(d,nagene,agene)) goto GBNL;
+        d->nagene[tmRNA]++;
         nagene++;
+        if (!fgetsd(d,line,STRLENM1)) return(-2L);
+        while (!margindetect(line,10))
+         { if (softstrpos(line,"acceptor")) agene->permuted = 1;
+           if (softstrpos(line,"/pseudo")) agene->pseudogene = 1;
+           if (!fgetsd(d,line,STRLENM1)) return(-2L); }
+        if (s = marginstring(line,"tmRNA",10))
+         { tmpagene.comp = 0;
+           tmpagene.start = -1L;
+           tmpagene.stop = -1L;
+           tmpagene.antistart = -1L;
+           tmpagene.antistop = -1L;
+           tmpagene.permuted = 0;
+           tmpagene.pseudogene = 0;
+	       if (softstrpos(s,"complement")) tmpagene.comp = 1;
+           if (s = getlong(s,&l)) tmpagene.start = l;
+           if (s = getlong(s,&l)) tmpagene.stop = l;
+           if (!fgetsd(d,line,STRLENM1)) return(-2L);
+           while (!margindetect(line,10))
+            { if (softstrpos(line,"coding")) tmpagene.permuted = 1;
+              if (softstrpos(line,"/pseudo")) tmpagene.pseudogene = 1;
+              if (s = softstrpos(line,"/tag_peptide"))
+               { if (s = getlong(s,&l)) tmpagene.antistart = l;
+                 if (s = getlong(s,&l)) tmpagene.antistop = l; }
+              if (!fgetsd(d,line,STRLENM1)) return(-2L); }
+           if (agene->permuted && tmpagene.permuted)
+            { agene->stop = tmpagene.stop;
+              agene->antistart = tmpagene.antistart;
+              agene->antistop = tmpagene.antistop;
+              copy("tmRNA(Perm)",agene->species); }
+           else
+            { if (nagene >= NGFT) goto GBNL;
+              agene = &(d->gene[nagene]);
+              agene->comp = tmpagene.comp;
+              agene->start = tmpagene.start;
+              agene->stop = tmpagene.stop;
+              agene->antistart = -1L;
+              agene->antistop = -1L;
+              agene->permuted = 0;
+              agene->pseudogene = tmpagene.pseudogene;
+              copy("tmRNA",agene->species);
+              if (agene_position_check(d,nagene,agene))
+               { d->nagene[tmRNA]++;
+                 nagene++; }}}           
 	    continue;
         CDSEQ:
         if (!(s = marginstring(line,"CDS",10)))
          if (!(s = marginstring(line,"mRNA",10))) 
           goto RRNA;
-        agene = &(d->gene[nagene]);
         agene->genetype = CDS;
 	    if (softstrpos(s,"complement")) agene->comp = 1;
-	    else agene->comp = 0;
-        if (!(s = getlong(s,&l))) l = -1L;
-	    agene->start = l;
-        if (!(s = getlong(s,&l))) l = -1L;
-	    agene->stop = l;
+        if (s = getlong(s,&l)) agene->start = l;
+        if (s = getlong(s,&l)) agene->stop = l;
         copy("???",agene->species);
-        if (!fgets(line,STRLENM1,f)) return(-1L);
+        if (!fgetsd(d,line,STRLENM1)) return(-2L);
         while (!margindetect(line,10))
          { if (s = softstrpos(line,"gene="))
             { s += 5;
@@ -1929,22 +3514,20 @@ long process_sequence_heading(data_set *d, csw *sw)
            else if (s = softstrpos(line,"product="))
             { s += 8;
               quotestring(s,agene->species,SHORTSTRLENM1); }
-           if (!fgets(line,STRLENM1,f)) return(-1L); }
-        d->nagene[CDS]++;
-        nagene++;
+           if (softstrpos(line,"/pseudo")) agene->pseudogene = 1;
+           if (!fgetsd(d,line,STRLENM1)) return(-2L); }
+        if (agene_position_check(d,nagene,agene))
+         { d->nagene[CDS]++;
+           nagene++; }
         continue;
         RRNA:
         if (!(s = marginstring(line,"rRNA",10))) goto GBNL;
-        agene = &(d->gene[nagene]);
         agene->genetype = rRNA;
 	    if (softstrpos(s,"complement")) agene->comp = 1;
-	    else agene->comp = 0;
-        if (!(s = getlong(s,&l))) l = -1L;
-	    agene->start = l;
-        if (!(s = getlong(s,&l))) l = -1L;
-	    agene->stop = l;
+        if (s = getlong(s,&l)) agene->start = l;
+        if (s = getlong(s,&l)) agene->stop = l;
         copy("???",agene->species);
-        if (!fgets(line,STRLENM1,f)) return(-1L);
+        if (!fgetsd(d,line,STRLENM1)) return(-2L);
         while (!margindetect(line,10))
          { if (s = softstrpos(line,"gene="))
             { s += 5;
@@ -1952,26 +3535,27 @@ long process_sequence_heading(data_set *d, csw *sw)
            else if (s = softstrpos(line,"product="))
             { s += 8;
               quotestring(s,agene->species,SHORTSTRLENM1); }
-           if (!fgets(line,STRLENM1,f)) return(-1L); }
-        d->nagene[rRNA]++;
-        nagene++;
+           if (softstrpos(line,"/pseudo")) agene->pseudogene = 1;
+           if (!fgetsd(d,line,STRLENM1)) return(-2L); }
+        if (agene_position_check(d,nagene,agene))
+         { d->nagene[rRNA]++;
+           nagene++; }
         continue;
         GBNL:
-        if (!fgets(line,STRLENM1,f)) return(-1L); }
+        if (!fgetsd(d,line,STRLENM1)) return(-2L); }
      d->datatype = GENBANK;
      d->nagene[NS-1] = nagene;
      sw->annotated = 1;
-     realstart = ftell(f); }
+     realstart = ftelld(d); }
   else
    { MH:
-     realstart = ftell(f);
-     do { if ((ic = getc(f)) == EOF) return(-1L);
-       c = (char)ic; }
+     realstart = ftelld(d);
+     do if ((c = fgetcd(d)) == NOCHAR) return(-3L);
      while (space(c));
      if (c == '>')
-      { if (!fgets(line,STRLENM1,f)) return(-1L);
+      { if (!fgetsd(d,line,STRLENM1)) return(-3L);
         goto MH; }
-     fseek(f,realstart,SEEK_SET); }
+     fseekd(d,realstart,0L); }
   s = d->seqname;
   i = 0;
   while ((c = *s) != '\0')
@@ -1982,11 +3566,11 @@ long process_sequence_heading(data_set *d, csw *sw)
   *s = '\0';
   return(realstart);
   FNSN:
-  realstart = d->seqstart;
   s = copy("Unnamed sequence ",d->seqname);
-  fseek(f,realstart,SEEK_SET);
-  if (fgets(line,STRLENM1,f)) copy3cr(line,s,50);
-  fseek(f,realstart,SEEK_SET);
+  fseekd(d,d->seqstart,d->seqstartoff);
+  realstart = ftelld(d);
+  if (fgetsd(d,line,STRLENM1)) copy3cr(line,s,50);
+  fseekd(d,realstart,0L);
   return(realstart); }
 
 
@@ -2012,10 +3596,10 @@ int move_forward(data_set *d)
     -4,-4,-4,-4,-4,-4,-4,-4,-4,-4,-4,-4,-4,-4,-4,-4,-4 };
   if (d->ps >= d->psmax)
    if (d->psmax > 0L)
-    { fseek(d->f,d->seqstart,SEEK_SET);
+    { fseekd(d,d->seqstart,d->seqstartoff);
       d->ps = 0L; }
   NL:
-  if ((ic = getc(d->f)) == EOF) goto FAIL;
+  if ((ic = (int)fgetcd(d)) == NOCHAR) goto FAIL;
   SC:
   ic = map[ic];
   BS:
@@ -2023,42 +3607,63 @@ int move_forward(data_set *d)
    { d->ps++;
      return(ic); }
   if (ic == -2)
-   { d->nextseq = ftell(d->f) - 1L;
+   { d->nextseq = ftelld(d);
+     d->nextseqoff = -1L;
      return(TERM); }
   if (ic == -3)
    if (d->datatype == GENBANK)
-   { if ((ic = getc(d->f)) == EOF) goto FAIL;
+   { if ((ic = (int)fgetcd(d)) == NOCHAR) goto FAIL;
      if ((ic = map[ic]) != -3) goto BS;
-     do if ((ic = getc(d->f)) == EOF) goto FAIL;
+     do if ((ic = (int)fgetcd(d)) == NOCHAR) goto FAIL;
      while (space(ic));
-     d->nextseq = ftell(d->f) - 1L;
+     d->nextseq = ftelld(d);
+     d->nextseqoff = -1L;
      return(TERM); }
   if (ic == -5)
-   { nextbase = ftell(d->f); 
-     if ((ic = getc(d->f)) == EOF) goto FAIL;
+   { nextbase = ftelld(d); 
+     if ((ic = (int)fgetcd(d)) == NOCHAR) goto FAIL;
      if (upcasec(ic) == 'O')
-      { if ((ic = getc(d->f)) == EOF) goto FAIL;
+      { if ((ic = (int)fgetcd(d)) == NOCHAR) goto FAIL;
         if (upcasec(ic) == 'C')
-         { if ((ic = getc(d->f)) == EOF) goto FAIL;
+         { if ((ic = (int)fgetcd(d)) == NOCHAR) goto FAIL;
            if (upcasec(ic) == 'U')
-            { if ((ic = getc(d->f)) == EOF) goto FAIL;
+            { if ((ic = (int)fgetcd(d)) == NOCHAR) goto FAIL;
               if (upcasec(ic) == 'S')
-               { d->nextseq = nextbase - 1L;
+               { d->nextseq = nextbase;
+                 d->nextseqoff = -1L;
                  return(TERM); }}}}
-     fseek(d->f,nextbase,SEEK_SET); }
+     fseekd(d,nextbase,0L); }
   goto NL;
   FAIL:
   d->nextseq = -1L;
+  d->nextseqoff = 0L;
   if (d->psmax > 0L)
    { d->ps = d->psmax;
      return(NOBASE); }
   else return(TERM); }
 
 
+
+char cbase(int c)
+{ static char base[7] = "acgt..";
+  if (c < Adenine) return('#');
+  if (c > NOBASE) return((char)c);
+  return(base[c]); }
+
+
+
+
 int seq_init(data_set *d, csw *sw)
 { long ngc;
   int ic;
-  if ((d->seqstart = process_sequence_heading(d,sw)) < 0L) return(0);
+  d->filepointer = 0;
+  if ((d->seqstart = process_sequence_heading(d,sw)) < 0L) 
+   { if (d->seqstart == -2L)
+      fprintf(stderr,"ERROR - unable to read Genbank sequence %s\n",d->seqname);
+     else if (d->seqstart == -2L)
+      fprintf(stderr,"ERROR - unable to read fasta sequence %s\n",d->seqname);
+     return(0); }
+  d->seqstartoff = 0L;
   d->ps = 0L;
   d->psmax = -1L;
   ngc = 0L;
@@ -2068,21 +3673,13 @@ int seq_init(data_set *d, csw *sw)
      ngc++;
   if ((d->psmax = d->ps) <= 0L) return(0);
   d->gc = (double)ngc/(double)d->psmax;
-  fseek(d->f,d->seqstart,SEEK_SET);
+  fseekd(d,d->seqstart,d->seqstartoff);
   d->ps = 0L;
   return(1); }
 
 
-
-char cbase(int c)
-{ static char base[6] = "acgt..";
-  if (c < Adenine) return('#');
-  if (c > NOBASE) return((char)c);
-  return(base[c]); }
-
-
 char cpbase(int c)
-{ static char base[6] = "ACGT..";
+{ static char base[7] = "ACGT..";
   if (c < Adenine) return('#');
   if (c > NOBASE) return((char)c);
   return(base[c]); }
@@ -2124,6 +3721,24 @@ char ptranslate(int *codon, csw *sw)
   return(aapolarity[aamap[sw->geneticcode][((3-p3)<<4)+((3-p2)<<2)+(3-p1)]]); }
 
 
+int seqlen(gene *t)
+{
+return(t->nbase + t->nintron);
+}
+
+
+int aseqlen(data_set *d, annotated_gene *a)
+{
+int alen;
+long astart,astop;
+astart = a->start;
+astop = a->stop;
+if (astart > astop) astop += d->psmax;
+alen = (int)(astop - astart) + 1;
+return(alen);
+}
+
+
 double gc_content(gene *t)
 { int *s,*se;
   double ngc;
@@ -2168,8 +3783,8 @@ int find_var_hairpin(gene *t)
   e = 0;
   sb = t->seq + t->astem1 + t->spacer1 + 2*t->dstem + t->dloop + 
        t->spacer2 + 2*t->cstem + t->cloop + t->nintron;
-  sc = sb + 3;   /* 4 */
-  se = sb + t->var - 2;  /* 3 */
+  sc = sb + 3;
+  se = sb + t->var - 2;
   sf = se - 2;
   te[0] = A[*se];
   te[1] = C[*se];
@@ -2688,7 +4303,7 @@ int *make_var(int *seq, char matrix[][MATY],
      stem = varbp & 0x1f;
      e = stem + ((varbp >> 5) & 0x1f);
      p = var - e;
-     if (p < 1) goto NBP;  /* 2 */
+     if (p < 1) goto NBP;
      if (p > 4) goto NBP;
      pxf = px + 2*ux[orient] + 3*vx[orient];
      pyf = py + 2*uy[orient] + 3*vy[orient];
@@ -2913,482 +4528,6 @@ void xcopy(char m[][MATY], int x, int y, char *s, int l)
 int identify_tag(char tag[], int len, char (*thit)[50], int nt)
 { int i,n;
   char *s,*st,*sb,*sd;
-  static struct { char name[50]; char tag[50]; } tagdatabase[NTAG] =
-   { { "Cyanidioschyzon merolae Chloroplast","ANQILPFSIPVKHLAV" },
-     { "Mesostigma viride chloroplast","ANNILPFNRKTAVAV" },
-     { "Nephroselmis olivacea chloroplast","TTYHSCLEGHLS" },
-     { "Pirellula sp.","AEENFALAA" },
-     { "Rhodopirellula baltica","AEENFALAA" },
-     { "Desulfotalea psychrophila","ADDYNYAVAA" },
-     { "Desulfuromonas acetoxidans","ADTDVSYALAA" },
-     { "Exiguobacterium sp.","GKTNTQLAAA" },
-     { "Mycoplasma gallisepticum","DKTSKELADENFVLNQLASNNYALNF" },
-     { "Aquifex aeolicus","APEAELALAA" },
-     { "Thermotoga maritima ","ANEPVAVAA" },
-     { "Thermotoga neapolitana","ANEPVAVAA" },
-     { "Chloroflexus aurantiacus","ANTNTRAQARLALAA" },
-     { "Thermus thermophilus","ANTNYALAA" },
-     { "Deinococcus radiodurans","GNQNYALAA" },
-     { "Deinococcus geothermalis","GNQNYALAA" },
-     { "Cytophaga hutchinsonii","GEESYAMAA" },
-     { "Bacteroides fragilis","GETNYALAA" },
-     { "Tannerella forsythensis","GENNYALAA" },
-     { "Porphyromonas gingivalis","GENNYALAA" },
-     { "Prevotella intermedia","GENNYALAA" },
-     { "Chlorobium tepidum","ADDYSYAMAA" },
-     { "Chlorobium chlorochromatii","ADDYSYAMAA" },
-     { "Salinibacter ruber","ADDYSYAMAA" },
-     { "Gemmata obscuriglobus","AEPQYSLAA" },
-     { "Chlammydophila pneumoniae","AEPKAECEIISLFDSVEERLAA" },
-     { "Chlammydophila caviae","AEPKAECEIISFSDLTEERLAA" },
-     { "Chlammydophila abortus","AEPKAKCEIISFSELSEQRLAA" },
-     { "Chlammydia trachomatis","AEPKAECEIISFADLEDLRVAA" },
-     { "Chlammydia muridarum","AEPKAECEIISFADLNDLRVAA" },
-     { "Nostoc PCC7120","ANNIVKFARKDALVAA" },
-     { "Nostoc punctiforme","ANNIVNFARKDALVAA" },
-     { "Fremyella diplosiphon","ANNIVKFARKEALVAA" },
-     { "Plectonema boryanum","ANNIVPFARKTAPVAA" },
-     { "Trichodesmium erythraeum","ANNIVPFARKQVAALA" },
-     { "Oscillatoria 6304","ANNIVPFARKAAPVAA" },
-     { "Chroococcidiopsis PCC6712","ANNIVKFERQAVFA" },
-     { "Synechocystis PCC6803","ANNIVSFKRVAIAA" },
-     { "Thermosynechococcus elongatus","ANNIVPFARKAAAVA" },
-     { "Synechococcus PCC6301","ANNIVPFARKAAPVAA" },
-     { "Synechococcus elongatus","ANNIVPFARKAAPVAA" },
-     { "Synechococcus WH8102","ANNIVRFSRHAAPVAA" },
-     { "Synechococcus PCC6307","ANNIVRFSRQAAPVAA" },
-     { "Synechococcus PCC7002","ANNIVPFARKAAAVA" },
-     { "Synechococcus PCC7009","ANNIVRFSRQAAPVAA" },
-     { "Synechococcus PCC6904","ANNIVRFSRQAAPVAA" },
-     { "Synechococcus CC9311","ANNIVRFSRQAAPVAA" },
-     { "Synechococcus CC9902","ANNIVRFSRQAAPVAA" },
-     { "Synechococcus CC9605","ANNIVRFSRQAAPVAA" },
-     { "Prochlorococcus marinus 1","ANKIVSFSRQTAPVAA" },
-     { "Prochlorococcus marinus 2","ANNIVRFSRQPALVAA" },
-     { "Prochlorococcus marinus 3","ANKIVSFSRQTAPVAA" },
-     { "Cyanophora paradoxa chloroplast","ATNIVRFNRKAAFAV" },
-     { "Thalassiosira weissflogii chloroplast","ANNIIPFIFKAVKTKKEAMALNFAV" },
-     { "Odontella sinensis chloroplast","ANNLISSVFKSLSTKQNSLNLSFAV" },
-     { "Bolidomonas pacifica chloroplast","ANNILAFNRKSLSFA" },
-     { "Pavlova lutheri chloroplast","ANNILSFNRVAVA" },
-     { "Porphyra purpurea chloroplast","AENNIIAFSRKLAVA" },
-     { "Guillardia theta chloroplast","ASNIVSFSSKRLVSFA" },
-     { "Fibrobacter succinogenes","ADENYALAA" },
-     { "Treponema pallidum","ANSDSFDYALAA" },
-     { "Treponema denticola","AENNDSFDYALAA" },
-     { "Leptospira interrogans","ANNELALAA" },
-     { "Borrelia burgdorferi","AKNNNFTSSNLVMAA" },
-     { "Borrelia garinii","AKNNNFTSSNLVMAA" },
-     { "Caulobacter crescentus","ANDNFAEEFAVAA" },
-     { "Rhodobacter sphaeroides","ANDNRAPVALAA" },
-     { "Silicibacter pomeroyi","ANDNRAPVALAA" },
-     { "Silicibacter TM1040","ANDNRAPVALAA" },
-     { "Paracoccus denitrificans","ANDNRAPVALAA" },
-     { "Nitrobacter hamburgensis","ANDNYAPVAQAA" },
-     { "Nitrobacter winogradskyi","ANDNYAPVAQAA" },
-     { "Nitrobacter Nb-311A","ANDNYAPVAQAA" },
-     { "Rhodopseudomonas palustris","ANDNYAPVAQAA" },
-     { "Rhodopseudomonas palustris 4","ANDNVRMNEVRLAA" },
-     { "Bradyrhizobium japonicum","ANDNFAPVAQAA" },
-     { "Agrobacterium tumefaciens 1","ANDNNAKEYALAA" },
-     { "Agrobacterium tumefaciens 2","ANDNNAKECALAA" },
-     { "Rhizobium leguminosarum","ANDNYAEARLAA" },
-     { "Sinorhizobium meliloti","ANDNYAEARLAA" },
-     { "Mesorhizobium loti","ANDNYAEARLAA" },
-     { "Mesorhizobium sp.","ANDNYAEARLAA" },
-     { "Bartonella henselae","ANDNYAEARLAA" },
-     { "Bartonella quintana","ANDNYAEARLAA" },
-     { "Brucella melitensis","ANDNNAQGYALAA" },
-     { "Brucella abortus","ANDNNAQGYALAA" },
-     { "Brucella suis","ANDNNAQGYALAA" },
-     { "Methylobacterium extorquens","ANDNFAPVAVAA" },
-     { "Magnetospirillum magnetotacticum 1","ANDNFAPVAVAA" },
-     { "Magnetospirillum magnetotacticum 2","ANDNVELAAAA" },
-     { "Rhodospirillum rubrum","ANDNVELAAAA" },
-     { "Novosphingobium aromaticivorans","ANDNEALALAA" },
-     { "Sphingopyxis alaskensis","ANDNEALALAA" },
-     { "Erythrobacter litoralis","ANDNEALALAA" },
-     { "Ehrlichia chaffeensis","ANDNFVFANDNNSSANLVAA" },
-     { "Anaplasma phagocytophilum","ANDDFVAANDNVETAFVAAA" },
-     { "Wolbachi.sp","ANDNFAAEDNVDAIAA" },
-     { "Rickettsia conorii","ANDNNRSVGHLALAA" },
-     { "Rickettsia sibirica","ANDNNRSVGHLALAA" },
-     { "Rickettsia typhi","ANDNKRYVGVAALAAA" },
-     { "Rickettsia prowazekii","ANDNRYVGVPALAAA" },
-     { "Neisseria gonorrhoeae","ANDETYALAA" },
-     { "Neisseria meningitidis","ANDETYALAA" },
-     { "Neisseria lactamica","ANDETYALAA" },
-     { "Chromobacterium violaceum","ANDETYALAA" },
-     { "Uncultured U02","ANDEQFALAA" },
-     { "Nitrosomonas europaea","ANDENYALAA" },
-     { "Nitrosomonas cryotolerans","ANDENYALAA" },
-     { "Methylobacillus glycogenes","ANDETYALAA" },
-     { "Methylobacillus flagellatus","ANDETYALAA" },
-     { "Moraxella catarrhalis","ANDETYALAA" },
-     { "Uncultured U04","ANDETYALAA" },
-     { "Ralstonia pickettii","ANDERYALAA" },
-     { "Ralstonia solanacearum","ANDNRYQLAA" },
-     { "Ralstonia eutropha","ANDERYALAA" },
-     { "Ralstonia metallidurans","ANDERYALAA" },
-     { "Alcaligenes faecalis","ANDERFALAA" },
-     { "Comamonas testosteroni","ANDERFALAA" },
-     { "Variovorax paradoxus","ANDERFALAA" },
-     { "Hydrogenophaga palleronii","ANDERFALAA" },
-     { "Burkholderia pseudomallei","ANDDTFALAA" },
-     { "Burkholderia mallei","ANDDTFALAA" },
-     { "Burkholderia fungorum","ANDDTFALAA" },
-     { "Burkholderia cepacia","ANDDTFALAA" },
-     { "Burkholderia cenocepacia","ANDDTFALAA" },
-     { "Burkholderia thailandensis","ANDDTFALAA" },
-     { "Burkholderia vietnamiensis","ANDDTFALAA" },
-     { "Burkholderia sp. 383","ANDDTFALAA" },
-     { "Bordetella avium","ANDERFALAA" },
-     { "Bordetella pertussis","ANDERFALAA" },
-     { "Bordetella parapertussis","ANDERFALAA" },
-     { "Bordetella bronchiseptica","ANDERFALAA" },
-     { "Polaromonas JS666","ANDERFALAA" },
-     { "Rubrivivax gelatinosus","ANDERFALAA" },
-     { "Uncultured stronglyoides1","ANDERFALAA" },
-     { "Azoarcus BH72","ANDERFALAA" },
-     { "Xylella fastidiosa 1","ANEDNFAVAA" },
-     { "Xylella fastidiosa 2","ANEDNFALAA" },
-     { "Xylella fastidiosa 3","ANEDNFAIAA" },
-     { "Xylella fastidiosa 4","ANEDNFALAA" },
-     { "Xanthomonas campestris 1","ANDDNYGSDFAIAA" },
-     { "Xanthomonas campestris 2","ANDDNYGSDSAIAA" },
-     { "Xanthomonas axonopodis","ANDDNYGSDFAIAA" },
-     { "Xanthomonas oryzae","ANDDNYGSDFAIAA" },
-     { "Legionella pneumophila","ANDENFAGGEAIAA" },
-     { "Coxiella burnetii","ANDSNYLQEAYA" },
-     { "Methylococcus capsulatus","ANDDVYALAA" },
-     { "Uncultured U01a","ANDSNYALAA" },
-     { "Dichelobacter nodosus","ANDDNYALAA" },
-     { "Francisella tularensis 1","GNKKANRVAANDSNFAAVAKAA" },
-     { "Francisella tularensis 2","ANDSNFAAVAKAA" },
-     { "Acidithiobacillus ferrooxidans","ANDSNYALAA" },
-     { "Acinetobacter ADP1","ANDETYALAA" },
-     { "Psychrobacter 2734","ANDENYALAA" },
-     { "Psychrobacter cryohalolentis","ANDENYALAA" },
-     { "Psychrobacter arcticus","ANDENYALAA" },
-     { "Azotobacter vinelandii","ANDDNYALAA" },
-     { "Pseudomonas aeruginosa","ANDDNYALAA" },
-     { "Pseudomonas syringae 1","ANDENYGAQLAA" },
-     { "Pseudomonas syringae 2","ANDETYGEYALAA" },
-     { "Pseudomonas syringae 3","ANDENYGAQLAA" },
-     { "Pseudomonas fluorescens 1","ANDDQYGAALAA" },
-     { "Pseudomonas fluorescens 2","ANDENYGQEFALAA" },
-     { "Pseudomonas putida 1","ANDENYGAEYKLAA" },
-     { "Marinobacter hydrocarbonoclasticus","ANDENYALAA" },
-     { "Marinobacter aquaeolei","ANDENYALAA" },
-     { "Pseudoalteromonas haloplanktis","ANDDNYSLAA" },
-     { "Pseudoalteromonas atlantica","ANDENYALAA" },
-     { "Uncultured WW11","ANDDNYALAA" },
-     { "Shewanella oneidensis","ANDDNYALAA" },
-     { "Shewanella putrefaciens","ANDDNYALAA" },
-     { "Shewanella PV-4","ANDDNYALAA" },
-     { "Shewanella amazonensis","ANDDNYALAA" },
-     { "Shewanella SAR-1","ANDDNYALAA" },
-     { "Shewanella ANA-3","ANDDNYALAA" },
-     { "Idiomarina loihiensis","ANDDNYALAA" },
-     { "Photorhabdus asymbiotica","ANDNEYALVA" },
-     { "Microbulbifer degradans","ANDDNYGAQLAA" },
-     { "Saccharophagus degradans","ANDDNYGAQLAA" },
-     { "Colwellia sp","ANDDTFALAA" },
-     { "Colwellia psychrerythraea","ANDDTFALAA" },
-     { "Photobacterium phosphoreum","ANDENYALAA" },
-     { "Vibrio cholerae","ANDENYALAA" },
-     { "Vibrio vulnificus","ANDENYALAA" },
-     { "Vibrio Ex25","ANDENYALAA" },
-     { "Vibrio parahemolyticus","ANDENYALAA" },
-     { "Aeromonas salmonicida","ANDENYALAA" },
-     { "Aeromonas hydrophila 1","ANDENYALAA" },
-     { "Aeromonas hydrophila 2","ANDENYALAA" },
-     { "Uncultured VLW3","ANDENYALAA" },
-     { "Uncultured VLS13","ANDENYALAA" },
-     { "Uncultured WW9","ANDENYALAA" },
-     { "Uncultured WW10","ANDENYALAV" },
-     { "Uncultured VLW5","ANDENYALAA" },
-     { "Uncultured RCA4","ANDETYALAA" },
-     { "Uncultured LEM1","ANDETYALAA" },
-     { "Uncultured LEM2","ANDETHALAA" },
-     { "Wigglesworthia brevipalpis","AKHKYNEPALLAA" },
-     { "Wigglesworthia glossinidia","AKHKYNEPALLAA" },
-     { "Buchnera aphidicola 1","ANNKQNYALAA" },
-     { "Buchnera aphidicola 2","ANNKQNYALAA" },
-     { "Buchnera aphidicola 3","AKQNQYALAA" },
-     { "Shigella dysenteriae 1","ANDENYALAA" },
-     { "Shigella dysenteriae 2","ANDENYALAA" },
-     { "Shigella flexneri","ANDENYALAA" },
-     { "Shigella boydii","ANDENYALAA" },
-     { "Shigella sonnei","ANDENYALAA" },
-     { "Escherichia coli","ANDENYALAA" },
-     { "Providencia rettgeri","ANDENYALAA" },
-     { "Serratia marcescens","ANDENYALAA" },
-     { "Klebsiella pneumoniae","ANDENYALAA" },
-     { "Pectobacterium carotovora","ANDENYALAA" },
-     { "Erwinia chrysanthemi","ANDENFAPAALAA" },
-     { "Erwinia amylovora","ANDENFAPAALAA" },
-     { "Erwinia carotovora","ANDENYALAA" },
-     { "Salmonella bongori","ANDENYALAA" },
-     { "Salmonella typhimurium","ANDETYALAA" },
-     { "Salmonella typhi","ANDETYALAA" },
-     { "Salmonella paratyphi","ANDENYALAA" },
-     { "Salmonella enterica 1","ANDETYALAA" },
-     { "Salmonella enterica 2","ANDENYALAA" },
-     { "Salmonella enterica 3","ANDETYALAA" },
-     { "Salmonella enterica 5","ANDETYALAA" },
-     { "Salmonella enterica 6","ANDENYALAA" },
-     { "Uncultured RCA1","ANDENYALAA" },
-     { "Uncultured VLS1","ANDENYALAA" },
-     { "Uncultured WW1","ANDENYALAA" },
-     { "Uncultured RCA2","SNDENYALAA" },
-     { "Uncultured WW2","ANDENYALAA" },
-     { "Uncultured QL1","ANVENYALAA" },
-     { "Uncultured WW4","ANDGNYALAA" },
-     { "Uncultured VLS5","ANDETYALAA" },
-     { "Uncultured FS1","ANDETYALAA" },
-     { "Uncultured VLS6","ANDENYALAA" },
-     { "Uncultured FS2","ANDENYALAA" },
-     { "Uncultured WW5","ANDENYALAA" },
-     { "Uncultured VLW1","ANDENYALAA" },
-     { "Uncultured VLS7","ANDENYALAA" },
-     { "Uncultured VLS9","ANDENYALAA" },
-     { "Uncultured VLW2","ANDENYALAA" },
-     { "Uncultured WW7","ANDENCALAA" },
-     { "Uncultured WW8","ANDENYALAA" },
-     { "Yersinia enterocolitica","ANDSQYESAALAA" },
-     { "Yersinia intermedia","ANDSQYESAALAA" },
-     { "Yersinia mollaretii","ANDSQYESAALAA" },
-     { "Yersinia bercovieri","ANDSQYESAALAA" },
-     { "Yersinia pestis","ANDENYALAA" },
-     { "Yersinia frederiksenii","ANDENYALAA" },
-     { "Yersinia pseudotuberculosis","ANDENYALAA" },
-     { "Mannheimia haemolytica","ANDEQYALAA" },
-     { "Mannheimia succiniciproducens","ANDEQYALAA" },
-     { "Haemophilus ducreyi","ANDEQYALAA" },
-     { "Haemophilus influenzae","ANDEQYALAA" },
-     { "Haemophilus somnus","ANDEQYALAA" },
-     { "Pasteurella multocida","ANDEQYALAA" },
-     { "Actinobacillus actinomycetemcomitans","ANDEQYALAA" },
-     { "Actinobacillus pleuropneumoniae","ANDEQYALAA" },
-     { "Lawsonia intracellularis","ANNNYDYALAA" },
-     { "Desulfovibrio desulfuricans","ANNDYDYAYAA" },
-     { "Desulfovibrio vulgaris","ANNYDYALAA" },
-     { "Desulfovibrio yellowstonii","ANNELALAA" },
-     { "Geobacter sulfurreducens","ADNYDYAVAA" },
-     { "Geobacter metallireducens","ADNYDYAVAA" },
-     { "Helicobacter pylori 1","VNNTDYAPAYAKAA" },
-     { "Helicobacter pylori 2","VNNTDYAPAYAKAA" },  
-     { "Helicobacter pylori 3","VNNADYAPAYAKAA" },  
-     { "Campylobacter jejuni","ANNVKFAPAYAKAA" },
-     { "Campylobacter lari","ANNVKFAPAYAKAA" },
-     { "Campylobacter fetus 2","ANNVKFAPAYAKAA" },
-     { "Campylobacter coli","ANNVKFAPAYAKAA" },
-     { "Fusobacterium nucleatum 1","GNKDYALAA" },
-     { "Fusobacterium nucleatum 2","GNKEYALAA" },
-     { "Dehalococcoides ethenogenes","GERELVLAG" },
-     { "Mycobacterium leprae","ADSYQRDYALAA" },
-     { "Mycobacterium avium","ADSHQRDYALAA" },
-     { "Mycobacterium bovis","ADSHQRDYALAA" },
-     { "Mycobacterium tuberculosis","ADSHQRDYALAA" },
-     { "Mycobacterium marinum","ADSHQRDYALAA" },
-     { "Mycobacterium microti","ADSHQRDYALAA" },
-     { "Mycobacterium africanum","ADSHQRDYALAA" },
-     { "Mycobacterium smegmatis","ADSNQRDYALAA" },
-     { "Corynebacterium diphtheriae","AENTQRDYALAA" },
-     { "Corynebacterium glutamicum","AEKSQRDYALAA" },
-     { "Thermobifida fusca","ANSKRTEFALAA" },
-     { "Streptomyces coelicolor","ANTKRDSSQQAFALAA" },
-     { "Streptomyces lividans","ANTKRDSSQQAFALAA" },
-     { "Tropheryma whipplei","ANLKRTDLSLAA" },
-     { "Clavibacter michiganensis","ANNKQSSFVLAA" },
-     { "Bifidobacterium longum","AKSNRTEFALAA" },
-     { "Bifidobacterium longum","AKSNRTEFALAA" },
-     { "Bacillus anthracis","GKQNNLSLAA" },
-     { "Bacillus thuringiensis","GKQNNLSLAA" },
-     { "Bacillus cereus","GKQNNLSLAA" },
-     { "Bacillus megaterium","GKSNNNFALAA" },
-     { "Bacillus halodurans","GKENNNFALAA" },
-     { "Bacillus clausii","GKENNNFALAA" },
-     { "Bacillus subtilis","GKTNSFNQNVALAA" },
-     { "Bacillus stearothermophilus","GKQNYALAA" },
-     { "Geobacillus kaustophilus","GKQNYALAA" },
-     { "Staphylococcus aureus","GKSNNNFAVAA" },
-     { "Staphylococcus saprophyticus","GKENNNFAVAA" },
-     { "Staphylococcus xylosus","GKENNNFAVAA" },
-     { "Staphylococcus epidermidis","DKSNNNFAVAA" },
-     { "Oceanobacillus iheyensis","GKETNQPVLAAA" },
-     { "Listeria monocytogenes","GKEKQNLAFAA" },
-     { "Listeria innocua","GKEKQNLAFAA" },
-     { "Listeria welshimeri","GKEKQNLAFAA" },
-     { "Listeria seeligeri","GKEKQNLAFAA" },
-     { "Listeria grayi 1","GKEKQNLAFAA" },
-     { "Listeria grayi 2","GKQNNNLAFAA" },
-     { "Listeria ivanovii","GKEKQNLAFAA" },
-     { "Lactobacillus gasseri","ANNENSYAVAA" },
-     { "Lactobacillus johnsonii","ANNENSYAVAA" },
-     { "Lactobacillus sakei","ANNNNSYAVAA" },
-     { "Lactobacillus helveticus","ANNKNSYALAA" },
-     { "Lactobacillus gallinarum","ANNKNSYALAA" },
-     { "Lactobacillus acidophilus","ANNKNSYALAA" },
-     { "Lactobacillus plantarum","AKNNNNSYALAA" },
-     { "Pediococcus pentosaceus","AKNNNNSYALAA" },
-     { "Leuconostoc mesenteroides","AKNENSFAIAA" },
-     { "Leuconostoc lactis","AKNENSFAIAA" },
-     { "Leuconostoc pseudomesenteroides","AKNENSYAIAA" },
-     { "Enterococcus durans","AKNENNSYALAA" },
-     { "Oenococcus oeni","AKNNEPSYALAA" },
-     { "Enterococcus faecium","AKNENNSYALAA" },
-     { "Enterococcus faecalis","AKNENNSFALAA" },
-     { "Streptococcus equi","AKNNTTYALAA" },
-     { "Streptococcus zooepidemicus","AKNNTTYALAA" },
-     { "Streptococcus suis","AKNTNTYALAA" },
-     { "Streptococcus uberis","AKNTNSYALAA" },
-     { "Streptococcus pyogenes","AKNTNSYALAA" },
-     { "Streptococcus agalactiae","AKNTNSYALAA" },
-     { "Streptococcus mutans","AKNTNSYAVAA" },
-     { "Streptococcus sobrinus","AKNTNSYAVAA" },
-     { "Streptococcus gordonii","AKNNTSYALAA" },
-     { "Streptococcus pneumoniae","AKNNTSYALAA" },
-     { "Streptococcus mitis","AKNNTSYALAA" },
-     { "Streptococcus thermophilus","AKNTNSYAVAA" },
-     { "Lactococcus raffinolactis","AKNTQTYAVAA" },
-     { "Lactococcus plantarum","AKNTQTYALAA" },
-     { "Lactococcus garvieae","AKNNTSYALAA" },
-     { "Lactococcus lactis","AKNNTQTYAMAA" },
-     { "Mycoplasma capricolum","ANKNEETFEMPAFMMNNASAGANFMFA" },
-     { "Mesoplasma florum","ANKNEENTNEVPTFMLNAGQANYAFA" },
-     { "Spiroplasma kunkelii","ASKKQKEDKIEMPAFMMNNQLAVSMLAA" },
-     { "Ureaplasma urealyticum","AENKKSSEVELNPAFMASATNANYAFAY" },
-     { "Ureaplasma parvum","AENKKSSEVELNPAFMASATNANYAFAY" },
-     { "Mycoplasma pulmonis","GTKKQENDYQDLMISQNLNQNLAFASV" },
-     { "Mycoplasma penetrans","AKNNKNEAVEVELNDFEINALSQNANLALYA" },
-     { "Mycoplasma genitalium 1","DKENNEVLVEPNLIINQQASVNFAFA" },
-     { "Mycoplasma genitalium 2","DKENNEVLVDPNLIINQQASVNFAFA" },
-     { "Mycoplasma pneumoniae","DKNNDEVLVDPMLIANQQASINYAFA" },
-     { "Thermoanaerobacter tengcongensis","ADRELAYAA" },
-     { "Heliobacillus mobilis","AEDNYALAA" },
-     { "Desulfitobacterium hafniense","ANDDNYALAA" },
-     { "Nitrosococcus oceani","ANDDNYALAA" },
-     { "Thiomicrospira crunogena","ANDDNYALAA" },
-     { "Stenotrophomonas maltophilia","ANDDNYALAA" },
-     { "Carboxydothermus hydrogenoformans","ANENYALAA" },
-     { "Ruminococcus albus","GHGYFAKAS" },
-     { "Clostridium acetobutylicum","DNENNLALAA" },
-     { "Clostridium perfringens","AEDNFALAA" },
-     { "Clostridium thermocellum","ANEDNYALAAA" },
-     { "Clostridium botulinum","ANDNFALAA" },
-     { "Clostridium tetani","ADDNFVLAA" },
-     { "Clostridium difficile","ADDNFAIAA" },
-     { "Hyphomonas neptunium","ANDNFAEGELLAA" },
-     { "Vibrio fischeri","ANDENYALAA" },
-     { "Corynebacterium efficiens","AEKTQRDYALAA" },
-     { "Streptomyces avermitilus","ANTKSDSQSFALAA" },
-     { "Brevibacterium linens","AKSNNRTDFALAA" },
-     { "Lactobacillus delbrueckii 1","AKNENNSYALAA" },
-     { "Lactobacillus delbrueckii 2","ANENSYAVAA" },
-     { "Lactobacillus casei","AKNENSYALAA" },
-     { "Lactobacillus brevis","AKNNNNSYALAA" },
-     { "Streptomyces thermophilus","AKNTNSYAVAA" },
-     { "Bacillusphage G","AKLNITNNELQVA" },
-     { "Thermodesulfobacterium commune","ANEYAYALAA" },
-     { "Thermomicrobium roseum","GERELALAA" },
-     { "Leptospirillum groupII","ANEELALAA" },
-     { "Leptospirillum groupIII","ANEELALAA" },
-     { "Gloeobacter violaceus","ATNNVVPFARARATVAA" },
-     { "Crocosphaera watsonii","ANNIVSFKRVAVAA" },
-     { "Thalassiosira pseudonana chloroplast","ANNIMPFMFNVVKTNRSLTTLNFAV" },
-     { "Emiliania huxleyi chloroplast","ANNILNFNSKLAIA" },
-     { "Cyanidium caldarium chloroplast","ANNIIEISNIRKPALVV" },
-     { "Gracilaria tenuistipitata chloroplast","AKNNILTLSRRLIYA" },
-     { "Prevotella ruminicola","GNNEYALAA" },
-     { "Jannaschia sp. CCS1","ANDNRAPAMALAA" },
-     { "Agrobacterium vitis","ANDNNAQGYAVAA" },
-     { "Alphaproteobacteria SAR-1","ANDELALAA" },
-     { "Gluconobacter oxydans","ANDNSEVLAVAA" },
-     { "Sphingomonas elodea","ANDNEALAIAA" },
-     { "Ehrlichia ruminantium 1","ANDNFVSANDNNSTANLVAA" },
-     { "Ehrlichia ruminantium 2","ANDNFVSANDNNSTANLVAA" },
-     { "Ehrlichia canis","ANDNFVFANDNNSSVAGLVAA" },
-     { "Anaplasma marginale","ANDDFVAANDNMETAFVAAA" },
-     { "Wolbachia sp. 2 (Brugi)","ANDNFAAEGDVAVAA" },
-     { "Wolbachia sp. 3 (Culex)","ANDNFAAEDNVALAA" },
-     { "Wolbachia sp. 4 (Dros.)","ANDNFAAEEYRVAA" },
-     { "Rickettsia rickettsii","ANDNNRSVGRLALAA" },
-     { "Tremblaya princeps 1 (Dysmicoccus)",
-       "APSNRFTIVANDCIDALVRRAVV" },
-     { "Azoarcus EbN1","ANDERFAVAA" },
-     { "Dechloromonas aromatica","ANDEQFAIAA" },
-     { "Dechloromonas agitata","ANDEQFAIAA" },
-     { "Thiobacillus denitrificans","AKSKAARRNPACSAGVMELKA" },
-     { "Shewanella SAR-2, version 2","ADYGYMAAA" },
-     { "Shewanella SAR-1, version 2","ANNDNYALAA" },
-     { "Uncultured marineEBAC20E09","ANNDNYALAA" },
-     { "Pseudomonas fluorescens 3 (Pf-5)","ANDETYGDYALAA" },
-     { "Uncultured remanei","ANDESYALAA" },
-     { "Chromohalobacter salexigens","ANDDNYAQGALAA" },
-     { "Gammaproteobacteria SAR-1","ANNYNYSLAA" },
-     { "Shewanella denitrificans","ANDSNYSLAA" },
-     { "Shewanella frigidimarina","ANDSNYSLAA" },
-     { "Shewanella baltica","ANDSNYSLAA" },
-     { "Photobacterium profundum","ANDENFALAA" },
-     { "Blochmannia floridanus","AKNKYNEPVALAA" },
-     { "Blochmannia pennsylvanicus","ANNTTYRESVALAA" },
-     { "Photorhabdus luminescens","ANDEKYALAA" },
-     { "Proteus mirabilis","ANDNQYKALAA" },
-     { "Magnetococcus sp.","ANDEHYAPAFAAA" },
-     { "Proteobacteria SAR-1, version 1","GENADYALAA" },
-     { "Proteobacteria SAR-1, version 2","ANNYNYSLAA" },
-     { "Proteobacteria SAR-1, version 3","ADNGYMAAA" },
-     { "Desulfovibrio desulfuricans 2 (G20)","ANNDYEYAMAA" },
-     { "Uncultured ciona","ANDEFFDARLRA" },
-     { "Bacteriovorax marinus","AESNFAPAMAA" },
-     { "Bdellovibrio bacteriovorus","GNDYALAA" },
-     { "Myxococcus xanthus","ANDNVELALAA" },
-     { "Wolinella succinogenes","ALSSHPKRGKRLGLPITSALGA" },
-     { "Campylobacter upsaliensis","ANNAKFAPAYAKVA" },
-     { "Helicobacter mustelae","ANNKNYAPAYAKVA" },
-     { "Helicobacter hepaticus","ANNANYAPAYAKVA" },
-     { "Ruminococcus albus","DNDNFAMAA" },
-     { "Coprothermobacter proteolyticus","AEPEFALAA" },
-     { "Moorella thermoacetica","ADDNLALAA" },
-     { "Mycoplasma mycoides","ADKNEENFEMPAFMINNASAGANYMFA" },
-     { "Mycoplasma mobile","GKEKQLEVSPLLMSSSQSNLVFA" },
-     { "Mycoplasma arthritidis","GNLETSEDKKLDLQFVMNSQTQQNLLFA" },
-     { "Paenibacillus larvae","GKQQNNYALAA" },
-     { "Bacillus licheniformis","GKSNQNLALAA" },
-     { "Actinomyces naeslundii","ADNTRTDFALAA" },
-     { "Arthrobacter FB24","AKQTRTDFALAA" },
-     { "Leifsonia xyli","ANSKSTVSAKADFALAA" },
-     { "Nocardia farcinica","ADSHQREYALAA" },
-     { "Propionibacterium acnes 1","AENTRTDFALAA" },
-     { "Propionibacterium acnes 2","AENTRTDFALAA" },
-     { "Streptomyces collinus","ANTKRDSSSFALAA" },
-     { "Streptomyces aureofaciens","ANSKRDSQQFALAA" },
-     { "Kineococcus radiotolerans","ADSKRTEFALAA" },
-     { "Frankia sp. CcI3","ANKTQPTTPTYALAA" },
-     { "Frankia sp. EAN1pec","ATKTQPASSTFALAA" },
-     { "Rubrobacter xylanophilus","ANDREMALAA" },
-     { "Parachlamydia UWE25","ANNSNKIAKVDFQEGTFARAA" },
-     { "Verrucomicrobium spinosum","ANSNELALAA" },
-     { "Acidobacterium capsulatum","ANNNLALAA" },
-     { "Acidobacterium Ellin6076","ANTQFAYAA" },
-     { "Solibacter usitatus","ANTQFAYAA" },
-     { "Dictyoglomus thermophilum","ANTNLALAA" },
-     { "Mycobacteriophage Bxz1 virion","ATDTDATVTDAEIEAFFAEEAAALV" },
-     { "Catera virion","ATDTDATVTDAEIEAFFAEEAAALV" },
-     { "Cyanobium gracile","ANNIVRFSRQAAPVAA" },
-     { "Anabaena variabilis","ANNIVKFARKDALVAA" },
-     { "Nitrosospira multiformis","ANDENYALAA" },
-     { "Enterobacter sakazakii","ANDENYALAA" },
-     { "Pantoea stewartii","ANDENYALAA" },
-     { "Citrobacter rodentium","ANDENYALAA" },
-     { "Prochlorococcus marinus","ANNIVSFSRQTAPVAA" },
-     { "Azospira oryzae","ANDERFAIAA" },
-     { "Uncultured phakopsora","ANDNSYALAA" },
-     { "Syntrophus aciditrophicus","ANDYEYALAA" },
-     { "Alkaliphilus metalliredigenes","ANDNYSLAAA" },
-     { "Caldicellulosiruptor saccharolyticus","ADKAELALAA" } };
   n = 0;
   st = tag + len;
   while (*--st == '*');
@@ -3416,16 +4555,109 @@ int identify_tag(char tag[], int len, char (*thit)[50], int nt)
   return(-1);  }
 
 
+
+int peptide_tag(char tag[], int maxlen, gene *t, csw *sw)
+{
+int i,lx,*se;
+se = t->eseq + t->tps;
+lx = (t->tpe - t->tps + 1);
+if (ltranslate(se+lx,t,sw) == '*')
+ { lx += 3;
+   if (ltranslate(se+lx,t,sw) == '*') lx += 3; }
+lx /= 3;
+if (lx > maxlen) lx = maxlen;
+for (i = 0; i < lx; i++)
+ { tag[i] = ltranslate(se,t,sw);
+   se += 3; }
+tag[i] = '\0';
+return(lx);
+}
+
+
+void update_tmrna_tag_database(gene ts[], int nt, csw *sw)
+{
+int nn,i,k,c,lx;
+char *sp,*se,*s;
+char species[STRLEN],tag[100];
+gene *t;
+if (sw->tagend >= NTAGMAX) return;
+for (i = 0; i < nt; i++)
+ { t = ts + i;
+   if (t->genetype != tmRNA) continue;
+   s = t->name;
+   se = NULL;
+   while (*s)
+    { if (*s == '|') se = s;
+      s++; }
+   if (!*se) continue;
+   while (++se) if (space(*se)) break;
+   if (!*se) continue;
+   while (++se) if (!space(*se)) break;
+   if (!*se) continue;
+   if (softstrpos(se," sp. "))
+    { if (!(sp = softstrpos(se,"two-piece")))
+       if (!(sp = softstrpos(se,"tmRNA")))
+        continue;
+      while (space(sp[-1])) sp--;
+      copy2sp(se,sp,species,49); }
+   else
+    { s = species;
+      c = 2;
+      while (*se)
+       { if (space(*se))
+         if (--c <= 0) break;
+         *s++ = *se++; }
+      *s = '\0'; }
+   for (k = 0; k < sw->tagend; k++)
+    if (softstrpos(tagdatabase[k].name,species)) break;
+   if (k < sw->tagend) continue;
+   copy(species,tagdatabase[sw->tagend].name);
+   s = tag;
+   lx = peptide_tag(s,50,t,sw);
+   s += (lx - 1);
+   while (*s == '*') s--;
+   *++s = '\0';
+   copy(tag,tagdatabase[sw->tagend].tag);
+   if (++sw->tagend >= NTAGMAX) break; }
+}
+
+int string_compare(char *s1, char *s2)
+{
+int r;
+char c1,c2;
+r = 0;
+while (c1 = *s1++)
+ { if (!(c2 = *s2++)) break;
+   r = (int)upcasec(c1) - (int)upcasec(c2);
+   if (r != 0) break; }
+return(r);
+}
+
+void report_new_tmrna_tags(csw *sw)
+{
+int k,n,sort[NTAGMAX];
+for (n = 0; n < sw->tagend; n++)
+ { k = n;
+   while (--k >= 0)
+    { if (string_compare(tagdatabase[n].name,tagdatabase[sort[k]].name) >= 0) break;
+      sort[k+1] = sort[k]; }
+   sort[++k] = n; }
+fprintf(sw->f,"\ntmRNA tag database update:\n");
+for (k = 0; k < sw->tagend; k++)
+ { n = sort[k];
+   fprintf(sw->f,"     { \"%s\",\"%s\"},\n",
+         tagdatabase[n].name,tagdatabase[n].tag); }
+fprintf(sw->f,"\n%d tmRNA peptide tags\n",sw->tagend);
+fprintf(sw->f,"%d new tmRNA peptide tags\n\n",sw->tagend - NTAG);
+}
+
+
 void disp_peptide_tag(FILE *f, gene *t, csw *sw)
 { int i,lx,nm,nmh,c1,c2,c3,*s,*se;
-  char tag[50],thit[21][50];
-  fprintf(f,"Tag peptide (at %d)\nTag sequence: ",t->tps+1);
+  char tag[52],thit[21][50];
+  fprintf(f,"Tag peptide at [%d,%d]\nTag sequence: ",t->tps+1,t->tpe+1);
+  lx = peptide_tag(tag,50,t,sw);
   se = t->eseq + t->tps;
-  lx = (t->tpe - t->tps + 1);
-  if (ltranslate(se+lx,t,sw) == '*')
-   { lx += 3;
-     if (ltranslate(se+lx,t,sw) == '*') lx += 3; }
-  lx /= 3;
   s = se;
   for (i = 0; i < lx; i++)
   { if (i > 0) fputc('-',f);
@@ -3441,13 +4673,7 @@ void disp_peptide_tag(FILE *f, gene *t, csw *sw)
   { fprintf(f,"%s",translate(s,sw));
     s += 3;
     if (i < (lx-1)) fputc('-',f); }
-  s = se;
-  fprintf(f,"\nTag peptide:  ");
-  for (i = 0; i < lx; i++)
-  { tag[i] = ltranslate(s,t,sw);
-    fprintf(f,"%c",tag[i]);
-    s += 3; }
-  tag[lx] = '\0';
+  fprintf(f,"\nTag peptide:  %s",tag);  
   if (sw->energydisp)
    { s = se;
      fprintf(f,"\nTag Polarity: ");
@@ -3526,7 +4752,6 @@ void disp_location(gene *t, csw *sw, char *m)
   fprintf(sw->f,"%s %s\n",m,position(sp,t,sw)); }
 
 
-
 char *name(gene *t, char *si, int proc, csw *sw)
 { int s[5],*ss,*sin,*sm,*s0,*s1,*s2,*s3,nintron;
   char *sb,*st;
@@ -3539,10 +4764,14 @@ char *name(gene *t, char *si, int proc, csw *sw)
               sprintf(si,"srpRNA");
               break;
      case tmRNA:
-              if (t->asst > 0)
-               sprintf(si,"tmRNA (Permuted)");
+              if (sw->dispmatch)
+               { if (t->asst > 0)
+                  sprintf(si,"tmRNA(Perm)  ");
+                 else sprintf(si,"tmRNA        "); }
               else
-               sprintf(si,"tmRNA");
+               { if (t->asst > 0)
+                  sprintf(si,"tmRNA (Permuted)");
+                 else sprintf(si,"tmRNA"); }
               break;
      case tRNA:
               ss = (proc?t->seq:t->ps);
@@ -3794,8 +5023,11 @@ void disp_tmrna_seq(FILE *f, gene *t, csw *sw)
       { fputc('\n',f);
         i = 0; }}
   if (i > 0) fputc('\n',f);
-  fputc('\n',f);
-  fprintf(f,"Resume consensus sequence (at %d): ",t->tps - 6);
+  fprintf(f,"\n5' tRNA domain at [%d,%d]\n",
+          1,t->intron);
+  fprintf(f,"3' tRNA domain at [%d,%d]\n",
+          t->intron+t->nintron+1,t->nbase+t->nintron);
+  fprintf(f,"Resume consensus sequence at [%d,%d]: ",t->tps - 6,t->tps + 11);
   s = t->eseq + t->tps - 7;
   for (i = 0; i < 18; i++) fputc(cbase(*s++),f);
   fputc('\n',f);
@@ -3858,7 +5090,11 @@ void disp_tmrna_perm_seq(FILE *f, gene *t, csw *sw)
       { fputc('\n',f);
         i = 0; }}
   if (i > 0) fputc('\n',f);
-  fprintf(f,"\nResume consensus sequence (at %d): ",t->tps - 6);
+  fprintf(f,"\n5' tRNA domain at [%d,%d]\n",
+          t->asst+1,t->asst+t->astem1+t->dloop+t->cstem);
+  fprintf(f,"3' tRNA domain at [%d,%d]\n",
+          55,t->intron);
+  fprintf(f,"Resume consensus sequence at [%d,%d]: ",t->tps - 6,t->tps + 11);
   s = t->eseq + t->tps - 7;
   for (i = 0; i < 18; i++) fputc(cbase(*s++),f);
   fputc('\n',f);
@@ -3896,9 +5132,9 @@ void disp_cds(FILE *f, gene *t, csw *sw)
   fputc('\n',f); }
 
 
-int pseudogene(gene *t)
+int pseudogene(gene *t, csw *sw)
 {
-if (t->energy < 100.0) return(1);
+if (t->energy < sw->pseudogenethresh) return(1);
 if (t->genetype == tRNA)
  if (t->cloop != 7)
   return(1);
@@ -3925,7 +5161,7 @@ void disp_gene(gene *t, char m[][MATY], csw *sw)
   sprintf(stat,"%d bases, %%GC = %2.1f",t->nbase,100.0*gc);
   xcopy(m,4,2,stat,length(stat));
   if (sw->reportpseudogenes)
-   if (pseudogene(t))
+   if (pseudogene(t,sw))
     xcopy(m,4,4,"Possible Pseudogene",19);
   if (sw->energydisp)
    { sprintf(stat,"Score = %g\n",t->energy);
@@ -3937,21 +5173,32 @@ void disp_batch_trna(FILE *f, gene *t, csw *sw)
   char pos[50],species[50];
   static char type[2][6] = { "tRNA","mtRNA" };
   static char asterisk[2] = { ' ','*'};
-  anticodon = 1 + t->anticodon;
-  if (t->nintron > 0)
-   if (t->intron <= t->anticodon)
-    anticodon += t->nintron;
   s = t->seq + t->anticodon;
-  ps = sw->reportpseudogenes?(pseudogene(t)?1:0):0;
-  switch(t->cloop)
-   { case 6:
-     case 8:
-	  sprintf(species,"%s-???%c",type[sw->mtrna],asterisk[ps]);
-	  break;
-     case 7:
-	 default:
-          sprintf(species,"%s-%s%c",type[sw->mtrna],aa(s,sw),asterisk[ps]);
-	  break; }
+  ps = sw->reportpseudogenes?(pseudogene(t,sw)?1:0):0;
+  if (sw->batchfullspecies)
+   { switch(t->cloop)
+      { case 6:
+	       sprintf(species,"%s-?(%s|%s)%c",
+                   type[sw->mtrna],aa(s-1,sw),aa(s,sw),asterisk[ps]);
+	       break;
+        case 8:
+	       sprintf(species,"%s-?(%s|%s)%c",
+                   type[sw->mtrna],aa(s,sw),aa(s+1,sw),asterisk[ps]);
+	       break;
+        case 7:
+	    default:
+           sprintf(species,"%s-%s%c",type[sw->mtrna],aa(s,sw),asterisk[ps]);
+	       break; }}
+  else
+   { switch(t->cloop)
+      { case 6:
+        case 8:
+	     sprintf(species,"%s-???%c",type[sw->mtrna],asterisk[ps]);
+	     break;
+        case 7:
+	    default:
+         sprintf(species,"%s-%s%c",type[sw->mtrna],aa(s,sw),asterisk[ps]);
+	     break; }}
   position(pos,t,sw);
   ls = length(species);
   if (ls <= 10) fprintf(f,"%-10s%28s",species,pos);
@@ -3959,6 +5206,10 @@ void disp_batch_trna(FILE *f, gene *t, csw *sw)
   else fprintf(f,"%-25s%13s",species,pos);
   if (sw->energydisp)
    { fprintf(f,"\t%5.1f",t->energy); }
+  anticodon = 1 + t->anticodon;
+  if (t->nintron > 0)                  
+   if (t->intron <= t->anticodon)      
+    anticodon += t->nintron;
   fprintf(f,"\t%-4d",anticodon);
   switch(t->cloop)
    { case 6:
@@ -4247,7 +5498,7 @@ void tmrna_score(FILE *f, gene *t, csw *sw)
 { int r,j,te,*s,*sb,*se,*tpos,tarm;
   double e,er,et,eal,esp,ed,ec,ea,egga,etcca,egg,eta,edgg;
   double ehairpin,euhairpin;
-  static int gtemplate[6] = { 0x00,0x00,0x11,0x00,0x00,0x00 };
+  static int gtem[6] = { 0x00,0x00,0x11,0x00,0x00,0x00 };
   static double tagend_score[4] = { 36.0, 66.0, 62.0, 72.0 };
   static int nps[126] =
    { 0,0,0,0,
@@ -4335,9 +5586,9 @@ void tmrna_score(FILE *f, gene *t, csw *sw)
   s = t->eseq + t->asst + t->astem1;
   sb = s + 3;
   se = s + 7;
-  r = gtemplate[*sb++];
+  r = gtem[*sb++];
   while (sb < se)
-   { r = (r >> 4) + gtemplate[*sb++];
+   { r = (r >> 4) + gtem[*sb++];
      if ((r & 3) == 2)
       { edgg = 14.0;
         break; }}
@@ -4386,7 +5637,7 @@ void tmrna_score(FILE *f, gene *t, csw *sw)
 
 int find_tstems(int *s, int ls, trna_loop hit[], int nh, csw *sw)
 { int i,r,c,tstem,tloop,ithresh1;
-  int *s1,*s2,*se,*ss,*si,*sb,*sc,*sf,*sl,*sx,*template;
+  int *s1,*s2,*se,*ss,*si,*sb,*sc,*sf,*sl,*sx,*tem;
   double ec,energy,penalty,thresh2;
   static double bem[6][6] =
    { { -2.144,-0.428,-2.144, ATBOND, 0.000, 0.000 },
@@ -4399,22 +5650,22 @@ int find_tstems(int *s, int ls, trna_loop hit[], int nh, csw *sw)
   static double C[6] = { 0.0,2.0,0.0,0.0,0.0,0.0 };
   static double G[6] = { 0.0,0.0,2.0,0.0,0.0,0.0 };
   static double T[6] = { 0.0,0.0,0.0,2.0,0.0,0.0 };
-  static int template_trna[6] =
+  static int tem_trna[6] =
    { 0x0100, 0x0002, 0x2000, 0x0220, 0x0000, 0x0000 };
-  static int template_tmrna[6] =
+  static int tem_tmrna[6] =
    { 0x0100, 0x0002, 0x2220, 0x0220, 0x0000, 0x0000 };
   i = 0;
-  template = (sw->tmrna)?template_tmrna:template_trna;
+  tem = (sw->tmrna)?tem_tmrna:tem_trna;
   ithresh1 = (int)sw->ttscanthresh;
   thresh2 = sw->ttarmthresh;
   ss = s + sw->loffset;
   si = ss + 4 - 1;
   sl = s + ls - sw->roffset + 5 + 3;
-  r = template[*si++];
-  r = (r >> 4) + template[*si++];
-  r = (r >> 4) + template[*si++];
+  r = tem[*si++];
+  r = (r >> 4) + tem[*si++];
+  r = (r >> 4) + tem[*si++];
   while (si < sl)
-   { r = (r >> 4) + template[*si++];
+   { r = (r >> 4) + tem[*si++];
      if ((c = (r & 0xF)) < ithresh1) continue;
      sb = si - 7;
      sf = sb + 13;
@@ -4460,7 +5711,7 @@ int find_astem5(int *si, int *sl, int *astem3, int n3,
   int *s1,*s2,*se;
   unsigned int r,tascanthresh;
   double tastemthresh,energy;
-  static unsigned int template[6] = { 0,0,0,0,0,0 };
+  static unsigned int tem[6] = { 0,0,0,0,0,0 };
   static unsigned int A[6] = { 0,0,0,2,0,0 };
   static unsigned int C[6] = { 0,0,2,0,0,0 };
   static unsigned int G[6] = { 0,2,0,1,0,0 };
@@ -4477,20 +5728,20 @@ int find_astem5(int *si, int *sl, int *astem3, int n3,
   i = 0;
   sl += n3;
   se = astem3 + n3 - 1;
-  template[0] = A[*se];
-  template[1] = C[*se];
-  template[2] = G[*se];
-  template[3] = T[*se];
+  tem[0] = A[*se];
+  tem[1] = C[*se];
+  tem[2] = G[*se];
+  tem[3] = T[*se];
   while (--se >= astem3)
-   { template[0] = (template[0] << 4) + A[*se];
-     template[1] = (template[1] << 4) + C[*se];
-     template[2] = (template[2] << 4) + G[*se];
-     template[3] = (template[3] << 4) + T[*se]; }
-  r = template[*si++];
+   { tem[0] = (tem[0] << 4) + A[*se];
+     tem[1] = (tem[1] << 4) + C[*se];
+     tem[2] = (tem[2] << 4) + G[*se];
+     tem[3] = (tem[3] << 4) + T[*se]; }
+  r = tem[*si++];
   k = 1;
-  while (++k < n3) r = (r >> 4) + template[*si++];
+  while (++k < n3) r = (r >> 4) + tem[*si++];
   while (si < sl)
-   { r = (r >> 4) + template[*si++];
+   { r = (r >> 4) + tem[*si++];
      if ((r & 15) >= tascanthresh)
       { s1 = astem3;
         s2 = si;
@@ -4524,7 +5775,6 @@ V = A or C or G
 M = A or C
 H = A or C or T
 K = G or T
-
 */
 
 int find_resume_seq(int *s, int ls, trna_loop hit[], int nh, csw *sw)
@@ -4546,7 +5796,7 @@ int find_resume_seq(int *s, int ls, trna_loop hit[], int nh, csw *sw)
      0,0,0,0, 0,0,0,0,
      0,0,0,0, 0,0,0,0,0 };
   static double score[4] = { 36.0, 66.0, 62.0, 72.0 };
-  static unsigned int template[6] =
+  static unsigned int tem[6] =
    { 0x10310000, 0x01000101, 0x00010030,
      0x02000100, 0x00000000, 0x00000000 };
   static int A[6] = { 0,1,1,1,1,1 };
@@ -4555,16 +5805,16 @@ int find_resume_seq(int *s, int ls, trna_loop hit[], int nh, csw *sw)
   thresh = (unsigned int)sw->tmrthresh;
   i = 0;
   sl = s + ls;
-  r = template[*s++];
-  r = (r >> 4) + template[*s++];
-  r = (r >> 4) + template[*s++];
-  r = (r >> 4) + template[*s++];
-  r = (r >> 4) + template[*s++];
-  r = (r >> 4) + template[*s++];
-  r = (r >> 4) + template[*s++];
+  r = tem[*s++];
+  r = (r >> 4) + tem[*s++];
+  r = (r >> 4) + tem[*s++];
+  r = (r >> 4) + tem[*s++];
+  r = (r >> 4) + tem[*s++];
+  r = (r >> 4) + tem[*s++];
+  r = (r >> 4) + tem[*s++];
   if (sw->tmstrict)
     while (s < sl)
-     { r = (r >> 4) + template[*s++];
+     { r = (r >> 4) + tem[*s++];
        if ((c = (r & 0xF)) < thresh) continue;
        c -= (V[s[1]] + V[s[2]] + M[s[5]] + A[s[8]]);
        if (c < thresh) continue;
@@ -4605,7 +5855,7 @@ int find_resume_seq(int *s, int ls, trna_loop hit[], int nh, csw *sw)
        i++; }
   else
     while (s < sl)
-     { r = (r >> 4) + template[*s++];
+     { r = (r >> 4) + tem[*s++];
        if ((c = (r & 0xF)) < thresh) continue;
        if (i >= nh) goto FL;
        st = s - 2;
@@ -4690,7 +5940,7 @@ gene *nearest_trna_gene(data_set *d, int nt, gene *t, csw *sw)
             { ilength = e - c;
               if ((2*thresh) > (5*ilength)) continue;
               if ((2*ilength) > (5*thresh)) continue; }
-           score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
+           score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?e-c:b-c);
            if (score >= proximity)
             if (ts[i].energy < energy)
               { n = i;
@@ -4708,7 +5958,7 @@ gene *nearest_trna_gene(data_set *d, int nt, gene *t, csw *sw)
          { ilength = e - c;
            if ((2*thresh) > (5*ilength)) continue;
            if ((2*ilength) > (5*thresh)) continue; }
-        score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
+        score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?e-c:b-c);
         if (score >= proximity)
             if (ts[i].energy < energy)
               { n = i;
@@ -4730,7 +5980,7 @@ gene *nearest_trna_gene(data_set *d, int nt, gene *t, csw *sw)
          { ilength = e - c;
            if ((2*thresh) > (5*ilength)) continue;
            if ((2*ilength) > (5*thresh)) continue; }
-        score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
+        score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?e-c:b-c);
         if (score >= proximity)
             if (ts[i].energy < energy)
               { n = i;
@@ -4748,7 +5998,7 @@ gene *nearest_trna_gene(data_set *d, int nt, gene *t, csw *sw)
       { ilength = e - c;
         if ((2*thresh) > (5*ilength)) continue;
         if ((2*ilength) > (5*thresh)) continue; }
-     score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
+     score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?e-c:b-c);
      if (score >= proximity)
       if (ts[i].energy < energy)
        { n = i;
@@ -4779,7 +6029,7 @@ gene *nearest_tmrna_gene(data_set *d, int nt, gene *t)
            if (b < c) goto NXTW;
            if (ts[i].genetype != tmRNA) continue;
            if (ts[i].comp != comp) continue;
-           score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
+           score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?e-c:b-c);
            if (score >= smax)
             if (score > smax)
              { n = i;
@@ -4794,7 +6044,7 @@ gene *nearest_tmrna_gene(data_set *d, int nt, gene *t)
         if (b < c) continue;
         if (ts[i].genetype != tmRNA) continue;
         if (ts[i].comp != comp) continue;
-        score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
+        score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?e-c:b-c);
         if (score >= smax)
          if (score > smax)
           { n = i;
@@ -4813,7 +6063,7 @@ gene *nearest_tmrna_gene(data_set *d, int nt, gene *t)
         if (b < c) goto NXTN;
         if (ts[i].genetype != tmRNA) continue;
         if (ts[i].comp != comp) continue;
-        score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
+        score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?e-c:b-c);
         if (score >= smax)
          if (score > smax)
           { n = i;
@@ -4828,7 +6078,7 @@ gene *nearest_tmrna_gene(data_set *d, int nt, gene *t)
      if (b < c) continue;
      if (ts[i].genetype != tmRNA) continue;
      if (ts[i].comp != comp) continue;
-     score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
+     score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?e-c:b-c);
      if (score >= smax)
       if (score > smax)
        { n = i;
@@ -4949,7 +6199,7 @@ gene *find_slot(data_set *d, gene *t, int *nts, csw *sw)
            ts = tsn;
            init_gene(sw->genespace,newspace);
            sw->genespace = newspace; }
-        copy3cr(d->seqname,t->name,79);
+        copy3cr(d->seqname,t->name,99);
         tn = ts + (*nts);
         *nts = (*nts) + 1;
         if (sw->verbose)
@@ -5039,7 +6289,7 @@ int find_mt_trna(data_set *d, int *seq, int lseq, int nts, csw *sw)
   static int RI[6] = { 1,0,1,0,1,0 };
   static int YI[6] = { 0,1,0,1,1,0 };
   static int WI[6] = { 1,0,0,1,1,0 };
-  static unsigned int template[6] = { 0,0,0,0,0,0 };
+  static unsigned int tem[6] = { 0,0,0,0,0,0 };
   static unsigned int At[6] = { 0,0,0,1,1,0 };
   static unsigned int Ct[6] = { 0,0,1,0,1,0 };
   static unsigned int Gt[6] = { 0,1,0,1,1,0 };
@@ -5389,30 +6639,30 @@ int find_mt_trna(data_set *d, int *seq, int lseq, int nts, csw *sw)
   sg = sc + 16;
   sge = sg + 30;
   slb = sg + 32;
-  template[0] = At[*slm];
-  template[1] = Ct[*slm];
-  template[2] = Gt[*slm];
-  template[3] = Tt[*slm];
+  tem[0] = At[*slm];
+  tem[1] = Ct[*slm];
+  tem[2] = Gt[*slm];
+  tem[3] = Tt[*slm];
   while (--slm > sle)
-   { template[0] = (template[0] << 4) | At[*slm];
-     template[1] = (template[1] << 4) | Ct[*slm];
-     template[2] = (template[2] << 4) | Gt[*slm];
-     template[3] = (template[3] << 4) | Tt[*slm]; }
+   { tem[0] = (tem[0] << 4) | At[*slm];
+     tem[1] = (tem[1] << 4) | Ct[*slm];
+     tem[2] = (tem[2] << 4) | Gt[*slm];
+     tem[3] = (tem[3] << 4) | Tt[*slm]; }
   while (slm >= sb)
-   { template[0] = ((template[0] << 4) | At[*slm]) & 0xfffff;
-     template[1] = ((template[1] << 4) | Ct[*slm]) & 0xfffff;
-     template[2] = ((template[2] << 4) | Gt[*slm]) & 0xfffff;
-     template[3] = ((template[3] << 4) | Tt[*slm]) & 0xfffff;
+   { tem[0] = ((tem[0] << 4) | At[*slm]) & 0xfffff;
+     tem[1] = ((tem[1] << 4) | Ct[*slm]) & 0xfffff;
+     tem[2] = ((tem[2] << 4) | Gt[*slm]) & 0xfffff;
+     tem[3] = ((tem[3] << 4) | Tt[*slm]) & 0xfffff;
      sf = slm + 3;
      if (sf > sge) sf = sge;
      apos2 = slm + 5;
      si = sg;
      s = si + 4;
-     r = template[*si];
-     while (++si < s) r = (r >> 4) + template[*si];
+     r = tem[*si];
+     while (++si < s) r = (r >> 4) + tem[*si];
      while (si <= sf)
       { if (si < slm)
-          r = (r >> 4) + template[*si++];
+          r = (r >> 4) + tem[*si++];
         else
          { si++;
            r = r >> 4; }
@@ -5658,28 +6908,28 @@ int find_mt_trna(data_set *d, int *seq, int lseq, int nts, csw *sw)
   sle = sc - 4;
   slb = sc - 8;
   slm = sc - 1;
-  template[0] = dAt[*slm];
-  template[1] = dCt[*slm];
-  template[2] = dGt[*slm];
-  template[3] = dTt[*slm];
+  tem[0] = dAt[*slm];
+  tem[1] = dCt[*slm];
+  tem[2] = dGt[*slm];
+  tem[3] = dTt[*slm];
   while (--slm > sle)
-   { template[0] = (template[0] << 4) | dAt[*slm];
-     template[1] = (template[1] << 4) | dCt[*slm];
-     template[2] = (template[2] << 4) | dGt[*slm];
-     template[3] = (template[3] << 4) | dTt[*slm]; }
+   { tem[0] = (tem[0] << 4) | dAt[*slm];
+     tem[1] = (tem[1] << 4) | dCt[*slm];
+     tem[2] = (tem[2] << 4) | dGt[*slm];
+     tem[3] = (tem[3] << 4) | dTt[*slm]; }
   slm1 = slm;
   while (slm > slb)
-   { template[0] = ((template[0] << 4) | dAt[*slm]) & 0xffff;
-     template[1] = ((template[1] << 4) | dCt[*slm]) & 0xffff;
-     template[2] = ((template[2] << 4) | dGt[*slm]) & 0xffff;
-     template[3] = ((template[3] << 4) | dTt[*slm]) & 0xffff;
+   { tem[0] = ((tem[0] << 4) | dAt[*slm]) & 0xffff;
+     tem[1] = ((tem[1] << 4) | dCt[*slm]) & 0xffff;
+     tem[2] = ((tem[2] << 4) | dGt[*slm]) & 0xffff;
+     tem[3] = ((tem[3] << 4) | dTt[*slm]) & 0xffff;
      slm--;
      si = slm - 18;
      s = si + 3;
-     r = template[*si];
-     while (++si < s) r = (r >> 4) + template[*si];
+     r = tem[*si];
+     while (++si < s) r = (r >> 4) + tem[*si];
      while (si <= slm1)
-      { if (si < slm) r = (r >> 4) + template[*si++];
+      { if (si < slm) r = (r >> 4) + tem[*si++];
         else
          { r = r >> 4;
 	   si++; }
@@ -5872,32 +7122,32 @@ int find_mt_trna(data_set *d, int *seq, int lseq, int nts, csw *sw)
      sg = sf - 6;
      sb = sc + 17;
      se = s2 + 6;
-     template[0] = aAt[*se];
-     template[1] = aCt[*se];
-     template[2] = aGt[*se];
-     template[3] = aTt[*se];
+     tem[0] = aAt[*se];
+     tem[1] = aCt[*se];
+     tem[2] = aGt[*se];
+     tem[3] = aTt[*se];
      while (--se > s2)
-      { template[0] = (template[0] << 4) | aAt[*se];
-        template[1] = (template[1] << 4) | aCt[*se];
-        template[2] = (template[2] << 4) | aGt[*se];
-        template[3] = (template[3] << 4) | aTt[*se]; }
+      { tem[0] = (tem[0] << 4) | aAt[*se];
+        tem[1] = (tem[1] << 4) | aCt[*se];
+        tem[2] = (tem[2] << 4) | aGt[*se];
+        tem[3] = (tem[3] << 4) | aTt[*se]; }
      ti = (int)(se - sc);
      while (se >= sb)
-      { template[0] = ((template[0] << 4) | aAt[*se]) & 0xfffffff;
-        template[1] = ((template[1] << 4) | aCt[*se]) & 0xfffffff;
-        template[2] = ((template[2] << 4) | aGt[*se]) & 0xfffffff;
-        template[3] = ((template[3] << 4) | aTt[*se]) & 0xfffffff;
+      { tem[0] = ((tem[0] << 4) | aAt[*se]) & 0xfffffff;
+        tem[1] = ((tem[1] << 4) | aCt[*se]) & 0xfffffff;
+        tem[2] = ((tem[2] << 4) | aGt[*se]) & 0xfffffff;
+        tem[3] = ((tem[3] << 4) | aTt[*se]) & 0xfffffff;
 	if (tendmap[ti])
 	 { nti = (tendmap[ti] < 0x2000)?1:0; }
 	else
 	 { if (se > sle) goto ANX;
 	   nti = -1; }
         si = sg;
-        r = template[*si];
-        while (++si < sf) r = (r >> 4) + template[*si];
+        r = tem[*si];
+        while (++si < sf) r = (r >> 4) + tem[*si];
 		di = (int)(sc - si);
         while (si < sa)
-         { r = (r >> 4) + template[*si++];
+         { r = (r >> 4) + tem[*si++];
 	   if (dposmap[--di])
             { if (nti <= 0)
                { if (nti < 0)
@@ -6037,28 +7287,6 @@ int find_mt_trna(data_set *d, int *seq, int lseq, int nts, csw *sw)
              ea -= 2.0;  
              break; }
 
-/*     if (incds) continue;  */
-
-
-
-/*
-        s = apos1 + nbase/2;
-        if (incodon(s-75,s+75) > 30.0) /@ 3.5,3.0,2.5 @/
-         { incds = 1;
-           ea -= 2.0; }
-        else
-         incds = 0;
-*/
-
-/*
-        s = apos1 + nbase/2;
-        if (incodon(s-150,s+150) > 0.0) 
-         { incds = 1;
-           ea -= 2.0; }
-        else
-         incds = 0;
-*/
-
 
   /* cycle through carms that fall between astem */
 
@@ -7050,28 +8278,28 @@ int find_mt_trna(data_set *d, int *seq, int lseq, int nts, csw *sw)
   /* remember fully formed D-loop replacement mttRNA gene */
   /* if threshold reached */
 
-              if (energy < thresh) goto DN;
-	      te.energy = energy;
-              thresh = energy;
-              te.ps = apos1;
-              te.spacer1 = 0;
-              te.spacer2 = 0;
-              te.dstem = 0;
-              te.dloop = dloop;
-              te.cstem = cstem;
-              te.cloop = cloop;
-              te.anticodon = astem + dloop + cstem + 2;
-              te.nintron = 0;
-              te.intron = 0;
-              te.var = var;
-              te.varbp = 0;
-              te.tstem = tstem;
-              te.tloop = tl;
-              te.nbase = astem + dloop + carm + var +
+       if (energy < thresh) goto DN;
+	   te.energy = energy;
+       thresh = energy;
+       te.ps = apos1;
+       te.spacer1 = 0;
+       te.spacer2 = 0;
+       te.dstem = 0;
+       te.dloop = dloop;
+       te.cstem = cstem;
+       te.cloop = cloop;
+       te.anticodon = astem + dloop + cstem + 2;
+       te.nintron = 0;
+       te.intron = 0;
+       te.var = var;
+       te.varbp = 0;
+       te.tstem = tstem;
+       te.tloop = tl;
+       te.nbase = astem + dloop + carm + var +
                   2*tstem + tl;
-	      tastem = astem;
-	      tastem8 = astem8;
-	      tastem8d = astem8d;
+	   tastem = astem;
+	   tastem8 = astem8;
+	   tastem8d = astem8d;
 
   /* build fully formed cloverleaf mttRNA genes */
 
@@ -8106,7 +9334,7 @@ int tmopt(data_set *d,
           int nts,int *seq, csw *sw)
 { int r,na,nr,nrh,ibase,flag,as,aext,nbasefext;
   int *s,*v,*s1,*s2,*sa,*sb,*se,*sf,*ps,*tpos,pseq[MAXETRNALEN+1];
-  static int gtemplate[6] = { 0x00,0x00,0x11,0x00,0x00,0x00 };
+  static int gtem[6] = { 0x00,0x00,0x11,0x00,0x00,0x00 };
   static double A[6] = { 6.0,0.0,0.0,0.0,0.0,0.0 };
   static double Ar[6] = { 10.0,0.0,0.0,0.0,0.0,0.0 };
   static double Cr[6] = { 0.0,10.0,0.0,0.0,0.0,0.0 };
@@ -8170,9 +9398,9 @@ int tmopt(data_set *d,
         if (energy < cathresh) continue;
         sb = sa + 3;
         sf = sa + 7;
-        r = gtemplate[*sb++];
+        r = gtem[*sb++];
         while (sb < sf)
-         { r = (r >> 4) + gtemplate[*sb++];
+         { r = (r >> 4) + gtem[*sb++];
            if ((r & 3) == 2)
             { energy += 14.0;
               break; }}
@@ -8213,7 +9441,7 @@ int tmopt_perm(data_set *d,
           int nts, int *seq, csw *sw)
 { int r,na,nr,nrh,flag,as,aext;
   int *s,*v,*s1,*s2,*sa,*sb,*se,*sf,*ps,*apos,*tpos;
-  static int gtemplate[6] = { 0x00,0x00,0x11,0x00,0x00,0x00 };
+  static int gtem[6] = { 0x00,0x00,0x11,0x00,0x00,0x00 };
   double e,energy,penergy,tenergy,aenergy,athresh,cthresh,cathresh;
   static double A[6] = { 6.0,0.0,0.0,0.0,0.0,0.0 };
   static double Ar[6] = { 10.0,0.0,0.0,0.0,0.0,0.0 };
@@ -8271,9 +9499,9 @@ int tmopt_perm(data_set *d,
      if (energy < cathresh) continue;
      sb = sa + 3;
      sf = sa + 7;
-     r = gtemplate[*sb++];
+     r = gtem[*sb++];
      while (sb < sf)
-      { r = (r >> 4) + gtemplate[*sb++];
+      { r = (r >> 4) + gtem[*sb++];
         if ((r & 3) == 2)
          { energy += 14.0;
            break; }}
@@ -8284,10 +9512,10 @@ int tmopt_perm(data_set *d,
       { ps = rhit[nr].pos;
         t.energy = penergy + rhit[nr].energy;
         if (rhit[nr].stem < 24) t.energy -= 15.0;
- if (t.energy > te.energy)
+        if (t.energy > te.energy)
          { flag = 1;
-    t.tstem = th->stem;
-    t.tloop = th->loop;
+           t.tstem = th->stem;
+           t.tloop = th->loop;
            t.asst = (long)(apos - tpos) + t.var + t.cstem;
            t.ps = tpos - t.var - t.cstem;
            t.tps = (int)(ps - t.ps);
@@ -8609,7 +9837,7 @@ int tmioptimise(data_set *d, int *seq, int lseq, int nts, csw *sw)
            if (*tloopfold == Guanine)
             { sb = dpos + dstem + 2;
               sc = sb;
-              se = sb + t.dloop - 3;
+              se = sb + dhit[nd1].loop - 3;
               r = TT[*sb++];
               while (sb < se)
                { r = (r >> 4) + TT[*sb++];
@@ -8636,6 +9864,8 @@ int tmioptimise(data_set *d, int *seq, int lseq, int nts, csw *sw)
                { denergy = e;
                  dhit[ndx].end = NULL;
                  ndx = nd2; }}}
+        cposmin = 0;
+        cposmax = 0;
         nd1 = ndh;
         while (--nd1 >= 0)
          { if (!dhit[nd1].end) continue;
@@ -8792,11 +10022,11 @@ int tmioptimise(data_set *d, int *seq, int lseq, int nts, csw *sw)
                     energy += 4.0; }
           if (energy < ethresh) continue;
           t.energy = energy;
+          t.dstem = dstem;
           t.astem1 = (t.dstem < 6)?7:((t.tstem < 5)?9:8);
           t.astem2 = t.astem1;
           t.ps = apos + 7 - t.astem1;
           t.nbase = (int)(tend - t.ps) + t.astem2;
-          t.dstem = dstem;
           t.dloop = dhit[ndx].loop;
           t.spacer1 = (int)(dpos - apos) - 7;
           t.spacer2 = (int)(cpos - dhit[ndx].end);
@@ -8816,32 +10046,31 @@ int tmioptimise(data_set *d, int *seq, int lseq, int nts, csw *sw)
 
 void disp_ftable_entry(FILE *f, int n[], int i, int m, csw *sw)
  { if (m > 0)
-            switch(sw->geneticcode)
-              { case METAZOAN_MT:
-                        if (i < 2) fprintf(f," %-4s %-4d",aa(n,sw),m);
-                        else fprintf(f," %-18s %-4d",aa(n,sw),m);
-                        break;
-                case STANDARD:
-                case VERTEBRATE_MT:
-                default:
-                        fprintf(f," %-4s %-5d",aa(n,sw),m);
-                        break; }
-           else
-            switch(sw->geneticcode)
-              { case METAZOAN_MT:
-                        if (i < 2) fprintf(f," %-4s     ",aa(n,sw));
-                        else fprintf(f," %-18s     ",aa(n,sw));
-                        break;
-                case STANDARD:
-                case VERTEBRATE_MT:
-                default:
-                        fprintf(f," %-4s      ",aa(n,sw));
-                        break; }}
+    switch(sw->geneticcode)
+      { case METAZOAN_MT:
+          fprintf(f," %-18s %-4d",aa(n,sw),m);
+          break;
+        case STANDARD:
+        case VERTEBRATE_MT:
+        default:
+          fprintf(f," %-4s %-5d",aa(n,sw),m);
+          break; }
+   else
+    switch(sw->geneticcode)
+     { case METAZOAN_MT:
+         fprintf(f," %-18s     ",aa(n,sw));
+         break;
+      case STANDARD:
+      case VERTEBRATE_MT:
+      default:
+        fprintf(f," %-4s      ",aa(n,sw));
+        break; }}
 
 
 void disp_freq_table(int nt, csw *sw)
-{ int i,j,k,m,ambig,*s,c1,c2,c3,c[3],n[3],table[4][4][4];
+{ int i,j,k,m,ambig,*s,c1,c2,c3,c[3],a[3],table[4][4][4];
   static int cgflip[4] = { 0,2,1,3 };
+  static int codonorder[4] = { 3,1,0,2 };
   FILE *f = sw->f;
   ambig = 0;
   for (i = 0; i < 4; i++)
@@ -8864,40 +10093,43 @@ void disp_freq_table(int nt, csw *sw)
         else ambig++;
        else ambig++; }
      else ambig++;
-  fprintf(f,"tRNA Anticodon Frequency\n");
-  for (j = 0; j < 4; j++)
-   { n[2] = cgflip[j];
-     for (k = 0; k < 4; k++)
-      { n[1] = cgflip[k];
-        for (i = 0; i < 4; i++)
-         { n[0] = cgflip[i];
-           fprintf(f,"%c%c%c",cpbase(n[0]),cpbase(n[1]),cpbase(n[2]));
-           m = table[n[0]][n[1]][n[2]];
-          disp_ftable_entry(f,n,i,m,sw); }
-        fputc('\n',f); }}
+  fprintf(f,"tRNA anticodon frequency\n");
+  for (i = 0; i < 4; i++)
+   { c[0] = codonorder[i];
+     a[2] = 3 - c[0];
+     for (j = 0; j < 4; j++)
+      { c[2] = codonorder[j];
+        a[0] = 3 - c[2];
+        for (k = 0; k < 4; k++)
+         { c[1] = codonorder[k];
+           a[1] = 3 - c[1];
+           fprintf(f,"%c%c%c",cpbase(a[0]),cpbase(a[1]),cpbase(a[2]));
+           m = table[a[0]][a[1]][a[2]];
+           disp_ftable_entry(f,a,k,m,sw); }
+        fputc('\n',f); }
+     if (i < 3) fputc('\n',f); }
   if (ambig > 0) fprintf(f,"Ambiguous: %d\n",ambig);
-  fprintf(f,"\ntRNA Codon Frequency\n");
+  fprintf(f,"\ntRNA codon frequency\n");
   for (i = 0; i < 4; i++)
-   { n[0] = 3 - cgflip[i];
+   { c[0] = codonorder[i];
+     a[2] = 3 - c[0];
      for (j = 0; j < 4; j++)
-      { n[1] = 3 - cgflip[j];
+      { c[2] = codonorder[j];
+        a[0] = 3 - c[2];
         for (k = 0; k < 4; k++)
-         { n[2] = 3 - cgflip[k];
-           fprintf(f,"%c%c%c",cpbase(n[0]),cpbase(n[1]),cpbase(n[2]));
-           c[0] = 3 - n[2];
-           c[1] = 3 - n[1];
-           c[2] = 3 - n[0];
-           m = table[c[0]][c[1]][c[2]];
-           disp_ftable_entry(f,c,k,m,sw); }
-        fputc('\n',f); }}
+         { c[1] = codonorder[k];
+           a[1] = 3 - c[1];
+           fprintf(f,"%c%c%c",cpbase(c[0]),cpbase(c[1]),cpbase(c[2]));
+           m = table[a[0]][a[1]][a[2]];
+           disp_ftable_entry(f,a,k,m,sw); }
+        fputc('\n',f); }
+     if (i < 3) fputc('\n',f); }
   if (ambig > 0) fprintf(f,"Ambiguous: %d\n",ambig);
   fputc('\n',f); }
 
 void disp_energy_stats(data_set *d, int nt, csw *sw)
 { int i,n[NS],genetype,introns,nintron,trna,mtrna,ntv,nd,nps;
   double gc,gcmin[NS],gcmax[NS];
-  static char genetype_name[NS][30] =
-   { "tRNA genes","tmRNA genes","srpRNA genes","rRNA genes","CDS genes","Overall" };
   FILE *f = sw->f;
   mtrna = sw->mtrna;
   trna = sw->trna | mtrna;
@@ -8919,7 +10151,7 @@ void disp_energy_stats(data_set *d, int nt, csw *sw)
     { n[NS-1]++;
       genetype = ts[i].genetype;
       n[genetype]++;
-      if (pseudogene(ts + i)) nps++;
+      if (pseudogene(ts + i,sw)) nps++;
       if (genetype == tRNA)
        { if (mtrna)
           { if (ts[i].tstem == 0) ntv++;
@@ -8944,7 +10176,7 @@ void disp_energy_stats(data_set *d, int nt, csw *sw)
            fprintf(f,"Number of tRNA genes with C-loop introns = %d\n",
                    nintron); }
         else
-         fprintf(f,"Number of %s = %d\n",genetype_name[tRNA],n[tRNA]);
+         fprintf(f,"Number of %s genes = %d\n",sw->genetypename[tRNA],n[tRNA]);
         if (mtrna)
          { if (sw->tvloop)
             fprintf(f,"Number of TV replacement loop tRNA genes = %d\n",
@@ -8957,7 +10189,7 @@ void disp_energy_stats(data_set *d, int nt, csw *sw)
   if (sw->tmrna)
    { sw->ngene[tmRNA] += n[tmRNA];
      if ((n[tmRNA] > 1) || (trna && (n[tRNA] > 0)))
-      fprintf(f,"Number of %s = %d\n",genetype_name[tmRNA],n[tmRNA]); }
+      fprintf(f,"Number of %s genes = %d\n",sw->genetypename[tmRNA],n[tmRNA]); }
   sw->nps += nps;
   if (sw->reportpseudogenes)
    if (nps > 0) 
@@ -8966,6 +10198,7 @@ void disp_energy_stats(data_set *d, int nt, csw *sw)
   fputc('\n',f);
   fputc('\n',f); }
   
+
 void batch_energy_stats(data_set *d, int nt, csw *sw)
 { int i,n[NS],genetype,introns,nintron,trna,mtrna,ntv,nd,nps;
   double gc,gcmin[NS],gcmax[NS];
@@ -9045,23 +10278,81 @@ int gene_sort(data_set *d, int nt, int sort[], csw *sw)
 
 int iamatch(data_set *d, gene *t, csw *sw)
 { char key[5],*k,s[100];
-  if (!(k = softstrpos(d->seqname,"TRNA-"))) return(-1);
-  copy3cr(k+5,key,3);
+  if (k = softstrpos(d->seqname,"TRNA-")) k += 5;
+  else
+   if (k = wildstrpos(d->seqname,"|***|")) k++;
+    else return(-1);
+  copy3cr(k,key,3);
   name(t,s,1,sw);
   if (softstrpos(s,key)) return(1);
   return(0); }
 
 
-int nearest_annotated_gene(data_set *d, gene *t, int matchgenetype)
-{ int n,i,nagene,max;
-  long a,b,c,e,score,thresh,psmax,proximity; 
+
+int gene_mismatch(data_set *d, annotated_gene *agene, gene *t, csw *sw)
+{
+int w,alen,dlen;
+char *s;
+w = 0;
+dlen = seqlen(t);
+alen = aseqlen(d,agene);
+switch(t->genetype)
+ { case tRNA:
+     s = aa(t->seq + t->anticodon,sw);
+     if (!softstrpos(s,agene->species+5))
+      { if (t->cloop == 8)
+         { s = aa(t->seq + t->anticodon + 1,sw);
+           if (!softstrpos(s,agene->species+5)) w += 1; }
+        else if (t->cloop == 6)
+         { s = aa(t->seq + t->anticodon - 1,sw);
+           if (!softstrpos(s,agene->species+5)) w += 1; }
+        else w += 1; }
+     if (agene->comp != t->comp) w += 2;
+     if (alen <= (dlen - sw->trnalenmisthresh)) w += 4;
+     else if (alen >= (dlen + sw->trnalenmisthresh)) w += 4;
+     break;
+  case tmRNA:
+     if (agene->comp != t->comp) w += 2;
+     if (alen <= (dlen - sw->tmrnalenmisthresh)) w += 4;
+     else if (alen >= (dlen + sw->tmrnalenmisthresh)) w += 4;
+     break; }
+return(w);
+}
+
+
+int gene_mismatch_report(data_set *d, annotated_gene *agene, gene *t, char *report, csw *sw)
+{
+int w;
+char *s;
+w = gene_mismatch(d,agene,t,sw);
+s = report;
+if (w & 1) s = copy("amino acceptor",s);
+if (w & 2)
+ { if (w & 1) 
+    if (w & 4) s = copy(", ",s);
+    else s = copy(" and ",s);
+   s = copy("sense",s); }
+if (w & 4) 
+ { if ((w & 3) > 0) s = copy(" and ",s);
+   s = copy("sequence length",s); }
+if (w > 0) s = copy(" mismatch",s);
+*s = '\0';
+return(w);
+}
+
+
+
+int nearest_annotated_gene(data_set *d, gene *t,
+                           int list[], int score[], int nmax,
+                           csw *sw)
+{ int n,i,j,k,q,w,nagene;
+  long a,b,c,e,thresh,psmax;
+  char *s; 
   annotated_gene *ta;
   psmax = d->psmax;
   nagene = d->nagene[NS-1];
   ta = d->gene;
-  n = -1;
-  max = 0;
-  proximity = matchgenetype?40:1;
+  n = 0;
   a = t->start;
   b = t->stop;
   thresh = b-a;
@@ -9075,25 +10366,19 @@ int nearest_annotated_gene(data_set *d, gene *t, int matchgenetype)
          { e += psmax;
            if (a > e) goto NXTW;
            if (b < c) goto NXTW;
-           if (matchgenetype)
-            if (ta[i].genetype != t->genetype) continue;
-           score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
-           if (score >= proximity)
-            if (score > max)
-              { n = i;
-                max = score; }
+           if (n >= nmax) break;
+           list[n] = i;
+           score[n] = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?e-c:b-c);
+           n++;
            NXTW:
            c -= psmax;
            e -= psmax; }
         if (a > e) continue;
         if (b < c) continue;
-        if (matchgenetype)
-         if (ta[i].genetype != t->genetype) continue;
-        score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
-        if (score >= proximity)
-         if (score > max)
-           { n = i;
-             max = score; } }
+        if (n >= nmax) break;
+        list[n] = i;
+        score[n] = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?e-c:b-c);
+        n++; }
      a -= psmax;
      b -= psmax; }
   for (i = 0; i < nagene; i++)
@@ -9103,302 +10388,342 @@ int nearest_annotated_gene(data_set *d, gene *t, int matchgenetype)
       { e += psmax;
         if (a > e) goto NXTN;
         if (b < c) goto NXTN;
-        if (matchgenetype)
-         if (ta[i].genetype != t->genetype) continue;
-        score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
-        if (score >= proximity)
-         if (score > max)
-          { n = i;
-            max = score; }
+        if (n >= nmax) break;
+        list[n] = i;
+        score[n] = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?e-c:b-c);
+        n++;
         NXTN:
         c -= psmax;
         e -= psmax; }
      if (a > e) continue;
      if (b < c) continue;
-     if (matchgenetype)
-      if (ta[i].genetype != t->genetype) continue;
-     score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
-     if (score >= proximity)
-      if (score > max)
-       { n = i;
-         max = score; } }
+     if (n >= nmax) break;
+     list[n] = i;
+     score[n] = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?e-c:b-c);
+     n++; }
+  for (i = 0; i < n; i++)
+   { k = list[i];
+     if (ta[k].genetype == t->genetype) 
+      { score[i] += 5000;
+        w = gene_mismatch(d,ta + k,t,sw);
+        if (w & 1) score[i] -= 2;
+        if (w & 2) score[i] -= 1; }}
+  if (n > 1)
+   { for (i = 0; i < (n-1); i++)
+      for (j = i+1; j < n; j++)
+       if (score[j] > score[i])
+       { k = list[i];
+         list[i] = list[j];
+         list[j] = k;
+         k = score[i];
+         score[i] = score[j];
+         score[j] = k; }}
   return(n); }
 
+    
+
+
+int proximity_compare(data_set *d, int is,
+                      long prox, long dlen, long alen,
+                      annotated_gene *a,
+                      csw *sw)
+{
+int w,score;
+long diff;
+char nm[200];
+gene *t;
+t = ts + is;
+w = gene_mismatch(d,a,t,sw);
+if (prox >= alen)
+ { diff = dlen - alen;
+   if (prox >= (2L*diff)) score = (int)(prox - diff);
+   else score = (int)(prox/2L); }
+else
+ if (prox >= dlen)
+  { diff = alen - dlen;
+    if (prox >= (2L*diff)) score = (int)(prox - diff);
+    else score = (int)(prox/2L); }
+else { score = (int)prox; }
+if (w & 1) score -= 10;
+if (w & 2) score -= 2;
+if (score < 0) score = 0;
+if (t->annotation >= 0)
+ if (t->annosc >= score) return(-1);
+return(score);
+}
+
 
-int nearest_detected_gene(data_set *d, int *sort, int ns, 
-                          int proxtype, int *overlap,
-                               annotated_gene *t)
+
+
+int nearest_detected_gene(data_set *d, int sort[], int nd,
+                          int *scorep,
+                          annotated_gene *ag, csw *sw)
 { int n,i,is;
-  long a,b,c,e,score,thresh,scoremax,psmax;
-  long proximity;
-  double energy;
+  long a,b,c,e,score,alen,scoremax,psmax;
+  long prox,proximity;
   psmax = d->psmax;
   n = -1;
-  energy = -INACTIVE;
   scoremax = -1;
-  a = t->start;
-  b = t->stop;
-  thresh = b-a;
-  proximity = thresh;
-  if (proximity < 0) proximity = -proximity;
-  proximity = 1 + proximity/2;
-  if (proximity > 40) proximity = 40;
+  a = ag->start;
+  b = ag->stop;
+  alen = b - a;
+  if (b < a) alen += psmax;
+  proximity = 1 + alen/2;
+  if (proximity > 30) proximity = 30;
   if (b < a)
    { b += psmax;
-     thresh += psmax;
-     for (i = 0; i < ns; i++)
+     for (i = 0; i < nd; i++)
       { is = sort[i];
+        if (ag->genetype != ts[is].genetype) continue;
         c = ts[is].start;
         e = ts[is].stop;
         if (e < c)
          { e += psmax;
            if (a > e) goto NXTW;
            if (b < c) goto NXTW;
-           if (ts[is].genetype != t->genetype) continue;
-           score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
-           if (score >= proximity)
-            if (proxtype)
-             { if (score > scoremax)
-                { n = i;
-                  scoremax = score; }}
-            else
-             if (ts[is].energy > energy)
-              { n = i;
-                scoremax = score;
-                energy = ts[is].energy; }
+           prox = (a >= c)?((b >= e)?e-a:alen):((b >= e)?e-c:b-c);
+           if (prox >= proximity)
+            if ((score = proximity_compare(d,is,prox,e-c,alen,ag,sw)) > scoremax)
+             { n = i;
+               scoremax = score; }
            NXTW:
            c -= psmax;
            e -= psmax; }
         if (a > e) continue;
         if (b < c) continue;
-        if (ts[is].genetype != t->genetype) continue;
-        score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
-        if (score >= proximity)
-            if (proxtype)
-             { if (score > scoremax)
-                { n = i;
-                  scoremax = score; }}
-            else
-             if (ts[is].energy > energy)
-              { n = i;
-                scoremax = score;
-                energy = ts[is].energy; } }
+        prox = (a >= c)?((b >= e)?e-a:alen):((b >= e)?e-c:b-c);
+        if (prox >= proximity)
+         if ((score = proximity_compare(d,is,prox,e-c,alen,ag,sw)) > scoremax)
+          { n = i;
+            scoremax = score; }}
      a -= psmax;
      b -= psmax; }
-  for (i = 0; i < ns; i++)
+  for (i = 0; i < nd; i++)
    { is = sort[i];
+     if (ag->genetype != ts[is].genetype) continue;
      c = ts[is].start;
      e = ts[is].stop;
      if (e < c)
       { e += psmax;
         if (a > e) goto NXTN;
         if (b < c) goto NXTN;
-        if (ts[is].genetype != t->genetype) continue;
-        score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
-        if (score >= proximity)
-            if (proxtype)
-             { if (score > scoremax)
-                { n = i;
-                  scoremax = score; }}
-            else
-             if (ts[is].energy > energy)
-              { n = i;
-                scoremax = score;
-                energy = ts[is].energy; }
+        prox = (a >= c)?((b >= e)?e-a:alen):((b >= e)?e-c:b-c);
+        if (prox >= proximity)
+         if ((score = proximity_compare(d,is,prox,e-c,alen,ag,sw)) > scoremax)
+          { n = is;
+            scoremax = score; }
         NXTN:
         c -= psmax;
         e -= psmax; }
      if (a > e) continue;
      if (b < c) continue;
-     if (ts[is].genetype != t->genetype) continue;
-     score = (a >= c)?((b >= e)?e-a:thresh):((b >= e)?thresh:b-c);
-     if (score >= proximity)
-      if (proxtype)
-       { if (score > scoremax)
-          { n = i;
-            scoremax = score; }}
-      else
-       if (ts[is].energy > energy)
-        { n = i;
-          scoremax = score;
-          energy = ts[is].energy; } }
-  *overlap = (scoremax + 1);
+     prox = (a >= c)?((b >= e)?e-a:alen):((b >= e)?e-c:b-c);
+     if (prox >= proximity)
+      if ((score = proximity_compare(d,is,prox,e-c,alen,ag,sw)) > scoremax)
+       { n = is;
+         scoremax = score; }}
+  *scorep = scoremax;
   return(n); }
 
 
+
 void disp_match(data_set *d, int *sort, int nd, csw *sw)
-{ int i,ld,fn,fp,fpd,fptv,w,alen,overlap,length,detect[NGFT],n[NS];
-  char nm[100],anm[100],ps[100],*s;
-  FILE *f = sw->f;
-  gene *t;
-  annotated_gene *agene;
-  static char comp[3] = " c";
-  for (i = 0; i < NS; i++) n[i] = 0;
-  for (i = 0; i < nd; i++)
-   { w = sort[i];
-     if (ts[w].energy >= 0.0)
-      { n[NS-1]++;
-        n[ts[w].genetype]++; }}
-  fprintf(f,"\n%s\n",d->seqname);
-  fprintf(f,"%ld nucleotides in sequence\n",d->psmax);
-  fprintf(f,"Mean G+C content = %2.1f%%\n",100.0*d->gc);
-  fprintf(f,"GenBank to Aragorn Comparison\n");
-  if (sw->trna | sw->mtrna)
-   { fn = 0;
-     fp = 0;
-     fpd = 0;
-     fptv = 0;
-     fprintf(f,"\n%d annotated tRNA genes\n",d->nagene[tRNA]);
-     fprintf(f,"%d detected tRNA genes\n\n",n[tRNA]);
-     fprintf(f,"  GenBank\t\t\t\tAragorn\n");
-     ld = 0;
-     for (i = 0; i < d->nagene[NS-1]; i++)
-      { agene = d->gene + i;
-        if (agene->genetype != tRNA) continue;
-        detect[i] = nearest_detected_gene(d,sort,nd,0,&overlap,agene);
-        while (ld < nd)
-         { t = ts + sort[ld];
-           if (detect[i] >= 0)
-            if (ld >= detect[i]) break;
-           if (t->start < t->stop)
-            if (t->start > agene->start) break;
-           fprintf(f,"* Not annotated                 %s ",name(t,nm,1,sw));
-           fprintf(f,"%s",position(ps,t,sw));
-           if (sw->reportpseudogenes)
-            if (pseudogene(t))
-             fprintf(f," PS");
-           fputc('\n',f);
-           fp++;
-           if (t->genetype == tRNA)
-            { if (t->dstem == 0) fpd++;
-              if (t->tstem == 0) fptv++; }
-           ld++;  }
-        if (detect[i] >= 0)
-         { ld = detect[i] + 1;
-           w = 0;
-           t = ts + sort[detect[i]];
-           s = aa(t->seq + t->anticodon,sw);
-           if (!softstrpos(s,agene->species+5)) w += 1;
-           if (agene->comp != t->comp) w += 2;
-           alen = agene->stop - agene->start;
-           if (alen < 0) alen = -alen;
-           if (alen < (t->nbase - 10)) w += 4;
-           else if (alen > (t->nbase + 10)) w += 4;
-           if (w > 0) fputc('*',f);
-           else fputc(' ',f); }
-        else 
-         fputc('*',f);
-        sprintf(anm," %s %c(%ld,%ld)",
-                agene->species,comp[agene->comp],agene->start,agene->stop);
-        fprintf(f,"%-30s ",anm);
-        if (detect[i] >= 0)
-         { fprintf(f,"%s ",name(t,nm,1,sw));
-           fprintf(f,"%s",position(ps,t,sw));
-           if (sw->reportpseudogenes)
-            if (pseudogene(t))
-             fprintf(f," PS");
-           if (w & 1) fprintf(f," AAM");
-           if (w & 2) fprintf(f," SM");
-           if (w & 4) fprintf(f," LM");
-           fputc('\n',f); }
-        else
-         { fprintf(f,"Not detected\n");
-           fn++; }}
-     while (ld < nd)
-      { fprintf(f,"* Not annotated\t\t\t%s ",name(ts + sort[ld],nm,1,sw));
-        fprintf(f,"%s\n",position(ps,ts + sort[ld],sw));
-        fp++;
-        if (t->genetype == tRNA)
-         { if (t->dstem == 0) fpd++;
-           if (t->tstem == 0) fptv++; }
-        ld++; }
-     fprintf(f,"\nNumber of false negative genes = %d\n",fn);
-     fprintf(f,"Number of false positive genes = %d\n",fp);
-     fprintf(f,"Number of false positive D-replacement tRNA genes = %d\n",fpd);
-     fprintf(f,"Number of false positive TV-replacement tRNA genes = %d\n",fptv);
-     fprintf(f,"\n\n");
-     sw->nagene[tRNA] += d->nagene[tRNA];
-     sw->natfn += fn; 
-     sw->natfp += fp;
-     sw->natfpd += fpd;
-     sw->natfptv += fptv; }
-  if (sw->cds)
-   { fn = 0;
-     fp = 0;
-     fprintf(f,"\n%d annotated CDS genes\n",d->nagene[CDS]);
-     fprintf(f,"%d detected CDS genes\n\n",n[CDS]);
-     fprintf(f,"  GenBank\t\t\t\t          Aragorn\n");
-     ld = 0;
-     for (i = 0; i < d->nagene[NS-1]; i++)
-      { agene = d->gene + i;
-        if (agene->genetype != CDS) continue;
-        length = (int)(agene->stop - agene->start) + 1;
-        sw->lacds += length;
-        detect[i] = nearest_detected_gene(d,sort,nd,1,&overlap,agene);
-        while (ld < nd)
-         { t = ts + sort[ld];
-           if (detect[i] >= 0)
-            if (ld >= detect[i]) break;
-           if (t->start < t->stop)
-            if (t->start > agene->start) break;
-           fprintf(f,"* Not annotated                                   ");
-           sprintf(anm,"%s %s",
-                   name(t,nm,1,sw),position(ps,t,sw));
-           fprintf(f,"%-18s",anm);
-           if (sw->energydisp) fprintf(f," %lg",t->energy);
-           if (sw->reportpseudogenes)
-            if (pseudogene(t))
-             fprintf(f," PS");
-           fputc('\n',f);
-           fp++;
-           ld++;  }
-        if (detect[i] >= 0)
-         { ld = detect[i] + 1;
-           t = ts + sort[detect[i]];
-           fputc(' ',f); }
-        else 
-         fputc('*',f);
-        fprintf(f," %-33s",agene->species);
-        sprintf(anm,"%c(%ld,%ld)",comp[agene->comp],agene->start,agene->stop);
-        fprintf(f,"%14s ",anm);
-        if (detect[i] >= 0)
-         { sprintf(anm,"%s %s",name(t,nm,1,sw),position(ps,t,sw));
-           fprintf(f,"%-18s",anm);
-           if (sw->energydisp) fprintf(f," %lg",t->energy);
-           if (sw->reportpseudogenes)
-            if (pseudogene(t))
-             fprintf(f," PS");
-           fputc('\n',f); 
-           length = (int)(t->stop - t->start) + 1; 
-           sw->ldcds += length; }
-        else
-         { fprintf(f,"Not detected\n");
-           fn++; }}
-     while (ld < nd)
-      { t = ts + sort[ld];
-        fprintf(f,"* Not annotated                                   ");
-        sprintf(anm,"%s %s",name(t,nm,1,sw),position(ps,t,sw));
-        fprintf(f,"%-18s",anm);
-        if (sw->energydisp) fprintf(f," %lg",t->energy);
-        if (sw->reportpseudogenes)
-         if (pseudogene(t))
-          fprintf(f," PS");
-        fputc('\n',f);
-        fp++;
-        ld++; }
-     fprintf(f,"\nNumber of false negative CDS genes = %d\n",fn);
-     fprintf(f,"Number of false positive CDS genes = %d\n",fp);
-     fprintf(f,"\n\n");
-     sw->nagene[CDS] += d->nagene[CDS];
-     sw->nacdsfn += fn; 
-     sw->nacdsfp += fp; }
-  sw->nabase += d->psmax; }
+{
+int i,ld,fn[NS],fp[NS],fpd,fptv,w,score,detect,n[NS];
+int prevannoted,nl,k,csort[NGFT],*msort;
+long start;
+char tag[52],nm[100],anm[100],ps[100],mreport[100],*s;
+FILE *f = sw->f;
+gene *t;
+annotated_gene *agene,*a;
+static char gp[2][7] = { "genes","gene" };
+static char comp[3] = " c";
+static char aps[2][5] = { "  ","PS" };
+nl = nd;
+if (sw->trna | sw->mtrna) nl += d->nagene[tRNA];
+if (sw->tmrna) nl += d->nagene[tmRNA];
+if (nl < NGFT) msort = csort;
+else
+ { msort = (int *)malloc(nl*sizeof(int));
+   if (msort == NULL)
+    { fprintf(stderr,"Not enough memory to match genes\n");
+      return; }}
+fprintf(f,"\n%s\n",d->seqname);
+fprintf(f,"%ld nucleotides in sequence\n",d->psmax);
+fprintf(f,"Mean G+C content = %2.1f%%\n",100.0*d->gc);
+fprintf(f,"\nGenBank to Aragorn comparison\n\n");
+sw->dispmatch = 1;
+for (i = 0; i < NS; i++) 
+  { n[i] = 0;
+    fn[i] = 0;
+    fp[i] = 0; }
+for (i = 0; i < nd; i++)
+ { w = sort[i];
+   if (ts[w].energy >= 0.0)
+    { n[NS-1]++;
+     n[ts[w].genetype]++; }
+   ts[w].annotation = -1;
+   ts[w].annosc = -1; }
+if (sw->trna | sw->mtrna | sw->tmrna)
+ { fpd = 0;
+   fptv = 0;
+   if (sw->trna | sw->mtrna)
+    { fprintf(f,"%d annotated tRNA %s\n",d->nagene[tRNA],gp[(d->nagene[tRNA]==1)?1:0]);
+      fprintf(f,"%d detected tRNA %s\n",n[tRNA],gp[(n[tRNA]==1)?1:0]); }
+   if (sw->tmrna)
+    { fprintf(f,"%d annotated tmRNA %s\n",d->nagene[tmRNA],gp[(d->nagene[tmRNA]==1)?1:0]);
+      fprintf(f,"%d detected tmRNA %s\n",n[tmRNA],gp[(n[tmRNA]==1)?1:0]); }
+   fprintf(f,"\n  GenBank                                      Aragorn\n");
+   nl = 0;
+   for (i = 0; i < d->nagene[NS-1]; i++)
+    { agene = d->gene + i;
+      agene->detected = -1;
+      if (agene->genetype != tRNA) 
+       { if (agene->genetype != tmRNA) continue;
+         else if (!sw->tmrna) continue; }
+      else if (!sw->trna) if (!sw->mtrna) continue;
+      a = agene;
+      k = i;
+      while ((a->detected = nearest_detected_gene(d,sort,nd,&score,a,sw)) >= 0)
+       { t = ts + a->detected;
+         prevannoted = t->annotation;
+         t->annotation = k;
+         t->annosc = score;
+         if (prevannoted < 0) break;
+         if (prevannoted == k) break;
+         if (prevannoted == i) break; 
+         a = d->gene + prevannoted;
+         k = prevannoted; }
+      k = nl;
+      while (--k >= 0)
+       { if (agene->start >= d->gene[msort[k]].start) break;
+         msort[k+1] = msort[k]; }
+      msort[++k] = i;
+      nl++; }
+   for (i = 0; i < nd; i++)
+    { t = ts + sort[i];
+      if (t->annotation >= 0) continue;
+      if (t->genetype != tRNA) 
+       { if (t->genetype != tmRNA) continue;
+         else if (!sw->tmrna) continue; }
+      else if (!sw->trna) if (!sw->mtrna) continue;
+      k = nl;
+      while (--k >= 0)
+       { if (msort[k] >= 0) start = d->gene[msort[k]].start;
+         else start = ts[-1-msort[k]].start;
+         if (t->start >= start) break;
+         msort[k+1] = msort[k]; }
+      msort[++k] = -(sort[i] + 1);
+      nl++; }
+   for (i = 0; i < nl; i++)
+    { if (msort[i] >= 0)
+       { agene = d->gene + msort[i];
+         detect = agene->detected;
+         if (detect >= 0)
+          { t = ts + detect;
+            w = gene_mismatch_report(d,agene,t,mreport,sw);
+            if (w > 0) fputc('*',f);
+            else fputc(' ',f); }
+         else fputc('*',f);
+         sprintf(anm," %-11s%c(%ld,%ld) %s",
+                 agene->species,comp[agene->comp],
+                 sq(agene->start),sq(agene->stop),aps[agene->pseudogene]);
+         fprintf(f,"%-45s ",anm);
+         if (detect >= 0)
+          { fprintf(f,"%s ",name(t,nm,1,sw));
+            if (t->comp == 0) fputc(' ',f);
+            fprintf(f,"%s",position(ps,t,sw));
+            if (sw->energydisp) fprintf(f," %7.3lf",t->energy);
+            if (t->genetype == tmRNA)
+             { peptide_tag(tag,50,t,sw);
+               fprintf(f," %s",tag); }
+            if (sw->reportpseudogenes)
+             if (pseudogene(t,sw))
+              fprintf(f," PS");
+            if (w > 0) fprintf(f," %s",mreport);
+            fputc('\n',f); }
+         else
+          { fprintf(f,"Not detected\n");
+            fn[agene->genetype]++; }}
+      else
+       { t = ts - (msort[i] + 1);
+         fprintf(f,"* Not annotated                                %s ",name(t,nm,1,sw));
+         if (t->comp == 0) fputc(' ',f);
+         fprintf(f,"%s",position(ps,t,sw));
+         if (sw->energydisp) fprintf(f," %7.3lf",t->energy);
+         if (t->genetype == tmRNA)
+          { peptide_tag(tag,50,t,sw);
+            fprintf(f," %s",tag); }
+         if (sw->reportpseudogenes)
+          if (pseudogene(t,sw))
+           fprintf(f," PS");
+         fputc('\n',f);
+         fp[t->genetype]++;
+         if (t->genetype == tRNA)
+          { if (t->dstem == 0) fpd++;
+            if (t->tstem == 0) fptv++; }}}
+   fprintf(f,"\n");
+   if (sw->trna | sw->mtrna)
+    { fprintf(f,"Number of annotated tRNA genes not detected = %d\n",fn[tRNA]);
+      fprintf(f,"Number of unannotated tRNA genes detected = %d\n",fp[tRNA]); }
+   if (sw->mtrna)
+    { fprintf(f,"Number of unannotated D-replacement tRNA genes detected = %d\n",fpd);
+      fprintf(f,"Number of unannotated TV-replacement tRNA genes detected = %d\n",fptv); }
+   if (sw->tmrna)
+    { fprintf(f,"Number of annotated tmRNA genes not detected = %d\n",fn[tmRNA]);
+      fprintf(f,"Number of unannotated tmRNA genes detected = %d\n",fp[tmRNA]); }
+   fprintf(f,"\n\n");
+   for (i = tRNA; i <= tmRNA; i++)
+    { sw->nagene[i] += d->nagene[i];
+      sw->nafn[i] += fn[i]; 
+      sw->nafp[i] += fp[i]; }
+   if (sw->mtrna)
+    { sw->natfpd += fpd;
+      sw->natfptv += fptv; }}
+sw->nabase += d->psmax;
+sw->dispmatch = 0;
+if (nl >= NGFT) free((void *)msort);
+}
 
 
+void annotation_overlap_check(data_set *d, gene *t, char *margin, csw *sw)
+{
+int a,m,n,w,list[20],score[20];
+char mreport[100];
+static char comp[3] = " c";
+n = nearest_annotated_gene(d,t,list,score,20,sw);
+if (n < 1) m = -1;
+else
+ { m = 0;
+   a = list[m];
+   if (d->gene[a].genetype != t->genetype) m = -1;
+   else 
+    { w = gene_mismatch_report(d,d->gene+a,t,mreport,sw);
+      if (w & 1)
+       { if ((score[m] - 5000) < (3*seqlen(t)/4)) m = -1; }
+      else
+       { if ((score[m] - 5000) < (seqlen(t)/3)) m = -1; }}}
+if (m < 0)
+ fprintf(sw->f,"%sNot annotated\n",margin);
+else
+ { a = list[m];
+   fprintf(sw->f,"%sMatch with annotated %s %c(%ld,%ld)",
+           margin,d->gene[a].species,comp[d->gene[a].comp],
+           d->gene[a].start,d->gene[a].stop);
+   w = gene_mismatch_report(d,d->gene+a,t,mreport,sw);
+   if (w > 0) fprintf(sw->f," * %s",mreport);
+   fputc('\n',sw->f); }
+while (++m < n)
+ { a = list[m];
+   fprintf(sw->f,"%sOverlap with annotated %s %c(%ld,%ld)\n",
+           margin,d->gene[a].species,comp[d->gene[a].comp],
+           d->gene[a].start,d->gene[a].stop); }
+fputc('\n',sw->f);
+}
+
 void disp_gene_set(data_set *d, int nt, csw *sw)
-{ int i,j,n,a,vsort[NT],*sort;
+{ int i,j,n,vsort[NT],*sort;
   char m[MATX][MATY],s[20];
-  static char comp[3] = " c";
   gene *t;
   FILE *f = sw->f;
   if (nt <= NT)
@@ -9430,13 +10755,7 @@ void disp_gene_set(data_set *d, int nt, csw *sw)
 		              { fprintf(f,"    Iso-acceptor mismatch\n");
 			            sw->iamismatch++; }
                     if (sw->annotated)
-                     if ((a = nearest_annotated_gene(d,t,1)) < 0)
-                      { fprintf(f,"    Annotation false positive\n");
-                        if ((a = nearest_annotated_gene(d,t,0)) >= 0)
-                         fprintf(f,"    Overlap with %s %c(%ld,%ld)\n",
-                                 d->gene[a].species,comp[d->gene[a].comp],
-                                 d->gene[a].start,d->gene[a].stop); 
-                        fputc('\n',f); }
+                     annotation_overlap_check(d,t,"    ",sw);
                     overlap(d,sort,n,i,sw);
                     if (sw->seqdisp) disp_seq(f,t,sw);
                     if (t->nintron > 0) disp_intron(f,t,sw);
@@ -9448,12 +10767,19 @@ void disp_gene_set(data_set *d, int nt, csw *sw)
                        disp_gene(t,m,sw);
                        sprintf(s,"%d.",j);
                        xcopy(m,0,32,s,length(s));
-                       disp_matrix(f,m,MATY); }
+                       disp_matrix(f,m,MATY);
+                       if (sw->annotated)
+                        annotation_overlap_check(d,t,"    ",sw); }
                     else
                      { fprintf(f,"\n%d.\n",j);
                        disp_location(t,sw,"Location");
+                       if (sw->reportpseudogenes)
+                        if (pseudogene(t,sw))
+                         fprintf(f,"Possible Pseudogene\n");
                        if (sw->energydisp)
-                        fprintf(f,"Score = %g\n",t->energy); }
+                        fprintf(f,"Score = %g\n",t->energy);
+                       if (sw->annotated)
+                        annotation_overlap_check(d,t,"",sw); }
                     overlap(d,sort,n,i,sw);
                     if (t->asst == 0) disp_tmrna_seq(f,t,sw);
                     else disp_tmrna_perm_seq(f,t,sw);
@@ -9462,6 +10788,8 @@ void disp_gene_set(data_set *d, int nt, csw *sw)
             case CDS:
                     fprintf(f,"\n%d.\nCDS gene\n",j);
                     disp_location(t,sw,"Location");
+                    if (sw->annotated)
+                     annotation_overlap_check(d,t,"",sw);
                     overlap(d,sort,n,i,sw);
                     disp_cds(f,t,sw);
                     break;
@@ -9611,7 +10939,7 @@ void iopt_fastafile(data_set *d, csw *sw)
   int *s,*sf,*se,*sc,*swrap;
   int seq[2*LSEQ+WRAP+1],cseq[2*LSEQ+WRAP+1],wseq[2*WRAP+1];
   long gap,start,rewind,drewind,psmax,tmaxlen,vstart,vstop;
-  double sensitivity,sel1,sel2;
+  double sens,sel1,sel2;
   char c1,c2,c3;
   static char trnatypename[3][25] =
    { "Metazoan mitochondrial","Cytosolic","Mammalian mitochondrial" };
@@ -9639,7 +10967,9 @@ void iopt_fastafile(data_set *d, csw *sw)
      "deleted -> standard",
      "Trematode Mitochondrial",
      "Scenedesmus obliquus Mitochondrial",
-     "Thraustochytrium Mitochondrial" };
+     "Thraustochytrium Mitochondrial",
+     "Pterobranchia mitochondrial",
+     "Gracilibacteria" };
   FILE *f = sw->f;
   init_tmrna(f,sw);
   aragorn = (sw->trna || sw->tmrna || sw->cds || sw->srprna);
@@ -9718,9 +11048,12 @@ void iopt_fastafile(data_set *d, csw *sw)
   sw->roffset = rewind;
   drewind = 2*rewind;
   d->ns = 0;
+  d->nf = 0;
   d->nextseq = 0L;
+  d->nextseqoff = 0L;
   while (d->nextseq >= 0L)
    { d->seqstart = d->nextseq;
+     d->seqstartoff = d->nextseqoff;
      if (!seq_init(d,sw)) break;
      psmax = d->psmax;
      if (sw->verbose)
@@ -9817,7 +11150,9 @@ void iopt_fastafile(data_set *d, csw *sw)
         while (s < se) *s++ = *sf++;
         start += len - drewind;
         goto NX; }
+     if (nt < 1) d->nf++;
      if (sw->maxintronlen > 0) remove_overlapping_trna(d,nt,sw);
+     if (sw->updatetmrnatags) update_tmrna_tag_database(ts,nt,sw);
      disp_gene_set(d,nt,sw);
      if (sw->verbose) fprintf(stderr,"%s\nSearch Finished\n\n",d->seqname);
      d->ns++; }
@@ -9831,50 +11166,71 @@ void iopt_fastafile(data_set *d, csw *sw)
      if (sw->reportpseudogenes)
       if (sw->nps > 0)
        fprintf(f,"Total number of possible pseudogenes = %d\n",sw->nps);
+     if (d->nf > 0)
+      { sens = 100.0*(d->ns - d->nf)/d->ns; 
+        fprintf(f,"Nothing found in %d sequences (%.2lf%% sensitivity)\n",d->nf,sens); }
      if (sw->annotated)
       { if (sw->trna | sw->mtrna) 
          { fprintf(f,"\nTotal number of annotated tRNA genes = %d\n",
                    sw->nagene[tRNA]);
-           fprintf(f,"Total number of annotated false negatives = %d\n",sw->natfn);
-           fprintf(f,"Total number of annotated false positives = %d\n",sw->natfp);
-           fprintf(f,"Total number of annotated DRL false positives = %d\n",
+           fprintf(f,"Total number of annotated tRNA genes not detected = %d\n",sw->nafn[tRNA]);
+           fprintf(f,"Total number of unannotated tRNA genes detected = %d\n",sw->nafp[tRNA]);
+           fprintf(f,"Total number of unannotated DRL tRNA genes detected = %d\n",
                    sw->natfpd);
-           fprintf(f,"Total number of annotated TVRL false positives = %d\n",
+           fprintf(f,"Total number of unannotated TVRL tRNA genes detected = %d\n",
                    sw->natfptv);
            fprintf(f,"Total annotated sequence length = %ld bases\n",sw->nabase);
-           sensitivity = (sw->nagene[tRNA] > 0)?
-                         100.0*(double)(sw->nagene[tRNA] - sw->natfn)/
+           sens = (sw->nagene[tRNA] > 0)?
+                         100.0*(double)(sw->nagene[tRNA] - sw->nafn[tRNA])/
                          (double)sw->nagene[tRNA]:0.0;
            sel1 = (sw->nagene[tRNA] > 0)?
-                         100.0*(double)(sw->natfp)/
+                         100.0*(double)(sw->nafp[tRNA])/
                          (double)sw->nagene[tRNA]:0.0;
            sel2 = (sw->nabase > 0)?
-                         1000000.0*(double)(sw->natfp)/
+                         1000000.0*(double)(sw->nafp[tRNA])/
                          (double)sw->nabase:0.0;
-           fprintf(f,"Sensitivity = %lg%%\n",sensitivity);
+           fprintf(f,"Sensitivity = %lg%%\n",sens);
+           fprintf(f,"Selectivity = %lg%% or %lg per Megabase\n\n",sel1,sel2); }
+        if (sw->tmrna) 
+         { fprintf(f,"\nTotal number of annotated tmRNA genes = %d\n",
+                   sw->nagene[tmRNA]);
+           fprintf(f,"Total number of annotated tmRNA genes not detected = %d\n",sw->nafn[tmRNA]);
+           fprintf(f,"Total number of unannotated tmRNA genes detected = %d\n",sw->nafp[tmRNA]);
+           fprintf(f,"Total annotated sequence length = %ld bases\n",sw->nabase);
+           sens = (sw->nagene[tmRNA] > 0)?
+                         100.0*(double)(sw->nagene[tmRNA] - sw->nafn[tmRNA])/
+                         (double)sw->nagene[tmRNA]:0.0;
+           sel1 = (sw->nagene[tmRNA] > 0)?
+                         100.0*(double)(sw->nafp[tmRNA])/
+                         (double)sw->nagene[tmRNA]:0.0;
+           sel2 = (sw->nabase > 0)?
+                         1000000.0*(double)(sw->nafp[tmRNA])/
+                         (double)sw->nabase:0.0;
+           fprintf(f,"Sensitivity = %lg%%\n",sens);
            fprintf(f,"Selectivity = %lg%% or %lg per Megabase\n\n",sel1,sel2); }
         if (sw->cds) 
          { fprintf(f,"\nTotal number of annotated CDS genes = %d\n",
                    sw->nagene[CDS]);
-           fprintf(f,"Total number of annotated false negatives = %d\n",sw->nacdsfn);
-           fprintf(f,"Total number of annotated false positives = %d\n",sw->nacdsfp);
+           fprintf(f,"Total number of annotated CDS genes not detected = %d\n",sw->nafn[CDS]);
+           fprintf(f,"Total number of unannotated CDS genes detected = %d\n",sw->nafp[CDS]);
            fprintf(f,"Total annotated sequence length = %ld bases\n",sw->nabase);
-           sensitivity = (sw->nagene[CDS] > 0)?
-                         100.0*(double)(sw->nagene[CDS] - sw->nacdsfn)/
+           sens = (sw->nagene[CDS] > 0)?
+                         100.0*(double)(sw->nagene[CDS] - sw->nafn[CDS])/
                          (double)sw->nagene[CDS]:0.0;
            sel1 = (sw->nagene[CDS] > 0)?
-                         100.0*(double)(sw->nacdsfp)/
+                         100.0*(double)(sw->nafp[CDS])/
                          (double)sw->nagene[CDS]:0.0;
            sel2 = (sw->nabase > 0)?
-                         1000000.0*(double)(sw->nacdsfp)/
+                         1000000.0*(double)(sw->nafp[CDS])/
                          (double)sw->nabase:0.0;
-           fprintf(f,"Sensitivity = %lg%%\n",sensitivity);
+           fprintf(f,"Sensitivity = %lg%%\n",sens);
            fprintf(f,"Selectivity = %lg%% or %lg per Megabase\n",sel1,sel2);
-           sensitivity = (sw->lacds > 0)?
+           sens = (sw->lacds > 0)?
                          100.0*(double)sw->ldcds/(double)sw->lacds:0.0;
-           fprintf(f,"Length sensitivity = %lg%%\n\n",sensitivity); }
+           fprintf(f,"Length sensitivity = %lg%%\n\n",sens); }
       } }
-  }
+  if (sw->updatetmrnatags) report_new_tmrna_tags(sw);
+}
 
 
 void bopt_fastafile(data_set *d, csw *sw)
@@ -9882,6 +11238,7 @@ void bopt_fastafile(data_set *d, csw *sw)
   int *s,*sf,*se,*sc,*swrap;
   int seq[2*LSEQ+WRAP+1],cseq[2*LSEQ+WRAP+1],wseq[2*WRAP+1];
   long gap,start,rewind,drewind,psmax,tmaxlen,vstart,vstop;
+  double sens;
   FILE *f = sw->f;
   rewind = MAXTAGDIST + 20;
   if (sw->trna | sw->mtrna)
@@ -9896,9 +11253,12 @@ void bopt_fastafile(data_set *d, csw *sw)
   sw->roffset = rewind;
   drewind = 2*rewind;
   d->ns = 0;
+  d->nf = 0;
   d->nextseq = 0L;
+  d->nextseqoff = 0L;
   while (d->nextseq >= 0L)
    { d->seqstart = d->nextseq;
+     d->seqstartoff = d->nextseqoff;
      if (!seq_init(d,sw)) break;
      psmax = d->psmax;
      if (sw->verbose)
@@ -9982,7 +11342,9 @@ void bopt_fastafile(data_set *d, csw *sw)
         while (s < se) *s++ = *sf++;
         start += len - drewind;
         goto NX; }
+     if (nt < 1) d->nf++;
      if (sw->maxintronlen > 0) remove_overlapping_trna(d,nt,sw);
+     if (sw->updatetmrnatags) update_tmrna_tag_database(ts,nt,sw);
      batch_gene_set(d,nt,sw);
      if (sw->verbose) fprintf(stderr,"%s\nSearch Finished\n\n",d->seqname);
      d->ns++; }
@@ -9990,168 +11352,19 @@ void bopt_fastafile(data_set *d, csw *sw)
    { fprintf(f,">end \t%d sequences",d->ns);
      if (sw->trna || sw->mtrna) fprintf(f," %d tRNA genes",sw->ngene[tRNA]);
      if (sw->tmrna) fprintf(f," %d tmRNA genes",sw->ngene[tmRNA]);
-     fputc('\n',f); } }
+     if (d->nf > 0)
+      { sens = 100.0*(d->ns - d->nf)/d->ns; 
+        fprintf(f,", nothing found in %d sequences, (%.2lf%% sensitivity)",d->nf,sens); }
+     fputc('\n',f); }
+  if (sw->updatetmrnatags) report_new_tmrna_tags(sw);
+}
 
 
 void aragorn_help_menu()
-{ printf("\n");
-  printf("----------------------------\n");
-  printf("ARAGORN v1.2.36 Dean Laslett\n");
-  printf("----------------------------\n");
-  printf("\n");
-  printf("Please reference the following papers if you use this\n");
-  printf("program as part of any published research.\n\n");
-  printf("Laslett, D. and Canback, B. (2004) ARAGORN, a\n");
-  printf("program for the detection of transfer RNA and transfer-messenger\n");
-  printf("RNA genes in nucleotide sequences\n");
-  printf("Nucleic Acids Research, 32;11-16\n\n");
-  printf("Laslett, D. and Canback, B. (2008) ARWEN: a\n");
-  printf("program to detect tRNA genes in metazoan mitochondrial\n");
-  printf("nucleotide sequences\n");
-  printf("Bioinformatics, 24(2); 172-175.\n\n\n");
-  printf("ARAGORN detects tRNA, mtRNA, and tmRNA genes.\n");
-  printf("\n");
-  printf("Usage:\n");
-  printf("aragorn -v -s -d -c -l -a -w -j -ifro<min>,<max> -t -mt -m");
-  printf(" -tv -gc -seq -br -fasta -fo -o <outfile> <filename>\n\n");
-  printf("<filename> is assumed to contain one or more sequences\n");
-  printf("in FASTA format. Results of the search are printed to\n");
-  printf("STDOUT. All switches are optional and case-insensitive.\n");
-  printf("Unless -i is specified, tRNA genes containing introns\n");
-  printf("are not detected. \n");
-  printf("\n");
-  printf("    -m            Search for tmRNA genes.\n");
-  printf("    -t            Search for tRNA genes.\n");
-  printf("                  By default, both are detected. If one of\n");
-  printf("                  -m or -t is specified, then the other\n");
-  printf("                  is not detected unless specified as well.\n");
-  printf("    -mt           Search for Metazoan mitochondrial tRNA\n");
-  printf("                  genes. -i switch ignored. Composite\n");
-  printf("                  Metazoan mitochondrial genetic code used.\n");
-  printf("    -mtmam        Search for Mammalian mitochondrial tRNA\n");
-  printf("                  genes. -i switch ignored. -tv switch set.\n");
-  printf("                  Mammalian mitochondrial genetic code used.\n");
-  printf("    -mtx          Same as -mt but low scoring tRNA genes are\n"); 
-  printf("                  not reported.\n");
-  printf("    -gc<num>      Use the GenBank transl_table = <num> genetic code.\n");
-  printf("    -gcstd        Use standard genetic code.\n");
-  printf("    -gcmet        Use composite Metazoan mitochondrial genetic code.\n");
-  printf("    -gcvert       Use Vertebrate mitochondrial genetic code.\n");
-  printf("    -gcinvert     Use Invertebrate mitochondrial genetic code.\n");
-  printf("    -gcyeast      Use Yeast mitochondrial genetic code.\n");
-  printf("    -gcprot       Use Mold/Protozoan/Coelenterate");
-  printf(" mitochondrial genetic code.\n");
-  printf("    -gcciliate    Use Ciliate genetic code.\n");
-  printf("    -gcflatworm   Use Echinoderm/Flatworm mitochondrial genetic code.\n");
-  printf("    -gceuplot     Use Euplotid genetic code.\n");
-  printf("    -gcbact       Use Bacterial/Plant Chloroplast genetic code.\n");
-  printf("    -gcaltyeast   Use alternative Yeast genetic code.\n");
-  printf("    -gcascid      Use Ascidian Mitochondrial genetic code.\n");
-  printf("    -gcaltflat    Use alternative Flatworm Mitochondrial genetic code.\n");
-  printf("    -gcblep       Use Blepharisma genetic code.\n");
-  printf("    -gcchloroph   Use Chlorophycean Mitochondrial genetic code.\n");
-  printf("    -gctrem       Use Trematode Mitochondrial genetic code.\n");
-  printf("    -gcscen       Use Scenedesmus obliquus Mitochondrial genetic code.\n");
-  printf("    -gcthraust    Use Thraustochytrium Mitochondrial genetic code.\n");
-  printf("                  Individual modifications can be appended using\n");
-  printf("    ,BBB=<aa>     B = A,C,G, or T. <aa> is the three letter\n");
-  printf("                  code for an amino-acid. More than one modification\n");
-  printf("                  can be specified. eg -gcvert,aga=Trp,agg=Trp uses\n");
-  printf("                  the Vertebrate Mitochondrial code and the codons\n");
-  printf("                  AGA and AGG changed to Tryptophan.\n");            
-  printf("    -tv           Do not search for mitochondrial ");
-  printf("TV replacement\n");
-  printf("                  loop tRNA genes. Only relevant if -mt used. \n");
-  printf("    -i            Search for tRNA genes with introns in\n");
-  printf("                  anticodon loop with maximum length %d\n",
-         MAXINTRONLEN);
-  printf("                  bases. Minimum intron length is 0 bases.\n");
-  printf("                  Ignored if -m is specified.\n");
-  printf("    -i<max>       Search for tRNA genes with introns in\n");
-  printf("                  anticodon loop with maximum length <max>\n");
-  printf("                  bases. Minimum intron length is 0 bases.\n");
-  printf("                  Ignored if -m is specified.\n");
-  printf("    -i<min>,<max> Search for tRNA genes with introns in\n");
-  printf("                  anticodon loop with maximum length <max>\n");
-  printf("                  bases, and minimum length <min> bases.\n");
-  printf("                  Ignored if -m is specified.\n");
-  printf("    -io           Same as -i, but allow tRNA genes with long\n");
-  printf("                  introns to overlap shorter tRNA genes.\n");
-  printf("    -if           Same as -i, but fix intron between positions\n");
-  printf("                  37 and 38 on C-loop (one base after anticodon).\n");
-  printf("    -ifo          Same as -if and -io combined.\n");
-  printf("    -ir           Same as -i, but search for tRNA genes with minimum intron\n");
-  printf("                  length 0 bases, and only report tRNA genes with minimum\n");
-  printf("                  intron length <min> bases.\n");
-  printf("    -c            Assume that each sequence has a circular\n");
-  printf("                  topology. Search wraps around each end.\n");
-  printf("                  Default setting.\n");
-  printf("    -l            Assume that each sequence has a linear\n");
-  printf("                  topology. Search does not wrap.\n");
-  printf("    -d            Double. Search both strands of each\n");
-  printf("                  sequence. Default setting.\n");
-  printf("    -s or -s+     Single. Do not search the complementary\n");
-  printf("                  (antisense) strand of each sequence.\n");
-  printf("    -sc or -s-    Single complementary. Do not search the sense\n");
-  printf("                  strand of each sequence.\n");
-  printf("    -ss           Use the stricter canonical 1-2 bp spacer1 and\n");
-  printf("                  1 bp spacer2. Ignored if -mt set. Default is to\n");
-  printf("                  allow 3 bp spacer1 and 0-2 bp spacer2, which may\n"); 
-  printf("                  degrade selectivity.\n");
-  printf("    -ps           Lower scoring thresholds to 95%% of default levels.\n"); 
-  printf("    -ps<num>      Change scoring thresholds to <num> percent of default levels.\n");
-  printf("    -rp           Flag possible pseudogenes (score < 100 or tRNA anticodon\n");
-  printf("                  loop <> 7 bases long). Note that genes with score < 100\n");
-  printf("                  will not be detected or flagged if scoring thresholds are not\n");
-  printf("                  also changed to below 100%% (see -ps switch).\n");
-  printf("    -seq          Print out primary sequence.\n");
-  printf("    -br           Show secondary structure of tRNA gene primary\n");
-  printf("                  sequence with round brackets.\n");
-  printf("    -fasta        Print out primary sequence in fasta format.\n");
-  printf("    -fo           Print out primary sequence in fasta format only\n");
-  printf("                  (no secondary structure).\n");
-  printf("    -fon          Same as -fo, with sequence and gene numbering in header.\n"); 
-  printf("    -fos          Same as -fo, with no spaces in header.\n"); 
-  printf("    -fons         Same as -fo, with sequence and gene numbering, but no spaces.\n");
-  printf("    -j            Display 4-base sequence on 3' end of astem\n");
-  printf("                  regardless of predicted amino-acyl acceptor\n");
-  printf("                  length.\n");
-  printf("    -jr           Allow some divergence of 3' ");
-  printf("amino-acyl acceptor\n");
-  printf("                  sequence from NCCA.\n");
-  printf("    -jr4          Allow some divergence of 3' ");
-  printf("amino-acyl acceptor\n");
-  printf("                  sequence from NCCA, and display 4 bases.\n");
-  printf("    -v            Verbose. Prints out search progress\n");
-  printf("                  to STDERR.\n");
-  printf("    -a            Print out tRNA domain for tmRNA genes\n");
-  printf("    -o <outfile>  print output into <outfile>. If <outfile>\n");
-  printf("                  exists, it is overwritten.\n");
-  printf("                  By default, output goes to STDOUT.\n");
-  printf("    -w            Print out genes in batch mode.\n");
-  printf("                  For tRNA genes, output is in the form:\n\n");
-  printf("                  Sequence name\n");
-  printf("                  N genes found\n");
-  printf("                  1 tRNA-<species> [locus 1]");
-  printf(" <Apos> (nnn)\n");
-  printf("                  i(<intron position>,<intron length>)\n");
-  printf("                            .          \n");
-  printf("                            .          \n");
-  printf("                  N tRNA-<species> [Locus N]");
-  printf(" <Apos> (nnn)\n");
-  printf("                  i(<intron position>,<intron length>)\n");
-  printf("\n                  N is the number of genes found\n");
-  printf("                  <species> is the tRNA iso-acceptor species\n");
-  printf("                  <Apos> is the tRNA anticodon ");
-  printf("relative position\n");
-  printf("                  (nnn) is the tRNA anticodon base triplet\n");
-  printf("                  i means the tRNA gene has a C-loop intron\n");
-  printf("\n                  For tmRNA genes, output is in the form:\n");
-  printf("\n                  n tmRNA(p) [Locus n] <tag offset>,");
-  printf("<tag end offset>\n");
-  printf("                  <tag peptide>\n\n");
-  printf("                  p means the tmRNA gene is permuted\n");
-  printf("\n\n"); }
+{
+int h;
+for (h = 0; h < NHELPLINE; h++) printf("%s\n",helpmenu[h]);
+}
 
 void error_report(int n, char *s)
 { switch(n)
@@ -10185,7 +11398,7 @@ void process_genecode_switch(char *s, csw *sw)
      "CILIATE","DELETED","DELETED","FLATWORM","EUPLOT",
      "BACT","ALTYEAST","ASCID","ALTFLAT","BLEP",
      "CHLOROPH","DELETED","DELETED","DELETED","DELETED",
-     "TREM","SCEN","THRAUST" };
+     "TREM","SCEN","THRAUST","PTERO","GRAC" };
   sw->geneticcode = STANDARD;
   sw->gcfix = 1;
   c = *s;
@@ -10257,11 +11470,13 @@ int main(int z, char *v[])
   char c1,c2,c3,c4,*s;
   data_set d;
   static csw sw =
-   { NULL,0,0,0,0,0,0,0,1,0,0,
+   { {"tRNA","tmRNA","","","CDS","overall"},
+     NULL,0,0,0,0,0,0,0,0,1,0,0,
      STANDARD,0,{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},0,METAZOAN_MT,
      1,0,5,5,1,0,0,0,2,0,0,0,0,0,0,3,0,2,1,1,0,0,0,0,0,0,0,0,1,
-     0,0,0,0,0,0,0,{0,0,0,0,0},0,0,{0,0,0,0,0},0,0,0,0,0,0,0,0,0L,
-     tRNAthresh,4.0,29.0,26.0,7.5,8.0,
+     0,0,0,0,0,0,0,{0,0,0,0,0,0},0,0,0,0,NTAG,10,30,
+     {0,0,0,0,0,0},{0,0,0,0,0,0},{0,0,0,0,0,0},0,0,0,0,0L,
+     100.0,tRNAthresh,4.0,29.0,26.0,7.5,8.0,
      mtRNAtthresh,mtRNAdthresh,mtRNAdtthresh,-7.9,-6.0,
      tmRNAthresh,14.0,10.0,25.0,9.0,srpRNAthresh,CDSthresh,
      {tRNAthresh,tmRNAthresh,srpRNAthresh,0.0,CDSthresh },
@@ -10269,7 +11484,7 @@ int main(int z, char *v[])
        45, 45, 45, 45, 45, 45, 45, 45, 45, 45,
        45, 45, 45, 45, 45, 45, 45, 45, 45, 45,
        10, 65, 82, 65, 71, 79, 82, 78, 32,
-       118, 49, 46, 50, 46, 51, 54, 32, 32, 32,
+       118, 49, 46, 50, 46, 51, 55, 32, 32, 32,
        68, 101, 97,110, 32, 76, 97, 115, 108,
        101, 116, 116, 10,
        45, 45, 45, 45, 45, 45, 45, 45, 45, 45,
@@ -10277,6 +11492,7 @@ int main(int z, char *v[])
        45, 45, 45, 45, 45, 45, 45, 45, 45, 45,
        10, TERM }};
   sw.f = stdout;
+  d.bugmode = 0;
   filecounter = 0;
   i = 0;
   while (++i < z)
@@ -10294,14 +11510,43 @@ int main(int z, char *v[])
          case  'A': if (c2 == '7') sw.extastem = 0;
                     else
                      if (c2 == 'A') sw.matchacceptor = 1;
-		             else sw.secstructdisp = 1;
+		             else 
+                      if (c2 == 'M')
+                       { l = 1L;
+                         if (c3 == 'T')
+                          { if (lv > 4) 
+                             { s = lconvert(s+3,&l);
+                               if (l < 1L) l = 1L;
+                               sw.trnalenmisthresh = (int)l; }
+                            else sw.trnalenmisthresh = 1; }
+                         else if (c3 == 'M')
+                          { if (lv > 4) 
+                             { s = lconvert(s+3,&l);
+                               if (l < 1L) l = 1L;
+                               sw.tmrnalenmisthresh = (int)l; }
+                            else sw.tmrnalenmisthresh = 1; }
+                         else if (lv > 3)
+                          { s = lconvert(s+2,&l);
+                            if (l < 1L) l = 1L;
+                            sw.trnalenmisthresh = (int)l;
+                            sw.tmrnalenmisthresh = (int)l; }
+                         else
+                          { sw.trnalenmisthresh = 1;
+                            sw.tmrnalenmisthresh = 1; }}
+                      else sw.secstructdisp = 1;
                     break;
          case  'B': if (c2 == 'R') sw.seqdisp = 2;
                     else sw.libflag = 1;
                     break;
          case  'X': sw.libflag = 2;
                     break;
-         case  'W': if (sw.batch < 1) sw.batch = 1;
+         case  'W': if (c2 == 'U')
+                     if (c3 == 'N')
+                      if (c4 == 'I')
+                       { d.bugmode = 1;
+                         break; }
+                    if (sw.batch < 1) sw.batch = 1;
+                    if (c2 == 'A') sw.batchfullspecies = 1;
                     break;
          case  'V': sw.verbose = 1;
                     break;
@@ -10419,8 +11664,12 @@ int main(int z, char *v[])
                           if (*s == ',') dconvert(s+1,&sw.mtdarmthresh); }}
                     else
                      { sw.tmrna = 1;
-                       if (lv > 2)
-                       dconvert(s+1,&sw.tmrnathresh); }
+                       if (c2 == 'U')
+                        if (c3 == 'T')
+                         { sw.updatetmrnatags = 1;
+                           lv -= 2;
+                           s += 2; }
+                       if (lv > 2) dconvert(s+1,&sw.tmrnathresh); }
                     break;
          case  'P': if (c2 == 'S')
                      { if (c3 != '-')
@@ -10438,7 +11687,10 @@ int main(int z, char *v[])
                     break;
          case  'R': if (c2 == 'N') sw.repeatsn = 1;
                     else 
-                     if (c2 == 'P') sw.reportpseudogenes = 1;
+                     if (c2 == 'P') 
+                      { sw.reportpseudogenes = 1;
+                        if (lv > 3)
+                         dconvert(s+2,&sw.pseudogenethresh); }
                      else sw.tmstrict = 0;
                     break;
          case  'Q': sw.showconfig = 0;
diff --git a/manpage.1.src b/manpage.1.src
new file mode 100644
index 0000000..26b9c6d
--- /dev/null
+++ b/manpage.1.src
@@ -0,0 +1,273 @@
+ARAGORN(1)
+==========
+
+NAME
+----
+
+aragorn - detect tRNA genes in nucleotide sequences
+
+
+SYNOPSIS
+--------
+
+*aragorn* ['OPTION']...  'FILE'
+
+
+OPTIONS
+-------
+
+*-m*::
+            Search for tmRNA genes.
+
+*-t*::
+            Search for tRNA genes.
+            By default, all are detected. If one of
+            *-m* or *-t* is specified, then the other
+            is not detected unless specified as well.
+*-mt*::
+            Search for Metazoan mitochondrial tRNA genes.
+            tRNA genes with introns not detected. *-i*, *-sr* switchs
+            ignored. Composite Metazoan mitochondrial
+            genetic code used.
+
+*-mtmam*::
+            Search for Mammalian mitochondrial tRNA
+            genes. *-i*, *-sr* switchs ignored. *-tv* switch set.
+            Mammalian mitochondrial genetic code used.
+
+*-mtx*::
+            Same as *-mt* but low scoring tRNA genes are
+            not reported.
+
+*-mtd*::
+            Overlapping metazoan mitochondrial tRNA genes
+            on opposite strands are reported.
+
+*-gc*['num']::
+            Use the GenBank transl_table = ['num'] genetic code.
+            Individual modifications can be appended using
+            ',BBB'=<aa>     B = A,C,G, or T. <aa> is the three letter
+            code for an amino-acid. More than one modification
+            can be specified. eg *-gcvert*,aga=Trp,agg=Trp uses
+            the Vertebrate Mitochondrial code and the codons
+            AGA and AGG changed to Tryptophan.
+
+*-gcstd*::
+            Use standard genetic code.
+*-gcmet*::
+            Use composite Metazoan mitochondrial genetic code.
+*-gcvert*::
+            Use Vertebrate mitochondrial genetic code.
+*-gcinvert*::
+            Use Invertebrate mitochondrial genetic code.
+*-gcyeast*::
+            Use Yeast mitochondrial genetic code.
+*-gcprot*::
+            Use Mold/Protozoan/Coelenterate mitochondrial genetic code.
+*-gcciliate*::
+            Use Ciliate genetic code.
+*-gcflatworm*::
+            Use Echinoderm/Flatworm mitochondrial genetic code
+*-gceuplot*::
+            Use Euplotid genetic code.
+*-gcbact*::
+            Use Bacterial/Plant Chloroplast genetic code.
+*-gcaltyeast*::
+            Use alternative Yeast genetic code.
+*-gcascid*::
+            Use Ascidian Mitochondrial genetic code.
+*-gcaltflat*::
+            Use alternative Flatworm Mitochondrial genetic code.
+*-gcblep*::
+            Use Blepharisma genetic code.
+*-gcchloroph*::
+            Use Chlorophycean Mitochondrial genetic code.
+*-gctrem*::
+            Use Trematode Mitochondrial genetic code.
+*-gcscen*::
+            Use Scenedesmus obliquus Mitochondrial genetic code.
+*-gcthraust*::
+            Use Thraustochytrium Mitochondrial genetic code.
+
+*-tv*::
+            Do not search for mitochondrial TV replacement               loop tRNA genes. Only relevant if *-mt* used.
+
+*-c7*::
+            Search for tRNA genes with 7 base C-loops only.
+
+*-i*::
+            Search for tRNA genes with introns in
+            anticodon loop with maximum length 3000
+            bases. Minimum intron length is 0 bases.
+            Ignored if *-m* is specified.
+
+*-i*['max']::
+            Search for tRNA genes with introns in
+            anticodon loop with maximum length ['max']
+            bases. Minimum intron length is 0 bases.
+            Ignored if *-m* is specified.
+
+*-i*['min'],['max']::
+            Search for tRNA genes with introns in
+            anticodon loop with maximum length ['max']
+            bases, and minimum length ['min'] bases.
+            Ignored if *-m* is specified.
+
+*-io*::
+            Same as *-i*, but allow tRNA genes with long
+            introns to overlap shorter tRNA genes.
+
+*-if*::
+            Same as *-i*, but fix intron between positions
+            37 and 38 on C-loop (one base after anticodon).
+
+*-ifo*::
+            Same as *-if* and *-io* combined.
+
+*-ir*::
+            Same as *-i*, but report tRNA genes with minimum
+            length ['min'] bases rather than search for
+            tRNA genes with minimum length ['min'] bases.
+            With this switch, ['min'] acts as an output filter,
+            minimum intron length for searching is still 0 bases.
+
+*-c*::
+            Assume that each sequence has a circular
+            topology. Search wraps around each end.
+            Default setting.
+
+*-l*::
+            Assume that each sequence has a linear
+            topology. Search does not wrap.
+
+*-d*::
+            Double. Search both strands of each
+            sequence. Default setting.
+
+*-s*  or *-s+*::
+            Single. Do not search the complementary
+            (antisense) strand of each sequence.
+
+*-sc* or *-s-*::
+            Single complementary. Do not search the sense
+            strand of each sequence.
+
+*-ps*::
+            Lower scoring thresholds to 95% of default levels.
+
+*-ps*['num']::
+            Change scoring thresholds to ['num'] percent of
+            default levels.
+
+*-rp*::
+            Flag possible pseudogenes (score < 100 or tRNA anticodon
+            loop <> 7 bases long). Note that genes with score < 100
+            will not be detected or flagged if scoring thresholds are not
+            also changed to below 100% (see -ps switch).
+
+*-seq*::
+            Print out primary sequence.
+
+*-br*::
+            Show secondary structure of tRNA gene primary sequence
+            using round brackets.
+
+*-fasta*::
+            Print out primary sequence in fasta format.
+*-fo*::
+            Print out primary sequence in fasta format only
+            (no secondary structure).
+
+*-fon*::
+            Same as *-fo*, with sequence and gene numbering in header.
+
+*-fos*::
+            Same as *-fo*, with no spaces in header.
+
+*-fons*::
+            Same as *-fo*, with sequence and gene numbering, but no
+            spaces.
+
+*-w*::
+            Print out in Batch mode.
+
+*-ss*::
+            Use the stricter canonical 1-2 bp spacer1 and
+            1 bp spacer2. Ignored if *-mt* set. Default is to
+            allow 3 bp spacer1 and 0-2 bp spacer2, which may
+            degrade selectivity.
+
+*-v*::
+            Verbose. Prints out information during
+            search to STDERR.
+
+*-a*::
+            Print out tRNA domain for tmRNA genes.
+
+*-a7*::
+            Restrict tRNA astem length to a maximum of 7 bases
+
+*-aa*::
+            Display message if predicted iso-acceptor species
+            does not match species in sequence name (if present).
+
+*-j*::
+            Display 4-base sequence on 3' end of astem
+            regardless of predicted amino-acyl acceptor length.
+
+*-jr*::
+            Allow some divergence of 3' amino-acyl acceptor
+            sequence from NCCA.
+
+*-jr4*::
+            Allow some divergence of 3' amino-acyl acceptor
+            sequence from NCCA, and display 4 bases.
+
+*-q*::
+            Dont print configuration line (which switchs
+            and files were used).
+*-rn*::
+            Repeat sequence name before summary information.
+
+*-O* ['outfile']::
+            Print output to ['outfile]'. If ['outfile']
+            already exists, it is overwritten.  By default
+            all output goes to stdout.
+
+DESCRIPTION
+-----------
+
+aragorn detects tRNA, mtRNA, and tmRNA genes.
+A minimum requirement is at least a 32 bit compiler architecture
+(variable types int and unsigned int are at least 4 bytes long).
+
+['FILE'] is assumed to contain one or more sequences
+in FASTA format. Results of the search are printed to
+STDOUT. All switches are optional and case-insensitive.
+Unless -i is specified, tRNA genes containing introns
+are not detected.
+
+
+AUTHORS
+------
+
+Bjorn Canback <bcanback at acgt.se>, Dean Laslett <gaiaquark at gmail.com>
+
+
+REFERENCES
+----------
+
+Laslett, D. and Canback, B. (2004) ARAGORN, a
+program for the detection of transfer RNA and transfer-messenger
+RNA genes in nucleotide sequences
+Nucleic Acids Research, 32;11-16
+
+Laslett, D. and Canback, B. (2008) ARWEN: a
+program to detect tRNA genes in metazoan mitochondrial
+nucleotide sequences
+Bioinformatics, 24(2); 172-175.
+
+
+
+
+

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/aragorn.git



More information about the debian-med-commit mailing list