[med-svn] r11400 - in trunk/packages/meme/trunk/debian: . glam2_manpages meme_manpages

Thorsten Alteholz alteholz at alioth.debian.org
Wed Jun 20 20:06:47 UTC 2012


Author: alteholz
Date: 2012-06-20 20:06:46 +0000 (Wed, 20 Jun 2012)
New Revision: 11400

Added:
   trunk/packages/meme/trunk/debian/glam2_manpages/glam2html.1
   trunk/packages/meme/trunk/debian/glam2_manpages/glam2psfm.1
   trunk/packages/meme/trunk/debian/glam2_manpages/glam2scan2html.1
   trunk/packages/meme/trunk/debian/meme.manpages
   trunk/packages/meme/trunk/debian/meme_manpages/
   trunk/packages/meme/trunk/debian/meme_manpages/meme-get-motif.1
   trunk/packages/meme/trunk/debian/meme_manpages/meme-xml-html.1
   trunk/packages/meme/trunk/debian/meme_manpages/meme.1
   trunk/packages/meme/trunk/debian/meme_manpages/meme.bin.1
   trunk/packages/meme/trunk/debian/meme_manpages/meme2images.1
Modified:
   trunk/packages/meme/trunk/debian/rules
Log:
some man pages for meme

Added: trunk/packages/meme/trunk/debian/glam2_manpages/glam2html.1
===================================================================
--- trunk/packages/meme/trunk/debian/glam2_manpages/glam2html.1	                        (rev 0)
+++ trunk/packages/meme/trunk/debian/glam2_manpages/glam2html.1	2012-06-20 20:06:46 UTC (rev 11400)
@@ -0,0 +1,33 @@
+.TH "GLAM2HTML" "1" "06/20/2012" "GLAM2 1056" "glam2 Manual"
+.\" disable hyphenation
+.nh
+.\" disable justification (adjust text to left margin only)
+.ad l
+.SH "NAME"
+glam2html \- convert GLAM2 output to html
+.SH "SYNOPSIS"
+.HP 10
+\fBglam2html\fR 
+.SH "DESCRIPTION"
+.PP
+
+\fBglam2html\fR
+reads GLAM2 output from STDIN and writes HTML output to STDOUT.
+.SH "REFERENCE"
+.PP
+If you use GLAM2, please cite: MC Frith, NFW Saunders, B Kobe, TL Bailey (2008) Discovering sequence motifs with arbitrary insertions and deletions, PLoS Computational Biology (in press)\&.
+.SH "AUTHORS"
+.PP
+\fBMartin Frith\fR
+.sp -1n
+.IP "" 4
+Author of GLAM2\&.
+.PP
+\fBTimothy Bailey\fR
+.sp -1n
+.IP "" 4
+Author of GLAM2\&.
+.SH "COPYRIGHT"
+.PP
+The source code and the documentation of GLAM2 are released in the public domain\&.
+.sp

Added: trunk/packages/meme/trunk/debian/glam2_manpages/glam2psfm.1
===================================================================
--- trunk/packages/meme/trunk/debian/glam2_manpages/glam2psfm.1	                        (rev 0)
+++ trunk/packages/meme/trunk/debian/glam2_manpages/glam2psfm.1	2012-06-20 20:06:46 UTC (rev 11400)
@@ -0,0 +1,34 @@
+.TH "GLAM2PSFM" "1" "06/20/2012" "GLAM2 1056" "glam2 Manual"
+.\" disable hyphenation
+.nh
+.\" disable justification (adjust text to left margin only)
+.ad l
+.SH "NAME"
+glam2psfm \- convert GLAM2 output to html
+.SH "SYNOPSIS"
+.HP 10
+\fBglam2psfm\fR <filename>
+.SH "DESCRIPTION"
+.PP
+
+\fBglam2psfm\fR
+reads glam2 output from <filename> and writes it in MEME's PSFM format. 
+This can be used as input to TOMTOM.
+.SH "REFERENCE"
+.PP
+If you use GLAM2, please cite: MC Frith, NFW Saunders, B Kobe, TL Bailey (2008) Discovering sequence motifs with arbitrary insertions and deletions, PLoS Computational Biology (in press)\&.
+.SH "AUTHORS"
+.PP
+\fBMartin Frith\fR
+.sp -1n
+.IP "" 4
+Author of GLAM2\&.
+.PP
+\fBTimothy Bailey\fR
+.sp -1n
+.IP "" 4
+Author of GLAM2\&.
+.SH "COPYRIGHT"
+.PP
+The source code and the documentation of GLAM2 are released in the public domain\&.
+.sp

Added: trunk/packages/meme/trunk/debian/glam2_manpages/glam2scan2html.1
===================================================================
--- trunk/packages/meme/trunk/debian/glam2_manpages/glam2scan2html.1	                        (rev 0)
+++ trunk/packages/meme/trunk/debian/glam2_manpages/glam2scan2html.1	2012-06-20 20:06:46 UTC (rev 11400)
@@ -0,0 +1,33 @@
+.TH "GLAM2SCAN2HTML" "1" "06/20/2012" "GLAM2 1056" "glam2 Manual"
+.\" disable hyphenation
+.nh
+.\" disable justification (adjust text to left margin only)
+.ad l
+.SH "NAME"
+glam2scan2html \- convert GLAM2SCAN output to html
+.SH "SYNOPSIS"
+.HP 10
+\fBglam2scan2html\fR 
+.SH "DESCRIPTION"
+.PP
+
+\fBglam2scan2html\fR
+reads GLAM2SCAN output from STDIN and writes an HTML output to STDOUT.
+.SH "REFERENCE"
+.PP
+If you use GLAM2, please cite: MC Frith, NFW Saunders, B Kobe, TL Bailey (2008) Discovering sequence motifs with arbitrary insertions and deletions, PLoS Computational Biology (in press)\&.
+.SH "AUTHORS"
+.PP
+\fBMartin Frith\fR
+.sp -1n
+.IP "" 4
+Author of GLAM2\&.
+.PP
+\fBTimothy Bailey\fR
+.sp -1n
+.IP "" 4
+Author of GLAM2\&.
+.SH "COPYRIGHT"
+.PP
+The source code and the documentation of GLAM2 are released in the public domain\&.
+.sp

Added: trunk/packages/meme/trunk/debian/meme.manpages
===================================================================
--- trunk/packages/meme/trunk/debian/meme.manpages	                        (rev 0)
+++ trunk/packages/meme/trunk/debian/meme.manpages	2012-06-20 20:06:46 UTC (rev 11400)
@@ -0,0 +1 @@
+debian/meme_manpages/*

Added: trunk/packages/meme/trunk/debian/meme_manpages/meme-get-motif.1
===================================================================
--- trunk/packages/meme/trunk/debian/meme_manpages/meme-get-motif.1	                        (rev 0)
+++ trunk/packages/meme/trunk/debian/meme_manpages/meme-get-motif.1	2012-06-20 20:06:46 UTC (rev 11400)
@@ -0,0 +1,29 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.40.10.
+.TH MEME-GET-MOTIF "1" "June 2012" "meme-get-motif" "User Commands"
+.SH NAME
+meme-get-motif \- motifs extraction
+.SH DESCRIPTION
+.IP
+USAGE:
+.IP
+meme\-get\-motif [options]
+.TP
+[\-id <modif_id>]+
+id of motif to extract from MEME .txt file
+.TP
+[\-all]
+get all motifs in the MEME .txt file
+.TP
+[\-noll]
+MEME file is missing log\-odds matrices
+.IP
+Extract a motifs from a MEME\-formated motif database
+or from a MEME output file (.txt format).
+.IP
+Reads standard input.
+Writes standard output.
+.IP
+Copyright
+(2006) The University of Queensland
+All Rights Reserved.
+Author: Timothy L. Bailey

Added: trunk/packages/meme/trunk/debian/meme_manpages/meme-xml-html.1
===================================================================
--- trunk/packages/meme/trunk/debian/meme_manpages/meme-xml-html.1	                        (rev 0)
+++ trunk/packages/meme/trunk/debian/meme_manpages/meme-xml-html.1	2012-06-20 20:06:46 UTC (rev 11400)
@@ -0,0 +1,18 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.40.10.
+.TH MEME-XML-HTML "1" "June 2012" "meme-xml-html" "User Commands"
+.SH NAME
+meme-xml-html \- meme conversion
+.SH DESCRIPTION
+.SS "USAGE:"
+.IP
+meme\-xml\-html [options]
+[\-xml <xml>]    name of xml file
+[\-xsl <xsl>]    name of xsl file
+[\-html <html>]  name of html file
+.PP
+Convert MEME XML to HTML using the given style sheet.
+.PP
+Copyright
+(2007) The University of Queensland
+All Rights Reserved.
+Author: Timothy L. Bailey

Added: trunk/packages/meme/trunk/debian/meme_manpages/meme.1
===================================================================
--- trunk/packages/meme/trunk/debian/meme_manpages/meme.1	                        (rev 0)
+++ trunk/packages/meme/trunk/debian/meme_manpages/meme.1	2012-06-20 20:06:46 UTC (rev 11400)
@@ -0,0 +1,899 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.40.10.
+.TH MEME "1" "June 2012" "Multiple EM for Motif Elicitation" "User Commands"
+.SH NAME
+meme \- MEME  Multiple EM for Motif Elicitation
+.SH DESCRIPTION
+.PP
+USAGE:
+.PP
+.IP
+ meme <dataset> [optional arguments]
+.IP
+
+.TP
+<dataset>
+file containing sequences in FASTA format
+.TP
+[\-h]
+print this message
+.TP
+[\-o <output dir>]
+name of directory for output files will not
+replace existing directory
+.TP
+[\-oc <output dir>]
+name of directory for output files will
+replace existing directory
+.TP
+[\-text]
+output in text format (default is HTML)
+.TP
+[\-dna]
+sequences use DNA alphabet
+.TP
+[\-protein]
+sequences use protein alphabet
+.TP
+[\-mod oops|zoops|anr]
+distribution of motifs
+.TP
+[\-nmotifs <nmotifs>]
+maximum number of motifs to find
+.TP
+[\-evt <ev>]
+stop if motif E\-value greater than <evt>
+.TP
+[\-nsites <sites>]
+number of sites for each motif
+.TP
+[\-minsites <minsites>]
+minimum number of sites for each motif
+.TP
+[\-maxsites <maxsites>]
+maximum number of sites for each motif
+.TP
+[\-wnsites <wnsites>]
+weight on expected number of sites
+.TP
+[\-w <w>]
+motif width
+.TP
+[\-minw <minw>]
+minumum motif width
+.TP
+[\-maxw <maxw>]
+maximum motif width
+.TP
+[\-nomatrim]
+do not adjust motif width using multiple
+alignments
+.TP
+[\-wg <wg>]
+gap opening cost for multiple alignments
+.TP
+[\-ws <ws>]
+gap extension cost for multiple alignments
+.TP
+[\-noendgaps]
+do not count end gaps in multiple alignments
+.TP
+[\-bfile <bfile>]
+name of background Markov model file
+.TP
+[\-revcomp]
+allow sites on + or \- DNA strands
+.TP
+[\-pal]
+force palindromes (requires \fB\-dna\fR)
+.TP
+[\-maxiter <maxiter>]
+maximum EM iterations to run
+.TP
+[\-distance <distance>]
+EM convergence criterion
+.TP
+[\-psp <pspfile>]
+name of positional priors file
+.IP
+[\-prior dirichlet|dmix| type of prior to use
+mega|megap|addone]
+[\-b <b>]                strength of the prior
+[\-plib <plib>]          name of Dirichlet prior file
+[\-spfuzz <spfuzz>]      fuzziness of sequence to theta mapping
+[\-spmap uni|pam]        starting point seq to theta mapping type
+[\-cons <cons>]          consensus sequence to start EM from
+[\-heapsize <hs>]        size of heaps for widths where substring
+.IP
+search occurs
+.TP
+[\-x_branch]
+perform x\-branching
+.TP
+[\-w_branch]
+perform width branching
+.TP
+[\-bfactor <bf>]
+branching factor for branching search
+.TP
+[\-maxsize <maxsize>]
+maximum dataset size in characters
+.TP
+[\-nostatus]
+do not print progress reports to terminal
+.TP
+[\-p <np>]
+use parallel version with <np> processors
+.TP
+[\-time <t>]
+quit before <t> CPU seconds consumed
+.TP
+[\-sf <sf>]
+print <sf> as name of sequence file
+.TP
+[\-V]
+verbose mode
+.IP
+MEME is a tool for discovering motifs in a group of related DNA or protein
+sequences.
+.IP
+A motif is a sequence pattern that occurs repeatedly in a group of related
+protein or DNA sequences. MEME represents motifs as position\-dependent
+letter\-probability matrices which describe the probability of each
+possible letter at each position in the pattern. Individual MEME motifs do
+not contain gaps. Patterns with variable\-length gaps are split by MEME
+into two or more separate motifs.
+.IP
+MEME takes as input a group of DNA or protein sequences (the training set)
+and outputs as many motifs as requested. MEME uses statistical modeling
+techniques to automatically choose the best width, number of occurrences,
+and description for each motif.
+.IP
+MEME outputs its results primarily as a hypertext (HTML) document named
+meme.html. This is placed in a directory named meme_out/". You can select
+for the directory to have a different name. MEME also outputs
+machine\-readable (XML) a plain\-text versions of its output, named meme.xml
+and meme.txt, respectively. These are placed in the same output directory
+as meme.html.
+.IP
+The MEME results consist of:
+.IP
+*The version of MEME and the date it was released.
+*The reference to cite if you use MEME in your research.
+*A description of the sequences you submitted (the "training set")
+.IP
+showing the name, "weight" and length of each sequence.
+.IP
+*The command line summary detailing the parameters with which you ran
+.IP
+MEME.
+.IP
+*Information on each of the motifs MEME discovered, including:
+.IP
+1.A summary line showing the width, number of occurrences, log
+.IP
+likelihood ratio and statistical significance of the motif.
+.IP
+2.A sequence "LOGO" illustrating the motif, along with links to
+.IP
+publication\-ready versions of the LOGO (postscript and PNG
+formats).
+.IP
+3.The occurrences of the motif sorted by p\-value and aligned with
+.IP
+each other.
+.IP
+4.Block diagrams of the occurrences of the motif within each
+.IP
+sequence in the training set.
+.IP
+5.The motif in BLOCKS format: buttons for viewing the motif in
+.IP
+various formats and for submitting it to the BLOCKS multiple
+alignment processor.
+.IP
+6.A position\-specific scoring matrix (PSSM) for use in scanning
+.IP
+sequence databases and a button for submitting the motif to the
+MAST scanning program.
+.IP
+7.The position specific probability matrix (PSPM) describing the
+.IP
+motif and a button for comparing the motif to known motifs.
+.IP
+8.A regular expression describing the motif.
+.IP
+*A summary of motifs showing an optimized (non\-overlapping) tiling of
+.IP
+all of the motifs onto each of the sequences in the training set.
+.IP
+*The reason why MEME stopped and the name of the CPU on which it ran.
+*This explanation of how to interpret MEME results.
+.PP
+REQUIRED ARGUMENTS:
+.IP
+*<dataset> The name of the file containing the training set sequences.
+.IP
+If <dataset> is the word stdin, MEME reads from standard input.
+.IP
+The sequences in the dataset should be in Pearson/FASTA format. For
+example:
+.IP
+>ICYA_MANSE INSECTICYANIN A FORM (BLUE BILIPROTEIN)
+GDIFYPGYCPDVKPVNDFDLSAFAGAWHEIAK
+LPLENENQGKCTIAEYKYDGKKASVYNSFVSNGVKEYMEGDLEIAPDA
+>LACB_BOVIN BETA\-LACTOGLOBULIN PRECURSOR (BETA\-LG)
+MKCLLLALALTCGAQALIVTQTMKGLDI
+QKVAGTWYSLAMAASDISLLDAQSAPLRVYVEELKPTPEGDLEILLQKW
+.IP
+Sequences start with a header line followed by sequence lines. A
+header line has the character ">" in position one, followed by an
+unique name without any spaces, followed by (optional) descriptive
+text. After the header line come the actual sequence lines. Spaces and
+blank lines are ignored. Sequences may be in capital or lowercase or
+both.
+.IP
+MEME uses the first word in the header line of each sequence,
+truncated to 24 characters if necessary, as the name of the sequence.
+This name must be unique. Sequences with duplicate names will be
+ignored. (The first word in the title line is everything following the
+">" up to the first blank.)
+.IP
+Sequence weights may be specified in the dataset file by special
+header lines where the unique name is "WEIGHTS" (all caps) and the
+descriptive text is a list of sequence weights. Sequence weights are
+numbers in the range 0 < w <= 1. All weights are assigned in order to
+the sequences in the file. If there are more sequences than weights,
+the remainder are given weight one. Weights must be greater than zero
+and less than or equal to one. Weights may be specified by more than
+one "WEIGHT" entry which may appear anywhere in the file. When weights
+are used, sequences will contribute to motifs in proportion to their
+weights. Here is an example for a file of three sequences where the
+first two sequences are very similar and it is desired to down\-weight
+them:
+.IP
+>WEIGHTS 0.5 .5 1.0
+>seq1
+GDIFYPGYCPDVKPVNDFDLSAFAGAWHEIAK
+>seq2
+GDMFCPGYCPDVKPVGDFDLSAFAGAWHELAK
+>seq3
+QKVAGTWYSLAMAASDISLLDAQSAPLRVYVEELKPTPEGDLEILLQKW
+.PP
+OPTIONAL ARGUMENTS:
+.IP
+MEME has a large number of optional inputs that can be used to fine\-tune
+its behavior. To make these easier to understand they are divided into the
+following categories:
+.IP
+*OUTPUT DESTINATION  control where MEME places its output
+*ALPHABET  control the alphabet for the motifs (patterns) that MEME
+.IP
+will search for
+.IP
+*DISTRIBUTION  control how MEME assumes the occurrences of the motifs
+.IP
+are distributed throughout the training set sequences
+.IP
+*SEARCH  control how MEME searches for motifs
+*SYSTEM  the \fB\-p\fR <np> argument causes a version of MEME compiled for a
+.IP
+parallel CPU architecture to be run. (By placing <np> in quotes you
+may pass installation specific switches to the 'mpirun' command. The
+number of processors to run on must be the first argument following
+\fB\-p\fR).
+.IP
+In what follows, <n> is an integer, <a> is a decimal number, and <string>
+is a string of characters.
+.IP
+OUTPUT DESTINATION
+.IP
+By default MEME writes its results to a directory named meme_out/, which
+is created if it doesn't exist. The main results include file is
+meme.html. You can specify that a different directory be created or used
+for results. You can also specify that MEME create only a text file.
+.IP
+*\-o <output dir>  Name of directory for output files; will not replace
+.IP
+existing directory.
+.IP
+*\-oc <output dir>  Name of directory for output files; will replace
+.IP
+existing directory.
+.IP
+*\-text  Output in text format only to standard output.
+.IP
+ALPHABET
+.IP
+MEME accepts either DNA or protein sequences, but not both in the same
+run. By default, sequences are assumed to be protein. The sequences must
+be in FASTA format.
+.IP
+DNA sequences must contain only the letters "ACGT", plus the ambiguous
+letters "BDHKMNRSUVWY*\-". Protein sequences must contain only the letters
+"ACDEFGHIKLMNPQRSTVWY", plus the ambiguous letters "BUXZ*\-".
+.IP
+MEME converts all ambiguous letters to "X", which is treated as "unknown".
+.IP
+*\-dna  Assume sequences are DNA; default: protein sequences.
+*\-protein  Assume sequences are protein.
+.IP
+DISTRIBUTION
+.IP
+If you know how occurrences of motifs are distributed in the training set
+sequences, you can specify it with the following optional switches. The
+default distribution of motif occurrences is assumed to be zero or one
+occurrence of per sequence.
+.IP
+If you know how occurrences of motifs are distributed in the training set
+.IP
+*\-mod <string>  The type of distribution to assume.
+.IP
+*oops  One Occurrence Per Sequence MEME assumes that each
+.IP
+sequence in the dataset contains exactly one occurrence of each
+motif. This option is the fastest and most sensitive but the
+motifs returned by MEME may be "blurry" if any of the sequences
+is missing them.
+.IP
+*zoops  Zero or One Occurrence Per Sequence MEME assumes that
+.IP
+each sequence may contain at most one occurrence of each motif.
+This option is useful when you suspect that some motifs may be
+missing from some of the sequences. In that case, the motifs
+found will be more accurate than using the first option. This
+option takes more computer time than the first option (about
+twice as much) and is slightly less sensitive to weak motifs
+present in all of the sequences.
+.IP
+*anr  Any Number of Repetitions MEME assumes each sequence may
+.IP
+contain any number of non\-overlapping occurrences of each motif.
+This option is useful when you suspect that motifs repeat
+multiple times within a single sequence. In that case, the motifs
+found will be much more accurate than using one of the other
+options. This option can also be used to discover repeats within
+a single sequence. This option takes the much more computer time
+than the first option (about ten times as much) and is somewhat
+less sensitive to weak motifs which do not repeat within a single
+sequence than the other two options.
+.IP
+SEARCH
+.IP
+1.OBJECTIVE FUNCTION
+.IP
+MEME uses an objective function on motifs to select the "best" motif.
+The objective function is based on the statistical significance of the
+log likelihood ratio (LLR) of the occurrences of the motif. The
+E\-value of the motif is an estimate of the number of motifs (with the
+same width and number of occurrences) that would have equal or higher
+log likelihood ratio if the training set sequences had been generated
+randomly according to the (0\-order portion of the) background model.
+.IP
+MEME searches for the motif with the smallest E\-value. It searches
+over different motif widths, numbers of occurrences, and positions in
+the training set for the motif occurrences. The user may limit the
+range of motif widths and number of occurrences that MEME tries using
+the switches described below. In addition, MEME trims the motif (using
+a dynamic programming multiple alignment) to eliminate any positions
+where there is a gap in any of the occurrences.
+.IP
+The log likelihood ratio of a motif is llr = log (Pr(sites | motif) /
+Pr(sites | back)) and is a measure of how different the sites are from
+the background model. Pr(sites | motif) is the probability of the
+occurrences given the a model consisting of the position\-specific
+probability matrix (PSPM) of the motif. (The PSPM is output by MEME).
+Pr(sites | back) is the probability of the occurrences given the
+background model. The background model is an n\-order Markov model. By
+default, it is a 0\-order model consisting of the frequencies of the
+letters in the training set. A different 0\-order Markov model or
+higher order Markov models can be specified to MEME using the \fB\-bfile\fR
+option described below.
+.IP
+The E\-value reported by MEME is actually an approximation of the
+E\-value of the log likelihood ratio. (An approximation is used because
+it is far more efficient to compute.) The approximation is based on
+the fact that the log likelihood ratio of a motif is the sum of the
+log likelihood ratios of each column of the motif. Instead of
+computing the statistical significance of this sum (its p\-value), MEME
+computes the p\-value of each column and then computes the significance
+of their product. Although not identical to the significance of the
+log likelihood ratio, this easier to compute objective function works
+very similarly in practice.
+.IP
+The motif significance is reported as the E\-value of the motif. The
+statistical significance of a motif is computed based on:
+.IP
+1.the log likelihood ratio,
+2.the width of the motif,
+3.the number of occurrences,
+4.the 0\-order portion of the background model,
+5.the size of the training set, and
+6.the type of model (oops, zoops, or anr, which determines the
+.IP
+number of possible different motifs of the given width and number
+of occurrences).
+.IP
+MEME searches for motifs by performing Expectation Maximization (EM)
+on a motif model of a fixed width and using an initial estimate of the
+number of sites. It then sorts the possible sites according to their
+probability according to EM. MEME then and calculates the E\-values of
+the first n sites in the sorted list for different values of n. This
+procedure (first EM, followed by computing E\-values for different
+numbers of sites) is repeated with different widths and different
+initial estimates of the number of sites. MEME outputs the motif with
+the lowest E\-value.
+.IP
+2.NUMBER OF MOTIFS
+.IP
+*\-nmotifs <n>  The number of *different* motifs to search for.
+.IP
+MEME will search for and output <n> motifs. Default: 1
+.IP
+*\-evt <p>  Quit looking for motifs if E\-value exceeds <p>.
+.IP
+Default: infinite (so by default MEME never quits before \fB\-nmotifs\fR
+<n> have been found.)
+.IP
+3.NUMBER OF MOTIF OCCURRENCES
+.IP
+*\-nsites <n>
+.IP
+\fB\-minsites\fR <n>
+\fB\-maxsites\fR <n>
+The (expected) number of occurrences of each motif. If \fB\-nsites\fR is
+given, only that number of occurrences is tried. Otherwise,
+numbers of occurrences between \fB\-minsites\fR and \fB\-maxsites\fR are tried
+as initial guesses for the number of motif occurrences. These
+switches are ignored if mod = oops.
+Defaults:
+.IP
+\fB\-minsites\fR : 2
+\fB\-maxsites\fR :
+.IP
+zoops : number of sequences
+anr   : MIN(5*(number of sequences), 50)
+.IP
+*\-wnsites <n>  The weight on the prior on nsites. This controls
+.IP
+how strong the bias towards motifs with exactly nsites sites (or
+between minsites and maxsites sites) is. It is a number in the
+range [0..1). The larger it is, the stronger the bias towards
+motifs with exactly nsites occurrences is.
+Default: 0.8
+.IP
+4.MOTIF WIDTH
+.IP
+*\-w <n>
+.IP
+\fB\-minw\fR <n>
+\fB\-maxw\fR <n>
+.IP
+The width of the motif(s) to search for. If \fB\-w\fR is given, only
+that width is tried. Otherwise, widths between \fB\-minw\fR and \fB\-maxw\fR
+are tried.
+Default: \fB\-minw\fR 8, \fB\-maxw\fR 50 (defined in user.h)
+.IP
+Note: If <n> is less than the length of the shortest sequence in
+the dataset, <n> is reset by MEME to that value.
+.IP
+*\-nomatrim
+.IP
+\fB\-wg\fR <a>
+\fB\-ws\fR <a>
+\fB\-noendgaps\fR
+.IP
+These switches control trimming (shortening) of motifs using the
+multiple alignment method. Specifying \fB\-nomatrim\fR causes MEME to
+skip this and causes the other switches to be ignored. MEME finds
+the best motif found and then trims (shortens) it using the
+multiple alignment method (described below). The number of
+occurrences is then adjusted to maximize the motif E\-value, and
+then the motif width is further shortened to optimize the
+E\-value.
+.IP
+The multiple alignment method performs a separate pairwise
+alignment of the site with the highest probability and each other
+possible site. (The alignment includes width/2 positions on
+either side of the sites.) The pairwise alignment is controlled
+by the switches:
+\fB\-wg\fR <a> (gap cost; default: 11),
+\fB\-ws\fR <a> (space cost; default 1), and,
+\fB\-noendgaps\fR (do not penalize endgaps; default: penalize endgaps).
+.IP
+The pairwise alignments are then combined and the method
+determines the widest section of the motif with no insertions or
+deletions. If this alignment is shorter than <minw>, it tries to
+find an alignment allowing up to one insertion/deletion per motif
+column. This continues (allowing up to 2, 3 ...
+insertions/deletions per motif column) until an alignment of
+width at least <minw> is found.
+.IP
+5.BACKGROUND MODEL
+.IP
+*\-bfile <bfile>  The name of the file containing the background
+.IP
+model for sequences. The background model is the model of random
+sequences used by MEME. The background model is used by MEME
+.IP
+1.during EM as the "null model",
+2.for calculating the log likelihood ratio of a motif,
+3.for calculating the significance (E\-value) of a motif, and,
+4.for creating the position\-specific scoring matrix (log\-odds
+.IP
+matrix).
+.IP
+By default, the background model is a 0\-order Markov model based on
+the letter frequencies in the training set.
+.IP
+Markov models of any order can be specified in <bfile> by listing
+frequencies of all possible tuples of length up to order+1.
+.IP
+Note that MEME uses only the 0\-order portion (single letter
+frequencies) of the background model for purposes 3) and 4), but uses
+the full\-order model for purposes 1) and 2), above.
+.IP
+Example: To specify a 1\-order Markov background model for DNA, <bfile>
+might contain the following lines. Note that optional comment lines
+are marked by "#" and are ignored by MEME.
+.TP
+# tuple
+frequency_non_coding
+.TP
+a
+0.324
+.TP
+c
+0.176
+.TP
+g
+0.176
+.TP
+t
+0.324
+.TP
+# tuple
+frequency_non_coding
+.TP
+aa
+0.119
+.TP
+ac
+0.052
+.TP
+ag
+0.056
+.TP
+at
+0.097
+.TP
+ca
+0.058
+.TP
+cc
+0.033
+.TP
+cg
+0.028
+.TP
+ct
+0.056
+.TP
+ga
+0.056
+.TP
+gc
+0.035
+.TP
+gg
+0.033
+.TP
+gt
+0.052
+.TP
+ta
+0.091
+.TP
+tc
+0.056
+.TP
+tg
+0.058
+.TP
+tt
+0.119
+.IP
+Sample \fB\-bfile\fR files are given in directory tests: tests/nt.freq (DNA),
+and tests/na.freq (amino acid).
+.IP
+6.POSITION\-SPECIFIC PRIORS
+.IP
+*\-psp <pspfile>  position\-specific prior (PSP)
+.IP
+These priors allow the user to bias the search for motifs. They
+give a position\-specific prior distribution on the location of
+motif sites in sequence(s) in the input dataset. The MEME PSP
+format used in the pspfile includes the name of the sequence for
+which a prior distribution corresponds. Sequences not named in
+the pspfile are given uniform prior distributions on site
+locations by MEME.
+.IP
+A PSP must be created for a specific width of motif, w. This
+width must be specified for each entry in the pspfile, and must
+be the same for all entries. If MEME varies the motif width
+during computation, MEME renormalises the PSP for each sequence.
+.IP
+The pspfile should be in MEME PSP format, which is similar to
+FASTA format. For example:
+.IP
+>ICYA_MANSE 4
+0.075922 0.070764 0.082380 0.030292 0.025101 0.043139 0.032963
+0.086047 0.057445 0.000000 0.000000 0.000000
+.IP
+>LACB_BOVIN 4
+0.107099 0.099822 0.116208 0.042731 0.035408 0.060854 0.046499
+0.000000 0.000000 0.000000
+.IP
+Each entry should start with a header line consisting of a
+sequence name followed by the width, w, of the PSP prior. The
+sequence name must the name of a sequence in the FASTA file input
+to MEME. Any other text on the header line after the name and w
+is ignored by MEME. The following lines contain one number for
+each position in the identically\-named named FASTA sequence,
+where the number gives the prior probability of a motif site at
+that position in the sequence (or in the reverse complement if
+\fB\-revcomp\fR is specified). The last w\-1 numbers for each entry
+should be 0 (shown in blue in the example), since a motif of that
+width cannot start in those positions. All numbers for an entry
+must be in the range [0,1], and must sum to a number no greater
+than 1. If they sum to less than 1 and \fB\-mod\fR oops is specified,
+MEME will rescale the numbers so that they sum to 1.
+.IP
+For more detail on generation of PSP data, see documentation of
+our simple tool [1]psp\-gen.
+.IP
+7.DNA PALINDROMES AND STRANDS
+.IP
+*\-revcomp  motifs occurrences may be on the given DNA strand or
+.IP
+on its reverse complement.
+Default: look for DNA motifs only on the strand given in the
+training set.
+.IP
+*\-pal  Choosing \fB\-pal\fR causes MEME to look for palindromes in DNA
+.IP
+datasets.
+.IP
+MEME averages the letter frequencies in corresponding columns of the
+motif (PSPM) together. For instance, if the width of the motif is 10,
+columns 1 and 10, 2 and 9, 3 and 8, etc., are averaged together. The
+averaging combines the frequency of A in one column with T in the
+other, and the frequency of C in one column with G in the other. If
+neither option is not chosen, MEME does not search for DNA
+palindromes.
+.IP
+8.EM ALGORITHM
+.IP
+*\-maxiter <n>  The number of iterations of EM to run from any
+.IP
+starting point. EM is run for <n> iterations or until convergence
+(see \fB\-distance\fR, below) from each starting point. Default: 50
+.IP
+*\-distance <a>  The convergence criterion. MEME stops iterating
+.IP
+EM when the change in the motif frequency matrix is less than <a>
+(Change is the euclidean distance between two successive
+frequency matrices.) Default: 0.001
+.IP
+*\-prior <string>  The prior distribution on the model parameters:
+.IP
+*dirichlet  simple Dirichlet prior. This is the default for
+.IP
+\fB\-dna\fR. It is based on the non\-redundant database letter
+frequencies.
+.IP
+*dmix  mixture of Dirichlets prior. This is the default for
+.IP
+\fB\-protein\fR.
+.IP
+*mega  extremely low variance dmix; variance is scaled
+.IP
+inversely with the size of the dataset.
+.IP
+*megap  mega for all but last iteration of EM; dmix on last
+.IP
+iteration.
+.IP
+*addone  add +1 to each observed count
+.IP
+*\-b <a>  The strength of the prior on model parameters: <a> = 0
+.IP
+means use intrinsic strength of prior for prior = dmix.
+Defaults:
+.IP
+*0.01  if prior = dirichlet
+*0  if prior = dmix
+.IP
+*\-plib <string>  The name of the file containing the Dirichlet
+.IP
+prior in the format of file prior30.plib.
+.IP
+9.SELECTING STARTS FOR EM
+.IP
+The default is for MEME to search the dataset for good starts for EM.
+How the starting points are derived from the dataset is specified by
+the following switches.
+.IP
+The default type of mapping MEME uses is:
+.IP
+*\-spmap uni  for \fB\-dna\fR
+*\-spmap pam  for \fB\-protein\fR
+*\-spfuzz <a>  The fuzziness of the mapping. Possible values are
+.IP
+greater than 0. Meaning depends on \fB\-spmap\fR, see below.
+.IP
+*\-spmap <string>  The type of mapping function to use.
+.IP
+*uni  Use add\-<a> prior when converting a substring to an
+.IP
+estimate of theta.
+Default \fB\-spfuzz\fR <a> 0.5
+.IP
+*pam  Use columns of PAM <a> matrix when converting a
+.IP
+substring to an estimate of theta.
+Default \fB\-spfuzz\fR <a> 120 (PAM 120)
+.IP
+Other types of starting points can be specified using the
+following switches.
+.IP
+*\-cons <string>  Override the sampling of starting points and
+.IP
+just use a starting point derived from <string>. This is useful
+when an actual occurrence of a motif is known and can be used as
+the starting point for finding the motif.
+.IP
+10.BRANCHING SEARCH ON EM STARTS
+.IP
+The search for good EM starting points can be improved by using
+branching search.
+.IP
+Branching search begins with a fixed\-sized heap of best EM starts
+identified during the search of subsequences from the dataset. These
+starts are also called "seeds". The fixed\-sized heap of seeds is used
+as the "branch_heap" during the first iteration of branching search.
+.IP
+For each iteration of branching search, all seeds in the current
+branch_heap are considered. All seeds in the ball within hamming
+distance 1 of a given seed are evaluated and added to a new heap. The
+ball of new seeds is generated by mutating each character of the
+initial seed to each alternative character in the alphabet.
+.IP
+After the ball for every branch_heap seed has been evaluated, the
+seeds in the resulting new heap are added to the heap of best EM
+starts. The new heap is then used as the branch_heap for the next
+iteration of branching search.
+.IP
+By default MEME does not perform x\-branching.
+.IP
+*\-x_branch  Perform x\-branching.
+*\-bfactor <bf>  The number of iterations of branching search. The
+.IP
+default number of branching iterations is three.
+.IP
+*\-heapsize <hs>  The maximum size of the heaps used during
+.IP
+branching search. The default heap size is 64.
+.PP
+EXAMPLES:
+.IP
+The following examples use data files provided in this release of MEME.
+MEME writes its output to standard output, so you will want to redirect it
+to a file in order for use with MAST.
+.IP
+1.A simple DNA example:
+.IP
+meme crp0.s \fB\-dna\fR \fB\-mod\fR oops \fB\-pal\fR
+.IP
+MEME looks for a single motif in the file crp0.s which contains DNA
+sequences in FASTA format. The OOPS model is used so MEME assumes that
+every sequence contains exactly one occurrence of the motif. The
+palindrome switch is given so the motif model (PSPM) is converted into
+a palindrome by combining corresponding frequency columns. MEME
+automatically chooses the best width for the motif in this example
+since no width was specified.
+.IP
+2.Searching for motifs on both DNA strands:
+.IP
+meme crp0.s \fB\-dna\fR \fB\-mod\fR oops \fB\-revcomp\fR
+.IP
+This is like the previous example except that the \fB\-revcomp\fR switch
+tells MEME to consider both DNA strands, and the \fB\-pal\fR switch is absent
+so the palindrome conversion is omitted. When DNA uses both DNA
+strands, motif occurrences on the two strands may not overlap. That
+is, any position in the sequence given in the training set may be
+contained in an occurrence of a motif on the positive strand or the
+negative strand, but not both.
+.IP
+3.A fast DNA example:
+.IP
+meme crp0.s \fB\-dna\fR \fB\-mod\fR oops \fB\-revcomp\fR \fB\-w\fR 20
+.IP
+This example differs from example 1) in that MEME is told to only
+consider motifs of width 20. This causes MEME to execute about 10
+times faster. The \fB\-w\fR switch can also be used with protein datasets if
+the width of the motifs is known in advance.
+.IP
+4.Using a higher\-order background model:
+.IP
+meme INO_up800.s \fB\-dna\fR \fB\-mod\fR anr \fB\-revcomp\fR \fB\-bfile\fR yeast.nc.6.freq
+.IP
+In this example we use \fB\-mod\fR anr and \fB\-bfile\fR yeast.nc.6.freq. This
+specifies that
+a) the motif may have any number of occurrences in each sequence, and,
+b) the Markov model specified in yeast.nc.6.freq is used as the
+background model. This file contains a fifth\-order Markov model for
+the non\-coding regions in the yeast genome.
+.IP
+Using a higher order background model can often result in more
+sensitive detection of motifs. This is because the background model
+more accurately models non\-motif sequence, allowing MEME to
+discriminate against it and find the true motifs.
+.IP
+5.A simple protein example:
+.IP
+meme lipocalin.s \fB\-mod\fR oops \fB\-maxw\fR 20 \fB\-nmotifs\fR 2
+.IP
+The \fB\-dna\fR switch is absent, so MEME assumes the file lipocalin.s
+contains protein sequences. MEME searches for two motifs each of width
+less than or equal to 20. (Specifying \fB\-maxw\fR 20 makes MEME run faster
+since it does not have to consider motifs longer than 20.) Each motif
+is assumed to occur in each of the sequences because the OOPS model is
+specified.
+.IP
+6.Another simple protein example:
+.IP
+meme farntrans5.s \fB\-mod\fR anr \fB\-maxw\fR 40 \fB\-maxsites\fR 50
+.IP
+MEME searches for a motif of width up to 40 with up to 50 occurrences
+in the entire training set. The ANR sequence model is specified, which
+allows each motif to have any number of occurrences in each sequence.
+This dataset contains motifs with multiple repeats of motifs in each
+sequence. This example is fairly time consuming due to the fact that
+the time required to initial the motif probability tables is
+proportional to <maxw> times <maxsites>. By default, MEME only looks
+for motifs up to 29 letters wide with a maximum total of number of
+occurrences equal to twice the number of sequences or 30, whichever is
+less.
+.IP
+7.A much faster protein example:
+.IP
+meme farntrans5.s \fB\-mod\fR anr \fB\-w\fR 10 \fB\-maxsites\fR 30 \fB\-nmotifs\fR 3
+.IP
+This time MEME is constrained to search for three motifs of width
+exactly ten. The effect is to break up the long motif found in the
+previous example. The \fB\-w\fR switch forces motifs to be *exactly* ten
+letters wide. This example is much faster because, since only one
+width is considered, the time to build the motif probability tables is
+only proportional to <maxsites>.
+.IP
+8.Splitting the sites into three:
+.IP
+meme farntrans5.s \fB\-mod\fR anr \fB\-maxw\fR 12 \fB\-nsites\fR 24 \fB\-nmotifs\fR 3
+.IP
+This forces each motif to have 24 occurrences, exactly, and be up to
+12 letters wide.
+.IP
+9.A larger protein example with E\-value cutoff:
+.IP
+meme adh.s \fB\-mod\fR zoops \fB\-nmotifs\fR 20 \fB\-evt\fR 0.01
+.IP
+In this example, MEME looks for up to 20 motifs, but stops when a
+motif is found with E\-value greater than 0.01. Motifs with large
+E\-values are likely to be statistical artifacts rather than
+biologically significant.
+.PP
+References
+.IP
+Visible links
+1. file:///home/james/dist/doc/psp\-gen.html
+

Added: trunk/packages/meme/trunk/debian/meme_manpages/meme.bin.1
===================================================================
--- trunk/packages/meme/trunk/debian/meme_manpages/meme.bin.1	                        (rev 0)
+++ trunk/packages/meme/trunk/debian/meme_manpages/meme.bin.1	2012-06-20 20:06:46 UTC (rev 11400)
@@ -0,0 +1,147 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.40.10.
+.TH MEME.BIN "1" "June 2012" "meme.bin" "User Commands"
+.SH NAME
+meme.bin \- modify sequences
+.SH DESCRIPTION
+.PP
+USAGE:
+.TP
+meme.bin <dataset> [optional arguments]
+.TP
+<dataset>
+file containing sequences in FASTA format
+.TP
+[\-h]
+print this message
+.TP
+[\-o <output dir>]
+name of directory for output files
+will not replace existing directory
+.TP
+[\-oc <output dir>]
+name of directory for output files
+will replace existing directory
+.TP
+[\-text]
+output in text format (default is HTML)
+.TP
+[\-dna]
+sequences use DNA alphabet
+.TP
+[\-protein]
+sequences use protein alphabet
+.TP
+[\-mod oops|zoops|anr]
+distribution of motifs
+.TP
+[\-nmotifs <nmotifs>]
+maximum number of motifs to find
+.TP
+[\-evt <ev>]
+stop if motif E\-value greater than <evt>
+.TP
+[\-nsites <sites>]
+number of sites for each motif
+.TP
+[\-minsites <minsites>]
+minimum number of sites for each motif
+.TP
+[\-maxsites <maxsites>]
+maximum number of sites for each motif
+.TP
+[\-wnsites <wnsites>]
+weight on expected number of sites
+.TP
+[\-w <w>]
+motif width
+.TP
+[\-minw <minw>]
+minimum motif width
+.TP
+[\-maxw <maxw>]
+maximum motif width
+.TP
+[\-nomatrim]
+do not adjust motif width using multiple
+alignment
+.TP
+[\-wg <wg>]
+gap opening cost for multiple alignments
+.TP
+[\-ws <ws>]
+gap extension cost for multiple alignments
+.TP
+[\-noendgaps]
+do not count end gaps in multiple alignments
+.TP
+[\-bfile <bfile>]
+name of background Markov model file
+.TP
+[\-revcomp]
+allow sites on + or \- DNA strands
+.TP
+[\-pal]
+force palindromes (requires \fB\-dna\fR)
+.TP
+[\-maxiter <maxiter>]
+maximum EM iterations to run
+.TP
+[\-distance <distance>]
+EM convergence criterion
+.TP
+[\-psp <pspfile>]
+name of positional priors file
+.IP
+[\-prior dirichlet|dmix|mega|megap|addone]
+.IP
+type of prior to use
+.TP
+[\-b <b>]
+strength of the prior
+.TP
+[\-plib <plib>]
+name of Dirichlet prior file
+.TP
+[\-spfuzz <spfuzz>]
+fuzziness of sequence to theta mapping
+.TP
+[\-spmap uni|pam]
+starting point seq to theta mapping type
+.TP
+[\-cons <cons>]
+consensus sequence to start EM from
+.TP
+[\-heapsize <hs>]
+size of heaps for widths where substring
+search occurs
+.TP
+[\-x_branch]
+perform x\-branching
+.TP
+[\-w_branch]
+perform width branching
+.TP
+[\-allw]
+include all motif widths from min to max
+.TP
+[\-bfactor <bf>]
+branching factor for branching search
+.TP
+[\-maxsize <maxsize>]
+maximum dataset size in characters
+.TP
+[\-nostatus]
+do not print progress reports to terminal
+.TP
+[\-p <np>]
+use parallel version with <np> processors
+.TP
+[\-time <t>]
+quit before <t> CPU seconds consumed
+.TP
+[\-sf <sf>]
+print <sf> as name of sequence file
+.TP
+[\-V]
+verbose mode
+

Added: trunk/packages/meme/trunk/debian/meme_manpages/meme2images.1
===================================================================
--- trunk/packages/meme/trunk/debian/meme_manpages/meme2images.1	                        (rev 0)
+++ trunk/packages/meme/trunk/debian/meme_manpages/meme2images.1	2012-06-20 20:06:46 UTC (rev 11400)
@@ -0,0 +1,22 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.40.10.
+.TH MEME2IMAGES "1" "June 2012" "meme2images" "User Commands"
+.SH NAME
+meme2images: \- manual page for meme2images
+.SH DESCRIPTION
+.SS "Usage:"
+.IP
+meme2images [options] <motifs file> <output directory>
+.SH OPTIONS
+.TP
+\fB\-eps\fR
+output logos in eps format
+.TP
+\fB\-png\fR
+output logos in png format
+.TP
+\fB\-rc\fR
+output reverse complement logos
+.TP
+\fB\-help\fR
+print this usage message
+

Modified: trunk/packages/meme/trunk/debian/rules
===================================================================
--- trunk/packages/meme/trunk/debian/rules	2012-06-20 19:59:30 UTC (rev 11399)
+++ trunk/packages/meme/trunk/debian/rules	2012-06-20 20:06:46 UTC (rev 11400)
@@ -23,6 +23,7 @@
 	mv debian/tmp/usr/doc/examples debian/tmp || true
 	mkdir -p debian/tmp/usr/share/doc/meme ; mv debian/tmp/usr/doc/* debian/tmp/usr/share/doc/meme
 	mv debian/tmp/etc/meme/meme.doc debian/tmp/usr/share/doc/meme
+	find ./* -print|grep STRGGTCAN.meme|xargs chmod 644
 	dh_install -v --sourcedir=debian/tmp
 
 override_dh_auto_clean:




More information about the debian-med-commit mailing list