[med-svn] [r-cran-qqman] 02/06: Imported Upstream version 0.1.2

Dylan Aïssi bob.dybian-guest at moszumanska.debian.org
Thu Oct 27 20:51:00 UTC 2016


This is an automated email from the git hooks/post-receive script.

bob.dybian-guest pushed a commit to branch master
in repository r-cran-qqman.

commit d3986601178b6c3c88566fe0e288cee857d6cbc7
Author: Dylan Aïssi <bob.dybian at gmail.com>
Date:   Thu Oct 27 22:10:26 2016 +0200

    Imported Upstream version 0.1.2
---
 DESCRIPTION               |  15 ++
 MD5                       |  21 +++
 NAMESPACE                 |   4 +
 R/gwasResults-data.R      |   5 +
 R/manhattan.R             | 192 ++++++++++++++++++++++++
 R/package.R               |  10 ++
 R/qq.R                    |  53 +++++++
 R/snpsOfInterest-data.R   |   4 +
 R/zzz.R                   |   8 +
 README.md                 |  76 ++++++++++
 build/vignette.rds        | Bin 0 -> 211 bytes
 data/gwasResults.RData    | Bin 0 -> 133708 bytes
 data/snpsOfInterest.RData | Bin 0 -> 284 bytes
 inst/doc/qqman.R          |  76 ++++++++++
 inst/doc/qqman.Rmd        | 133 +++++++++++++++++
 inst/doc/qqman.html       | 363 ++++++++++++++++++++++++++++++++++++++++++++++
 man/gwasResults.Rd        |   9 ++
 man/manhattan.Rd          |  60 ++++++++
 man/qq.Rd                 |  25 ++++
 man/qqman.Rd              |  15 ++
 man/snpsOfInterest.Rd     |   9 ++
 vignettes/qqman.Rmd       | 133 +++++++++++++++++
 22 files changed, 1211 insertions(+)

diff --git a/DESCRIPTION b/DESCRIPTION
new file mode 100644
index 0000000..dbcde6a
--- /dev/null
+++ b/DESCRIPTION
@@ -0,0 +1,15 @@
+Package: qqman
+Title: Q-Q and manhattan plots for GWAS data
+Version: 0.1.2
+Author: Stephen Turner <vustephen at gmail.com>
+Maintainer: Stephen Turner <vustephen at gmail.com>
+Description: Q-Q and manhattan plots for GWAS data
+Depends: R (>= 3.0.0)
+Suggests: knitr
+License: GPL-3
+LazyData: true
+VignetteBuilder: knitr
+Packaged: 2014-09-25 16:54:14 UTC; sdt5z
+NeedsCompilation: no
+Repository: CRAN
+Date/Publication: 2014-09-25 22:29:57
diff --git a/MD5 b/MD5
new file mode 100644
index 0000000..4fb8e9e
--- /dev/null
+++ b/MD5
@@ -0,0 +1,21 @@
+878f0a46bac21b0a8714378777493948 *DESCRIPTION
+e8a3b67a1cfc5de2c87e2c05509ccb86 *NAMESPACE
+4703c59c0422edc311c09f34a79425c2 *R/gwasResults-data.R
+506bb6bd378ce87b44d1b452c47dac6a *R/manhattan.R
+6f2bc34e71e66aef115d0a9dff9c2c78 *R/package.R
+32fa6e2f67e18b0f33654b21de1624a6 *R/qq.R
+1d2a7fe82e8a5143d495b8f2f7800eca *R/snpsOfInterest-data.R
+4e7b578263eee2e20de8b161e4c08af4 *R/zzz.R
+7d28ad0c1131dbad19b657df61645cda *README.md
+86002107ab6d65a20ca6f0bbc7db8d34 *build/vignette.rds
+ff43bc38623980d5b64e423362075c10 *data/gwasResults.RData
+a10481df0b1192bc3f599e14414e0a55 *data/snpsOfInterest.RData
+9cb0830cb6ce29dbd98ebbb76d770cab *inst/doc/qqman.R
+47c06f8e3c96a96fca34f31fad80fd6d *inst/doc/qqman.Rmd
+c608b972e60a7ce5b88f47c7ae9055cb *inst/doc/qqman.html
+d1bb3943d85241abb665c6235c55825a *man/gwasResults.Rd
+516c1f05ddcecd781b1b1e01e16ad68d *man/manhattan.Rd
+7b64ebaab0b22df9b3c146614083689b *man/qq.Rd
+e7729862ac4cb3540ee65a21600538a4 *man/qqman.Rd
+efcb6525c2360e9fbf5acfa005b5cdff *man/snpsOfInterest.Rd
+47c06f8e3c96a96fca34f31fad80fd6d *vignettes/qqman.Rmd
diff --git a/NAMESPACE b/NAMESPACE
new file mode 100644
index 0000000..5d9f9a2
--- /dev/null
+++ b/NAMESPACE
@@ -0,0 +1,4 @@
+# Generated by roxygen2 (4.0.1): do not edit by hand
+
+export(manhattan)
+export(qq)
diff --git a/R/gwasResults-data.R b/R/gwasResults-data.R
new file mode 100644
index 0000000..b4648df
--- /dev/null
+++ b/R/gwasResults-data.R
@@ -0,0 +1,5 @@
+#' @name gwasResults
+#' @title Simulated GWAS results
+#' @description Simulated GWAS results as obtained from \code{plink --assoc}.
+#' @docType data
+NULL
\ No newline at end of file
diff --git a/R/manhattan.R b/R/manhattan.R
new file mode 100644
index 0000000..555e1c7
--- /dev/null
+++ b/R/manhattan.R
@@ -0,0 +1,192 @@
+#' Creates a manhattan plot
+#' 
+#' Creates a manhattan plot from PLINK assoc output (or any data frame with 
+#' chromosome, position, and p-value).
+#' 
+#' @param x A data.frame with columns "BP," "CHR," "P," and optionally, "SNP."
+#' @param chr A string denoting the column name for the chromosome. Defaults to 
+#'   PLINK's "CHR." Said column must be numeric. If you have X, Y, or MT 
+#'   chromosomes, be sure to renumber these 23, 24, 25, etc.
+#' @param bp A string denoting the column name for the chromosomal position. 
+#'   Defaults to PLINK's "BP." Said column must be numeric.
+#' @param p A string denoting the column name for the p-value. Defaults to 
+#'   PLINK's "P." Said column must be numeric.
+#' @param snp A string denoting the column name for the SNP name (rs number). 
+#'   Defaults to PLINK's "SNP." Said column should be a character.
+#' @param col A character vector indicating which colors to alternate.
+#' @param chrlabs A character vector equal to the number of chromosomes
+#'   specifying the chromosome labels (e.g., \code{c(1:22, "X", "Y", "MT")}).
+#' @param suggestiveline Where to draw a "suggestive" line. Default 
+#'   -log10(1e-5). Set to FALSE to disable.
+#' @param genomewideline Where to draw a "genome-wide sigificant" line. Default 
+#'   -log10(5e-8). Set to FALSE to disable.
+#' @param highlight A character vector of SNPs in your dataset to highlight. 
+#'   These SNPs should all be in your dataset.
+#' @param logp If TRUE, the -log10 of the p-value is plotted. It isn't very 
+#'   useful to plot raw p-values, but plotting the raw value could be useful for
+#'   other genome-wide plots, for example, peak heights, bayes factors, test 
+#'   statistics, other "scores," etc.
+#' @param ... Arguments passed on to other plot/points functions
+#'   
+#' @return A manhattan plot.
+#'   
+#' @keywords visualization manhattan
+#'   
+#' @examples
+#' manhattan(gwasResults)
+#'   
+#' @export
+
+manhattan <- function(x, chr="CHR", bp="BP", p="P", snp="SNP", 
+                      col=c("gray10", "gray60"), chrlabs=NULL,
+                      suggestiveline=-log10(1e-5), genomewideline=-log10(5e-8), 
+                      highlight=NULL, logp=TRUE, ...) {
+    
+    # Not sure why, but package check will warn without this.
+    CHR=BP=P=index=NULL
+    
+    # Check for sensible dataset
+    ## Make sure you have chr, bp and p columns.
+    if (!(chr %in% names(x))) stop(paste("Column", chr, "not found!"))
+    if (!(bp %in% names(x))) stop(paste("Column", bp, "not found!"))
+    if (!(p %in% names(x))) stop(paste("Column", p, "not found!"))
+    ## warn if you don't have a snp column
+    if (!(snp %in% names(x))) warning(paste("No SNP column found. OK unless you're trying to highlight."))
+    ## make sure chr, bp, and p columns are numeric.
+    if (!is.numeric(x[[chr]])) stop(paste(chr, "column should be numeric. Do you have 'X', 'Y', 'MT', etc? If so change to numbers and try again."))
+    if (!is.numeric(x[[bp]])) stop(paste(bp, "column should be numeric."))
+    if (!is.numeric(x[[p]])) stop(paste(p, "column should be numeric."))
+    
+    # Create a new data.frame with columns called CHR, BP, and P.
+    d=data.frame(CHR=x[[chr]], BP=x[[bp]], P=x[[p]])
+    
+    # If the input data frame has a SNP column, add it to the new data frame you're creating.
+    if (!is.null(x[[snp]])) d=transform(d, SNP=x[[snp]])
+    
+    # Set positions, ticks, and labels for plotting
+    ## Sort and keep only values where is numeric.
+    #d <- subset(d[order(d$CHR, d$BP), ], (P>0 & P<=1 & is.numeric(P)))
+    d <- subset(d, (is.numeric(CHR) & is.numeric(BP) & is.numeric(P)))
+    d <- d[order(d$CHR, d$BP), ]
+    #d$logp <- ifelse(logp, yes=-log10(d$P), no=d$P)
+    if (logp) {
+        d$logp <- -log10(d$P)
+    } else {
+        d$logp <- d$P
+    }
+    d$pos=NA
+    
+    
+    # Fixes the bug where one chromosome is missing by adding a sequential index column.
+    d$index=NA
+    ind = 0
+    for (i in unique(d$CHR)){
+        ind = ind + 1
+        d[d$CHR==i,]$index = ind
+    }
+    
+    # This section sets up positions and ticks. Ticks should be placed in the
+    # middle of a chromosome. The a new pos column is added that keeps a running
+    # sum of the positions of each successive chromsome. For example:
+    # chr bp pos
+    # 1   1  1
+    # 1   2  2
+    # 2   1  3
+    # 2   2  4
+    # 3   1  5
+    nchr = length(unique(d$CHR))
+    if (nchr==1) { ## For a single chromosome
+        options(scipen=999)
+	d$pos=d$BP/1e6
+        ticks=floor(length(d$pos))/2+1
+        xlabel = paste('Chromosome',unique(d$CHR),'position(Mb)')
+        labs = ticks
+    } else { ## For multiple chromosomes
+        lastbase=0
+        ticks=NULL
+        for (i in unique(d$index)) {
+            if (i==1) {
+                d[d$index==i, ]$pos=d[d$index==i, ]$BP
+            } else {
+                lastbase=lastbase+tail(subset(d,index==i-1)$BP, 1)
+                d[d$index==i, ]$pos=d[d$index==i, ]$BP+lastbase
+            }
+            # Old way: assumes SNPs evenly distributed
+            # ticks=c(ticks, d[d$index==i, ]$pos[floor(length(d[d$index==i, ]$pos)/2)+1])
+            # New way: doesn't make that assumption
+            ticks = c(ticks, (min(d[d$CHR == i,]$pos) + max(d[d$CHR == i,]$pos))/2 + 1)
+        }
+        xlabel = 'Chromosome'
+        #labs = append(unique(d$CHR),'') ## I forgot what this was here for... if seems to work, remove.
+        labs <- unique(d$CHR)
+    }
+    
+    # Initialize plot
+    xmax = ceiling(max(d$pos) * 1.03)
+    xmin = floor(max(d$pos) * -0.03)
+    
+    # The old way to initialize the plot
+    # plot(NULL, xaxt='n', bty='n', xaxs='i', yaxs='i', xlim=c(xmin,xmax), ylim=c(ymin,ymax),
+    #      xlab=xlabel, ylab=expression(-log[10](italic(p))), las=1, pch=20, ...)
+
+    
+    # The new way to initialize the plot.
+    ## See http://stackoverflow.com/q/23922130/654296
+    ## First, define your default arguments
+    def_args <- list(xaxt='n', bty='n', xaxs='i', yaxs='i', las=1, pch=20,
+                     xlim=c(xmin,xmax), ylim=c(0,ceiling(max(d$logp))),
+                     xlab=xlabel, ylab=expression(-log[10](italic(p))))
+    ## Next, get a list of ... arguments
+    #dotargs <- as.list(match.call())[-1L]
+    dotargs <- list(...)
+    ## And call the plot function passing NA, your ... arguments, and the default
+    ## arguments that were not defined in the ... arguments.
+    do.call("plot", c(NA, dotargs, def_args[!names(def_args) %in% names(dotargs)]))
+    
+    # If manually specifying chromosome labels, ensure a character vector and number of labels matches number chrs.
+    if (!is.null(chrlabs)) {
+        if (is.character(chrlabs)) {
+            if (length(chrlabs)==length(labs)) {
+                labs <- chrlabs
+            } else {
+                warning("You're trying to specify chromosome labels but the number of labels != number of chromosomes.")
+            }
+        } else {
+            warning("If you're trying to specify chromosome labels, chrlabs must be a character vector")
+        }
+    }
+    
+    # Add an axis. 
+    if (nchr==1) { #If single chromosome, ticks and labels automatic.
+        axis(1, ...)
+    } else { # if multiple chrs, use the ticks and labels you created above.
+        axis(1, at=ticks, labels=labs, ...)
+    }
+    
+    # Create a vector of alternatiting colors
+    col=rep(col, max(d$CHR))
+
+    # Add points to the plot
+    if (nchr==1) {
+        with(d, points(pos, logp, pch=20, col=col[1], ...))
+    } else {
+        # if multiple chromosomes, need to alternate colors and increase the color index (icol) each chr.
+        icol=1
+        for (i in unique(d$index)) {
+            with(d[d$index==unique(d$index)[i], ], points(pos, logp, col=col[icol], pch=20, ...))
+            icol=icol+1
+        }
+    }
+    
+    # Add suggestive and genomewide lines
+    if (suggestiveline) abline(h=suggestiveline, col="blue")
+    if (genomewideline) abline(h=genomewideline, col="red")
+    
+    # Highlight snps from a character vector
+    if (!is.null(highlight)) {
+        if (any(!(highlight %in% d$SNP))) warning("You're trying to highlight SNPs that don't exist in your results.")
+        d.highlight=d[which(d$SNP %in% highlight), ]
+        with(d.highlight, points(pos, logp, col="green3", pch=20, ...)) 
+    }
+
+}
diff --git a/R/package.R b/R/package.R
new file mode 100644
index 0000000..ea0e5c8
--- /dev/null
+++ b/R/package.R
@@ -0,0 +1,10 @@
+#' Create Q-Q and manhattan plots for GWAS data.
+#' 
+#' A package for creating Q-Q and manhattan plots for GWAS data. See the package
+#' vignette for details: \cr\cr
+#' \code{vignette("qqman")}
+#' 
+#' @name qqman
+#' @docType package
+#' @author Stephen Turner <\url{http://stephenturner.us}>
+NULL
\ No newline at end of file
diff --git a/R/qq.R b/R/qq.R
new file mode 100644
index 0000000..2b2d31f
--- /dev/null
+++ b/R/qq.R
@@ -0,0 +1,53 @@
+#' Creates a Q-Q plot
+#' 
+#' Creates a quantile-quantile plot from p-values from a GWAS study.
+#' 
+#' @param pvector A numeric vector of p-values.
+#' @param ... Other arguments passed to \code{plot()}
+#' 
+#' @return A Q-Q plot.
+#' 
+#' @keywords visualization qq qqplot
+#' 
+#' @examples
+#' qq(gwasResults$P)
+#' 
+#' @export
+
+qq = function(pvector, ...) {
+    
+    # Check for sensible input
+    if (!is.numeric(pvector)) stop("Input must be numeric.")
+    
+    # limit to not missing, not nan, not null, not infinite, between 0 and 1
+    pvector <- pvector[!is.na(pvector) & !is.nan(pvector) & !is.null(pvector) & is.finite(pvector) & pvector<1 & pvector>0]
+    
+    # Observed and expected
+    o = -log10(sort(pvector,decreasing=FALSE))
+    e = -log10( ppoints(length(pvector) ))
+    
+    
+#     # The old way
+#     plot(e, o, pch=20, 
+#          xlab=expression(Expected~~-log[10](italic(p))), 
+#          ylab=expression(Observed~~-log[10](italic(p))), 
+#          ...)
+    
+    # The new way to initialize the plot.
+    ## See http://stackoverflow.com/q/23922130/654296
+    ## First, define your default arguments
+    def_args <- list(pch=20, xlim=c(0, max(e)), ylim=c(0, max(o)), 
+                     xlab=expression(Expected~~-log[10](italic(p))), 
+                     ylab=expression(Observed~~-log[10](italic(p)))
+    )
+    ## Next, get a list of ... arguments
+    #dotargs <- as.list(match.call())[-1L]
+    dotargs <- list(...)
+    ## And call the plot function passing NA, your ... arguments, and the default
+    ## arguments that were not defined in the ... arguments.
+    tryCatch(do.call("plot", c(list(x=e, y=o), def_args[!names(def_args) %in% names(dotargs)], dotargs)), warn=stop)
+
+    # Add diagonal
+    abline(0,1,col="red")
+    
+}
diff --git a/R/snpsOfInterest-data.R b/R/snpsOfInterest-data.R
new file mode 100644
index 0000000..20647a3
--- /dev/null
+++ b/R/snpsOfInterest-data.R
@@ -0,0 +1,4 @@
+#' @name snpsOfInterest
+#' @title snpsOfInterest
+#' @docType data
+NULL
\ No newline at end of file
diff --git a/R/zzz.R b/R/zzz.R
new file mode 100644
index 0000000..37d09ca
--- /dev/null
+++ b/R/zzz.R
@@ -0,0 +1,8 @@
+.onAttach <- function(...){
+    packageStartupMessage()
+    packageStartupMessage("For example usage please run: vignette('qqman')")
+    packageStartupMessage()
+    packageStartupMessage("Citation appreciated but not required:")
+    packageStartupMessage("Turner, S.D. qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots. biorXiv DOI: 10.1101/005165 (2014).")
+    packageStartupMessage()
+}
\ No newline at end of file
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..b32c2bf
--- /dev/null
+++ b/README.md
@@ -0,0 +1,76 @@
+# qqman: An R package for creating Q-Q and manhattan plots from GWAS results.
+
+![qqman.gif](assets/qqman.gif)
+
+## Citation
+
+If you'd like to cite qqman (appreciated but not required), please cite the pre-print below:
+
+Turner, S.D. qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots. [*biorXiv* DOI: 10.1101/005165](http://biorxiv.org/content/early/2014/05/14/005165).
+
+## Installation
+
+Install the stable release from CRAN:
+
+```coffee
+install.packages("qqman")
+```
+
+Or install directly from github using devtools
+
+```coffee
+library(devtools)
+install_github("stephenturner/qqman")
+```
+
+Or install the most recent development release with devtools (note, there be dragons here):
+
+```coffee
+library(devtools)
+install_github("stephenturner/qqman", ref="dev")
+```
+
+Load the package each time you use it:
+
+```coffee
+library(qqman)
+```
+
+## Usage
+
+See the [online package vignette](http://cran.r-project.org/web/packages/qqman/vignettes/qqman.html) for more examples:
+
+```coffee
+vignette("manhattan")
+```
+
+Take a look at the built-in data:
+
+```coffee
+head(gwasResults)
+```
+
+Basic manhattan plot using built-in data:
+
+```coffee
+manhattan(gwasResults)
+```
+
+Basic Q-Q plot using built-in data:
+
+```coffee
+qq(gwasResults$P)
+```
+
+Get help:
+
+```coffee
+?manhattan
+?qq
+```
+
+## Notes
+
+* This release is substantially simplified for the sake of maintainability and creating an R package. The old code that allows confidence intervals on the Q-Q plot and allows more flexible annotation and highlighting is still available at the [version 0.0.0 tag](https://github.com/stephenturner/qqman/tree/v0.0.0).
+* Special thanks to Dan Capurso and Tim Knutsen for useful contributions and bugfixes.
+* Thanks to all the blog commenters for pointing out bugs and other issues.
diff --git a/build/vignette.rds b/build/vignette.rds
new file mode 100644
index 0000000..58afde2
Binary files /dev/null and b/build/vignette.rds differ
diff --git a/data/gwasResults.RData b/data/gwasResults.RData
new file mode 100644
index 0000000..5709f84
Binary files /dev/null and b/data/gwasResults.RData differ
diff --git a/data/snpsOfInterest.RData b/data/snpsOfInterest.RData
new file mode 100644
index 0000000..6843ab5
Binary files /dev/null and b/data/snpsOfInterest.RData differ
diff --git a/inst/doc/qqman.R b/inst/doc/qqman.R
new file mode 100644
index 0000000..162b9b7
--- /dev/null
+++ b/inst/doc/qqman.R
@@ -0,0 +1,76 @@
+## ----, include=FALSE-----------------------------------------------------
+library(qqman)
+library(knitr)
+opts_chunk$set(comment=NA, fig.width=12, fig.height=9, message=FALSE, tidy=TRUE, dpi=75)
+
+## ----generatedata, eval=FALSE, echo=FALSE--------------------------------
+#  # This code used to generate the test data. Runs slow, but does the job.
+#  chrstats <- data.frame(chr=1:22, nsnps=1500)
+#  chrstats$nsnps <- with(chrstats, round(nsnps/chr^(1/3)))
+#  chrstats
+#  
+#  d <- data.frame(SNP=rep(NA, sum(chrstats$nsnps)),
+#                  CHR=rep(NA, sum(chrstats$nsnps)),
+#                  BP=rep(NA, sum(chrstats$nsnps)),
+#                  P=rep(NA, sum(chrstats$nsnps)))
+#  snpi <- 1
+#  set.seed(42)
+#  for (i in chrstats$chr) {
+#      for (j in 1:chrstats[i, 2]) {
+#          d[snpi, ]$SNP=paste0("rs", snpi)
+#          d[snpi, ]$CHR=i
+#          d[snpi, ]$BP=j
+#          d[snpi, ]$P=runif(1)
+#          snpi <- snpi+1
+#      }
+#  }
+#  
+#  divisor <- c(seq(2,50,2), seq(50,2,-2))
+#  divisor <- divisor^4
+#  length(divisor)
+#  d[3026:3075, ]$P <- d[3026:3075, ]$P/divisor
+#  snpsOfInterest <- paste0("rs", 3001:3100)
+#  qq(d$P)
+#  manhattan(d, highlight=snpsOfInterest)
+#  gwasResults <- d
+#  save(gwasResults, file="data/gwasResults.RData")
+
+## ------------------------------------------------------------------------
+str(gwasResults)
+head(gwasResults)
+tail(gwasResults)
+
+## ------------------------------------------------------------------------
+as.data.frame(table(gwasResults$CHR))
+
+## ------------------------------------------------------------------------
+manhattan(gwasResults)
+
+## ------------------------------------------------------------------------
+manhattan(gwasResults, main="Manhattan Plot", ylim=c(0,10), cex=0.6, cex.axis=0.9, col=c("blue4", "orange3"), suggestiveline=F, genomewideline=F, chrlabs=c(1:20, "P", "Q"))
+
+## ------------------------------------------------------------------------
+manhattan(subset(gwasResults, CHR==1))
+
+## ------------------------------------------------------------------------
+str(snpsOfInterest)
+manhattan(gwasResults, highlight=snpsOfInterest)
+
+## ------------------------------------------------------------------------
+manhattan(subset(gwasResults, CHR==3), highlight=snpsOfInterest, xlim=c(200, 500), main="Chr 3")
+
+## ------------------------------------------------------------------------
+# Add test statistics
+gwasResults <- transform(gwasResults, zscore=qnorm(P/2, lower.tail=FALSE))
+head(gwasResults)
+
+# Make the new plot
+manhattan(gwasResults, p="zscore", logp=FALSE, ylab="Z-score", genomewideline=FALSE, suggestiveline=FALSE, main="Manhattan plot of Z-scores")
+
+## ------------------------------------------------------------------------
+qq(gwasResults$P)
+
+## ------------------------------------------------------------------------
+qq(gwasResults$P, main="Q-Q plot of GWAS p-values",
+   xlim=c(0,7), ylim=c(0,12), pch=18, col="blue4", cex=1.5, las=1)
+
diff --git a/inst/doc/qqman.Rmd b/inst/doc/qqman.Rmd
new file mode 100644
index 0000000..c5fe6f8
--- /dev/null
+++ b/inst/doc/qqman.Rmd
@@ -0,0 +1,133 @@
+---
+title: "Intro to the qqman package"
+author: "Stephen D. Turner"
+date: '`r Sys.Date()`'
+output:
+  html_document:
+    toc: yes
+---
+
+<!--
+%\VignetteEngine{knitr::knitr}
+%\VignetteIndexEntry{Intro to the qqman package}
+-->
+
+```{r, include=FALSE}
+library(qqman)
+library(knitr)
+opts_chunk$set(comment=NA, fig.width=12, fig.height=9, message=FALSE, tidy=TRUE, dpi=75)
+```
+
+# Intro to the **qqman** package
+
+```{r generatedata, eval=FALSE, echo=FALSE}
+# This code used to generate the test data. Runs slow, but does the job.
+chrstats <- data.frame(chr=1:22, nsnps=1500)
+chrstats$nsnps <- with(chrstats, round(nsnps/chr^(1/3)))
+chrstats
+
+d <- data.frame(SNP=rep(NA, sum(chrstats$nsnps)), 
+                CHR=rep(NA, sum(chrstats$nsnps)), 
+                BP=rep(NA, sum(chrstats$nsnps)), 
+                P=rep(NA, sum(chrstats$nsnps)))
+snpi <- 1
+set.seed(42)
+for (i in chrstats$chr) {
+    for (j in 1:chrstats[i, 2]) {
+        d[snpi, ]$SNP=paste0("rs", snpi)
+        d[snpi, ]$CHR=i
+        d[snpi, ]$BP=j
+        d[snpi, ]$P=runif(1)
+        snpi <- snpi+1
+    }
+}
+
+divisor <- c(seq(2,50,2), seq(50,2,-2))
+divisor <- divisor^4
+length(divisor)
+d[3026:3075, ]$P <- d[3026:3075, ]$P/divisor
+snpsOfInterest <- paste0("rs", 3001:3100)
+qq(d$P)
+manhattan(d, highlight=snpsOfInterest)
+gwasResults <- d
+save(gwasResults, file="data/gwasResults.RData")
+```
+
+The **qqman** package includes functions for creating manhattan plots and q-q plots from GWAS results. The `gwasResults` data.frame included with the package has simulated results for 16,470 SNPs on 22 chromosomes. Take a look at the data:
+
+```{r}
+str(gwasResults)
+head(gwasResults)
+tail(gwasResults)
+```
+
+How many SNPs on each chromosome?
+
+```{r}
+as.data.frame(table(gwasResults$CHR))
+```
+
+## Creating manhattan plots
+
+Now, let's make a basic manhattan plot. 
+
+```{r}
+manhattan(gwasResults)
+```
+
+We can also pass in other graphical parameters. Let's add a title (`main=`), increase the y-axis limit (`ylim=`), reduce the point size to 60% (`cex=`), and reduce the font size of the axis labels to 90% (`cex.axis=`). While we're at it, let's change the colors (`col=`), remove the suggestive and genome-wide significance lines, and supply our own labels for the chromosomes:
+
+```{r}
+manhattan(gwasResults, main="Manhattan Plot", ylim=c(0,10), cex=0.6, cex.axis=0.9, col=c("blue4", "orange3"), suggestiveline=F, genomewideline=F, chrlabs=c(1:20, "P", "Q"))
+```
+
+Now, let's look at a single chromosome:
+
+```{r}
+manhattan(subset(gwasResults, CHR==1))
+```
+
+Let's highlight some SNPs of interest on chromosome 3. The 100 SNPs we're highlighting here are in a character vector called `snpsOfInterest`. You'll get a warning if you try to highlight SNPs that don't exist.
+
+```{r}
+str(snpsOfInterest)
+manhattan(gwasResults, highlight=snpsOfInterest)
+```
+
+We can combine highlighting and limiting to a single chromosome, and use the `xlim` graphical parameter to zoom in on a region of interest (between position 200-500):
+
+```{r}
+manhattan(subset(gwasResults, CHR==3), highlight=snpsOfInterest, xlim=c(200, 500), main="Chr 3")
+```
+
+Finally, the `manhattan` function can be used to plot any value, not just p-values. Here, we'll simply call the function passing to the `p=` argument the name of the column we want to plot instead of the default "P" column. In this example, let's create a test statistic ("zscore"), plot that instead of p-values, change the y-axis label, and remove the default log transformation. We'll also remove the genomewide and suggestive lines because these are only meaningful if you're plotting -lo [...]
+
+```{r}
+# Add test statistics
+gwasResults <- transform(gwasResults, zscore=qnorm(P/2, lower.tail=FALSE))
+head(gwasResults)
+
+# Make the new plot
+manhattan(gwasResults, p="zscore", logp=FALSE, ylab="Z-score", genomewideline=FALSE, suggestiveline=FALSE, main="Manhattan plot of Z-scores")
+```
+
+A few notes on creating manhattan plots:
+
+* Run `str(gwasResults)`. Notice that the `gwasResults` data.frame has SNP, chromosome, position, and p-value columns named `SNP`, `CHR`, `BP`, and `P`. If you're creating a manhattan plot and your column names are different, you'll have to pass the column names to the `chr=`, `bp=`, `p=`, and `snp=` arguments. See `help(manhattan)` for details.
+* The chromosome column must be numeric. If you have "X," "Y," or "MT" chromosomes, you'll need to rename these 23, 24, 25, etc. You can modify the source code (e.g., `fix(manhattan)`) to change the line designating the axis tick labels (`labs <- unique(d$CHR)`) to set this to whatever you'd like it to be.
+* If you'd like to change the color of the highlight or the suggestive/genomewide lines, you'll need to modify the source code. Search for `col="blue"`, `col="red"`, or `col="green3"` to modify the suggestive line, genomewide line, and highlight colors, respectively.
+
+## Creating Q-Q plots
+
+Creating Q-Q plots is straightforward - simply supply a vector of p-values to the `qq()` function. 
+
+```{r}
+qq(gwasResults$P)
+```
+
+We can otionally supply many other graphical parameters.
+
+```{r}
+qq(gwasResults$P, main="Q-Q plot of GWAS p-values",
+   xlim=c(0,7), ylim=c(0,12), pch=18, col="blue4", cex=1.5, las=1)
+```
\ No newline at end of file
diff --git a/inst/doc/qqman.html b/inst/doc/qqman.html
new file mode 100644
index 0000000..43dbc04
--- /dev/null
+++ b/inst/doc/qqman.html
@@ -0,0 +1,363 @@
+<!DOCTYPE html>
+<html>
+<head>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
+
+<title>Intro to the <strong>qqman</strong> package</title>
+
+<!-- Styles for R syntax highlighter -->
+<style type="text/css">
+   pre .operator,
+   pre .paren {
+     color: rgb(104, 118, 135)
+   }
+
+   pre .literal {
+     color: #990073
+   }
+
+   pre .number {
+     color: #099;
+   }
+
+   pre .comment {
+     color: #998;
+     font-style: italic
+   }
+
+   pre .keyword {
+     color: #900;
+     font-weight: bold
+   }
+
+   pre .identifier {
+     color: rgb(0, 0, 0);
+   }
+
+   pre .string {
+     color: #d14;
+   }
+</style>
+
+<!-- R syntax highlighter -->
+<script type="text/javascript">
+var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/</gm,"<")}function f(r,q,p){return RegExp(q,"m"+(r.cI?"i":"")+(p?"g":""))}function b(r){for(var p=0;p<r.childNodes.length;p++){var q=r.childNodes[p];if(q.nodeName=="CODE"){return q}if(!(q.nodeType==3&&q.nodeValue.match(/\s+/))){break}}}function h(t,s){var p="";for(var r=0;r<t.childNodes.length;r++){if(t.childNodes[r].nodeType==3){var q=t.childNodes[r].nodeValue;if(s){q=q.replace(/\n/g,"")}p+=q}else{if(t.chi [...]
+hljs.initHighlightingOnLoad();
+</script>
+
+<!-- MathJax scripts -->
+<script type="text/javascript" src="https://c328740.ssl.cf1.rackcdn.com/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
+</script>
+
+
+<style type="text/css">
+body, td {
+   font-family: sans-serif;
+   background-color: white;
+   font-size: 13px;
+}
+
+body {
+  max-width: 800px;
+  margin: auto;
+  padding: 1em;
+  line-height: 20px;
+}
+
+tt, code, pre {
+   font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
+}
+
+h1 {
+   font-size:2.2em;
+}
+
+h2 {
+   font-size:1.8em;
+}
+
+h3 {
+   font-size:1.4em;
+}
+
+h4 {
+   font-size:1.0em;
+}
+
+h5 {
+   font-size:0.9em;
+}
+
+h6 {
+   font-size:0.8em;
+}
+
+a:visited {
+   color: rgb(50%, 0%, 50%);
+}
+
+pre, img {
+  max-width: 100%;
+}
+
+pre code {
+   display: block; padding: 0.5em;
+}
+
+code {
+  font-size: 92%;
+  border: 1px solid #ccc;
+}
+
+code[class] {
+  background-color: #F8F8F8;
+}
+
+table, td, th {
+  border: none;
+}
+
+blockquote {
+   color:#666666;
+   margin:0;
+   padding-left: 1em;
+   border-left: 0.5em #EEE solid;
+}
+
+hr {
+   height: 0px;
+   border-bottom: none;
+   border-top-width: thin;
+   border-top-style: dotted;
+   border-top-color: #999999;
+}
+
+ at media print {
+   * {
+      background: transparent !important;
+      color: black !important;
+      filter:none !important;
+      -ms-filter: none !important;
+   }
+
+   body {
+      font-size:12pt;
+      max-width:100%;
+   }
+
+   a, a:visited {
+      text-decoration: underline;
+   }
+
+   hr {
+      visibility: hidden;
+      page-break-before: always;
+   }
+
+   pre, blockquote {
+      padding-right: 1em;
+      page-break-inside: avoid;
+   }
+
+   tr, img {
+      page-break-inside: avoid;
+   }
+
+   img {
+      max-width: 100% !important;
+   }
+
+   @page :left {
+      margin: 15mm 20mm 15mm 10mm;
+   }
+
+   @page :right {
+      margin: 15mm 10mm 15mm 20mm;
+   }
+
+   p, h2, h3 {
+      orphans: 3; widows: 3;
+   }
+
+   h2, h3 {
+      page-break-after: avoid;
+   }
+}
+</style>
+
+
+
+</head>
+
+<body>
+<!--
+%\VignetteEngine{knitr::knitr}
+%\VignetteIndexEntry{Intro to the qqman package}
+-->
+
+<h1>Intro to the <strong>qqman</strong> package</h1>
+
+<p>The <strong>qqman</strong> package includes functions for creating manhattan plots and q-q plots from GWAS results. The <code>gwasResults</code> data.frame included with the package has simulated results for 16,470 SNPs on 22 chromosomes. Take a look at the data:</p>
+
+<pre><code class="r">str(gwasResults)
+</code></pre>
+
+<pre><code>'data.frame':   16470 obs. of  5 variables:
+ $ SNP   : chr  "rs1" "rs2" "rs3" "rs4" ...
+ $ CHR   : int  1 1 1 1 1 1 1 1 1 1 ...
+ $ BP    : int  1 2 3 4 5 6 7 8 9 10 ...
+ $ P     : num  0.915 0.937 0.286 0.83 0.642 ...
+ $ zscore: num  0.107 0.0789 1.0666 0.2141 0.4653 ...
+</code></pre>
+
+<pre><code class="r">head(gwasResults)
+</code></pre>
+
+<pre><code>  SNP CHR BP      P  zscore
+1 rs1   1  1 0.9148 0.10698
+2 rs2   1  2 0.9371 0.07895
+3 rs3   1  3 0.2861 1.06663
+4 rs4   1  4 0.8304 0.21413
+5 rs5   1  5 0.6417 0.46526
+6 rs6   1  6 0.5191 0.64474
+</code></pre>
+
+<pre><code class="r">tail(gwasResults)
+</code></pre>
+
+<pre><code>          SNP CHR  BP      P zscore
+16465 rs16465  22 530 0.5644 0.5764
+16466 rs16466  22 531 0.1383 1.4822
+16467 rs16467  22 532 0.3937 0.8529
+16468 rs16468  22 533 0.1779 1.3473
+16469 rs16469  22 534 0.2393 1.1767
+16470 rs16470  22 535 0.2630 1.1192
+</code></pre>
+
+<p>How many SNPs on each chromosome?</p>
+
+<pre><code class="r">as.data.frame(table(gwasResults$CHR))
+</code></pre>
+
+<pre><code>   Var1 Freq
+1     1 1500
+2     2 1191
+3     3 1040
+4     4  945
+5     5  877
+6     6  825
+7     7  784
+8     8  750
+9     9  721
+10   10  696
+11   11  674
+12   12  655
+13   13  638
+14   14  622
+15   15  608
+16   16  595
+17   17  583
+18   18  572
+19   19  562
+20   20  553
+21   21  544
+22   22  535
+</code></pre>
+
+<h2>Creating manhattan plots</h2>
+
+<p>Now, let's make a basic manhattan plot. </p>
+
+<pre><code class="r">manhattan(gwasResults)
+</code></pre>
+
+<p><img src=" [...]
+
+<p>We can also pass in other graphical parameters. Let's add a title (<code>main=</code>), increase the y-axis limit (<code>ylim=</code>), reduce the point size to 60% (<code>cex=</code>), and reduce the font size of the axis labels to 90% (<code>cex.axis=</code>). While we're at it, let's change the colors (<code>col=</code>), remove the suggestive and genome-wide significance lines, and supply our own labels for the chromosomes:</p>
+
+<pre><code class="r">manhattan(gwasResults, main = "Manhattan Plot", ylim = c(0, 10), cex = 0.6, 
+    cex.axis = 0.9, col = c("blue4", "orange3"), suggestiveline = F, genomewideline = F, 
+    chrlabs = c(1:20, "P", "Q"))
+</code></pre>
+
+<p><img src=" [...]
+
+<p>Now, let's look at a single chromosome:</p>
+
+<pre><code class="r">manhattan(subset(gwasResults, CHR == 1))
+</code></pre>
+
+<p><img src=" [...]
+
+<p>Let's highlight some SNPs of interest on chromosome 3. The 100 SNPs we're highlighting here are in a character vector called <code>snpsOfInterest</code>. You'll get a warning if you try to highlight SNPs that don't exist.</p>
+
+<pre><code class="r">str(snpsOfInterest)
+</code></pre>
+
+<pre><code> chr [1:100] "rs3001" "rs3002" "rs3003" "rs3004" "rs3005" ...
+</code></pre>
+
+<pre><code class="r">manhattan(gwasResults, highlight = snpsOfInterest)
+</code></pre>
+
+<p><img src=" [...]
+
+<p>We can combine highlighting and limiting to a single chromosome, and use the <code>xlim</code> graphical parameter to zoom in on a region of interest (between position 200-500):</p>
+
+<pre><code class="r">manhattan(subset(gwasResults, CHR == 3), highlight = snpsOfInterest, xlim = c(200, 
+    500), main = "Chr 3")
+</code></pre>
+
+<p><img src=" [...]
+
+<p>Finally, the <code>manhattan</code> function can be used to plot any value, not just p-values. Here, we'll simply call the function passing to the <code>p=</code> argument the name of the column we want to plot instead of the default “P” column. In this example, let's create a test statistic (“zscore”), plot that instead of p-values, change the y-axis label, and remove the default log transformation. We'll also remove the genomewide and suggestive l [...]
+
+<pre><code class="r"># Add test statistics
+gwasResults <- transform(gwasResults, zscore = qnorm(P/2, lower.tail = FALSE))
+head(gwasResults)
+</code></pre>
+
+<pre><code>  SNP CHR BP      P  zscore
+1 rs1   1  1 0.9148 0.10698
+2 rs2   1  2 0.9371 0.07895
+3 rs3   1  3 0.2861 1.06663
+4 rs4   1  4 0.8304 0.21413
+5 rs5   1  5 0.6417 0.46526
+6 rs6   1  6 0.5191 0.64474
+</code></pre>
+
+<pre><code class="r"># Make the new plot
+manhattan(gwasResults, p = "zscore", logp = FALSE, ylab = "Z-score", genomewideline = FALSE, 
+    suggestiveline = FALSE, main = "Manhattan plot of Z-scores")
+</code></pre>
+
+<p><img src=" [...]
+
+<p>A few notes on creating manhattan plots:</p>
+
+<ul>
+<li>Run <code>str(gwasResults)</code>. Notice that the <code>gwasResults</code> data.frame has SNP, chromosome, position, and p-value columns named <code>SNP</code>, <code>CHR</code>, <code>BP</code>, and <code>P</code>. If you're creating a manhattan plot and your column names are different, you'll have to pass the column names to the <code>chr=</code>, <code>bp=</code>, <code>p=</code>, and <code>snp=</code> arguments. See <code>help(manhattan)</code> for details.</li>
+<li>The chromosome column must be numeric. If you have “X,” “Y,” or “MT” chromosomes, you'll need to rename these 23, 24, 25, etc. You can modify the source code (e.g., <code>fix(manhattan)</code>) to change the line designating the axis tick labels (<code>labs <- unique(d$CHR)</code>) to set this to whatever you'd like it to be.</li>
+<li>If you'd like to change the color of the highlight or the suggestive/genomewide lines, you'll need to modify the source code. Search for <code>col="blue"</code>, <code>col="red"</code>, or <code>col="green3"</code> to modify the suggestive line, genomewide line, and highlight colors, respectively.</li>
+</ul>
+
+<h2>Creating Q-Q plots</h2>
+
+<p>Creating Q-Q plots is straightforward - simply supply a vector of p-values to the <code>qq()</code> function. </p>
+
+<pre><code class="r">qq(gwasResults$P)
+</code></pre>
+
+<p><img src=" [...]
+
+<p>We can otionally supply many other graphical parameters.</p>
+
+<pre><code class="r">qq(gwasResults$P, main = "Q-Q plot of GWAS p-values", xlim = c(0, 7), ylim = c(0, 
+    12), pch = 18, col = "blue4", cex = 1.5, las = 1)
+</code></pre>
+
+<p><img src=" [...]
+
+</body>
+
+</html>
diff --git a/man/gwasResults.Rd b/man/gwasResults.Rd
new file mode 100644
index 0000000..ee31dd0
--- /dev/null
+++ b/man/gwasResults.Rd
@@ -0,0 +1,9 @@
+% Generated by roxygen2 (4.0.1): do not edit by hand
+\docType{data}
+\name{gwasResults}
+\alias{gwasResults}
+\title{Simulated GWAS results}
+\description{
+Simulated GWAS results as obtained from \code{plink --assoc}.
+}
+
diff --git a/man/manhattan.Rd b/man/manhattan.Rd
new file mode 100644
index 0000000..df48072
--- /dev/null
+++ b/man/manhattan.Rd
@@ -0,0 +1,60 @@
+% Generated by roxygen2 (4.0.1): do not edit by hand
+\name{manhattan}
+\alias{manhattan}
+\title{Creates a manhattan plot}
+\usage{
+manhattan(x, chr = "CHR", bp = "BP", p = "P", snp = "SNP",
+  col = c("gray10", "gray60"), chrlabs = NULL,
+  suggestiveline = -log10(1e-05), genomewideline = -log10(5e-08),
+  highlight = NULL, logp = TRUE, ...)
+}
+\arguments{
+\item{x}{A data.frame with columns "BP," "CHR," "P," and optionally, "SNP."}
+
+\item{chr}{A string denoting the column name for the chromosome. Defaults to
+PLINK's "CHR." Said column must be numeric. If you have X, Y, or MT
+chromosomes, be sure to renumber these 23, 24, 25, etc.}
+
+\item{bp}{A string denoting the column name for the chromosomal position.
+Defaults to PLINK's "BP." Said column must be numeric.}
+
+\item{p}{A string denoting the column name for the p-value. Defaults to
+PLINK's "P." Said column must be numeric.}
+
+\item{snp}{A string denoting the column name for the SNP name (rs number).
+Defaults to PLINK's "SNP." Said column should be a character.}
+
+\item{col}{A character vector indicating which colors to alternate.}
+
+\item{chrlabs}{A character vector equal to the number of chromosomes
+specifying the chromosome labels (e.g., \code{c(1:22, "X", "Y", "MT")}).}
+
+\item{suggestiveline}{Where to draw a "suggestive" line. Default
+-log10(1e-5). Set to FALSE to disable.}
+
+\item{genomewideline}{Where to draw a "genome-wide sigificant" line. Default
+-log10(5e-8). Set to FALSE to disable.}
+
+\item{highlight}{A character vector of SNPs in your dataset to highlight.
+These SNPs should all be in your dataset.}
+
+\item{logp}{If TRUE, the -log10 of the p-value is plotted. It isn't very
+useful to plot raw p-values, but plotting the raw value could be useful for
+other genome-wide plots, for example, peak heights, bayes factors, test
+statistics, other "scores," etc.}
+
+\item{...}{Arguments passed on to other plot/points functions}
+}
+\value{
+A manhattan plot.
+}
+\description{
+Creates a manhattan plot from PLINK assoc output (or any data frame with
+chromosome, position, and p-value).
+}
+\examples{
+manhattan(gwasResults)
+}
+\keyword{manhattan}
+\keyword{visualization}
+
diff --git a/man/qq.Rd b/man/qq.Rd
new file mode 100644
index 0000000..f97730e
--- /dev/null
+++ b/man/qq.Rd
@@ -0,0 +1,25 @@
+% Generated by roxygen2 (4.0.1): do not edit by hand
+\name{qq}
+\alias{qq}
+\title{Creates a Q-Q plot}
+\usage{
+qq(pvector, ...)
+}
+\arguments{
+\item{pvector}{A numeric vector of p-values.}
+
+\item{...}{Other arguments passed to \code{plot()}}
+}
+\value{
+A Q-Q plot.
+}
+\description{
+Creates a quantile-quantile plot from p-values from a GWAS study.
+}
+\examples{
+qq(gwasResults$P)
+}
+\keyword{qq}
+\keyword{qqplot}
+\keyword{visualization}
+
diff --git a/man/qqman.Rd b/man/qqman.Rd
new file mode 100644
index 0000000..30ce912
--- /dev/null
+++ b/man/qqman.Rd
@@ -0,0 +1,15 @@
+% Generated by roxygen2 (4.0.1): do not edit by hand
+\docType{package}
+\name{qqman}
+\alias{qqman}
+\alias{qqman-package}
+\title{Create Q-Q and manhattan plots for GWAS data.}
+\description{
+A package for creating Q-Q and manhattan plots for GWAS data. See the package
+vignette for details: \cr\cr
+\code{vignette("qqman")}
+}
+\author{
+Stephen Turner <\url{http://stephenturner.us}>
+}
+
diff --git a/man/snpsOfInterest.Rd b/man/snpsOfInterest.Rd
new file mode 100644
index 0000000..411b532
--- /dev/null
+++ b/man/snpsOfInterest.Rd
@@ -0,0 +1,9 @@
+% Generated by roxygen2 (4.0.1): do not edit by hand
+\docType{data}
+\name{snpsOfInterest}
+\alias{snpsOfInterest}
+\title{snpsOfInterest}
+\description{
+snpsOfInterest
+}
+
diff --git a/vignettes/qqman.Rmd b/vignettes/qqman.Rmd
new file mode 100644
index 0000000..c5fe6f8
--- /dev/null
+++ b/vignettes/qqman.Rmd
@@ -0,0 +1,133 @@
+---
+title: "Intro to the qqman package"
+author: "Stephen D. Turner"
+date: '`r Sys.Date()`'
+output:
+  html_document:
+    toc: yes
+---
+
+<!--
+%\VignetteEngine{knitr::knitr}
+%\VignetteIndexEntry{Intro to the qqman package}
+-->
+
+```{r, include=FALSE}
+library(qqman)
+library(knitr)
+opts_chunk$set(comment=NA, fig.width=12, fig.height=9, message=FALSE, tidy=TRUE, dpi=75)
+```
+
+# Intro to the **qqman** package
+
+```{r generatedata, eval=FALSE, echo=FALSE}
+# This code used to generate the test data. Runs slow, but does the job.
+chrstats <- data.frame(chr=1:22, nsnps=1500)
+chrstats$nsnps <- with(chrstats, round(nsnps/chr^(1/3)))
+chrstats
+
+d <- data.frame(SNP=rep(NA, sum(chrstats$nsnps)), 
+                CHR=rep(NA, sum(chrstats$nsnps)), 
+                BP=rep(NA, sum(chrstats$nsnps)), 
+                P=rep(NA, sum(chrstats$nsnps)))
+snpi <- 1
+set.seed(42)
+for (i in chrstats$chr) {
+    for (j in 1:chrstats[i, 2]) {
+        d[snpi, ]$SNP=paste0("rs", snpi)
+        d[snpi, ]$CHR=i
+        d[snpi, ]$BP=j
+        d[snpi, ]$P=runif(1)
+        snpi <- snpi+1
+    }
+}
+
+divisor <- c(seq(2,50,2), seq(50,2,-2))
+divisor <- divisor^4
+length(divisor)
+d[3026:3075, ]$P <- d[3026:3075, ]$P/divisor
+snpsOfInterest <- paste0("rs", 3001:3100)
+qq(d$P)
+manhattan(d, highlight=snpsOfInterest)
+gwasResults <- d
+save(gwasResults, file="data/gwasResults.RData")
+```
+
+The **qqman** package includes functions for creating manhattan plots and q-q plots from GWAS results. The `gwasResults` data.frame included with the package has simulated results for 16,470 SNPs on 22 chromosomes. Take a look at the data:
+
+```{r}
+str(gwasResults)
+head(gwasResults)
+tail(gwasResults)
+```
+
+How many SNPs on each chromosome?
+
+```{r}
+as.data.frame(table(gwasResults$CHR))
+```
+
+## Creating manhattan plots
+
+Now, let's make a basic manhattan plot. 
+
+```{r}
+manhattan(gwasResults)
+```
+
+We can also pass in other graphical parameters. Let's add a title (`main=`), increase the y-axis limit (`ylim=`), reduce the point size to 60% (`cex=`), and reduce the font size of the axis labels to 90% (`cex.axis=`). While we're at it, let's change the colors (`col=`), remove the suggestive and genome-wide significance lines, and supply our own labels for the chromosomes:
+
+```{r}
+manhattan(gwasResults, main="Manhattan Plot", ylim=c(0,10), cex=0.6, cex.axis=0.9, col=c("blue4", "orange3"), suggestiveline=F, genomewideline=F, chrlabs=c(1:20, "P", "Q"))
+```
+
+Now, let's look at a single chromosome:
+
+```{r}
+manhattan(subset(gwasResults, CHR==1))
+```
+
+Let's highlight some SNPs of interest on chromosome 3. The 100 SNPs we're highlighting here are in a character vector called `snpsOfInterest`. You'll get a warning if you try to highlight SNPs that don't exist.
+
+```{r}
+str(snpsOfInterest)
+manhattan(gwasResults, highlight=snpsOfInterest)
+```
+
+We can combine highlighting and limiting to a single chromosome, and use the `xlim` graphical parameter to zoom in on a region of interest (between position 200-500):
+
+```{r}
+manhattan(subset(gwasResults, CHR==3), highlight=snpsOfInterest, xlim=c(200, 500), main="Chr 3")
+```
+
+Finally, the `manhattan` function can be used to plot any value, not just p-values. Here, we'll simply call the function passing to the `p=` argument the name of the column we want to plot instead of the default "P" column. In this example, let's create a test statistic ("zscore"), plot that instead of p-values, change the y-axis label, and remove the default log transformation. We'll also remove the genomewide and suggestive lines because these are only meaningful if you're plotting -lo [...]
+
+```{r}
+# Add test statistics
+gwasResults <- transform(gwasResults, zscore=qnorm(P/2, lower.tail=FALSE))
+head(gwasResults)
+
+# Make the new plot
+manhattan(gwasResults, p="zscore", logp=FALSE, ylab="Z-score", genomewideline=FALSE, suggestiveline=FALSE, main="Manhattan plot of Z-scores")
+```
+
+A few notes on creating manhattan plots:
+
+* Run `str(gwasResults)`. Notice that the `gwasResults` data.frame has SNP, chromosome, position, and p-value columns named `SNP`, `CHR`, `BP`, and `P`. If you're creating a manhattan plot and your column names are different, you'll have to pass the column names to the `chr=`, `bp=`, `p=`, and `snp=` arguments. See `help(manhattan)` for details.
+* The chromosome column must be numeric. If you have "X," "Y," or "MT" chromosomes, you'll need to rename these 23, 24, 25, etc. You can modify the source code (e.g., `fix(manhattan)`) to change the line designating the axis tick labels (`labs <- unique(d$CHR)`) to set this to whatever you'd like it to be.
+* If you'd like to change the color of the highlight or the suggestive/genomewide lines, you'll need to modify the source code. Search for `col="blue"`, `col="red"`, or `col="green3"` to modify the suggestive line, genomewide line, and highlight colors, respectively.
+
+## Creating Q-Q plots
+
+Creating Q-Q plots is straightforward - simply supply a vector of p-values to the `qq()` function. 
+
+```{r}
+qq(gwasResults$P)
+```
+
+We can otionally supply many other graphical parameters.
+
+```{r}
+qq(gwasResults$P, main="Q-Q plot of GWAS p-values",
+   xlim=c(0,7), ylim=c(0,12), pch=18, col="blue4", cex=1.5, las=1)
+```
\ No newline at end of file

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/r-cran-qqman.git



More information about the debian-med-commit mailing list