[med-svn] [r-cran-irlba] 03/05: New upstream version 2.1.2

Tue Oct 10 15:15:18 UTC 2017

This is an automated email from the git hooks/post-receive script.

tille pushed a commit to branch master
in repository r-cran-irlba.

commit 2983a2794cdabd92e044b010e21ec196c7d03b2c
Author: Andreas Tille <tille at debian.org>
Date:   Tue Oct 10 17:12:57 2017 +0200

    New upstream version 2.1.2
---
 DESCRIPTION                   |  27 ++
 MD5                           |  22 ++
 NAMESPACE                     |  12 +
 R/eigen.R                     |  73 +++++
 R/irlba.R                     | 640 ++++++++++++++++++++++++++++++++++++++++++
 R/prcomp.R                    |  96 +++++++
 R/utility.R                   |  69 +++++
 README.md                     |  33 +++
 build/vignette.rds            | Bin 0 -> 199 bytes
 debian/README.test            |   8 -
 debian/changelog              |  14 -
 debian/compat                 |   1 -
 debian/control                |  25 --
 debian/copyright              |  29 --
 debian/docs                   |   3 -
 debian/examples               |   1 -
 debian/rules                  |   4 -
 debian/source/format          |   1 -
 debian/tests/control          |   3 -
 debian/tests/run-unit-test    |  22 --
 debian/watch                  |   2 -
 demo/00Index                  |   1 +
 demo/custom_matrix_multiply.R |  44 +++
 inst/doc/irlba.Rnw            | 467 ++++++++++++++++++++++++++++++
 inst/doc/irlba.pdf            | Bin 0 -> 181643 bytes
 man/irlba.Rd                  | 200 +++++++++++++
 man/partial_eigen.Rd          |  65 +++++
 man/prcomp_irlba.Rd           |  76 +++++
 src/Makevars                  |   1 +
 src/irlb.c                    | 546 +++++++++++++++++++++++++++++++++++
 src/irlb.h                    |  65 +++++
 src/utility.c                 |  93 ++++++
 tests/edge.R                  |  66 +++++
 tests/test.R                  | 129 +++++++++
 vignettes/irlba.Rnw           | 467 ++++++++++++++++++++++++++++++
 35 files changed, 3192 insertions(+), 113 deletions(-)

diff --git a/DESCRIPTION b/DESCRIPTION
new file mode 100644
index 0000000..d507e0d
--- /dev/null
+++ b/DESCRIPTION
@@ -0,0 +1,27 @@
+Package: irlba
+Type: Package
+Title: Fast Truncated SVD, PCA and Symmetric Eigendecomposition for
+        Large Dense and Sparse Matrices
+Version: 2.1.2
+Date: 2016-09-20
+Authors at R: c(
+    person("Jim", "Baglama", rol=c("aut", "cph"), email="jbaglama at uri.edu"),
+    person("Lothar", "Reichel", role=c("aut", "cph"), email="reichel at math.kent.edu"),
+    person("B. W.", "Lewis", role=c("aut","cre","cph"), email="blewis at illposed.net"))
+Description: Fast and memory efficient methods for truncated singular and
+    eigenvalue decompositions and principal component analysis of large sparse or
+    dense matrices.
+Depends: Matrix
+LinkingTo: Matrix
+Imports: stats, methods
+License: GPL-3
+BugReports: https://github.com/bwlewis/irlba/issues
+RoxygenNote: 5.0.0
+NeedsCompilation: yes
+Packaged: 2016-09-21 13:57:09 UTC; blewis
+Author: Jim Baglama [aut, cph],
+  Lothar Reichel [aut, cph],
+  B. W. Lewis [aut, cre, cph]
+Maintainer: B. W. Lewis <blewis at illposed.net>
+Repository: CRAN
+Date/Publication: 2016-09-21 19:27:40
diff --git a/MD5 b/MD5
new file mode 100644
index 0000000..f6915e5
--- /dev/null
+++ b/MD5
@@ -0,0 +1,22 @@
+1a4cc9a6d803b51099c7e8019203340d *DESCRIPTION
+a92cc95a84584b03f782a11a0f14ccac *NAMESPACE
+a5caf177747baf3939d8da5a7a65f20d *R/eigen.R
+e2e2bd1fc7d563008881ce2f7c02047e *R/irlba.R
+261b601f66e15a95762cfcb25ff7bc52 *R/prcomp.R
+a3aeb39b2c56f9626fcc500a2efbff3c *R/utility.R
+b370501f642fc4e79215983fd2851f20 *README.md
+e5726491a241e7d7cd30836a46ef2ee9 *build/vignette.rds
+f7a30b7455d7c846dd5e4baf3a7a2963 *demo/00Index
+fa21422f4cbda8c9083f67e47421f99c *demo/custom_matrix_multiply.R
+e9b9777c575e2b56f2c85aa758ba4779 *inst/doc/irlba.Rnw
+4122e6a46391d621cb23bf6635653e3b *inst/doc/irlba.pdf
+4285679ab7678c954d2c0ad0e63417b3 *man/irlba.Rd
+a30fddf6e2d613164a96320d0cbc22f0 *man/partial_eigen.Rd
+3e425d237604f5286d6db3c9d32435ba *man/prcomp_irlba.Rd
+619d13ce6900fca2139e2139408d39ad *src/Makevars
+fd6a9dd38f353ec10da475f55d662314 *src/irlb.c
+27b357884a2232d31d8d9f5390391d08 *src/irlb.h
+aba4af25c20becc25598407f487abac9 *src/utility.c
+98df9161ed239100dcbb95109a0cf0ac *tests/edge.R
+26d6d03bbcfa4c377d475d392f74da30 *tests/test.R
+e9b9777c575e2b56f2c85aa758ba4779 *vignettes/irlba.Rnw
diff --git a/NAMESPACE b/NAMESPACE
new file mode 100644
index 0000000..1842492
--- /dev/null
+++ b/NAMESPACE
@@ -0,0 +1,12 @@
+# Generated by roxygen2: do not edit by hand
+
+export(irlba)
+export(partial_eigen)
+export(prcomp_irlba)
+import(Matrix)
+importFrom(methods,slot)
+importFrom(methods,slotNames)
+importFrom(stats,prcomp)
+importFrom(stats,rnorm)
+importFrom(stats,sd)
+useDynLib(irlba)
diff --git a/R/eigen.R b/R/eigen.R
new file mode 100644
index 0000000..287804f
--- /dev/null
+++ b/R/eigen.R
@@ -0,0 +1,73 @@
+#' Find a few approximate largest eigenvalues and corresponding eigenvectors of a symmetric matrix.
+#'
+#' Use \code{partial_eigen} to estimate a subset of the largest (most positive)
+#' eigenvalues and corresponding eigenvectors of a symmetric dense or sparse
+#' real-valued matrix.
+#'
+#' @param x numeric real-valued dense or sparse matrix.
+#' @param n number of largest eigenvalues and corresponding eigenvectors to compute.
+#' @param symmetric \code{TRUE} indicates \code{x} is a symmetric matrix (the default);
+#'   specify \code{symmetric=FALSE} to compute the largest eigenvalues and corresponding eigenvectors of \code{t(x) \%*\% x} instead.
+#' @param ... optional additional parameters passed to the \code{irlba} function.
+#'
+#' @return
+#' Returns a list with entries:
+#' \itemize{
+#'   \item{values}{ n approximate largest eigenvalues}
+#'   \item{vectors}{ n approximate corresponding eigenvectors}
+#' }
+#'
+#' @note
+#' Specify \code{symmetric=FALSE} to compute the largest \code{n} eigenvalues
+#' and corresponding eigenvectors of the symmetric matrix cross-product
+#' \code{t(x) \%*\% x}.
+#'
+#' This function uses the \code{irlba} function under the hood. See \code{?irlba}
+#' for description of additional options, especially the \code{tol} parameter.
+#'
+#' See the RSpectra package https://cran.r-project.org/package=RSpectra for more comprehensive
+#' partial eigenvalue decomposition.
+#'
+#' @references
+#' Augmented Implicitly Restarted Lanczos Bidiagonalization Methods, J. Baglama and L. Reichel, SIAM J. Sci. Comput. 2005.
+#'
+#' @examples
+#' set.seed(1)
+#' # Construct a symmetric matrix with some positive and negative eigenvalues:
+#' V <- qr.Q(qr(matrix(runif(100),nrow=10)))
+#' x <- V %*% diag(c(10, -9, 8, -7, 6, -5, 4, -3, 2, -1)) %*% t(V)
+#' partial_eigen(x, 3)$values
+#'
+#' # Compare with eigen
+#' eigen(x)$values[1:3]
+#'
+#' # Use symmetric=FALSE to compute the eigenvalues of t(x) %*% x for general
+#' # matrices x:
+#' x <- matrix(rnorm(100), 10)
+#' partial_eigen(x, 3, symmetric=FALSE)$values
+#' eigen(crossprod(x))$values
+#'
+#' @seealso \code{\link{eigen}}, \code{\link{irlba}}
+#' @export
+partial_eigen <- function(x, n=5, symmetric=TRUE, ...)
+{
+  if (n > 0.5 * min(nrow(x), ncol(x)))
+  {
+    warning("You're computing a large percentage of total eigenvalues, the standard eigen function will likely work better!")
+  }
+  if (!symmetric)
+  {
+    L <- irlba(x, n, ...)
+    return(list(vectors=L$v, values=L$d ^ 2))
+  }
+  L <- irlba(x, n, ...)
+  s <- sign(L$u[1, ] * L$v[1, ])
+  if (all(s > 0))
+  {
+    return(list(vectors=L$u, values=L$d))
+  }
+  i <- min(which(s < 0))
+  shift <- L$d[i]
+  L <- irlba(x, n, shift=shift, ...)
+  return(list(vectors=L$u, values=L$d - shift))
+}
diff --git a/R/irlba.R b/R/irlba.R
new file mode 100644
index 0000000..5371c6f
--- /dev/null
+++ b/R/irlba.R
@@ -0,0 +1,640 @@
+#' Find a few approximate largest singular values and corresponding
+#' singular vectors of a matrix.
+#'
+#' The augmented implicitly restarted Lanczos bidiagonalization algorithm
+#' (IRLBA) finds a few approximate largest singular values and corresponding
+#' singular vectors of a sparse or dense matrix using a method of Baglama and
+#' Reichel.  It is a fast and memory-efficient way to compute a partial SVD.
+#'
+#' @param A numeric real- or complex-valued matrix or real-valued sparse matrix.
+#' @param nv number of right singular vectors to estimate.
+#' @param nu number of left singular vectors to estimate (defaults to \code{nv}).
+#' @param maxit maximum number of iterations.
+#' @param work working subspace dimension, larger values can speed convergence at the cost of more memory use.
+#' @param reorth if \code{TRUE}, apply full reorthogonalization to both SVD bases, otherwise
+#'   only apply reorthogonalization to the right SVD basis vectors; the latter case is cheaper per
+#'   iteration but, overall, may require more iterations for convergence. Always set to \code{TRUE}
+#'   when \code{fastpath=TRUE} (see below).
+#' @param tol convergence is determined when \eqn{\|AV - US\| < tol\|A\|}{||AV - US|| < tol*||A||},
+#'   where the spectral norm ||A|| is approximated by the
+#'   largest estimated singular value, and U, V, S are the matrices corresponding
+#'   to the estimated left and right singular vectors, and diagonal matrix of
+#'   estimated singular values, respectively.
+#' @param v optional starting vector or output from a previous run of \code{irlba} used
+#'   to restart the algorithm from where it left off (see the notes).
+#' @param right_only logical value indicating return only the right singular vectors
+#'  (\code{TRUE}) or both sets of vectors (\code{FALSE}). The right_only option can be
+#'  cheaper to compute and use much less memory when \code{nrow(A) >> ncol(A)} but note
+#'  that \code{right_only = TRUE} sets \code{fastpath = FALSE} (only use this option
+#'  for really large problems that run out of memory).
+#' @param verbose logical value that when \code{TRUE} prints status messages during the computation.
+#' @param scale optional column scaling vector whose values divide each column of \code{A};
+#'   must be as long as the number of columns of \code{A} (see notes).
+#' @param center optional column centering vector whose values are subtracted from each
+#'   column of \code{A}; must be as long as the number of columns of \code{A} and may
+#"   not be used together with the deflation options below (see notes).
+#' @param du DEPRECATED optional subspace deflation vector (see notes).
+#' @param ds DEPRECATED optional subspace deflation scalar (see notes).
+#' @param dv DEPRECATED optional subspace deflation vector (see notes).
+#' @param shift optional shift value (square matrices only, see notes).
+#' @param mult optional custom matrix multiplication function (default is \code{\%*\%}, see notes).
+#' @param fastpath try a fast C algorithm implementation if possible; set \code{fastpath=FALSE} to use the reference R implementation. See notes.
+#'
+#' @return
+#' Returns a list with entries:
+#' \describe{
+#'   \item{d:}{ max(nu, nv) approximate singular values}
+#'   \item{u:}{ nu approximate left singular vectors (only when right_only=FALSE)}
+#'   \item{v:}{ nv approximate right singular vectors}
+#'   \item{iter:}{ The number of Lanczos iterations carried out}
+#'   \item{mprod:}{ The total number of matrix vector products carried out}
+#' }
+#'
+#' @note
+#' The syntax of \code{irlba} partially follows \code{svd}, with an important
+#' exception. The usual R \code{svd} function always returns a complete set of
+#' singular values, even if the number of singular vectors \code{nu} or \code{nv}
+#' is set less than the maximum. The \code{irlba} function returns a number of
+#' estimated singular values equal to the maximum of the number of specified
+#' singular vectors \code{nu} and \code{nv}.
+#'
+#' Use the optional \code{scale} parameter to implicitly scale each column of
+#' the matrix \code{A} by the values in the \code{scale} vector, computing the
+#' truncated SVD of the column-scaled \code{sweep(A, 2, scale, FUN=`/`)}, or
+#' equivalently, \code{A \%*\% diag(1 / scale)}, without explicitly forming the
+#' scaled matrix. \code{scale} must be a non-zero vector of length equal
+#' to the number of columns of \code{A}.
+#'
+#' Use the optional \code{center} parameter to implicitly subtract the values
+#' in the \code{center} vector from each column of \code{A}, computing the
+#' truncated SVD of \code{sweep(A, 2, center, FUN=`-`)},
+#' without explicitly forming the centered matrix. This option may not be
+#' used together with the general rank 1 deflation options. \code{center}
+#' must be a vector of length equal to the number of columns of \code{A}.
+#' This option may be used to efficiently compute principal components without
+#' explicitly forming the centered matrix (which can, importantly, preserve
+#' sparsity in the matrix). See the examples.
+#'
+#' The optional deflation parameters are deprecated and will be removed in
+#' a future version. They could be used to compute the rank-one deflated
+#' SVD of \eqn{A - ds \cdot du dv^T}{A - ds*du \%*\% t(dv)}, where
+#' \eqn{du^T A - ds\cdot dv^T = 0}{t(du) \%*\% A - ds * t(dv) == 0}. For
+#' example, the triple \code{ds, du, dv} may be a known singular value
+#' and corresponding singular vectors. Or \code{ds=m} and \code{dv}
+#' and \code{du} represent a vector of column means of \code{A} and of ones,
+#' respectively, where \code{m} is the number of rows of \code{A}.
+#' This functionality can be effectively replaced with custom matrix
+#' product functions.
+#'
+#' Specify an optional alternative matrix multiplication operator in the
+#' \code{mult} parameter. \code{mult} must be a function of two arguments,
+#' and must handle both cases where one argument is a vector and the other
+#' a matrix. See the examples and
+#' \code{demo("custom_matrix_multiply", package="irlba")} for an 
+#' alternative approach.
+#'
+#' Use the \code{v} option to supply a starting vector for the iterative
+#' method. A random vector is used by default (precede with \code{set.seed()} to
+#' for reproducibility). Optionally set \code{v} to
+#' the output of a previous run of \code{irlba} to restart the method, adding
+#' additional singular values/vectors without recomputing the solution
+#' subspace. See the examples.
+#'
+#' The function may generate the following warnings:
+#' \itemize{
+#'   \item{"did not converge--results might be invalid!; try increasing maxit or fastpath=FALSE" means that the algorithm didn't
+#'   converge -- this is potentially a serious problem and the returned results may not be valid. \code{irlba}
+#'   reports a warning here instead of an error so that you can inspect whatever is returned. If this
+#'   happens, carefully heed the warning and inspect the result.}
+#'   \item{"You're computing a large percentage of total singular values, standard svd might work better!"
+#'     \code{irlba} is designed to efficiently compute a few of the largest singular values and associated
+#'      singular vectors of a matrix. The standard \code{svd} function will be more efficient for computing
+#'      large numbers of singular values than \code{irlba}.}
+#'    \item{"convergence criterion below machine epsilon" means that the product of \code{tol} and the
+#'      largest estimated singular value is really small and the normal convergence criterion is only
+#'      met up to round off error.}
+#' }
+#' The function might return an error for several reasons including a situation when the starting
+#' vector \code{v} is near the null space of the matrix. In that case, try a different \code{v}.
+#'
+#' The \code{fastpath=TRUE} option only supports real-valued matrices and sparse matrices
+#' of type \code{dgCMatrix} (for now). Other problems fall back to the reference
+#' R implementation.
+#'
+#' @references
+#' Augmented Implicitly Restarted Lanczos Bidiagonalization Methods, J. Baglama and L. Reichel, SIAM J. Sci. Comput. 2005.
+#'
+#' @examples
+#' set.seed(1)
+#'
+#' A <- matrix(runif(400), nrow=20)
+#' S <- irlba(A, 3)
+#' S$d
+#'
+#' # Compare with svd
+#' svd(A)$d[1:3]
+#'
+#' # Restart the algorithm to compute more singular values
+#' # (starting with an existing solution S)
+#' S1 <- irlba(A, 5, v=S)
+#'
+#' # Principal components (see also prcomp_irlba)
+#' P <- irlba(A, nv=1, center=colMeans(A))
+#'
+#' # Compare with prcomp and prcomp_irlba (might vary up to sign)
+#' cbind(P$v,
+#'       prcomp(A)$rotation[, 1],
+#'       prcomp_irlba(A)$rotation[, 1])
+#'
+#' # A custom matrix multiplication function that scales the columns of A
+#' # (cf the scale option). This function scales the columns of A to unit norm.
+#' col_scale <- sqrt(apply(A, 2, crossprod))
+#' mult <- function(x, y)
+#'         {
+#'           # check if x is a  vector
+#'           if (is.vector(x))
+#'           {
+#'             return((x %*% y) / col_scale)
+#'           }
+#'           # else x is the matrix
+#'           x %*% (y / col_scale)
+#'         }
+#' irlba(A, 3, mult=mult)$d
+#'
+#' # Compare with:
+#' irlba(A, 3, scale=col_scale)$d
+#'
+#' # Compare with:
+#' svd(sweep(A, 2, col_scale, FUN=`/`))$d[1:3]
+#'
+#' @seealso \code{\link{svd}}, \code{\link{prcomp}}, \code{\link{partial_eigen}}
+#' @import Matrix
+#' @importFrom stats rnorm
+#' @importFrom methods slotNames
+#' @useDynLib irlba
+#' @export
+irlba <-
+function (A,                     # data matrix
+          nv=5, nu,              # number of singular vectors to estimate
+          maxit=1000,            # maximum number of iterations
+          work=nv + 7,           # working subspace size
+          reorth=TRUE,           # TRUE=full reorthogonalization
+          tol=1e-5,              # stopping tolerance
+          v=NULL,                # optional starting vector or restart
+          right_only=FALSE,      # TRUE=only return V
+          verbose=FALSE,         # display status messages
+          scale,                 # optional column scaling
+          center,                # optional column centering
+          du, ds, dv,            # optional general rank 1 deflation
+          shift,                 # optional shift for square matrices
+          mult,                  # optional custom matrix multiplication func.
+          fastpath=TRUE)         # use the faster C implementation if possible
+{
+# ---------------------------------------------------------------------
+# Check input parameters
+# ---------------------------------------------------------------------
+  ropts <- options(warn=1) # immediately show warnings
+  on.exit(options(ropts))  # reset on exit
+  eps <- .Machine$double.eps
+  deflate <- missing(du) + missing(ds) + missing(dv)
+  if (deflate == 3)
+  {
+    deflate <- FALSE
+  } else if (deflate == 0)
+  {
+    deflate <- TRUE
+    warning("The deflation options are deprecated and will be removed in a future version.")
+    if (length(ds) > 1) stop("deflation limited to one dimension")
+    if (!is.null(dim(du))) du <- du[, 1]
+    if (!is.null(dim(dv))) dv <- dv[, 1]
+  } else stop("all three du ds dv parameters must be specified for deflation")
+  if (!missing(center))
+  {
+    if (deflate) stop("the center parameter can't be specified together with deflation parameters")
+    if (length(center) != ncol(A)) stop("center must be a vector of length ncol(A)")
+    if (fastpath && ! right_only) du <- NULL
+    else du <- 1
+    ds <- 1
+    dv <- center
+    deflate <- TRUE
+  }
+  iscomplex <- is.complex(A)
+  m <- nrow(A)
+  n <- ncol(A)
+  if (missing(nu)) nu <- nv
+  if (!missing(mult) && deflate) stop("the mult parameter can't be specified together with deflation parameters")
+  missingmult <- FALSE
+  if (missing(mult))
+  {
+    missingmult <- TRUE
+    mult <- `%*%`
+  }
+  k <- max(nu, nv)
+  if (k <= 0)  stop("max(nu, nv) must be positive")
+  if (k > min(m - 1, n - 1)) stop("max(nu, nv) must be strictly less than min(nrow(A), ncol(A))")
+  if (k >= 0.5 * min(m, n))
+  {
+    warning("You're computing a large percentage of total singular values, standard svd might work better!")
+  }
+  if (work <= 1) stop("work must be greater than 1")
+  if (tol < 0) stop("tol must be non-negative")
+  if (maxit <= 0) stop("maxit must be positive")
+  if (work <= k) work <- k + 1 # work must be strictly larger than requested subspace dimension
+  if (work >= min(n, m))
+  {
+    work <- min(n, m)
+    if (work <= k)
+    {
+      k <- work - 1  # the best we can do! Need to reduce output subspace dimension
+      warning("Requested subspace dimension too large! Reduced to ", k)
+    }
+  }
+  k_org <- k
+  w_dim <- work
+  if (right_only)
+  {
+    w_dim <- 1
+    work <- min(min(m, n), work + 20 ) # typically need this to help convergence
+    fastpath <- FALSE
+  }
+
+  if (verbose)
+  {
+    message ("Working dimension size ", work)
+  }
+# Check for tiny problem, use standard SVD in that case. Make definition of 'tiny' larger?
+  if (min(m, n) < 6)
+  {
+    if (verbose) message ("Tiny problem detected, using standard `svd` function.")
+    if (!missing(scale)) A <- A / scale
+    if (!missing(shift)) A <- A + diag(shift)
+    if (deflate)
+    {
+      if (is.null(du)) du <- rep(1, nrow(A))
+      A <- A - (ds * du) %*% t(dv)
+    }
+    s <- svd(A)
+    return(list(d=s$d[1:k], u=s$u[, 1:nu, drop=FALSE],
+              v=s$v[, 1:nv, drop=FALSE], iter=0, mprod=0))
+  }
+
+# Try to use the fast C-language code path
+  if (deflate) fastpath <- fastpath && is.null(du)
+# Only dgCMatrix supported by fastpath for now
+  if ("Matrix" %in% attributes(class(A)) && ! ("dgCMatrix" %in% class(A)))
+  {
+    fastpath <- FALSE
+  }
+# Check for custom class
+  if ("matrix" %in% attributes(A)$.S3Class && ! ("matrix" %in% class(A)))
+  {
+    fastpath <- FALSE
+  }
+  if (fastpath && missingmult && !iscomplex && !right_only)
+  {
+    RESTART <- 0
+    RV <- RW <- RS <- NULL
+    if (is.null(v))
+    {
+      v <- rnorm(n)
+      if (verbose) message("Initializing starting vector v with samples from standard normal distribution.\nUse `set.seed` first for reproducibility.")
+    } else if (is.list(v))  # restarted case
+    {
+      if (is.null(v$v) || is.null(v$d) || is.null(v$u)) stop("restart requires left and right singular vectors")
+      if (max(nu, nv) <= min(ncol(v$u), ncol(v$v))) return(v) # Nothing to do!
+      RESTART <- as.integer(length(v$d))
+      RND <- rnorm(n)
+      RND <- orthog(RND, v$v)
+      RV <- cbind(v$v, RND / norm2(RND))
+      RW <- v$u
+      RS <- v$d
+      v <- NULL
+    }
+
+    SP <- ifelse(is.matrix(A), 0L, 1L)
+    if (verbose) message("irlba: using fast C implementation")
+    SCALE <- NULL
+    SHIFT <- NULL
+    CENTER <- NULL
+    if (!missing(scale))
+    {
+      if (length(scale) != ncol(A)) stop("scale length must mactch number of matrix columns")
+      SCALE <- as.double(scale)
+    }
+    if (!missing(shift))
+    {
+      if (length(shift) != 1) stop("shift length must be 1")
+      SHIFT <- as.double(shift)
+    }
+    if (deflate)
+    {
+      if (length(center) != ncol(A)) stop("the centering vector length must match the number of matrix columns")
+      CENTER <- as.double(center)
+    }
+    ans <- .Call("IRLB", A, as.integer(k), as.double(v), as.integer(work),
+                 as.integer(maxit), as.double(tol), .Machine$double.eps, as.integer(SP),
+                 RESTART, RV, RW, RS, SCALE, SHIFT, CENTER, PACKAGE="irlba")
+    if (ans[[6]] == 0 || ans[[6]] == -2)
+    {
+      names(ans) <- c("d", "u", "v", "iter", "mprod", "err")
+      ans$u <- matrix(head(ans$u, m * nu), nrow=m, ncol=nu)
+      ans$v <- matrix(head(ans$v, n * nv), nrow=n, ncol=nv)
+      if(tol * ans$d[1] < eps) warning("convergence criterion below machine epsilon")
+      if (ans[[6]] == -2) warning("did not converge--results might be invlaid!; try increasing maxit or fastpath=FALSE")
+      return(ans[-6])
+    }
+    errors <- c("invalid dimensions",
+                "didn't converge",
+                "out of memory",
+                "starting vector near the null space",
+                "linear dependency encountered")
+    erridx <- abs(ans[[6]])
+    if (erridx > 1)
+      warning("fast code path error ", errors[erridx], "; re-trying with fastpath=FALSE.", immediate.=TRUE)
+  }
+
+# Allocate memory for W and F:
+  W <- matrix(0.0, m, w_dim)
+  F <- matrix(0.0, n, 1)
+  restart <- FALSE
+  if (is.list(v))
+  {
+    if (is.null(v$v) || is.null(v$d) || is.null(v$u)) stop("restart requires left and right singular vectors")
+    if (max(nu, nv) <= min(ncol(v$u), ncol(v$v))) return(v) # Nothing to do!
+    right_only <- FALSE
+    W[, 1:ncol(v$u)] <- v$u
+    d <- v$d
+    V <- matrix(0.0, n, work)
+    V[, 1:ncol(v$v)] <- v$v
+    restart <- TRUE
+  } else if (is.null(v))
+  {
+# If starting matrix v is not given then set V to be an (n x 1) matrix of
+# normally distributed random numbers.  In any case, allocate V appropriate to
+# problem size:
+    V <- matrix(0.0, n, work)
+    V[, 1] <- rnorm(n)
+  } else
+  {
+# user-supplied starting subspace
+    V <- matrix(0.0, n, work)
+    V[1:length(v)] <- v
+  }
+
+# ---------------------------------------------------------------------
+# Initialize local variables
+# ---------------------------------------------------------------------
+  B <- NULL                  # Bidiagonal matrix
+  Bsz <- NULL                # Size of B
+  eps23 <- eps ^ (2 / 3)     # Used for Smax/avoids using zero
+  eps2 <- 2 * eps
+  iter <- 1                  # Man loop iteration count
+  mprod <- 0                 # Number of matrix-vector products
+  R_F <- NULL                # 2-norm of residual vector F
+  sqrteps <- sqrt(eps)       #
+  Smax <- 1                  # Max value of all computed singular values of
+                             # B est. ||A||_2
+  Smin <- NULL               # Min value of all computed singular values of
+                             # B est. cond(A)
+  SVTol <- tol  # Tolerance for singular vector convergence
+
+# Check for user-supplied restart condition
+  if (restart)
+  {
+    B <- cbind(diag(d), 0)
+    k <- length(d)
+
+    F <- rnorm(n)
+    F <- orthog(F, V[, 1:k])
+    V[, k + 1] <- F / norm2(F)
+  }
+
+# ---------------------------------------------------------------------
+# Main iteration
+# ---------------------------------------------------------------------
+  while (iter <= maxit)
+  {
+# ---------------------------------------------------------------------
+# Compute the Lanczos bidiagonal decomposition:
+# such that AV  = WB
+# and       t(A)W = VB + Ft(E)
+# This routine updates W, V, F, B, mprod
+#
+# Note on scale and center: These options are applied implicitly below
+# for maximum computational efficiency. This complicates their application
+# somewhat, but saves a few flops.
+# ---------------------------------------------------------------------
+    j <- 1
+#   Normalize starting vector:
+    if (iter == 1 && !restart)
+    {
+      V[, 1] <- V[, 1] / norm2(V[, 1])
+    }
+    else j <- k + 1
+#   j_w is used here to support the right_only=TRUE case.
+    j_w <- ifelse(w_dim > 1, j, 1)
+
+#   Compute W=AV
+#   Optionally apply scale
+    VJ <- V[, j]
+    if (!missing(scale))
+    {
+      VJ <- VJ / scale
+    }
+#   Handle sparse products.
+    avj <- mult(A, VJ)
+    if ("Matrix" %in% attributes(class(avj)) && "x" %in% slotNames(avj))
+    {
+      if (length(avj at x) == nrow(W)) avj <- slot(avj, "x")
+      else avj <- as.vector(avj)
+    }
+    W[, j_w] <- avj
+    mprod <- mprod + 1
+
+#   Optionally apply shift
+    if (!missing(shift))
+    {
+      W[, j_w] <- W[, j_w] + shift * VJ
+    }
+
+#   Optionally apply deflation
+    if (deflate)
+    {
+      W[, j_w] <- W[, j_w] - ds * drop(cross(dv, VJ)) * du
+    }
+
+#   Orthogonalize W
+    if (iter != 1 && w_dim > 1 && reorth)
+    {
+      W[, j] <- orthog (W[, j, drop=FALSE], W[, 1:(j - 1), drop=FALSE])
+    }
+
+    S <- norm2(W[, j_w, drop=FALSE])
+#   Check for linearly dependent vectors
+    if (is.na(S) || S < eps2 && j == 1) stop("starting vector near the null space")
+    if (is.na(S) || S < eps2)
+    {
+      W[, j_w] <- rnorm(nrow(W))
+      if (w_dim > 1) W[, j] <- orthog(W[, j], W[, 1:(j - 1)])
+      W[, j_w] <- W[, j_w] / norm2(W[, j_w])
+      S <- 0
+    }
+    else W[, j_w] <- W[, j_w] / S
+
+#   Lanczos process
+    while (j <= work)
+    {
+      j_w <- ifelse(w_dim > 1, j, 1)
+      if (iscomplex)
+      {
+        F <- Conj(t(drop(mult(Conj(drop(W[, j_w])), A))))
+      }
+      else F <- t(drop(mult(drop(W[, j_w]), A)))
+#     Optionally apply shift and scale
+      if (!missing(shift)) F <- F + shift * W[, j_w]
+      if (!missing(scale)) F <- F / scale
+      mprod <- mprod + 1
+      F <- drop(F - S * V[, j])
+#     Orthogonalize
+      F <- orthog(F, V[, 1:j, drop=FALSE])
+      if (j + 1 <= work)
+      {
+        R <- norm2(F)
+#       Check for linear dependence
+        if (R < eps2)
+        {
+          F <- matrix(rnorm(dim(V)[1]), dim(V)[1], 1)
+          F <- orthog(F, V[, 1:j, drop=FALSE])
+          V[, j + 1] <- F / norm2(F)
+          R <- 0
+        }
+        else V[, j + 1] <- F / R
+
+#       Compute block diagonal matrix
+        if (is.null(B)) B <- cbind(S, R)
+        else            B <- rbind(cbind(B, 0), c(rep(0, ncol(B) - 1), S, R))
+
+        jp1_w <- ifelse(w_dim > 1, j + 1, 1)
+        w_old <- W[, j_w]
+
+#       Optionally apply scale
+        VJP1 <- V[, j + 1]
+        if (!missing(scale))
+        {
+          VJP1 <- VJP1 / scale
+        }
+        W[, jp1_w] <- drop(mult(A, drop(VJP1)))
+        mprod <- mprod + 1
+
+#       Optionally apply shift
+        if (!missing(shift))
+        {
+          W[, jp1_w] <- W[, jp1_w] + shift * VJP1
+        }
+
+#       Optionally apply deflation
+        if (deflate)
+        {
+          W[, jp1_w] <- W[, jp1_w] - ds * drop(cross(dv, VJP1)) * du
+        }
+
+#       One step of the classical Gram-Schmidt process
+        W[, jp1_w] <- W[, jp1_w] - R * w_old
+
+#       Full reorthogonalization of W
+        if (reorth && w_dim > 1) W[, j + 1] <- orthog(W[, j + 1], W[, 1:j])
+        S <- norm2(W[, jp1_w])
+#       Check for linear dependence
+        if (S < eps2)
+        {
+          W[, jp1_w] <- rnorm(nrow(W))
+          if (w_dim > 1) W[, j + 1] <- orthog(W[, j + 1], W[, 1:j])
+          W[, jp1_w] <- W[, jp1_w] / norm2(W[, jp1_w])
+          S <- 0
+        }
+        else W[, jp1_w] <- W[, jp1_w] / S
+      }
+      else
+      {
+#       Add a last block to matrix B
+        B <- rbind(B, c(rep(0, j - 1), S))
+      }
+      j <- j + 1
+    }
+    if (verbose)
+    {
+      message("Lanczos iter = ", iter, ", dim = ", j - 1, ", mprod = ", mprod)
+    }
+# ---------------------------------------------------------------------
+# (End of the Lanczos bidiagonalization part)
+# ---------------------------------------------------------------------
+    Bsz <- nrow(B)
+    R_F <- norm2(F)
+    F <- F / R_F
+#   Compute singular triplets of B. Expect svd to return s.v.s in order
+#   from largest to smallest.
+    Bsvd <- svd(B)
+
+#   Estimate ||A|| using the largest singular value over all iterations
+#   and estimate the cond(A) using approximations to the largest and
+#   smallest singular values. If a small singular value is less than sqrteps
+#   require two-sided reorthogonalization.
+    if (iter == 1)
+    {
+      Smax <- Bsvd$d[1]
+      Smin <- Bsvd$d[Bsz]
+    }
+    else
+    {
+      Smax <- max(Smax, Bsvd$d[1])
+      Smin <- min(Smin, Bsvd$d[Bsz])
+    }
+    Smax <- max(eps23, Smax)
+    if (Smin / Smax < sqrteps && !reorth)
+    {
+      warning("The matrix is ill-conditioned. Basis will be reorthogonalized.")
+      reorth <- TRUE
+    }
+
+#   Compute the residuals
+    R <- R_F * Bsvd$u[Bsz, , drop=FALSE]
+#   Check for convergence
+    ct <- convtests(Bsz, tol, k_org, Bsvd$u,
+                    Bsvd$d, Bsvd$v, abs(R), k, SVTol, Smax)
+    k <- ct$k
+
+#   If all desired singular values converged, then exit main loop
+    if (ct$converged) break
+    if (iter >= maxit) break
+
+#   Compute the starting vectors and first block of B[1:k, 1:(k+1), drop=FALSE]
+#   using the Ritz vectors
+      V[, 1:(k + dim(F)[2])] <- cbind(V[, 1:(dim(Bsvd$v)[1]),
+                                     drop=FALSE] %*% Bsvd$v[, 1:k], F)
+      B <- cbind( diag(Bsvd$d[1:k], nrow=k), R[1:k])
+
+#   Update the left approximate singular vectors
+    if (w_dim > 1)
+    {
+      W[, 1:k] <- W[, 1:(dim(Bsvd$u)[1]), drop=FALSE] %*% Bsvd$u[, 1:k]
+    }
+
+    iter <- iter + 1
+  }
+# ---------------------------------------------------------------------
+# End of the main iteration loop
+# Output results
+# ---------------------------------------------------------------------
+  if (iter > maxit) warning("did not converge--results might be invalid!; try increasing maxit")
+  d <- Bsvd$d[1:k_org]
+  if (!right_only)
+  {
+    u <- W[, 1:(dim(Bsvd$u)[1]), drop=FALSE] %*% Bsvd$u[, 1:k_org, drop=FALSE]
+  }
+  v <- V[, 1:(dim(Bsvd$v)[1]), drop=FALSE] %*% Bsvd$v[, 1:k_org, drop=FALSE]
+  if(tol * d[1] < eps) warning("convergence criterion below machine epsilon")
+  if (right_only)
+    return(list(d=d, v=v[, 1:nv, drop=FALSE], iter=iter, mprod=mprod))
+  return(list(d=d, u=u[, 1:nu, drop=FALSE],
+              v=v[, 1:nv, drop=FALSE], iter=iter, mprod=mprod))
+}
diff --git a/R/prcomp.R b/R/prcomp.R
new file mode 100644
index 0000000..33427fa
--- /dev/null
+++ b/R/prcomp.R
@@ -0,0 +1,96 @@
+#' Principal Components Analysis
+#'
+#' Efficient computation of a truncated principal components analysis of a given data matrix
+#' using an implicitly restarted Lanczos method from the \code{\link{irlba}} package.
+#'
+#' @param x a numeric or complex matrix (or data frame) which provides
+#'          the data for the principal components analysis.
+#' @param retx a logical value indicating whether the rotated variables should be returned.
+#' @param center a logical value indicating whether the variables should be
+#'          shifted to be zero centered. Alternately, a centering vector of length
+#'          equal the number of columns of \code{x} can be supplied.
+#' @param scale. a logical value indicating whether the variables should be
+#'          scaled to have unit variance before the analysis takes place.
+#'          The default is \code{FALSE} for consistency with S, but scaling is often advisable.
+#'          Alternatively, a vector of length equal the number of columns of \code{x} can be supplied.
+#' @param n integer number of principal component vectors to return, must be less than
+#' \code{min(dim(x))}.
+#' @param ... additional arguments passed to \code{\link{irlba}}.
+#'
+#' @return
+#' A list with class "prcomp" containing the following components:
+#' \itemize{
+#'    \item{sdev} {the standard deviations of the principal components (i.e.,
+#'          the square roots of the eigenvalues of the
+#'          covariance/correlation matrix, though the calculation is
+#'          actually done with the singular values of the data matrix).}
+#'   \item{rotation} {the matrix of variable loadings (i.e., a matrix whose columns
+#'          contain the eigenvectors).}
+#'   \item {x} {if \code{retx} is \code{TRUE} the value of the rotated data (the centred
+#'          (and scaled if requested) data multiplied by the \code{rotation}
+#'         matrix) is returned.  Hence, \code{cov(x)} is the diagonal matrix
+#'          \code{diag(sdev^2)}.}
+#'   \item{center, scale} {the centering and scaling used, or \code{FALSE}.}
+#' }
+#'
+#' @note
+#' The signs of the columns of the rotation matrix are arbitrary, and
+#' so may differ between different programs for PCA, and even between
+#' different builds of R.
+#'
+#' NOTE DIFFERENCES WITH THE DEFAULT \code{\link{prcomp}} FUNCTION!
+#' The \code{tol} truncation argument found in \code{prcomp} is not supported.
+#' In place of the truncation tolerance in the original function, the
+#' \code{prcomp_irlba}  function has the argument \code{n} explicitly giving the
+#' number of principal components to return. A warning is generated if the
+#' argument \code{tol} is used, which is interpreted differently between
+#' the two functions.
+#'
+#' @examples
+#' set.seed(1)
+#' x  <- matrix(rnorm(200), nrow=20)
+#' p1 <- prcomp_irlba(x, n=3)
+#' summary(p1)
+#'
+#' # Compare with
+#' p2 <- prcomp(x, tol=0.7)
+#' summary(p2)
+#'
+#' @seealso \code{\link{prcomp}}
+#' @import Matrix
+#' @importFrom stats rnorm prcomp sd
+#' @importFrom methods slotNames slot
+#' @export
+prcomp_irlba <- function (x, n = 3, retx = TRUE, center = TRUE, scale. = FALSE, ...)
+{
+  a <- names(as.list(match.call()))
+  if ("tol" %in% a)
+    warning("The `tol` truncation argument from `prcomp` is not supported by
+`prcomp_irlba`. If specified, `tol` is passed to the `irlba` function to
+control that algorithm's convergence tolerance. See `?prcomp_irlba` for help.")
+# Try to convert to a matrix...
+  if (!is.matrix(x)) x <- as.matrix(x)
+  args <- list(A=x, nv=n)
+  if (is.logical(center))
+  {
+    if (center) args$center <- colMeans(x)
+  } else args$center <- center
+  if (is.logical(scale.))
+  {
+    if (scale.) args$scale <- apply(x, 2, sd)
+  } else args$scale <- scale.
+  if (!missing(...)) args <- c(args, list(...))
+
+  s <- do.call(irlba, args=args)
+  ans <- list(sdev=s$d / sqrt(max(1, nrow(x) - 1)), rotation=s$v)
+  colnames(ans$rotation) <- paste("PC", seq(1, ncol(ans$rotation)), sep="")
+  ans$center <- args$center
+  ans$scale <- args$scale
+  if (retx)
+  {
+    ans <- c(ans, list(x = s$d * s$u))
+    colnames(ans$x) <- paste("PC", seq(1, ncol(ans$rotation)), sep="")
+  }
+  class(ans) <- "prcomp"
+  ans
+}
diff --git a/R/utility.R b/R/utility.R
new file mode 100644
index 0000000..e87e05d
--- /dev/null
+++ b/R/utility.R
@@ -0,0 +1,69 @@
+# ---------------------------------------------------------------------
+# Internal supporting functions
+# ---------------------------------------------------------------------
+# General real/complex crossprod
+cross <- function(x, y)
+{
+  if (missing(y))
+  {
+    if (is.complex(x)) return(abs(Conj(t(x)) %*% x))
+    return(crossprod(x))
+  }
+  if (!is.complex(x) && !is.complex(y)) return(crossprod(x, y))
+  Conj(t(x)) %*% y
+}
+
+# Euclidean norm
+norm2 <- function (x)
+{
+  drop(sqrt(cross(x)))
+}
+
+# Orthogonalize vectors Y against vectors X.
+orthog <- function (Y, X)
+ {
+  dx2 <- dim(X)[2]
+  if (is.null(dx2)) dx2 <- 1
+  dy2 <- dim(Y)[2]
+  if (is.null(dy2)) dy2 <- 1
+  if (dx2 < dy2) doty <- cross(X, Y)
+  else doty <- Conj(t(cross(Y, X)))
+  return (Y - X %*% doty)
+ }
+
+# Convergence tests
+# Input parameters
+# Bsz            Number of rows of the bidiagonal matrix B
+# tol
+# k_org
+# U_B            Left singular vectors of small matrix B
+# S_B            Singular values of B
+# V_B            Right singular vectors of B
+# residuals
+# k
+# SVTol
+# Smax
+#
+# Output parameter list
+# converged      TRUE/FALSE
+# U_B            Left singular vectors of small matrix B
+# S_B            Singular values of B
+# V_B            Right singular vectors of B
+# k              Number of singular vectors returned
+convtests <- function (Bsz, tol, k_org, U_B, S_B, V_B,
+                       residuals, k, SVTol, Smax)
+ {
+  len_res <- sum(residuals[1:k_org] < tol * Smax)
+  if (is.na(len_res)) len_res <- 0
+  if (len_res == k_org) {
+    return (list(converged=TRUE, U_B=U_B[, 1:k_org, drop=FALSE],
+                  S_B=S_B[1:k_org, drop=FALSE],
+                  V_B=V_B[, 1:k_org, drop=FALSE], k=k))
+  }
+# Not converged yet...
+# Adjust k to include more vectors as the number of vectors converge.
+  len_res <- sum(residuals[1:k_org] < SVTol * Smax)
+  k <- max(k, k_org + len_res)
+  if (k > Bsz - 3) k <- max(Bsz - 3, 1)
+  return (list(converged=FALSE, U_B=U_B, S_B=S_B, V_B=V_B, k=k) )
+ }
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..340d3da
--- /dev/null
+++ b/README.md
@@ -0,0 +1,33 @@
+# irlba
+
+Implicitly-restarted Lanczos methods for fast truncated singular value decomposition
+of sparse and dense matrices (also referred to as partial SVD).  IRLBA stands
+for Augmented, <b>I</b>mplicitly <b>R</b>estarted <b>L</b>anczos
+<b>B</b>idiagonalization <b>A</b>lgorithm. The package provides the following
+functions (see help on each for details and examples).
+
+* `irlba` main partial SVD function
+* `prcomp_irlba`  principal components function similar to the `prcomp` function in stats package for computing the first few principal components of large matrices
+* `partial_eigen` a very limited partial eigenvalue decomposition for symmetric matrices (see the [RSpectra](https://cran.r-project.org/package=RSpectra) package for more comprehensive truncated eigenvalue decomposition)
+
+Help documentation for each function includes examples. Also see the package
+vignette, `vignette("irlba", package="irlba")`, and demo,
+`demo("custom_matrix_multiply", package="irlba")`.
+
+Version 2.1.2 of the package includes a convenience `prcomp`-like function for
+computing principal components and numerous bug fixes and numerical stability
+improvements in edge cases.
+
+
+## References
+
+* Augmented Implicitly Restarted Lanczos Bidiagonalization Methods, J. Baglama and L. Reichel, SIAM J. Sci. Comput. 2005. (http://www.math.uri.edu/~jbaglama/papers/paper14.pdf)
+
+
+## Status
+<a href="https://travis-ci.org/bwlewis/irlba">
+<img src="https://travis-ci.org/bwlewis/irlba.svg?branch=master" alt="Travis CI status"></img>
+</a>
+[![codecov.io](https://codecov.io/github/bwlewis/irlba/coverage.svg?branch=master)](https://codecov.io/github/bwlewis/irlba?branch=master)
+[![CRAN version](http://www.r-pkg.org/badges/version/irlba)](https://cran.r-project.org/package=irlba)
+![](http://cranlogs.r-pkg.org/badges/irlba)
diff --git a/build/vignette.rds b/build/vignette.rds
new file mode 100644
index 0000000..6bf9c7b
Binary files /dev/null and b/build/vignette.rds differ
diff --git a/debian/README.test b/debian/README.test
deleted file mode 100644
index 55a9142..0000000
--- a/debian/README.test
+++ /dev/null
@@ -1,8 +0,0 @@
-Notes on how this package can be tested.
-────────────────────────────────────────
-
-To run the unit tests provided by the package you can do
-
-   sh  run-unit-test
-
-in this directory.
diff --git a/debian/changelog b/debian/changelog
deleted file mode 100644
index 7b86198..0000000
--- a/debian/changelog
+++ /dev/null
@@ -1,14 +0,0 @@
-r-cran-irlba (2.1.2-1) unstable; urgency=medium
-
-  * New upstream version
-  * Convert to dh-r
-  * Canonical homepage for CRAN
-  * Architecture: any
-
- -- Andreas Tille <tille at debian.org>  Tue, 08 Nov 2016 11:04:54 +0100
-
-r-cran-irlba (2.0.0-1) unstable; urgency=low
-
-  * Initial release (closes: #830217)
-
- -- Andreas Tille <tille at debian.org>  Thu, 07 Jul 2016 14:41:08 +0200
diff --git a/debian/compat b/debian/compat
deleted file mode 100644
index ec63514..0000000
--- a/debian/compat
+++ /dev/null
@@ -1 +0,0 @@
-9
diff --git a/debian/control b/debian/control
deleted file mode 100644
index 36701d6..0000000
--- a/debian/control
+++ /dev/null
@@ -1,25 +0,0 @@
-Source: r-cran-irlba
-Maintainer: Debian Med Packaging Team <debian-med-packaging at lists.alioth.debian.org>
-Uploaders: Andreas Tille <tille at debian.org>
-Section: gnu-r
-Priority: optional
-Build-Depends: debhelper (>= 9),
-               dh-r,
-               r-base-dev,
-               r-cran-matrix
-Standards-Version: 3.9.8
-Vcs-Browser: https://anonscm.debian.org/viewvc/debian-med/trunk/packages/R/r-cran-irlba/trunk/
-Vcs-Svn: svn://anonscm.debian.org/debian-med/trunk/packages/R/r-cran-irlba/trunk/
-Homepage: https://cran.r-project.org/package=irlba
-
-Package: r-cran-irlba
-Architecture: any
-Depends: ${misc:Depends},
-         ${shlibs:Depends},
-         ${R:Depends}
-Recommends: ${R:Recommends}
-Suggests: ${R:Suggests}
-Description: GNU R fast truncated SVD, PCA and symmetric eigendecomposition
- This GNU R package provides Fast and memory efficient methods for
- truncated singular and eigenvalue decompositions and principal component
- analysis of large sparse or dense matrices.
diff --git a/debian/copyright b/debian/copyright
deleted file mode 100644
index 74141ac..0000000
--- a/debian/copyright
+++ /dev/null
@@ -1,29 +0,0 @@
-Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
-Upstream-Name: irlba
-Upstream-Contact: Bryan W. Lewis <blewis at illposed.net>
-Source: https://cran.r-project.org/package=irlba
-
-Files: *
-Copyright: 2010-2016 Bryan W. Lewis <blewis at illposed.net>,
-                     Lothar Reichel
-License: GPL-2+
-Comment: The download page specifies
-    GPL-2 | GPL-3 [expanded from: GPL] and links to a GPL-2+ text
-
-Files: debian/*
-Copyright: 2016 Andreas Tille <tille at debian.org>
-License: GPL-2+
-
-License: GPL-2+
-    This program is free software; you can redistribute it and/or modify
-    it under the terms of the GNU General Public License as published by
-    the Free Software Foundation; either version 2 of the License, or
-    (at your option) any later version.
- .
-    This program is distributed in the hope that it will be useful,
-    but WITHOUT ANY WARRANTY; without even the implied warranty of
-    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-    GNU General Public License for more details.
- .
- On Debian systems, the complete text of the GNU General Public
- License can be found in `/usr/share/common-licenses/GPL-2'.
diff --git a/debian/docs b/debian/docs
deleted file mode 100644
index 3adf0d6..0000000
--- a/debian/docs
+++ /dev/null
@@ -1,3 +0,0 @@
-debian/README.test
-debian/tests/run-unit-test
-tests
diff --git a/debian/examples b/debian/examples
deleted file mode 100644
index 18244c8..0000000
--- a/debian/examples
+++ /dev/null
@@ -1 +0,0 @@
-vignettes
diff --git a/debian/rules b/debian/rules
deleted file mode 100755
index 68d9a36..0000000
--- a/debian/rules
+++ /dev/null
@@ -1,4 +0,0 @@
-#!/usr/bin/make -f
-
-%:
-	dh $@ --buildsystem R
diff --git a/debian/source/format b/debian/source/format
deleted file mode 100644
index 163aaf8..0000000
--- a/debian/source/format
+++ /dev/null
@@ -1 +0,0 @@
-3.0 (quilt)
diff --git a/debian/tests/control b/debian/tests/control
deleted file mode 100644
index d2aa55a..0000000
--- a/debian/tests/control
+++ /dev/null
@@ -1,3 +0,0 @@
-Tests: run-unit-test
-Depends: @
-Restrictions: allow-stderr
diff --git a/debian/tests/run-unit-test b/debian/tests/run-unit-test
deleted file mode 100644
index 9612b89..0000000
--- a/debian/tests/run-unit-test
+++ /dev/null
@@ -1,22 +0,0 @@
-#!/bin/sh -e
-oname=irlba
-pkg=r-cran-`echo $oname | tr '[A-Z]' '[a-z]'`
-
-if [ "$ADTTMP" = "" ] ; then
-  ADTTMP=`mktemp -d /tmp/${pkg}-test.XXXXXX`
-  trap "rm -rf $ADTTMP" 0 INT QUIT ABRT PIPE TERM
-fi
-cd $ADTTMP
-cp /usr/share/doc/$pkg/examples/vignettes/* $ADTTMP
-cp /usr/share/doc/$pkg/tests/* $ADTTMP
-find . -name "*.gz" -exec gunzip \{\} \;
-for rnw in `ls *.[rRS]nw` ; do
-rfile=`echo $rnw | sed 's/\.[rRS]nw/.R/'`
-R --no-save <<EOT
-  Stangle("$rnw")
-  source("$rfile", echo=TRUE)
-EOT
-  echo "$rnw passed"
-done
-
-LC_ALL=C R --no-save < test.R
diff --git a/debian/watch b/debian/watch
deleted file mode 100644
index 377e921..0000000
--- a/debian/watch
+++ /dev/null
@@ -1,2 +0,0 @@
-version=3
-http://cran.r-project.org/src/contrib/irlba_([-\d.]*)\.tar\.gz
diff --git a/demo/00Index b/demo/00Index
new file mode 100644
index 0000000..b968cce
--- /dev/null
+++ b/demo/00Index
@@ -0,0 +1 @@
+custom_matrix_multiply   Examples of custom matrix multiplication with irlba.
diff --git a/demo/custom_matrix_multiply.R b/demo/custom_matrix_multiply.R
new file mode 100644
index 0000000..cee6f97
--- /dev/null
+++ b/demo/custom_matrix_multiply.R
@@ -0,0 +1,44 @@
+library(irlba)
+set.seed(1)
+A <- matrix(runif(400), nrow=20)
+
+# A custom matrix multiplication function that scales the columns of A
+# (cf the scale option). This function scales the columns of A to unit norm.
+# This approach uses the `mult` argument to irlba and is also found in the
+# examples.
+col_scale <- sqrt(apply(A, 2, crossprod))
+mult <- function(x, y)
+        {
+          # check if x is a  vector
+          if (is.vector(x))
+          {
+            return((x %*% y) / col_scale)
+          }
+          # else x is the matrix
+          x %*% (y / col_scale)
+        }
+print(irlba(A, 3, mult=mult)$d)
+     
+# Compare with:
+print(svd(sweep(A, 2, col_scale, FUN=`/`))$d[1:3])
+
+
+# An alternative and maybe better approach simply uses R's native operator
+# overloading. You simply need to define a matrix x vector product method and a
+# vector x matrix product method.
+setClass("mymat", contains="matrix", S3methods=TRUE, slots=c(scale="numeric"))
+setMethod("%*%", signature(x="mymat", y="numeric"), function(x ,y)
+  {
+    x at .Data %*% (y / x at scale)
+  })
+setMethod("%*%", signature(x="numeric", y="mymat"), function(x ,y)
+  {
+    (x %*% y at .Data) / y at scale
+  })
+X <- new("mymat", A, scale=col_scale)
+
+# Compare with other approaches
+print(irlba(X, 3, fastpath=FALSE)$d)
+
+# See https://bwlewis.github.io/1000_genomes_examples/PCA_whole_genome.html
+# for a much more complex example using a custom class.
diff --git a/inst/doc/irlba.Rnw b/inst/doc/irlba.Rnw
new file mode 100644
index 0000000..546237b
--- /dev/null
+++ b/inst/doc/irlba.Rnw
@@ -0,0 +1,467 @@
+% \VignetteIndexEntry{irlba Manual}
+% \VignetteDepends{irlba}
+% \VignettePackage{irlba}
+\documentclass[12pt]{article}
+\usepackage{amsmath}
+\usepackage[pdftex]{graphicx}
+\usepackage{color}
+\usepackage{xspace}
+\usepackage{fancyvrb}
+\usepackage{fancyhdr}
+\usepackage[
+     colorlinks=true,
+     linkcolor=blue,
+     citecolor=blue,
+     urlcolor=blue]
+     {hyperref}
+\usepackage{lscape}
+\usepackage{Sweave}
+\usepackage{tabularx}
+\usepackage{listings}
+\usepackage{mdwlist}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+% define new colors for use
+\definecolor{darkgreen}{rgb}{0,0.6,0}
+\definecolor{darkred}{rgb}{0.6,0.0,0}
+\definecolor{lightbrown}{rgb}{1,0.9,0.8}
+\definecolor{brown}{rgb}{0.6,0.3,0.3}
+\definecolor{darkblue}{rgb}{0,0,0.8}
+\definecolor{darkmagenta}{rgb}{0.5,0,0.5}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+
+\newcommand{\bld}[1]{\mbox{\boldmath $#1$}}
+\newcommand{\shell}[1]{\mbox{$#1$}}
+\renewcommand{\vec}[1]{\mbox{\bf {#1}}}
+\newcommand{\ReallySmallSpacing}{\renewcommand{\baselinestretch}{.6}\Large\normalsize}
+\newcommand{\SmallSpacing}{\renewcommand{\baselinestretch}{1.1}\Large\normalsize}
+\def\tm{\leavevmode\hbox{$\rm {}^{TM}$}}
+
+\newcommand{\R}{{\mathbf R}}
+\newcommand{\brho}{{\color{blue}{\rho}}}
+\newcommand{\Ra}{{\mathcal R}}
+\newcommand{\PP}{{\mathbf P}}
+\newcommand{\N}{{\mathbf N}}
+\newcommand{\K}{{\mathcal K}}
+
+
+
+\setlength{\oddsidemargin}{-.25 truein}
+\setlength{\evensidemargin}{0truein}
+\setlength{\topmargin}{-0.2truein}
+\setlength{\textwidth}{7 truein}
+\setlength{\textheight}{8.5 truein}
+\setlength{\parindent}{0.20truein}
+\setlength{\parskip}{0.10truein}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\pagestyle{fancy}
+\lhead{}
+\chead{The {\tt irlba} Package}
+\rhead{}
+\lfoot{}
+\cfoot{}
+\rfoot{\thepage}
+\renewcommand{\headrulewidth}{1pt}
+\renewcommand{\footrulewidth}{1pt}
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\title{The {\tt irlba} Package}
+\author{Bryan W. Lewis \\ 
+blewis at illposed.net,
+\\[6pt]
+adapted from the work of:\\
+Jim Baglama (University of Rhode Island)\\
+and Lothar Reichel (Kent State University).
+}
+
+\begin{document}
+
+\maketitle
+
+\thispagestyle{empty}
+
+\section{Introduction}
+
+The {\tt irlba} package provides a fast way to compute partial singular value
+decompositions (SVD) of large sparse or dense matrices. Recent additions to the
+package can also compute fast partial symmetric eigenvalue decompositions and
+principal components. The package is an R implementation of the {\it augmented
+implicitly restarted Lanczos bidiagonalization algorithm} of Jim Baglama and
+Lothar Reichel\footnote{Augmented Implicitly Restarted Lanczos
+Bidiagonalization Methods, J. Baglama and L. Reichel, SIAM J. Sci. Comput.
+2005.}.  Source code is maintained at
+\href{https://github.com/bwlewis/irlba}{https://github.com/bwlewis/irlba}.
+
+The {\tt irlba} package works with real- and complex-valued dense R matrices
+and real-valued sparse matrices from the {\tt Matrix} package. It provides
+several easy ways to define custom matrix arithmetic that works with other
+matrix classes including {\tt big.matrix} from the {\tt bigmemory} package and
+others.  The {\tt irlba} is both faster and more memory efficient than the
+usual R {\tt svd} function for computing a few of the largest singular vectors
+and corresponding singular values of a matrix, and advantage of available
+high-performance linear algebra libraries if R is compiled to use them. In
+particular, the package uses the same BLAS and LAPACK libraries that R uses
+(see
+\href{https://cran.r-project.org/doc/manuals/R-admin.html#BLAS}{https://cran.r-project.org/doc/manuals/R-admin.html\#BLAS}),
+or the CHOLMOD library from R's Matrix package for sparse matrix problems.
+
+A whirlwind summary of the algorithm follows, along with a few basic examples.
+A much more detailed description and discussion of the algorithm may be found
+in the cited Baglama-Reichel reference.
+
+
+\section{Partial Singular Value Decomposition}
+
+Let $A\in\R^{\ell\times n}$ and assume $\ell\ge n$. These notes simplify the
+presentation by considering only real-valued matrices and assuming without
+losing generality that there are at least as many rows as columns (the
+method works more generally). A singular
+value decomposition of $A$ can be expressed as:
+
+\[
+A = \sum_{j=1}^n \sigma_j u_j v_j^T,
+\phantom{xxxxxxxx}
+v_j^Tv_k = u_j^Tu_k = 
+\left\{
+\begin{array}{ll}
+1 & \mbox{if}\phantom{x}  j=k,\\
+0 & \mbox{o.w.,}\\
+\end{array}
+\right.
+\]
+where $u_j\in\R^\ell $, $v_j\in\R^n $,
+$j=1,2,\ldots, n$, and
+$ \sigma_1 \ge \sigma_2 \ge \cdots \ge \sigma_n \ge 0 $.
+
+Let $1 \le k<n$. A rank $k$ partial SVD of $A$ is defined as:
+\begin{eqnarray*}
+A_k &:=& \sum_{j=1}^k \sigma_j u_j v_j^T.\\
+\end{eqnarray*}
+
+
+
+The following simple example shows how to use {\tt irlba} to compute the five
+largest singular values and corresponding singular vectors of a
+$5000\times5000$ matrix. We compare to the usual R {\tt svd} function and
+report timings for our test machine, a 4-CPU core, 3.0\, GHz AMD  A10-7850K
+personal computer with 16\,GB RAM, using R version 3.3.1 using the high
+performance AMD ACML core math library BLAS and LAPACK.
+\lstset{columns=flexible, basicstyle={\ttfamily\slshape}}
+\begin{lstlisting}
+> library('irlba')
+> set.seed(1)
+> A <- matrix(rnorm(5000*5000), 5000)
+> t1 <- proc.time()
+> L <- irlba(A, 5)
+> print(proc.time() - t1)
+   user  system elapsed
+ 17.440   0.192   4.417
+> gc()
+           used  (Mb) gc trigger  (Mb) max used  (Mb)
+Ncells  1096734  58.6    1770749  94.6  1442291  77.1
+Vcells 26685618 203.6   62229965 474.8 52110704 397.6
+
+\end{lstlisting}
+Compare with the standard {\tt svd} function:\newpage
+\begin{lstlisting}
+> t1 <- proc.time()
+> S <- svd(A, nu=5, nv=5)
+> print(proc.time() - t1)
+   user  system elapsed
+277.092  11.552  74.425
+> gc()
+           used  (Mb) gc trigger   (Mb)  max used   (Mb)
+Ncells  1097441  58.7    1770749   94.6   1442291   77.1
+Vcells 26741910 204.1  169891972 1296.2 176827295 1349.1
+\end{lstlisting}
+The {\tt irlba} method uses about 1/20 elapsed time as the
+{\tt svd} method in this example and less than one third the peak memory.
+The defalut tolerance value yields the following relative error
+in the estimated singular values:
+\begin{lstlisting}
+> sqrt (crossprod(S$d[1:5]-L$d)/crossprod(S$d[1:5]))
+            [,1]
+[1,] 4.352641e-10
+\end{lstlisting}
+
+\subsection{Convergence tolerance}
+
+IRLBA is an iterative method that estimates a few largest singular values
+and associated singular vectors. A sketch of the algorithm is outlined
+in Section \ref{sketch} below. The R {\tt tol} argument controls when the algorithm converges.
+Convergence occurs when
+\[
+\|AV_k - US_k\| < \mbox{tol} \cdot \|A\|,
+\]
+where $\|\cdot\|$ means spectral matrix norm, $A$ is the matrix, $V_k$ and $U_k$
+are the {\it estimated} right and left $k$ singular vectors computed by the
+algorithm, and $\|A\|$ is the {\it estimated} spectral norm of the matrix defined
+by the largest singular value computed by the algorithm. Using R notation,
+the algorithm stops when
+\begin{lstlisting}
+L <- irlba(A, k, tol)
+svd(A %*% L$v - L$u %*% diag(L$d))$d[1] < tol * L$d[1]
+\end{lstlisting}
+It's possible, but unlikely, to encounter problems that fail to converge before
+the maximum number of algorithm iterations specified by the {\tt maxit}
+argument.
+
+
+\subsection{Differences with {\tt svd}}
+The {\tt irlba} function is designed to compute a {\it partial} singular
+value decomposition. It is largely compatible with the usual R {\tt svd}
+function but there are some differences. In particular:
+\begin{enumerate}
+\item The {\tt irlba} function only computes the number of singular values
+corresponding to the maximum of the desired singular vectors,
+{\tt max(nu, nv)}. For example, if 5
+singular vectors are desired ({\tt nu=nv=5}), then only the five corresponding
+singular values are computed. The standard R {\tt svd} function always
+returns the {\it total} set of singular values for the matrix, regardless of how
+many singular vectors are specified.
+\item The {\tt irlba} function is an iterative method that continues until
+either a tolerance or maximum number of iterations is reached.
+Problems with difficult convergence properties are not likely to be
+encountered, but the method will fail with an error after the iteration limit
+is reached in those cases.
+\end{enumerate}
+Watch out especially for the first difference noted above!
+
+
+\subsection{Principal Components}
+
+Version 2.1.0 of the package introduces optional arguments and {\tt prcomp}-like
+function syntax for efficiently computing partial SVDs of matrices after
+centering and scaling their columns and other adjustments.
+Use the following arguments to the {\tt irlba} function, or the new
+{\tt irlba\_prcomp} function for PCA:
+\begin{itemize}
+\item {\tt center}: if {\tt center} is a numeric vector with length equal to
+     the number of columns of the matrix, then each column of the matrix has the
+     corresponding value from {\tt center} subtracted from it.
+\item {\tt scale}: if 'scale' is a numeric vector with length
+     equal to the number of columns of the matrix, then each column is
+     divided by the corresponding value from {\tt scale}.
+\end{itemize}
+Both centering and scaling options are performed implicitly in the algorithm
+and, for instance, do not affect sparsity of the input matrix or increase
+storage requirements.
+The following
+example compares the output of the usual {\tt prcomp} function with
+output from {\tt irlba}.
+Note that in general, singular vectors and principal component vectors
+are only unique up to sign!
+
+\begin{lstlisting}
+>      set.seed(1)
+>      x  <- matrix(rnorm(200), nrow=20)
+>      p1 <- prcomp_irlba(x, n=3)
+>      summary(p1)
+Importance of components:
+                         PC1    PC2    PC3
+Standard deviation     1.541 1.2513 1.1916
+Proportion of Variance 0.443 0.2921 0.2649
+Cumulative Proportion  0.443 0.7351 1.0000
+
+>      # Compare with
+>      p2 <- prcomp(x, tol=0.7)
+>      summary(p2)
+Importance of components:
+                         PC1    PC2    PC3
+Standard deviation     1.541 1.2513 1.1916
+Proportion of Variance 0.443 0.2921 0.2649
+Cumulative Proportion  0.443 0.7351 1.0000
+\end{lstlisting}
+Alternatively, you can compute principal components directly using the
+singular value decomposition and the {\tt center} option:
+\begin{lstlisting}
+> p3 <- svd(scale(x, center=colMeans(x), scale=FALSE))
+> p4 <- irlba(x, 3, center=colMeans(x))
+
+> # compare with prcomp
+> sqrt(crossprod(p1$rotation[,1] - p3$v[,1]))
+             [,1]
+[1,] 9.773228e-13
+> sqrt(crossprod(p1$rotation[,1] + p4$v[,1]))
+             [,1]
+[1,] 1.652423e-12
+\end{lstlisting}
+The implementation of the {\tt center} function argument takes advantage of
+computational efficiencies in the IRLB algorithm that result in a modest
+savings of a few vector inner products per iteration compared to a naive
+implementation using a custom matrix vector product to center the matrix.
+
+\subsection{Truncated symmetric eigenvalue decomposition}
+
+Use the {\tt partial\_eigen} function to estimate a subset of the largest (most
+positive) eigenvalues and corresponding eigenvectors of a symmetric dense or
+sparse real-valued matrix. The function is particularly well-suited to
+estimating the largest eigenvalues and corresponding eigenvectors of symmetric
+positive semi-definite matrices of the form $A^T A$.
+
+\subsection{User-Defined Matrix Multiplication}
+
+The {\tt irlba} function includes options for specifying a custom matrix
+multiplication function. Custom matrix multiplication functions can be used,
+for example, with the {\tt big.matrix} class from the {\tt bigmemory}/{\tt
+bigalgebra} packages, or to compute the partial SVD of matrix-free linear
+operators.
+
+User-defined matrix operations may specified using the optional {\tt mult}
+parameter. If defined, it must be a function of two arguments that computes
+matrix vector products. Either argument can be a vector, and the {\tt mult}
+function must deal with that. The following example illustrates a simple
+custom matrix function that scales the columns of the matrix, and then
+compares it with other ways of doing the same thing.
+\begin{lstlisting}
+> set.seed(1)
+> A <- matrix(runif(200), nrow=20)
+> col_scale <- 1:10
+> mult <- function(x,y)
++         {
++           # check if x is a plain vector
++           if(is.vector(x))
++           {
++             return((x %*% y)/col_scale)
++           }
++           # else x is the matrix
++           x %*% (y/col_scale)
++         }
+
+> irlba(A, 3, mult=mult)$d
+[1] 3.1384503 0.9477628 0.4313855
+
+> # Compare with:
+> irlba(A, 3, scale=col_scale)$d
+[1] 3.1384503 0.9477628 0.4313855
+
+> # Compare with:
+> svd(scale(A, scale=col_scale, center=FALSE))$d[1:3]
+[1] 3.1384503 0.9477628 0.4313855
+\end{lstlisting}
+Alternatively, simply use R's operator overloading methods to define
+customized matrix vector products. This is the approach taken in
+the
+\\
+\href{https://bwlewis.github.io/1000_genomes_examples/PCA_whole_genome.html}{http://bwlewis.github.io/1000\_genomes\_examples/PCA\_whole\_genome.html}
+vignette to work with large out of core genomics data.
+I prefer the simple operator overloading approach and may consider
+retiring the {\tt mult} function argument in some future package version.
+
+NOTE! When a user-defined matrix multiplication function is used, either using
+the {\tt mult} argument or through operator overloading, the R reference IRLBA
+implementation is used instead of the faster C code path in newer package
+versions.
+
+
+\section{A Quick Summary of the IRLBA Method}\label{sketch}
+\subsection{Partial Lanczos Bidiagonalization}
+
+Start with a given vector $p_1$. Compute $m$ steps of the Lanczos process:
+
+\begin{eqnarray*}
+A P_m &=& Q_m B_m \\
+A^T Q_m &=& P_m B_m^T + r_m e_m^T,\\
+\end{eqnarray*}
+
+$B_m\in\R^{m\times m}, P_m \in \R^{n\times m}, $ 
+$Q_m \in \R^{\ell \times m},$ 
+
+$P_m^TP_m=Q_m^TQ_m=I_m, $ 
+
+$r_m\in\R^n,  P_m^Tr_m=0,$
+
+$P_m = [p_1, p_2, \ldots, p_m]$.
+
+\subsection{Approximating Partial SVD with A Partial Lanczos bidiagonalization}
+\begin{eqnarray*}
+A^TA P_m &=& A^TQ_m B_m \\
+         &=& P_m {\color{blue}{B_m^TB_m}} + r_m e_m^TB_m,\\
+\end{eqnarray*}
+\begin{eqnarray*}
+AA^T Q_m &=& AP_m B_m^T + Ar_m e_m^T,\\
+&=& Q_m{\color{blue}{B_mB_m^T}} + Ar_me_m^T.
+\end{eqnarray*}
+
+Compute the SVD of $B_m$:
+\[
+B_m = \sum_{j=1}^m\sigma^B_ju^B_j\left(v_j^B\right)^T.
+\] 
+\\[6pt]
+\[
+\left(\mbox{i.e., }  B_mv_j^B = \sigma_j^Bu_j^B,  \mbox{ and }  
+B_m^Tu_j^b = \sigma_j^Bv_j^B.\right)
+\]
+
+Define:
+$
+\tilde{\sigma_j} := \sigma_j^B, \phantom{xxx}
+\tilde{u}_j := Q_m u_j^B, \phantom{xxx}
+\tilde{v}_j := P_m v_j^B.
+$
+
+Then:
+\begin{eqnarray*}
+A\tilde{v}_j &=& AP_mv_j^B \\
+&=& Q_mB_mv_j^B \\
+&=& \sigma^B_jQ_mu_j^B \\
+&=& \tilde{\sigma}_j \tilde{u}_j,
+\end{eqnarray*}
+and
+\begin{eqnarray*}
+A^T\tilde{u}_j &=& A^TQ_mu_j^B \\
+&=& P_mB^T_mu_j^B + r_me_m^Tu_j^B \\
+&=& \sigma^B_jP_mv_j^B + r_me_m^Tu_j^B\\
+&=& \tilde{\sigma}_j \tilde{v}_j + {\color{red} {r_me_m^Tu_j^B}}.
+\end{eqnarray*}
+The part in red above represents the error with respect to the exact SVD.
+The IRLBA strategy is to iteratively reduce the norm of that error term
+by augmenting and restarting.
+
+Here is the overall method:
+\begin{enumerate}
+\item Compute the Lanczos process up to step $m$.
+\item Compute $k<m$ approximate singular vectors.
+\item Orthogonalize against the approximate singular vectors to get a new 
+      starting vector.
+\item Continue the Lanczos process with the new starting vector 
+      for $m$ more steps.
+\item Check for convergence tolerance and exit if met.
+\item GOTO 1.
+\end{enumerate}
+
+
+\subsection{Sketch of the augmented process...}
+\begin{eqnarray*}
+\bar{P}_{k+1} &:=& [\tilde{v}_1, \tilde{v}_2, \ldots, \tilde{v}_k, p_{m+1}],\\
+A\bar{P}_{k+1} &=& [\tilde{\sigma}_1\tilde{u}_1, \tilde{\sigma}_1\tilde{u}_2, \ldots, \tilde{\sigma}_k\tilde{u}_k, Ap_{m+1}]
+\end{eqnarray*}
+
+Orthogonalize $Ap_{m+1}$ against $\{\tilde{u}_j\}_{j=1}^k$:   
+$
+Ap_{m+1} = \sum_{j=1}^k {\color{blue}{\rho_j}}\tilde{u}_j + {\color{blue}{r_k}}.
+$
+\begin{eqnarray*}
+\bar{Q}_{k+1} &:=& [\tilde{u}_1, \tilde{u}_2, \ldots, \tilde{u}_k, 
+{\color{blue}{r_k}}/\|{\color{blue}{r_k}}\|],\\
+\bar{B}_{k+1} &:=&
+\left[
+\begin{array}{ccccc}
+\tilde{\sigma}_1 & & & \brho_1 \\
+& \tilde{\sigma}_2 & & \brho_2 \\
+& & \ddots & \brho_k \\
+& & & \|\color{blue}{r_k}\|
+\end{array}
+\right].
+\end{eqnarray*}
+\[
+A\bar{P}_{k+1} = \bar{Q}_{k+1}\bar{B}_{k+1}.
+\]
+
+
+
+
+\end{document}
diff --git a/inst/doc/irlba.pdf b/inst/doc/irlba.pdf
new file mode 100644
index 0000000..26781a5
Binary files /dev/null and b/inst/doc/irlba.pdf differ
diff --git a/man/irlba.Rd b/man/irlba.Rd
new file mode 100644
index 0000000..0c6096c
--- /dev/null
+++ b/man/irlba.Rd
@@ -0,0 +1,200 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/irlba.R
+\name{irlba}
+\alias{irlba}
+\title{Find a few approximate largest singular values and corresponding
+singular vectors of a matrix.}
+\usage{
+irlba(A, nv = 5, nu, maxit = 1000, work = nv + 7, reorth = TRUE,
+  tol = 1e-05, v = NULL, right_only = FALSE, verbose = FALSE, scale,
+  center, du, ds, dv, shift, mult, fastpath = TRUE)
+}
+\arguments{
+\item{A}{numeric real- or complex-valued matrix or real-valued sparse matrix.}
+
+\item{nv}{number of right singular vectors to estimate.}
+
+\item{nu}{number of left singular vectors to estimate (defaults to \code{nv}).}
+
+\item{maxit}{maximum number of iterations.}
+
+\item{work}{working subspace dimension, larger values can speed convergence at the cost of more memory use.}
+
+\item{reorth}{if \code{TRUE}, apply full reorthogonalization to both SVD bases, otherwise
+only apply reorthogonalization to the right SVD basis vectors; the latter case is cheaper per
+iteration but, overall, may require more iterations for convergence. Always set to \code{TRUE}
+when \code{fastpath=TRUE} (see below).}
+
+\item{tol}{convergence is determined when \eqn{\|AV - US\| < tol\|A\|}{||AV - US|| < tol*||A||},
+where the spectral norm ||A|| is approximated by the
+largest estimated singular value, and U, V, S are the matrices corresponding
+to the estimated left and right singular vectors, and diagonal matrix of
+estimated singular values, respectively.}
+
+\item{v}{optional starting vector or output from a previous run of \code{irlba} used
+to restart the algorithm from where it left off (see the notes).}
+
+\item{right_only}{logical value indicating return only the right singular vectors
+(\code{TRUE}) or both sets of vectors (\code{FALSE}). The right_only option can be
+cheaper to compute and use much less memory when \code{nrow(A) >> ncol(A)} but note
+that \code{right_only = TRUE} sets \code{fastpath = FALSE} (only use this option
+for really large problems that run out of memory).}
+
+\item{verbose}{logical value that when \code{TRUE} prints status messages during the computation.}
+
+\item{scale}{optional column scaling vector whose values divide each column of \code{A};
+must be as long as the number of columns of \code{A} (see notes).}
+
+\item{center}{optional column centering vector whose values are subtracted from each
+column of \code{A}; must be as long as the number of columns of \code{A} and may}
+
+\item{du}{DEPRECATED optional subspace deflation vector (see notes).}
+
+\item{ds}{DEPRECATED optional subspace deflation scalar (see notes).}
+
+\item{dv}{DEPRECATED optional subspace deflation vector (see notes).}
+
+\item{shift}{optional shift value (square matrices only, see notes).}
+
+\item{mult}{optional custom matrix multiplication function (default is \code{\%*\%}, see notes).}
+
+\item{fastpath}{try a fast C algorithm implementation if possible; set \code{fastpath=FALSE} to use the reference R implementation. See notes.}
+}
+\value{
+Returns a list with entries:
+\describe{
+  \item{d:}{ max(nu, nv) approximate singular values}
+  \item{u:}{ nu approximate left singular vectors (only when right_only=FALSE)}
+  \item{v:}{ nv approximate right singular vectors}
+  \item{iter:}{ The number of Lanczos iterations carried out}
+  \item{mprod:}{ The total number of matrix vector products carried out}
+}
+}
+\description{
+The augmented implicitly restarted Lanczos bidiagonalization algorithm
+(IRLBA) finds a few approximate largest singular values and corresponding
+singular vectors of a sparse or dense matrix using a method of Baglama and
+Reichel.  It is a fast and memory-efficient way to compute a partial SVD.
+}
+\note{
+The syntax of \code{irlba} partially follows \code{svd}, with an important
+exception. The usual R \code{svd} function always returns a complete set of
+singular values, even if the number of singular vectors \code{nu} or \code{nv}
+is set less than the maximum. The \code{irlba} function returns a number of
+estimated singular values equal to the maximum of the number of specified
+singular vectors \code{nu} and \code{nv}.
+
+Use the optional \code{scale} parameter to implicitly scale each column of
+the matrix \code{A} by the values in the \code{scale} vector, computing the
+truncated SVD of the column-scaled \code{sweep(A, 2, scale, FUN=`/`)}, or
+equivalently, \code{A \%*\% diag(1 / scale)}, without explicitly forming the
+scaled matrix. \code{scale} must be a non-zero vector of length equal
+to the number of columns of \code{A}.
+
+Use the optional \code{center} parameter to implicitly subtract the values
+in the \code{center} vector from each column of \code{A}, computing the
+truncated SVD of \code{sweep(A, 2, center, FUN=`-`)},
+without explicitly forming the centered matrix. This option may not be
+used together with the general rank 1 deflation options. \code{center}
+must be a vector of length equal to the number of columns of \code{A}.
+This option may be used to efficiently compute principal components without
+explicitly forming the centered matrix (which can, importantly, preserve
+sparsity in the matrix). See the examples.
+
+The optional deflation parameters are deprecated and will be removed in
+a future version. They could be used to compute the rank-one deflated
+SVD of \eqn{A - ds \cdot du dv^T}{A - ds*du \%*\% t(dv)}, where
+\eqn{du^T A - ds\cdot dv^T = 0}{t(du) \%*\% A - ds * t(dv) == 0}. For
+example, the triple \code{ds, du, dv} may be a known singular value
+and corresponding singular vectors. Or \code{ds=m} and \code{dv}
+and \code{du} represent a vector of column means of \code{A} and of ones,
+respectively, where \code{m} is the number of rows of \code{A}.
+This functionality can be effectively replaced with custom matrix
+product functions.
+
+Specify an optional alternative matrix multiplication operator in the
+\code{mult} parameter. \code{mult} must be a function of two arguments,
+and must handle both cases where one argument is a vector and the other
+a matrix. See the examples and
+\code{demo("custom_matrix_multiply", package="irlba")} for an 
+alternative approach.
+
+Use the \code{v} option to supply a starting vector for the iterative
+method. A random vector is used by default (precede with \code{set.seed()} to
+for reproducibility). Optionally set \code{v} to
+the output of a previous run of \code{irlba} to restart the method, adding
+additional singular values/vectors without recomputing the solution
+subspace. See the examples.
+
+The function may generate the following warnings:
+\itemize{
+  \item{"did not converge--results might be invalid!; try increasing maxit or fastpath=FALSE" means that the algorithm didn't
+  converge -- this is potentially a serious problem and the returned results may not be valid. \code{irlba}
+  reports a warning here instead of an error so that you can inspect whatever is returned. If this
+  happens, carefully heed the warning and inspect the result.}
+  \item{"You're computing a large percentage of total singular values, standard svd might work better!"
+    \code{irlba} is designed to efficiently compute a few of the largest singular values and associated
+     singular vectors of a matrix. The standard \code{svd} function will be more efficient for computing
+     large numbers of singular values than \code{irlba}.}
+   \item{"convergence criterion below machine epsilon" means that the product of \code{tol} and the
+     largest estimated singular value is really small and the normal convergence criterion is only
+     met up to round off error.}
+}
+The function might return an error for several reasons including a situation when the starting
+vector \code{v} is near the null space of the matrix. In that case, try a different \code{v}.
+
+The \code{fastpath=TRUE} option only supports real-valued matrices and sparse matrices
+of type \code{dgCMatrix} (for now). Other problems fall back to the reference
+R implementation.
+}
+\examples{
+set.seed(1)
+
+A <- matrix(runif(400), nrow=20)
+S <- irlba(A, 3)
+S$d
+
+# Compare with svd
+svd(A)$d[1:3]
+
+# Restart the algorithm to compute more singular values
+# (starting with an existing solution S)
+S1 <- irlba(A, 5, v=S)
+
+# Principal components (see also prcomp_irlba)
+P <- irlba(A, nv=1, center=colMeans(A))
+
+# Compare with prcomp and prcomp_irlba (might vary up to sign)
+cbind(P$v,
+      prcomp(A)$rotation[, 1],
+      prcomp_irlba(A)$rotation[, 1])
+
+# A custom matrix multiplication function that scales the columns of A
+# (cf the scale option). This function scales the columns of A to unit norm.
+col_scale <- sqrt(apply(A, 2, crossprod))
+mult <- function(x, y)
+        {
+          # check if x is a  vector
+          if (is.vector(x))
+          {
+            return((x \%*\% y) / col_scale)
+          }
+          # else x is the matrix
+          x \%*\% (y / col_scale)
+        }
+irlba(A, 3, mult=mult)$d
+
+# Compare with:
+irlba(A, 3, scale=col_scale)$d
+
+# Compare with:
+svd(sweep(A, 2, col_scale, FUN=`/`))$d[1:3]
+
+}
+\references{
+Augmented Implicitly Restarted Lanczos Bidiagonalization Methods, J. Baglama and L. Reichel, SIAM J. Sci. Comput. 2005.
+}
+\seealso{
+\code{\link{svd}}, \code{\link{prcomp}}, \code{\link{partial_eigen}}
+}
+
diff --git a/man/partial_eigen.Rd b/man/partial_eigen.Rd
new file mode 100644
index 0000000..c21fba2
--- /dev/null
+++ b/man/partial_eigen.Rd
@@ -0,0 +1,65 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/eigen.R
+\name{partial_eigen}
+\alias{partial_eigen}
+\title{Find a few approximate largest eigenvalues and corresponding eigenvectors of a symmetric matrix.}
+\usage{
+partial_eigen(x, n = 5, symmetric = TRUE, ...)
+}
+\arguments{
+\item{x}{numeric real-valued dense or sparse matrix.}
+
+\item{n}{number of largest eigenvalues and corresponding eigenvectors to compute.}
+
+\item{symmetric}{\code{TRUE} indicates \code{x} is a symmetric matrix (the default);
+specify \code{symmetric=FALSE} to compute the largest eigenvalues and corresponding eigenvectors of \code{t(x) \%*\% x} instead.}
+
+\item{...}{optional additional parameters passed to the \code{irlba} function.}
+}
+\value{
+Returns a list with entries:
+\itemize{
+  \item{values}{ n approximate largest eigenvalues}
+  \item{vectors}{ n approximate corresponding eigenvectors}
+}
+}
+\description{
+Use \code{partial_eigen} to estimate a subset of the largest (most positive)
+eigenvalues and corresponding eigenvectors of a symmetric dense or sparse
+real-valued matrix.
+}
+\note{
+Specify \code{symmetric=FALSE} to compute the largest \code{n} eigenvalues
+and corresponding eigenvectors of the symmetric matrix cross-product
+\code{t(x) \%*\% x}.
+
+This function uses the \code{irlba} function under the hood. See \code{?irlba}
+for description of additional options, especially the \code{tol} parameter.
+
+See the RSpectra package https://cran.r-project.org/package=RSpectra for more comprehensive
+partial eigenvalue decomposition.
+}
+\examples{
+set.seed(1)
+# Construct a symmetric matrix with some positive and negative eigenvalues:
+V <- qr.Q(qr(matrix(runif(100),nrow=10)))
+x <- V \%*\% diag(c(10, -9, 8, -7, 6, -5, 4, -3, 2, -1)) \%*\% t(V)
+partial_eigen(x, 3)$values
+
+# Compare with eigen
+eigen(x)$values[1:3]
+
+# Use symmetric=FALSE to compute the eigenvalues of t(x) \%*\% x for general
+# matrices x:
+x <- matrix(rnorm(100), 10)
+partial_eigen(x, 3, symmetric=FALSE)$values
+eigen(crossprod(x))$values
+
+}
+\references{
+Augmented Implicitly Restarted Lanczos Bidiagonalization Methods, J. Baglama and L. Reichel, SIAM J. Sci. Comput. 2005.
+}
+\seealso{
+\code{\link{eigen}}, \code{\link{irlba}}
+}
+
diff --git a/man/prcomp_irlba.Rd b/man/prcomp_irlba.Rd
new file mode 100644
index 0000000..e7f5592
--- /dev/null
+++ b/man/prcomp_irlba.Rd
@@ -0,0 +1,76 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/prcomp.R
+\name{prcomp_irlba}
+\alias{prcomp_irlba}
+\title{Principal Components Analysis}
+\usage{
+prcomp_irlba(x, n = 3, retx = TRUE, center = TRUE, scale. = FALSE, ...)
+}
+\arguments{
+\item{x}{a numeric or complex matrix (or data frame) which provides
+the data for the principal components analysis.}
+
+\item{n}{integer number of principal component vectors to return, must be less than
+\code{min(dim(x))}.}
+
+\item{retx}{a logical value indicating whether the rotated variables should be returned.}
+
+\item{center}{a logical value indicating whether the variables should be
+shifted to be zero centered. Alternately, a centering vector of length
+equal the number of columns of \code{x} can be supplied.}
+
+\item{scale.}{a logical value indicating whether the variables should be
+scaled to have unit variance before the analysis takes place.
+The default is \code{FALSE} for consistency with S, but scaling is often advisable.
+Alternatively, a vector of length equal the number of columns of \code{x} can be supplied.}
+
+\item{...}{additional arguments passed to \code{\link{irlba}}.}
+}
+\value{
+A list with class "prcomp" containing the following components:
+\itemize{
+   \item{sdev} {the standard deviations of the principal components (i.e.,
+         the square roots of the eigenvalues of the
+         covariance/correlation matrix, though the calculation is
+         actually done with the singular values of the data matrix).}
+  \item{rotation} {the matrix of variable loadings (i.e., a matrix whose columns
+         contain the eigenvectors).}
+  \item {x} {if \code{retx} is \code{TRUE} the value of the rotated data (the centred
+         (and scaled if requested) data multiplied by the \code{rotation}
+        matrix) is returned.  Hence, \code{cov(x)} is the diagonal matrix
+         \code{diag(sdev^2)}.}
+  \item{center, scale} {the centering and scaling used, or \code{FALSE}.}
+}
+}
+\description{
+Efficient computation of a truncated principal components analysis of a given data matrix
+using an implicitly restarted Lanczos method from the \code{\link{irlba}} package.
+}
+\note{
+The signs of the columns of the rotation matrix are arbitrary, and
+so may differ between different programs for PCA, and even between
+different builds of R.
+
+NOTE DIFFERENCES WITH THE DEFAULT \code{\link{prcomp}} FUNCTION!
+The \code{tol} truncation argument found in \code{prcomp} is not supported.
+In place of the truncation tolerance in the original function, the
+\code{prcomp_irlba}  function has the argument \code{n} explicitly giving the
+number of principal components to return. A warning is generated if the
+argument \code{tol} is used, which is interpreted differently between
+the two functions.
+}
+\examples{
+set.seed(1)
+x  <- matrix(rnorm(200), nrow=20)
+p1 <- prcomp_irlba(x, n=3)
+summary(p1)
+
+# Compare with
+p2 <- prcomp(x, tol=0.7)
+summary(p2)
+
+}
+\seealso{
+\code{\link{prcomp}}
+}
+
diff --git a/src/Makevars b/src/Makevars
new file mode 100644
index 0000000..fadfd95
--- /dev/null
+++ b/src/Makevars
@@ -0,0 +1 @@
+PKG_LIBS =  $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS)
diff --git a/src/irlb.c b/src/irlb.c
new file mode 100644
index 0000000..0bb265c
--- /dev/null
+++ b/src/irlb.c
@@ -0,0 +1,546 @@
+/*
+ * irlb: Implicitly restarted Lanczos bidiagonalization partial SVD.
+ * Copyright (c) 2016 by Bryan W. Lewis
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <fcntl.h>
+#include <assert.h>
+#include <math.h>
+
+#include <R.h>
+#define USE_RINTERNALS
+#include <Rinternals.h>
+#include <Rdefines.h>
+
+#include "R_ext/BLAS.h"
+#include "R_ext/Lapack.h"
+#include "R_ext/Rdynload.h"
+#include "R_ext/Utils.h"
+#include "R_ext/Parse.h"
+
+#include "Matrix.h"
+#include "Matrix_stubs.c"
+
+#include "irlb.h"
+
+/* helper function for calling rnorm below */
+SEXP
+RNORM (int n)
+{
+  char buf[4096];
+  SEXP cmdSexp, cmdexpr, ans = R_NilValue;
+  ParseStatus status;
+  cmdSexp = PROTECT (allocVector (STRSXP, 1));
+  snprintf (buf, 4095, "rnorm(%d)", n);
+  SET_STRING_ELT (cmdSexp, 0, mkChar (buf));
+  cmdexpr = PROTECT (R_ParseVector (cmdSexp, -1, &status, R_NilValue));
+  if (status != PARSE_OK)
+    {
+      UNPROTECT (2);
+      error ("invalid call");
+    }
+  for (int i = 0; i < length (cmdexpr); i++)
+    ans = eval (VECTOR_ELT (cmdexpr, i), R_GlobalEnv);
+  UNPROTECT (2);
+  return ans;
+}
+
+/* irlb C implementation wrapper
+ * X double precision input matrix
+ * NU integer number of singular values/vectors to compute must be > 3
+ * INIT double precision starting vector length(INIT) must equal ncol(X)
+ * WORK integer working subspace dimension must be > NU
+ * MAXIT integer maximum number of iterations
+ * TOL double tolerance
+ * EPS double machine epsilon
+ * MULT integer 0 X is a dense matrix (dgemm), 1 sparse (cholmod)
+ * RESTART integer 0 no or > 0 indicates restart of dimension n
+ * RV, RW, RS optional restart V W and S values of dimension RESTART
+ *    (only used when RESTART > 0)
+ * SCALE either NULL (no scaling) or a vector of length ncol(X)
+ * SHIFT either NULL (no shift) or a single double-precision number
+ * CENTER either NULL (no centering) or a vector of length ncol(X)
+ *
+ * Returns a list with 6 elements:
+ * 1. vector of estimated singular values
+ * 2. matrix of estimated left singular vectors
+ * 3. matrix of estimated right singular vectors
+ * 4. number of algorithm iterations
+ * 5. number of matrix vector products
+ * 6. irlb C algorithm return error code (see irlb below)
+ */
+SEXP
+IRLB (SEXP X, SEXP NU, SEXP INIT, SEXP WORK, SEXP MAXIT, SEXP TOL, SEXP EPS,
+      SEXP MULT, SEXP RESTART, SEXP RV, SEXP RW, SEXP RS, SEXP SCALE,
+      SEXP SHIFT, SEXP CENTER)
+{
+  SEXP ANS, S, U, V;
+  double *V1, *U1, *W, *F, *B, *BU, *BV, *BS, *BW, *res, *T, *scale, *shift,
+    *center;
+  int i, iter, mprod, ret;
+  R_xlen_t m, n;
+
+  int mult = INTEGER (MULT)[0];
+  void *A;
+  switch (mult)
+    {
+    case 1:
+      A = (void *) AS_CHM_SP (X);
+      int *dims = INTEGER (GET_SLOT (X, install ("Dim")));
+      m = dims[0];
+      n = dims[1];
+      break;
+    default:
+      A = (void *) REAL (X);
+      m = nrows (X);
+      n = ncols (X);
+    }
+  int nu = INTEGER (NU)[0];
+  R_xlen_t work = INTEGER (WORK)[0];
+  int maxit = INTEGER (MAXIT)[0];
+  double tol = REAL (TOL)[0];
+  int lwork = 7 * work * (1 + work);
+  int restart = INTEGER (RESTART)[0];
+  double eps = REAL (EPS)[0];
+
+  PROTECT (ANS = NEW_LIST (6));
+  PROTECT (S = allocVector (REALSXP, nu));
+  PROTECT (U = allocVector (REALSXP, m * work));
+  PROTECT (V = allocVector (REALSXP, n * work));
+  if (restart == 0)
+    for (i = 0; i < n; ++i)
+      (REAL (V))[i] = (REAL (INIT))[i];
+
+  /* set up intermediate working storage */
+  scale = NULL;
+  shift = NULL;
+  center = NULL;
+  if (TYPEOF (SCALE) == REALSXP)
+    {
+      scale = (double *) R_alloc (n * 2, sizeof (double));
+      memcpy (scale, REAL (SCALE), n * sizeof (double));
+    }
+  if (TYPEOF (SHIFT) == REALSXP)
+    {
+      shift = REAL (SHIFT);
+    }
+  if (TYPEOF (CENTER) == REALSXP)
+    {
+      center = REAL (CENTER);
+    }
+  V1 = (double *) R_alloc (n * work, sizeof (double));
+  U1 = (double *) R_alloc (m * work, sizeof (double));
+  W = (double *) R_alloc (m * work, sizeof (double));
+  F = (double *) R_alloc (n, sizeof (double));
+  B = (double *) R_alloc (work * work, sizeof (double));
+  BU = (double *) R_alloc (work * work, sizeof (double));
+  BV = (double *) R_alloc (work * work, sizeof (double));
+  BS = (double *) R_alloc (work, sizeof (double));
+  BW = (double *) R_alloc (lwork, sizeof (double));
+  res = (double *) R_alloc (work, sizeof (double));
+  T = (double *) R_alloc (lwork, sizeof (double));
+  if (restart > 0)
+    {
+      memcpy (REAL (V), REAL (RV), n * (restart + 1) * sizeof (double));
+      memcpy (W, REAL (RW), m * restart * sizeof (double));
+      memset (B, 0, work * work * sizeof (double));
+      for (i = 0; i < restart; ++i)
+        B[i + work * i] = REAL (RS)[i];
+    }
+  ret =
+    irlb (A, mult, m, n, nu, work, maxit, restart, tol, scale, shift, center,
+          REAL (S), REAL (U), REAL (V), &iter, &mprod, eps, lwork, V1, U1, W,
+          F, B, BU, BV, BS, BW, res, T);
+  SET_VECTOR_ELT (ANS, 0, S);
+  SET_VECTOR_ELT (ANS, 1, U);
+  SET_VECTOR_ELT (ANS, 2, V);
+  SET_VECTOR_ELT (ANS, 3, ScalarInteger (iter));
+  SET_VECTOR_ELT (ANS, 4, ScalarInteger (mprod));
+  SET_VECTOR_ELT (ANS, 5, ScalarInteger (ret));
+  UNPROTECT (4);
+  return ANS;
+}
+
+/* irlb: main computation function.
+ * returns:
+ *  0 on success,
+ * -1 invalid dimensions,
+ * -2 not converged
+ * -3 out of memory
+ * -4 starting vector near the null space of A
+ *
+ * all data must be allocated by caller, required sizes listed below
+ */
+int
+irlb (void *A,                  // Input data matrix
+      int mult,                 // 0 -> A is double *, 1 -> A is cholmod
+      int m,                    // data matrix number of rows, must be > 3.
+      int n,                    // data matrix number of columns, must be > 3.
+      int nu,                   // dimension of solution
+      int work,                 // working dimension, must be > 3.
+      int maxit,                // maximum number of main iterations
+      int restart,              // 0->no, n>0 -> restarted algorithm of dimension n
+      double tol,               // convergence tolerance
+      double *scale,            // optional scale (NULL for no scale) size n * 2
+      double *shift,            // optional shift (NULL for no shift)
+      double *center,           // optional center (NULL for no center)
+      // output values
+      double *s,                // output singular vectors at least length nu
+      double *U,                // output left singular vectors  m x work
+      double *V,                // output right singular vectors n x work
+      int *ITER,                // ouput number of Lanczos iterations
+      int *MPROD,               // output number of matrix vector products
+      double eps,               // machine epsilon
+      // working intermediate storage, sizes shown
+      int lwork, double *V1,    // n x work
+      double *U1,               // m x work
+      double *W,                // m x work  input when restart > 0
+      double *F,                // n
+      double *B,                // work x work  input when restart > 0
+      double *BU,               // work x work
+      double *BV,               // work x work
+      double *BS,               // work
+      double *BW,               // lwork
+      double *res,              // work
+      double *T)                // lwork
+{
+  double d, S, R, alpha, beta, R_F, SS;
+  double *x;
+  int jj, kk;
+  int converged;
+  int info, j, k = restart;
+  int inc = 1;
+  int mprod = 0;
+  int iter = 0;
+  double Smax = 0;
+  SEXP FOO;
+
+/* Check for valid input dimensions */
+  if (work < 4 || n < 4 || m < 4)
+    return -1;
+
+  if (restart == 0)
+    memset (B, 0, work * work * sizeof (double));
+/* Main iteration */
+  while (iter < maxit)
+    {
+      j = 0;
+/*  Normalize starting vector */
+      if (iter == 0 && restart == 0)
+        {
+          d = F77_NAME (dnrm2) (&n, V, &inc);
+          if (d < 2 * eps)
+            return -1;
+          d = 1 / d;
+          F77_NAME (dscal) (&n, &d, V, &inc);
+        }
+      else
+        j = k;
+
+/* optionally apply scale */
+      x = V + j * n;
+      if (scale)
+        {
+          x = scale + n;
+          memcpy (scale + n, V + j * n, n * sizeof (double));
+          for (kk = 0; kk < n; ++kk)
+            x[kk] = x[kk] / scale[kk];
+        }
+
+      switch (mult)
+        {
+        case 1:
+          dsdmult ('n', m, n, (CHM_SP) A, x, W + j * m);
+          break;
+        default:
+          alpha = 1;
+          beta = 0;
+          F77_NAME (dgemv) ("n", &m, &n, &alpha, (double *) A, &m, x,
+                            &inc, &beta, W + j * m, &inc);
+        }
+      mprod++;
+      R_CheckUserInterrupt ();
+
+/* optionally apply shift in square cases m = n */
+      if (shift)
+        {
+          jj = j * m;
+          for (kk = 0; kk < m; ++kk)
+            W[jj + kk] = W[jj + kk] + shift[0] * x[kk];
+        }
+/* optionally apply centering */
+      if (center)
+        {
+          jj = j * m;
+          beta = F77_CALL (ddot) (&n, x, &inc, center, &inc);
+          for (kk = 0; kk < m; ++kk)
+            W[jj + kk] = W[jj + kk] - beta;
+        }
+
+      if (iter > 0)
+        orthog (W, W + j * m, T, m, j, 1);
+      S = F77_NAME (dnrm2) (&m, W + j * m, &inc);
+      if (S < 2 * eps && j == 0)
+        return -4;
+      SS = 1.0 / S;
+      F77_NAME (dscal) (&m, &SS, W + j * m, &inc);
+
+/* The Lanczos process */
+      while (j < work)
+        {
+          switch (mult)
+            {
+            case 1:
+              dsdmult ('t', m, n, (CHM_SP) A, W + j * m, F);
+              break;
+            default:
+              alpha = 1.0;
+              beta = 0.0;
+              F77_NAME (dgemv) ("t", &m, &n, &alpha, (double *) A, &m,
+                                W + j * m, &inc, &beta, F, &inc);
+            }
+          mprod++;
+          R_CheckUserInterrupt ();
+/* optionally apply shift and scale */
+          if (shift)
+            {
+              for (kk = 0; kk < m; ++kk)
+                F[kk] = F[kk] + shift[0] * W[j * m + kk];
+            }
+          if (scale)
+            {
+              for (kk = 0; kk < n; ++kk)
+                F[kk] = F[kk] / scale[kk];
+            }
+          SS = -S;
+          F77_NAME (daxpy) (&n, &SS, V + j * n, &inc, F, &inc);
+          orthog (V, F, T, n, j + 1, 1);
+
+          if (j + 1 < work)
+            {
+              R_F = F77_NAME (dnrm2) (&n, F, &inc);
+              R = 1.0 / R_F;
+              if (R_F < 2 * eps)        // near invariant subspace
+                {
+                  FOO = RNORM (n);
+                  for (kk = 0; kk < n; ++kk)
+                    F[kk] = REAL (FOO)[kk];
+                  orthog (V, F, T, n, j + 1, 1);
+                  R_F = F77_NAME (dnrm2) (&n, F, &inc);
+                  R = 1.0 / R_F;
+                  R_F = 0;
+                }
+
+              memmove (V + (j + 1) * n, F, n * sizeof (double));
+              F77_NAME (dscal) (&n, &R, V + (j + 1) * n, &inc);
+              B[j * work + j] = S;
+              B[(j + 1) * work + j] = R_F;
+/* optionally apply scale */
+              x = V + (j + 1) * n;
+              if (scale)
+                {
+                  x = scale + n;
+                  memcpy (x, V + (j + 1) * n, n * sizeof (double));
+                  for (kk = 0; kk < n; ++kk)
+                    x[kk] = x[kk] / scale[kk];
+                }
+              switch (mult)
+                {
+                case 1:
+                  dsdmult ('n', m, n, (CHM_SP) A, x, W + (j + 1) * m);
+                  break;
+                default:
+                  alpha = 1.0;
+                  beta = 0.0;
+                  F77_NAME (dgemv) ("n", &m, &n, &alpha, (double *) A, &m,
+                                    x, &inc, &beta, W + (j + 1) * m, &inc);
+                }
+              mprod++;
+              R_CheckUserInterrupt ();
+/* optionally apply shift */
+              if (shift)
+                {
+                  jj = j + 1;
+                  for (kk = 0; kk < m; ++kk)
+                    W[jj * m + kk] = W[jj * m + kk] + shift[0] * x[kk];
+                }
+/* optionally apply centering */
+              if (center)
+                {
+                  jj = (j + 1) * m;
+                  beta = F77_CALL (ddot) (&n, x, &inc, center, &inc);
+                  for (kk = 0; kk < m; ++kk)
+                    W[jj + kk] = W[jj + kk] - beta;
+                }
+/* One step of classical Gram-Schmidt */
+              R = -R_F;
+              F77_NAME (daxpy) (&m, &R, W + j * m, &inc, W + (j + 1) * m,
+                                &inc);
+/* full re-orthogonalization of W_{j+1} */
+              orthog (W, W + (j + 1) * m, T, m, j + 1, 1);
+              S = F77_NAME (dnrm2) (&m, W + (j + 1) * m, &inc);
+              SS = 1.0 / S;
+              if (S < 2 * eps)
+                {
+                  FOO = RNORM (m);
+                  jj = (j + 1) * m;
+                  for (kk = 0; kk < m; ++kk)
+                    W[jj + kk] = REAL (FOO)[kk];
+                  orthog (W, W + (j + 1) * m, T, m, j + 1, 1);
+                  S = F77_NAME (dnrm2) (&m, W + (j + 1) * m, &inc);
+                  SS = 1.0 / S;
+                  F77_NAME (dscal) (&m, &SS, W + (j + 1) * m, &inc);
+                  S = 0;
+                }
+              else
+                F77_NAME (dscal) (&m, &SS, W + (j + 1) * m, &inc);
+            }
+          else
+            {
+              B[j * work + j] = S;
+            }
+          j++;
+        }
+
+      memmove (BU, B, work * work * sizeof (double));   // Make a working copy of B
+      int *BI = (int *) T;
+      F77_NAME (dgesdd) ("O", &work, &work, BU, &work, BS, BU, &work, BV,
+                         &work, BW, &lwork, BI, &info);
+      R_F = F77_NAME (dnrm2) (&n, F, &inc);
+      R = 1.0 / R_F;
+      F77_NAME (dscal) (&n, &R, F, &inc);
+/* Force termination after encountering linear dependence */
+      if (R_F < 2 * eps)
+        R_F = 0;
+      Smax = 0;
+      for (jj = 0; jj < j; ++jj)
+        if (BS[jj] > Smax)
+          Smax = BS[jj];
+      for (kk = 0; kk < j; ++kk)
+        res[kk] = R_F * BU[kk * work + (j - 1)];
+/* Update k to be the number of converged singular values. */
+      convtests (j, nu, tol, Smax, res, &k, &converged);
+      if (converged == 1)
+        {
+          iter++;
+          break;
+        }
+
+      alpha = 1;
+      beta = 0;
+      F77_NAME (dgemm) ("n", "t", &n, &k, &j, &alpha, V, &n, BV, &work, &beta,
+                        V1, &n);
+      memmove (V, V1, n * k * sizeof (double));
+      memmove (V + n * k, F, n * sizeof (double));
+
+      memset (B, 0, work * work * sizeof (double));
+      for (jj = 0; jj < k; ++jj)
+        {
+          B[jj * work + jj] = BS[jj];
+          B[k * work + jj] = res[jj];
+        }
+
+/*   Update the left approximate singular vectors */
+      alpha = 1;
+      beta = 0;
+      F77_NAME (dgemm) ("n", "n", &m, &k, &j, &alpha, W, &m, BU, &work, &beta,
+                        U1, &m);
+      memmove (W, U1, m * k * sizeof (double));
+      iter++;
+    }
+
+/* Results */
+  memmove (s, BS, nu * sizeof (double));        /* Singular values */
+  alpha = 1;
+  beta = 0;
+  F77_NAME (dgemm) ("n", "n", &m, &nu, &work, &alpha, W, &m, BU, &work, &beta,
+                    U, &m);
+  F77_NAME (dgemm) ("n", "t", &n, &nu, &work, &alpha, V, &n, BV, &work, &beta,
+                    V1, &n);
+  memmove (V, V1, n * nu * sizeof (double));
+
+  *ITER = iter;
+  *MPROD = mprod;
+  return (converged == 1 ? 0 : -2);
+}
+
+
+cholmod_common chol_c;
+/* Need our own CHOLMOD error handler */
+void attribute_hidden
+irlba_R_cholmod_error (int status, const char *file, int line,
+                       const char *message)
+{
+  if (status < 0)
+    error ("Cholmod error '%s' at file:%s, line %d", message, file, line);
+  else
+    warning ("Cholmod warning '%s' at file:%s, line %d", message, file, line);
+}
+
+#ifdef HAVE_VISIBILITY_ATTRIBUTE
+__attribute__ ((visibility ("default")))
+#endif
+     void R_init_irlba (DllInfo * dll)
+{
+  M_R_cholmod_start (&chol_c);
+  chol_c.final_ll = 1;          /* LL' form of simplicial factorization */
+
+  /* need own error handler, that resets  final_ll (after *_defaults()) : */
+  chol_c.error_handler = irlba_R_cholmod_error;
+}
+
+void
+R_unload_irlba (DllInfo * dll)
+{
+  M_cholmod_finish (&chol_c);
+}
+
+
+void
+dsdmult (char transpose, int m, int n, void *a, double *b, double *c)
+{
+  DL_FUNC sdmult = R_GetCCallable ("Matrix", "cholmod_sdmult");
+  int t = transpose == 't' ? 1 : 0;
+  CHM_SP cha = (CHM_SP) a;
+
+  cholmod_dense chb;
+  chb.nrow = transpose == 't' ? m : n;
+  chb.d = chb.nrow;
+  chb.ncol = 1;
+  chb.nzmax = chb.nrow;
+  chb.xtype = cha->xtype;
+  chb.dtype = 0;
+  chb.x = (void *) b;
+  chb.z = (void *) NULL;
+
+  cholmod_dense chc;
+  chc.nrow = transpose == 't' ? n : m;
+  chc.d = chc.nrow;
+  chc.ncol = 1;
+  chc.nzmax = chc.nrow;
+  chc.xtype = cha->xtype;
+  chc.dtype = 0;
+  chc.x = (void *) c;
+  chc.z = (void *) NULL;
+
+  double one[] = { 1, 0 }, zero[] =
+  {
+  0, 0};
+  sdmult (cha, t, one, zero, &chb, &chc, &chol_c);
+}
diff --git a/src/irlb.h b/src/irlb.h
new file mode 100644
index 0000000..6a2a630
--- /dev/null
+++ b/src/irlb.h
@@ -0,0 +1,65 @@
+/* Compute Y = Y - X * t(X) * Y */
+void
+orthog( double *X,  // Input data matrix
+        double *Y,  // Input data matrix
+        double *T,  // work matrix size xn * yn
+        int xm,     // number of columns of X
+        int xn,     // number of columns of X
+        int yn);    // number of columns of Y
+
+void
+convtests (int Bsz,           // Number of rows of bidiagonal matrix B
+           int n,             // requested number of singular values
+           double tol,        // convergence tolerance
+           double Smax,       // largest singular value of B
+           double *residuals, // vector of residual values
+           int *k,            // number of estimated singular values (INPUT)
+                              // adjusted subspace size (OUTPUT)
+           int *converged);   // 0 = FALSE, 1 = TRUE
+
+/*
+ * Simple cholmod double-precision sparse matrix times dense vector multiplication interface
+ * Compute c = op(a) %*% b, c changed on output a and b unchanged.
+ * where, if transpose = 't' then op(a) = t(a) and length(b) = m, length(c) = n
+ *        else,  then op(a) = a and length(b) = n, length(c) = m
+ */
+void
+dsdmult(char transpose, // 't' -> op(a) = t(a), non-transposed a otherwise 
+        int m,          // number of rows of a
+        int n,          // number of columns of a
+        void *a,        // double precision valued sparse matrix
+        double *b,      // double precision dense vector
+        double *c);     // output
+
+/* IRLB method for sparse or dense double-precision valued matrices */
+int
+irlb(void *A,       // Input data matrix
+     int mult,      // 0->A is double *, 1-> A is sparse double *
+     int m,         // data matrix number of rows
+     int n,         // data matrix number of columns
+     int nu,        // dimension of solution
+     int m_b,       // working dimension
+     int maxit,     // maximum number of Lanzcos iterations
+     int restart,   // 0 -> no restart, 1 -> restarted form
+     double tol,    // convergence tolerance
+     double *scale, // optional scale (NULL for no scale) length n * 2 (1st n scale values 2nd n work)
+     double *shift, // optional shift (NULL for no shift) length 1
+     double *center,// optional center (NULL for no center) length n
+     double *s,     // output singular vectors at least length nu
+     double *U,     // output left singular vectors length >= m x m_b
+     double *V,     // output right singular vectors length >= n x m_b
+     int *ITER,     // output number of iterations performed
+     int *MPROD,    // output number of matrix vector products
+     double eps,    // machine epsilon
+     int lwork,     // length for some intermediate values below
+     double *V1,    // working storage  n * work
+     double *U1,    // working storage  m * work
+     double *W,     // working storage  m * work
+     double *F,     // working storage  n
+     double *B,     // working storage  work * work
+     double *BU,    // working storage  work * work
+     double *BV,    // working storage  work * work
+     double *BS,    // working storage  work
+     double *BW,    // working storage  lwork * lwork
+     double *res,   // working storage  work
+     double *T);    // working storage lwork
diff --git a/src/utility.c b/src/utility.c
new file mode 100644
index 0000000..a359c8f
--- /dev/null
+++ b/src/utility.c
@@ -0,0 +1,93 @@
+/*
+ * irlb: Implicitly restarted Lanczos bidiagonalization partial SVD.
+ * Copyright (c) 2016 by Bryan W. Lewis
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <fcntl.h>
+#include <assert.h>
+#include <math.h>
+
+#include "Rinternals.h"
+#include "R_ext/BLAS.h"
+#include "R_ext/Lapack.h"
+
+#include "irlb.h"
+
+
+/* orthog(X,Y,...)
+ * compute Y = Y - X * t(X) * Y
+ * xm,xn: nrow, ncol X
+ * yn: ncol Y (ASSUMED TO BE 1)
+ * On entry, number of rows of Y must be xm to compute t(X) * Y and
+ * T must be allocated of at least size xn * yn.
+ * Modifies contents of Y.
+ */
+void
+orthog (double *X, double *Y, double *T, int xm, int xn, int yn)
+{
+  double a = 1, b = 1;
+  int inc = 1;
+  memset(T, 0, xn * yn * sizeof(double));
+  // T = t(X) * Y
+  F77_NAME (dgemv) ("t", &xm, &xn, &a, X, &xm, Y, &inc, &b, T, &inc);
+  // Y = Y - X * T
+  a = -1.0;
+  b = 1.0;
+  F77_NAME (dgemv) ("n", &xm, &xn, &a, X, &xm, T, &inc, &b, Y, &inc);
+}
+
+
+/*
+ * Convergence tests
+ * Input parameters
+ * Bsz            Number of rows of the bidiagonal matrix B (scalar)
+ * tol            convergence tolerance (scalar)
+ * n              Requested number of singular values
+ * Smax           Largest singular value of B
+ * residuals      vector of residual values
+ * k              number of estimated signular values (scalar)
+ *
+ * Output
+ * converged      0 = FALSE, 1 = TRUE (all converged)
+ * k              Adjusted subspace size.
+ */
+void
+convtests (int Bsz, int n, double tol, double Smax,
+           double *residuals, int *k, int *converged)
+{
+  int j, Len_res = 0;
+  for (j = 0; j < Bsz; ++j)
+    {
+      if (fabs(residuals[j]) < tol * Smax)
+        Len_res++;
+    }
+  if (Len_res >= n)
+    {
+      *converged = 1;
+      return;
+    }
+  if (*k < n + Len_res)
+    *k = n + Len_res;
+  if (*k > Bsz - 3)
+    *k = Bsz - 3;
+  if (*k < 1)
+    *k = 1;
+  *converged = 0;
+  return;
+}
diff --git a/tests/edge.R b/tests/edge.R
new file mode 100644
index 0000000..fc1bcba
--- /dev/null
+++ b/tests/edge.R
@@ -0,0 +1,66 @@
+# Tests for a few edge cases
+
+require("irlba")
+# Dense matrix
+set.seed(1)
+A <- matrix(rnorm(16), 4)
+L <- irlba(A, nu=1, nv=1, tol=1e-9, fastpath=FALSE)
+L1 <- irlba(A, nu=1, nv=1, tol=1e-9, fastpath=TRUE)
+S <- svd(A, nu=1, nv=1)
+if (!isTRUE(all.equal(L$d, S$d[1])))
+{
+  stop("Failed tiny reference example ")
+}
+if (!isTRUE(all.equal(L1$d, S$d[1])))
+{
+  stop("Failed tiny fastpath example")
+}
+
+# Tickle misc. checks
+set.seed(1)
+A <- matrix(rnorm(100), 10)
+L <- tryCatch(irlba(A, nv=3, tol=1e-9, fastpath=FALSE, work=2, v=rep(0, nrow(A))), error=function(e) "NULLSPACE")
+S <- svd(A)
+L <- irlba(A, nv=3, tol=1e-9, fastpath=FALSE, work=2, v=S$v[, 1])
+A <- S$u %*% diag(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1e-12)) %*% t(S$v)
+L <- irlba(A, nv=3, tol=1e-9, fastpath=FALSE, work=2, reorth=FALSE)
+
+# Convergence
+A <- S$u %*% (c(1e-5, rep(1e-9, 9)) * t(S$v))
+for (tol in 10 ^ -(7:12))
+{
+  L <- irlba(A, 3, tol=tol)
+  converged <- svd(A %*% L$v - L$u  %*% diag(L$d))$d[1] < tol * L$d[1]
+  stopifnot(converged)
+}
+
+# Sparse but not dgCMatrix (issue #6)
+A <- Matrix(matrix(rnorm(100), 10))
+L <- irlba(A, nv=1)
+S <- svd(A, nu=1, nv=1)
+if (!isTRUE(all.equal(L$d, S$d[1])))
+{
+  stop("Failed general sparse matrix example ")
+}
+
+A <- Matrix(sample(c(FALSE, TRUE), 100, replace=TRUE), 10, 10)
+L <- irlba(A, nv=1)
+S <- svd(A, nu=1, nv=1)
+if (!isTRUE(all.equal(L$d, S$d[1])))
+{
+  stop("Failed logical sparse matrix example ")
+}
+
+# Test for issue #7, a really dumb bug.
+mx <- matrix(sample(1:10, 10 * 100, replace=TRUE), nrow=10)
+S <- irlba(mx, nv=2, verbose=TRUE, center=colMeans(mx), right_only=TRUE)
+
+# test for issue #9
+set.seed(2)
+s1 <- irlba(diag(c(1,2,3,4,5,0,0,0,0)), 4)
+set.seed(2)
+s2 <- irlba(diag(c(1,2,3,4,5,0,0,0,0)), 4, fastpath=FALSE)
+if (!isTRUE(all.equal(s1$d, s2$d)))
+{
+  stop("Failed fastpath invariant subspace detection")
+}
diff --git a/tests/test.R b/tests/test.R
new file mode 100644
index 0000000..fa932e6
--- /dev/null
+++ b/tests/test.R
@@ -0,0 +1,129 @@
+require("irlba")
+
+for (FAST in c(FALSE, TRUE))
+{
+  # Dense matrix
+  set.seed(1)
+  A <- matrix(rnorm(400), 20)
+  L <- irlba(A, nu=2, nv=2, tol=1e-9, fastpath=FAST)
+  S <- svd(A, nu=2, nv=2)
+  if (!isTRUE(all.equal(L$d, S$d[1:2])))
+  {
+    stop("Failed simple dense singular value test", " fastpath=", FAST)
+  }
+
+  # restart
+  L1 <- irlba(A, nv=3, v=L, fastpath=FAST)
+  if (!isTRUE(all.equal(L1$d, S$d[1:3])))
+  {
+    stop("Failed restart", " fastpath=", FAST)
+  }
+
+  # Scaling and centering, dense
+  s <- sqrt(apply(A, 2, crossprod))
+  m <- colMeans(A)
+  L <- irlba(A, 3, tol=1e-9, center=m, scale=s, fastpath=FAST)
+  S <- svd(scale(A, center=TRUE, scale=s))
+  if (!isTRUE(all.equal(L$d, S$d[1:3])))
+  {
+    stop("Failed scaling/centering test", " fastpath=", FAST)
+  }
+  # Scale only, non-square, dense
+  A <- matrix(rnorm(200), 10)
+  s <- seq(1, ncol(A))
+  m <- colMeans(A)
+  L <- irlba(A, 3, tol=1e-9, scale=s, fastpath=FAST)
+  S <- svd(scale(A, center=FALSE, scale=s))
+  if (!isTRUE(all.equal(L$d, S$d[1:3])))
+  {
+    stop("Failed dense scaling test", " fastpath=", FAST)
+  }
+  # Center only, non-square, dense
+  L <- irlba(A, 3, tol=1e-9, center=m, fastpath=FAST)
+  S <- svd(scale(A, center=TRUE, scale=FALSE))
+  if (!isTRUE(all.equal(L$d, S$d[1:3])))
+  {
+    stop("Failed dense centering test", " fastpath=", FAST)
+  }
+  # Sparse matrix
+  require("Matrix")
+  K <- 400
+  N <- 2000
+  i <- sample(K, size=N, replace=TRUE)
+  j <- sample(K, size=N, replace=TRUE)
+  A <- sparseMatrix(i, j, x=rnorm(N))
+  L <- irlba(A, nu=2, nv=2, tol=1e-9, fastpath=FAST)
+  S <- svd(A, nu=2, nv=2)
+  if (!isTRUE(all.equal(L$d, S$d[1:2])))
+  {
+    stop("Failed simple sparse singular value test", " fastpath=", FAST)
+  }
+  # Center only, sparse
+  m <- colMeans(A)
+  L <- irlba(A, 3, tol=1e-9, center=m, fastpath=FAST)
+  S <- svd(scale(A, center=TRUE, scale=FALSE))
+  if (!isTRUE(all.equal(L$d, S$d[1:3])))
+  {
+    stop("Failed sparse centering test", " fastpath=", FAST)
+  }
+  # scale only, spase
+  s <- seq(1, ncol(A))
+  L <- irlba(A, 3, tol=1e-9, scale=s, fastpath=FAST)
+  S <- svd(scale(A, center=FALSE, scale=s))
+  if (!isTRUE(all.equal(L$d, S$d[1:3])))
+  {
+    stop("Failed sparse scaling test", " fastpath=", FAST)
+  }
+
+  # Symmetric partial eigendecomposition
+  set.seed(1)
+  V <- qr.Q(qr(matrix(runif(100), nrow=10)))
+  x <- V %*% diag(c(10, -9, 8, -7, 6, -5, 4, -3, 2, -1)) %*% t(V)
+  if (!isTRUE(all.equal(partial_eigen(x, 3, fastpath=FAST)$values, c(10, 8, 6))))
+  {
+    stop("Failed partial_eigen test", " fastpath=", FAST)
+  }
+
+  # Test right-only option
+  L <- irlba(A, 2, tol=1e-3, right_only=TRUE, fastpath=FAST)
+  S <- svd(A, nu=2, nv=2)
+  if (!isTRUE(all.equal(L$d, S$d[1:2])))
+  {
+    stop("Failed right_only test", " fastpath=", FAST)
+  }
+
+  # Dense complex-valued matrix
+  A <- matrix(rnorm(400), 20) + 1i * matrix(rnorm(400), 20)
+  L <- irlba(A, nu=2, nv=2, tol=1e-9, fastpath=FAST)
+  S <- svd(A, nu=2, nv=2)
+  if (!isTRUE(all.equal(L$d, S$d[1:2])))
+  {
+    stop("Failed complex-valued dense singular value test", " fastpath=", FAST)
+  }
+
+  # test extra reorthogonalization
+  L <- irlba(A, nu=2, nv=2, tol=1e-9, reorth=TRUE, fastpath=FAST)
+  if (!isTRUE(all.equal(L$d, S$d[1:2])))
+  {
+    stop("Failed reorthogonalization test", " fastpath=", FAST)
+  }
+
+  # prcomp convenience function
+  x  <- matrix(rnorm(200), nrow=20)
+  p1 <- prcomp_irlba(x, n=3, fastpath=FAST)
+  p2 <- prcomp(x, tol=0.7)
+  if (!isTRUE(all.equal(p1$sdev[1:2], p2$sdev[1:2])))
+  {
+    stop("Failed prcomp test", " fastpath=", FAST)
+  }
+
+  # very non-square dense matrices
+  set.seed(1)
+  A <- matrix(rnorm(2000), 20)
+  L1 <- irlba(A, nu=2, nv=2, tol=1e-9, fastpath=FAST)
+  L2 <- irlba(t(A), nu=2, nv=2, tol=1e-9, fastpath=FAST)
+  if (!isTRUE(all.equal(L1$d, L2$d)))
+  {
+    stop("Failed nonsquare test", " fastpath=", FAST)
+  }
+}
diff --git a/vignettes/irlba.Rnw b/vignettes/irlba.Rnw
new file mode 100644
index 0000000..546237b
--- /dev/null
+++ b/vignettes/irlba.Rnw
@@ -0,0 +1,467 @@
+% \VignetteIndexEntry{irlba Manual}
+% \VignetteDepends{irlba}
+% \VignettePackage{irlba}
+\documentclass[12pt]{article}
+\usepackage{amsmath}
+\usepackage[pdftex]{graphicx}
+\usepackage{color}
+\usepackage{xspace}
+\usepackage{fancyvrb}
+\usepackage{fancyhdr}
+\usepackage[
+     colorlinks=true,
+     linkcolor=blue,
+     citecolor=blue,
+     urlcolor=blue]
+     {hyperref}
+\usepackage{lscape}
+\usepackage{Sweave}
+\usepackage{tabularx}
+\usepackage{listings}
+\usepackage{mdwlist}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+% define new colors for use
+\definecolor{darkgreen}{rgb}{0,0.6,0}
+\definecolor{darkred}{rgb}{0.6,0.0,0}
+\definecolor{lightbrown}{rgb}{1,0.9,0.8}
+\definecolor{brown}{rgb}{0.6,0.3,0.3}
+\definecolor{darkblue}{rgb}{0,0,0.8}
+\definecolor{darkmagenta}{rgb}{0.5,0,0.5}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+
+\newcommand{\bld}[1]{\mbox{\boldmath $#1$}}
+\newcommand{\shell}[1]{\mbox{$#1$}}
+\renewcommand{\vec}[1]{\mbox{\bf {#1}}}
+\newcommand{\ReallySmallSpacing}{\renewcommand{\baselinestretch}{.6}\Large\normalsize}
+\newcommand{\SmallSpacing}{\renewcommand{\baselinestretch}{1.1}\Large\normalsize}
+\def\tm{\leavevmode\hbox{$\rm {}^{TM}$}}
+
+\newcommand{\R}{{\mathbf R}}
+\newcommand{\brho}{{\color{blue}{\rho}}}
+\newcommand{\Ra}{{\mathcal R}}
+\newcommand{\PP}{{\mathbf P}}
+\newcommand{\N}{{\mathbf N}}
+\newcommand{\K}{{\mathcal K}}
+
+
+
+\setlength{\oddsidemargin}{-.25 truein}
+\setlength{\evensidemargin}{0truein}
+\setlength{\topmargin}{-0.2truein}
+\setlength{\textwidth}{7 truein}
+\setlength{\textheight}{8.5 truein}
+\setlength{\parindent}{0.20truein}
+\setlength{\parskip}{0.10truein}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\pagestyle{fancy}
+\lhead{}
+\chead{The {\tt irlba} Package}
+\rhead{}
+\lfoot{}
+\cfoot{}
+\rfoot{\thepage}
+\renewcommand{\headrulewidth}{1pt}
+\renewcommand{\footrulewidth}{1pt}
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\title{The {\tt irlba} Package}
+\author{Bryan W. Lewis \\ 
+blewis at illposed.net,
+\\[6pt]
+adapted from the work of:\\
+Jim Baglama (University of Rhode Island)\\
+and Lothar Reichel (Kent State University).
+}
+
+\begin{document}
+
+\maketitle
+
+\thispagestyle{empty}
+
+\section{Introduction}
+
+The {\tt irlba} package provides a fast way to compute partial singular value
+decompositions (SVD) of large sparse or dense matrices. Recent additions to the
+package can also compute fast partial symmetric eigenvalue decompositions and
+principal components. The package is an R implementation of the {\it augmented
+implicitly restarted Lanczos bidiagonalization algorithm} of Jim Baglama and
+Lothar Reichel\footnote{Augmented Implicitly Restarted Lanczos
+Bidiagonalization Methods, J. Baglama and L. Reichel, SIAM J. Sci. Comput.
+2005.}.  Source code is maintained at
+\href{https://github.com/bwlewis/irlba}{https://github.com/bwlewis/irlba}.
+
+The {\tt irlba} package works with real- and complex-valued dense R matrices
+and real-valued sparse matrices from the {\tt Matrix} package. It provides
+several easy ways to define custom matrix arithmetic that works with other
+matrix classes including {\tt big.matrix} from the {\tt bigmemory} package and
+others.  The {\tt irlba} is both faster and more memory efficient than the
+usual R {\tt svd} function for computing a few of the largest singular vectors
+and corresponding singular values of a matrix, and advantage of available
+high-performance linear algebra libraries if R is compiled to use them. In
+particular, the package uses the same BLAS and LAPACK libraries that R uses
+(see
+\href{https://cran.r-project.org/doc/manuals/R-admin.html#BLAS}{https://cran.r-project.org/doc/manuals/R-admin.html\#BLAS}),
+or the CHOLMOD library from R's Matrix package for sparse matrix problems.
+
+A whirlwind summary of the algorithm follows, along with a few basic examples.
+A much more detailed description and discussion of the algorithm may be found
+in the cited Baglama-Reichel reference.
+
+
+\section{Partial Singular Value Decomposition}
+
+Let $A\in\R^{\ell\times n}$ and assume $\ell\ge n$. These notes simplify the
+presentation by considering only real-valued matrices and assuming without
+losing generality that there are at least as many rows as columns (the
+method works more generally). A singular
+value decomposition of $A$ can be expressed as:
+
+\[
+A = \sum_{j=1}^n \sigma_j u_j v_j^T,
+\phantom{xxxxxxxx}
+v_j^Tv_k = u_j^Tu_k = 
+\left\{
+\begin{array}{ll}
+1 & \mbox{if}\phantom{x}  j=k,\\
+0 & \mbox{o.w.,}\\
+\end{array}
+\right.
+\]
+where $u_j\in\R^\ell $, $v_j\in\R^n $,
+$j=1,2,\ldots, n$, and
+$ \sigma_1 \ge \sigma_2 \ge \cdots \ge \sigma_n \ge 0 $.
+
+Let $1 \le k<n$. A rank $k$ partial SVD of $A$ is defined as:
+\begin{eqnarray*}
+A_k &:=& \sum_{j=1}^k \sigma_j u_j v_j^T.\\
+\end{eqnarray*}
+
+
+
+The following simple example shows how to use {\tt irlba} to compute the five
+largest singular values and corresponding singular vectors of a
+$5000\times5000$ matrix. We compare to the usual R {\tt svd} function and
+report timings for our test machine, a 4-CPU core, 3.0\, GHz AMD  A10-7850K
+personal computer with 16\,GB RAM, using R version 3.3.1 using the high
+performance AMD ACML core math library BLAS and LAPACK.
+\lstset{columns=flexible, basicstyle={\ttfamily\slshape}}
+\begin{lstlisting}
+> library('irlba')
+> set.seed(1)
+> A <- matrix(rnorm(5000*5000), 5000)
+> t1 <- proc.time()
+> L <- irlba(A, 5)
+> print(proc.time() - t1)
+   user  system elapsed
+ 17.440   0.192   4.417
+> gc()
+           used  (Mb) gc trigger  (Mb) max used  (Mb)
+Ncells  1096734  58.6    1770749  94.6  1442291  77.1
+Vcells 26685618 203.6   62229965 474.8 52110704 397.6
+
+\end{lstlisting}
+Compare with the standard {\tt svd} function:\newpage
+\begin{lstlisting}
+> t1 <- proc.time()
+> S <- svd(A, nu=5, nv=5)
+> print(proc.time() - t1)
+   user  system elapsed
+277.092  11.552  74.425
+> gc()
+           used  (Mb) gc trigger   (Mb)  max used   (Mb)
+Ncells  1097441  58.7    1770749   94.6   1442291   77.1
+Vcells 26741910 204.1  169891972 1296.2 176827295 1349.1
+\end{lstlisting}
+The {\tt irlba} method uses about 1/20 elapsed time as the
+{\tt svd} method in this example and less than one third the peak memory.
+The defalut tolerance value yields the following relative error
+in the estimated singular values:
+\begin{lstlisting}
+> sqrt (crossprod(S$d[1:5]-L$d)/crossprod(S$d[1:5]))
+            [,1]
+[1,] 4.352641e-10
+\end{lstlisting}
+
+\subsection{Convergence tolerance}
+
+IRLBA is an iterative method that estimates a few largest singular values
+and associated singular vectors. A sketch of the algorithm is outlined
+in Section \ref{sketch} below. The R {\tt tol} argument controls when the algorithm converges.
+Convergence occurs when
+\[
+\|AV_k - US_k\| < \mbox{tol} \cdot \|A\|,
+\]
+where $\|\cdot\|$ means spectral matrix norm, $A$ is the matrix, $V_k$ and $U_k$
+are the {\it estimated} right and left $k$ singular vectors computed by the
+algorithm, and $\|A\|$ is the {\it estimated} spectral norm of the matrix defined
+by the largest singular value computed by the algorithm. Using R notation,
+the algorithm stops when
+\begin{lstlisting}
+L <- irlba(A, k, tol)
+svd(A %*% L$v - L$u %*% diag(L$d))$d[1] < tol * L$d[1]
+\end{lstlisting}
+It's possible, but unlikely, to encounter problems that fail to converge before
+the maximum number of algorithm iterations specified by the {\tt maxit}
+argument.
+
+
+\subsection{Differences with {\tt svd}}
+The {\tt irlba} function is designed to compute a {\it partial} singular
+value decomposition. It is largely compatible with the usual R {\tt svd}
+function but there are some differences. In particular:
+\begin{enumerate}
+\item The {\tt irlba} function only computes the number of singular values
+corresponding to the maximum of the desired singular vectors,
+{\tt max(nu, nv)}. For example, if 5
+singular vectors are desired ({\tt nu=nv=5}), then only the five corresponding
+singular values are computed. The standard R {\tt svd} function always
+returns the {\it total} set of singular values for the matrix, regardless of how
+many singular vectors are specified.
+\item The {\tt irlba} function is an iterative method that continues until
+either a tolerance or maximum number of iterations is reached.
+Problems with difficult convergence properties are not likely to be
+encountered, but the method will fail with an error after the iteration limit
+is reached in those cases.
+\end{enumerate}
+Watch out especially for the first difference noted above!
+
+
+\subsection{Principal Components}
+
+Version 2.1.0 of the package introduces optional arguments and {\tt prcomp}-like
+function syntax for efficiently computing partial SVDs of matrices after
+centering and scaling their columns and other adjustments.
+Use the following arguments to the {\tt irlba} function, or the new
+{\tt irlba\_prcomp} function for PCA:
+\begin{itemize}
+\item {\tt center}: if {\tt center} is a numeric vector with length equal to
+     the number of columns of the matrix, then each column of the matrix has the
+     corresponding value from {\tt center} subtracted from it.
+\item {\tt scale}: if 'scale' is a numeric vector with length
+     equal to the number of columns of the matrix, then each column is
+     divided by the corresponding value from {\tt scale}.
+\end{itemize}
+Both centering and scaling options are performed implicitly in the algorithm
+and, for instance, do not affect sparsity of the input matrix or increase
+storage requirements.
+The following
+example compares the output of the usual {\tt prcomp} function with
+output from {\tt irlba}.
+Note that in general, singular vectors and principal component vectors
+are only unique up to sign!
+
+\begin{lstlisting}
+>      set.seed(1)
+>      x  <- matrix(rnorm(200), nrow=20)
+>      p1 <- prcomp_irlba(x, n=3)
+>      summary(p1)
+Importance of components:
+                         PC1    PC2    PC3
+Standard deviation     1.541 1.2513 1.1916
+Proportion of Variance 0.443 0.2921 0.2649
+Cumulative Proportion  0.443 0.7351 1.0000
+
+>      # Compare with
+>      p2 <- prcomp(x, tol=0.7)
+>      summary(p2)
+Importance of components:
+                         PC1    PC2    PC3
+Standard deviation     1.541 1.2513 1.1916
+Proportion of Variance 0.443 0.2921 0.2649
+Cumulative Proportion  0.443 0.7351 1.0000
+\end{lstlisting}
+Alternatively, you can compute principal components directly using the
+singular value decomposition and the {\tt center} option:
+\begin{lstlisting}
+> p3 <- svd(scale(x, center=colMeans(x), scale=FALSE))
+> p4 <- irlba(x, 3, center=colMeans(x))
+
+> # compare with prcomp
+> sqrt(crossprod(p1$rotation[,1] - p3$v[,1]))
+             [,1]
+[1,] 9.773228e-13
+> sqrt(crossprod(p1$rotation[,1] + p4$v[,1]))
+             [,1]
+[1,] 1.652423e-12
+\end{lstlisting}
+The implementation of the {\tt center} function argument takes advantage of
+computational efficiencies in the IRLB algorithm that result in a modest
+savings of a few vector inner products per iteration compared to a naive
+implementation using a custom matrix vector product to center the matrix.
+
+\subsection{Truncated symmetric eigenvalue decomposition}
+
+Use the {\tt partial\_eigen} function to estimate a subset of the largest (most
+positive) eigenvalues and corresponding eigenvectors of a symmetric dense or
+sparse real-valued matrix. The function is particularly well-suited to
+estimating the largest eigenvalues and corresponding eigenvectors of symmetric
+positive semi-definite matrices of the form $A^T A$.
+
+\subsection{User-Defined Matrix Multiplication}
+
+The {\tt irlba} function includes options for specifying a custom matrix
+multiplication function. Custom matrix multiplication functions can be used,
+for example, with the {\tt big.matrix} class from the {\tt bigmemory}/{\tt
+bigalgebra} packages, or to compute the partial SVD of matrix-free linear
+operators.
+
+User-defined matrix operations may specified using the optional {\tt mult}
+parameter. If defined, it must be a function of two arguments that computes
+matrix vector products. Either argument can be a vector, and the {\tt mult}
+function must deal with that. The following example illustrates a simple
+custom matrix function that scales the columns of the matrix, and then
+compares it with other ways of doing the same thing.
+\begin{lstlisting}
+> set.seed(1)
+> A <- matrix(runif(200), nrow=20)
+> col_scale <- 1:10
+> mult <- function(x,y)
++         {
++           # check if x is a plain vector
++           if(is.vector(x))
++           {
++             return((x %*% y)/col_scale)
++           }
++           # else x is the matrix
++           x %*% (y/col_scale)
++         }
+
+> irlba(A, 3, mult=mult)$d
+[1] 3.1384503 0.9477628 0.4313855
+
+> # Compare with:
+> irlba(A, 3, scale=col_scale)$d
+[1] 3.1384503 0.9477628 0.4313855
+
+> # Compare with:
+> svd(scale(A, scale=col_scale, center=FALSE))$d[1:3]
+[1] 3.1384503 0.9477628 0.4313855
+\end{lstlisting}
+Alternatively, simply use R's operator overloading methods to define
+customized matrix vector products. This is the approach taken in
+the
+\\
+\href{https://bwlewis.github.io/1000_genomes_examples/PCA_whole_genome.html}{http://bwlewis.github.io/1000\_genomes\_examples/PCA\_whole\_genome.html}
+vignette to work with large out of core genomics data.
+I prefer the simple operator overloading approach and may consider
+retiring the {\tt mult} function argument in some future package version.
+
+NOTE! When a user-defined matrix multiplication function is used, either using
+the {\tt mult} argument or through operator overloading, the R reference IRLBA
+implementation is used instead of the faster C code path in newer package
+versions.
+
+
+\section{A Quick Summary of the IRLBA Method}\label{sketch}
+\subsection{Partial Lanczos Bidiagonalization}
+
+Start with a given vector $p_1$. Compute $m$ steps of the Lanczos process:
+
+\begin{eqnarray*}
+A P_m &=& Q_m B_m \\
+A^T Q_m &=& P_m B_m^T + r_m e_m^T,\\
+\end{eqnarray*}
+
+$B_m\in\R^{m\times m}, P_m \in \R^{n\times m}, $ 
+$Q_m \in \R^{\ell \times m},$ 
+
+$P_m^TP_m=Q_m^TQ_m=I_m, $ 
+
+$r_m\in\R^n,  P_m^Tr_m=0,$
+
+$P_m = [p_1, p_2, \ldots, p_m]$.
+
+\subsection{Approximating Partial SVD with A Partial Lanczos bidiagonalization}
+\begin{eqnarray*}
+A^TA P_m &=& A^TQ_m B_m \\
+         &=& P_m {\color{blue}{B_m^TB_m}} + r_m e_m^TB_m,\\
+\end{eqnarray*}
+\begin{eqnarray*}
+AA^T Q_m &=& AP_m B_m^T + Ar_m e_m^T,\\
+&=& Q_m{\color{blue}{B_mB_m^T}} + Ar_me_m^T.
+\end{eqnarray*}
+
+Compute the SVD of $B_m$:
+\[
+B_m = \sum_{j=1}^m\sigma^B_ju^B_j\left(v_j^B\right)^T.
+\] 
+\\[6pt]
+\[
+\left(\mbox{i.e., }  B_mv_j^B = \sigma_j^Bu_j^B,  \mbox{ and }  
+B_m^Tu_j^b = \sigma_j^Bv_j^B.\right)
+\]
+
+Define:
+$
+\tilde{\sigma_j} := \sigma_j^B, \phantom{xxx}
+\tilde{u}_j := Q_m u_j^B, \phantom{xxx}
+\tilde{v}_j := P_m v_j^B.
+$
+
+Then:
+\begin{eqnarray*}
+A\tilde{v}_j &=& AP_mv_j^B \\
+&=& Q_mB_mv_j^B \\
+&=& \sigma^B_jQ_mu_j^B \\
+&=& \tilde{\sigma}_j \tilde{u}_j,
+\end{eqnarray*}
+and
+\begin{eqnarray*}
+A^T\tilde{u}_j &=& A^TQ_mu_j^B \\
+&=& P_mB^T_mu_j^B + r_me_m^Tu_j^B \\
+&=& \sigma^B_jP_mv_j^B + r_me_m^Tu_j^B\\
+&=& \tilde{\sigma}_j \tilde{v}_j + {\color{red} {r_me_m^Tu_j^B}}.
+\end{eqnarray*}
+The part in red above represents the error with respect to the exact SVD.
+The IRLBA strategy is to iteratively reduce the norm of that error term
+by augmenting and restarting.
+
+Here is the overall method:
+\begin{enumerate}
+\item Compute the Lanczos process up to step $m$.
+\item Compute $k<m$ approximate singular vectors.
+\item Orthogonalize against the approximate singular vectors to get a new 
+      starting vector.
+\item Continue the Lanczos process with the new starting vector 
+      for $m$ more steps.
+\item Check for convergence tolerance and exit if met.
+\item GOTO 1.
+\end{enumerate}
+
+
+\subsection{Sketch of the augmented process...}
+\begin{eqnarray*}
+\bar{P}_{k+1} &:=& [\tilde{v}_1, \tilde{v}_2, \ldots, \tilde{v}_k, p_{m+1}],\\
+A\bar{P}_{k+1} &=& [\tilde{\sigma}_1\tilde{u}_1, \tilde{\sigma}_1\tilde{u}_2, \ldots, \tilde{\sigma}_k\tilde{u}_k, Ap_{m+1}]
+\end{eqnarray*}
+
+Orthogonalize $Ap_{m+1}$ against $\{\tilde{u}_j\}_{j=1}^k$:   
+$
+Ap_{m+1} = \sum_{j=1}^k {\color{blue}{\rho_j}}\tilde{u}_j + {\color{blue}{r_k}}.
+$
+\begin{eqnarray*}
+\bar{Q}_{k+1} &:=& [\tilde{u}_1, \tilde{u}_2, \ldots, \tilde{u}_k, 
+{\color{blue}{r_k}}/\|{\color{blue}{r_k}}\|],\\
+\bar{B}_{k+1} &:=&
+\left[
+\begin{array}{ccccc}
+\tilde{\sigma}_1 & & & \brho_1 \\
+& \tilde{\sigma}_2 & & \brho_2 \\
+& & \ddots & \brho_k \\
+& & & \|\color{blue}{r_k}\|
+\end{array}
+\right].
+\end{eqnarray*}
+\[
+A\bar{P}_{k+1} = \bar{Q}_{k+1}\bar{B}_{k+1}.
+\]
+
+
+
+
+\end{document}

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/r-cran-irlba.git