[pkg-bioc] i386 builds of CRAN

Dirk Eddelbuettel edd@debian.org
Wed, 8 Jun 2005 22:29:38 -0500


On 8 June 2005 at 20:49, Dirk Eddelbuettel wrote:
| 
| On 8 June 2005 at 13:28, Steffen Moeller wrote:
| | Dirk Eddelbuettel wrote:
| | 
| | >On 8 June 2005 at 11:26, Steffen Moeller wrote:
| | >| 
| | >| Hi Rafael, this is good news, many thanks!
| | >| 
| | >| I just completed a compilation of BioConductor with cran2deb ... not
| | >| bioc2deb. It is certainly too early to announce things, nevertheless,
| | >
| | >Sounds intriguing, and you did it without using repostools -- i.e. by reading
| | >all the BioC package description from the directory as for CRAN?
| | >  
| | >
| | Yip, I did not find the bioc.Rdata you were opening....
| 
| Uhhh, my memory is getting foggy on that, but I think the bioc.Rdata was just
| a local cache so that repostools didn't have to go out and hit the server for
| every test run I was doing.

Uh-uh, sorry about that. For completeness, I seem to have three files
'get_bioc.sh', 'update_bioc.sh' and 'update_bioc.R'.  Here is the R one, it
is the most recent and longest of the three:


## for CRAN:
##   CRAN <- getReposEntry(paste(getReposOption()[1] ))
##   download.packages2(repdataframe(CRAN)$Package[1:4], CRAN, destDir="/tmp", type="Source")
## i.e. doesn't need the paste'd Source/

## variant of get_bioc.sh with doDownload=FALSE so that it doesn't blindly download

stopifnot(require(reposTools, quiet=TRUE))

doBioC <- function(URL, doDownload=TRUE, path="bioC", doUpdate=TRUE) {
  ## this returns an S4 object with loads of info
  repoEntries <- getReposEntry(URL)
  ## this function is in reposTools too and does the actual download
  if (doDownload) {
    download.packages2(repdataframe(repoEntries)$Package,
                       repoEntries,
                       destDir="bioC",
                       type="Source")
  }
  ## Doesn't work as confirmed by Robert and Jeff -- S4 buglet
  #write.table(repdataframe(repoEntries), 
  #                         file="bioC/reposStatus.csv", sep=",", 
  #            row.names=FALSE, eol="_EOL_", qmethod="double")

  ## create a dataframe
  bioCdf <- repdataframe(repoEntries)
  
  if (doUpdate) {
    ## check if we have it already
    n <- dim(bioCdf)[1]
    getFiles <- rep(FALSE, n)
    for (i in 1:n) {
      filename <- paste(bioCdf[i,1], "_", bioCdf[i,2][[1]], ".tar.gz", sep="")
      if (file.access(paste(path, filename, sep="/"))) {
        #cat("--> need ", filename, "\n")
        getFiles[i] <- TRUE
      }
    }
    #print(bioCdf[getFiles,1])
    download.packages2(bioCdf[getFiles,1], repoEntries, path, type="Source")
  }
  
  # return the repository object as a dataframe
  invisible(bioCdf)
}

doCRAN <- function(URL, path="contrib/main", doUpdate=TRUE) {

  ## this returns an S4 object with loads of info
  repoEntries <- getReposEntry(URL)
  crandf <- repdataframe(repoEntries) 

  if (doUpdate) {
    ## check if we have it already
    n <- dim(crandf)[1]
    getFiles <- rep(FALSE, n)
    for (i in 1:n) {
      filename <- paste(crandf[i,1], "_", crandf[i,2][[1]], ".tar.gz", sep="")
      if (file.access(paste(path, filename, sep="/"))) {
        #cat("--> need ", filename, "\n")
        getFiles[i] <- TRUE
      }
    }
    #print(crandf[getFiles,1])
    download.packages2(crandf[getFiles,1], repoEntries, path, type="Source")
  }
  
  # return the repository object as a dataframe
  invisible(crandf)
}


## do not download by default
doDownload <- FALSE
doUpdate <- TRUE

## combine the source and data repo infos into one data.frame
bioCdf <- 
    ## position 2 is the current bioC release
    ## and source downloads is what we're after
    rbind(doBioC(paste(getReposOption()[2], "Source", sep="/"), doDownload,  "bioC", doUpdate),
    
    ## position 4 is the current bioC data and metadata
          doBioC(getReposOption()[4], doDownload, "bioC", doUpdate)
	  )

## also need CRAN information
#cranURL <- getReposOption()[1]
#cranRepoEntries <- getReposEntry(cranURL)
#crandf <- repdataframe(cranRepoEntries) 

crandf <- doCRAN(getReposOption()[1], "contrib/main", doUpdate)

## cache the metainfo in a file
save(bioCdf, crandf, file="bioC.Rdata", compress=TRUE)






So it really just uses a few lines of R to get all the package info, incl
those for CRAN.  As I recall, for an initial download you want doDownload<-TRUE.

Afterwards, this can re-run and it gets just what changed on the mirror. I
think that was the idea, at least.

Let me know if this is of interest. We could weave it back into cran2deb, or
we could keep cran2deb as it is...

Dirk


-- 
Statistics: The (futile) attempt to offer certainty about uncertainty.
         -- Roger Koenker, 'Dictionary of Received Ideas of Statistics'