[Pkg-exppsy-pynifti] [Nipy-devel] Example data - a proposal

Gael Varoquaux gael.varoquaux at normalesup.org
Sun Jul 12 07:55:12 UTC 2009


On Sat, Jul 11, 2009 at 11:28:44AM -0700, Matthew Brett wrote:
> As we're working on stuff, the problem of example data keeps coming
> up.  We often need example data for

> 1) tests
> 2) examples

> Of course we're often using images and these can be rather large.

> The options are:

> [...]

I am not going to comment or give my opinion on Matthew's proposition, as
I was next to him when he wrote it, and we discussed it together.

I would like to add a few data points ( :o ) and ask a few more
questions.

* First data point: we are currently downloading data automaticaly from
  Internet when nipy is imported (or built, or installed), and sticking it
  in the user's home folder. Refusal to download this data will cause the
  import (or the build, or the instal) to fail. This data is 64Mb
  downloaded from the net, expanded to use 127Mb on the user's disk, half
  of the space used by the tarball downloaded from the net that is not
  deleted after download :). And, we are not using most of this data (grep
  for datapjoin in the code base) :).

* Second data point: many examples and tests would really like data. I see
  in the codebase (mostly neurospin) code that would like to load files
  from user's home directory, or temporary directory.

So, it seems that there is a strong need for data, and that we need to
rationalize this need.

We would like to slowly build a set of files that please most people, to
include them in a data package. Therefore, the questions are: 

* What are your basic needs in terms of data for testing and demo that
  would cover 80% of your usecases? 

* What are the unusual datasets that you need to cover the 20% remaining 
  (the complete FIAC dataset would, I hope, fall in the second set :->). 

* Finally, which files can you not live without? By this I mean: which 
  files are necessary for your basic algorithms to function?

We would like to hear from everybody, in order to rationalize and decide
what falls in the a, b, c list that Matthew had in his previous e-mail.

Gaël



More information about the Pkg-exppsy-pynifti mailing list