[Pkg-exppsy-pynifti] [Nipy-devel] Example data - a proposal

Christopher Burns cburns at berkeley.edu
Mon Jul 13 17:42:56 UTC 2009


On Sat, Jul 11, 2009 at 11:28 AM, Matthew Brett<matthew.brett at gmail.com> wrote:
> As we're working on stuff, the problem of example data keeps coming
> up.  We often need example data for
>
> 1) tests
> 2) examples

I think we need to handle these two cases separately.
1) Data used for tests:  Set of small files, less than 100K, committed
to the source repository.
2) Data used for examples:  normal data sets that can be run through
an entire processing stream.

#1 was the original intention of the functional and anatomical files
in:  <nipy>/testing/

These should replaced with a matching set of sub-sampled images.
Jonathan and I hacked those together in a hurry last year at a sprint.

But it's important that the test suite be fast and lean, otherwise
it's a burden to run and as a result gets run less often.

Also, Debian packaging.  There's two problems with our current test
data in regard to debian packaging.
1) We require an active network connection to download the data.  Not
all of the test machines have active networks.
2) We store the data in $HOME.  Not all test machines have this.

If the tests ran on a couple small test files committed to the
repository, these 2 problems would be solved.

Currently, nipy.test() is a memory hog and takes too long.

As a comparison, below is the output from top.  The first entry is
nipy.test(), the second is numpy.test().

  PID USER      VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 5918 cburns    539m 114m 9508 S  0.0  0.2   0:57.43 ipython
 6074 cburns    242m  31m 6632 S  0.0  0.1   0:04.76 ipython

And it takes a while to run our tests.  Below is the results of
running the tests on our cluster:

numpy.test()
Ran 2027 tests in 4.739s
OK (KNOWNFAIL=1, SKIP=2)

nipy.test()
Ran 1869 tests in 58.990s
FAILED (SKIP=1, errors=13, failures=3)


The examples can rely on a larger dataset, which can be packaged
independently (option C), but the larger dataset would not be part of
the test suite, and therefore is not required to run the tests.

I'm happy to assist in moving to this sort of a split if folks decide
this is the right way to go.

Chris



More information about the Pkg-exppsy-pynifti mailing list