[pymvpa] Appending datasets

Roberto Guidotti robbenson18 at gmail.com
Wed Oct 1 09:52:34 UTC 2014


Hi,

have you tried using the vstack command implemented in pymvpa?
http://www.pymvpa.org/generated/mvpa2.base.dataset.vstack.html#mvpa2.base.dataset.vstack

Roberto

On 1 October 2014 05:15, Shane Hoversten <shanusmagnus at gmail.com> wrote:

> Hi -
>
> I have some code that creates a dataset from a bunch of ornate processing
> (figuring out which volumes to censor based on subject performance, subject
> motion, other scanning params; creating aggregate event types for certain
> events; etc.)  Dataset creation has been done, to this point, per-session:
> subjects were scanned twice, and each session is processed separately; and
> all that ornate stuff that is done will be different from session to
> session.
>
> Now I want to aggregate these two sessions and do MVPA things after
> throwing all the data into the hopper.  Specifically, instead of using
> NFoldPartitioner on a day's worth of runs (there are 4 runs per day) and
> leaving one run out, I want to run it on the combined runs from both days
> (8 runs) and leave two out.
>
> PyMVPA is awesome and so it would be clear how to do this if all the data
> were aggregated together; but as I mentioned, to get the dataset in the
> right format I have to do a bunch of processing, and it would be a pain to
> combine all the various files to make this into one single aggregate set.
> What I'd rather do is just glue together two separate datasets, which have
> already been processed in the ways they require, s.t. the new dataset just
> had the samples, targets, and associated attributes from the second
> session's dataset glued onto the first session's dataset.
>
> The ds.samples variable reports as being a numpy.ndarray, so I figured I
> could just stuff them together with array operations, for instance:
>
> combined_ds = np.append(ds_1.samples, ds_2.samples)
>
> and so on for the targets, sample attributes, etc.  But nope, this gets
> screwed up immediately:
>
> In [*57*]: ds_1 = m.MVPAMaster("tp101", 1, "dc",
> "new_temporal_tp101_day1.nii")
>
> In [*58*]: len(ds_1.ds)
>
> Out[*58*]: 290
>
> In [*57*]: ds_2 = m.MVPAMaster("tp101", 2, "dc",
> "new_temporal_tp101_day2.nii")
>
> In [*58*]: len(ds_2.ds)
>
> Out[*58*]: 290
>
> In [*64*]: combined = np.append(ds_1.ds.samples, ds_2.ds.samples)
>
> In [*65*]: len(combined)
>
> Out[*65*]: 16960360
>
> I'm thinking this is a mapper unrolling everything behind the scenes,
> maybe?  I could beat my ahead against this for a while, but I figured first
> I'd ask and see if there's a straightforward method to extending a dataset
> in this fashion?
>
> Thanks,
>
> Shane
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20141001/025c855a/attachment.html>


More information about the Pkg-ExpPsy-PyMVPA mailing list