[pymvpa] advice for constructing a dataset for use in pyMVPA

Thu Feb 12 14:10:31 UTC 2009

are you doing cross-fold validation?  if so, using subjects as chunks 
would mean you train the classifiers on one set of subjects and use it 
to classify the one(s) left out.  typically you don't want to do this, 
so i would say keep each subject as a completely separate data set and 
use runs as chunks, then decide how you want to compare the output over 
all subjects (ie, t-test, whathaveyou...)

i don't think there's a need to convert everything into a 4d image, 
since you can build up the dataset incrementally in python just by 
adding two datasets (so ie you can create a single data set per block, 
using the data and labels for that block).  it may be more convenient to 
build it up in 4d first but at least for me, this takes up way to much 
disk space (* R runs * S sessions * N subjects...)

Another trick you can do is just set the labels to incremental integers 
(0:N) representing their position in the sequence.  Then at any time you 
can swap them out ie Dataset.labels = labelsequence[Dataset.labels] if 
you want to try over different classifications, or simply 
dataset.selectSamplesByLabel() if you have several independent binary 
problems

typically what i do with anatomical (or functional) ROIs is build up the 
entire array using just the brain mask, load in a 3d image w/ integer 
values in each voxel designating a unique ROI, threshold it to any 
unique id, and use that as a feature mask directly on the full dataset 
(Dataset.selectFeaturesByMask(...) i think...)

apologies if any of the syntax above is incorrect, i'm rushing out the 
door...

-Scott

Jo Etzel wrote:
> I have been working with an fMRI data set, using R for classification 
> analysis. I’d like to try the pyMVPA package on the same data, so I want 
> to use my already-preprocessed (in SPM2) files.
>
> I have 15 subjects, each of which did 3 runs. It was a block designed 
> experiment, and I’ve preprocessed the data to have one analyze image per 
> block (for each subject & run). I also have anatomical masks, also as 
> analyze images. For each block I have several text labels ("color", 
> "conf", etc.). I never need to classify on more than one text label at a 
> time (i.e. just "color", not "color" and "conf"), though I do need to 
> subset the data based on these labels prior to classification (i.e. 
> classify "conf" for certain "colors" only).
>
> I am trying to understand how to set up my data for pyMVPA, and 
> appreciate your feedback as to whether this is the correct strategy, and 
> thank you for your patience with this long post.
>
> In this case, I think that the "samples" are my blocks, and I have two 
> levels of "chunks" - runs and subjects. My "labels" are my block types 
> ("color", etc).
>
> Do I want all of my data in *one* NiftiDataset object or separate ones 
> for each subject?
>
> I think that the steps I need to perform to get my data converted for 
> pyMVPA are as follows:
>
> 1 - use fslmerge to convert my (one-for-each-block) analyze files into 
> one large 4D nifti.gz file, containing all the files for all subjects.
>
> 2 - make attributes_literal.txt files, one for each labeling I need (one 
> for "color", one for "conf", etc). These will be used for the labels 
> part of NiftiDataset, read by SampleAttributes. The labels in these 
> files need to be in the same order as my volumes in the nifti.gz files.
>
> 3 - define arrays to label my files by chunks. I think I will need a 2D 
> array: the first column giving the subject number and the second the 
> run, with the rows in the same order as my nifti.gz files.
>
> 4 - write python code to create my NiftiDataset object, using my analyze 
> image (0 for voxels to exclude >0 for voxels to include) as a "mask" if 
> I want to restrict my analysis to those voxels.
>
>
> Would you advice this strategy?
>
> Thank you so much for your help!
>
> Jo
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
>