[pymvpa] searchlight for data with different runs with different masks

Sat Jan 16 04:40:42 UTC 2016

BTW another way to handle imbbalanced data (and perhaps easier to implement
and test) could be assign weights in libvsm. This has to be done for each
partition separately, any ideas on how this can be done?

Thanks

On Fri, Jan 15, 2016 at 11:28 PM, Kaustubh Patil <kaustubh.patil at gmail.com>
wrote:

> Thanks again Yaroslav.
>
> I agree that the classifier might end up giving 0 or very small balanced
> accuracy (or micro accuracy) values but I think thats still a better
> measure than using overall accuracy  (or macro accuracy). There are couple
> of other measures that can be useful for imbalanced datasets:
>
> 1. A-mean: arithmetic mean, same as average class-wise accuracy or
> micro-accuracy
> 2. G-mean: geometric mean instead of arithmetic mean above
> 3. F-measure
> 4. Area under the ROC curve
>
> Of course a better solution would be using a classifier that can handle
> imbalanced datasets, as you suggested. I have previously used SVMperf that
> can optimize AU-ROC:
> https://www.cs.cornell.edu/people/tj/svm_light/svm_perf.html
>
> Not sure how easy it is to incorporate new classifiers in PyMVPA but I
> could give it a try with some guideline.
>
> Best regards,
> Kaustubh
>
>
>
>
> On Fri, Jan 15, 2016 at 11:08 PM, Yaroslav Halchenko <
> debian at onerussian.com> wrote:
>
>>
>> On Fri, 15 Jan 2016, Kaustubh Patil wrote:
>>
>> >    Thanks Yaroslav.
>>
>> >    I tried your solution and it seems to work for this particular
>> dataset but
>> >    unfortunately not for other datasets as the labels cannot be balanced
>> >    easily.
>>
>> >    Maybe it's possible to directly calculate balanced measures in the
>> CV? I
>> >    guess I will have to change the code to do that, any suggestions
>> where to
>> >    start?
>>
>> some toolboxes compute 'mean of within class accuracies' (not mean
>> overall accuracy) which allows to account for disbalance.  I guess we
>> could code it quite easily if you like
>>
>> BUT the problem really would remain:  with small number of samples
>> classifier might just take the "majority" label since it would minimize
>> error more than low performace decision.  So you would hurt yourself
>> more than help.
>>
>> another solution is to try a classifier which provides weighting
>> to the classes, e.g. as GNB with default prior setting does.  you could
>> try it and see how it goes.  It is not the greatest classifier but a
>> start. then you could add similar class weighting to some other
>> classifiers supporting that.
>>
>> >    Best regards
>> >    On Sat, Dec 19, 2015 at 3:51 PM, Yaroslav Halchenko
>> >    <debian at onerussian.com> wrote:
>>
>> >      On Sat, 19 Dec 2015, Kaustubh Patil wrote:
>>
>> >      > Hi,
>>
>> >      > I want to use PyMVPA for whole-brain searchlight analysis on some
>> >      existing
>> >      > data. The data has been already preprocessed (skull stripping,
>> motion
>> >      > correction etc.). Each subject data contains 10 runs and each
>> run was
>> >      processed
>> >      > separately, so there is a separate full brain boolean mask for
>> each
>> >      run.
>>
>> >      > My question is what is the recommended/correct a way to use this
>> data
>> >      to
>> >      > perform run-wise cross-validation searchlight?
>>
>> >      you have a problem here, since you have done per run
>> preprocessing, in
>> >      particular motion-correction, your volumes are misaligned across
>> runs.
>> >      (used FSL, didn't you? )
>>
>> >      ideally, you redo preprocessing while motion correcting to the same
>> >      volume across all the runs.A  Alternatively, you reslice all the
>> runs
>> >      into the same space (could well be the common space your toolkit
>> used
>> >      for analysis across runs -- common anatomical or MNI) and then do
>> >      analysis there, while again unifying your mask, which must be the
>> same
>> >      across all the runs.
>> >      > As I understand, each run has to be in the same space (same
>> number of
>> >      voxels)
>> >      > so that training and test can be performed, so the whole brain
>> masks
>> >      have to be
>> >      > somehow aligned. How would you recommend doing this?
>>
>> >      it is not a mere 'number of voxels' problem but rather that you
>> have
>> >      misaligned across runs volumes.A  if just voxel number -- choose
>> >      intersection of all masks.
>> --
>> Yaroslav O. Halchenko
>> Center for Open Neuroscience     http://centerforopenneuroscience.org
>> Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
>> Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
>> WWW:   http://www.linkedin.com/in/yarik
>>
>> _______________________________________________
>> Pkg-ExpPsy-PyMVPA mailing list
>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20160115/7ba4bbe5/attachment.html>