[pymvpa] normalization: zscore by example?

Tue Nov 8 01:56:54 UTC 2011

Hi,

Yes, that makes sense now. Thanks for the time and thought you put into it!

Best,
Mike

On Mon, Nov 7, 2011 at 6:20 PM, Yaroslav Halchenko <debian at onerussian.com>wrote:

>
> On Mon, 07 Nov 2011, Mike E. Klein wrote:
> >    - My reason for wanting to do this is because of a relative paucity of
> >    examples (9 per run, 3 categories, 9 runs), which I plan to reduce in
> >    number further by some averaging. It seems (to me) that the zscoring
> would
> >    be more accurate when considering thousands of voxels (per example) as
> >    opposed to several voxels (per run).
>
> oy... full answer and pros/cons/against/for would take a while... there
> are papers which were published and such "preprocessing" was in-place...
> I even debated with one of the authors but never got to submit some kind
> of cririque... in case of Francisco's -- I guess I have missed that
> particular piece or it was in respect only for classification (which is
> as I said is ok)
>
>
> really coarse reasoning from me:  the problem with zscoring (or
> even just plain demeaning) across voxels is that it leaks information
> among voxels...  e.g. consider the most obvious simplified example
> where you have 2 voxels,  1 of which is informative, and one is not.
>
> after demeaning (let stay to the ground) -- they both become informative
> in respect to the condition of interest, so if you were to judge among
> "who is informative" -- you would be misguided
>
> in case of full brain, where majority of voxels is not informative and
> only few (unless you have really simple contrast/paradigm) are
> informative, zscoring shouldn't be as detremental, but fact would remain
> valid -- mean/std might be stimuli-dependent thus you would leak
> that information into each voxel (possibly even removing effects from
> informative voxels).  So in case of distributed effects having a large
> "mass" -- they will be reflected as diagnostic information in the mean
> of the volumes, and then would be introduced into every voxel in the
> volume.
>
> altogether -- to be on a safe side I would not do anything like that ;)
>
> zscoring -- I would just zscore across run before extracting/meaning
> any samples you want to use for classification...
>
> hope it makes sense...
>
> >    - I guess I don't see why zscoring this way would render a searchlight
> >    invalid. I'm looking at the Pereira paper clip that was referenced
> >    recently in the listserv: "In the example study, we normalized each
> >    example (row) to have mean 0 and standard deviation 1. The idea in
> this
> >    case is to reduce the effect of large, image-wide signal changes.
> Another
> >    possibility would be to normalize each feature (column) to have mean
> 0 and
> >    standard deviation 1, either across the entire experiment or within
> >    examples coming from the same run."
> --
> =------------------------------------------------------------------=
> Keep in touch                                     www.onerussian.com
> Yaroslav Halchenko                 www.ohloh.net/accounts/yarikoptic
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20111107/8d86ae86/attachment-0001.html>