Thank you so much for your response, Dr. Halchenko.<div><br></div><div>However I am embarrased to once again show my inexperience, but I have a little set of questions.</div><div><br></div><div>Let's start off with this line of code:</div>

<div><br></div><div><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">        NFoldPartitioner(len(</span><a href="http://ds.sa/" target="_blank" style="color:rgb(17,85,204);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">ds.sa</a><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">['</span><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">superord'].unique),</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">

<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">                         attr='subord'),</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">

<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">        ## so it should select only those splits where we took 1 from</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">

<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">        ## each of the superord categories leaving things in balance</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">

<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">        Sifter([('partitions', 2),</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">

<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">                ('superord',</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">

<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">                 { 'uvalues': </span><a href="http://ds.sa/" target="_blank" style="color:rgb(17,85,204);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">ds.sa</a><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">['superord'].unique,</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">

<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">                   'balanced': True})</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">

<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">                 ]),</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">

<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">                   ], space='partitions')</span></div><div><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><br>

</span></div><div><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">My question in this regard is if the attributes file must be different in any way for this, or just pointing to a category in question (which I'm guessing in this case you just named it "subord") will do the trick? Is there any way to do this for more than 2 superordinates? Sorry for this last question as I acknowledge you wrote:</span></div>

<div><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><br></span></div><div><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">"</span></div>

<div><span style="background-color:rgb(255,255,255);color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px"># And with that NFold + Sifter we achieve desired effect that we would get only</span></div><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"># those splits where into testing we place 3 different subord categories with 1</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">

<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"># of each superord</span><br style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">

<div>"</div><div><div><br></div></div><div>but I'm just a little lost.</div><div><br></div><div>Anyway, hope I am not being too big a burden. As soon as I have some feedback to share I will be sure to.</div><div>

<br></div><div>Thanks again! J</div><div><br><div class="gmail_quote">On Thu, Oct 25, 2012 at 3:23 PM, Yaroslav Halchenko <span dir="ltr"><<a href="mailto:debian@onerussian.com" target="_blank">debian@onerussian.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"><br>

On Thu, 25 Oct 2012, Jacob Itzhacki wrote:<br>

<br>

>    "e.g. �some super-ordinate category (e.g. �animate-vs-inanimate) �you<br>

>    would like to cross-validate not across functional runs BUT across<br>

>    sub-ordinate stimuli categories (e.g. train on<br>

>    humans/reptiles/shoes/scissors to discriminate animacy and<br>

>    cross-validate into bugs/houses, then continue with another pair to take<br>

>    out)."<br>

>    BTW, this exactly what I would like to do but I still don't figure out how<br>

>    to leave out the test trials from the training trials, so they don't get<br>

>    classified into themselves.<br>

<br>

</div>ok then -- the point is to craft such an interesting partitioner.  And there<br>

are actually 2 approaches to this.  Let's first look into<br>

<br>

<a href="https://github.com/PyMVPA/PyMVPA/blob/HEAD/mvpa2/tests/test_usecases.py#L50" target="_blank">https://github.com/PyMVPA/PyMVPA/blob/HEAD/mvpa2/tests/test_usecases.py#L50</a><br>

<br>

which I am citing here with some additional comments and omitting import<br>

statement(s) -- it is a bit more cumbersome since in it we have 6 subordinate<br>

categories and 3 superord (not 2 which would make explanation easier):<br>

<br>

    # Let's simulate the beast -- 6 categories total groupped into 3<br>

    # super-ordinate, and actually without any 'superordinate' effect<br>

    # since subordinate categories independent<br>

<br>

# in your case I hope you would have a true superordinate effect like in<br>

# example study I am referring to below<br>

<br>

    ds = normal_feature_dataset(nlabels=6,<br>

                                snr=100,   # pure signal! ;)<br>

                                perlabel=30,<br>

                                nfeatures=6,<br>

                                nonbogus_features=range(6),<br>

                                nchunks=5)<br>

    <a href="http://ds.sa" target="_blank">ds.sa</a>['subord'] = ds.sa.targets.copy()<br>

<br>

# Here  I am creating a new 'superord' category as a remainder of division by 3<br>

# of original 6 categories (in 'subord')<br>

<br>

    <a href="http://ds.sa" target="_blank">ds.sa</a>['superord'] = ['super%d' % (int(i[1])%3,)<br>

                         for i in ds.targets]   # 3 superord categories<br>

    # let's override original targets just to be sure that we aren't relying on them<br>

    ds.targets[:] = 0<br>

<br>

    npart = ChainNode([<br>

    ## so we split based on superord<br>

<br>

# So now this NFold partitioner would select 3 subord categories (possibly where we even<br>

# have multiple samples from the same superord category)<br>

<br>

        NFoldPartitioner(len(<a href="http://ds.sa" target="_blank">ds.sa</a>['superord'].unique),<br>

                         attr='subord'),<br>

        ## so it should select only those splits where we took 1 from<br>

        ## each of the superord categories leaving things in balance<br>

        Sifter([('partitions', 2),<br>

                ('superord',<br>

                 { 'uvalues': <a href="http://ds.sa" target="_blank">ds.sa</a>['superord'].unique,<br>

                   'balanced': True})<br>

                 ]),<br>

                   ], space='partitions')<br>

<br>

# And with that NFold + Sifter we achieve desired effect that we would get only<br>

# those splits where into testing we place 3 different subord categories with 1<br>

# of each superord<br>

<br>

    # and then do your normal where clf is space='superord'<br>

    clf = LinearCSVMC(space='superord')<br>

<br>

    cvte_regular = CrossValidation(clf, NFoldPartitioner(),<br>

                                   errorfx=lambda p,t: np.mean(p==t))<br>

<br>

# below we use our NFold + Sifter partitioner instead of a simple NFold on chunks<br>

<br>

    cvte_super = CrossValidation(clf, npart, errorfx=lambda p,t: np.mean(p==t))<br>

<br>

# apply as usual ;)<br>

<br>

    accs_regular = cvte_regular(ds)<br>

    accs_super = cvte_super(ds)<br>

<br>

<br>

If you are interested in how that would effect the results -- I would invite<br>

you to look at my recent poster at SfN 2012:<br>

<a href="http://haxbylab.dartmouth.edu/publications/HGG+12_sfn12_famfaces.png" target="_blank">http://haxbylab.dartmouth.edu/publications/HGG+12_sfn12_famfaces.png</a><br>

2nd column, scatter plot "Why across identities?"<br>

<br>

where on x-axis you have z-scores for CV stats across identities while<br>

cross-validating searchlights on classification of personal familiarity to<br>

faces across functional runs, while on y-axis -- across pairs of individuals.<br>

Both results are in high agreement BUT in the "blue areas" -- early visual<br>

cortex, where if we cross-validate across functional runs, classifier might<br>

just learn identity information. Since identity of a face (subordinate<br>

category) here has clear association with familiarity (superordinate), it would<br>

provide significant classification results in those areas where there is strong<br>

identity information on stimuli (in our case in early visual cortex since the<br>

faces were actually different ;) ) but possibly no (strong) superord effects<br>

(let's forget for now about possible attention/engagement etc effects).  By<br>

cross-validating across identities (subord), we can easily get rid of those<br>

subord-specific effects and capture the notion of the superord category<br>

effects more clearly.<br>

<br>

Alternative, even more stricter cross-validation scheme would involve<br>

cross-validation across runs BUT also bootstrapping then additional folds for<br>

each such a split with generating all those splits across identities.  For that<br>

we have ExcludeTargetsCombinationsPartitioner docs for which are<br>

<a href="http://www.pymvpa.org/generated/mvpa2.generators.partition.ExcludeTargetsCombinationsPartitioner.html?highlight=excludetargetscombinationspartitioner" target="_blank">http://www.pymvpa.org/generated/mvpa2.generators.partition.ExcludeTargetsCombinationsPartitioner.html?highlight=excludetargetscombinationspartitioner</a><br>


and unittest<br>

<a href="https://github.com/PyMVPA/PyMVPA/blob/HEAD/mvpa2/tests/test_generators.py#L266" target="_blank">https://github.com/PyMVPA/PyMVPA/blob/HEAD/mvpa2/tests/test_generators.py#L266</a><br>

<br>

This one was used in the original hyperalignment paper<br>

(<a href="http://haxbylab.dartmouth.edu/publications/HGC+11.pdf" target="_blank">http://haxbylab.dartmouth.edu/publications/HGC+11.pdf</a>) to do not fall into the<br>

trap of run order effects...<br>

<br>

I would be glad to see people reporting back comparing these 3 schemes (just<br>

across runs, across subord, across runs+subord) of cross-validation on their<br>

data with hierarchical categories design. Thanks in advance for sharing  -- it<br>

would be great if we get a dialog going instead of my one-way blurbing... doh<br>

-- sharing! ;)<br>

<br>

Cheers,<br>

<br>

>    On Wed, Oct 24, 2012 at 5:15 PM, Jacob Itzhacki <[1]<a href="mailto:jitzhacki@gmail.com">jitzhacki@gmail.com</a>><br>

<div class="im HOEnZb">>    wrote:<br>

<br>

>      Please do!<br>

<br>

>      and thank you for all the responses :D<br>

>      Don't want to come across as lazy but I'm not a master coder at all so<br>

>      sometimes figuring out what one line of code does can be quite the<br>

>      ordeal, in my case.<br>

>      J<br>

>      On Wed, Oct 24, 2012 at 3:54 PM, Yaroslav Halchenko<br>

</div><div class="HOEnZb"><div class="h5">>      <[2]<a href="mailto:debian@onerussian.com">debian@onerussian.com</a>> wrote:<br>

<br>

>        On Wed, 24 Oct 2012, MS Al-Rawi wrote:<br>

>        > � �Cross-validation is fine even in this case, you'll just need to<br>

>        rearrange<br>

>        > � �your data in a way to leave-a-set-of-stimuli out, instead of<br>

>        > � �leave-one-run-out. Perhaps PyMVPA has some functionality to do<br>

>        this.�<br>

<br>

>        now it is getting interesting -- I think you got close to what I<br>

>        thought<br>

>        the question was about: �to investigate the conceptual/true effect of<br>

>        e.g. �some super-ordinate category (e.g. �animate-vs-inanimate) �you<br>

>        would like to cross-validate not across functional runs BUT across<br>

>        sub-ordinate stimuli categories (e.g. train on<br>

>        humans/reptiles/shoes/scissors to discriminate animacy and<br>

>        cross-validate into bugs/houses, then continue with another pair to<br>

>        take<br>

>        out). �And that is what I thought for a moment the question was<br>

>        about ;)<br>

<br>

>        This all can be (was) done with PyMVPA although would require 3-4<br>

>        lines of code instead of 1 to accomplish ATM. �If anyone interested I<br>

>        could provide an example ;)... ?<br>

--<br>

Yaroslav O. Halchenko<br>

Postdoctoral Fellow,   Department of Psychological and Brain Sciences<br>

Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755<br>

Phone: <a href="tel:%2B1%20%28603%29%20646-9834" value="+16036469834">+1 (603) 646-9834</a>                       Fax: <a href="tel:%2B1%20%28603%29%20646-1419" value="+16036461419">+1 (603) 646-1419</a><br>

WWW:   <a href="http://www.linkedin.com/in/yarik" target="_blank">http://www.linkedin.com/in/yarik</a><br>

<br>

_______________________________________________<br>

Pkg-ExpPsy-PyMVPA mailing list<br>

<a href="mailto:Pkg-ExpPsy-PyMVPA@lists.alioth.debian.org">Pkg-ExpPsy-PyMVPA@lists.alioth.debian.org</a><br>

<a href="http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa" target="_blank">http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa</a></div></div></blockquote></div><br></div>