[pymvpa] NSplitter?

Per B. Sederberg persed at princeton.edu
Sat Mar 28 11:58:05 UTC 2009


OK, I'll try again with examples :)

Right now, if someone had each sample loaded with a different chunk
value, which is the default if you do not specify chunks, then they
have two options for splitting their data down the middle and running
a CV on each half:

coarsenChunks(2) followed by NFoldSplitter(1)

or the much easier if you never think about chunks:

HalfSplitter()

It seems to me that if you are an end user who never has to think
about chunks, but you want to run a CV on your data split into 10
chunks, then you would like to define a splitter like:

NSplitter(10)

This would essentially do:

coarsenChunks(10) followed by NFoldSplitter(1)

but in a way that is possibly more intuitive and does not require you
to actually modify your chunks.

If no one else thinks this already exists in PyMVPA, I'll just go
ahead and code it.  I will also make HalfSplitter just call
NSplitter(2) to save code.

Best,
Per


On Sat, Mar 28, 2009 at 12:27 AM, Per B. Sederberg <persed at princeton.edu> wrote:
> Howdy Y:
>
> I think coarsen chunks will work, but i'm confused by all the methods
> of doing what I want, which I think is simpler than what you described
> in your email.
>
> HalfSplitter makes two splits (right down the middle).  you could
> imagine ThirdSplitter, which splits your data into thirds, etc...
> this is all that coarsen chunks is doing along with nfoldsplitter, but
> why force two steps?  if I only wanted coarsenchunks(2), I could just
> call HalfSplitter, which already exists?
>
> perhaps I shouldn't send tired emails from my phone in bed.  i'll try
> and clairify tomorrow...
>
> latros,
> p
>
> On 3/27/09, Yaroslav Halchenko <debian at onerussian.com> wrote:
>>
>>> I was helping a colleague perform a multivariate analysis of EEG data
>>> today and we ran into a missing splitter issue.  Given that it's EEG
>>> there's no real need for chunks, like in fMRI runs, and we didn't want
>>> to run an NFoldSplitter-based cross validation (CV) because it would,
>>> quite possibly, take forever.  Instead, we simply wanted to run a CV
>>> by splitting the data into 10 chunks (though we could try different
>>> numbers of splits).
>>
>> just look at our Frontiers paper code ;)
>>
>> I guess you just want to use
>>
>> coarsenChunks(10)  with NFoldSplitter ;)
>>
>> or may be look at the docstring of Splitter at such parameters as
>> nperlabel, count and strategy. Not sure if it is applicable directly yet
>> to some splitter to generate your usecase, but I guess we could alter
>> NoneSplitter to spit out 'the other' part as a testing part
>>
>>
>>> I realize we could have set up custom dataset chunks ourselves or
>>> created a CustomSplitter, but what do folks think of extending the
>>> HalfSplitter into an NSplitter, where you specify how many pieces you
>>> want to split your data into.
>>
>> hm... is that wording wrong or me is slow? are you saying that you want
>> to generate splits which consistn more than of 2 parts? like 1 for training,
>> 2nd for testing and parameter selection and 3rd one for resultant
>> cross-validation?
>>
>>>  Obviously, providing N=2 would be
>>> identical to the HalfSplitter.  But this would make it really easy to
>>> split your data into arbitrary numbers of equal-sized chunks.
>>
>> yeah... I guess those parameters described above is what you are looking
>> for... I am just not sure now if we have that "TheOtherSplitter" which
>> toseses
>> what wasn't selected into a testing part... may be it is already there ;)
>>
>> or did I misunderstood smth?
>>
>> P.S. how is pyepl release coming? ;)
>>
>> --
>> Yaroslav Halchenko
>> Research Assistant, Psychology Department, Rutgers-Newark
>> Student  Ph.D. @ CS Dept. NJIT
>> Office: (973) 353-1412 | FWD: 82823 | Fax: (973) 353-1171
>>         101 Warren Str, Smith Hall, Rm 4-105, Newark NJ 07102
>> WWW:     http://www.linkedin.com/in/yarik
>>
>> _______________________________________________
>> Pkg-ExpPsy-PyMVPA mailing list
>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
>>
>



More information about the Pkg-ExpPsy-PyMVPA mailing list