<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
On 05/08/2012 11:17 PM, Vadim Axel wrote:
<blockquote
cite="mid:CACMmGHE4tGyJn-a4Mb+BYD5OC_BmaaM0w=ow6gVk3DhV=X75Zg@mail.gmail.com"
type="cite">
<div dir="ltr">Thank you a lot, Emanuele!<br>
<br>
The definition that permutation test is reliable when "the data
distribution is adequately<br>
represented by the sample data" sounds a little bit puzzling for
me :) As far as I looked, the authors do not discuss this point
too much. How can I know when it is indeed adequate? For
example, with small number of data points the permutation chance
level is frequently high and can be even 0.6. The distribution
of predictions from reshuffling looks like normal, so it is not
something completely pathological. Not surprisingly, for some
subjects it might be difficult if not impossible to reach
significance with such a high chance level. Is permutation test
appropriate is such scenario? In other words, can it be that
result is non-significant with permutation test, but the
information is still present.<br>
<br>
<br>
</div>
</blockquote>
<br>
<br>
As far as I know the results of the resampling approaches
(permutation test, bootstrap, etc.)<br>
provides correct answers only "asymptotically", i.e. when the sample
size grows big. The<br>
theoretical work in this field is concerned with finding new smarter
estimators that<br>
converge to the true value faster than previous methods. I am not
aware of results<br>
about the reliability of the permutation test on small samples that
give you (even<br>
probabilistic) bounds on how far you are from the asymptotical
regime. I mean besides<br>
toy examples.<br>
<br>
About your example I guess you could make a simulation, like
creating a simple<br>
dataset of two classes (N(0,1) for class 1 and N(delta,1) for class
2), train<br>
classifier and then observe the behaviour of the permutation test
while changing the<br>
number of examples and the overlap between the two classes (delta).
Maybe you<br>
would see that with low sample size and high overlap, i.e. a small
effect size, the<br>
permutation test might not be too reliable. This would not be
surprising: in that<br>
situation you lower the probability of having an adequate
representation<br>
of the underlying distribution of the data from your dataset. Any
deduction following<br>
from the data *alone* would suffer.<br>
<br>
About your last question "can it be that result is non-significant
with permutation test, but the<br>
information is still present": surely it could. Your question asks
how to interpret a non-rejected<br>
null hypothesis (of the classifier predicting at random). The
classical null-hypothesis testing<br>
framework - that you are using - is asymmetric and can only reject
the null hypothesis, never<br>
accept it. For this reason a non-significant test does not provide
insights on the data.<br>
<br>
Best,<br>
<br>
Emanuele<br>
<br>
PS: here is the code implementing the proposed method of the paper I
mentioned in my<br>
last email. Just in case you wanted to try. In Python, of course ;-)<br>
<a class="moz-txt-link-freetext" href="https://github.com/emanuele/Bayes-factor-multi-subject">https://github.com/emanuele/Bayes-factor-multi-subject</a><br>
<br>
<br>
<blockquote
cite="mid:CACMmGHE4tGyJn-a4Mb+BYD5OC_BmaaM0w=ow6gVk3DhV=X75Zg@mail.gmail.com"
type="cite">
<div dir="ltr"><br>
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
Emanuele Olivetti, Sriharsha Veeramachaneni, Ewa Nowakowska,
Bayesian hypothesis testing for pattern discrimination in
brain decoding, Pattern Recognition, 45, 2012. <a
moz-do-not-send="true"
href="http://dx.doi.org/10.1016/j.patcog.2011.04.025"
target="_blank">http://dx.doi.org/10.1016/j.patcog.2011.04.025</a><br>
I know self citations suck, still I haven't found a more
convincing one.<br>
<br>
</blockquote>
</div>
</div>
</blockquote>
<br>
</body>
</html>