<div dir="ltr">Thank you Francisco and Emanuele for your encouragement :)<br>1. I use run sessions as my folds in cross-validation, so I think that I should be fine. At least I do not have any temporal correlation leakage from training to test data set.<br>

2. I understand the concern with regard to binomial accuracy test. I think it might be more robust to create scrambled labels distribution by running ~1000  times the classification on scrambled data (some sort of non-parametric test). Just recently, I have encountered some class which for some unknown reason even with scrambled labels was far beyond chance. So, the binomial test would have missed this.<br>

3. Averaging within block indeed reduces the temporal correlation. I just ran some test on two different data sets, where in one of them I had 10 secs fixation between the blocks and in another one, no fixation at all. Surprisingly, the temporal correlation between the data blocks were similar (~0.1). Though it is much better than raw data, it is still far away from Gaussian noise. Anyway, if all the issue is just the low performance, then I am fine with it.<br>

<br><br><div class="gmail_quote">On Thu, Dec 3, 2009 at 11:17 PM, Emanuele Olivetti <span dir="ltr">&lt;<a href="mailto:emanuele@relativita.com">emanuele@relativita.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Vadim Axel wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Hi,<br>

...<div class="im"><br>

What do you think about this issue? Does ignoring temporal correlation may just decrease the prediction rate or it casts doubt in the results in general?<br>

</div></blockquote>

<br>

SVM will underperform in case of non-iid data because it will not exploit temporal<br>

dependencies. Underperform in the sense that a classifier exploiting it could do better.<br>

As far as I remember some generalization bounds should not hold for SVM when data<br>

is not iid. Nevertheless it is pretty common that data is not iid and to use classifiers<br>

that assume iid data on them.<br>

<br>

As far as I know there are several schemas to minimize the impact of the temporal<br>

dependencies between fMRI volumes. Averaging over blocks is one of them. For<br>

example in [0] they use beta values for each trial as regressors instead of BOLD.<br>

Many other strategies can be conceived.<br>

<br>

As a basic rule just be sure that you don&#39;t use highly temporal-correlated samples<br>

between train and test set, which in your case could mean to avoid samples from the same<br>

block be splitted in train and test set. PyMVPA has the concept of &quot;chunk&quot; for that.<br>

During cross-validation samples from the same chunk will all go either to train or<br>

test set. This helps, for example, when you want to test the error rate of your<br>

binary classifier with the binomial test.<br>

<br>

HTH,<br>

<br>

E.<br>

<br>

[0]: <a href="http://www.citeulike.org/user/librain/article/3140982" target="_blank">http://www.citeulike.org/user/librain/article/3140982</a><br>

<br>

</blockquote></div><br></div>