[pymvpa] time complexity of IterativeRelief feature selection

Emanuele Olivetti emanuele at relativita.com
Wed Jan 12 15:33:31 UTC 2011


On 01/12/2011 01:42 PM, Brian Murphy wrote:
>
> I've been running an IterativeRelief feature selection for five days now, and it still 
> hasn't completed. Does anyone have experience to help me get a ball-park estimate of how 
> long it should take? My dataset is ~300 samples by ~25,000 features. I see on the API 
> documentation that the algorithm has complexity "O(T*N^2*I), where T is the number of 
> iterations, N the number of instances, I the number of features". Any idea what the 
> number of iterations should/might be? I don't see a parameter to set this,
>

Hi Brian,

My guess is that at least one of the following situations occurs:
- The initial guess you are starting from is not good for your
problem. Did you normalize data?
- The threshold you are using is too low and so it take ages, maybe
without any real gain. Try to increase it. Again data nomalization
should play an important role.
- Your problem is not so friendly towards optimization :-) so a
stochastic gradient strategy like IterativeReliefOnline might help.

In any case I strongly suggest you to enable the debug mode
and observe the evolution of the convergence statistics. It will
tell you/us more on where is the problem.

Emanuele




More information about the Pkg-ExpPsy-PyMVPA mailing list