[pymvpa] time complexity of IterativeRelief feature selection

MS Al-Rawi rawi707 at yahoo.com
Thu Jan 13 12:47:47 UTC 2011


Hi Brian,

Although I'm not familiar with how IterativeRelief  works, but, if the 
complexity has O(T*N^(2*L)) then it might need few hundred years (or maybe much 
more) to finish since the complexity here is ~ O(2^50,000) ... If L is 40, 
you'll need roughly 1099511627776 operations, now, suppose each operation takes 
1E-7 second ... then, for L =40 our processor will take:

1099511627776 *1E-7 /3600 ~= 30.5 hours 

Now the problem is that 2^n is neither linear nor polynomial, in fact, it has 
exponential growth. So my guess is that the complexity of  O(T*N^(2*L)) is the 
for the direct implementation of the algorithm and without any optimization to 
speedup the process.

Regards,
-Rawi




________________________________
From: Emanuele Olivetti <emanuele at relativita.com>
To:Development and support of PyMVPA <pkg-exppsy-pymvpa at lists.alioth.debian.org>
Sent:Wed, January 12, 2011 3:33:31 PM
Subject:Re: [pymvpa] time complexity of IterativeRelief feature selection

On 01/12/2011 01:42 PM, Brian Murphy wrote:
> 
> I've been running an IterativeRelief feature selection for five days now, and 
>it still hasn't completed. Does anyone have experience to help me get a 
>ball-park estimate of how long it should take? My dataset is ~300 samples by 
>~25,000 features. I see on the API documentation that the algorithm has 
>complexity "O(T*N^2*I), where T is the number of iterations, N the number of 
>instances, I the number of features". Any idea what the number of iterations 
>should/might be? I don't see a parameter to set this,
> 

Hi Brian,

My guess is that at least one of the following situations occurs:
- The initial guess you are starting from is not good for your
problem. Did you normalize data?
- The threshold you are using is too low and so it take ages, maybe
without any real gain. Try to increase it. Again data nomalization
should play an important role.
- Your problem is not so friendly towards optimization :-) so a
stochastic gradient strategy like IterativeReliefOnline might help.

In any case I strongly suggest you to enable the debug mode
and observe the evolution of the convergence statistics. It will
tell you/us more on where is the problem.

Emanuele


_______________________________________________
Pkg-ExpPsy-PyMVPA mailing list
Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa



      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20110113/b1482499/attachment.htm>


More information about the Pkg-ExpPsy-PyMVPA mailing list