[pymvpa] GPR woes

Sat Mar 28 15:15:48 UTC 2009

Dear Per and Scott,

You are both right and I feel quite guilty since I did not spend
time on GPR as I promised long time ago (basically lack of time). The
epsilon issue is exactly what Scott says: I discussed the same issue
with Yarick some time ago but then did not work on a principled solution.
I'm fine if you patch the current code in order to increase epsilon till a
certain extent. Any idea about alternative/principled solutions?

Moreover I'm (slowly) after another numerical instability due, I guess, to
underflow in summation which could affect the marginal likelihood
computation (which means it is highly relevant for GPR). This is more
likely to happen when the dataset is large. So be prepared to that ;-)

As far as I remember I forced the kernel matrix to be double precision
for exactly the same reasons you mentioned, but checking more
accurately this fact is highly desirable.

Unfortunately I'm not going to work on GPR very soon, even though
I'd love to. So any patch is warmly welcome, as well as bug reports ;-)

Best,

Emanuele

Per B. Sederberg wrote:
> Hi Scott:
>
> You are correct on all accounts.  epsilon is currently hard-coded and
> either of your proposed solutions would help to make GPR work more
> frequently/consistently.  I'll try out some combination of both and
> report back.
>
> Best,
> Per
>
> On Sat, Mar 28, 2009 at 9:46 AM, Scott Gorlin <gorlins at mit.edu> wrote:
>   
>> Epsilon is the Tikhonov regularization parameter, no?  (Haven't read
>> much about this classifier, but it looks similar to the RLS solution to
>> me...).  As such it should be adjustable to enable punishing overfit
>> solutions, so maybe it should actually be a formal parameter.
>>
>> If the default returns a LinAlg error, you could also stick the cholesky
>> decomposition in a finite loop where epsilon is doubled each time
>> (though not too many iterations...)
>>
>> On a wild tangent, also check that your kernel matrix (here self._C) is
>> double precision, since 1e-20 is close to the precision limit for single
>> floats.
>>
>> Per B. Sederberg wrote:
>>     
>>> Hi Everybody (and Emanuele in particular):
>>>
>>> I've tried GPR on about 5 different datasets and I've never actually
>>> gotten it to run through an entire cross validation process.  I always
>>> get the following error:
>>>
>>> /home/per/lib/python/mvpa/clfs/gpr.pyc in _train(self, data)
>>>     308             except SLAError:
>>>     309                 epsilon = 1.0e-20 * N.eye(self._C.shape[0])
>>> --> 310                 self._L = SLcholesky(self._C + epsilon, lower=True)
>>>     311                 self._LL = (self._L, True)
>>>     312                 pass
>>>
>>> /usr/lib/python2.5/site-packages/scipy/linalg/decomp.pyc in
>>> cholesky(a, lower, overwrite_a)
>>>     552     potrf, = get_lapack_funcs(('potrf',),(a1,))
>>>     553     c,info = potrf(a1,lower=lower,overwrite_a=overwrite_a,clean=1)
>>> --> 554     if info>0: raise LinAlgError, "matrix not positive definite"
>>>     555     if info<0: raise ValueError,\
>>>     556        'illegal value in %-th argument of internal potrf'%(-info)
>>>
>>> LinAlgError: matrix not positive definite
>>>
>>> I think Yarik mentioned changing epsilon or something like that, but
>>> it seems to me that having to change source code to tweak a classifier
>>> to not crash is not ideal.
>>>
>>> Do you have any suggestions for how to fix this?  I've been tantalized
>>> with some very good transfer errors in the folds that GPR was able to
>>> complete before crashing.  Is there a permanent change we could make
>>> to the GPR code to make it more stable across datasets?  Or could we
>>> turn epsilon into a configurable variable when setting up the
>>> classifier?  If so, what should I change epsilon to in order to see if
>>> I can get it to run?
>>>
>>> Thanks,
>>> Per
>>>
>>> _______________________________________________
>>> Pkg-ExpPsy-PyMVPA mailing list
>>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>>> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
>>>
>>>       
>>
>> _______________________________________________
>> Pkg-ExpPsy-PyMVPA mailing list
>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
>>
>>     
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa
>
>