[Pkg-exppsy-maintainers] New LARS classifier in PyMVPA!!!

Per B. Sederberg persed at princeton.edu
Sun Apr 20 17:24:15 UTC 2008


Hi Folks:

So I was talking with a world-famous mathematician (Ingrid Daubechies)
about SMLR on Friday and she suggested trying out the Least Angle
Regression (LARS) technique instead.  Here's the relevant paper by
some of the most famous folks in machine learning:

Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani,
Least Angle Regression Annals of Statistics (with discussion) (2004)
32(2), 407-499. A new method for variable subset selection, with the
lasso and "epsilon" forward stagewise methods as special cases.

It is kind of like a smart boosted linear regression classifer where
it adds features in one-by-one in a very intelligent fashion (those
folks are smart.)

She warned that implementing it may be a bit difficult due to a few
tricks and suggested that I use some existing implementation from a
person we can trust.  Well, it turns out that Trevor Hastie
implemented it very nicely in R, so I figured let's make use of that
PyMVPA framework and wrap it up!!!

The result is a new LARS classifier that makes use of RPy to wrap the
R implementation.  It looks like it works great and we should
eventually make a LARSWeights to go with the SMLRWeights and
LinearSVMWeights.

To make it work, you have to install R and RPy and then download the
lars contributed package.  Below are my notes on doing this on Debian:

Howto install and use the R version of lars (on Debian Lenny):
 - First you have to install all the R you need:
<example>
sudo aptitude install python-rpy python-rpy-doc r-base-dev
</example>
 - Then you have to install the lars library (if you do this as root
   you will install it globally):
<example>
R
install.package()
</example>
   Just pick your mirror, then pick lars from the list of packages.
 - Finally this is how to use it with rpy:
<example>
ipython -pylab
import rpy
import numpy as N
rpy.r.library('lars')
x = N.random.randn(100,1000)
x[:50,:5] = x[:50,:5] + 2
x2 = N.random.randn(10,1000)
x2[:5,:5] = x2[:5,:5] + 2
y = N.zeros((100,1))
y[:50,0] = 1
res = rpy.r.lars(x,y,use_Gram=False)
p = rpy.r.predict_lars(res,x2)
</example>


The current implementation passes the test_lars.py tests, but I was
having a shogun error, so all the tests we not running on my machine.

We should think about a graceful way for the code to error out if
someone does not have the dependencies loaded correctly.

Talk to ya'll soon,
Per



More information about the Pkg-exppsy-maintainers mailing list