[pymvpa] Surface searchlight taking 6 to 8 hours

Thu Jul 23 14:46:46 UTC 2015

On 07/23/2015 06:45 AM, Nick Oosterhof wrote:
> 
>> On 22 Jul 2015, at 20:11, John Baublitz <jbaub at bu.edu> wrote:
>>
>> I have been battling with a surface searchlight that has been taking 6 to 8 hours for a small dataset. It outputs a usable analysis but the time it takes is concerning given that our lab is looking to use even higher resolution fMRI datasets in the future. I profiled the searchlight call and it looks like approximately 90% of those hours is spent mapping in the function from feature IDs to linear voxel IDs (the function feature_id2linear_voxel_ids).
> 
> From mvpa2.misc.surfing.queryengine, you are using the SurfaceVoxelsQueryEngine, not the SurfaceVerticesQueryEngine? Only the former should be using the feature_id2linear_voxel_ids function. 
> 
> (When instantiating a query engine through disc_surface_queryengine, the Vertices variant is the default; the Voxels variant is used then output_modality=‘volume’).
> 
> For the typical surface-based analysis, the output is a surface-based dataset, and the SurfaceVerticesQueryEngine is used for that. When using the SurfaceVoxelsQueryEngine, the output is a volumetric dataset.
> 
>> I looked into the source code and it appears that it is using the in keyword on a list which has to search through every element of the list for each iteration of the list comprehension and then calls that function for each feature. This might account for the slowdown. I'm wondering if there is a way to work around this or speed it up.
> 
> When using the SurfaceVoxelsQueryEngine, the euclidean distance between each node (on the surface) and each voxel (in the volume) is computed. My guess is that this is responsible for the slow-down. This could probably be made faster by dividing the 3D space into blocks and assigning nodes and vertices to each block, and then compute distances between nodes and voxels only within each block and across neighbouring ones. (a somewhat similar approach is taken in mvpa2.support.nibabel.Surface.map_to_high_resolution_surf). But that would take some time to implement and test. How important is this feature for you?  Is there a particular reason why you would want the output to be a volumetric, not surface-based, dataset?  
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
> 

Nick,

To clarify, are you saying that using SurfaceVerticesQueryEngine runs
the classifiers (or other measure) on sets of vertices, not sets of
voxels? I'm not familiar enough with AFNI surfaces, but the ratio of
vertices to intersecting voxels in FreeSurfer is about 6:1. If a
searchlight is a set of vertices, how is the implicit resampling
accounted for?

Sorry if this is explained in documentation. I have my own
FreeSurfer-based implementation that I've been using that uses the
surface only to generate sets of voxels, so I haven't been keeping close
tabs on how PyMVPA's AFNI-based one works.

Also, if mapping vertices to voxel IDs is a serious bottleneck, you can
have a look at my query engine
(https://github.com/effigies/PyMVPA/blob/qnl_surf_searchlight/mvpa2/misc/neighborhood.py#L383).
It uses FreeSurfer vertex map volumes (see: mri_surf2vol --vtxvol),
where each voxel contains the ID of the vertex nearest its center. Maybe
AFNI has something similar?

-- 
Christopher J Markiewicz
Ph.D. Candidate, Quantitative Neuroscience Laboratory
Boston University

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: OpenPGP digital signature
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20150723/e65741ba/attachment.sig>