Bug#944617: python3-h5py import performance severely degraded in 2.10.0 release (due to OpenMPI?)

Drew Parsons dparsons at debian.org
Mon Nov 18 04:29:17 GMT 2019


Source: h5py
Followup-For: Bug #944617

There is additional overhead in h5py 2.10 compared to 2.8.

Comparing 2.10 with and without mpi support shows the load-up
difference with mpi to be slower by a factor of only 2-3 rather
than ×7.


h5py 2.10.0 with mpi support:

$ multitime -qq -n 10 python3 -c 'import h5py'
===> multitime results
1: -qq python3 -c "import h5py"
            Mean        Std.Dev.    Min         Median      Max
real        0.696       0.123       0.637       0.659       1.065
user        0.608       0.052       0.480       0.617       0.665       
sys         0.313       0.049       0.196       0.315       0.394       



h5py 2.10.0 without mpi support:

$ multitime -qq -n 10 python3 -c 'import h5py'
===> multitime results
1: -qq python3 -c "import h5py"
            Mean        Std.Dev.    Min         Median      Max
real        0.293       0.048       0.260       0.270       0.414       
user        0.549       0.036       0.479       0.552       0.605       
sys         0.269       0.022       0.228       0.264       0.301       




But note that this test only measures the time for loading the h5py
module itself, so it does not provide a good measure of performance
with mpi support available. It's not fair to characterise it as ×2.5
slower, since this is a once-off cost in CPU time.  i.e. the relevant
quantity here is the additional 0.4 sec of time to load the module.
It's a bit of a stretch to say that 0.4 sec is a severe performance
penalty, I think.


To measure performance, you would need to measure the time taken to
work with the actual data files, e.g. to load a large data file (say
4-8 GB data).  It would be interesting if you could run this kind of
performance test.


More information about the debian-science-maintainers mailing list