[Pkg-openmpi-maintainers] Bug#848574: openmpi: frequent segfault in linker on mips64el

James Cowgill jcowgill at debian.org
Fri Feb 3 17:02:33 UTC 2017


Hi,

So I have been looking at this on and off for some time now but I've
been unable to make much progress on it.

Whatever the bug is, it's causing heap corruption which causes
things to segfault before MPI_Init returns. While most of the time it
segfaults, sometimes it prints other openmpi errors, and sometimes
glibc aborts due to detecting heap corruption.

There is probably a threading data race issue here as well. If I run
the test program on an idle machine with "taskset 1" (so it only runs
on 1 CPU) then the errors go away. I expect this is the reason why the
specific error message is not 100% reproducible for me.

On Mon, 23 Jan 2017 23:20:23 +0800 YunQiang Su <wzssyqa at gmail.com> wrote:
> It is quite strange that it won't fail if run with gdb.

I guess this is due to gdb slowing down certain threads.

Thanks,
James

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://lists.alioth.debian.org/pipermail/pkg-openmpi-maintainers/attachments/20170203/8f378eb5/attachment.sig>


More information about the Pkg-openmpi-maintainers mailing list