[Pkg-openmpi-maintainers] Bug#592326: Failure of AZTEC test case run.

Rachel Gordon rgordon at techunix.technion.ac.il
Mon Aug 9 09:59:58 UTC 2010


package:  openmpi

dpkg --search openmpi
gromacs-openmpi: /usr/share/doc/gromacs-openmpi/copyright
gromacs-dev: /usr/lib/libmd_mpi_openmpi.la
gromacs-dev: /usr/lib/libgmx_mpi_d_openmpi.la
gromacs-openmpi: /usr/share/lintian/overrides/gromacs-openmpi
gromacs-openmpi: /usr/lib/libmd_mpi_openmpi.so.5
gromacs-openmpi: /usr/lib/libmd_mpi_d_openmpi.so.5.0.0
gromacs-dev: /usr/lib/libmd_mpi_openmpi.so
gromacs-dev: /usr/lib/libgmx_mpi_d_openmpi.so
gromacs-openmpi: /usr/lib/libmd_mpi_openmpi.so.5.0.0
gromacs-openmpi: /usr/bin/mdrun_mpi_d.openmpi
gromacs-openmpi: /usr/lib/libgmx_mpi_d_openmpi.so.5.0.0
gromacs-openmpi: /usr/share/doc/gromacs-openmpi/README.Debian
gromacs-dev: /usr/lib/libgmx_mpi_d_openmpi.a
gromacs-openmpi: /usr/bin/mdrun_mpi.openmpi
gromacs-openmpi: /usr/share/doc/gromacs-openmpi/changelog.Debian.gz
gromacs-dev: /usr/lib/libmd_mpi_d_openmpi.la
gromacs-openmpi: /usr/share/man/man1/mdrun_mpi_d.openmpi.1.gz
gromacs-dev: /usr/lib/libgmx_mpi_openmpi.a
gromacs-openmpi: /usr/lib/libgmx_mpi_openmpi.so.5.0.0
gromacs-dev: /usr/lib/libmd_mpi_d_openmpi.so
gromacs-openmpi: /usr/lib/libmd_mpi_d_openmpi.so.5
gromacs-dev: /usr/lib/libgmx_mpi_openmpi.la
gromacs-openmpi: /usr/share/man/man1/mdrun_mpi.openmpi.1.gz
gromacs-openmpi: /usr/share/doc/gromacs-openmpi
gromacs-dev: /usr/lib/libmd_mpi_openmpi.a
gromacs-dev: /usr/lib/libgmx_mpi_openmpi.so
gromacs-openmpi: /usr/lib/libgmx_mpi_openmpi.so.5
gromacs-openmpi: /usr/lib/libgmx_mpi_d_openmpi.so.5
gromacs-dev: /usr/lib/libmd_mpi_d_openmpi.a


Dear support,
I am trying to run a test case of AZTEC library named 
az_tutorial_with_MPI.f . The example uses gfortran + MPI. The
compilation and linkage stage goes O.K., generating an executable 
'sample'. But when I try to run sample (on 1 or more
processors) the run crushes immediately.

The compilation and linkage stage is done as follows:

gfortran -O  -I/shared/include -I/shared/include/openmpi/ompi/mpi/cxx 
-I../lib -DMAX_MEM_SIZE=16731136
-DCOMM_BUFF_SIZE=200000 -DMAX_CHUNK_SIZE=200000  -c -o 
az_tutorial_with_MPI.o az_tutorial_with_MPI.f
gfortran az_tutorial_with_MPI.o -O -L../lib -laztec  -lm -L/shared/lib 
-lgfortran -lmpi -lmpi_f77 -o sample

The run:
/shared/home/gordon/Aztec_lib.dir/app>mpirun -np 1 sample

[cluster:12046] *** Process received signal ***
[cluster:12046] Signal: Segmentation fault (11)
[cluster:12046] Signal code: Address not mapped (1)
[cluster:12046] Failing at address: 0x100000098
[cluster:12046] [ 0] /lib/libc.so.6 [0x7fd4a2fa8f60]
[cluster:12046] [ 1] /shared/lib/libmpi.so.0(MPI_Comm_size+0x6e) 
[0x7fd4a376c34e]
[cluster:12046] [ 2] sample [0x4178aa]
[cluster:12046] [ 3] sample [0x402a07]
[cluster:12046] [ 4] sample [0x402175]
[cluster:12046] [ 5] sample [0x401c52]
[cluster:12046] [ 6] sample [0x448edc]
[cluster:12046] [ 7] /lib/libc.so.6(__libc_start_main+0xe6) 
[0x7fd4a2f951a6]
[cluster:12046] [ 8] sample [0x401a49]
[cluster:12046] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 12046 on node cluster exited 
on signal 11 (Segmentation fault).

Here is some information about the machine:

uname -a
Linux cluster 2.6.26-2-amd64 #1 SMP Sun Jun 20 20:16:30 UTC 2010 x86_64 
GNU/Linux


lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 5.0.5 (lenny)
Release:        5.0.5
Codename:       lenny

gcc --version
gcc (Debian 4.3.2-1.1) 4.3.2

gfortran --version
GNU Fortran (Debian 4.3.2-1.1) 4.3.2

ldd sample
         linux-vdso.so.1 =>  (0x00007fffffffe000)
         libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0x00007fd29db16000)
         libm.so.6 => /lib/libm.so.6 (0x00007fd29d893000)
         libmpi.so.0 => /shared/lib/libmpi.so.0 (0x00007fd29d5e7000)
         libmpi_f77.so.0 => /shared/lib/libmpi_f77.so.0 
(0x00007fd29d3af000)
         libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007fd29d198000)
         libc.so.6 => /lib/libc.so.6 (0x00007fd29ce45000)
         libopen-rte.so.0 => /shared/lib/libopen-rte.so.0 
(0x00007fd29cbf8000)
         libopen-pal.so.0 => /shared/lib/libopen-pal.so.0 
(0x00007fd29c9a2000)
         libdl.so.2 => /lib/libdl.so.2 (0x00007fd29c79e000)
         libnsl.so.1 => /lib/libnsl.so.1 (0x00007fd29c586000)
         libutil.so.1 => /lib/libutil.so.1 (0x00007fd29c383000)
         libpthread.so.0 => /lib/libpthread.so.0 (0x00007fd29c167000)
         /lib64/ld-linux-x86-64.so.2 (0x00007fd29ddf1000)


Let me just mention that the C+MPI test case of the AZTEC library 
'az_tutorial.c' runs with no problem.
Also, az_tutorial_with_MPI.f runs O.K. on my 32bit LINUX cluster running 
gcc,g77 and MPICH, and on my SGI 16 processors
Ithanium 64 bit machine.

Thank you for your help,

Sincerely,
Rachel

   Dr.  Rachel Gordon
   Senior Research Fellow   		Phone: +972-4-8293811
   Dept. of Aerospace Eng.		Fax:   +972 - 4 - 8292030
   The Technion, Haifa 32000, Israel     email: rgordon at tx.technion.ac.il







More information about the Pkg-openmpi-maintainers mailing list