[Debichem-devel] Bug#683467: aces3: divide-by-zero error in many test cases when running on one core only

Michael Banck mbanck at debian.org
Wed Aug 1 00:18:58 UTC 2012


package: aces3
severity: important
tags: upstream
version: 3.0.6-1

> I now ran the full testsuite, and unfortunately I get a floating point
> exception in tran_rhf_ao_sv1.sio:
> 
> | An instruction timer report will be printed
> |
> | Gather on company_rank succeeded.
> | Static pre-defined array #           2  is first used on line
> |328
> |
> |Program received signal SIGFPE: Floating-point exception - erroneous
> |arithmetic operation.
> |
> |Backtrace for this error:
> |#0  0x2AEA062F8667
> |#1  0x2AEA062F8C34
> |#2  0x2AEA06F104EF
> |#3  0x447C1C in vtdemo_init_ at vtdemo_init.F:814

That line reads

                  iproc_company_rank = mod(next_server-1,niocompany)

> #0  vtdemo_init (optable=..., noptable=245, array_table=..., narray_table=<error reading variable: Cannot access memory at address 0xc9>, index_table=..., nindex_table=32, 
>     segment_table=..., nsegment_table=193, scalar_table=..., nscalar_table=13, block_map_table=..., nblock_map_table=27364, proctab=..., address_table=..., blocksize=1185921, 
>     end_nfps=..., nshells=16, scf_energy=438.55129855222071, totenerg=0, damp_init=0.19999998807907104, cc_conv=9.9999999999999995e-08, scf_conv=9.9999999999999995e-07, 
>     stabvalue=0, excite=0, eom_tol=0, eom_roots=0, io_company_id=2, niocompany=0, need_predef=..., npre_defined=19, dryrun=.TRUE.) at vtdemo_init.F:814

and niocompany is indeed zero.

Apparently this code is relevant, from , line 107:

         niocompany = 0
         do i = 1, nprocs
            if (pst_get_company(i-1) .eq. io_company_id)
     *        niocompany = niocompany + 1
         enddo

We did not add a machine-specific section to tests/Makefile, so they run
with the default MPIRUN of "mpirun ./xaces3 >./job.out", resulting in
one core being used:

>         nprocs = 1

I can't remember whether the above do-loop will run at all if nprocs is
1 as well, and I am not sure what pst_get_company() is supposed to
return, but obviously niocompany should not stay zero or else the above
mod() function will divide by zero.

Indeed, if I run the testsuite manually with mpirun -np 2, I no longer
get a floating point exception.



More information about the Debichem-devel mailing list