DF and signal handlers

Nikodemus Siivola nikodemus at random-state.net
Wed Mar 5 13:48:05 UTC 2008


On 3/5/08, Aurelien Jarno <aurelien at aurel32.net> wrote:
> Nikodemus Siivola a écrit :
>
> > On 3/5/08, Debian Bug Tracking System <owner at bugs.debian.org> wrote:
>  >
>  >> tag 469058 + patch
>  >>  Bug#469058: sbcl doesn't reset direction flag upon exit
>  >>  There were no tags set.
>  >>  Tags added: patch
>  >
>  > Thanks for the patch, but... while I agree that it is good to change
>  > SBCL to reset the direction flag every time it is diddled, instead of
>  > just before calling C, I don't think SBCL is actually at fault here.
>  >
>  >  1. SBCL does actually reset DF before any call to foreign (GCC generated) code.
>  >     See line 236 in src/compiler/x86/c-call.lisp, and line 125 in
>  >     src/runtime/x86-assem.S.
>  >
>  >     (It is possible I'm missing out a call-path here, but even so, read on and
>  >     see if my fears are unfounded or not.)
>  >
>  >  2. If the problem was due to a foreign call, it should be deterministic.
>  >
>  >  3. If the problem was due to _returning_ to main(), it should be deterministic.
>
>
> Looks correct.
>
>
>  > What I suspect is actually going on (especially considering your
>  > statement that compiling signals/ with 4.2 avoided the issue) is that
>  > a signal handler is entered while DF is set.
>
>
> What I am sure is that sigemptyset() from the glibc is called with the
> direction flag set, and that should not happen.

Right.

I'm about to merge a patch to SBCL based on yours, which moves all DF
resets to immediate vicinity of STDs for easier auditing, and removed
the then-unnecessary CLD instructions from foreign call sequences.
This will fix them symptoms, and be good for SBCL, but I think the
underlying problem is still there in signal handling. :/

>  > If this is the case, then clearing it right after each REP loop where
>  > SBCL uses it just makes seeing the bug much more unlikely -- but not
>  > impossible in the presence of async signals.
>
>
> Seems correct, though I have made half a dozen of build here, without
> any problem.

That is not too suprising: the are normally no asynch signals
delivered during the build, but SIGSEGV is a regular occurance (it is
used by the GC), so SIGSEGV handlers may have been seeing the DF set.

What _is_ strange is that this appears to have been random. (At least
all the reporters seemed to characterize it as semirandom behaviour.)
Multiple builds from the same source with the same host compiler
should have essentially identical GC characteristics.

>  > If so, this may also explain some _very_ hard to reproduce faults we
>  > have seen over the years: using a pre 4.3-GCC compiled libc, a signal
>  > at an in opportune moment in the middle of a REP loop could clear DF!
>  > Yikes!
>  >
>  > I'm not sure what is The Right Thing here, though. Should SBCL (and
>  > _any_ program that ever sets DF!) save, clear, and restore DF in its
>  > signal handlers? Should libc/kernel do that? Should signals be blocked
>
>
> I currently have no idea about that.

I'll see if I can cook up a small test-case using async signals. (One
that doesn't need SBCL so that it can be passed to upstream libc /
kernel people if necessary without too much friction.)

Cheers,

 -- Nikodemus


More information about the pkg-common-lisp-devel mailing list