[Ltrace-devel] [PATCH] Add support for using elfutils as unwinder.

Mark Wielaard mjw at redhat.com
Thu Jan 9 20:10:28 UTC 2014


On Thu, 2014-01-09 at 17:15 +0100, Petr Machata wrote:
> Mark Wielaard <mjw at redhat.com> writes:
> 
> > Both issues fixed in the newly attached patch.
> 
> Now it seems to mostly work, but all my backtraces look like this:
> 
> getenv(0x404325, 0x7fff788de6f8, 0x7fff788de708, 0x403af0)                                       = 0
>  > echo(main+0x20) [4010b0]
>  > libc.so.6(__libc_start_main+0xec) [7eff6b48735c]
> 	/usr/src/debug/glibc-2.14.1/csu/libc-start.c:226
>  > echo(_start+0x28) [4016e4]
> dwfl_getthread_frames tid 21025: no matching address range
> 
> This is x86_64 Fedora 15 (I admit to being fairly conservative with my
> upgrade rate).  There's a comment at dwfl_thread_getframes that says:
> 
> > some systems return error instead of zero on end of the backtrace, for
> > cross-platform compatibility callers should consider error as a zero.
> 
> The error indeed comes from inside dwfl_thread_getframes, from the
> post-unwind return in particular.  Presumably the cause is that crt1.o
> on my system lacks .eh_frame--in Fedora 20 mock, it all works fine.

Yes, old systems didn't mark _start or _clone as end of the call stack.
So the unwinder just keeps going looking for the next frame, cannot find
any information about the current frame and has to give up.

> So, any idea how to get around this?  Ignoring DWARF_E_NO_MATCH would
> work, but extracting dwarf_error from libdwfl doesn't seem to be well
> supported.  We might take the comment at dwfl_thread_getframes
> literally, notice that we successfully unwound at least a single frame,
> and assume that any further errors should be ignored.  Since this is
> fairly obscure use-case, I can prepare a patch to this effect myself.

Making a difference between failure to unwind anything and failure to
unwind further after at least one frame has been unwind is what I did in
eu-stack too actually eu-stack makes a difference between fatal errors,
warnings and complete success. Not being able to unwind any further
after something has been unwound is regarded as a warning in that case.

In the ltrace case it might just be enough to print > [...] in case of
any error to indicate we don't know whether or not there should be more
frames (beyond the -w <NR> limit).

> I still wonder what your opinion is though.  It seems as if
> dwfl_thread_getframes should return a different error number on line
> 436, but that ship has sailed.

Yeah, but it is hard to know which one. Since there are various things
that can cause us getting in that error state. The stack could be
corrupted, missing .eh_frame, corrupted/bad CFI data, etc. And on a
modern up to date system it really should return success.

Cheers,

Mark




More information about the Ltrace-devel mailing list