[Ltrace-devel] DWARF prototypes: handling symbol aliases

Dima Kogan lists at dima.secretsauce.net
Wed Jul 9 08:39:43 UTC 2014


Petr Machata <pmachata at redhat.com> writes:

> Sorry this took so long again.  I'm a fresh father and things have been
> hectic the last two weeks or so.

That's ok, course. Thank you for putting in the work you have been. I
was busy, so likewise sorry for the delay.

I addressed your points. The new tree is at

https://github.com/dkogan/ltrace/tree/libsym_aliases_inexportlist

Each point you raised is in a separate patch for easier review. Probably
would be good to squash them before merging. The benchmark tree
(described below) is here:

https://github.com/dkogan/ltrace/tree/libsym_aliases_inexportlist_benchmarking


>>  https://github.com/dkogan/ltrace/tree/libsym_aliases_inexportlist
>
> 99fe555 (protolib_lookup() now uses a dict* instead of a function that
> returns it) changes meaning of the code and breaks a lot of test cases.
> The complexity is in fact necessary.

Ah yes. I rebased and killed that patch.


> Regarding 6db578849 (We now use known prototypes...):
>
> Names that start with underscore (_dtor_string, _clone_vect, _dtor_vect)
> are reserved, we shouldn't use them in ltrace itself.

Done


> Don't use logical operations in integer context (i.e. where "int"
> carries an actual value, not a boolean).

Done

> This:
>
> +	struct vect **paliases = DICT_FIND_REF(&names->addrs,
> +					       &addr, struct vect*);
>
> ... technically maps an address to a pointer to a vector of names.  We
> could actually store vectors in the hash table directly.  If you would
> be willing to make this change, that would be great, but I'm not going
> to push it ;)

The reason I did it this way was to reduce the memory inefficiency of
the hash table. As implemented, sizeof(hash table cell)=sizeof(struct
vect*), which is much smaller than sizeof(struct vect). Your suggestion
has much better malloc() overhead and fragmentation, so maybe it's still
worthwhile. Still want this change?


> This:
>
> +	result = vect_pushback(aliases, &namedup);
> +	if(result != 0)
> +		return result;
>
> ... should free(namedup).

Done


> Iterators in ltrace are generally restartable

Done


> Other than that, there's a number of instances of if not being followed
> by a space.  Same holds for while and for, though I don't recall having
> seen any offenders.

Done


>> 2. The name is still "exported_names".
> I don't really mind the name.

OK. Leaving it


>> 3. The data structure currently is not ideally efficient.
>
> Why don't we turn the logic around?  Go through the list of symbols, and
> for each of them, consult the vector of aliases that we found out
> earlier.

Ah yes. You already know this, but for my own clarity
activate_latent_in() looks like

for(export names in lib1) // hash table iteration
{
  for(symbol names in lib2) // list iteration
  {
    if(names equal && libsym->latent)
    {
      proc_activate_latent_symbol(proc, libsym)
    }
  }
}

I turned it into

for(symbol names in lib2) // list iteration
{
  if(name in lib1 export names && libsym->latent) // hash table lookup
  {
    proc_activate_latent_symbol(proc, libsym)
  }
}

It LOOKS much more efficient, but benchmarks don't show a huge
difference, so as you predicted it doesn't matter a lot. Benchmarks
before the relevant patch (only the slowest libraries shown):

library: libpthread.so.0 _activate_latent_in took 21023 ns
library: tstlib.so _activate_latent_in took 22419 ns
library: tst _activate_latent_in took 35270 ns

After the patch:

library: libpthread.so.0 _activate_latent_in took 13759 ns
library: tstlib.so _activate_latent_in took 20254 ns
library: tst _activate_latent_in took 31708 ns

In any case, the patch seems to work, but I'd be more comfortable if the
test suite told me it was good. I see the same test/suite failures
before and after, some non-deterministic. Do you know which tests
specifically are good to look at for a change like this?



>> 4. I discovered that in the libc on my machine (Debian/sid amd64) some
>> symbols appear at multiple addresses:
>
> ltrace doesn't really handle versions at all, except it knows to
> ignore them (see e.g. populate_this_symtab in ltrace-elf.c). ... But
> right now your approach makes sense.

Leaving it alone for now.

Thanks again for looking.

dima



More information about the Ltrace-devel mailing list