[Reproducible-builds] build paths and debug info for generated C objects

Daniel Kahn Gillmor dkg at fifthhorseman.net
Sat Dec 5 02:38:54 UTC 2015


Hey all--

Niels, anthraxx and i were just commiserating about the fact that we're
punting on reproducibility of the build path.  We think we might have
found a way to make progress on this.

Problem Statement
-----------------

One of the main concerns about the build path is that it gets included
by gcc in any generated DWARF [0] debugging symbols, specifically in the
dwarf attribute named DW_AT_comp_dir.

Background
----------

gcc already allows the user to tweak this attribute directly:

      -fdebug-prefix-map=old=new
           When compiling files in directory old, record debugging information
           describing them as in new instead.

So, for example, i can do:

   gcc -fdebug-prefix-map=$(pwd)=. -o test test.c

gdb still works for me when debugging code that is built this way.

However, gcc also stores all the switches used during the build in the
DW_AT_producer, so if you do this, then you're just moving the build
path to a different dwarf attribute, so it's still being encoded in the
output.  This doesn't solve the reproducibility problem, but it provides
us with a way to demonstrate that removing the data from DW_AT_comp_dir
doesn't cripple our ability to debug.

We also observed that DW_AT_name already stores the name of the compiled
file relative to the DW_AT_comp_dir -- this poses no reproducibility
problems on its own.

Proposed Solution
-----------------

A minor change to gcc:

 * if the "old" parameter for -fdebug-prefix-map starts with a literal $
   character, make gcc treat it as an environment variable name.  So:
   (note the shell escaping)

    export SOURCE_BUILD_DIR=$(pwd)
    gcc -g -fdebug-prefix-map=\$SOURCE_BUILD_DIR=. -o test.o -c test.c

   should do what we need: the gcc flags are static, and the build path
   is stripped.

   - What to do if the chosen env var isn't set?  Probably just skip the
     match entirely, and maybe raise a warning.

   - What about bizarre theoretical filesystems that might have $ as a
     leading character?  We don't know of any.  We're willing to
     sacrifice them for this feature. :)

I've patched GCC to work this way successfully:

-------------- next part --------------
A non-text attachment was scrubbed...
Name: debug-prefix-map-from-env.patch
Type: text/x-diff
Size: 1754 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20151205/05fd1d61/attachment.patch>
-------------- next part --------------

What do r-b people think about this?  I'm happy to try to push this
patch to the gcc upstream if folks here think this sounds reasonable and
would address a real future r-b issue.


Alternate Solutions
-------------------

We considered and discarded several other possible solutions, which i'm
noting below, along with the downsides that led us to select the one we
chose:


 * ask gcc to not record -fdebug-prefix-map options in DW_AT_producer

    - it's weird that some options wouldn't be recorded and some
      would.
      
    - build systems would need to set dynamic CFLAGS not be able to
      use this approach.  debian can do this in dpkg-buildpackage, but
      apparently it's tougher on Arch (though Arch can more easily
      set dynamic environment variables).

or three different ideas for new gcc flags, all of which share the
problem that adding a new gcc option would mean that attempts to apply
this prefix map would fail hard when using non-updated gcc:

 * gcc -fdebug-prefix-map-from-env=NEW

  This evaluates a specific, fixed environment variable like
  SOURCE_BUILD_ROOT as the "old" part of the prefix map.

    - asking gcc to adopt a new magic environment variable seems
      sketchy.

 * gcc -fdebug-prefix-map-from-env=ENVVAR=NEW

   This does the same thing as the as the main proposal, but it uses a
   distinct flag and doesn't expect the leading $ prefix.  e.g.

       gcc -fdebug-prefix-map-from-env=SOURCE_BUILD_ROOT=.

 * gcc -fdebug-force-path-to=NEW

   This approach just forces the value of all generated DW_AT_comp_dir
   attributes, which might be overkill.

    - this fails to record the paths relative to the build directory in
      the event that a recursive descent build pattern (a tree of
      Makefiles) is used.  That is, if the top level Makefile does both
      "make -C src1" and "make -C src2", then debug info from files
      named foo.c in each directory will be indistinguishable, even
      within the same project.

feedback welcome,

     --dkg


[0] http://dwarfstd.org/doc/DWARF4.pdf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 948 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20151205/05fd1d61/attachment.sig>


More information about the Reproducible-builds mailing list