[Reproducible-builds] Ideas

Hilko Bengen bengen at debian.org
Mon Feb 3 16:15:22 UTC 2014


Hi,

After returning home from FOSDEM where I attended Lunar's talk, I have
read and started about the Wiki page at
<https://wiki.debian.org/ReproducibleBuilds>.

I realize that getting Debian into a state where all binary packages are
build in a reproducible way is going to be a *lot* of work, so I wonder
whether the work can be structured in a way that some useful sub-goals
can be reached maybe even in time for jessie (or jessie+1).

Here are some ideas:

1. Get the most "interesting" / most "useful" pieces done first:

Can we get reproducible builds for the set of packages needed for a
buildd host? 

What about the "standard" install and various "*-server" tasks as
offered by tasksel? (database, dns, file, mail, print, ssh, web)?

2. Concentrate on the contents of a binary pacakge instead of the
package itself:

In his talk, Lunar mentioned that some patches for dpkg (tar file order,
timestamps) for creating reproducible .deb packages have not been
integrated yet. As far as I understand, setting arbitrary timestamps in
the .deb files seems to be a controversial feature...

However, since binary packages are little more than a vehicle for
transporting files to the machine where they will be installed, I think
that focussing on the contents of the .deb archives might be an
alternative.

Like so:

dpkg-deb --raw-extract $pkg $unpack_dir
( 
    cd $unpack_dir
    find -type f | sort | xargs -i sh -c 'md5sum {}; stat -c"%a %n" {}'
)

One could even take a checksum of the resulting string and take to
compare different builds of a given package. This approach would also
ignore compression of the control.tar, data.tar within the .deb package.

Cheers,
-Hilko



More information about the Reproducible-builds mailing list