[Reproducible-builds] [GSoC 2016] : Application review

Satyam Zode satyamzode at gmail.com
Mon Mar 21 09:30:11 UTC 2016

Hi, everyone!

 Jérémy Bobbio:
> Thanks for your application. I much appreciated that it's done before
> the deadline. I also like you being clear of your other commitments.
> I think it would have been fine application for last summer, but we've
> made significant progress on several fronts since, therefore I'm not
> convinced that there's much work left in the tasks your propose. I think
> it would be better to aim with more precise tasks, e.g. toolchain
> software you'll be improving to solve classes of issues, or at leas an
> outline the features you intend to add to strip-nondeterminism or
> diffoscope.
> (If you feel you're part of the reproducible builds team and disagree
> with my comments, please say so!)

Thanks Lunar for this valuable feedback. Yes, I am agree with you.
After reading the reasons( which you mentioned) and  as a part of
reproducible builds team I don't think the proposed work(don't need
whole summer to work on) by me will help much to reproducible builds
effort too. But I think there are some issues which still needs to be
fixed. There are some issues in which not even a single package have
patch. I will try to look into those and will try to search for

So this summer I intend to work on
1) Improvements to diffoscope:
1.1)  Allow users to ignore arbitrary differences (Addition of
ignore-profiles flag).
1.2)  Perform fuzzy-matching across archives.
1.3)  Finish parallel processing part.
Above points are mentioned on GSoC wiki. And also there are more
features mentioned in whishlist
I will try to cover some of those too.
I guess Better/smarter ELF diffing is underdevelopment (I have checked
git logs and diffoscope for same)

2) Improving reproducibility of Debian packages:
In this section I will be fixing Debian packages and will try to find
the solutions to the issues which do not have solution yet. I am
trying to enlist such issues.

> If you look at packages identified as leaving timestamps in gzip
> headers, you'll see that most of them already have patches, and the ones
> who don't are affected with other issues
> https://tests.reproducible-builds.org/issues/unstable/timestamps_in_gzip_headers_issue.html
> These other issues probably deter maintainers' motivation to fix the
> problems with gzip timestamps.
> Almost all packages with varying mtimes in data.tar or control.tar have
> patches or have been fixed through toolchain improvements:
> https://tests.reproducible-builds.org/issues/unstable/varying_mtimes_in_data_tar_gz_or_control_tar_gz_issue.html
> It feels quite suboptimal to highlight user and groups in tarballs as
> separate issues as I think all are affected by other tarball related
> issues. They should be fixed at the same time:
> https://tests.reproducible-builds.org/issues/unstable/users_and_groups_in_tarball_issue.html
> Regarding timestamps due to C pre-processor macros, Dhole is waiting
> for GCC patch window to open again—which will be in April, IIRC.
> So unless you intend to work on adding support for SOURCE_DATE_EPOCH in
> clang, I'm not sure there's much work left on this issue. I believe that
> fixing the 400+ packages individually should not be undertaken if
> we can avoid it.
> https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01402.html
> https://wiki.debian.org/ReproducibleBuilds/TimestampsFromCPPMacros
> Emmanuel Bourg has been working and fixing almost all Java-related
> issues in the course of the past year. I expect he'll probably work on
> this fixing locale related javadoc issue in a near future. I guess you
> could coordinate with him to write the necessary patches, though.
> https://tests.reproducible-builds.org/issues/unstable/locale_in_documentation_generated_by_javadoc_issue.html

A big thanks to you, because I really didn't know about many of the
above things. Its good to know that people are already working on this
part. :-)

> These quick evaluations leave me the feeling that your proposed schedule
> is currently not adequate with actual needs of the reproducible builds
> effort.
> This probably means that progress can be made on making more visible
> areas that actually require work…

Please let me know what you think about the work which I have proposed
now. I will frame timeline accordingly.

PS: I really feel that I am part of reproducible builds and I want to
strengthen the bond by spending my summer working with reproducible
builds ;-)

More information about the Reproducible-builds mailing list