public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Ximin Luo <infinity0@pwned.gg>
To: GCC Patches <gcc-patches@gcc.gnu.org>
Cc: Ximin Luo <infinity0@pwned.gg>
Subject: [PING^4][PATCH v2] Generate reproducible output independently of the build-path
Date: Fri, 21 Jul 2017 16:16:00 -0000	[thread overview]
Message-ID: <20170721161538.7508-1-infinity0@pwned.gg> (raw)

(Please keep me on CC, I am not subscribed)


Proposal
========

This patch series adds a new environment variable BUILD_PATH_PREFIX_MAP. When
this is set, GCC will treat this as extra implicit "-fdebug-prefix-map=$value"
command-line arguments that precede any explicit ones. This makes the final
binary output reproducible, and also hides the unreproducible value (the source
path prefixes) from CFLAGS et. al. which many build tools (understandably)
embed as-is into their build output.

This environment variable also acts on the __FILE__ macro, mapping it in the
same way that debug-prefix-map works for debug symbols. We have seen that
__FILE__ is also a very large source of unreproducibility, and is represented
quite heavily in the 3k+ figure given earlier.

Finally, we tweak the mapping algorithm so that it applies only to whole path
components when matching prefixes. This is justified in further detail in the
patch header. It is an optional part of the patch series and could be dropped
if the GCC maintainers are not convinced by our arguments there.


Background
==========

We have prepared a document that describes how this works in detail, so that
projects can be confident that they are interoperable:

https://reproducible-builds.org/specs/build-path-prefix-map/

The specification is currently in DRAFT status, awaiting some final feedback,
including what the GCC maintainers think about it.

We have written up some more detailed discussions on the topic, including a
thorough justification on why we chose the mechanism of environment variables:

https://wiki.debian.org/ReproducibleBuilds/StandardEnvironmentVariables

The previous iteration of the patch series, essentially the same as the current
re-submission, is here:

https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00513.html

An older version, that explains some GCC-specific background, is here:

https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00182.html

The current patch series applies cleanly to GCC-8 snapshot 20170716.


Reproducibility testing
=======================

Over the past 3 months, we have tested this patch backported to Debian GCC-6.
Together with a patched dpkg that sets the environment variable appropriately,
it allows us to reproduce ~1800 extra packages.

This is about 6.8% of ~26400 Debian source packages, and just over 1/2 of the
ones whose irreproducibility is due to build-path issues.

https://tests.reproducible-builds.org/debian/issues/unstable/gcc_captures_build_path_issue.html
https://tests.reproducible-builds.org/debian/unstable/index_suite_amd64_stats.html

The first major increase around 2017-04 is due to us deploying this patch. The
next major increase later in 2017-04 is unrelated, due to us deploying a patch
for R. The dip during the last part of 2017-06 is due to unpatched and patched
packages getting out-of-sync partly because of extra admin work around the
Debian stretch release, and we believe that the green will soon return to their
previous high after this situation settles.


Unit testing
============

I've tested these patches on a Debian unstable x86_64-linux-gnu schroot running
inside a Debian jessie system, on a full-bootstrap build. The output of
contrib/compare_tests is as follows:

~~~~
gcc-8-20170716$ contrib/compare_tests ../gcc-build-{0,1}
# Comparing directories
## Dir1=../gcc-build-0: 8 sum files
## Dir2=../gcc-build-1: 8 sum files

# Comparing 8 common sum files
## /bin/sh contrib/compare_tests  /tmp/gxx-sum1.13468 /tmp/gxx-sum2.13468
New tests that PASS:

gcc.dg/cpp/build_path_prefix_map-1.c (test for excess errors)
gcc.dg/cpp/build_path_prefix_map-1.c execution test
gcc.dg/cpp/build_path_prefix_map-2.c (test for excess errors)
gcc.dg/cpp/build_path_prefix_map-2.c execution test
gcc.dg/debug/dwarf2/build_path_prefix_map-1.c (test for excess errors)
gcc.dg/debug/dwarf2/build_path_prefix_map-1.c scan-assembler DW_AT_comp_dir: "DWARF2TEST/gcc
gcc.dg/debug/dwarf2/build_path_prefix_map-2.c (test for excess errors)
gcc.dg/debug/dwarf2/build_path_prefix_map-2.c scan-assembler DW_AT_comp_dir: "/

# No differences found in 8 common sum files
~~~~

I can also provide the full logs on request.


Fuzzing
=======

I've also fuzzed the prefix-map code using AFL with ASAN enabled. Due to how
AFL works I did not fuzz this patch directly but a smaller program with just
the parser and remapper, available here:

https://anonscm.debian.org/cgit/reproducible/build-path-prefix-map-spec.git/tree/consume

Over the course of about ~4k cycles, no crashes were found.

To reproduce, you could run something like:

$ echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
$ make CC=afl-gcc clean reset-fuzz-pecsplit.c fuzz-pecsplit.c


Copyright disclaimer
====================

I've signed a copyright disclaimer and the FSF has this on record. (RT #1209764)

             reply	other threads:[~2017-07-21 16:16 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-21 16:16 Ximin Luo [this message]
2017-07-21 16:16 ` [PATCH 3/3] When remapping paths, only match whole path components Ximin Luo
2017-07-21 16:16 ` [PATCH 1/3] Use BUILD_PATH_PREFIX_MAP envvar for debug-prefix-map Ximin Luo
2017-08-02 19:19   ` Jeff Law
2017-07-21 16:16 ` [PATCH 2/3] Use BUILD_PATH_PREFIX_MAP envvar to transform __FILE__ Ximin Luo
2017-08-02 19:09 ` [PING^4][PATCH v2] Generate reproducible output independently of the build-path Jeff Law
2017-08-03  2:06   ` Ximin Luo
2017-08-03  4:49     ` Yury Gribov
2017-08-03 11:46       ` Ximin Luo
2017-08-04  8:40         ` Yury Gribov
2017-08-10 21:29           ` Ximin Luo
2017-08-03 15:57       ` Jeff Law
2017-08-03 16:05     ` Jeff Law
2017-08-03 17:02       ` Ximin Luo
2017-08-04 12:32       ` Matthias Klose
2017-08-04 13:05         ` Jakub Jelinek
2017-08-10 21:15           ` Ximin Luo
2017-08-04 16:05         ` Yury Gribov
2017-08-10 20:55           ` Ximin Luo
2017-08-11  2:53             ` Joseph Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170721161538.7508-1-infinity0@pwned.gg \
    --to=infinity0@pwned.gg \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).