From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 71910 invoked by alias); 21 Jul 2017 16:16:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 71737 invoked by uid 89); 21 Jul 2017 16:16:11 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 spammy=Hx-spam-relays-external:sk:mail.he, H*RU:sk:mail.he, H*r:sk:mail.he, DRAFT X-HELO: mail.headstrong.de Received: from mail.headstrong.de (HELO mail.headstrong.de) (81.7.4.112) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 21 Jul 2017 16:16:08 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.headstrong.de (Postfix) with ESMTP id 296131C0061C; Fri, 21 Jul 2017 18:16:05 +0200 (CEST) Authentication-Results: mail.headstrong.de (amavisd-new); dkim=pass (1024-bit key) reason="pass (just generated, assumed good)" header.d=headstrong.de Received: from mail.headstrong.de ([127.0.0.1]) by localhost (mail.headstrong.de [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id pDuteBiuz0zG; Fri, 21 Jul 2017 18:15:51 +0200 (CEST) Received: from infinity0 by localhost with local (Exim 4.89) (envelope-from ) id 1dYaak-0001yz-Ac; Fri, 21 Jul 2017 18:15:50 +0200 From: Ximin Luo To: GCC Patches Cc: Ximin Luo Subject: [PING^4][PATCH v2] Generate reproducible output independently of the build-path Date: Fri, 21 Jul 2017 16:16:00 -0000 Message-Id: <20170721161538.7508-1-infinity0@pwned.gg> X-SW-Source: 2017-07/txt/msg01315.txt.bz2 (Please keep me on CC, I am not subscribed) Proposal ======== This patch series adds a new environment variable BUILD_PATH_PREFIX_MAP. When this is set, GCC will treat this as extra implicit "-fdebug-prefix-map=$value" command-line arguments that precede any explicit ones. This makes the final binary output reproducible, and also hides the unreproducible value (the source path prefixes) from CFLAGS et. al. which many build tools (understandably) embed as-is into their build output. This environment variable also acts on the __FILE__ macro, mapping it in the same way that debug-prefix-map works for debug symbols. We have seen that __FILE__ is also a very large source of unreproducibility, and is represented quite heavily in the 3k+ figure given earlier. Finally, we tweak the mapping algorithm so that it applies only to whole path components when matching prefixes. This is justified in further detail in the patch header. It is an optional part of the patch series and could be dropped if the GCC maintainers are not convinced by our arguments there. Background ========== We have prepared a document that describes how this works in detail, so that projects can be confident that they are interoperable: https://reproducible-builds.org/specs/build-path-prefix-map/ The specification is currently in DRAFT status, awaiting some final feedback, including what the GCC maintainers think about it. We have written up some more detailed discussions on the topic, including a thorough justification on why we chose the mechanism of environment variables: https://wiki.debian.org/ReproducibleBuilds/StandardEnvironmentVariables The previous iteration of the patch series, essentially the same as the current re-submission, is here: https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00513.html An older version, that explains some GCC-specific background, is here: https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00182.html The current patch series applies cleanly to GCC-8 snapshot 20170716. Reproducibility testing ======================= Over the past 3 months, we have tested this patch backported to Debian GCC-6. Together with a patched dpkg that sets the environment variable appropriately, it allows us to reproduce ~1800 extra packages. This is about 6.8% of ~26400 Debian source packages, and just over 1/2 of the ones whose irreproducibility is due to build-path issues. https://tests.reproducible-builds.org/debian/issues/unstable/gcc_captures_build_path_issue.html https://tests.reproducible-builds.org/debian/unstable/index_suite_amd64_stats.html The first major increase around 2017-04 is due to us deploying this patch. The next major increase later in 2017-04 is unrelated, due to us deploying a patch for R. The dip during the last part of 2017-06 is due to unpatched and patched packages getting out-of-sync partly because of extra admin work around the Debian stretch release, and we believe that the green will soon return to their previous high after this situation settles. Unit testing ============ I've tested these patches on a Debian unstable x86_64-linux-gnu schroot running inside a Debian jessie system, on a full-bootstrap build. The output of contrib/compare_tests is as follows: ~~~~ gcc-8-20170716$ contrib/compare_tests ../gcc-build-{0,1} # Comparing directories ## Dir1=../gcc-build-0: 8 sum files ## Dir2=../gcc-build-1: 8 sum files # Comparing 8 common sum files ## /bin/sh contrib/compare_tests /tmp/gxx-sum1.13468 /tmp/gxx-sum2.13468 New tests that PASS: gcc.dg/cpp/build_path_prefix_map-1.c (test for excess errors) gcc.dg/cpp/build_path_prefix_map-1.c execution test gcc.dg/cpp/build_path_prefix_map-2.c (test for excess errors) gcc.dg/cpp/build_path_prefix_map-2.c execution test gcc.dg/debug/dwarf2/build_path_prefix_map-1.c (test for excess errors) gcc.dg/debug/dwarf2/build_path_prefix_map-1.c scan-assembler DW_AT_comp_dir: "DWARF2TEST/gcc gcc.dg/debug/dwarf2/build_path_prefix_map-2.c (test for excess errors) gcc.dg/debug/dwarf2/build_path_prefix_map-2.c scan-assembler DW_AT_comp_dir: "/ # No differences found in 8 common sum files ~~~~ I can also provide the full logs on request. Fuzzing ======= I've also fuzzed the prefix-map code using AFL with ASAN enabled. Due to how AFL works I did not fuzz this patch directly but a smaller program with just the parser and remapper, available here: https://anonscm.debian.org/cgit/reproducible/build-path-prefix-map-spec.git/tree/consume Over the course of about ~4k cycles, no crashes were found. To reproduce, you could run something like: $ echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor $ make CC=afl-gcc clean reset-fuzz-pecsplit.c fuzz-pecsplit.c Copyright disclaimer ==================== I've signed a copyright disclaimer and the FSF has this on record. (RT #1209764)