public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Arthur Gautier <gcc.gnu.org@superbaloo.net>
To: Tadeus Prastowo <tadeus.prastowo@unitn.it>
Cc: gcc@gcc.gnu.org
Subject: Re: Build reproducibility of gcc @ NixOS
Date: Fri, 2 Apr 2021 15:04:34 +0000	[thread overview]
Message-ID: <CAOAHwbU8EfACwygPgGzN5RYvP0ftW8yw5900SVeJPwbRWCC8_A@mail.gmail.com> (raw)
In-Reply-To: <CAA1YtmsoEFb2uPK18TA3nbHh8dYKU+UP7iz_adejRnNk=tUNEQ@mail.gmail.com>

Hi Tadeus,

On Fri, Apr 2, 2021 at 9:07 AM Tadeus Prastowo <tadeus.prastowo@unitn.it> wrote:
>
> Hi Arthur,
>
> On Fri, Apr 2, 2021 at 5:56 AM Arthur Gautier
> <gcc.gnu.org@superbaloo.net> wrote:
> >
> > Dear GCC development team,
> >
> > We've been trying to build reproducibly the minimal NixOS image, and
> > gcc was one of the last issues we had.
> > We found that disabling profiled bootstrap compilation of GCC allowed
> > us to get a reproducible build of gcc.
> > Our efforts can be followed here: https://github.com/NixOS/nixpkgs/pull/112928
> >
> > But I measured disabling this optimization to cost around 7-12%
> > depending on the build.
>
> That is expected as mentioned in the manual:
>
> And, I have reproduced it as well as reported here:
> https://gcc.gnu.org/ml/gcc-help/2019-05/msg00118.html, the last
> questions that remain being here:
> https://gcc.gnu.org/legacy-ml/gcc-help/2019-07/msg00053.html
>
> > Because of this performance regression, we're trying to find a middle
> > ground. Ideally we'd like to keep the performance of gcc as untouched
> > as possible (even if that costs us on compilation time of gcc itself).
> >
> > Compiling gcc twice on the same machine gets us the same output, but
> > compiling on a different architecture gets us a different result.
> > Reading the documentation, it would seem that autoprofiledback
> > bootstrap would use machine metrics and injects them in the build (and
> > we don't use autoprofiledback), But I would not expect the stagetrain
> > of profiledbootstrap to do that.
> > I tried disabling concurrency of the stagetrain without luck.
> >
> > It feels like I'm missing something.
> > Would anyone have any idea what could inject the host's behavior here?
>
> Since an optimized build is likely to be machine-dependent regardless
> of any intended injection (e.g., different instructions used in GCC
> binaries depending on /proc/cpuinfo), I don't understand why an
> optimized build should be reproducible on different machines, unless
> of course every channel that GCC uses to find out about the machine
> (e.g., /proc/cpuinfo) is under your total control.  So, do you mean to
> ask a list of all channels that GCC uses to find out about the
> machine?

This is where I'm getting confused. According to the manual,
stagetrain only record branch statistics. And I would expect, given
the same input provided in the same order, two different architectures
to take the same branch, and not observe any difference. I understand
that with autoprofiled builds, the local architecture behavior is
injected in the build, but I don't use that.
I'm not using any -march in the build either (as far as I can
understand/tell). So I do not expect the build to change its
instruction set either.

Is that normal that two different architectures would issue two
different "execution counts of instruction and branch probabilities"?
Or is there something more?

Thank you for your reply!

-- 
Arthur

  reply	other threads:[~2021-04-02 15:04 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-02  3:55 Arthur Gautier
2021-04-02  9:07 ` Tadeus Prastowo
2021-04-02 15:04   ` Arthur Gautier [this message]
2021-04-02 16:32     ` Tadeus Prastowo
2021-04-02 16:45       ` Arthur Gautier
2021-04-03  0:14         ` Tadeus Prastowo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOAHwbU8EfACwygPgGzN5RYvP0ftW8yw5900SVeJPwbRWCC8_A@mail.gmail.com \
    --to=gcc.gnu.org@superbaloo.net \
    --cc=gcc@gcc.gnu.org \
    --cc=tadeus.prastowo@unitn.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).