Build reproducibility of gcc @ NixOS

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Build reproducibility of gcc @ NixOS
@ 2021-04-02  3:55 Arthur Gautier
  2021-04-02  9:07 ` Tadeus Prastowo
  0 siblings, 1 reply; 6+ messages in thread
From: Arthur Gautier @ 2021-04-02  3:55 UTC (permalink / raw)
  To: gcc

Dear GCC development team,

We've been trying to build reproducibly the minimal NixOS image, and
gcc was one of the last issues we had.
We found that disabling profiled bootstrap compilation of GCC allowed
us to get a reproducible build of gcc.
Our efforts can be followed here: https://github.com/NixOS/nixpkgs/pull/112928

But I measured disabling this optimization to cost around 7-12%
depending on the build.
Because of this performance regression, we're trying to find a middle
ground. Ideally we'd like to keep the performance of gcc as untouched
as possible (even if that costs us on compilation time of gcc itself).

Compiling gcc twice on the same machine gets us the same output, but
compiling on a different architecture gets us a different result.
Reading the documentation, it would seem that autoprofiledback
bootstrap would use machine metrics and injects them in the build (and
we don't use autoprofiledback), But I would not expect the stagetrain
of profiledbootstrap to do that.
I tried disabling concurrency of the stagetrain without luck.

It feels like I'm missing something.
Would anyone have any idea what could inject the host's behavior here?

Thank you for your help!

Best,
-- 
Arthur

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Build reproducibility of gcc @ NixOS
  2021-04-02  3:55 Build reproducibility of gcc @ NixOS Arthur Gautier
@ 2021-04-02  9:07 ` Tadeus Prastowo
  2021-04-02 15:04   ` Arthur Gautier
  0 siblings, 1 reply; 6+ messages in thread
From: Tadeus Prastowo @ 2021-04-02  9:07 UTC (permalink / raw)
  To: Arthur Gautier; +Cc: gcc

Hi Arthur,

On Fri, Apr 2, 2021 at 5:56 AM Arthur Gautier
<gcc.gnu.org@superbaloo.net> wrote:
>
> Dear GCC development team,
>
> We've been trying to build reproducibly the minimal NixOS image, and
> gcc was one of the last issues we had.
> We found that disabling profiled bootstrap compilation of GCC allowed
> us to get a reproducible build of gcc.
> Our efforts can be followed here: https://github.com/NixOS/nixpkgs/pull/112928
>
> But I measured disabling this optimization to cost around 7-12%
> depending on the build.

That is expected as mentioned in the manual:
https://gcc.gnu.org/install/build.html#Building-with-profile-feedback
And, I have reproduced it as well as reported here:
https://gcc.gnu.org/ml/gcc-help/2019-05/msg00118.html, the last
questions that remain being here:
https://gcc.gnu.org/legacy-ml/gcc-help/2019-07/msg00053.html

> Because of this performance regression, we're trying to find a middle
> ground. Ideally we'd like to keep the performance of gcc as untouched
> as possible (even if that costs us on compilation time of gcc itself).
>
> Compiling gcc twice on the same machine gets us the same output, but
> compiling on a different architecture gets us a different result.
> Reading the documentation, it would seem that autoprofiledback
> bootstrap would use machine metrics and injects them in the build (and
> we don't use autoprofiledback), But I would not expect the stagetrain
> of profiledbootstrap to do that.
> I tried disabling concurrency of the stagetrain without luck.
>
> It feels like I'm missing something.
> Would anyone have any idea what could inject the host's behavior here?

Since an optimized build is likely to be machine-dependent regardless
of any intended injection (e.g., different instructions used in GCC
binaries depending on /proc/cpuinfo), I don't understand why an
optimized build should be reproducible on different machines, unless
of course every channel that GCC uses to find out about the machine
(e.g., /proc/cpuinfo) is under your total control.  So, do you mean to
ask a list of all channels that GCC uses to find out about the
machine?

> Thank you for your help!

No worries.

> Best,
> --
> Arthur

--
Best regards,
Tadeus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Build reproducibility of gcc @ NixOS
  2021-04-02  9:07 ` Tadeus Prastowo
@ 2021-04-02 15:04   ` Arthur Gautier
  2021-04-02 16:32     ` Tadeus Prastowo
  0 siblings, 1 reply; 6+ messages in thread
From: Arthur Gautier @ 2021-04-02 15:04 UTC (permalink / raw)
  To: Tadeus Prastowo; +Cc: gcc

Hi Tadeus,

On Fri, Apr 2, 2021 at 9:07 AM Tadeus Prastowo <tadeus.prastowo@unitn.it> wrote:
>
> Hi Arthur,
>
> On Fri, Apr 2, 2021 at 5:56 AM Arthur Gautier
> <gcc.gnu.org@superbaloo.net> wrote:
> >
> > Dear GCC development team,
> >
> > We've been trying to build reproducibly the minimal NixOS image, and
> > gcc was one of the last issues we had.
> > We found that disabling profiled bootstrap compilation of GCC allowed
> > us to get a reproducible build of gcc.
> > Our efforts can be followed here: https://github.com/NixOS/nixpkgs/pull/112928
> >
> > But I measured disabling this optimization to cost around 7-12%
> > depending on the build.
>
> That is expected as mentioned in the manual:
>
> And, I have reproduced it as well as reported here:
> https://gcc.gnu.org/ml/gcc-help/2019-05/msg00118.html, the last
> questions that remain being here:
> https://gcc.gnu.org/legacy-ml/gcc-help/2019-07/msg00053.html
>
> > Because of this performance regression, we're trying to find a middle
> > ground. Ideally we'd like to keep the performance of gcc as untouched
> > as possible (even if that costs us on compilation time of gcc itself).
> >
> > Compiling gcc twice on the same machine gets us the same output, but
> > compiling on a different architecture gets us a different result.
> > Reading the documentation, it would seem that autoprofiledback
> > bootstrap would use machine metrics and injects them in the build (and
> > we don't use autoprofiledback), But I would not expect the stagetrain
> > of profiledbootstrap to do that.
> > I tried disabling concurrency of the stagetrain without luck.
> >
> > It feels like I'm missing something.
> > Would anyone have any idea what could inject the host's behavior here?
>
> Since an optimized build is likely to be machine-dependent regardless
> of any intended injection (e.g., different instructions used in GCC
> binaries depending on /proc/cpuinfo), I don't understand why an
> optimized build should be reproducible on different machines, unless
> of course every channel that GCC uses to find out about the machine
> (e.g., /proc/cpuinfo) is under your total control.  So, do you mean to
> ask a list of all channels that GCC uses to find out about the
> machine?

This is where I'm getting confused. According to the manual,
stagetrain only record branch statistics. And I would expect, given
the same input provided in the same order, two different architectures
to take the same branch, and not observe any difference. I understand
that with autoprofiled builds, the local architecture behavior is
injected in the build, but I don't use that.
I'm not using any -march in the build either (as far as I can
understand/tell). So I do not expect the build to change its
instruction set either.

Is that normal that two different architectures would issue two
different "execution counts of instruction and branch probabilities"?
Or is there something more?

Thank you for your reply!

-- 
Arthur

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Build reproducibility of gcc @ NixOS
  2021-04-02 15:04   ` Arthur Gautier
@ 2021-04-02 16:32     ` Tadeus Prastowo
  2021-04-02 16:45       ` Arthur Gautier
  0 siblings, 1 reply; 6+ messages in thread
From: Tadeus Prastowo @ 2021-04-02 16:32 UTC (permalink / raw)
  To: Arthur Gautier; +Cc: gcc

Hi Arthur,

On Fri, Apr 2, 2021 at 5:04 PM Arthur Gautier
<gcc.gnu.org@superbaloo.net> wrote:
>
> Hi Tadeus,
>
> On Fri, Apr 2, 2021 at 9:07 AM Tadeus Prastowo <tadeus.prastowo@unitn.it> wrote:

[...]

> > Since an optimized build is likely to be machine-dependent regardless
> > of any intended injection (e.g., different instructions used in GCC
> > binaries depending on /proc/cpuinfo), I don't understand why an
> > optimized build should be reproducible on different machines, unless
> > of course every channel that GCC uses to find out about the machine
> > (e.g., /proc/cpuinfo) is under your total control.  So, do you mean to
> > ask a list of all channels that GCC uses to find out about the
> > machine?
>
> This is where I'm getting confused. According to the manual,
> stagetrain only record branch statistics.

By "the manual", do you refer to
https://gcc.gnu.org/install/build.html#Building-with-profile-feedback
?

Quoting the page:
  When ‘make profiledbootstrap’ is run, it will first build a stage1 compiler.
  This compiler is used to build a stageprofile compiler instrumented
to collect execution counts of instruction and branch probabilities.
  Training run is done by building stagetrain compiler.
  Finally a stagefeedback compiler is built using the information collected.
End quote.

Based on the quote, a reproducible build is to expected for the
following compilers:
1. The stage1 compiler.
2. The stageprofile compiler, which is built by the stage1 compiler.
3. The stagetrain compiler, which is built by the stageprofile compiler.

Then, a reproducible build is expected for the stagefeedback compiler
on the condition that the same information, which was collected by the
stageprofile compiler when building the stagegrain compiler, is used.

Do you agree with that reasoning?

> And I would expect, given
> the same input provided in the same order, two different architectures
> to take the same branch, and not observe any difference.

In other words, you expect that branch statistics depends only on the
given source code.  Correct?

> I understand
> that with autoprofiled builds, the local architecture behavior is
> injected in the build, but I don't use that.
> I'm not using any -march in the build either (as far as I can
> understand/tell). So I do not expect the build to change its
> instruction set either.
>
> Is that normal that two different architectures would issue two
> different "execution counts of instruction and branch probabilities"?

I guess that it would be the case.

> Or is there something more?

Perhaps you can have the reproducible build that you want by first
isolating the information that is collected by the stageprofile
compiler when building the stagegrain compiler and then reusing the
same information when building every other stagefeedback compiler.

> Thank you for your reply!

My pleasure.

> --
> Arthur

-- 
Best regards,
Tadeus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Build reproducibility of gcc @ NixOS
  2021-04-02 16:32     ` Tadeus Prastowo
@ 2021-04-02 16:45       ` Arthur Gautier
  2021-04-03  0:14         ` Tadeus Prastowo
  0 siblings, 1 reply; 6+ messages in thread
From: Arthur Gautier @ 2021-04-02 16:45 UTC (permalink / raw)
  To: Tadeus Prastowo; +Cc: gcc

On Fri, Apr 2, 2021 at 4:32 PM Tadeus Prastowo <tadeus.prastowo@unitn.it> wrote:
>
> Hi Arthur,
>
> On Fri, Apr 2, 2021 at 5:04 PM Arthur Gautier
> <gcc.gnu.org@superbaloo.net> wrote:
> >
> > Hi Tadeus,
> >
> > On Fri, Apr 2, 2021 at 9:07 AM Tadeus Prastowo <tadeus.prastowo@unitn.it> wrote:
[...]
>
> By "the manual", do you refer to
> https://gcc.gnu.org/install/build.html#Building-with-profile-feedback
> ?
yes
>
> Quoting the page:
>   When ‘make profiledbootstrap’ is run, it will first build a stage1 compiler.
>   This compiler is used to build a stageprofile compiler instrumented
> to collect execution counts of instruction and branch probabilities.
>   Training run is done by building stagetrain compiler.
>   Finally a stagefeedback compiler is built using the information collected.
> End quote.
>
> Based on the quote, a reproducible build is to expected for the
> following compilers:
> 1. The stage1 compiler.
> 2. The stageprofile compiler, which is built by the stage1 compiler.
> 3. The stagetrain compiler, which is built by the stageprofile compiler.
>
> Then, a reproducible build is expected for the stagefeedback compiler
> on the condition that the same information, which was collected by the
> stageprofile compiler when building the stagegrain compiler, is used.
>
> Do you agree with that reasoning?
Yes, and as far as I can tell, up to the stageprofile I get the same result.
Only the output of the stagetrain compilation changes, which affects
the stagefeedback compilation.

>
> > And I would expect, given
> > the same input provided in the same order, two different architectures
> > to take the same branch, and not observe any difference.
>
> In other words, you expect that branch statistics depends only on the
> given source code.  Correct?
That would be my understanding (although very limited).

What I'm trying to understand is: what "local behavior" is injected in
my build, and see if I could get rid of that, and only keep branch
statistics/execution counts, which I expect to be reproducible.

>
> > I understand
> > that with autoprofiled builds, the local architecture behavior is
> > injected in the build, but I don't use that.
> > I'm not using any -march in the build either (as far as I can
> > understand/tell). So I do not expect the build to change its
> > instruction set either.
> >
> > Is that normal that two different architectures would issue two
> > different "execution counts of instruction and branch probabilities"?
>
> I guess that it would be the case.
>
> > Or is there something more?
>
> Perhaps you can have the reproducible build that you want by first
> isolating the information that is collected by the stageprofile
> compiler when building the stagegrain compiler and then reusing the
> same information when building every other stagefeedback compiler.

Yeah, but said information is not reproducible itself (that would
defeat the purpose of the effort).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Build reproducibility of gcc @ NixOS
  2021-04-02 16:45       ` Arthur Gautier
@ 2021-04-03  0:14         ` Tadeus Prastowo
  0 siblings, 0 replies; 6+ messages in thread
From: Tadeus Prastowo @ 2021-04-03  0:14 UTC (permalink / raw)
  To: Arthur Gautier; +Cc: gcc

Hi Arthur,

On Fri, Apr 2, 2021 at 6:45 PM Arthur Gautier
<gcc.gnu.org@superbaloo.net> wrote:
>
> On Fri, Apr 2, 2021 at 4:32 PM Tadeus Prastowo <tadeus.prastowo@unitn.it> wrote:
> >
> > Hi Arthur,
> >
> > On Fri, Apr 2, 2021 at 5:04 PM Arthur Gautier
> > <gcc.gnu.org@superbaloo.net> wrote:
> > >
> > > Hi Tadeus,
> > >
> > > On Fri, Apr 2, 2021 at 9:07 AM Tadeus Prastowo <tadeus.prastowo@unitn.it> wrote:
> [...]
> >
> > By "the manual", do you refer to
> > https://gcc.gnu.org/install/build.html#Building-with-profile-feedback
> > ?
> yes
> >
> > Quoting the page:
> >   When ‘make profiledbootstrap’ is run, it will first build a stage1 compiler.
> >   This compiler is used to build a stageprofile compiler instrumented
> > to collect execution counts of instruction and branch probabilities.
> >   Training run is done by building stagetrain compiler.
> >   Finally a stagefeedback compiler is built using the information collected.
> > End quote.
> >
> > Based on the quote, a reproducible build is to expected for the
> > following compilers:
> > 1. The stage1 compiler.
> > 2. The stageprofile compiler, which is built by the stage1 compiler.
> > 3. The stagetrain compiler, which is built by the stageprofile compiler.
> >
> > Then, a reproducible build is expected for the stagefeedback compiler
> > on the condition that the same information, which was collected by the
> > stageprofile compiler when building the stagegrain compiler, is used.
> >
> > Do you agree with that reasoning?
> Yes, and as far as I can tell, up to the stageprofile I get the same result.
> Only the output of the stagetrain compilation changes, which affects
> the stagefeedback compilation.

By saying "up to the stageprofile I get the same result", do you mean
that you obtain the same stageprofile compilers on two different
architectures?  Or, do you mean that you obtain the same stageprofile
compilers on the same machine by building GCC one after another in
different build directories?

And by saying "the output of the stagetrain compilation changes", do
you mean that the stageprofile compiler __running on the same
machine__ produces different stagetrain compilers?  Or, do you mean
that the stageprofile compiler running on the same machine obtains
different statistics that will later be used to build the
stagefeedback compiler?  Or, something else?

> > > And I would expect, given
> > > the same input provided in the same order, two different architectures
> > > to take the same branch, and not observe any difference.
> >
> > In other words, you expect that branch statistics depends only on the
> > given source code.  Correct?
> That would be my understanding (although very limited).

Could you tell me what you actually mean by the word "architectures",
please?  It is because in my understanding different architectures
mean different instruction sets.

> What I'm trying to understand is: what "local behavior" is injected in
> my build, and see if I could get rid of that, and only keep branch
> statistics/execution counts, which I expect to be reproducible.

That currently I don't know because I haven't fully understood your
actual situation.

> > > I understand
> > > that with autoprofiled builds, the local architecture behavior is
> > > injected in the build, but I don't use that.
> > > I'm not using any -march in the build either (as far as I can
> > > understand/tell). So I do not expect the build to change its
> > > instruction set either.
> > >
> > > Is that normal that two different architectures would issue two
> > > different "execution counts of instruction and branch probabilities"?
> >
> > I guess that it would be the case.
> >
> > > Or is there something more?
> >
> > Perhaps you can have the reproducible build that you want by first
> > isolating the information that is collected by the stageprofile
> > compiler when building the stagegrain compiler and then reusing the
> > same information when building every other stagefeedback compiler.
>
> Yeah, but said information is not reproducible itself (that would
> defeat the purpose of the effort).

I expect that running the same stageprofile compiler on the same
machine twice, once to build stagetrain compiler A and and then once
again to build stagetrain compiler B, should obtain A = B.  While you
imply that it was not the case, I am not sure because you don't say
whether in your case you did the same (i.e., running the same
stageprofile compiler on the same machine twice to obtain A and B).

-- 
Best regards,
Tadeus

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-04-03  0:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-02  3:55 Build reproducibility of gcc @ NixOS Arthur Gautier
2021-04-02  9:07 ` Tadeus Prastowo
2021-04-02 15:04   ` Arthur Gautier
2021-04-02 16:32     ` Tadeus Prastowo
2021-04-02 16:45       ` Arthur Gautier
2021-04-03  0:14         ` Tadeus Prastowo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).