public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Facilitate deterministic pe executables between linker invocations
@ 2013-10-01 17:40 Cory Fields
  2013-10-09 19:08 ` Cory Fields
  2013-10-10 14:11 ` nick clifton
  0 siblings, 2 replies; 10+ messages in thread
From: Cory Fields @ 2013-10-01 17:40 UTC (permalink / raw)
  To: binutils; +Cc: Cory Fields

I'm not sure if this has been discussed before, or exactly how to propose this
behavioral change, so I'm submitting this patch with the goal of starting a
discussion.

It's currently not possible to create two byte-exact exe's due to the timestamp
in the PE header. This does not match elf behavior, where successive runs can
produce the exact same binary.

Only a tiny change is needed to avoid the random result. An
(entirely arbitrary) value of 1 is hard-coded rather than using the current
timestamp.

Is there a historical reason for the non-deterministic behavior? If so, would
it be reasonable to add an option similar to enable-deterministic-archives to
disable it?

Before:
$ ~/dev/binutils/ld/ld-new -m i386pe -o test.exe <snip>
$ md5sum test.exe
d88f78cff7e0f6cf50f4be546c2b4189  test.exe

$ ~/dev/binutils/ld/ld-new -m i386pe -o test.exe <snip>
$ md5sum test.exe
7287892f03f067940b508db830cf85ac  test.exe

After:
$ ~/dev/binutils/ld/ld-new -m i386pe -o test.exe <snip>
$ md5sum test.exe
fa0bf1a326b332f72f270ae060fa758c  test.exe

$ ~/dev/binutils/ld/ld-new -m i386pe -o test.exe <snip>
$ md5sum test.exe
fa0bf1a326b332f72f270ae060fa758c  test.exe

binutils/Changelog
10-01-2013  Cory Fields  <cory@coryfields.com>
    * bfd/peXXigen.c (_bfd_XXi_only_swap_filehdr_out): Use a constant rather
      than a real timestamp in the PE header to ensure deterministic link
      results when invoked with identical inputs.
---
 bfd/peXXigen.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/bfd/peXXigen.c b/bfd/peXXigen.c
index d0f7a96..2d9f93c 100644
--- a/bfd/peXXigen.c
+++ b/bfd/peXXigen.c
@@ -793,7 +793,10 @@ _bfd_XXi_only_swap_filehdr_out (bfd * abfd, void * in, void * out)
   H_PUT_16 (abfd, filehdr_in->f_magic, filehdr_out->f_magic);
   H_PUT_16 (abfd, filehdr_in->f_nscns, filehdr_out->f_nscns);
 
-  H_PUT_32 (abfd, time (0), filehdr_out->f_timdat);
+  /* use a constant for the timestamp to ensure deterministic results with
+     identical inputs */
+  H_PUT_32 (abfd, 1, filehdr_out->f_timdat);
+
   PUT_FILEHDR_SYMPTR (abfd, filehdr_in->f_symptr,
 		      filehdr_out->f_symptr);
   H_PUT_32 (abfd, filehdr_in->f_nsyms, filehdr_out->f_nsyms);
-- 
1.8.1.2

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Facilitate deterministic pe executables between linker invocations
  2013-10-01 17:40 [PATCH] Facilitate deterministic pe executables between linker invocations Cory Fields
@ 2013-10-09 19:08 ` Cory Fields
  2013-10-10 14:11 ` nick clifton
  1 sibling, 0 replies; 10+ messages in thread
From: Cory Fields @ 2013-10-09 19:08 UTC (permalink / raw)
  To: binutils; +Cc: Cory Fields

Ping. Just making sure this wasn't lost.

Regards,
Cory

On Tue, Oct 1, 2013 at 1:40 PM, Cory Fields <cory@coryfields.com> wrote:
> I'm not sure if this has been discussed before, or exactly how to propose this
> behavioral change, so I'm submitting this patch with the goal of starting a
> discussion.
>
> It's currently not possible to create two byte-exact exe's due to the timestamp
> in the PE header. This does not match elf behavior, where successive runs can
> produce the exact same binary.
>
> Only a tiny change is needed to avoid the random result. An
> (entirely arbitrary) value of 1 is hard-coded rather than using the current
> timestamp.
>
> Is there a historical reason for the non-deterministic behavior? If so, would
> it be reasonable to add an option similar to enable-deterministic-archives to
> disable it?
>
> Before:
> $ ~/dev/binutils/ld/ld-new -m i386pe -o test.exe <snip>
> $ md5sum test.exe
> d88f78cff7e0f6cf50f4be546c2b4189  test.exe
>
> $ ~/dev/binutils/ld/ld-new -m i386pe -o test.exe <snip>
> $ md5sum test.exe
> 7287892f03f067940b508db830cf85ac  test.exe
>
> After:
> $ ~/dev/binutils/ld/ld-new -m i386pe -o test.exe <snip>
> $ md5sum test.exe
> fa0bf1a326b332f72f270ae060fa758c  test.exe
>
> $ ~/dev/binutils/ld/ld-new -m i386pe -o test.exe <snip>
> $ md5sum test.exe
> fa0bf1a326b332f72f270ae060fa758c  test.exe
>
> binutils/Changelog
> 10-01-2013  Cory Fields  <cory@coryfields.com>
>     * bfd/peXXigen.c (_bfd_XXi_only_swap_filehdr_out): Use a constant rather
>       than a real timestamp in the PE header to ensure deterministic link
>       results when invoked with identical inputs.
> ---
>  bfd/peXXigen.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/bfd/peXXigen.c b/bfd/peXXigen.c
> index d0f7a96..2d9f93c 100644
> --- a/bfd/peXXigen.c
> +++ b/bfd/peXXigen.c
> @@ -793,7 +793,10 @@ _bfd_XXi_only_swap_filehdr_out (bfd * abfd, void * in, void * out)
>    H_PUT_16 (abfd, filehdr_in->f_magic, filehdr_out->f_magic);
>    H_PUT_16 (abfd, filehdr_in->f_nscns, filehdr_out->f_nscns);
>
> -  H_PUT_32 (abfd, time (0), filehdr_out->f_timdat);
> +  /* use a constant for the timestamp to ensure deterministic results with
> +     identical inputs */
> +  H_PUT_32 (abfd, 1, filehdr_out->f_timdat);
> +
>    PUT_FILEHDR_SYMPTR (abfd, filehdr_in->f_symptr,
>                       filehdr_out->f_symptr);
>    H_PUT_32 (abfd, filehdr_in->f_nsyms, filehdr_out->f_nsyms);
> --
> 1.8.1.2
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Facilitate deterministic pe executables between linker invocations
  2013-10-01 17:40 [PATCH] Facilitate deterministic pe executables between linker invocations Cory Fields
  2013-10-09 19:08 ` Cory Fields
@ 2013-10-10 14:11 ` nick clifton
  2013-10-10 14:40   ` Cory Fields
  1 sibling, 1 reply; 10+ messages in thread
From: nick clifton @ 2013-10-10 14:11 UTC (permalink / raw)
  To: Cory Fields, binutils

Hi Cory,

[Sorry for the delay in replying]

> Only a tiny change is needed to avoid the random result. An
> (entirely arbitrary) value of 1 is hard-coded rather than using the current
> timestamp.
>
> Is there a historical reason for the non-deterministic behavior?

Yes. :-)

Oh, you want to know the reason ?  I believe that this is because 
non-deterministic behaviour was not considered to be important, and that 
having a timestamped executable was thought to be a helpful feature. 
(For example the timestamp could be used like a build-id to identify a 
specific release of a binary to a customer).


> If so, would
> it be reasonable to add an option similar to enable-deterministic-archives to
> disable it?

Yes it would.

Cheers
   Nick


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Facilitate deterministic pe executables between linker invocations
  2013-10-10 14:11 ` nick clifton
@ 2013-10-10 14:40   ` Cory Fields
  2013-10-10 18:48     ` Cory Fields
  0 siblings, 1 reply; 10+ messages in thread
From: Cory Fields @ 2013-10-10 14:40 UTC (permalink / raw)
  To: nick clifton; +Cc: binutils

On Thu, Oct 10, 2013 at 10:10 AM, nick clifton <nickc@redhat.com> wrote:
> Hi Cory,
>
> [Sorry for the delay in replying]
>

No problem, thanks for getting to it.

>
>> Only a tiny change is needed to avoid the random result. An
>> (entirely arbitrary) value of 1 is hard-coded rather than using the
>> current
>> timestamp.
>>
>> Is there a historical reason for the non-deterministic behavior?
>
>
> Yes. :-)
>
> Oh, you want to know the reason ?  I believe that this is because
> non-deterministic behaviour was not considered to be important, and that
> having a timestamped executable was thought to be a helpful feature. (For
> example the timestamp could be used like a build-id to identify a specific
> release of a binary to a customer).
>

Yea, that's about what I expected to hear :\

>
>
>> If so, would
>> it be reasonable to add an option similar to enable-deterministic-archives
>> to
>> disable it?
>
>
> Yes it would.

Great, then I'm happy to do the work. A few quick questions:

- Are there other viable targets you can think of beyond PE executables?
- What do you think about making this a runtime ld option as well?
e.g. -D to match ar's. It'd be a shame to be at the mercy of my distro
for this.
- Is it worth considering a generic "attempt deterministic behavior"
configure option that would bring behavior like this and ar's under
one roof? I'm assuming not since there are presumably cases where
someone might want one but not the other, just throwing it out there.

Regards,
Cory

>
> Cheers
>   Nick
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Facilitate deterministic pe executables between linker invocations
  2013-10-10 14:40   ` Cory Fields
@ 2013-10-10 18:48     ` Cory Fields
  2013-10-11 11:13       ` nick clifton
  0 siblings, 1 reply; 10+ messages in thread
From: Cory Fields @ 2013-10-10 18:48 UTC (permalink / raw)
  To: nick clifton; +Cc: binutils

On Thu, Oct 10, 2013 at 10:40 AM, Cory Fields <cory@coryfields.com> wrote:
> On Thu, Oct 10, 2013 at 10:10 AM, nick clifton <nickc@redhat.com> wrote:
>> Hi Cory,
>>
>> [Sorry for the delay in replying]
>>
>
> No problem, thanks for getting to it.
>
>>
>>> Only a tiny change is needed to avoid the random result. An
>>> (entirely arbitrary) value of 1 is hard-coded rather than using the
>>> current
>>> timestamp.
>>>
>>> Is there a historical reason for the non-deterministic behavior?
>>
>>
>> Yes. :-)
>>
>> Oh, you want to know the reason ?  I believe that this is because
>> non-deterministic behaviour was not considered to be important, and that
>> having a timestamped executable was thought to be a helpful feature. (For
>> example the timestamp could be used like a build-id to identify a specific
>> release of a binary to a customer).
>>
>
> Yea, that's about what I expected to hear :\
>
>>
>>
>>> If so, would
>>> it be reasonable to add an option similar to enable-deterministic-archives
>>> to
>>> disable it?
>>
>>
>> Yes it would.
>
> Great, then I'm happy to do the work. A few quick questions:
>
> - Are there other viable targets you can think of beyond PE executables?
> - What do you think about making this a runtime ld option as well?
> e.g. -D to match ar's. It'd be a shame to be at the mercy of my distro
> for this.
> - Is it worth considering a generic "attempt deterministic behavior"
> configure option that would bring behavior like this and ar's under
> one roof? I'm assuming not since there are presumably cases where
> someone might want one but not the other, just throwing it out there.
>
> Regards,
> Cory
>
>>
>> Cheers
>>   Nick
>>
>>

I've taken another look, and it looks like a pe-specific emulation
flag makes the most sense. I'll hook that up and send it along unless
you have a different suggestion.

Regards,
Cory

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Facilitate deterministic pe executables between linker invocations
  2013-10-10 18:48     ` Cory Fields
@ 2013-10-11 11:13       ` nick clifton
  2013-11-19  0:34         ` Cory Fields
  0 siblings, 1 reply; 10+ messages in thread
From: nick clifton @ 2013-10-11 11:13 UTC (permalink / raw)
  To: cory; +Cc: binutils

Hi Cory,

>> - Are there other viable targets you can think of beyond PE executables?

Nope - I think that this is a PE specific feature/bug.

>> - What do you think about making this a runtime ld option as well?
>> e.g. -D to match ar's. It'd be a shame to be at the mercy of my distro
>> for this.

This is a good idea.

> I've taken another look, and it looks like a pe-specific emulation
> flag makes the most sense.

Go for it.

Cheers
   Nick


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Facilitate deterministic pe executables between linker invocations
  2013-10-11 11:13       ` nick clifton
@ 2013-11-19  0:34         ` Cory Fields
  2013-11-20 17:08           ` Cory Fields
  0 siblings, 1 reply; 10+ messages in thread
From: Cory Fields @ 2013-11-19  0:34 UTC (permalink / raw)
  To: nick clifton; +Cc: binutils

Hi Nick

Now that windres and ar are fixed up, ld is the only hold-out for
clean, deterministic, mingw binaries.

I've implemented an '--enable-deterministic-ld' configure option, to
match the semantics of ar. But in looking over it, it rather feels
like beating in a nail with a screwdriver.

It seems to me that the non-deterministic behavior of mingw's ld is a
bug that has become accepted behavior, rather than an as-intended
feature. So rather than going the same route as ar, I propose the
following:

mingw's coff timestamp is set to an arbitrary number, to match the
standard behavior of ld. A --use-real-timestamp (or so, maybe
random-seed to match gcc?) is introduced to mimic the previous
behavior if desired.

With the current functionality of mingw's ld, the userspace
application 'faketime' can be used to spoof the inserted timestamp.
This has been used by the buildsystems of several applications
(including tor and bitcoin) to work-around this problem. To my
knowledge, the usage of a phony timestamp has not introduced any
problems. So I don't envision any runtime issues with this change,
only the theoretical distribution issue you mentioned (companies
wishing to have different timestamps for distributed binaries, for
internal reasons), which could easily be mitigated by
--use-real-timestamp.

Thoughts?

Regards,
Cory

On Fri, Oct 11, 2013 at 7:11 AM, nick clifton <nickc@redhat.com> wrote:
> Hi Cory,
>
>
>>> - Are there other viable targets you can think of beyond PE executables?
>
>
> Nope - I think that this is a PE specific feature/bug.
>
>
>>> - What do you think about making this a runtime ld option as well?
>>> e.g. -D to match ar's. It'd be a shame to be at the mercy of my distro
>>> for this.
>
>
> This is a good idea.
>
>
>> I've taken another look, and it looks like a pe-specific emulation
>> flag makes the most sense.
>
>
> Go for it.
>
> Cheers
>   Nick
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Facilitate deterministic pe executables between linker invocations
  2013-11-19  0:34         ` Cory Fields
@ 2013-11-20 17:08           ` Cory Fields
  2013-11-20 17:56             ` nick clifton
  0 siblings, 1 reply; 10+ messages in thread
From: Cory Fields @ 2013-11-20 17:08 UTC (permalink / raw)
  To: nick clifton; +Cc: binutils

Hi Nick

Sorry for the hasty re-ping, but Tristan seems ok with pulling in the
determinism fixes for 2.24 if I can get them ack'd before the window
closes.

Would you mind commenting on the approach above?

Regards,
Cory

On Mon, Nov 18, 2013 at 6:06 PM, Cory Fields <cory@coryfields.com> wrote:
> Hi Nick
>
> Now that windres and ar are fixed up, ld is the only hold-out for
> clean, deterministic, mingw binaries.
>
> I've implemented an '--enable-deterministic-ld' configure option, to
> match the semantics of ar. But in looking over it, it rather feels
> like beating in a nail with a screwdriver.
>
> It seems to me that the non-deterministic behavior of mingw's ld is a
> bug that has become accepted behavior, rather than an as-intended
> feature. So rather than going the same route as ar, I propose the
> following:
>
> mingw's coff timestamp is set to an arbitrary number, to match the
> standard behavior of ld. A --use-real-timestamp (or so, maybe
> random-seed to match gcc?) is introduced to mimic the previous
> behavior if desired.
>
> With the current functionality of mingw's ld, the userspace
> application 'faketime' can be used to spoof the inserted timestamp.
> This has been used by the buildsystems of several applications
> (including tor and bitcoin) to work-around this problem. To my
> knowledge, the usage of a phony timestamp has not introduced any
> problems. So I don't envision any runtime issues with this change,
> only the theoretical distribution issue you mentioned (companies
> wishing to have different timestamps for distributed binaries, for
> internal reasons), which could easily be mitigated by
> --use-real-timestamp.
>
> Thoughts?
>
> Regards,
> Cory
>
> On Fri, Oct 11, 2013 at 7:11 AM, nick clifton <nickc@redhat.com> wrote:
>> Hi Cory,
>>
>>
>>>> - Are there other viable targets you can think of beyond PE executables?
>>
>>
>> Nope - I think that this is a PE specific feature/bug.
>>
>>
>>>> - What do you think about making this a runtime ld option as well?
>>>> e.g. -D to match ar's. It'd be a shame to be at the mercy of my distro
>>>> for this.
>>
>>
>> This is a good idea.
>>
>>
>>> I've taken another look, and it looks like a pe-specific emulation
>>> flag makes the most sense.
>>
>>
>> Go for it.
>>
>> Cheers
>>   Nick
>>
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Facilitate deterministic pe executables between linker invocations
  2013-11-20 17:08           ` Cory Fields
@ 2013-11-20 17:56             ` nick clifton
  2013-11-20 20:18               ` Cory Fields
  0 siblings, 1 reply; 10+ messages in thread
From: nick clifton @ 2013-11-20 17:56 UTC (permalink / raw)
  To: cory; +Cc: binutils

Hi Cory,

   Sorry - my bad - I meant to send you a reply yesterday...

 >> mingw's coff timestamp is set to an arbitrary number, to match the
>> standard behavior of ld.

OK.  Presumable an arbitary value of zero is suitable ?

>>  A --use-real-timestamp (or so, maybe
>> random-seed to match gcc?) is introduced to mimic the previous
>> behavior if desired.

I would prefer not to have completely random values, since they would be 
of no use whatsoever.  A real timestamp should be fine.

So, yes, please do go ahead and post the patch for this change.  I do 
not anticipate there being any real problems with it...

Cheers
   Nick


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Facilitate deterministic pe executables between linker invocations
  2013-11-20 17:56             ` nick clifton
@ 2013-11-20 20:18               ` Cory Fields
  0 siblings, 0 replies; 10+ messages in thread
From: Cory Fields @ 2013-11-20 20:18 UTC (permalink / raw)
  To: nick clifton; +Cc: binutils

On Wed, Nov 20, 2013 at 11:48 AM, nick clifton <nickc@redhat.com> wrote:
> Hi Cory,
>
>   Sorry - my bad - I meant to send you a reply yesterday...
>

No worries. I don't make a habit of quick bumps. Many thanks for the
quick response this time.

>>> mingw's coff timestamp is set to an arbitrary number, to match the
>>>
>>> standard behavior of ld.
>
>
> OK.  Presumable an arbitary value of zero is suitable ?
>

Will verify.

>
>>>  A --use-real-timestamp (or so, maybe
>>> random-seed to match gcc?) is introduced to mimic the previous
>>> behavior if desired.
>
>
> I would prefer not to have completely random values, since they would be of
> no use whatsoever.  A real timestamp should be fine.
>
> So, yes, please do go ahead and post the patch for this change.  I do not
> anticipate there being any real problems with it...

Great, I'll send it along when I can get to it. Should be in the next few hours.

Regards,
Cory

>
> Cheers
>   Nick
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-11-20 17:56 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-01 17:40 [PATCH] Facilitate deterministic pe executables between linker invocations Cory Fields
2013-10-09 19:08 ` Cory Fields
2013-10-10 14:11 ` nick clifton
2013-10-10 14:40   ` Cory Fields
2013-10-10 18:48     ` Cory Fields
2013-10-11 11:13       ` nick clifton
2013-11-19  0:34         ` Cory Fields
2013-11-20 17:08           ` Cory Fields
2013-11-20 17:56             ` nick clifton
2013-11-20 20:18               ` Cory Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).