public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] GNU_AR_DETERMINISTIC env var, --enable-deterministic-archives
@ 2011-12-06 21:28 Roland McGrath
  2011-12-08 18:13 ` Roland McGrath
  2011-12-13 13:30 ` nick clifton
  0 siblings, 2 replies; 8+ messages in thread
From: Roland McGrath @ 2011-12-06 21:28 UTC (permalink / raw)
  To: binutils

ar member headers are one of the last remaining common barriers to
reproducible builds (of development packages).  Hence we have the D option.
But it is devilishly difficult to get the option override--or even AR and
RANLIB settings pointing to a wrapper script--through configure et al into
all corners of e.g. the gcc build.

Instead, this change lets the user set the environment variable
GNU_AR_DETERMINISTIC=1 to enable "deterministic mode" by default.
That's far easier to use in practice.

In most places, the only use of "ar" is on libraries for linking and there
is no benefit whatsoever to the timestamp, UID, or GID fields in these.
So --enable-deterministic-archives lets one choose the GNU_AR_DETERMINISTIC=1
mode at build time.  Users can still set GNU_AR_DETERMINISTIC=0 to reverse
the build-time default.

Since D is incompatible with u and some packages run "ar cru",
when the environment variable or build-time default sets D, we
just silently ignore u.


Ok for trunk?


Thanks,
Roland


2011-12-06  Roland McGrath  <mcgrathr@google.com>

	* configure.in (--enable-deterministic-archives): Grok new
	argument.  Set DEFAULT_AR_DETERMINISTIC to 1 or 0 accordingly.
	* configure: Regenerated.
	* config.in: Regenerated.
	* ar.c (default_deterministic): New function.
	(ranlib_main, main): Call it.
	* doc/binutils.texi (ar cmdline, ranlib): Document that.

diff --git a/binutils/ar.c b/binutils/ar.c
index 676e92c..8706d94 100644
--- a/binutils/ar.c
+++ b/binutils/ar.c
@@ -37,6 +37,7 @@
 #include "filenames.h"
 #include "binemul.h"
 #include "plugin.h"
+#include <ctype.h>
 #include <sys/stat.h>
 
 #ifdef __GO32___
@@ -553,6 +554,34 @@ decode_options (int argc, char **argv)
   return &argv[optind];
 }
 
+/* If -D was not specified explicitly, then default it to on or off
+   depending on the environment variable GNU_AR_DETERMINISTIC, or to
+   the configured default.  */
+static void
+default_deterministic (void)
+{
+  if (!deterministic)
+    {
+      const char *env = getenv ("GNU_AR_DETERMINISTIC");
+      if (env != NULL && env[0] != '\0')
+        {
+          if (isdigit(env[0]))
+            {
+              char *endp = NULL;
+              long int value = strtol (env, &endp, 0);
+              if (endp != NULL)
+                deterministic = value != 0;
+            }
+          else
+            deterministic = 1;
+        }
+      else
+        deterministic = DEFAULT_AR_DETERMINISTIC;
+      if (deterministic)
+        newer_only = 0;
+    }
+}
+
 static void
 ranlib_main (int argc, char **argv)
 {
@@ -590,6 +619,8 @@ ranlib_main (int argc, char **argv)
   if (show_version)
     print_version ("ranlib");
 
+  default_deterministic ();
+
   arg_index = optind;
 
   while (arg_index < argc)
@@ -702,6 +733,8 @@ main (int argc, char **argv)
       if (newer_only && deterministic)
 	fatal (_("`u' is not meaningful with the `D' option."));
 
+      default_deterministic ();
+
       if (postype != pos_default)
 	posname = argv[arg_index++];
 
diff --git a/binutils/configure.in b/binutils/configure.in
index 965d66c..da56b2d 100644
--- a/binutils/configure.in
+++ b/binutils/configure.in
@@ -28,6 +28,18 @@ AC_ARG_ENABLE(targets,
   *)        enable_targets=$enableval ;;
 esac])dnl
 
+AC_ARG_ENABLE(deterministic-archives,
+[AS_HELP_STRING([--enable-deterministic-archives],
+		[ar and ranlib default to -D behavior])], [
+if test "${enableval}" = no; then
+  default_ar_deterministic=0
+else
+  default_ar_deterministic=1
+fi], [default_ar_deterministic=0])
+
+AC_DEFINE_UNQUOTED(DEFAULT_AR_DETERMINISTIC, $default_ar_deterministic,
+		   [Should ar and ranlib use -D behavior by default?])
+
 AM_BINUTILS_WARNINGS
 		   
 AC_CONFIG_HEADERS(config.h:config.in)
diff --git a/binutils/doc/binutils.texi b/binutils/doc/binutils.texi
index 3217a1a..678379b 100644
--- a/binutils/doc/binutils.texi
+++ b/binutils/doc/binutils.texi
@@ -418,6 +418,7 @@ using this modifier.
 
 @item D
 @cindex deterministic archives
+@kindex GNU_AR_DETERMINISTIC
 Operate in @emph{deterministic} mode.  When adding files and the archive
 index use zero for UIDs, GIDs, timestamps, and use consistent file modes
 for all files.  When this option is used, if @command{ar} is used with
@@ -425,6 +426,14 @@ identical options and identical input files, multiple runs will create
 identical output files regardless of the input files' owners, groups,
 file modes, or modification times.
 
+If the environment variable @code{GNU_AR_DETERMINISTIC} is set to a
+nonempty value that is not @code{0}, then this mode is enabled by default.
+In this case, the @code{u} modifier is ignored.  If that environment
+variable is absent or has an empty value and @file{binutils} was configured
+with @option{--enable-deterministic-archives}, then this mode is on by
+default.  If @code{GNU_AR_DETERMINISTIC} is set to @code{0} then this mode
+is never enabled without the @code{D} modifier.
+
 @item f
 Truncate names in the archive.  @sc{gnu} @command{ar} will normally permit file
 names of any length.  This will cause it to create archives which are
@@ -2386,10 +2395,18 @@ Show the version number of @command{ranlib}.
 
 @item -D
 @cindex deterministic archives
+@kindex GNU_AR_DETERMINISTIC
 Operate in @emph{deterministic} mode.  The symbol map archive member's
 header will show zero for the UID, GID, and timestamp.  When this
 option is used, multiple runs will produce identical output files.
 
+If the environment variable @code{GNU_AR_DETERMINISTIC} is set to a nonempty
+value that is not @code{0}, then this mode is enabled by default.  If that
+environment variable is absent or has an empty value and @file{binutils} was
+configured with @option{--enable-deterministic-archives}, then this mode is
+on by default.  If @code{GNU_AR_DETERMINISTIC} is set to @code{0} then this
+mode is never enabled without the @code{D} modifier.
+
 @item -t
 Update the timestamp of the symbol map of an archive.
 @end table

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] GNU_AR_DETERMINISTIC env var, --enable-deterministic-archives
  2011-12-06 21:28 [PATCH] GNU_AR_DETERMINISTIC env var, --enable-deterministic-archives Roland McGrath
@ 2011-12-08 18:13 ` Roland McGrath
  2011-12-13 13:30 ` nick clifton
  1 sibling, 0 replies; 8+ messages in thread
From: Roland McGrath @ 2011-12-08 18:13 UTC (permalink / raw)
  To: binutils

ping?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] GNU_AR_DETERMINISTIC env var, --enable-deterministic-archives
  2011-12-06 21:28 [PATCH] GNU_AR_DETERMINISTIC env var, --enable-deterministic-archives Roland McGrath
  2011-12-08 18:13 ` Roland McGrath
@ 2011-12-13 13:30 ` nick clifton
  2011-12-13 14:52   ` Ian Lance Taylor
  2011-12-13 17:44   ` Roland McGrath
  1 sibling, 2 replies; 8+ messages in thread
From: nick clifton @ 2011-12-13 13:30 UTC (permalink / raw)
  To: Roland McGrath; +Cc: binutils

Hi Roland,

   Sorry for the delay in responding.  I was waiting to see if any of 
the other maintainers had anything to say on this issue.


> ar member headers are one of the last remaining common barriers to
> reproducible builds (of development packages).  Hence we have the D option.
> But it is devilishly difficult to get the option override--or even AR and
> RANLIB settings pointing to a wrapper script--through configure et al into
> all corners of e.g. the gcc build.
>
> Instead, this change lets the user set the environment variable

Eww, I hate using environment variables like this.  It makes debugging 
so much harder when you cannot rely upon the command line to describe 
how a program was invoked.

I appreciate that the main thrust of this change is to make reproducible 
builds possible without having to do a lot of work changing package 
build scripts.  But I think that that is a poor reason to introduce a 
debugging nightmare like this.

In my opinion it is better to take on the work of changing all of the 
packages one at a time, even if this will be a lot of effort.  On the 
plus side if you do this you can also simplify the package build scripts 
at the same time.  Perhaps by providing a library of useful command 
lines that the scripts can access.  Then future changes like this can be 
simply made by just updating the command line in the library.

We have always resisted using environment variables to control program 
behaviour in projects like gcc and the binutils and I see no reason why 
this policy should change.


That said I have some other comments on the patch (see below).  They are 
more along the lines of commentary on the code for possible future 
reference, rather than acceptance of the patch.


> Users can still set GNU_AR_DETERMINISTIC=0 to reverse
> the build-time default.

But they cannot use a command line option to do this, which strikes me 
as wrong.  If we are going to allow deterministic behaviour to be a 
default then we must also provide a command line option to disable it.

Also - there needs to be some way for the user to find out how the 
particular ar/ranlib that they are using was configured.  Ie they need 
to be able to find out if the default behaviour is for deterministic or 
non-deterministic archives.


> Since D is incompatible with u and some packages run "ar cru",
> when the environment variable or build-time default sets D, we
> just silently ignore u.

I am not happy with this either.  Since "u" has been explicitly 
requested by the user (well the package build script) we should not 
silently ignore it.  If "D" and "u" have both been specified on the 
command line then a fatal warning makes sense.  If on the other hand "D" 
is just the default behaviour then a warning message would make more 
sense.  Then if the user really does want the "u" behaviour and no 
warnings they can add the "not-D" command line option to their build script.


> Ok for trunk?

No, sorry.



> +      const char *env = getenv ("GNU_AR_DETERMINISTIC");
> +      if (env != NULL&&  env[0] != '\0')
> +        {
> +          if (isdigit(env[0]))
> +            {
> +              char *endp = NULL;
> +              long int value = strtol (env,&endp, 0);
> +              if (endp != NULL)
> +                deterministic = value != 0;
> +            }
> +          else
> +            deterministic = 1;

So setting "GNU_AR_DETERMINISTIC=no" will enable determinism...  I admit 
that it matches your proposed new documentation, but it still seems 
counter intuitive.


Cheers
   Nick

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] GNU_AR_DETERMINISTIC env var, --enable-deterministic-archives
  2011-12-13 13:30 ` nick clifton
@ 2011-12-13 14:52   ` Ian Lance Taylor
  2011-12-13 17:32     ` Roland McGrath
  2011-12-13 17:44   ` Roland McGrath
  1 sibling, 1 reply; 8+ messages in thread
From: Ian Lance Taylor @ 2011-12-13 14:52 UTC (permalink / raw)
  To: nick clifton; +Cc: Roland McGrath, binutils

nick clifton <nickc@redhat.com> writes:

> I appreciate that the main thrust of this change is to make
> reproducible builds possible without having to do a lot of work
> changing package build scripts.  But I think that that is a poor
> reason to introduce a debugging nightmare like this.

What about putting a shell script version of ar first on PATH?  Are
there many programs that pull ar out of some magic location rather than
simply using PATH?

Ian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] GNU_AR_DETERMINISTIC env var, --enable-deterministic-archives
  2011-12-13 14:52   ` Ian Lance Taylor
@ 2011-12-13 17:32     ` Roland McGrath
  0 siblings, 0 replies; 8+ messages in thread
From: Roland McGrath @ 2011-12-13 17:32 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: nick clifton, binutils

On Tue, Dec 13, 2011 at 6:51 AM, Ian Lance Taylor <iant@google.com> wrote:
> What about putting a shell script version of ar first on PATH?  Are
> there many programs that pull ar out of some magic location rather than
> simply using PATH?

There are many that use AC_CHECK_TOOL and the like, so you'd need to make
it a ${target_alias}-ar script (and likewise for ranlib).  Then there is
the GCC build, which might do that or might do something far more difficult
to figure out.  This is probably doable--more doable than getting a setting
of AR, or harder yet, of the flags passed to AR through configure/make for
the GCC build.  But it's a hell of a lot less friendly and replicable than
just setting an environment variable or configuring the target binutils
with --enable-deterministic-archives.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] GNU_AR_DETERMINISTIC env var, --enable-deterministic-archives
  2011-12-13 13:30 ` nick clifton
  2011-12-13 14:52   ` Ian Lance Taylor
@ 2011-12-13 17:44   ` Roland McGrath
  2011-12-15 14:41     ` nick clifton
  1 sibling, 1 reply; 8+ messages in thread
From: Roland McGrath @ 2011-12-13 17:44 UTC (permalink / raw)
  To: nick clifton; +Cc: binutils

On Tue, Dec 13, 2011 at 5:29 AM, nick clifton <nickc@redhat.com> wrote:
> Eww, I hate using environment variables like this.  It makes debugging so
> much harder when you cannot rely upon the command line to describe how a
> program was invoked.

I agree with this as a general principle.  (But it's worth pointing out
that GCC is indeed affected by various environment variables--which, at
least, are documented in their own section of the GCC manual.)  However, I
don't think that this is worse than the royal pain in the ass of trying to
modify lots of packages.

> I appreciate that the main thrust of this change is to make reproducible
> builds possible without having to do a lot of work changing package build
> scripts.  But I think that that is a poor reason to introduce a debugging
> nightmare like this.

Frankly, I think this makes a mountain of a mole-hill.  (Anyone ever using
the "u" option is probably a far more difficult and subtle problem today.)

> In my opinion it is better to take on the work of changing all of the
> packages one at a time, even if this will be a lot of effort.  On the plus
> side if you do this you can also simplify the package build scripts at the
> same time.  Perhaps by providing a library of useful command lines that the
> scripts can access.  Then future changes like this can be simply made by
> just updating the command line in the library.

I don't think I have to tell you what a monumental pain in the ass this is
going to be in practice.  I agree it would eventually result in a better
situation, bringing both uniformity and configurability to the use of ar in
many packages.  But for effort to effect ratio, it's just beyond the pale.

> We have always resisted using environment variables to control program
> behaviour in projects like gcc and the binutils and I see no reason why this
> policy should change.

Well, it's just not very true at all about GCC.  But I take your point.
However, you didn't say anything about the build-time configure option.

> That said I have some other comments on the patch (see below).  They are
> more along the lines of commentary on the code for possible future
> reference, rather than acceptance of the patch.

Understood.

>> Users can still set GNU_AR_DETERMINISTIC=0 to reverse
>> the build-time default.
>
> But they cannot use a command line option to do this, which strikes me as
> wrong.  If we are going to allow deterministic behaviour to be a default
> then we must also provide a command line option to disable it.

I'd be happy to add such an option.

> Also - there needs to be some way for the user to find out how the
> particular ar/ranlib that they are using was configured.  Ie they need to be
> able to find out if the default behaviour is for deterministic or
> non-deterministic archives.

What do you have in mind?  It's easy enough to add the information to
--help output, but that is not programmatically friendly.  But I don't
really see why someone really needs to know the default.  If they care
especially about the behavior, having options to explicitly set it one way
or the other seems like the most useful thing.

> I am not happy with this either.  Since "u" has been explicitly requested by
> the user (well the package build script) we should not silently ignore it.
>  If "D" and "u" have both been specified on the command line then a fatal
> warning makes sense.  If on the other hand "D" is just the default behaviour
> then a warning message would make more sense.  Then if the user really does
> want the "u" behaviour and no warnings they can add the "not-D" command line
> option to their build script.

I have no problem with adding the warning.

> So setting "GNU_AR_DETERMINISTIC=no" will enable determinism...  I admit
> that it matches your proposed new documentation, but it still seems counter
> intuitive.

Well, I didn't want to get into a lot of value-parsing.  I wouldn't object
to a more precise requirement on the values, with an unrecognized value
causing a warning or error.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] GNU_AR_DETERMINISTIC env var, --enable-deterministic-archives
  2011-12-13 17:44   ` Roland McGrath
@ 2011-12-15 14:41     ` nick clifton
  2011-12-15 18:13       ` Roland McGrath
  0 siblings, 1 reply; 8+ messages in thread
From: nick clifton @ 2011-12-15 14:41 UTC (permalink / raw)
  To: Roland McGrath; +Cc: binutils

Hi Roland,


> I agree with this as a general principle.  (But it's worth pointing out
> that GCC is indeed affected by various environment variables--which, at
> least, are documented in their own section of the GCC manual.)

True.

> Well, it's just not very true at all about GCC.  But I take your point.
> However, you didn't say anything about the build-time configure option.

Sorry, I should have covered that.

I am OK with the configure option.  In fact I think that it is the best 
way forward.  Allow users (or toolchain providers) to choose what the 
default behaviour will be, and provide a command line option to 
explicitly determine determinism.


>> Also - there needs to be some way for the user to find out how the
>> particular ar/ranlib that they are using was configured.
>
> What do you have in mind?  It's easy enough to add the information to
> --help output,

That was what I had in mind.

> but that is not programmatically friendly.

True.  Another option would be to have "ar --verbose" display the 
configure arguments that were used when it was built.  Ie similar to the 
way that gcc behaves.

> But I don't
> really see why someone really needs to know the default.

I was thinking more about debugging again.  If user reports a strange 
behaviour with ar not storing file timestamps say, then we can ask them 
to run "ar --help" or "ar --verbose" and explain what the output means.


So - if you will accept dropping the environment variable and just using 
a configure time parameter to set the default behaviour then please do 
resubmit your patch.

Cheers
   Nick

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] GNU_AR_DETERMINISTIC env var, --enable-deterministic-archives
  2011-12-15 14:41     ` nick clifton
@ 2011-12-15 18:13       ` Roland McGrath
  0 siblings, 0 replies; 8+ messages in thread
From: Roland McGrath @ 2011-12-15 18:13 UTC (permalink / raw)
  To: nick clifton; +Cc: binutils

On Thu, Dec 15, 2011 at 6:29 AM, nick clifton <nickc@redhat.com> wrote:
> I am OK with the configure option.  In fact I think that it is the best way
> forward.  Allow users (or toolchain providers) to choose what the default
> behaviour will be, and provide a command line option to explicitly determine
> determinism.

Ok.  That meets my minds just fine.

>> What do you have in mind?  It's easy enough to add the information to
>> --help output,
>
> That was what I had in mind.

I can do that.

> True.  Another option would be to have "ar --verbose" display the configure
> arguments that were used when it was built.  Ie similar to the way that gcc
> behaves.

If this were done it seems to me it should be uniformly done for all of
binutils.  That seems like a worthwhile thing, but I don't think I'm going
to bother with it.

> I was thinking more about debugging again.  If user reports a strange
> behaviour with ar not storing file timestamps say, then we can ask them to
> run "ar --help" or "ar --verbose" and explain what the output means.

That makes sense.

> So - if you will accept dropping the environment variable and just using a
> configure time parameter to set the default behaviour then please do
> resubmit your patch.

Will do.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-12-15 18:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-06 21:28 [PATCH] GNU_AR_DETERMINISTIC env var, --enable-deterministic-archives Roland McGrath
2011-12-08 18:13 ` Roland McGrath
2011-12-13 13:30 ` nick clifton
2011-12-13 14:52   ` Ian Lance Taylor
2011-12-13 17:32     ` Roland McGrath
2011-12-13 17:44   ` Roland McGrath
2011-12-15 14:41     ` nick clifton
2011-12-15 18:13       ` Roland McGrath

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).