public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 1/3] Use BUILD_PATH_PREFIX_MAP envvar for debug-prefix-map
  2017-04-11 11:35 [PATCH v2] Generate reproducible output independently of the build-path Ximin Luo
  2017-04-11 11:35 ` [PATCH 2/3] Use BUILD_PATH_PREFIX_MAP envvar to transform __FILE__ Ximin Luo
@ 2017-04-11 11:35 ` Ximin Luo
  2017-04-11 11:35 ` [PATCH 3/3] When remapping paths, only match whole path components Ximin Luo
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Ximin Luo @ 2017-04-11 11:35 UTC (permalink / raw)
  To: GCC Patches; +Cc: Ximin Luo

Define the BUILD_PATH_PREFIX_MAP environment variable, and treat it as implicit
-fdebug-prefix-map CLI options specified before any explicit such options.

Much of the generic code for applying and parsing prefix-maps is implemented in
libiberty instead of the dwarf2 parts of the code, in order to make subsequent
patches unrelated to debuginfo easier.

Acknowledgements
----------------

Daniel Kahn Gillmor who wrote the patch for r231835, which saved me a lot of
time figuring out what to edit.

HW42 for discussion on the details of the proposal, and for suggesting that we
retain the ability to map the prefix to something other than ".".

Other contributors to the BUILD_PATH_PREFIX_MAP specification, see
https://reproducible-builds.org/specs/build-path-prefix-map/

ChangeLogs
----------

include/ChangeLog:

2017-04-09  Ximin Luo  <infinity0@pwned.gg>

	* prefix-map.h: New file implementing the BUILD_PATH_PREFIX_MAP
	specification; includes code from /gcc/final.c and code adapted from
	examples attached to the specification.

libiberty/ChangeLog:

2017-04-09  Ximin Luo  <infinity0@pwned.gg>

	* prefix-map.c: New file implementing the BUILD_PATH_PREFIX_MAP
	specification; includes code from /gcc/final.c and code adapted from
	examples attached to the specification.
	* Makefile.in: Update for new files.

gcc/ChangeLog:

2017-04-09  Ximin Luo  <infinity0@pwned.gg>

	* debug.h: Declare add_debug_prefix_map_from_envvar.
	* final.c: Define add_debug_prefix_map_from_envvar, and refactor
	prefix-map utilities to use equivalent code from libiberty instead.
	* opts-global.c: (handle_common_deferred_options): Call
	add_debug_prefix_map_from_envvar before processing options.

gcc/testsuite/ChangeLog:

2017-04-09  Ximin Luo  <infinity0@pwned.gg>

	* lib/gcc-dg.exp: Allow dg-set-compiler-env-var to take only one
	argument in which case it unsets the given env var.
	* gcc.dg/debug/dwarf2/build_path_prefix_map-1.c: New test.
	* gcc.dg/debug/dwarf2/build_path_prefix_map-2.c: New test.

Index: gcc-7-20170409/include/prefix-map.h
===================================================================
--- /dev/null
+++ gcc-7-20170409/include/prefix-map.h
@@ -0,0 +1,94 @@
+/* Declarations for manipulating filename prefixes.
+   Written 2017 by Ximin Luo <infinity0@pwned.gg>
+   This code is in the public domain. */
+
+#ifndef _PREFIX_MAP_H
+#define _PREFIX_MAP_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifdef HAVE_STDLIB_H
+#include <stdlib.h>
+#endif
+
+/* Linked-list of mappings from old prefixes to new prefixes.  */
+
+struct prefix_map
+{
+  const char *old_prefix;
+  const char *new_prefix;
+  size_t old_len;
+  size_t new_len;
+  struct prefix_map *next;
+};
+
+
+/* Find a mapping suitable for the given OLD_NAME in the linked list MAP.\
+
+   If a mapping is found, writes a pointer to the non-matching suffix part of
+   OLD_NAME in SUFFIX, and its length in SUF_LEN.
+
+   Returns NULL if there was no suitable mapping.  */
+struct prefix_map *
+prefix_map_find (struct prefix_map *map, const char *old_name,
+		 const char **suffix, size_t *suf_len);
+
+/* Prepend a prefix map before a given SUFFIX.
+
+   The remapped name is written to NEW_NAME and returned as a const pointer. No
+   allocations are performed; the caller must ensure it can hold at least
+   MAP->NEW_LEN + SUF_LEN + 1 characters.  */
+const char *
+prefix_map_prepend (struct prefix_map *map, char *new_name,
+		    const char *suffix, size_t suf_len);
+
+/* Remap a filename.
+
+   Returns OLD_NAME unchanged if there was no remapping, otherwise returns a
+   pointer to newly-allocated memory for the remapped filename.  The memory is
+   allocated by the given ALLOC function, which also determines who is
+   responsible for freeing it.  */
+#define prefix_map_remap_alloc_(map_head, old_name, alloc)		       \
+  __extension__								       \
+  ({									       \
+    const char *__suffix;						       \
+    size_t __suf_len;							       \
+    struct prefix_map *__map;						       \
+    (__map = prefix_map_find ((map_head), (old_name), &__suffix, &__suf_len))  \
+      ? prefix_map_prepend (__map,					       \
+			    (char *) alloc (__map->new_len + __suf_len + 1),   \
+			    __suffix, __suf_len)			       \
+      : (old_name);							       \
+  })
+
+/* Remap a filename.
+
+   Returns OLD_NAME unchanged if there was no remapping, otherwise returns a
+   stack-allocated pointer to the newly-remapped filename.  */
+#define prefix_map_remap_alloca(map_head, old_name) \
+  prefix_map_remap_alloc_ (map_head, old_name, alloca)
+
+
+/* Parse prefix-maps according to the BUILD_PATH_PREFIX_MAP standard.
+
+   The input string value is of the form
+
+     dst[0]=src[0]:dst[1]=src[1]...
+
+   Every dst[i] and src[i] has had "%", "=" and ":" characters replaced with
+   "%#", "%+", and "%." respectively; this function reverses this replacement.
+
+   Rightmost entries are stored at the head of the parsed structure.
+
+   Returns 0 on failure and 1 on success.  */
+int
+prefix_map_parse (struct prefix_map **map_head, const char *arg);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _PREFIX_MAP_H */
Index: gcc-7-20170409/libiberty/Makefile.in
===================================================================
--- gcc-7-20170409.orig/libiberty/Makefile.in
+++ gcc-7-20170409/libiberty/Makefile.in
@@ -143,6 +143,7 @@ CFILES = alloca.c argv.c asprintf.c atex
 	 pex-common.c pex-djgpp.c pex-msdos.c pex-one.c			\
 	 pex-unix.c pex-win32.c						\
          physmem.c putenv.c						\
+	prefix-map.c \
 	random.c regex.c rename.c rindex.c				\
 	rust-demangle.c							\
 	safe-ctype.c setenv.c setproctitle.c sha1.c sigsetmask.c        \
@@ -182,6 +183,7 @@ REQUIRED_OFILES =							\
 	./partition.$(objext) ./pexecute.$(objext) ./physmem.$(objext)	\
 	./pex-common.$(objext) ./pex-one.$(objext)			\
 	./@pexecute@.$(objext) ./vprintf-support.$(objext)		\
+	./prefix-map.$(objext) \
 	./rust-demangle.$(objext)					\
 	./safe-ctype.$(objext)						\
 	./simple-object.$(objext) ./simple-object-coff.$(objext)	\
@@ -757,7 +759,7 @@ $(CONFIGURED_OFILES): stamp-picdir stamp
 	$(COMPILE.c) $(srcdir)/fibheap.c $(OUTPUT_OPTION)
 
 ./filename_cmp.$(objext): $(srcdir)/filename_cmp.c config.h $(INCDIR)/ansidecl.h \
-	$(INCDIR)/filenames.h $(INCDIR)/hashtab.h \
+	$(INCDIR)/filenames.h $(INCDIR)/hashtab.h $(INCDIR)/libiberty.h \
 	$(INCDIR)/safe-ctype.h
 	if [ x"$(PICFLAG)" != x ]; then \
 	  $(COMPILE.c) $(PICFLAG) $(srcdir)/filename_cmp.c -o pic/$@; \
@@ -1104,7 +1106,8 @@ $(CONFIGURED_OFILES): stamp-picdir stamp
 	$(COMPILE.c) $(srcdir)/pex-one.c $(OUTPUT_OPTION)
 
 ./pex-unix.$(objext): $(srcdir)/pex-unix.c config.h $(INCDIR)/ansidecl.h \
-	$(INCDIR)/libiberty.h $(srcdir)/pex-common.h
+	$(INCDIR)/environ.h $(INCDIR)/libiberty.h \
+	$(srcdir)/pex-common.h
 	if [ x"$(PICFLAG)" != x ]; then \
 	  $(COMPILE.c) $(PICFLAG) $(srcdir)/pex-unix.c -o pic/$@; \
 	else true; fi
@@ -1143,6 +1146,15 @@ $(CONFIGURED_OFILES): stamp-picdir stamp
 	else true; fi
 	$(COMPILE.c) $(srcdir)/physmem.c $(OUTPUT_OPTION)
 
+./prefix-map.$(objext): $(srcdir)/prefix-map.c config.h $(INCDIR)/prefix-map.h
+	if [ x"$(PICFLAG)" != x ]; then \
+	  $(COMPILE.c) $(PICFLAG) $(srcdir)/prefix-map.c -o pic/$@; \
+	else true; fi
+	if [ x"$(NOASANFLAG)" != x ]; then \
+	  $(COMPILE.c) $(PICFLAG) $(NOASANFLAG) $(srcdir)/prefix-map.c -o noasan/$@; \
+	else true; fi
+	$(COMPILE.c) $(srcdir)/prefix-map.c $(OUTPUT_OPTION)
+
 ./putenv.$(objext): $(srcdir)/putenv.c config.h $(INCDIR)/ansidecl.h
 	if [ x"$(PICFLAG)" != x ]; then \
 	  $(COMPILE.c) $(PICFLAG) $(srcdir)/putenv.c -o pic/$@; \
@@ -1210,7 +1222,8 @@ $(CONFIGURED_OFILES): stamp-picdir stamp
 	else true; fi
 	$(COMPILE.c) $(srcdir)/safe-ctype.c $(OUTPUT_OPTION)
 
-./setenv.$(objext): $(srcdir)/setenv.c config.h $(INCDIR)/ansidecl.h
+./setenv.$(objext): $(srcdir)/setenv.c config.h $(INCDIR)/ansidecl.h \
+	$(INCDIR)/environ.h
 	if [ x"$(PICFLAG)" != x ]; then \
 	  $(COMPILE.c) $(PICFLAG) $(srcdir)/setenv.c -o pic/$@; \
 	else true; fi
@@ -1661,7 +1674,7 @@ $(CONFIGURED_OFILES): stamp-picdir stamp
 	$(COMPILE.c) $(srcdir)/xexit.c $(OUTPUT_OPTION)
 
 ./xmalloc.$(objext): $(srcdir)/xmalloc.c config.h $(INCDIR)/ansidecl.h \
-	$(INCDIR)/libiberty.h
+	$(INCDIR)/environ.h $(INCDIR)/libiberty.h
 	if [ x"$(PICFLAG)" != x ]; then \
 	  $(COMPILE.c) $(PICFLAG) $(srcdir)/xmalloc.c -o pic/$@; \
 	else true; fi
@@ -1719,3 +1732,4 @@ $(CONFIGURED_OFILES): stamp-picdir stamp
 	  $(COMPILE.c) $(PICFLAG) $(NOASANFLAG) $(srcdir)/xvasprintf.c -o noasan/$@; \
 	else true; fi
 	$(COMPILE.c) $(srcdir)/xvasprintf.c $(OUTPUT_OPTION)
+
Index: gcc-7-20170409/libiberty/prefix-map.c
===================================================================
--- /dev/null
+++ gcc-7-20170409/libiberty/prefix-map.c
@@ -0,0 +1,201 @@
+/* Definitions for manipulating filename prefixes.
+   Written 2017 by Ximin Luo <infinity0@pwned.gg>
+   This code is in the public domain. */
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#ifdef HAVE_STRING_H
+#include <string.h>
+#endif
+
+#ifdef HAVE_STDLIB_H
+#include <stdlib.h>
+#endif
+
+#include "filenames.h"
+#include "libiberty.h"
+#include "prefix-map.h"
+
+
+/* Add a new mapping.
+
+   The input strings are duplicated and a new prefix_map struct is allocated.
+   Ownership of the duplicates, as well as the new prefix_map, is the same as
+   the ownership of the old struct.
+
+   Returns 0 on failure and 1 on success.  */
+int
+prefix_map_push (struct prefix_map **map_head,
+		 const char *new_prefix, const char *old_prefix)
+{
+  struct prefix_map *map = XNEW (struct prefix_map);
+  if (!map)
+    goto rewind_0;
+
+  map->old_prefix = xstrdup (old_prefix);
+  if (!map->old_prefix)
+    goto rewind_1;
+  map->old_len = strlen (old_prefix);
+
+  map->new_prefix = xstrdup (new_prefix);
+  if (!map->new_prefix)
+    goto rewind_2;
+  map->new_len = strlen (new_prefix);
+
+  map->next = *map_head;
+  *map_head = map;
+  return 1;
+
+rewind_2:
+  free ((void *) map->old_prefix);
+rewind_1:
+  free (map);
+rewind_0:
+  return 0;
+}
+
+/* Rewind a prefix map.
+
+   Everything up to the given OLD_HEAD is freed.  */
+void
+prefix_map_pop_until (struct prefix_map **map_head, struct prefix_map *old_head)
+{
+  struct prefix_map *map;
+  struct prefix_map *next;
+
+  for (map = *map_head; map != old_head; map = next)
+    {
+      free ((void *) map->old_prefix);
+      free ((void *) map->new_prefix);
+      next = map->next;
+      free (map);
+    }
+
+  *map_head = map;
+}
+
+
+/* Find a mapping suitable for the given OLD_NAME in the linked list MAP.\
+
+   If a mapping is found, writes a pointer to the non-matching suffix part of
+   OLD_NAME in SUFFIX, and its length in SUF_LEN.
+
+   Returns NULL if there was no suitable mapping.  */
+struct prefix_map *
+prefix_map_find (struct prefix_map *map, const char *old_name,
+		 const char **suffix, size_t *suf_len)
+{
+  for (; map; map = map->next)
+    if (filename_ncmp (old_name, map->old_prefix, map->old_len) == 0)
+      {
+	*suf_len = strlen (*suffix = old_name + map->old_len);
+	break;
+      }
+
+  return map;
+}
+
+/* Prepend a prefix map before a given SUFFIX.
+
+   The remapped name is written to NEW_NAME and returned as a const pointer. No
+   allocations are performed; the caller must ensure it can hold at least
+   MAP->NEW_LEN + SUF_LEN + 1 characters.  */
+const char *
+prefix_map_prepend (struct prefix_map *map, char *new_name,
+		    const char *suffix, size_t suf_len)
+{
+  memcpy (new_name, map->new_prefix, map->new_len);
+  memcpy (new_name + map->new_len, suffix, suf_len + 1);
+  return new_name;
+}
+
+
+/* Parse a single part of a single prefix-map pair.
+
+   Returns 0 on failure and 1 on success.  */
+int
+prefix_map_parse_unquote (char *src)
+{
+  for (char *dest = src; 0 != (*dest = *src); ++dest, ++src)
+    switch (*src)
+      {
+      case ':':
+      case '=':
+	return 0; // should have been escaped
+      case '%':
+	switch (*(src + 1))
+	  {
+	  case '.':
+	    *dest = ':';
+	    goto unquoted;
+	  case '+':
+	    *dest = '=';
+	  unquoted:
+	  case '#':
+	    ++src;
+	    break;
+	  default:
+	    return 0; // invalid
+	  }
+      }
+  return 1;
+}
+
+/* Parse a single prefix-map.
+
+   Returns 0 on failure and 1 on success.  */
+int
+prefix_map_parse1 (struct prefix_map **map_head, char *arg)
+{
+  char *p;
+  p = strchr (arg, '=');
+  if (!p)
+    return 0;
+  *p = '\0';
+  if (!prefix_map_parse_unquote (arg))
+    return 0;
+  p++;
+  if (!prefix_map_parse_unquote (p))
+    return 0;
+
+  return prefix_map_push (map_head, arg, p);
+}
+
+/* Parse a prefix-map according to the BUILD_PATH_PREFIX_MAP standard.
+
+   The input string value is of the form
+
+     dst[0]=src[0]:dst[1]=src[1]...
+
+   Every dst[i] and src[i] has had "%", "=" and ":" characters replaced with
+   "%#", "%+", and "%." respectively; this function reverses this replacement.
+
+   Rightmost entries are stored at the head of the parsed structure.
+
+   Returns 0 on failure and 1 on success.  */
+int
+prefix_map_parse (struct prefix_map **map_head, const char *arg)
+{
+  struct prefix_map *old_head = *map_head;
+
+  size_t len = strlen (arg);
+  char *copy = (char *) alloca (len + 1);
+  memcpy (copy, arg, len + 1);
+
+  const char *sep = ":";
+  char *end, *tok = strtok_r (copy, sep, &end);
+  while (tok != NULL)
+    {
+      if (!prefix_map_parse1 (map_head, tok))
+	{
+	  prefix_map_pop_until (map_head, old_head);
+	  return 0;
+	}
+
+      tok = strtok_r (NULL, sep, &end);
+    }
+
+  return 1;
+}
Index: gcc-7-20170409/gcc/debug.h
===================================================================
--- gcc-7-20170409.orig/gcc/debug.h
+++ gcc-7-20170409/gcc/debug.h
@@ -236,6 +236,7 @@ extern void dwarf2out_switch_text_sectio
 
 const char *remap_debug_filename (const char *);
 void add_debug_prefix_map (const char *);
+void add_debug_prefix_map_from_envvar ();
 
 /* For -fdump-go-spec.  */
 
Index: gcc-7-20170409/gcc/final.c
===================================================================
--- gcc-7-20170409.orig/gcc/final.c
+++ gcc-7-20170409/gcc/final.c
@@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.
 #define INCLUDE_ALGORITHM /* reverse */
 #include "system.h"
 #include "coretypes.h"
+#include "prefix-map.h"
 #include "backend.h"
 #include "target.h"
 #include "rtl.h"
@@ -1506,22 +1507,9 @@ asm_str_count (const char *templ)
   return count;
 }
 \f
-/* ??? This is probably the wrong place for these.  */
-/* Structure recording the mapping from source file and directory
-   names at compile time to those to be embedded in debug
-   information.  */
-struct debug_prefix_map
-{
-  const char *old_prefix;
-  const char *new_prefix;
-  size_t old_len;
-  size_t new_len;
-  struct debug_prefix_map *next;
-};
-
-/* Linked list of such structures.  */
-static debug_prefix_map *debug_prefix_maps;
 
+/* Linked list of `struct prefix_map'.  */
+static prefix_map *debug_prefix_maps = NULL;
 
 /* Record a debug file prefix mapping.  ARG is the argument to
    -fdebug-prefix-map and must be of the form OLD=NEW.  */
@@ -1529,7 +1517,7 @@ static debug_prefix_map *debug_prefix_ma
 void
 add_debug_prefix_map (const char *arg)
 {
-  debug_prefix_map *map;
+  prefix_map *map;
   const char *p;
 
   p = strchr (arg, '=');
@@ -1538,7 +1526,7 @@ add_debug_prefix_map (const char *arg)
       error ("invalid argument %qs to -fdebug-prefix-map", arg);
       return;
     }
-  map = XNEW (debug_prefix_map);
+  map = XNEW (prefix_map);
   map->old_prefix = xstrndup (arg, p - arg);
   map->old_len = p - arg;
   p++;
@@ -1548,28 +1536,32 @@ add_debug_prefix_map (const char *arg)
   debug_prefix_maps = map;
 }
 
+/* Add debug-prefix-maps from BUILD_PATH_PREFIX_MAP environment variable.  */
+
+void
+add_debug_prefix_map_from_envvar ()
+{
+  const char *arg = getenv ("BUILD_PATH_PREFIX_MAP");
+
+  if (!arg || prefix_map_parse (&debug_prefix_maps, arg))
+    return;
+
+  error ("environment variable BUILD_PATH_PREFIX_MAP is "
+	 "not well formed; see the GCC documentation for more details.");
+}
+
 /* Perform user-specified mapping of debug filename prefixes.  Return
    the new name corresponding to FILENAME.  */
 
 const char *
 remap_debug_filename (const char *filename)
 {
-  debug_prefix_map *map;
-  char *s;
-  const char *name;
-  size_t name_len;
-
-  for (map = debug_prefix_maps; map; map = map->next)
-    if (filename_ncmp (filename, map->old_prefix, map->old_len) == 0)
-      break;
-  if (!map)
+  const char *name = prefix_map_remap_alloca (debug_prefix_maps, filename);
+
+  if (name == filename)
     return filename;
-  name = filename + map->old_len;
-  name_len = strlen (name) + 1;
-  s = (char *) alloca (name_len + map->new_len);
-  memcpy (s, map->new_prefix, map->new_len);
-  memcpy (s + map->new_len, name, name_len);
-  return ggc_strdup (s);
+
+  return ggc_strdup (name);
 }
 \f
 /* Return true if DWARF2 debug info can be emitted for DECL.  */
Index: gcc-7-20170409/gcc/opts-global.c
===================================================================
--- gcc-7-20170409.orig/gcc/opts-global.c
+++ gcc-7-20170409/gcc/opts-global.c
@@ -335,6 +335,8 @@ handle_common_deferred_options (void)
   if (flag_opt_info)
     opt_info_switch_p (NULL);
 
+  add_debug_prefix_map_from_envvar ();
+
   FOR_EACH_VEC_ELT (v, i, opt)
     {
       switch (opt->opt_index)
Index: gcc-7-20170409/gcc/testsuite/lib/gcc-dg.exp
===================================================================
--- gcc-7-20170409.orig/gcc/testsuite/lib/gcc-dg.exp
+++ gcc-7-20170409/gcc/testsuite/lib/gcc-dg.exp
@@ -454,19 +454,24 @@ proc restore-target-env-var { } {
 proc dg-set-compiler-env-var { args } {
     global set_compiler_env_var
     global saved_compiler_env_var
-    if { [llength $args] != 3 } {
-	error "dg-set-compiler-env-var: need two arguments"
+    if { [llength $args] != 3 && [llength $args] != 2 } {
+	error "dg-set-compiler-env-var: need one or two arguments"
 	return
     }
     set var [lindex $args 1]
-    set value [lindex $args 2]
     if [info exists ::env($var)] {
       lappend saved_compiler_env_var [list $var 1 $::env($var)]
     } else {
       lappend saved_compiler_env_var [list $var 0]
     }
-    setenv $var $value
-    lappend set_compiler_env_var [list $var $value]
+    if { [llength $args] == 3 } {
+      set value [lindex $args 2]
+      setenv $var $value
+      lappend set_compiler_env_var [list $var 1 $value]
+    } else {
+      catch { unsetenv $var }
+      lappend set_compiler_env_var [list $var 0]
+    }
 }
 
 proc restore-compiler-env-var { } {
@@ -478,7 +483,7 @@ proc restore-compiler-env-var { } {
 	if [lindex $env_var 1] {
 	    setenv $var [lindex $env_var 2]
 	} else {
-	    unsetenv $var
+	    catch { unsetenv $var }
 	}
     }
 }
Index: gcc-7-20170409/gcc/testsuite/gcc.dg/debug/dwarf2/build_path_prefix_map-1.c
===================================================================
--- /dev/null
+++ gcc-7-20170409/gcc/testsuite/gcc.dg/debug/dwarf2/build_path_prefix_map-1.c
@@ -0,0 +1,9 @@
+/* DW_AT_comp_dir should be relative if BUILD_PATH_PREFIX_MAP is a prefix of it.  */
+/* { dg-do compile } */
+/* { dg-options "-gdwarf -dA" } */
+/* { dg-set-compiler-env-var BUILD_PATH_PREFIX_MAP "DWARF2TEST=[file dirname [pwd]]" } */
+/* { dg-final { scan-assembler "DW_AT_comp_dir: \"DWARF2TEST/gcc" } } */
+
+void func (void)
+{
+}
Index: gcc-7-20170409/gcc/testsuite/gcc.dg/debug/dwarf2/build_path_prefix_map-2.c
===================================================================
--- /dev/null
+++ gcc-7-20170409/gcc/testsuite/gcc.dg/debug/dwarf2/build_path_prefix_map-2.c
@@ -0,0 +1,9 @@
+/* DW_AT_comp_dir should be absolute if BUILD_PATH_PREFIX_MAP is not set.  */
+/* { dg-do compile } */
+/* { dg-options "-gdwarf -dA" } */
+/* { dg-set-compiler-env-var BUILD_PATH_PREFIX_MAP } */
+/* { dg-final { scan-assembler "DW_AT_comp_dir: \"/" } } */
+
+void func (void)
+{
+}

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2] Generate reproducible output independently of the build-path
@ 2017-04-11 11:35 Ximin Luo
  2017-04-11 11:35 ` [PATCH 2/3] Use BUILD_PATH_PREFIX_MAP envvar to transform __FILE__ Ximin Luo
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Ximin Luo @ 2017-04-11 11:35 UTC (permalink / raw)
  To: GCC Patches; +Cc: Ximin Luo

(Please keep me on CC, I am not subscribed)

Background
==========

Previous background is here: https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00182.html

Upon further discussion, we decided to add support for multiple mappings and to
rename the environment variable to BUILD_PATH_PREFIX_MAP. We have also prepared
a document that describes how this works in detail, so that projects can be
confident that they are interoperable:

https://reproducible-builds.org/specs/build-path-prefix-map/

The specification is currently in DRAFT status, awaiting some final feedback,
including what the GCC maintainers think about it.

If one is interested in reading about this topic in the wider context of
reproducible builds, there's some more background here:

https://wiki.debian.org/ReproducibleBuilds/StandardEnvironmentVariables

Proposal
========

This patch series adds a new environment variable BUILD_PATH_PREFIX_MAP. When
this is set, GCC will treat this as extra implicit "-fdebug-prefix-map=$value"
command-line arguments that precede any explicit ones. This makes the final
binary output reproducible, and also hides the unreproducible value (the source
path prefixes) from CFLAGS et. al. which many build tools (understandably)
embed as-is into their build output.

This environment variable also acts on the __FILE__ macro, mapping it in the
same way that debug-prefix-map works for debug symbols. We have seen that
__FILE__ is also a very large source of unreproducibility, and is represented
quite heavily in the 3k+ figure given earlier.

Finally, we tweak the mapping algorithm so that it applies only to whole path
components when matching prefixes. This algorithm contains fewer corner cases
and is more predictable, so it is easier for users to figure out how to set the
mapping appropriately, and it is better as a standardised algorithm that other
build tools might like to adopt. (The original idea came from discussions with
some rustc developers about this same topic.) This does technically break
backwards-compatibility, but I was under the impression that this option was
not seen as such a critical feature, that this would be too important. I am
also happy to justify it in more detail on request.

Nevertheless, for this reason our draft specification currently offers two
algorithms for implementers, but I would reduce this to one if the GCC
maintainers agree to accept this third patch.

Testing
=======

I've tested these patches on a Debian unstable x86_64-linux-gnu schroot running
inside a Debian jessie system, on a full-bootstrap build. The output of
contrib/compare_tests is as follows:

~~~~
gcc-7-20170409$ contrib/compare_tests ../gcc-build-0 ../gcc-build-1
# Comparing directories
## Dir1=../gcc-build-0: 8 sum files
## Dir2=../gcc-build-1: 8 sum files

# Comparing 8 common sum files
## /bin/sh contrib/compare_tests  /tmp/gxx-sum1.24154 /tmp/gxx-sum2.24154
New tests that PASS:

gcc.dg/cpp/build_path_prefix_map-1.c (test for excess errors)
gcc.dg/cpp/build_path_prefix_map-1.c execution test
gcc.dg/cpp/build_path_prefix_map-2.c (test for excess errors)
gcc.dg/cpp/build_path_prefix_map-2.c execution test
gcc.dg/debug/dwarf2/build_path_prefix_map-1.c (test for excess errors)
gcc.dg/debug/dwarf2/build_path_prefix_map-1.c scan-assembler DW_AT_comp_dir: "DWARF2TEST/gcc
gcc.dg/debug/dwarf2/build_path_prefix_map-2.c (test for excess errors)
gcc.dg/debug/dwarf2/build_path_prefix_map-2.c scan-assembler DW_AT_comp_dir: "/

# No differences found in 8 common sum files
~~~~

I can also provide the full logs on request.

--

I've also fuzzed the prefix-map code using AFL with ASAN enabled. Due to how
AFL works I did not fuzz this patch directly but a smaller program with just
the parser and remapper, available here:

https://anonscm.debian.org/cgit/reproducible/build-path-prefix-map-spec.git/tree/consume

Over the course of about ~4k cycles, no crashes were found.

To reproduce, you could run something like:

$ echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
$ make CC=afl-gcc clean reset-fuzz-pecsplit.c fuzz-pecsplit.c

--

I will soon test this patch backported to Debian GCC-6 on
tests.reproducible-builds.org and will have results in a few days or weeks.
Some preliminary tests earlier gave good results (about +40 packages
reproducible over ~2 days) but we had to abort due to some misscheduling.

Copyright disclaimer
====================

I dedicate these patches to the public domain by waiving all of my rights to
the work worldwide under copyright law, including all related and neighboring
rights, to the extent allowed by law.

See https://creativecommons.org/publicdomain/zero/1.0/legalcode for full text.

Please let me know if the above is insufficient and I will be happy to sign any
relevant forms.

However, I would prefer it if the prefix-map.{h,c} remain public domain since
its code is also duplicated in our "example code" repo (url above), which is
meant for other projects to copy+paste.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3/3] When remapping paths, only match whole path components
  2017-04-11 11:35 [PATCH v2] Generate reproducible output independently of the build-path Ximin Luo
  2017-04-11 11:35 ` [PATCH 2/3] Use BUILD_PATH_PREFIX_MAP envvar to transform __FILE__ Ximin Luo
  2017-04-11 11:35 ` [PATCH 1/3] Use BUILD_PATH_PREFIX_MAP envvar for debug-prefix-map Ximin Luo
@ 2017-04-11 11:35 ` Ximin Luo
  2017-04-18 14:57 ` [PATCH v2] Generate reproducible output independently of the build-path Ximin Luo
  2017-04-21 18:28 ` Joseph Myers
  4 siblings, 0 replies; 10+ messages in thread
From: Ximin Luo @ 2017-04-11 11:35 UTC (permalink / raw)
  To: GCC Patches; +Cc: Ximin Luo

Change the remapping algorithm so that each old_prefix only matches paths that
have old_prefix as a whole path component prefix.  (A whole path component is a
part of a path that begins and ends at a directory separator or at either end
of the path string.)

This remapping algorithm is more predictable than the old algorithm, because
there is no chance of mappings for one directory interfering with mappings for
other directories.  It contains less corner cases and is therefore nicer for
clients to use.  For these reasons, in our BUILD_PATH_PREFIX_MAP specification
we recommend this algorithm, and it would be good for GCC to follow suit.

This does technically break backwards compatibility but I don't think anyone
would be reasonably depending on the corner cases of the previous algorithm,
which are surprising and counterintuitive.

Acknowledgements
----------------

Discussions with Michael Woerister and other members of the Rust compiler team
on Github, and discussions with Daniel Shahaf on the rb-general@ mailing list
on lists.reproducible-builds.org.

ChangeLogs
----------

gcc/ChangeLog:

2017-04-09  Ximin Luo  <infinity0@pwned.gg>

	* doc/invoke.texi (Environment Variables): Document form and behaviour
	of BUILD_PATH_PREFIX_MAP.

libiberty/ChangeLog:

2017-04-09  Ximin Luo  <infinity0@pwned.gg>

	* prefix-map.c: When remapping paths, only match whole path components.

Index: gcc-7-20170409/gcc/doc/invoke.texi
===================================================================
--- gcc-7-20170409.orig/gcc/doc/invoke.texi
+++ gcc-7-20170409/gcc/doc/invoke.texi
@@ -26637,6 +26637,26 @@ Recognize EUCJP characters.
 If @env{LANG} is not defined, or if it has some other value, then the
 compiler uses @code{mblen} and @code{mbtowc} as defined by the default locale to
 recognize and translate multibyte characters.
+
+@item BUILD_PATH_PREFIX_MAP
+@findex BUILD_PATH_PREFIX_MAP
+If this variable is set, it specifies an ordered map used to transform
+filepaths output in debugging symbols and expansions of the @code{__FILE__}
+macro.  This may be used to achieve fully reproducible output.  In the context
+of running GCC within a higher-level build tool, it is typically more reliable
+than setting command line arguments such as @option{-fdebug-prefix-map} or
+common environment variables such as @env{CFLAGS}, since the build tool may
+save these latter values into other output outside of GCC's control.
+
+The value is of the form
+@samp{@var{dst@r{[0]}}=@var{src@r{[0]}}:@var{dst@r{[1]}}=@var{src@r{[1]}}@r{@dots{}}}.
+If any @var{dst@r{[}i@r{]}} or @var{src@r{[}i@r{]}} contains @code{%}, @code{=}
+or @code{:} characters, they must be replaced with @code{%#}, @code{%+}, and
+@code{%.} respectively.
+
+Whenever GCC emits a filepath that starts with a whole path component matching
+@var{src@r{[}i@r{]}} for some @var{i}, with rightmost @var{i} taking priority,
+the matching part is replaced with @var{dst@r{[}i@r{]}} in the final output.
 @end table
 
 @noindent
Index: gcc-7-20170409/libiberty/prefix-map.c
===================================================================
--- gcc-7-20170409.orig/libiberty/prefix-map.c
+++ gcc-7-20170409/libiberty/prefix-map.c
@@ -87,12 +87,22 @@ struct prefix_map *
 prefix_map_find (struct prefix_map *map, const char *old_name,
 		 const char **suffix, size_t *suf_len)
 {
+  size_t len;
+
   for (; map; map = map->next)
-    if (filename_ncmp (old_name, map->old_prefix, map->old_len) == 0)
-      {
-	*suf_len = strlen (*suffix = old_name + map->old_len);
-	break;
-      }
+    {
+      len = map->old_len;
+      /* Ignore trailing path separators at the end of old_prefix */
+      while (len > 0 && IS_DIR_SEPARATOR (map->old_prefix[len-1])) len--;
+      /* Check if old_name matches old_prefix at a path component boundary */
+      if (! filename_ncmp (old_name, map->old_prefix, len)
+	  && (IS_DIR_SEPARATOR (old_name[len])
+	      || old_name[len] == '\0'))
+	{
+	  *suf_len = strlen (*suffix = old_name + len);
+	  break;
+	}
+    }
 
   return map;
 }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/3] Use BUILD_PATH_PREFIX_MAP envvar to transform __FILE__
  2017-04-11 11:35 [PATCH v2] Generate reproducible output independently of the build-path Ximin Luo
@ 2017-04-11 11:35 ` Ximin Luo
  2017-04-11 11:35 ` [PATCH 1/3] Use BUILD_PATH_PREFIX_MAP envvar for debug-prefix-map Ximin Luo
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Ximin Luo @ 2017-04-11 11:35 UTC (permalink / raw)
  To: GCC Patches; +Cc: Ximin Luo

Use the BUILD_PATH_PREFIX_MAP environment variable when expanding the __FILE__
macro, in the same way that debug-prefix-map works for debugging symbol paths.

This patch follows similar lines to the earlier patch for SOURCE_DATE_EPOCH.
Specifically, we read the environment variable not in libcpp but via a hook
which has an implementation defined in gcc/c-family.  However, to achieve this
is more complex than the earlier patch: we need to share the prefix_map data
structure and associated functions between libcpp and c-family.  Therefore, we
need to move these to libiberty.  (For comparison, the SOURCE_DATE_EPOCH patch
did not need this because time_t et. al. are in the standard C library.)

Acknowledgements
----------------

Dhole <dhole@openmailbox.org> who wrote the earlier patch for SOURCE_DATE_EPOCH
which saved me a lot of time on figuring out what to edit.

ChangeLogs
----------

gcc/c-family/ChangeLog:

2017-03-27  Ximin Luo  <infinity0@pwned.gg>

	* c-common.c (cb_get_build_path_prefix_map): Define new call target.
	* c-common.h (cb_get_build_path_prefix_map): Declare call target.
	* c-lex.c (init_c_lex): Set the get_build_path_prefix_map callback.

libcpp/ChangeLog:

2017-03-27  Ximin Luo  <infinity0@pwned.gg>

	* include/cpplib.h (cpp_callbacks): Add get_build_path_prefix_map
	callback.
	* init.c (cpp_create_reader): Initialise build_path_prefix_map field.
	* internal.h (cpp_reader): Add new field build_path_prefix_map.
	* macro.c (_cpp_builtin_macro_text): Set the build_path_prefix_map
	field if unset and apply it when expanding __FILE__ macros.

gcc/testsuite/ChangeLog:

2017-03-27  Ximin Luo  <infinity0@pwned.gg>

	* gcc.dg/cpp/build_path_prefix_map-1.c: New test.
	* gcc.dg/cpp/build_path_prefix_map-2.c: New test.

Index: gcc-7-20170402/gcc/c-family/c-common.c
===================================================================
--- gcc-7-20170402.orig/gcc/c-family/c-common.c
+++ gcc-7-20170402/gcc/c-family/c-common.c
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.
 
 #include "config.h"
 #include "system.h"
+#include "prefix-map.h"
 #include "coretypes.h"
 #include "target.h"
 #include "function.h"
@@ -8012,6 +8013,25 @@ cb_get_source_date_epoch (cpp_reader *pf
   return (time_t) epoch;
 }
 
+/* Read BUILD_PATH_PREFIX_MAP from environment to have deterministic relative
+   paths to replace embedded absolute paths to get reproducible results.
+   Returns NULL if BUILD_PATH_PREFIX_MAP is badly formed.  */
+
+prefix_map **
+cb_get_build_path_prefix_map (cpp_reader *pfile ATTRIBUTE_UNUSED)
+{
+  prefix_map **map = XCNEW (prefix_map *);
+
+  const char *arg = getenv ("BUILD_PATH_PREFIX_MAP");
+  if (!arg || prefix_map_parse (map, arg))
+    return map;
+
+  free (map);
+  error_at (input_location, "environment variable BUILD_PATH_PREFIX_MAP is "
+	    "not well formed; see the GCC documentation for more details.");
+  return NULL;
+}
+
 /* Callback for libcpp for offering spelling suggestions for misspelled
    directives.  GOAL is an unrecognized string; CANDIDATES is a
    NULL-terminated array of candidate strings.  Return the closest
Index: gcc-7-20170402/gcc/c-family/c-common.h
===================================================================
--- gcc-7-20170402.orig/gcc/c-family/c-common.h
+++ gcc-7-20170402/gcc/c-family/c-common.h
@@ -1085,6 +1085,11 @@ extern time_t cb_get_source_date_epoch (
    __TIME__ can store.  */
 #define MAX_SOURCE_DATE_EPOCH HOST_WIDE_INT_C (253402300799)
 
+/* Read BUILD_PATH_PREFIX_MAP from environment to have deterministic relative
+   paths to replace embedded absolute paths to get reproducible results.
+   Returns NULL if BUILD_PATH_PREFIX_MAP is badly formed.  */
+extern prefix_map **cb_get_build_path_prefix_map (cpp_reader *pfile);
+
 /* Callback for libcpp for offering spelling suggestions for misspelled
    directives.  */
 extern const char *cb_get_suggestion (cpp_reader *, const char *,
Index: gcc-7-20170402/gcc/c-family/c-lex.c
===================================================================
--- gcc-7-20170402.orig/gcc/c-family/c-lex.c
+++ gcc-7-20170402/gcc/c-family/c-lex.c
@@ -81,6 +81,7 @@ init_c_lex (void)
   cb->read_pch = c_common_read_pch;
   cb->has_attribute = c_common_has_attribute;
   cb->get_source_date_epoch = cb_get_source_date_epoch;
+  cb->get_build_path_prefix_map = cb_get_build_path_prefix_map;
   cb->get_suggestion = cb_get_suggestion;
 
   /* Set the debug callbacks if we can use them.  */
Index: gcc-7-20170402/libcpp/include/cpplib.h
===================================================================
--- gcc-7-20170402.orig/libcpp/include/cpplib.h
+++ gcc-7-20170402/libcpp/include/cpplib.h
@@ -607,6 +607,9 @@ struct cpp_callbacks
   /* Callback to parse SOURCE_DATE_EPOCH from environment.  */
   time_t (*get_source_date_epoch) (cpp_reader *);
 
+  /* Callback to parse BUILD_PATH_PREFIX_MAP from environment.  */
+  struct prefix_map **(*get_build_path_prefix_map) (cpp_reader *);
+
   /* Callback for providing suggestions for misspelled directives.  */
   const char *(*get_suggestion) (cpp_reader *, const char *, const char *const *);
 };
Index: gcc-7-20170402/libcpp/init.c
===================================================================
--- gcc-7-20170402.orig/libcpp/init.c
+++ gcc-7-20170402/libcpp/init.c
@@ -261,6 +261,9 @@ cpp_create_reader (enum c_lang lang, cpp
   /* Initialize source_date_epoch to -2 (not yet set).  */
   pfile->source_date_epoch = (time_t) -2;
 
+  /* Initialize build_path_prefix_map to NULL (not yet set).  */
+  pfile->build_path_prefix_map = NULL;
+
   /* The expression parser stack.  */
   _cpp_expand_op_stack (pfile);
 
Index: gcc-7-20170402/libcpp/internal.h
===================================================================
--- gcc-7-20170402.orig/libcpp/internal.h
+++ gcc-7-20170402/libcpp/internal.h
@@ -507,6 +507,11 @@ struct cpp_reader
      set to -1 to disable it or to a non-negative value to enable it.  */
   time_t source_date_epoch;
 
+  /* Externally set prefix-map to transform absolute paths, useful for
+     reproducibility.  It should be initialized to NULL (not yet set or
+     disabled) or to a `struct prefix_map` double pointer to enable it.  */
+  struct prefix_map **build_path_prefix_map;
+
   /* EOF token, and a token forcing paste avoidance.  */
   cpp_token avoid_paste;
   cpp_token eof;
Index: gcc-7-20170402/libcpp/macro.c
===================================================================
--- gcc-7-20170402.orig/libcpp/macro.c
+++ gcc-7-20170402/libcpp/macro.c
@@ -26,6 +26,7 @@ along with this program; see the file CO
 #include "system.h"
 #include "cpplib.h"
 #include "internal.h"
+#include "prefix-map.h"
 
 typedef struct macro_arg macro_arg;
 /* This structure represents the tokens of a macro argument.  These
@@ -291,7 +292,17 @@ _cpp_builtin_macro_text (cpp_reader *pfi
 	unsigned int len;
 	const char *name;
 	uchar *buf;
+	prefix_map **map = pfile->build_path_prefix_map;
 	
+	/* Set a prefix-map for __FILE__ if BUILD_PATH_PREFIX_MAP is defined.  */
+	if (map == NULL && pfile->cb.get_build_path_prefix_map != NULL)
+	  {
+	    map = pfile->cb.get_build_path_prefix_map (pfile);
+	    if (map == NULL)
+	      abort ();
+	    pfile->build_path_prefix_map = map;
+	  }
+
 	if (node->value.builtin == BT_FILE)
 	  name = linemap_get_expansion_filename (pfile->line_table,
 						 pfile->line_table->highest_line);
@@ -301,6 +312,11 @@ _cpp_builtin_macro_text (cpp_reader *pfi
 	    if (!name)
 	      abort ();
 	  }
+
+	/* Apply the prefix-map for deterministic path output.  */
+	if (map != NULL)
+	  name = prefix_map_remap_alloca (*map, name);
+
 	len = strlen (name);
 	buf = _cpp_unaligned_alloc (pfile, len * 2 + 3);
 	result = buf;
Index: gcc-7-20170402/gcc/testsuite/gcc.dg/cpp/build_path_prefix_map-1.c
===================================================================
--- /dev/null
+++ gcc-7-20170402/gcc/testsuite/gcc.dg/cpp/build_path_prefix_map-1.c
@@ -0,0 +1,11 @@
+/* __FILE__ should strip BUILD_PATH_PREFIX_MAP if the latter is a prefix. */
+/* { dg-do run } */
+/* { dg-set-compiler-env-var BUILD_PATH_PREFIX_MAP "MACROTEST=$srcdir" } */
+
+int
+main ()
+{
+  if (__builtin_strcmp (__FILE__, "MACROTEST/gcc.dg/cpp/build_path_prefix_map-1.c") != 0)
+    __builtin_abort ();
+  return 0;
+}
Index: gcc-7-20170402/gcc/testsuite/gcc.dg/cpp/build_path_prefix_map-2.c
===================================================================
--- /dev/null
+++ gcc-7-20170402/gcc/testsuite/gcc.dg/cpp/build_path_prefix_map-2.c
@@ -0,0 +1,12 @@
+/* __FILE__ should not be relative if BUILD_PATH_PREFIX_MAP is not set, and gcc is
+   asked to compile an absolute filename as is the case with this test.  */
+/* { dg-do run } */
+/* { dg-set-compiler-env-var BUILD_PATH_PREFIX_MAP } */
+
+int
+main ()
+{
+  if (__builtin_strcmp (__FILE__, "./gcc.dg/cpp/build_path_prefix_map-2.c") == 0)
+    __builtin_abort ();
+  return 0;
+}

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] Generate reproducible output independently of the build-path
  2017-04-11 11:35 [PATCH v2] Generate reproducible output independently of the build-path Ximin Luo
                   ` (2 preceding siblings ...)
  2017-04-11 11:35 ` [PATCH 3/3] When remapping paths, only match whole path components Ximin Luo
@ 2017-04-18 14:57 ` Ximin Luo
  2017-04-21 18:28 ` Joseph Myers
  4 siblings, 0 replies; 10+ messages in thread
From: Ximin Luo @ 2017-04-18 14:57 UTC (permalink / raw)
  To: GCC Patches

Ximin Luo:
> [..]
> 
> I will soon test this patch backported to Debian GCC-6 on
> tests.reproducible-builds.org and will have results in a few days or weeks.
> Some preliminary tests earlier gave good results (about +40 packages
> reproducible over ~2 days) but we had to abort due to some misscheduling.
> 
> [..]

This has been completed and we reproduced ~1700 extra packages when building with a GCC-6 with this patch, as well as a patched dpkg that sets the environment variable appropriately.

This is about 6.5% of ~26100 Debian source packages, and about 1/2 of the ones whose irreproducibility is due to build-path issues. Most of the rest are not related to GCC, such as things built by R, OCaml, Erlang, LLVM, PDF IDs, etc, etc.

https://tests.reproducible-builds.org/debian/unstable/index_suite_amd64_stats.html

The dip afterwards is due to reverting back to an unpatched GCC-6, but I'll be rebasing the patch continually over the next few weeks so the graph should stay up.

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] Generate reproducible output independently of the build-path
  2017-04-11 11:35 [PATCH v2] Generate reproducible output independently of the build-path Ximin Luo
                   ` (3 preceding siblings ...)
  2017-04-18 14:57 ` [PATCH v2] Generate reproducible output independently of the build-path Ximin Luo
@ 2017-04-21 18:28 ` Joseph Myers
  2017-05-03 15:53   ` Ximin Luo
  4 siblings, 1 reply; 10+ messages in thread
From: Joseph Myers @ 2017-04-21 18:28 UTC (permalink / raw)
  To: Ximin Luo; +Cc: GCC Patches

On Tue, 11 Apr 2017, Ximin Luo wrote:

> Copyright disclaimer
> ====================
> 
> I dedicate these patches to the public domain by waiving all of my rights to
> the work worldwide under copyright law, including all related and neighboring
> rights, to the extent allowed by law.
> 
> See https://creativecommons.org/publicdomain/zero/1.0/legalcode for full text.
> 
> Please let me know if the above is insufficient and I will be happy to sign any
> relevant forms.

I believe the FSF wants its own disclaimer forms signed as evidence code 
is in the public domain.  The process for getting disclaimer forms is to 
complete 
https://git.savannah.gnu.org/cgit/gnulib.git/plain/doc/Copyright/request-disclaim.changes 
and then you should be sent a disclaimer form for disclaiming the 
particular set of changes you have completed (if you then make further 
significant changes afterwards, the disclaimer form would then need 
completing for them as well).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] Generate reproducible output independently of the build-path
  2017-04-21 18:28 ` Joseph Myers
@ 2017-05-03 15:53   ` Ximin Luo
  2017-06-07 16:15     ` Ximin Luo
  0 siblings, 1 reply; 10+ messages in thread
From: Ximin Luo @ 2017-05-03 15:53 UTC (permalink / raw)
  To: Joseph Myers; +Cc: GCC Patches

Joseph Myers:
> On Tue, 11 Apr 2017, Ximin Luo wrote:
> 
>> Copyright disclaimer
>> ====================
>>
>> I dedicate these patches to the public domain by waiving all of my rights to
>> the work worldwide under copyright law, including all related and neighboring
>> rights, to the extent allowed by law.
>>
>> See https://creativecommons.org/publicdomain/zero/1.0/legalcode for full text.
>>
>> Please let me know if the above is insufficient and I will be happy to sign any
>> relevant forms.
> 
> I believe the FSF wants its own disclaimer forms signed as evidence code 
> is in the public domain.  The process for getting disclaimer forms is to 
> complete 
> https://git.savannah.gnu.org/cgit/gnulib.git/plain/doc/Copyright/request-disclaim.changes 
> and then you should be sent a disclaimer form for disclaiming the 
> particular set of changes you have completed (if you then make further 
> significant changes afterwards, the disclaimer form would then need 
> completing for them as well).
> 

I've now done this, and the copyright clerk at the FSF has told me that this is complete on their side as well.

Did any of you get a chance to look at the patch yet?

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] Generate reproducible output independently of the build-path
  2017-05-03 15:53   ` Ximin Luo
@ 2017-06-07 16:15     ` Ximin Luo
  2017-06-29 10:46       ` [Ping ^3][PATCH " Ximin Luo
  0 siblings, 1 reply; 10+ messages in thread
From: Ximin Luo @ 2017-06-07 16:15 UTC (permalink / raw)
  To: GCC Patches

Ximin Luo:
> Joseph Myers:
>> On Tue, 11 Apr 2017, Ximin Luo wrote:
>>
>>> Copyright disclaimer
>>> ====================
>>>
>>> I dedicate these patches to the public domain by waiving all of my rights to
>>> the work worldwide under copyright law, including all related and neighboring
>>> rights, to the extent allowed by law.
>>>
>>> See https://creativecommons.org/publicdomain/zero/1.0/legalcode for full text.
>>>
>>> Please let me know if the above is insufficient and I will be happy to sign any
>>> relevant forms.
>>
>> I believe the FSF wants its own disclaimer forms signed as evidence code 
>> is in the public domain.  The process for getting disclaimer forms is to 
>> complete 
>> https://git.savannah.gnu.org/cgit/gnulib.git/plain/doc/Copyright/request-disclaim.changes 
>> and then you should be sent a disclaimer form for disclaiming the 
>> particular set of changes you have completed (if you then make further 
>> significant changes afterwards, the disclaimer form would then need 
>> completing for them as well).
>>
> 
> I've now done this, and the copyright clerk at the FSF has told me that this is complete on their side as well.
> 
> Did any of you get a chance to look at the patch yet?
> 

Hi GCC patches list,

Any progress or feedback on this patch series?

Ximin

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Ping ^3][PATCH v2] Generate reproducible output independently of the build-path
  2017-06-07 16:15     ` Ximin Luo
@ 2017-06-29 10:46       ` Ximin Luo
  0 siblings, 0 replies; 10+ messages in thread
From: Ximin Luo @ 2017-06-29 10:46 UTC (permalink / raw)
  To: GCC Patches
  Cc: Richard Earnshaw, Richard Biener, Richard Henderson,
	Jakub Jelinek, Richard Kenner, Jeff Law, Michael Meissner,
	Jason Merrill, David S. Miller, Joseph Myers, Bernd Schmidt,
	Ian Lance Taylor, Jim Wilson

Dear GCC Global Reviewers,

Could any of you please review my patch series? It's about being able to reproducibly build things, even when the build machines are executing the build under different paths.

Overview:
https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00513.html

Full thread, including individual patches:
https://gcc.gnu.org/ml/gcc-patches/2017-04/threads.html#00513

Follow-up report:
https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00781.html

In summary, this patch helps ~1800/26000 packages in Debian to become reproducible even when the build-path is varied across builds.

I've signed a copyright disclaimer and the FSF has this on record.

X

Ximin Luo:
> Ximin Luo:
>> Joseph Myers:
>>> On Tue, 11 Apr 2017, Ximin Luo wrote:
>>>
>>>> Copyright disclaimer
>>>> ====================
>>>>
>>>> I dedicate these patches to the public domain by waiving all of my rights to
>>>> the work worldwide under copyright law, including all related and neighboring
>>>> rights, to the extent allowed by law.
>>>>
>>>> See https://creativecommons.org/publicdomain/zero/1.0/legalcode for full text.
>>>>
>>>> Please let me know if the above is insufficient and I will be happy to sign any
>>>> relevant forms.
>>>
>>> I believe the FSF wants its own disclaimer forms signed as evidence code 
>>> is in the public domain.  The process for getting disclaimer forms is to 
>>> complete 
>>> https://git.savannah.gnu.org/cgit/gnulib.git/plain/doc/Copyright/request-disclaim.changes 
>>> and then you should be sent a disclaimer form for disclaiming the 
>>> particular set of changes you have completed (if you then make further 
>>> significant changes afterwards, the disclaimer form would then need 
>>> completing for them as well).
>>>
>>
>> I've now done this, and the copyright clerk at the FSF has told me that this is complete on their side as well.
>>
>> Did any of you get a chance to look at the patch yet?
>>
> 
> Hi GCC patches list,
> 
> Any progress or feedback on this patch series?
> 
> Ximin
> 

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/3] Use BUILD_PATH_PREFIX_MAP envvar to transform __FILE__
  2017-07-21 16:16 [PING^4][PATCH " Ximin Luo
@ 2017-07-21 16:16 ` Ximin Luo
  0 siblings, 0 replies; 10+ messages in thread
From: Ximin Luo @ 2017-07-21 16:16 UTC (permalink / raw)
  To: GCC Patches; +Cc: Ximin Luo

Use the BUILD_PATH_PREFIX_MAP environment variable when expanding the __FILE__
macro, in the same way that debug-prefix-map works for debugging symbol paths.

This patch follows similar lines to the earlier patch for SOURCE_DATE_EPOCH.
Specifically, we read the environment variable not in libcpp but via a hook
which has an implementation defined in gcc/c-family.  However, to achieve this
is more complex than the earlier patch: we need to share the prefix_map data
structure and associated functions between libcpp and c-family.  Therefore, we
need to move these to libiberty.  (For comparison, the SOURCE_DATE_EPOCH patch
did not need this because time_t et. al. are in the standard C library.)

Acknowledgements
----------------

Dhole <dhole@openmailbox.org> who wrote the earlier patch for SOURCE_DATE_EPOCH
which saved me a lot of time on figuring out what to edit.

ChangeLogs
----------

gcc/ChangeLog:

2017-07-21  Ximin Luo  <infinity0@pwned.gg>

	* doc/invoke.texi (Environment Variables): Document form and behaviour
	of BUILD_PATH_PREFIX_MAP.

gcc/c-family/ChangeLog:

2017-07-21  Ximin Luo  <infinity0@pwned.gg>

	* c-common.c (cb_get_build_path_prefix_map): Define new call target.
	* c-common.h (cb_get_build_path_prefix_map): Declare call target.
	* c-lex.c (init_c_lex): Set the get_build_path_prefix_map callback.

libcpp/ChangeLog:

2017-07-21  Ximin Luo  <infinity0@pwned.gg>

	* include/cpplib.h (cpp_callbacks): Add get_build_path_prefix_map
	callback.
	* init.c (cpp_create_reader): Initialise build_path_prefix_map field.
	* internal.h (cpp_reader): Add new field build_path_prefix_map.
	* macro.c (_cpp_builtin_macro_text): Set the build_path_prefix_map
	field if unset and apply it when expanding __FILE__ macros.

gcc/testsuite/ChangeLog:

2017-07-21  Ximin Luo  <infinity0@pwned.gg>

	* gcc.dg/cpp/build_path_prefix_map-1.c: New test.
	* gcc.dg/cpp/build_path_prefix_map-2.c: New test.

Index: gcc-8-20170716/gcc/doc/invoke.texi
===================================================================
--- gcc-8-20170716.orig/gcc/doc/invoke.texi
+++ gcc-8-20170716/gcc/doc/invoke.texi
@@ -27197,6 +27197,26 @@ Recognize EUCJP characters.
 If @env{LANG} is not defined, or if it has some other value, then the
 compiler uses @code{mblen} and @code{mbtowc} as defined by the default locale to
 recognize and translate multibyte characters.
+
+@item BUILD_PATH_PREFIX_MAP
+@findex BUILD_PATH_PREFIX_MAP
+If this variable is set, it specifies an ordered map used to transform
+filepaths output in debugging symbols and expansions of the @code{__FILE__}
+macro.  This may be used to achieve fully reproducible output.  In the context
+of running GCC within a higher-level build tool, it is typically more reliable
+than setting command line arguments such as @option{-fdebug-prefix-map} or
+common environment variables such as @env{CFLAGS}, since the build tool may
+save these latter values into other output outside of GCC's control.
+
+The value is of the form
+@samp{@var{dst@r{[0]}}=@var{src@r{[0]}}:@var{dst@r{[1]}}=@var{src@r{[1]}}@r{@dots{}}}.
+If any @var{dst@r{[}i@r{]}} or @var{src@r{[}i@r{]}} contains @code{%}, @code{=}
+or @code{:} characters, they must be replaced with @code{%#}, @code{%+}, and
+@code{%.} respectively.
+
+Whenever GCC emits a filepath that starts with a whole path component matching
+@var{src@r{[}i@r{]}} for some @var{i}, with rightmost @var{i} taking priority,
+the matching part is replaced with @var{dst@r{[}i@r{]}} in the final output.
 @end table
 
 @noindent
Index: gcc-8-20170716/gcc/c-family/c-common.c
===================================================================
--- gcc-8-20170716.orig/gcc/c-family/c-common.c
+++ gcc-8-20170716/gcc/c-family/c-common.c
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.
 
 #include "config.h"
 #include "system.h"
+#include "prefix-map.h"
 #include "coretypes.h"
 #include "target.h"
 #include "function.h"
@@ -7905,6 +7906,25 @@ cb_get_source_date_epoch (cpp_reader *pf
   return (time_t) epoch;
 }
 
+/* Read BUILD_PATH_PREFIX_MAP from environment to have deterministic relative
+   paths to replace embedded absolute paths to get reproducible results.
+   Returns NULL if BUILD_PATH_PREFIX_MAP is badly formed.  */
+
+prefix_map **
+cb_get_build_path_prefix_map (cpp_reader *pfile ATTRIBUTE_UNUSED)
+{
+  prefix_map **map = XCNEW (prefix_map *);
+
+  const char *arg = getenv ("BUILD_PATH_PREFIX_MAP");
+  if (!arg || prefix_map_parse (map, arg))
+    return map;
+
+  free (map);
+  error_at (input_location, "environment variable BUILD_PATH_PREFIX_MAP is "
+	    "not well formed; see the GCC documentation for more details.");
+  return NULL;
+}
+
 /* Callback for libcpp for offering spelling suggestions for misspelled
    directives.  GOAL is an unrecognized string; CANDIDATES is a
    NULL-terminated array of candidate strings.  Return the closest
Index: gcc-8-20170716/gcc/c-family/c-common.h
===================================================================
--- gcc-8-20170716.orig/gcc/c-family/c-common.h
+++ gcc-8-20170716/gcc/c-family/c-common.h
@@ -1084,6 +1084,11 @@ extern time_t cb_get_source_date_epoch (
    __TIME__ can store.  */
 #define MAX_SOURCE_DATE_EPOCH HOST_WIDE_INT_C (253402300799)
 
+/* Read BUILD_PATH_PREFIX_MAP from environment to have deterministic relative
+   paths to replace embedded absolute paths to get reproducible results.
+   Returns NULL if BUILD_PATH_PREFIX_MAP is badly formed.  */
+extern prefix_map **cb_get_build_path_prefix_map (cpp_reader *pfile);
+
 /* Callback for libcpp for offering spelling suggestions for misspelled
    directives.  */
 extern const char *cb_get_suggestion (cpp_reader *, const char *,
Index: gcc-8-20170716/gcc/c-family/c-lex.c
===================================================================
--- gcc-8-20170716.orig/gcc/c-family/c-lex.c
+++ gcc-8-20170716/gcc/c-family/c-lex.c
@@ -81,6 +81,7 @@ init_c_lex (void)
   cb->read_pch = c_common_read_pch;
   cb->has_attribute = c_common_has_attribute;
   cb->get_source_date_epoch = cb_get_source_date_epoch;
+  cb->get_build_path_prefix_map = cb_get_build_path_prefix_map;
   cb->get_suggestion = cb_get_suggestion;
 
   /* Set the debug callbacks if we can use them.  */
Index: gcc-8-20170716/libcpp/include/cpplib.h
===================================================================
--- gcc-8-20170716.orig/libcpp/include/cpplib.h
+++ gcc-8-20170716/libcpp/include/cpplib.h
@@ -607,6 +607,9 @@ struct cpp_callbacks
   /* Callback to parse SOURCE_DATE_EPOCH from environment.  */
   time_t (*get_source_date_epoch) (cpp_reader *);
 
+  /* Callback to parse BUILD_PATH_PREFIX_MAP from environment.  */
+  struct prefix_map **(*get_build_path_prefix_map) (cpp_reader *);
+
   /* Callback for providing suggestions for misspelled directives.  */
   const char *(*get_suggestion) (cpp_reader *, const char *, const char *const *);
 
Index: gcc-8-20170716/libcpp/init.c
===================================================================
--- gcc-8-20170716.orig/libcpp/init.c
+++ gcc-8-20170716/libcpp/init.c
@@ -261,6 +261,9 @@ cpp_create_reader (enum c_lang lang, cpp
   /* Initialize source_date_epoch to -2 (not yet set).  */
   pfile->source_date_epoch = (time_t) -2;
 
+  /* Initialize build_path_prefix_map to NULL (not yet set).  */
+  pfile->build_path_prefix_map = NULL;
+
   /* The expression parser stack.  */
   _cpp_expand_op_stack (pfile);
 
Index: gcc-8-20170716/libcpp/internal.h
===================================================================
--- gcc-8-20170716.orig/libcpp/internal.h
+++ gcc-8-20170716/libcpp/internal.h
@@ -507,6 +507,11 @@ struct cpp_reader
      set to -1 to disable it or to a non-negative value to enable it.  */
   time_t source_date_epoch;
 
+  /* Externally set prefix-map to transform absolute paths, useful for
+     reproducibility.  It should be initialized to NULL (not yet set or
+     disabled) or to a `struct prefix_map` double pointer to enable it.  */
+  struct prefix_map **build_path_prefix_map;
+
   /* EOF token, and a token forcing paste avoidance.  */
   cpp_token avoid_paste;
   cpp_token eof;
Index: gcc-8-20170716/libcpp/macro.c
===================================================================
--- gcc-8-20170716.orig/libcpp/macro.c
+++ gcc-8-20170716/libcpp/macro.c
@@ -26,6 +26,7 @@ along with this program; see the file CO
 #include "system.h"
 #include "cpplib.h"
 #include "internal.h"
+#include "prefix-map.h"
 
 typedef struct macro_arg macro_arg;
 /* This structure represents the tokens of a macro argument.  These
@@ -291,7 +292,17 @@ _cpp_builtin_macro_text (cpp_reader *pfi
 	unsigned int len;
 	const char *name;
 	uchar *buf;
+	prefix_map **map = pfile->build_path_prefix_map;
 	
+	/* Set a prefix-map for __FILE__ if BUILD_PATH_PREFIX_MAP is defined.  */
+	if (map == NULL && pfile->cb.get_build_path_prefix_map != NULL)
+	  {
+	    map = pfile->cb.get_build_path_prefix_map (pfile);
+	    if (map == NULL)
+	      abort ();
+	    pfile->build_path_prefix_map = map;
+	  }
+
 	if (node->value.builtin == BT_FILE)
 	  name = linemap_get_expansion_filename (pfile->line_table,
 						 pfile->line_table->highest_line);
@@ -301,6 +312,11 @@ _cpp_builtin_macro_text (cpp_reader *pfi
 	    if (!name)
 	      abort ();
 	  }
+
+	/* Apply the prefix-map for deterministic path output.  */
+	if (map != NULL)
+	  name = prefix_map_remap_alloca (*map, name);
+
 	len = strlen (name);
 	buf = _cpp_unaligned_alloc (pfile, len * 2 + 3);
 	result = buf;
Index: gcc-8-20170716/gcc/testsuite/gcc.dg/cpp/build_path_prefix_map-1.c
===================================================================
--- /dev/null
+++ gcc-8-20170716/gcc/testsuite/gcc.dg/cpp/build_path_prefix_map-1.c
@@ -0,0 +1,11 @@
+/* __FILE__ should strip BUILD_PATH_PREFIX_MAP if the latter is a prefix. */
+/* { dg-do run } */
+/* { dg-set-compiler-env-var BUILD_PATH_PREFIX_MAP "MACROTEST=$srcdir" } */
+
+int
+main ()
+{
+  if (__builtin_strcmp (__FILE__, "MACROTEST/gcc.dg/cpp/build_path_prefix_map-1.c") != 0)
+    __builtin_abort ();
+  return 0;
+}
Index: gcc-8-20170716/gcc/testsuite/gcc.dg/cpp/build_path_prefix_map-2.c
===================================================================
--- /dev/null
+++ gcc-8-20170716/gcc/testsuite/gcc.dg/cpp/build_path_prefix_map-2.c
@@ -0,0 +1,12 @@
+/* __FILE__ should not be relative if BUILD_PATH_PREFIX_MAP is not set, and gcc is
+   asked to compile an absolute filename as is the case with this test.  */
+/* { dg-do run } */
+/* { dg-set-compiler-env-var BUILD_PATH_PREFIX_MAP } */
+
+int
+main ()
+{
+  if (__builtin_strcmp (__FILE__, "./gcc.dg/cpp/build_path_prefix_map-2.c") == 0)
+    __builtin_abort ();
+  return 0;
+}

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-07-21 16:16 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-11 11:35 [PATCH v2] Generate reproducible output independently of the build-path Ximin Luo
2017-04-11 11:35 ` [PATCH 2/3] Use BUILD_PATH_PREFIX_MAP envvar to transform __FILE__ Ximin Luo
2017-04-11 11:35 ` [PATCH 1/3] Use BUILD_PATH_PREFIX_MAP envvar for debug-prefix-map Ximin Luo
2017-04-11 11:35 ` [PATCH 3/3] When remapping paths, only match whole path components Ximin Luo
2017-04-18 14:57 ` [PATCH v2] Generate reproducible output independently of the build-path Ximin Luo
2017-04-21 18:28 ` Joseph Myers
2017-05-03 15:53   ` Ximin Luo
2017-06-07 16:15     ` Ximin Luo
2017-06-29 10:46       ` [Ping ^3][PATCH " Ximin Luo
2017-07-21 16:16 [PING^4][PATCH " Ximin Luo
2017-07-21 16:16 ` [PATCH 2/3] Use BUILD_PATH_PREFIX_MAP envvar to transform __FILE__ Ximin Luo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).