public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] dtrace: Use deterministic temp file creation for all temp files
@ 2023-02-27 12:13 Gioele Barabucci
  2023-02-27 15:49 ` Florian Weimer
  0 siblings, 1 reply; 8+ messages in thread
From: Gioele Barabucci @ 2023-02-27 12:13 UTC (permalink / raw)
  To: systemtap

`dtrace -G -C` creates temporary files with random filenames. The name
of these temporary files gets embedded in the ELF `.symtab` of the final
object files, making them always slightly different.

This behavior makes all packages that use `dtrace`-produced object files
inherently non reproducible.

To reproduce this issue:

```
$ git clone https://salsa.debian.org/sssd-team/sssd.git
$ cd sssd
$ mkdir -p build && cd build/

$ dtrace -C -G -s ../src/systemtap/sssd_probes.d -o stap_generated_probes.o
$ readelf --wide --symbols stap_generated_probes.o > sym1.txt

$ dtrace -C -G -s ../src/systemtap/sssd_probes.d -o stap_generated_probes.o
$ readelf --wide --symbols stap_generated_probes.o > sym2.txt

$ diff -u sym1.txt sym2.txt
--- sym1.txt    2023-02-27 08:38:48.955299234 +0100
+++ sym2.txt    2023-02-27 08:38:49.103303352 +0100
@@ -2,7 +2,7 @@
  Symbol table '.symtab' contains 59 entries:
  Num:    Value  Size Type    Bind   Vis      Ndx Name
    0: 0000000000   0 NOTYPE  LOCAL  DEFAULT  UND
-  1: 0000000000   0 FILE    LOCAL  DEFAULT  ABS .dtrace-temp.4f0bbdda.c
+  1: 0000000000   0 FILE    LOCAL  DEFAULT  ABS .dtrace-temp.d20e76c7.c
    2: 0000000000   0 SECTION LOCAL  DEFAULT    1 .text
    3: 0000000000   7 FUNC    LOCAL  DEFAULT    1 __dtrace
    4: 0000000000   0 SECTION LOCAL  DEFAULT    5 .debug_info
```

The root cause of this issue is that, although the name of the temporary
file is created in a deterministic way (from the SHA256 of the source
file), the name of the source file is overwritten with a random name
then the `-C` option (`use_cpp`) is used:

```
if s_filename != "" and use_cpp:
     (ignore, fname) = mkstemp(suffix=".d")
     cpp = os.environ.get("CPP", "cpp")
     retcode = call(split(cpp) + [...] + [s_filename, '-o', fname])
     if retcode != 0:
         print("\"cpp includes s_filename\" failed")
         usage()
         return 1
     s_filename = fname

[...]

sha = hashlib.sha256()
sha.update(s_filename.encode('utf-8'))
sha.update(filename.encode('utf-8'))
fname = ".dtrace-temp." + sha.hexdigest()[:8] + ".c"
```

To fix this issue, all temporary files are now created using
the same deterministic procedure currently used only for the
temporary ".c" files.

Fixes: https://bugs.debian.org/1032055
Fixes: https://bugs.debian.org/1032056
Signed-off-by: Gioele Barabucci <gioele@svario.it>
---
  dtrace.in | 50 +++++++++++++++++++++++++++-----------------------
  1 file changed, 27 insertions(+), 23 deletions(-)

diff --git a/dtrace.in b/dtrace.in
index adad99bdb..22c1a9d03 100644
--- a/dtrace.in
+++ b/dtrace.in
@@ -27,7 +27,6 @@ import time
  import atexit
  from shlex import split
  from subprocess import call
-from tempfile import mkstemp
  try:
      from pyparsing import alphas, cStyleComment, delimitedList, Group, \
          Keyword, lineno, Literal, nestedExpr, nums, oneOf, OneOrMore, \
@@ -278,6 +277,28 @@ class _ReProvider(_HeaderCreator):
          hdr.close()


+def mktemp_determ(sources, suffix):
+    # for reproducible-builds purposes, use a predictable tmpfile path
+    sha = hashlib.sha256()
+    for source in sources:
+        sha.update(source.encode('utf-8'))
+    fname = ".dtrace-temp." + sha.hexdigest()[:8] + suffix
+    tries = 0
+    while True:
+        tries += 1
+        if tries > 100: # if file exists due to previous crash or whatever
+            raise Exception("cannot create temporary file \""+fname+"\"")
+        try:
+            wxmode = 'x' if sys.version_info > (3,0) else 'wx'
+            fdesc = open(fname, mode=wxmode)
+            break
+        except FileExistsError:
+            time.sleep(0.1) # vague estimate of elapsed time for concurrent identical gcc job
+            pass # Try again
+
+    return fdesc, fname
+
+
  def usage():
      print("Usage " + sys.argv[0] + " [--help] [-h | -G] [-C [-I<Path>]] -s File.d [-o <File>]")

@@ -360,7 +381,7 @@ def main():
          return 1

      if s_filename != "" and use_cpp:
-        (ignore, fname) = mkstemp(suffix=".d")
+        (ignore, fname) = mktemp_determ(["use_cpp", s_filename], suffix=".d")
          cpp = os.environ.get("CPP", "cpp")
          retcode = call(split(cpp) + includes + defines + [s_filename, '-o', fname])
          if retcode != 0:
@@ -399,7 +420,7 @@ def main():
              providers = _PypProvider()
          else:
              providers = _ReProvider()
-        (ignore, fname) = mkstemp(suffix=".h")
+        (fdesc, fname) = mktemp_determ(["build_source", s_filename], suffix=".h")
          while True:
              try:
                  providers.probe_write(s_filename, fname)
@@ -413,26 +434,9 @@ def main():
          else:
              print("header: " + fname)

-        # for reproducible-builds purposes, use a predictable tmpfile path
-        sha = hashlib.sha256()
-        sha.update(s_filename.encode('utf-8'))
-        sha.update(filename.encode('utf-8'))
-        fname = ".dtrace-temp." + sha.hexdigest()[:8] + ".c"
-        tries = 0
-        while True:
-            tries += 1
-            if tries > 100: # if file exists due to previous crash or whatever
-                print("cannot create temporary file \""+fname+"\"")
-                return 1
-            try:
-                wxmode = 'x' if sys.version_info > (3,0) else 'wx'
-                fdesc = open(fname, mode=wxmode)
-                if not keep_temps:
-                   atexit.register(os.remove, fname) # delete generated source at exit, even if error
-                break
-            except:
-                time.sleep(0.1) # vague estimate of elapsed time for concurrent identical gcc job
-                pass # Try again
+        (fdesc, fname) = mktemp_determ(["build_source", s_filename, filename], suffix=".c")
+        if not keep_temps:
+            atexit.register(os.remove, fname) # delete generated source at exit, even if error
          providers.semaphore_write(fdesc)
          fdesc.close()
          cc1 = os.environ.get("CC", "gcc")
-- 
2.39.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-02-28 10:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-27 12:13 [PATCH] dtrace: Use deterministic temp file creation for all temp files Gioele Barabucci
2023-02-27 15:49 ` Florian Weimer
2023-02-27 15:59   ` Gioele Barabucci
2023-02-27 16:47     ` Florian Weimer
2023-02-27 17:15       ` Gioele Barabucci
2023-02-27 17:34         ` Florian Weimer
2023-02-28  3:46           ` Gioele Barabucci
2023-02-28 10:12             ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).