public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] dtrace: Use hash-based scheme for predictable file generation
@ 2022-02-28 19:42 dann frazier
  0 siblings, 0 replies; only message in thread
From: dann frazier @ 2022-02-28 19:42 UTC (permalink / raw)
  To: systemtap; +Cc: Bernhard M . Wiedemann, Frank Ch . Eigler

commit c245153 ("dtrace: Allow for reproducible .o file builds.")
introduced a condition where 2 dtrace processes can race when
generating the same file. Since both processes now use the same
temporary file name, one may delete the temporary .c file the other
is still processing:

  --------------------------------------------------------------------
  user@host:~/foo$ make -j2
  dtrace -o foo.out -G -s /dev/null
  dtrace -o foo.out -G -s /dev/null
  Traceback (most recent call last):
    File "/usr/bin/dtrace", line 455, in <module>
      sys.exit(main())
    File "/usr/bin/dtrace", line 440, in main
      os.remove(fname)
  FileNotFoundError: [Errno 2] No such file or directory: 'foo.out.dtrace-temp.c'
  make: *** [Makefile:4: ../foo/foo.out] Error 1
  --------------------------------------------------------------------

This can happen when a Makefile processes a pattern rule for two different
targets that happen to map to the same file, but addressed by different
relative paths. I discovered this in a real world case involving libvirt,
but here's a contrived reproducer:

  --------------------------------------------------------------------
  all: foo.out ../$(basename $(CURDIR))/foo.out

  %.out:
  	dtrace -o foo.out -G -s /dev/null

  clean:
  	rm -f foo.out
  --------------------------------------------------------------------

It would be ideal if we could inject a null .file directive, then we could
just use a mkstemp() file and keep the build reproducible by avoiding a
record of the source file path in the binary at all, but I can't find a
straightforward way of passing a .file through to the assembler. So,
instead, let's create a reproducible filename by building a hash of the
input and output paths. Note: this still leaves open a race in the case of
2 dtrace processes with identical input/output paths. But, at least in my
testing, GNU Make is smart enough to detect this case and not create
duplicate jobs.

https://sourceware.org/bugzilla/show_bug.cgi?id=28923

Fixes: Commit c245153 ("dtrace: Allow for reproducible .o file builds.")
Signed-off-by: dann frazier <dann.frazier@canonical.com>
---
 dtrace.in | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/dtrace.in b/dtrace.in
index 7cfe19c60..2e4cc4566 100644
--- a/dtrace.in
+++ b/dtrace.in
@@ -20,6 +20,7 @@
 # pylint: disable=R0201
 # pylint: disable=R0904
 
+import hashlib
 import os
 import sys
 from shlex import split
@@ -410,12 +411,12 @@ def main():
         else:
             print("header: " + fname)
 
-        try: # for reproducible-builds purposes, prefer a fixed path name pattern
-            fname = filename + ".dtrace-temp.c"
-            fdesc = open(fname, mode='w')
-        except: # but that doesn't work for  -o /dev/null - see rhbz1504009
-            (ignore,fname) = mkstemp(suffix=".c")
-            fdesc = open(fname, mode='w')
+        # for reproducible-builds purposes, use a predictable tmpfile path
+        sha = hashlib.sha256()
+        sha.update(s_filename.encode('utf-8'))
+        sha.update(filename.encode('utf-8'))
+        fname = ".dtrace-temp." + sha.hexdigest()[:8] + ".c"
+        fdesc = open(fname, mode='w')
         providers.semaphore_write(fdesc)
         fdesc.close()
         cc1 = os.environ.get("CC", "gcc")
-- 
2.35.1


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-02-28 19:43 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-28 19:42 [PATCH] dtrace: Use hash-based scheme for predictable file generation dann frazier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).