public inbox for dwz@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Clean up temporary file in hardlink mode
@ 2021-03-02 10:10 Tom de Vries
  2021-03-03 22:57 ` Mark Wielaard
  0 siblings, 1 reply; 5+ messages in thread
From: Tom de Vries @ 2021-03-02 10:10 UTC (permalink / raw)
  To: dwz, jakub, mark

Hi,

Consider an executable file with hardlinks a.out and b.out.

When running dwz once:
...
$ dwz -h a.out b.out
...
a.out and b.out are updated, and remain hardlinks to the same file.

But when running dwz once more, a.out and b.out remain unchanged, and a
temporary file b.out.#dwz#.XXXXXX is left.

This is caused by the fact that the code in function dwz that is intended to
handle unchanged hardlinks is never triggered.  It is guarded by a
"resa[n].res == 1" condition, but res->res is set to 0 at the end of function
dwz, irrespective of whether the file changed or not.

Fix this by only setting res->res to 0 if the file changed.

This makes test-case twice-multifile.sh fail, we'll deal with that in a
seperate patch.

Any comments?

Thanks,
- Tom

Clean up temporary file in hardlink mode

2021-03-02  Tom de Vries  <tdevries@suse.de>

	PR dwz/24275
	* dwz.c (dwz): Only set res->res to 0 if the file changed.
	* testsuite/dwz.tests/twice-hardlink.sh: Remove PR24275 workaround.

---
 dwz.c                                 | 11 ++++++++---
 testsuite/dwz.tests/twice-hardlink.sh |  5 -----
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/dwz.c b/dwz.c
index 1d7d815..96f292d 100644
--- a/dwz.c
+++ b/dwz.c
@@ -15255,6 +15255,10 @@ remove_empty_pus (void)
 /* Helper structure for hardlink discovery.  */
 struct file_result
 {
+  /* -2: Already processed under different name.
+     -1: Ignore.
+      0: Processed, changed.
+      1: Processed, unchanged.  */
   int res;
   dev_t dev;
   ino_t ino;
@@ -15315,7 +15319,7 @@ dwz (const char *file, const char *outfile, struct file_result *res,
 			 file);
 	      close (fd);
 	      res->res = -2;
-	      return 1;
+	      return 0;
 	    }
 	  /* If it changed, try to hardlink it again.  */
 	  if (resa[n].res == 0)
@@ -15572,6 +15576,9 @@ dwz (const char *file, const char *outfile, struct file_result *res,
 
 	  if (write_dso (dso, outfile, &st, save_to_temp))
 	    ret = 1;
+	  else
+	    res->res = 0;
+
 	  if (unlikely (progress_p))
 	    report_progress ();
 	}
@@ -15595,8 +15602,6 @@ dwz (const char *file, const char *outfile, struct file_result *res,
   close (fd);
 
   free (dso);
-  if (ret == 0)
-    res->res = 0;
   if (ret == 3)
     {
       ret = (outfile != NULL) ? 1 : 0;
diff --git a/testsuite/dwz.tests/twice-hardlink.sh b/testsuite/dwz.tests/twice-hardlink.sh
index 6ce5ee1..6bc0794 100644
--- a/testsuite/dwz.tests/twice-hardlink.sh
+++ b/testsuite/dwz.tests/twice-hardlink.sh
@@ -29,8 +29,3 @@ fi
 cmp 1 1.saved
 
 rm -f 1 1.saved 2 2.saved dwz.err
-
-if [ -f 2.#dwz#.* ]; then
-    echo "PR24275 workaround used" > dwz.info
-    rm -f 2.#dwz#.*
-fi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Clean up temporary file in hardlink mode
  2021-03-02 10:10 [PATCH] Clean up temporary file in hardlink mode Tom de Vries
@ 2021-03-03 22:57 ` Mark Wielaard
  2021-03-04  8:21   ` Tom de Vries
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Wielaard @ 2021-03-03 22:57 UTC (permalink / raw)
  To: Tom de Vries; +Cc: dwz, jakub

Hi Tom,

On Tue, Mar 02, 2021 at 11:10:27AM +0100, Tom de Vries wrote:
> Consider an executable file with hardlinks a.out and b.out.
> 
> When running dwz once:
> ...
> $ dwz -h a.out b.out
> ...
> a.out and b.out are updated, and remain hardlinks to the same file.
> 
> But when running dwz once more, a.out and b.out remain unchanged, and a
> temporary file b.out.#dwz#.XXXXXX is left.
> 
> This is caused by the fact that the code in function dwz that is intended to
> handle unchanged hardlinks is never triggered.  It is guarded by a
> "resa[n].res == 1" condition, but res->res is set to 0 at the end of function
> dwz, irrespective of whether the file changed or not.
> 
> Fix this by only setting res->res to 0 if the file changed.

This looks correct. This works because at the start of the dwz
function res->res is set to -1 (ignore) or 1 (unchanged) and then only
set to -2 for the hardlink case or 0 if write_dso succeeds.

> This makes test-case twice-multifile.sh fail, we'll deal with that in a
> seperate patch.

I haven't look at that patch yet, but if at all possible I would like
to get them in together. Or otherwise make the testcase skip. Having
failing tests in between commits makes things like bisecting a bit
messy (and will trigger warnings from the buildbots).

Thanks,

Mark

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Clean up temporary file in hardlink mode
  2021-03-03 22:57 ` Mark Wielaard
@ 2021-03-04  8:21   ` Tom de Vries
  0 siblings, 0 replies; 5+ messages in thread
From: Tom de Vries @ 2021-03-04  8:21 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: dwz, jakub

On 3/3/21 11:57 PM, Mark Wielaard wrote:
> Hi Tom,
> 
> On Tue, Mar 02, 2021 at 11:10:27AM +0100, Tom de Vries wrote:
>> Consider an executable file with hardlinks a.out and b.out.
>>
>> When running dwz once:
>> ...
>> $ dwz -h a.out b.out
>> ...
>> a.out and b.out are updated, and remain hardlinks to the same file.
>>
>> But when running dwz once more, a.out and b.out remain unchanged, and a
>> temporary file b.out.#dwz#.XXXXXX is left.
>>
>> This is caused by the fact that the code in function dwz that is intended to
>> handle unchanged hardlinks is never triggered.  It is guarded by a
>> "resa[n].res == 1" condition, but res->res is set to 0 at the end of function
>> dwz, irrespective of whether the file changed or not.
>>
>> Fix this by only setting res->res to 0 if the file changed.
> 
> This looks correct. This works because at the start of the dwz
> function res->res is set to -1 (ignore) or 1 (unchanged) and then only
> set to -2 for the hardlink case or 0 if write_dso succeeds.
> 
>> This makes test-case twice-multifile.sh fail, we'll deal with that in a
>> seperate patch.
> 
> I haven't look at that patch yet, but if at all possible I would like
> to get them in together. Or otherwise make the testcase skip. Having
> failing tests in between commits makes things like bisecting a bit
> messy (and will trigger warnings from the buildbots).

Committed them together.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] Clean up temporary file in hardlink mode
@ 2019-01-01  0:00 Tom de Vries
  2019-01-01  0:00 ` Tom de Vries
  0 siblings, 1 reply; 5+ messages in thread
From: Tom de Vries @ 2019-01-01  0:00 UTC (permalink / raw)
  To: dwz, jakub

Hi,

Consider an executable file with hardlinks a.out and b.out.

When running dwz once:
...
$ dwz -h a.out b.out
...
a.out and b.out are updated, and remain hardlinks to the same file.

But when running dwz once more, a.out and b.out remain unchanged, and a
temporary file b.out.#dwz#.XXXXXX is left.

This is caused by the fact that the code in function dwz intended to handle
unchanged hardlinks is never triggered.  It is guarded by a "resa[n].res == 1"
condition, but res->res is set to 0 at the end of function dwz, irrespective
of whether the file changed or not.

Fix this by only setting res->res to 0 if the file changed.

OK for trunk?

[ Applies on top of trunk + "[PATCH] Don't process low-mem files in multifile
mode". ]

Thanks,
- Tom

Clean up temporary file in hardlink mode

2019-03-02  Tom de Vries  <tdevries@suse.de>

	PR dwz/24275
	* dwz.c (struct file_result): Document res field.
	(dwz): Only set res->res to 1 if changed.
	* testsuite/dwz.tests/hardlink-no-change.sh: New test.

---
 dwz.c                                     |  9 +++++++--
 testsuite/dwz.tests/hardlink-no-change.sh | 23 +++++++++++++++++++++++
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/dwz.c b/dwz.c
index ffa8e08..3673bbf 100644
--- a/dwz.c
+++ b/dwz.c
@@ -10950,6 +10950,11 @@ remove_empty_pus (void)
 /* Helper structure for hardlink discovery.  */
 struct file_result
 {
+  /* -2: Already processed under different name.
+     -1: Ignore.
+      0: Processed, changed.
+      1: Processed, unchanged.
+  */
   int res;
   dev_t dev;
   ino_t ino;
@@ -11206,6 +11211,8 @@ dwz (const char *file, const char *outfile, struct file_result *res,
 
 	  if (write_dso (dso, outfile, &st))
 	    ret = 1;
+	  else
+	    res->res = 0;
 	}
     }
 
@@ -11227,8 +11234,6 @@ dwz (const char *file, const char *outfile, struct file_result *res,
   close (fd);
 
   free (dso);
-  if (ret == 0)
-    res->res = 0;
   return ret;
 }
 
diff --git a/testsuite/dwz.tests/hardlink-no-change.sh b/testsuite/dwz.tests/hardlink-no-change.sh
new file mode 100755
index 0000000..945d220
--- /dev/null
+++ b/testsuite/dwz.tests/hardlink-no-change.sh
@@ -0,0 +1,23 @@
+#!/bin/sh
+
+set -e
+
+cp ../hello 1
+ln 1 2
+
+dwz -h 1 2
+
+dwz -h 1 2 2>/dev/null
+
+smaller-than.sh 1 ../hello
+smaller-than.sh 2 ../hello
+
+hl="$(find -samefile 1)"
+hl="$(echo $hl)"
+[ "$hl" = "./1 ./2" ]
+
+ls=$(ls)
+ls=$(echo $ls)
+[ "$ls" = "1 2" ]
+
+rm -f 1 2

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Clean up temporary file in hardlink mode
  2019-01-01  0:00 Tom de Vries
@ 2019-01-01  0:00 ` Tom de Vries
  0 siblings, 0 replies; 5+ messages in thread
From: Tom de Vries @ 2019-01-01  0:00 UTC (permalink / raw)
  To: dwz, jakub

On 02-03-19 15:48, Tom de Vries wrote:
> Hi,
> 
> Consider an executable file with hardlinks a.out and b.out.
> 
> When running dwz once:
> ...
> $ dwz -h a.out b.out
> ...
> a.out and b.out are updated, and remain hardlinks to the same file.
> 
> But when running dwz once more, a.out and b.out remain unchanged, and a
> temporary file b.out.#dwz#.XXXXXX is left.
> 
> This is caused by the fact that the code in function dwz intended to handle
> unchanged hardlinks is never triggered.  It is guarded by a "resa[n].res == 1"
> condition, but res->res is set to 0 at the end of function dwz, irrespective
> of whether the file changed or not.
> 
> Fix this by only setting res->res to 0 if the file changed.
> 

To put it in terms of the --trace switch I just posted, the effect of
the patch is:
...
 $ dwz -t -h a.out b.out
 Compressing a.out
 Updating hardlink b.out to changed file
 $ dwz -t -h a.out b.out
 Compressing a.out
 dwz: a.out: compression not beneficial - old size 3444 new size 3444
-Updating hardlink b.out to changed file
+Skipping hardlink b.out to unchanged file
...

Thanks,
- Tom

> OK for trunk?
> 
> [ Applies on top of trunk + "[PATCH] Don't process low-mem files in multifile
> mode". ]
> 
> Thanks,
> - Tom
> 
> Clean up temporary file in hardlink mode
> 
> 2019-03-02  Tom de Vries  <tdevries@suse.de>
> 
> 	PR dwz/24275
> 	* dwz.c (struct file_result): Document res field.
> 	(dwz): Only set res->res to 1 if changed.
> 	* testsuite/dwz.tests/hardlink-no-change.sh: New test.
> 
> ---
>  dwz.c                                     |  9 +++++++--
>  testsuite/dwz.tests/hardlink-no-change.sh | 23 +++++++++++++++++++++++
>  2 files changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/dwz.c b/dwz.c
> index ffa8e08..3673bbf 100644
> --- a/dwz.c
> +++ b/dwz.c
> @@ -10950,6 +10950,11 @@ remove_empty_pus (void)
>  /* Helper structure for hardlink discovery.  */
>  struct file_result
>  {
> +  /* -2: Already processed under different name.
> +     -1: Ignore.
> +      0: Processed, changed.
> +      1: Processed, unchanged.
> +  */
>    int res;
>    dev_t dev;
>    ino_t ino;
> @@ -11206,6 +11211,8 @@ dwz (const char *file, const char *outfile, struct file_result *res,
>  
>  	  if (write_dso (dso, outfile, &st))
>  	    ret = 1;
> +	  else
> +	    res->res = 0;
>  	}
>      }
>  
> @@ -11227,8 +11234,6 @@ dwz (const char *file, const char *outfile, struct file_result *res,
>    close (fd);
>  
>    free (dso);
> -  if (ret == 0)
> -    res->res = 0;
>    return ret;
>  }
>  
> diff --git a/testsuite/dwz.tests/hardlink-no-change.sh b/testsuite/dwz.tests/hardlink-no-change.sh
> new file mode 100755
> index 0000000..945d220
> --- /dev/null
> +++ b/testsuite/dwz.tests/hardlink-no-change.sh
> @@ -0,0 +1,23 @@
> +#!/bin/sh
> +
> +set -e
> +
> +cp ../hello 1
> +ln 1 2
> +
> +dwz -h 1 2
> +
> +dwz -h 1 2 2>/dev/null
> +
> +smaller-than.sh 1 ../hello
> +smaller-than.sh 2 ../hello
> +
> +hl="$(find -samefile 1)"
> +hl="$(echo $hl)"
> +[ "$hl" = "./1 ./2" ]
> +
> +ls=$(ls)
> +ls=$(echo $ls)
> +[ "$ls" = "1 2" ]
> +
> +rm -f 1 2
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-03-04  8:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-02 10:10 [PATCH] Clean up temporary file in hardlink mode Tom de Vries
2021-03-03 22:57 ` Mark Wielaard
2021-03-04  8:21   ` Tom de Vries
  -- strict thread matches above, loose matches on Subject: below --
2019-01-01  0:00 Tom de Vries
2019-01-01  0:00 ` Tom de Vries

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).