public inbox for debugedit@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] find-debuginfo.sh: Exit with real exit status in parallel jobs
@ 2024-08-16 18:54 Keith Seitz
  2024-08-17 21:20 ` Mark Wielaard
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Seitz @ 2024-08-16 18:54 UTC (permalink / raw)
  To: debugedit

Currently, when the script is executed in parallel (-jN), the
resulting exit status will always be 0.

The script execs an appropriate number of clones of itself, calling
run_job to run the actual workload. This then calls do_file(), saving
the exit status into "res.$jobid".

In do_file(), though, if an error occurs, exit is called. This causes
the entire exec'd shell to exit with status 0 (since there are almost
always echo calls as the last executed statement). The real exit
status is therefor never written to the "res.$jobid" files by run_job().

The simple solution is to use 'return' instead of 'exit'. A number
of minor adjustments are also made to propagate this properly so that
it is reported as the correct exit status.

While at it, I've incorporated a patch for find-debuginfo/30505.
Using this patch and another patch to the RPM package (submitted as
github issue #3215), failures of gdb-add-index.sh will now properly fail
the build instead of being swallowed. It should be much easier for
developers to figure out why their builds have failed should gdb crash.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30505
Signed-off-by: Keith Seitz <keiths@redhat.com>
---
 scripts/find-debuginfo.in | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/scripts/find-debuginfo.in b/scripts/find-debuginfo.in
index ae0818f..f40e566 100755
--- a/scripts/find-debuginfo.in
+++ b/scripts/find-debuginfo.in
@@ -477,16 +477,20 @@ do_file()
 			      -l "$SOURCEFILE" "$f") || exit
   if [ -z "$id" ]; then
     echo >&2 "*** ${strict_error}: No build ID note found in $f"
-    $strict && exit 2
+    $strict && return 2
   fi
 
   # Add .gdb_index if requested.
   if $include_gdb_index; then
     if type gdb-add-index >/dev/null 2>&1; then
-      gdb-add-index "$f"
+      gdb-add-index "$f" || {
+        status=$?
+        echo >&2 "*** ERROR:: GDB exited with exit status $status during index generation"
+        return 2
+      }
     else
       echo >&2 "*** ERROR: GDB index requested, but no gdb-add-index installed"
-      exit 2
+      return 2
     fi
   fi
 
@@ -547,6 +551,7 @@ run_job()
 {
   local jobid=$1 filenum
   local SOURCEFILE=$temp/debugsources.$jobid ELFBINSFILE=$temp/elfbins.$jobid
+  local res=0
 
   >"$SOURCEFILE"
   >"$ELFBINSFILE"
@@ -558,8 +563,12 @@ run_job()
       break
     fi
     do_file $(sed -n "$(( 0x$filenum )) p" "$temp/primary")
+    res=$?
+    if [ $res != 0 ]; then
+	break
+    fi
   done
-  echo 0 >"$temp/res.$jobid"
+  echo $res >"$temp/res.$jobid"
 }
 
 n_files=$(wc -l <"$temp/primary")
@@ -587,7 +596,7 @@ else
     test -f "$f" || continue
     res=$(< "$f")
     if [ "$res" !=  "0" ]; then
-      exit 1
+      exit $res
     fi
   done
   # List of sources may have lots of duplicates. A kernel build was seen

base-commit: 630644cd7ce3ee068f5c105ad42adaa5a6f21287
-- 
2.45.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] find-debuginfo.sh: Exit with real exit status in parallel jobs
  2024-08-16 18:54 [PATCH] find-debuginfo.sh: Exit with real exit status in parallel jobs Keith Seitz
@ 2024-08-17 21:20 ` Mark Wielaard
  2024-08-19 19:51   ` Keith Seitz
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Wielaard @ 2024-08-17 21:20 UTC (permalink / raw)
  To: Keith Seitz; +Cc: debugedit

Hi Keith,

On Fri, Aug 16, 2024 at 11:54:20AM -0700, Keith Seitz wrote:
> Currently, when the script is executed in parallel (-jN), the
> resulting exit status will always be 0.
> 
> The script execs an appropriate number of clones of itself, calling
> run_job to run the actual workload. This then calls do_file(), saving
> the exit status into "res.$jobid".
> 
> In do_file(), though, if an error occurs, exit is called. This causes
> the entire exec'd shell to exit with status 0 (since there are almost
> always echo calls as the last executed statement). The real exit
> status is therefor never written to the "res.$jobid" files by run_job().
> 
> The simple solution is to use 'return' instead of 'exit'. A number
> of minor adjustments are also made to propagate this properly so that
> it is reported as the correct exit status.

Good find. Code looks correct. Except don't we now also need to check
the result of do_file () in the non-parallel case? Something like:

diff --git a/scripts/find-debuginfo.in b/scripts/find-debuginfo.in
index f40e566f86c3..5998b9d1fa6c 100755
--- a/scripts/find-debuginfo.in
+++ b/scripts/find-debuginfo.in
@@ -579,6 +579,10 @@ fi
 if [ $n_jobs -le 1 ]; then
   while read nlinks inum f; do
     do_file "$nlinks" "$inum" "$f"
+    res=$?
+    if [ "$res" != "0" ]; then
+      exit $res
+    fi
   done <"$temp/primary"
 else
   for ((i = 1; i <= n_files; i++)); do

> While at it, I've incorporated a patch for find-debuginfo/30505.
> Using this patch and another patch to the RPM package (submitted as
> github issue #3215), failures of gdb-add-index.sh will now properly fail
> the build instead of being swallowed. It should be much easier for
> developers to figure out why their builds have failed should gdb crash.

Thanks, that has been a long standing issue.  Smart to introduce a new
_find_debuginfo_exit_on_error for rpm-config. The reason errors
weren't reported in the past was because people would then disable
creating debuginfo packages completely. But these days we really
should report (and fix) any errors.

There are a couple of other places, the main debugedit invocation,
strip_to_debug, add_minidebug, which really should report errors
too. But that can wait.

Thanks,

Mark

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] find-debuginfo.sh: Exit with real exit status in parallel jobs
  2024-08-17 21:20 ` Mark Wielaard
@ 2024-08-19 19:51   ` Keith Seitz
  2024-08-20 14:51     ` Mark Wielaard
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Seitz @ 2024-08-19 19:51 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: debugedit

On 8/17/24 2:20 PM, Mark Wielaard wrote:
> On Fri, Aug 16, 2024 at 11:54:20AM -0700, Keith Seitz wrote:
>> Currently, when the script is executed in parallel (-jN), the
>> resulting exit status will always be 0.
>>
>> The script execs an appropriate number of clones of itself, calling
>> run_job to run the actual workload. This then calls do_file(), saving
>> the exit status into "res.$jobid".
>>
>> In do_file(), though, if an error occurs, exit is called. This causes
>> the entire exec'd shell to exit with status 0 (since there are almost
>> always echo calls as the last executed statement). The real exit
>> status is therefor never written to the "res.$jobid" files by run_job().
>>
>> The simple solution is to use 'return' instead of 'exit'. A number
>> of minor adjustments are also made to propagate this properly so that
>> it is reported as the correct exit status.
> 
> Good find. Code looks correct. Except don't we now also need to check
> the result of do_file () in the non-parallel case? Something like:
> 
> diff --git a/scripts/find-debuginfo.in b/scripts/find-debuginfo.in
> index f40e566f86c3..5998b9d1fa6c 100755
> --- a/scripts/find-debuginfo.in
> +++ b/scripts/find-debuginfo.in
> @@ -579,6 +579,10 @@ fi
>   if [ $n_jobs -le 1 ]; then
>     while read nlinks inum f; do
>       do_file "$nlinks" "$inum" "$f"
> +    res=$?
> +    if [ "$res" != "0" ]; then
> +      exit $res
> +    fi
>     done <"$temp/primary"
>   else
>     for ((i = 1; i <= n_files; i++)); do

Yes, that does look appropriate. My apologies for not following through
with the non-parallel invocation. I guess I could have (further) 
modified my custom rpm package to specifically test this case, too,
and submit a more complete patch.

>> While at it, I've incorporated a patch for find-debuginfo/30505.
>> Using this patch and another patch to the RPM package (submitted as
>> github issue #3215), failures of gdb-add-index.sh will now properly fail
>> the build instead of being swallowed. It should be much easier for
>> developers to figure out why their builds have failed should gdb crash.
> 
> Thanks, that has been a long standing issue.  Smart to introduce a new
> _find_debuginfo_exit_on_error for rpm-config. The reason errors
> weren't reported in the past was because people would then disable
> creating debuginfo packages completely. But these days we really
> should report (and fix) any errors.
> 
> There are a couple of other places, the main debugedit invocation,
> strip_to_debug, add_minidebug, which really should report errors
> too. But that can wait.

Indeed. I'm selfish, so I am attempting to improve testing for
areas for which I am directly responsible, but perhaps I can add
that to a TODO and give it a try with my mass prebuilder some time
to verify.

In any case, a step in the right direction. Thank you for the review!

Keith


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] find-debuginfo.sh: Exit with real exit status in parallel jobs
  2024-08-19 19:51   ` Keith Seitz
@ 2024-08-20 14:51     ` Mark Wielaard
  2024-08-20 20:26       ` Keith Seitz
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Wielaard @ 2024-08-20 14:51 UTC (permalink / raw)
  To: Keith Seitz; +Cc: debugedit

Hi Keith,

On Mon, 2024-08-19 at 12:51 -0700, Keith Seitz wrote:
> On 8/17/24 2:20 PM, Mark Wielaard wrote:
> > Good find. Code looks correct. Except don't we now also need to check
> > the result of do_file () in the non-parallel case? Something like:
> > 
> > diff --git a/scripts/find-debuginfo.in b/scripts/find-debuginfo.in
> > index f40e566f86c3..5998b9d1fa6c 100755
> > --- a/scripts/find-debuginfo.in
> > +++ b/scripts/find-debuginfo.in
> > @@ -579,6 +579,10 @@ fi
> >   if [ $n_jobs -le 1 ]; then
> >     while read nlinks inum f; do
> >       do_file "$nlinks" "$inum" "$f"
> > +    res=$?
> > +    if [ "$res" != "0" ]; then
> > +      exit $res
> > +    fi
> >     done <"$temp/primary"
> >   else
> >     for ((i = 1; i <= n_files; i++)); do
> 
> Yes, that does look appropriate. My apologies for not following through
> with the non-parallel invocation. I guess I could have (further) 
> modified my custom rpm package to specifically test this case, too,
> and submit a more complete patch.

I pushed you patch with the above change for the single-thread case.

It would be appreciated if you could incorporate it into your testing.
Sadly we don't have any script tests, so we do rely on external
testing.

> > > While at it, I've incorporated a patch for find-debuginfo/30505.
> > > Using this patch and another patch to the RPM package (submitted as
> > > github issue #3215), failures of gdb-add-index.sh will now properly fail
> > > the build instead of being swallowed. It should be much easier for
> > > developers to figure out why their builds have failed should gdb crash.
> > 
> > Thanks, that has been a long standing issue.  Smart to introduce a new
> > _find_debuginfo_exit_on_error for rpm-config. The reason errors
> > weren't reported in the past was because people would then disable
> > creating debuginfo packages completely. But these days we really
> > should report (and fix) any errors.
> > 
> > There are a couple of other places, the main debugedit invocation,
> > strip_to_debug, add_minidebug, which really should report errors
> > too. But that can wait.
> 
> Indeed. I'm selfish, so I am attempting to improve testing for
> areas for which I am directly responsible, but perhaps I can add
> that to a TODO and give it a try with my mass prebuilder some time
> to verify.

Totally understood. Lets fix them in stages, so they can be tested
separately.

Thanks,

Mark

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] find-debuginfo.sh: Exit with real exit status in parallel jobs
  2024-08-20 14:51     ` Mark Wielaard
@ 2024-08-20 20:26       ` Keith Seitz
  0 siblings, 0 replies; 5+ messages in thread
From: Keith Seitz @ 2024-08-20 20:26 UTC (permalink / raw)
  To: mark; +Cc: debugedit, keiths

Mark wrote:
> I pushed you patch with the above change for the single-thread case.
>
> It would be appreciated if you could incorporate it into your testing.
> Sadly we don't have any script tests, so we do rely on external
> testing.

No problemo. I've hacked my mass prebuilder setup to ignore the -j option
and re-tested the whole thing with a gdb NVR bump and one where I introduced
an abort() into gdb's index generation.

Eveyrthing behaves as I would expect.

Keith


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-08-20 20:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-16 18:54 [PATCH] find-debuginfo.sh: Exit with real exit status in parallel jobs Keith Seitz
2024-08-17 21:20 ` Mark Wielaard
2024-08-19 19:51   ` Keith Seitz
2024-08-20 14:51     ` Mark Wielaard
2024-08-20 20:26       ` Keith Seitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).