Re: [Patch 1/2] PowerPC, Fix-test-gdb.base-store.exp

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Carl Love <cel@linux.ibm.com>
To: Andrew Burgess <aburgess@redhat.com>,
	Carl Love <cel@linux.ibm.com>,
	gdb-patches@sourceware.org,
	UlrichWeigand <Ulrich.Weigand@de.ibm.com>
Subject: Re: [Patch 1/2]  PowerPC, Fix-test-gdb.base-store.exp
Date: Tue, 24 Oct 2023 09:05:17 -0700	[thread overview]
Message-ID: <959a0bed35329b8327e16eb3dddd31c27361e29a.camel@linux.ibm.com> (raw)
In-Reply-To: <87y1fs2yld.fsf@redhat.com>

Andrew:

On Tue, 2023-10-24 at 09:50 +0100, Andrew Burgess wrote:
> Carl Love <cel@us.ibm.com> writes:
> 
> > Andrew:
> > 
> > Thanks for the review and comments.  See responses below.
> 
> Carl,
> 
> Sorry for my slow reply.
> 
> > On Mon, 2023-10-16 at 15:31 +0100, Andrew Burgess wrote:
> > > Carl Love <cel@us.ibm.com> writes:
> > > 
> > > > GDB maintainers:
> > > > 
> > > > This is the first patch in the series which fixes the DWWARF
> > > > register
> > > > mapping and signal handling issues on PowerPC.
> > > > 
> > > >                   Carl
> > > > 
> > > > -----------------------------------------------
> > > > 
> > > > rs6000, Fix Linux DWARF register mapping
> > > > 
> > > > The PowerPC DWARF register mapping is the same for the
> > > > .eh_frame
> > > > and
> > > > .debug_frame on Linux.  PowerPC uses different mapping for
> > > > .eh_frame and
> > > > .debug_frame on other operating systems.  The current GDB
> > > > support
> > > > for
> > > > mapping the DWARF registers in
> > > > rs6000_linux_dwarf2_reg_to_regnum
> > > > and
> > > > rs6000_adjust_frame_regnum file gdb/rs6000-tdep.c is not
> > > > correct
> > > > for Linux.
> > > > The files have some legacy mappings for spe_acc, spefscr, EV
> > > > which
> > > > was
> > > > removed from GCC in 2017.
> > > > 
> > > > This patch adds a two new functions
> > > > rs6000_linux_dwarf2_reg_to_regnum,
> > > > and rs6000_linux_adjust_frame_regnum in file gdb/ppc-linux-
> > > > tdep.c
> > > > to handle
> > > > the DWARF register mappings on Linux.  Function
> > > > rs6000_linux_dwarf2_reg_to_regnum is installed for both
> > > > gdb_dwarf_to_regnum
> > > > and gdbarch_stab_reg_to_regnum since the mappings are the same.
> > > > 
> > > > The ppc_linux_init_abi function in gdb/ppc-linux-tdep.c is
> > > > updated
> > > > to
> > > > call set_gdbarch_dwarf2_reg_to_regnum map the new function
> > > > rs6000_linux_dwarf2_reg_to_regnum for the
> > > > architecture.  Similarly,
> > > > dwarf2_frame_set_adjust_regnum is called to map
> > > > rs6000_linux_adjust_frame_regnum into the architecture.
> > > > 
> > > > The second issue fixed by this patch is the check for
> > > > subroutine
> > > > process_event_stop_test.  Need to make sure the frame is not
> > > > the
> > > > SIGTRAMP_FRAME.  The sequence of events on most platforms is:
> > > 
> > > Usually for GDB we avoid bundling unrelated changes into a single
> > > commit.  Each commit should address one self contained issue (as
> > > far
> > > as
> > > possible).
> > > 
> > > I really struggling to see any connection between the two fixes
> > > you
> > > have
> > > here.
> > 
> > As mentioned in the commit log, the first patch which fixes the
> > DWARF
> > register mapping fixes two of the 5 failures seen in store.exp. 
> > Specifically the patch fixes the issue of GDB not stopping in the
> > handler which results in the step.exp test failing.  The second
> > patch
> > then fixes the rest of the issues with the step.exp test.
> 
> I'm now even more confused.
> 
> The commit message for the first patch suggests that it contains two
> separate fixes, but the commit message, even your updated commit
> message
> for 'ver2' only mentions store.exp.
> 
> To be more specific, in patch 1/2, the changes in ppc-linux-tdep.c
> seem
> unrelated to the changes in infrun.c.  And I'm confused about how the
> changes in infrun.c can fix anything in the store.exp test script --
> the
> comments relating to that part of the patch specifically talk about
> 'nexti' and signal handlers, but I don't see either of these being
> used
> in store.exp (we do use 'next' though, so maybe that's what you
> mean?)
> But I'm not clear if a signal is invoked in that test at all.
> 
> In the text above you do mention step.exp, so maybe that's what the
> infrun.c changes are fixing?  I'll grab a PPC machine and have a
> play,
> but your 'ver 2' commit message has no mention of step.exp, so if
> that
> test is important you might need to update the text.
> 

The 128-bit floating point issues in store.exp will need a Power10
system which supports the 128-bit floating point types.  A Power 9
system is not sufficient.  

OK, I think I see what is missing in the explanation. So initially, I
fixed the DWARF register mapping for the store.exp test.  This fixed a
couple of the issues with the store.exp test.  Then in the second patch
we fixed the DWARF register mapping for the 128-bit floating point
values.  Specifically, DWARF used the same register numbers for 64-bit
and 128-bit floating point values.  However, the 128-bit floating point
values are stored in the VSR register file while the 64-bit values are
stored in the floating point registers.  The function
ieee_128_float_regnum_adjust takes care of that fix.  So at this point,
the store.exp test works.  

But now I have a regression failure on gdb.base/sigstep.exp.  This test
is specifically dealing with signal handlers.  Fixing the register
mapping, in patch 1, exposed an underlying bug in the signal handler
code.  The fixes for the nexti, stepi in the signal handling fixes the
regression error.  The signal handling fix is needed as part of the
DWARF register mapping fix so we don't have any regression failures and
so these fixes got rolled into the first patch but it was not really
explained in the original patch description.  So yes, there are really
two things getting fixed in the first patch. But you don't see the
signal handler issue until after the register mapping is fixed for
Linux.

The signal handler issues should be exposed on Power 9 if we split out
the DWARF register mapping from the signal handler fixes.  I didn't
play with these individually on Power 9 as part of the patch work.  All
of the patch work was done on Power 10.

So maybe it will be best if I redo the two patch sequence into three
patches.  Patch 1, fix the mapping; Patch 2 fix the signal handling
issue exposed once the register mapping is fixed; Patch 3 fix the
handling of the 128-bit float DWARF register mapping.  This should then
make it clearer what is going on in each patch.  Thoughts?

                      Carl 

> Usually in GDB when we talk about each patch containing a single fix
> we're talking about changes to GDB functionality, not fixing a test
> script.  So it's not: 'Patch 1/1 fixes all of store.exp' but rather,
> 'Patch 1/2 fixes PPC DWARF register mapping' and 'Patch 2/2 fixes
> nexti
> blah blah signal frames'.
> 
> > > >   1) Some code is running when a signal arrives.
> > > >   2) The kernel handles the signal and dispatches to the
> > > > handler.
> > > >     ...
> > > > 
> > > > However on PowerPC the sequence of events is:
> > > > 
> > > >   1) Some code is running when a signal arrives.
> > > >   2) The kernel handles the signal and dispatches to the
> > > > trampoline.
> > > >   3) The trampoline performs a normal function call to the
> > > > handler.
> > > >       ...
> > > > 
> > > > We want "nexti" to step into, not over, signal handlers invoked
> > > > by the kernel.  This is the case most platforms as the kernel
> > > > puts
> > > > a
> > > > signal trampoline frame onto the stack to handle proper return
> > > > after the
> > > > handler.  However, on some platforms such as PowerPC, the
> > > > kernel
> > > > actually
> > > > uses a trampoline to handle *invocation* of the handler.
> > > > 
> > > > The issue is fixed in function process_event_stop_test by
> > > > adding a
> > > > check
> > > > that the frame is not a SIGTRAMP_FRAME to the if statement to
> > > > stop
> > > > in
> > > > a subroutine call.  This prevents GDB from erroneously
> > > > detecting
> > > > the
> > > > trampoline invocation as a subroutine call.
> > > > 
> > > > This patch fixes two regression test failures in
> > > > gdb.base/store.exp.  It
> > > > also fixes two regression failures in gdb.python/py-thread-
> > > > exited.exp.
> > > 
> > > On the one random PPC box I tried this patch on, I'm not seeing
> > > any
> > > failures in gdb.python/py-thread-exited.exp either before, or
> > > after
> > > this
> > > commit.
> > > 
> > > Which tests in gdb.python/py-thread-exited.exp are you seeing as
> > > broken?
> > > And which of the two fixes in this commit fix the problems you're
> > > seeing?
> > 
> > The comment about the py-thread-exited.exp should have been
> > removed.  I
> > found that sometimes that test fails and sometime passes.  I have
> > yet
> > to dig into it and figure out why the test is inconsistent.  The
> > inconsistent behavior exists with and without these patches.
> > 
> > I will remove the comment about the gdb.python/py-thread-exited.exp
> > fixes as that is erroneous.
> 
> Thanks.
> 
> > 
> > > > Patch has been tested on Power 8 LE/BE, Power 9 LE/BE, Power 10
> > > > with no
> > > > new regressions.
> > > > ---
> > > >  gdb/infrun.c         | 13 ++++++++++
> > > >  gdb/ppc-linux-tdep.c | 56
> > > > ++++++++++++++++++++++++++++++++++++++++++++
> > > >  2 files changed, 69 insertions(+)
> > > > 
> > > > diff --git a/gdb/infrun.c b/gdb/infrun.c
> > > > index 4730d290442..922d8a6913d 100644
> > > > --- a/gdb/infrun.c
> > > > +++ b/gdb/infrun.c
> > > > @@ -7334,8 +7334,21 @@ process_event_stop_test (struct
> > > > execution_control_state *ecs)
> > > >       initial outermost frame, before sp was valid, would
> > > >       have code_addr == &_start.  See the comment in
> > > > frame_id::operator==
> > > >       for more.  */
> > > > +
> > > > +  /* We want "nexti" to step into, not over, signal handlers
> > > > invoked
> > > > +     by the kernel, therefore this subroutine check should not
> > > > trigger
> > > > +     for a signal handler invocation.  On most platforms, this
> > > > is
> > > > already
> > > > +     not the case, as the kernel puts a signal trampoline
> > > > frame
> > > > onto the
> > > > +     stack to handle proper return after the handler, and
> > > > therefore at this
> > > > +     point, the current frame is a grandchild of the step
> > > > frame,
> > > > not a
> > > > +     child.  However, on some platforms, the kernel actually
> > > > uses
> > > > a
> > > > +     trampoline to handle *invocation* of the handler.  In
> > > > that
> > > > case,
> > > > +     when executing the first instruction of the trampoline,
> > > > this
> > > > check
> > > > +     would erroneously detect the trampoline invocation as a
> > > > subroutine
> > > > +     call.  Fix this by checking for SIGTRAMP_FRAME.  */
> > > >    if ((get_stack_frame_id (frame)
> > > >         != ecs->event_thread->control.step_stack_frame_id)
> > > > +      && get_frame_type (frame) != SIGTRAMP_FRAME
> > > >        && ((frame_unwind_caller_id (get_current_frame ())
> > > >  	   == ecs->event_thread->control.step_stack_frame_id)
> > > >  	  && ((ecs->event_thread->control.step_stack_frame_id
> > > > diff --git a/gdb/ppc-linux-tdep.c b/gdb/ppc-linux-tdep.c
> > > > index 784dafa59db..7fb90799dff 100644
> > > > --- a/gdb/ppc-linux-tdep.c
> > > > +++ b/gdb/ppc-linux-tdep.c
> > > > @@ -83,6 +83,7 @@
> > > >  #include "features/rs6000/powerpc-isa207-vsx64l.c"
> > > >  #include "features/rs6000/powerpc-isa207-htm-vsx64l.c"
> > > >  #include "features/rs6000/powerpc-e500l.c"
> > > > +#include "dwarf2/frame.h"
> > > >  
> > > >  /* Shared library operations for PowerPC-Linux.  */
> > > >  static struct target_so_ops powerpc_so_ops;
> > > > @@ -2088,6 +2089,52 @@
> > > > ppc_linux_displaced_step_prepare  (gdbarch
> > > > *arch, thread_info *thread,
> > > >    return per_inferior->disp_step_buf->prepare (thread,
> > > > displaced_pc);
> > > >  }
> > > >  
> > > > +/* Convert a Dwarf 2 register number to a GDB register number
> > > > for
> > > > Linux.  */
> > > > +static int
> > > > +rs6000_linux_dwarf2_reg_to_regnum (struct gdbarch *gdbarch,
> > > > int
> > > > num)
> > > > +{
> > > > +  ppc_gdbarch_tdep *tdep =
> > > > gdbarch_tdep<ppc_gdbarch_tdep>(gdbarch);
> > > > +
> > > > +  if (0 <= num && num <= 31)
> > > > +    return tdep->ppc_gp0_regnum + num;
> > > > +  else if (32 <= num && num <= 63)
> > > > +    /* FIXME: jimb/2004-05-05: What should we do when the
> > > > debug
> > > > info
> > > > +       specifies registers the architecture doesn't have?  Our
> > > > +       callers don't check the value we return.  */
> > > 
> > > I see this comment was just copied from else where, but isn't the
> > > answer
> > > just: return -1 ?
> > > 
> > > The comment about the 'return -1' at the trail of this function
> > > seems
> > > to
> > > suggest that would be the correct thing to do.
> > > 
> > > I guess I'm asking: do we need to add another copy of this (I
> > > think
> > > out
> > > of date) fixme?
> > 
> > Yes, the function is based on rs6000_dwarf2_reg_to_regnum with the
> > specific changes for the Linux DWARF mappings.  The comment looked
> > out
> > of date to me, but I left it for consistency with the other two
> > functions that have the same comment. It would probably be best to
> > investigate the comment further and update it in all places in a
> > separate patch. 
> > 
> > Would it be acceptable to leave the comment as is for now, for
> > consistency sake, and I will work on a separate cleanup patch to
> > address removing the comment in the original two places and this
> > additional place?  I will want to look into it a bit more but I
> > think
> > we can just remove the comment.  But I do want to verify that the
> > return value is never used first.
> 
> I took a quick look.  There are three callers: 1 in dwarf2/loc.c, and
> 2
> in jit.c.  One of the callers in jit.c does fail to check the return
> value, but this looks like a bug to me.  The other callers do check
> for
> a -1 return value.  The comment in gdbarch-gen.h is also explicit
> that
> the right thing to do is return -1.
> 
> Coupled with the fact that the comment, in its current location just
> makes no sense I think the best idea would be to drop it from your
> new
> function.  Cleaning up the older code in the future would be nice,
> but
> is not required.  But I don't think we need to introduce more
> incorrect
> comments just for consistency.
> 
> Thanks,
> Andrew
>

next prev parent reply	other threads:[~2023-10-24 16:09 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-12 14:51 Carl Love
2023-10-12 14:58 ` [Patch 1/2] " Carl Love
2023-10-13 20:34   ` Keith Seitz
2023-10-13 21:00     ` Carl Love
2023-10-16 11:12       ` Ulrich Weigand
2023-10-16 14:31   ` Andrew Burgess
2023-10-16 15:51     ` Carl Love
2023-10-19 15:54       ` Carl Love
2023-10-24  8:50       ` Andrew Burgess
2023-10-24 16:05         ` Carl Love [this message]
2023-10-20 18:08     ` [PATCH 1/2, ver2] " Carl Love
2023-10-24  9:30       ` Andrew Burgess
2023-10-25 13:24         ` Ulrich Weigand
2023-10-30  9:45           ` Andrew Burgess
2023-10-30 16:44             ` Ulrich Weigand
2023-10-30 17:16               ` Carl Love
2023-10-30 17:25               ` [PATCH 1/2, ver3] " Carl Love
2023-11-06 18:24                 ` Carl Love
2023-11-08 10:54                 ` Andrew Burgess
2023-10-12 15:00 ` [PATCH 2/2] " Carl Love
2023-10-13 20:35   ` Keith Seitz
2023-10-13 21:00     ` Carl Love
2023-10-16 11:13       ` Ulrich Weigand
2023-10-16 14:36   ` Andrew Burgess
2023-10-16 15:51     ` Carl Love
2023-10-20 18:08     ` Carl Love
2023-10-24  8:53       ` Andrew Burgess

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=959a0bed35329b8327e16eb3dddd31c27361e29a.camel@linux.ibm.com \
    --to=cel@linux.ibm.com \
    --cc=Ulrich.Weigand@de.ibm.com \
    --cc=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).