public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [RFC] Fix variable lifetime related problem in gdb.base/store.exp
@ 2011-01-19 23:56 Kevin Buettner
  2011-01-20 12:05 ` Doug Evans
  2011-01-21 21:52 ` Ulrich Weigand
  0 siblings, 2 replies; 5+ messages in thread
From: Kevin Buettner @ 2011-01-19 23:56 UTC (permalink / raw)
  To: gdb-patches

I'm seeing some interesting failures in gdb.base/store.exp for
mips64vrel-elf with --target_board set to vr4300-sim.  (The exact
target_board setting isn't all that critical; vr4300-sim is but one
example where things go awry.)

Here is relevant log file output showing the problem:

tbreak add_float
Temporary breakpoint 15 at 0xa0100414: file /ironwood1/sourceware-clean/mips64vrel-elf/../src/gdb/testsuite/gdb.base/store.c, line 47.
(gdb) PASS: gdb.base/store.exp: tbreak add_float
continue
Continuing.

Temporary breakpoint 15, add_float (u=-1, v=-2) at /ironwood1/sourceware-clean/mips64vrel-elf/../src/gdb/testsuite/gdb.base/store.c:47
47	  return u + v;
(gdb) PASS: gdb.base/store.exp: continue to add_float
up
#1  0xa010072c in wack_float (u=-1, v=-2) at /ironwood1/sourceware-clean/mips64vrel-elf/../src/gdb/testsuite/gdb.base/store.c:110
110	  l = add_float (l, r);
(gdb) PASS: gdb.base/store.exp: upvar float l; up
print l
$48 = -1.21996474e-19
(gdb) FAIL: gdb.base/store.exp: upvar float l; print old l, expecting -1
print r
$49 = -2
(gdb) PASS: gdb.base/store.exp: upvar float l; print old r, expecting -2
set variable l = 4
(gdb) PASS: gdb.base/store.exp: upvar float l; set l to 4
print l
No symbol "l" in current context.
(gdb) FAIL: gdb.base/store.exp: upvar float l; print new l, expecting 4
tbreak add_double
Temporary breakpoint 16 at 0xa0100454: file /ironwood1/sourceware-clean/mips64vrel-elf/../src/gdb/testsuite/gdb.base/store.c, line 53.
(gdb) PASS: gdb.base/store.exp: tbreak add_double
continue
Continuing.
mips-core: 4 byte read to unmapped address 0x40800000 at 0x40800000

Program received signal SIGBUS, Bus error.
0x40800000 in ?? ()
(gdb) FAIL: gdb.base/store.exp: continue to add_double

These failures occur due to `l' in wack_float() being assigned (for
this particular MIPS toolchain) to the `ra' (return address) register.

The function wack_float() is defined as follows:

float
wack_float (register float u, register float v)
{
  register float l = u, r = v;
  l = add_float (l, r);
  return l + r;
}

The test case places a breakpoint in add_float(), runs to that
breakpoint, and then goes up a frame to examine and modify certain
variables.

The lifetime of `l' with value obtained from `u' ends with invocation
of add_float().  (More precisely, it ends as soon as l's value is
copied to the argument register for the call to add_float().)  The
result of the add_float() call is placed in `l' when the call is
finished.

Thus it's perfectly acceptable (though a bit perverse) to put l's value
in the ra register.  (However, given that this is occurring at -O0, it is
also very annoying that the user is unable to inspect l's pre-call
value.)

In any case, due to the fact that the lifetime of l-with-the-value-of-u
ends once the call to add_float() is set up, the compiler is then free
to use ra for other purposes which, of course, during a call, is used
to temporarily hold the return address.

This is why the test fails to print the old value of 'l'; the register
to which it was assigned is now being used for another purpose, i.e,
holding the return address.

And, moreover, it explains the BUS error after the assignment: 
add_float() is a leaf function and, therefore, does not need to save
ra on the stack.  So it simply uses it as is not expecting it to be
changed upon return.

That BUS error is reponsible for many cascade failures later on.

My fix, below, is to assign (via GDB's "set variable" command) to `r'
instead of `l' since the lifetime of r's value set at the beginning of
the function extends beyond the call to add_float().  The patch also
causes 'r', instead of 'l', to be examined immediately after the
assignment.

It won't fix the initial failure with `l' apparently having the wrong
value.  I think this test ought to be preserved to remind us - should
it fail - that although GCC's generated code might be correct, it is
not as useful as it should be for debugging purposes.

This fix does prevent the BUS error and the resulting cascade
failures.  Stopping the cascade failures is important because we can
then determine if there are any real failures amidst the cascade.

Comments?

	* gdb.base/store.exp (proc up_set): Set, and check, the variable
	`r' rather than `l' due to the fact that the lifetime for the
	pre-call value for `l' need not extend beyond the call of
	add_<type>.

Index: testsuite/gdb.base/store.exp
===================================================================
RCS file: /cvs/src/src/gdb/testsuite/gdb.base/store.exp,v
retrieving revision 1.17
diff -u -p -r1.17 store.exp
--- testsuite/gdb.base/store.exp	1 Jan 2011 15:33:43 -0000	1.17
+++ testsuite/gdb.base/store.exp	19 Jan 2011 22:22:38 -0000
@@ -95,10 +95,10 @@ proc up_set { t l r new } {
 	"${prefix}; print old l, expecting ${l}"
     gdb_test "print r" " = ${r}" \
 	"${prefix}; print old r, expecting ${r}"
-    gdb_test_no_output "set variable l = 4" \
-	"${prefix}; set l to 4"
-    gdb_test "print l" " = ${new}" \
-	"${prefix}; print new l, expecting ${new}"
+    gdb_test_no_output "set variable r = 4" \
+	"${prefix}; set r to 4"
+    gdb_test "print r" " = ${new}" \
+	"${prefix}; print new r, expecting ${new}"
 }
 
 up_set "charest" "-1 .*" "-2 .*" "4 ..004."


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Fix variable lifetime related problem in gdb.base/store.exp
  2011-01-19 23:56 [RFC] Fix variable lifetime related problem in gdb.base/store.exp Kevin Buettner
@ 2011-01-20 12:05 ` Doug Evans
  2011-01-21 20:42   ` Kevin Buettner
  2011-01-21 21:52 ` Ulrich Weigand
  1 sibling, 1 reply; 5+ messages in thread
From: Doug Evans @ 2011-01-20 12:05 UTC (permalink / raw)
  To: Kevin Buettner; +Cc: gdb-patches

On Wed, Jan 19, 2011 at 3:26 PM, Kevin Buettner <kevinb@redhat.com> wrote:
> I'm seeing some interesting failures in gdb.base/store.exp for
> mips64vrel-elf with --target_board set to vr4300-sim.  (The exact
> target_board setting isn't all that critical; vr4300-sim is but one
> example where things go awry.)
>
> Here is relevant log file output showing the problem:
>
> tbreak add_float
> Temporary breakpoint 15 at 0xa0100414: file /ironwood1/sourceware-clean/mips64vrel-elf/../src/gdb/testsuite/gdb.base/store.c, line 47.
> (gdb) PASS: gdb.base/store.exp: tbreak add_float
> continue
> Continuing.
>
> Temporary breakpoint 15, add_float (u=-1, v=-2) at /ironwood1/sourceware-clean/mips64vrel-elf/../src/gdb/testsuite/gdb.base/store.c:47
> 47        return u + v;
> (gdb) PASS: gdb.base/store.exp: continue to add_float
> up
> #1  0xa010072c in wack_float (u=-1, v=-2) at /ironwood1/sourceware-clean/mips64vrel-elf/../src/gdb/testsuite/gdb.base/store.c:110
> 110       l = add_float (l, r);
> (gdb) PASS: gdb.base/store.exp: upvar float l; up
> print l
> $48 = -1.21996474e-19
> (gdb) FAIL: gdb.base/store.exp: upvar float l; print old l, expecting -1
> print r
> $49 = -2
> (gdb) PASS: gdb.base/store.exp: upvar float l; print old r, expecting -2
> set variable l = 4
> (gdb) PASS: gdb.base/store.exp: upvar float l; set l to 4
> print l
> No symbol "l" in current context.
> (gdb) FAIL: gdb.base/store.exp: upvar float l; print new l, expecting 4
> tbreak add_double
> Temporary breakpoint 16 at 0xa0100454: file /ironwood1/sourceware-clean/mips64vrel-elf/../src/gdb/testsuite/gdb.base/store.c, line 53.
> (gdb) PASS: gdb.base/store.exp: tbreak add_double
> continue
> Continuing.
> mips-core: 4 byte read to unmapped address 0x40800000 at 0x40800000
>
> Program received signal SIGBUS, Bus error.
> 0x40800000 in ?? ()
> (gdb) FAIL: gdb.base/store.exp: continue to add_double
>
> These failures occur due to `l' in wack_float() being assigned (for
> this particular MIPS toolchain) to the `ra' (return address) register.
>
> The function wack_float() is defined as follows:
>
> float
> wack_float (register float u, register float v)
> {
>  register float l = u, r = v;
>  l = add_float (l, r);
>  return l + r;
> }
>
> The test case places a breakpoint in add_float(), runs to that
> breakpoint, and then goes up a frame to examine and modify certain
> variables.
>
> The lifetime of `l' with value obtained from `u' ends with invocation
> of add_float().  (More precisely, it ends as soon as l's value is
> copied to the argument register for the call to add_float().)  The
> result of the add_float() call is placed in `l' when the call is
> finished.
>
> Thus it's perfectly acceptable (though a bit perverse) to put l's value
> in the ra register.  (However, given that this is occurring at -O0, it is
> also very annoying that the user is unable to inspect l's pre-call
> value.)
>
> In any case, due to the fact that the lifetime of l-with-the-value-of-u
> ends once the call to add_float() is set up, the compiler is then free
> to use ra for other purposes which, of course, during a call, is used
> to temporarily hold the return address.
>
> This is why the test fails to print the old value of 'l'; the register
> to which it was assigned is now being used for another purpose, i.e,
> holding the return address.
>
> And, moreover, it explains the BUS error after the assignment:
> add_float() is a leaf function and, therefore, does not need to save
> ra on the stack.  So it simply uses it as is not expecting it to be
> changed upon return.
>
> That BUS error is reponsible for many cascade failures later on.
>
> My fix, below, is to assign (via GDB's "set variable" command) to `r'
> instead of `l' since the lifetime of r's value set at the beginning of
> the function extends beyond the call to add_float().  The patch also
> causes 'r', instead of 'l', to be examined immediately after the
> assignment.
>
> It won't fix the initial failure with `l' apparently having the wrong
> value.  I think this test ought to be preserved to remind us - should
> it fail - that although GCC's generated code might be correct, it is
> not as useful as it should be for debugging purposes.
>
> This fix does prevent the BUS error and the resulting cascade
> failures.  Stopping the cascade failures is important because we can
> then determine if there are any real failures amidst the cascade.
>
> Comments?
>
>        * gdb.base/store.exp (proc up_set): Set, and check, the variable
>        `r' rather than `l' due to the fact that the lifetime for the
>        pre-call value for `l' need not extend beyond the call of
>        add_<type>.
>
> Index: testsuite/gdb.base/store.exp
> ===================================================================
> RCS file: /cvs/src/src/gdb/testsuite/gdb.base/store.exp,v
> retrieving revision 1.17
> diff -u -p -r1.17 store.exp
> --- testsuite/gdb.base/store.exp        1 Jan 2011 15:33:43 -0000       1.17
> +++ testsuite/gdb.base/store.exp        19 Jan 2011 22:22:38 -0000
> @@ -95,10 +95,10 @@ proc up_set { t l r new } {
>        "${prefix}; print old l, expecting ${l}"
>     gdb_test "print r" " = ${r}" \
>        "${prefix}; print old r, expecting ${r}"
> -    gdb_test_no_output "set variable l = 4" \
> -       "${prefix}; set l to 4"
> -    gdb_test "print l" " = ${new}" \
> -       "${prefix}; print new l, expecting ${new}"
> +    gdb_test_no_output "set variable r = 4" \
> +       "${prefix}; set r to 4"
> +    gdb_test "print r" " = ${new}" \
> +       "${prefix}; print new r, expecting ${new}"
>  }
>
>  up_set "charest" "-1 .*" "-2 .*" "4 ..004."
>

Nasty.

The worry of course is that by changing the test to use r instead of l
are we breaking the intent of the test, at least on *some* target.
[since r has to survive the call gcc may be less inclined to put it in
a register]
OTOH, does up_set really provide any more coverage, as far as the
intent of the test (setting values that are in registers), than
check_set.  I don't know.

Since the high order bit here is, AIUI, to prevent the SIGBUS, an
alternative is to restore l afterwards.
I don't know if that's a better way to go.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Fix variable lifetime related problem in gdb.base/store.exp
  2011-01-20 12:05 ` Doug Evans
@ 2011-01-21 20:42   ` Kevin Buettner
  0 siblings, 0 replies; 5+ messages in thread
From: Kevin Buettner @ 2011-01-21 20:42 UTC (permalink / raw)
  To: gdb-patches; +Cc: Doug Evans

On Wed, 19 Jan 2011 23:17:35 -0800
Doug Evans <dje@google.com> wrote:

> The worry of course is that by changing the test to use r instead of l
> are we breaking the intent of the test, at least on *some* target.
> [since r has to survive the call gcc may be less inclined to put it in
> a register]

This is a valid concern.  I have no ready answer.

> OTOH, does up_set really provide any more coverage, as far as the
> intent of the test (setting values that are in registers), than
> check_set.  I don't know.

I think, perhaps, that the idea might have been to see if registers
up a frame, and now potentially saved on the stack, can be modified.
If this was the idea, the problem with its implementation is that
the various add_<type> functions are too simple to adequately test
this behavior on all architectures.  On some architectures (and for
some types), code generated for add_<type> will be cause one or more
registers to be saved on the stack.  On others it won't.  Also, even
when registers are saved on the stack, we have no idea whether the
registers of interest (i.e.  those assigned to `l' and/or `r') have
been saved.

Even different multilibs using the same architecture may show
different behavior.  Here's an example...

The disassembly for add_float() as generated for the default
vr4300-sim multilib is as follows:

   0xa01003fc <add_float>:      addiu   sp,sp,-16
   0xa0100400 <add_float+4>:    sd      ra,8(sp)
   0xa0100404 <add_float+8>:    sd      s8,0(sp)
   0xa0100408 <add_float+12>:   move    s8,sp
   0xa010040c <add_float+16>:   move    v1,a0
   0xa0100410 <add_float+20>:   move    v0,a1
   0xa0100414 <add_float+24>:   move    a0,v1
   0xa0100418 <add_float+28>:   move    a1,v0
   0xa010041c <add_float+32>:   jal     0xa01014a0 <__addsf3>
   0xa0100420 <add_float+36>:   nop
   0xa0100424 <add_float+40>:   move    sp,s8
   0xa0100428 <add_float+44>:   ld      ra,8(sp)
   0xa010042c <add_float+48>:   ld      s8,0(sp)
   0xa0100430 <add_float+52>:   addiu   sp,sp,16
   0xa0100434 <add_float+56>:   jr      ra
   0xa0100438 <add_float+60>:   nop

Note that it calls the helper function __addsf3 and therefore cannot
be a leaf function.  The ra register is saved on the stack and when
the test case modifes 'l' up a frame in wack_float(), it is actually
modifying the stack location at which l/ra has been saved.

When I look at the locations of l and r in wack_float(), I see:

    (gdb) info address r
    Symbol "r" is a variable in $s0.
    (gdb) info address l
    Symbol "l" is a variable in $ra.

So, in this instance, changing the test to modify 'r' instead of 'l'
does change what is being tested since s0 has not been saved on the
stack, but ra has.  The original test was modifying memory.  My patch
causes a register to be changed.

But, on the other hand, if I look at the disassembly for add_float
as generated for vr4300-sim/-march=vr5500, I see:

   0xa0100434 <add_float>:      addiu   sp,sp,-8
   0xa0100438 <add_float+4>:    sd      s8,0(sp)
   0xa010043c <add_float+8>:    move    s8,sp
   0xa0100440 <add_float+12>:   mov.s   $f1,$f12
   0xa0100444 <add_float+16>:   mov.s   $f0,$f13
   0xa0100448 <add_float+20>:   add.s   $f0,$f1,$f0
   0xa010044c <add_float+24>:   move    sp,s8
   0xa0100450 <add_float+28>:   ld      s8,0(sp)
   0xa0100454 <add_float+32>:   addiu   sp,sp,8
   0xa0100458 <add_float+36>:   jr      ra
   0xa010045c <add_float+40>:   nop

In this instance, the ra register is not saved on the stack; having
the test case modify `r' instead of `l' is similar in that a register
is being modified in both cases.  (I should note that I don't see
the same failures for this multilib since r and l are allocated to
floating point registers.)

> Since the high order bit here is, AIUI, to prevent the SIGBUS, an
> alternative is to restore l afterwards.
> I don't know if that's a better way to go.

That seemed like a reasonable alternative to me, so I gave it a try. 
But, sadly, it doesn't work.  Changing local variable `l' changes the
ra register which in turn changes GDB's notion of the frame that it's
in:

    Breakpoint 1, add_float (u=-1, v=-2)
	at testsuite/gdb.base/store.c:47
    47        return u + v;
    (gdb) up
    #1  0xa010072c in wack_float (u=-1, v=-2)
	at testsuite/gdb.base/store.c:110
    110       l = add_float (l, r);
    (gdb) info frame
    Stack level 1, frame at 0x807fffb0:
     pc = 0xa010072c in wack_float
	(testsuite/gdb.base/store.c:110); saved pc 0xa010111c
     called by frame at 0x807fffc0, caller of frame at 0x807fff98
     source language c.
     Arglist at 0x807fffb0, args: u=-1, v=-2
     Locals at 0x807fffb0, Previous frame's sp is 0x807fffb0
     Saved registers:
      s0 at 0x807fff98, s8 at 0x807fffa0, ra at 0x807fffa8, pc at 0x807fffa8
    (gdb) set variable l = 4
    (gdb) info frame
    Stack level 0, frame at 0x807fff98:
     pc = 0xa0100414 in add_float
	(testsuite/gdb.base/store.c:47); saved pc 0x40800000
     called by frame at 0x807fff98
     source language c.
     Arglist at 0x807fff98, args: u=-1, v=-2
     Locals at 0x807fff98, Previous frame's sp is 0x807fff98
     Saved registers:
      s8 at 0x807fff88, ra at 0x807fff90, pc at 0x807fff90

That means that GDB can no longer find local variable l in order to
restore it.  Even printing it fails:

    (gdb) print l
    No symbol "l" in current context.

Even saving the entire register state and restoring it afterwards won't
always work.  As shown earlier, `l' can be allocated to ra and then
saved on the stack.  In that case, it's that stack location that would
need to be restored.  But the act of changing `l' (and therefore ra)
prevents us from being able to correctly restore it later on.

I haven't been able to think of a better fix than the patch I've
already posted.

Kevin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Fix variable lifetime related problem in gdb.base/store.exp
  2011-01-19 23:56 [RFC] Fix variable lifetime related problem in gdb.base/store.exp Kevin Buettner
  2011-01-20 12:05 ` Doug Evans
@ 2011-01-21 21:52 ` Ulrich Weigand
  2011-01-25 15:14   ` Kevin Buettner
  1 sibling, 1 reply; 5+ messages in thread
From: Ulrich Weigand @ 2011-01-21 21:52 UTC (permalink / raw)
  To: Kevin Buettner; +Cc: gdb-patches

Kevin Buettner wrote:

> The test case places a breakpoint in add_float(), runs to that
> breakpoint, and then goes up a frame to examine and modify certain
> variables.
> 
> The lifetime of `l' with value obtained from `u' ends with invocation
> of add_float().  (More precisely, it ends as soon as l's value is
> copied to the argument register for the call to add_float().)  The
> result of the add_float() call is placed in `l' when the call is
> finished.
> 
> Thus it's perfectly acceptable (though a bit perverse) to put l's value
> in the ra register.  (However, given that this is occurring at -O0, it is
> also very annoying that the user is unable to inspect l's pre-call
> value.)
> 
> In any case, due to the fact that the lifetime of l-with-the-value-of-u
> ends once the call to add_float() is set up, the compiler is then free
> to use ra for other purposes which, of course, during a call, is used
> to temporarily hold the return address.

Isn't this then simply a matter of the compiler generating incorrect
debug information?  At the PC corresponding to the call site, it is
simply not true that register ra holds the variable l, so the debug
info shouldn't say that.  Instead, the debug info should either point
to whatever place actually holds the variable at this point, or else
mark variable l as "optimized out" there.

There's in the end not much GDB can do if debug info is just wrong.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Fix variable lifetime related problem in gdb.base/store.exp
  2011-01-21 21:52 ` Ulrich Weigand
@ 2011-01-25 15:14   ` Kevin Buettner
  0 siblings, 0 replies; 5+ messages in thread
From: Kevin Buettner @ 2011-01-25 15:14 UTC (permalink / raw)
  To: gdb-patches; +Cc: Ulrich Weigand

On Fri, 21 Jan 2011 21:41:29 +0100 (CET)
"Ulrich Weigand" <uweigand@de.ibm.com> wrote:

> Isn't this then simply a matter of the compiler generating incorrect
> debug information?  At the PC corresponding to the call site, it is
> simply not true that register ra holds the variable l, so the debug
> info shouldn't say that.  Instead, the debug info should either point
> to whatever place actually holds the variable at this point, or else
> mark variable l as "optimized out" there.
> 
> There's in the end not much GDB can do if debug info is just wrong.

I agree with your analysis.

Let's assume for the moment that we change the compiler to mark `l' as
"optimized out".  If we do this, we should still see failures when we
attempt to either examine or set `l'.  After all, we can't look at
or change values that have been optimized out.

I can think of four not entirely unreasonable courses of action:

1) Leave the test case as is with the justification that optimizing
   away variables in unoptimized code is not debugger-friendly and
   therefore *should* be considered a bug.

2) Avoid the problem by changing the test to use some other variable
   that's not so easy to optimize out.  (This is what my patch does.)

3) Change the test case to recognize the "optimized out" condition,
   causing the affected tests to be XFAILed instead.

4) Do (3), but also augment the test case to modify/examine some other
   variable that won't likely be optimized out.

As I ponder these options, I'm not really liking option 2 since it
doesn't allow us to learn of a potential deficiency in the compiler. 
That being the case, I withdraw my patch.

Kevin

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-01-25 14:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-19 23:56 [RFC] Fix variable lifetime related problem in gdb.base/store.exp Kevin Buettner
2011-01-20 12:05 ` Doug Evans
2011-01-21 20:42   ` Kevin Buettner
2011-01-21 21:52 ` Ulrich Weigand
2011-01-25 15:14   ` Kevin Buettner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).