public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/66614] New: LRA might fail to eliminate dead code
@ 2015-06-20 23:35 bernd.edlinger at hotmail dot de
  2015-06-21  9:32 ` [Bug rtl-optimization/66614] " ebotcazou at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: bernd.edlinger at hotmail dot de @ 2015-06-20 23:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66614

            Bug ID: 66614
           Summary: LRA might fail to eliminate dead code
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: bernd.edlinger at hotmail dot de
  Target Milestone: ---

Hi,

I have GCC compiled with this local patch in rtx_addr_can_trap_p_1
that should print a warning, when invalid frame offsets are encounterd:

Index: rtlanal.c
===================================================================
--- rtlanal.c   (revision 224708)
+++ rtlanal.c   (working copy)
@@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "emit-rtl.h"  /* FIXME: Can go away once crtl is moved to rtl.h.  */
 #include "addresses.h"
 #include "rtl-iter.h"
+#include "tree-pass.h"

 /* Forward declarations */
 static void set_of_1 (rtx, const_rtx, void *);
@@ -430,7 +431,20 @@ rtx_addr_can_trap_p_1 (const_rtx x, HOST_WIDE_INT
          if (FRAME_GROWS_DOWNWARD)
            {
              if (adj_offset < frame_offset || adj_offset + size - 1 >= 0)
-               return 1;
+               {
+                 HOST_WIDE_INT high_bound = STARTING_FRAME_OFFSET;
+                 HOST_WIDE_INT low_bound = high_bound - get_frame_size ();
+                 fprintf (stderr, "*** %s can trap: function=%s, pass=%s"
+                                  ", offset=%" PRId64 ", size=%" PRId64
+                                  ", low_bound=%" PRId64 ", high_bound=%"
PRId64 "\n",
+                          x == frame_pointer_rtx ? "frame"
+                          : x == hard_frame_pointer_rtx ? "fp"
+                          : x == stack_pointer_rtx ? "sp" : "argp",
+                          IDENTIFIER_POINTER (DECL_NAME
(current_function_decl)),
+                          current_pass->name,
+                          offset, size, low_bound, high_bound);
+                 return 1;
+               }
            }
          else
            {



That should not happen in ordinary code, like GCC itself.
BUT it does happen really often, when I boot-strap GCC it happens
almost 3.000 times alone in pass 2.

All these events happen in the reload pass.

As an example I debugged at builtins.c which emits 4 suspicious intructions:

*** frame can trap: function=fold_builtin_cbrt, pass=reload, offset=56, size=8,
low_bound=-64, high_bound=0
*** frame can trap: function=fold_builtin_cbrt, pass=reload, offset=48, size=8,
low_bound=-64, high_bound=0
*** frame can trap: function=fold_builtin_cbrt, pass=reload, offset=40, size=8,
low_bound=-64, high_bound=0
*** frame can trap: function=fold_builtin_cbrt, pass=reload, offset=32, size=8,
low_bound=-64, high_bound=0

I get this call stack:

#0  rtx_addr_can_trap_p_1 (x=0x7ffff6ecb348, offset=56, size=8, mode=DImode,
unaligned_mems=false) at ../../gcc-trunk/gcc/rtlanal.c:446
#1  0x0000000000fc3fff in rtx_addr_can_trap_p_1 (x=0x7ffff40d9bb8, offset=0,
size=8, mode=DImode, unaligned_mems=false)
    at ../../gcc-trunk/gcc/rtlanal.c:481
#2  0x0000000000fc845b in may_trap_p_1 (x=0x7ffff40d9bd0, flags=0) at
../../gcc-trunk/gcc/rtlanal.c:2542
#3  0x0000000000fc86b0 in may_trap_p_1 (x=0x7ffff2d07a08, flags=0) at
../../gcc-trunk/gcc/rtlanal.c:2620
#4  0x0000000000fc8763 in may_trap_p (x=0x7ffff2d07a08) at
../../gcc-trunk/gcc/rtlanal.c:2639
#5  0x0000000000e9e851 in process_bb_lives (bb=0x7ffff2c66478,
curr_point=@0x7fffffffd904: 123, dead_insn_p=true) at
../../gcc-trunk/gcc/lra-lives.c:697
#6  0x0000000000ea0740 in lra_create_live_ranges_1 (all_p=true,
dead_insn_p=true) at ../../gcc-trunk/gcc/lra-lives.c:1261
#7  0x0000000000ea0a22 in lra_create_live_ranges (all_p=true, dead_insn_p=true)
at ../../gcc-trunk/gcc/lra-lives.c:1326
#8  0x0000000000e7d7f9 in lra (f=0x0) at ../../gcc-trunk/gcc/lra.c:2308
#9  0x0000000000e272cb in do_reload () at ../../gcc-trunk/gcc/ira.c:5400
#10 0x0000000000e27679 in (anonymous namespace)::pass_reload::execute
(this=0x2685070) at ../../gcc-trunk/gcc/ira.c:5571
#11 0x0000000000f3c397 in execute_one_pass (pass=0x2685070) at
../../gcc-trunk/gcc/passes.c:2356
#12 0x0000000000f3c5e1 in execute_pass_list_1 (pass=0x2685070) at
../../gcc-trunk/gcc/passes.c:2409
#13 0x0000000000f3c612 in execute_pass_list_1 (pass=0x2683ff0) at
../../gcc-trunk/gcc/passes.c:2410
#14 0x0000000000f3c64f in execute_pass_list (fn=0x7ffff389ae70, pass=0x2680db0)
at ../../gcc-trunk/gcc/passes.c:2420
#15 0x0000000000b10878 in cgraph_node::expand (this=0x7ffff3898310) at
../../gcc-trunk/gcc/cgraphunit.c:1934
#16 0x0000000000b10ea9 in expand_all_functions () at
../../gcc-trunk/gcc/cgraphunit.c:2070
#17 0x0000000000b119c0 in symbol_table::compile (this=0x7ffff6ecf0a8) at
../../gcc-trunk/gcc/cgraphunit.c:2423
#18 0x0000000000b11bd4 in symbol_table::finalize_compilation_unit
(this=0x7ffff6ecf0a8) at ../../gcc-trunk/gcc/cgraphunit.c:2510
#19 0x000000000103c552 in compile_file () at ../../gcc-trunk/gcc/toplev.c:577
#20 0x000000000103ea92 in do_compile () at ../../gcc-trunk/gcc/toplev.c:2067
#21 0x000000000103ecde in toplev::main (this=0x7fffffffdc70, argc=33,
argv=0x7fffffffdd78) at ../../gcc-trunk/gcc/toplev.c:2168
#22 0x00000000018e5a83 in main (argc=33, argv=0x7fffffffdd78) at
../../gcc-trunk/gcc/main.c:39


at frame #5, curr_insn->u2.insn_uid = 374

at builtins.c.235r.ira this insn looks like this:

(insn 374 373 375 56 (set (reg:DI 233)
        (mem:DI (plus:DI (reg/f:DI 20 frame)
                (const_int -40 [0xffffffffffffffd8])) [1  S8 A64]))
../../gcc-trunk/gcc/builtins.c:7898 85 {*movdi_internal}
     (expr_list:REG_EQUIV (mem:DI (plus:DI (reg/f:DI 20 frame)
                (const_int -40 [0xffffffffffffffd8])) [1  S8 A64])
        (nil)))


at builtins.c.236r.reload this insn looks like this:

(insn 374 373 375 56 (set (reg:DI 4 si [233])
        (mem:DI (plus:DI (reg/f:DI 7 sp)
                (const_int 56 [0x38])) [1  S8 A64]))
../../gcc-trunk/gcc/builtins.c:7898 85 {*movdi_internal}
     (expr_list:REG_EQUIV (mem:DI (plus:DI (reg/f:DI 20 frame)
                (const_int -40 [0xffffffffffffffd8])) [1  S8 A64])
        (nil)))


So this looks exactly like rtx_addr_can_trap_p_1 is called up on a bogus RTX
that is a mixture between the old and new RTX, i.e. a frame_pointer_rtx
plus a STACK_POINTER relative offset.

If may_trap_p (PATTERN (curr_inst)) returns 1 here, when this instruction
is actually perfectly safe, that may result in lra being unable to eliminate
a dead instruction, just because it uses a bogus offset.


This happens by design, as lra uses for some reason two steps to eliminate
the frame registers, first the offset alone is changed from frame-relative
to sp-relative, and then only the base register is changed from frame to sp,
not changing the offset at this time.  These intermediate values are never
seen in any rtl dumps.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/66614] LRA might fail to eliminate dead code
  2015-06-20 23:35 [Bug rtl-optimization/66614] New: LRA might fail to eliminate dead code bernd.edlinger at hotmail dot de
@ 2015-06-21  9:32 ` ebotcazou at gcc dot gnu.org
  2015-06-21 13:31 ` bernd.edlinger at hotmail dot de
  2015-06-28  7:18 ` bernd.edlinger at hotmail dot de
  2 siblings, 0 replies; 4+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2015-06-21  9:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66614

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2015-06-21
                 CC|                            |ebotcazou at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #1 from Eric Botcazou <ebotcazou at gcc dot gnu.org> ---
> This happens by design, as lra uses for some reason two steps to eliminate
> the frame registers, first the offset alone is changed from frame-relative
> to sp-relative, and then only the base register is changed from frame to sp,
> not changing the offset at this time.  These intermediate values are never
> seen in any rtl dumps.

Nice discovery work.  How come may_trap_p is invoked on this semi-broken RTL?


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/66614] LRA might fail to eliminate dead code
  2015-06-20 23:35 [Bug rtl-optimization/66614] New: LRA might fail to eliminate dead code bernd.edlinger at hotmail dot de
  2015-06-21  9:32 ` [Bug rtl-optimization/66614] " ebotcazou at gcc dot gnu.org
@ 2015-06-21 13:31 ` bernd.edlinger at hotmail dot de
  2015-06-28  7:18 ` bernd.edlinger at hotmail dot de
  2 siblings, 0 replies; 4+ messages in thread
From: bernd.edlinger at hotmail dot de @ 2015-06-21 13:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66614

--- Comment #2 from Bernd Edlinger <bernd.edlinger at hotmail dot de> ---
this is what I understand from lra in my own words:

lra () consists of a sequence of


  for (;;) {
     ...
          lra_eliminate (false, false);
          /* Do inheritance only for regular algorithms.  */
          if (! lra_simple_p)
            {
              if (flag_ipa_ra)
                {
                  if (live_p)
                    lra_clear_live_ranges ();
                  /* As a side-effect of lra_create_live_ranges, we calculate
                     actual_call_used_reg_set,  which is needed during
                     lra_inheritance.  */
                  lra_create_live_ranges (true, true);
                  live_p = true;
                }
              lra_inheritance ();
            }
          if (live_p)
            lra_clear_live_ranges ();
     ...
    }

and it is finally fixated  by

lra_eliminate (true, false);

which actually replaces the base register, but keeps the offsets as they are

the RTL transformations are done by lra_eliminate_regs_1 ():

first it is called with subst_p=false, update_p=false, full_p=true,
this replaces the offset relative to from_rtx to to_rtx, but keeps the
previous base register.

then it is repeatedly called with subst_p=false, update_p=true, full_p=false,
which adds some minor corrections to the offset, and again keeps the to be
eliminated base register.

finally it is called with subst_p=true, update_p=false, full_p=false,
this time only the base register is replaced but the offsets are already
relative to the target base register.

lra_create_live_ranges calls process_bb_lives repeatedly
which uses this as a pre-condition of a dead insn which and can be eliminated:

      if (dead_insn_p && set != NULL_RTX
          && REG_P (SET_DEST (set)) && REGNO (SET_DEST (set)) >=
FIRST_PSEUDO_REGISTER
          && find_reg_note (curr_insn, REG_EH_REGION, NULL_RTX) == NULL_RTX
          && ! may_trap_p (PATTERN (curr_insn))
          /* Don't do premature remove of pic offset pseudo as we can
             start to use it after some reload generation.  */
          && (pic_offset_table_rtx == NULL_RTX
              || pic_offset_table_rtx != SET_DEST (set)))

now may_trap_p sees the offsets which are not consistent with the base.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/66614] LRA might fail to eliminate dead code
  2015-06-20 23:35 [Bug rtl-optimization/66614] New: LRA might fail to eliminate dead code bernd.edlinger at hotmail dot de
  2015-06-21  9:32 ` [Bug rtl-optimization/66614] " ebotcazou at gcc dot gnu.org
  2015-06-21 13:31 ` bernd.edlinger at hotmail dot de
@ 2015-06-28  7:18 ` bernd.edlinger at hotmail dot de
  2 siblings, 0 replies; 4+ messages in thread
From: bernd.edlinger at hotmail dot de @ 2015-06-28  7:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66614

--- Comment #3 from Bernd Edlinger <bernd.edlinger at hotmail dot de> ---
I checked a few apps, and have not yet found any impact of the return
value of rtx_addr_can_trap_p_1 on the final code so far.

That means that this is probably only a low prio bug...

Regarding the LRA, I think there are two different "may trap":

One is an instruction that may trap in a predictable way, like integer-div:

Example:

  try {

     int unused = a/b;

   } catch (...)

here it would not be OK to remove the division, although the result will not
be used directly.  (may_trap_p returns 1)

However there are other instructions that may trap in an unpredictable way,
but they invoke undefined behaviour first, like accessing an array
out-of-bounds:

Example:

   try {
     int a[1];
     int unused = a[1234];
   }  catch (...)

here it would be OK to remove the array access, because it invokes undefined
behavour.  (rtx_addr_can_trap_p_1 returns 1)

It is impossible to predict, if the catch block will be executed anyway, so
it is no big deal to remove this instruction.


My conclusion would be that LRA might need something like
"may_trap_without_undefined_behaviour_p()"
which does most things like may_trap_p but does not care about
array bounds for instance.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-06-28  7:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-20 23:35 [Bug rtl-optimization/66614] New: LRA might fail to eliminate dead code bernd.edlinger at hotmail dot de
2015-06-21  9:32 ` [Bug rtl-optimization/66614] " ebotcazou at gcc dot gnu.org
2015-06-21 13:31 ` bernd.edlinger at hotmail dot de
2015-06-28  7:18 ` bernd.edlinger at hotmail dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).