public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* Need help with fixing this two year old backend bug in mainline.
@ 2003-09-19 15:28 Carlo Wood
  2003-09-23  4:09 ` Jim Wilson
  0 siblings, 1 reply; 2+ messages in thread
From: Carlo Wood @ 2003-09-19 15:28 UTC (permalink / raw)
  To: gcc-bugs, gcc

I am trying to make a start with PR 10003,
which 4431 and 12319 marked as 'duplicate'.

PR 10003 was opened at 2003-03-08
but PR 4431 dates back to 2001-09-30 !

What this bug basically comes down to
is that the block labels (.LBB and .LBE)
that are put around an inlined function
are invaded by the prologue and epilogue
of the function that is calling the inlined
function.

Example:

    f:
    .LBB1:
	    <prologue of f()>
	    <inlined function>
	    <epilogue of f()>
    .LBE1:

*Should* have been:

    f:
	    <prologue of f()>
    .LBB1:
	    <inlined function>
    .LBE1:
	    <epilogue of f()>


After making an easy-to-reproduce testcase and
discussing this with friends online, it was brought
to my attention that gcc 3.3.x did NOT put the
prologue inside the inlined block.  Versions
3.2 and before as well as 3.4 (mainline) DO include
the prologue inside this block though.

I used a binary search on CVS dates to find that
the patch that "fixed" this (accidently apparently)
was made on 2002-06-02.  The ChangeLog entry was
rather big, but I was able to reduce the patch
further down till two lines:

+   scope_to_insns_initialize ();
 and
+       scope_to_insns_finalize ();

in two different files.

Unfortunately, these functions are gone in 3.4.

I think however that when I'd *understand* why this is
causing a partial fix for 3.3, I'd also be able to fix
mainline.  But, I cannot understand it because I am
totally clueless when it comes to understanding the
gcc internals :(.


Therefore my plead for some help; Please look at the
diff below that I produced and that fixes half of the
problem (the prologue) for 3.4, and tell me WHY you
think this fixes it - and what would be a REAL fix.

Assembly output before applying this patch:

f:
.LFB4:
        .file 1 "troep.c"
        .loc 1 5 0
.LBB2:
.LBB3:
        pushl   %ebp
.LCFI0:
        movl    %esp, %ebp
.LCFI1:
        .loc 1 3 0
        incl    a
        .loc 1 7 0
        leave
        ret
.LBE3:
.LBE2:
.LFE4:
        .size   f, .-f


Assembly output after applying this patch:

f:
.LFB4:
        .file 1 "troep.c"
        .loc 1 5 0
        pushl   %ebp
.LCFI0:
        movl    %esp, %ebp
.LCFI1:
.LBB2:
.LBB3:
        .loc 1 3 0
        incl    a
        .loc 1 7 0
        leave
        ret
.LBE3:
.LBE2:
.LFE4:
        .size   f, .-f


And the diff of that:

***************
*** 13,24 ****
  .LFB4:
        .file 1 "troep.c"
        .loc 1 5 0
- .LBB2:
- .LBB3:
        pushl   %ebp
  .LCFI0:
        movl    %esp, %ebp
  .LCFI1:
        .loc 1 3 0
        incl    a
        .loc 1 7 0
--- 13,24 ----
  .LFB4:
        .file 1 "troep.c"
        .loc 1 5 0
        pushl   %ebp
  .LCFI0:
        movl    %esp, %ebp
  .LCFI1:
+ .LBB2:
+ .LBB3:
        .loc 1 3 0
        incl    a
        .loc 1 7 0


And finally the patch that is causing this difference, that
is a step in the right direction (patch won't APPLY, I removed
the ^L from it because I was afraid it would interfere with
the mail):

Note that I simply/stupidly added scope_to_insns_initialize()
and scope_to_insns_finalize() from the 3.3 branch to get it
to compile.


Index: toplev.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.818
diff -u -d -p -r1.818 toplev.c
--- toplev.c	14 Aug 2003 01:47:30 -0000	1.818
+++ toplev.c	19 Sep 2003 13:42:05 -0000
@@ -2978,6 +2978,46 @@ rest_of_handle_loop2 (tree decl, rtx ins
   ggc_collect ();
 }
 
+/* Build a varray mapping INSN_UID to lexical block.  Return it.  */
+
+static void scope_to_insns_initialize (void);
+extern varray_type insn_scopes;
+
+static void
+scope_to_insns_initialize ()
+{
+  tree block = NULL;
+  rtx insn, next;
+
+  VARRAY_TREE_INIT (insn_scopes, get_max_uid (), "insn scopes");
+
+  for (insn = get_insns (); insn; insn = next)
+    {
+      next = NEXT_INSN (insn);
+
+      if (active_insn_p (insn)
+          && GET_CODE (PATTERN (insn)) != ADDR_VEC
+          && GET_CODE (PATTERN (insn)) != ADDR_DIFF_VEC)
+        VARRAY_TREE (insn_scopes, INSN_UID (insn)) = block;
+      else if (GET_CODE (insn) == NOTE)
+        {
+          switch (NOTE_LINE_NUMBER (insn))
+            {
+            case NOTE_INSN_BLOCK_BEG:
+              block = NOTE_BLOCK (insn);
+              delete_insn (insn);
+              break;
+            case NOTE_INSN_BLOCK_END:
+              block = BLOCK_SUPERCONTEXT (block);
+              delete_insn (insn);
+              break;
+            default:
+              break;
+            }
+        }
+    }
+}
+
 /* This is called from finish_function (within langhooks.parse_file)
    after each top-level definition is parsed.
    It is supposed to compile that function or variable
@@ -3091,6 +3131,7 @@ rest_of_compilation (tree decl)
 
   timevar_pop (TV_JUMP);
 
+  scope_to_insns_initialize ();
   insn_locators_initialize ();
   /* Complete generation of exception handling code.  */
   if (doing_eh (0))
Index: final.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/final.c,v
retrieving revision 1.289
diff -u -d -p -r1.289 final.c
--- final.c	30 Jul 2003 17:27:06 -0000	1.289
+++ final.c	19 Sep 2003 13:42:09 -0000
@@ -1318,6 +1318,110 @@ asm_insn_count (rtx body)
 }
 #endif

+static void set_block_levels (tree, int);
+static void scope_to_insns_finalize (void);
+static void change_scope (rtx, tree, tree);
+
+static void
+set_block_levels (block, level)
+     tree block;
+     int level;
+{
+  while (block)
+    {
+      BLOCK_NUMBER (block) = level;
+      set_block_levels (BLOCK_SUBBLOCKS (block), level + 1);
+      block = BLOCK_CHAIN (block);
+    }
+}
+
+varray_type insn_scopes;
+
+static void
+change_scope (orig_insn, s1, s2)
+     rtx orig_insn;
+     tree s1, s2;
+{
+  rtx insn = orig_insn;
+  tree com = NULL_TREE;
+  tree ts1 = s1, ts2 = s2;
+  tree s;
+
+  while (ts1 != ts2)
+    {
+      if (ts1 == NULL || ts2 == NULL)
+        abort ();
+      if (BLOCK_NUMBER (ts1) > BLOCK_NUMBER (ts2))
+        ts1 = BLOCK_SUPERCONTEXT (ts1);
+      else if (BLOCK_NUMBER (ts1) < BLOCK_NUMBER (ts2))
+        ts2 = BLOCK_SUPERCONTEXT (ts2);
+      else
+        {
+          ts1 = BLOCK_SUPERCONTEXT (ts1);
+          ts2 = BLOCK_SUPERCONTEXT (ts2);
+        }
+    }
+  com = ts1;
+
+  /* Close scopes.  */
+  s = s1;
+  while (s != com)
+    {
+      rtx note = emit_note_before (NOTE_INSN_BLOCK_END, insn);
+      NOTE_BLOCK (note) = s;
+      s = BLOCK_SUPERCONTEXT (s);
+    }
+
+  /* Open scopes.  */
+  s = s2;
+  while (s != com)
+    {
+      insn = emit_note_before (NOTE_INSN_BLOCK_BEG, insn);
+      NOTE_BLOCK (insn) = s;
+      s = BLOCK_SUPERCONTEXT (s);
+    }
+}
+
+/* Rebuild all the NOTE_INSN_BLOCK_BEG and NOTE_INSN_BLOCK_END notes based
+   on the scope tree and the newly reordered instructions.  */
+
+static void
+scope_to_insns_finalize ()
+{
+  tree cur_block = DECL_INITIAL (cfun->decl);
+  rtx insn, note;
+
+  /* Tag the blocks with a depth number so that change_scope can find
+     the common parent easily.  */
+  set_block_levels (cur_block, 0);
+
+  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
+    {
+      tree this_block;
+
+      if ((size_t) INSN_UID (insn) >= insn_scopes->num_elements)
+        continue;
+      this_block = VARRAY_TREE (insn_scopes, INSN_UID (insn));
+      if (! this_block)
+        continue;
+
+      if (this_block != cur_block)
+        {
+          change_scope (insn, cur_block, this_block);
+          cur_block = this_block;
+        }
+    }
+
+  VARRAY_FREE (insn_scopes);
+
+  /* change_scope emits before the insn, not after.  */
+  note = emit_note (NOTE_INSN_DELETED);
+  change_scope (note, cur_block, DECL_INITIAL (cfun->decl));
+  delete_insn (note);
+
+  reorder_blocks ();
+}
+ 
 /* Output assembler code for the start of a function,
    and initialize some of the variables in this file
    for the new function.  The label for the function and associated
@@ -1383,6 +1487,7 @@ final_start_function (rtx first ATTRIBUT
   if (write_symbols)
     {
       remove_unnecessary_notes ();
+      scope_to_insns_finalize ();
       reemit_insn_block_notes ();
       number_blocks (current_function_decl);
       /* We never actually put out begin/end notes for the top-level

===========================================================================
In order to make it easier for you to comment on this, here are also
some more context of the functions to which the
scope_to_insns_initialize()/scope_to_insns_finalize() were added:
(from 3.4 thus)

gcc/toplev.c:

void
rest_of_compilation (tree decl)
{
  rtx insns;
  int rebuild_label_notes_after_reload;

  timevar_push (TV_REST_OF_COMPILATION);

  /* Register rtl specific functions for cfg.  */
  rtl_register_cfg_hooks ();

[...]

  timevar_push (TV_JUMP);
  open_dump_file (DFI_sibling, decl);
  insns = get_insns ();
  rebuild_jump_labels (insns);
  find_exception_handler_labels ();
  find_basic_blocks (insns, max_reg_num (), rtl_dump_file);

  delete_unreachable_blocks ();

  /* We have to issue these warnings now already, because CFG cleanups
     further down may destroy the required information.  */
  check_function_return_warnings ();

  /* Turn NOTE_INSN_PREDICTIONs into branch predictions.  */
  if (flag_guess_branch_prob)
    {
      timevar_push (TV_BRANCH_PROB);
      note_prediction_to_br_prob ();
      timevar_pop (TV_BRANCH_PROB);
    }

  if (flag_optimize_sibling_calls)
    rest_of_handle_sibling_calls (insns);

  timevar_pop (TV_JUMP);

  scope_to_insns_initialize ();		// ****** THIS WAS ADDED ******

  insn_locators_initialize ();
  /* Complete generation of exception handling code.  */
  if (doing_eh (0))
    {
      timevar_push (TV_JUMP);
      open_dump_file (DFI_eh, decl);

      finish_eh_generation ();

      close_dump_file (DFI_eh, print_rtl, get_insns ());
      timevar_pop (TV_JUMP);
    }

  /* Delay emitting hard_reg_initial_value sets until after EH landing pad
     generation, which might create new sets.  */
  emit_initial_value_sets ();

[...]
}

gcc/final.c:

void
final_start_function (rtx first ATTRIBUTE_UNUSED, FILE *file,
                      int optimize ATTRIBUTE_UNUSED)
{
  block_depth = 0;

  this_is_asm_operands = 0;

#ifdef NON_SAVING_SETJMP
  /* A function that calls setjmp should save and restore all the
     call-saved registers on a system where longjmp clobbers them.  */
  if (NON_SAVING_SETJMP && current_function_calls_setjmp)
    {
      int i;

      for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
        if (!call_used_regs[i])
          regs_ever_live[i] = 1;
    }
#endif

  last_filename = locator_file (prologue_locator);
  last_linenum = locator_line (prologue_locator);

  high_block_linenum = high_function_linenum = last_linenum;

  (*debug_hooks->begin_prologue) (last_linenum, last_filename);

#if defined (DWARF2_UNWIND_INFO) || defined (IA64_UNWIND_INFO)
  if (write_symbols != DWARF2_DEBUG && write_symbols != VMS_AND_DWARF2_DEBUG)
    dwarf2out_begin_prologue (0, NULL);
#endif

#ifdef LEAF_REG_REMAP
  if (current_function_uses_only_leaf_regs)
    leaf_renumber_regs (first);
#endif

  /* The Sun386i and perhaps other machines don't work right
     if the profiling code comes after the prologue.  */
#ifdef PROFILE_BEFORE_PROLOGUE
  if (current_function_profile)
    profile_function (file);
#endif /* PROFILE_BEFORE_PROLOGUE */

#if defined (DWARF2_UNWIND_INFO) && defined (HAVE_prologue)
  if (dwarf2out_do_frame ())
    dwarf2out_frame_debug (NULL_RTX);
#endif

  /* If debugging, assign block numbers to all of the blocks in this
     function.  */
  if (write_symbols)
    {
      remove_unnecessary_notes ();
      scope_to_insns_finalize ();		// ****** THIS WAS ADDED ******
      reemit_insn_block_notes ();
      number_blocks (current_function_decl);
      /* We never actually put out begin/end notes for the top-level
         block in the function.  But, conceptually, that block is
         always needed.  */
      TREE_ASM_WRITTEN (DECL_INITIAL (current_function_decl)) = 1;
    }

  /* First output the function prologue: code to set up the stack frame.  */
  (*targetm.asm_out.function_prologue) (file, get_frame_size ());

  /* If the machine represents the prologue as RTL, the profiling code must
     be emitted when NOTE_INSN_PROLOGUE_END is scanned.  */
#ifdef HAVE_prologue
  if (! HAVE_prologue)
#endif
    profile_after_prologue (file);
}


The question is thus: Why does this have the wanted effect?
What would be a real fix for this prologue w/ inlining problem?

Thanks for time and help.

-- 
Carlo Wood <carlo@alinoe.com>


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Need help with fixing this two year old backend bug in mainline.
  2003-09-19 15:28 Need help with fixing this two year old backend bug in mainline Carlo Wood
@ 2003-09-23  4:09 ` Jim Wilson
  0 siblings, 0 replies; 2+ messages in thread
From: Jim Wilson @ 2003-09-23  4:09 UTC (permalink / raw)
  To: Carlo Wood; +Cc: gcc, gcc-bugs

Carlo Wood wrote:
> I am trying to make a start with PR 10003,
> which 4431 and 12319 marked as 'duplicate'.

> +       scope_to_insns_finalize ();
> Unfortunately, these functions are gone in 3.4.

I think these are gone because block info is now inside the insn rtx. 
Thus there is no longer any need to put block info in an external array 
and then sort it to match the insn stream.  The reason for the sort is 
to account for effects of things like instruction scheduling, but now 
that the block info gets scheduled along with the instructions, there is 
no longer any need for an extra sort.

This probably fixes the problem as a side-effect by moving (or 
creating?) a block note for the inlined function after the prologue.

Note that the prologue does not get emitted until after reload.  We 
can't emit it until then because we don't know which registers to save, 
and the size of the frame until then.

I am guessing that the problem is one or both of two problems:
1) When the prologue is inserted by thread_prologue_and_epilogue_insns, 
and the first block of the function is empty, it gets put in the wrong 
place.  It is inserted in the first block of the inlined function 
instead of before it, into the empty first block of the function.
2) We are too eagerly optimizing away empty blocks before the prologue 
gets inserted.  We should preserve an empty block to hold the prologue 
until after reload, to make sure that thread_prologue_and_epilogue_insns 
has a place to put the instructions it emits.

Exactly how to fix this depends on how the cfg routines are supposed to 
work.

I don't think this is the same problem as 4435, as that is apparently a 
scheduler/debug interaction problem, and the cfg stuff didn't exist 3 
years ago.
-- 
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2003-09-23  0:12 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-09-19 15:28 Need help with fixing this two year old backend bug in mainline Carlo Wood
2003-09-23  4:09 ` Jim Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).