public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* Need help with fixing this two year old backend bug in mainline.
@ 2003-09-19 15:28 Carlo Wood
2003-09-23 4:09 ` Jim Wilson
0 siblings, 1 reply; 2+ messages in thread
From: Carlo Wood @ 2003-09-19 15:28 UTC (permalink / raw)
To: gcc-bugs, gcc
I am trying to make a start with PR 10003,
which 4431 and 12319 marked as 'duplicate'.
PR 10003 was opened at 2003-03-08
but PR 4431 dates back to 2001-09-30 !
What this bug basically comes down to
is that the block labels (.LBB and .LBE)
that are put around an inlined function
are invaded by the prologue and epilogue
of the function that is calling the inlined
function.
Example:
f:
.LBB1:
<prologue of f()>
<inlined function>
<epilogue of f()>
.LBE1:
*Should* have been:
f:
<prologue of f()>
.LBB1:
<inlined function>
.LBE1:
<epilogue of f()>
After making an easy-to-reproduce testcase and
discussing this with friends online, it was brought
to my attention that gcc 3.3.x did NOT put the
prologue inside the inlined block. Versions
3.2 and before as well as 3.4 (mainline) DO include
the prologue inside this block though.
I used a binary search on CVS dates to find that
the patch that "fixed" this (accidently apparently)
was made on 2002-06-02. The ChangeLog entry was
rather big, but I was able to reduce the patch
further down till two lines:
+ scope_to_insns_initialize ();
and
+ scope_to_insns_finalize ();
in two different files.
Unfortunately, these functions are gone in 3.4.
I think however that when I'd *understand* why this is
causing a partial fix for 3.3, I'd also be able to fix
mainline. But, I cannot understand it because I am
totally clueless when it comes to understanding the
gcc internals :(.
Therefore my plead for some help; Please look at the
diff below that I produced and that fixes half of the
problem (the prologue) for 3.4, and tell me WHY you
think this fixes it - and what would be a REAL fix.
Assembly output before applying this patch:
f:
.LFB4:
.file 1 "troep.c"
.loc 1 5 0
.LBB2:
.LBB3:
pushl %ebp
.LCFI0:
movl %esp, %ebp
.LCFI1:
.loc 1 3 0
incl a
.loc 1 7 0
leave
ret
.LBE3:
.LBE2:
.LFE4:
.size f, .-f
Assembly output after applying this patch:
f:
.LFB4:
.file 1 "troep.c"
.loc 1 5 0
pushl %ebp
.LCFI0:
movl %esp, %ebp
.LCFI1:
.LBB2:
.LBB3:
.loc 1 3 0
incl a
.loc 1 7 0
leave
ret
.LBE3:
.LBE2:
.LFE4:
.size f, .-f
And the diff of that:
***************
*** 13,24 ****
.LFB4:
.file 1 "troep.c"
.loc 1 5 0
- .LBB2:
- .LBB3:
pushl %ebp
.LCFI0:
movl %esp, %ebp
.LCFI1:
.loc 1 3 0
incl a
.loc 1 7 0
--- 13,24 ----
.LFB4:
.file 1 "troep.c"
.loc 1 5 0
pushl %ebp
.LCFI0:
movl %esp, %ebp
.LCFI1:
+ .LBB2:
+ .LBB3:
.loc 1 3 0
incl a
.loc 1 7 0
And finally the patch that is causing this difference, that
is a step in the right direction (patch won't APPLY, I removed
the ^L from it because I was afraid it would interfere with
the mail):
Note that I simply/stupidly added scope_to_insns_initialize()
and scope_to_insns_finalize() from the 3.3 branch to get it
to compile.
Index: toplev.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.818
diff -u -d -p -r1.818 toplev.c
--- toplev.c 14 Aug 2003 01:47:30 -0000 1.818
+++ toplev.c 19 Sep 2003 13:42:05 -0000
@@ -2978,6 +2978,46 @@ rest_of_handle_loop2 (tree decl, rtx ins
ggc_collect ();
}
+/* Build a varray mapping INSN_UID to lexical block. Return it. */
+
+static void scope_to_insns_initialize (void);
+extern varray_type insn_scopes;
+
+static void
+scope_to_insns_initialize ()
+{
+ tree block = NULL;
+ rtx insn, next;
+
+ VARRAY_TREE_INIT (insn_scopes, get_max_uid (), "insn scopes");
+
+ for (insn = get_insns (); insn; insn = next)
+ {
+ next = NEXT_INSN (insn);
+
+ if (active_insn_p (insn)
+ && GET_CODE (PATTERN (insn)) != ADDR_VEC
+ && GET_CODE (PATTERN (insn)) != ADDR_DIFF_VEC)
+ VARRAY_TREE (insn_scopes, INSN_UID (insn)) = block;
+ else if (GET_CODE (insn) == NOTE)
+ {
+ switch (NOTE_LINE_NUMBER (insn))
+ {
+ case NOTE_INSN_BLOCK_BEG:
+ block = NOTE_BLOCK (insn);
+ delete_insn (insn);
+ break;
+ case NOTE_INSN_BLOCK_END:
+ block = BLOCK_SUPERCONTEXT (block);
+ delete_insn (insn);
+ break;
+ default:
+ break;
+ }
+ }
+ }
+}
+
/* This is called from finish_function (within langhooks.parse_file)
after each top-level definition is parsed.
It is supposed to compile that function or variable
@@ -3091,6 +3131,7 @@ rest_of_compilation (tree decl)
timevar_pop (TV_JUMP);
+ scope_to_insns_initialize ();
insn_locators_initialize ();
/* Complete generation of exception handling code. */
if (doing_eh (0))
Index: final.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/final.c,v
retrieving revision 1.289
diff -u -d -p -r1.289 final.c
--- final.c 30 Jul 2003 17:27:06 -0000 1.289
+++ final.c 19 Sep 2003 13:42:09 -0000
@@ -1318,6 +1318,110 @@ asm_insn_count (rtx body)
}
#endif
+static void set_block_levels (tree, int);
+static void scope_to_insns_finalize (void);
+static void change_scope (rtx, tree, tree);
+
+static void
+set_block_levels (block, level)
+ tree block;
+ int level;
+{
+ while (block)
+ {
+ BLOCK_NUMBER (block) = level;
+ set_block_levels (BLOCK_SUBBLOCKS (block), level + 1);
+ block = BLOCK_CHAIN (block);
+ }
+}
+
+varray_type insn_scopes;
+
+static void
+change_scope (orig_insn, s1, s2)
+ rtx orig_insn;
+ tree s1, s2;
+{
+ rtx insn = orig_insn;
+ tree com = NULL_TREE;
+ tree ts1 = s1, ts2 = s2;
+ tree s;
+
+ while (ts1 != ts2)
+ {
+ if (ts1 == NULL || ts2 == NULL)
+ abort ();
+ if (BLOCK_NUMBER (ts1) > BLOCK_NUMBER (ts2))
+ ts1 = BLOCK_SUPERCONTEXT (ts1);
+ else if (BLOCK_NUMBER (ts1) < BLOCK_NUMBER (ts2))
+ ts2 = BLOCK_SUPERCONTEXT (ts2);
+ else
+ {
+ ts1 = BLOCK_SUPERCONTEXT (ts1);
+ ts2 = BLOCK_SUPERCONTEXT (ts2);
+ }
+ }
+ com = ts1;
+
+ /* Close scopes. */
+ s = s1;
+ while (s != com)
+ {
+ rtx note = emit_note_before (NOTE_INSN_BLOCK_END, insn);
+ NOTE_BLOCK (note) = s;
+ s = BLOCK_SUPERCONTEXT (s);
+ }
+
+ /* Open scopes. */
+ s = s2;
+ while (s != com)
+ {
+ insn = emit_note_before (NOTE_INSN_BLOCK_BEG, insn);
+ NOTE_BLOCK (insn) = s;
+ s = BLOCK_SUPERCONTEXT (s);
+ }
+}
+
+/* Rebuild all the NOTE_INSN_BLOCK_BEG and NOTE_INSN_BLOCK_END notes based
+ on the scope tree and the newly reordered instructions. */
+
+static void
+scope_to_insns_finalize ()
+{
+ tree cur_block = DECL_INITIAL (cfun->decl);
+ rtx insn, note;
+
+ /* Tag the blocks with a depth number so that change_scope can find
+ the common parent easily. */
+ set_block_levels (cur_block, 0);
+
+ for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
+ {
+ tree this_block;
+
+ if ((size_t) INSN_UID (insn) >= insn_scopes->num_elements)
+ continue;
+ this_block = VARRAY_TREE (insn_scopes, INSN_UID (insn));
+ if (! this_block)
+ continue;
+
+ if (this_block != cur_block)
+ {
+ change_scope (insn, cur_block, this_block);
+ cur_block = this_block;
+ }
+ }
+
+ VARRAY_FREE (insn_scopes);
+
+ /* change_scope emits before the insn, not after. */
+ note = emit_note (NOTE_INSN_DELETED);
+ change_scope (note, cur_block, DECL_INITIAL (cfun->decl));
+ delete_insn (note);
+
+ reorder_blocks ();
+}
+
/* Output assembler code for the start of a function,
and initialize some of the variables in this file
for the new function. The label for the function and associated
@@ -1383,6 +1487,7 @@ final_start_function (rtx first ATTRIBUT
if (write_symbols)
{
remove_unnecessary_notes ();
+ scope_to_insns_finalize ();
reemit_insn_block_notes ();
number_blocks (current_function_decl);
/* We never actually put out begin/end notes for the top-level
===========================================================================
In order to make it easier for you to comment on this, here are also
some more context of the functions to which the
scope_to_insns_initialize()/scope_to_insns_finalize() were added:
(from 3.4 thus)
gcc/toplev.c:
void
rest_of_compilation (tree decl)
{
rtx insns;
int rebuild_label_notes_after_reload;
timevar_push (TV_REST_OF_COMPILATION);
/* Register rtl specific functions for cfg. */
rtl_register_cfg_hooks ();
[...]
timevar_push (TV_JUMP);
open_dump_file (DFI_sibling, decl);
insns = get_insns ();
rebuild_jump_labels (insns);
find_exception_handler_labels ();
find_basic_blocks (insns, max_reg_num (), rtl_dump_file);
delete_unreachable_blocks ();
/* We have to issue these warnings now already, because CFG cleanups
further down may destroy the required information. */
check_function_return_warnings ();
/* Turn NOTE_INSN_PREDICTIONs into branch predictions. */
if (flag_guess_branch_prob)
{
timevar_push (TV_BRANCH_PROB);
note_prediction_to_br_prob ();
timevar_pop (TV_BRANCH_PROB);
}
if (flag_optimize_sibling_calls)
rest_of_handle_sibling_calls (insns);
timevar_pop (TV_JUMP);
scope_to_insns_initialize (); // ****** THIS WAS ADDED ******
insn_locators_initialize ();
/* Complete generation of exception handling code. */
if (doing_eh (0))
{
timevar_push (TV_JUMP);
open_dump_file (DFI_eh, decl);
finish_eh_generation ();
close_dump_file (DFI_eh, print_rtl, get_insns ());
timevar_pop (TV_JUMP);
}
/* Delay emitting hard_reg_initial_value sets until after EH landing pad
generation, which might create new sets. */
emit_initial_value_sets ();
[...]
}
gcc/final.c:
void
final_start_function (rtx first ATTRIBUTE_UNUSED, FILE *file,
int optimize ATTRIBUTE_UNUSED)
{
block_depth = 0;
this_is_asm_operands = 0;
#ifdef NON_SAVING_SETJMP
/* A function that calls setjmp should save and restore all the
call-saved registers on a system where longjmp clobbers them. */
if (NON_SAVING_SETJMP && current_function_calls_setjmp)
{
int i;
for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
if (!call_used_regs[i])
regs_ever_live[i] = 1;
}
#endif
last_filename = locator_file (prologue_locator);
last_linenum = locator_line (prologue_locator);
high_block_linenum = high_function_linenum = last_linenum;
(*debug_hooks->begin_prologue) (last_linenum, last_filename);
#if defined (DWARF2_UNWIND_INFO) || defined (IA64_UNWIND_INFO)
if (write_symbols != DWARF2_DEBUG && write_symbols != VMS_AND_DWARF2_DEBUG)
dwarf2out_begin_prologue (0, NULL);
#endif
#ifdef LEAF_REG_REMAP
if (current_function_uses_only_leaf_regs)
leaf_renumber_regs (first);
#endif
/* The Sun386i and perhaps other machines don't work right
if the profiling code comes after the prologue. */
#ifdef PROFILE_BEFORE_PROLOGUE
if (current_function_profile)
profile_function (file);
#endif /* PROFILE_BEFORE_PROLOGUE */
#if defined (DWARF2_UNWIND_INFO) && defined (HAVE_prologue)
if (dwarf2out_do_frame ())
dwarf2out_frame_debug (NULL_RTX);
#endif
/* If debugging, assign block numbers to all of the blocks in this
function. */
if (write_symbols)
{
remove_unnecessary_notes ();
scope_to_insns_finalize (); // ****** THIS WAS ADDED ******
reemit_insn_block_notes ();
number_blocks (current_function_decl);
/* We never actually put out begin/end notes for the top-level
block in the function. But, conceptually, that block is
always needed. */
TREE_ASM_WRITTEN (DECL_INITIAL (current_function_decl)) = 1;
}
/* First output the function prologue: code to set up the stack frame. */
(*targetm.asm_out.function_prologue) (file, get_frame_size ());
/* If the machine represents the prologue as RTL, the profiling code must
be emitted when NOTE_INSN_PROLOGUE_END is scanned. */
#ifdef HAVE_prologue
if (! HAVE_prologue)
#endif
profile_after_prologue (file);
}
The question is thus: Why does this have the wanted effect?
What would be a real fix for this prologue w/ inlining problem?
Thanks for time and help.
--
Carlo Wood <carlo@alinoe.com>
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Need help with fixing this two year old backend bug in mainline.
2003-09-19 15:28 Need help with fixing this two year old backend bug in mainline Carlo Wood
@ 2003-09-23 4:09 ` Jim Wilson
0 siblings, 0 replies; 2+ messages in thread
From: Jim Wilson @ 2003-09-23 4:09 UTC (permalink / raw)
To: Carlo Wood; +Cc: gcc, gcc-bugs
Carlo Wood wrote:
> I am trying to make a start with PR 10003,
> which 4431 and 12319 marked as 'duplicate'.
> + scope_to_insns_finalize ();
> Unfortunately, these functions are gone in 3.4.
I think these are gone because block info is now inside the insn rtx.
Thus there is no longer any need to put block info in an external array
and then sort it to match the insn stream. The reason for the sort is
to account for effects of things like instruction scheduling, but now
that the block info gets scheduled along with the instructions, there is
no longer any need for an extra sort.
This probably fixes the problem as a side-effect by moving (or
creating?) a block note for the inlined function after the prologue.
Note that the prologue does not get emitted until after reload. We
can't emit it until then because we don't know which registers to save,
and the size of the frame until then.
I am guessing that the problem is one or both of two problems:
1) When the prologue is inserted by thread_prologue_and_epilogue_insns,
and the first block of the function is empty, it gets put in the wrong
place. It is inserted in the first block of the inlined function
instead of before it, into the empty first block of the function.
2) We are too eagerly optimizing away empty blocks before the prologue
gets inserted. We should preserve an empty block to hold the prologue
until after reload, to make sure that thread_prologue_and_epilogue_insns
has a place to put the instructions it emits.
Exactly how to fix this depends on how the cfg routines are supposed to
work.
I don't think this is the same problem as 4435, as that is apparently a
scheduler/debug interaction problem, and the cfg stuff didn't exist 3
years ago.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2003-09-23 0:12 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-09-19 15:28 Need help with fixing this two year old backend bug in mainline Carlo Wood
2003-09-23 4:09 ` Jim Wilson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).