* postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
@ 2005-03-31 17:10 Joern RENNECKE
2005-04-03 14:46 ` Mostafa Hagog
0 siblings, 1 reply; 10+ messages in thread
From: Joern RENNECKE @ 2005-03-31 17:10 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1212 bytes --]
The redundancy elimination used by postreload-gcse can necessitate to
split an edge,
which is not possible when we can't emit new jumps. I have therefore
added the
same check as we already had in bb-reorder.c:reorder_basic_blocks as an
obvious fix.
It appears that more and more passes are added that simply can't work
for SH5 SHmedia
code because of this inability to generate jumps after reload. A
possible solution is to
reserve a call-clobbered target register during register allocation and
the first branch target
optimization phase, so that we can generate (atrociously scheduled)
pta/blink pairs after
reload to expand a jump. The would then have to be cleaned up by the
second branch
target optimization pass. However, it is not possible to get a useful
measure of the target
register pressure without the first branch target registyer optimization
pass, so we'd have
to run them both, which also means that we need separate dump files.
Also some new
infrastructure is needed to reserve a target register. All in all, this
is a bit too much right
to add to my pile of pending patches, so I'm tabling this till I've made
some serious progress
with the merge of the existing patches.
[-- Attachment #2: tmp2 --]
[-- Type: text/plain, Size: 2518 bytes --]
2005-03-31 J"orn Rennecke <joern.rennecke@st.com>
* postreload-gcse.c: Include target.h.
(gcse_after_reload_main): Return early if we cannot modify jumps.
* Makefile.in (postreload-gcse.o): Depend on $(TARGET_H).
Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.1455
diff -p -r1.1455 Makefile.in
*** Makefile.in 21 Mar 2005 17:58:06 -0000 1.1455
--- Makefile.in 31 Mar 2005 15:43:21 -0000
*************** postreload.o : postreload.c $(CONFIG_H)
*** 2115,2121 ****
postreload-gcse.o : postreload-gcse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
$(RTL_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) real.h insn-config.h $(GGC_H) \
$(RECOG_H) $(EXPR_H) $(BASIC_BLOCK_H) function.h output.h toplev.h $(TM_P_H) \
! except.h $(TREE_H)
caller-save.o : caller-save.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
$(FLAGS_H) $(REGS_H) hard-reg-set.h insn-config.h $(BASIC_BLOCK_H) function.h \
$(RECOG_H) reload.h $(EXPR_H) toplev.h $(TM_P_H)
--- 2115,2121 ----
postreload-gcse.o : postreload-gcse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
$(RTL_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) real.h insn-config.h $(GGC_H) \
$(RECOG_H) $(EXPR_H) $(BASIC_BLOCK_H) function.h output.h toplev.h $(TM_P_H) \
! except.h $(TREE_H) $(TARGET_H)
caller-save.o : caller-save.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
$(FLAGS_H) $(REGS_H) hard-reg-set.h insn-config.h $(BASIC_BLOCK_H) function.h \
$(RECOG_H) reload.h $(EXPR_H) toplev.h $(TM_P_H)
Index: postreload-gcse.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/postreload-gcse.c,v
retrieving revision 2.9
diff -p -r2.9 postreload-gcse.c
*** postreload-gcse.c 18 Jan 2005 11:36:16 -0000 2.9
--- postreload-gcse.c 31 Mar 2005 15:43:21 -0000
*************** Software Foundation, 59 Temple Place - S
*** 43,48 ****
--- 43,49 ----
#include "obstack.h"
#include "hashtab.h"
#include "params.h"
+ #include "target.h"
/* The following code implements gcse after reload, the purpose of this
pass is to cleanup redundant loads generated by reload and other
*************** delete_redundant_insns (void)
*** 1283,1288 ****
--- 1284,1293 ----
void
gcse_after_reload_main (rtx f ATTRIBUTE_UNUSED)
{
+
+ if (targetm.cannot_modify_jumps_p ())
+ return;
+
memset (&stats, 0, sizeof (stats));
/* Allocate ememory for this pass.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
2005-03-31 17:10 postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't Joern RENNECKE
@ 2005-04-03 14:46 ` Mostafa Hagog
2005-04-04 15:19 ` Joern RENNECKE
0 siblings, 1 reply; 10+ messages in thread
From: Mostafa Hagog @ 2005-04-03 14:46 UTC (permalink / raw)
To: Joern RENNECKE; +Cc: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1111 bytes --]
gcc-patches-owner@gcc.gnu.org wrote on 31/03/2005 20:00:36:
> The redundancy elimination used by postreload-gcse can necessitate to
> split an edge,
> which is not possible when we can't emit new jumps. I have therefore
> added the
> same check as we already had in bb-reorder.c:reorder_basic_blocks as an
> obvious fix.
A more appropriate fix is to disable the gcse only for those cases that you
need an edge split; add the targetm.cannot_modify_jumps_p () condition to
eliminate_partially_redundant_load. I had a patch to do a similar thing
when we don't have profiling information; in such a case we cannot be sure
that the split worth it (even when we remove redundant loads). I have
update the patch to catch also cases where the machine doesn't support
adding jumps after reload. here is the modified patch, didn't test it yet
let me know what do you think ?
Mostafa.
2005-03-31 Mostafa Hagog <mustafa@il.ibm.com>
* postreload-gcse.c (eliminate_partially_redundant_load): Don't split
edge when it not
profitable or not possible.
(See attached file: postreload_gcse_nopdf4.patch)
[-- Attachment #2: postreload_gcse_nopdf4.patch --]
[-- Type: application/octet-stream, Size: 3767 bytes --]
Index: postreload-gcse.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/postreload-gcse.c,v
retrieving revision 2.11
diff -c -p -r2.11 postreload-gcse.c
*** postreload-gcse.c 1 Apr 2005 03:42:44 -0000 2.11
--- postreload-gcse.c 3 Apr 2005 14:35:11 -0000
*************** eliminate_partially_redundant_load (basi
*** 1017,1022 ****
--- 1017,1023 ----
gcov_type ok_count = 0; /* Redundant load execution count. */
gcov_type critical_count = 0; /* Execution count of critical edges. */
edge_iterator ei;
+ bool critical_edge_split = false;
/* The execution count of the loads to be added to make the
load fully redundant. */
*************** eliminate_partially_redundant_load (basi
*** 1037,1042 ****
--- 1038,1044 ----
rtx next_pred_bb_end;
avail_insn = NULL_RTX;
+ avail_reg = NULL_RTX;
pred_bb = pred->src;
next_pred_bb_end = NEXT_INSN (BB_END (pred_bb));
for (a_occr = get_bb_avail_insn (pred_bb, expr->avail_occr); a_occr;
*************** eliminate_partially_redundant_load (basi
*** 1072,1077 ****
--- 1074,1088 ----
{
npred_ok++;
ok_count += pred->count;
+ if (! set_noop_p (PATTERN (gen_move_insn (copy_rtx (dest),
+ copy_rtx (avail_reg)))))
+ {
+ /* Check if there is going to be a split. */
+ if (EDGE_CRITICAL_P (pred))
+ critical_edge_split = true;
+ }
+ else /* Its a dead move no need to generate. */
+ continue;
occr = (struct unoccr *) obstack_alloc (&unoccr_obstack,
sizeof (struct occr));
occr->insn = avail_insn;
*************** eliminate_partially_redundant_load (basi
*** 1083,1088 ****
--- 1094,1102 ----
}
else
{
+ /* Adding a load on a critical edge will cuase a split. */
+ if (EDGE_CRITICAL_P (pred))
+ critical_edge_split = true;
not_ok_count += pred->count;
unoccr = (struct unoccr *) obstack_alloc (&unoccr_obstack,
sizeof (struct unoccr));
*************** eliminate_partially_redundant_load (basi
*** 1098,1104 ****
if (/* No load can be replaced by copy. */
npred_ok == 0
/* Prevent exploding the code. */
! || (optimize_size && npred_ok > 1))
goto cleanup;
/* Check if it's worth applying the partial redundancy elimination. */
--- 1112,1123 ----
if (/* No load can be replaced by copy. */
npred_ok == 0
/* Prevent exploding the code. */
! || (optimize_size && npred_ok > 1)
! /* If we don't have profile information we cannot tell if splitting
! a critical edge is profitable or not so don't do it. */
! || ((! profile_info || ! flag_branch_probabilities
! || targetm.cannot_modify_jumps_p ())
! && critical_edge_split))
goto cleanup;
/* Check if it's worth applying the partial redundancy elimination. */
*************** eliminate_partially_redundant_load (basi
*** 1159,1165 ****
a_occr = get_bb_avail_insn (bb, a_occr->next));
if (!a_occr)
! delete_insn (insn);
else
a_occr->deleted_p = 1;
--- 1178,1194 ----
a_occr = get_bb_avail_insn (bb, a_occr->next));
if (!a_occr)
! {
! stats.insns_deleted++;
!
! if (dump_file)
! {
! fprintf (dump_file, "deleting insn:\n");
! print_rtl_single (dump_file, insn);
! fprintf (dump_file, "\n");
! }
! delete_insn (insn);
! }
else
a_occr->deleted_p = 1;
*************** void
*** 1285,1293 ****
gcse_after_reload_main (rtx f ATTRIBUTE_UNUSED)
{
- if (targetm.cannot_modify_jumps_p ())
- return;
-
memset (&stats, 0, sizeof (stats));
/* Allocate ememory for this pass.
--- 1314,1319 ----
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
2005-04-03 14:46 ` Mostafa Hagog
@ 2005-04-04 15:19 ` Joern RENNECKE
2005-04-04 21:08 ` Joern RENNECKE
0 siblings, 1 reply; 10+ messages in thread
From: Joern RENNECKE @ 2005-04-04 15:19 UTC (permalink / raw)
To: Mostafa Hagog; +Cc: gcc-patches, Ulrich.Weigand
Mostafa Hagog wrote:
>
>
>A more appropriate fix is to disable the gcse only for those cases that you
>need an edge split; add the targetm.cannot_modify_jumps_p () condition to
>eliminate_partially_redundant_load. I had a patch to do a similar thing
>when we don't have profiling information; in such a case we cannot be sure
>that the split worth it (even when we remove redundant loads). I have
>update the patch to catch also cases where the machine doesn't support
>adding jumps after reload. here is the modified patch, didn't test it yet
>let me know what do you think ?
>
If this indeed coverts all the cases where edge splits might be caused,
then this is a preferable
solution to the one I put in earlier. A test build of sh64-elf suceeded
with your patch, but
it will maybe another hour for the regression check. Strangely enough,
one of the
execution failures seems to have been 'cured'.
For the profiling information problem, it certainly would be better if
we could generate some
edge counts that are consistent with the assumed branch probabilities.
For a single function,
you have basically a system of linear equations, and you can assign an
arbitrary large number
for the entry edge to get a set of values. This is algorithmically
simple, although it can become
computationally quite expensive for complex flowgraphs. If you consider
multiple functions,
however, it gets worse, because the edge counts should be consistent
across functions so that
inlining decisions make sense. To address function call/return and
mutual recursion of direct
calls, you could mash the entire translation unit into a big linear
equation system, but this
looks like it's pretty much guaranteed to be too expensive to solve for
most translation units.
And it doesn't even begin to address vtables or indirect function calls
in general.
So, ironically, here we have a situation where global functions can be
optimized more efficiently
than static (i.e. ! TREE_PUBLIC) ones. For a global function. if we
don't have profiling
information, we may assume that most of the calls come from outside of
the module, and
assign an arbitrary large execution count to the entry edge. For most
real-world flowgraphs,
the linear equation sytstem can probably be solved in reasonable time.
We should probably
skip the cases where a cheap solution cannot be found with whatever
algorithms we build
into the compiler.
Thus, while it appears desirable and feasible to have estimated edge
execution count information
for some functions, it doesn't appear feasible to have it for all
functions. Hence, we need a
mechanism to deal with the case of not having execution count information.
Therefore, I think your patch makes sense. If/when we get estimated
edge execution count
information for some functions, we can refine the ! profile_info test to
take this into account.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
2005-04-04 15:19 ` Joern RENNECKE
@ 2005-04-04 21:08 ` Joern RENNECKE
2005-04-05 8:25 ` Mostafa Hagog
0 siblings, 1 reply; 10+ messages in thread
From: Joern RENNECKE @ 2005-04-04 21:08 UTC (permalink / raw)
To: Mostafa Hagog; +Cc: gcc-patches
In the sources from CVS 2005.04.01.17.00.00, there is one regression for
sh64/-m5-compact/-ml:
< PASS: tmpdir-gcc.dg-struct-layout-1/t011
c_compat_x_tst.o-c_compat_y_tst.o execute
---
> FAIL: tmpdir-gcc.dg-struct-layout-1/t011
c_compat_x_tst.o-c_compat_y_tst.o execute
In the sources from CVS 2005.03.23.12.00.00, with my merged patches,
one regression is fixed for sh64:
-FAIL: tmpdir-gcc.dg-struct-layout-1/t021
c_compat_x_tst.o-c_compat_y_tst.o execute
+PASS: tmpdir-gcc.dg-struct-layout-1/t021
c_compat_x_tst.o-c_compat_y_tst.o execute
But one new one appears for sh64/-m5-compact/-ml:
-PASS: tmpdir-gcc.dg-struct-layout-1/t024
c_compat_x_tst.o-c_compat_y_tst.o execute
+FAIL: tmpdir-gcc.dg-struct-layout-1/t024
c_compat_x_tst.o-c_compat_y_tst.o execute
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
2005-04-04 21:08 ` Joern RENNECKE
@ 2005-04-05 8:25 ` Mostafa Hagog
2005-04-05 11:53 ` Joern RENNECKE
0 siblings, 1 reply; 10+ messages in thread
From: Mostafa Hagog @ 2005-04-05 8:25 UTC (permalink / raw)
To: Joern RENNECKE; +Cc: gcc-patches
I am testing the patch on powerpc and x86 also, I will see what we get
there.
Joern RENNECKE <joern.rennecke@st.com> wrote on 05/04/2005 00:08:37:
> In the sources from CVS 2005.04.01.17.00.00, there is one regression for
> sh64/-m5-compact/-ml:
> < PASS: tmpdir-gcc.dg-struct-layout-1/t011
> c_compat_x_tst.o-c_compat_y_tst.o execute
> ---
> > FAIL: tmpdir-gcc.dg-struct-layout-1/t011
> c_compat_x_tst.o-c_compat_y_tst.o execute
Is this regression happen after you apply the patches ?
>
> In the sources from CVS 2005.03.23.12.00.00, with my merged patches,
> one regression is fixed for sh64:
> -FAIL: tmpdir-gcc.dg-struct-layout-1/t021
> c_compat_x_tst.o-c_compat_y_tst.o execute
> +PASS: tmpdir-gcc.dg-struct-layout-1/t021
> c_compat_x_tst.o-c_compat_y_tst.o execute
>
> But one new one appears for sh64/-m5-compact/-ml:
> -PASS: tmpdir-gcc.dg-struct-layout-1/t024
> c_compat_x_tst.o-c_compat_y_tst.o execute
> +FAIL: tmpdir-gcc.dg-struct-layout-1/t024
> c_compat_x_tst.o-c_compat_y_tst.o execute
To be able to debug this I must at least look at dump. Since I don't have
the sh64 system; can you please send me those dumps
Mostafa.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
2005-04-05 8:25 ` Mostafa Hagog
@ 2005-04-05 11:53 ` Joern RENNECKE
2005-04-05 12:55 ` Joern RENNECKE
2005-04-05 14:21 ` Mostafa Hagog
0 siblings, 2 replies; 10+ messages in thread
From: Joern RENNECKE @ 2005-04-05 11:53 UTC (permalink / raw)
To: Mostafa Hagog; +Cc: gcc-patches
Mostafa Hagog wrote:
>
>
>To be able to debug this I must at least look at dump. Since I don't have
>the sh64 system; can you please send me those dumps
>
I'm running these tests on a simulator. Sources together with pre-built
Red Hat
Linux binaries for gdb with integrated simulator and a stand-alone simulator
are available at:
ftp://sources.redhat.com/pub/gdb/contrib/superh/gdb+dejagnu-20020310-sh64-20040702.tgz
or your friendly neighbourhood mirror.
I'm not sure what dumps you want. Gcc dumps? Core dumps? At any rate,
debugging
under gdb is generally more useful.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
2005-04-05 11:53 ` Joern RENNECKE
@ 2005-04-05 12:55 ` Joern RENNECKE
2005-04-05 14:21 ` Mostafa Hagog
1 sibling, 0 replies; 10+ messages in thread
From: Joern RENNECKE @ 2005-04-05 12:55 UTC (permalink / raw)
To: Mostafa Hagog; +Cc: gcc-patches
I've had a look at the tmpdir-gcc.dg-struct-layout-1/t021 failure, and
the problem appears to be that the struct-layout
tests are way too large - about half a megabyte. Adding a byte or two
can cause a bss / stack collision.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
2005-04-05 11:53 ` Joern RENNECKE
2005-04-05 12:55 ` Joern RENNECKE
@ 2005-04-05 14:21 ` Mostafa Hagog
2005-04-05 15:44 ` Joern RENNECKE
2005-04-12 14:38 ` Joern RENNECKE
1 sibling, 2 replies; 10+ messages in thread
From: Mostafa Hagog @ 2005-04-05 14:21 UTC (permalink / raw)
To: Joern RENNECKE; +Cc: gcc-patches
Joern RENNECKE <joern.rennecke@st.com> wrote on 05/04/2005 14:53:24:
> Mostafa Hagog wrote:
>
> >
> >
> >To be able to debug this I must at least look at dump. Since I don't
have
> >the sh64 system; can you please send me those dumps
> >
> I'm running these tests on a simulator. Sources together with pre-built
> Red Hat
> Linux binaries for gdb with integrated simulator and a stand-alone
simulator
> are available at:
> ftp://sources.redhat.
> com/pub/gdb/contrib/superh/gdb+dejagnu-20020310-sh64-20040702.tgz
> or your friendly neighbourhood mirror.
>
> I'm not sure what dumps you want. Gcc dumps? Core dumps? At any rate,
> debugging
> under gdb is generally more useful.
>
RTL dumps of GCC can help understand what is going in postreload-gcse;
since already have GCC for sh64, can you produce the RTL dumps for this
test case ?
Thanks,
Mostafa.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
2005-04-05 14:21 ` Mostafa Hagog
@ 2005-04-05 15:44 ` Joern RENNECKE
2005-04-12 14:38 ` Joern RENNECKE
1 sibling, 0 replies; 10+ messages in thread
From: Joern RENNECKE @ 2005-04-05 15:44 UTC (permalink / raw)
To: Mostafa Hagog; +Cc: gcc-patches
Mostafa Hagog wrote:
>
>
>RTL dumps of GCC can help understand what is going in postreload-gcse;
>since already have GCC for sh64, can you produce the RTL dumps for this
>test case ?
>
As I said already in a previous email, the struct-layout testcases are
simply too large.
Since postreload-gcse is expected to increase code size a bit now and
then, I don't
think your patch is to blame for any of the regressions.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
2005-04-05 14:21 ` Mostafa Hagog
2005-04-05 15:44 ` Joern RENNECKE
@ 2005-04-12 14:38 ` Joern RENNECKE
1 sibling, 0 replies; 10+ messages in thread
From: Joern RENNECKE @ 2005-04-12 14:38 UTC (permalink / raw)
To: Mostafa Hagog; +Cc: gcc-patches, kkojima
P.S.: A number of regresions are also attributable to PR middle-end/20371.
The structure layout algorithm uses uninitialized data, so a number of the
struct-layout tests return random results.
I have the patch to the sh-elf-4_1-branch banch now because this bug
makes it impossible to make a straightforward regression test.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2005-04-12 14:38 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-03-31 17:10 postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't Joern RENNECKE
2005-04-03 14:46 ` Mostafa Hagog
2005-04-04 15:19 ` Joern RENNECKE
2005-04-04 21:08 ` Joern RENNECKE
2005-04-05 8:25 ` Mostafa Hagog
2005-04-05 11:53 ` Joern RENNECKE
2005-04-05 12:55 ` Joern RENNECKE
2005-04-05 14:21 ` Mostafa Hagog
2005-04-05 15:44 ` Joern RENNECKE
2005-04-12 14:38 ` Joern RENNECKE
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).