public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
@ 2005-03-31 17:10 Joern RENNECKE
  2005-04-03 14:46 ` Mostafa Hagog
  0 siblings, 1 reply; 10+ messages in thread
From: Joern RENNECKE @ 2005-03-31 17:10 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1212 bytes --]

The redundancy elimination used by postreload-gcse can necessitate to 
split an edge,
which is not possible when we can't emit new jumps.  I have therefore 
added the
same check as we already had in  bb-reorder.c:reorder_basic_blocks as an
obvious fix.

It appears that more and more passes are added that simply can't work 
for SH5 SHmedia
code because of this inability to generate jumps after reload.  A 
possible solution is to
reserve a call-clobbered target register during register allocation and 
the first branch target
optimization phase, so that we can generate (atrociously scheduled) 
pta/blink pairs after
reload to expand a jump.  The would then have to be cleaned up by the 
second branch
target optimization pass.  However, it is not possible to get a useful 
measure of the target
register pressure without the first branch target registyer optimization 
pass, so we'd have
to run them both, which also means that we need separate dump files.   
Also some new
infrastructure is needed to reserve a target register.  All in all, this 
is a bit too much right
to add to my pile of pending patches, so I'm tabling this till I've made 
some serious progress
with the merge of the existing patches.

[-- Attachment #2: tmp2 --]
[-- Type: text/plain, Size: 2518 bytes --]

2005-03-31  J"orn Rennecke <joern.rennecke@st.com>

	* postreload-gcse.c: Include target.h.
	(gcse_after_reload_main): Return early if we cannot modify jumps.
	* Makefile.in (postreload-gcse.o): Depend on $(TARGET_H).

Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.1455
diff -p -r1.1455 Makefile.in
*** Makefile.in	21 Mar 2005 17:58:06 -0000	1.1455
--- Makefile.in	31 Mar 2005 15:43:21 -0000
*************** postreload.o : postreload.c $(CONFIG_H) 
*** 2115,2121 ****
  postreload-gcse.o : postreload-gcse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
     $(RTL_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) real.h insn-config.h $(GGC_H) \
     $(RECOG_H) $(EXPR_H) $(BASIC_BLOCK_H) function.h output.h toplev.h $(TM_P_H) \
!    except.h $(TREE_H)
  caller-save.o : caller-save.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
     $(FLAGS_H) $(REGS_H) hard-reg-set.h insn-config.h $(BASIC_BLOCK_H) function.h \
     $(RECOG_H) reload.h $(EXPR_H) toplev.h $(TM_P_H)
--- 2115,2121 ----
  postreload-gcse.o : postreload-gcse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
     $(RTL_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) real.h insn-config.h $(GGC_H) \
     $(RECOG_H) $(EXPR_H) $(BASIC_BLOCK_H) function.h output.h toplev.h $(TM_P_H) \
!    except.h $(TREE_H) $(TARGET_H)
  caller-save.o : caller-save.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
     $(FLAGS_H) $(REGS_H) hard-reg-set.h insn-config.h $(BASIC_BLOCK_H) function.h \
     $(RECOG_H) reload.h $(EXPR_H) toplev.h $(TM_P_H)
Index: postreload-gcse.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/postreload-gcse.c,v
retrieving revision 2.9
diff -p -r2.9 postreload-gcse.c
*** postreload-gcse.c	18 Jan 2005 11:36:16 -0000	2.9
--- postreload-gcse.c	31 Mar 2005 15:43:21 -0000
*************** Software Foundation, 59 Temple Place - S
*** 43,48 ****
--- 43,49 ----
  #include "obstack.h"
  #include "hashtab.h"
  #include "params.h"
+ #include "target.h"
  
  /* The following code implements gcse after reload, the purpose of this
     pass is to cleanup redundant loads generated by reload and other
*************** delete_redundant_insns (void)
*** 1283,1288 ****
--- 1284,1293 ----
  void
  gcse_after_reload_main (rtx f ATTRIBUTE_UNUSED)
  {
+ 
+   if (targetm.cannot_modify_jumps_p ())
+     return;
+ 
    memset (&stats, 0, sizeof (stats));
  
    /* Allocate ememory for this pass.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
  2005-03-31 17:10 postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't Joern RENNECKE
@ 2005-04-03 14:46 ` Mostafa Hagog
  2005-04-04 15:19   ` Joern RENNECKE
  0 siblings, 1 reply; 10+ messages in thread
From: Mostafa Hagog @ 2005-04-03 14:46 UTC (permalink / raw)
  To: Joern RENNECKE; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1111 bytes --]





gcc-patches-owner@gcc.gnu.org wrote on 31/03/2005 20:00:36:

> The redundancy elimination used by postreload-gcse can necessitate to
> split an edge,
> which is not possible when we can't emit new jumps.  I have therefore
> added the
> same check as we already had in  bb-reorder.c:reorder_basic_blocks as an
> obvious fix.
A more appropriate fix is to disable the gcse only for those cases that you
need an edge split; add the targetm.cannot_modify_jumps_p () condition to
eliminate_partially_redundant_load.  I had a patch to do a similar thing
when we don't have profiling information; in such a case we cannot be sure
that the split worth it (even when we remove redundant loads).  I have
update the patch to catch also cases where the machine doesn't support
adding jumps after reload.  here is the modified patch, didn't test it yet
let me know what do you think ?

Mostafa.

2005-03-31 Mostafa Hagog <mustafa@il.ibm.com>

      * postreload-gcse.c (eliminate_partially_redundant_load): Don't split
edge when it not
      profitable or not possible.


(See attached file: postreload_gcse_nopdf4.patch)

[-- Attachment #2: postreload_gcse_nopdf4.patch --]
[-- Type: application/octet-stream, Size: 3767 bytes --]

Index: postreload-gcse.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/postreload-gcse.c,v
retrieving revision 2.11
diff -c -p -r2.11 postreload-gcse.c
*** postreload-gcse.c	1 Apr 2005 03:42:44 -0000	2.11
--- postreload-gcse.c	3 Apr 2005 14:35:11 -0000
*************** eliminate_partially_redundant_load (basi
*** 1017,1022 ****
--- 1017,1023 ----
    gcov_type ok_count = 0; /* Redundant load execution count.  */
    gcov_type critical_count = 0; /* Execution count of critical edges.  */
    edge_iterator ei;
+   bool critical_edge_split = false;
  
    /* The execution count of the loads to be added to make the
       load fully redundant.  */
*************** eliminate_partially_redundant_load (basi
*** 1037,1042 ****
--- 1038,1044 ----
        rtx next_pred_bb_end;
  
        avail_insn = NULL_RTX;
+       avail_reg = NULL_RTX;
        pred_bb = pred->src;
        next_pred_bb_end = NEXT_INSN (BB_END (pred_bb));
        for (a_occr = get_bb_avail_insn (pred_bb, expr->avail_occr); a_occr;
*************** eliminate_partially_redundant_load (basi
*** 1072,1077 ****
--- 1074,1088 ----
  	{
  	  npred_ok++;
  	  ok_count += pred->count;
+ 	  if (! set_noop_p (PATTERN (gen_move_insn (copy_rtx (dest),
+ 						    copy_rtx (avail_reg)))))
+ 	    {
+ 	      /* Check if there is going to be a split.  */
+ 	      if (EDGE_CRITICAL_P (pred))
+ 		critical_edge_split = true;
+ 	    }
+ 	  else /* Its a dead move no need to generate.  */
+ 	    continue;
  	  occr = (struct unoccr *) obstack_alloc (&unoccr_obstack,
  						  sizeof (struct occr));
  	  occr->insn = avail_insn;
*************** eliminate_partially_redundant_load (basi
*** 1083,1088 ****
--- 1094,1102 ----
  	}
        else
  	{
+ 	  /* Adding a load on a critical edge will cuase a split.  */
+ 	  if (EDGE_CRITICAL_P (pred))
+ 	    critical_edge_split = true;
  	  not_ok_count += pred->count;
  	  unoccr = (struct unoccr *) obstack_alloc (&unoccr_obstack,
  						    sizeof (struct unoccr));
*************** eliminate_partially_redundant_load (basi
*** 1098,1104 ****
    if (/* No load can be replaced by copy.  */
        npred_ok == 0
        /* Prevent exploding the code.  */ 
!       || (optimize_size && npred_ok > 1))
      goto cleanup;
  
    /* Check if it's worth applying the partial redundancy elimination.  */
--- 1112,1123 ----
    if (/* No load can be replaced by copy.  */
        npred_ok == 0
        /* Prevent exploding the code.  */ 
!       || (optimize_size && npred_ok > 1)
!       /* If we don't have profile information we cannot tell if splitting 
!          a critical edge is profitable or not so don't do it.  */
!       || ((! profile_info || ! flag_branch_probabilities
! 	   || targetm.cannot_modify_jumps_p ())
! 	  && critical_edge_split))
      goto cleanup;
  
    /* Check if it's worth applying the partial redundancy elimination.  */
*************** eliminate_partially_redundant_load (basi
*** 1159,1165 ****
         a_occr = get_bb_avail_insn (bb, a_occr->next));
  
    if (!a_occr)
!     delete_insn (insn);
    else
      a_occr->deleted_p = 1;
  
--- 1178,1194 ----
         a_occr = get_bb_avail_insn (bb, a_occr->next));
  
    if (!a_occr)
!     {
!       stats.insns_deleted++;
! 
!       if (dump_file)
! 	{
! 	  fprintf (dump_file, "deleting insn:\n");
!           print_rtl_single (dump_file, insn);
!           fprintf (dump_file, "\n");
! 	}
!       delete_insn (insn);
!     }
    else
      a_occr->deleted_p = 1;
  
*************** void
*** 1285,1293 ****
  gcse_after_reload_main (rtx f ATTRIBUTE_UNUSED)
  {
  
-   if (targetm.cannot_modify_jumps_p ())
-     return;
- 
    memset (&stats, 0, sizeof (stats));
  
    /* Allocate ememory for this pass.
--- 1314,1319 ----

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
  2005-04-03 14:46 ` Mostafa Hagog
@ 2005-04-04 15:19   ` Joern RENNECKE
  2005-04-04 21:08     ` Joern RENNECKE
  0 siblings, 1 reply; 10+ messages in thread
From: Joern RENNECKE @ 2005-04-04 15:19 UTC (permalink / raw)
  To: Mostafa Hagog; +Cc: gcc-patches, Ulrich.Weigand

Mostafa Hagog wrote:

>    
>
>A more appropriate fix is to disable the gcse only for those cases that you
>need an edge split; add the targetm.cannot_modify_jumps_p () condition to
>eliminate_partially_redundant_load.  I had a patch to do a similar thing
>when we don't have profiling information; in such a case we cannot be sure
>that the split worth it (even when we remove redundant loads).  I have
>update the patch to catch also cases where the machine doesn't support
>adding jumps after reload.  here is the modified patch, didn't test it yet
>let me know what do you think ?
>
If this indeed coverts all the cases where edge splits might be caused, 
then this is a preferable
solution to the one I put in earlier.  A test build of sh64-elf suceeded 
with your patch, but
it will maybe another hour for the regression check.  Strangely enough, 
one of the
execution failures seems to have been 'cured'.

For the profiling information problem, it certainly would be better if 
we could generate some
edge counts that are consistent with the assumed branch probabilities.  
For a single function,
you have basically a system of linear equations, and you can assign an 
arbitrary large number
for the entry edge to get a set of values.  This is algorithmically 
simple, although it can become
computationally quite expensive for complex flowgraphs.  If you consider 
multiple functions,
however, it gets worse, because the edge counts should be consistent 
across functions so that
inlining decisions make sense.  To address function call/return and 
mutual recursion of direct
calls, you could mash the entire translation unit into a big linear 
equation system, but this
looks like it's pretty much guaranteed to be too expensive to solve for 
most translation units.
And it doesn't even begin to address vtables or indirect function calls 
in general.
So, ironically, here we have a situation where global functions can be 
optimized more efficiently
than static (i.e. ! TREE_PUBLIC) ones.  For a global function. if we 
don't have profiling
information, we may assume that most of the calls come from outside of 
the module, and
assign an arbitrary large execution count to the entry edge.  For  most 
real-world flowgraphs,
the linear equation sytstem can probably be solved in reasonable time.  
We should probably
skip the cases where a cheap solution cannot be found with whatever 
algorithms we build
into the compiler.

Thus, while it appears desirable and feasible to have estimated edge 
execution count information
for some functions, it doesn't appear feasible to have it for all 
functions.  Hence, we need a
mechanism to deal with the case of not having execution count information.
Therefore,  I think your patch makes sense.  If/when we get estimated 
edge execution count
information for some functions, we can refine the ! profile_info test to 
take this into account.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
  2005-04-04 15:19   ` Joern RENNECKE
@ 2005-04-04 21:08     ` Joern RENNECKE
  2005-04-05  8:25       ` Mostafa Hagog
  0 siblings, 1 reply; 10+ messages in thread
From: Joern RENNECKE @ 2005-04-04 21:08 UTC (permalink / raw)
  To: Mostafa Hagog; +Cc: gcc-patches

In the sources from CVS 2005.04.01.17.00.00, there is one regression for
sh64/-m5-compact/-ml:
< PASS: tmpdir-gcc.dg-struct-layout-1/t011 
c_compat_x_tst.o-c_compat_y_tst.o execute
---
 > FAIL: tmpdir-gcc.dg-struct-layout-1/t011 
c_compat_x_tst.o-c_compat_y_tst.o execute

In the sources from CVS 2005.03.23.12.00.00, with my merged patches,
one regression is fixed for sh64:
-FAIL: tmpdir-gcc.dg-struct-layout-1/t021 
c_compat_x_tst.o-c_compat_y_tst.o execute
+PASS: tmpdir-gcc.dg-struct-layout-1/t021 
c_compat_x_tst.o-c_compat_y_tst.o execute

But one new one appears for sh64/-m5-compact/-ml:
-PASS: tmpdir-gcc.dg-struct-layout-1/t024 
c_compat_x_tst.o-c_compat_y_tst.o execute
+FAIL: tmpdir-gcc.dg-struct-layout-1/t024 
c_compat_x_tst.o-c_compat_y_tst.o execute



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
  2005-04-04 21:08     ` Joern RENNECKE
@ 2005-04-05  8:25       ` Mostafa Hagog
  2005-04-05 11:53         ` Joern RENNECKE
  0 siblings, 1 reply; 10+ messages in thread
From: Mostafa Hagog @ 2005-04-05  8:25 UTC (permalink / raw)
  To: Joern RENNECKE; +Cc: gcc-patches





I am testing the patch on powerpc and x86 also, I will see what we get
there.

Joern RENNECKE <joern.rennecke@st.com> wrote on 05/04/2005 00:08:37:

> In the sources from CVS 2005.04.01.17.00.00, there is one regression for
> sh64/-m5-compact/-ml:
> < PASS: tmpdir-gcc.dg-struct-layout-1/t011
> c_compat_x_tst.o-c_compat_y_tst.o execute
> ---
>  > FAIL: tmpdir-gcc.dg-struct-layout-1/t011
> c_compat_x_tst.o-c_compat_y_tst.o execute
Is this regression happen after you apply the patches ?

>
> In the sources from CVS 2005.03.23.12.00.00, with my merged patches,
> one regression is fixed for sh64:
> -FAIL: tmpdir-gcc.dg-struct-layout-1/t021
> c_compat_x_tst.o-c_compat_y_tst.o execute
> +PASS: tmpdir-gcc.dg-struct-layout-1/t021
> c_compat_x_tst.o-c_compat_y_tst.o execute
>
> But one new one appears for sh64/-m5-compact/-ml:
> -PASS: tmpdir-gcc.dg-struct-layout-1/t024
> c_compat_x_tst.o-c_compat_y_tst.o execute
> +FAIL: tmpdir-gcc.dg-struct-layout-1/t024
> c_compat_x_tst.o-c_compat_y_tst.o execute

To be able to debug this I must at least look at dump.  Since I don't have
the sh64 system; can you please send me those dumps

Mostafa.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
  2005-04-05  8:25       ` Mostafa Hagog
@ 2005-04-05 11:53         ` Joern RENNECKE
  2005-04-05 12:55           ` Joern RENNECKE
  2005-04-05 14:21           ` Mostafa Hagog
  0 siblings, 2 replies; 10+ messages in thread
From: Joern RENNECKE @ 2005-04-05 11:53 UTC (permalink / raw)
  To: Mostafa Hagog; +Cc: gcc-patches

Mostafa Hagog wrote:

>    
>
>To be able to debug this I must at least look at dump.  Since I don't have
>the sh64 system; can you please send me those dumps
>
I'm running these tests on a simulator.  Sources together with pre-built 
Red Hat
Linux binaries for gdb with integrated simulator and a stand-alone simulator
are available at:
ftp://sources.redhat.com/pub/gdb/contrib/superh/gdb+dejagnu-20020310-sh64-20040702.tgz
or your friendly neighbourhood mirror.

I'm not sure what dumps you want.  Gcc dumps?  Core dumps?  At any rate, 
debugging
under gdb is generally more useful.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
  2005-04-05 11:53         ` Joern RENNECKE
@ 2005-04-05 12:55           ` Joern RENNECKE
  2005-04-05 14:21           ` Mostafa Hagog
  1 sibling, 0 replies; 10+ messages in thread
From: Joern RENNECKE @ 2005-04-05 12:55 UTC (permalink / raw)
  To: Mostafa Hagog; +Cc: gcc-patches

I've had a look at the tmpdir-gcc.dg-struct-layout-1/t021 failure, and 
the problem appears to be that the struct-layout
tests are way too large - about half a megabyte.  Adding a byte or two 
can cause a bss / stack collision.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
  2005-04-05 11:53         ` Joern RENNECKE
  2005-04-05 12:55           ` Joern RENNECKE
@ 2005-04-05 14:21           ` Mostafa Hagog
  2005-04-05 15:44             ` Joern RENNECKE
  2005-04-12 14:38             ` Joern RENNECKE
  1 sibling, 2 replies; 10+ messages in thread
From: Mostafa Hagog @ 2005-04-05 14:21 UTC (permalink / raw)
  To: Joern RENNECKE; +Cc: gcc-patches






Joern RENNECKE <joern.rennecke@st.com> wrote on 05/04/2005 14:53:24:

> Mostafa Hagog wrote:
>
> >
> >
> >To be able to debug this I must at least look at dump.  Since I don't
have
> >the sh64 system; can you please send me those dumps
> >
> I'm running these tests on a simulator.  Sources together with pre-built
> Red Hat
> Linux binaries for gdb with integrated simulator and a stand-alone
simulator
> are available at:
> ftp://sources.redhat.
> com/pub/gdb/contrib/superh/gdb+dejagnu-20020310-sh64-20040702.tgz
> or your friendly neighbourhood mirror.
>
> I'm not sure what dumps you want.  Gcc dumps?  Core dumps?  At any rate,
> debugging
> under gdb is generally more useful.
>

RTL dumps of GCC can help understand what is going in postreload-gcse;
since already have GCC for sh64, can you produce the RTL dumps for this
test case ?

Thanks,
Mostafa.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
  2005-04-05 14:21           ` Mostafa Hagog
@ 2005-04-05 15:44             ` Joern RENNECKE
  2005-04-12 14:38             ` Joern RENNECKE
  1 sibling, 0 replies; 10+ messages in thread
From: Joern RENNECKE @ 2005-04-05 15:44 UTC (permalink / raw)
  To: Mostafa Hagog; +Cc: gcc-patches

Mostafa Hagog wrote:

>    
>
>RTL dumps of GCC can help understand what is going in postreload-gcse;
>since already have GCC for sh64, can you produce the RTL dumps for this
>test case ?
>

As I said already in a previous email, the struct-layout testcases are 
simply too large.
Since postreload-gcse is expected to increase code size a bit now and 
then, I don't
think your patch is to blame for any of the regressions.
   

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't
  2005-04-05 14:21           ` Mostafa Hagog
  2005-04-05 15:44             ` Joern RENNECKE
@ 2005-04-12 14:38             ` Joern RENNECKE
  1 sibling, 0 replies; 10+ messages in thread
From: Joern RENNECKE @ 2005-04-12 14:38 UTC (permalink / raw)
  To: Mostafa Hagog; +Cc: gcc-patches, kkojima

P.S.: A number of regresions are also attributable to PR middle-end/20371.
The structure layout algorithm uses uninitialized data, so a number of the
struct-layout tests return random results.
I have the patch to the sh-elf-4_1-branch banch now because this bug
makes it impossible to make a straightforward regression test.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2005-04-12 14:38 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-03-31 17:10 postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't Joern RENNECKE
2005-04-03 14:46 ` Mostafa Hagog
2005-04-04 15:19   ` Joern RENNECKE
2005-04-04 21:08     ` Joern RENNECKE
2005-04-05  8:25       ` Mostafa Hagog
2005-04-05 11:53         ` Joern RENNECKE
2005-04-05 12:55           ` Joern RENNECKE
2005-04-05 14:21           ` Mostafa Hagog
2005-04-05 15:44             ` Joern RENNECKE
2005-04-12 14:38             ` Joern RENNECKE

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).