* Re: [patch] Re: CSE improvement
1998-03-09 21:39 ` [patch] Re: CSE improvement Bruno Haible
@ 1998-03-09 15:19 ` John Carr
1998-03-10 5:51 ` Joern Rennecke
0 siblings, 1 reply; 4+ messages in thread
From: John Carr @ 1998-03-09 15:19 UTC (permalink / raw)
To: Bruno Haible; +Cc: egcs
> Only on Sparc, not on i386. It depends on strict alignment of p1 and p2.
> (If p1 and p2 differ by half a word, nothing can be said about *p1 after
> the store.)
C code which stores to partially overlapping locations is non-conforming.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [patch] Re: CSE improvement
[not found] <199803071452.JAA00648@jfc.>
@ 1998-03-09 21:39 ` Bruno Haible
1998-03-09 15:19 ` John Carr
0 siblings, 1 reply; 4+ messages in thread
From: Bruno Haible @ 1998-03-09 21:39 UTC (permalink / raw)
To: John Carr; +Cc: egcs
John Carr writes:
>
> For example, after this code
>
> a = *p1;
> *p2 = a;
>
> it is known that a == *p1.
Only on Sparc, not on i386. It depends on strict alignment of p1 and p2.
(If p1 and p2 differ by half a word, nothing can be said about *p1 after
the store.)
> He described this as an alias deficiency but I think the right solution
> is in CSE. Please comment on the following patch, in particular whether
> the optimization benefit is worth the added complexity. This is mostly
> untested.
Interestingly, the patch I came up with last week-end is very similar
to yours: Introduce a third argument to `invalidate'. But I can get
away without the function `any_equiv_value'.
This patch is tested: causes no regression on sparc-sun-solaris2
(thanks Alexandre for the reference data!), and has no effect on i386.
Bruno
===============================================================================
Here is a patch which avoids unnecessary load instructions when the value
to be loaded is already in a register and has been written to memory. For
example, in the following code, the load insn marked as "superfluous" is
eliminated.
========================== foo.c ===============================
typedef struct mylist {
struct bfd * abfd;
struct mylist * next;
struct mylist ** prev;
} mylist;
void mylist_insert (struct mylist ** p, struct mylist * n)
{
n->next = *p;
if (*p)
(*p)->prev = &n->next;
n->prev = p;
*p = n;
}
========================= gcc -O3 -S foo.c =====================
gcc2_compiled.:
___gnu_compiled_c:
.text
.align 4
.global _mylist_insert
.proc 020
_mylist_insert:
!#PROLOGUE# 0
!#PROLOGUE# 1
ld [%o0],%g2 ; fetch *p
st %g2,[%o1+4] ; store it in memory
ld [%o0],%g3 ; <-------- fetch *p again, superfluous
cmp %g3,0
be L2
add %o1,4,%g2
st %g2,[%g3+8]
L2:
st %o0,[%o1+8]
retl
st %o1,[%o0]
================================================================
Sat Mar 7 01:20:33 1998 Bruno Haible <bruno@linuix.mathematik.uni-karlsruhe.de>
* cse.c (invalidate): Add `src' parameter. Don't invalidate
expressions equivalent to `src'. All callers changed.
*** egcs-980302/gcc/cse.c.bak Wed Mar 4 02:46:34 1998
--- egcs-980302/gcc/cse.c Mon Mar 9 03:34:47 1998
***************
*** 612,618 ****
enum machine_mode));
static void merge_equiv_classes PROTO((struct table_elt *,
struct table_elt *));
! static void invalidate PROTO((rtx, enum machine_mode));
static int cse_rtx_varies_p PROTO((rtx));
static void remove_invalid_refs PROTO((int));
static void rehash_using_reg PROTO((rtx));
--- 612,618 ----
enum machine_mode));
static void merge_equiv_classes PROTO((struct table_elt *,
struct table_elt *));
! static void invalidate PROTO((rtx, enum machine_mode, rtx));
static int cse_rtx_varies_p PROTO((rtx));
static void remove_invalid_refs PROTO((int));
static void rehash_using_reg PROTO((rtx));
***************
*** 1507,1524 ****
FULL_MODE, if not VOIDmode, indicates that this much should be invalidated
instead of just the amount indicated by the mode of X. This is only used
for bitfield stores into memory.
A nonvarying address may be just a register or just
a symbol reference, or it may be either of those plus
a numeric offset. */
static void
! invalidate (x, full_mode)
rtx x;
enum machine_mode full_mode;
{
register int i;
register struct table_elt *p;
/* If X is a register, dependencies on its contents
are recorded through the qty number mechanism.
--- 1507,1528 ----
FULL_MODE, if not VOIDmode, indicates that this much should be invalidated
instead of just the amount indicated by the mode of X. This is only used
for bitfield stores into memory.
+ SRC, if not NULL_RTX, is the value which is being stored into X. This is
+ only used if X is a memory reference, to avoid invalidating SRC itself.
A nonvarying address may be just a register or just
a symbol reference, or it may be either of those plus
a numeric offset. */
static void
! invalidate (x, full_mode, src)
rtx x;
enum machine_mode full_mode;
+ rtx src;
{
register int i;
register struct table_elt *p;
+ register struct table_elt *src_first_same_value;
/* If X is a register, dependencies on its contents
are recorded through the qty number mechanism.
***************
*** 1596,1602 ****
{
if (GET_CODE (SUBREG_REG (x)) != REG)
abort ();
! invalidate (SUBREG_REG (x), VOIDmode);
return;
}
--- 1600,1606 ----
{
if (GET_CODE (SUBREG_REG (x)) != REG)
abort ();
! invalidate (SUBREG_REG (x), VOIDmode, NULL_RTX);
return;
}
***************
*** 1610,1615 ****
--- 1614,1637 ----
if (full_mode == VOIDmode)
full_mode = GET_MODE (x);
+ /* If we are dealing with (SET X SRC), and SRC is the same qty as a
+ memory reference P_EXP, and the mode's alignment is equal to its size,
+ then we won't need to remove P_EXP from the table, because:
+ Either the memory locations of X and P_EXP are disjoint, then P_EXP
+ is not modified.
+ Or the memory locations of X and P_EXP are identical, then P_EXP is
+ overwritten by itself. */
+ src_first_same_value = NULL;
+ if (src
+ && STRICT_ALIGNMENT
+ && (GET_MODE_ALIGNMENT (full_mode) == GET_MODE_BITSIZE (full_mode)))
+ {
+ struct table_elt *src_elt
+ = lookup (src, safe_hash (src, full_mode) % NBUCKETS, full_mode);
+ if (src_elt)
+ src_first_same_value = src_elt->first_same_value;
+ }
+
for (i = 0; i < NBUCKETS; i++)
{
register struct table_elt *next;
***************
*** 1620,1626 ****
than checking all the aliases). */
if (p->in_memory
&& (GET_CODE (p->exp) != MEM
! || true_dependence (x, full_mode, p->exp, cse_rtx_varies_p)))
remove_from_table (p, i);
}
}
--- 1642,1649 ----
than checking all the aliases). */
if (p->in_memory
&& (GET_CODE (p->exp) != MEM
! || true_dependence (x, full_mode, p->exp, cse_rtx_varies_p))
! && p->first_same_value != src_first_same_value)
remove_from_table (p, i);
}
}
***************
*** 6139,6145 ****
{
for (tem = CALL_INSN_FUNCTION_USAGE (insn); tem; tem = XEXP (tem, 1))
if (GET_CODE (XEXP (tem, 0)) == CLOBBER)
! invalidate (SET_DEST (XEXP (tem, 0)), VOIDmode);
}
if (GET_CODE (x) == SET)
--- 6162,6168 ----
{
for (tem = CALL_INSN_FUNCTION_USAGE (insn); tem; tem = XEXP (tem, 1))
if (GET_CODE (XEXP (tem, 0)) == CLOBBER)
! invalidate (SET_DEST (XEXP (tem, 0)), VOIDmode, NULL_RTX);
}
if (GET_CODE (x) == SET)
***************
*** 6170,6176 ****
canon_reg (SET_SRC (x), insn);
apply_change_group ();
fold_rtx (SET_SRC (x), insn);
! invalidate (SET_DEST (x), VOIDmode);
}
else
n_sets = 1;
--- 6193,6199 ----
canon_reg (SET_SRC (x), insn);
apply_change_group ();
fold_rtx (SET_SRC (x), insn);
! invalidate (SET_DEST (x), VOIDmode, NULL_RTX);
}
else
n_sets = 1;
***************
*** 6201,6210 ****
if (GET_CODE (clobbered) == REG
|| GET_CODE (clobbered) == SUBREG)
! invalidate (clobbered, VOIDmode);
else if (GET_CODE (clobbered) == STRICT_LOW_PART
|| GET_CODE (clobbered) == ZERO_EXTRACT)
! invalidate (XEXP (clobbered, 0), GET_MODE (clobbered));
}
}
--- 6224,6233 ----
if (GET_CODE (clobbered) == REG
|| GET_CODE (clobbered) == SUBREG)
! invalidate (clobbered, VOIDmode, NULL_RTX);
else if (GET_CODE (clobbered) == STRICT_LOW_PART
|| GET_CODE (clobbered) == ZERO_EXTRACT)
! invalidate (XEXP (clobbered, 0), GET_MODE (clobbered), NULL_RTX);
}
}
***************
*** 6220,6226 ****
canon_reg (SET_SRC (y), insn);
apply_change_group ();
fold_rtx (SET_SRC (y), insn);
! invalidate (SET_DEST (y), VOIDmode);
}
else if (SET_DEST (y) == pc_rtx
&& GET_CODE (SET_SRC (y)) == LABEL_REF)
--- 6243,6249 ----
canon_reg (SET_SRC (y), insn);
apply_change_group ();
fold_rtx (SET_SRC (y), insn);
! invalidate (SET_DEST (y), VOIDmode, NULL_RTX);
}
else if (SET_DEST (y) == pc_rtx
&& GET_CODE (SET_SRC (y)) == LABEL_REF)
***************
*** 7042,7048 ****
if ((GET_CODE (addr) == PRE_DEC || GET_CODE (addr) == PRE_INC
|| GET_CODE (addr) == POST_DEC || GET_CODE (addr) == POST_INC)
&& XEXP (addr, 0) == stack_pointer_rtx)
! invalidate (stack_pointer_rtx, Pmode);
#endif
dest = fold_rtx (dest, insn);
}
--- 7065,7071 ----
if ((GET_CODE (addr) == PRE_DEC || GET_CODE (addr) == PRE_INC
|| GET_CODE (addr) == POST_DEC || GET_CODE (addr) == POST_INC)
&& XEXP (addr, 0) == stack_pointer_rtx)
! invalidate (stack_pointer_rtx, Pmode, NULL_RTX);
#endif
dest = fold_rtx (dest, insn);
}
***************
*** 7168,7177 ****
{
if (GET_CODE (dest) == REG || GET_CODE (dest) == SUBREG
|| GET_CODE (dest) == MEM)
! invalidate (dest, VOIDmode);
else if (GET_CODE (dest) == STRICT_LOW_PART
|| GET_CODE (dest) == ZERO_EXTRACT)
! invalidate (XEXP (dest, 0), GET_MODE (dest));
sets[i].rtl = 0;
}
--- 7191,7200 ----
{
if (GET_CODE (dest) == REG || GET_CODE (dest) == SUBREG
|| GET_CODE (dest) == MEM)
! invalidate (dest, VOIDmode, NULL_RTX);
else if (GET_CODE (dest) == STRICT_LOW_PART
|| GET_CODE (dest) == ZERO_EXTRACT)
! invalidate (XEXP (dest, 0), GET_MODE (dest), NULL_RTX);
sets[i].rtl = 0;
}
***************
*** 7320,7329 ****
we have just done an invalidate_memory that covers even those. */
if (GET_CODE (dest) == REG || GET_CODE (dest) == SUBREG
|| GET_CODE (dest) == MEM)
! invalidate (dest, VOIDmode);
else if (GET_CODE (dest) == STRICT_LOW_PART
|| GET_CODE (dest) == ZERO_EXTRACT)
! invalidate (XEXP (dest, 0), GET_MODE (dest));
}
/* Make sure registers mentioned in destinations
--- 7343,7352 ----
we have just done an invalidate_memory that covers even those. */
if (GET_CODE (dest) == REG || GET_CODE (dest) == SUBREG
|| GET_CODE (dest) == MEM)
! invalidate (dest, VOIDmode, SET_SRC (sets[i].rtl));
else if (GET_CODE (dest) == STRICT_LOW_PART
|| GET_CODE (dest) == ZERO_EXTRACT)
! invalidate (XEXP (dest, 0), GET_MODE (dest), NULL_RTX);
}
/* Make sure registers mentioned in destinations
***************
*** 7629,7635 ****
/* This should be *very* rare. */
if (TEST_HARD_REG_BIT (hard_regs_in_table, STACK_POINTER_REGNUM))
! invalidate (stack_pointer_rtx, VOIDmode);
return 1;
}
return 0;
--- 7652,7658 ----
/* This should be *very* rare. */
if (TEST_HARD_REG_BIT (hard_regs_in_table, STACK_POINTER_REGNUM))
! invalidate (stack_pointer_rtx, VOIDmode, NULL_RTX);
return 1;
}
return 0;
***************
*** 7653,7662 ****
{
if (GET_CODE (ref) == REG || GET_CODE (ref) == SUBREG
|| GET_CODE (ref) == MEM)
! invalidate (ref, VOIDmode);
else if (GET_CODE (ref) == STRICT_LOW_PART
|| GET_CODE (ref) == ZERO_EXTRACT)
! invalidate (XEXP (ref, 0), GET_MODE (ref));
}
}
else if (GET_CODE (x) == PARALLEL)
--- 7676,7685 ----
{
if (GET_CODE (ref) == REG || GET_CODE (ref) == SUBREG
|| GET_CODE (ref) == MEM)
! invalidate (ref, VOIDmode, NULL_RTX);
else if (GET_CODE (ref) == STRICT_LOW_PART
|| GET_CODE (ref) == ZERO_EXTRACT)
! invalidate (XEXP (ref, 0), GET_MODE (ref), NULL_RTX);
}
}
else if (GET_CODE (x) == PARALLEL)
***************
*** 7670,7679 ****
rtx ref = XEXP (y, 0);
if (GET_CODE (ref) == REG || GET_CODE (ref) == SUBREG
|| GET_CODE (ref) == MEM)
! invalidate (ref, VOIDmode);
else if (GET_CODE (ref) == STRICT_LOW_PART
|| GET_CODE (ref) == ZERO_EXTRACT)
! invalidate (XEXP (ref, 0), GET_MODE (ref));
}
}
}
--- 7693,7702 ----
rtx ref = XEXP (y, 0);
if (GET_CODE (ref) == REG || GET_CODE (ref) == SUBREG
|| GET_CODE (ref) == MEM)
! invalidate (ref, VOIDmode, NULL_RTX);
else if (GET_CODE (ref) == STRICT_LOW_PART
|| GET_CODE (ref) == ZERO_EXTRACT)
! invalidate (XEXP (ref, 0), GET_MODE (ref), NULL_RTX);
}
}
}
***************
*** 7807,7816 ****
if (GET_CODE (p->exp) == MEM || GET_CODE (p->exp) == REG
|| (GET_CODE (p->exp) == SUBREG
&& GET_CODE (SUBREG_REG (p->exp)) == REG))
! invalidate (p->exp, VOIDmode);
else if (GET_CODE (p->exp) == STRICT_LOW_PART
|| GET_CODE (p->exp) == ZERO_EXTRACT)
! invalidate (XEXP (p->exp, 0), GET_MODE (p->exp));
}
/* Process insns starting after LOOP_START until we hit a CALL_INSN or
--- 7830,7839 ----
if (GET_CODE (p->exp) == MEM || GET_CODE (p->exp) == REG
|| (GET_CODE (p->exp) == SUBREG
&& GET_CODE (SUBREG_REG (p->exp)) == REG))
! invalidate (p->exp, VOIDmode, NULL_RTX);
else if (GET_CODE (p->exp) == STRICT_LOW_PART
|| GET_CODE (p->exp) == ZERO_EXTRACT)
! invalidate (XEXP (p->exp, 0), GET_MODE (p->exp), NULL_RTX);
}
/* Process insns starting after LOOP_START until we hit a CALL_INSN or
***************
*** 7878,7886 ****
return;
if (code == STRICT_LOW_PART || code == ZERO_EXTRACT)
! invalidate (XEXP (dest, 0), GET_MODE (dest));
else if (code == REG || code == SUBREG || code == MEM)
! invalidate (dest, VOIDmode);
}
/* Invalidate all insns from START up to the end of the function or the
--- 7901,7909 ----
return;
if (code == STRICT_LOW_PART || code == ZERO_EXTRACT)
! invalidate (XEXP (dest, 0), GET_MODE (dest), NULL_RTX);
else if (code == REG || code == SUBREG || code == MEM)
! invalidate (dest, VOIDmode, NULL_RTX);
}
/* Invalidate all insns from START up to the end of the function or the
***************
*** 8020,8029 ****
/* See comment on similar code in cse_insn for explanation of these tests. */
if (GET_CODE (SET_DEST (x)) == REG || GET_CODE (SET_DEST (x)) == SUBREG
|| GET_CODE (SET_DEST (x)) == MEM)
! invalidate (SET_DEST (x), VOIDmode);
else if (GET_CODE (SET_DEST (x)) == STRICT_LOW_PART
|| GET_CODE (SET_DEST (x)) == ZERO_EXTRACT)
! invalidate (XEXP (SET_DEST (x), 0), GET_MODE (SET_DEST (x)));
}
\f
/* Find the end of INSN's basic block and return its range,
--- 8043,8052 ----
/* See comment on similar code in cse_insn for explanation of these tests. */
if (GET_CODE (SET_DEST (x)) == REG || GET_CODE (SET_DEST (x)) == SUBREG
|| GET_CODE (SET_DEST (x)) == MEM)
! invalidate (SET_DEST (x), VOIDmode, NULL_RTX);
else if (GET_CODE (SET_DEST (x)) == STRICT_LOW_PART
|| GET_CODE (SET_DEST (x)) == ZERO_EXTRACT)
! invalidate (XEXP (SET_DEST (x), 0), GET_MODE (SET_DEST (x)), NULL_RTX);
}
\f
/* Find the end of INSN's basic block and return its range,
***************
*** 8493,8499 ****
next = p->next_same_hash;
if (GET_CODE (p->exp) == REG)
! invalidate (p->exp, p->mode);
else
remove_from_table (p, i);
}
--- 8516,8522 ----
next = p->next_same_hash;
if (GET_CODE (p->exp) == REG)
! invalidate (p->exp, p->mode, NULL_RTX);
else
remove_from_table (p, i);
}
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [patch] Re: CSE improvement
1998-03-09 15:19 ` John Carr
@ 1998-03-10 5:51 ` Joern Rennecke
1998-03-10 10:13 ` John Carr
0 siblings, 1 reply; 4+ messages in thread
From: Joern Rennecke @ 1998-03-10 5:51 UTC (permalink / raw)
To: John Carr; +Cc: haible, egcs
>
> > Only on Sparc, not on i386. It depends on strict alignment of p1 and p2.
> > (If p1 and p2 differ by half a word, nothing can be said about *p1 after
> > the store.)
>
> C code which stores to partially overlapping locations is non-conforming.
What about complex? IIRC, it is supposed to work like an array of
float/double. Thus, if both p1 and p2 are pointers to a complex type and
point into a larger array of the base type, would it be legal for them to
overlap ?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [patch] Re: CSE improvement
1998-03-10 5:51 ` Joern Rennecke
@ 1998-03-10 10:13 ` John Carr
0 siblings, 0 replies; 4+ messages in thread
From: John Carr @ 1998-03-10 10:13 UTC (permalink / raw)
To: Joern Rennecke; +Cc: haible, egcs
> What about complex? IIRC, it is supposed to work like an array of
> float/double. Thus, if both p1 and p2 are pointers to a complex type and
> point into a larger array of the base type, would it be legal for them to
> overlap ?
C9X complex is stored in memory as if it were an array ("same
representation and alignment requirements") but I don't think that
makes it an array for the section 6.3 alias rules.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~1998-03-10 10:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <199803071452.JAA00648@jfc.>
1998-03-09 21:39 ` [patch] Re: CSE improvement Bruno Haible
1998-03-09 15:19 ` John Carr
1998-03-10 5:51 ` Joern Rennecke
1998-03-10 10:13 ` John Carr
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).