* Alias analysis - does base_alias_check still work ?
@ 2002-07-16 14:10 Toon Moene
2002-07-19 11:00 ` Toon Moene
0 siblings, 1 reply; 19+ messages in thread
From: Toon Moene @ 2002-07-16 14:10 UTC (permalink / raw)
To: gcc
L.S.,
f/com.c contains the following note, preceding the definition of
#define LANG_HOOKS_GET_ALIAS_SET hook_get_alias_set_0
/* We do not wish to use alias-set based aliasing at all. Used in the
extreme (every object with its own set, with equivalences recorded)
it
might be helpful, but there are problems when it comes to inlining.
We
get on ok with flag_argument_noalias, and alias-set aliasing does
currently limit how stack slots can be reused, which is a lose. */
I do not know if all the facts mentioned here still actually hold, but I
do have strong doubts that base_alias_check in alias.c still does its
duty.
Consider the following Fortran source:
SUBROUTINE SIMPLE(A, B)
B = 3.0
A = 2.0
B = A*B
END
one would assume that alias analysis at least once should check that the
assignment to A in line 3 doesn't change the value of B set in line 2,
which, with
flag_argument_noalias > 1
[arguments don't alias] in effect, would be the case.
However, according to my experiments with setting breakpoints in
base_alias_check, it never passes that point.
Before I go on a wholesale check to see if base_alias_check *ever*
returns anything else than `1` (x and y might alias), does someone have
a good idea to narrow the search ?
Thanks in advance,
--
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-16 14:10 Alias analysis - does base_alias_check still work ? Toon Moene
@ 2002-07-19 11:00 ` Toon Moene
2002-07-19 11:02 ` Daniel Berlin
2002-07-19 13:31 ` Richard Henderson
0 siblings, 2 replies; 19+ messages in thread
From: Toon Moene @ 2002-07-19 11:00 UTC (permalink / raw)
To: gcc
I wrote:
> f/com.c contains the following note, preceding the definition of
>
> #define LANG_HOOKS_GET_ALIAS_SET hook_get_alias_set_0
>
> /* We do not wish to use alias-set based aliasing at all. Used in the
> extreme (every object with its own set, with equivalences recorded)
> it
> might be helpful, but there are problems when it comes to inlining.
> We
> get on ok with flag_argument_noalias, and alias-set aliasing does
> currently limit how stack slots can be reused, which is a lose. */
>
> I do not know if all the facts mentioned here still actually hold, but I
> do have strong doubts that base_alias_check in alias.c still does its
> duty.
>
> Consider the following Fortran source:
>
> SUBROUTINE SIMPLE(A, B)
> B = 3.0
> A = 2.0
> B = A*B
> END
>
> one would assume that alias analysis at least once should check that the
> assignment to A in line 3 doesn't change the value of B set in line 2,
> which, with
>
> flag_argument_noalias > 1
>
> [arguments don't alias] in effect, would be the case.
>
> However, according to my experiments with setting breakpoints in
> base_alias_check, it never passes that point.
Sigh, that's just because it doesn't need to. The code generated looks
like this:
...
movl $0x40400000, (%edx) ! B=3.0
movl $0x40000000, (%eax) ! A=2.0
flds (%edx) ! put B on stack
...
which, of course, gets neatly around the problem of whether the store into
A would change B.
Now for the real issue. To have register renaming be really effective,
alias analysis has to work well.
Take the following example:
subroutine saxpy(n,sa,sx,sy)
real sx(n),sy(n),sa
integer i,n
do i = 1,n
sy(i) = sy(i) + sa*sx(i)
enddo
end
If we compile this with -O2 -march=pentium4 -mfpmath=sse -funroll-loops
-frename-registers, we get for the unrolled loop:
.L6:
movaps %xmm1, %xmm7
movaps %xmm1, %xmm6
movaps %xmm1, %xmm5
mulss (%edx), %xmm7
movaps %xmm1, %xmm4
addss (%eax), %xmm7
movss %xmm7, (%eax)
mulss 4(%edx), %xmm6
addss 4(%eax), %xmm6
movss %xmm6, 4(%eax)
mulss 8(%edx), %xmm5
addss 8(%eax), %xmm5
movss %xmm5, 8(%eax)
mulss 12(%edx), %xmm4
addl $16, %edx
addss 12(%eax), %xmm4
movss %xmm4, 12(%eax)
addl $16, %eax
subl $4, %ecx
jns .L6
Obviously, register renaming has done its work, but the (re-)scheduling of
instructions leaves something to be desired. After much gdb'ing in
sched-deps.c and alias.c I believe to have found the cause: the
rescheduling of this loop after register renaming is run after register
allocation (hey, no surprise :-). However, alias analysis is really
careful about assumptions on the contents of these hard registers, so
almost no instruction gets moved around.
OK, but what if we allow instruction scheduling before register allocation
(that would only be beneficial if the floating point (pseudo) registers
have different "names" already, but fortunately, they do), using
-fschedule-insns instead of -frename-registers:
.L6:
movaps %xmm4, %xmm0
movaps %xmm4, %xmm1
movaps %xmm4, %xmm2
movaps %xmm4, %xmm3
mulss (%edx), %xmm0
mulss 4(%edx), %xmm1
mulss 8(%edx), %xmm2
mulss 12(%edx), %xmm3
addss (%eax), %xmm0
addss 4(%eax), %xmm1
addss 8(%eax), %xmm2
addss 12(%eax), %xmm3
movss %xmm0, (%eax)
movss %xmm1, 4(%eax)
movss %xmm2, 8(%eax)
movss %xmm3, 12(%eax)
addl $16, %edx
addl $16, %eax
subl $4, %ecx
jns .L6
Bingo ! That's a lot better !
Which begs the question: Is there a reason -fschedule-insns isn't on by
default when using -O2 ?
Cheers,
--
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-19 11:00 ` Toon Moene
@ 2002-07-19 11:02 ` Daniel Berlin
2002-07-19 11:03 ` David Edelsohn
2002-07-19 13:31 ` Richard Henderson
1 sibling, 1 reply; 19+ messages in thread
From: Daniel Berlin @ 2002-07-19 11:02 UTC (permalink / raw)
To: Toon Moene; +Cc: gcc
On Fri, 19 Jul 2002, Toon Moene wrote:
> I wrote:
>
> > f/com.c contains the following note, preceding the definition of
> >
> > #define LANG_HOOKS_GET_ALIAS_SET hook_get_alias_set_0
> >
> > /* We do not wish to use alias-set based aliasing at all. Used in the
> > extreme (every object with its own set, with equivalences recorded)
> > it
> > might be helpful, but there are problems when it comes to inlining.
> > We
> > get on ok with flag_argument_noalias, and alias-set aliasing does
> > currently limit how stack slots can be reused, which is a lose. */
> >
> > I do not know if all the facts mentioned here still actually hold, but I
> > do have strong doubts that base_alias_check in alias.c still does its
> > duty.
> >
> > Consider the following Fortran source:
> >
> > SUBROUTINE SIMPLE(A, B)
> > B = 3.0
> > A = 2.0
> > B = A*B
> > END
> >
> > one would assume that alias analysis at least once should check that the
> > assignment to A in line 3 doesn't change the value of B set in line 2,
> > which, with
> >
> > flag_argument_noalias > 1
> >
> > [arguments don't alias] in effect, would be the case.
> >
> > However, according to my experiments with setting breakpoints in
> > base_alias_check, it never passes that point.
>
> Sigh, that's just because it doesn't need to. The code generated looks
> like this:
>
> ...
> movl $0x40400000, (%edx) ! B=3.0
> movl $0x40000000, (%eax) ! A=2.0
> flds (%edx) ! put B on stack
> ...
>
> which, of course, gets neatly around the problem of whether the store into
> A would change B.
>
> Now for the real issue. To have register renaming be really effective,
> alias analysis has to work well.
>
> Take the following example:
>
> subroutine saxpy(n,sa,sx,sy)
> real sx(n),sy(n),sa
> integer i,n
> do i = 1,n
> sy(i) = sy(i) + sa*sx(i)
> enddo
> end
>
> If we compile this with -O2 -march=pentium4 -mfpmath=sse -funroll-loops
> -frename-registers, we get for the unrolled loop:
>
> .L6:
> movaps %xmm1, %xmm7
> movaps %xmm1, %xmm6
> movaps %xmm1, %xmm5
> mulss (%edx), %xmm7
> movaps %xmm1, %xmm4
> addss (%eax), %xmm7
> movss %xmm7, (%eax)
> mulss 4(%edx), %xmm6
> addss 4(%eax), %xmm6
> movss %xmm6, 4(%eax)
> mulss 8(%edx), %xmm5
> addss 8(%eax), %xmm5
> movss %xmm5, 8(%eax)
> mulss 12(%edx), %xmm4
> addl $16, %edx
> addss 12(%eax), %xmm4
> movss %xmm4, 12(%eax)
> addl $16, %eax
> subl $4, %ecx
> jns .L6
>
> Obviously, register renaming has done its work, but the (re-)scheduling of
> instructions leaves something to be desired. After much gdb'ing in
> sched-deps.c and alias.c I believe to have found the cause: the
> rescheduling of this loop after register renaming is run after register
> allocation (hey, no surprise :-). However, alias analysis is really
> careful about assumptions on the contents of these hard registers, so
> almost no instruction gets moved around.
>
> OK, but what if we allow instruction scheduling before register allocation
> (that would only be beneficial if the floating point (pseudo) registers
> have different "names" already, but fortunately, they do), using
> -fschedule-insns instead of -frename-registers:
>
> .L6:
> movaps %xmm4, %xmm0
> movaps %xmm4, %xmm1
> movaps %xmm4, %xmm2
> movaps %xmm4, %xmm3
> mulss (%edx), %xmm0
> mulss 4(%edx), %xmm1
> mulss 8(%edx), %xmm2
> mulss 12(%edx), %xmm3
> addss (%eax), %xmm0
> addss 4(%eax), %xmm1
> addss 8(%eax), %xmm2
> addss 12(%eax), %xmm3
> movss %xmm0, (%eax)
> movss %xmm1, 4(%eax)
> movss %xmm2, 8(%eax)
> movss %xmm3, 12(%eax)
> addl $16, %edx
> addl $16, %eax
> subl $4, %ecx
> jns .L6
>
> Bingo ! That's a lot better !
>
> Which begs the question: Is there a reason -fschedule-insns isn't on by
> default when using -O2 ?
Err, it is.
if (optimize >= 2)
{
flag_optimize_sibling_calls = 1;
flag_cse_follow_jumps = 1;
flag_cse_skip_blocks = 1;
flag_gcse = 1;
flag_expensive_optimizations = 1;
flag_strength_reduce = 1;
flag_rerun_cse_after_loop = 1;
flag_rerun_loop_opt = 1;
flag_caller_saves = 1;
flag_force_mem = 1;
flag_peephole2 = 1;
#ifdef INSN_SCHEDULING
flag_schedule_insns = 1;
flag_schedule_insns_after_reload = 1;
#endif
flag_regmove = 1;
flag_strict_aliasing = 1;
flag_delete_null_pointer_checks = 1;
flag_reorder_blocks = 1;
flag_reorder_functions = 1;
}
>
> Cheers,
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-19 11:02 ` Daniel Berlin
@ 2002-07-19 11:03 ` David Edelsohn
0 siblings, 0 replies; 19+ messages in thread
From: David Edelsohn @ 2002-07-19 11:03 UTC (permalink / raw)
To: Daniel Berlin, Toon Moene; +Cc: gcc
>>>>> Daniel Berlin writes:
>> Which begs the question: Is there a reason -fschedule-insns isn't on by
>> default when using -O2 ?
Daniel> Err, it is.
Daniel> if (optimize >= 2)
Daniel> flag_peephole2 = 1;
Daniel> #ifdef INSN_SCHEDULING
Daniel> flag_schedule_insns = 1;
Daniel> flag_schedule_insns_after_reload = 1;
Daniel> #endif
Except i386.c turns it off:
void
optimization_options (level, size)
int level;
int size ATTRIBUTE_UNUSED;
{
/* For -O2 and beyond, turn off -fschedule-insns by default. It tends to
make the problem with not enough registers even worse. */
#ifdef INSN_SCHEDULING
if (level > 1)
flag_schedule_insns = 0;
#endif
David
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-19 11:00 ` Toon Moene
2002-07-19 11:02 ` Daniel Berlin
@ 2002-07-19 13:31 ` Richard Henderson
2002-07-20 2:13 ` Toon Moene
2002-08-12 7:49 ` Jeff Law
1 sibling, 2 replies; 19+ messages in thread
From: Richard Henderson @ 2002-07-19 13:31 UTC (permalink / raw)
To: Toon Moene; +Cc: gcc
On Fri, Jul 19, 2002 at 04:40:26PM +0200, Toon Moene wrote:
> Which begs the question: Is there a reason -fschedule-insns isn't on by
> default when using -O2 ?
Yes. The fact that the scheduler doesn't understand register
pressure means that pre-register-allocation scheduling generally
sucks eggs on x86.
r~
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-19 13:31 ` Richard Henderson
@ 2002-07-20 2:13 ` Toon Moene
2002-07-20 11:42 ` Toon Moene
2002-08-12 7:49 ` Jeff Law
1 sibling, 1 reply; 19+ messages in thread
From: Toon Moene @ 2002-07-20 2:13 UTC (permalink / raw)
To: Richard Henderson; +Cc: gcc
Richard Henderson wrote:
> On Fri, Jul 19, 2002 at 04:40:26PM +0200, Toon Moene wrote:
> > Which begs the question: Is there a reason -fschedule-insns isn't on by
> > default when using -O2 ?
> Yes. The fact that the scheduler doesn't understand register
> pressure means that pre-register-allocation scheduling generally
> sucks eggs on x86.
Thanks. Independently from your message and David Edelsohn's (showing the
comment that gives the rationale for turning this option off on ix86) I
came to the same conclusion while having dinner (nothing helps better than
give the grey cells some rest :-)
Now the 64K question is: How can it be that I came around going through all
this trouble because I saw the same "mis-optimization" on the Alpha, where
- given what Daniel showed us - -fschedule-insns is on ?
I'm switching machines as we speak to test this ...
--
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-20 2:13 ` Toon Moene
@ 2002-07-20 11:42 ` Toon Moene
2002-07-20 12:05 ` Richard Henderson
0 siblings, 1 reply; 19+ messages in thread
From: Toon Moene @ 2002-07-20 11:42 UTC (permalink / raw)
To: Richard Henderson, gcc
I wrote:
> Richard Henderson wrote:
> > Yes. The fact that the scheduler doesn't understand register
> > pressure means that pre-register-allocation scheduling generally
> > sucks eggs on x86.
>
> Thanks. Independently from your message and David Edelsohn's (showing the
> comment that gives the rationale for turning this option off on ix86) I
> came to the same conclusion while having dinner (nothing helps better than
> give the grey cells some rest :-)
>
> Now the 64K question is: How can it be that I came around going through all
> this trouble because I saw the same "mis-optimization" on the Alpha, where
> - given what Daniel showed us - -fschedule-insns is on ?
>
> I'm switching machines as we speak to test this ...
Well, it obviously doesn't work on the Alpha. First of all I have to
specify -fno-rerun-loop-opts to get any loop unrolling at all, and then the
unrolled loop looks like this:
$L6:
lds $f12,0($17)
lds $f10,0($18)
lda $1,-3($5)
lda $5,-4($5)
lds $f11,-4($3)
addl $1,$31,$4
muls $f12,$f10,$f10
adds $f11,$f10,$f11
sts $f11,-4($2)
lds $f10,4($18)
lds $f11,0($3)
muls $f12,$f10,$f10
adds $f11,$f10,$f11
sts $f11,0($2)
lds $f10,8($18)
lds $f11,4($3)
muls $f12,$f10,$f10
adds $f11,$f10,$f11
sts $f11,4($2)
lds $f13,12($18)
lds $f10,8($3)
lda $18,16($18)
lda $3,16($3)
muls $f12,$f13,$f12
adds $f10,$f12,$f10
sts $f10,8($2)
lda $2,16($2)
bge $4,$L6
It's obvious that no scheduling has taken place, as all four operations are
still in sequence (lds, muls, adds, sts).
--
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-20 11:42 ` Toon Moene
@ 2002-07-20 12:05 ` Richard Henderson
2002-07-20 12:12 ` Richard Henderson
0 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2002-07-20 12:05 UTC (permalink / raw)
To: Toon Moene; +Cc: gcc
On Fri, Jul 19, 2002 at 09:55:00PM +0200, Toon Moene wrote:
> Well, it obviously doesn't work on the Alpha. First of all I have to
> specify -fno-rerun-loop-opts to get any loop unrolling at all, and then the
> unrolled loop looks like this:
Havn't looked at what exactly goes wrong with rerun-loop-opts,
except to notice that loop loses track of the register that
contains the iteration count.
As for the aliasing, the problem is that Alpha doesn't have
indexed addressing, so we wind up with the base addresses being
strength reduced, and presumably that confuses the alias
analysis code enough that it considers the memory references
to be "variable", and thus may alias anything.
One solution is to make use of the new MEM_EXPR field and
record this information such that it can't (or shouldn't)
get lost ever.
Try the following. For me it cleans up the example a bit:
$L6:
lds $f10,0($18)
lds $f12,-4($3)
lda $1,-3($5)
lda $5,-4($5)
lds $f11,4($18)
lds $f13,8($18)
addl $1,$31,$4
lds $f14,12($18)
lda $18,16($18)
muls $f15,$f10,$f10
muls $f15,$f11,$f11
muls $f15,$f13,$f13
muls $f15,$f14,$f14
adds $f12,$f10,$f12
sts $f12,-4($2)
lds $f10,0($3)
adds $f10,$f11,$f10
sts $f10,0($2)
lds $f11,4($3)
adds $f11,$f13,$f11
sts $f11,4($2)
lds $f10,8($3)
lda $3,16($3)
adds $f10,$f14,$f10
sts $f10,8($2)
lda $2,16($2)
bge $4,$L6
The remaining sts/lds pairs are writes then reads from SY.
We've lost track of the fact that the write is to index I
and the read from index I+1, and so cannot overlap.
r~
Index: alias.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/alias.c,v
retrieving revision 1.177
diff -c -p -d -r1.177 alias.c
*** alias.c 20 Jun 2002 07:29:59 -0000 1.177
--- alias.c 19 Jul 2002 22:23:34 -0000
*************** nonoverlapping_memrefs_p (x, y)
*** 1957,1962 ****
--- 1957,1970 ----
moffsetx = adjust_offset_for_component_ref (exprx, moffsetx);
exprx = t;
}
+ else if (TREE_CODE (exprx) == INDIRECT_REF)
+ {
+ exprx = TREE_OPERAND (exprx, 0);
+ if (flag_argument_noalias < 2
+ || TREE_CODE (exprx) != PARM_DECL)
+ return 0;
+ }
+
moffsety = MEM_OFFSET (y);
if (TREE_CODE (expry) == COMPONENT_REF)
{
*************** nonoverlapping_memrefs_p (x, y)
*** 1965,1970 ****
--- 1973,1985 ----
return 0;
moffsety = adjust_offset_for_component_ref (expry, moffsety);
expry = t;
+ }
+ else if (TREE_CODE (expry) == INDIRECT_REF)
+ {
+ expry = TREE_OPERAND (expry, 0);
+ if (flag_argument_noalias < 2
+ || TREE_CODE (expry) != PARM_DECL)
+ return 0;
}
if (! DECL_P (exprx) || ! DECL_P (expry))
Index: emit-rtl.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/emit-rtl.c,v
retrieving revision 1.284
diff -c -p -d -r1.284 emit-rtl.c
*** emit-rtl.c 11 Jul 2002 10:32:54 -0000 1.284
--- emit-rtl.c 19 Jul 2002 22:23:34 -0000
*************** set_mem_attributes (ref, t, objectp)
*** 1805,1811 ****
}
while (TREE_CODE (t) == ARRAY_REF);
! if (TREE_CODE (t) == COMPONENT_REF)
{
expr = component_ref_for_mem_expr (t);
if (host_integerp (off_tree, 1))
--- 1805,1821 ----
}
while (TREE_CODE (t) == ARRAY_REF);
! if (DECL_P (t))
! {
! expr = t;
! if (host_integerp (off_tree, 1))
! offset = GEN_INT (tree_low_cst (off_tree, 1));
! size = (DECL_SIZE_UNIT (t)
! && host_integerp (DECL_SIZE_UNIT (t), 1)
! ? GEN_INT (tree_low_cst (DECL_SIZE_UNIT (t), 1)) : 0);
! align = DECL_ALIGN (t);
! }
! else if (TREE_CODE (t) == COMPONENT_REF)
{
expr = component_ref_for_mem_expr (t);
if (host_integerp (off_tree, 1))
*************** set_mem_attributes (ref, t, objectp)
*** 1813,1818 ****
--- 1823,1845 ----
/* ??? Any reason the field size would be different than
the size we got from the type? */
}
+ else if (flag_argument_noalias > 1
+ && TREE_CODE (t) == INDIRECT_REF
+ && TREE_CODE (TREE_OPERAND (t, 0)) == PARM_DECL)
+ {
+ expr = t;
+ offset = NULL;
+ }
+ }
+
+ /* If this is a Fortran indirect argument reference, record the
+ parameter decl. */
+ else if (flag_argument_noalias > 1
+ && TREE_CODE (t) == INDIRECT_REF
+ && TREE_CODE (TREE_OPERAND (t, 0)) == PARM_DECL)
+ {
+ expr = t;
+ offset = NULL;
}
}
Index: print-rtl.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/print-rtl.c,v
retrieving revision 1.84
diff -c -p -d -r1.84 print-rtl.c
*** print-rtl.c 18 Jun 2002 20:12:13 -0000 1.84
--- print-rtl.c 19 Jul 2002 22:23:34 -0000
*************** print_mem_expr (outfile, expr)
*** 92,97 ****
--- 92,103 ----
fprintf (outfile, ".%s",
IDENTIFIER_POINTER (DECL_NAME (TREE_OPERAND (expr, 1))));
}
+ else if (TREE_CODE (expr) == INDIRECT_REF)
+ {
+ fputs (" (*", outfile);
+ print_mem_expr (outfile, TREE_OPERAND (expr, 0));
+ fputs (")", outfile);
+ }
else if (DECL_NAME (expr))
fprintf (outfile, " %s", IDENTIFIER_POINTER (DECL_NAME (expr)));
else if (TREE_CODE (expr) == RESULT_DECL)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-20 12:05 ` Richard Henderson
@ 2002-07-20 12:12 ` Richard Henderson
2002-07-21 10:01 ` Toon Moene
0 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2002-07-20 12:12 UTC (permalink / raw)
To: Toon Moene, gcc; +Cc: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 769 bytes --]
On Fri, Jul 19, 2002 at 03:35:14PM -0700, Richard Henderson wrote:
> The remaining sts/lds pairs are writes then reads from SY.
> We've lost track of the fact that the write is to index I
> and the read from index I+1, and so cannot overlap.
This appears to be the unroller doing stupid things. The attached
patch1 should cure this. If this patch can be shown to be a win,
we can axe this section of code properly rather than goto out of it.
I also tried running the unroller during the first loop pass so
that the second loop pass could clean up the giv lossage. This
didn't work for this case, but I'd be interested in knowing what
effect this has generically. I.e. a three-way benchmark comparison:
with -fno-rerun-loop-opt
without patch2
with patch2
r~
[-- Attachment #2: patch1 --]
[-- Type: text/plain, Size: 1379 bytes --]
Index: unroll.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/unroll.c,v
retrieving revision 1.169
diff -c -p -d -r1.169 unroll.c
*** unroll.c 30 Jun 2002 05:06:01 -0000 1.169
--- unroll.c 19 Jul 2002 23:51:44 -0000
*************** find_splittable_givs (loop, bl, unroll_t
*** 2867,2875 ****
register to hold the split value of the DEST_ADDR giv.
Emit insn to initialize its value before loop start. */
! rtx tem = gen_reg_rtx (v->mode);
! struct induction *same = v->same;
! rtx new_reg = v->new_reg;
record_base_value (REGNO (tem), v->add_val, 0);
/* If the address giv has a constant in its new_reg value,
--- 2867,2885 ----
register to hold the split value of the DEST_ADDR giv.
Emit insn to initialize its value before loop start. */
! rtx tem;
! struct induction *same;
! rtx new_reg;
!
! /* ??? This appears to be entirely crap. All it appears
! to do is scrog giv combination and confuse alias
! analysis such that it forgets that two DEST_ADDR
! givs have the same base register. */
! continue;
!
! tem = gen_reg_rtx (v->mode);
! same = v->same;
! new_reg = v->new_reg;
record_base_value (REGNO (tem), v->add_val, 0);
/* If the address giv has a constant in its new_reg value,
[-- Attachment #3: patch2 --]
[-- Type: text/plain, Size: 4243 bytes --]
Index: loop.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/loop.c,v
retrieving revision 1.412
diff -c -p -d -r1.412 loop.c
*** loop.c 19 Jul 2002 16:31:40 -0000 1.412
--- loop.c 19 Jul 2002 23:51:43 -0000
*************** strength_reduce (loop, flags)
*** 5320,5326 ****
collected. Always unroll loops that would be as small or smaller
unrolled than when rolled. */
if ((flags & LOOP_UNROLL)
! || (!(flags & LOOP_FIRST_PASS)
&& loop_info->n_iterations > 0
&& unrolled_insn_copies <= insn_count))
unroll_loop (loop, insn_count, 1);
--- 5320,5326 ----
collected. Always unroll loops that would be as small or smaller
unrolled than when rolled. */
if ((flags & LOOP_UNROLL)
! || ((flags & LOOP_AUTO_UNROLL)
&& loop_info->n_iterations > 0
&& unrolled_insn_copies <= insn_count))
unroll_loop (loop, insn_count, 1);
Index: loop.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/loop.h,v
retrieving revision 1.61
diff -c -p -d -r1.61 loop.h
*** loop.h 30 May 2002 20:55:11 -0000 1.61
--- loop.h 19 Jul 2002 23:51:43 -0000
*************** Software Foundation, 59 Temple Place - S
*** 28,34 ****
#define LOOP_UNROLL 1
#define LOOP_BCT 2
#define LOOP_PREFETCH 4
! #define LOOP_FIRST_PASS 8
/* Get the loop info pointer of a loop. */
#define LOOP_INFO(LOOP) ((struct loop_info *) (LOOP)->aux)
--- 28,34 ----
#define LOOP_UNROLL 1
#define LOOP_BCT 2
#define LOOP_PREFETCH 4
! #define LOOP_AUTO_UNROLL 8
/* Get the loop info pointer of a loop. */
#define LOOP_INFO(LOOP) ((struct loop_info *) (LOOP)->aux)
Index: toplev.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.658
diff -c -p -d -r1.658 toplev.c
*** toplev.c 17 Jul 2002 03:03:40 -0000 1.658
--- toplev.c 19 Jul 2002 23:51:43 -0000
***************
*** 1,4 ****
-
/* Top level of GNU C compiler
Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
1999, 2000, 2001, 2002 Free Software Foundation, Inc.
--- 1,3 ----
*************** rest_of_compilation (decl)
*** 2878,2883 ****
--- 2877,2884 ----
if (optimize > 0 && flag_loop_optimize)
{
+ int do_unroll, do_prefetch;
+
timevar_push (TV_LOOP);
delete_dead_jumptables ();
cleanup_cfg (CLEANUP_EXPENSIVE | CLEANUP_PRE_LOOP);
*************** rest_of_compilation (decl)
*** 2885,2896 ****
/* CFG is no longer maintained up-to-date. */
free_bb_for_insn ();
if (flag_rerun_loop_opt)
{
cleanup_barriers ();
/* We only want to perform unrolling once. */
! loop_optimize (insns, rtl_dump_file, LOOP_FIRST_PASS);
/* The first call to loop_optimize makes some instructions
trivially dead. We delete those instructions now in the
--- 2886,2900 ----
/* CFG is no longer maintained up-to-date. */
free_bb_for_insn ();
+ do_unroll = flag_unroll_loops ? LOOP_UNROLL : LOOP_AUTO_UNROLL;
+ do_prefetch = flag_prefetch_loop_arrays ? LOOP_PREFETCH : 0;
if (flag_rerun_loop_opt)
{
cleanup_barriers ();
/* We only want to perform unrolling once. */
! loop_optimize (insns, rtl_dump_file, do_unroll);
! do_unroll = 0;
/* The first call to loop_optimize makes some instructions
trivially dead. We delete those instructions now in the
*************** rest_of_compilation (decl)
*** 2903,2911 ****
reg_scan (insns, max_reg_num (), 1);
}
cleanup_barriers ();
! loop_optimize (insns, rtl_dump_file,
! (flag_unroll_loops ? LOOP_UNROLL : 0) | LOOP_BCT
! | (flag_prefetch_loop_arrays ? LOOP_PREFETCH : 0));
/* Loop can create trivially dead instructions. */
delete_trivially_dead_insns (insns, max_reg_num ());
--- 2907,2913 ----
reg_scan (insns, max_reg_num (), 1);
}
cleanup_barriers ();
! loop_optimize (insns, rtl_dump_file, do_unroll | LOOP_BCT | do_prefetch);
/* Loop can create trivially dead instructions. */
delete_trivially_dead_insns (insns, max_reg_num ());
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-20 12:12 ` Richard Henderson
@ 2002-07-21 10:01 ` Toon Moene
2002-07-21 14:23 ` Richard Henderson
0 siblings, 1 reply; 19+ messages in thread
From: Toon Moene @ 2002-07-21 10:01 UTC (permalink / raw)
To: Richard Henderson; +Cc: gcc, gcc-patches
Richard Henderson wrote:
> On Fri, Jul 19, 2002 at 03:35:14PM -0700, Richard Henderson wrote:
> > The remaining sts/lds pairs are writes then reads from SY.
> > We've lost track of the fact that the write is to index I
> > and the read from index I+1, and so cannot overlap.
>
> This appears to be the unroller doing stupid things. The attached
> patch1 should cure this. If this patch can be shown to be a win,
> we can axe this section of code properly rather than goto out of it.
I combined your MEM_EXPR patch and patch1, but now I get (-O2
-funroll-loops -fno-rerun-loop-opt):
$L6:
lds $f10,0($18)
lds $f14,-4($5)
lda $2,4($5)
lda $3,8($5)
lda $18,4($18)
lda $4,12($5)
lda $1,-3($7)
lda $7,-4($7)
lds $f11,0($18)
addl $1,$31,$6
lda $18,4($18)
lds $f12,0($18)
lda $18,4($18)
muls $f15,$f10,$f10
lds $f13,0($18)
lda $18,4($18)
muls $f15,$f11,$f11
muls $f15,$f12,$f12
adds $f14,$f10,$f14
muls $f15,$f13,$f13
sts $f14,-4($5)
lda $5,16($5)
lds $f10,-4($2)
adds $f10,$f11,$f10
sts $f10,-4($2)
lds $f11,-4($3)
adds $f11,$f12,$f11
sts $f11,-4($3)
lds $f10,-4($4)
adds $f10,$f13,$f10
sts $f10,-4($4)
bge $6,$L6
which is worse than you showed for the MEM_EXPR patch alone 32 insns vs
27).
--
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-21 10:01 ` Toon Moene
@ 2002-07-21 14:23 ` Richard Henderson
2002-07-21 15:14 ` Toon Moene
2002-07-21 22:41 ` Toon Moene
0 siblings, 2 replies; 19+ messages in thread
From: Richard Henderson @ 2002-07-21 14:23 UTC (permalink / raw)
To: Toon Moene; +Cc: gcc, gcc-patches
On Sun, Jul 21, 2002 at 11:24:00AM +0200, Toon Moene wrote:
> I combined your MEM_EXPR patch and patch1, but now I get (-O2
> -funroll-loops -fno-rerun-loop-opt):
[...]
> which is worse than you showed for the MEM_EXPR patch alone
> 32 insns vs 27).
So is the claim that patch1 is dependent on patch2?
r~
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-21 14:23 ` Richard Henderson
@ 2002-07-21 15:14 ` Toon Moene
2002-07-21 22:41 ` Toon Moene
1 sibling, 0 replies; 19+ messages in thread
From: Toon Moene @ 2002-07-21 15:14 UTC (permalink / raw)
To: Richard Henderson; +Cc: gcc, gcc-patches
Richard Henderson wrote:
> On Sun, Jul 21, 2002 at 11:24:00AM +0200, Toon Moene wrote:
> > I combined your MEM_EXPR patch and patch1, but now I get (-O2
> > -funroll-loops -fno-rerun-loop-opt):
> [...]
> > which is worse than you showed for the MEM_EXPR patch alone
> > 32 insns vs 27).
>
> So is the claim that patch1 is dependent on patch2?
Ah, sorry, didn't test because I thought that this would invalidate any
useful testing of patch2.
Will try asap.
--
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-21 14:23 ` Richard Henderson
2002-07-21 15:14 ` Toon Moene
@ 2002-07-21 22:41 ` Toon Moene
2002-07-22 0:03 ` Richard Henderson
1 sibling, 1 reply; 19+ messages in thread
From: Toon Moene @ 2002-07-21 22:41 UTC (permalink / raw)
To: Richard Henderson; +Cc: gcc, gcc-patches
Richard Henderson wrote:
> On Sun, Jul 21, 2002 at 11:24:00AM +0200, Toon Moene wrote:
> > I combined your MEM_EXPR patch and patch1, but now I get (-O2
> > -funroll-loops -fno-rerun-loop-opt):
> [...]
> > which is worse than you showed for the MEM_EXPR patch alone
> > 32 insns vs 27).
> So is the claim that patch1 is dependent on patch2?
[alphaev6-linux-gnu]
OK, you won.
Original (trunk CVS'd 10 UTC this morning) compiled our NWP software -O2
-ffast-math -funroll-loops -fno-rerun-loop-opt:
ETAETA TOOK 2.4101572 SECONDS
ETAETA TOOK 3.5712893 SECONDS
0SUPOBS TOOK : 0.33203029633
0DATACH TOOK : 442.5986328125
0ANAEVA TOOK : 105.2470703125
0GRPEVA TOOK : 635.9697265625
0HUMSUP TOOK : 0.02246093750
0DATACH TOOK : 28.8212890625
0HUMEVA TOOK : 7.5175781250
0GRPEVA TOOK : 14.9267578125
ANALYSIS TOOK: 1257.66016 SEC.
PREPARATIONS TOOK 4.39550686
SECONDS
FORECAST TOOK 131.1475
SECONDS
PREPARATIONS TOOK 6.03613281
SECONDS
FORECAST TOOK 354.8984
SECONDS
PREPARATIONS TOOK 13.0136719
SECONDS
FORECAST TOOK 989.5225
SECONDS
ETAETA TOOK 3.62402296 SECONDS
0SUPOBS TOOK : 0.18945407867
0DATACH TOOK : 830.9023437500
0ANAEVA TOOK : 117.0732421875
0GRPEVA TOOK : 768.3984375000
0HUMSUP TOOK : 0.03613281250
0DATACH TOOK : 17.2988281250
0HUMEVA TOOK : 5.5039062500
0GRPEVA TOOK : 11.3378906250
ANALYSIS TOOK: 1777.08496 SEC.
PREPARATIONS TOOK 6.4160161
SECONDS
FORECAST TOOK 127.3857
SECONDS
PREPARATIONS TOOK 7.72753906
SECONDS
FORECAST TOOK 361.6230
SECONDS
PREPARATIONS TOOK 12.6972656
SECONDS
FORECAST TOOK 974.3926
SECONDS
In addition with your patch 0 (MEM_EXPR), 1 and 2, compiled our NWP
software -O2 -ffast-math -funroll-loops:
ETAETA TOOK 4.66308594 SECONDS
ETAETA TOOK 6.23046875 SECONDS
0SUPOBS TOOK : 0.74706935883
0DATACH TOOK : 354.9746093750
0ANAEVA TOOK : 120.2656250000
0GRPEVA TOOK : 607.8193359375
0HUMSUP TOOK : 0.01953125000
0DATACH TOOK : 31.6162109375
0HUMEVA TOOK : 8.9648437500
0GRPEVA TOOK : 14.9980468750
ANALYSIS TOOK: 1162.40234 SEC.
PREPARATIONS TOOK 5.09668016
SECONDS
FORECAST TOOK 118.2822
SECONDS
PREPARATIONS TOOK 6.49609375
SECONDS
FORECAST TOOK 339.0889
SECONDS
PREPARATIONS TOOK 13.4384766
SECONDS
FORECAST TOOK 910.8564
SECONDS
ETAETA TOOK 2.88281202 SECONDS
0SUPOBS TOOK : 0.75683593750
0DATACH TOOK : 680.7177734375
0ANAEVA TOOK : 117.9658203125
0GRPEVA TOOK : 743.5371093750
0HUMSUP TOOK : 0.01562500000
0DATACH TOOK : 16.0937500000
0HUMEVA TOOK : 5.9082031250
0GRPEVA TOOK : 11.3769531250
ANALYSIS TOOK: 1620.9668 SEC.
PREPARATIONS TOOK 5.05175829
SECONDS
FORECAST TOOK 124.7197
SECONDS
PREPARATIONS TOOK 5.18066406
SECONDS
FORECAST TOOK 331.8086
SECONDS
PREPARATIONS TOOK 11.7978516
SECONDS
FORECAST TOOK 922.9990
SECONDS
Cheers,
--
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-21 22:41 ` Toon Moene
@ 2002-07-22 0:03 ` Richard Henderson
2002-07-22 16:42 ` Toon Moene
0 siblings, 1 reply; 19+ messages in thread
From: Richard Henderson @ 2002-07-22 0:03 UTC (permalink / raw)
To: Toon Moene; +Cc: gcc, gcc-patches
On Mon, Jul 22, 2002 at 12:23:55AM +0200, Toon Moene wrote:
> OK, you won.
Excellent. I've checked things in.
r~
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-22 0:03 ` Richard Henderson
@ 2002-07-22 16:42 ` Toon Moene
2002-07-23 2:12 ` Andreas Jaeger
0 siblings, 1 reply; 19+ messages in thread
From: Toon Moene @ 2002-07-22 16:42 UTC (permalink / raw)
To: Richard Henderson; +Cc: gcc, gcc-patches
Richard Henderson wrote:
> On Mon, Jul 22, 2002 at 12:23:55AM +0200, Toon Moene wrote:
> > OK, you won.
> Excellent. I've checked things in.
Well, that helped. Look at the SPECfp scores on Andreas' site !
--
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-22 16:42 ` Toon Moene
@ 2002-07-23 2:12 ` Andreas Jaeger
0 siblings, 0 replies; 19+ messages in thread
From: Andreas Jaeger @ 2002-07-23 2:12 UTC (permalink / raw)
To: Toon Moene; +Cc: Richard Henderson, gcc, gcc-patches
Toon Moene <toon@moene.indiv.nluug.nl> writes:
> Richard Henderson wrote:
>
>> On Mon, Jul 22, 2002 at 12:23:55AM +0200, Toon Moene wrote:
>> > OK, you won.
>
>> Excellent. I've checked things in.
>
> Well, that helped. Look at the SPECfp scores on Andreas' site !
Wow! That's really impressive for wupwise and swim!
Andreas
--
Andreas Jaeger
SuSE Labs aj@suse.de
private aj@arthur.inka.de
http://www.suse.de/~aj
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-07-19 13:31 ` Richard Henderson
2002-07-20 2:13 ` Toon Moene
@ 2002-08-12 7:49 ` Jeff Law
2002-08-12 7:53 ` Jan Hubicka
1 sibling, 1 reply; 19+ messages in thread
From: Jeff Law @ 2002-08-12 7:49 UTC (permalink / raw)
To: Richard Henderson; +Cc: Toon Moene, gcc
In message <20020719095446.A15598@redhat.com>, Richard Henderson writes:
>On Fri, Jul 19, 2002 at 04:40:26PM +0200, Toon Moene wrote:
>> Which begs the question: Is there a reason -fschedule-insns isn't on by
>> default when using -O2 ?
>
>Yes. The fact that the scheduler doesn't understand register
>pressure means that pre-register-allocation scheduling generally
>sucks eggs on x86.
True. But the real reason -fschedule-insns isn't on by default for ia32
is the return register problem -- which I believe you actually fixed
a while back, but we haven't gone back to see if it's safe/profitable to
enable the first scheduling pass for ia32.
Jeff
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-08-12 7:49 ` Jeff Law
@ 2002-08-12 7:53 ` Jan Hubicka
2002-08-12 9:56 ` Richard Henderson
0 siblings, 1 reply; 19+ messages in thread
From: Jan Hubicka @ 2002-08-12 7:53 UTC (permalink / raw)
To: law; +Cc: Richard Henderson, Toon Moene, gcc
> In message <20020719095446.A15598@redhat.com>, Richard Henderson writes:
> >On Fri, Jul 19, 2002 at 04:40:26PM +0200, Toon Moene wrote:
> >> Which begs the question: Is there a reason -fschedule-insns isn't on by
> >> default when using -O2 ?
> >
> >Yes. The fact that the scheduler doesn't understand register
> >pressure means that pre-register-allocation scheduling generally
> >sucks eggs on x86.
> True. But the real reason -fschedule-insns isn't on by default for ia32
> is the return register problem -- which I believe you actually fixed
> a while back, but we haven't gone back to see if it's safe/profitable to
> enable the first scheduling pass for ia32.
Did you really fixed all the problems regarding SMALL_REGISTER_CLASSes?
When using register passing conventions I've seen ia32 compilation
dying all the time.
It may be interesting to set the flag on at least for x86_64 where
register pressure is lower if it worked.
Honza
>
> Jeff
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ?
2002-08-12 7:53 ` Jan Hubicka
@ 2002-08-12 9:56 ` Richard Henderson
0 siblings, 0 replies; 19+ messages in thread
From: Richard Henderson @ 2002-08-12 9:56 UTC (permalink / raw)
To: Jan Hubicka; +Cc: law, Toon Moene, gcc
On Mon, Aug 12, 2002 at 04:52:57PM +0200, Jan Hubicka wrote:
> Did you really fixed all the problems regarding SMALL_REGISTER_CLASSes?
I thought so.
> It may be interesting to set the flag on at least for x86_64 where
> register pressure is lower if it worked.
Indeed.
r~
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2002-08-12 9:56 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-16 14:10 Alias analysis - does base_alias_check still work ? Toon Moene
2002-07-19 11:00 ` Toon Moene
2002-07-19 11:02 ` Daniel Berlin
2002-07-19 11:03 ` David Edelsohn
2002-07-19 13:31 ` Richard Henderson
2002-07-20 2:13 ` Toon Moene
2002-07-20 11:42 ` Toon Moene
2002-07-20 12:05 ` Richard Henderson
2002-07-20 12:12 ` Richard Henderson
2002-07-21 10:01 ` Toon Moene
2002-07-21 14:23 ` Richard Henderson
2002-07-21 15:14 ` Toon Moene
2002-07-21 22:41 ` Toon Moene
2002-07-22 0:03 ` Richard Henderson
2002-07-22 16:42 ` Toon Moene
2002-07-23 2:12 ` Andreas Jaeger
2002-08-12 7:49 ` Jeff Law
2002-08-12 7:53 ` Jan Hubicka
2002-08-12 9:56 ` Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).