* Alias analysis - does base_alias_check still work ? @ 2002-07-16 14:10 Toon Moene 2002-07-19 11:00 ` Toon Moene 0 siblings, 1 reply; 19+ messages in thread From: Toon Moene @ 2002-07-16 14:10 UTC (permalink / raw) To: gcc L.S., f/com.c contains the following note, preceding the definition of #define LANG_HOOKS_GET_ALIAS_SET hook_get_alias_set_0 /* We do not wish to use alias-set based aliasing at all. Used in the extreme (every object with its own set, with equivalences recorded) it might be helpful, but there are problems when it comes to inlining. We get on ok with flag_argument_noalias, and alias-set aliasing does currently limit how stack slots can be reused, which is a lose. */ I do not know if all the facts mentioned here still actually hold, but I do have strong doubts that base_alias_check in alias.c still does its duty. Consider the following Fortran source: SUBROUTINE SIMPLE(A, B) B = 3.0 A = 2.0 B = A*B END one would assume that alias analysis at least once should check that the assignment to A in line 3 doesn't change the value of B set in line 2, which, with flag_argument_noalias > 1 [arguments don't alias] in effect, would be the case. However, according to my experiments with setting breakpoints in base_alias_check, it never passes that point. Before I go on a wholesale check to see if base_alias_check *ever* returns anything else than `1` (x and y might alias), does someone have a good idea to narrow the search ? Thanks in advance, -- Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-16 14:10 Alias analysis - does base_alias_check still work ? Toon Moene @ 2002-07-19 11:00 ` Toon Moene 2002-07-19 11:02 ` Daniel Berlin 2002-07-19 13:31 ` Richard Henderson 0 siblings, 2 replies; 19+ messages in thread From: Toon Moene @ 2002-07-19 11:00 UTC (permalink / raw) To: gcc I wrote: > f/com.c contains the following note, preceding the definition of > > #define LANG_HOOKS_GET_ALIAS_SET hook_get_alias_set_0 > > /* We do not wish to use alias-set based aliasing at all. Used in the > extreme (every object with its own set, with equivalences recorded) > it > might be helpful, but there are problems when it comes to inlining. > We > get on ok with flag_argument_noalias, and alias-set aliasing does > currently limit how stack slots can be reused, which is a lose. */ > > I do not know if all the facts mentioned here still actually hold, but I > do have strong doubts that base_alias_check in alias.c still does its > duty. > > Consider the following Fortran source: > > SUBROUTINE SIMPLE(A, B) > B = 3.0 > A = 2.0 > B = A*B > END > > one would assume that alias analysis at least once should check that the > assignment to A in line 3 doesn't change the value of B set in line 2, > which, with > > flag_argument_noalias > 1 > > [arguments don't alias] in effect, would be the case. > > However, according to my experiments with setting breakpoints in > base_alias_check, it never passes that point. Sigh, that's just because it doesn't need to. The code generated looks like this: ... movl $0x40400000, (%edx) ! B=3.0 movl $0x40000000, (%eax) ! A=2.0 flds (%edx) ! put B on stack ... which, of course, gets neatly around the problem of whether the store into A would change B. Now for the real issue. To have register renaming be really effective, alias analysis has to work well. Take the following example: subroutine saxpy(n,sa,sx,sy) real sx(n),sy(n),sa integer i,n do i = 1,n sy(i) = sy(i) + sa*sx(i) enddo end If we compile this with -O2 -march=pentium4 -mfpmath=sse -funroll-loops -frename-registers, we get for the unrolled loop: .L6: movaps %xmm1, %xmm7 movaps %xmm1, %xmm6 movaps %xmm1, %xmm5 mulss (%edx), %xmm7 movaps %xmm1, %xmm4 addss (%eax), %xmm7 movss %xmm7, (%eax) mulss 4(%edx), %xmm6 addss 4(%eax), %xmm6 movss %xmm6, 4(%eax) mulss 8(%edx), %xmm5 addss 8(%eax), %xmm5 movss %xmm5, 8(%eax) mulss 12(%edx), %xmm4 addl $16, %edx addss 12(%eax), %xmm4 movss %xmm4, 12(%eax) addl $16, %eax subl $4, %ecx jns .L6 Obviously, register renaming has done its work, but the (re-)scheduling of instructions leaves something to be desired. After much gdb'ing in sched-deps.c and alias.c I believe to have found the cause: the rescheduling of this loop after register renaming is run after register allocation (hey, no surprise :-). However, alias analysis is really careful about assumptions on the contents of these hard registers, so almost no instruction gets moved around. OK, but what if we allow instruction scheduling before register allocation (that would only be beneficial if the floating point (pseudo) registers have different "names" already, but fortunately, they do), using -fschedule-insns instead of -frename-registers: .L6: movaps %xmm4, %xmm0 movaps %xmm4, %xmm1 movaps %xmm4, %xmm2 movaps %xmm4, %xmm3 mulss (%edx), %xmm0 mulss 4(%edx), %xmm1 mulss 8(%edx), %xmm2 mulss 12(%edx), %xmm3 addss (%eax), %xmm0 addss 4(%eax), %xmm1 addss 8(%eax), %xmm2 addss 12(%eax), %xmm3 movss %xmm0, (%eax) movss %xmm1, 4(%eax) movss %xmm2, 8(%eax) movss %xmm3, 12(%eax) addl $16, %edx addl $16, %eax subl $4, %ecx jns .L6 Bingo ! That's a lot better ! Which begs the question: Is there a reason -fschedule-insns isn't on by default when using -O2 ? Cheers, -- Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-19 11:00 ` Toon Moene @ 2002-07-19 11:02 ` Daniel Berlin 2002-07-19 11:03 ` David Edelsohn 2002-07-19 13:31 ` Richard Henderson 1 sibling, 1 reply; 19+ messages in thread From: Daniel Berlin @ 2002-07-19 11:02 UTC (permalink / raw) To: Toon Moene; +Cc: gcc On Fri, 19 Jul 2002, Toon Moene wrote: > I wrote: > > > f/com.c contains the following note, preceding the definition of > > > > #define LANG_HOOKS_GET_ALIAS_SET hook_get_alias_set_0 > > > > /* We do not wish to use alias-set based aliasing at all. Used in the > > extreme (every object with its own set, with equivalences recorded) > > it > > might be helpful, but there are problems when it comes to inlining. > > We > > get on ok with flag_argument_noalias, and alias-set aliasing does > > currently limit how stack slots can be reused, which is a lose. */ > > > > I do not know if all the facts mentioned here still actually hold, but I > > do have strong doubts that base_alias_check in alias.c still does its > > duty. > > > > Consider the following Fortran source: > > > > SUBROUTINE SIMPLE(A, B) > > B = 3.0 > > A = 2.0 > > B = A*B > > END > > > > one would assume that alias analysis at least once should check that the > > assignment to A in line 3 doesn't change the value of B set in line 2, > > which, with > > > > flag_argument_noalias > 1 > > > > [arguments don't alias] in effect, would be the case. > > > > However, according to my experiments with setting breakpoints in > > base_alias_check, it never passes that point. > > Sigh, that's just because it doesn't need to. The code generated looks > like this: > > ... > movl $0x40400000, (%edx) ! B=3.0 > movl $0x40000000, (%eax) ! A=2.0 > flds (%edx) ! put B on stack > ... > > which, of course, gets neatly around the problem of whether the store into > A would change B. > > Now for the real issue. To have register renaming be really effective, > alias analysis has to work well. > > Take the following example: > > subroutine saxpy(n,sa,sx,sy) > real sx(n),sy(n),sa > integer i,n > do i = 1,n > sy(i) = sy(i) + sa*sx(i) > enddo > end > > If we compile this with -O2 -march=pentium4 -mfpmath=sse -funroll-loops > -frename-registers, we get for the unrolled loop: > > .L6: > movaps %xmm1, %xmm7 > movaps %xmm1, %xmm6 > movaps %xmm1, %xmm5 > mulss (%edx), %xmm7 > movaps %xmm1, %xmm4 > addss (%eax), %xmm7 > movss %xmm7, (%eax) > mulss 4(%edx), %xmm6 > addss 4(%eax), %xmm6 > movss %xmm6, 4(%eax) > mulss 8(%edx), %xmm5 > addss 8(%eax), %xmm5 > movss %xmm5, 8(%eax) > mulss 12(%edx), %xmm4 > addl $16, %edx > addss 12(%eax), %xmm4 > movss %xmm4, 12(%eax) > addl $16, %eax > subl $4, %ecx > jns .L6 > > Obviously, register renaming has done its work, but the (re-)scheduling of > instructions leaves something to be desired. After much gdb'ing in > sched-deps.c and alias.c I believe to have found the cause: the > rescheduling of this loop after register renaming is run after register > allocation (hey, no surprise :-). However, alias analysis is really > careful about assumptions on the contents of these hard registers, so > almost no instruction gets moved around. > > OK, but what if we allow instruction scheduling before register allocation > (that would only be beneficial if the floating point (pseudo) registers > have different "names" already, but fortunately, they do), using > -fschedule-insns instead of -frename-registers: > > .L6: > movaps %xmm4, %xmm0 > movaps %xmm4, %xmm1 > movaps %xmm4, %xmm2 > movaps %xmm4, %xmm3 > mulss (%edx), %xmm0 > mulss 4(%edx), %xmm1 > mulss 8(%edx), %xmm2 > mulss 12(%edx), %xmm3 > addss (%eax), %xmm0 > addss 4(%eax), %xmm1 > addss 8(%eax), %xmm2 > addss 12(%eax), %xmm3 > movss %xmm0, (%eax) > movss %xmm1, 4(%eax) > movss %xmm2, 8(%eax) > movss %xmm3, 12(%eax) > addl $16, %edx > addl $16, %eax > subl $4, %ecx > jns .L6 > > Bingo ! That's a lot better ! > > Which begs the question: Is there a reason -fschedule-insns isn't on by > default when using -O2 ? Err, it is. if (optimize >= 2) { flag_optimize_sibling_calls = 1; flag_cse_follow_jumps = 1; flag_cse_skip_blocks = 1; flag_gcse = 1; flag_expensive_optimizations = 1; flag_strength_reduce = 1; flag_rerun_cse_after_loop = 1; flag_rerun_loop_opt = 1; flag_caller_saves = 1; flag_force_mem = 1; flag_peephole2 = 1; #ifdef INSN_SCHEDULING flag_schedule_insns = 1; flag_schedule_insns_after_reload = 1; #endif flag_regmove = 1; flag_strict_aliasing = 1; flag_delete_null_pointer_checks = 1; flag_reorder_blocks = 1; flag_reorder_functions = 1; } > > Cheers, > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-19 11:02 ` Daniel Berlin @ 2002-07-19 11:03 ` David Edelsohn 0 siblings, 0 replies; 19+ messages in thread From: David Edelsohn @ 2002-07-19 11:03 UTC (permalink / raw) To: Daniel Berlin, Toon Moene; +Cc: gcc >>>>> Daniel Berlin writes: >> Which begs the question: Is there a reason -fschedule-insns isn't on by >> default when using -O2 ? Daniel> Err, it is. Daniel> if (optimize >= 2) Daniel> flag_peephole2 = 1; Daniel> #ifdef INSN_SCHEDULING Daniel> flag_schedule_insns = 1; Daniel> flag_schedule_insns_after_reload = 1; Daniel> #endif Except i386.c turns it off: void optimization_options (level, size) int level; int size ATTRIBUTE_UNUSED; { /* For -O2 and beyond, turn off -fschedule-insns by default. It tends to make the problem with not enough registers even worse. */ #ifdef INSN_SCHEDULING if (level > 1) flag_schedule_insns = 0; #endif David ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-19 11:00 ` Toon Moene 2002-07-19 11:02 ` Daniel Berlin @ 2002-07-19 13:31 ` Richard Henderson 2002-07-20 2:13 ` Toon Moene 2002-08-12 7:49 ` Jeff Law 1 sibling, 2 replies; 19+ messages in thread From: Richard Henderson @ 2002-07-19 13:31 UTC (permalink / raw) To: Toon Moene; +Cc: gcc On Fri, Jul 19, 2002 at 04:40:26PM +0200, Toon Moene wrote: > Which begs the question: Is there a reason -fschedule-insns isn't on by > default when using -O2 ? Yes. The fact that the scheduler doesn't understand register pressure means that pre-register-allocation scheduling generally sucks eggs on x86. r~ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-19 13:31 ` Richard Henderson @ 2002-07-20 2:13 ` Toon Moene 2002-07-20 11:42 ` Toon Moene 2002-08-12 7:49 ` Jeff Law 1 sibling, 1 reply; 19+ messages in thread From: Toon Moene @ 2002-07-20 2:13 UTC (permalink / raw) To: Richard Henderson; +Cc: gcc Richard Henderson wrote: > On Fri, Jul 19, 2002 at 04:40:26PM +0200, Toon Moene wrote: > > Which begs the question: Is there a reason -fschedule-insns isn't on by > > default when using -O2 ? > Yes. The fact that the scheduler doesn't understand register > pressure means that pre-register-allocation scheduling generally > sucks eggs on x86. Thanks. Independently from your message and David Edelsohn's (showing the comment that gives the rationale for turning this option off on ix86) I came to the same conclusion while having dinner (nothing helps better than give the grey cells some rest :-) Now the 64K question is: How can it be that I came around going through all this trouble because I saw the same "mis-optimization" on the Alpha, where - given what Daniel showed us - -fschedule-insns is on ? I'm switching machines as we speak to test this ... -- Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-20 2:13 ` Toon Moene @ 2002-07-20 11:42 ` Toon Moene 2002-07-20 12:05 ` Richard Henderson 0 siblings, 1 reply; 19+ messages in thread From: Toon Moene @ 2002-07-20 11:42 UTC (permalink / raw) To: Richard Henderson, gcc I wrote: > Richard Henderson wrote: > > Yes. The fact that the scheduler doesn't understand register > > pressure means that pre-register-allocation scheduling generally > > sucks eggs on x86. > > Thanks. Independently from your message and David Edelsohn's (showing the > comment that gives the rationale for turning this option off on ix86) I > came to the same conclusion while having dinner (nothing helps better than > give the grey cells some rest :-) > > Now the 64K question is: How can it be that I came around going through all > this trouble because I saw the same "mis-optimization" on the Alpha, where > - given what Daniel showed us - -fschedule-insns is on ? > > I'm switching machines as we speak to test this ... Well, it obviously doesn't work on the Alpha. First of all I have to specify -fno-rerun-loop-opts to get any loop unrolling at all, and then the unrolled loop looks like this: $L6: lds $f12,0($17) lds $f10,0($18) lda $1,-3($5) lda $5,-4($5) lds $f11,-4($3) addl $1,$31,$4 muls $f12,$f10,$f10 adds $f11,$f10,$f11 sts $f11,-4($2) lds $f10,4($18) lds $f11,0($3) muls $f12,$f10,$f10 adds $f11,$f10,$f11 sts $f11,0($2) lds $f10,8($18) lds $f11,4($3) muls $f12,$f10,$f10 adds $f11,$f10,$f11 sts $f11,4($2) lds $f13,12($18) lds $f10,8($3) lda $18,16($18) lda $3,16($3) muls $f12,$f13,$f12 adds $f10,$f12,$f10 sts $f10,8($2) lda $2,16($2) bge $4,$L6 It's obvious that no scheduling has taken place, as all four operations are still in sequence (lds, muls, adds, sts). -- Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-20 11:42 ` Toon Moene @ 2002-07-20 12:05 ` Richard Henderson 2002-07-20 12:12 ` Richard Henderson 0 siblings, 1 reply; 19+ messages in thread From: Richard Henderson @ 2002-07-20 12:05 UTC (permalink / raw) To: Toon Moene; +Cc: gcc On Fri, Jul 19, 2002 at 09:55:00PM +0200, Toon Moene wrote: > Well, it obviously doesn't work on the Alpha. First of all I have to > specify -fno-rerun-loop-opts to get any loop unrolling at all, and then the > unrolled loop looks like this: Havn't looked at what exactly goes wrong with rerun-loop-opts, except to notice that loop loses track of the register that contains the iteration count. As for the aliasing, the problem is that Alpha doesn't have indexed addressing, so we wind up with the base addresses being strength reduced, and presumably that confuses the alias analysis code enough that it considers the memory references to be "variable", and thus may alias anything. One solution is to make use of the new MEM_EXPR field and record this information such that it can't (or shouldn't) get lost ever. Try the following. For me it cleans up the example a bit: $L6: lds $f10,0($18) lds $f12,-4($3) lda $1,-3($5) lda $5,-4($5) lds $f11,4($18) lds $f13,8($18) addl $1,$31,$4 lds $f14,12($18) lda $18,16($18) muls $f15,$f10,$f10 muls $f15,$f11,$f11 muls $f15,$f13,$f13 muls $f15,$f14,$f14 adds $f12,$f10,$f12 sts $f12,-4($2) lds $f10,0($3) adds $f10,$f11,$f10 sts $f10,0($2) lds $f11,4($3) adds $f11,$f13,$f11 sts $f11,4($2) lds $f10,8($3) lda $3,16($3) adds $f10,$f14,$f10 sts $f10,8($2) lda $2,16($2) bge $4,$L6 The remaining sts/lds pairs are writes then reads from SY. We've lost track of the fact that the write is to index I and the read from index I+1, and so cannot overlap. r~ Index: alias.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/alias.c,v retrieving revision 1.177 diff -c -p -d -r1.177 alias.c *** alias.c 20 Jun 2002 07:29:59 -0000 1.177 --- alias.c 19 Jul 2002 22:23:34 -0000 *************** nonoverlapping_memrefs_p (x, y) *** 1957,1962 **** --- 1957,1970 ---- moffsetx = adjust_offset_for_component_ref (exprx, moffsetx); exprx = t; } + else if (TREE_CODE (exprx) == INDIRECT_REF) + { + exprx = TREE_OPERAND (exprx, 0); + if (flag_argument_noalias < 2 + || TREE_CODE (exprx) != PARM_DECL) + return 0; + } + moffsety = MEM_OFFSET (y); if (TREE_CODE (expry) == COMPONENT_REF) { *************** nonoverlapping_memrefs_p (x, y) *** 1965,1970 **** --- 1973,1985 ---- return 0; moffsety = adjust_offset_for_component_ref (expry, moffsety); expry = t; + } + else if (TREE_CODE (expry) == INDIRECT_REF) + { + expry = TREE_OPERAND (expry, 0); + if (flag_argument_noalias < 2 + || TREE_CODE (expry) != PARM_DECL) + return 0; } if (! DECL_P (exprx) || ! DECL_P (expry)) Index: emit-rtl.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/emit-rtl.c,v retrieving revision 1.284 diff -c -p -d -r1.284 emit-rtl.c *** emit-rtl.c 11 Jul 2002 10:32:54 -0000 1.284 --- emit-rtl.c 19 Jul 2002 22:23:34 -0000 *************** set_mem_attributes (ref, t, objectp) *** 1805,1811 **** } while (TREE_CODE (t) == ARRAY_REF); ! if (TREE_CODE (t) == COMPONENT_REF) { expr = component_ref_for_mem_expr (t); if (host_integerp (off_tree, 1)) --- 1805,1821 ---- } while (TREE_CODE (t) == ARRAY_REF); ! if (DECL_P (t)) ! { ! expr = t; ! if (host_integerp (off_tree, 1)) ! offset = GEN_INT (tree_low_cst (off_tree, 1)); ! size = (DECL_SIZE_UNIT (t) ! && host_integerp (DECL_SIZE_UNIT (t), 1) ! ? GEN_INT (tree_low_cst (DECL_SIZE_UNIT (t), 1)) : 0); ! align = DECL_ALIGN (t); ! } ! else if (TREE_CODE (t) == COMPONENT_REF) { expr = component_ref_for_mem_expr (t); if (host_integerp (off_tree, 1)) *************** set_mem_attributes (ref, t, objectp) *** 1813,1818 **** --- 1823,1845 ---- /* ??? Any reason the field size would be different than the size we got from the type? */ } + else if (flag_argument_noalias > 1 + && TREE_CODE (t) == INDIRECT_REF + && TREE_CODE (TREE_OPERAND (t, 0)) == PARM_DECL) + { + expr = t; + offset = NULL; + } + } + + /* If this is a Fortran indirect argument reference, record the + parameter decl. */ + else if (flag_argument_noalias > 1 + && TREE_CODE (t) == INDIRECT_REF + && TREE_CODE (TREE_OPERAND (t, 0)) == PARM_DECL) + { + expr = t; + offset = NULL; } } Index: print-rtl.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/print-rtl.c,v retrieving revision 1.84 diff -c -p -d -r1.84 print-rtl.c *** print-rtl.c 18 Jun 2002 20:12:13 -0000 1.84 --- print-rtl.c 19 Jul 2002 22:23:34 -0000 *************** print_mem_expr (outfile, expr) *** 92,97 **** --- 92,103 ---- fprintf (outfile, ".%s", IDENTIFIER_POINTER (DECL_NAME (TREE_OPERAND (expr, 1)))); } + else if (TREE_CODE (expr) == INDIRECT_REF) + { + fputs (" (*", outfile); + print_mem_expr (outfile, TREE_OPERAND (expr, 0)); + fputs (")", outfile); + } else if (DECL_NAME (expr)) fprintf (outfile, " %s", IDENTIFIER_POINTER (DECL_NAME (expr))); else if (TREE_CODE (expr) == RESULT_DECL) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-20 12:05 ` Richard Henderson @ 2002-07-20 12:12 ` Richard Henderson 2002-07-21 10:01 ` Toon Moene 0 siblings, 1 reply; 19+ messages in thread From: Richard Henderson @ 2002-07-20 12:12 UTC (permalink / raw) To: Toon Moene, gcc; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 769 bytes --] On Fri, Jul 19, 2002 at 03:35:14PM -0700, Richard Henderson wrote: > The remaining sts/lds pairs are writes then reads from SY. > We've lost track of the fact that the write is to index I > and the read from index I+1, and so cannot overlap. This appears to be the unroller doing stupid things. The attached patch1 should cure this. If this patch can be shown to be a win, we can axe this section of code properly rather than goto out of it. I also tried running the unroller during the first loop pass so that the second loop pass could clean up the giv lossage. This didn't work for this case, but I'd be interested in knowing what effect this has generically. I.e. a three-way benchmark comparison: with -fno-rerun-loop-opt without patch2 with patch2 r~ [-- Attachment #2: patch1 --] [-- Type: text/plain, Size: 1379 bytes --] Index: unroll.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/unroll.c,v retrieving revision 1.169 diff -c -p -d -r1.169 unroll.c *** unroll.c 30 Jun 2002 05:06:01 -0000 1.169 --- unroll.c 19 Jul 2002 23:51:44 -0000 *************** find_splittable_givs (loop, bl, unroll_t *** 2867,2875 **** register to hold the split value of the DEST_ADDR giv. Emit insn to initialize its value before loop start. */ ! rtx tem = gen_reg_rtx (v->mode); ! struct induction *same = v->same; ! rtx new_reg = v->new_reg; record_base_value (REGNO (tem), v->add_val, 0); /* If the address giv has a constant in its new_reg value, --- 2867,2885 ---- register to hold the split value of the DEST_ADDR giv. Emit insn to initialize its value before loop start. */ ! rtx tem; ! struct induction *same; ! rtx new_reg; ! ! /* ??? This appears to be entirely crap. All it appears ! to do is scrog giv combination and confuse alias ! analysis such that it forgets that two DEST_ADDR ! givs have the same base register. */ ! continue; ! ! tem = gen_reg_rtx (v->mode); ! same = v->same; ! new_reg = v->new_reg; record_base_value (REGNO (tem), v->add_val, 0); /* If the address giv has a constant in its new_reg value, [-- Attachment #3: patch2 --] [-- Type: text/plain, Size: 4243 bytes --] Index: loop.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/loop.c,v retrieving revision 1.412 diff -c -p -d -r1.412 loop.c *** loop.c 19 Jul 2002 16:31:40 -0000 1.412 --- loop.c 19 Jul 2002 23:51:43 -0000 *************** strength_reduce (loop, flags) *** 5320,5326 **** collected. Always unroll loops that would be as small or smaller unrolled than when rolled. */ if ((flags & LOOP_UNROLL) ! || (!(flags & LOOP_FIRST_PASS) && loop_info->n_iterations > 0 && unrolled_insn_copies <= insn_count)) unroll_loop (loop, insn_count, 1); --- 5320,5326 ---- collected. Always unroll loops that would be as small or smaller unrolled than when rolled. */ if ((flags & LOOP_UNROLL) ! || ((flags & LOOP_AUTO_UNROLL) && loop_info->n_iterations > 0 && unrolled_insn_copies <= insn_count)) unroll_loop (loop, insn_count, 1); Index: loop.h =================================================================== RCS file: /cvs/gcc/gcc/gcc/loop.h,v retrieving revision 1.61 diff -c -p -d -r1.61 loop.h *** loop.h 30 May 2002 20:55:11 -0000 1.61 --- loop.h 19 Jul 2002 23:51:43 -0000 *************** Software Foundation, 59 Temple Place - S *** 28,34 **** #define LOOP_UNROLL 1 #define LOOP_BCT 2 #define LOOP_PREFETCH 4 ! #define LOOP_FIRST_PASS 8 /* Get the loop info pointer of a loop. */ #define LOOP_INFO(LOOP) ((struct loop_info *) (LOOP)->aux) --- 28,34 ---- #define LOOP_UNROLL 1 #define LOOP_BCT 2 #define LOOP_PREFETCH 4 ! #define LOOP_AUTO_UNROLL 8 /* Get the loop info pointer of a loop. */ #define LOOP_INFO(LOOP) ((struct loop_info *) (LOOP)->aux) Index: toplev.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/toplev.c,v retrieving revision 1.658 diff -c -p -d -r1.658 toplev.c *** toplev.c 17 Jul 2002 03:03:40 -0000 1.658 --- toplev.c 19 Jul 2002 23:51:43 -0000 *************** *** 1,4 **** - /* Top level of GNU C compiler Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002 Free Software Foundation, Inc. --- 1,3 ---- *************** rest_of_compilation (decl) *** 2878,2883 **** --- 2877,2884 ---- if (optimize > 0 && flag_loop_optimize) { + int do_unroll, do_prefetch; + timevar_push (TV_LOOP); delete_dead_jumptables (); cleanup_cfg (CLEANUP_EXPENSIVE | CLEANUP_PRE_LOOP); *************** rest_of_compilation (decl) *** 2885,2896 **** /* CFG is no longer maintained up-to-date. */ free_bb_for_insn (); if (flag_rerun_loop_opt) { cleanup_barriers (); /* We only want to perform unrolling once. */ ! loop_optimize (insns, rtl_dump_file, LOOP_FIRST_PASS); /* The first call to loop_optimize makes some instructions trivially dead. We delete those instructions now in the --- 2886,2900 ---- /* CFG is no longer maintained up-to-date. */ free_bb_for_insn (); + do_unroll = flag_unroll_loops ? LOOP_UNROLL : LOOP_AUTO_UNROLL; + do_prefetch = flag_prefetch_loop_arrays ? LOOP_PREFETCH : 0; if (flag_rerun_loop_opt) { cleanup_barriers (); /* We only want to perform unrolling once. */ ! loop_optimize (insns, rtl_dump_file, do_unroll); ! do_unroll = 0; /* The first call to loop_optimize makes some instructions trivially dead. We delete those instructions now in the *************** rest_of_compilation (decl) *** 2903,2911 **** reg_scan (insns, max_reg_num (), 1); } cleanup_barriers (); ! loop_optimize (insns, rtl_dump_file, ! (flag_unroll_loops ? LOOP_UNROLL : 0) | LOOP_BCT ! | (flag_prefetch_loop_arrays ? LOOP_PREFETCH : 0)); /* Loop can create trivially dead instructions. */ delete_trivially_dead_insns (insns, max_reg_num ()); --- 2907,2913 ---- reg_scan (insns, max_reg_num (), 1); } cleanup_barriers (); ! loop_optimize (insns, rtl_dump_file, do_unroll | LOOP_BCT | do_prefetch); /* Loop can create trivially dead instructions. */ delete_trivially_dead_insns (insns, max_reg_num ()); ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-20 12:12 ` Richard Henderson @ 2002-07-21 10:01 ` Toon Moene 2002-07-21 14:23 ` Richard Henderson 0 siblings, 1 reply; 19+ messages in thread From: Toon Moene @ 2002-07-21 10:01 UTC (permalink / raw) To: Richard Henderson; +Cc: gcc, gcc-patches Richard Henderson wrote: > On Fri, Jul 19, 2002 at 03:35:14PM -0700, Richard Henderson wrote: > > The remaining sts/lds pairs are writes then reads from SY. > > We've lost track of the fact that the write is to index I > > and the read from index I+1, and so cannot overlap. > > This appears to be the unroller doing stupid things. The attached > patch1 should cure this. If this patch can be shown to be a win, > we can axe this section of code properly rather than goto out of it. I combined your MEM_EXPR patch and patch1, but now I get (-O2 -funroll-loops -fno-rerun-loop-opt): $L6: lds $f10,0($18) lds $f14,-4($5) lda $2,4($5) lda $3,8($5) lda $18,4($18) lda $4,12($5) lda $1,-3($7) lda $7,-4($7) lds $f11,0($18) addl $1,$31,$6 lda $18,4($18) lds $f12,0($18) lda $18,4($18) muls $f15,$f10,$f10 lds $f13,0($18) lda $18,4($18) muls $f15,$f11,$f11 muls $f15,$f12,$f12 adds $f14,$f10,$f14 muls $f15,$f13,$f13 sts $f14,-4($5) lda $5,16($5) lds $f10,-4($2) adds $f10,$f11,$f10 sts $f10,-4($2) lds $f11,-4($3) adds $f11,$f12,$f11 sts $f11,-4($3) lds $f10,-4($4) adds $f10,$f13,$f10 sts $f10,-4($4) bge $6,$L6 which is worse than you showed for the MEM_EXPR patch alone 32 insns vs 27). -- Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-21 10:01 ` Toon Moene @ 2002-07-21 14:23 ` Richard Henderson 2002-07-21 15:14 ` Toon Moene 2002-07-21 22:41 ` Toon Moene 0 siblings, 2 replies; 19+ messages in thread From: Richard Henderson @ 2002-07-21 14:23 UTC (permalink / raw) To: Toon Moene; +Cc: gcc, gcc-patches On Sun, Jul 21, 2002 at 11:24:00AM +0200, Toon Moene wrote: > I combined your MEM_EXPR patch and patch1, but now I get (-O2 > -funroll-loops -fno-rerun-loop-opt): [...] > which is worse than you showed for the MEM_EXPR patch alone > 32 insns vs 27). So is the claim that patch1 is dependent on patch2? r~ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-21 14:23 ` Richard Henderson @ 2002-07-21 15:14 ` Toon Moene 2002-07-21 22:41 ` Toon Moene 1 sibling, 0 replies; 19+ messages in thread From: Toon Moene @ 2002-07-21 15:14 UTC (permalink / raw) To: Richard Henderson; +Cc: gcc, gcc-patches Richard Henderson wrote: > On Sun, Jul 21, 2002 at 11:24:00AM +0200, Toon Moene wrote: > > I combined your MEM_EXPR patch and patch1, but now I get (-O2 > > -funroll-loops -fno-rerun-loop-opt): > [...] > > which is worse than you showed for the MEM_EXPR patch alone > > 32 insns vs 27). > > So is the claim that patch1 is dependent on patch2? Ah, sorry, didn't test because I thought that this would invalidate any useful testing of patch2. Will try asap. -- Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-21 14:23 ` Richard Henderson 2002-07-21 15:14 ` Toon Moene @ 2002-07-21 22:41 ` Toon Moene 2002-07-22 0:03 ` Richard Henderson 1 sibling, 1 reply; 19+ messages in thread From: Toon Moene @ 2002-07-21 22:41 UTC (permalink / raw) To: Richard Henderson; +Cc: gcc, gcc-patches Richard Henderson wrote: > On Sun, Jul 21, 2002 at 11:24:00AM +0200, Toon Moene wrote: > > I combined your MEM_EXPR patch and patch1, but now I get (-O2 > > -funroll-loops -fno-rerun-loop-opt): > [...] > > which is worse than you showed for the MEM_EXPR patch alone > > 32 insns vs 27). > So is the claim that patch1 is dependent on patch2? [alphaev6-linux-gnu] OK, you won. Original (trunk CVS'd 10 UTC this morning) compiled our NWP software -O2 -ffast-math -funroll-loops -fno-rerun-loop-opt: ETAETA TOOK 2.4101572 SECONDS ETAETA TOOK 3.5712893 SECONDS 0SUPOBS TOOK : 0.33203029633 0DATACH TOOK : 442.5986328125 0ANAEVA TOOK : 105.2470703125 0GRPEVA TOOK : 635.9697265625 0HUMSUP TOOK : 0.02246093750 0DATACH TOOK : 28.8212890625 0HUMEVA TOOK : 7.5175781250 0GRPEVA TOOK : 14.9267578125 ANALYSIS TOOK: 1257.66016 SEC. PREPARATIONS TOOK 4.39550686 SECONDS FORECAST TOOK 131.1475 SECONDS PREPARATIONS TOOK 6.03613281 SECONDS FORECAST TOOK 354.8984 SECONDS PREPARATIONS TOOK 13.0136719 SECONDS FORECAST TOOK 989.5225 SECONDS ETAETA TOOK 3.62402296 SECONDS 0SUPOBS TOOK : 0.18945407867 0DATACH TOOK : 830.9023437500 0ANAEVA TOOK : 117.0732421875 0GRPEVA TOOK : 768.3984375000 0HUMSUP TOOK : 0.03613281250 0DATACH TOOK : 17.2988281250 0HUMEVA TOOK : 5.5039062500 0GRPEVA TOOK : 11.3378906250 ANALYSIS TOOK: 1777.08496 SEC. PREPARATIONS TOOK 6.4160161 SECONDS FORECAST TOOK 127.3857 SECONDS PREPARATIONS TOOK 7.72753906 SECONDS FORECAST TOOK 361.6230 SECONDS PREPARATIONS TOOK 12.6972656 SECONDS FORECAST TOOK 974.3926 SECONDS In addition with your patch 0 (MEM_EXPR), 1 and 2, compiled our NWP software -O2 -ffast-math -funroll-loops: ETAETA TOOK 4.66308594 SECONDS ETAETA TOOK 6.23046875 SECONDS 0SUPOBS TOOK : 0.74706935883 0DATACH TOOK : 354.9746093750 0ANAEVA TOOK : 120.2656250000 0GRPEVA TOOK : 607.8193359375 0HUMSUP TOOK : 0.01953125000 0DATACH TOOK : 31.6162109375 0HUMEVA TOOK : 8.9648437500 0GRPEVA TOOK : 14.9980468750 ANALYSIS TOOK: 1162.40234 SEC. PREPARATIONS TOOK 5.09668016 SECONDS FORECAST TOOK 118.2822 SECONDS PREPARATIONS TOOK 6.49609375 SECONDS FORECAST TOOK 339.0889 SECONDS PREPARATIONS TOOK 13.4384766 SECONDS FORECAST TOOK 910.8564 SECONDS ETAETA TOOK 2.88281202 SECONDS 0SUPOBS TOOK : 0.75683593750 0DATACH TOOK : 680.7177734375 0ANAEVA TOOK : 117.9658203125 0GRPEVA TOOK : 743.5371093750 0HUMSUP TOOK : 0.01562500000 0DATACH TOOK : 16.0937500000 0HUMEVA TOOK : 5.9082031250 0GRPEVA TOOK : 11.3769531250 ANALYSIS TOOK: 1620.9668 SEC. PREPARATIONS TOOK 5.05175829 SECONDS FORECAST TOOK 124.7197 SECONDS PREPARATIONS TOOK 5.18066406 SECONDS FORECAST TOOK 331.8086 SECONDS PREPARATIONS TOOK 11.7978516 SECONDS FORECAST TOOK 922.9990 SECONDS Cheers, -- Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-21 22:41 ` Toon Moene @ 2002-07-22 0:03 ` Richard Henderson 2002-07-22 16:42 ` Toon Moene 0 siblings, 1 reply; 19+ messages in thread From: Richard Henderson @ 2002-07-22 0:03 UTC (permalink / raw) To: Toon Moene; +Cc: gcc, gcc-patches On Mon, Jul 22, 2002 at 12:23:55AM +0200, Toon Moene wrote: > OK, you won. Excellent. I've checked things in. r~ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-22 0:03 ` Richard Henderson @ 2002-07-22 16:42 ` Toon Moene 2002-07-23 2:12 ` Andreas Jaeger 0 siblings, 1 reply; 19+ messages in thread From: Toon Moene @ 2002-07-22 16:42 UTC (permalink / raw) To: Richard Henderson; +Cc: gcc, gcc-patches Richard Henderson wrote: > On Mon, Jul 22, 2002 at 12:23:55AM +0200, Toon Moene wrote: > > OK, you won. > Excellent. I've checked things in. Well, that helped. Look at the SPECfp scores on Andreas' site ! -- Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-22 16:42 ` Toon Moene @ 2002-07-23 2:12 ` Andreas Jaeger 0 siblings, 0 replies; 19+ messages in thread From: Andreas Jaeger @ 2002-07-23 2:12 UTC (permalink / raw) To: Toon Moene; +Cc: Richard Henderson, gcc, gcc-patches Toon Moene <toon@moene.indiv.nluug.nl> writes: > Richard Henderson wrote: > >> On Mon, Jul 22, 2002 at 12:23:55AM +0200, Toon Moene wrote: >> > OK, you won. > >> Excellent. I've checked things in. > > Well, that helped. Look at the SPECfp scores on Andreas' site ! Wow! That's really impressive for wupwise and swim! Andreas -- Andreas Jaeger SuSE Labs aj@suse.de private aj@arthur.inka.de http://www.suse.de/~aj ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-07-19 13:31 ` Richard Henderson 2002-07-20 2:13 ` Toon Moene @ 2002-08-12 7:49 ` Jeff Law 2002-08-12 7:53 ` Jan Hubicka 1 sibling, 1 reply; 19+ messages in thread From: Jeff Law @ 2002-08-12 7:49 UTC (permalink / raw) To: Richard Henderson; +Cc: Toon Moene, gcc In message <20020719095446.A15598@redhat.com>, Richard Henderson writes: >On Fri, Jul 19, 2002 at 04:40:26PM +0200, Toon Moene wrote: >> Which begs the question: Is there a reason -fschedule-insns isn't on by >> default when using -O2 ? > >Yes. The fact that the scheduler doesn't understand register >pressure means that pre-register-allocation scheduling generally >sucks eggs on x86. True. But the real reason -fschedule-insns isn't on by default for ia32 is the return register problem -- which I believe you actually fixed a while back, but we haven't gone back to see if it's safe/profitable to enable the first scheduling pass for ia32. Jeff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-08-12 7:49 ` Jeff Law @ 2002-08-12 7:53 ` Jan Hubicka 2002-08-12 9:56 ` Richard Henderson 0 siblings, 1 reply; 19+ messages in thread From: Jan Hubicka @ 2002-08-12 7:53 UTC (permalink / raw) To: law; +Cc: Richard Henderson, Toon Moene, gcc > In message <20020719095446.A15598@redhat.com>, Richard Henderson writes: > >On Fri, Jul 19, 2002 at 04:40:26PM +0200, Toon Moene wrote: > >> Which begs the question: Is there a reason -fschedule-insns isn't on by > >> default when using -O2 ? > > > >Yes. The fact that the scheduler doesn't understand register > >pressure means that pre-register-allocation scheduling generally > >sucks eggs on x86. > True. But the real reason -fschedule-insns isn't on by default for ia32 > is the return register problem -- which I believe you actually fixed > a while back, but we haven't gone back to see if it's safe/profitable to > enable the first scheduling pass for ia32. Did you really fixed all the problems regarding SMALL_REGISTER_CLASSes? When using register passing conventions I've seen ia32 compilation dying all the time. It may be interesting to set the flag on at least for x86_64 where register pressure is lower if it worked. Honza > > Jeff > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Alias analysis - does base_alias_check still work ? 2002-08-12 7:53 ` Jan Hubicka @ 2002-08-12 9:56 ` Richard Henderson 0 siblings, 0 replies; 19+ messages in thread From: Richard Henderson @ 2002-08-12 9:56 UTC (permalink / raw) To: Jan Hubicka; +Cc: law, Toon Moene, gcc On Mon, Aug 12, 2002 at 04:52:57PM +0200, Jan Hubicka wrote: > Did you really fixed all the problems regarding SMALL_REGISTER_CLASSes? I thought so. > It may be interesting to set the flag on at least for x86_64 where > register pressure is lower if it worked. Indeed. r~ ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2002-08-12 9:56 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2002-07-16 14:10 Alias analysis - does base_alias_check still work ? Toon Moene 2002-07-19 11:00 ` Toon Moene 2002-07-19 11:02 ` Daniel Berlin 2002-07-19 11:03 ` David Edelsohn 2002-07-19 13:31 ` Richard Henderson 2002-07-20 2:13 ` Toon Moene 2002-07-20 11:42 ` Toon Moene 2002-07-20 12:05 ` Richard Henderson 2002-07-20 12:12 ` Richard Henderson 2002-07-21 10:01 ` Toon Moene 2002-07-21 14:23 ` Richard Henderson 2002-07-21 15:14 ` Toon Moene 2002-07-21 22:41 ` Toon Moene 2002-07-22 0:03 ` Richard Henderson 2002-07-22 16:42 ` Toon Moene 2002-07-23 2:12 ` Andreas Jaeger 2002-08-12 7:49 ` Jeff Law 2002-08-12 7:53 ` Jan Hubicka 2002-08-12 9:56 ` Richard Henderson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).