public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling
@ 2010-06-18 7:51 borntraeger at de dot ibm dot com
2010-06-18 7:59 ` [Bug middle-end/44576] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling borntraeger at de dot ibm dot com
` (22 more replies)
0 siblings, 23 replies; 24+ messages in thread
From: borntraeger at de dot ibm dot com @ 2010-06-18 7:51 UTC (permalink / raw)
To: gcc-bugs
testsuite/gfortran.dg/zero_sized_1.f90 takes almost 11 minutes to compile on my
notebook when combining aggressive loop prefetching with loop peeling:
$ time gfortran-4.5 -O3 -march=core2 zero_sized_1.f90 -S
-fprefetch-loop-arrays -funroll-loops --param max-completely-peeled-insns=2000
real 10m54.594s
user 10m48.638s
sys 0m0.393s
The compiler is almost always in compute_miss_rate introduced by
http://gcc.gnu.org/ml/gcc-patches/2009-08/msg00641.html
The problem is triggered by several things:
- loop peeling creates thousands of references (with only a small delta) and
each reference is compared with every other reference
- for each comparison we iterate over all alignments
- for each alignment we iterate over all distinct iterators
As you can see this causes an exploding complexitiy.
Furthermore,since the cache line size is passed in via an external variable,
the compiler cannot optimize the integer division into shifts.
I think the right solution would be to reduce the complexity of
compute_miss_rate, but I have not found a good solution yet.
--
Summary: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile
time on prefetching+peeling
Product: gcc
Version: 4.5.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: borntraeger at de dot ibm dot com
GCC host triplet: several, i486, s390..
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
@ 2010-06-18 7:59 ` borntraeger at de dot ibm dot com
2010-06-18 10:52 ` [Bug middle-end/44576] [4.5/4.6 Regression] " rguenth at gcc dot gnu dot org
` (21 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: borntraeger at de dot ibm dot com @ 2010-06-18 7:59 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from borntraeger at de dot ibm dot com 2010-06-18 07:59 -------
4.6 (trunk) is also affected
--
borntraeger at de dot ibm dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|testsuite/gfortran.dg/zero_s|testsuite/gfortran.dg/zero_s
|ized_1.f90 with huge compile|ized_1.f90 with huge compile
|time on prefetching+peeling |time on prefetching +
| |peeling
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
2010-06-18 7:59 ` [Bug middle-end/44576] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling borntraeger at de dot ibm dot com
@ 2010-06-18 10:52 ` rguenth at gcc dot gnu dot org
2010-06-24 21:44 ` rguenth at gcc dot gnu dot org
` (20 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-06-18 10:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from rguenth at gcc dot gnu dot org 2010-06-18 10:52 -------
Confirmed.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Keywords| |compile-time-hog
Last reconfirmed|0000-00-00 00:00:00 |2010-06-18 10:52:06
date| |
Summary|testsuite/gfortran.dg/zero_s|[4.5/4.6 Regression]
|ized_1.f90 with huge compile|testsuite/gfortran.dg/zero_s
|time on prefetching + |ized_1.f90 with huge compile
|peeling |time on prefetching +
| |peeling
Target Milestone|--- |4.5.1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
2010-06-18 7:59 ` [Bug middle-end/44576] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling borntraeger at de dot ibm dot com
2010-06-18 10:52 ` [Bug middle-end/44576] [4.5/4.6 Regression] " rguenth at gcc dot gnu dot org
@ 2010-06-24 21:44 ` rguenth at gcc dot gnu dot org
2010-06-25 9:03 ` borntraeger at de dot ibm dot com
` (19 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-06-24 21:44 UTC (permalink / raw)
To: gcc-bugs
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (2 preceding siblings ...)
2010-06-24 21:44 ` rguenth at gcc dot gnu dot org
@ 2010-06-25 9:03 ` borntraeger at de dot ibm dot com
2010-06-25 17:08 ` changpeng dot fang at amd dot com
` (18 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: borntraeger at de dot ibm dot com @ 2010-06-25 9:03 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from borntraeger at de dot ibm dot com 2010-06-25 09:02 -------
Created an attachment (id=21001)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21001&action=view)
Potential fix for compile time regression
Here is a potential fix. We just limit prefetching to loops with a "low" amount
of memory references and bail out if the amount of references is too large.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (3 preceding siblings ...)
2010-06-25 9:03 ` borntraeger at de dot ibm dot com
@ 2010-06-25 17:08 ` changpeng dot fang at amd dot com
2010-06-26 14:25 ` rguenth at gcc dot gnu dot org
` (17 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: changpeng dot fang at amd dot com @ 2010-06-25 17:08 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from changpeng dot fang at amd dot com 2010-06-25 17:08 -------
(In reply to comment #3)
> Created an attachment (id=21001)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21001&action=view) [edit]
> Potential fix for compile time regression
>
> Here is a potential fix. We just limit prefetching to loops with a "low" amount
> of memory references and bail out if the amount of references is too large.
>
This should be a good fix for now. But the complexities of computing group
reuse
and miss rate are still a concern. I don't think we need to compute the miss
rate "exactly" here.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (4 preceding siblings ...)
2010-06-25 17:08 ` changpeng dot fang at amd dot com
@ 2010-06-26 14:25 ` rguenth at gcc dot gnu dot org
2010-06-26 20:30 ` borntraeger at de dot ibm dot com
` (16 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-06-26 14:25 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from rguenth at gcc dot gnu dot org 2010-06-26 14:25 -------
I now see compile-time on polyhedron sky-rocketed with enabling prefetching at
-O3 by default. See
http://gcc.opensuse.org/c++bench/polyhedron/polyhedron-summary.txt-1-0.html
This isn't acceptable - please work on this.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (5 preceding siblings ...)
2010-06-26 14:25 ` rguenth at gcc dot gnu dot org
@ 2010-06-26 20:30 ` borntraeger at de dot ibm dot com
2010-06-27 9:16 ` rguenth at gcc dot gnu dot org
` (15 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: borntraeger at de dot ibm dot com @ 2010-06-26 20:30 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from borntraeger at de dot ibm dot com 2010-06-26 20:30 -------
Richard,
can you check if compute_miss_rate is the problem?
does the attached patch helps?
thanks
Christian
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (6 preceding siblings ...)
2010-06-26 20:30 ` borntraeger at de dot ibm dot com
@ 2010-06-27 9:16 ` rguenth at gcc dot gnu dot org
2010-06-27 9:17 ` rguenth at gcc dot gnu dot org
` (14 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-06-27 9:16 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from rguenth at gcc dot gnu dot org 2010-06-27 09:16 -------
The patch didn't help. Btw, reuse analysis looks worse than quadratic.
Polyhedron test_fpu.f90 is just a single file so you should be able to
check yourself easily. The last jump doesn't reproduce on the 4.5 branch.
SPEC CPU 2006 gamess compile-time also increased by 25%. leslie3d compile-time
doubled.
http://gcc.opensuse.org/SPEC/CFP/sb-barbella.suse.de-head-64-2006/times.html.
A good chunk of time seems to be spent in the RTL loop unroller, triggered
by array prefetching (testing with -O3 -funroll-loops). Otherwise it might
as well be just excessive code growth caused by prefetching.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (7 preceding siblings ...)
2010-06-27 9:16 ` rguenth at gcc dot gnu dot org
@ 2010-06-27 9:17 ` rguenth at gcc dot gnu dot org
2010-06-27 11:09 ` borntraeger at de dot ibm dot com
` (13 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-06-27 9:17 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from rguenth at gcc dot gnu dot org 2010-06-27 09:17 -------
Size of the binaries explodes as well...
http://gcc.opensuse.org/SPEC/CFP/sb-barbella.suse.de-head-64-2006/size.html
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (8 preceding siblings ...)
2010-06-27 9:17 ` rguenth at gcc dot gnu dot org
@ 2010-06-27 11:09 ` borntraeger at de dot ibm dot com
2010-06-27 11:20 ` rguenth at gcc dot gnu dot org
` (12 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: borntraeger at de dot ibm dot com @ 2010-06-27 11:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from borntraeger at de dot ibm dot com 2010-06-27 11:09 -------
So there seem to be at least two problems:
1. exploding complexity in compute_miss_rate (the start for this bugzilla)
2. effects due to prefetching seen in other passes
I think the attached patch cures 1. What shall we do about 2? Opening another
bugzilla or tracking the problems here?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (9 preceding siblings ...)
2010-06-27 11:09 ` borntraeger at de dot ibm dot com
@ 2010-06-27 11:20 ` rguenth at gcc dot gnu dot org
2010-06-29 0:07 ` changpeng dot fang at amd dot com
` (11 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-06-27 11:20 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from rguenth at gcc dot gnu dot org 2010-06-27 11:20 -------
I have opened PR44688
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (10 preceding siblings ...)
2010-06-27 11:20 ` rguenth at gcc dot gnu dot org
@ 2010-06-29 0:07 ` changpeng dot fang at amd dot com
2010-06-29 0:49 ` changpeng dot fang at amd dot com
` (10 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: changpeng dot fang at amd dot com @ 2010-06-29 0:07 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from changpeng dot fang at amd dot com 2010-06-29 00:07 -------
I have a patch that partially fixes the problem:
http://gcc.gnu.org/ml/gcc-patches/2010-06/msg02956.html
Note that for this test case, the compile time doubled even though
I don't compute the miss rate at all.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (11 preceding siblings ...)
2010-06-29 0:07 ` changpeng dot fang at amd dot com
@ 2010-06-29 0:49 ` changpeng dot fang at amd dot com
2010-06-30 0:23 ` changpeng dot fang at amd dot com
` (9 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: changpeng dot fang at amd dot com @ 2010-06-29 0:49 UTC (permalink / raw)
To: gcc-bugs
------- Comment #12 from changpeng dot fang at amd dot com 2010-06-29 00:49 -------
Created an attachment (id=21034)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21034&action=view)
Early return in miss rate computation
The attached patch improves the computation of miss rate. We can stop computing
if the total misses has always exceeds the given acceptable threshold.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (12 preceding siblings ...)
2010-06-29 0:49 ` changpeng dot fang at amd dot com
@ 2010-06-30 0:23 ` changpeng dot fang at amd dot com
2010-06-30 0:36 ` changpeng dot fang at amd dot com
` (8 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: changpeng dot fang at amd dot com @ 2010-06-30 0:23 UTC (permalink / raw)
To: gcc-bugs
------- Comment #13 from changpeng dot fang at amd dot com 2010-06-30 00:23 -------
Here is the current status of this work:
patch1: http://gcc.gnu.org/ml/gcc-patches/2010-06/msg02956.html
patch2: http://gcc.gnu.org/ml/gcc-patches/2010-06/msg03049.html
On my system with -O3 zero_sized_1.f90 -fprefetch-loop-arrays
-fno-unroll-loops --param max-completely-peeled-insns=2000:
original timing: 5m30s
with patch1: 1m20s
with patch1 + patch2: 1m03s
without prefetch: 0m30s
The timing with prefetch-loop-arrays is still doubled after the two patch
compared to no-prefetch-loop-arrays. The extra 33s is mostly spent in
dependence computation for loops. For this test case, prefetching is the
only optimization that invokes "compute_all_dependences".
I am not sure whether we should tolerate this timing increase with aggressive
peeling and prefetching, or we should work on the cost reduction of dependence
computation.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (13 preceding siblings ...)
2010-06-30 0:23 ` changpeng dot fang at amd dot com
@ 2010-06-30 0:36 ` changpeng dot fang at amd dot com
2010-07-01 0:34 ` changpeng dot fang at amd dot com
` (7 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: changpeng dot fang at amd dot com @ 2010-06-30 0:36 UTC (permalink / raw)
To: gcc-bugs
------- Comment #14 from changpeng dot fang at amd dot com 2010-06-30 00:36 -------
(In reply to comment #7)
> A good chunk of time seems to be spent in the RTL loop unroller, triggered
> by array prefetching (testing with -O3 -funroll-loops). Otherwise it might
> as well be just excessive code growth caused by prefetching.
Yes, for test_fpu.f90, more than half of the time is spent in the RTL loop
unroller, and if manually set unroll_factor to 1 (don't unroll), the timing
increase by array prefetching is negligible.
With -O3 -funroll-loops, I don't expect code size or compilation time increase
from the RTL loop unroller, triggered by array prefetching.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (14 preceding siblings ...)
2010-06-30 0:36 ` changpeng dot fang at amd dot com
@ 2010-07-01 0:34 ` changpeng dot fang at amd dot com
2010-07-02 16:34 ` spop at gcc dot gnu dot org
` (6 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: changpeng dot fang at amd dot com @ 2010-07-01 0:34 UTC (permalink / raw)
To: gcc-bugs
------- Comment #15 from changpeng dot fang at amd dot com 2010-07-01 00:34 -------
Unrolling of the peeled loop is partially the reason for test_fpu.f90
compilation
time and code size increase. Vectorization peeled a few iteration of the the
loop, the prefetching and unrolling passes does not recognize that a loop is a
peeled version and still unroll the loop.
MODULE kinds
INTEGER, PARAMETER :: RK8 = SELECTED_REAL_KIND(15, 300)
END MODULE kinds
! --------------------------------------------------------------------
PROGRAM TEST_FPU ! A number-crunching benchmark using matrix inversion.
USE kinds ! Implemented by: David Frank Dave_Frank@hotmail.com
IMPLICIT NONE ! Gauss routine by: Tim Prince N8TM@aol.com
! Crout routine by: James Van Buskirk torsop@ix.netcom.com
! Lapack routine by: Jos Bergervoet bergervo@IAEhv.nl
REAL(RK8) :: pool(101, 101,1000), a(101, 101)
INTEGER :: i
DO i = 1,1000
a = pool(:,:,i) ! get next matrix to invert
END DO
END PROGRAM TEST_FPU
In this example, prefetching will unroll tree version of the innermost loop.
If we turn off the vectorizer, it unrolls the only loop.
In addition, -fprefetch-loop-arrays and -funroll-loops (turned on at the same
time) will unroll the same loop. This is over-unrolling and -funroll-loops
should recognize that the loop has already been unrolled by prefetching.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (15 preceding siblings ...)
2010-07-01 0:34 ` changpeng dot fang at amd dot com
@ 2010-07-02 16:34 ` spop at gcc dot gnu dot org
2010-07-02 23:58 ` changpeng dot fang at amd dot com
` (5 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: spop at gcc dot gnu dot org @ 2010-07-02 16:34 UTC (permalink / raw)
To: gcc-bugs
------- Comment #16 from spop at gcc dot gnu dot org 2010-07-02 16:34 -------
Subject: Bug 44576
Author: spop
Date: Fri Jul 2 16:34:29 2010
New Revision: 161727
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=161727
Log:
PR 44576: miss rate computation improvement for prefetching loop arrays.
2010-07-02 Changpeng Fang <changpeng.fang@amd.com>
PR middle-end/44576
* tree-ssa-loop-prefetch.c (compute_miss_rate): Return 1000 (out
of 1000) for miss rate if the address diference is greater than or
equal to the cache line size (the two reference will never hit the
same cache line).
Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-loop-prefetch.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (16 preceding siblings ...)
2010-07-02 16:34 ` spop at gcc dot gnu dot org
@ 2010-07-02 23:58 ` changpeng dot fang at amd dot com
2010-07-07 18:15 ` spop at gcc dot gnu dot org
` (4 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: changpeng dot fang at amd dot com @ 2010-07-02 23:58 UTC (permalink / raw)
To: gcc-bugs
------- Comment #17 from changpeng dot fang at amd dot com 2010-07-02 23:58 -------
(In reply to comment #15)
I have opened PR44794 for the unrolling of pre- and post-loop issue.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (17 preceding siblings ...)
2010-07-02 23:58 ` changpeng dot fang at amd dot com
@ 2010-07-07 18:15 ` spop at gcc dot gnu dot org
2010-07-07 19:00 ` changpeng dot fang at amd dot com
` (3 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: spop at gcc dot gnu dot org @ 2010-07-07 18:15 UTC (permalink / raw)
To: gcc-bugs
------- Comment #18 from spop at gcc dot gnu dot org 2010-07-07 18:14 -------
Changpeng, should this PR be closed now?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (18 preceding siblings ...)
2010-07-07 18:15 ` spop at gcc dot gnu dot org
@ 2010-07-07 19:00 ` changpeng dot fang at amd dot com
2010-07-09 1:59 ` changpeng dot fang at amd dot com
` (2 subsequent siblings)
22 siblings, 0 replies; 24+ messages in thread
From: changpeng dot fang at amd dot com @ 2010-07-07 19:00 UTC (permalink / raw)
To: gcc-bugs
------- Comment #19 from changpeng dot fang at amd dot com 2010-07-07 19:00 -------
(In reply to comment #18)
> Changpeng, should this PR be closed now?
>
No. I am still looking at the dependence computation cost. I just found the
most of the time is spent in memory allocation and freeing of the data
dependence relatiuon structure.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (19 preceding siblings ...)
2010-07-07 19:00 ` changpeng dot fang at amd dot com
@ 2010-07-09 1:59 ` changpeng dot fang at amd dot com
2010-07-09 23:09 ` spop at gcc dot gnu dot org
2010-07-09 23:12 ` spop at gcc dot gnu dot org
22 siblings, 0 replies; 24+ messages in thread
From: changpeng dot fang at amd dot com @ 2010-07-09 1:59 UTC (permalink / raw)
To: gcc-bugs
------- Comment #20 from changpeng dot fang at amd dot com 2010-07-09 01:59 -------
I submitted a patch for review to completely fix the problem. The patch is an
extension to Christian's speedup.patch. It splits the cost analysis into
three small functions and quits further prefetching analysis as long as we
know prefetching is not going to be beneficial to the loop.
Here is the gcc-patches@ link:
http://gcc.gnu.org/ml/gcc-patches/2010-07/msg00734.html
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (20 preceding siblings ...)
2010-07-09 1:59 ` changpeng dot fang at amd dot com
@ 2010-07-09 23:09 ` spop at gcc dot gnu dot org
2010-07-09 23:12 ` spop at gcc dot gnu dot org
22 siblings, 0 replies; 24+ messages in thread
From: spop at gcc dot gnu dot org @ 2010-07-09 23:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #21 from spop at gcc dot gnu dot org 2010-07-09 23:09 -------
Subject: Bug 44576
Author: spop
Date: Fri Jul 9 23:08:55 2010
New Revision: 162023
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162023
Log:
pr44576 Avoid un-necessary prefetch analysis by distributing the cost models
2010-07-09 Changpeng Fang <changpeng.fang@amd.com>
PR tree-optimization/44576
* tree-ssa-loop-prefetch.c (trip_count_to_ahead_ratio_too_small_p):
New. Pull out from is_loop_prefetching_profitable to implement
the trip count to ahead ratio heuristic.
(mem_ref_count_reasonable_p): New. Pull out from
is_loop_prefetching_profitable to implement the instruction to
memory reference ratio heuristic. Also consider not reasonable if
the memory reference count is above a threshold (to avoid
explosive compilation time.
(insn_to_prefetch_ratio_too_small_p): New. Pull out from
is_loop_prefetching_profitable to implement the instruction to
prefetch ratio heuristic.
(is_loop_prefetching_profitable): Removed.
(loop_prefetch_arrays): Distribute the cost analysis across the
function to allow early exit of the prefetch analysis.
is_loop_prefetching_profitable is splitted into three functions,
with each one called as early as possible.
(PREFETCH_MAX_MEM_REFS_PER_LOOP): New. Threshold above which the
number of memory references in a loop is considered too many.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-loop-prefetch.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Bug middle-end/44576] [4.5/4.6 Regression] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
` (21 preceding siblings ...)
2010-07-09 23:09 ` spop at gcc dot gnu dot org
@ 2010-07-09 23:12 ` spop at gcc dot gnu dot org
22 siblings, 0 replies; 24+ messages in thread
From: spop at gcc dot gnu dot org @ 2010-07-09 23:12 UTC (permalink / raw)
To: gcc-bugs
------- Comment #22 from spop at gcc dot gnu dot org 2010-07-09 23:12 -------
Fixed.
--
spop at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2010-07-09 23:12 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-18 7:51 [Bug middle-end/44576] New: testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching+peeling borntraeger at de dot ibm dot com
2010-06-18 7:59 ` [Bug middle-end/44576] testsuite/gfortran.dg/zero_sized_1.f90 with huge compile time on prefetching + peeling borntraeger at de dot ibm dot com
2010-06-18 10:52 ` [Bug middle-end/44576] [4.5/4.6 Regression] " rguenth at gcc dot gnu dot org
2010-06-24 21:44 ` rguenth at gcc dot gnu dot org
2010-06-25 9:03 ` borntraeger at de dot ibm dot com
2010-06-25 17:08 ` changpeng dot fang at amd dot com
2010-06-26 14:25 ` rguenth at gcc dot gnu dot org
2010-06-26 20:30 ` borntraeger at de dot ibm dot com
2010-06-27 9:16 ` rguenth at gcc dot gnu dot org
2010-06-27 9:17 ` rguenth at gcc dot gnu dot org
2010-06-27 11:09 ` borntraeger at de dot ibm dot com
2010-06-27 11:20 ` rguenth at gcc dot gnu dot org
2010-06-29 0:07 ` changpeng dot fang at amd dot com
2010-06-29 0:49 ` changpeng dot fang at amd dot com
2010-06-30 0:23 ` changpeng dot fang at amd dot com
2010-06-30 0:36 ` changpeng dot fang at amd dot com
2010-07-01 0:34 ` changpeng dot fang at amd dot com
2010-07-02 16:34 ` spop at gcc dot gnu dot org
2010-07-02 23:58 ` changpeng dot fang at amd dot com
2010-07-07 18:15 ` spop at gcc dot gnu dot org
2010-07-07 19:00 ` changpeng dot fang at amd dot com
2010-07-09 1:59 ` changpeng dot fang at amd dot com
2010-07-09 23:09 ` spop at gcc dot gnu dot org
2010-07-09 23:12 ` spop at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).