public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug optimization/12771] New: Weak loop optimizer, significant performance regression
@ 2003-10-25 0:54 tm at kloo dot net
2003-10-25 1:03 ` [Bug optimization/12771] " tm at kloo dot net
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: tm at kloo dot net @ 2003-10-25 0:54 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12771
Summary: Weak loop optimizer, significant performance regression
Product: gcc
Version: 3.4
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: tm at kloo dot net
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: i386-linux
GCC host triplet: i386-linux
GCC target triplet: i386-linux
This is based on Scott Robert Ladd's lpbench benchmark, which is derived from
linpack. He found a significant performance improvement on linpack when
-freduce-all-givs was used. This is the analysis of his situation using
gcc-3.4-20031024.
The majority of the time in Linpack is spent in the second loop in daxpy(). This
is compiled using "-O2 -S" to the following code:
.L98:
movl 20(%ebp), %edx <- memory ref #1
flds (%edx,%eax,4) <- memory ref #2
movl 12(%ebp), %edx <- memory ref #3
fmuls (%edx,%eax,4) <- memory ref #4
incl %eax
faddp %st, %st(1) <- memory ref #5
Here is the code as compiled with -freduce-all-givs:
.L85:
flds (%ecx) <- memory ref #1
addl $4, %ecx
fmuls (%edx) <- memory ref #2
addl $4, %edx
decl %eax
faddp %st, %st(1) <- memory ref #3
jne .L85
Basically, by default the loop optimizer chooses to optimize:
for (i = 0;i < n; i++) {
dy[i] = dy[i] + da*dx[i];
}
using a dual-register indirect addressing mode 4(%edx,%eax). This is bad because
it uses an extra register which causes the register allocator to reload dx and
dy every iteration through the loop, which results in two extra memory loads in
the inner loop.
The -freduce-all-givs version eliminates the biv which frees up a register, and
this removes two memory loads in the inner loop.
The loop optimizer should be able to estimate register pressure and should
eliminate the biv (perform giv reduction) automagically if it will reduce
register pressure.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug optimization/12771] Weak loop optimizer, significant performance regression
2003-10-25 0:54 [Bug optimization/12771] New: Weak loop optimizer, significant performance regression tm at kloo dot net
@ 2003-10-25 1:03 ` tm at kloo dot net
2003-10-25 1:31 ` pinskia at gcc dot gnu dot org
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: tm at kloo dot net @ 2003-10-25 1:03 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12771
------- Additional Comments From tm at kloo dot net 2003-10-25 00:55 -------
Created an attachment (id=4992)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=4992&action=view)
preprocessed source for linpack
preprocessed source for linpack for this PR
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug optimization/12771] Weak loop optimizer, significant performance regression
2003-10-25 0:54 [Bug optimization/12771] New: Weak loop optimizer, significant performance regression tm at kloo dot net
2003-10-25 1:03 ` [Bug optimization/12771] " tm at kloo dot net
@ 2003-10-25 1:31 ` pinskia at gcc dot gnu dot org
2003-10-27 11:33 ` tm at kloo dot net
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2003-10-25 1:31 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12771
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |pessimizes-code
------- Additional Comments From pinskia at gcc dot gnu dot org 2003-10-25 01:14 -------
What is this a regression from, I get about the same code from 2.95.3, 3.2.3, and 3.3.1?
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug optimization/12771] Weak loop optimizer, significant performance regression
2003-10-25 0:54 [Bug optimization/12771] New: Weak loop optimizer, significant performance regression tm at kloo dot net
2003-10-25 1:03 ` [Bug optimization/12771] " tm at kloo dot net
2003-10-25 1:31 ` pinskia at gcc dot gnu dot org
@ 2003-10-27 11:33 ` tm at kloo dot net
2003-10-28 15:42 ` [Bug optimization/12771] Weak loop optimizer pinskia at gcc dot gnu dot org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: tm at kloo dot net @ 2003-10-27 11:33 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12771
------- Additional Comments From tm at kloo dot net 2003-10-27 10:01 -------
Subject: Re: Weak loop optimizer, significant
performance regression
On 25 Oct 2003, pinskia at gcc dot gnu dot org wrote:
Regression from 2.7.2.
Toshi
> PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12771
>
>
> pinskia at gcc dot gnu dot org changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> Keywords| |pessimizes-code
>
>
> ------- Additional Comments From pinskia at gcc dot gnu dot org 2003-10-25 01:14 -------
> What is this a regression from, I get about the same code from 2.95.3, 3.2.3, and 3.3.1?
>
>
>
>
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug optimization/12771] Weak loop optimizer
2003-10-25 0:54 [Bug optimization/12771] New: Weak loop optimizer, significant performance regression tm at kloo dot net
` (2 preceding siblings ...)
2003-10-27 11:33 ` tm at kloo dot net
@ 2003-10-28 15:42 ` pinskia at gcc dot gnu dot org
2003-10-28 19:22 ` tm at kloo dot net
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2003-10-28 15:42 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12771
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Status|UNCONFIRMED |NEW
Ever Confirmed| |1
Last reconfirmed|0000-00-00 00:00:00 |2003-10-28 15:38:57
date| |
Summary|Weak loop optimizer, |Weak loop optimizer
|significant performance |
|regression |
------- Additional Comments From pinskia at gcc dot gnu dot org 2003-10-28 15:38 -------
Since this is a regression from 2.7.2 which was released November 26, 1995 almost 8 years ago,
we cannot really count this one as a regression as so much has changed since then.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug optimization/12771] Weak loop optimizer
2003-10-25 0:54 [Bug optimization/12771] New: Weak loop optimizer, significant performance regression tm at kloo dot net
` (3 preceding siblings ...)
2003-10-28 15:42 ` [Bug optimization/12771] Weak loop optimizer pinskia at gcc dot gnu dot org
@ 2003-10-28 19:22 ` tm at kloo dot net
2004-03-07 1:34 ` pinskia at gcc dot gnu dot org
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: tm at kloo dot net @ 2003-10-28 19:22 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12771
------- Additional Comments From tm at kloo dot net 2003-10-28 19:21 -------
Subject: Re: Weak loop optimizer
On 28 Oct 2003, pinskia at gcc dot gnu dot org wrote:
It's still an optimization weakness.
Toshi
> PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12771
>
>
> pinskia at gcc dot gnu dot org changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> Severity|normal |enhancement
> Status|UNCONFIRMED |NEW
> Ever Confirmed| |1
> Last reconfirmed|0000-00-00 00:00:00 |2003-10-28 15:38:57
> date| |
> Summary|Weak loop optimizer, |Weak loop optimizer
> |significant performance |
> |regression |
>
>
> ------- Additional Comments From pinskia at gcc dot gnu dot org 2003-10-28 15:38 -------
> Since this is a regression from 2.7.2 which was released November 26, 1995 almost 8 years ago,
> we cannot really count this one as a regression as so much has changed since then.
>
>
>
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug optimization/12771] Weak loop optimizer
2003-10-25 0:54 [Bug optimization/12771] New: Weak loop optimizer, significant performance regression tm at kloo dot net
` (4 preceding siblings ...)
2003-10-28 19:22 ` tm at kloo dot net
@ 2004-03-07 1:34 ` pinskia at gcc dot gnu dot org
2005-03-23 3:07 ` [Bug rtl-optimization/12771] " pinskia at gcc dot gnu dot org
2005-05-07 20:47 ` pinskia at gcc dot gnu dot org
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-07 1:34 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-03-07 01:34 -------
The mainline has changed but still the same number of memory accesses:
movl 16(%ebp), %edx
flds (%edx,%eax,4)
fmul %st(1), %st
fadds (%esi,%eax,4)
fstps (%esi,%eax,4)
incl %eax
On the other hand with -freduce-all-givs, it has increased by one:
.L15:
flds (%ecx)
addl $4, %ecx
fmul %st(1), %st
fadds (%eax)
fstps (%eax)
addl $4, %eax
decl %edx
jne .L15
Here is the reduced source:
void daxpy(int n, float da, float dx[], int incx, float dy[], int incy)
{
int i,ix,iy,m,mp1;
mp1 = 0;
m = 0;
if(n <= 0) return;
if (da == 0.0 ) return;
if(incx != 1 || incy != 1) {
ix = 0;
iy = 0;
if(incx < 0) ix = (-n+1)*incx;
if(incy < 0)iy = (-n+1)*incy;
for (i = 0;i < n; i++) {
dy[iy] = dy[iy] + da*dx[ix];
ix = ix + incx;
iy = iy + incy;
}
return;
}
for (i = 0;i < n; i++) {
dy[i] = dy[i] + da*dx[i];
}
return;
}
--
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed|2003-10-28 15:38:57 |2004-03-07 01:34:09
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12771
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/12771] Weak loop optimizer
2003-10-25 0:54 [Bug optimization/12771] New: Weak loop optimizer, significant performance regression tm at kloo dot net
` (5 preceding siblings ...)
2004-03-07 1:34 ` pinskia at gcc dot gnu dot org
@ 2005-03-23 3:07 ` pinskia at gcc dot gnu dot org
2005-05-07 20:47 ` pinskia at gcc dot gnu dot org
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-03-23 3:07 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-03-23 03:07 -------
We are better on the mainline but still not as good with the old code with -freduce-all-givs.
Here is what we get with the mainline:
.L14:
fld %st(0)
fmuls (%ecx,%edx,4)
incl %edx
fadds (%eax)
fstps (%eax)
addl $4, %eax
cmpl %edx, %esi
jne .L14
--
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed|2004-12-22 04:37:34 |2005-03-23 03:07:20
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12771
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/12771] Weak loop optimizer
2003-10-25 0:54 [Bug optimization/12771] New: Weak loop optimizer, significant performance regression tm at kloo dot net
` (6 preceding siblings ...)
2005-03-23 3:07 ` [Bug rtl-optimization/12771] " pinskia at gcc dot gnu dot org
@ 2005-05-07 20:47 ` pinskia at gcc dot gnu dot org
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-05-07 20:47 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-05-07 20:47 -------
On the mainline, we get:
.L13:
fld %st(0)
incl %ecx
fmuls (%edx)
addl %ebx, %edx
fadds (%eax)
fstps (%eax)
addl %esi, %eax
cmpl %ecx, %edi
jne .L13
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12771
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2005-05-07 20:47 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-25 0:54 [Bug optimization/12771] New: Weak loop optimizer, significant performance regression tm at kloo dot net
2003-10-25 1:03 ` [Bug optimization/12771] " tm at kloo dot net
2003-10-25 1:31 ` pinskia at gcc dot gnu dot org
2003-10-27 11:33 ` tm at kloo dot net
2003-10-28 15:42 ` [Bug optimization/12771] Weak loop optimizer pinskia at gcc dot gnu dot org
2003-10-28 19:22 ` tm at kloo dot net
2004-03-07 1:34 ` pinskia at gcc dot gnu dot org
2005-03-23 3:07 ` [Bug rtl-optimization/12771] " pinskia at gcc dot gnu dot org
2005-05-07 20:47 ` pinskia at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).