public inbox for gcc-prs@sourceware.org
help / color / mirror / Atom feed
* optimization/10625: -freduce-all-givs has negative effect
@ 2003-05-05  9:36 thome
  0 siblings, 0 replies; 2+ messages in thread
From: thome @ 2003-05-05  9:36 UTC (permalink / raw)
  To: gcc-gnats


>Number:         10625
>Category:       optimization
>Synopsis:       -freduce-all-givs has negative effect
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    unassigned
>State:          open
>Class:          pessimizes-code
>Submitter-Id:   net
>Arrival-Date:   Mon May 05 09:36:00 UTC 2003
>Closed-Date:
>Last-Modified:
>Originator:     thome@lix.polytechnique.fr
>Release:        gcc-3.2
>Organization:
>Environment:
Red Hat Linux 8.0
>Description:
Currently, on my P3 866MHz, the attached code takes 200ms with:
-mcpu=pentiumpro -O3 -funroll-loops, 
and 232ms with: 
-mcpu=pentiumpro -O3 -funroll-loops -freduce-all-givs.

-freduce-all-givs impacts negatively on the performance by around 15%.

The code also suffers from the inability of unroll-loops to expand the
two nested loops (the inner loop becomes constant when the outer is
expanded).

For the record, a hand-made unrolling of these loops can yield a 25%
speedup (hence 150ms). Further hand-tweaking of the assembly code can
improve the performace by an additional 20% or so.

This code could probably be better optimized by gcc than what happens
now.

gcc version is here ; I know, that's a cvs version patched with some
rh specifities. Bash me if redhat spewed in patches messing up with
the areas of the code concerned, but I consider this unlikely.

Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/3.2/specs
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --host=i386-redhat-linux --with-system-zlib --enable-__cxa_atexit
Thread model: posix
gcc version 3.2 20020903 (Red Hat Linux 8.0 3.2-7)
>How-To-Repeat:

>Fix:

>Release-Note:
>Audit-Trail:
>Unformatted:
----gnatsweb-attachment----
Content-Type: application/octet-stream; name="gccbug.c"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="gccbug.c"

Ci8qCiAqIEN1cnJlbnRseSwgb24gbXkgUDMgODY2TUh6LCB0aGlzIGNvZGUgdGFrZXMKICogCiAq
IDIwMG1zIHdpdGggLW1jcHU9cGVudGl1bXBybyAtTzMgLWZ1bnJvbGwtbG9vcHMsCiAqIGFuZCAy
MzJtcyB3aXRoIC1tY3B1PXBlbnRpdW1wcm8gLU8zIC1mdW5yb2xsLWxvb3BzIC1mcmVkdWNlLWFs
bC1naXZzLgogKgogKiAtZnJlZHVjZS1hbGwtZ2l2cyBpbXBhY3RzIG5lZ2F0aXZlbHkgb24gdGhl
IHBlcmZvcm1hbmNlIGJ5IGFyb3VuZCAxNSUuCiAqCiAqIFRoZSBjb2RlIGFsc28gc3VmZmVycyBm
cm9tIHRoZSBpbmFiaWxpdHkgb2YgdW5yb2xsLWxvb3BzIHRvIGV4cGFuZCB0aGUKICogdHdvIGlt
YnJpY2F0ZWQgbG9vcHMgKHRoZSBpbm5lciBsb29wIGJlY29tZXMgY29uc3RhbnQgd2hlbiB0aGUg
b3V0ZXIgaXMKICogZXhwYW5kZWQpLgogKgogKiBGb3IgdGhlIHJlY29yZCwgYSBoYW5kLW1hZGUg
dW5yb2xsaW5nIG9mIHRoZXNlIGxvb3BzIGNhbiB5aWVsZCBhIDI1JQogKiBzcGVlZHVwIChoZW5j
ZSAxNTBtcykuIEZ1cnRoZXIgaGFuZC10d2Vha2luZyBvZiB0aGUgYXNzZW1ibHkgY29kZSBjYW4K
ICogaW1wcm92ZSB0aGUgcGVyZm9ybWFjZSBieSBhbiBhZGRpdGlvbmFsIDIwJSBvciBzby4KICoK
ICogVGhpcyBjb2RlIGNvdWxkIHByb2JhYmx5IGJlIGJldHRlciBvcHRpbWl6ZWQgYnkgZ2NjIHRo
YW4gd2hhdCBoYXBwZW5zCiAqIG5vdy4KICoKICogZ2NjIHZlcnNpb24gaXMgaGVyZSA7IEkga25v
dywgdGhhdCdzIGEgY3ZzIHZlcnNpb24gcGF0Y2hlZCB3aXRoIHNvbWUKICogcmggc3BlY2lmaXRp
ZXMuIEJhc2ggbWUgaWYgcmVkaGF0IHNwZXdlZCBpbiBwYXRjaGVzIG1lc3NpbmcgdXAgd2l0aAog
KiB0aGUgYXJlYXMgb2YgdGhlIGNvZGUgY29uY2VybmVkLCBidXQgSSBjb25zaWRlciB0aGlzIHVu
bGlrZWx5LgoKUmVhZGluZyBzcGVjcyBmcm9tIC91c3IvbGliL2djYy1saWIvaTM4Ni1yZWRoYXQt
bGludXgvMy4yL3NwZWNzCkNvbmZpZ3VyZWQgd2l0aDogLi4vY29uZmlndXJlIC0tcHJlZml4PS91
c3IgLS1tYW5kaXI9L3Vzci9zaGFyZS9tYW4gLS1pbmZvZGlyPS91c3Ivc2hhcmUvaW5mbyAtLWVu
YWJsZS1zaGFyZWQgLS1lbmFibGUtdGhyZWFkcz1wb3NpeCAtLWRpc2FibGUtY2hlY2tpbmcgLS1o
b3N0PWkzODYtcmVkaGF0LWxpbnV4IC0td2l0aC1zeXN0ZW0temxpYiAtLWVuYWJsZS1fX2N4YV9h
dGV4aXQKVGhyZWFkIG1vZGVsOiBwb3NpeApnY2MgdmVyc2lvbiAzLjIgMjAwMjA5MDMgKFJlZCBI
YXQgTGludXggOC4wIDMuMi03KQoKICovCiNpbmNsdWRlIDxzdGRsaWIuaD4KCnR5cGVkZWYgdW5z
aWduZWQgbG9uZyBtcF9saW1iX3Q7CgojZGVmaW5lIHVtdWxfcHBtbSh3MSwgdzAsIHUsIHYpIFwK
ICBfX2FzbV9fICgibXVsbCAlMyIJCQkJCQkJXAoJICAgOiAiPWEiICh3MCksICI9ZCIgKHcxKQkJ
CQkJXAoJICAgOiAiJTAiICh1KSwgInJtIiAodikpCiNkZWZpbmUgY2FycnlfdGVybWluYXRlXzIo
dzEsdzAsYykJXAogIGFzbSAoICIJYWRkbCAlMiwgJTBcbiIJCVwKCSIJYWRjbCAkMCwgJTFcbiIJ
CVwKCQkgIDogIityIiAodzApLCAiK3JtIiAodzEpIDogInJtIiAoYykpCiNkZWZpbmUgYWRkMl9z
dHJlYW0oZDEsZDAsczEsczAsaW4pCVwKICBhc20gKAkiCWFkZGwJJTQsJTFcbiIJXAoJIglhZGNs
CSUzLCUwXG4iCVwKCSIJYWRjbAkkMCwlMlxuIglcCgkJOiAiK3IiIChkMSksICIrciIgKGQwKSwg
IityIiAoczEpIDogInJtIiAoczApLCAicm0iIChpbikpCgppbmxpbmUgdm9pZCBNVUw1KG1wX2xp
bWJfdCAqIHIsY29uc3QgbXBfbGltYl90ICogczEsY29uc3QgbXBfbGltYl90ICogczIpCnsKCW1w
X2xpbWJfdCB1LHYsdyx6OwoKCWludCBpLGo7Cgljb25zdCBpbnQgbj01OwoJCgkvKiBGaWxsIGlu
IHRoZSBibGFua3MgZmlyc3QuLi4gKi8KCgl1bXVsX3BwbW0odSxyWzBdLHMxWzBdLHMyWzBdKTsK
CS8qIHNvbWUgc3R1ZmYgcGVuZGluZyBpbiB1ICovCgoJZm9yKGk9MTtpPD1uLTI7aSsrKSB7CgkJ
dW11bF9wcG1tKHcsdixzMVtpXSxzMlswXSk7CgkJY2FycnlfdGVybWluYXRlXzIodyx2LHUpOwoJ
CXJbaV09djsKCQl1PXc7CgkJLyogc29tZSBzdHVmZiBwZW5kaW5nIGluIHUgKi8KCQkvKiByWzAu
LmldIGhhdmUgYmVlbiB3cml0dGVuICovCgl9CgoJcltuLTFdPXUrczFbbi0xXSpzMlswXTsKCgkv
KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKiovCgoJZm9yKGo9MTtqPD1uLTI7
aisrKSB7CgkJdW11bF9wcG1tKHYsdSxzMVswXSxzMltqXSk7CgkJLyogc29tZSBzdHVmZiBwZW5k
aW5nIGluIHYsdSA7IHcseiBhdmFpbGFibGUgKi8KCQlmb3IoaT0xO2krai0xPG4tMjtpKyspIHsK
CQkJdW11bF9wcG1tKHosdyxzMVtpXSxzMltqXSk7CgkJCWFkZDJfc3RyZWFtKHYscltpK2otMV0s
eix3LHUpOwoJCQkvKiBhZGQgdSB0byByW2krai0xXSAod2FzIHBlbmRpbmcpLiBBZGQgdyBhbmQg
dGhlCgkJCSAqIGNhcnJ5IG91dCB0byB2LiBBZGQgdGhlIGNhcnJ5IG91dCB0byB6LgoJCQkgKi8K
CQkJdT12OwoJCQl2PXo7CgkJCS8qIHNvbWUgc3R1ZmYgcGVuZGluZyBpbiB2LHUgOyB3LHogYXZh
aWxhYmxlLgoJCQkgKiBBZGQtdXAgdG8gcltpK2pdIGlzIHBlbmRpbmcgKGluIHUpICovCgoJCQkv
KiBXcml0ZXMgaW4gcltqLi5qK2ktMV0gaGF2ZSBiZWVuIHBlcmZvcm1lZCAqLwoJCX0KCgkJLyoK
CQlBU1NFUlQoaitpLTEgPT0gbi0yKTsKCQlBU1NFUlQoaitpID09IG4tMSk7CgkJKi8KCgkJLyog
d3JpdGUgdG8gcltuLTJdIGlzIHBlbmRpbmcsIGRhdGEgaXMgaW4gdS4gKi8KCQoJCXc9czFbaV0q
czJbal07CgkJY2FycnlfdGVybWluYXRlXzIodyxyW24tMl0sdSk7CgkJcltuLTFdKz13K3Y7Cgl9
CgoJcltuLTFdKz1zMVswXSpzMltuLTFdOwp9CgppbnQgbWFpbigpCnsKCWludCBpOwoKCW1wX2xp
bWJfdCBhWzVdLGJbNV0sY1s1XTsKCglmb3IoaT0wO2k8NTtpKyspIHsKCQlhW2ldPXJhbmQoKTsK
CQliW2ldPXJhbmQoKTsKCX0KCglmb3IoaT0wO2k8MTAwMDAwMDtpKyspIHsKCQlNVUw1KGMsYSxi
KTsKCQlhWzBdKz1jWzRdOwoJfQoJcmV0dXJuIGNbNF07Cn0K


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: optimization/10625: -freduce-all-givs has negative effect
@ 2003-05-05 19:16 Zdenek Dvorak
  0 siblings, 0 replies; 2+ messages in thread
From: Zdenek Dvorak @ 2003-05-05 19:16 UTC (permalink / raw)
  To: nobody; +Cc: gcc-prs

The following reply was made to PR optimization/10625; it has been noted by GNATS.

From: Zdenek Dvorak <rakdver@atrey.karlin.mff.cuni.cz>
To: thome@lix.polytechnique.fr
Cc: gcc-gnats@gcc.gnu.org
Subject: Re: optimization/10625: -freduce-all-givs has negative effect
Date: Mon, 5 May 2003 21:12:39 +0200

 Hello,
 
 > -freduce-all-givs impacts negatively on the performance by around 15%.
 > 
 > The code also suffers from the inability of unroll-loops to expand the
 > two nested loops (the inner loop becomes constant when the outer is
 > expanded).
 > 
 > For the record, a hand-made unrolling of these loops can yield a 25%
 > speedup (hence 150ms). Further hand-tweaking of the assembly code can
 > improve the performace by an additional 20% or so.
 > 
 > This code could probably be better optimized by gcc than what happens
 > now.
 
 -freduce-all-givs is not enabled by default exactly for this reason --
 you order loop optimizer to reduce givs even if it does not know that
 there will be any profit, and the increased register presure causes
 the performance loss.
 
 Zdenek


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2003-05-05 19:16 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-05  9:36 optimization/10625: -freduce-all-givs has negative effect thome
2003-05-05 19:16 Zdenek Dvorak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).