public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/19240] New: runtime performance regression in floating point heavy code, x86/SSE
@ 2005-01-03 13:13 tbptbp at gmail dot com
  2005-01-03 13:14 ` [Bug rtl-optimization/19240] " tbptbp at gmail dot com
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: tbptbp at gmail dot com @ 2005-01-03 13:13 UTC (permalink / raw)
  To: gcc-bugs

I'm seeing a significant runtime performance regression (>15%) with snapshots
following gcc-4.0-20041205; as far as i can see there's some issues when the
register pressure builds up: in later versions the fpu gets involved when former
version didn't.

The >15% figure comes from larger application (a raytracer), branch predictions
also changed (but i've fixed that) so i'm reasonably sure the problem is what's
demonstrated in the attached testcase.

Switches: -march=k8 -mfpmath=sse -O3 -ffast-math -fomit-frame-pointer

with gcc-4.0-20041205:
[snip]
  4010f4:       movss  (%ecx,%esi,4),%xmm0
  4010f9:       movss  (%eax,%ebx,4),%xmm5
  4010fe:       movss  (%eax,%esi,4),%xmm7
  401103:       mulss  %xmm5,%xmm1
  401107:       movss  (%ecx,%ebx,4),%xmm4
  40110c:       movss  %xmm0,(%esp)
  401111:       mulss  %xmm4,%xmm2
  401115:       movaps %xmm3,%xmm0
  401118:       subss  (%ecx,%edx,4),%xmm6
  40111d:       addss  (%eax,%edx,4),%xmm1
  401122:       mulss  (%esp),%xmm3
  401127:       mulss  %xmm7,%xmm0
  40112b:       subss  %xmm2,%xmm6
  40112f:       xorps  %xmm2,%xmm2
  401132:       addss  %xmm0,%xmm1
  401136:       subss  %xmm3,%xmm6
  40113a:       divss  %xmm1,%xmm6
  40113e:       mulss  %xmm6,%xmm7
  401142:       comiss 0x0(%ebp),%xmm6
  401146:       mulss  %xmm6,%xmm5
  40114a:       addss  (%esp),%xmm7

with gcc-4.0-20050102:
[snip]
  4010ff:       movss  (%ecx,%esi,4),%xmm0
  401104:       movss  (%eax,%ebx,4),%xmm5
  401109:       movss  (%eax,%esi,4),%xmm7
  40110e:       mulss  %xmm5,%xmm1
  401112:       movss  (%ecx,%ebx,4),%xmm4
  401117:       movss  %xmm0,0x4(%esp)
  40111d:       mulss  %xmm4,%xmm2
  401121:       movaps %xmm3,%xmm0
  401124:       flds   (%ecx,%edx,4)
  401127:       addss  (%eax,%edx,4),%xmm1
  40112c:       mulss  0x4(%esp),%xmm3
  401132:       fsubrs 0xc(%edi)
  401135:       mulss  %xmm7,%xmm0
  401139:       addss  %xmm0,%xmm1
  40113d:       fstps  (%esp)
  401140:       movss  (%esp),%xmm6
  401145:       subss  %xmm2,%xmm6
  401149:       xorps  %xmm2,%xmm2
  40114c:       subss  %xmm3,%xmm6
  401150:       divss  %xmm1,%xmm6
  401154:       mulss  %xmm6,%xmm7
  401158:       comiss 0x0(%ebp),%xmm6
  40115c:       mulss  %xmm6,%xmm5
  401160:       addss  0x4(%esp),%xmm7

-- 
           Summary: runtime performance regression in floating point heavy
                    code, x86/SSE
           Product: gcc
           Version: 4.0.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: tbptbp at gmail dot com
                CC: gcc-bugs at gcc dot gnu dot org
  GCC host triplet: cygwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19240


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-01-04 15:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-03 13:13 [Bug rtl-optimization/19240] New: runtime performance regression in floating point heavy code, x86/SSE tbptbp at gmail dot com
2005-01-03 13:14 ` [Bug rtl-optimization/19240] " tbptbp at gmail dot com
2005-01-03 15:06 ` [Bug target/19240] [4.0 Regression] " pinskia at gcc dot gnu dot org
2005-01-03 16:27 ` uros at kss-loka dot si
2005-01-03 20:39 ` rth at gcc dot gnu dot org
2005-01-04 10:41 ` cvs-commit at gcc dot gnu dot org
2005-01-04 15:43 ` pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).