public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/41783]  New: r151561 (PRE fix) regresses zeusmp
@ 2009-10-21 14:34 matz at gcc dot gnu dot org
  2009-10-21 14:53 ` [Bug tree-optimization/41783] " rguenth at gcc dot gnu dot org
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: matz at gcc dot gnu dot org @ 2009-10-21 14:34 UTC (permalink / raw)
  To: gcc-bugs

zeusmp regressed by about 5% again with the PRE fix for PR41101, which is
r151561.  The problem is that PRE now finds a partial redundancy (where in
reality there isn't any) and the PHI node to compensate for this prevents
vectorization of a loop due to its value used outside that loop.  Testcase
extracted from zeusmp:

% cat hsmoc-1.f
      subroutine hsmoc ( )
      implicit NONE
      integer ijkn
      parameter(ijkn =   128+5)
      real*8 dt, fact, db(ijkn), w1dt(ijkn)
      integer i, is, ie, j, js, je
      common /rootr/ dt
      common /scratch/  w1dt
         do 9 i=is,ie
           do 807 j=js-1,je+1
             db (j  ) = j
 807       continue
           fact = dt * i
           do 808 j=js,je+1
             w1dt(j)= fact * db (j)
 808       continue
 9      continue
       return
       end

(compile with -march=barcelona -O3 -ffast-math -funroll-loops -fpeel-loops)
The problem is the access to 'dt' (rootr.dt), which PRE thinks is partially
redundant in the first loop (!?), hence it creates this code:

pretmp.11_53 = rootr.dt;
Loop-i:
  prephitmp.12_51 = PHI <pretmp.11_53(9), D.1376_20(20)>
...
  Loop-j1
    prephitmp.12_49 = PHI <prephitmp.12_51(11), pretmp.11_52(14)>
    ...
    pretmp.11_52 = rootr.dt;
    goto Loop-j1
  prephitmp.12_23 = PHI <prephitmp.12_51(12), prephitmp.12_49(13)>
  D.1376_20 = prephitmp.12_23;
  ...
  Loop-j2

Notice especially how we now read rootr.dt in the backedge for loop-j1,
which is much more often than before.  Originally we access it ie-is times,
now we access it (ie-is)*(je-js) times.

It's possible that this alone explains the speed regression, and not
necessarily the missed vectorization.  But the missed vectorization was
much easier to detect.


-- 
           Summary: r151561 (PRE fix) regresses zeusmp
           Product: gcc
           Version: 4.5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: matz at gcc dot gnu dot org
  GCC host triplet: x86_64-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41783


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2010-01-19 16:06 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-21 14:34 [Bug tree-optimization/41783] New: r151561 (PRE fix) regresses zeusmp matz at gcc dot gnu dot org
2009-10-21 14:53 ` [Bug tree-optimization/41783] " rguenth at gcc dot gnu dot org
2009-10-21 15:07 ` rguenth at gcc dot gnu dot org
2009-10-21 15:14 ` rguenth at gcc dot gnu dot org
2009-10-21 15:16 ` rguenth at gcc dot gnu dot org
2009-10-21 15:20 ` matz at gcc dot gnu dot org
2009-10-21 15:22 ` matz at gcc dot gnu dot org
2009-10-21 15:26 ` rguenth at gcc dot gnu dot org
2009-10-21 15:35 ` matz at gcc dot gnu dot org
2009-10-21 18:09 ` rguenth at gcc dot gnu dot org
2009-10-21 20:14 ` rguenth at gcc dot gnu dot org
2009-10-22 15:13 ` matz at gcc dot gnu dot org
2009-10-26 13:01 ` matz at gcc dot gnu dot org
2009-10-26 13:04 ` matz at gcc dot gnu dot org
2009-10-30 16:14 ` rguenth at gcc dot gnu dot org
2009-10-30 16:40 ` rguenth at gcc dot gnu dot org
2010-01-19 16:06 ` matz at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).