public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/115438] New: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca
@ 2024-06-11 15:45 jamborm at gcc dot gnu.org
  2024-06-11 15:47 ` [Bug tree-optimization/115438] [15 Regression] " pinskia at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: jamborm at gcc dot gnu.org @ 2024-06-11 15:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115438

            Bug ID: 115438
           Summary: 503.bwaves_r regressed 5-11% on different x86_64
                    machines at -Ofast -march=native since
                    r15-1006-gd93353e6423eca
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jamborm at gcc dot gnu.org
                CC: rguenth at gcc dot gnu.org
            Blocks: 26163
  Target Milestone: ---

The run-tie of 503.bwaves_r from SPEC INTrate 2017 regressed by 5-11%
on different x86_64 machines at -Ofast -march=native (specifically
without LTO) since r15-1006-gd93353e6423eca (Richard Biener: Do
single-lane SLP discovery for reductions).  I have bisected the issue
on zen3, the other regressions however appeared around the same time:

- zen3: 11% https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=471.427.0
- zen2: 7%  https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=295.427.0
- skylake: 5%
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=801.427.0
- zen4: 5%  https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=970.427.0


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/115438] [15 Regression] 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca
  2024-06-11 15:45 [Bug tree-optimization/115438] New: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca jamborm at gcc dot gnu.org
@ 2024-06-11 15:47 ` pinskia at gcc dot gnu.org
  2024-06-12  6:59 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-06-11 15:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115438

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
   Target Milestone|---                         |15.0
             Blocks|                            |53947
            Version|14.0                        |15.0
                 CC|                            |pinskia at gcc dot gnu.org
            Summary|503.bwaves_r regressed      |[15 Regression]
                   |5-11% on different x86_64   |503.bwaves_r regressed
                   |machines at -Ofast          |5-11% on different x86_64
                   |-march=native since         |machines at -Ofast
                   |r15-1006-gd93353e6423eca    |-march=native since
                   |                            |r15-1006-gd93353e6423eca


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/115438] [15 Regression] 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca
  2024-06-11 15:45 [Bug tree-optimization/115438] New: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca jamborm at gcc dot gnu.org
  2024-06-11 15:47 ` [Bug tree-optimization/115438] [15 Regression] " pinskia at gcc dot gnu.org
@ 2024-06-12  6:59 ` rguenth at gcc dot gnu.org
  2024-06-13 11:18 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-06-12  6:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115438

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2024-06-12
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I will see what happens.  It's somewhat expected but OTOH it's not expected.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/115438] [15 Regression] 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca
  2024-06-11 15:45 [Bug tree-optimization/115438] New: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca jamborm at gcc dot gnu.org
  2024-06-11 15:47 ` [Bug tree-optimization/115438] [15 Regression] " pinskia at gcc dot gnu.org
  2024-06-12  6:59 ` rguenth at gcc dot gnu.org
@ 2024-06-13 11:18 ` rguenth at gcc dot gnu.org
  2024-06-13 12:42 ` rguenth at gcc dot gnu.org
  2024-06-13 13:03 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-06-13 11:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115438

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
I reproduced the larger regression with -march=zen3 on a zen4 machine.

   9.27%         94741  bwaves_r_peak.g  bwaves_r_peak.gcc7-m64  [.]
bi_cgstab_block_
   5.96%         60744  bwaves_r_base.g  bwaves_r_base.gcc7-m64  [.]
bi_cgstab_block_

in block_solver.F is the main regression.  There's no -fopt-info-vec
difference.

I think the difference is that with SLP we do not perform the
STMT_VINFO_FORCE_SINGLE_CYCLE transform.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/115438] [15 Regression] 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca
  2024-06-11 15:45 [Bug tree-optimization/115438] New: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca jamborm at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2024-06-13 11:18 ` rguenth at gcc dot gnu.org
@ 2024-06-13 12:42 ` rguenth at gcc dot gnu.org
  2024-06-13 13:03 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-06-13 12:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115438

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixing that doesn't seem to help.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/115438] [15 Regression] 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca
  2024-06-11 15:45 [Bug tree-optimization/115438] New: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca jamborm at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2024-06-13 12:42 ` rguenth at gcc dot gnu.org
@ 2024-06-13 13:03 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-06-13 13:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115438

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Another difference is that for

C       Local r*r norm
        r2=0.0
        do k=2,nzl+1
           do j=1,ny
              do i=1,nx
                 do l=1,nb
                    r(l,i,j,k) = b(l,i,j,k-1) - r(l,i,j,k)
                    r2 =r2+r(l,i,j,k)**2
                    rhat(l,i,j,k) = r(l,i,j,k)
                 enddo
              enddo
           enddo
        enddo

we're now ending up with hybrid SLP (SLP for the reduction and non-SLP
for the non-grouped stores).  In the end in .optimized the code looks
the same again though.

That's expected and will resolve itself.

Another difference is that without SLP we prefer to use a neutral element
as reduction init while with SLP we prefer the scalar initial values
as that's more efficient for SLP reductions and it might also reduce
lifetime of the reg holding the initial value.  I doubt this to be
the reason for the slowness, but it at least prevails.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-06-13 13:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-11 15:45 [Bug tree-optimization/115438] New: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca jamborm at gcc dot gnu.org
2024-06-11 15:47 ` [Bug tree-optimization/115438] [15 Regression] " pinskia at gcc dot gnu.org
2024-06-12  6:59 ` rguenth at gcc dot gnu.org
2024-06-13 11:18 ` rguenth at gcc dot gnu.org
2024-06-13 12:42 ` rguenth at gcc dot gnu.org
2024-06-13 13:03 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).