public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug tree-optimization/115438] New: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca @ 2024-06-11 15:45 jamborm at gcc dot gnu.org 2024-06-11 15:47 ` [Bug tree-optimization/115438] [15 Regression] " pinskia at gcc dot gnu.org ` (4 more replies) 0 siblings, 5 replies; 6+ messages in thread From: jamborm at gcc dot gnu.org @ 2024-06-11 15:45 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115438 Bug ID: 115438 Summary: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: jamborm at gcc dot gnu.org CC: rguenth at gcc dot gnu.org Blocks: 26163 Target Milestone: --- The run-tie of 503.bwaves_r from SPEC INTrate 2017 regressed by 5-11% on different x86_64 machines at -Ofast -march=native (specifically without LTO) since r15-1006-gd93353e6423eca (Richard Biener: Do single-lane SLP discovery for reductions). I have bisected the issue on zen3, the other regressions however appeared around the same time: - zen3: 11% https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=471.427.0 - zen2: 7% https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=295.427.0 - skylake: 5% https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=801.427.0 - zen4: 5% https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=970.427.0 Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 [Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95) ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/115438] [15 Regression] 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca 2024-06-11 15:45 [Bug tree-optimization/115438] New: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca jamborm at gcc dot gnu.org @ 2024-06-11 15:47 ` pinskia at gcc dot gnu.org 2024-06-12 6:59 ` rguenth at gcc dot gnu.org ` (3 subsequent siblings) 4 siblings, 0 replies; 6+ messages in thread From: pinskia at gcc dot gnu.org @ 2024-06-11 15:47 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115438 Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization Target Milestone|--- |15.0 Blocks| |53947 Version|14.0 |15.0 CC| |pinskia at gcc dot gnu.org Summary|503.bwaves_r regressed |[15 Regression] |5-11% on different x86_64 |503.bwaves_r regressed |machines at -Ofast |5-11% on different x86_64 |-march=native since |machines at -Ofast |r15-1006-gd93353e6423eca |-march=native since | |r15-1006-gd93353e6423eca Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/115438] [15 Regression] 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca 2024-06-11 15:45 [Bug tree-optimization/115438] New: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca jamborm at gcc dot gnu.org 2024-06-11 15:47 ` [Bug tree-optimization/115438] [15 Regression] " pinskia at gcc dot gnu.org @ 2024-06-12 6:59 ` rguenth at gcc dot gnu.org 2024-06-13 11:18 ` rguenth at gcc dot gnu.org ` (2 subsequent siblings) 4 siblings, 0 replies; 6+ messages in thread From: rguenth at gcc dot gnu.org @ 2024-06-12 6:59 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115438 Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Last reconfirmed| |2024-06-12 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- I will see what happens. It's somewhat expected but OTOH it's not expected. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/115438] [15 Regression] 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca 2024-06-11 15:45 [Bug tree-optimization/115438] New: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca jamborm at gcc dot gnu.org 2024-06-11 15:47 ` [Bug tree-optimization/115438] [15 Regression] " pinskia at gcc dot gnu.org 2024-06-12 6:59 ` rguenth at gcc dot gnu.org @ 2024-06-13 11:18 ` rguenth at gcc dot gnu.org 2024-06-13 12:42 ` rguenth at gcc dot gnu.org 2024-06-13 13:03 ` rguenth at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: rguenth at gcc dot gnu.org @ 2024-06-13 11:18 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115438 --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- I reproduced the larger regression with -march=zen3 on a zen4 machine. 9.27% 94741 bwaves_r_peak.g bwaves_r_peak.gcc7-m64 [.] bi_cgstab_block_ 5.96% 60744 bwaves_r_base.g bwaves_r_base.gcc7-m64 [.] bi_cgstab_block_ in block_solver.F is the main regression. There's no -fopt-info-vec difference. I think the difference is that with SLP we do not perform the STMT_VINFO_FORCE_SINGLE_CYCLE transform. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/115438] [15 Regression] 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca 2024-06-11 15:45 [Bug tree-optimization/115438] New: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca jamborm at gcc dot gnu.org ` (2 preceding siblings ...) 2024-06-13 11:18 ` rguenth at gcc dot gnu.org @ 2024-06-13 12:42 ` rguenth at gcc dot gnu.org 2024-06-13 13:03 ` rguenth at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: rguenth at gcc dot gnu.org @ 2024-06-13 12:42 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115438 --- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- Fixing that doesn't seem to help. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/115438] [15 Regression] 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca 2024-06-11 15:45 [Bug tree-optimization/115438] New: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca jamborm at gcc dot gnu.org ` (3 preceding siblings ...) 2024-06-13 12:42 ` rguenth at gcc dot gnu.org @ 2024-06-13 13:03 ` rguenth at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: rguenth at gcc dot gnu.org @ 2024-06-13 13:03 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115438 --- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> --- Another difference is that for C Local r*r norm r2=0.0 do k=2,nzl+1 do j=1,ny do i=1,nx do l=1,nb r(l,i,j,k) = b(l,i,j,k-1) - r(l,i,j,k) r2 =r2+r(l,i,j,k)**2 rhat(l,i,j,k) = r(l,i,j,k) enddo enddo enddo enddo we're now ending up with hybrid SLP (SLP for the reduction and non-SLP for the non-grouped stores). In the end in .optimized the code looks the same again though. That's expected and will resolve itself. Another difference is that without SLP we prefer to use a neutral element as reduction init while with SLP we prefer the scalar initial values as that's more efficient for SLP reductions and it might also reduce lifetime of the reg holding the initial value. I doubt this to be the reason for the slowness, but it at least prevails. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-06-13 13:03 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-06-11 15:45 [Bug tree-optimization/115438] New: 503.bwaves_r regressed 5-11% on different x86_64 machines at -Ofast -march=native since r15-1006-gd93353e6423eca jamborm at gcc dot gnu.org 2024-06-11 15:47 ` [Bug tree-optimization/115438] [15 Regression] " pinskia at gcc dot gnu.org 2024-06-12 6:59 ` rguenth at gcc dot gnu.org 2024-06-13 11:18 ` rguenth at gcc dot gnu.org 2024-06-13 12:42 ` rguenth at gcc dot gnu.org 2024-06-13 13:03 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).