public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/105275] New: 525.x264_r and 538.imagick_r regressed  on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718
@ 2022-04-14 13:02 jamborm at gcc dot gnu.org
  2022-05-20 17:47 ` [Bug target/105275] " jamborm at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2022-04-14 13:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105275

            Bug ID: 105275
           Summary: 525.x264_r and 538.imagick_r regressed  on x86_64 at
                    -O2 with PGO after r12-7319-g90d693bdc9d718
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jamborm at gcc dot gnu.org
                CC: rguenth at gcc dot gnu.org
  Target Milestone: ---

I can see x86_64 regressions of 525.x264_r and 538.imagick_r when
built with plain -O2 (so generic march/mtune) and profile guided
optimization (PGO), compared to GCC 11.

The performance drop of 525.x264_r is about 11% on znver3 and 10% on
Intel cascadelake.  The performance drop of 538.imagick_r is about
6.4% on znver3.  FWIW, I bisected both to commit
r12-7319-g90d693bdc9d718:

   commit 90d693bdc9d71841f51d68826ffa5bd685d7f0bc
   Author: Richard Biener <rguenther@suse.de>
   Date:   Fri Feb 18 14:32:14 2022 +0100

   target/99881 - x86 vector cost of CTOR from integer regs

   This uses the now passed SLP node to the vectorizer costing hook
   to adjust vector construction costs for the cost of moving an
   integer component from a GPR to a vector register when that's
   required for building a vector from components.  A cruical difference
   here is whether the component is loaded from memory or extracted
   from a vector register as in those cases no intermediate GPR is involved.

   The pr99881.c testcase can be Un-XFAILed with this patch, the
   pr91446.c testcase now produces scalar code which looks superior
   to me so I've adjusted it as well.

   2022-02-18  Richard Biener  <rguenther@suse.de>

           PR tree-optimization/104582
           PR target/99881
           * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
           Cost GPR to vector register moves for integer vector construction.

With PGo+LTO, the 538.imagick_r regression on znver3 is small (less
than 3%), the 525.x264_r ones are smaller but visible (9.4% and 7.1%
on the two machines).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/105275] 525.x264_r and 538.imagick_r regressed  on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718
  2022-04-14 13:02 [Bug target/105275] New: 525.x264_r and 538.imagick_r regressed on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718 jamborm at gcc dot gnu.org
@ 2022-05-20 17:47 ` jamborm at gcc dot gnu.org
  2023-01-18 15:56 ` jamborm at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2022-05-20 17:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105275

--- Comment #1 from Martin Jambor <jamborm at gcc dot gnu.org> ---
Confirmed with GCC 12.1 numbers.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/105275] 525.x264_r and 538.imagick_r regressed  on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718
  2022-04-14 13:02 [Bug target/105275] New: 525.x264_r and 538.imagick_r regressed on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718 jamborm at gcc dot gnu.org
  2022-05-20 17:47 ` [Bug target/105275] " jamborm at gcc dot gnu.org
@ 2023-01-18 15:56 ` jamborm at gcc dot gnu.org
  2024-01-24 22:41 ` jamborm at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2023-01-18 15:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105275

Martin Jambor <jamborm at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org

--- Comment #2 from Martin Jambor <jamborm at gcc dot gnu.org> ---
I can see this again in my measurements from January 10, 2023.  Trunk and GCC
12.2 are about 10% slower with PGO than GCC 11 with the same options and (this
time also) about 9% slower with both PGO and LTO than GCC 11 with the same
options (well, in the latter case it's only 4% on zen2).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/105275] 525.x264_r and 538.imagick_r regressed  on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718
  2022-04-14 13:02 [Bug target/105275] New: 525.x264_r and 538.imagick_r regressed on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718 jamborm at gcc dot gnu.org
  2022-05-20 17:47 ` [Bug target/105275] " jamborm at gcc dot gnu.org
  2023-01-18 15:56 ` jamborm at gcc dot gnu.org
@ 2024-01-24 22:41 ` jamborm at gcc dot gnu.org
  2024-01-25  8:36 ` [Bug target/105275] [12/13/14 regression] " rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2024-01-24 22:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105275

--- Comment #3 from Martin Jambor <jamborm at gcc dot gnu.org> ---
I have re-checked this year again (using master revision
r14-7200-g95440171d0e615)  but this time on a high-frequency Zen3 CPU (EPYC
75F3). Run-time of 525.x264_r built with master with PGO and -O2 improved by
5.49% compared to GCC 13 and so compared to GCC 11 the regression dropped to
4.2%.

Run-time of 538.imagick_r compiled with the same options and master is 5.8%
slower on this CPU than when compiling it with GCC 11.

With both PGO and LTO, 525.x264_r is now only 2.8% slower than GCC 11.  In case
of 538.imagick_r the regression is 2.01% on the zen4, but it is 7.49% on a zen4
machine :-/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/105275] [12/13/14 regression] 525.x264_r and 538.imagick_r regressed  on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718
  2022-04-14 13:02 [Bug target/105275] New: 525.x264_r and 538.imagick_r regressed on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718 jamborm at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2024-01-24 22:41 ` jamborm at gcc dot gnu.org
@ 2024-01-25  8:36 ` rguenth at gcc dot gnu.org
  2024-01-31 14:32 ` rguenth at gcc dot gnu.org
  2024-03-22 14:06 ` law at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-01-25  8:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105275

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Since this was a costing change I wonder if we identified the code change
responsible and thus have a testcase?  I realize that for maximum assurance
one would need to have a debug counter for switching the patch on/off to
have it apply more selectively (possibly per SLP attempt rather than
per cost hook invocation which would be even more tricky to do).

Feeding another parameter to the hook via a new flag in the vinfo might
be possible (and set that from a dbg_cnt call) for example.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/105275] [12/13/14 regression] 525.x264_r and 538.imagick_r regressed  on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718
  2022-04-14 13:02 [Bug target/105275] New: 525.x264_r and 538.imagick_r regressed on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718 jamborm at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2024-01-25  8:36 ` [Bug target/105275] [12/13/14 regression] " rguenth at gcc dot gnu.org
@ 2024-01-31 14:32 ` rguenth at gcc dot gnu.org
  2024-03-22 14:06 ` law at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-01-31 14:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105275

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.4

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/105275] [12/13/14 regression] 525.x264_r and 538.imagick_r regressed  on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718
  2022-04-14 13:02 [Bug target/105275] New: 525.x264_r and 538.imagick_r regressed on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718 jamborm at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2024-01-31 14:32 ` rguenth at gcc dot gnu.org
@ 2024-03-22 14:06 ` law at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: law at gcc dot gnu.org @ 2024-03-22 14:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105275

Jeffrey A. Law <law at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |law at gcc dot gnu.org
           Priority|P3                          |P2

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-03-22 14:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-14 13:02 [Bug target/105275] New: 525.x264_r and 538.imagick_r regressed on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718 jamborm at gcc dot gnu.org
2022-05-20 17:47 ` [Bug target/105275] " jamborm at gcc dot gnu.org
2023-01-18 15:56 ` jamborm at gcc dot gnu.org
2024-01-24 22:41 ` jamborm at gcc dot gnu.org
2024-01-25  8:36 ` [Bug target/105275] [12/13/14 regression] " rguenth at gcc dot gnu.org
2024-01-31 14:32 ` rguenth at gcc dot gnu.org
2024-03-22 14:06 ` law at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).