[Bug target/113600] New: 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/113600] New: 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4
@ 2024-01-25 14:29 jamborm at gcc dot gnu.org
  2024-01-25 16:18 ` [Bug target/113600] [14 regression] " pinskia at gcc dot gnu.org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: jamborm at gcc dot gnu.org @ 2024-01-25 14:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

            Bug ID: 113600
           Summary: 525.x264_r run-time regresses by 8% with PGO -Ofast
                    -march=znver4
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jamborm at gcc dot gnu.org
                CC: liuhongt at gcc dot gnu.org
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-linux-gnu
            Target: x86_64-linux-gnu

With profile-feedback, -Ofast and -march=native on an AMD Zen 4, there is a
recent 8% regression:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=979.377.0&plot.1=966.377.0&

With both PGO and LTO, the situation is similar (6%):
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=977.377.0&plot.1=958.377.0&

On a Zen3 machine, there is a 2% bump around the same time:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=900.377.0&plot.1=473.377.0&

I have bisected the (non-LTO) Zen 4 case to commit r14-5603-g2b59e2b4dff421:

2b59e2b4dff42118fe3a505f07b9a6aa4cf53bdf is the first bad commit
commit 2b59e2b4dff42118fe3a505f07b9a6aa4cf53bdf
Author: liuhongt <hongtao.liu@intel.com>
Date:   Thu Nov 16 18:38:39 2023 +0800

    Support reduc_{plus,xor,and,ior}_scal_m for vector integer mode.

    BB vectorizer relies on the backend support of
    .REDUC_{PLUS,IOR,XOR,AND} to vectorize reduction.

    gcc/ChangeLog:

            PR target/112325
            * config/i386/sse.md (reduc_<code>_scal_<mode>): New expander.
            (REDUC_ANY_LOGIC_MODE): New iterator.
            (REDUC_PLUS_MODE): Extend to VxHI/SI/DImode.
            (REDUC_SSE_PLUS_MODE): Ditto.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr112325-1.c: New test.
            * gcc.target/i386/pr112325-2.c: New test.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4
  2024-01-25 14:29 [Bug target/113600] New: 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4 jamborm at gcc dot gnu.org
@ 2024-01-25 16:18 ` pinskia at gcc dot gnu.org
  2024-01-26  1:02 ` liuhongt at gcc dot gnu.org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-25 16:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |14.0
           Keywords|                            |missed-optimization

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4
  2024-01-25 14:29 [Bug target/113600] New: 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4 jamborm at gcc dot gnu.org
  2024-01-25 16:18 ` [Bug target/113600] [14 regression] " pinskia at gcc dot gnu.org
@ 2024-01-26  1:02 ` liuhongt at gcc dot gnu.org
  2024-01-26  2:14 ` liuhongt at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-01-26  1:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

--- Comment #1 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
Guess it's same issue as PR112879?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4
  2024-01-25 14:29 [Bug target/113600] New: 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4 jamborm at gcc dot gnu.org
  2024-01-25 16:18 ` [Bug target/113600] [14 regression] " pinskia at gcc dot gnu.org
  2024-01-26  1:02 ` liuhongt at gcc dot gnu.org
@ 2024-01-26  2:14 ` liuhongt at gcc dot gnu.org
  2024-01-26  7:48 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-01-26  2:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

--- Comment #2 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
A patch is posted at
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640276.html

Would you give a try to see if it fixes the regression, I don't currently have
a znver4 machine for testing.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4
  2024-01-25 14:29 [Bug target/113600] New: 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4 jamborm at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2024-01-26  2:14 ` liuhongt at gcc dot gnu.org
@ 2024-01-26  7:48 ` rguenth at gcc dot gnu.org
  2024-01-26 18:27 ` jamborm at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-01-26  7:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
I'll note that esp. two-lane reductions (or in general two-lane BB
vectorization) is hardly profitable on modern x86 uarchs unless the vectorized
code is interleaved with other non-vectorized code that can execute at the same
time.  vectorizing two lanes will only make them dependent on each other while
when not vectorized modern uarchs have no difficulty in executing them in
parallel (but without the tied dependences).  It's only when there's sufficient
benefit, aka more lanes, approaching the issue width or the number of available
ports for the ops, or the whole SLP mostly consisting of loads/stores, that BB
vectorization is going to be profitable.  Note the cost model only ever looks
at the stmts participating in the vectorization, not the "surrounding" code,
and it would be difficult to include that since the schedule on GIMPLE isn't
even close to what we get later.  The reduction op is also a serialization
point on the scalar side of course, whether that means that BB reductions
with two lanes are possibly better candidates than grouped BB stores with
two lanes is another question.

The BB reduction op itself is costed properly.

So the 525.x264_r case might be loop vectorization, OTOH the epilogue
cost is hardly ever a knob that decides whether a vectorization is profitable.

I think we need to figure out what exactly gets slower (and hope it's not
scattered all over the place)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4
  2024-01-25 14:29 [Bug target/113600] New: 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4 jamborm at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2024-01-26  7:48 ` rguenth at gcc dot gnu.org
@ 2024-01-26 18:27 ` jamborm at gcc dot gnu.org
  2024-01-30  9:29 ` liuhongt at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: jamborm at gcc dot gnu.org @ 2024-01-26 18:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

--- Comment #4 from Martin Jambor <jamborm at gcc dot gnu.org> ---
(In reply to Hongtao Liu from comment #2)
> A patch is posted at
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640276.html
> 
> Would you give a try to see if it fixes the regression, I don't currently
> have a znver4 machine for testing.

Unfortunately it does not.

(In reply to Richard Biener from comment #3)
> I think we need to figure out what exactly gets slower (and hope it's not
> scattered all over the place)

I have collected some profiles:

r14-5602-ge6269bb69c0734

# Samples: 516K of event 'cycles:u'
# Event count (approx.): 468008188417
# Overhead       Samples  Command          Shared Object                       
  Symbol                                           
# ........  ............  ............... 
..................................... 
.................................................
#
    13.55%         69886  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] mc_chroma
    11.05%         57017  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_pixel_satd_16x16
     9.24%         47693  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_pixel_satd_8x8
     8.67%         44733  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] get_ref
     4.84%         24984  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] sub16x16_dct
     4.16%         21484  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_me_search_ref
     3.30%         17033  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_pixel_hadamard_ac_16x16
     2.28%         11770  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_pixel_satd_4x4
     2.10%         10824  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] quant_trellis_cabac
     2.07%         10694  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] hpel_filter
     2.05%         10616  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] sub8x8_dct
     1.86%          9593  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] refine_subpel
     1.70%          8788  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] quant_4x4
     1.57%          8077  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_pixel_sad_16x16
     1.16%          6324  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] frame_init_lowres_core
     1.14%          5867  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_pixel_sa8d_8x8
     1.11%          5738  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_cabac_encode_decision_c
     1.08%          5736  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_pixel_var_16x16



r14-5603-g2b59e2b4dff421

# Samples: 550K of event 'cycles:u'
# Event count (approx.): 498834737657
# Overhead       Samples  Command          Shared Object                       
  Symbol                                           
# ........  ............  ............... 
..................................... 
.................................................
#
    18.21%        100151  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_pixel_satd_16x16
    12.37%         68006  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] mc_chroma
     8.51%         46815  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_pixel_satd_8x8
     7.56%         41560  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] get_ref
     4.53%         24901  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] sub16x16_dct
     3.92%         21561  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_me_search_ref
     3.08%         16963  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_pixel_hadamard_ac_16x16
     2.41%         13239  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_pixel_satd_4x4
     1.99%         10931  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] quant_trellis_cabac
     1.96%         10801  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] hpel_filter
     1.95%         10764  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] sub8x8_dct
     1.56%          8587  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] quant_4x4
     1.49%          8166  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] refine_subpel
     1.48%          8124  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_pixel_sad_16x16
     1.09%          6328  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] frame_init_lowres_core
     1.07%          5901  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_pixel_sa8d_8x8
     1.04%          5703  x264_r_peak.min 
x264_r_peak.mine-pgo-Ofast-native-m64  [.] x264_cabac_encode_decision_c

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4
  2024-01-25 14:29 [Bug target/113600] New: 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4 jamborm at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2024-01-26 18:27 ` jamborm at gcc dot gnu.org
@ 2024-01-30  9:29 ` liuhongt at gcc dot gnu.org
  2024-01-30  9:31 ` liuhongt at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-01-30  9:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

--- Comment #5 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
It looks like x264_pixel_satd_16x16 consumes more time after my commit, an
extracted case is as below, note there's no attribute((always_inline)) in the
original x264_pixel_satd_8x4, it's added to force inline(Under PGO, it's hot
and will be inlined)

typedef unsigned char uint8_t;
typedef unsigned uint32_t;
typedef unsigned short uint16_t;

static inline uint32_t abs2( uint32_t a )
{
    uint32_t s = ((a>>15)&0x10001)*0xffff;
    return (a+s)^s;
}

int
__attribute__((always_inline))
x264_pixel_satd_8x4( uint8_t *pix1, int i_pix1, uint8_t *pix2, int i_pix2 )
{
  uint32_t tmp[4][4];
  uint32_t a0, a1, a2, a3;
  int sum = 0;
  for( int i = 0; i < 4; i++, pix1 += i_pix1, pix2 += i_pix2 )
    {
      a0 = (pix1[0] - pix2[0]) + ((pix1[4] - pix2[4]) << 16);
      a1 = (pix1[1] - pix2[1]) + ((pix1[5] - pix2[5]) << 16);
      a2 = (pix1[2] - pix2[2]) + ((pix1[6] - pix2[6]) << 16);
      a3 = (pix1[3] - pix2[3]) + ((pix1[7] - pix2[7]) << 16);
      { int t0 = a0 + a1; int t1 = a0 - a1; int t2 = a2 + a3; int t3 = a2 - a3;
tmp[i][0] = t0 + t2; tmp[i][2] = t0 - t2; tmp[i][1] = t1 + t3; tmp[i][3] = t1 -
t3;};
    }
  for( int i = 0; i < 4; i++ )
    {
      { int t0 = tmp[0][i] + tmp[1][i]; int t1 = tmp[0][i] - tmp[1][i]; int t2
= tmp[2][i] + tmp[3][i]; int t3 = tmp[2][i] - tmp[3][i]; a0 = t0 + t2; a2 = t0
- t2; a1 = t1 + t3; a3 = t1 - t3;};
      sum += abs2(a0) + abs2(a1) + abs2(a2) + abs2(a3);
    }
  return (((uint16_t)sum) + ((uint32_t)sum>>16)) >> 1;
}

int x264_pixel_satd_16x16( uint8_t *pix1, int i_pix1, uint8_t *pix2, int i_pix2
)
{
  int sum = x264_pixel_satd_8x4( pix1, i_pix1, pix2, i_pix2 )
    + x264_pixel_satd_8x4( pix1+4*i_pix1, i_pix1, pix2+4*i_pix2, i_pix2 );
  sum+= x264_pixel_satd_8x4( pix1+8, i_pix1, pix2+8, i_pix2 )
    + x264_pixel_satd_8x4( pix1+8+4*i_pix1, i_pix1, pix2+8+4*i_pix2, i_pix2 );
  sum+= x264_pixel_satd_8x4( pix1+8*i_pix1, i_pix1, pix2+8*i_pix2, i_pix2 )
    + x264_pixel_satd_8x4( pix1+12*i_pix1, i_pix1, pix2+12*i_pix2, i_pix2 );
  sum+= x264_pixel_satd_8x4( pix1+8+8*i_pix1, i_pix1, pix2+8+8*i_pix2, i_pix2 )
    + x264_pixel_satd_8x4( pix1+8+12*i_pix1, i_pix1, pix2+8+12*i_pix2, i_pix2
);
  return sum;
}


after commits, slp failed to splitted group size 16(vector int(16)) into small
4 + 12 and missed vectorization for below cases.

  vect_t2_2445.784_8503 = VIEW_CONVERT_EXPR<vector(4) int>(_8502);
  vect__2457.786_8505 = vect_t0_2441.783_8501 - vect_t2_2445.784_8503;
  vect__2448.785_8504 = vect_t0_2441.783_8501 + vect_t2_2445.784_8503;
  _8506 = VEC_PERM_EXPR <vect__2448.785_8504, vect__2457.786_8505, { 0, 1, 6, 7
}>;
  vect__2449.787_8507 = VIEW_CONVERT_EXPR<vector(4) unsigned int>(_8506);
  t3_2447 = (int) _2446;
  _2448 = t0_2441 + t2_2445;
  _2449 = (unsigned int) _2448;
  _2451 = t0_2441 - t2_2445;
  _2452 = (unsigned int) _2451;
  _2454 = t1_2443 + t3_2447;
  _2455 = (unsigned int) _2454;
  _2457 = t1_2443 - t3_2447;
  _2458 = (unsigned int) _2457;
  MEM <vector(4) unsigned int> [(unsigned int *)&tmp + 16B] =
vect__2449.787_8507;


The vector store will be optimized off with later vector load, so for the bad
case there're STLF issue.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4
  2024-01-25 14:29 [Bug target/113600] New: 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4 jamborm at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2024-01-30  9:29 ` liuhongt at gcc dot gnu.org
@ 2024-01-30  9:31 ` liuhongt at gcc dot gnu.org
  2024-02-13 10:13 ` pheeck at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-01-30  9:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

--- Comment #6 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
Guess explicit .REDUC_PLUS instead of original VEC_PERM_EXPR somehow impacts
the store split decision.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4
  2024-01-25 14:29 [Bug target/113600] New: 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4 jamborm at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2024-01-30  9:31 ` liuhongt at gcc dot gnu.org
@ 2024-02-13 10:13 ` pheeck at gcc dot gnu.org
  2024-03-07 20:44 ` law at gcc dot gnu.org
  2024-05-07  7:44 ` [Bug target/113600] [14/15 " rguenth at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: pheeck at gcc dot gnu.org @ 2024-02-13 10:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

Filip Kastl <pheeck at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pheeck at gcc dot gnu.org

--- Comment #7 from Filip Kastl <pheeck at gcc dot gnu.org> ---
*** Bug 112879 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4
  2024-01-25 14:29 [Bug target/113600] New: 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4 jamborm at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2024-02-13 10:13 ` pheeck at gcc dot gnu.org
@ 2024-03-07 20:44 ` law at gcc dot gnu.org
  2024-05-07  7:44 ` [Bug target/113600] [14/15 " rguenth at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: law at gcc dot gnu.org @ 2024-03-07 20:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

Jeffrey A. Law <law at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
                 CC|                            |law at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/113600] [14/15 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4
  2024-01-25 14:29 [Bug target/113600] New: 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4 jamborm at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2024-03-07 20:44 ` law at gcc dot gnu.org
@ 2024-05-07  7:44 ` rguenth at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-05-07  7:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|14.0                        |14.2

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 14.1 is being released, retargeting bugs to GCC 14.2.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-05-07  7:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-25 14:29 [Bug target/113600] New: 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4 jamborm at gcc dot gnu.org
2024-01-25 16:18 ` [Bug target/113600] [14 regression] " pinskia at gcc dot gnu.org
2024-01-26  1:02 ` liuhongt at gcc dot gnu.org
2024-01-26  2:14 ` liuhongt at gcc dot gnu.org
2024-01-26  7:48 ` rguenth at gcc dot gnu.org
2024-01-26 18:27 ` jamborm at gcc dot gnu.org
2024-01-30  9:29 ` liuhongt at gcc dot gnu.org
2024-01-30  9:31 ` liuhongt at gcc dot gnu.org
2024-02-13 10:13 ` pheeck at gcc dot gnu.org
2024-03-07 20:44 ` law at gcc dot gnu.org
2024-05-07  7:44 ` [Bug target/113600] [14/15 " rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).