public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4
@ 2024-01-04 16:53 aros at gmx dot com
  2024-01-04 17:05 ` [Bug target/113235] " aros at gmx dot com
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: aros at gmx dot com @ 2024-01-04 16:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

            Bug ID: 113235
           Summary: SMHasher SHA3-256 benchmark is almost 40% slower vs.
                    Clang on AMD Zen 4
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: aros at gmx dot com
  Target Milestone: ---

According to Phoronix Test Suite SMHasher SHA3-256 is almost 40% slower when
built with GCC 13.2/GCC git snapshort vs Clang:

https://www.phoronix.com/review/gcc-clang-eoy2023/3

FormHash32 x86_64 AVX is also a lot slower.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4
  2024-01-04 16:53 [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4 aros at gmx dot com
@ 2024-01-04 17:05 ` aros at gmx dot com
  2024-01-04 17:09 ` xry111 at gcc dot gnu.org
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: aros at gmx dot com @ 2024-01-04 17:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

--- Comment #1 from Artem S. Tashkinov <aros at gmx dot com> ---
Also valid for MTL:
https://www.phoronix.com/review/intel-meteorlake-gcc-clang/2

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4
  2024-01-04 16:53 [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4 aros at gmx dot com
  2024-01-04 17:05 ` [Bug target/113235] " aros at gmx dot com
@ 2024-01-04 17:09 ` xry111 at gcc dot gnu.org
  2024-01-04 17:27 ` [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang xry111 at gcc dot gnu.org
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-01-04 17:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

--- Comment #2 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
The test file can be downloaded from
http://phoronix-test-suite.com/benchmark-files/smhasher-20220822.tar.xz.  Just
build it with cmake and run "./SMHasher --test=Speed sha3-256".  The building
system enables -O3 and LTO by default.

With GCC 13 I get about 180 MiB/s, but Clang 17 produces 250 MiB/s.

Part of the difference is caused by the different -fsemantic-interposition
default, if I pass -fno-semantic-interposition GCC 13 produces about 200 MiB/s.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang
  2024-01-04 16:53 [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4 aros at gmx dot com
  2024-01-04 17:05 ` [Bug target/113235] " aros at gmx dot com
  2024-01-04 17:09 ` xry111 at gcc dot gnu.org
@ 2024-01-04 17:27 ` xry111 at gcc dot gnu.org
  2024-01-05 19:54 ` hubicka at gcc dot gnu.org
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-01-04 17:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

Xi Ruoyao <xry111 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
                 CC|                            |xry111 at gcc dot gnu.org
   Last reconfirmed|                            |2024-01-04
            Summary|SMHasher SHA3-256 benchmark |SMHasher SHA3-256 benchmark
                   |is almost 40% slower vs.    |is almost 40% slower vs.
                   |Clang on AMD Zen 4          |Clang
             Status|UNCONFIRMED                 |NEW

--- Comment #3 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
GCC trunk still gets around 200 (on a Tiger Lake but I've not used -march) with
-fno-semantic-interposition.

Confirm, and I'm removing "on xxx" from the subject as the uarch seems
irrelevant.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang
  2024-01-04 16:53 [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4 aros at gmx dot com
                   ` (2 preceding siblings ...)
  2024-01-04 17:27 ` [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang xry111 at gcc dot gnu.org
@ 2024-01-05 19:54 ` hubicka at gcc dot gnu.org
  2024-01-05 20:26 ` [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling) hubicka at gcc dot gnu.org
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: hubicka at gcc dot gnu.org @ 2024-01-05 19:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

Jan Hubicka <hubicka at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org

--- Comment #4 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
I keep mentioning to Larabel that he should use -fno-semantic-interposition,
but he doesn't.

Profile is very simple:

 96.75%  SMHasher                                        [.] keccakf.lto_priv.0
  ◆

All goes to simple loop. On Zen3 gcc 13 -march=native -Ofast -flto I get:

  3.85 │330:   mov    %r8,%rdi                                                  
  7.68 │       movslq (%rsi,%r9,1),%rcx                                         
  3.85 │       lea    (%rax,%rcx,8),%r10                                        
  3.86 │       mov    (%rdx,%r9,1),%ecx                                         
  3.83 │       add    $0x4,%r9                                                  
  3.86 │       mov    (%r10),%r8                                                
  7.37 │       rol    %cl,%rdi                                                  
  7.37 │       mov    %rdi,(%r10)                                               
  4.76 │       cmp    $0x60,%r9                                                 
  0.00 │     ↑ jne    330                                                       


Clang seems to unroll it:

 0.25 │ d0:   mov  -0x48(%rsp),%rdx                                            
  ▒
  0.25 │       xor  %r12,%rcx                                                  
   ▒
  0.25 │       mov  %r13,%r12                                                  
   ▒
  0.25 │       mov  %r13,0x10(%rsp)                                            
   ▒
  0.25 │       mov  %rax,%r13                                                  
   ◆
  0.26 │       xor  %r15,%r13                                                  
   ▒
  0.23 │       mov  %r11,-0x70(%rsp)                                           
   ▒
  0.25 │       mov  %r8,0x8(%rsp)                                              
   ▒
  0.25 │       mov  %r15,-0x40(%rsp)                                           
   ▒
  0.25 │       mov  %r10,%r15                                                  
   ▒
  0.26 │       mov  %r10,(%rsp)                                                
   ▒
  0.26 │       mov  %r14,%r10                                                  
   ▒
  0.25 │       xor  %r12,%r10                                                  
   ▒
  0.26 │       xor  %rsi,%r15                                                  
   ▒
  0.24 │       mov  %rbp,-0x80(%rsp)                                           
   ▒
  0.25 │       xor  %rcx,%r15                                                  
   ▒
  0.26 │       mov  -0x60(%rsp),%rcx                                           
   ▒
  0.25 │       xor  -0x68(%rsp),%r15                                           
   ▒
  0.26 │       xor  %rbp,%rdx                                                  
   ▒
  0.25 │       mov  -0x30(%rsp),%rbp                                           
   ▒
  0.25 │       xor  %rdx,%r13                                                  
   ▒
  0.24 │       mov  -0x10(%rsp),%rdx                                           
   ▒
  0.25 │       mov  %rcx,%r12                                                  
   ▒
  0.24 │       xor  %rcx,%r13                                                  
   ▒
  0.25 │       mov  $0x1,%ecx                                                  
   ▒
  0.25 │       xor  %r11,%rdx                                                  
   ▒
  0.24 │       mov  %r8,%r11                                                   
   ▒
  0.25 │       mov  -0x28(%rsp),%r8                                            
   ▒
  0.26 │       xor  -0x58(%rsp),%r8                                            
   ▒
  0.24 │       xor  %rdx,%r8                                                   
   ▒
  0.26 │       mov  -0x8(%rsp),%rdx                                            
   ▒
  0.25 │       xor  %rbp,%r8                                                   
   ▒
  0.26 │       xor  %r11,%rdx                                                  
   ▒
  0.25 │       mov  -0x20(%rsp),%r11                                           
   ▒
  0.25 │       xor  %rdx,%r10                                                  
   ▒....

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling)
  2024-01-04 16:53 [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4 aros at gmx dot com
                   ` (3 preceding siblings ...)
  2024-01-05 19:54 ` hubicka at gcc dot gnu.org
@ 2024-01-05 20:26 ` hubicka at gcc dot gnu.org
  2024-01-05 21:03 ` hubicka at gcc dot gnu.org
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: hubicka at gcc dot gnu.org @ 2024-01-05 20:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

Jan Hubicka <hubicka at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|SMHasher SHA3-256 benchmark |SMHasher SHA3-256 benchmark
                   |is almost 40% slower vs.    |is almost 40% slower vs.
                   |Clang                       |Clang (not enough complete
                   |                            |loop peeling)

--- Comment #5 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
On my zen3 machine default build gets me 180MB/S
-O3 -flto -funroll-all-loops gets me 193MB/s
-O3 -flto --param max-completely-peel-times=30 gets me 382MB/s, speedup is gone
with --param max-completely-peel-times=20, default is 16.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling)
  2024-01-04 16:53 [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4 aros at gmx dot com
                   ` (4 preceding siblings ...)
  2024-01-05 20:26 ` [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling) hubicka at gcc dot gnu.org
@ 2024-01-05 21:03 ` hubicka at gcc dot gnu.org
  2024-01-08 14:54 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: hubicka at gcc dot gnu.org @ 2024-01-05 21:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

--- Comment #6 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
The internal loops are:

static const unsigned keccakf_rotc[24] = {
   1, 3, 6, 10, 15, 21, 28, 36, 45, 55, 2, 14, 27, 41, 56, 8, 25, 43, 62, 18,
39, 61, 20, 44
}; 

static const unsigned keccakf_piln[24] = {
   10, 7, 11, 17, 18, 3, 5, 16, 8, 21, 24, 4, 15, 23, 19, 13, 12, 2, 20, 14,
22, 9, 6, 1
};

static void keccakf(ulong64 s[25])
{  
   int i, j, round;
   ulong64 t, bc[5];

   for(round = 0; round < SHA3_KECCAK_ROUNDS; round++) {
      /* Theta */
      for(i = 0; i < 5; i++)
         bc[i] = s[i] ^ s[i + 5] ^ s[i + 10] ^ s[i + 15] ^ s[i + 20];

      for(i = 0; i < 5; i++) { 
         t = bc[(i + 4) % 5] ^ ROL64(bc[(i + 1) % 5], 1);
         for(j = 0; j < 25; j += 5)
            s[j + i] ^= t;
      }
      /* Rho Pi */
      t = s[1];
      for(i = 0; i < 24; i++) {
         j = keccakf_piln[i];
         bc[0] = s[j];
         s[j] = ROL64(t, keccakf_rotc[i]);
         t = bc[0];
      }
      /* Chi */
      for(j = 0; j < 25; j += 5) {
         for(i = 0; i < 5; i++)
            bc[i] = s[j + i];
         for(i = 0; i < 5; i++)
            s[j + i] ^= (~bc[(i + 1) % 5]) & bc[(i + 2) % 5];
      }
      s[0] ^= keccakf_rndc[round];
   }
}

I suppose with complete unrolling this will propagate, partly stay in registers
and fold. I think increasing the default limits, especially -O3 may make sense.
Value of 16 is there for very long time (I think since the initial
implementation).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling)
  2024-01-04 16:53 [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4 aros at gmx dot com
                   ` (5 preceding siblings ...)
  2024-01-05 21:03 ` hubicka at gcc dot gnu.org
@ 2024-01-08 14:54 ` rguenth at gcc dot gnu.org
  2024-01-08 14:55 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-01-08 14:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
IMO it should be purely growth/unrolled-insns bound, the bound on the actual
unrolled iterations is somewhat artificial (to avoid really large unrolls
when we estimate the unrolled body to be zero, thus never hit any of the other
limits).  That said, I think we should get better at estimating growth - I
don't
think we get that the reads from the constant arrays get elided?  (though
that's
not always an optimal thing)

See the proposal on better estimation I had last year.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling)
  2024-01-04 16:53 [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4 aros at gmx dot com
                   ` (6 preceding siblings ...)
  2024-01-08 14:54 ` rguenth at gcc dot gnu.org
@ 2024-01-08 14:55 ` rguenth at gcc dot gnu.org
  2024-04-24 16:02 ` hubicka at gcc dot gnu.org
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-01-08 14:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 57006
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57006&action=edit
unroll heuristics

this one

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling)
  2024-01-04 16:53 [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4 aros at gmx dot com
                   ` (7 preceding siblings ...)
  2024-01-08 14:55 ` rguenth at gcc dot gnu.org
@ 2024-04-24 16:02 ` hubicka at gcc dot gnu.org
  2024-04-24 16:41 ` dmalcolm at gcc dot gnu.org
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: hubicka at gcc dot gnu.org @ 2024-04-24 16:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

--- Comment #9 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
Phoronix still claims the difference
https://www.phoronix.com/review/gcc14-clang18-amd-zen4/2

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling)
  2024-01-04 16:53 [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4 aros at gmx dot com
                   ` (8 preceding siblings ...)
  2024-04-24 16:02 ` hubicka at gcc dot gnu.org
@ 2024-04-24 16:41 ` dmalcolm at gcc dot gnu.org
  2024-04-24 16:44 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: dmalcolm at gcc dot gnu.org @ 2024-04-24 16:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

David Malcolm <dmalcolm at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmalcolm at gcc dot gnu.org

--- Comment #10 from David Malcolm <dmalcolm at gcc dot gnu.org> ---
(In reply to Jan Hubicka from comment #4)
> I keep mentioning to Larabel that he should use -fno-semantic-interposition,
> but he doesn't.

Possibly a silly question, but how about changing the default in GCC 15?  What
proportion of users actually make use of -fsemantic-interposition ?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling)
  2024-01-04 16:53 [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4 aros at gmx dot com
                   ` (9 preceding siblings ...)
  2024-04-24 16:41 ` dmalcolm at gcc dot gnu.org
@ 2024-04-24 16:44 ` pinskia at gcc dot gnu.org
  2024-04-24 16:47 ` pinskia at gcc dot gnu.org
  2024-04-24 16:51 ` xry111 at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-24 16:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

--- Comment #11 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to David Malcolm from comment #10)
> (In reply to Jan Hubicka from comment #4)
> > I keep mentioning to Larabel that he should use -fno-semantic-interposition,
> > but he doesn't.
> 
> Possibly a silly question, but how about changing the default in GCC 15? 
> What proportion of users actually make use of -fsemantic-interposition ?

See https://inbox.sourceware.org/gcc-patches/ri6czn5z8mw.fsf@suse.cz/ for
previous discussion on this.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling)
  2024-01-04 16:53 [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4 aros at gmx dot com
                   ` (10 preceding siblings ...)
  2024-04-24 16:44 ` pinskia at gcc dot gnu.org
@ 2024-04-24 16:47 ` pinskia at gcc dot gnu.org
  2024-04-24 16:51 ` xry111 at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-24 16:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #11)
> (In reply to David Malcolm from comment #10)
> > (In reply to Jan Hubicka from comment #4)
> > > I keep mentioning to Larabel that he should use -fno-semantic-interposition,
> > > but he doesn't.
> > 
> > Possibly a silly question, but how about changing the default in GCC 15? 
> > What proportion of users actually make use of -fsemantic-interposition ?
> 
> See https://inbox.sourceware.org/gcc-patches/ri6czn5z8mw.fsf@suse.cz/ for
> previous discussion on this.

Sorry
https://inbox.sourceware.org/gcc-patches/20210606231215.49899-1-maskray@google.com/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling)
  2024-01-04 16:53 [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4 aros at gmx dot com
                   ` (11 preceding siblings ...)
  2024-04-24 16:47 ` pinskia at gcc dot gnu.org
@ 2024-04-24 16:51 ` xry111 at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-04-24 16:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235

--- Comment #13 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
(In reply to David Malcolm from comment #10)
> (In reply to Jan Hubicka from comment #4)
> > I keep mentioning to Larabel that he should use -fno-semantic-interposition,
> > but he doesn't.
> 
> Possibly a silly question, but how about changing the default in GCC 15? 
> What proportion of users actually make use of -fsemantic-interposition ?

At least if building Glibc with -fno-semantic-interposition, several tests will
fail.  I've not figured out if they are test-suite issues or real issues
though.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-04-24 16:51 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-04 16:53 [Bug rtl-optimization/113235] New: SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang on AMD Zen 4 aros at gmx dot com
2024-01-04 17:05 ` [Bug target/113235] " aros at gmx dot com
2024-01-04 17:09 ` xry111 at gcc dot gnu.org
2024-01-04 17:27 ` [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang xry111 at gcc dot gnu.org
2024-01-05 19:54 ` hubicka at gcc dot gnu.org
2024-01-05 20:26 ` [Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling) hubicka at gcc dot gnu.org
2024-01-05 21:03 ` hubicka at gcc dot gnu.org
2024-01-08 14:54 ` rguenth at gcc dot gnu.org
2024-01-08 14:55 ` rguenth at gcc dot gnu.org
2024-04-24 16:02 ` hubicka at gcc dot gnu.org
2024-04-24 16:41 ` dmalcolm at gcc dot gnu.org
2024-04-24 16:44 ` pinskia at gcc dot gnu.org
2024-04-24 16:47 ` pinskia at gcc dot gnu.org
2024-04-24 16:51 ` xry111 at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).