public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/115002] New: wide integer vector performance regression, x86, between gcc-14 and gcc-13 using target clones on skylake platform
@ 2024-05-09  8:54 colin.king at intel dot com
  2024-05-09  8:58 ` [Bug target/115002] [14/15 regression] " colin.king at intel dot com
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: colin.king at intel dot com @ 2024-05-09  8:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115002

            Bug ID: 115002
           Summary: wide integer vector performance regression, x86,
                    between gcc-14 and gcc-13 using target clones on
                    skylake platform
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: colin.king at intel dot com
  Target Milestone: ---

Created attachment 58138
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58138&action=edit
reproducer source code

I'm seeing a ~1.5% performance regression in gcc-14 compared to gcc-13, using
gcc on Ubuntu 24.04:

Versions:
gcc version 13.2.0 (Ubuntu 13.2.0-23ubuntu4) 
gcc version 14.0.1 20240412 (experimental) [master r14-9935-g67e1433a94f]
(Ubuntu 14-20240412-0ubuntu1) 

CFLAGS="" gcc-13 reproducer-vecwide.c -O2 -Wall
cking@skylake:~$ ./a.out 
7615.58 vint8w2048_t ops per sec, duration = 13.13 secs

cking@skylake:~$ CFLAGS="" gcc-14 reproducer-vecwide.c -O2 -Wall
cking@skylake:~$ ./a.out 
7489.42 vint8w2048_t ops per sec, duration = 13.35 secs

The original issue appeared when regression testing stress-ng vecwide stressor
[1]. I've managed to extract the attached reproducer from the original code
(see attached).

Salient point to focus on:

1. The issue is also dependant on the TARGET_CLONES macro being defined as
__attribute__((target_clones("avx,default")))  - the avx target clones seems to
be an issue in reproducing this problem.

Attached are the reproducer C source and disassembled object code. 

References: [1]
https://github.com/ColinIanKing/stress-ng/blob/master/stress-vecwide.c

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/115002] [14/15 regression] wide integer vector performance regression, x86, between gcc-14 and gcc-13 using target clones on skylake platform
  2024-05-09  8:54 [Bug c/115002] New: wide integer vector performance regression, x86, between gcc-14 and gcc-13 using target clones on skylake platform colin.king at intel dot com
@ 2024-05-09  8:58 ` colin.king at intel dot com
  2024-05-09  8:58 ` colin.king at intel dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: colin.king at intel dot com @ 2024-05-09  8:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115002

--- Comment #1 from Colin Ian King <colin.king at intel dot com> ---
Created attachment 58139
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58139&action=edit
gcc-13 disassembly

gcc-13 disassembly

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/115002] [14/15 regression] wide integer vector performance regression, x86, between gcc-14 and gcc-13 using target clones on skylake platform
  2024-05-09  8:54 [Bug c/115002] New: wide integer vector performance regression, x86, between gcc-14 and gcc-13 using target clones on skylake platform colin.king at intel dot com
  2024-05-09  8:58 ` [Bug target/115002] [14/15 regression] " colin.king at intel dot com
@ 2024-05-09  8:58 ` colin.king at intel dot com
  2024-05-09  9:11 ` colin.king at intel dot com
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: colin.king at intel dot com @ 2024-05-09  8:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115002

--- Comment #2 from Colin Ian King <colin.king at intel dot com> ---
Created attachment 58140
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58140&action=edit
gcc-14 disassembly

gcc-14 disassembly

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/115002] [14/15 regression] wide integer vector performance regression, x86, between gcc-14 and gcc-13 using target clones on skylake platform
  2024-05-09  8:54 [Bug c/115002] New: wide integer vector performance regression, x86, between gcc-14 and gcc-13 using target clones on skylake platform colin.king at intel dot com
  2024-05-09  8:58 ` [Bug target/115002] [14/15 regression] " colin.king at intel dot com
  2024-05-09  8:58 ` colin.king at intel dot com
@ 2024-05-09  9:11 ` colin.king at intel dot com
  2024-05-09  9:12 ` colin.king at intel dot com
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: colin.king at intel dot com @ 2024-05-09  9:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115002

--- Comment #3 from Colin Ian King <colin.king at intel dot com> ---
Created attachment 58141
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58141&action=edit
perf output of stress_vecwide_2048 for gcc-13 compiled code

perf output of stress_vecwide_2048 for gcc-13 compiled code

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/115002] [14/15 regression] wide integer vector performance regression, x86, between gcc-14 and gcc-13 using target clones on skylake platform
  2024-05-09  8:54 [Bug c/115002] New: wide integer vector performance regression, x86, between gcc-14 and gcc-13 using target clones on skylake platform colin.king at intel dot com
                   ` (2 preceding siblings ...)
  2024-05-09  9:11 ` colin.king at intel dot com
@ 2024-05-09  9:12 ` colin.king at intel dot com
  2024-05-10 11:16 ` rguenth at gcc dot gnu.org
  2024-05-14  8:59 ` haochen.jiang at intel dot com
  5 siblings, 0 replies; 7+ messages in thread
From: colin.king at intel dot com @ 2024-05-09  9:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115002

--- Comment #4 from Colin Ian King <colin.king at intel dot com> ---
Created attachment 58142
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58142&action=edit
perf output of stress_vecwide_2048 for gcc-14 compiled code

perf output of stress_vecwide_2048 for gcc-14 compiled code

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/115002] [14/15 regression] wide integer vector performance regression, x86, between gcc-14 and gcc-13 using target clones on skylake platform
  2024-05-09  8:54 [Bug c/115002] New: wide integer vector performance regression, x86, between gcc-14 and gcc-13 using target clones on skylake platform colin.king at intel dot com
                   ` (3 preceding siblings ...)
  2024-05-09  9:12 ` colin.king at intel dot com
@ 2024-05-10 11:16 ` rguenth at gcc dot gnu.org
  2024-05-14  8:59 ` haochen.jiang at intel dot com
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-05-10 11:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115002

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
   Target Milestone|---                         |14.2

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/115002] [14/15 regression] wide integer vector performance regression, x86, between gcc-14 and gcc-13 using target clones on skylake platform
  2024-05-09  8:54 [Bug c/115002] New: wide integer vector performance regression, x86, between gcc-14 and gcc-13 using target clones on skylake platform colin.king at intel dot com
                   ` (4 preceding siblings ...)
  2024-05-10 11:16 ` rguenth at gcc dot gnu.org
@ 2024-05-14  8:59 ` haochen.jiang at intel dot com
  5 siblings, 0 replies; 7+ messages in thread
From: haochen.jiang at intel dot com @ 2024-05-14  8:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115002

--- Comment #5 from Haochen Jiang <haochen.jiang at intel dot com> ---
It seems that mainly caused by codesize increase in GCC14 since the actual
instruction retired increase ratio is similar to the regression.

Also, just like PR114987, I tried with GCC11, seems it gets the better
performance than GCC13.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-05-14  8:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-09  8:54 [Bug c/115002] New: wide integer vector performance regression, x86, between gcc-14 and gcc-13 using target clones on skylake platform colin.king at intel dot com
2024-05-09  8:58 ` [Bug target/115002] [14/15 regression] " colin.king at intel dot com
2024-05-09  8:58 ` colin.king at intel dot com
2024-05-09  9:11 ` colin.king at intel dot com
2024-05-09  9:12 ` colin.king at intel dot com
2024-05-10 11:16 ` rguenth at gcc dot gnu.org
2024-05-14  8:59 ` haochen.jiang at intel dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).