public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/114987] New: floating point vector regression, x86, between gcc 14 and gcc-13 using -O3 and target clones on skylake platforms
@ 2024-05-08 14:44 colin.king at intel dot com
  2024-05-08 14:45 ` [Bug c/114987] " colin.king at intel dot com
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: colin.king at intel dot com @ 2024-05-08 14:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114987

            Bug ID: 114987
           Summary: floating point vector regression, x86, between gcc 14
                    and gcc-13 using -O3 and target clones on skylake
                    platforms
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: colin.king at intel dot com
  Target Milestone: ---

Created attachment 58126
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58126&action=edit
reproducer.c source code

I'm seeing a ~10% performance regression in gcc-14 compared to gcc-13, using
gcc on Ubuntu 24.04:

Versions:
gcc version 13.2.0 (Ubuntu 13.2.0-23ubuntu4) 
gcc version 14.0.1 20240412 (experimental) [master r14-9935-g67e1433a94f]
(Ubuntu 14-20240412-0ubuntu1) 

king@skylake:~$ CFLAGS="" gcc-13 reproducer.c; ./a.out  
4.92 secs duration, 2130.379 Mfp-ops/sec
cking@skylake:~$ CFLAGS="" gcc-14 reproducer.c; ./a.out  
5.46 secs duration, 1921.799 Mfp-ops/sec

The original issue appeared when regression testing stress-ng vecfp stressor
[1] using the floating point vector 16 add stressor method. I've managed to
extract the attached reproducer (reproducer.c) from the original code.

Salient points to focus on:

1. The issue is dependant on the OPTIMIZE3 macro in the reproducer being
__attribute__((optimize("-O3")))
2. The issue is also dependant on the TARGET_CLONES macro being defined as
__attribute__((target_clones("mmx,avx,default")))  - the avx target clones
seems to be an issue in reproducing this problem.

Attached are the reproducer.c C source and disassembled object code. The
stress_vecfp_float_add_16.avx from gcc-13 is significantly different from the
gcc-14 code.

References: [1]
https://github.com/ColinIanKing/stress-ng/blob/master/stress-vecfp.c

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-05-10  8:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-08 14:44 [Bug c/114987] New: floating point vector regression, x86, between gcc 14 and gcc-13 using -O3 and target clones on skylake platforms colin.king at intel dot com
2024-05-08 14:45 ` [Bug c/114987] " colin.king at intel dot com
2024-05-08 14:45 ` colin.king at intel dot com
2024-05-08 15:00 ` colin.king at intel dot com
2024-05-10  7:52 ` [Bug target/114987] [14/15 Regression] " rguenth at gcc dot gnu.org
2024-05-10  8:00 ` haochen.jiang at intel dot com
2024-05-10  8:05 ` liuhongt at gcc dot gnu.org
2024-05-10  8:42 ` haochen.jiang at intel dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).