From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2153) id 498D0385B509; Fri, 8 Mar 2024 08:30:18 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 498D0385B509 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1709886621; bh=pmVo53MnHoPrjQFu2g+/lcht1yDwHzgt9I2VO/TXMz4=; h=From:To:Subject:Date:From; b=HaN4izVIuLftid8hv39dDHb55njJla36Zxh29MLpt/bBhsCF36tSFv+c6I6Bi5iNn XgJTRgxagHFNFmqoZeY70xgGsJbsmdNLD213OdvmMMCa9k+07fBkDz3H9onLxozG72 /wgvSJOQYdojHZru3/koAwY9F5eYQDpWMfN3FTAk= MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: Jakub Jelinek To: gcc-cvs@gcc.gnu.org Subject: [gcc r14-9387] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-register X-Act-Checkin: gcc X-Git-Author: Jakub Jelinek X-Git-Refname: refs/heads/master X-Git-Oldrev: eed4e541711ab4ae7783f75dd132e2acca71fdb9 X-Git-Newrev: a307a26e8b392ba65edfdae15489556b7701db81 Message-Id: <20240308083021.498D0385B509@sourceware.org> Date: Fri, 8 Mar 2024 08:30:18 +0000 (GMT) List-Id: https://gcc.gnu.org/g:a307a26e8b392ba65edfdae15489556b7701db81 commit r14-9387-ga307a26e8b392ba65edfdae15489556b7701db81 Author: Jakub Jelinek Date: Fri Mar 8 09:18:19 2024 +0100 i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534] The following patch hides the noreturn no_callee_saved_registers (except bp) optimization with a not enabled by default option. The reason is that most noreturn functions should be called just once in a program (unless they are recursive or invoke longjmp or similar, for exceptions we already punt), so it isn't that essential to save a few instructions in their prologue, but more importantly because it interferes with debugging. And unlike most other optimizations, doesn't actually make it harder to debug the given function, which can be solved by recompiling the given function if it is too hard to debug, but makes it harder to debug the callers of that noreturn function. Those can be from a different translation unit, different binary or even different package, so if e.g. glibc abort needs to use all of the callee saved registers (%rbx, %rbp, %r12, %r13, %r14, %r15), debugging any programs which abort will be harder because any DWARF expressions which use those registers will be optimized out, not just in the immediate caller, but in other callers as well until some frame restores a particular register from some stack slot. 2024-03-08 Jakub Jelinek PR target/38534 * config/i386/i386.opt (mnoreturn-no-callee-saved-registers): New option. * config/i386/i386-options.cc (ix86_set_func_type): Don't use TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP unless ix86_noreturn_no_callee_saved_registers is enabled. * doc/invoke.texi (-mnoreturn-no-callee-saved-registers): Document. * gcc.target/i386/pr38534-1.c: Add -mnoreturn-no-callee-saved-registers to dg-options. * gcc.target/i386/pr38534-2.c: Likewise. * gcc.target/i386/pr38534-3.c: Likewise. * gcc.target/i386/pr38534-4.c: Likewise. * gcc.target/i386/pr38534-5.c: Likewise. * gcc.target/i386/pr38534-6.c: Likewise. * gcc.target/i386/pr114097-1.c: Likewise. * gcc.target/i386/stack-check-17.c: Likewise. Diff: --- gcc/config/i386/i386-options.cc | 6 ++++-- gcc/config/i386/i386.opt | 4 ++++ gcc/doc/invoke.texi | 10 ++++++++++ gcc/testsuite/gcc.target/i386/pr114097-1.c | 2 +- gcc/testsuite/gcc.target/i386/pr38534-1.c | 2 +- gcc/testsuite/gcc.target/i386/pr38534-2.c | 2 +- gcc/testsuite/gcc.target/i386/pr38534-3.c | 2 +- gcc/testsuite/gcc.target/i386/pr38534-4.c | 2 +- gcc/testsuite/gcc.target/i386/pr38534-5.c | 2 +- gcc/testsuite/gcc.target/i386/pr38534-6.c | 2 +- gcc/testsuite/gcc.target/i386/stack-check-17.c | 2 +- 11 files changed, 26 insertions(+), 10 deletions(-) diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index 2f8c85f66d4..3cc147fa70c 100644 --- a/gcc/config/i386/i386-options.cc +++ b/gcc/config/i386/i386-options.cc @@ -3384,7 +3384,8 @@ ix86_set_func_type (tree fndecl) { /* No need to save and restore callee-saved registers for a noreturn function with nothrow or compiled with -fno-exceptions unless when - compiling with -O0 or -Og. So that backtrace works for those at least + compiling with -O0 or -Og, except that it interferes with debugging + of callers. So that backtrace works for those at least in most cases, save the bp register if it is used, because it often is used in callers to compute CFA. @@ -3401,7 +3402,8 @@ ix86_set_func_type (tree fndecl) if (lookup_attribute ("no_callee_saved_registers", TYPE_ATTRIBUTES (TREE_TYPE (fndecl)))) no_callee_saved_registers = TYPE_NO_CALLEE_SAVED_REGISTERS; - else if (TREE_THIS_VOLATILE (fndecl) + else if (ix86_noreturn_no_callee_saved_registers + && TREE_THIS_VOLATILE (fndecl) && optimize && !optimize_debug && (TREE_NOTHROW (fndecl) || !flag_exceptions) diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index 5b4f1bff25f..d5f793a9e8b 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -659,6 +659,10 @@ mstore-max= Target RejectNegative Joined Var(ix86_store_max) Enum(prefer_vector_width) Init(PVW_NONE) Save Maximum number of bits that can be stored to memory efficiently. +mnoreturn-no-callee-saved-registers +Target Var(ix86_noreturn_no_callee_saved_registers) +Optimize noreturn functions by not saving callee-saved registers used in the function. + ;; ISA support m32 diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 2390d478121..c0d604a2c5c 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -1450,6 +1450,7 @@ See RS/6000 and PowerPC Options. -mvzeroupper -mprefer-avx128 -mprefer-vector-width=@var{opt} -mpartial-vector-fp-math -mmove-max=@var{bits} -mstore-max=@var{bits} +-mnoreturn-no-callee-saved-registers -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx -mavx2 -mavx512f -mavx512pf -mavx512er -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -msha -maes @@ -35397,6 +35398,15 @@ Prefer 256-bit vector width for instructions. Prefer 512-bit vector width for instructions. @end table +@opindex mnoreturn-no-callee-saved-registers +@item -mnoreturn-no-callee-saved-registers +This option optimizes functions with @code{noreturn} attribute or +@code{_Noreturn} specifier by not saving in the function prologue callee-saved +registers which are used in the function (except for the @code{BP} +register). This option can interfere with debugging of the caller of the +@code{noreturn} function or any function further up in the call stack, so it +is not enabled by default. + @opindex mcx16 @item -mcx16 This option enables GCC to generate @code{CMPXCHG16B} instructions in 64-bit diff --git a/gcc/testsuite/gcc.target/i386/pr114097-1.c b/gcc/testsuite/gcc.target/i386/pr114097-1.c index feeb9165570..b8edfaaf25b 100644 --- a/gcc/testsuite/gcc.target/i386/pr114097-1.c +++ b/gcc/testsuite/gcc.target/i386/pr114097-1.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */ +/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer -mnoreturn-no-callee-saved-registers" } */ #define ARRAY_SIZE 256 diff --git a/gcc/testsuite/gcc.target/i386/pr38534-1.c b/gcc/testsuite/gcc.target/i386/pr38534-1.c index c73c8d23288..795174f4eb3 100644 --- a/gcc/testsuite/gcc.target/i386/pr38534-1.c +++ b/gcc/testsuite/gcc.target/i386/pr38534-1.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */ +/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer -mnoreturn-no-callee-saved-registers" } */ #define ARRAY_SIZE 256 diff --git a/gcc/testsuite/gcc.target/i386/pr38534-2.c b/gcc/testsuite/gcc.target/i386/pr38534-2.c index 0dc8720dc89..2d8a4b6a724 100644 --- a/gcc/testsuite/gcc.target/i386/pr38534-2.c +++ b/gcc/testsuite/gcc.target/i386/pr38534-2.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */ +/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer -mnoreturn-no-callee-saved-registers" } */ extern void bar (void) __attribute__ ((no_callee_saved_registers)); extern void fn (void) __attribute__ ((noreturn)); diff --git a/gcc/testsuite/gcc.target/i386/pr38534-3.c b/gcc/testsuite/gcc.target/i386/pr38534-3.c index 554c594feb7..9ab2f5dca39 100644 --- a/gcc/testsuite/gcc.target/i386/pr38534-3.c +++ b/gcc/testsuite/gcc.target/i386/pr38534-3.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */ +/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer -mnoreturn-no-callee-saved-registers" } */ typedef void (*fn_t) (void) __attribute__ ((no_callee_saved_registers)); extern fn_t bar; diff --git a/gcc/testsuite/gcc.target/i386/pr38534-4.c b/gcc/testsuite/gcc.target/i386/pr38534-4.c index 8073aac01fc..d4726c36137 100644 --- a/gcc/testsuite/gcc.target/i386/pr38534-4.c +++ b/gcc/testsuite/gcc.target/i386/pr38534-4.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */ +/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer -mnoreturn-no-callee-saved-registers" } */ typedef void (*fn_t) (void) __attribute__ ((no_callee_saved_registers)); extern void fn (void) __attribute__ ((noreturn)); diff --git a/gcc/testsuite/gcc.target/i386/pr38534-5.c b/gcc/testsuite/gcc.target/i386/pr38534-5.c index 91c0c0f8c59..44386459895 100644 --- a/gcc/testsuite/gcc.target/i386/pr38534-5.c +++ b/gcc/testsuite/gcc.target/i386/pr38534-5.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O0 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */ +/* { dg-options "-O0 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -mnoreturn-no-callee-saved-registers" } */ #define ARRAY_SIZE 256 diff --git a/gcc/testsuite/gcc.target/i386/pr38534-6.c b/gcc/testsuite/gcc.target/i386/pr38534-6.c index cf1463a9c66..266c8e7f1de 100644 --- a/gcc/testsuite/gcc.target/i386/pr38534-6.c +++ b/gcc/testsuite/gcc.target/i386/pr38534-6.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */ +/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -mnoreturn-no-callee-saved-registers" } */ #define ARRAY_SIZE 256 diff --git a/gcc/testsuite/gcc.target/i386/stack-check-17.c b/gcc/testsuite/gcc.target/i386/stack-check-17.c index 648572e5ebd..924a459c4e2 100644 --- a/gcc/testsuite/gcc.target/i386/stack-check-17.c +++ b/gcc/testsuite/gcc.target/i386/stack-check-17.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fstack-clash-protection -mtune=generic -fomit-frame-pointer" } */ +/* { dg-options "-O2 -fstack-clash-protection -mtune=generic -fomit-frame-pointer -mnoreturn-no-callee-saved-registers" } */ /* { dg-require-effective-target supports_stack_clash_protection } */ /* { dg-skip-if "" { *-*-* } { "-fstack-protector*" } { "" } } */