public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] i386: For noreturn functions save at least the bp register if it is used [PR114116]
@ 2024-02-27  8:40 Jakub Jelinek
  2024-02-27  8:54 ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2024-02-27  8:40 UTC (permalink / raw)
  To: Uros Bizjak, hjl.tools; +Cc: gcc-patches

Hi!

As mentioned in the PR, on x86_64 currently a lot of ICEs end up
with crashes in the unwinder like:
during RTL pass: expand
pr114044-2.c: In function ‘foo’:
pr114044-2.c:5:3: internal compiler error: in expand_fn_using_insn, at internal-fn.cc:208
    5 |   __builtin_clzg (a);
      |   ^~~~~~~~~~~~~~~~~~
0x7d9246 expand_fn_using_insn
	../../gcc/internal-fn.cc:208

pr114044-2.c:5:3: internal compiler error: Segmentation fault
0x1554262 crash_signal
	../../gcc/toplev.cc:319
0x2b20320 x86_64_fallback_frame_state
	./md-unwind-support.h:63
0x2b20320 uw_frame_state_for
	../../../libgcc/unwind-dw2.c:1013
0x2b2165d _Unwind_Backtrace
	../../../libgcc/unwind.inc:303
0x2acbd69 backtrace_full
	../../libbacktrace/backtrace.c:127
0x2a32fa6 diagnostic_context::action_after_output(diagnostic_t)
	../../gcc/diagnostic.cc:781
0x2a331bb diagnostic_action_after_output(diagnostic_context*, diagnostic_t)
	../../gcc/diagnostic.h:1002
0x2a331bb diagnostic_context::report_diagnostic(diagnostic_info*)
	../../gcc/diagnostic.cc:1633
0x2a33543 diagnostic_impl
	../../gcc/diagnostic.cc:1767
0x2a33c26 internal_error(char const*, ...)
	../../gcc/diagnostic.cc:2225
0xe232c8 fancy_abort(char const*, int, char const*)
	../../gcc/diagnostic.cc:2336
0x7d9246 expand_fn_using_insn
	../../gcc/internal-fn.cc:208
Segmentation fault (core dumped)

The problem are the PR38534 r14-8470 changes which avoid saving call-saved
registers in noreturn functions.  If such functions ever touch the
bp register but because of the r14-8470 changes don't save it in the
prologue, the caller or any other function in the backtrace uses a frame
pointer and the noreturn function or anything it calls directly or
indirectly calls backtrace, then the unwinder crashes, because bp register
contains some unrelated value, but in the frames which do use frame pointer
CFA is based on the bp register.

In theory this could happen with any other call-saved register, e.g. code
written by hand in assembly with .cfi_* directives could use any other
call-saved register as register into which store the CFA or something
related to that, but in reality at least compiler generated code and usual
assembly probably just making sure bp doesn't contain garbage could be
enough for backtrace purposes.  In the debugger of course it will not be
enough, the values of the arguments etc. can be lost (if DW_CFA_undefined
is emitted) or garbage.

So, I think for noreturn function we should at least save the bp register
if we use it.  If user asks for it using no_callee_saved_registers
attribute, let's honor what is asked for (but then it is up to the user
to make sure e.g. backtrace isn't called from the function or anything it
calls).  As discussed in the PR, whether to save bp or not shouldn't be
based on whether compiling with -g or -g0, because we don't want code
generation changes without/with debugging, it would also break
-fcompare-debug, and users can call backtrace(3), that doesn't use debug
info, just unwind info, even backtrace_symbols{,_fd}(3) don't use debug info
but just looks at dynamic symbol table.

The patch also adds check for no_caller_saved_registers
attribute in the implicit addition of not saving callee saved register
in noreturn functions, because on I think
__attribute__((no_caller_saved_registers, noreturn)) will otherwise
error that no_caller_saved_registers and no_callee_saved_registers
attributes are incompatible (but user didn't specify anything like that).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-02-27  Jakub Jelinek  <jakub@redhat.com>

	PR target/114116
	* config/i386/i386.h (enum call_saved_registers_type): Add
	TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP enumerator.
	* config/i386/i386-options.cc (ix86_set_func_type): Remove
	has_no_callee_saved_registers variable, add no_callee_saved_registers
	instead, initialize it depending on whether it is
	no_callee_saved_registers function or not.  Don't set it if
	no_caller_saved_registers attribute is present.  Adjust users.
	* config/i386/i386.cc (ix86_function_ok_for_sibcall): Handle
	TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP like
	TYPE_NO_CALLEE_SAVED_REGISTERS.
	(ix86_save_reg): Handle TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP.

	* gcc.target/i386/pr38534-1.c: Allow push/pop of bp.
	* gcc.target/i386/pr38534-4.c: Likewise.
	* gcc.target/i386/pr38534-2.c: Likewise.
	* gcc.target/i386/pr38534-3.c: Likewise.
	* gcc.target/i386/pr114097-1.c: Likewise.
	* gcc.target/i386/stack-check-17.c: Expect no pop on ! ia32.

--- gcc/config/i386/i386.h.jj	2024-01-27 22:59:53.121319813 +0100
+++ gcc/config/i386/i386.h	2024-02-26 16:33:14.644745362 +0100
@@ -2730,9 +2730,12 @@ enum call_saved_registers_type
   /* The current function is a function specified with the "interrupt"
      or "no_caller_saved_registers" attribute.  */
   TYPE_NO_CALLER_SAVED_REGISTERS,
+  /* The current function is a function specified with the
+     "no_callee_saved_registers" attribute.  */
+  TYPE_NO_CALLEE_SAVED_REGISTERS,
   /* The current function is a function specified with the "noreturn"
-     or "no_callee_saved_registers" attribute.  */
-  TYPE_NO_CALLEE_SAVED_REGISTERS
+     attribute.  */
+  TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP,
 };
 
 enum queued_insn_type
--- gcc/config/i386/i386-options.cc.jj	2024-02-26 16:29:07.471201524 +0100
+++ gcc/config/i386/i386-options.cc	2024-02-26 16:42:01.730358609 +0100
@@ -3384,7 +3384,9 @@ ix86_set_func_type (tree fndecl)
 {
   /* No need to save and restore callee-saved registers for a noreturn
      function with nothrow or compiled with -fno-exceptions unless when
-     compiling with -O0 or -Og.
+     compiling with -O0 or -Og.  So that backtrace works for those at least
+     in most cases, save the bp register if it is used, because it often
+     is used in callers to compute CFA.
 
      NB: Can't use just TREE_THIS_VOLATILE to check if this is a noreturn
      function.  The local-pure-const pass turns an interrupt function
@@ -3394,15 +3396,20 @@ ix86_set_func_type (tree fndecl)
      function is marked with TREE_THIS_VOLATILE in the IR output, which
      leads to the incompatible attribute error in LTO1.  Ignore the
      interrupt function in this case.  */
-  bool has_no_callee_saved_registers
-    = ((TREE_THIS_VOLATILE (fndecl)
-	&& !lookup_attribute ("interrupt",
-			      TYPE_ATTRIBUTES (TREE_TYPE (fndecl)))
-	&& optimize
-	&& !optimize_debug
-	&& (TREE_NOTHROW (fndecl) || !flag_exceptions))
-       || lookup_attribute ("no_callee_saved_registers",
-			    TYPE_ATTRIBUTES (TREE_TYPE (fndecl))));
+  enum call_saved_registers_type no_callee_saved_registers
+    = TYPE_DEFAULT_CALL_SAVED_REGISTERS;
+  if (lookup_attribute ("no_callee_saved_registers",
+			TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
+    no_callee_saved_registers = TYPE_NO_CALLEE_SAVED_REGISTERS;
+  else if (TREE_THIS_VOLATILE (fndecl)
+	   && optimize
+	   && !optimize_debug
+	   && (TREE_NOTHROW (fndecl) || !flag_exceptions)
+	   && !lookup_attribute ("interrupt",
+				 TYPE_ATTRIBUTES (TREE_TYPE (fndecl)))
+	   && !lookup_attribute ("no_caller_saved_registers",
+				 TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
+    no_callee_saved_registers = TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP;
 
   if (cfun->machine->func_type == TYPE_UNKNOWN)
     {
@@ -3413,7 +3420,7 @@ ix86_set_func_type (tree fndecl)
 	    error_at (DECL_SOURCE_LOCATION (fndecl),
 		      "interrupt and naked attributes are not compatible");
 
-	  if (has_no_callee_saved_registers)
+	  if (no_callee_saved_registers)
 	    error_at (DECL_SOURCE_LOCATION (fndecl),
 		      "%qs and %qs attributes are not compatible",
 		      "interrupt", "no_callee_saved_registers");
@@ -3442,7 +3449,7 @@ ix86_set_func_type (tree fndecl)
 				TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
 	    cfun->machine->call_saved_registers
 	      = TYPE_NO_CALLER_SAVED_REGISTERS;
-	  if (has_no_callee_saved_registers)
+	  if (no_callee_saved_registers)
 	    {
 	      if (cfun->machine->call_saved_registers
 		  == TYPE_NO_CALLER_SAVED_REGISTERS)
@@ -3451,7 +3458,7 @@ ix86_set_func_type (tree fndecl)
 			  "no_caller_saved_registers",
 			  "no_callee_saved_registers");
 	      cfun->machine->call_saved_registers
-		= TYPE_NO_CALLEE_SAVED_REGISTERS;
+		= no_callee_saved_registers;
 	    }
 	}
     }
--- gcc/config/i386/i386.cc.jj	2024-02-26 11:12:36.275453513 +0100
+++ gcc/config/i386/i386.cc	2024-02-26 17:24:50.701602352 +0100
@@ -984,8 +984,9 @@ ix86_function_ok_for_sibcall (tree decl,
 
   /* Sibling call isn't OK if callee has no callee-saved registers
      and the calling function has callee-saved registers.  */
-  if ((cfun->machine->call_saved_registers
-       != TYPE_NO_CALLEE_SAVED_REGISTERS)
+  if (cfun->machine->call_saved_registers != TYPE_NO_CALLEE_SAVED_REGISTERS
+      && (cfun->machine->call_saved_registers
+	  != TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP)
       && lookup_attribute ("no_callee_saved_registers",
 			   TYPE_ATTRIBUTES (type)))
     return false;
@@ -6649,6 +6650,11 @@ ix86_save_reg (unsigned int regno, bool
 
     case TYPE_NO_CALLEE_SAVED_REGISTERS:
       return false;
+
+    case TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP:
+      if (regno != HARD_FRAME_POINTER_REGNUM)
+	return false;
+      break;
     }
 
   if (regno == REAL_PIC_OFFSET_TABLE_REGNUM
--- gcc/testsuite/gcc.target/i386/pr38534-1.c.jj	2024-02-01 16:00:37.443735635 +0100
+++ gcc/testsuite/gcc.target/i386/pr38534-1.c	2024-02-27 08:59:08.421878166 +0100
@@ -22,5 +22,5 @@ no_return_to_caller (void)
   while (1);
 }
 
-/* { dg-final { scan-assembler-not "push" } } */
-/* { dg-final { scan-assembler-not "pop" } } */
+/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
+/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
--- gcc/testsuite/gcc.target/i386/pr38534-4.c.jj	2024-02-01 16:00:37.443735635 +0100
+++ gcc/testsuite/gcc.target/i386/pr38534-4.c	2024-02-27 09:00:09.869021464 +0100
@@ -12,7 +12,7 @@ foo (fn_t bar)
   fn ();
 }
 
-/* { dg-final { scan-assembler-not "push" } } */
-/* { dg-final { scan-assembler-not "pop" } } */
+/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
+/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
 /* { dg-final { scan-assembler-not "jmp" } } */
 /* { dg-final { scan-assembler "call\[\\t \]+" } } */
--- gcc/testsuite/gcc.target/i386/pr38534-2.c.jj	2024-02-01 16:00:37.443735635 +0100
+++ gcc/testsuite/gcc.target/i386/pr38534-2.c	2024-02-27 08:59:35.723497526 +0100
@@ -12,7 +12,7 @@ foo (void)
   fn ();
 }
 
-/* { dg-final { scan-assembler-not "push" } } */
-/* { dg-final { scan-assembler-not "pop" } } */
+/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
+/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
 /* { dg-final { scan-assembler-not "jmp\[\\t \]+_?bar" } } */
 /* { dg-final { scan-assembler "call\[\\t \]+_?bar" } } */
--- gcc/testsuite/gcc.target/i386/pr38534-3.c.jj	2024-02-01 16:00:37.443735635 +0100
+++ gcc/testsuite/gcc.target/i386/pr38534-3.c	2024-02-27 08:59:49.669303090 +0100
@@ -13,7 +13,7 @@ foo (void)
   fn ();
 }
 
-/* { dg-final { scan-assembler-not "push" } } */
-/* { dg-final { scan-assembler-not "pop" } } */
+/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
+/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
 /* { dg-final { scan-assembler-not "jmp" } } */
 /* { dg-final { scan-assembler "call\[\\t \]+" } } */
--- gcc/testsuite/gcc.target/i386/pr114097-1.c.jj	2024-02-26 16:29:07.529200714 +0100
+++ gcc/testsuite/gcc.target/i386/pr114097-1.c	2024-02-27 09:01:37.876794453 +0100
@@ -22,5 +22,5 @@ no_return_to_caller (void)
   while (1);
 }
 
-/* { dg-final { scan-assembler-not "push" } } */
-/* { dg-final { scan-assembler-not "pop" } } */
+/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
+/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
--- gcc/testsuite/gcc.target/i386/stack-check-17.c.jj	2024-01-27 22:59:53.186318915 +0100
+++ gcc/testsuite/gcc.target/i386/stack-check-17.c	2024-02-27 09:06:10.805989259 +0100
@@ -32,5 +32,5 @@ f3 (void)
    register on ia32 for a noreturn function.  */
 /* { dg-final { scan-assembler-times "push\[ql\]" 1 { target { ! ia32 } } } }  */
 /* { dg-final { scan-assembler-times "push\[ql\]" 3 { target ia32 } } }  */
-/* { dg-final { scan-assembler-times "pop" 1 } } */
-
+/* { dg-final { scan-assembler-not "pop" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "pop" 1 { target ia32 } } } */

	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: For noreturn functions save at least the bp register if it is used [PR114116]
  2024-02-27  8:40 [PATCH] i386: For noreturn functions save at least the bp register if it is used [PR114116] Jakub Jelinek
@ 2024-02-27  8:54 ` Richard Biener
  2024-02-27  9:04   ` Jakub Jelinek
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2024-02-27  8:54 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Uros Bizjak, hjl.tools, gcc-patches

On Tue, Feb 27, 2024 at 9:42 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> As mentioned in the PR, on x86_64 currently a lot of ICEs end up
> with crashes in the unwinder like:
> during RTL pass: expand
> pr114044-2.c: In function ‘foo’:
> pr114044-2.c:5:3: internal compiler error: in expand_fn_using_insn, at internal-fn.cc:208
>     5 |   __builtin_clzg (a);
>       |   ^~~~~~~~~~~~~~~~~~
> 0x7d9246 expand_fn_using_insn
>         ../../gcc/internal-fn.cc:208
>
> pr114044-2.c:5:3: internal compiler error: Segmentation fault
> 0x1554262 crash_signal
>         ../../gcc/toplev.cc:319
> 0x2b20320 x86_64_fallback_frame_state
>         ./md-unwind-support.h:63
> 0x2b20320 uw_frame_state_for
>         ../../../libgcc/unwind-dw2.c:1013
> 0x2b2165d _Unwind_Backtrace
>         ../../../libgcc/unwind.inc:303
> 0x2acbd69 backtrace_full
>         ../../libbacktrace/backtrace.c:127
> 0x2a32fa6 diagnostic_context::action_after_output(diagnostic_t)
>         ../../gcc/diagnostic.cc:781
> 0x2a331bb diagnostic_action_after_output(diagnostic_context*, diagnostic_t)
>         ../../gcc/diagnostic.h:1002
> 0x2a331bb diagnostic_context::report_diagnostic(diagnostic_info*)
>         ../../gcc/diagnostic.cc:1633
> 0x2a33543 diagnostic_impl
>         ../../gcc/diagnostic.cc:1767
> 0x2a33c26 internal_error(char const*, ...)
>         ../../gcc/diagnostic.cc:2225
> 0xe232c8 fancy_abort(char const*, int, char const*)
>         ../../gcc/diagnostic.cc:2336
> 0x7d9246 expand_fn_using_insn
>         ../../gcc/internal-fn.cc:208
> Segmentation fault (core dumped)
>
> The problem are the PR38534 r14-8470 changes which avoid saving call-saved
> registers in noreturn functions.  If such functions ever touch the
> bp register but because of the r14-8470 changes don't save it in the
> prologue, the caller or any other function in the backtrace uses a frame
> pointer and the noreturn function or anything it calls directly or
> indirectly calls backtrace, then the unwinder crashes, because bp register
> contains some unrelated value, but in the frames which do use frame pointer
> CFA is based on the bp register.
>
> In theory this could happen with any other call-saved register, e.g. code
> written by hand in assembly with .cfi_* directives could use any other
> call-saved register as register into which store the CFA or something
> related to that, but in reality at least compiler generated code and usual
> assembly probably just making sure bp doesn't contain garbage could be
> enough for backtrace purposes.  In the debugger of course it will not be
> enough, the values of the arguments etc. can be lost (if DW_CFA_undefined
> is emitted) or garbage.
>
> So, I think for noreturn function we should at least save the bp register
> if we use it.  If user asks for it using no_callee_saved_registers
> attribute, let's honor what is asked for (but then it is up to the user
> to make sure e.g. backtrace isn't called from the function or anything it
> calls).  As discussed in the PR, whether to save bp or not shouldn't be
> based on whether compiling with -g or -g0, because we don't want code
> generation changes without/with debugging, it would also break
> -fcompare-debug, and users can call backtrace(3), that doesn't use debug
> info, just unwind info, even backtrace_symbols{,_fd}(3) don't use debug info
> but just looks at dynamic symbol table.
>
> The patch also adds check for no_caller_saved_registers
> attribute in the implicit addition of not saving callee saved register
> in noreturn functions, because on I think
> __attribute__((no_caller_saved_registers, noreturn)) will otherwise
> error that no_caller_saved_registers and no_callee_saved_registers
> attributes are incompatible (but user didn't specify anything like that).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Just to add we've in the past opted to avoid tail-calling abort () and friends
exactly to help debugging.  So treating noreturn functions specially with
respect to caller saves looks inconsistent in this regard - it makes debugging
harder since a lot of info in the calling frame is going to be lost?

I hope we at least avoid that at -O0, possibly also with -Og?

> 2024-02-27  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/114116
>         * config/i386/i386.h (enum call_saved_registers_type): Add
>         TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP enumerator.
>         * config/i386/i386-options.cc (ix86_set_func_type): Remove
>         has_no_callee_saved_registers variable, add no_callee_saved_registers
>         instead, initialize it depending on whether it is
>         no_callee_saved_registers function or not.  Don't set it if
>         no_caller_saved_registers attribute is present.  Adjust users.
>         * config/i386/i386.cc (ix86_function_ok_for_sibcall): Handle
>         TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP like
>         TYPE_NO_CALLEE_SAVED_REGISTERS.
>         (ix86_save_reg): Handle TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP.
>
>         * gcc.target/i386/pr38534-1.c: Allow push/pop of bp.
>         * gcc.target/i386/pr38534-4.c: Likewise.
>         * gcc.target/i386/pr38534-2.c: Likewise.
>         * gcc.target/i386/pr38534-3.c: Likewise.
>         * gcc.target/i386/pr114097-1.c: Likewise.
>         * gcc.target/i386/stack-check-17.c: Expect no pop on ! ia32.
>
> --- gcc/config/i386/i386.h.jj   2024-01-27 22:59:53.121319813 +0100
> +++ gcc/config/i386/i386.h      2024-02-26 16:33:14.644745362 +0100
> @@ -2730,9 +2730,12 @@ enum call_saved_registers_type
>    /* The current function is a function specified with the "interrupt"
>       or "no_caller_saved_registers" attribute.  */
>    TYPE_NO_CALLER_SAVED_REGISTERS,
> +  /* The current function is a function specified with the
> +     "no_callee_saved_registers" attribute.  */
> +  TYPE_NO_CALLEE_SAVED_REGISTERS,
>    /* The current function is a function specified with the "noreturn"
> -     or "no_callee_saved_registers" attribute.  */
> -  TYPE_NO_CALLEE_SAVED_REGISTERS
> +     attribute.  */
> +  TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP,
>  };
>
>  enum queued_insn_type
> --- gcc/config/i386/i386-options.cc.jj  2024-02-26 16:29:07.471201524 +0100
> +++ gcc/config/i386/i386-options.cc     2024-02-26 16:42:01.730358609 +0100
> @@ -3384,7 +3384,9 @@ ix86_set_func_type (tree fndecl)
>  {
>    /* No need to save and restore callee-saved registers for a noreturn
>       function with nothrow or compiled with -fno-exceptions unless when
> -     compiling with -O0 or -Og.
> +     compiling with -O0 or -Og.  So that backtrace works for those at least
> +     in most cases, save the bp register if it is used, because it often
> +     is used in callers to compute CFA.
>
>       NB: Can't use just TREE_THIS_VOLATILE to check if this is a noreturn
>       function.  The local-pure-const pass turns an interrupt function
> @@ -3394,15 +3396,20 @@ ix86_set_func_type (tree fndecl)
>       function is marked with TREE_THIS_VOLATILE in the IR output, which
>       leads to the incompatible attribute error in LTO1.  Ignore the
>       interrupt function in this case.  */
> -  bool has_no_callee_saved_registers
> -    = ((TREE_THIS_VOLATILE (fndecl)
> -       && !lookup_attribute ("interrupt",
> -                             TYPE_ATTRIBUTES (TREE_TYPE (fndecl)))
> -       && optimize
> -       && !optimize_debug
> -       && (TREE_NOTHROW (fndecl) || !flag_exceptions))
> -       || lookup_attribute ("no_callee_saved_registers",
> -                           TYPE_ATTRIBUTES (TREE_TYPE (fndecl))));
> +  enum call_saved_registers_type no_callee_saved_registers
> +    = TYPE_DEFAULT_CALL_SAVED_REGISTERS;
> +  if (lookup_attribute ("no_callee_saved_registers",
> +                       TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
> +    no_callee_saved_registers = TYPE_NO_CALLEE_SAVED_REGISTERS;
> +  else if (TREE_THIS_VOLATILE (fndecl)
> +          && optimize
> +          && !optimize_debug
> +          && (TREE_NOTHROW (fndecl) || !flag_exceptions)
> +          && !lookup_attribute ("interrupt",
> +                                TYPE_ATTRIBUTES (TREE_TYPE (fndecl)))
> +          && !lookup_attribute ("no_caller_saved_registers",
> +                                TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
> +    no_callee_saved_registers = TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP;
>
>    if (cfun->machine->func_type == TYPE_UNKNOWN)
>      {
> @@ -3413,7 +3420,7 @@ ix86_set_func_type (tree fndecl)
>             error_at (DECL_SOURCE_LOCATION (fndecl),
>                       "interrupt and naked attributes are not compatible");
>
> -         if (has_no_callee_saved_registers)
> +         if (no_callee_saved_registers)
>             error_at (DECL_SOURCE_LOCATION (fndecl),
>                       "%qs and %qs attributes are not compatible",
>                       "interrupt", "no_callee_saved_registers");
> @@ -3442,7 +3449,7 @@ ix86_set_func_type (tree fndecl)
>                                 TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
>             cfun->machine->call_saved_registers
>               = TYPE_NO_CALLER_SAVED_REGISTERS;
> -         if (has_no_callee_saved_registers)
> +         if (no_callee_saved_registers)
>             {
>               if (cfun->machine->call_saved_registers
>                   == TYPE_NO_CALLER_SAVED_REGISTERS)
> @@ -3451,7 +3458,7 @@ ix86_set_func_type (tree fndecl)
>                           "no_caller_saved_registers",
>                           "no_callee_saved_registers");
>               cfun->machine->call_saved_registers
> -               = TYPE_NO_CALLEE_SAVED_REGISTERS;
> +               = no_callee_saved_registers;
>             }
>         }
>      }
> --- gcc/config/i386/i386.cc.jj  2024-02-26 11:12:36.275453513 +0100
> +++ gcc/config/i386/i386.cc     2024-02-26 17:24:50.701602352 +0100
> @@ -984,8 +984,9 @@ ix86_function_ok_for_sibcall (tree decl,
>
>    /* Sibling call isn't OK if callee has no callee-saved registers
>       and the calling function has callee-saved registers.  */
> -  if ((cfun->machine->call_saved_registers
> -       != TYPE_NO_CALLEE_SAVED_REGISTERS)
> +  if (cfun->machine->call_saved_registers != TYPE_NO_CALLEE_SAVED_REGISTERS
> +      && (cfun->machine->call_saved_registers
> +         != TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP)
>        && lookup_attribute ("no_callee_saved_registers",
>                            TYPE_ATTRIBUTES (type)))
>      return false;
> @@ -6649,6 +6650,11 @@ ix86_save_reg (unsigned int regno, bool
>
>      case TYPE_NO_CALLEE_SAVED_REGISTERS:
>        return false;
> +
> +    case TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP:
> +      if (regno != HARD_FRAME_POINTER_REGNUM)
> +       return false;
> +      break;
>      }
>
>    if (regno == REAL_PIC_OFFSET_TABLE_REGNUM
> --- gcc/testsuite/gcc.target/i386/pr38534-1.c.jj        2024-02-01 16:00:37.443735635 +0100
> +++ gcc/testsuite/gcc.target/i386/pr38534-1.c   2024-02-27 08:59:08.421878166 +0100
> @@ -22,5 +22,5 @@ no_return_to_caller (void)
>    while (1);
>  }
>
> -/* { dg-final { scan-assembler-not "push" } } */
> -/* { dg-final { scan-assembler-not "pop" } } */
> +/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
> +/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
> --- gcc/testsuite/gcc.target/i386/pr38534-4.c.jj        2024-02-01 16:00:37.443735635 +0100
> +++ gcc/testsuite/gcc.target/i386/pr38534-4.c   2024-02-27 09:00:09.869021464 +0100
> @@ -12,7 +12,7 @@ foo (fn_t bar)
>    fn ();
>  }
>
> -/* { dg-final { scan-assembler-not "push" } } */
> -/* { dg-final { scan-assembler-not "pop" } } */
> +/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
> +/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
>  /* { dg-final { scan-assembler-not "jmp" } } */
>  /* { dg-final { scan-assembler "call\[\\t \]+" } } */
> --- gcc/testsuite/gcc.target/i386/pr38534-2.c.jj        2024-02-01 16:00:37.443735635 +0100
> +++ gcc/testsuite/gcc.target/i386/pr38534-2.c   2024-02-27 08:59:35.723497526 +0100
> @@ -12,7 +12,7 @@ foo (void)
>    fn ();
>  }
>
> -/* { dg-final { scan-assembler-not "push" } } */
> -/* { dg-final { scan-assembler-not "pop" } } */
> +/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
> +/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
>  /* { dg-final { scan-assembler-not "jmp\[\\t \]+_?bar" } } */
>  /* { dg-final { scan-assembler "call\[\\t \]+_?bar" } } */
> --- gcc/testsuite/gcc.target/i386/pr38534-3.c.jj        2024-02-01 16:00:37.443735635 +0100
> +++ gcc/testsuite/gcc.target/i386/pr38534-3.c   2024-02-27 08:59:49.669303090 +0100
> @@ -13,7 +13,7 @@ foo (void)
>    fn ();
>  }
>
> -/* { dg-final { scan-assembler-not "push" } } */
> -/* { dg-final { scan-assembler-not "pop" } } */
> +/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
> +/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
>  /* { dg-final { scan-assembler-not "jmp" } } */
>  /* { dg-final { scan-assembler "call\[\\t \]+" } } */
> --- gcc/testsuite/gcc.target/i386/pr114097-1.c.jj       2024-02-26 16:29:07.529200714 +0100
> +++ gcc/testsuite/gcc.target/i386/pr114097-1.c  2024-02-27 09:01:37.876794453 +0100
> @@ -22,5 +22,5 @@ no_return_to_caller (void)
>    while (1);
>  }
>
> -/* { dg-final { scan-assembler-not "push" } } */
> -/* { dg-final { scan-assembler-not "pop" } } */
> +/* { dg-final { scan-assembler-not "push\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
> +/* { dg-final { scan-assembler-not "pop\[^\n\r\]*(?:\[abcd\]x|\[sd\]i|sp|r\[0-9\]|\[xyz\]mm)" } } */
> --- gcc/testsuite/gcc.target/i386/stack-check-17.c.jj   2024-01-27 22:59:53.186318915 +0100
> +++ gcc/testsuite/gcc.target/i386/stack-check-17.c      2024-02-27 09:06:10.805989259 +0100
> @@ -32,5 +32,5 @@ f3 (void)
>     register on ia32 for a noreturn function.  */
>  /* { dg-final { scan-assembler-times "push\[ql\]" 1 { target { ! ia32 } } } }  */
>  /* { dg-final { scan-assembler-times "push\[ql\]" 3 { target ia32 } } }  */
> -/* { dg-final { scan-assembler-times "pop" 1 } } */
> -
> +/* { dg-final { scan-assembler-not "pop" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "pop" 1 { target ia32 } } } */
>
>         Jakub
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: For noreturn functions save at least the bp register if it is used [PR114116]
  2024-02-27  8:54 ` Richard Biener
@ 2024-02-27  9:04   ` Jakub Jelinek
  2024-02-27  9:13     ` Jakub Jelinek
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2024-02-27  9:04 UTC (permalink / raw)
  To: Richard Biener; +Cc: Uros Bizjak, hjl.tools, gcc-patches

On Tue, Feb 27, 2024 at 09:54:45AM +0100, Richard Biener wrote:
> Just to add we've in the past opted to avoid tail-calling abort () and friends
> exactly to help debugging.  So treating noreturn functions specially with
> respect to caller saves looks inconsistent in this regard - it makes debugging
> harder since a lot of info in the calling frame is going to be lost?

I know and am not very happy about the changes either.

> I hope we at least avoid that at -O0, possibly also with -Og?

r14-8495 fixed at least that.

Of course, it can break debugging experience even when the noreturn function
is compiled with -O2 but some or all callers of that are -O0 or -Og.
So, if unlucky and e.g. abort function in glibc gets the
no_callee_saved_registers treatment and actually uses some call saved
registers, backtraces when something aborts will be worse or useless.
And we don't even have any attribute which would tell gcc not to do that.

So sure, another option is just revert the PR38534 changes.

	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: For noreturn functions save at least the bp register if it is used [PR114116]
  2024-02-27  9:04   ` Jakub Jelinek
@ 2024-02-27  9:13     ` Jakub Jelinek
  2024-02-27  9:50       ` Richard Biener
                         ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Jakub Jelinek @ 2024-02-27  9:13 UTC (permalink / raw)
  To: Richard Biener, Uros Bizjak, hjl.tools, gcc-patches

On Tue, Feb 27, 2024 at 10:04:06AM +0100, Jakub Jelinek wrote:
> > I hope we at least avoid that at -O0, possibly also with -Og?
> 
> r14-8495 fixed at least that.
> 
> Of course, it can break debugging experience even when the noreturn function
> is compiled with -O2 but some or all callers of that are -O0 or -Og.
> So, if unlucky and e.g. abort function in glibc gets the
> no_callee_saved_registers treatment and actually uses some call saved
> registers, backtraces when something aborts will be worse or useless.
> And we don't even have any attribute which would tell gcc not to do that.
> 
> So sure, another option is just revert the PR38534 changes.

For __libc_start_main, glibc surely just could use no_callee_saved_registers
attribute, because that is typically the outermost frame in backtrace,
there is no need to save those there.
And for kernel if it really wants it and nothing will use the backtraces,
perhaps the patch wouldn't need to be reverted completely but just guarded
the implicit no_callee_saved_registers treatment of noreturn
functions on -mcmodel=kernel or -fno-asynchronous-unwind-tables.

	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: For noreturn functions save at least the bp register if it is used [PR114116]
  2024-02-27  9:13     ` Jakub Jelinek
@ 2024-02-27  9:50       ` Richard Biener
  2024-02-27  9:55       ` Jakub Jelinek
  2024-02-27 12:09       ` Jakub Jelinek
  2 siblings, 0 replies; 22+ messages in thread
From: Richard Biener @ 2024-02-27  9:50 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Uros Bizjak, hjl.tools, gcc-patches

On Tue, Feb 27, 2024 at 10:13 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Tue, Feb 27, 2024 at 10:04:06AM +0100, Jakub Jelinek wrote:
> > > I hope we at least avoid that at -O0, possibly also with -Og?
> >
> > r14-8495 fixed at least that.
> >
> > Of course, it can break debugging experience even when the noreturn function
> > is compiled with -O2 but some or all callers of that are -O0 or -Og.
> > So, if unlucky and e.g. abort function in glibc gets the
> > no_callee_saved_registers treatment and actually uses some call saved
> > registers, backtraces when something aborts will be worse or useless.
> > And we don't even have any attribute which would tell gcc not to do that.
> >
> > So sure, another option is just revert the PR38534 changes.
>
> For __libc_start_main, glibc surely just could use no_callee_saved_registers
> attribute, because that is typically the outermost frame in backtrace,
> there is no need to save those there.
> And for kernel if it really wants it and nothing will use the backtraces,
> perhaps the patch wouldn't need to be reverted completely but just guarded
> the implicit no_callee_saved_registers treatment of noreturn
> functions on -mcmodel=kernel or -fno-asynchronous-unwind-tables.

I guess the kernel could also add the attribute to appropriate places?  Or
we have a -mno-noreturn-callee-saved-registers

But yeah, I think this is an overall bad change (to make this the default
behavior for noreturn functions).

Richard.

>         Jakub
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: For noreturn functions save at least the bp register if it is used [PR114116]
  2024-02-27  9:13     ` Jakub Jelinek
  2024-02-27  9:50       ` Richard Biener
@ 2024-02-27  9:55       ` Jakub Jelinek
  2024-02-27 12:09       ` Jakub Jelinek
  2 siblings, 0 replies; 22+ messages in thread
From: Jakub Jelinek @ 2024-02-27  9:55 UTC (permalink / raw)
  To: Richard Biener, Uros Bizjak, hjl.tools, gcc-patches

On Tue, Feb 27, 2024 at 10:13:14AM +0100, Jakub Jelinek wrote:
> On Tue, Feb 27, 2024 at 10:04:06AM +0100, Jakub Jelinek wrote:
> > > I hope we at least avoid that at -O0, possibly also with -Og?
> > 
> > r14-8495 fixed at least that.
> > 
> > Of course, it can break debugging experience even when the noreturn function
> > is compiled with -O2 but some or all callers of that are -O0 or -Og.
> > So, if unlucky and e.g. abort function in glibc gets the
> > no_callee_saved_registers treatment and actually uses some call saved
> > registers, backtraces when something aborts will be worse or useless.
> > And we don't even have any attribute which would tell gcc not to do that.
> > 
> > So sure, another option is just revert the PR38534 changes.
> 
> For __libc_start_main, glibc surely just could use no_callee_saved_registers
> attribute, because that is typically the outermost frame in backtrace,
> there is no need to save those there.
> And for kernel if it really wants it and nothing will use the backtraces,
> perhaps the patch wouldn't need to be reverted completely but just guarded
> the implicit no_callee_saved_registers treatment of noreturn
> functions on -mcmodel=kernel or -fno-asynchronous-unwind-tables.

So, let's look at admittedly artificial testcase which could very well
happen in real world code though, just using some meaningful code creating the
register preassure instead of it being created artificially, noipa attribute
just a replacement for functions in different TUs or too large to be
inlinable, and the noreturn function in a different library where everything
is built with -O2, not -O0 or -Og for debugging.
Note, it doesn't have to be even abort, can be exit too.

I've compiled this with -Og -g with gcc 12, trunk and trunk patched with the
patch I've posted, in each case I run it under debugger, type run and bt
when it aborts.
The gcc 12 case is the expected one:
#0  0x00007ffff7dbd765 in abort () from /lib64/libc.so.6
#1  0x00000000004011ca in bar () at /tmp/1.c:30
#2  0x00000000004011f1 in baz (a=a@entry=42, b=b@entry=43, c=c@entry=44, d=d@entry=45, e=e@entry=46, f=f@entry=47, g=48, h=49) at /tmp/1.c:38
#3  0x00000000004012d8 in qux () at /tmp/1.c:55
#4  0x0000000000401319 in main () at /tmp/1.c:62
The gcc trunk hits the backtrace not possible problem because rbp is
clobbered and needed in upper frame CFA computation:
#0  0x00007ffff7dbd765 in abort () from /lib64/libc.so.6
#1  0x00000000004011b0 in bar () at /tmp/1.c:30
#2  0x00000000004011d1 in baz (a=<error reading variable: Cannot access memory at address 0xdeadbebb>, b=<error reading variable: Cannot access memory at address 0xdeadbeb7>, 
    c=<error reading variable: Cannot access memory at address 0xdeadbeb3>, d=d@entry=-559038737, e=e@entry=-559038737, f=f@entry=-559038737, g=48, h=49) at /tmp/1.c:38
#3  0x00000000004012a9 in qux () at /tmp/1.c:55
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
And in the patched gcc backtrace works but many of the values are bogus:
#0  0x00007ffff7dbd765 in abort () from /lib64/libc.so.6
#1  0x00000000004011b1 in bar () at /tmp/1.c:30
#2  0x00000000004011d2 in baz (a=a@entry=42, b=b@entry=43, c=c@entry=44, d=d@entry=-559038737, e=e@entry=-559038737, f=f@entry=-559038737, g=48, h=49) at /tmp/1.c:38
#3  0x00000000004012aa in qux () at /tmp/1.c:55
#4  0x00000000004012e4 in main () at /tmp/1.c:62

extern void abort (void);
volatile unsigned v = 0xdeadbeefU;
int w;

__attribute__((noipa)) void
corge (char *p)
{
  (void) p;
}

__attribute__((noipa)) int
foo (int x)
{
  return x;
}

__attribute__((noipa, noreturn, optimize (2))) void
bar (void)
{
  unsigned a = v;
  unsigned b = v;
  unsigned c = v;
  unsigned d = v;
  unsigned e = v;
  unsigned f = v;
  unsigned g = v;
  unsigned h = v;
  int i = foo (50);
  v = a + b + c + d + e + f + g + h;
  abort ();
}

__attribute__((noipa)) void
baz (int a, int b, int c, int d, int e, int f, int g, int h)
{
  int i = foo (51);
  if (w)
    bar ();
}

__attribute__((noipa)) void
qux (void)
{
  int a = foo (42);
  int b = foo (43);
  int c = foo (44);
  int d = foo (45);
  int e = foo (46);
  int f = foo (47);
  int g = foo (48);
  int h = foo (49);
  corge (__builtin_alloca (foo (52)));
  baz (a, b, c, d, e, f, g, h);
  w++;
  baz (a, b, c, d, e, f, g, h);
  baz (a, b, c, d, e, f, g, h);
}

int
main ()
{
  qux ();
}


	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: For noreturn functions save at least the bp register if it is used [PR114116]
  2024-02-27  9:13     ` Jakub Jelinek
  2024-02-27  9:50       ` Richard Biener
  2024-02-27  9:55       ` Jakub Jelinek
@ 2024-02-27 12:09       ` Jakub Jelinek
  2024-02-27 14:57         ` [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534] Jakub Jelinek
  2024-02-29 16:25         ` [PATCH] i386: For noreturn functions save at least the bp register if it is used [PR114116] Michael Matz
  2 siblings, 2 replies; 22+ messages in thread
From: Jakub Jelinek @ 2024-02-27 12:09 UTC (permalink / raw)
  To: Richard Biener, Uros Bizjak, hjl.tools, gcc-patches

On Tue, Feb 27, 2024 at 10:13:14AM +0100, Jakub Jelinek wrote:
> For __libc_start_main, glibc surely just could use no_callee_saved_registers
> attribute, because that is typically the outermost frame in backtrace,
> there is no need to save those there.
> And for kernel if it really wants it and nothing will use the backtraces,
> perhaps the patch wouldn't need to be reverted completely but just guarded
> the implicit no_callee_saved_registers treatment of noreturn
> functions on -mcmodel=kernel or -fno-asynchronous-unwind-tables.

Guarding on -fno-asynchronous-unwind-tables isn't a good idea,
with just -g we emit in that case unwind info in .debug_frame section
and even that shouldn't break, and we shouldn't generate different code for
-g vs. -g0.
The problem with the changes is that it breaks the unwinding and debugging
experience not just in the functions on which the optimization triggers,
but on all functions in the backtrace as well.

So, IMHO either revert the changes altogether, or guard on -mcmodel=kernel
(but talk to kernel people on linux-toolchains if that is what they actually
want).

	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]
  2024-02-27 12:09       ` Jakub Jelinek
@ 2024-02-27 14:57         ` Jakub Jelinek
  2024-02-28  8:00           ` Jakub Jelinek
  2024-02-29 16:25         ` [PATCH] i386: For noreturn functions save at least the bp register if it is used [PR114116] Michael Matz
  1 sibling, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2024-02-27 14:57 UTC (permalink / raw)
  To: Richard Biener, Uros Bizjak, hjl.tools, gcc-patches

On Tue, Feb 27, 2024 at 01:09:09PM +0100, Jakub Jelinek wrote:
> So, IMHO either revert the changes altogether, or guard on -mcmodel=kernel
> (but talk to kernel people on linux-toolchains if that is what they actually
> want).

Here is a patch which guards this by non-default option, so kernel and other
users can choose if they want this or not.  On top of the PR114116 patch.

Only lightly tested so far.

2024-02-27  Jakub Jelinek  <jakub@redhat.com>

	PR target/38534
	* config/i386/i386.opt (mnoreturn-no-callee-saved-registers): New
	option.
	* config/i386/i386-options.cc (ix86_set_func_type): Don't use
	TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP unless
	ix86_noreturn_no_callee_saved_registers is enabled.
	* doc/invoke.texi (-mnoreturn-no-callee-saved-registers): Document.

	* gcc.target/i386/pr38534-1.c: Add -mnoreturn-no-callee-saved-registers
	to dg-options.
	* gcc.target/i386/pr38534-2.c: Likewise.
	* gcc.target/i386/pr38534-3.c: Likewise.
	* gcc.target/i386/pr38534-4.c: Likewise.
	* gcc.target/i386/pr38534-5.c: Likewise.
	* gcc.target/i386/pr38534-6.c: Likewise.
	* gcc.target/i386/pr114097-1.c: Likewise.
	* gcc.target/i386/stack-check-17.c: Likewise.

--- gcc/config/i386/i386.opt.jj	2024-01-10 12:19:07.694681189 +0100
+++ gcc/config/i386/i386.opt	2024-02-27 14:18:34.439240869 +0100
@@ -659,6 +659,10 @@ mstore-max=
 Target RejectNegative Joined Var(ix86_store_max) Enum(prefer_vector_width) Init(PVW_NONE) Save
 Maximum number of bits that can be stored to memory efficiently.
 
+mnoreturn-no-callee-saved-registers
+Target Var(ix86_noreturn_no_callee_saved_registers)
+Optimize noreturn functions by not saving callee-saved registers used in the function.
+
 ;; ISA support
 
 m32
--- gcc/config/i386/i386-options.cc.jj	2024-02-27 14:20:59.972228314 +0100
+++ gcc/config/i386/i386-options.cc	2024-02-27 14:23:26.042208182 +0100
@@ -3384,7 +3384,8 @@ ix86_set_func_type (tree fndecl)
 {
   /* No need to save and restore callee-saved registers for a noreturn
      function with nothrow or compiled with -fno-exceptions unless when
-     compiling with -O0 or -Og.  So that backtrace works for those at least
+     compiling with -O0 or -Og, except that it interferes with debugging
+     of callers.  So that backtrace works for those at least
      in most cases, save the bp register if it is used, because it often
      is used in callers to compute CFA.
 
@@ -3401,7 +3402,8 @@ ix86_set_func_type (tree fndecl)
   if (lookup_attribute ("no_callee_saved_registers",
 			TYPE_ATTRIBUTES (TREE_TYPE (fndecl))))
     no_callee_saved_registers = TYPE_NO_CALLEE_SAVED_REGISTERS;
-  else if (TREE_THIS_VOLATILE (fndecl)
+  else if (ix86_noreturn_no_callee_saved_registers
+	   && TREE_THIS_VOLATILE (fndecl)
 	   && optimize
 	   && !optimize_debug
 	   && (TREE_NOTHROW (fndecl) || !flag_exceptions)
--- gcc/doc/invoke.texi.jj	2024-02-23 11:34:34.278287553 +0100
+++ gcc/doc/invoke.texi	2024-02-27 14:29:18.071339182 +0100
@@ -1450,6 +1450,7 @@ See RS/6000 and PowerPC Options.
 -mvzeroupper  -mprefer-avx128  -mprefer-vector-width=@var{opt}
 -mpartial-vector-fp-math
 -mmove-max=@var{bits} -mstore-max=@var{bits}
+-mnoreturn-no-callee-saved-registers
 -mmmx  -msse  -msse2  -msse3  -mssse3  -msse4.1  -msse4.2  -msse4  -mavx
 -mavx2  -mavx512f  -mavx512pf  -mavx512er  -mavx512cd  -mavx512vl
 -mavx512bw  -mavx512dq  -mavx512ifma  -mavx512vbmi  -msha  -maes
@@ -35376,6 +35377,15 @@ Prefer 256-bit vector width for instruct
 Prefer 512-bit vector width for instructions.
 @end table
 
+@opindex mnoreturn-no-callee-saved-registers
+@item -mnoreturn-no-callee-saved-registers
+This option optimizes functions with @code{noreturn} attribute or
+@code{_Noreturn} specifier by not saving in the function prologue callee-saved
+registers which are used in the function (except for the @code{BP}
+register).  This option can interfere with debugging of the caller of the
+@code{noreturn} function or any function further up in the call stack, so it
+is not enabled by default.
+
 @opindex mcx16
 @item -mcx16
 This option enables GCC to generate @code{CMPXCHG16B} instructions in 64-bit
--- gcc/testsuite/gcc.target/i386/pr38534-1.c.jj	2024-02-27 14:21:00.385222600 +0100
+++ gcc/testsuite/gcc.target/i386/pr38534-1.c	2024-02-27 15:39:44.687716915 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */
+/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer -mnoreturn-no-callee-saved-registers" } */
 
 #define ARRAY_SIZE 256
 
--- gcc/testsuite/gcc.target/i386/pr38534-2.c.jj	2024-02-27 14:21:00.385222600 +0100
+++ gcc/testsuite/gcc.target/i386/pr38534-2.c	2024-02-27 15:39:51.569621585 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */
+/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer -mnoreturn-no-callee-saved-registers" } */
 
 extern void bar (void) __attribute__ ((no_callee_saved_registers));
 extern void fn (void) __attribute__ ((noreturn));
--- gcc/testsuite/gcc.target/i386/pr38534-3.c.jj	2024-02-27 14:21:00.385222600 +0100
+++ gcc/testsuite/gcc.target/i386/pr38534-3.c	2024-02-27 15:39:57.420540547 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */
+/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer -mnoreturn-no-callee-saved-registers" } */
 
 typedef void (*fn_t) (void) __attribute__ ((no_callee_saved_registers));
 extern fn_t bar;
--- gcc/testsuite/gcc.target/i386/pr38534-4.c.jj	2024-02-27 14:21:00.385222600 +0100
+++ gcc/testsuite/gcc.target/i386/pr38534-4.c	2024-02-27 15:40:08.185391436 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */
+/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer -mnoreturn-no-callee-saved-registers" } */
 
 typedef void (*fn_t) (void) __attribute__ ((no_callee_saved_registers));
 extern void fn (void) __attribute__ ((noreturn));
--- gcc/testsuite/gcc.target/i386/pr38534-5.c.jj	2024-01-30 08:45:06.904842201 +0100
+++ gcc/testsuite/gcc.target/i386/pr38534-5.c	2024-02-27 15:49:17.382784286 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O0 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */
+/* { dg-options "-O0 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -mnoreturn-no-callee-saved-registers" } */
 
 #define ARRAY_SIZE 256
 
--- gcc/testsuite/gcc.target/i386/pr38534-6.c.jj	2024-01-30 08:45:06.904842201 +0100
+++ gcc/testsuite/gcc.target/i386/pr38534-6.c	2024-02-27 15:49:32.123580145 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */
+/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -mnoreturn-no-callee-saved-registers" } */
 
 #define ARRAY_SIZE 256
 
--- gcc/testsuite/gcc.target/i386/pr114097-1.c.jj	2024-02-27 14:21:00.386222586 +0100
+++ gcc/testsuite/gcc.target/i386/pr114097-1.c	2024-02-27 15:41:12.758496992 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */
+/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer -mnoreturn-no-callee-saved-registers" } */
 
 #define ARRAY_SIZE 256
 
--- gcc/testsuite/gcc.target/i386/stack-check-17.c.jj	2024-02-27 14:21:00.386222586 +0100
+++ gcc/testsuite/gcc.target/i386/stack-check-17.c	2024-02-27 15:41:42.269088224 +0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fstack-clash-protection -mtune=generic -fomit-frame-pointer" } */
+/* { dg-options "-O2 -fstack-clash-protection -mtune=generic -fomit-frame-pointer -mnoreturn-no-callee-saved-registers" } */
 /* { dg-require-effective-target supports_stack_clash_protection } */
 /* { dg-skip-if "" { *-*-* } { "-fstack-protector*" } { "" } } */
 


	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]
  2024-02-27 14:57         ` [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534] Jakub Jelinek
@ 2024-02-28  8:00           ` Jakub Jelinek
  2024-02-28  8:53             ` Jakub Jelinek
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2024-02-28  8:00 UTC (permalink / raw)
  To: Richard Biener, Uros Bizjak, hjl.tools, gcc-patches

On Tue, Feb 27, 2024 at 03:57:52PM +0100, Jakub Jelinek wrote:
> On Tue, Feb 27, 2024 at 01:09:09PM +0100, Jakub Jelinek wrote:
> > So, IMHO either revert the changes altogether, or guard on -mcmodel=kernel
> > (but talk to kernel people on linux-toolchains if that is what they actually
> > want).
> 
> Here is a patch which guards this by non-default option, so kernel and other
> users can choose if they want this or not.  On top of the PR114116 patch.
> 
> Only lightly tested so far.

Successfully bootstrapped/regtested on x86_64-linux and i686-linux now.
Ok for trunk (if the PR114116 patch is approved too)?

> 2024-02-27  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR target/38534
> 	* config/i386/i386.opt (mnoreturn-no-callee-saved-registers): New
> 	option.
> 	* config/i386/i386-options.cc (ix86_set_func_type): Don't use
> 	TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP unless
> 	ix86_noreturn_no_callee_saved_registers is enabled.
> 	* doc/invoke.texi (-mnoreturn-no-callee-saved-registers): Document.
> 
> 	* gcc.target/i386/pr38534-1.c: Add -mnoreturn-no-callee-saved-registers
> 	to dg-options.
> 	* gcc.target/i386/pr38534-2.c: Likewise.
> 	* gcc.target/i386/pr38534-3.c: Likewise.
> 	* gcc.target/i386/pr38534-4.c: Likewise.
> 	* gcc.target/i386/pr38534-5.c: Likewise.
> 	* gcc.target/i386/pr38534-6.c: Likewise.
> 	* gcc.target/i386/pr114097-1.c: Likewise.
> 	* gcc.target/i386/stack-check-17.c: Likewise.

	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]
  2024-02-28  8:00           ` Jakub Jelinek
@ 2024-02-28  8:53             ` Jakub Jelinek
  2024-02-29  6:20               ` Hongtao Liu
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2024-02-28  8:53 UTC (permalink / raw)
  To: Richard Biener, Uros Bizjak, hjl.tools, Hongtao Liu, Jan Hubicka
  Cc: gcc-patches

Hi!

Adding Hongtao and Honza into the loop as the ones who acked the original
patch.

The no_callee_saved_registers by default for noreturn functions change can
break in-process backtrace(3) or backtraces from debugger or other process
(quite often, any time the noreturn function decides to use the bp register
and any of the parent frames uses a frame pointer; the unwinder just crashes
in the libgcc unwinder case, gdb prints stack corrupted message), so I'd
like to save bp register in that case:

https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646591.html

and additionally the no_callee_saved_registers by default for noreturn
functions change can make debugging harder, again not localized to the
noreturn function, but any of its callers.  So, if say glibc abort function
implementation needs a lot of normally callee-saved registers, no matter how
users recompile their apps, they will see garbage or optimized out
vars/parameters in their code unless they rebuild their glibc with -O0.
So, I think we should guard that by a non-default option:

https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646649.html

Plus we need to somehow make sure to emit DW_CFA_undefined for the modified
but not saved normally callee-saved registers, so that we at least don't get
garbage in debug info.  H.J. posted some patches for that, so far I wasn't
happy about the implementation but the actual change is desirable.

Your thoughts on this?

	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]
  2024-02-28  8:53             ` Jakub Jelinek
@ 2024-02-29  6:20               ` Hongtao Liu
  2024-02-29 12:26                 ` H.J. Lu
  2024-03-05  4:52                 ` Hongtao Liu
  0 siblings, 2 replies; 22+ messages in thread
From: Hongtao Liu @ 2024-02-29  6:20 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Biener, Uros Bizjak, hjl.tools, Jan Hubicka, gcc-patches

On Wed, Feb 28, 2024 at 4:54 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> Adding Hongtao and Honza into the loop as the ones who acked the original
> patch.
>
> The no_callee_saved_registers by default for noreturn functions change can
> break in-process backtrace(3) or backtraces from debugger or other process
> (quite often, any time the noreturn function decides to use the bp register
> and any of the parent frames uses a frame pointer; the unwinder just crashes
> in the libgcc unwinder case, gdb prints stack corrupted message), so I'd
> like to save bp register in that case:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646591.html
I think this patch makes sense and LGTM, we save and restore frame
pointer for noreturn.
>
> and additionally the no_callee_saved_registers by default for noreturn
> functions change can make debugging harder, again not localized to the
> noreturn function, but any of its callers.  So, if say glibc abort function
> implementation needs a lot of normally callee-saved registers, no matter how
> users recompile their apps, they will see garbage or optimized out
> vars/parameters in their code unless they rebuild their glibc with -O0.
> So, I think we should guard that by a non-default option:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646649.html
So it turns off the optimization for noreturn functions by default,
I'm not sure about this.
Any comments, H.J?
>
> Plus we need to somehow make sure to emit DW_CFA_undefined for the modified
> but not saved normally callee-saved registers, so that we at least don't get
> garbage in debug info.  H.J. posted some patches for that, so far I wasn't
> happy about the implementation but the actual change is desirable.
>
> Your thoughts on this?
>
>         Jakub
>


-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]
  2024-02-29  6:20               ` Hongtao Liu
@ 2024-02-29 12:26                 ` H.J. Lu
  2024-02-29 12:47                   ` Jakub Jelinek
  2024-03-05  4:52                 ` Hongtao Liu
  1 sibling, 1 reply; 22+ messages in thread
From: H.J. Lu @ 2024-02-29 12:26 UTC (permalink / raw)
  To: Hongtao Liu
  Cc: Jakub Jelinek, Richard Biener, Uros Bizjak, Jan Hubicka, gcc-patches

On Wed, Feb 28, 2024 at 10:20 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Wed, Feb 28, 2024 at 4:54 PM Jakub Jelinek <jakub@redhat.com> wrote:
> >
> > Hi!
> >
> > Adding Hongtao and Honza into the loop as the ones who acked the original
> > patch.
> >
> > The no_callee_saved_registers by default for noreturn functions change can
> > break in-process backtrace(3) or backtraces from debugger or other process
> > (quite often, any time the noreturn function decides to use the bp register
> > and any of the parent frames uses a frame pointer; the unwinder just crashes
> > in the libgcc unwinder case, gdb prints stack corrupted message), so I'd
> > like to save bp register in that case:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646591.html
> I think this patch makes sense and LGTM, we save and restore frame
> pointer for noreturn.
> >
> > and additionally the no_callee_saved_registers by default for noreturn
> > functions change can make debugging harder, again not localized to the
> > noreturn function, but any of its callers.  So, if say glibc abort function
> > implementation needs a lot of normally callee-saved registers, no matter how
> > users recompile their apps, they will see garbage or optimized out
> > vars/parameters in their code unless they rebuild their glibc with -O0.
> > So, I think we should guard that by a non-default option:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646649.html
> So it turns off the optimization for noreturn functions by default,
> I'm not sure about this.
> Any comments, H.J?

We need BP for backtrace.  I don't think we need to save other
registers.  True, GDB may not see function parameters.  But
optimization always has this impact.  When I need to debug a
program, I always use -O0 or -Og.

> >
> > Plus we need to somehow make sure to emit DW_CFA_undefined for the modified
> > but not saved normally callee-saved registers, so that we at least don't get
> > garbage in debug info.  H.J. posted some patches for that, so far I wasn't
> > happy about the implementation but the actual change is desirable.
> >
> > Your thoughts on this?
> >
> >         Jakub
> >
>
>
> --
> BR,
> Hongtao



-- 
H.J.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]
  2024-02-29 12:26                 ` H.J. Lu
@ 2024-02-29 12:47                   ` Jakub Jelinek
  2024-02-29 13:24                     ` Richard Biener
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2024-02-29 12:47 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Hongtao Liu, Richard Biener, Uros Bizjak, Jan Hubicka, gcc-patches

On Thu, Feb 29, 2024 at 04:26:00AM -0800, H.J. Lu wrote:
> > > Adding Hongtao and Honza into the loop as the ones who acked the original
> > > patch.
> > >
> > > The no_callee_saved_registers by default for noreturn functions change can
> > > break in-process backtrace(3) or backtraces from debugger or other process
> > > (quite often, any time the noreturn function decides to use the bp register
> > > and any of the parent frames uses a frame pointer; the unwinder just crashes
> > > in the libgcc unwinder case, gdb prints stack corrupted message), so I'd
> > > like to save bp register in that case:
> > >
> > > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646591.html
> > I think this patch makes sense and LGTM, we save and restore frame
> > pointer for noreturn.
> > >
> > > and additionally the no_callee_saved_registers by default for noreturn
> > > functions change can make debugging harder, again not localized to the
> > > noreturn function, but any of its callers.  So, if say glibc abort function
> > > implementation needs a lot of normally callee-saved registers, no matter how
> > > users recompile their apps, they will see garbage or optimized out
> > > vars/parameters in their code unless they rebuild their glibc with -O0.
> > > So, I think we should guard that by a non-default option:
> > >
> > > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646649.html
> > So it turns off the optimization for noreturn functions by default,
> > I'm not sure about this.
> > Any comments, H.J?
> 
> We need BP for backtrace.  I don't think we need to save other
> registers.  True, GDB may not see function parameters.  But
> optimization always has this impact.  When I need to debug a
> program, I always use -O0 or -Og.

The problem is that it doesn't help in this case.
If some optimization makes debugging of some function harder, normally it is
enough to recompile the translation unit that defines it with -O0/-Og, or
add optimize attribute on the function.
While in this case, the optimization interferes with debugging of other
functions, not necessarily from the same translation unit, not necessarily
even from the same library or binary, or even from the same package.
As I tried to explain, supposedly glibc abort is compiled with -O2 and needs
a lot of registers, so say it uses all of %rbx, %rbp, %r12, %r13, %r14,
%r15 and this optimization is applied on it.  That means debugging of any
application (-O2, -Og or even -O0 compiled) to understand what went wrong
and why it aborted will be harder.  Including core file analysis.
Recompiling those apps with -O0/-Og will not help.  The only thing that
would help is to recompile glibc with -O0/-Og.
Doesn't have to be abort, doesn't have to be glibc.  Any library which
exports some noreturn APIs may be affected.
And there is not even a workaround other than to recompile with -O0/-Og the
noreturn functions, no way to disable this optimization.

Given that most users just will not be aware of this, even adding the option
but defaulting to on would mean a problem for a lot of users.  Most of them
will not know the problem is that some noreturn function 10 frames deep in
the call stack was optimized this way.

If people only call the noreturn functions from within the same package,
for some strange reason care about performance of noreturn functions (they
don't return, so unless you longjmp out of them or something similar
which is costly on its own already, they should be entered exactly once)
and are willing to pay the price in worse debugging in that case, let them
use the option.  But if they provide libraries that other packages then
consume, I'd say it wouldn't be a good idea.

	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]
  2024-02-29 12:47                   ` Jakub Jelinek
@ 2024-02-29 13:24                     ` Richard Biener
  2024-02-29 13:31                       ` Jan Hubicka
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Biener @ 2024-02-29 13:24 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: H.J. Lu, Hongtao Liu, Uros Bizjak, Jan Hubicka, gcc-patches

On Thu, Feb 29, 2024 at 1:47 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Thu, Feb 29, 2024 at 04:26:00AM -0800, H.J. Lu wrote:
> > > > Adding Hongtao and Honza into the loop as the ones who acked the original
> > > > patch.
> > > >
> > > > The no_callee_saved_registers by default for noreturn functions change can
> > > > break in-process backtrace(3) or backtraces from debugger or other process
> > > > (quite often, any time the noreturn function decides to use the bp register
> > > > and any of the parent frames uses a frame pointer; the unwinder just crashes
> > > > in the libgcc unwinder case, gdb prints stack corrupted message), so I'd
> > > > like to save bp register in that case:
> > > >
> > > > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646591.html
> > > I think this patch makes sense and LGTM, we save and restore frame
> > > pointer for noreturn.
> > > >
> > > > and additionally the no_callee_saved_registers by default for noreturn
> > > > functions change can make debugging harder, again not localized to the
> > > > noreturn function, but any of its callers.  So, if say glibc abort function
> > > > implementation needs a lot of normally callee-saved registers, no matter how
> > > > users recompile their apps, they will see garbage or optimized out
> > > > vars/parameters in their code unless they rebuild their glibc with -O0.
> > > > So, I think we should guard that by a non-default option:
> > > >
> > > > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646649.html
> > > So it turns off the optimization for noreturn functions by default,
> > > I'm not sure about this.
> > > Any comments, H.J?
> >
> > We need BP for backtrace.  I don't think we need to save other
> > registers.  True, GDB may not see function parameters.  But
> > optimization always has this impact.  When I need to debug a
> > program, I always use -O0 or -Og.
>
> The problem is that it doesn't help in this case.
> If some optimization makes debugging of some function harder, normally it is
> enough to recompile the translation unit that defines it with -O0/-Og, or
> add optimize attribute on the function.
> While in this case, the optimization interferes with debugging of other
> functions, not necessarily from the same translation unit, not necessarily
> even from the same library or binary, or even from the same package.
> As I tried to explain, supposedly glibc abort is compiled with -O2 and needs
> a lot of registers, so say it uses all of %rbx, %rbp, %r12, %r13, %r14,
> %r15 and this optimization is applied on it.  That means debugging of any
> application (-O2, -Og or even -O0 compiled) to understand what went wrong
> and why it aborted will be harder.  Including core file analysis.
> Recompiling those apps with -O0/-Og will not help.  The only thing that
> would help is to recompile glibc with -O0/-Og.
> Doesn't have to be abort, doesn't have to be glibc.  Any library which
> exports some noreturn APIs may be affected.
> And there is not even a workaround other than to recompile with -O0/-Og the
> noreturn functions, no way to disable this optimization.
>
> Given that most users just will not be aware of this, even adding the option
> but defaulting to on would mean a problem for a lot of users.  Most of them
> will not know the problem is that some noreturn function 10 frames deep in
> the call stack was optimized this way.
>
> If people only call the noreturn functions from within the same package,
> for some strange reason care about performance of noreturn functions (they
> don't return, so unless you longjmp out of them or something similar
> which is costly on its own already, they should be entered exactly once)
> and are willing to pay the price in worse debugging in that case, let them
> use the option.  But if they provide libraries that other packages then
> consume, I'd say it wouldn't be a good idea.

+1

I'll definitely patch this by-default behavior out if we as upstream keep it.
Debugging customer core dumps is more important than optimizing
glibc abort/assert.

I do hope such patch will be at least easy, like flipping the default of an
option.

Richard.

>         Jakub
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]
  2024-02-29 13:24                     ` Richard Biener
@ 2024-02-29 13:31                       ` Jan Hubicka
  2024-02-29 13:56                         ` Jakub Jelinek
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Hubicka @ 2024-02-29 13:31 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jakub Jelinek, H.J. Lu, Hongtao Liu, Uros Bizjak, gcc-patches

> >
> > The problem is that it doesn't help in this case.
> > If some optimization makes debugging of some function harder, normally it is
> > enough to recompile the translation unit that defines it with -O0/-Og, or
> > add optimize attribute on the function.
> > While in this case, the optimization interferes with debugging of other
> > functions, not necessarily from the same translation unit, not necessarily
> > even from the same library or binary, or even from the same package.
> > As I tried to explain, supposedly glibc abort is compiled with -O2 and needs
> > a lot of registers, so say it uses all of %rbx, %rbp, %r12, %r13, %r14,
> > %r15 and this optimization is applied on it.  That means debugging of any
> > application (-O2, -Og or even -O0 compiled) to understand what went wrong
> > and why it aborted will be harder.  Including core file analysis.
> > Recompiling those apps with -O0/-Og will not help.  The only thing that
> > would help is to recompile glibc with -O0/-Og.
> > Doesn't have to be abort, doesn't have to be glibc.  Any library which
> > exports some noreturn APIs may be affected.
> > And there is not even a workaround other than to recompile with -O0/-Og the
> > noreturn functions, no way to disable this optimization.
> >
> > Given that most users just will not be aware of this, even adding the option
> > but defaulting to on would mean a problem for a lot of users.  Most of them
> > will not know the problem is that some noreturn function 10 frames deep in
> > the call stack was optimized this way.
> >
> > If people only call the noreturn functions from within the same package,
> > for some strange reason care about performance of noreturn functions (they
> > don't return, so unless you longjmp out of them or something similar
> > which is costly on its own already, they should be entered exactly once)
> > and are willing to pay the price in worse debugging in that case, let them
> > use the option.  But if they provide libraries that other packages then
> > consume, I'd say it wouldn't be a good idea.
> 
> +1
> 
> I'll definitely patch this by-default behavior out if we as upstream keep it.
> Debugging customer core dumps is more important than optimizing
> glibc abort/assert.
> 
> I do hope such patch will be at least easy, like flipping the default of an
> option.

I agree that debugability of user core dumps is important here.

I guess an ideal solution would be to change codegen of noreturn functions
to callee save all registers. Performance of prologue of noreturn
function is not too important. THen we can stop caller saving registers
and still get reasonable backtraces.

This is essentially an ABI change (though kind of conservative one since
nothing except debugging really depends on it).  Perhaps it would make
sense to make the optimization non-default option now and also implement
the callee save logic. Then we should be able to flip release or two
later. Maybe synchroniation with LLVM would be desirable here if we
decide to go this route.

Honza
> 
> Richard.
> 
> >         Jakub
> >

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]
  2024-02-29 13:31                       ` Jan Hubicka
@ 2024-02-29 13:56                         ` Jakub Jelinek
  2024-02-29 14:15                           ` Jan Hubicka
  0 siblings, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2024-02-29 13:56 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Richard Biener, H.J. Lu, Hongtao Liu, Uros Bizjak, gcc-patches

On Thu, Feb 29, 2024 at 02:31:05PM +0100, Jan Hubicka wrote:
> I agree that debugability of user core dumps is important here.
> 
> I guess an ideal solution would be to change codegen of noreturn functions
> to callee save all registers. Performance of prologue of noreturn
> function is not too important. THen we can stop caller saving registers
> and still get reasonable backtraces.

I don't think that is possible.
While both C and C++ require that if [[noreturn]] attribute is used on
some function declaration, it must be used on the first declaration and
also if some function is [[noreturn]] in one TU, it must be [[noreturn]]
in all other TUs which declare the same function.
But, we have no such requirement for __attribute__((noreturn)), there it
is a pure optimization, it can be declared just on the caller side as an
optimization hint the function will not return, or just on the callee side
where the compiler will actually verify it doesn't return, or both.
And, the attribute is not part of function type, so even in standard C/C++,
one can use
extern void bar ();
[[noreturn]] void foo ()
{
  for (;;) bar ();
}
void (*fn) () = foo;
void baz ()
{
  fn ();
}
As you can call the noreturn function directly or indirectly, changing
calling conventions based on noreturn vs. no-noreturn is IMHO not possible.

	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]
  2024-02-29 13:56                         ` Jakub Jelinek
@ 2024-02-29 14:15                           ` Jan Hubicka
  2024-02-29 14:28                             ` H.J. Lu
  2024-02-29 15:10                             ` Jakub Jelinek
  0 siblings, 2 replies; 22+ messages in thread
From: Jan Hubicka @ 2024-02-29 14:15 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Biener, H.J. Lu, Hongtao Liu, Uros Bizjak, gcc-patches

> On Thu, Feb 29, 2024 at 02:31:05PM +0100, Jan Hubicka wrote:
> > I agree that debugability of user core dumps is important here.
> > 
> > I guess an ideal solution would be to change codegen of noreturn functions
> > to callee save all registers. Performance of prologue of noreturn
> > function is not too important. THen we can stop caller saving registers
> > and still get reasonable backtraces.
> 
> I don't think that is possible.
> While both C and C++ require that if [[noreturn]] attribute is used on
> some function declaration, it must be used on the first declaration and
> also if some function is [[noreturn]] in one TU, it must be [[noreturn]]
> in all other TUs which declare the same function.
> But, we have no such requirement for __attribute__((noreturn)), there it
> is a pure optimization, it can be declared just on the caller side as an
> optimization hint the function will not return, or just on the callee side
> where the compiler will actually verify it doesn't return, or both.
> And, the attribute is not part of function type, so even in standard C/C++,
> one can use
> extern void bar ();
> [[noreturn]] void foo ()
> {
>   for (;;) bar ();
> }
> void (*fn) () = foo;
> void baz ()
> {
>   fn ();
> }
> As you can call the noreturn function directly or indirectly, changing
> calling conventions based on noreturn vs. no-noreturn is IMHO not possible.

I am not wed to the idea (just it appeared to me as an option to
disabling this optimization by default). I still think it may make sense.

Making noreturn calles to save caller saved register is compatible with
the default ABI.  If noreturn is missing on caller side, then caller will
save reigsters as usual. Noreturn callee will save them again, which is
pointless, but everything should work as usual and extra cost of saving
should not matter in practice.  This is also the case of indirect call
of noreturn function where you miss annotation on caller side.

If noreturn is missing on callee side, we will lose information on
functions arguments in backtrace, but the code will still work
(especially if we save BP register to make code backtraceable).  This is
scenario that probably can be avoided in practice where it matters (such
as in glibc abort whose implementation is annotated).  

Noreturn already leads to some information loss in backtraces. I tend to
get surprised from time to time to see whrong call to abort due to tail
merging. So it may be acceptable to lose info in a situation where user
does sily thing and only annotates caller.

Since we auto-detect noreturn, we may need to be extra careful about noreturn
comdats. Here auto-detection of prevailing def may have different
outcome than auto-detection of prevailed defs. So we may want to disable
the optimization for auto-detected comdats.

Honza
> 
> 	Jakub
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]
  2024-02-29 14:15                           ` Jan Hubicka
@ 2024-02-29 14:28                             ` H.J. Lu
  2024-02-29 15:10                             ` Jakub Jelinek
  1 sibling, 0 replies; 22+ messages in thread
From: H.J. Lu @ 2024-02-29 14:28 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Jakub Jelinek, Richard Biener, Hongtao Liu, Uros Bizjak, gcc-patches

On Thu, Feb 29, 2024 at 6:15 AM Jan Hubicka <hubicka@ucw.cz> wrote:
>
> > On Thu, Feb 29, 2024 at 02:31:05PM +0100, Jan Hubicka wrote:
> > > I agree that debugability of user core dumps is important here.
> > >
> > > I guess an ideal solution would be to change codegen of noreturn functions
> > > to callee save all registers. Performance of prologue of noreturn
> > > function is not too important. THen we can stop caller saving registers
> > > and still get reasonable backtraces.
> >
> > I don't think that is possible.
> > While both C and C++ require that if [[noreturn]] attribute is used on
> > some function declaration, it must be used on the first declaration and
> > also if some function is [[noreturn]] in one TU, it must be [[noreturn]]
> > in all other TUs which declare the same function.
> > But, we have no such requirement for __attribute__((noreturn)), there it
> > is a pure optimization, it can be declared just on the caller side as an
> > optimization hint the function will not return, or just on the callee side
> > where the compiler will actually verify it doesn't return, or both.
> > And, the attribute is not part of function type, so even in standard C/C++,
> > one can use
> > extern void bar ();
> > [[noreturn]] void foo ()
> > {
> >   for (;;) bar ();
> > }
> > void (*fn) () = foo;
> > void baz ()
> > {
> >   fn ();
> > }
> > As you can call the noreturn function directly or indirectly, changing
> > calling conventions based on noreturn vs. no-noreturn is IMHO not possible.
>
> I am not wed to the idea (just it appeared to me as an option to
> disabling this optimization by default). I still think it may make sense.
>
> Making noreturn calles to save caller saved register is compatible with
> the default ABI.  If noreturn is missing on caller side, then caller will
> save reigsters as usual. Noreturn callee will save them again, which is
> pointless, but everything should work as usual and extra cost of saving
> should not matter in practice.  This is also the case of indirect call
> of noreturn function where you miss annotation on caller side.
>
> If noreturn is missing on callee side, we will lose information on
> functions arguments in backtrace, but the code will still work
> (especially if we save BP register to make code backtraceable).  This is
> scenario that probably can be avoided in practice where it matters (such
> as in glibc abort whose implementation is annotated).
>
> Noreturn already leads to some information loss in backtraces. I tend to
> get surprised from time to time to see whrong call to abort due to tail
> merging. So it may be acceptable to lose info in a situation where user
> does sily thing and only annotates caller.
>
> Since we auto-detect noreturn, we may need to be extra careful about noreturn
> comdats. Here auto-detection of prevailing def may have different
> outcome than auto-detection of prevailed defs. So we may want to disable
> the optimization for auto-detected comdats.
>

There are 2 kinds of noreturns.  One is abort which may require backtrace.
The other is a normal exit from the previous frame.  The latter case doesn't
require backtrace and can be performance critical.  Which one is more
important for users?

-- 
H.J.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]
  2024-02-29 14:15                           ` Jan Hubicka
  2024-02-29 14:28                             ` H.J. Lu
@ 2024-02-29 15:10                             ` Jakub Jelinek
  2024-02-29 15:26                               ` Jan Hubicka
  1 sibling, 1 reply; 22+ messages in thread
From: Jakub Jelinek @ 2024-02-29 15:10 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Richard Biener, H.J. Lu, Hongtao Liu, Uros Bizjak, gcc-patches

On Thu, Feb 29, 2024 at 03:15:30PM +0100, Jan Hubicka wrote:
> I am not wed to the idea (just it appeared to me as an option to
> disabling this optimization by default). I still think it may make sense.

Maybe I misunderstood your idea.
So, you are basically suggesting to go in a completely opposite direction
from H.J.'s changes, instead of saving less in noreturn prologues, save
more, most likely not change anything in code generation on the caller's
side (nothing is kept alive across visible unconditional noreturn calls)
and maybe after some time use normally caller-saved registers in say call
frame debug info for possible use in DW_OP_entry_value (but in that case it
is an ABI change) and improve debuggability of vars which live in normally
caller-saved registers right before a noreturn call in that frame?
What about registers in which function arguments are passed?

If it is done as a change with no changes at all on the caller side and
just on the callee side, it could again be guarded by some option (not sure
what the default would be) where the user could increase debuggability of
the noreturn caller (in that case always necessarily just a single immediate
one; while not doing the callee saved register saves improves debuggability
in perhaps multiple routines in the call stack, depending on which registers
wouldn't be saved and which registers are used in each of the caller frames;
e.g. noreturn function could save/not save all of %rbx, %r1[2345], one
caller somewhere use %r12, another %rbx and %r13, yet another one %r14 and
%r15).

	Jakub


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]
  2024-02-29 15:10                             ` Jakub Jelinek
@ 2024-02-29 15:26                               ` Jan Hubicka
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Hubicka @ 2024-02-29 15:26 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Biener, H.J. Lu, Hongtao Liu, Uros Bizjak, gcc-patches

> On Thu, Feb 29, 2024 at 03:15:30PM +0100, Jan Hubicka wrote:
> > I am not wed to the idea (just it appeared to me as an option to
> > disabling this optimization by default). I still think it may make sense.
> 
> Maybe I misunderstood your idea.
> So, you are basically suggesting to go in a completely opposite direction
> from H.J.'s changes, instead of saving less in noreturn prologues, save
> more, most likely not change anything in code generation on the caller's
> side (nothing is kept alive across visible unconditional noreturn calls)
> and maybe after some time use normally caller-saved registers in say call
> frame debug info for possible use in DW_OP_entry_value (but in that case it
> is an ABI change) and improve debuggability of vars which live in normally
> caller-saved registers right before a noreturn call in that frame?
> What about registers in which function arguments are passed?

Hmm, you are right - I got it backwards. 
> 
> If it is done as a change with no changes at all on the caller side and
> just on the callee side, it could again be guarded by some option (not sure
> what the default would be) where the user could increase debuggability of
> the noreturn caller (in that case always necessarily just a single immediate
> one; while not doing the callee saved register saves improves debuggability
> in perhaps multiple routines in the call stack, depending on which registers
> wouldn't be saved and which registers are used in each of the caller frames;
> e.g. noreturn function could save/not save all of %rbx, %r1[2345], one
> caller somewhere use %r12, another %rbx and %r13, yet another one %r14 and
> %r15).

I am not sure how much practical value this would get, but in any case
it is indpeendent of the discussed patch.

Sorry for the confussion,
Honza
> 
> 	Jakub
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: For noreturn functions save at least the bp register if it is used [PR114116]
  2024-02-27 12:09       ` Jakub Jelinek
  2024-02-27 14:57         ` [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534] Jakub Jelinek
@ 2024-02-29 16:25         ` Michael Matz
  1 sibling, 0 replies; 22+ messages in thread
From: Michael Matz @ 2024-02-29 16:25 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, Uros Bizjak, hjl.tools, gcc-patches

Hello,

On Tue, 27 Feb 2024, Jakub Jelinek wrote:

> On Tue, Feb 27, 2024 at 10:13:14AM +0100, Jakub Jelinek wrote:
> > For __libc_start_main, glibc surely just could use no_callee_saved_registers
> > attribute, because that is typically the outermost frame in backtrace,
> > there is no need to save those there.
> > And for kernel if it really wants it and nothing will use the backtraces,
> > perhaps the patch wouldn't need to be reverted completely but just guarded
> > the implicit no_callee_saved_registers treatment of noreturn
> > functions on -mcmodel=kernel or -fno-asynchronous-unwind-tables.
> 
> Guarding on -fno-asynchronous-unwind-tables isn't a good idea,
> with just -g we emit in that case unwind info in .debug_frame section
> and even that shouldn't break, and we shouldn't generate different code for
> -g vs. -g0.
> The problem with the changes is that it breaks the unwinding and debugging
> experience not just in the functions on which the optimization triggers,
> but on all functions in the backtrace as well.
> 
> So, IMHO either revert the changes altogether, or guard on -mcmodel=kernel
> (but talk to kernel people on linux-toolchains if that is what they actually
> want).

What is the underlying real purpose of the changes anyway?  It's a 
nano-optimization: for functions to be called multiple times 
they must either return or be recursive.  The latter is not very likely, 
so a noreturn function is called only once in the vast majority of cases.  
Any optimizations that diddle with the frame setup code for functions 
called only once seems to be ... well, not so very useful, especially so 
when they impact anything that is actually useful, like debugging.

I definitely think this shouldn't be done by default.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]
  2024-02-29  6:20               ` Hongtao Liu
  2024-02-29 12:26                 ` H.J. Lu
@ 2024-03-05  4:52                 ` Hongtao Liu
  1 sibling, 0 replies; 22+ messages in thread
From: Hongtao Liu @ 2024-03-05  4:52 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Biener, Uros Bizjak, hjl.tools, Jan Hubicka, gcc-patches

On Thu, Feb 29, 2024 at 2:20 PM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Wed, Feb 28, 2024 at 4:54 PM Jakub Jelinek <jakub@redhat.com> wrote:
> >
> > Hi!
> >
> > Adding Hongtao and Honza into the loop as the ones who acked the original
> > patch.
> >
> > The no_callee_saved_registers by default for noreturn functions change can
> > break in-process backtrace(3) or backtraces from debugger or other process
> > (quite often, any time the noreturn function decides to use the bp register
> > and any of the parent frames uses a frame pointer; the unwinder just crashes
> > in the libgcc unwinder case, gdb prints stack corrupted message), so I'd
> > like to save bp register in that case:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646591.html
> I think this patch makes sense and LGTM, we save and restore frame
> pointer for noreturn.
> >
> > and additionally the no_callee_saved_registers by default for noreturn
> > functions change can make debugging harder, again not localized to the
> > noreturn function, but any of its callers.  So, if say glibc abort function
> > implementation needs a lot of normally callee-saved registers, no matter how
> > users recompile their apps, they will see garbage or optimized out
> > vars/parameters in their code unless they rebuild their glibc with -O0.
> > So, I think we should guard that by a non-default option:
From what has been discussed so far, I am inclined to this proposal.
If there are no additional objections(or concerns) in a few days, ok
for the trunk.
> >
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-03-05  4:52 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-27  8:40 [PATCH] i386: For noreturn functions save at least the bp register if it is used [PR114116] Jakub Jelinek
2024-02-27  8:54 ` Richard Biener
2024-02-27  9:04   ` Jakub Jelinek
2024-02-27  9:13     ` Jakub Jelinek
2024-02-27  9:50       ` Richard Biener
2024-02-27  9:55       ` Jakub Jelinek
2024-02-27 12:09       ` Jakub Jelinek
2024-02-27 14:57         ` [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534] Jakub Jelinek
2024-02-28  8:00           ` Jakub Jelinek
2024-02-28  8:53             ` Jakub Jelinek
2024-02-29  6:20               ` Hongtao Liu
2024-02-29 12:26                 ` H.J. Lu
2024-02-29 12:47                   ` Jakub Jelinek
2024-02-29 13:24                     ` Richard Biener
2024-02-29 13:31                       ` Jan Hubicka
2024-02-29 13:56                         ` Jakub Jelinek
2024-02-29 14:15                           ` Jan Hubicka
2024-02-29 14:28                             ` H.J. Lu
2024-02-29 15:10                             ` Jakub Jelinek
2024-02-29 15:26                               ` Jan Hubicka
2024-03-05  4:52                 ` Hongtao Liu
2024-02-29 16:25         ` [PATCH] i386: For noreturn functions save at least the bp register if it is used [PR114116] Michael Matz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).