public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v5 1/5] Improve must tail in RTL backend
@ 2024-05-05 18:14 Andi Kleen
  2024-05-05 18:14 ` [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324) Andi Kleen
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Andi Kleen @ 2024-05-05 18:14 UTC (permalink / raw)
  To: gcc-patches; +Cc: nathan, josmyers, richard.sandiford, jason, Andi Kleen

- Give error messages for all causes of non sibling call generation
- Don't override choices of other non sibling call checks with
must tail. This causes ICEs. The must tail attribute now only
overrides flag_optimize_sibling_calls locally.
- Error out when tree-tailcall failed to mark a must-tail call
sibcall. In this case it doesn't know the true reason and only gives
a vague message (this could be improved, but it's already useful without
that) tree-tailcall usually fails without optimization, so must
adjust the existing must-tail plugin test to specify -O2.

	PR83324

gcc/ChangeLog:

	* calls.cc (expand_call): Fix mustcall implementation.

gcc/testsuite/ChangeLog:

	* gcc.dg/plugin/must-tail-call-1.c: Adjust.
---
 gcc/calls.cc                                  | 30 ++++++++++++-------
 .../gcc.dg/plugin/must-tail-call-1.c          |  1 +
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/gcc/calls.cc b/gcc/calls.cc
index 21d78f9779fe..a6b8ee44cc29 100644
--- a/gcc/calls.cc
+++ b/gcc/calls.cc
@@ -2650,7 +2650,9 @@ expand_call (tree exp, rtx target, int ignore)
   /* The type of the function being called.  */
   tree fntype;
   bool try_tail_call = CALL_EXPR_TAILCALL (exp);
-  bool must_tail_call = CALL_EXPR_MUST_TAIL_CALL (exp);
+  /* tree-tailcall decided not to do tail calls. Error for the musttail case.  */
+  if (!try_tail_call)
+      maybe_complain_about_tail_call (exp, "other reasons");
   int pass;
 
   /* Register in which non-BLKmode value will be returned,
@@ -3022,10 +3024,22 @@ expand_call (tree exp, rtx target, int ignore)
      pushed these optimizations into -O2.  Don't try if we're already
      expanding a call, as that means we're an argument.  Don't try if
      there's cleanups, as we know there's code to follow the call.  */
-  if (currently_expanding_call++ != 0
-      || (!flag_optimize_sibling_calls && !CALL_FROM_THUNK_P (exp))
-      || args_size.var
-      || dbg_cnt (tail_call) == false)
+  if (currently_expanding_call++ != 0)
+    {
+      maybe_complain_about_tail_call (exp, "inside another call");
+      try_tail_call = 0;
+    }
+  if (!flag_optimize_sibling_calls
+	&& !CALL_FROM_THUNK_P (exp)
+	&& !CALL_EXPR_MUST_TAIL_CALL (exp))
+    try_tail_call = 0;
+  if (args_size.var)
+    {
+      /* ??? correct message?  */
+      maybe_complain_about_tail_call (exp, "stack space needed");
+      try_tail_call = 0;
+    }
+  if (dbg_cnt (tail_call) == false)
     try_tail_call = 0;
 
   /* Workaround buggy C/C++ wrappers around Fortran routines with
@@ -3046,15 +3060,11 @@ expand_call (tree exp, rtx target, int ignore)
 	    if (MEM_P (*iter))
 	      {
 		try_tail_call = 0;
+		maybe_complain_about_tail_call (exp, "hidden string length argument");
 		break;
 	      }
 	}
 
-  /* If the user has marked the function as requiring tail-call
-     optimization, attempt it.  */
-  if (must_tail_call)
-    try_tail_call = 1;
-
   /*  Rest of purposes for tail call optimizations to fail.  */
   if (try_tail_call)
     try_tail_call = can_implement_as_sibling_call_p (exp,
diff --git a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
index 3a6d4cceaba7..44af361e2925 100644
--- a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
@@ -1,4 +1,5 @@
 /* { dg-do compile { target tail_call } } */
+/* { dg-options "-O2" } */
 /* { dg-options "-fdelayed-branch" { target sparc*-*-* } } */
 
 extern void abort (void);
-- 
2.44.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324)
  2024-05-05 18:14 [PATCH v5 1/5] Improve must tail in RTL backend Andi Kleen
@ 2024-05-05 18:14 ` Andi Kleen
  2024-05-07  3:02   ` Jason Merrill
  2024-05-05 18:14 ` [PATCH v5 3/5] C: Implement musttail attribute for returns Andi Kleen
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2024-05-05 18:14 UTC (permalink / raw)
  To: gcc-patches; +Cc: nathan, josmyers, richard.sandiford, jason, Andi Kleen

This patch implements a clang compatible [[musttail]] attribute for
returns.

musttail is useful as an alternative to computed goto for interpreters.
With computed goto the interpreter function usually ends up very big
which causes problems with register allocation and other per function
optimizations not scaling. With musttail the interpreter can be instead
written as a sequence of smaller functions that call each other. To
avoid unbounded stack growth this requires forcing a sibling call, which
this attribute does. It guarantees an error if the call cannot be tail
called which allows the programmer to fix it instead of risking a stack
overflow. Unlike computed goto it is also type-safe.

It turns out that David Malcolm had already implemented middle/backend
support for a musttail attribute back in 2016, but it wasn't exposed
to any frontend other than a special plugin.

This patch adds a [[gnu::musttail]] attribute for C++ that can be added
to return statements. The return statement must be a direct call
(it does not follow dependencies), which is similar to what clang
implements. It then uses the existing must tail infrastructure.

For compatibility it also detects clang::musttail

One problem is that tree-tailcall usually fails when optimization
is disabled, which implies the attribute only really works with
optimization on. But that seems to be a reasonable limitation.

Passes bootstrap and full test

	PR83324

gcc/cp/ChangeLog:

	* cp-tree.h (finish_return_stmt): Add musttail_p.
	(check_return_expr): Dito.
	* parser.cc (cp_parser_statement): Handle [[musttail]].
	(cp_parser_std_attribute): Dito.
	(cp_parser_init_statement): Dito.
	(cp_parser_jump_statement): Dito.
	* semantics.cc (finish_return_stmt): Dito.
	* typeck.cc (check_return_expr): Handle musttail_p flag.
---
 gcc/cp/cp-tree.h    |  4 ++--
 gcc/cp/parser.cc    | 30 ++++++++++++++++++++++++------
 gcc/cp/semantics.cc |  6 +++---
 gcc/cp/typeck.cc    | 20 ++++++++++++++++++--
 4 files changed, 47 insertions(+), 13 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 52d6841559ca..ef5f0039ece2 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7782,7 +7782,7 @@ extern void finish_while_stmt			(tree);
 extern tree begin_do_stmt			(void);
 extern void finish_do_body			(tree);
 extern void finish_do_stmt		(tree, tree, bool, tree, bool);
-extern tree finish_return_stmt			(tree);
+extern tree finish_return_stmt			(tree, bool = false);
 extern tree begin_for_scope			(tree *);
 extern tree begin_for_stmt			(tree, tree);
 extern void finish_init_stmt			(tree);
@@ -8294,7 +8294,7 @@ extern tree composite_pointer_type		(const op_location_t &,
 						 tsubst_flags_t);
 extern tree merge_types				(tree, tree);
 extern tree strip_array_domain			(tree);
-extern tree check_return_expr			(tree, bool *, bool *);
+extern tree check_return_expr			(tree, bool *, bool *, bool);
 extern tree spaceship_type			(tree, tsubst_flags_t = tf_warning_or_error);
 extern tree genericize_spaceship		(location_t, tree, tree, tree);
 extern tree cp_build_binary_op                  (const op_location_t &,
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 66ce161252c7..e1bf92628ac3 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -2467,7 +2467,7 @@ static tree cp_parser_perform_range_for_lookup
 static tree cp_parser_range_for_member_function
   (tree, tree);
 static tree cp_parser_jump_statement
-  (cp_parser *);
+  (cp_parser *, bool = false);
 static void cp_parser_declaration_statement
   (cp_parser *);
 
@@ -12734,9 +12734,27 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
 						     NULL_TREE, false);
 	  break;
 
+	case RID_RETURN:
+	  {
+	    bool musttail_p = false;
+	    std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
+	    if (lookup_attribute ("gnu", "musttail", std_attrs))
+	      {
+		musttail_p = true;
+		std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
+	      }
+	    // support this for compatibility
+	    if (lookup_attribute ("clang", "musttail", std_attrs))
+	      {
+		musttail_p = true;
+		std_attrs = remove_attribute ("clang", "musttail", std_attrs);
+	      }
+	    statement = cp_parser_jump_statement (parser, musttail_p);
+	  }
+	  break;
+
 	case RID_BREAK:
 	case RID_CONTINUE:
-	case RID_RETURN:
 	case RID_CO_RETURN:
 	case RID_GOTO:
 	  std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
@@ -14782,7 +14800,7 @@ cp_parser_init_statement (cp_parser *parser, tree *decl)
   return false;
 }
 
-/* Parse a jump-statement.
+/* Parse a jump-statement.  MUSTTAIL_P indicates a musttail attribute.
 
    jump-statement:
      break ;
@@ -14800,7 +14818,7 @@ cp_parser_init_statement (cp_parser *parser, tree *decl)
    Returns the new BREAK_STMT, CONTINUE_STMT, RETURN_EXPR, or GOTO_EXPR.  */
 
 static tree
-cp_parser_jump_statement (cp_parser* parser)
+cp_parser_jump_statement (cp_parser* parser, bool musttail_p)
 {
   tree statement = error_mark_node;
   cp_token *token;
@@ -14884,7 +14902,7 @@ cp_parser_jump_statement (cp_parser* parser)
 	else if (FNDECL_USED_AUTO (current_function_decl) && in_discarded_stmt)
 	  /* Don't deduce from a discarded return statement.  */;
 	else
-	  statement = finish_return_stmt (expr);
+	  statement = finish_return_stmt (expr, musttail_p);
 	/* Look for the final `;'.  */
 	cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
       }
@@ -30189,7 +30207,7 @@ cp_parser_std_attribute (cp_parser *parser, tree attr_ns)
     /* Maybe we don't expect to see any arguments for this attribute.  */
     const attribute_spec *as
       = lookup_attribute_spec (TREE_PURPOSE (attribute));
-    if (as && as->max_length == 0)
+    if ((as && as->max_length == 0) || is_attribute_p ("musttail", attr_id))
       {
 	error_at (token->location, "%qE attribute does not take any arguments",
 		  attr_id);
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index b8c2bf8771fc..cac1a949c7d3 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -1387,16 +1387,16 @@ finish_do_stmt (tree cond, tree do_stmt, bool ivdep, tree unroll,
 }
 
 /* Finish a return-statement.  The EXPRESSION returned, if any, is as
-   indicated.  */
+   indicated.  MUSTTAIL_P indicates a mustcall attribute.  */
 
 tree
-finish_return_stmt (tree expr)
+finish_return_stmt (tree expr, bool musttail_p)
 {
   tree r;
   bool no_warning;
   bool dangling;
 
-  expr = check_return_expr (expr, &no_warning, &dangling);
+  expr = check_return_expr (expr, &no_warning, &dangling, musttail_p);
 
   if (error_operand_p (expr)
       || (flag_openmp && !check_omp_return ()))
diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index a25f8622651d..9a5668796d2e 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -11036,10 +11036,12 @@ maybe_warn_pessimizing_move (tree expr, tree type, bool return_p)
    the DECL_RESULT for the function.  Set *NO_WARNING to true if
    code reaches end of non-void function warning shouldn't be issued
    on this RETURN_EXPR.  Set *DANGLING to true if code returns the
-   address of a local variable.  */
+   address of a local variable.  MUSTTAIL_P indicates a musttail
+   return.  */
 
 tree
-check_return_expr (tree retval, bool *no_warning, bool *dangling)
+check_return_expr (tree retval, bool *no_warning, bool *dangling,
+		   bool musttail_p)
 {
   tree result;
   /* The type actually returned by the function.  */
@@ -11053,6 +11055,20 @@ check_return_expr (tree retval, bool *no_warning, bool *dangling)
   *no_warning = false;
   *dangling = false;
 
+  if (musttail_p)
+    {
+      tree t = retval;
+      if (TREE_CODE (t) == TARGET_EXPR)
+	t = TARGET_EXPR_INITIAL (retval);
+      if (TREE_CODE (t) != CALL_EXPR)
+	{
+	  error_at (loc, "cannot tail-call: return value must be a call");
+	  return error_mark_node;
+	}
+      else
+	CALL_EXPR_MUST_TAIL_CALL (t) = 1;
+    }
+
   /* A `volatile' function is one that isn't supposed to return, ever.
      (This is a G++ extension, used to get better code for functions
      that call the `volatile' function.)  */
-- 
2.44.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v5 3/5] C: Implement musttail attribute for returns
  2024-05-05 18:14 [PATCH v5 1/5] Improve must tail in RTL backend Andi Kleen
  2024-05-05 18:14 ` [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324) Andi Kleen
@ 2024-05-05 18:14 ` Andi Kleen
  2024-05-05 18:14 ` [PATCH v5 4/5] Add tests for C/C++ musttail attributes Andi Kleen
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2024-05-05 18:14 UTC (permalink / raw)
  To: gcc-patches; +Cc: nathan, josmyers, richard.sandiford, jason, Andi Kleen

Implement a C23 clang compatible musttail attribute similar to the earlier
C++ implementation in the C parser.

	PR83324

gcc/c/ChangeLog:

	* c-parser.cc (struct attr_state): Define with musttail_p.
	(c_parser_statement_after_labels): Handle [[musttail]]
	(c_parser_std_attribute): Dito.
	(c_parser_handle_musttail): Dito.
	(c_parser_compound_statement_nostart): Dito.
	(c_parser_all_labels): Dito.
	(c_parser_statement): Dito.
	* c-tree.h (c_finish_return): Add musttail_p flag.
	* c-typeck.cc (c_finish_return): Handle musttail_p flag.
---
 gcc/c/c-parser.cc | 61 +++++++++++++++++++++++++++++++++++++----------
 gcc/c/c-tree.h    |  2 +-
 gcc/c/c-typeck.cc | 15 ++++++++++--
 3 files changed, 63 insertions(+), 15 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 00f8bf4376e5..9edadb0fee96 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1616,6 +1616,11 @@ struct omp_for_parse_data {
   bool fail : 1;
 };
 
+struct attr_state
+{
+  bool musttail_p; // parsed a musttail for return
+};
+
 static bool c_parser_nth_token_starts_std_attributes (c_parser *,
 						      unsigned int);
 static tree c_parser_std_attribute_specifier_sequence (c_parser *);
@@ -1660,7 +1665,7 @@ static location_t c_parser_compound_statement_nostart (c_parser *);
 static void c_parser_label (c_parser *, tree);
 static void c_parser_statement (c_parser *, bool *, location_t * = NULL);
 static void c_parser_statement_after_labels (c_parser *, bool *,
-					     vec<tree> * = NULL);
+					     vec<tree> * = NULL, attr_state = {});
 static tree c_parser_c99_block_statement (c_parser *, bool *,
 					  location_t * = NULL);
 static void c_parser_if_statement (c_parser *, bool *, vec<tree> *);
@@ -5756,6 +5761,8 @@ c_parser_std_attribute (c_parser *parser, bool for_tm)
 	}
       goto out;
     }
+  else if (is_attribute_p ("musttail", name))
+    error ("%<musttail%> attribute has arguments");
   {
     location_t open_loc = c_parser_peek_token (parser)->location;
     matching_parens parens;
@@ -6941,6 +6948,28 @@ c_parser_handle_directive_omp_attributes (tree &attrs,
     }
 }
 
+/* Check if STD_ATTR contains a musttail attribute and handle it
+   PARSER is the parser and A is the output attr_state.  */
+
+static tree
+c_parser_handle_musttail (c_parser *parser, tree std_attrs, attr_state &a)
+{
+  if (c_parser_next_token_is_keyword (parser, RID_RETURN))
+    {
+      if (lookup_attribute ("gnu", "musttail", std_attrs))
+	{
+	  std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
+	  a.musttail_p = true;
+	}
+      if (lookup_attribute ("clang", "musttail", std_attrs))
+	{
+	  std_attrs = remove_attribute ("clang", "musttail", std_attrs);
+	  a.musttail_p = true;
+	}
+    }
+  return std_attrs;
+}
+
 /* Parse a compound statement except for the opening brace.  This is
    used for parsing both compound statements and statement expressions
    (which follow different paths to handling the opening).  */
@@ -6957,6 +6986,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
   bool in_omp_loop_block
     = omp_for_parse_state ? omp_for_parse_state->want_nested_loop : false;
   tree sl = NULL_TREE;
+  attr_state a = {};
 
   if (c_parser_next_token_is (parser, CPP_CLOSE_BRACE))
     {
@@ -7095,7 +7125,10 @@ c_parser_compound_statement_nostart (c_parser *parser)
 	= c_parser_nth_token_starts_std_attributes (parser, 1);
       tree std_attrs = NULL_TREE;
       if (have_std_attrs)
-	std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+	{
+	  std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+	  std_attrs = c_parser_handle_musttail (parser, std_attrs, a);
+	}
       if (c_parser_next_token_is_keyword (parser, RID_CASE)
 	  || c_parser_next_token_is_keyword (parser, RID_DEFAULT)
 	  || (c_parser_next_token_is (parser, CPP_NAME)
@@ -7243,7 +7276,7 @@ c_parser_compound_statement_nostart (c_parser *parser)
 	  last_stmt = true;
 	  mark_valid_location_for_stdc_pragma (false);
 	  if (!omp_for_parse_state)
-	    c_parser_statement_after_labels (parser, NULL);
+	    c_parser_statement_after_labels (parser, NULL, NULL, a);
 	  else
 	    {
 	      /* In canonical loop nest form, nested loops can only appear
@@ -7285,15 +7318,18 @@ c_parser_compound_statement_nostart (c_parser *parser)
 /* Parse all consecutive labels, possibly preceded by standard
    attributes.  In this context, a statement is required, not a
    declaration, so attributes must be followed by a statement that is
-   not just a semicolon.  */
+   not just a semicolon.  Returns an attr_state.  */
 
-static void
+static attr_state
 c_parser_all_labels (c_parser *parser)
 {
+  attr_state a = {};
   bool have_std_attrs;
   tree std_attrs = NULL;
   if ((have_std_attrs = c_parser_nth_token_starts_std_attributes (parser, 1)))
-    std_attrs = c_parser_std_attribute_specifier_sequence (parser);
+    std_attrs = c_parser_handle_musttail (parser,
+		    c_parser_std_attribute_specifier_sequence (parser), a);
+
   while (c_parser_next_token_is_keyword (parser, RID_CASE)
 	 || c_parser_next_token_is_keyword (parser, RID_DEFAULT)
 	 || (c_parser_next_token_is (parser, CPP_NAME)
@@ -7315,6 +7351,7 @@ c_parser_all_labels (c_parser *parser)
     }
   else if (have_std_attrs && c_parser_next_token_is (parser, CPP_SEMICOLON))
     c_parser_error (parser, "expected statement");
+  return a;
 }
 
 /* Parse a label (C90 6.6.1, C99 6.8.1, C11 6.8.1).
@@ -7558,11 +7595,11 @@ c_parser_label (c_parser *parser, tree std_attrs)
 static void
 c_parser_statement (c_parser *parser, bool *if_p, location_t *loc_after_labels)
 {
-  c_parser_all_labels (parser);
+  attr_state a = c_parser_all_labels (parser);
   if (loc_after_labels)
     *loc_after_labels = c_parser_peek_token (parser)->location;
   parser->omp_attrs_forbidden_p = false;
-  c_parser_statement_after_labels (parser, if_p, NULL);
+  c_parser_statement_after_labels (parser, if_p, NULL, a);
 }
 
 /* Parse a statement, other than a labeled statement.  CHAIN is a vector
@@ -7571,11 +7608,11 @@ c_parser_statement (c_parser *parser, bool *if_p, location_t *loc_after_labels)
 
    IF_P is used to track whether there's a (possibly labeled) if statement
    which is not enclosed in braces and has an else clause.  This is used to
-   implement -Wparentheses.  */
+   implement -Wparentheses. A has an earlier parsed attribute state.  */
 
 static void
 c_parser_statement_after_labels (c_parser *parser, bool *if_p,
-				 vec<tree> *chain)
+				 vec<tree> *chain, attr_state a)
 {
   location_t loc = c_parser_peek_token (parser)->location;
   tree stmt = NULL_TREE;
@@ -7643,7 +7680,7 @@ c_parser_statement_after_labels (c_parser *parser, bool *if_p,
 	  c_parser_consume_token (parser);
 	  if (c_parser_next_token_is (parser, CPP_SEMICOLON))
 	    {
-	      stmt = c_finish_return (loc, NULL_TREE, NULL_TREE);
+	      stmt = c_finish_return (loc, NULL_TREE, NULL_TREE, a.musttail_p);
 	      c_parser_consume_token (parser);
 	    }
 	  else
@@ -7652,7 +7689,7 @@ c_parser_statement_after_labels (c_parser *parser, bool *if_p,
 	      struct c_expr expr = c_parser_expression_conv (parser);
 	      mark_exp_read (expr.value);
 	      stmt = c_finish_return (EXPR_LOC_OR_LOC (expr.value, xloc),
-				      expr.value, expr.original_type);
+				      expr.value, expr.original_type, a.musttail_p);
 	      goto expect_semicolon;
 	    }
 	  break;
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h
index 22b0009874b5..002351bb9934 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h
@@ -826,7 +826,7 @@ extern tree c_begin_stmt_expr (void);
 extern tree c_finish_stmt_expr (location_t, tree);
 extern tree c_process_expr_stmt (location_t, tree);
 extern tree c_finish_expr_stmt (location_t, tree);
-extern tree c_finish_return (location_t, tree, tree);
+extern tree c_finish_return (location_t, tree, tree, bool = false);
 extern tree c_finish_bc_stmt (location_t, tree, bool);
 extern tree c_finish_goto_label (location_t, tree);
 extern tree c_finish_goto_ptr (location_t, c_expr val);
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 4567b114734b..174348e96d7f 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -11445,10 +11445,10 @@ c_finish_goto_ptr (location_t loc, c_expr val)
    to return, or a null pointer for `return;' with no value.  LOC is
    the location of the return statement, or the location of the expression,
    if the statement has any.  If ORIGTYPE is not NULL_TREE, it
-   is the original type of RETVAL.  */
+   is the original type of RETVAL.  MUSTTAIL_P indicates a musttail attribute.  */
 
 tree
-c_finish_return (location_t loc, tree retval, tree origtype)
+c_finish_return (location_t loc, tree retval, tree origtype, bool musttail_p)
 {
   tree valtype = TREE_TYPE (TREE_TYPE (current_function_decl)), ret_stmt;
   bool no_warning = false;
@@ -11462,6 +11462,17 @@ c_finish_return (location_t loc, tree retval, tree origtype)
     warning_at (xloc, 0,
 		"function declared %<noreturn%> has a %<return%> statement");
 
+  if (retval && musttail_p)
+    {
+      tree t = retval;
+      if (TREE_CODE (t) == TARGET_EXPR)
+	t = TARGET_EXPR_INITIAL (t);
+      if (TREE_CODE (t) != CALL_EXPR)
+	error_at (xloc, "cannot tail-call: return value must be call");
+      else
+	CALL_EXPR_MUST_TAIL_CALL (t) = 1;
+    }
+
   if (retval)
     {
       tree semantic_type = NULL_TREE;
-- 
2.44.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v5 4/5] Add tests for C/C++ musttail attributes
  2024-05-05 18:14 [PATCH v5 1/5] Improve must tail in RTL backend Andi Kleen
  2024-05-05 18:14 ` [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324) Andi Kleen
  2024-05-05 18:14 ` [PATCH v5 3/5] C: Implement musttail attribute for returns Andi Kleen
@ 2024-05-05 18:14 ` Andi Kleen
  2024-05-05 18:14 ` [PATCH v5 5/5] Add documentation for musttail attribute Andi Kleen
  2024-05-14 14:15 ` [PATCH v5 1/5] Improve must tail in RTL backend Richard Biener
  4 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2024-05-05 18:14 UTC (permalink / raw)
  To: gcc-patches; +Cc: nathan, josmyers, richard.sandiford, jason, Andi Kleen

Mostly adopted from the existing C musttail plugin tests.

gcc/testsuite/ChangeLog:

	* c-c++-common/musttail1.c: New test.
	* c-c++-common/musttail2.c: New test.
	* c-c++-common/musttail3.c: New test.
	* c-c++-common/musttail4.c: New test.
	* c-c++-common/musttail5.c: New test.
---
 gcc/testsuite/c-c++-common/musttail1.c | 15 ++++++++++++
 gcc/testsuite/c-c++-common/musttail2.c | 34 ++++++++++++++++++++++++++
 gcc/testsuite/c-c++-common/musttail3.c | 29 ++++++++++++++++++++++
 gcc/testsuite/c-c++-common/musttail4.c | 17 +++++++++++++
 gcc/testsuite/c-c++-common/musttail5.c | 28 +++++++++++++++++++++
 5 files changed, 123 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/musttail1.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail2.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail3.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail4.c
 create mode 100644 gcc/testsuite/c-c++-common/musttail5.c

diff --git a/gcc/testsuite/c-c++-common/musttail1.c b/gcc/testsuite/c-c++-common/musttail1.c
new file mode 100644
index 000000000000..ac92f9f74616
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+/* { dg-options "-O2" } */
+/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
+
+int __attribute__((noinline,noclone,noipa))
+callee (int i)
+{
+  return i * i;
+}
+
+int __attribute__((noinline,noclone,noipa))
+caller (int i)
+{
+  [[gnu::musttail]] return callee (i + 1);
+}
diff --git a/gcc/testsuite/c-c++-common/musttail2.c b/gcc/testsuite/c-c++-common/musttail2.c
new file mode 100644
index 000000000000..058329b69cc2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail2.c
@@ -0,0 +1,34 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+struct box { char field[256]; int i; };
+
+int __attribute__((noinline,noclone,noipa))
+test_2_callee (int i, struct box b)
+{
+  if (b.field[0])
+    return 5;
+  return i * i;
+}
+
+int __attribute__((noinline,noclone,noipa))
+test_2_caller (int i)
+{
+  struct box b;
+  [[gnu::musttail]] return test_2_callee (i + 1, b); /* { dg-error "cannot tail-call: " } */
+}
+
+extern void setjmp (void);
+void
+test_3 (void)
+{
+  [[gnu::musttail]] return setjmp (); /* { dg-error "cannot tail-call: " } */
+}
+
+typedef void (fn_ptr_t) (void);
+volatile fn_ptr_t fn_ptr;
+
+void
+test_5 (void)
+{
+  [[gnu::musttail]] return fn_ptr (); /* { dg-error "cannot tail-call: " } */
+}
diff --git a/gcc/testsuite/c-c++-common/musttail3.c b/gcc/testsuite/c-c++-common/musttail3.c
new file mode 100644
index 000000000000..ea9589c59ef2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail3.c
@@ -0,0 +1,29 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+extern int foo2 (int x, ...);
+
+struct str
+{
+  int a, b;
+};
+
+struct str
+cstruct (int x)
+{
+  if (x < 10)
+    [[clang::musttail]] return cstruct (x + 1);
+  return ((struct str){ x, 0 });
+}
+
+int
+foo (int x)
+{
+  if (x < 10)
+    [[clang::musttail]] return foo2 (x, 29);
+  if (x < 100)
+    {
+      int k = foo (x + 1);
+      [[clang::musttail]] return k;	/* { dg-error "cannot tail-call: " } */
+    }
+  return x;
+}
diff --git a/gcc/testsuite/c-c++-common/musttail4.c b/gcc/testsuite/c-c++-common/musttail4.c
new file mode 100644
index 000000000000..23f4b5e1cd68
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail4.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target { tail_call && { c || c++11 } } } } */
+
+struct box { char field[64]; int i; };
+
+struct box __attribute__((noinline,noclone,noipa))
+returns_struct (int i)
+{
+  struct box b;
+  b.i = i * i;
+  return b;
+}
+
+int __attribute__((noinline,noclone))
+test_1 (int i)
+{
+  [[gnu::musttail]] return returns_struct (i * 5).i; /* { dg-error "cannot tail-call: " } */
+}
diff --git a/gcc/testsuite/c-c++-common/musttail5.c b/gcc/testsuite/c-c++-common/musttail5.c
new file mode 100644
index 000000000000..7938e7ff80e4
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/musttail5.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23" { target c } } */
+/* { dg-options "-std=gnu++11" { target c++ } } */
+
+[[musttail]] int j; /* { dg-warning "attribute" } */
+__attribute__((musttail)) int k; /* { dg-warning "attribute" } */
+
+void foo(void)
+{
+  [[gnu::musttail]] j++; /* { dg-warning "attribute" } */
+  [[gnu::musttail]] if (k > 0) /* { dg-warning "attribute" } */
+    [[gnu::musttail]] k++; /* { dg-warning "attribute" } */
+}
+
+int foo2(int p)
+{
+  [[gnu::musttail(1)]] return foo2(p + 1); /* { dg-error "attribute" } */
+}
+
+int i;
+
+int foo3(void)
+{
+  [[musttail]] i++; /* { dg-warning "attribute" } */
+  [[musttail]] if (i > 10) /* { dg-warning "attribute" } */
+    [[musttail]] return foo2(i); /* { dg-warning "attribute" } */
+  return 0;
+}
-- 
2.44.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v5 5/5] Add documentation for musttail attribute
  2024-05-05 18:14 [PATCH v5 1/5] Improve must tail in RTL backend Andi Kleen
                   ` (2 preceding siblings ...)
  2024-05-05 18:14 ` [PATCH v5 4/5] Add tests for C/C++ musttail attributes Andi Kleen
@ 2024-05-05 18:14 ` Andi Kleen
  2024-05-14 14:21   ` Richard Biener
  2024-05-14 14:15 ` [PATCH v5 1/5] Improve must tail in RTL backend Richard Biener
  4 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2024-05-05 18:14 UTC (permalink / raw)
  To: gcc-patches; +Cc: nathan, josmyers, richard.sandiford, jason, Andi Kleen

gcc/ChangeLog:

	* doc/extend.texi: Document [[musttail]]
---
 gcc/doc/extend.texi | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index e290265d68d3..deb100ad93b6 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -9839,7 +9839,7 @@ same manner as the @code{deprecated} attribute.
 @section Statement Attributes
 @cindex Statement Attributes
 
-GCC allows attributes to be set on null statements.  @xref{Attribute Syntax},
+GCC allows attributes to be set on statements.  @xref{Attribute Syntax},
 for details of the exact syntax for using attributes.  Other attributes are
 available for functions (@pxref{Function Attributes}), variables
 (@pxref{Variable Attributes}), labels (@pxref{Label Attributes}), enumerators
@@ -9896,6 +9896,22 @@ foo (int x, int y)
 @code{y} is not actually incremented and the compiler can but does not
 have to optimize it to just @code{return 42 + 42;}.
 
+@cindex @code{musttail} statement attribute
+@item musttail
+
+The @code{gnu::musttail} or @code{clang::musttail} attribute
+can be applied to a @code{return} statement with a return-value expression
+that is a function call.  It asserts that the call must be a tail call that
+does not allocate extra stack space.
+
+@smallexample
+[[gnu::musttail]] return foo();
+@end smallexample
+
+If the compiler cannot generate a tail call it generates
+an error. Tail calls generally require enabling optimization.
+On some targets they may not be supported.
+
 @end table
 
 @node Attribute Syntax
@@ -10019,7 +10035,9 @@ the constant expression, if present.
 
 @subsubheading Statement Attributes
 In GNU C, an attribute specifier list may appear as part of a null
-statement.  The attribute goes before the semicolon.
+statement. The attribute goes before the semicolon.
+Some attributes in new style syntax are also supported
+on non-null statements.
 
 @subsubheading Type Attributes
 
-- 
2.44.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324)
  2024-05-05 18:14 ` [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324) Andi Kleen
@ 2024-05-07  3:02   ` Jason Merrill
  2024-05-14 17:24     ` Andi Kleen
  0 siblings, 1 reply; 20+ messages in thread
From: Jason Merrill @ 2024-05-07  3:02 UTC (permalink / raw)
  To: Andi Kleen, gcc-patches; +Cc: nathan, josmyers, richard.sandiford

On 5/5/24 14:14, Andi Kleen wrote:
> This patch implements a clang compatible [[musttail]] attribute for
> returns.

Thanks.

> musttail is useful as an alternative to computed goto for interpreters.
> With computed goto the interpreter function usually ends up very big
> which causes problems with register allocation and other per function
> optimizations not scaling. With musttail the interpreter can be instead
> written as a sequence of smaller functions that call each other. To
> avoid unbounded stack growth this requires forcing a sibling call, which
> this attribute does. It guarantees an error if the call cannot be tail
> called which allows the programmer to fix it instead of risking a stack
> overflow. Unlike computed goto it is also type-safe.
> 
> It turns out that David Malcolm had already implemented middle/backend
> support for a musttail attribute back in 2016, but it wasn't exposed
> to any frontend other than a special plugin.
> 
> This patch adds a [[gnu::musttail]] attribute for C++ that can be added
> to return statements. The return statement must be a direct call
> (it does not follow dependencies), which is similar to what clang
> implements. It then uses the existing must tail infrastructure.
> 
> For compatibility it also detects clang::musttail
> 
> One problem is that tree-tailcall usually fails when optimization
> is disabled, which implies the attribute only really works with
> optimization on. But that seems to be a reasonable limitation.
> 
> Passes bootstrap and full test
> 
> 	PR83324
> 
> gcc/cp/ChangeLog:
> 
> 	* cp-tree.h (finish_return_stmt): Add musttail_p.
> 	(check_return_expr): Dito.
> 	* parser.cc (cp_parser_statement): Handle [[musttail]].
> 	(cp_parser_std_attribute): Dito.
> 	(cp_parser_init_statement): Dito.
> 	(cp_parser_jump_statement): Dito.
> 	* semantics.cc (finish_return_stmt): Dito.
> 	* typeck.cc (check_return_expr): Handle musttail_p flag.
> ---
>   gcc/cp/cp-tree.h    |  4 ++--
>   gcc/cp/parser.cc    | 30 ++++++++++++++++++++++++------
>   gcc/cp/semantics.cc |  6 +++---
>   gcc/cp/typeck.cc    | 20 ++++++++++++++++++--
>   4 files changed, 47 insertions(+), 13 deletions(-)
> 
> @@ -12734,9 +12734,27 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
>   						     NULL_TREE, false);
>   	  break;
>   
> +	case RID_RETURN:
> +	  {
> +	    bool musttail_p = false;
> +	    std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
> +	    if (lookup_attribute ("gnu", "musttail", std_attrs))
> +	      {
> +		musttail_p = true;
> +		std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
> +	      }
> +	    // support this for compatibility
> +	    if (lookup_attribute ("clang", "musttail", std_attrs))
> +	      {
> +		musttail_p = true;
> +		std_attrs = remove_attribute ("clang", "musttail", std_attrs);
> +	      }
> +	    statement = cp_parser_jump_statement (parser, musttail_p);

It seems to me that if we were to pass &std_attrs to 
cp_parser_jump_statement, we could handle this entirely in that function 
rather than adding a flag to finish_return_stmt and check_return_stmt.

> @@ -30189,7 +30207,7 @@ cp_parser_std_attribute (cp_parser *parser, tree attr_ns)
>       /* Maybe we don't expect to see any arguments for this attribute.  */
>       const attribute_spec *as
>         = lookup_attribute_spec (TREE_PURPOSE (attribute));
> -    if (as && as->max_length == 0)
> +    if ((as && as->max_length == 0) || is_attribute_p ("musttail", attr_id))

I'd prefer to add an attribute to the table, rather than special-case it 
here; apart from consistency, it seems likely that someone will later 
want to apply it to a function.

You need a template testcase; I expect it doesn't work in templates with 
the current patch.  It's probably enough to copy it in tsubst_expr where 
we currently propagate CALL_EXPR_OPERATOR_SYNTAX.

You also need a testcase where the function returns a class; in that 
case the call will often appear as AGGR_INIT_EXPR rather than CALL_EXPR, 
so you'll need to handle that as well.  And see the places that copy 
flags like CALL_EXPR_OPERATOR_SYNTAX between CALL_EXPR and AGGR_INIT_EXPR.

Jason


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 1/5] Improve must tail in RTL backend
  2024-05-05 18:14 [PATCH v5 1/5] Improve must tail in RTL backend Andi Kleen
                   ` (3 preceding siblings ...)
  2024-05-05 18:14 ` [PATCH v5 5/5] Add documentation for musttail attribute Andi Kleen
@ 2024-05-14 14:15 ` Richard Biener
  2024-05-14 17:19   ` Andi Kleen
  2024-05-20  4:53   ` Andi Kleen
  4 siblings, 2 replies; 20+ messages in thread
From: Richard Biener @ 2024-05-14 14:15 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc-patches, nathan, josmyers, richard.sandiford, jason

On Sun, May 5, 2024 at 8:16 PM Andi Kleen <ak@linux.intel.com> wrote:
>
> - Give error messages for all causes of non sibling call generation
> - Don't override choices of other non sibling call checks with
> must tail. This causes ICEs. The must tail attribute now only
> overrides flag_optimize_sibling_calls locally.
> - Error out when tree-tailcall failed to mark a must-tail call
> sibcall. In this case it doesn't know the true reason and only gives
> a vague message (this could be improved, but it's already useful without
> that) tree-tailcall usually fails without optimization, so must
> adjust the existing must-tail plugin test to specify -O2.
>
>         PR83324
>
> gcc/ChangeLog:
>
>         * calls.cc (expand_call): Fix mustcall implementation.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/plugin/must-tail-call-1.c: Adjust.
> ---
>  gcc/calls.cc                                  | 30 ++++++++++++-------
>  .../gcc.dg/plugin/must-tail-call-1.c          |  1 +
>  2 files changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/calls.cc b/gcc/calls.cc
> index 21d78f9779fe..a6b8ee44cc29 100644
> --- a/gcc/calls.cc
> +++ b/gcc/calls.cc
> @@ -2650,7 +2650,9 @@ expand_call (tree exp, rtx target, int ignore)
>    /* The type of the function being called.  */
>    tree fntype;
>    bool try_tail_call = CALL_EXPR_TAILCALL (exp);
> -  bool must_tail_call = CALL_EXPR_MUST_TAIL_CALL (exp);
> +  /* tree-tailcall decided not to do tail calls. Error for the musttail case.  */
> +  if (!try_tail_call)
> +      maybe_complain_about_tail_call (exp, "other reasons");
>    int pass;
>
>    /* Register in which non-BLKmode value will be returned,
> @@ -3022,10 +3024,22 @@ expand_call (tree exp, rtx target, int ignore)
>       pushed these optimizations into -O2.  Don't try if we're already
>       expanding a call, as that means we're an argument.  Don't try if
>       there's cleanups, as we know there's code to follow the call.  */
> -  if (currently_expanding_call++ != 0
> -      || (!flag_optimize_sibling_calls && !CALL_FROM_THUNK_P (exp))
> -      || args_size.var
> -      || dbg_cnt (tail_call) == false)
> +  if (currently_expanding_call++ != 0)
> +    {
> +      maybe_complain_about_tail_call (exp, "inside another call");
> +      try_tail_call = 0;
> +    }
> +  if (!flag_optimize_sibling_calls
> +       && !CALL_FROM_THUNK_P (exp)
> +       && !CALL_EXPR_MUST_TAIL_CALL (exp))
> +    try_tail_call = 0;
> +  if (args_size.var)

If we are both inside another call and run into this we give two errors,
but I guess that's OK ...

> +    {
> +      /* ??? correct message?  */
> +      maybe_complain_about_tail_call (exp, "stack space needed");

args_size.var != NULL_TREE means the argument size is not constant.
I'm quite sure this is an overly conservative check.

> +      try_tail_call = 0;
> +    }
> +  if (dbg_cnt (tail_call) == false)
>      try_tail_call = 0;
>
>    /* Workaround buggy C/C++ wrappers around Fortran routines with
> @@ -3046,15 +3060,11 @@ expand_call (tree exp, rtx target, int ignore)
>             if (MEM_P (*iter))
>               {
>                 try_tail_call = 0;
> +               maybe_complain_about_tail_call (exp, "hidden string length argument");

"hidden string length argument passed on stack"

from what I read the code.

>                 break;
>               }
>         }
>
> -  /* If the user has marked the function as requiring tail-call
> -     optimization, attempt it.  */
> -  if (must_tail_call)
> -    try_tail_call = 1;
> -
>    /*  Rest of purposes for tail call optimizations to fail.  */
>    if (try_tail_call)
>      try_tail_call = can_implement_as_sibling_call_p (exp,
> diff --git a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
> index 3a6d4cceaba7..44af361e2925 100644
> --- a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
> +++ b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
> @@ -1,4 +1,5 @@
>  /* { dg-do compile { target tail_call } } */
> +/* { dg-options "-O2" } */

So I think this is unfortunate - I think when there's a must-tail attribute
we should either run the tailcall pass to check the call even at -O0 or
trust the user with correctness  (hoping no optimization interfered with
the ability to tail-call).

What were the ICEs you ran into?

I would guess it's for example problematic to duplicate must-tail calls?

Thanks,
Richard.

>  /* { dg-options "-fdelayed-branch" { target sparc*-*-* } } */
>
>  extern void abort (void);
> --
> 2.44.0
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 5/5] Add documentation for musttail attribute
  2024-05-05 18:14 ` [PATCH v5 5/5] Add documentation for musttail attribute Andi Kleen
@ 2024-05-14 14:21   ` Richard Biener
  2024-05-14 16:30     ` Andi Kleen
  0 siblings, 1 reply; 20+ messages in thread
From: Richard Biener @ 2024-05-14 14:21 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc-patches, nathan, josmyers, richard.sandiford, jason

On Sun, May 5, 2024 at 8:16 PM Andi Kleen <ak@linux.intel.com> wrote:
>
> gcc/ChangeLog:
>
>         * doc/extend.texi: Document [[musttail]]
> ---
>  gcc/doc/extend.texi | 22 ++++++++++++++++++++--
>  1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index e290265d68d3..deb100ad93b6 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -9839,7 +9839,7 @@ same manner as the @code{deprecated} attribute.
>  @section Statement Attributes
>  @cindex Statement Attributes
>
> -GCC allows attributes to be set on null statements.  @xref{Attribute Syntax},
> +GCC allows attributes to be set on statements.  @xref{Attribute Syntax},
>  for details of the exact syntax for using attributes.  Other attributes are
>  available for functions (@pxref{Function Attributes}), variables
>  (@pxref{Variable Attributes}), labels (@pxref{Label Attributes}), enumerators
> @@ -9896,6 +9896,22 @@ foo (int x, int y)
>  @code{y} is not actually incremented and the compiler can but does not
>  have to optimize it to just @code{return 42 + 42;}.
>
> +@cindex @code{musttail} statement attribute
> +@item musttail
> +
> +The @code{gnu::musttail} or @code{clang::musttail} attribute
> +can be applied to a @code{return} statement with a return-value expression
> +that is a function call.  It asserts that the call must be a tail call that
> +does not allocate extra stack space.
> +
> +@smallexample
> +[[gnu::musttail]] return foo();
> +@end smallexample
> +
> +If the compiler cannot generate a tail call it generates
> +an error. Tail calls generally require enabling optimization.
> +On some targets they may not be supported.

Looks generally OK though does this mean people can debug
programs using [[gnu::musttail]] only with optimized builds?  It
seems to me we should try harder to make [[gnu::musttail]] work
at -O0 and generally behave the same at all optimization levels?

> +
>  @end table
>
>  @node Attribute Syntax
> @@ -10019,7 +10035,9 @@ the constant expression, if present.
>
>  @subsubheading Statement Attributes
>  In GNU C, an attribute specifier list may appear as part of a null
> -statement.  The attribute goes before the semicolon.
> +statement. The attribute goes before the semicolon.
> +Some attributes in new style syntax are also supported
> +on non-null statements.
>
>  @subsubheading Type Attributes
>
> --
> 2.44.0
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 5/5] Add documentation for musttail attribute
  2024-05-14 14:21   ` Richard Biener
@ 2024-05-14 16:30     ` Andi Kleen
  2024-05-14 17:08       ` Richard Biener
  0 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2024-05-14 16:30 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, nathan, josmyers, richard.sandiford, jason

> Looks generally OK though does this mean people can debug
> programs using [[gnu::musttail]] only with optimized builds?  It
> seems to me we should try harder to make [[gnu::musttail]] work
> at -O0 and generally behave the same at all optimization levels?

Yes that's a fair point. The problem is tree-tailcall failing,
not the RTL backend. Have to see what it would take to fix. I would
prefer to do this as a followon patch though.

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 5/5] Add documentation for musttail attribute
  2024-05-14 16:30     ` Andi Kleen
@ 2024-05-14 17:08       ` Richard Biener
  0 siblings, 0 replies; 20+ messages in thread
From: Richard Biener @ 2024-05-14 17:08 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc-patches, nathan, josmyers, richard.sandiford, jason

On Tue, May 14, 2024 at 6:30 PM Andi Kleen <ak@linux.intel.com> wrote:
>
> > Looks generally OK though does this mean people can debug
> > programs using [[gnu::musttail]] only with optimized builds?  It
> > seems to me we should try harder to make [[gnu::musttail]] work
> > at -O0 and generally behave the same at all optimization levels?
>
> Yes that's a fair point. The problem is tree-tailcall failing,
> not the RTL backend. Have to see what it would take to fix. I would
> prefer to do this as a followon patch though.

Btw, -Og also doesn't run the tail-calls pass.  I think we should at least
try to run the pass at -Og and -O0 somewhere before pass_expand
and in find_tail_calls simply only consider the musttail annotated calls
in this case?  It should be reasonably cheap to walk the return stmts
of each function.

I'm not sure why we run pass_tail_calls so "early" and within the
regular post-IPA optimization pipeline rather than in the pre-expand
set of passes common to all optimization levels.  Possibly simply
moving the pass works out already.

Richard.

>
> -Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 1/5] Improve must tail in RTL backend
  2024-05-14 14:15 ` [PATCH v5 1/5] Improve must tail in RTL backend Richard Biener
@ 2024-05-14 17:19   ` Andi Kleen
  2024-05-20  4:53   ` Andi Kleen
  1 sibling, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2024-05-14 17:19 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, nathan, josmyers, richard.sandiford, jason

> > diff --git a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
> > index 3a6d4cceaba7..44af361e2925 100644
> > --- a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
> > +++ b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
> > @@ -1,4 +1,5 @@
> >  /* { dg-do compile { target tail_call } } */
> > +/* { dg-options "-O2" } */
> 
> So I think this is unfortunate - I think when there's a must-tail attribute
> we should either run the tailcall pass to check the call even at -O0 or
> trust the user with correctness  (hoping no optimization interfered with
> the ability to tail-call).
> 
> What were the ICEs you ran into?

I just tried and I don't see ICEs with the current test cases anymore.
Unfortunately I didn't save the earlier ones and I'm not fully sure
what test case i used then. I'll undo that change and watch it.

Maybe together with moving the tree pass that's enough to fix -O0.

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324)
  2024-05-07  3:02   ` Jason Merrill
@ 2024-05-14 17:24     ` Andi Kleen
  2024-05-14 21:52       ` Jason Merrill
  0 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2024-05-14 17:24 UTC (permalink / raw)
  To: Jason Merrill; +Cc: gcc-patches

Hi Jason,

On Mon, May 06, 2024 at 11:02:20PM -0400, Jason Merrill wrote:
> > @@ -30189,7 +30207,7 @@ cp_parser_std_attribute (cp_parser *parser, tree attr_ns)
> >       /* Maybe we don't expect to see any arguments for this attribute.  */
> >       const attribute_spec *as
> >         = lookup_attribute_spec (TREE_PURPOSE (attribute));
> > -    if (as && as->max_length == 0)
> > +    if ((as && as->max_length == 0) || is_attribute_p ("musttail", attr_id))
> 
> I'd prefer to add an attribute to the table, rather than special-case it
> here; apart from consistency, it seems likely that someone will later want
> to apply it to a function.

Just to clarify. I can add it to the table, but it would be a nop there
for now because the table is not used for statement attributes by
the current parser.

> 
> You need a template testcase; I expect it doesn't work in templates with the
> current patch.  It's probably enough to copy it in tsubst_expr where we
> currently propagate CALL_EXPR_OPERATOR_SYNTAX.

I tried it with the appended test case, everything seems to work without
changes.

Does it cover the cases you were concerned about?


> 
> You also need a testcase where the function returns a class; in that case
> the call will often appear as AGGR_INIT_EXPR rather than CALL_EXPR, so
> you'll need to handle that as well.  And see the places that copy flags like
> CALL_EXPR_OPERATOR_SYNTAX between CALL_EXPR and AGGR_INIT_EXPR.

Dito.

-Andi


/* { dg-do compile { target { tail_call } } } */
/* { dg-options "-O2" } */
/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */

class Foo {
public:
  int a, b;
  Foo(int a, int b) : a(a), b(b) {}
};

Foo __attribute__((noinline,noclone,noipa))
callee (int i)
{
  return Foo(i, i+1);
}

Foo __attribute__((noinline,noclone,noipa))
caller (int i)
{
  [[gnu::musttail]] return callee (i + 1);
}

template<typename T>
T __attribute__((noinline,noclone,noipa)) foo (T i)
{
  return i + 1;
}

int
caller2 (int k)
{
  [[gnu::musttail]] return foo<int>(1);
}

template<typename T>
T caller3 (T v)
{
  [[gnu::musttail]] return foo<T>(v);
}

int call3(int i)
{
  [[gnu::musttail]] return caller3<int>(i + 1);
}

class Bar {
  int a;
public:
  Bar(int a) : a(a) {}
  Bar operator+(Bar o) { return Bar(a + o.a); } 
};

Bar
caller3 (Bar k)
{
  [[gnu::musttail]] return caller3<Bar>(Bar(99));
}


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324)
  2024-05-14 17:24     ` Andi Kleen
@ 2024-05-14 21:52       ` Jason Merrill
  2024-05-14 23:23         ` Andi Kleen
  0 siblings, 1 reply; 20+ messages in thread
From: Jason Merrill @ 2024-05-14 21:52 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc-patches

On 5/14/24 13:24, Andi Kleen wrote:
> Hi Jason,
> 
> On Mon, May 06, 2024 at 11:02:20PM -0400, Jason Merrill wrote:
>>> @@ -30189,7 +30207,7 @@ cp_parser_std_attribute (cp_parser *parser, tree attr_ns)
>>>        /* Maybe we don't expect to see any arguments for this attribute.  */
>>>        const attribute_spec *as
>>>          = lookup_attribute_spec (TREE_PURPOSE (attribute));
>>> -    if (as && as->max_length == 0)
>>> +    if ((as && as->max_length == 0) || is_attribute_p ("musttail", attr_id))
>>
>> I'd prefer to add an attribute to the table, rather than special-case it
>> here; apart from consistency, it seems likely that someone will later want
>> to apply it to a function.
> 
> Just to clarify. I can add it to the table, but it would be a nop there
> for now because the table is not used for statement attributes by
> the current parser.

Agreed.

>> You need a template testcase; I expect it doesn't work in templates with the
>> current patch.  It's probably enough to copy it in tsubst_expr where we
>> currently propagate CALL_EXPR_OPERATOR_SYNTAX.
> 
> I tried it with the appended test case, everything seems to work without
> changes.
> 
> Does it cover the cases you were concerned about?

Not fully; this testcase doesn't seem to check for errors if tail-call 
fails, only whether the syntax is accepted.  So it would pass if the 
attribute were simply ignored.

Did you also see this comment?

> It seems to me that if we were to pass &std_attrs to cp_parser_jump_statement, we could handle this entirely in that function rather than adding a flag to finish_return_stmt and check_return_stmt. 

Jason


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324)
  2024-05-14 21:52       ` Jason Merrill
@ 2024-05-14 23:23         ` Andi Kleen
  2024-05-21 18:02           ` Jason Merrill
  0 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2024-05-14 23:23 UTC (permalink / raw)
  To: Jason Merrill; +Cc: gcc-patches

> > > You need a template testcase; I expect it doesn't work in templates with the
> > > current patch.  It's probably enough to copy it in tsubst_expr where we
> > > currently propagate CALL_EXPR_OPERATOR_SYNTAX.
> > 
> > I tried it with the appended test case, everything seems to work without
> > changes.
> > 
> > Does it cover the cases you were concerned about?
> 
> Not fully; this testcase doesn't seem to check for errors if tail-call
> fails, only whether the syntax is accepted.  So it would pass if the
> attribute were simply ignored.

Okay I'm not clear how I would do that. Pattern match the assembler 
in a target specific test case? From looking at the assembler output
everything got tail converted.

> 
> Did you also see this comment?
> 
> > It seems to me that if we were to pass &std_attrs to
> > cp_parser_jump_statement, we could handle this entirely in that function
> > rather than adding a flag to finish_return_stmt and check_return_stmt.

Yes. I did that change.

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 1/5] Improve must tail in RTL backend
  2024-05-14 14:15 ` [PATCH v5 1/5] Improve must tail in RTL backend Richard Biener
  2024-05-14 17:19   ` Andi Kleen
@ 2024-05-20  4:53   ` Andi Kleen
  2024-05-21  8:31     ` Richard Biener
  1 sibling, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2024-05-20  4:53 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, nathan, josmyers, richard.sandiford, jason

On Tue, May 14, 2024 at 04:15:08PM +0200, Richard Biener wrote:
> On Sun, May 5, 2024 at 8:16 PM Andi Kleen <ak@linux.intel.com> wrote:
> >
> > - Give error messages for all causes of non sibling call generation
> > - Don't override choices of other non sibling call checks with
> > must tail. This causes ICEs. The must tail attribute now only
> > overrides flag_optimize_sibling_calls locally.
> > - Error out when tree-tailcall failed to mark a must-tail call
> > sibcall. In this case it doesn't know the true reason and only gives
> > a vague message (this could be improved, but it's already useful without
> > that) tree-tailcall usually fails without optimization, so must
> > adjust the existing must-tail plugin test to specify -O2.
> >
> >         PR83324
> >
> > gcc/ChangeLog:
> >
> >         * calls.cc (expand_call): Fix mustcall implementation.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.dg/plugin/must-tail-call-1.c: Adjust.
> > ---
> >  gcc/calls.cc                                  | 30 ++++++++++++-------
> >  .../gcc.dg/plugin/must-tail-call-1.c          |  1 +
> >  2 files changed, 21 insertions(+), 10 deletions(-)
> >
> > diff --git a/gcc/calls.cc b/gcc/calls.cc
> > index 21d78f9779fe..a6b8ee44cc29 100644
> > --- a/gcc/calls.cc
> > +++ b/gcc/calls.cc
> > @@ -2650,7 +2650,9 @@ expand_call (tree exp, rtx target, int ignore)
> >    /* The type of the function being called.  */
> >    tree fntype;
> >    bool try_tail_call = CALL_EXPR_TAILCALL (exp);
> > -  bool must_tail_call = CALL_EXPR_MUST_TAIL_CALL (exp);
> > +  /* tree-tailcall decided not to do tail calls. Error for the musttail case.  */
> > +  if (!try_tail_call)
> > +      maybe_complain_about_tail_call (exp, "other reasons");
> >    int pass;
> >
> >    /* Register in which non-BLKmode value will be returned,
> > @@ -3022,10 +3024,22 @@ expand_call (tree exp, rtx target, int ignore)
> >       pushed these optimizations into -O2.  Don't try if we're already
> >       expanding a call, as that means we're an argument.  Don't try if
> >       there's cleanups, as we know there's code to follow the call.  */
> > -  if (currently_expanding_call++ != 0
> > -      || (!flag_optimize_sibling_calls && !CALL_FROM_THUNK_P (exp))
> > -      || args_size.var
> > -      || dbg_cnt (tail_call) == false)
> > +  if (currently_expanding_call++ != 0)
> > +    {
> > +      maybe_complain_about_tail_call (exp, "inside another call");
> > +      try_tail_call = 0;
> > +    }
> > +  if (!flag_optimize_sibling_calls
> > +       && !CALL_FROM_THUNK_P (exp)
> > +       && !CALL_EXPR_MUST_TAIL_CALL (exp))
> > +    try_tail_call = 0;
> > +  if (args_size.var)
> 
> If we are both inside another call and run into this we give two errors,
> but I guess that's OK ...
> 
> > +    {
> > +      /* ??? correct message?  */
> > +      maybe_complain_about_tail_call (exp, "stack space needed");
> 
> args_size.var != NULL_TREE means the argument size is not constant.
> I'm quite sure this is an overly conservative check.
> 
> > +      try_tail_call = 0;
> > +    }
> > +  if (dbg_cnt (tail_call) == false)
> >      try_tail_call = 0;
> >
> >    /* Workaround buggy C/C++ wrappers around Fortran routines with
> > @@ -3046,15 +3060,11 @@ expand_call (tree exp, rtx target, int ignore)
> >             if (MEM_P (*iter))
> >               {
> >                 try_tail_call = 0;
> > +               maybe_complain_about_tail_call (exp, "hidden string length argument");
> 
> "hidden string length argument passed on stack"
> 
> from what I read the code.
> 
> >                 break;
> >               }
> >         }
> >
> > -  /* If the user has marked the function as requiring tail-call
> > -     optimization, attempt it.  */
> > -  if (must_tail_call)
> > -    try_tail_call = 1;
> > -
> >    /*  Rest of purposes for tail call optimizations to fail.  */
> >    if (try_tail_call)
> >      try_tail_call = can_implement_as_sibling_call_p (exp,
> > diff --git a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
> > index 3a6d4cceaba7..44af361e2925 100644
> > --- a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
> > +++ b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
> > @@ -1,4 +1,5 @@
> >  /* { dg-do compile { target tail_call } } */
> > +/* { dg-options "-O2" } */
> 
> So I think this is unfortunate - I think when there's a must-tail attribute
> we should either run the tailcall pass to check the call even at -O0 or
> trust the user with correctness  (hoping no optimization interfered with
> the ability to tail-call).
> 
> What were the ICEs you ran into?
> 
> I would guess it's for example problematic to duplicate must-tail calls?

I experimented more with this, the problem I have currently is that in
-O0 when returning a struct I get something like 

  D.2846 = caller3<Bar> (D.2836); [must tail call]

  <bb 3> :
  D.2836 ={v} {CLOBBER(eos)};
  return D.2846;

And this always fails this check in tree-tailcall:

      /* If the statement references memory or volatile operands, fail.  */
      if (gimple_references_memory_p (stmt)
          || gimple_has_volatile_ops (stmt))
        return;

In a optimized build this passes, but with -O0 it always fails
when the pass is placed before pass_optimizations_g. I assume
it's some problem with mem ssa form.

Any ideas how to fix that? Otherwise I can restrict musttail to non
structs.

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 1/5] Improve must tail in RTL backend
  2024-05-20  4:53   ` Andi Kleen
@ 2024-05-21  8:31     ` Richard Biener
  2024-05-21 13:35       ` Andi Kleen
  0 siblings, 1 reply; 20+ messages in thread
From: Richard Biener @ 2024-05-21  8:31 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc-patches, nathan, josmyers, richard.sandiford, jason

On Mon, May 20, 2024 at 6:53 AM Andi Kleen <ak@linux.intel.com> wrote:
>
> On Tue, May 14, 2024 at 04:15:08PM +0200, Richard Biener wrote:
> > On Sun, May 5, 2024 at 8:16 PM Andi Kleen <ak@linux.intel.com> wrote:
> > >
> > > - Give error messages for all causes of non sibling call generation
> > > - Don't override choices of other non sibling call checks with
> > > must tail. This causes ICEs. The must tail attribute now only
> > > overrides flag_optimize_sibling_calls locally.
> > > - Error out when tree-tailcall failed to mark a must-tail call
> > > sibcall. In this case it doesn't know the true reason and only gives
> > > a vague message (this could be improved, but it's already useful without
> > > that) tree-tailcall usually fails without optimization, so must
> > > adjust the existing must-tail plugin test to specify -O2.
> > >
> > >         PR83324
> > >
> > > gcc/ChangeLog:
> > >
> > >         * calls.cc (expand_call): Fix mustcall implementation.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >         * gcc.dg/plugin/must-tail-call-1.c: Adjust.
> > > ---
> > >  gcc/calls.cc                                  | 30 ++++++++++++-------
> > >  .../gcc.dg/plugin/must-tail-call-1.c          |  1 +
> > >  2 files changed, 21 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/gcc/calls.cc b/gcc/calls.cc
> > > index 21d78f9779fe..a6b8ee44cc29 100644
> > > --- a/gcc/calls.cc
> > > +++ b/gcc/calls.cc
> > > @@ -2650,7 +2650,9 @@ expand_call (tree exp, rtx target, int ignore)
> > >    /* The type of the function being called.  */
> > >    tree fntype;
> > >    bool try_tail_call = CALL_EXPR_TAILCALL (exp);
> > > -  bool must_tail_call = CALL_EXPR_MUST_TAIL_CALL (exp);
> > > +  /* tree-tailcall decided not to do tail calls. Error for the musttail case.  */
> > > +  if (!try_tail_call)
> > > +      maybe_complain_about_tail_call (exp, "other reasons");
> > >    int pass;
> > >
> > >    /* Register in which non-BLKmode value will be returned,
> > > @@ -3022,10 +3024,22 @@ expand_call (tree exp, rtx target, int ignore)
> > >       pushed these optimizations into -O2.  Don't try if we're already
> > >       expanding a call, as that means we're an argument.  Don't try if
> > >       there's cleanups, as we know there's code to follow the call.  */
> > > -  if (currently_expanding_call++ != 0
> > > -      || (!flag_optimize_sibling_calls && !CALL_FROM_THUNK_P (exp))
> > > -      || args_size.var
> > > -      || dbg_cnt (tail_call) == false)
> > > +  if (currently_expanding_call++ != 0)
> > > +    {
> > > +      maybe_complain_about_tail_call (exp, "inside another call");
> > > +      try_tail_call = 0;
> > > +    }
> > > +  if (!flag_optimize_sibling_calls
> > > +       && !CALL_FROM_THUNK_P (exp)
> > > +       && !CALL_EXPR_MUST_TAIL_CALL (exp))
> > > +    try_tail_call = 0;
> > > +  if (args_size.var)
> >
> > If we are both inside another call and run into this we give two errors,
> > but I guess that's OK ...
> >
> > > +    {
> > > +      /* ??? correct message?  */
> > > +      maybe_complain_about_tail_call (exp, "stack space needed");
> >
> > args_size.var != NULL_TREE means the argument size is not constant.
> > I'm quite sure this is an overly conservative check.
> >
> > > +      try_tail_call = 0;
> > > +    }
> > > +  if (dbg_cnt (tail_call) == false)
> > >      try_tail_call = 0;
> > >
> > >    /* Workaround buggy C/C++ wrappers around Fortran routines with
> > > @@ -3046,15 +3060,11 @@ expand_call (tree exp, rtx target, int ignore)
> > >             if (MEM_P (*iter))
> > >               {
> > >                 try_tail_call = 0;
> > > +               maybe_complain_about_tail_call (exp, "hidden string length argument");
> >
> > "hidden string length argument passed on stack"
> >
> > from what I read the code.
> >
> > >                 break;
> > >               }
> > >         }
> > >
> > > -  /* If the user has marked the function as requiring tail-call
> > > -     optimization, attempt it.  */
> > > -  if (must_tail_call)
> > > -    try_tail_call = 1;
> > > -
> > >    /*  Rest of purposes for tail call optimizations to fail.  */
> > >    if (try_tail_call)
> > >      try_tail_call = can_implement_as_sibling_call_p (exp,
> > > diff --git a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
> > > index 3a6d4cceaba7..44af361e2925 100644
> > > --- a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
> > > +++ b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
> > > @@ -1,4 +1,5 @@
> > >  /* { dg-do compile { target tail_call } } */
> > > +/* { dg-options "-O2" } */
> >
> > So I think this is unfortunate - I think when there's a must-tail attribute
> > we should either run the tailcall pass to check the call even at -O0 or
> > trust the user with correctness  (hoping no optimization interfered with
> > the ability to tail-call).
> >
> > What were the ICEs you ran into?
> >
> > I would guess it's for example problematic to duplicate must-tail calls?
>
> I experimented more with this, the problem I have currently is that in
> -O0 when returning a struct I get something like
>
>   D.2846 = caller3<Bar> (D.2836); [must tail call]
>
>   <bb 3> :
>   D.2836 ={v} {CLOBBER(eos)};
>   return D.2846;
>
> And this always fails this check in tree-tailcall:
>
>       /* If the statement references memory or volatile operands, fail.  */
>       if (gimple_references_memory_p (stmt)
>           || gimple_has_volatile_ops (stmt))
>         return;

I can't see how this triggers on the IL above, the loop should have
ignored both the return and the clobber and when recursing to
the predecessor stop before the above check when runnig into the
call?

> In a optimized build this passes, but with -O0 it always fails
> when the pass is placed before pass_optimizations_g. I assume
> it's some problem with mem ssa form.
>
> Any ideas how to fix that? Otherwise I can restrict musttail to non
> structs.

I wonder how this works when optimizing?

>
> -Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 1/5] Improve must tail in RTL backend
  2024-05-21  8:31     ` Richard Biener
@ 2024-05-21 13:35       ` Andi Kleen
  2024-05-21 16:41         ` Richard Biener
  0 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2024-05-21 13:35 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, nathan, josmyers, richard.sandiford, jason

> I can't see how this triggers on the IL above, the loop should have
> ignored both the return and the clobber and when recursing to
> the predecessor stop before the above check when runnig into the
> call?

Yes, I tracked that down later. The problem was that there
were multiple successors to the BB due to exception handling,
which makes the find_tail_calls walker give up.

Putting the new pass after ehcleanup fixed that, but there
are still cases when ehcleanup cannot get rid of them and
then it gives up. musttail checking at expand time still
works, but can only give a vague error message.

> 
> > In a optimized build this passes, but with -O0 it always fails
> > when the pass is placed before pass_optimizations_g. I assume
> > it's some problem with mem ssa form.
> >
> > Any ideas how to fix that? Otherwise I can restrict musttail to non
> > structs.
> 
> I wonder how this works when optimizing?

It just doesn't. You need optimization to do tail calls with
structs. The only alternative would be to detect the situation
and pull in some extra passes.

Also even with optimization it only works for structs that
fit into registers. This could be maybe fixed, but is out of scope
for this patch kit.

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 1/5] Improve must tail in RTL backend
  2024-05-21 13:35       ` Andi Kleen
@ 2024-05-21 16:41         ` Richard Biener
  2024-05-21 21:45           ` Andi Kleen
  0 siblings, 1 reply; 20+ messages in thread
From: Richard Biener @ 2024-05-21 16:41 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc-patches, nathan, josmyers, richard.sandiford, jason

On Tue, May 21, 2024 at 3:35 PM Andi Kleen <ak@linux.intel.com> wrote:
>
> > I can't see how this triggers on the IL above, the loop should have
> > ignored both the return and the clobber and when recursing to
> > the predecessor stop before the above check when runnig into the
> > call?
>
> Yes, I tracked that down later. The problem was that there
> were multiple successors to the BB due to exception handling,
> which makes the find_tail_calls walker give up.
>
> Putting the new pass after ehcleanup fixed that, but there
> are still cases when ehcleanup cannot get rid of them and
> then it gives up. musttail checking at expand time still
> works, but can only give a vague error message.
>
> >
> > > In a optimized build this passes, but with -O0 it always fails
> > > when the pass is placed before pass_optimizations_g. I assume
> > > it's some problem with mem ssa form.
> > >
> > > Any ideas how to fix that? Otherwise I can restrict musttail to non
> > > structs.
> >
> > I wonder how this works when optimizing?
>
> It just doesn't. You need optimization to do tail calls with
> structs. The only alternative would be to detect the situation
> and pull in some extra passes.
>
> Also even with optimization it only works for structs that
> fit into registers. This could be maybe fixed, but is out of scope
> for this patch kit.

I see.  I do wonder how we should deal with the inherent
dependence on optimization for [[musttail]] to work then?  "Solve"
the problem with good documentation?  Offer a -fignore-musttail
option to allow a -O0 build to at least succeed?  But then [[musttail]]
would rather be [[shouldtail]] and can no longer be for correctness?

How does clang solve this?

Richard.

>
> -Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324)
  2024-05-14 23:23         ` Andi Kleen
@ 2024-05-21 18:02           ` Jason Merrill
  0 siblings, 0 replies; 20+ messages in thread
From: Jason Merrill @ 2024-05-21 18:02 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc-patches

On 5/14/24 19:23, Andi Kleen wrote:
>>>> You need a template testcase; I expect it doesn't work in templates with the
>>>> current patch.  It's probably enough to copy it in tsubst_expr where we
>>>> currently propagate CALL_EXPR_OPERATOR_SYNTAX.
>>>
>>> I tried it with the appended test case, everything seems to work without
>>> changes.
>>>
>>> Does it cover the cases you were concerned about?
>>
>> Not fully; this testcase doesn't seem to check for errors if tail-call
>> fails, only whether the syntax is accepted.  So it would pass if the
>> attribute were simply ignored.
> 
> Okay I'm not clear how I would do that. Pattern match the assembler
> in a target specific test case? From looking at the assembler output
> everything got tail converted.

Write a testcase where the tail-call optimization can't happen, perhaps 
because the caller and callee disagree on return type:

int f();

double h() { [[gnu::musttail]] return f(); } // error

template <class T>
T g() { [[gnu::musttail]] return f(); }

int main()
{
   g<int>();
   g<double>(); // should error, but doesn't with v6 patch set
}

Jason


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 1/5] Improve must tail in RTL backend
  2024-05-21 16:41         ` Richard Biener
@ 2024-05-21 21:45           ` Andi Kleen
  0 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2024-05-21 21:45 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, nathan, josmyers, richard.sandiford, jason

> I see.  I do wonder how we should deal with the inherent
> dependence on optimization for [[musttail]] to work then?  "Solve"
> the problem with good documentation?

For now that's good enough I think. If it's a significant problem
it can always be improved later.

> Offer a -fignore-musttail
> option to allow a -O0 build to at least succeed?  But then [[musttail]]
> would rather be [[shouldtail]] and can no longer be for correctness?

I don't think ignore-musttail is a good idea because an interpreter
relying on this would almost certainly die very quickly after overflowing
its stack. The feature is only useful when it can be relied on.

> How does clang solve this?

clang++ 17 seems to handle both cases (large struct in memory or struct
fitting into registers) without optimization.
I haven't checked the LLVM source on the exact details.

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-05-21 21:45 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-05 18:14 [PATCH v5 1/5] Improve must tail in RTL backend Andi Kleen
2024-05-05 18:14 ` [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324) Andi Kleen
2024-05-07  3:02   ` Jason Merrill
2024-05-14 17:24     ` Andi Kleen
2024-05-14 21:52       ` Jason Merrill
2024-05-14 23:23         ` Andi Kleen
2024-05-21 18:02           ` Jason Merrill
2024-05-05 18:14 ` [PATCH v5 3/5] C: Implement musttail attribute for returns Andi Kleen
2024-05-05 18:14 ` [PATCH v5 4/5] Add tests for C/C++ musttail attributes Andi Kleen
2024-05-05 18:14 ` [PATCH v5 5/5] Add documentation for musttail attribute Andi Kleen
2024-05-14 14:21   ` Richard Biener
2024-05-14 16:30     ` Andi Kleen
2024-05-14 17:08       ` Richard Biener
2024-05-14 14:15 ` [PATCH v5 1/5] Improve must tail in RTL backend Richard Biener
2024-05-14 17:19   ` Andi Kleen
2024-05-20  4:53   ` Andi Kleen
2024-05-21  8:31     ` Richard Biener
2024-05-21 13:35       ` Andi Kleen
2024-05-21 16:41         ` Richard Biener
2024-05-21 21:45           ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).