public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
@ 2022-02-04 13:49 Richard Biener
  2022-02-04 14:03 ` Jakub Jelinek
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Richard Biener @ 2022-02-04 13:49 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, matz

This adds explicit variable birth CLOBBERs in an attempt to fix
PR90348 and duplicates.  The birth / death CLOBBER pairs are
used to compute liveness and conflicts for stack variable
coalescing where the lack of an explicit birth but instead
use of first mention causes misrepresentation of variable life
ranges when optimization moves the first mention upwards the
original birth point at the variables bind expression start.

Birth CLOBBERs are represented as traditional CLOBBERs with all
the accompaning effect on optimization.  While they do not serve
as a barrier for address mentions they act as barrier for the
actual accesses which is enough for determining conflicts in
the context of stack slot sharing.  The birth CLOBBERs are
distinguished from death CLOBBERs by setting CLOBBER_MARKS_BIRTH
using the private_flag on the CONSTRUCTOR node and amend the
CLOBBER_MARK_EOL marked clobbers introduced earlier.

The patch changes how we handle stack variables that are not marked
with CLOBBERs.  For those the first mention started live which then
lasted upon function exit which means effectively all not marked
variables conflicted with each other variable.  This property is best
represented by an extra flag rather than the full conflict bitmap
which is what the patch introduces with conflicts_represented which
is cleared at start and set to true once we visit the first birth
CLOBBER.  From that on we assume all variable accesses are properly
fenced by birth/death CLOBBERs.

Variables without explicit births will not take part in stack slot
sharing after this change.

Currently birth CLOBBERs are added when gimplification adds
corresponding end-of-life CLOBBERs, during bind and target expression
gimplification.  Generally inserting births at DECL_EXPRs is
more precise so we also do it there (also noting not all such variables
are mentioned in BINDs).  Avoiding redundant births is on the TOOD
list and might also remove the need for the warn_switch_unreachable_r
hunk.

This is the meat of the PR90348 fix, the following 3 patches perform
testcase adjustments and followup fixes to avoid regressions.

Bootstrapped on x86_64-unknown-linux-gnu - re-testing in progress.

Any comments?  I have mixed feelings with proposing this for GCC 12
but like to hear from others as well.  I didn't try to evaluate
the quality of stack slot sharing before/after this change besides
fixing the testsuite fallout (we have a few testcases checking for
specific instances).

Thanks,
Richard.

2022-02-01  Richard Biener  <rguenther@suse.de>

	PR middle-end/90348
	PR middle-end/103006
	* tree-core.h (clobber_kind): Add CLOBBER_BIRTH.
	* gimplify.cc (gimplify_bind_expr): Also add birth CLOBBERs.
	(gimplify_target_expr): Likewise.
	(gimplify_decl_expr): Likewise.
	(warn_switch_unreachable_r): Do not treat birth CLOBBERs as
	real stmt - they get added at BIND starts but that's before
	the case labels.
	* tree-pretty-print.cc (dump_generic_node): Mark birth CLOBBERs.
	* cfgexpand.cc (stack_var::conflicts_represented): New.
	(add_stack_var): Initialize conflicts_represented.
	(add_stack_var_conflict): Assert the conflicts for the
	vars are represented.
	(stack_var_conflict_p): Honor conflicts_represented flag.
	(visit_op): Remove.
	(visit_conflict): Likewise.
	(add_scope_conflicts_1): Simplify by only considering birth
	and death CLOBBERs.
	(add_scope_conflicts): Adjust comment.
	(add_stack_protection_conflicts): Only add conflicts for
	variables that have them represented.

	* gcc.dg/torture/pr103006-1.c: New testcase.
	* gcc.dg/torture/pr103006-2.c: Likewise.
	* gcc.dg/torture/pr90348.c: Likewise.
---
 gcc/cfgexpand.cc                          | 157 +++++++++-------------
 gcc/gimplify.cc                           |  83 +++++++++++-
 gcc/testsuite/gcc.dg/torture/pr103006-1.c |  27 ++++
 gcc/testsuite/gcc.dg/torture/pr103006-2.c |  29 ++++
 gcc/testsuite/gcc.dg/torture/pr90348.c    |  40 ++++++
 gcc/tree-core.h                           |   2 +
 gcc/tree-pretty-print.cc                  |   4 +-
 7 files changed, 244 insertions(+), 98 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr103006-1.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr103006-2.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr90348.c

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index d51af2e3084..02467874996 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -319,6 +319,10 @@ public:
      size, the alignment for this partition.  */
   unsigned int alignb;
 
+  /* Whether this variable has conflicts represented in the CONFLICTS
+     bitmap.  If not, it conflicts with all other stack variables.  */
+  bool conflicts_represented;
+
   /* The partition representative.  */
   size_t representative;
 
@@ -479,7 +483,8 @@ add_stack_var (tree decl, bool really_expand)
   v->representative = stack_vars_num;
   v->next = EOC;
 
-  /* All variables initially conflict with no other.  */
+  /* All variables initially conflict with each other.  */
+  v->conflicts_represented = false;
   v->conflicts = NULL;
 
   /* Ensure that this decl doesn't get put onto the list twice.  */
@@ -497,6 +502,7 @@ add_stack_var_conflict (size_t x, size_t y)
   class stack_var *b = &stack_vars[y];
   if (x == y)
     return;
+  gcc_assert (a->conflicts_represented && b->conflicts_represented);
   if (!a->conflicts)
     a->conflicts = BITMAP_ALLOC (&stack_var_bitmap_obstack);
   if (!b->conflicts)
@@ -520,57 +526,16 @@ stack_var_conflict_p (size_t x, size_t y)
   if (TREE_CODE (a->decl) == SSA_NAME || TREE_CODE (b->decl) == SSA_NAME)
     return true;
 
+  /* If we cannot compute lifeness in a meaningful way for either variable
+     they conflict.  */
+  if (!a->conflicts_represented || !b->conflicts_represented)
+    return true;
+
   if (!a->conflicts || !b->conflicts)
     return false;
   return bitmap_bit_p (a->conflicts, y);
 }
 
-/* Callback for walk_stmt_ops.  If OP is a decl touched by add_stack_var
-   enter its partition number into bitmap DATA.  */
-
-static bool
-visit_op (gimple *, tree op, tree, void *data)
-{
-  bitmap active = (bitmap)data;
-  op = get_base_address (op);
-  if (op
-      && DECL_P (op)
-      && DECL_RTL_IF_SET (op) == pc_rtx)
-    {
-      size_t *v = decl_to_stack_part->get (op);
-      if (v)
-	bitmap_set_bit (active, *v);
-    }
-  return false;
-}
-
-/* Callback for walk_stmt_ops.  If OP is a decl touched by add_stack_var
-   record conflicts between it and all currently active other partitions
-   from bitmap DATA.  */
-
-static bool
-visit_conflict (gimple *, tree op, tree, void *data)
-{
-  bitmap active = (bitmap)data;
-  op = get_base_address (op);
-  if (op
-      && DECL_P (op)
-      && DECL_RTL_IF_SET (op) == pc_rtx)
-    {
-      size_t *v = decl_to_stack_part->get (op);
-      if (v && bitmap_set_bit (active, *v))
-	{
-	  size_t num = *v;
-	  bitmap_iterator bi;
-	  unsigned i;
-	  gcc_assert (num < stack_vars_num);
-	  EXECUTE_IF_SET_IN_BITMAP (active, 0, i, bi)
-	    add_stack_var_conflict (num, i);
-	}
-    }
-  return false;
-}
-
 /* Helper routine for add_scope_conflicts, calculating the active partitions
    at the end of BB, leaving the result in WORK.  We're called to generate
    conflicts when FOR_CONFLICT is true, otherwise we're just tracking
@@ -582,58 +547,63 @@ add_scope_conflicts_1 (basic_block bb, bitmap work, bool for_conflict)
   edge e;
   edge_iterator ei;
   gimple_stmt_iterator gsi;
-  walk_stmt_load_store_addr_fn visit;
+  bitmap_iterator bi;
+  unsigned i;
 
   bitmap_clear (work);
   FOR_EACH_EDGE (e, ei, bb->preds)
     bitmap_ior_into (work, (bitmap)e->src->aux);
 
-  visit = visit_op;
+  /* Since we do not get to see the birth clobber for variables being
+     live only via backedges add conflicts for everything live at
+     the start of the block.  */
+  if (for_conflict)
+    EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi)
+      {
+	class stack_var *a = &stack_vars[i];
+	if (!a->conflicts)
+	  a->conflicts = BITMAP_ALLOC (&stack_var_bitmap_obstack);
+	bitmap_ior_into (a->conflicts, work);
+      }
 
-  for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
-    {
-      gimple *stmt = gsi_stmt (gsi);
-      walk_stmt_load_store_addr_ops (stmt, work, NULL, NULL, visit);
-    }
+  /* Now do the local dataflow.  */
   for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi))
     {
       gimple *stmt = gsi_stmt (gsi);
+      if (!gimple_clobber_p (stmt))
+	continue;
 
-      if (gimple_clobber_p (stmt))
-	{
-	  tree lhs = gimple_assign_lhs (stmt);
-	  size_t *v;
-	  /* Nested function lowering might introduce LHSs
-	     that are COMPONENT_REFs.  */
-	  if (!VAR_P (lhs))
-	    continue;
-	  if (DECL_RTL_IF_SET (lhs) == pc_rtx
-	      && (v = decl_to_stack_part->get (lhs)))
-	    bitmap_clear_bit (work, *v);
-	}
-      else if (!is_gimple_debug (stmt))
+      tree lhs = gimple_assign_lhs (stmt);
+      size_t *v;
+      /* Nested function lowering might introduce LHSs
+	 that are COMPONENT_REFs.  */
+      if (!VAR_P (lhs))
+	continue;
+      if (DECL_RTL_IF_SET (lhs) == pc_rtx
+	  && (v = decl_to_stack_part->get (lhs)))
 	{
-	  if (for_conflict
-	      && visit == visit_op)
+	  size_t num = *v;
+	  if (CLOBBER_KIND (gimple_assign_rhs1 (stmt)) == CLOBBER_BIRTH)
 	    {
-	      /* If this is the first real instruction in this BB we need
-	         to add conflicts for everything live at this point now.
-		 Unlike classical liveness for named objects we can't
-		 rely on seeing a def/use of the names we're interested in.
-		 There might merely be indirect loads/stores.  We'd not add any
-		 conflicts for such partitions.  */
-	      bitmap_iterator bi;
-	      unsigned i;
-	      EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi)
+	      /* Birth.  */
+	      if (for_conflict)
 		{
-		  class stack_var *a = &stack_vars[i];
-		  if (!a->conflicts)
-		    a->conflicts = BITMAP_ALLOC (&stack_var_bitmap_obstack);
-		  bitmap_ior_into (a->conflicts, work);
+		  /* The births are where we add conflicts.  */
+		  if (bitmap_set_bit (work, num))
+		    EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi)
+		      add_stack_var_conflict (num, i);
+		}
+	      else
+		{
+		  /* We've seen a birth, assume life-ranges are
+		     properly represented by birth and death CLOBBERs.  */
+		  stack_vars[num].conflicts_represented = true;
+		  bitmap_set_bit (work, num);
 		}
-	      visit = visit_conflict;
 	    }
-	  walk_stmt_load_store_addr_ops (stmt, work, visit, visit, visit);
+	  else if (CLOBBER_KIND (gimple_assign_rhs1 (stmt)) == CLOBBER_EOL)
+	    /* Death.  */
+	    bitmap_clear_bit (work, num);
 	}
     }
 }
@@ -650,13 +620,13 @@ add_scope_conflicts (void)
   int *rpo;
   int n_bbs;
 
-  /* We approximate the live range of a stack variable by taking the first
-     mention of its name as starting point(s), and by the end-of-scope
-     death clobber added by gimplify as ending point(s) of the range.
-     This overapproximates in the case we for instance moved an address-taken
-     operation upward, without also moving a dereference to it upwards.
-     But it's conservatively correct as a variable never can hold values
-     before its name is mentioned at least once.
+  /* We approximate the live range of a stack variable by using the
+     birth and death CLOBBERs as added by for example gimplification
+     based on the scopes the variables were declared in.
+     For variables we never see mentioned in a birth CLOBBER we have
+     to assume conflicts with all other stack variables.  We could
+     improve this for the case of stack variables that do not have
+     their address taken as we can precisely track all mentions.
 
      We then do a mostly classical bitmap liveness algorithm.  */
 
@@ -2022,9 +1992,12 @@ add_stack_protection_conflicts (void)
 
   for (i = 0; i < n; ++i)
     {
+      if (!stack_vars[i].conflicts_represented)
+	continue;
       unsigned char ph_i = phase[i];
       for (j = i + 1; j < n; ++j)
-	if (ph_i != phase[j])
+	if (stack_vars[j].conflicts_represented
+	    && ph_i != phase[j])
 	  add_stack_var_conflict (i, j);
     }
 
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 875b115d02d..37c80d609c4 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -1355,6 +1355,7 @@ gimplify_bind_expr (tree *expr_p, gimple_seq *pre_p)
   tree temp = voidify_wrapper_expr (bind_expr, NULL);
 
   /* Mark variables seen in this bind expr.  */
+  body = NULL;
   for (t = BIND_EXPR_VARS (bind_expr); t ; t = DECL_CHAIN (t))
     {
       if (VAR_P (t))
@@ -1412,6 +1413,43 @@ gimplify_bind_expr (tree *expr_p, gimple_seq *pre_p)
 
 	  if (DECL_HARD_REGISTER (t) && !is_global_var (t) && cfun)
 	    cfun->has_local_explicit_reg_vars = true;
+
+	  /* For variables we add clobbers for in the cleanup, add
+	     corresponding births at the start.  */
+	  /* ???  We'd like to not add the birth here if we knew that
+	     we'd see a DECL_EXPR for this variable later.  Incidentially
+	     OMP also seems to have a need to know that (see find_decl_expr).
+	     Adding the birth here for example causes us to have a spurious
+	     one for
+	       switch (..)
+		 {
+		 case 1:
+		   S s;
+		 }
+	     leading to bogus diagnostic we now mitigate in
+	     warn_switch_unreachable_r as the birth will appear before
+	     the case label in that case.
+	     It might be possible to hijack the GENERIC body walk
+	     we do in unshare_body to set a flag on all decls that
+	     we discover having a DECL_EXPR or alternatively wrap
+	     all builders of DECL_EXPR with a function doing that.  */
+	  if (!is_global_var (t)
+	      && DECL_CONTEXT (t) == current_function_decl
+	      && !DECL_HARD_REGISTER (t)
+	      && !TREE_THIS_VOLATILE (t)
+	      && !DECL_HAS_VALUE_EXPR_P (t)
+	      /* Only care for variables that have to be in memory.  Others
+		 will be rewritten into SSA names, hence moved to the
+		 top-level.  */
+	      && !is_gimple_reg (t)
+	      && flag_stack_reuse != SR_NONE)
+	    {
+	      tree clobber = build_clobber (TREE_TYPE (t), CLOBBER_BIRTH);
+	      gimple *clobber_stmt;
+	      clobber_stmt = gimple_build_assign (t, clobber);
+	      gimple_set_location (clobber_stmt, start_locus);
+	      gimplify_seq_add_stmt (&body, clobber_stmt);
+	    }
 	}
     }
 
@@ -1423,7 +1461,6 @@ gimplify_bind_expr (tree *expr_p, gimple_seq *pre_p)
   gimplify_ctxp->save_stack = false;
 
   /* Gimplify the body into the GIMPLE_BIND tuple's body.  */
-  body = NULL;
   gimplify_stmt (&BIND_EXPR_BODY (bind_expr), &body);
   gimple_bind_set_body (bind_stmt, body);
 
@@ -1901,6 +1938,23 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p)
 	  is_vla = true;
 	}
 
+      /* This mostly copies the condition we used for building a CLOBBER
+	 when visiting a BIND_EXPR.  We should rather remember that though,
+	 we approximate it by checking DECL_SEEN_IN_BIND_EXPR_P.  */
+      if (!decl_had_value_expr_p
+	  && !is_global_var (decl)
+	  && !TREE_THIS_VOLATILE (decl)
+	  && !DECL_HARD_REGISTER (decl)
+	  && !is_gimple_reg (decl)
+	  && DECL_SEEN_IN_BIND_EXPR_P (decl)
+	  && flag_stack_reuse != SR_NONE)
+	{
+	  tree clobber = build_clobber (TREE_TYPE (decl), CLOBBER_BIRTH);
+	  gimple *clobber_stmt = gimple_build_assign (decl, clobber);
+	  gimple_set_location (clobber_stmt, EXPR_LOCATION (stmt));
+	  gimplify_seq_add_stmt (seq_p, clobber_stmt);
+	}
+
       if (asan_poisoned_variables
 	  && !is_vla
 	  && TREE_ADDRESSABLE (decl)
@@ -2075,7 +2129,10 @@ warn_switch_unreachable_r (gimple_stmt_iterator *gsi_p, bool *handled_ops_p,
 	}
       /* Fall through.  */
     default:
-      /* Save the first "real" statement (not a decl/lexical scope/...).  */
+      /* Save the first "real" statement (not a decl/lexical scope/...)
+	 but avoid clobbers we add for the birth of variables.  */
+      if (gimple_clobber_p (stmt, CLOBBER_BIRTH))
+	break;
       wi->info = stmt;
       return integer_zero_node;
     }
@@ -6917,7 +6974,7 @@ gimplify_target_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p)
   enum gimplify_status ret;
 
   bool unpoison_empty_seq = false;
-  gimple_stmt_iterator unpoison_it;
+  gimple_stmt_iterator unpoison_it = gsi_none ();
 
   if (init)
     {
@@ -6935,8 +6992,9 @@ gimplify_target_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p)
 	}
       else
 	{
-	  /* Save location where we need to place unpoisoning.  It's possible
-	     that a variable will be converted to needs_to_live_in_memory.  */
+	  /* Save location where we need to place unpoisoning or birth.
+	     It's possible that a variable will be converted to
+	     needs_to_live_in_memory.  */
 	  unpoison_it = gsi_last (*pre_p);
 	  unpoison_empty_seq = gsi_end_p (unpoison_it);
 
@@ -6984,6 +7042,21 @@ gimplify_target_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p)
 	      tree clobber = build_clobber (TREE_TYPE (temp), CLOBBER_EOL);
 	      clobber = build2 (MODIFY_EXPR, TREE_TYPE (temp), temp, clobber);
 	      gimple_push_cleanup (temp, clobber, false, pre_p, true);
+
+	      /* Replace the placed NOP with a birth clobber.  */
+	      clobber = build_clobber (TREE_TYPE (temp), CLOBBER_BIRTH);
+	      clobber = build2 (MODIFY_EXPR, TREE_TYPE (temp), temp, clobber);
+	      gimple_seq stmts = NULL;
+	      gimplify_and_add (clobber, &stmts);
+	      if (unpoison_empty_seq)
+		{
+		  gimple_stmt_iterator si = gsi_start (*pre_p);
+		  gsi_insert_seq_before_without_update (&si, stmts,
+							GSI_SAME_STMT);
+		}
+	      else
+		gsi_insert_seq_after_without_update (&unpoison_it, stmts,
+						     GSI_LAST_NEW_STMT);
 	    }
 	  if (asan_poisoned_variables
 	      && DECL_ALIGN (temp) <= MAX_SUPPORTED_STACK_ALIGNMENT
diff --git a/gcc/testsuite/gcc.dg/torture/pr103006-1.c b/gcc/testsuite/gcc.dg/torture/pr103006-1.c
new file mode 100644
index 00000000000..0053896f19f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr103006-1.c
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+
+int printf(const char *, ...);
+int a, *b, c, e, f;
+void g() {
+  int *d[7];
+  d[6] = b = (int *)d;
+  printf("0\n");
+}
+int i() {
+  for (c = 0; c < 2; c++) {
+    long h[6][2];
+    for (e = 0; e < 6; e++)
+      for (f = 0; f < 2; f++)
+        h[e][f] = 1;
+    if (c) {
+      g();
+      return h[3][0];
+    }
+  }
+  return 0;
+}
+int main() {
+  if (i() != 1)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr103006-2.c b/gcc/testsuite/gcc.dg/torture/pr103006-2.c
new file mode 100644
index 00000000000..38acf85c4b9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr103006-2.c
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+
+__attribute__((noipa)) void ff(void){}
+int a, *b, c, e, f;
+__attribute__((always_inline))
+static inline void g() {
+  int *d[7];
+  d[6] = b = (int *)d;
+  ff();
+}
+__attribute__((noinline))
+int i() {
+  for (c = 0; c < 2; c++) {
+    long h[6][2];
+    for (e = 0; e < 6; e++)
+      for (f = 0; f < 2; f++)
+        h[e][f] = 1;
+    if (c) {
+      g();
+      return h[3][0];
+    }
+  }
+  return 0;
+}
+int main() {
+  if (i() != 1)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr90348.c b/gcc/testsuite/gcc.dg/torture/pr90348.c
new file mode 100644
index 00000000000..132aff7861f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr90348.c
@@ -0,0 +1,40 @@
+/* { dg-do run } */
+
+void __attribute__ ((noipa))
+set_one (unsigned char* ptr)
+{
+  *ptr = 1;
+}
+
+int __attribute__ ((noipa))
+check_nonzero (unsigned char const* in, unsigned int len)
+{
+  for (unsigned int i = 0; i < len; ++i)
+    if (in[i] != 0)
+      return 1;
+  return 0;
+}
+
+void
+set_one_on_stack ()
+{
+  unsigned char buf[1];
+  set_one (buf);
+}
+
+int
+main ()
+{
+  for (int i = 0; i < 5; ++i)
+    {
+      unsigned char in[4];
+      for (int j = 0; j < i; ++j)
+	{
+	  in[j] = 0;
+	  set_one_on_stack ();
+	}
+      if (check_nonzero (in, i))
+	__builtin_abort ();
+    }
+}
+
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index bf2efa61330..786bc91571d 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -967,6 +967,8 @@ enum annot_expr_kind {
 enum clobber_kind {
   /* Unspecified, this clobber acts as a store of an undefined value.  */
   CLOBBER_UNDEF,
+  /* This clobber starts the lifetime of the storage.  */
+  CLOBBER_BIRTH,
   /* This clobber ends the lifetime of the storage.  */
   CLOBBER_EOL,
   CLOBBER_LAST
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index 666b7a70ea2..d601f3f980d 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -2502,7 +2502,9 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
 	if (TREE_CLOBBER_P (node))
 	  {
 	    pp_string (pp, "CLOBBER");
-	    if (CLOBBER_KIND (node) == CLOBBER_EOL)
+	    if (CLOBBER_KIND (node) == CLOBBER_BIRTH)
+	      pp_string (pp, "(birth)");
+	    else if (CLOBBER_KIND (node) == CLOBBER_EOL)
 	      pp_string (pp, "(eol)");
 	  }
 	else if (TREE_CODE (TREE_TYPE (node)) == RECORD_TYPE
-- 
2.34.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-04 13:49 [PATCH 1/4][RFC] middle-end/90348 - add explicit birth Richard Biener
@ 2022-02-04 14:03 ` Jakub Jelinek
  2022-02-04 14:11   ` Richard Biener
  2022-02-05 18:21 ` Jeff Law
  2022-02-07 23:46 ` Joseph Myers
  2 siblings, 1 reply; 16+ messages in thread
From: Jakub Jelinek @ 2022-02-04 14:03 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, matz

On Fri, Feb 04, 2022 at 02:49:13PM +0100, Richard Biener wrote:
> Any comments?  I have mixed feelings with proposing this for GCC 12
> but like to hear from others as well.  I didn't try to evaluate
> the quality of stack slot sharing before/after this change besides
> fixing the testsuite fallout (we have a few testcases checking for
> specific instances).

I have mixed feelings too, it is quite risky, on the other side we have
those numerous otherwise unsolvable PRs.

I wonder if tree-ssa-live.cc (compute_live_vars_1) doesn't need similar
changes, after all, it is a variant of the cfgexpand algorithm.

And I'll certainly need to incrementally mark most if not all current
build_clobbers in omp-low.cc as EOLs and emit birth clobbers (stuff that is
added post gimplification, so too late for gimplification's added birth/eol
handling there and so it must be done manually).

	Jakub


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-04 14:03 ` Jakub Jelinek
@ 2022-02-04 14:11   ` Richard Biener
  0 siblings, 0 replies; 16+ messages in thread
From: Richard Biener @ 2022-02-04 14:11 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, matz

On Fri, 4 Feb 2022, Jakub Jelinek wrote:

> On Fri, Feb 04, 2022 at 02:49:13PM +0100, Richard Biener wrote:
> > Any comments?  I have mixed feelings with proposing this for GCC 12
> > but like to hear from others as well.  I didn't try to evaluate
> > the quality of stack slot sharing before/after this change besides
> > fixing the testsuite fallout (we have a few testcases checking for
> > specific instances).
> 
> I have mixed feelings too, it is quite risky, on the other side we have
> those numerous otherwise unsolvable PRs.

Yep - it seems it's always stage3/4 when we get to those.  That said,
I'm happy to wait for stage1 and I'm also happy to revert if issues
pop up.

> I wonder if tree-ssa-live.cc (compute_live_vars_1) doesn't need similar
> changes, after all, it is a variant of the cfgexpand algorithm.

Oh, I wasn't aware of that.  It seems it's only used by tail-recursion
and inlining (there for inserting CLOBBERs).  But yes, I think it
might suffer from the same issue.

> And I'll certainly need to incrementally mark most if not all current
> build_clobbers in omp-low.cc as EOLs and emit birth clobbers (stuff that is
> added post gimplification, so too late for gimplification's added birth/eol
> handling there and so it must be done manually).

Adding testcases for intended stack slot sharing in those cases would
be nice.

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-04 13:49 [PATCH 1/4][RFC] middle-end/90348 - add explicit birth Richard Biener
  2022-02-04 14:03 ` Jakub Jelinek
@ 2022-02-05 18:21 ` Jeff Law
  2022-02-05 19:54   ` Eric Botcazou
  2022-02-07 23:46 ` Joseph Myers
  2 siblings, 1 reply; 16+ messages in thread
From: Jeff Law @ 2022-02-05 18:21 UTC (permalink / raw)
  To: Richard Biener, gcc-patches; +Cc: Jakub Jelinek, matz



On 2/4/2022 6:49 AM, Richard Biener via Gcc-patches wrote:
> This adds explicit variable birth CLOBBERs in an attempt to fix
> PR90348 and duplicates.  The birth / death CLOBBER pairs are
> used to compute liveness and conflicts for stack variable
> coalescing where the lack of an explicit birth but instead
> use of first mention causes misrepresentation of variable life
> ranges when optimization moves the first mention upwards the
> original birth point at the variables bind expression start.
>
> Birth CLOBBERs are represented as traditional CLOBBERs with all
> the accompaning effect on optimization.  While they do not serve
> as a barrier for address mentions they act as barrier for the
> actual accesses which is enough for determining conflicts in
> the context of stack slot sharing.  The birth CLOBBERs are
> distinguished from death CLOBBERs by setting CLOBBER_MARKS_BIRTH
> using the private_flag on the CONSTRUCTOR node and amend the
> CLOBBER_MARK_EOL marked clobbers introduced earlier.
>
> The patch changes how we handle stack variables that are not marked
> with CLOBBERs.  For those the first mention started live which then
> lasted upon function exit which means effectively all not marked
> variables conflicted with each other variable.  This property is best
> represented by an extra flag rather than the full conflict bitmap
> which is what the patch introduces with conflicts_represented which
> is cleared at start and set to true once we visit the first birth
> CLOBBER.  From that on we assume all variable accesses are properly
> fenced by birth/death CLOBBERs.
>
> Variables without explicit births will not take part in stack slot
> sharing after this change.
>
> Currently birth CLOBBERs are added when gimplification adds
> corresponding end-of-life CLOBBERs, during bind and target expression
> gimplification.  Generally inserting births at DECL_EXPRs is
> more precise so we also do it there (also noting not all such variables
> are mentioned in BINDs).  Avoiding redundant births is on the TOOD
> list and might also remove the need for the warn_switch_unreachable_r
> hunk.
>
> This is the meat of the PR90348 fix, the following 3 patches perform
> testcase adjustments and followup fixes to avoid regressions.
>
> Bootstrapped on x86_64-unknown-linux-gnu - re-testing in progress.
>
> Any comments?  I have mixed feelings with proposing this for GCC 12
> but like to hear from others as well.  I didn't try to evaluate
> the quality of stack slot sharing before/after this change besides
> fixing the testsuite fallout (we have a few testcases checking for
> specific instances).
>
> Thanks,
> Richard.
>
> 2022-02-01  Richard Biener  <rguenther@suse.de>
>
> 	PR middle-end/90348
> 	PR middle-end/103006
> 	* tree-core.h (clobber_kind): Add CLOBBER_BIRTH.
> 	* gimplify.cc (gimplify_bind_expr): Also add birth CLOBBERs.
> 	(gimplify_target_expr): Likewise.
> 	(gimplify_decl_expr): Likewise.
> 	(warn_switch_unreachable_r): Do not treat birth CLOBBERs as
> 	real stmt - they get added at BIND starts but that's before
> 	the case labels.
> 	* tree-pretty-print.cc (dump_generic_node): Mark birth CLOBBERs.
> 	* cfgexpand.cc (stack_var::conflicts_represented): New.
> 	(add_stack_var): Initialize conflicts_represented.
> 	(add_stack_var_conflict): Assert the conflicts for the
> 	vars are represented.
> 	(stack_var_conflict_p): Honor conflicts_represented flag.
> 	(visit_op): Remove.
> 	(visit_conflict): Likewise.
> 	(add_scope_conflicts_1): Simplify by only considering birth
> 	and death CLOBBERs.
> 	(add_scope_conflicts): Adjust comment.
> 	(add_stack_protection_conflicts): Only add conflicts for
> 	variables that have them represented.
>
> 	* gcc.dg/torture/pr103006-1.c: New testcase.
> 	* gcc.dg/torture/pr103006-2.c: Likewise.
> 	* gcc.dg/torture/pr90348.c: Likewise.
The lack of explicit birth/death points in gimple has been a long 
standing problem, so definitely happy to see this moving forward. As you 
note, it's not clear if we should go forward now or wait for the next 
stage1 cycle.

In the past stack sharing has been quite important for the linux 
kernel.  So perhaps one of the tests we should do if we wanted to go 
forward in this cycle would be to test kernel builds to see if any start 
tripping over the stack space diagnostics they've put in place over the 
years.

Jeff


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-05 18:21 ` Jeff Law
@ 2022-02-05 19:54   ` Eric Botcazou
  2022-02-07  7:29     ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Botcazou @ 2022-02-05 19:54 UTC (permalink / raw)
  To: Jeff Law; +Cc: Richard Biener, gcc-patches, Jakub Jelinek, matz

> In the past stack sharing has been quite important for the linux
> kernel.  So perhaps one of the tests we should do if we wanted to go
> forward in this cycle would be to test kernel builds to see if any start
> tripping over the stack space diagnostics they've put in place over the
> years.

Stack slot sharing is important generally speaking, not just for the kernel.
As demonstrated by the recent controversy about the 1/X optimization, last- 
minute broad changes made during stage #4 are generally not a good idea, and 
I'm not sure what the incentive for fixing PR middle-end/90348 now is (it's 
labeled a 9/10/11/12 regression).  At a minimum, I think that a switch to 
revert to the pre-12 behavior would be in order if the patch goes in now.

-- 
Eric Botcazou



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-05 19:54   ` Eric Botcazou
@ 2022-02-07  7:29     ` Richard Biener
  2022-02-08 10:31       ` Eric Botcazou
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2022-02-07  7:29 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jeff Law, gcc-patches, Jakub Jelinek, matz

On Sat, 5 Feb 2022, Eric Botcazou wrote:

> > In the past stack sharing has been quite important for the linux
> > kernel.  So perhaps one of the tests we should do if we wanted to go
> > forward in this cycle would be to test kernel builds to see if any start
> > tripping over the stack space diagnostics they've put in place over the
> > years.
> 
> Stack slot sharing is important generally speaking, not just for the kernel.
> As demonstrated by the recent controversy about the 1/X optimization, last- 
> minute broad changes made during stage #4 are generally not a good idea, and 
> I'm not sure what the incentive for fixing PR middle-end/90348 now is (it's 
> labeled a 9/10/11/12 regression).  At a minimum, I think that a switch to 
> revert to the pre-12 behavior would be in order if the patch goes in now.

I don't think an option to go to pre-12 behavior is useful.  I'll
postpone the series to stage1.

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-04 13:49 [PATCH 1/4][RFC] middle-end/90348 - add explicit birth Richard Biener
  2022-02-04 14:03 ` Jakub Jelinek
  2022-02-05 18:21 ` Jeff Law
@ 2022-02-07 23:46 ` Joseph Myers
  2022-02-08  7:20   ` Richard Biener
  2 siblings, 1 reply; 16+ messages in thread
From: Joseph Myers @ 2022-02-07 23:46 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Jakub Jelinek, matz

On Fri, 4 Feb 2022, Richard Biener via Gcc-patches wrote:

> This adds explicit variable birth CLOBBERs in an attempt to fix
> PR90348 and duplicates.  The birth / death CLOBBER pairs are
> used to compute liveness and conflicts for stack variable
> coalescing where the lack of an explicit birth but instead
> use of first mention causes misrepresentation of variable life
> ranges when optimization moves the first mention upwards the
> original birth point at the variables bind expression start.

I'm not clear on exactly where you consider a variable to be born, but 
note that non-VLA C variables have a lifetime that starts at the beginning 
of their block, not at their declaration: it's valid to jump backward from 
after the declaration to before it and then access the variable via a 
pointer, or jump forward again over the declaration and access it by name.  
(Most programs of course don't do that sort of thing.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-07 23:46 ` Joseph Myers
@ 2022-02-08  7:20   ` Richard Biener
  2022-02-08 14:13     ` Michael Matz
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2022-02-08  7:20 UTC (permalink / raw)
  To: Joseph Myers; +Cc: gcc-patches, Jakub Jelinek, matz

On Mon, 7 Feb 2022, Joseph Myers wrote:

> On Fri, 4 Feb 2022, Richard Biener via Gcc-patches wrote:
> 
> > This adds explicit variable birth CLOBBERs in an attempt to fix
> > PR90348 and duplicates.  The birth / death CLOBBER pairs are
> > used to compute liveness and conflicts for stack variable
> > coalescing where the lack of an explicit birth but instead
> > use of first mention causes misrepresentation of variable life
> > ranges when optimization moves the first mention upwards the
> > original birth point at the variables bind expression start.
> 
> I'm not clear on exactly where you consider a variable to be born, but 
> note that non-VLA C variables have a lifetime that starts at the beginning 
> of their block, not at their declaration: it's valid to jump backward from 
> after the declaration to before it and then access the variable via a 
> pointer, or jump forward again over the declaration and access it by name.  
> (Most programs of course don't do that sort of thing.)

Oh, interesting.  So the following is valid

int foo(int always_true_at_runtime)
{
  {
    int *p;
    if (always_true_at_runtime)
      goto after;
lab:
    return *p;
after:
    int i = 0;
    p = &i;
    if (always_true_at_runtime)
      goto lab;
  }
  return 0;
}

For the implementation I considered starting lifetime at a DECL_EXPR
if it exists and otherwise at the start of the BIND_EXPR.  Note the
complication is that control flow has to reach the lifetime-start
clobber before the first access.  But that might also save us here,
since for the example above 'i' would be live via the backedge
from goto lab.

That said, the existing complication is that when the gimplifier
visits the BIND_EXPR it has no way to know whether there will be
a following DECL_EXPR or not (and the gimplifier notes there are
cases a DECL_EXPR variable is not in a BIND_EXPR).  My plan is to
compute this during one of the body walks the gimplifier performs
before gimplifying.

Another complication is that at least the C frontend + gimplifier
combination for

  switch (i)
  {
  case 1:
    int i;
    break;
  }

will end up having the BIND_EXPR mentioning 'i' containing the
case label, so just placing the birth at the BIND will make it
unreachable as

  i = BIRTH;
case_label_for_1:
  ...

conveniently at least the C frontend has a DECL_EXPR for 'i'
which avoids this and I did not find a way (yet) in the gimplifier
to rectify this when gimplifying switches.

So there's still work to do but I think that starting the lifetime
at a DECL_EXPR if it exists is the way to go?

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-07  7:29     ` Richard Biener
@ 2022-02-08 10:31       ` Eric Botcazou
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Botcazou @ 2022-02-08 10:31 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, gcc-patches, Jakub Jelinek, matz

> I don't think an option to go to pre-12 behavior is useful.  I'll
> postpone the series to stage1.

FWIW fine with me.

-- 
Eric Botcazou



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-08  7:20   ` Richard Biener
@ 2022-02-08 14:13     ` Michael Matz
  2022-02-08 14:53       ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Matz @ 2022-02-08 14:13 UTC (permalink / raw)
  To: Richard Biener; +Cc: Joseph Myers, gcc-patches, Jakub Jelinek

Hello,

On Tue, 8 Feb 2022, Richard Biener wrote:

> int foo(int always_true_at_runtime)
> {
>   {
>     int *p;
>     if (always_true_at_runtime)
>       goto after;
> lab:
>     return *p;
> after:
>     int i = 0;
>     p = &i;
>     if (always_true_at_runtime)
>       goto lab;
>   }
>   return 0;
> }
> 
> For the implementation I considered starting lifetime at a DECL_EXPR
> if it exists and otherwise at the start of the BIND_EXPR.  Note the
> complication is that control flow has to reach the lifetime-start
> clobber before the first access.  But that might also save us here,
> since for the example above 'i' would be live via the backedge
> from goto lab.
> 
> That said, the existing complication is that when the gimplifier
> visits the BIND_EXPR it has no way to know whether there will be
> a following DECL_EXPR or not (and the gimplifier notes there are
> cases a DECL_EXPR variable is not in a BIND_EXPR).  My plan is to
> compute this during one of the body walks the gimplifier performs
> before gimplifying.
> 
> Another complication is that at least the C frontend + gimplifier
> combination for
> 
>   switch (i)
>   {
>   case 1:
>     int i;
>     break;
>   }
> 
> will end up having the BIND_EXPR mentioning 'i' containing the
> case label, so just placing the birth at the BIND will make it
> unreachable as
> 
>   i = BIRTH;
> case_label_for_1:
>   ...

I think anything that needs to happen (conceptually) during the jump from 
switch to case-label needs to happen right before the jump, that might 
mean creating a new fake BLOCK that contains just the jump and the 
associated births.

> conveniently at least the C frontend has a DECL_EXPR for 'i'
> which avoids this and I did not find a way (yet) in the gimplifier
> to rectify this when gimplifying switches.

In C a 'case' label is nothing else than a normal label, it doesn't start 
it's own block or the like.  So also for that one the births would need to 
be at the start of their containing blocks.

> So there's still work to do but I think that starting the lifetime at a 
> DECL_EXPR if it exists is the way to go?

Depends on where the DECL_EXPR is exactly placed, but probably it wouldn't 
be okay.  You can't place the birth at the fall-through path, because this 
needs to work (basically your jump example above rewritten as switch):

int state = 2, *p, camefrom1 = 0;
for (;;) switch (state) {
  case 1: 
  case 2: ;
    int i;
    if (state != 1) { p = &i; i = 0; }
    if (state == 1) { (*p)++; return *p; }
    state = 1;
    continue;
}

Note how i is initialized during state 2, and needs to be kept initialized 
during state 1, so there must not be a CLOBBER (birth or other) at the 
point of the declaration of 'i'.  AFAICS in my simple tests a DECL_EXPR 
for 'i' is placed with the statement associated with 'case 2' label, 
putting a CLOBBER there would be the wrong thing.  If the decl had an 
initializer it might be harmless, as it would be overwritten at that 
place, but even so, in this case there is no initializer.  Hmm.

Another complication arises from simple forward jumps:

  goto forw;
  {
    int i;
    printf("not reachable\n");
  forw:
    i = 1;
  }

Despite the jump skiping the unreachable head of the BLOCK, 'i' needs to 
be considered birthed at the label.  (In a way the placement of births 
exactly mirrors the placements of deaths (of course), every path from 
outside a BLOCK to inside needs to birth-clobber all variables (in C), 
like every path leaving needs to kill them.  It's just that we have a 
convenient construct for the latter (try-finally), but not for the former)

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-08 14:13     ` Michael Matz
@ 2022-02-08 14:53       ` Richard Biener
  2022-02-08 15:19         ` Michael Matz
  2022-02-08 18:38         ` Joseph Myers
  0 siblings, 2 replies; 16+ messages in thread
From: Richard Biener @ 2022-02-08 14:53 UTC (permalink / raw)
  To: Michael Matz; +Cc: Joseph Myers, gcc-patches, Jakub Jelinek

On Tue, 8 Feb 2022, Michael Matz wrote:

> Hello,
> 
> On Tue, 8 Feb 2022, Richard Biener wrote:
> 
> > int foo(int always_true_at_runtime)
> > {
> >   {
> >     int *p;
> >     if (always_true_at_runtime)
> >       goto after;
> > lab:
> >     return *p;
> > after:
> >     int i = 0;
> >     p = &i;
> >     if (always_true_at_runtime)
> >       goto lab;
> >   }
> >   return 0;
> > }
> > 
> > For the implementation I considered starting lifetime at a DECL_EXPR
> > if it exists and otherwise at the start of the BIND_EXPR.  Note the
> > complication is that control flow has to reach the lifetime-start
> > clobber before the first access.  But that might also save us here,
> > since for the example above 'i' would be live via the backedge
> > from goto lab.
> > 
> > That said, the existing complication is that when the gimplifier
> > visits the BIND_EXPR it has no way to know whether there will be
> > a following DECL_EXPR or not (and the gimplifier notes there are
> > cases a DECL_EXPR variable is not in a BIND_EXPR).  My plan is to
> > compute this during one of the body walks the gimplifier performs
> > before gimplifying.
> > 
> > Another complication is that at least the C frontend + gimplifier
> > combination for
> > 
> >   switch (i)
> >   {
> >   case 1:
> >     int i;
> >     break;
> >   }
> > 
> > will end up having the BIND_EXPR mentioning 'i' containing the
> > case label, so just placing the birth at the BIND will make it
> > unreachable as
> > 
> >   i = BIRTH;
> > case_label_for_1:
> >   ...
> 
> I think anything that needs to happen (conceptually) during the jump from 
> switch to case-label needs to happen right before the jump, that might 
> mean creating a new fake BLOCK that contains just the jump and the 
> associated births.
> 
> > conveniently at least the C frontend has a DECL_EXPR for 'i'
> > which avoids this and I did not find a way (yet) in the gimplifier
> > to rectify this when gimplifying switches.
> 
> In C a 'case' label is nothing else than a normal label, it doesn't start 
> it's own block or the like.  So also for that one the births would need to 
> be at the start of their containing blocks.
> 
> > So there's still work to do but I think that starting the lifetime at a 
> > DECL_EXPR if it exists is the way to go?
> 
> Depends on where the DECL_EXPR is exactly placed, but probably it wouldn't 
> be okay.  You can't place the birth at the fall-through path, because this 
> needs to work (basically your jump example above rewritten as switch):
> 
> int state = 2, *p, camefrom1 = 0;
> for (;;) switch (state) {
>   case 1: 
>   case 2: ;
>     int i;
>     if (state != 1) { p = &i; i = 0; }
>     if (state == 1) { (*p)++; return *p; }
>     state = 1;
>     continue;
> }
> 
> Note how i is initialized during state 2, and needs to be kept initialized 
> during state 1, so there must not be a CLOBBER (birth or other) at the 
> point of the declaration of 'i'.  AFAICS in my simple tests a DECL_EXPR 
> for 'i' is placed with the statement associated with 'case 2' label, 
> putting a CLOBBER there would be the wrong thing.  If the decl had an 
> initializer it might be harmless, as it would be overwritten at that 
> place, but even so, in this case there is no initializer.  Hmm.

You get after gimplification:

  state = 2;
  camefrom1 = 0;
  <D.1989>:
  switch (state) <default: <D.1996>, case 1: <D.1983>, case 2: <D.1984>>
  {
    int i;

    try
      {
        i = {CLOBBER(birth)};  /// ignore, should go away
        <D.1983>:
        <D.1984>:
        i = {CLOBBER(birth)};
        if (state != 1) goto <D.1991>; else goto <D.1992>;
        <D.1991>:
        p = &i;
        i = 0;
        <D.1992>:
        if (state == 1) goto <D.1993>; else goto <D.1994>;
        <D.1993>:
        _1 = *p;
        _2 = _1 + 1;
        *p = _2;
        D.1995 = *p;
        // predicted unlikely by early return (on trees) predictor.
        return D.1995;
        <D.1994>:
        state = 1;
        // predicted unlikely by continue predictor.
        goto <D.1987>;
      }
    finally
      {
        i = {CLOBBER(eol)};
      }
  }

which I think is OK?  That is, when the abstract machine
arrives at 'int i;' then the previous content in 'i' goes
away?  Or would

int foo()
{
   goto ick;
use:
   int i, *p;
   return *p;
ick:
   i = 1;
   p = &i;
   goto use;

}

require us to return 1?  With the current patch 'i' is clobbered
before the return.

> Another complication arises from simple forward jumps:
> 
>   goto forw;
>   {
>     int i;
>     printf("not reachable\n");
>   forw:
>     i = 1;
>   }
> 
> Despite the jump skiping the unreachable head of the BLOCK, 'i' needs to 
> be considered birthed at the label.  (In a way the placement of births 
> exactly mirrors the placements of deaths (of course), every path from 
> outside a BLOCK to inside needs to birth-clobber all variables (in C), 
> like every path leaving needs to kill them.  It's just that we have a 
> convenient construct for the latter (try-finally), but not for the former)

The situation with an unreachable birth is handled conservatively
since we consider a variable without a (visible at RTL expansion time)
birth to conflict with all other variables.

I don't see a good way to have a birth magically appear at 'forw'
without trying to argue that the following stmt is the first mentioning
the variable.

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-08 14:53       ` Richard Biener
@ 2022-02-08 15:19         ` Michael Matz
  2022-02-09 14:25           ` Richard Biener
  2022-02-08 18:38         ` Joseph Myers
  1 sibling, 1 reply; 16+ messages in thread
From: Michael Matz @ 2022-02-08 15:19 UTC (permalink / raw)
  To: Richard Biener; +Cc: Joseph Myers, gcc-patches, Jakub Jelinek

Hello,

On Tue, 8 Feb 2022, Richard Biener wrote:

> > int state = 2, *p, camefrom1 = 0;
> > for (;;) switch (state) {
> >   case 1: 
> >   case 2: ;
> >     int i;
> >     if (state != 1) { p = &i; i = 0; }
> >     if (state == 1) { (*p)++; return *p; }
> >     state = 1;
> >     continue;
> > }
> > 
> > Note how i is initialized during state 2, and needs to be kept initialized 
> > during state 1, so there must not be a CLOBBER (birth or other) at the 
> > point of the declaration of 'i'.  AFAICS in my simple tests a DECL_EXPR 
> > for 'i' is placed with the statement associated with 'case 2' label, 
> > putting a CLOBBER there would be the wrong thing.  If the decl had an 
> > initializer it might be harmless, as it would be overwritten at that 
> > place, but even so, in this case there is no initializer.  Hmm.
> 
> You get after gimplification:
> 
>   state = 2;
>   camefrom1 = 0;
>   <D.1989>:
>   switch (state) <default: <D.1996>, case 1: <D.1983>, case 2: <D.1984>>
>   {
>     int i;
> 
>     try
>       {
>         i = {CLOBBER(birth)};  /// ignore, should go away
>         <D.1983>:
>         <D.1984>:
>         i = {CLOBBER(birth)};

I think this clobber here would be the problem, because ...

> which I think is OK?  That is, when the abstract machine
> arrives at 'int i;' then the previous content in 'i' goes
> away?  Or would
> 
> int foo()
> {
>    goto ick;
> use:
>    int i, *p;
>    return *p;
> ick:
>    i = 1;
>    p = &i;
>    goto use;
> 
> }
> 
> require us to return 1?

... I think this is exactly the logical consequence of lifetime of 'i' 
being the whole block.  We need to return 1. (Joseph: correct me on that! 
:) )  That's what I was trying to get at with my switch example as well.

> With the current patch 'i' is clobbered before the return.
> 
> > Another complication arises from simple forward jumps:
> > 
> >   goto forw;
> >   {
> >     int i;
> >     printf("not reachable\n");
> >   forw:
> >     i = 1;
> >   }
> > 
> > Despite the jump skiping the unreachable head of the BLOCK, 'i' needs to 
> > be considered birthed at the label.  (In a way the placement of births 
> > exactly mirrors the placements of deaths (of course), every path from 
> > outside a BLOCK to inside needs to birth-clobber all variables (in C), 
> > like every path leaving needs to kill them.  It's just that we have a 
> > convenient construct for the latter (try-finally), but not for the former)
> 
> The situation with an unreachable birth is handled conservatively
> since we consider a variable without a (visible at RTL expansion time)
> birth to conflict with all other variables.

That breaks down when a birth is there (because it was otherwise 
reachable) but not on the taken path:

  if (nevertrue)
    goto start;
  goto forw;
  start:
  {
    int i;
    printf("not really reachable, but unknowingly so\n");
  forw:
    i = 1;
  }

> I don't see a good way to have a birth magically appear at 'forw' 
> without trying to argue that the following stmt is the first mentioning 
> the variable.

That's what my 'Hmm' aluded to :)  The only correct and precise way I see 
is to implement something similar like try-finally topside-down.  An 
easier but less precise way is to place the births at the (start of) 
innermost block containing the decl _and all jumps into the block_.  Even 
less presice, but perhaps even easier is to place the births for decls in 
blocks with side-entries into the function scope (and for blocks without 
side entries at their start).

Except for switches side-entries are really really seldom, so we might 
usefully get away with that latter solution.  And for switches it might be 
okay to put the births at the block containing the switch (if it itself 
doesn't have side entries, and the switch block doesn't have side 
entries except the case labels).

If the birth is moved to outward blocks it might be best if also the 
dealloc/death clobbers are moved to it, otherwise there might be paths 
containing a birth but no death.

The less precise you get with those births the more non-sharing you'll 
get, but that's the price.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-08 14:53       ` Richard Biener
  2022-02-08 15:19         ` Michael Matz
@ 2022-02-08 18:38         ` Joseph Myers
  2022-02-09 13:37           ` Michael Matz
  1 sibling, 1 reply; 16+ messages in thread
From: Joseph Myers @ 2022-02-08 18:38 UTC (permalink / raw)
  To: Richard Biener; +Cc: Michael Matz, Jakub Jelinek, gcc-patches

On Tue, 8 Feb 2022, Richard Biener via Gcc-patches wrote:

> which I think is OK?  That is, when the abstract machine
> arrives at 'int i;' then the previous content in 'i' goes
> away?  Or would

Yes, that's correct.  "If an initialization is specified for the object, 
it is performed each time the declaration or compound literal is reached 
in the execution of the block; otherwise, the value becomes indeterminate 
each time the declaration is reached.".

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-08 18:38         ` Joseph Myers
@ 2022-02-09 13:37           ` Michael Matz
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Matz @ 2022-02-09 13:37 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Richard Biener, Jakub Jelinek, gcc-patches

Hey,

On Tue, 8 Feb 2022, Joseph Myers wrote:

> On Tue, 8 Feb 2022, Richard Biener via Gcc-patches wrote:
> 
> > which I think is OK?  That is, when the abstract machine
> > arrives at 'int i;' then the previous content in 'i' goes
> > away?  Or would
> 
> Yes, that's correct.  "If an initialization is specified for the object, 
> it is performed each time the declaration or compound literal is reached 
> in the execution of the block; otherwise, the value becomes indeterminate 
> each time the declaration is reached.".

Okay, that makes things easier then.  We can put the birth 
clobbers at their point of declaration, just the storage associated with a 
decl needs to survive for the whole block.  We still need to make sure 
that side entries skipping declarations correctly "allocate" such storage 
(by introducing proper conflicts with other objects), but at least values 
don't need to survive decls.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-08 15:19         ` Michael Matz
@ 2022-02-09 14:25           ` Richard Biener
  2022-02-09 15:55             ` Michael Matz
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2022-02-09 14:25 UTC (permalink / raw)
  To: Michael Matz; +Cc: Joseph Myers, gcc-patches, Jakub Jelinek

On Tue, 8 Feb 2022, Michael Matz wrote:

> Hello,
> 
> On Tue, 8 Feb 2022, Richard Biener wrote:
> 
> > > int state = 2, *p, camefrom1 = 0;
> > > for (;;) switch (state) {
> > >   case 1: 
> > >   case 2: ;
> > >     int i;
> > >     if (state != 1) { p = &i; i = 0; }
> > >     if (state == 1) { (*p)++; return *p; }
> > >     state = 1;
> > >     continue;
> > > }
> > > 
> > > Note how i is initialized during state 2, and needs to be kept initialized 
> > > during state 1, so there must not be a CLOBBER (birth or other) at the 
> > > point of the declaration of 'i'.  AFAICS in my simple tests a DECL_EXPR 
> > > for 'i' is placed with the statement associated with 'case 2' label, 
> > > putting a CLOBBER there would be the wrong thing.  If the decl had an 
> > > initializer it might be harmless, as it would be overwritten at that 
> > > place, but even so, in this case there is no initializer.  Hmm.
> > 
> > You get after gimplification:
> > 
> >   state = 2;
> >   camefrom1 = 0;
> >   <D.1989>:
> >   switch (state) <default: <D.1996>, case 1: <D.1983>, case 2: <D.1984>>
> >   {
> >     int i;
> > 
> >     try
> >       {
> >         i = {CLOBBER(birth)};  /// ignore, should go away
> >         <D.1983>:
> >         <D.1984>:
> >         i = {CLOBBER(birth)};
> 
> I think this clobber here would be the problem, because ...
> 
> > which I think is OK?  That is, when the abstract machine
> > arrives at 'int i;' then the previous content in 'i' goes
> > away?  Or would
> > 
> > int foo()
> > {
> >    goto ick;
> > use:
> >    int i, *p;
> >    return *p;
> > ick:
> >    i = 1;
> >    p = &i;
> >    goto use;
> > 
> > }
> > 
> > require us to return 1?
> 
> ... I think this is exactly the logical consequence of lifetime of 'i' 
> being the whole block.  We need to return 1. (Joseph: correct me on that! 
> :) )  That's what I was trying to get at with my switch example as well.
> 
> > With the current patch 'i' is clobbered before the return.
> > 
> > > Another complication arises from simple forward jumps:
> > > 
> > >   goto forw;
> > >   {
> > >     int i;
> > >     printf("not reachable\n");
> > >   forw:
> > >     i = 1;
> > >   }
> > > 
> > > Despite the jump skiping the unreachable head of the BLOCK, 'i' needs to 
> > > be considered birthed at the label.  (In a way the placement of births 
> > > exactly mirrors the placements of deaths (of course), every path from 
> > > outside a BLOCK to inside needs to birth-clobber all variables (in C), 
> > > like every path leaving needs to kill them.  It's just that we have a 
> > > convenient construct for the latter (try-finally), but not for the former)
> > 
> > The situation with an unreachable birth is handled conservatively
> > since we consider a variable without a (visible at RTL expansion time)
> > birth to conflict with all other variables.
> 
> That breaks down when a birth is there (because it was otherwise 
> reachable) but not on the taken path:
> 
>   if (nevertrue)
>     goto start;
>   goto forw;
>   start:
>   {
>     int i;
>     printf("not really reachable, but unknowingly so\n");
>   forw:
>     i = 1;
>   }

I think to cause breakage you need a use of 'i' on the side-entry
path that is not reachable from the path with the birth.  I guess sth like

   if (nevertrue)
     goto start;
   goto forw;
   start:
   {
     int i = 0;
     printf("not really reachable, but unknowingly so\n");
     goto common;
   forw:
     i = 1;
   common:
     foo (&i);
   }

if we'd have a variable that's live only on the side-entry path
then it could share the stack slot with 'i' this way, breaking
things (now we don't move CLOBBERs so it isn't easy to construct
such case).  The present dataflow would, for the above, indeed
compute 'i' not live in the forw: block.

> > I don't see a good way to have a birth magically appear at 'forw' 
> > without trying to argue that the following stmt is the first mentioning 
> > the variable.
> 
> That's what my 'Hmm' aluded to :)  The only correct and precise way I see 
> is to implement something similar like try-finally topside-down.  An 
> easier but less precise way is to place the births at the (start of) 
> innermost block containing the decl _and all jumps into the block_.  Even 
> less presice, but perhaps even easier is to place the births for decls in 
> blocks with side-entries into the function scope (and for blocks without 
> side entries at their start).
> 
> Except for switches side-entries are really really seldom, so we might 
> usefully get away with that latter solution.  And for switches it might be 
> okay to put the births at the block containing the switch (if it itself 
> doesn't have side entries, and the switch block doesn't have side 
> entries except the case labels).
> 
> If the birth is moved to outward blocks it might be best if also the 
> dealloc/death clobbers are moved to it, otherwise there might be paths 
> containing a birth but no death.
> 
> The less precise you get with those births the more non-sharing you'll 
> get, but that's the price.

Yes, sure.  I don't see a good way to place births during gimplification
then.  The end clobbers rely on our EH lowering machinery.  For
the entries we might be able to compute GIMPLE BIND transitions during
BIND lowering if we associate labels with BINDs.  There should be
a single fallthru into the BIND at this point.  We could put a flag
on the goto destination labels whether they are reached from
an outer BIND.

 goto inner;
 {
   {
    int i;
     {
       int j;
inner:
       goto middle;
     }
    middle:
   }
 }

Since an entry CLOBBER is also a clobber we have to insert birth
clobbers for all decls of all the binds inbetwee the goto source
and the destination.  So for the above case on the edge to
inner: have births for i and j and at the edge to middle we'd
have none.

Requires some kind of back-mapping from label to goto sources
that are possibly problematic.  One issue is that GIMPLE
lowering re-builds the BLOCK tree (for whatever reason...),
so I'm not sure if we need to do it before that (for correctness).

Does that make sense?

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4][RFC] middle-end/90348 - add explicit birth
  2022-02-09 14:25           ` Richard Biener
@ 2022-02-09 15:55             ` Michael Matz
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Matz @ 2022-02-09 15:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: Joseph Myers, gcc-patches, Jakub Jelinek

Hello,

On Wed, 9 Feb 2022, Richard Biener wrote:

> > That breaks down when a birth is there (because it was otherwise 
> > reachable) but not on the taken path:
> > 
> >   if (nevertrue)
> >     goto start;
> >   goto forw;
> >   start:
> >   {
> >     int i;
> >     printf("not really reachable, but unknowingly so\n");
> >   forw:
> >     i = 1;
> >   }
> 
> I think to cause breakage you need a use of 'i' on the side-entry
> path that is not reachable from the path with the birth.  I guess sth like
> 
>    if (nevertrue)
>      goto start;
>    goto forw;
>    start:
>    {
>      int i = 0;
>      printf("not really reachable, but unknowingly so\n");
>      goto common;
>    forw:
>      i = 1;
>    common:
>      foo (&i);
>    }
> 
> if we'd have a variable that's live only on the side-entry path
> then it could share the stack slot with 'i' this way, breaking
> things (now we don't move CLOBBERs so it isn't easy to construct
> such case).  The present dataflow would, for the above, indeed
> compute 'i' not live in the forw: block.

Yes, now that we have established (in the subthread with Joseph) that the 
value becomes indeterminate at decls we only need to care for not sharing 
storage invalidly, so yeah, some changes in the conflict computation still 
are needed.

> > Except for switches side-entries are really really seldom, so we might 
> > usefully get away with that latter solution.  And for switches it might be 
> > okay to put the births at the block containing the switch (if it itself 
> > doesn't have side entries, and the switch block doesn't have side 
> > entries except the case labels).
> > 
> > If the birth is moved to outward blocks it might be best if also the 
> > dealloc/death clobbers are moved to it, otherwise there might be paths 
> > containing a birth but no death.
> > 
> > The less precise you get with those births the more non-sharing you'll 
> > get, but that's the price.
> 
> Yes, sure.  I don't see a good way to place births during gimplification
> then.

Well, for each BIND you can compute if there are side entries at all, 
then, when lowering a BIND you put the births into the containing 
innermost BIND that doesn't have side-entries, instead of into the current 
BIND.

> The end clobbers rely on our EH lowering machinery.  For the entries we 
> might be able to compute GIMPLE BIND transitions during BIND lowering if 
> we associate labels with BINDs.  There should be a single fallthru into 
> the BIND at this point.  We could put a flag on the goto destination 
> labels whether they are reached from an outer BIND.
> 
>  goto inner;
>  {
>    {
>     int i;
>      {
>        int j;
> inner:
>        goto middle;
>      }
>     middle:
>    }
>  }
> 
> Since an entry CLOBBER is also a clobber we have to insert birth
> clobbers for all decls of all the binds inbetwee the goto source
> and the destination.  So for the above case on the edge to
> inner: have births for i and j and at the edge to middle we'd
> have none.

Correct, that's basically the most precise scheme, it's what I called 
try-finally topside-down ("always-before"? :) ).  (You have to care for 
computed goto! i.e. BINDs containing address-taken labels, which make 
things even uglier)  I think the easier way to deal with the above is to 
notice that the inner BIND has a side entry and conceptually move the 
decls outwards to BINDs that don't have such (or bind-crossing gotos), 
i.e. do as if it were:

  int i;  // moved
  int j;  // moved
  goto inner;
  {
    {
      {
  inner:
        goto middle;
      }
  middle:
    }
  }

> Requires some kind of back-mapping from label to goto sources
> that are possibly problematic.  One issue is that GIMPLE
> lowering re-builds the BLOCK tree (for whatever reason...),
> so I'm not sure if we need to do it before that (for correctness).
> 
> Does that make sense?

I honestly can't say for 100% :-)  It sounds like it makes sense, yes.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-02-09 15:55 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-04 13:49 [PATCH 1/4][RFC] middle-end/90348 - add explicit birth Richard Biener
2022-02-04 14:03 ` Jakub Jelinek
2022-02-04 14:11   ` Richard Biener
2022-02-05 18:21 ` Jeff Law
2022-02-05 19:54   ` Eric Botcazou
2022-02-07  7:29     ` Richard Biener
2022-02-08 10:31       ` Eric Botcazou
2022-02-07 23:46 ` Joseph Myers
2022-02-08  7:20   ` Richard Biener
2022-02-08 14:13     ` Michael Matz
2022-02-08 14:53       ` Richard Biener
2022-02-08 15:19         ` Michael Matz
2022-02-09 14:25           ` Richard Biener
2022-02-09 15:55             ` Michael Matz
2022-02-08 18:38         ` Joseph Myers
2022-02-09 13:37           ` Michael Matz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).