public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] store VLA bounds in attribute access as strings (PR 97172)
@ 2020-12-19  0:55 Martin Sebor
  2020-12-19  7:48 ` Richard Biener
  0 siblings, 1 reply; 15+ messages in thread
From: Martin Sebor @ 2020-12-19  0:55 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1069 bytes --]

To keep tree expressions stored by the front end in attribute
access for nontrivial VLA bounds from getting corrupted during
Gimplification and to avoid breaking the preconditions verified
by the LTO streamer that no such trees exist in the IL,
the attached patch replaces those bounds with a string
representation of those expressions (as STRING_CST).  It also
tweaks the pretty-printer to improve the formatting of the VLA
bounds and avoid inserting spurious spaces in some cases.

The strings are only used by the front end to verify that
redeclarations of the same function match in the form and bounds
of their VLA arguments, and they're not needed in the middle end.
I considered removing them just before the front end finishes but
I couldn't find an efficient way to do that.  Is there some data
structure that stores all function declarations in a translation
unit?  If there is, then traversing it and removing the attribute
arguments might also be an option, either in addition to this
change or in lieu of it.

The patch was tested on x86_64-linux.

Martin

[-- Attachment #2: gcc-97172.diff --]
[-- Type: text/x-patch, Size: 19357 bytes --]

Store nontrivial VLA bounds as strings (PR c/97172).

gcc/ChangeLog

	PR c/97172
	* attribs.c (attr_access::array_as_string): Handle VLA bounds
	represented as strings.
	* pretty-print.c (pp_string): Add argument and use it.
	* pretty-print.h (pp_string): Add default argument.
	* tree-pretty-print.c (print_generic_expr_to_string_cst): Define.
	(dump_generic_node): Avoid printing spurious space; parenthesize
	operands to avoid three consecutive minus signs.
	(op_symbol_code):
	* tree-pretty-print.h (print_generic_expr_to_string_cst): New.
	* tree.c (array_bound_from_maxval): Define.
	* tree.h (array_bound_from_maxval): New.

gcc/c-family/ChangeLog:

	PR c/97172
	* c-attribs.c (handle_argspec_attribute): Handle VLA bounds
	represented as strings.
	(build_attr_access_from_parms): Adjust comment.
	* c-pretty-print.c (c_pretty_printer::direct_abstract_declarator):
	Call array_bound_from_maxval.
	* c-warn.c (warn_parm_array_mismatch): Handle VLA bounds represented
	as strings.

gcc/c/ChangeLog:

	PR c/97172
	* c-decl.c (get_parm_array_spec): Store nontrivial VLA bounds as
	strings.

gcc/testsuite/ChangeLog:

	PR c/97172
	* gcc.dg/Wvla-parameter-2.c: Adjust text of expected warning.
	* gcc.dg/Wvla-parameter-9.c: New test.

diff --git a/gcc/attribs.c b/gcc/attribs.c
index a6f6b70e39e..8803e0e7260 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -2257,6 +2257,9 @@ attr_access::array_as_string (tree type) const
   if (type == error_mark_node)
     return std::string ();
 
+  /* The most significant bound of a VLA as a PARM_DECL or VAR_DECL,
+     or STRING_CST for expressions, or null.  */
+  tree bound = NULL_TREE;
   if (this->str)
     {
       /* For array parameters (but not pointers) create a temporary array
@@ -2275,15 +2278,20 @@ attr_access::array_as_string (tree type) const
 	  const char *p = end;
 	  for ( ; p != str && *p-- != ']'; );
 	  if (*p == '$')
-	    index_type = build_index_type (TREE_VALUE (size));
+	    {
+	      bound = TREE_VALUE (size);
+	      if (DECL_P (bound))
+		index_type = build_index_type (bound);
+	    }
 	}
       else if (minsize)
 	index_type = build_index_type (size_int (minsize - 1));
 
       tree arat = NULL_TREE;
-      if (static_p || vla_p)
+      if ((static_p || vla_p))
 	{
-	  tree flag = static_p ? integer_one_node : NULL_TREE;
+	  tree flag = (static_p && (!bound || DECL_P (bound))
+		       ? integer_one_node : NULL_TREE);
 	  /* Hack: there's no language-independent way to encode
 	     the "static" specifier or the "*" notation in an array type.
 	     Add a "fake" attribute to have the pretty-printer add "static"
@@ -2307,6 +2315,21 @@ attr_access::array_as_string (tree type) const
   typstr = pp_formatted_text (pp);
   delete pp;
 
+  if (bound && TREE_CODE (bound) == STRING_CST)
+    {
+      /* For the most significant bound that's not a DECL and that's
+	 stored as a STRING_CST, replace the asterisk in the first [*]
+	 it was formatted as, or the first "static", with its human-
+	 readable representation.  */
+      size_t pos = typstr.find ("[*]", 0);
+      if (static_p)
+	{
+	  typstr.replace (pos + 1, 1, "static  ");
+	  pos += 7;
+	}
+      typstr.replace (pos + 1, 1, TREE_STRING_POINTER (bound));
+    }
+
   return typstr;
 }
 
diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 29e26728300..40d6389102f 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -3539,7 +3539,7 @@ handle_argspec_attribute (tree *, tree, tree args, int, bool *)
   for (tree next = TREE_CHAIN (args); next; next = TREE_CHAIN (next))
     {
       tree val = TREE_VALUE (next);
-      gcc_assert (DECL_P (val) || EXPR_P (val));
+      gcc_assert (DECL_P (val) || TREE_CODE (val) == STRING_CST);
     }
   return NULL_TREE;
 }
@@ -4910,7 +4910,7 @@ tree
 build_attr_access_from_parms (tree parms, bool skip_voidptr)
 {
   /* Maps each named integral argument DECL seen so far to its position
-     in the argument list; used to associate VLA sizes with arguments.  */
+     in the argument list; used to associate VLA bounds with arguments.  */
   hash_map<tree, unsigned> arg2pos;
 
   /* The string representation of the access specification for all
diff --git a/gcc/c-family/c-pretty-print.c b/gcc/c-family/c-pretty-print.c
index 3027703056b..0c182c80b90 100644
--- a/gcc/c-family/c-pretty-print.c
+++ b/gcc/c-family/c-pretty-print.c
@@ -633,22 +633,7 @@ c_pretty_printer::direct_abstract_declarator (tree t)
 		  /* Strip the expressions from around a VLA bound added
 		     internally to make it fit the domain mold, including
 		     any casts.  */
-		  if (TREE_CODE (maxval) == NOP_EXPR)
-		    maxval = TREE_OPERAND (maxval, 0);
-		  if (TREE_CODE (maxval) == PLUS_EXPR
-		      && integer_all_onesp (TREE_OPERAND (maxval, 1)))
-		    {
-		      maxval = TREE_OPERAND (maxval, 0);
-		      if (TREE_CODE (maxval) == NOP_EXPR)
-			maxval = TREE_OPERAND (maxval, 0);
-		    }
-		  if (TREE_CODE (maxval) == SAVE_EXPR)
-		    {
-		      maxval = TREE_OPERAND (maxval, 0);
-		      if (TREE_CODE (maxval) == NOP_EXPR)
-			maxval = TREE_OPERAND (maxval, 0);
-		    }
-
+		  maxval = array_bound_from_maxval (maxval);
 		  expression (maxval);
 		}
 	    }
diff --git a/gcc/c-family/c-warn.c b/gcc/c-family/c-warn.c
index 91abafe49b1..22708af16cb 100644
--- a/gcc/c-family/c-warn.c
+++ b/gcc/c-family/c-warn.c
@@ -3578,17 +3578,47 @@ warn_parm_array_mismatch (location_t origloc, tree fndecl, tree newparms)
 	  tree newbnd = vla_bound_parm_decl (TREE_VALUE (newvbl));
 	  tree curbnd = vla_bound_parm_decl (TREE_VALUE (curvbl));
 
-	  if (newpos == curpos && newbnd == curbnd)
+	  /* The printable representation of the mismatched bound.  */
+	  const char *newbndstr = nullptr, *curbndstr = nullptr;
+	  if (TREE_CODE (newbnd) == STRING_CST
+	      && TREE_CODE (curbnd) == STRING_CST)
+	    {
+	      newbndstr = TREE_STRING_POINTER (newbnd);
+	      curbndstr = TREE_STRING_POINTER (curbnd);	
+	    }
+
+	  if (newpos == curpos)
+	    {
 	    /* In the expected case when both bounds either refer to
 	       the same positional parameter or when neither does,
 	       and both are the same expression they are necessarily
-	       the same.  */
-	    continue;
+	       the same.  */	
+	      if (newbnd == curbnd)
+		continue;
 
-	  const char* const newbndstr =
-	    newbnd ? print_generic_expr_to_str (newbnd) : "*";
-	  const char* const curbndstr =
-	    curbnd ? print_generic_expr_to_str (curbnd) : "*";
+	      if (newbndstr && curbndstr && !strcmp (newbndstr, curbndstr))
+		continue;
+	    }
+
+	  {
+	    /* Convert the bounds to their printable representation.
+	       This is a little convoluted because bounds stored as
+	       strings should not be enclosed in double quotes.  */
+	    auto bnd2str = [](tree bnd) -> const char*
+	      {
+	       if (!bnd)
+		 return "*";
+	       if (TREE_CODE (bnd) == STRING_CST)
+		 return TREE_STRING_POINTER (bnd);
+	       return print_generic_expr_to_str (bnd);
+	      };
+
+	    if (!newbndstr)
+	      newbndstr = bnd2str (newbnd);
+
+	    if (!curbndstr)
+	      curbndstr = bnd2str (curbnd);
+	  }
 
 	  if (!newpos != !curpos
 	      || (newpos && !tree_int_cst_equal (newpos, curpos)))
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 27f77224ea4..c6d81834528 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -5782,6 +5782,12 @@ get_parm_array_spec (const struct c_parm *parm, tree attrs)
 		  /* Each variable VLA bound is represented by the dollar
 		     sign.  */
 		  spec += "$";
+		  if (EXPR_P (nelts))
+		    {
+		      nelts = array_bound_from_maxval (nelts);
+		      if (EXPR_P (nelts))
+			nelts = print_generic_expr_to_string_cst (nelts);
+		    }
 		  tpbnds = tree_cons (NULL_TREE, nelts, tpbnds);
 		}
 	    }
@@ -5836,6 +5842,12 @@ get_parm_array_spec (const struct c_parm *parm, tree attrs)
 
       /* Each variable VLA bound is represented by a dollar sign.  */
       spec += "$";
+      if (EXPR_P (nelts))
+	{
+	  nelts = array_bound_from_maxval (nelts);
+	  if (EXPR_P (nelts))
+	    nelts = print_generic_expr_to_string_cst (nelts);
+	}
       vbchain = tree_cons (NULL_TREE, nelts, vbchain);
     }
 
diff --git a/gcc/pretty-print.c b/gcc/pretty-print.c
index 407f7300dfb..4ecdb03048b 100644
--- a/gcc/pretty-print.c
+++ b/gcc/pretty-print.c
@@ -1819,13 +1819,16 @@ pp_character (pretty_printer *pp, int c)
   ++pp_buffer (pp)->line_length;
 }
 
-/* Append a STRING to the output area of PRETTY-PRINTER; the STRING may
-   be line-wrapped if in appropriate mode.  */
+/* Append the first LEN characters of STRING to the output area of
+   PRETTY-PRINTER; the STRING may be line-wrapped if in appropriate
+   mode.  */
 void
-pp_string (pretty_printer *pp, const char *str)
+pp_string (pretty_printer *pp, const char *str, int len /* = -1 */)
 {
   gcc_checking_assert (str);
-  pp_maybe_wrap_text (pp, str, str + strlen (str));
+  if (len < 0)
+    len = strlen (str);
+  pp_maybe_wrap_text (pp, str, str + len);
 }
 
 /* Append the leading N characters of STRING to the output area of
diff --git a/gcc/pretty-print.h b/gcc/pretty-print.h
index 22892f12ab7..8b125c363ca 100644
--- a/gcc/pretty-print.h
+++ b/gcc/pretty-print.h
@@ -393,7 +393,7 @@ extern void pp_format_verbatim (pretty_printer *, text_info *);
 extern void pp_indent (pretty_printer *);
 extern void pp_newline (pretty_printer *);
 extern void pp_character (pretty_printer *, int);
-extern void pp_string (pretty_printer *, const char *);
+extern void pp_string (pretty_printer *, const char *, int = -1);
 
 extern void pp_write_text_to_stream (pretty_printer *);
 extern void pp_write_text_as_dot_label_to_stream (pretty_printer *, bool);
diff --git a/gcc/testsuite/gcc.dg/Wvla-parameter-2.c b/gcc/testsuite/gcc.dg/Wvla-parameter-2.c
index 01728e7ebb7..d8e6867c777 100644
--- a/gcc/testsuite/gcc.dg/Wvla-parameter-2.c
+++ b/gcc/testsuite/gcc.dg/Wvla-parameter-2.c
@@ -26,7 +26,7 @@ typedef int A7[7];
 void fm_A7_m_5 (int m, A7[m][5]);               // { dg-message "previously declared as 'int\\\[m]\\\[5]\\\[7]' with bound argument 1" "note" }
 void fm_A7_m_5 (int n, A7[n][5]);
 
-void fm_A7_m_5 (int n, A7[n + 1][5]);           // { dg-warning "argument 2 of type 'int\\\[n \\\+ 1]\\\[5]\\\[7]' declared with mismatched bound 'n \\\+ 1'" }
+void fm_A7_m_5 (int n, A7[n + 1][5]);           // { dg-warning "argument 2 of type 'int\\\[n \\\+ 1]\\\[5]\\\[7]' (.aka 'int\\\[]\\\[5]\\\[7]'. )?declared with mismatched bound 'n \\\+ 1'" }
 
 
 int n1, n2, n3, n4, n5, n6, n7, n8, n9;
diff --git a/gcc/testsuite/gcc.dg/Wvla-parameter-9.c b/gcc/testsuite/gcc.dg/Wvla-parameter-9.c
new file mode 100644
index 00000000000..4e9952ee61b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wvla-parameter-9.c
@@ -0,0 +1,39 @@
+/* Verify that VLA bound expressions are formatted in a readable and
+   unambiguous way and without extraneous spaces.
+   { dg-do compile }
+   { dg-options "-Wall" } */
+
+int f (int n, int[n]);
+
+int f (int n, int[+(++n)]);     // { dg-warning "argument 2 of type 'int\\\[\\+\\+n]' declared with mismatched bound '\\+\\+n'" }
+int f (int n, int[+(--n)]);     // { dg-warning "argument 2 of type 'int\\\[--n]' declared with mismatched bound '--n'" }
+
+int f (int n, int[-(++n)]);     // { dg-warning "argument 2 of type 'int\\\[-\\+\\+n]' declared with mismatched bound '-\\+\\+n'" }
+int f (int n, int[-(--n)]);     // { dg-warning "argument 2 of type 'int\\\[-\\(--n\\)]' declared with mismatched bound '-\\(--n\\)'" }
+
+extern int *ip, **ipp;
+
+int f (int n, int[n * *ip]);    // { dg-warning "argument 2 of type 'int\\\[n \\* \\*ip]' declared with mismatched bound 'n \\* \\*ip'" }
+int f (int n, int[n * **ipp]);  // { dg-warning "argument 2 of type 'int\\\[n \\* \\*\\*ipp]' declared with mismatched bound 'n \\* \\*\\*ipp'" }
+int f (int n, int[*ip]);        // { dg-warning "argument 2 of type 'int\\\[\\*ip]' declared with mismatched bound '\\*ip'" }
+
+
+/* The type in the following is butchered as
+   'int[*(ip + (sizetype) ((long unsigned int) n * 4))]'  */
+int f (int n, int[ip[n]]);      // { dg-warning "argument 2 of type 'int\\\[ip\\\[n]]' declared with mismatched bound 'ip\\\[n]'" "pr?????" { xfail *-*-* } }
+                                // { dg-warning "argument 2 of type 'int" "butchered type" { target *-*-* } .-1 }
+
+/* This is even worse and because of the newline it's not even handled
+   by DejaGnu:
+     'int[<<< Unknown tree: c_maybe_const_expr
+     *ip >>> == 0]'
+int f (int n, int[!*ip]);       // { xxxdg-warning "argument 2 of type 'int\\\[!\\*ip]' declared with mismatched bound '!\\*ip'" "pr?????" { xfail *-*-* } }
+                                // { xxxdg-warning "argument 2 of type " broken { target *-*-* } .-1 }
+				*/
+
+int g (int n, int[n + 1][n + 2]); // { dg-message "previously declared as 'int\\\[n \\+ 1]\\\[n \\+ 2]' with bound " }
+int g (int n, int[n + 1][n]);     // { dg-warning "argument 2 of type 'int\\\[n \\+ 1]\\\[n]' declared with mismatched bound argument 1" }
+int g (int n, int[n + 1][n + 1]); // { dg-warning "argument 2 of type 'int\\\[n \\+ 1]\\\[n \\+ 1]' declared with mismatched bound 'n \\+ 1'" }
+int g (int n, int[n][n + 2]);     // { dg-warning "argument 2 of type 'int\\\[n]\\\[n \\+ 2]' declared with mismatched bound argument 1" }
+int g (int n, int[n + 2][n + 2]); // { dg-warning "argument 2 of type 'int\\\[n \\+ 2]\\\[n \\+ 2]' declared with mismatched bound 'n \\+ 2'" }
+
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index ae4b898782a..4d6657fd468 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -179,6 +179,17 @@ print_generic_expr_to_str (tree t)
   return xstrdup (pp_formatted_text (&pp));
 }
 
+/* Return the expression T formatted as STRING_CST.  */
+
+tree
+print_generic_expr_to_string_cst (tree t)
+{
+  pretty_printer pp;
+  dump_generic_node (&pp, t, 0, TDF_VOPS|TDF_MEMSYMS, false);
+  const char *s = pp_formatted_text (&pp);
+  return build_string (strlen (s) + 1, s);
+}
+
 /* Dump NAME, an IDENTIFIER_POINTER, sanitized so that D<num> sequences
    in it are replaced with Dxxxx, as long as they are at the start or
    preceded by $ and at the end or followed by $.  See make_fancy_name
@@ -1635,7 +1646,6 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
   tree op0, op1;
   const char *str;
   bool is_expr;
-  enum tree_code code;
 
   if (node == NULL_TREE)
     return spc;
@@ -1648,7 +1658,7 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
   if ((flags & TDF_LINENO) && EXPR_HAS_LOCATION (node))
     dump_location (pp, EXPR_LOCATION (node));
 
-  code = TREE_CODE (node);
+  tree_code code = TREE_CODE (node);
   switch (code)
     {
     case ERROR_MARK:
@@ -2737,22 +2747,37 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
     case PREDECREMENT_EXPR:
     case PREINCREMENT_EXPR:
     case INDIRECT_REF:
-      if (TREE_CODE (node) == ADDR_EXPR
-	  && (TREE_CODE (TREE_OPERAND (node, 0)) == STRING_CST
-	      || TREE_CODE (TREE_OPERAND (node, 0)) == FUNCTION_DECL))
-	;	/* Do not output '&' for strings and function pointers.  */
-      else
-	pp_string (pp, op_symbol (node));
+      {
+	tree op0 = TREE_OPERAND (node, 0);
+	/* Set to enclose operand in a pair of parentheses.  */
+	bool lparen = op_prio (op0) < op_prio (node);
+	bool rparen = lparen;
+	if (TREE_CODE (node) == ADDR_EXPR
+	    && (TREE_CODE (op0) == STRING_CST
+		|| TREE_CODE (op0) == FUNCTION_DECL))
+	  ;	/* Do not output '&' for strings and function pointers.  */
+	else
+	  {
+	    pp_string (pp, op_symbol (node));
 
-      if (op_prio (TREE_OPERAND (node, 0)) < op_prio (node))
-	{
+	    tree_code op0code = TREE_CODE (op0);
+	    if (((code == NEGATE_EXPR || code == PREDECREMENT_EXPR)
+		 && (op0code == NEGATE_EXPR || op0code == PREDECREMENT_EXPR)))
+	      {
+		/* Enclosing the operand in a pair of parentheses.  */
+		pp_left_paren (pp);
+		lparen = false;
+		rparen = true;
+	      }
+	  }
+
+	if (lparen)
 	  pp_left_paren (pp);
-	  dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+	dump_generic_node (pp, op0, spc, flags, false);
+	if (rparen)
 	  pp_right_paren (pp);
-	}
-      else
-	dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
-      break;
+	break;
+      }
 
     case POSTDECREMENT_EXPR:
     case POSTINCREMENT_EXPR:
@@ -4168,16 +4193,16 @@ op_symbol_code (enum tree_code code)
       return "%[rd]";
 
     case PREDECREMENT_EXPR:
-      return " --";
+      return "--";
 
     case PREINCREMENT_EXPR:
-      return " ++";
+      return "++";
 
     case POSTDECREMENT_EXPR:
-      return "-- ";
+      return "--";
 
     case POSTINCREMENT_EXPR:
-      return "++ ";
+      return "++";
 
     case MAX_EXPR:
       return "max";
diff --git a/gcc/tree-pretty-print.h b/gcc/tree-pretty-print.h
index aca7679b496..b09e43a2183 100644
--- a/gcc/tree-pretty-print.h
+++ b/gcc/tree-pretty-print.h
@@ -39,6 +39,7 @@ extern void print_generic_stmt (FILE *, tree, dump_flags_t = TDF_NONE);
 extern void print_generic_stmt_indented (FILE *, tree, dump_flags_t, int);
 extern void print_generic_expr (FILE *, tree, dump_flags_t = TDF_NONE);
 extern char *print_generic_expr_to_str (tree);
+extern tree print_generic_expr_to_string_cst (tree);
 extern void dump_omp_clauses (pretty_printer *, tree, int, dump_flags_t);
 extern void dump_omp_atomic_memory_order (pretty_printer *,
 					  enum omp_memory_order);
diff --git a/gcc/tree.c b/gcc/tree.c
index 5fd9da3ab96..a4145b13d5e 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -9354,6 +9354,48 @@ variably_modified_type_p (tree type, tree fn)
 #undef RETURN_TRUE_IF_VAR
 }
 
+/* Strip any expressions from around the array bound MAXVAL, including
+   any casts, added internally to VLAs to make them fit the array domain
+   mold.  Return the result.  */
+
+tree
+array_bound_from_maxval (tree maxval)
+{
+  tree bound = maxval;
+
+  if (TREE_CODE (bound) == NOP_EXPR)
+    bound = TREE_OPERAND (bound, 0);
+
+  if (TREE_CODE (bound) == CONVERT_EXPR)
+    {
+      tree op0 = TREE_OPERAND (bound, 0);
+      tree bndtyp = TREE_TYPE (bound);
+      tree op0typ = TREE_TYPE (op0);
+      if (TYPE_PRECISION (bndtyp) == TYPE_PRECISION (op0typ))
+	bound = op0;
+    }
+
+  if (TREE_CODE (bound) == NON_LVALUE_EXPR)
+    bound = TREE_OPERAND (bound, 0);
+
+  if (TREE_CODE (bound) == PLUS_EXPR
+      && integer_all_onesp (TREE_OPERAND (bound, 1)))
+    {
+      bound = TREE_OPERAND (bound, 0);
+      if (TREE_CODE (bound) == NOP_EXPR)
+	bound = TREE_OPERAND (bound, 0);
+    }
+
+  if (TREE_CODE (bound) == SAVE_EXPR)
+    {
+      bound = TREE_OPERAND (bound, 0);
+      if (TREE_CODE (bound) == NOP_EXPR)
+	bound = TREE_OPERAND (bound, 0);
+    }
+
+  return bound;
+}
+
 /* Given a DECL or TYPE, return the scope in which it was declared, or
    NULL_TREE if there is no containing scope.  */
 
diff --git a/gcc/tree.h b/gcc/tree.h
index d366ffd8a51..3668ff4bcc4 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -5235,6 +5235,7 @@ extern bool int_fits_type_p (const_tree, const_tree)
 extern void get_type_static_bounds (const_tree, mpz_t, mpz_t);
 #endif
 extern bool variably_modified_type_p (tree, tree);
+extern tree array_bound_from_maxval (tree) ATTRIBUTE_NONNULL (1);
 extern int tree_log2 (const_tree);
 extern int tree_floor_log2 (const_tree);
 extern unsigned int tree_ctz (const_tree);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)
  2020-12-19  0:55 [PATCH] store VLA bounds in attribute access as strings (PR 97172) Martin Sebor
@ 2020-12-19  7:48 ` Richard Biener
  2020-12-20 17:43   ` Martin Sebor
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Biener @ 2020-12-19  7:48 UTC (permalink / raw)
  To: Martin Sebor, Martin Sebor via Gcc-patches, gcc-patches

On December 19, 2020 1:55:02 AM GMT+01:00, Martin Sebor via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>To keep tree expressions stored by the front end in attribute
>access for nontrivial VLA bounds from getting corrupted during
>Gimplification and to avoid breaking the preconditions verified
>by the LTO streamer that no such trees exist in the IL,
>the attached patch replaces those bounds with a string
>representation of those expressions (as STRING_CST).  It also
>tweaks the pretty-printer to improve the formatting of the VLA
>bounds and avoid inserting spurious spaces in some cases.
>
>The strings are only used by the front end to verify that
>redeclarations of the same function match in the form and bounds
>of their VLA arguments, and they're not needed in the middle end.
>I considered removing them just before the front end finishes but
>I couldn't find an efficient way to do that.  Is there some data
>structure that stores all function declarations in a translation
>unit?  If there is, then traversing it and removing the attribute
>arguments might also be an option, either in addition to this
>change or in lieu of it.

There is the free lang data pass in tree.c which walks all reachable tree nodes. 

>The patch was tested on x86_64-linux.
>
>Martin


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)
  2020-12-19  7:48 ` Richard Biener
@ 2020-12-20 17:43   ` Martin Sebor
  2021-01-04 12:59     ` Richard Biener
  0 siblings, 1 reply; 15+ messages in thread
From: Martin Sebor @ 2020-12-20 17:43 UTC (permalink / raw)
  To: Richard Biener, Martin Sebor, Martin Sebor via Gcc-patches

On 12/19/20 12:48 AM, Richard Biener via Gcc-patches wrote:
> On December 19, 2020 1:55:02 AM GMT+01:00, Martin Sebor via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>> To keep tree expressions stored by the front end in attribute
>> access for nontrivial VLA bounds from getting corrupted during
>> Gimplification and to avoid breaking the preconditions verified
>> by the LTO streamer that no such trees exist in the IL,
>> the attached patch replaces those bounds with a string
>> representation of those expressions (as STRING_CST).  It also
>> tweaks the pretty-printer to improve the formatting of the VLA
>> bounds and avoid inserting spurious spaces in some cases.
>>
>> The strings are only used by the front end to verify that
>> redeclarations of the same function match in the form and bounds
>> of their VLA arguments, and they're not needed in the middle end.
>> I considered removing them just before the front end finishes but
>> I couldn't find an efficient way to do that.  Is there some data
>> structure that stores all function declarations in a translation
>> unit?  If there is, then traversing it and removing the attribute
>> arguments might also be an option, either in addition to this
>> change or in lieu of it.
> 
> There is the free lang data pass in tree.c which walks all reachable tree nodes.

You said in response to Honza's suggestion in pr97172 to do that:

   The frontend needs to make sure no frontend specific tree codes
   leak into GENERIC so GENERICization is the place where the FE
   should clear those.  FLD is too late and doing it there would
   be a hack.

Are you now saying you've had a change of heart and that doing it
there isn't a hack after all?

Martin

>> The patch was tested on x86_64-linux.
>>
>> Martin
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)
  2020-12-20 17:43   ` Martin Sebor
@ 2021-01-04 12:59     ` Richard Biener
  2021-01-04 19:14       ` Jeff Law
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Biener @ 2021-01-04 12:59 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Martin Sebor, Martin Sebor via Gcc-patches

On Sun, Dec 20, 2020 at 6:43 PM Martin Sebor <msebor@gmail.com> wrote:
>
> On 12/19/20 12:48 AM, Richard Biener via Gcc-patches wrote:
> > On December 19, 2020 1:55:02 AM GMT+01:00, Martin Sebor via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> >> To keep tree expressions stored by the front end in attribute
> >> access for nontrivial VLA bounds from getting corrupted during
> >> Gimplification and to avoid breaking the preconditions verified
> >> by the LTO streamer that no such trees exist in the IL,
> >> the attached patch replaces those bounds with a string
> >> representation of those expressions (as STRING_CST).  It also
> >> tweaks the pretty-printer to improve the formatting of the VLA
> >> bounds and avoid inserting spurious spaces in some cases.
> >>
> >> The strings are only used by the front end to verify that
> >> redeclarations of the same function match in the form and bounds
> >> of their VLA arguments, and they're not needed in the middle end.
> >> I considered removing them just before the front end finishes but
> >> I couldn't find an efficient way to do that.  Is there some data
> >> structure that stores all function declarations in a translation
> >> unit?  If there is, then traversing it and removing the attribute
> >> arguments might also be an option, either in addition to this
> >> change or in lieu of it.
> >
> > There is the free lang data pass in tree.c which walks all reachable tree nodes.
>
> You said in response to Honza's suggestion in pr97172 to do that:
>
>    The frontend needs to make sure no frontend specific tree codes
>    leak into GENERIC so GENERICization is the place where the FE
>    should clear those.  FLD is too late and doing it there would
>    be a hack.
>
> Are you now saying you've had a change of heart and that doing it
> there isn't a hack after all?

Well, removing a FE specific attribute [when having non-constant arg] might
be OK there.  But note you asked for a "pass over all tree nodes" thing
and I just pointed out free-lang-data.

Doing the STRING_CST is certainly less fragile since the SSA names
created at gimplification time could even be ggc_freed when no longer
used in the IL.

Richard.

> Martin
>
> >> The patch was tested on x86_64-linux.
> >>
> >> Martin
> >
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)
  2021-01-04 12:59     ` Richard Biener
@ 2021-01-04 19:14       ` Jeff Law
  2021-01-04 19:19         ` Jakub Jelinek
  0 siblings, 1 reply; 15+ messages in thread
From: Jeff Law @ 2021-01-04 19:14 UTC (permalink / raw)
  To: Richard Biener, Martin Sebor; +Cc: Martin Sebor, Martin Sebor via Gcc-patches



On 1/4/21 5:59 AM, Richard Biener via Gcc-patches wrote:
> On Sun, Dec 20, 2020 at 6:43 PM Martin Sebor <msebor@gmail.com> wrote:
>> On 12/19/20 12:48 AM, Richard Biener via Gcc-patches wrote:
>>> On December 19, 2020 1:55:02 AM GMT+01:00, Martin Sebor via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>>> To keep tree expressions stored by the front end in attribute
>>>> access for nontrivial VLA bounds from getting corrupted during
>>>> Gimplification and to avoid breaking the preconditions verified
>>>> by the LTO streamer that no such trees exist in the IL,
>>>> the attached patch replaces those bounds with a string
>>>> representation of those expressions (as STRING_CST).  It also
>>>> tweaks the pretty-printer to improve the formatting of the VLA
>>>> bounds and avoid inserting spurious spaces in some cases.
>>>>
>>>> The strings are only used by the front end to verify that
>>>> redeclarations of the same function match in the form and bounds
>>>> of their VLA arguments, and they're not needed in the middle end.
>>>> I considered removing them just before the front end finishes but
>>>> I couldn't find an efficient way to do that.  Is there some data
>>>> structure that stores all function declarations in a translation
>>>> unit?  If there is, then traversing it and removing the attribute
>>>> arguments might also be an option, either in addition to this
>>>> change or in lieu of it.
>>> There is the free lang data pass in tree.c which walks all reachable tree nodes.
>> You said in response to Honza's suggestion in pr97172 to do that:
>>
>>    The frontend needs to make sure no frontend specific tree codes
>>    leak into GENERIC so GENERICization is the place where the FE
>>    should clear those.  FLD is too late and doing it there would
>>    be a hack.
>>
>> Are you now saying you've had a change of heart and that doing it
>> there isn't a hack after all?
> Well, removing a FE specific attribute [when having non-constant arg] might
> be OK there.  But note you asked for a "pass over all tree nodes" thing
> and I just pointed out free-lang-data.
>
> Doing the STRING_CST is certainly less fragile since the SSA names
> created at gimplification time could even be ggc_freed when no longer
> used in the IL.
Obviously we can't use SSA_NAMEs as they're specific to each function as
they get compiled.  But what's not as clear to me is why we can't use a
SAVE_EXPR of the original expression that indicates the size of the
parameter.


Jeff


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)
  2021-01-04 19:14       ` Jeff Law
@ 2021-01-04 19:19         ` Jakub Jelinek
  2021-01-04 19:23           ` Jeff Law
  0 siblings, 1 reply; 15+ messages in thread
From: Jakub Jelinek @ 2021-01-04 19:19 UTC (permalink / raw)
  To: Jeff Law
  Cc: Richard Biener, Martin Sebor, Martin Sebor, Martin Sebor via Gcc-patches

On Mon, Jan 04, 2021 at 12:14:15PM -0700, Jeff Law via Gcc-patches wrote:
> > Doing the STRING_CST is certainly less fragile since the SSA names
> > created at gimplification time could even be ggc_freed when no longer
> > used in the IL.
> Obviously we can't use SSA_NAMEs as they're specific to each function as
> they get compiled.  But what's not as clear to me is why we can't use a
> SAVE_EXPR of the original expression that indicates the size of the
> parameter.

The gimplifier is destructive, so if the expressions are partly (e.g. in
those SAVE_EXPRs) shared with what is in the actual IL, we lose.
And if they aren't shared and there are side-effects, if we tried to
gimplify them again we'd get the side-effects duplicated.
So it all depends on what the code wants to handle, if e.g. just values of
parameters with simple arithmetics on those and punt on everything else,
then it is doable, but generally it is not.

	Jakub


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)
  2021-01-04 19:19         ` Jakub Jelinek
@ 2021-01-04 19:23           ` Jeff Law
  2021-01-04 20:53             ` Martin Sebor
  0 siblings, 1 reply; 15+ messages in thread
From: Jeff Law @ 2021-01-04 19:23 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Biener, Martin Sebor, Martin Sebor, Martin Sebor via Gcc-patches



On 1/4/21 12:19 PM, Jakub Jelinek wrote:
> On Mon, Jan 04, 2021 at 12:14:15PM -0700, Jeff Law via Gcc-patches wrote:
>>> Doing the STRING_CST is certainly less fragile since the SSA names
>>> created at gimplification time could even be ggc_freed when no longer
>>> used in the IL.
>> Obviously we can't use SSA_NAMEs as they're specific to each function as
>> they get compiled.  But what's not as clear to me is why we can't use a
>> SAVE_EXPR of the original expression that indicates the size of the
>> parameter.
> The gimplifier is destructive, so if the expressions are partly (e.g. in
> those SAVE_EXPRs) shared with what is in the actual IL, we lose.
> And if they aren't shared and there are side-effects, if we tried to
> gimplify them again we'd get the side-effects duplicated.
> So it all depends on what the code wants to handle, if e.g. just values of
> parameters with simple arithmetics on those and punt on everything else,
> then it is doable, but generally it is not.
I would expect the expressions to be values of parameters (or objects in
static storage) and simple arithemetic on them.  If there's other cases,
punting seems appropriate.

Martin -- are there nontrivial expressions we need to be worried about here?


Jeff


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)
  2021-01-04 19:23           ` Jeff Law
@ 2021-01-04 20:53             ` Martin Sebor
  2021-01-04 21:10               ` Jeff Law
  2021-01-05 12:38               ` Richard Biener
  0 siblings, 2 replies; 15+ messages in thread
From: Martin Sebor @ 2021-01-04 20:53 UTC (permalink / raw)
  To: Jeff Law, Jakub Jelinek
  Cc: Richard Biener, Martin Sebor, Martin Sebor via Gcc-patches

On 1/4/21 12:23 PM, Jeff Law wrote:
> 
> 
> On 1/4/21 12:19 PM, Jakub Jelinek wrote:
>> On Mon, Jan 04, 2021 at 12:14:15PM -0700, Jeff Law via Gcc-patches wrote:
>>>> Doing the STRING_CST is certainly less fragile since the SSA names
>>>> created at gimplification time could even be ggc_freed when no longer
>>>> used in the IL.
>>> Obviously we can't use SSA_NAMEs as they're specific to each function as
>>> they get compiled.  But what's not as clear to me is why we can't use a
>>> SAVE_EXPR of the original expression that indicates the size of the
>>> parameter.
>> The gimplifier is destructive, so if the expressions are partly (e.g. in
>> those SAVE_EXPRs) shared with what is in the actual IL, we lose.
>> And if they aren't shared and there are side-effects, if we tried to
>> gimplify them again we'd get the side-effects duplicated.
>> So it all depends on what the code wants to handle, if e.g. just values of
>> parameters with simple arithmetics on those and punt on everything else,
>> then it is doable, but generally it is not.

I explained what the code handles and when in the pipeline in
the discussion of the previous patch:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559770.html

> I would expect the expressions to be values of parameters (or objects in
> static storage) and simple arithemetic on them.  If there's other cases,
> punting seems appropriate.
> 
> Martin -- are there nontrivial expressions we need to be worried about here?

At the moment the middle warnings only consider parameters, like
the N in

   void f (int N, int[N]);

   void g (void)
   {
     int a[3];
     f (sizeof a, a);   // warning
   }

The front end redeclaration warnings consider all expressions,
including

   int f (void);

   void g (int[f () + 1]);
   void g (int[f () + 2]);   // warning

The patch turns these complex bounds into strings that the front
end compares instead.  After the front end is done the strings
don't serve any purpose (and I don't think ever will) and could
be removed.  I looked for a way to do it but couldn't find one
other than the free_lang_data pass in tree.c that Richard had
initially said wasn't the right place.  Sounds like he's
reconsidered but at this point, given that VLA parameters are
used only infraquently, and VLAs with these nontrivial bounds
are exceedingly rare, going to the trouble of removing them
doesn't seem worth the effort.

Martin

> 
> 
> Jeff
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)
  2021-01-04 20:53             ` Martin Sebor
@ 2021-01-04 21:10               ` Jeff Law
  2021-01-04 21:20                 ` Jakub Jelinek
  2021-01-04 23:54                 ` Martin Sebor
  2021-01-05 12:38               ` Richard Biener
  1 sibling, 2 replies; 15+ messages in thread
From: Jeff Law @ 2021-01-04 21:10 UTC (permalink / raw)
  To: Martin Sebor, Jakub Jelinek
  Cc: Richard Biener, Martin Sebor, Martin Sebor via Gcc-patches



On 1/4/21 1:53 PM, Martin Sebor wrote:
> On 1/4/21 12:23 PM, Jeff Law wrote:
>>
>>
>> On 1/4/21 12:19 PM, Jakub Jelinek wrote:
>>> On Mon, Jan 04, 2021 at 12:14:15PM -0700, Jeff Law via Gcc-patches
>>> wrote:
>>>>> Doing the STRING_CST is certainly less fragile since the SSA names
>>>>> created at gimplification time could even be ggc_freed when no longer
>>>>> used in the IL.
>>>> Obviously we can't use SSA_NAMEs as they're specific to each
>>>> function as
>>>> they get compiled.  But what's not as clear to me is why we can't
>>>> use a
>>>> SAVE_EXPR of the original expression that indicates the size of the
>>>> parameter.
>>> The gimplifier is destructive, so if the expressions are partly
>>> (e.g. in
>>> those SAVE_EXPRs) shared with what is in the actual IL, we lose.
>>> And if they aren't shared and there are side-effects, if we tried to
>>> gimplify them again we'd get the side-effects duplicated.
>>> So it all depends on what the code wants to handle, if e.g. just
>>> values of
>>> parameters with simple arithmetics on those and punt on everything
>>> else,
>>> then it is doable, but generally it is not.
>
> I explained what the code handles and when in the pipeline in
> the discussion of the previous patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559770.html
Right, but that message talks about GC.  This is not a GC issue.

This feels like we need a SAVE_EXPR to me to ensure single evaluation
and an unshare_expr to avoid problems with destructive gimplification.



>
>> I would expect the expressions to be values of parameters (or objects in
>> static storage) and simple arithemetic on them.  If there's other cases,
>> punting seems appropriate.
>>
>> Martin -- are there nontrivial expressions we need to be worried
>> about here?
>
> At the moment the middle warnings only consider parameters, like
> the N in
>
>   void f (int N, int[N]);
>
>   void g (void)
>   {
>     int a[3];
>     f (sizeof a, a);   // warning
>   }
>
> The front end redeclaration warnings consider all expressions,
> including
>
>   int f (void);
>
>   void g (int[f () + 1]);
>   void g (int[f () + 2]);   // warning
>
> The patch turns these complex bounds into strings that the front
> end compares instead.

If you can have an arbitrary expression, such as a function call like
that, then ISTM that a SAVE_EPR is mandatory as you can't call the
function more than once.  BUt it also seems to me that for cases that
aren't simple arithmetic of leaf nodes we could just punt.  I doubt
we're going to miss significant real world diagnostics by doing that.

> After the front end is done the strings
> don't serve any purpose (and I don't think ever will) and could
> be removed.  I looked for a way to do it but couldn't find one
> other than the free_lang_data pass in tree.c that Richard had
> initially said wasn't the right place.  Sounds like he's
> reconsidered but at this point, given that VLA parameters are
> used only infraquently, and VLAs with these nontrivial bounds
> are exceedingly rare, going to the trouble of removing them
> doesn't seem worth the effort.
But I'm not sure that inventing a new method for smuggling the data
around is all that wise or necessary here.   I don't see a message from
anyone suggesting that, but I could have missed it.

jeff


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)
  2021-01-04 21:10               ` Jeff Law
@ 2021-01-04 21:20                 ` Jakub Jelinek
  2021-01-05 23:19                   ` Jeff Law
  2021-01-04 23:54                 ` Martin Sebor
  1 sibling, 1 reply; 15+ messages in thread
From: Jakub Jelinek @ 2021-01-04 21:20 UTC (permalink / raw)
  To: Jeff Law
  Cc: Martin Sebor, Richard Biener, Martin Sebor, Martin Sebor via Gcc-patches

On Mon, Jan 04, 2021 at 02:10:39PM -0700, Jeff Law wrote:
> > I explained what the code handles and when in the pipeline in
> > the discussion of the previous patch:
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559770.html
> Right, but that message talks about GC.  This is not a GC issue.
> 
> This feels like we need a SAVE_EXPR to me to ensure single evaluation
> and an unshare_expr to avoid problems with destructive gimplification.

unshare_expr will not duplicate SAVE_EXPRs.
So, one would need to unshare with special handling of SAVE_EXPRs that would
throw them away (for the simple arguments case) rather than handling them
normally.

	Jakub


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)
  2021-01-04 21:10               ` Jeff Law
  2021-01-04 21:20                 ` Jakub Jelinek
@ 2021-01-04 23:54                 ` Martin Sebor
  2021-01-05 23:23                   ` Jeff Law
  1 sibling, 1 reply; 15+ messages in thread
From: Martin Sebor @ 2021-01-04 23:54 UTC (permalink / raw)
  To: Jeff Law, Jakub Jelinek
  Cc: Richard Biener, Martin Sebor, Martin Sebor via Gcc-patches

On 1/4/21 2:10 PM, Jeff Law wrote:
> 
> 
> On 1/4/21 1:53 PM, Martin Sebor wrote:
>> On 1/4/21 12:23 PM, Jeff Law wrote:
>>>
>>>
>>> On 1/4/21 12:19 PM, Jakub Jelinek wrote:
>>>> On Mon, Jan 04, 2021 at 12:14:15PM -0700, Jeff Law via Gcc-patches
>>>> wrote:
>>>>>> Doing the STRING_CST is certainly less fragile since the SSA names
>>>>>> created at gimplification time could even be ggc_freed when no longer
>>>>>> used in the IL.
>>>>> Obviously we can't use SSA_NAMEs as they're specific to each
>>>>> function as
>>>>> they get compiled.  But what's not as clear to me is why we can't
>>>>> use a
>>>>> SAVE_EXPR of the original expression that indicates the size of the
>>>>> parameter.
>>>> The gimplifier is destructive, so if the expressions are partly
>>>> (e.g. in
>>>> those SAVE_EXPRs) shared with what is in the actual IL, we lose.
>>>> And if they aren't shared and there are side-effects, if we tried to
>>>> gimplify them again we'd get the side-effects duplicated.
>>>> So it all depends on what the code wants to handle, if e.g. just
>>>> values of
>>>> parameters with simple arithmetics on those and punt on everything
>>>> else,
>>>> then it is doable, but generally it is not.
>>
>> I explained what the code handles and when in the pipeline in
>> the discussion of the previous patch:
>> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559770.html
> Right, but that message talks about GC.  This is not a GC issue.
> 
> This feels like we need a SAVE_EXPR to me to ensure single evaluation
> and an unshare_expr to avoid problems with destructive gimplification.
> 
> 
> 
>>
>>> I would expect the expressions to be values of parameters (or objects in
>>> static storage) and simple arithemetic on them.  If there's other cases,
>>> punting seems appropriate.
>>>
>>> Martin -- are there nontrivial expressions we need to be worried
>>> about here?
>>
>> At the moment the middle warnings only consider parameters, like
>> the N in
>>
>>    void f (int N, int[N]);
>>
>>    void g (void)
>>    {
>>      int a[3];
>>      f (sizeof a, a);   // warning
>>    }
>>
>> The front end redeclaration warnings consider all expressions,
>> including
>>
>>    int f (void);
>>
>>    void g (int[f () + 1]);
>>    void g (int[f () + 2]);   // warning
>>
>> The patch turns these complex bounds into strings that the front
>> end compares instead.
> 
> If you can have an arbitrary expression, such as a function call like
> that, then ISTM that a SAVE_EPR is mandatory as you can't call the
> function more than once.  BUt it also seems to me that for cases that
> aren't simple arithmetic of leaf nodes we could just punt.  I doubt
> we're going to miss significant real world diagnostics by doing that.

I don't know about that.  Bugs are rare and often in unusual and
hard to read/understand code, so focusing on the simple cases and
doing nothing for the rest would certainly not be an improvement.

My understanding from the discussion at the link above is that
using SAVE_EXPRs is only necessary when the expression is evaluated
(the warning doesn't evaluate them).  I didn't use the SAVE_EXPRs
in this patch even though they're readily available (I had initially
missed it) or unsharing because Jakub preferred to avoid the space
overhead (IIUC).  I didn't remove the expressions because (as
I explained) the only way I could find to do it was one that Richard
was quite emphatic should be avoided.

If you want me to use the SHARE_EXPRs I can easily make that change
(if that counts for anything that would be my preference).  If there's
preference for removing the saved bounds in free_lang_data pass I can
presumably do that, either in addition or instead.

>> After the front end is done the strings
>> don't serve any purpose (and I don't think ever will) and could
>> be removed.  I looked for a way to do it but couldn't find one
>> other than the free_lang_data pass in tree.c that Richard had
>> initially said wasn't the right place.  Sounds like he's
>> reconsidered but at this point, given that VLA parameters are
>> used only infraquently, and VLAs with these nontrivial bounds
>> are exceedingly rare, going to the trouble of removing them
>> doesn't seem worth the effort.
> But I'm not sure that inventing a new method for smuggling the data
> around is all that wise or necessary here.   I don't see a message from
> anyone suggesting that, but I could have missed it.

No one suggested "smuggling" anything around.  It also wasn't
my intent, nor do I think the code code that.  It just stores
the bounds in a form that the middle end can cope with.  There
are other front-end-only attributes that store strings (e.g.,
attribute deprecated) so this is not new.  But as I said, I'm
open to removing either the strings or the expressions.  I'd
just like to know which before I do the work this time.

Martin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)
  2021-01-04 20:53             ` Martin Sebor
  2021-01-04 21:10               ` Jeff Law
@ 2021-01-05 12:38               ` Richard Biener
  2021-01-05 16:31                 ` Martin Sebor
  1 sibling, 1 reply; 15+ messages in thread
From: Richard Biener @ 2021-01-05 12:38 UTC (permalink / raw)
  To: Martin Sebor
  Cc: Jeff Law, Jakub Jelinek, Martin Sebor, Martin Sebor via Gcc-patches

On Mon, Jan 4, 2021 at 9:53 PM Martin Sebor <msebor@gmail.com> wrote:
>
> On 1/4/21 12:23 PM, Jeff Law wrote:
> >
> >
> > On 1/4/21 12:19 PM, Jakub Jelinek wrote:
> >> On Mon, Jan 04, 2021 at 12:14:15PM -0700, Jeff Law via Gcc-patches wrote:
> >>>> Doing the STRING_CST is certainly less fragile since the SSA names
> >>>> created at gimplification time could even be ggc_freed when no longer
> >>>> used in the IL.
> >>> Obviously we can't use SSA_NAMEs as they're specific to each function as
> >>> they get compiled.  But what's not as clear to me is why we can't use a
> >>> SAVE_EXPR of the original expression that indicates the size of the
> >>> parameter.
> >> The gimplifier is destructive, so if the expressions are partly (e.g. in
> >> those SAVE_EXPRs) shared with what is in the actual IL, we lose.
> >> And if they aren't shared and there are side-effects, if we tried to
> >> gimplify them again we'd get the side-effects duplicated.
> >> So it all depends on what the code wants to handle, if e.g. just values of
> >> parameters with simple arithmetics on those and punt on everything else,
> >> then it is doable, but generally it is not.
>
> I explained what the code handles and when in the pipeline in
> the discussion of the previous patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559770.html
>
> > I would expect the expressions to be values of parameters (or objects in
> > static storage) and simple arithemetic on them.  If there's other cases,
> > punting seems appropriate.
> >
> > Martin -- are there nontrivial expressions we need to be worried about here?
>
> At the moment the middle warnings only consider parameters, like
> the N in
>
>    void f (int N, int[N]);
>
>    void g (void)
>    {
>      int a[3];
>      f (sizeof a, a);   // warning

I wonder how this can work reliably without heavy-weight
"parsing" of the attribute?  That is, how do you relate
the passed 24 constant to the N in int[N]?

>    }
>
> The front end redeclaration warnings consider all expressions,
> including
>
>    int f (void);
>
>    void g (int[f () + 1]);
>    void g (int[f () + 2]);   // warning

For redeclaration warning the attribute isn't needed since you
have both decls and can compare sizes directly?

> The patch turns these complex bounds into strings that the front
> end compares instead.  After the front end is done the strings
> don't serve any purpose (and I don't think ever will) and could
> be removed.  I looked for a way to do it but couldn't find one
> other than the free_lang_data pass in tree.c that Richard had
> initially said wasn't the right place.  Sounds like he's
> reconsidered but at this point, given that VLA parameters are
> used only infraquently, and VLAs with these nontrivial bounds
> are exceedingly rare, going to the trouble of removing them
> doesn't seem worth the effort.
>
> Martin
>
> >
> >
> > Jeff
> >
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)
  2021-01-05 12:38               ` Richard Biener
@ 2021-01-05 16:31                 ` Martin Sebor
  0 siblings, 0 replies; 15+ messages in thread
From: Martin Sebor @ 2021-01-05 16:31 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jeff Law, Jakub Jelinek, Martin Sebor, Martin Sebor via Gcc-patches

On 1/5/21 5:38 AM, Richard Biener wrote:
> On Mon, Jan 4, 2021 at 9:53 PM Martin Sebor <msebor@gmail.com> wrote:
>>
>> On 1/4/21 12:23 PM, Jeff Law wrote:
>>>
>>>
>>> On 1/4/21 12:19 PM, Jakub Jelinek wrote:
>>>> On Mon, Jan 04, 2021 at 12:14:15PM -0700, Jeff Law via Gcc-patches wrote:
>>>>>> Doing the STRING_CST is certainly less fragile since the SSA names
>>>>>> created at gimplification time could even be ggc_freed when no longer
>>>>>> used in the IL.
>>>>> Obviously we can't use SSA_NAMEs as they're specific to each function as
>>>>> they get compiled.  But what's not as clear to me is why we can't use a
>>>>> SAVE_EXPR of the original expression that indicates the size of the
>>>>> parameter.
>>>> The gimplifier is destructive, so if the expressions are partly (e.g. in
>>>> those SAVE_EXPRs) shared with what is in the actual IL, we lose.
>>>> And if they aren't shared and there are side-effects, if we tried to
>>>> gimplify them again we'd get the side-effects duplicated.
>>>> So it all depends on what the code wants to handle, if e.g. just values of
>>>> parameters with simple arithmetics on those and punt on everything else,
>>>> then it is doable, but generally it is not.
>>
>> I explained what the code handles and when in the pipeline in
>> the discussion of the previous patch:
>> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559770.html
>>
>>> I would expect the expressions to be values of parameters (or objects in
>>> static storage) and simple arithemetic on them.  If there's other cases,
>>> punting seems appropriate.
>>>
>>> Martin -- are there nontrivial expressions we need to be worried about here?
>>
>> At the moment the middle warnings only consider parameters, like
>> the N in
>>
>>     void f (int N, int[N]);
>>
>>     void g (void)
>>     {
>>       int a[3];
>>       f (sizeof a, a);   // warning
> 
> I wonder how this can work reliably without heavy-weight
> "parsing" of the attribute?  That is, how do you relate
> the passed 24 constant to the N in int[N]?

There is some parsing involved but it's only slightly more complex
than in attribute fn spec.  Just a string scan followed by constant
time lookup for each pointer argument.

The N is associated with int[N] via N's position in the argument
list and encoded as $N in the string.  The attribute for the decl
above is "1[$],$0".  The 1 is the VLA argument position, each
dollar sign in the brackets is one VLA bound (numbers are constant
bounds), and the $0 is the VLA bound argument.

(There is some redundancy here since all but the most significant
array (or VLA) bound are also encoded in the type of the argument.)

>>
>> The front end redeclaration warnings consider all expressions,
>> including
>>
>>     int f (void);
>>
>>     void g (int[f () + 1]);
>>     void g (int[f () + 2]);   // warning
> 
> For redeclaration warning the attribute isn't needed since you
> have both decls and can compare sizes directly?

The attribute is used here as well.  It's attached to the first
decl irrespective of the form of the VLA, then created for
the second decl and the two are compared.  Mismatches are then
diagnosed and dropped from the second attribute.  The result
is merged with the first and added to the decl.

Martin

> 
>> The patch turns these complex bounds into strings that the front
>> end compares instead.  After the front end is done the strings
>> don't serve any purpose (and I don't think ever will) and could
>> be removed.  I looked for a way to do it but couldn't find one
>> other than the free_lang_data pass in tree.c that Richard had
>> initially said wasn't the right place.  Sounds like he's
>> reconsidered but at this point, given that VLA parameters are
>> used only infraquently, and VLAs with these nontrivial bounds
>> are exceedingly rare, going to the trouble of removing them
>> doesn't seem worth the effort.
>>
>> Martin
>>
>>>
>>>
>>> Jeff
>>>
>>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)
  2021-01-04 21:20                 ` Jakub Jelinek
@ 2021-01-05 23:19                   ` Jeff Law
  0 siblings, 0 replies; 15+ messages in thread
From: Jeff Law @ 2021-01-05 23:19 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Martin Sebor, Richard Biener, Martin Sebor, Martin Sebor via Gcc-patches



On 1/4/21 2:20 PM, Jakub Jelinek wrote:
> On Mon, Jan 04, 2021 at 02:10:39PM -0700, Jeff Law wrote:
>>> I explained what the code handles and when in the pipeline in
>>> the discussion of the previous patch:
>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559770.html
>> Right, but that message talks about GC.  This is not a GC issue.
>>
>> This feels like we need a SAVE_EXPR to me to ensure single evaluation
>> and an unshare_expr to avoid problems with destructive gimplification.
> unshare_expr will not duplicate SAVE_EXPRs.
> So, one would need to unshare with special handling of SAVE_EXPRs that would
> throw them away (for the simple arguments case) rather than handling them
> normally.
My mental model of how this works must be broken then.  I thought we
would need to unshare the expression, then wrap it in the SAVE_EXPR.  It
seems like you're saying that we've already got the SAVE_EXPR and that
unshare_expr won't traverse into it.  That would indeed be problematical.

jeff


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] store VLA bounds in attribute access as strings (PR 97172)
  2021-01-04 23:54                 ` Martin Sebor
@ 2021-01-05 23:23                   ` Jeff Law
  0 siblings, 0 replies; 15+ messages in thread
From: Jeff Law @ 2021-01-05 23:23 UTC (permalink / raw)
  To: Martin Sebor, Jakub Jelinek
  Cc: Richard Biener, Martin Sebor, Martin Sebor via Gcc-patches



On 1/4/21 4:54 PM, Martin Sebor wrote:
> On 1/4/21 2:10 PM, Jeff Law wrote:
>>
>>
>> On 1/4/21 1:53 PM, Martin Sebor wrote:
>>> On 1/4/21 12:23 PM, Jeff Law wrote:
>>>>
>>>>
>>>> On 1/4/21 12:19 PM, Jakub Jelinek wrote:
>>>>> On Mon, Jan 04, 2021 at 12:14:15PM -0700, Jeff Law via Gcc-patches
>>>>> wrote:
>>>>>>> Doing the STRING_CST is certainly less fragile since the SSA names
>>>>>>> created at gimplification time could even be ggc_freed when no
>>>>>>> longer
>>>>>>> used in the IL.
>>>>>> Obviously we can't use SSA_NAMEs as they're specific to each
>>>>>> function as
>>>>>> they get compiled.  But what's not as clear to me is why we can't
>>>>>> use a
>>>>>> SAVE_EXPR of the original expression that indicates the size of the
>>>>>> parameter.
>>>>> The gimplifier is destructive, so if the expressions are partly
>>>>> (e.g. in
>>>>> those SAVE_EXPRs) shared with what is in the actual IL, we lose.
>>>>> And if they aren't shared and there are side-effects, if we tried to
>>>>> gimplify them again we'd get the side-effects duplicated.
>>>>> So it all depends on what the code wants to handle, if e.g. just
>>>>> values of
>>>>> parameters with simple arithmetics on those and punt on everything
>>>>> else,
>>>>> then it is doable, but generally it is not.
>>>
>>> I explained what the code handles and when in the pipeline in
>>> the discussion of the previous patch:
>>> https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559770.html
>> Right, but that message talks about GC.  This is not a GC issue.
>>
>> This feels like we need a SAVE_EXPR to me to ensure single evaluation
>> and an unshare_expr to avoid problems with destructive gimplification.
>>
>>
>>
>>>
>>>> I would expect the expressions to be values of parameters (or
>>>> objects in
>>>> static storage) and simple arithemetic on them.  If there's other
>>>> cases,
>>>> punting seems appropriate.
>>>>
>>>> Martin -- are there nontrivial expressions we need to be worried
>>>> about here?
>>>
>>> At the moment the middle warnings only consider parameters, like
>>> the N in
>>>
>>>    void f (int N, int[N]);
>>>
>>>    void g (void)
>>>    {
>>>      int a[3];
>>>      f (sizeof a, a);   // warning
>>>    }
>>>
>>> The front end redeclaration warnings consider all expressions,
>>> including
>>>
>>>    int f (void);
>>>
>>>    void g (int[f () + 1]);
>>>    void g (int[f () + 2]);   // warning
>>>
>>> The patch turns these complex bounds into strings that the front
>>> end compares instead.
>>
>> If you can have an arbitrary expression, such as a function call like
>> that, then ISTM that a SAVE_EPR is mandatory as you can't call the
>> function more than once.  BUt it also seems to me that for cases that
>> aren't simple arithmetic of leaf nodes we could just punt.  I doubt
>> we're going to miss significant real world diagnostics by doing that.
>
> I don't know about that.  Bugs are rare and often in unusual and
> hard to read/understand code, so focusing on the simple cases and
> doing nothing for the rest would certainly not be an improvement.
I would disagree.  It's an improvement for what is most likely the most
common case.  VLAs aren't used that heavily to begin with and VLAs with
bounds that require function calls to evaluate would seem to be quite rare.



>
> My understanding from the discussion at the link above is that
> using SAVE_EXPRs is only necessary when the expression is evaluated
> (the warning doesn't evaluate them).
Hmm,  so this goes back to Richi's comment/question.  If we're not
evaluating the expression, then we're just doing a lexicographical
comparison?  And yes, in that case we wouldn't need the SAVE_EXPR.



>>> After the front end is done the strings
>>> don't serve any purpose (and I don't think ever will) and could
>>> be removed.  I looked for a way to do it but couldn't find one
>>> other than the free_lang_data pass in tree.c that Richard had
>>> initially said wasn't the right place.  Sounds like he's
>>> reconsidered but at this point, given that VLA parameters are
>>> used only infraquently, and VLAs with these nontrivial bounds
>>> are exceedingly rare, going to the trouble of removing them
>>> doesn't seem worth the effort.
>> But I'm not sure that inventing a new method for smuggling the data
>> around is all that wise or necessary here.   I don't see a message from
>> anyone suggesting that, but I could have missed it.
>
> No one suggested "smuggling" anything around.  It also wasn't
> my intent, nor do I think the code code that.  It just stores
> the bounds in a form that the middle end can cope with.  There
> are other front-end-only attributes that store strings (e.g.,
> attribute deprecated) so this is not new.  But as I said, I'm
> open to removing either the strings or the expressions.  I'd
> just like to know which before I do the work this time.
You're reading way too  much into the word "smuggle".

jeff


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-01-05 23:23 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-19  0:55 [PATCH] store VLA bounds in attribute access as strings (PR 97172) Martin Sebor
2020-12-19  7:48 ` Richard Biener
2020-12-20 17:43   ` Martin Sebor
2021-01-04 12:59     ` Richard Biener
2021-01-04 19:14       ` Jeff Law
2021-01-04 19:19         ` Jakub Jelinek
2021-01-04 19:23           ` Jeff Law
2021-01-04 20:53             ` Martin Sebor
2021-01-04 21:10               ` Jeff Law
2021-01-04 21:20                 ` Jakub Jelinek
2021-01-05 23:19                   ` Jeff Law
2021-01-04 23:54                 ` Martin Sebor
2021-01-05 23:23                   ` Jeff Law
2021-01-05 12:38               ` Richard Biener
2021-01-05 16:31                 ` Martin Sebor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).