public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] improve handling of aggregates in sprintf [PR 102238, 102919]
@ 2021-10-24 23:43 Martin Sebor
  2021-10-25  2:18 ` Jeff Law
  0 siblings, 1 reply; 2+ messages in thread
From: Martin Sebor @ 2021-10-24 23:43 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 872 bytes --]

The detection of overlapping sprintf calls has a limitation
that leads to both false positives (PR 102919) and negatives
(PR 102238) in corner cases involving members of aggregates.
The false positives result from the overlap logic not using
the size of the member used as an argument to %s to constrain
the length of the directive output.  The false negatives are
due to the logic failing to determine the identity of a member
from the address or reference to the enclosing object and
an offset.

The attached patch improves the detection logic to handle both
sets of cases.  In addition, it moves the utility functions used
to implement the logic from the sprintf pass into pointer-query
where they can be used for other purposes in the future (my
work in progress).

Tested on x86_64-linux and by building Glibc and verifying
it doesn't cause any new warnings,

Martin

[-- Attachment #2: gcc-102238.diff --]
[-- Type: text/x-patch, Size: 28399 bytes --]

PR tree-optimization/102238 - alias_offset in gimple-ssa-sprintf.c is broken
PR tree-optimization/102919 - spurious -Wrestrict warning for sprintf into the same member array as argument plus offset

gcc/ChangeLog:

	PR tree-optimization/102238
	PR tree-optimization/102919
	* gimple-ssa-sprintf.c (get_string_length): Ad an argument.
	(array_elt_at_offset): Move to pointer-query.
	(set_aggregate_size_and_offset): New function.
	(field_at_offset):  Move to pointer-query.
	(get_origin_and_offset): Rename...
	(get_origin_and_offset_r): this.  Add an argument.  Make aggregate
	handling more robust.
	(get_origin_and_offset): New.
	(alias_offset): Add an argument.
	(format_string): Use subobject size determined by get_origin_and_offset.
	* pointer-query.cc (field_at_offset): Move from gimple-ssa-sprintf.c.
	Improve/correct handling of aggregates.
	(array_elt_at_offset): Same.
	* pointer-query.h (field_at_offset): Declare.
	(array_elt_at_offset): Declare.

gcc/testsuite/ChangeLog:

	PR tree-optimization/102238
	PR tree-optimization/102919
	* gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Remove warnings.
	* gcc.dg/Wrestrict-23.c: New test.

diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index 8e90b7cfc43..cc679845137 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -2024,8 +2024,8 @@ format_floating (const directive &dir, tree arg, range_query *)
    Used by the format_string function below.  */
 
 static fmtresult
-get_string_length (tree str, gimple *stmt, unsigned eltsize,
-		   range_query *query)
+get_string_length (tree str, gimple *stmt, unsigned HOST_WIDE_INT max_size,
+		   unsigned eltsize, range_query *query)
 {
   if (!str)
     return fmtresult ();
@@ -2065,6 +2065,20 @@ get_string_length (tree str, gimple *stmt, unsigned eltsize,
       && (!lendata.maxbound || lenmax <= tree_to_uhwi (lendata.maxbound))
       && lenmax <= tree_to_uhwi (lendata.maxlen))
     {
+      if (max_size > 0 && max_size < HOST_WIDE_INT_MAX)
+	{
+	  /* Adjust the conservative unknown/unbounded result if MAX_SIZE
+	     is valid.  Set UNLIKELY to maximum in case MAX_SIZE refers
+	     to a subobject.
+	     TODO: This is overly conservative.  Set UNLIKELY to the size
+	     of the outermost enclosing declared object.  */
+	  fmtresult res (0, max_size - 1);
+	  res.nonstr = lendata.decl;
+	  res.range.likely = res.range.max;
+	  res.range.unlikely = HOST_WIDE_INT_MAX;
+	  return res;
+	}
+
       fmtresult res;
       res.nonstr = lendata.decl;
       return res;
@@ -2203,110 +2217,80 @@ format_character (const directive &dir, tree arg, range_query *query)
   return res.adjust_for_width_or_precision (dir.width);
 }
 
-/* Determine the offset *INDEX of the first byte of an array element of
-   TYPE (possibly recursively) into which the byte offset OFF points.
-   On success set *INDEX to the offset of the first byte and return type.
-   Otherwise, if no such element can be found, return null.  */
+/* If TYPE is an array or struct or union, increment *FLDOFF by the starting
+   offset of the member that *OFF point into and set *FLDSIZE to its size
+   in bytes and decrement *OFF by the same.  Otherwise do nothing.  */
 
-static tree
-array_elt_at_offset (tree type, HOST_WIDE_INT off, HOST_WIDE_INT *index)
+static void
+set_aggregate_size_and_offset (tree type, HOST_WIDE_INT *fldoff,
+			       HOST_WIDE_INT *fldsize, HOST_WIDE_INT *off)
 {
-  gcc_assert (TREE_CODE (type) == ARRAY_TYPE);
-
-  tree eltype = type;
-  while (TREE_CODE (TREE_TYPE (eltype)) == ARRAY_TYPE)
-    eltype = TREE_TYPE (eltype);
-
-  if (TYPE_MODE (TREE_TYPE (eltype)) != TYPE_MODE (char_type_node))
-    eltype = TREE_TYPE (eltype);
-
-  if (eltype == type)
-    {
-      *index = 0;
-      return type;
-    }
-
-  HOST_WIDE_INT typsz = int_size_in_bytes (type);
-  HOST_WIDE_INT eltsz = int_size_in_bytes (eltype);
-  if (off < typsz * eltsz)
+  /* The byte offset of the most basic struct member the byte
+     offset *OFF corresponds to, or for a (multidimensional)
+     array member, the byte offset of the array element.  */
+  if (TREE_CODE (type) == ARRAY_TYPE
+      && TREE_CODE (TREE_TYPE (type)) == ARRAY_TYPE)
     {
-      *index = (off / eltsz) * eltsz;
-      return TREE_CODE (eltype) == ARRAY_TYPE ? TREE_TYPE (eltype) : eltype;
+      HOST_WIDE_INT index = 0, arrsize = 0;
+      if (array_elt_at_offset (type, *off, &index, &arrsize))
+	{
+	  *fldoff += index;
+	  *off -= index;
+	  *fldsize = arrsize;
+	}
     }
-
-  return NULL_TREE;
-}
-
-/* Determine the offset *INDEX of the first byte of a struct member of TYPE
-   (possibly recursively) into which the byte offset OFF points.  On success
-   set *INDEX to the offset of the first byte and return true.  Otherwise,
-   if no such member can be found, return false.  */
-
-static bool
-field_at_offset (tree type, HOST_WIDE_INT off, HOST_WIDE_INT *index)
-{
-  gcc_assert (RECORD_OR_UNION_TYPE_P (type));
-
-  for (tree fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
+  else if (RECORD_OR_UNION_TYPE_P (type))
     {
-      if (TREE_CODE (fld) != FIELD_DECL || DECL_ARTIFICIAL (fld))
-	continue;
-
-      tree fldtype = TREE_TYPE (fld);
-      HOST_WIDE_INT fldoff = int_byte_position (fld);
-
-      /* If the size is not available the field is a flexible array
-	 member.  Treat this case as success.  */
-      tree typesize = TYPE_SIZE_UNIT (fldtype);
-      HOST_WIDE_INT fldsize = (tree_fits_uhwi_p (typesize)
-			       ? tree_to_uhwi (typesize)
-			       : off);
-
-      if (fldoff + fldsize < off)
-	continue;
-
-      if (TREE_CODE (fldtype) == ARRAY_TYPE)
+      HOST_WIDE_INT index = 0;
+      tree sub = field_at_offset (type, NULL_TREE, *off, &index);
+      if (sub)
 	{
-	  HOST_WIDE_INT idx = 0;
-	  if (tree ft = array_elt_at_offset (fldtype, off, &idx))
-	    fldtype = ft;
+	  tree subsize = DECL_SIZE_UNIT (sub);
+	  if (*fldsize < HOST_WIDE_INT_MAX
+	      && subsize
+	      && tree_fits_uhwi_p (subsize))
+	    *fldsize = tree_to_uhwi (subsize);
 	  else
-	    break;
-
-	  *index += idx;
-	  fldoff -= idx;
-	  off -= idx;
-	}
-
-      if (RECORD_OR_UNION_TYPE_P (fldtype))
-	{
-	  *index += fldoff;
-	  return field_at_offset (fldtype, off - fldoff, index);
+	    *fldsize = HOST_WIDE_INT_MAX;
+	  *fldoff += index;
+	  *off -= index;
 	}
-
-      *index += fldoff;
-      return true;
     }
-
-  return false;
 }
 
 /* For an expression X of pointer type, recursively try to find the same
-   origin (object or pointer) as Y it references and return such an X.
-   When X refers to a struct member, set *FLDOFF to the offset of the
-   member from the beginning of the "most derived" object.  */
+   origin (object or pointer) as Y it references and return such a Y.
+   When X refers to an array element or struct member, set *FLDOFF to
+   the offset of the element or member from the beginning of the "most
+   derived" object and *FLDSIZE to its size.  When nonnull, set *OFF to
+   the overall offset from the beginning of the object so that
+   *FLDOFF <= *OFF.  */
 
 static tree
-get_origin_and_offset (tree x, HOST_WIDE_INT *fldoff, HOST_WIDE_INT *off)
+get_origin_and_offset_r (tree x, HOST_WIDE_INT *fldoff, HOST_WIDE_INT *fldsize,
+			 HOST_WIDE_INT *off)
 {
   if (!x)
     return NULL_TREE;
 
+  HOST_WIDE_INT sizebuf = -1;
+  if (!fldsize)
+    fldsize = &sizebuf;
+
+  if (DECL_P (x))
+    {
+      /* Set the size if it hasn't been set yet.  */
+      if (tree size = DECL_SIZE_UNIT (x))
+	if (*fldsize < 0 && tree_fits_shwi_p (size))
+	  *fldsize = tree_to_shwi (size);
+      return x;
+    }
+
   switch (TREE_CODE (x))
     {
     case ADDR_EXPR:
       x = TREE_OPERAND (x, 0);
-      return get_origin_and_offset (x, fldoff, off);
+      return get_origin_and_offset_r (x, fldoff, fldsize, off);
 
     case ARRAY_REF:
       {
@@ -2326,7 +2310,7 @@ get_origin_and_offset (tree x, HOST_WIDE_INT *fldoff, HOST_WIDE_INT *off)
 	  *fldoff = idx;
 
 	x = TREE_OPERAND (x, 0);
-	return get_origin_and_offset (x, fldoff, NULL);
+	return get_origin_and_offset_r (x, fldoff, fldsize, nullptr);
       }
 
     case MEM_REF:
@@ -2345,32 +2329,19 @@ get_origin_and_offset (tree x, HOST_WIDE_INT *fldoff, HOST_WIDE_INT *off)
 	    = (TREE_CODE (x) == ADDR_EXPR
 	       ? TREE_TYPE (TREE_OPERAND (x, 0)) : TREE_TYPE (TREE_TYPE (x)));
 
-	  /* The byte offset of the most basic struct member the byte
-	     offset *OFF corresponds to, or for a (multidimensional)
-	     array member, the byte offset of the array element.  */
-	  HOST_WIDE_INT index = 0;
-
-	  if ((RECORD_OR_UNION_TYPE_P (xtype)
-	       && field_at_offset (xtype, *off, &index))
-	      || (TREE_CODE (xtype) == ARRAY_TYPE
-		  && TREE_CODE (TREE_TYPE (xtype)) == ARRAY_TYPE
-		  && array_elt_at_offset (xtype, *off, &index)))
-	    {
-	      *fldoff += index;
-	      *off -= index;
-	    }
+	  set_aggregate_size_and_offset (xtype, fldoff, fldsize, off);
 	}
 
-      return get_origin_and_offset (x, fldoff, NULL);
+      return get_origin_and_offset_r (x, fldoff, fldsize, nullptr);
 
     case COMPONENT_REF:
       {
 	tree fld = TREE_OPERAND (x, 1);
 	*fldoff += int_byte_position (fld);
 
-	get_origin_and_offset (fld, fldoff, off);
+	get_origin_and_offset_r (fld, fldoff, fldsize, off);
 	x = TREE_OPERAND (x, 0);
-	return get_origin_and_offset (x, fldoff, off);
+	return get_origin_and_offset_r (x, fldoff, nullptr, off);
       }
 
     case SSA_NAME:
@@ -2382,27 +2353,41 @@ get_origin_and_offset (tree x, HOST_WIDE_INT *fldoff, HOST_WIDE_INT *off)
 	    if (code == ADDR_EXPR)
 	      {
 		x = gimple_assign_rhs1 (def);
-		return get_origin_and_offset (x, fldoff, off);
+		return get_origin_and_offset_r (x, fldoff, fldsize, off);
 	      }
 
 	    if (code == POINTER_PLUS_EXPR)
 	      {
 		tree offset = gimple_assign_rhs2 (def);
-		if (off)
-		  *off = (tree_fits_uhwi_p (offset)
-			  ? tree_to_uhwi (offset) : HOST_WIDE_INT_MAX);
+		if (off && tree_fits_uhwi_p (offset))
+		  *off = tree_to_uhwi (offset);
 
 		x = gimple_assign_rhs1 (def);
-		return get_origin_and_offset (x, fldoff, NULL);
+		x = get_origin_and_offset_r (x, fldoff, fldsize, off);
+		if (off && !tree_fits_uhwi_p (offset))
+		  *off = HOST_WIDE_INT_MAX;
+		if (off)
+		  {
+		    tree xtype = TREE_TYPE (x);
+		    set_aggregate_size_and_offset (xtype, fldoff, fldsize, off);
+		  }
+		return x;
 	      }
 	    else if (code == VAR_DECL)
 	      {
 		x = gimple_assign_rhs1 (def);
-		return get_origin_and_offset (x, fldoff, off);
+		return get_origin_and_offset_r (x, fldoff, fldsize, off);
 	      }
 	  }
 	else if (gimple_nop_p (def) && SSA_NAME_VAR (x))
 	  x = SSA_NAME_VAR (x);
+
+	tree xtype = TREE_TYPE (x);
+	if (POINTER_TYPE_P (xtype))
+	  xtype = TREE_TYPE (xtype);
+
+	if (off)
+	  set_aggregate_size_and_offset (xtype, fldoff, fldsize, off);
       }
 
     default:
@@ -2412,13 +2397,41 @@ get_origin_and_offset (tree x, HOST_WIDE_INT *fldoff, HOST_WIDE_INT *off)
   return x;
 }
 
+/* Nonrecursive version of the above.  */
+
+static tree
+get_origin_and_offset (tree x, HOST_WIDE_INT *fldoff, HOST_WIDE_INT *off,
+		       HOST_WIDE_INT *fldsize = nullptr)
+{
+  HOST_WIDE_INT sizebuf;
+  if (!fldsize)
+    fldsize = &sizebuf;
+
+  *fldsize = -1;
+
+  *fldoff = *off = *fldsize = 0;
+  tree orig = get_origin_and_offset_r (x, fldoff, fldsize, off);
+  if (!orig)
+    return NULL_TREE;
+
+  if (!*fldoff && *off == *fldsize)
+    {
+      *fldoff = *off;
+      *off = 0;
+    }
+
+  return orig;
+}
+
 /* If ARG refers to the same (sub)object or array element as described
    by DST and DST_FLD, return the byte offset into the struct member or
-   array element referenced by ARG.  Otherwise return HOST_WIDE_INT_MIN
-   to indicate that ARG and DST do not refer to the same object.  */
+   array element referenced by ARG and set *ARG_SIZE to the size of
+   the (sub)object.  Otherwise return HOST_WIDE_INT_MIN to indicate
+   that ARG and DST do not refer to the same object.  */
 
 static HOST_WIDE_INT
-alias_offset (tree arg, tree dst, HOST_WIDE_INT dst_fld)
+alias_offset (tree arg, HOST_WIDE_INT *arg_size,
+	      tree dst, HOST_WIDE_INT dst_fld)
 {
   /* See if the argument refers to the same base object as the destination
      of the formatted function call, and if so, try to determine if they
@@ -2430,7 +2443,7 @@ alias_offset (tree arg, tree dst, HOST_WIDE_INT dst_fld)
      to a struct member, see if the members are one and the same.  */
   HOST_WIDE_INT arg_off = 0, arg_fld = 0;
 
-  tree arg_orig = get_origin_and_offset (arg, &arg_fld, &arg_off);
+  tree arg_orig = get_origin_and_offset (arg, &arg_fld, &arg_off, arg_size);
 
   if (arg_orig == dst && arg_fld == dst_fld)
     return arg_off;
@@ -2448,6 +2461,10 @@ format_string (const directive &dir, tree arg, range_query *query)
 {
   fmtresult res;
 
+  /* The size of the (sub)object ARG refers to.  Used to adjust
+     the conservative get_string_length() result.  */
+  HOST_WIDE_INT arg_size = 0;
+
   if (warn_restrict)
     {
       /* See if ARG might alias the destination of the call with
@@ -2455,8 +2472,12 @@ format_string (const directive &dir, tree arg, range_query *query)
 	 so that the overlap can be determined for certain later,
 	 when the amount of output of the call (including subsequent
 	 directives) has been computed.  Otherwise, store HWI_MIN.  */
-      res.dst_offset = alias_offset (arg, dir.info->dst_origin,
+      res.dst_offset = alias_offset (arg, &arg_size, dir.info->dst_origin,
 				     dir.info->dst_field);
+      if (res.dst_offset >= 0 && res.dst_offset <= arg_size)
+	arg_size -= res.dst_offset;
+      else
+	arg_size = 0;
     }
 
   /* Compute the range the argument's length can be in.  */
@@ -2473,7 +2494,8 @@ format_string (const directive &dir, tree arg, range_query *query)
       gcc_checking_assert (count_by == 2 || count_by == 4);
     }
 
-  fmtresult slen = get_string_length (arg, dir.info->callstmt, count_by, query);
+  fmtresult slen =
+    get_string_length (arg, dir.info->callstmt, arg_size, count_by, query);
   if (slen.range.min == slen.range.max
       && slen.range.min < HOST_WIDE_INT_MAX)
     {
diff --git a/gcc/pointer-query.cc b/gcc/pointer-query.cc
index 910f452868e..0b1bf75c913 100644
--- a/gcc/pointer-query.cc
+++ b/gcc/pointer-query.cc
@@ -2196,3 +2196,167 @@ compute_objsize (tree ptr, int ostype, tree *pdecl /* = NULL */,
 
   return size;
 }
+
+/* Determine the offset *FLDOFF of the first byte of a struct member
+   of TYPE (possibly recursively) into which the byte offset OFF points,
+   starting after the field START_AFTER if it's non-null.  On success,
+   if nonnull, set *FLDOFF to the offset of the first byte, and return
+   the field decl.  If nonnull, set *NEXTOFF to the offset of the next
+   field (which reflects any padding between the returned field and
+   the next).  Otherwise, if no such member can be found, return null.  */
+
+tree
+field_at_offset (tree type, tree start_after, HOST_WIDE_INT off,
+		 HOST_WIDE_INT *fldoff /* = nullptr */,
+		 HOST_WIDE_INT *nextoff /* = nullptr */)
+{
+  tree first_fld = TYPE_FIELDS (type);
+
+  HOST_WIDE_INT offbuf = 0, nextbuf = 0;
+  if (!fldoff)
+    fldoff = &offbuf;
+  if (!nextoff)
+    nextoff = &nextbuf;
+
+  *nextoff = 0;
+
+  /* The field to return.  */
+  tree last_fld = NULL_TREE;
+  /* The next field to advance to.  */
+  tree next_fld = NULL_TREE;
+
+  /* NEXT_FLD's cached offset.  */
+  HOST_WIDE_INT next_pos = -1;
+
+  for (tree fld = first_fld; fld; fld = next_fld)
+    {
+      next_fld = fld;
+      do
+	/* Advance to the next relevant data member.  */
+	next_fld = TREE_CHAIN (next_fld);
+      while (next_fld
+	     && (TREE_CODE (next_fld) != FIELD_DECL
+		 || DECL_ARTIFICIAL (next_fld)));
+
+      if (TREE_CODE (fld) != FIELD_DECL || DECL_ARTIFICIAL (fld))
+	continue;
+
+      if (fld == start_after)
+	continue;
+
+      tree fldtype = TREE_TYPE (fld);
+      /* The offset of FLD within its immediately enclosing structure.  */
+      HOST_WIDE_INT fldpos = next_pos < 0 ? int_byte_position (fld) : next_pos;
+
+      /* If the size is not available the field is a flexible array
+	 member.  Treat this case as success.  */
+      tree typesize = TYPE_SIZE_UNIT (fldtype);
+      HOST_WIDE_INT fldsize = (tree_fits_uhwi_p (typesize)
+			       ? tree_to_uhwi (typesize)
+			       : off);
+
+      /* If OFF is beyond the end of the current field continue.  */
+      HOST_WIDE_INT fldend = fldpos + fldsize;
+      if (fldend < off)
+	continue;
+
+      if (next_fld)
+	{
+	  /* If OFF is equal to the offset of the next field continue
+	     to it and skip the array/struct business below.  */
+	  next_pos = int_byte_position (next_fld);
+	  *nextoff = *fldoff + next_pos;
+	  if (*nextoff == off && TREE_CODE (type) != UNION_TYPE)
+	    continue;
+	}
+      else
+	*nextoff = HOST_WIDE_INT_MAX;
+
+      /* OFF refers somewhere into the current field or just past its end,
+	 which could mean it refers to the next field.  */
+      if (TREE_CODE (fldtype) == ARRAY_TYPE)
+	{
+	  /* Will be set to the offset of the first byte of the array
+	     element (which may be an array) of FLDTYPE into which
+	     OFF - FLDPOS points (which may be past ELTOFF).  */
+	  HOST_WIDE_INT eltoff = 0;
+	  if (tree ft = array_elt_at_offset (fldtype, off - fldpos, &eltoff))
+	    fldtype = ft;
+	  else
+	    continue;
+
+	  /* Advance the position to include the array element above.
+	     If OFF - FLPOS refers to a member of FLDTYPE, the member
+	     will be determined below.  */
+	  fldpos += eltoff;
+	}
+
+      *fldoff += fldpos;
+
+      if (TREE_CODE (fldtype) == RECORD_TYPE)
+	/* Drill down into the current field if it's a struct.  */
+	fld = field_at_offset (fldtype, start_after, off - fldpos,
+			       fldoff, nextoff);
+
+      last_fld = fld;
+
+      /* Unless the offset is just past the end of the field return it.
+	 Otherwise save it and return it only if the offset of the next
+	 next field is greater (i.e., there is padding between the two)
+	 or if there is no next field.  */
+      if (off < fldend)
+	break;
+    }
+
+  if (*nextoff == HOST_WIDE_INT_MAX && next_fld)
+    *nextoff = next_pos;
+
+  return last_fld;
+}
+
+/* Determine the offset *ELTOFF of the first byte of the array element
+   of array ARTYPE into which the byte offset OFF points.  On success
+   set *ELTOFF to the offset of the first byte and return type.
+   Otherwise, if no such element can be found, return null.  */
+
+tree
+array_elt_at_offset (tree artype, HOST_WIDE_INT off,
+		     HOST_WIDE_INT *eltoff /* = nullptr */,
+		     HOST_WIDE_INT *subar_size /* = nullptr */)
+{
+  gcc_assert (TREE_CODE (artype) == ARRAY_TYPE);
+
+  HOST_WIDE_INT dummy;
+  if (!eltoff)
+    eltoff = &dummy;
+  if (!subar_size)
+    subar_size = &dummy;
+
+  tree eltype = artype;
+  while (TREE_CODE (TREE_TYPE (eltype)) == ARRAY_TYPE)
+    eltype = TREE_TYPE (eltype);
+
+  tree subartype = eltype;
+  if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (eltype))
+      || TYPE_MODE (TREE_TYPE (eltype)) != TYPE_MODE (char_type_node))
+    eltype = TREE_TYPE (eltype);
+
+  *subar_size = int_size_in_bytes (subartype);
+
+  if (eltype == artype)
+    {
+      *eltoff = 0;
+      return artype;
+    }
+
+  HOST_WIDE_INT artype_size = int_size_in_bytes (artype);
+  HOST_WIDE_INT eltype_size = int_size_in_bytes (eltype);
+
+  if (off < artype_size)// * eltype_size)
+    {
+      *eltoff = (off / eltype_size) * eltype_size;
+      return TREE_CODE (eltype) == ARRAY_TYPE ? TREE_TYPE (eltype) : eltype;
+    }
+
+  return NULL_TREE;
+}
diff --git a/gcc/pointer-query.h b/gcc/pointer-query.h
index 3c8172c652d..18743ad5b0a 100644
--- a/gcc/pointer-query.h
+++ b/gcc/pointer-query.h
@@ -255,4 +255,13 @@ extern tree compute_objsize (tree, int, access_ref *, pointer_query *);
 extern tree compute_objsize (tree, int, tree * = NULL, tree * = NULL,
 			     range_query * = NULL);
 
+/* Return the field at the constant offset.  */
+extern tree field_at_offset (tree, tree, HOST_WIDE_INT,
+			     HOST_WIDE_INT * = nullptr,
+			     HOST_WIDE_INT * = nullptr);
+/* Return the array at the constant offset.  */
+extern tree array_elt_at_offset (tree, HOST_WIDE_INT,
+				 HOST_WIDE_INT * = nullptr,
+				 HOST_WIDE_INT * = nullptr);
+
 #endif   // GCC_POINTER_QUERY_H
diff --git a/gcc/testsuite/gcc.dg/Wrestrict-23.c b/gcc/testsuite/gcc.dg/Wrestrict-23.c
new file mode 100644
index 00000000000..c7a828bb8eb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wrestrict-23.c
@@ -0,0 +1,146 @@
+/* PR tree-optimization/102238 - missing -Wrestrict on sprintf formatting
+   a struct member into enclosing object
+   { dg-do compile }
+   { dg-options "-O2 -Wall -Wno-format-overflow" } */
+
+extern int sprintf (char*, const char*, ...);
+
+extern void sink (void*, ...);
+
+struct A
+{
+  char a[4];
+};
+
+struct B
+{
+  struct A a1, a2;
+};
+
+extern struct B eb;
+
+enum { B_a2_a_off = __builtin_offsetof (struct B, a2.a) };
+
+
+void test_warn_src_decl_plus (void)
+{
+  {
+    char *s = (char*)&eb + B_a2_a_off;
+    char *d = eb.a2.a;
+    sprintf (d, "%s", s);     // { dg-warning "overlaps" }
+  }
+
+  {
+    // If strlen (s) > 0 there is overlap with a[1].
+    char *s = (char*)&eb + B_a2_a_off + 1;
+    char *d = eb.a2.a;
+    sprintf (d, "%s", s);     // { dg-warning "may overlap" }
+  }
+
+  {
+    // strlen (s) must be at most 1 so there can be no overlap with a.
+    char *s = (char*)&eb + B_a2_a_off + 2;
+    char *d = eb.a2.a;
+    sprintf (d, "%s", s);     // { dg-bogus "-Wrestrict" }
+  }
+
+  {
+    // strlen (s) must be at most 0 so there can be no overlap with a.
+    char *s = (char*)&eb + B_a2_a_off + 3;
+    char *d = eb.a2.a;
+    sprintf (d, "%s", s);     // { dg-bogus "-Wrestrict" }
+  }
+}
+
+
+void test_warn_src_ptr_plus (struct B *p)
+{
+  {
+    char *s = (char*)p + B_a2_a_off;
+    char *d = p->a2.a;
+    sprintf (d, "%s", s);     // { dg-warning "overlaps" }
+  }
+
+  {
+    // If strlen (s) > 0 there is overlap with a[1].
+    char *s = (char*)p + B_a2_a_off + 1;
+    char *d = p->a2.a;
+    sprintf (d, "%s", s);     // { dg-warning "may overlap" }
+  }
+
+  {
+    // strlen (s) must be at most 1 so there can be no overlap with a.
+    char *s = (char*)p + B_a2_a_off + 2;
+    char *d = p->a2.a;
+    sprintf (d, "%s", s);     // { dg-bogus "-Wrestrict" }
+  }
+
+  {
+    // strlen (s) must be at most 0 so there can be no overlap with a.
+    char *s = (char*)p + B_a2_a_off + 3;
+    char *d = p->a2.a;
+    sprintf (d, "%s", s);     // { dg-bogus "-Wrestrict" }
+  }
+}
+
+
+void test_warn_dst_decl_plus (void)
+{
+  {
+    char *s = eb.a2.a;
+    char *d = (char*)&eb + B_a2_a_off;
+    sprintf (d, "%s", s);     // { dg-warning "overlaps" }
+  }
+
+  {
+    // If strlen (a) > 0 there is overlap with a[1].
+    char *s = eb.a2.a;
+    char *d = (char*)&eb + B_a2_a_off + 1;
+    sprintf (d, "%s", s);     // { dg-warning "may overlap" }
+  }
+
+  {
+    // If strlen (a) > 1 there is overlap with a[2].
+    char *s = eb.a2.a;
+    char *d = (char*)&eb + B_a2_a_off + 2;
+    sprintf (d, "%s", s);     // { dg-warning "may overlap" }
+  }
+
+  {
+    // If strlen (a) > 2 there is overlap with a[3].
+    char *s = eb.a2.a;
+    char *d = (char*)&eb + B_a2_a_off + 3;
+    sprintf (d, "%s", s);     // { dg-warning "may overlap" }
+  }
+}
+
+
+void test_warn_dst_ptr_plus (struct B *p)
+{
+  {
+    char *s = p->a2.a;
+    char *d = (char*)p + B_a2_a_off;
+    sprintf (d, "%s", s);     // { dg-warning "overlaps" }
+  }
+
+  {
+    // If strlen (a) > 0 there is overlap with a[1].
+    char *s = p->a2.a;
+    char *d = (char*)p + B_a2_a_off + 1;
+    sprintf (d, "%s", s);     // { dg-warning "may overlap" }
+  }
+
+  {
+    // If strlen (a) > 1 there is overlap with a[2].
+    char *s = p->a2.a;
+    char *d = (char*)p + B_a2_a_off + 2;
+    sprintf (d, "%s", s);     // { dg-warning "may overlap" }
+  }
+
+  {
+    // If strlen (a) > 2 there is overlap with a[3].
+    char *s = p->a2.a;
+    char *d = (char*)p + B_a2_a_off + 3;
+    sprintf (d, "%s", s);     // { dg-warning "may overlap" }
+  }
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c
index 7fb96514f1d..112b08afc44 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-23.c
@@ -214,12 +214,14 @@ void test_struct_member_array (struct S3 *s3, int i)
   T (d, "%s", d);       /* { dg-warning "overlaps" } */
   T (d, "%s", d + 0);   /* { dg-warning "overlaps" } */
   T (d, "%s", d + 1);   /* { dg-warning "may overlap" } */
-  T (d, "%s", d + 2);   /* { dg-warning "may overlap" } */
+  /* Since d below points to char[4], strlen(d + 2) must be at most 1
+     and so the call cannot overlap. */
+  T (d, "%s", d + 2);
   T (d, "%s", d + i);   /* { dg-warning "may overlap" } */
 
   T (d, "%s", &d[0]);   /* { dg-warning "overlaps" } */
   T (d, "%s", &d[1]);   /* { dg-warning "may overlap" } */
-  T (d, "%s", &d[2]);   /* { dg-warning "may overlap" } */
+  T (d, "%s", &d[2]);
   T (d, "%s", &d[i]);   /* { dg-warning "may overlap" } */
 
   T (d + 0, "%s", d);   /* { dg-warning "overlaps" } */
@@ -236,7 +238,7 @@ void test_struct_member_array (struct S3 *s3, int i)
 
   T (d, "%s", s);       /* { dg-warning "overlaps" } */
   T (d, "%s", s + 1);   /* { dg-warning "may overlap" } */
-  T (d, "%s", s + 2);   /* { dg-warning "may overlap" } */
+  T (d, "%s", s + 2);
   T (d, "%s", s + i);   /* { dg-warning "may overlap" } */
 
   s = s3->s2_1.s_1.b;
@@ -324,7 +326,7 @@ void test_struct_member_array (struct S3 *s3, int i)
 
   T (d, "%s", s);       /* { dg-warning "overlaps" } */
   T (d, "%s", s + 1);   /* { dg-warning "may overlap" } */
-  T (d, "%s", s + 2);   /* { dg-warning "may overlap" } */
+  T (d, "%s", s + 2);
   T (d, "%s", s + i);   /* { dg-warning "may overlap" } */
 
   s = s3->s2_2.s_2.a;
@@ -368,7 +370,7 @@ void test_struct_member_array (struct S3 *s3, int i)
 
   T (d, "%s", s);       /* { dg-warning "overlaps" } */
   T (d, "%s", s + 1);   /* { dg-warning "may overlap" } */
-  T (d, "%s", s + 2);   /* { dg-warning "may overlap" } */
+  T (d, "%s", s + 2);
   T (d, "%s", s + i);   /* { dg-warning "may overlap" } */
 
   s = s3->s2_2.s_2.a;
@@ -394,12 +396,12 @@ void test_struct_member_array_array (struct S3 *s3, int i)
   T (d, "%s", s);       /* { dg-warning "overlaps" } */
   T (d, "%s", s + 0);   /* { dg-warning "overlaps" } */
   T (d, "%s", s + 1);   /* { dg-warning "may overlap" } */
-  T (d, "%s", s + 2);   /* { dg-warning "may overlap" } */
+  T (d, "%s", s + 2);
   T (d, "%s", s + i);   /* { dg-warning "may overlap" } */
 
   T (d, "%s", &s[0]);   /* { dg-warning "overlaps" } */
   T (d, "%s", &s[1]);   /* { dg-warning "may overlap" } */
-  T (d, "%s", &s[2]);   /* { dg-warning "may overlap" } */
+  T (d, "%s", &s[2]);
   T (d, "%s", &s[i]);   /* { dg-warning "may overlap" } */
 
   T (d + 0, "%s", s);   /* { dg-warning "overlaps" } */
@@ -566,12 +568,12 @@ void test_union_member_array (union U *un, int i)
   T (d, "%s", d);       /* { dg-warning "overlaps" } */
   T (d, "%s", d + 0);   /* { dg-warning "overlaps" } */
   T (d, "%s", d + 1);   /* { dg-warning "may overlap" } */
-  T (d, "%s", d + 2);   /* { dg-warning "may overlap" } */
+  T (d, "%s", d + 2);
   T (d, "%s", d + i);   /* { dg-warning "may overlap" } */
 
   T (d, "%s", &d[0]);   /* { dg-warning "overlaps" } */
   T (d, "%s", &d[1]);   /* { dg-warning "may overlap" } */
-  T (d, "%s", &d[2]);   /* { dg-warning "may overlap" } */
+  T (d, "%s", &d[2]);
   T (d, "%s", &d[i]);   /* { dg-warning "may overlap" } */
 
   T (d + 0, "%s", d);   /* { dg-warning "overlaps" } */
@@ -588,7 +590,7 @@ void test_union_member_array (union U *un, int i)
 
   T (d, "%s", s);       /* { dg-warning "overlaps" } */
   T (d, "%s", s + 1);   /* { dg-warning "may overlap" } */
-  T (d, "%s", s + 2);   /* { dg-warning "may overlap" } */
+  T (d, "%s", s + 2);
   T (d, "%s", s + i);   /* { dg-warning "may overlap" } */
 
   s = un->s2_1.s_1.b;
@@ -616,7 +618,7 @@ void test_union_member_array (union U *un, int i)
 
   T (d, "%s", s);       /* { dg-warning "overlaps" } */
   T (d, "%s", s + 1);   /* { dg-warning "may overlap" } */
-  T (d, "%s", s + 2);   /* { dg-warning "may overlap" } */
+  T (d, "%s", s + 2);
   T (d, "%s", s + i);   /* { dg-warning "may overlap" } */
 
   s = un->s2_2.s_1.b;

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] improve handling of aggregates in sprintf [PR 102238, 102919]
  2021-10-24 23:43 [PATCH] improve handling of aggregates in sprintf [PR 102238, 102919] Martin Sebor
@ 2021-10-25  2:18 ` Jeff Law
  0 siblings, 0 replies; 2+ messages in thread
From: Jeff Law @ 2021-10-25  2:18 UTC (permalink / raw)
  To: Martin Sebor, gcc-patches



On 10/24/2021 5:43 PM, Martin Sebor via Gcc-patches wrote:
> The detection of overlapping sprintf calls has a limitation
> that leads to both false positives (PR 102919) and negatives
> (PR 102238) in corner cases involving members of aggregates.
> The false positives result from the overlap logic not using
> the size of the member used as an argument to %s to constrain
> the length of the directive output.  The false negatives are
> due to the logic failing to determine the identity of a member
> from the address or reference to the enclosing object and
> an offset.
>
> The attached patch improves the detection logic to handle both
> sets of cases.  In addition, it moves the utility functions used
> to implement the logic from the sprintf pass into pointer-query
> where they can be used for other purposes in the future (my
> work in progress).
>
> Tested on x86_64-linux and by building Glibc and verifying
> it doesn't cause any new warnings,
>
> Martin
>
> gcc-102238.diff
>
> PR tree-optimization/102238 - alias_offset in gimple-ssa-sprintf.c is broken
> PR tree-optimization/102919 - spurious -Wrestrict warning for sprintf into the same member array as argument plus offset
>
> gcc/ChangeLog:
>
> 	PR tree-optimization/102238
> 	PR tree-optimization/102919
> 	* gimple-ssa-sprintf.c (get_string_length): Ad an argument.
> 	(array_elt_at_offset): Move to pointer-query.
> 	(set_aggregate_size_and_offset): New function.
> 	(field_at_offset):  Move to pointer-query.
> 	(get_origin_and_offset): Rename...
> 	(get_origin_and_offset_r): this.  Add an argument.  Make aggregate
> 	handling more robust.
> 	(get_origin_and_offset): New.
> 	(alias_offset): Add an argument.
> 	(format_string): Use subobject size determined by get_origin_and_offset.
> 	* pointer-query.cc (field_at_offset): Move from gimple-ssa-sprintf.c.
> 	Improve/correct handling of aggregates.
> 	(array_elt_at_offset): Same.
> 	* pointer-query.h (field_at_offset): Declare.
> 	(array_elt_at_offset): Declare.
>
> gcc/testsuite/ChangeLog:
>
> 	PR tree-optimization/102238
> 	PR tree-optimization/102919
> 	* gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Remove warnings.
> 	* gcc.dg/Wrestrict-23.c: New test.
Given you know this code better than anyone.  OK.
jeff


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-10-25  2:18 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-24 23:43 [PATCH] improve handling of aggregates in sprintf [PR 102238, 102919] Martin Sebor
2021-10-25  2:18 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).