public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] handle MEM_REF with void* arguments (PR c++/95768)
@ 2020-06-22 17:25 Martin Sebor
  2020-06-22 18:55 ` Jason Merrill
  0 siblings, 1 reply; 11+ messages in thread
From: Martin Sebor @ 2020-06-22 17:25 UTC (permalink / raw)
  To: gcc-patches, Jason Merrill

[-- Attachment #1: Type: text/plain, Size: 322 bytes --]

The attached fix parallels the one for the equivalent C bug 95580
where the pretty printers don't correctly handle MEM_REF arguments
with type void* or other pointers to an incomplete type.

The incorrect handling was exposed by the recent change to
-Wuninitialized which includes such expressions in diagnostics.

Martin

[-- Attachment #2: gcc-95768.diff --]
[-- Type: text/x-patch, Size: 2863 bytes --]

PR c++/95768 - pretty-printer ICE on -Wuninitialized with allocated storage

gcc/cp/ChangeLog:

	PR c++/95768
	* error.c (dump_expr): Handle sizeless operand types such as void*.

gcc/testsuite/ChangeLog:

	PR c++/95768
	* g++.dg/pr95768.C: New test.

diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 0d6375e5e14..3a7254fdce1 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -2374,32 +2374,37 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
       break;
 
     case MEM_REF:
-      if (TREE_CODE (TREE_OPERAND (t, 0)) == ADDR_EXPR
-	  && integer_zerop (TREE_OPERAND (t, 1)))
-	dump_expr (pp, TREE_OPERAND (TREE_OPERAND (t, 0), 0), flags);
-      else
-	{
-	  pp_cxx_star (pp);
-	  if (!integer_zerop (TREE_OPERAND (t, 1)))
-	    {
-	      pp_cxx_left_paren (pp);
-	      if (!integer_onep (TYPE_SIZE_UNIT
-				 (TREE_TYPE (TREE_TYPE (TREE_OPERAND (t, 0))))))
-		{
-		  pp_cxx_left_paren (pp);
-		  dump_type (pp, ptr_type_node, flags);
-		  pp_cxx_right_paren (pp);
-		}
-	    }
-	  dump_expr (pp, TREE_OPERAND (t, 0), flags);
-	  if (!integer_zerop (TREE_OPERAND (t, 1)))
-	    {
-	      pp_cxx_ws_string (pp, "+");
-	      dump_expr (pp, fold_convert (ssizetype, TREE_OPERAND (t, 1)),
-                         flags);
-	      pp_cxx_right_paren (pp);
-	    }
-	}
+      {
+	tree arg = TREE_OPERAND (t, 0);
+	tree offset = TREE_OPERAND (t, 1);
+
+	if (TREE_CODE (arg) == ADDR_EXPR
+	    && integer_zerop (offset))
+	  dump_expr (pp, TREE_OPERAND (arg, 0), flags);
+	else
+	  {
+	    pp_cxx_star (pp);
+	    if (!integer_zerop (offset))
+	      {
+		pp_cxx_left_paren (pp);
+		tree argtype = TREE_TYPE (arg);
+		if (tree size = TYPE_SIZE_UNIT (TREE_TYPE (argtype)))
+		  if (!integer_onep (size))
+		    {
+		      pp_cxx_left_paren (pp);
+		      dump_type (pp, ptr_type_node, flags);
+		      pp_cxx_right_paren (pp);
+		    }
+	      }
+	    dump_expr (pp, arg, flags);
+	    if (!integer_zerop (offset))
+	      {
+		pp_cxx_ws_string (pp, "+");
+		dump_expr (pp, fold_convert (ssizetype, offset), flags);
+		pp_cxx_right_paren (pp);
+	      }
+	  }
+      }
       break;
 
     case NEGATE_EXPR:
diff --git a/gcc/testsuite/g++.dg/pr95768.C b/gcc/testsuite/g++.dg/pr95768.C
new file mode 100644
index 00000000000..babfcb49bf8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr95768.C
@@ -0,0 +1,32 @@
+/* PR c++/95768 - pretty-printer ICE on -Wuninitialized with allocated storage
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+extern "C" void *malloc (__SIZE_TYPE__);
+
+struct f
+{
+  int i;
+  static int e (int);
+  void operator= (int) { e (i); }
+};
+
+struct m {
+  int i;
+  f length;
+};
+
+struct n {
+  m *o() { return (m *)this; }
+};
+
+struct p {
+  n *header;
+  p () {
+    header = (n *)malloc (0);
+    m b = *header->o();       // { dg-warning "-Wuninitialized" }
+    b.length = 0;
+  }
+};
+
+void detach2() { p(); }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] handle MEM_REF with void* arguments (PR c++/95768)
  2020-06-22 17:25 [PATCH] handle MEM_REF with void* arguments (PR c++/95768) Martin Sebor
@ 2020-06-22 18:55 ` Jason Merrill
  2020-06-22 22:22   ` Martin Sebor
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Merrill @ 2020-06-22 18:55 UTC (permalink / raw)
  To: Martin Sebor, gcc-patches

On 6/22/20 1:25 PM, Martin Sebor wrote:
> The attached fix parallels the one for the equivalent C bug 95580
> where the pretty printers don't correctly handle MEM_REF arguments
> with type void* or other pointers to an incomplete type.
> 
> The incorrect handling was exposed by the recent change to
> -Wuninitialized which includes such expressions in diagnostics.

> +		if (tree size = TYPE_SIZE_UNIT (TREE_TYPE (argtype)))
> +		  if (!integer_onep (size))
> +		    {
> +		      pp_cxx_left_paren (pp);
> +		      dump_type (pp, ptr_type_node, flags);
> +		      pp_cxx_right_paren (pp);
> +		    }

Don't we want to print the cast if the pointer target type is incomplete?

Jason


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] handle MEM_REF with void* arguments (PR c++/95768)
  2020-06-22 18:55 ` Jason Merrill
@ 2020-06-22 22:22   ` Martin Sebor
  2020-06-23  7:12     ` Richard Biener
  0 siblings, 1 reply; 11+ messages in thread
From: Martin Sebor @ 2020-06-22 22:22 UTC (permalink / raw)
  To: Jason Merrill, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2436 bytes --]

On 6/22/20 12:55 PM, Jason Merrill wrote:
> On 6/22/20 1:25 PM, Martin Sebor wrote:
>> The attached fix parallels the one for the equivalent C bug 95580
>> where the pretty printers don't correctly handle MEM_REF arguments
>> with type void* or other pointers to an incomplete type.
>>
>> The incorrect handling was exposed by the recent change to
>> -Wuninitialized which includes such expressions in diagnostics.
> 
>> +        if (tree size = TYPE_SIZE_UNIT (TREE_TYPE (argtype)))
>> +          if (!integer_onep (size))
>> +            {
>> +              pp_cxx_left_paren (pp);
>> +              dump_type (pp, ptr_type_node, flags);
>> +              pp_cxx_right_paren (pp);
>> +            }
> 
> Don't we want to print the cast if the pointer target type is incomplete?

I suppose, yes, although after some more testing I think what should
be output is the type of the access.  The target pointer type isn't
meaningful (at least not in this case).

Here's what the warning looks like in C for the test case in
gcc.dg/pr95580.c:

   warning: ‘*((void *)(p)+1)’ may be used uninitialized

and like this in C++:

   warning: ‘*(p +1)’ may be used uninitialized

The +1 is a byte offset, which is correct given that incrementing
a void* in GCC is the same as adding 1 to the byte address, but
dereferencing a void* doesn't correspond to what's going on in
the source.

Even for a complete type (with size greater than 1), printing
the type of the argument plus a byte offset is wrong.  It ends
up with this for the C++ test case from 95768:

   warning: ‘*((int*)<unknown> +4)’ is used uninitialized

when the access is actually ‘*((int*)<unknown> +1)’

So it seems to me for MEM_REF, to make the output meaningful,
it's the type of the access (i.e., the MEM_REF type) that should
be printed here, and the offset should either be in elements of
the accessed type, i.e.,

   warning: ‘*((int*)<unknown> +1)’ is used uninitialized

or, if the access is misaligned, the argument should first be
cast to char*, the offset added, and the result then cast to
the access type, like this:

   warning: ‘*(T*)((char*)<unknown> +1)’ is used uninitialized

The attached revised and less than fully tested patch implements
this for C++ only for now.  If we agree on this approach I'll see
about making the corresponding change in C.

Martin

[-- Attachment #2: gcc-95768.diff --]
[-- Type: text/x-patch, Size: 5229 bytes --]

PR c++/95768 - pretty-printer ICE on -Wuninitialized with allocated storage

gcc/cp/ChangeLog:

	PR c++/95768
	* error.c (dump_expr): Handle sizeless operand types such as void*.

gcc/testsuite/ChangeLog:

	PR c++/95768
	* g++.dg/pr95768.C: New test.

diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 0d6375e5e14..ea1f3232e3d 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -2374,32 +2374,63 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
       break;
 
     case MEM_REF:
-      if (TREE_CODE (TREE_OPERAND (t, 0)) == ADDR_EXPR
-	  && integer_zerop (TREE_OPERAND (t, 1)))
-	dump_expr (pp, TREE_OPERAND (TREE_OPERAND (t, 0), 0), flags);
-      else
-	{
-	  pp_cxx_star (pp);
-	  if (!integer_zerop (TREE_OPERAND (t, 1)))
-	    {
-	      pp_cxx_left_paren (pp);
-	      if (!integer_onep (TYPE_SIZE_UNIT
-				 (TREE_TYPE (TREE_TYPE (TREE_OPERAND (t, 0))))))
-		{
-		  pp_cxx_left_paren (pp);
-		  dump_type (pp, ptr_type_node, flags);
-		  pp_cxx_right_paren (pp);
-		}
-	    }
-	  dump_expr (pp, TREE_OPERAND (t, 0), flags);
-	  if (!integer_zerop (TREE_OPERAND (t, 1)))
-	    {
-	      pp_cxx_ws_string (pp, "+");
-	      dump_expr (pp, fold_convert (ssizetype, TREE_OPERAND (t, 1)),
-                         flags);
-	      pp_cxx_right_paren (pp);
-	    }
-	}
+      {
+	tree type = TREE_TYPE (t);
+	tree arg = TREE_OPERAND (t, 0);
+	tree offset = TREE_OPERAND (t, 1);
+	bool zero_offset = integer_zerop (offset);
+
+	if (TREE_CODE (arg) == ADDR_EXPR && zero_offset)
+	  dump_expr (pp, TREE_OPERAND (arg, 0), flags);
+	else
+	  {
+	    pp_cxx_star (pp);
+	    if (!zero_offset)
+	      {
+		pp_cxx_left_paren (pp);
+		pp_cxx_left_paren (pp);
+		dump_type (pp, type, flags);
+		pp_cxx_star (pp);
+		pp_cxx_right_paren (pp);
+
+		bool byte_offset = true;
+
+		if (tree size = TYPE_SIZE_UNIT (type))
+		  {
+		    /* For naturally aligned accesses print the nonzero
+		       offset in units of the access type.  For unaligned
+		       accesses print a byte offset instead.  */
+		    offset_int wsiz = wi::to_offset (size);
+		    offset_int woff = wi::to_offset (offset);
+		    offset_int szlg2 = wi::floor_log2 (wsiz);
+		    offset_int eltoff = woff >> szlg2;
+		    if (eltoff << szlg2 == woff)
+		      {
+			offset = wide_int_to_tree (ssizetype, eltoff);
+			byte_offset = false;
+		      }
+		  }
+
+		if (byte_offset)
+		  {
+		    /* When printing the byte offset include a cast to
+		       a character type first, before printing the cast
+		       to the access type.  */
+		    pp_cxx_left_paren (pp);
+		    dump_type (pp, char_type_node, flags);
+		    pp_cxx_star (pp);
+		    pp_cxx_right_paren (pp);
+		  }
+	      }
+	    dump_expr (pp, arg, flags);
+	    if (!zero_offset)
+	      {
+		pp_cxx_ws_string (pp, "+");
+		dump_expr (pp, fold_convert (ssizetype, offset), flags);
+		pp_cxx_right_paren (pp);
+	      }
+	  }
+      }
       break;
 
     case NEGATE_EXPR:
diff --git a/gcc/testsuite/g++.dg/pr95768.C b/gcc/testsuite/g++.dg/pr95768.C
new file mode 100644
index 00000000000..23e6988b410
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr95768.C
@@ -0,0 +1,32 @@
+/* PR c++/95768 - pretty-printer ICE on -Wuninitialized with allocated storage
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+extern "C" void *malloc (__SIZE_TYPE__);
+
+struct f
+{
+  int i;
+  static int e (int);
+  void operator= (int) { e (i); }
+};
+
+struct m {
+  int i;
+  f length;
+};
+
+struct n {
+  m *o() { return (m *)this; }
+};
+
+struct p {
+  n *header;
+  p () {
+    header = (n *)malloc (0);
+    m b = *header->o();       // { dg-warning "'\\*\\(\\(int\\*\\)<\[a-z\]+> \\+1\\)' is used uninitialized" }
+    b.length = 0;
+  }
+};
+
+void detach2() { p(); }
diff --git a/gcc/testsuite/g++.dg/warn/Wuninitialized-11.C b/gcc/testsuite/g++.dg/warn/Wuninitialized-11.C
new file mode 100644
index 00000000000..6f8a4586adf
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wuninitialized-11.C
@@ -0,0 +1,40 @@
+/* Verify that -Wuninitialized warnings about accesses to objects via
+   pointers and offsets mention valid expressions.
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+typedef __INT16_TYPE__ int16_t;
+typedef __INT32_TYPE__ int32_t;
+
+void sink (int);
+
+/* Verify properly aligned accesses at offsets that are multiples of
+   the access size.  */
+
+void test_aligned (void)
+{
+  char *p1 = (char*)__builtin_malloc (32);
+  p1 += sizeof (int32_t);
+
+  int16_t *p2 = (int16_t*)p1;
+  sink (p2[1]);               // { dg-warning "'\\\*\\\(\\\(int16_t\\\*\\\)p1 \\\+ *3\\\)' is used uninitialized" }
+
+  int32_t *p4 = (int32_t*)p1;
+  sink (p4[1]);               // { dg-warning "'\\\*\\\(\\\(int32_t\\\*\\\)p1 \\\+ *2\\\)' is used uninitialized" }
+}
+
+
+/* Verify misaligned accesses at offsets that aren't multiples of
+   the access size.  */
+
+void test_misaligned (void)
+{
+  char *p1 = (char*)__builtin_malloc (32);
+  p1 += 1;
+
+  int16_t *p2 = (int16_t*)p1;
+  sink (p2[1]);               // { dg-warning "'\\\*\\\(\\\(int16_t\\\*\\\)\\\(char\\\*\\\)p1 \\\+ *3\\\)' is used uninitialized" }
+
+  int32_t *p4 = (int32_t*)p1;
+  sink (p4[1]);               // { dg-warning "'\\\*\\\(\\\(int32_t\\\*\\)\\\(char\\\*\\\)p1 \\\+ *5\\\)' is used uninitialized" }
+}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] handle MEM_REF with void* arguments (PR c++/95768)
  2020-06-22 22:22   ` Martin Sebor
@ 2020-06-23  7:12     ` Richard Biener
  2020-06-28 23:07       ` Martin Sebor
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Biener @ 2020-06-23  7:12 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jason Merrill, gcc-patches

On Tue, Jun 23, 2020 at 12:22 AM Martin Sebor via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On 6/22/20 12:55 PM, Jason Merrill wrote:
> > On 6/22/20 1:25 PM, Martin Sebor wrote:
> >> The attached fix parallels the one for the equivalent C bug 95580
> >> where the pretty printers don't correctly handle MEM_REF arguments
> >> with type void* or other pointers to an incomplete type.
> >>
> >> The incorrect handling was exposed by the recent change to
> >> -Wuninitialized which includes such expressions in diagnostics.
> >
> >> +        if (tree size = TYPE_SIZE_UNIT (TREE_TYPE (argtype)))
> >> +          if (!integer_onep (size))
> >> +            {
> >> +              pp_cxx_left_paren (pp);
> >> +              dump_type (pp, ptr_type_node, flags);
> >> +              pp_cxx_right_paren (pp);
> >> +            }
> >
> > Don't we want to print the cast if the pointer target type is incomplete?
>
> I suppose, yes, although after some more testing I think what should
> be output is the type of the access.  The target pointer type isn't
> meaningful (at least not in this case).
>
> Here's what the warning looks like in C for the test case in
> gcc.dg/pr95580.c:
>
>    warning: ‘*((void *)(p)+1)’ may be used uninitialized
>
> and like this in C++:
>
>    warning: ‘*(p +1)’ may be used uninitialized
>
> The +1 is a byte offset, which is correct given that incrementing
> a void* in GCC is the same as adding 1 to the byte address, but
> dereferencing a void* doesn't correspond to what's going on in
> the source.
>
> Even for a complete type (with size greater than 1), printing
> the type of the argument plus a byte offset is wrong.  It ends
> up with this for the C++ test case from 95768:
>
>    warning: ‘*((int*)<unknown> +4)’ is used uninitialized
>
> when the access is actually ‘*((int*)<unknown> +1)’
>
> So it seems to me for MEM_REF, to make the output meaningful,
> it's the type of the access (i.e., the MEM_REF type) that should
> be printed here, and the offset should either be in elements of
> the accessed type, i.e.,
>
>    warning: ‘*((int*)<unknown> +1)’ is used uninitialized
>
> or, if the access is misaligned, the argument should first be
> cast to char*, the offset added, and the result then cast to
> the access type, like this:
>
>    warning: ‘*(T*)((char*)<unknown> +1)’ is used uninitialized
>
> The attached revised and less than fully tested patch implements
> this for C++ only for now.  If we agree on this approach I'll see
> about making the corresponding change in C.

Note that there is no C/C++ way of fully expressing MEM_REF
semantics.  __MEM <int> ((T *)p + 1) is not actually
*(int *)((char *)p + 1) because that does not reflect that the
effective type of the lvalue when TBAA is concerned is 'T'
rather than 'int'.  Note for MEM_REF the offset is always
a constant byte offset but it indeed does not have to be a
multiple of the MEM_REF type size.

I wonder whether printing the MEM_REF in full provides
any real diagnostic value in the more "obfuscated" cases.

I'd also not print <unknown> but <register>.

Richard.

> Martin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] handle MEM_REF with void* arguments (PR c++/95768)
  2020-06-23  7:12     ` Richard Biener
@ 2020-06-28 23:07       ` Martin Sebor
  2020-06-29  7:19         ` Richard Biener
  0 siblings, 1 reply; 11+ messages in thread
From: Martin Sebor @ 2020-06-28 23:07 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jason Merrill, gcc-patches

On 6/23/20 1:12 AM, Richard Biener wrote:
> On Tue, Jun 23, 2020 at 12:22 AM Martin Sebor via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> On 6/22/20 12:55 PM, Jason Merrill wrote:
>>> On 6/22/20 1:25 PM, Martin Sebor wrote:
>>>> The attached fix parallels the one for the equivalent C bug 95580
>>>> where the pretty printers don't correctly handle MEM_REF arguments
>>>> with type void* or other pointers to an incomplete type.
>>>>
>>>> The incorrect handling was exposed by the recent change to
>>>> -Wuninitialized which includes such expressions in diagnostics.
>>>
>>>> +        if (tree size = TYPE_SIZE_UNIT (TREE_TYPE (argtype)))
>>>> +          if (!integer_onep (size))
>>>> +            {
>>>> +              pp_cxx_left_paren (pp);
>>>> +              dump_type (pp, ptr_type_node, flags);
>>>> +              pp_cxx_right_paren (pp);
>>>> +            }
>>>
>>> Don't we want to print the cast if the pointer target type is incomplete?
>>
>> I suppose, yes, although after some more testing I think what should
>> be output is the type of the access.  The target pointer type isn't
>> meaningful (at least not in this case).
>>
>> Here's what the warning looks like in C for the test case in
>> gcc.dg/pr95580.c:
>>
>>     warning: ‘*((void *)(p)+1)’ may be used uninitialized
>>
>> and like this in C++:
>>
>>     warning: ‘*(p +1)’ may be used uninitialized
>>
>> The +1 is a byte offset, which is correct given that incrementing
>> a void* in GCC is the same as adding 1 to the byte address, but
>> dereferencing a void* doesn't correspond to what's going on in
>> the source.
>>
>> Even for a complete type (with size greater than 1), printing
>> the type of the argument plus a byte offset is wrong.  It ends
>> up with this for the C++ test case from 95768:
>>
>>     warning: ‘*((int*)<unknown> +4)’ is used uninitialized
>>
>> when the access is actually ‘*((int*)<unknown> +1)’
>>
>> So it seems to me for MEM_REF, to make the output meaningful,
>> it's the type of the access (i.e., the MEM_REF type) that should
>> be printed here, and the offset should either be in elements of
>> the accessed type, i.e.,
>>
>>     warning: ‘*((int*)<unknown> +1)’ is used uninitialized
>>
>> or, if the access is misaligned, the argument should first be
>> cast to char*, the offset added, and the result then cast to
>> the access type, like this:
>>
>>     warning: ‘*(T*)((char*)<unknown> +1)’ is used uninitialized
>>
>> The attached revised and less than fully tested patch implements
>> this for C++ only for now.  If we agree on this approach I'll see
>> about making the corresponding change in C.
> 
> Note that there is no C/C++ way of fully expressing MEM_REF
> semantics.  __MEM <int> ((T *)p + 1) is not actually
> *(int *)((char *)p + 1) because that does not reflect that the
> effective type of the lvalue when TBAA is concerned is 'T'
> rather than 'int'.

What form would you say is closest to the C/C++ semantics, or
likely the most useful to users, that GCC could print instead?

> Note for MEM_REF the offset is always
> a constant byte offset but it indeed does not have to be a
> multiple of the MEM_REF type size.
> 
> I wonder whether printing the MEM_REF in full provides
> any real diagnostic value in the more "obfuscated" cases.

I'm not sure what obfuscated cases you're thinking of, or what
you mean by printing it in full.  I instrumented the code to
print every MEM_REF in that comes up in warn_uninitialized_vars
and rebuilt GCC.  There are 17,456 distinct instances so I didn't
review them all but those I did look at all look reasonable.
Probably the least useful are those that mention <unknown> by
itself (i.e., <unknown> or *<unknown>).  Those with an offset
are more informative (e.g., *((access**)<unknown> +1).  In
a few the offset is very large, such as *((unsigned int*)sp
+4611686018427387900), but that doesn't seem like a problem.
I'd be happy to share the result.

> 
> I'd also not print <unknown> but <register>.

I also don't find <unknown> helpful, but I don't see <register>
as an improvement.  I think printing the SSA variable would be
more informative here since its name is usually related to
the variable it was derived from in the source.  But making that
change (or any other like it) feels like too much feature creep
for this fix.  I'd be happy to do it in a follow up if we agree
it's a good idea.

Either way, please let me know if the patch is okay as is or,
if not, what type it should mention.

Martin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] handle MEM_REF with void* arguments (PR c++/95768)
  2020-06-28 23:07       ` Martin Sebor
@ 2020-06-29  7:19         ` Richard Biener
  2020-07-09 15:50           ` Martin Sebor
  0 siblings, 1 reply; 11+ messages in thread
From: Richard Biener @ 2020-06-29  7:19 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Jason Merrill, gcc-patches

On Mon, Jun 29, 2020 at 1:08 AM Martin Sebor <msebor@gmail.com> wrote:
>
> On 6/23/20 1:12 AM, Richard Biener wrote:
> > On Tue, Jun 23, 2020 at 12:22 AM Martin Sebor via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> On 6/22/20 12:55 PM, Jason Merrill wrote:
> >>> On 6/22/20 1:25 PM, Martin Sebor wrote:
> >>>> The attached fix parallels the one for the equivalent C bug 95580
> >>>> where the pretty printers don't correctly handle MEM_REF arguments
> >>>> with type void* or other pointers to an incomplete type.
> >>>>
> >>>> The incorrect handling was exposed by the recent change to
> >>>> -Wuninitialized which includes such expressions in diagnostics.
> >>>
> >>>> +        if (tree size = TYPE_SIZE_UNIT (TREE_TYPE (argtype)))
> >>>> +          if (!integer_onep (size))
> >>>> +            {
> >>>> +              pp_cxx_left_paren (pp);
> >>>> +              dump_type (pp, ptr_type_node, flags);
> >>>> +              pp_cxx_right_paren (pp);
> >>>> +            }
> >>>
> >>> Don't we want to print the cast if the pointer target type is incomplete?
> >>
> >> I suppose, yes, although after some more testing I think what should
> >> be output is the type of the access.  The target pointer type isn't
> >> meaningful (at least not in this case).
> >>
> >> Here's what the warning looks like in C for the test case in
> >> gcc.dg/pr95580.c:
> >>
> >>     warning: ‘*((void *)(p)+1)’ may be used uninitialized
> >>
> >> and like this in C++:
> >>
> >>     warning: ‘*(p +1)’ may be used uninitialized
> >>
> >> The +1 is a byte offset, which is correct given that incrementing
> >> a void* in GCC is the same as adding 1 to the byte address, but
> >> dereferencing a void* doesn't correspond to what's going on in
> >> the source.
> >>
> >> Even for a complete type (with size greater than 1), printing
> >> the type of the argument plus a byte offset is wrong.  It ends
> >> up with this for the C++ test case from 95768:
> >>
> >>     warning: ‘*((int*)<unknown> +4)’ is used uninitialized
> >>
> >> when the access is actually ‘*((int*)<unknown> +1)’
> >>
> >> So it seems to me for MEM_REF, to make the output meaningful,
> >> it's the type of the access (i.e., the MEM_REF type) that should
> >> be printed here, and the offset should either be in elements of
> >> the accessed type, i.e.,
> >>
> >>     warning: ‘*((int*)<unknown> +1)’ is used uninitialized
> >>
> >> or, if the access is misaligned, the argument should first be
> >> cast to char*, the offset added, and the result then cast to
> >> the access type, like this:
> >>
> >>     warning: ‘*(T*)((char*)<unknown> +1)’ is used uninitialized
> >>
> >> The attached revised and less than fully tested patch implements
> >> this for C++ only for now.  If we agree on this approach I'll see
> >> about making the corresponding change in C.
> >
> > Note that there is no C/C++ way of fully expressing MEM_REF
> > semantics.  __MEM <int> ((T *)p + 1) is not actually
> > *(int *)((char *)p + 1) because that does not reflect that the
> > effective type of the lvalue when TBAA is concerned is 'T'
> > rather than 'int'.
>
> What form would you say is closest to the C/C++ semantics, or
> likely the most useful to users, that GCC could print instead?

Hmm, I'd try *(<pointer derived from `p'>) maybe?  Because there's
no C/C++ that can express what GIMPLE can do here.  Of course
pattern matching the exact cases we can handle like your patch
is an improvement (but as said the TBAA issue is still present).

> > Note for MEM_REF the offset is always
> > a constant byte offset but it indeed does not have to be a
> > multiple of the MEM_REF type size.
> >
> > I wonder whether printing the MEM_REF in full provides
> > any real diagnostic value in the more "obfuscated" cases.
>
> I'm not sure what obfuscated cases you're thinking of, or what
> you mean by printing it in full.

I think that printing ‘*(T*)((char*)<unknown> +1)’ is likely going
to confuse users because they cannot match this to a source
location.  If we have a source location we should have caret
diagnostics.

>  I instrumented the code to
> print every MEM_REF in that comes up in warn_uninitialized_vars
> and rebuilt GCC.  There are 17,456 distinct instances so I didn't
> review them all but those I did look at all look reasonable.
> Probably the least useful are those that mention <unknown> by
> itself (i.e., <unknown> or *<unknown>).  Those with an offset
> are more informative (e.g., *((access**)<unknown> +1).  In
> a few the offset is very large, such as *((unsigned int*)sp
> +4611686018427387900), but that doesn't seem like a problem.
> I'd be happy to share the result.

Here +4611686018427387900 should be printed as -4, MEM_REF
offsets are to be interpreted as signed.

> >
> > I'd also not print <unknown> but <register>.
>
> I also don't find <unknown> helpful, but I don't see <register>
> as an improvement.  I think printing the SSA variable would be
> more informative here since its name is usually related to
> the variable it was derived from in the source.  But making that
> change (or any other like it) feels like too much feature creep
> for this fix.  I'd be happy to do it in a follow up if we agree
> it's a good idea.
>
> Either way, please let me know if the patch is okay as is or,
> if not, what type it should mention.

+               if (tree size = TYPE_SIZE_UNIT (TREE_TYPE (argtype)))
+                 if (!integer_onep (size))
+                   {

this should be if (!TYPE_SIZE_UNIT (...) || !integer_onep (...)).

Otherwise the original patch looks OK to me.  As said for your
second patch printing *(int*)p only if p is offsetted is inconsistent
and misleading for TBAA reasons.  So I do not view it as
general improvement.  If the type of the MEM_REF offset
and the access type agree doing that would be fine but then
MEM_REF<int>(p) and MEM_REF<int>(p+4) should be treated
the same.

That said, can we fix the segfault first?  Also see c-pretty-print.c
for another "copy" of this functionality.  It feels we should dispatch
to c-family/ code here.

Richard.



>
> Martin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] handle MEM_REF with void* arguments (PR c++/95768)
  2020-06-29  7:19         ` Richard Biener
@ 2020-07-09 15:50           ` Martin Sebor
  2021-01-02 22:22             ` [PATCH v3] " Martin Sebor
  0 siblings, 1 reply; 11+ messages in thread
From: Martin Sebor @ 2020-07-09 15:50 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jason Merrill, gcc-patches

On 6/29/20 1:19 AM, Richard Biener wrote:
> On Mon, Jun 29, 2020 at 1:08 AM Martin Sebor <msebor@gmail.com> wrote:
>>
>> On 6/23/20 1:12 AM, Richard Biener wrote:
>>> On Tue, Jun 23, 2020 at 12:22 AM Martin Sebor via Gcc-patches
>>> <gcc-patches@gcc.gnu.org> wrote:
>>>>
>>>> On 6/22/20 12:55 PM, Jason Merrill wrote:
>>>>> On 6/22/20 1:25 PM, Martin Sebor wrote:
>>>>>> The attached fix parallels the one for the equivalent C bug 95580
>>>>>> where the pretty printers don't correctly handle MEM_REF arguments
>>>>>> with type void* or other pointers to an incomplete type.
>>>>>>
>>>>>> The incorrect handling was exposed by the recent change to
>>>>>> -Wuninitialized which includes such expressions in diagnostics.
>>>>>
>>>>>> +        if (tree size = TYPE_SIZE_UNIT (TREE_TYPE (argtype)))
>>>>>> +          if (!integer_onep (size))
>>>>>> +            {
>>>>>> +              pp_cxx_left_paren (pp);
>>>>>> +              dump_type (pp, ptr_type_node, flags);
>>>>>> +              pp_cxx_right_paren (pp);
>>>>>> +            }
>>>>>
>>>>> Don't we want to print the cast if the pointer target type is incomplete?
>>>>
>>>> I suppose, yes, although after some more testing I think what should
>>>> be output is the type of the access.  The target pointer type isn't
>>>> meaningful (at least not in this case).
>>>>
>>>> Here's what the warning looks like in C for the test case in
>>>> gcc.dg/pr95580.c:
>>>>
>>>>      warning: ‘*((void *)(p)+1)’ may be used uninitialized
>>>>
>>>> and like this in C++:
>>>>
>>>>      warning: ‘*(p +1)’ may be used uninitialized
>>>>
>>>> The +1 is a byte offset, which is correct given that incrementing
>>>> a void* in GCC is the same as adding 1 to the byte address, but
>>>> dereferencing a void* doesn't correspond to what's going on in
>>>> the source.
>>>>
>>>> Even for a complete type (with size greater than 1), printing
>>>> the type of the argument plus a byte offset is wrong.  It ends
>>>> up with this for the C++ test case from 95768:
>>>>
>>>>      warning: ‘*((int*)<unknown> +4)’ is used uninitialized
>>>>
>>>> when the access is actually ‘*((int*)<unknown> +1)’
>>>>
>>>> So it seems to me for MEM_REF, to make the output meaningful,
>>>> it's the type of the access (i.e., the MEM_REF type) that should
>>>> be printed here, and the offset should either be in elements of
>>>> the accessed type, i.e.,
>>>>
>>>>      warning: ‘*((int*)<unknown> +1)’ is used uninitialized
>>>>
>>>> or, if the access is misaligned, the argument should first be
>>>> cast to char*, the offset added, and the result then cast to
>>>> the access type, like this:
>>>>
>>>>      warning: ‘*(T*)((char*)<unknown> +1)’ is used uninitialized
>>>>
>>>> The attached revised and less than fully tested patch implements
>>>> this for C++ only for now.  If we agree on this approach I'll see
>>>> about making the corresponding change in C.
>>>
>>> Note that there is no C/C++ way of fully expressing MEM_REF
>>> semantics.  __MEM <int> ((T *)p + 1) is not actually
>>> *(int *)((char *)p + 1) because that does not reflect that the
>>> effective type of the lvalue when TBAA is concerned is 'T'
>>> rather than 'int'.
>>
>> What form would you say is closest to the C/C++ semantics, or
>> likely the most useful to users, that GCC could print instead?
> 
> Hmm, I'd try *(<pointer derived from `p'>) maybe?  Because there's
> no C/C++ that can express what GIMPLE can do here.  Of course
> pattern matching the exact cases we can handle like your patch
> is an improvement (but as said the TBAA issue is still present).

"pointer derived from" would be misleading because of C++ class
derivation.  But more important, I think the output needs to
reflect what the warning actually is based on.  Leaving out
salient details like types or offsets from warnings about
out-of-bounds accesses makes analyzing them more difficult,
both for users and for us during initial triage.

>>> Note for MEM_REF the offset is always
>>> a constant byte offset but it indeed does not have to be a
>>> multiple of the MEM_REF type size.
>>>
>>> I wonder whether printing the MEM_REF in full provides
>>> any real diagnostic value in the more "obfuscated" cases.
>>
>> I'm not sure what obfuscated cases you're thinking of, or what
>> you mean by printing it in full.
> 
> I think that printing ‘*(T*)((char*)<unknown> +1)’ is likely going
> to confuse users because they cannot match this to a source
> location.  If we have a source location we should have caret
> diagnostics.
> 
>>   I instrumented the code to
>> print every MEM_REF in that comes up in warn_uninitialized_vars
>> and rebuilt GCC.  There are 17,456 distinct instances so I didn't
>> review them all but those I did look at all look reasonable.
>> Probably the least useful are those that mention <unknown> by
>> itself (i.e., <unknown> or *<unknown>).  Those with an offset
>> are more informative (e.g., *((access**)<unknown> +1).  In
>> a few the offset is very large, such as *((unsigned int*)sp
>> +4611686018427387900), but that doesn't seem like a problem.
>> I'd be happy to share the result.
> 
> Here +4611686018427387900 should be printed as -4, MEM_REF
> offsets are to be interpreted as signed.

Sure, converting the offset to signed sounds like a good idea.

> 
>>>
>>> I'd also not print <unknown> but <register>.
>>
>> I also don't find <unknown> helpful, but I don't see <register>
>> as an improvement.  I think printing the SSA variable would be
>> more informative here since its name is usually related to
>> the variable it was derived from in the source.  But making that
>> change (or any other like it) feels like too much feature creep
>> for this fix.  I'd be happy to do it in a follow up if we agree
>> it's a good idea.
>>
>> Either way, please let me know if the patch is okay as is or,
>> if not, what type it should mention.
> 
> +               if (tree size = TYPE_SIZE_UNIT (TREE_TYPE (argtype)))
> +                 if (!integer_onep (size))
> +                   {
> 
> this should be if (!TYPE_SIZE_UNIT (...) || !integer_onep (...)).
> 
> Otherwise the original patch looks OK to me.  As said for your
> second patch printing *(int*)p only if p is offsetted is inconsistent
> and misleading for TBAA reasons.  So I do not view it as
> general improvement.  If the type of the MEM_REF offset
> and the access type agree doing that would be fine but then
> MEM_REF<int>(p) and MEM_REF<int>(p+4) should be treated
> the same.

Jason pointed out a problem with the original patch: it doesn't
mention any type so it ends up printing things like

    *(p +1)

or worse:

    *(<unknown> +1)

I agree that a type should be mentioned, but not that it should
be the type of the pointer because in the cases where it doesn't
match the type of the access like the one in the report it's
nonsensical.  It ends up looking like what the C front end prints:

    *((void *)(p)+1)

> That said, can we fix the segfault first?

Sure.  I proposed a v2 patch and I'm happy to improve it further,
either before or after committing it.  I don't find the first patch
suitable anymore.

> Also see c-pretty-print.c
> for another "copy" of this functionality.  It feels we should dispatch
> to c-family/ code here.

I agree.  I can clean it up in a follow up, after we agree on what
the first fix should look like.

FWIW, I don't share (or understand) your concern with TBAA or see
a problem with '*(T*)((char*)<unknown> +1)'  In the cases I've seen
it corresponds to the source.  Even if there are cases where it
doesn't faithfully reflect all aliasing subtleties it can't be worse
than printing '*((void *)(p)+1).'  The middle end warnings are mostly
concerned with out-of-bounds accesses where it's the type of the access
and its offset into the destination that matter most.

Martin


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3] handle MEM_REF with void* arguments (PR c++/95768)
  2020-07-09 15:50           ` Martin Sebor
@ 2021-01-02 22:22             ` Martin Sebor
  2021-01-05 23:17               ` Jeff Law
  2021-01-07  8:26               ` Jakub Jelinek
  0 siblings, 2 replies; 11+ messages in thread
From: Martin Sebor @ 2021-01-02 22:22 UTC (permalink / raw)
  To: gcc-patches, Jason Merrill, Richard Biener

[-- Attachment #1: Type: text/plain, Size: 1428 bytes --]

Attached is another revision of a patch I posted last July to keep
the pretty-printer from crashing on MEM_REFs with void* arguments:
   https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549746.html

Besides avoiding the ICE and enhancing the MEM_REF detail and
improving its format, this revision implements the suggestions
in that discussion.  To avoid code duplication it moves
the handling to the C pretty-printer and changes the C++ front
end to delegate to it.  In addition, it includes a cast to
the accessed type if it's different from/incompatible with
(according to GIMPLE) that of the dereferenced pointer, or if
the object is typeless.  Lastly, it replaces the <unknown> in
the output with either VLA names or the RHS of the GIMPLE
expression (this improves the output when for dynamically
allocated objects).

As an aside, In my experience, MEM_REFs in warnings are limited
to -Wuninitialized.  I think other middle end warnings tend to
avoid them.  Those that involve invalid/out-of-bounds accesses
replace them with either the target DECL (e.g., local variable,
or FIELD_DECL), the allocation call (e.g., malloc), or the DECL
of the pointer (e.g., PARM_DECL), followed by a note mentioning
the offset into the object.  I'd like to change -Wuninitialized
at some point to follow the same style.  So I see the value of
the MEM_REF formatting enhancement mainly as a transient solution
until that happens.

Martin

[-- Attachment #2: gcc-95768.diff --]
[-- Type: text/x-patch, Size: 15261 bytes --]

PR c++/95768 - pretty-printer ICE on -Wuninitialized with allocated storage

gcc/c-family/ChangeLog:

	PR c++/95768
	* c-pretty-print.c (c_pretty_printer::primary_expression): For
	SSA_NAMEs print VLA names and GIMPLE defining statements.
	(print_mem_ref): New function.
	(c_pretty_printer::unary_expression): Call it.

gcc/cp/ChangeLog:

	PR c++/95768
	* error.c (dump_expr): Call c_pretty_printer::unary_expression.

gcc/testsuite/ChangeLog:

	PR c++/95768
	* g++.dg/pr95768.C: New test.
	* g++.dg/warn/Wuninitialized-12.C: New test.
	* gcc.dg/uninit-38.c: New test.

diff --git a/gcc/c-family/c-pretty-print.c b/gcc/c-family/c-pretty-print.c
index 3027703056b..3a3f2f7bdcc 100644
--- a/gcc/c-family/c-pretty-print.c
+++ b/gcc/c-family/c-pretty-print.c
@@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "c-pretty-print.h"
+#include "gimple-pretty-print.h"
 #include "diagnostic.h"
 #include "stor-layout.h"
 #include "stringpool.h"
@@ -1334,6 +1335,34 @@ c_pretty_printer::primary_expression (tree e)
       pp_c_right_paren (this);
       break;
 
+    case SSA_NAME:
+      if (SSA_NAME_VAR (e))
+	{
+	  tree var = SSA_NAME_VAR (e);
+	  const char *name = IDENTIFIER_POINTER (SSA_NAME_IDENTIFIER (e));
+	  const char *dot;
+	  if (DECL_ARTIFICIAL (var) && (dot = strchr (name, '.')))
+	    {
+	      /* Print the name without the . suffix (such as in VLAs).
+		 Use pp_c_identifier so that it can be converted into
+		 the appropriate encoding.  */
+	      size_t size = dot - name;
+	      char *ident = XALLOCAVEC (char, size + 1);
+	      memcpy (ident, name, size);
+	      ident[size] = '\0';
+	      pp_c_identifier (this, ident);
+	    }
+	  else
+	    primary_expression (var);
+	}
+      else
+	{
+	  /* Print only the right side of the GIMPLE assignment.  */
+	  gimple *def_stmt = SSA_NAME_DEF_STMT (e);
+	  pp_gimple_stmt_1 (this, def_stmt, 0, TDF_RHS_ONLY);
+	}
+      break;
+
     default:
       /* FIXME:  Make sure we won't get into an infinite loop.  */
       if (location_wrapper_p (e))
@@ -1780,6 +1809,139 @@ pp_c_call_argument_list (c_pretty_printer *pp, tree t)
   pp_c_right_paren (pp);
 }
 
+/* Print the MEM_REF expression REF, including its type and offset.
+   Apply casts as necessary if the type of the access is different
+   from the type of the accessed object.  Produce compact output
+   designed to include both the element index as well as any
+   misalignment by preferring
+     ((int*)((char*)p + 1))[2]
+   over
+     *(int*)((char*)p + 9)
+   The former is more verbose but makes it clearer that the access
+   to the third element of the array is misaligned by one byte.  */
+
+static void
+print_mem_ref (c_pretty_printer *pp, tree e)
+{
+  tree arg = TREE_OPERAND (e, 0);
+
+  /* The byte offset.  Initially equal to the MEM_REF offset, then
+     adjusted to the remainder of the division by the byte size of
+     the access.  */
+  offset_int byte_off = wi::to_offset (TREE_OPERAND (e, 1));
+  /* The result of dividing BYTE_OFF by the size of the access.  */
+  offset_int elt_idx = 0;
+  /* True to include a cast to char* (for a nonzero final BYTE_OFF).  */
+  bool char_cast = false;
+  const bool addr = TREE_CODE (arg) == ADDR_EXPR;
+  if (addr)
+    {
+      arg = TREE_OPERAND (arg, 0);
+      if (byte_off == 0)
+	{
+	  pp->expression (arg);
+	  return;
+	}
+    }
+
+  const tree access_type = TREE_TYPE (e);
+  tree arg_type = TREE_TYPE (TREE_TYPE (arg));
+  if (TREE_CODE (arg_type) == ARRAY_TYPE)
+    arg_type = TREE_TYPE (arg_type);
+
+  if (tree access_size = TYPE_SIZE_UNIT (access_type))
+    {
+      /* For naturally aligned accesses print the nonzero offset
+	 in units of the accessed type, in the form of an index.
+	 For unaligned accesses also print the residual byte offset.  */
+      offset_int asize = wi::to_offset (access_size);
+      offset_int szlg2 = wi::floor_log2 (asize);
+
+      elt_idx = byte_off >> szlg2;
+      byte_off = byte_off - (elt_idx << szlg2);
+    }
+
+  /* True to include a cast to the accessed type.  */
+  const bool access_cast = VOID_TYPE_P (arg_type)
+    || !gimple_canonical_types_compatible_p (access_type, arg_type);
+
+  if (byte_off != 0)
+    {
+      /* When printing the byte offset for a pointer to a type of
+	 a different size than char, include a cast to char* first,
+	 before printing the cast to a pointer to the accessed type.  */
+      tree arg_type = TREE_TYPE (TREE_TYPE (arg));
+      if (TREE_CODE (arg_type) == ARRAY_TYPE)
+	arg_type = TREE_TYPE (arg_type);
+      offset_int arg_size = 0;
+      if (tree size = TYPE_SIZE (arg_type))
+	arg_size = wi::to_offset (size);
+      if (arg_size != BITS_PER_UNIT)
+	char_cast = true;
+    }
+
+  if (elt_idx == 0)
+    {
+      if (!addr)
+	pp_c_star (pp);
+    }
+  else if (access_cast || char_cast)
+    pp_c_left_paren (pp);
+
+  if (access_cast)
+    {
+      /* Include a cast to the accessed type if it isn't compatible
+	 with the type of the referenced object (or if the object
+	 is typeless).  */
+      pp_c_left_paren (pp);
+      pp->type_id (access_type);
+      pp_c_star (pp);
+      pp_c_right_paren (pp);
+    }
+
+  if (byte_off != 0)
+    pp_c_left_paren (pp);
+
+  if (char_cast)
+    {
+      /* Include a cast to char*.  */
+      pp_c_left_paren (pp);
+      pp->type_id (char_type_node);
+      pp_c_star (pp);
+      pp_c_right_paren (pp);
+    }
+
+  pp->unary_expression (arg);
+
+  if (byte_off != 0)
+    {
+      pp_space (pp);
+      pp_plus (pp);
+      pp_space (pp);
+      tree off = wide_int_to_tree (ssizetype, byte_off);
+      pp->constant (off);
+      pp_c_right_paren (pp);
+    }
+  if (elt_idx != 0)
+    {
+      if (byte_off == 0 && char_cast)
+	pp_c_right_paren (pp);
+      pp_c_right_paren (pp);
+      if (addr)
+	{
+	  pp_space (pp);
+	  pp_plus (pp);
+	  pp_space (pp);
+	}
+      else
+	pp_c_left_bracket (pp);
+      tree idx = wide_int_to_tree (ssizetype, elt_idx);
+      pp->constant (idx);
+      if (!addr)
+	pp_c_right_bracket (pp);
+    }
+}
+
 /* unary-expression:
       postfix-expression
       ++ cast-expression
@@ -1837,30 +1999,7 @@ c_pretty_printer::unary_expression (tree e)
       break;
 
     case MEM_REF:
-      if (TREE_CODE (TREE_OPERAND (e, 0)) == ADDR_EXPR
-	  && integer_zerop (TREE_OPERAND (e, 1)))
-	expression (TREE_OPERAND (TREE_OPERAND (e, 0), 0));
-      else
-	{
-	  pp_c_star (this);
-	  if (!integer_zerop (TREE_OPERAND (e, 1)))
-	    {
-	      pp_c_left_paren (this);
-	      tree type = TREE_TYPE (TREE_TYPE (TREE_OPERAND (e, 0)));
-	      if (TYPE_SIZE_UNIT (type) == NULL_TREE
-		  || !integer_onep (TYPE_SIZE_UNIT (type)))
-		pp_c_type_cast (this, ptr_type_node);
-	    }
-	  pp_c_cast_expression (this, TREE_OPERAND (e, 0));
-	  if (!integer_zerop (TREE_OPERAND (e, 1)))
-	    {
-	      pp_plus (this);
-	      pp_c_integer_constant (this,
-				     fold_convert (ssizetype,
-						   TREE_OPERAND (e, 1)));
-	      pp_c_right_paren (this);
-	    }
-	}
+      print_mem_ref (this, e);
       break;
 
     case TARGET_MEM_REF:
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 4572f6e4ae2..4ab27e0768e 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -2417,32 +2417,8 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
       break;
 
     case MEM_REF:
-      if (TREE_CODE (TREE_OPERAND (t, 0)) == ADDR_EXPR
-	  && integer_zerop (TREE_OPERAND (t, 1)))
-	dump_expr (pp, TREE_OPERAND (TREE_OPERAND (t, 0), 0), flags);
-      else
-	{
-	  pp_cxx_star (pp);
-	  if (!integer_zerop (TREE_OPERAND (t, 1)))
-	    {
-	      pp_cxx_left_paren (pp);
-	      if (!integer_onep (TYPE_SIZE_UNIT
-				 (TREE_TYPE (TREE_TYPE (TREE_OPERAND (t, 0))))))
-		{
-		  pp_cxx_left_paren (pp);
-		  dump_type (pp, ptr_type_node, flags);
-		  pp_cxx_right_paren (pp);
-		}
-	    }
-	  dump_expr (pp, TREE_OPERAND (t, 0), flags);
-	  if (!integer_zerop (TREE_OPERAND (t, 1)))
-	    {
-	      pp_cxx_ws_string (pp, "+");
-	      dump_expr (pp, fold_convert (ssizetype, TREE_OPERAND (t, 1)),
-                         flags);
-	      pp_cxx_right_paren (pp);
-	    }
-	}
+      /* Delegate to the base "C" pretty printer.  */
+      pp->c_pretty_printer::unary_expression (t);
       break;
 
     case TARGET_MEM_REF:
diff --git a/gcc/testsuite/g++.dg/pr95768.C b/gcc/testsuite/g++.dg/pr95768.C
new file mode 100644
index 00000000000..5e2c8c44ad0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr95768.C
@@ -0,0 +1,32 @@
+/* PR c++/95768 - pretty-printer ICE on -Wuninitialized with allocated storage
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+extern "C" void *malloc (__SIZE_TYPE__);
+
+struct f
+{
+  int i;
+  static int e (int);
+  void operator= (int) { e (i); }
+};
+
+struct m {
+  int i;
+  f length;
+};
+
+struct n {
+  m *o() { return (m *)this; }
+};
+
+struct p {
+  n *header;
+  p () {
+    header = (n *)malloc (0);
+    m b = *header->o();       // { dg-warning "\\\[-Wuninitialized" }
+    b.length = 0;
+  }
+};
+
+void detach2() { p(); }
diff --git a/gcc/testsuite/g++.dg/warn/Wuninitialized-12.C b/gcc/testsuite/g++.dg/warn/Wuninitialized-12.C
new file mode 100644
index 00000000000..d06aaac71b8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wuninitialized-12.C
@@ -0,0 +1,40 @@
+/* Verify that -Wuninitialized warnings about accesses to objects via
+   pointers and offsets mention valid expressions.
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+typedef __INT16_TYPE__ int16_t;
+typedef __INT32_TYPE__ int32_t;
+
+void sink (int);
+
+/* Verify properly aligned accesses at offsets that are multiples of
+   the access size.  */
+
+void test_aligned (void)
+{
+  char *p1 = (char*)__builtin_malloc (32);
+  p1 += sizeof (int32_t);
+
+  int16_t *p2 = (int16_t*)p1;
+  sink (p2[1]);               // { dg-warning "'\\(\\(int16_t\\*\\)p1\\)\\\[3]' is used uninitialized" }
+
+  int32_t *p4 = (int32_t*)p1;
+  sink (p4[1]);               // { dg-warning "'\\(\\(int32_t\\*\\)p1\\)\\\[2]' is used uninitialized" }
+}
+
+
+/* Verify misaligned accesses at offsets that aren't multiples of
+   the access size.  */
+
+void test_misaligned (void)
+{
+  char *p1 = (char*)__builtin_malloc (32);
+  p1 += 1;
+
+  int16_t *p2 = (int16_t*)p1;
+  sink (p2[1]);               // { dg-warning "'\\(\\(int16_t\\*\\)\\(p1 \\+ 1\\)\\)\\\[1]' is used uninitialized" }
+
+  int32_t *p4 = (int32_t*)p1;
+  sink (p4[1]);               // { dg-warning "'\\(\\(int32_t\\*\\)\\(p1 \\+ 1\\)\\)\\\[1]' is used uninitialized" }
+}
diff --git a/gcc/testsuite/gcc.dg/uninit-38.c b/gcc/testsuite/gcc.dg/uninit-38.c
new file mode 100644
index 00000000000..ebf11174af0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/uninit-38.c
@@ -0,0 +1,87 @@
+/* Verify that dereferencing uninitialized allocated objects and VLAs
+   correctly reflects offsets into the objects.
+   The test's main purpose is to exercise the formatting of MEM_REFs.
+   If -Wuninitialized gets smarter and detects uninitialized accesses
+   before they're turned into MEM_REFs the test will likely need to
+   be adjusted.  Ditto if -Wuninitialized output changes for some
+   other reason.
+   { dg-do compile { target { { lp64 || ilp32 } || llp64 } } }
+   { dg-options "-O2 -Wall -ftrack-macro-expansion=0" } */
+
+#define CONCAT(x, y)   x ## y
+#define CAT(x, y)      CONCAT(x, y)
+#define UNIQ(name)     CAT (name, __LINE__)
+
+typedef __SIZE_TYPE__ size_t;
+
+extern void* malloc (size_t);
+
+void sink (void*, ...);
+
+#undef T
+#define T(Type, idx, off)			\
+  __attribute__ ((noipa))			\
+  void UNIQ (test_)(int n)			\
+  {						\
+    void *p = malloc (n);			\
+    Type *q = (Type*)((char*)p + off);		\
+    sink (p, q[idx]);				\
+  }						\
+  typedef void dummy_type
+
+T (int, 0, 0);      // { dg-warning "'\\*\\(int\\*\\)p' is used uninitialized" }
+T (int, 0, 1);      // { dg-warning "'\\*\\(int\\*\\)\\(\\(char\\*\\)p \\+ 1\\)'" }
+T (int, 0, 2);      // { dg-warning "'\\*\\(int\\*\\)\\(\\(char\\*\\)p \\+ 2\\)'" }
+T (int, 0, 3);      // { dg-warning "'\\*\\(int\\*\\)\\(\\(char\\*\\)p \\+ 3\\)'" }
+T (int, 0, 4);      // { dg-warning "'\\(\\(int\\*\\)p\\)\\\[1]'" }
+T (int, 0, 5);      // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 1\\)\\)\\\[1]'" }
+T (int, 0, 6);      // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 2\\)\\)\\\[1]'" }
+T (int, 0, 7);      // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 3\\)\\)\\\[1]'" }
+T (int, 0, 8);      // { dg-warning "'\\(\\(int\\*\\)p\\)\\\[2]'" }
+T (int, 0, 9);      // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 1\\)\\)\\\[2]'" }
+
+
+T (int, 1, 0);      // { dg-warning "'\\(\\(int\\*\\)p\\)\\\[1]' is used uninitialized" }
+T (int, 1, 1);      // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 1\\)\\)\\\[1]'" }
+T (int, 1, 2);      // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 2\\)\\)\\\[1]'" }
+T (int, 1, 3);      // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 3\\)\\)\\\[1]'" }
+T (int, 1, 4);      // { dg-warning "'\\(\\(int\\*\\)p\\)\\\[2]'" }
+T (int, 1, 5);      // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 1\\)\\)\\\[2]'" }
+T (int, 1, 6);      // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 2\\)\\)\\\[2]'" }
+T (int, 1, 7);      // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 3\\)\\)\\\[2]'" }
+T (int, 1, 8);      // { dg-warning "'\\(\\(int\\*\\)p\\)\\\[3]'" }
+T (int, 1, 9);      // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 1\\)\\)\\\[3]'" }
+
+#undef T
+#define T(Type, idx, off)			\
+  __attribute__ ((noipa))			\
+  void UNIQ (test_)(int n)			\
+  {						\
+    char a[n], *p = a;				\
+    Type *q = (Type*)((char*)p + off);		\
+    sink (p, q[idx]);				\
+  }						\
+  typedef void dummy_type
+
+T (int, 0, 0);      // { dg-warning "'\\*\\(int\\*\\)a' is used uninitialized" }
+T (int, 0, 1);      // { dg-warning "'\\*\\(int\\*\\)\\(a \\+ 1\\)'" }
+T (int, 0, 2);      // { dg-warning "'\\*\\(int\\*\\)\\(a \\+ 2\\)'" }
+T (int, 0, 3);      // { dg-warning "'\\*\\(int\\*\\)\\(a \\+ 3\\)'" }
+T (int, 0, 4);      // { dg-warning "'\\(\\(int\\*\\)a\\)\\\[1]'" }
+T (int, 0, 5);      // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 1\\)\\)\\\[1]'" }
+T (int, 0, 6);      // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 2\\)\\)\\\[1]'" }
+T (int, 0, 7);      // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 3\\)\\)\\\[1]'" }
+T (int, 0, 8);      // { dg-warning "'\\(\\(int\\*\\)a\\)\\\[2]'" }
+T (int, 0, 9);      // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 1\\)\\)\\\[2]'" }
+
+
+T (int, 1, 0);      // { dg-warning "'\\(\\(int\\*\\)a\\)\\\[1]' is used uninitialized" }
+T (int, 1, 1);      // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 1\\)\\)\\\[1]'" }
+T (int, 1, 2);      // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 2\\)\\)\\\[1]'" }
+T (int, 1, 3);      // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 3\\)\\)\\\[1]'" }
+T (int, 1, 4);      // { dg-warning "'\\(\\(int\\*\\)a\\)\\\[2]'" }
+T (int, 1, 5);      // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 1\\)\\)\\\[2]'" }
+T (int, 1, 6);      // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 2\\)\\)\\\[2]'" }
+T (int, 1, 7);      // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 3\\)\\)\\\[2]'" }
+T (int, 1, 8);      // { dg-warning "'\\(\\(int\\*\\)a\\)\\\[3]'" }
+T (int, 1, 9);      // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 1\\)\\)\\\[3]'" }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] handle MEM_REF with void* arguments (PR c++/95768)
  2021-01-02 22:22             ` [PATCH v3] " Martin Sebor
@ 2021-01-05 23:17               ` Jeff Law
  2021-01-07  8:26               ` Jakub Jelinek
  1 sibling, 0 replies; 11+ messages in thread
From: Jeff Law @ 2021-01-05 23:17 UTC (permalink / raw)
  To: Martin Sebor, gcc-patches, Jason Merrill, Richard Biener



On 1/2/21 3:22 PM, Martin Sebor via Gcc-patches wrote:
> Attached is another revision of a patch I posted last July to keep
> the pretty-printer from crashing on MEM_REFs with void* arguments:
>   https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549746.html
>
> Besides avoiding the ICE and enhancing the MEM_REF detail and
> improving its format, this revision implements the suggestions
> in that discussion.  To avoid code duplication it moves
> the handling to the C pretty-printer and changes the C++ front
> end to delegate to it.  In addition, it includes a cast to
> the accessed type if it's different from/incompatible with
> (according to GIMPLE) that of the dereferenced pointer, or if
> the object is typeless.  Lastly, it replaces the <unknown> in
> the output with either VLA names or the RHS of the GIMPLE
> expression (this improves the output when for dynamically
> allocated objects).
>
> As an aside, In my experience, MEM_REFs in warnings are limited
> to -Wuninitialized.  I think other middle end warnings tend to
> avoid them.  Those that involve invalid/out-of-bounds accesses
> replace them with either the target DECL (e.g., local variable,
> or FIELD_DECL), the allocation call (e.g., malloc), or the DECL
> of the pointer (e.g., PARM_DECL), followed by a note mentioning
> the offset into the object.  I'd like to change -Wuninitialized
> at some point to follow the same style.  So I see the value of
> the MEM_REF formatting enhancement mainly as a transient solution
> until that happens.
>
> Martin
>
> gcc-95768.diff
>
> PR c++/95768 - pretty-printer ICE on -Wuninitialized with allocated storage
>
> gcc/c-family/ChangeLog:
>
> 	PR c++/95768
> 	* c-pretty-print.c (c_pretty_printer::primary_expression): For
> 	SSA_NAMEs print VLA names and GIMPLE defining statements.
> 	(print_mem_ref): New function.
> 	(c_pretty_printer::unary_expression): Call it.
>
> gcc/cp/ChangeLog:
>
> 	PR c++/95768
> 	* error.c (dump_expr): Call c_pretty_printer::unary_expression.
>
> gcc/testsuite/ChangeLog:
>
> 	PR c++/95768
> 	* g++.dg/pr95768.C: New test.
> 	* g++.dg/warn/Wuninitialized-12.C: New test.
> 	* gcc.dg/uninit-38.c: New test.
OK
jeff


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] handle MEM_REF with void* arguments (PR c++/95768)
  2021-01-02 22:22             ` [PATCH v3] " Martin Sebor
  2021-01-05 23:17               ` Jeff Law
@ 2021-01-07  8:26               ` Jakub Jelinek
  2021-01-07 21:36                 ` Martin Sebor
  1 sibling, 1 reply; 11+ messages in thread
From: Jakub Jelinek @ 2021-01-07  8:26 UTC (permalink / raw)
  To: Martin Sebor; +Cc: gcc-patches, Jason Merrill, Richard Biener

On Sat, Jan 02, 2021 at 03:22:25PM -0700, Martin Sebor via Gcc-patches wrote:
> PR c++/95768 - pretty-printer ICE on -Wuninitialized with allocated storage
> 
> gcc/c-family/ChangeLog:
> 
> 	PR c++/95768
> 	* c-pretty-print.c (c_pretty_printer::primary_expression): For
> 	SSA_NAMEs print VLA names and GIMPLE defining statements.
> 	(print_mem_ref): New function.
> 	(c_pretty_printer::unary_expression): Call it.

This broke:
+FAIL: gcc.dg/plugin/gil-1.c -fplugin=./analyzer_gil_plugin.so  (test for warnings, line 16)
+FAIL: gcc.dg/plugin/gil-1.c -fplugin=./analyzer_gil_plugin.so  (test for warnings, line 63)
+FAIL: gcc.dg/plugin/gil-1.c -fplugin=./analyzer_gil_plugin.so (test for excess errors)
and
+FAIL: g++.dg/cpp0x/constexpr-trivial2.C  -std=c++11  (test for errors, line 13)
+FAIL: g++.dg/cpp0x/constexpr-trivial2.C  -std=c++11 (test for excess errors)
The former one is just a different printing of the MEM_REF from what the
test expects, but the latter is an ICE (one needs
make check-c++-all RUNTESTFLAGS=dg.exp=constexpr-trivial2.C
to reproduce or GXX_TESTSUITE_STDS=11 or similar as C++11 is not tested by
default).

	Jakub


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] handle MEM_REF with void* arguments (PR c++/95768)
  2021-01-07  8:26               ` Jakub Jelinek
@ 2021-01-07 21:36                 ` Martin Sebor
  0 siblings, 0 replies; 11+ messages in thread
From: Martin Sebor @ 2021-01-07 21:36 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, Jason Merrill, Richard Biener

On 1/7/21 1:26 AM, Jakub Jelinek wrote:
> On Sat, Jan 02, 2021 at 03:22:25PM -0700, Martin Sebor via Gcc-patches wrote:
>> PR c++/95768 - pretty-printer ICE on -Wuninitialized with allocated storage
>>
>> gcc/c-family/ChangeLog:
>>
>> 	PR c++/95768
>> 	* c-pretty-print.c (c_pretty_printer::primary_expression): For
>> 	SSA_NAMEs print VLA names and GIMPLE defining statements.
>> 	(print_mem_ref): New function.
>> 	(c_pretty_printer::unary_expression): Call it.
> 
> This broke:
> +FAIL: gcc.dg/plugin/gil-1.c -fplugin=./analyzer_gil_plugin.so  (test for warnings, line 16)
> +FAIL: gcc.dg/plugin/gil-1.c -fplugin=./analyzer_gil_plugin.so  (test for warnings, line 63)
> +FAIL: gcc.dg/plugin/gil-1.c -fplugin=./analyzer_gil_plugin.so (test for excess errors)
> and
> +FAIL: g++.dg/cpp0x/constexpr-trivial2.C  -std=c++11  (test for errors, line 13)
> +FAIL: g++.dg/cpp0x/constexpr-trivial2.C  -std=c++11 (test for excess errors)
> The former one is just a different printing of the MEM_REF from what the
> test expects, but the latter is an ICE (one needs
> make check-c++-all RUNTESTFLAGS=dg.exp=constexpr-trivial2.C
> to reproduce or GXX_TESTSUITE_STDS=11 or similar as C++11 is not tested by
> default).

I saw the gcc.dg/plugin/gil-1.c failure but missed the subtle difference
in the output.  Rerunning the test didn't show any failures so I assumed
it was something transient.  But when I tried gain today and looked more
carefully at the output I saw the test doesn't run at all by itself.
This doesn't work:

  $ nice make -C /ssd/test/build/gcc-95768/gcc check-c 
RUNTESTFLAGS="plugin.exp=gil-1.c"

Is there some special magic to get it to run by itself or do these
tests just not do that?

Running all the plugin tests does work but also shows the failures
below that don't show up in the final summary (or on gcc-testresults).
I assume those are unrelated (I think I've seen this test fail in
a similarly mysterious way before).

FAIL: gcc.dg/plugin/diagnostic-test-expressions-1.c 
-fplugin=./diagnostic_plugin_test_tree_expression_range.so  1 blank 
line(s) in output
FAIL: gcc.dg/plugin/diagnostic-test-expressions-1.c 
-fplugin=./diagnostic_plugin_test_tree_expression_range.so  expected 
multiline pattern lines 550-551 not found: " 
__builtin_types_compatible_p \(long, int\) \+ f \(i\)\);.*\n 
                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\^~~~~~~\n"
FAIL: gcc.dg/plugin/diagnostic-test-expressions-1.c 
-fplugin=./diagnostic_plugin_test_tree_expression_range.so (test for 
excess errors)

Martin

PS I committed a fix for both the ICE and the gil-1.c failure
in r11-6532.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-01-07 21:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-22 17:25 [PATCH] handle MEM_REF with void* arguments (PR c++/95768) Martin Sebor
2020-06-22 18:55 ` Jason Merrill
2020-06-22 22:22   ` Martin Sebor
2020-06-23  7:12     ` Richard Biener
2020-06-28 23:07       ` Martin Sebor
2020-06-29  7:19         ` Richard Biener
2020-07-09 15:50           ` Martin Sebor
2021-01-02 22:22             ` [PATCH v3] " Martin Sebor
2021-01-05 23:17               ` Jeff Law
2021-01-07  8:26               ` Jakub Jelinek
2021-01-07 21:36                 ` Martin Sebor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).