From: Martin Sebor <msebor@gmail.com>
To: gcc-patches <gcc-patches@gcc.gnu.org>,
Jason Merrill <jason@redhat.com>,
Richard Biener <richard.guenther@gmail.com>
Subject: [PATCH v3] handle MEM_REF with void* arguments (PR c++/95768)
Date: Sat, 2 Jan 2021 15:22:25 -0700 [thread overview]
Message-ID: <887c792e-ba8e-33ee-a607-dc4fbd8a1b8e@gmail.com> (raw)
In-Reply-To: <658215da-753d-8df6-4467-b43db150e5bd@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1428 bytes --]
Attached is another revision of a patch I posted last July to keep
the pretty-printer from crashing on MEM_REFs with void* arguments:
https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549746.html
Besides avoiding the ICE and enhancing the MEM_REF detail and
improving its format, this revision implements the suggestions
in that discussion. To avoid code duplication it moves
the handling to the C pretty-printer and changes the C++ front
end to delegate to it. In addition, it includes a cast to
the accessed type if it's different from/incompatible with
(according to GIMPLE) that of the dereferenced pointer, or if
the object is typeless. Lastly, it replaces the <unknown> in
the output with either VLA names or the RHS of the GIMPLE
expression (this improves the output when for dynamically
allocated objects).
As an aside, In my experience, MEM_REFs in warnings are limited
to -Wuninitialized. I think other middle end warnings tend to
avoid them. Those that involve invalid/out-of-bounds accesses
replace them with either the target DECL (e.g., local variable,
or FIELD_DECL), the allocation call (e.g., malloc), or the DECL
of the pointer (e.g., PARM_DECL), followed by a note mentioning
the offset into the object. I'd like to change -Wuninitialized
at some point to follow the same style. So I see the value of
the MEM_REF formatting enhancement mainly as a transient solution
until that happens.
Martin
[-- Attachment #2: gcc-95768.diff --]
[-- Type: text/x-patch, Size: 15261 bytes --]
PR c++/95768 - pretty-printer ICE on -Wuninitialized with allocated storage
gcc/c-family/ChangeLog:
PR c++/95768
* c-pretty-print.c (c_pretty_printer::primary_expression): For
SSA_NAMEs print VLA names and GIMPLE defining statements.
(print_mem_ref): New function.
(c_pretty_printer::unary_expression): Call it.
gcc/cp/ChangeLog:
PR c++/95768
* error.c (dump_expr): Call c_pretty_printer::unary_expression.
gcc/testsuite/ChangeLog:
PR c++/95768
* g++.dg/pr95768.C: New test.
* g++.dg/warn/Wuninitialized-12.C: New test.
* gcc.dg/uninit-38.c: New test.
diff --git a/gcc/c-family/c-pretty-print.c b/gcc/c-family/c-pretty-print.c
index 3027703056b..3a3f2f7bdcc 100644
--- a/gcc/c-family/c-pretty-print.c
+++ b/gcc/c-family/c-pretty-print.c
@@ -22,6 +22,7 @@ along with GCC; see the file COPYING3. If not see
#include "system.h"
#include "coretypes.h"
#include "c-pretty-print.h"
+#include "gimple-pretty-print.h"
#include "diagnostic.h"
#include "stor-layout.h"
#include "stringpool.h"
@@ -1334,6 +1335,34 @@ c_pretty_printer::primary_expression (tree e)
pp_c_right_paren (this);
break;
+ case SSA_NAME:
+ if (SSA_NAME_VAR (e))
+ {
+ tree var = SSA_NAME_VAR (e);
+ const char *name = IDENTIFIER_POINTER (SSA_NAME_IDENTIFIER (e));
+ const char *dot;
+ if (DECL_ARTIFICIAL (var) && (dot = strchr (name, '.')))
+ {
+ /* Print the name without the . suffix (such as in VLAs).
+ Use pp_c_identifier so that it can be converted into
+ the appropriate encoding. */
+ size_t size = dot - name;
+ char *ident = XALLOCAVEC (char, size + 1);
+ memcpy (ident, name, size);
+ ident[size] = '\0';
+ pp_c_identifier (this, ident);
+ }
+ else
+ primary_expression (var);
+ }
+ else
+ {
+ /* Print only the right side of the GIMPLE assignment. */
+ gimple *def_stmt = SSA_NAME_DEF_STMT (e);
+ pp_gimple_stmt_1 (this, def_stmt, 0, TDF_RHS_ONLY);
+ }
+ break;
+
default:
/* FIXME: Make sure we won't get into an infinite loop. */
if (location_wrapper_p (e))
@@ -1780,6 +1809,139 @@ pp_c_call_argument_list (c_pretty_printer *pp, tree t)
pp_c_right_paren (pp);
}
+/* Print the MEM_REF expression REF, including its type and offset.
+ Apply casts as necessary if the type of the access is different
+ from the type of the accessed object. Produce compact output
+ designed to include both the element index as well as any
+ misalignment by preferring
+ ((int*)((char*)p + 1))[2]
+ over
+ *(int*)((char*)p + 9)
+ The former is more verbose but makes it clearer that the access
+ to the third element of the array is misaligned by one byte. */
+
+static void
+print_mem_ref (c_pretty_printer *pp, tree e)
+{
+ tree arg = TREE_OPERAND (e, 0);
+
+ /* The byte offset. Initially equal to the MEM_REF offset, then
+ adjusted to the remainder of the division by the byte size of
+ the access. */
+ offset_int byte_off = wi::to_offset (TREE_OPERAND (e, 1));
+ /* The result of dividing BYTE_OFF by the size of the access. */
+ offset_int elt_idx = 0;
+ /* True to include a cast to char* (for a nonzero final BYTE_OFF). */
+ bool char_cast = false;
+ const bool addr = TREE_CODE (arg) == ADDR_EXPR;
+ if (addr)
+ {
+ arg = TREE_OPERAND (arg, 0);
+ if (byte_off == 0)
+ {
+ pp->expression (arg);
+ return;
+ }
+ }
+
+ const tree access_type = TREE_TYPE (e);
+ tree arg_type = TREE_TYPE (TREE_TYPE (arg));
+ if (TREE_CODE (arg_type) == ARRAY_TYPE)
+ arg_type = TREE_TYPE (arg_type);
+
+ if (tree access_size = TYPE_SIZE_UNIT (access_type))
+ {
+ /* For naturally aligned accesses print the nonzero offset
+ in units of the accessed type, in the form of an index.
+ For unaligned accesses also print the residual byte offset. */
+ offset_int asize = wi::to_offset (access_size);
+ offset_int szlg2 = wi::floor_log2 (asize);
+
+ elt_idx = byte_off >> szlg2;
+ byte_off = byte_off - (elt_idx << szlg2);
+ }
+
+ /* True to include a cast to the accessed type. */
+ const bool access_cast = VOID_TYPE_P (arg_type)
+ || !gimple_canonical_types_compatible_p (access_type, arg_type);
+
+ if (byte_off != 0)
+ {
+ /* When printing the byte offset for a pointer to a type of
+ a different size than char, include a cast to char* first,
+ before printing the cast to a pointer to the accessed type. */
+ tree arg_type = TREE_TYPE (TREE_TYPE (arg));
+ if (TREE_CODE (arg_type) == ARRAY_TYPE)
+ arg_type = TREE_TYPE (arg_type);
+ offset_int arg_size = 0;
+ if (tree size = TYPE_SIZE (arg_type))
+ arg_size = wi::to_offset (size);
+ if (arg_size != BITS_PER_UNIT)
+ char_cast = true;
+ }
+
+ if (elt_idx == 0)
+ {
+ if (!addr)
+ pp_c_star (pp);
+ }
+ else if (access_cast || char_cast)
+ pp_c_left_paren (pp);
+
+ if (access_cast)
+ {
+ /* Include a cast to the accessed type if it isn't compatible
+ with the type of the referenced object (or if the object
+ is typeless). */
+ pp_c_left_paren (pp);
+ pp->type_id (access_type);
+ pp_c_star (pp);
+ pp_c_right_paren (pp);
+ }
+
+ if (byte_off != 0)
+ pp_c_left_paren (pp);
+
+ if (char_cast)
+ {
+ /* Include a cast to char*. */
+ pp_c_left_paren (pp);
+ pp->type_id (char_type_node);
+ pp_c_star (pp);
+ pp_c_right_paren (pp);
+ }
+
+ pp->unary_expression (arg);
+
+ if (byte_off != 0)
+ {
+ pp_space (pp);
+ pp_plus (pp);
+ pp_space (pp);
+ tree off = wide_int_to_tree (ssizetype, byte_off);
+ pp->constant (off);
+ pp_c_right_paren (pp);
+ }
+ if (elt_idx != 0)
+ {
+ if (byte_off == 0 && char_cast)
+ pp_c_right_paren (pp);
+ pp_c_right_paren (pp);
+ if (addr)
+ {
+ pp_space (pp);
+ pp_plus (pp);
+ pp_space (pp);
+ }
+ else
+ pp_c_left_bracket (pp);
+ tree idx = wide_int_to_tree (ssizetype, elt_idx);
+ pp->constant (idx);
+ if (!addr)
+ pp_c_right_bracket (pp);
+ }
+}
+
/* unary-expression:
postfix-expression
++ cast-expression
@@ -1837,30 +1999,7 @@ c_pretty_printer::unary_expression (tree e)
break;
case MEM_REF:
- if (TREE_CODE (TREE_OPERAND (e, 0)) == ADDR_EXPR
- && integer_zerop (TREE_OPERAND (e, 1)))
- expression (TREE_OPERAND (TREE_OPERAND (e, 0), 0));
- else
- {
- pp_c_star (this);
- if (!integer_zerop (TREE_OPERAND (e, 1)))
- {
- pp_c_left_paren (this);
- tree type = TREE_TYPE (TREE_TYPE (TREE_OPERAND (e, 0)));
- if (TYPE_SIZE_UNIT (type) == NULL_TREE
- || !integer_onep (TYPE_SIZE_UNIT (type)))
- pp_c_type_cast (this, ptr_type_node);
- }
- pp_c_cast_expression (this, TREE_OPERAND (e, 0));
- if (!integer_zerop (TREE_OPERAND (e, 1)))
- {
- pp_plus (this);
- pp_c_integer_constant (this,
- fold_convert (ssizetype,
- TREE_OPERAND (e, 1)));
- pp_c_right_paren (this);
- }
- }
+ print_mem_ref (this, e);
break;
case TARGET_MEM_REF:
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 4572f6e4ae2..4ab27e0768e 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -2417,32 +2417,8 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
break;
case MEM_REF:
- if (TREE_CODE (TREE_OPERAND (t, 0)) == ADDR_EXPR
- && integer_zerop (TREE_OPERAND (t, 1)))
- dump_expr (pp, TREE_OPERAND (TREE_OPERAND (t, 0), 0), flags);
- else
- {
- pp_cxx_star (pp);
- if (!integer_zerop (TREE_OPERAND (t, 1)))
- {
- pp_cxx_left_paren (pp);
- if (!integer_onep (TYPE_SIZE_UNIT
- (TREE_TYPE (TREE_TYPE (TREE_OPERAND (t, 0))))))
- {
- pp_cxx_left_paren (pp);
- dump_type (pp, ptr_type_node, flags);
- pp_cxx_right_paren (pp);
- }
- }
- dump_expr (pp, TREE_OPERAND (t, 0), flags);
- if (!integer_zerop (TREE_OPERAND (t, 1)))
- {
- pp_cxx_ws_string (pp, "+");
- dump_expr (pp, fold_convert (ssizetype, TREE_OPERAND (t, 1)),
- flags);
- pp_cxx_right_paren (pp);
- }
- }
+ /* Delegate to the base "C" pretty printer. */
+ pp->c_pretty_printer::unary_expression (t);
break;
case TARGET_MEM_REF:
diff --git a/gcc/testsuite/g++.dg/pr95768.C b/gcc/testsuite/g++.dg/pr95768.C
new file mode 100644
index 00000000000..5e2c8c44ad0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr95768.C
@@ -0,0 +1,32 @@
+/* PR c++/95768 - pretty-printer ICE on -Wuninitialized with allocated storage
+ { dg-do compile }
+ { dg-options "-O2 -Wall" } */
+
+extern "C" void *malloc (__SIZE_TYPE__);
+
+struct f
+{
+ int i;
+ static int e (int);
+ void operator= (int) { e (i); }
+};
+
+struct m {
+ int i;
+ f length;
+};
+
+struct n {
+ m *o() { return (m *)this; }
+};
+
+struct p {
+ n *header;
+ p () {
+ header = (n *)malloc (0);
+ m b = *header->o(); // { dg-warning "\\\[-Wuninitialized" }
+ b.length = 0;
+ }
+};
+
+void detach2() { p(); }
diff --git a/gcc/testsuite/g++.dg/warn/Wuninitialized-12.C b/gcc/testsuite/g++.dg/warn/Wuninitialized-12.C
new file mode 100644
index 00000000000..d06aaac71b8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wuninitialized-12.C
@@ -0,0 +1,40 @@
+/* Verify that -Wuninitialized warnings about accesses to objects via
+ pointers and offsets mention valid expressions.
+ { dg-do compile }
+ { dg-options "-O2 -Wall" } */
+
+typedef __INT16_TYPE__ int16_t;
+typedef __INT32_TYPE__ int32_t;
+
+void sink (int);
+
+/* Verify properly aligned accesses at offsets that are multiples of
+ the access size. */
+
+void test_aligned (void)
+{
+ char *p1 = (char*)__builtin_malloc (32);
+ p1 += sizeof (int32_t);
+
+ int16_t *p2 = (int16_t*)p1;
+ sink (p2[1]); // { dg-warning "'\\(\\(int16_t\\*\\)p1\\)\\\[3]' is used uninitialized" }
+
+ int32_t *p4 = (int32_t*)p1;
+ sink (p4[1]); // { dg-warning "'\\(\\(int32_t\\*\\)p1\\)\\\[2]' is used uninitialized" }
+}
+
+
+/* Verify misaligned accesses at offsets that aren't multiples of
+ the access size. */
+
+void test_misaligned (void)
+{
+ char *p1 = (char*)__builtin_malloc (32);
+ p1 += 1;
+
+ int16_t *p2 = (int16_t*)p1;
+ sink (p2[1]); // { dg-warning "'\\(\\(int16_t\\*\\)\\(p1 \\+ 1\\)\\)\\\[1]' is used uninitialized" }
+
+ int32_t *p4 = (int32_t*)p1;
+ sink (p4[1]); // { dg-warning "'\\(\\(int32_t\\*\\)\\(p1 \\+ 1\\)\\)\\\[1]' is used uninitialized" }
+}
diff --git a/gcc/testsuite/gcc.dg/uninit-38.c b/gcc/testsuite/gcc.dg/uninit-38.c
new file mode 100644
index 00000000000..ebf11174af0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/uninit-38.c
@@ -0,0 +1,87 @@
+/* Verify that dereferencing uninitialized allocated objects and VLAs
+ correctly reflects offsets into the objects.
+ The test's main purpose is to exercise the formatting of MEM_REFs.
+ If -Wuninitialized gets smarter and detects uninitialized accesses
+ before they're turned into MEM_REFs the test will likely need to
+ be adjusted. Ditto if -Wuninitialized output changes for some
+ other reason.
+ { dg-do compile { target { { lp64 || ilp32 } || llp64 } } }
+ { dg-options "-O2 -Wall -ftrack-macro-expansion=0" } */
+
+#define CONCAT(x, y) x ## y
+#define CAT(x, y) CONCAT(x, y)
+#define UNIQ(name) CAT (name, __LINE__)
+
+typedef __SIZE_TYPE__ size_t;
+
+extern void* malloc (size_t);
+
+void sink (void*, ...);
+
+#undef T
+#define T(Type, idx, off) \
+ __attribute__ ((noipa)) \
+ void UNIQ (test_)(int n) \
+ { \
+ void *p = malloc (n); \
+ Type *q = (Type*)((char*)p + off); \
+ sink (p, q[idx]); \
+ } \
+ typedef void dummy_type
+
+T (int, 0, 0); // { dg-warning "'\\*\\(int\\*\\)p' is used uninitialized" }
+T (int, 0, 1); // { dg-warning "'\\*\\(int\\*\\)\\(\\(char\\*\\)p \\+ 1\\)'" }
+T (int, 0, 2); // { dg-warning "'\\*\\(int\\*\\)\\(\\(char\\*\\)p \\+ 2\\)'" }
+T (int, 0, 3); // { dg-warning "'\\*\\(int\\*\\)\\(\\(char\\*\\)p \\+ 3\\)'" }
+T (int, 0, 4); // { dg-warning "'\\(\\(int\\*\\)p\\)\\\[1]'" }
+T (int, 0, 5); // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 1\\)\\)\\\[1]'" }
+T (int, 0, 6); // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 2\\)\\)\\\[1]'" }
+T (int, 0, 7); // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 3\\)\\)\\\[1]'" }
+T (int, 0, 8); // { dg-warning "'\\(\\(int\\*\\)p\\)\\\[2]'" }
+T (int, 0, 9); // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 1\\)\\)\\\[2]'" }
+
+
+T (int, 1, 0); // { dg-warning "'\\(\\(int\\*\\)p\\)\\\[1]' is used uninitialized" }
+T (int, 1, 1); // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 1\\)\\)\\\[1]'" }
+T (int, 1, 2); // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 2\\)\\)\\\[1]'" }
+T (int, 1, 3); // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 3\\)\\)\\\[1]'" }
+T (int, 1, 4); // { dg-warning "'\\(\\(int\\*\\)p\\)\\\[2]'" }
+T (int, 1, 5); // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 1\\)\\)\\\[2]'" }
+T (int, 1, 6); // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 2\\)\\)\\\[2]'" }
+T (int, 1, 7); // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 3\\)\\)\\\[2]'" }
+T (int, 1, 8); // { dg-warning "'\\(\\(int\\*\\)p\\)\\\[3]'" }
+T (int, 1, 9); // { dg-warning "'\\(\\(int\\*\\)\\(\\(char\\*\\)p \\+ 1\\)\\)\\\[3]'" }
+
+#undef T
+#define T(Type, idx, off) \
+ __attribute__ ((noipa)) \
+ void UNIQ (test_)(int n) \
+ { \
+ char a[n], *p = a; \
+ Type *q = (Type*)((char*)p + off); \
+ sink (p, q[idx]); \
+ } \
+ typedef void dummy_type
+
+T (int, 0, 0); // { dg-warning "'\\*\\(int\\*\\)a' is used uninitialized" }
+T (int, 0, 1); // { dg-warning "'\\*\\(int\\*\\)\\(a \\+ 1\\)'" }
+T (int, 0, 2); // { dg-warning "'\\*\\(int\\*\\)\\(a \\+ 2\\)'" }
+T (int, 0, 3); // { dg-warning "'\\*\\(int\\*\\)\\(a \\+ 3\\)'" }
+T (int, 0, 4); // { dg-warning "'\\(\\(int\\*\\)a\\)\\\[1]'" }
+T (int, 0, 5); // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 1\\)\\)\\\[1]'" }
+T (int, 0, 6); // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 2\\)\\)\\\[1]'" }
+T (int, 0, 7); // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 3\\)\\)\\\[1]'" }
+T (int, 0, 8); // { dg-warning "'\\(\\(int\\*\\)a\\)\\\[2]'" }
+T (int, 0, 9); // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 1\\)\\)\\\[2]'" }
+
+
+T (int, 1, 0); // { dg-warning "'\\(\\(int\\*\\)a\\)\\\[1]' is used uninitialized" }
+T (int, 1, 1); // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 1\\)\\)\\\[1]'" }
+T (int, 1, 2); // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 2\\)\\)\\\[1]'" }
+T (int, 1, 3); // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 3\\)\\)\\\[1]'" }
+T (int, 1, 4); // { dg-warning "'\\(\\(int\\*\\)a\\)\\\[2]'" }
+T (int, 1, 5); // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 1\\)\\)\\\[2]'" }
+T (int, 1, 6); // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 2\\)\\)\\\[2]'" }
+T (int, 1, 7); // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 3\\)\\)\\\[2]'" }
+T (int, 1, 8); // { dg-warning "'\\(\\(int\\*\\)a\\)\\\[3]'" }
+T (int, 1, 9); // { dg-warning "'\\(\\(int\\*\\)\\(a \\+ 1\\)\\)\\\[3]'" }
next prev parent reply other threads:[~2021-01-02 22:22 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-22 17:25 [PATCH] " Martin Sebor
2020-06-22 18:55 ` Jason Merrill
2020-06-22 22:22 ` Martin Sebor
2020-06-23 7:12 ` Richard Biener
2020-06-28 23:07 ` Martin Sebor
2020-06-29 7:19 ` Richard Biener
2020-07-09 15:50 ` Martin Sebor
2021-01-02 22:22 ` Martin Sebor [this message]
2021-01-05 23:17 ` [PATCH v3] " Jeff Law
2021-01-07 8:26 ` Jakub Jelinek
2021-01-07 21:36 ` Martin Sebor
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=887c792e-ba8e-33ee-a607-dc4fbd8a1b8e@gmail.com \
--to=msebor@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jason@redhat.com \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).