public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 1/5] substring-locations: add class format_string_diagnostic_t
  2018-09-14 17:31 [PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696) David Malcolm
@ 2018-09-14 17:30 ` David Malcolm
  2018-09-14 17:31 ` [PATCH 5/5] gimple-ssa-sprintf.c: rewrite overflow/truncation diagnostic (PR middle-end/77696) David Malcolm
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: David Malcolm @ 2018-09-14 17:30 UTC (permalink / raw)
  To: msebor; +Cc: gcc-patches, David Malcolm

With the addition of ranges in r263564, format_warning_at_substring_n
has 10 arguments.

Reduce the number of args by bundling the shared ones into a
class format_string_diagnostic_t.

gcc/c-family/ChangeLog:
	* c-format.c (format_warning_at_char): Update for introduction of
	format_string_diagnostic_t.
	(format_type_warning): Likewise.

gcc/ChangeLog:
	* gimple-ssa-sprintf.c (fmtwarn): Update for introduction of
	format_string_diagnostic_t.
	(fmtwarn_n): Likewise.
	* substring-locations.c
	(format_string_diagnostic_t::format_string_diagnostic_t) New ctor.
	(format_warning_n_va): Convert to...
	(format_string_diagnostic_t::emit_warning_n_va): ...this.
	(format_warning_va): Convert to...
	(format_string_diagnostic_t::emit_warning_va): ...this.
	(format_warning_at_substring): Convert to...
	(format_string_diagnostic_t::emit_warning): ...this.
	(format_warning_at_substring_n): Convert to...
	(format_string_diagnostic_t::emit_warning_n): ...this.
	* substring-locations.h (class format_string_diagnostic_t): New
	class.
	(format_warning_va): Convert to
	format_string_diagnostic_t::emit_warning_va.
	(format_warning_n_va): Convert to
	format_string_diagnostic_t::emit_warning_n_va.
	(format_warning_at_substring): Convert to
	format_string_diagnostic_t::emit_warning.
	(format_warning_at_substring_n): Convert to
	format_string_diagnostic_t::emit_warning_n.
---
 gcc/c-family/c-format.c   |  28 ++++++------
 gcc/gimple-ssa-sprintf.c  |  16 ++++---
 gcc/substring-locations.c | 113 +++++++++++++++++++++++-----------------------
 gcc/substring-locations.h |  74 +++++++++++++++---------------
 4 files changed, 115 insertions(+), 116 deletions(-)

diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index 98c49cf..a7edfca 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -100,8 +100,9 @@ format_warning_at_char (location_t fmt_string_loc, tree format_string_cst,
 
   substring_loc fmt_loc (fmt_string_loc, string_type, char_idx, char_idx,
 			 char_idx);
-  bool warned = format_warning_va (fmt_loc, NULL, UNKNOWN_LOCATION, NULL,
-				   NULL, opt, gmsgid, &ap);
+  format_string_diagnostic_t diag (fmt_loc, NULL, UNKNOWN_LOCATION, NULL,
+				   NULL);
+  bool warned = diag.emit_warning_va (opt, gmsgid, &ap);
   va_end (ap);
 
   return warned;
@@ -3694,13 +3695,13 @@ format_type_warning (const substring_loc &whole_fmt_loc,
   char *corrected_substring
     = get_corrected_substring (fmt_loc, type, arg_type, fki,
 			       offset_to_type_start, conversion_char);
-
+  format_string_diagnostic_t diag (fmt_loc, &fmt_label, param_loc, &param_label,
+				   corrected_substring);
   if (wanted_type_name)
     {
       if (arg_type)
-	format_warning_at_substring
-	  (fmt_loc, &fmt_label, param_loc, &param_label,
-	   corrected_substring, OPT_Wformat_,
+	diag.emit_warning
+	  (OPT_Wformat_,
 	   "%s %<%s%.*s%> expects argument of type %<%s%s%>, "
 	   "but argument %d has type %qT",
 	   gettext (kind_descriptions[kind]),
@@ -3708,9 +3709,8 @@ format_type_warning (const substring_loc &whole_fmt_loc,
 	   format_length, format_start,
 	   wanted_type_name, p, arg_num, arg_type);
       else
-	format_warning_at_substring
-	  (fmt_loc, &fmt_label, param_loc, &param_label,
-	   corrected_substring, OPT_Wformat_,
+	diag.emit_warning
+	  (OPT_Wformat_,
 	   "%s %<%s%.*s%> expects a matching %<%s%s%> argument",
 	   gettext (kind_descriptions[kind]),
 	   (kind == CF_KIND_FORMAT ? "%" : ""),
@@ -3719,9 +3719,8 @@ format_type_warning (const substring_loc &whole_fmt_loc,
   else
     {
       if (arg_type)
-	format_warning_at_substring
-	  (fmt_loc, &fmt_label, param_loc, &param_label,
-	   corrected_substring, OPT_Wformat_,
+	diag.emit_warning
+	  (OPT_Wformat_,
 	   "%s %<%s%.*s%> expects argument of type %<%T%s%>, "
 	   "but argument %d has type %qT",
 	   gettext (kind_descriptions[kind]),
@@ -3729,9 +3728,8 @@ format_type_warning (const substring_loc &whole_fmt_loc,
 	   format_length, format_start,
 	   wanted_type, p, arg_num, arg_type);
       else
-	format_warning_at_substring
-	  (fmt_loc, &fmt_label, param_loc, &param_label,
-	   corrected_substring, OPT_Wformat_,
+	diag.emit_warning
+	  (OPT_Wformat_,
 	   "%s %<%s%.*s%> expects a matching %<%T%s%> argument",
 	   gettext (kind_descriptions[kind]),
 	   (kind == CF_KIND_FORMAT ? "%" : ""),
diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index 433e558..65c7f92 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -455,7 +455,8 @@ get_format_string (tree format, location_t *ploc)
 }
 
 /* For convenience and brevity, shorter named entrypoints of
-   format_warning_at_substring and format_warning_at_substring_n.
+   format_string_diagnostic_t::emit_warning_va and
+   format_string_diagnostic_t::emit_warning_n_va.
    These have to be functions with the attribute so that exgettext
    works properly.  */
 
@@ -464,10 +465,11 @@ ATTRIBUTE_GCC_DIAG (5, 6)
 fmtwarn (const substring_loc &fmt_loc, location_t param_loc,
 	 const char *corrected_substring, int opt, const char *gmsgid, ...)
 {
+  format_string_diagnostic_t diag (fmt_loc, NULL, param_loc, NULL,
+				   corrected_substring);
   va_list ap;
   va_start (ap, gmsgid);
-  bool warned = format_warning_va (fmt_loc, NULL, param_loc, NULL,
-				   corrected_substring, opt, gmsgid, &ap);
+  bool warned = diag.emit_warning_va (opt, gmsgid, &ap);
   va_end (ap);
 
   return warned;
@@ -479,12 +481,12 @@ fmtwarn_n (const substring_loc &fmt_loc, location_t param_loc,
 	   const char *corrected_substring, int opt, unsigned HOST_WIDE_INT n,
 	   const char *singular_gmsgid, const char *plural_gmsgid, ...)
 {
+  format_string_diagnostic_t diag (fmt_loc, NULL, param_loc, NULL,
+				   corrected_substring);
   va_list ap;
   va_start (ap, plural_gmsgid);
-  bool warned = format_warning_n_va (fmt_loc, NULL, param_loc, NULL,
-				     corrected_substring,
-				     opt, n, singular_gmsgid, plural_gmsgid,
-				     &ap);
+  bool warned = diag.emit_warning_n_va (opt, n, singular_gmsgid, plural_gmsgid,
+					&ap);
   va_end (ap);
 
   return warned;
diff --git a/gcc/substring-locations.c b/gcc/substring-locations.c
index faf7884..db88f20 100644
--- a/gcc/substring-locations.c
+++ b/gcc/substring-locations.c
@@ -28,12 +28,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "substring-locations.h"
 #include "gcc-rich-location.h"
 
-/* Emit a warning governed by option OPT, using SINGULAR_GMSGID as the
-   format string (or if PLURAL_GMSGID is different from SINGULAR_GMSGID,
-   using SINGULAR_GMSGID, PLURAL_GMSGID and N as arguments to ngettext)
-   and AP as its arguments.
+/* format_string_diagnostic_t's ctor, giving information for use by
+   the emit_warning* member functions, as follows:
 
-   Attempt to obtain precise location information within a string
+   They attempt to obtain precise location information within a string
    literal from FMT_LOC.
 
    Case 1: if substring location is available, and is within the range of
@@ -49,7 +47,7 @@ along with GCC; see the file COPYING3.  If not see
 
    Case 2: if the substring location is available, but is not within
    the range of the format string, the primary location is that of the
-   format string, and an note is emitted showing the substring location.
+   format string, and a note is emitted showing the substring location.
 
    For example:
      test.c:90:10: warning: problem with '%i' here [-Wformat=]
@@ -120,29 +118,47 @@ along with GCC; see the file COPYING3.  If not see
                     ~^
                     %s
 
+*/
+
+format_string_diagnostic_t::
+format_string_diagnostic_t (const substring_loc &fmt_loc,
+			    const range_label *fmt_label,
+			    location_t param_loc,
+			    const range_label *param_label,
+			    const char *corrected_substring)
+: m_fmt_loc (fmt_loc),
+  m_fmt_label (fmt_label),
+  m_param_loc (param_loc),
+  m_param_label (param_label),
+  m_corrected_substring (corrected_substring)
+{
+}
+
+/* Emit a warning governed by option OPT, using SINGULAR_GMSGID as the
+   format string (or if PLURAL_GMSGID is different from SINGULAR_GMSGID,
+   using SINGULAR_GMSGID, PLURAL_GMSGID and N as arguments to ngettext)
+   and AP as its arguments.
+
    Return true if a warning was emitted, false otherwise.  */
 
 bool
-format_warning_n_va (const substring_loc &fmt_loc,
-		     const range_label *fmt_label,
-		     location_t param_loc,
-		     const range_label *param_label,
-		     const char *corrected_substring,
-		     int opt, unsigned HOST_WIDE_INT n,
-		     const char *singular_gmsgid,
-		     const char *plural_gmsgid, va_list *ap)
+format_string_diagnostic_t::emit_warning_n_va (int opt,
+					       unsigned HOST_WIDE_INT n,
+					       const char *singular_gmsgid,
+					       const char *plural_gmsgid,
+					       va_list *ap) const
 {
   bool substring_within_range = false;
   location_t primary_loc;
   location_t fmt_substring_loc = UNKNOWN_LOCATION;
   source_range fmt_loc_range
-    = get_range_from_loc (line_table, fmt_loc.get_fmt_string_loc ());
-  const char *err = fmt_loc.get_location (&fmt_substring_loc);
+    = get_range_from_loc (line_table, m_fmt_loc.get_fmt_string_loc ());
+  const char *err = m_fmt_loc.get_location (&fmt_substring_loc);
   source_range fmt_substring_range
     = get_range_from_loc (line_table, fmt_substring_loc);
   if (err)
     /* Case 3: unable to get substring location.  */
-    primary_loc = fmt_loc.get_fmt_string_loc ();
+    primary_loc = m_fmt_loc.get_fmt_string_loc ();
   else
     {
       if (fmt_substring_range.m_start >= fmt_loc_range.m_start
@@ -158,23 +174,23 @@ format_warning_n_va (const substring_loc &fmt_loc,
 	/* Case 2.  */
 	{
 	  substring_within_range = false;
-	  primary_loc = fmt_loc.get_fmt_string_loc ();
+	  primary_loc = m_fmt_loc.get_fmt_string_loc ();
 	}
     }
 
   /* Only use fmt_label in the initial warning for case 1.  */
   const range_label *primary_label = NULL;
   if (substring_within_range)
-    primary_label = fmt_label;
+    primary_label = m_fmt_label;
 
   auto_diagnostic_group d;
   gcc_rich_location richloc (primary_loc, primary_label);
 
-  if (param_loc != UNKNOWN_LOCATION)
-    richloc.add_range (param_loc, SHOW_RANGE_WITHOUT_CARET, param_label);
+  if (m_param_loc != UNKNOWN_LOCATION)
+    richloc.add_range (m_param_loc, SHOW_RANGE_WITHOUT_CARET, m_param_label);
 
-  if (!err && corrected_substring && substring_within_range)
-    richloc.add_fixit_replace (fmt_substring_range, corrected_substring);
+  if (!err && m_corrected_substring && substring_within_range)
+    richloc.add_fixit_replace (fmt_substring_range, m_corrected_substring);
 
   diagnostic_info diagnostic;
   if (singular_gmsgid != plural_gmsgid)
@@ -205,10 +221,10 @@ format_warning_n_va (const substring_loc &fmt_loc,
       {
 	/* Use fmt_label in the note for case 2.  */
 	rich_location substring_richloc (line_table, fmt_substring_loc,
-					 fmt_label);
-	if (corrected_substring)
+					 m_fmt_label);
+	if (m_corrected_substring)
 	  substring_richloc.add_fixit_replace (fmt_substring_range,
-					       corrected_substring);
+					       m_corrected_substring);
 	inform (&substring_richloc,
 		"format string is defined here");
       }
@@ -219,55 +235,38 @@ format_warning_n_va (const substring_loc &fmt_loc,
 /* Singular-only version of the above.  */
 
 bool
-format_warning_va (const substring_loc &fmt_loc,
-		   const range_label *fmt_label,
-		   location_t param_loc,
-		   const range_label *param_label,
-		   const char *corrected_substring,
-		   int opt, const char *gmsgid, va_list *ap)
+format_string_diagnostic_t::emit_warning_va (int opt, const char *gmsgid,
+					     va_list *ap) const
 {
-  return format_warning_n_va (fmt_loc, fmt_label, param_loc, param_label,
-			      corrected_substring, opt,
-			      0, gmsgid, gmsgid, ap);
+  return emit_warning_n_va (opt, 0, gmsgid, gmsgid, ap);
 }
 
-/* Variadic call to format_warning_va.  */
+/* Variadic version of the above (singular only).  */
 
 bool
-format_warning_at_substring (const substring_loc &fmt_loc,
-			     const range_label *fmt_label,
-			     location_t param_loc,
-			     const range_label *param_label,
-			     const char *corrected_substring,
-			     int opt, const char *gmsgid, ...)
+format_string_diagnostic_t::emit_warning (int opt, const char *gmsgid,
+					  ...) const
 {
   va_list ap;
   va_start (ap, gmsgid);
-  bool warned = format_warning_va (fmt_loc, fmt_label, param_loc, param_label,
-				   corrected_substring, opt, gmsgid, &ap);
+  bool warned = emit_warning_va (opt, gmsgid, &ap);
   va_end (ap);
 
   return warned;
 }
 
-/* Variadic call to format_warning_n_va.  */
+/* Variadic version of the above (singular vs plural).  */
 
 bool
-format_warning_at_substring_n (const substring_loc &fmt_loc,
-			       const range_label *fmt_label,
-			       location_t param_loc,
-			       const range_label *param_label,
-			       const char *corrected_substring,
-			       int opt, unsigned HOST_WIDE_INT n,
-			       const char *singular_gmsgid,
-			       const char *plural_gmsgid, ...)
+format_string_diagnostic_t::emit_warning_n (int opt, unsigned HOST_WIDE_INT n,
+					    const char *singular_gmsgid,
+					    const char *plural_gmsgid,
+					    ...) const
 {
   va_list ap;
   va_start (ap, plural_gmsgid);
-  bool warned = format_warning_n_va (fmt_loc, fmt_label, param_loc, param_label,
-				     corrected_substring,
-				     opt, n, singular_gmsgid, plural_gmsgid,
-				     &ap);
+  bool warned = emit_warning_n_va (opt, n, singular_gmsgid, plural_gmsgid,
+				   &ap);
   va_end (ap);
 
   return warned;
diff --git a/gcc/substring-locations.h b/gcc/substring-locations.h
index 919fdf0..caac537 100644
--- a/gcc/substring-locations.h
+++ b/gcc/substring-locations.h
@@ -74,43 +74,43 @@ class substring_loc
   int m_end_idx;
 };
 
-/* Functions for emitting a warning about a format string.  */
-
-extern bool format_warning_va (const substring_loc &fmt_loc,
-			       const range_label *fmt_label,
-			       location_t param_loc,
-			       const range_label *param_label,
-			       const char *corrected_substring,
-			       int opt, const char *gmsgid, va_list *ap)
-  ATTRIBUTE_GCC_DIAG (7, 0);
-
-extern bool format_warning_n_va (const substring_loc &fmt_loc,
-				 const range_label *fmt_label,
-				 location_t param_loc,
-				 const range_label *param_label,
-				 const char *corrected_substring,
-				 int opt, unsigned HOST_WIDE_INT n,
-				 const char *singular_gmsgid,
-				 const char *plural_gmsgid, va_list *ap)
-  ATTRIBUTE_GCC_DIAG (8, 0) ATTRIBUTE_GCC_DIAG (9, 0);
-
-extern bool format_warning_at_substring (const substring_loc &fmt_loc,
-					 const range_label *fmt_label,
-					 location_t param_loc,
-					 const range_label *param_label,
-					 const char *corrected_substring,
-					 int opt, const char *gmsgid, ...)
-  ATTRIBUTE_GCC_DIAG (7, 8);
-
-extern bool format_warning_at_substring_n (const substring_loc &fmt_loc,
-					   const range_label *fmt_label,
-					   location_t param_loc,
-					   const range_label *param_label,
-					   const char *corrected_substring,
-					   int opt, unsigned HOST_WIDE_INT n,
-					   const char *singular_gmsgid,
-					   const char *plural_gmsgid, ...)
-  ATTRIBUTE_GCC_DIAG (8, 10) ATTRIBUTE_GCC_DIAG (9, 10);
+/* A bundle of state for emitting a diagnostic relating to a format string.  */
+
+class format_string_diagnostic_t
+{
+ public:
+  format_string_diagnostic_t (const substring_loc &fmt_loc,
+			      const range_label *fmt_label,
+			      location_t param_loc,
+			      const range_label *param_label,
+			      const char *corrected_substring);
+
+  /* Functions for emitting a warning about a format string.  */
+
+  bool emit_warning_va (int opt, const char *gmsgid, va_list *ap) const
+    ATTRIBUTE_GCC_DIAG (3, 0);
+
+  bool emit_warning_n_va (int opt, unsigned HOST_WIDE_INT n,
+			  const char *singular_gmsgid,
+			  const char *plural_gmsgid, va_list *ap) const
+  ATTRIBUTE_GCC_DIAG (4, 0) ATTRIBUTE_GCC_DIAG (5, 0);
+
+  bool emit_warning (int opt, const char *gmsgid, ...) const
+    ATTRIBUTE_GCC_DIAG (3, 4);
+
+  bool emit_warning_n (int opt, unsigned HOST_WIDE_INT n,
+		       const char *singular_gmsgid,
+		       const char *plural_gmsgid, ...) const
+  ATTRIBUTE_GCC_DIAG (4, 6) ATTRIBUTE_GCC_DIAG (5, 6);
+
+ private:
+  const substring_loc &m_fmt_loc;
+  const range_label *m_fmt_label;
+  location_t m_param_loc;
+  const range_label *m_param_label;
+  const char *m_corrected_substring;
+};
+
 
 /* Implementation detail, for use when implementing
    LANG_HOOKS_GET_SUBSTRING_LOCATION.  */
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696)
@ 2018-09-14 17:31 David Malcolm
  2018-09-14 17:30 ` [PATCH 1/5] substring-locations: add class format_string_diagnostic_t David Malcolm
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: David Malcolm @ 2018-09-14 17:31 UTC (permalink / raw)
  To: msebor; +Cc: gcc-patches, David Malcolm

Martin: on the way back from Cauldron I had a go at implementing new
output formats for the warnings from gimple-ssa-sprintf.c, to try to
better indicate to the end user what the problem is.  My plan was to
implement some of the ASCII art ideas we've been discussing.  However,
to try to understand what the pass is doing, I tried visualizing the
existing data structures, and I ended up liking the output of that so much,
that I think that it is what we ought to show the end-user.

Consider the examples from PR middle-end/77696:

char d[4];

typedef __SIZE_TYPE__ size_t;

extern int sprintf (char*, const char*, ...);
extern int snprintf (char*, size_t, const char*, ...);


void f (int i)
{
  // bounded, definite truncation in a directve
  snprintf (d, sizeof d, "%i", 1235);

  // bounded, definite truncation copying format string
  snprintf (d, sizeof d, "%iAB", 123);

  // unbounded, definite overflow in a directve
  sprintf (d, "%i", 1235);

  // unbounded, definite overflow copying format string
  sprintf (d, "%iAB", 123);

  // bounded, possible truncation a directve
  snprintf (d, sizeof d, "%i", i);

  // bounded, possible overflow copying format string
  snprintf (d, sizeof d, "%iAB", i);

  // unbounded, possible overflow in a directve
  sprintf (d, "%i", i);

  // unbounded, possible overflow copying format string
  sprintf (d, "%iAB", 123);

  // unbounded, possible overflow copying format string
  const char *s = i ? "123" : "1234";
  sprintf (d, "%sAB", s);
}

extern char buf_10[80];
extern char tmpdir[80];
extern char fname[8];

void g (int num)
{
  sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
}

trunk currently emits:

zz.c: In function ‘f’:
zzz.c:12:29: warning: ‘snprintf’ output truncated before the last format character [-Wformat-truncation=]
12 |   snprintf (d, sizeof d, "%i", 1235);
   |                             ^
zzz.c:12:3: note: ‘snprintf’ output 5 bytes into a destination of size 4
12 |   snprintf (d, sizeof d, "%i", 1235);
   |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:15:30: warning: ‘AB’ directive output truncated writing 2 bytes into a region of size 1 [-Wformat-truncation=]
15 |   snprintf (d, sizeof d, "%iAB", 123);
   |                             ~^
zzz.c:15:3: note: ‘snprintf’ output 6 bytes into a destination of size 4
15 |   snprintf (d, sizeof d, "%iAB", 123);
   |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:18:18: warning: ‘sprintf’ writing a terminating nul past the end of the destination [-Wformat-overflow=]
18 |   sprintf (d, "%i", 1235);
   |                  ^
zzz.c:18:3: note: ‘sprintf’ output 5 bytes into a destination of size 4
18 |   sprintf (d, "%i", 1235);
   |   ^~~~~~~~~~~~~~~~~~~~~~~
zzz.c:21:19: warning: ‘AB’ directive writing 2 bytes into a region of size 1 [-Wformat-overflow=]
21 |   sprintf (d, "%iAB", 123);
   |                  ~^
zzz.c:21:3: note: ‘sprintf’ output 6 bytes into a destination of size 4
21 |   sprintf (d, "%iAB", 123);
   |   ^~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:33:19: warning: ‘AB’ directive writing 2 bytes into a region of size 1 [-Wformat-overflow=]
33 |   sprintf (d, "%iAB", 123);
   |                  ~^
zzz.c:33:3: note: ‘sprintf’ output 6 bytes into a destination of size 4
33 |   sprintf (d, "%iAB", 123);
   |   ^~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:37:19: warning: ‘AB’ directive writing 2 bytes into a region of size between 0 and 1 [-Wformat-overflow=]
37 |   sprintf (d, "%sAB", s);
   |                  ~^
zzz.c:37:3: note: ‘sprintf’ output between 6 and 7 bytes into a destination of size 4
37 |   sprintf (d, "%sAB", s);
   |   ^~~~~~~~~~~~~~~~~~~~~~
zzz.c: In function ‘g’:
zzz.c:46:24: warning: ‘/’ directive writing 1 byte into a region of size between 0 and 79 [-Wformat-overflow=]
46 |   sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
   |                        ^
zzz.c:46:3: note: ‘sprintf’ output between 9 and 105 bytes into a destination of size 80
46 |   sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
   |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


With this patch kit, gcc emits:

zzz.c: In function ‘f’:
zzz.c:12:13: warning: ‘snprintf’ will truncate the output (5 bytes) to size 4 [-Wformat-truncation=]
12 |   snprintf (d, sizeof d, "%i", 1235);
   |             ^             ^~^
   |             |             | |
   |             |             | 1 byte (for NUL terminator)
   |             |             4 bytes
   |             capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
  |      ^
zzz.c:15:13: warning: ‘snprintf’ will truncate the output (6 bytes) to size 4 [-Wformat-truncation=]
15 |   snprintf (d, sizeof d, "%iAB", 123);
   |             ^             ^~^~^
   |             |             | | |
   |             |             | | 1 byte (for NUL terminator)
   |             |             | 2 bytes
   |             |             3 bytes
   |             capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
  |      ^
zzz.c:18:12: warning: buffer overflow: ‘sprintf’ will write 5 bytes into a destination of size 4 [-Wformat-overflow=]
18 |   sprintf (d, "%i", 1235);
   |            ^   ^~^
   |            |   | |
   |            |   | 1 byte (for NUL terminator)
   |            |   4 bytes
   |            capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
  |      ^
zzz.c:21:12: warning: buffer overflow: ‘sprintf’ will write 6 bytes into a destination of size 4 [-Wformat-overflow=]
21 |   sprintf (d, "%iAB", 123);
   |            ^   ^~^~^
   |            |   | | |
   |            |   | | 1 byte (for NUL terminator)
   |            |   | 2 bytes
   |            |   3 bytes
   |            capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
  |      ^
zzz.c:33:12: warning: buffer overflow: ‘sprintf’ will write 6 bytes into a destination of size 4 [-Wformat-overflow=]
33 |   sprintf (d, "%iAB", 123);
   |            ^   ^~^~^
   |            |   | | |
   |            |   | | 1 byte (for NUL terminator)
   |            |   | 2 bytes
   |            |   3 bytes
   |            capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
  |      ^
zzz.c:37:12: warning: buffer overflow: ‘sprintf’ will write between 6 and 7 bytes into a destination of size 4 [-Wformat-overflow=]
37 |   sprintf (d, "%sAB", s);
   |            ^   ^~^~^
   |            |   | | |
   |            |   | | 1 byte (for NUL terminator)
   |            |   | 2 bytes
   |            |   3...4 bytes
   |            capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
  |      ^
zzz.c: In function ‘g’:
zzz.c:46:12: warning: buffer overflow: ‘sprintf’ will write between 9 and 105 bytes into a destination of size 80 [-Wformat-overflow=]
46 |   sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
   |            ^~~~~~   ^^~^^~^^~^~~~^
   |            |        || || || |   |
   |            |        || || || |   1 byte (for NUL terminator)
   |            |        || || || 4 bytes
   |            |        || || |1...11 bytes
   |            |        || || 1 byte
   |            |        || |0...7 bytes
   |            |        || 1 byte
   |            |        |0...79 bytes
   |            |        1 byte
   |            capacity: 80 bytes
zzz.c:40:13: note: destination declared here
40 | extern char buf_10[80];
   |             ^~~~~~

The above is missing colorization, which can be seen at:

  https://dmalcolm.fedorapeople.org/gcc/2018-09-14/zzz.html

where the secondary locations within the format string alternate
colors between odd and even secondary locations.

The above output format is rather verbose, but I think that it is
appropriate for this warning.

Rationale: What is the action that an end-user will want to take when
encountering a warning?

I think they will want to:
(a) review it and decide:
  (a.1) "Is this a "true" result?"
  (a.2) "Do I care?"  
(b) if they decide it's genuine: "How do I fix this?"

(i.e. we should aim for our diagnostics to be "actionable")

Consider the sprintf/snprintf warnings.  For the user to easily take the
above actions, the most helpful thing we can do is to provide the user
with clear information on how much data the format directives will write
when the code runs, and to compare that with the size of the destination buffer.

I think the typical fix the user will apply will be to increase the size
of the buffer, or to convert sprintf to snprintf: a buffer overflow is
typically very bad, whereas truncating a string can often be much less
serious.  A less common fix will be to add or tweak width specifiers to
specific format directives.

To help the end-user to make these kinds of decisions, these warnings
need to clearly summarize the sizes of *all* of the various writes,
and compare them to the size of the destination buffer.

As I understand it, the current implementation iterates through the
format directives, tracking the possible ranges of data that could have
been written so far.  It emits a warning at the first directive that can
have an overflow, and then stops analyzing the directives at that callsite.

The existing approach thus seems to focus on where the first overflow
can happen.

My proposed new approach analyzes all of the directives up-front, and if
an overflow can occur, then the warning that's emitted shows all of the
various sizes of writes that will happen.

Hence the reimplementation doesn't bother showing the point where the
overflow happens; instead it summarizes all of what will be written, to
help the user in deciding how to fix their code.
(Doing so reduces some of the combinatorial explosion in possible output
messages).

The patch also puts the words "buffer overflow" at the front of the
message where appropriate (to better grab the user's attention).
It also calls out NUL terminators, since people tend to make mistakes
with them.

There's a lot of FIXMEs in the resulting patch, but I wanted to run the
idea past you before fixing all the FIXMEs.  In particular, I haven't
attempted to bootstrap and regression-test the new implementation.
(I suspect many of the existing tests are broken by this...).  Consider
this a prototype.

[the interesting part of the patch kit is in patch 5/5]

Thoughts?
Dave

David Malcolm (5):
  substring-locations: add class format_string_diagnostic_t
  Add range_idx param to range_label::get_text
  gimple-ssa-sprintf.c: move struct call_info out of the dom_walker subclass
  Add pp_humanized_limit and pp_humanized_range
  gimple-ssa-sprintf.c: rewrite overflow/truncation diagnostic (PR middle-end/77696)

 gcc/c-family/c-format.c                      |  32 +-
 gcc/c/c-objc-common.c                        |   2 +-
 gcc/c/c-typeck.c                             |   4 +-
 gcc/cp/error.c                               |   2 +-
 gcc/diagnostic-show-locus.c                  |  19 +-
 gcc/gcc-rich-location.h                      |   4 +-
 gcc/gimple-ssa-sprintf.c                     | 670 ++++++++++++---------------
 gcc/pretty-print.c                           | 181 ++++++++
 gcc/pretty-print.h                           |   4 +
 gcc/substring-locations.c                    | 113 +++--
 gcc/substring-locations.h                    |  74 +--
 gcc/testsuite/gcc.dg/sprintf-diagnostics-1.c | 252 ++++++++++
 gcc/testsuite/gcc.dg/sprintf-diagnostics-2.c |  57 +++
 libcpp/include/line-map.h                    |   6 +-
 14 files changed, 927 insertions(+), 493 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/sprintf-diagnostics-1.c
 create mode 100644 gcc/testsuite/gcc.dg/sprintf-diagnostics-2.c

-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 3/5] gimple-ssa-sprintf.c: move struct call_info out of the dom_walker subclass
  2018-09-14 17:31 [PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696) David Malcolm
  2018-09-14 17:30 ` [PATCH 1/5] substring-locations: add class format_string_diagnostic_t David Malcolm
  2018-09-14 17:31 ` [PATCH 5/5] gimple-ssa-sprintf.c: rewrite overflow/truncation diagnostic (PR middle-end/77696) David Malcolm
@ 2018-09-14 17:31 ` David Malcolm
  2018-09-14 17:31 ` [PATCH 2/5] Add range_idx param to range_label::get_text David Malcolm
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: David Malcolm @ 2018-09-14 17:31 UTC (permalink / raw)
  To: msebor; +Cc: gcc-patches, David Malcolm

gcc/ChangeLog:
	* gimple-ssa-sprintf.c (struct call_info): New forward decl.
	(class sprintf_dom_walker): Remove forward decl.
	(struct sprintf_dom_walker::call_info): Rename to...
	(struct call_info): ...this.
	(should_warn_p): Update for move of call_info out of
	sprintf_dom_walker.
	(maybe_warn): Likewise.
	(format_directive): Likewise.
	(parse_directive): Likewise.
	(is_call_safe): Likewise.
	(try_substitute_return_value): Likewise.
	(try_simplify_call): Likewise.
---
 gcc/gimple-ssa-sprintf.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index 65c7f92..ab430fe 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -116,6 +116,7 @@ const pass_data pass_data_sprintf_length = {
 static int warn_level;
 
 struct format_result;
+struct call_info;
 
 class sprintf_dom_walker : public dom_walker
 {
@@ -127,7 +128,6 @@ class sprintf_dom_walker : public dom_walker
   void after_dom_children (basic_block) FINAL OVERRIDE;
   bool handle_gimple_call (gimple_stmt_iterator *);
 
-  struct call_info;
   bool compute_format_length (call_info &, format_result *);
   class evrp_range_analyzer evrp_range_analyzer;
 };
@@ -890,7 +890,7 @@ bytes_remaining (unsigned HOST_WIDE_INT navail, const format_result &res)
 
 /* Description of a call to a formatted function.  */
 
-struct sprintf_dom_walker::call_info
+struct call_info
 {
   /* Function call statement.  */
   gimple *callstmt;
@@ -2333,7 +2333,7 @@ format_plain (const directive &dir, tree, vr_values *)
    should be diagnosed given the AVAILable space in the destination.  */
 
 static bool
-should_warn_p (const sprintf_dom_walker::call_info &info,
+should_warn_p (const call_info &info,
 	       const result_range &avail, const result_range &result)
 {
   if (result.max <= avail.min)
@@ -2404,7 +2404,7 @@ should_warn_p (const sprintf_dom_walker::call_info &info,
 
 static bool
 maybe_warn (substring_loc &dirloc, location_t argloc,
-	    const sprintf_dom_walker::call_info &info,
+	    const call_info &info,
 	    const result_range &avail_range, const result_range &res,
 	    const directive &dir)
 {
@@ -2684,7 +2684,7 @@ maybe_warn (substring_loc &dirloc, location_t argloc,
    in *RES.  Return true if the directive has been handled.  */
 
 static bool
-format_directive (const sprintf_dom_walker::call_info &info,
+format_directive (const call_info &info,
 		  format_result *res, const directive &dir,
 		  class vr_values *vr_values)
 {
@@ -2963,7 +2963,7 @@ format_directive (const sprintf_dom_walker::call_info &info,
    the directive.  */
 
 static size_t
-parse_directive (sprintf_dom_walker::call_info &info,
+parse_directive (call_info &info,
 		 directive &dir, format_result *res,
 		 const char *str, unsigned *argno,
 		 vr_values *vr_values)
@@ -3491,7 +3491,7 @@ get_destination_size (tree dest)
    of its return values.  */
 
 static bool
-is_call_safe (const sprintf_dom_walker::call_info &info,
+is_call_safe (const call_info &info,
 	      const format_result &res, bool under4k,
 	      unsigned HOST_WIDE_INT retval[2])
 {
@@ -3550,7 +3550,7 @@ is_call_safe (const sprintf_dom_walker::call_info &info,
 
 static bool
 try_substitute_return_value (gimple_stmt_iterator *gsi,
-			     const sprintf_dom_walker::call_info &info,
+			     const call_info &info,
 			     const format_result &res)
 {
   tree lhs = gimple_get_lhs (info.callstmt);
@@ -3668,7 +3668,7 @@ try_substitute_return_value (gimple_stmt_iterator *gsi,
 
 static bool
 try_simplify_call (gimple_stmt_iterator *gsi,
-		   const sprintf_dom_walker::call_info &info,
+		   const call_info &info,
 		   const format_result &res)
 {
   unsigned HOST_WIDE_INT dummy[2];
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 2/5] Add range_idx param to range_label::get_text
  2018-09-14 17:31 [PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696) David Malcolm
                   ` (2 preceding siblings ...)
  2018-09-14 17:31 ` [PATCH 3/5] gimple-ssa-sprintf.c: move struct call_info out of the dom_walker subclass David Malcolm
@ 2018-09-14 17:31 ` David Malcolm
  2018-09-14 17:36 ` [PATCH 4/5] Add pp_humanized_limit and pp_humanized_range David Malcolm
  2018-09-17 21:16 ` [PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696) Martin Sebor
  5 siblings, 0 replies; 8+ messages in thread
From: David Malcolm @ 2018-09-14 17:31 UTC (permalink / raw)
  To: msebor; +Cc: gcc-patches, David Malcolm

This patch updates the pure virtual function range_label::get_text
(and its implementations) so that the index of the range is passed
in, allowing for one label instance to be shared by multiple ranges.

gcc/c-family/ChangeLog:
	* c-format.c (range_label_for_format_type_mismatch::get_text):
	Update for new param.

gcc/c/ChangeLog:
	* c-objc-common.c (range_label_for_type_mismatch::get_text):
	Update for new param.
	* c-typeck.c (maybe_range_label_for_tree_type_mismatch::get_text):
	Likewise.

gcc/cp/ChangeLog:
	* error.c (range_label_for_type_mismatch::get_text): Update for
	new param.

gcc/ChangeLog:
	* diagnostic-show-locus.c (class layout_range): Add field
	"m_original_idx".
	(layout_range::layout_range): Add "original_idx" param and use it
	to initialize new field.
	(make_range): Use 0 for original_idx.
	(layout::layout): Pass in index to calls to
	maybe_add_location_range.
	(layout::maybe_add_location_range): Add param "original_idx" and
	pass it on to layout_range.
	(layout::print_any_labels): Pass on range->m_original_idx to
	get_text call.
	(gcc_rich_location::add_location_if_nearby): Use 0 for
	original_idx.
	* gcc-rich-location.h (text_range_label::get_text): Update for new
	param.
	(range_label_for_type_mismatch::get_text): Likewise.

libcpp/ChangeLog:
	* include/line-map.h (range_label::get_text): Add param
	"range_idx".
---
 gcc/c-family/c-format.c     |  4 ++--
 gcc/c/c-objc-common.c       |  2 +-
 gcc/c/c-typeck.c            |  4 ++--
 gcc/cp/error.c              |  2 +-
 gcc/diagnostic-show-locus.c | 19 ++++++++++++++-----
 gcc/gcc-rich-location.h     |  4 ++--
 libcpp/include/line-map.h   |  6 ++++--
 7 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index a7edfca..a1133c7 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -3597,9 +3597,9 @@ class range_label_for_format_type_mismatch
   {
   }
 
-  label_text get_text () const FINAL OVERRIDE
+  label_text get_text (unsigned range_idx) const FINAL OVERRIDE
   {
-    label_text text = range_label_for_type_mismatch::get_text ();
+    label_text text = range_label_for_type_mismatch::get_text (range_idx);
     if (text.m_buffer == NULL)
       return text;
 
diff --git a/gcc/c/c-objc-common.c b/gcc/c/c-objc-common.c
index 12e777a..fee5268 100644
--- a/gcc/c/c-objc-common.c
+++ b/gcc/c/c-objc-common.c
@@ -218,7 +218,7 @@ c_tree_printer (pretty_printer *pp, text_info *text, const char *spec,
    range_label_for_type_mismatch.  */
 
 label_text
-range_label_for_type_mismatch::get_text () const
+range_label_for_type_mismatch::get_text (unsigned /*range_idx*/) const
 {
   if (m_labelled_type == NULL_TREE)
     return label_text (NULL, false);
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 5f8df12..5388ddb 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -11049,7 +11049,7 @@ class maybe_range_label_for_tree_type_mismatch : public range_label
   {
   }
 
-  label_text get_text () const FINAL OVERRIDE
+  label_text get_text (unsigned range_idx) const FINAL OVERRIDE
   {
     if (m_expr == NULL_TREE
 	|| !EXPR_P (m_expr))
@@ -11061,7 +11061,7 @@ class maybe_range_label_for_tree_type_mismatch : public range_label
       other_type = TREE_TYPE (m_other_expr);
 
    range_label_for_type_mismatch inner (expr_type, other_type);
-   return inner.get_text ();
+   return inner.get_text (range_idx);
   }
 
  private:
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 5bab3f3..601f6d2 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -4289,7 +4289,7 @@ qualified_name_lookup_error (tree scope, tree name,
    Compare with print_template_differences above.  */
 
 label_text
-range_label_for_type_mismatch::get_text () const
+range_label_for_type_mismatch::get_text (unsigned /*range_idx*/) const
 {
   if (m_labelled_type == NULL_TREE)
     return label_text (NULL, false);
diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 6ce8a0f..7dfb0a0 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -128,6 +128,7 @@ class layout_range
 		const expanded_location *finish_exploc,
 		enum range_display_kind range_display_kind,
 		const expanded_location *caret_exploc,
+		unsigned original_idx,
 		const range_label *label);
 
   bool contains_point (linenum_type row, int column) const;
@@ -137,6 +138,7 @@ class layout_range
   layout_point m_finish;
   enum range_display_kind m_range_display_kind;
   layout_point m_caret;
+  unsigned m_original_idx;
   const range_label *m_label;
 };
 
@@ -236,6 +238,7 @@ class layout
 	  diagnostic_t diagnostic_kind);
 
   bool maybe_add_location_range (const location_range *loc_range,
+				 unsigned original_idx,
 				 bool restrict_to_current_line_spans);
 
   int get_num_line_spans () const { return m_line_spans.length (); }
@@ -414,11 +417,13 @@ layout_range::layout_range (const expanded_location *start_exploc,
 			    const expanded_location *finish_exploc,
 			    enum range_display_kind range_display_kind,
 			    const expanded_location *caret_exploc,
+			    unsigned original_idx,
 			    const range_label *label)
 : m_start (*start_exploc),
   m_finish (*finish_exploc),
   m_range_display_kind (range_display_kind),
   m_caret (*caret_exploc),
+  m_original_idx (original_idx),
   m_label (label)
 {
 }
@@ -546,7 +551,7 @@ make_range (int start_line, int start_col, int end_line, int end_col)
   const expanded_location finish_exploc
     = {"test.c", end_line, end_col, NULL, false};
   return layout_range (&start_exploc, &finish_exploc, SHOW_RANGE_WITHOUT_CARET,
-		       &start_exploc, NULL);
+		       &start_exploc, 0, NULL);
 }
 
 /* Selftests for layout_range::contains_point and
@@ -899,7 +904,7 @@ layout::layout (diagnostic_context * context,
       /* This diagnostic printer can only cope with "sufficiently sane" ranges.
 	 Ignore any ranges that are awkward to handle.  */
       const location_range *loc_range = richloc->get_range (idx);
-      maybe_add_location_range (loc_range, false);
+      maybe_add_location_range (loc_range, idx, false);
     }
 
   /* Populate m_fixit_hints, filtering to only those that are in the
@@ -953,6 +958,9 @@ layout::layout (diagnostic_context * context,
 /* Attempt to add LOC_RANGE to m_layout_ranges, filtering them to
    those that we can sanely print.
 
+   ORIGINAL_IDX is the index of LOC_RANGE within its rich_location,
+   (for use as extrinsic state by label ranges FIXME).
+
    If RESTRICT_TO_CURRENT_LINE_SPANS is true, then LOC_RANGE is also
    filtered against this layout instance's current line spans: it
    will only be added if the location is fully within the lines
@@ -962,6 +970,7 @@ layout::layout (diagnostic_context * context,
 
 bool
 layout::maybe_add_location_range (const location_range *loc_range,
+				  unsigned original_idx,
 				  bool restrict_to_current_line_spans)
 {
   gcc_assert (loc_range);
@@ -1001,7 +1010,7 @@ layout::maybe_add_location_range (const location_range *loc_range,
   /* Everything is now known to be in the correct source file,
      but it may require further sanitization.  */
   layout_range ri (&start, &finish, loc_range->m_range_display_kind, &caret,
-		   loc_range->m_label);
+		   original_idx, loc_range->m_label);
 
   /* If we have a range that finishes before it starts (perhaps
      from something built via macro expansion), printing the
@@ -1488,7 +1497,7 @@ layout::print_any_labels (linenum_type row)
 	  continue;
 
 	label_text text;
-	text = range->m_label->get_text ();
+	text = range->m_label->get_text (range->m_original_idx);
 
 	/* Allow for labels that return NULL from their get_text
 	   implementation (so e.g. such labels can control their own
@@ -2277,7 +2286,7 @@ gcc_rich_location::add_location_if_nearby (location_t loc)
   location_range loc_range;
   loc_range.m_loc = loc;
   loc_range.m_range_display_kind = SHOW_RANGE_WITHOUT_CARET;
-  if (!layout.maybe_add_location_range (&loc_range, true))
+  if (!layout.maybe_add_location_range (&loc_range, 0, true))
     return false;
 
   add_range (loc);
diff --git a/gcc/gcc-rich-location.h b/gcc/gcc-rich-location.h
index dc11ee8..e55dd76 100644
--- a/gcc/gcc-rich-location.h
+++ b/gcc/gcc-rich-location.h
@@ -109,7 +109,7 @@ class text_range_label : public range_label
  public:
   text_range_label (const char *text) : m_text (text) {}
 
-  label_text get_text () const FINAL OVERRIDE
+  label_text get_text (unsigned /*range_idx*/) const FINAL OVERRIDE
   {
     return label_text (const_cast <char *> (m_text), false);
   }
@@ -155,7 +155,7 @@ class range_label_for_type_mismatch : public range_label
   {
   }
 
-  label_text get_text () const OVERRIDE;
+  label_text get_text (unsigned range_idx) const OVERRIDE;
 
  protected:
   tree m_labelled_type;
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index fd06758..c479dfa 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -1808,8 +1808,10 @@ class range_label
  public:
   virtual ~range_label () {}
 
-  /* Get localized text for the label.  */
-  virtual label_text get_text () const = 0;
+  /* Get localized text for the label.
+     The RANGE_IDX is provided, allowing for range_label instances to be
+     shared by multiple ranges if need be (the "flyweight" design pattern).  */
+  virtual label_text get_text (unsigned range_idx) const = 0;
 };
 
 /* A fix-it hint: a suggested insertion, replacement, or deletion of text.
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 5/5] gimple-ssa-sprintf.c: rewrite overflow/truncation diagnostic (PR middle-end/77696)
  2018-09-14 17:31 [PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696) David Malcolm
  2018-09-14 17:30 ` [PATCH 1/5] substring-locations: add class format_string_diagnostic_t David Malcolm
@ 2018-09-14 17:31 ` David Malcolm
  2018-09-14 17:31 ` [PATCH 3/5] gimple-ssa-sprintf.c: move struct call_info out of the dom_walker subclass David Malcolm
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: David Malcolm @ 2018-09-14 17:31 UTC (permalink / raw)
  To: msebor; +Cc: gcc-patches, David Malcolm

gcc/ChangeLog:
	PR middle-end/77696
	* gimple-ssa-sprintf.c: Include "gcc-rich-location.h".
	(struct directive_state): New struct.
	(directive_state::get_text): New function.
	(format_result::add_directive): New member function.
	(format_result::should_warn): New field.
	(format_result::m_directives): New field.
	(fmtwarn_n): Delete.
	(maybe_warn): Delete.
	(format_directive): Add param "is_nul_terminator".  Rename locals
	"start" and "length" to "start_idx" and "end_idx" respectively.
	Call res->add_directive.  Eliminate call to maybe_warn in favor of
	calling should_warn_p and updating res->should_warn.
	Eliminate notes.
	(class rich_format_location): New class.
	(rich_format_location::flyweight_range_label::get_text): New
	function.
	(rich_format_location::get_label_for_buffer): New function.
	(rich_format_location::get_label_for_directive): New function.
	(sprintf_dom_walker::compute_format_length): Add param "dst_ptr".
	Retain param "callloc".  Initialize res->should_warn.  Add local
	"is_nul_terminator" and pass to format_directive.  Rather than
	returning at the end of the format string, check for res->should_warn
	and emit warnings using rich_format_location.
	(sprintf_dom_walker::handle_gimple_call): Pass in dst_ptr to
	compute_format_length.

gcc/testsuite/ChangeLog:
	PR middle-end/77696
	* gcc.dg/sprintf-diagnostics-1.c: New test.
	* gcc.dg/sprintf-diagnostics-2.c: New test.
---
 gcc/gimple-ssa-sprintf.c                     | 646 ++++++++++++---------------
 gcc/testsuite/gcc.dg/sprintf-diagnostics-1.c | 252 +++++++++++
 gcc/testsuite/gcc.dg/sprintf-diagnostics-2.c |  57 +++
 3 files changed, 597 insertions(+), 358 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/sprintf-diagnostics-1.c
 create mode 100644 gcc/testsuite/gcc.dg/sprintf-diagnostics-2.c

diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index ab430fe..fb6268c 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -83,6 +83,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "alloc-pool.h"
 #include "vr-values.h"
 #include "gimple-ssa-evrp-analyze.h"
+#include "gcc-rich-location.h"
 
 /* The likely worst case value of MB_LEN_MAX for the target, large enough
    for UTF-8.  Ideally, this would be obtained by a target hook if it were
@@ -128,7 +129,7 @@ class sprintf_dom_walker : public dom_walker
   void after_dom_children (basic_block) FINAL OVERRIDE;
   bool handle_gimple_call (gimple_stmt_iterator *);
 
-  bool compute_format_length (call_info &, format_result *);
+  bool compute_format_length (call_info &, format_result *, tree);
   class evrp_range_analyzer evrp_range_analyzer;
 };
 
@@ -193,10 +194,67 @@ struct result_range
   unsigned HOST_WIDE_INT unlikely;
 };
 
+/* Information about a particular directive (or the NUL terminator)
+   within a format string, for use when emitting warnings.  */
+
+struct directive_state
+{
+  directive_state (size_t fmt_start_idx, size_t fmt_end_idx,
+		   bool is_nul_terminator,
+		   unsigned HOST_WIDE_INT min_size,
+		   unsigned HOST_WIDE_INT max_size)
+  : m_fmt_start_idx (fmt_start_idx), m_fmt_end_idx (fmt_end_idx),
+    m_is_nul_terminator (is_nul_terminator),
+    m_min_size (min_size), m_max_size (max_size)
+  {}
+
+  label_text get_text () const;
+
+  size_t m_fmt_start_idx;
+  size_t m_fmt_end_idx;
+  bool m_is_nul_terminator;
+  unsigned HOST_WIDE_INT m_min_size;
+  unsigned HOST_WIDE_INT m_max_size;
+};
+
+/* Generate a label for the directive within the format string, describing
+   the size consumed by the output, and highlighting the NUL terminator.  */
+
+label_text
+directive_state::get_text () const
+{
+  pretty_printer pp_range;
+  bool singular = pp_humanized_range (&pp_range, m_min_size, m_max_size);
+  pretty_printer pp;
+  // FIXME: i18n for singular/plural
+  if (singular)
+    pp_printf (&pp, G_("%s byte"),
+	       pp_formatted_text (&pp_range));
+  else
+    pp_printf (&pp, G_("%s bytes"),
+	       pp_formatted_text (&pp_range));
+  if (m_is_nul_terminator)
+    pp_string (&pp, G_(" (for NUL terminator)"));
+  return label_text (xstrdup (pp_formatted_text (&pp)), true);
+}
+
 /* The result of a call to a formatted function.  */
 
 struct format_result
 {
+  /* Record a directive from FMT_START_IDX through FMT_END_IDX,
+     and whether it's the nul terminator, consuming between
+     MIN_SIZE and MAX_SIZE bytes when output.  */
+  void add_directive (size_t fmt_start_idx, size_t fmt_end_idx,
+		      bool is_nul_terminator,
+		      unsigned HOST_WIDE_INT min_size,
+		      unsigned HOST_WIDE_INT max_size)
+  {
+    m_directives.safe_push (directive_state (fmt_start_idx, fmt_end_idx,
+					     is_nul_terminator,
+					     min_size, max_size));
+  }
+
   /* Range of characters written by the formatted function.
      Setting the minimum to HOST_WIDE_INT_MAX disables all
      length tracking for the remainder of the format string.  */
@@ -229,6 +287,13 @@ struct format_result
      of a call.  WARNED also disables the return value optimization.  */
   bool warned;
 
+  /* True when we have an overflow or truncation.  */
+  bool should_warn;
+
+  /* The directives we've seen: locations and sizes of output, in case
+     we need to emit a warning.  */
+  auto_vec<directive_state> m_directives;
+
   /* Preincrement the number of output characters by 1.  */
   format_result& operator++ ()
   {
@@ -475,23 +540,6 @@ fmtwarn (const substring_loc &fmt_loc, location_t param_loc,
   return warned;
 }
 
-static bool
-ATTRIBUTE_GCC_DIAG (6, 8) ATTRIBUTE_GCC_DIAG (7, 8)
-fmtwarn_n (const substring_loc &fmt_loc, location_t param_loc,
-	   const char *corrected_substring, int opt, unsigned HOST_WIDE_INT n,
-	   const char *singular_gmsgid, const char *plural_gmsgid, ...)
-{
-  format_string_diagnostic_t diag (fmt_loc, NULL, param_loc, NULL,
-				   corrected_substring);
-  va_list ap;
-  va_start (ap, plural_gmsgid);
-  bool warned = diag.emit_warning_n_va (opt, n, singular_gmsgid, plural_gmsgid,
-					&ap);
-  va_end (ap);
-
-  return warned;
-}
-
 /* Format length modifiers.  */
 
 enum format_lengths
@@ -2394,310 +2442,27 @@ should_warn_p (const call_info &info,
   return true;
 }
 
-/* At format string location describe by DIRLOC in a call described
-   by INFO, issue a warning for a directive DIR whose output may be
-   in excess of the available space AVAIL_RANGE in the destination
-   given the formatting result FMTRES.  This function does nothing
-   except decide whether to issue a warning for a possible write
-   past the end or truncation and, if so, format the warning.
-   Return true if a warning has been issued.  */
-
-static bool
-maybe_warn (substring_loc &dirloc, location_t argloc,
-	    const call_info &info,
-	    const result_range &avail_range, const result_range &res,
-	    const directive &dir)
-{
-  if (!should_warn_p (info, avail_range, res))
-    return false;
-
-  /* A warning will definitely be issued below.  */
-
-  /* The maximum byte count to reference in the warning.  Larger counts
-     imply that the upper bound is unknown (and could be anywhere between
-     RES.MIN + 1 and SIZE_MAX / 2) are printed as "N or more bytes" rather
-     than "between N and X" where X is some huge number.  */
-  unsigned HOST_WIDE_INT maxbytes = target_dir_max ();
-
-  /* True when there is enough room in the destination for the least
-     amount of a directive's output but not enough for its likely or
-     maximum output.  */
-  bool maybe = (res.min <= avail_range.max
-		&& (avail_range.min < res.likely
-		    || (res.max < HOST_WIDE_INT_MAX
-			&& avail_range.min < res.max)));
-
-  /* Buffer for the directive in the host character set (used when
-     the source character set is different).  */
-  char hostdir[32];
-
-  if (avail_range.min == avail_range.max)
-    {
-      /* The size of the destination region is exact.  */
-      unsigned HOST_WIDE_INT navail = avail_range.max;
-
-      if (target_to_host (*dir.beg) != '%')
-	{
-	  /* For plain character directives (i.e., the format string itself)
-	     but not others, point the caret at the first character that's
-	     past the end of the destination.  */
-	  if (navail < dir.len)
-	    dirloc.set_caret_index (dirloc.get_caret_idx () + navail);
-	}
-
-      if (*dir.beg == '\0')
-	{
-	  /* This is the terminating nul.  */
-	  gcc_assert (res.min == 1 && res.min == res.max);
-
-	  return fmtwarn (dirloc, UNKNOWN_LOCATION, NULL, info.warnopt (),
-			  info.bounded
-			  ? (maybe
-			     ? G_("%qE output may be truncated before the "
-				  "last format character")
-			     : G_("%qE output truncated before the last "
-				  "format character"))
-			  : (maybe
-			     ? G_("%qE may write a terminating nul past the "
-				  "end of the destination")
-			     : G_("%qE writing a terminating nul past the "
-				  "end of the destination")),
-			  info.func);
-	}
-
-      if (res.min == res.max)
-	{
-	  const char *d = target_to_host (hostdir, sizeof hostdir, dir.beg);
-	  if (!info.bounded)
-	    return fmtwarn_n (dirloc, argloc, NULL, info.warnopt (), res.min,
-			      "%<%.*s%> directive writing %wu byte into a "
-			      "region of size %wu",
-			      "%<%.*s%> directive writing %wu bytes into a "
-			      "region of size %wu",
-			      (int) dir.len, d, res.min, navail);
-	  else if (maybe)
-	    return fmtwarn_n (dirloc, argloc, NULL, info.warnopt (), res.min,
-			      "%<%.*s%> directive output may be truncated "
-			      "writing %wu byte into a region of size %wu",
-			      "%<%.*s%> directive output may be truncated "
-			      "writing %wu bytes into a region of size %wu",
-			      (int) dir.len, d, res.min, navail);
-	  else
-	    return fmtwarn_n (dirloc, argloc, NULL, info.warnopt (), res.min,
-			      "%<%.*s%> directive output truncated writing "
-			      "%wu byte into a region of size %wu",
-			      "%<%.*s%> directive output truncated writing "
-			      "%wu bytes into a region of size %wu",
-			      (int) dir.len, d, res.min, navail);
-	}
-      if (res.min == 0 && res.max < maxbytes)
-	return fmtwarn (dirloc, argloc, NULL,
-			info.warnopt (),
-			info.bounded
-			? (maybe
-			   ? G_("%<%.*s%> directive output may be truncated "
-				"writing up to %wu bytes into a region of "
-				"size %wu")
-			   : G_("%<%.*s%> directive output truncated writing "
-				"up to %wu bytes into a region of size %wu"))
-			: G_("%<%.*s%> directive writing up to %wu bytes "
-			     "into a region of size %wu"), (int) dir.len,
-			target_to_host (hostdir, sizeof hostdir, dir.beg),
-			res.max, navail);
-
-      if (res.min == 0 && maxbytes <= res.max)
-	/* This is a special case to avoid issuing the potentially
-	   confusing warning:
-	     writing 0 or more bytes into a region of size 0.  */
-	return fmtwarn (dirloc, argloc, NULL, info.warnopt (),
-			info.bounded
-			? (maybe
-			   ? G_("%<%.*s%> directive output may be truncated "
-				"writing likely %wu or more bytes into a "
-				"region of size %wu")
-			   : G_("%<%.*s%> directive output truncated writing "
-				"likely %wu or more bytes into a region of "
-				"size %wu"))
-			: G_("%<%.*s%> directive writing likely %wu or more "
-			     "bytes into a region of size %wu"), (int) dir.len,
-			target_to_host (hostdir, sizeof hostdir, dir.beg),
-			res.likely, navail);
-
-      if (res.max < maxbytes)
-	return fmtwarn (dirloc, argloc, NULL, info.warnopt (),
-			info.bounded
-			? (maybe
-			   ? G_("%<%.*s%> directive output may be truncated "
-				"writing between %wu and %wu bytes into a "
-				"region of size %wu")
-			   : G_("%<%.*s%> directive output truncated "
-				"writing between %wu and %wu bytes into a "
-				"region of size %wu"))
-			: G_("%<%.*s%> directive writing between %wu and "
-			     "%wu bytes into a region of size %wu"),
-			(int) dir.len,
-			target_to_host (hostdir, sizeof hostdir, dir.beg),
-			res.min, res.max, navail);
-
-      return fmtwarn (dirloc, argloc, NULL, info.warnopt (),
-		      info.bounded
-		      ? (maybe
-			 ? G_("%<%.*s%> directive output may be truncated "
-			      "writing %wu or more bytes into a region of "
-			      "size %wu")
-			 : G_("%<%.*s%> directive output truncated writing "
-			      "%wu or more bytes into a region of size %wu"))
-		      : G_("%<%.*s%> directive writing %wu or more bytes "
-			   "into a region of size %wu"), (int) dir.len,
-		      target_to_host (hostdir, sizeof hostdir, dir.beg),
-		      res.min, navail);
-    }
-
-  /* The size of the destination region is a range.  */
-
-  if (target_to_host (*dir.beg) != '%')
-    {
-      unsigned HOST_WIDE_INT navail = avail_range.max;
-
-      /* For plain character directives (i.e., the format string itself)
-	 but not others, point the caret at the first character that's
-	 past the end of the destination.  */
-      if (navail < dir.len)
-	dirloc.set_caret_index (dirloc.get_caret_idx () + navail);
-    }
-
-  if (*dir.beg == '\0')
-    {
-      gcc_assert (res.min == 1 && res.min == res.max);
-
-      return fmtwarn (dirloc, UNKNOWN_LOCATION, NULL, info.warnopt (),
-		      info.bounded
-		      ? (maybe
-			 ? G_("%qE output may be truncated before the last "
-			      "format character")
-			 : G_("%qE output truncated before the last format "
-			      "character"))
-		      : (maybe
-			 ? G_("%qE may write a terminating nul past the end "
-			      "of the destination")
-			 : G_("%qE writing a terminating nul past the end "
-			      "of the destination")), info.func);
-    }
-
-  if (res.min == res.max)
-    {
-      const char *d = target_to_host (hostdir, sizeof hostdir, dir.beg);
-      if (!info.bounded)
-	return fmtwarn_n (dirloc, argloc, NULL, info.warnopt (), res.min,
-			  "%<%.*s%> directive writing %wu byte into a region "
-			  "of size between %wu and %wu",
-			  "%<%.*s%> directive writing %wu bytes into a region "
-			  "of size between %wu and %wu", (int) dir.len, d,
-			  res.min, avail_range.min, avail_range.max);
-      else if (maybe)
-	return fmtwarn_n (dirloc, argloc, NULL, info.warnopt (), res.min,
-			  "%<%.*s%> directive output may be truncated writing "
-			  "%wu byte into a region of size between %wu and %wu",
-			  "%<%.*s%> directive output may be truncated writing "
-			  "%wu bytes into a region of size between %wu and "
-			  "%wu", (int) dir.len, d, res.min, avail_range.min,
-			  avail_range.max);
-      else
-	return fmtwarn_n (dirloc, argloc, NULL, info.warnopt (), res.min,
-			  "%<%.*s%> directive output truncated writing %wu "
-			  "byte into a region of size between %wu and %wu",
-			  "%<%.*s%> directive output truncated writing %wu "
-			  "bytes into a region of size between %wu and %wu",
-			  (int) dir.len, d, res.min, avail_range.min,
-			  avail_range.max);
-    }
-
-  if (res.min == 0 && res.max < maxbytes)
-    return fmtwarn (dirloc, argloc, NULL, info.warnopt (),
-		    info.bounded
-		    ? (maybe
-		       ? G_("%<%.*s%> directive output may be truncated "
-			    "writing up to %wu bytes into a region of size "
-			    "between %wu and %wu")
-		       : G_("%<%.*s%> directive output truncated writing "
-			    "up to %wu bytes into a region of size between "
-			    "%wu and %wu"))
-		    : G_("%<%.*s%> directive writing up to %wu bytes "
-			 "into a region of size between %wu and %wu"),
-		    (int) dir.len,
-		    target_to_host (hostdir, sizeof hostdir, dir.beg),
-		    res.max, avail_range.min, avail_range.max);
-
-  if (res.min == 0 && maxbytes <= res.max)
-    /* This is a special case to avoid issuing the potentially confusing
-       warning:
-	 writing 0 or more bytes into a region of size between 0 and N.  */
-    return fmtwarn (dirloc, argloc, NULL, info.warnopt (),
-		    info.bounded
-		    ? (maybe
-		       ? G_("%<%.*s%> directive output may be truncated "
-			    "writing likely %wu or more bytes into a region "
-			    "of size between %wu and %wu")
-		       : G_("%<%.*s%> directive output truncated writing "
-			    "likely %wu or more bytes into a region of size "
-			    "between %wu and %wu"))
-		    : G_("%<%.*s%> directive writing likely %wu or more bytes "
-			 "into a region of size between %wu and %wu"),
-		    (int) dir.len,
-		    target_to_host (hostdir, sizeof hostdir, dir.beg),
-		    res.likely, avail_range.min, avail_range.max);
-
-  if (res.max < maxbytes)
-    return fmtwarn (dirloc, argloc, NULL, info.warnopt (),
-		    info.bounded
-		    ? (maybe
-		       ? G_("%<%.*s%> directive output may be truncated "
-			    "writing between %wu and %wu bytes into a region "
-			    "of size between %wu and %wu")
-		       : G_("%<%.*s%> directive output truncated writing "
-			    "between %wu and %wu bytes into a region of size "
-			    "between %wu and %wu"))
-		    : G_("%<%.*s%> directive writing between %wu and "
-			 "%wu bytes into a region of size between %wu and "
-			 "%wu"), (int) dir.len,
-		    target_to_host (hostdir, sizeof hostdir, dir.beg),
-		    res.min, res.max, avail_range.min, avail_range.max);
-
-  return fmtwarn (dirloc, argloc, NULL, info.warnopt (),
-		  info.bounded
-		  ? (maybe
-		     ? G_("%<%.*s%> directive output may be truncated writing "
-			  "%wu or more bytes into a region of size between "
-			  "%wu and %wu")
-		     : G_("%<%.*s%> directive output truncated writing "
-			  "%wu or more bytes into a region of size between "
-			  "%wu and %wu"))
-		  : G_("%<%.*s%> directive writing %wu or more bytes "
-		       "into a region of size between %wu and %wu"),
-		  (int) dir.len,
-		  target_to_host (hostdir, sizeof hostdir, dir.beg),
-		  res.min, avail_range.min, avail_range.max);
-}
-
 /* Compute the length of the output resulting from the directive DIR
    in a call described by INFO and update the overall result of the call
-   in *RES.  Return true if the directive has been handled.  */
+   in *RES.  Return true if the directive has been handled.
+   IS_NUL_TERMINATOR is true iff the "directive" is the NUL terminator.  */
 
 static bool
 format_directive (const call_info &info,
 		  format_result *res, const directive &dir,
-		  class vr_values *vr_values)
+		  class vr_values *vr_values,
+		  bool is_nul_terminator)
 {
   /* Offset of the beginning of the directive from the beginning
      of the format string.  */
   size_t offset = dir.beg - info.fmtstr;
-  size_t start = offset;
-  size_t length = offset + dir.len - !!dir.len;
+  size_t start_idx = offset;
+  size_t end_idx = offset + dir.len - !!dir.len;
 
   /* Create a location for the whole directive from the % to the format
      specifier.  */
   substring_loc dirloc (info.fmtloc, TREE_TYPE (info.format),
-			offset, start, length);
+			offset, start_idx, end_idx);
 
   /* Also get the location of the argument if possible.
      This doesn't work for integer literals or function calls.  */
@@ -2713,6 +2478,10 @@ format_directive (const call_info &info,
   /* Compute the range of lengths of the formatted output.  */
   fmtresult fmtres = dir.fmtfunc (dir, dir.arg, vr_values);
 
+  // (FIXME: or maybe record the whole of fmtres?)
+  res->add_directive (start_idx, end_idx, is_nul_terminator,
+		      fmtres.range.min, fmtres.range.max);
+
   /* Record whether the output of all directives is known to be
      bounded by some maximum, implying that their arguments are
      either known exactly or determined to be in a known range
@@ -2776,11 +2545,8 @@ format_directive (const call_info &info,
      NUL that's appended after the format string has been processed.  */
   result_range avail_range = bytes_remaining (info.objsize, *res);
 
-  bool warned = res->warned;
-
-  if (!warned)
-    warned = maybe_warn (dirloc, argloc, info, avail_range,
-			 fmtres.range, dir);
+  if (should_warn_p (info, avail_range, fmtres.range))
+    res->should_warn = true;
 
   /* Bump up the total maximum if it isn't too big.  */
   if (res->range.max < HOST_WIDE_INT_MAX
@@ -2814,6 +2580,8 @@ format_directive (const call_info &info,
   if (fmtres.mayfail)
     res->posunder4k = false;
 
+  bool warned = res->warned;
+
   if (!warned
       /* Only warn at level 2.  */
       && warn_level > 1
@@ -2907,39 +2675,6 @@ format_directive (const call_info &info,
 
   res->warned |= warned;
 
-  if (!dir.beg[0] && res->warned && info.objsize < HOST_WIDE_INT_MAX)
-    {
-      /* If a warning has been issued for buffer overflow or truncation
-	 (but not otherwise) help the user figure out how big a buffer
-	 they need.  */
-
-      location_t callloc = gimple_location (info.callstmt);
-
-      unsigned HOST_WIDE_INT min = res->range.min;
-      unsigned HOST_WIDE_INT max = res->range.max;
-
-      if (min == max)
-	inform (callloc,
-		(min == 1
-		 ? G_("%qE output %wu byte into a destination of size %wu")
-		 : G_("%qE output %wu bytes into a destination of size %wu")),
-		info.func, min, info.objsize);
-      else if (max < HOST_WIDE_INT_MAX)
-	inform (callloc,
-		"%qE output between %wu and %wu bytes into "
-		"a destination of size %wu",
-		info.func, min, max, info.objsize);
-      else if (min < res->range.likely && res->range.likely < max)
-	inform (callloc,
-		"%qE output %wu or more bytes (assuming %wu) into "
-		"a destination of size %wu",
-		info.func, min, res->range.likely, info.objsize);
-      else
-	inform (callloc,
-		"%qE output %wu or more bytes into a destination of size %wu",
-		info.func, min, info.objsize);
-    }
-
   if (dump_file && *dir.beg)
     {
       fprintf (dump_file,
@@ -3396,20 +3131,109 @@ parse_directive (call_info &info,
   return dir.len;
 }
 
+/* Subclass of gcc_rich_location for emitting warnings about sprintf and
+   snprintf.  */
+
+class rich_format_location : public gcc_rich_location
+{
+ public:
+  rich_format_location (location_t dst_loc, const call_info &info,
+			const format_result &res)
+  : gcc_rich_location (dst_loc, &m_label),
+    m_info (info), m_res (res),
+    m_label (*this)
+  {
+    unsigned i;
+    directive_state *dir_state;
+    FOR_EACH_VEC_ELT (m_res.m_directives, i, dir_state)
+      {
+	substring_loc dirloc (m_info.fmtloc, TREE_TYPE (m_info.format),
+			      dir_state->m_fmt_start_idx,
+			      dir_state->m_fmt_start_idx,
+			      dir_state->m_fmt_end_idx);
+	location_t loc = UNKNOWN_LOCATION;
+	dirloc.get_location (&loc);
+	add_range (loc, SHOW_RANGE_WITH_CARET, &m_label);
+      }
+  }
+
+ private:
+  /* Subclass of range_label for labelling all of the various ranges.  */
+  class flyweight_range_label : public range_label
+  {
+  public:
+    flyweight_range_label (const rich_format_location &rich_loc)
+    : m_rich_loc (rich_loc) {}
+
+    label_text get_text (unsigned range_idx) const FINAL OVERRIDE;
+  private:
+    const rich_format_location &m_rich_loc;
+  };
+  friend class flyweight_range_label;
+
+  label_text get_label_for_buffer () const;
+  label_text get_label_for_directive (unsigned dir_idx) const;
+
+  const call_info &m_info;
+  const format_result &m_res;
+  flyweight_range_label m_label;
+};
+
+/* Implementation of range_label::get_text for
+   rich_format_location::flyweight_range_label.
+   Handle all of the various labels with one instance by dispatching based on
+   RANGE_IDX.  */
+
+label_text
+rich_format_location::flyweight_range_label::get_text (unsigned range_idx) const
+{
+  if (range_idx == 0)
+    return m_rich_loc.get_label_for_buffer ();
+  else
+    return m_rich_loc.get_label_for_directive (range_idx - 1);
+}
+
+/* Get a label for the underlined destination buffer, showing the capacity of
+   the buffer.  */
+
+label_text
+rich_format_location::get_label_for_buffer () const
+{
+  pretty_printer pp;
+  // FIXME: i18n for singular/plural
+  if (m_info.objsize == 1)
+    pp_printf (&pp, G_("capacity: %wu byte"), m_info.objsize);
+  else
+    pp_printf (&pp, G_("capacity: %wu bytes"), m_info.objsize);
+  return label_text (xstrdup (pp_formatted_text (&pp)), true);
+}
+
+/* Get a label for the underlined directive, showing the size of the output
+   from that directive.  */
+
+label_text
+rich_format_location::get_label_for_directive (unsigned dir_idx) const
+{
+  const directive_state &dir_state = m_res.m_directives[dir_idx];
+  return dir_state.get_text ();
+}
+
 /* Compute the length of the output resulting from the call to a formatted
    output function described by INFO and store the result of the call in
    *RES.  Issue warnings for detected past the end writes.  Return true
    if the complete format string has been processed and *RES can be relied
    on, false otherwise (e.g., when a unknown or unhandled directive was seen
-   that caused the processing to be terminated early).  */
+   that caused the processing to be terminated early).  DST_PTR is the destination
+   of the output.  */
 
 bool
 sprintf_dom_walker::compute_format_length (call_info &info,
-					   format_result *res)
+					   format_result *res,
+					   tree dst_ptr)
 {
+  location_t callloc = gimple_location (info.callstmt);
   if (dump_file)
     {
-      location_t callloc = gimple_location (info.callstmt);
       fprintf (dump_file, "%s:%i: ",
 	       LOCATION_FILE (callloc), LOCATION_LINE (callloc));
       print_generic_expr (dump_file, info.func, dump_flags);
@@ -3430,6 +3254,7 @@ sprintf_dom_walker::compute_format_length (call_info &info,
   res->posunder4k = true;
   res->floating = false;
   res->warned = false;
+  res->should_warn = false;
 
   /* 1-based directive counter.  */
   unsigned dirno = 1;
@@ -3437,6 +3262,8 @@ sprintf_dom_walker::compute_format_length (call_info &info,
   /* The variadic argument counter.  */
   unsigned argno = info.argidx;
 
+  bool is_nul_terminator = false;
+
   for (const char *pf = info.fmtstr; ; ++dirno)
     {
       directive dir = directive ();
@@ -3445,23 +3272,126 @@ sprintf_dom_walker::compute_format_length (call_info &info,
       size_t n = parse_directive (info, dir, res, pf, &argno,
 				  evrp_range_analyzer.get_vr_values ());
 
+      is_nul_terminator = !n && *pf == '\0';
+
+      fmtresult fmtres;
+
       /* Return failure if the format function fails.  */
       if (!format_directive (info, res, dir,
-			     evrp_range_analyzer.get_vr_values ()))
+			     evrp_range_analyzer.get_vr_values (),
+			     is_nul_terminator))
 	return false;
 
-      /* Return success the directive is zero bytes long and it's
-	 the last think in the format string (i.e., it's the terminating
-	 nul, which isn't really a directive but handling it as one makes
-	 things simpler).  */
+      /* Stop at the end of the format string.  */
       if (!n)
-	return *pf == '\0';
+	break;
 
       pf += n;
     }
 
-  /* The complete format string was processed (with or without warnings).  */
-  return true;
+  if (res->should_warn && info.objsize < HOST_WIDE_INT_MAX)
+    {
+      unsigned HOST_WIDE_INT min = res->range.min;
+      unsigned HOST_WIDE_INT max = res->range.max;
+      bool warned;
+
+      /* Build the rich location for diagnostics.  */
+      location_t dst_loc = UNKNOWN_LOCATION;
+      if (dst_ptr != NULL_TREE && EXPR_HAS_LOCATION (dst_ptr))
+	dst_loc = EXPR_LOCATION (dst_ptr);
+      location_t primary_loc;
+      if (dst_loc != UNKNOWN_LOCATION)
+	primary_loc = dst_loc;
+      else
+	primary_loc = get_start (callloc);
+      rich_format_location rich_loc (primary_loc, info, *res);
+
+      /* Truncation vs buffer overflow.  */
+      if (info.bounded)
+	{
+	  /* Emit a "truncated output" warning.  */
+	  if (min == max)
+	    warned = warning_at
+	      (&rich_loc, info.warnopt (),
+	       (min == 1
+		? G_("%qE will truncate the output (%wu byte) to size %wu")
+		: G_("%qE will truncate the output (%wu bytes) to size %wu")),
+	       info.func, min, info.objsize);
+	  else if (max < HOST_WIDE_INT_MAX)
+	    warned = warning_at
+	      (&rich_loc, info.warnopt (),
+	       "%qE will truncate the output (between %wu and %wu bytes) "
+	       "to size %wu",
+	       info.func, min, max, info.objsize);
+	  else if (min < res->range.likely && res->range.likely < max)
+	    warned = warning_at
+	      (&rich_loc, info.warnopt (),
+	       "%qE will truncate the output (%wu or more bytes, assuming %wu) "
+	       "to size %wu",
+	       info.func, min, res->range.likely, info.objsize);
+	  else
+	    warned = warning_at
+	      (&rich_loc, info.warnopt (),
+	       "FIXME: the other truncation case for %qE) ",
+	       info.func);
+	  // FIXME: what's the other case here?
+
+	  // FIXME: what about wrong size limit?
+	  // FIXME: should we highlight where the size comes for sizeof exprs?
+	}
+      else
+	{
+	  /* Emit a "buffer overflow" warning.  */
+	  auto_diagnostic_group d;
+	  if (min == max)
+	    warned = warning_at
+	      (&rich_loc, info.warnopt (),
+	       (min == 1
+		? G_("buffer overflow: %qE will write %wu byte"
+		     " into a destination of size %wu")
+		: G_("buffer overflow: %qE will write %wu bytes"
+		     " into a destination of size %wu")),
+	       info.func, min, info.objsize);
+	  else if (max < HOST_WIDE_INT_MAX)
+	    warned = warning_at
+	      (&rich_loc, info.warnopt (),
+	       "buffer overflow: %qE will write between %wu and %wu bytes into "
+	       "a destination of size %wu",
+	       info.func, min, max, info.objsize);
+	  else if (min < res->range.likely && res->range.likely < max)
+	    warned = warning_at
+	      (&rich_loc, info.warnopt (),
+	       "buffer overflow: %qE will write %wu or more bytes"
+	       " (assuming %wu) into a destination of size %wu",
+	       info.func, min, res->range.likely, info.objsize);
+	  else
+	    warned = warning_at
+	      (&rich_loc, info.warnopt (),
+	       "buffer overflow: %qE will write %wu or more bytes"
+	       " into a destination of size %wu",
+	       info.func, min, info.objsize);
+	}
+      if (warned)
+	{
+	  /* If possible, emit a note showing the declaration of the
+	     destination buffer.  */
+	  // FIXME: possibly use "if nearby" to add as another location to main warning?
+	  if (TREE_CODE (dst_ptr) == ADDR_EXPR)
+	    {
+	      tree pt_var = TREE_OPERAND (dst_ptr, 0);
+	      if (DECL_P (pt_var))
+		inform (DECL_SOURCE_LOCATION (pt_var),
+			"destination declared here");
+	    }
+	}
+    }
+
+  /* Return success if the final "directive" is zero bytes long and it's
+     the last think in the format string (i.e., it's the terminating
+     nul, which isn't really a directive but handling it as one makes
+     things simpler).
+     This isn't affected by whether we emitted warnings.  */
+  return is_nul_terminator;
 }
 
 /* Return the size of the object referenced by the expression DEST if
@@ -3932,7 +3862,7 @@ sprintf_dom_walker::handle_gimple_call (gimple_stmt_iterator *gsi)
      including the terminating NUL.  */
   format_result res = format_result ();
 
-  bool success = compute_format_length (info, &res);
+  bool success = compute_format_length (info, &res, dstptr);
 
   /* When optimizing and the printf return value optimization is enabled,
      attempt to substitute the computed result for the return value of
diff --git a/gcc/testsuite/gcc.dg/sprintf-diagnostics-1.c b/gcc/testsuite/gcc.dg/sprintf-diagnostics-1.c
new file mode 100644
index 0000000..c807b2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/sprintf-diagnostics-1.c
@@ -0,0 +1,252 @@
+/* { dg-do compile } */
+/* { dg-options "-Wformat-overflow=2 -Wformat-truncation=2 -O2 -fdiagnostics-show-caret" } */
+
+typedef __SIZE_TYPE__ size_t;
+extern int sprintf (char*, const char*, ...);
+extern int snprintf (char*, size_t, const char*, ...);
+
+/* Bounded, definite truncation in a directive.  */
+
+void test_1 (int i)
+{
+  char buf_1[4]; /* { dg-message "destination declared here" } */
+  snprintf (buf_1, sizeof buf_1, "%i", 1235); /* { dg-warning "'snprintf' will truncate the output \\(5 bytes\\) to size 4" } */
+  /* { dg-begin-multiline-output "" }
+   snprintf (buf_1, sizeof buf_1, "%i", 1235);
+             ^~~~~                 ^~^
+             |                     | |
+             |                     | 1 byte (for NUL terminator)
+             capacity: 4 bytes     4 bytes
+     { dg-end-multiline-output "" } */
+  /* { dg-begin-multiline-output "" }
+   char buf_1[4];
+        ^~~~~
+     { dg-end-multiline-output "" } */
+}
+
+/* Bounded, definite truncation copying format string.  */
+
+void test_2 (int i)
+{
+  char buf_2[4]; /* { dg-message "destination declared here" } */
+  snprintf (buf_2, sizeof buf_2, "%iAB", 123); /* { dg-warning "'snprintf' will truncate the output \\(6 bytes\\) to size 4" } */
+  /* { dg-begin-multiline-output "" }
+   snprintf (buf_2, sizeof buf_2, "%iAB", 123);
+             ^~~~~                 ^~^~^
+             |                     | | |
+             |                     | | 1 byte (for NUL terminator)
+             |                     | 2 bytes
+             capacity: 4 bytes     3 bytes
+     { dg-end-multiline-output "" } */
+  /* { dg-begin-multiline-output "" }
+   char buf_2[4];
+        ^~~~~
+     { dg-end-multiline-output "" } */
+}
+
+/* unbounded, definite overflow in a directive.  */
+
+void test_3 (int i)
+{
+  char buf_3[4]; /* { dg-message "destination declared here" } */
+  sprintf (buf_3, "%i", 1235); /* { dg-warning "buffer overflow: 'sprintf' will write 5 bytes into a destination of size 4" } */
+  /* { dg-begin-multiline-output "" }
+   sprintf (buf_3, "%i", 1235);
+            ^~~~~   ^~^
+            |       | |
+            |       | 1 byte (for NUL terminator)
+            |       4 bytes
+            capacity: 4 bytes
+     { dg-end-multiline-output "" } */
+  /* { dg-begin-multiline-output "" }
+   char buf_3[4];
+        ^~~~~
+     { dg-end-multiline-output "" } */
+}
+
+/* unbounded, definite overflow copying format string.  */
+
+void test_4 (int i)
+{
+  char buf_4[4]; /* { dg-message "destination declared here" } */
+  sprintf (buf_4, "%iAB", 123); /* { dg-warning "buffer overflow: 'sprintf' will write 6 bytes into a destination of size 4" } */
+  /* { dg-begin-multiline-output "" }
+   sprintf (buf_4, "%iAB", 123);
+            ^~~~~   ^~^~^
+            |       | | |
+            |       | | 1 byte (for NUL terminator)
+            |       | 2 bytes
+            |       3 bytes
+            capacity: 4 bytes
+     { dg-end-multiline-output "" } */
+  /* { dg-begin-multiline-output "" }
+   char buf_4[4];
+        ^~~~~
+     { dg-end-multiline-output "" } */
+}
+
+/* bounded, possible truncation a directve.  */
+
+void test_5 (int i)
+{
+  char buf_5[4]; /* { dg-message "destination declared here" } */
+  snprintf (buf_5, sizeof buf_5, "%i", i); /* { dg-warning "'snprintf' will truncate the output \\(between 2 and 12 bytes\\) to size 4" } */
+  /* { dg-begin-multiline-output "" }
+   snprintf (buf_5, sizeof buf_5, "%i", i);
+             ^~~~~                 ^~^
+             |                     | |
+             |                     | 1 byte (for NUL terminator)
+             capacity: 4 bytes     1...11 bytes
+     { dg-end-multiline-output "" } */
+  /* { dg-begin-multiline-output "" }
+   char buf_5[4];
+        ^~~~~
+     { dg-end-multiline-output "" } */
+}
+
+/* bounded, possible overflow copying format string.  */
+
+void test_6 (int i)
+{
+  char buf_6[4]; /* { dg-message "destination declared here" } */
+  snprintf (buf_6, sizeof buf_6, "%iAB", i); /* { dg-warning "'snprintf' will truncate the output \\(between 4 and 14 bytes\\) to size 4" } */
+  /* { dg-begin-multiline-output "" }
+   snprintf (buf_6, sizeof buf_6, "%iAB", i);
+             ^~~~~                 ^~^~^
+             |                     | | |
+             |                     | | 1 byte (for NUL terminator)
+             |                     | 2 bytes
+             capacity: 4 bytes     1...11 bytes
+     { dg-end-multiline-output "" } */
+  /* { dg-begin-multiline-output "" }
+   char buf_6[4];
+        ^~~~~
+     { dg-end-multiline-output "" } */
+}
+
+/* unbounded, possible overflow in a directve.  */
+
+void test_7 (int i)
+{
+  char buf_7[4]; /* { dg-message "destination declared here" } */
+  sprintf (buf_7, "%i", i); /* { dg-warning "buffer overflow: 'sprintf' will write between 2 and 12 bytes into a destination of size 4" } */
+  /* { dg-begin-multiline-output "" }
+   sprintf (buf_7, "%i", i);
+            ^~~~~   ^~^
+            |       | |
+            |       | 1 byte (for NUL terminator)
+            |       1...11 bytes
+            capacity: 4 bytes
+     { dg-end-multiline-output "" } */
+  /* { dg-begin-multiline-output "" }
+   char buf_7[4];
+        ^~~~~
+     { dg-end-multiline-output "" } */
+}
+
+/* unbounded, possible overflow copying format string.  */
+
+void test_8 (int i)
+{
+  char buf_8[4];
+  sprintf (buf_8, "%iAB", 123);
+  /* FIXME: why isn't this generating a warning?  */
+}
+
+/* unbounded, possible overflow copying format string.  */
+
+void test_9 (int i)
+{
+  char buf_9[4]; /* { dg-message "destination declared here" } */
+  const char *s = i ? "123" : "1234";
+  sprintf (buf_9, "%sAB", s); /* { dg-warning "buffer overflow: 'sprintf' will write between 6 and 7 bytes into a destination of size 4" } */
+  /* { dg-begin-multiline-output "" }
+   sprintf (buf_9, "%sAB", s);
+            ^~~~~   ^~^~^
+            |       | | |
+            |       | | 1 byte (for NUL terminator)
+            |       | 2 bytes
+            |       3...4 bytes
+            capacity: 4 bytes
+     { dg-end-multiline-output "" } */
+  /* { dg-begin-multiline-output "" }
+   char buf_9[4];
+        ^~~~~
+     { dg-end-multiline-output "" } */
+}
+
+extern char buf_10[80];
+extern char tmpdir[80];
+extern char fname[8];
+
+void test_10 (int num)
+{
+  sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num); /* { dg-warning "buffer overflow: 'sprintf' will write between 9 and 105 bytes into a destination of size 80" } */
+  /* { dg-begin-multiline-output "" }
+   sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
+            ^~~~~~   ^^~^^~^^~^~~~^
+            |        || || || |   |
+            |        || || || |   1 byte (for NUL terminator)
+            |        || || || 4 bytes
+            |        || || |1...11 bytes
+            |        || || 1 byte
+            |        || |0...7 bytes
+            |        || 1 byte
+            |        |0...79 bytes
+            |        1 byte
+            capacity: 80 bytes
+     { dg-end-multiline-output "" } */
+  /* { dg-begin-multiline-output "" }
+ extern char buf_10[80];
+             ^~~~~~
+     { dg-end-multiline-output "" } */
+}
+
+struct MyStrings {
+  char a[8], b[20];
+};
+
+const struct MyStrings ms[] = {
+  { "foo", "bar" }, { "abcd", "klmno" }
+};
+
+void test_11 (void)
+{
+  char buf_11[6];
+  sprintf (buf_11, "msg: %s\n", ms[1].b); /* { dg-warning "buffer overflow: 'sprintf' will write 12 bytes into a destination of size 6" } */
+  /* { dg-begin-multiline-output "" }
+   sprintf (buf_11, "msg: %s\n", ms[1].b);
+            ^~~~~~   ^~~~~^~^~^
+            |        |    | | |
+            |        |    | | 1 byte (for NUL terminator)
+            |        |    | 1 byte
+            |        |    5 bytes
+            |        5 bytes
+            capacity: 6 bytes
+     { dg-end-multiline-output "" } */
+  /* { dg-begin-multiline-output "" }
+   char buf_11[6];
+        ^~~~~~
+     { dg-end-multiline-output "" } */
+}
+
+void test_12 (int idx)
+{
+  char buf_12[6];
+  sprintf (buf_12, "msg: %s\n", ms[idx].b); /* { dg-warning "buffer overflow: 'sprintf' will write 7 or more bytes \\(assuming 8\\) into a destination of size 6" } */
+  // FIXME: this has a 64-bit assumption
+  /* { dg-begin-multiline-output "" }
+   sprintf (buf_12, "msg: %s\n", ms[idx].b);
+            ^~~~~~   ^~~~~^~^~^
+            |        |    | | |
+            |        |    | | 1 byte (for NUL terminator)
+            |        |    | 1 byte
+            |        |    0...(1<<63)-1 bytes
+            |        5 bytes
+            capacity: 6 bytes
+     { dg-end-multiline-output "" } */
+  /* { dg-begin-multiline-output "" }
+   char buf_12[6];
+        ^~~~~~
+     { dg-end-multiline-output "" } */
+}
diff --git a/gcc/testsuite/gcc.dg/sprintf-diagnostics-2.c b/gcc/testsuite/gcc.dg/sprintf-diagnostics-2.c
new file mode 100644
index 0000000..69482ee
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/sprintf-diagnostics-2.c
@@ -0,0 +1,57 @@
+/* TODO: finish these test cases.  */
+
+/* { dg-do compile } */
+/* { dg-options "-Wformat-overflow=2 -Wformat-truncation=2 -O2 -fdiagnostics-show-caret" } */
+
+typedef __SIZE_TYPE__ size_t;
+extern int sprintf (char*, const char*, ...);
+extern int snprintf (char*, size_t, const char*, ...);
+
+void test_13 (const char *msg)
+{
+  char buf[16];
+
+  const char *fmt = "msg: %s\n";
+  sprintf (buf, fmt, msg);
+  /* { dg-begin-multiline-output "" }
+     { dg-end-multiline-output "" } */
+}
+
+void test_14 (void)
+{
+#define INT_FMT "%i"
+  char buf_14[4]; /* { dg-message "destination declared here" } */
+  sprintf (buf_14, INT_FMT "AB", 123);
+  /* { dg-begin-multiline-output "" }
+     { dg-end-multiline-output "" } */
+}
+
+void test_15 (void)
+{
+#define FMT "%i"
+  char buf[16];
+  sprintf (buf, "i: " FMT " j: " FMT " k: " FMT "\n", 1066, 1215, 1649);
+  /* { dg-begin-multiline-output "" }
+     { dg-end-multiline-output "" } */
+}
+
+void test_non_contiguous_strings (void)
+{
+  char buf[4]; /* { dg-message "destination declared here" } */
+  sprintf(buf, " %" "i ", 65536);
+  /* { dg-begin-multiline-output "" }
+     { dg-end-multiline-output "" } */
+}
+
+void test_non_contiguous_strings_2 (void)
+{
+  char buf[4]; /* { dg-message "destination declared here" } */
+  sprintf(buf, " %"
+	  "i ", 65536);
+  /* FIXME: looks like ICF merges this with the above and gets the wrong location.  */
+  /* { dg-begin-multiline-output "" }
+     { dg-end-multiline-output "" } */
+}
+
+/* FIXME: see gcc.dg/format/diagnostic-ranges.c
+   FIXME: try with C++.  */
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 4/5] Add pp_humanized_limit and pp_humanized_range
  2018-09-14 17:31 [PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696) David Malcolm
                   ` (3 preceding siblings ...)
  2018-09-14 17:31 ` [PATCH 2/5] Add range_idx param to range_label::get_text David Malcolm
@ 2018-09-14 17:36 ` David Malcolm
  2018-09-18 15:43   ` Martin Sebor
  2018-09-17 21:16 ` [PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696) Martin Sebor
  5 siblings, 1 reply; 8+ messages in thread
From: David Malcolm @ 2018-09-14 17:36 UTC (permalink / raw)
  To: msebor; +Cc: gcc-patches, David Malcolm

gcc/ChangeLog:
	* pretty-print.c (get_power_of_two): New function.
	(struct relative_to_power_of_2): New struct.
	(pp_humanized_limit): New function.
	(pp_humanized_range): New function.
	(selftest::assert_pp_humanized_limit): New function.
	(ASSERT_PP_HUMANIZED_LIMIT): New macro.
	(selftest::assert_pp_humanized_range): New function.
	(ASSERT_PP_HUMANIZED_RANGE): New macro.
	(selftest::test_get_power_of_two): New function.
	(selftest::test_humanized_printing): New function.
	(selftest::pretty_print_c_tests): Call them.
	* pretty-print.h (pp_humanized_limit): New function.
	(pp_humanized_range): New function.
---
 gcc/pretty-print.c | 181 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/pretty-print.h |   4 ++
 2 files changed, 185 insertions(+)

diff --git a/gcc/pretty-print.c b/gcc/pretty-print.c
index 7dd900b..7a2cd30 100644
--- a/gcc/pretty-print.c
+++ b/gcc/pretty-print.c
@@ -1799,6 +1799,107 @@ pp_end_quote (pretty_printer *pp, bool show_color)
   pp_string (pp, close_quote);
 }
 
+/* Get the largest power of two that is less that or equal to VAL.
+
+   FIXME: does this exist in stdlib (e.g. log2l in C99)?  */
+
+static unsigned
+get_power_of_two (unsigned HOST_WIDE_INT val)
+{
+  int power_of_two = 0;
+  while (val >= 2)
+    {
+      val >>= 1;
+      power_of_two++;
+    }
+  return power_of_two;
+}
+
+/* A way to express a number in terms of an offset from a power of two.  */
+
+struct relative_to_power_of_2
+{
+  relative_to_power_of_2 (unsigned HOST_WIDE_INT val)
+  {
+    m_shift = get_power_of_two (val);
+    m_offset = val - (1ul << m_shift);
+
+    if (m_shift > 1)
+      if (m_offset > (1l << (m_shift - 1)))
+	{
+	  m_shift++;
+	  m_offset = val - (1ul << m_shift);
+	}
+  }
+
+  void print (pretty_printer *pp) const
+  {
+    if (m_offset > 0)
+      pp_printf (pp, "(1<<%u)+%wd", m_shift, m_offset);
+    else if (m_offset < 0)
+      pp_printf (pp, "(1<<%u)-%wd", m_shift, -m_offset);
+    else
+      pp_printf (pp, "1<<%u", m_shift);
+  }
+
+  unsigned m_shift;
+  HOST_WIDE_INT m_offset;
+};
+
+/* Print VAL to PP, in a "humanized" way.
+   Small numbers are printed in decimal form (e.g. "42").
+   Large numbers close to a power of two are printed relative to a
+   power of two.  For example, 4294967295 is printed as "(1<<32)-1".  */
+
+void
+pp_humanized_limit (pretty_printer *pp, unsigned HOST_WIDE_INT val)
+{
+  /* For large numbers near a power of two, print them symbolically.
+     "65536" is about the most I can recognize by eye (dmalcolm),
+     1<<17 is where I can't tell at a glance if it is indeed 1<<17.  */
+  const unsigned threshold = 1 << 16;
+  if (val > threshold)
+    {
+      relative_to_power_of_2 rel2 (val);
+
+      /* How near to the large power of two should they be to be
+	 printed in terms of it?  */
+      const unsigned threshold_2 = 100;
+      if (abs (rel2.m_offset) <= threshold_2)
+	{
+	  rel2.print (pp);
+	  return;
+	}
+    }
+
+  /* Otherwise, just print the value as a decimal.  */
+  pp_printf (pp, "%wu", val);
+}
+
+/* Print the range MIN to MAX to PP, in a "humanized" way.
+   For example if MIN == MAX, just one number is printed.
+
+   Return true if both values are 1, false otherwise.
+   FIXME: how do plural forms work for *ranges*?  */
+
+bool
+pp_humanized_range (pretty_printer *pp, unsigned HOST_WIDE_INT min,
+		    unsigned HOST_WIDE_INT max)
+{
+  if (min == max)
+    {
+      pp_humanized_limit (pp, min);
+      return min == 1;
+    }
+  else
+    {
+      pp_humanized_limit (pp, min);
+      pp_string (pp, "...");
+      pp_humanized_limit (pp, max);
+      return false;
+    }
+}
+
 \f
 /* The string starting at P has LEN (at least 1) bytes left; if they
    start with a valid UTF-8 sequence, return the length of that
@@ -2211,6 +2312,84 @@ test_pp_format ()
 		    "problem with %qs at line %i", "bar", 10);
 }
 
+/* Implementation detail of ASSERT_PP_HUMANIZED_LIMIT.  */
+
+static void
+assert_pp_humanized_limit (const location &loc, unsigned HOST_WIDE_INT val,
+			   const char *expected)
+{
+  pretty_printer pp;
+  pp_humanized_limit (&pp, val);
+  ASSERT_STREQ_AT (loc, expected, pp_formatted_text (&pp));
+}
+
+/* Verify that pp_humanized_limit (pp, VAL) is EXPECTED.  */
+
+#define ASSERT_PP_HUMANIZED_LIMIT(VAL, EXPECTED)			\
+  SELFTEST_BEGIN_STMT							\
+    assert_pp_humanized_limit ((SELFTEST_LOCATION), (VAL), (EXPECTED)); \
+  SELFTEST_END_STMT
+
+/* Implementation detail of ASSERT_PP_HUMANIZED_RANGE.  */
+
+static void
+assert_pp_humanized_range (const location &loc, unsigned HOST_WIDE_INT min,
+			   unsigned HOST_WIDE_INT max, const char *expected)
+{
+  pretty_printer pp;
+  pp_humanized_range (&pp, min, max);
+  ASSERT_STREQ_AT (loc, expected, pp_formatted_text (&pp));
+}
+
+/* Verify that pp_humanized_range (pp, MIN, MAX) is EXPECTED.  */
+
+#define ASSERT_PP_HUMANIZED_RANGE(MIN, MAX, EXPECTED)			\
+  SELFTEST_BEGIN_STMT							\
+    assert_pp_humanized_range ((SELFTEST_LOCATION), (MIN), (MAX), (EXPECTED)); \
+  SELFTEST_END_STMT
+
+/* Verify that get_power_of_two works as expected.  */
+
+static void
+test_get_power_of_two ()
+{
+  ASSERT_EQ (get_power_of_two (0), 0);
+  ASSERT_EQ (get_power_of_two (1), 0);
+  ASSERT_EQ (get_power_of_two (2), 1);
+
+  ASSERT_EQ (get_power_of_two (7), 2);
+  ASSERT_EQ (get_power_of_two (8), 3);
+  ASSERT_EQ (get_power_of_two (9), 3);
+
+  ASSERT_EQ (get_power_of_two (65535), 15);
+  ASSERT_EQ (get_power_of_two (65536), 16);
+}
+
+/* Verify that pp_humanized_limit and pp_humanized_range work as expected.  */
+
+static void
+test_humanized_printing ()
+{
+  ASSERT_PP_HUMANIZED_LIMIT (0, "0");
+  ASSERT_PP_HUMANIZED_LIMIT (1, "1");
+  ASSERT_PP_HUMANIZED_LIMIT (2, "2");
+  ASSERT_PP_HUMANIZED_LIMIT (3, "3");
+  ASSERT_PP_HUMANIZED_LIMIT (4, "4");
+  ASSERT_PP_HUMANIZED_LIMIT (42, "42");
+  ASSERT_PP_HUMANIZED_LIMIT (256, "256");
+  ASSERT_PP_HUMANIZED_LIMIT (1<<16, "65536");
+  ASSERT_PP_HUMANIZED_LIMIT ((1<<16) + 1, "(1<<16)+1");
+  ASSERT_PP_HUMANIZED_LIMIT (100000, "100000");
+  ASSERT_PP_HUMANIZED_LIMIT (1<<17, "1<<17");
+  ASSERT_PP_HUMANIZED_LIMIT ((1<<17) - 1, "(1<<17)-1");
+  ASSERT_PP_HUMANIZED_LIMIT ((1<<17) + 1, "(1<<17)+1");
+  ASSERT_PP_HUMANIZED_LIMIT (4294967295, "(1<<32)-1");
+
+  ASSERT_PP_HUMANIZED_RANGE (0, 0, "0");
+  ASSERT_PP_HUMANIZED_RANGE (0, 1, "0...1");
+  ASSERT_PP_HUMANIZED_RANGE ((1<<16), (1<<17), "65536...1<<17");
+}
+
 /* Run all of the selftests within this file.  */
 
 void
@@ -2218,6 +2397,8 @@ pretty_print_c_tests ()
 {
   test_basic_printing ();
   test_pp_format ();
+  test_get_power_of_two ();
+  test_humanized_printing ();
 }
 
 } // namespace selftest
diff --git a/gcc/pretty-print.h b/gcc/pretty-print.h
index 2decc51..e4a8c97 100644
--- a/gcc/pretty-print.h
+++ b/gcc/pretty-print.h
@@ -416,4 +416,8 @@ pp_wide_integer (pretty_printer *pp, HOST_WIDE_INT i)
 template<unsigned int N, typename T>
 void pp_wide_integer (pretty_printer *pp, const poly_int_pod<N, T> &);
 
+extern void pp_humanized_limit (pretty_printer *, unsigned HOST_WIDE_INT);
+extern bool pp_humanized_range (pretty_printer *, unsigned HOST_WIDE_INT,
+				unsigned HOST_WIDE_INT);
+
 #endif /* GCC_PRETTY_PRINT_H */
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696)
  2018-09-14 17:31 [PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696) David Malcolm
                   ` (4 preceding siblings ...)
  2018-09-14 17:36 ` [PATCH 4/5] Add pp_humanized_limit and pp_humanized_range David Malcolm
@ 2018-09-17 21:16 ` Martin Sebor
  5 siblings, 0 replies; 8+ messages in thread
From: Martin Sebor @ 2018-09-17 21:16 UTC (permalink / raw)
  To: David Malcolm, msebor; +Cc: gcc-patches

On 09/14/2018 12:17 PM, David Malcolm wrote:
> Martin: on the way back from Cauldron I had a go at implementing new
> output formats for the warnings from gimple-ssa-sprintf.c, to try to
> better indicate to the end user what the problem is.  My plan was to
> implement some of the ASCII art ideas we've been discussing.  However,
> to try to understand what the pass is doing, I tried visualizing the
> existing data structures, and I ended up liking the output of that so much,
> that I think that it is what we ought to show the end-user.
>
> Consider the examples from PR middle-end/77696:
>
> char d[4];
>
> typedef __SIZE_TYPE__ size_t;
>
> extern int sprintf (char*, const char*, ...);
> extern int snprintf (char*, size_t, const char*, ...);
>
>
> void f (int i)
> {
>   // bounded, definite truncation in a directve
>   snprintf (d, sizeof d, "%i", 1235);
>
>   // bounded, definite truncation copying format string
>   snprintf (d, sizeof d, "%iAB", 123);
>
>   // unbounded, definite overflow in a directve
>   sprintf (d, "%i", 1235);
>
>   // unbounded, definite overflow copying format string
>   sprintf (d, "%iAB", 123);
>
>   // bounded, possible truncation a directve
>   snprintf (d, sizeof d, "%i", i);
>
>   // bounded, possible overflow copying format string
>   snprintf (d, sizeof d, "%iAB", i);
>
>   // unbounded, possible overflow in a directve
>   sprintf (d, "%i", i);
>
>   // unbounded, possible overflow copying format string
>   sprintf (d, "%iAB", 123);
>
>   // unbounded, possible overflow copying format string
>   const char *s = i ? "123" : "1234";
>   sprintf (d, "%sAB", s);
> }
>
> extern char buf_10[80];
> extern char tmpdir[80];
> extern char fname[8];
>
> void g (int num)
> {
>   sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
> }
>
> trunk currently emits:
>
> zz.c: In function ‘f’:
> zzz.c:12:29: warning: ‘snprintf’ output truncated before the last format character [-Wformat-truncation=]
> 12 |   snprintf (d, sizeof d, "%i", 1235);
>    |                             ^
> zzz.c:12:3: note: ‘snprintf’ output 5 bytes into a destination of size 4
> 12 |   snprintf (d, sizeof d, "%i", 1235);
>    |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> zzz.c:15:30: warning: ‘AB’ directive output truncated writing 2 bytes into a region of size 1 [-Wformat-truncation=]
> 15 |   snprintf (d, sizeof d, "%iAB", 123);
>    |                             ~^
> zzz.c:15:3: note: ‘snprintf’ output 6 bytes into a destination of size 4
> 15 |   snprintf (d, sizeof d, "%iAB", 123);
>    |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> zzz.c:18:18: warning: ‘sprintf’ writing a terminating nul past the end of the destination [-Wformat-overflow=]
> 18 |   sprintf (d, "%i", 1235);
>    |                  ^
> zzz.c:18:3: note: ‘sprintf’ output 5 bytes into a destination of size 4
> 18 |   sprintf (d, "%i", 1235);
>    |   ^~~~~~~~~~~~~~~~~~~~~~~
> zzz.c:21:19: warning: ‘AB’ directive writing 2 bytes into a region of size 1 [-Wformat-overflow=]
> 21 |   sprintf (d, "%iAB", 123);
>    |                  ~^
> zzz.c:21:3: note: ‘sprintf’ output 6 bytes into a destination of size 4
> 21 |   sprintf (d, "%iAB", 123);
>    |   ^~~~~~~~~~~~~~~~~~~~~~~~
> zzz.c:33:19: warning: ‘AB’ directive writing 2 bytes into a region of size 1 [-Wformat-overflow=]
> 33 |   sprintf (d, "%iAB", 123);
>    |                  ~^
> zzz.c:33:3: note: ‘sprintf’ output 6 bytes into a destination of size 4
> 33 |   sprintf (d, "%iAB", 123);
>    |   ^~~~~~~~~~~~~~~~~~~~~~~~
> zzz.c:37:19: warning: ‘AB’ directive writing 2 bytes into a region of size between 0 and 1 [-Wformat-overflow=]
> 37 |   sprintf (d, "%sAB", s);
>    |                  ~^
> zzz.c:37:3: note: ‘sprintf’ output between 6 and 7 bytes into a destination of size 4
> 37 |   sprintf (d, "%sAB", s);
>    |   ^~~~~~~~~~~~~~~~~~~~~~
> zzz.c: In function ‘g’:
> zzz.c:46:24: warning: ‘/’ directive writing 1 byte into a region of size between 0 and 79 [-Wformat-overflow=]
> 46 |   sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
>    |                        ^
> zzz.c:46:3: note: ‘sprintf’ output between 9 and 105 bytes into a destination of size 80
> 46 |   sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
>    |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
> With this patch kit, gcc emits:
>
> zzz.c: In function ‘f’:
> zzz.c:12:13: warning: ‘snprintf’ will truncate the output (5 bytes) to size 4 [-Wformat-truncation=]
> 12 |   snprintf (d, sizeof d, "%i", 1235);
>    |             ^             ^~^
>    |             |             | |
>    |             |             | 1 byte (for NUL terminator)
>    |             |             4 bytes
>    |             capacity: 4 bytes
> zzz.c:1:6: note: destination declared here
> 1 | char d[4];
>   |      ^
> zzz.c:15:13: warning: ‘snprintf’ will truncate the output (6 bytes) to size 4 [-Wformat-truncation=]
> 15 |   snprintf (d, sizeof d, "%iAB", 123);
>    |             ^             ^~^~^
>    |             |             | | |
>    |             |             | | 1 byte (for NUL terminator)
>    |             |             | 2 bytes
>    |             |             3 bytes
>    |             capacity: 4 bytes
> zzz.c:1:6: note: destination declared here
> 1 | char d[4];
>   |      ^
> zzz.c:18:12: warning: buffer overflow: ‘sprintf’ will write 5 bytes into a destination of size 4 [-Wformat-overflow=]
> 18 |   sprintf (d, "%i", 1235);
>    |            ^   ^~^
>    |            |   | |
>    |            |   | 1 byte (for NUL terminator)
>    |            |   4 bytes
>    |            capacity: 4 bytes
> zzz.c:1:6: note: destination declared here
> 1 | char d[4];
>   |      ^
> zzz.c:21:12: warning: buffer overflow: ‘sprintf’ will write 6 bytes into a destination of size 4 [-Wformat-overflow=]
> 21 |   sprintf (d, "%iAB", 123);
>    |            ^   ^~^~^
>    |            |   | | |
>    |            |   | | 1 byte (for NUL terminator)
>    |            |   | 2 bytes
>    |            |   3 bytes
>    |            capacity: 4 bytes
> zzz.c:1:6: note: destination declared here
> 1 | char d[4];
>   |      ^
> zzz.c:33:12: warning: buffer overflow: ‘sprintf’ will write 6 bytes into a destination of size 4 [-Wformat-overflow=]
> 33 |   sprintf (d, "%iAB", 123);
>    |            ^   ^~^~^
>    |            |   | | |
>    |            |   | | 1 byte (for NUL terminator)
>    |            |   | 2 bytes
>    |            |   3 bytes
>    |            capacity: 4 bytes
> zzz.c:1:6: note: destination declared here
> 1 | char d[4];
>   |      ^
> zzz.c:37:12: warning: buffer overflow: ‘sprintf’ will write between 6 and 7 bytes into a destination of size 4 [-Wformat-overflow=]
> 37 |   sprintf (d, "%sAB", s);
>    |            ^   ^~^~^
>    |            |   | | |
>    |            |   | | 1 byte (for NUL terminator)
>    |            |   | 2 bytes
>    |            |   3...4 bytes
>    |            capacity: 4 bytes
> zzz.c:1:6: note: destination declared here
> 1 | char d[4];
>   |      ^
> zzz.c: In function ‘g’:
> zzz.c:46:12: warning: buffer overflow: ‘sprintf’ will write between 9 and 105 bytes into a destination of size 80 [-Wformat-overflow=]
> 46 |   sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
>    |            ^~~~~~   ^^~^^~^^~^~~~^
>    |            |        || || || |   |
>    |            |        || || || |   1 byte (for NUL terminator)
>    |            |        || || || 4 bytes
>    |            |        || || |1...11 bytes
>    |            |        || || 1 byte
>    |            |        || |0...7 bytes
>    |            |        || 1 byte
>    |            |        |0...79 bytes
>    |            |        1 byte
>    |            capacity: 80 bytes
> zzz.c:40:13: note: destination declared here
> 40 | extern char buf_10[80];
>    |             ^~~~~~
>
> The above is missing colorization, which can be seen at:
>
>   https://dmalcolm.fedorapeople.org/gcc/2018-09-14/zzz.html
>
> where the secondary locations within the format string alternate
> colors between odd and even secondary locations.
>
> The above output format is rather verbose, but I think that it is
> appropriate for this warning.
>
> Rationale: What is the action that an end-user will want to take when
> encountering a warning?
>
> I think they will want to:
> (a) review it and decide:
>   (a.1) "Is this a "true" result?"
>   (a.2) "Do I care?"
> (b) if they decide it's genuine: "How do I fix this?"
>
> (i.e. we should aim for our diagnostics to be "actionable")
>
> Consider the sprintf/snprintf warnings.  For the user to easily take the
> above actions, the most helpful thing we can do is to provide the user
> with clear information on how much data the format directives will write
> when the code runs, and to compare that with the size of the destination buffer.
>
> I think the typical fix the user will apply will be to increase the size
> of the buffer, or to convert sprintf to snprintf: a buffer overflow is
> typically very bad, whereas truncating a string can often be much less
> serious.  A less common fix will be to add or tweak width specifiers to
> specific format directives.
>
> To help the end-user to make these kinds of decisions, these warnings
> need to clearly summarize the sizes of *all* of the various writes,
> and compare them to the size of the destination buffer.
>
> As I understand it, the current implementation iterates through the
> format directives, tracking the possible ranges of data that could have
> been written so far.  It emits a warning at the first directive that can
> have an overflow, and then stops analyzing the directives at that callsite.
>
> The existing approach thus seems to focus on where the first overflow
> can happen.
>
> My proposed new approach analyzes all of the directives up-front, and if
> an overflow can occur, then the warning that's emitted shows all of the
> various sizes of writes that will happen.
>
> Hence the reimplementation doesn't bother showing the point where the
> overflow happens; instead it summarizes all of what will be written, to
> help the user in deciding how to fix their code.
> (Doing so reduces some of the combinatorial explosion in possible output
> messages).
>
> The patch also puts the words "buffer overflow" at the front of the
> message where appropriate (to better grab the user's attention).
> It also calls out NUL terminators, since people tend to make mistakes
> with them.
>
> There's a lot of FIXMEs in the resulting patch, but I wanted to run the
> idea past you before fixing all the FIXMEs.  In particular, I haven't
> attempted to bootstrap and regression-test the new implementation.
> (I suspect many of the existing tests are broken by this...).  Consider
> this a prototype.
>
> [the interesting part of the patch kit is in patch 5/5]
>
> Thoughts?

I haven't had a chance to review the patch itself or install
and play with it but going by the example diagnostics you've
provided (thanks, that was helpful) I agree there are some
nice elements there (e.g, referencing the declaration of
the destination object in a note, or showing the size of
the output of each directive).

At the same time, I still have the same concerns as before:
the cost of maintaining the new output and the tests for it
seems considerable.  And while I do like some of
the improvements quite a bit, others seem subjective and
less like a clear win to me.  On balance, I'm not sure
the overall costs and liabilities justify the benefits.

A big concern I have is also about losing some subtle but
important distinctions that the current warnings make and that
the tests rely on to verify the warning works correctly (this
is both the phrasing and the data in the warnings).  If these
are lost or moved into the notes, we'd either lose that aspect
of the tests or will have to replicate it via
the dg-multiline-output directives.  I find those very hard
to work with (and hence my concern about maintenance).

An example is the amount of output printed by the overflowing
directive. Besides testing, I think it's important also for
users in the "may write" kind of warnings -- users need to
see clearly which directive caused the overflow, not just
that the whole call may have overflowed and be expected to
do the math (add up the ranges of output of prior directives)
to determine the problem directive.

Beyond that, when introducing the sprintf warnings I tried
to make them consistent with other similar warnings (e.g.,
-Wstringop-overflow, -Warray-bounds, -Wrestrict, etc), both
in phrasing and in style.  The rewording/reformatting would
make the printf kind entirely different.  E.g., the notation
used to represent ranges ([MIN, MAX] vs MIN...MAX).  I'm not
opposed to tweaking the wording or the style but I think
there is value in consistency and so if we change sprintf
I think we should adopt the same style across the board.

It's definitely useful to be able to tell how much output each
directive produced (that's why it's in the debugging dumps) but
the "ASCII art" feels so different from the style of all other
GCC diagnostics that I'm having trouble convincing myself it's
the right (or even a good) choice.

I understand why you chose different colors to distinguish
adjacent directives -- but so much colorization feels more
distracting than helpful  I suppose it could be mitigated
somewhat by only printing it for the %- kinds of directives
and for plain characters in the format string.

With all that said, in case you haven't seen it, there is
additional detail about the pass in
the -fdump-tree-printf-return-value output, but it isn't
nearly as nicely presented.  Perhaps enhancing the dump format
would be a way to make these improvements accessible to users
without running into the problems above.  It might also be
a good way to experiment with different styles of output until
we have decided on one to converge all these diagnostics on.
Another alternative might be make this output available under
an option, but that would have to be done without excessively
increasing the complexity of the pass.

Martin

PS I'm not sure I see if/where the patch retains
the distinction between certain and possible overflow and
truncation ("writing" vs "may write").

> Dave
>
> David Malcolm (5):
>   substring-locations: add class format_string_diagnostic_t
>   Add range_idx param to range_label::get_text
>   gimple-ssa-sprintf.c: move struct call_info out of the dom_walker subclass
>   Add pp_humanized_limit and pp_humanized_range
>   gimple-ssa-sprintf.c: rewrite overflow/truncation diagnostic (PR middle-end/77696)
>
>  gcc/c-family/c-format.c                      |  32 +-
>  gcc/c/c-objc-common.c                        |   2 +-
>  gcc/c/c-typeck.c                             |   4 +-
>  gcc/cp/error.c                               |   2 +-
>  gcc/diagnostic-show-locus.c                  |  19 +-
>  gcc/gcc-rich-location.h                      |   4 +-
>  gcc/gimple-ssa-sprintf.c                     | 670 ++++++++++++---------------
>  gcc/pretty-print.c                           | 181 ++++++++
>  gcc/pretty-print.h                           |   4 +
>  gcc/substring-locations.c                    | 113 +++--
>  gcc/substring-locations.h                    |  74 +--
>  gcc/testsuite/gcc.dg/sprintf-diagnostics-1.c | 252 ++++++++++
>  gcc/testsuite/gcc.dg/sprintf-diagnostics-2.c |  57 +++
>  libcpp/include/line-map.h                    |   6 +-
>  14 files changed, 927 insertions(+), 493 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/sprintf-diagnostics-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/sprintf-diagnostics-2.c
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 4/5] Add pp_humanized_limit and pp_humanized_range
  2018-09-14 17:36 ` [PATCH 4/5] Add pp_humanized_limit and pp_humanized_range David Malcolm
@ 2018-09-18 15:43   ` Martin Sebor
  0 siblings, 0 replies; 8+ messages in thread
From: Martin Sebor @ 2018-09-18 15:43 UTC (permalink / raw)
  To: David Malcolm, msebor; +Cc: gcc-patches

On 09/14/2018 12:17 PM, David Malcolm wrote:
> gcc/ChangeLog:
> 	* pretty-print.c (get_power_of_two): New function.
> 	(struct relative_to_power_of_2): New struct.
> 	(pp_humanized_limit): New function.
> 	(pp_humanized_range): New function.
> 	(selftest::assert_pp_humanized_limit): New function.
> 	(ASSERT_PP_HUMANIZED_LIMIT): New macro.
> 	(selftest::assert_pp_humanized_range): New function.
> 	(ASSERT_PP_HUMANIZED_RANGE): New macro.
> 	(selftest::test_get_power_of_two): New function.
> 	(selftest::test_humanized_printing): New function.
> 	(selftest::pretty_print_c_tests): Call them.
> 	* pretty-print.h (pp_humanized_limit): New function.
> 	(pp_humanized_range): New function.
...
> +static void
> +test_humanized_printing ()
> +{
> +  ASSERT_PP_HUMANIZED_LIMIT ((1<<16) + 1, "(1<<16)+1");
> +  ASSERT_PP_HUMANIZED_LIMIT (100000, "100000");
> +  ASSERT_PP_HUMANIZED_LIMIT (1<<17, "1<<17");
> +  ASSERT_PP_HUMANIZED_LIMIT ((1<<17) - 1, "(1<<17)-1");
> +  ASSERT_PP_HUMANIZED_LIMIT ((1<<17) + 1, "(1<<17)+1");
> +  ASSERT_PP_HUMANIZED_LIMIT (4294967295, "(1<<32)-1");
> +
> +  ASSERT_PP_HUMANIZED_RANGE (0, 0, "0");
> +  ASSERT_PP_HUMANIZED_RANGE (0, 1, "0...1");
> +  ASSERT_PP_HUMANIZED_RANGE ((1<<16), (1<<17), "65536...1<<17");

I didn't comment on this aspect of the change yesterday: besides
making very large numbers easier for humans to understand, my hope
was to also help avoid data model dependencies in the test suite.

When testing on any given host it's easy to forget that a constant
may have a different value in a different data model and hardcode
a fixed number.  The test then fails for targets where the constant
has a different value.

As it is, the sprintf (and other) tests deal with the problem like
this:

   T (-1, "%*cX", INT_MAX,     '1'); /* { dg-warning "output of \[0-9\]+ 
bytes causes result to exceed .INT_MAX." } */

I.e., by accepting any sequence of digits where a large constant
like INT_MAX is expected.  This works fine but doesn't verify that
the value printed is correct.

One approach to dealing with this is to use manifest constants
in the diagnostic output that are independent of the data model,
like INT_MAX, SIZE_MAX, etc.

Using shift expressions instead doesn't solve this problem,
but even beyond that, I wouldn't consider shift expressions
like "(1 << 16) + 1" to be significantly more readable than
their value.  IMO, it overestimates the capacity of most
programmers, to do shift arithmetic in their heads ;-)

Do you see a problem with using the <limits.h> constants?

Martin

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-09-18 15:38 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-14 17:31 [PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696) David Malcolm
2018-09-14 17:30 ` [PATCH 1/5] substring-locations: add class format_string_diagnostic_t David Malcolm
2018-09-14 17:31 ` [PATCH 5/5] gimple-ssa-sprintf.c: rewrite overflow/truncation diagnostic (PR middle-end/77696) David Malcolm
2018-09-14 17:31 ` [PATCH 3/5] gimple-ssa-sprintf.c: move struct call_info out of the dom_walker subclass David Malcolm
2018-09-14 17:31 ` [PATCH 2/5] Add range_idx param to range_label::get_text David Malcolm
2018-09-14 17:36 ` [PATCH 4/5] Add pp_humanized_limit and pp_humanized_range David Malcolm
2018-09-18 15:43   ` Martin Sebor
2018-09-17 21:16 ` [PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696) Martin Sebor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).