public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Malcolm <dmalcolm@redhat.com>
To: msebor@redhat.com
Cc: gcc-patches@gcc.gnu.org,	David Malcolm <dmalcolm@redhat.com>
Subject: [PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696)
Date: Fri, 14 Sep 2018 17:31:00 -0000	[thread overview]
Message-ID: <1536949038-34114-1-git-send-email-dmalcolm@redhat.com> (raw)

Martin: on the way back from Cauldron I had a go at implementing new
output formats for the warnings from gimple-ssa-sprintf.c, to try to
better indicate to the end user what the problem is.  My plan was to
implement some of the ASCII art ideas we've been discussing.  However,
to try to understand what the pass is doing, I tried visualizing the
existing data structures, and I ended up liking the output of that so much,
that I think that it is what we ought to show the end-user.

Consider the examples from PR middle-end/77696:

char d[4];

typedef __SIZE_TYPE__ size_t;

extern int sprintf (char*, const char*, ...);
extern int snprintf (char*, size_t, const char*, ...);


void f (int i)
{
  // bounded, definite truncation in a directve
  snprintf (d, sizeof d, "%i", 1235);

  // bounded, definite truncation copying format string
  snprintf (d, sizeof d, "%iAB", 123);

  // unbounded, definite overflow in a directve
  sprintf (d, "%i", 1235);

  // unbounded, definite overflow copying format string
  sprintf (d, "%iAB", 123);

  // bounded, possible truncation a directve
  snprintf (d, sizeof d, "%i", i);

  // bounded, possible overflow copying format string
  snprintf (d, sizeof d, "%iAB", i);

  // unbounded, possible overflow in a directve
  sprintf (d, "%i", i);

  // unbounded, possible overflow copying format string
  sprintf (d, "%iAB", 123);

  // unbounded, possible overflow copying format string
  const char *s = i ? "123" : "1234";
  sprintf (d, "%sAB", s);
}

extern char buf_10[80];
extern char tmpdir[80];
extern char fname[8];

void g (int num)
{
  sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
}

trunk currently emits:

zz.c: In function ‘f’:
zzz.c:12:29: warning: ‘snprintf’ output truncated before the last format character [-Wformat-truncation=]
12 |   snprintf (d, sizeof d, "%i", 1235);
   |                             ^
zzz.c:12:3: note: ‘snprintf’ output 5 bytes into a destination of size 4
12 |   snprintf (d, sizeof d, "%i", 1235);
   |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:15:30: warning: ‘AB’ directive output truncated writing 2 bytes into a region of size 1 [-Wformat-truncation=]
15 |   snprintf (d, sizeof d, "%iAB", 123);
   |                             ~^
zzz.c:15:3: note: ‘snprintf’ output 6 bytes into a destination of size 4
15 |   snprintf (d, sizeof d, "%iAB", 123);
   |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:18:18: warning: ‘sprintf’ writing a terminating nul past the end of the destination [-Wformat-overflow=]
18 |   sprintf (d, "%i", 1235);
   |                  ^
zzz.c:18:3: note: ‘sprintf’ output 5 bytes into a destination of size 4
18 |   sprintf (d, "%i", 1235);
   |   ^~~~~~~~~~~~~~~~~~~~~~~
zzz.c:21:19: warning: ‘AB’ directive writing 2 bytes into a region of size 1 [-Wformat-overflow=]
21 |   sprintf (d, "%iAB", 123);
   |                  ~^
zzz.c:21:3: note: ‘sprintf’ output 6 bytes into a destination of size 4
21 |   sprintf (d, "%iAB", 123);
   |   ^~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:33:19: warning: ‘AB’ directive writing 2 bytes into a region of size 1 [-Wformat-overflow=]
33 |   sprintf (d, "%iAB", 123);
   |                  ~^
zzz.c:33:3: note: ‘sprintf’ output 6 bytes into a destination of size 4
33 |   sprintf (d, "%iAB", 123);
   |   ^~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:37:19: warning: ‘AB’ directive writing 2 bytes into a region of size between 0 and 1 [-Wformat-overflow=]
37 |   sprintf (d, "%sAB", s);
   |                  ~^
zzz.c:37:3: note: ‘sprintf’ output between 6 and 7 bytes into a destination of size 4
37 |   sprintf (d, "%sAB", s);
   |   ^~~~~~~~~~~~~~~~~~~~~~
zzz.c: In function ‘g’:
zzz.c:46:24: warning: ‘/’ directive writing 1 byte into a region of size between 0 and 79 [-Wformat-overflow=]
46 |   sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
   |                        ^
zzz.c:46:3: note: ‘sprintf’ output between 9 and 105 bytes into a destination of size 80
46 |   sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
   |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


With this patch kit, gcc emits:

zzz.c: In function ‘f’:
zzz.c:12:13: warning: ‘snprintf’ will truncate the output (5 bytes) to size 4 [-Wformat-truncation=]
12 |   snprintf (d, sizeof d, "%i", 1235);
   |             ^             ^~^
   |             |             | |
   |             |             | 1 byte (for NUL terminator)
   |             |             4 bytes
   |             capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
  |      ^
zzz.c:15:13: warning: ‘snprintf’ will truncate the output (6 bytes) to size 4 [-Wformat-truncation=]
15 |   snprintf (d, sizeof d, "%iAB", 123);
   |             ^             ^~^~^
   |             |             | | |
   |             |             | | 1 byte (for NUL terminator)
   |             |             | 2 bytes
   |             |             3 bytes
   |             capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
  |      ^
zzz.c:18:12: warning: buffer overflow: ‘sprintf’ will write 5 bytes into a destination of size 4 [-Wformat-overflow=]
18 |   sprintf (d, "%i", 1235);
   |            ^   ^~^
   |            |   | |
   |            |   | 1 byte (for NUL terminator)
   |            |   4 bytes
   |            capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
  |      ^
zzz.c:21:12: warning: buffer overflow: ‘sprintf’ will write 6 bytes into a destination of size 4 [-Wformat-overflow=]
21 |   sprintf (d, "%iAB", 123);
   |            ^   ^~^~^
   |            |   | | |
   |            |   | | 1 byte (for NUL terminator)
   |            |   | 2 bytes
   |            |   3 bytes
   |            capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
  |      ^
zzz.c:33:12: warning: buffer overflow: ‘sprintf’ will write 6 bytes into a destination of size 4 [-Wformat-overflow=]
33 |   sprintf (d, "%iAB", 123);
   |            ^   ^~^~^
   |            |   | | |
   |            |   | | 1 byte (for NUL terminator)
   |            |   | 2 bytes
   |            |   3 bytes
   |            capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
  |      ^
zzz.c:37:12: warning: buffer overflow: ‘sprintf’ will write between 6 and 7 bytes into a destination of size 4 [-Wformat-overflow=]
37 |   sprintf (d, "%sAB", s);
   |            ^   ^~^~^
   |            |   | | |
   |            |   | | 1 byte (for NUL terminator)
   |            |   | 2 bytes
   |            |   3...4 bytes
   |            capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
  |      ^
zzz.c: In function ‘g’:
zzz.c:46:12: warning: buffer overflow: ‘sprintf’ will write between 9 and 105 bytes into a destination of size 80 [-Wformat-overflow=]
46 |   sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
   |            ^~~~~~   ^^~^^~^^~^~~~^
   |            |        || || || |   |
   |            |        || || || |   1 byte (for NUL terminator)
   |            |        || || || 4 bytes
   |            |        || || |1...11 bytes
   |            |        || || 1 byte
   |            |        || |0...7 bytes
   |            |        || 1 byte
   |            |        |0...79 bytes
   |            |        1 byte
   |            capacity: 80 bytes
zzz.c:40:13: note: destination declared here
40 | extern char buf_10[80];
   |             ^~~~~~

The above is missing colorization, which can be seen at:

  https://dmalcolm.fedorapeople.org/gcc/2018-09-14/zzz.html

where the secondary locations within the format string alternate
colors between odd and even secondary locations.

The above output format is rather verbose, but I think that it is
appropriate for this warning.

Rationale: What is the action that an end-user will want to take when
encountering a warning?

I think they will want to:
(a) review it and decide:
  (a.1) "Is this a "true" result?"
  (a.2) "Do I care?"  
(b) if they decide it's genuine: "How do I fix this?"

(i.e. we should aim for our diagnostics to be "actionable")

Consider the sprintf/snprintf warnings.  For the user to easily take the
above actions, the most helpful thing we can do is to provide the user
with clear information on how much data the format directives will write
when the code runs, and to compare that with the size of the destination buffer.

I think the typical fix the user will apply will be to increase the size
of the buffer, or to convert sprintf to snprintf: a buffer overflow is
typically very bad, whereas truncating a string can often be much less
serious.  A less common fix will be to add or tweak width specifiers to
specific format directives.

To help the end-user to make these kinds of decisions, these warnings
need to clearly summarize the sizes of *all* of the various writes,
and compare them to the size of the destination buffer.

As I understand it, the current implementation iterates through the
format directives, tracking the possible ranges of data that could have
been written so far.  It emits a warning at the first directive that can
have an overflow, and then stops analyzing the directives at that callsite.

The existing approach thus seems to focus on where the first overflow
can happen.

My proposed new approach analyzes all of the directives up-front, and if
an overflow can occur, then the warning that's emitted shows all of the
various sizes of writes that will happen.

Hence the reimplementation doesn't bother showing the point where the
overflow happens; instead it summarizes all of what will be written, to
help the user in deciding how to fix their code.
(Doing so reduces some of the combinatorial explosion in possible output
messages).

The patch also puts the words "buffer overflow" at the front of the
message where appropriate (to better grab the user's attention).
It also calls out NUL terminators, since people tend to make mistakes
with them.

There's a lot of FIXMEs in the resulting patch, but I wanted to run the
idea past you before fixing all the FIXMEs.  In particular, I haven't
attempted to bootstrap and regression-test the new implementation.
(I suspect many of the existing tests are broken by this...).  Consider
this a prototype.

[the interesting part of the patch kit is in patch 5/5]

Thoughts?
Dave

David Malcolm (5):
  substring-locations: add class format_string_diagnostic_t
  Add range_idx param to range_label::get_text
  gimple-ssa-sprintf.c: move struct call_info out of the dom_walker subclass
  Add pp_humanized_limit and pp_humanized_range
  gimple-ssa-sprintf.c: rewrite overflow/truncation diagnostic (PR middle-end/77696)

 gcc/c-family/c-format.c                      |  32 +-
 gcc/c/c-objc-common.c                        |   2 +-
 gcc/c/c-typeck.c                             |   4 +-
 gcc/cp/error.c                               |   2 +-
 gcc/diagnostic-show-locus.c                  |  19 +-
 gcc/gcc-rich-location.h                      |   4 +-
 gcc/gimple-ssa-sprintf.c                     | 670 ++++++++++++---------------
 gcc/pretty-print.c                           | 181 ++++++++
 gcc/pretty-print.h                           |   4 +
 gcc/substring-locations.c                    | 113 +++--
 gcc/substring-locations.h                    |  74 +--
 gcc/testsuite/gcc.dg/sprintf-diagnostics-1.c | 252 ++++++++++
 gcc/testsuite/gcc.dg/sprintf-diagnostics-2.c |  57 +++
 libcpp/include/line-map.h                    |   6 +-
 14 files changed, 927 insertions(+), 493 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/sprintf-diagnostics-1.c
 create mode 100644 gcc/testsuite/gcc.dg/sprintf-diagnostics-2.c

-- 
1.8.5.3

             reply	other threads:[~2018-09-14 17:30 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-14 17:31 David Malcolm [this message]
2018-09-14 17:30 ` [PATCH 1/5] substring-locations: add class format_string_diagnostic_t David Malcolm
2018-09-14 17:31 ` [PATCH 3/5] gimple-ssa-sprintf.c: move struct call_info out of the dom_walker subclass David Malcolm
2018-09-14 17:31 ` [PATCH 5/5] gimple-ssa-sprintf.c: rewrite overflow/truncation diagnostic (PR middle-end/77696) David Malcolm
2018-09-14 17:31 ` [PATCH 2/5] Add range_idx param to range_label::get_text David Malcolm
2018-09-14 17:36 ` [PATCH 4/5] Add pp_humanized_limit and pp_humanized_range David Malcolm
2018-09-18 15:43   ` Martin Sebor
2018-09-17 21:16 ` [PATCH 0/5] RFC: gimple-ssa-sprintf.c: a new approach (PR middle-end/77696) Martin Sebor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1536949038-34114-1-git-send-email-dmalcolm@redhat.com \
    --to=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=msebor@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).