From: David Malcolm <dmalcolm@redhat.com>
To: Tim Lange <mail@tim-lange.me>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH 2/2] analyzer: out-of-bounds checker [PR106000]
Date: Tue, 09 Aug 2022 18:44:23 -0400 [thread overview]
Message-ID: <7a010cff4272202a15bd90b92863f54946f41b9e.camel@redhat.com> (raw)
In-Reply-To: <20220809211943.82098-2-mail@tim-lange.me>
On Tue, 2022-08-09 at 23:19 +0200, Tim Lange wrote:
> This patch adds an experimental out-of-bounds checker to the
> analyzer.
>
> The checker was tested on coreutils, curl, httpd and openssh. It is
> mostly
> accurate but does produce false-positives on yacc-generated files and
> sometimes when the analyzer misses an invariant. These cases will be
> documented in bugzilla.
> (Regrtests still running with the latest changes, will report back
> later.)
Hi Tim, thanks for the patch, and for all the testing you've done on
it.
We've already had several rounds of review of this off-list, and this
patch looks very close to ready.
Some nits below...
> diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt
> index 5021376b6fb..8e73af60ceb 100644
> --- a/gcc/analyzer/analyzer.opt
> +++ b/gcc/analyzer/analyzer.opt
> @@ -158,6 +158,10 @@ Wanalyzer-tainted-size
> Common Var(warn_analyzer_tainted_size) Init(1) Warning
> Warn about code paths in which an unsanitized value is used as a
> size.
>
> +Wanalyzer-out-of-bounds
> +Common Var(warn_analyzer_out_of_bounds) Init(1) Warning
> +Warn about code paths in which a write or read to a buffer is out-
> of-bounds.
> +
Please keep the list alphabetized; I think this needs to be between
Wanalyzer-mismatching-deallocation
and
Wanalyzer-possible-null-argument
> Wanalyzer-use-after-free
> Common Var(warn_analyzer_use_after_free) Init(1) Warning
> Warn about code paths in which a freed value is used.
> diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-
> model.cc
> index f7df2fca245..2f9382ed96c 100644
> --- a/gcc/analyzer/region-model.cc
> +++ b/gcc/analyzer/region-model.cc
> @@ -1268,6 +1268,402 @@ region_model::on_stmt_pre (const gimple
> *stmt,
> }
> }
>
> +/* Abstract base class for all out-of-bounds warnings. */
> +
> +class out_of_bounds : public
> pending_diagnostic_subclass<out_of_bounds>
> +{
> +public:
> + out_of_bounds (const region *reg, tree diag_arg, byte_range range)
> + : m_reg (reg), m_diag_arg (diag_arg), m_range (range)
> + {}
> +
> + const char *get_kind () const final override
> + {
> + return "out_of_bounds_diagnostic";
> + }
> +
> + bool operator== (const out_of_bounds &other) const
> + {
> + return m_reg == other.m_reg
> + && m_range == other.m_range
> + && pending_diagnostic::same_tree_p (m_diag_arg,
> other.m_diag_arg);
> + }
> +
> + int get_controlling_option () const final override
> + {
> + return OPT_Wanalyzer_out_of_bounds;
> + }
> +
> + void mark_interesting_stuff (interesting_t *interest) final
> override
> + {
> + interest->add_region_creation (m_reg);
> + }
> +
> +protected:
> + const region *m_reg;
> + tree m_diag_arg;
> + byte_range m_range;
Please add a comment clarifying what the meaning of m_range is here.
Is it
(a) the range of all bytes that are accessed,
(b) the range of bytes that are accessed out-of-bounds,
(c) etc?
From my reading of the patch I think it's (b).
> +};
> +
> +/* Abstract subclass to complaing about out-of-bounds
> + past the end of the buffer. */
> +
> +class past_the_end : public out_of_bounds
> +{
> +public:
> + past_the_end (const region *reg, tree diag_arg, byte_range range,
> + tree byte_bound)
> + : out_of_bounds (reg, diag_arg, range), m_byte_bound (byte_bound)
> + {}
> +
> + bool operator== (const past_the_end &other) const
> + {
> + return m_reg == other.m_reg
> + && m_range == other.m_range
> + && pending_diagnostic::same_tree_p (m_diag_arg,
> other.m_diag_arg)
Is it possible to call
out_of_bounds::operator==
for the first three fields, rather than a copy-and-paste of the logic?
> + && pending_diagnostic::same_tree_p (m_byte_bound,
> + other.m_byte_bound);
> + }
> +
> + label_text
> + describe_region_creation_event (const evdesc::region_creation &ev)
> final
> + override
> + {
> + if (m_byte_bound && TREE_CODE (m_byte_bound) == INTEGER_CST)
> + return ev.formatted_print ("capacity is %E bytes",
> m_byte_bound);
> +
> + return label_text ();
> + }
> +
> +protected:
> + tree m_byte_bound;
> +};
[...snip the concrete subclasses...]
We went through several rounds of review off-list, and I have lots of
ideas for wording tweaks to the patch, but rather than me be a
"backseat driver" (or bikeshedding), I think that that aspect of the
patch is good enough as-is, and I'll make the wording changes myself
once the patch is in trunk.
[...snip...]
> +
> + if (warned)
> + {
> + char num_bytes_past_buf[WIDE_INT_PRINT_BUFFER_SIZE];
> + print_dec (m_range.m_size_in_bytes, num_bytes_past_buf,
> UNSIGNED);
I think we can use %wu for this, but I can fix this up in a followup.
[...snip...]
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index fa23fbeaaaa..5ab834af780 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -459,6 +459,7 @@ Objective-C and Objective-C++ Dialects}.
> -Wno-analyzer-null-dereference @gol
> -Wno-analyzer-possible-null-argument @gol
> -Wno-analyzer-possible-null-dereference @gol
> +-Wno-analyzer-out-of-bounds @gol
Please move between
-Wno-analyzer-null-dereference @gol
and
-Wno-analyzer-possible-null-argument @gol
for alphabetization.
> -Wno-analyzer-shift-count-negative @gol
> -Wno-analyzer-shift-count-overflow @gol
> -Wno-analyzer-stale-setjmp-buffer @gol
> @@ -9991,6 +9992,17 @@ This warning requires @option{-fanalyzer},
> which enables it; use
> This diagnostic warns for paths through the code in which a
> value known to be NULL is dereferenced.
>
> +@item -Wno-analyzer-out-of-bounds
> +@opindex Wanalyzer-out-of-bounds
> +@opindex Wno-analyzer-out-of-bounds
> +This warning requires @option{-fanalyzer} to enable it; use
> +@option{-Wno-analyzer-out-of-bounds} to disable it.
> +
> +This diagnostic warns for path through the code in which a buffer is
> +accessed or written out-of-bounds.
Would be good to clarify the limitations: as I understand it:
"The diagnostic only applies for cases where the analyzer is able to
determine a constant size for the buffer. It warns when any part of a
read or write is definitely before the start of the buffer, or
definitely after the end."
...or somesuch wording.
> +
> +See @url{https://cwe.mitre.org/data/definitions/119.html, CWE-119:
> Improper Restriction of Operations within the Bounds of a Memory
> Buffer}.
Also, please move the new entry to position to keep things
alphabetized.
> +
> @item -Wno-analyzer-shift-count-negative
> @opindex Wanalyzer-shift-count-negative
> @opindex Wno-analyzer-shift-count-negative
[...snip...]
> diff --git a/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-1.c
> b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-1.c
> new file mode 100644
> index 00000000000..715c8b7460f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-1.c
> @@ -0,0 +1,119 @@
> +#include <stdlib.h>
> +#include <string.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +
> +/* Wanalyzer-out-of-bounds tests for buffer overflows. */
> +
> +/* Avoid folding of memcpy. */
> +typedef void * (*memcpy_t) (void *dst, const void *src, size_t n);
> +
> +static memcpy_t __attribute__((noinline))
> +get_memcpy (void)
> +{
> + return memcpy;
> +}
> +
> +
> +/* Taken from CWE-787. */
> +void test1 (void)
> +{
> + int id_sequence[3];
> +
> + id_sequence[0] = 123;
> + id_sequence[1] = 234;
> + id_sequence[2] = 345;
> + id_sequence[3] = 456; /* { dg-line test1 } */
> +
> + /* { dg-warning "overflow" "warning" { target *-*-* } test1 } */
> + /* { dg-message "" "note" { target *-*-* } test1 } */
I see that you've left the regexes mostly blank in the various DejaGnu
directives in these new tests. Normally I'd want these to be less
vague, but given that I plan to change the wordings in a followup
anyway, this is OK.
[...snip lots of great testcases...]
With the above nits fixed, the patch is OK for trunk (assuming that
your testing doesn't show any problems).
Thanks again for the patch; this feels like a major new feature.
Dave
next prev parent reply other threads:[~2022-08-09 22:44 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-09 21:19 [PATCH 1/2] analyzer: consider that realloc could shrink the buffer [PR106539] Tim Lange
2022-08-09 21:19 ` [PATCH 2/2] analyzer: out-of-bounds checker [PR106000] Tim Lange
2022-08-09 22:44 ` David Malcolm [this message]
2022-08-09 22:02 ` [PATCH 1/2] analyzer: consider that realloc could shrink the buffer [PR106539] David Malcolm
2022-08-11 17:24 ` [PATCH 1/2 v2] " Tim Lange
2022-08-11 17:24 ` [PATCH 2/2 v2] analyzer: out-of-bounds checker [PR106000] Tim Lange
2022-08-11 19:30 ` David Malcolm
2022-08-11 19:25 ` [PATCH 1/2 v2] analyzer: consider that realloc could shrink the buffer [PR106539] David Malcolm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7a010cff4272202a15bd90b92863f54946f41b9e.camel@redhat.com \
--to=dmalcolm@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=mail@tim-lange.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).