public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Tim Lange <mail@tim-lange.me>
To: David Malcolm <dmalcolm@redhat.com>
Cc: GCC Mailing List <gcc@gcc.gnu.org>
Subject: Re: [RFC] analyzer: allocation size warning
Date: Fri, 17 Jun 2022 22:23:05 +0200	[thread overview]
Message-ID: <20220617202305.kg2otzye6k57fgyo@fedora> (raw)
In-Reply-To: <42afee94ad4da41b87980c6b4a7cd7dcc6cb1e97.camel@redhat.com>

On Fri, Jun 17, 2022 at 01:48:09PM -0400, David Malcolm wrote:
> On Fri, 2022-06-17 at 17:54 +0200, Tim Lange wrote:
> > Hi everyone,
> 
> Hi Tim.
> 
> Thanks for the patch.
> 
> Various comments inline below, throughout...
> 
> > 
> > tracked in PR105900 [0], I'd like to add support for a new warning on
> > dubious allocation sizes. The new checker emits a warning when the 
> > allocation size is not a multiple of the type's size. With the checker,
> > following mistakes are detected:
> >   int *arr = malloc(3); // forgot to multiply by sizeof
> >   arr[0] = ...;
> >   arr[1] = ...;
> > or
> >   int *buf = malloc (n + sizeof(int)); // probably should be * instead 
> > of +
> > Because it is implemented inside the analyzer, it also emits warnings
> > when the buffer is first of type void* and later on casted to something
> > else. Though, this also inherits a limitation. The checker can not 
> > distinguish 2 * sizeof(short) from sizeof(int) because sizeof is 
> > resolved and constants are folded at the point when the analyzer runs. 
> > As a mitigation, I plan to implement a check in the frontend that emits
> > a warning if sizeof(lhs pointee type) is not part of the malloc 
> > argument.
> > 
> > I'm looking for a first feedback on the phrasing of the diagnostics as 
> > well on the preliminary patch [1].
> > 
> > On constant buffer sizes, the warnings looks like this:
> > warning: Allocated buffer size is not a multiple of the pointee's size 
> > [CWE-131] [-Wanalyzer-allocation-size]
> >    22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } */
> >       | ^~~~~~~~~~~~~~~~~~~~~~~~~
> >   ‘test_2’: event 1
> >     |
> >     | 22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 }
> > */
> >     | | ^~~~~~~~~~~~~~~~~~~~~~~~~
> >     | | |
> >     | | (1) Casting a 14 byte buffer to ‘int *’ leaves 2 trailing 
> > bytes; either the allocated size is bogus or the type on the left-hand 
> > side is wrong
> >     |
> 
> Something strange seems to have happened with the indentation in your
> email; the code in the patch seems to me to be strangely indented, and
> looking at the archive here:
>   https://gcc.gnu.org/pipermail/gcc/2022-June/238907.html
> I see the same thing, so I think it's a problem with what the mailing
> list received, rather than just in my mail client.  Maybe something 
> 
> FWIW I normally use "git send-email" to send patches.
> 
> The underlinings in the above look strange; I see this in your email:

I have resent the patch using git send-email as a reply to my original message. 
The new message looks properly formatted in the archive:
    https://gcc.gnu.org/pipermail/gcc/2022-June/238911.html

> 
> warning: Allocated buffer size is not a multiple of the pointee's size 
> [CWE-131] [-Wanalyzer-allocation-size]
>    22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } */
>       | ^~~~~~~~~~~~~~~~~~~~~~~~~
>   ‘test_2’: event 1
>     |
>     | 22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } 
> */
>     | | ^~~~~~~~~~~~~~~~~~~~~~~~~
>     | | |
>     | | (1) Casting a 14 byte buffer to ‘int *’ leaves 2 trailing 
> bytes; either the allocated size is bogus or the type on the left-hand 
> side is wrong
>     |
> 
> Should it have been (omitting the dg-line directives for clarity):
> 
> warning: Allocated buffer size is not a multiple of the pointee's size  [CWE-131] [-Wanalyzer-allocation-size]
>    22 | int *ptr = malloc (10 + sizeof(int));
>       |            ^~~~~~~~~~~~~~~~~~~~~~~~~
>   ‘test_2’: event 1
>     |
>     | 22 | int *ptr = malloc (10 + sizeof(int));
>     |    |            ^~~~~~~~~~~~~~~~~~~~~~~~~
>     |    |            |
>     |    |            (1) Casting a 14 byte buffer to ‘int *’ leaves 2 trailing bytes; either the allocated size is bogus or the type on the left-hand side is wrong
>     |
> 
> ?
> 
> It looks like something somewhere has collapsed repeated whitespace in
> the message down to single spaces, which has broken the ASCII art in
> your examples, and the indentation in your code.
> 
> 
> It would probably be helpful for the message to tell the user what
> sizeof(*ptr) is,  sizeof(int) in this case (much more helpful when it's
> a struct)
> 
> Maybe something alike:
> 
> note: a buffer of 14 bytes is allocated...
> note: ...but sizeof (int) is 4 bytes...
> note: ...leaving 2 trailing bytes for an array of 3 'int's (which would
> occupy 12 bytes)
> 
> or somesuch???
> 
> I'm brainstorming here, my ideas above aren't necessarily good. 
> Sometimes it's good to chop up messages like this, to minimize
> combinatorial explosion for all the different cases.
> 
> 
> 
> On symbolic buffer sizes:
> warning: Allocated buffer size is not a multiple of the pointee's size 
> [CWE-131] [-Wanalyzer-allocation-size]
>    33 | int *ptr = malloc (n + sizeof(int)); /* { dg-line malloc3 } */
>       | ^~~~~~~~~~~~~~~~~~~~~~~~
>   ‘test_3’: event 1
>     |
>     | 33 | int *ptr = malloc (n + sizeof(int)); /* { dg-line malloc3 } 
> */
>     | | ^~~~~~~~~~~~~~~~~~~~~~~~
>     | | |
>     | | (1) Allocation is incompatible with ‘int *’; either the 
> allocated size is bogus or the type on the left-hand side is wrong
>     |
> 
> 
> Is there location information for both the malloc and for the
> assignment, here?

I'm not sure whether I understand your question but the warning is 
emitted at the gcall* with a ssa var lhs and the call_fndecl on the rhs.
I think that is enough to split that up into "(1) n + sizeof(int) 
allocated here" and "(2) Allocation at (1) is incompatible with..."? 

> 
> If so, then maybe two events:
> 
> warning: Allocated buffer size is not a multiple of the pointee's size 
> [CWE-131] [-Wanalyzer-allocation-size]
>    33 | int *ptr = malloc (n + sizeof(int));
>       |            ^~~~~~~~~~~~~~~~~~~~~~~~
>   ‘test_3’: events 1-2 
>     |
>     | 33 | int *ptr = malloc (n + sizeof(int));
>     |    |          ^ ^~~~~~~~~~~~~~~~~~~~~~~~
>     |    |          | |
>     |    |          | (1) buffer allocated here with size 'n + 4'
>     |    |          | 
>     |    |          (2) sizeof(*ptr) is 4
>     |
> 
> or somesuch.
> 
> 
> 
> And this is how a simple flow looks like:
> warning: Allocated buffer size is not a multiple of the pointee's size 
> [CWE-131] [-Wanalyzer-allocation-size]
>    39 | int *iptr = (int *)ptr; /* { dg-line assign } */
>       | ^~~~
>   ‘test_4’: events 1-2
>     |
>     | 38 | void *ptr = malloc (n * sizeof (short)); /* { dg-message }
> */
>     | | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
>     | | |
>     | | (1) allocated here
>     | 39 | int *iptr = (int *)ptr; /* { dg-line assign } */
>     | | ~~~~
>     | | |
>     | | (2) ‘ptr’ is incompatible with ‘int *’; either the 
> allocated size at (1) is bogus or the type on the left-hand side is 
> wrong
>     |
> 
> 
> 
> I think it would make the diagnostic more readable if the "allocated
> here" event's message expresses how big the buffer is e.g.:
> 
> warning: Allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size]
>    39 | int *iptr = (int *)ptr;
>       | ^~~~
>   ‘test_4’: events 1-2
>     |
>     | 38 | void *ptr = malloc (n * sizeof (short));*/
>     |    |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~
>     |    |             |
>     |    |            (1) allocated here with size '(n * 2)'
>     | 39 | int *iptr = (int *)ptr;
>     |    |      ~~~~
>     |    |      |
>     |    |      (2) ‘ptr’ is incompatible with ‘int *’; sizeof(int) is 4
>     |
> 
> 
> There are some things to discuss from my side:
> * The tests with the "toy re-implementation of CPython's object 
> model"[2] fail due to a extra warning emitted. Because the analyzer 
> can't know the calculation actually results in a correct buffer size 
> when viewed as a string_obj later on, it emits a warning, e.g. at line 
> 61 in data-model-5.c. The only mitigation would be to disable the 
> warning for structs entirely. Now, the question is to rather have noise
> on these cases or disable the warning for structs entirely?
> 
> Can you post the full warning please?

/path/to/data-model-5.c: In function ‘alloc_obj’:
/path/to/data-model-5.c:61:31: warning: Allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size]
   61 |   base_obj *obj = (base_obj *)malloc (sz);
      |                               ^~~~~~~~~~~
  ‘new_string_obj’: events 1-2
    |
    |   69 | base_obj *new_string_obj (const char *str)
    |      |           ^~~~~~~~~~~~~~
    |      |           |
    |      |           (1) entry to ‘new_string_obj’
    |......
    |   75 |     = (string_obj *)alloc_obj (&str_type, sizeof (string_obj) + len + 1);
    |      |                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |      |                     |
    |      |                     (2) calling ‘alloc_obj’ from ‘new_string_obj’
    |
    +--> ‘alloc_obj’: events 3-4
           |
           |   59 | base_obj *alloc_obj (type_obj *ob_type, size_t sz)
           |      |           ^~~~~~~~~
           |      |           |
           |      |           (3) entry to ‘alloc_obj’
           |   60 | {
           |   61 |   base_obj *obj = (base_obj *)malloc (sz);
           |      |                               ~~~~~~~~~~~
           |      |                               |
           |      |                               (4) Allocation is incompatible with ‘base_obj *’; either the allocated size is bogus or the type on the left-hand side is wrong
           |

> 
> These testcases exhibit a common way of faking inheritance in C, and I
> think it ought to be possible to support this in the warning.
> 
> I thing what's happening is we have
> 
> struct base
> { 
>   /* fields */
> };
> 
> struct sub
> {
>   struct base m_base;
>   /* extra fields.  */
> };
> 
> struct base *construct_base (size_t sz)
> {
>   struct base *p = (struct base *) malloc (sz);
> 
>   /* set up fields of base in p  */
> 
>   return p;
> }
> 
> Or is this on the interprocedural path as called with a specific sizeof
> for struct sub?

At (4), it does not know that base_obj is later used as a "base struct". 
As it is called with sizeof(struct sub), my checker thinks the buffer is
too large for one but too small for another base_obj.

> 
> Maybe we can special-case these by detecting where struct sub's first
> field is struct base, and hence where we expect this pattern?  (and use
> this to suppress the warning for such cases?)

I already excluded all structs with structs inside with 
struct_or_union_with_inheritance_p inside sm-malloc.cc. This does not help 
in the case size for struct sub is allocated but casted as base. Maybe, we
should do a special case for structs where we only warn when the sizeof is
too small to hold the base struct together with supressing warnings when
the first field is a struct? 

> 
> 
> * I'm unable to emit a warning whenever the cast happens at an 
> assignment with a call as the rhs, e.g. test_1 in allocation-size-4.c. 
> This is because I'm unable to access a region_svalue for the returned
> value. Even in the new_program_state, the svalue of the lhs is still a 
> conjured_svalue. Maybe David can lead me to a place where I can access 
> the return value's region_svalue or do I have to adapt the engine?
> 
> Please can you try reposting the patch?  I tried to read it, but am
> having trouble with the mangled indentation.

See my inline answer above. Both, the test case and from where I want 
to access the region_svalue are commented with // FIXME.

> 
> 
> * attr-malloc-6.c and pr96639.c did both contain structs without an 
> implementation. Something in the analyzer must have triggered another
> warning about the usage of those without them having an implementation.
> I changed those structs to have an empty implementation, such that the 
> additional warning are gone. I think this shouldn't change the test 
> case, so is this change okay?
> 
> What were the new warnings?

/path/to/attr-malloc-6.c:175:15: error: invalid use of undefined type ‘struct FILE’
  175 |     FILE *p = malloc (100);   // { dg-message "allocated here" }
      |               ^~~~~~~~~~~~

All were like the one above. error: invalid use of undefined type 'struct XXX'

> 
> Thanks for the patch; sorry if this seems nitpicky; the patch seems
> promising

Thanks for the fast reply. I'll try out all the suggestions regarding
splitting up the allocation and assignment and see how they look.

- Tim

> 
> Dave
> 
> 
> 
> - Tim
> 
> [0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105900
> [1] While all tests except the cpython ones work, I have yet to test it
> on large C projects
> [2] FAIL: gcc.dg/analyzer/data-model-5.c (test for excess errors)
>     FAIL: gcc.dg/analyzer/data-model-5b.c (test for excess errors)
>     FAIL: gcc.dg/analyzer/data-model-5c.c (test for excess errors)
>     FAIL: gcc.dg/analyzer/data-model-5d.c (test for excess errors)
>     FAIL: gcc.dg/analyzer/first-field-2.c (test for excess errors)
> 
> -------
> 
> Subject: [PATCH] analyzer: add allocation size warning
> 
> This patch adds an allocation size checker to the analyzer.
> The checker warns when the tracked buffer size is not a multiple of the
> left-hand side pointee's type. This resolves PR analyzer/105900.
> 
> The patch is not yet fully tested.
> 
> gcc/analyzer/ChangeLog:
> 
>         * analyzer.opt: Add Wanalyzer-allocation-size.
>         * sm-malloc.cc (class dubious_allocation_size): New 
> pending_diagnostic subclass.
>         (capacity_compatible_with_type): New.
>         (const_operand_in_sval_p): New.
>         (struct_or_union_with_inheritance_p): New.
>         (check_capacity): New.
>         (malloc_state_machine::on_stmt): Add calls to 
> on_pointer_assignment.
>         (malloc_state_machine::on_allocator_call): Add node to 
> parameters and call to on_pointer_assignment.
>         (malloc_state_machine::on_pointer_assignment): New.
> 
> gcc/testsuite/ChangeLog:
> 
>         * gcc.dg/analyzer/attr-malloc-6.c: Disabled 
> Wanalyzer-allocation-size and added default implementation for FILE.
>         * gcc.dg/analyzer/capacity-1.c: Added dg directives.
>         * gcc.dg/analyzer/malloc-4.c: Disabled 
> Wanalyzer-allocation-size.
>         * gcc.dg/analyzer/pr96639.c: Disabled Wanalyzer-allocation-size
> and added default implementation for foo and bar.
>         * gcc.dg/analyzer/allocation-size-1.c: New test.
>         * gcc.dg/analyzer/allocation-size-2.c: New test.
>         * gcc.dg/analyzer/allocation-size-3.c: New test.
>         * gcc.dg/analyzer/allocation-size-4.c: New test.
> 
> Signed-off-by: Tim Lange <mail@tim-lange.me>
> ---
>  gcc/analyzer/analyzer.opt | 4 +
>  gcc/analyzer/sm-malloc.cc | 363 +++++++++++++++++-
>  .../gcc.dg/analyzer/allocation-size-1.c | 54 +++
>  .../gcc.dg/analyzer/allocation-size-2.c | 44 +++
>  .../gcc.dg/analyzer/allocation-size-3.c | 48 +++
>  .../gcc.dg/analyzer/allocation-size-4.c | 39 ++
>  gcc/testsuite/gcc.dg/analyzer/attr-malloc-6.c | 2 +
>  gcc/testsuite/gcc.dg/analyzer/capacity-1.c | 5 +-
>  gcc/testsuite/gcc.dg/analyzer/malloc-4.c | 6 +-
>  gcc/testsuite/gcc.dg/analyzer/pr96639.c | 2 +
>  10 files changed, 559 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-3.c
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-4.c
> 
> diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt
> index 4aea52d3a87..f213989e0bb 100644
> --- a/gcc/analyzer/analyzer.opt
> +++ b/gcc/analyzer/analyzer.opt
> @@ -78,6 +78,10 @@ Wanalyzer-malloc-leak
>  Common Var(warn_analyzer_malloc_leak) Init(1) Warning
>  Warn about code paths in which a heap-allocated pointer leaks.
> 
> +Wanalyzer-allocation-size
> +Common Var(warn_analyzer_allocation_size) Init(1) Warning
> +Warn about code paths in which a buffer is assigned to a incompatible 
> type.
> +
>  Wanalyzer-mismatching-deallocation
>  Common Var(warn_analyzer_mismatching_deallocation) Init(1) Warning
>  Warn about code paths in which the wrong deallocation function is 
> called.
> diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc
> index 3bd40425919..790c9f0e57d 100644
> --- a/gcc/analyzer/sm-malloc.cc
> +++ b/gcc/analyzer/sm-malloc.cc
> @@ -46,6 +46,8 @@ along with GCC; see the file COPYING3. If not see
>  #include "attribs.h"
>  #include "analyzer/function-set.h"
>  #include "analyzer/program-state.h"
> +#include "print-tree.h"
> +#include "gimple-pretty-print.h"
> 
>  #if ENABLE_ANALYZER
> 
> @@ -428,6 +430,7 @@ private:
>    get_or_create_deallocator (tree deallocator_fndecl);
> 
>    void on_allocator_call (sm_context *sm_ctxt,
> + const supernode *node,
>       const gcall *call,
>       const deallocator_set *deallocators,
>       bool returns_nonnull = false) const;
> @@ -444,6 +447,16 @@ private:
>    void on_realloc_call (sm_context *sm_ctxt,
>     const supernode *node,
>     const gcall *call) const;
> + void on_pointer_assignment(sm_context *sm_ctxt,
> + const supernode *node,
> + const gassign *assign_stmt,
> + tree lhs,
> + tree rhs) const;
> + void on_pointer_assignment(sm_context *sm_ctxt,
> + const supernode *node,
> + const gcall *call,
> + tree lhs,
> + tree rhs) const;
>    void on_zero_assignment (sm_context *sm_ctxt,
>        const gimple *stmt,
>        tree lhs) const;
> @@ -1432,6 +1445,117 @@ private:
>    const char *m_funcname;
>  };
> 
> +/* Concrete subclass for casts of pointers that lead to trailing 
> bytes. */
> +
> +class dubious_allocation_size : public malloc_diagnostic
> +{
> +public:
> + dubious_allocation_size (const malloc_state_machine &sm, tree lhs, 
> tree rhs,
> + tree size_tree, unsigned HOST_WIDE_INT size_diff)
> + : malloc_diagnostic(sm, rhs), 
> m_type(dubious_allocation_type::CONSTANT_SIZE),
> + m_lhs(lhs), m_size_tree(size_tree), m_size_diff(size_diff)
> + {}
> +
> + dubious_allocation_size (const malloc_state_machine &sm, tree lhs, 
> tree rhs,
> + tree size_tree)
> + : malloc_diagnostic(sm, rhs), 
> m_type(dubious_allocation_type::MISSING_OPERAND),
> + m_lhs(lhs), m_size_tree(size_tree), m_size_diff(0)
> + {}
> +
> + const char *get_kind () const final override
> + {
> + return "dubious_allocation_size";
> + }
> +
> + int get_controlling_option () const final override
> + {
> + return OPT_Wanalyzer_allocation_size;
> + }
> +
> + bool subclass_equal_p (const pending_diagnostic &base_other) const
> + final override
> + {
> + const dubious_allocation_size &other = (const dubious_allocation_size
> &)base_other;
> + return malloc_diagnostic::subclass_equal_p(other)
> + && m_type == other.m_type
> + && same_tree_p (m_lhs, other.m_lhs)
> + && same_tree_p (m_size_tree, other.m_size_tree)
> + && m_size_diff == other.m_size_diff;
> + }
> +
> + bool emit (rich_location *rich_loc) final override
> + {
> + diagnostic_metadata m;
> + m.add_cwe (131);
> + return warning_meta (rich_loc, m, get_controlling_option (),
> + "Allocated buffer size is not a multiple of the pointee's size");
> + }
> +
> + label_text describe_state_change (const evdesc::state_change &change)
> + override
> + {
> + if (change.m_old_state == m_sm.get_start_state ()
> + && unchecked_p (change.m_new_state))
> + {
> + m_alloc_event = change.m_event_id;
> + if (m_type == dubious_allocation_type::CONSTANT_SIZE)
> + {
> + // TODO: verify that it's the allocation stmt, not a copy
> + return change.formatted_print ("%E bytes allocated here",
> + m_size_tree);
> + }
> + }
> + return malloc_diagnostic::describe_state_change (change);
> + }
> +
> + label_text describe_final_event (const evdesc::final_event &ev) final
> override
> + {
> + if (m_type == dubious_allocation_type::CONSTANT_SIZE)
> + {
> + if (m_alloc_event.known_p ())
> + return ev.formatted_print (
> + "Casting %qE to %qT leaves %wu trailing bytes; either the"
> + " allocated size is bogus or the type on the left-hand side is"
> + " wrong",
> + m_arg, TREE_TYPE (m_lhs), m_size_diff);
> + else
> + return ev.formatted_print (
> + "Casting a %E byte buffer to %qT leaves %wu trailing bytes; either"
> + " the allocated size is bogus or the type on the left-hand side is"
> + " wrong",
> + m_size_tree, TREE_TYPE (m_lhs), m_size_diff);
> + }
> + else if (m_type == dubious_allocation_type::MISSING_OPERAND)
> + {
> + if (m_alloc_event.known_p ())
> + return ev.formatted_print (
> + "%qE is incompatible with %qT; either the allocated size at %@ is"
> + " bogus or the type on the left-hand side is wrong",
> + m_arg, TREE_TYPE (m_lhs), &m_alloc_event);
> + else
> + return ev.formatted_print (
> + "Allocation is incompatible with %qT; either the allocated size is"
> + " bogus or the type on the left-hand side is wrong",
> + TREE_TYPE (m_lhs));
> + }
> +
> + gcc_unreachable ();
> + return label_text ();
> + }
> +
> +private:
> + enum dubious_allocation_type {
> + CONSTANT_SIZE,
> + MISSING_OPERAND
> + };
> +
> + dubious_allocation_type m_type;
> + diagnostic_event_id_t m_alloc_event;
> + tree m_lhs;
> + tree m_size_tree;
> + unsigned HOST_WIDE_INT m_size_diff;
> +};
> +
>  /* struct allocation_state : public state_machine::state. */
> 
>  /* Implementation of state_machine::state::dump_to_pp vfunc
> @@ -1633,6 +1757,160 @@ known_allocator_p (const_tree fndecl, const 
> gcall *call)
>    return false;
>  }
> 
> +/* Returns the trailing bytes on dubious allocation sizes. */
> +
> +static unsigned HOST_WIDE_INT
> +capacity_compatible_with_type (tree cst, tree pointee_size_tree)
> +{
> + unsigned HOST_WIDE_INT pointee_size = TREE_INT_CST_LOW 
> (pointee_size_tree);
> + if (pointee_size == 0)
> + return 0;
> + unsigned HOST_WIDE_INT alloc_size = TREE_INT_CST_LOW (cst);
> +
> + return alloc_size % pointee_size;
> +}
> +
> +/* Returns true if there is a constant tree with
> + the same constant value inside the sval. */
> +
> +static bool
> +const_operand_in_sval_p (const svalue *sval, tree size_cst)
> +{
> + auto_vec<const svalue *> non_mult_expr;
> + auto_vec<const svalue *> worklist;
> + worklist.safe_push(sval);
> + while (!worklist.is_empty())
> + {
> + const svalue *curr = worklist.pop ();
> + curr = curr->unwrap_any_unmergeable ();
> +
> + switch (curr->get_kind())
> + {
> + default:
> + break;
> + case svalue_kind::SK_CONSTANT:
> + {
> + const constant_svalue *cst_sval = curr->dyn_cast_constant_svalue ();
> + unsigned HOST_WIDE_INT sval_int
> + = TREE_INT_CST_LOW (cst_sval->get_constant ());
> + unsigned HOST_WIDE_INT size_cst_int = TREE_INT_CST_LOW (size_cst);
> + if (sval_int % size_cst_int == 0)
> + return true;
> + }
> + break;
> + case svalue_kind::SK_BINOP:
> + {
> + const binop_svalue *b_sval = curr->dyn_cast_binop_svalue ();
> + if (b_sval->get_op () == MULT_EXPR)
> + {
> + worklist.safe_push (b_sval->get_arg0 ());
> + worklist.safe_push (b_sval->get_arg1 ());
> + }
> + else
> + {
> + non_mult_expr.safe_push (b_sval->get_arg0 ());
> + non_mult_expr.safe_push (b_sval->get_arg1 ());
> + }
> + }
> + break;
> + case svalue_kind::SK_UNARYOP:
> + {
> + const unaryop_svalue *un_sval = curr->dyn_cast_unaryop_svalue ();
> + worklist.safe_push (un_sval->get_arg ());
> + }
> + break;
> + case svalue_kind::SK_UNKNOWN:
> + return true;
> + }
> + }
> +
> + /* Each expr should be a multiple of the size.
> + E.g. used to catch n + sizeof(int) errors. */
> + bool reduce = !non_mult_expr.is_empty ();
> + while (!non_mult_expr.is_empty() && reduce)
> + {
> + const svalue *expr_sval = non_mult_expr.pop ();
> + reduce &= const_operand_in_sval_p (expr_sval, size_cst);
> + }
> + return reduce;
> +}
> +
> +/* Returns true iff the type is a struct with another struct inside.
> */
> +
> +static bool
> +struct_or_union_with_inheritance_p (tree type)
> +{
> + if (!RECORD_OR_UNION_TYPE_P (type))
> + return false;
> +
> + for (tree f = TYPE_FIELDS (type); f; f = TREE_CHAIN (f))
> + if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (f)))
> + return true;
> +
> + return false;
> +}
> +
> +static void
> +check_capacity (sm_context *sm_ctxt,
> + const malloc_state_machine &sm,
> + const supernode *node,
> + const gimple *stmt,
> + tree lhs,
> + tree rhs,
> + const svalue *capacity)
> +{
> + tree pointer_type = TREE_TYPE (lhs);
> + gcc_assert (TREE_CODE (pointer_type) == POINTER_TYPE);
> +
> + tree pointee_type = TREE_TYPE (pointer_type);
> + /* void * is always compatible. */
> + if (TREE_CODE (pointee_type) == VOID_TYPE)
> + return;
> +
> + if (struct_or_union_with_inheritance_p (pointee_type))
> + return;
> +
> + tree pointee_size_tree = size_in_bytes(pointee_type);
> + /* The size might be unknown e.g. being a array with n elements
> + or casting to char * never has any trailing bytes. */
> + if (TREE_CODE (pointee_size_tree) != INTEGER_CST
> + || TREE_INT_CST_LOW (pointee_size_tree) == 1)
> + return;
> +
> + switch (capacity->get_kind ())
> + {
> + default:
> + break;
> + case svalue_kind::SK_CONSTANT:
> + {
> + const constant_svalue *cst_sval = capacity->dyn_cast_constant_svalue 
> ();
> + tree cst = cst_sval->get_constant ();
> + unsigned HOST_WIDE_INT size_diff
> + = capacity_compatible_with_type (cst, pointee_size_tree);
> + if (size_diff != 0)
> + {
> + tree diag_arg = sm_ctxt->get_diagnostic_tree (rhs);
> + sm_ctxt->warn (node, stmt, diag_arg,
> + new dubious_allocation_size (sm, lhs, diag_arg,
> + cst, size_diff));
> + }
> + }
> + break;
> + case svalue_kind::SK_BINOP:
> + case svalue_kind::SK_UNARYOP:
> + {
> + if (!const_operand_in_sval_p (capacity, pointee_size_tree))
> + {
> + tree diag_arg = sm_ctxt->get_diagnostic_tree (rhs);
> + sm_ctxt->warn (node, stmt, diag_arg,
> + new dubious_allocation_size (sm, lhs, diag_arg,
> + pointee_size_tree));
> + }
> + }
> + break;
> + }
> +}
> +
>  /* Implementation of state_machine::on_stmt vfunc for 
> malloc_state_machine. */
> 
>  bool
> @@ -1645,14 +1923,14 @@ malloc_state_machine::on_stmt (sm_context 
> *sm_ctxt,
>        {
>   if (known_allocator_p (callee_fndecl, call))
>     {
> - on_allocator_call (sm_ctxt, call, &m_free);
> + on_allocator_call (sm_ctxt, node, call, &m_free);
>       return true;
>     }
> 
>   if (is_named_call_p (callee_fndecl, "operator new", call, 1))
> - on_allocator_call (sm_ctxt, call, &m_scalar_delete);
> + on_allocator_call (sm_ctxt, node, call, &m_scalar_delete);
>   else if (is_named_call_p (callee_fndecl, "operator new []", call, 1))
> - on_allocator_call (sm_ctxt, call, &m_vector_delete);
> + on_allocator_call (sm_ctxt, node, call, &m_vector_delete);
>   else if (is_named_call_p (callee_fndecl, "operator delete", call, 1)
>     || is_named_call_p (callee_fndecl, "operator delete", call, 2))
>     {
> @@ -1707,7 +1985,7 @@ malloc_state_machine::on_stmt (sm_context 
> *sm_ctxt,
>       tree attrs = TYPE_ATTRIBUTES (TREE_TYPE (callee_fndecl));
>       bool returns_nonnull
>         = lookup_attribute ("returns_nonnull", attrs);
> - on_allocator_call (sm_ctxt, call, deallocators, returns_nonnull);
> + on_allocator_call (sm_ctxt, node, call, deallocators, 
> returns_nonnull);
>     }
> 
>   /* Handle "__attribute__((nonnull))". */
> @@ -1763,12 +2041,31 @@ malloc_state_machine::on_stmt (sm_context 
> *sm_ctxt,
>         = mutable_this->get_or_create_deallocator (callee_fndecl);
>       on_deallocator_call (sm_ctxt, node, call, d, dealloc_argno);
>     }
> +
> + /* Handle returns from function calls. */
> + tree lhs = gimple_call_lhs (call);
> + if (lhs && TREE_CODE (TREE_TYPE (lhs)) == POINTER_TYPE
> + && TREE_CODE (gimple_call_return_type (call)) == POINTER_TYPE)
> + on_pointer_assignment (sm_ctxt, node, call, lhs,
> + gimple_call_fn (call));
>        }
> 
>    if (tree lhs = sm_ctxt->is_zero_assignment (stmt))
>      if (any_pointer_p (lhs))
>        on_zero_assignment (sm_ctxt, stmt,lhs);
> 
> + /* Handle pointer assignments/casts for dubious allocation size. */
> + if (const gassign *assign_stmt = dyn_cast <const gassign *> (stmt))
> + {
> + if (gimple_num_ops (stmt) == 2)
> + {
> + tree lhs = gimple_assign_lhs (assign_stmt);
> + tree rhs = gimple_assign_rhs1 (assign_stmt);
> + if (any_pointer_p (lhs) && any_pointer_p (rhs))
> + on_pointer_assignment (sm_ctxt, node, assign_stmt, lhs, rhs);
> + }
> + }
> +
>    /* Handle dereferences. */
>    for (unsigned i = 0; i < gimple_num_ops (stmt); i++)
>      {
> @@ -1818,6 +2115,7 @@ malloc_state_machine::on_stmt (sm_context 
> *sm_ctxt,
> 
>  void
>  malloc_state_machine::on_allocator_call (sm_context *sm_ctxt,
> + const supernode *node,
>        const gcall *call,
>        const deallocator_set *deallocators,
>        bool returns_nonnull) const
> @@ -1830,6 +2128,9 @@ malloc_state_machine::on_allocator_call 
> (sm_context *sm_ctxt,
>       (returns_nonnull
>        ? deallocators->m_nonnull
>        : deallocators->m_unchecked));
> +
> + if (TREE_CODE (TREE_TYPE (lhs)) == POINTER_TYPE)
> + on_pointer_assignment (sm_ctxt, node, call, lhs, gimple_call_fn 
> (call));
>      }
>    else
>      {
> @@ -1968,6 +2269,60 @@ malloc_state_machine::on_realloc_call 
> (sm_context *sm_ctxt,
>      }
>  }
> 
> +/* Handle assignments between two pointers.
> + Check for dubious allocation sizes.
> +*/
> +
> +void
> +malloc_state_machine::on_pointer_assignment (sm_context *sm_ctxt,
> + const supernode *node,
> + const gassign *assign_stmt,
> + tree lhs,
> + tree rhs) const
> +{
> + /* Do not warn if lhs and rhs are of the same type to not emit 
> duplicate
> + warnings on assignments after the cast. */
> + if (pending_diagnostic::same_tree_p (TREE_TYPE (lhs), TREE_TYPE 
> (rhs)))
> + return;
> +
> + const program_state *state = sm_ctxt->get_old_program_state ();
> + const svalue *r_value = state->m_region_model->get_rvalue (rhs,
> NULL);
> + if (const region_svalue *reg = dyn_cast <const region_svalue *> 
> (r_value))
> + {
> + const svalue *capacity = state->m_region_model->get_capacity
> + (reg->get_pointee ());
> + check_capacity(sm_ctxt, *this, node, assign_stmt, lhs, rhs,
> capacity);
> + }
> +}
> +
> +void
> +malloc_state_machine::on_pointer_assignment (sm_context *sm_ctxt,
> + const supernode *node,
> + const gcall *call,
> + tree lhs,
> + tree fn_decl) const
> +{
> + /* Do not warn if lhs and rhs are of the same type to not emit 
> duplicate
> + warnings on assignments after the cast. */
> + if (pending_diagnostic::same_tree_p
> + (TREE_TYPE (lhs), TREE_TYPE (gimple_call_return_type (call))))
> + return;
> +
> + const program_state *state = sm_ctxt->get_new_program_state ();
> + const svalue *r_value = state->m_region_model->get_rvalue (lhs,
> NULL);
> + if (const region_svalue *reg = dyn_cast <const region_svalue *> 
> (r_value))
> + {
> + const svalue *capacity = state->m_region_model->get_capacity
> + (reg->get_pointee ());
> + check_capacity (sm_ctxt, *this, node, call, lhs, fn_decl, capacity);
> + }
> + else if (const conjured_svalue *con
> + = dyn_cast <const conjured_svalue *> (r_value))
> + {
> + // FIXME: How to get a region_svalue?
> + }
> +}
> +
>  /* Implementation of state_machine::on_phi vfunc for 
> malloc_state_machine. */
> 
>  void
> diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c 
> b/gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c
> new file mode 100644
> index 00000000000..5403c5f41f1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c
> @@ -0,0 +1,54 @@
> +#include <stdlib.h>
> +
> +/* Tests with constant buffer sizes */
> +
> +void test_1 (void)
> +{
> + short *ptr = malloc (21 * sizeof(short));
> + free (ptr);
> +}
> +
> +void test_2 (void)
> +{
> + int *ptr = malloc (21 * sizeof (short)); /* { dg-line malloc } */
> + free (ptr);
> +
> + /* { dg-warning "Allocated buffer size is not a multiple of the 
> pointee's size" "" { target *-*-* } malloc } */
> + /* { dg-message "\\(1\\) Casting a 42 byte buffer to 'int \\*' leaves
> 2 trailing bytes" "" { target *-*-* } malloc } */
> +}
> +
> +void test_3 (void)
> +{
> + void *ptr = malloc (21 * sizeof (short));
> + short *sptr = (short *)ptr;
> + free (sptr);
> +}
> +
> +void test_4 (void)
> +{
> + void *ptr = malloc (21 * sizeof (short)); /* { dg-message } */
> + int *iptr = (int *)ptr; /* { dg-line assign } */
> + free (iptr);
> +
> + /* { dg-warning "Allocated buffer size is not a multiple of the 
> pointee's size" "" { target *-*-* } assign } */
> + /* { dg-message "\\(2\\) Casting 'ptr' to 'int \\*' leaves 2 trailing
> bytes" "" { target *-*-* } assign } */
> +}
> +
> +struct s {
> + int i;
> +};
> +
> +void test_5 (void)
> +{
> + struct s *ptr = malloc (5 * sizeof (struct s));
> + free (ptr);
> +}
> +
> +void test_6 (void)
> +{
> + long *ptr = malloc (5 * sizeof (struct s)); /* { dg-line malloc6 } */
> + free (ptr);
> +
> + /* { dg-warning "" "" { target *-*-* } malloc6 } */
> + /* { dg-message "" "" { target *-*-* } malloc6 } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c 
> b/gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c
> new file mode 100644
> index 00000000000..e66d2793f13
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c
> @@ -0,0 +1,44 @@
> +#include <stdlib.h>
> +#include <stdio.h>
> +
> +/* Tests with symbolic buffer sizes */
> +
> +void test_1 (void)
> +{
> + int n;
> + scanf("%i", &n);
> + short *ptr = malloc (n * sizeof(short));
> + free (ptr);
> +}
> +
> +void test_2 (void)
> +{
> + int n;
> + scanf("%i", &n);
> + int *ptr = malloc (n * sizeof (short)); /* { dg-line malloc } */
> + free (ptr);
> +
> + /* { dg-warning "Allocated buffer size is not a multiple of the 
> pointee's size" "" { target *-*-* } malloc } */
> + /* { dg-message "\\(1\\) Allocation is incompatible with 'int \\*'"
> "" { target *-*-* } malloc } */
> +}
> +
> +void test_3 (void)
> +{
> + int n;
> + scanf("%i", &n);
> + void *ptr = malloc (n * sizeof (short));
> + short *sptr = (short *)ptr;
> + free (sptr);
> +}
> +
> +void test_4 (void)
> +{
> + int n;
> + scanf("%i", &n);
> + void *ptr = malloc (n * sizeof (short)); /* { dg-message } */
> + int *iptr = (int *)ptr; /* { dg-line assign } */
> + free (iptr);
> +
> + /* { dg-warning "Allocated buffer size is not a multiple of the 
> pointee's size" "" { target *-*-* } assign } */
> + /* { dg-message "\\(2\\) 'ptr' is incompatible with 'int \\*'; either
> the allocated size at \\(1\\)" "" { target *-*-* } assign } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-3.c 
> b/gcc/testsuite/gcc.dg/analyzer/allocation-size-3.c
> new file mode 100644
> index 00000000000..dafc0e73c63
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-3.c
> @@ -0,0 +1,48 @@
> +#include <stdlib.h>
> +#include <stdio.h>
> +
> +/* CWE-131 example 5 */
> +void test_1(void)
> +{
> + int *id_sequence = (int *) malloc (3); /* { dg-line malloc1 } */
> + if (id_sequence == NULL) exit (1);
> +
> + id_sequence[0] = 13579;
> + id_sequence[1] = 24680;
> + id_sequence[2] = 97531;
> +
> + free (id_sequence);
> +
> + /* { dg-warning "" "" { target *-*-* } malloc1 } */
> + /* { dg-message "" "" { target *-*-* } malloc1 } */
> +}
> +
> +void test_2(void)
> +{
> + int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } */
> + free (ptr);
> +
> + /* { dg-warning "" "" { target *-*-* } malloc2 } */
> + /* { dg-message "" "" { target *-*-* } malloc2 } */
> +}
> +
> +void test_3(void)
> +{
> + int n;
> + scanf("%i", &n);
> + int *ptr = malloc (n + sizeof (int)); /* { dg-line malloc3 } */
> + free (ptr);
> +
> + /* { dg-warning "" "" { target *-*-* } malloc3 } */
> + /* { dg-message "" "" { target *-*-* } malloc3 } */
> +}
> +
> +void test_4(void)
> +{
> + int n;
> + scanf("%i", &n);
> + int m;
> + scanf("%i", &m);
> + int *ptr = malloc ((n + m) * sizeof (int));
> + free (ptr);
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-4.c 
> b/gcc/testsuite/gcc.dg/analyzer/allocation-size-4.c
> new file mode 100644
> index 00000000000..4c2b31d6e0a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-4.c
> @@ -0,0 +1,39 @@
> +#include <stddef.h>
> +#include <stdlib.h>
> +
> +/* Flow warnings */
> +
> +void *create_buffer(int n)
> +{
> + return malloc(n);
> +}
> +
> +void test_1(void)
> +{
> + // FIXME
> + int *buf = create_buffer(42); /* { dg-warning "" "" { xfail *-*-* } }
> */
> + free (buf);
> +}
> +
> +void test_2(void)
> +{
> + void *buf = create_buffer(42); /* { dg-message } */
> + int *ibuf = buf; /* { dg-line assign2 } */
> + free (ibuf);
> +
> + /* { dg-warning "" "" { target *-*-* } assign2 } */
> + /* { dg-message "" "" { target *-*-* } assign2 } */
> +}
> +
> +void test_3(void)
> +{
> + void *buf = malloc(42); /* { dg-message } */
> + if (buf != NULL) /* { dg-message } */
> + {
> + int *ibuf = buf; /* { dg-line assign3 } */
> + free (ibuf);
> + }
> +
> + /* { dg-warning "" "" { target *-*-* } assign3 } */
> + /* { dg-message "" "" { target *-*-* } assign3 } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-malloc-6.c 
> b/gcc/testsuite/gcc.dg/analyzer/attr-malloc-6.c
> index bd28107d0d7..809ee88cf07 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/attr-malloc-6.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/attr-malloc-6.c
> @@ -1,7 +1,9 @@
> +/* { dg-additional-options -Wno-analyzer-allocation-size } */
>  /* Adapted from gcc.dg/Wmismatched-dealloc.c. */
> 
>  #define A(...) __attribute__ ((malloc (__VA_ARGS__)))
> 
> +struct FILE {};
>  typedef struct FILE FILE;
>  typedef __SIZE_TYPE__ size_t;
> 
> diff --git a/gcc/testsuite/gcc.dg/analyzer/capacity-1.c 
> b/gcc/testsuite/gcc.dg/analyzer/capacity-1.c
> index 2d124833296..94f569e390b 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/capacity-1.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/capacity-1.c
> @@ -89,8 +89,11 @@ struct s
>  static struct s * __attribute__((noinline))
>  alloc_s (size_t num)
>  {
> - struct s *p = malloc (sizeof(struct s) + num);
> + struct s *p = malloc (sizeof(struct s) + num); /* { dg-line malloc } 
> */
>    return p;
> +
> + /* { dg-warning "" "" { target *-*-* } malloc } */
> + /* { dg-message "" "" { target *-*-* } malloc } */
>  }
> 
>  struct s *
> diff --git a/gcc/testsuite/gcc.dg/analyzer/malloc-4.c 
> b/gcc/testsuite/gcc.dg/analyzer/malloc-4.c
> index 908bb28ee50..0ca94250ba2 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/malloc-4.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/malloc-4.c
> @@ -1,9 +1,9 @@
> -/* { dg-additional-options "-Wno-incompatible-pointer-types" } */
> +/* { dg-additional-options "-Wno-incompatible-pointer-types 
> -Wno-analyzer-allocation-size" } */
> 
>  #include <stdlib.h>
> 
> -struct foo;
> -struct bar;
> +struct foo {};
> +struct bar {};
>  void *hv (struct foo **tm)
>  {
>    void *p = __builtin_malloc (4);
> diff --git a/gcc/testsuite/gcc.dg/analyzer/pr96639.c 
> b/gcc/testsuite/gcc.dg/analyzer/pr96639.c
> index 02ca3f084a2..6f365c3cb5d 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/pr96639.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/pr96639.c
> @@ -1,3 +1,5 @@
> +/* { dg-additional-options -Wno-analyzer-allocation-size } */
> +
>  void *calloc (__SIZE_TYPE__, __SIZE_TYPE__);
> 
>  int
> 
> 

  reply	other threads:[~2022-06-17 20:23 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-17 15:54 Tim Lange
2022-06-17 17:15 ` Prathamesh Kulkarni
2022-06-17 19:23   ` Tim Lange
2022-06-17 21:39     ` David Malcolm
2022-06-17 17:48 ` David Malcolm
2022-06-17 20:23   ` Tim Lange [this message]
2022-06-17 22:13     ` David Malcolm
2022-06-21 20:00       ` Tim Lange
2022-06-21 23:16         ` David Malcolm
2022-06-22 14:57           ` Tim Lange
2022-06-22 18:23             ` David Malcolm
2022-06-17 18:34 ` [RFC] analyzer: add " Tim Lange
2022-06-29 15:39 ` [PATCH v2] analyzer: add allocation size checker Tim Lange
2022-06-29 17:39   ` David Malcolm
2022-06-30 20:40     ` Tim Lange
2022-06-30 22:11 ` [PATCH v3] analyzer: add allocation size checker [PR105900] Tim Lange
2022-06-30 22:47   ` David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220617202305.kg2otzye6k57fgyo@fedora \
    --to=mail@tim-lange.me \
    --cc=dmalcolm@redhat.com \
    --cc=gcc@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).