From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from www523.your-server.de (www523.your-server.de [159.69.224.22]) by sourceware.org (Postfix) with ESMTPS id 0B93C3857BB9 for ; Fri, 17 Jun 2022 20:23:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0B93C3857BB9 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=tim-lange.me Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tim-lange.me DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tim-lange.me; s=default2108; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=Cio+oAqoH1vA5dMbHpDY+yOTzp4fTqKGbPbgDY/n/00=; b=IjJz4syONpPkdJStoRNrQUP9B7 AkOEaLuE8bws+FkmOz2uao0GV45L9p4A5jAeb2kFlY4bN0DJwEA60eeZRPCy4SmcQYVE99VQ1zSck CoDeE8D0xr9nqW5mqeWUS51ji3MSmY1eyjIkRIV+crNEto+mCIVr08dMAJob2JBUf8/WBZOMuGWIC fYauwey5Yj2dnPbtx0y7cBl+VZ0Bjc+ZKEdWC0lkL2YItcTksOM8I+uUGRnlIdzELYuKV2mfGSi2y 6oHGNxeQrl1E2XR+n1b6zQAJ7EExaonp1BnTdazICNn2RRhhWxT5vRVM0m0diBhOT7d6z8RNeqvAu TiANNoXg==; Received: from sslproxy05.your-server.de ([78.46.172.2]) by www523.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.92.3) (envelope-from ) id 1o2IUl-0004Ky-Fj; Fri, 17 Jun 2022 22:23:07 +0200 Received: from [2a02:908:1864:dbe0::6768] (helo=fedora) by sslproxy05.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1o2IUl-000IZK-A3; Fri, 17 Jun 2022 22:23:07 +0200 Date: Fri, 17 Jun 2022 22:23:05 +0200 From: Tim Lange To: David Malcolm Cc: GCC Mailing List Subject: Re: [RFC] analyzer: allocation size warning Message-ID: <20220617202305.kg2otzye6k57fgyo@fedora> References: <42afee94ad4da41b87980c6b4a7cd7dcc6cb1e97.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <42afee94ad4da41b87980c6b4a7cd7dcc6cb1e97.camel@redhat.com> X-Authenticated-Sender: mail@tim-lange.me X-Virus-Scanned: Clear (ClamAV 0.103.6/26575/Fri Jun 17 10:08:05 2022) X-Spam-Status: No, score=-8.9 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_INFOUSMEBIZ, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Jun 2022 20:23:14 -0000 On Fri, Jun 17, 2022 at 01:48:09PM -0400, David Malcolm wrote: > On Fri, 2022-06-17 at 17:54 +0200, Tim Lange wrote: > > Hi everyone, > > Hi Tim. > > Thanks for the patch. > > Various comments inline below, throughout... > > > > > tracked in PR105900 [0], I'd like to add support for a new warning on > > dubious allocation sizes. The new checker emits a warning when the > > allocation size is not a multiple of the type's size. With the checker, > > following mistakes are detected: > >   int *arr = malloc(3); // forgot to multiply by sizeof > >   arr[0] = ...; > >   arr[1] = ...; > > or > >   int *buf = malloc (n + sizeof(int)); // probably should be * instead > > of + > > Because it is implemented inside the analyzer, it also emits warnings > > when the buffer is first of type void* and later on casted to something > > else. Though, this also inherits a limitation. The checker can not > > distinguish 2 * sizeof(short) from sizeof(int) because sizeof is > > resolved and constants are folded at the point when the analyzer runs. > > As a mitigation, I plan to implement a check in the frontend that emits > > a warning if sizeof(lhs pointee type) is not part of the malloc > > argument. > > > > I'm looking for a first feedback on the phrasing of the diagnostics as > > well on the preliminary patch [1]. > > > > On constant buffer sizes, the warnings looks like this: > > warning: Allocated buffer size is not a multiple of the pointee's size > > [CWE-131] [-Wanalyzer-allocation-size] > >    22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } */ > >       | ^~~~~~~~~~~~~~~~~~~~~~~~~ > >   ‘test_2’: event 1 > >     | > >     | 22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } > > */ > >     | | ^~~~~~~~~~~~~~~~~~~~~~~~~ > >     | | | > >     | | (1) Casting a 14 byte buffer to ‘int *’ leaves 2 trailing > > bytes; either the allocated size is bogus or the type on the left-hand > > side is wrong > >     | > > Something strange seems to have happened with the indentation in your > email; the code in the patch seems to me to be strangely indented, and > looking at the archive here: > https://gcc.gnu.org/pipermail/gcc/2022-June/238907.html > I see the same thing, so I think it's a problem with what the mailing > list received, rather than just in my mail client. Maybe something > > FWIW I normally use "git send-email" to send patches. > > The underlinings in the above look strange; I see this in your email: I have resent the patch using git send-email as a reply to my original message. The new message looks properly formatted in the archive: https://gcc.gnu.org/pipermail/gcc/2022-June/238911.html > > warning: Allocated buffer size is not a multiple of the pointee's size > [CWE-131] [-Wanalyzer-allocation-size] > 22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } */ > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > ‘test_2’: event 1 > | > | 22 | int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } > */ > | | ^~~~~~~~~~~~~~~~~~~~~~~~~ > | | | > | | (1) Casting a 14 byte buffer to ‘int *’ leaves 2 trailing > bytes; either the allocated size is bogus or the type on the left-hand > side is wrong > | > > Should it have been (omitting the dg-line directives for clarity): > > warning: Allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] > 22 | int *ptr = malloc (10 + sizeof(int)); > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > ‘test_2’: event 1 > | > | 22 | int *ptr = malloc (10 + sizeof(int)); > | | ^~~~~~~~~~~~~~~~~~~~~~~~~ > | | | > | | (1) Casting a 14 byte buffer to ‘int *’ leaves 2 trailing bytes; either the allocated size is bogus or the type on the left-hand side is wrong > | > > ? > > It looks like something somewhere has collapsed repeated whitespace in > the message down to single spaces, which has broken the ASCII art in > your examples, and the indentation in your code. > > > It would probably be helpful for the message to tell the user what > sizeof(*ptr) is, sizeof(int) in this case (much more helpful when it's > a struct) > > Maybe something alike: > > note: a buffer of 14 bytes is allocated... > note: ...but sizeof (int) is 4 bytes... > note: ...leaving 2 trailing bytes for an array of 3 'int's (which would > occupy 12 bytes) > > or somesuch??? > > I'm brainstorming here, my ideas above aren't necessarily good. > Sometimes it's good to chop up messages like this, to minimize > combinatorial explosion for all the different cases. > > > > On symbolic buffer sizes: > warning: Allocated buffer size is not a multiple of the pointee's size > [CWE-131] [-Wanalyzer-allocation-size] >    33 | int *ptr = malloc (n + sizeof(int)); /* { dg-line malloc3 } */ >       | ^~~~~~~~~~~~~~~~~~~~~~~~ >   ‘test_3’: event 1 >     | >     | 33 | int *ptr = malloc (n + sizeof(int)); /* { dg-line malloc3 } > */ >     | | ^~~~~~~~~~~~~~~~~~~~~~~~ >     | | | >     | | (1) Allocation is incompatible with ‘int *’; either the > allocated size is bogus or the type on the left-hand side is wrong >     | > > > Is there location information for both the malloc and for the > assignment, here? I'm not sure whether I understand your question but the warning is emitted at the gcall* with a ssa var lhs and the call_fndecl on the rhs. I think that is enough to split that up into "(1) n + sizeof(int) allocated here" and "(2) Allocation at (1) is incompatible with..."? > > If so, then maybe two events: > > warning: Allocated buffer size is not a multiple of the pointee's size > [CWE-131] [-Wanalyzer-allocation-size] > 33 | int *ptr = malloc (n + sizeof(int)); > | ^~~~~~~~~~~~~~~~~~~~~~~~ > ‘test_3’: events 1-2 > | > | 33 | int *ptr = malloc (n + sizeof(int)); > | | ^ ^~~~~~~~~~~~~~~~~~~~~~~~ > | | | | > | | | (1) buffer allocated here with size 'n + 4' > | | | > | | (2) sizeof(*ptr) is 4 > | > > or somesuch. > > > > And this is how a simple flow looks like: > warning: Allocated buffer size is not a multiple of the pointee's size > [CWE-131] [-Wanalyzer-allocation-size] >    39 | int *iptr = (int *)ptr; /* { dg-line assign } */ >       | ^~~~ >   ‘test_4’: events 1-2 >     | >     | 38 | void *ptr = malloc (n * sizeof (short)); /* { dg-message } > */ >     | | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ >     | | | >     | | (1) allocated here >     | 39 | int *iptr = (int *)ptr; /* { dg-line assign } */ >     | | ~~~~ >     | | | >     | | (2) ‘ptr’ is incompatible with ‘int *’; either the > allocated size at (1) is bogus or the type on the left-hand side is > wrong >     | > > > > I think it would make the diagnostic more readable if the "allocated > here" event's message expresses how big the buffer is e.g.: > > warning: Allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] > 39 | int *iptr = (int *)ptr; > | ^~~~ > ‘test_4’: events 1-2 > | > | 38 | void *ptr = malloc (n * sizeof (short));*/ > | | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ > | | | > | | (1) allocated here with size '(n * 2)' > | 39 | int *iptr = (int *)ptr; > | | ~~~~ > | | | > | | (2) ‘ptr’ is incompatible with ‘int *’; sizeof(int) is 4 > | > > > There are some things to discuss from my side: > * The tests with the "toy re-implementation of CPython's object > model"[2] fail due to a extra warning emitted. Because the analyzer > can't know the calculation actually results in a correct buffer size > when viewed as a string_obj later on, it emits a warning, e.g. at line > 61 in data-model-5.c. The only mitigation would be to disable the > warning for structs entirely. Now, the question is to rather have noise > on these cases or disable the warning for structs entirely? > > Can you post the full warning please? /path/to/data-model-5.c: In function ‘alloc_obj’: /path/to/data-model-5.c:61:31: warning: Allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size] 61 | base_obj *obj = (base_obj *)malloc (sz); | ^~~~~~~~~~~ ‘new_string_obj’: events 1-2 | | 69 | base_obj *new_string_obj (const char *str) | | ^~~~~~~~~~~~~~ | | | | | (1) entry to ‘new_string_obj’ |...... | 75 | = (string_obj *)alloc_obj (&str_type, sizeof (string_obj) + len + 1); | | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (2) calling ‘alloc_obj’ from ‘new_string_obj’ | +--> ‘alloc_obj’: events 3-4 | | 59 | base_obj *alloc_obj (type_obj *ob_type, size_t sz) | | ^~~~~~~~~ | | | | | (3) entry to ‘alloc_obj’ | 60 | { | 61 | base_obj *obj = (base_obj *)malloc (sz); | | ~~~~~~~~~~~ | | | | | (4) Allocation is incompatible with ‘base_obj *’; either the allocated size is bogus or the type on the left-hand side is wrong | > > These testcases exhibit a common way of faking inheritance in C, and I > think it ought to be possible to support this in the warning. > > I thing what's happening is we have > > struct base > {  > /* fields */ > }; > > struct sub > { > struct base m_base; > /* extra fields. */ > }; > > struct base *construct_base (size_t sz) > { > struct base *p = (struct base *) malloc (sz); > > /* set up fields of base in p */ > > return p; > } > > Or is this on the interprocedural path as called with a specific sizeof > for struct sub? At (4), it does not know that base_obj is later used as a "base struct". As it is called with sizeof(struct sub), my checker thinks the buffer is too large for one but too small for another base_obj. > > Maybe we can special-case these by detecting where struct sub's first > field is struct base, and hence where we expect this pattern? (and use > this to suppress the warning for such cases?) I already excluded all structs with structs inside with struct_or_union_with_inheritance_p inside sm-malloc.cc. This does not help in the case size for struct sub is allocated but casted as base. Maybe, we should do a special case for structs where we only warn when the sizeof is too small to hold the base struct together with supressing warnings when the first field is a struct? > > > * I'm unable to emit a warning whenever the cast happens at an > assignment with a call as the rhs, e.g. test_1 in allocation-size-4.c. > This is because I'm unable to access a region_svalue for the returned > value. Even in the new_program_state, the svalue of the lhs is still a > conjured_svalue. Maybe David can lead me to a place where I can access > the return value's region_svalue or do I have to adapt the engine? > > Please can you try reposting the patch? I tried to read it, but am > having trouble with the mangled indentation. See my inline answer above. Both, the test case and from where I want to access the region_svalue are commented with // FIXME. > > > * attr-malloc-6.c and pr96639.c did both contain structs without an > implementation. Something in the analyzer must have triggered another > warning about the usage of those without them having an implementation. > I changed those structs to have an empty implementation, such that the > additional warning are gone. I think this shouldn't change the test > case, so is this change okay? > > What were the new warnings? /path/to/attr-malloc-6.c:175:15: error: invalid use of undefined type ‘struct FILE’ 175 | FILE *p = malloc (100); // { dg-message "allocated here" } | ^~~~~~~~~~~~ All were like the one above. error: invalid use of undefined type 'struct XXX' > > Thanks for the patch; sorry if this seems nitpicky; the patch seems > promising Thanks for the fast reply. I'll try out all the suggestions regarding splitting up the allocation and assignment and see how they look. - Tim > > Dave > > > > - Tim > > [0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105900 > [1] While all tests except the cpython ones work, I have yet to test it > on large C projects > [2] FAIL: gcc.dg/analyzer/data-model-5.c (test for excess errors) >     FAIL: gcc.dg/analyzer/data-model-5b.c (test for excess errors) >     FAIL: gcc.dg/analyzer/data-model-5c.c (test for excess errors) >     FAIL: gcc.dg/analyzer/data-model-5d.c (test for excess errors) >     FAIL: gcc.dg/analyzer/first-field-2.c (test for excess errors) > > ------- > > Subject: [PATCH] analyzer: add allocation size warning > > This patch adds an allocation size checker to the analyzer. > The checker warns when the tracked buffer size is not a multiple of the > left-hand side pointee's type. This resolves PR analyzer/105900. > > The patch is not yet fully tested. > > gcc/analyzer/ChangeLog: > >         * analyzer.opt: Add Wanalyzer-allocation-size. >         * sm-malloc.cc (class dubious_allocation_size): New > pending_diagnostic subclass. >         (capacity_compatible_with_type): New. >         (const_operand_in_sval_p): New. >         (struct_or_union_with_inheritance_p): New. >         (check_capacity): New. >         (malloc_state_machine::on_stmt): Add calls to > on_pointer_assignment. >         (malloc_state_machine::on_allocator_call): Add node to > parameters and call to on_pointer_assignment. >         (malloc_state_machine::on_pointer_assignment): New. > > gcc/testsuite/ChangeLog: > >         * gcc.dg/analyzer/attr-malloc-6.c: Disabled > Wanalyzer-allocation-size and added default implementation for FILE. >         * gcc.dg/analyzer/capacity-1.c: Added dg directives. >         * gcc.dg/analyzer/malloc-4.c: Disabled > Wanalyzer-allocation-size. >         * gcc.dg/analyzer/pr96639.c: Disabled Wanalyzer-allocation-size > and added default implementation for foo and bar. >         * gcc.dg/analyzer/allocation-size-1.c: New test. >         * gcc.dg/analyzer/allocation-size-2.c: New test. >         * gcc.dg/analyzer/allocation-size-3.c: New test. >         * gcc.dg/analyzer/allocation-size-4.c: New test. > > Signed-off-by: Tim Lange > --- >  gcc/analyzer/analyzer.opt | 4 + >  gcc/analyzer/sm-malloc.cc | 363 +++++++++++++++++- >  .../gcc.dg/analyzer/allocation-size-1.c | 54 +++ >  .../gcc.dg/analyzer/allocation-size-2.c | 44 +++ >  .../gcc.dg/analyzer/allocation-size-3.c | 48 +++ >  .../gcc.dg/analyzer/allocation-size-4.c | 39 ++ >  gcc/testsuite/gcc.dg/analyzer/attr-malloc-6.c | 2 + >  gcc/testsuite/gcc.dg/analyzer/capacity-1.c | 5 +- >  gcc/testsuite/gcc.dg/analyzer/malloc-4.c | 6 +- >  gcc/testsuite/gcc.dg/analyzer/pr96639.c | 2 + >  10 files changed, 559 insertions(+), 8 deletions(-) >  create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c >  create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c >  create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-3.c >  create mode 100644 gcc/testsuite/gcc.dg/analyzer/allocation-size-4.c > > diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt > index 4aea52d3a87..f213989e0bb 100644 > --- a/gcc/analyzer/analyzer.opt > +++ b/gcc/analyzer/analyzer.opt > @@ -78,6 +78,10 @@ Wanalyzer-malloc-leak >  Common Var(warn_analyzer_malloc_leak) Init(1) Warning >  Warn about code paths in which a heap-allocated pointer leaks. > > +Wanalyzer-allocation-size > +Common Var(warn_analyzer_allocation_size) Init(1) Warning > +Warn about code paths in which a buffer is assigned to a incompatible > type. > + >  Wanalyzer-mismatching-deallocation >  Common Var(warn_analyzer_mismatching_deallocation) Init(1) Warning >  Warn about code paths in which the wrong deallocation function is > called. > diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc > index 3bd40425919..790c9f0e57d 100644 > --- a/gcc/analyzer/sm-malloc.cc > +++ b/gcc/analyzer/sm-malloc.cc > @@ -46,6 +46,8 @@ along with GCC; see the file COPYING3. If not see >  #include "attribs.h" >  #include "analyzer/function-set.h" >  #include "analyzer/program-state.h" > +#include "print-tree.h" > +#include "gimple-pretty-print.h" > >  #if ENABLE_ANALYZER > > @@ -428,6 +430,7 @@ private: >    get_or_create_deallocator (tree deallocator_fndecl); > >    void on_allocator_call (sm_context *sm_ctxt, > + const supernode *node, >       const gcall *call, >       const deallocator_set *deallocators, >       bool returns_nonnull = false) const; > @@ -444,6 +447,16 @@ private: >    void on_realloc_call (sm_context *sm_ctxt, >     const supernode *node, >     const gcall *call) const; > + void on_pointer_assignment(sm_context *sm_ctxt, > + const supernode *node, > + const gassign *assign_stmt, > + tree lhs, > + tree rhs) const; > + void on_pointer_assignment(sm_context *sm_ctxt, > + const supernode *node, > + const gcall *call, > + tree lhs, > + tree rhs) const; >    void on_zero_assignment (sm_context *sm_ctxt, >        const gimple *stmt, >        tree lhs) const; > @@ -1432,6 +1445,117 @@ private: >    const char *m_funcname; >  }; > > +/* Concrete subclass for casts of pointers that lead to trailing > bytes. */ > + > +class dubious_allocation_size : public malloc_diagnostic > +{ > +public: > + dubious_allocation_size (const malloc_state_machine &sm, tree lhs, > tree rhs, > + tree size_tree, unsigned HOST_WIDE_INT size_diff) > + : malloc_diagnostic(sm, rhs), > m_type(dubious_allocation_type::CONSTANT_SIZE), > + m_lhs(lhs), m_size_tree(size_tree), m_size_diff(size_diff) > + {} > + > + dubious_allocation_size (const malloc_state_machine &sm, tree lhs, > tree rhs, > + tree size_tree) > + : malloc_diagnostic(sm, rhs), > m_type(dubious_allocation_type::MISSING_OPERAND), > + m_lhs(lhs), m_size_tree(size_tree), m_size_diff(0) > + {} > + > + const char *get_kind () const final override > + { > + return "dubious_allocation_size"; > + } > + > + int get_controlling_option () const final override > + { > + return OPT_Wanalyzer_allocation_size; > + } > + > + bool subclass_equal_p (const pending_diagnostic &base_other) const > + final override > + { > + const dubious_allocation_size &other = (const dubious_allocation_size > &)base_other; > + return malloc_diagnostic::subclass_equal_p(other) > + && m_type == other.m_type > + && same_tree_p (m_lhs, other.m_lhs) > + && same_tree_p (m_size_tree, other.m_size_tree) > + && m_size_diff == other.m_size_diff; > + } > + > + bool emit (rich_location *rich_loc) final override > + { > + diagnostic_metadata m; > + m.add_cwe (131); > + return warning_meta (rich_loc, m, get_controlling_option (), > + "Allocated buffer size is not a multiple of the pointee's size"); > + } > + > + label_text describe_state_change (const evdesc::state_change &change) > + override > + { > + if (change.m_old_state == m_sm.get_start_state () > + && unchecked_p (change.m_new_state)) > + { > + m_alloc_event = change.m_event_id; > + if (m_type == dubious_allocation_type::CONSTANT_SIZE) > + { > + // TODO: verify that it's the allocation stmt, not a copy > + return change.formatted_print ("%E bytes allocated here", > + m_size_tree); > + } > + } > + return malloc_diagnostic::describe_state_change (change); > + } > + > + label_text describe_final_event (const evdesc::final_event &ev) final > override > + { > + if (m_type == dubious_allocation_type::CONSTANT_SIZE) > + { > + if (m_alloc_event.known_p ()) > + return ev.formatted_print ( > + "Casting %qE to %qT leaves %wu trailing bytes; either the" > + " allocated size is bogus or the type on the left-hand side is" > + " wrong", > + m_arg, TREE_TYPE (m_lhs), m_size_diff); > + else > + return ev.formatted_print ( > + "Casting a %E byte buffer to %qT leaves %wu trailing bytes; either" > + " the allocated size is bogus or the type on the left-hand side is" > + " wrong", > + m_size_tree, TREE_TYPE (m_lhs), m_size_diff); > + } > + else if (m_type == dubious_allocation_type::MISSING_OPERAND) > + { > + if (m_alloc_event.known_p ()) > + return ev.formatted_print ( > + "%qE is incompatible with %qT; either the allocated size at %@ is" > + " bogus or the type on the left-hand side is wrong", > + m_arg, TREE_TYPE (m_lhs), &m_alloc_event); > + else > + return ev.formatted_print ( > + "Allocation is incompatible with %qT; either the allocated size is" > + " bogus or the type on the left-hand side is wrong", > + TREE_TYPE (m_lhs)); > + } > + > + gcc_unreachable (); > + return label_text (); > + } > + > +private: > + enum dubious_allocation_type { > + CONSTANT_SIZE, > + MISSING_OPERAND > + }; > + > + dubious_allocation_type m_type; > + diagnostic_event_id_t m_alloc_event; > + tree m_lhs; > + tree m_size_tree; > + unsigned HOST_WIDE_INT m_size_diff; > +}; > + >  /* struct allocation_state : public state_machine::state. */ > >  /* Implementation of state_machine::state::dump_to_pp vfunc > @@ -1633,6 +1757,160 @@ known_allocator_p (const_tree fndecl, const > gcall *call) >    return false; >  } > > +/* Returns the trailing bytes on dubious allocation sizes. */ > + > +static unsigned HOST_WIDE_INT > +capacity_compatible_with_type (tree cst, tree pointee_size_tree) > +{ > + unsigned HOST_WIDE_INT pointee_size = TREE_INT_CST_LOW > (pointee_size_tree); > + if (pointee_size == 0) > + return 0; > + unsigned HOST_WIDE_INT alloc_size = TREE_INT_CST_LOW (cst); > + > + return alloc_size % pointee_size; > +} > + > +/* Returns true if there is a constant tree with > + the same constant value inside the sval. */ > + > +static bool > +const_operand_in_sval_p (const svalue *sval, tree size_cst) > +{ > + auto_vec non_mult_expr; > + auto_vec worklist; > + worklist.safe_push(sval); > + while (!worklist.is_empty()) > + { > + const svalue *curr = worklist.pop (); > + curr = curr->unwrap_any_unmergeable (); > + > + switch (curr->get_kind()) > + { > + default: > + break; > + case svalue_kind::SK_CONSTANT: > + { > + const constant_svalue *cst_sval = curr->dyn_cast_constant_svalue (); > + unsigned HOST_WIDE_INT sval_int > + = TREE_INT_CST_LOW (cst_sval->get_constant ()); > + unsigned HOST_WIDE_INT size_cst_int = TREE_INT_CST_LOW (size_cst); > + if (sval_int % size_cst_int == 0) > + return true; > + } > + break; > + case svalue_kind::SK_BINOP: > + { > + const binop_svalue *b_sval = curr->dyn_cast_binop_svalue (); > + if (b_sval->get_op () == MULT_EXPR) > + { > + worklist.safe_push (b_sval->get_arg0 ()); > + worklist.safe_push (b_sval->get_arg1 ()); > + } > + else > + { > + non_mult_expr.safe_push (b_sval->get_arg0 ()); > + non_mult_expr.safe_push (b_sval->get_arg1 ()); > + } > + } > + break; > + case svalue_kind::SK_UNARYOP: > + { > + const unaryop_svalue *un_sval = curr->dyn_cast_unaryop_svalue (); > + worklist.safe_push (un_sval->get_arg ()); > + } > + break; > + case svalue_kind::SK_UNKNOWN: > + return true; > + } > + } > + > + /* Each expr should be a multiple of the size. > + E.g. used to catch n + sizeof(int) errors. */ > + bool reduce = !non_mult_expr.is_empty (); > + while (!non_mult_expr.is_empty() && reduce) > + { > + const svalue *expr_sval = non_mult_expr.pop (); > + reduce &= const_operand_in_sval_p (expr_sval, size_cst); > + } > + return reduce; > +} > + > +/* Returns true iff the type is a struct with another struct inside. > */ > + > +static bool > +struct_or_union_with_inheritance_p (tree type) > +{ > + if (!RECORD_OR_UNION_TYPE_P (type)) > + return false; > + > + for (tree f = TYPE_FIELDS (type); f; f = TREE_CHAIN (f)) > + if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (f))) > + return true; > + > + return false; > +} > + > +static void > +check_capacity (sm_context *sm_ctxt, > + const malloc_state_machine &sm, > + const supernode *node, > + const gimple *stmt, > + tree lhs, > + tree rhs, > + const svalue *capacity) > +{ > + tree pointer_type = TREE_TYPE (lhs); > + gcc_assert (TREE_CODE (pointer_type) == POINTER_TYPE); > + > + tree pointee_type = TREE_TYPE (pointer_type); > + /* void * is always compatible. */ > + if (TREE_CODE (pointee_type) == VOID_TYPE) > + return; > + > + if (struct_or_union_with_inheritance_p (pointee_type)) > + return; > + > + tree pointee_size_tree = size_in_bytes(pointee_type); > + /* The size might be unknown e.g. being a array with n elements > + or casting to char * never has any trailing bytes. */ > + if (TREE_CODE (pointee_size_tree) != INTEGER_CST > + || TREE_INT_CST_LOW (pointee_size_tree) == 1) > + return; > + > + switch (capacity->get_kind ()) > + { > + default: > + break; > + case svalue_kind::SK_CONSTANT: > + { > + const constant_svalue *cst_sval = capacity->dyn_cast_constant_svalue > (); > + tree cst = cst_sval->get_constant (); > + unsigned HOST_WIDE_INT size_diff > + = capacity_compatible_with_type (cst, pointee_size_tree); > + if (size_diff != 0) > + { > + tree diag_arg = sm_ctxt->get_diagnostic_tree (rhs); > + sm_ctxt->warn (node, stmt, diag_arg, > + new dubious_allocation_size (sm, lhs, diag_arg, > + cst, size_diff)); > + } > + } > + break; > + case svalue_kind::SK_BINOP: > + case svalue_kind::SK_UNARYOP: > + { > + if (!const_operand_in_sval_p (capacity, pointee_size_tree)) > + { > + tree diag_arg = sm_ctxt->get_diagnostic_tree (rhs); > + sm_ctxt->warn (node, stmt, diag_arg, > + new dubious_allocation_size (sm, lhs, diag_arg, > + pointee_size_tree)); > + } > + } > + break; > + } > +} > + >  /* Implementation of state_machine::on_stmt vfunc for > malloc_state_machine. */ > >  bool > @@ -1645,14 +1923,14 @@ malloc_state_machine::on_stmt (sm_context > *sm_ctxt, >        { >   if (known_allocator_p (callee_fndecl, call)) >     { > - on_allocator_call (sm_ctxt, call, &m_free); > + on_allocator_call (sm_ctxt, node, call, &m_free); >       return true; >     } > >   if (is_named_call_p (callee_fndecl, "operator new", call, 1)) > - on_allocator_call (sm_ctxt, call, &m_scalar_delete); > + on_allocator_call (sm_ctxt, node, call, &m_scalar_delete); >   else if (is_named_call_p (callee_fndecl, "operator new []", call, 1)) > - on_allocator_call (sm_ctxt, call, &m_vector_delete); > + on_allocator_call (sm_ctxt, node, call, &m_vector_delete); >   else if (is_named_call_p (callee_fndecl, "operator delete", call, 1) >     || is_named_call_p (callee_fndecl, "operator delete", call, 2)) >     { > @@ -1707,7 +1985,7 @@ malloc_state_machine::on_stmt (sm_context > *sm_ctxt, >       tree attrs = TYPE_ATTRIBUTES (TREE_TYPE (callee_fndecl)); >       bool returns_nonnull >         = lookup_attribute ("returns_nonnull", attrs); > - on_allocator_call (sm_ctxt, call, deallocators, returns_nonnull); > + on_allocator_call (sm_ctxt, node, call, deallocators, > returns_nonnull); >     } > >   /* Handle "__attribute__((nonnull))". */ > @@ -1763,12 +2041,31 @@ malloc_state_machine::on_stmt (sm_context > *sm_ctxt, >         = mutable_this->get_or_create_deallocator (callee_fndecl); >       on_deallocator_call (sm_ctxt, node, call, d, dealloc_argno); >     } > + > + /* Handle returns from function calls. */ > + tree lhs = gimple_call_lhs (call); > + if (lhs && TREE_CODE (TREE_TYPE (lhs)) == POINTER_TYPE > + && TREE_CODE (gimple_call_return_type (call)) == POINTER_TYPE) > + on_pointer_assignment (sm_ctxt, node, call, lhs, > + gimple_call_fn (call)); >        } > >    if (tree lhs = sm_ctxt->is_zero_assignment (stmt)) >      if (any_pointer_p (lhs)) >        on_zero_assignment (sm_ctxt, stmt,lhs); > > + /* Handle pointer assignments/casts for dubious allocation size. */ > + if (const gassign *assign_stmt = dyn_cast (stmt)) > + { > + if (gimple_num_ops (stmt) == 2) > + { > + tree lhs = gimple_assign_lhs (assign_stmt); > + tree rhs = gimple_assign_rhs1 (assign_stmt); > + if (any_pointer_p (lhs) && any_pointer_p (rhs)) > + on_pointer_assignment (sm_ctxt, node, assign_stmt, lhs, rhs); > + } > + } > + >    /* Handle dereferences. */ >    for (unsigned i = 0; i < gimple_num_ops (stmt); i++) >      { > @@ -1818,6 +2115,7 @@ malloc_state_machine::on_stmt (sm_context > *sm_ctxt, > >  void >  malloc_state_machine::on_allocator_call (sm_context *sm_ctxt, > + const supernode *node, >        const gcall *call, >        const deallocator_set *deallocators, >        bool returns_nonnull) const > @@ -1830,6 +2128,9 @@ malloc_state_machine::on_allocator_call > (sm_context *sm_ctxt, >       (returns_nonnull >        ? deallocators->m_nonnull >        : deallocators->m_unchecked)); > + > + if (TREE_CODE (TREE_TYPE (lhs)) == POINTER_TYPE) > + on_pointer_assignment (sm_ctxt, node, call, lhs, gimple_call_fn > (call)); >      } >    else >      { > @@ -1968,6 +2269,60 @@ malloc_state_machine::on_realloc_call > (sm_context *sm_ctxt, >      } >  } > > +/* Handle assignments between two pointers. > + Check for dubious allocation sizes. > +*/ > + > +void > +malloc_state_machine::on_pointer_assignment (sm_context *sm_ctxt, > + const supernode *node, > + const gassign *assign_stmt, > + tree lhs, > + tree rhs) const > +{ > + /* Do not warn if lhs and rhs are of the same type to not emit > duplicate > + warnings on assignments after the cast. */ > + if (pending_diagnostic::same_tree_p (TREE_TYPE (lhs), TREE_TYPE > (rhs))) > + return; > + > + const program_state *state = sm_ctxt->get_old_program_state (); > + const svalue *r_value = state->m_region_model->get_rvalue (rhs, > NULL); > + if (const region_svalue *reg = dyn_cast > (r_value)) > + { > + const svalue *capacity = state->m_region_model->get_capacity > + (reg->get_pointee ()); > + check_capacity(sm_ctxt, *this, node, assign_stmt, lhs, rhs, > capacity); > + } > +} > + > +void > +malloc_state_machine::on_pointer_assignment (sm_context *sm_ctxt, > + const supernode *node, > + const gcall *call, > + tree lhs, > + tree fn_decl) const > +{ > + /* Do not warn if lhs and rhs are of the same type to not emit > duplicate > + warnings on assignments after the cast. */ > + if (pending_diagnostic::same_tree_p > + (TREE_TYPE (lhs), TREE_TYPE (gimple_call_return_type (call)))) > + return; > + > + const program_state *state = sm_ctxt->get_new_program_state (); > + const svalue *r_value = state->m_region_model->get_rvalue (lhs, > NULL); > + if (const region_svalue *reg = dyn_cast > (r_value)) > + { > + const svalue *capacity = state->m_region_model->get_capacity > + (reg->get_pointee ()); > + check_capacity (sm_ctxt, *this, node, call, lhs, fn_decl, capacity); > + } > + else if (const conjured_svalue *con > + = dyn_cast (r_value)) > + { > + // FIXME: How to get a region_svalue? > + } > +} > + >  /* Implementation of state_machine::on_phi vfunc for > malloc_state_machine. */ > >  void > diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c > b/gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c > new file mode 100644 > index 00000000000..5403c5f41f1 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c > @@ -0,0 +1,54 @@ > +#include > + > +/* Tests with constant buffer sizes */ > + > +void test_1 (void) > +{ > + short *ptr = malloc (21 * sizeof(short)); > + free (ptr); > +} > + > +void test_2 (void) > +{ > + int *ptr = malloc (21 * sizeof (short)); /* { dg-line malloc } */ > + free (ptr); > + > + /* { dg-warning "Allocated buffer size is not a multiple of the > pointee's size" "" { target *-*-* } malloc } */ > + /* { dg-message "\\(1\\) Casting a 42 byte buffer to 'int \\*' leaves > 2 trailing bytes" "" { target *-*-* } malloc } */ > +} > + > +void test_3 (void) > +{ > + void *ptr = malloc (21 * sizeof (short)); > + short *sptr = (short *)ptr; > + free (sptr); > +} > + > +void test_4 (void) > +{ > + void *ptr = malloc (21 * sizeof (short)); /* { dg-message } */ > + int *iptr = (int *)ptr; /* { dg-line assign } */ > + free (iptr); > + > + /* { dg-warning "Allocated buffer size is not a multiple of the > pointee's size" "" { target *-*-* } assign } */ > + /* { dg-message "\\(2\\) Casting 'ptr' to 'int \\*' leaves 2 trailing > bytes" "" { target *-*-* } assign } */ > +} > + > +struct s { > + int i; > +}; > + > +void test_5 (void) > +{ > + struct s *ptr = malloc (5 * sizeof (struct s)); > + free (ptr); > +} > + > +void test_6 (void) > +{ > + long *ptr = malloc (5 * sizeof (struct s)); /* { dg-line malloc6 } */ > + free (ptr); > + > + /* { dg-warning "" "" { target *-*-* } malloc6 } */ > + /* { dg-message "" "" { target *-*-* } malloc6 } */ > +} > diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c > b/gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c > new file mode 100644 > index 00000000000..e66d2793f13 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c > @@ -0,0 +1,44 @@ > +#include > +#include > + > +/* Tests with symbolic buffer sizes */ > + > +void test_1 (void) > +{ > + int n; > + scanf("%i", &n); > + short *ptr = malloc (n * sizeof(short)); > + free (ptr); > +} > + > +void test_2 (void) > +{ > + int n; > + scanf("%i", &n); > + int *ptr = malloc (n * sizeof (short)); /* { dg-line malloc } */ > + free (ptr); > + > + /* { dg-warning "Allocated buffer size is not a multiple of the > pointee's size" "" { target *-*-* } malloc } */ > + /* { dg-message "\\(1\\) Allocation is incompatible with 'int \\*'" > "" { target *-*-* } malloc } */ > +} > + > +void test_3 (void) > +{ > + int n; > + scanf("%i", &n); > + void *ptr = malloc (n * sizeof (short)); > + short *sptr = (short *)ptr; > + free (sptr); > +} > + > +void test_4 (void) > +{ > + int n; > + scanf("%i", &n); > + void *ptr = malloc (n * sizeof (short)); /* { dg-message } */ > + int *iptr = (int *)ptr; /* { dg-line assign } */ > + free (iptr); > + > + /* { dg-warning "Allocated buffer size is not a multiple of the > pointee's size" "" { target *-*-* } assign } */ > + /* { dg-message "\\(2\\) 'ptr' is incompatible with 'int \\*'; either > the allocated size at \\(1\\)" "" { target *-*-* } assign } */ > +} > diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-3.c > b/gcc/testsuite/gcc.dg/analyzer/allocation-size-3.c > new file mode 100644 > index 00000000000..dafc0e73c63 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-3.c > @@ -0,0 +1,48 @@ > +#include > +#include > + > +/* CWE-131 example 5 */ > +void test_1(void) > +{ > + int *id_sequence = (int *) malloc (3); /* { dg-line malloc1 } */ > + if (id_sequence == NULL) exit (1); > + > + id_sequence[0] = 13579; > + id_sequence[1] = 24680; > + id_sequence[2] = 97531; > + > + free (id_sequence); > + > + /* { dg-warning "" "" { target *-*-* } malloc1 } */ > + /* { dg-message "" "" { target *-*-* } malloc1 } */ > +} > + > +void test_2(void) > +{ > + int *ptr = malloc (10 + sizeof(int)); /* { dg-line malloc2 } */ > + free (ptr); > + > + /* { dg-warning "" "" { target *-*-* } malloc2 } */ > + /* { dg-message "" "" { target *-*-* } malloc2 } */ > +} > + > +void test_3(void) > +{ > + int n; > + scanf("%i", &n); > + int *ptr = malloc (n + sizeof (int)); /* { dg-line malloc3 } */ > + free (ptr); > + > + /* { dg-warning "" "" { target *-*-* } malloc3 } */ > + /* { dg-message "" "" { target *-*-* } malloc3 } */ > +} > + > +void test_4(void) > +{ > + int n; > + scanf("%i", &n); > + int m; > + scanf("%i", &m); > + int *ptr = malloc ((n + m) * sizeof (int)); > + free (ptr); > +} > diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-4.c > b/gcc/testsuite/gcc.dg/analyzer/allocation-size-4.c > new file mode 100644 > index 00000000000..4c2b31d6e0a > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-4.c > @@ -0,0 +1,39 @@ > +#include > +#include > + > +/* Flow warnings */ > + > +void *create_buffer(int n) > +{ > + return malloc(n); > +} > + > +void test_1(void) > +{ > + // FIXME > + int *buf = create_buffer(42); /* { dg-warning "" "" { xfail *-*-* } } > */ > + free (buf); > +} > + > +void test_2(void) > +{ > + void *buf = create_buffer(42); /* { dg-message } */ > + int *ibuf = buf; /* { dg-line assign2 } */ > + free (ibuf); > + > + /* { dg-warning "" "" { target *-*-* } assign2 } */ > + /* { dg-message "" "" { target *-*-* } assign2 } */ > +} > + > +void test_3(void) > +{ > + void *buf = malloc(42); /* { dg-message } */ > + if (buf != NULL) /* { dg-message } */ > + { > + int *ibuf = buf; /* { dg-line assign3 } */ > + free (ibuf); > + } > + > + /* { dg-warning "" "" { target *-*-* } assign3 } */ > + /* { dg-message "" "" { target *-*-* } assign3 } */ > +} > diff --git a/gcc/testsuite/gcc.dg/analyzer/attr-malloc-6.c > b/gcc/testsuite/gcc.dg/analyzer/attr-malloc-6.c > index bd28107d0d7..809ee88cf07 100644 > --- a/gcc/testsuite/gcc.dg/analyzer/attr-malloc-6.c > +++ b/gcc/testsuite/gcc.dg/analyzer/attr-malloc-6.c > @@ -1,7 +1,9 @@ > +/* { dg-additional-options -Wno-analyzer-allocation-size } */ >  /* Adapted from gcc.dg/Wmismatched-dealloc.c. */ > >  #define A(...) __attribute__ ((malloc (__VA_ARGS__))) > > +struct FILE {}; >  typedef struct FILE FILE; >  typedef __SIZE_TYPE__ size_t; > > diff --git a/gcc/testsuite/gcc.dg/analyzer/capacity-1.c > b/gcc/testsuite/gcc.dg/analyzer/capacity-1.c > index 2d124833296..94f569e390b 100644 > --- a/gcc/testsuite/gcc.dg/analyzer/capacity-1.c > +++ b/gcc/testsuite/gcc.dg/analyzer/capacity-1.c > @@ -89,8 +89,11 @@ struct s >  static struct s * __attribute__((noinline)) >  alloc_s (size_t num) >  { > - struct s *p = malloc (sizeof(struct s) + num); > + struct s *p = malloc (sizeof(struct s) + num); /* { dg-line malloc } > */ >    return p; > + > + /* { dg-warning "" "" { target *-*-* } malloc } */ > + /* { dg-message "" "" { target *-*-* } malloc } */ >  } > >  struct s * > diff --git a/gcc/testsuite/gcc.dg/analyzer/malloc-4.c > b/gcc/testsuite/gcc.dg/analyzer/malloc-4.c > index 908bb28ee50..0ca94250ba2 100644 > --- a/gcc/testsuite/gcc.dg/analyzer/malloc-4.c > +++ b/gcc/testsuite/gcc.dg/analyzer/malloc-4.c > @@ -1,9 +1,9 @@ > -/* { dg-additional-options "-Wno-incompatible-pointer-types" } */ > +/* { dg-additional-options "-Wno-incompatible-pointer-types > -Wno-analyzer-allocation-size" } */ > >  #include > > -struct foo; > -struct bar; > +struct foo {}; > +struct bar {}; >  void *hv (struct foo **tm) >  { >    void *p = __builtin_malloc (4); > diff --git a/gcc/testsuite/gcc.dg/analyzer/pr96639.c > b/gcc/testsuite/gcc.dg/analyzer/pr96639.c > index 02ca3f084a2..6f365c3cb5d 100644 > --- a/gcc/testsuite/gcc.dg/analyzer/pr96639.c > +++ b/gcc/testsuite/gcc.dg/analyzer/pr96639.c > @@ -1,3 +1,5 @@ > +/* { dg-additional-options -Wno-analyzer-allocation-size } */ > + >  void *calloc (__SIZE_TYPE__, __SIZE_TYPE__); > >  int > >