From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 1CDB239450D3 for ; Tue, 13 Apr 2021 15:12:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1CDB239450D3 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-24-ySXMEqYLOUifPCk-sessmQ-1; Tue, 13 Apr 2021 11:12:47 -0400 X-MC-Unique: ySXMEqYLOUifPCk-sessmQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3F928814338; Tue, 13 Apr 2021 15:12:46 +0000 (UTC) Received: from t14s.localdomain (ovpn-112-65.phx2.redhat.com [10.3.112.65]) by smtp.corp.redhat.com (Postfix) with ESMTP id AA8335D768; Tue, 13 Apr 2021 15:12:44 +0000 (UTC) Message-ID: <39405315715a453d6779f5a63c153f3ec36cb657.camel@redhat.com> Subject: Re: [PATCH] analyzer: enabling Modula-2 Storage/ALLOCATE/DEALLOCATE From: David Malcolm To: Gaius Mulley , gcc-patches@gcc.gnu.org Date: Tue, 13 Apr 2021 11:12:43 -0400 In-Reply-To: <878s5mldfc.fsf@j228-gm.comp.glam.ac.uk> References: <878s5mldfc.fsf@j228-gm.comp.glam.ac.uk> User-Agent: Evolution 3.38.4 (3.38.4-1.fc33) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, BODY_8BITS, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Apr 2021 15:12:59 -0000 On Tue, 2021-04-13 at 10:52 +0100, Gaius Mulley wrote: > > > Hello David and fellow GCC developers, > > [the proposed patches for GCC trunk are at the end of the email] > > I've been having way too much fun with your -fanalyzer code and > here are four patches which give the analyzer the ability to > understand the Modula-2 Storage API. > >   http://www.nongnu.org/gm2/gm2-libs-isostorage.html >   http://www.nongnu.org/gm2/gm2-libsstorage.html >   http://www.nongnu.org/gm2/gm2-libssysstorage.html > > Here it is in action - these short tests have all been added to the > gm2 regression testsuite.  Many are from your C examples - rewritten > in Modula-2.  Ignoring a few poor token position subexpressions from > the front end, which I'm currently improving, it appears to work! Excellent! I'm glad you're enjoying it. Caveat: The last time I did any Modula-2 coding was a MUD I wrote at school, which I realize to my horror is now over 30 years ago; I wonder if the disks are still readable :) Various comments inline below throughout... > I've rebuilt GCC trunk 10/04/2021 with these patches and no new > regressions are introduced.  I ran a build of: c,c++,go,d,fortran,lto > and checked the before and after regression tests.  The new code is by > default disabled - hence the langhook patches. > > While it is true that this is the cart before the horse (the gm2 front > end is still to arrive in the gcc tree).  I thought it might be useful > to post the patches - just in case others might benefit from the code > or point out bugs/flaws in the code! > > Anyway thanks for the analyzer - it is great fun reading the code and > addictive trying to improve the accuracy of the error messages.  The > four patches follow after the examples below. > > Examples of it in use: > > $ cat badlistfree.mod > MODULE badlistfree ; > > FROM Storage IMPORT ALLOCATE, DEALLOCATE ; > > TYPE >    list = POINTER TO RECORD >                         value: CARDINAL ; >                         next : list ; >                      END ; > VAR >    head: list ; > > PROCEDURE badfree (l: list) ; > BEGIN >    DISPOSE (l) ; >    WHILE l^.next # NIL DO >       l := l^.next ; >       DISPOSE (l) >    END > END badfree ; > > > BEGIN >    NEW (head) ; >    badfree (head) ; > END badlistfree. > $ gm2 -g -c -fanalyzer badlistfree.mod > badlistfree.mod: In function ‘badfree’: > badlistfree.mod:16:24: warning: use after ‘DISPOSE’ of ‘l’ [CWE-416] [- > Wanalyzer-use-after-free] >    16 |    WHILE l^.next # NIL DO >       |                        ^~ >   ‘badfree’: events 1-2 >     | >     |   15 |    DISPOSE (l) ; >     |      |    ^~~~~~~~~~ >     |      |    | >     |      |    (1) deallocated here >     |   16 |    WHILE l^.next # NIL DO >     |      |                        ~~ >     |      |                        | >     |      |                        (2) use after ‘DISPOSE’ of ‘l’; > deallocated at (1) >     | Excellent. Do you actually need the NEW(head); to trigger the warning? Given: BEGIN NEW (head) ; (* "head" is now explicitly in state "unchecked" *) badfree (head) ; END badlistfree. whereas, assuming that "head" is being passed in, say, as a param: BEGIN (* "head" is implicitly in state "start" *) badfree (head) ; END badlistfree. so if you're thinking about unit test coverage, it may be an idea to have both cases. Your patch doesn't seem to have any test cases, but presumably that would be part of the gm2 frontend and thus would be merged with that, rather than as this patch. I went poking around in your repo, and see: http://git.savannah.gnu.org/cgit/gm2.git/commit/?id=bd9e38fcebf5c063370bec9a69331037005b15de I don't see any DejaGnu directives on the files; should there be something like (* dg-warning "use after 'DISPOSE' of 'l'" */ on the pertinent line, or do your tests work a different way? > $ cat disposenoalloc.mod > MODULE disposenoalloc ; > > FROM Storage IMPORT ALLOCATE, DEALLOCATE ; > FROM SYSTEM IMPORT ADR ; > > TYPE >    list = POINTER TO RECORD >                         value: CARDINAL ; >                         next : list ; >                      END ; > VAR >    head: list ; > > BEGIN >    head := ADR (head) ; >    DISPOSE (head) > END disposenoalloc. > $ gm2 -g -c -fanalyzer disposenoalloc.mod > disposenoalloc.mod: In function ‘_M2_disposenoalloc_init’: > disposenoalloc.mod:16:4: warning: ‘DISPOSE’ of ‘head’ which points to > memory not on the heap [CWE-590] [-Wanalyzer-free-of-non-heap] >    16 |    DISPOSE (head) >       |    ^~~~~~~~~~~~~ >   ‘_M2_disposenoalloc_init’: events 1-3 >     | >     |   15 |    head := ADR (head) ; >     |      |                       ^ >     |      |                       | >     |      |                       (1) pointer is from here >     |   16 |    DISPOSE (head) >     |      |    ~~~~~~~~~~~~~       >     |      |    |        | >     |      |    |        (2) pointer is from here >     |      |    (3) call to ‘DISPOSE’ here >     | I had to look up ADR; it's the address of a variable, right? Event (2) looks spurious, but I think it's a bug I've seen before. > $ cat testdoubledispose.mod > MODULE testdoubledispose ; > > FROM Storage IMPORT ALLOCATE, DEALLOCATE ; > > TYPE >    list = POINTER TO RECORD >                         value: CARDINAL ; >                         next : list ; >                      END ; > VAR >    head: list ; > BEGIN >    NEW (head) ; >    DISPOSE (head) ; >    DISPOSE (head) ; > END testdoubledispose. > $ gm2 -g -c -fanalyzer testdoubledispose.mod > testdoubledispose.mod: In function ‘_M2_testdoubledispose_init’: > testdoubledispose.mod:15:4: warning: double-‘DISPOSE’ of ‘head’ [CWE- > 415] [-Wanalyzer-double-free] >    15 |    DISPOSE (head) ; >       |    ^~~~~~~~~~~~~ >   ‘_M2_testdoubledispose_init’: events 1-3 >     | >     |   13 |    NEW (head) ; >     |      |    ^~~~~~~~~ >     |      |    | >     |      |    (1) allocated here >     |   14 |    DISPOSE (head) ; >     |      |    ~~~~~~~~~~~~~ >     |      |    | >     |      |    (2) first ‘DISPOSE’ here >     |   15 |    DISPOSE (head) ; >     |      |    ~~~~~~~~~~~~~ >     |      |    | >     |      |    (3) second ‘DISPOSE’ here; first ‘DISPOSE’ was at (2) >     | Great. As noted above, for maximum test coverage it would be good to have a case where "head" is passed in in unknown (start) state as well as this case where it is explicitly NEW-ed. > $ cat testdoublefree.mod > MODULE testdoublefree ; > > FROM libc IMPORT malloc, free ; > FROM SYSTEM IMPORT ADDRESS ; > > VAR >    a: ADDRESS ; > BEGIN >    a := malloc (100) ; >    free (a) ; >    free (a) > END testdoublefree. > $ gm2 -g -c -fanalyzer testdoublefree.mod > testdoublefree.mod: In function ‘_M2_testdoublefree_init’: > testdoublefree.mod:11:4: warning: double-‘free’ of ‘a’ [CWE-415] [- > Wanalyzer-double-free] >    11 |    free (a) >       |    ^~~~ >   ‘_M2_testdoublefree_init’: events 1-3 >     | >     |    9 |    a := malloc (100) ; >     |      |         ^~~~~~ >     |      |         | >     |      |         (1) allocated here >     |   10 |    free (a) ; >     |      |    ~~~~  >     |      |    | >     |      |    (2) first ‘free’ here >     |   11 |    free (a) >     |      |    ~~~~  >     |      |    | >     |      |    (3) second ‘free’ here; first ‘free’ was at (2) >     | Great - double-free detection was my initial motivation for writing the analyzer. Same notes as before about test coverage for 'a' passed in in unknown state vs explicitly malloced. Can you get it to warn if the ADDRESS is dereferenced without checking the result of malloc for NULL? (should be -Wanalyzer-possible-null-dereference) > $ cat useafterdeallocate.mod > MODULE useafterdeallocate ; > > FROM Storage IMPORT ALLOCATE, DEALLOCATE ; > > TYPE >    ptrType = POINTER TO RECORD >                            foo: CARDINAL ; >                         END ; > > VAR >    head: ptrType ; > BEGIN >    NEW (head) ; >    IF head # NIL >    THEN >       head^.foo := 1 ; >       DISPOSE (head) ; >       head^.foo := 2 >    END > END useafterdeallocate. > $ gm2 -g -c -fanalyzer useafterdeallocate.mod > useafterdeallocate.mod: In function ‘_M2_useafterdeallocate_init’: > useafterdeallocate.mod:18:17: warning: use after ‘DISPOSE’ of ‘head’ > [CWE-416] [-Wanalyzer-use-after-free] >    18 |       head^.foo := 2 >       |                 ^~ >   ‘_M2_useafterdeallocate_init’: events 1-6 >     | >     |   13 |    NEW (head) ; >     |      |    ^~~~~~~~~ >     |      |    | >     |      |    (1) allocated here >     |   14 |    IF head # NIL If you don't check for head # NIL here, does the analyzer complain about -Wanalyzer-possible-null-dereference? >     |   15 |    THEN >     |      |    ~~~~ >     |      |    | >     |      |    (2) assuming ‘head’ is non-NULL >     |      |    (3) following ‘false’ branch... >     |   16 |       head^.foo := 1 ; >     |      |            ~ >     |      |            | >     |      |            (4) ...to here >     |   17 |       DISPOSE (head) ; >     |      |       ~~~~~~~~~~~~~ >     |      |       | >     |      |       (5) deallocated here >     |   18 |       head^.foo := 2 >     |      |                 ~~ >     |      |                 | >     |      |                 (6) use after ‘DISPOSE’ of ‘_T30’; > deallocated at (5) >     | The '_T30' in the message for the final event (6) is a bit of a wart; it would be better as 'head'. Interestingly, that's what the "warning" message at the top says. This is reminiscent of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99771, fwiw. What does the underlying gimple look like? You can see it via -fdump- ipa-analyzer=stderr, though I suspect you know that. Other things you might want to try that might already work, if you're still enjoying playing with it: - is it a problem if the user tried to "free" something that should have been DISPOSEd and vice-versa? Looking at your patch, it looks like -Wanalyzer-mismatching-deallocation may issue a diagnostic for such a case - the analyzer can probably complain about dereference of NIL (e.g. interprocedually if one function returns a NIL and the caller then blindly dereferences the result). But does Modula-2 have any protection for that syntactically? (I can't remember) - do interprocedural cases look sane? - do intermodule cases work with -flto? (assuming that m2 already works with -flto, of course. Note that I committed a bug fix for the analyzer's LTO support last night [PR98599], so you'll want to refresh your tree or grab that patch if you want to try LTO). > and the four file patches: > > ChangeLog entries: > ================== [...snip...] > Patches > ======= > > > diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc > index 1d5b8601b1f..9f0f544e32c 100644 > --- a/gcc/analyzer/sm-malloc.cc > +++ b/gcc/analyzer/sm-malloc.cc > @@ -22,6 +22,8 @@ along with GCC; see the file COPYING3.  If not see >  #include "system.h" >  #include "coretypes.h" >  #include "tree.h" > +#include "langhooks-def.h" > +#include "langhooks.h" >  #include "function.h" >  #include "basic-block.h" >  #include "gimple.h" > @@ -44,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see >  #include "analyzer/region-model.h" >  #include "stringpool.h" >  #include "attribs.h" > +#include "print-tree.h" > >  #if ENABLE_ANALYZER > > @@ -387,6 +390,7 @@ public: >    standard_deallocator_set m_free; >    standard_deallocator_set m_scalar_delete; >    standard_deallocator_set m_vector_delete; > +  standard_deallocator_set m_storage_deallocate; This new field could use a comment. Actually, maybe the names should identify the language, say after the "m_" prefix, so e.g.: /* C++ scalar delete and vector delete[]. */ standard_deallocator_set m_cp_scalar_delete; standard_deallocator_set m_cp_vector_delete; /* Modula 2 something something. */ standard_deallocator_set m_m2_storage_deallocate; >    standard_deallocator m_realloc; > > @@ -419,13 +423,18 @@ private: >                             const supernode *node, >                             const gcall *call, >                             const deallocator *d, > -                           unsigned argno) const; > +                            unsigned argno, > +                            bool pass_by_reference = false) const; >    void on_realloc_call (sm_context *sm_ctxt, >                         const supernode *node, >                         const gcall *call) const; >    void on_zero_assignment (sm_context *sm_ctxt, >                            const gimple *stmt, >                            tree lhs) const; > +  void on_allocator_call_pass_by_ref (sm_context *sm_ctxt, > +                                      const gcall *call, > +                                      const deallocator_set > *deallocators, > +                                      bool returns_nonnull = false) > const; I only see one use of this, so is the default value for the final param actually needed? (I prefer to avoid default params when they're not needed) >    /* A map for consolidating deallocators so that they are >       unique per deallocator FUNCTION_DECL.  */ > @@ -1370,6 +1379,7 @@ malloc_state_machine::malloc_state_machine > (logger *logger) >    m_free (this, "free", WORDING_FREED), >    m_scalar_delete (this, "delete", WORDING_DELETED), >    m_vector_delete (this, "delete[]", WORDING_DELETED), > +  m_storage_deallocate (this, "DISPOSE", WORDING_DEALLOCATED), >    m_realloc (this, "realloc", WORDING_REALLOCATED) >  { >    gcc_assert (m_start->get_id () == 0); > @@ -1415,7 +1425,7 @@ get_or_create_custom_deallocator_set (tree > allocator_fndecl) >      return NULL; > >    /* Otherwise, call maybe_create_custom_deallocator_set, > -     memoizing the result.  */ > +     memorizing the result.  */ I think I did mean "memoizing" here - but maybe "caching" is a less fancy way of writing that. Am I misuing the term? I'm not a CompSci professor :) >    if (custom_deallocator_set **slot >        = m_custom_deallocator_set_cache.get (allocator_fndecl)) >      return *slot; > @@ -1526,6 +1536,15 @@ malloc_state_machine::on_stmt (sm_context > *sm_ctxt, >             return true; >           } > > +        if (lang_hooks.new_dispose_storage_substitution) The langhook returns a bool, so I think you meant to call it here: if (lang_hooks.new_dispose_storage_substitution ()) rather than see if the function pointer is non-NULL. That said, I don't love that lang_hook (more generally I'd prefer we had vfuncs to callbacks, but that's what we currently have). I'm not quite sure what the interface between the analyzer and frontends should be. One of my goals for the analyzer is whole-program analysis, with multiple source files (which is why I picked the gimple- SSA representation, to try to piggyback off the existing LTO infrastructure). In particular, what happens if some of a program is written in one source language, and some in another? I have vague ideas of a class representing a source language, with subclasses implementing analyzer policy for each source language, so that the analyzer can instantiate multiple objects and thus somehow do the correct thing for such cases. I'm handwaving here, of course. It's very useful seeing your patch, because it makes these ideas more concrete - thanks. > +          /* m2 allocation.  */ > +          if (is_named_call_p (callee_fndecl, "Storage_ALLOCATE", > call, 2) > +              || is_named_call_p (callee_fndecl, > "SysStorage_ALLOCATE", call, 2)) > +            { > +              on_allocator_call_pass_by_ref (sm_ctxt, call, > &m_storage_deallocate, false); > +              return true; > +            } > + It looks like all the langhook is doing is conditionalizing the recognition of a specific function by name, which presumably is reserved within the Modula 2 ecosystem, but isn't a reserved name in the world of C/C++. Is that the reason for making it conditional? Maybe we can identify which source language a particular gimple stmt was written in, and apply policy based on that. (what about cross- language inlining?!) Also, I recognize that gradually adding more and more string comparisons for special-cases to the analyzer isn't ideal; sorry about that. There's probably a better way to recognize these functions, but I'm not sure what it is yet. >         if (is_named_call_p (callee_fndecl, "operator new", call, 1)) >           on_allocator_call (sm_ctxt, call, &m_scalar_delete); >         else if (is_named_call_p (callee_fndecl, "operator new []", > call, 1)) > @@ -1562,6 +1581,16 @@ malloc_state_machine::on_stmt (sm_context > *sm_ctxt, >             return true; >           } > > +        if (lang_hooks.new_dispose_storage_substitution) > +          /* m2 deallocation.  */ > +          if (is_named_call_p (callee_fndecl, "Storage_DEALLOCATE", > call, 2) > +              || is_named_call_p (callee_fndecl, > "SysStorage_DEALLOCATE", call, 2)) > +            { > +              on_deallocator_call (sm_ctxt, node, call, > +                                   > &m_storage_deallocate.m_deallocator, 0, true); > +              return true; > +            } > + Similar considerations as above. >         if (is_named_call_p (callee_fndecl, "realloc", call, 2) >             || is_named_call_p (callee_fndecl, "__builtin_realloc", > call, 2)) >           { > @@ -1731,16 +1760,72 @@ malloc_state_machine::on_allocator_call > (sm_context *sm_ctxt, >      } >  } > > +/* Skips an ADDR_EXPR if seen.  */ > + > +static > +tree > +skip_reference (tree arg) > +{ > +  if (TREE_CODE (arg) == ADDR_EXPR) > +      return TREE_OPERAND (arg, 0); > +  return arg; > +} > + > + > +/* Handle allocators which return the value through a pass by > reference parameter.  */ > + > +void > +malloc_state_machine::on_allocator_call_pass_by_ref (sm_context > *sm_ctxt, > +                                                     const gcall > *call, > +                                                     const > deallocator_set *deallocators, > +                                                     bool > returns_nonnull) const > +{ > +  if (gimple_call_num_args (call) == 0) > +    return; > +  tree arg = gimple_call_arg (call, 0); > +  if (arg) > +    { > +      /* in Modula-2 the heap allocator API is: ALLOCATE (VAR ADDRESS; > +         CARDINAL).  So we need to skip the reference or pointer in > +         the first parameter.  */ > +      tree diag_arg_lvalue = sm_ctxt->get_diagnostic_tree (arg); I'm not sure get_diagnostic_tree is correct here; that tries to generate a human-readable tree for diagnostics. Why can't you just use 'arg'? What kind of trees are you getting? > +      tree diag_arg_rvalue = skip_reference (diag_arg_lvalue); The conditional in skip_reference seems like a "code smell" to me: is the code meant to be dereferencing the pointer or not? What does that mean for the types of the things involved? It seems like it shouldn't be conditional to me - or maybe there's something about Modula 2 that I'm missing? > +      if (sm_ctxt->get_state (call, diag_arg_rvalue) == m_start) > +       { > +         sm_ctxt->set_next_state (call, diag_arg_rvalue, > +                                  (returns_nonnull > +                                   ? deallocators->m_nonnull > +                                   : deallocators->m_unchecked)); > +       } > +    } > +} > + >  void >  malloc_state_machine::on_deallocator_call (sm_context *sm_ctxt, >                                            const supernode *node, >                                            const gcall *call, >                                            const deallocator *d, > -                                          unsigned argno) const > +                                           unsigned argno, > +                                           bool pass_by_reference) > const >  { >    if (argno >= gimple_call_num_args (call)) >      return; >    tree arg = gimple_call_arg (call, argno); > +  if (pass_by_reference) > +    { > +      tree diag_arg = sm_ctxt->get_diagnostic_tree (arg); > +      /* in Modula-2 the API is: DEALLOCATE (VAR a: ADDRESS; size: > +         CARDINAL).  So we need to skip the pointer or reference in > +         the first parameter.  We also know the pointer is assigned to > +         NULL.  In C this could be described as: > + > +         DEALLOCATE (void **address, unsigned int size) > +         { > +            free (*address); > +            *address = NULL; > +         }.  */ > +      arg = skip_reference (diag_arg); > +    } Similar comments as above about use of get_diagnostic_tree and skip_reference. >    state_t state = sm_ctxt->get_state (call, arg); > > @@ -1783,6 +1868,9 @@ malloc_state_machine::on_deallocator_call > (sm_context *sm_ctxt, >                                            d->m_name)); >        sm_ctxt->set_next_state (call, arg, m_stop); >      } > +  /* in Modula-2 the DEALLOCATE assigns the pointer to NULL.  However > +     we don't do this in the analyzer as it ignores NULL pointers > +     during deallocation.  */ Annoyingly, the analyzer's knowledge of what a function does it split between the sm-*.cc state machine implementations, and region- model*.cc, in particularly region-model-impl-calls.cc. In theory you could "teach" the analyzer that DEALLOCATE writes NULL; have a look at region-model-impl-calls.cc and the various region_model::impl_call_* methods implemented there. Again, my apologies for the rather shoddy way that special-case knowledge of functions is implemented. You might consider implementing Storage_ALLOCATE/DEALLOCATE there, see region_model::impl_call_malloc/free and region_model::impl_call_operator_new/delete. >  } > >  /* Implementation of realloc(3): > diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h > index ae3991c770a..952513c071b 100644 > --- a/gcc/langhooks-def.h > +++ b/gcc/langhooks-def.h > @@ -95,6 +95,7 @@ extern const char *lhd_get_substring_location (const > substring_loc &, >  extern int lhd_decl_dwarf_attribute (const_tree, int); >  extern int lhd_type_dwarf_attribute (const_tree, int); >  extern void lhd_finalize_early_debug (void); > +extern bool lhd_new_dispose_storage_substitution (void); > >  #define LANG_HOOKS_NAME                        "GNU unknown" >  #define LANG_HOOKS_IDENTIFIER_SIZE     sizeof (struct lang_identifier) > @@ -147,6 +148,7 @@ extern void lhd_finalize_early_debug (void); >  #define LANG_HOOKS_RUN_LANG_SELFTESTS   lhd_do_nothing >  #define LANG_HOOKS_GET_SUBSTRING_LOCATION lhd_get_substring_location >  #define LANG_HOOKS_FINALIZE_EARLY_DEBUG lhd_finalize_early_debug > +#define LANG_HOOKS_NEW_DISPOSE_STORAGE_SUBSTITUTION > lhd_new_dispose_storage_substitution > >  /* Attribute hooks.  */ >  #define LANG_HOOKS_ATTRIBUTE_TABLE             NULL > @@ -381,7 +383,8 @@ extern void lhd_end_section (void); >    LANG_HOOKS_EMITS_BEGIN_STMT, \ >    LANG_HOOKS_RUN_LANG_SELFTESTS, \ >    LANG_HOOKS_GET_SUBSTRING_LOCATION, \ > -  LANG_HOOKS_FINALIZE_EARLY_DEBUG \ > +  LANG_HOOKS_FINALIZE_EARLY_DEBUG, \ > +  LANG_HOOKS_NEW_DISPOSE_STORAGE_SUBSTITUTION \ >  } > So the function pointer is non-NULL for the default case, and so the conditional is true for every frontend :) >  #endif /* GCC_LANG_HOOKS_DEF_H */ > diff --git a/gcc/langhooks.c b/gcc/langhooks.c > index 2354386f7b4..6486c484895 100644 > --- a/gcc/langhooks.c > +++ b/gcc/langhooks.c > @@ -896,6 +896,13 @@ lhd_finalize_early_debug (void) >      (*debug_hooks->early_global_decl) (cnode->decl); >  } > > +/* Should the analyzer check for NEW/DISPOSE > Storage_ALLOCATE/Storage_DEALLOCATE?  */ > + > +bool lhd_new_dispose_storage_substitution (void) > +{ > +  return false; > +} > + >  /* Returns true if the current lang_hooks represents the GNU C > frontend.  */ > >  bool > diff --git a/gcc/langhooks.h b/gcc/langhooks.h > index 842e605c439..16b368bfbe1 100644 > --- a/gcc/langhooks.h > +++ b/gcc/langhooks.h > @@ -611,6 +611,9 @@ struct lang_hooks >    /* Invoked before the early_finish debug hook is invoked.  */ >    void (*finalize_early_debug) (void); > > +  /* Does the language substitute NEW into ALLOCATE and DISPOSE into > DEALLOCATE?  */ > +  bool (*new_dispose_storage_substitution) (void); > + >    /* Whenever you add entries here, make sure you adjust langhooks- > def.h >       and langhooks.c accordingly.  */ >  }; > > regards, > Gaius Hope the above is constructive, sorry if it comes across as nitpicky in places. Thanks again for posting the patch, it's very exciting to see the analyzer support other languages! Dave