public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] analyzer: enabling Modula-2 Storage/ALLOCATE/DEALLOCATE
@ 2021-04-13  9:52 Gaius Mulley
  2021-04-13 15:12 ` David Malcolm
  0 siblings, 1 reply; 3+ messages in thread
From: Gaius Mulley @ 2021-04-13  9:52 UTC (permalink / raw)
  To: gcc-patches, dmalcolm



Hello David and fellow GCC developers,

[the proposed patches for GCC trunk are at the end of the email]

I've been having way too much fun with your -fanalyzer code and
here are four patches which give the analyzer the ability to
understand the Modula-2 Storage API.

  http://www.nongnu.org/gm2/gm2-libs-isostorage.html
  http://www.nongnu.org/gm2/gm2-libsstorage.html
  http://www.nongnu.org/gm2/gm2-libssysstorage.html

Here it is in action - these short tests have all been added to the
gm2 regression testsuite.  Many are from your C examples - rewritten
in Modula-2.  Ignoring a few poor token position subexpressions from
the front end, which I'm currently improving, it appears to work!

I've rebuilt GCC trunk 10/04/2021 with these patches and no new
regressions are introduced.  I ran a build of: c,c++,go,d,fortran,lto
and checked the before and after regression tests.  The new code is by
default disabled - hence the langhook patches.

While it is true that this is the cart before the horse (the gm2 front
end is still to arrive in the gcc tree).  I thought it might be useful
to post the patches - just in case others might benefit from the code
or point out bugs/flaws in the code!

Anyway thanks for the analyzer - it is great fun reading the code and
addictive trying to improve the accuracy of the error messages.  The
four patches follow after the examples below.

Examples of it in use:

$ cat badlistfree.mod
MODULE badlistfree ;

FROM Storage IMPORT ALLOCATE, DEALLOCATE ;

TYPE
   list = POINTER TO RECORD
                        value: CARDINAL ;
                        next : list ;
                     END ;
VAR
   head: list ;

PROCEDURE badfree (l: list) ;
BEGIN
   DISPOSE (l) ;
   WHILE l^.next # NIL DO
      l := l^.next ;
      DISPOSE (l)
   END
END badfree ;


BEGIN
   NEW (head) ;
   badfree (head) ;
END badlistfree.
$ gm2 -g -c -fanalyzer badlistfree.mod
badlistfree.mod: In function ‘badfree’:
badlistfree.mod:16:24: warning: use after ‘DISPOSE’ of ‘l’ [CWE-416] [-Wanalyzer-use-after-free]
   16 |    WHILE l^.next # NIL DO
      |                        ^~
  ‘badfree’: events 1-2
    |
    |   15 |    DISPOSE (l) ;
    |      |    ^~~~~~~~~~
    |      |    |
    |      |    (1) deallocated here
    |   16 |    WHILE l^.next # NIL DO
    |      |                        ~~
    |      |                        |
    |      |                        (2) use after ‘DISPOSE’ of ‘l’; deallocated at (1)
    |
$ cat disposenoalloc.mod
MODULE disposenoalloc ;

FROM Storage IMPORT ALLOCATE, DEALLOCATE ;
FROM SYSTEM IMPORT ADR ;

TYPE
   list = POINTER TO RECORD
                        value: CARDINAL ;
                        next : list ;
                     END ;
VAR
   head: list ;

BEGIN
   head := ADR (head) ;
   DISPOSE (head)
END disposenoalloc.
$ gm2 -g -c -fanalyzer disposenoalloc.mod
disposenoalloc.mod: In function ‘_M2_disposenoalloc_init’:
disposenoalloc.mod:16:4: warning: ‘DISPOSE’ of ‘head’ which points to memory not on the heap [CWE-590] [-Wanalyzer-free-of-non-heap]
   16 |    DISPOSE (head)
      |    ^~~~~~~~~~~~~
  ‘_M2_disposenoalloc_init’: events 1-3
    |
    |   15 |    head := ADR (head) ;
    |      |                       ^
    |      |                       |
    |      |                       (1) pointer is from here
    |   16 |    DISPOSE (head)
    |      |    ~~~~~~~~~~~~~       
    |      |    |        |
    |      |    |        (2) pointer is from here
    |      |    (3) call to ‘DISPOSE’ here
    |
$ cat testdoubledispose.mod
MODULE testdoubledispose ;

FROM Storage IMPORT ALLOCATE, DEALLOCATE ;

TYPE
   list = POINTER TO RECORD
                        value: CARDINAL ;
                        next : list ;
                     END ;
VAR
   head: list ;
BEGIN
   NEW (head) ;
   DISPOSE (head) ;
   DISPOSE (head) ;
END testdoubledispose.
$ gm2 -g -c -fanalyzer testdoubledispose.mod
testdoubledispose.mod: In function ‘_M2_testdoubledispose_init’:
testdoubledispose.mod:15:4: warning: double-‘DISPOSE’ of ‘head’ [CWE-415] [-Wanalyzer-double-free]
   15 |    DISPOSE (head) ;
      |    ^~~~~~~~~~~~~
  ‘_M2_testdoubledispose_init’: events 1-3
    |
    |   13 |    NEW (head) ;
    |      |    ^~~~~~~~~
    |      |    |
    |      |    (1) allocated here
    |   14 |    DISPOSE (head) ;
    |      |    ~~~~~~~~~~~~~
    |      |    |
    |      |    (2) first ‘DISPOSE’ here
    |   15 |    DISPOSE (head) ;
    |      |    ~~~~~~~~~~~~~
    |      |    |
    |      |    (3) second ‘DISPOSE’ here; first ‘DISPOSE’ was at (2)
    |
$ cat testdoublefree.mod
MODULE testdoublefree ;

FROM libc IMPORT malloc, free ;
FROM SYSTEM IMPORT ADDRESS ;

VAR
   a: ADDRESS ;
BEGIN
   a := malloc (100) ;
   free (a) ;
   free (a)
END testdoublefree.
$ gm2 -g -c -fanalyzer testdoublefree.mod
testdoublefree.mod: In function ‘_M2_testdoublefree_init’:
testdoublefree.mod:11:4: warning: double-‘free’ of ‘a’ [CWE-415] [-Wanalyzer-double-free]
   11 |    free (a)
      |    ^~~~
  ‘_M2_testdoublefree_init’: events 1-3
    |
    |    9 |    a := malloc (100) ;
    |      |         ^~~~~~
    |      |         |
    |      |         (1) allocated here
    |   10 |    free (a) ;
    |      |    ~~~~  
    |      |    |
    |      |    (2) first ‘free’ here
    |   11 |    free (a)
    |      |    ~~~~  
    |      |    |
    |      |    (3) second ‘free’ here; first ‘free’ was at (2)
    |
$ cat useafterdeallocate.mod
MODULE useafterdeallocate ;

FROM Storage IMPORT ALLOCATE, DEALLOCATE ;

TYPE
   ptrType = POINTER TO RECORD
                           foo: CARDINAL ;
                        END ;

VAR
   head: ptrType ;
BEGIN
   NEW (head) ;
   IF head # NIL
   THEN
      head^.foo := 1 ;
      DISPOSE (head) ;
      head^.foo := 2
   END
END useafterdeallocate.
$ gm2 -g -c -fanalyzer useafterdeallocate.mod
useafterdeallocate.mod: In function ‘_M2_useafterdeallocate_init’:
useafterdeallocate.mod:18:17: warning: use after ‘DISPOSE’ of ‘head’ [CWE-416] [-Wanalyzer-use-after-free]
   18 |       head^.foo := 2
      |                 ^~
  ‘_M2_useafterdeallocate_init’: events 1-6
    |
    |   13 |    NEW (head) ;
    |      |    ^~~~~~~~~
    |      |    |
    |      |    (1) allocated here
    |   14 |    IF head # NIL
    |   15 |    THEN
    |      |    ~~~~
    |      |    |
    |      |    (2) assuming ‘head’ is non-NULL
    |      |    (3) following ‘false’ branch...
    |   16 |       head^.foo := 1 ;
    |      |            ~
    |      |            |
    |      |            (4) ...to here
    |   17 |       DISPOSE (head) ;
    |      |       ~~~~~~~~~~~~~
    |      |       |
    |      |       (5) deallocated here
    |   18 |       head^.foo := 2
    |      |                 ~~
    |      |                 |
    |      |                 (6) use after ‘DISPOSE’ of ‘_T30’; deallocated at (5)
    |

and the four file patches:

ChangeLog entries:
==================

* gcc/analyzer/sm-malloc.cc: include langhooks.
(m_storage_deallocate) new field.
(malloc_state_machine::on_deallocator_call) pass_by_reference
parameter added.  Dereference parameter if necessary.
(skip_reference) New function.
(malloc_state_machine::on_allocator_call_pass_by_ref) New method.
(malloc_state_machine::on_stmt) test
lhd_new_dispose_storage_substitution to see if heap allocators
Storage_ALLOCATE or SysStorage_ALLOCATE should be analyzed.  Also
setup heap deallocators Storage_DEALLOCATE and SysStorage_DEALLOCATE.
* langhooks-def.h: (lhd_new_dispose_storage_substitution) New
prototype.  (LANG_HOOKS_NEW_DISPOSE_STORAGE_SUBSTITUTION) New
define which is added to the lang_hooks initializer.
* langhooks.h: (new_dispose_storage_substitution) New
function indirect field for struct lang_hooks.
* gcc/langhooks.c: (lhd_new_dispose_storage_substitution) New default
stub disabling NEW/DISPOSE Storage analysis.

Patches
=======


diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc
index 1d5b8601b1f..9f0f544e32c 100644
--- a/gcc/analyzer/sm-malloc.cc
+++ b/gcc/analyzer/sm-malloc.cc
@@ -22,6 +22,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "tree.h"
+#include "langhooks-def.h"
+#include "langhooks.h"
 #include "function.h"
 #include "basic-block.h"
 #include "gimple.h"
@@ -44,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "analyzer/region-model.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "print-tree.h"

 #if ENABLE_ANALYZER

@@ -387,6 +390,7 @@ public:
   standard_deallocator_set m_free;
   standard_deallocator_set m_scalar_delete;
   standard_deallocator_set m_vector_delete;
+  standard_deallocator_set m_storage_deallocate;

   standard_deallocator m_realloc;

@@ -419,13 +423,18 @@ private:
 			    const supernode *node,
 			    const gcall *call,
 			    const deallocator *d,
-			    unsigned argno) const;
+			     unsigned argno,
+			     bool pass_by_reference = false) const;
   void on_realloc_call (sm_context *sm_ctxt,
 			const supernode *node,
 			const gcall *call) const;
   void on_zero_assignment (sm_context *sm_ctxt,
 			   const gimple *stmt,
 			   tree lhs) const;
+  void on_allocator_call_pass_by_ref (sm_context *sm_ctxt,
+				       const gcall *call,
+				       const deallocator_set *deallocators,
+				       bool returns_nonnull = false) const;

   /* A map for consolidating deallocators so that they are
      unique per deallocator FUNCTION_DECL.  */
@@ -1370,6 +1379,7 @@ malloc_state_machine::malloc_state_machine (logger *logger)
   m_free (this, "free", WORDING_FREED),
   m_scalar_delete (this, "delete", WORDING_DELETED),
   m_vector_delete (this, "delete[]", WORDING_DELETED),
+  m_storage_deallocate (this, "DISPOSE", WORDING_DEALLOCATED),
   m_realloc (this, "realloc", WORDING_REALLOCATED)
 {
   gcc_assert (m_start->get_id () == 0);
@@ -1415,7 +1425,7 @@ get_or_create_custom_deallocator_set (tree allocator_fndecl)
     return NULL;

   /* Otherwise, call maybe_create_custom_deallocator_set,
-     memoizing the result.  */
+     memorizing the result.  */
   if (custom_deallocator_set **slot
       = m_custom_deallocator_set_cache.get (allocator_fndecl))
     return *slot;
@@ -1526,6 +1536,15 @@ malloc_state_machine::on_stmt (sm_context *sm_ctxt,
 	    return true;
 	  }

+	 if (lang_hooks.new_dispose_storage_substitution)
+	   /* m2 allocation.  */
+	   if (is_named_call_p (callee_fndecl, "Storage_ALLOCATE", call, 2)
+	       || is_named_call_p (callee_fndecl, "SysStorage_ALLOCATE", call, 2))
+	     {
+	       on_allocator_call_pass_by_ref (sm_ctxt, call, &m_storage_deallocate, false);
+	       return true;
+	     }
+
 	if (is_named_call_p (callee_fndecl, "operator new", call, 1))
 	  on_allocator_call (sm_ctxt, call, &m_scalar_delete);
 	else if (is_named_call_p (callee_fndecl, "operator new []", call, 1))
@@ -1562,6 +1581,16 @@ malloc_state_machine::on_stmt (sm_context *sm_ctxt,
 	    return true;
 	  }

+	 if (lang_hooks.new_dispose_storage_substitution)
+	   /* m2 deallocation.	*/
+	   if (is_named_call_p (callee_fndecl, "Storage_DEALLOCATE", call, 2)
+	       || is_named_call_p (callee_fndecl, "SysStorage_DEALLOCATE", call, 2))
+	     {
+	       on_deallocator_call (sm_ctxt, node, call,
+				    &m_storage_deallocate.m_deallocator, 0, true);
+	       return true;
+	     }
+
 	if (is_named_call_p (callee_fndecl, "realloc", call, 2)
 	    || is_named_call_p (callee_fndecl, "__builtin_realloc", call, 2))
 	  {
@@ -1731,16 +1760,72 @@ malloc_state_machine::on_allocator_call (sm_context *sm_ctxt,
     }
 }

+/* Skips an ADDR_EXPR if seen.	 */
+
+static
+tree
+skip_reference (tree arg)
+{
+  if (TREE_CODE (arg) == ADDR_EXPR)
+      return TREE_OPERAND (arg, 0);
+  return arg;
+}
+
+
+/* Handle allocators which return the value through a pass by reference parameter.  */
+
+void
+malloc_state_machine::on_allocator_call_pass_by_ref (sm_context *sm_ctxt,
+						      const gcall *call,
+						      const deallocator_set *deallocators,
+						      bool returns_nonnull) const
+{
+  if (gimple_call_num_args (call) == 0)
+    return;
+  tree arg = gimple_call_arg (call, 0);
+  if (arg)
+    {
+      /* in Modula-2 the heap allocator API is: ALLOCATE (VAR ADDRESS;
+	  CARDINAL).  So we need to skip the reference or pointer in
+	  the first parameter.	*/
+      tree diag_arg_lvalue = sm_ctxt->get_diagnostic_tree (arg);
+      tree diag_arg_rvalue = skip_reference (diag_arg_lvalue);
+      if (sm_ctxt->get_state (call, diag_arg_rvalue) == m_start)
+	{
+	  sm_ctxt->set_next_state (call, diag_arg_rvalue,
+				   (returns_nonnull
+				    ? deallocators->m_nonnull
+				    : deallocators->m_unchecked));
+	}
+    }
+}
+
 void
 malloc_state_machine::on_deallocator_call (sm_context *sm_ctxt,
 					   const supernode *node,
 					   const gcall *call,
 					   const deallocator *d,
-					   unsigned argno) const
+					    unsigned argno,
+					    bool pass_by_reference) const
 {
   if (argno >= gimple_call_num_args (call))
     return;
   tree arg = gimple_call_arg (call, argno);
+  if (pass_by_reference)
+    {
+      tree diag_arg = sm_ctxt->get_diagnostic_tree (arg);
+      /* in Modula-2 the API is: DEALLOCATE (VAR a: ADDRESS; size:
+	  CARDINAL).  So we need to skip the pointer or reference in
+	  the first parameter.	We also know the pointer is assigned to
+	  NULL.	 In C this could be described as:
+
+	  DEALLOCATE (void **address, unsigned int size)
+	  {
+	     free (*address);
+	     *address = NULL;
+	  }.  */
+      arg = skip_reference (diag_arg);
+    }

   state_t state = sm_ctxt->get_state (call, arg);

@@ -1783,6 +1868,9 @@ malloc_state_machine::on_deallocator_call (sm_context *sm_ctxt,
 					   d->m_name));
       sm_ctxt->set_next_state (call, arg, m_stop);
     }
+  /* in Modula-2 the DEALLOCATE assigns the pointer to NULL.  However
+     we don't do this in the analyzer as it ignores NULL pointers
+     during deallocation.  */
 }

 /* Implementation of realloc(3):
diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h
index ae3991c770a..952513c071b 100644
--- a/gcc/langhooks-def.h
+++ b/gcc/langhooks-def.h
@@ -95,6 +95,7 @@ extern const char *lhd_get_substring_location (const substring_loc &,
 extern int lhd_decl_dwarf_attribute (const_tree, int);
 extern int lhd_type_dwarf_attribute (const_tree, int);
 extern void lhd_finalize_early_debug (void);
+extern bool lhd_new_dispose_storage_substitution (void);

 #define LANG_HOOKS_NAME			"GNU unknown"
 #define LANG_HOOKS_IDENTIFIER_SIZE	sizeof (struct lang_identifier)
@@ -147,6 +148,7 @@ extern void lhd_finalize_early_debug (void);
 #define LANG_HOOKS_RUN_LANG_SELFTESTS   lhd_do_nothing
 #define LANG_HOOKS_GET_SUBSTRING_LOCATION lhd_get_substring_location
 #define LANG_HOOKS_FINALIZE_EARLY_DEBUG lhd_finalize_early_debug
+#define LANG_HOOKS_NEW_DISPOSE_STORAGE_SUBSTITUTION lhd_new_dispose_storage_substitution

 /* Attribute hooks.  */
 #define LANG_HOOKS_ATTRIBUTE_TABLE		NULL
@@ -381,7 +383,8 @@ extern void lhd_end_section (void);
   LANG_HOOKS_EMITS_BEGIN_STMT, \
   LANG_HOOKS_RUN_LANG_SELFTESTS, \
   LANG_HOOKS_GET_SUBSTRING_LOCATION, \
-  LANG_HOOKS_FINALIZE_EARLY_DEBUG \
+  LANG_HOOKS_FINALIZE_EARLY_DEBUG, \
+  LANG_HOOKS_NEW_DISPOSE_STORAGE_SUBSTITUTION \
 }

 #endif /* GCC_LANG_HOOKS_DEF_H */
diff --git a/gcc/langhooks.c b/gcc/langhooks.c
index 2354386f7b4..6486c484895 100644
--- a/gcc/langhooks.c
+++ b/gcc/langhooks.c
@@ -896,6 +896,13 @@ lhd_finalize_early_debug (void)
     (*debug_hooks->early_global_decl) (cnode->decl);
 }

+/* Should the analyzer check for NEW/DISPOSE Storage_ALLOCATE/Storage_DEALLOCATE?  */
+
+bool lhd_new_dispose_storage_substitution (void)
+{
+  return false;
+}
+
 /* Returns true if the current lang_hooks represents the GNU C frontend.  */

 bool
diff --git a/gcc/langhooks.h b/gcc/langhooks.h
index 842e605c439..16b368bfbe1 100644
--- a/gcc/langhooks.h
+++ b/gcc/langhooks.h
@@ -611,6 +611,9 @@ struct lang_hooks
   /* Invoked before the early_finish debug hook is invoked.  */
   void (*finalize_early_debug) (void);

+  /* Does the language substitute NEW into ALLOCATE and DISPOSE into DEALLOCATE?  */
+  bool (*new_dispose_storage_substitution) (void);
+
   /* Whenever you add entries here, make sure you adjust langhooks-def.h
      and langhooks.c accordingly.  */
 };

regards,
Gaius

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] analyzer: enabling Modula-2 Storage/ALLOCATE/DEALLOCATE
  2021-04-13  9:52 [PATCH] analyzer: enabling Modula-2 Storage/ALLOCATE/DEALLOCATE Gaius Mulley
@ 2021-04-13 15:12 ` David Malcolm
  2021-04-19 13:13   ` Gaius Mulley
  0 siblings, 1 reply; 3+ messages in thread
From: David Malcolm @ 2021-04-13 15:12 UTC (permalink / raw)
  To: Gaius Mulley, gcc-patches

On Tue, 2021-04-13 at 10:52 +0100, Gaius Mulley wrote:
> 
> 
> Hello David and fellow GCC developers,
> 
> [the proposed patches for GCC trunk are at the end of the email]
> 
> I've been having way too much fun with your -fanalyzer code and
> here are four patches which give the analyzer the ability to
> understand the Modula-2 Storage API.
> 
>   http://www.nongnu.org/gm2/gm2-libs-isostorage.html
>   http://www.nongnu.org/gm2/gm2-libsstorage.html
>   http://www.nongnu.org/gm2/gm2-libssysstorage.html
> 
> Here it is in action - these short tests have all been added to the
> gm2 regression testsuite.  Many are from your C examples - rewritten
> in Modula-2.  Ignoring a few poor token position subexpressions from
> the front end, which I'm currently improving, it appears to work!

Excellent!  I'm glad you're enjoying it.

Caveat: The last time I did any Modula-2 coding was a MUD I wrote at
school, which I realize to my horror is now over 30 years ago; I wonder
if the disks are still readable :)

Various comments inline below throughout...

> I've rebuilt GCC trunk 10/04/2021 with these patches and no new
> regressions are introduced.  I ran a build of: c,c++,go,d,fortran,lto
> and checked the before and after regression tests.  The new code is by
> default disabled - hence the langhook patches.
> 
> While it is true that this is the cart before the horse (the gm2 front
> end is still to arrive in the gcc tree).  I thought it might be useful
> to post the patches - just in case others might benefit from the code
> or point out bugs/flaws in the code!
> 
> Anyway thanks for the analyzer - it is great fun reading the code and
> addictive trying to improve the accuracy of the error messages.  The
> four patches follow after the examples below.
> 
> Examples of it in use:
> 
> $ cat badlistfree.mod
> MODULE badlistfree ;
> 
> FROM Storage IMPORT ALLOCATE, DEALLOCATE ;
> 
> TYPE
>    list = POINTER TO RECORD
>                         value: CARDINAL ;
>                         next : list ;
>                      END ;
> VAR
>    head: list ;
> 
> PROCEDURE badfree (l: list) ;
> BEGIN
>    DISPOSE (l) ;
>    WHILE l^.next # NIL DO
>       l := l^.next ;
>       DISPOSE (l)
>    END
> END badfree ;
> 
> 
> BEGIN
>    NEW (head) ;
>    badfree (head) ;
> END badlistfree.
> $ gm2 -g -c -fanalyzer badlistfree.mod
> badlistfree.mod: In function ‘badfree’:
> badlistfree.mod:16:24: warning: use after ‘DISPOSE’ of ‘l’ [CWE-416] [-
> Wanalyzer-use-after-free]
>    16 |    WHILE l^.next # NIL DO
>       |                        ^~
>   ‘badfree’: events 1-2
>     |
>     |   15 |    DISPOSE (l) ;
>     |      |    ^~~~~~~~~~
>     |      |    |
>     |      |    (1) deallocated here
>     |   16 |    WHILE l^.next # NIL DO
>     |      |                        ~~
>     |      |                        |
>     |      |                        (2) use after ‘DISPOSE’ of ‘l’;
> deallocated at (1)
>     |

Excellent.

Do you actually need the NEW(head); to trigger the warning?

Given:

BEGIN
  NEW (head) ;

  (* "head" is now explicitly in state "unchecked" *)

  badfree (head) ;
END badlistfree.

whereas, assuming that "head" is being passed in, say, as a param:

BEGIN

  (* "head" is implicitly in state "start" *)

  badfree (head) ;  
END badlistfree.

so if you're thinking about unit test coverage, it may be an idea to
have both cases.

Your patch doesn't seem to have any test cases, but presumably that
would be part of the gm2 frontend and thus would be merged with that,
rather than as this patch.  I went poking around in your repo, and see:
http://git.savannah.gnu.org/cgit/gm2.git/commit/?id=bd9e38fcebf5c063370bec9a69331037005b15de

I don't see any DejaGnu directives on the files; should there be
something like

  (* dg-warning "use after 'DISPOSE' of 'l'" */

on the pertinent line, or do your tests work a different way?



> $ cat disposenoalloc.mod
> MODULE disposenoalloc ;
> 
> FROM Storage IMPORT ALLOCATE, DEALLOCATE ;
> FROM SYSTEM IMPORT ADR ;
> 
> TYPE
>    list = POINTER TO RECORD
>                         value: CARDINAL ;
>                         next : list ;
>                      END ;
> VAR
>    head: list ;
> 
> BEGIN
>    head := ADR (head) ;
>    DISPOSE (head)
> END disposenoalloc.
> $ gm2 -g -c -fanalyzer disposenoalloc.mod
> disposenoalloc.mod: In function ‘_M2_disposenoalloc_init’:
> disposenoalloc.mod:16:4: warning: ‘DISPOSE’ of ‘head’ which points to
> memory not on the heap [CWE-590] [-Wanalyzer-free-of-non-heap]
>    16 |    DISPOSE (head)
>       |    ^~~~~~~~~~~~~
>   ‘_M2_disposenoalloc_init’: events 1-3
>     |
>     |   15 |    head := ADR (head) ;
>     |      |                       ^
>     |      |                       |
>     |      |                       (1) pointer is from here
>     |   16 |    DISPOSE (head)
>     |      |    ~~~~~~~~~~~~~       
>     |      |    |        |
>     |      |    |        (2) pointer is from here
>     |      |    (3) call to ‘DISPOSE’ here
>     |

I had to look up ADR; it's the address of a variable, right?

Event (2) looks spurious, but I think it's a bug I've seen before.


> $ cat testdoubledispose.mod
> MODULE testdoubledispose ;
> 
> FROM Storage IMPORT ALLOCATE, DEALLOCATE ;
> 
> TYPE
>    list = POINTER TO RECORD
>                         value: CARDINAL ;
>                         next : list ;
>                      END ;
> VAR
>    head: list ;
> BEGIN
>    NEW (head) ;
>    DISPOSE (head) ;
>    DISPOSE (head) ;
> END testdoubledispose.
> $ gm2 -g -c -fanalyzer testdoubledispose.mod
> testdoubledispose.mod: In function ‘_M2_testdoubledispose_init’:
> testdoubledispose.mod:15:4: warning: double-‘DISPOSE’ of ‘head’ [CWE-
> 415] [-Wanalyzer-double-free]
>    15 |    DISPOSE (head) ;
>       |    ^~~~~~~~~~~~~
>   ‘_M2_testdoubledispose_init’: events 1-3
>     |
>     |   13 |    NEW (head) ;
>     |      |    ^~~~~~~~~
>     |      |    |
>     |      |    (1) allocated here
>     |   14 |    DISPOSE (head) ;
>     |      |    ~~~~~~~~~~~~~
>     |      |    |
>     |      |    (2) first ‘DISPOSE’ here
>     |   15 |    DISPOSE (head) ;
>     |      |    ~~~~~~~~~~~~~
>     |      |    |
>     |      |    (3) second ‘DISPOSE’ here; first ‘DISPOSE’ was at (2)
>     |

Great.  As noted above, for maximum test coverage it would be good to
have a case where "head" is passed in in unknown (start) state as well
as this case where it is explicitly NEW-ed.


> $ cat testdoublefree.mod
> MODULE testdoublefree ;
> 
> FROM libc IMPORT malloc, free ;
> FROM SYSTEM IMPORT ADDRESS ;
> 
> VAR
>    a: ADDRESS ;
> BEGIN
>    a := malloc (100) ;
>    free (a) ;
>    free (a)
> END testdoublefree.
> $ gm2 -g -c -fanalyzer testdoublefree.mod
> testdoublefree.mod: In function ‘_M2_testdoublefree_init’:
> testdoublefree.mod:11:4: warning: double-‘free’ of ‘a’ [CWE-415] [-
> Wanalyzer-double-free]
>    11 |    free (a)
>       |    ^~~~
>   ‘_M2_testdoublefree_init’: events 1-3
>     |
>     |    9 |    a := malloc (100) ;
>     |      |         ^~~~~~
>     |      |         |
>     |      |         (1) allocated here
>     |   10 |    free (a) ;
>     |      |    ~~~~  
>     |      |    |
>     |      |    (2) first ‘free’ here
>     |   11 |    free (a)
>     |      |    ~~~~  
>     |      |    |
>     |      |    (3) second ‘free’ here; first ‘free’ was at (2)
>     |

Great - double-free detection was my initial motivation for writing the
analyzer.

Same notes as before about test coverage for 'a' passed in in unknown
state vs explicitly malloced.

Can you get it to warn if the ADDRESS is dereferenced without checking
the result of malloc for NULL?
(should be -Wanalyzer-possible-null-dereference)


> $ cat useafterdeallocate.mod
> MODULE useafterdeallocate ;
> 
> FROM Storage IMPORT ALLOCATE, DEALLOCATE ;
> 
> TYPE
>    ptrType = POINTER TO RECORD
>                            foo: CARDINAL ;
>                         END ;
> 
> VAR
>    head: ptrType ;
> BEGIN
>    NEW (head) ;
>    IF head # NIL
>    THEN
>       head^.foo := 1 ;
>       DISPOSE (head) ;
>       head^.foo := 2
>    END
> END useafterdeallocate.
> $ gm2 -g -c -fanalyzer useafterdeallocate.mod
> useafterdeallocate.mod: In function ‘_M2_useafterdeallocate_init’:
> useafterdeallocate.mod:18:17: warning: use after ‘DISPOSE’ of ‘head’
> [CWE-416] [-Wanalyzer-use-after-free]
>    18 |       head^.foo := 2
>       |                 ^~
>   ‘_M2_useafterdeallocate_init’: events 1-6
>     |
>     |   13 |    NEW (head) ;
>     |      |    ^~~~~~~~~
>     |      |    |
>     |      |    (1) allocated here
>     |   14 |    IF head # NIL

If you don't check for head # NIL here, does the analyzer complain
about -Wanalyzer-possible-null-dereference?

>     |   15 |    THEN
>     |      |    ~~~~
>     |      |    |
>     |      |    (2) assuming ‘head’ is non-NULL
>     |      |    (3) following ‘false’ branch...
>     |   16 |       head^.foo := 1 ;
>     |      |            ~
>     |      |            |
>     |      |            (4) ...to here
>     |   17 |       DISPOSE (head) ;
>     |      |       ~~~~~~~~~~~~~
>     |      |       |
>     |      |       (5) deallocated here
>     |   18 |       head^.foo := 2
>     |      |                 ~~
>     |      |                 |
>     |      |                 (6) use after ‘DISPOSE’ of ‘_T30’;
> deallocated at (5)
>     |

The '_T30' in the message for the final event (6) is a bit of a wart;
it would be better as 'head'.  Interestingly, that's what the "warning"
message at the top says.  This is reminiscent of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99771, fwiw.

What does the underlying gimple look like?  You can see it via -fdump-
ipa-analyzer=stderr, though I suspect you know that.


Other things you might want to try that might already work, if you're
still enjoying playing with it:
- is it a problem if the user tried to "free" something that should
have been DISPOSEd and vice-versa?  Looking at your patch, it looks
like -Wanalyzer-mismatching-deallocation may issue a diagnostic for
such a case
- the analyzer can probably complain about dereference of NIL (e.g.
interprocedually if one function returns a NIL and the caller then
blindly dereferences the result).  But does Modula-2 have any
protection for that syntactically? (I can't remember)
- do interprocedural cases look sane?
- do intermodule cases work with -flto?  (assuming that m2 already
works with -flto, of course.  Note that I committed a bug fix for the
analyzer's LTO support last night [PR98599], so you'll want to refresh
your tree or grab that patch if you want to try LTO).


> and the four file patches:
> 
> ChangeLog entries:
> ==================

[...snip...]

> Patches
> =======
> 
> 
> diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc
> index 1d5b8601b1f..9f0f544e32c 100644
> --- a/gcc/analyzer/sm-malloc.cc
> +++ b/gcc/analyzer/sm-malloc.cc
> @@ -22,6 +22,8 @@ along with GCC; see the file COPYING3.  If not see
>  #include "system.h"
>  #include "coretypes.h"
>  #include "tree.h"
> +#include "langhooks-def.h"
> +#include "langhooks.h"
>  #include "function.h"
>  #include "basic-block.h"
>  #include "gimple.h"
> @@ -44,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "analyzer/region-model.h"
>  #include "stringpool.h"
>  #include "attribs.h"
> +#include "print-tree.h"
> 
>  #if ENABLE_ANALYZER
> 
> @@ -387,6 +390,7 @@ public:
>    standard_deallocator_set m_free;
>    standard_deallocator_set m_scalar_delete;
>    standard_deallocator_set m_vector_delete;
> +  standard_deallocator_set m_storage_deallocate;

This new field could use a comment.

Actually, maybe the names should identify the language, say after the
"m_" prefix, so e.g.:

    /* C++ scalar delete and vector delete[].  */
    standard_deallocator_set m_cp_scalar_delete;
    standard_deallocator_set m_cp_vector_delete;

    /* Modula 2 something something.  */
    standard_deallocator_set m_m2_storage_deallocate;


>    standard_deallocator m_realloc;
> 
> @@ -419,13 +423,18 @@ private:
>                             const supernode *node,
>                             const gcall *call,
>                             const deallocator *d,
> -                           unsigned argno) const;
> +                            unsigned argno,
> +                            bool pass_by_reference = false) const;
>    void on_realloc_call (sm_context *sm_ctxt,
>                         const supernode *node,
>                         const gcall *call) const;
>    void on_zero_assignment (sm_context *sm_ctxt,
>                            const gimple *stmt,
>                            tree lhs) const;
> +  void on_allocator_call_pass_by_ref (sm_context *sm_ctxt,
> +                                      const gcall *call,
> +                                      const deallocator_set
> *deallocators,
> +                                      bool returns_nonnull = false)
> const;

I only see one use of this, so is the default value for the final param
actually needed? (I prefer to avoid default params when they're not
needed)

>    /* A map for consolidating deallocators so that they are
>       unique per deallocator FUNCTION_DECL.  */
> @@ -1370,6 +1379,7 @@ malloc_state_machine::malloc_state_machine
> (logger *logger)
>    m_free (this, "free", WORDING_FREED),
>    m_scalar_delete (this, "delete", WORDING_DELETED),
>    m_vector_delete (this, "delete[]", WORDING_DELETED),
> +  m_storage_deallocate (this, "DISPOSE", WORDING_DEALLOCATED),
>    m_realloc (this, "realloc", WORDING_REALLOCATED)
>  {
>    gcc_assert (m_start->get_id () == 0);
> @@ -1415,7 +1425,7 @@ get_or_create_custom_deallocator_set (tree
> allocator_fndecl)
>      return NULL;
> 
>    /* Otherwise, call maybe_create_custom_deallocator_set,
> -     memoizing the result.  */
> +     memorizing the result.  */

I think I did mean "memoizing" here - but maybe "caching" is a less
fancy way of writing that.  Am I misuing the term?  I'm not a CompSci
professor :)


>    if (custom_deallocator_set **slot
>        = m_custom_deallocator_set_cache.get (allocator_fndecl))
>      return *slot;
> @@ -1526,6 +1536,15 @@ malloc_state_machine::on_stmt (sm_context
> *sm_ctxt,
>             return true;
>           }
> 
> +        if (lang_hooks.new_dispose_storage_substitution)

The langhook returns a bool, so I think you meant to call it here:

           if (lang_hooks.new_dispose_storage_substitution ())

rather than see if the function pointer is non-NULL.

That said, I don't love that lang_hook (more generally I'd prefer we
had vfuncs to callbacks, but that's what we currently have).

I'm not quite sure what the interface between the analyzer and
frontends should be.  One of my goals for the analyzer is whole-program
analysis, with multiple source files (which is why I picked the gimple-
SSA representation, to try to piggyback off the existing LTO
infrastructure).

In particular, what happens if some of a program is written in one
source language, and some in another?

I have vague ideas of a class representing a source language, with
subclasses implementing analyzer policy for each source language, so
that the analyzer can instantiate multiple objects and thus somehow do
the correct thing for such cases.  I'm handwaving here, of course. 
It's very useful seeing your patch, because it makes these ideas more
concrete - thanks.


> +          /* m2 allocation.  */
> +          if (is_named_call_p (callee_fndecl, "Storage_ALLOCATE",
> call, 2)
> +              || is_named_call_p (callee_fndecl,
> "SysStorage_ALLOCATE", call, 2))
> +            {
> +              on_allocator_call_pass_by_ref (sm_ctxt, call,
> &m_storage_deallocate, false);
> +              return true;
> +            }
> +

It looks like all the langhook is doing is conditionalizing the
recognition of a specific function by name, which presumably is
reserved within the Modula 2 ecosystem, but isn't a reserved name in
the world of C/C++.  Is that the reason for making it conditional? 
Maybe we can identify which source language a particular gimple stmt
was written in, and apply policy based on that.  (what about cross-
language inlining?!)

Also, I recognize that gradually adding more and more string
comparisons for special-cases to the analyzer isn't ideal; sorry about
that.  There's probably a better way to recognize these functions, but
I'm not sure what it is yet.



>         if (is_named_call_p (callee_fndecl, "operator new", call, 1))
>           on_allocator_call (sm_ctxt, call, &m_scalar_delete);
>         else if (is_named_call_p (callee_fndecl, "operator new []",
> call, 1))
> @@ -1562,6 +1581,16 @@ malloc_state_machine::on_stmt (sm_context
> *sm_ctxt,
>             return true;
>           }
> 
> +        if (lang_hooks.new_dispose_storage_substitution)
> +          /* m2 deallocation.  */
> +          if (is_named_call_p (callee_fndecl, "Storage_DEALLOCATE",
> call, 2)
> +              || is_named_call_p (callee_fndecl,
> "SysStorage_DEALLOCATE", call, 2))
> +            {
> +              on_deallocator_call (sm_ctxt, node, call,
> +                                  
> &m_storage_deallocate.m_deallocator, 0, true);
> +              return true;
> +            }
> +

Similar considerations as above.

>         if (is_named_call_p (callee_fndecl, "realloc", call, 2)
>             || is_named_call_p (callee_fndecl, "__builtin_realloc",
> call, 2))
>           {
> @@ -1731,16 +1760,72 @@ malloc_state_machine::on_allocator_call
> (sm_context *sm_ctxt,
>      }
>  }
> 
> +/* Skips an ADDR_EXPR if seen.  */
> +
> +static
> +tree
> +skip_reference (tree arg)
> +{
> +  if (TREE_CODE (arg) == ADDR_EXPR)
> +      return TREE_OPERAND (arg, 0);
> +  return arg;
> +}
> +
> +
> +/* Handle allocators which return the value through a pass by
> reference parameter.  */
> +
> +void
> +malloc_state_machine::on_allocator_call_pass_by_ref (sm_context
> *sm_ctxt,
> +                                                     const gcall
> *call,
> +                                                     const
> deallocator_set *deallocators,
> +                                                     bool
> returns_nonnull) const
> +{
> +  if (gimple_call_num_args (call) == 0)
> +    return;
> +  tree arg = gimple_call_arg (call, 0);
> +  if (arg)
> +    {
> +      /* in Modula-2 the heap allocator API is: ALLOCATE (VAR ADDRESS;
> +         CARDINAL).  So we need to skip the reference or pointer in
> +         the first parameter.  */
> +      tree diag_arg_lvalue = sm_ctxt->get_diagnostic_tree (arg);

I'm not sure get_diagnostic_tree is correct here; that tries to
generate a human-readable tree for diagnostics.
Why can't you just use 'arg'?  What kind of trees are you getting?

> +      tree diag_arg_rvalue = skip_reference (diag_arg_lvalue);

The conditional in skip_reference seems like a "code smell" to me: is
the code meant to be dereferencing the pointer or not?  What does that
mean for the types of the things involved?  It seems like it shouldn't
be conditional to me - or maybe there's something about Modula 2 that
I'm missing?

> +      if (sm_ctxt->get_state (call, diag_arg_rvalue) == m_start)
> +       {
> +         sm_ctxt->set_next_state (call, diag_arg_rvalue,
> +                                  (returns_nonnull
> +                                   ? deallocators->m_nonnull
> +                                   : deallocators->m_unchecked));
> +       }
> +    }
> +}
> +
>  void
>  malloc_state_machine::on_deallocator_call (sm_context *sm_ctxt,
>                                            const supernode *node,
>                                            const gcall *call,
>                                            const deallocator *d,
> -                                          unsigned argno) const
> +                                           unsigned argno,
> +                                           bool pass_by_reference)
> const
>  {
>    if (argno >= gimple_call_num_args (call))
>      return;
>    tree arg = gimple_call_arg (call, argno);
> +  if (pass_by_reference)
> +    {
> +      tree diag_arg = sm_ctxt->get_diagnostic_tree (arg);
> +      /* in Modula-2 the API is: DEALLOCATE (VAR a: ADDRESS; size:
> +         CARDINAL).  So we need to skip the pointer or reference in
> +         the first parameter.  We also know the pointer is assigned to
> +         NULL.  In C this could be described as:
> +
> +         DEALLOCATE (void **address, unsigned int size)
> +         {
> +            free (*address);
> +            *address = NULL;
> +         }.  */
> +      arg = skip_reference (diag_arg);
> +    }

Similar comments as above about use of get_diagnostic_tree and
skip_reference.


>    state_t state = sm_ctxt->get_state (call, arg);
> 
> @@ -1783,6 +1868,9 @@ malloc_state_machine::on_deallocator_call
> (sm_context *sm_ctxt,
>                                            d->m_name));
>        sm_ctxt->set_next_state (call, arg, m_stop);
>      }
> +  /* in Modula-2 the DEALLOCATE assigns the pointer to NULL.  However
> +     we don't do this in the analyzer as it ignores NULL pointers
> +     during deallocation.  */

Annoyingly, the analyzer's knowledge of what a function does it split
between the sm-*.cc state machine implementations, and region-
model*.cc, in particularly region-model-impl-calls.cc.  In theory you
could "teach" the analyzer that DEALLOCATE writes NULL; have a look at
region-model-impl-calls.cc and the various region_model::impl_call_*
methods implemented there.  Again, my apologies for the rather shoddy
way that special-case knowledge of functions is implemented.

You might consider implementing Storage_ALLOCATE/DEALLOCATE there, see
region_model::impl_call_malloc/free and
region_model::impl_call_operator_new/delete.


>  }
> 
>  /* Implementation of realloc(3):
> diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h
> index ae3991c770a..952513c071b 100644
> --- a/gcc/langhooks-def.h
> +++ b/gcc/langhooks-def.h
> @@ -95,6 +95,7 @@ extern const char *lhd_get_substring_location (const
> substring_loc &,
>  extern int lhd_decl_dwarf_attribute (const_tree, int);
>  extern int lhd_type_dwarf_attribute (const_tree, int);
>  extern void lhd_finalize_early_debug (void);
> +extern bool lhd_new_dispose_storage_substitution (void);
> 
>  #define LANG_HOOKS_NAME                        "GNU unknown"
>  #define LANG_HOOKS_IDENTIFIER_SIZE     sizeof (struct lang_identifier)
> @@ -147,6 +148,7 @@ extern void lhd_finalize_early_debug (void);
>  #define LANG_HOOKS_RUN_LANG_SELFTESTS   lhd_do_nothing
>  #define LANG_HOOKS_GET_SUBSTRING_LOCATION lhd_get_substring_location
>  #define LANG_HOOKS_FINALIZE_EARLY_DEBUG lhd_finalize_early_debug
> +#define LANG_HOOKS_NEW_DISPOSE_STORAGE_SUBSTITUTION
> lhd_new_dispose_storage_substitution
> 
>  /* Attribute hooks.  */
>  #define LANG_HOOKS_ATTRIBUTE_TABLE             NULL
> @@ -381,7 +383,8 @@ extern void lhd_end_section (void);
>    LANG_HOOKS_EMITS_BEGIN_STMT, \
>    LANG_HOOKS_RUN_LANG_SELFTESTS, \
>    LANG_HOOKS_GET_SUBSTRING_LOCATION, \
> -  LANG_HOOKS_FINALIZE_EARLY_DEBUG \
> +  LANG_HOOKS_FINALIZE_EARLY_DEBUG, \
> +  LANG_HOOKS_NEW_DISPOSE_STORAGE_SUBSTITUTION \
>  }
> 

So the function pointer is non-NULL for the default case, and so the
conditional is true for every frontend  :)


>  #endif /* GCC_LANG_HOOKS_DEF_H */
> diff --git a/gcc/langhooks.c b/gcc/langhooks.c
> index 2354386f7b4..6486c484895 100644
> --- a/gcc/langhooks.c
> +++ b/gcc/langhooks.c
> @@ -896,6 +896,13 @@ lhd_finalize_early_debug (void)
>      (*debug_hooks->early_global_decl) (cnode->decl);
>  }
> 
> +/* Should the analyzer check for NEW/DISPOSE
> Storage_ALLOCATE/Storage_DEALLOCATE?  */
> +
> +bool lhd_new_dispose_storage_substitution (void)
> +{
> +  return false;
> +}
> +
>  /* Returns true if the current lang_hooks represents the GNU C
> frontend.  */
> 
>  bool
> diff --git a/gcc/langhooks.h b/gcc/langhooks.h
> index 842e605c439..16b368bfbe1 100644
> --- a/gcc/langhooks.h
> +++ b/gcc/langhooks.h
> @@ -611,6 +611,9 @@ struct lang_hooks
>    /* Invoked before the early_finish debug hook is invoked.  */
>    void (*finalize_early_debug) (void);
> 
> +  /* Does the language substitute NEW into ALLOCATE and DISPOSE into
> DEALLOCATE?  */
> +  bool (*new_dispose_storage_substitution) (void);
> +
>    /* Whenever you add entries here, make sure you adjust langhooks-
> def.h
>       and langhooks.c accordingly.  */
>  };
> 
> regards,
> Gaius

Hope the above is constructive, sorry if it comes across as nitpicky in
places.

Thanks again for posting the patch, it's very exciting to see the
analyzer support other languages!

Dave



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] analyzer: enabling Modula-2 Storage/ALLOCATE/DEALLOCATE
  2021-04-13 15:12 ` David Malcolm
@ 2021-04-19 13:13   ` Gaius Mulley
  0 siblings, 0 replies; 3+ messages in thread
From: Gaius Mulley @ 2021-04-19 13:13 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches

David Malcolm <dmalcolm@redhat.com> writes:

> On Tue, 2021-04-13 at 10:52 +0100, Gaius Mulley wrote:
>>
>> Hello David and fellow GCC developers,
>>
>> [the proposed patches for GCC trunk are at the end of the email]
>>

Firstly many thanks for your excellent feedback.  Initially I thought
I'd post the latest version of patches which have taken on board the
issues and problems you identified - however due to the length of this
reply I think it better to post new patches separately.  Many new
tests, use of dg signatures, improved accuracy of token locations and
different heaps tracked in the new patch (to be emailed later).

> Excellent!  I'm glad you're enjoying it.
>
> Caveat: The last time I did any Modula-2 coding was a MUD I wrote at
> school, which I realize to my horror is now over 30 years ago; I wonder
> if the disks are still readable :)

ah yes oddly I did something rather similar post graduation
(multiplayer morloc tower clone).  I think it was partly inspired by
the elegant SYSTEM PROCESS primitives NEWPROCESS, IOTRANSFER and
TRANSFER which allow for a simple implementation of a multitasking
executive.

I've modified all the Modula-2 analyzer tests to use the dg
signatures as you suggested and added many new tests based on your
comments.  Many were the same test without using NEW to allocate
memory.  Most of which pass!  However I do get some failures in
particular:

Running /home/gaius/GM2/graft-combine/gm2-floppsie/gcc/testsuite/gm2/analyzer/fail/gm2-dg.exp ...
FAIL: gm2/analyzer/fail/globalnilderef.mod   -O   (test for warnings, line 17)
FAIL: gm2/analyzer/fail/globalnilderef.mod   -O   (test for errors, line 18)
FAIL: gm2/analyzer/fail/globalnilderef.mod   -O  (test for excess errors)

I've also added the ability to track different heaps (which was a
trivial change) but an improvement as it tracks allocation from
Storage and SysStorage.

$ cat globalnilderef.mod
MODULE globalnilderef ;

(* { dg-options "-fanalyzer" }  *)
(* { dg-do compile }  *)

FROM libc IMPORT printf ;

TYPE
   List = POINTER TO RECORD
                        value: CARDINAL ;
                        next : List ;
                     END ;

VAR
   l: List ;
BEGIN
   l^.value := 1 ;   (* { dg-warning "runtime error will occur, if this pointer value 'l' is ever dereferenced it will cause an exception.*" }  *)
   printf ("value is: %d\n", l^.value)   (* { dg-error "runtime error will occur, if this pointer value 'l' is ever dereferenced it will cause an exception.*" }  *)
END globalnilderef.
$ gm2 -O2 -fanalyzer -c -g globalnilderef.mod
$ gm2 -O2 -fanalyzer -c -g -fdump-ipa-analyzer=stderr globalnilderef.mod

INTEGER _M2_globalnilderef_finish ()
{
  <bb 2> [local count: 1073741824]:
  return;
}


INTEGER _M2_globalnilderef_init ()
{
  struct
{
  CARDINAL value;
} * l.0_1;
  INTEGER _8;

  <bb 2> [local count: 1073741824]:
  l.0_1 = l;
  _T24 = l.0_1;
  l.0_1->value = 1;
  _T26 = l.0_1;
  _T28 = "value is: %d\n";
  _8 = printf ("value is: %d\n", 1);
  _T29 = _8;
  return;
}



>> $ cat badlistfree.mod
>> MODULE badlistfree ;
>>
>> FROM Storage IMPORT ALLOCATE, DEALLOCATE ;
>>
>> TYPE
>>    list = POINTER TO RECORD
>>                         value: CARDINAL ;
>>                         next : list ;
>>                      END ;
>> VAR
>>    head: list ;
>>
>> PROCEDURE badfree (l: list) ;
>> BEGIN
>>    DISPOSE (l) ;
>>    WHILE l^.next # NIL DO
>>       l := l^.next ;
>>       DISPOSE (l)
>>    END
>> END badfree ;
>>
>>
>> BEGIN
>>    NEW (head) ;
>>    badfree (head) ;
>> END badlistfree.
>> $ gm2 -g -c -fanalyzer badlistfree.mod
>> badlistfree.mod: In function ‘badfree’:
>> badlistfree.mod:16:24: warning: use after ‘DISPOSE’ of ‘l’ [CWE-416] [-
...
> Excellent.
>
> Do you actually need the NEW(head); to trigger the warning?

no - you're right it can be triggered without NEW (head).

> so if you're thinking about unit test coverage, it may be an idea to
> have both cases.

thanks for the steer - I've now included both cases for this and
other similar testcases.

> Your patch doesn't seem to have any test cases, but presumably that
> would be part of the gm2 frontend and thus would be merged with that,
> rather than as this patch.  I went poking around in your repo, and see:

> I don't see any DejaGnu directives on the files; should there be
> something like
>
>   (* dg-warning "use after 'DISPOSE' of 'l'" */
>
> on the pertinent line, or do your tests work a different way?

Thanks for the advice - I've now adopted the dg signatures in this
directory.

git.savannah.gnu.org/cgit/gm2.git/tree/gcc-versionno/gcc/testsuite/gm2/analyzer/fail

The fail directories for the remainder of the testsuite simply test
for any compiler exit (non zero) or any error message.  But I'll trawl
though these and migrate to the dg- style.

>> $ cat disposenoalloc.mod
>> MODULE disposenoalloc ;
>>
>> FROM Storage IMPORT ALLOCATE, DEALLOCATE ;
>> FROM SYSTEM IMPORT ADR ;
>>
>> TYPE
>>    list = POINTER TO RECORD
>>                         value: CARDINAL ;
>>                         next : list ;
>>                      END ;
>> VAR
>>    head: list ;
>>
>> BEGIN
>>    head := ADR (head) ;
>>    DISPOSE (head)
>> END disposenoalloc.
>> $ gm2 -g -c -fanalyzer disposenoalloc.mod
>> disposenoalloc.mod: In function ‘_M2_disposenoalloc_init’:
>> disposenoalloc.mod:16:4: warning: ‘DISPOSE’ of ‘head’ which points to
>> memory not on the heap [CWE-590] [-Wanalyzer-free-of-non-heap]
>>    16 |    DISPOSE (head)
>>       |    ^~~~~~~~~~~~~
>>   ‘_M2_disposenoalloc_init’: events 1-3
>>     |
>>     |   15 |    head := ADR (head) ;
>>     |      |                       ^
>>     |      |                       |
>>     |      |                       (1) pointer is from here
>>     |   16 |    DISPOSE (head)
>>     |      |    ~~~~~~~~~~~~~      
>>     |      |    |        |
>>     |      |    |        (2) pointer is from here
>>     |      |    (3) call to ‘DISPOSE’ here
>>     |
>
> I had to look up ADR; it's the address of a variable, right?

yes true.

> Event (2) looks spurious, but I think it's a bug I've seen before.
>
>
>> $ cat testdoubledispose.mod
>> MODULE testdoubledispose ;
>>
>> FROM Storage IMPORT ALLOCATE, DEALLOCATE ;
>>
>> TYPE
>>    list = POINTER TO RECORD
>>                         value: CARDINAL ;
>>                         next : list ;
>>                      END ;
>> VAR
>>    head: list ;
>> BEGIN
>>    NEW (head) ;
>>    DISPOSE (head) ;
>>    DISPOSE (head) ;
>> END testdoubledispose.
>> $ gm2 -g -c -fanalyzer testdoubledispose.mod
>> testdoubledispose.mod: In function ‘_M2_testdoubledispose_init’:
>> testdoubledispose.mod:15:4: warning: double-‘DISPOSE’ of ‘head’ [CWE-
>> 415] [-Wanalyzer-double-free]
>>    15 |    DISPOSE (head) ;
>>       |    ^~~~~~~~~~~~~
>>   ‘_M2_testdoubledispose_init’: events 1-3
>>     |
>>     |   13 |    NEW (head) ;
>>     |      |    ^~~~~~~~~
>>     |      |    |
>>     |      |    (1) allocated here
>>     |   14 |    DISPOSE (head) ;
>>     |      |    ~~~~~~~~~~~~~
>>     |      |    |
>>     |      |    (2) first ‘DISPOSE’ here
>>     |   15 |    DISPOSE (head) ;
>>     |      |    ~~~~~~~~~~~~~
>>     |      |    |
>>     |      |    (3) second ‘DISPOSE’ here; first ‘DISPOSE’ was at (2)
>>     |
>
> Great.  As noted above, for maximum test coverage it would be good to
> have a case where "head" is passed in in unknown (start) state as well
> as this case where it is explicitly NEW-ed.

yes sure - all done now.

>> $ cat testdoublefree.mod
>> MODULE testdoublefree ;
>>
>> FROM libc IMPORT malloc, free ;
>> FROM SYSTEM IMPORT ADDRESS ;
>>
>> VAR
>>    a: ADDRESS ;
>> BEGIN
>>    a := malloc (100) ;
>>    free (a) ;
>>    free (a)
>> END testdoublefree.
>> $ gm2 -g -c -fanalyzer testdoublefree.mod
>> testdoublefree.mod: In function ‘_M2_testdoublefree_init’:
>> testdoublefree.mod:11:4: warning: double-‘free’ of ‘a’ [CWE-415] [-
>> Wanalyzer-double-free]
>>    11 |    free (a)
>>       |    ^~~~
>>   ‘_M2_testdoublefree_init’: events 1-3
>>     |
>>     |    9 |    a := malloc (100) ;
>>     |      |         ^~~~~~
>>     |      |         |
>>     |      |         (1) allocated here
>>     |   10 |    free (a) ;
>>     |      |    ~~~~ 
>>     |      |    |
>>     |      |    (2) first ‘free’ here
>>     |   11 |    free (a)
>>     |      |    ~~~~ 
>>     |      |    |
>>     |      |    (3) second ‘free’ here; first ‘free’ was at (2)
>>     |
>
> Great - double-free detection was my initial motivation for writing the
> analyzer.
>
> Same notes as before about test coverage for 'a' passed in in unknown
> state vs explicitly malloced.
>
> Can you get it to warn if the ADDRESS is dereferenced without checking
> the result of malloc for NULL?
> (should be -Wanalyzer-possible-null-dereference)

yes this also works - and new tests added.

>> $ cat useafterdeallocate.mod
>> MODULE useafterdeallocate ;
>>
>> FROM Storage IMPORT ALLOCATE, DEALLOCATE ;
>>
>> TYPE
>>    ptrType = POINTER TO RECORD
>>                            foo: CARDINAL ;
>>                         END ;
>>
>> VAR
>>    head: ptrType ;
>> BEGIN
>>    NEW (head) ;
>>    IF head # NIL
>>    THEN
>>       head^.foo := 1 ;
>>       DISPOSE (head) ;
>>       head^.foo := 2
>>    END
>> END useafterdeallocate.
>> $ gm2 -g -c -fanalyzer useafterdeallocate.mod
>> useafterdeallocate.mod: In function ‘_M2_useafterdeallocate_init’:
>> useafterdeallocate.mod:18:17: warning: use after ‘DISPOSE’ of ‘head’
>> [CWE-416] [-Wanalyzer-use-after-free]
>>    18 |       head^.foo := 2
>>       |                 ^~
>>   ‘_M2_useafterdeallocate_init’: events 1-6
>>     |
>>     |   13 |    NEW (head) ;
>>     |      |    ^~~~~~~~~
>>     |      |    |
>>     |      |    (1) allocated here
>>     |   14 |    IF head # NIL
>
> If you don't check for head # NIL here, does the analyzer complain
> about -Wanalyzer-possible-null-dereference?

yes it catches the possible null dereference (new test added :-)

>>     |   15 |    THEN
>>     |      |    ~~~~
>>     |      |    |
>>     |      |    (2) assuming ‘head’ is non-NULL
>>     |      |    (3) following ‘false’ branch...
>>     |   16 |       head^.foo := 1 ;
>>     |      |            ~
>>     |      |            |
>>     |      |            (4) ...to here
>>     |   17 |       DISPOSE (head) ;
>>     |      |       ~~~~~~~~~~~~~
>>     |      |       |
>>     |      |       (5) deallocated here
>>     |   18 |       head^.foo := 2
>>     |      |                 ~~
>>     |      |                 |
>>     |      |                 (6) use after ‘DISPOSE’ of ‘_T30’;
>> deallocated at (5)
>>     |
>
> The '_T30' in the message for the final event (6) is a bit of a wart;
> it would be better as 'head'.  Interestingly, that's what the "warning"
> message at the top says.  This is reminiscent of
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99771, fwiw.
>
> What does the underlying gimple look like?  You can see it via -fdump-
> ipa-analyzer=stderr, though I suspect you know that.

thanks - I'd missed that the analyzer could dump its knowledge.

> Other things you might want to try that might already work, if you're
> still enjoying playing with it:
> - is it a problem if the user tried to "free" something that should
> have been DISPOSEd and vice-versa?  Looking at your patch, it looks
> like -Wanalyzer-mismatching-deallocation may issue a diagnostic for
> such a case

the analyzer correctly catches this case (see mismatchedheap2.mod)
https://git.savannah.gnu.org/cgit/gm2.git/tree/gcc-versionno/gcc/testsuite/gm2/analyzer/fail/mismatchedheap2.mod

> - the analyzer can probably complain about dereference of NIL (e.g.
> interprocedually if one function returns a NIL and the caller then
> blindly dereferences the result).  But does Modula-2 have any
> protection for that syntactically? (I can't remember)

yes it protects this via the grammar.  The grammar forbids a
designator from calling a procedure function and only designators can
dereference pointers or access an array or a record field.

> - do interprocedural cases look sane?

yes these look very good :-), I'll post these in the next email.

> - do intermodule cases work with -flto?  (assuming that m2 already
> works with -flto, of course.  Note that I committed a bug fix for the
> analyzer's LTO support last night [PR98599], so you'll want to refresh
> your tree or grab that patch if you want to try LTO).

no at the moment gm2 fails with -flto.  I'll get onto this as well.

>> and the four file patches:
>>
>> ChangeLog entries:
>> ==================
>
> [...snip...]
>
>> Patches
>> =======
>>
>>
>> diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc
>> index 1d5b8601b1f..9f0f544e32c 100644
>> --- a/gcc/analyzer/sm-malloc.cc
>> +++ b/gcc/analyzer/sm-malloc.cc
>> @@ -22,6 +22,8 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "system.h"
>>  #include "coretypes.h"
>>  #include "tree.h"
>> +#include "langhooks-def.h"
>> +#include "langhooks.h"
>>  #include "function.h"
>>  #include "basic-block.h"
>>  #include "gimple.h"
>> @@ -44,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "analyzer/region-model.h"
>>  #include "stringpool.h"
>>  #include "attribs.h"
>> +#include "print-tree.h"
>>
>>  #if ENABLE_ANALYZER
>>
>> @@ -387,6 +390,7 @@ public:
>>    standard_deallocator_set m_free;
>>    standard_deallocator_set m_scalar_delete;
>>    standard_deallocator_set m_vector_delete;
>> +  standard_deallocator_set m_storage_deallocate;
>
> This new field could use a comment.
>
> Actually, maybe the names should identify the language, say after the
> "m_" prefix, so e.g.:
>
>     /* C++ scalar delete and vector delete[].  */
>     standard_deallocator_set m_cp_scalar_delete;
>     standard_deallocator_set m_cp_vector_delete;
>
>     /* Modula 2 something something.  */
>     standard_deallocator_set m_m2_storage_deallocate;
>
>>    standard_deallocator m_realloc;
>>
>> @@ -419,13 +423,18 @@ private:
>>                             const supernode *node,
>>                             const gcall *call,
>>                             const deallocator *d,
>> -                           unsigned argno) const;
>> +                            unsigned argno,
>> +                            bool pass_by_reference = false) const;
>>    void on_realloc_call (sm_context *sm_ctxt,
>>                         const supernode *node,
>>                         const gcall *call) const;
>>    void on_zero_assignment (sm_context *sm_ctxt,
>>                            const gimple *stmt,
>>                            tree lhs) const;
>> +  void on_allocator_call_pass_by_ref (sm_context *sm_ctxt,
>> +                                      const gcall *call,
>> +                                      const deallocator_set
>> *deallocators,
>> +                                      bool returns_nonnull = false)
>> const;
>
> I only see one use of this, so is the default value for the final param
> actually needed? (I prefer to avoid default params when they're not
> needed)

sure ok - I've removed the non needed default parameter.

>>    /* A map for consolidating deallocators so that they are
>>       unique per deallocator FUNCTION_DECL.  */
>> @@ -1370,6 +1379,7 @@ malloc_state_machine::malloc_state_machine
>> (logger *logger)
>>    m_free (this, "free", WORDING_FREED),
>>    m_scalar_delete (this, "delete", WORDING_DELETED),
>>    m_vector_delete (this, "delete[]", WORDING_DELETED),
>> +  m_storage_deallocate (this, "DISPOSE", WORDING_DEALLOCATED),
>>    m_realloc (this, "realloc", WORDING_REALLOCATED)
>>  {
>>    gcc_assert (m_start->get_id () == 0);
>> @@ -1415,7 +1425,7 @@ get_or_create_custom_deallocator_set (tree
>> allocator_fndecl)
>>      return NULL;
>>
>>    /* Otherwise, call maybe_create_custom_deallocator_set,
>> -     memoizing the result.  */
>> +     memorizing the result.  */
>
> I think I did mean "memoizing" here - but maybe "caching" is a less
> fancy way of writing that.  Am I misuing the term?  I'm not a CompSci
> professor :)

ah apologies - my mistake - but the silver lining is that I won't
forget this in a hurry.  Life imitates source :-)

>>    if (custom_deallocator_set **slot
>>        = m_custom_deallocator_set_cache.get (allocator_fndecl))
>>      return *slot;
>> @@ -1526,6 +1536,15 @@ malloc_state_machine::on_stmt (sm_context
>> *sm_ctxt,
>>             return true;
>>           }
>>
>> +        if (lang_hooks.new_dispose_storage_substitution)
>
> The langhook returns a bool, so I think you meant to call it here:
>
>            if (lang_hooks.new_dispose_storage_substitution ())
>
> rather than see if the function pointer is non-NULL.

ah yes thank you for spotting this error!

> That said, I don't love that lang_hook (more generally I'd prefer we
> had vfuncs to callbacks, but that's what we currently have).
>
> I'm not quite sure what the interface between the analyzer and
> frontends should be.  One of my goals for the analyzer is whole-program
> analysis, with multiple source files (which is why I picked the gimple-
> SSA representation, to try to piggyback off the existing LTO
> infrastructure).

yes it did feel a little awkward to use langhooks but the only option
currently.

> In particular, what happens if some of a program is written in one
> source language, and some in another?

indeed - and for gm2 every Modula-2 program is a mixed language
project.  It might be good to have this command line switchable?

> I have vague ideas of a class representing a source language, with
> subclasses implementing analyzer policy for each source language, so
> that the analyzer can instantiate multiple objects and thus somehow do
> the correct thing for such cases.  I'm handwaving here, of course.
> It's very useful seeing your patch, because it makes these ideas more
> concrete - thanks.

yes one of the motivations for posting was as a heads up - which
brings me onto a question.  In Modula-2 the language standard enforces
all kinds of runtime checks (two examples out of 24 are integer
overflow and procedure functions not executing a return statement).
These eventually map onto calls in M2RTS

PROCEDURE NoReturnException (filename: ADDRESS; line, column: CARDINAL; scope, message: ADDRESS) ;
PROCEDURE WholeValueException (filename: ADDRESS; line, column: CARDINAL; scope, message: ADDRESS) ;

I presume that it should be possible to add checks to determine
whether these exception routines are reachable from within the
analyzer?  I'm currently looking at sm-signal.cc and sm-malloc.cc and
region*cc for inspiration.

I wrote a plugin a few years ago which detected whether any of these
calls were reachable from the first basic block of any reachable
function and elevated the warning to an error - fairly limited
usefulness.  So I am keen to explore the usefulness of the analyzer to
highlight any potential m2 runtime exception.

>> +          /* m2 allocation.  */
>> +          if (is_named_call_p (callee_fndecl, "Storage_ALLOCATE",
>> call, 2)
>> +              || is_named_call_p (callee_fndecl,
>> "SysStorage_ALLOCATE", call, 2))
>> +            {
>> +              on_allocator_call_pass_by_ref (sm_ctxt, call,
>> &m_storage_deallocate, false);
>> +              return true;
>> +            }
>> +
>
> It looks like all the langhook is doing is conditionalizing the
> recognition of a specific function by name, which presumably is
> reserved within the Modula 2 ecosystem, but isn't a reserved name in
> the world of C/C++.  Is that the reason for making it conditional?

yes I was wondering if, eventually, a command line switch might be
appropriate to enable this feature from non Modula-2 front ends.

> Maybe we can identify which source language a particular gimple stmt
> was written in, and apply policy based on that.  (what about cross-
> language inlining?!)

hmm yes this needs some thought!

> Also, I recognize that gradually adding more and more string
> comparisons for special-cases to the analyzer isn't ideal; sorry about
> that.  There's probably a better way to recognize these functions, but
> I'm not sure what it is yet.
>
>>         if (is_named_call_p (callee_fndecl, "operator new", call, 1))
>>           on_allocator_call (sm_ctxt, call, &m_scalar_delete);
>>         else if (is_named_call_p (callee_fndecl, "operator new []",
>> call, 1))
>> @@ -1562,6 +1581,16 @@ malloc_state_machine::on_stmt (sm_context
>> *sm_ctxt,
>>             return true;
>>           }
>>
>> +        if (lang_hooks.new_dispose_storage_substitution)
>> +          /* m2 deallocation.  */
>> +          if (is_named_call_p (callee_fndecl, "Storage_DEALLOCATE",
>> call, 2)
>> +              || is_named_call_p (callee_fndecl,
>> "SysStorage_DEALLOCATE", call, 2))
>> +            {
>> +              on_deallocator_call (sm_ctxt, node, call,
>> +
>> &m_storage_deallocate.m_deallocator, 0, true);
>> +              return true;
>> +            }
>> +
>
> Similar considerations as above.
>
>>         if (is_named_call_p (callee_fndecl, "realloc", call, 2)
>>             || is_named_call_p (callee_fndecl, "__builtin_realloc",
>> call, 2))
>>           {
>> @@ -1731,16 +1760,72 @@ malloc_state_machine::on_allocator_call
>> (sm_context *sm_ctxt,
>>      }
>>  }
>>
>> +/* Skips an ADDR_EXPR if seen.  */
>> +
>> +static
>> +tree
>> +skip_reference (tree arg)
>> +{
>> +  if (TREE_CODE (arg) == ADDR_EXPR)
>> +      return TREE_OPERAND (arg, 0);
>> +  return arg;
>> +}
>> +
>> +
>> +/* Handle allocators which return the value through a pass by
>> reference parameter.  */
>> +
>> +void
>> +malloc_state_machine::on_allocator_call_pass_by_ref (sm_context
>> *sm_ctxt,
>> +                                                     const gcall
>> *call,
>> +                                                     const
>> deallocator_set *deallocators,
>> +                                                     bool
>> returns_nonnull) const
>> +{
>> +  if (gimple_call_num_args (call) == 0)
>> +    return;
>> +  tree arg = gimple_call_arg (call, 0);
>> +  if (arg)
>> +    {
>> +      /* in Modula-2 the heap allocator API is: ALLOCATE (VAR ADDRESS;
>> +         CARDINAL).  So we need to skip the reference or pointer in
>> +         the first parameter.  */
>> +      tree diag_arg_lvalue = sm_ctxt->get_diagnostic_tree (arg);
>
> I'm not sure get_diagnostic_tree is correct here; that tries to
> generate a human-readable tree for diagnostics.
> Why can't you just use 'arg'?  What kind of trees are you getting?
>
>> +      tree diag_arg_rvalue = skip_reference (diag_arg_lvalue);
>
> The conditional in skip_reference seems like a "code smell" to me: is
> the code meant to be dereferencing the pointer or not?  What does that
> mean for the types of the things involved?  It seems like it shouldn't
> be conditional to me - or maybe there's something about Modula 2 that
> I'm missing?

no you're right - I've changed this to return NULL if ADDR_EXPR was
not seen and the caller will bail out if NULL is seen.

>> +      if (sm_ctxt->get_state (call, diag_arg_rvalue) == m_start)
>> +       {
>> +         sm_ctxt->set_next_state (call, diag_arg_rvalue,
>> +                                  (returns_nonnull
>> +                                   ? deallocators->m_nonnull
>> +                                   : deallocators->m_unchecked));
>> +       }
>> +    }
>> +}
>> +
>>  void
>>  malloc_state_machine::on_deallocator_call (sm_context *sm_ctxt,
>>                                            const supernode *node,
>>                                            const gcall *call,
>>                                            const deallocator *d,
>> -                                          unsigned argno) const
>> +                                           unsigned argno,
>> +                                           bool pass_by_reference)
>> const
>>  {
>>    if (argno >= gimple_call_num_args (call))
>>      return;
>>    tree arg = gimple_call_arg (call, argno);
>> +  if (pass_by_reference)
>> +    {
>> +      tree diag_arg = sm_ctxt->get_diagnostic_tree (arg);
>> +      /* in Modula-2 the API is: DEALLOCATE (VAR a: ADDRESS; size:
>> +         CARDINAL).  So we need to skip the pointer or reference in
>> +         the first parameter.  We also know the pointer is assigned to
>> +         NULL.  In C this could be described as:
>> +
>> +         DEALLOCATE (void **address, unsigned int size)
>> +         {
>> +            free (*address);
>> +            *address = NULL;
>> +         }.  */
>> +      arg = skip_reference (diag_arg);
>> +    }
>
> Similar comments as above about use of get_diagnostic_tree and
> skip_reference.
>
>
>>    state_t state = sm_ctxt->get_state (call, arg);
>>
>> @@ -1783,6 +1868,9 @@ malloc_state_machine::on_deallocator_call
>> (sm_context *sm_ctxt,
>>                                            d->m_name));
>>        sm_ctxt->set_next_state (call, arg, m_stop);
>>      }
>> +  /* in Modula-2 the DEALLOCATE assigns the pointer to NULL.  However
>> +     we don't do this in the analyzer as it ignores NULL pointers
>> +     during deallocation.  */
>
> Annoyingly, the analyzer's knowledge of what a function does it split
> between the sm-*.cc state machine implementations, and region-
> model*.cc, in particularly region-model-impl-calls.cc.  In theory you
> could "teach" the analyzer that DEALLOCATE writes NULL; have a look at
> region-model-impl-calls.cc and the various region_model::impl_call_*
> methods implemented there.  Again, my apologies for the rather shoddy
> way that special-case knowledge of functions is implemented.
>
> You might consider implementing Storage_ALLOCATE/DEALLOCATE there, see
> region_model::impl_call_malloc/free and
> region_model::impl_call_operator_new/delete.

yes thanks - I'm looking at this now.

>>  }
>>
>>  /* Implementation of realloc(3):
>> diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h
>> index ae3991c770a..952513c071b 100644
>> --- a/gcc/langhooks-def.h
>> +++ b/gcc/langhooks-def.h
>> @@ -95,6 +95,7 @@ extern const char *lhd_get_substring_location (const
>> substring_loc &,
>>  extern int lhd_decl_dwarf_attribute (const_tree, int);
>>  extern int lhd_type_dwarf_attribute (const_tree, int);
>>  extern void lhd_finalize_early_debug (void);
>> +extern bool lhd_new_dispose_storage_substitution (void);
>>
>>  #define LANG_HOOKS_NAME                        "GNU unknown"
>>  #define LANG_HOOKS_IDENTIFIER_SIZE     sizeof (struct lang_identifier)
>> @@ -147,6 +148,7 @@ extern void lhd_finalize_early_debug (void);
>>  #define LANG_HOOKS_RUN_LANG_SELFTESTS   lhd_do_nothing
>>  #define LANG_HOOKS_GET_SUBSTRING_LOCATION lhd_get_substring_location
>>  #define LANG_HOOKS_FINALIZE_EARLY_DEBUG lhd_finalize_early_debug
>> +#define LANG_HOOKS_NEW_DISPOSE_STORAGE_SUBSTITUTION
>> lhd_new_dispose_storage_substitution
>>
>>  /* Attribute hooks.  */
>>  #define LANG_HOOKS_ATTRIBUTE_TABLE             NULL
>> @@ -381,7 +383,8 @@ extern void lhd_end_section (void);
>>    LANG_HOOKS_EMITS_BEGIN_STMT, \
>>    LANG_HOOKS_RUN_LANG_SELFTESTS, \
>>    LANG_HOOKS_GET_SUBSTRING_LOCATION, \
>> -  LANG_HOOKS_FINALIZE_EARLY_DEBUG \
>> +  LANG_HOOKS_FINALIZE_EARLY_DEBUG, \
>> +  LANG_HOOKS_NEW_DISPOSE_STORAGE_SUBSTITUTION \
>>  }
>>
>
> So the function pointer is non-NULL for the default case, and so the
> conditional is true for every frontend  :)
>
>
>>  #endif /* GCC_LANG_HOOKS_DEF_H */
>> diff --git a/gcc/langhooks.c b/gcc/langhooks.c
>> index 2354386f7b4..6486c484895 100644
>> --- a/gcc/langhooks.c
>> +++ b/gcc/langhooks.c
>> @@ -896,6 +896,13 @@ lhd_finalize_early_debug (void)
>>      (*debug_hooks->early_global_decl) (cnode->decl);
>>  }
>>
>> +/* Should the analyzer check for NEW/DISPOSE
>> Storage_ALLOCATE/Storage_DEALLOCATE?  */
>> +
>> +bool lhd_new_dispose_storage_substitution (void)
>> +{
>> +  return false;
>> +}
>> +
>>  /* Returns true if the current lang_hooks represents the GNU C
>> frontend.  */
>>
>>  bool
>> diff --git a/gcc/langhooks.h b/gcc/langhooks.h
>> index 842e605c439..16b368bfbe1 100644
>> --- a/gcc/langhooks.h
>> +++ b/gcc/langhooks.h
>> @@ -611,6 +611,9 @@ struct lang_hooks
>>    /* Invoked before the early_finish debug hook is invoked.  */
>>    void (*finalize_early_debug) (void);
>>
>> +  /* Does the language substitute NEW into ALLOCATE and DISPOSE into
>> DEALLOCATE?  */
>> +  bool (*new_dispose_storage_substitution) (void);
>> +
>>    /* Whenever you add entries here, make sure you adjust langhooks-
>> def.h
>>       and langhooks.c accordingly.  */
>>  };
>>
>> regards,
>> Gaius
>
> Hope the above is constructive, sorry if it comes across as nitpicky in
> places.

thank you for the constructive feedback - all very useful

regards,
Gaius

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-19 13:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13  9:52 [PATCH] analyzer: enabling Modula-2 Storage/ALLOCATE/DEALLOCATE Gaius Mulley
2021-04-13 15:12 ` David Malcolm
2021-04-19 13:13   ` Gaius Mulley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).