public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Malcolm <dmalcolm@redhat.com>
To: gcc-patches@gcc.gnu.org
Subject: [PATCH] Add analyzer plugin support and CPython GIL example
Date: Wed, 14 Oct 2020 21:21:58 -0400	[thread overview]
Message-ID: <20201015012158.777902-1-dmalcolm@redhat.com> (raw)

This patch adds a new GCC plugin event: PLUGIN_ANALYZER_INIT, called
when -fanalyzer is starting, allowing for GCC plugins to register
additional state-machine-based checks within -fanalyzer.  The idea
is that 3rd-party code might want to add domain-specific checks for
its own APIs - with the caveat that the analyzer is itself still
rather experimental.

As an example, the patch adds a proof-of-concept plugin to the testsuite
for checking CPython code: verifying that code that relinquishes
CPython's global interpreter lock doesn't attempt to do anything with
PyObjects in the sections where the lock isn't held.  It also adds a
warning about nested releases of the lock, which is forbidden.
For example:

demo.c: In function 'foo':
demo.c:11:3: warning: use of PyObject '*(obj)' without the GIL
   11 |   Py_INCREF (obj);
      |   ^~~~~~~~~
  'test': events 1-3
    |
    |   15 | void test (PyObject *obj)
    |      |      ^~~~
    |      |      |
    |      |      (1) entry to 'test'
    |   16 | {
    |   17 |   Py_BEGIN_ALLOW_THREADS
    |      |   ~~~~~~~~~~~~~~~~~~~~~~
    |      |   |
    |      |   (2) releasing the GIL here
    |   18 |   foo (obj);
    |      |   ~~~~~~~~~
    |      |   |
    |      |   (3) calling 'foo' from 'test'
    |
    +--> 'foo': events 4-5
           |
           |    9 | foo (PyObject *obj)
           |      | ^~~
           |      | |
           |      | (4) entry to 'foo'
           |   10 | {
           |   11 |   Py_INCREF (obj);
           |      |   ~~~~~~~~~
           |      |   |
           |      |   (5) PyObject '*(obj)' used here without the GIL
           |

Doing so requires adding some logic for ignoring macro expansions in
analyzer diagnostics, since the insides of Py_INCREF and
Py_BEGIN_ALLOW_THREADS are not of interest to the user for these cases.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

Are the non-analyzer parts OK for master?

gcc/analyzer/ChangeLog:
	* analyzer-pass.cc (pass_analyzer::execute): Move sorry call to...
	(sorry_no_analyzer): New.
	* analyzer.h (class state_machine): New forward decl.
	(class logger): New forward decl.
	(class plugin_analyzer_init_iface): New.
	(sorry_no_analyzer): New decl.
	* checker-path.cc (checker_path::fixup_locations): New.
	* checker-path.h (checker_event::set_location): New.
	(checker_path::fixup_locations): New decl.
	* diagnostic-manager.cc
	(diagnostic_manager::emit_saved_diagnostic): Call
	checker_path::fixup_locations, and call fixup_location
	on the primary location.
	* engine.cc: Include "plugin.h".
	(class plugin_analyzer_init_impl): New.
	(impl_run_checkers): Invoke PLUGIN_ANALYZER_INIT callbacks.
	* pending-diagnostic.h (pending_diagnostic::fixup_location): New
	vfunc.

gcc/ChangeLog:
	* doc/plugins.texi (Plugin callbacks): Add PLUGIN_ANALYZER_INIT.
	* plugin.c (register_callback): Likewise.
	(invoke_plugin_callbacks_full): Likewise.
	* plugin.def (PLUGIN_ANALYZER_INIT): New event.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/analyzer_gil_plugin.c: New test.
	* gcc.dg/plugin/gil-1.c: New test.
	* gcc.dg/plugin/gil.h: New header.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the new plugin
	and test.
---
 gcc/analyzer/analyzer-pass.cc                 |  18 +-
 gcc/analyzer/analyzer.h                       |  15 +
 gcc/analyzer/checker-path.cc                  |   9 +
 gcc/analyzer/checker-path.h                   |   4 +
 gcc/analyzer/diagnostic-manager.cc            |   9 +-
 gcc/analyzer/engine.cc                        |  31 ++
 gcc/analyzer/pending-diagnostic.h             |   8 +
 gcc/doc/plugins.texi                          |   4 +
 gcc/plugin.c                                  |   2 +
 gcc/plugin.def                                |   4 +
 .../gcc.dg/plugin/analyzer_gil_plugin.c       | 436 ++++++++++++++++++
 gcc/testsuite/gcc.dg/plugin/gil-1.c           |  90 ++++
 gcc/testsuite/gcc.dg/plugin/gil.h             |  32 ++
 gcc/testsuite/gcc.dg/plugin/plugin.exp        |   2 +
 14 files changed, 660 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/gil-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/gil.h

diff --git a/gcc/analyzer/analyzer-pass.cc b/gcc/analyzer/analyzer-pass.cc
index a27421e46d4..1f65bf8b154 100644
--- a/gcc/analyzer/analyzer-pass.cc
+++ b/gcc/analyzer/analyzer-pass.cc
@@ -83,9 +83,7 @@ pass_analyzer::execute (function *)
 #if ENABLE_ANALYZER
   ana::run_checkers ();
 #else
-  sorry ("%qs was not enabled in this build of GCC"
-	 " (missing configure-time option %qs)",
-	 "-fanalyzer", "--enable-analyzer");
+  sorry_no_analyzer ();
 #endif
 
   return 0;
@@ -100,3 +98,17 @@ make_pass_analyzer (gcc::context *ctxt)
 {
   return new pass_analyzer (ctxt);
 }
+
+#if !ENABLE_ANALYZER
+
+/* Issue a "sorry" diagnostic that the analyzer was not enabled.  */
+
+void
+sorry_no_analyzer ()
+{
+  sorry ("%qs was not enabled in this build of GCC"
+	 " (missing configure-time option %qs)",
+	 "-fanalyzer", "--enable-analyzer");
+}
+
+#endif /* #if !ENABLE_ANALYZER */
diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h
index aa43b7f66a9..5d01d022f3c 100644
--- a/gcc/analyzer/analyzer.h
+++ b/gcc/analyzer/analyzer.h
@@ -97,6 +97,8 @@ class state_change;
 class rewind_info_t;
 
 class engine;
+class state_machine;
+class logger;
 
 /* Forward decls of functions.  */
 
@@ -186,6 +188,15 @@ private:
 
 extern location_t get_stmt_location (const gimple *stmt, function *fun);
 
+/* Passed by pointer to PLUGIN_ANALYZER_INIT callbacks.  */
+
+class plugin_analyzer_init_iface
+{
+public:
+  virtual void register_state_machine (state_machine *) = 0;
+  virtual logger *get_logger () const = 0;
+};
+
 } // namespace ana
 
 extern bool is_special_named_call_p (const gcall *call, const char *funcname,
@@ -306,4 +317,8 @@ private:
 #pragma GCC diagnostic ignored "-Wformat-diag"
 #endif
 
+#if !ENABLE_ANALYZER
+extern void sorry_no_analyzer ();
+#endif /* #if !ENABLE_ANALYZER */
+
 #endif /* GCC_ANALYZER_ANALYZER_H */
diff --git a/gcc/analyzer/checker-path.cc b/gcc/analyzer/checker-path.cc
index 1f6d6a86001..062730cc789 100644
--- a/gcc/analyzer/checker-path.cc
+++ b/gcc/analyzer/checker-path.cc
@@ -982,6 +982,15 @@ checker_path::add_final_event (const state_machine *sm,
   add_event (end_of_path);
 }
 
+void
+checker_path::fixup_locations (pending_diagnostic *pd)
+{
+  checker_event *e;
+  int i;
+  FOR_EACH_VEC_ELT (m_events, i, e)
+    e->set_location (pd->fixup_location (e->get_location ()));
+}
+
 } // namespace ana
 
 #endif /* #if ENABLE_ANALYZER */
diff --git a/gcc/analyzer/checker-path.h b/gcc/analyzer/checker-path.h
index 7b86d48e983..53a7a4efea3 100644
--- a/gcc/analyzer/checker-path.h
+++ b/gcc/analyzer/checker-path.h
@@ -99,6 +99,8 @@ public:
 
   void dump (pretty_printer *pp) const;
 
+  void set_location (location_t loc) { m_loc = loc; }
+
  public:
   const enum event_kind m_kind;
  protected:
@@ -498,6 +500,8 @@ public:
       e->prepare_for_emission (this, pd, diagnostic_event_id_t (i));
   }
 
+  void fixup_locations (pending_diagnostic *pd);
+
   void record_setjmp_event (const exploded_node *enode,
 			    diagnostic_event_id_t setjmp_emission_id)
   {
diff --git a/gcc/analyzer/diagnostic-manager.cc b/gcc/analyzer/diagnostic-manager.cc
index cb95a95ff0b..baad14c6c4d 100644
--- a/gcc/analyzer/diagnostic-manager.cc
+++ b/gcc/analyzer/diagnostic-manager.cc
@@ -648,7 +648,14 @@ diagnostic_manager::emit_saved_diagnostic (const exploded_graph &eg,
 
   emission_path.prepare_for_emission (sd.m_d);
 
-  gcc_rich_location rich_loc (get_stmt_location (stmt, sd.m_snode->m_fun));
+  location_t loc = get_stmt_location (stmt, sd.m_snode->m_fun);
+
+  /* Allow the pending_diagnostic to fix up the primary location
+     and any locations for events.  */
+  loc = sd.m_d->fixup_location (loc);
+  emission_path.fixup_locations (sd.m_d);
+
+  gcc_rich_location rich_loc (loc);
   rich_loc.set_path (&emission_path);
 
   auto_diagnostic_group d;
diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 65d7495f26f..de772c77b2a 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "analyzer/state-purge.h"
 #include "analyzer/bar-chart.h"
 #include <zlib.h>
+#include "plugin.h"
 
 /* For an overview, see gcc/doc/analyzer.texi.  */
 
@@ -4510,6 +4511,33 @@ dump_analyzer_json (const supergraph &sg,
   free (filename);
 }
 
+/* Concrete subclass of plugin_analyzer_init_iface, allowing plugins
+   to register new state machines.  */
+
+class plugin_analyzer_init_impl : public plugin_analyzer_init_iface
+{
+public:
+  plugin_analyzer_init_impl (auto_delete_vec <state_machine> *checkers,
+			     logger *logger)
+  : m_checkers (checkers),
+    m_logger (logger)
+  {}
+
+  void register_state_machine (state_machine *sm) FINAL OVERRIDE
+  {
+    m_checkers->safe_push (sm);
+  }
+
+  logger *get_logger () const FINAL OVERRIDE
+  {
+    return m_logger;
+  }
+
+private:
+  auto_delete_vec <state_machine> *m_checkers;
+  logger *m_logger;
+};
+
 /* Run the analysis "engine".  */
 
 void
@@ -4555,6 +4583,9 @@ impl_run_checkers (logger *logger)
   auto_delete_vec <state_machine> checkers;
   make_checkers (checkers, logger);
 
+  plugin_analyzer_init_impl data (&checkers, logger);
+  invoke_plugin_callbacks (PLUGIN_ANALYZER_INIT, &data);
+
   if (logger)
     {
       int i;
diff --git a/gcc/analyzer/pending-diagnostic.h b/gcc/analyzer/pending-diagnostic.h
index b1ff2688bcc..fa6c9781044 100644
--- a/gcc/analyzer/pending-diagnostic.h
+++ b/gcc/analyzer/pending-diagnostic.h
@@ -175,6 +175,14 @@ class pending_diagnostic
      diagnostic deduplication.  */
   static bool same_tree_p (tree t1, tree t2);
 
+  /* A vfunc for fixing up locations (both the primary location for the
+     diagnostic, and for events in their paths), e.g. to avoid unwinding
+     inside specific macros.  */
+  virtual location_t fixup_location (location_t loc) const
+  {
+    return loc;
+  }
+
   /* For greatest precision-of-wording, the various following "describe_*"
      virtual functions give the pending diagnostic a way to describe events
      in a diagnostic_path in terms that make sense for that diagnostic.
diff --git a/gcc/doc/plugins.texi b/gcc/doc/plugins.texi
index 23a8b410ebb..1c211bbacf1 100644
--- a/gcc/doc/plugins.texi
+++ b/gcc/doc/plugins.texi
@@ -218,6 +218,10 @@ enum plugin_event
    as a const char* pointer.  */
   PLUGIN_INCLUDE_FILE,
 
+  /* Called when -fanalyzer starts. The event data is an
+     ana::plugin_analyzer_init_iface *.  */
+  PLUGIN_ANALYZER_INIT,
+
   PLUGIN_EVENT_FIRST_DYNAMIC    /* Dummy event used for indexing callback
                                    array.  */
 @};
diff --git a/gcc/plugin.c b/gcc/plugin.c
index 69b6f5b8b6a..76069e64798 100644
--- a/gcc/plugin.c
+++ b/gcc/plugin.c
@@ -497,6 +497,7 @@ register_callback (const char *plugin_name,
       case PLUGIN_EARLY_GIMPLE_PASSES_END:
       case PLUGIN_NEW_PASS:
       case PLUGIN_INCLUDE_FILE:
+      case PLUGIN_ANALYZER_INIT:
         {
           struct callback_info *new_callback;
           if (!callback)
@@ -577,6 +578,7 @@ invoke_plugin_callbacks_full (int event, void *gcc_data)
       case PLUGIN_EARLY_GIMPLE_PASSES_END:
       case PLUGIN_NEW_PASS:
       case PLUGIN_INCLUDE_FILE:
+      case PLUGIN_ANALYZER_INIT:
         {
           /* Iterate over every callback registered with this event and
              call it.  */
diff --git a/gcc/plugin.def b/gcc/plugin.def
index 3482ed71906..b8110190d3a 100644
--- a/gcc/plugin.def
+++ b/gcc/plugin.def
@@ -99,6 +99,10 @@ DEFEVENT (PLUGIN_NEW_PASS)
    as a const char* pointer.  */
 DEFEVENT (PLUGIN_INCLUDE_FILE)
 
+/* Called when -fanalyzer starts. The event data is an
+   ana::plugin_analyzer_init_iface *.  */
+DEFEVENT (PLUGIN_ANALYZER_INIT)
+
 /* When adding a new hard-coded plugin event, don't forget to edit in
    file plugin.c the functions register_callback and
    invoke_plugin_callbacks_full accordingly!  */
diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c b/gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c
new file mode 100644
index 00000000000..05133d5250e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c
@@ -0,0 +1,436 @@
+/* Proof-of-concept of a -fanalyzer plugin.
+   Detect (some) uses of CPython API outside of the Global Interpreter Lock.
+   https://docs.python.org/3/c-api/init.html#thread-state-and-the-global-interpreter-lock
+*/
+/* { dg-options "-g" } */
+
+#include "gcc-plugin.h"
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "diagnostic.h"
+#include "tree.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-walk.h"
+#include "diagnostic-event-id.h"
+#include "analyzer/analyzer.h"
+#include "analyzer/analyzer-logging.h"
+#include "json.h"
+#include "analyzer/sm.h"
+#include "analyzer/pending-diagnostic.h"
+
+int plugin_is_GPL_compatible;
+
+#if ENABLE_ANALYZER
+
+namespace ana {
+
+static bool
+type_based_on_pyobject_p (tree type)
+{
+  /* Ideally we'd also check for "subclasses" here by iterating up the
+     first field of each struct.  */
+  if (TREE_CODE (type) != RECORD_TYPE)
+    return false;
+  tree name = TYPE_IDENTIFIER (type);
+  if (!name)
+    return false;
+  return id_equal (name, "PyObject");
+}
+
+/* An experimental state machine, for tracking whether the GIL is held,
+   as global state..  */
+
+class gil_state_machine : public state_machine
+{
+public:
+  gil_state_machine (logger *logger);
+
+  bool inherited_state_p () const FINAL OVERRIDE { return false; }
+
+  bool on_stmt (sm_context *sm_ctxt,
+		const supernode *node,
+		const gimple *stmt) const FINAL OVERRIDE;
+
+  void on_condition (sm_context *sm_ctxt,
+		     const supernode *node,
+		     const gimple *stmt,
+		     tree lhs,
+		     enum tree_code op,
+		     tree rhs) const FINAL OVERRIDE;
+
+  bool can_purge_p (state_t s) const FINAL OVERRIDE;
+
+  void check_for_pyobject_usage_without_gil (sm_context *sm_ctxt,
+					     const supernode *node,
+					     const gimple *stmt,
+					     tree op) const;
+
+ private:
+  void check_for_pyobject_in_call (sm_context *sm_ctxt,
+				   const supernode *node,
+				   const gcall *call,
+				   tree callee_fndecl) const;
+
+ public:
+  /* These states are "global", rather than per-expression.  */
+
+  /* State for when we've released the GIL.  */
+  state_t m_released_gil;
+
+  /* Stop state.  */
+  state_t m_stop;
+};
+
+/* Subclass for diagnostics involving the GIL.  */
+
+class gil_diagnostic : public pending_diagnostic
+{
+public:
+  location_t fixup_location (location_t loc) const FINAL OVERRIDE
+  {
+    /* Ideally we'd check for specific macros here, and only
+       resolve certain macros.  */
+    if (linemap_location_from_macro_expansion_p (line_table, loc))
+      loc = linemap_resolve_location (line_table, loc,
+				      LRK_MACRO_EXPANSION_POINT, NULL);
+    return loc;
+  }
+
+  label_text describe_state_change (const evdesc::state_change &change)
+    FINAL OVERRIDE
+  {
+    if (change.is_global_p ()
+	&& change.m_new_state == m_sm.m_released_gil)
+      return change.formatted_print ("releasing the GIL here");
+    if (change.is_global_p ()
+	&& change.m_new_state == m_sm.get_start_state ())
+      return change.formatted_print ("acquiring the GIL here");
+    return label_text ();
+  }
+
+ protected:
+  gil_diagnostic (const gil_state_machine &sm) : m_sm (sm)
+  {
+  }
+
+ private:
+  const gil_state_machine &m_sm;
+};
+
+class double_save_thread : public gil_diagnostic
+{
+ public:
+  double_save_thread (const gil_state_machine &sm, const gcall *call)
+  : gil_diagnostic (sm), m_call (call)
+  {}
+
+  const char *get_kind () const FINAL OVERRIDE
+  {
+    return "double_save_thread";
+  }
+
+  bool subclass_equal_p (const pending_diagnostic &base_other) const OVERRIDE
+  {
+    const double_save_thread &sub_other
+      = (const double_save_thread &)base_other;
+    return m_call == sub_other.m_call;
+  }
+
+  bool emit (rich_location *rich_loc) FINAL OVERRIDE
+  {
+    return warning_at (rich_loc, 0,
+		       "nested usage of %qs", "Py_BEGIN_ALLOW_THREADS");
+  }
+
+  label_text describe_final_event (const evdesc::final_event &ev) FINAL OVERRIDE
+  {
+    return ev.formatted_print ("nested usage of %qs here",
+			       "Py_BEGIN_ALLOW_THREADS");
+  }
+
+ private:
+  const gcall *m_call;
+};
+
+class fncall_without_gil : public gil_diagnostic
+{
+ public:
+  fncall_without_gil (const gil_state_machine &sm, const gcall *call,
+		      tree callee_fndecl, unsigned arg_idx)
+  : gil_diagnostic (sm), m_call (call), m_callee_fndecl (callee_fndecl),
+    m_arg_idx (arg_idx)
+  {}
+
+  const char *get_kind () const FINAL OVERRIDE
+  {
+    return "fncall_without_gil";
+  }
+
+  bool subclass_equal_p (const pending_diagnostic &base_other) const OVERRIDE
+  {
+    const fncall_without_gil &sub_other
+      = (const fncall_without_gil &)base_other;
+    return (m_call == sub_other.m_call
+	    && m_callee_fndecl == sub_other.m_callee_fndecl
+	    && m_arg_idx == sub_other.m_arg_idx);
+  }
+
+  bool emit (rich_location *rich_loc) FINAL OVERRIDE
+  {
+    auto_diagnostic_group d;
+    /* There isn't a warning ID for use to use.  */
+    if (m_callee_fndecl)
+      return warning_at (rich_loc, 0,
+			 "use of PyObject as argument %i of %qE"
+			 " without the GIL",
+			 m_arg_idx + 1, m_callee_fndecl);
+    else
+      return warning_at (rich_loc, 0,
+			 "use of PyObject as argument %i of call"
+			 " without the GIL",
+			 m_arg_idx + 1, m_callee_fndecl);
+  }
+
+  label_text describe_final_event (const evdesc::final_event &ev) FINAL OVERRIDE
+  {
+    if (m_callee_fndecl)
+      return ev.formatted_print ("use of PyObject as argument %i of %qE here"
+				 " without the GIL",
+				 m_arg_idx + 1, m_callee_fndecl);
+    else
+      return ev.formatted_print ("use of PyObject as argument %i of call here"
+				 " without the GIL",
+				 m_arg_idx + 1, m_callee_fndecl);
+  }
+
+ private:
+  const gcall *m_call;
+  tree m_callee_fndecl;
+  unsigned m_arg_idx;
+};
+
+class pyobject_usage_without_gil : public gil_diagnostic
+{
+ public:
+  pyobject_usage_without_gil (const gil_state_machine &sm, tree expr)
+  : gil_diagnostic (sm), m_expr (expr)
+  {}
+
+  const char *get_kind () const FINAL OVERRIDE
+  {
+    return "pyobject_usage_without_gil";
+  }
+
+  bool subclass_equal_p (const pending_diagnostic &base_other) const OVERRIDE
+  {
+    return same_tree_p (m_expr,
+			((const pyobject_usage_without_gil&)base_other).m_expr);
+  }
+
+  bool emit (rich_location *rich_loc) FINAL OVERRIDE
+  {
+    auto_diagnostic_group d;
+    /* There isn't a warning ID for use to use.  */
+    return warning_at (rich_loc, 0,
+		       "use of PyObject %qE without the GIL", m_expr);
+  }
+
+  label_text describe_final_event (const evdesc::final_event &ev) FINAL OVERRIDE
+  {
+    return ev.formatted_print ("PyObject %qE used here without the GIL",
+			       m_expr);
+  }
+
+ private:
+  tree m_expr;
+};
+
+/* gil_state_machine's ctor.  */
+
+gil_state_machine::gil_state_machine (logger *logger)
+: state_machine ("gil", logger)
+{
+  m_released_gil = add_state ("released_gil");
+  m_stop = add_state ("stop");
+}
+
+struct cb_data
+{
+  cb_data (const gil_state_machine &sm, sm_context *sm_ctxt,
+	   const supernode *snode, const gimple *stmt)
+  : m_sm (sm), m_sm_ctxt (sm_ctxt), m_snode (snode), m_stmt (stmt)
+  {
+  }
+
+  const gil_state_machine &m_sm;
+  sm_context *m_sm_ctxt;
+  const supernode *m_snode;
+  const gimple *m_stmt;
+};
+
+static bool
+check_for_pyobject (gimple *, tree op, tree, void *data)
+{
+  cb_data *d = (cb_data *)data;
+  d->m_sm.check_for_pyobject_usage_without_gil (d->m_sm_ctxt, d->m_snode,
+						d->m_stmt, op);
+  return true;
+}
+
+/* Assuming that the GIL has been released, complain about any
+   PyObject * arguments passed to CALL.  */
+
+void
+gil_state_machine::check_for_pyobject_in_call (sm_context *sm_ctxt,
+					       const supernode *node,
+					       const gcall *call,
+					       tree callee_fndecl) const
+{
+  for (unsigned i = 0; i < gimple_call_num_args (call); i++)
+    {
+      tree arg = gimple_call_arg (call, i);
+      if (TREE_CODE (TREE_TYPE (arg)) != POINTER_TYPE)
+	continue;
+      tree type = TREE_TYPE (TREE_TYPE (arg));
+      if (type_based_on_pyobject_p (type))
+	{
+	  sm_ctxt->warn (node, call, NULL_TREE,
+			 new fncall_without_gil (*this, call,
+						 callee_fndecl,
+						 i));
+	  sm_ctxt->set_global_state (m_stop);
+	}
+    }
+}
+
+/* Implementation of state_machine::on_stmt vfunc for gil_state_machine.  */
+
+bool
+gil_state_machine::on_stmt (sm_context *sm_ctxt,
+			    const supernode *node,
+			    const gimple *stmt) const
+{
+  const state_t global_state = sm_ctxt->get_global_state ();
+  if (const gcall *call = dyn_cast <const gcall *> (stmt))
+    {
+      if (tree callee_fndecl = sm_ctxt->get_fndecl_for_call (call))
+	{
+	  if (is_named_call_p (callee_fndecl, "PyEval_SaveThread", call, 0))
+	    {
+	      if (0)
+		inform (input_location, "found call to %qs",
+			"PyEval_SaveThread");
+	      if (global_state == m_released_gil)
+		{
+		  sm_ctxt->warn (node, stmt, NULL_TREE,
+				 new double_save_thread (*this, call));
+		  sm_ctxt->set_global_state (m_stop);
+		}
+	      else
+		sm_ctxt->set_global_state (m_released_gil);
+	      return true;
+	    }
+	  else if (is_named_call_p (callee_fndecl, "PyEval_RestoreThread",
+				    call, 1))
+	    {
+	      if (0)
+		inform (input_location, "found call to %qs",
+			"PyEval_SaveThread");
+	      if (global_state == m_released_gil)
+		sm_ctxt->set_global_state (m_start);
+	      return true;
+	    }
+	  else if (global_state == m_released_gil)
+	    {
+	      /* Find PyObject * args of calls to fns with unknown bodies.  */
+	      if (!fndecl_has_gimple_body_p (callee_fndecl))
+		check_for_pyobject_in_call (sm_ctxt, node, call, callee_fndecl);
+	    }
+	}
+      else if (global_state == m_released_gil)
+	check_for_pyobject_in_call (sm_ctxt, node, call, NULL);
+    }
+  else
+    if (global_state == m_released_gil)
+      {
+	/* Walk the stmt, finding uses of PyObject (or "subclasses").  */
+	cb_data d (*this, sm_ctxt, node, stmt);
+	walk_stmt_load_store_addr_ops (const_cast <gimple *> (stmt), &d,
+				       check_for_pyobject,
+				       check_for_pyobject,
+				       check_for_pyobject);
+    }
+  return false;
+}
+
+/* Implementation of state_machine::on_condition vfunc for
+   gil_state_machine.  */
+
+void
+gil_state_machine::on_condition (sm_context *sm_ctxt ATTRIBUTE_UNUSED,
+				 const supernode *node ATTRIBUTE_UNUSED,
+				 const gimple *stmt ATTRIBUTE_UNUSED,
+				 tree lhs ATTRIBUTE_UNUSED,
+				 enum tree_code op ATTRIBUTE_UNUSED,
+				 tree rhs ATTRIBUTE_UNUSED) const
+{
+  // Empty
+}
+
+bool
+gil_state_machine::can_purge_p (state_t s ATTRIBUTE_UNUSED) const
+{
+  return true;
+}
+
+void
+gil_state_machine::check_for_pyobject_usage_without_gil (sm_context *sm_ctxt,
+							 const supernode *node,
+							 const gimple *stmt,
+							 tree op) const
+{
+  tree type = TREE_TYPE (op);
+  if (type_based_on_pyobject_p (type))
+    {
+      sm_ctxt->warn (node, stmt, NULL_TREE,
+		     new pyobject_usage_without_gil (*this, op));
+      sm_ctxt->set_global_state (m_stop);
+    }
+}
+
+/* Callback handler for the PLUGIN_ANALYZER_INIT event.  */
+
+static void
+gil_analyzer_init_cb (void *gcc_data, void */*user_data*/)
+{
+  ana::plugin_analyzer_init_iface *iface
+    = (ana::plugin_analyzer_init_iface *)gcc_data;
+  LOG_SCOPE (iface->get_logger ());
+  if (0)
+    inform (input_location, "got here: gil_analyzer_init_cb");
+  iface->register_state_machine (new gil_state_machine (iface->get_logger ()));
+}
+
+} // namespace ana
+
+#endif /* #if ENABLE_ANALYZER */
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+	     struct plugin_gcc_version *version)
+{
+#if ENABLE_ANALYZER
+  const char *plugin_name = plugin_info->base_name;
+  if (0)
+    inform (input_location, "got here; %qs", plugin_name);
+  register_callback (plugin_info->base_name,
+		     PLUGIN_ANALYZER_INIT,
+		     ana::gil_analyzer_init_cb,
+		     NULL); /* void *user_data */
+#else
+  sorry_no_analyzer ();
+#endif
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/gil-1.c b/gcc/testsuite/gcc.dg/plugin/gil-1.c
new file mode 100644
index 00000000000..4e8f535ba85
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/gil-1.c
@@ -0,0 +1,90 @@
+/* { dg-do compile } */
+/* { dg-options "-fanalyzer" } */
+
+#include "gil.h"
+
+void test_1 (void)
+{
+  Py_BEGIN_ALLOW_THREADS
+  Py_END_ALLOW_THREADS
+}
+
+void test_2 (PyObject *obj)
+{
+  Py_BEGIN_ALLOW_THREADS /* { dg-message "releasing the GIL here" } */
+
+  Py_INCREF (obj); /* { dg-warning "use of PyObject '\\*\\(obj\\)' without the GIL" } */
+  Py_DECREF (obj);
+
+  Py_END_ALLOW_THREADS
+}
+
+void test_3 (PyObject *obj)
+{
+  Py_BEGIN_ALLOW_THREADS /* { dg-message "releasing the GIL here" } */
+
+  Py_BEGIN_ALLOW_THREADS /* { dg-warning "nested usage of 'Py_BEGIN_ALLOW_THREADS'" } */
+  Py_END_ALLOW_THREADS
+
+  Py_END_ALLOW_THREADS
+}
+
+void test_4 (PyObject *obj)
+{
+  /* These aren't nested, so should be OK.  */
+  Py_BEGIN_ALLOW_THREADS
+  Py_END_ALLOW_THREADS
+
+  Py_BEGIN_ALLOW_THREADS
+  Py_END_ALLOW_THREADS
+}
+
+/* Interprocedural example of erroneously nested usage.  */
+
+static void  __attribute__((noinline))
+called_by_test_5 (void)
+{
+  Py_BEGIN_ALLOW_THREADS /* { dg-warning "nested usage of 'Py_BEGIN_ALLOW_THREADS'" } */
+  Py_END_ALLOW_THREADS
+}
+
+void test_5 (PyObject *obj)
+{
+  Py_BEGIN_ALLOW_THREADS /* { dg-message "releasing the GIL here" } */
+  called_by_test_5 ();
+  Py_END_ALLOW_THREADS
+}
+
+/* Interprocedural example of bogusly using a PyObject outside of GIL.  */
+
+static void  __attribute__((noinline))
+called_by_test_6 (PyObject *obj)
+{
+  Py_INCREF (obj); /* { dg-warning "use of PyObject '\\*\\(obj\\)' without the GIL" } */
+  Py_DECREF (obj);
+}
+
+void test_6 (PyObject *obj)
+{
+  Py_BEGIN_ALLOW_THREADS /* { dg-message "releasing the GIL here" } */
+  called_by_test_6 (obj);
+  Py_END_ALLOW_THREADS
+}
+
+extern void called_by_test_7 (PyObject *obj);
+
+void test_7 (PyObject *obj)
+{
+  Py_BEGIN_ALLOW_THREADS /* { dg-message "releasing the GIL here" } */
+  called_by_test_7 (obj); /* { dg-warning "use of PyObject as argument 1 of 'called_by_test_7' without the GIL" } */
+  Py_END_ALLOW_THREADS
+}
+
+typedef void (*callback_t) (PyObject *);
+
+void test_8 (PyObject *obj, callback_t cb)
+{
+  Py_BEGIN_ALLOW_THREADS /* { dg-message "releasing the GIL here" } */
+  cb (obj); /* { dg-warning "use of PyObject as argument 1 of call without the GIL" } */
+  Py_END_ALLOW_THREADS
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/gil.h b/gcc/testsuite/gcc.dg/plugin/gil.h
new file mode 100644
index 00000000000..b0610cd749e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/gil.h
@@ -0,0 +1,32 @@
+/* Adapted from CPython 3.8's ceval.h.  */
+typedef struct PyThreadState PyThreadState;
+extern PyThreadState * PyEval_SaveThread(void);
+extern void PyEval_RestoreThread(PyThreadState *);
+
+#define Py_BEGIN_ALLOW_THREADS { \
+                        PyThreadState *_save; \
+                        _save = PyEval_SaveThread();
+#define Py_BLOCK_THREADS        PyEval_RestoreThread(_save);
+#define Py_UNBLOCK_THREADS      _save = PyEval_SaveThread();
+#define Py_END_ALLOW_THREADS    PyEval_RestoreThread(_save); \
+                 }
+
+/* Adapted/hacked up from CPython 3.8's object.h.  */
+
+typedef struct _object {
+    int ob_refcnt;
+} PyObject;
+
+#define _PyObject_CAST(op) ((PyObject*)(op))
+
+extern void _Py_Dealloc(PyObject *);
+
+#define _Py_INCREF(OP) do { (OP)->ob_refcnt++; } while (0);
+#define _Py_DECREF(OP) do {	  \
+    if (--(OP)->ob_refcnt == 0) { \
+      _Py_Dealloc(OP);		  \
+    }				  \
+  } while (0)
+
+#define Py_INCREF(op) _Py_INCREF(_PyObject_CAST(op))
+#define Py_DECREF(op) _Py_DECREF(_PyObject_CAST(op))
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index 5dd102ae05c..7f0ffd68f8b 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -118,6 +118,8 @@ set plugin_test_list [list \
     { dump_plugin.c \
 	  dump-1.c \
 	  dump-2.c } \
+    { analyzer_gil_plugin.c \
+	  gil-1.c } \
 ]
 
 foreach plugin_test $plugin_test_list {
-- 
2.26.2


             reply	other threads:[~2020-10-15  1:22 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-15  1:21 David Malcolm [this message]
2020-11-10 21:51 ` PING: " David Malcolm
2020-11-23 21:10   ` Jeff Law
2020-11-30 21:19     ` [PATCH] Unbreak build with --disable-analyzer David Malcolm
2020-11-30 22:39       ` David Malcolm
2020-11-30 22:41         ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201015012158.777902-1-dmalcolm@redhat.com \
    --to=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).