public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* On inform diagnostics in plugins, support scripts for gdb and modeling creation of PyObjects for static analysis
@ 2023-06-07 20:21 Eric Feng
  2023-06-07 21:54 ` David Malcolm
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Feng @ 2023-06-07 20:21 UTC (permalink / raw)
  To: gcc; +Cc: David Malcolm

Hi everyone,

I am one of the GSoC participants this year — in particular, I am
working on a static analyzer plugin for CPython extension module code.
I'm encountering a few challenges and would appreciate any guidance on
the following issues:

1) Issue with "inform" diagnostics in the plugin:
I am currently unable to see any "inform" messages from my plugin when
compiling test programs with the plugin enabled. As per the structure
of existing analyzer plugins, I have included the following code in
the plugin_init function:

#if ENABLE_ANALYZER
    const char *plugin_name = plugin_info->base_name;
    if (0)
        inform(input_location, "got here; %qs", plugin_name);
    register_callback(plugin_info->base_name,
                      PLUGIN_ANALYZER_INIT,
                      ana::cpython_analyzer_init_cb,
                      NULL);
#else
    sorry_no_analyzer();
#endif
    return 0;

I expected to see the "got here" message (among others in other areas
of the plugin) when compiling test programs but haven't observed any
output. I also did not observe the "sorry" diagnostic. I am compiling
a simple CPython extension module with the plugin loaded like so:

gcc-dev -S -fanalyzer -fplugin=/path/to/cpython_plugin.so
-I/usr/include/python3.9 -lpython3.9 -x c refcount6.c

Additionally, I compiled the plugin following the steps outlined in
the GCC documentation for plugin building
(https://gcc.gnu.org/onlinedocs/gccint/Plugins-building.html):

g++-dev -shared -I/home/flappy/gcc_/gcc/gcc
-I/usr/local/lib/gcc/aarch64-unknown-linux-gnu/14.0.0/plugin/include
-fPIC -fno-rtti -O2 analyzer_cpython_plugin.c -o cpython_plugin.so

Please let me know if I missed any steps or if there is something else
I should consider. I have no trouble seeing inform calls when they are
added to the core GCC.

2) gdb not detecting .gdbinit in build/gcc:
Following Dave's GCC newbies guide, I ran gcc/configure within the gcc
subdirectory of the build directory to generate a .gdbinit file.
Dave's guide suggested that this file would be automatically detected
and run by gdb. However, it appears that GDB is not detecting this
.gdbinit file, even after I added the following line to my ~/.gdbinit
file:

add-auto-load-safe-path /absolute/path/to/build/gcc

3) Modeling creation of a new PyObject:
Many CPython API calls involve the creation of a new PyObject. To
model the creation of a simple PyObject, we can allocate a new heap
region using get_or_create_region_for_heap_alloc. We can then create
field_regions using get_field_region to associate the newly allocated
region to represent fields such as ob_refcnt and ob_type in the
PyObject struct. However, one of the parameters to get _field_region
is a tree representing the field (e.g ob_refcnt). I'm currently
wondering how we may retrieve this information. My intuition is that
it would be fairly easy if we can first get a tree representation of
the PyObject struct. Since we include the relevant headers when
compiling CPython extension modules (e.g., -I/usr/include/python3.9),
I wonder if there is a way to "look up" the tree representation of
PyObject from the included headers. This information may also be
important for obtaining a svalue representing the size of the PyObject
in get_or_create_region_for_heap_alloc. If there is no way to "look
up" a tree representation of PyObject as described in the included
Python header files, does it make sense for us to just create a tree
representation manually for this task? Please let me know if this
approach makes sense and if so where I could look into to get the
required information.

Thanks all.

Best,
Eric

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: On inform diagnostics in plugins, support scripts for gdb and modeling creation of PyObjects for static analysis
  2023-06-07 20:21 On inform diagnostics in plugins, support scripts for gdb and modeling creation of PyObjects for static analysis Eric Feng
@ 2023-06-07 21:54 ` David Malcolm
  2023-06-08  2:00   ` Eric Feng
  0 siblings, 1 reply; 3+ messages in thread
From: David Malcolm @ 2023-06-07 21:54 UTC (permalink / raw)
  To: Eric Feng, gcc

On Wed, 2023-06-07 at 16:21 -0400, Eric Feng wrote:
> Hi everyone,
> 
> I am one of the GSoC participants this year — in particular, I am
> working on a static analyzer plugin for CPython extension module
> code.
> I'm encountering a few challenges and would appreciate any guidance
> on
> the following issues:
> 
> 1) Issue with "inform" diagnostics in the plugin:
> I am currently unable to see any "inform" messages from my plugin
> when
> compiling test programs with the plugin enabled. As per the structure
> of existing analyzer plugins, I have included the following code in
> the plugin_init function:
> 
> #if ENABLE_ANALYZER
>     const char *plugin_name = plugin_info->base_name;
>     if (0)
>         inform(input_location, "got here; %qs", plugin_name);

If that's the code, does it work if you get rid of the "if (0)"
conditional, or change it to "if (1)"?  As written, that guard is
false, so that call to "inform" will never be executed.

>     register_callback(plugin_info->base_name,
>                       PLUGIN_ANALYZER_INIT,
>                       ana::cpython_analyzer_init_cb,
>                       NULL);
> #else
>     sorry_no_analyzer();
> #endif
>     return 0;
> 
> I expected to see the "got here" message (among others in other areas
> of the plugin) when compiling test programs but haven't observed any
> output. I also did not observe the "sorry" diagnostic. I am compiling
> a simple CPython extension module with the plugin loaded like so:
> 
> gcc-dev -S -fanalyzer -fplugin=/path/to/cpython_plugin.so
> -I/usr/include/python3.9 -lpython3.9 -x c refcount6.c

Looks reasonable.

> 
> Additionally, I compiled the plugin following the steps outlined in
> the GCC documentation for plugin building
> (https://gcc.gnu.org/onlinedocs/gccint/Plugins-building.html):
> 
> g++-dev -shared -I/home/flappy/gcc_/gcc/gcc
> -I/usr/local/lib/gcc/aarch64-unknown-linux-gnu/14.0.0/plugin/include
> -fPIC -fno-rtti -O2 analyzer_cpython_plugin.c -o cpython_plugin.so
> 
> Please let me know if I missed any steps or if there is something
> else
> I should consider. I have no trouble seeing inform calls when they
> are
> added to the core GCC.
> 
> 2) gdb not detecting .gdbinit in build/gcc:
> Following Dave's GCC newbies guide, I ran gcc/configure within the
> gcc
> subdirectory of the build directory to generate a .gdbinit file.
> Dave's guide suggested that this file would be automatically detected
> and run by gdb. However, it appears that GDB is not detecting this
> .gdbinit file, even after I added the following line to my ~/.gdbinit
> file:
> 
> add-auto-load-safe-path /absolute/path/to/build/gcc

Are you invoking gcc from an installed copy, or from the build
directory?  I think my instructions assume the latter.

> 
> 3) Modeling creation of a new PyObject:
> Many CPython API calls involve the creation of a new PyObject. To
> model the creation of a simple PyObject, we can allocate a new heap
> region using get_or_create_region_for_heap_alloc. We can then create
> field_regions using get_field_region to associate the newly allocated
> region to represent fields such as ob_refcnt and ob_type in the
> PyObject struct. However, one of the parameters to get _field_region
> is a tree representing the field (e.g ob_refcnt). I'm currently
> wondering how we may retrieve this information. My intuition is that
> it would be fairly easy if we can first get a tree representation of
> the PyObject struct. Since we include the relevant headers when
> compiling CPython extension modules (e.g., -I/usr/include/python3.9),
> I wonder if there is a way to "look up" the tree representation of
> PyObject from the included headers. This information may also be
> important for obtaining a svalue representing the size of the
> PyObject
> in get_or_create_region_for_heap_alloc. If there is no way to "look
> up" a tree representation of PyObject as described in the included
> Python header files, does it make sense for us to just create a tree
> representation manually for this task? Please let me know if this
> approach makes sense and if so where I could look into to get the
> required information.

Don't attempt to build the struct by hand; we want to look up the
struct from the user's headers.  There are at least two ABIs for
PyObject, so we want to be sure we're using the correct one.

IIRC, to look things up by name, that's generally a frontend thing,
since every language has its own concept of scopes/namespaces/etc.

It sounds like you want to look for a type in the global scope of the
C/C++ FE with the name "PyObject".

We currently have some hooks in the analyzer for getting constants from
the frontends; see analyzer-language.cc, where the frontend calls
on_finish_translation_unit, where the analyzer queries the FE for the
named constants that will be of interest during analysis.  Maybe we can
extend this so that we have a way to look up named types there, and
stash the tree for later use, and thus your plugin could ask the
frontend which tree is the PyObject RECORD_TYPE before the frontend is
cleaned up (in on_finish_translation_unit).

There might be a simpler way to do this, but I can't think of it right
now, sorry.

Hope this is helpful
Dave



> 
> Thanks all.
> 
> Best,
> Eric
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: On inform diagnostics in plugins, support scripts for gdb and modeling creation of PyObjects for static analysis
  2023-06-07 21:54 ` David Malcolm
@ 2023-06-08  2:00   ` Eric Feng
  0 siblings, 0 replies; 3+ messages in thread
From: Eric Feng @ 2023-06-08  2:00 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc

Hi Dave,

> If that's the code, does it work if you get rid of the "if (0)"
> conditional, or change it to "if (1)"?  As written, that guard is
> false, so that call to "inform" will never be executed.

Woops! Somehow I missed that but yes, it works now. Thanks!

>  Are you invoking gcc from an installed copy, or from the build
> directory?  I think my instructions assume the latter.

Ah gotcha, thanks! It loads as expected when invoking gcc from the
build directory.

> Don't attempt to build the struct by hand; we want to look up the
> struct from the user's headers.  There are at least two ABIs for
> PyObject, so we want to be sure we're using the correct one.
>
> IIRC, to look things up by name, that's generally a frontend thing,
> since every language has its own concept of scopes/namespaces/etc.
>
> It sounds like you want to look for a type in the global scope of the
> C/C++ FE with the name "PyObject".
>
> We currently have some hooks in the analyzer for getting constants from
> the frontends; see analyzer-language.cc, where the frontend calls
> on_finish_translation_unit, where the analyzer queries the FE for the
> named constants that will be of interest during analysis.  Maybe we can
> extend this so that we have a way to look up named types there, and
> stash the tree for later use, and thus your plugin could ask the
> frontend which tree is the PyObject RECORD_TYPE before the frontend is
> cleaned up (in on_finish_translation_unit).

Sounds good, I will look into that. Thanks for the suggestion!

Best,
Eric


On Wed, Jun 7, 2023 at 5:55 PM David Malcolm <dmalcolm@redhat.com> wrote:
>
> On Wed, 2023-06-07 at 16:21 -0400, Eric Feng wrote:
> > Hi everyone,
> >
> > I am one of the GSoC participants this year — in particular, I am
> > working on a static analyzer plugin for CPython extension module
> > code.
> > I'm encountering a few challenges and would appreciate any guidance
> > on
> > the following issues:
> >
> > 1) Issue with "inform" diagnostics in the plugin:
> > I am currently unable to see any "inform" messages from my plugin
> > when
> > compiling test programs with the plugin enabled. As per the structure
> > of existing analyzer plugins, I have included the following code in
> > the plugin_init function:
> >
> > #if ENABLE_ANALYZER
> >     const char *plugin_name = plugin_info->base_name;
> >     if (0)
> >         inform(input_location, "got here; %qs", plugin_name);
>
> If that's the code, does it work if you get rid of the "if (0)"
> conditional, or change it to "if (1)"?  As written, that guard is
> false, so that call to "inform" will never be executed.
>
> >     register_callback(plugin_info->base_name,
> >                       PLUGIN_ANALYZER_INIT,
> >                       ana::cpython_analyzer_init_cb,
> >                       NULL);
> > #else
> >     sorry_no_analyzer();
> > #endif
> >     return 0;
> >
> > I expected to see the "got here" message (among others in other areas
> > of the plugin) when compiling test programs but haven't observed any
> > output. I also did not observe the "sorry" diagnostic. I am compiling
> > a simple CPython extension module with the plugin loaded like so:
> >
> > gcc-dev -S -fanalyzer -fplugin=/path/to/cpython_plugin.so
> > -I/usr/include/python3.9 -lpython3.9 -x c refcount6.c
>
> Looks reasonable.
>
> >
> > Additionally, I compiled the plugin following the steps outlined in
> > the GCC documentation for plugin building
> > (https://gcc.gnu.org/onlinedocs/gccint/Plugins-building.html):
> >
> > g++-dev -shared -I/home/flappy/gcc_/gcc/gcc
> > -I/usr/local/lib/gcc/aarch64-unknown-linux-gnu/14.0.0/plugin/include
> > -fPIC -fno-rtti -O2 analyzer_cpython_plugin.c -o cpython_plugin.so
> >
> > Please let me know if I missed any steps or if there is something
> > else
> > I should consider. I have no trouble seeing inform calls when they
> > are
> > added to the core GCC.
> >
> > 2) gdb not detecting .gdbinit in build/gcc:
> > Following Dave's GCC newbies guide, I ran gcc/configure within the
> > gcc
> > subdirectory of the build directory to generate a .gdbinit file.
> > Dave's guide suggested that this file would be automatically detected
> > and run by gdb. However, it appears that GDB is not detecting this
> > .gdbinit file, even after I added the following line to my ~/.gdbinit
> > file:
> >
> > add-auto-load-safe-path /absolute/path/to/build/gcc
>
> Are you invoking gcc from an installed copy, or from the build
> directory?  I think my instructions assume the latter.
>
> >
> > 3) Modeling creation of a new PyObject:
> > Many CPython API calls involve the creation of a new PyObject. To
> > model the creation of a simple PyObject, we can allocate a new heap
> > region using get_or_create_region_for_heap_alloc. We can then create
> > field_regions using get_field_region to associate the newly allocated
> > region to represent fields such as ob_refcnt and ob_type in the
> > PyObject struct. However, one of the parameters to get _field_region
> > is a tree representing the field (e.g ob_refcnt). I'm currently
> > wondering how we may retrieve this information. My intuition is that
> > it would be fairly easy if we can first get a tree representation of
> > the PyObject struct. Since we include the relevant headers when
> > compiling CPython extension modules (e.g., -I/usr/include/python3.9),
> > I wonder if there is a way to "look up" the tree representation of
> > PyObject from the included headers. This information may also be
> > important for obtaining a svalue representing the size of the
> > PyObject
> > in get_or_create_region_for_heap_alloc. If there is no way to "look
> > up" a tree representation of PyObject as described in the included
> > Python header files, does it make sense for us to just create a tree
> > representation manually for this task? Please let me know if this
> > approach makes sense and if so where I could look into to get the
> > required information.
>
> Don't attempt to build the struct by hand; we want to look up the
> struct from the user's headers.  There are at least two ABIs for
> PyObject, so we want to be sure we're using the correct one.
>
> IIRC, to look things up by name, that's generally a frontend thing,
> since every language has its own concept of scopes/namespaces/etc.
>
> It sounds like you want to look for a type in the global scope of the
> C/C++ FE with the name "PyObject".
>
> We currently have some hooks in the analyzer for getting constants from
> the frontends; see analyzer-language.cc, where the frontend calls
> on_finish_translation_unit, where the analyzer queries the FE for the
> named constants that will be of interest during analysis.  Maybe we can
> extend this so that we have a way to look up named types there, and
> stash the tree for later use, and thus your plugin could ask the
> frontend which tree is the PyObject RECORD_TYPE before the frontend is
> cleaned up (in on_finish_translation_unit).
>
> There might be a simpler way to do this, but I can't think of it right
> now, sorry.
>
> Hope this is helpful
> Dave
>
>
>
> >
> > Thanks all.
> >
> > Best,
> > Eric
> >
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-06-08  2:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-07 20:21 On inform diagnostics in plugins, support scripts for gdb and modeling creation of PyObjects for static analysis Eric Feng
2023-06-07 21:54 ` David Malcolm
2023-06-08  2:00   ` Eric Feng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).