public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* analyzer: Weekly update on extending C++ support (2)
@ 2023-08-07 13:04 Benjamin Priour
  2023-08-07 15:48 ` David Malcolm
  2023-08-07 16:06 ` David Malcolm
  0 siblings, 2 replies; 5+ messages in thread
From: Benjamin Priour @ 2023-08-07 13:04 UTC (permalink / raw)
  To: David Malcolm, gcc

[-- Attachment #1: Type: text/plain, Size: 11730 bytes --]

Hi Dave,

I made some more progress on moving tests from gcc.dg/analyzer/ to
c-c++-common/analyzer.
I'll only detail the most noteworthy tests I encountered.

gcc.dg/analyzer/pr103892.c troubled me with an Excess error when compiled
with g++
analysis bailed out early (91 'after-snode' enodes; 314 enodes)
[-Wanalyzer-too-complex]

pr103892.c is compiled with optimization level -O2.
Analysis bails out when the number of "after-SN" enodes explodes, i.e.
exceeds a certain proportion of the total number of SG nodes.
limit(after-SN) = m_sg.num_nodes ()  * param-bb-explosion-factor.
The reason why C and C++ differs is simply due to what '-O2' does to each
of them.
Under -O0 or -O1, there is no failure whatsoever, under -O2 only C++ fails,
whereas with -O3 both gcc and g++ emits -Wanalyzer-too-complex.
With -O2, although GCC produces a greater total number of "after-SN" enodes
than G++, their proportion barely stays under limit(after-SN) as
the total number of SN is also bigger, hence no warning is emitted.

All in all, I don't believe there is a significant difference here between
C and C++, nor is there much we can do about this.
Therefore I'll simply add a { dg-warning "analysis bailed out early" "" {
target c++ } }

An interesting divergence between GCC and G++ was in the handling of
built-ins.
Tests gcc.dg/analyzer/{pr104062.c,sprintf-2.c} are failing when compiling
with G++.
FAIL: c-c++-common/analyzer/pr104062.c  leak of ap7 at line 13 (test for
warnings, line 12)
The reason is neither realloc nor sprintf are considered by G++ as a
built-in, contrary to GCC.
C built-ins are mapped within the analyzer to known_functions using their
'enum combined_fn' code
(See kf.cc:register_known_functions), not their function name.
Therefore, we find a built-in known function by checking
tree.h:fndecl_built_in_p returns true
and calling known_function_manager::get_normal_builtin.
The former returns false for 'realloc' and 'sprintf' when using G++.

To fix that, I've derived builtin_known_function from known_function.

/* Subclass of known_function for builtin functions.  */

class builtin_known_function : public known_function
{
public:
  virtual enum built_in_function builtin_code () const = 0;
  tree builtin_decl () const {
    gcc_assert (builtin_code () < END_BUILTINS);
    return builtin_info[builtin_code ()].decl;
  }

  virtual const builtin_known_function *
  dyn_cast_builtin_kf () const { return this; }
  virtual builtin_known_function *
  dyn_cast_builtin_kf () { return this; }
};

And 'kfm.add (BUILT_IN_REALLOC, make_unique<kf_realloc> ());' becomes
kfm.add ("realloc", make_unique<kf_realloc> ());
kfm.add ("__builtin_realloc", make_unique<kf_realloc> ());

Unfortunately we have to register the built-ins using their function name
which is more apt to bugs,
and the double call to kfm.add is quite messy. Having two instances of
kf_realloc however is not that troubling,
as builtin_known_function objects are lightweight.

That GCC built-ins are not recognized by G++ doesn't only impede their
detection,
but also how we process them, and what information we have of them.
In gcc.dg/analyzer/sprintf-2.c, sprintf signature follows the manual and
cppreference.
Yet we expect sm-malloc to warn about use of NULL when either of the
argument is NULL,
although there is no attribute(nonnull) specified.
In fact, sm-malloc isn't using the signature of sprintf as specified in the
test case, but rather the one provided
by GCC in builtins.def

/* See e.g. https://en.cppreference.com/w/c/io/fprintf
   and https://www.man7.org/linux/man-pages/man3/sprintf.3.html */

  extern int
  sprintf(char* dst, const char* fmt, ...)
    __attribute__((__nothrow__)); // No attribute(nonnull)

int
test_null_dst (void)
{
  return sprintf (NULL, "hello world"); /* { dg-warning "use of NULL where
non-null expected" } */
}

int
test_null_fmt (char *dst)
{
  return sprintf (dst, NULL);  /* { dg-warning "use of NULL where non-null
expected" } */
}

Signature in builtins.def: DEF_LIB_BUILTIN        (BUILT_IN_SPRINTF,
"sprintf", BT_FN_INT_STRING_CONST_STRING_VAR,
ATTR_NOTHROW_NONNULL_1_FORMAT_PRINTF_2_3)

My question is then : Should G++ behave as GCC and ignore the user's
signatures of built-ins, instead using the attributes specified by GCC ?
At the moment I went with a "yes". If the function is a builtin, I'm
operating on its builtin_known_function::builtin_decl () (see above) rather
than the callee_fndecl
deduced from a gimple call, therefore overriding the user's signature.

Doing so led to tests sprintf-2.c and realloc-1.c to pass both in GCC and
G++.


Last test I'd like to discuss is analyzer/pr99193-1.c

/* { dg-additional-options "-Wno-analyzer-too-complex" } */

/* Verify absence of false positive from -Wanalyzer-mismatching-deallocation
   on realloc(3).
   Based on
https://github.com/libguestfs/libguestfs/blob/f19fd566f6387ce7e4d82409528c9dde374d25e0/daemon/command.c#L115
   which is GPLv2 or later.  */

typedef __SIZE_TYPE__ size_t;
typedef __builtin_va_list va_list;

#ifdef __cplusplus
  #if __cplusplus >= 201103L
  #define NULL nullptr
  #else
  #define NULL 0
  #endif
#else
  #define NULL ((void *)0)
#endif

extern void *malloc (size_t __size)
  __attribute__ ((__nothrow__ , __leaf__))
  __attribute__ ((__malloc__))
  __attribute__ ((__alloc_size__ (1)));
extern void perror (const char *__s);
extern void *realloc (void *__ptr, size_t __size)
  __attribute__ ((__nothrow__ , __leaf__))
  __attribute__ ((__warn_unused_result__))
  __attribute__ ((__alloc_size__ (2)));

extern void guestfs_int_cleanup_free (void *ptr);
extern int commandrvf (char **stdoutput, char **stderror, unsigned flags,
                       char const* const *argv);
#define CLEANUP_FREE __attribute__((cleanup(guestfs_int_cleanup_free)))

int
commandrf (char **stdoutput, char **stderror, unsigned flags,
           const char *name, ...)
{
  va_list args;
  CLEANUP_FREE const char **argv = NULL;
  char *s;
  int i, r;

  /* Collect the command line arguments into an array. */
  i = 2;
  argv = (const char **) malloc (sizeof (char *) * i);

 if (argv == NULL) {
    perror ("malloc");
    return -1;
  }
  argv[0] = (char *) name;
  argv[1] = NULL;

  __builtin_va_start (args, name);

  while ((s = __builtin_va_arg (args, char *)) != NULL) {
    const char **p = (const char **) realloc (argv, sizeof (char *) *
(++i)); /* { dg-bogus "'free'" } */
    if (p == NULL) {
      perror ("realloc");
      __builtin_va_end (args); // (*)
      return -1;
    }
    argv = p;
    argv[i-2] = s;
    argv[i-1] = NULL;
  }

  __builtin_va_end (args);

  r = commandrvf (stdoutput, stderror, flags, argv);

  return r;
} // G++ emits Excess error here

Even without the prior changes to the processing of built-ins, va_end and
__builtin_va_end are recognized by G++.
Yet, G++ fails with an Excess error "warning: missing call to 'va_end'
[-Wanalyzer-va-list-leak]".

/home/vultkayn/Desktop/gcc_sources_perso/gcc/gcc/testsuite/c-c++-common/analyzer/pr99193-1.c:75:1:
warning: missing call to ‘va_end’ [-Wanalyzer-va-list-leak]
   75 | }
      | ^
  ‘int commandrf(char**, char**, unsigned int, const char*, ...)’: events
1-3
    |
    |   49 |  if (argv == NULL) {
    |      |  ^~
    |      |  |
    |      |  (1) following ‘false’ branch...
    |......
    |   53 |   argv[0] = (char *) name;
    |      |         ~
    |      |         |
    |      |         (2) ...to here
    |......
    |   56 |   __builtin_va_start (args, name);
    |      |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |      |                      |
    |      |                      (3) ‘va_start’ called here (state of ‘’:
‘start’ -> ‘started’, NULL origin, meaning: {verb: ‘acquire’, noun:
‘resource’})
    |
  ‘int commandrf(char**, char**, unsigned int, const char*, ...)’: events
4-9
    |
    |   58 |   while ((s = __builtin_va_arg (args, char *)) != NULL) {
    |      |                                                ^
    |      |                                                |
    |      |                                                (4) following
‘true’ branch...
    |   59 |     const char **p = (const char **) realloc (argv, sizeof
(char *) * (++i)); /* { dg-bogus "'free'" } */
    |      |
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |      |                                              |
            |
    |      |                                              |
            (5) ...to here
    |      |                                              (6) when
‘realloc’ fails
    |   60 |     if (p == NULL) {
    |      |     ~~
    |      |     |
    |      |     (7) following ‘true’ branch (when ‘p’ is NULL)...
    |   61 |       perror ("realloc");
    |      |       ~~~~~~~~~~~~~~~~~~
    |      |              |
    |      |              (8) ...to here
    |......
    |   75 | }
    |      | ~
    |      | |
    |      | (9) missing call to ‘va_end’ to match ‘va_start’ at (3) (in
global state ‘started’)
    |

The IPA doesn't differ much between C and C++ at this sec
// gcc -fdump-ipa-analyzer
  <bb 5> :
  i_44 = i_22 + 1;
  _8 = (long unsigned int) i_44;
  _9 = _8 * 8;
  argv.3_10 = argv;
  p_46 = realloc (argv.3_10, _9);
  if (p_46 == 0B)
    goto <bb 6>; [INV]
  else
    goto <bb 7>; [INV]

  <bb 6> :
  perror ("realloc");
  __builtin_va_end (&args);
  _52 = -1;
  // predicted unlikely by early return (on trees) predictor.
  goto <bb 10>; [INV]

// g++ -fdump-analyzer-ipa
  <bb 6> :
  i_48 = i_22 + 1;
  _8 = (long unsigned int) i_48;
  _9 = _8 * 8;
  argv.3_10 = argv;
  p_50 = realloc (argv.3_10, _9);
  if (p_50 == 0B) // if p == NULL
    goto <bb 7>; [INV]
  else
    goto <bb 9>; [INV]

  <bb 7> :
  perror ("realloc");

  <bb 8> :
  __builtin_va_end (&args);
  _56 = -1;
  // predicted unlikely by early return (on trees) predictor.
  goto <bb 13>; [INV]

If I comment out the call to perror("realloc"), then G++ behaves as GCC.
If I replace perror ("realloc") by any other call, to a fully-defined
function or whatever, then the warning reappears.
The above SSA differ by an extra basic block in G++ that splits GCC's <bb
6> in two.
However, that doesn't account for much, as changing
  if (p == NULL) {
      perror ("realloc");
      __builtin_va_end (args); // (*)
      return -1;
    }
to
    if (p == NULL) {
      int x;
      perror("realloc");
      if (x > 7)
        x = 12;
      __builtin_va_end (args);
      return -1;
    }
generates a similar basic block around __builtin_va_end, yet G++ keeps
emitting the false positive whereas GCC doesn't.
// GCC modified SSA
  <bb 6> :
  perror ("realloc");
  if (x_51(D) > 7)
    goto <bb 7>; [INV]
  else
    goto <bb 8>; [INV]

  <bb 7> :
  x_52 = 12;

  <bb 8> :
  __builtin_va_end (&args);
  _54 = -1;
  // predicted unlikely by early return (on trees) predictor.
  goto <bb 12>; [INV]

// G++ modified SSA
  <bb 7> :
  perror ("realloc");

  <bb 8> :
  if (x_55(D) > 7)
    goto <bb 9>; [INV]
  else
    goto <bb 10>; [INV]

  <bb 9> :
  x_56 = 12;

  <bb 10> :
  __builtin_va_end (&args);
  _58 = -1;
  // predicted unlikely by early return (on trees) predictor.
  goto <bb 15>; [INV]

Debugging and glaring at the exploded graph gave me no clue towards fixing
this test.
If you have any hint, I welcome it.

I hope the above isn't too indigest.

Cheers,
Benjamin.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: analyzer: Weekly update on extending C++ support (2)
  2023-08-07 13:04 analyzer: Weekly update on extending C++ support (2) Benjamin Priour
@ 2023-08-07 15:48 ` David Malcolm
  2023-08-07 16:06 ` David Malcolm
  1 sibling, 0 replies; 5+ messages in thread
From: David Malcolm @ 2023-08-07 15:48 UTC (permalink / raw)
  To: Benjamin Priour, gcc

On Mon, 2023-08-07 at 15:04 +0200, Benjamin Priour wrote:
> Hi Dave,

Hi Benjamin.

> 
> I made some more progress on moving tests from gcc.dg/analyzer/ to
> c-c++-common/analyzer.

Thanks.

It sounds like you have a large amount of "pending" work that is
accumulating on your hard drive.  This makes me nervous; if we disagree
on how an aspect of the work should be done, it would be better to sort
that out sooner rather than when you've done all the work.  So in the
spirit of "release early, release often", are you able to post a small
representative subset of the work to gcc-patches?

Also, do you have backups?

Further comments inline below...

> I'll only detail the most noteworthy tests I encountered.
> 
> gcc.dg/analyzer/pr103892.c troubled me with an Excess error when
> compiled
> with g++
> analysis bailed out early (91 'after-snode' enodes; 314 enodes)
> [-Wanalyzer-too-complex]
> 
> pr103892.c is compiled with optimization level -O2.
> Analysis bails out when the number of "after-SN" enodes explodes,
> i.e.
> exceeds a certain proportion of the total number of SG nodes.
> limit(after-SN) = m_sg.num_nodes ()  * param-bb-explosion-factor.
> The reason why C and C++ differs is simply due to what '-O2' does to
> each
> of them.
> Under -O0 or -O1, there is no failure whatsoever, under -O2 only C++
> fails,
> whereas with -O3 both gcc and g++ emits -Wanalyzer-too-complex.
> With -O2, although GCC produces a greater total number of "after-SN"
> enodes
> than G++, their proportion barely stays under limit(after-SN) as
> the total number of SN is also bigger, hence no warning is emitted.

Is the gimple seen by -fanalyzer different between the C and C++
frontends for this case?

> 
> All in all, I don't believe there is a significant difference here
> between
> C and C++, nor is there much we can do about this.
> Therefore I'll simply add a { dg-warning "analysis bailed out early"
> "" {
> target c++ } }

Putting in a dg-warning directive would mean that we expect the warning
at that line, which would be over-specifying the behavior of the test.

It's usually better to add -Wno-analyzer-too-complex to the options. 
Looking at git log, that test was added in
3d41408c5d28105e7a3ea2eb2529431a70b96369 as part of a fix for state
explosions.  So the absense of -Wanalyzer-too-complex is something that
the test is effectively asserting.  Hence the -Wno-analyzer-too-complex
should be conditional on { target c++ }.


> An interesting divergence between GCC and G++ was in the handling of
> built-ins.
> Tests gcc.dg/analyzer/{pr104062.c,sprintf-2.c} are failing when
> compiling
> with G++.
> FAIL: c-c++-common/analyzer/pr104062.c  leak of ap7 at line 13 (test
> for
> warnings, line 12)
> The reason is neither realloc nor sprintf are considered by G++ as a
> built-in, contrary to GCC.

Do you know why this happens?  I see both of those tests have their own
prototypes for the functions in question, rather than using system
headers; maybe those prototypes don't match some property that the C++
FE is checking for?

> C built-ins are mapped within the analyzer to known_functions using
> their
> 'enum combined_fn' code
> (See kf.cc:register_known_functions), not their function name.
> Therefore, we find a built-in known function by checking
> tree.h:fndecl_built_in_p returns true
> and calling known_function_manager::get_normal_builtin.
> The former returns false for 'realloc' and 'sprintf' when using G++.
> 
> To fix that, I've derived builtin_known_function from known_function.
> 
> /* Subclass of known_function for builtin functions.  */
> 
> class builtin_known_function : public known_function
> {
> public:
>   virtual enum built_in_function builtin_code () const = 0;
>   tree builtin_decl () const {
>     gcc_assert (builtin_code () < END_BUILTINS);
>     return builtin_info[builtin_code ()].decl;
>   }
> 
>   virtual const builtin_known_function *
>   dyn_cast_builtin_kf () const { return this; }
>   virtual builtin_known_function *
>   dyn_cast_builtin_kf () { return this; }
> };
> 
> And 'kfm.add (BUILT_IN_REALLOC, make_unique<kf_realloc> ());' becomes
> kfm.add ("realloc", make_unique<kf_realloc> ());
> kfm.add ("__builtin_realloc", make_unique<kf_realloc> ());
> 
> Unfortunately we have to register the built-ins using their function
> name
> which is more apt to bugs,
> and the double call to kfm.add is quite messy. Having two instances
> of
> kf_realloc however is not that troubling,
> as builtin_known_function objects are lightweight.
> 
> That GCC built-ins are not recognized by G++ doesn't only impede
> their
> detection,
> but also how we process them, and what information we have of them.
> In gcc.dg/analyzer/sprintf-2.c, sprintf signature follows the manual
> and
> cppreference.
> Yet we expect sm-malloc to warn about use of NULL when either of the
> argument is NULL,
> although there is no attribute(nonnull) specified.
> In fact, sm-malloc isn't using the signature of sprintf as specified
> in the
> test case, but rather the one provided
> by GCC in builtins.def
> 
> /* See e.g. https://en.cppreference.com/w/c/io/fprintf
>    and https://www.man7.org/linux/man-pages/man3/sprintf.3.html */
> 
>   extern int
>   sprintf(char* dst, const char* fmt, ...)
>     __attribute__((__nothrow__)); // No attribute(nonnull)
> 
> int
> test_null_dst (void)
> {
>   return sprintf (NULL, "hello world"); /* { dg-warning "use of NULL
> where
> non-null expected" } */
> }
> 
> int
> test_null_fmt (char *dst)
> {
>   return sprintf (dst, NULL);  /* { dg-warning "use of NULL where
> non-null
> expected" } */
> }
> 
> Signature in builtins.def: DEF_LIB_BUILTIN        (BUILT_IN_SPRINTF,
> "sprintf", BT_FN_INT_STRING_CONST_STRING_VAR,
> ATTR_NOTHROW_NONNULL_1_FORMAT_PRINTF_2_3)
> 
> My question is then : Should G++ behave as GCC and ignore the user's
> signatures of built-ins, instead using the attributes specified by
> GCC ?
> At the moment I went with a "yes". If the function is a builtin, I'm
> operating on its builtin_known_function::builtin_decl () (see above)
> rather
> than the callee_fndecl
> deduced from a gimple call, therefore overriding the user's
> signature.
> 
> Doing so led to tests sprintf-2.c and realloc-1.c to pass both in GCC
> and
> G++.

I think I agree with "yes" here.

> 
> 
> Last test I'd like to discuss is analyzer/pr99193-1.c

I'll reply to this in a separate email

Thanks; hope the above makes sense
Dave


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: analyzer: Weekly update on extending C++ support (2)
  2023-08-07 13:04 analyzer: Weekly update on extending C++ support (2) Benjamin Priour
  2023-08-07 15:48 ` David Malcolm
@ 2023-08-07 16:06 ` David Malcolm
  2023-08-23 13:55   ` analyzer: How to recognize builtins Benjamin Priour
  1 sibling, 1 reply; 5+ messages in thread
From: David Malcolm @ 2023-08-07 16:06 UTC (permalink / raw)
  To: Benjamin Priour, gcc

On Mon, 2023-08-07 at 15:04 +0200, Benjamin Priour wrote:
> Hi Dave,

[...snip...]

> Last test I'd like to discuss is analyzer/pr99193-1.c
> 

[...snip...]

> 
> If I comment out the call to perror("realloc"), then G++ behaves as
> GCC.
> If I replace perror ("realloc") by any other call, to a fully-defined
> function or whatever, then the warning reappears.
> The above SSA differ by an extra basic block in G++ that splits GCC's
> <bb
> 6> in two.
> However, that doesn't account for much, as changing
>   if (p == NULL) {
>       perror ("realloc");
>       __builtin_va_end (args); // (*)
>       return -1;
>     }
> to
>     if (p == NULL) {
>       int x;
>       perror("realloc");
>       if (x > 7)
>         x = 12;
>       __builtin_va_end (args);
>       return -1;
>     }
> generates a similar basic block around __builtin_va_end, yet G++
> keeps
> emitting the false positive whereas GCC doesn't.
> // GCC modified SSA
>   <bb 6> :
>   perror ("realloc");
>   if (x_51(D) > 7)
>     goto <bb 7>; [INV]
>   else
>     goto <bb 8>; [INV]
> 
>   <bb 7> :
>   x_52 = 12;
> 
>   <bb 8> :
>   __builtin_va_end (&args);
>   _54 = -1;
>   // predicted unlikely by early return (on trees) predictor.
>   goto <bb 12>; [INV]
> 
> // G++ modified SSA
>   <bb 7> :
>   perror ("realloc");
> 
>   <bb 8> :
>   if (x_55(D) > 7)
>     goto <bb 9>; [INV]
>   else
>     goto <bb 10>; [INV]
> 
>   <bb 9> :
>   x_56 = 12;
> 
>   <bb 10> :
>   __builtin_va_end (&args);
>   _58 = -1;
>   // predicted unlikely by early return (on trees) predictor.
>   goto <bb 15>; [INV]
> 
> Debugging and glaring at the exploded graph gave me no clue towards
> fixing
> this test.
> If you have any hint, I welcome it.

There a CFG difference here, so have a look at the .supergraph.dot
files with gcc vs g++, using -fdump-analyzer-supergraph.

Both write out to pr99193-1.c.supergraph.dot, but renaming the results
and then comparing them side-by-side, I see various differences.  In
particular, looking at the BB containing the perror ("realloc"), both
GCC and G++ have an out-edge with flag FALLTHRU going to the BB that
starts with __builtin_va_end, but G++ also has an out-edge with flag EH
going to BB 16, which starts with guestfs_int_cleanup_free (generated
by the __attribute__((cleanup(guestfs_int_cleanup_free))) on argv.

So I think what we're seeing here is the execution path that follows
that EH edge, which is to handle the case where perror raises an
exception.  I don't know if that's actually possible, but perhaps it
is.

Are you adding any exception-handling flags to the G++ cases? (e.g.
disabling exception-handling?)

Dave


^ permalink raw reply	[flat|nested] 5+ messages in thread

* analyzer: How to recognize builtins
  2023-08-07 16:06 ` David Malcolm
@ 2023-08-23 13:55   ` Benjamin Priour
  2023-08-23 23:31     ` David Malcolm
  0 siblings, 1 reply; 5+ messages in thread
From: Benjamin Priour @ 2023-08-23 13:55 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc

[-- Attachment #1: Type: text/plain, Size: 3609 bytes --]

Hi David,

A quick update on transitioning the analyzer tests from gcc.dg to
c-c++-common.
As discussed previously, C builtins are not C++ builtins, therefore I had
to add
recognition of the builtins by their name rather than with the predicate
fndecl_built_in_p.

Do you know why this happens?  I see both of those tests have their own
> prototypes for the functions in question, rather than using system
> headers; maybe those prototypes don't match some property that the C++
> FE is checking for?
>

The C FE relies on c-decl.cc:c_builtin_function, the macro
C_DECL_BUILTIN_PROTOTYPE,
as well as c-decl.cc:match_builtin_function_types to recognize a builtin
function decl
and provide a unified type between what the user wrote, and the known
builtin fndecl.

I'd like however to correct something I've said previously

My question is then : Should G++ behave as GCC and ignore the user's
> signatures of built-ins, instead using the attributes specified by GCC ?
> At the moment I went with a "yes". If the function is a builtin, I'm
> operating on its builtin_known_function::builtin_decl () (see above) rather
> than the callee_fndecl
> deduced from a gimple call, therefore overriding the user's signature.
>

Actually, GCC does not ignore the user's signatures of built-ins, but if
match_builtin_function_types finds out that the user's signature
is close enough to the one described in builtins.def, then it will complete
the user's add with attributes and the likes so that we get an unified type.

A case where the user's signature *isn't* close enough to the built-in's is
gcc.dg/analyzer/pr96841.c's malloc's signature.

__SIZE_TYPE__
malloc (__SIZE_TYPE__);

To simplify, whenever a Wbuiltin-declaration-mismatch is emitted, then the
user's decl isn't unified with the built-ins,
and the extra GCC's attributes defined in builtins.def are not added to the
user's decl.
In such cases, the function decl is not recognized as a builtin by
fndecl_builtin_p, hence the analyzer's (before my patch) known_function
is never found out.
What this mean is that pr96841.c's malloc uses don't go within
kf_malloc:impl_call_pre nor kf_malloc:impl_call_post with the current
analyzer's handling of built-ins - do note however that sm-malloc goes as
expected, as it recognizes malloc calls both by name
and builtin decl (sm-malloc.cc::known_allocator_p).

With my patch, since the recognition is by name and no longer dependent on
the fndecl_builtin_p predicate, pr96841's malloc *is*
recognized as a builtin, and its kf is processed.
This led to pr96841.c no longer passing, as the while loop makes the
analysis too complex. (-Wanalyzer-too-complex)

> > My question is then : Should G++ behave as GCC and ignore the user's
> > signatures of built-ins, instead using the attributes specified by
> > GCC ?
> > At the moment I went with a "yes". If the function is a builtin, I'm
> > operating on its builtin_known_function::builtin_decl () (see above)
> > rather
> > than the callee_fndecl
> > deduced from a gimple call, therefore overriding the user's
> > signature.
> >
> > Doing so led to tests sprintf-2.c and realloc-1.c to pass both in GCC
> > and
> > G++.
>
> I think I agree with "yes" here.

I would like to proceed with what we said here, so that the analyzer
proceeds
with the analysis of malloc or any other builtins even when their fndecl is
not recognized
as such. Then I would update pr96841.c to expect a -Wanalyzer-too-complex.
Besides, sm-malloc already makes the assumption that a call named malloc
should be treated
as a malloc, whether it is matching a builtin or not.

Thanks,
Benjamin.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: analyzer: How to recognize builtins
  2023-08-23 13:55   ` analyzer: How to recognize builtins Benjamin Priour
@ 2023-08-23 23:31     ` David Malcolm
  0 siblings, 0 replies; 5+ messages in thread
From: David Malcolm @ 2023-08-23 23:31 UTC (permalink / raw)
  To: Benjamin Priour; +Cc: gcc

On Wed, 2023-08-23 at 15:55 +0200, Benjamin Priour wrote:
> Hi David,
> 
> A quick update on transitioning the analyzer tests from gcc.dg to
> c-c++-common.
> As discussed previously, C builtins are not C++ builtins, therefore I
> had
> to add
> recognition of the builtins by their name rather than with the
> predicate
> fndecl_built_in_p.
> 
> Do you know why this happens?  I see both of those tests have their
> own
> > prototypes for the functions in question, rather than using system
> > headers; maybe those prototypes don't match some property that the
> > C++
> > FE is checking for?
> > 
> 
> The C FE relies on c-decl.cc:c_builtin_function, the macro
> C_DECL_BUILTIN_PROTOTYPE,
> as well as c-decl.cc:match_builtin_function_types to recognize a
> builtin
> function decl
> and provide a unified type between what the user wrote, and the known
> builtin fndecl.
> 
> I'd like however to correct something I've said previously
> 
> My question is then : Should G++ behave as GCC and ignore the user's
> > signatures of built-ins, instead using the attributes specified by
> > GCC ?
> > At the moment I went with a "yes". If the function is a builtin,
> > I'm
> > operating on its builtin_known_function::builtin_decl () (see
> > above) rather
> > than the callee_fndecl
> > deduced from a gimple call, therefore overriding the user's
> > signature.
> > 
> 
> Actually, GCC does not ignore the user's signatures of built-ins, but
> if
> match_builtin_function_types finds out that the user's signature
> is close enough to the one described in builtins.def, then it will
> complete
> the user's add with attributes and the likes so that we get an
> unified type.
> 
> A case where the user's signature *isn't* close enough to the built-
> in's is
> gcc.dg/analyzer/pr96841.c's malloc's signature.
> 
> __SIZE_TYPE__
> malloc (__SIZE_TYPE__);
> 
> To simplify, whenever a Wbuiltin-declaration-mismatch is emitted,
> then the
> user's decl isn't unified with the built-ins,
> and the extra GCC's attributes defined in builtins.def are not added
> to the
> user's decl.
> In such cases, the function decl is not recognized as a builtin by
> fndecl_builtin_p, hence the analyzer's (before my patch)
> known_function
> is never found out.
> What this mean is that pr96841.c's malloc uses don't go within
> kf_malloc:impl_call_pre nor kf_malloc:impl_call_post with the current
> analyzer's handling of built-ins - do note however that sm-malloc
> goes as
> expected, as it recognizes malloc calls both by name
> and builtin decl (sm-malloc.cc::known_allocator_p).
> 
> With my patch, since the recognition is by name and no longer
> dependent on
> the fndecl_builtin_p predicate, pr96841's malloc *is*
> recognized as a builtin, and its kf is processed.
> This led to pr96841.c no longer passing, as the while loop makes the
> analysis too complex. (-Wanalyzer-too-complex) 

Are you able to post the part of that patch that affects recognition of
builtins?  (even if it's just a work-in-progress with no ChangeLog)

> 
> > > My question is then : Should G++ behave as GCC and ignore the
> > > user's
> > > signatures of built-ins, instead using the attributes specified
> > > by
> > > GCC ?
> > > At the moment I went with a "yes". If the function is a builtin,
> > > I'm
> > > operating on its builtin_known_function::builtin_decl () (see
> > > above)
> > > rather
> > > than the callee_fndecl
> > > deduced from a gimple call, therefore overriding the user's
> > > signature.
> > > 
> > > Doing so led to tests sprintf-2.c and realloc-1.c to pass both in
> > > GCC
> > > and
> > > G++.
> > 
> > I think I agree with "yes" here.
> 
> I would like to proceed with what we said here, so that the analyzer
> proceeds
> with the analysis of malloc or any other builtins even when their
> fndecl is
> not recognized
> as such. Then I would update pr96841.c to expect a -Wanalyzer-too-
> complex.
> Besides, sm-malloc already makes the assumption that a call named
> malloc
> should be treated
> as a malloc, whether it is matching a builtin or not.

Sounds good.

Dave


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-08-23 23:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-07 13:04 analyzer: Weekly update on extending C++ support (2) Benjamin Priour
2023-08-07 15:48 ` David Malcolm
2023-08-07 16:06 ` David Malcolm
2023-08-23 13:55   ` analyzer: How to recognize builtins Benjamin Priour
2023-08-23 23:31     ` David Malcolm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).