public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug analyzer/108028] New: --Wanalyzer-null-dereference false posiative with *q = 1
@ 2022-12-09  7:08 mengli.ming at outlook dot com
  2022-12-13 22:38 ` [Bug analyzer/108028] Misleading -fanalyzer messages at -O2 and above dmalcolm at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: mengli.ming at outlook dot com @ 2022-12-09  7:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108028

            Bug ID: 108028
           Summary: --Wanalyzer-null-dereference false posiative with *q =
                    1
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: analyzer
          Assignee: dmalcolm at gcc dot gnu.org
          Reporter: mengli.ming at outlook dot com
  Target Milestone: ---

I got a false positive error when compiling the following program with
gcc(trunk) -fanalyzer -O2 in https://godbolt.org/z/W37MzrPqd. The
`__analyzer_eval()` statement is added at some suitable places in the code in
order to keep track of the information already available to the analyzer at
some point in the static analysis of the program. After that, I found that
under -O0, for this program (https://godbolt.org/z/Y1GMEMaG9),
`__analyzer_eval(p && (0 == q))`, `__analyzer_eval(p)`, `__analyzer_eval(0 ==
q)` give the same result at the same program point as -O2 without generating
the NPD warning. The following is the result of the analysis obtained using
-O2, please take a look, thank you.

```
#include "stdio.h"
int f(int, int *);

int f(int p, int *q)
{
    __analyzer_eval(p && (0 == q));
    if (p && (0 == q))
    {
        __analyzer_eval(p && (0 == q));
        __analyzer_eval(p);
        __analyzer_eval(0 == p);
        __analyzer_eval(q);
        __analyzer_eval(0 == q);
        *q = 1;
    }
    printf("NPD_FLAG\n");
}

int main()
{
    int a = 0;
    int *b = (void*)0;
    f(a, b);
}

```

```
<source>: In function 'f':
<source>:6:5: warning: FALSE
    6 |     __analyzer_eval(p && (0 == q));
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<source>:6:5: warning: UNKNOWN
<source>:9:9: warning: TRUE
    9 |         __analyzer_eval(p && (0 == q));
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<source>:10:9: warning: TRUE
   10 |         __analyzer_eval(p);
      |         ^~~~~~~~~~~~~~~~~~
<source>:11:9: warning: FALSE
   11 |         __analyzer_eval(0 == p);
      |         ^~~~~~~~~~~~~~~~~~~~~~~
<source>:12:9: warning: UNKNOWN
   12 |         __analyzer_eval(q);
      |         ^~~~~~~~~~~~~~~~~~
<source>:13:9: warning: TRUE
   13 |         __analyzer_eval(0 == q);
      |         ^~~~~~~~~~~~~~~~~~~~~~~
<source>:14:12: warning: dereference of NULL '0' [CWE-476]
[-Wanalyzer-null-dereference]
   14 |         *q = 1;
      |         ~~~^~~
  'f': events 1-3
    |
    |    7 |     if (p && (0 == q))
    |      |        ^
    |      |        |
    |      |        (1) following 'true' branch...
    |    8 |     {
    |    9 |         __analyzer_eval(p && (0 == q));
    |      |         ~~~~~~~~~~~~~~~
    |      |         |
    |      |         (2) ...to here
    |......
    |   14 |         *q = 1;
    |      |         ~~~~~~
    |      |            |
    |      |            (3) dereference of NULL '0'

```

In the analysis under -O2 above, lines 12 and 13 are somewhat contradictory, as
`__analyzer_eval(q)` results in UNKNOWN while `__analyzer_eval(0 == q)` results
in TRUE.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug analyzer/108028] Misleading -fanalyzer messages at -O2 and above
  2022-12-09  7:08 [Bug analyzer/108028] New: --Wanalyzer-null-dereference false posiative with *q = 1 mengli.ming at outlook dot com
@ 2022-12-13 22:38 ` dmalcolm at gcc dot gnu.org
  2022-12-13 22:55 ` dmalcolm at gcc dot gnu.org
  2022-12-14 15:27 ` mengli.ming at outlook dot com
  2 siblings, 0 replies; 4+ messages in thread
From: dmalcolm at gcc dot gnu.org @ 2022-12-13 22:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108028

David Malcolm <dmalcolm at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|--Wanalyzer-null-dereferenc |Misleading -fanalyzer
                   |e false positive with *q =  |messages at -O2 and above
                   |1                           |

--- Comment #1 from David Malcolm <dmalcolm at gcc dot gnu.org> ---
Thanks for filing this bug.

There are several things going on here.

(A): the analyzer is considering the function "f" as called standalone, as well
as the case where it's called from "main", rather than merely considering it
when it's called from "main".  There are a few other bug reports about that;
it's not clear to me what we should do for this case; is it expected that such
functions are only ever called from main?

The situation is clearer if we simply delete "main" from the reproducer.  With
that, we see:

  'f': events 1-3
    |
    |    7 |     if (p && (0 == q))
    |      |        ^
    |      |        |
    |      |        (1) following 'true' branch...
    |    8 |     {
    |    9 |         __analyzer_eval(p && (0 == q));
    |      |         ~~~~~~~~~~~~~~~
    |      |         |
    |      |         (2) ...to here
    |......
    |   14 |         *q = 1;
    |      |         ~~~~~~
    |      |            |
    |      |            (3) dereference of NULL '0'
    |


(B) arguably the CFG event (1) to (2) is poorly worded; at (1) we have
"following 'true' branch...", which suggests it always follows the 'true'
branch, whereas it's merely considering what happens *if* we take the 'true'
branch.

If it read: e.g.:

  'f': events 1-3
    |
    |    7 |     if (p && (0 == q))
    |      |        ^
    |      |        |
    |      |        (1) considering following 'true' branch (implying that 'q'
is NULL)...
    |    8 |     {
    |    9 |         __analyzer_eval(p && (0 == q));
    |      |         ~~~~~~~~~~~~~~~
    |      |         |
    |      |         (2) ...to here
    |......
    |   14 |         *q = 1;
    |      |         ~~~~~~
    |      |            |
    |      |            (3) dereference of NULL '0'
    |

the analyzer would be more obviously correct and useful.

Or even "considering when 'q' is NULL; following 'true' branch..."

I've been experimenting with making the wording here clearer
(i): should make a distinction between when the analyzer chooses one of several
paths to consider, versus when the choice of execution path is already
determined by previous choices
(ii): would be nice to capture that q's nullness is the most interesting
property at the "if" statement with respect to the warning, and express that to
the user.


(C) The analyzer runs relatively late compared to most static analyzers, so the
optimization levels affect things.  In particular, consider the gimple IR seen
by -fanalyzer for the assignment:
     *q = 1;
in the block guarded by (0 == q).

At -O1 we have:
     *q_10(D) = 1;
but at -O2 it converts it to:
     MEM[(int *)0B] = 1;

Arguably it's a bug here that we only warn at -O2 and above; analyzing "f"
standalone, we ought to be complaining about the null dereference without
needing -O2.
(specifically, at -O2 -fanalyzer sees a deref of NULL, whereas at -O1 it merely
sees a deref of INIT_VAL(q), whilst knowing the constraint that INIT_VAL(q) is
NULL.
The __analyzer_eval results are also due to gimple IR differences caused by the
optimizer.

Also, perhaps we should run earlier; I probably ought to file a bug about that,
it's a big can of worms.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug analyzer/108028] Misleading -fanalyzer messages at -O2 and above
  2022-12-09  7:08 [Bug analyzer/108028] New: --Wanalyzer-null-dereference false posiative with *q = 1 mengli.ming at outlook dot com
  2022-12-13 22:38 ` [Bug analyzer/108028] Misleading -fanalyzer messages at -O2 and above dmalcolm at gcc dot gnu.org
@ 2022-12-13 22:55 ` dmalcolm at gcc dot gnu.org
  2022-12-14 15:27 ` mengli.ming at outlook dot com
  2 siblings, 0 replies; 4+ messages in thread
From: dmalcolm at gcc dot gnu.org @ 2022-12-13 22:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108028

--- Comment #2 from David Malcolm <dmalcolm at gcc dot gnu.org> ---
(D)  Also, the 
        (3) dereference of NULL '0'
is poorly worded; ideally we'd say:
        (3) dereference of NULL 'q'

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug analyzer/108028] Misleading -fanalyzer messages at -O2 and above
  2022-12-09  7:08 [Bug analyzer/108028] New: --Wanalyzer-null-dereference false posiative with *q = 1 mengli.ming at outlook dot com
  2022-12-13 22:38 ` [Bug analyzer/108028] Misleading -fanalyzer messages at -O2 and above dmalcolm at gcc dot gnu.org
  2022-12-13 22:55 ` dmalcolm at gcc dot gnu.org
@ 2022-12-14 15:27 ` mengli.ming at outlook dot com
  2 siblings, 0 replies; 4+ messages in thread
From: mengli.ming at outlook dot com @ 2022-12-14 15:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108028

--- Comment #3 from ming mengli <mengli.ming at outlook dot com> ---
(In reply to David Malcolm from comment #1)
> Thanks for filing this bug.
> 
> There are several things going on here.
> 
> (A): the analyzer is considering the function "f" as called standalone, as
> well as the case where it's called from "main", rather than merely
> considering it when it's called from "main".  There are a few other bug
> reports about that; it's not clear to me what we should do for this case; is
> it expected that such functions are only ever called from main?
> 
> The situation is clearer if we simply delete "main" from the reproducer. 
> With that, we see:
> 
>   'f': events 1-3
>     |
>     |    7 |     if (p && (0 == q))
>     |      |        ^
>     |      |        |
>     |      |        (1) following 'true' branch...
>     |    8 |     {
>     |    9 |         __analyzer_eval(p && (0 == q));
>     |      |         ~~~~~~~~~~~~~~~
>     |      |         |
>     |      |         (2) ...to here
>     |......
>     |   14 |         *q = 1;
>     |      |         ~~~~~~
>     |      |            |
>     |      |            (3) dereference of NULL '0'
>     |
> 
> 
> (B) arguably the CFG event (1) to (2) is poorly worded; at (1) we have
> "following 'true' branch...", which suggests it always follows the 'true'
> branch, whereas it's merely considering what happens *if* we take the 'true'
> branch.
> 
> If it read: e.g.:
> 
>   'f': events 1-3
>     |
>     |    7 |     if (p && (0 == q))
>     |      |        ^
>     |      |        |
>     |      |        (1) considering following 'true' branch (implying that
> 'q' is NULL)...
>     |    8 |     {
>     |    9 |         __analyzer_eval(p && (0 == q));
>     |      |         ~~~~~~~~~~~~~~~
>     |      |         |
>     |      |         (2) ...to here
>     |......
>     |   14 |         *q = 1;
>     |      |         ~~~~~~
>     |      |            |
>     |      |            (3) dereference of NULL '0'
>     |
> 
> the analyzer would be more obviously correct and useful.
> 
> Or even "considering when 'q' is NULL; following 'true' branch..."
> 
> I've been experimenting with making the wording here clearer
> (i): should make a distinction between when the analyzer chooses one of
> several paths to consider, versus when the choice of execution path is
> already determined by previous choices
> (ii): would be nice to capture that q's nullness is the most interesting
> property at the "if" statement with respect to the warning, and express that
> to the user.
> 
> 
> (C) The analyzer runs relatively late compared to most static analyzers, so
> the optimization levels affect things.  In particular, consider the gimple
> IR seen by -fanalyzer for the assignment:
>      *q = 1;
> in the block guarded by (0 == q).
> 
> At -O1 we have:
>      *q_10(D) = 1;
> but at -O2 it converts it to:
>      MEM[(int *)0B] = 1;
> 
> Arguably it's a bug here that we only warn at -O2 and above; analyzing "f"
> standalone, we ought to be complaining about the null dereference without
> needing -O2.
> (specifically, at -O2 -fanalyzer sees a deref of NULL, whereas at -O1 it
> merely sees a deref of INIT_VAL(q), whilst knowing the constraint that
> INIT_VAL(q) is NULL.
> The __analyzer_eval results are also due to gimple IR differences caused by
> the optimizer.
> 
> Also, perhaps we should run earlier; I probably ought to file a bug about
> that, it's a big can of worms.

Thanks a lot for your explanation of this bug, it helps a lot. I think it's
perfectly feasible to consider both cases for that such functions to be
analyzed standalone and from the main function, sorry I didn't express myself
clearly on that point. Feel free to let us know if we could do anything to help
with these. We hope gcc static analyzer will get better.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-12-14 15:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-09  7:08 [Bug analyzer/108028] New: --Wanalyzer-null-dereference false posiative with *q = 1 mengli.ming at outlook dot com
2022-12-13 22:38 ` [Bug analyzer/108028] Misleading -fanalyzer messages at -O2 and above dmalcolm at gcc dot gnu.org
2022-12-13 22:55 ` dmalcolm at gcc dot gnu.org
2022-12-14 15:27 ` mengli.ming at outlook dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).