From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <mail@tim-lange.me>
Received: from www523.your-server.de (www523.your-server.de [159.69.224.22])
 by sourceware.org (Postfix) with ESMTPS id 817203857B88
 for <gcc@gcc.gnu.org>; Sun, 12 Jun 2022 18:27:10 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 817203857B88
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=tim-lange.me
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=tim-lange.me
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
 d=tim-lange.me; s=default2108; h=Content-Transfer-Encoding:Content-Type:
 MIME-Version:References:In-Reply-To:Message-Id:Cc:To:Subject:From:Date:Sender
 :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:
 Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID;
 bh=ZAa7qna9g1pO5pVyfWvLmQgrAxVu78x3CQVj+nSMYbA=; b=iuaG0zU1tCNw8GzLQPhfz/Aioo
 uRUTyty37aXAEzRoYlgXE7skidzndg1ZfJbMQXIovxOEf+J6XEluNBj/RxxROf8YktWp7nAPgZtIb
 ZuFUiZIqTBFK82w01C7X/pd8lgnZ7WLWS0vZcowJG8+LzGGB3Vw907vvrM/pgkEo3MAwZDR9hSMTb
 xSPADrhdCv9bklqx9FDJHTD+tREoFj64MwCdRJ9LlnDICuBfNzn1v2OVr04I+Il46Qy3h+ZhexOM9
 VXmAxGdmFIRzNq0nDLyJCspBBRDIST7bK/MeEsbFYGN8a3XOz9/+8kgoIJj+CQGLnF8suTyp7MRKf
 Bs4haAzw==;
Received: from sslproxy05.your-server.de ([78.46.172.2])
 by www523.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256)
 (Exim 4.92.3) (envelope-from <mail@tim-lange.me>)
 id 1o0SIm-000Arz-Ur; Sun, 12 Jun 2022 20:27:09 +0200
Received: from [2a02:908:1864:dbe0::73b5] (helo=fedora)
 by sslproxy05.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256)
 (Exim 4.92) (envelope-from <mail@tim-lange.me>)
 id 1o0SIm-000Emn-QL; Sun, 12 Jun 2022 20:27:08 +0200
Date: Sun, 12 Jun 2022 20:27:02 +0200
From: Tim Lange <mail@tim-lange.me>
Subject: Re: fanalyzer: debugging zero state machine
To: David Malcolm <dmalcolm@redhat.com>
Cc: GCC Mailing List <gcc@gcc.gnu.org>
Message-Id: <29NDDR.H0A8YV139ST52@tim-lange.me>
In-Reply-To: <6d52e46e0ee72a9185bc9cd969e7f1553086585e.camel@redhat.com>
References: <57T7DR.KTUNZBPTS69U1@tim-lange.me>
 <6d52e46e0ee72a9185bc9cd969e7f1553086585e.camel@redhat.com>
X-Mailer: geary/40.0
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1251; format=flowed
Content-Transfer-Encoding: quoted-printable
X-Authenticated-Sender: mail@tim-lange.me
X-Virus-Scanned: Clear (ClamAV 0.103.6/26570/Sun Jun 12 10:15:28 2022)
X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_INFOUSMEBIZ, SPF_HELO_NONE,
 SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc mailing list <gcc.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc>,
 <mailto:gcc-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <mailto:gcc-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc>,
 <mailto:gcc-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Sun, 12 Jun 2022 18:27:13 -0000


On Do, Jun 9 2022 at 13:40:06 -0400, David Malcolm=20
<dmalcolm@redhat.com> wrote:
> On Thu, 2022-06-09 at 16:49 +0200, Tim Lange wrote:
>>=20
>>   > On Mi, Jun 8 2022 at 11:12:52 -0400, David Malcolm
>>  <dmalcolm@redhat.com> wrote:
>>   > > On Wed, 2022-06-08 at 01:42 +0200, Tim Lange wrote:
>>   > >
>>   > > Hi Dave,
>=20
> Hi Tim; various responses inline below...
>=20
>>   > >
>>   > > I did spent some time to think about the zero state machine. I
>>  first
>>   > > thought about distinguishing between "assigned zero" and "EQ 0
>>   > > condition on the path" for cases where e.g. unreachable() is
>>  used
>>  to
>>   > > say that some variable will never be zero according to the
>>  programmer.
>>   > > In that case the dev might not want zero warnings from
>>  conditions
>>   > > outside the if body itself for dev build where unreachable
>>  expands
>>  to
>>   > > some panic exit. But as the condition constraint is not pruned
>>  when the
>>   > > state machine is distinguishing the states, I'm not sure how to
>>  know
>>   > > whether the analysis already left the if body?
>>   >
>>   > The analyzer works on the gimple-ssa representation, which uses
>>  basic
>>   > blocks in a CFG, rather than an AST, so the only remants we have
>>  of
>>   > scope is in "clobber" statements (a special kind of assignment
>>  stmt),
>>   > which the gimplify adds as variables go out of scope.
>>  If the constraints only lived until the immediate dominator of `if
>>  (cond)`, I could easily distinguish:
>>  1. if (x =3D=3D 0) && still inside the if =3D> zero
>>  2. if (x =3D=3D 0) && outside if =3D> maybe zero
>>  but as this seems to be not the case, if I want to distinguish 1. &
>>  2.,
>>  I'd have to find another way.
>>   >
>>   > For pruning, the analyzer's state_machine class has a
>>  "can_purge_p"
>>   > virtual function:
>>   >
>>   > /* Return true if it safe to discard the given state (to help
>>   > when simplifying state objects).
>>   > States that need leak detection should return false. */
>>   > virtual bool can_purge_p (state_t s) const =3D 0;
>>   >
>>   > which should return true for a "zeroness" state machine, in that
>>  we
>>   > always consider pruning states for svalues that aren't needed
>>  anymore
>>   > along a path.
>>  Is implemented and returns true.
>>   >
>>   > Is there some other kind of state explosion you're running into?
>>  It's
>>   > hard to figure this out further without seeing code.
>>  No, my code is by far not that mature to be tested. I just had in my
>>  head that I wanted to find out if I can distinguish the two cases.
>>   >
>>   >
>>   > > Also, while trying out different things, it seems simple
>>  assignments on
>>   > > phi like here
>>   > > int x;
>>   > > if (argc =3D=3D 1) {
>>   > > x =3D 1; // x_5
>>   > > } else {
>>   > > x =3D 0; // x_4
>>   > > }
>>   > > printf("%i", 5 / x); // x_2
>>   > > automatically work such that x_2 already inherits the state=20
>> from
>>   > > x_4/x_5 without me doing anything inside my sm's on_phi
>>  function.
>>  Same
>>   > > for the simple y =3D 0; x =3D y; case. Where does this happen=20
>> inside
>>  the
>>   > > code?
>>   >
>>   > With the caveat that I'm seeing your code, what's probably
>>  happening
>>  is
>>   > that we have either:
>>   >
>>   > BB (a):
>>   > x_5 =3D 1;
>>   > goto BB (c);
>>   >
>>   > BB (b):
>>   > x_4 =3D 0;
>>   > goto BB (c);
>>   >
>>   > BB (c):
>>   > x_2 =3D PHI (x_5 from (a), x_4 from (b));
>>  I compiled it with -g, so this one is like the dumped gimple.
>>   >
>>   > or (at higher optimization levels):
>>   >
>>   > BB (a):
>>   > goto BB (c);
>>   >
>>   > BB (b):
>>   > goto BB (c);
>>   >
>>   > BB (c):
>>   > x_2 =3D PHI (1 from (a), 0 from (b));
>>   >
>>   > and I think that at the phi node we have=20
>> region_model::handle_phi,
>>   > which is setting x_2 to either the constant 1 or the constant 0=20
>> in
>>  the
>>   > store, and is calling the on_phi vfunc, leading to on_phi being
>>  called
>>   > for all state machines.
>>  Thanks, that is the case. The set_value inside handle_phi seems to
>>  this
>>  for me.
>>   >
>>   > BTW, are you implementing an override for this vfunc:
>>   > virtual state_machine::state_t get_default_state (const svalue *)
>>   > const;
>>   >
>>   > to capture the inherently known zeroness/nonzeroness of
>>  constant_svalue
>>   > instances? That would make those constants have that state.
>>  Yeah, I saw that on your nullness check. I tried it on a small
>>  example
>>  with and without, but didn't noticed a difference in warnings=20
>> (except
>>  for not having zero(x_4) inside the supergraph.dot). So if I
>>  understood
>>  this right, this is just to have one state less for that
>>  variable/value[0]?
>=20
> The states are stored in sm_state_map using a mapping from svalue to
> state.
>=20
> Given e.g. a parameter, this will be "initial_svalue (parm)", but in
> the above case where x_4 either has value 1 or has value 0, it's not
> storing a state for x_4; it's storing a state for 0 or a state for 1.
> So without implementing the get_default_state vfunc, the sm_state_maps
> will gradually aquire explicit "this is zero" or "this is nonzero" for
> all of the constants that get used, leading to an explosion of states
> for all the different combinations of constants that have been
> encountered along an execution path, which is probably not going to be
> useful.  If you do implement the get_default_state vfunc for=20
> constants,
> then the constants will implicitly have the zero vs nonzero states,=20
> and
> this information won't get stored in the sm_state_map instances.
>=20
> program_state::can_merge_with_p compares the sm_state_map instances
> between the two program_state instances - and so if the
> get_default_state vfunc is implemented for constants, the peer
> sm_state_maps for the "zero" state machine will both be empty, and=20
> thus
> equal, and thus it will merge state.
>=20
> So it might make sense to revamp this sm_state_map comparison so that
> we consider implicit states in sm_state_maps when merging the store,=20
> so
> that we see that "1" and "0" have different sm-states for the "zero"
> state machine, and thus we shouldn't merge states for these.  See
> binding_cluster::can_merge_p, which calls svalue::can_merge_p.  It
> might make sense to have svalue::can_merge_p "know" about=20
> sm-state-maps
> for this.
>=20
> svalue::can_merge_p calls model_merger::mergeable_svalue_p, so that we
> can reject merging values that explicitly have different state-machine
> states, but that code could be reworked so that it considers whether a
> pair of svalues can be merged, using implicit state-machine states=20
> (and
> thus have it reject the merger of a zero with a nonzero constant).
>=20
> Or, whilst prototyping, you could simply hack in a rejection of=20
> merging
> zero vs non-zero integer values into svalue::can_merge_p.
>=20
>>=20
>>  If that is right, is it also favorable to "merge" the stop state and
>>  non_zero state inside the zero state machine because - for now -
>>  there
>>  is no warning planned on non-zero values?
>>=20
>>  [0] less states are favorable because then the analyzer maybe has
>>  less
>>  different enodes to visit and thus less runtime(?)
>>   >
>>   > Thanks
>>   > Dave
>>   >
>>   > > - Tim
>>=20
>>  Also another question unrelated to the ones before. I do have a=20
>> weird
>>  bug in my zero sm[1] but I'm unsure where my sm is flawed. Take for
>>  example, the probably simplest case:
>>    int x =3D 0;h
>>    printf("%d", 42 / x);
>>  If I use inform to emit a notice for my state machine result, it
>>  seems
>>  to be correct and I do get following traversal order (by printing=20
>> the
>>  gimple stmt inside on_stmt):
>>    x_2 =3D 0;
>>    _1 =3D 42 % x_2;
>>    # .MEM_4 =3D VDEF <.MEM_3(D)>
>>    printf ("%d", _1);
>>    _5 =3D 0;
>>    <L0>:
>>    # VUSE <.MEM_4>
>>    return _5;
>>  But if i use my zero_diagnostics and sm_ctxt->warn to emit the
>>  warning,
>>  I get an unexpected traversal order of:
>>    x_2 =3D 0;
>>    _1 =3D 42 / x_2;
>>    # .MEM_4 =3D VDEF <.MEM_3(D)>
>>    printf ("%d", _1);
>>    _5 =3D 0;
>>    <L0>:
>>    # VUSE <.MEM_4>
>>    return _5;
>>    x_2 =3D 0;              <-- why do I get these stmts again but they
>>  are not as
>>  duplicated in the e-supergraph?
>>    _1 =3D 42 / x_2;
>>    _5 =3D 0;
>>    _1 =3D 42 / x_2;
>>    _5 =3D 0;
>=20
> What exactly is your test source file?
It is just the two lines wrapped into main():
  #include <stdio.h>
  int main(int argc, char **argv) {
    int x =3D 0;
    printf("%d", 42 / x);
    return 0;
  }
>=20
> Note that a function being called by another function will get=20
> explored
> in the exploded_graph twice: once being analyzed "standalone", and
> again being analyzed as called from the other function.
>=20
> When you say "I get these stmts again", do you mean the state machine
> code is seeing them again, or the region model code?
In the state_machine::on_stmt function.
>=20
>   sm_ctxt->warn (new foo)
>=20
> will save an instance of a pending_diagnostic into the
> diagnotic_manager, associating it with a particular exploded_node (and
> some other info).
>=20
> After the exploded_graph has been built, the diagnostic_manager does
> some deduplication work, and then tries to find a feasible path to=20
> each
> pending_diagnostic's exploded node.
>=20
> This feasibility analysis reruns much of the region_model code that=20
> was
> run when building the exploded graph.  So that's another way you can
> "see" stmts repeatedly.
Okay, that probably explains it. I had two distinct problems where I=20
thought they might be both related to seeing the statements again:
    |   13 |   int x =3D 0;
    |      |       ^
    |      |       |
    |      |       (1) set to zero here
    |      |       (3) set to zero here
    |   14 |   printf("%d", 42 / x);
    |      |   ~~~~~~~~~~~~~~~~~~~~
    |      |   |
    |      |   (4) division by zero
    |   15 |
    |   16 |   return 0;
    |      |          ~
    |      |          |
    |      |          (2) set to zero here
    |      |          (5) set to zero here
* In line 13, there are two state changes while I'd expect only one.
=3D> I re-looked at the other state machines and they transfer to the=20
stop state after the warning. If I do the same, my problem in line 13=20
is fixed.
* The other problem in line 16 is that return 0 is not relevant toward=20
the division by zero.
=3D> After looking at the diagnostic code, I think those should be=20
filtered out but aren't because x (in gimple x_2) and return 0 (in=20
gimple _5) share the (int)0 state. Inside=20
diagnostic_manager::prune_for_sm_diagnostic there is a check that the=20
state_change->m_sval is equal to the sval of the saved_diagnostic,=20
which is not enough to filter unrelated events in the zero state=20
machine case.
I tried to filter the event by comparing the state of interest but that=20
doesn't work if the unrelated event is after the related event on the=20
path.
Similarly, I tried to add a new field m_var to state_change_event and=20
provide this as a comparison but in most functions that call the=20
constructor of state_change_event, the var is not available.

Is there is any easy way to filter these unrelated events without=20
considerable changes [1]?

By the way, I can trigger the same behavior on the malloc null_deref=20
with gcc version 12.1.1 shipped in Fedora 36. Assume this simple=20
program:
  #include <stddef.h>
  int main (void)
  {
    int *p =3D NULL;
    *p =3D 42;

    int *q =3D NULL;

    return 0;
  }
results in the following warning:
/home/tim/Projects/simple_c/main.c: In function =91main=92:
/home/tim/Projects/simple_c/main.c:12:6: warning: dereference of NULL=20
=91p=92 [CWE-476] [-Wanalyzer-null-dereference]
   12 |   *p =3D 42;
      |   ~~~^~~~
  =91main=92: events 1-4
    |
    |   11 |   int *p =3D NULL;
    |      |        ^
    |      |        |
    |      |        (1) =91p=92 is NULL
    |   12 |   *p =3D 42;
    |      |   ~~~~~~~
    |      |      |
    |      |      (4) dereference of NULL =91p=92
    |   13 |
    |   14 |   int *q =3D NULL;
    |      |        ~
    |      |        |
    |      |        (2) =91p=92 is NULL
    |      |        (3) =91p=92 is NULL

>=20
>=20
>>  I tracked the cause down the call stack to
>>  m_saved_diagnostics.safe_push (sd); inside
>>  diagnostic_manager::add_diagnostic. For whatever reason, the pushing
>>  of
>>  diagnostics bricks the zero sm. It might be a embarrassing error on
>>  my
>>  side, but I'm stuck on this and can't seem to find what I'm doing
>>  wrong.
>=20
> FWIW, assuming you're using the gdb support scripts:
>  =20
> https://gcc-newbies-guide.readthedocs.io/en/latest/debugging.html#support=
-scripts-for-gdb
> there's a handy:
>=20
> (gdb) break-on-saved-diagnostic
>=20
> command which adds a breakpoint on the analyzer's diagnostic_manager
> saving a pending_diagnostic.
>=20
> You might also what to try -fdump-analyzer-stderr, which will dump=20
> lots
> of information on what the analyzer is doing to stderr, which although
> verbose is usually very helpful in figuring out why the analyzer is
> doing something.
>=20
> Hope this is helpful
Definitely helped me understand more of the internals!

- Tim
> Dave
>=20
>=20
>=20
>>=20
>>  - Tim
>>=20
>>  [1]
>>=20
>> =20
>> https://github.com/timll/gcc/blob/castsize/gcc/analyzer/sm-zero.cc#L181
>>=20
>>=20
>=20
>=20