* fanalyzer: debugging zero state machine @ 2022-06-09 14:49 Tim Lange 2022-06-09 17:40 ` David Malcolm 0 siblings, 1 reply; 4+ messages in thread From: Tim Lange @ 2022-06-09 14:49 UTC (permalink / raw) To: dmalcolm, GCC Mailing List > On Mi, Jun 8 2022 at 11:12:52 -0400, David Malcolm <dmalcolm@redhat.com> wrote: > > On Wed, 2022-06-08 at 01:42 +0200, Tim Lange wrote: > > > > Hi Dave, > > > > I did spent some time to think about the zero state machine. I first > > thought about distinguishing between "assigned zero" and "EQ 0 > > condition on the path" for cases where e.g. unreachable() is used to > > say that some variable will never be zero according to the programmer. > > In that case the dev might not want zero warnings from conditions > > outside the if body itself for dev build where unreachable expands to > > some panic exit. But as the condition constraint is not pruned when the > > state machine is distinguishing the states, I'm not sure how to know > > whether the analysis already left the if body? > > The analyzer works on the gimple-ssa representation, which uses basic > blocks in a CFG, rather than an AST, so the only remants we have of > scope is in "clobber" statements (a special kind of assignment stmt), > which the gimplify adds as variables go out of scope. If the constraints only lived until the immediate dominator of `if (cond)`, I could easily distinguish: 1. if (x == 0) && still inside the if => zero 2. if (x == 0) && outside if => maybe zero but as this seems to be not the case, if I want to distinguish 1. & 2., I'd have to find another way. > > For pruning, the analyzer's state_machine class has a "can_purge_p" > virtual function: > > /* Return true if it safe to discard the given state (to help > when simplifying state objects). > States that need leak detection should return false. */ > virtual bool can_purge_p (state_t s) const = 0; > > which should return true for a "zeroness" state machine, in that we > always consider pruning states for svalues that aren't needed anymore > along a path. Is implemented and returns true. > > Is there some other kind of state explosion you're running into? It's > hard to figure this out further without seeing code. No, my code is by far not that mature to be tested. I just had in my head that I wanted to find out if I can distinguish the two cases. > > > > Also, while trying out different things, it seems simple assignments on > > phi like here > > int x; > > if (argc == 1) { > > x = 1; // x_5 > > } else { > > x = 0; // x_4 > > } > > printf("%i", 5 / x); // x_2 > > automatically work such that x_2 already inherits the state from > > x_4/x_5 without me doing anything inside my sm's on_phi function. Same > > for the simple y = 0; x = y; case. Where does this happen inside the > > code? > > With the caveat that I'm seeing your code, what's probably happening is > that we have either: > > BB (a): > x_5 = 1; > goto BB (c); > > BB (b): > x_4 = 0; > goto BB (c); > > BB (c): > x_2 = PHI (x_5 from (a), x_4 from (b)); I compiled it with -g, so this one is like the dumped gimple. > > or (at higher optimization levels): > > BB (a): > goto BB (c); > > BB (b): > goto BB (c); > > BB (c): > x_2 = PHI (1 from (a), 0 from (b)); > > and I think that at the phi node we have region_model::handle_phi, > which is setting x_2 to either the constant 1 or the constant 0 in the > store, and is calling the on_phi vfunc, leading to on_phi being called > for all state machines. Thanks, that is the case. The set_value inside handle_phi seems to this for me. > > BTW, are you implementing an override for this vfunc: > virtual state_machine::state_t get_default_state (const svalue *) > const; > > to capture the inherently known zeroness/nonzeroness of constant_svalue > instances? That would make those constants have that state. Yeah, I saw that on your nullness check. I tried it on a small example with and without, but didn't noticed a difference in warnings (except for not having zero(x_4) inside the supergraph.dot). So if I understood this right, this is just to have one state less for that variable/value[0]? If that is right, is it also favorable to "merge" the stop state and non_zero state inside the zero state machine because - for now - there is no warning planned on non-zero values? [0] less states are favorable because then the analyzer maybe has less different enodes to visit and thus less runtime(?) > > Thanks > Dave > > > - Tim Also another question unrelated to the ones before. I do have a weird bug in my zero sm[1] but I'm unsure where my sm is flawed. Take for example, the probably simplest case: int x = 0;h printf("%d", 42 / x); If I use inform to emit a notice for my state machine result, it seems to be correct and I do get following traversal order (by printing the gimple stmt inside on_stmt): x_2 = 0; _1 = 42 % x_2; # .MEM_4 = VDEF <.MEM_3(D)> printf ("%d", _1); _5 = 0; <L0>: # VUSE <.MEM_4> return _5; But if i use my zero_diagnostics and sm_ctxt->warn to emit the warning, I get an unexpected traversal order of: x_2 = 0; _1 = 42 / x_2; # .MEM_4 = VDEF <.MEM_3(D)> printf ("%d", _1); _5 = 0; <L0>: # VUSE <.MEM_4> return _5; x_2 = 0; <-- why do I get these stmts again but they are not as duplicated in the e-supergraph? _1 = 42 / x_2; _5 = 0; _1 = 42 / x_2; _5 = 0; I tracked the cause down the call stack to m_saved_diagnostics.safe_push (sd); inside diagnostic_manager::add_diagnostic. For whatever reason, the pushing of diagnostics bricks the zero sm. It might be a embarrassing error on my side, but I'm stuck on this and can't seem to find what I'm doing wrong. - Tim [1] https://github.com/timll/gcc/blob/castsize/gcc/analyzer/sm-zero.cc#L181 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: fanalyzer: debugging zero state machine 2022-06-09 14:49 fanalyzer: debugging zero state machine Tim Lange @ 2022-06-09 17:40 ` David Malcolm 2022-06-12 18:27 ` Tim Lange 0 siblings, 1 reply; 4+ messages in thread From: David Malcolm @ 2022-06-09 17:40 UTC (permalink / raw) To: Tim Lange, GCC Mailing List On Thu, 2022-06-09 at 16:49 +0200, Tim Lange wrote: > > > On Mi, Jun 8 2022 at 11:12:52 -0400, David Malcolm > <dmalcolm@redhat.com> wrote: > > > On Wed, 2022-06-08 at 01:42 +0200, Tim Lange wrote: > > > > > > Hi Dave, Hi Tim; various responses inline below... > > > > > > I did spent some time to think about the zero state machine. I > first > > > thought about distinguishing between "assigned zero" and "EQ 0 > > > condition on the path" for cases where e.g. unreachable() is > used > to > > > say that some variable will never be zero according to the > programmer. > > > In that case the dev might not want zero warnings from > conditions > > > outside the if body itself for dev build where unreachable > expands > to > > > some panic exit. But as the condition constraint is not pruned > when the > > > state machine is distinguishing the states, I'm not sure how to > know > > > whether the analysis already left the if body? > > > > The analyzer works on the gimple-ssa representation, which uses > basic > > blocks in a CFG, rather than an AST, so the only remants we have > of > > scope is in "clobber" statements (a special kind of assignment > stmt), > > which the gimplify adds as variables go out of scope. > If the constraints only lived until the immediate dominator of `if > (cond)`, I could easily distinguish: > 1. if (x == 0) && still inside the if => zero > 2. if (x == 0) && outside if => maybe zero > but as this seems to be not the case, if I want to distinguish 1. & > 2., > I'd have to find another way. > > > > For pruning, the analyzer's state_machine class has a > "can_purge_p" > > virtual function: > > > > /* Return true if it safe to discard the given state (to help > > when simplifying state objects). > > States that need leak detection should return false. */ > > virtual bool can_purge_p (state_t s) const = 0; > > > > which should return true for a "zeroness" state machine, in that > we > > always consider pruning states for svalues that aren't needed > anymore > > along a path. > Is implemented and returns true. > > > > Is there some other kind of state explosion you're running into? > It's > > hard to figure this out further without seeing code. > No, my code is by far not that mature to be tested. I just had in my > head that I wanted to find out if I can distinguish the two cases. > > > > > > > Also, while trying out different things, it seems simple > assignments on > > > phi like here > > > int x; > > > if (argc == 1) { > > > x = 1; // x_5 > > > } else { > > > x = 0; // x_4 > > > } > > > printf("%i", 5 / x); // x_2 > > > automatically work such that x_2 already inherits the state from > > > x_4/x_5 without me doing anything inside my sm's on_phi > function. > Same > > > for the simple y = 0; x = y; case. Where does this happen inside > the > > > code? > > > > With the caveat that I'm seeing your code, what's probably > happening > is > > that we have either: > > > > BB (a): > > x_5 = 1; > > goto BB (c); > > > > BB (b): > > x_4 = 0; > > goto BB (c); > > > > BB (c): > > x_2 = PHI (x_5 from (a), x_4 from (b)); > I compiled it with -g, so this one is like the dumped gimple. > > > > or (at higher optimization levels): > > > > BB (a): > > goto BB (c); > > > > BB (b): > > goto BB (c); > > > > BB (c): > > x_2 = PHI (1 from (a), 0 from (b)); > > > > and I think that at the phi node we have region_model::handle_phi, > > which is setting x_2 to either the constant 1 or the constant 0 in > the > > store, and is calling the on_phi vfunc, leading to on_phi being > called > > for all state machines. > Thanks, that is the case. The set_value inside handle_phi seems to > this > for me. > > > > BTW, are you implementing an override for this vfunc: > > virtual state_machine::state_t get_default_state (const svalue *) > > const; > > > > to capture the inherently known zeroness/nonzeroness of > constant_svalue > > instances? That would make those constants have that state. > Yeah, I saw that on your nullness check. I tried it on a small > example > with and without, but didn't noticed a difference in warnings (except > for not having zero(x_4) inside the supergraph.dot). So if I > understood > this right, this is just to have one state less for that > variable/value[0]? The states are stored in sm_state_map using a mapping from svalue to state. Given e.g. a parameter, this will be "initial_svalue (parm)", but in the above case where x_4 either has value 1 or has value 0, it's not storing a state for x_4; it's storing a state for 0 or a state for 1. So without implementing the get_default_state vfunc, the sm_state_maps will gradually aquire explicit "this is zero" or "this is nonzero" for all of the constants that get used, leading to an explosion of states for all the different combinations of constants that have been encountered along an execution path, which is probably not going to be useful. If you do implement the get_default_state vfunc for constants, then the constants will implicitly have the zero vs nonzero states, and this information won't get stored in the sm_state_map instances. program_state::can_merge_with_p compares the sm_state_map instances between the two program_state instances - and so if the get_default_state vfunc is implemented for constants, the peer sm_state_maps for the "zero" state machine will both be empty, and thus equal, and thus it will merge state. So it might make sense to revamp this sm_state_map comparison so that we consider implicit states in sm_state_maps when merging the store, so that we see that "1" and "0" have different sm-states for the "zero" state machine, and thus we shouldn't merge states for these. See binding_cluster::can_merge_p, which calls svalue::can_merge_p. It might make sense to have svalue::can_merge_p "know" about sm-state-maps for this. svalue::can_merge_p calls model_merger::mergeable_svalue_p, so that we can reject merging values that explicitly have different state-machine states, but that code could be reworked so that it considers whether a pair of svalues can be merged, using implicit state-machine states (and thus have it reject the merger of a zero with a nonzero constant). Or, whilst prototyping, you could simply hack in a rejection of merging zero vs non-zero integer values into svalue::can_merge_p. > > If that is right, is it also favorable to "merge" the stop state and > non_zero state inside the zero state machine because - for now - > there > is no warning planned on non-zero values? > > [0] less states are favorable because then the analyzer maybe has > less > different enodes to visit and thus less runtime(?) > > > > Thanks > > Dave > > > > > - Tim > > Also another question unrelated to the ones before. I do have a weird > bug in my zero sm[1] but I'm unsure where my sm is flawed. Take for > example, the probably simplest case: > int x = 0;h > printf("%d", 42 / x); > If I use inform to emit a notice for my state machine result, it > seems > to be correct and I do get following traversal order (by printing the > gimple stmt inside on_stmt): > x_2 = 0; > _1 = 42 % x_2; > # .MEM_4 = VDEF <.MEM_3(D)> > printf ("%d", _1); > _5 = 0; > <L0>: > # VUSE <.MEM_4> > return _5; > But if i use my zero_diagnostics and sm_ctxt->warn to emit the > warning, > I get an unexpected traversal order of: > x_2 = 0; > _1 = 42 / x_2; > # .MEM_4 = VDEF <.MEM_3(D)> > printf ("%d", _1); > _5 = 0; > <L0>: > # VUSE <.MEM_4> > return _5; > x_2 = 0; <-- why do I get these stmts again but they > are not as > duplicated in the e-supergraph? > _1 = 42 / x_2; > _5 = 0; > _1 = 42 / x_2; > _5 = 0; What exactly is your test source file? Note that a function being called by another function will get explored in the exploded_graph twice: once being analyzed "standalone", and again being analyzed as called from the other function. When you say "I get these stmts again", do you mean the state machine code is seeing them again, or the region model code? sm_ctxt->warn (new foo) will save an instance of a pending_diagnostic into the diagnotic_manager, associating it with a particular exploded_node (and some other info). After the exploded_graph has been built, the diagnostic_manager does some deduplication work, and then tries to find a feasible path to each pending_diagnostic's exploded node. This feasibility analysis reruns much of the region_model code that was run when building the exploded graph. So that's another way you can "see" stmts repeatedly. > I tracked the cause down the call stack to > m_saved_diagnostics.safe_push (sd); inside > diagnostic_manager::add_diagnostic. For whatever reason, the pushing > of > diagnostics bricks the zero sm. It might be a embarrassing error on > my > side, but I'm stuck on this and can't seem to find what I'm doing > wrong. FWIW, assuming you're using the gdb support scripts: https://gcc-newbies-guide.readthedocs.io/en/latest/debugging.html#support-scripts-for-gdb there's a handy: (gdb) break-on-saved-diagnostic command which adds a breakpoint on the analyzer's diagnostic_manager saving a pending_diagnostic. You might also what to try -fdump-analyzer-stderr, which will dump lots of information on what the analyzer is doing to stderr, which although verbose is usually very helpful in figuring out why the analyzer is doing something. Hope this is helpful Dave > > - Tim > > [1] > > https://github.com/timll/gcc/blob/castsize/gcc/analyzer/sm-zero.cc#L181 > > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: fanalyzer: debugging zero state machine 2022-06-09 17:40 ` David Malcolm @ 2022-06-12 18:27 ` Tim Lange 2022-06-13 19:25 ` David Malcolm 0 siblings, 1 reply; 4+ messages in thread From: Tim Lange @ 2022-06-12 18:27 UTC (permalink / raw) To: David Malcolm; +Cc: GCC Mailing List On Do, Jun 9 2022 at 13:40:06 -0400, David Malcolm <dmalcolm@redhat.com> wrote: > On Thu, 2022-06-09 at 16:49 +0200, Tim Lange wrote: >> >> > On Mi, Jun 8 2022 at 11:12:52 -0400, David Malcolm >> <dmalcolm@redhat.com> wrote: >> > > On Wed, 2022-06-08 at 01:42 +0200, Tim Lange wrote: >> > > >> > > Hi Dave, > > Hi Tim; various responses inline below... > >> > > >> > > I did spent some time to think about the zero state machine. I >> first >> > > thought about distinguishing between "assigned zero" and "EQ 0 >> > > condition on the path" for cases where e.g. unreachable() is >> used >> to >> > > say that some variable will never be zero according to the >> programmer. >> > > In that case the dev might not want zero warnings from >> conditions >> > > outside the if body itself for dev build where unreachable >> expands >> to >> > > some panic exit. But as the condition constraint is not pruned >> when the >> > > state machine is distinguishing the states, I'm not sure how to >> know >> > > whether the analysis already left the if body? >> > >> > The analyzer works on the gimple-ssa representation, which uses >> basic >> > blocks in a CFG, rather than an AST, so the only remants we have >> of >> > scope is in "clobber" statements (a special kind of assignment >> stmt), >> > which the gimplify adds as variables go out of scope. >> If the constraints only lived until the immediate dominator of `if >> (cond)`, I could easily distinguish: >> 1. if (x == 0) && still inside the if => zero >> 2. if (x == 0) && outside if => maybe zero >> but as this seems to be not the case, if I want to distinguish 1. & >> 2., >> I'd have to find another way. >> > >> > For pruning, the analyzer's state_machine class has a >> "can_purge_p" >> > virtual function: >> > >> > /* Return true if it safe to discard the given state (to help >> > when simplifying state objects). >> > States that need leak detection should return false. */ >> > virtual bool can_purge_p (state_t s) const = 0; >> > >> > which should return true for a "zeroness" state machine, in that >> we >> > always consider pruning states for svalues that aren't needed >> anymore >> > along a path. >> Is implemented and returns true. >> > >> > Is there some other kind of state explosion you're running into? >> It's >> > hard to figure this out further without seeing code. >> No, my code is by far not that mature to be tested. I just had in my >> head that I wanted to find out if I can distinguish the two cases. >> > >> > >> > > Also, while trying out different things, it seems simple >> assignments on >> > > phi like here >> > > int x; >> > > if (argc == 1) { >> > > x = 1; // x_5 >> > > } else { >> > > x = 0; // x_4 >> > > } >> > > printf("%i", 5 / x); // x_2 >> > > automatically work such that x_2 already inherits the state >> from >> > > x_4/x_5 without me doing anything inside my sm's on_phi >> function. >> Same >> > > for the simple y = 0; x = y; case. Where does this happen >> inside >> the >> > > code? >> > >> > With the caveat that I'm seeing your code, what's probably >> happening >> is >> > that we have either: >> > >> > BB (a): >> > x_5 = 1; >> > goto BB (c); >> > >> > BB (b): >> > x_4 = 0; >> > goto BB (c); >> > >> > BB (c): >> > x_2 = PHI (x_5 from (a), x_4 from (b)); >> I compiled it with -g, so this one is like the dumped gimple. >> > >> > or (at higher optimization levels): >> > >> > BB (a): >> > goto BB (c); >> > >> > BB (b): >> > goto BB (c); >> > >> > BB (c): >> > x_2 = PHI (1 from (a), 0 from (b)); >> > >> > and I think that at the phi node we have >> region_model::handle_phi, >> > which is setting x_2 to either the constant 1 or the constant 0 >> in >> the >> > store, and is calling the on_phi vfunc, leading to on_phi being >> called >> > for all state machines. >> Thanks, that is the case. The set_value inside handle_phi seems to >> this >> for me. >> > >> > BTW, are you implementing an override for this vfunc: >> > virtual state_machine::state_t get_default_state (const svalue *) >> > const; >> > >> > to capture the inherently known zeroness/nonzeroness of >> constant_svalue >> > instances? That would make those constants have that state. >> Yeah, I saw that on your nullness check. I tried it on a small >> example >> with and without, but didn't noticed a difference in warnings >> (except >> for not having zero(x_4) inside the supergraph.dot). So if I >> understood >> this right, this is just to have one state less for that >> variable/value[0]? > > The states are stored in sm_state_map using a mapping from svalue to > state. > > Given e.g. a parameter, this will be "initial_svalue (parm)", but in > the above case where x_4 either has value 1 or has value 0, it's not > storing a state for x_4; it's storing a state for 0 or a state for 1. > So without implementing the get_default_state vfunc, the sm_state_maps > will gradually aquire explicit "this is zero" or "this is nonzero" for > all of the constants that get used, leading to an explosion of states > for all the different combinations of constants that have been > encountered along an execution path, which is probably not going to be > useful. If you do implement the get_default_state vfunc for > constants, > then the constants will implicitly have the zero vs nonzero states, > and > this information won't get stored in the sm_state_map instances. > > program_state::can_merge_with_p compares the sm_state_map instances > between the two program_state instances - and so if the > get_default_state vfunc is implemented for constants, the peer > sm_state_maps for the "zero" state machine will both be empty, and > thus > equal, and thus it will merge state. > > So it might make sense to revamp this sm_state_map comparison so that > we consider implicit states in sm_state_maps when merging the store, > so > that we see that "1" and "0" have different sm-states for the "zero" > state machine, and thus we shouldn't merge states for these. See > binding_cluster::can_merge_p, which calls svalue::can_merge_p. It > might make sense to have svalue::can_merge_p "know" about > sm-state-maps > for this. > > svalue::can_merge_p calls model_merger::mergeable_svalue_p, so that we > can reject merging values that explicitly have different state-machine > states, but that code could be reworked so that it considers whether a > pair of svalues can be merged, using implicit state-machine states > (and > thus have it reject the merger of a zero with a nonzero constant). > > Or, whilst prototyping, you could simply hack in a rejection of > merging > zero vs non-zero integer values into svalue::can_merge_p. > >> >> If that is right, is it also favorable to "merge" the stop state and >> non_zero state inside the zero state machine because - for now - >> there >> is no warning planned on non-zero values? >> >> [0] less states are favorable because then the analyzer maybe has >> less >> different enodes to visit and thus less runtime(?) >> > >> > Thanks >> > Dave >> > >> > > - Tim >> >> Also another question unrelated to the ones before. I do have a >> weird >> bug in my zero sm[1] but I'm unsure where my sm is flawed. Take for >> example, the probably simplest case: >> int x = 0;h >> printf("%d", 42 / x); >> If I use inform to emit a notice for my state machine result, it >> seems >> to be correct and I do get following traversal order (by printing >> the >> gimple stmt inside on_stmt): >> x_2 = 0; >> _1 = 42 % x_2; >> # .MEM_4 = VDEF <.MEM_3(D)> >> printf ("%d", _1); >> _5 = 0; >> <L0>: >> # VUSE <.MEM_4> >> return _5; >> But if i use my zero_diagnostics and sm_ctxt->warn to emit the >> warning, >> I get an unexpected traversal order of: >> x_2 = 0; >> _1 = 42 / x_2; >> # .MEM_4 = VDEF <.MEM_3(D)> >> printf ("%d", _1); >> _5 = 0; >> <L0>: >> # VUSE <.MEM_4> >> return _5; >> x_2 = 0; <-- why do I get these stmts again but they >> are not as >> duplicated in the e-supergraph? >> _1 = 42 / x_2; >> _5 = 0; >> _1 = 42 / x_2; >> _5 = 0; > > What exactly is your test source file? It is just the two lines wrapped into main(): #include <stdio.h> int main(int argc, char **argv) { int x = 0; printf("%d", 42 / x); return 0; } > > Note that a function being called by another function will get > explored > in the exploded_graph twice: once being analyzed "standalone", and > again being analyzed as called from the other function. > > When you say "I get these stmts again", do you mean the state machine > code is seeing them again, or the region model code? In the state_machine::on_stmt function. > > sm_ctxt->warn (new foo) > > will save an instance of a pending_diagnostic into the > diagnotic_manager, associating it with a particular exploded_node (and > some other info). > > After the exploded_graph has been built, the diagnostic_manager does > some deduplication work, and then tries to find a feasible path to > each > pending_diagnostic's exploded node. > > This feasibility analysis reruns much of the region_model code that > was > run when building the exploded graph. So that's another way you can > "see" stmts repeatedly. Okay, that probably explains it. I had two distinct problems where I thought they might be both related to seeing the statements again: | 13 | int x = 0; | | ^ | | | | | (1) set to zero here | | (3) set to zero here | 14 | printf("%d", 42 / x); | | ~~~~~~~~~~~~~~~~~~~~ | | | | | (4) division by zero | 15 | | 16 | return 0; | | ~ | | | | | (2) set to zero here | | (5) set to zero here * In line 13, there are two state changes while I'd expect only one. => I re-looked at the other state machines and they transfer to the stop state after the warning. If I do the same, my problem in line 13 is fixed. * The other problem in line 16 is that return 0 is not relevant toward the division by zero. => After looking at the diagnostic code, I think those should be filtered out but aren't because x (in gimple x_2) and return 0 (in gimple _5) share the (int)0 state. Inside diagnostic_manager::prune_for_sm_diagnostic there is a check that the state_change->m_sval is equal to the sval of the saved_diagnostic, which is not enough to filter unrelated events in the zero state machine case. I tried to filter the event by comparing the state of interest but that doesn't work if the unrelated event is after the related event on the path. Similarly, I tried to add a new field m_var to state_change_event and provide this as a comparison but in most functions that call the constructor of state_change_event, the var is not available. Is there is any easy way to filter these unrelated events without considerable changes [1]? By the way, I can trigger the same behavior on the malloc null_deref with gcc version 12.1.1 shipped in Fedora 36. Assume this simple program: #include <stddef.h> int main (void) { int *p = NULL; *p = 42; int *q = NULL; return 0; } results in the following warning: /home/tim/Projects/simple_c/main.c: In function ‘main’: /home/tim/Projects/simple_c/main.c:12:6: warning: dereference of NULL ‘p’ [CWE-476] [-Wanalyzer-null-dereference] 12 | *p = 42; | ~~~^~~~ ‘main’: events 1-4 | | 11 | int *p = NULL; | | ^ | | | | | (1) ‘p’ is NULL | 12 | *p = 42; | | ~~~~~~~ | | | | | (4) dereference of NULL ‘p’ | 13 | | 14 | int *q = NULL; | | ~ | | | | | (2) ‘p’ is NULL | | (3) ‘p’ is NULL > > >> I tracked the cause down the call stack to >> m_saved_diagnostics.safe_push (sd); inside >> diagnostic_manager::add_diagnostic. For whatever reason, the pushing >> of >> diagnostics bricks the zero sm. It might be a embarrassing error on >> my >> side, but I'm stuck on this and can't seem to find what I'm doing >> wrong. > > FWIW, assuming you're using the gdb support scripts: > > https://gcc-newbies-guide.readthedocs.io/en/latest/debugging.html#support-scripts-for-gdb > there's a handy: > > (gdb) break-on-saved-diagnostic > > command which adds a breakpoint on the analyzer's diagnostic_manager > saving a pending_diagnostic. > > You might also what to try -fdump-analyzer-stderr, which will dump > lots > of information on what the analyzer is doing to stderr, which although > verbose is usually very helpful in figuring out why the analyzer is > doing something. > > Hope this is helpful Definitely helped me understand more of the internals! - Tim > Dave > > > >> >> - Tim >> >> [1] >> >> >> https://github.com/timll/gcc/blob/castsize/gcc/analyzer/sm-zero.cc#L181 >> >> > > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: fanalyzer: debugging zero state machine 2022-06-12 18:27 ` Tim Lange @ 2022-06-13 19:25 ` David Malcolm 0 siblings, 0 replies; 4+ messages in thread From: David Malcolm @ 2022-06-13 19:25 UTC (permalink / raw) To: Tim Lange; +Cc: GCC Mailing List On Sun, 2022-06-12 at 20:27 +0200, Tim Lange wrote: > > > On Do, Jun 9 2022 at 13:40:06 -0400, David Malcolm > <dmalcolm@redhat.com> wrote: > > On Thu, 2022-06-09 at 16:49 +0200, Tim Lange wrote: > > > > > > > On Mi, Jun 8 2022 at 11:12:52 -0400, David Malcolm > > > <dmalcolm@redhat.com> wrote: > > > > > On Wed, 2022-06-08 at 01:42 +0200, Tim Lange wrote: > > > > > > > > > > Hi Dave, > > > > Hi Tim; various responses inline below... > > > > > > > > > > > > I did spent some time to think about the zero state machine. > > > I > > > first > > > > > thought about distinguishing between "assigned zero" and "EQ > > > 0 > > > > > condition on the path" for cases where e.g. unreachable() is > > > used > > > to > > > > > say that some variable will never be zero according to the > > > programmer. > > > > > In that case the dev might not want zero warnings from > > > conditions > > > > > outside the if body itself for dev build where unreachable > > > expands > > > to > > > > > some panic exit. But as the condition constraint is not > > > pruned > > > when the > > > > > state machine is distinguishing the states, I'm not sure how > > > to > > > know > > > > > whether the analysis already left the if body? > > > > > > > > The analyzer works on the gimple-ssa representation, which uses > > > basic > > > > blocks in a CFG, rather than an AST, so the only remants we > > > have > > > of > > > > scope is in "clobber" statements (a special kind of assignment > > > stmt), > > > > which the gimplify adds as variables go out of scope. > > > If the constraints only lived until the immediate dominator of `if > > > (cond)`, I could easily distinguish: > > > 1. if (x == 0) && still inside the if => zero > > > 2. if (x == 0) && outside if => maybe zero > > > but as this seems to be not the case, if I want to distinguish 1. > > > & > > > 2., > > > I'd have to find another way. > > > > > > > > For pruning, the analyzer's state_machine class has a > > > "can_purge_p" > > > > virtual function: > > > > > > > > /* Return true if it safe to discard the given state (to help > > > > when simplifying state objects). > > > > States that need leak detection should return false. */ > > > > virtual bool can_purge_p (state_t s) const = 0; > > > > > > > > which should return true for a "zeroness" state machine, in > > > that > > > we > > > > always consider pruning states for svalues that aren't needed > > > anymore > > > > along a path. > > > Is implemented and returns true. > > > > > > > > Is there some other kind of state explosion you're running > > > into? > > > It's > > > > hard to figure this out further without seeing code. > > > No, my code is by far not that mature to be tested. I just had in > > > my > > > head that I wanted to find out if I can distinguish the two cases. > > > > > > > > > > > > > Also, while trying out different things, it seems simple > > > assignments on > > > > > phi like here > > > > > int x; > > > > > if (argc == 1) { > > > > > x = 1; // x_5 > > > > > } else { > > > > > x = 0; // x_4 > > > > > } > > > > > printf("%i", 5 / x); // x_2 > > > > > automatically work such that x_2 already inherits the state > > > from > > > > > x_4/x_5 without me doing anything inside my sm's on_phi > > > function. > > > Same > > > > > for the simple y = 0; x = y; case. Where does this happen > > > inside > > > the > > > > > code? > > > > > > > > With the caveat that I'm seeing your code, what's probably > > > happening > > > is > > > > that we have either: > > > > > > > > BB (a): > > > > x_5 = 1; > > > > goto BB (c); > > > > > > > > BB (b): > > > > x_4 = 0; > > > > goto BB (c); > > > > > > > > BB (c): > > > > x_2 = PHI (x_5 from (a), x_4 from (b)); > > > I compiled it with -g, so this one is like the dumped gimple. > > > > > > > > or (at higher optimization levels): > > > > > > > > BB (a): > > > > goto BB (c); > > > > > > > > BB (b): > > > > goto BB (c); > > > > > > > > BB (c): > > > > x_2 = PHI (1 from (a), 0 from (b)); > > > > > > > > and I think that at the phi node we have > > > region_model::handle_phi, > > > > which is setting x_2 to either the constant 1 or the constant 0 > > > in > > > the > > > > store, and is calling the on_phi vfunc, leading to on_phi being > > > called > > > > for all state machines. > > > Thanks, that is the case. The set_value inside handle_phi seems to > > > this > > > for me. > > > > > > > > BTW, are you implementing an override for this vfunc: > > > > virtual state_machine::state_t get_default_state (const svalue > > > *) > > > > const; > > > > > > > > to capture the inherently known zeroness/nonzeroness of > > > constant_svalue > > > > instances? That would make those constants have that state. > > > Yeah, I saw that on your nullness check. I tried it on a small > > > example > > > with and without, but didn't noticed a difference in warnings > > > (except > > > for not having zero(x_4) inside the supergraph.dot). So if I > > > understood > > > this right, this is just to have one state less for that > > > variable/value[0]? > > > > The states are stored in sm_state_map using a mapping from svalue to > > state. > > > > Given e.g. a parameter, this will be "initial_svalue (parm)", but in > > the above case where x_4 either has value 1 or has value 0, it's not > > storing a state for x_4; it's storing a state for 0 or a state for 1. > > So without implementing the get_default_state vfunc, the > > sm_state_maps > > will gradually aquire explicit "this is zero" or "this is nonzero" > > for > > all of the constants that get used, leading to an explosion of states > > for all the different combinations of constants that have been > > encountered along an execution path, which is probably not going to > > be > > useful. If you do implement the get_default_state vfunc for > > constants, > > then the constants will implicitly have the zero vs nonzero states, > > and > > this information won't get stored in the sm_state_map instances. > > > > program_state::can_merge_with_p compares the sm_state_map instances > > between the two program_state instances - and so if the > > get_default_state vfunc is implemented for constants, the peer > > sm_state_maps for the "zero" state machine will both be empty, and > > thus > > equal, and thus it will merge state. > > > > So it might make sense to revamp this sm_state_map comparison so that > > we consider implicit states in sm_state_maps when merging the store, > > so > > that we see that "1" and "0" have different sm-states for the "zero" > > state machine, and thus we shouldn't merge states for these. See > > binding_cluster::can_merge_p, which calls svalue::can_merge_p. It > > might make sense to have svalue::can_merge_p "know" about > > sm-state-maps > > for this. > > > > svalue::can_merge_p calls model_merger::mergeable_svalue_p, so that > > we > > can reject merging values that explicitly have different state- > > machine > > states, but that code could be reworked so that it considers whether > > a > > pair of svalues can be merged, using implicit state-machine states > > (and > > thus have it reject the merger of a zero with a nonzero constant). > > > > Or, whilst prototyping, you could simply hack in a rejection of > > merging > > zero vs non-zero integer values into svalue::can_merge_p. > > > > > > > > If that is right, is it also favorable to "merge" the stop state > > > and > > > non_zero state inside the zero state machine because - for now - > > > there > > > is no warning planned on non-zero values? > > > > > > [0] less states are favorable because then the analyzer maybe has > > > less > > > different enodes to visit and thus less runtime(?) > > > > > > > > Thanks > > > > Dave > > > > > > > > > - Tim > > > > > > Also another question unrelated to the ones before. I do have a > > > weird > > > bug in my zero sm[1] but I'm unsure where my sm is flawed. Take > > > for > > > example, the probably simplest case: > > > int x = 0;h > > > printf("%d", 42 / x); > > > If I use inform to emit a notice for my state machine result, it > > > seems > > > to be correct and I do get following traversal order (by printing > > > the > > > gimple stmt inside on_stmt): > > > x_2 = 0; > > > _1 = 42 % x_2; > > > # .MEM_4 = VDEF <.MEM_3(D)> > > > printf ("%d", _1); > > > _5 = 0; > > > <L0>: > > > # VUSE <.MEM_4> > > > return _5; > > > But if i use my zero_diagnostics and sm_ctxt->warn to emit the > > > warning, > > > I get an unexpected traversal order of: > > > x_2 = 0; > > > _1 = 42 / x_2; > > > # .MEM_4 = VDEF <.MEM_3(D)> > > > printf ("%d", _1); > > > _5 = 0; > > > <L0>: > > > # VUSE <.MEM_4> > > > return _5; > > > x_2 = 0; <-- why do I get these stmts again but > > > they > > > are not as > > > duplicated in the e-supergraph? > > > _1 = 42 / x_2; > > > _5 = 0; > > > _1 = 42 / x_2; > > > _5 = 0; > > > > What exactly is your test source file? > It is just the two lines wrapped into main(): > #include <stdio.h> > int main(int argc, char **argv) { > int x = 0; > printf("%d", 42 / x); > return 0; > } > > > > Note that a function being called by another function will get > > explored > > in the exploded_graph twice: once being analyzed "standalone", and > > again being analyzed as called from the other function. > > > > When you say "I get these stmts again", do you mean the state machine > > code is seeing them again, or the region model code? > In the state_machine::on_stmt function. > > > > sm_ctxt->warn (new foo) > > > > will save an instance of a pending_diagnostic into the > > diagnotic_manager, associating it with a particular exploded_node > > (and > > some other info). > > > > After the exploded_graph has been built, the diagnostic_manager does > > some deduplication work, and then tries to find a feasible path to > > each > > pending_diagnostic's exploded node. > > > > This feasibility analysis reruns much of the region_model code that > > was > > run when building the exploded graph. So that's another way you can > > "see" stmts repeatedly. > Okay, that probably explains it. I had two distinct problems where I > thought they might be both related to seeing the statements again: > | 13 | int x = 0; > | | ^ > | | | > | | (1) set to zero here > | | (3) set to zero here > | 14 | printf("%d", 42 / x); > | | ~~~~~~~~~~~~~~~~~~~~ > | | | > | | (4) division by zero > | 15 | > | 16 | return 0; > | | ~ > | | | > | | (2) set to zero here > | | (5) set to zero here Aha: when you say "seeing the statements again", you're referring to the diagnostic_path that's emitted. gcc/analyzer/diagnostic-manager.cc finds a suitable path through the exploded graph for the warning, and then walks the path, generating a list of checker_event instances. It then "optimizes" this list of checker_events, and the result is what then gets visualized in ASCII art. Something really weird's happening in the above. The way diagnostic-manager.cc generates events could definitely be improved - it tries to track the state of interest, and thus generate events that relate to it, but it only knows about state machine changes, not region_model changes, and it doesn't do a good job of the former. I've got some ideas for rewriting it which I'm hoping to get to later this month, once I finish my work on SARIF. Basically, my plan is to update the exploded_graph exploration so that I capture a list of state changes on each edge, capturing the sequence of how each source state became each destination state, in a reversible way, so that for any given diagnostic I hope we'll be able to walk backwards along a graph and see what the changes were, and where they came from. Given that, I think it's best if you put your state machine work on hold for now, and focus on the other tests you were interested in. FWIW, I just added some more ideas to the tracker bug here: https://gcc.gnu.org/bugzilla/showdependencytree.cgi?id=105887&hide_resolved=1 at least one of which looks fairly easy (105947). > * In line 13, there are two state changes while I'd expect only one. > => I re-looked at the other state machines and they transfer to the > stop state after the warning. If I do the same, my problem in line 13 > is fixed. > * The other problem in line 16 is that return 0 is not relevant toward > the division by zero. > => After looking at the diagnostic code, I think those should be > filtered out but aren't because x (in gimple x_2) and return 0 (in > gimple _5) share the (int)0 state. Inside > diagnostic_manager::prune_for_sm_diagnostic there is a check that the > state_change->m_sval is equal to the sval of the saved_diagnostic, > which is not enough to filter unrelated events in the zero state > machine case. I don't know that it makes much sense to track the states on constants; it makes more sense for things like the initial_value (x), if x were a parameter. It may be that the current implementation of state machines in the analyzer just isn't a good fit for this test. > I tried to filter the event by comparing the state of interest but that > doesn't work if the unrelated event is after the related event on the > path. > Similarly, I tried to add a new field m_var to state_change_event and > provide this as a comparison but in most functions that call the > constructor of state_change_event, the var is not available. > > Is there is any easy way to filter these unrelated events without > considerable changes [1]? > > By the way, I can trigger the same behavior on the malloc null_deref > with gcc version 12.1.1 shipped in Fedora 36. Assume this simple > program: > #include <stddef.h> > int main (void) > { > int *p = NULL; > *p = 42; > > int *q = NULL; > > return 0; > } > results in the following warning: > /home/tim/Projects/simple_c/main.c: In function ‘main’: > /home/tim/Projects/simple_c/main.c:12:6: warning: dereference of NULL > ‘p’ [CWE-476] [-Wanalyzer-null-dereference] > 12 | *p = 42; > | ~~~^~~~ > ‘main’: events 1-4 > | > | 11 | int *p = NULL; > | | ^ > | | | > | | (1) ‘p’ is NULL > | 12 | *p = 42; > | | ~~~~~~~ > | | | > | | (4) dereference of NULL ‘p’ > | 13 | > | 14 | int *q = NULL; > | | ~ > | | | > | | (2) ‘p’ is NULL > | | (3) ‘p’ is NULL > That definitely looks like a bug in the code I was talking about earlier. I've filed this as: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105958 Thanks Dave > > > > > > > I tracked the cause down the call stack to > > > m_saved_diagnostics.safe_push (sd); inside > > > diagnostic_manager::add_diagnostic. For whatever reason, the > > > pushing > > > of > > > diagnostics bricks the zero sm. It might be a embarrassing error > > > on > > > my > > > side, but I'm stuck on this and can't seem to find what I'm doing > > > wrong. > > > > FWIW, assuming you're using the gdb support scripts: > > > > > > https://gcc-newbies-guide.readthedocs.io/en/latest/debugging.html#support-scripts-for-gdb > > there's a handy: > > > > (gdb) break-on-saved-diagnostic > > > > command which adds a breakpoint on the analyzer's diagnostic_manager > > saving a pending_diagnostic. > > > > You might also what to try -fdump-analyzer-stderr, which will dump > > lots > > of information on what the analyzer is doing to stderr, which > > although > > verbose is usually very helpful in figuring out why the analyzer is > > doing something. > > > > Hope this is helpful > Definitely helped me understand more of the internals! > > - Tim > > Dave > > > > > > > > > > > > - Tim > > > > > > [1] > > > > > > > > > > > > https://github.com/timll/gcc/blob/castsize/gcc/analyzer/sm-zero.cc#L181 > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-06-13 19:25 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-06-09 14:49 fanalyzer: debugging zero state machine Tim Lange 2022-06-09 17:40 ` David Malcolm 2022-06-12 18:27 ` Tim Lange 2022-06-13 19:25 ` David Malcolm
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).