Hello, I have made some changes based on the review feedback, please have a look. In summary: * Support for -profile=atomic * gcc -fprofile-conditions does not ICE without -fprofile-arcs * Indentation and formatting complies with contrib/GNU_style_check * Some small adjustment to documentation, phrasing As mentioned in the previous message, I think it should only be necessary to do atomic fetch/store on the global accumulators, not the locals (as they are specific to the activation record). If is wrong then I have no problems extending the atomics to the function-local accumulators too, please let me know. I added a simple test case for -profile=update, gcov-20.c. I've added checks for profile_condition_flag everywhere the profile_arc_flag is checked, as they share the same code paths and infrastructure. I also made it so that -fprofile-conditions is sufficient for linking gcov. gcc will no longer crash on -fprofile-conditions -fprofile-generate by explicitly checking coverage_begin_*. I'm not too happy with how it turned out, but the condition coverage needs a lot more analysis before it can write out its gcov structure, which doesn't fit with the flow of branch_prob (). Thanks, Jørgen On 25/03/2022 20:44, Jørgen Kvalsvik wrote: > Hello, and thanks for the review! > >> 1) Do I correctly understand that __conditions_accu_true/false track >> every short-circuit sub-expression of a condition and record >> if a given sub-expr is seen to be true or false? > > Sort-of. It is not really aware of sub-expressions at all, but tracks every > boolean condition/term in the full expression, mapped to a bitvector. I usually > find it easier to understand visually: > > if (a || (b && c) || d) > > terms | a b c d > ------|-------- > true | 0 0 0 0 > false | 0 0 0 0 > > Whenever a = true, true becomes 1000 and false remains 0000. a = true would > short-circuit, (implemented as a normal edge to the "exit" node of the > expression) leaving bcd unevaluated. Coverage is then determined as the number > of unset bits in this vector. The accumulators are zero'd on every function > entry, and |= into the global "counter" on function exit. > >> 2) As mentioned these are bit sets, can you please explain why you (& >> -value) from these sets? > > Yes, excellent question. This should answer 3) too, and is the most surprsing > thing when unfamiliar with MC/DC. > > For modified condition/decision coverage we need to prove that a condition's > value independently affects the outcome/decision. In other words, would > flipping a condition change the outcome? > > Let's take your if (a || b) check, which says that 1/4 conditions are covered > on (0, 1). If a = 1, the expression is always true, and b is never evaluated. > In fact, the value of b does not change the outcome at all, so intuitively only > 1/4 of the conditions are covered. > > On (0, 1) b gets evaluated, so in a way we know that a = 0. However, the value > of a will no longer affect the outcome because (* || 1) is always true. In a > way, you can think of it as reverse short-circuiting, the same thing would > happen on (* && 0). This is what Wahlen, Heimdahl, and De Silva calls masking. > What I do is figure out when masking must happen by analyzing the CFG and zero > out the bits that are masked (i.e. no longer affect the outcome). This is in > practice what happens to the bits when regular short circuiting happen. > > So while (0, 1) "sees" the condition a = false, it cannot prove that it > mattered based on your inputs. In general, you need N+1 test cases to cover N > conditionals with MC/DC. As a consequence, the only way of covering a = false > is (0, 0), which alone would be 2/4 cases covered. > > I hope this made it clearer, if not I can write a more elaborate explanation > with more examples. > >> 4) I noticed various CFG patterns in tree-profile.cc which are handled. Can >> you please explain how the algorithm works even without knowing the original >> paper? > > My paper is not written yet, but I will describe the high-level algorithm (+ > extracts from source comments) at the end of this email, as it is a bit long. > >> 5) I noticed the following ICE: > Yes, this is an unfortunate flaw - I piggyback on the setup done by branch > coverage, which means I silently assume --coverage is used. I was unsure what > to do (and meant to ask, sorry!) as it is a larger decision - should > condition-coverage also imply branch coverage? Should condition coverage work > on its own? Other options? > > I'm happy to implement either strategy - please advise on what you would > prefer. > > >> 6) Please follow the GNU coding style, most notable we replace 8 spaces with >> a tab. And we break line before {, (, ... Remove noexcept specifiers for >> functions and I think most of the newly added functions can be static. And >> each newly added function should have a comment. > > Sure thing, I'll add some tabs. All the helper functions are already static (in > an unnamed namespace), but I can remove the namespace and/or add static. > >> 7) Please consider supporting of -profile-update=atomic where you'll need to >> utilize an atomic builts (see how other instrumentation looks like with >> it). > Will do! I always intended to support the atomics, not having them there is an > oversight. To confirm, I will only need to use atomics for the global counters, > right? The per-call accumulators are local to each activation record and should > not need the atomics, or am I missing something? > > ALGORITHM > > Phase 1: expression isolation from CFG analysis > > Break the CFG into smaller pieces ("sections") by splitting it on dominators. > No expression can cross a dominator, so becomes a natural way of terminating > expression searches. > > For each section, do a BFS from the root node and collect all reachable nodes > following edges that point to "conditional" nodes, i.e. nodes with outgoing > true&false edges. Series of single-parent single-exit nodes are contracted. > This gives a good estimate for an expression, but might include conditions in > the if/else blocks. > > The algorithm then collects the immediate neighborhood, i.e. all nodes adjacent > to the collected set, dropping everything in the set itself. Then, a new BFS is > performed, but now "upwards" (i.e. following predecessors), skipping any > neighbor not dominated by the root (this eliminates loops and other expressions > with the same "exit" block, see BFS-up). The intersection of all the paths > found with this BFS is the isolated expression. > > If the root is not a conditional, it is skipped and it's successor > becomes the new root. If an expression is found it has a pair of "expression > sucessors", and the algorithm recurses into both, evaluating this sub-section > in isolation, i.e. > > if (a || b) { > root: > if (c && d) { > ... > } > ... > sink: > } > > The sections between the root and sink labels are evaulated as if the outer > expression did not exist. > > From the comments: > > 1. Collect all nodes reachable from the root node through (contracted) paths > of true/false edges. > 2. Collect the neighbors of the reachable node (set). > 3. From every node in the neighborhood, walk the up the CFG and mark every > reachable node. Only the nodes reachable from *every* node in the > neighborhood are a part of the first expression. > 4. Record the expression plus the two successors of the last (highest-index) > node in the expression, i.e. the last term. > 5. Repeat using the two successors as new root nodes. > > It is not guaranteed to find nodes in the order of the expression, i.e. it > might find (a || b) && c as [a c b], so the output is sorted by > basic_block->index. > > Steps 2 and 3 are necessary to distinguish chained conditionals from > multi-term conditionals, e.g. to separate > > if (a) > { > if (b) > work (); > } > if (a && b) > work (); > > > On BFS-up: > > Either block of a conditional could have more decisions and loops, so > isolate the first decision by set-intersecting all paths from the > post-dominator to the entry block. > > The function returns the number of blocks from n that make up the leading > expression in prefix order (i.e. the order expected by the instrumenting > code). When this function returns 0 there are no decisions between pre and > post, and this segment of the CFG can safely be skipped. > > The post nodes can have predecessors that do not belong to this subgraph, > which are skipped. This is expected, for example when there is a > conditional in the else-block of a larger expression: > > if (A) > { > if (B) {} > } else { > if (C) {} else {} > } > > A > t / \ f > / \ > B C > /\ / \ > / \ T F > T \ \ / > \ | o > \ | / > \ | / > \|/ > E > > Processing [B, E) which looks like: > > B > /| > / | > T | > \ / > E ----> o // predecessor outside [B, e) > > > Phase 2: Masking vector and instrumentation > > The input to this phase is the basic blocks that make up every expression, plus > its two exit nodes, i.e. the 3-term expression (a || b || c) yields the input > [a b c T F] to phase 2. > > For every expression, allocate two local accumulators (bit-vectors), one for > the true-vector, one for the false-vector. These are zero'd on function entry > and flushed on function exit. > > For each "conditional" block (a b c), unconditionally set the corresponding bit > when an edge in the CFG is taken. Rough pseudo-gimple illustration: > > if (a) { > accu_true |= 1 << 0 > goto done; > } else { > accu_false |= 1 << 0 > if (b) { > accu_true |= 1 << 1; > goto done; > } else { > accu_false |= 1 << 1; > ... > } > } > > global_true[expr_n] |= accu_true > global_false[expr_n] |= accu_false > > This alone would more-or-less be branch coverage, but would not prove that > terms would independently decide the outcome. To find masking, the expression > is walked in search of triangles. A || B as a decision diagram becomes > > A > t|\f > | \ > | B > |t/ \f > |/ \ > T F > > and we search for triangles like ATB, that is nodes with 2 predecessors with > the same boolean value, where the two predecessors are connected with the > opposite value. This marks a candidate for masking, and the terms masked are > determined by chasing predecessors starting at the top of the triangle for as > long as the chain of predecessors have the same truth value as the "exit" > edges. This set is often empty. > > The edge that triggers masking is then instrumented with a accu &= mask which > wipes out anything that does not independently affect the outcome. Note that > nothing is ever wiped from the global accumulator, only the local. The (a || b) > example had a false vector of [1, 0] at some point, but it was wiped by the > masking operation (b = 1 is the only masking value for a || b). > > From the comments: > > Scan the blocks that make up an expression and look for conditions that > would mask other conditions. For a deeper discussion on masking, see > Whalen, Heimdahl, De Silva "Efficient Test Coverage Measurement for MC/DC". > Masking is best illustrated with an example: > > A || B. If B is true then A will does not independently affect the > decision. In a way, this is "reverse" short circuiting, and always work on > the right-hand side of expressions. * || true is always true and * && > false is always false - the left-hand-side does not affect the outcome, and > their values should not contribute to modified condition/decision coverage. > > A || B interpreted as a decision diagram becomes: > > A > t|\f > | \ > | B > |t/ \f > |/ \ > T F > > The algorithm looks for triangles like ATB. Masking right-hand sides happen > when a block has a pair of incoming edges of the same boolean value, and > there is an edge connecting the two predecessors with the *opposite* boolean > value. The masking block is B, and the masked block is A. In this > particular example: > > Masking can affect "outside" its own subexpression; in A || (B && C) if C is > false, B is masked. However, if (B && C) is true, A gets masked. B && C > can be determined from evaluating C since !B would short-circuit, so a true > C would mask A. > > A > t|\f > | \ > | \ > | \ > | B > | t/ \f > | C | > |t/ \f | > |/ \ | > T F > > Notice how BFC forms a triangle. Expressions masked by an edge are > determined by: > * Go to the predecessor with truth value b (if it exists). > * Follow the path of ancestors with taking only !b edges, recording every > node from here on. > > For the case where C can mask A, the path goes C [true]-> B -> [false] A, so > C [true] masks A. > > The mask is output as a bit-set stored in a gcov_type_unsigned. The > bit-sets are output in pairs, one for each decision (the outcome of the > boolean expression, or which arc to take in the CFG). > > Phase 3: reading the report > > This is pretty standard gcov stuff, except the "counters" are interpreted as > bitsets, and the number of terms in the expressions is stored. Coverage then > becomes popcount / expected. > > Final remarks: > > Granted, this is a bit dense, but I think the overall approach is easy to grasp > when supported by a lot of example graphs and stepping. I will try to type up a > bunch with the paper and publish, but for now all I have is the code (which I > hope for the most part is easy to follow). I'll update this with style updates > and atomics support, but I would prefer a maintainer to make the call if this > condition coverage should imply branch coverage, or if it should work without it. > > Thanks, > Jørgen > > On 24/03/2022 17:08, Martin Liška via Gcc-patches wrote: >> On 3/21/22 12:55, Jørgen Kvalsvik via Gcc-patches wrote: >> >> Hello. >> >> Thanks for very interesting extension of the existing GCOV profiling. >> >> I briefly read the patch and investigated what happens and I've got various >> questions/comments about it: >> >> 1) Do I correctly understand that __conditions_accu_true/false track >> every short-circuit sub-expression of a condition and record >> if a given sub-expr is seen to be true or false? >> >> 2) As mentioned these are bit sets, can you please explain why you (& -value) >> from these sets? >> >> 3) I noticed a false report for a simple test-case: >> >> $ cat cond.c >> int x; >> >> void foo (int a, int b) >> { >> if (a || b) >> x = 1; >> else >> x = 2; >> } >> >> int main() >> { >> foo (0, 1); >> return 0; >> } >> >> $ rm *gcda ; gcc --coverage cond.c -fprofile-conditions && ./a.out && gcov -g -t a-cond.c >> -: 0:Source:cond.c >> -: 0:Graph:a-cond.gcno >> -: 0:Data:a-cond.gcda >> -: 0:Runs:1 >> -: 1:int x; >> -: 2: >> 1: 3:void foo (int a, int b) >> -: 4:{ >> 1: 5: if (a || b) >> conditions covered 1/4 >> condition 0 not covered (true) >> condition 0 not covered (false) <-- this was executed if I'm correct >> condition 1 not covered (false) >> 1: 6: x = 1; >> -: 7: else >> #####: 8: x = 2; >> 1: 9:} >> -: 10: >> 1: 11:int main() >> -: 12:{ >> 1: 13: foo (0, 1); >> 1: 14: return 0; >> -: 15:} >> >> 4) I noticed various CFG patterns in tree-profile.cc which are handled. Can you please >> explain how the algorithm works even without knowing the original paper? >> >> 5) I noticed the following ICE: >> >> $ gcc cond.c -fprofile-conditions -fprofile-generate >> during IPA pass: profile >> cond.c: In function ?foo?: >> cond.c:15:1: internal compiler error: Segmentation fault >> 15 | } >> | ^ >> 0xf4d64a crash_signal >> /home/marxin/Programming/gcc/gcc/toplev.cc:322 >> 0x7ffff78b93cf ??? >> /usr/src/debug/glibc-2.35-2.1.x86_64/signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 >> 0x7ffff78f29ed __GI__IO_ftell >> /usr/src/debug/glibc-2.35-2.1.x86_64/libio/ioftell.c:37 >> 0xa9dfda gcov_position >> /home/marxin/Programming/gcc/gcc/gcov-io.cc:48 >> 0xa9dfda gcov_write_tag(unsigned int) >> /home/marxin/Programming/gcc/gcc/gcov-io.cc:321 >> 0xe9734a branch_prob(bool) >> /home/marxin/Programming/gcc/gcc/profile.cc:1542 >> 0x1032e08 tree_profiling >> /home/marxin/Programming/gcc/gcc/tree-profile.cc:1620 >> 0x1032e08 execute >> /home/marxin/Programming/gcc/gcc/tree-profile.cc:1726 >> Please submit a full bug report, with preprocessed source (by using -freport-bug). >> Please include the complete backtrace with any bug report. >> See for instructions. >> >> 6) Please follow the GNU coding style, most notable we replace 8 spaces with a tab. >> And we break line before {, (, ... Remove noexcept specifiers for functions and I think >> most of the newly added functions can be static. And each newly added function should >> have a comment. >> >> 7) Please consider supporting of -profile-update=atomic where you'll need to utilize >> an atomic builts (see how other instrumentation looks like with it). >> >> That's the very first round of the review. Hope it's helpful. >> >> Cheers, >> Martin >> >>> This patch adds support in gcc+gcov for modified condition/decision >>> coverage (MC/DC) with the -fprofile-conditions flag. MC/DC is a type of >>> test/code coverage, and it is particularly important in the avation and >>> automotive industries for safety-critical applications. In particular, >>> it is required or recommended by: >>> >>> * DO-178C for the most critical software (Level A) in avionics >>> * IEC 61508 for SIL 4 >>> * ISO 26262-6 for ASIL D >>> >>> From the SQLite webpage: >>> >>> Two methods of measuring test coverage were described above: >>> "statement" and "branch" coverage. There are many other test >>> coverage metrics besides these two. Another popular metric is >>> "Modified Condition/Decision Coverage" or MC/DC. Wikipedia defines >>> MC/DC as follows: >>> >>> * Each decision tries every possible outcome. >>> * Each condition in a decision takes on every possible outcome. >>> * Each entry and exit point is invoked. >>> * Each condition in a decision is shown to independently affect >>> the outcome of the decision. >>> >>> In the C programming language where && and || are "short-circuit" >>> operators, MC/DC and branch coverage are very nearly the same thing. >>> The primary difference is in boolean vector tests. One can test for >>> any of several bits in bit-vector and still obtain 100% branch test >>> coverage even though the second element of MC/DC - the requirement >>> that each condition in a decision take on every possible outcome - >>> might not be satisfied. >>> >>> https://sqlite.org/testing.html#mcdc >>> >>> Wahlen, Heimdahl, and De Silva "Efficient Test Coverage Measurement for >>> MC/DC" describes an algorithm for adding instrumentation by carrying >>> over information from the AST, but my algorithm does analysis on the >>> control flow graph. This should make it work for any language gcc >>> supports, although I have only tested it on constructs in C and C++, see >>> testsuite/gcc.misc-tests and testsuite/g++.dg. >>> >>> Like Wahlen et al this implementation uses bitsets to store conditions, >>> which gcov later interprets. This is very fast, but introduces an max >>> limit for the number of terms in a single boolean expression. This limit >>> is the number of bits in a gcov_unsigned_type (which is typedef'd to >>> uint64_t), so for most practical purposes this would be acceptable. >>> limitation can be relaxed with a more sophisticated way of storing and >>> updating bitsets (for example length-encoding). >>> >>> In action it looks pretty similar to the branch coverage. The -g short >>> opt carries no significance, but was chosen because it was an available >>> option with the upper-case free too. >>> >>> gcov --conditions: >>> >>> 3: 17:void fn (int a, int b, int c, int d) { >>> 3: 18: if ((a && (b || c)) && d) >>> conditions covered 5/8 >>> condition 1 not covered (false) >>> condition 2 not covered (true) >>> condition 2 not covered (false) >>> 1: 19: x = 1; >>> -: 20: else >>> 2: 21: x = 2; >>> 3: 22:} >>> >>> gcov --conditions --json-format: >>> >>> "conditions": [ >>> { >>> "not_covered_false": [ >>> 1, >>> 2 >>> ], >>> "count": 8, >>> "covered": 5, >>> "not_covered_true": [ >>> 2 >>> ] >>> } >>> ], >>> >>> C++ destructors will add extra conditionals that are not explicitly >>> present in source code. These are present in the CFG, which means the >>> condition coverage will show them. >>> >>> gcov --conditions >>> >>> -: 5:struct A { >>> 1: 6: explicit A (int x) : v (x) {} >>> 1: 7: operator bool () const { return bool(v); } >>> 1: 8: ~A() {} >>> -: 9: >>> -: 10: int v; >>> -: 11:}; >>> -: 12: >>> 2: 13:void fn (int a, int b) { >>> 2*: 14: x = a && A(b); >>> conditions covered 2/4 >>> condition 0 not covered (true) >>> condition 1 not covered (true) >>> conditions covered 2/2 >>> >>> They are also reported by the branch coverage. >>> >>> gcov -bc >>> >>> 2*: 14: x = a && A(b); >>> branch 0 taken 1 (fallthrough) >>> branch 1 taken 1 >>> call 2 returned 1 >>> call 3 returned 1 >>> branch 4 taken 0 (fallthrough) >>> branch 5 taken 1 >>> branch 6 taken 1 (fallthrough) >>> branch 7 taken 1 >>> >>> The algorithm struggles in a particular case where gcc would generate >>> identical CFGs for different expressions: >>> >>> and.c: >>> >>> if (a && b && c) >>> x = 1; >>> >>> ifs.c: >>> >>> if (a) >>> if (b) >>> if (c) >>> x = 1; >>> >>> if (a && b && c) >>> x = 1; >>> >>> and.c.gcov: >>> >>> #####: 2: if (a && b && c) >>> conditions covered 2/2 >>> conditions covered 2/2 >>> conditions covered 2/2 >>> #####: 3: x = 1; >>> >>> ifs.c.gcov: >>> #####: 2: if (a) >>> conditions covered 2/2 >>> #####: 3: 2 if (b) >>> conditions covered 2/2 >>> #####: 4: 2 if (c) >>> conditions covered 2/2 >>> #####: 5: x = 1; >>> >>> These programs are semantically equivalent, and are interpreted as 3 >>> separate conditional expressions. It does not matter w.r.t. coverage, >>> but is not ideal. Adding an else to and.c fixes the issue: >>> >>> #####: 2: if (a && b && c) >>> conditions covered 6/6 >>> #####: 3: x = 1; >>> #####: 4: else x = x; >>> >>> In the (a && b && c) case the else block must be shared between all >>> three conditionals, and for if (a) if (b) if (c) there would be three >>> *separate* else blocks. >>> >>> Since the algorithm works on CFGs, it cannot detect conditionals (from >>> ternaries) that don't get an entry in the cfg. For example, >>> int x = a ? 0 : 1 in gimple becomes _x = (_a == 0). From source you >>> would expect coverage, but it gets neither branch nor condition >>> coverage. For completeness, it could be achieved by scanning all gimple >>> statements for comparisons, and insert an extra instruction for >>> recording the outcome. >>> >>> The test suite consists of all different CFG kinds and control flow >>> structures I could find, but there is certainly room for many more. >>> >>> Alternative email: Jørgen Kvalsvik >>> >>> -- >>> >>> Some final remarks (not a part of the commit message), this is written in a C++ style that I felt meshed well with what was already there, but if the maintainers would prefer a different approach then I'm happy to to make adjustments. I plan to write a more thorough description of the algorithm, as well as adding the same feature to clang as it is very useful for us at Woven Planet. >>> >>> I tried to pick flag names and options that were comfortable and descriptive, but if you have better names or different ideas for flags then I am happy to change them. The --coverage flag was not changed to imply -fprofile-conditions, but it is possibly something to consider. >>> >>> gcc/ChangeLog: >>> >>> * common.opt: Add new options -fprofile-conditions and >>> -Wcoverage-too-many-conditions. >>> * doc/gcov.texi: Add --conditions documentation. >>> * doc/invoke.texi: Add -fprofile-conditions documentation. >>> * gcov-counter.def (GCOV_COUNTER_CONDS): New. >>> * gcov-io.h (GCOV_TAG_CONDS): New. >>> (GCOV_TAG_CONDS_LENGTH): Likewise. >>> (GCOV_TAG_CONDS_NUM): Likewise. >>> * gcov.cc (class condition_info): New. >>> (condition_info::condition_info): New. >>> (condition_info::popcount): New. >>> (struct coverage_info): New. >>> (add_condition_counts): New. >>> (output_conditions): New. >>> (print_usage): Add -g, --conditions. >>> (process_args): Likewise. >>> (output_intermediate_json_line): Output conditions. >>> (read_graph_file): Read conditions counters. >>> (read_count_file): Read conditions counters. >>> (file_summary): Print conditions. >>> (accumulate_line_info): Accumulate conditions. >>> (output_line_details): Print conditions. >>> * profile.cc (find_conditions): New declaration. >>> (instrument_decisions): New declaration. >>> (branch_prob): Call find_conditions, instrument_decisions. >>> (init_branch_prob): Add total_num_conds. >>> (end_branch_prob): Likewise. >>> * tree-profile.cc (INCLUDE_ALGORITHM): Define. >>> (struct conds_ctx): New. >>> (CONDITIONS_MAX_TERMS): New. >>> (index_of): New. >>> (index_lt): New. >>> (single): New. >>> (single_edge): New. >>> (contract_edge): New. >>> (contract_edge_up): New. >>> (is_conditional_p): New. >>> (collect_neighbors): New. >>> (find_first_conditional): New. >>> (emit_bitwise_op): New. >>> (collect_conditions): New. >>> (find_conditions): New. >>> (instrument_decisions): New. >>> >>> gcc/testsuite/ChangeLog: >>> >>> * lib/gcov.exp: >>> * g++.dg/gcov/gcov-18.C: New test. >>> * gcc.misc-tests/gcov-19.c: New test. >>> --- >>> gcc/common.opt | 8 + >>> gcc/doc/gcov.texi | 36 ++ >>> gcc/doc/invoke.texi | 19 + >>> gcc/gcov-counter.def | 3 + >>> gcc/gcov-io.h | 3 + >>> gcc/gcov.cc | 177 +++++- >>> gcc/profile.cc | 51 +- >>> gcc/testsuite/g++.dg/gcov/gcov-18.C | 209 ++++++ >>> gcc/testsuite/gcc.misc-tests/gcov-19.c | 726 +++++++++++++++++++++ >>> gcc/testsuite/lib/gcov.exp | 187 +++++- >>> gcc/tree-profile.cc | 838 +++++++++++++++++++++++++ >>> 11 files changed, 2248 insertions(+), 9 deletions(-) >>> create mode 100644 gcc/testsuite/g++.dg/gcov/gcov-18.C >>> create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-19.c >