From: David Malcolm <dmalcolm@redhat.com>
To: Richard Biener <rguenther@suse.de>, Qing Zhao <qing.zhao@oracle.com>
Cc: gcc-patches@gcc.gnu.org, pinskia@gmail.com, keescook@chromium.org
Subject: Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
Date: Wed, 15 May 2024 10:00:42 -0400 [thread overview]
Message-ID: <8c3c2c188421e4817249a5bc6d51dcf43eb079db.camel@redhat.com> (raw)
In-Reply-To: <35799652-55op-79os-ro61-56094939s0o2@fhfr.qr>
On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
> On Mon, 13 May 2024, Qing Zhao wrote:
>
> > -Warray-bounds is an important option to enable linux kernal to
> > keep
> > the array out-of-bound errors out of the source tree.
> >
> > However, due to the false positive warnings reported in PR109071
> > (-Warray-bounds false positive warnings due to code duplication
> > from
> > jump threading), -Warray-bounds=1 cannot be added on by default.
> >
> > Although it's impossible to elinimate all the false positive
> > warnings
> > from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
> > documentation says "always out of bounds"), we should minimize the
> > false positive warnings in -Warray-bounds=1.
> >
> > The root reason for the false positive warnings reported in
> > PR109071 is:
> >
> > When the thread jump optimization tries to reduce the # of branches
> > inside the routine, sometimes it needs to duplicate the code and
> > split into two conditional pathes. for example:
> >
> > The original code:
> >
> > void sparx5_set (int * ptr, struct nums * sg, int index)
> > {
> > if (index >= 4)
> > warn ();
> > *ptr = 0;
> > *val = sg->vals[index];
> > if (index >= 4)
> > warn ();
> > *ptr = *val;
> >
> > return;
> > }
> >
> > With the thread jump, the above becomes:
> >
> > void sparx5_set (int * ptr, struct nums * sg, int index)
> > {
> > if (index >= 4)
> > {
> > warn ();
> > *ptr = 0; // Code duplications since "warn" does
> > return;
> > *val = sg->vals[index]; // same this line.
> > // In this path, since it's under
> > the condition
> > // "index >= 4", the compiler knows
> > the value
> > // of "index" is larger then 4,
> > therefore the
> > // out-of-bound warning.
> > warn ();
> > }
> > else
> > {
> > *ptr = 0;
> > *val = sg->vals[index];
> > }
> > *ptr = *val;
> > return;
> > }
> >
> > We can see, after the thread jump optimization, the # of branches
> > inside
> > the routine "sparx5_set" is reduced from 2 to 1, however, due to
> > the
> > code duplication (which is needed for the correctness of the code),
> > we
> > got a false positive out-of-bound warning.
> >
> > In order to eliminate such false positive out-of-bound warning,
> >
> > A. Add one more flag for GIMPLE: is_splitted.
> > B. During the thread jump optimization, when the basic blocks are
> > duplicated, mark all the STMTs inside the original and
> > duplicated
> > basic blocks as "is_splitted";
> > C. Inside the array bound checker, add the following new heuristic:
> >
> > If
> > 1. the stmt is duplicated and splitted into two conditional
> > paths;
> > + 2. the warning level < 2;
> > + 3. the current block is not dominating the exit block
> > Then not report the warning.
> >
> > The false positive warnings are moved from -Warray-bounds=1 to
> > -Warray-bounds=2 now.
> >
> > Bootstrapped and regression tested on both x86 and aarch64.
> > adjusted
> > -Warray-bounds-61.c due to the false positive warnings.
> >
> > Let me know if you have any comments and suggestions.
>
> At the last Cauldron I talked with David Malcolm about these kind of
> issues and thought of instead of suppressing diagnostics to record
> how a block was duplicated. For jump threading my idea was to record
> the condition that was proved true when entering the path and do this
> by recording the corresponding locations so that in the end we can
> use the diagnostic-path infrastructure to say
>
> warning: array index always above array bounds
> events 1:
>
> > 3 | if (index >= 4)
> |
> (1) when index >= 4
>
> it would be possible to record the info as part of the ad-hoc
> location data on each duplicated stmt or, possibly simpler,
> as part of a debug stmt of new kind.
>
> I'm not sure pruning the warnings is a good thing to do. One
> would argue we should instead isolate such path as unreachable
> since it invokes undefined behavior. In particular your
> example is clearly a bug and should be diagnosed.
>
> Note very similar issues happen when unrolling a loop.
>
> Note all late diagnostics are prone to these kind of issues.
To recap our chat at Cauldron: any GCC diagnostic can potentially have
a diagnostic_path associated with it (not just the analyzer). The
current mechanism is:
(a) use a rich_location for the diagnostic, and
(b) create an instance of some diagnostic_path subclass (e.g.
simple_diagnostic_path, or something else), and
(c) call richloc.set_path (&path); to associate the path with the
rich_location
See gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_paths.c for an
example of using simple_diagnostic_path that doesn't use the analyzer.
If we want *every* late diagnostic to potentially have a path, it
sounds like we might want some extra infrastructure (perhaps a hook
that autogenerates the paths from the ad-hoc data???) But probably
best to get it working for just a specific example first before trying
to be fancy and generalize.
Dave
>
> Richard.
>
> > Thanks.
> >
> > Qing
> >
> >
> > PR tree optimization/109071
> >
> > gcc/ChangeLog:
> >
> > * gimple-array-bounds.cc (check_out_of_bounds_and_warn):
> > Add two new
> > arguments for the new heuristic to not issue warnings.
> > (array_bounds_checker::check_array_ref): Call the new
> > prototype of the
> > routine check_out_of_bounds_and_warn.
> > (array_bounds_checker::check_mem_ref): Add one new argument
> > for the
> > new heuristic to not issue warnings.
> > (array_bounds_checker::check_addr_expr): Call the new
> > prototype of the
> > routine check_mem_ref, add new heuristic for not issue
> > warnings.
> > (array_bounds_checker::check_array_bounds): Call the new
> > prototype of
> > the routine check_mem_ref.
> > * gimple-array-bounds.h: New prototype of check_mem_ref.
> > * gimple.h (struct GTY): Add one new flag is_splitted for
> > gimple.
> > (gimple_is_splitted_p): New function.
> > (gimple_set_is_splitted): New function.
> > * tree-ssa-threadupdate.cc (set_stmts_in_bb_is_splitted):
> > New function.
> > (back_jt_path_registry::duplicate_thread_path): Mark all
> > the stmts in
> > both original and copied blocks as IS_SPLITTED.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/Warray-bounds-61.c: Adjust testing case.
> > * gcc.dg/pr109071-1.c: New test.
> > * gcc.dg/pr109071.c: New test.
> > ---
> > gcc/gimple-array-bounds.cc | 46
> > +++++++++++++++++++++----
> > gcc/gimple-array-bounds.h | 2 +-
> > gcc/gimple.h | 21 +++++++++--
> > gcc/testsuite/gcc.dg/Warray-bounds-61.c | 6 ++--
> > gcc/testsuite/gcc.dg/pr109071-1.c | 22 ++++++++++++
> > gcc/testsuite/gcc.dg/pr109071.c | 22 ++++++++++++
> > gcc/tree-ssa-threadupdate.cc | 15 ++++++++
> > 7 files changed, 122 insertions(+), 12 deletions(-)
> > create mode 100644 gcc/testsuite/gcc.dg/pr109071-1.c
> > create mode 100644 gcc/testsuite/gcc.dg/pr109071.c
> >
> > diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-
> > bounds.cc
> > index 008071cd5464..4a2975623bc1 100644
> > --- a/gcc/gimple-array-bounds.cc
> > +++ b/gcc/gimple-array-bounds.cc
> > @@ -264,7 +264,9 @@ check_out_of_bounds_and_warn (location_t
> > location, tree ref,
> > tree up_bound, tree up_bound_p1,
> > const value_range *vr,
> > bool ignore_off_by_one, bool
> > for_array_bound,
> > - bool *out_of_bound)
> > + bool *out_of_bound,
> > + bool is_splitted,
> > + bool is_dominate_exit)
> > {
> > tree min, max;
> > tree low_bound = array_ref_low_bound (ref);
> > @@ -273,6 +275,13 @@ check_out_of_bounds_and_warn (location_t
> > location, tree ref,
> > bool warned = false;
> > *out_of_bound = false;
> >
> > + /* If the stmt is duplicated and splitted, the warning level is
> > not 2,
> > + and the current block is not dominating the exit block, not
> > report
> > + the warning. */
> > + if (is_splitted && warn_array_bounds < 2
> > + && !is_dominate_exit)
> > + return false;
> > +
> > /* Empty array. */
> > if (up_bound && tree_int_cst_equal (low_bound, up_bound_p1))
> > {
> > @@ -386,12 +395,17 @@ array_bounds_checker::check_array_ref
> > (location_t location, tree ref,
> > }
> > }
> >
> > + bool is_dominate_exit = dominated_by_p (CDI_DOMINATORS,
> > + EXIT_BLOCK_PTR_FOR_FN
> > (fun),
> > + gimple_bb (stmt));
> > +
> > warned = check_out_of_bounds_and_warn (location, ref,
> > low_sub_org, low_sub,
> > up_sub,
> > up_bound, up_bound_p1,
> > &vr,
> > ignore_off_by_one,
> > warn_array_bounds,
> > - &out_of_bound);
> > -
> > + &out_of_bound,
> > + gimple_is_splitted_p
> > (stmt),
> > + is_dominate_exit);
> >
> > if (!warned && sam == special_array_member::int_0)
> > warned = warning_at (location, OPT_Wzero_length_bounds,
> > @@ -476,7 +490,7 @@ array_bounds_checker::check_array_ref
> > (location_t location, tree ref,
> >
> > bool
> > array_bounds_checker::check_mem_ref (location_t location, tree
> > ref,
> > - bool ignore_off_by_one)
> > + gimple *stmt, bool
> > ignore_off_by_one)
> > {
> > if (warning_suppressed_p (ref, OPT_Warray_bounds_))
> > return false;
> > @@ -574,6 +588,16 @@ array_bounds_checker::check_mem_ref
> > (location_t location, tree ref,
> > }
> > }
> >
> > + /* If the stmt is duplicated and splitted, the warning level is
> > not 2,
> > + and the current block is not dominating the exit block, not
> > report
> > + the warning. */
> > + bool is_dominate_exit = dominated_by_p (CDI_DOMINATORS,
> > + EXIT_BLOCK_PTR_FOR_FN
> > (fun),
> > + gimple_bb (stmt));
> > + if (gimple_is_splitted_p (stmt) && warn_array_bounds < 2
> > + && !is_dominate_exit)
> > + return false;
> > +
> > bool warned = false;
> > if (lboob)
> > {
> > @@ -654,7 +678,7 @@ array_bounds_checker::check_addr_expr
> > (location_t location, tree t,
> > ignore_off_by_one = false;
> > }
> > else if (TREE_CODE (t) == MEM_REF)
> > - warned = check_mem_ref (location, t, ignore_off_by_one);
> > + warned = check_mem_ref (location, t, stmt,
> > ignore_off_by_one);
> >
> > if (warned)
> > suppress_warning (t, OPT_Warray_bounds_);
> > @@ -690,6 +714,16 @@ array_bounds_checker::check_addr_expr
> > (location_t location, tree t,
> > if (!mem_ref_offset (t).is_constant (&idx))
> > return;
> >
> > + /* If the stmt is duplicated and splitted, the warning level is
> > not 2,
> > + and the current block is not dominating the exit block, not
> > report
> > + the warning. */
> > + bool is_dominate_exit = dominated_by_p (CDI_DOMINATORS,
> > + EXIT_BLOCK_PTR_FOR_FN
> > (fun),
> > + gimple_bb (stmt));
> > + if (gimple_is_splitted_p (stmt) && warn_array_bounds < 2
> > + && !is_dominate_exit)
> > + return;
> > +
> > bool warned = false;
> > idx = wi::sdiv_trunc (idx, wi::to_offset (el_sz));
> > if (idx < 0)
> > @@ -809,7 +843,7 @@ array_bounds_checker::check_array_bounds (tree
> > *tp, int *walk_subtree,
> > warned = checker->check_array_ref (location, t, wi->stmt,
> > false/*ignore_off_by_one*/);
> > else if (TREE_CODE (t) == MEM_REF)
> > - warned = checker->check_mem_ref (location, t,
> > + warned = checker->check_mem_ref (location, t, wi->stmt,
> > false /*ignore_off_by_one*/);
> > else if (TREE_CODE (t) == ADDR_EXPR)
> > {
> > diff --git a/gcc/gimple-array-bounds.h b/gcc/gimple-array-bounds.h
> > index 3e077d0178ff..7c98f02204c9 100644
> > --- a/gcc/gimple-array-bounds.h
> > +++ b/gcc/gimple-array-bounds.h
> > @@ -33,7 +33,7 @@ public:
> > private:
> > static tree check_array_bounds (tree *tp, int *walk_subtree,
> > void *data);
> > bool check_array_ref (location_t, tree, gimple *, bool
> > ignore_off_by_one);
> > - bool check_mem_ref (location_t, tree, bool ignore_off_by_one);
> > + bool check_mem_ref (location_t, tree, gimple *, bool
> > ignore_off_by_one);
> > void check_addr_expr (location_t, tree, gimple *);
> > void get_value_range (irange &r, const_tree op, gimple *);
> >
> > diff --git a/gcc/gimple.h b/gcc/gimple.h
> > index bd315ffc2dd4..08f52d107084 100644
> > --- a/gcc/gimple.h
> > +++ b/gcc/gimple.h
> > @@ -254,8 +254,8 @@ struct GTY((desc ("gimple_statement_structure
> > (&%h)"), tag ("GSS_BASE"),
> > /* Nonzero if this statement contains volatile operands. */
> > unsigned has_volatile_ops : 1;
> >
> > - /* Padding to get subcode to 16 bit alignment. */
> > - unsigned pad : 1;
> > + /* Nonzero if this statement is duplicated and splitted to two
> > pathes. */
> > + unsigned is_splitted : 1;
> >
> > /* The SUBCODE field can be used for tuple-specific flags for
> > tuples
> > that do not require subcodes. Note that SUBCODE should be at
> > @@ -2327,6 +2327,23 @@ gimple_set_has_volatile_ops (gimple *stmt,
> > bool volatilep)
> > stmt->has_volatile_ops = (unsigned) volatilep;
> > }
> >
> > +/* Return true if statement G's is_splitted field has
> > + been set. */
> > +
> > +inline bool
> > +gimple_is_splitted_p (const gimple *g)
> > +{
> > + return (bool) g->is_splitted;
> > +}
> > +
> > +/* Set the IS_SPLITTED flag to IS_SPLITTEDP. */
> > +
> > +inline void
> > +gimple_set_is_splitted (gimple *s, bool is_splittedp)
> > +{
> > + s->is_splitted = (unsigned) is_splittedp;
> > +}
> > +
> > /* Return true if STMT is in a transaction. */
> >
> > inline bool
> > diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-61.c
> > b/gcc/testsuite/gcc.dg/Warray-bounds-61.c
> > index 5b66cdc0aab1..cb3c64a813d7 100644
> > --- a/gcc/testsuite/gcc.dg/Warray-bounds-61.c
> > +++ b/gcc/testsuite/gcc.dg/Warray-bounds-61.c
> > @@ -23,7 +23,7 @@ void test_ua3_ua0_a0 (int i)
> >
> > if (i < __LINE__)
> > i = 5;
> > - ua3_a0.a0[i] = 0; // { dg-warning "\\\[-Warray-bounds"
> > }
> > + ua3_a0.a0[i] = 0; // { dg-bogus "\\\[-Warray-bounds" }
> >
> > if (i > -1)
> > i = -1;
> > @@ -44,7 +44,7 @@ void test_ua3_ua0_a1 (int i)
> >
> > if (i > -1)
> > i = -1;
> > - ua3_a0.a1[i] = 0; // { dg-warning "\\\[-Warray-bounds"
> > }
> > + ua3_a0.a1[i] = 0; // { dg-bogus "\\\[-Warray-bounds" }
> >
> > if (i < 7)
> > i = 7;
> > @@ -60,7 +60,7 @@ void test_ua3_ua0_a2 (int i)
> >
> > if (i < __LINE__)
> > i = __LINE__;
> > - ua3_a0.a2[i] = 0; // { dg-warning "\\\[-Warray-bounds"
> > }
> > + ua3_a0.a2[i] = 0; // { dg-bogus "\\\[-Warray-bounds" }
> >
> > if (i > -1)
> > i = -1;
> > diff --git a/gcc/testsuite/gcc.dg/pr109071-1.c
> > b/gcc/testsuite/gcc.dg/pr109071-1.c
> > new file mode 100644
> > index 000000000000..a405c80bd549
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/pr109071-1.c
> > @@ -0,0 +1,22 @@
> > +/* PR tree-optimization/109071 -Warray-bounds false positive
> > warnings
> > + due to code duplication from jump threading
> > + { dg-do compile }
> > + { dg-options "-O2 -Warray-bounds=2" }
> > + */
> > +
> > +extern void warn(void);
> > +static inline void assign(int val, int *regs, int index)
> > +{
> > + if (index >= 4)
> > + warn();
> > + *regs = val;
> > +}
> > +struct nums {int vals[4];};
> > +
> > +void sparx5_set (int *ptr, struct nums *sg, int index)
> > +{
> > + int *val = &sg->vals[index]; /* { dg-warning "is above array
> > bounds" } */
> > +
> > + assign(0, ptr, index);
> > + assign(*val, ptr, index);
> > +}
> > diff --git a/gcc/testsuite/gcc.dg/pr109071.c
> > b/gcc/testsuite/gcc.dg/pr109071.c
> > new file mode 100644
> > index 000000000000..782dfad84ea2
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/pr109071.c
> > @@ -0,0 +1,22 @@
> > +/* PR tree-optimization/109071 -Warray-bounds false positive
> > warnings
> > + due to code duplication from jump threading
> > + { dg-do compile }
> > + { dg-options "-O2 -Wall" }
> > + */
> > +
> > +extern void warn(void);
> > +static inline void assign(int val, int *regs, int index)
> > +{
> > + if (index >= 4)
> > + warn();
> > + *regs = val;
> > +}
> > +struct nums {int vals[4];};
> > +
> > +void sparx5_set (int *ptr, struct nums *sg, int index)
> > +{
> > + int *val = &sg->vals[index]; /* { dg-bogus "is above array
> > bounds" } */
> > +
> > + assign(0, ptr, index);
> > + assign(*val, ptr, index);
> > +}
> > diff --git a/gcc/tree-ssa-threadupdate.cc b/gcc/tree-ssa-
> > threadupdate.cc
> > index fa61ba9512b7..9f338dd4d54d 100644
> > --- a/gcc/tree-ssa-threadupdate.cc
> > +++ b/gcc/tree-ssa-threadupdate.cc
> > @@ -2371,6 +2371,17 @@
> > back_jt_path_registry::adjust_paths_after_duplication (unsigned
> > curr_path_num)
> > }
> > }
> >
> > +/* Set all the stmts in the basic block BB as IS_SPLITTED. */
> > +
> > +static void
> > +set_stmts_in_bb_is_splitted (basic_block bb)
> > +{
> > + gimple_stmt_iterator gsi;
> > + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> > + gimple_set_is_splitted (gsi_stmt (gsi), true);
> > + return;
> > +}
> > +
> > /* Duplicates a jump-thread path of N_REGION basic blocks.
> > The ENTRY edge is redirected to the duplicate of the region.
> >
> > @@ -2418,6 +2429,10 @@ back_jt_path_registry::duplicate_thread_path
> > (edge entry,
> > basic_block *region_copy = XNEWVEC (basic_block, n_region);
> > copy_bbs (region, n_region, region_copy, &exit, 1, &exit_copy,
> > loop,
> > split_edge_bb_loc (entry), false);
> > + /* Mark all the stmts in both original and copied basic blocks
> > + as IS_SPLITTED. */
> > + set_stmts_in_bb_is_splitted (*region);
> > + set_stmts_in_bb_is_splitted (*region_copy);
> >
> > /* Fix up: copy_bbs redirects all edges pointing to copied
> > blocks. The
> > following code ensures that all the edges exiting the jump-
> > thread path are
> >
>
next prev parent reply other threads:[~2024-05-15 14:00 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-13 19:48 Qing Zhao
2024-05-13 20:46 ` Jeff Law
2024-05-13 21:41 ` Kees Cook
2024-05-13 23:38 ` Andrew Pinski
2024-05-14 0:14 ` Kees Cook
2024-05-14 14:57 ` Qing Zhao
2024-05-14 15:08 ` Jeff Law
2024-05-14 16:03 ` Qing Zhao
2024-05-14 13:08 ` Richard Biener
2024-05-14 14:17 ` Qing Zhao
2024-05-14 14:29 ` Richard Biener
2024-05-14 15:19 ` Qing Zhao
2024-05-14 17:14 ` Richard Biener
2024-05-14 17:21 ` Qing Zhao
2024-05-15 6:09 ` Richard Biener
2024-05-15 13:37 ` Qing Zhao
2024-05-14 19:50 ` Kees Cook
2024-05-15 5:50 ` Richard Biener
2024-05-15 14:00 ` David Malcolm [this message]
2024-05-21 15:13 ` Qing Zhao
2024-05-21 21:36 ` David Malcolm
2024-05-22 7:38 ` Richard Biener
2024-05-22 18:53 ` Qing Zhao
2024-05-23 11:46 ` Richard Biener
2024-05-23 14:03 ` Qing Zhao
2024-05-23 14:13 ` David Malcolm
2024-05-23 14:23 ` Qing Zhao
2024-05-31 21:22 ` Qing Zhao
2024-06-03 6:29 ` Richard Biener
2024-06-03 14:33 ` Qing Zhao
2024-06-03 14:48 ` David Malcolm
2024-06-04 7:43 ` Richard Biener
2024-06-04 20:30 ` Qing Zhao
2024-06-05 7:26 ` Richard Biener
2024-06-05 16:38 ` Qing Zhao
2024-06-05 17:07 ` Richard Biener
2024-06-05 17:58 ` Qing Zhao
2024-06-07 19:13 ` Qing Zhao
2024-06-12 18:15 ` Qing Zhao
2024-05-14 16:43 ` Kees Cook
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8c3c2c188421e4817249a5bc6d51dcf43eb079db.camel@redhat.com \
--to=dmalcolm@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=keescook@chromium.org \
--cc=pinskia@gmail.com \
--cc=qing.zhao@oracle.com \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).