From: Qing Zhao <qing.zhao@oracle.com>
To: Richard Biener <richard.guenther@gmail.com>,
David Malcolm <dmalcolm@redhat.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
Andrew Pinski <pinskia@gmail.com>,
kees Cook <keescook@chromium.org>
Subject: Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
Date: Wed, 22 May 2024 18:53:14 +0000 [thread overview]
Message-ID: <552015E4-10FF-4E20-BC2F-8D88A6C55C0B@oracle.com> (raw)
In-Reply-To: <CAFiYyc2U9Dn1p=L5+5kpW3jCgHwNb5xc_xoubKBf4KKR-di0Gw@mail.gmail.com>
> On May 22, 2024, at 03:38, Richard Biener <richard.guenther@gmail.com> wrote:
>
> On Tue, May 21, 2024 at 11:36 PM David Malcolm <dmalcolm@redhat.com> wrote:
>>
>> On Tue, 2024-05-21 at 15:13 +0000, Qing Zhao wrote:
>>> Thanks for the comments and suggestions.
>>>
>>>> On May 15, 2024, at 10:00, David Malcolm <dmalcolm@redhat.com>
>>>> wrote:
>>>>
>>>> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
>>>>> On Mon, 13 May 2024, Qing Zhao wrote:
>>>>>
>>>>>> -Warray-bounds is an important option to enable linux kernal to
>>>>>> keep
>>>>>> the array out-of-bound errors out of the source tree.
>>>>>>
>>>>>> However, due to the false positive warnings reported in
>>>>>> PR109071
>>>>>> (-Warray-bounds false positive warnings due to code duplication
>>>>>> from
>>>>>> jump threading), -Warray-bounds=1 cannot be added on by
>>>>>> default.
>>>>>>
>>>>>> Although it's impossible to elinimate all the false positive
>>>>>> warnings
>>>>>> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
>>>>>> documentation says "always out of bounds"), we should minimize
>>>>>> the
>>>>>> false positive warnings in -Warray-bounds=1.
>>>>>>
>>>>>> The root reason for the false positive warnings reported in
>>>>>> PR109071 is:
>>>>>>
>>>>>> When the thread jump optimization tries to reduce the # of
>>>>>> branches
>>>>>> inside the routine, sometimes it needs to duplicate the code
>>>>>> and
>>>>>> split into two conditional pathes. for example:
>>>>>>
>>>>>> The original code:
>>>>>>
>>>>>> void sparx5_set (int * ptr, struct nums * sg, int index)
>>>>>> {
>>>>>> if (index >= 4)
>>>>>> warn ();
>>>>>> *ptr = 0;
>>>>>> *val = sg->vals[index];
>>>>>> if (index >= 4)
>>>>>> warn ();
>>>>>> *ptr = *val;
>>>>>>
>>>>>> return;
>>>>>> }
>>>>>>
>>>>>> With the thread jump, the above becomes:
>>>>>>
>>>>>> void sparx5_set (int * ptr, struct nums * sg, int index)
>>>>>> {
>>>>>> if (index >= 4)
>>>>>> {
>>>>>> warn ();
>>>>>> *ptr = 0; // Code duplications since "warn" does
>>>>>> return;
>>>>>> *val = sg->vals[index]; // same this line.
>>>>>> // In this path, since it's
>>>>>> under
>>>>>> the condition
>>>>>> // "index >= 4", the compiler
>>>>>> knows
>>>>>> the value
>>>>>> // of "index" is larger then 4,
>>>>>> therefore the
>>>>>> // out-of-bound warning.
>>>>>> warn ();
>>>>>> }
>>>>>> else
>>>>>> {
>>>>>> *ptr = 0;
>>>>>> *val = sg->vals[index];
>>>>>> }
>>>>>> *ptr = *val;
>>>>>> return;
>>>>>> }
>>>>>>
>>>>>> We can see, after the thread jump optimization, the # of
>>>>>> branches
>>>>>> inside
>>>>>> the routine "sparx5_set" is reduced from 2 to 1, however, due
>>>>>> to
>>>>>> the
>>>>>> code duplication (which is needed for the correctness of the
>>>>>> code),
>>>>>> we
>>>>>> got a false positive out-of-bound warning.
>>>>>>
>>>>>> In order to eliminate such false positive out-of-bound warning,
>>>>>>
>>>>>> A. Add one more flag for GIMPLE: is_splitted.
>>>>>> B. During the thread jump optimization, when the basic blocks
>>>>>> are
>>>>>> duplicated, mark all the STMTs inside the original and
>>>>>> duplicated
>>>>>> basic blocks as "is_splitted";
>>>>>> C. Inside the array bound checker, add the following new
>>>>>> heuristic:
>>>>>>
>>>>>> If
>>>>>> 1. the stmt is duplicated and splitted into two conditional
>>>>>> paths;
>>>>>> + 2. the warning level < 2;
>>>>>> + 3. the current block is not dominating the exit block
>>>>>> Then not report the warning.
>>>>>>
>>>>>> The false positive warnings are moved from -Warray-bounds=1 to
>>>>>> -Warray-bounds=2 now.
>>>>>>
>>>>>> Bootstrapped and regression tested on both x86 and aarch64.
>>>>>> adjusted
>>>>>> -Warray-bounds-61.c due to the false positive warnings.
>>>>>>
>>>>>> Let me know if you have any comments and suggestions.
>>>>>
>>>>> At the last Cauldron I talked with David Malcolm about these kind
>>>>> of
>>>>> issues and thought of instead of suppressing diagnostics to
>>>>> record
>>>>> how a block was duplicated. For jump threading my idea was to
>>>>> record
>>>>> the condition that was proved true when entering the path and do
>>>>> this
>>>>> by recording the corresponding locations
>>>
>>> Is only recording the location for the TRUE path enough?
>>> We might need to record the corresponding locations for both TRUE and
>>> FALSE paths since the VRP might be more accurate on both paths.
>>> Is only recording the location is enough?
>>> Do we need to record the pointer to the original condition stmt?
>>
>> Just to be clear: I don't plan to work on this myself (I have far too
>> much already to work on...); I'm assuming Richard Biener is
>> hoping/planning to implement this himself.
>
> While I think some of this might be an improvement to those vast
> number of "false positive" diagnostics we have from (too) late diagnostic
> passes I do not have the cycles to work on this.
I can study a little bit more on how to improve the diagnostics for PR 109071 first.
FYI, I just updated PR109071’s description as: Need more context for -Warray-bounds warnings due to code duplication from jump threading.
As we already agreed, this is NOT a false positive. It caught a real bug in linux kernel that need to be patched in the kernel source. (See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109071#c11)
In order to add more context to the diagnostics to help the end user locate the bug accurately in their source code, compiler needs:
1. During jump threading phase, record the following information for each duplicated STMT:
A. A pointer to the COND stmt;
B. True/false for such COND
2. During array out-of-bound checking phase, whenever locate an out-of-bound case,
A. Use a rich_location for the diagnostic;
B. Create an instance of diagnositic_path, add events to this diagnostic_path based on the COND stmt, True/False info recorded in the STMT;
C. Call richloc.set_path() to associate the path with the rich_location;
D. Then issue warning with this rich_location instead of the current regular location.
Any comment and suggestion to the above?
Thanks.
Qing
>
> Richard.
>
>> But feel free to ask me any questions about the diagnostic_path
>> machinery within the diagnostics subsystem.
>>
>> [...snip...]
>>
>> Regards
>> Dave
next prev parent reply other threads:[~2024-05-22 18:53 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-13 19:48 Qing Zhao
2024-05-13 20:46 ` Jeff Law
2024-05-13 21:41 ` Kees Cook
2024-05-13 23:38 ` Andrew Pinski
2024-05-14 0:14 ` Kees Cook
2024-05-14 14:57 ` Qing Zhao
2024-05-14 15:08 ` Jeff Law
2024-05-14 16:03 ` Qing Zhao
2024-05-14 13:08 ` Richard Biener
2024-05-14 14:17 ` Qing Zhao
2024-05-14 14:29 ` Richard Biener
2024-05-14 15:19 ` Qing Zhao
2024-05-14 17:14 ` Richard Biener
2024-05-14 17:21 ` Qing Zhao
2024-05-15 6:09 ` Richard Biener
2024-05-15 13:37 ` Qing Zhao
2024-05-14 19:50 ` Kees Cook
2024-05-15 5:50 ` Richard Biener
2024-05-15 14:00 ` David Malcolm
2024-05-21 15:13 ` Qing Zhao
2024-05-21 21:36 ` David Malcolm
2024-05-22 7:38 ` Richard Biener
2024-05-22 18:53 ` Qing Zhao [this message]
2024-05-23 11:46 ` Richard Biener
2024-05-23 14:03 ` Qing Zhao
2024-05-23 14:13 ` David Malcolm
2024-05-23 14:23 ` Qing Zhao
2024-05-31 21:22 ` Qing Zhao
2024-06-03 6:29 ` Richard Biener
2024-06-03 14:33 ` Qing Zhao
2024-06-03 14:48 ` David Malcolm
2024-06-04 7:43 ` Richard Biener
2024-06-04 20:30 ` Qing Zhao
2024-06-05 7:26 ` Richard Biener
2024-06-05 16:38 ` Qing Zhao
2024-06-05 17:07 ` Richard Biener
2024-06-05 17:58 ` Qing Zhao
2024-06-07 19:13 ` Qing Zhao
2024-06-12 18:15 ` Qing Zhao
2024-05-14 16:43 ` Kees Cook
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=552015E4-10FF-4E20-BC2F-8D88A6C55C0B@oracle.com \
--to=qing.zhao@oracle.com \
--cc=dmalcolm@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=keescook@chromium.org \
--cc=pinskia@gmail.com \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).