public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Martin Sebor <msebor@gmail.com>
To: Aldy Hernandez <aldyh@redhat.com>, Andrew MacLeod <amacleod@redhat.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] generalized range_query class for multiple contexts
Date: Thu, 5 Nov 2020 12:29:38 -0700	[thread overview]
Message-ID: <5332f424-1296-4f2a-7800-99277dbe63ac@gmail.com> (raw)
In-Reply-To: <d9065d38-d515-e0fa-0691-6654892b0aec@gmail.com>

On 10/1/20 11:25 AM, Martin Sebor wrote:
> On 10/1/20 9:34 AM, Aldy Hernandez wrote:
>>
>>
>> On 10/1/20 3:22 PM, Andrew MacLeod wrote:
>>  > On 10/1/20 5:05 AM, Aldy Hernandez via Gcc-patches wrote:
>>  >>> Thanks for doing all this!  There isn't anything I don't understand
>>  >>> in the sprintf changes so no questions from me (well, almost none).
>>  >>> Just some comments:
>>  >> Thanks for your comments on the sprintf/strlen API conversion.
>>  >>
>>  >>> The current call statement is available in all functions that take
>>  >>> a directive argument, as dir->info.callstmt.  There should be no 
>> need
>>  >>> to also add it as a new argument to the functions that now need it.
>>  >> Fixed.
>>  >>
>>  >>> The change adds code along these lines in a bunch of places:
>>  >>>
>>  >>> +         value_range vr;
>>  >>> +         if (!query->range_of_expr (vr, arg, stmt))
>>  >>> +           vr.set_varying (TREE_TYPE (arg));
>>  >>>
>>  >>> I thought under the new Ranger APIs when a range couldn't be
>>  >>> determined it would be automatically set to the maximum for
>>  >>> the type.  I like that and have been moving in that direction
>>  >>> with my code myself (rather than having an API fail, have it
>>  >>> set the max range and succeed).
>>  >> I went through all the above idioms and noticed all are being used on
>>  >> supported types (integers or pointers).  So range_of_expr will always
>>  >> return true.  I've removed the if() and the set_varying.
>>  >>
>>  >>> Since that isn't so in this case, I think it would still be nice
>>  >>> if the added code could be written as if the range were set to
>>  >>> varying in this case and (ideally) reduced to just initialization:
>>  >>>
>>  >>>     value_range vr = some-function (query, stmt, arg);
>>  >>>
>>  >>> some-function could be an inline helper defined just for the sprintf
>>  >>> pass (and maybe also strlen which also seems to use the same 
>> pattern),
>>  >>> or it could be a value_range AKA irange ctor, or it could be a 
>> member
>>  >>> of range_query, whatever is the most appropriate.
>>  >>>
>>  >>> (If assigning/copying a value_range is thought to be too expensive,
>>  >>> declaring it first and then passing it to that helper to set it
>>  >>> would work too).
>>  >>>
>>  >>> In strlen, is the removed comment no longer relevant?  (I.e., does
>>  >>> the ranger solve the problem?)
>>  >>>
>>  >>> -      /* The range below may be "inaccurate" if a constant has been
>>  >>> -        substituted earlier for VAL by this pass that hasn't been
>>  >>> -        propagated through the CFG.  This shoud be fixed by the new
>>  >>> -        on-demand VRP if/when it becomes available (hopefully in
>>  >>> -        GCC 11).  */
>>  >> It should.
>>  >>
>>  >>> I'm wondering about the comment added to get_range_strlen_dynamic
>>  >>> and other places:
>>  >>>
>>  >>> +         // FIXME: Use range_query instead of global ranges.
>>  >>>
>>  >>> Is that something you're planning to do in a followup or should
>>  >>> I remember to do it at some point?
>>  >> I'm not planning on doing it.  It's just a reminder that it would be
>>  >> beneficial to do so.
>>  >>
>>  >>> Otherwise I have no concern with the changes.
>>  >> It's not cleared whether Andrew approved all 3 parts of the patchset
>>  >> or just the valuation part.  I'll wait for his nod before committing
>>  >> this chunk.
>>  >>
>>  >> Aldy
>>  >>
>>  > I have no issue with it, so OK.
>>
>> Pushed all 3 patches.
>>
>>  >
>>  > Just an observation that should be pointed out, I believe Aldy has all
>>  > the code for converting to a ranger, but we have not pursued that any
>>  > further yet since there is a regression due to our lack of equivalence
>>  > processing I think?  That should be resolved in the coming month, 
>> but at
>>  > the moment is a holdback/concern for converting these passes...  iirc.
>>
>> Yes.  Martin, the take away here is that the strlen/sprintf pass has 
>> been converted to the new API, but ranger is still not up and running 
>> on it (even on the branch).
>>
>> With the new API, all you have to do is remove all instances of 
>> evrp_range_analyzer and replace them with a ranger.  That's it.
>> Below is an untested patch that would convert you to a ranger once 
>> it's contributed.
>>
>> IIRC when I enabled the ranger for your pass a while back, there was 
>> one or two regressions due to missing equivalences, and the rest were 
>> because the tests were expecting an actual specific range, and the 
>> ranger returned a slightly different/better one.  You'll need to 
>> adjust your tests.
> 
> Ack.  I'll be on the lookout for the ranger commit (if you hppen
> to remember and CC me on it just in case I might miss it that would
> be great).

I have applied the patch and ran some tests.  There are quite
a few failures (see the list below).  I have only looked at
a couple.  The one in in gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
boils down to the following test case.  There should be no warning
for either sprintf call.  The one in h() is a false positive and
the reason for at least some of the regressions.  Somehow,
the conversions between int and char are causing Ranger to lose
the range.

$ cat t.c && gcc -O2 -S -Wall t.c
char a[2];

extern int x;

signed char f (int min, int max)
{
   signed char i = x;
   return i < min || max < i ? min : i;
}

void ff (signed char i)
{
   __builtin_sprintf (a, "%i", f (0, 9));   // okay
}

signed char g (signed char min, signed char max)
{
   signed char i = x;
   return i < min || max < i ? min : i;
}

void gg (void)
{
   __builtin_sprintf (a, "%i", g (0, 9));   // bogus warning
}

t.c: In function ‘gg’:
t.c:24:26: warning: ‘%i’ directive writing between 1 and 4 bytes into a 
region of size 2 [-Wformat-overflow=]
    24 |   __builtin_sprintf (a, "%i", g (0, 9));   // bogus warning
       |                          ^~
t.c:24:25: note: directive argument in the range [-128, 127]
    24 |   __builtin_sprintf (a, "%i", g (0, 9));   // bogus warning
       |                         ^~~~
t.c:24:3: note: ‘__builtin_sprintf’ output between 2 and 5 bytes into a 
destination of size 2
    24 |   __builtin_sprintf (a, "%i", g (0, 9));   // bogus warning
       |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


Another failure (the one in builtin-sprintf-warn-22.c) is this where
the false positive is due to the strlen pass somehow having lost
the upper bound on the sum of the two string lengths.

$ cat t.c && gcc -O2 -S -Wall t.c
typedef __SIZE_TYPE__ size_t;

char a[1025];

void g (char *s1, char *s2)
{
   size_t n = __builtin_strlen (s1), d = __builtin_strlen (s2);
   if (n + d + 1 >= 1025)
     return;

   __builtin_sprintf (a, "%s.%s", s1, s2);     // { dg-bogus 
"\\\[-Wformat-overflow" }
}

t.c: In function ‘g’:
t.c:11:29: warning: ‘%s’ directive writing up to 1023 bytes into a 
region of size between 1 and 1024 [-Wformat-overflow=]
    11 |   __builtin_sprintf (a, "%s.%s", s1, s2);     // { dg-bogus 
"\\\[-Wformat-overflow" }
       |                             ^~
t.c:11:3: note: ‘__builtin_sprintf’ output between 2 and 2048 bytes into 
a destination of size 1025
    11 |   __builtin_sprintf (a, "%s.%s", s1, s2);     // { dg-bogus 
"\\\[-Wformat-overflow" }
       |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


I'll see if I can reduce the others into test cases but I'll have
to hold off on enabling the ranger in the sprintf/strlen pass until
these issues are resolved.

Thanks
Martin

FAIL: gcc.dg/Wstringop-overflow-20.c  (test for warnings, line 28)
FAIL: gcc.dg/Wstringop-overflow-20.c  (test for warnings, line 29)
FAIL: gcc.dg/Wstringop-overflow-20.c  (test for warnings, line 32)
FAIL: gcc.dg/Wstringop-overflow-20.c  (test for warnings, line 38)
FAIL: gcc.dg/Wstringop-overflow-20.c  (test for warnings, line 39)
XPASS: gcc.dg/pr80776-1.c  (test for bogus messages, line 22)
FAIL: gcc.dg/strlenopt-81.c (test for excess errors)
FAIL: gcc.dg/strlenopt-90.c scan-tree-dump-not strlen1 "strlen \\(\\&a"
FAIL: gcc.dg/strlenopt-80.c scan-tree-dump-times optimized 
"failure_on_line \\(" 0
FAIL: gcc.dg/strlenopt-89.c scan-tree-dump-not strlen1 "strlen \\(\\&a"
FAIL: gcc.dg/strlenopt-89.c scan-tree-dump-not strlen1 "a1\\] = 0;"
FAIL: gcc.dg/strlenopt-89.c scan-tree-dump-not strlen1 "a2 \\+ 1B\\] = 0;"
FAIL: gcc.dg/strlenopt-89.c scan-tree-dump-not strlen1 "a3 \\+ 2B\\] = 0;"
FAIL: gcc.dg/strlenopt-89.c scan-tree-dump-not strlen1 "a4 \\+ 3B\\] = 0;"
FAIL: gcc.dg/strlenopt-89.c scan-tree-dump-not strlen1 "a5 \\+ 4B\\] = 0;"
FAIL: gcc.dg/strlenopt-89.c scan-tree-dump-not strlen1 "a6 \\+ 5B\\] = 0;"
FAIL: gcc.dg/strlenopt-89.c scan-tree-dump-not strlen1 "a7 \\+ 6B\\] = 0;"
FAIL: gcc.dg/strlenopt-89.c scan-tree-dump-not strlen1 "a8 \\+ 7B\\] = 0;"
FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-3.c (test for excess errors)
(many failures)
FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-22.c  (test for bogus 
messages, line 21)


> 
> Thanks
> Martin
> 
>>
>> Aldy
>>
>> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
>> index f4d1c5ca256..9f9e95b7155 100644
>> --- a/gcc/tree-ssa-strlen.c
>> +++ b/gcc/tree-ssa-strlen.c
>> @@ -58,8 +58,8 @@ along with GCC; see the file COPYING3.  If not see
>>   #include "tree-ssa-loop.h"
>>   #include "tree-scalar-evolution.h"
>>   #include "vr-values.h"
>> -#include "gimple-ssa-evrp-analyze.h"
>>   #include "tree-ssa.h"
>> +#include "gimple-range.h"
>>
>>   /* A vector indexed by SSA_NAME_VERSION.  0 means unknown, positive 
>> value
>>      is an index into strinfo vector, negative value stands for
>> @@ -5755,16 +5755,13 @@ class strlen_dom_walker : public dom_walker
>>   public:
>>     strlen_dom_walker (cdi_direction direction)
>>       : dom_walker (direction),
>> -    evrp (false),
>>       m_cleanup_cfg (false)
>>     {}
>>
>>     virtual edge before_dom_children (basic_block);
>>     virtual void after_dom_children (basic_block);
>>
>> -  /* EVRP analyzer used for printf argument range processing, and
>> -     to track strlen results across integer variable assignments.  */
>> -  evrp_range_analyzer evrp;
>> +  gimple_ranger ranger;
>>
>>     /* Flag that will trigger TODO_cleanup_cfg to be returned in strlen
>>        execute function.  */
>> @@ -5777,8 +5774,6 @@ public:
>>   edge
>>   strlen_dom_walker::before_dom_children (basic_block bb)
>>   {
>> -  evrp.enter (bb);
>> -
>>     basic_block dombb = get_immediate_dominator (CDI_DOMINATORS, bb);
>>
>>     if (dombb == NULL)
>> @@ -5853,13 +5848,7 @@ strlen_dom_walker::before_dom_children 
>> (basic_block bb)
>>     /* Attempt to optimize individual statements.  */
>>     for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p 
>> (gsi); )
>>       {
>> -      gimple *stmt = gsi_stmt (gsi);
>> -
>> -      /* First record ranges generated by this statement so they
>> -     can be used by printf argument processing.  */
>> -      evrp.record_ranges_from_stmt (stmt, false);
>> -
>> -      if (check_and_optimize_stmt (&gsi, &cleanup_eh, &evrp))
>> +      if (check_and_optimize_stmt (&gsi, &cleanup_eh, &ranger))
>>       gsi_next (&gsi);
>>       }
>>
>> @@ -5878,8 +5867,6 @@ strlen_dom_walker::before_dom_children 
>> (basic_block bb)
>>   void
>>   strlen_dom_walker::after_dom_children (basic_block bb)
>>   {
>> -  evrp.leave (bb);
>> -
>>     if (bb->aux)
>>       {
>>         stridx_to_strinfo = ((vec<strinfo *, va_heap, vl_embed> *) 
>> bb->aux);
>>
> 


  reply	other threads:[~2020-11-05 19:29 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-18 18:38 Aldy Hernandez
2020-09-18 20:40 ` Aldy Hernandez
2020-09-23 23:53 ` Martin Sebor
2020-09-24  6:46   ` Aldy Hernandez
2020-09-24 17:10     ` Martin Sebor
2020-09-25 17:41   ` Andrew MacLeod
2020-09-28 15:27     ` Martin Sebor
2020-09-28 15:48       ` Andrew MacLeod
2020-10-01  4:17     ` Andrew MacLeod
2020-09-24 21:51 ` Martin Sebor
2020-09-25 13:17   ` Andrew MacLeod
2020-09-28  8:12     ` Aldy Hernandez
2020-10-01  9:05   ` Aldy Hernandez
2020-10-01 13:22     ` Andrew MacLeod
2020-10-01 15:34       ` Aldy Hernandez
2020-10-01 17:25         ` Martin Sebor
2020-11-05 19:29           ` Martin Sebor [this message]
2020-11-05 21:02             ` Martin Sebor
2020-11-06  0:02               ` Andrew MacLeod
2020-11-06  0:50                 ` Martin Sebor
2020-11-06  3:08                   ` Andrew MacLeod
2020-11-06 16:13                     ` Martin Sebor
2020-11-06 21:54                       ` Andrew MacLeod
2020-11-10 14:48                         ` Martin Sebor
2020-11-06  0:05             ` Andrew MacLeod
2020-11-10  1:57             ` Andrew MacLeod

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5332f424-1296-4f2a-7800-99277dbe63ac@gmail.com \
    --to=msebor@gmail.com \
    --cc=aldyh@redhat.com \
    --cc=amacleod@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).