public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Aldy Hernandez <aldyh@redhat.com>
To: Martin Sebor <msebor@gmail.com>
Cc: Jakub Jelinek <jakub@redhat.com>,
	Martin Sebor <msebor@redhat.com>,
	 GCC patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] Convert strlen pass from evrp to ranger.
Date: Sat, 9 Oct 2021 17:04:14 +0200	[thread overview]
Message-ID: <CAGm3qMXt5JWV60G5FS0i1z4AjogcB0UsE+YDwsYOqi21ci9qyQ@mail.gmail.com> (raw)
In-Reply-To: <fa8ccb2e-39fd-48d9-fc29-1760bff311d5@gmail.com>

On Fri, Oct 8, 2021 at 6:52 PM Martin Sebor <msebor@gmail.com> wrote:
>
> On 10/8/21 9:12 AM, Aldy Hernandez via Gcc-patches wrote:
> > The following patch converts the strlen pass from evrp to ranger,
> > leaving DOM as the last remaining user.
>
> Thanks for doing this.  I know I said I'd work on it but I'm still
> bogged down in my stage 1 work that's not going so great :(  I just
> have a few minor comments/questions on the strlen change (inline)
> but I am a bit concerned about the test failure.
>
> >
> > No additional cleanups have been done.  For example, the strlen pass
> > still has uses of VR_ANTI_RANGE, and the sprintf still passes around
> > pairs of integers instead of using a proper range.  Fixing this
> > could further improve these passes.
> >
> > As a further enhancement, if the relevant maintainers deem useful,
> > the domwalk could be removed from strlen.  That is, unless the pass
> > needs it for something else.
> >
> > With ranger we are now able to remove the range calculation from
> > before_dom_children entirely.  Just working with the ranger on-demand
> > catches all the strlen and sprintf testcases with the exception of
> > builtin-sprintf-warn-22.c which is due to a limitation of the sprintf
> > code.  I have XFAILed the test and documented what the problem is.
>
> builtin-sprintf-warn-22.c is a regression test for a false positive
> in Glibc.  If it fails we'll have to deal with the Glibc failure
> again, which I would rather avoid.  Have you checked to see if
> Glibc is affected by the change?
>
> >
> > It looks like the same problem in the sprintf test triggers a false
> > positive in gimple-ssa-warn-access.cc so I have added
> > -Wno-format-overflow until it can be fixed.
> >
> > I can expand on the false positive if necessary, but the gist is that
> > this:
> >
> >      _17 = strlen (_132);
> >      _18 = strlen (_136);
> >      _19 = _18 + _17;
> >      if (_19 > 75)
> >        goto <bb 59>; [0.00%]
> >      else
> >        goto <bb 61>; [100.00%]
> >
> > ...dominates the sprintf in BB61.  This means that ranger can figure
> > out that the _17 and _18 are [0, 75].  On the other hand, evrp
> > returned a range of [0, 9223372036854775805] which presumably the
> > sprintf code was ignoring as a false positive here:
>
> This is a feature designed to avoid false positives when the sprintf
> pass doesn't know anything about the strings (i.e., their lengths
> are unconstrained by either the sizes of the arrays they're stored
> in or any expressions like asserts involving their lengths).
>
> It sounds like the strlen/ranger improvement partially propagates
> constraints from subsequent expressions into the strlen results
> but it doesn't go far enough for them to actually fully satisfy
> the constraint, which is what in turn triggers the warning.
>
> I.e., in the test:
>
> void g (char *s1, char *s2)
> {
>    char b[1025];
>    size_t n = __builtin_strlen (s1), d = __builtin_strlen (s2);
>    if (n + d + 1 >= 1025)
>      return;
>
>    sprintf (b, "%s.%s", s1, s2); // { dg-bogus "\\\[-Wformat-overflow" }
>
> the range of n and d is [0, INF] and so the sprintf call doesn't
> trigger a warning.  With your change, because their range is
> [0, 1023] each (and there's no way to express that their sum
> is less than 1025), the warning triggers because it considers
> the worst case scenario (the upper bounds of both).

I agree with Andrew.  If this doesn't work with more than 1 string we
shouldn't even attempt it, as it's bound to be riddled with false
positives, which you'll get more of with enhanced ranges at this
point.

> > @@ -269,7 +270,7 @@ compare_nonzero_chars (strinfo *si, unsigned HOST_WIDE_INT off,
> >       return -1;
> >
> >     value_range vr;
> > -  if (!rvals->range_of_expr (vr, si->nonzero_chars, si->stmt))
> > +  if (!rvals->range_of_expr (vr, si->nonzero_chars, stmt))
>
> We need stmt rather than si->stmt because the latter may be different
> or null when the former will always refer to the current statement,
> correct?

Yes.

> I think there still are quite a few calls to get_addr_stridx() and
> get_addr() that don't pass in rvals.  I started doing this back in
> the GCC 11 cycle but didn't finish the job.  Eventually, rvals
> should be passed to all their callers (or they made class members
> with a ranger member).  I mention this in case it has an effect on
> the range propagation (since the pass also sets ranges).

Definitely.  You'll get even better ranges, though perhaps more false
positives as discussed above :-/.

Aldy


  parent reply	other threads:[~2021-10-09 15:04 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-08 15:12 Aldy Hernandez
2021-10-08 16:51 ` Martin Sebor
2021-10-08 17:56   ` Andrew MacLeod
2021-10-08 20:27     ` Martin Sebor
2021-10-09 15:04   ` Aldy Hernandez [this message]
2021-10-09 16:19     ` Martin Sebor
2021-10-09 17:59       ` Martin Sebor
2021-10-11  6:54         ` Aldy Hernandez
2021-10-09 18:47 ` Aldy Hernandez
2021-10-14 22:07   ` Martin Sebor
2021-10-14 23:45     ` Jeff Law
2021-10-15  0:47     ` Andrew MacLeod
2021-10-15 10:39       ` Aldy Hernandez
2021-10-17 22:49         ` Jeff Law
2021-10-18  7:43           ` Aldy Hernandez
2021-10-22 11:11         ` Aldy Hernandez
2021-10-29 20:04         ` Aldy Hernandez
2021-11-09  0:09         ` Jeff Law
2021-10-17 22:52 ` Jeff Law
2021-10-18  8:17   ` Aldy Hernandez
2021-10-20 20:58     ` Jeff Law
2021-10-21  7:42       ` Aldy Hernandez
2021-10-21 18:20         ` Jeff Law
2021-10-23 21:32           ` Jeff Law
2021-10-25  1:59             ` Jeff Law
2021-10-21 10:20       ` Richard Biener
2021-10-21 12:56         ` Aldy Hernandez
2021-10-21 13:14           ` Richard Biener
2021-10-21 13:30             ` Aldy Hernandez
2021-10-21 13:46               ` Richard Biener
2021-10-21 14:17                 ` Aldy Hernandez
2021-10-21 13:43           ` Jeff Law
2021-10-21 14:18             ` Aldy Hernandez
2021-10-25  2:15     ` Jeff Law
2021-10-25  4:42       ` Jeff Law
2021-10-25 11:27         ` Aldy Hernandez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGm3qMXt5JWV60G5FS0i1z4AjogcB0UsE+YDwsYOqi21ci9qyQ@mail.gmail.com \
    --to=aldyh@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=msebor@gmail.com \
    --cc=msebor@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).