public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Jeff Law <law@redhat.com>
Cc: Martin Sebor <msebor@gmail.com>, GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] adjust object size computation for union accesses and PHIs (PR 92765)
Date: Tue, 04 Feb 2020 14:35:00 -0000	[thread overview]
Message-ID: <CAFiYyc0GeeqviRtj2UQ81GgNDcRr4a48bSpxSDR-04wQw4N8fA@mail.gmail.com> (raw)
In-Reply-To: <76407395bfad8f8c9e24660758cd9340ab897ddb.camel@redhat.com>

On Mon, Feb 3, 2020 at 7:45 PM Jeff Law <law@redhat.com> wrote:
>
> On Fri, 2020-01-31 at 12:04 -0700, Martin Sebor wrote:
> > Attached is a reworked patch since the first one didn't go far
> > enough to solve the major problems.  The new solution relies on
> > get_range_strlen_dynamic the same way as the sprintf optimization,
> > and does away with the determine_min_objsize function and calling
> > compute_builtin_object_size.
> >
> > To minimize testsuite fallout I extended get_range_strlen to handle
> > a couple more kinds expressions(*), but I still had to xfail and
> > disable a few tests that were relying on being able to use the type
> > of the destination object as the upper bound on the string length.
> >
> > Tested on x86_64-linux.
> >
> > Martin
> >
> > [*] With all the issues around MEM_REFs and types this change needs
> > extra scrutiny.  I'm still not sure I fully understand what can and
> > what cannot be safely relied on at this level.
> >
> > On 1/15/20 6:18 AM, Martin Sebor wrote:
> > > The strcmp optimization newly introduced in GCC 10 relies on
> > > the size of the smallest referenced array object to determine
> > > whether the function can return zero.  When the size of
> > > the object is smaller than the length of the other string
> > > argument the optimization folds the equality to false.
> > >
> > > The bug report has identified a couple of problems here:
> > > 1) when the access to the array object is via a pointer to
> > > a (possibly indirect) member of a union, in GIMPLE the pointer
> > > may actually point to a different member than the one in
> > > the original source code.  Thus the size of the array may
> > > appear to be smaller than in the source code which can then
> > > result in the optimization being invalid.
> > > 2) when the pointer in the access may point to two or more
> > > arrays of different size (i.e., it's the result of a PHI),
> > > assuming it points to the smallest of them can also lead
> > > to an incorrect result when the optimization is applied.
> > >
> > > The attached patch adjusts the optimization to 1) avoid making
> > > any assumptions about the sizes of objects accessed via union
> > > types, and b) use the size of the largest object in PHI nodes.
> > >
> > > Tested on x86_64-linux.
> > >
> > > Martin
> >
> >
> > PR tree-optimization/92765 - wrong code for strcmp of a union member
> >
> > gcc/ChangeLog:
> >
> >         PR tree-optimization/92765
> >         * gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and PARM_DECL.
> >         * tree-ssa-strlen.c (compute_string_length): Remove.
> >         (determine_min_objsize): Remove.
> >         (get_len_or_size): Add an argument.  Call get_range_strlen_dynamic.
> >         Avoid using type size as the upper bound on string length.
> >         (handle_builtin_string_cmp): Add an argument.  Adjust.
> >         (strlen_check_and_optimize_call): Pass additional argument to
> >         handle_builtin_string_cmp.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         PR tree-optimization/92765
> >         * g++.dg/tree-ssa/strlenopt-1.C: New test.
> >         * g++.dg/tree-ssa/strlenopt-2.C: New test.
> >         * gcc.dg/Warray-bounds-58.c: New test.
> >         * gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow.
> >         * gcc.dg/Wstring-compare.c: Xfail a test.
> >         * gcc.dg/strcmpopt_2.c: Disable tests.
> >         * gcc.dg/strcmpopt_4.c: Adjust tests.
> >         * gcc.dg/strcmpopt_10.c: New test.
> >         * gcc.dg/strlenopt-69.c: Disable tests.
> >         * gcc.dg/strlenopt-92.c: New test.
> >         * gcc.dg/strlenopt-93.c: New test.
> >         * gcc.dg/strlenopt.h: Declare calloc.
> >         * gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved.
> >         * gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).
> >
> > diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> > index ed225922269..d70ac67e1ca 100644
> > --- a/gcc/gimple-fold.c
> > +++ b/gcc/gimple-fold.c
> > @@ -1280,7 +1280,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
> >                        c_strlen_data *pdata, unsigned eltsize)
> >  {
> >    gcc_assert (TREE_CODE (arg) != SSA_NAME);
> > -
> > +
> >    /* The length computed by this invocation of the function.  */
> >    tree val = NULL_TREE;
> >
> > @@ -1422,7 +1422,42 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,
> >              type about the length here.  */
> >           tight_bound = true;
> >         }
> > -      else if (VAR_P (arg))
> > +      else if (TREE_CODE (arg) == MEM_REF
> > +              && TREE_CODE (TREE_TYPE (arg)) == ARRAY_TYPE
> > +              && TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) == INTEGER_TYPE
> > +              && TREE_CODE (TREE_OPERAND (arg, 0)) == ADDR_EXPR)
> > +       {
> > +         /* Handle a MEM_REF into a DECL accessing an array of integers,
> > +            being conservative about references to extern structures with
> > +            flexible array members that can be initialized to arbitrary
> > +            numbers of elements as an extension (static structs are okay).
> > +            FIXME: Make this less conservative -- see
> > +            component_ref_size in tree.c.  */
> I think it's generally been agreed that we can look at sizes of _DECL
> nodes and this code doesn't look like this walks backwards through
> casts or anything like that.  So the worry would be if we forward
> propagated through a cast into the MEM_REF node.
>
> It looks like forwprop only propagates through "compatible" pointer
> conversions.  It makes me a bit nervous.
>
> Jakub/Richi, comments on this hunk?

+         tree ref = TREE_OPERAND (TREE_OPERAND (arg, 0), 0);
+         tree off = TREE_OPERAND (arg, 1);
+         if ((TREE_CODE (ref) == PARM_DECL || VAR_P (ref))
+             && (!DECL_EXTERNAL (ref)
+                 || !array_at_struct_end_p (arg)))
+           {

I think you'd want decl_binds_to_current_def_p (ref) instead of !DECL_EXTERNAL.
Since 'arg' is originally a pointer array_at_struct_end_p is
meaningless here since
that's about the structure of a reference while the pointer is just a
value.  So if
you're concerned the objects size might not be as it looks like then you have to
rely on decl_binds_to_current_def_p only.  You also shouldn't use 'off' natively
in the code below but use mem_ref_offset to access the embedded offset
which is to be interpreted as signed integer (it's a pointer as you use it).
You compare it against an unsigned size...

>
>
> > diff --git a/gcc/testsuite/gcc.dg/Wstring-compare.c b/gcc/testsuite/gcc.dg/Wstring-compare.c
> > index 0ca492db0ab..d1534bf7555 100644
> > --- a/gcc/testsuite/gcc.dg/Wstring-compare.c
> > +++ b/gcc/testsuite/gcc.dg/Wstring-compare.c
> In general I have a slight preference for pulling these into new files
> when we need to xfail them.  Why?  Because a test which previously
> passed, but now fails (even an xfail) causes the tester to flag the
> build as failing due to a testsuite regression.
>
> But I don't think that preference is significant enough to ask you to
> redo the work.  Just something to ponder in the future.
>
>
> > diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c b/gcc/testsuite/gcc.dg/strcmpopt_2.c
> > index 57d8f651c28..f31761be173 100644
> > --- a/gcc/testsuite/gcc.dg/strcmpopt_2.c
> > +++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c
> So I'd pulled f1, f3, f5, f7 into a new file.  But disabling them like
> you've done is reasonable as well.
>
>
> > diff --git a/gcc/testsuite/gcc.dg/strcmpopt_4.c b/gcc/testsuite/gcc.dg/strcmpopt_4.c
> > index 4e26522eed1..b07fbb6b7b0 100644
> > --- a/gcc/testsuite/gcc.dg/strcmpopt_4.c
> > +++ b/gcc/testsuite/gcc.dg/strcmpopt_4.c
> THanks for creating the new test.  I'd done the exact same thing in my
> local tree.  I'm a little surprised that f_param passes.  Can you
> double check that?
>
>
>
> diff --git a/gcc/testsuite/gcc.dg/strlenopt-69.c
> b/gcc/testsuite/gcc.dg/strlenopt-69.c
> > index 9ad8e2e8aac..9df6eeccb97 100644
> > --- a/gcc/testsuite/gcc.dg/strlenopt-69.c
> > +++ b/gcc/testsuite/gcc.dg/strlenopt-69.c
> I'd pulled the offending tests into a new file, but your approach is
> fine too.
>
> >
> > diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
> > index ad9e98973b1..b9972c16e18 100644
> > --- a/gcc/tree-ssa-strlen.c
> > +++ b/gcc/tree-ssa-strlen.c
> > -
> > -/* Given strinfo IDX for ARG, set LENRNG[] to the range of lengths
> > -   of  the string(s) referenced by ARG if it can be determined.
> > -   If the length cannot be determined, set *SIZE to the size of
> > +/* Given strinfo IDX for ARG, sets LENRNG[] to the range of lengths
> > +   of the string(s) referenced by ARG if it can be determined.
> > +   If the length cannot be determined, sets *SIZE to the size of
> >     the array the string is stored in, if any.  If no such array is
> > -   known, set *SIZE to -1.  When the strings are nul-terminated set
> > -   *NULTERM to true, otherwise to false.  Return true on success.  */
> > +   known, sets *SIZE to -1.  When the strings are nul-terminated sets
> > +   *NULTERM to true, otherwise to false.  When nonnull uses RVALS to
> > +   determine range information. Returns true on success.  */
> "When nonnull uses RVALS to detemrine range information."  That isn't a
> sentence and just doesn't seem to make sense.  Please review for
> comment clarity.
>
> This looks OK to me with the comment fixed.  I like that we drop the
> whole determine_min_objsize and replace it it the standard range bits
> that we're using elsewhere.
>
> Please give Richi and Jakub time to chime in on the gimple-fold.c
> changes.
>
>
> Jeff
>

  reply	other threads:[~2020-02-04 14:35 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-15 13:25 Martin Sebor
2020-01-15 21:15 ` Jeff Law
2020-01-15 21:49 ` Jakub Jelinek
2020-01-16 13:26   ` Jakub Jelinek
2020-01-31 20:17 ` Martin Sebor
2020-02-03 18:44   ` Jeff Law
2020-02-04 14:35     ` Richard Biener [this message]
2020-02-05 23:58       ` Martin Sebor
2020-02-05 23:58     ` Martin Sebor
2020-02-06 13:01       ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc0GeeqviRtj2UQ81GgNDcRr4a48bSpxSDR-04wQw4N8fA@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=msebor@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).