public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/55157] New: Missing VRP
@ 2012-10-31 21:51 pinskia at gcc dot gnu.org
  2021-12-25 23:35 ` [Bug tree-optimization/55157] Missed VRP with != 0 and multiply pinskia at gcc dot gnu.org
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-10-31 21:51 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

             Bug #: 55157
           Summary: Missing VRP
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: pinskia@gcc.gnu.org
            Blocks: 55155


Testcase:
void gg(void);
int f(unsigned t)
{
  unsigned g = t*16;
  if (g==0)  return 1;
  gg();
  gg();
  gg();
  gg();
  gg();
  gg();
  if (g<=4)  return 1;
  return 0;
}

In the end there should only be one check for t == 0.
Yes this shows up in real code (well with the autovectorizer); See PR 55155


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/55157] Missed VRP with != 0 and multiply
  2012-10-31 21:51 [Bug tree-optimization/55157] New: Missing VRP pinskia at gcc dot gnu.org
@ 2021-12-25 23:35 ` pinskia at gcc dot gnu.org
  2022-11-04 11:22 ` aldyh at gcc dot gnu.org
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-25 23:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2021-06-08 00:00:00         |2021-12-25

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
LLVM is able to remove the check.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/55157] Missed VRP with != 0 and multiply
  2012-10-31 21:51 [Bug tree-optimization/55157] New: Missing VRP pinskia at gcc dot gnu.org
  2021-12-25 23:35 ` [Bug tree-optimization/55157] Missed VRP with != 0 and multiply pinskia at gcc dot gnu.org
@ 2022-11-04 11:22 ` aldyh at gcc dot gnu.org
  2022-11-04 11:39 ` aldyh at gcc dot gnu.org
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-11-04 11:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

Aldy Hernandez <aldyh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |aldyh at gcc dot gnu.org,
                   |                            |amacleod at redhat dot com

--- Comment #4 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
We have a variety of ways of fixing this:
  1. Improve the range-op entry for MULT_EXPR.
  2. Handle nonzero bits in the relational operators.
  3. Reflect the nonzero bits in the actual range for trivial masks.

At evrp time we have:

=========== BB 2 ============
Imports: t_3(D)  
Exports: t_3(D)  g_4  
         g_4 : t_3(D)(I)  
t_3(D)  [irange] unsigned int VARYING
    <bb 2> :
    g_4 = t_3(D) * 16;
    if (g_4 == 0)
      goto <bb 3>; [INV]
    else
      goto <bb 4>; [INV]

2->3  (T) g_4 :         [irange] unsigned int [0, 0] NONZERO 0x0
2->4  (F) g_4 :         [irange] unsigned int [1, +INF]

Since this is unsigned, improving the multiply range-op entry could give a
range for g_4 of [0,0][16,+INF].  This would be enough to fold the other
conditional: if (g_4 <= 4).

The code for mult and div is legacy code inherited from legacy VRP.  I'm
assuming we didn't handle the above originally because legacy ranges couldn't
represent [0,0][16,+INF].  We can now, so we could improve here.

However, even without a multiplication improvement, the nonzero mask could help
here.  Though this mask is missing by evrp time (because mult doesn't handle it
(yet)), CCP2 does provide it and we can see it by vrp1 time:

        g_4: [irange] unsigned int [1, +INF] NONZERO 0xfffffff0

This should be enough to fold the second conditional: if (g_4 <= 4).  But alas,
the LE_EXPR range-op entry does not take into account nonzero bits.

The above mask of 0xfffffff0 is fancy talk for [0, 0][16, +INF] which we could
intersect with [1, +INF].  We could teach the LE_EXPR range-op entry about
nonzero bits.

However, perhaps instead of teaching all the relation operators about nonzero
bits, we could augment the range in irange::set_nonzero_bits().  Basically,
intersecting [0,0][16,+INF] with the [1, +INF] when setting the nonzero mask.

The patch below does this, but it does have a 3% penalty for VRP (though no
penalty to overall compilation).  I'm inclined to pursue this route, since it
makes nonzero mask optimization more pervasive across the board.

What do you think Andrew?

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index a855aaf626c..db6a3b7bcb6 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -2962,6 +2962,20 @@ irange::set_nonzero_bits (const wide_int_ref &bits)
   normalize_kind ();
   if (flag_checking)
     verify_range ();
+
+  // Reflect the mask as a simple range.  For example, a mask of
+  // 0xff00 could be represented as [0,0][0x100, 0xffff].
+  if (TYPE_UNSIGNED (type ()) && prec > 1)
+    {
+      gcc_checking_assert (nz != 0);
+      wide_int lb = wi::lshift (wi::one (prec), wi::ctz (nz));
+      wide_int ub = wi::lrshift (wi::minus_one (prec), wi::clz (nz));
+      int_range<2> r (type (), lb, ub);
+      int_range<2> zero;
+      zero.set_zero (type ());
+      r.union_ (zero);
+      intersect (r);
+    }
 }

 // Return the nonzero bitmask.  This will return the nonzero bits plus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/55157] Missed VRP with != 0 and multiply
  2012-10-31 21:51 [Bug tree-optimization/55157] New: Missing VRP pinskia at gcc dot gnu.org
  2021-12-25 23:35 ` [Bug tree-optimization/55157] Missed VRP with != 0 and multiply pinskia at gcc dot gnu.org
  2022-11-04 11:22 ` aldyh at gcc dot gnu.org
@ 2022-11-04 11:39 ` aldyh at gcc dot gnu.org
  2022-11-04 13:40 ` amacleod at redhat dot com
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-11-04 11:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

--- Comment #5 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Aldy Hernandez from comment #4)

> +  // Reflect the mask as a simple range.  For example, a mask of
> +  // 0xff00 could be represented as [0,0][0x100, 0xffff].
> +  if (TYPE_UNSIGNED (type ()) && prec > 1)

Bonus points for making it work with signed.  I'm not smart enough to do that
:).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/55157] Missed VRP with != 0 and multiply
  2012-10-31 21:51 [Bug tree-optimization/55157] New: Missing VRP pinskia at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2022-11-04 11:39 ` aldyh at gcc dot gnu.org
@ 2022-11-04 13:40 ` amacleod at redhat dot com
  2022-11-04 15:44 ` aldyh at gcc dot gnu.org
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: amacleod at redhat dot com @ 2022-11-04 13:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

--- Comment #6 from Andrew Macleod <amacleod at redhat dot com> ---
(In reply to Aldy Hernandez from comment #4)

> 
> The patch below does this, but it does have a 3% penalty for VRP (though no
> penalty to overall compilation).  I'm inclined to pursue this route, since
> it makes nonzero mask optimization more pervasive across the board.
> 
> What do you think Andrew?
> 

1) Why wouldn't this be done in set_range_from_nonzero_bits()?  That call is
just above that spot in the code. Or is the name misleading and it does
something else?

2) That seems expensive.. we must be doing unnecessary work.  Maybe it would
speed up if we checked if either the ctz or clz would cause it to do anything
first.  Thus avoiding creating a couple of ranges and performing a union and
intersection in cases where neither the leading nor trailing bit is a zero?

3) It also seems to me that you then only need to add the zero/union iff the
trailing bit has zeros. ie, if the are no trailing zeros, then just set the lb
to 0, and calculate the UB based on the clz.

I should think that would speed things up a bit.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/55157] Missed VRP with != 0 and multiply
  2012-10-31 21:51 [Bug tree-optimization/55157] New: Missing VRP pinskia at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2022-11-04 13:40 ` amacleod at redhat dot com
@ 2022-11-04 15:44 ` aldyh at gcc dot gnu.org
  2022-11-04 15:46 ` aldyh at gcc dot gnu.org
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-11-04 15:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

--- Comment #7 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Andrew Macleod from comment #6)
> (In reply to Aldy Hernandez from comment #4)
> 
> > 
> > The patch below does this, but it does have a 3% penalty for VRP (though no
> > penalty to overall compilation).  I'm inclined to pursue this route, since
> > it makes nonzero mask optimization more pervasive across the board.
> > 
> > What do you think Andrew?
> > 
> 
> 1) Why wouldn't this be done in set_range_from_nonzero_bits()?  That call is

You're right.

> 2) That seems expensive.. we must be doing unnecessary work.  Maybe it would
> speed up if we checked if either the ctz or clz would cause it to do
> anything first.  Thus avoiding creating a couple of ranges and performing a
> union and intersection in cases where neither the leading nor trailing bit
> is a zero?

I had already played with that.  It made a marginal difference.

> 
> 3) It also seems to me that you then only need to add the zero/union iff the
> trailing bit has zeros. ie, if the are no trailing zeros, then just set the
> lb to 0, and calculate the UB based on the clz.

That actually made it slightly worse.

I'll attach what I just tested.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/55157] Missed VRP with != 0 and multiply
  2012-10-31 21:51 [Bug tree-optimization/55157] New: Missing VRP pinskia at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2022-11-04 15:44 ` aldyh at gcc dot gnu.org
@ 2022-11-04 15:46 ` aldyh at gcc dot gnu.org
  2022-11-04 19:45 ` aldyh at gcc dot gnu.org
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-11-04 15:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

--- Comment #8 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
Created attachment 53826
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53826&action=edit
untested

Here's what I tested and we're still around a 3% degradation for VRP.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/55157] Missed VRP with != 0 and multiply
  2012-10-31 21:51 [Bug tree-optimization/55157] New: Missing VRP pinskia at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2022-11-04 15:46 ` aldyh at gcc dot gnu.org
@ 2022-11-04 19:45 ` aldyh at gcc dot gnu.org
  2022-11-04 19:48 ` aldyh at gcc dot gnu.org
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-11-04 19:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

--- Comment #9 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Aldy Hernandez from comment #7)
> (In reply to Andrew Macleod from comment #6)
> > (In reply to Aldy Hernandez from comment #4)

> > 3) It also seems to me that you then only need to add the zero/union iff the
> > trailing bit has zeros. ie, if the are no trailing zeros, then just set the
> > lb to 0, and calculate the UB based on the clz.
> 
> That actually made it slightly worse.

I did some more testing.  Your suggestions actually improved the original code
from 3.3% to 2.5%, but I added:

+  if (!wi::neg_p (mask, TYPE_SIGN (type ())) && prec > 1)

instead of:

+  if (TYPE_UNSIGNED (type ()) && prec > 1)

because we can still perform the optimization for signed positive numbers.  And
that brought the slowdown back to 3.6%.  But ISTM that we're gonna do it, might
as well do it for signed numbers as well.

Hmmmm... I think we really should do our best to represent masks as ranges as
it makes other optimizations possible across the ranger ecosystem.  But I'm not
ecstatic about the 3.6% drop (even though overall compilation is unaffected).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/55157] Missed VRP with != 0 and multiply
  2012-10-31 21:51 [Bug tree-optimization/55157] New: Missing VRP pinskia at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2022-11-04 19:45 ` aldyh at gcc dot gnu.org
@ 2022-11-04 19:48 ` aldyh at gcc dot gnu.org
  2022-11-04 20:03 ` aldyh at gcc dot gnu.org
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-11-04 19:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

--- Comment #10 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
Original TYPE_UNSIGNED patch with leading / trailing suggestions: -2.52%

As attached: -3.62%

Moving the code out of set_range_from_nonzero_bits back into set_nonzero_bits:
-3.7%

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/55157] Missed VRP with != 0 and multiply
  2012-10-31 21:51 [Bug tree-optimization/55157] New: Missing VRP pinskia at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2022-11-04 19:48 ` aldyh at gcc dot gnu.org
@ 2022-11-04 20:03 ` aldyh at gcc dot gnu.org
  2022-11-05  9:04 ` aldyh at gcc dot gnu.org
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-11-04 20:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

Aldy Hernandez <aldyh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #53826|0                           |1
        is obsolete|                            |

--- Comment #11 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
Created attachment 53827
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53827&action=edit
revised patch in testing

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/55157] Missed VRP with != 0 and multiply
  2012-10-31 21:51 [Bug tree-optimization/55157] New: Missing VRP pinskia at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2022-11-04 20:03 ` aldyh at gcc dot gnu.org
@ 2022-11-05  9:04 ` aldyh at gcc dot gnu.org
  2022-11-07 19:59 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-11-05  9:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

--- Comment #12 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
Created attachment 53831
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53831&action=edit
solution improving MULT_EXPR range-ops

Another solution is just improving the MULT_EXPR range-op entry.  This has no
penalty anywhere.

Though, I still feel we should reflect nonzero masks in the range independently
of this :-/.

Either way, we can solve the PR.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/55157] Missed VRP with != 0 and multiply
  2012-10-31 21:51 [Bug tree-optimization/55157] New: Missing VRP pinskia at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2022-11-05  9:04 ` aldyh at gcc dot gnu.org
@ 2022-11-07 19:59 ` cvs-commit at gcc dot gnu.org
  2022-11-07 20:02 ` aldyh at gcc dot gnu.org
  2022-11-08 18:26 ` pinskia at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-11-07 19:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

--- Comment #13 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Aldy Hernandez <aldyh@gcc.gnu.org>:

https://gcc.gnu.org/g:a239a63f868e29e9276088e7c0fb00804c2903ba

commit r13-3761-ga239a63f868e29e9276088e7c0fb00804c2903ba
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Fri Nov 4 22:24:42 2022 +0100

    Improve multiplication by powers of 2 in range-ops.

    For unsigned numbers, multiplication by X, where X is a power of 2 is
    [0,0][X,+INF].

    This patch causes a regression to g++.dg/pr71488.C where
    -Wstringop-overflow gets the same IL as before, but better ranges
    cause it to issue a bogus warning.  I will create a PR with some
    notes.

    No discernible changes in performance.

    Tested on x86-64 Linux.

            PR tree-optimization/55157

    gcc/ChangeLog:

            * range-op.cc (operator_mult::wi_fold): Optimize multiplications
            by powers of 2.

    gcc/testsuite/ChangeLog:

            * gcc.dg/tree-ssa/pr55157.c: New test.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/55157] Missed VRP with != 0 and multiply
  2012-10-31 21:51 [Bug tree-optimization/55157] New: Missing VRP pinskia at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2022-11-07 19:59 ` cvs-commit at gcc dot gnu.org
@ 2022-11-07 20:02 ` aldyh at gcc dot gnu.org
  2022-11-08 18:26 ` pinskia at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: aldyh at gcc dot gnu.org @ 2022-11-07 20:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

Aldy Hernandez <aldyh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #14 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Aldy Hernandez from comment #11)
> Created attachment 53827 [details]
> revised patch in testing

(In reply to Aldy Hernandez from comment #4)
> We have a variety of ways of fixing this:
>   1. Improve the range-op entry for MULT_EXPR.
>   2. Handle nonzero bits in the relational operators.
>   3. Reflect the nonzero bits in the actual range for trivial masks.

I'm going to close this PR as fixed, though I'd still like to do #3 as
discussed if we can accept a performance degradation, or come up with a more
efficient way of doing so.  Either way, that is irrelevant to the PR per se.

Fixed in trunk.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/55157] Missed VRP with != 0 and multiply
  2012-10-31 21:51 [Bug tree-optimization/55157] New: Missing VRP pinskia at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2022-11-07 20:02 ` aldyh at gcc dot gnu.org
@ 2022-11-08 18:26 ` pinskia at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-11-08 18:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |13.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-11-08 18:26 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-31 21:51 [Bug tree-optimization/55157] New: Missing VRP pinskia at gcc dot gnu.org
2021-12-25 23:35 ` [Bug tree-optimization/55157] Missed VRP with != 0 and multiply pinskia at gcc dot gnu.org
2022-11-04 11:22 ` aldyh at gcc dot gnu.org
2022-11-04 11:39 ` aldyh at gcc dot gnu.org
2022-11-04 13:40 ` amacleod at redhat dot com
2022-11-04 15:44 ` aldyh at gcc dot gnu.org
2022-11-04 15:46 ` aldyh at gcc dot gnu.org
2022-11-04 19:45 ` aldyh at gcc dot gnu.org
2022-11-04 19:48 ` aldyh at gcc dot gnu.org
2022-11-04 20:03 ` aldyh at gcc dot gnu.org
2022-11-05  9:04 ` aldyh at gcc dot gnu.org
2022-11-07 19:59 ` cvs-commit at gcc dot gnu.org
2022-11-07 20:02 ` aldyh at gcc dot gnu.org
2022-11-08 18:26 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).