[Bug middle-end/98865] New: Missed transform of (a >> 63) * b

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug middle-end/98865] New: Missed transform of (a >> 63) * b
@ 2021-01-28 13:38 rguenth at gcc dot gnu.org
  2021-01-28 13:40 ` [Bug middle-end/98865] " rguenth at gcc dot gnu.org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-28 13:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98865

            Bug ID: 98865
           Summary: Missed transform of (a >> 63) * b
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

unsigned long foo (unsigned long a, unsigned long b)
{
  return (a >> 63) * b;
}

generates

foo:
.LFB0:
        .cfi_startproc
        shrq    $63, %rdi
        movq    %rdi, %rax
        imulq   %rsi, %rax
        ret

but we can do (like llvm):

foo:                                    # @foo
        .cfi_startproc
# %bb.0:
        movq    %rdi, %rax
        sarq    $63, %rax
        andq    %rsi, %rax
        retq

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/98865] Missed transform of (a >> 63) * b
  2021-01-28 13:38 [Bug middle-end/98865] New: Missed transform of (a >> 63) * b rguenth at gcc dot gnu.org
@ 2021-01-28 13:40 ` rguenth at gcc dot gnu.org
  2021-01-28 14:26 ` jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-28 13:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98865

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Target|                            |x86_64-*-*

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Happens in Botan AES-128/XTS (seen in PR98856).  Probably sth for RTL expansion
or even match.pd and not target specific.  Quite faster for > word_mode
arithmetic (only the upper part needs shifting and can be shared for the
bitwise and) - but that's then really for RTL expansion.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/98865] Missed transform of (a >> 63) * b
  2021-01-28 13:38 [Bug middle-end/98865] New: Missed transform of (a >> 63) * b rguenth at gcc dot gnu.org
  2021-01-28 13:40 ` [Bug middle-end/98865] " rguenth at gcc dot gnu.org
@ 2021-01-28 14:26 ` jakub at gcc dot gnu.org
  2021-03-07  2:46 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-28 14:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98865

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
        PR middle-end/98865
        * match.pd (a * (b >> (prec-1)) to ((signed)b >> (prec-1)) & a): New
        simplification.

--- gcc/match.pd.jj     2021-01-22 11:50:09.882909120 +0100
+++ gcc/match.pd        2021-01-28 15:20:20.536238614 +0100
@@ -793,6 +793,16 @@ (define_operator_list COND_TERNARY
        && tree_nop_conversion_p (type, TREE_TYPE (@1)))
    (lshift @0 @2)))

+/* Fold (a * (b >> (prec-1))) with logical shift into
+   ((signed)b >> (prec-1)) & a.  */
+(simplify
+ (mult:c @0 (nop_convert? (rshift @1 INTEGER_CST@2)))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
+       && TYPE_UNSIGNED (TREE_TYPE (@1))
+       && wi::to_widest (@2) + 1 == TYPE_PRECISION (TREE_TYPE (@1)))
+   (with { tree stype = signed_type_for (TREE_TYPE (@1)); }
+    (bit_and (convert:type (rshift (convert:stype @1) @2)) @0))))
+
 /* Fold (1 << (C - x)) where C = precision(type) - 1
    into ((1 << C) >> x). */
 (simplify

(completely untested) does that.
It doesn't handle vector types, whether that is a good idea or not depends on
how do we deal with the match.pd simplifications after last veclower pass
issue.
And, given:
unsigned long long
foo (unsigned long long a, unsigned long long b)
{
  return (a >> 63) * b;
}

long long
bar (long long a, long long b)
{
  return -(a >> 63) * b;
}

long long
baz (long long a, long long b)
{
  long long c = a >> 63;
  long long d = -c;
  return d * b;
}
we optimize with it for and bar but not baz, apparently the -(a >> 63)
arithmetic to (a >> 63) logical shift is done only in GENERIC folding and not
later.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/98865] Missed transform of (a >> 63) * b
  2021-01-28 13:38 [Bug middle-end/98865] New: Missed transform of (a >> 63) * b rguenth at gcc dot gnu.org
  2021-01-28 13:40 ` [Bug middle-end/98865] " rguenth at gcc dot gnu.org
  2021-01-28 14:26 ` jakub at gcc dot gnu.org
@ 2021-03-07  2:46 ` pinskia at gcc dot gnu.org
  2021-07-20 22:19 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-03-07  2:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98865

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-03-07
           Severity|normal                      |enhancement
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/98865] Missed transform of (a >> 63) * b
  2021-01-28 13:38 [Bug middle-end/98865] New: Missed transform of (a >> 63) * b rguenth at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-03-07  2:46 ` pinskia at gcc dot gnu.org
@ 2021-07-20 22:19 ` pinskia at gcc dot gnu.org
  2021-09-22 18:19 ` cvs-commit at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-20 22:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98865

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pinskia at gcc dot gnu.org

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Actually I think we should do:
(simplify
 (mult:c truth_valuep@0 @1)
 (and (neg @0) @1))

Instead. What do you think?

This will catch things like:
unsigned long foo (long a, unsigned long b)
{
  unsigned long t =  a & 1;
  return t * b;
}

---- CUT ---
We can put a ! after neg if we want it to be optimized out even.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/98865] Missed transform of (a >> 63) * b
  2021-01-28 13:38 [Bug middle-end/98865] New: Missed transform of (a >> 63) * b rguenth at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-07-20 22:19 ` pinskia at gcc dot gnu.org
@ 2021-09-22 18:19 ` cvs-commit at gcc dot gnu.org
  2022-01-11 11:19 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-22 18:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98865

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:8f571e64713cc72561f84241863496e473eae4c6

commit r12-3824-g8f571e64713cc72561f84241863496e473eae4c6
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Wed Sep 22 19:17:49 2021 +0100

    More NEGATE_EXPR folding in match.pd

    As observed by Jakub in comment #2 of PR 98865, the expression -(a>>63)
    is optimized in GENERIC but not in GIMPLE.  Investigating further it
    turns out that this is one of a few transformations performed by
    fold_negate_expr in fold-const.c that aren't yet performed by match.pd.
    This patch moves/duplicates them there, and should be relatively safe
    as these transformations are already performed by the compiler, but
    just in different passes.

    This revised patch adds a Boolean simplify argument to tree-ssa-sccvn.c's
    vn_nary_build_or_lookup_1 to control whether simplification should be
    performed before value numbering, updating the callers, but then
    avoiding simplification when constructing/value-numbering NEGATE_EXPR.
    This avoids the regression of gcc.dg/tree-ssa/ssa-free-88.c, and enables
    the new test case(s) to pass.

    2021-09-22  Roger Sayle  <roger@nextmovesoftware.com>
                Richard Biener  <rguenther@suse.de>

    gcc/ChangeLog
            * match.pd (negation simplifications): Implement some negation
            folding transformations from fold-const.c's fold_negate_expr.
            * tree-ssa-sccvn.c (vn_nary_build_or_lookup_1): Add a SIMPLIFY
            argument, to control whether the op should be simplified prior
            to looking up/assigning a value number.
            (vn_nary_build_or_lookup): Update call to
vn_nary_build_or_lookup_1.
            (vn_nary_simplify): Likewise.
            (visit_nary_op): Likewise, but when constructing a NEGATE_EXPR
            now call vn_nary_build_or_lookup_1 disabling simplification.

    gcc/testsuite/ChangeLog
            * gcc.dg/fold-negate-1.c: New test case.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/98865] Missed transform of (a >> 63) * b
  2021-01-28 13:38 [Bug middle-end/98865] New: Missed transform of (a >> 63) * b rguenth at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-09-22 18:19 ` cvs-commit at gcc dot gnu.org
@ 2022-01-11 11:19 ` rguenth at gcc dot gnu.org
  2022-05-18 15:24 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-11 11:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98865

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|enhancement                 |normal
   Last reconfirmed|2021-03-07 00:00:00         |2022-1-11

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/98865] Missed transform of (a >> 63) * b
  2021-01-28 13:38 [Bug middle-end/98865] New: Missed transform of (a >> 63) * b rguenth at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2022-01-11 11:19 ` rguenth at gcc dot gnu.org
@ 2022-05-18 15:24 ` cvs-commit at gcc dot gnu.org
  2022-05-19 16:55 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-18 15:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98865

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:4a9be8d51182076222d707d9d68f6eda78e8ee2c

commit r13-624-g4a9be8d51182076222d707d9d68f6eda78e8ee2c
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Wed May 18 16:23:01 2022 +0100

    Correct ix86_rtx_cost for multi-word multiplication.

    This is the i386 backend specific piece of my revised patch for
    PR middle-end/98865, where Richard Biener has suggested that I perform
    the desired transformation during RTL expansion where the backend can
    control whether it is profitable to convert a multiplication into a
    bit-wise AND and a negation.  This works well for x86_64, but alas
    exposes a latent bug with -m32, where a DImode multiplication incorrectly
    appears to be cheaper than negdi2+anddi3(!?).  The fix to ix86_rtx_costs
    is to report that a DImode (multi-word) multiplication actually requires
    three SImode multiplications and two SImode additions.  This also corrects
    the cost of TImode multiplication on TARGET_64BIT.

    2022-05-18  Roger Sayle  <roger@nextmovesoftware.com>

    gcc/ChangeLog
            * config/i386/i386.cc (ix86_rtx_costs) [MULT]: When mode size
            is wider than word_mode, a multiplication costs three word_mode
            multiplications and two word_mode additions.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/98865] Missed transform of (a >> 63) * b
  2021-01-28 13:38 [Bug middle-end/98865] New: Missed transform of (a >> 63) * b rguenth at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2022-05-18 15:24 ` cvs-commit at gcc dot gnu.org
@ 2022-05-19 16:55 ` cvs-commit at gcc dot gnu.org
  2022-05-27  8:02 ` cvs-commit at gcc dot gnu.org
  2022-05-28  9:18 ` roger at nextmovesoftware dot com
  9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-19 16:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98865

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:d863ba23fb16122bb0547b0c678173be0d98f43c

commit r13-673-gd863ba23fb16122bb0547b0c678173be0d98f43c
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Thu May 19 17:54:38 2022 +0100

    PR middle-end/98865: Expand X*Y as X&-Y when Y is [0,1].

    The patch is a revised solution for PR middle-end/98865 incorporating
    the feedback/suggestions from Richard Biener's review here:
    https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593928.html
    Most significantly, this patch now performs the transformation/optimization
    during RTL expansion, where the target's rtx_costs can be used to determine
    whether the original multiplication (that may potentially be implemented by
    a shift or lea) is cheaper than a negation and a bit-wise and.

    Previously the expression (x>>63)*y would be compiled with -O2 as
            shrq    $63, %rdi
            movq    %rdi, %rax
            imulq   %rsi, %rax

    but with this patch now produces:
            sarq    $63, %rdi
            movq    %rdi, %rax
            andq    %rsi, %rax

    Likewise the expression (x>>63)*135 [that appears in a hot-spot of the
    Botan AES-128 benchmark] was previously:

            shrq    $63, %rdi
            leaq    (%rdi,%rdi,8), %rdx
            movq    %rdx, %rax
            salq    $4, %rax
            subq    %rdx, %rax

    now becomes:
            movq    %rdi, %rax
            sarq    $63, %rax
            andl    $135, %eax

    2022-05-19  Roger Sayle  <roger@nextmovesoftware.com>

    gcc/ChangeLog
            PR middle-end/98865
            * expr.cc (expand_expr_real_2) [MULT_EXPR]:  Expand X*Y as X&Y
            when both X and Y are [0, 1], X*Y as X&-Y when Y is [0,1] and
            likewise X*Y as -X&Y when X is [0,1] using tree_nonzero_bits.

    gcc/testsuite/ChangeLog
            PR middle-end/98865
            * gcc.target/i386/pr98865.c: New test case.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/98865] Missed transform of (a >> 63) * b
  2021-01-28 13:38 [Bug middle-end/98865] New: Missed transform of (a >> 63) * b rguenth at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2022-05-19 16:55 ` cvs-commit at gcc dot gnu.org
@ 2022-05-27  8:02 ` cvs-commit at gcc dot gnu.org
  2022-05-28  9:18 ` roger at nextmovesoftware dot com
  9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-27  8:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98865

--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:8fb94fc6097c0a934aac0d89c9c5e2038da67655

commit r13-793-g8fb94fc6097c0a934aac0d89c9c5e2038da67655
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Fri May 27 08:57:46 2022 +0100

    Canonicalize X&-Y as X*Y in match.pd when Y is [0,1].

    "For every pessimization, there's an equal and opposite optimization".

    In the review of my original patch for PR middle-end/98865, Richard
    Biener pointed out that match.pd shouldn't be transforming X*Y into
    X&-Y as the former is considered cheaper by tree-ssa's cost model
    (operator count).  A corollary of this is that we should instead be
    transforming X&-Y into the cheaper X*Y as a preferred canonical form
    (especially as RTL expansion now intelligently selects the appropriate
    implementation based on the target's costs).

    With this patch we now generate identical code for:
    int foo(int x, int y) { return -(x&1) & y; }
    int bar(int x, int y) { return (x&1) * y; }

    specifically on x86_64-pc-linux-gnu both use and/neg/and with -O2,
    but both use and/mul with -Os.

    One minor wrinkle/improvement is that this patch includes three
    additional optimizations (that account for the change in canonical
    form) to continue to optimize PR92834 and PR94786.

    2022-05-27  Roger Sayle  <roger@nextmovesoftware.com>

    gcc/ChangeLog
            * match.pd (match_zero_one_valued_p): New predicate.
            (mult @0 @1): Use zero_one_valued_p for optimization to the
            expression "bit_and @0 @1".
            (bit_and (negate zero_one_valued_p@0) @1): Optimize to MULT_EXPR.
            (plus @0 (mult (minus @1 @0) zero_one_valued_p@2)): New transform.
            (minus @0 (mult (minus @0 @1) zero_one_valued_p@2)): Likewise.
            (bit_xor @0 (mult (bit_xor @0 @1) zero_one_valued_p@2)): Likewise.
            Remove three redundant transforms obsoleted by the three above.

    gcc/testsuite/ChangeLog
            * gcc.dg/pr98865.c: New test case.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug middle-end/98865] Missed transform of (a >> 63) * b
  2021-01-28 13:38 [Bug middle-end/98865] New: Missed transform of (a >> 63) * b rguenth at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2022-05-27  8:02 ` cvs-commit at gcc dot gnu.org
@ 2022-05-28  9:18 ` roger at nextmovesoftware dot com
  9 siblings, 0 replies; 11+ messages in thread
From: roger at nextmovesoftware dot com @ 2022-05-28  9:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98865

Roger Sayle <roger at nextmovesoftware dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |roger at nextmovesoftware dot com
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED
   Target Milestone|---                         |13.0

--- Comment #9 from Roger Sayle <roger at nextmovesoftware dot com> ---
This is now fixed/implemented on mainline for GCC 13.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-05-28  9:18 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-28 13:38 [Bug middle-end/98865] New: Missed transform of (a >> 63) * b rguenth at gcc dot gnu.org
2021-01-28 13:40 ` [Bug middle-end/98865] " rguenth at gcc dot gnu.org
2021-01-28 14:26 ` jakub at gcc dot gnu.org
2021-03-07  2:46 ` pinskia at gcc dot gnu.org
2021-07-20 22:19 ` pinskia at gcc dot gnu.org
2021-09-22 18:19 ` cvs-commit at gcc dot gnu.org
2022-01-11 11:19 ` rguenth at gcc dot gnu.org
2022-05-18 15:24 ` cvs-commit at gcc dot gnu.org
2022-05-19 16:55 ` cvs-commit at gcc dot gnu.org
2022-05-27  8:02 ` cvs-commit at gcc dot gnu.org
2022-05-28  9:18 ` roger at nextmovesoftware dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).