[Bug tree-optimization/96930] New: Failure to optimize out 64-bit arithmetic when it can't happen with division transformed into right shift

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug tree-optimization/96930] New: Failure to optimize out 64-bit arithmetic when it can't happen with division transformed into right shift
@ 2020-09-03 20:47 gabravier at gmail dot com
  2020-09-04  6:41 ` [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter " rguenth at gcc dot gnu.org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: gabravier at gmail dot com @ 2020-09-03 20:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96930

            Bug ID: 96930
           Summary: Failure to optimize out 64-bit arithmetic when it
                    can't happen with division transformed into right
                    shift
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

unsigned f(unsigned a, unsigned b)
{
    return a / (unsigned long long)(1U << b);
}

This can be optimized to `return a >> b;`. This transformation is done by LLVM,
but not by GCC.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter with division transformed into right shift
  2020-09-03 20:47 [Bug tree-optimization/96930] New: Failure to optimize out 64-bit arithmetic when it can't happen with division transformed into right shift gabravier at gmail dot com
@ 2020-09-04  6:41 ` rguenth at gcc dot gnu.org
  2021-01-02 10:04 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-04  6:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96930

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2020-09-04
             Status|UNCONFIRMED                 |NEW
             Blocks|                            |19987

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Thanks for the report.  Note there's the PR19987 meta-bug all your expression
simplification reports should 'block' (not so much the target specific ones,
maybe case-by-case)


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19987
[Bug 19987] [meta-bug] fold missing optimizations in general

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter with division transformed into right shift
  2020-09-03 20:47 [Bug tree-optimization/96930] New: Failure to optimize out 64-bit arithmetic when it can't happen with division transformed into right shift gabravier at gmail dot com
  2020-09-04  6:41 ` [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter " rguenth at gcc dot gnu.org
@ 2021-01-02 10:04 ` jakub at gcc dot gnu.org
  2021-01-02 10:53 ` jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-02 10:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96930

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The testcase seems to be optimized into return a >> b; and already e.g. GCC 4.4
does that.
So it is unclear why this has been reported and what difference you found.

That said, given:
unsigned
foo (unsigned a, unsigned b)
{
  return a / (unsigned long long) (1U << b);
}

unsigned
bar (unsigned a, unsigned b)
{
  return a / (1U << b);
}

unsigned
baz (unsigned a, unsigned b)
{
  unsigned long long c = 1U << b;
  return a / c;
}

I see that while we optimize foo and bar into a >> b, by the:
/* (A / (1 << B)) -> (A >> B).
   Only for unsigned A.  For signed A, this would not preserve rounding
   toward zero.
   For example: (-1 / ( 1 << B)) !=  -1 >> B.
   Also also widening conversions, like:
   (A / (unsigned long long) (1U << B)) -> (A >> B)
   or
   (A / (unsigned long long) (1 << B)) -> (A >> B).
   If the left shift is signed, it can be done only if the upper bits
   of A starting from shift's type sign bit are zero, as
   (unsigned long long) (1 << 31) is -2147483648ULL, not 2147483648ULL,
   so it is valid only if A >> 31 is zero.  */
but for baz we actually perform the shift in the wider mode unnecessarily,
because both operands are zero-extended from 32 bits.

Given:
unsigned
qux (unsigned a, unsigned b)
{
  unsigned long long c = a;
  unsigned long long d = b;
  return c / d;
}

unsigned
corge (unsigned a, unsigned b)
{
  return ((unsigned long long) a) / (unsigned long long) b;
}
we only optimize it in corge to return a / b; and not in qux, so some
fold-const optimization is not done on GIMPLE.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter with division transformed into right shift
  2020-09-03 20:47 [Bug tree-optimization/96930] New: Failure to optimize out 64-bit arithmetic when it can't happen with division transformed into right shift gabravier at gmail dot com
  2020-09-04  6:41 ` [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter " rguenth at gcc dot gnu.org
  2021-01-02 10:04 ` jakub at gcc dot gnu.org
@ 2021-01-02 10:53 ` jakub at gcc dot gnu.org
  2021-01-03 10:24 ` gabravier at gmail dot com
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-02 10:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96930

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I guess it could be either optimized using a match.pd pattern like we have for:
/* Convert (outertype)((innertype0)a+(innertype1)b)
   into ((newtype)a+(newtype)b) where newtype
   is the widest mode from all of these.  */
(for op (plus minus mult rdiv)
 (simplify
   (convert (op:s@0 (convert1?@3 @1) (convert2?@4 @2)))
but for the *_div and trunc_mod/floor_mod the C/C++ FEs optimize using
shorten_binary_op without the outer convert, but instead requiring that at
least one of the operands is actually converted from narrower type and that the
other one fits into the range of that narrower type and for signed div/mod the
second operand is known not to be -1, or do it instead in:
simplify_using_ranges::simplify_div_or_mod_using_ranges
with the same rules.  Any preferences?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter with division transformed into right shift
  2020-09-03 20:47 [Bug tree-optimization/96930] New: Failure to optimize out 64-bit arithmetic when it can't happen with division transformed into right shift gabravier at gmail dot com
                   ` (2 preceding siblings ...)
  2021-01-02 10:53 ` jakub at gcc dot gnu.org
@ 2021-01-03 10:24 ` gabravier at gmail dot com
  2021-01-03 10:52 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: gabravier at gmail dot com @ 2021-01-03 10:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96930

--- Comment #4 from Gabriel Ravier <gabravier at gmail dot com> ---
> The testcase seems to be optimized into return a >> b; and already e.g. GCC 4.4 does that.
> So it is unclear why this has been reported and what difference you found.

What I observed is that it is optimized into `return (uint64_t)a >> b;` (where
`unsigned` is 32-bit), not `return a >> b;`.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter with division transformed into right shift
  2020-09-03 20:47 [Bug tree-optimization/96930] New: Failure to optimize out 64-bit arithmetic when it can't happen with division transformed into right shift gabravier at gmail dot com
                   ` (3 preceding siblings ...)
  2021-01-03 10:24 ` gabravier at gmail dot com
@ 2021-01-03 10:52 ` jakub at gcc dot gnu.org
  2021-01-03 12:06 ` gabravier at gmail dot com
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-03 10:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96930

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I can't reproduce that.
I get:
        movl    %edi, %eax
        movl    %esi, %ecx
        shrl    %cl, %eax
        ret
for that function, and LLVM emits the same code with the first two insns
swapped.
Only in baz above I get:
        movl    %edi, %eax
        movl    %esi, %ecx
        shrq    %cl, %rax
        ret
which is the 64-bit shift instead (but I doubt on x86_64 there is any
difference except the 1 byte in code size).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter with division transformed into right shift
  2020-09-03 20:47 [Bug tree-optimization/96930] New: Failure to optimize out 64-bit arithmetic when it can't happen with division transformed into right shift gabravier at gmail dot com
                   ` (4 preceding siblings ...)
  2021-01-03 10:52 ` jakub at gcc dot gnu.org
@ 2021-01-03 12:06 ` gabravier at gmail dot com
  2021-01-03 14:17 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: gabravier at gmail dot com @ 2021-01-03 12:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96930

--- Comment #6 from Gabriel Ravier <gabravier at gmail dot com> ---
For this exact code :

unsigned f(unsigned a, unsigned b)
{
    return a / (unsigned long long)(1U << b);
}

compiled with a trunk-based GCC built yesterday for x86-64-linux-gnu configured
with: ../gcc-trunk-20210102/configure
--prefix=/opt/compiler-explorer/gcc-build/staging --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu --disable-bootstrap
--enable-multiarch --with-abi=m64 --with-multilib-list=m32,m64,mx32
--enable-multilib --enable-clocale=gnu --enable-languages=c,c++,fortran,ada,d
--enable-ld=yes --enable-gold=yes --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --enable-linker-build-id --enable-lto
--enable-plugins --enable-threads=posix
--with-pkgversion=Compiler-Explorer-Build

with the specific compiler command line of `g++ -g -o
/tmp/compiler-explorer-compiler202103-4524-15mdddx.f4b4g/output.s -masm=intel
-S -fdiagnostics-color=always -O3
/tmp/compiler-explorer-compiler202103-4524-15mdddx.f4b4g/example.cpp`, I get:

f(unsigned int, unsigned int):
  mov eax, edi
  mov ecx, esi
  shr rax, cl
  ret

whereas LLVM uses `shr eax, cl` instead.

See also https://godbolt.org/z/Gqza7v for the exact setup I used (in case I
missed something and you rather avoid having to wait for me to answer).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter with division transformed into right shift
  2020-09-03 20:47 [Bug tree-optimization/96930] New: Failure to optimize out 64-bit arithmetic when it can't happen with division transformed into right shift gabravier at gmail dot com
                   ` (5 preceding siblings ...)
  2021-01-03 12:06 ` gabravier at gmail dot com
@ 2021-01-03 14:17 ` jakub at gcc dot gnu.org
  2021-01-03 22:51 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-03 14:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96930

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Oh, C++, I was trying C.  Apparently this optimization is done by the C FE
only.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter with division transformed into right shift
  2020-09-03 20:47 [Bug tree-optimization/96930] New: Failure to optimize out 64-bit arithmetic when it can't happen with division transformed into right shift gabravier at gmail dot com
                   ` (6 preceding siblings ...)
  2021-01-03 14:17 ` jakub at gcc dot gnu.org
@ 2021-01-03 22:51 ` jakub at gcc dot gnu.org
  2021-01-04 13:31 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-03 22:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96930

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The optimization is there, but just has different conditions:
            /* Although it would be tempting to shorten always here, that
               loses on some targets, since the modulo instruction is
               undefined if the quotient can't be represented in the
               computation mode.  We shorten only if unsigned or if
               dividing by something we know != -1.  */
            shorten = (TYPE_UNSIGNED (TREE_TYPE (orig_op0))
                       || (TREE_CODE (op1) == INTEGER_CST
                           && !integer_all_onesp (op1)));
in C, and
              /* When dividing two signed integers, we have to promote to int.
                 unless we divide by a constant != -1.  Note that default
                 conversion will have been performed on the operands at this
                 point, so we have to dig out the original type to find out if
                 it was unsigned.  */
              tree stripped_op1 = tree_strip_any_location_wrapper (op1);
              shorten = ((TREE_CODE (op0) == NOP_EXPR
                          && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0))))
                         || (TREE_CODE (stripped_op1) == INTEGER_CST
                             && ! integer_all_onesp (stripped_op1)));
So, in C++ we only try to shorten divisions (and modulo) if the first operand
has been extended, rather than the second one.
Anyway, if we optimize this in the middle-end, it won't be needed to change the
FE.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter with division transformed into right shift
  2020-09-03 20:47 [Bug tree-optimization/96930] New: Failure to optimize out 64-bit arithmetic when it can't happen with division transformed into right shift gabravier at gmail dot com
                   ` (7 preceding siblings ...)
  2021-01-03 22:51 ` jakub at gcc dot gnu.org
@ 2021-01-04 13:31 ` jakub at gcc dot gnu.org
  2021-01-05 15:34 ` cvs-commit at gcc dot gnu.org
  2023-02-18  2:09 ` gabravier at gmail dot com
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-04 13:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96930

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 49873
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49873&action=edit
gcc11-pr96930.patch

Untested fix.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter with division transformed into right shift
  2020-09-03 20:47 [Bug tree-optimization/96930] New: Failure to optimize out 64-bit arithmetic when it can't happen with division transformed into right shift gabravier at gmail dot com
                   ` (8 preceding siblings ...)
  2021-01-04 13:31 ` jakub at gcc dot gnu.org
@ 2021-01-05 15:34 ` cvs-commit at gcc dot gnu.org
  2023-02-18  2:09 ` gabravier at gmail dot com
  10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-01-05 15:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96930

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:5ca2400270e985f9b33d93007f4d831299b9bda7

commit r11-6471-g5ca2400270e985f9b33d93007f4d831299b9bda7
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Jan 5 16:33:29 2021 +0100

    match.pd: Improve (A / (1 << B)) -> (A >> B) optimization [PR96930]

    The following patch improves the A / (1 << B) -> A >> B simplification,
    as seen in the testcase, if there is unnecessary widening for the division,
    we just optimize it into a shift on the widened type, but if the lshift
    is widened too, there is no reason to do that, we can just shift it in the
    original type and convert after.  The tree_nonzero_bits & wi::mask check
    already ensures it is fine even for signed values.

    I've split the vr-values optimization into a separate patch as it causes
    a small regression on two testcases, but this patch fixes what has been
    reported in the PR alone.

    2021-01-05  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/96930
            * match.pd ((A / (1 << B)) -> (A >> B)): If A is extended
            from narrower value which has the same type as 1 << B, perform
            the right shift on the narrower value followed by extension.

            * g++.dg/tree-ssa/pr96930.C: New test.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter with division transformed into right shift
  2020-09-03 20:47 [Bug tree-optimization/96930] New: Failure to optimize out 64-bit arithmetic when it can't happen with division transformed into right shift gabravier at gmail dot com
                   ` (9 preceding siblings ...)
  2021-01-05 15:34 ` cvs-commit at gcc dot gnu.org
@ 2023-02-18  2:09 ` gabravier at gmail dot com
  10 siblings, 0 replies; 12+ messages in thread
From: gabravier at gmail dot com @ 2023-02-18  2:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96930

--- Comment #11 from Gabriel Ravier <gabravier at gmail dot com> ---
It appears like this is fixed on trunk, I think ?

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-02-18  2:09 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-03 20:47 [Bug tree-optimization/96930] New: Failure to optimize out 64-bit arithmetic when it can't happen with division transformed into right shift gabravier at gmail dot com
2020-09-04  6:41 ` [Bug tree-optimization/96930] Failure to optimize out arithmetic with bigger size when it can't matter " rguenth at gcc dot gnu.org
2021-01-02 10:04 ` jakub at gcc dot gnu.org
2021-01-02 10:53 ` jakub at gcc dot gnu.org
2021-01-03 10:24 ` gabravier at gmail dot com
2021-01-03 10:52 ` jakub at gcc dot gnu.org
2021-01-03 12:06 ` gabravier at gmail dot com
2021-01-03 14:17 ` jakub at gcc dot gnu.org
2021-01-03 22:51 ` jakub at gcc dot gnu.org
2021-01-04 13:31 ` jakub at gcc dot gnu.org
2021-01-05 15:34 ` cvs-commit at gcc dot gnu.org
2023-02-18  2:09 ` gabravier at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).