public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/71775] Redundant move instruction for sign extension
       [not found] <bug-71775-4@http.gcc.gnu.org/bugzilla/>
@ 2021-08-07  4:53 ` pinskia at gcc dot gnu.org
  2022-08-03  7:58 ` cvs-commit at gcc dot gnu.org
  1 sibling, 0 replies; 2+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-07  4:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71775

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
           Severity|normal                      |enhancement
          Component|target                      |rtl-optimization
           Keywords|                            |missed-optimization
   Last reconfirmed|                            |2021-08-07

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed:
Trying 11 -> 13:
   11: {r87:DI=ctz(r86:DI);clobber flags:CC;}
      REG_UNUSED flags:CC
   13: r88:DI=sign_extend(r87:DI#0)
      REG_DEAD r87:DI
Failed to match this instruction:
(set (reg:DI 88 [ _1 ])
    (sign_extend:DI (subreg:SI (ctz:DI (reg/v:DI 86 [ x ])) 0)))


Part of the problem is ctz has an unkown value at 0 but we know x is non-zero
(well kinda, at the gimple level we do).

We do the right thing on aarch64 because we know the value at 0.
Trying 11 -> 13:
   11: r97:DI=ctz(r96:DI)
   13: r98:DI=sign_extend(r97:DI#0)
      REG_DEAD r97:DI
Successfully matched this instruction:
(set (reg:DI 98 [ _1 ])
    (ctz:DI (reg/v:DI 96 [ x ])))
allowing combination of insns 11 and 13
original costs 8 + 4 = 12
replacement cost 8
deferring deletion of insn with uid = 11.
modifying insn i3    13: r98:DI=ctz(r96:DI)
deferring rescan insn with uid = 13.

So this requires us to bring the range down from gimple to RTL.

Here is the range:
  # RANGE [1, 18446744073709551615]
  # x_12 = PHI <x_10(3), x_6(D)(2)>

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug rtl-optimization/71775] Redundant move instruction for sign extension
       [not found] <bug-71775-4@http.gcc.gnu.org/bugzilla/>
  2021-08-07  4:53 ` [Bug rtl-optimization/71775] Redundant move instruction for sign extension pinskia at gcc dot gnu.org
@ 2022-08-03  7:58 ` cvs-commit at gcc dot gnu.org
  1 sibling, 0 replies; 2+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-08-03  7:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71775

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:c23a9c87cc62bd177fd0d4db6ad34b34e1b9a31f

commit r13-1942-gc23a9c87cc62bd177fd0d4db6ad34b34e1b9a31f
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Wed Aug 3 08:55:35 2022 +0100

    Some additional zero-extension related optimizations in simplify-rtx.

    This patch implements some additional zero-extension and sign-extension
    related optimizations in simplify-rtx.cc.  The original motivation comes
    from PR rtl-optimization/71775, where in comment #2 Andrew Pinksi sees:

    Failed to match this instruction:
    (set (reg:DI 88 [ _1 ])
        (sign_extend:DI (subreg:SI (ctz:DI (reg/v:DI 86 [ x ])) 0)))

    On many platforms the result of DImode CTZ is constrained to be a
    small unsigned integer (between 0 and 64), hence the truncation to
    32-bits (using a SUBREG) and the following sign extension back to
    64-bits are effectively a no-op, so the above should ideally (often)
    be simplified to "(set (reg:DI 88) (ctz:DI (reg/v:DI 86 [ x ]))".

    To implement this, and some closely related transformations, we build
    upon the existing val_signbit_known_clear_p predicate.  In the first
    chunk, nonzero_bits knows that FFS and ABS can't leave the sign-bit
    bit set, so the simplification of of ABS (ABS (x)) and ABS (FFS (x))
    can itself be simplified.  The second transformation is that we can
    canonicalized SIGN_EXTEND to ZERO_EXTEND (as in the PR 71775 case above)
    when the operand's sign-bit is known to be clear.  The final two chunks
    are for SIGN_EXTEND of a truncating SUBREG, and ZERO_EXTEND of a
    truncating SUBREG respectively.  The nonzero_bits of a truncating
    SUBREG pessimistically thinks that the upper bits may have an
    arbitrary value (by taking the SUBREG), so we need look deeper at the
    SUBREG's operand to confirm that the high bits are known to be zero.

    Unfortunately, for PR rtl-optimization/71775, ctz:DI on x86_64 with
    default architecture options is undefined at zero, so we can't be sure
    the upper bits of reg:DI 88 will be sign extended (all zeros or all ones).
    nonzero_bits knows this, so the above transformations don't trigger,
    but the transformations themselves are perfectly valid for other
    operations such as FFS, POPCOUNT and PARITY, and on other targets/-march
    settings where CTZ is defined at zero.

    2022-08-03  Roger Sayle  <roger@nextmovesoftware.com>
                Segher Boessenkool  <segher@kernel.crashing.org>
                Richard Sandiford  <richard.sandiford@arm.com>

    gcc/ChangeLog
            * simplify-rtx.cc (simplify_unary_operation_1) <ABS>: Add
            optimizations for CLRSB, PARITY, POPCOUNT, SS_ABS and LSHIFTRT
            that are all positive to complement the existing FFS and
            idempotent ABS simplifications.
            <SIGN_EXTEND>: Canonicalize SIGN_EXTEND to ZERO_EXTEND when
            val_signbit_known_clear_p is true of the operand.
            Simplify sign extensions of SUBREG truncations of operands
            that are already suitably (zero) extended.
            <ZERO_EXTEND>: Simplify zero extensions of SUBREG truncations
            of operands that are already suitably zero extended.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-08-03  7:58 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-71775-4@http.gcc.gnu.org/bugzilla/>
2021-08-07  4:53 ` [Bug rtl-optimization/71775] Redundant move instruction for sign extension pinskia at gcc dot gnu.org
2022-08-03  7:58 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).