public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Some additional zero-extension related optimizations in simplify-rtx.
@ 2022-07-27 13:42 Roger Sayle
  2022-07-27 20:23 ` Segher Boessenkool
  2022-08-02  9:38 ` Richard Sandiford
  0 siblings, 2 replies; 9+ messages in thread
From: Roger Sayle @ 2022-07-27 13:42 UTC (permalink / raw)
  To: gcc-patches; +Cc: 'Segher Boessenkool'

[-- Attachment #1: Type: text/plain, Size: 2935 bytes --]


This patch implements some additional zero-extension and sign-extension
related optimizations in simplify-rtx.cc.  The original motivation comes
from PR rtl-optimization/71775, where in comment #2 Andrew Pinski sees:

Failed to match this instruction:
(set (reg:DI 88 [ _1 ])
    (sign_extend:DI (subreg:SI (ctz:DI (reg/v:DI 86 [ x ])) 0)))

On many platforms the result of DImode CTZ is constrained to be a
small unsigned integer (between 0 and 64), hence the truncation to
32-bits (using a SUBREG) and the following sign extension back to
64-bits are effectively a no-op, so the above should ideally (often)
be simplified to "(set (reg:DI 88) (ctz:DI (reg/v:DI 86 [ x ]))".

To implement this, and some closely related transformations, we build
upon the existing val_signbit_known_clear_p predicate.  In the first
chunk, nonzero_bits knows that FFS and ABS can't leave the sign-bit
bit set, so the simplification of of ABS (ABS (x)) and ABS (FFS (x))
can itself be simplified.  The second transformation is that we can
canonicalized SIGN_EXTEND to ZERO_EXTEND (as in the PR 71775 case above)
when the operand's sign-bit is known to be clear.  The final two chunks
are for SIGN_EXTEND of a truncating SUBREG, and ZERO_EXTEND of a
truncating SUBREG respectively.  The nonzero_bits of a truncating
SUBREG pessimistically thinks that the upper bits may have an
arbitrary value (by taking the SUBREG), so we need look deeper at the
SUBREG's operand to confirm that the high bits are known to be zero.

Unfortunately, for PR rtl-optimization/71775, ctz:DI on x86_64 with
default architecture options is undefined at zero, so we can't be sure
the upper bits of reg:DI 88 will be sign extended (all zeros or all ones).
nonzero_bits knows this, so the above transformations don't trigger,
but the transformations themselves are perfectly valid for other
operations such as FFS, POPCOUNT and PARITY, and on other targets/-march
settings where CTZ is defined at zero.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Testing with CSiBE shows these transformations
trigger on several source files (and with -Os reduces the size of the
code).  Ok for mainline?


2022-07-27  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        * simplify_rtx.cc (simplify_unary_operation_1) <ABS>: Simplify
        test as both FFS and ABS result in nonzero_bits returning a
        mask that satisfies val_signbit_known_clear_p.
        <SIGN_EXTEND>: Canonicalize SIGN_EXTEND to ZERO_EXTEND when
        val_signbit_known_clear_p is true of the operand.
        Simplify sign extensions of SUBREG truncations of operands
        that are already suitably (zero) extended.
        <ZERO_EXTEND>: Simplify zero extensions of SUBREG truncations
        of operands that are already suitably zero extended.


Thanks in advance,
Roger
--


[-- Attachment #2: patchxt.txt --]
[-- Type: text/plain, Size: 2701 bytes --]

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index fa20665..e62bf56 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -1366,9 +1366,8 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
 	break;
 
       /* If operand is something known to be positive, ignore the ABS.  */
-      if (GET_CODE (op) == FFS || GET_CODE (op) == ABS
-	  || val_signbit_known_clear_p (GET_MODE (op),
-					nonzero_bits (op, GET_MODE (op))))
+      if (val_signbit_known_clear_p (GET_MODE (op),
+				     nonzero_bits (op, GET_MODE (op))))
 	return op;
 
       /* If operand is known to be only -1 or 0, convert ABS to NEG.  */
@@ -1615,6 +1614,24 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
 	    }
 	}
 
+      /* We can canonicalize SIGN_EXTEND (op) as ZERO_EXTEND (op) when
+         we know the sign bit of OP must be clear.  */
+      if (val_signbit_known_clear_p (GET_MODE (op),
+				     nonzero_bits (op, GET_MODE (op))))
+	return simplify_gen_unary (ZERO_EXTEND, mode, op, GET_MODE (op));
+
+      /* (sign_extend:DI (subreg:SI (ctz:DI ...))) is (ctz:DI ...).  */
+      if (GET_CODE (op) == SUBREG
+	  && subreg_lowpart_p (op)
+	  && GET_MODE (SUBREG_REG (op)) == mode
+	  && is_a <scalar_int_mode> (mode, &int_mode)
+	  && is_a <scalar_int_mode> (GET_MODE (op), &op_mode)
+	  && GET_MODE_PRECISION (int_mode) <= HOST_BITS_PER_WIDE_INT
+	  && GET_MODE_PRECISION (op_mode) < GET_MODE_PRECISION (int_mode)
+	  && (nonzero_bits (SUBREG_REG (op), mode)
+	      & ~(GET_MODE_MASK (op_mode)>>1)) == 0)
+	return SUBREG_REG (op);
+
 #if defined(POINTERS_EXTEND_UNSIGNED)
       /* As we do not know which address space the pointer is referring to,
 	 we can do this only if the target does not support different pointer
@@ -1765,6 +1782,18 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
 				     op0_mode);
 	}
 
+      /* (zero_extend:DI (subreg:SI (ctz:DI ...))) is (ctz:DI ...).  */
+      if (GET_CODE (op) == SUBREG
+	  && subreg_lowpart_p (op)
+	  && GET_MODE (SUBREG_REG (op)) == mode
+	  && is_a <scalar_int_mode> (mode, &int_mode)
+	  && is_a <scalar_int_mode> (GET_MODE (op), &op_mode)
+	  && GET_MODE_PRECISION (int_mode) <= HOST_BITS_PER_WIDE_INT
+	  && GET_MODE_PRECISION (op_mode) < GET_MODE_PRECISION (int_mode)
+	  && (nonzero_bits (SUBREG_REG (op), mode)
+	      & ~GET_MODE_MASK (op_mode)) == 0)
+	return SUBREG_REG (op);
+
 #if defined(POINTERS_EXTEND_UNSIGNED)
       /* As we do not know which address space the pointer is referring to,
 	 we can do this only if the target does not support different pointer

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-08-02 20:08 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-27 13:42 [PATCH] Some additional zero-extension related optimizations in simplify-rtx Roger Sayle
2022-07-27 20:23 ` Segher Boessenkool
2022-07-29  6:57   ` Roger Sayle
2022-08-02 20:07     ` Segher Boessenkool
2022-07-29  7:28   ` Roger Sayle
2022-08-02  9:38 ` Richard Sandiford
2022-08-02 11:55   ` [PATCH take #2] " Roger Sayle
2022-08-02 13:39     ` Richard Sandiford
2022-08-02 16:46       ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).