public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: "Roger Sayle" <roger@nextmovesoftware.com>
Cc: <gcc-patches@gcc.gnu.org>,
	'Segher Boessenkool' <segher@kernel.crashing.org>
Subject: Re: [PATCH] Some additional zero-extension related optimizations in simplify-rtx.
Date: Tue, 02 Aug 2022 10:38:39 +0100	[thread overview]
Message-ID: <mpto7x3q928.fsf@arm.com> (raw)
In-Reply-To: <009501d8a1be$b6199e20$224cda60$@nextmovesoftware.com> (Roger Sayle's message of "Wed, 27 Jul 2022 14:42:25 +0100")

"Roger Sayle" <roger@nextmovesoftware.com> writes:
> This patch implements some additional zero-extension and sign-extension
> related optimizations in simplify-rtx.cc.  The original motivation comes
> from PR rtl-optimization/71775, where in comment #2 Andrew Pinski sees:
>
> Failed to match this instruction:
> (set (reg:DI 88 [ _1 ])
>     (sign_extend:DI (subreg:SI (ctz:DI (reg/v:DI 86 [ x ])) 0)))
>
> On many platforms the result of DImode CTZ is constrained to be a
> small unsigned integer (between 0 and 64), hence the truncation to
> 32-bits (using a SUBREG) and the following sign extension back to
> 64-bits are effectively a no-op, so the above should ideally (often)
> be simplified to "(set (reg:DI 88) (ctz:DI (reg/v:DI 86 [ x ]))".
>
> To implement this, and some closely related transformations, we build
> upon the existing val_signbit_known_clear_p predicate.  In the first
> chunk, nonzero_bits knows that FFS and ABS can't leave the sign-bit
> bit set, so the simplification of of ABS (ABS (x)) and ABS (FFS (x))
> can itself be simplified.

I think I misunderstood, but just in case: RTL ABS is well-defined
for the minimum integer (giving back the minimum integer), so we
can't assume that ABS leaves the sign bit clear.

Thanks,
Richard

> The second transformation is that we can
> canonicalized SIGN_EXTEND to ZERO_EXTEND (as in the PR 71775 case above)
> when the operand's sign-bit is known to be clear.  The final two chunks
> are for SIGN_EXTEND of a truncating SUBREG, and ZERO_EXTEND of a
> truncating SUBREG respectively.  The nonzero_bits of a truncating
> SUBREG pessimistically thinks that the upper bits may have an
> arbitrary value (by taking the SUBREG), so we need look deeper at the
> SUBREG's operand to confirm that the high bits are known to be zero.
>
> Unfortunately, for PR rtl-optimization/71775, ctz:DI on x86_64 with
> default architecture options is undefined at zero, so we can't be sure
> the upper bits of reg:DI 88 will be sign extended (all zeros or all ones).
> nonzero_bits knows this, so the above transformations don't trigger,
> but the transformations themselves are perfectly valid for other
> operations such as FFS, POPCOUNT and PARITY, and on other targets/-march
> settings where CTZ is defined at zero.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Testing with CSiBE shows these transformations
> trigger on several source files (and with -Os reduces the size of the
> code).  Ok for mainline?
>
>
> 2022-07-27  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * simplify_rtx.cc (simplify_unary_operation_1) <ABS>: Simplify
>         test as both FFS and ABS result in nonzero_bits returning a
>         mask that satisfies val_signbit_known_clear_p.
>         <SIGN_EXTEND>: Canonicalize SIGN_EXTEND to ZERO_EXTEND when
>         val_signbit_known_clear_p is true of the operand.
>         Simplify sign extensions of SUBREG truncations of operands
>         that are already suitably (zero) extended.
>         <ZERO_EXTEND>: Simplify zero extensions of SUBREG truncations
>         of operands that are already suitably zero extended.
>
>
> Thanks in advance,
> Roger
> --
>
> diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
> index fa20665..e62bf56 100644
> --- a/gcc/simplify-rtx.cc
> +++ b/gcc/simplify-rtx.cc
> @@ -1366,9 +1366,8 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
>  	break;
>  
>        /* If operand is something known to be positive, ignore the ABS.  */
> -      if (GET_CODE (op) == FFS || GET_CODE (op) == ABS
> -	  || val_signbit_known_clear_p (GET_MODE (op),
> -					nonzero_bits (op, GET_MODE (op))))
> +      if (val_signbit_known_clear_p (GET_MODE (op),
> +				     nonzero_bits (op, GET_MODE (op))))
>  	return op;
>  
>        /* If operand is known to be only -1 or 0, convert ABS to NEG.  */
> @@ -1615,6 +1614,24 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
>  	    }
>  	}
>  
> +      /* We can canonicalize SIGN_EXTEND (op) as ZERO_EXTEND (op) when
> +         we know the sign bit of OP must be clear.  */
> +      if (val_signbit_known_clear_p (GET_MODE (op),
> +				     nonzero_bits (op, GET_MODE (op))))
> +	return simplify_gen_unary (ZERO_EXTEND, mode, op, GET_MODE (op));
> +
> +      /* (sign_extend:DI (subreg:SI (ctz:DI ...))) is (ctz:DI ...).  */
> +      if (GET_CODE (op) == SUBREG
> +	  && subreg_lowpart_p (op)
> +	  && GET_MODE (SUBREG_REG (op)) == mode
> +	  && is_a <scalar_int_mode> (mode, &int_mode)
> +	  && is_a <scalar_int_mode> (GET_MODE (op), &op_mode)
> +	  && GET_MODE_PRECISION (int_mode) <= HOST_BITS_PER_WIDE_INT
> +	  && GET_MODE_PRECISION (op_mode) < GET_MODE_PRECISION (int_mode)
> +	  && (nonzero_bits (SUBREG_REG (op), mode)
> +	      & ~(GET_MODE_MASK (op_mode)>>1)) == 0)
> +	return SUBREG_REG (op);
> +
>  #if defined(POINTERS_EXTEND_UNSIGNED)
>        /* As we do not know which address space the pointer is referring to,
>  	 we can do this only if the target does not support different pointer
> @@ -1765,6 +1782,18 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
>  				     op0_mode);
>  	}
>  
> +      /* (zero_extend:DI (subreg:SI (ctz:DI ...))) is (ctz:DI ...).  */
> +      if (GET_CODE (op) == SUBREG
> +	  && subreg_lowpart_p (op)
> +	  && GET_MODE (SUBREG_REG (op)) == mode
> +	  && is_a <scalar_int_mode> (mode, &int_mode)
> +	  && is_a <scalar_int_mode> (GET_MODE (op), &op_mode)
> +	  && GET_MODE_PRECISION (int_mode) <= HOST_BITS_PER_WIDE_INT
> +	  && GET_MODE_PRECISION (op_mode) < GET_MODE_PRECISION (int_mode)
> +	  && (nonzero_bits (SUBREG_REG (op), mode)
> +	      & ~GET_MODE_MASK (op_mode)) == 0)
> +	return SUBREG_REG (op);
> +
>  #if defined(POINTERS_EXTEND_UNSIGNED)
>        /* As we do not know which address space the pointer is referring to,
>  	 we can do this only if the target does not support different pointer

  parent reply	other threads:[~2022-08-02  9:38 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-27 13:42 Roger Sayle
2022-07-27 20:23 ` Segher Boessenkool
2022-07-29  6:57   ` Roger Sayle
2022-08-02 20:07     ` Segher Boessenkool
2022-07-29  7:28   ` Roger Sayle
2022-08-02  9:38 ` Richard Sandiford [this message]
2022-08-02 11:55   ` [PATCH take #2] " Roger Sayle
2022-08-02 13:39     ` Richard Sandiford
2022-08-02 16:46       ` Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mpto7x3q928.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=roger@nextmovesoftware.com \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).