public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
@ 2023-01-01 15:55 Roger Sayle
  2023-01-01 21:03 ` Segher Boessenkool
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Roger Sayle @ 2023-01-01 15:55 UTC (permalink / raw)
  To: 'GCC Patches'; +Cc: 'Segher Boessenkool'

[-- Attachment #1: Type: text/plain, Size: 1138 bytes --]

In 2011, the rtl.texi documentation was updated to reflect that the
modes of the RTX unary operations FFS, POPCOUNT and PARITY must
match those of their operands.  Unfortunately, some of the transformations
in simplify-rtx.cc predate this tightening of RTL semantics, and have
not (until now) been updated/fixed.  i.e. The POPCOUNT and PARITY
optimizations were "correct" when I originally added them back in 2007.

Segher requested that I split this piece out from a fix for PR 106594 in
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601501.html

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2023-01-01  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* gcc/simplify-rtx.cc (simplify_unary_operation_1) <case FFS>:
	Avoid generating FFS with mismatched operand and result modes, by
	using an explicit SIGN_EXTEND/ZERO_EXTEND instead.
	<case POPCOUNT>: Likewise, for POPCOUNT of ZERO_EXTEND.
	<case PARITY>: Likewise, for PARITY of {ZERO,SIGN}_EXTEND.

Thanks in advance,
Roger
--


[-- Attachment #2: patchaa4.txt --]
[-- Type: text/plain, Size: 2076 bytes --]

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index fc0d6c3..698ca6e 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -1404,22 +1404,32 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
       break;
 
     case FFS:
-      /* (ffs (*_extend <X>)) = (ffs <X>) */
+      /* (ffs (*_extend <X>)) = (*_extend (ffs <X>)).  */
       if (GET_CODE (op) == SIGN_EXTEND
 	  || GET_CODE (op) == ZERO_EXTEND)
-	return simplify_gen_unary (FFS, mode, XEXP (op, 0),
-				   GET_MODE (XEXP (op, 0)));
+	{
+	  temp = simplify_gen_unary (FFS, GET_MODE (XEXP (op, 0)),
+				     XEXP (op, 0), GET_MODE (XEXP (op, 0)));
+	  return simplify_gen_unary (GET_CODE (op), mode, temp,
+				     GET_MODE (temp));
+	}
       break;
 
     case POPCOUNT:
       switch (GET_CODE (op))
 	{
 	case BSWAP:
-	case ZERO_EXTEND:
-	  /* (popcount (zero_extend <X>)) = (popcount <X>) */
+	  /* (popcount (bswap <X>)) = (popcount <X>).  */
 	  return simplify_gen_unary (POPCOUNT, mode, XEXP (op, 0),
 				     GET_MODE (XEXP (op, 0)));
 
+	case ZERO_EXTEND:
+	  /* (popcount (zero_extend <X>)) = (zero_extend (popcount <X>)).  */
+	  temp = simplify_gen_unary (POPCOUNT, GET_MODE (XEXP (op, 0)),
+				     XEXP (op, 0), GET_MODE (XEXP (op, 0)));
+	  return simplify_gen_unary (ZERO_EXTEND, mode, temp,
+				     GET_MODE (temp));
+
 	case ROTATE:
 	case ROTATERT:
 	  /* Rotations don't affect popcount.  */
@@ -1438,11 +1448,16 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
 	{
 	case NOT:
 	case BSWAP:
-	case ZERO_EXTEND:
-	case SIGN_EXTEND:
 	  return simplify_gen_unary (PARITY, mode, XEXP (op, 0),
 				     GET_MODE (XEXP (op, 0)));
 
+	case ZERO_EXTEND:
+	case SIGN_EXTEND:
+	  temp = simplify_gen_unary (PARITY, GET_MODE (XEXP (op, 0)),
+				     XEXP (op, 0), GET_MODE (XEXP (op, 0)));
+	  return simplify_gen_unary (GET_CODE (op), mode, temp,
+				     GET_MODE (temp));
+
 	case ROTATE:
 	case ROTATERT:
 	  /* Rotations don't affect parity.  */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
  2023-01-01 15:55 [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY Roger Sayle
@ 2023-01-01 21:03 ` Segher Boessenkool
  2023-01-01 22:59   ` roger
  2023-01-02 15:45 ` Jeff Law
  2023-02-21 15:39 ` Jakub Jelinek
  2 siblings, 1 reply; 14+ messages in thread
From: Segher Boessenkool @ 2023-01-01 21:03 UTC (permalink / raw)
  To: Roger Sayle; +Cc: 'GCC Patches'

Hi!

On Sun, Jan 01, 2023 at 03:55:26PM -0000, Roger Sayle wrote:
> In 2011, the rtl.texi documentation was updated to reflect that the
> modes of the RTX unary operations FFS, POPCOUNT and PARITY must
> match those of their operands.

Is that not a limitation we should try to get rid of?  It does not
really make any sense after all.

> Unfortunately, some of the transformations
> in simplify-rtx.cc predate this tightening of RTL semantics, and have
> not (until now) been updated/fixed.  i.e. The POPCOUNT and PARITY
> optimizations were "correct" when I originally added them back in 2007.

What would be needed to fix this the "other" way?  Just update the
documentation?  Or is there some code that requires this strange
constraint, maybe even some checking thing?

If we *do* want this strange limitation, we really should have some RTL
check that enforces it, makes sure we find it if the rules are broken.

Thanks and happy new year,


Segher

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
  2023-01-01 21:03 ` Segher Boessenkool
@ 2023-01-01 22:59   ` roger
  0 siblings, 0 replies; 14+ messages in thread
From: roger @ 2023-01-01 22:59 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches


The motivation for the current design (requiring the result and the operand to have
the same mode) was from PR middle-end/50161.  The challenge there is that when
the RTL optimizers can determine that the operand is a constant, simplify_rtx sees
just a CONST_INT with VOIDmode and therefore doesn't know what the original/intended
mode of the operation is/should be.

Using patterns with explicit truncates, sign or zero extensions avoids these problems,
but still allows backend insns where the result mode differs from the operand mode.

> On 1 Jan 2023, at 21:03, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> On Sun, Jan 01, 2023 at 03:55:26PM -0000, Roger Sayle wrote:
>> In 2011, the rtl.texi documentation was updated to reflect that the
>> modes of the RTX unary operations FFS, POPCOUNT and PARITY must
>> match those of their operands.
> 
> Is that not a limitation we should try to get rid of?  It does not
> really make any sense after all.
> 
>> Unfortunately, some of the transformations
>> in simplify-rtx.cc predate this tightening of RTL semantics, and have
>> not (until now) been updated/fixed.  i.e. The POPCOUNT and PARITY
>> optimizations were "correct" when I originally added them back in 2007.
> 
> What would be needed to fix this the "other" way?  Just update the
> documentation?  Or is there some code that requires this strange
> constraint, maybe even some checking thing?
> 
> If we *do* want this strange limitation, we really should have some RTL
> check that enforces it, makes sure we find it if the rules are broken.
> 
> Segher


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
  2023-01-01 15:55 [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY Roger Sayle
  2023-01-01 21:03 ` Segher Boessenkool
@ 2023-01-02 15:45 ` Jeff Law
  2023-01-02 15:59   ` Jakub Jelinek
  2023-01-02 17:30   ` roger
  2023-02-21 15:39 ` Jakub Jelinek
  2 siblings, 2 replies; 14+ messages in thread
From: Jeff Law @ 2023-01-02 15:45 UTC (permalink / raw)
  To: Roger Sayle, 'GCC Patches'; +Cc: 'Segher Boessenkool'



On 1/1/23 08:55, Roger Sayle wrote:
> In 2011, the rtl.texi documentation was updated to reflect that the
> modes of the RTX unary operations FFS, POPCOUNT and PARITY must
> match those of their operands.  Unfortunately, some of the transformations
> in simplify-rtx.cc predate this tightening of RTL semantics, and have
> not (until now) been updated/fixed.  i.e. The POPCOUNT and PARITY
> optimizations were "correct" when I originally added them back in 2007.
> 
> Segher requested that I split this piece out from a fix for PR 106594 in
> https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601501.html
> 
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?
> 
> 
> 2023-01-01  Roger Sayle  <roger@nextmovesoftware.com>
> 
> gcc/ChangeLog
> 	* gcc/simplify-rtx.cc (simplify_unary_operation_1) <case FFS>:
> 	Avoid generating FFS with mismatched operand and result modes, by
> 	using an explicit SIGN_EXTEND/ZERO_EXTEND instead.
> 	<case POPCOUNT>: Likewise, for POPCOUNT of ZERO_EXTEND.
> 	<case PARITY>: Likewise, for PARITY of {ZERO,SIGN}_EXTEND.
?!?  The docs still seem to indicate to me that the modes of the input 
and output operands can differ.  Let's take PARITY as an example:

> @cindex @code{parity@var{m}2} instruction pattern
> @item @samp{parity@var{m}2}
> Store into operand 0 the parity of operand 1, i.e.@: the number of 1-bits
> in operand 1 modulo 2.
> 
> @var{m} is either a scalar or vector integer mode.  When it is a scalar,
> operand 1 has mode @var{m} but operand 0 can have whatever scalar
> integer mode is suitable for the target.  

The mode of the pattern name has to match the mode of the input operand. 
  The mode of the output operand can differ from the mode of the input 
operand.  we seem to have a disagreement on the documented semantics of 
these opcodes.

In fact Raphael and I were about to submit a patch which takes advantage 
of that capability to improve the code slightly for risc-v.

Jeff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
  2023-01-02 15:45 ` Jeff Law
@ 2023-01-02 15:59   ` Jakub Jelinek
  2023-01-02 16:20     ` Jeff Law
  2023-01-02 17:30   ` roger
  1 sibling, 1 reply; 14+ messages in thread
From: Jakub Jelinek @ 2023-01-02 15:59 UTC (permalink / raw)
  To: Jeff Law; +Cc: Roger Sayle, 'GCC Patches', 'Segher Boessenkool'

On Mon, Jan 02, 2023 at 08:45:15AM -0700, Jeff Law via Gcc-patches wrote:
> On 1/1/23 08:55, Roger Sayle wrote:
> > In 2011, the rtl.texi documentation was updated to reflect that the
> > modes of the RTX unary operations FFS, POPCOUNT and PARITY must
> > match those of their operands.  Unfortunately, some of the transformations
> > in simplify-rtx.cc predate this tightening of RTL semantics, and have
> > not (until now) been updated/fixed.  i.e. The POPCOUNT and PARITY
> > optimizations were "correct" when I originally added them back in 2007.
> > 
> > Segher requested that I split this piece out from a fix for PR 106594 in
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601501.html
> > 
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check, both with and without --target_board=unix{-m32},
> > with no new failures.  Ok for mainline?
> > 
> > 
> > 2023-01-01  Roger Sayle  <roger@nextmovesoftware.com>
> > 
> > gcc/ChangeLog
> > 	* gcc/simplify-rtx.cc (simplify_unary_operation_1) <case FFS>:
> > 	Avoid generating FFS with mismatched operand and result modes, by
> > 	using an explicit SIGN_EXTEND/ZERO_EXTEND instead.
> > 	<case POPCOUNT>: Likewise, for POPCOUNT of ZERO_EXTEND.
> > 	<case PARITY>: Likewise, for PARITY of {ZERO,SIGN}_EXTEND.
> ?!?  The docs still seem to indicate to me that the modes of the input and
> output operands can differ.  Let's take PARITY as an example:

See the PR50161 thread in
https://gcc.gnu.org/legacy-ml/gcc-patches/2011-08/threads.html#01847
The options are to disallow different modes, which is what my patch did
(perhaps not all documentation has been tweaked), or ensure that the operand
of those is never constant.  The latter is much harder and needs to be done
in many places.  While for SUBREG/ZERO_EXTEND/SIGN_EXTEND and to some extend
also FLOAT/UNSIGNED_FLOAT we already try hard not to fold those immediately
(and still find every now and then spots where we don't do that), for the
rarely used unary rtls we certainly don't.

> In fact Raphael and I were about to submit a patch which takes advantage of
> that capability to improve the code slightly for risc-v.

Just use a pattern with zero_extend or sign_extend around it or subreg of
it?

	Jakub


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
  2023-01-02 15:59   ` Jakub Jelinek
@ 2023-01-02 16:20     ` Jeff Law
  2023-01-02 17:22       ` Jakub Jelinek
  2023-01-03 11:33       ` Segher Boessenkool
  0 siblings, 2 replies; 14+ messages in thread
From: Jeff Law @ 2023-01-02 16:20 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Roger Sayle, 'GCC Patches', 'Segher Boessenkool'



On 1/2/23 08:59, Jakub Jelinek wrote:
> On Mon, Jan 02, 2023 at 08:45:15AM -0700, Jeff Law via Gcc-patches wrote:
>> On 1/1/23 08:55, Roger Sayle wrote:
>>> In 2011, the rtl.texi documentation was updated to reflect that the
>>> modes of the RTX unary operations FFS, POPCOUNT and PARITY must
>>> match those of their operands.  Unfortunately, some of the transformations
>>> in simplify-rtx.cc predate this tightening of RTL semantics, and have
>>> not (until now) been updated/fixed.  i.e. The POPCOUNT and PARITY
>>> optimizations were "correct" when I originally added them back in 2007.
>>>
>>> Segher requested that I split this piece out from a fix for PR 106594 in
>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601501.html
>>>
>>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>>> and make -k check, both with and without --target_board=unix{-m32},
>>> with no new failures.  Ok for mainline?
>>>
>>>
>>> 2023-01-01  Roger Sayle  <roger@nextmovesoftware.com>
>>>
>>> gcc/ChangeLog
>>> 	* gcc/simplify-rtx.cc (simplify_unary_operation_1) <case FFS>:
>>> 	Avoid generating FFS with mismatched operand and result modes, by
>>> 	using an explicit SIGN_EXTEND/ZERO_EXTEND instead.
>>> 	<case POPCOUNT>: Likewise, for POPCOUNT of ZERO_EXTEND.
>>> 	<case PARITY>: Likewise, for PARITY of {ZERO,SIGN}_EXTEND.
>> ?!?  The docs still seem to indicate to me that the modes of the input and
>> output operands can differ.  Let's take PARITY as an example:
> 
> See the PR50161 thread in
> https://gcc.gnu.org/legacy-ml/gcc-patches/2011-08/threads.html#01847
> The options are to disallow different modes, which is what my patch did
> (perhaps not all documentation has been tweaked), or ensure that the operand
> of those is never constant.  The latter is much harder and needs to be done
> in many places.  While for SUBREG/ZERO_EXTEND/SIGN_EXTEND and to some extend
> also FLOAT/UNSIGNED_FLOAT we already try hard not to fold those immediately
> (and still find every now and then spots where we don't do that), for the
> rarely used unary rtls we certainly don't.
Sigh.  Lack of modes on constants mucking things up elsewhere.  There's 
no good reason other than our poor representation to force the input and 
output modes to match for these instructions.


> 
>> In fact Raphael and I were about to submit a patch which takes advantage of
>> that capability to improve the code slightly for risc-v.
> 
> Just use a pattern with zero_extend or sign_extend around it or subreg of
> it?
If it were only that easy ;(  In the bowels of the simplifications the 
zero_extension turns into either a pair of shifts or an AND with a mask 
(I forget which offhand).  I'm sure we *can* work around this in the 
target, but it'll be ugly.

The documentation definitely needs to be updated.  I looked at the whole 
family a few weeks ago and my recollection was they all need to be fixed 
(ffs, clrsb, clz, ctz, popcount & parity) if the defined semantics are 
that the input and output operand modes must match.

Jeff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
  2023-01-02 16:20     ` Jeff Law
@ 2023-01-02 17:22       ` Jakub Jelinek
  2023-01-02 17:38         ` Jeff Law
  2023-01-03 11:33       ` Segher Boessenkool
  1 sibling, 1 reply; 14+ messages in thread
From: Jakub Jelinek @ 2023-01-02 17:22 UTC (permalink / raw)
  To: Jeff Law; +Cc: Roger Sayle, 'GCC Patches', 'Segher Boessenkool'

On Mon, Jan 02, 2023 at 09:20:33AM -0700, Jeff Law wrote:
> > > In fact Raphael and I were about to submit a patch which takes advantage of
> > > that capability to improve the code slightly for risc-v.
> > 
> > Just use a pattern with zero_extend or sign_extend around it or subreg of
> > it?
> If it were only that easy ;(  In the bowels of the simplifications the
> zero_extension turns into either a pair of shifts or an AND with a mask (I
> forget which offhand).  I'm sure we *can* work around this in the target,
> but it'll be ugly.
> 
> The documentation definitely needs to be updated.  I looked at the whole
> family a few weeks ago and my recollection was they all need to be fixed
> (ffs, clrsb, clz, ctz, popcount & parity) if the defined semantics are that
> the input and output operand modes must match.

When I look at the documentation of all the above, all of them have
"The mode of @var{x} must be @var{m} or @code{VOIDmode}."

	Jakub


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
  2023-01-02 15:45 ` Jeff Law
  2023-01-02 15:59   ` Jakub Jelinek
@ 2023-01-02 17:30   ` roger
  2023-01-02 17:47     ` Jeff Law
  1 sibling, 1 reply; 14+ messages in thread
From: roger @ 2023-01-02 17:30 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC Patches


Hi Jeff,

> On 2 Jan 2023, at 15:45, Jeff Law <jeffreyalaw@gmail.com> wrote:
> On 1/1/23 08:55, Roger Sayle wrote:
>> In 2011, the rtl.texi documentation was updated to reflect that the
>> modes of the RTX unary operations FFS, POPCOUNT and PARITY must
>> match those of their operands.  Unfortunately, some of the transformations
>> in simplify-rtx.cc predate this tightening of RTL semantics, and have
>> not (until now) been updated/fixed.  i.e. The POPCOUNT and PARITY
>> optimizations were "correct" when I originally added them back in 2007.
>> Segher requested that I split this piece out from a fix for PR 106594 in
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601501.html
>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>> and make -k check, both with and without --target_board=unix{-m32},
>> with no new failures.  Ok for mainline?
>> 2023-01-01  Roger Sayle  <roger@nextmovesoftware.com>
>> gcc/ChangeLog
>> 	* gcc/simplify-rtx.cc (simplify_unary_operation_1) <case FFS>:
>> 	Avoid generating FFS with mismatched operand and result modes, by
>> 	using an explicit SIGN_EXTEND/ZERO_EXTEND instead.
>> 	<case POPCOUNT>: Likewise, for POPCOUNT of ZERO_EXTEND.
>> 	<case PARITY>: Likewise, for PARITY of {ZERO,SIGN}_EXTEND.
> ?!?  The docs still seem to indicate to me that the modes of the input and output operands can differ.
> Let's take PARITY as an example:
> 
>> @cindex @code{parity@var{m}2} instruction pattern
>> @item @samp{parity@var{m}2}
>> Store into operand 0 the parity of operand 1, i.e.@: the number of 1-bits
>> in operand 1 modulo 2.
>> @var{m} is either a scalar or vector integer mode.  When it is a scalar,
>> operand 1 has mode @var{m} but operand 0 can have whatever scalar
>> integer mode is suitable for the target.  
> 
> The mode of the pattern name has to match the mode of the input operand.  The mode of the
> output operand can differ from the mode of the input operand.  we seem to have a disagreement
> on the documented semantics of these opcodes.

The documentation that you're looking at is the definition of the parity optab in
md.texi, not the definition of the PARITY rtx in rtl.texi.  The distinction is subtle.
Hence a backend can define paritysiqi2 but in the RTL pattern it expands to the
unary PARITY operator must have the same result type as its operand type,
wrapped in either a truncate or extension if necessary.

I hope this helps.

Cheers,
Roger
--


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
  2023-01-02 17:22       ` Jakub Jelinek
@ 2023-01-02 17:38         ` Jeff Law
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff Law @ 2023-01-02 17:38 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Roger Sayle, 'GCC Patches', 'Segher Boessenkool'



On 1/2/23 10:22, Jakub Jelinek wrote:
> On Mon, Jan 02, 2023 at 09:20:33AM -0700, Jeff Law wrote:
>>>> In fact Raphael and I were about to submit a patch which takes advantage of
>>>> that capability to improve the code slightly for risc-v.
>>>
>>> Just use a pattern with zero_extend or sign_extend around it or subreg of
>>> it?
>> If it were only that easy ;(  In the bowels of the simplifications the
>> zero_extension turns into either a pair of shifts or an AND with a mask (I
>> forget which offhand).  I'm sure we *can* work around this in the target,
>> but it'll be ugly.
>>
>> The documentation definitely needs to be updated.  I looked at the whole
>> family a few weeks ago and my recollection was they all need to be fixed
>> (ffs, clrsb, clz, ctz, popcount & parity) if the defined semantics are that
>> the input and output operand modes must match.
> 
> When I look at the documentation of all the above, all of them have
> "The mode of @var{x} must be @var{m} or @code{VOIDmode}."
M is the mode on the insn (ie clzsi2).  I don't see a reference to
                                  ^^
@var{x}.  Weird.

jeff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
  2023-01-02 17:30   ` roger
@ 2023-01-02 17:47     ` Jeff Law
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff Law @ 2023-01-02 17:47 UTC (permalink / raw)
  To: roger; +Cc: GCC Patches



On 1/2/23 10:30, roger@nextmovesoftware.com wrote:
> 
> Hi Jeff,
> 
>> On 2 Jan 2023, at 15:45, Jeff Law <jeffreyalaw@gmail.com> wrote:
>> On 1/1/23 08:55, Roger Sayle wrote:
>>> In 2011, the rtl.texi documentation was updated to reflect that the
>>> modes of the RTX unary operations FFS, POPCOUNT and PARITY must
>>> match those of their operands.  Unfortunately, some of the transformations
>>> in simplify-rtx.cc predate this tightening of RTL semantics, and have
>>> not (until now) been updated/fixed.  i.e. The POPCOUNT and PARITY
>>> optimizations were "correct" when I originally added them back in 2007.
>>> Segher requested that I split this piece out from a fix for PR 106594 in
>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601501.html
>>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>>> and make -k check, both with and without --target_board=unix{-m32},
>>> with no new failures.  Ok for mainline?
>>> 2023-01-01  Roger Sayle  <roger@nextmovesoftware.com>
>>> gcc/ChangeLog
>>> 	* gcc/simplify-rtx.cc (simplify_unary_operation_1) <case FFS>:
>>> 	Avoid generating FFS with mismatched operand and result modes, by
>>> 	using an explicit SIGN_EXTEND/ZERO_EXTEND instead.
>>> 	<case POPCOUNT>: Likewise, for POPCOUNT of ZERO_EXTEND.
>>> 	<case PARITY>: Likewise, for PARITY of {ZERO,SIGN}_EXTEND.
>> ?!?  The docs still seem to indicate to me that the modes of the input and output operands can differ.
>> Let's take PARITY as an example:
>>
>>> @cindex @code{parity@var{m}2} instruction pattern
>>> @item @samp{parity@var{m}2}
>>> Store into operand 0 the parity of operand 1, i.e.@: the number of 1-bits
>>> in operand 1 modulo 2.
>>> @var{m} is either a scalar or vector integer mode.  When it is a scalar,
>>> operand 1 has mode @var{m} but operand 0 can have whatever scalar
>>> integer mode is suitable for the target.
>>
>> The mode of the pattern name has to match the mode of the input operand.  The mode of the
>> output operand can differ from the mode of the input operand.  we seem to have a disagreement
>> on the documented semantics of these opcodes.
> 
> The documentation that you're looking at is the definition of the parity optab in
> md.texi, not the definition of the PARITY rtx in rtl.texi.  The distinction is subtle.
> Hence a backend can define paritysiqi2 but in the RTL pattern it expands to the
> unary PARITY operator must have the same result type as its operand type,
> wrapped in either a truncate or extension if necessary.
Hmm, yea I was looking at md.texi, not rtl.texi.  I'll take a look at 
the latter after I get some coffee.

Thanks!
jeff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
  2023-01-02 16:20     ` Jeff Law
  2023-01-02 17:22       ` Jakub Jelinek
@ 2023-01-03 11:33       ` Segher Boessenkool
  1 sibling, 0 replies; 14+ messages in thread
From: Segher Boessenkool @ 2023-01-03 11:33 UTC (permalink / raw)
  To: Jeff Law; +Cc: Jakub Jelinek, Roger Sayle, 'GCC Patches'

On Mon, Jan 02, 2023 at 09:20:33AM -0700, Jeff Law wrote:
> On 1/2/23 08:59, Jakub Jelinek wrote:
> >See the PR50161 thread in
> >https://gcc.gnu.org/legacy-ml/gcc-patches/2011-08/threads.html#01847

Nasty nasty nasty.

> >The options are to disallow different modes, which is what my patch did
> >(perhaps not all documentation has been tweaked),

Which isn't so bad, except that simplifiers (in general, not just
simplify-rtx) will try to get rid of it.  Since there is no RTL check
that tests this property, we lose.\

> > or ensure that the 
> >operand
> >of those is never constant.

This means we cannot express intermediate stages of simplification in
RTL, which goes counter the whole idea of RTL.

> Sigh.  Lack of modes on constants mucking things up elsewhere.  There's 
> no good reason other than our poor representation to force the input and 
> output modes to match for these instructions.

But things like popcount need to know the mode of the input, if it is
a negative constant anyway.  Maybe we could simply disallow that?


Segher

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
  2023-01-01 15:55 [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY Roger Sayle
  2023-01-01 21:03 ` Segher Boessenkool
  2023-01-02 15:45 ` Jeff Law
@ 2023-02-21 15:39 ` Jakub Jelinek
  2023-02-26 13:10   ` Roger Sayle
  2 siblings, 1 reply; 14+ messages in thread
From: Jakub Jelinek @ 2023-02-21 15:39 UTC (permalink / raw)
  To: Jeff Law, 'Segher Boessenkool', Roger Sayle; +Cc: 'GCC Patches'

On Sun, Jan 01, 2023 at 03:55:26PM -0000, Roger Sayle wrote:
> In 2011, the rtl.texi documentation was updated to reflect that the
> modes of the RTX unary operations FFS, POPCOUNT and PARITY must
> match those of their operands.  Unfortunately, some of the transformations
> in simplify-rtx.cc predate this tightening of RTL semantics, and have
> not (until now) been updated/fixed.  i.e. The POPCOUNT and PARITY
> optimizations were "correct" when I originally added them back in 2007.
> 
> Segher requested that I split this piece out from a fix for PR 106594 in
> https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601501.html
> 
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?
> 
> 
> 2023-01-01  Roger Sayle  <roger@nextmovesoftware.com>
> 
> gcc/ChangeLog
> 	* gcc/simplify-rtx.cc (simplify_unary_operation_1) <case FFS>:

BTW, gcc/ prefix shouldn't be in the ChangeLog entry.

> 	Avoid generating FFS with mismatched operand and result modes, by
> 	using an explicit SIGN_EXTEND/ZERO_EXTEND instead.
> 	<case POPCOUNT>: Likewise, for POPCOUNT of ZERO_EXTEND.
> 	<case PARITY>: Likewise, for PARITY of {ZERO,SIGN}_EXTEND.

Can we make progress on this?
I've looked with
make mddump
grep '(\(ffs\|popcount\|parity\):' tmp-mddump.md | grep -v '(\(ffs\|popcount\|parity\):\([A-Z0-9x]*\) (match_operand:\2'
at various targets.  This prints nothing at all on aarch64, arm,
gcn, powerpc, rl78, visium, on x86 it prints
            (popcount:SI (zero_extend:SI (match_operand:HI 1 ("nonimmediate_operand") ("")))))
                    (popcount:SI (match_dup 1)))
                    (popcount:DI (match_dup 1)))
                    (and:DI (subreg:DI (popcount:SI (match_dup 1)) 0)
                    (zero_extend:DI (popcount:SI (match_dup 1))))
            (popcount:SI (zero_extend:SI (match_operand:HI 1 ("nonimmediate_operand") ("")))))
                    (popcount:SI (match_dup 0)))
where the (popcount:SI (zero_extend:SI cases are *popcounthi2_1 define_insn_and_split
but we have *popcounthi2_2 which has (zero_extend:SI (popcount:HI and so is
ok, and I've manually checked the match_dup cases and the match_operand
corresponding to match_dup has the right mode, on mips it prints
            (popcount:SI (truncate:SI (match_operand:DI 1 ("register_operand") ("d")))))
which is also ok, on sparc
            (truncate:SI (popcount:DI (match_dup 2))))
in popcountsi2 also looks ok, on s390x
                    (popcount:DI (match_dup 2)))
                    (popcount:DI (match_dup 2)))
where I've also manually verified popcountsi2 and popcounthi2 to be ok.
And then riscv, see below.
Now, for the above arches, everything thus matches the rtl.texi
documentation and popcount/parity/ffs match or create only matching
modes between the RTL and its operand and Roger's patch seems to be IMHO
the way to go, because what simplify-rtx.cc does now for those cases
the patch touches is that it will always result in something that won't
match because it is invalid.  With the patch there is at least some chance
it might match something.

Now, riscv is the only target for which I have tmp-mddump.md around and
which doesn't have matching modes in the machine description:
            (popcount:SI (unspec:VNx1BI [
            (popcount:SI (unspec:VNx2BI [
            (popcount:SI (unspec:VNx4BI [
            (popcount:SI (unspec:VNx8BI [
            (popcount:SI (unspec:VNx16BI [
            (popcount:SI (unspec:VNx32BI [
            (popcount:SI (unspec:VNx64BI [
            (popcount:DI (unspec:VNx1BI [
            (popcount:DI (unspec:VNx2BI [
            (popcount:DI (unspec:VNx4BI [
            (popcount:DI (unspec:VNx8BI [
            (popcount:DI (unspec:VNx16BI [
            (popcount:DI (unspec:VNx32BI [
            (popcount:DI (unspec:VNx64BI [
            (plus:SI (ffs:SI (unspec:VNx1BI [
            (plus:SI (ffs:SI (unspec:VNx2BI [
            (plus:SI (ffs:SI (unspec:VNx4BI [
            (plus:SI (ffs:SI (unspec:VNx8BI [
            (plus:SI (ffs:SI (unspec:VNx16BI [
            (plus:SI (ffs:SI (unspec:VNx32BI [
            (plus:SI (ffs:SI (unspec:VNx64BI [
            (plus:DI (ffs:DI (unspec:VNx1BI [
            (plus:DI (ffs:DI (unspec:VNx2BI [
            (plus:DI (ffs:DI (unspec:VNx4BI [
            (plus:DI (ffs:DI (unspec:VNx8BI [
            (plus:DI (ffs:DI (unspec:VNx16BI [
            (plus:DI (ffs:DI (unspec:VNx32BI [
            (plus:DI (ffs:DI (unspec:VNx64BI [
While invalid against current documentation and to me it isn't clear what
they actually do, sum up popcounts of all vector elements and return index
of first non-zero element + ffs in it or something similar, I think Roger's
patch shouldn't change anything for these either because those patterns
use unspecs with vector modes, so there won't be zero_extend/sign_extend
from thos to integral mode that the patch could swap around.

> diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
> index fc0d6c3..698ca6e 100644
> --- a/gcc/simplify-rtx.cc
> +++ b/gcc/simplify-rtx.cc
> @@ -1404,22 +1404,32 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
>        break;
>  
>      case FFS:
> -      /* (ffs (*_extend <X>)) = (ffs <X>) */
> +      /* (ffs (*_extend <X>)) = (*_extend (ffs <X>)).  */
>        if (GET_CODE (op) == SIGN_EXTEND
>  	  || GET_CODE (op) == ZERO_EXTEND)
> -	return simplify_gen_unary (FFS, mode, XEXP (op, 0),
> -				   GET_MODE (XEXP (op, 0)));
> +	{
> +	  temp = simplify_gen_unary (FFS, GET_MODE (XEXP (op, 0)),
> +				     XEXP (op, 0), GET_MODE (XEXP (op, 0)));
> +	  return simplify_gen_unary (GET_CODE (op), mode, temp,
> +				     GET_MODE (temp));
> +	}
>        break;
>  
>      case POPCOUNT:
>        switch (GET_CODE (op))
>  	{
>  	case BSWAP:
> -	case ZERO_EXTEND:
> -	  /* (popcount (zero_extend <X>)) = (popcount <X>) */
> +	  /* (popcount (bswap <X>)) = (popcount <X>).  */
>  	  return simplify_gen_unary (POPCOUNT, mode, XEXP (op, 0),
>  				     GET_MODE (XEXP (op, 0)));
>  
> +	case ZERO_EXTEND:
> +	  /* (popcount (zero_extend <X>)) = (zero_extend (popcount <X>)).  */
> +	  temp = simplify_gen_unary (POPCOUNT, GET_MODE (XEXP (op, 0)),
> +				     XEXP (op, 0), GET_MODE (XEXP (op, 0)));
> +	  return simplify_gen_unary (ZERO_EXTEND, mode, temp,
> +				     GET_MODE (temp));
> +
>  	case ROTATE:
>  	case ROTATERT:
>  	  /* Rotations don't affect popcount.  */
> @@ -1438,11 +1448,16 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
>  	{
>  	case NOT:
>  	case BSWAP:
> -	case ZERO_EXTEND:
> -	case SIGN_EXTEND:
>  	  return simplify_gen_unary (PARITY, mode, XEXP (op, 0),
>  				     GET_MODE (XEXP (op, 0)));
>  
> +	case ZERO_EXTEND:
> +	case SIGN_EXTEND:
> +	  temp = simplify_gen_unary (PARITY, GET_MODE (XEXP (op, 0)),
> +				     XEXP (op, 0), GET_MODE (XEXP (op, 0)));
> +	  return simplify_gen_unary (GET_CODE (op), mode, temp,
> +				     GET_MODE (temp));
> +
>  	case ROTATE:
>  	case ROTATERT:
>  	  /* Rotations don't affect parity.  */


	Jakub


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
  2023-02-21 15:39 ` Jakub Jelinek
@ 2023-02-26 13:10   ` Roger Sayle
  2023-02-27 16:47     ` Segher Boessenkool
  0 siblings, 1 reply; 14+ messages in thread
From: Roger Sayle @ 2023-02-26 13:10 UTC (permalink / raw)
  To: 'Jakub Jelinek', 'Jeff Law',
	'Segher Boessenkool'
  Cc: 'GCC Patches'

[-- Attachment #1: Type: text/plain, Size: 9566 bytes --]


To assist Jakub's request for progress on this, I've re-tested this patch
(which
still applies cleanly) on x86_64-pc-linux-gnu, again without any
regressions.
This is a middle-end piece of my preferred solution to PR106594, which is a
P1 regression affecting aarch64.

This patch  teaches simplify-rtx.cc to err on the side of caution, by never
creating (new) FFS, POPCOUNT or PARITY rtx with mismatched modes,
matching the documentation.

I've a related (optional) backend patch for nvptx that was posted here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609571.html
but this isn't strictly required, and is effectively a clean-up.

The RISC-V vector popcount instructions mentioned by Jakub are a similar
representation problem.   I'd suggest that the preferred way to represent
these in RTL is as a vector-to-vector popcount, followed by a sum reduction
[RISC-V already has a vector sum reduction insn/UNSPEC].  This would, in
theory, allow constant folding of this instruction for known operands, but
unfortunately, (I believe) GCC doesn't (yet) have an RTX for representing
(sum) reductions (from vectors to scalars)?.

This patch has been retested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2023-02-26  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        * simplify-rtx.cc (simplify_unary_operation_1) <case FFS>:
        Avoid generating FFS with mismatched operand and result modes,
        by using an explicit SIGN_EXTEND/ZERO_EXTEND.
        <case POPCOUNT>: Likewise, for POPCOUNT of ZERO_EXTEND.
        <case PARITY>: Likewise, for PARITY of {ZERO,SIGN}_EXTEND.

Thanks in advance,
Roger

> -----Original Message-----
> From: Jakub Jelinek <jakub@redhat.com>
> Sent: 21 February 2023 15:39
> To: Jeff Law <jeffreyalaw@gmail.com>; 'Segher Boessenkool'
> <segher@kernel.crashing.org>; Roger Sayle <roger@nextmovesoftware.com>
> Cc: 'GCC Patches' <gcc-patches@gcc.gnu.org>
> Subject: Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
> 
> On Sun, Jan 01, 2023 at 03:55:26PM -0000, Roger Sayle wrote:
> > In 2011, the rtl.texi documentation was updated to reflect that the
> > modes of the RTX unary operations FFS, POPCOUNT and PARITY must match
> > those of their operands.  Unfortunately, some of the transformations
> > in simplify-rtx.cc predate this tightening of RTL semantics, and have
> > not (until now) been updated/fixed.  i.e. The POPCOUNT and PARITY
> > optimizations were "correct" when I originally added them back in 2007.
> >
> > Segher requested that I split this piece out from a fix for PR 106594
> > in
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601501.html
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check, both with and without --target_board=unix{-m32},
> > with no new failures.  Ok for mainline?
> >
> >
> > 2023-01-01  Roger Sayle  <roger@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> > 	* gcc/simplify-rtx.cc (simplify_unary_operation_1) <case FFS>:
> 
> BTW, gcc/ prefix shouldn't be in the ChangeLog entry.
> 
> > 	Avoid generating FFS with mismatched operand and result modes, by
> > 	using an explicit SIGN_EXTEND/ZERO_EXTEND instead.
> > 	<case POPCOUNT>: Likewise, for POPCOUNT of ZERO_EXTEND.
> > 	<case PARITY>: Likewise, for PARITY of {ZERO,SIGN}_EXTEND.
> 
> Can we make progress on this?
> I've looked with
> make mddump
> grep '(\(ffs\|popcount\|parity\):' tmp-mddump.md | grep -v
> '(\(ffs\|popcount\|parity\):\([A-Z0-9x]*\) (match_operand:\2'
> at various targets.  This prints nothing at all on aarch64, arm, gcn,
powerpc,
> rl78, visium, on x86 it prints
>             (popcount:SI (zero_extend:SI (match_operand:HI 1
> ("nonimmediate_operand") ("")))))
>                     (popcount:SI (match_dup 1)))
>                     (popcount:DI (match_dup 1)))
>                     (and:DI (subreg:DI (popcount:SI (match_dup 1)) 0)
>                     (zero_extend:DI (popcount:SI (match_dup 1))))
>             (popcount:SI (zero_extend:SI (match_operand:HI 1
> ("nonimmediate_operand") ("")))))
>                     (popcount:SI (match_dup 0))) where the (popcount:SI
(zero_extend:SI
> cases are *popcounthi2_1 define_insn_and_split but we have *popcounthi2_2
> which has (zero_extend:SI (popcount:HI and so is ok, and I've manually
checked
> the match_dup cases and the match_operand corresponding to match_dup has
> the right mode, on mips it prints
>             (popcount:SI (truncate:SI (match_operand:DI 1
("register_operand")
> ("d"))))) which is also ok, on sparc
>             (truncate:SI (popcount:DI (match_dup 2)))) in popcountsi2 also
looks ok,
> on s390x
>                     (popcount:DI (match_dup 2)))
>                     (popcount:DI (match_dup 2))) where I've also manually
verified
> popcountsi2 and popcounthi2 to be ok.
> And then riscv, see below.
> Now, for the above arches, everything thus matches the rtl.texi
documentation
> and popcount/parity/ffs match or create only matching modes between the
RTL
> and its operand and Roger's patch seems to be IMHO the way to go, because
> what simplify-rtx.cc does now for those cases the patch touches is that it
will
> always result in something that won't match because it is invalid.  With
the patch
> there is at least some chance it might match something.
> 
> Now, riscv is the only target for which I have tmp-mddump.md around and
> which doesn't have matching modes in the machine description:
>             (popcount:SI (unspec:VNx1BI [
>             (popcount:SI (unspec:VNx2BI [
>             (popcount:SI (unspec:VNx4BI [
>             (popcount:SI (unspec:VNx8BI [
>             (popcount:SI (unspec:VNx16BI [
>             (popcount:SI (unspec:VNx32BI [
>             (popcount:SI (unspec:VNx64BI [
>             (popcount:DI (unspec:VNx1BI [
>             (popcount:DI (unspec:VNx2BI [
>             (popcount:DI (unspec:VNx4BI [
>             (popcount:DI (unspec:VNx8BI [
>             (popcount:DI (unspec:VNx16BI [
>             (popcount:DI (unspec:VNx32BI [
>             (popcount:DI (unspec:VNx64BI [
>             (plus:SI (ffs:SI (unspec:VNx1BI [
>             (plus:SI (ffs:SI (unspec:VNx2BI [
>             (plus:SI (ffs:SI (unspec:VNx4BI [
>             (plus:SI (ffs:SI (unspec:VNx8BI [
>             (plus:SI (ffs:SI (unspec:VNx16BI [
>             (plus:SI (ffs:SI (unspec:VNx32BI [
>             (plus:SI (ffs:SI (unspec:VNx64BI [
>             (plus:DI (ffs:DI (unspec:VNx1BI [
>             (plus:DI (ffs:DI (unspec:VNx2BI [
>             (plus:DI (ffs:DI (unspec:VNx4BI [
>             (plus:DI (ffs:DI (unspec:VNx8BI [
>             (plus:DI (ffs:DI (unspec:VNx16BI [
>             (plus:DI (ffs:DI (unspec:VNx32BI [
>             (plus:DI (ffs:DI (unspec:VNx64BI [ While invalid against
current
> documentation and to me it isn't clear what they actually do, sum up
popcounts
> of all vector elements and return index of first non-zero element + ffs in
it or
> something similar, I think Roger's patch shouldn't change anything for
these
> either because those patterns use unspecs with vector modes, so there
won't be
> zero_extend/sign_extend from thos to integral mode that the patch could
swap
> around.
> 
> > diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc index
> > fc0d6c3..698ca6e 100644
> > --- a/gcc/simplify-rtx.cc
> > +++ b/gcc/simplify-rtx.cc
> > @@ -1404,22 +1404,32 @@ simplify_context::simplify_unary_operation_1
> (rtx_code code, machine_mode mode,
> >        break;
> >
> >      case FFS:
> > -      /* (ffs (*_extend <X>)) = (ffs <X>) */
> > +      /* (ffs (*_extend <X>)) = (*_extend (ffs <X>)).  */
> >        if (GET_CODE (op) == SIGN_EXTEND
> >  	  || GET_CODE (op) == ZERO_EXTEND)
> > -	return simplify_gen_unary (FFS, mode, XEXP (op, 0),
> > -				   GET_MODE (XEXP (op, 0)));
> > +	{
> > +	  temp = simplify_gen_unary (FFS, GET_MODE (XEXP (op, 0)),
> > +				     XEXP (op, 0), GET_MODE (XEXP (op, 0)));
> > +	  return simplify_gen_unary (GET_CODE (op), mode, temp,
> > +				     GET_MODE (temp));
> > +	}
> >        break;
> >
> >      case POPCOUNT:
> >        switch (GET_CODE (op))
> >  	{
> >  	case BSWAP:
> > -	case ZERO_EXTEND:
> > -	  /* (popcount (zero_extend <X>)) = (popcount <X>) */
> > +	  /* (popcount (bswap <X>)) = (popcount <X>).  */
> >  	  return simplify_gen_unary (POPCOUNT, mode, XEXP (op, 0),
> >  				     GET_MODE (XEXP (op, 0)));
> >
> > +	case ZERO_EXTEND:
> > +	  /* (popcount (zero_extend <X>)) = (zero_extend (popcount <X>)).
*/
> > +	  temp = simplify_gen_unary (POPCOUNT, GET_MODE (XEXP (op, 0)),
> > +				     XEXP (op, 0), GET_MODE (XEXP (op, 0)));
> > +	  return simplify_gen_unary (ZERO_EXTEND, mode, temp,
> > +				     GET_MODE (temp));
> > +
> >  	case ROTATE:
> >  	case ROTATERT:
> >  	  /* Rotations don't affect popcount.  */ @@ -1438,11 +1448,16 @@
> > simplify_context::simplify_unary_operation_1 (rtx_code code,
machine_mode
> mode,
> >  	{
> >  	case NOT:
> >  	case BSWAP:
> > -	case ZERO_EXTEND:
> > -	case SIGN_EXTEND:
> >  	  return simplify_gen_unary (PARITY, mode, XEXP (op, 0),
> >  				     GET_MODE (XEXP (op, 0)));
> >
> > +	case ZERO_EXTEND:
> > +	case SIGN_EXTEND:
> > +	  temp = simplify_gen_unary (PARITY, GET_MODE (XEXP (op, 0)),
> > +				     XEXP (op, 0), GET_MODE (XEXP (op, 0)));
> > +	  return simplify_gen_unary (GET_CODE (op), mode, temp,
> > +				     GET_MODE (temp));
> > +
> >  	case ROTATE:
> >  	case ROTATERT:
> >  	  /* Rotations don't affect parity.  */
> 
> 
> 	Jakub


[-- Attachment #2: patchaa4c.txt --]
[-- Type: text/plain, Size: 2076 bytes --]

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index 3955929..2c82256 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -1404,22 +1404,32 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
       break;
 
     case FFS:
-      /* (ffs (*_extend <X>)) = (ffs <X>) */
+      /* (ffs (*_extend <X>)) = (*_extend (ffs <X>)).  */
       if (GET_CODE (op) == SIGN_EXTEND
 	  || GET_CODE (op) == ZERO_EXTEND)
-	return simplify_gen_unary (FFS, mode, XEXP (op, 0),
-				   GET_MODE (XEXP (op, 0)));
+	{
+	  temp = simplify_gen_unary (FFS, GET_MODE (XEXP (op, 0)),
+				     XEXP (op, 0), GET_MODE (XEXP (op, 0)));
+	  return simplify_gen_unary (GET_CODE (op), mode, temp,
+				     GET_MODE (temp));
+	}
       break;
 
     case POPCOUNT:
       switch (GET_CODE (op))
 	{
 	case BSWAP:
-	case ZERO_EXTEND:
-	  /* (popcount (zero_extend <X>)) = (popcount <X>) */
+	  /* (popcount (bswap <X>)) = (popcount <X>).  */
 	  return simplify_gen_unary (POPCOUNT, mode, XEXP (op, 0),
 				     GET_MODE (XEXP (op, 0)));
 
+	case ZERO_EXTEND:
+	  /* (popcount (zero_extend <X>)) = (zero_extend (popcount <X>)).  */
+	  temp = simplify_gen_unary (POPCOUNT, GET_MODE (XEXP (op, 0)),
+				     XEXP (op, 0), GET_MODE (XEXP (op, 0)));
+	  return simplify_gen_unary (ZERO_EXTEND, mode, temp,
+				     GET_MODE (temp));
+
 	case ROTATE:
 	case ROTATERT:
 	  /* Rotations don't affect popcount.  */
@@ -1438,11 +1448,16 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
 	{
 	case NOT:
 	case BSWAP:
-	case ZERO_EXTEND:
-	case SIGN_EXTEND:
 	  return simplify_gen_unary (PARITY, mode, XEXP (op, 0),
 				     GET_MODE (XEXP (op, 0)));
 
+	case ZERO_EXTEND:
+	case SIGN_EXTEND:
+	  temp = simplify_gen_unary (PARITY, GET_MODE (XEXP (op, 0)),
+				     XEXP (op, 0), GET_MODE (XEXP (op, 0)));
+	  return simplify_gen_unary (GET_CODE (op), mode, temp,
+				     GET_MODE (temp));
+
 	case ROTATE:
 	case ROTATERT:
 	  /* Rotations don't affect parity.  */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.
  2023-02-26 13:10   ` Roger Sayle
@ 2023-02-27 16:47     ` Segher Boessenkool
  0 siblings, 0 replies; 14+ messages in thread
From: Segher Boessenkool @ 2023-02-27 16:47 UTC (permalink / raw)
  To: Roger Sayle
  Cc: 'Jakub Jelinek', 'Jeff Law', 'GCC Patches'

Hi!

On Sun, Feb 26, 2023 at 01:10:41PM -0000, Roger Sayle wrote:
> This patch  teaches simplify-rtx.cc to err on the side of caution, by never
> creating (new) FFS, POPCOUNT or PARITY rtx with mismatched modes,
> matching the documentation.

>         * simplify-rtx.cc (simplify_unary_operation_1) <case FFS>:
>         Avoid generating FFS with mismatched operand and result modes,

Please have at least one word after a colon, so that it doesn't look
like something is missing.  Changelog lines are 80 positions long :-)

The patch is okay for trunk.  Thank you!


Segher

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-02-27 16:48 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-01 15:55 [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY Roger Sayle
2023-01-01 21:03 ` Segher Boessenkool
2023-01-01 22:59   ` roger
2023-01-02 15:45 ` Jeff Law
2023-01-02 15:59   ` Jakub Jelinek
2023-01-02 16:20     ` Jeff Law
2023-01-02 17:22       ` Jakub Jelinek
2023-01-02 17:38         ` Jeff Law
2023-01-03 11:33       ` Segher Boessenkool
2023-01-02 17:30   ` roger
2023-01-02 17:47     ` Jeff Law
2023-02-21 15:39 ` Jakub Jelinek
2023-02-26 13:10   ` Roger Sayle
2023-02-27 16:47     ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).