public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] middle-end: Call negate_rtx instead of simplify_gen_unary expanding rotate shift [PR113033]
@ 2023-12-18 13:42 Xi Ruoyao
  2023-12-18 15:39 ` Jeff Law
  0 siblings, 1 reply; 7+ messages in thread
From: Xi Ruoyao @ 2023-12-18 13:42 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, chenglulu, i, xuchenghua, c, Xi Ruoyao

With simplify_gen_unary we end up with a not fully expanded RTX like

    (set (reg:SI 90) (and:SI (neg:SI (reg:SI 80)) (const_int 63)))

Then it will cause an ICE with unrecognizable insn.

gcc/ChangeLog:

	PR middle-end/113033
	* expmed.cc (expand_shift_1): When expanding rotate shift, call
	negate_rtx instead of simplify_gen_unary (NEG, ...).
---

Bootstrapped and regtested on x86_64-linux-gnu and
loongarch64-linux-gnu.  Ok for trunk?

 gcc/expmed.cc | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/gcc/expmed.cc b/gcc/expmed.cc
index 05331dd5d82..f9e416b9549 100644
--- a/gcc/expmed.cc
+++ b/gcc/expmed.cc
@@ -2634,9 +2634,7 @@ expand_shift_1 (enum tree_code code, machine_mode mode, rtx shifted,
 		  (mode, GET_MODE_BITSIZE (scalar_mode) - INTVAL (op1));
 	      else
 		{
-		  other_amount
-		    = simplify_gen_unary (NEG, GET_MODE (op1),
-					  op1, GET_MODE (op1));
+		  other_amount = negate_rtx (GET_MODE (op1), op1);
 		  HOST_WIDE_INT mask = GET_MODE_PRECISION (scalar_mode) - 1;
 		  other_amount
 		    = simplify_gen_binary (AND, GET_MODE (op1), other_amount,
-- 
2.43.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] middle-end: Call negate_rtx instead of simplify_gen_unary expanding rotate shift [PR113033]
  2023-12-18 13:42 [PATCH] middle-end: Call negate_rtx instead of simplify_gen_unary expanding rotate shift [PR113033] Xi Ruoyao
@ 2023-12-18 15:39 ` Jeff Law
  2023-12-18 16:48   ` Xi Ruoyao
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Law @ 2023-12-18 15:39 UTC (permalink / raw)
  To: Xi Ruoyao, gcc-patches; +Cc: Jakub Jelinek, chenglulu, i, xuchenghua, c



On 12/18/23 06:42, Xi Ruoyao wrote:
> With simplify_gen_unary we end up with a not fully expanded RTX like
> 
>      (set (reg:SI 90) (and:SI (neg:SI (reg:SI 80)) (const_int 63)))
> 
> Then it will cause an ICE with unrecognizable insn.
> 
> gcc/ChangeLog:
> 
> 	PR middle-end/113033
> 	* expmed.cc (expand_shift_1): When expanding rotate shift, call
> 	negate_rtx instead of simplify_gen_unary (NEG, ...).
The key difference being that using negate_rtx will go through the 
expander which knows how to synthesize negation whereas 
simplify_gen_unary will just generate a (neg ...) and assume it matches 
something in the backend, right?

If so, this patch is fine for the trunk.
jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] middle-end: Call negate_rtx instead of simplify_gen_unary expanding rotate shift [PR113033]
  2023-12-18 15:39 ` Jeff Law
@ 2023-12-18 16:48   ` Xi Ruoyao
  2023-12-18 17:45     ` Jakub Jelinek
  0 siblings, 1 reply; 7+ messages in thread
From: Xi Ruoyao @ 2023-12-18 16:48 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: Jakub Jelinek, chenglulu, i, xuchenghua, c

On Mon, 2023-12-18 at 08:39 -0700, Jeff Law wrote:
> 
> 
> On 12/18/23 06:42, Xi Ruoyao wrote:
> > With simplify_gen_unary we end up with a not fully expanded RTX like
> > 
> >      (set (reg:SI 90) (and:SI (neg:SI (reg:SI 80)) (const_int 63)))
> > 
> > Then it will cause an ICE with unrecognizable insn.
> > 
> > gcc/ChangeLog:
> > 
> > 	PR middle-end/113033
> > 	* expmed.cc (expand_shift_1): When expanding rotate shift, call
> > 	negate_rtx instead of simplify_gen_unary (NEG, ...).

> The key difference being that using negate_rtx will go through the 
> expander which knows how to synthesize negation whereas 
> simplify_gen_unary will just generate a (neg ...) and assume it matches 
> something in the backend, right?

For PR113033 the key difference (to me) is negate_rtx emits an insn to
set a new pseudo reg to -x.  So the result will be

(set (reg:SI 81) (neg:SI (reg:SI 80)))

then

(and (reg:SI 81) (const_int 31))

instead of a consolidated

(and:SI (neg:SI (reg:SI IN)) (const_int 63))

AFAIK no backends have an instruction doing "negate an operand then and
bitwisely".

To me, technically the following operation

other_amount
  = simplify_gen_binary (AND, GET_MODE (op1), other_amount,
                         gen_int_mode (mask, GET_MODE (op1)));

should also be something negate_rtx too or we may still end up with an
ICE with backends incapable to match

(set (reg:SI XX) (and (reg:SI 81) (const_int 31)))

But fortunately most backends has an immediate and operation so it's
unlikely to blow up...

> If so, this patch is fine for the trunk.

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] middle-end: Call negate_rtx instead of simplify_gen_unary expanding rotate shift [PR113033]
  2023-12-18 16:48   ` Xi Ruoyao
@ 2023-12-18 17:45     ` Jakub Jelinek
  2023-12-18 20:01       ` Xi Ruoyao
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Jelinek @ 2023-12-18 17:45 UTC (permalink / raw)
  To: Xi Ruoyao; +Cc: Jeff Law, gcc-patches, chenglulu, i, xuchenghua, c

On Tue, Dec 19, 2023 at 12:48:46AM +0800, Xi Ruoyao wrote:
> > > gcc/ChangeLog:
> > > 
> > > 	PR middle-end/113033
> > > 	* expmed.cc (expand_shift_1): When expanding rotate shift, call
> > > 	negate_rtx instead of simplify_gen_unary (NEG, ...).
> 
> > The key difference being that using negate_rtx will go through the 
> > expander which knows how to synthesize negation whereas 
> > simplify_gen_unary will just generate a (neg ...) and assume it matches 
> > something in the backend, right?
> 
> For PR113033 the key difference (to me) is negate_rtx emits an insn to
> set a new pseudo reg to -x.  So the result will be
> 
> (set (reg:SI 81) (neg:SI (reg:SI 80)))
> 
> then
> 
> (and (reg:SI 81) (const_int 31))
> 
> instead of a consolidated
> 
> (and:SI (neg:SI (reg:SI IN)) (const_int 63))
> 
> AFAIK no backends have an instruction doing "negate an operand then and
> bitwisely".

Can you explain why it doesn't work as is though?
I mean, expand_shift_1 with that (and (neg (reg ...)) (const_int ...))
should try to legitimize the operand (e.g. in maybe_legitimize_operand
-> force_operand and force_operand should be able to deal with that,
AND is binary op, so it recurses on the 2 operands and NEG is UNARY_P,
so the recursion should deal with that if it is not general_operand.

	Jakub


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] middle-end: Call negate_rtx instead of simplify_gen_unary expanding rotate shift [PR113033]
  2023-12-18 17:45     ` Jakub Jelinek
@ 2023-12-18 20:01       ` Xi Ruoyao
  2023-12-18 20:15         ` Jakub Jelinek
  0 siblings, 1 reply; 7+ messages in thread
From: Xi Ruoyao @ 2023-12-18 20:01 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jeff Law, gcc-patches, chenglulu, i, xuchenghua, c

On Mon, 2023-12-18 at 18:45 +0100, Jakub Jelinek wrote:
> On Tue, Dec 19, 2023 at 12:48:46AM +0800, Xi Ruoyao wrote:
> > > > gcc/ChangeLog:
> > > > 
> > > > 	PR middle-end/113033
> > > > 	* expmed.cc (expand_shift_1): When expanding rotate shift, call
> > > > 	negate_rtx instead of simplify_gen_unary (NEG, ...).
> > 
> > > The key difference being that using negate_rtx will go through the
> > > expander which knows how to synthesize negation whereas 
> > > simplify_gen_unary will just generate a (neg ...) and assume it matches 
> > > something in the backend, right?
> > 
> > For PR113033 the key difference (to me) is negate_rtx emits an insn to
> > set a new pseudo reg to -x.  So the result will be
> > 
> > (set (reg:SI 81) (neg:SI (reg:SI 80)))
> > 
> > then
> > 
> > (and (reg:SI 81) (const_int 31))
> > 
> > instead of a consolidated
> > 
> > (and:SI (neg:SI (reg:SI IN)) (const_int 63))
> > 
> > AFAIK no backends have an instruction doing "negate an operand then and
> > bitwisely".
> 
> Can you explain why it doesn't work as is though?
> I mean, expand_shift_1 with that (and (neg (reg ...)) (const_int ...))
> should try to legitimize the operand (e.g. in maybe_legitimize_operand
> -> force_operand and force_operand should be able to deal with that,
> AND is binary op, so it recurses on the 2 operands and NEG is UNARY_P,
> so the recursion should deal with that if it is not general_operand.

It happens with vector left rotate:

V test (V a, int x)
{
  int _1; 
  V _4; 

  <bb 2> [local count: 1073741824]:
  _1 = x_2(D) & 31; 
  _4 = a_3(D) r<< _1; 
  return _4; 

}

Here V is in V4SImode.  With other_amount = (and (neg (reg 85))
(const_int 31)), we end up calling

expand_shift_1 (
  code = RSHIFT_EXPR,
  mode = V4SImode,
  shifted = (reg:V4SI 82),
  amount = (and:SI (neg:SI (reg:SI 85)) (const_int 31)),
  target = (reg:V4SI 84),
  unsignedp = true,
  may_fail = false)

It then calls

expand_binop (
  mode = V4SImode,
  lshr_optab,
  op0 = (reg:V4SI 82),
  op1 = (and:SI (neg:SI (reg:SI 85)) (const_int 31)),
  target = (reg:V4SI 84),
  unsignedp=1,
  methods=OPTAB_DIRECT)

In expand_binop:

rtx vop1 = expand_vector_broadcast (mode, op1);

LoongArch backend don't have vec_duplicate (well, broadcasting is
implemented as a special case of vec_init and maybe this is not so
good...), so finally we get:

  vec = rtvec_alloc (n); 
  for (int i = 0; i < n; ++i) 
    RTVEC_ELT (vec, i) = op;
  rtx ret = gen_reg_rtx (vmode);
  emit_insn (GEN_FCN (icode) (ret, gen_rtx_PARALLEL (vmode, vec)));

here "op" is (and:SI (neg:SI (reg:SI 85)) (const_int 31)), thus it
evaded expansion :(.

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] middle-end: Call negate_rtx instead of simplify_gen_unary expanding rotate shift [PR113033]
  2023-12-18 20:01       ` Xi Ruoyao
@ 2023-12-18 20:15         ` Jakub Jelinek
  2023-12-18 20:18           ` Xi Ruoyao
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Jelinek @ 2023-12-18 20:15 UTC (permalink / raw)
  To: Xi Ruoyao; +Cc: Jeff Law, gcc-patches, chenglulu, i, xuchenghua, c

On Tue, Dec 19, 2023 at 04:01:52AM +0800, Xi Ruoyao wrote:
> On Mon, 2023-12-18 at 18:45 +0100, Jakub Jelinek wrote:
> > On Tue, Dec 19, 2023 at 12:48:46AM +0800, Xi Ruoyao wrote:
> > > > > gcc/ChangeLog:
> > > > > 
> > > > > 	PR middle-end/113033
> > > > > 	* expmed.cc (expand_shift_1): When expanding rotate shift, call
> > > > > 	negate_rtx instead of simplify_gen_unary (NEG, ...).
> > > 
> > > > The key difference being that using negate_rtx will go through the
> > > > expander which knows how to synthesize negation whereas 
> > > > simplify_gen_unary will just generate a (neg ...) and assume it matches 
> > > > something in the backend, right?
> > > 
> > > For PR113033 the key difference (to me) is negate_rtx emits an insn to
> > > set a new pseudo reg to -x.  So the result will be
> > > 
> > > (set (reg:SI 81) (neg:SI (reg:SI 80)))
> > > 
> > > then
> > > 
> > > (and (reg:SI 81) (const_int 31))
> > > 
> > > instead of a consolidated
> > > 
> > > (and:SI (neg:SI (reg:SI IN)) (const_int 63))
> > > 
> > > AFAIK no backends have an instruction doing "negate an operand then and
> > > bitwisely".
> > 
> > Can you explain why it doesn't work as is though?
> > I mean, expand_shift_1 with that (and (neg (reg ...)) (const_int ...))
> > should try to legitimize the operand (e.g. in maybe_legitimize_operand
> > -> force_operand and force_operand should be able to deal with that,
> > AND is binary op, so it recurses on the 2 operands and NEG is UNARY_P,
> > so the recursion should deal with that if it is not general_operand.
> 
> It happens with vector left rotate:
> 
> V test (V a, int x)
> {
>   int _1; 
>   V _4; 
> 
>   <bb 2> [local count: 1073741824]:
>   _1 = x_2(D) & 31; 
>   _4 = a_3(D) r<< _1; 
>   return _4; 
> 
> }
> 
> Here V is in V4SImode.  With other_amount = (and (neg (reg 85))
> (const_int 31)), we end up calling
> 
> expand_shift_1 (
>   code = RSHIFT_EXPR,
>   mode = V4SImode,
>   shifted = (reg:V4SI 82),
>   amount = (and:SI (neg:SI (reg:SI 85)) (const_int 31)),
>   target = (reg:V4SI 84),
>   unsignedp = true,
>   may_fail = false)
> 
> It then calls
> 
> expand_binop (
>   mode = V4SImode,
>   lshr_optab,
>   op0 = (reg:V4SI 82),
>   op1 = (and:SI (neg:SI (reg:SI 85)) (const_int 31)),
>   target = (reg:V4SI 84),
>   unsignedp=1,
>   methods=OPTAB_DIRECT)
> 
> In expand_binop:
> 
> rtx vop1 = expand_vector_broadcast (mode, op1);
> 
> LoongArch backend don't have vec_duplicate (well, broadcasting is
> implemented as a special case of vec_init and maybe this is not so
> good...), so finally we get:
> 
>   vec = rtvec_alloc (n); 
>   for (int i = 0; i < n; ++i) 
>     RTVEC_ELT (vec, i) = op;
>   rtx ret = gen_reg_rtx (vmode);
>   emit_insn (GEN_FCN (icode) (ret, gen_rtx_PARALLEL (vmode, vec)));
> 
> here "op" is (and:SI (neg:SI (reg:SI 85)) (const_int 31)), thus it
> evaded expansion :(.

Then that seems like a bug in the loongarch vec_init pattern(s).
Those really don't have a predicate in any of the backends on the input
operand, so they need to force_reg it if it is something it can't handle.
I've looked e.g. at i386 vec_init and that is exactly what it does,
see the various tests + force_reg calls in ix86_expand_vector_init*.

	Jakub


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] middle-end: Call negate_rtx instead of simplify_gen_unary expanding rotate shift [PR113033]
  2023-12-18 20:15         ` Jakub Jelinek
@ 2023-12-18 20:18           ` Xi Ruoyao
  0 siblings, 0 replies; 7+ messages in thread
From: Xi Ruoyao @ 2023-12-18 20:18 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jeff Law, gcc-patches, chenglulu, i, xuchenghua, c

On Mon, 2023-12-18 at 21:15 +0100, Jakub Jelinek wrote:
> 
> Then that seems like a bug in the loongarch vec_init pattern(s).
> Those really don't have a predicate in any of the backends on the input
> operand, so they need to force_reg it if it is something it can't handle.
> I've looked e.g. at i386 vec_init and that is exactly what it does,
> see the various tests + force_reg calls in ix86_expand_vector_init*.

Ok, I'm abandoning abandon this patch and I'll rework.

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-12-18 20:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-18 13:42 [PATCH] middle-end: Call negate_rtx instead of simplify_gen_unary expanding rotate shift [PR113033] Xi Ruoyao
2023-12-18 15:39 ` Jeff Law
2023-12-18 16:48   ` Xi Ruoyao
2023-12-18 17:45     ` Jakub Jelinek
2023-12-18 20:01       ` Xi Ruoyao
2023-12-18 20:15         ` Jakub Jelinek
2023-12-18 20:18           ` Xi Ruoyao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).