public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern
@ 2023-08-09  3:54 Lehua Ding
  2023-08-10 12:29 ` Lehua Ding
  2023-08-11 15:57 ` Jeff Law
  0 siblings, 2 replies; 11+ messages in thread
From: Lehua Ding @ 2023-08-09  3:54 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: juzhe.zhong, rdapp.gcc, kito.cheng, palmer

[-- Attachment #1: Type: text/plain, Size: 13955 bytes --]

Hi Jeff,


> The pattern's operand 0 explicitly allows MEMs as do the constraints.
> So forcing the operand into a register just seems like it's papering
> over the real problem.


The added of force_reg code is address the problem preduced after address the error combine.
The more restrict condtion of the pattern forbidden mem->mem pattern which will
produced in -O0. I think the implementation forgot to do this force_reg operation before
when doing the intrinis expansion The reason this problem isn't exposed before is because
the reload pass will converts mem->mem to mem->reg; reg->mem based on the constraint.


> I wonder if we should just remove the memory destination from this
> pattern.  Ultimately isn't that case just trying to optimize a constant
> store into memory -- perhaps we just need a distinct pattern for that.
> We generally try to avoid that for movXX patterns, but this seems a bit
> different.


The pattern like scalar mov pattern, need to block mem->mem case.
I think mem->reg, reg->mem, reg->reg patterns are defined in the
same insn is more readable, I wonder how you feel about that?
And there's another `*mov<mode&gt;_whole` pattern that needs to be
restricted here as well, I'll try to send a separate patch to address that
like bellow.


(define_insn "*mov<mode&gt;_whole"
&nbsp; [(set (match_operand:V_WHOLE 0 "reg_or_mem_operand" "=vr, m,vr")
	(match_operand:V_WHOLE 1 "reg_or_mem_operand" "&nbsp; m,vr,vr"))]
&nbsp; "TARGET_VECTOR"
&nbsp; ...)


Change to:


(define_insn "*mov<mode&gt;_whole"
&nbsp; [(set (match_operand:V_WHOLE 0 "reg_or_mem_operand" "=vr, m,vr")
	(match_operand:V_WHOLE 1 "reg_or_mem_operand" "&nbsp; m,vr,vr"))]
&nbsp; "TARGET_VECTOR &amp;&amp; (register_operand (operands[0], <MODE&gt;mode)
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;|| register_operand(operands[1], <MODE&gt;mode))"
&nbsp; ...)



&gt; This comment doesn't make sense in conjuction with your earlier details.
&gt; In particular combine doesn't run at -O0, so your earlier comment that
&gt; combine creates the problem seems inconsistent with the comment above.

As the above says, the code addresses the problem which produced
after addressing the combine problem.


&gt; Umm, wow.&nbsp; I haven't thought deeply about this, but the complexity of
&gt; that insn condition is a huge red flag that our operand predicates
&gt; aren't correct for this pattern.


This condition is large because the vsetvl info need (compare to scalar mov or *mov<mode&gt;_whole pattern),
but I think this condition is enough clear to understand. Let me explain briefly.


&nbsp; &nbsp; (register_operand (operands[0], <MODE&gt;mode) &amp;&amp; MEM_P (operands[3]))
&nbsp; &nbsp; || (MEM_P (operands[0]) &amp;&amp; register_operand(operands[3], <MODE&gt;mode))


This two conditons mean allow mem-&gt;reg and reg-&gt;mem pattern.


&nbsp; &nbsp; (register_operand (operands[0], <MODE&gt;mode) &amp;&amp; satisfies_constraint_Wc1 (operands[1]))


This condition mean the mask must be all trues for reg-&gt;reg_or_imm pattern since
reg-&gt;reg insn doen't support mask operand.


Best,
Lehua


------------------&nbsp;Original&nbsp;------------------
From:                                                                                                                        "Jeff Law"                                                                                    <jeffreyalaw@gmail.com&gt;;
Date:&nbsp;Wed, Aug 9, 2023 00:10 AM
To:&nbsp;"Lehua Ding"<lehua.ding@rivai.ai&gt;;"gcc-patches"<gcc-patches@gcc.gnu.org&gt;;
Cc:&nbsp;"juzhe.zhong"<juzhe.zhong@rivai.ai&gt;;"rdapp.gcc"<rdapp.gcc@gmail.com&gt;;"kito.cheng"<kito.cheng@gmail.com&gt;;"palmer"<palmer@rivosinc.com&gt;;
Subject:&nbsp;Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern



On 8/8/23 05:57, Lehua Ding wrote:
&gt; Hi,
&gt; 
&gt; This patch fix PR110943 which will produce some error code. This is because
&gt; the error combine of some pred_mov pattern. Consider this code:
&gt; 
&gt; ```
&gt; #include <riscv_vector.h&gt;
&gt; 
&gt; void foo9 (void *base, void *out, size_t vl)
&gt; {
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int64_t scalar = *(int64_t*)(base + 100);
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1);
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *(vint64m2_t*)out = v;
&gt; }
&gt; ```
&gt; 
&gt; RTL before combine pass:
&gt; 
&gt; ```
&gt; (insn 11 10 12 2 (set (reg/v:RVVM2DI 134 [ v ])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (if_then_else:RVVM2DI (unspec:RVVMF32BI [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (const_vector:RVVMF32BI repeat [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (const_int 1 [0x1])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (const_int 1 [0x1])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (const_int 2 [0x2]) repeated x2
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (const_int 0 [0])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (reg:SI 66 vl)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (reg:SI 67 vtype)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ] UNSPEC_VPREDICATE)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (const_vector:RVVM2DI repeat [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (const_int 0 [0])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (unspec:RVVM2DI [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (reg:SI 0 zero)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ] UNSPEC_VUNDEF))) "/app/example.c":6:20 1089 {pred_movrvvm2di})
&gt; (insn 14 13 0 2 (set (mem:RVVM2DI (reg/v/f:DI 136 [ out ]) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (reg/v:RVVM2DI 134 [ v ])) "/app/example.c":7:23 717 {*movrvvm2di_whole})
&gt; ```
&gt; 
&gt; RTL after combine pass:
&gt; ```
&gt; (insn 14 13 0 2 (set (mem:RVVM2DI (reg:DI 138) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (if_then_else:RVVM2DI (unspec:RVVMF32BI [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (const_vector:RVVMF32BI repeat [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (const_int 1 [0x1])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (const_int 1 [0x1])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (const_int 2 [0x2]) repeated x2
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (const_int 0 [0])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (reg:SI 66 vl)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (reg:SI 67 vtype)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ] UNSPEC_VPREDICATE)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (const_vector:RVVM2DI repeat [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (const_int 0 [0])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (unspec:RVVM2DI [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (reg:SI 0 zero)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ] UNSPEC_VUNDEF))) "/app/example.c":7:23 1089 {pred_movrvvm2di})
&gt; ```
&gt; 
&gt; This combine change the semantics of insn 14. I refine the conditon of @pred_mov
&gt; pattern to a more restrict. It's Ok for trunk?
&gt; 
&gt; Best,
&gt; Lehua
&gt; 
&gt; 
&gt;PR target/110943
&gt; 
&gt; gcc/ChangeLog:
&gt; 
&gt;* config/riscv/riscv-vector-builtins.cc (function_expander::function_expander):
&gt;&nbsp; force_reg mem operand.
&gt;* config/riscv/vector.md: Refine condition.
&gt; 
&gt; gcc/testsuite/ChangeLog:
&gt; 
&gt;* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Update.
&gt;* gcc.target/riscv/rvv/base/pr110943.c: New test.
So at a high level this doesn't look correct to me.

The pattern's operand 0 explicitly allows MEMs as do the constraints. 
So forcing the operand into a register just seems like it's papering 
over the real problem.

I wonder if we should just remove the memory destination from this 
pattern.&nbsp; Ultimately isn't that case just trying to optimize a constant 
store into memory -- perhaps we just need a distinct pattern for that. 
We generally try to avoid that for movXX patterns, but this seems a bit 
different.


&gt;&nbsp;&nbsp; create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c
&gt; 
&gt; diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc
&gt; index 528dca7ae85..cd40fb2060f 100644
&gt; --- a/gcc/config/riscv/riscv-vector-builtins.cc
&gt; +++ b/gcc/config/riscv/riscv-vector-builtins.cc
&gt; @@ -3471,7 +3471,13 @@ function_expander::function_expander (const function_instance &amp;instance,
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; exp (exp_in), target (target_in), opno (0)
&gt;&nbsp;&nbsp; {
&gt;&nbsp;&nbsp;&nbsp;&nbsp; if (!function_returns_void_p ())
&gt; -&nbsp;&nbsp;&nbsp; create_output_operand (&amp;m_ops[opno++], target, TYPE_MODE (TREE_TYPE (exp)));
&gt; +&nbsp;&nbsp;&nbsp; {
&gt; +&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (target != NULL_RTX &amp;&amp; MEM_P (target))
&gt; + /* Use force_reg to prevent illegal mem-to-mem pattern on -O0.&nbsp; */
This comment doesn't make sense in conjuction with your earlier details. 
 In particular combine doesn't run at -O0, so your earlier comment that 
combine creates the problem seems inconsistent with the comment above.


&gt; diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
&gt; index e56a2bf4bed..f0484b1162c 100644
&gt; --- a/gcc/config/riscv/vector.md
&gt; +++ b/gcc/config/riscv/vector.md
&gt; @@ -1509,8 +1509,9 @@
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (match_operand:V_VLS 3 "vector_move_operand"&nbsp;&nbsp; "&nbsp;&nbsp;&nbsp; m,&nbsp;&nbsp;&nbsp;&nbsp; m,&nbsp;&nbsp;&nbsp;&nbsp; m,&nbsp;&nbsp;&nbsp; vr,&nbsp;&nbsp;&nbsp; vr,&nbsp;&nbsp;&nbsp; vr, viWc0, viWc0")
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (match_operand:V_VLS 2 "vector_merge_operand"&nbsp; "&nbsp;&nbsp;&nbsp; 0,&nbsp;&nbsp;&nbsp; vu,&nbsp;&nbsp;&nbsp; vu,&nbsp;&nbsp;&nbsp; vu,&nbsp;&nbsp;&nbsp; vu,&nbsp;&nbsp;&nbsp;&nbsp; 0,&nbsp;&nbsp;&nbsp; vu,&nbsp;&nbsp;&nbsp;&nbsp; 0")))]
&gt; -&nbsp; "TARGET_VECTOR &amp;&amp; (MEM_P (operands[0]) || MEM_P (operands[3])
&gt; -&nbsp;&nbsp; || CONST_VECTOR_P (operands[1]))"
&gt; +&nbsp; "TARGET_VECTOR &amp;&amp; ((register_operand (operands[0], <MODE&gt;mode) &amp;&amp; MEM_P (operands[3])) ||
&gt; +&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (MEM_P (operands[0]) &amp;&amp; register_operand (operands[3], <MODE&gt;mode)) ||
&gt; +&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (register_operand (operands[0], <MODE&gt;mode) &amp;&amp; satisfies_constraint_Wc1 (operands[1])))"
Umm, wow.&nbsp; I haven't thought deeply about this, but the complexity of 
that insn condition is a huge red flag that our operand predicates 
aren't correct for this pattern.

From a formatting standpoint bring the wrapped operator down and 
indent.&nbsp; ie

 (condition 1
 || condition 2
 || (condition 3
 &amp;&amp; other test 4))


Jeff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern
  2023-08-09  3:54 [PATCH] RISC-V: Fix error combine of pred_mov pattern Lehua Ding
@ 2023-08-10 12:29 ` Lehua Ding
  2023-08-11 15:57 ` Jeff Law
  1 sibling, 0 replies; 11+ messages in thread
From: Lehua Ding @ 2023-08-10 12:29 UTC (permalink / raw)
  To: Lehua Ding, Jeff Law, gcc-patches
  Cc: juzhe.zhong, rdapp.gcc, kito.cheng, palmer

[-- Attachment #1: Type: text/plain, Size: 456 bytes --]

Hi Jeff,

After reconsidering I think the split of pattern you mention
makes sense to me. I have split the `@pred_mov<mode&gt;`
into two pattern. One for pure move like mem-&gt;reg, reg-&gt;mem, reg-&gt;reg
One for imm-&gt;reg and be move to&nbsp;pred_broadcast area
since pred_broadcast mean duplicate something to vector register.


The V2 patch as bellow:


https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626981.html


Best,
Lehua

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern
  2023-08-09  3:54 [PATCH] RISC-V: Fix error combine of pred_mov pattern Lehua Ding
  2023-08-10 12:29 ` Lehua Ding
@ 2023-08-11 15:57 ` Jeff Law
  2023-08-11 16:30   ` Lehua Ding
  2023-08-18 10:30   ` Lehua Ding
  1 sibling, 2 replies; 11+ messages in thread
From: Jeff Law @ 2023-08-11 15:57 UTC (permalink / raw)
  To: Lehua Ding, gcc-patches; +Cc: juzhe.zhong, rdapp.gcc, kito.cheng, palmer



On 8/8/23 21:54, Lehua Ding wrote:
> Hi Jeff,
> 
>  > The pattern's operand 0 explicitly allows MEMs as do the constraints.
>  > So forcing the operand into a register just seems like it's papering
>  > over the real problem.
> 
> The added of force_reg code is address the problem preduced after 
> address the error combine.
> The more restrict condtion of the pattern forbidden mem->mem pattern 
> which will
> produced in -O0. I think the implementation forgot to do this force_reg 
> operation before
> when doing the intrinis expansion The reason this problem isn't exposed 
> before is because
> the reload pass will converts mem->mem to mem->reg; reg->mem based on 
> the constraint.
So if the core issue if mem->mem, that is a common thing to avoid.

Basically in the expander you use a force_reg and then have a test like
!(MEM_P (op0) && MEM_P (op1)) in the define_insn's condition.

But the v1 had a much more complex condition.  It looks like that got 
cleaned up in the v2.  So I'll need to look at that one more closely.


> 
>  > This comment doesn't make sense in conjuction with your earlier details.
>  > In particular combine doesn't run at -O0, so your earlier comment that
>  > combine creates the problem seems inconsistent with the comment above.
> 
> As the above says, the code addresses the problem which produced
> after addressing the combine problem.
But combine doesn't run at -O0.  So something is inconsistent.  I 
certainly believe we need to avoid the mem->mem case, but that's 
independent of combine and affects all optimization levels.


> 
>  > Umm, wow.  I haven't thought deeply about this, but the complexity of
>  > that insn condition is a huge red flag that our operand predicates
>  > aren't correct for this pattern.
> 
> This condition is large because the vsetvl info need (compare to scalar 
> mov or *mov<mode>_whole pattern),
> but I think this condition is enough clear to understand. Let me explain 
> briefly.
> 
>      (register_operand (operands[0], <MODE>mode) && MEM_P (operands[3]))
>      || (MEM_P (operands[0]) && register_operand(operands[3], <MODE>mode))
> 
> This two conditons mean allow mem->reg and reg->mem pattern.
I think we can simplify to just

  !(MEM_P (operands[0]) && MEM_P (operands[1])

> 
>      (register_operand (operands[0], <MODE>mode) && 
> satisfies_constraint_Wc1 (operands[1]))
> 
> This condition mean the mask must be all trues for reg->reg_or_imm 
> pattern since> reg->reg insn doen't support mask operand.
I would have expected those to be handled by the constraints rather than 
the pattern's condition.

Jeff


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern
  2023-08-11 15:57 ` Jeff Law
@ 2023-08-11 16:30   ` Lehua Ding
  2023-08-11 16:40     ` Lehua Ding
  2023-08-28 21:34     ` Jeff Law
  2023-08-18 10:30   ` Lehua Ding
  1 sibling, 2 replies; 11+ messages in thread
From: Lehua Ding @ 2023-08-11 16:30 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: juzhe.zhong, rdapp.gcc, kito.cheng, palmer

[-- Attachment #1: Type: text/plain, Size: 4465 bytes --]

&gt; But combine doesn't run at -O0.&nbsp; So something is inconsistent.&nbsp; I
&gt; certainly believe we need to avoid the mem-&gt;mem case, but that's
&gt; independent of combine and affects all optimization levels.


This is an new bug when running all tests after fixing the combine bug.
I understand that maybe I should send a separate patch to fix the problem.
Maybe this problem was exposed after I changed the pattern. I will continue to track it.


&gt; I think we can simplify to just
&gt; !(MEM_P (operands[0]) &amp;&amp; MEM_P (operands[1])


&gt; I would have expected those to be handled by the constraints rather than
&gt; the pattern's condition.
Yeh, the condition of the V2 becomes much simpler after split.







------------------&nbsp;Original&nbsp;------------------
From:                                                                                                                        "Jeff Law"                                                                                    <gcc-patches@gcc.gnu.org&gt;;
Date:&nbsp;Fri, Aug 11, 2023 11:57 PM
To:&nbsp;"Lehua Ding"<lehua.ding@rivai.ai&gt;;"gcc-patches"<gcc-patches@gcc.gnu.org&gt;;
Cc:&nbsp;"juzhe.zhong"<juzhe.zhong@rivai.ai&gt;;"rdapp.gcc"<rdapp.gcc@gmail.com&gt;;"kito.cheng"<kito.cheng@gmail.com&gt;;"palmer"<palmer@rivosinc.com&gt;;
Subject:&nbsp;Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern



On 8/8/23 21:54, Lehua Ding wrote:
&gt; Hi Jeff,
&gt; 
&gt;&nbsp; &gt; The pattern's operand 0 explicitly allows MEMs as do the constraints.
&gt;&nbsp; &gt;&nbsp;So forcing the operand into a register just seems like it's papering
&gt;&nbsp; &gt;&nbsp;over the real problem.
&gt; 
&gt; The added of force_reg code is address the problem preduced after 
&gt; address the error combine.
&gt; The more restrict condtion of the pattern forbidden mem-&gt;mem pattern 
&gt; which will
&gt; produced in -O0. I think the implementation forgot to do this force_reg 
&gt; operation before
&gt; when doing the intrinis expansion The reason this problem isn't exposed 
&gt; before is because
&gt; the reload pass will converts mem-&gt;mem to mem-&gt;reg; reg-&gt;mem based on 
&gt; the constraint.
So if the core issue if mem-&gt;mem, that is a common thing to avoid.

Basically in the expander you use a force_reg and then have a test like
!(MEM_P (op0) &amp;&amp; MEM_P (op1)) in the define_insn's condition.

But the v1 had a much more complex condition.&nbsp; It looks like that got 
cleaned up in the v2.&nbsp; So I'll need to look at that one more closely.


&gt; 
&gt;&nbsp; &gt; This comment doesn't make sense in conjuction with your earlier details.
&gt;&nbsp; &gt; In particular combine doesn't run at -O0, so your earlier comment that
&gt;&nbsp; &gt; combine creates the problem seems inconsistent with the comment above.
&gt; 
&gt; As the above says, the code addresses the problem which produced
&gt; after addressing the combine problem.
But combine doesn't run at -O0.&nbsp; So something is inconsistent.&nbsp; I 
certainly believe we need to avoid the mem-&gt;mem case, but that's 
independent of combine and affects all optimization levels.


&gt; 
&gt;&nbsp; &gt; Umm, wow.&nbsp; I haven't thought deeply about this, but the complexity of
&gt;&nbsp; &gt; that insn condition is a huge red flag that our operand predicates
&gt;&nbsp; &gt; aren't correct for this pattern.
&gt; 
&gt; This condition is large because the vsetvl info need (compare to scalar 
&gt; mov or *mov<mode&gt;_whole pattern),
&gt; but I think this condition is enough clear to understand. Let me explain 
&gt; briefly.
&gt; 
&gt;&nbsp; &nbsp; &nbsp; (register_operand (operands[0], <MODE&gt;mode) &amp;&amp; MEM_P (operands[3]))
&gt;&nbsp; &nbsp; &nbsp; || (MEM_P (operands[0]) &amp;&amp; register_operand(operands[3], <MODE&gt;mode))
&gt; 
&gt; This two conditons mean allow mem-&gt;reg and reg-&gt;mem pattern.
I think we can simplify to just

 !(MEM_P (operands[0]) &amp;&amp; MEM_P (operands[1])

&gt; 
&gt;&nbsp; &nbsp; &nbsp; (register_operand (operands[0], <MODE&gt;mode) &amp;&amp; 
&gt; satisfies_constraint_Wc1 (operands[1]))
&gt; 
&gt; This condition mean the mask must be all trues for reg-&gt;reg_or_imm 
&gt; pattern since&gt; reg-&gt;reg insn doen't support mask operand.
I would have expected those to be handled by the constraints rather than 
the pattern's condition.

Jeff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern
  2023-08-11 16:30   ` Lehua Ding
@ 2023-08-11 16:40     ` Lehua Ding
  2023-08-28 21:34     ` Jeff Law
  1 sibling, 0 replies; 11+ messages in thread
From: Lehua Ding @ 2023-08-11 16:40 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: juzhe.zhong, rdapp.gcc, kito.cheng, palmer

[-- Attachment #1: Type: text/plain, Size: 7738 bytes --]

&gt;&gt; But combine doesn't run at -O0.&nbsp; So something is inconsistent.&nbsp; I &gt;&gt; certainly believe we need to avoid the mem-&gt;mem case, but that's &gt;&gt; independent of combine and affects all optimization levels.  &gt; This is an new bug when running all tests after fixing the combine bug. &gt; I understand that maybe I should send a separate patch to fix the problem. &gt; Maybe this problem was exposed after I changed the pattern. I will continue to track it.
Just now, I debug and found that the -O0 problem
after repairing error combine was caused by the condition
of pred_mov becoming more strict. Before was
(MEM_P (operands[0]) || MEM_P (operands[3])&nbsp; || CONST_VECTOR_P (operands[1])
That is, mem-&gt;mem is allowed. This faulty condition causes
two problems at once. One is error combine, the other is to hide
the error pattern with -O0.&nbsp;After correcting the condition with this patch,
I fixed the error combine problem, and also exposed the problem under -O0.
So I think force_reg still needs to be put together with this patch.



------------------&nbsp;Original&nbsp;------------------
From:                                                                                                                        "Lehua Ding"                                                                                    <lehua.ding@rivai.ai&gt;;
Date:&nbsp;Sat, Aug 12, 2023 00:30 AM
To:&nbsp;"Jeff Law"<jeffreyalaw@gmail.com&gt;;"gcc-patches"<gcc-patches@gcc.gnu.org&gt;;
Cc:&nbsp;"juzhe.zhong"<juzhe.zhong@rivai.ai&gt;;"rdapp.gcc"<rdapp.gcc@gmail.com&gt;;"kito.cheng"<kito.cheng@gmail.com&gt;;"palmer"<palmer@rivosinc.com&gt;;
Subject:&nbsp;Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern



&amp;gt; But combine doesn't run at -O0.&amp;nbsp; So something is inconsistent.&amp;nbsp; I
&amp;gt; certainly believe we need to avoid the mem-&amp;gt;mem case, but that's
&amp;gt; independent of combine and affects all optimization levels.


This is an new bug when running all tests after fixing the combine bug.
I understand that maybe I should send a separate patch to fix the problem.
Maybe this problem was exposed after I changed the pattern. I will continue to track it.


&amp;gt; I think we can simplify to just
&amp;gt; !(MEM_P (operands[0]) &amp;amp;&amp;amp; MEM_P (operands[1])


&amp;gt; I would have expected those to be handled by the constraints rather than
&amp;gt; the pattern's condition.
Yeh, the condition of the V2 becomes much simpler after split.







------------------&amp;nbsp;Original&amp;nbsp;------------------
From:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "Jeff Law"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <gcc-patches@gcc.gnu.org&amp;gt;;
Date:&amp;nbsp;Fri, Aug 11, 2023 11:57 PM
To:&amp;nbsp;"Lehua Ding"<lehua.ding@rivai.ai&amp;gt;;"gcc-patches"<gcc-patches@gcc.gnu.org&amp;gt;;
Cc:&amp;nbsp;"juzhe.zhong"<juzhe.zhong@rivai.ai&amp;gt;;"rdapp.gcc"<rdapp.gcc@gmail.com&amp;gt;;"kito.cheng"<kito.cheng@gmail.com&amp;gt;;"palmer"<palmer@rivosinc.com&amp;gt;;
Subject:&amp;nbsp;Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern



On 8/8/23 21:54, Lehua Ding wrote:
&amp;gt; Hi Jeff,
&amp;gt; 
&amp;gt;&amp;nbsp; &amp;gt; The pattern's operand 0 explicitly allows MEMs as do the constraints.
&amp;gt;&amp;nbsp; &amp;gt;&amp;nbsp;So forcing the operand into a register just seems like it's papering
&amp;gt;&amp;nbsp; &amp;gt;&amp;nbsp;over the real problem.
&amp;gt; 
&amp;gt; The added of force_reg code is address the problem preduced after 
&amp;gt; address the error combine.
&amp;gt; The more restrict condtion of the pattern forbidden mem-&amp;gt;mem pattern 
&amp;gt; which will
&amp;gt; produced in -O0. I think the implementation forgot to do this force_reg 
&amp;gt; operation before
&amp;gt; when doing the intrinis expansion The reason this problem isn't exposed 
&amp;gt; before is because
&amp;gt; the reload pass will converts mem-&amp;gt;mem to mem-&amp;gt;reg; reg-&amp;gt;mem based on 
&amp;gt; the constraint.
So if the core issue if mem-&amp;gt;mem, that is a common thing to avoid.

Basically in the expander you use a force_reg and then have a test like
!(MEM_P (op0) &amp;amp;&amp;amp; MEM_P (op1)) in the define_insn's condition.

But the v1 had a much more complex condition.&amp;nbsp; It looks like that got 
cleaned up in the v2.&amp;nbsp; So I'll need to look at that one more closely.


&amp;gt; 
&amp;gt;&amp;nbsp; &amp;gt; This comment doesn't make sense in conjuction with your earlier details.
&amp;gt;&amp;nbsp; &amp;gt; In particular combine doesn't run at -O0, so your earlier comment that
&amp;gt;&amp;nbsp; &amp;gt; combine creates the problem seems inconsistent with the comment above.
&amp;gt; 
&amp;gt; As the above says, the code addresses the problem which produced
&amp;gt; after addressing the combine problem.
But combine doesn't run at -O0.&amp;nbsp; So something is inconsistent.&amp;nbsp; I 
certainly believe we need to avoid the mem-&amp;gt;mem case, but that's 
independent of combine and affects all optimization levels.


&amp;gt; 
&amp;gt;&amp;nbsp; &amp;gt; Umm, wow.&amp;nbsp; I haven't thought deeply about this, but the complexity of
&amp;gt;&amp;nbsp; &amp;gt; that insn condition is a huge red flag that our operand predicates
&amp;gt;&amp;nbsp; &amp;gt; aren't correct for this pattern.
&amp;gt; 
&amp;gt; This condition is large because the vsetvl info need (compare to scalar 
&amp;gt; mov or *mov<mode&amp;gt;_whole pattern),
&amp;gt; but I think this condition is enough clear to understand. Let me explain 
&amp;gt; briefly.
&amp;gt; 
&amp;gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; (register_operand (operands[0], <MODE&amp;gt;mode) &amp;amp;&amp;amp; MEM_P (operands[3]))
&amp;gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; || (MEM_P (operands[0]) &amp;amp;&amp;amp; register_operand(operands[3], <MODE&amp;gt;mode))
&amp;gt; 
&amp;gt; This two conditons mean allow mem-&amp;gt;reg and reg-&amp;gt;mem pattern.
I think we can simplify to just

!(MEM_P (operands[0]) &amp;amp;&amp;amp; MEM_P (operands[1])

&amp;gt; 
&amp;gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; (register_operand (operands[0], <MODE&amp;gt;mode) &amp;amp;&amp;amp; 
&amp;gt; satisfies_constraint_Wc1 (operands[1]))
&amp;gt; 
&amp;gt; This condition mean the mask must be all trues for reg-&amp;gt;reg_or_imm 
&amp;gt; pattern since&amp;gt; reg-&amp;gt;reg insn doen't support mask operand.
I would have expected those to be handled by the constraints rather than 
the pattern's condition.

Jeff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern
  2023-08-11 15:57 ` Jeff Law
  2023-08-11 16:30   ` Lehua Ding
@ 2023-08-18 10:30   ` Lehua Ding
  1 sibling, 0 replies; 11+ messages in thread
From: Lehua Ding @ 2023-08-18 10:30 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: juzhe.zhong, rdapp.gcc, kito.cheng, palmer

On 2023/8/11 23:57, Jeff Law wrote:
> 
> 
> On 8/8/23 21:54, Lehua Ding wrote:
>> Hi Jeff,
>>
>>  > The pattern's operand 0 explicitly allows MEMs as do the constraints.
>>  > So forcing the operand into a register just seems like it's papering
>>  > over the real problem.
>>
>> The added of force_reg code is address the problem preduced after 
>> address the error combine.
>> The more restrict condtion of the pattern forbidden mem->mem pattern 
>> which will
>> produced in -O0. I think the implementation forgot to do this 
>> force_reg operation before
>> when doing the intrinis expansion The reason this problem isn't 
>> exposed before is because
>> the reload pass will converts mem->mem to mem->reg; reg->mem based on 
>> the constraint.
> So if the core issue if mem->mem, that is a common thing to avoid.
> 
> Basically in the expander you use a force_reg and then have a test like
> !(MEM_P (op0) && MEM_P (op1)) in the define_insn's condition.
> 
> But the v1 had a much more complex condition.  It looks like that got 
> cleaned up in the v2.  So I'll need to look at that one more closely.
> 

Gentle ping V2, thanks.

> 
>>
>>  > This comment doesn't make sense in conjuction with your earlier 
>> details.
>>  > In particular combine doesn't run at -O0, so your earlier comment that
>>  > combine creates the problem seems inconsistent with the comment above.
>>
>> As the above says, the code addresses the problem which produced
>> after addressing the combine problem.
> But combine doesn't run at -O0.  So something is inconsistent.  I 
> certainly believe we need to avoid the mem->mem case, but that's 
> independent of combine and affects all optimization levels.
> 

I think it's the comment written here that is the problem. I plan to 
change it to this:
   /* Since there is no intrinsic where target is a mem operand, it must
      be converted to reg if it is a mem operand.  */

-- 
Best,
Lehua



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern
  2023-08-11 16:30   ` Lehua Ding
  2023-08-11 16:40     ` Lehua Ding
@ 2023-08-28 21:34     ` Jeff Law
  1 sibling, 0 replies; 11+ messages in thread
From: Jeff Law @ 2023-08-28 21:34 UTC (permalink / raw)
  To: Lehua Ding, gcc-patches; +Cc: juzhe.zhong, rdapp.gcc, kito.cheng, palmer



On 8/11/23 10:30, Lehua Ding wrote:
>  > But combine doesn't run at -O0.  So something is inconsistent.  I
>  > certainly believe we need to avoid the mem->mem case, but that's
>  > independent of combine and affects all optimization levels.
> 
> This is an new bug when running all tests after fixing the combine bug.
OK.  I must have misunderstood.   Thanks for clarifying.

> 
>  > I think we can simplify to just
>  > !(MEM_P (operands[0]) && MEM_P (operands[1])
> 
>  > I would have expected those to be handled by the constraints rather than
>  > the pattern's condition.
> 
> Yeh, the condition of the V2 becomes much simpler after split.
That was the hope.  It is worth noting that for simple moves eg movsi, 
movdi, movsf, movdf, etc there is a requirement that a single insn 
support all the valid combinations.  But I don't think we've ever had 
that requirement for vector modes and the situations where it's 
important are much less likely to trigger for vector moves.  Even more 
so given how the cond_mov patterns are implemented for RISC-V.

Jeff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern
  2024-02-20  4:21 Alexandre Oliva
@ 2024-02-23  7:39 ` Jeff Law
  0 siblings, 0 replies; 11+ messages in thread
From: Jeff Law @ 2024-02-23  7:39 UTC (permalink / raw)
  To: Alexandre Oliva, gcc-patches
  Cc: Kito Cheng, Palmer Dabbelt, Andrew Waterman, Jim Wilson,
	Lehua Ding, Ju-Zhe Zhong



On 2/19/24 21:21, Alexandre Oliva wrote:
> This backport is the second of two required for the pr111935 testcase,
> already backported to gcc-13, to pass on riscv64-elf and riscv32-elf.
> The V_VLS mode iterator, used in the original patch, is not available in
> gcc-13, and I thought that would be too much to backport (and maybe so
> are these two patches, WDYT?), so I changed it to V, to match the
> preexisting gcc-13 pattern.  Comments also needed manual adjustment.
> Regstrapped on x86_64-linux-gnu, along with other backports, and tested
> manually on riscv64-elf.  Ok to install?
> 
> From: Lehua Ding <lehua.ding@rivai.ai>
> 
> This patch fix PR110943 which will produce some error code. This is because
> the error combine of some pred_mov pattern. Consider this code:
> 
> ```
> 
> void foo9 (void *base, void *out, size_t vl)
> {
>      int64_t scalar = *(int64_t*)(base + 100);
>      vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1);
>      *(vint64m2_t*)out = v;
> }
> ```
> 
> RTL before combine pass:
> 
> ```
> (insn 11 10 12 2 (set (reg/v:RVVM2DI 134 [ v ])
>          (if_then_else:RVVM2DI (unspec:RVVMF32BI [
>                      (const_vector:RVVMF32BI repeat [
>                              (const_int 1 [0x1])
>                          ])
>                      (const_int 1 [0x1])
>                      (const_int 2 [0x2]) repeated x2
>                      (const_int 0 [0])
>                      (reg:SI 66 vl)
>                      (reg:SI 67 vtype)
>                  ] UNSPEC_VPREDICATE)
>              (const_vector:RVVM2DI repeat [
>                      (const_int 0 [0])
>                  ])
>              (unspec:RVVM2DI [
>                      (reg:SI 0 zero)
>                  ] UNSPEC_VUNDEF))) "/app/example.c":6:20 1089 {pred_movrvvm2di})
> (insn 14 13 0 2 (set (mem:RVVM2DI (reg/v/f:DI 136 [ out ]) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128])
>          (reg/v:RVVM2DI 134 [ v ])) "/app/example.c":7:23 717 {*movrvvm2di_whole})
> ```
> 
> RTL after combine pass:
> ```
> (insn 14 13 0 2 (set (mem:RVVM2DI (reg:DI 138) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128])
>          (if_then_else:RVVM2DI (unspec:RVVMF32BI [
>                      (const_vector:RVVMF32BI repeat [
>                              (const_int 1 [0x1])
>                          ])
>                      (const_int 1 [0x1])
>                      (const_int 2 [0x2]) repeated x2
>                      (const_int 0 [0])
>                      (reg:SI 66 vl)
>                      (reg:SI 67 vtype)
>                  ] UNSPEC_VPREDICATE)
>              (const_vector:RVVM2DI repeat [
>                      (const_int 0 [0])
>                  ])
>              (unspec:RVVM2DI [
>                      (reg:SI 0 zero)
>                  ] UNSPEC_VUNDEF))) "/app/example.c":7:23 1089 {pred_movrvvm2di})
> ```
> 
> This combine change the semantics of insn 14. I split @pred_mov pattern and
> restrict the conditon of @pred_mov.
> 
> 	PR target/110943
> 
> gcc/ChangeLog:
> 
> 	* config/riscv/predicates.md (vector_const_int_or_double_0_operand):
> 	New predicate.
> 	* config/riscv/riscv-vector-builtins.cc (function_expander::function_expander):
> 	force_reg mem target operand.
> 	* config/riscv/vector.md (@pred_mov<mode>): Wrapper.
> 	(*pred_mov<mode>): Remove imm -> reg pattern.
> 	(*pred_broadcast<mode>_imm): Add imm -> reg pattern.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/rvv/base/pr110943.c: New test.
I'd leave this alone as well.  I just don't see much value in the backports.

jeff


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] RISC-V: Fix error combine of pred_mov pattern
@ 2024-02-20  4:21 Alexandre Oliva
  2024-02-23  7:39 ` Jeff Law
  0 siblings, 1 reply; 11+ messages in thread
From: Alexandre Oliva @ 2024-02-20  4:21 UTC (permalink / raw)
  To: gcc-patches
  Cc: Kito Cheng, Palmer Dabbelt, Andrew Waterman, Jim Wilson,
	Lehua Ding, Ju-Zhe Zhong

This backport is the second of two required for the pr111935 testcase,
already backported to gcc-13, to pass on riscv64-elf and riscv32-elf.
The V_VLS mode iterator, used in the original patch, is not available in
gcc-13, and I thought that would be too much to backport (and maybe so
are these two patches, WDYT?), so I changed it to V, to match the
preexisting gcc-13 pattern.  Comments also needed manual adjustment.
Regstrapped on x86_64-linux-gnu, along with other backports, and tested
manually on riscv64-elf.  Ok to install?

From: Lehua Ding <lehua.ding@rivai.ai>

This patch fix PR110943 which will produce some error code. This is because
the error combine of some pred_mov pattern. Consider this code:

```

void foo9 (void *base, void *out, size_t vl)
{
    int64_t scalar = *(int64_t*)(base + 100);
    vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1);
    *(vint64m2_t*)out = v;
}
```

RTL before combine pass:

```
(insn 11 10 12 2 (set (reg/v:RVVM2DI 134 [ v ])
        (if_then_else:RVVM2DI (unspec:RVVMF32BI [
                    (const_vector:RVVMF32BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (const_int 1 [0x1])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (const_vector:RVVM2DI repeat [
                    (const_int 0 [0])
                ])
            (unspec:RVVM2DI [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.c":6:20 1089 {pred_movrvvm2di})
(insn 14 13 0 2 (set (mem:RVVM2DI (reg/v/f:DI 136 [ out ]) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128])
        (reg/v:RVVM2DI 134 [ v ])) "/app/example.c":7:23 717 {*movrvvm2di_whole})
```

RTL after combine pass:
```
(insn 14 13 0 2 (set (mem:RVVM2DI (reg:DI 138) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128])
        (if_then_else:RVVM2DI (unspec:RVVMF32BI [
                    (const_vector:RVVMF32BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (const_int 1 [0x1])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (const_vector:RVVM2DI repeat [
                    (const_int 0 [0])
                ])
            (unspec:RVVM2DI [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.c":7:23 1089 {pred_movrvvm2di})
```

This combine change the semantics of insn 14. I split @pred_mov pattern and
restrict the conditon of @pred_mov.

	PR target/110943

gcc/ChangeLog:

	* config/riscv/predicates.md (vector_const_int_or_double_0_operand):
	New predicate.
	* config/riscv/riscv-vector-builtins.cc (function_expander::function_expander):
	force_reg mem target operand.
	* config/riscv/vector.md (@pred_mov<mode>): Wrapper.
	(*pred_mov<mode>): Remove imm -> reg pattern.
	(*pred_broadcast<mode>_imm): Add imm -> reg pattern.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr110943.c: New test.

(cherry picked from commit 973eb0deb467c79cc21f265a710a81054cfd3e8c)

Dropped from backport:
	* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Adjust.

This backport is a prerequisite for gcc.target/riscv/rvv/base/pr111935.c
that was backported from gcc-14 to gcc-13 upstream, presumably without
realizing that the test didn't pass in gcc-13.
---
 gcc/config/riscv/predicates.md                     |    5 +
 gcc/config/riscv/riscv-vector-builtins.cc          |    9 ++
 gcc/config/riscv/vector.md                         |   98 +++++++++++---------
 gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c |   33 +++++++
 4 files changed, 101 insertions(+), 44 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 1707c80cba256..0600824695ed8 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -280,6 +280,11 @@ (define_predicate "vector_const_0_operand"
   (and (match_code "const_vector")
        (match_test "satisfies_constraint_Wc0 (op)")))
 
+(define_predicate "vector_const_int_or_double_0_operand"
+  (and (match_code "const_vector")
+       (match_test "satisfies_constraint_vi (op)
+                    || satisfies_constraint_Wc0 (op)")))
+
 (define_predicate "vector_move_operand"
   (ior (match_operand 0 "nonimmediate_operand")
        (and (match_code "const_vector")
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc
index 01cea23d3e687..60ad59814cd5d 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -2935,7 +2935,14 @@ function_expander::function_expander (const function_instance &instance,
     exp (exp_in), target (target_in), opno (0)
 {
   if (!function_returns_void_p ())
-    create_output_operand (&m_ops[opno++], target, TYPE_MODE (TREE_TYPE (exp)));
+    {
+      if (target != NULL_RTX && MEM_P (target))
+	/* Since there is no intrinsic where target is a mem operand, it
+	   should be converted to reg if it is a mem operand.  */
+	target = force_reg (GET_MODE (target), target);
+      create_output_operand (&m_ops[opno++], target,
+			     TYPE_MODE (TREE_TYPE (exp)));
+    }
 }
 
 /* Take argument ARGNO from EXP's argument list and convert it into
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index fb0caab8da360..d84355163408e 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -936,69 +936,61 @@ (define_insn_and_split "@vsetvl<mode>_no_side_effects"
 ;; - 15.1 Vector Mask-Register Logical Instructions
 ;; -------------------------------------------------------------------------------
 
-;; vle.v/vse.v/vmv.v.v/vmv.v.x/vmv.v.i/vfmv.v.f.
-;; For vle.v/vmv.v.v/vmv.v.x/vmv.v.i/vfmv.v.f, we may need merge and mask operand.
+;; vle.v/vse.v/vmv.v.v.
+;; For vle.v/vmv.v.v, we may need merge and mask operand.
 ;; For vse.v, we don't need merge operand, so it should always match "vu".
 ;; constraint alternative 0 ~ 1 match vle.v.
 ;; constraint alternative 2 match vse.v.
 ;; constraint alternative 3 match vmv.v.v.
-;; constraint alternative 4 match vmv.v.i.
-;; For vmv.v.i, we allow 2 following cases:
-;;    1. (const_vector:VNx1QI repeat [
-;;                (const_int:QI N)]), -15 <= N < 16.
-;;    2. (const_vector:VNx1SF repeat [
-;;                (const_double:SF 0.0 [0x0.0p+0])]).
-
-;; We add "MEM_P (operands[0]) || MEM_P (operands[3]) || CONST_VECTOR_P (operands[1])" here to
-;; make sure we don't want CSE to generate the following pattern:
-;; (insn 17 8 19 2 (set (reg:VNx1HI 134 [ _1 ])
-;;       (if_then_else:VNx1HI (unspec:VNx1BI [
-;;                   (reg/v:VNx1BI 137 [ mask ])
-;;                   (reg:DI 151)
-;;                   (const_int 0 [0]) repeated x3
-;;                   (reg:SI 66 vl)
-;;                   (reg:SI 67 vtype)
-;;               ] UNSPEC_VPREDICATE)
-;;           (const_vector:VNx1HI repeat [
-;;                   (const_int 0 [0])
-;;               ])
-;;           (reg/v:VNx1HI 140 [ merge ]))) "rvv.c":8:12 608 {pred_movvnx1hi}
-;;    (expr_list:REG_DEAD (reg:DI 151)
-;;       (expr_list:REG_DEAD (reg/v:VNx1HI 140 [ merge ])
-;;           (expr_list:REG_DEAD (reg/v:VNx1BI 137 [ mask ])
-;;               (nil)))))
-;; Since both vmv.v.v and vmv.v.i doesn't have mask operand.
-(define_insn_and_split "@pred_mov<mode>"
-  [(set (match_operand:V 0 "nonimmediate_operand"      "=vr,    vr,    vd,     m,    vr,    vr,    vr,    vr")
+
+;; If operand 3 is a const_vector, then it is left to pred_braordcast patterns.
+(define_expand "@pred_mov<mode>"
+  [(set (match_operand:V 0 "nonimmediate_operand")
     (if_then_else:V
       (unspec:<VM>
-        [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,   Wc1,    vm, vmWc1,   Wc1,   Wc1,   Wc1,   Wc1")
-         (match_operand 4 "vector_length_operand"    "   rK,    rK,    rK,    rK,    rK,    rK,    rK,    rK")
-         (match_operand 5 "const_int_operand"        "    i,     i,     i,     i,     i,     i,     i,     i")
-         (match_operand 6 "const_int_operand"        "    i,     i,     i,     i,     i,     i,     i,     i")
-         (match_operand 7 "const_int_operand"        "    i,     i,     i,     i,     i,     i,     i,     i")
+        [(match_operand:<VM> 1 "vector_mask_operand")
+         (match_operand 4 "vector_length_operand")
+         (match_operand 5 "const_int_operand")
+         (match_operand 6 "const_int_operand")
+         (match_operand 7 "const_int_operand")
          (reg:SI VL_REGNUM)
          (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
-      (match_operand:V 3 "vector_move_operand"       "    m,     m,     m,    vr,    vr,    vr, viWc0, viWc0")
-      (match_operand:V 2 "vector_merge_operand"      "    0,    vu,    vu,    vu,    vu,     0,    vu,     0")))]
-  "TARGET_VECTOR && (MEM_P (operands[0]) || MEM_P (operands[3])
-   || CONST_VECTOR_P (operands[1]))"
+      (match_operand:V 3 "vector_move_operand")
+      (match_operand:V 2 "vector_merge_operand")))]
+  "TARGET_VECTOR"
+  {})
+
+;; vle.v/vse.v,vmv.v.v
+(define_insn_and_split "*pred_mov<mode>"
+  [(set (match_operand:V 0 "nonimmediate_operand"            "=vr,    vr,    vd,     m,    vr,    vr")
+    (if_then_else:V
+      (unspec:<VM>
+        [(match_operand:<VM> 1 "vector_mask_operand"           "vmWc1,   Wc1,    vm, vmWc1,   Wc1,   Wc1")
+         (match_operand 4 "vector_length_operand"              "   rK,    rK,    rK,    rK,    rK,    rK")
+         (match_operand 5 "const_int_operand"                  "    i,     i,     i,     i,     i,     i")
+         (match_operand 6 "const_int_operand"                  "    i,     i,     i,     i,     i,     i")
+         (match_operand 7 "const_int_operand"                  "    i,     i,     i,     i,     i,     i")
+         (reg:SI VL_REGNUM)
+         (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+      (match_operand:V 3 "reg_or_mem_operand"              "    m,     m,     m,    vr,    vr,    vr")
+      (match_operand:V 2 "vector_merge_operand"            "    0,    vu,    vu,    vu,    vu,     0")))]
+  "(TARGET_VECTOR
+    && (register_operand (operands[0], <MODE>mode)
+        || register_operand (operands[3], <MODE>mode)))"
   "@
    vle<sew>.v\t%0,%3%p1
    vle<sew>.v\t%0,%3
    vle<sew>.v\t%0,%3,%1.t
    vse<sew>.v\t%3,%0%p1
    vmv.v.v\t%0,%3
-   vmv.v.v\t%0,%3
-   vmv.v.i\t%0,%v3
-   vmv.v.i\t%0,%v3"
+   vmv.v.v\t%0,%3"
   "&& register_operand (operands[0], <MODE>mode)
    && register_operand (operands[3], <MODE>mode)
    && satisfies_constraint_vu (operands[2])
    && INTVAL (operands[7]) == riscv_vector::VLMAX"
   [(set (match_dup 0) (match_dup 3))]
   ""
-  [(set_attr "type" "vlde,vlde,vlde,vste,vimov,vimov,vimov,vimov")
+  [(set_attr "type" "vlde,vlde,vlde,vste,vimov,vimov")
    (set_attr "mode" "<MODE>")])
 
 ;; Dedicated pattern for vse.v instruction since we can't reuse pred_mov pattern to include
@@ -1367,6 +1359,26 @@ (define_insn "*pred_broadcast<mode>_zero"
   [(set_attr "type" "vimovxv,vimovxv")
    (set_attr "mode" "<MODE>")])
 
+;; Because (vec_duplicate imm) will be converted to (const_vector imm),
+;; This pattern is used to handle this case.
+(define_insn "*pred_broadcast<mode>_imm"
+  [(set (match_operand:V 0 "register_operand"                     "=vr,    vr")
+    (if_then_else:V
+      (unspec:<VM>
+        [(match_operand:<VM> 1 "vector_all_trues_mask_operand"      "  Wc1,   Wc1")
+         (match_operand 4 "vector_length_operand"                   "   rK,    rK")
+         (match_operand 5 "const_int_operand"                       "    i,     i")
+         (match_operand 6 "const_int_operand"                       "    i,     i")
+         (match_operand 7 "const_int_operand"                       "    i,     i")
+         (reg:SI VL_REGNUM)
+         (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+      (match_operand:V 3 "vector_const_int_or_double_0_operand" "viWc0, viWc0")
+      (match_operand:V 2 "vector_merge_operand"                 "   vu,     0")))]
+  "TARGET_VECTOR"
+  "vmv.v.i\t%0,%v3"
+  [(set_attr "type" "vimov,vimov")
+   (set_attr "mode" "<MODE>")])
+
 ;; -------------------------------------------------------------------------------
 ;; ---- Predicated Strided loads/stores
 ;; -------------------------------------------------------------------------------
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c
new file mode 100644
index 0000000000000..8a6c00fc94d29
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include <riscv_vector.h>
+
+/*
+** foo9:
+**   vsetivli\tzero,1,e64,m2,t[au],m[au]
+**   ...
+**   vs2r.v\tv[0-9]+,0\([a-x0-9]+\)
+**   ret
+*/
+void foo9 (void *base, void *out, size_t vl)
+{
+    int64_t scalar = *(int64_t*)(base + 100);
+    vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1);
+    *(vint64m2_t*)out = v;
+}
+
+/*
+** foo10:
+**   vsetivli\tzero,1,e64,m2,t[au],m[au]
+**   ...
+**   vs2r.v\tv[0-9]+,0\([a-x0-9]+\)
+**   ret
+*/
+void foo10 (void *base, void *out, size_t vl)
+{
+    int64_t scalar = *(int64_t*)(base + 100);
+    vint64m2_t v = __riscv_vmv_s_x_i64m2 (0, 1);
+    *(vint64m2_t*)out = v;
+}

-- 
Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
   Free Software Activist                   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern
  2023-08-08 11:57 Lehua Ding
@ 2023-08-08 16:10 ` Jeff Law
  0 siblings, 0 replies; 11+ messages in thread
From: Jeff Law @ 2023-08-08 16:10 UTC (permalink / raw)
  To: Lehua Ding, gcc-patches; +Cc: juzhe.zhong, rdapp.gcc, kito.cheng, palmer



On 8/8/23 05:57, Lehua Ding wrote:
> Hi,
> 
> This patch fix PR110943 which will produce some error code. This is because
> the error combine of some pred_mov pattern. Consider this code:
> 
> ```
> #include <riscv_vector.h>
> 
> void foo9 (void *base, void *out, size_t vl)
> {
>      int64_t scalar = *(int64_t*)(base + 100);
>      vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1);
>      *(vint64m2_t*)out = v;
> }
> ```
> 
> RTL before combine pass:
> 
> ```
> (insn 11 10 12 2 (set (reg/v:RVVM2DI 134 [ v ])
>          (if_then_else:RVVM2DI (unspec:RVVMF32BI [
>                      (const_vector:RVVMF32BI repeat [
>                              (const_int 1 [0x1])
>                          ])
>                      (const_int 1 [0x1])
>                      (const_int 2 [0x2]) repeated x2
>                      (const_int 0 [0])
>                      (reg:SI 66 vl)
>                      (reg:SI 67 vtype)
>                  ] UNSPEC_VPREDICATE)
>              (const_vector:RVVM2DI repeat [
>                      (const_int 0 [0])
>                  ])
>              (unspec:RVVM2DI [
>                      (reg:SI 0 zero)
>                  ] UNSPEC_VUNDEF))) "/app/example.c":6:20 1089 {pred_movrvvm2di})
> (insn 14 13 0 2 (set (mem:RVVM2DI (reg/v/f:DI 136 [ out ]) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128])
>          (reg/v:RVVM2DI 134 [ v ])) "/app/example.c":7:23 717 {*movrvvm2di_whole})
> ```
> 
> RTL after combine pass:
> ```
> (insn 14 13 0 2 (set (mem:RVVM2DI (reg:DI 138) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128])
>          (if_then_else:RVVM2DI (unspec:RVVMF32BI [
>                      (const_vector:RVVMF32BI repeat [
>                              (const_int 1 [0x1])
>                          ])
>                      (const_int 1 [0x1])
>                      (const_int 2 [0x2]) repeated x2
>                      (const_int 0 [0])
>                      (reg:SI 66 vl)
>                      (reg:SI 67 vtype)
>                  ] UNSPEC_VPREDICATE)
>              (const_vector:RVVM2DI repeat [
>                      (const_int 0 [0])
>                  ])
>              (unspec:RVVM2DI [
>                      (reg:SI 0 zero)
>                  ] UNSPEC_VUNDEF))) "/app/example.c":7:23 1089 {pred_movrvvm2di})
> ```
> 
> This combine change the semantics of insn 14. I refine the conditon of @pred_mov
> pattern to a more restrict. It's Ok for trunk?
> 
> Best,
> Lehua
> 
> 
> 	PR target/110943
> 
> gcc/ChangeLog:
> 
> 	* config/riscv/riscv-vector-builtins.cc (function_expander::function_expander):
> 	  force_reg mem operand.
> 	* config/riscv/vector.md: Refine condition.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Update.
> 	* gcc.target/riscv/rvv/base/pr110943.c: New test.
So at a high level this doesn't look correct to me.

The pattern's operand 0 explicitly allows MEMs as do the constraints. 
So forcing the operand into a register just seems like it's papering 
over the real problem.

I wonder if we should just remove the memory destination from this 
pattern.  Ultimately isn't that case just trying to optimize a constant 
store into memory -- perhaps we just need a distinct pattern for that. 
We generally try to avoid that for movXX patterns, but this seems a bit 
different.


>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c
> 
> diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc
> index 528dca7ae85..cd40fb2060f 100644
> --- a/gcc/config/riscv/riscv-vector-builtins.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins.cc
> @@ -3471,7 +3471,13 @@ function_expander::function_expander (const function_instance &instance,
>       exp (exp_in), target (target_in), opno (0)
>   {
>     if (!function_returns_void_p ())
> -    create_output_operand (&m_ops[opno++], target, TYPE_MODE (TREE_TYPE (exp)));
> +    {
> +      if (target != NULL_RTX && MEM_P (target))
> +	/* Use force_reg to prevent illegal mem-to-mem pattern on -O0.  */
This comment doesn't make sense in conjuction with your earlier details. 
  In particular combine doesn't run at -O0, so your earlier comment that 
combine creates the problem seems inconsistent with the comment above.


> diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
> index e56a2bf4bed..f0484b1162c 100644
> --- a/gcc/config/riscv/vector.md
> +++ b/gcc/config/riscv/vector.md
> @@ -1509,8 +1509,9 @@
>            (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
>         (match_operand:V_VLS 3 "vector_move_operand"   "    m,     m,     m,    vr,    vr,    vr, viWc0, viWc0")
>         (match_operand:V_VLS 2 "vector_merge_operand"  "    0,    vu,    vu,    vu,    vu,     0,    vu,     0")))]
> -  "TARGET_VECTOR && (MEM_P (operands[0]) || MEM_P (operands[3])
> -   || CONST_VECTOR_P (operands[1]))"
> +  "TARGET_VECTOR && ((register_operand (operands[0], <MODE>mode) && MEM_P (operands[3])) ||
> +                     (MEM_P (operands[0]) && register_operand (operands[3], <MODE>mode)) ||
> +                     (register_operand (operands[0], <MODE>mode) && satisfies_constraint_Wc1 (operands[1])))"
Umm, wow.  I haven't thought deeply about this, but the complexity of 
that insn condition is a huge red flag that our operand predicates 
aren't correct for this pattern.

 From a formatting standpoint bring the wrapped operator down and 
indent.  ie

   (condition 1
    || condition 2
    || (condition 3
        && other test 4))


Jeff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] RISC-V: Fix error combine of pred_mov pattern
@ 2023-08-08 11:57 Lehua Ding
  2023-08-08 16:10 ` Jeff Law
  0 siblings, 1 reply; 11+ messages in thread
From: Lehua Ding @ 2023-08-08 11:57 UTC (permalink / raw)
  To: gcc-patches; +Cc: juzhe.zhong, rdapp.gcc, kito.cheng, palmer, jeffreyalaw

Hi,

This patch fix PR110943 which will produce some error code. This is because
the error combine of some pred_mov pattern. Consider this code:

```
#include <riscv_vector.h>

void foo9 (void *base, void *out, size_t vl)
{
    int64_t scalar = *(int64_t*)(base + 100);
    vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1);
    *(vint64m2_t*)out = v;
}
```

RTL before combine pass:

```
(insn 11 10 12 2 (set (reg/v:RVVM2DI 134 [ v ])
        (if_then_else:RVVM2DI (unspec:RVVMF32BI [
                    (const_vector:RVVMF32BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (const_int 1 [0x1])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (const_vector:RVVM2DI repeat [
                    (const_int 0 [0])
                ])
            (unspec:RVVM2DI [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.c":6:20 1089 {pred_movrvvm2di})
(insn 14 13 0 2 (set (mem:RVVM2DI (reg/v/f:DI 136 [ out ]) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128])
        (reg/v:RVVM2DI 134 [ v ])) "/app/example.c":7:23 717 {*movrvvm2di_whole})
```

RTL after combine pass:
```
(insn 14 13 0 2 (set (mem:RVVM2DI (reg:DI 138) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128])
        (if_then_else:RVVM2DI (unspec:RVVMF32BI [
                    (const_vector:RVVMF32BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (const_int 1 [0x1])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (const_vector:RVVM2DI repeat [
                    (const_int 0 [0])
                ])
            (unspec:RVVM2DI [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "/app/example.c":7:23 1089 {pred_movrvvm2di})
```

This combine change the semantics of insn 14. I refine the conditon of @pred_mov
pattern to a more restrict. It's Ok for trunk?

Best,
Lehua


	PR target/110943

gcc/ChangeLog:

	* config/riscv/riscv-vector-builtins.cc (function_expander::function_expander):
	  force_reg mem operand.
	* config/riscv/vector.md: Refine condition.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Update.
	* gcc.target/riscv/rvv/base/pr110943.c: New test.

---
 gcc/config/riscv/riscv-vector-builtins.cc     |  8 ++++-
 gcc/config/riscv/vector.md                    |  5 +--
 .../gcc.target/riscv/rvv/base/pr110943.c      | 33 +++++++++++++++++++
 .../riscv/rvv/base/zvfhmin-intrinsic.c        | 10 +++---
 4 files changed, 48 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c

diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc
index 528dca7ae85..cd40fb2060f 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3471,7 +3471,13 @@ function_expander::function_expander (const function_instance &instance,
     exp (exp_in), target (target_in), opno (0)
 {
   if (!function_returns_void_p ())
-    create_output_operand (&m_ops[opno++], target, TYPE_MODE (TREE_TYPE (exp)));
+    {
+      if (target != NULL_RTX && MEM_P (target))
+	/* Use force_reg to prevent illegal mem-to-mem pattern on -O0.  */
+	target = force_reg (GET_MODE (target), target);
+      create_output_operand (&m_ops[opno++], target,
+			     TYPE_MODE (TREE_TYPE (exp)));
+    }
 }
 
 /* Take argument ARGNO from EXP's argument list and convert it into
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index e56a2bf4bed..f0484b1162c 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -1509,8 +1509,9 @@
          (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
       (match_operand:V_VLS 3 "vector_move_operand"   "    m,     m,     m,    vr,    vr,    vr, viWc0, viWc0")
       (match_operand:V_VLS 2 "vector_merge_operand"  "    0,    vu,    vu,    vu,    vu,     0,    vu,     0")))]
-  "TARGET_VECTOR && (MEM_P (operands[0]) || MEM_P (operands[3])
-   || CONST_VECTOR_P (operands[1]))"
+  "TARGET_VECTOR && ((register_operand (operands[0], <MODE>mode) && MEM_P (operands[3])) ||
+                     (MEM_P (operands[0]) && register_operand (operands[3], <MODE>mode)) ||
+                     (register_operand (operands[0], <MODE>mode) && satisfies_constraint_Wc1 (operands[1])))"
   "@
    vle<sew>.v\t%0,%3%p1
    vle<sew>.v\t%0,%3
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c
new file mode 100644
index 00000000000..8a6c00fc94d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include <riscv_vector.h>
+
+/*
+** foo9:
+**   vsetivli\tzero,1,e64,m2,t[au],m[au]
+**   ...
+**   vs2r.v\tv[0-9]+,0\([a-x0-9]+\)
+**   ret
+*/
+void foo9 (void *base, void *out, size_t vl)
+{
+    int64_t scalar = *(int64_t*)(base + 100);
+    vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1);
+    *(vint64m2_t*)out = v;
+}
+
+/*
+** foo10:
+**   vsetivli\tzero,1,e64,m2,t[au],m[au]
+**   ...
+**   vs2r.v\tv[0-9]+,0\([a-x0-9]+\)
+**   ret
+*/
+void foo10 (void *base, void *out, size_t vl)
+{
+    int64_t scalar = *(int64_t*)(base + 100);
+    vint64m2_t v = __riscv_vmv_s_x_i64m2 (0, 1);
+    *(vint64m2_t*)out = v;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c b/gcc/testsuite/gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c
index fc70c54c7fc..500748b8e79 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c
@@ -194,12 +194,12 @@ vfloat16m4_t test_vget_v_f16m8_f16m4(vfloat16m8_t src, size_t index) {
 /* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero,\s*e16,\s*m8,\s*t[au],\s*m[au]} 5 } } */
 /* { dg-final { scan-assembler-times {vfwcvt\.f\.f\.v\s+v[0-9]+,\s*v[0-9]+} 5 } } */
 /* { dg-final { scan-assembler-times {vfncvt\.f\.f\.w\s+v[0-9]+,\s*v[0-9]+} 5 } } */
-/* { dg-final { scan-assembler-times {vle16\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 20 } } */
+/* { dg-final { scan-assembler-times {vle16\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 15 } } */
 /* { dg-final { scan-assembler-times {vse16\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 15 } } */
-/* { dg-final { scan-assembler-times {vl1re16\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 5 } } */
-/* { dg-final { scan-assembler-times {vl2re16\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 4 } } */
-/* { dg-final { scan-assembler-times {vl8re16\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 5 } } */
-/* { dg-final { scan-assembler-times {vl4re16\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 5 } } */
+/* { dg-final { scan-assembler-times {vl1re16\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 7 } } */
+/* { dg-final { scan-assembler-times {vl2re16\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 5 } } */
+/* { dg-final { scan-assembler-times {vl8re16\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 6 } } */
+/* { dg-final { scan-assembler-times {vl4re16\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 6 } } */
 /* { dg-final { scan-assembler-times {vs1r\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 5 } } */
 /* { dg-final { scan-assembler-times {vs2r\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 5 } } */
 /* { dg-final { scan-assembler-times {vs4r\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 5 } } */
-- 
2.36.3


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-02-23  7:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-09  3:54 [PATCH] RISC-V: Fix error combine of pred_mov pattern Lehua Ding
2023-08-10 12:29 ` Lehua Ding
2023-08-11 15:57 ` Jeff Law
2023-08-11 16:30   ` Lehua Ding
2023-08-11 16:40     ` Lehua Ding
2023-08-28 21:34     ` Jeff Law
2023-08-18 10:30   ` Lehua Ding
  -- strict thread matches above, loose matches on Subject: below --
2024-02-20  4:21 Alexandre Oliva
2024-02-23  7:39 ` Jeff Law
2023-08-08 11:57 Lehua Ding
2023-08-08 16:10 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).