Re: Re: [PATCH] RISC-V: Expand VLMAX scalar move in reduction

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai>
To: jeffreyalaw <jeffreyalaw@gmail.com>,
	 gcc-patches <gcc-patches@gcc.gnu.org>
Cc: kito.cheng <kito.cheng@gmail.com>,
	 Kito.cheng <kito.cheng@sifive.com>,
	 "Robin Dapp" <rdapp.gcc@gmail.com>
Subject: Re: Re: [PATCH] RISC-V: Expand VLMAX scalar move in reduction
Date: Mon, 5 Feb 2024 14:37:53 +0800	[thread overview]
Message-ID: <5B8117E9C28EBC42+202402051437529412972@rivai.ai> (raw)
In-Reply-To: <778a4509-c387-452f-b682-e0dd9963ac90@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5596 bytes --]

I think it just trigger a latent bug that we didn't encounter.

Hi, Robin. Would you mind give me preprocessed file to reproduce the issue ?

I suspect it triggers latent bug in VSETVL PASS.



juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2024-02-05 12:36
To: Juzhe-Zhong; gcc-patches
CC: kito.cheng; kito.cheng; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Expand VLMAX scalar move in reduction
 
 
On 2/4/24 20:26, Jeff Law wrote:
> 
> 
> On 2/1/24 18:56, Juzhe-Zhong wrote:
>> This patch fixes the following:
>>
>>          vsetvli a5,a1,e32,m1,tu,ma
>>          slli    a4,a5,2
>>          sub     a1,a1,a5
>>          vle32.v v2,0(a0)
>>          add     a0,a0,a4
>>          vadd.vv v1,v2,v1
>>          bne     a1,zero,.L3
>>          vsetivli        zero,1,e32,m1,ta,ma
>>          vmv.s.x v2,zero
>>          vsetvli a5,zero,e32,m1,ta,ma              ---> Redundant vsetvl.
>>          vredsum.vs      v1,v1,v2
>>          vmv.x.s a0,v1
>>          ret
>>
>> VSETVL PASS is able to fuse avl = 1 of scalar move and VLMAX avl of 
>> reduction.
>>
>> However, this following RTL blocks the fusion in dependence analysis 
>> in VSETVL PASS:
>>
>> (insn 49 24 50 5 (set (reg:RVVM1SI 98 v2 [148])
>>          (if_then_else:RVVM1SI (unspec:RVVMF32BI [
>>                      (const_vector:RVVMF32BI [
>>                              (const_int 1 [0x1])
>>                              repeat [
>>                                  (const_int 0 [0])
>>                              ]
>>                          ])
>>                      (const_int 1 [0x1])
>>                      (const_int 2 [0x2]) repeated x2
>>                      (const_int 0 [0])
>>                      (reg:SI 66 vl)
>>                      (reg:SI 67 vtype)
>>                  ] UNSPEC_VPREDICATE)
>>              (const_vector:RVVM1SI repeat [
>>                      (const_int 0 [0])
>>                  ])
>>              (unspec:RVVM1SI [
>>                      (reg:DI 0 zero)
>>                  ] UNSPEC_VUNDEF))) 3813 {*pred_broadcastrvvm1si_zero}
>>       (nil))
>> (insn 50 49 51 5 (set (reg:DI 15 a5 [151])                          
>> ---->  It set a5, blocks the following VLMAX into the scalar move above.
>>          (unspec:DI [
>>                  (const_int 32 [0x20])
>>              ] UNSPEC_VLMAX)) 2566 {vlmax_avldi}
>>       (expr_list:REG_EQUIV (unspec:DI [
>>                  (const_int 32 [0x20])
>>              ] UNSPEC_VLMAX)
>>          (nil)))
>> (insn 51 50 52 5 (set (reg:RVVM1SI 97 v1 [150])
>>          (unspec:RVVM1SI [
>>                  (unspec:RVVMF32BI [
>>                          (const_vector:RVVMF32BI repeat [
>>                                  (const_int 1 [0x1])
>>                              ])
>>                          (reg:DI 15 a5 [151])
>>                          (const_int 2 [0x2])
>>                          (const_int 1 [0x1])
>>                          (reg:SI 66 vl)
>>                          (reg:SI 67 vtype)
>>                      ] UNSPEC_VPREDICATE)
>>                  (unspec:RVVM1SI [
>>                          (reg:RVVM1SI 97 v1 [orig:134 vect_result_14.6 
>> ] [134])
>>                          (reg:RVVM1SI 98 v2 [148])
>>                      ] UNSPEC_REDUC_SUM)
>>                  (unspec:RVVM1SI [
>>                          (reg:DI 0 zero)
>>                      ] UNSPEC_VUNDEF)
>>              ] UNSPEC_REDUC)) 17541 {pred_redsumrvvm1si}
>>       (expr_list:REG_DEAD (reg:RVVM1SI 98 v2 [148])
>>          (expr_list:REG_DEAD (reg:SI 66 vl)
>>              (expr_list:REG_DEAD (reg:DI 15 a5 [151])
>>                  (expr_list:REG_DEAD (reg:DI 0 zero)
>>                      (nil))))))
>>
>> Such situation can only happen on auto-vectorization, never happen on 
>> intrinsic codes.
>> Since the reduction is passed VLMAX AVL, it should be more natural to 
>> pass VLMAX to the scalar move which initial the value of the reduction.
>>
>> After this patch:
>>
>>     vsetvli    a5,a1,e32,m1,tu,ma
>>     slli    a4,a5,2
>>     sub    a1,a1,a5
>>     vle32.v    v2,0(a0)
>>     add    a0,a0,a4
>>     vadd.vv    v1,v2,v1
>>     bne    a1,zero,.L3
>>     vsetvli    a5,zero,e32,m1,ta,ma
>>     vmv.s.x    v2,zero
>>     vredsum.vs    v1,v1,v2
>>     vmv.x.s    a0,v1
>>          ret
>>
>> Tested on both RV32/RV64 no regression.
>>
>>     PR target/113697
>>
>> gcc/ChangeLog:
>>
>>     * config/riscv/riscv-v.cc (expand_reduction): Pass VLMAX avl to 
>> scalar move.
>>
>> gcc/testsuite/ChangeLog:
>>
>>     * gcc.target/riscv/rvv/autovec/pr113697.c: New test.
> I suspect this broke 502.gcc in spec2017.  Basically it's hanging during 
> the build phase.  I'm not sure if I'm going to have time this week to 
> dive into it.
> 
> 
> Optimization options used:
> 
>> GCC Flags:  -Ofast -flto -fsched-pressure -fno-strict-aliasing 
>> -fgnu89-inline -fcommon -fno-finite-math-only 
>> -fno-unsafe-math-optimizations
> 
> 
> 
> Given this appears to be a minor optimization issue, I wouldn't lose any 
> sleep if it was reverted and deferred to gcc-15.
> 
> Anyway, good luck.  Sorry I can't do more on the debugging/reduction front.
Actually, I'm starting to wonder if this is just the trigger and if the 
real issue is something else that went in over the last week or so.  I 
reverted the patch above which allows 502.gcc to build. But then I get a 
hang on xalancbmk.
 
Makes me wonder if the vsetvl bits are the culprit given the size of 
that change.
 
jeff

next prev parent reply	other threads:[~2024-02-05  6:38 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-02  1:56 Juzhe-Zhong
2024-02-02  9:50 ` Kito Cheng
2024-02-05  3:26 ` Jeff Law
2024-02-05  4:36   ` Jeff Law
2024-02-05  6:37     ` juzhe.zhong [this message]
2024-02-05 15:34       ` Jeff Law
2024-02-05 19:09       ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5B8117E9C28EBC42+202402051437529412972@rivai.ai \
    --to=juzhe.zhong@rivai.ai \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=kito.cheng@gmail.com \
    --cc=kito.cheng@sifive.com \
    --cc=rdapp.gcc@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).