Re: Re: [PATCH] RISC-V: Fix PR108279

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: <juzhe.zhong@rivai.ai>
To: "Jeff Law" <jeffreyalaw@gmail.com>,
	 gcc-patches <gcc-patches@gcc.gnu.org>
Cc: kito.cheng <kito.cheng@gmail.com>,  palmer <palmer@dabbelt.com>
Subject: Re: Re: [PATCH] RISC-V: Fix PR108279
Date: Mon, 3 Apr 2023 06:40:42 +0800	[thread overview]
Message-ID: <127758663DFFBC5C+2023040306404192617341@rivai.ai> (raw)
In-Reply-To: <7117f9e5-2a82-87d5-66e9-633d9f55cc2d@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4296 bytes --]

This point is seletected not because LCM but by Phase 3 (VL/VTYPE demand info backward fusion and propogation) which
is I introduced into VSETVL PASS to enhance LCM && improve vsetvl instruction performance.

This patch is to supress the Phase 3 too aggressive backward fusion and propagation to the top of the function program
when there is no define instruction of AVL (AVL is 0 ~ 31 imm since vsetivli instruction allows imm value instead of reg).

You may want to ask why we need Phase 3 to the job. 
Well, we have so many situations that pure LCM fails to optimize, here I can show you a simple case to demonstrate it:
void f (void * restrict in, void * restrict out, int n, int m, int cond)
{
  size_t vl = 101;
  for (size_t j = 0; j < m; j++){
    if (cond) {
      for (size_t i = 0; i < n; i++)
        {
          vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + j, vl);
          __riscv_vse8_v_i8mf8 (out + i, v, vl);
        }
    } else {
      for (size_t i = 0; i < n; i++)
        {
          vint32mf2_t v = __riscv_vle32_v_i32mf2 (in + i + j, vl);
          v = __riscv_vadd_vv_i32mf2 (v,v,vl);
          __riscv_vse32_v_i32mf2 (out + i, v, vl);
        }
    }
  }
}

You can see:
The first inner loop needs vsetvli e8 mf8 for vle+vse.
The second inner loop need vsetvli e32 mf2 for vle+vadd+vse.

If we don't have Phase 3 (Only handled by LCM (Phase 4)), we will end up with :

outerloop:
...
vsetvli e8mf8
inner loop 1:
....

vsetvli e32mf2
inner loop 2:
....

However, if we have Phase 3, Phase 3 is going to fuse the vsetvli e32 mf2 of inner loop 2 into vsetvli e8 mf8, then we will end up with this result after phase 3:

outerloop:
...
inner loop 1:
vsetvli e32mf2
....

inner loop 2:
vsetvli e32mf2
....

Then, this demand information after phase 3 will be well optimized after phase 4 (LCM), after Phase 4 result is:

vsetvli e32mf2
outerloop:
...
inner loop 1:
....

inner loop 2:
....

You can see this is the optimal codegen after current VSETVL PASS (Phase 3: Demand backward fusion and propagation + Phase 4: LCM ). This is a known issue when I start to implement VSETVL PASS.
I leaved it to be fixed after I finished all target GCC 13 features. And Kito postpone this patch to be merged after GCC 14 is open.

juzhe.zhong@rivai.ai

From: Jeff Law
Date: 2023-04-03 03:41
To: juzhe.zhong; gcc-patches
CC: kito.cheng; palmer
Subject: Re: [PATCH] RISC-V: Fix PR108279

On 3/27/23 00:59, juzhe.zhong@rivai.ai wrote:
> From: Juzhe-Zhong <juzhe.zhong@rivai.ai>
> 
>          PR 108270
> 
> Fix bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108270.
> 
> Consider the following testcase:
> void f (void * restrict in, void * restrict out, int l, int n, int m)
> {
>    for (int i = 0; i < l; i++){
>      for (int j = 0; j < m; j++){
>        for (int k = 0; k < n; k++)
>          {
>            vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + j, 17);
>            __riscv_vse8_v_i8mf8 (out + i + j, v, 17);
>          }
>      }
>    }
> }
> 
> Compile option: -O3
> 
> Before this patch:
> mv a7,a2
> mv a6,a0 
>          mv t1,a1
> mv a2,a3
> vsetivli zero,17,e8,mf8,ta,ma
> ...
> 
> After this patch:
>          mv      a7,a2
>          mv      a6,a0
>          mv      t1,a1
>          mv      a2,a3
>          ble     a7,zero,.L1
>          ble     a4,zero,.L1
>          ble     a3,zero,.L1
>          add     a1,a0,a4
>          li      a0,0
>          vsetivli        zero,17,e8,mf8,ta,ma
> ...
> 
> It will produce potential bug when:
> 
> int main ()
> {
>    vsetivli zero, 100,.....
>    f (in, out, 0,0,0)
>    asm volatile ("csrr a0,vl":::"memory");
> 
>    // Before this patch the a0 is 17. (Wrong).
>    // After this patch the a0 is 100. (Correct).
>    ...
> }
So why was that point selected in the first place?   I would have 
expected LCM to select the loop entry edge as the desired insertion point.

Essentially if LCM selects the point before those branches, then it's 
voilating a fundamental principal of LCM, namely that you never put an 
evaluation on a path where it didn't have one before.

So not objecting to the patch but it is raising concerns about the LCM 
results.

jeff

next prev parent reply	other threads:[~2023-04-02 22:40 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-27  6:59 juzhe.zhong
2023-04-02 19:41 ` Jeff Law
2023-04-02 22:40   ` juzhe.zhong [this message]
2023-04-05 13:05     ` Jeff Law
2023-04-05 13:53       ` juzhe.zhong
2023-04-11  8:55         ` Richard Biener
2023-04-11  9:18           ` juzhe.zhong
2023-04-11 11:19             ` Richard Biener
2023-04-11 11:35               ` juzhe.zhong
2023-04-11 21:14           ` Jeff Law
2023-04-11 23:09             ` juzhe.zhong
2023-04-11 23:11               ` Jeff Law
2023-04-12 23:18         ` Jeff Law
2023-04-12 23:23 ` Jeff Law
2023-04-22  3:06 ` [PATCH] RISC-V: Fix PR108270 Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=127758663DFFBC5C+2023040306404192617341@rivai.ai \
    --to=juzhe.zhong@rivai.ai \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=kito.cheng@gmail.com \
    --cc=palmer@dabbelt.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).