public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai>
To: rguenther <rguenther@suse.de>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	 jeffreyalaw <jeffreyalaw@gmail.com>
Subject: Re: Re: [PATCH] LCM: Export 2 helpful functions as global for VSETVL PASS use in RISC-V backend
Date: Mon, 21 Aug 2023 15:46:52 +0800	[thread overview]
Message-ID: <D3CD1545179BDCAB+2023082115465200847223@rivai.ai> (raw)
In-Reply-To: <nycvar.YFH.7.77.849.2308210707100.12935@jbgna.fhfr.qr>

[-- Attachment #1: Type: text/plain, Size: 6400 bytes --]

Hi. Richi.
I'd like to share more details that I want to do in VSETVL PASS.

Consider this following case:

for
  for 
    for
      ...
         for
	     VSETVL demand: RATIO = 32 and TU policy.

For this simple case, 'pre_edge_lcm_av' can perfectly work for us, will hoist "vsetvli e32,tu" to the outer-most loop.

However, for this case:
  for
  for 
    for
      ...
         for
	   if (...)
	     VSETVL 1 demand: RATIO = 32 and TU policy.
	   else if (...)
	     VSETVL 2 demand: SEW = 16.
	   else
	     VSETVL 3 demand: MU policy.

'pre_edge_lcm_av' is not sufficient to give us optimal codegen since VSETVL 1,  VSETVL 2 and VSETVL 3 are 3 different VSETVL demands
'pre_edge_lcm_av' can only hoist one of them. Such case I can easily produce by RVV intrinsic and they are already in our RVV testsuite.

To get the optimal codegen for this case,  We need I call it as "Demand fusion" which is fusing all "compatible" VSETVLs into a single VSETVL
then set them to avoid redundant VSETVLs.

In this case, we should be able to fuse VSETVL 1, VSETVL 2 and VSETVL 3 into new VSETVL demand : SEW = 16, LMUL = MF2, TU, MU into a single 
new VSETVL demand. Instead of giving 'pre_edge_lcm_av' 3 VSETVL demands (VSETVL 1/2/3). I give 'pre_edge_lcm_av' only single 1 new VSETVL demand.
Then, LCM PRE can hoist such fused VSETVL to the outer-most loop. So the program will be transformed as:

VSETVL SEW = 16, LMUL = MF2, TU, MU
  for
  for 
    for
      ...
         for
	   if (...) 
	     .....   no vsetvl insn.
	   else if (...)
	     ....  no vsetvl insn.

	   else
	     ....  no vsetvl insn.

So, how to do the demand fusion in this case? 
Before this patch and following RISC-V refactor patch, I do it explictly with my own decide algorithm.
Meaning I calculate which location of the program to do the VSETVL fusion is correct and optimal.

However, I found "compute_earliest" can help us to do the job for calculating the location of the program to do VSETVL fusion and
turns out it's a quite more reliable and reasonable approach than I do.

So that's why I export those 2 functions for us to be use in Phase 3 (Demand fusion) in RISC-V backend VSETVL PASS.

Thanks.


juzhe.zhong@rivai.ai
 
From: Richard Biener
Date: 2023-08-21 15:09
To: Juzhe-Zhong
CC: gcc-patches; jeffreyalaw
Subject: Re: [PATCH] LCM: Export 2 helpful functions as global for VSETVL PASS use in RISC-V backend
On Mon, 21 Aug 2023, Juzhe-Zhong wrote:
 
> This patch exports 'compute_antinout_edge' and 'compute_earliest' as global scope
> which is going to be used in VSETVL PASS of RISC-V backend.
> 
> The demand fusion is the fusion of VSETVL information to emit VSETVL which dominate and pre-config for most
> of the RVV instructions in order to elide redundant VSETVLs.
> 
> For exmaple:
> 
> for
>  for
>   for
>     if (cond}
>       VSETVL demand 1: SEW/LMUL = 16 and TU policy
>     else
>       VSETVL demand 2: SEW = 32
> 
> VSETVL pass should be able to fuse demand 1 and demand 2 into new demand: SEW = 32, LMUL = M2, TU policy.
> Then emit such VSETVL at the outmost of the for loop to get the most optimal codegen and run-time execution.
> 
> Currenty the VSETVL PASS Phase 3 (demand fusion) is really messy and un-reliable as well as un-maintainable.
> And, I recently read dragon book and morgan's book again, I found there "earliest" can allow us to do the
> demand fusion in a very reliable and optimal way.
> 
> So, this patch exports these 2 functions which are very helpful for VSETVL pass.
 
It would be nice to put these internal functions into a class or a
namespace given their non LCM name.  I don't see how you are going
to use these intermediate DF functions - they are just necessary
to compute pre_edge_lcm_avs which I see you already do.  Just to say
you are possibly going to blow up compile-time complexity of your
VSETVL dataflow problem?
 
> gcc/ChangeLog:
> 
> * lcm.cc (compute_antinout_edge): Export as global use.
> (compute_earliest): Ditto.
> (compute_rev_insert_delete): Ditto.
> * lcm.h (compute_antinout_edge): Ditto.
> (compute_earliest): Ditto.
> 
> ---
>  gcc/lcm.cc | 7 ++-----
>  gcc/lcm.h  | 3 +++
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/lcm.cc b/gcc/lcm.cc
> index 94a3ed43aea..03421e490e4 100644
> --- a/gcc/lcm.cc
> +++ b/gcc/lcm.cc
> @@ -56,9 +56,6 @@ along with GCC; see the file COPYING3.  If not see
>  #include "lcm.h"
>  
>  /* Edge based LCM routines.  */
> -static void compute_antinout_edge (sbitmap *, sbitmap *, sbitmap *, sbitmap *);
> -static void compute_earliest (struct edge_list *, int, sbitmap *, sbitmap *,
> -       sbitmap *, sbitmap *, sbitmap *);
>  static void compute_laterin (struct edge_list *, sbitmap *, sbitmap *,
>       sbitmap *, sbitmap *);
>  static void compute_insert_delete (struct edge_list *edge_list, sbitmap *,
> @@ -79,7 +76,7 @@ static void compute_rev_insert_delete (struct edge_list *edge_list, sbitmap *,
>     This is done based on the flow graph, and not on the pred-succ lists.
>     Other than that, its pretty much identical to compute_antinout.  */
>  
> -static void
> +void
>  compute_antinout_edge (sbitmap *antloc, sbitmap *transp, sbitmap *antin,
>         sbitmap *antout)
>  {
> @@ -170,7 +167,7 @@ compute_antinout_edge (sbitmap *antloc, sbitmap *transp, sbitmap *antin,
>  
>  /* Compute the earliest vector for edge based lcm.  */
>  
> -static void
> +void
>  compute_earliest (struct edge_list *edge_list, int n_exprs, sbitmap *antin,
>    sbitmap *antout, sbitmap *avout, sbitmap *kill,
>    sbitmap *earliest)
> diff --git a/gcc/lcm.h b/gcc/lcm.h
> index e08339352e0..7145d6fc46d 100644
> --- a/gcc/lcm.h
> +++ b/gcc/lcm.h
> @@ -31,4 +31,7 @@ extern struct edge_list *pre_edge_rev_lcm (int, sbitmap *,
>     sbitmap *, sbitmap *,
>     sbitmap *, sbitmap **,
>     sbitmap **);
> +extern void compute_antinout_edge (sbitmap *, sbitmap *, sbitmap *, sbitmap *);
> +extern void compute_earliest (struct edge_list *, int, sbitmap *, sbitmap *,
> +       sbitmap *, sbitmap *, sbitmap *);
>  #endif /* GCC_LCM_H */
> 
 
-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
 

  parent reply	other threads:[~2023-08-21  7:47 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-21  1:04 Juzhe-Zhong
2023-08-21  7:09 ` Richard Biener
2023-08-21  7:24   ` juzhe.zhong
2023-08-21  7:46   ` juzhe.zhong [this message]
2023-08-21  7:50     ` Richard Biener
2023-08-21  8:06       ` juzhe.zhong
     [not found]       ` <202308211606108820614@rivai.ai>
2023-08-21  8:38         ` juzhe.zhong
2023-08-21  9:12           ` Richard Biener
2023-08-21  9:32             ` Lehua Ding

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D3CD1545179BDCAB+2023082115465200847223@rivai.ai \
    --to=juzhe.zhong@rivai.ai \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).