public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: 钟居哲 <juzhe.zhong@rivai.ai>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	 "richard.sandiford" <richard.sandiford@arm.com>
Subject: Re: Re: [PATCH] VECT: Enhance SLP of MASK_LEN_GATHER_LOAD[PR111721]
Date: Thu, 12 Oct 2023 09:44:14 +0000 (UTC)	[thread overview]
Message-ID: <nycvar.YFH.7.77.849.2310120942550.8126@jbgna.fhfr.qr> (raw)
In-Reply-To: <FDEDCD783B90B8D0+2023101200022498629426@rivai.ai>

On Thu, 12 Oct 2023, ??? wrote:

> Thanks Richi point it out.
> 
> I found this patch can't make conditional gather load succeed on SLP.
> 
> I am considering change MASK_LEN_GATHER_LOAD in pattern recognization:
> 
> If no condition mask, in tree-vect-patterns.cc,  I build MASK_LEN_GATHER_LOAD (ptr, offset, scale, 0) -> 4 arguments same as GATHER_LOAD.
> In this situation, MASK_LEN_GATHER_LOAD can resue the GATHER_LOAD SLP flow naturally.
> 
> If has condition mask, in tree-vect-patterns.cc,  I build MASK_LEN_GATHER_LOAD (ptr, offset, scale, 0, condition) -> 5 arguments same as MASK_GATHER_LOAD.
> In this situation, MASK_LEN_GATHER_LOAD can resue the MASK_GATHER_LOAD SLP flow naturally.
> 
> Is it reasonable ?

What's wrong with handling MASK_LEN_GATHER_LOAD with all arguments
even when the mask is -1?

> 
> juzhe.zhong@rivai.ai
>  
> From: Richard Biener
> Date: 2023-10-11 20:50
> To: Juzhe-Zhong
> CC: gcc-patches; richard.sandiford
> Subject: Re: [PATCH] VECT: Enhance SLP of MASK_LEN_GATHER_LOAD[PR111721]
> On Wed, 11 Oct 2023, Juzhe-Zhong wrote:
>  
> > This patch fixes this following FAILs in RISC-V regression:
> > 
> > FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects  scan-tree-dump vect "Loop contains only SLP stmts"
> > FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP stmts"
> > FAIL: gcc.dg/vect/vect-gather-3.c -flto -ffat-lto-objects  scan-tree-dump vect "Loop contains only SLP stmts"
> > FAIL: gcc.dg/vect/vect-gather-3.c scan-tree-dump vect "Loop contains only SLP stmts"
> > 
> > The root cause of these FAIL is that GCC SLP failed on MASK_LEN_GATHER_LOAD.
> > 
> > Since for RVV, we build MASK_LEN_GATHER_LOAD with dummy mask (-1) in tree-vect-patterns.cc if it is same
> > situation as GATHER_LOAD (no conditional mask).
> > 
> > So we make MASK_LEN_GATHER_LOAD leverage the flow of GATHER_LOAD if mask argument is a dummy mask.
> > 
> > gcc/ChangeLog:
> > 
> > * tree-vect-slp.cc (vect_get_operand_map):
> > (vect_build_slp_tree_1):
> > (vect_build_slp_tree_2):
> > * tree-vect-stmts.cc (vectorizable_load):
> > 
> > ---
> >  gcc/tree-vect-slp.cc   | 18 ++++++++++++++++--
> >  gcc/tree-vect-stmts.cc |  4 ++--
> >  2 files changed, 18 insertions(+), 4 deletions(-)
> > 
> > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > index fa098f9ff4e..712c04ec278 100644
> > --- a/gcc/tree-vect-slp.cc
> > +++ b/gcc/tree-vect-slp.cc
> > @@ -544,6 +544,17 @@ vect_get_operand_map (const gimple *stmt, unsigned char swap = 0)
> >    case IFN_MASK_GATHER_LOAD:
> >      return arg1_arg4_map;
> >  
> > +   case IFN_MASK_LEN_GATHER_LOAD:
> > +     /* In tree-vect-patterns.cc, we will have these 2 situations:
> > +
> > + - Unconditional gather load transforms
> > +   into MASK_LEN_GATHER_LOAD with dummy mask which is -1.
> > +
> > + - Conditional gather load transforms
> > +   into MASK_LEN_GATHER_LOAD with real conditional mask.*/
> > +     return integer_minus_onep (gimple_call_arg (call, 4)) ? arg1_map
> > +   : nullptr;
> > +
> >    case IFN_MASK_STORE:
> >      return arg3_arg2_map;
> >  
> > @@ -1077,7 +1088,8 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap,
> >  
> >    if (cfn == CFN_MASK_LOAD
> >        || cfn == CFN_GATHER_LOAD
> > -       || cfn == CFN_MASK_GATHER_LOAD)
> > +       || cfn == CFN_MASK_GATHER_LOAD
> > +       || cfn == CFN_MASK_LEN_GATHER_LOAD)
> >      ldst_p = true;
> >    else if (cfn == CFN_MASK_STORE)
> >      {
> > @@ -1337,6 +1349,7 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap,
> >    if (DR_IS_READ (STMT_VINFO_DATA_REF (stmt_info))
> >        && rhs_code != CFN_GATHER_LOAD
> >        && rhs_code != CFN_MASK_GATHER_LOAD
> > +       && rhs_code != CFN_MASK_LEN_GATHER_LOAD
> >        /* Not grouped loads are handled as externals for BB
> >  vectorization.  For loop vectorization we can handle
> >  splats the same we handle single element interleaving.  */
> > @@ -1837,7 +1850,8 @@ vect_build_slp_tree_2 (vec_info *vinfo, slp_tree node,
> >        if (gcall *stmt = dyn_cast <gcall *> (stmt_info->stmt))
> >  gcc_assert (gimple_call_internal_p (stmt, IFN_MASK_LOAD)
> >      || gimple_call_internal_p (stmt, IFN_GATHER_LOAD)
> > -     || gimple_call_internal_p (stmt, IFN_MASK_GATHER_LOAD));
> > +     || gimple_call_internal_p (stmt, IFN_MASK_GATHER_LOAD)
> > +     || gimple_call_internal_p (stmt, IFN_MASK_LEN_GATHER_LOAD));
> >        else
> >  {
> >    *max_nunits = this_max_nunits;
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index cd7c1090d88..263acf5d3cd 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -9575,9 +9575,9 @@ vectorizable_load (vec_info *vinfo,
> >  return false;
> >  
> >        mask_index = internal_fn_mask_index (ifn);
> > -      if (mask_index >= 0 && slp_node)
> > +      if (mask_index >= 0 && slp_node && internal_fn_len_index (ifn) < 0)
> >  mask_index = vect_slp_child_index_for_operand (call, mask_index);
> > -      if (mask_index >= 0
> > +      if (mask_index >= 0 && internal_fn_len_index (ifn) < 0
> >    && !vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_index,
> >        &mask, NULL, &mask_dt, &mask_vectype))
> >  return false;
>  
> You are ignoring the mask argument and here only handle it when the
> IFN doesn't have a _LEN.  This doesn't seem to be forward looking
> to the point where you want to actually handle masked (aka conditional)
> gather.
>  
> Did you check that SLP is actually used to vectorize this?
>  
> Richard.
>  
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

  reply	other threads:[~2023-10-12  9:44 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-11 12:27 Juzhe-Zhong
2023-10-11 12:50 ` Richard Biener
2023-10-11 16:02   ` 钟居哲
2023-10-12  9:44     ` Richard Biener [this message]
2023-10-12  9:50       ` juzhe.zhong
2023-10-12  9:55         ` Richard Biener
2023-10-12 10:18           ` juzhe.zhong
2023-10-12 11:12             ` Richard Biener
2023-10-12 10:36           ` juzhe.zhong
2023-10-12 11:13             ` Richard Biener
2023-10-12 10:57           ` juzhe.zhong
2023-10-12 11:14             ` Richard Biener
2023-10-12 11:24               ` juzhe.zhong
2023-10-13  4:33               ` juzhe.zhong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.YFH.7.77.849.2310120942550.8126@jbgna.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=juzhe.zhong@rivai.ai \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).