From: "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai>
To: rguenther <rguenther@suse.de>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
richard.sandiford <richard.sandiford@arm.com>
Subject: Re: Re: [PATCH] VECT: Enhance SLP of MASK_LEN_GATHER_LOAD[PR111721]
Date: Fri, 13 Oct 2023 12:33:45 +0800 [thread overview]
Message-ID: <EFBA158316151AEB+202310131233452246141@rivai.ai> (raw)
In-Reply-To: <nycvar.YFH.7.77.849.2310121114030.8126@jbgna.fhfr.qr>
[-- Attachment #1: Type: text/plain, Size: 9353 bytes --]
Hi, Richi.
As you suggest, I keep MAK_LEN_GATHER_LOAD (...,-1) format and support SLP for that in V3:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632846.html
Thanks.
juzhe.zhong@rivai.ai
From: Richard Biener
Date: 2023-10-12 19:14
To: juzhe.zhong@rivai.ai
CC: gcc-patches; richard.sandiford
Subject: Re: Re: [PATCH] VECT: Enhance SLP of MASK_LEN_GATHER_LOAD[PR111721]
On Thu, 12 Oct 2023, juzhe.zhong@rivai.ai wrote:
> In tree-vect-stmts.cc
>
> vect_check_scalar_mask
>
> Failed here:
>
> /* If the caller is not prepared for adjusting an external/constant
> SLP mask vector type fail. */
> if (slp_node
> && !mask_node
^^^
where's the mask_node?
> && SLP_TREE_DEF_TYPE (mask_node_1) != vect_internal_def)
> {
> if (dump_enabled_p ())
> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> "SLP mask argument is not vectorized.\n");
> return false;
> }
>
> If we allow vect_constant_def, we should adjust constant SLP mask ? in the caller "vectorizable_load" ?
>
> But I don't know how to adjust that.
>
>
>
> juzhe.zhong@rivai.ai
>
> From: Richard Biener
> Date: 2023-10-12 17:55
> To: juzhe.zhong@rivai.ai
> CC: gcc-patches; richard.sandiford
> Subject: Re: Re: [PATCH] VECT: Enhance SLP of MASK_LEN_GATHER_LOAD[PR111721]
> On Thu, 12 Oct 2023, juzhe.zhong@rivai.ai wrote:
>
> > I tree-vect-slp.cc:
> > vect_get_and_check_slp_defs
> > 711:
> >
> > tree type = TREE_TYPE (oprnd);
> > dt = dts[i];
> > if ((dt == vect_constant_def
> > || dt == vect_external_def)
> > && !GET_MODE_SIZE (vinfo->vector_mode).is_constant ()
> > && (TREE_CODE (type) == BOOLEAN_TYPE
> > || !can_duplicate_and_interleave_p (vinfo, stmts.length (),
> > type)))
> > {
> > if (dump_enabled_p ())
> > dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > "Build SLP failed: invalid type of def "
> > "for variable-length SLP %T\n", oprnd);
> > return -1;
> > }
> >
> > Here mask = -1 is BOOLEAN type in tree-vect-patterns.cc reaching this condition, then SLP failed:
> > Build SLP failed: invalid type of def
>
> I think this can be restricted to vect_external_def, but some history
> might reveal the cases we put this code in for (we should be able to
> materialize all constants?). At least uniform boolean constants
> should be fine.
> >
> >
> >
> > juzhe.zhong@rivai.ai
> >
> > From: Richard Biener
> > Date: 2023-10-12 17:44
> > To: ???
> > CC: gcc-patches; richard.sandiford
> > Subject: Re: Re: [PATCH] VECT: Enhance SLP of MASK_LEN_GATHER_LOAD[PR111721]
> > On Thu, 12 Oct 2023, ??? wrote:
> >
> > > Thanks Richi point it out.
> > >
> > > I found this patch can't make conditional gather load succeed on SLP.
> > >
> > > I am considering change MASK_LEN_GATHER_LOAD in pattern recognization:
> > >
> > > If no condition mask, in tree-vect-patterns.cc, I build MASK_LEN_GATHER_LOAD (ptr, offset, scale, 0) -> 4 arguments same as GATHER_LOAD.
> > > In this situation, MASK_LEN_GATHER_LOAD can resue the GATHER_LOAD SLP flow naturally.
> > >
> > > If has condition mask, in tree-vect-patterns.cc, I build MASK_LEN_GATHER_LOAD (ptr, offset, scale, 0, condition) -> 5 arguments same as MASK_GATHER_LOAD.
> > > In this situation, MASK_LEN_GATHER_LOAD can resue the MASK_GATHER_LOAD SLP flow naturally.
> > >
> > > Is it reasonable ?
> >
> > What's wrong with handling MASK_LEN_GATHER_LOAD with all arguments
> > even when the mask is -1?
> >
> > >
> > > juzhe.zhong@rivai.ai
> > >
> > > From: Richard Biener
> > > Date: 2023-10-11 20:50
> > > To: Juzhe-Zhong
> > > CC: gcc-patches; richard.sandiford
> > > Subject: Re: [PATCH] VECT: Enhance SLP of MASK_LEN_GATHER_LOAD[PR111721]
> > > On Wed, 11 Oct 2023, Juzhe-Zhong wrote:
> > >
> > > > This patch fixes this following FAILs in RISC-V regression:
> > > >
> > > > FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects scan-tree-dump vect "Loop contains only SLP stmts"
> > > > FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP stmts"
> > > > FAIL: gcc.dg/vect/vect-gather-3.c -flto -ffat-lto-objects scan-tree-dump vect "Loop contains only SLP stmts"
> > > > FAIL: gcc.dg/vect/vect-gather-3.c scan-tree-dump vect "Loop contains only SLP stmts"
> > > >
> > > > The root cause of these FAIL is that GCC SLP failed on MASK_LEN_GATHER_LOAD.
> > > >
> > > > Since for RVV, we build MASK_LEN_GATHER_LOAD with dummy mask (-1) in tree-vect-patterns.cc if it is same
> > > > situation as GATHER_LOAD (no conditional mask).
> > > >
> > > > So we make MASK_LEN_GATHER_LOAD leverage the flow of GATHER_LOAD if mask argument is a dummy mask.
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * tree-vect-slp.cc (vect_get_operand_map):
> > > > (vect_build_slp_tree_1):
> > > > (vect_build_slp_tree_2):
> > > > * tree-vect-stmts.cc (vectorizable_load):
> > > >
> > > > ---
> > > > gcc/tree-vect-slp.cc | 18 ++++++++++++++++--
> > > > gcc/tree-vect-stmts.cc | 4 ++--
> > > > 2 files changed, 18 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > > > index fa098f9ff4e..712c04ec278 100644
> > > > --- a/gcc/tree-vect-slp.cc
> > > > +++ b/gcc/tree-vect-slp.cc
> > > > @@ -544,6 +544,17 @@ vect_get_operand_map (const gimple *stmt, unsigned char swap = 0)
> > > > case IFN_MASK_GATHER_LOAD:
> > > > return arg1_arg4_map;
> > > >
> > > > + case IFN_MASK_LEN_GATHER_LOAD:
> > > > + /* In tree-vect-patterns.cc, we will have these 2 situations:
> > > > +
> > > > + - Unconditional gather load transforms
> > > > + into MASK_LEN_GATHER_LOAD with dummy mask which is -1.
> > > > +
> > > > + - Conditional gather load transforms
> > > > + into MASK_LEN_GATHER_LOAD with real conditional mask.*/
> > > > + return integer_minus_onep (gimple_call_arg (call, 4)) ? arg1_map
> > > > + : nullptr;
> > > > +
> > > > case IFN_MASK_STORE:
> > > > return arg3_arg2_map;
> > > >
> > > > @@ -1077,7 +1088,8 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap,
> > > >
> > > > if (cfn == CFN_MASK_LOAD
> > > > || cfn == CFN_GATHER_LOAD
> > > > - || cfn == CFN_MASK_GATHER_LOAD)
> > > > + || cfn == CFN_MASK_GATHER_LOAD
> > > > + || cfn == CFN_MASK_LEN_GATHER_LOAD)
> > > > ldst_p = true;
> > > > else if (cfn == CFN_MASK_STORE)
> > > > {
> > > > @@ -1337,6 +1349,7 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap,
> > > > if (DR_IS_READ (STMT_VINFO_DATA_REF (stmt_info))
> > > > && rhs_code != CFN_GATHER_LOAD
> > > > && rhs_code != CFN_MASK_GATHER_LOAD
> > > > + && rhs_code != CFN_MASK_LEN_GATHER_LOAD
> > > > /* Not grouped loads are handled as externals for BB
> > > > vectorization. For loop vectorization we can handle
> > > > splats the same we handle single element interleaving. */
> > > > @@ -1837,7 +1850,8 @@ vect_build_slp_tree_2 (vec_info *vinfo, slp_tree node,
> > > > if (gcall *stmt = dyn_cast <gcall *> (stmt_info->stmt))
> > > > gcc_assert (gimple_call_internal_p (stmt, IFN_MASK_LOAD)
> > > > || gimple_call_internal_p (stmt, IFN_GATHER_LOAD)
> > > > - || gimple_call_internal_p (stmt, IFN_MASK_GATHER_LOAD));
> > > > + || gimple_call_internal_p (stmt, IFN_MASK_GATHER_LOAD)
> > > > + || gimple_call_internal_p (stmt, IFN_MASK_LEN_GATHER_LOAD));
> > > > else
> > > > {
> > > > *max_nunits = this_max_nunits;
> > > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > > > index cd7c1090d88..263acf5d3cd 100644
> > > > --- a/gcc/tree-vect-stmts.cc
> > > > +++ b/gcc/tree-vect-stmts.cc
> > > > @@ -9575,9 +9575,9 @@ vectorizable_load (vec_info *vinfo,
> > > > return false;
> > > >
> > > > mask_index = internal_fn_mask_index (ifn);
> > > > - if (mask_index >= 0 && slp_node)
> > > > + if (mask_index >= 0 && slp_node && internal_fn_len_index (ifn) < 0)
> > > > mask_index = vect_slp_child_index_for_operand (call, mask_index);
> > > > - if (mask_index >= 0
> > > > + if (mask_index >= 0 && internal_fn_len_index (ifn) < 0
> > > > && !vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_index,
> > > > &mask, NULL, &mask_dt, &mask_vectype))
> > > > return false;
> > >
> > > You are ignoring the mask argument and here only handle it when the
> > > IFN doesn't have a _LEN. This doesn't seem to be forward looking
> > > to the point where you want to actually handle masked (aka conditional)
> > > gather.
> > >
> > > Did you check that SLP is actually used to vectorize this?
> > >
> > > Richard.
> > >
> > >
> >
> >
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
prev parent reply other threads:[~2023-10-13 4:33 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-11 12:27 Juzhe-Zhong
2023-10-11 12:50 ` Richard Biener
2023-10-11 16:02 ` 钟居哲
2023-10-12 9:44 ` Richard Biener
2023-10-12 9:50 ` juzhe.zhong
2023-10-12 9:55 ` Richard Biener
2023-10-12 10:18 ` juzhe.zhong
2023-10-12 11:12 ` Richard Biener
2023-10-12 10:36 ` juzhe.zhong
2023-10-12 11:13 ` Richard Biener
2023-10-12 10:57 ` juzhe.zhong
2023-10-12 11:14 ` Richard Biener
2023-10-12 11:24 ` juzhe.zhong
2023-10-13 4:33 ` juzhe.zhong [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=EFBA158316151AEB+202310131233452246141@rivai.ai \
--to=juzhe.zhong@rivai.ai \
--cc=gcc-patches@gcc.gnu.org \
--cc=rguenther@suse.de \
--cc=richard.sandiford@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).