* [PATCH] VECT: Support CALL vectorization for COND_LEN_* @ 2023-07-24 11:46 juzhe.zhong 2023-07-24 14:33 ` Richard Biener 0 siblings, 1 reply; 9+ messages in thread From: juzhe.zhong @ 2023-07-24 11:46 UTC (permalink / raw) To: gcc-patches; +Cc: richard.sandiford, rguenther, Ju-Zhe Zhong From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai> Hi, Richard and Richi. This patch supports CALL vectorization by COND_LEN_*. Consider this following case: void foo (float * __restrict a, float * __restrict b, int * __restrict cond, int n) { for (int i = 0; i < n; i++) if (cond[i]) a[i] = b[i] + a[i]; } Output of RISC-V (32-bits) gcc (trunk) (Compiler #3) <source>:5:21: missed: couldn't vectorize loop <source>:5:21: missed: not vectorized: control flow in loop. ARM SVE: ... mask__27.10_51 = vect__4.9_49 != { 0, ... }; ... vec_mask_and_55 = loop_mask_49 & mask__27.10_51; ... vect__9.17_62 = .COND_ADD (vec_mask_and_55, vect__6.13_56, vect__8.16_60, vect__6.13_56); For RVV, we want IR as follows: ... _68 = .SELECT_VL (ivtmp_66, POLY_INT_CST [4, 4]); ... mask__27.10_51 = vect__4.9_49 != { 0, ... }; ... vect__9.17_60 = .COND_LEN_ADD (mask__27.10_51, vect__6.13_55, vect__8.16_59, vect__6.13_55, _68, 0); ... Both len and mask of COND_LEN_ADD are real not dummy. gcc/ChangeLog: * tree-if-conv.cc (ifcvt_can_predicate): Enable ifcvt for COND_LEN_*. * tree-vect-stmts.cc (vectorizable_internal_function): Apply COND_LEN_*. (vectorizable_call): Ditto. --- gcc/tree-if-conv.cc | 7 +++- gcc/tree-vect-stmts.cc | 84 +++++++++++++++++++++++++++++++++++++++--- 2 files changed, 83 insertions(+), 8 deletions(-) diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc index 799f071965e..2e17658b9ed 100644 --- a/gcc/tree-if-conv.cc +++ b/gcc/tree-if-conv.cc @@ -1010,8 +1010,11 @@ ifcvt_can_predicate (gimple *stmt) if (!types_compatible_p (lhs_type, rhs_type)) return false; internal_fn cond_fn = get_conditional_internal_fn (code); - return (cond_fn != IFN_LAST - && vectorized_internal_fn_supported_p (cond_fn, lhs_type)); + internal_fn cond_len_fn = get_conditional_len_internal_fn (code); + return ((cond_fn != IFN_LAST + && vectorized_internal_fn_supported_p (cond_fn, lhs_type)) + || (cond_len_fn != IFN_LAST + && vectorized_internal_fn_supported_p (cond_len_fn, lhs_type))); } /* Return true when STMT is if-convertible. diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index ed28fbdced3..15401ddb682 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1543,10 +1543,27 @@ vectorizable_internal_function (combined_fn cfn, tree fndecl, tree vectype_out, tree vectype_in) { internal_fn ifn; + internal_fn len_ifn = IFN_LAST; if (internal_fn_p (cfn)) - ifn = as_internal_fn (cfn); + { + ifn = as_internal_fn (cfn); + tree_code code = conditional_internal_fn_code (ifn); + len_ifn = get_conditional_len_internal_fn (code); + } else ifn = associated_internal_fn (fndecl); + if (len_ifn != IFN_LAST && direct_internal_fn_p (len_ifn)) + { + const direct_internal_fn_info &info = direct_internal_fn (len_ifn); + if (info.vectorizable) + { + tree type0 = (info.type0 < 0 ? vectype_out : vectype_in); + tree type1 = (info.type1 < 0 ? vectype_out : vectype_in); + if (direct_internal_fn_supported_p (len_ifn, tree_pair (type0, type1), + OPTIMIZE_FOR_SPEED)) + return len_ifn; + } + } if (ifn != IFN_LAST && direct_internal_fn_p (ifn)) { const direct_internal_fn_info &info = direct_internal_fn (ifn); @@ -3538,8 +3555,10 @@ vectorizable_call (vec_info *vinfo, gcc_assert (ncopies >= 1); int reduc_idx = STMT_VINFO_REDUC_IDX (stmt_info); + int len_index = internal_fn_len_index (ifn); internal_fn cond_fn = get_conditional_internal_fn (ifn); vec_loop_masks *masks = (loop_vinfo ? &LOOP_VINFO_MASKS (loop_vinfo) : NULL); + vec_loop_lens *lens = (loop_vinfo ? &LOOP_VINFO_LENS (loop_vinfo) : NULL); if (!vec_stmt) /* transformation not required. */ { if (slp_node) @@ -3585,8 +3604,12 @@ vectorizable_call (vec_info *vinfo, tree scalar_mask = NULL_TREE; if (mask_opno >= 0) scalar_mask = gimple_call_arg (stmt_info->stmt, mask_opno); - vect_record_loop_mask (loop_vinfo, masks, nvectors, - vectype_out, scalar_mask); + if (len_index >= 0) + vect_record_loop_len (loop_vinfo, lens, nvectors, vectype_out, + 1); + else + vect_record_loop_mask (loop_vinfo, masks, nvectors, vectype_out, + scalar_mask); } } return true; @@ -3602,8 +3625,16 @@ vectorizable_call (vec_info *vinfo, vec_dest = vect_create_destination_var (scalar_dest, vectype_out); bool masked_loop_p = loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo); + bool len_loop_p = loop_vinfo && LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo); unsigned int vect_nargs = nargs; - if (masked_loop_p && reduc_idx >= 0) + if (len_index) + { + /* COND_ADD --> COND_LEN_ADD with 2 more args (LEN + BIAS). + FIXME: We don't support FMAX --> COND_LEN_FMAX yet which + needs 4 more args (MASK + ELSE + LEN + BIAS). */ + vect_nargs += 2; + } + else if (masked_loop_p && reduc_idx >= 0) { ifn = cond_fn; vect_nargs += 2; @@ -3670,7 +3701,29 @@ vectorizable_call (vec_info *vinfo, } else { - if (mask_opno >= 0 && masked_loop_p) + if (len_index) + { + tree len, bias; + if (len_loop_p) + len + = vect_get_loop_len (loop_vinfo, gsi, lens, + ncopies, vectype_out, j, 1); + else + { + /* Dummy LEN. */ + tree iv_type + = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); + len + = build_int_cst (iv_type, TYPE_VECTOR_SUBPARTS ( + vectype_out)); + } + signed char biasval + = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); + bias = build_int_cst (intQI_type_node, biasval); + vargs[varg++] = len; + vargs[varg++] = bias; + } + else if (mask_opno >= 0 && masked_loop_p) { unsigned int vec_num = vec_oprnds0.length (); /* Always true for SLP. */ @@ -3718,7 +3771,26 @@ vectorizable_call (vec_info *vinfo, if (masked_loop_p && reduc_idx >= 0) vargs[varg++] = vargs[reduc_idx + 1]; - if (mask_opno >= 0 && masked_loop_p) + if (len_index) + { + tree len, bias; + if (len_loop_p) + len = vect_get_loop_len (loop_vinfo, gsi, lens, ncopies, + vectype_out, j, 1); + else + { + /* Dummy LEN. */ + tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); + len = build_int_cst (iv_type, + TYPE_VECTOR_SUBPARTS (vectype_out)); + } + signed char biasval + = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); + bias = build_int_cst (intQI_type_node, biasval); + vargs[varg++] = len; + vargs[varg++] = bias; + } + else if (mask_opno >= 0 && masked_loop_p) { tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks, ncopies, vectype_out, j); -- 2.36.3 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_* 2023-07-24 11:46 [PATCH] VECT: Support CALL vectorization for COND_LEN_* juzhe.zhong @ 2023-07-24 14:33 ` Richard Biener 2023-07-24 23:16 ` 钟居哲 0 siblings, 1 reply; 9+ messages in thread From: Richard Biener @ 2023-07-24 14:33 UTC (permalink / raw) To: Ju-Zhe Zhong; +Cc: gcc-patches, richard.sandiford On Mon, 24 Jul 2023, juzhe.zhong@rivai.ai wrote: > From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai> > > Hi, Richard and Richi. > > This patch supports CALL vectorization by COND_LEN_*. > > Consider this following case: > void foo (float * __restrict a, float * __restrict b, int * __restrict cond, int n) > { > for (int i = 0; i < n; i++) > if (cond[i]) > a[i] = b[i] + a[i]; > } > > > Output of RISC-V (32-bits) gcc (trunk) (Compiler #3) > <source>:5:21: missed: couldn't vectorize loop > <source>:5:21: missed: not vectorized: control flow in loop. > > ARM SVE: > > ... > mask__27.10_51 = vect__4.9_49 != { 0, ... }; > ... > vec_mask_and_55 = loop_mask_49 & mask__27.10_51; > ... > vect__9.17_62 = .COND_ADD (vec_mask_and_55, vect__6.13_56, vect__8.16_60, vect__6.13_56); > > For RVV, we want IR as follows: > > ... > _68 = .SELECT_VL (ivtmp_66, POLY_INT_CST [4, 4]); > ... > mask__27.10_51 = vect__4.9_49 != { 0, ... }; > ... > vect__9.17_60 = .COND_LEN_ADD (mask__27.10_51, vect__6.13_55, vect__8.16_59, vect__6.13_55, _68, 0); > ... > > Both len and mask of COND_LEN_ADD are real not dummy. > > gcc/ChangeLog: > > * tree-if-conv.cc (ifcvt_can_predicate): Enable ifcvt for COND_LEN_*. > * tree-vect-stmts.cc (vectorizable_internal_function): Apply COND_LEN_*. > (vectorizable_call): Ditto. > > --- > gcc/tree-if-conv.cc | 7 +++- > gcc/tree-vect-stmts.cc | 84 +++++++++++++++++++++++++++++++++++++++--- > 2 files changed, 83 insertions(+), 8 deletions(-) > > diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc > index 799f071965e..2e17658b9ed 100644 > --- a/gcc/tree-if-conv.cc > +++ b/gcc/tree-if-conv.cc > @@ -1010,8 +1010,11 @@ ifcvt_can_predicate (gimple *stmt) > if (!types_compatible_p (lhs_type, rhs_type)) > return false; > internal_fn cond_fn = get_conditional_internal_fn (code); > - return (cond_fn != IFN_LAST > - && vectorized_internal_fn_supported_p (cond_fn, lhs_type)); > + internal_fn cond_len_fn = get_conditional_len_internal_fn (code); > + return ((cond_fn != IFN_LAST > + && vectorized_internal_fn_supported_p (cond_fn, lhs_type)) > + || (cond_len_fn != IFN_LAST > + && vectorized_internal_fn_supported_p (cond_len_fn, lhs_type))); > } > > /* Return true when STMT is if-convertible. > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > index ed28fbdced3..15401ddb682 100644 > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -1543,10 +1543,27 @@ vectorizable_internal_function (combined_fn cfn, tree fndecl, > tree vectype_out, tree vectype_in) > { > internal_fn ifn; > + internal_fn len_ifn = IFN_LAST; > if (internal_fn_p (cfn)) > - ifn = as_internal_fn (cfn); > + { > + ifn = as_internal_fn (cfn); > + tree_code code = conditional_internal_fn_code (ifn); > + len_ifn = get_conditional_len_internal_fn (code); > + } > else > ifn = associated_internal_fn (fndecl); This function doesn't seem to care about conditional vectorization support, so why are you changing it? > + if (len_ifn != IFN_LAST && direct_internal_fn_p (len_ifn)) > + { > + const direct_internal_fn_info &info = direct_internal_fn (len_ifn); > + if (info.vectorizable) > + { > + tree type0 = (info.type0 < 0 ? vectype_out : vectype_in); > + tree type1 = (info.type1 < 0 ? vectype_out : vectype_in); > + if (direct_internal_fn_supported_p (len_ifn, tree_pair (type0, type1), > + OPTIMIZE_FOR_SPEED)) > + return len_ifn; > + } > + } > if (ifn != IFN_LAST && direct_internal_fn_p (ifn)) > { > const direct_internal_fn_info &info = direct_internal_fn (ifn); > @@ -3538,8 +3555,10 @@ vectorizable_call (vec_info *vinfo, > gcc_assert (ncopies >= 1); > > int reduc_idx = STMT_VINFO_REDUC_IDX (stmt_info); > + int len_index = internal_fn_len_index (ifn); Note traditionally non-LEN fns would return -1 here, so ... > internal_fn cond_fn = get_conditional_internal_fn (ifn); > vec_loop_masks *masks = (loop_vinfo ? &LOOP_VINFO_MASKS (loop_vinfo) : NULL); > + vec_loop_lens *lens = (loop_vinfo ? &LOOP_VINFO_LENS (loop_vinfo) : NULL); > if (!vec_stmt) /* transformation not required. */ > { > if (slp_node) > @@ -3585,8 +3604,12 @@ vectorizable_call (vec_info *vinfo, instead there's code here that checks for the .COND_xxx case so you should ament it for the len case as well? > tree scalar_mask = NULL_TREE; > if (mask_opno >= 0) > scalar_mask = gimple_call_arg (stmt_info->stmt, mask_opno); > - vect_record_loop_mask (loop_vinfo, masks, nvectors, > - vectype_out, scalar_mask); > + if (len_index >= 0) > + vect_record_loop_len (loop_vinfo, lens, nvectors, vectype_out, > + 1); > + else > + vect_record_loop_mask (loop_vinfo, masks, nvectors, vectype_out, > + scalar_mask); > } > } > return true; > @@ -3602,8 +3625,16 @@ vectorizable_call (vec_info *vinfo, > vec_dest = vect_create_destination_var (scalar_dest, vectype_out); > > bool masked_loop_p = loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo); > + bool len_loop_p = loop_vinfo && LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo); > unsigned int vect_nargs = nargs; > - if (masked_loop_p && reduc_idx >= 0) > + if (len_index) what about reduc_indx? Why check len_indes and not len_loop_p? ... checking len_index is going to be true for -1, you probably would want len_index != -1 then? > + { > + /* COND_ADD --> COND_LEN_ADD with 2 more args (LEN + BIAS). > + FIXME: We don't support FMAX --> COND_LEN_FMAX yet which > + needs 4 more args (MASK + ELSE + LEN + BIAS). */ > + vect_nargs += 2; > + } > + else if (masked_loop_p && reduc_idx >= 0) > { > ifn = cond_fn; > vect_nargs += 2; > @@ -3670,7 +3701,29 @@ vectorizable_call (vec_info *vinfo, > } > else > { > - if (mask_opno >= 0 && masked_loop_p) > + if (len_index) see above > + { > + tree len, bias; > + if (len_loop_p) > + len > + = vect_get_loop_len (loop_vinfo, gsi, lens, > + ncopies, vectype_out, j, 1); > + else > + { > + /* Dummy LEN. */ > + tree iv_type > + = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); > + len > + = build_int_cst (iv_type, TYPE_VECTOR_SUBPARTS ( > + vectype_out)); > + } > + signed char biasval > + = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); > + bias = build_int_cst (intQI_type_node, biasval); > + vargs[varg++] = len; > + vargs[varg++] = bias; > + } > + else if (mask_opno >= 0 && masked_loop_p) > { > unsigned int vec_num = vec_oprnds0.length (); > /* Always true for SLP. */ > @@ -3718,7 +3771,26 @@ vectorizable_call (vec_info *vinfo, > if (masked_loop_p && reduc_idx >= 0) > vargs[varg++] = vargs[reduc_idx + 1]; > > - if (mask_opno >= 0 && masked_loop_p) > + if (len_index) > + { > + tree len, bias; > + if (len_loop_p) > + len = vect_get_loop_len (loop_vinfo, gsi, lens, ncopies, > + vectype_out, j, 1); > + else > + { > + /* Dummy LEN. */ > + tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); > + len = build_int_cst (iv_type, > + TYPE_VECTOR_SUBPARTS (vectype_out)); > + } > + signed char biasval > + = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); > + bias = build_int_cst (intQI_type_node, biasval); > + vargs[varg++] = len; > + vargs[varg++] = bias; You are not actually using len_index to place the arguments. See how mask_opno is used. You are producing patches too quickly, please double-check what you did before posting for review. Thanks, Richard. > + } > + else if (mask_opno >= 0 && masked_loop_p) > { > tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks, ncopies, > vectype_out, j); > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_* 2023-07-24 14:33 ` Richard Biener @ 2023-07-24 23:16 ` 钟居哲 2023-07-25 9:41 ` Richard Sandiford 0 siblings, 1 reply; 9+ messages in thread From: 钟居哲 @ 2023-07-24 23:16 UTC (permalink / raw) To: rguenther; +Cc: gcc-patches, richard.sandiford [-- Attachment #1: Type: text/plain, Size: 12384 bytes --] Hi, Richi. Thank you so much for review. >> This function doesn't seem to care about conditional vectorization >> support, so why are you changing it? I debug and analyze the code here: Breakpoint 1, vectorizable_call (vinfo=0x3d358d0, stmt_info=0x3dcc820, gsi=0x0, vec_stmt=0x0, slp_node=0x0, cost_vec=0x7fffffffc668) at ../../../riscv-gcc/gcc/tree-vect-stmts.cc:3485 3485 ifn = vectorizable_internal_function (cfn, callee, vectype_out, (gdb) p cfn $1 = CFN_COND_ADD (gdb) call print_gimple_stmt(stdout,stmt,0,0) _9 = .COND_ADD (_27, _6, _8, _6); When target like RISC-V didnt' support COND_* (we only support COND_LEN_*, no support COND_*), ifn will be IFN_LAST. Also, the following code doesn't modify ifn until "internal_fn cond_fn = get_conditional_internal_fn (ifn);" When ifn == IFN_LAST, cond_fn will IFN_LAST too. So, I extend "vectorizable_internal_function" to return COND_LEN_*. Am I going into wrong direction on debug && analysis. >> what about reduc_indx? Why check len_indes and not len_loop_p? >> Note traditionally non-LEN fns would return -1 here, so ... >> instead there's code here that checks for the .COND_xxx case so you should >> ament it for the len case as well? >> You are not actually using len_index to place the arguments. See >> how mask_opno is used. I understand this patch codes look quite strange not natural. Give me some time to share the story why I am doing this: 1. At the beginning, I want to write the codes naturally and very similiar with the current codes. So it's natural to write the codes like this: if (mask_opno >= 0 && len_loop_p) { tree len = vect_get_loop_len tree bias = .... } However, we have the following case: void foo1 (float * __restrict a, float * __restrict b, int * __restrict cond, int n) { for (int i = 0; i < 8; i++) if (cond[i]) a[i] = b[i] + a[i]; } Note here "i<8" means niters = 8 and compile with known VF (fixed-length vector) to vectorize the codes. Then len_loop_p will be false. If I don't handle case, then the code will generate the IR: COND_LEN_ADD (maks, op1, op2, op3).... Missing LEN && BIAS argument which cause ICE. 2. So I use "int len_index = internal_fn_len_index (ifn);", ifn is supported as COND_LEN_*, the len_index >= 0, otherwise, it will be -1. So I prepare LEN && BIAS arguments according to len_index. So I change the code into: if (len_index >= 0) { tree len = vect_get_loop_len tree bias = .... } 3. Overall, I want to normalize COND_* into COND_LEN_*, when loop_len_p is false, then I create a dummy LEN for it. It make the codes strange. 4. I have another approach/prepared patch which makes codes more natural: Enable COND_xxxx in RISC-V port, then keep "tree-vect-ifcvt.cc" unchange, and "vectorizable_internal_function" unchange. And simpily write the codes like this: if (mask_opno >= 0 && len_loop_p) { tree len = vect_get_loop_len tree bias = .... } and when len_loop_p is true, I change to use "COND_LEN_xxx", otherwise reuse original flow. Are you suggesting me doing this? Meaning enable COND_xxx (The patterns are duplicate I think if we have COND_LEN_xxx) in RISC-V port ? >> You are producing patches too quickly, please double-check what you did >> before posting for review. I am so sorry this patch codes look quite strange and confusing. I actually consider it for some time (I have 2 prepared patches, this one which is not natural, the other only add a very few codes and look natural and more acceptable ), and compare them (the other one I need to support COND_xxx in RISC-V port). Maybe it's better enable COND_xxx in RISC-V port so that the codes in middle-end look much more reasonable ? I think maybe next time I should ask you first about the solution before I send the patch Looking forward your suggestions. Thanks. juzhe.zhong@rivai.ai From: Richard Biener Date: 2023-07-24 22:33 To: Ju-Zhe Zhong CC: gcc-patches; richard.sandiford Subject: Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_* On Mon, 24 Jul 2023, juzhe.zhong@rivai.ai wrote: > From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai> > > Hi, Richard and Richi. > > This patch supports CALL vectorization by COND_LEN_*. > > Consider this following case: > void foo (float * __restrict a, float * __restrict b, int * __restrict cond, int n) > { > for (int i = 0; i < n; i++) > if (cond[i]) > a[i] = b[i] + a[i]; > } > > > Output of RISC-V (32-bits) gcc (trunk) (Compiler #3) > <source>:5:21: missed: couldn't vectorize loop > <source>:5:21: missed: not vectorized: control flow in loop. > > ARM SVE: > > ... > mask__27.10_51 = vect__4.9_49 != { 0, ... }; > ... > vec_mask_and_55 = loop_mask_49 & mask__27.10_51; > ... > vect__9.17_62 = .COND_ADD (vec_mask_and_55, vect__6.13_56, vect__8.16_60, vect__6.13_56); > > For RVV, we want IR as follows: > > ... > _68 = .SELECT_VL (ivtmp_66, POLY_INT_CST [4, 4]); > ... > mask__27.10_51 = vect__4.9_49 != { 0, ... }; > ... > vect__9.17_60 = .COND_LEN_ADD (mask__27.10_51, vect__6.13_55, vect__8.16_59, vect__6.13_55, _68, 0); > ... > > Both len and mask of COND_LEN_ADD are real not dummy. > > gcc/ChangeLog: > > * tree-if-conv.cc (ifcvt_can_predicate): Enable ifcvt for COND_LEN_*. > * tree-vect-stmts.cc (vectorizable_internal_function): Apply COND_LEN_*. > (vectorizable_call): Ditto. > > --- > gcc/tree-if-conv.cc | 7 +++- > gcc/tree-vect-stmts.cc | 84 +++++++++++++++++++++++++++++++++++++++--- > 2 files changed, 83 insertions(+), 8 deletions(-) > > diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc > index 799f071965e..2e17658b9ed 100644 > --- a/gcc/tree-if-conv.cc > +++ b/gcc/tree-if-conv.cc > @@ -1010,8 +1010,11 @@ ifcvt_can_predicate (gimple *stmt) > if (!types_compatible_p (lhs_type, rhs_type)) > return false; > internal_fn cond_fn = get_conditional_internal_fn (code); > - return (cond_fn != IFN_LAST > - && vectorized_internal_fn_supported_p (cond_fn, lhs_type)); > + internal_fn cond_len_fn = get_conditional_len_internal_fn (code); > + return ((cond_fn != IFN_LAST > + && vectorized_internal_fn_supported_p (cond_fn, lhs_type)) > + || (cond_len_fn != IFN_LAST > + && vectorized_internal_fn_supported_p (cond_len_fn, lhs_type))); > } > > /* Return true when STMT is if-convertible. > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > index ed28fbdced3..15401ddb682 100644 > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -1543,10 +1543,27 @@ vectorizable_internal_function (combined_fn cfn, tree fndecl, > tree vectype_out, tree vectype_in) > { > internal_fn ifn; > + internal_fn len_ifn = IFN_LAST; > if (internal_fn_p (cfn)) > - ifn = as_internal_fn (cfn); > + { > + ifn = as_internal_fn (cfn); > + tree_code code = conditional_internal_fn_code (ifn); > + len_ifn = get_conditional_len_internal_fn (code); > + } > else > ifn = associated_internal_fn (fndecl); This function doesn't seem to care about conditional vectorization support, so why are you changing it? > + if (len_ifn != IFN_LAST && direct_internal_fn_p (len_ifn)) > + { > + const direct_internal_fn_info &info = direct_internal_fn (len_ifn); > + if (info.vectorizable) > + { > + tree type0 = (info.type0 < 0 ? vectype_out : vectype_in); > + tree type1 = (info.type1 < 0 ? vectype_out : vectype_in); > + if (direct_internal_fn_supported_p (len_ifn, tree_pair (type0, type1), > + OPTIMIZE_FOR_SPEED)) > + return len_ifn; > + } > + } > if (ifn != IFN_LAST && direct_internal_fn_p (ifn)) > { > const direct_internal_fn_info &info = direct_internal_fn (ifn); > @@ -3538,8 +3555,10 @@ vectorizable_call (vec_info *vinfo, > gcc_assert (ncopies >= 1); > > int reduc_idx = STMT_VINFO_REDUC_IDX (stmt_info); > + int len_index = internal_fn_len_index (ifn); Note traditionally non-LEN fns would return -1 here, so ... > internal_fn cond_fn = get_conditional_internal_fn (ifn); > vec_loop_masks *masks = (loop_vinfo ? &LOOP_VINFO_MASKS (loop_vinfo) : NULL); > + vec_loop_lens *lens = (loop_vinfo ? &LOOP_VINFO_LENS (loop_vinfo) : NULL); > if (!vec_stmt) /* transformation not required. */ > { > if (slp_node) > @@ -3585,8 +3604,12 @@ vectorizable_call (vec_info *vinfo, instead there's code here that checks for the .COND_xxx case so you should ament it for the len case as well? > tree scalar_mask = NULL_TREE; > if (mask_opno >= 0) > scalar_mask = gimple_call_arg (stmt_info->stmt, mask_opno); > - vect_record_loop_mask (loop_vinfo, masks, nvectors, > - vectype_out, scalar_mask); > + if (len_index >= 0) > + vect_record_loop_len (loop_vinfo, lens, nvectors, vectype_out, > + 1); > + else > + vect_record_loop_mask (loop_vinfo, masks, nvectors, vectype_out, > + scalar_mask); > } > } > return true; > @@ -3602,8 +3625,16 @@ vectorizable_call (vec_info *vinfo, > vec_dest = vect_create_destination_var (scalar_dest, vectype_out); > > bool masked_loop_p = loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo); > + bool len_loop_p = loop_vinfo && LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo); > unsigned int vect_nargs = nargs; > - if (masked_loop_p && reduc_idx >= 0) > + if (len_index) what about reduc_indx? Why check len_indes and not len_loop_p? ... checking len_index is going to be true for -1, you probably would want len_index != -1 then? > + { > + /* COND_ADD --> COND_LEN_ADD with 2 more args (LEN + BIAS). > + FIXME: We don't support FMAX --> COND_LEN_FMAX yet which > + needs 4 more args (MASK + ELSE + LEN + BIAS). */ > + vect_nargs += 2; > + } > + else if (masked_loop_p && reduc_idx >= 0) > { > ifn = cond_fn; > vect_nargs += 2; > @@ -3670,7 +3701,29 @@ vectorizable_call (vec_info *vinfo, > } > else > { > - if (mask_opno >= 0 && masked_loop_p) > + if (len_index) see above > + { > + tree len, bias; > + if (len_loop_p) > + len > + = vect_get_loop_len (loop_vinfo, gsi, lens, > + ncopies, vectype_out, j, 1); > + else > + { > + /* Dummy LEN. */ > + tree iv_type > + = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); > + len > + = build_int_cst (iv_type, TYPE_VECTOR_SUBPARTS ( > + vectype_out)); > + } > + signed char biasval > + = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); > + bias = build_int_cst (intQI_type_node, biasval); > + vargs[varg++] = len; > + vargs[varg++] = bias; > + } > + else if (mask_opno >= 0 && masked_loop_p) > { > unsigned int vec_num = vec_oprnds0.length (); > /* Always true for SLP. */ > @@ -3718,7 +3771,26 @@ vectorizable_call (vec_info *vinfo, > if (masked_loop_p && reduc_idx >= 0) > vargs[varg++] = vargs[reduc_idx + 1]; > > - if (mask_opno >= 0 && masked_loop_p) > + if (len_index) > + { > + tree len, bias; > + if (len_loop_p) > + len = vect_get_loop_len (loop_vinfo, gsi, lens, ncopies, > + vectype_out, j, 1); > + else > + { > + /* Dummy LEN. */ > + tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); > + len = build_int_cst (iv_type, > + TYPE_VECTOR_SUBPARTS (vectype_out)); > + } > + signed char biasval > + = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); > + bias = build_int_cst (intQI_type_node, biasval); > + vargs[varg++] = len; > + vargs[varg++] = bias; You are not actually using len_index to place the arguments. See how mask_opno is used. You are producing patches too quickly, please double-check what you did before posting for review. Thanks, Richard. > + } > + else if (mask_opno >= 0 && masked_loop_p) > { > tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks, ncopies, > vectype_out, j); > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_* 2023-07-24 23:16 ` 钟居哲 @ 2023-07-25 9:41 ` Richard Sandiford 2023-07-25 10:17 ` juzhe.zhong 2023-07-31 9:30 ` Richard Biener 0 siblings, 2 replies; 9+ messages in thread From: Richard Sandiford @ 2023-07-25 9:41 UTC (permalink / raw) To: 钟居哲; +Cc: rguenther, gcc-patches 钟居哲 <juzhe.zhong@rivai.ai> writes: > Hi, Richi. Thank you so much for review. > >>> This function doesn't seem to care about conditional vectorization >>> support, so why are you changing it? > > I debug and analyze the code here: > > Breakpoint 1, vectorizable_call (vinfo=0x3d358d0, stmt_info=0x3dcc820, gsi=0x0, vec_stmt=0x0, slp_node=0x0, cost_vec=0x7fffffffc668) at ../../../riscv-gcc/gcc/tree-vect-stmts.cc:3485 > 3485 ifn = vectorizable_internal_function (cfn, callee, vectype_out, > (gdb) p cfn > $1 = CFN_COND_ADD > (gdb) call print_gimple_stmt(stdout,stmt,0,0) > _9 = .COND_ADD (_27, _6, _8, _6); > > When target like RISC-V didnt' support COND_* (we only support COND_LEN_*, no support COND_*), > ifn will be IFN_LAST. > Also, the following code doesn't modify ifn until > "internal_fn cond_fn = get_conditional_internal_fn (ifn);" > When ifn == IFN_LAST, cond_fn will IFN_LAST too. > > So, I extend "vectorizable_internal_function" to return COND_LEN_*. > > Am I going into wrong direction on debug && analysis. The main difference between IFN_COND_ADD and IFN_COND_LEN_ADD is that IFN_COND_ADD makes logical sense for scalars, whereas IFN_COND_LEN_ADD only makes sense for vectors. So I think if-conversion has to create IFN_COND_ADD rather than IFN_COND_LEN_ADD even for targets that only support IFN_COND_LEN_ADD. (Which is what the patch does.) IFN_COND_LEN_ADD is really a generalisation of IFN_COND_ADD. In a sense, every target that supports IFN_COND_LEN_ADD also supports IFN_COND_ADD. All we need is two extra arguments. And the patch does seem to take that approach (by adding dummy length arguments). So I think there are at least four ways we could go: (1) RVV implements cond_* optabs as expanders. RVV therefore supports both IFN_COND_ADD and IFN_COND_LEN_ADD. No dummy length arguments are needed at the gimple level. (2) Target-independent code automatically provides cond_* optabs based on cond_len_* optabs. Otherwise the same as (1). (3) A gimple pass lowers IFN_COND_ADD to IFN_COND_LEN_ADD on targets that need it. Could be vec lowering or isel. (4) The vectoriser maintains the distinction between IFN_COND_ADD and IFN_COND_LEN_ADD from the outset. It adds dummy length arguments where necessary. The patch is going for (4). But I think we should at least consider whether any of the other alternatives make sense. If we do go for (4), I think the get_conditional_len_internal_fn is in the wrong place. It should be applied: internal_fn ifn; if (internal_fn_p (cfn)) ifn = as_internal_fn (cfn); else ifn = associated_internal_fn (fndecl); if (ifn != IFN_LAST && direct_internal_fn_p (ifn)) { // After this point I don't think we should assume that every IFN_COND_* has an associated tree code. It should be possible to create IFN_COND_*s from unconditional internal functions too. So rather than use: tree_code code = conditional_internal_fn_code (ifn); len_ifn = get_conditional_len_internal_fn (code); I think we should have an internal-fn helper that returns IFN_COND_LEN_* for a given IFN_COND_*. It could handle IFN_MASK_LOAD -> IFN_MASK_LEN_LOAD etc. too. It would help if we combined the COND_* and COND_LEN_* definitions. For example: DEF_INTERNAL_OPTAB_FN (COND_ADD, ECF_CONST, cond_add, cond_binary) DEF_INTERNAL_OPTAB_FN (COND_LEN_ADD, ECF_CONST, cond_len_add, cond_len_binary) could become: DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) that expands to the COND_ADD and COND_LEN_ADD definitions. This would help to keep the definitions in sync and would make it eaiser to do a mapping between COND_* and COND_LEN_*. Thanks, Richard ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_* 2023-07-25 9:41 ` Richard Sandiford @ 2023-07-25 10:17 ` juzhe.zhong 2023-07-25 10:21 ` Richard Sandiford 2023-07-31 9:30 ` Richard Biener 1 sibling, 1 reply; 9+ messages in thread From: juzhe.zhong @ 2023-07-25 10:17 UTC (permalink / raw) To: richard.sandiford; +Cc: rguenther, gcc-patches [-- Attachment #1: Type: text/plain, Size: 4396 bytes --] Thanks Richard. Do you suggest we should add a macro like this first: #ifndef DEF_INTERNAL_COND_FN #define DEF_INTERNAL_COND_FN(NAME, FLAGS, OPTAB, TYPE) \ DEF_INTERNAL_OPTAB_FN (COND_##NAME, FLAGS, cond_##optab, cond_##TYPE) DEF_INTERNAL_OPTAB_FN (COND_LEN_##NAME, FLAGS, cond_len_##optab, cond_len_##TYPE) #endif If yes, maybe I should first do this in a single patch first? juzhe.zhong@rivai.ai From: Richard Sandiford Date: 2023-07-25 17:41 To: 钟居哲 CC: rguenther; gcc-patches Subject: Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_* 钟居哲 <juzhe.zhong@rivai.ai> writes: > Hi, Richi. Thank you so much for review. > >>> This function doesn't seem to care about conditional vectorization >>> support, so why are you changing it? > > I debug and analyze the code here: > > Breakpoint 1, vectorizable_call (vinfo=0x3d358d0, stmt_info=0x3dcc820, gsi=0x0, vec_stmt=0x0, slp_node=0x0, cost_vec=0x7fffffffc668) at ../../../riscv-gcc/gcc/tree-vect-stmts.cc:3485 > 3485 ifn = vectorizable_internal_function (cfn, callee, vectype_out, > (gdb) p cfn > $1 = CFN_COND_ADD > (gdb) call print_gimple_stmt(stdout,stmt,0,0) > _9 = .COND_ADD (_27, _6, _8, _6); > > When target like RISC-V didnt' support COND_* (we only support COND_LEN_*, no support COND_*), > ifn will be IFN_LAST. > Also, the following code doesn't modify ifn until > "internal_fn cond_fn = get_conditional_internal_fn (ifn);" > When ifn == IFN_LAST, cond_fn will IFN_LAST too. > > So, I extend "vectorizable_internal_function" to return COND_LEN_*. > > Am I going into wrong direction on debug && analysis. The main difference between IFN_COND_ADD and IFN_COND_LEN_ADD is that IFN_COND_ADD makes logical sense for scalars, whereas IFN_COND_LEN_ADD only makes sense for vectors. So I think if-conversion has to create IFN_COND_ADD rather than IFN_COND_LEN_ADD even for targets that only support IFN_COND_LEN_ADD. (Which is what the patch does.) IFN_COND_LEN_ADD is really a generalisation of IFN_COND_ADD. In a sense, every target that supports IFN_COND_LEN_ADD also supports IFN_COND_ADD. All we need is two extra arguments. And the patch does seem to take that approach (by adding dummy length arguments). So I think there are at least four ways we could go: (1) RVV implements cond_* optabs as expanders. RVV therefore supports both IFN_COND_ADD and IFN_COND_LEN_ADD. No dummy length arguments are needed at the gimple level. (2) Target-independent code automatically provides cond_* optabs based on cond_len_* optabs. Otherwise the same as (1). (3) A gimple pass lowers IFN_COND_ADD to IFN_COND_LEN_ADD on targets that need it. Could be vec lowering or isel. (4) The vectoriser maintains the distinction between IFN_COND_ADD and IFN_COND_LEN_ADD from the outset. It adds dummy length arguments where necessary. The patch is going for (4). But I think we should at least consider whether any of the other alternatives make sense. If we do go for (4), I think the get_conditional_len_internal_fn is in the wrong place. It should be applied: internal_fn ifn; if (internal_fn_p (cfn)) ifn = as_internal_fn (cfn); else ifn = associated_internal_fn (fndecl); if (ifn != IFN_LAST && direct_internal_fn_p (ifn)) { // After this point I don't think we should assume that every IFN_COND_* has an associated tree code. It should be possible to create IFN_COND_*s from unconditional internal functions too. So rather than use: tree_code code = conditional_internal_fn_code (ifn); len_ifn = get_conditional_len_internal_fn (code); I think we should have an internal-fn helper that returns IFN_COND_LEN_* for a given IFN_COND_*. It could handle IFN_MASK_LOAD -> IFN_MASK_LEN_LOAD etc. too. It would help if we combined the COND_* and COND_LEN_* definitions. For example: DEF_INTERNAL_OPTAB_FN (COND_ADD, ECF_CONST, cond_add, cond_binary) DEF_INTERNAL_OPTAB_FN (COND_LEN_ADD, ECF_CONST, cond_len_add, cond_len_binary) could become: DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) that expands to the COND_ADD and COND_LEN_ADD definitions. This would help to keep the definitions in sync and would make it eaiser to do a mapping between COND_* and COND_LEN_*. Thanks, Richard ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_* 2023-07-25 10:17 ` juzhe.zhong @ 2023-07-25 10:21 ` Richard Sandiford 2023-07-25 11:31 ` juzhe.zhong 0 siblings, 1 reply; 9+ messages in thread From: Richard Sandiford @ 2023-07-25 10:21 UTC (permalink / raw) To: juzhe.zhong; +Cc: rguenther, gcc-patches "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai> writes: > Thanks Richard. > > Do you suggest we should add a macro like this first: > > #ifndef DEF_INTERNAL_COND_FN > #define DEF_INTERNAL_COND_FN(NAME, FLAGS, OPTAB, TYPE) \ > DEF_INTERNAL_OPTAB_FN (COND_##NAME, FLAGS, cond_##optab, cond_##TYPE) > DEF_INTERNAL_OPTAB_FN (COND_LEN_##NAME, FLAGS, cond_len_##optab, cond_len_##TYPE) > #endif Yeah. (Think there's a missing backslash though.) > If yes, maybe I should first do this in a single patch first? Yeah, doing it as a separate patch sounds good. Richard ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_* 2023-07-25 10:21 ` Richard Sandiford @ 2023-07-25 11:31 ` juzhe.zhong 2023-07-25 12:18 ` Richard Sandiford 0 siblings, 1 reply; 9+ messages in thread From: juzhe.zhong @ 2023-07-25 11:31 UTC (permalink / raw) To: richard.sandiford; +Cc: rguenther, gcc-patches [-- Attachment #1: Type: text/plain, Size: 1100 bytes --] Hi, Richard. >> I think we should have an internal-fn helper that returns IFN_COND_LEN_* >> for a given IFN_COND_*. It could handle IFN_MASK_LOAD -> IFN_MASK_LEN_LOAD >> etc. too. Could you name this helper function for me? Does it call "get_conditional_len_internal_fn_for_conditional_fn" ? Thanks. juzhe.zhong@rivai.ai From: Richard Sandiford Date: 2023-07-25 18:21 To: juzhe.zhong\@rivai.ai CC: rguenther; gcc-patches Subject: Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_* "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai> writes: > Thanks Richard. > > Do you suggest we should add a macro like this first: > > #ifndef DEF_INTERNAL_COND_FN > #define DEF_INTERNAL_COND_FN(NAME, FLAGS, OPTAB, TYPE) \ > DEF_INTERNAL_OPTAB_FN (COND_##NAME, FLAGS, cond_##optab, cond_##TYPE) > DEF_INTERNAL_OPTAB_FN (COND_LEN_##NAME, FLAGS, cond_len_##optab, cond_len_##TYPE) > #endif Yeah. (Think there's a missing backslash though.) > If yes, maybe I should first do this in a single patch first? Yeah, doing it as a separate patch sounds good. Richard ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_* 2023-07-25 11:31 ` juzhe.zhong @ 2023-07-25 12:18 ` Richard Sandiford 0 siblings, 0 replies; 9+ messages in thread From: Richard Sandiford @ 2023-07-25 12:18 UTC (permalink / raw) To: juzhe.zhong; +Cc: rguenther, gcc-patches "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai> writes: > Hi, Richard. >>> I think we should have an internal-fn helper that returns IFN_COND_LEN_* >>> for a given IFN_COND_*. It could handle IFN_MASK_LOAD -> IFN_MASK_LEN_LOAD >>> etc. too. > Could you name this helper function for me? Does it call "get_conditional_len_internal_fn_for_conditional_fn" ? How about get_len_internal_fn? /* If there exists an internal function like IFN that operates on vectors, but with additional length and bias parameters, return the internal_fn for that function, otherwise return IFN_LAST. */ Thanks, Richard ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_* 2023-07-25 9:41 ` Richard Sandiford 2023-07-25 10:17 ` juzhe.zhong @ 2023-07-31 9:30 ` Richard Biener 1 sibling, 0 replies; 9+ messages in thread From: Richard Biener @ 2023-07-31 9:30 UTC (permalink / raw) To: Richard Sandiford; +Cc: 钟居哲, gcc-patches On Tue, 25 Jul 2023, Richard Sandiford wrote: > ??? <juzhe.zhong@rivai.ai> writes: > > Hi, Richi. Thank you so much for review. > > > >>> This function doesn't seem to care about conditional vectorization > >>> support, so why are you changing it? > > > > I debug and analyze the code here: > > > > Breakpoint 1, vectorizable_call (vinfo=0x3d358d0, stmt_info=0x3dcc820, gsi=0x0, vec_stmt=0x0, slp_node=0x0, cost_vec=0x7fffffffc668) at ../../../riscv-gcc/gcc/tree-vect-stmts.cc:3485 > > 3485 ifn = vectorizable_internal_function (cfn, callee, vectype_out, > > (gdb) p cfn > > $1 = CFN_COND_ADD > > (gdb) call print_gimple_stmt(stdout,stmt,0,0) > > _9 = .COND_ADD (_27, _6, _8, _6); > > > > When target like RISC-V didnt' support COND_* (we only support COND_LEN_*, no support COND_*), > > ifn will be IFN_LAST. > > Also, the following code doesn't modify ifn until > > "internal_fn cond_fn = get_conditional_internal_fn (ifn);" > > When ifn == IFN_LAST, cond_fn will IFN_LAST too. > > > > So, I extend "vectorizable_internal_function" to return COND_LEN_*. > > > > Am I going into wrong direction on debug && analysis. > > The main difference between IFN_COND_ADD and IFN_COND_LEN_ADD is that > IFN_COND_ADD makes logical sense for scalars, whereas IFN_COND_LEN_ADD > only makes sense for vectors. So I think if-conversion has to create > IFN_COND_ADD rather than IFN_COND_LEN_ADD even for targets that only > support IFN_COND_LEN_ADD. (Which is what the patch does.) > > IFN_COND_LEN_ADD is really a generalisation of IFN_COND_ADD. In a > sense, every target that supports IFN_COND_LEN_ADD also supports > IFN_COND_ADD. All we need is two extra arguments. And the patch > does seem to take that approach (by adding dummy length arguments). > > So I think there are at least four ways we could go: > > (1) RVV implements cond_* optabs as expanders. RVV therefore supports > both IFN_COND_ADD and IFN_COND_LEN_ADD. No dummy length arguments > are needed at the gimple level. > > (2) Target-independent code automatically provides cond_* optabs > based on cond_len_* optabs. Otherwise the same as (1). > > (3) A gimple pass lowers IFN_COND_ADD to IFN_COND_LEN_ADD on targets > that need it. Could be vec lowering or isel. > > (4) The vectoriser maintains the distinction between IFN_COND_ADD and > IFN_COND_LEN_ADD from the outset. It adds dummy length arguments > where necessary. > > The patch is going for (4). But I think we should at least consider > whether any of the other alternatives make sense. I think only (1) and (4) make sense. The vectorizer would need to check for IFN_COND_LEN_ADD or IFN_COND_ADD support anyway even when only emitting IFN_COND_ADD. Since we've already gone half-way then we can as well do (4). That also possibly simplifies N targets as opposed to just the vectorizer. So my preference would be (4), but I'm also fine with (1). Richard. > If we do go for (4), I think the get_conditional_len_internal_fn > is in the wrong place. It should be applied: > > internal_fn ifn; > if (internal_fn_p (cfn)) > ifn = as_internal_fn (cfn); > else > ifn = associated_internal_fn (fndecl); > if (ifn != IFN_LAST && direct_internal_fn_p (ifn)) > { > // After this point > > I don't think we should assume that every IFN_COND_* has an associated > tree code. It should be possible to create IFN_COND_*s from unconditional > internal functions too. So rather than use: > > tree_code code = conditional_internal_fn_code (ifn); > len_ifn = get_conditional_len_internal_fn (code); > > I think we should have an internal-fn helper that returns IFN_COND_LEN_* > for a given IFN_COND_*. It could handle IFN_MASK_LOAD -> IFN_MASK_LEN_LOAD > etc. too. > > It would help if we combined the COND_* and COND_LEN_* definitions. > For example: > > DEF_INTERNAL_OPTAB_FN (COND_ADD, ECF_CONST, cond_add, cond_binary) > DEF_INTERNAL_OPTAB_FN (COND_LEN_ADD, ECF_CONST, cond_len_add, cond_len_binary) > > could become: > > DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) > > that expands to the COND_ADD and COND_LEN_ADD definitions. This would > help to keep the definitions in sync and would make it eaiser to do a > mapping between COND_* and COND_LEN_*. > > Thanks, > Richard > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-07-31 9:30 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-07-24 11:46 [PATCH] VECT: Support CALL vectorization for COND_LEN_* juzhe.zhong 2023-07-24 14:33 ` Richard Biener 2023-07-24 23:16 ` 钟居哲 2023-07-25 9:41 ` Richard Sandiford 2023-07-25 10:17 ` juzhe.zhong 2023-07-25 10:21 ` Richard Sandiford 2023-07-25 11:31 ` juzhe.zhong 2023-07-25 12:18 ` Richard Sandiford 2023-07-31 9:30 ` Richard Biener
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).