From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id D6BAE3857008; Fri, 25 Sep 2020 06:51:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org D6BAE3857008 Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 08P65liP022059; Fri, 25 Sep 2020 02:51:17 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 33sakghmh0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 25 Sep 2020 02:51:17 -0400 Received: from m0098404.ppops.net (m0098404.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 08P687Ri036702; Fri, 25 Sep 2020 02:51:17 -0400 Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com with ESMTP id 33sakghmg9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 25 Sep 2020 02:51:17 -0400 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 08P6WRrL022274; Fri, 25 Sep 2020 06:51:15 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma06ams.nl.ibm.com with ESMTP id 33n98gx0yk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 25 Sep 2020 06:51:14 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 08P6pB2c13697492 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Sep 2020 06:51:12 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D78965204E; Fri, 25 Sep 2020 06:51:10 +0000 (GMT) Received: from luoxhus-MacBook-Pro.local (unknown [9.197.251.249]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTPS id 9141E52063; Fri, 25 Sep 2020 06:51:07 +0000 (GMT) Subject: [PATCH v4 1/3] IFN: Implement IFN_VEC_SET for ARRAY_REF with VIEW_CONVERT_EXPR To: Richard Biener , GCC Patches , Segher Boessenkool , David Edelsohn , Bill Schmidt , Jiufu Guo , linkw@gcc.gnu.org, richard.sandiford@arm.com References: <20200918061741.65490-1-luoxhu@linux.ibm.com> From: xionghu luo Message-ID: Date: Fri, 25 Sep 2020 14:51:04 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-09-25_02:2020-09-24, 2020-09-25 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 suspectscore=0 mlxscore=0 adultscore=0 bulkscore=0 impostorscore=0 phishscore=0 lowpriorityscore=0 spamscore=0 clxscore=1015 priorityscore=1501 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2009250042 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_MANYTO, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Sep 2020 06:51:29 -0000 Hi, On 2020/9/24 20:39, Richard Sandiford wrote: > xionghu luo writes: >> @@ -2658,6 +2659,43 @@ expand_vect_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab) >> >> #define expand_vec_cond_mask_optab_fn expand_vect_cond_mask_optab_fn >> >> +/* Expand VEC_SET internal functions. */ >> + >> +static void >> +expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab) >> +{ >> + tree lhs = gimple_call_lhs (stmt); >> + tree op0 = gimple_call_arg (stmt, 0); >> + tree op1 = gimple_call_arg (stmt, 1); >> + tree op2 = gimple_call_arg (stmt, 2); >> + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); >> + rtx src = expand_expr (op0, NULL_RTX, VOIDmode, EXPAND_WRITE); > > I'm not sure about the expand_expr here. ISTM that op0 is a normal > input and so should be expanded by expand_normal rather than > EXPAND_WRITE. Also: > >> + >> + machine_mode outermode = TYPE_MODE (TREE_TYPE (op0)); >> + scalar_mode innermode = GET_MODE_INNER (outermode); >> + >> + rtx value = expand_expr (op1, NULL_RTX, VOIDmode, EXPAND_NORMAL); >> + rtx pos = expand_expr (op2, NULL_RTX, VOIDmode, EXPAND_NORMAL); >> + >> + class expand_operand ops[3]; >> + enum insn_code icode = optab_handler (optab, outermode); >> + >> + if (icode != CODE_FOR_nothing) >> + { >> + pos = convert_to_mode (E_SImode, pos, 0); >> + >> + create_fixed_operand (&ops[0], src); > > ...this would mean that if SRC happens to be a MEM, the pattern > must also accept a MEM. > > ISTM that we're making more work for ourselves by not “fixing” the optab > to have a natural pure-input + pure-output interface. :-) But if we > stick with the current optab interface, I think we need to: > > - create a temporary register > - move SRC into the temporary register before the insn > - use create_fixed_operand with the temporary register for operand 0 > - move the temporary register into TARGET after the insn > >> + create_input_operand (&ops[1], value, innermode); >> + create_input_operand (&ops[2], pos, GET_MODE (pos)); > > For this I think we should use convert_operand_from on the original “pos”, > so that the target gets to choose what the mode of the operand is. > Thanks a lot for the nice suggestions, fixed them all and updated the patch as below. [PATCH v4 1/3] IFN: Implement IFN_VEC_SET for ARRAY_REF with VIEW_CONVERT_EXPR This patch enables transformation from ARRAY_REF(VIEW_CONVERT_EXPR) to VEC_SET internal function in gimple-isel pass if target supports vec_set with variable index by checking can_vec_set_var_idx_p. gcc/ChangeLog: 2020-09-25 Xionghu Luo * gimple-isel.cc (gimple_expand_vec_set_expr): New function. (gimple_expand_vec_cond_exprs): Rename to ... (gimple_expand_vec_exprs): ... this and call gimple_expand_vec_set_expr. * internal-fn.c (vec_set_direct): New define. (expand_vec_set_optab_fn): New function. (direct_vec_set_optab_supported_p): New define. * internal-fn.def (VEC_SET): New DEF_INTERNAL_OPTAB_FN. * optabs.c (can_vec_set_var_idx_p): New function. * optabs.h (can_vec_set_var_idx_p): New declaration. --- gcc/gimple-isel.cc | 75 +++++++++++++++++++++++++++++++++++++++++++-- gcc/internal-fn.c | 41 +++++++++++++++++++++++++ gcc/internal-fn.def | 2 ++ gcc/optabs.c | 21 +++++++++++++ gcc/optabs.h | 4 +++ 5 files changed, 141 insertions(+), 2 deletions(-) diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc index b330cf4c20e..02513e04900 100644 --- a/gcc/gimple-isel.cc +++ b/gcc/gimple-isel.cc @@ -35,6 +35,74 @@ along with GCC; see the file COPYING3. If not see #include "tree-cfg.h" #include "bitmap.h" #include "tree-ssa-dce.h" +#include "memmodel.h" +#include "optabs.h" + +/* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to + internal function based on vector type of selected expansion. + i.e.: + VIEW_CONVERT_EXPR(u)[_1] = = i_4(D); + => + _7 = u; + _8 = .VEC_SET (_7, i_4(D), _1); + u = _8; */ + +static gimple * +gimple_expand_vec_set_expr (gimple_stmt_iterator *gsi) +{ + enum tree_code code; + gcall *new_stmt = NULL; + gassign *ass_stmt = NULL; + + /* Only consider code == GIMPLE_ASSIGN. */ + gassign *stmt = dyn_cast (gsi_stmt (*gsi)); + if (!stmt) + return NULL; + + tree lhs = gimple_assign_lhs (stmt); + code = TREE_CODE (lhs); + if (code != ARRAY_REF) + return NULL; + + tree val = gimple_assign_rhs1 (stmt); + tree op0 = TREE_OPERAND (lhs, 0); + if (TREE_CODE (op0) == VIEW_CONVERT_EXPR && DECL_P (TREE_OPERAND (op0, 0)) + && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0))) + && TYPE_MODE (TREE_TYPE (lhs)) + == TYPE_MODE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0))))) + { + tree pos = TREE_OPERAND (lhs, 1); + tree view_op0 = TREE_OPERAND (op0, 0); + machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0)); + if (auto_var_in_fn_p (view_op0, cfun->decl) + && !TREE_ADDRESSABLE (view_op0) && can_vec_set_var_idx_p (outermode)) + { + location_t loc = gimple_location (stmt); + tree var_src = make_ssa_name (TREE_TYPE (view_op0)); + tree var_dst = make_ssa_name (TREE_TYPE (view_op0)); + + ass_stmt = gimple_build_assign (var_src, view_op0); + gimple_set_vuse (ass_stmt, gimple_vuse (stmt)); + gimple_set_location (ass_stmt, loc); + gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT); + + new_stmt + = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos); + gimple_call_set_lhs (new_stmt, var_dst); + gimple_set_location (new_stmt, loc); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + + ass_stmt = gimple_build_assign (view_op0, var_dst); + gimple_set_location (ass_stmt, loc); + gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT); + + gimple_move_vops (ass_stmt, stmt); + gsi_remove (gsi, true); + } + } + + return ass_stmt; +} /* Expand all VEC_COND_EXPR gimple assignments into calls to internal function based on type of selected expansion. */ @@ -176,7 +244,7 @@ gimple_expand_vec_cond_expr (gimple_stmt_iterator *gsi, VEC_COND_EXPR assignments. */ static unsigned int -gimple_expand_vec_cond_exprs (void) +gimple_expand_vec_exprs (void) { gimple_stmt_iterator gsi; basic_block bb; @@ -189,12 +257,15 @@ gimple_expand_vec_cond_exprs (void) { gimple *g = gimple_expand_vec_cond_expr (&gsi, &vec_cond_ssa_name_uses); + if (g != NULL) { tree lhs = gimple_assign_lhs (gsi_stmt (gsi)); gimple_set_lhs (g, lhs); gsi_replace (&gsi, g, false); } + + gimple_expand_vec_set_expr (&gsi); } } @@ -237,7 +308,7 @@ public: virtual unsigned int execute (function *) { - return gimple_expand_vec_cond_exprs (); + return gimple_expand_vec_exprs (); } }; // class pass_gimple_isel diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c index 8efc77d986b..20c1d31fb73 100644 --- a/gcc/internal-fn.c +++ b/gcc/internal-fn.c @@ -115,6 +115,7 @@ init_internal_fns () #define vec_condeq_direct { 0, 0, false } #define scatter_store_direct { 3, 1, false } #define len_store_direct { 3, 3, false } +#define vec_set_direct { 3, 3, false } #define unary_direct { 0, 0, true } #define binary_direct { 0, 0, true } #define ternary_direct { 0, 0, true } @@ -2658,6 +2659,45 @@ expand_vect_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab) #define expand_vec_cond_mask_optab_fn expand_vect_cond_mask_optab_fn +/* Expand VEC_SET internal functions. */ + +static void +expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab) +{ + tree lhs = gimple_call_lhs (stmt); + tree op0 = gimple_call_arg (stmt, 0); + tree op1 = gimple_call_arg (stmt, 1); + tree op2 = gimple_call_arg (stmt, 2); + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); + rtx src = expand_normal (op0); + + machine_mode outermode = TYPE_MODE (TREE_TYPE (op0)); + scalar_mode innermode = GET_MODE_INNER (outermode); + + rtx value = expand_expr (op1, NULL_RTX, VOIDmode, EXPAND_NORMAL); + rtx pos = expand_expr (op2, NULL_RTX, VOIDmode, EXPAND_NORMAL); + + class expand_operand ops[3]; + enum insn_code icode = optab_handler (optab, outermode); + + if (icode != CODE_FOR_nothing) + { + rtx temp = gen_reg_rtx (outermode); + emit_move_insn (temp, src); + + create_fixed_operand (&ops[0], temp); + create_input_operand (&ops[1], value, innermode); + create_convert_operand_from (&ops[2], pos, TYPE_MODE (TREE_TYPE (op2)), + true); + if (maybe_expand_insn (icode, 3, ops)) + { + emit_move_insn (target, temp); + return; + } + } + gcc_unreachable (); +} + static void expand_ABNORMAL_DISPATCHER (internal_fn, gcall *) { @@ -3253,6 +3293,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, #define direct_fold_left_optab_supported_p direct_optab_supported_p #define direct_mask_fold_left_optab_supported_p direct_optab_supported_p #define direct_check_ptrs_optab_supported_p direct_optab_supported_p +#define direct_vec_set_optab_supported_p direct_optab_supported_p /* Return the optab used by internal function FN. */ diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 13e60828fcf..e6cfe1b6159 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -145,6 +145,8 @@ DEF_INTERNAL_OPTAB_FN (VCONDU, 0, vcondu, vec_condu) DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_condeq) DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask) +DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set) + DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store) DEF_INTERNAL_OPTAB_FN (WHILE_ULT, ECF_CONST | ECF_NOTHROW, while_ult, while) diff --git a/gcc/optabs.c b/gcc/optabs.c index 184827fdf4e..8e844028d92 100644 --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -3841,6 +3841,27 @@ can_vcond_compare_p (enum rtx_code code, machine_mode value_mode, && insn_operand_matches (icode, 3, test); } +/* Return whether the backend can emit vector set instructions for inserting + element into vector at variable index position. */ + +bool +can_vec_set_var_idx_p (machine_mode vec_mode) +{ + if (!VECTOR_MODE_P (vec_mode)) + return false; + + machine_mode inner_mode = GET_MODE_INNER (vec_mode); + rtx reg1 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 1); + rtx reg2 = alloca_raw_REG (inner_mode, LAST_VIRTUAL_REGISTER + 2); + rtx reg3 = alloca_raw_REG (VOIDmode, LAST_VIRTUAL_REGISTER + 3); + + enum insn_code icode = optab_handler (vec_set_optab, vec_mode); + + return icode != CODE_FOR_nothing && insn_operand_matches (icode, 0, reg1) + && insn_operand_matches (icode, 1, reg2) + && insn_operand_matches (icode, 2, reg3); +} + /* This function is called when we are going to emit a compare instruction that compares the values found in X and Y, using the rtl operator COMPARISON. diff --git a/gcc/optabs.h b/gcc/optabs.h index 7c2ec257cb0..0b14700ab3d 100644 --- a/gcc/optabs.h +++ b/gcc/optabs.h @@ -249,6 +249,10 @@ extern int can_compare_p (enum rtx_code, machine_mode, VALUE_MODE. */ extern bool can_vcond_compare_p (enum rtx_code, machine_mode, machine_mode); +/* Return whether the backend can emit vector set instructions for inserting + element into vector at variable index position. */ +extern bool can_vec_set_var_idx_p (machine_mode); + extern rtx prepare_operand (enum insn_code, rtx, int, machine_mode, machine_mode, int); /* Emit a pair of rtl insns to compare two rtx's and to jump -- 2.27.0.90.geebb51ba8c