From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id D1776386F81F; Fri, 4 Sep 2020 06:39:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org D1776386F81F Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0846Veq3012328; Fri, 4 Sep 2020 02:38:57 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 33bfns9904-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 04 Sep 2020 02:38:57 -0400 Received: from m0098409.ppops.net (m0098409.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0846XOa2015907; Fri, 4 Sep 2020 02:38:57 -0400 Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com with ESMTP id 33bfns98yd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 04 Sep 2020 02:38:57 -0400 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 0846bF5C003085; Fri, 4 Sep 2020 06:38:54 GMT Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by ppma05fra.de.ibm.com with ESMTP id 337en841f2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 04 Sep 2020 06:38:54 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0846cqDh33554748 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 4 Sep 2020 06:38:52 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 32BB011C04C; Fri, 4 Sep 2020 06:38:52 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CEF8811C04A; Fri, 4 Sep 2020 06:38:49 +0000 (GMT) Received: from luoxhus-MacBook-Pro.local (unknown [9.200.62.8]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 4 Sep 2020 06:38:49 +0000 (GMT) Subject: Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251] To: Richard Biener Cc: GCC Patches , David Edelsohn , Bill Schmidt , linkw@gcc.gnu.org, Segher Boessenkool References: <20200831090647.152432-1-luoxhu@linux.ibm.com> <20200831170414.GJ28786@gate.crashing.org> <3aec8b6f-8cf5-bd48-eb4e-c7b82e88dcd7@linux.ibm.com> <53017f5c-d81b-6fe2-6c5d-1496249313f1@linux.ibm.com> <599b6eff-42fc-82ef-c822-f3502997798a@linux.ibm.com> From: luoxhu Message-ID: Date: Fri, 4 Sep 2020 14:38:48 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-09-04_03:2020-09-03, 2020-09-04 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 phishscore=0 adultscore=0 malwarescore=0 suspectscore=0 impostorscore=0 bulkscore=0 lowpriorityscore=0 clxscore=1015 mlxlogscore=999 spamscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2009040055 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Sep 2020 06:39:02 -0000 On 2020/9/4 14:16, luoxhu via Gcc-patches wrote: > Hi, > > > Yes, I checked and found that both vec_set and vec_extract doesn't support > variable index for most targets, store_bit_field_1 and extract_bit_field_1 > would only consider use optabs when index is integer value. Anyway, it > shouldn't be hard to extend depending on target requirements. > > Another problem is v[n&3]=i and vec_insert(v, i, n) are generating with > different gimple code: > > { > _1 = n & 3; > VIEW_CONVERT_EXPR(v1)[_1] = i; > } > > vs: > > { > __vector signed int v1; > __vector signed int D.3192; > long unsigned int _1; > long unsigned int _2; > int * _3; > > [local count: 1073741824]: > D.3192 = v_4(D); > _1 = n_7(D) & 3; > _2 = _1 * 4; > _3 = &D.3192 + _2; > *_3 = i_8(D); > v1_10 = D.3192; > return v1_10; > } Just realized use convert_vector_to_array_for_subscript would generate "VIEW_CONVERT_EXPR(v1)[_1] = i;" before produce those instructions, your confirmation and comments will be highly appreciated... Thanks in advance. :) Xionghu > > If not use builtin for "vec_insert(v, i, n)", the pointer is "int*" instead > of vector type, will this be difficult for expander to capture so many > statements then call the optabs? So shall we still keep the builtin style > for "vec_insert(v, i, n)" and expand "v[n&3]=i" with optabs or expand both > with optabs??? > > Drafted a fast patch to expand "v[n&3]=i" with optabs as below, sorry that not > using existed vec_set yet as not quite sure, together with the first patch, both > cases could be handled as expected: > > > [PATCH] Expander: expand VIEW_CONVERT_EXPR to vec_insert with variable index > > v[n%4] = i has same semantic with vec_insert (i, v, n), but it will be > optimized to "VIEW_CONVERT_EXPR(v1)[_1] = i;" in gimple, this > patch tries to recognize the pattern in expander and use optabs to > expand it to fast instructions like vec_insert: lvsl+xxperm+xxsel. > > gcc/ChangeLog: > > * config/rs6000/vector.md: > * expr.c (expand_assignment): > * optabs.def (OPTAB_CD): > --- > gcc/config/rs6000/vector.md | 13 +++++++++++ > gcc/expr.c | 46 +++++++++++++++++++++++++++++++++++++ > gcc/optabs.def | 1 + > 3 files changed, 60 insertions(+) > > diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md > index 796345c80d3..46d21271e17 100644 > --- a/gcc/config/rs6000/vector.md > +++ b/gcc/config/rs6000/vector.md > @@ -1244,6 +1244,19 @@ (define_expand "vec_extract" > DONE; > }) > > +(define_expand "vec_insert" > + [(match_operand:VEC_E 0 "vlogical_operand") > + (match_operand: 1 "register_operand") > + (match_operand 2 "register_operand")] > + "VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)" > +{ > + rtx target = gen_reg_rtx (V16QImode); > + rs6000_expand_vector_insert (target, operands[0], operands[1], operands[2]); > + rtx sub_target = simplify_gen_subreg (GET_MODE(operands[0]), target, V16QImode, 0); > + emit_insn (gen_rtx_SET (operands[0], sub_target)); > + DONE; > +}) > + > ;; Convert double word types to single word types > (define_expand "vec_pack_trunc_v2df" > [(match_operand:V4SF 0 "vfloat_operand") > diff --git a/gcc/expr.c b/gcc/expr.c > index dd2200ddea8..ce2890c1a2d 100644 > --- a/gcc/expr.c > +++ b/gcc/expr.c > @@ -5237,6 +5237,52 @@ expand_assignment (tree to, tree from, bool nontemporal) > > to_rtx = expand_expr (tem, NULL_RTX, VOIDmode, EXPAND_WRITE); > > + tree type = TREE_TYPE (to); > + if (TREE_CODE (to) == ARRAY_REF && tree_fits_uhwi_p (TYPE_SIZE (type)) > + && tree_fits_uhwi_p (TYPE_SIZE_UNIT (type)) > + && tree_to_uhwi (TYPE_SIZE (type)) > + * tree_to_uhwi (TYPE_SIZE_UNIT (type)) > + == 128) > + { > + tree op0 = TREE_OPERAND (to, 0); > + tree op1 = TREE_OPERAND (to, 1); > + if (TREE_CODE (op0) == VIEW_CONVERT_EXPR) > + { > + tree view_op0 = TREE_OPERAND (op0, 0); > + mode = TYPE_MODE (TREE_TYPE (view_op0)); > + if (TREE_CODE (TREE_TYPE (view_op0)) == VECTOR_TYPE) > + { > + rtx value > + = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL); > + rtx pos > + = expand_expr (op1, NULL_RTX, VOIDmode, EXPAND_NORMAL); > + rtx temp_target = gen_reg_rtx (mode); > + emit_move_insn (temp_target, to_rtx); > + > + machine_mode outermode = mode; > + scalar_mode innermode = GET_MODE_INNER (outermode); > + class expand_operand ops[3]; > + enum insn_code icode > + = convert_optab_handler (vec_insert_optab, innermode, > + outermode); > + > + if (icode != CODE_FOR_nothing) > + { > + pos = convert_to_mode (E_SImode, pos, 0); > + > + create_fixed_operand (&ops[0], temp_target); > + create_input_operand (&ops[1], value, innermode); > + create_input_operand (&ops[2], pos, GET_MODE (pos)); > + if (maybe_expand_insn (icode, 3, ops)) > + { > + emit_move_insn (to_rtx, temp_target); > + pop_temp_slots (); > + return; > + } > + } > + } > + } > + } > /* If the field has a mode, we want to access it in the > field's mode, not the computed mode. > If a MEM has VOIDmode (external with incomplete type), > diff --git a/gcc/optabs.def b/gcc/optabs.def > index 78409aa1453..21b163a969e 100644 > --- a/gcc/optabs.def > +++ b/gcc/optabs.def > @@ -96,6 +96,7 @@ OPTAB_CD(mask_gather_load_optab, "mask_gather_load$a$b") > OPTAB_CD(scatter_store_optab, "scatter_store$a$b") > OPTAB_CD(mask_scatter_store_optab, "mask_scatter_store$a$b") > OPTAB_CD(vec_extract_optab, "vec_extract$a$b") > +OPTAB_CD(vec_insert_optab, "vec_insert$a$b") > OPTAB_CD(vec_init_optab, "vec_init$a$b") > > OPTAB_CD (while_ult_optab, "while_ult$a$b") >