From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 6BFD9385740D for ; Thu, 26 May 2022 05:30:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6BFD9385740D Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 24Q4N0tk004169; Thu, 26 May 2022 05:30:49 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ga2fkgwt7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 26 May 2022 05:30:49 +0000 Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 24Q5DVii022402; Thu, 26 May 2022 05:30:49 GMT Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ga2fkgws4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 26 May 2022 05:30:48 +0000 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 24Q5LHwq022130; Thu, 26 May 2022 05:30:46 GMT Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by ppma06fra.de.ibm.com with ESMTP id 3g948n0f25-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 26 May 2022 05:30:46 +0000 Received: from d06av24.portsmouth.uk.ibm.com (d06av24.portsmouth.uk.ibm.com [9.149.105.60]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 24Q5Uh8013762950 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 26 May 2022 05:30:43 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 230D442045; Thu, 26 May 2022 05:30:43 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5CC7D42041; Thu, 26 May 2022 05:30:40 +0000 (GMT) Received: from [9.200.42.164] (unknown [9.200.42.164]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 26 May 2022 05:30:40 +0000 (GMT) Message-ID: Date: Thu, 26 May 2022 13:30:38 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: [PATCH v4, rs6000] Add V1TI into vector comparison expand [PR103316] Content-Language: en-US To: "Kewen.Lin" Cc: Segher Boessenkool , David , Peter Bergner , gcc-patches References: <5e9b4423-b40f-f5e0-15fd-99776c426c32@linux.ibm.com> From: HAO CHEN GUI In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: jBdLxLRxBbBxk_VeAWx-caoYW1I4ZI2J X-Proofpoint-ORIG-GUID: kgMMQkrytt0k3KFzl4a_PidDbqSI3miG X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.874,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-05-26_01,2022-05-25_02,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 spamscore=0 priorityscore=1501 malwarescore=0 impostorscore=0 suspectscore=0 phishscore=0 lowpriorityscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2204290000 definitions=main-2205260030 X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 May 2022 05:30:54 -0000 Kewen, Thanks so much for your advice. Just one question about effective-target. For the test cases, it needs both power10_ok and int128 support. I saw some existing test cases have these two checks as well. But I wonder if power10_ok already covers int128 on powerpc targets? Can we save one check then? On 26/5/2022 上午 11:22, Kewen.Lin wrote: > Hi Haochen, > > on 2022/5/24 16:45, HAO CHEN GUI wrote: >> Hi, >> This patch adds V1TI mode into a new mode iterator used in vector >> comparison and rotation expands. Without the patch, the comparisons >> between two vector __int128 are converted to scalar comparisons. The >> code is suboptimal. The patch fixes the issue. Now all comparisons >> between two vector __int128 generates P10 new comparison instructions. >> Also the relative built-ins generate the same instructions after gimple >> folding. So they're added back to the list. >> >> This patch also merges some vector comparison and rotation expands >> for V1T1 and other vector integer modes as they have the same patterns. >> The expands for V1TI only are removed. >> >> Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. >> Is this okay for trunk? Any recommendations? Thanks a lot. >> >> ChangeLog >> 2022-05-24 Haochen Gui >> >> gcc/ >> PR target/103316 >> * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Enable >> gimple folding for RS6000_BIF_VCMPEQUT, RS6000_BIF_VCMPNET, >> RS6000_BIF_CMPGE_1TI, RS6000_BIF_CMPGE_U1TI, RS6000_BIF_VCMPGTUT, >> RS6000_BIF_VCMPGTST, RS6000_BIF_CMPLE_1TI, RS6000_BIF_CMPLE_U1TI. >> * config/rs6000/vector.md (VEC_IC): Define. Add support for new Power10 >> V1TI instructions. > > Nit: Maybe "New mode iterator" is better than "Define". > >> (vec_cmp): Set mode iterator to VEC_IC. >> (vec_cmpu): Likewise. >> (vector_nlt): Set mode iterator to VEC_IC. >> (vector_nltv1ti): Remove. >> (vector_gtu): Set mode iterator to VEC_IC. >> (vector_gtuv1ti): Remove. >> (vector_nltu): Set mode iterator to VEC_IC. >> (vector_nltuv1ti): Remove. >> (vector_geu): Set mode iterator to VEC_IC. >> (vector_ngt): Likewise. >> (vector_ngtv1ti): Remove. >> (vector_ngtu): Set mode iterator to VEC_IC. >> (vector_ngtuv1ti): Remove. >> (vector_gtu__p): Set mode iterator to VEC_IC. >> (vector_gtu_v1ti_p): Remove. >> (vrotl3): Set mode iterator to VEC_IC. Emit insns for V1TI. >> (vrotlv1ti3): Remove. >> (vashr3): Set mode iterator to VEC_IC. Emit insns for V1TI. >> (vashrv1ti3): Remove. >> >> gcc/testsuite/ >> PR target/103316 >> * gcc.target/powerpc/pr103316.c: New. >> * gcc.target/powerpc/fold-vec-cmp-int128.c: New. >> >> patch.diff >> diff --git a/gcc/config/rs6000/rs6000-builtin.cc b/gcc/config/rs6000/rs6000-builtin.cc >> index e925ba9fad9..b67f4e066a8 100644 >> --- a/gcc/config/rs6000/rs6000-builtin.cc >> +++ b/gcc/config/rs6000/rs6000-builtin.cc >> @@ -2000,16 +2000,14 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) >> case RS6000_BIF_VCMPEQUH: >> case RS6000_BIF_VCMPEQUW: >> case RS6000_BIF_VCMPEQUD: >> - /* We deliberately omit RS6000_BIF_VCMPEQUT for now, because gimple >> - folding produces worse code for 128-bit compares. */ >> + case RS6000_BIF_VCMPEQUT: >> fold_compare_helper (gsi, EQ_EXPR, stmt); >> return true; >> >> case RS6000_BIF_VCMPNEB: >> case RS6000_BIF_VCMPNEH: >> case RS6000_BIF_VCMPNEW: >> - /* We deliberately omit RS6000_BIF_VCMPNET for now, because gimple >> - folding produces worse code for 128-bit compares. */ >> + case RS6000_BIF_VCMPNET: >> fold_compare_helper (gsi, NE_EXPR, stmt); >> return true; >> >> @@ -2021,9 +2019,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) >> case RS6000_BIF_CMPGE_U4SI: >> case RS6000_BIF_CMPGE_2DI: >> case RS6000_BIF_CMPGE_U2DI: >> - /* We deliberately omit RS6000_BIF_CMPGE_1TI and RS6000_BIF_CMPGE_U1TI >> - for now, because gimple folding produces worse code for 128-bit >> - compares. */ >> + case RS6000_BIF_CMPGE_1TI: >> + case RS6000_BIF_CMPGE_U1TI: >> fold_compare_helper (gsi, GE_EXPR, stmt); >> return true; >> >> @@ -2035,9 +2032,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) >> case RS6000_BIF_VCMPGTUW: >> case RS6000_BIF_VCMPGTUD: >> case RS6000_BIF_VCMPGTSD: >> - /* We deliberately omit RS6000_BIF_VCMPGTUT and RS6000_BIF_VCMPGTST >> - for now, because gimple folding produces worse code for 128-bit >> - compares. */ >> + case RS6000_BIF_VCMPGTUT: >> + case RS6000_BIF_VCMPGTST: >> fold_compare_helper (gsi, GT_EXPR, stmt); >> return true; >> >> @@ -2049,9 +2045,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) >> case RS6000_BIF_CMPLE_U4SI: >> case RS6000_BIF_CMPLE_2DI: >> case RS6000_BIF_CMPLE_U2DI: >> - /* We deliberately omit RS6000_BIF_CMPLE_1TI and RS6000_BIF_CMPLE_U1TI >> - for now, because gimple folding produces worse code for 128-bit >> - compares. */ >> + case RS6000_BIF_CMPLE_1TI: >> + case RS6000_BIF_CMPLE_U1TI: >> fold_compare_helper (gsi, LE_EXPR, stmt); >> return true; >> >> diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md >> index 4d0797c48f8..3b7a272994f 100644 >> --- a/gcc/config/rs6000/vector.md >> +++ b/gcc/config/rs6000/vector.md >> @@ -26,6 +26,9 @@ >> ;; Vector int modes >> (define_mode_iterator VEC_I [V16QI V8HI V4SI V2DI]) >> >> +;; Vector int modes for comparison > > Nit: This comment line doesn't perfectly match the usage below since it's also > used for shift and rotation in this patch. Maybe it's better with: > > "Vector int modes for comparison, shift and rotation" > >> +(define_mode_iterator VEC_IC [V16QI V8HI V4SI V2DI (V1TI "TARGET_POWER10")]) >> + >> ;; 128-bit int modes >> (define_mode_iterator VEC_TI [V1TI TI]) >> >> @@ -533,10 +536,10 @@ (define_expand "vcond_mask_" >> >> ;; For signed integer vectors comparison. >> (define_expand "vec_cmp" >> - [(set (match_operand:VEC_I 0 "vint_operand") >> + [(set (match_operand:VEC_IC 0 "vint_operand") >> (match_operator 1 "signed_or_equality_comparison_operator" >> - [(match_operand:VEC_I 2 "vint_operand") >> - (match_operand:VEC_I 3 "vint_operand")]))] >> + [(match_operand:VEC_IC 2 "vint_operand") >> + (match_operand:VEC_IC 3 "vint_operand")]))] >> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" >> { >> enum rtx_code code = GET_CODE (operands[1]); >> @@ -573,10 +576,10 @@ (define_expand "vec_cmp" >> >> ;; For unsigned integer vectors comparison. >> (define_expand "vec_cmpu" >> - [(set (match_operand:VEC_I 0 "vint_operand") >> + [(set (match_operand:VEC_IC 0 "vint_operand") >> (match_operator 1 "unsigned_or_equality_comparison_operator" >> - [(match_operand:VEC_I 2 "vint_operand") >> - (match_operand:VEC_I 3 "vint_operand")]))] >> + [(match_operand:VEC_IC 2 "vint_operand") >> + (match_operand:VEC_IC 3 "vint_operand")]))] >> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" >> { >> enum rtx_code code = GET_CODE (operands[1]); >> @@ -690,116 +693,65 @@ (define_expand "vector_gt" >> >> ; >= for integer vectors: swap operands and apply not-greater-than >> (define_expand "vector_nlt" >> - [(set (match_operand:VEC_I 3 "vlogical_operand") >> - (gt:VEC_I (match_operand:VEC_I 2 "vlogical_operand") >> - (match_operand:VEC_I 1 "vlogical_operand"))) >> - (set (match_operand:VEC_I 0 "vlogical_operand") >> - (not:VEC_I (match_dup 3)))] >> + [(set (match_operand:VEC_IC 3 "vlogical_operand") >> + (gt:VEC_IC (match_operand:VEC_IC 2 "vlogical_operand") >> + (match_operand:VEC_IC 1 "vlogical_operand"))) >> + (set (match_operand:VEC_IC 0 "vlogical_operand") >> + (not:VEC_IC (match_dup 3)))] >> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" >> { >> operands[3] = gen_reg_rtx_and_attrs (operands[0]); >> }) >> >> -(define_expand "vector_nltv1ti" >> - [(set (match_operand:V1TI 3 "vlogical_operand") >> - (gt:V1TI (match_operand:V1TI 2 "vlogical_operand") >> - (match_operand:V1TI 1 "vlogical_operand"))) >> - (set (match_operand:V1TI 0 "vlogical_operand") >> - (not:V1TI (match_dup 3)))] >> - "TARGET_POWER10" >> -{ >> - operands[3] = gen_reg_rtx_and_attrs (operands[0]); >> -}) >> - >> (define_expand "vector_gtu" >> - [(set (match_operand:VEC_I 0 "vint_operand") >> - (gtu:VEC_I (match_operand:VEC_I 1 "vint_operand") >> - (match_operand:VEC_I 2 "vint_operand")))] >> + [(set (match_operand:VEC_IC 0 "vint_operand") >> + (gtu:VEC_IC (match_operand:VEC_IC 1 "vint_operand") >> + (match_operand:VEC_IC 2 "vint_operand")))] >> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" >> "") >> >> -(define_expand "vector_gtuv1ti" >> - [(set (match_operand:V1TI 0 "altivec_register_operand") >> - (gtu:V1TI (match_operand:V1TI 1 "altivec_register_operand") >> - (match_operand:V1TI 2 "altivec_register_operand")))] >> - "TARGET_POWER10" >> - "") >> - >> ; >= for integer vectors: swap operands and apply not-greater-than >> (define_expand "vector_nltu" >> - [(set (match_operand:VEC_I 3 "vlogical_operand") >> - (gtu:VEC_I (match_operand:VEC_I 2 "vlogical_operand") >> - (match_operand:VEC_I 1 "vlogical_operand"))) >> - (set (match_operand:VEC_I 0 "vlogical_operand") >> - (not:VEC_I (match_dup 3)))] >> + [(set (match_operand:VEC_IC 3 "vlogical_operand") >> + (gtu:VEC_IC (match_operand:VEC_IC 2 "vlogical_operand") >> + (match_operand:VEC_IC 1 "vlogical_operand"))) >> + (set (match_operand:VEC_IC 0 "vlogical_operand") >> + (not:VEC_IC (match_dup 3)))] >> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" >> { >> operands[3] = gen_reg_rtx_and_attrs (operands[0]); >> }) >> >> -(define_expand "vector_nltuv1ti" >> - [(set (match_operand:V1TI 3 "vlogical_operand") >> - (gtu:V1TI (match_operand:V1TI 2 "vlogical_operand") >> - (match_operand:V1TI 1 "vlogical_operand"))) >> - (set (match_operand:V1TI 0 "vlogical_operand") >> - (not:V1TI (match_dup 3)))] >> - "TARGET_POWER10" >> -{ >> - operands[3] = gen_reg_rtx_and_attrs (operands[0]); >> -}) >> - >> (define_expand "vector_geu" >> - [(set (match_operand:VEC_I 0 "vint_operand") >> - (geu:VEC_I (match_operand:VEC_I 1 "vint_operand") >> - (match_operand:VEC_I 2 "vint_operand")))] >> + [(set (match_operand:VEC_IC 0 "vint_operand") >> + (geu:VEC_IC (match_operand:VEC_IC 1 "vint_operand") >> + (match_operand:VEC_IC 2 "vint_operand")))] >> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" >> "") >> >> ; <= for integer vectors: apply not-greater-than >> (define_expand "vector_ngt" >> - [(set (match_operand:VEC_I 3 "vlogical_operand") >> - (gt:VEC_I (match_operand:VEC_I 1 "vlogical_operand") >> - (match_operand:VEC_I 2 "vlogical_operand"))) >> - (set (match_operand:VEC_I 0 "vlogical_operand") >> - (not:VEC_I (match_dup 3)))] >> + [(set (match_operand:VEC_IC 3 "vlogical_operand") >> + (gt:VEC_IC (match_operand:VEC_IC 1 "vlogical_operand") >> + (match_operand:VEC_IC 2 "vlogical_operand"))) >> + (set (match_operand:VEC_IC 0 "vlogical_operand") >> + (not:VEC_IC (match_dup 3)))] >> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" >> { >> operands[3] = gen_reg_rtx_and_attrs (operands[0]); >> }) >> >> -(define_expand "vector_ngtv1ti" >> - [(set (match_operand:V1TI 3 "vlogical_operand") >> - (gt:V1TI (match_operand:V1TI 1 "vlogical_operand") >> - (match_operand:V1TI 2 "vlogical_operand"))) >> - (set (match_operand:V1TI 0 "vlogical_operand") >> - (not:V1TI (match_dup 3)))] >> - "TARGET_POWER10" >> -{ >> - operands[3] = gen_reg_rtx_and_attrs (operands[0]); >> -}) >> - >> (define_expand "vector_ngtu" >> - [(set (match_operand:VEC_I 3 "vlogical_operand") >> - (gtu:VEC_I (match_operand:VEC_I 1 "vlogical_operand") >> - (match_operand:VEC_I 2 "vlogical_operand"))) >> - (set (match_operand:VEC_I 0 "vlogical_operand") >> - (not:VEC_I (match_dup 3)))] >> + [(set (match_operand:VEC_IC 3 "vlogical_operand") >> + (gtu:VEC_IC (match_operand:VEC_IC 1 "vlogical_operand") >> + (match_operand:VEC_IC 2 "vlogical_operand"))) >> + (set (match_operand:VEC_IC 0 "vlogical_operand") >> + (not:VEC_IC (match_dup 3)))] >> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" >> { >> operands[3] = gen_reg_rtx_and_attrs (operands[0]); >> }) >> >> -(define_expand "vector_ngtuv1ti" >> - [(set (match_operand:V1TI 3 "vlogical_operand") >> - (gtu:V1TI (match_operand:V1TI 1 "vlogical_operand") >> - (match_operand:V1TI 2 "vlogical_operand"))) >> - (set (match_operand:V1TI 0 "vlogical_operand") >> - (not:V1TI (match_dup 3)))] >> - "TARGET_POWER10" >> -{ >> - operands[3] = gen_reg_rtx_and_attrs (operands[0]); >> -}) >> - >> ; There are 14 possible vector FP comparison operators, gt and eq of them have >> ; been expanded above, so just support 12 remaining operators here. >> >> @@ -1189,27 +1141,15 @@ (define_expand "vector_ge__p" >> (define_expand "vector_gtu__p" >> [(parallel >> [(set (reg:CC CR6_REGNO) >> - (unspec:CC [(gtu:CC (match_operand:VEC_I 1 "vint_operand") >> - (match_operand:VEC_I 2 "vint_operand"))] >> + (unspec:CC [(gtu:CC (match_operand:VEC_IC 1 "vint_operand") >> + (match_operand:VEC_IC 2 "vint_operand"))] >> UNSPEC_PREDICATE)) >> - (set (match_operand:VEC_I 0 "vlogical_operand") >> - (gtu:VEC_I (match_dup 1) >> - (match_dup 2)))])] >> + (set (match_operand:VEC_IC 0 "vlogical_operand") >> + (gtu:VEC_IC (match_dup 1) >> + (match_dup 2)))])] >> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" >> "") >> >> -(define_expand "vector_gtu_v1ti_p" >> - [(parallel >> - [(set (reg:CC CR6_REGNO) >> - (unspec:CC [(gtu:CC (match_operand:V1TI 1 "altivec_register_operand") >> - (match_operand:V1TI 2 "altivec_register_operand"))] >> - UNSPEC_PREDICATE)) >> - (set (match_operand:V1TI 0 "altivec_register_operand") >> - (gtu:V1TI (match_dup 1) >> - (match_dup 2)))])] >> - "TARGET_POWER10" >> - "") >> - >> ;; AltiVec/VSX predicates. >> >> ;; This expansion is triggered during expansion of predicate built-in >> @@ -1582,25 +1522,21 @@ (define_expand "vec_shr_" >> >> ;; Expanders for rotate each element in a vector >> (define_expand "vrotl3" >> - [(set (match_operand:VEC_I 0 "vint_operand") >> - (rotate:VEC_I (match_operand:VEC_I 1 "vint_operand") >> - (match_operand:VEC_I 2 "vint_operand")))] >> + [(set (match_operand:VEC_IC 0 "vint_operand") >> + (rotate:VEC_IC (match_operand:VEC_IC 1 "vint_operand") >> + (match_operand:VEC_IC 2 "vint_operand")))] >> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" >> - "") >> - >> -(define_expand "vrotlv1ti3" >> - [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") >> - (rotate:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") >> - (match_operand:V1TI 2 "vsx_register_operand" "v")))] >> - "TARGET_POWER10" >> { >> - /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ >> - rtx tmp = gen_reg_rtx (V1TImode); >> + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ >> + if (mode == V1TImode) >> + { >> + rtx tmp = gen_reg_rtx (V1TImode); >> >> - emit_insn (gen_xxswapd_v1ti (tmp, operands[2])); >> - emit_insn (gen_altivec_vrlq (operands[0], operands[1], tmp)); >> - DONE; >> -}) >> + emit_insn (gen_xxswapd_v1ti (tmp, operands[2])); >> + emit_insn (gen_altivec_vrlq (operands[0], operands[1], tmp)); >> + DONE; >> + } >> + }) >> >> ;; Expanders for rotatert to make use of vrotl >> (define_expand "vrotr3" >> @@ -1663,25 +1599,20 @@ (define_expand "vlshr3" >> >> ;; Expanders for arithmetic shift right on each vector element >> (define_expand "vashr3" >> - [(set (match_operand:VEC_I 0 "vint_operand") >> - (ashiftrt:VEC_I (match_operand:VEC_I 1 "vint_operand") >> - (match_operand:VEC_I 2 "vint_operand")))] >> + [(set (match_operand:VEC_IC 0 "vint_operand") >> + (ashiftrt:VEC_IC (match_operand:VEC_IC 1 "vint_operand") >> + (match_operand:VEC_IC 2 "vint_operand")))] >> "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" >> - "") >> - >> -;; No immediate version of this 128-bit instruction >> -(define_expand "vashrv1ti3" >> - [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") >> - (ashiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") >> - (match_operand:V1TI 2 "vsx_register_operand" "v")))] >> - "TARGET_POWER10" >> { >> - /* Shift amount in needs to be put into bits[57:63] of 128-bit operand2. */ >> - rtx tmp = gen_reg_rtx (V1TImode); >> + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ >> + if (mode == V1TImode) >> + { >> + rtx tmp = gen_reg_rtx (V1TImode); >> >> - emit_insn (gen_xxswapd_v1ti (tmp, operands[2])); >> - emit_insn (gen_altivec_vsraq (operands[0], operands[1], tmp)); >> - DONE; >> + emit_insn (gen_xxswapd_v1ti (tmp, operands[2])); >> + emit_insn (gen_altivec_vsraq (operands[0], operands[1], tmp)); >> + DONE; >> + } >> }) >> >> >> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-cmp-int128.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-cmp-int128.c >> new file mode 100644 >> index 00000000000..1a4db0f45d4 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-cmp-int128.c >> @@ -0,0 +1,86 @@ >> +/* Verify that overloaded built-ins for vec_cmp with __int128 >> + inputs produce the right code. */ >> + >> +/* { dg-do compile } */ > > Need /* { dg-require-effective-target int128 } */ > >> +/* { dg-require-effective-target power10_ok } */ >> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ >> + >> +#include >> + >> +vector bool __int128 >> +test3_eq (vector signed __int128 x, vector signed __int128 y) >> +{ >> + return vec_cmpeq (x, y); >> +} >> + >> +vector bool __int128 >> +test6_eq (vector unsigned __int128 x, vector unsigned __int128 y) >> +{ >> + return vec_cmpeq (x, y); >> +} >> + > > Nit: The function names test6 and test3 seems copied from other cases somewhere. > Maybe it's more meaningful with s/3// (or s/3/s/) and s/6/u/. > >> +vector bool __int128 >> +test3_ge (vector signed __int128 x, vector signed __int128 y) >> +{ >> + return vec_cmpge (x, y); >> +} >> + >> +vector bool __int128 >> +test6_ge (vector unsigned __int128 x, vector unsigned __int128 y) >> +{ >> + return vec_cmpge (x, y); >> +} >> + >> +vector bool __int128 >> +test3_gt (vector signed __int128 x, vector signed __int128 y) >> +{ >> + return vec_cmpgt (x, y); >> +} >> + >> +vector bool __int128 >> +test6_gt (vector unsigned __int128 x, vector unsigned __int128 y) >> +{ >> + return vec_cmpgt (x, y); >> +} >> + >> +vector bool __int128 >> +test3_le (vector signed __int128 x, vector signed __int128 y) >> +{ >> + return vec_cmple (x, y); >> +} >> + >> +vector bool __int128 >> +test6_le (vector unsigned __int128 x, vector unsigned __int128 y) >> +{ >> + return vec_cmple (x, y); >> +} >> + >> +vector bool __int128 >> +test3_lt (vector signed __int128 x, vector signed __int128 y) >> +{ >> + return vec_cmplt (x, y); >> +} >> + >> +vector bool __int128 >> +test6_lt (vector unsigned __int128 x, vector unsigned __int128 y) >> +{ >> + return vec_cmplt (x, y); >> +} >> + >> +vector bool __int128 >> +test3_ne (vector signed __int128 x, vector signed __int128 y) >> +{ >> + return vec_cmpne (x, y); >> +} >> + >> +vector bool __int128 >> +test6_ne (vector unsigned __int128 x, vector unsigned __int128 y) >> +{ >> + return vec_cmpne (x, y); >> +} >> + >> +/* { dg-final { scan-assembler-times "vcmpequq" 4 } } */ >> +/* { dg-final { scan-assembler-times "vcmpgtsq" 4 } } */ >> +/* { dg-final { scan-assembler-times "vcmpgtuq" 4 } } */ >> +/* { dg-final { scan-assembler-times "xxlnor" 6 } } */ >> + >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103316.c b/gcc/testsuite/gcc.target/powerpc/pr103316.c >> new file mode 100644 >> index 00000000000..02f7dc5ca1b >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr103316.c >> @@ -0,0 +1,80 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target power10_ok } */ > > Need /* { dg-require-effective-target int128 } */ too. > > BR, > Kewen