From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <guihaoc@linux.ibm.com>
Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com
 [148.163.156.1])
 by sourceware.org (Postfix) with ESMTPS id 6BFD9385740D
 for <gcc-patches@gcc.gnu.org>; Thu, 26 May 2022 05:30:51 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6BFD9385740D
Received: from pps.filterd (m0098396.ppops.net [127.0.0.1])
 by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 24Q4N0tk004169;
 Thu, 26 May 2022 05:30:49 GMT
Received: from pps.reinject (localhost [127.0.0.1])
 by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ga2fkgwt7-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Thu, 26 May 2022 05:30:49 +0000
Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1])
 by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 24Q5DVii022402;
 Thu, 26 May 2022 05:30:49 GMT
Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com
 [159.122.73.72])
 by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ga2fkgws4-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Thu, 26 May 2022 05:30:48 +0000
Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1])
 by ppma06fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 24Q5LHwq022130;
 Thu, 26 May 2022 05:30:46 GMT
Received: from b06cxnps3075.portsmouth.uk.ibm.com
 (d06relay10.portsmouth.uk.ibm.com [9.149.109.195])
 by ppma06fra.de.ibm.com with ESMTP id 3g948n0f25-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Thu, 26 May 2022 05:30:46 +0000
Received: from d06av24.portsmouth.uk.ibm.com (d06av24.portsmouth.uk.ibm.com
 [9.149.105.60])
 by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id
 24Q5Uh8013762950
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);
 Thu, 26 May 2022 05:30:43 GMT
Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1])
 by IMSVA (Postfix) with ESMTP id 230D442045;
 Thu, 26 May 2022 05:30:43 +0000 (GMT)
Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1])
 by IMSVA (Postfix) with ESMTP id 5CC7D42041;
 Thu, 26 May 2022 05:30:40 +0000 (GMT)
Received: from [9.200.42.164] (unknown [9.200.42.164])
 by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP;
 Thu, 26 May 2022 05:30:40 +0000 (GMT)
Message-ID: <a7cb3408-59ac-78c1-6bf6-ab80edb62a61@linux.ibm.com>
Date: Thu, 26 May 2022 13:30:38 +0800
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
 Thunderbird/91.9.1
Subject: Re: [PATCH v4, rs6000] Add V1TI into vector comparison expand
 [PR103316]
Content-Language: en-US
To: "Kewen.Lin" <linkw@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>, David
 <dje.gcc@gmail.com>, Peter Bergner <bergner@linux.ibm.com>,
 gcc-patches <gcc-patches@gcc.gnu.org>
References: <5e9b4423-b40f-f5e0-15fd-99776c426c32@linux.ibm.com>
 <ef264a3e-cb4d-66dd-fd6e-93a4c0254780@linux.ibm.com>
From: HAO CHEN GUI <guihaoc@linux.ibm.com>
In-Reply-To: <ef264a3e-cb4d-66dd-fd6e-93a4c0254780@linux.ibm.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-TM-AS-GCONF: 00
X-Proofpoint-GUID: jBdLxLRxBbBxk_VeAWx-caoYW1I4ZI2J
X-Proofpoint-ORIG-GUID: kgMMQkrytt0k3KFzl4a_PidDbqSI3miG
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.205,Aquarius:18.0.874,Hydra:6.0.486,FMLib:17.11.64.514
 definitions=2022-05-26_01,2022-05-25_02,2022-02-23_01
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0
 clxscore=1015 spamscore=0
 priorityscore=1501 malwarescore=0 impostorscore=0 suspectscore=0
 phishscore=0 lowpriorityscore=0 mlxlogscore=999 adultscore=0 bulkscore=0
 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1
 engine=8.12.0-2204290000 definitions=main-2205260030
X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A,
 RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Thu, 26 May 2022 05:30:54 -0000

Kewen,
  Thanks so much for your advice. Just one question about effective-target.

  For the test cases, it needs both power10_ok and int128 support. I saw some
existing test cases have these two checks as well. But I wonder if power10_ok
already covers int128 on powerpc targets? Can we save one check then?

On 26/5/2022 上午 11:22, Kewen.Lin wrote:
> Hi Haochen,
> 
> on 2022/5/24 16:45, HAO CHEN GUI wrote:
>> Hi,
>>    This patch adds V1TI mode into a new mode iterator used in vector
>> comparison and rotation expands. Without the patch, the comparisons
>> between two vector __int128 are converted to scalar comparisons. The
>> code is suboptimal. The patch fixes the issue. Now all comparisons
>> between two vector __int128 generates P10 new comparison instructions.
>> Also the relative built-ins generate the same instructions after gimple
>> folding. So they're added back to the list.
>>
>>   This patch also merges some vector comparison and rotation expands
>> for V1T1 and other vector integer modes as they have the same patterns.
>> The expands for V1TI only are removed.
>>
>>    Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
>> Is this okay for trunk? Any recommendations? Thanks a lot.
>>
>> ChangeLog
>> 2022-05-24 Haochen Gui <guihaoc@linux.ibm.com>
>>
>> gcc/
>> 	PR target/103316
>> 	* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Enable
>> 	gimple folding for RS6000_BIF_VCMPEQUT, RS6000_BIF_VCMPNET,
>> 	RS6000_BIF_CMPGE_1TI, RS6000_BIF_CMPGE_U1TI, RS6000_BIF_VCMPGTUT,
>> 	RS6000_BIF_VCMPGTST, RS6000_BIF_CMPLE_1TI, RS6000_BIF_CMPLE_U1TI.
>> 	* config/rs6000/vector.md (VEC_IC): Define.  Add support for new Power10
>> 	V1TI instructions.
> 
> Nit: Maybe "New mode iterator" is better than "Define".
> 
>> 	(vec_cmp<mode><mode>): Set mode iterator to VEC_IC.
>> 	(vec_cmpu<mode><mode>): Likewise.
>> 	(vector_nlt<mode>): Set mode iterator to VEC_IC.
>> 	(vector_nltv1ti): Remove.
>> 	(vector_gtu<mode>): Set mode iterator to VEC_IC.
>> 	(vector_gtuv1ti): Remove.
>> 	(vector_nltu<mode>): Set mode iterator to VEC_IC.
>> 	(vector_nltuv1ti): Remove.
>> 	(vector_geu<mode>): Set mode iterator to VEC_IC.
>> 	(vector_ngt<mode>): Likewise.
>> 	(vector_ngtv1ti): Remove.
>> 	(vector_ngtu<mode>): Set mode iterator to VEC_IC.
>> 	(vector_ngtuv1ti): Remove.
>> 	(vector_gtu_<mode>_p): Set mode iterator to VEC_IC.
>> 	(vector_gtu_v1ti_p): Remove.
>> 	(vrotl<mode>3): Set mode iterator to VEC_IC.  Emit insns for V1TI.
>> 	(vrotlv1ti3): Remove.
>> 	(vashr<mode>3): Set mode iterator to VEC_IC.  Emit insns for V1TI.
>> 	(vashrv1ti3): Remove.
>>
>> gcc/testsuite/
>> 	PR target/103316
>> 	* gcc.target/powerpc/pr103316.c: New.
>> 	* gcc.target/powerpc/fold-vec-cmp-int128.c: New.
>>
>> patch.diff
>> diff --git a/gcc/config/rs6000/rs6000-builtin.cc b/gcc/config/rs6000/rs6000-builtin.cc
>> index e925ba9fad9..b67f4e066a8 100644
>> --- a/gcc/config/rs6000/rs6000-builtin.cc
>> +++ b/gcc/config/rs6000/rs6000-builtin.cc
>> @@ -2000,16 +2000,14 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
>>      case RS6000_BIF_VCMPEQUH:
>>      case RS6000_BIF_VCMPEQUW:
>>      case RS6000_BIF_VCMPEQUD:
>> -    /* We deliberately omit RS6000_BIF_VCMPEQUT for now, because gimple
>> -       folding produces worse code for 128-bit compares.  */
>> +    case RS6000_BIF_VCMPEQUT:
>>        fold_compare_helper (gsi, EQ_EXPR, stmt);
>>        return true;
>>
>>      case RS6000_BIF_VCMPNEB:
>>      case RS6000_BIF_VCMPNEH:
>>      case RS6000_BIF_VCMPNEW:
>> -    /* We deliberately omit RS6000_BIF_VCMPNET for now, because gimple
>> -       folding produces worse code for 128-bit compares.  */
>> +    case RS6000_BIF_VCMPNET:
>>        fold_compare_helper (gsi, NE_EXPR, stmt);
>>        return true;
>>
>> @@ -2021,9 +2019,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
>>      case RS6000_BIF_CMPGE_U4SI:
>>      case RS6000_BIF_CMPGE_2DI:
>>      case RS6000_BIF_CMPGE_U2DI:
>> -    /* We deliberately omit RS6000_BIF_CMPGE_1TI and RS6000_BIF_CMPGE_U1TI
>> -       for now, because gimple folding produces worse code for 128-bit
>> -       compares.  */
>> +    case RS6000_BIF_CMPGE_1TI:
>> +    case RS6000_BIF_CMPGE_U1TI:
>>        fold_compare_helper (gsi, GE_EXPR, stmt);
>>        return true;
>>
>> @@ -2035,9 +2032,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
>>      case RS6000_BIF_VCMPGTUW:
>>      case RS6000_BIF_VCMPGTUD:
>>      case RS6000_BIF_VCMPGTSD:
>> -    /* We deliberately omit RS6000_BIF_VCMPGTUT and RS6000_BIF_VCMPGTST
>> -       for now, because gimple folding produces worse code for 128-bit
>> -       compares.  */
>> +    case RS6000_BIF_VCMPGTUT:
>> +    case RS6000_BIF_VCMPGTST:
>>        fold_compare_helper (gsi, GT_EXPR, stmt);
>>        return true;
>>
>> @@ -2049,9 +2045,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
>>      case RS6000_BIF_CMPLE_U4SI:
>>      case RS6000_BIF_CMPLE_2DI:
>>      case RS6000_BIF_CMPLE_U2DI:
>> -    /* We deliberately omit RS6000_BIF_CMPLE_1TI and RS6000_BIF_CMPLE_U1TI
>> -       for now, because gimple folding produces worse code for 128-bit
>> -       compares.  */
>> +    case RS6000_BIF_CMPLE_1TI:
>> +    case RS6000_BIF_CMPLE_U1TI:
>>        fold_compare_helper (gsi, LE_EXPR, stmt);
>>        return true;
>>
>> diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
>> index 4d0797c48f8..3b7a272994f 100644
>> --- a/gcc/config/rs6000/vector.md
>> +++ b/gcc/config/rs6000/vector.md
>> @@ -26,6 +26,9 @@
>>  ;; Vector int modes
>>  (define_mode_iterator VEC_I [V16QI V8HI V4SI V2DI])
>>
>> +;; Vector int modes for comparison
> 
> Nit: This comment line doesn't perfectly match the usage below since it's also
> used for shift and rotation in this patch.  Maybe it's better with:
> 
> "Vector int modes for comparison, shift and rotation"
> 
>> +(define_mode_iterator VEC_IC [V16QI V8HI V4SI V2DI (V1TI "TARGET_POWER10")])
>> +
>>  ;; 128-bit int modes
>>  (define_mode_iterator VEC_TI [V1TI TI])
>>
>> @@ -533,10 +536,10 @@ (define_expand "vcond_mask_<mode><VEC_int>"
>>
>>  ;; For signed integer vectors comparison.
>>  (define_expand "vec_cmp<mode><mode>"
>> -  [(set (match_operand:VEC_I 0 "vint_operand")
>> +  [(set (match_operand:VEC_IC 0 "vint_operand")
>>  	(match_operator 1 "signed_or_equality_comparison_operator"
>> -	  [(match_operand:VEC_I 2 "vint_operand")
>> -	   (match_operand:VEC_I 3 "vint_operand")]))]
>> +	  [(match_operand:VEC_IC 2 "vint_operand")
>> +	   (match_operand:VEC_IC 3 "vint_operand")]))]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>>  {
>>    enum rtx_code code = GET_CODE (operands[1]);
>> @@ -573,10 +576,10 @@ (define_expand "vec_cmp<mode><mode>"
>>
>>  ;; For unsigned integer vectors comparison.
>>  (define_expand "vec_cmpu<mode><mode>"
>> -  [(set (match_operand:VEC_I 0 "vint_operand")
>> +  [(set (match_operand:VEC_IC 0 "vint_operand")
>>  	(match_operator 1 "unsigned_or_equality_comparison_operator"
>> -	  [(match_operand:VEC_I 2 "vint_operand")
>> -	   (match_operand:VEC_I 3 "vint_operand")]))]
>> +	  [(match_operand:VEC_IC 2 "vint_operand")
>> +	   (match_operand:VEC_IC 3 "vint_operand")]))]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>>  {
>>    enum rtx_code code = GET_CODE (operands[1]);
>> @@ -690,116 +693,65 @@ (define_expand "vector_gt<mode>"
>>
>>  ; >= for integer vectors: swap operands and apply not-greater-than
>>  (define_expand "vector_nlt<mode>"
>> -  [(set (match_operand:VEC_I 3 "vlogical_operand")
>> -	(gt:VEC_I (match_operand:VEC_I 2 "vlogical_operand")
>> -		  (match_operand:VEC_I 1 "vlogical_operand")))
>> -   (set (match_operand:VEC_I 0 "vlogical_operand")
>> -        (not:VEC_I (match_dup 3)))]
>> +  [(set (match_operand:VEC_IC 3 "vlogical_operand")
>> +	(gt:VEC_IC (match_operand:VEC_IC 2 "vlogical_operand")
>> +		   (match_operand:VEC_IC 1 "vlogical_operand")))
>> +   (set (match_operand:VEC_IC 0 "vlogical_operand")
>> +	(not:VEC_IC (match_dup 3)))]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>>  {
>>    operands[3] = gen_reg_rtx_and_attrs (operands[0]);
>>  })
>>
>> -(define_expand "vector_nltv1ti"
>> -  [(set (match_operand:V1TI 3 "vlogical_operand")
>> -	(gt:V1TI (match_operand:V1TI 2 "vlogical_operand")
>> -		 (match_operand:V1TI 1 "vlogical_operand")))
>> -   (set (match_operand:V1TI 0 "vlogical_operand")
>> -        (not:V1TI (match_dup 3)))]
>> -  "TARGET_POWER10"
>> -{
>> -  operands[3] = gen_reg_rtx_and_attrs (operands[0]);
>> -})
>> -
>>  (define_expand "vector_gtu<mode>"
>> -  [(set (match_operand:VEC_I 0 "vint_operand")
>> -	(gtu:VEC_I (match_operand:VEC_I 1 "vint_operand")
>> -		   (match_operand:VEC_I 2 "vint_operand")))]
>> +  [(set (match_operand:VEC_IC 0 "vint_operand")
>> +	(gtu:VEC_IC (match_operand:VEC_IC 1 "vint_operand")
>> +		    (match_operand:VEC_IC 2 "vint_operand")))]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>>    "")
>>
>> -(define_expand "vector_gtuv1ti"
>> -  [(set (match_operand:V1TI 0 "altivec_register_operand")
>> -	(gtu:V1TI (match_operand:V1TI 1 "altivec_register_operand")
>> -		  (match_operand:V1TI 2 "altivec_register_operand")))]
>> -  "TARGET_POWER10"
>> -  "")
>> -
>>  ; >= for integer vectors: swap operands and apply not-greater-than
>>  (define_expand "vector_nltu<mode>"
>> -  [(set (match_operand:VEC_I 3 "vlogical_operand")
>> -	(gtu:VEC_I (match_operand:VEC_I 2 "vlogical_operand")
>> -	 	   (match_operand:VEC_I 1 "vlogical_operand")))
>> -   (set (match_operand:VEC_I 0 "vlogical_operand")
>> -        (not:VEC_I (match_dup 3)))]
>> +  [(set (match_operand:VEC_IC 3 "vlogical_operand")
>> +	(gtu:VEC_IC (match_operand:VEC_IC 2 "vlogical_operand")
>> +		    (match_operand:VEC_IC 1 "vlogical_operand")))
>> +   (set (match_operand:VEC_IC 0 "vlogical_operand")
>> +	(not:VEC_IC (match_dup 3)))]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>>  {
>>    operands[3] = gen_reg_rtx_and_attrs (operands[0]);
>>  })
>>
>> -(define_expand "vector_nltuv1ti"
>> -  [(set (match_operand:V1TI 3 "vlogical_operand")
>> -	(gtu:V1TI (match_operand:V1TI 2 "vlogical_operand")
>> -		  (match_operand:V1TI 1 "vlogical_operand")))
>> -   (set (match_operand:V1TI 0 "vlogical_operand")
>> -	(not:V1TI (match_dup 3)))]
>> -  "TARGET_POWER10"
>> -{
>> -  operands[3] = gen_reg_rtx_and_attrs (operands[0]);
>> -})
>> -
>>  (define_expand "vector_geu<mode>"
>> -  [(set (match_operand:VEC_I 0 "vint_operand")
>> -	(geu:VEC_I (match_operand:VEC_I 1 "vint_operand")
>> -		   (match_operand:VEC_I 2 "vint_operand")))]
>> +  [(set (match_operand:VEC_IC 0 "vint_operand")
>> +	(geu:VEC_IC (match_operand:VEC_IC 1 "vint_operand")
>> +		    (match_operand:VEC_IC 2 "vint_operand")))]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>>    "")
>>
>>  ; <= for integer vectors: apply not-greater-than
>>  (define_expand "vector_ngt<mode>"
>> -  [(set (match_operand:VEC_I 3 "vlogical_operand")
>> -	(gt:VEC_I (match_operand:VEC_I 1 "vlogical_operand")
>> -		  (match_operand:VEC_I 2 "vlogical_operand")))
>> -   (set (match_operand:VEC_I 0 "vlogical_operand")
>> -        (not:VEC_I (match_dup 3)))]
>> +  [(set (match_operand:VEC_IC 3 "vlogical_operand")
>> +	(gt:VEC_IC (match_operand:VEC_IC 1 "vlogical_operand")
>> +		   (match_operand:VEC_IC 2 "vlogical_operand")))
>> +   (set (match_operand:VEC_IC 0 "vlogical_operand")
>> +	(not:VEC_IC (match_dup 3)))]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>>  {
>>    operands[3] = gen_reg_rtx_and_attrs (operands[0]);
>>  })
>>
>> -(define_expand "vector_ngtv1ti"
>> -  [(set (match_operand:V1TI 3 "vlogical_operand")
>> -	(gt:V1TI (match_operand:V1TI 1 "vlogical_operand")
>> -		 (match_operand:V1TI 2 "vlogical_operand")))
>> -   (set (match_operand:V1TI 0 "vlogical_operand")
>> -        (not:V1TI (match_dup 3)))]
>> -  "TARGET_POWER10"
>> -{
>> -  operands[3] = gen_reg_rtx_and_attrs (operands[0]);
>> -})
>> -
>>  (define_expand "vector_ngtu<mode>"
>> -  [(set (match_operand:VEC_I 3 "vlogical_operand")
>> -	(gtu:VEC_I (match_operand:VEC_I 1 "vlogical_operand")
>> -	 	   (match_operand:VEC_I 2 "vlogical_operand")))
>> -   (set (match_operand:VEC_I 0 "vlogical_operand")
>> -        (not:VEC_I (match_dup 3)))]
>> +  [(set (match_operand:VEC_IC 3 "vlogical_operand")
>> +	(gtu:VEC_IC (match_operand:VEC_IC 1 "vlogical_operand")
>> +		    (match_operand:VEC_IC 2 "vlogical_operand")))
>> +   (set (match_operand:VEC_IC 0 "vlogical_operand")
>> +	(not:VEC_IC (match_dup 3)))]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>>  {
>>    operands[3] = gen_reg_rtx_and_attrs (operands[0]);
>>  })
>>
>> -(define_expand "vector_ngtuv1ti"
>> -  [(set (match_operand:V1TI 3 "vlogical_operand")
>> -	(gtu:V1TI (match_operand:V1TI 1 "vlogical_operand")
>> -		  (match_operand:V1TI 2 "vlogical_operand")))
>> -   (set (match_operand:V1TI 0 "vlogical_operand")
>> -        (not:V1TI (match_dup 3)))]
>> -  "TARGET_POWER10"
>> -{
>> -  operands[3] = gen_reg_rtx_and_attrs (operands[0]);
>> -})
>> -
>>  ; There are 14 possible vector FP comparison operators, gt and eq of them have
>>  ; been expanded above, so just support 12 remaining operators here.
>>
>> @@ -1189,27 +1141,15 @@ (define_expand "vector_ge_<mode>_p"
>>  (define_expand "vector_gtu_<mode>_p"
>>    [(parallel
>>      [(set (reg:CC CR6_REGNO)
>> -	  (unspec:CC [(gtu:CC (match_operand:VEC_I 1 "vint_operand")
>> -			      (match_operand:VEC_I 2 "vint_operand"))]
>> +	  (unspec:CC [(gtu:CC (match_operand:VEC_IC 1 "vint_operand")
>> +			      (match_operand:VEC_IC 2 "vint_operand"))]
>>  		     UNSPEC_PREDICATE))
>> -     (set (match_operand:VEC_I 0 "vlogical_operand")
>> -	  (gtu:VEC_I (match_dup 1)
>> -		     (match_dup 2)))])]
>> +     (set (match_operand:VEC_IC 0 "vlogical_operand")
>> +	  (gtu:VEC_IC (match_dup 1)
>> +		      (match_dup 2)))])]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>>    "")
>>
>> -(define_expand "vector_gtu_v1ti_p"
>> -  [(parallel
>> -    [(set (reg:CC CR6_REGNO)
>> -	  (unspec:CC [(gtu:CC (match_operand:V1TI 1 "altivec_register_operand")
>> -			      (match_operand:V1TI 2 "altivec_register_operand"))]
>> -		     UNSPEC_PREDICATE))
>> -     (set (match_operand:V1TI 0 "altivec_register_operand")
>> -	  (gtu:V1TI (match_dup 1)
>> -		    (match_dup 2)))])]
>> -  "TARGET_POWER10"
>> -  "")
>> -
>>  ;; AltiVec/VSX predicates.
>>
>>  ;; This expansion is triggered during expansion of predicate built-in
>> @@ -1582,25 +1522,21 @@ (define_expand "vec_shr_<mode>"
>>
>>  ;; Expanders for rotate each element in a vector
>>  (define_expand "vrotl<mode>3"
>> -  [(set (match_operand:VEC_I 0 "vint_operand")
>> -	(rotate:VEC_I (match_operand:VEC_I 1 "vint_operand")
>> -		      (match_operand:VEC_I 2 "vint_operand")))]
>> +  [(set (match_operand:VEC_IC 0 "vint_operand")
>> +	(rotate:VEC_IC (match_operand:VEC_IC 1 "vint_operand")
>> +		       (match_operand:VEC_IC 2 "vint_operand")))]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>> -  "")
>> -
>> -(define_expand "vrotlv1ti3"
>> -  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
>> -        (rotate:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
>> -                     (match_operand:V1TI 2 "vsx_register_operand" "v")))]
>> -  "TARGET_POWER10"
>>  {
>> -  /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */
>> -  rtx tmp = gen_reg_rtx (V1TImode);
>> +  /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2.  */
>> +  if (<MODE>mode == V1TImode)
>> +    {
>> +      rtx tmp = gen_reg_rtx (V1TImode);
>>
>> -  emit_insn (gen_xxswapd_v1ti (tmp, operands[2]));
>> -  emit_insn (gen_altivec_vrlq (operands[0], operands[1], tmp));
>> -  DONE;
>> -})
>> +      emit_insn (gen_xxswapd_v1ti (tmp, operands[2]));
>> +      emit_insn (gen_altivec_vrlq (operands[0], operands[1], tmp));
>> +      DONE;
>> +    }
>> + })
>>
>>  ;; Expanders for rotatert to make use of vrotl
>>  (define_expand "vrotr<mode>3"
>> @@ -1663,25 +1599,20 @@ (define_expand "vlshr<mode>3"
>>
>>  ;; Expanders for arithmetic shift right on each vector element
>>  (define_expand "vashr<mode>3"
>> -  [(set (match_operand:VEC_I 0 "vint_operand")
>> -	(ashiftrt:VEC_I (match_operand:VEC_I 1 "vint_operand")
>> -			(match_operand:VEC_I 2 "vint_operand")))]
>> +  [(set (match_operand:VEC_IC 0 "vint_operand")
>> +	(ashiftrt:VEC_IC (match_operand:VEC_IC 1 "vint_operand")
>> +			 (match_operand:VEC_IC 2 "vint_operand")))]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>> -  "")
>> -
>> -;; No immediate version of this 128-bit instruction
>> -(define_expand "vashrv1ti3"
>> -  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
>> -	(ashiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
>> -		       (match_operand:V1TI 2 "vsx_register_operand" "v")))]
>> -  "TARGET_POWER10"
>>  {
>> -  /* Shift amount in needs to be put into bits[57:63] of 128-bit operand2. */
>> -  rtx tmp = gen_reg_rtx (V1TImode);
>> +  /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2.  */
>> +  if (<MODE>mode == V1TImode)
>> +    {
>> +      rtx tmp = gen_reg_rtx (V1TImode);
>>
>> -  emit_insn (gen_xxswapd_v1ti (tmp, operands[2]));
>> -  emit_insn (gen_altivec_vsraq (operands[0], operands[1], tmp));
>> -  DONE;
>> +      emit_insn (gen_xxswapd_v1ti (tmp, operands[2]));
>> +      emit_insn (gen_altivec_vsraq (operands[0], operands[1], tmp));
>> +      DONE;
>> +    }
>>  })
>>
>>  
>> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-cmp-int128.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-cmp-int128.c
>> new file mode 100644
>> index 00000000000..1a4db0f45d4
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-cmp-int128.c
>> @@ -0,0 +1,86 @@
>> +/* Verify that overloaded built-ins for vec_cmp with __int128
>> +   inputs produce the right code.  */
>> +
>> +/* { dg-do compile } */
> 
> Need /* { dg-require-effective-target int128 } */
> 
>> +/* { dg-require-effective-target power10_ok } */
>> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
>> +
>> +#include <altivec.h>
>> +
>> +vector bool __int128
>> +test3_eq (vector signed __int128 x, vector signed __int128 y)
>> +{
>> +  return vec_cmpeq (x, y);
>> +}
>> +
>> +vector bool __int128
>> +test6_eq (vector unsigned __int128 x, vector unsigned __int128 y)
>> +{
>> +  return vec_cmpeq (x, y);
>> +}
>> +
> 
> Nit: The function names test6 and test3 seems copied from other cases somewhere.
> Maybe it's more meaningful with s/3// (or s/3/s/) and s/6/u/.
> 
>> +vector bool __int128
>> +test3_ge (vector signed __int128 x, vector signed __int128 y)
>> +{
>> +  return vec_cmpge (x, y);
>> +}
>> +
>> +vector bool __int128
>> +test6_ge (vector unsigned __int128 x, vector unsigned __int128 y)
>> +{
>> +  return vec_cmpge (x, y);
>> +}
>> +
>> +vector bool __int128
>> +test3_gt (vector signed __int128 x, vector signed __int128 y)
>> +{
>> +  return vec_cmpgt (x, y);
>> +}
>> +
>> +vector bool __int128
>> +test6_gt (vector unsigned __int128 x, vector unsigned __int128 y)
>> +{
>> +  return vec_cmpgt (x, y);
>> +}
>> +
>> +vector bool __int128
>> +test3_le (vector signed __int128 x, vector signed __int128 y)
>> +{
>> +  return vec_cmple (x, y);
>> +}
>> +
>> +vector bool __int128
>> +test6_le (vector unsigned __int128 x, vector unsigned __int128 y)
>> +{
>> +  return vec_cmple (x, y);
>> +}
>> +
>> +vector bool __int128
>> +test3_lt (vector signed __int128 x, vector signed __int128 y)
>> +{
>> +  return vec_cmplt (x, y);
>> +}
>> +
>> +vector bool __int128
>> +test6_lt (vector unsigned __int128 x, vector unsigned __int128 y)
>> +{
>> +  return vec_cmplt (x, y);
>> +}
>> +
>> +vector bool __int128
>> +test3_ne (vector signed __int128 x, vector signed __int128 y)
>> +{
>> +  return vec_cmpne (x, y);
>> +}
>> +
>> +vector bool __int128
>> +test6_ne (vector unsigned __int128 x, vector unsigned __int128 y)
>> +{
>> +  return vec_cmpne (x, y);
>> +}
>> +
>> +/* { dg-final { scan-assembler-times "vcmpequq" 4 } } */
>> +/* { dg-final { scan-assembler-times "vcmpgtsq" 4 } } */
>> +/* { dg-final { scan-assembler-times "vcmpgtuq" 4 } } */
>> +/* { dg-final { scan-assembler-times "xxlnor" 6 } } */
>> +
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103316.c b/gcc/testsuite/gcc.target/powerpc/pr103316.c
>> new file mode 100644
>> index 00000000000..02f7dc5ca1b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103316.c
>> @@ -0,0 +1,80 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target power10_ok } */
> 
> Need /* { dg-require-effective-target int128 } */ too.
> 
> BR,
> Kewen