From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 9532E3858C27 for ; Wed, 17 Nov 2021 01:44:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9532E3858C27 Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1AH0jYud027480; Wed, 17 Nov 2021 01:44:22 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3ccqfs0s2f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 17 Nov 2021 01:44:21 +0000 Received: from m0098416.ppops.net (m0098416.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1AH1darE006762; Wed, 17 Nov 2021 01:44:21 GMT Received: from ppma03fra.de.ibm.com (6b.4a.5195.ip4.static.sl-reverse.com [149.81.74.107]) by mx0b-001b2d01.pphosted.com with ESMTP id 3ccqfs0s26-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 17 Nov 2021 01:44:21 +0000 Received: from pps.filterd (ppma03fra.de.ibm.com [127.0.0.1]) by ppma03fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1AH1ckOM021317; Wed, 17 Nov 2021 01:44:19 GMT Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by ppma03fra.de.ibm.com with ESMTP id 3ca50a3h0y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 17 Nov 2021 01:44:19 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1AH1iHZ955247344 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 17 Nov 2021 01:44:17 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2CCF75204F; Wed, 17 Nov 2021 01:44:17 +0000 (GMT) Received: from [9.200.100.183] (unknown [9.200.100.183]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id E78BD52051; Wed, 17 Nov 2021 01:44:15 +0000 (GMT) Message-ID: Date: Wed, 17 Nov 2021 09:44:15 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.2.1 Subject: Re: [PATCH, rs6000] Optimization for vec_xl_sext Content-Language: en-US To: wschmidt@linux.ibm.com, gcc-patches Cc: Segher Boessenkool , David References: <9cb86d85-4501-a81d-4f9f-dc70de5d83ad@linux.ibm.com> From: HAO CHEN GUI In-Reply-To: <9cb86d85-4501-a81d-4f9f-dc70de5d83ad@linux.ibm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 4s38GyxYh53VxxQlcEJPkz3uq68cVQfU X-Proofpoint-GUID: Uc5JMuWuYmqw9dcfoJRVVAe3YLHsKMCZ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-16_07,2021-11-16_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 priorityscore=1501 adultscore=0 suspectscore=0 impostorscore=0 bulkscore=0 mlxscore=0 lowpriorityscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111170006 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Nov 2021 01:44:24 -0000 Bill,     Sorry, I mixed up the patches. There is one vec_reve patch which hasn't gotten approval for a long time. I will re-send it.  Thanks a lot. On 16/11/2021 下午 9:10, Bill Schmidt wrote: > Hi Hao Chen, > > I don't understand.  This patch was already approved and you committed it. :-) I know > because I needed to make corresponding adjustments to the new builtins code. > > Thanks, > Bill > > On 11/15/21 8:16 PM, HAO CHEN GUI wrote: >> Hi, >> >>    The patch optimizes the code generation for vec_xl_sext builtin. Now all the sign extensions are done on VSX registers directly. >> >>    Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. >> >> ChangeLog >> >> 2021-11-16 Haochen Gui >> >> gcc/ >>         * config/rs6000/rs6000-call.c (altivec_expand_lxvr_builtin): Modify >>         the expansion for sign extension. All extensions are done on VSX >>         registers. >> >> gcc/testsuite/ >>         * gcc.target/powerpc/p10_vec_xl_sext.c: New test. >> >> patch.diff >> >> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c >> index b4e13af4dc6..587e9fa2a2a 100644 >> --- a/gcc/config/rs6000/rs6000-call.c >> +++ b/gcc/config/rs6000/rs6000-call.c >> @@ -9779,7 +9779,7 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree exp, rtx target, bool bl >> >>    if (sign_extend) >>      { >> -      rtx discratch = gen_reg_rtx (DImode); >> +      rtx discratch = gen_reg_rtx (V2DImode); >>        rtx tiscratch = gen_reg_rtx (TImode); >> >>        /* Emit the lxvr*x insn.  */ >> @@ -9788,20 +9788,31 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree exp, rtx target, bool bl >>         return 0; >>        emit_insn (pat); >> >> -      /* Emit a sign extension from QI,HI,WI to double (DI).  */ >> -      rtx scratch = gen_lowpart (smode, tiscratch); >> +      /* Emit a sign extension from V16QI,V8HI,V4SI to V2DI.  */ >> +      rtx temp1, temp2; >>        if (icode == CODE_FOR_vsx_lxvrbx) >> -       emit_insn (gen_extendqidi2 (discratch, scratch)); >> +       { >> +         temp1  = simplify_gen_subreg (V16QImode, tiscratch, TImode, 0); >> +         emit_insn (gen_vsx_sign_extend_qi_v2di (discratch, temp1)); >> +       } >>        else if (icode == CODE_FOR_vsx_lxvrhx) >> -       emit_insn (gen_extendhidi2 (discratch, scratch)); >> +       { >> +         temp1  = simplify_gen_subreg (V8HImode, tiscratch, TImode, 0); >> +         emit_insn (gen_vsx_sign_extend_hi_v2di (discratch, temp1)); >> +       } >>        else if (icode == CODE_FOR_vsx_lxvrwx) >> -       emit_insn (gen_extendsidi2 (discratch, scratch)); >> -      /*  Assign discratch directly if scratch is already DI.  */ >> -      if (icode == CODE_FOR_vsx_lxvrdx) >> -       discratch = scratch; >> +       { >> +         temp1  = simplify_gen_subreg (V4SImode, tiscratch, TImode, 0); >> +         emit_insn (gen_vsx_sign_extend_si_v2di (discratch, temp1)); >> +       } >> +      else if (icode == CODE_FOR_vsx_lxvrdx) >> +       discratch = simplify_gen_subreg (V2DImode, tiscratch, TImode, 0); >> +      else >> +       gcc_unreachable (); >> >> -      /* Emit the sign extension from DI (double) to TI (quad).  */ >> -      emit_insn (gen_extendditi2 (target, discratch)); >> +      /* Emit the sign extension from V2DI (double) to TI (quad).  */ >> +      temp2 = simplify_gen_subreg (TImode, discratch, V2DImode, 0); >> +      emit_insn (gen_extendditi2_vector (target, temp2)); >> >>        return target; >>      } >> diff --git a/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c >> new file mode 100644 >> index 00000000000..78e72ac5425 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c >> @@ -0,0 +1,35 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target int128 } */ >> +/* { dg-require-effective-target power10_ok } */ >> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ >> + >> +#include >> + >> +vector signed __int128 >> +foo1 (signed long a, signed char *b) >> +{ >> +  return vec_xl_sext (a, b); >> +} >> + >> +vector signed __int128 >> +foo2 (signed long a, signed short *b) >> +{ >> +  return vec_xl_sext (a, b); >> +} >> + >> +vector signed __int128 >> +foo3 (signed long a, signed int *b) >> +{ >> +  return vec_xl_sext (a, b); >> +} >> + >> +vector signed __int128 >> +foo4 (signed long a, signed long *b) >> +{ >> +  return vec_xl_sext (a, b); >> +} >> + >> +/* { dg-final { scan-assembler-times {\mvextsd2q\M} 4 } } */ >> +/* { dg-final { scan-assembler-times {\mvextsb2d\M} 1 } } */ >> +/* { dg-final { scan-assembler-times {\mvextsh2d\M} 1 } } */ >> +/* { dg-final { scan-assembler-times {\mvextsw2d\M} 1 } } */ >>