From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 4150C3858D35 for ; Tue, 16 Nov 2021 13:11:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4150C3858D35 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1AGCbLLr021451; Tue, 16 Nov 2021 13:11:02 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3ccceth72b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 16 Nov 2021 13:11:02 +0000 Received: from m0098417.ppops.net (m0098417.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1AGD0GAZ011599; Tue, 16 Nov 2021 13:11:02 GMT Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0a-001b2d01.pphosted.com with ESMTP id 3ccceth71y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 16 Nov 2021 13:11:01 +0000 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1AGD3ara013779; Tue, 16 Nov 2021 13:11:01 GMT Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by ppma02wdc.us.ibm.com with ESMTP id 3ca50aum7s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 16 Nov 2021 13:11:01 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1AGDB0wd52101488 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 16 Nov 2021 13:11:00 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4757FBE07A; Tue, 16 Nov 2021 13:11:00 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0EF77BE05D; Tue, 16 Nov 2021 13:10:59 +0000 (GMT) Received: from [9.211.84.243] (unknown [9.211.84.243]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Tue, 16 Nov 2021 13:10:59 +0000 (GMT) Message-ID: <9cb86d85-4501-a81d-4f9f-dc70de5d83ad@linux.ibm.com> Date: Tue, 16 Nov 2021 07:10:59 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.3.0 Reply-To: wschmidt@linux.ibm.com Subject: Re: [PATCH, rs6000] Optimization for vec_xl_sext To: HAO CHEN GUI , gcc-patches Cc: Segher Boessenkool , David References: From: Bill Schmidt In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: xAUUmRhpH6-aXphy1lxvHExcaJ2GTIKJ X-Proofpoint-ORIG-GUID: NFa_tWxHaKupi-CGUZQqTR4t-bVsJecP X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-16_02,2021-11-16_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 suspectscore=0 priorityscore=1501 mlxscore=0 lowpriorityscore=0 mlxlogscore=999 clxscore=1015 bulkscore=0 adultscore=0 phishscore=0 malwarescore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111160067 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Nov 2021 13:11:05 -0000 Hi Hao Chen, I don't understand.  This patch was already approved and you committed it. :-) I know because I needed to make corresponding adjustments to the new builtins code. Thanks, Bill On 11/15/21 8:16 PM, HAO CHEN GUI wrote: > Hi, > >    The patch optimizes the code generation for vec_xl_sext builtin. Now all the sign extensions are done on VSX registers directly. > >    Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. > > ChangeLog > > 2021-11-16 Haochen Gui > > gcc/ >         * config/rs6000/rs6000-call.c (altivec_expand_lxvr_builtin): Modify >         the expansion for sign extension. All extensions are done on VSX >         registers. > > gcc/testsuite/ >         * gcc.target/powerpc/p10_vec_xl_sext.c: New test. > > patch.diff > > diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c > index b4e13af4dc6..587e9fa2a2a 100644 > --- a/gcc/config/rs6000/rs6000-call.c > +++ b/gcc/config/rs6000/rs6000-call.c > @@ -9779,7 +9779,7 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree exp, rtx target, bool bl > >    if (sign_extend) >      { > -      rtx discratch = gen_reg_rtx (DImode); > +      rtx discratch = gen_reg_rtx (V2DImode); >        rtx tiscratch = gen_reg_rtx (TImode); > >        /* Emit the lxvr*x insn.  */ > @@ -9788,20 +9788,31 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree exp, rtx target, bool bl >         return 0; >        emit_insn (pat); > > -      /* Emit a sign extension from QI,HI,WI to double (DI).  */ > -      rtx scratch = gen_lowpart (smode, tiscratch); > +      /* Emit a sign extension from V16QI,V8HI,V4SI to V2DI.  */ > +      rtx temp1, temp2; >        if (icode == CODE_FOR_vsx_lxvrbx) > -       emit_insn (gen_extendqidi2 (discratch, scratch)); > +       { > +         temp1  = simplify_gen_subreg (V16QImode, tiscratch, TImode, 0); > +         emit_insn (gen_vsx_sign_extend_qi_v2di (discratch, temp1)); > +       } >        else if (icode == CODE_FOR_vsx_lxvrhx) > -       emit_insn (gen_extendhidi2 (discratch, scratch)); > +       { > +         temp1  = simplify_gen_subreg (V8HImode, tiscratch, TImode, 0); > +         emit_insn (gen_vsx_sign_extend_hi_v2di (discratch, temp1)); > +       } >        else if (icode == CODE_FOR_vsx_lxvrwx) > -       emit_insn (gen_extendsidi2 (discratch, scratch)); > -      /*  Assign discratch directly if scratch is already DI.  */ > -      if (icode == CODE_FOR_vsx_lxvrdx) > -       discratch = scratch; > +       { > +         temp1  = simplify_gen_subreg (V4SImode, tiscratch, TImode, 0); > +         emit_insn (gen_vsx_sign_extend_si_v2di (discratch, temp1)); > +       } > +      else if (icode == CODE_FOR_vsx_lxvrdx) > +       discratch = simplify_gen_subreg (V2DImode, tiscratch, TImode, 0); > +      else > +       gcc_unreachable (); > > -      /* Emit the sign extension from DI (double) to TI (quad).  */ > -      emit_insn (gen_extendditi2 (target, discratch)); > +      /* Emit the sign extension from V2DI (double) to TI (quad).  */ > +      temp2 = simplify_gen_subreg (TImode, discratch, V2DImode, 0); > +      emit_insn (gen_extendditi2_vector (target, temp2)); > >        return target; >      } > diff --git a/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c > new file mode 100644 > index 00000000000..78e72ac5425 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c > @@ -0,0 +1,35 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target int128 } */ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ > + > +#include > + > +vector signed __int128 > +foo1 (signed long a, signed char *b) > +{ > +  return vec_xl_sext (a, b); > +} > + > +vector signed __int128 > +foo2 (signed long a, signed short *b) > +{ > +  return vec_xl_sext (a, b); > +} > + > +vector signed __int128 > +foo3 (signed long a, signed int *b) > +{ > +  return vec_xl_sext (a, b); > +} > + > +vector signed __int128 > +foo4 (signed long a, signed long *b) > +{ > +  return vec_xl_sext (a, b); > +} > + > +/* { dg-final { scan-assembler-times {\mvextsd2q\M} 4 } } */ > +/* { dg-final { scan-assembler-times {\mvextsb2d\M} 1 } } */ > +/* { dg-final { scan-assembler-times {\mvextsh2d\M} 1 } } */ > +/* { dg-final { scan-assembler-times {\mvextsw2d\M} 1 } } */ >