From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 467363858404 for ; Mon, 11 Oct 2021 05:33:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 467363858404 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19B3BcAk017762; Mon, 11 Oct 2021 01:33:42 -0400 Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bmd53a845-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 11 Oct 2021 01:33:41 -0400 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19B5VjYK019455; Mon, 11 Oct 2021 05:33:39 GMT Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by ppma06ams.nl.ibm.com with ESMTP id 3bk2bht2yf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 11 Oct 2021 05:33:39 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19B5XZPC51118484 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 11 Oct 2021 05:33:35 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B842A5209D; Mon, 11 Oct 2021 05:33:28 +0000 (GMT) Received: from [9.200.100.183] (unknown [9.200.100.183]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id A066B5229D; Mon, 11 Oct 2021 05:32:47 +0000 (GMT) Subject: Ping^1 [PATCH, rs6000] optimization for vec_reve builtin [PR100868] From: HAO CHEN GUI To: gcc-patches Cc: Segher Boessenkool , Bill Schmidt References: <9e63418c-7893-b1c0-47a8-c6f7a201b72a@linux.ibm.com> Message-ID: <1af358d5-906f-bf8f-9a6e-6ea9bbee757c@linux.ibm.com> Date: Mon, 11 Oct 2021 13:32:45 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 In-Reply-To: <9e63418c-7893-b1c0-47a8-c6f7a201b72a@linux.ibm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: sYsNrhVVEHX-veXy1n_lBzhuF07WVd3v X-Proofpoint-GUID: sYsNrhVVEHX-veXy1n_lBzhuF07WVd3v Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.391,FMLib:17.0.607.475 definitions=2021-10-10_07,2021-10-07_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 mlxlogscore=999 bulkscore=0 malwarescore=0 spamscore=0 mlxscore=0 suspectscore=0 impostorscore=0 priorityscore=1501 adultscore=0 clxscore=1015 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110110032 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Oct 2021 05:33:45 -0000 Hi,      Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579038.html Thanks On 8/9/2021 下午 2:42, HAO CHEN GUI wrote: > Hi, > >   The patch optimized for vec_reve builtin on rs6000. For V2DI and V2DF, it is implemented by xxswapd on all targets. For V16QI, V8HI, V4SI and V4SF, it is implemented by quadword byte reverse plus halfword/word byte reverse when p9_vector is defined. > >   Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. > > > ChangeLog > > 2021-09-08 Haochen Gui > > gcc/ >         * config/rs6000/altivec.md (altivec_vreve2 for VEC_K): >         Use xxbrq for v16qi, xxbrq + xxbrh for v8hi and xxbrq + xxbrw >         for v4si or v4sf when p9_vector is defined. >         (altivec_vreve2 for VEC_64): Defined. Implemented by >         xxswapd. > > gcc/testsuite/ >         * gcc.target/powerpc/vec_reve_1.c: New test. >         * gcc.target/powerpc/vec_reve_2.c: Likewise. > > > patch.diff > > diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md > index 1351dafbc41..a1698ce85c0 100644 > --- a/gcc/config/rs6000/altivec.md > +++ b/gcc/config/rs6000/altivec.md > @@ -4049,13 +4049,43 @@ (define_expand "altivec_negv4sf2" >    DONE; >  }) > > -;; Vector reverse elements > +;; Vector reverse elements for V16QI V8HI V4SI V4SF >  (define_expand "altivec_vreve2" > -  [(set (match_operand:VEC_A 0 "register_operand" "=v") > -       (unspec:VEC_A [(match_operand:VEC_A 1 "register_operand" "v")] > +  [(set (match_operand:VEC_K 0 "register_operand" "=v") > +       (unspec:VEC_K [(match_operand:VEC_K 1 "register_operand" "v")] >                       UNSPEC_VREVEV))] >    "TARGET_ALTIVEC" >  { > +  if (TARGET_P9_VECTOR) > +    { > +      if (mode == V16QImode) > +       emit_insn (gen_p9_xxbrq_v16qi (operands[0], operands[1])); > +      else if (mode == V8HImode) > +       { > +         rtx subreg1 = simplify_gen_subreg (V1TImode, operands[1], > +                                            mode, 0); > +         rtx temp = gen_reg_rtx (V1TImode); > +         emit_insn (gen_p9_xxbrq_v1ti (temp, subreg1)); > +         rtx subreg2 = simplify_gen_subreg (mode, temp, > +                                            V1TImode, 0); > +         emit_insn (gen_p9_xxbrh_v8hi (operands[0], subreg2)); > +       } > +      else /* V4SI and V4SF.  */ > +       { > +         rtx subreg1 = simplify_gen_subreg (V1TImode, operands[1], > +                                            mode, 0); > +         rtx temp = gen_reg_rtx (V1TImode); > +         emit_insn (gen_p9_xxbrq_v1ti (temp, subreg1)); > +         rtx subreg2 = simplify_gen_subreg (mode, temp, > +                                            V1TImode, 0); > +         if (mode == V4SImode) > +           emit_insn (gen_p9_xxbrw_v4si (operands[0], subreg2)); > +         else > +           emit_insn (gen_p9_xxbrw_v4sf (operands[0], subreg2)); > +       } > +      DONE; > +    } > + >    int i, j, size, num_elements; >    rtvec v = rtvec_alloc (16); >    rtx mask = gen_reg_rtx (V16QImode); > @@ -4074,6 +4104,17 @@ (define_expand "altivec_vreve2" >    DONE; >  }) > > +;; Vector reverse elements for V2DI V2DF > +(define_expand "altivec_vreve2" > +  [(set (match_operand:VEC_64 0 "register_operand" "=v") > +       (unspec:VEC_64 [(match_operand:VEC_64 1 "register_operand" "v")] > +                     UNSPEC_VREVEV))] > +  "TARGET_ALTIVEC" > +{ > +  emit_insn (gen_xxswapd_ (operands[0], operands[1])); > +  DONE; > +}) > + >  ;; Vector SIMD PEM v2.06c defines LVLX, LVLXL, LVRX, LVRXL, >  ;; STVLX, STVLXL, STVVRX, STVRXL are available only on Cell. >  (define_insn "altivec_lvlx" > diff --git a/gcc/testsuite/gcc.target/powerpc/vec_reve_1.c b/gcc/testsuite/gcc.target/powerpc/vec_reve_1.c > new file mode 100644 > index 00000000000..83a9206758b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/vec_reve_1.c > @@ -0,0 +1,16 @@ > +/* { dg-require-effective-target powerpc_altivec_ok } */ > +/* { dg-options "-O2 -maltivec" } */ > + > +#include > + > +vector double foo1 (vector double a) > +{ > +   return vec_reve (a); > +} > + > +vector long long foo2 (vector long long a) > +{ > +   return vec_reve (a); > +} > + > +/* { dg-final { scan-assembler-times {\mxxpermdi\M} 2 } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/vec_reve_2.c b/gcc/testsuite/gcc.target/powerpc/vec_reve_2.c > new file mode 100644 > index 00000000000..b6dd33d6d79 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/vec_reve_2.c > @@ -0,0 +1,28 @@ > +/* { dg-require-effective-target powerpc_p9vector_ok } */ > +/* { dg-options "-mdejagnu-cpu=power9 -O2 -maltivec" } */ > + > +#include > + > +vector int foo1 (vector int a) > +{ > +   return vec_reve (a); > +} > + > +vector float foo2 (vector float a) > +{ > +   return vec_reve (a); > +} > + > +vector short foo3 (vector short a) > +{ > +   return vec_reve (a); > +} > + > +vector char foo4 (vector char a) > +{ > +   return vec_reve (a); > +} > + > +/* { dg-final { scan-assembler-times {\mxxbrq\M} 4 } } */ > +/* { dg-final { scan-assembler-times {\mxxbrw\M} 2 } } */ > +/* { dg-final { scan-assembler-times {\mxxbrh\M} 1 } } */ >