From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 2FC473945C15 for ; Fri, 25 Nov 2022 07:50:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2FC473945C15 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AP799Hr024030; Fri, 25 Nov 2022 07:50:49 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=TC+Ja6fVK42luj/9rvZjycT0NkNbelyH5pL44qTZgVE=; b=jNSkKQrkPsXiAnvr7vL85b35wrvJfA4Rv8By2a5atigS9VG/lk7TKLVE1Gi9iZAjf6d5 MTnsvmRvqdiT6E0R/m3Gc49eDqO+2uSoQnz3daXPpxXMrQUhOxj8vz/ZUkbR5H7POQFe M64gQ1+ESuPPVXMTNVfXU9AE9ewdFbqN026nCMiJO+nBo3kc/ZITAivytCO8k2oDM282 QhTk9NQYRwMN2C3I37BTqfjSHGWFiSC1YKyNqcveu7lg26PZ3GHo/vFQVzZeqhFC30Dv bLhvgekKjZiChAqVIoA2i3E/UTXhyqtSIkoHqjzv32m829DlGYzTxXKVGDS9/BdYfJGs Mg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3m2qt3abqx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 25 Nov 2022 07:50:49 +0000 Received: from m0098419.ppops.net (m0098419.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2AP7AnK3027523; Fri, 25 Nov 2022 07:50:49 GMT Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3m2qt3abqd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 25 Nov 2022 07:50:49 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2AP7ZIs8005623; Fri, 25 Nov 2022 07:50:47 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma04ams.nl.ibm.com with ESMTP id 3kxps917a4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 25 Nov 2022 07:50:47 +0000 Received: from d06av24.portsmouth.uk.ibm.com (d06av24.portsmouth.uk.ibm.com [9.149.105.60]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2AP7iPbv48169332 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Nov 2022 07:44:25 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 47BF342041; Fri, 25 Nov 2022 07:50:44 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E8B184203F; Fri, 25 Nov 2022 07:50:41 +0000 (GMT) Received: from [9.197.245.251] (unknown [9.197.245.251]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 25 Nov 2022 07:50:41 +0000 (GMT) Message-ID: Date: Fri, 25 Nov 2022 15:50:40 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH-1, rs6000] Generate permute index directly for little endian target [PR100866] Content-Language: en-US To: HAO CHEN GUI Cc: Segher Boessenkool , David , Peter Bergner , gcc-patches References: From: "Kewen.Lin" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: OOT4pFj4itPh-fNPLaiFvYfsDszLQP6C X-Proofpoint-ORIG-GUID: SB4Ejw7K8wPBYwVeK6XNyCQFd9BYGetJ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-25_02,2022-11-24_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 clxscore=1015 phishscore=0 mlxscore=0 suspectscore=0 spamscore=0 malwarescore=0 impostorscore=0 adultscore=0 mlxlogscore=999 bulkscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211250060 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Haochen, Sorry for the late review. on 2022/10/11 15:38, HAO CHEN GUI wrote: > Hi, > This patch modifies the help function which generates permute index for > vector byte reversion and generates permute index directly for little endian > targets. It saves one "xxlnor" instructions on P8 little endian targets as > the original process needs an "xxlnor" to calculate complement for the index. > Nice. > Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. > Is this okay for trunk? Any recommendations? Thanks a lot. > > ChangeLog > 2022-10-11 Haochen Gui > > gcc/ > PR target/100866 > * config/rs6000/rs6000-call.cc (swap_endian_selector_for_mode): > Generate permute index directly for little endian targets. > * config/rs6000/vsx.md (revb_): Call vprem directly with > corresponding permute indexes. > > gcc/testsuite/ > PR target/100866 > * gcc.target/powerpc/pr100866.c: New. > > patch.diff > diff --git a/gcc/config/rs6000/rs6000-call.cc b/gcc/config/rs6000/rs6000-call.cc > index 551968b0995..bad8e9e0e52 100644 > --- a/gcc/config/rs6000/rs6000-call.cc > +++ b/gcc/config/rs6000/rs6000-call.cc > @@ -2839,7 +2839,10 @@ swap_endian_selector_for_mode (machine_mode mode) > } > > for (i = 0; i < 16; ++i) > - perm[i] = GEN_INT (swaparray[i]); > + if (BYTES_BIG_ENDIAN) > + perm[i] = GEN_INT (swaparray[i]); > + else > + perm[i] = GEN_INT (~swaparray[i] & 0x0000001f); IMHO, it would be good to add a function comment for this function, it's sad that we didn't have it before. With this patch, the selector (perm) is expected to be used with vperm direct as shown below, it would be good to note it explicitly for other potential callers too. > > return force_reg (V16QImode, gen_rtx_CONST_VECTOR (V16QImode, > gen_rtvec_v (16, perm))); > diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md > index e226a93bbe5..b68eba48d2c 100644 > --- a/gcc/config/rs6000/vsx.md > +++ b/gcc/config/rs6000/vsx.md > @@ -6096,8 +6096,8 @@ (define_expand "revb_" > to the endian mode in use, i.e. in LE mode, put elements > in BE order. */ > rtx sel = swap_endian_selector_for_mode(mode); > - emit_insn (gen_altivec_vperm_ (operands[0], operands[1], > - operands[1], sel)); > + emit_insn (gen_altivec_vperm__direct (operands[0], operands[1], > + operands[1], sel));> } > > DONE; > diff --git a/gcc/testsuite/gcc.target/powerpc/pr100866.c b/gcc/testsuite/gcc.target/powerpc/pr100866.c > new file mode 100644 > index 00000000000..c708dfd502e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr100866.c > @@ -0,0 +1,11 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target powerpc_p8vector_ok } */ > +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */ > +/* { dg-final { scan-assembler-not "xxlnor" } } */ Nit: may be better with {\mxxlnor\M}? The others look good to me. Thanks! BR, Kewen > + > +#include > + > +vector unsigned short revb(vector unsigned short a) > +{ > + return vec_revb(a); > +} >