From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 20B2F3858C53 for ; Mon, 20 Mar 2023 06:35:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 20B2F3858C53 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 32K4TnAT010600; Mon, 20 Mar 2023 06:35:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding : mime-version; s=pp1; bh=0eXhGgY8SIbFxH/pTABC+vq4Ba5RbLUCDADf0c1k6Rs=; b=rgOUZxgJwEPsoUoeE1bHPgCprfXh6p4DC7JQBxEEsW940iwx8OnMZSK2ULGC/VZDsr9P 8sjCyPXWxPLQnMCnndTjTL4DTRZJtbtttCUXqG7CHCh+U/JxvUMhE5Znhk7pcLqdSZ1k vZOPY6ryP9c0+68swbd3w1kKsTIGw+BtPlHn7o8kT87aF5kQRYr5ryDZnrH9hga348WV JFvK5kBpPXtzqk96CeRyaBy8NoGIpoO9KeIwRLC/W/tgiL3FnUDobzjr2n6nOhnQiav/ fVMijImCaIEplY3LMbyNgEud60/lVyZHPsPH7uNTgqBOgtHhjubSlIbF22eUtLruGwPw Ng== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3pdq80yq7a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 20 Mar 2023 06:35:52 +0000 Received: from m0098420.ppops.net (m0098420.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 32K6D0Y6019625; Mon, 20 Mar 2023 06:35:52 GMT Received: from ppma01fra.de.ibm.com (46.49.7a9f.ip4.static.sl-reverse.com [159.122.73.70]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3pdq80yq6y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 20 Mar 2023 06:35:51 +0000 Received: from pps.filterd (ppma01fra.de.ibm.com [127.0.0.1]) by ppma01fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 32JH1Iv0015419; Mon, 20 Mar 2023 06:35:50 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma01fra.de.ibm.com (PPS) with ESMTPS id 3pd4x6aaua-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 20 Mar 2023 06:35:50 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 32K6Zl6A28377602 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 20 Mar 2023 06:35:47 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9353C2004B; Mon, 20 Mar 2023 06:35:47 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CB53B20049; Mon, 20 Mar 2023 06:35:45 +0000 (GMT) Received: from [9.177.11.227] (unknown [9.177.11.227]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 20 Mar 2023 06:35:45 +0000 (GMT) Message-ID: <52394650-aa6b-c7a1-8a7f-691870309829@linux.ibm.com> Date: Mon, 20 Mar 2023 14:35:43 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: PING^1 [PATCH] rs6000: Fix vector_set_var_p9 by considering BE [PR108807] Content-Language: en-US To: GCC Patches Cc: Segher Boessenkool , David Edelsohn , Peter Bergner References: <737a5392-29f8-763c-8dc7-b48c36edb1a7@linux.ibm.com> From: "Kewen.Lin" In-Reply-To: <737a5392-29f8-763c-8dc7-b48c36edb1a7@linux.ibm.com> Content-Type: text/plain; charset=UTF-8 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: ToVU33r5ZhbgBIJd-ryVUkSziYPkGPSX X-Proofpoint-ORIG-GUID: w-iVVHPzxLTdqOjDjWjWmuue5S01RrR- Content-Transfer-Encoding: 7bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-03-20_03,2023-03-16_02,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 malwarescore=0 clxscore=1015 impostorscore=0 mlxscore=0 spamscore=0 bulkscore=0 mlxlogscore=999 lowpriorityscore=0 adultscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303150002 definitions=main-2303200053 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, I'd like to gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612213.html It's to fix one regression, I think it's stage 4 content. BR, Kewen on 2023/2/17 17:55, Kewen.Lin via Gcc-patches wrote: > Hi, > > As PR108807 exposes, the current handling in function > rs6000_expand_vector_set_var_p9 doesn't take care of big > endianness. Currently the function is to rotate the > target vector by moving element to-be-set to element 0, > set element 0 with the given val, then rotate back. To > get the permutation control vector for the rotation, it > makes use of lvsr and lvsl, but the element ordering is > different for BE and LE (like element 0 is the most > significant one on BE while the least significant one on > LE), this patch is to add consideration for BE and make > sure permutation control vectors for rotations are expected. > > As tested, it helped to fix the below failures: > > FAIL: gcc.target/powerpc/pr79251-run.p9.c execution test > FAIL: gcc.target/powerpc/pr89765-mc.c execution test > FAIL: gcc.target/powerpc/vsx-builtin-10d.c execution test > FAIL: gcc.target/powerpc/vsx-builtin-11d.c execution test > FAIL: gcc.target/powerpc/vsx-builtin-14d.c execution test > FAIL: gcc.target/powerpc/vsx-builtin-16d.c execution test > FAIL: gcc.target/powerpc/vsx-builtin-18d.c execution test > FAIL: gcc.target/powerpc/vsx-builtin-9d.c execution test > > Bootstrapped and regtested on powerpc64-linux-gnu P{8,9} > and powerpc64le-linux-gnu P10. > > Is it ok for trunk? > > BR, > Kewen > ----- > PR target/108807 > > gcc/ChangeLog: > > * config/rs6000/rs6000.cc (rs6000_expand_vector_set_var_p9): Fix gen > function for permutation control vector by considering big endianness. > --- > gcc/config/rs6000/rs6000.cc | 48 +++++++++++++++++++++---------------- > 1 file changed, 28 insertions(+), 20 deletions(-) > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 16ca3a31757..774eb2963d9 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -7235,22 +7235,26 @@ rs6000_expand_vector_set_var_p9 (rtx target, rtx val, rtx idx) > > machine_mode shift_mode; > rtx (*gen_ashl)(rtx, rtx, rtx); > - rtx (*gen_lvsl)(rtx, rtx); > - rtx (*gen_lvsr)(rtx, rtx); > + rtx (*gen_pcvr1)(rtx, rtx); > + rtx (*gen_pcvr2)(rtx, rtx); > > if (TARGET_POWERPC64) > { > shift_mode = DImode; > gen_ashl = gen_ashldi3; > - gen_lvsl = gen_altivec_lvsl_reg_di; > - gen_lvsr = gen_altivec_lvsr_reg_di; > + gen_pcvr1 = BYTES_BIG_ENDIAN ? gen_altivec_lvsl_reg_di > + : gen_altivec_lvsr_reg_di; > + gen_pcvr2 = BYTES_BIG_ENDIAN ? gen_altivec_lvsr_reg_di > + : gen_altivec_lvsl_reg_di; > } > else > { > shift_mode = SImode; > gen_ashl = gen_ashlsi3; > - gen_lvsl = gen_altivec_lvsl_reg_si; > - gen_lvsr = gen_altivec_lvsr_reg_si; > + gen_pcvr1 = BYTES_BIG_ENDIAN ? gen_altivec_lvsl_reg_si > + : gen_altivec_lvsr_reg_si; > + gen_pcvr2 = BYTES_BIG_ENDIAN ? gen_altivec_lvsr_reg_si > + : gen_altivec_lvsl_reg_si; > } > /* Generate the IDX for permute shift, width is the vector element size. > idx = idx * width. */ > @@ -7259,25 +7263,29 @@ rs6000_expand_vector_set_var_p9 (rtx target, rtx val, rtx idx) > > emit_insn (gen_ashl (tmp, idx, GEN_INT (shift))); > > - /* lvsr v1,0,idx. */ > - rtx pcvr = gen_reg_rtx (V16QImode); > - emit_insn (gen_lvsr (pcvr, tmp)); > - > - /* lvsl v2,0,idx. */ > - rtx pcvl = gen_reg_rtx (V16QImode); > - emit_insn (gen_lvsl (pcvl, tmp)); > + /* Generate one permutation control vector used for rotating the element > + at to-insert position to element zero in target vector. lvsl is > + used for big endianness while lvsr is used for little endianness: > + lvs[lr] v1,0,idx. */ > + rtx pcvr1 = gen_reg_rtx (V16QImode); > + emit_insn (gen_pcvr1 (pcvr1, tmp)); > > rtx sub_target = simplify_gen_subreg (V16QImode, target, mode, 0); > + rtx perm1 = gen_altivec_vperm_v8hiv16qi (sub_target, sub_target, sub_target, > + pcvr1); > + emit_insn (perm1); > > - rtx permr > - = gen_altivec_vperm_v8hiv16qi (sub_target, sub_target, sub_target, pcvr); > - emit_insn (permr); > - > + /* Insert val into element 0 of target vector. */ > rs6000_expand_vector_set (target, val, const0_rtx); > > - rtx perml > - = gen_altivec_vperm_v8hiv16qi (sub_target, sub_target, sub_target, pcvl); > - emit_insn (perml); > + /* Rotate back with a reversed permutation control vector generated from: > + lvs[rl] v2,0,idx. */ > + rtx pcvr2 = gen_reg_rtx (V16QImode); > + emit_insn (gen_pcvr2 (pcvr2, tmp)); > + > + rtx perm2 = gen_altivec_vperm_v8hiv16qi (sub_target, sub_target, sub_target, > + pcvr2); > + emit_insn (perm2); > } > > /* Insert VAL into IDX of TARGET, VAL size is same of the vector element, IDX > -- > 2.39.1