From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 19DA13858D1E; Tue, 14 Feb 2023 02:17:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 19DA13858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 31E1gTBF020316; Tue, 14 Feb 2023 02:17:42 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : references : cc : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=TDYkcgpU83LbUoMYuSCmv80PqGZ0z/VFBQYAc4xueEo=; b=RbTV0j6jBOKyt3keoxu64bVCPEGSVtN5ETdpYADvyQn7NZPmVFN9l9XdajB+JAi4V6Bd dGoH31M3ObYfoYlmj6gYDqizEnBDZk1RqUqBhuaC0vVJvIp9z9YvTdA7o8L9TedfONwb j3LCGZ+p3gRDdkPeeCOCCWv7Xqlvi0BrLPEos6jt9shyXUEd/RjOGMFZdhW6/J64AGqS +0I9HSAAAqnmQ/X7Bb7/glnM0PNM8A2V84W/3my97vvrUXAavUD/tbtnyPq0dqGamcxm SuaGfeEYvC5P5l88kgHbkAPk8q1w+v+jlvyElCs6H/6M/UVY1sFiBUg8T5sPZsp3Fz1+ 3w== Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3nr0v7rmtm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Feb 2023 02:17:42 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 31DDDnsZ017639; Tue, 14 Feb 2023 02:17:39 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma03ams.nl.ibm.com (PPS) with ESMTPS id 3np2n6kc2e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Feb 2023 02:17:39 +0000 Received: from smtpav02.fra02v.mail.ibm.com (smtpav02.fra02v.mail.ibm.com [10.20.54.101]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 31E2Hbbf47382916 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 14 Feb 2023 02:17:37 GMT Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7753420040; Tue, 14 Feb 2023 02:17:37 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CE91D20043; Tue, 14 Feb 2023 02:17:35 +0000 (GMT) Received: from [9.200.40.67] (unknown [9.200.40.67]) by smtpav02.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 14 Feb 2023 02:17:35 +0000 (GMT) Message-ID: Date: Tue, 14 Feb 2023 10:17:34 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH 2/2] vect: Make partial trapping ops use predication [PR96373] Content-Language: en-US To: richard.sandiford@arm.com References: <3daa6e11-7ac6-0348-91eb-26e65f067fe4@linux.ibm.com> Cc: linkw@gcc.gnu.org, rguenther@suse.de, gcc-patches@gcc.gnu.org From: "Kewen.Lin" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: O-4x5X4YoFkgi1OVHJrxzZ3ODs15aEle X-Proofpoint-GUID: O-4x5X4YoFkgi1OVHJrxzZ3ODs15aEle X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.170.22 definitions=2023-02-14_01,2023-02-13_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 phishscore=0 lowpriorityscore=0 adultscore=0 suspectscore=0 malwarescore=0 mlxscore=0 bulkscore=0 priorityscore=1501 clxscore=1015 spamscore=0 mlxlogscore=794 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2302140014 X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,KAM_SHORT,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: on 2023/2/13 21:57, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi Richard, >> >> on 2023/1/27 19:08, Richard Sandiford via Gcc-patches wrote: >>> PR96373 points out that a predicated SVE loop currently converts >>> trapping unconditional ops into unpredicated vector ops. Doing >>> the operation on inactive lanes can then raise an exception. >>> >>> As discussed in the PR trail, we aren't 100% consistent about >>> whether we preserve traps or not. But the direction of travel >>> is clearly to improve that rather than live with it. This patch >>> tries to do that for the SVE case. >>> >>> Doing this regresses gcc.target/aarch64/sve/fabd_1.c. I've added >>> -fno-trapping-math for now and filed PR108571 to track it. >>> A similar problem applies to fsubr_1.d. >>> >>> I think this is likely to regress Power 10, since conditional >>> operations are only available for masked loops. I think we'll >>> need to add -fno-trapping-math to any affected testcases, >>> but I don't have a Power 10 system to test on. Kewen, would you >>> mind giving this a spin and seeing how bad the fallout is? >>> >> >> Sorry for the late reply, I'm just back from vacation. >> >> Thank you for fixing this and caring about Power10! >> >> I tested your proposed patch on one Power10 machine (ppc64le), >> it's bootstrapped but some test failures got exposed as below. >> >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-1.c scan-assembler-times \\\\mlxvl\\\\M 14 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-1.c scan-assembler-times \\\\mstxvl\\\\M 7 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-2.c scan-assembler-times \\\\mlxvl\\\\M 20 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-2.c scan-assembler-times \\\\mstxvl\\\\M 10 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-3.c scan-assembler-times \\\\mlxvl\\\\M 14 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-3.c scan-assembler-times \\\\mstxvl\\\\M 7 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times \\\\mlxvl\\\\M 70 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times \\\\mlxvx?\\\\M 120 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times \\\\mstxvl\\\\M 70 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times \\\\mstxvx?\\\\M 70 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-5.c scan-assembler-times \\\\mlxvl\\\\M 21 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-5.c scan-assembler-times \\\\mstxvl\\\\M 21 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-5.c scan-assembler-times \\\\mstxvx?\\\\M 21 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-6.c scan-assembler-times \\\\mlxvl\\\\M 10 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-6.c scan-assembler-times \\\\mlxvx?\\\\M 42 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-6.c scan-assembler-times \\\\mstxvl\\\\M 10 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-8.c scan-assembler-times \\\\mlxvl\\\\M 16 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-8.c scan-assembler-times \\\\mstxvl\\\\M 7 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-not \\\\mlxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-not \\\\mstxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-times \\\\mlxvl\\\\M 20 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-times \\\\mstxvl\\\\M 10 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-not \\\\mlxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-not \\\\mstxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-times \\\\mlxvl\\\\M 20 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-times \\\\mstxvl\\\\M 10 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-3.c scan-assembler-times \\\\mlxvl\\\\M 14 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-3.c scan-assembler-times \\\\mstxvl\\\\M 7 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-not \\\\mlxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-not \\\\mstxv\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-not \\\\mstxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-times \\\\mlxvl\\\\M 70 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-times \\\\mstxvl\\\\M 70 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-not \\\\mlxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-not \\\\mstxv\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-not \\\\mstxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-times \\\\mlxvl\\\\M 21 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-times \\\\mstxvl\\\\M 21 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-6.c scan-assembler-times \\\\mlxvl\\\\M 10 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-6.c scan-assembler-times \\\\mstxvl\\\\M 10 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-6.c scan-assembler-times \\\\mstxvx?\\\\M 6 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-8.c scan-assembler-times \\\\mlxvl\\\\M 30 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-8.c scan-assembler-times \\\\mstxvl\\\\M 10 >> >> By checking several of them, it's due to that we don't vectorize >> some loop having float type involved with partial vector any more. >> >> As you suggested above, I fixed them with an extra option >> "-fno-trapping-math" and verified all of them can pass again. >> I also noticed that the original test case in PR96373 fails >> on Power10 too, so I added one constructed case pr96373.c >> into sub bucket gcc.target/powerpc for testing coverage >> on Power. >> >> One re-spin with the attached adjustment shows there is no >> regression failure any more, and the new test case works well >> on both ppc64 (P8) and ppc64le (P10) Linux. > > Thanks for doing this. The patch is OK, if you need approval. > I'll push mine once it's in. Thanks for the review! Pushed in r13-5978-g4f5a1198065dc0. btw, do we want this to be backported? If yes, I'm going to backport it to gcc-12 and gcc-11 branches soon (for gcc-10 we don't have partial vector support on Power btw). BR, Kewen