From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id C5673385741E; Mon, 12 Jul 2021 01:25:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C5673385741E Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 16C14OBo135305; Sun, 11 Jul 2021 21:25:32 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 39qrmcsx4b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 11 Jul 2021 21:25:32 -0400 Received: from m0098413.ppops.net (m0098413.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 16C1KiLG006262; Sun, 11 Jul 2021 21:25:31 -0400 Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0b-001b2d01.pphosted.com with ESMTP id 39qrmcsx3y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 11 Jul 2021 21:25:31 -0400 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 16C1NMbH022282; Mon, 12 Jul 2021 01:25:30 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma03ams.nl.ibm.com with ESMTP id 39q3688g7w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 Jul 2021 01:25:30 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 16C1NPJY36372858 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 12 Jul 2021 01:23:25 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C09BB11C080; Mon, 12 Jul 2021 01:25:27 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C0E9411C04A; Mon, 12 Jul 2021 01:25:25 +0000 (GMT) Received: from luoxhus-MacBook-Pro.local (unknown [9.200.155.117]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Mon, 12 Jul 2021 01:25:25 +0000 (GMT) Subject: Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] To: will schmidt , gcc-patches@gcc.gnu.org Cc: wschmidt@linux.ibm.com, dje.gcc@gmail.com, segher@kernel.crashing.org, linkw@gcc.gnu.org References: <20210416071007.78812-1-luoxhu@linux.ibm.com> <6d438971-4778-91cf-451c-a493b0cf9bdf@linux.ibm.com> <4ead69cf-daac-31bb-ddb5-d7b41cf298e2@linux.ibm.com> From: Xionghu Luo Message-ID: <2260972d-e44d-f084-69bb-2b7c96c00525@linux.ibm.com> Date: Mon, 12 Jul 2021 09:25:22 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.0; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: D8hbz7fv9K6aJYNZOqpMVIv4MY5B-x-1 X-Proofpoint-GUID: ayk9XfzZ7QTMgrdMgj0pdPLj4U54nnc_ Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-07-11_17:2021-07-09, 2021-07-11 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 mlxlogscore=999 adultscore=0 impostorscore=0 spamscore=0 phishscore=0 mlxscore=0 bulkscore=0 lowpriorityscore=0 suspectscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2107120007 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Jul 2021 01:25:35 -0000 On 2021/7/10 02:40, will schmidt wrote: > On Wed, 2021-06-30 at 09:44 +0800, Xionghu Luo via Gcc-patches wrote: >> Gentle ping ^2, thanks. >> >> https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html >> >> >> On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote: >>> Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%, >>> 526.blender_r +1.72%, no obvious changes to others. > > Ok. > >>> >>> >>> On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote: >>>> Gentle ping, thanks. >>>> >>>> >>>> On 2021/4/16 15:10, Xiong Hu Luo wrote: >>>>> fmod/fmodf and remainder/remainderf could be expanded instead of library >>>>> call when fast-math build, which is much faster. >>>>> >>>>> fmodf: >>>>> fdivs f0,f1,f2 >>>>> friz f0,f0 >>>>> fnmsubs f1,f2,f0,f1 >>>>> >>>>> remainderf: >>>>> fdivs f0,f1,f2 >>>>> frin f0,f0 >>>>> fnmsubs f1,f2,f0,f1 >>>>> >>>>> gcc/ChangeLog: >>>>> >>>>> 2021-04-16 Xionghu Luo >>>>> >>>>> PR target/97142 > > That PR is " Bug 97142 > - __builtin_fmod not optimized on POWER " > > OK. > > >>>>> * config/rs6000/rs6000.md (fmod3): New define_expand. >>>>> (remainder3): Likewise. > > >>>>> >>>>> gcc/testsuite/ChangeLog: >>>>> >>>>> 2021-04-16 Xionghu Luo >>>>> >>>>> PR target/97142 >>>>> * gcc.target/powerpc/pr97142.c: New test. > > Ok. > >>>>> --- >>>>> gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++ >>>>> gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++ >>>>> 2 files changed, 66 insertions(+) >>>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c >>>>> >>>>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >>>>> index a1315523fec..7e0e94e6ba4 100644 >>>>> --- a/gcc/config/rs6000/rs6000.md >>>>> +++ b/gcc/config/rs6000/rs6000.md >>>>> @@ -4902,6 +4902,42 @@ (define_insn "fre" >>>>> [(set_attr "type" "fp") >>>>> (set_attr "isa" "*,")]) >>>>> +(define_expand "fmod3" >>>>> + [(use (match_operand:SFDF 0 "gpc_reg_operand")) >>>>> + (use (match_operand:SFDF 1 "gpc_reg_operand")) >>>>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))] >>>>> + "TARGET_HARD_FLOAT >>>>> + && TARGET_FPRND >>>>> + && flag_unsafe_math_optimizations" >>>>> +{ >>>>> + rtx div = gen_reg_rtx (mode); >>>>> + emit_insn (gen_div3 (div, operands[1], operands[2])); >>>>> + >>>>> + rtx friz = gen_reg_rtx (mode); >>>>> + emit_insn (gen_btrunc2 (friz, div)); >>>>> + >>>>> + emit_insn (gen_nfms4 (operands[0], operands[2], friz, >>>>> operands[1])); >>>>> + DONE; >>>>> + }) >>>>> + >>>>> +(define_expand "remainder3" >>>>> + [(use (match_operand:SFDF 0 "gpc_reg_operand")) >>>>> + (use (match_operand:SFDF 1 "gpc_reg_operand")) >>>>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))] >>>>> + "TARGET_HARD_FLOAT >>>>> + && TARGET_FPRND >>>>> + && flag_unsafe_math_optimizations" >>>>> +{ >>>>> + rtx div = gen_reg_rtx (mode); >>>>> + emit_insn (gen_div3 (div, operands[1], operands[2])); >>>>> + >>>>> + rtx frin = gen_reg_rtx (mode); >>>>> + emit_insn (gen_round2 (frin, div)); >>>>> + >>>>> + emit_insn (gen_nfms4 (operands[0], operands[2], frin, >>>>> operands[1])); >>>>> + DONE; >>>>> + }) > > I notice the pattern of arguments to the final emit > is op[0],op[2],fri*,op[1] > while the description comment suggests the generated instruction > will be fnmsubs f1,f2,f0,f1 ; > > I don't see any rearranging in the nfms4 expansions, but > presumably this is correct and just a cosmetic nit that catches my eye. >From the ISA, fnmsub FRT,FRA,FRC,FRB The operation FRT ← - ( [(FRA) (FRC)] - (FRB) ) is performed. fmodf: fdivs f0,f1,f2 friz f0,f0 fnmsubs f1,f2,f0,f1 Then the ASM means: f1 = - (f2 * f0 - f1) = - ([f2 * f1/f2] - f1) So f1 is set with the mod result. > > Ok. > > >>>>> + >>>>> (define_insn "*rsqrt2" >>>>> [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,wa") >>>>> (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" ",wa")] >>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c >>>>> b/gcc/testsuite/gcc.target/powerpc/pr97142.c >>>>> new file mode 100644 >>>>> index 00000000000..48f25ca5b5b >>>>> --- /dev/null >>>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c >>>>> @@ -0,0 +1,30 @@ >>>>> +/* { dg-do compile } */ >>>>> +/* { dg-options "-Ofast" } */ >>>>> + >>>>> +#include >>>>> + >>>>> +float test1 (float x, float y) >>>>> +{ >>>>> + return fmodf (x, y); >>>>> +} >>>>> + >>>>> +double test2 (double x, double y) >>>>> +{ >>>>> + return fmod (x, y); >>>>> +} >>>>> + >>>>> +float test3 (float x, float y) >>>>> +{ >>>>> + return remainderf (x, y); >>>>> +} >>>>> + >>>>> +double test4 (double x, double y) >>>>> +{ >>>>> + return remainder (x, y); >>>>> +} >>>>> + >>>>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */ >>>>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ >>>>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ >>>>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ > > > Ok. > I'd be tempted to add scan-assembler checks for the fdivs,fri*,fnmsubs > instructions as well. > I defer to others on that, of course.. :-) Thanks, will add below check: diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c index 48f25ca5b5b..081ab40b4c0 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr97142.c +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c @@ -27,4 +27,11 @@ double test4 (double x, double y) /* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ /* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ /* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ +/* { dg-final { scan-assembler-times {\mfdiv\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfdivs\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfnmsub\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfnmsubs\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfriz\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfrin\M} 2 } } */ + > > lgtm, > thanks > -Will > > > >>>>> + >>>>> >> >> > -- Thanks, Xionghu