From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 8FCD63858C27; Fri, 9 Jul 2021 18:40:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8FCD63858C27 Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 169IX10a088583; Fri, 9 Jul 2021 14:40:06 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 39ppex8p99-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 09 Jul 2021 14:40:06 -0400 Received: from m0098404.ppops.net (m0098404.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 169IXpso093109; Fri, 9 Jul 2021 14:40:05 -0400 Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0a-001b2d01.pphosted.com with ESMTP id 39ppex8p8q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 09 Jul 2021 14:40:05 -0400 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 169IW1IU010509; Fri, 9 Jul 2021 18:40:04 GMT Received: from b01cxnp22034.gho.pok.ibm.com (b01cxnp22034.gho.pok.ibm.com [9.57.198.24]) by ppma02dal.us.ibm.com with ESMTP id 39jfhfevdt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 09 Jul 2021 18:40:04 +0000 Received: from b01ledav005.gho.pok.ibm.com (b01ledav005.gho.pok.ibm.com [9.57.199.110]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 169Ie3OL36372830 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 9 Jul 2021 18:40:03 GMT Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6D958AE05F; Fri, 9 Jul 2021 18:40:03 +0000 (GMT) Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B23B4AE063; Fri, 9 Jul 2021 18:40:01 +0000 (GMT) Received: from lexx (unknown [9.171.59.93]) by b01ledav005.gho.pok.ibm.com (Postfix) with ESMTP; Fri, 9 Jul 2021 18:40:01 +0000 (GMT) Message-ID: Subject: Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] From: will schmidt To: Xionghu Luo , gcc-patches@gcc.gnu.org Cc: wschmidt@linux.ibm.com, dje.gcc@gmail.com, segher@kernel.crashing.org, linkw@gcc.gnu.org Date: Fri, 09 Jul 2021 13:40:00 -0500 In-Reply-To: References: <20210416071007.78812-1-luoxhu@linux.ibm.com> <6d438971-4778-91cf-451c-a493b0cf9bdf@linux.ibm.com> <4ead69cf-daac-31bb-ddb5-d7b41cf298e2@linux.ibm.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 (3.28.5-10.el7) X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: F3owYRcCYnEnn_Jav9Nyg_IORJ-A16iT X-Proofpoint-GUID: YRrSnLSklRp8VJwV1Hs0YwTk25yZbZko Content-Transfer-Encoding: 7bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-07-09_12:2021-07-09, 2021-07-09 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 lowpriorityscore=0 spamscore=0 phishscore=0 priorityscore=1501 mlxlogscore=999 suspectscore=0 clxscore=1011 adultscore=0 bulkscore=0 malwarescore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2107090090 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Jul 2021 18:40:09 -0000 On Wed, 2021-06-30 at 09:44 +0800, Xionghu Luo via Gcc-patches wrote: > Gentle ping ^2, thanks. > > https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html > > > On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote: > > Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%, > > 526.blender_r +1.72%, no obvious changes to others. Ok. > > > > > > On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote: > > > Gentle ping, thanks. > > > > > > > > > On 2021/4/16 15:10, Xiong Hu Luo wrote: > > > > fmod/fmodf and remainder/remainderf could be expanded instead of library > > > > call when fast-math build, which is much faster. > > > > > > > > fmodf: > > > > fdivs f0,f1,f2 > > > > friz f0,f0 > > > > fnmsubs f1,f2,f0,f1 > > > > > > > > remainderf: > > > > fdivs f0,f1,f2 > > > > frin f0,f0 > > > > fnmsubs f1,f2,f0,f1 > > > > > > > > gcc/ChangeLog: > > > > > > > > 2021-04-16 Xionghu Luo > > > > > > > > PR target/97142 That PR is " Bug 97142 - __builtin_fmod not optimized on POWER " OK. > > > > * config/rs6000/rs6000.md (fmod3): New define_expand. > > > > (remainder3): Likewise. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > 2021-04-16 Xionghu Luo > > > > > > > > PR target/97142 > > > > * gcc.target/powerpc/pr97142.c: New test. Ok. > > > > --- > > > > gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++ > > > > gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++ > > > > 2 files changed, 66 insertions(+) > > > > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c > > > > > > > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > > > > index a1315523fec..7e0e94e6ba4 100644 > > > > --- a/gcc/config/rs6000/rs6000.md > > > > +++ b/gcc/config/rs6000/rs6000.md > > > > @@ -4902,6 +4902,42 @@ (define_insn "fre" > > > > [(set_attr "type" "fp") > > > > (set_attr "isa" "*,")]) > > > > +(define_expand "fmod3" > > > > + [(use (match_operand:SFDF 0 "gpc_reg_operand")) > > > > + (use (match_operand:SFDF 1 "gpc_reg_operand")) > > > > + (use (match_operand:SFDF 2 "gpc_reg_operand"))] > > > > + "TARGET_HARD_FLOAT > > > > + && TARGET_FPRND > > > > + && flag_unsafe_math_optimizations" > > > > +{ > > > > + rtx div = gen_reg_rtx (mode); > > > > + emit_insn (gen_div3 (div, operands[1], operands[2])); > > > > + > > > > + rtx friz = gen_reg_rtx (mode); > > > > + emit_insn (gen_btrunc2 (friz, div)); > > > > + > > > > + emit_insn (gen_nfms4 (operands[0], operands[2], friz, > > > > operands[1])); > > > > + DONE; > > > > + }) > > > > + > > > > +(define_expand "remainder3" > > > > + [(use (match_operand:SFDF 0 "gpc_reg_operand")) > > > > + (use (match_operand:SFDF 1 "gpc_reg_operand")) > > > > + (use (match_operand:SFDF 2 "gpc_reg_operand"))] > > > > + "TARGET_HARD_FLOAT > > > > + && TARGET_FPRND > > > > + && flag_unsafe_math_optimizations" > > > > +{ > > > > + rtx div = gen_reg_rtx (mode); > > > > + emit_insn (gen_div3 (div, operands[1], operands[2])); > > > > + > > > > + rtx frin = gen_reg_rtx (mode); > > > > + emit_insn (gen_round2 (frin, div)); > > > > + > > > > + emit_insn (gen_nfms4 (operands[0], operands[2], frin, > > > > operands[1])); > > > > + DONE; > > > > + }) I notice the pattern of arguments to the final emit is op[0],op[2],fri*,op[1] while the description comment suggests the generated instruction will be fnmsubs f1,f2,f0,f1 ; I don't see any rearranging in the nfms4 expansions, but presumably this is correct and just a cosmetic nit that catches my eye. Ok. > > > > + > > > > (define_insn "*rsqrt2" > > > > [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,wa") > > > > (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" ",wa")] > > > > diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c > > > > b/gcc/testsuite/gcc.target/powerpc/pr97142.c > > > > new file mode 100644 > > > > index 00000000000..48f25ca5b5b > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c > > > > @@ -0,0 +1,30 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-Ofast" } */ > > > > + > > > > +#include > > > > + > > > > +float test1 (float x, float y) > > > > +{ > > > > + return fmodf (x, y); > > > > +} > > > > + > > > > +double test2 (double x, double y) > > > > +{ > > > > + return fmod (x, y); > > > > +} > > > > + > > > > +float test3 (float x, float y) > > > > +{ > > > > + return remainderf (x, y); > > > > +} > > > > + > > > > +double test4 (double x, double y) > > > > +{ > > > > + return remainder (x, y); > > > > +} > > > > + > > > > +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */ > > > > +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ > > > > +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ > > > > +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ Ok. I'd be tempted to add scan-assembler checks for the fdivs,fri*,fnmsubs instructions as well. I defer to others on that, of course.. :-) lgtm, thanks -Will > > > > + > > > > > >