From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from xry111.site (xry111.site [IPv6:2001:470:683e::1]) by sourceware.org (Postfix) with ESMTPS id 70B9D3858C29 for ; Sat, 12 Nov 2022 04:41:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 70B9D3858C29 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1668228062; bh=2ziq3d+Hq/1Cp5o8roFL1Uiuxe70C+VaqbevoAzt+7Y=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=PeCxMdpaVY7hNI+ysFQCynx+bifj7yxzq52+XslHiFQNnM0knofbpCtFvE0BCh5f+ gCLVeX5Wx19uQ7fKHLV//MpDOAQMJeuhQzv8LF4C3SA3pxmfKGMenB/6QxpEkhs/zW JjL0IEjp9RndnEOvic0flBNici/YN5ULIqZo78ow= Received: from [IPv6:240e:358:1161:e600:dc73:854d:832e:2] (unknown [IPv6:240e:358:1161:e600:dc73:854d:832e:2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id 88EC2667AD; Fri, 11 Nov 2022 23:40:54 -0500 (EST) Message-ID: Subject: Re: [PATCH v2 3/4] LoongArch: Add fscaleb.{s,d} instructions as ldexp{sf,df}3 From: Xi Ruoyao To: Lulu Cheng , gcc-patches@gcc.gnu.org Cc: Wang Xuerui , Chenghua Xu , Xiaolin Tang Date: Sat, 12 Nov 2022 12:40:45 +0800 In-Reply-To: References: <20221109135329.952128-1-xry111@xry111.site> <20221109135329.952128-4-xry111@xry111.site> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.0 MIME-Version: 1.0 X-Spam-Status: No, score=-5.7 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FROM_SUSPICIOUS_NTLD,GIT_PATCH_0,KAM_NUMSUBJECT,KAM_SHORT,LIKELY_SPAM_FROM,PDS_OTHER_BAD_TLD,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sat, 2022-11-12 at 11:54 +0800, Lulu Cheng wrote: >=20 > =E5=9C=A8 2022/11/9 =E4=B8=8B=E5=8D=889:53, Xi Ruoyao =E5=86=99=E9=81=93: > > This allows optimizing __builtin_ldexp{,f} and __builtin_scalbn{,f} > > with > > -fno-math-errno. > >=20 > > IMODE is added because we can't hard code SI for operand 2: > > fscaleb.d > > instruction always take the high half of both source registers into > > account.=C2=A0 See my_ldexp_long in the test case. > >=20 > > gcc/ChangeLog: > >=20 > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* config/loongarch/loon= garch.md (UNSPEC_FSCALEB): New > > unspec. > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(type): Add fscaleb. > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(IMODE): New mode attr. > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(ldexp3): New ins= truction template. > >=20 > > gcc/testsuite/ChangeLog: > >=20 > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* gcc.target/loongarch/= fscaleb.c: New test. > > --- > > =C2=A0 gcc/config/loongarch/loongarch.md=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | 26 ++++++++++- > > =C2=A0 gcc/testsuite/gcc.target/loongarch/fscaleb.c | 48 > > ++++++++++++++++++++ > > =C2=A0 2 files changed, 72 insertions(+), 2 deletions(-) > > =C2=A0 create mode 100644 gcc/testsuite/gcc.target/loongarch/fscaleb.c > >=20 > > diff --git a/gcc/config/loongarch/loongarch.md > > b/gcc/config/loongarch/loongarch.md > > index eb127c346a3..c141c9adde2 100644 > > --- a/gcc/config/loongarch/loongarch.md > > +++ b/gcc/config/loongarch/loongarch.md > > @@ -41,6 +41,7 @@ (define_c_enum "unspec" [ > > =C2=A0=C2=A0=C2=A0 UNSPEC_FTINT > > =C2=A0=C2=A0=C2=A0 UNSPEC_FTINTRM > > =C2=A0=C2=A0=C2=A0 UNSPEC_FTINTRP > > +=C2=A0 UNSPEC_FSCALEB > > =C2=A0=20 > > =C2=A0=C2=A0=C2=A0 ;; Override return address for exception handling. > > =C2=A0=C2=A0=C2=A0 UNSPEC_EH_RETURN > > @@ -220,6 +221,7 @@ (define_attr "qword_mode" "no,yes" > > =C2=A0 ;; fcmp=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0floating point compare > > =C2=A0 ;; fcopysign=C2=A0=C2=A0floating point copysign > > =C2=A0 ;; fcvt=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0floating point convert > > +;; fscaleb=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0floating point scale > > =C2=A0 ;; fsqrt=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0floating point squar= e root > > =C2=A0 ;; frsqrt=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 floating point rec= iprocal square root > > =C2=A0 ;; multi=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0multiword sequence (= or user asm statements) > > @@ -231,8 +233,8 @@ (define_attr "type" > > =C2=A0=C2=A0=C2=A0 > > "unknown,branch,jump,call,load,fpload,fpidxload,store,fpstore,fpidxs > > tore, > > =C2=A0=C2=A0=C2=A0=C2=A0 prefetch,prefetchx,condmove,mgtf,mftg,const,ar= ith,logical, > > =C2=A0=C2=A0=C2=A0=C2=A0 shift,slt,signext,clz,trap,imul,idiv,move, > > -=C2=A0=C2=A0 > > fmove,fadd,fmul,fmadd,fdiv,frdiv,fabs,fneg,fcmp,fcopysign,fcvt,fsqrt > > , > > -=C2=A0=C2=A0 frsqrt,accext,accmod,multi,atomic,syncloop,nop,ghost" > > +=C2=A0=C2=A0 > > fmove,fadd,fmul,fmadd,fdiv,frdiv,fabs,fneg,fcmp,fcopysign,fcvt,fscal > > eb, > > +=C2=A0=C2=A0 fsqrt,frsqrt,accext,accmod,multi,atomic,syncloop,nop,ghos= t" > > =C2=A0=C2=A0=C2=A0 (cond [(eq_attr "jirl" "!unset") (const_string "call= ") > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (eq_attr "got" "load")= (const_string "load") > > =C2=A0=20 > > @@ -418,6 +420,10 @@ (define_mode_attr UNITMODE [(SF "SF") (DF > > "DF")]) > > =C2=A0 ;; the controlling mode. > > =C2=A0 (define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")]) > > =C2=A0=20 > > +;; This attribute gives the integer mode that has the same size of > > a > > +;; floating-point mode. > > +(define_mode_attr IMODE [(SF "SI") (DF "DI")]) > > + > > =C2=A0 ;; This code iterator allows signed and unsigned widening > > multiplications > > =C2=A0 ;; to use the same template. > > =C2=A0 (define_code_iterator any_extend [sign_extend zero_extend]) > > @@ -1014,7 +1020,23 @@ (define_insn "copysign3" > > =C2=A0=C2=A0=C2=A0 "fcopysign.\t%0,%1,%2" > > =C2=A0=C2=A0=C2=A0 [(set_attr "type" "fcopysign") > > =C2=A0=C2=A0=C2=A0=C2=A0 (set_attr "mode" "")]) > > +=0C > > +;; > > +;;=C2=A0 .................... > > +;; > > +;;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0FLOATING POINT SCALE > > +;; > > +;;=C2=A0 .................... > > =C2=A0=20 > > +(define_insn "ldexp3" > > +=C2=A0 [(set (match_operand:ANYF 0 "register_operand" "=3Df") > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(unspec:ANYF [(match_operand= :ANYF=C2=A0=C2=A0=C2=A0 1 "register_operand" > > "f") > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (match_operand: = 2 "register_operand" > > "f")] > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 UNSPEC_FSCALEB))] > > +=C2=A0 "TARGET_HARD_FLOAT" > > +=C2=A0 "fscaleb.\t%0,%1,%2" > > +=C2=A0 [(set_attr "type" "fscaleb") > > +=C2=A0=C2=A0 (set_attr "mode" "")]) > > =C2=A0 =0C > > =C2=A0 ;; > > =C2=A0 ;;=C2=A0 ................... > > diff --git a/gcc/testsuite/gcc.target/loongarch/fscaleb.c > > b/gcc/testsuite/gcc.target/loongarch/fscaleb.c > > new file mode 100644 > > index 00000000000..f18470fbb8f > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/loongarch/fscaleb.c > > @@ -0,0 +1,48 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2 -mabi=3Dlp64d -mdouble-float -fno-math-errno" } > > */ > > +/* { dg-final { scan-assembler-times "fscaleb\\.s" 3 } } */ > > +/* { dg-final { scan-assembler-times "fscaleb\\.d" 4 } } */ > > +/* { dg-final { scan-assembler-times "slli\\.w" 1 } } */ > > + > > +double > > +my_scalbln (double a, long b) > > +{ > > +=C2=A0 return __builtin_scalbln (a, b); > > +} > > + > > +double > > +my_scalbn (double a, int b) > > +{ > > +=C2=A0 return __builtin_scalbn (a, b); > > +} > > + > > + > > +float > > +my_scalblnf (float a, long b) > > +{ > > +=C2=A0 return __builtin_scalblnf (a, b); > > +} > > + > > +float > > +my_scalbnf (float a, int b) > > +{ > > +=C2=A0 return __builtin_scalbnf (a, b); > > +} > > + > >=20 > I think scalbln/scalblnf/scalbn/scalbnf these four builtin test > function=20 > with the macro __FLT_RADIX__ control. >=20 > These functions are tested only if the macro __FLT_RADIX__ has a value > of 2. LoongArch does not use RESET_FLOAT_FORMAT on SFmode, so __FLT_RADIX__ is always 2. --=20 Xi Ruoyao School of Aerospace Science and Technology, Xidian University