From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <xry111@xry111.site>
Received: from xry111.site (xry111.site [IPv6:2001:470:683e::1])
	by sourceware.org (Postfix) with ESMTPS id 70B9D3858C29
	for <gcc-patches@gcc.gnu.org>; Sat, 12 Nov 2022 04:41:05 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 70B9D3858C29
Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site;
	s=default; t=1668228062;
	bh=2ziq3d+Hq/1Cp5o8roFL1Uiuxe70C+VaqbevoAzt+7Y=;
	h=Subject:From:To:Cc:Date:In-Reply-To:References:From;
	b=PeCxMdpaVY7hNI+ysFQCynx+bifj7yxzq52+XslHiFQNnM0knofbpCtFvE0BCh5f+
	 gCLVeX5Wx19uQ7fKHLV//MpDOAQMJeuhQzv8LF4C3SA3pxmfKGMenB/6QxpEkhs/zW
	 JjL0IEjp9RndnEOvic0flBNici/YN5ULIqZo78ow=
Received: from [IPv6:240e:358:1161:e600:dc73:854d:832e:2] (unknown [IPv6:240e:358:1161:e600:dc73:854d:832e:2])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange ECDHE (P-256) server-signature ECDSA (P-384) server-digest SHA384)
	(Client did not present a certificate)
	(Authenticated sender: xry111@xry111.site)
	by xry111.site (Postfix) with ESMTPSA id 88EC2667AD;
	Fri, 11 Nov 2022 23:40:54 -0500 (EST)
Message-ID: <c207817c562185f8f4b403143d921ca898b3f3ab.camel@xry111.site>
Subject: Re: [PATCH v2 3/4] LoongArch: Add fscaleb.{s,d} instructions as
 ldexp{sf,df}3
From: Xi Ruoyao <xry111@xry111.site>
To: Lulu Cheng <chenglulu@loongson.cn>, gcc-patches@gcc.gnu.org
Cc: Wang Xuerui <i@xen0n.name>, Chenghua Xu <xuchenghua@loongson.cn>, 
	Xiaolin Tang <tangxiaolin@loongson.cn>
Date: Sat, 12 Nov 2022 12:40:45 +0800
In-Reply-To: <bc6d8401-087b-dbd3-a231-5ca8a9b6bc31@loongson.cn>
References: <20221109135329.952128-1-xry111@xry111.site>
	 <20221109135329.952128-4-xry111@xry111.site>
	 <bc6d8401-087b-dbd3-a231-5ca8a9b6bc31@loongson.cn>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
User-Agent: Evolution 3.46.0 
MIME-Version: 1.0
X-Spam-Status: No, score=-5.7 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FROM_SUSPICIOUS_NTLD,GIT_PATCH_0,KAM_NUMSUBJECT,KAM_SHORT,LIKELY_SPAM_FROM,PDS_OTHER_BAD_TLD,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Sat, 2022-11-12 at 11:54 +0800, Lulu Cheng wrote:
>=20
> =E5=9C=A8 2022/11/9 =E4=B8=8B=E5=8D=889:53, Xi Ruoyao =E5=86=99=E9=81=93:
> > This allows optimizing __builtin_ldexp{,f} and __builtin_scalbn{,f}
> > with
> > -fno-math-errno.
> >=20
> > IMODE is added because we can't hard code SI for operand 2:
> > fscaleb.d
> > instruction always take the high half of both source registers into
> > account.=C2=A0 See my_ldexp_long in the test case.
> >=20
> > gcc/ChangeLog:
> >=20
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* config/loongarch/loon=
garch.md (UNSPEC_FSCALEB): New
> > unspec.
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(type): Add fscaleb.
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(IMODE): New mode attr.
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(ldexp<mode>3): New ins=
truction template.
> >=20
> > gcc/testsuite/ChangeLog:
> >=20
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* gcc.target/loongarch/=
fscaleb.c: New test.
> > ---
> > =C2=A0 gcc/config/loongarch/loongarch.md=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | 26 ++++++++++-
> > =C2=A0 gcc/testsuite/gcc.target/loongarch/fscaleb.c | 48
> > ++++++++++++++++++++
> > =C2=A0 2 files changed, 72 insertions(+), 2 deletions(-)
> > =C2=A0 create mode 100644 gcc/testsuite/gcc.target/loongarch/fscaleb.c
> >=20
> > diff --git a/gcc/config/loongarch/loongarch.md
> > b/gcc/config/loongarch/loongarch.md
> > index eb127c346a3..c141c9adde2 100644
> > --- a/gcc/config/loongarch/loongarch.md
> > +++ b/gcc/config/loongarch/loongarch.md
> > @@ -41,6 +41,7 @@ (define_c_enum "unspec" [
> > =C2=A0=C2=A0=C2=A0 UNSPEC_FTINT
> > =C2=A0=C2=A0=C2=A0 UNSPEC_FTINTRM
> > =C2=A0=C2=A0=C2=A0 UNSPEC_FTINTRP
> > +=C2=A0 UNSPEC_FSCALEB
> > =C2=A0=20
> > =C2=A0=C2=A0=C2=A0 ;; Override return address for exception handling.
> > =C2=A0=C2=A0=C2=A0 UNSPEC_EH_RETURN
> > @@ -220,6 +221,7 @@ (define_attr "qword_mode" "no,yes"
> > =C2=A0 ;; fcmp=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0floating point compare
> > =C2=A0 ;; fcopysign=C2=A0=C2=A0floating point copysign
> > =C2=A0 ;; fcvt=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0floating point convert
> > +;; fscaleb=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0floating point scale
> > =C2=A0 ;; fsqrt=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0floating point squar=
e root
> > =C2=A0 ;; frsqrt=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 floating point rec=
iprocal square root
> > =C2=A0 ;; multi=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0multiword sequence (=
or user asm statements)
> > @@ -231,8 +233,8 @@ (define_attr "type"
> > =C2=A0=C2=A0=C2=A0
> > "unknown,branch,jump,call,load,fpload,fpidxload,store,fpstore,fpidxs
> > tore,
> > =C2=A0=C2=A0=C2=A0=C2=A0 prefetch,prefetchx,condmove,mgtf,mftg,const,ar=
ith,logical,
> > =C2=A0=C2=A0=C2=A0=C2=A0 shift,slt,signext,clz,trap,imul,idiv,move,
> > -=C2=A0=C2=A0
> > fmove,fadd,fmul,fmadd,fdiv,frdiv,fabs,fneg,fcmp,fcopysign,fcvt,fsqrt
> > ,
> > -=C2=A0=C2=A0 frsqrt,accext,accmod,multi,atomic,syncloop,nop,ghost"
> > +=C2=A0=C2=A0
> > fmove,fadd,fmul,fmadd,fdiv,frdiv,fabs,fneg,fcmp,fcopysign,fcvt,fscal
> > eb,
> > +=C2=A0=C2=A0 fsqrt,frsqrt,accext,accmod,multi,atomic,syncloop,nop,ghos=
t"
> > =C2=A0=C2=A0=C2=A0 (cond [(eq_attr "jirl" "!unset") (const_string "call=
")
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (eq_attr "got" "load")=
 (const_string "load")
> > =C2=A0=20
> > @@ -418,6 +420,10 @@ (define_mode_attr UNITMODE [(SF "SF") (DF
> > "DF")])
> > =C2=A0 ;; the controlling mode.
> > =C2=A0 (define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")])
> > =C2=A0=20
> > +;; This attribute gives the integer mode that has the same size of
> > a
> > +;; floating-point mode.
> > +(define_mode_attr IMODE [(SF "SI") (DF "DI")])
> > +
> > =C2=A0 ;; This code iterator allows signed and unsigned widening
> > multiplications
> > =C2=A0 ;; to use the same template.
> > =C2=A0 (define_code_iterator any_extend [sign_extend zero_extend])
> > @@ -1014,7 +1020,23 @@ (define_insn "copysign<mode>3"
> > =C2=A0=C2=A0=C2=A0 "fcopysign.<fmt>\t%0,%1,%2"
> > =C2=A0=C2=A0=C2=A0 [(set_attr "type" "fcopysign")
> > =C2=A0=C2=A0=C2=A0=C2=A0 (set_attr "mode" "<UNITMODE>")])
> > +=0C
> > +;;
> > +;;=C2=A0 ....................
> > +;;
> > +;;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0FLOATING POINT SCALE
> > +;;
> > +;;=C2=A0 ....................
> > =C2=A0=20
> > +(define_insn "ldexp<mode>3"
> > +=C2=A0 [(set (match_operand:ANYF 0 "register_operand" "=3Df")
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(unspec:ANYF [(match_operand=
:ANYF=C2=A0=C2=A0=C2=A0 1 "register_operand"
> > "f")
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (match_operand:<IMODE> =
2 "register_operand"
> > "f")]
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 UNSPEC_FSCALEB))]
> > +=C2=A0 "TARGET_HARD_FLOAT"
> > +=C2=A0 "fscaleb.<fmt>\t%0,%1,%2"
> > +=C2=A0 [(set_attr "type" "fscaleb")
> > +=C2=A0=C2=A0 (set_attr "mode" "<UNITMODE>")])
> > =C2=A0 =0C
> > =C2=A0 ;;
> > =C2=A0 ;;=C2=A0 ...................
> > diff --git a/gcc/testsuite/gcc.target/loongarch/fscaleb.c
> > b/gcc/testsuite/gcc.target/loongarch/fscaleb.c
> > new file mode 100644
> > index 00000000000..f18470fbb8f
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/loongarch/fscaleb.c
> > @@ -0,0 +1,48 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -mabi=3Dlp64d -mdouble-float -fno-math-errno" }
> > */
> > +/* { dg-final { scan-assembler-times "fscaleb\\.s" 3 } } */
> > +/* { dg-final { scan-assembler-times "fscaleb\\.d" 4 } } */
> > +/* { dg-final { scan-assembler-times "slli\\.w" 1 } } */
> > +
> > +double
> > +my_scalbln (double a, long b)
> > +{
> > +=C2=A0 return __builtin_scalbln (a, b);
> > +}
> > +
> > +double
> > +my_scalbn (double a, int b)
> > +{
> > +=C2=A0 return __builtin_scalbn (a, b);
> > +}
> > +
> > +
> > +float
> > +my_scalblnf (float a, long b)
> > +{
> > +=C2=A0 return __builtin_scalblnf (a, b);
> > +}
> > +
> > +float
> > +my_scalbnf (float a, int b)
> > +{
> > +=C2=A0 return __builtin_scalbnf (a, b);
> > +}
> > +
> >=20
> I think scalbln/scalblnf/scalbn/scalbnf these four builtin test
> function=20
> with the macro __FLT_RADIX__ control.
>=20
> These functions are tested only if the macro __FLT_RADIX__ has a value
> of 2.

LoongArch does not use RESET_FLOAT_FORMAT on SFmode, so __FLT_RADIX__ is
always 2.

--=20
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University