From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from xry111.site (xry111.site [89.208.246.23]) by sourceware.org (Postfix) with ESMTPS id 2E1603858D33 for ; Thu, 23 Nov 2023 12:06:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2E1603858D33 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2E1603858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=89.208.246.23 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700741184; cv=none; b=H0ozbmySY9C+N5Nnb1uyMnuiNGXSGKERosGUh/6+VRoRYS8dfdqkdIcmeVAn+Kjmldy7B7ZKEOm1mGy9ID5BFhfzmtme4N7+ODV7kzKOLWcM7spGgI4dq7AsNFRyR918yjJJQQCwlO2ulznek07OdE47hcDMriOUPv0p7j6qmHY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700741184; c=relaxed/simple; bh=v6SyJvW+BLbJEgh3/L7FMiZ88Bo5Bh1QBQOczJTjpMc=; h=DKIM-Signature:Message-ID:Subject:From:To:Date:MIME-Version; b=BE0B6OLyCOwggyj8nGtmzas3J3fyZ8ZhQjDsqf9LyEUabM/53w943Xn1DprohBhNiiLC/6PxISZkJ1RRod0TdsG8xBPuqvivxp3Iz+m8mKq5TYXhMBlmaM8NaNm3JiGVbV5BOamMCmw5FOKzRn9LVtBxUFbdJ1V9MED8+hZR/qs= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1700741177; bh=v6SyJvW+BLbJEgh3/L7FMiZ88Bo5Bh1QBQOczJTjpMc=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=emn7BowpGk7R+QBqkqH/Wnlnsb7jV2y5Z1aN/HJi0/GM80jzgFLBKcYne4/bY0IGY 3QlyYjuh8foLZ63d499TVgqTE8IYrsmeCo+HvrM4+9MEKSdSL0F6kBqo9yQpy2oT4U F4igrM6PoRnUi2czheuYwcPGlK3Y4kXkEaFQvvME= Received: from [127.0.0.1] (unknown [IPv6:2001:470:683e::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id D9AA566B39; Thu, 23 Nov 2023 07:06:15 -0500 (EST) Message-ID: <069390c9612943f8b196d9ec10edb907c09aeda9.camel@xry111.site> Subject: Re: [PATCH v3 1/5] LoongArch: Fix usage of LSX and LASX frint/ftint instructions [PR112578] From: Xi Ruoyao To: chenglulu , gcc-patches@gcc.gnu.org, Uros Bizjak , Joseph Myers Cc: i@xen0n.name, xuchenghua@loongson.cn Date: Thu, 23 Nov 2023 20:06:13 +0800 In-Reply-To: <0fc6f3d2536b6d2d8a1e86a5e17354f89ba7040a.camel@xry111.site> References: <20231120004728.205167-1-xry111@xry111.site> <20231120004728.205167-2-xry111@xry111.site> <2d1c9d59544d15ef7fba07d758431da840cc0bfe.camel@xry111.site> <9ce7e0b2-eeeb-a8c5-2cc7-e9b65b1b2a6b@loongson.cn> <0fc6f3d2536b6d2d8a1e86a5e17354f89ba7040a.camel@xry111.site> Autocrypt: addr=xry111@xry111.site; prefer-encrypt=mutual; keydata=mDMEYnkdPhYJKwYBBAHaRw8BAQdAsY+HvJs3EVKpwIu2gN89cQT/pnrbQtlvd6Yfq7egugi0HlhpIFJ1b3lhbyA8eHJ5MTExQHhyeTExMS5zaXRlPoiTBBMWCgA7FiEEkdD1djAfkk197dzorKrSDhnnEOMFAmJ5HT4CGwMFCwkIBwICIgIGFQoJCAsCBBYCAwECHgcCF4AACgkQrKrSDhnnEOPHFgD8D9vUToTd1MF5bng9uPJq5y3DfpcxDp+LD3joA3U2TmwA/jZtN9xLH7CGDHeClKZK/ZYELotWfJsqRcthOIGjsdAPuDgEYnkdPhIKKwYBBAGXVQEFAQEHQG+HnNiPZseiBkzYBHwq/nN638o0NPwgYwH70wlKMZhRAwEIB4h4BBgWCgAgFiEEkdD1djAfkk197dzorKrSDhnnEOMFAmJ5HT4CGwwACgkQrKrSDhnnEOPjXgD/euD64cxwqDIqckUaisT3VCst11RcnO5iRHm6meNIwj0BALLmWplyi7beKrOlqKfuZtCLbiAPywGfCNg8LOTt4iMD Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.50.1 MIME-Version: 1.0 X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,LIKELY_SPAM_FROM,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 2023-11-23 at 18:12 +0800, Xi Ruoyao wrote: > On Thu, 2023-11-23 at 17:12 +0800, chenglulu wrote: > >=20 > > =E5=9C=A8 2023/11/23 =E4=B8=8B=E5=8D=885:02, Xi Ruoyao =E5=86=99=E9=81= =93: > > > On Thu, 2023-11-23 at 16:13 +0800, chenglulu wrote: > > > > The fix_truncv4sfv4si2 template is indeed called when debugging wit= h > > > > gdb. > > > >=20 > > > > So I think we can use define_expand here. > > > The problem is cases where we want to combine an rint call with float= - > > > to-int conversion: > > >=20 > > > float x[4]; > > > int y[4]; > > >=20 > > > void test() > > > { > > > for (int i =3D 0; i < 4; i++) > > > y[i] =3D __builtin_rintf(x[i]); > > > } > > >=20 > > > With define_expand we get "vfrint + vftintrz", but with define_insn w= e > > > get a single "vftint". > > >=20 > > > Arguably the generic code should try to handle this (PR86609), but it= 's > > > "not sure if that's a good idea in general" (comment 1 in the PR) so = we > > > can do this in a target-specific way. > > >=20 > > I tried to use Ofast to compile, and found that a vftint was generated,= =20 > > and at.006t.gimple appeared. > >=20 > > If O2 was compiled, __builtin_rintf would be generated, but Ofast would= =20 > > generate __builtin_irintf >=20 > Indeed...=C2=A0 It seems the FE will only generate __builtin_irintf when = - > fno-math-errno -funsafe-math-optimizations. >=20 > But I cannot see why this is necessary (at least for us): the rintf > function does not set errno at all, and to me using vftint.w.s here is > safe: if the rounded result can be represented as a 32-bit int, > obviously there is no issue;=C2=A0 otherwise, per C23 section F.4 we shou= ld > raise FE_INVALID and produce unspecified result.=C2=A0 It seems our ftint= .w.s > instruction has the required semantics. >=20 > +Uros and Joseph for some comment about the expected behavior of > (int)rintf(x). I've spent some time reading the code and got some results. For -fno-math-errno, it's for preventing from converting (int)rintf(x) to a call to the *external* function irintf(x). The problem is rintf never sets errno, but irintf may set errno, this was PR 61876. However it's not a problem preventing us from using ftint.w.s because this instruction does not sets errno. For -funsafe-math-optimizations, there seems a logic error in convert_to_integer_1: /* Convert e.g. (long)round(d) -> lround(d). */ /* If we're converting to char, we may encounter differing behavior between converting from double->char vs double->long->char. We're in "undefined" territory but we prefer to be conservative, so only proceed in "unsafe" math mode. */ if (optimize && (flag_unsafe_math_optimizations || (long_integer_type_node && outprec >=3D TYPE_PRECISION (long_integer_type_node)))) But shouldn't we compare against integer_type_node here as we're handling __builtin_irint etc. of which the output is int (not long) in this block? Anyway, both constraints does not apply for our ftint.w.s instruction.=20 And IMO the second constraint is a target-independent bug which should be fixed. The first constraint must remain there, but it's only for preventing from mistakenly using an external irint (which may set errno), not the ftint.w.s instruction (it does not even know errno). So we should use the target-specific way, i. e. a define_insn, to ensure the optimization even if -fmath-errno. --=20 Xi Ruoyao School of Aerospace Science and Technology, Xidian University