From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 905103858D20 for ; Mon, 14 Mar 2022 17:24:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 905103858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Transfer-Encoding:Content-Type: MIME-Version:Message-ID:Date:Subject:In-Reply-To:References:To:From:Sender: Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=P9hVGlD53CztkLEG2vQPRqtUEnRSQyRDglNrnAzF9jw=; b=hG0daWvokHYf9VufK2geHC0vKd 7+HmYWowvZuiX9fU0O242Yma0UAX7/YZoqTOOrtLrlw98mC1yhygwXUuCw8N3cR3Wjj7n6g5DC+XB D/98/1DA8Eew7InsbJWdcAku4SsvvgV9ortpcA1Cbeckm0ikXH20voolW2YdkwqPHVvrIAUQO4yxR kRCEk9Km1TiRIhwxi9o+jPSPq0KHUsdIfCfXPqECLIdeyq2L1vQUtt60vIR6atwS8sUgc5npqdzkP ctwXGdePgFU/uFo6oEryooePAS9GvMOzomNfywM8yv1/yFnHTxN/tdMk1luOpxx6st/bKi7LGXZVw T0fD9Uww==; Received: from [185.62.158.67] (port=59071 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nToQe-0007wC-2P; Mon, 14 Mar 2022 13:24:20 -0400 From: "Roger Sayle" To: "'Jeff Law'" , References: <057201d81df1$50523bc0$f0f6b340$@nextmovesoftware.com> In-Reply-To: Subject: RE: [PATCH] middle-end: Support ABIs that pass FP values as wider integers. Date: Mon, 14 Mar 2022 17:24:18 -0000 Message-ID: <000a01d837c8$56b6d840$042488c0$@nextmovesoftware.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 16.0 Thread-Index: AQJ9RK1A/mLmQI2lFVOi8QF0uUeM9QKHrcykq2DEsLA= Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-6.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Mar 2022 17:24:22 -0000 Hi Jeff, > What I find rather surprising is the location of your changes -- they = feel > incomplete. For example, you fix the callee side of returns in > expand_value_return, but I don't analogous code for the caller side. >=20 > Similarly while you fix things for arguments in expand_expr_real_1, = that's again > just the callee side. Don't you need to do something on the caller = side too? I've taken the pragmatic approach for this fix to PR target/104489, that = this patch only needs to modify/fix the parts of the middle-end that are = broken. With this patch, gcc can compile the following with -O2 -misa=3Dsm_80 = -ffast-math _Float16 p; _Float16 q; _Float16 r; _Float16 foo(_Float16 x, _Float16 y) { return x * y; } _Float16 mid(_Float16 x, _Float16 y) { return foo(x,y) + foo(y,x); } void bar() { p =3D mid(q,r); } which I assume covers all of the paths that I/we need to care about. Technically, the blocker is that without this patch, GCC's build fails in libgcc (compiling __mulhc3) when/if HFmode is enabled by default. I'm hoping any remaining issues, not caught by the current testsuite, can be handled as regular Bugzilla PRs to be fixed/added to the testsuite. Let me if there's anything I've missed or need to worry about. I believe most PC laptops/desktops contain Nvidia graphics cards, so = it's relatively easy for GCC developers to try things out (on real hardware) for themselves. =20 Cheers, Roger -- > -----Original Message----- > From: Jeff Law > Sent: 14 March 2022 15:30 > To: Roger Sayle ; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] middle-end: Support ABIs that pass FP values as = wider > integers. >=20 >=20 >=20 > On 2/9/2022 1:12 PM, Roger Sayle wrote: > > This patch adds middle-end support for target ABIs that pass/return > > floating point values in integer registers with precision wider than > > the original FP mode. An example, is the nvptx backend where 16-bit > > HFmode registers are passed/returned as (promoted to) SImode = registers. > > Unfortunately, this currently falls foul of the various (recent?) > > sanity checks that (very sensibly) prevent creating paradoxical > > SUBREGs of floating point registers. The approach below is to > > explicitly perform the conversion/promotion in two steps, via an > > integer mode of same precision as the floating point value. So on > > nvptx, 16-bit HFmode is initially converted to 16-bit HImode (using > > SUBREG), then zero-extended to SImode, and likewise when going the > > other way, parameters truncated to HImode then converted to HFmode > > (using SUBREG). These changes are localized to expand_value_return > > and expanding DECL_RTL to support strange ABIs, rather than inside > > convert_modes or gen_lowpart, as mismatched precision integer/FP > > conversions should be explicit in the RTL, and these semantics not = generally > visible/implicit in user code. > > > > This patch has been tested on x86_64-pc-linux-gnu with make = bootstrap > > and make -k check with no new failures, and on nvptx-none, where it = is > > the middle-end portion of a pair of patches to allow the default ISA > > to be advanced. Ok for mainline? > > > > 2022-02-09 Roger Sayle > > > > gcc/ChangeLog > > * cfgexpand.cc (expand_value_return): Allow backends to = promote > > a scalar floating point return value to a wider integer = mode. > > * expr.cc (expand_expr_real_1) [expand_decl_rtl]: Likewise, = allow > > backends to promote scalar FP PARM_DECLs to wider integer = modes. >=20 > Buried somewhere in our calling conventions code is the ability to = pass around > BLKmode objects in registers along with the ability to tune left vs = right padding > adjustments. Much of this support grew out of the PA > 32 bit SOM ABI. >=20 > While I think we could probably make those bits do what we want, I = suspect the > result will actually be uglier than what you've done here and I = wouldn't be > surprised if there was a performance hit as the code to handle those = cases was > pretty dumb in its implementation. >=20 > What I find rather surprising is the location of your changes -- they = feel > incomplete. For example, you fix the callee side of returns in > expand_value_return, but I don't analogous code for the caller side. >=20 > Similarly while you fix things for arguments in expand_expr_real_1, = that's again > just the callee side. Don't you need to so something on the caller = side too? >=20 > Jeff >=20