Hi Jeff,
Sorry for the long delay getting back to this, but after deeper
investigation, it turns out that your tingling spider senses that
the original patch wasn't updating everywhere that was required
were spot on.  Although my nvptx testing showed no problems with -O2,
compiling the same tests with -O0 found several additional assertion
ICEs (exactly where you'd predicted they should/would be).

Here's a revised patch that updates five locations (up from the
previous two).  Finding any remaining locations (if any) might be
easier once folks are able to test things on their targets.


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures, and on nvptx-none, where it is
the middle-end portion of a pair of patches to allow the default ISA to
be advanced.  Ok for mainline?


2022-06-26  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR target/104489
	* calls.cc (precompute_register_parameters): Allow promotion
	of floating point values to be passed in wider integer modes.
	(expand_call): Allow floating point results to be returned in
	wider integer modes.
	* cfgexpand.cc (expand_value_return): Allow backends to promote
	a scalar floating point return value to a wider integer mode.
	* expr.cc (expand_expr_real_1) <expand_decl_rtl>: Likewise, allow
	backends to promote scalar FP PARM_DECLs to wider integer modes.
	* function.cc (assign_parm_setup_stack): Allow floating point
	values to be passed on the stack as wider integer modes.

Thanks in advance,
Roger
--

> -----Original Message-----
> From: Jeff Law <jeffreyalaw@gmail.com>
> Sent: 14 March 2022 15:30
> To: Roger Sayle <roger@nextmovesoftware.com>; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] middle-end: Support ABIs that pass FP values as wider
> integers.
> 
> 
> 
> On 2/9/2022 1:12 PM, Roger Sayle wrote:
> > This patch adds middle-end support for target ABIs that pass/return
> > floating point values in integer registers with precision wider than
> > the original FP mode.  An example, is the nvptx backend where 16-bit
> > HFmode registers are passed/returned as (promoted to) SImode registers.
> > Unfortunately, this currently falls foul of the various (recent?)
> > sanity checks that (very sensibly) prevent creating paradoxical
> > SUBREGs of floating point registers.  The approach below is to
> > explicitly perform the conversion/promotion in two steps, via an
> > integer mode of same precision as the floating point value.  So on
> > nvptx, 16-bit HFmode is initially converted to 16-bit HImode (using
> > SUBREG), then zero-extended to SImode, and likewise when going the
> > other way, parameters truncated to HImode then converted to HFmode
> > (using SUBREG).  These changes are localized to expand_value_return
> > and expanding DECL_RTL to support strange ABIs, rather than inside
> > convert_modes or gen_lowpart, as mismatched precision integer/FP
> > conversions should be explicit in the RTL, and these semantics not generally
> visible/implicit in user code.
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check with no new failures, and on nvptx-none, where it is
> > the middle-end portion of a pair of patches to allow the default ISA
> > to be advanced.  Ok for mainline?
> >
> > 2022-02-09  Roger Sayle  <roger@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> >         * cfgexpand.cc (expand_value_return): Allow backends to promote
> >         a scalar floating point return value to a wider integer mode.
> >         * expr.cc (expand_expr_real_1) [expand_decl_rtl]: Likewise, allow
> >         backends to promote scalar FP PARM_DECLs to wider integer modes.
> 
> Buried somewhere in our calling conventions code is the ability to pass around
> BLKmode objects in registers along with the ability to tune left vs right padding
> adjustments.   Much of this support grew out of the PA
> 32 bit SOM ABI.
> 
> While I think we could probably make those bits do what we want, I suspect the
> result will actually be uglier than what you've done here and I wouldn't be
> surprised if there was a performance hit as the code to handle those cases was
> pretty dumb in its implementation.
> 
> What I find rather surprising is the location of your changes -- they feel
> incomplete.  For example, you fix the callee side of returns in
> expand_value_return, but I don't analogous code for the caller side.
> 
> Similarly while you fix things for arguments in expand_expr_real_1, that's again
> just the callee side.  Don't you need to so something on the caller side too?
> 
> Jeff
>