From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=rFZQ=4V=linux.ibm.com=guojiufu@sourceware.org>
Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5])
	by sourceware.org (Postfix) with ESMTPS id 1AB053858D1E;
	Fri, 23 Dec 2022 12:54:45 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1AB053858D1E
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com
Received: from pps.filterd (m0098417.ppops.net [127.0.0.1])
	by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2BNCDoba017861;
	Fri, 23 Dec 2022 12:54:44 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject
 : references : date : in-reply-to : message-id : content-type :
 mime-version; s=pp1; bh=S2vK/qiJ2KtoClOs8igDr6jOteZPUNzW6V+zweBq4ho=;
 b=SeWpnJEAsLEarYqWpQSZs60kTkDLTCRCCvRpOnK1krr6sHvR359ZiWn7axk+C6Q2N+hf
 w10rNdfCDReHR2H/Y0bWm1RcDb6LwY3n3y9zRTVew8kSZ0ClYA+Lj5Lpi8uLp3U6RDiF
 /k2Uh5OLHKohxX7ufw/Ty/LY7K3ogMpPc8Lg2jYiX4jAo+XSMMVTVtD3Hk8NQsxyNC6t
 fiVkLiG4t8KMcyz0UlFOl1khfrNMv4Io9P9XMBysHF0/7m36PWXA5fnh/Ul+raAIhHO0
 WUbszxe3eQtTeCddyHxvvh1eSpmaRdA44s+EOTDAYE8t0+l0/ltD7PA9dRyw4ytPkgTB aw== 
Received: from pps.reinject (localhost [127.0.0.1])
	by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3mnc5dgvcm-1
	(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
	Fri, 23 Dec 2022 12:54:44 +0000
Received: from m0098417.ppops.net (m0098417.ppops.net [127.0.0.1])
	by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2BNCFU6T025169;
	Fri, 23 Dec 2022 12:54:43 GMT
Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27])
	by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3mnc5dgvc3-2
	(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
	Fri, 23 Dec 2022 12:54:43 +0000
Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1])
	by ppma05wdc.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 2BN9bG4V007425;
	Fri, 23 Dec 2022 12:36:41 GMT
Received: from smtprelay02.dal12v.mail.ibm.com ([9.208.130.97])
	by ppma05wdc.us.ibm.com (PPS) with ESMTPS id 3mh6yy7fpg-1
	(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
	Fri, 23 Dec 2022 12:36:40 +0000
Received: from smtpav05.wdc07v.mail.ibm.com (smtpav05.wdc07v.mail.ibm.com [10.39.53.232])
	by smtprelay02.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2BNCaeE844958380
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);
	Fri, 23 Dec 2022 12:36:40 GMT
Received: from smtpav05.wdc07v.mail.ibm.com (unknown [127.0.0.1])
	by IMSVA (Postfix) with ESMTP id D8DB458043;
	Fri, 23 Dec 2022 12:36:39 +0000 (GMT)
Received: from smtpav05.wdc07v.mail.ibm.com (unknown [127.0.0.1])
	by IMSVA (Postfix) with ESMTP id 4A51558059;
	Fri, 23 Dec 2022 12:36:39 +0000 (GMT)
Received: from pike (unknown [9.5.12.127])
	by smtpav05.wdc07v.mail.ibm.com (Postfix) with ESMTPS;
	Fri, 23 Dec 2022 12:36:39 +0000 (GMT)
From: Jiufu Guo <guojiufu@linux.ibm.com>
To: Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Richard Biener <rguenther@suse.de>, segher@kernel.crashing.org,
        dje.gcc@gmail.com, linkw@gcc.gnu.org, jeffreyalaw@gmail.com
Subject: Re: [PATCH] loading float member of parameter stored via int registers
References: <20221221062736.78036-1-guojiufu@linux.ibm.com>
	<nycvar.YFH.7.77.849.2212210718030.5956@jbgna.fhfr.qr>
	<58beeb5dd65a10b7480f73462da904a4@linux.ibm.com>
	<nycvar.YFH.7.77.849.2212220748510.5956@jbgna.fhfr.qr>
	<7ebknvhkun.fsf@pike.rch.stglabs.ibm.com>
Date: Fri, 23 Dec 2022 20:36:36 +0800
In-Reply-To: <7ebknvhkun.fsf@pike.rch.stglabs.ibm.com> (Jiufu Guo via
	Gcc-patches's message of "Thu, 22 Dec 2022 17:02:40 +0800")
Message-ID: <7ev8m2fga3.fsf@pike.rch.stglabs.ibm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)
Content-Type: text/plain
X-TM-AS-GCONF: 00
X-Proofpoint-GUID: ZZYwLfOqij627T9e-kYdAECokY-DHWpW
X-Proofpoint-ORIG-GUID: sIb_o3DTwkZsM6p73gk42InNLRYN3e6H
X-Proofpoint-UnRewURL: 0 URL was un-rewritten
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1
 definitions=2022-12-23_05,2022-12-22_03,2022-06-22_01
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015
 lowpriorityscore=0 suspectscore=0 adultscore=0 mlxlogscore=999 mlxscore=0
 phishscore=0 spamscore=0 bulkscore=0 impostorscore=0 priorityscore=1501
 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1
 engine=8.12.0-2212070000 definitions=main-2212230107
X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

HI,

Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:

> Hi,
>
> Richard Biener <rguenther@suse.de> writes:
>
>> On Thu, 22 Dec 2022, guojiufu wrote:
>>
>>> Hi,
>>> 
>>> On 2022-12-21 15:30, Richard Biener wrote:
>>> > On Wed, 21 Dec 2022, Jiufu Guo wrote:
>>> > 
>>> >> Hi,
>>> >> 
>>> >> This patch is fixing an issue about parameter accessing if the
>>> >> parameter is struct type and passed through integer registers, and
>>> >> there is floating member is accessed. Like below code:
>>> >> 
>>> >> typedef struct DF {double a[4]; long l; } DF;
>>> >> double foo_df (DF arg){return arg.a[3];}
>>> >> 
>>> >> On ppc64le, with trunk gcc, "std 6,-24(1) ; lfd 1,-24(1)" is
>>> >> generated.  While instruction "mtvsrd 1, 6" would be enough for
>>> >> this case.
>>> > 
>>> > So why do we end up spilling for PPC?
>>> 
>>> Good question! According to GCC source code (in function.cc/expr.cc),
>>> it is common behavior: using "word_mode" to store the parameter to stack,
>>> And using the field's mode (e.g. float mode) to load from the stack.
>>> But with some tries, I fail to construct cases on many platforms.
>>> So, I convert the fix to a target hook and implemented the rs6000 part
>>> first.
>>> 
>>> > 
>>> > struct X { int i; float f; };
>>> > 
>>> > float foo (struct X x)
>>> > {
>>> >   return x.f;
>>> > }
>>> > 
>>> > does pass the structure in $RDI on x86_64 and we manage (with
>>> > optimization, with -O0 we spill) to generate
>>> > 
>>> >         shrq    $32, %rdi
>>> >         movd    %edi, %xmm0
>>> > 
>>> > and RTL expansion generates
>>> > 
>>> > (note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
>>> > (insn 2 4 3 2 (set (reg/v:DI 83 [ x ])
>>> >         (reg:DI 5 di [ x ])) "t.c":4:1 -1
>>> >      (nil))
>>> > (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
>>> > (insn 6 3 7 2 (parallel [
>>> >             (set (reg:DI 85)
>>> >                 (ashiftrt:DI (reg/v:DI 83 [ x ])
>>> >                     (const_int 32 [0x20])))
>>> >             (clobber (reg:CC 17 flags))
>>> >         ]) "t.c":5:11 -1
>>> >      (nil))
>>> > (insn 7 6 8 2 (set (reg:SI 86)
>>> >         (subreg:SI (reg:DI 85) 0)) "t.c":5:11 -1
>>> >      (nil))
>>> > 
>>> > I would imagine that for the ppc case we only see the subreg here
>>> > which should be even easier to optimize.
>>> > 
>>> > So how's this not fixable by providing proper patterns / subreg
>>> > capabilities?  Looking a bit at the RTL we have the issue might
>>> > be that nothing seems to handle CSE of
>>> > 
>>> 
>>> This case is also related to 'parameter on struct', PR89310 is
>>> just for this case. On trunk, it is fixed.
>>> One difference: the parameter is in DImode, and passed via an
>>> integer register for "{int i; float f;}".
>>> But for "{double a[4]; long l;}", the parameter is in BLKmode,
>>> and stored to stack during the argument setup.
>>
>> OK, so this would be another case for "heuristics" to use
>> sth different than word_mode for storing, but of course
>> the arguments are in integer registers and using different
>> modes can for example prohibit store-multiple instruction use.
>>
>> As I said in the related thread an RTL expansion time "SRA"
>> with the incoming argument assignment in mind could make
>> more optimal decisions for these kind of special-cases.
>
> Thanks a lot for your comments!
>
> Yeap! Using SRA-like analysis during expansion for parameter
> and returns (and may also some field accessing) would be a
> generic improvement for this kind of issue (PR101926 collected
> a lot of them).
> While we may still need some work for various ABIs and different
> targets, to analyze where the 'struct field' come from
> (int/float/vector/.. registers, or stack) and how the struct
> need to be handled (keep in pseudo or store in the stack).
> This may indicate a mount of changes for param_setup code.
>
> To reduce risk, I'm just draft straightforward patches for
> special cases currently, Like:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608081.html
> and this patch.
>
>>
>>> > (note 8 0 5 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
>>> > (insn 5 8 7 2 (set (mem/c:DI (plus:DI (reg/f:DI 110 sfp)
>>> >                 (const_int 56 [0x38])) [2 arg+24 S8 A64])
>>> >         (reg:DI 6 6)) "t.c":2:23 679 {*movdi_internal64}
>>> >      (expr_list:REG_DEAD (reg:DI 6 6)
>>> >         (nil)))
>>> > (note 7 5 10 2 NOTE_INSN_FUNCTION_BEG)
>>> > (note 10 7 15 2 NOTE_INSN_DELETED)
>>> > (insn 15 10 16 2 (set (reg/i:DF 33 1)
>>> >         (mem/c:DF (plus:DI (reg/f:DI 110 sfp)
>>> >                 (const_int 56 [0x38])) [1 arg.a[3]+0 S8 A64])) "t.c":2:40
>>> > 576 {*movdf_hardfloat64}
>>> >      (nil))
>>> > 
>>> > Possibly because the store and load happen in a different mode?  Can
>>> > you see why CSE doesn't handle this (producing a subreg)?  On
>>> 
>>> Yes, exactly! For "{double a[4]; long l;}", because the store and load
>>> are using a different mode, and then CSE does not optimize it.  This
>>> patch makes the store and load using the same mode (DImode), and then
>>> leverage CSE to handle it.
>>
>> So can we instead fix CSE to consider replacing insn 15 above with
>>
>>  (insn 15 (set (reg/i:DF 33 1)
>>                (subreg:DF (reg/f:DI 6 6)))
>>
>
> Thanks for your suggestion! I will check to see if able to draft
> a quick fix.

I just hacked a patch as below.
It seems some limitations there. e.g. 1. "subreg:DF on DI register"
may not work well on pseudo, and 2. to convert high-part:DI to SF,
a "shift/rotate" is needed, and then we need to "emit shift insn"
in cse. I may need to update this patch.

Thanks for the comments and suggestions again!
And happy holiday!


BR,
Jeff (Jiufu)

diff --git a/gcc/cse.cc b/gcc/cse.cc
index b13afd4ba72..77bcbf75d8f 100644
--- a/gcc/cse.cc
+++ b/gcc/cse.cc
@@ -5011,6 +5011,32 @@ cse_insn (rtx_insn *insn)
 	  src_related_is_const_anchor = src_related != NULL_RTX;
 	}
 
+      /* Try to optimize "[rm:DI]=rd:DI ; rf:DF=[rm:DI]" to "rf:DF=rd:DI#0" */
+      if (!src_related && MEM_P (src) && (mode == DFmode || mode == SFmode)
+	  && known_le (GET_MODE_SIZE (mode), GET_MODE_SIZE (word_mode)))
+	{
+	  machine_mode src_mode = GET_MODE (src);
+	  PUT_MODE (src, word_mode);
+	  struct table_elt *mem_elt = lookup (src, sets[i].src_hash, word_mode);
+
+	  if (mem_elt)
+	    {
+	      for (mem_elt = mem_elt->first_same_value; mem_elt;
+		   mem_elt = mem_elt->next_same_value)
+		if (REG_P (mem_elt->exp))
+		  {
+		    poly_uint64 low_off
+		      = subreg_lowpart_offset (SImode, word_mode);
+		    if (known_eq (low_off, 0U))
+		      src_related = gen_lowpart (mode, mem_elt->exp);
+		    //TODO: highpart DI->SF
+		    break;
+		  }
+	    }
+
+	  PUT_MODE (src, src_mode);
+	}
+
       /* Try to re-materialize a vec_dup with an existing constant.   */
       rtx src_elt;
       if ((!src_eqv_here || CONSTANT_P (src_eqv_here))
--------


>
> BR,
> Jeff (Jiufu)
>
>> ?  One peculiarity is that this introduces a hardreg use in this
>> special case.
>>
>> Richard.
>>
>>> > the GIMPLE side we'd happily do that (but we don't see the argument
>>> > setup).
>>> 
>>> Thanks for your comments!
>>> 
>>> 
>>> BR,
>>> Jeff (Jiufu)
>>> 
>>> > 
>>> > Thanks,
>>> > Richard.
>>> > 
>>> >> This patch updates the behavior when loading floating members of a
>>> >> parameter: if that floating member is stored via integer register,
>>> >> then loading it as integer mode first, and converting it to floating
>>> >> mode.
>>> >> 
>>> >> I also thought of a method: before storing the register to stack,
>>> >> convert it to float mode first. While there are some cases that may
>>> >> still prefer to keep an integer register store.
>>> >> 
>>> >> Bootstrap and regtest passes on ppc64{,le}.
>>> >> I would ask for help to review for comments and if this patch is
>>> >> acceptable for the trunk.
>>> >> 
>>> >> 
>>> >> BR,
>>> >> Jeff (Jiufu)
>>> >> 
>>> >>  PR target/108073
>>> >> 
>>> >> gcc/ChangeLog:
>>> >> 
>>> >>  * config/rs6000/rs6000.cc (TARGET_LOADING_INT_CONVERT_TO_FLOAT): New
>>> >>  macro definition.
>>> >>  (rs6000_loading_int_convert_to_float): New hook implement.
>>> >>  * doc/tm.texi: Regenerated.
>>> >>  * doc/tm.texi.in (loading_int_convert_to_float): New hook.
>>> >>  * expr.cc (expand_expr_real_1): Updated to use the new hook.
>>> >>  * target.def (loading_int_convert_to_float): New hook.
>>> >> 
>>> >> gcc/testsuite/ChangeLog:
>>> >> 
>>> >>  * g++.target/powerpc/pr102024.C: Update.
>>> >>  * gcc.target/powerpc/pr108073.c: New test.
>>> >> 
>>> >> ---
>>> >> gcc/config/rs6000/rs6000.cc                 | 70 +++++++++++++++++++++
>>> >>  gcc/doc/tm.texi                             |  6 ++
>>> >>  gcc/doc/tm.texi.in                          |  2 +
>>> >>  gcc/expr.cc                                 | 15 +++++
>>> >>  gcc/target.def                              | 11 ++++
>>> >>  gcc/testsuite/g++.target/powerpc/pr102024.C |  2 +-
>>> >>  gcc/testsuite/gcc.target/powerpc/pr108073.c | 24 +++++++
>>> >>  7 files changed, 129 insertions(+), 1 deletion(-)
>>> >>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108073.c
>>> >> 
>>> >> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>>> >> index b3a609f3aa3..af676eea276 100644
>>> >> --- a/gcc/config/rs6000/rs6000.cc
>>> >> +++ b/gcc/config/rs6000/rs6000.cc
>>> >> @@ -1559,6 +1559,9 @@ static const struct attribute_spec 
>>> >> rs6000_attribute_table[] =
>>> >>  #undef TARGET_INVALID_ARG_FOR_UNPROTOTYPED_FN
>>> >>  #define TARGET_INVALID_ARG_FOR_UNPROTOTYPED_FN 
>>> >> invalid_arg_for_unprototyped_fn
>>> >> 
>>> >> +#undef TARGET_LOADING_INT_CONVERT_TO_FLOAT
>>> >> +#define TARGET_LOADING_INT_CONVERT_TO_FLOAT
>>> >> rs6000_loading_int_convert_to_float
>>> >> +
>>> >>  #undef TARGET_MD_ASM_ADJUST
>>> >>  #define TARGET_MD_ASM_ADJUST rs6000_md_asm_adjust
>>> >> 
>>> >> @@ -24018,6 +24021,73 @@ invalid_arg_for_unprototyped_fn (const_tree 
>>> >> typelist, const_tree funcdecl, const
>>> >>     : NULL;
>>> >>  }
>>> >> 
>>> >> +/* Implement the TARGET_LOADING_INT_CONVERT_TO_FLOAT. */
>>> >> +static rtx
>>> >> +rs6000_loading_int_convert_to_float (machine_mode mode, rtx source, rtx
>>> >> base)
>>> >> +{
>>> >> +  rtx src_base = XEXP (source, 0);
>>> >> +  poly_uint64 offset = MEM_OFFSET (source);
>>> >> +
>>> >> +  if (GET_CODE (src_base) == PLUS && CONSTANT_P (XEXP (src_base, 1)))
>>> >> +    {
>>> >> +      offset += INTVAL (XEXP (src_base, 1));
>>> >> +      src_base = XEXP (src_base, 0);
>>> >> +    }
>>> >> +
>>> >> +  if (!rtx_equal_p (XEXP (base, 0), src_base))
>>> >> +    return NULL_RTX;
>>> >> +
>>> >> +  rtx temp_reg = gen_reg_rtx (word_mode);
>>> >> +  rtx temp_mem = copy_rtx (source);
>>> >> +  PUT_MODE (temp_mem, word_mode);
>>> >> +
>>> >> +  /* DI->DF */
>>> >> +  if (word_mode == DImode && mode == DFmode)
>>> >> +    {
>>> >> +      if (multiple_p (offset, GET_MODE_SIZE (word_mode)))
>>> >> +	{
>>> >> +	  emit_move_insn (temp_reg, temp_mem);
>>> >> +	  rtx float_subreg = simplify_gen_subreg (mode, temp_reg, word_mode,
>>> >> 0);
>>> >> +	  rtx target_reg = gen_reg_rtx (mode);
>>> >> +	  emit_move_insn (target_reg, float_subreg);
>>> >> +	  return target_reg;
>>> >> +	}
>>> >> +      return NULL_RTX;
>>> >> +    }
>>> >> +
>>> >> +  /* Sub DI#->SF */
>>> >> +  if (word_mode == DImode && mode == SFmode)
>>> >> +    {
>>> >> +      poly_uint64 byte_off = 0;
>>> >> +      if (multiple_p (offset, GET_MODE_SIZE (word_mode)))
>>> >> +	byte_off = 0;
>>> >> +      else if (multiple_p (offset - GET_MODE_SIZE (mode),
>>> >> +			   GET_MODE_SIZE (word_mode)))
>>> >> +	byte_off = GET_MODE_SIZE (mode);
>>> >> +      else
>>> >> +	return NULL_RTX;
>>> >> +
>>> >> +      temp_mem = adjust_address (temp_mem, word_mode, -byte_off);
>>> >> +      emit_move_insn (temp_reg, temp_mem);
>>> >> +
>>> >> +      /* little endia only? */
>>> >> +      poly_uint64 high_off = subreg_highpart_offset (SImode, word_mode);
>>> >> +      if (known_eq (byte_off, high_off))
>>> >> +	{
>>> >> +	  temp_reg = expand_shift (RSHIFT_EXPR, word_mode, temp_reg,
>>> >> +				   GET_MODE_PRECISION (SImode), temp_reg, 0);
>>> >> +	}
>>> >> +      rtx subreg_si = gen_reg_rtx (SImode);
>>> >> +      emit_move_insn (subreg_si,  gen_lowpart (SImode, temp_reg));
>>> >> +      rtx float_subreg = simplify_gen_subreg (mode, subreg_si, SImode, 0);
>>> >> +      rtx target_reg = gen_reg_rtx (mode);
>>> >> +      emit_move_insn (target_reg, float_subreg);
>>> >> +      return target_reg;
>>> >> +    }
>>> >> +
>>> >> +  return NULL_RTX;
>>> >> +}
>>> >> +
>>> >>  /* For TARGET_SECURE_PLT 32-bit PIC code we can save PIC register
>>> >>     setup by using __stack_chk_fail_local hidden function instead of
>>> >>     calling __stack_chk_fail directly.  Otherwise it is better to call
>>> >> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
>>> >> index 8fe49c2ba3d..10f94af553d 100644
>>> >> --- a/gcc/doc/tm.texi
>>> >> +++ b/gcc/doc/tm.texi
>>> >> @@ -11933,6 +11933,12 @@ or when the back end is in a 
>>> >> partially-initialized state.
>>> >>  outside of any function scope.
>>> >>  @end deftypefn
>>> >> 
>>> >> +@deftypefn {Target Hook} rtx TARGET_LOADING_INT_CONVERT_TO_FLOAT
>>> >> (machine_mode @var{mode}, rtx @var{source}, rtx @var{base})
>>> >> +If the target is protifiable to load an integer in word_mode from
>>> >> +@var{source} which is based on @var{base}, then convert to a floating
>>> >> +point value in @var{mode}.
>>> >> +@end deftypefn
>>> >> +
>>> >>  @defmac TARGET_OBJECT_SUFFIX
>>> >>  Define this macro to be a C string representing the suffix for object
>>> >>  files on your target machine.  If you do not define this macro, GCC 
>>> >> will
>>> >> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
>>> >> index 62c49ac46de..1ca6a671d86 100644
>>> >> --- a/gcc/doc/tm.texi.in
>>> >> +++ b/gcc/doc/tm.texi.in
>>> >> @@ -7756,6 +7756,8 @@ to by @var{ce_info}.
>>> >> 
>>> >>  @hook TARGET_SET_CURRENT_FUNCTION
>>> >> 
>>> >> +@hook TARGET_LOADING_INT_CONVERT_TO_FLOAT
>>> >> +
>>> >>  @defmac TARGET_OBJECT_SUFFIX
>>> >>  Define this macro to be a C string representing the suffix for object
>>> >>  files on your target machine.  If you do not define this macro, GCC 
>>> >> will
>>> >> diff --git a/gcc/expr.cc b/gcc/expr.cc
>>> >> index d9407432ea5..466079220e7 100644
>>> >> --- a/gcc/expr.cc
>>> >> +++ b/gcc/expr.cc
>>> >> @@ -11812,6 +11812,21 @@ expand_expr_real_1 (tree exp, rtx target, 
>>> >> machine_mode tmode,
>>> >>       && modifier != EXPAND_WRITE)
>>> >>     op0 = flip_storage_order (mode1, op0);
>>> >> 
>>> >> +	/* Accessing float field of struct parameter which passed via integer
>>> >> +	   registers.  */
>>> >> +	if (targetm.loading_int_convert_to_float && mode == mode1
>>> >> +	    && GET_MODE_CLASS (mode) == MODE_FLOAT
>>> >> +	    && TREE_CODE (tem) == PARM_DECL && DECL_INCOMING_RTL (tem)
>>> >> +	    && REG_P (DECL_INCOMING_RTL (tem))
>>> >> +	    && GET_MODE (DECL_INCOMING_RTL (tem)) == BLKmode && MEM_P (op0)
>>> >> +	    && MEM_OFFSET_KNOWN_P (op0))
>>> >> +	  {
>>> >> +	    rtx res = targetm.loading_int_convert_to_float (mode, op0,
>>> >> +							    DECL_RTL (tem));
>>> >> +	    if (res)
>>> >> +	      op0 = res;
>>> >> +	  }
>>> >> +
>>> >>   if (mode == mode1 || mode1 == BLKmode || mode1 == tmode
>>> >>       || modifier == EXPAND_CONST_ADDRESS
>>> >>       || modifier == EXPAND_INITIALIZER)
>>> >> diff --git a/gcc/target.def b/gcc/target.def
>>> >> index 082a7c62f34..837ce902489 100644
>>> >> --- a/gcc/target.def
>>> >> +++ b/gcc/target.def
>>> >> @@ -4491,6 +4491,17 @@ original and the returned modes should be 
>>> >> @code{MODE_INT}.",
>>> >>   (machine_mode mode),
>>> >>   default_preferred_doloop_mode)
>>> >> 
>>> >> +
>>> >> +/* Loading an integer value from memory first, then convert (bitcast)
>>> >> to\n\n
>>> >> +   floating point value, if target is able to support such behavior.  */
>>> >> +DEFHOOK
>>> >> +(loading_int_convert_to_float,
>>> >> +"If the target is protifiable to load an integer in word_mode from\n\
>>> >> +@var{source} which is based on @var{base}, then convert to a floating\n\
>>> >> +point value in @var{mode}.",
>>> >> + rtx, (machine_mode mode, rtx source, rtx base),
>>> >> + NULL)
>>> >> +
>>> >>  /* Returns true for a legitimate combined insn.  */
>>> >>  DEFHOOK
>>> >>  (legitimate_combined_insn,
>>> >> diff --git a/gcc/testsuite/g++.target/powerpc/pr102024.C
>>> >> b/gcc/testsuite/g++.target/powerpc/pr102024.C
>>> >> index 769585052b5..c8995cae707 100644
>>> >> --- a/gcc/testsuite/g++.target/powerpc/pr102024.C
>>> >> +++ b/gcc/testsuite/g++.target/powerpc/pr102024.C
>>> >> @@ -5,7 +5,7 @@
>>> >> // Test that a zero-width bit field in an otherwise homogeneous aggregate
>>> >>  // generates a psabi warning and passes arguments in GPRs.
>>> >> 
>>> >> -// { dg-final { scan-assembler-times {\mstd\M} 4 } }
>>> >> +// { dg-final { scan-assembler-times {\mmtvsrd\M} 4 } }
>>> >> 
>>> >>  struct a_thing
>>> >>  {
>>> >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr108073.c
>>> >> b/gcc/testsuite/gcc.target/powerpc/pr108073.c
>>> >> new file mode 100644
>>> >> index 00000000000..aa02de56405
>>> >> --- /dev/null
>>> >> +++ b/gcc/testsuite/gcc.target/powerpc/pr108073.c
>>> >> @@ -0,0 +1,24 @@
>>> >> +/* { dg-do run } */
>>> >> +/* { dg-options "-O2 -save-temps" } */
>>> >> +
>>> >> +typedef struct DF {double a[4]; long l; } DF;
>>> >> +typedef struct SF {float a[4];short l; } SF;
>>> >> +
>>> >> +/* Each of below function contains one mtvsrd.  */
>>> >> +/* { dg-final { scan-assembler-times {\mmtvsrd\M} 3 {target {
>>> >> has_arch_ppc64 && has_arch_pwr8 } } } } */
>>> >> +double __attribute__ ((noipa)) foo_df (DF arg){return arg.a[3];}
>>> >> +float  __attribute__ ((noipa)) foo_sf (SF arg){return arg.a[2];}
>>> >> +float  __attribute__ ((noipa)) foo_sf1 (SF arg){return arg.a[1];}
>>> >> +
>>> >> +double gd = 4.0;
>>> >> +float gf1 = 3.0f, gf2 = 2.0f;
>>> >> +DF gdf = {{1.0,2.0,3.0,4.0}, 1L};
>>> >> +SF gsf = {{1.0f,2.0f,3.0f,4.0f}, 1L};
>>> >> +
>>> >> +int main()
>>> >> +{
>>> >> +  if (!(foo_df (gdf) == gd && foo_sf (gsf) == gf1 && foo_sf1 (gsf) ==
>>> >> gf2))
>>> >> +    __builtin_abort ();
>>> >> +  return 0;
>>> >> +}
>>> >> +
>>> >> 
>>> 
>>>