public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jeff Law <jeffreyalaw@gmail.com>
To: Jiufu Guo <guojiufu@linux.ibm.com>
Cc: gcc-patches@gcc.gnu.org, segher@kernel.crashing.org,
	dje.gcc@gmail.com, linkw@gcc.gnu.org, rguenther@suse.de
Subject: Re: [PATCH V2] Use subscalar mode to move struct block for parameter
Date: Mon, 28 Nov 2022 10:00:57 -0700	[thread overview]
Message-ID: <0e432b50-d500-ca2f-0db5-9e9cf099f26c@gmail.com> (raw)
In-Reply-To: <7efseas7f1.fsf@pike.rch.stglabs.ibm.com>


On 11/22/22 19:58, Jiufu Guo wrote:
> Hi Jeff,
>
> Thanks a lot for your comments!
>
> Jeff Law <jeffreyalaw@gmail.com> writes:
>
>> On 11/20/22 20:07, Jiufu Guo wrote:
>>> Jiufu Guo <guojiufu@linux.ibm.com> writes:
>>>
>>>> Hi,
>>>>
>>>> As mentioned in the previous version patch:
>>>> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604646.html
>>>> The suboptimal code is generated for "assigning from parameter" or
>>>> "assigning to return value".
>>>> This patch enhances the assignment from parameters like the below
>>>> cases:
>>>> /////case1.c
>>>> typedef struct SA {double a[3];long l; } A;
>>>> A ret_arg (A a) {return a;}
>>>> void st_arg (A a, A *p) {*p = a;}
>>>>
>>>> ////case2.c
>>>> typedef struct SA {double a[3];} A;
>>>> A ret_arg (A a) {return a;}
>>>> void st_arg (A a, A *p) {*p = a;}
>>>>
>>>> For this patch, bootstrap and regtest pass on ppc64{,le}
>>>> and x86_64.
>>>> * Besides asking for help reviewing this patch, I would like to
>>>> consult comments about enhancing for "assigning to returns".
>>> I updated the patch to fix the issue for returns.  This patch
>>> adds a flag DECL_USEDBY_RETURN_P to indicate if a var is used
>>> by a return stmt.  This patch fix the issue in expand pass only,
>>> so, we would try to update the patch to avoid this flag.
>>>
>>> diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
>>> index dd29ffffc03..09b8ec64cea 100644
>>> --- a/gcc/cfgexpand.cc
>>> +++ b/gcc/cfgexpand.cc
>>> @@ -2158,6 +2158,20 @@ expand_used_vars (bitmap forced_stack_vars)
>>>        frame_phase = off ? align - off : 0;
>>>      }
>>>    +  /* Collect VARs on returns.  */
>>> +  if (DECL_RESULT (current_function_decl))
>>> +    {
>>> +      edge_iterator ei;
>>> +      edge e;
>>> +      FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
>>> +	if (greturn *ret = safe_dyn_cast<greturn *> (last_stmt (e->src)))
>>> +	  {
>>> +	    tree val = gimple_return_retval (ret);
>>> +	    if (val && VAR_P (val))
>>> +	      DECL_USEDBY_RETURN_P (val) = 1;
>>> +	  }
>>> +    }
>>> +
>>>      /* Set TREE_USED on all variables in the local_decls.  */
>>>      FOR_EACH_LOCAL_DECL (cfun, i, var)
>>>        TREE_USED (var) = 1;
>>> diff --git a/gcc/expr.cc b/gcc/expr.cc
>>> index d9407432ea5..20973649963 100644
>>> --- a/gcc/expr.cc
>>> +++ b/gcc/expr.cc
>>> @@ -6045,6 +6045,52 @@ expand_assignment (tree to, tree from, bool nontemporal)
>>>          return;
>>>        }
>>>    +  if ((TREE_CODE (from) == PARM_DECL && DECL_INCOMING_RTL (from)
>>> +       && TYPE_MODE (TREE_TYPE (from)) == BLKmode
>>> +       && (GET_CODE (DECL_INCOMING_RTL (from)) == PARALLEL
>>> +	   || REG_P (DECL_INCOMING_RTL (from))))
>>> +      || (VAR_P (to) && DECL_USEDBY_RETURN_P (to)
>>> +	  && TYPE_MODE (TREE_TYPE (to)) == BLKmode
>>> +	  && GET_CODE (DECL_RTL (DECL_RESULT (current_function_decl)))
>>> +	       == PARALLEL))
>>> +    {
>>> +      push_temp_slots ();
>>> +      rtx par_ret;
>>> +      machine_mode mode;
>>> +      par_ret = TREE_CODE (from) == PARM_DECL
>>> +		  ? DECL_INCOMING_RTL (from)
>>> +		  : DECL_RTL (DECL_RESULT (current_function_decl));
>>> +      mode = GET_CODE (par_ret) == PARALLEL
>>> +	       ? GET_MODE (XEXP (XVECEXP (par_ret, 0, 0), 0))
>>> +	       : word_mode;
>>> +      int mode_size = GET_MODE_SIZE (mode).to_constant ();
>>> +      int size = INTVAL (expr_size (from));
>>> +
>>> +      /* If/How the parameter using submode, it dependes on the size and
>>> +	 position of the parameter.  Here using heurisitic number.  */
>>> +      int hurstc_num = 8;
>> Where did this come from and what does it mean?
> Sorry for does not make this clear. We know that an aggregate arg may be
> on registers partially or totally, as assign_parm_adjust_entry_rtl.
> For an example, if a parameter with 12 words and the target/ABI only
> allow 8 gprs for arguments, then the parameter could use 8 regs at most
> and left part in stack.

Right, but the number of registers is target dependent, so I don't see 
how using "8" or any number of that matter is correct here.


>
>>
>> Note that BLKmode subword values passed in registers can be either
>> right or left justified.  I think you also need to worry about
>> endianness here.
> Since the subword is used to move block(read from source mem and then
> store to destination mem with register mode), and this would keep to use
> the same endianness on reg like move_block_from_reg. So, the patch does
> not check the endianness.

Hmm, I was clear enough here, particularly using the endianness term.  
Don't you need to query the ABI to ensure that you're not changing left 
vs right justification for a partially in register argument.     On the 
PA we have:

/* Specify padding for the last element of a block move between registers
    and memory.

    The 64-bit runtime specifies that objects need to be left justified
    (i.e., the normal justification for a big endian target).  The 32-bit
    runtime specifies right justification for objects smaller than 64 bits.
    We use a DImode register in the parallel for 5 to 7 byte structures
    so that there is only one element.  This allows the object to be
    correctly padded.  */
#define BLOCK_REG_PADDING(MODE, TYPE, FIRST) \
   targetm.calls.function_arg_padding ((MODE), (TYPE))


Jeff





  parent reply	other threads:[~2022-11-28 17:01 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-17  6:15 Jiufu Guo
2022-11-21  3:07 ` Jiufu Guo
2022-11-22 21:57   ` Jeff Law
2022-11-23  2:58     ` Jiufu Guo
2022-11-24  7:31       ` Richard Biener
2022-11-25  5:05         ` Jiufu Guo
2022-11-25 12:29           ` Jiufu Guo
2022-11-28 17:00       ` Jeff Law [this message]
2022-11-29  3:53         ` Jiufu Guo
2022-12-05 16:48           ` Jeff Law
2022-12-06  2:36             ` Jiufu Guo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0e432b50-d500-ca2f-0db5-9e9cf099f26c@gmail.com \
    --to=jeffreyalaw@gmail.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=guojiufu@linux.ibm.com \
    --cc=linkw@gcc.gnu.org \
    --cc=rguenther@suse.de \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).