public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jiufu Guo <guojiufu@linux.ibm.com>
To: gcc-patches@gcc.gnu.org
Cc: segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org,
	rguenther@suse.de, jeffreyalaw@gmail.com
Subject: Ping: [PATCH V4] Use reg mode to move sub blocks for parameters and returns
Date: Thu, 02 Mar 2023 17:21:54 +0800	[thread overview]
Message-ID: <7nmt4vld8t.fsf@ltcden2-lp1.aus.stglabs.ibm.com> (raw)
In-Reply-To: <20230104124439.191858-1-guojiufu@linux.ibm.com> (Jiufu Guo's message of "Wed, 4 Jan 2023 20:44:39 +0800")

Hi,

Ping this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609394.html

Thanks for any comments and suggestions!


BR,
Jeff (Jiufu)


Jiufu Guo <guojiufu@linux.ibm.com> writes:

> Hi,
>
> When assigning a parameter to a variable, or assigning a variable to
> return value with struct type, "block move" may be used to expand
> the assignment if the parameter/return is passing through registers and
> the parameter/return has BLK mode.
> For this kind of case, when moving the blocks, it would be better to use
> the nature mode of the registers.
> This would raise more opportunities for other optimization passes(cse,
> dse, xprop).
>
> As the example code (like code in PR65421):
>
> typedef struct SA {double a[3];} A;
> A ret_arg_pt (A *a) {return *a;} // on ppc64le, expect only 3 lfd(s)
> A ret_arg (A a) {return a;} // just empty fun body
> void st_arg (A a, A *p) {*p = a;} //only 3 stfd(s)
>
> This patches check the "from" and "to" of an assignment in
> "expand_assignment", if it is about param/ret which may passing via
> register, then use the register nature mode to move sub-blocks for
> the assignning.
>
> This patch may be still useful even if we change the behavior of
> parameter setup or adopt SRA-like code in expender.
>
> Comparing with previous version:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608081.html
> This patch update the code slightly and merged/added test cases.
> And I checked the cases with large struct or non-homogeneous struct
> to confirm it does not degrade the code.
>
> Bootstrap and regtest pass on ppc64{,le} and x86_64.
> Is this ok for trunk?
>
> BR,
> Jeff (Jiufu)
>
> 	PR target/65421
>
> gcc/ChangeLog:
>
> 	* cfgexpand.cc (expand_used_vars): Update to mark DECL_USEDBY_RETURN_P
> 	for returns.
> 	* expr.cc (move_sub_blocks): New function.
> 	(expand_assignment): Update to call move_sub_blocks for returns or
> 	parameters.
> 	* function.cc (assign_parm_setup_block): Update to mark
> 	DECL_REGS_TO_STACK_P for parameter.
> 	* tree-core.h (struct tree_decl_common): Add comment.
> 	* tree.h (DECL_USEDBY_RETURN_P): New define.
> 	(DECL_REGS_TO_STACK_P): New define.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.target/powerpc/pr65421-1.c: New test.
> 	* gcc.target/powerpc/pr65421.c: New test.
>
> ---
>  gcc/cfgexpand.cc                             | 14 ++++
>  gcc/expr.cc                                  | 77 ++++++++++++++++++++
>  gcc/function.cc                              |  3 +
>  gcc/tree-core.h                              |  4 +-
>  gcc/tree.h                                   |  9 +++
>  gcc/testsuite/gcc.target/powerpc/pr65421-1.c |  6 ++
>  gcc/testsuite/gcc.target/powerpc/pr65421.c   | 33 +++++++++
>  7 files changed, 145 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421.c
>
> diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
> index dd29ffffc03..09b8ec64cea 100644
> --- a/gcc/cfgexpand.cc
> +++ b/gcc/cfgexpand.cc
> @@ -2158,6 +2158,20 @@ expand_used_vars (bitmap forced_stack_vars)
>      frame_phase = off ? align - off : 0;
>    }
>  
> +  /* Collect VARs on returns.  */
> +  if (DECL_RESULT (current_function_decl))
> +    {
> +      edge_iterator ei;
> +      edge e;
> +      FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
> +	if (greturn *ret = safe_dyn_cast<greturn *> (last_stmt (e->src)))
> +	  {
> +	    tree val = gimple_return_retval (ret);
> +	    if (val && VAR_P (val))
> +	      DECL_USEDBY_RETURN_P (val) = 1;
> +	  }
> +    }
> +
>    /* Set TREE_USED on all variables in the local_decls.  */
>    FOR_EACH_LOCAL_DECL (cfun, i, var)
>      TREE_USED (var) = 1;
> diff --git a/gcc/expr.cc b/gcc/expr.cc
> index d9407432ea5..afcec6f3c10 100644
> --- a/gcc/expr.cc
> +++ b/gcc/expr.cc
> @@ -5559,6 +5559,51 @@ mem_ref_refers_to_non_mem_p (tree ref)
>    return non_mem_decl_p (base);
>  }
>  
> +/* Sub routine of expand_assignment, invoked when assigning from a
> +   parameter or assigning to a return val on struct type which may
> +   be passed through registers.  The mode of register is used to
> +   move the content for the assignment.
> +
> +   This routine generates code for expression FROM which is BLKmode,
> +   and move the generated content to TO_RTX by su-blocks in SUB_MODE.  */
> +
> +static void
> +move_sub_blocks (rtx to_rtx, tree from, machine_mode sub_mode, bool nontemporal)
> +{
> +  gcc_assert (MEM_P (to_rtx));
> +
> +  HOST_WIDE_INT size = MEM_SIZE (to_rtx).to_constant ();
> +  HOST_WIDE_INT sub_size = GET_MODE_SIZE (sub_mode).to_constant ();
> +  HOST_WIDE_INT len = size / sub_size;
> +
> +  /* It would be not profitable to move through sub-modes, if the size does
> +     not meet register mode.  */
> +  if ((size % sub_size) != 0)
> +    {
> +      push_temp_slots ();
> +      rtx result = store_expr (from, to_rtx, 0, nontemporal, false);
> +      preserve_temp_slots (result);
> +      pop_temp_slots ();
> +      return;
> +    }
> +
> +  push_temp_slots ();
> +
> +  rtx from_rtx = expand_expr (from, NULL_RTX, GET_MODE (to_rtx), EXPAND_NORMAL);
> +  for (int i = 0; i < len; i++)
> +    {
> +      rtx temp = gen_reg_rtx (sub_mode);
> +      rtx src = adjust_address (from_rtx, sub_mode, sub_size * i);
> +      rtx dest = adjust_address (to_rtx, sub_mode, sub_size * i);
> +      emit_move_insn (temp, src);
> +      emit_move_insn (dest, temp);
> +    }
> +
> +  preserve_temp_slots (to_rtx);
> +  pop_temp_slots ();
> +  return;
> +}
> +
>  /* Expand an assignment that stores the value of FROM into TO.  If NONTEMPORAL
>     is true, try generating a nontemporal store.  */
>  
> @@ -6045,6 +6090,38 @@ expand_assignment (tree to, tree from, bool nontemporal)
>        return;
>      }
>  
> +  /* If it is assigning from a struct param which may be passed via registers,
> +     It would be better to use the register's mode to move sub-blocks for the
> +     assignment.  */
> +  if (TREE_CODE (from) == PARM_DECL && mode == BLKmode
> +      && DECL_REGS_TO_STACK_P (from))
> +    {
> +      rtx parm = DECL_INCOMING_RTL (from);
> +      gcc_assert (REG_P (parm) || GET_CODE (parm) == PARALLEL);
> +
> +      machine_mode sub_mode;
> +      if (REG_P (parm))
> +	sub_mode = word_mode;
> +      else
> +	sub_mode = GET_MODE (XEXP (XVECEXP (parm, 0, 0), 0));
> +
> +      move_sub_blocks (to_rtx, from, sub_mode, nontemporal);
> +      return;
> +    }
> +
> +  /* If it is assigning to a struct var which will be returned, and the
> +     function is returning via registers, it would be better to use the
> +     register's mode to move sub-blocks for the assignment.  */
> +  if (VAR_P (to) && DECL_USEDBY_RETURN_P (to) && mode == BLKmode
> +      && TREE_CODE (from) != CONSTRUCTOR
> +      && GET_CODE (DECL_RTL (DECL_RESULT (current_function_decl))) == PARALLEL)
> +    {
> +      rtx ret = DECL_RTL (DECL_RESULT (current_function_decl));
> +      machine_mode sub_mode = GET_MODE (XEXP (XVECEXP (ret, 0, 0), 0));
> +      move_sub_blocks (to_rtx, from, sub_mode, nontemporal);
> +      return;
> +    }
> +
>    /* Compute FROM and store the value in the rtx we got.  */
>  
>    push_temp_slots ();
> diff --git a/gcc/function.cc b/gcc/function.cc
> index dc333c27e92..0ff89d89365 100644
> --- a/gcc/function.cc
> +++ b/gcc/function.cc
> @@ -2991,6 +2991,9 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
>  
>        mem = validize_mem (copy_rtx (stack_parm));
>  
> +      if (MEM_P (mem))
> +	DECL_REGS_TO_STACK_P (parm) = 1;
> +
>        /* Handle values in multiple non-contiguous locations.  */
>        if (GET_CODE (entry_parm) == PARALLEL && !MEM_P (mem))
>  	emit_group_store (mem, entry_parm, data->arg.type, size);
> diff --git a/gcc/tree-core.h b/gcc/tree-core.h
> index e146b133dbd..c76d08bd109 100644
> --- a/gcc/tree-core.h
> +++ b/gcc/tree-core.h
> @@ -1808,7 +1808,9 @@ struct GTY(()) tree_decl_common {
>       In VAR_DECL, PARM_DECL and RESULT_DECL, this is
>       DECL_HAS_VALUE_EXPR_P.  */
>    unsigned decl_flag_2 : 1;
> -  /* In FIELD_DECL, this is DECL_PADDING_P.  */
> +  /* In FIELD_DECL, this is DECL_PADDING_P
> +     In VAR_DECL, this is DECL_USEDBY_RETURN_P
> +     In PARM_DECL, this is DECL_REGS_TO_STACK_P.  */
>    unsigned decl_flag_3 : 1;
>    /* Logically, these two would go in a theoretical base shared by var and
>       parm decl. */
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 7e26e726bc5..3f732e3c00c 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -3009,6 +3009,15 @@ extern void decl_value_expr_insert (tree, tree);
>  #define DECL_PADDING_P(NODE) \
>    (FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_3)
>  
> +/* Used in a VAR_DECL to indicate that it is used by a return stmt.  */
> +#define DECL_USEDBY_RETURN_P(NODE) \
> +  (VAR_DECL_CHECK (NODE)->decl_common.decl_flag_3)
> +
> +/* Used in a PARM_DECL to indicate that it is struct parameter passed
> +   by registers totally and stored to stack during setup.  */
> +#define DECL_REGS_TO_STACK_P(NODE) \
> +  (PARM_DECL_CHECK (NODE)->decl_common.decl_flag_3)
> +
>  /* Used in a FIELD_DECL to indicate whether this field is not a flexible
>     array member. This is only valid for the last array type field of a
>     structure.  */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr65421-1.c b/gcc/testsuite/gcc.target/powerpc/pr65421-1.c
> new file mode 100644
> index 00000000000..4e1f87f7939
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr65421-1.c
> @@ -0,0 +1,6 @@
> +/* PR target/65421 */
> +/* { dg-options "-O2" } */
> +
> +typedef struct LARGE {double a[4]; int arr[32];} LARGE;
> +LARGE foo (LARGE a){return a;}
> +/* { dg-final { scan-assembler-times {\mmemcpy\M} 1 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr65421.c b/gcc/testsuite/gcc.target/powerpc/pr65421.c
> new file mode 100644
> index 00000000000..fd5ad542c64
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr65421.c
> @@ -0,0 +1,33 @@
> +/* PR target/65421 */
> +/* { dg-options "-O2" } */
> +/* { dg-require-effective-target powerpc_elfv2 } */
> +/* { dg-require-effective-target has_arch_ppc64 } */
> +
> +typedef struct FLOATS
> +{
> +  double a[3];
> +} FLOATS;
> +
> +/* 3 lfd */
> +FLOATS ret_arg_pt (FLOATS *a){return *a;}
> +/* { dg-final { scan-assembler-times {\mlfd\M} 3 } } */
> +
> +/* 3 stfd */
> +void st_arg (FLOATS a, FLOATS *p) {*p = a;}
> +/* { dg-final { scan-assembler-times {\mstfd\M} 3 } } */
> +
> +/* blr */
> +FLOATS ret_arg (FLOATS a) {return a;}
> +
> +typedef struct MIX
> +{
> +  double a[2];
> +  long l;
> +} MIX;
> +
> +/* std 3 param regs to return slot */
> +MIX ret_arg1 (MIX a) {return a;}
> +/* { dg-final { scan-assembler-times {\mstd\M} 3 } } */
> +
> +/* count insns */
> +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 13 } } */

      reply	other threads:[~2023-03-02  9:22 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-04 12:44 Jiufu Guo
2023-03-02  9:21 ` Jiufu Guo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7nmt4vld8t.fsf@ltcden2-lp1.aus.stglabs.ibm.com \
    --to=guojiufu@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    --cc=linkw@gcc.gnu.org \
    --cc=rguenther@suse.de \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).