public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Jiufu Guo <guojiufu@linux.ibm.com>
Cc: gcc-patches@gcc.gnu.org, segher@kernel.crashing.org,
	dje.gcc@gmail.com,  linkw@gcc.gnu.org, jeffreyalaw@gmail.com
Subject: Re: [PATCH] Using sub-scalars mode to move struct block
Date: Mon, 14 Nov 2022 07:23:08 +0000 (UTC)	[thread overview]
Message-ID: <nycvar.YFH.7.77.849.2211140719340.3995@jbgna.fhfr.qr> (raw)
In-Reply-To: <7ey1sery63.fsf@pike.rch.stglabs.ibm.com>

On Mon, 14 Nov 2022, Jiufu Guo wrote:

> 
> Hi!
> Thanks so much for your review!
> 
> Richard Biener <rguenther@suse.de> writes:
> 
> > On Fri, 11 Nov 2022, Jiufu Guo wrote:
> >
> >> Hi,
> >> 
> >> When assigning a struct parameter to another variable, or loading a
> >> memory block to a struct var (especially for return value),
> >> Now, "block move" would be used during expand the assignment. And
> >> the "block move" may use a type/mode different from the mode which
> >> is accessing the var. e.g. on ppc64le, V2DI would be used to move
> >> the block of 16bytes.
> >> 
> >> And then, this "block move" would prevent optimization passes from
> >> leaping/crossing over the assignment. PR65421 reflects this issue.
> >> 
> >> As the example code in PR65421.
> >> 
> >> typedef struct { double a[4]; } A;
> >> A foo (const A *a) { return *a; }
> >> 
> >> On ppc64le, the below instructions are used for the "block move":
> >>   7: r122:V2DI=[r121:DI]
> >>   8: r124:V2DI=[r121:DI+r123:DI]
> >>   9: [r112:DI]=r122:V2DI
> >>   10: [r112:DI+0x10]=r124:V2DI
> >> 
> >> For this issue, a few comments/suggestions are mentioned via RFC:
> >> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604646.html
> >> I drafted a patch which is updating the behavior of block_move for
> >> struct type. This patch is simple to work with, a few ideas in the
> >> comments are not put into this patch. I would submit this
> >> patch first.
> >> 
> >> The idea is trying to use sub-modes(scalar) for the "block move".
> >> And the sub-modes would align with the access patterns of the
> >> struct members and usages on parameter/return value.
> >> The major benefits of this change would be raising more 
> >> opportunities for other optimization passes(cse/dse/xprop).
> >> 
> >> The suitable mode would be target specified and relates to ABI,
> >> this patch introduces a target hook. And in this patch, the hook
> >> is implemented on rs6000.
> >> 
> >> In this patch, the hook would be just using heuristic modes for all
> >> struct block moving. And the hook would not check if the "block move"
> >> is about parameters or return value or other uses.
> >> 
> >> For the rs6000 implementation of this hook, it is able to use
> >> DF/DI/TD/.. modes for the struct block movement. The sub-modes
> >> would be the same as the mode when the struct type is on parameter or
> >> return value.
> >> 
> >> Bootstrapped and regtested on ppc64/ppc64le. 
> >> Is this ok for trunk?
> >> 
> >> 
> >> BR,
> >> Jeff(Jiufu)
> >> 
> >> 
> >> gcc/ChangeLog:
> >> 
> >> 	* config/rs6000/rs6000.cc (TARGET_BLOCK_MOVE_FOR_STRUCT): Define.
> >> 	(submode_for_struct_block_move): New function.  Called from
> >> 	rs600_block_move_for_struct.
> >> 	(rs600_block_move_for_struct): New function.
> >> 	* doc/tm.texi: Regenerate.
> >> 	* doc/tm.texi.in (TARGET_BLOCK_MOVE_FOR_STRUCT): New.
> >> 	* expr.cc (store_expr): Call block_move_for_struct.
> >> 	* target.def (block_move_for_struct): New hook.
> >> 	* targhooks.cc (default_block_move_for_struct): New function.
> >> 	* targhooks.h (default_block_move_for_struct): New Prototype.
> >> 
> >> ---
> >>  gcc/config/rs6000/rs6000.cc | 44 +++++++++++++++++++++++++++++++++++++
> >>  gcc/doc/tm.texi             |  6 +++++
> >>  gcc/doc/tm.texi.in          |  2 ++
> >>  gcc/expr.cc                 | 14 +++++++++---
> >>  gcc/target.def              | 10 +++++++++
> >>  gcc/targhooks.cc            |  7 ++++++
> >>  gcc/targhooks.h             |  1 +
> >>  7 files changed, 81 insertions(+), 3 deletions(-)
> >> 
> >> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> >> index a85d7630b41..e14cecba0ef 100644
> >> --- a/gcc/config/rs6000/rs6000.cc
> >> +++ b/gcc/config/rs6000/rs6000.cc
> >> @@ -1758,6 +1758,9 @@ static const struct attribute_spec rs6000_attribute_table[] =
> >>  #undef TARGET_NEED_IPA_FN_TARGET_INFO
> >>  #define TARGET_NEED_IPA_FN_TARGET_INFO rs6000_need_ipa_fn_target_info
> >>  
> >> +#undef TARGET_BLOCK_MOVE_FOR_STRUCT
> >> +#define TARGET_BLOCK_MOVE_FOR_STRUCT rs600_block_move_for_struct
> >> +
> >>  #undef TARGET_UPDATE_IPA_FN_TARGET_INFO
> >>  #define TARGET_UPDATE_IPA_FN_TARGET_INFO rs6000_update_ipa_fn_target_info
> >>  \f
> >> @@ -23672,6 +23675,47 @@ rs6000_function_value (const_tree valtype,
> >>    return gen_rtx_REG (mode, regno);
> >>  }
> >>  
> >> +/* Subroutine of rs600_block_move_for_struct, to get the internal mode which
> >> +   would be used to move the struct.  */
> >> +static machine_mode
> >> +submode_for_struct_block_move (tree type)
> >> +{
> >> +  gcc_assert (TREE_CODE (type) == RECORD_TYPE);
> >> +
> >> +  /* The sub mode may not be the field's type of the struct.
> >> +     It would be fine to use the mode as if the type is used as a function
> >> +     parameter or return value.  For example: DF for "{double a[4];}", and
> >> +     DI for "{doubel a[3]; long l;}".
> >> +     Here, using the mode as if it is function return type.  */
> >> +  rtx val = rs6000_function_value (type, NULL, 0);
> >> +  return (GET_CODE (val) == PARALLEL) ? GET_MODE (XEXP (XVECEXP (val, 0, 0), 0))
> >> +				      : word_mode;
> >> +}
> >> +
> >> +/* Implement the TARGET_BLOCK_MOVE_FOR_STRUCT hook.  */
> >> +static void
> >> +rs600_block_move_for_struct (rtx x, rtx y, tree exp, HOST_WIDE_INT method)
> >> +{
> >> +  machine_mode mode = submode_for_struct_block_move (TREE_TYPE (exp));
> >> +  int mode_size = GET_MODE_SIZE (mode);
> >> +  int size = UINTVAL (expr_size (exp));
> >> +  if (size < mode_size || (size % mode_size) != 0 || size > 64)
> >> +    {
> >> +      default_block_move_for_struct (x, y, exp, method);
> >> +      return;
> >> +    }
> >> +
> >> +  int len = size / mode_size;
> >> +  for (int i = 0; i < len; i++)
> >> +    {
> >> +      rtx temp = gen_reg_rtx (mode);
> >> +      rtx src = adjust_address (y, mode, mode_size * i);
> >> +      rtx dest = adjust_address (x, mode, mode_size * i);
> >> +      emit_move_insn (temp, src);
> >> +      emit_move_insn (dest, temp);
> >> +    }
> >> +}
> >> +
> >>  /* Define how to find the value returned by a library function
> >>     assuming the value has mode MODE.  */
> >>  rtx
> >> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> >> index 8572313b308..c8a1c1f30cf 100644
> >> --- a/gcc/doc/tm.texi
> >> +++ b/gcc/doc/tm.texi
> >> @@ -1380,6 +1380,12 @@ retain the field's mode.
> >>  Normally, this is not needed.
> >>  @end deftypefn
> >>  
> >> +@deftypefn {Target Hook} void TARGET_BLOCK_MOVE_FOR_STRUCT (rtx @var{x}, rtx @var{y}, tree @var{exp}, HOST_WIDE_INT @var{method})
> >> +Move from @var{y} to @var{x}, where @var{y} is as @var{exp} in structure
> >> +type. @var{method} is the method if this function invokes block_move.
> >> +The default definition invokes block_move.
> >> +@end deftypefn
> >> +
> >>  @defmac ROUND_TYPE_ALIGN (@var{type}, @var{computed}, @var{specified})
> >>  Define this macro as an expression for the alignment of a type (given
> >>  by @var{type} as a tree node) if the alignment computed in the usual
> >> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> >> index 986e8f0da09..f0f525a2008 100644
> >> --- a/gcc/doc/tm.texi.in
> >> +++ b/gcc/doc/tm.texi.in
> >> @@ -1226,6 +1226,8 @@ to aligning a bit-field within the structure.
> >>  
> >>  @hook TARGET_MEMBER_TYPE_FORCES_BLK
> >>  
> >> +@hook TARGET_BLOCK_MOVE_FOR_STRUCT
> >> +
> >>  @defmac ROUND_TYPE_ALIGN (@var{type}, @var{computed}, @var{specified})
> >>  Define this macro as an expression for the alignment of a type (given
> >>  by @var{type} as a tree node) if the alignment computed in the usual
> >> diff --git a/gcc/expr.cc b/gcc/expr.cc
> >> index c6917fbf7bd..734dc07a76b 100644
> >> --- a/gcc/expr.cc
> >> +++ b/gcc/expr.cc
> >> @@ -6504,9 +6504,17 @@ store_expr (tree exp, rtx target, int call_param_p,
> >>  	emit_group_store (target, temp, TREE_TYPE (exp),
> >>  			  int_size_in_bytes (TREE_TYPE (exp)));
> >>        else if (GET_MODE (temp) == BLKmode)
> >> -	emit_block_move (target, temp, expr_size (exp),
> >> -			 (call_param_p
> >> -			  ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL));
> >> +	{
> >> +	  if (TREE_CODE (TREE_TYPE (exp)) == RECORD_TYPE)
> >> +	    targetm.block_move_for_struct (target, temp, exp,
> >> +					   call_param_p ? BLOCK_OP_CALL_PARM
> >> +							: BLOCK_OP_NORMAL);
> >
> > I think it's only sensible to do something about the call_param_p case,
> > all other cases are GIGO.  And I don't see a good reason to add a new
> > target hook since function parameter passing info is readily
> > available.
> This patch enhances the normal assignments on struct variables (
> expand_assignment-->store_expr, where the call_param_p is 0). Because
> the "struct block move/copy" is generated for this case too.
> 
> For example, the struct block_move is generated.
> # .MEM_3 = VDEF <.MEM_1(D)>
> D.3956 = *a_2(D);
> For this stmt, "D.3956" is a struct variable, and "a_2(D)" points to a
> block of the struct type.
> 
> If we only care about parameter/arg/retval(s), we would not need a
> hook, as you said, since parm/retval contains the information about
> the target/ABI.  And for more as you suggested in the previous emails,
> we could do SRA-like analyze to check if an assignment relates to
> parm/retval.

But the issue with applying a stmt-local heuristic is that it's going
to be as many times wrong as it is right so there inevitably will be
regressions.

Also copies not directly involving parameters or returns can be optimized
on the GIMPLE level just fine (see what SRA does), the exception is
cpymem optab implementations but those you short-circuit as well here.

> The current patch is simple relatively, it uses 'scalar-modes' to move
> blocks for all struct-type assignments.  I feel this would be enough,
> and I'm wondering if we need to check parm/retval.

It's only for the param/retval that RTL expansion has more information
than GIMPLE and in that specific case the code generation is very 
difficult to untie and thus a change like this looks appropriate.

Thanks,
Richard.

> 
> >
> > The question is whether this is best handled up in the call chain where
> > the parameter setup is done or below here (or in emit_block_move).
> As above discussed, this patch touches 'block move' for struct type,
> including copy from 'parameter stack block' to a variable block.
> Oh, assigning from 'parameter regs' to 'stack block' is not changed
> here, it is done through "assign_parms-...->emit_group_xx".
> 
> If any misunderstanding, or any comments/concerns, please point them
> out.  Thanks a lot!
> 
> BR,
> Jeff (Jiufu)
> 
> >
> > Richard.
> >
> >> +	  else
> >> +	    emit_block_move (target, temp, expr_size (exp),
> >> +			     (call_param_p ? BLOCK_OP_CALL_PARM
> >> +					   : BLOCK_OP_NORMAL));
> >> +	}
> >> +
> >>        /* If we emit a nontemporal store, there is nothing else to do.  */
> >>        else if (nontemporal && emit_storent_insn (target, temp))
> >>  	;
> >> diff --git a/gcc/target.def b/gcc/target.def
> >> index 25f94c19fa7..e141f72a8a3 100644
> >> --- a/gcc/target.def
> >> +++ b/gcc/target.def
> >> @@ -5584,6 +5584,16 @@ Normally, this is not needed.",
> >>   bool, (const_tree field, machine_mode mode),
> >>   default_member_type_forces_blk)
> >>  
> >> +
> >> +/* Move block for structure type.  */
> >> +DEFHOOK
> >> +(block_move_for_struct,
> >> + "Move from @var{y} to @var{x}, where @var{y} is as @var{exp} in structure\n\
> >> +type. @var{method} is the method if this function invokes block_move.\n\
> >> +The default definition invokes block_move.",
> >> + void, (rtx x, rtx y, tree exp, HOST_WIDE_INT method),
> >> + default_block_move_for_struct)
> >> +
> >>  /* See tree-ssa-math-opts.cc:divmod_candidate_p for conditions
> >>     that gate the divod transform.  */
> >>  DEFHOOK
> >> diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
> >> index 12a58456b39..2b96c348419 100644
> >> --- a/gcc/targhooks.cc
> >> +++ b/gcc/targhooks.cc
> >> @@ -2330,6 +2330,13 @@ default_member_type_forces_blk (const_tree, machine_mode)
> >>    return false;
> >>  }
> >>  
> >> +/* Default version of block_move_for_struct.  */
> >> +void
> >> +default_block_move_for_struct (rtx x, rtx y, tree exp, HOST_WIDE_INT method)
> >> +{
> >> +  emit_block_move (x, y, expr_size (exp), (enum block_op_methods)method);
> >> +}
> >> +
> >>  /* Default version of canonicalize_comparison.  */
> >>  
> >>  void
> >> diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> >> index a6a423c1abb..c284a35ee28 100644
> >> --- a/gcc/targhooks.h
> >> +++ b/gcc/targhooks.h
> >> @@ -264,6 +264,7 @@ extern void default_asm_output_ident_directive (const char*);
> >>  
> >>  extern scalar_int_mode default_cstore_mode (enum insn_code);
> >>  extern bool default_member_type_forces_blk (const_tree, machine_mode);
> >> +extern void default_block_move_for_struct (rtx, rtx, tree, HOST_WIDE_INT);
> >>  extern void default_atomic_assign_expand_fenv (tree *, tree *, tree *);
> >>  extern tree build_va_arg_indirect_ref (tree);
> >>  extern tree std_gimplify_va_arg_expr (tree, tree, gimple_seq *, gimple_seq *);
> >> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

  reply	other threads:[~2022-11-14  7:23 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-11  8:25 Jiufu Guo
2022-11-11  9:40 ` Richard Biener
2022-11-14  3:51   ` Jiufu Guo
2022-11-14  7:23     ` Richard Biener [this message]
2022-11-14  9:53       ` Jiufu Guo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.YFH.7.77.849.2211140719340.3995@jbgna.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=guojiufu@linux.ibm.com \
    --cc=jeffreyalaw@gmail.com \
    --cc=linkw@gcc.gnu.org \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).