public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Sandra Loosemore <sandra@codesourcery.com>
To: GCC Patches <gcc-patches@gcc.gnu.org>,
	  Nigel Stephens <nigel@mips.com>,  Guy Morrogh <guym@mips.com>,
	David Ung <davidu@mips.com>,   Thiemo Seufer <ths@mips.com>,
	 Mark Mitchell <mark@codesourcery.com>,
	 richard@codesourcery.com,   jakub@redhat.com
Subject: [committed] Re: PATCH: fine-tuning for can_store_by_pieces
Date: Sat, 25 Aug 2007 05:35:00 -0000	[thread overview]
Message-ID: <46CF7332.3000706@codesourcery.com> (raw)
In-Reply-To: <87hcmqe2sp.fsf@firetop.home>

[-- Attachment #1: Type: text/plain, Size: 533 bytes --]

Richard Sandiford wrote:
> Thanks for your patience and all the iterations and testing.
> This version looks good to me.

Mark Mitchell wrote:
> The target-independent changes look fine.

OK, I've committed the patch.  I found that it collided with this one 
from Jakub:

http://gcc.gnu.org/ml/gcc-patches/2007-08/msg01641.html

but made the obvious correction, and verified that it still builds and 
that CSiBE results on MIPS do not regress from my previous version. 
I've attached the final version of the patch.

-Sandra









[-- Attachment #2: 31c-frob-by-pieces.log --]
[-- Type: text/x-log, Size: 1409 bytes --]

2007-08-24  Sandra Loosemore  <sandra@codesourcery.com>
            Nigel Stephens <nigel@mips.com>

	PR target/11787

	gcc/

	* doc/tm.texi (SET_RATIO, SET_BY_PIECES_P): Document new macros.
	(STORE_BY_PIECES_P): No longer applies to __builtin_memset.
	* expr.c (SET_BY_PIECES_P): Define.
	(can_store_by_pieces, store_by_pieces): Add MEMSETP argument; use
	it to decide whether to use SET_BY_PIECES_P or STORE_BY_PIECES_P.
	(store_expr):  Pass MEMSETP argument to can_store_by_pieces and
	store_by_pieces.
	* expr.h (SET_RATIO): Define.
	(can_store_by_pieces, store_by_pieces):	Update prototypes.
	* builtins.c (expand_builtin_memcpy): Pass MEMSETP argument to
	can_store_by_pieces/store_by_pieces.
	(expand_builtin_memcpy_args): Likewise.
	(expand_builtin_strncpy): Likewise.
	(expand_builtin_memset_args): Likewise.  Also remove special case
	for optimize_size so that can_store_by_pieces/SET_BY_PIECES_P can
	decide what to do instead.
	* value-prof.c (tree_stringops_transform): Pass MEMSETP argument
	to can_store_by_pieces.

	* config/sh/sh.h (SET_BY_PIECES_P): Clone from STORE_BY_PIECES_P.
	* config/s390/s390.h (SET_BY_PIECES_P): Likewise.

	* config/mips/mips.opt (mmemcpy): Change from Var to Mask.
	* config/mips/mips.c (override_options): Make -Os default to -mmemcpy.
	* config/mips/mips.h (MIPS_CALL_RATIO): Define.
	(MOVE_RATIO, CLEAR_RATIO, SET_RATIO): Define.
	(STORE_BY_PIECES_P): Define.

[-- Attachment #3: 31c-frob-by-pieces.patch --]
[-- Type: text/x-patch, Size: 24084 bytes --]

Index: gcc/doc/tm.texi
===================================================================
*** gcc/doc/tm.texi	(revision 127789)
--- gcc/doc/tm.texi	(working copy)
*************** will be used.  Defaults to 1 if @code{mo
*** 5897,5908 ****
  than @code{CLEAR_RATIO}.
  @end defmac
  
  @defmac STORE_BY_PIECES_P (@var{size}, @var{alignment})
  A C expression used to determine whether @code{store_by_pieces} will be
! used to set a chunk of memory to a constant value, or whether some other
! mechanism will be used.  Used by @code{__builtin_memset} when storing
! values other than constant zero and by @code{__builtin_strcpy} when
! when called with a constant source string.
  Defaults to 1 if @code{move_by_pieces_ninsns} returns less
  than @code{MOVE_RATIO}.
  @end defmac
--- 5897,5926 ----
  than @code{CLEAR_RATIO}.
  @end defmac
  
+ @defmac SET_RATIO
+ The threshold of number of scalar move insns, @emph{below} which a sequence
+ of insns should be generated to set memory to a constant value, instead of
+ a block set insn or a library call.  
+ Increasing the value will always make code faster, but
+ eventually incurs high cost in increased code size.
+ 
+ If you don't define this, it defaults to the value of @code{MOVE_RATIO}.
+ @end defmac
+ 
+ @defmac SET_BY_PIECES_P (@var{size}, @var{alignment})
+ A C expression used to determine whether @code{store_by_pieces} will be
+ used to set a chunk of memory to a constant value, or whether some 
+ other mechanism will be used.  Used by @code{__builtin_memset} when 
+ storing values other than constant zero.
+ Defaults to 1 if @code{move_by_pieces_ninsns} returns less
+ than @code{SET_RATIO}.
+ @end defmac
+ 
  @defmac STORE_BY_PIECES_P (@var{size}, @var{alignment})
  A C expression used to determine whether @code{store_by_pieces} will be
! used to set a chunk of memory to a constant string value, or whether some 
! other mechanism will be used.  Used by @code{__builtin_strcpy} when
! called with a constant source string.
  Defaults to 1 if @code{move_by_pieces_ninsns} returns less
  than @code{MOVE_RATIO}.
  @end defmac
Index: gcc/expr.c
===================================================================
*** gcc/expr.c	(revision 127789)
--- gcc/expr.c	(working copy)
*************** static bool float_extend_from_mem[NUM_MA
*** 186,193 ****
  #endif
  
  /* This macro is used to determine whether store_by_pieces should be
!    called to "memset" storage with byte values other than zero, or
!    to "memcpy" storage when the source is a constant string.  */
  #ifndef STORE_BY_PIECES_P
  #define STORE_BY_PIECES_P(SIZE, ALIGN) \
    (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES + 1) \
--- 186,200 ----
  #endif
  
  /* This macro is used to determine whether store_by_pieces should be
!    called to "memset" storage with byte values other than zero.  */
! #ifndef SET_BY_PIECES_P
! #define SET_BY_PIECES_P(SIZE, ALIGN) \
!   (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES + 1) \
!    < (unsigned int) SET_RATIO)
! #endif
! 
! /* This macro is used to determine whether store_by_pieces should be
!    called to "memcpy" storage when the source is a constant string.  */
  #ifndef STORE_BY_PIECES_P
  #define STORE_BY_PIECES_P(SIZE, ALIGN) \
    (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES + 1) \
*************** use_group_regs (rtx *call_fusage, rtx re
*** 2191,2203 ****
  /* Determine whether the LEN bytes generated by CONSTFUN can be
     stored to memory using several move instructions.  CONSTFUNDATA is
     a pointer which will be passed as argument in every CONSTFUN call.
!    ALIGN is maximum alignment we can assume.  Return nonzero if a
!    call to store_by_pieces should succeed.  */
  
  int
  can_store_by_pieces (unsigned HOST_WIDE_INT len,
  		     rtx (*constfun) (void *, HOST_WIDE_INT, enum machine_mode),
! 		     void *constfundata, unsigned int align)
  {
    unsigned HOST_WIDE_INT l;
    unsigned int max_size;
--- 2198,2211 ----
  /* Determine whether the LEN bytes generated by CONSTFUN can be
     stored to memory using several move instructions.  CONSTFUNDATA is
     a pointer which will be passed as argument in every CONSTFUN call.
!    ALIGN is maximum alignment we can assume.  MEMSETP is true if this is
!    a memset operation and false if it's a copy of a constant string.
!    Return nonzero if a call to store_by_pieces should succeed.  */
  
  int
  can_store_by_pieces (unsigned HOST_WIDE_INT len,
  		     rtx (*constfun) (void *, HOST_WIDE_INT, enum machine_mode),
! 		     void *constfundata, unsigned int align, bool memsetp)
  {
    unsigned HOST_WIDE_INT l;
    unsigned int max_size;
*************** can_store_by_pieces (unsigned HOST_WIDE_
*** 2210,2216 ****
    if (len == 0)
      return 1;
  
!   if (! STORE_BY_PIECES_P (len, align))
      return 0;
  
    tmode = mode_for_size (STORE_MAX_PIECES * BITS_PER_UNIT, MODE_INT, 1);
--- 2218,2226 ----
    if (len == 0)
      return 1;
  
!   if (! (memsetp 
! 	 ? SET_BY_PIECES_P (len, align)
! 	 : STORE_BY_PIECES_P (len, align)))
      return 0;
  
    tmode = mode_for_size (STORE_MAX_PIECES * BITS_PER_UNIT, MODE_INT, 1);
*************** can_store_by_pieces (unsigned HOST_WIDE_
*** 2285,2291 ****
  /* Generate several move instructions to store LEN bytes generated by
     CONSTFUN to block TO.  (A MEM rtx with BLKmode).  CONSTFUNDATA is a
     pointer which will be passed as argument in every CONSTFUN call.
!    ALIGN is maximum alignment we can assume.
     If ENDP is 0 return to, if ENDP is 1 return memory at the end ala
     mempcpy, and if ENDP is 2 return memory the end minus one byte ala
     stpcpy.  */
--- 2295,2302 ----
  /* Generate several move instructions to store LEN bytes generated by
     CONSTFUN to block TO.  (A MEM rtx with BLKmode).  CONSTFUNDATA is a
     pointer which will be passed as argument in every CONSTFUN call.
!    ALIGN is maximum alignment we can assume.  MEMSETP is true if this is
!    a memset operation and false if it's a copy of a constant string.
     If ENDP is 0 return to, if ENDP is 1 return memory at the end ala
     mempcpy, and if ENDP is 2 return memory the end minus one byte ala
     stpcpy.  */
*************** can_store_by_pieces (unsigned HOST_WIDE_
*** 2293,2299 ****
  rtx
  store_by_pieces (rtx to, unsigned HOST_WIDE_INT len,
  		 rtx (*constfun) (void *, HOST_WIDE_INT, enum machine_mode),
! 		 void *constfundata, unsigned int align, int endp)
  {
    struct store_by_pieces data;
  
--- 2304,2310 ----
  rtx
  store_by_pieces (rtx to, unsigned HOST_WIDE_INT len,
  		 rtx (*constfun) (void *, HOST_WIDE_INT, enum machine_mode),
! 		 void *constfundata, unsigned int align, bool memsetp, int endp)
  {
    struct store_by_pieces data;
  
*************** store_by_pieces (rtx to, unsigned HOST_W
*** 2303,2309 ****
        return to;
      }
  
!   gcc_assert (STORE_BY_PIECES_P (len, align));
    data.constfun = constfun;
    data.constfundata = constfundata;
    data.len = len;
--- 2314,2322 ----
        return to;
      }
  
!   gcc_assert (memsetp
! 	      ? SET_BY_PIECES_P (len, align)
! 	      : STORE_BY_PIECES_P (len, align));
    data.constfun = constfun;
    data.constfundata = constfundata;
    data.len = len;
*************** store_expr (tree exp, rtx target, int ca
*** 4498,4504 ****
        str_copy_len = MIN (str_copy_len, exp_len);
        if (!can_store_by_pieces (str_copy_len, builtin_strncpy_read_str,
  				(void *) TREE_STRING_POINTER (exp),
! 				MEM_ALIGN (target)))
  	goto normal_expr;
  
        dest_mem = target;
--- 4511,4517 ----
        str_copy_len = MIN (str_copy_len, exp_len);
        if (!can_store_by_pieces (str_copy_len, builtin_strncpy_read_str,
  				(void *) TREE_STRING_POINTER (exp),
! 				MEM_ALIGN (target), false))
  	goto normal_expr;
  
        dest_mem = target;
*************** store_expr (tree exp, rtx target, int ca
*** 4507,4513 ****
  				  str_copy_len, builtin_strncpy_read_str,
  				  (void *) TREE_STRING_POINTER (exp),
  				  MEM_ALIGN (target),
! 				  exp_len > str_copy_len ? 1 : 0);
        if (exp_len > str_copy_len)
  	clear_storage (dest_mem, GEN_INT (exp_len - str_copy_len),
  		       BLOCK_OP_NORMAL);
--- 4520,4527 ----
  				  str_copy_len, builtin_strncpy_read_str,
  				  (void *) TREE_STRING_POINTER (exp),
  				  MEM_ALIGN (target),
! 				  exp_len > str_copy_len ? 1 : 0,
! 				  false);
        if (exp_len > str_copy_len)
  	clear_storage (dest_mem, GEN_INT (exp_len - str_copy_len),
  		       BLOCK_OP_NORMAL);
Index: gcc/expr.h
===================================================================
*** gcc/expr.h	(revision 127789)
--- gcc/expr.h	(working copy)
*************** enum expand_modifier {EXPAND_NORMAL = 0,
*** 84,89 ****
--- 84,96 ----
  #define CLEAR_RATIO (optimize_size ? 3 : 15)
  #endif
  #endif
+ 
+ /* If a memory set (to value other than zero) operation would take
+    SET_RATIO or more simple move-instruction sequences, we will do a movmem
+    or libcall instead.  */
+ #ifndef SET_RATIO
+ #define SET_RATIO MOVE_RATIO
+ #endif
  \f
  enum direction {none, upward, downward};
  
*************** extern int can_move_by_pieces (unsigned 
*** 444,463 ****
     CONSTFUN with several move instructions by store_by_pieces
     function.  CONSTFUNDATA is a pointer which will be passed as argument
     in every CONSTFUN call.
!    ALIGN is maximum alignment we can assume.  */
  extern int can_store_by_pieces (unsigned HOST_WIDE_INT,
  				rtx (*) (void *, HOST_WIDE_INT,
  					 enum machine_mode),
! 				void *, unsigned int);
  
  /* Generate several move instructions to store LEN bytes generated by
     CONSTFUN to block TO.  (A MEM rtx with BLKmode).  CONSTFUNDATA is a
     pointer which will be passed as argument in every CONSTFUN call.
     ALIGN is maximum alignment we can assume.
     Returns TO + LEN.  */
  extern rtx store_by_pieces (rtx, unsigned HOST_WIDE_INT,
  			    rtx (*) (void *, HOST_WIDE_INT, enum machine_mode),
! 			    void *, unsigned int, int);
  
  /* Emit insns to set X from Y.  */
  extern rtx emit_move_insn (rtx, rtx);
--- 451,473 ----
     CONSTFUN with several move instructions by store_by_pieces
     function.  CONSTFUNDATA is a pointer which will be passed as argument
     in every CONSTFUN call.
!    ALIGN is maximum alignment we can assume.
!    MEMSETP is true if this is a real memset/bzero, not a copy
!    of a const string.  */
  extern int can_store_by_pieces (unsigned HOST_WIDE_INT,
  				rtx (*) (void *, HOST_WIDE_INT,
  					 enum machine_mode),
! 				void *, unsigned int, bool);
  
  /* Generate several move instructions to store LEN bytes generated by
     CONSTFUN to block TO.  (A MEM rtx with BLKmode).  CONSTFUNDATA is a
     pointer which will be passed as argument in every CONSTFUN call.
     ALIGN is maximum alignment we can assume.
+    MEMSETP is true if this is a real memset/bzero, not a copy.
     Returns TO + LEN.  */
  extern rtx store_by_pieces (rtx, unsigned HOST_WIDE_INT,
  			    rtx (*) (void *, HOST_WIDE_INT, enum machine_mode),
! 			    void *, unsigned int, bool, int);
  
  /* Emit insns to set X from Y.  */
  extern rtx emit_move_insn (rtx, rtx);
Index: gcc/builtins.c
===================================================================
*** gcc/builtins.c	(revision 127789)
--- gcc/builtins.c	(working copy)
*************** expand_builtin_memcpy (tree exp, rtx tar
*** 3331,3341 ****
  	  && GET_CODE (len_rtx) == CONST_INT
  	  && (unsigned HOST_WIDE_INT) INTVAL (len_rtx) <= strlen (src_str) + 1
  	  && can_store_by_pieces (INTVAL (len_rtx), builtin_memcpy_read_str,
! 				  (void *) src_str, dest_align))
  	{
  	  dest_mem = store_by_pieces (dest_mem, INTVAL (len_rtx),
  				      builtin_memcpy_read_str,
! 				      (void *) src_str, dest_align, 0);
  	  dest_mem = force_operand (XEXP (dest_mem, 0), NULL_RTX);
  	  dest_mem = convert_memory_address (ptr_mode, dest_mem);
  	  return dest_mem;
--- 3331,3341 ----
  	  && GET_CODE (len_rtx) == CONST_INT
  	  && (unsigned HOST_WIDE_INT) INTVAL (len_rtx) <= strlen (src_str) + 1
  	  && can_store_by_pieces (INTVAL (len_rtx), builtin_memcpy_read_str,
! 				  (void *) src_str, dest_align, false))
  	{
  	  dest_mem = store_by_pieces (dest_mem, INTVAL (len_rtx),
  				      builtin_memcpy_read_str,
! 				      (void *) src_str, dest_align, false, 0);
  	  dest_mem = force_operand (XEXP (dest_mem, 0), NULL_RTX);
  	  dest_mem = convert_memory_address (ptr_mode, dest_mem);
  	  return dest_mem;
*************** expand_builtin_mempcpy_args (tree dest, 
*** 3444,3456 ****
  	  && GET_CODE (len_rtx) == CONST_INT
  	  && (unsigned HOST_WIDE_INT) INTVAL (len_rtx) <= strlen (src_str) + 1
  	  && can_store_by_pieces (INTVAL (len_rtx), builtin_memcpy_read_str,
! 				  (void *) src_str, dest_align))
  	{
  	  dest_mem = get_memory_rtx (dest, len);
  	  set_mem_align (dest_mem, dest_align);
  	  dest_mem = store_by_pieces (dest_mem, INTVAL (len_rtx),
  				      builtin_memcpy_read_str,
! 				      (void *) src_str, dest_align, endp);
  	  dest_mem = force_operand (XEXP (dest_mem, 0), NULL_RTX);
  	  dest_mem = convert_memory_address (ptr_mode, dest_mem);
  	  return dest_mem;
--- 3444,3457 ----
  	  && GET_CODE (len_rtx) == CONST_INT
  	  && (unsigned HOST_WIDE_INT) INTVAL (len_rtx) <= strlen (src_str) + 1
  	  && can_store_by_pieces (INTVAL (len_rtx), builtin_memcpy_read_str,
! 				  (void *) src_str, dest_align, false))
  	{
  	  dest_mem = get_memory_rtx (dest, len);
  	  set_mem_align (dest_mem, dest_align);
  	  dest_mem = store_by_pieces (dest_mem, INTVAL (len_rtx),
  				      builtin_memcpy_read_str,
! 				      (void *) src_str, dest_align,
! 				      false, endp);
  	  dest_mem = force_operand (XEXP (dest_mem, 0), NULL_RTX);
  	  dest_mem = convert_memory_address (ptr_mode, dest_mem);
  	  return dest_mem;
*************** expand_builtin_strncpy (tree exp, rtx ta
*** 3792,3804 ****
  	  if (!p || dest_align == 0 || !host_integerp (len, 1)
  	      || !can_store_by_pieces (tree_low_cst (len, 1),
  				       builtin_strncpy_read_str,
! 				       (void *) p, dest_align))
  	    return NULL_RTX;
  
  	  dest_mem = get_memory_rtx (dest, len);
  	  store_by_pieces (dest_mem, tree_low_cst (len, 1),
  			   builtin_strncpy_read_str,
! 			   (void *) p, dest_align, 0);
  	  dest_mem = force_operand (XEXP (dest_mem, 0), NULL_RTX);
  	  dest_mem = convert_memory_address (ptr_mode, dest_mem);
  	  return dest_mem;
--- 3793,3805 ----
  	  if (!p || dest_align == 0 || !host_integerp (len, 1)
  	      || !can_store_by_pieces (tree_low_cst (len, 1),
  				       builtin_strncpy_read_str,
! 				       (void *) p, dest_align, false))
  	    return NULL_RTX;
  
  	  dest_mem = get_memory_rtx (dest, len);
  	  store_by_pieces (dest_mem, tree_low_cst (len, 1),
  			   builtin_strncpy_read_str,
! 			   (void *) p, dest_align, false, 0);
  	  dest_mem = force_operand (XEXP (dest_mem, 0), NULL_RTX);
  	  dest_mem = convert_memory_address (ptr_mode, dest_mem);
  	  return dest_mem;
*************** expand_builtin_memset_args (tree dest, t
*** 3926,3939 ****
         * We can't pass builtin_memset_gen_str as that emits RTL.  */
        c = 1;
        if (host_integerp (len, 1)
- 	  && !(optimize_size && tree_low_cst (len, 1) > 1)
  	  && can_store_by_pieces (tree_low_cst (len, 1),
! 				  builtin_memset_read_str, &c, dest_align))
  	{
  	  val_rtx = force_reg (TYPE_MODE (unsigned_char_type_node),
  			       val_rtx);
  	  store_by_pieces (dest_mem, tree_low_cst (len, 1),
! 			   builtin_memset_gen_str, val_rtx, dest_align, 0);
  	}
        else if (!set_storage_via_setmem (dest_mem, len_rtx, val_rtx,
  					dest_align, expected_align,
--- 3927,3941 ----
         * We can't pass builtin_memset_gen_str as that emits RTL.  */
        c = 1;
        if (host_integerp (len, 1)
  	  && can_store_by_pieces (tree_low_cst (len, 1),
! 				  builtin_memset_read_str, &c, dest_align,
! 				  true))
  	{
  	  val_rtx = force_reg (TYPE_MODE (unsigned_char_type_node),
  			       val_rtx);
  	  store_by_pieces (dest_mem, tree_low_cst (len, 1),
! 			   builtin_memset_gen_str, val_rtx, dest_align,
! 			   true, 0);
  	}
        else if (!set_storage_via_setmem (dest_mem, len_rtx, val_rtx,
  					dest_align, expected_align,
*************** expand_builtin_memset_args (tree dest, t
*** 3951,3961 ****
    if (c)
      {
        if (host_integerp (len, 1)
- 	  && !(optimize_size && tree_low_cst (len, 1) > 1)
  	  && can_store_by_pieces (tree_low_cst (len, 1),
! 				  builtin_memset_read_str, &c, dest_align))
  	store_by_pieces (dest_mem, tree_low_cst (len, 1),
! 			 builtin_memset_read_str, &c, dest_align, 0);
        else if (!set_storage_via_setmem (dest_mem, len_rtx, GEN_INT (c),
  					dest_align, expected_align,
  					expected_size))
--- 3953,3963 ----
    if (c)
      {
        if (host_integerp (len, 1)
  	  && can_store_by_pieces (tree_low_cst (len, 1),
! 				  builtin_memset_read_str, &c, dest_align,
! 				  true))
  	store_by_pieces (dest_mem, tree_low_cst (len, 1),
! 			 builtin_memset_read_str, &c, dest_align, true, 0);
        else if (!set_storage_via_setmem (dest_mem, len_rtx, GEN_INT (c),
  					dest_align, expected_align,
  					expected_size))
Index: gcc/value-prof.c
===================================================================
*** gcc/value-prof.c	(revision 127789)
--- gcc/value-prof.c	(working copy)
*************** tree_stringops_transform (block_stmt_ite
*** 1392,1404 ****
      case BUILT_IN_MEMSET:
        if (!can_store_by_pieces (val, builtin_memset_read_str,
  				CALL_EXPR_ARG (call, 1),
! 				dest_align))
  	return false;
        break;
      case BUILT_IN_BZERO:
        if (!can_store_by_pieces (val, builtin_memset_read_str,
  				integer_zero_node,
! 				dest_align))
  	return false;
        break;
      default:
--- 1392,1404 ----
      case BUILT_IN_MEMSET:
        if (!can_store_by_pieces (val, builtin_memset_read_str,
  				CALL_EXPR_ARG (call, 1),
! 				dest_align, true))
  	return false;
        break;
      case BUILT_IN_BZERO:
        if (!can_store_by_pieces (val, builtin_memset_read_str,
  				integer_zero_node,
! 				dest_align, true))
  	return false;
        break;
      default:
Index: gcc/config/sh/sh.h
===================================================================
*** gcc/config/sh/sh.h	(revision 127789)
--- gcc/config/sh/sh.h	(working copy)
*************** struct sh_args {
*** 2184,2189 ****
--- 2184,2191 ----
    (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES + 1) \
     < (TARGET_SMALLCODE ? 2 : ((ALIGN >= 32) ? 16 : 2)))
  
+ #define SET_BY_PIECES_P(SIZE, ALIGN) STORE_BY_PIECES_P(SIZE, ALIGN)
+ 
  /* Macros to check register numbers against specific register classes.  */
  
  /* These assume that REGNO is a hard or pseudo reg number.
Index: gcc/config/s390/s390.h
===================================================================
*** gcc/config/s390/s390.h	(revision 127789)
--- gcc/config/s390/s390.h	(working copy)
*************** extern struct rtx_def *s390_compare_op0,
*** 803,812 ****
      || (TARGET_64BIT && (SIZE) == 8) )
  
  /* This macro is used to determine whether store_by_pieces should be
!    called to "memset" storage with byte values other than zero, or
!    to "memcpy" storage when the source is a constant string.  */
  #define STORE_BY_PIECES_P(SIZE, ALIGN) MOVE_BY_PIECES_P (SIZE, ALIGN)
  
  /* Don't perform CSE on function addresses.  */
  #define NO_FUNCTION_CSE
  
--- 803,815 ----
      || (TARGET_64BIT && (SIZE) == 8) )
  
  /* This macro is used to determine whether store_by_pieces should be
!    called to "memcpy" storage when the source is a constant string.  */
  #define STORE_BY_PIECES_P(SIZE, ALIGN) MOVE_BY_PIECES_P (SIZE, ALIGN)
  
+ /* Likewise to decide whether to "memset" storage with byte values
+    other than zero.  */
+ #define SET_BY_PIECES_P(SIZE, ALIGN) STORE_BY_PIECES_P (SIZE, ALIGN)
+ 
  /* Don't perform CSE on function addresses.  */
  #define NO_FUNCTION_CSE
  
Index: gcc/config/mips/mips.opt
===================================================================
*** gcc/config/mips/mips.opt	(revision 127789)
--- gcc/config/mips/mips.opt	(working copy)
*************** Target Report RejectNegative Mask(LONG64
*** 173,179 ****
  Use a 64-bit long type
  
  mmemcpy
! Target Report Var(TARGET_MEMCPY)
  Don't optimize block moves
  
  mmips-tfile
--- 173,179 ----
  Use a 64-bit long type
  
  mmemcpy
! Target Report Mask(MEMCPY)
  Don't optimize block moves
  
  mmips-tfile
Index: gcc/config/mips/mips.c
===================================================================
*** gcc/config/mips/mips.c	(revision 127789)
--- gcc/config/mips/mips.c	(working copy)
*************** override_options (void)
*** 5323,5328 ****
--- 5323,5333 ----
        flag_delayed_branch = 0;
      }
  
+   /* Prefer a call to memcpy over inline code when optimizing for size,
+      though see MOVE_RATIO in mips.h.  */
+   if (optimize_size && (target_flags_explicit & MASK_MEMCPY) == 0)
+     target_flags |= MASK_MEMCPY;
+ 
  #ifdef MIPS_TFMODE_FORMAT
    REAL_MODE_FORMAT (TFmode) = &MIPS_TFMODE_FORMAT;
  #endif
Index: gcc/config/mips/mips.h
===================================================================
*** gcc/config/mips/mips.h	(revision 127789)
--- gcc/config/mips/mips.h	(working copy)
*************** while (0)
*** 2785,2790 ****
--- 2785,2841 ----
  
  #undef PTRDIFF_TYPE
  #define PTRDIFF_TYPE (POINTER_SIZE == 64 ? "long int" : "int")
+ 
+ /* The base cost of a memcpy call, for MOVE_RATIO and friends.  These
+    values were determined experimentally by benchmarking with CSiBE.
+    In theory, the call overhead is higher for TARGET_ABICALLS (especially
+    for o32 where we have to restore $gp afterwards as well as make an
+    indirect call), but in practice, bumping this up higher for
+    TARGET_ABICALLS doesn't make much difference to code size.  */
+ 
+ #define MIPS_CALL_RATIO 8
+ 
+ /* Define MOVE_RATIO to encourage use of movmemsi when enabled,
+    since it should always generate code at least as good as
+    move_by_pieces().  But when inline movmemsi pattern is disabled
+    (i.e., with -mips16 or -mmemcpy), instead use a value approximating
+    the length of a memcpy call sequence, so that move_by_pieces will
+    generate inline code if it is shorter than a function call.
+    Since move_by_pieces_ninsns() counts memory-to-memory moves, but
+    we'll have to generate a load/store pair for each, halve the value of 
+    MIPS_CALL_RATIO to take that into account.
+    The default value for MOVE_RATIO when HAVE_movmemsi is true is 2.
+    There is no point to setting it to less than this to try to disable
+    move_by_pieces entirely, because that also disables some desirable 
+    tree-level optimizations, specifically related to optimizing a
+    one-byte string copy into a simple move byte operation.  */
+ 
+ #define MOVE_RATIO \
+   ((TARGET_MIPS16 || TARGET_MEMCPY) ? MIPS_CALL_RATIO / 2 : 2)
+ 
+ /* For CLEAR_RATIO, when optimizing for size, give a better estimate
+    of the length of a memset call, but use the default otherwise.  */
+ 
+ #define CLEAR_RATIO \
+   (optimize_size ? MIPS_CALL_RATIO : 15)
+ 
+ /* This is similar to CLEAR_RATIO, but for a non-zero constant, so when
+    optimizing for size adjust the ratio to account for the overhead of
+    loading the constant and replicating it across the word.  */
+ 
+ #define SET_RATIO \
+   (optimize_size ? MIPS_CALL_RATIO - 2 : 15)
+ 
+ /* STORE_BY_PIECES_P can be used when copying a constant string, but
+    in that case each word takes 3 insns (lui, ori, sw), or more in
+    64-bit mode, instead of 2 (lw, sw).  For now we always fail this
+    and let the move_by_pieces code copy the string from read-only
+    memory.  In the future, this could be tuned further for multi-issue
+    CPUs that can issue stores down one pipe and arithmetic instructions
+    down another; in that case, the lui/ori/sw combination would be a
+    win for long enough strings.  */
+ 
+ #define STORE_BY_PIECES_P(SIZE, ALIGN) 0
  \f
  #ifndef __mips16
  /* Since the bits of the _init and _fini function is spread across

  reply	other threads:[~2007-08-25  0:09 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-15 17:15 Sandra Loosemore
2007-08-15 17:22 ` Andrew Pinski
2007-08-15 18:32   ` Sandra Loosemore
2007-08-15 19:53     ` Nigel Stephens
2007-08-15 19:58   ` Sandra Loosemore
2007-08-17  4:50   ` Mark Mitchell
2007-08-17 13:24     ` Sandra Loosemore
2007-08-17 18:55       ` Mark Mitchell
2007-08-16  8:34 ` Richard Sandiford
2007-08-16 19:41   ` Sandra Loosemore
2007-08-19  0:03   ` Sandra Loosemore
2007-08-20  8:22     ` Richard Sandiford
2007-08-20 23:38       ` Sandra Loosemore
2007-08-21  8:21         ` Richard Sandiford
2007-08-21 10:34           ` Nigel Stephens
2007-08-21 11:53             ` Richard Sandiford
2007-08-21 12:14               ` Nigel Stephens
2007-08-21 12:35                 ` Richard Sandiford
2007-08-21 13:54           ` Sandra Loosemore
2007-08-21 14:22             ` Richard Sandiford
2007-08-21 20:39               ` Sandra Loosemore
2007-08-21 20:56                 ` Richard Sandiford
2007-08-23 14:35                   ` Sandra Loosemore
2007-08-23 14:44                     ` Richard Sandiford
2007-08-25  5:35                       ` Sandra Loosemore [this message]
2007-08-25  9:18                         ` [committed] " Jakub Jelinek
2007-08-25  9:58                           ` Jakub Jelinek
2007-08-25 14:30                           ` gcc.c-torture/execute/20030221-1.c regressed with "fine-tuning for can_store_by_pieces" Hans-Peter Nilsson
2007-08-25 14:40                           ` [committed] Re: PATCH: fine-tuning for can_store_by_pieces Sandra Loosemore
2007-08-24 22:06                     ` Mark Mitchell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46CF7332.3000706@codesourcery.com \
    --to=sandra@codesourcery.com \
    --cc=davidu@mips.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=guym@mips.com \
    --cc=jakub@redhat.com \
    --cc=mark@codesourcery.com \
    --cc=nigel@mips.com \
    --cc=richard@codesourcery.com \
    --cc=ths@mips.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).