From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9380 invoked by alias); 25 Aug 2007 00:09:55 -0000 Received: (qmail 8455 invoked by uid 22791); 25 Aug 2007 00:09:48 -0000 X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (65.74.133.4) by sourceware.org (qpsmtpd/0.31) with ESMTP; Sat, 25 Aug 2007 00:09:39 +0000 Received: (qmail 26639 invoked from network); 25 Aug 2007 00:09:37 -0000 Received: from unknown (HELO bullfrog.localdomain) (sandra@127.0.0.2) by mail.codesourcery.com with ESMTPA; 25 Aug 2007 00:09:37 -0000 Message-ID: <46CF7332.3000706@codesourcery.com> Date: Sat, 25 Aug 2007 05:35:00 -0000 From: Sandra Loosemore User-Agent: Thunderbird 2.0.0.4 (X11/20070604) MIME-Version: 1.0 To: GCC Patches , Nigel Stephens , Guy Morrogh , David Ung , Thiemo Seufer , Mark Mitchell , richard@codesourcery.com, jakub@redhat.com Subject: [committed] Re: PATCH: fine-tuning for can_store_by_pieces References: <46C3343A.5080407@codesourcery.com> <87ps1nop2x.fsf@firetop.home> <46C778D6.5060808@codesourcery.com> <87y7g6r50c.fsf@firetop.home> <46CA222D.2050107@codesourcery.com> <87ps1h5mda.fsf@firetop.home> <46CAEBCE.3050807@codesourcery.com> <87r6lx3r9p.fsf@firetop.home> <46CB4B99.5010501@codesourcery.com> <87zm0k39gj.fsf@firetop.home> <46CD9828.3040305@codesourcery.com> <87hcmqe2sp.fsf@firetop.home> In-Reply-To: <87hcmqe2sp.fsf@firetop.home> Content-Type: multipart/mixed; boundary="------------040703060709000908050701" Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2007-08/txt/msg01695.txt.bz2 This is a multi-part message in MIME format. --------------040703060709000908050701 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-length: 533 Richard Sandiford wrote: > Thanks for your patience and all the iterations and testing. > This version looks good to me. Mark Mitchell wrote: > The target-independent changes look fine. OK, I've committed the patch. I found that it collided with this one from Jakub: http://gcc.gnu.org/ml/gcc-patches/2007-08/msg01641.html but made the obvious correction, and verified that it still builds and that CSiBE results on MIPS do not regress from my previous version. I've attached the final version of the patch. -Sandra --------------040703060709000908050701 Content-Type: text/x-log; name="31c-frob-by-pieces.log" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="31c-frob-by-pieces.log" Content-length: 1409 2007-08-24 Sandra Loosemore Nigel Stephens PR target/11787 gcc/ * doc/tm.texi (SET_RATIO, SET_BY_PIECES_P): Document new macros. (STORE_BY_PIECES_P): No longer applies to __builtin_memset. * expr.c (SET_BY_PIECES_P): Define. (can_store_by_pieces, store_by_pieces): Add MEMSETP argument; use it to decide whether to use SET_BY_PIECES_P or STORE_BY_PIECES_P. (store_expr): Pass MEMSETP argument to can_store_by_pieces and store_by_pieces. * expr.h (SET_RATIO): Define. (can_store_by_pieces, store_by_pieces): Update prototypes. * builtins.c (expand_builtin_memcpy): Pass MEMSETP argument to can_store_by_pieces/store_by_pieces. (expand_builtin_memcpy_args): Likewise. (expand_builtin_strncpy): Likewise. (expand_builtin_memset_args): Likewise. Also remove special case for optimize_size so that can_store_by_pieces/SET_BY_PIECES_P can decide what to do instead. * value-prof.c (tree_stringops_transform): Pass MEMSETP argument to can_store_by_pieces. * config/sh/sh.h (SET_BY_PIECES_P): Clone from STORE_BY_PIECES_P. * config/s390/s390.h (SET_BY_PIECES_P): Likewise. * config/mips/mips.opt (mmemcpy): Change from Var to Mask. * config/mips/mips.c (override_options): Make -Os default to -mmemcpy. * config/mips/mips.h (MIPS_CALL_RATIO): Define. (MOVE_RATIO, CLEAR_RATIO, SET_RATIO): Define. (STORE_BY_PIECES_P): Define. --------------040703060709000908050701 Content-Type: text/x-patch; name="31c-frob-by-pieces.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="31c-frob-by-pieces.patch" Content-length: 24084 Index: gcc/doc/tm.texi =================================================================== *** gcc/doc/tm.texi (revision 127789) --- gcc/doc/tm.texi (working copy) *************** will be used. Defaults to 1 if @code{mo *** 5897,5908 **** than @code{CLEAR_RATIO}. @end defmac @defmac STORE_BY_PIECES_P (@var{size}, @var{alignment}) A C expression used to determine whether @code{store_by_pieces} will be ! used to set a chunk of memory to a constant value, or whether some other ! mechanism will be used. Used by @code{__builtin_memset} when storing ! values other than constant zero and by @code{__builtin_strcpy} when ! when called with a constant source string. Defaults to 1 if @code{move_by_pieces_ninsns} returns less than @code{MOVE_RATIO}. @end defmac --- 5897,5926 ---- than @code{CLEAR_RATIO}. @end defmac + @defmac SET_RATIO + The threshold of number of scalar move insns, @emph{below} which a sequence + of insns should be generated to set memory to a constant value, instead of + a block set insn or a library call. + Increasing the value will always make code faster, but + eventually incurs high cost in increased code size. + + If you don't define this, it defaults to the value of @code{MOVE_RATIO}. + @end defmac + + @defmac SET_BY_PIECES_P (@var{size}, @var{alignment}) + A C expression used to determine whether @code{store_by_pieces} will be + used to set a chunk of memory to a constant value, or whether some + other mechanism will be used. Used by @code{__builtin_memset} when + storing values other than constant zero. + Defaults to 1 if @code{move_by_pieces_ninsns} returns less + than @code{SET_RATIO}. + @end defmac + @defmac STORE_BY_PIECES_P (@var{size}, @var{alignment}) A C expression used to determine whether @code{store_by_pieces} will be ! used to set a chunk of memory to a constant string value, or whether some ! other mechanism will be used. Used by @code{__builtin_strcpy} when ! called with a constant source string. Defaults to 1 if @code{move_by_pieces_ninsns} returns less than @code{MOVE_RATIO}. @end defmac Index: gcc/expr.c =================================================================== *** gcc/expr.c (revision 127789) --- gcc/expr.c (working copy) *************** static bool float_extend_from_mem[NUM_MA *** 186,193 **** #endif /* This macro is used to determine whether store_by_pieces should be ! called to "memset" storage with byte values other than zero, or ! to "memcpy" storage when the source is a constant string. */ #ifndef STORE_BY_PIECES_P #define STORE_BY_PIECES_P(SIZE, ALIGN) \ (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES + 1) \ --- 186,200 ---- #endif /* This macro is used to determine whether store_by_pieces should be ! called to "memset" storage with byte values other than zero. */ ! #ifndef SET_BY_PIECES_P ! #define SET_BY_PIECES_P(SIZE, ALIGN) \ ! (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES + 1) \ ! < (unsigned int) SET_RATIO) ! #endif ! ! /* This macro is used to determine whether store_by_pieces should be ! called to "memcpy" storage when the source is a constant string. */ #ifndef STORE_BY_PIECES_P #define STORE_BY_PIECES_P(SIZE, ALIGN) \ (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES + 1) \ *************** use_group_regs (rtx *call_fusage, rtx re *** 2191,2203 **** /* Determine whether the LEN bytes generated by CONSTFUN can be stored to memory using several move instructions. CONSTFUNDATA is a pointer which will be passed as argument in every CONSTFUN call. ! ALIGN is maximum alignment we can assume. Return nonzero if a ! call to store_by_pieces should succeed. */ int can_store_by_pieces (unsigned HOST_WIDE_INT len, rtx (*constfun) (void *, HOST_WIDE_INT, enum machine_mode), ! void *constfundata, unsigned int align) { unsigned HOST_WIDE_INT l; unsigned int max_size; --- 2198,2211 ---- /* Determine whether the LEN bytes generated by CONSTFUN can be stored to memory using several move instructions. CONSTFUNDATA is a pointer which will be passed as argument in every CONSTFUN call. ! ALIGN is maximum alignment we can assume. MEMSETP is true if this is ! a memset operation and false if it's a copy of a constant string. ! Return nonzero if a call to store_by_pieces should succeed. */ int can_store_by_pieces (unsigned HOST_WIDE_INT len, rtx (*constfun) (void *, HOST_WIDE_INT, enum machine_mode), ! void *constfundata, unsigned int align, bool memsetp) { unsigned HOST_WIDE_INT l; unsigned int max_size; *************** can_store_by_pieces (unsigned HOST_WIDE_ *** 2210,2216 **** if (len == 0) return 1; ! if (! STORE_BY_PIECES_P (len, align)) return 0; tmode = mode_for_size (STORE_MAX_PIECES * BITS_PER_UNIT, MODE_INT, 1); --- 2218,2226 ---- if (len == 0) return 1; ! if (! (memsetp ! ? SET_BY_PIECES_P (len, align) ! : STORE_BY_PIECES_P (len, align))) return 0; tmode = mode_for_size (STORE_MAX_PIECES * BITS_PER_UNIT, MODE_INT, 1); *************** can_store_by_pieces (unsigned HOST_WIDE_ *** 2285,2291 **** /* Generate several move instructions to store LEN bytes generated by CONSTFUN to block TO. (A MEM rtx with BLKmode). CONSTFUNDATA is a pointer which will be passed as argument in every CONSTFUN call. ! ALIGN is maximum alignment we can assume. If ENDP is 0 return to, if ENDP is 1 return memory at the end ala mempcpy, and if ENDP is 2 return memory the end minus one byte ala stpcpy. */ --- 2295,2302 ---- /* Generate several move instructions to store LEN bytes generated by CONSTFUN to block TO. (A MEM rtx with BLKmode). CONSTFUNDATA is a pointer which will be passed as argument in every CONSTFUN call. ! ALIGN is maximum alignment we can assume. MEMSETP is true if this is ! a memset operation and false if it's a copy of a constant string. If ENDP is 0 return to, if ENDP is 1 return memory at the end ala mempcpy, and if ENDP is 2 return memory the end minus one byte ala stpcpy. */ *************** can_store_by_pieces (unsigned HOST_WIDE_ *** 2293,2299 **** rtx store_by_pieces (rtx to, unsigned HOST_WIDE_INT len, rtx (*constfun) (void *, HOST_WIDE_INT, enum machine_mode), ! void *constfundata, unsigned int align, int endp) { struct store_by_pieces data; --- 2304,2310 ---- rtx store_by_pieces (rtx to, unsigned HOST_WIDE_INT len, rtx (*constfun) (void *, HOST_WIDE_INT, enum machine_mode), ! void *constfundata, unsigned int align, bool memsetp, int endp) { struct store_by_pieces data; *************** store_by_pieces (rtx to, unsigned HOST_W *** 2303,2309 **** return to; } ! gcc_assert (STORE_BY_PIECES_P (len, align)); data.constfun = constfun; data.constfundata = constfundata; data.len = len; --- 2314,2322 ---- return to; } ! gcc_assert (memsetp ! ? SET_BY_PIECES_P (len, align) ! : STORE_BY_PIECES_P (len, align)); data.constfun = constfun; data.constfundata = constfundata; data.len = len; *************** store_expr (tree exp, rtx target, int ca *** 4498,4504 **** str_copy_len = MIN (str_copy_len, exp_len); if (!can_store_by_pieces (str_copy_len, builtin_strncpy_read_str, (void *) TREE_STRING_POINTER (exp), ! MEM_ALIGN (target))) goto normal_expr; dest_mem = target; --- 4511,4517 ---- str_copy_len = MIN (str_copy_len, exp_len); if (!can_store_by_pieces (str_copy_len, builtin_strncpy_read_str, (void *) TREE_STRING_POINTER (exp), ! MEM_ALIGN (target), false)) goto normal_expr; dest_mem = target; *************** store_expr (tree exp, rtx target, int ca *** 4507,4513 **** str_copy_len, builtin_strncpy_read_str, (void *) TREE_STRING_POINTER (exp), MEM_ALIGN (target), ! exp_len > str_copy_len ? 1 : 0); if (exp_len > str_copy_len) clear_storage (dest_mem, GEN_INT (exp_len - str_copy_len), BLOCK_OP_NORMAL); --- 4520,4527 ---- str_copy_len, builtin_strncpy_read_str, (void *) TREE_STRING_POINTER (exp), MEM_ALIGN (target), ! exp_len > str_copy_len ? 1 : 0, ! false); if (exp_len > str_copy_len) clear_storage (dest_mem, GEN_INT (exp_len - str_copy_len), BLOCK_OP_NORMAL); Index: gcc/expr.h =================================================================== *** gcc/expr.h (revision 127789) --- gcc/expr.h (working copy) *************** enum expand_modifier {EXPAND_NORMAL = 0, *** 84,89 **** --- 84,96 ---- #define CLEAR_RATIO (optimize_size ? 3 : 15) #endif #endif + + /* If a memory set (to value other than zero) operation would take + SET_RATIO or more simple move-instruction sequences, we will do a movmem + or libcall instead. */ + #ifndef SET_RATIO + #define SET_RATIO MOVE_RATIO + #endif enum direction {none, upward, downward}; *************** extern int can_move_by_pieces (unsigned *** 444,463 **** CONSTFUN with several move instructions by store_by_pieces function. CONSTFUNDATA is a pointer which will be passed as argument in every CONSTFUN call. ! ALIGN is maximum alignment we can assume. */ extern int can_store_by_pieces (unsigned HOST_WIDE_INT, rtx (*) (void *, HOST_WIDE_INT, enum machine_mode), ! void *, unsigned int); /* Generate several move instructions to store LEN bytes generated by CONSTFUN to block TO. (A MEM rtx with BLKmode). CONSTFUNDATA is a pointer which will be passed as argument in every CONSTFUN call. ALIGN is maximum alignment we can assume. Returns TO + LEN. */ extern rtx store_by_pieces (rtx, unsigned HOST_WIDE_INT, rtx (*) (void *, HOST_WIDE_INT, enum machine_mode), ! void *, unsigned int, int); /* Emit insns to set X from Y. */ extern rtx emit_move_insn (rtx, rtx); --- 451,473 ---- CONSTFUN with several move instructions by store_by_pieces function. CONSTFUNDATA is a pointer which will be passed as argument in every CONSTFUN call. ! ALIGN is maximum alignment we can assume. ! MEMSETP is true if this is a real memset/bzero, not a copy ! of a const string. */ extern int can_store_by_pieces (unsigned HOST_WIDE_INT, rtx (*) (void *, HOST_WIDE_INT, enum machine_mode), ! void *, unsigned int, bool); /* Generate several move instructions to store LEN bytes generated by CONSTFUN to block TO. (A MEM rtx with BLKmode). CONSTFUNDATA is a pointer which will be passed as argument in every CONSTFUN call. ALIGN is maximum alignment we can assume. + MEMSETP is true if this is a real memset/bzero, not a copy. Returns TO + LEN. */ extern rtx store_by_pieces (rtx, unsigned HOST_WIDE_INT, rtx (*) (void *, HOST_WIDE_INT, enum machine_mode), ! void *, unsigned int, bool, int); /* Emit insns to set X from Y. */ extern rtx emit_move_insn (rtx, rtx); Index: gcc/builtins.c =================================================================== *** gcc/builtins.c (revision 127789) --- gcc/builtins.c (working copy) *************** expand_builtin_memcpy (tree exp, rtx tar *** 3331,3341 **** && GET_CODE (len_rtx) == CONST_INT && (unsigned HOST_WIDE_INT) INTVAL (len_rtx) <= strlen (src_str) + 1 && can_store_by_pieces (INTVAL (len_rtx), builtin_memcpy_read_str, ! (void *) src_str, dest_align)) { dest_mem = store_by_pieces (dest_mem, INTVAL (len_rtx), builtin_memcpy_read_str, ! (void *) src_str, dest_align, 0); dest_mem = force_operand (XEXP (dest_mem, 0), NULL_RTX); dest_mem = convert_memory_address (ptr_mode, dest_mem); return dest_mem; --- 3331,3341 ---- && GET_CODE (len_rtx) == CONST_INT && (unsigned HOST_WIDE_INT) INTVAL (len_rtx) <= strlen (src_str) + 1 && can_store_by_pieces (INTVAL (len_rtx), builtin_memcpy_read_str, ! (void *) src_str, dest_align, false)) { dest_mem = store_by_pieces (dest_mem, INTVAL (len_rtx), builtin_memcpy_read_str, ! (void *) src_str, dest_align, false, 0); dest_mem = force_operand (XEXP (dest_mem, 0), NULL_RTX); dest_mem = convert_memory_address (ptr_mode, dest_mem); return dest_mem; *************** expand_builtin_mempcpy_args (tree dest, *** 3444,3456 **** && GET_CODE (len_rtx) == CONST_INT && (unsigned HOST_WIDE_INT) INTVAL (len_rtx) <= strlen (src_str) + 1 && can_store_by_pieces (INTVAL (len_rtx), builtin_memcpy_read_str, ! (void *) src_str, dest_align)) { dest_mem = get_memory_rtx (dest, len); set_mem_align (dest_mem, dest_align); dest_mem = store_by_pieces (dest_mem, INTVAL (len_rtx), builtin_memcpy_read_str, ! (void *) src_str, dest_align, endp); dest_mem = force_operand (XEXP (dest_mem, 0), NULL_RTX); dest_mem = convert_memory_address (ptr_mode, dest_mem); return dest_mem; --- 3444,3457 ---- && GET_CODE (len_rtx) == CONST_INT && (unsigned HOST_WIDE_INT) INTVAL (len_rtx) <= strlen (src_str) + 1 && can_store_by_pieces (INTVAL (len_rtx), builtin_memcpy_read_str, ! (void *) src_str, dest_align, false)) { dest_mem = get_memory_rtx (dest, len); set_mem_align (dest_mem, dest_align); dest_mem = store_by_pieces (dest_mem, INTVAL (len_rtx), builtin_memcpy_read_str, ! (void *) src_str, dest_align, ! false, endp); dest_mem = force_operand (XEXP (dest_mem, 0), NULL_RTX); dest_mem = convert_memory_address (ptr_mode, dest_mem); return dest_mem; *************** expand_builtin_strncpy (tree exp, rtx ta *** 3792,3804 **** if (!p || dest_align == 0 || !host_integerp (len, 1) || !can_store_by_pieces (tree_low_cst (len, 1), builtin_strncpy_read_str, ! (void *) p, dest_align)) return NULL_RTX; dest_mem = get_memory_rtx (dest, len); store_by_pieces (dest_mem, tree_low_cst (len, 1), builtin_strncpy_read_str, ! (void *) p, dest_align, 0); dest_mem = force_operand (XEXP (dest_mem, 0), NULL_RTX); dest_mem = convert_memory_address (ptr_mode, dest_mem); return dest_mem; --- 3793,3805 ---- if (!p || dest_align == 0 || !host_integerp (len, 1) || !can_store_by_pieces (tree_low_cst (len, 1), builtin_strncpy_read_str, ! (void *) p, dest_align, false)) return NULL_RTX; dest_mem = get_memory_rtx (dest, len); store_by_pieces (dest_mem, tree_low_cst (len, 1), builtin_strncpy_read_str, ! (void *) p, dest_align, false, 0); dest_mem = force_operand (XEXP (dest_mem, 0), NULL_RTX); dest_mem = convert_memory_address (ptr_mode, dest_mem); return dest_mem; *************** expand_builtin_memset_args (tree dest, t *** 3926,3939 **** * We can't pass builtin_memset_gen_str as that emits RTL. */ c = 1; if (host_integerp (len, 1) - && !(optimize_size && tree_low_cst (len, 1) > 1) && can_store_by_pieces (tree_low_cst (len, 1), ! builtin_memset_read_str, &c, dest_align)) { val_rtx = force_reg (TYPE_MODE (unsigned_char_type_node), val_rtx); store_by_pieces (dest_mem, tree_low_cst (len, 1), ! builtin_memset_gen_str, val_rtx, dest_align, 0); } else if (!set_storage_via_setmem (dest_mem, len_rtx, val_rtx, dest_align, expected_align, --- 3927,3941 ---- * We can't pass builtin_memset_gen_str as that emits RTL. */ c = 1; if (host_integerp (len, 1) && can_store_by_pieces (tree_low_cst (len, 1), ! builtin_memset_read_str, &c, dest_align, ! true)) { val_rtx = force_reg (TYPE_MODE (unsigned_char_type_node), val_rtx); store_by_pieces (dest_mem, tree_low_cst (len, 1), ! builtin_memset_gen_str, val_rtx, dest_align, ! true, 0); } else if (!set_storage_via_setmem (dest_mem, len_rtx, val_rtx, dest_align, expected_align, *************** expand_builtin_memset_args (tree dest, t *** 3951,3961 **** if (c) { if (host_integerp (len, 1) - && !(optimize_size && tree_low_cst (len, 1) > 1) && can_store_by_pieces (tree_low_cst (len, 1), ! builtin_memset_read_str, &c, dest_align)) store_by_pieces (dest_mem, tree_low_cst (len, 1), ! builtin_memset_read_str, &c, dest_align, 0); else if (!set_storage_via_setmem (dest_mem, len_rtx, GEN_INT (c), dest_align, expected_align, expected_size)) --- 3953,3963 ---- if (c) { if (host_integerp (len, 1) && can_store_by_pieces (tree_low_cst (len, 1), ! builtin_memset_read_str, &c, dest_align, ! true)) store_by_pieces (dest_mem, tree_low_cst (len, 1), ! builtin_memset_read_str, &c, dest_align, true, 0); else if (!set_storage_via_setmem (dest_mem, len_rtx, GEN_INT (c), dest_align, expected_align, expected_size)) Index: gcc/value-prof.c =================================================================== *** gcc/value-prof.c (revision 127789) --- gcc/value-prof.c (working copy) *************** tree_stringops_transform (block_stmt_ite *** 1392,1404 **** case BUILT_IN_MEMSET: if (!can_store_by_pieces (val, builtin_memset_read_str, CALL_EXPR_ARG (call, 1), ! dest_align)) return false; break; case BUILT_IN_BZERO: if (!can_store_by_pieces (val, builtin_memset_read_str, integer_zero_node, ! dest_align)) return false; break; default: --- 1392,1404 ---- case BUILT_IN_MEMSET: if (!can_store_by_pieces (val, builtin_memset_read_str, CALL_EXPR_ARG (call, 1), ! dest_align, true)) return false; break; case BUILT_IN_BZERO: if (!can_store_by_pieces (val, builtin_memset_read_str, integer_zero_node, ! dest_align, true)) return false; break; default: Index: gcc/config/sh/sh.h =================================================================== *** gcc/config/sh/sh.h (revision 127789) --- gcc/config/sh/sh.h (working copy) *************** struct sh_args { *** 2184,2189 **** --- 2184,2191 ---- (move_by_pieces_ninsns (SIZE, ALIGN, STORE_MAX_PIECES + 1) \ < (TARGET_SMALLCODE ? 2 : ((ALIGN >= 32) ? 16 : 2))) + #define SET_BY_PIECES_P(SIZE, ALIGN) STORE_BY_PIECES_P(SIZE, ALIGN) + /* Macros to check register numbers against specific register classes. */ /* These assume that REGNO is a hard or pseudo reg number. Index: gcc/config/s390/s390.h =================================================================== *** gcc/config/s390/s390.h (revision 127789) --- gcc/config/s390/s390.h (working copy) *************** extern struct rtx_def *s390_compare_op0, *** 803,812 **** || (TARGET_64BIT && (SIZE) == 8) ) /* This macro is used to determine whether store_by_pieces should be ! called to "memset" storage with byte values other than zero, or ! to "memcpy" storage when the source is a constant string. */ #define STORE_BY_PIECES_P(SIZE, ALIGN) MOVE_BY_PIECES_P (SIZE, ALIGN) /* Don't perform CSE on function addresses. */ #define NO_FUNCTION_CSE --- 803,815 ---- || (TARGET_64BIT && (SIZE) == 8) ) /* This macro is used to determine whether store_by_pieces should be ! called to "memcpy" storage when the source is a constant string. */ #define STORE_BY_PIECES_P(SIZE, ALIGN) MOVE_BY_PIECES_P (SIZE, ALIGN) + /* Likewise to decide whether to "memset" storage with byte values + other than zero. */ + #define SET_BY_PIECES_P(SIZE, ALIGN) STORE_BY_PIECES_P (SIZE, ALIGN) + /* Don't perform CSE on function addresses. */ #define NO_FUNCTION_CSE Index: gcc/config/mips/mips.opt =================================================================== *** gcc/config/mips/mips.opt (revision 127789) --- gcc/config/mips/mips.opt (working copy) *************** Target Report RejectNegative Mask(LONG64 *** 173,179 **** Use a 64-bit long type mmemcpy ! Target Report Var(TARGET_MEMCPY) Don't optimize block moves mmips-tfile --- 173,179 ---- Use a 64-bit long type mmemcpy ! Target Report Mask(MEMCPY) Don't optimize block moves mmips-tfile Index: gcc/config/mips/mips.c =================================================================== *** gcc/config/mips/mips.c (revision 127789) --- gcc/config/mips/mips.c (working copy) *************** override_options (void) *** 5323,5328 **** --- 5323,5333 ---- flag_delayed_branch = 0; } + /* Prefer a call to memcpy over inline code when optimizing for size, + though see MOVE_RATIO in mips.h. */ + if (optimize_size && (target_flags_explicit & MASK_MEMCPY) == 0) + target_flags |= MASK_MEMCPY; + #ifdef MIPS_TFMODE_FORMAT REAL_MODE_FORMAT (TFmode) = &MIPS_TFMODE_FORMAT; #endif Index: gcc/config/mips/mips.h =================================================================== *** gcc/config/mips/mips.h (revision 127789) --- gcc/config/mips/mips.h (working copy) *************** while (0) *** 2785,2790 **** --- 2785,2841 ---- #undef PTRDIFF_TYPE #define PTRDIFF_TYPE (POINTER_SIZE == 64 ? "long int" : "int") + + /* The base cost of a memcpy call, for MOVE_RATIO and friends. These + values were determined experimentally by benchmarking with CSiBE. + In theory, the call overhead is higher for TARGET_ABICALLS (especially + for o32 where we have to restore $gp afterwards as well as make an + indirect call), but in practice, bumping this up higher for + TARGET_ABICALLS doesn't make much difference to code size. */ + + #define MIPS_CALL_RATIO 8 + + /* Define MOVE_RATIO to encourage use of movmemsi when enabled, + since it should always generate code at least as good as + move_by_pieces(). But when inline movmemsi pattern is disabled + (i.e., with -mips16 or -mmemcpy), instead use a value approximating + the length of a memcpy call sequence, so that move_by_pieces will + generate inline code if it is shorter than a function call. + Since move_by_pieces_ninsns() counts memory-to-memory moves, but + we'll have to generate a load/store pair for each, halve the value of + MIPS_CALL_RATIO to take that into account. + The default value for MOVE_RATIO when HAVE_movmemsi is true is 2. + There is no point to setting it to less than this to try to disable + move_by_pieces entirely, because that also disables some desirable + tree-level optimizations, specifically related to optimizing a + one-byte string copy into a simple move byte operation. */ + + #define MOVE_RATIO \ + ((TARGET_MIPS16 || TARGET_MEMCPY) ? MIPS_CALL_RATIO / 2 : 2) + + /* For CLEAR_RATIO, when optimizing for size, give a better estimate + of the length of a memset call, but use the default otherwise. */ + + #define CLEAR_RATIO \ + (optimize_size ? MIPS_CALL_RATIO : 15) + + /* This is similar to CLEAR_RATIO, but for a non-zero constant, so when + optimizing for size adjust the ratio to account for the overhead of + loading the constant and replicating it across the word. */ + + #define SET_RATIO \ + (optimize_size ? MIPS_CALL_RATIO - 2 : 15) + + /* STORE_BY_PIECES_P can be used when copying a constant string, but + in that case each word takes 3 insns (lui, ori, sw), or more in + 64-bit mode, instead of 2 (lw, sw). For now we always fail this + and let the move_by_pieces code copy the string from read-only + memory. In the future, this could be tuned further for multi-issue + CPUs that can issue stores down one pipe and arithmetic instructions + down another; in that case, the lui/ori/sw combination would be a + win for long enough strings. */ + + #define STORE_BY_PIECES_P(SIZE, ALIGN) 0 #ifndef __mips16 /* Since the bits of the _init and _fini function is spread across --------------040703060709000908050701--