From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 47575 invoked by alias); 10 Apr 2018 12:28:06 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 47178 invoked by uid 89); 10 Apr 2018 12:28:05 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_SHORT,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 10 Apr 2018 12:28:03 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id BC968AC4E; Tue, 10 Apr 2018 12:28:00 +0000 (UTC) Subject: Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657). To: Jakub Jelinek Cc: Richard Biener , Uros Bizjak , gcc-patches@gcc.gnu.org, Marc Glisse , "H.J. Lu" References: <4ca9c192-84f2-95ba-ffd7-1c9aa9be1dfd@suse.cz> <20180321103425.GJ8577@tucnak> <20180328143114.GK8577@tucnak> <20180328163652.GL8577@tucnak> <772b1171-2321-67d9-85e7-358a5cad0efa@suse.cz> <20180329122532.GP8577@tucnak> <17bbc039-e511-4fbe-d534-3d6d21aadc00@suse.cz> <2d812eaf-8ea0-68e8-089b-0c3d89a203d8@suse.cz> <20180410091915.GA8577@tucnak> From: =?UTF-8?Q?Martin_Li=c5=a1ka?= Message-ID: Date: Tue, 10 Apr 2018 12:28:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180410091915.GA8577@tucnak> Content-Type: multipart/mixed; boundary="------------12E6C572365E91A0C09E6020" X-IsSubscribed: yes X-SW-Source: 2018-04/txt/msg00449.txt.bz2 This is a multi-part message in MIME format. --------------12E6C572365E91A0C09E6020 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-length: 2641 On 04/10/2018 11:19 AM, Jakub Jelinek wrote: > On Mon, Apr 09, 2018 at 02:31:04PM +0200, Martin Liška wrote: >> gcc/testsuite/ChangeLog: >> >> 2018-03-28 Martin Liska >> >> * gcc.dg/string-opt-1.c: > > I guess you really didn't mean to keep the above entry around, just the one > below, right? Sure, fixed. > >> gcc/testsuite/ChangeLog: >> >> 2018-03-14 Martin Liska >> >> * gcc.dg/string-opt-1.c: Adjust scans for i386 and glibc target >> and others. > >> --- a/gcc/config.gcc >> +++ b/gcc/config.gcc >> @@ -1607,6 +1607,7 @@ x86_64-*-linux* | x86_64-*-kfreebsd*-gnu) >> x86_64-*-linux*) >> tm_file="${tm_file} linux.h linux-android.h i386/linux-common.h i386/linux64.h" >> extra_options="${extra_options} linux-android.opt" >> + extra_objs="${extra_objs} x86-linux.o" >> ;; > > The should go into the i[34567]86-*-linux*) case too (outside of the > if test x$enable_targets = xall; then conditional). > Or maybe better, remove the above and do it in: > i[34567]86-*-linux* | x86_64-*-linux*) > extra_objs="${extra_objs} cet.o" > tmake_file="$tmake_file i386/t-linux i386/t-cet" > ;; > spot, just add x86-linux.o next to cet.o. Done. > >> --- a/gcc/config/i386/linux.h >> +++ b/gcc/config/i386/linux.h >> @@ -24,3 +24,5 @@ along with GCC; see the file COPYING3. If not see >> >> #undef MUSL_DYNAMIC_LINKER >> #define MUSL_DYNAMIC_LINKER "/lib/ld-musl-i386.so.1" >> + >> +#define SUBTARGET_LIBC_FUNC_SPEED ix86_linux_libc_func_speed >> diff --git a/gcc/config/i386/linux64.h b/gcc/config/i386/linux64.h >> index f2d913e30ac..d855f5cc239 100644 >> --- a/gcc/config/i386/linux64.h >> +++ b/gcc/config/i386/linux64.h >> @@ -37,3 +37,5 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see >> #define MUSL_DYNAMIC_LINKER64 "/lib/ld-musl-x86_64.so.1" >> #undef MUSL_DYNAMIC_LINKERX32 >> #define MUSL_DYNAMIC_LINKERX32 "/lib/ld-musl-x32.so.1" >> + >> +#define SUBTARGET_LIBC_FUNC_SPEED ix86_linux_libc_func_speed > > And the above two changes should be replaced by a change in > gcc/config/i386/linux-common.h. Likewise. > >> +#include "coretypes.h" >> +#include "cp/cp-tree.h" /* This is why we're a separate module. */ > > Why do you need cp/cp-tree.h? That is just too weird. > The function just uses libc_speed (in core-types.h, built_in_function > (likewise), OPTION_GLIBC (config/linux.h). I ended up with minimal set of includes: #include "config.h" #include "system.h" #include "coretypes.h" #include "backend.h" #include "tree.h" I'm retesting the patch. Martin > > Jakub > --------------12E6C572365E91A0C09E6020 Content-Type: text/x-patch; name="0001-Introduce-new-libc_func_speed-target-hook-PR-middle-.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0001-Introduce-new-libc_func_speed-target-hook-PR-middle-.pa"; filename*1="tch" Content-length: 13606 >From bed35715063f9435b697eaf4c9868f81e8556de8 Mon Sep 17 00:00:00 2001 From: marxin Date: Wed, 14 Mar 2018 09:44:18 +0100 Subject: [PATCH] Introduce new libc_func_speed target hook (PR middle-end/81657). gcc/ChangeLog: 2018-03-14 Martin Liska PR middle-end/81657 * builtins.c (expand_builtin_memory_copy_args): Handle situation when libc library provides a fast mempcpy implementation/ * config/linux-protos.h (ix86_linux_libc_func_speed): New. (TARGET_LIBC_FUNC_SPEED): Likewise. * config/i386/linux-common.h (SUBTARGET_LIBC_FUNC_SPEED): Define macro. * config/i386/t-linux: Add x86-linux.o. * config.gcc: Likewise. * config/i386/x86-linux.c: New file. * coretypes.h (enum libc_speed): Likewise. * doc/tm.texi: Document new target hook. * doc/tm.texi.in: Likewise. * expr.c (emit_block_move_hints): Handle libc bail out argument. * expr.h (emit_block_move_hints): Add new parameters. * target.def: Add new hook. * targhooks.c (enum libc_speed): New enum. (default_libc_func_speed): Provide a default hook implementation. * targhooks.h (default_libc_func_speed): Likewise. gcc/testsuite/ChangeLog: 2018-03-14 Martin Liska * gcc.dg/string-opt-1.c: Adjust scans for i386 and glibc target and others. --- gcc/builtins.c | 15 ++++++++++- gcc/config.gcc | 2 +- gcc/config/i386/i386.c | 5 ++++ gcc/config/i386/linux-common.h | 2 ++ gcc/config/i386/t-linux | 6 +++++ gcc/config/i386/x86-linux.c | 52 +++++++++++++++++++++++++++++++++++++ gcc/config/linux-protos.h | 1 + gcc/coretypes.h | 7 +++++ gcc/doc/tm.texi | 4 +++ gcc/doc/tm.texi.in | 1 + gcc/expr.c | 11 +++++++- gcc/expr.h | 3 ++- gcc/target.def | 7 +++++ gcc/targhooks.c | 9 +++++++ gcc/targhooks.h | 1 + gcc/testsuite/gcc.dg/string-opt-1.c | 5 ++-- 16 files changed, 125 insertions(+), 6 deletions(-) create mode 100644 gcc/config/i386/x86-linux.c diff --git a/gcc/builtins.c b/gcc/builtins.c index 487d9d58db2..98ee3fb272d 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -3651,13 +3651,26 @@ expand_builtin_memory_copy_args (tree dest, tree src, tree len, src_mem = get_memory_rtx (src, len); set_mem_align (src_mem, src_align); + /* emit_block_move_hints can generate a library call to memcpy function. + In situations when a libc library provides fast implementation + of mempcpy, then it's better to call mempcpy directly. */ + bool avoid_libcall + = (endp == 1 + && targetm.libc_func_speed ((int)BUILT_IN_MEMPCPY) == LIBC_FAST_SPEED + && target != const0_rtx); + /* Copy word part most expediently. */ + bool libcall_avoided = false; dest_addr = emit_block_move_hints (dest_mem, src_mem, len_rtx, CALL_EXPR_TAILCALL (exp) && (endp == 0 || target == const0_rtx) ? BLOCK_OP_TAILCALL : BLOCK_OP_NORMAL, expected_align, expected_size, - min_size, max_size, probable_max_size); + min_size, max_size, probable_max_size, + avoid_libcall ? &libcall_avoided : NULL); + + if (libcall_avoided) + return NULL_RTX; if (dest_addr == 0) { diff --git a/gcc/config.gcc b/gcc/config.gcc index 1b58c060a92..7fe43856b6a 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -4617,7 +4617,7 @@ case ${target} in i[34567]86-*-darwin* | x86_64-*-darwin*) ;; i[34567]86-*-linux* | x86_64-*-linux*) - extra_objs="${extra_objs} cet.o" + extra_objs="${extra_objs} cet.o x86-linux.o" tmake_file="$tmake_file i386/t-linux i386/t-cet" ;; i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index b4f6aec1434..2471ff7b99a 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -52105,6 +52105,11 @@ ix86_run_selftests (void) #undef TARGET_WARN_PARAMETER_PASSING_ABI #define TARGET_WARN_PARAMETER_PASSING_ABI ix86_warn_parameter_passing_abi +#ifdef SUBTARGET_LIBC_FUNC_SPEED +#undef TARGET_LIBC_FUNC_SPEED +#define TARGET_LIBC_FUNC_SPEED SUBTARGET_LIBC_FUNC_SPEED +#endif + #if CHECKING_P #undef TARGET_RUN_TARGET_SELFTESTS #define TARGET_RUN_TARGET_SELFTESTS selftest::ix86_run_selftests diff --git a/gcc/config/i386/linux-common.h b/gcc/config/i386/linux-common.h index d877387021b..1b48c15e5c0 100644 --- a/gcc/config/i386/linux-common.h +++ b/gcc/config/i386/linux-common.h @@ -126,3 +126,5 @@ extern void file_end_indicate_exec_stack_and_cet (void); #undef TARGET_ASM_FILE_END #define TARGET_ASM_FILE_END file_end_indicate_exec_stack_and_cet + +#define SUBTARGET_LIBC_FUNC_SPEED ix86_linux_libc_func_speed diff --git a/gcc/config/i386/t-linux b/gcc/config/i386/t-linux index 155314c08a7..6e3ebe94fe8 100644 --- a/gcc/config/i386/t-linux +++ b/gcc/config/i386/t-linux @@ -1 +1,7 @@ MULTIARCH_DIRNAME = $(call if_multiarch,i386-linux-gnu) + +x86-linux.o: $(srcdir)/config/i386/x86-linux.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ + $(TM_H) $(RTL_H) $(REGS_H) hard-reg-set.h output.h $(TREE_H) flags.h \ + $(TM_P_H) $(HASHTAB_H) $(GGC_H) + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ + $(srcdir)/config/i386/x86-linux.c diff --git a/gcc/config/i386/x86-linux.c b/gcc/config/i386/x86-linux.c new file mode 100644 index 00000000000..5e4331f635a --- /dev/null +++ b/gcc/config/i386/x86-linux.c @@ -0,0 +1,52 @@ +/* Implementation for linux-specific functions for i386 and x86-64 systems. + Copyright (C) 2018 Free Software Foundation, Inc. + Contributed by Martin Liska . + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +. */ + +#define IN_TARGET_CODE 1 +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "tree.h" + +/* This hook determines whether a function from libc has a fast implementation + FN is present at the runtime. We override it for i386 and glibc C library + as this combination provides fast implementation of mempcpy function. */ + +enum libc_speed +ix86_linux_libc_func_speed (int fn) +{ + enum built_in_function f = (built_in_function)fn; + + if (!OPTION_GLIBC) + return LIBC_UNKNOWN_SPEED; + + switch (f) + { + case BUILT_IN_MEMPCPY: + return LIBC_FAST_SPEED; + default: + return LIBC_UNKNOWN_SPEED; + } +} diff --git a/gcc/config/linux-protos.h b/gcc/config/linux-protos.h index 9da8dd7ecaa..b7284735366 100644 --- a/gcc/config/linux-protos.h +++ b/gcc/config/linux-protos.h @@ -20,3 +20,4 @@ along with GCC; see the file COPYING3. If not see extern bool linux_has_ifunc_p (void); extern bool linux_libc_has_function (enum function_class fn_class); +extern enum libc_speed ix86_linux_libc_func_speed (int fn); diff --git a/gcc/coretypes.h b/gcc/coretypes.h index 283b4eb33fe..fe618f708f4 100644 --- a/gcc/coretypes.h +++ b/gcc/coretypes.h @@ -384,6 +384,13 @@ enum excess_precision_type EXCESS_PRECISION_TYPE_FAST }; +enum libc_speed +{ + LIBC_FAST_SPEED, + LIBC_SLOW_SPEED, + LIBC_UNKNOWN_SPEED +}; + /* Support for user-provided GGC and PCH markers. The first parameter is a pointer to a pointer, the second a cookie. */ typedef void (*gt_pointer_operator) (void *, void *); diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index bd8b917ba82..0f7c91a22c4 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -5501,6 +5501,10 @@ macro, a reasonable default is used. This hook determines whether a function from a class of functions @var{fn_class} is present at the runtime. @end deftypefn +@deftypefn {Target Hook} libc_speed TARGET_LIBC_FUNC_SPEED (int @var{fn}) +This hook determines whether a function from libc has a fast implementation +@var{fn} is present at the runtime. +@end deftypefn @defmac NEXT_OBJC_RUNTIME Set this macro to 1 to use the "NeXT" Objective-C message sending conventions diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index b0207146e8c..4bb2998a8a1 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -3933,6 +3933,7 @@ macro, a reasonable default is used. @end defmac @hook TARGET_LIBC_HAS_FUNCTION +@hook TARGET_LIBC_FUNC_SPEED @defmac NEXT_OBJC_RUNTIME Set this macro to 1 to use the "NeXT" Objective-C message sending conventions diff --git a/gcc/expr.c b/gcc/expr.c index 00660293f72..f3bd698bc4d 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -1554,6 +1554,8 @@ compare_by_pieces (rtx arg0, rtx arg1, unsigned HOST_WIDE_INT len, MIN_SIZE is the minimal size of block to move MAX_SIZE is the maximal size of block to move, if it can not be represented in unsigned HOST_WIDE_INT, than it is mask of all ones. + If BAIL_OUT_LIBCALL is non-null, do not emit library call and assign + true to the pointer when move is not done. Return the address of the new block, if memcpy is called and returns it, 0 otherwise. */ @@ -1563,7 +1565,8 @@ emit_block_move_hints (rtx x, rtx y, rtx size, enum block_op_methods method, unsigned int expected_align, HOST_WIDE_INT expected_size, unsigned HOST_WIDE_INT min_size, unsigned HOST_WIDE_INT max_size, - unsigned HOST_WIDE_INT probable_max_size) + unsigned HOST_WIDE_INT probable_max_size, + bool *bail_out_libcall) { bool may_use_call; rtx retval = 0; @@ -1625,6 +1628,12 @@ emit_block_move_hints (rtx x, rtx y, rtx size, enum block_op_methods method, && ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (x)) && ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (y))) { + if (bail_out_libcall) + { + *bail_out_libcall = true; + return retval; + } + /* Since x and y are passed to a libcall, mark the corresponding tree EXPR as addressable. */ tree y_expr = MEM_EXPR (y); diff --git a/gcc/expr.h b/gcc/expr.h index b3d523bcb24..c2bf87fd14e 100644 --- a/gcc/expr.h +++ b/gcc/expr.h @@ -110,7 +110,8 @@ extern rtx emit_block_move_hints (rtx, rtx, rtx, enum block_op_methods, unsigned int, HOST_WIDE_INT, unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT, - unsigned HOST_WIDE_INT); + unsigned HOST_WIDE_INT, + bool *bail_out_libcall = NULL); extern rtx emit_block_cmp_hints (rtx, rtx, rtx, tree, rtx, bool, by_pieces_constfn, void *); extern bool emit_storent_insn (rtx to, rtx from); diff --git a/gcc/target.def b/gcc/target.def index c5b2a1e7e71..3bbddc82776 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -2639,6 +2639,13 @@ DEFHOOK bool, (enum function_class fn_class), default_libc_has_function) +DEFHOOK +(libc_func_speed, + "This hook determines whether a function from libc has a fast implementation\n\ +@var{fn} is present at the runtime.", + libc_speed, (int fn), + default_libc_func_speed) + /* True if new jumps cannot be created, to replace existing ones or not, at the current point in the compilation. */ DEFHOOK diff --git a/gcc/targhooks.c b/gcc/targhooks.c index fafcc6c5196..6e44f6f79cf 100644 --- a/gcc/targhooks.c +++ b/gcc/targhooks.c @@ -1642,6 +1642,15 @@ no_c99_libc_has_function (enum function_class fn_class ATTRIBUTE_UNUSED) return false; } +/* This hook determines whether a function from libc has a fast implementation + FN is present at the runtime. */ + +enum libc_speed +default_libc_func_speed (int) +{ + return LIBC_UNKNOWN_SPEED; +} + tree default_builtin_tm_load_store (tree ARG_UNUSED (type)) { diff --git a/gcc/targhooks.h b/gcc/targhooks.h index 8a4393f2ba4..7508673ad0a 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -205,6 +205,7 @@ extern bool default_have_conditional_execution (void); extern bool default_libc_has_function (enum function_class); extern bool no_c99_libc_has_function (enum function_class); extern bool gnu_libc_has_function (enum function_class); +extern enum libc_speed default_libc_func_speed (int); extern tree default_builtin_tm_load_store (tree); diff --git a/gcc/testsuite/gcc.dg/string-opt-1.c b/gcc/testsuite/gcc.dg/string-opt-1.c index 2f060732bf0..7faaadcbb1f 100644 --- a/gcc/testsuite/gcc.dg/string-opt-1.c +++ b/gcc/testsuite/gcc.dg/string-opt-1.c @@ -48,5 +48,6 @@ main (void) return 0; } -/* { dg-final { scan-assembler-not "\" } } */ -/* { dg-final { scan-assembler "memcpy" } } */ +/* { dg-final { scan-assembler-not "\" { target { ! { i?86-*-gnu* x86_64-*-gnu* i?86-*-linux* x86_64-*-linux* } } } } } */ +/* { dg-final { scan-assembler "memcpy" { target { ! { i?86-*-gnu* x86_64-*-gnu* i?86-*-linux* x86_64-*-linux* } } } } } */ +/* { dg-final { scan-assembler "mempcpy" { target { i?86-*-gnu* x86_64-*-gnu* i?86-*-linux* x86_64-*-linux* } } } } */ -- 2.16.3 --------------12E6C572365E91A0C09E6020--