From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 127228 invoked by alias); 9 Apr 2018 12:31:12 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 127218 invoked by uid 89); 9 Apr 2018 12:31:11 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_SHORT,SPF_PASS autolearn=ham version=3.3.2 spammy=Document, 1107, H*F:D*cz X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 09 Apr 2018 12:31:08 +0000 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 2C0C5AEF8; Mon, 9 Apr 2018 12:31:06 +0000 (UTC) Subject: Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657). From: =?UTF-8?Q?Martin_Li=c5=a1ka?= To: Jakub Jelinek Cc: Richard Biener , Uros Bizjak , gcc-patches@gcc.gnu.org, Marc Glisse , "H.J. Lu" References: <20180313152359.GB8577@tucnak> <2cc467cb-569a-88be-0f91-b6f389415ffd@suse.cz> <20180314130754.GH8577@tucnak> <4ca9c192-84f2-95ba-ffd7-1c9aa9be1dfd@suse.cz> <20180321103425.GJ8577@tucnak> <20180328143114.GK8577@tucnak> <20180328163652.GL8577@tucnak> <772b1171-2321-67d9-85e7-358a5cad0efa@suse.cz> <20180329122532.GP8577@tucnak> <17bbc039-e511-4fbe-d534-3d6d21aadc00@suse.cz> Message-ID: <2d812eaf-8ea0-68e8-089b-0c3d89a203d8@suse.cz> Date: Mon, 09 Apr 2018 12:31:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <17bbc039-e511-4fbe-d534-3d6d21aadc00@suse.cz> Content-Type: multipart/mixed; boundary="------------DC2E70E2053CBCC1AD56842C" X-IsSubscribed: yes X-SW-Source: 2018-04/txt/msg00387.txt.bz2 This is a multi-part message in MIME format. --------------DC2E70E2053CBCC1AD56842C Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-length: 148 Hi. I'm sending updated version of the patch. Patch can bootstrap on ppc64le-redhat-linux and x86_64-linux and survives regression tests. Martin --------------DC2E70E2053CBCC1AD56842C Content-Type: text/x-patch; name="0001-Introduce-new-libc_func_speed-target-hook-PR-middle-.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0001-Introduce-new-libc_func_speed-target-hook-PR-middle-.pa"; filename*1="tch" Content-length: 14329 >From 6d19b1bf0c28a683e1458e16943e34bb547d36d6 Mon Sep 17 00:00:00 2001 From: marxin Date: Wed, 14 Mar 2018 09:44:18 +0100 Subject: [PATCH] Introduce new libc_func_speed target hook (PR middle-end/81657). gcc/ChangeLog: 2018-03-14 Martin Liska PR middle-end/81657 * builtins.c (expand_builtin_memory_copy_args): Handle situation when libc library provides a fast mempcpy implementation/ * config/linux-protos.h (ix86_linux_libc_func_speed): New. (TARGET_LIBC_FUNC_SPEED): Likewise. * config/i386/linux64.h (SUBTARGET_LIBC_FUNC_SPEED): Define macro. * config/i386/linux.h (SUBTARGET_LIBC_FUNC_SPEED): Likewise. * config/i386/t-linux: Add x86-linux.o. * config.gcc: Likewise. * config/i386/x86-linux.c: New file. * coretypes.h (enum libc_speed): Likewise. * doc/tm.texi: Document new target hook. * doc/tm.texi.in: Likewise. * expr.c (emit_block_move_hints): Handle libc bail out argument. * expr.h (emit_block_move_hints): Add new parameters. * target.def: Add new hook. * targhooks.c (enum libc_speed): New enum. (default_libc_func_speed): Provide a default hook implementation. * targhooks.h (default_libc_func_speed): Likewise. gcc/testsuite/ChangeLog: 2018-03-28 Martin Liska * gcc.dg/string-opt-1.c: gcc/testsuite/ChangeLog: 2018-03-14 Martin Liska * gcc.dg/string-opt-1.c: Adjust scans for i386 and glibc target and others. --- gcc/builtins.c | 15 ++++++++++- gcc/config.gcc | 1 + gcc/config/i386/i386.c | 5 ++++ gcc/config/i386/linux.h | 2 ++ gcc/config/i386/linux64.h | 2 ++ gcc/config/i386/t-linux | 6 +++++ gcc/config/i386/x86-linux.c | 54 +++++++++++++++++++++++++++++++++++++ gcc/config/linux-protos.h | 1 + gcc/coretypes.h | 7 +++++ gcc/doc/tm.texi | 4 +++ gcc/doc/tm.texi.in | 1 + gcc/expr.c | 11 +++++++- gcc/expr.h | 3 ++- gcc/target.def | 7 +++++ gcc/targhooks.c | 9 +++++++ gcc/targhooks.h | 1 + gcc/testsuite/gcc.dg/string-opt-1.c | 5 ++-- 17 files changed, 129 insertions(+), 5 deletions(-) create mode 100644 gcc/config/i386/x86-linux.c diff --git a/gcc/builtins.c b/gcc/builtins.c index 487d9d58db2..98ee3fb272d 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -3651,13 +3651,26 @@ expand_builtin_memory_copy_args (tree dest, tree src, tree len, src_mem = get_memory_rtx (src, len); set_mem_align (src_mem, src_align); + /* emit_block_move_hints can generate a library call to memcpy function. + In situations when a libc library provides fast implementation + of mempcpy, then it's better to call mempcpy directly. */ + bool avoid_libcall + = (endp == 1 + && targetm.libc_func_speed ((int)BUILT_IN_MEMPCPY) == LIBC_FAST_SPEED + && target != const0_rtx); + /* Copy word part most expediently. */ + bool libcall_avoided = false; dest_addr = emit_block_move_hints (dest_mem, src_mem, len_rtx, CALL_EXPR_TAILCALL (exp) && (endp == 0 || target == const0_rtx) ? BLOCK_OP_TAILCALL : BLOCK_OP_NORMAL, expected_align, expected_size, - min_size, max_size, probable_max_size); + min_size, max_size, probable_max_size, + avoid_libcall ? &libcall_avoided : NULL); + + if (libcall_avoided) + return NULL_RTX; if (dest_addr == 0) { diff --git a/gcc/config.gcc b/gcc/config.gcc index 1b58c060a92..6445ff569b3 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -1607,6 +1607,7 @@ x86_64-*-linux* | x86_64-*-kfreebsd*-gnu) x86_64-*-linux*) tm_file="${tm_file} linux.h linux-android.h i386/linux-common.h i386/linux64.h" extra_options="${extra_options} linux-android.opt" + extra_objs="${extra_objs} x86-linux.o" ;; x86_64-*-kfreebsd*-gnu) tm_file="${tm_file} kfreebsd-gnu.h i386/kfreebsd-gnu64.h" diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index b4f6aec1434..2471ff7b99a 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -52105,6 +52105,11 @@ ix86_run_selftests (void) #undef TARGET_WARN_PARAMETER_PASSING_ABI #define TARGET_WARN_PARAMETER_PASSING_ABI ix86_warn_parameter_passing_abi +#ifdef SUBTARGET_LIBC_FUNC_SPEED +#undef TARGET_LIBC_FUNC_SPEED +#define TARGET_LIBC_FUNC_SPEED SUBTARGET_LIBC_FUNC_SPEED +#endif + #if CHECKING_P #undef TARGET_RUN_TARGET_SELFTESTS #define TARGET_RUN_TARGET_SELFTESTS selftest::ix86_run_selftests diff --git a/gcc/config/i386/linux.h b/gcc/config/i386/linux.h index 69f97f15b0d..6c59cbd6d62 100644 --- a/gcc/config/i386/linux.h +++ b/gcc/config/i386/linux.h @@ -24,3 +24,5 @@ along with GCC; see the file COPYING3. If not see #undef MUSL_DYNAMIC_LINKER #define MUSL_DYNAMIC_LINKER "/lib/ld-musl-i386.so.1" + +#define SUBTARGET_LIBC_FUNC_SPEED ix86_linux_libc_func_speed diff --git a/gcc/config/i386/linux64.h b/gcc/config/i386/linux64.h index f2d913e30ac..d855f5cc239 100644 --- a/gcc/config/i386/linux64.h +++ b/gcc/config/i386/linux64.h @@ -37,3 +37,5 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define MUSL_DYNAMIC_LINKER64 "/lib/ld-musl-x86_64.so.1" #undef MUSL_DYNAMIC_LINKERX32 #define MUSL_DYNAMIC_LINKERX32 "/lib/ld-musl-x32.so.1" + +#define SUBTARGET_LIBC_FUNC_SPEED ix86_linux_libc_func_speed diff --git a/gcc/config/i386/t-linux b/gcc/config/i386/t-linux index 155314c08a7..6e3ebe94fe8 100644 --- a/gcc/config/i386/t-linux +++ b/gcc/config/i386/t-linux @@ -1 +1,7 @@ MULTIARCH_DIRNAME = $(call if_multiarch,i386-linux-gnu) + +x86-linux.o: $(srcdir)/config/i386/x86-linux.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ + $(TM_H) $(RTL_H) $(REGS_H) hard-reg-set.h output.h $(TREE_H) flags.h \ + $(TM_P_H) $(HASHTAB_H) $(GGC_H) + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ + $(srcdir)/config/i386/x86-linux.c diff --git a/gcc/config/i386/x86-linux.c b/gcc/config/i386/x86-linux.c new file mode 100644 index 00000000000..71291d18a71 --- /dev/null +++ b/gcc/config/i386/x86-linux.c @@ -0,0 +1,54 @@ +/* Implementation for linux-specific functions for i386 and x86-64 systems. + Copyright (C) 2018 Free Software Foundation, Inc. + Contributed by Martin Liska . + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +. */ + +#define IN_TARGET_CODE 1 + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "cp/cp-tree.h" /* This is why we're a separate module. */ +#include "stringpool.h" +#include "attribs.h" + +/* This hook determines whether a function from libc has a fast implementation + FN is present at the runtime. We override it for i386 and glibc C library + as this combination provides fast implementation of mempcpy function. */ + +enum libc_speed +ix86_linux_libc_func_speed (int fn) +{ + enum built_in_function f = (built_in_function)fn; + + if (!OPTION_GLIBC) + return LIBC_UNKNOWN_SPEED; + + switch (f) + { + case BUILT_IN_MEMPCPY: + return LIBC_FAST_SPEED; + default: + return LIBC_UNKNOWN_SPEED; + } +} diff --git a/gcc/config/linux-protos.h b/gcc/config/linux-protos.h index 9da8dd7ecaa..b7284735366 100644 --- a/gcc/config/linux-protos.h +++ b/gcc/config/linux-protos.h @@ -20,3 +20,4 @@ along with GCC; see the file COPYING3. If not see extern bool linux_has_ifunc_p (void); extern bool linux_libc_has_function (enum function_class fn_class); +extern enum libc_speed ix86_linux_libc_func_speed (int fn); diff --git a/gcc/coretypes.h b/gcc/coretypes.h index 283b4eb33fe..fe618f708f4 100644 --- a/gcc/coretypes.h +++ b/gcc/coretypes.h @@ -384,6 +384,13 @@ enum excess_precision_type EXCESS_PRECISION_TYPE_FAST }; +enum libc_speed +{ + LIBC_FAST_SPEED, + LIBC_SLOW_SPEED, + LIBC_UNKNOWN_SPEED +}; + /* Support for user-provided GGC and PCH markers. The first parameter is a pointer to a pointer, the second a cookie. */ typedef void (*gt_pointer_operator) (void *, void *); diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index bd8b917ba82..0f7c91a22c4 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -5501,6 +5501,10 @@ macro, a reasonable default is used. This hook determines whether a function from a class of functions @var{fn_class} is present at the runtime. @end deftypefn +@deftypefn {Target Hook} libc_speed TARGET_LIBC_FUNC_SPEED (int @var{fn}) +This hook determines whether a function from libc has a fast implementation +@var{fn} is present at the runtime. +@end deftypefn @defmac NEXT_OBJC_RUNTIME Set this macro to 1 to use the "NeXT" Objective-C message sending conventions diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index b0207146e8c..4bb2998a8a1 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -3933,6 +3933,7 @@ macro, a reasonable default is used. @end defmac @hook TARGET_LIBC_HAS_FUNCTION +@hook TARGET_LIBC_FUNC_SPEED @defmac NEXT_OBJC_RUNTIME Set this macro to 1 to use the "NeXT" Objective-C message sending conventions diff --git a/gcc/expr.c b/gcc/expr.c index 00660293f72..f3bd698bc4d 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -1554,6 +1554,8 @@ compare_by_pieces (rtx arg0, rtx arg1, unsigned HOST_WIDE_INT len, MIN_SIZE is the minimal size of block to move MAX_SIZE is the maximal size of block to move, if it can not be represented in unsigned HOST_WIDE_INT, than it is mask of all ones. + If BAIL_OUT_LIBCALL is non-null, do not emit library call and assign + true to the pointer when move is not done. Return the address of the new block, if memcpy is called and returns it, 0 otherwise. */ @@ -1563,7 +1565,8 @@ emit_block_move_hints (rtx x, rtx y, rtx size, enum block_op_methods method, unsigned int expected_align, HOST_WIDE_INT expected_size, unsigned HOST_WIDE_INT min_size, unsigned HOST_WIDE_INT max_size, - unsigned HOST_WIDE_INT probable_max_size) + unsigned HOST_WIDE_INT probable_max_size, + bool *bail_out_libcall) { bool may_use_call; rtx retval = 0; @@ -1625,6 +1628,12 @@ emit_block_move_hints (rtx x, rtx y, rtx size, enum block_op_methods method, && ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (x)) && ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (y))) { + if (bail_out_libcall) + { + *bail_out_libcall = true; + return retval; + } + /* Since x and y are passed to a libcall, mark the corresponding tree EXPR as addressable. */ tree y_expr = MEM_EXPR (y); diff --git a/gcc/expr.h b/gcc/expr.h index b3d523bcb24..c2bf87fd14e 100644 --- a/gcc/expr.h +++ b/gcc/expr.h @@ -110,7 +110,8 @@ extern rtx emit_block_move_hints (rtx, rtx, rtx, enum block_op_methods, unsigned int, HOST_WIDE_INT, unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT, - unsigned HOST_WIDE_INT); + unsigned HOST_WIDE_INT, + bool *bail_out_libcall = NULL); extern rtx emit_block_cmp_hints (rtx, rtx, rtx, tree, rtx, bool, by_pieces_constfn, void *); extern bool emit_storent_insn (rtx to, rtx from); diff --git a/gcc/target.def b/gcc/target.def index c5b2a1e7e71..3bbddc82776 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -2639,6 +2639,13 @@ DEFHOOK bool, (enum function_class fn_class), default_libc_has_function) +DEFHOOK +(libc_func_speed, + "This hook determines whether a function from libc has a fast implementation\n\ +@var{fn} is present at the runtime.", + libc_speed, (int fn), + default_libc_func_speed) + /* True if new jumps cannot be created, to replace existing ones or not, at the current point in the compilation. */ DEFHOOK diff --git a/gcc/targhooks.c b/gcc/targhooks.c index fafcc6c5196..6e44f6f79cf 100644 --- a/gcc/targhooks.c +++ b/gcc/targhooks.c @@ -1642,6 +1642,15 @@ no_c99_libc_has_function (enum function_class fn_class ATTRIBUTE_UNUSED) return false; } +/* This hook determines whether a function from libc has a fast implementation + FN is present at the runtime. */ + +enum libc_speed +default_libc_func_speed (int) +{ + return LIBC_UNKNOWN_SPEED; +} + tree default_builtin_tm_load_store (tree ARG_UNUSED (type)) { diff --git a/gcc/targhooks.h b/gcc/targhooks.h index 8a4393f2ba4..7508673ad0a 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -205,6 +205,7 @@ extern bool default_have_conditional_execution (void); extern bool default_libc_has_function (enum function_class); extern bool no_c99_libc_has_function (enum function_class); extern bool gnu_libc_has_function (enum function_class); +extern enum libc_speed default_libc_func_speed (int); extern tree default_builtin_tm_load_store (tree); diff --git a/gcc/testsuite/gcc.dg/string-opt-1.c b/gcc/testsuite/gcc.dg/string-opt-1.c index 2f060732bf0..7faaadcbb1f 100644 --- a/gcc/testsuite/gcc.dg/string-opt-1.c +++ b/gcc/testsuite/gcc.dg/string-opt-1.c @@ -48,5 +48,6 @@ main (void) return 0; } -/* { dg-final { scan-assembler-not "\" } } */ -/* { dg-final { scan-assembler "memcpy" } } */ +/* { dg-final { scan-assembler-not "\" { target { ! { i?86-*-gnu* x86_64-*-gnu* i?86-*-linux* x86_64-*-linux* } } } } } */ +/* { dg-final { scan-assembler "memcpy" { target { ! { i?86-*-gnu* x86_64-*-gnu* i?86-*-linux* x86_64-*-linux* } } } } } */ +/* { dg-final { scan-assembler "mempcpy" { target { i?86-*-gnu* x86_64-*-gnu* i?86-*-linux* x86_64-*-linux* } } } } */ -- 2.16.3 --------------DC2E70E2053CBCC1AD56842C--