From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg1-x52b.google.com (mail-pg1-x52b.google.com [IPv6:2607:f8b0:4864:20::52b]) by sourceware.org (Postfix) with ESMTPS id 3239C3857BBD for ; Fri, 15 Jul 2022 17:07:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3239C3857BBD Received: by mail-pg1-x52b.google.com with SMTP id 23so4930577pgc.8 for ; Fri, 15 Jul 2022 10:07:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=TW66lIMpbTCxWYoNyFWln8z4gM3GWChLx4AmUndHZ9M=; b=dML1CDKa92s8ZaDRzCvN2Ei5dqewCTQwDvvW2ytHzG6s84s1VrEvqnr/6LQhaWGF/Q sxupGj+4k5H8Es3a4Cgh7XdWcCuXfS8y442+nUqjEUFutI1RWsEOvIkk7ecYoAzhjs50 w6zBPN2497T5kHztpJyw9dxTAg1PitIO47HXMk/e/rczVCJYitPDaVRbdDpD590tcofq JDx2LJUKbt+vXfaB7k5MAMJkTifqpH53zVuVYIPfvXdjvn1rRQ+nro90RPrQwlgA47CK BUkvFyVHGVG0+x0cK29/uC2FtrDnncMwXgOaybFGHrZvrum9eEqAwf/wOP4V2Z/LWQCr BNgA== X-Gm-Message-State: AJIora+wqQ+KGJq5y3kGXU1DNQ+Ew1biP6TTFhPUEVDi0xRj89XZk2PR dkycooO9e63dcpXTJ8bncmm4lvvfB+AusDXrNo2uq+jf X-Google-Smtp-Source: AGRyM1scg3cwAceO3u/3GKbR1sFSvPoVfWMMZu2NnMIJ2qryvrkuNDzm/Z2we3qi1ywHZR98dbJmm58kmS7CNUE/Vcs= X-Received: by 2002:a05:6a00:14c7:b0:52a:d3ba:74c9 with SMTP id w7-20020a056a0014c700b0052ad3ba74c9mr15065510pfu.1.1657904836146; Fri, 15 Jul 2022 10:07:16 -0700 (PDT) MIME-Version: 1.0 References: <20220713230945.1866193-1-goldstein.w.n@gmail.com> <20220713233301.1868315-1-goldstein.w.n@gmail.com> <20220713233301.1868315-2-goldstein.w.n@gmail.com> In-Reply-To: <20220713233301.1868315-2-goldstein.w.n@gmail.com> From: "H.J. Lu" Date: Fri, 15 Jul 2022 10:06:40 -0700 Message-ID: Subject: Re: [PATCH v2 2/3] x86: Add support to build wcscpy with explicit ISA level To: Noah Goldstein Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3023.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jul 2022 17:07:21 -0000 On Wed, Jul 13, 2022 at 4:33 PM Noah Goldstein wrote: > > 1. Add ISA level build guards to different implementations. > - wcscpy-ssse3.S is used as ISA level 2/3/4. > - wcscpy-generic.c is only used at ISA level 1 and will > only build if compiled with ISA level == 1. Otherwise > there is no reason to include it as we will always use > wcscpy-ssse3.S > > 2. Refactor the ifunc selector and ifunc implementation list to use > the ISA level aware wrapper macros that allow functions below the > compiled ISA level (with a guranteed replacement) to be skipped. > > Tested with and without multiarch on x86_64 for ISA levels: > {generic, x86-64-v2, x86-64-v3, x86-64-v4} > > And m32 with and without multiarch. > --- > sysdeps/x86_64/Makefile | 1 + > sysdeps/x86_64/multiarch/Makefile | 1 - > sysdeps/x86_64/multiarch/ifunc-impl-list.c | 12 +++++-- > sysdeps/x86_64/multiarch/wcscpy-generic.c | 9 +++-- > sysdeps/x86_64/multiarch/wcscpy-ssse3.S | 16 +++++++-- > sysdeps/x86_64/multiarch/wcscpy.c | 5 +-- > sysdeps/x86_64/wcscpy-generic.c | 31 +++++++++++++++++ > sysdeps/x86_64/wcscpy.S | 40 ++++++++++++++++++++++ > 8 files changed, 103 insertions(+), 12 deletions(-) > create mode 100644 sysdeps/x86_64/wcscpy-generic.c > create mode 100644 sysdeps/x86_64/wcscpy.S > > diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile > index 341ee69a35..c19bef2dec 100644 > --- a/sysdeps/x86_64/Makefile > +++ b/sysdeps/x86_64/Makefile > @@ -199,6 +199,7 @@ endif > ifeq ($(subdir),wcsmbs) > > sysdep_routines += \ > + wcscpy-generic \ > wcsncmp-generic \ > wcsnlen-generic \ > # sysdep_routines > diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile > index ba29a65716..df4601c294 100644 > --- a/sysdeps/x86_64/multiarch/Makefile > +++ b/sysdeps/x86_64/multiarch/Makefile > @@ -134,7 +134,6 @@ sysdep_routines += \ > wcscmp-avx2-rtm \ > wcscmp-evex \ > wcscmp-sse2 \ > - wcscpy-generic \ > wcscpy-ssse3 \ > wcslen-avx2 \ > wcslen-avx2-rtm \ > diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > index 3b1df9b73c..9318e98cc8 100644 > --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c > +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > @@ -791,9 +791,15 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > > /* Support sysdeps/x86_64/multiarch/wcscpy.c. */ > IFUNC_IMPL (i, name, wcscpy, > - IFUNC_IMPL_ADD (array, i, wcscpy, CPU_FEATURE_USABLE (SSSE3), > - __wcscpy_ssse3) > - IFUNC_IMPL_ADD (array, i, wcscpy, 1, __wcscpy_generic)) > + /* ISA V4 wrapper for SSSE3 implementation because > + the SSSE3 implementation is also used at ISA > + level 3/4. */ > + X86_IFUNC_IMPL_ADD_V4 (array, i, wcscpy, > + CPU_FEATURE_USABLE (SSSE3), > + __wcscpy_ssse3) > + X86_IFUNC_IMPL_ADD_V1 (array, i, wcscpy, > + 1, > + __wcscpy_generic)) > > /* Support sysdeps/x86_64/multiarch/wcslen.c. */ > IFUNC_IMPL (i, name, wcslen, > diff --git a/sysdeps/x86_64/multiarch/wcscpy-generic.c b/sysdeps/x86_64/multiarch/wcscpy-generic.c > index 5ea905f33f..93d314aaad 100644 > --- a/sysdeps/x86_64/multiarch/wcscpy-generic.c > +++ b/sysdeps/x86_64/multiarch/wcscpy-generic.c > @@ -17,8 +17,11 @@ > . */ > > > -#if IS_IN (libc) > +#include > + > +#if ISA_SHOULD_BUILD (1) > + > # define WCSCPY __wcscpy_generic > -#endif > +# include > > -#include > +#endif > diff --git a/sysdeps/x86_64/multiarch/wcscpy-ssse3.S b/sysdeps/x86_64/multiarch/wcscpy-ssse3.S > index aa2b9d030f..62741071e6 100644 > --- a/sysdeps/x86_64/multiarch/wcscpy-ssse3.S > +++ b/sysdeps/x86_64/multiarch/wcscpy-ssse3.S > @@ -16,11 +16,21 @@ > License along with the GNU C Library; if not, see > . */ > > -#if IS_IN (libc) > +#include > + > +/* MINIMUM_X86_ISA_LEVEL <= 4 because there are not V3/V4 > + implementations so we need this to build for ISA V3/V4 > + builds. */ > +#if ISA_SHOULD_BUILD (4) > + > +# ifndef WCSCPY > +# define WCSCPY __wcscpy_ssse3 > +# endif > + > # include > > .section .text.ssse3,"ax",@progbits > -ENTRY (__wcscpy_ssse3) > +ENTRY (WCSCPY) > > mov %rsi, %rcx > mov %rdi, %rdx > @@ -547,5 +557,5 @@ L(Exit16): > mov %rdi, %rax > ret > > -END(__wcscpy_ssse3) > +END(WCSCPY) > #endif > diff --git a/sysdeps/x86_64/multiarch/wcscpy.c b/sysdeps/x86_64/multiarch/wcscpy.c > index 53c3228dc2..92c917b6b4 100644 > --- a/sysdeps/x86_64/multiarch/wcscpy.c > +++ b/sysdeps/x86_64/multiarch/wcscpy.c > @@ -26,15 +26,16 @@ > # define SYMBOL_NAME wcscpy > # include > > -extern __typeof (REDIRECT_NAME) OPTIMIZE (generic) attribute_hidden; > extern __typeof (REDIRECT_NAME) OPTIMIZE (ssse3) attribute_hidden; > > +extern __typeof (REDIRECT_NAME) OPTIMIZE (generic) attribute_hidden; > + > static inline void * > IFUNC_SELECTOR (void) > { > const struct cpu_features* cpu_features = __get_cpu_features (); > > - if (CPU_FEATURE_USABLE_P (cpu_features, SSSE3)) > + if (X86_ISA_CPU_FEATURE_USABLE_P (cpu_features, SSSE3)) > return OPTIMIZE (ssse3); > > return OPTIMIZE (generic); > diff --git a/sysdeps/x86_64/wcscpy-generic.c b/sysdeps/x86_64/wcscpy-generic.c > new file mode 100644 > index 0000000000..eef74ae53d > --- /dev/null > +++ b/sysdeps/x86_64/wcscpy-generic.c > @@ -0,0 +1,31 @@ > +/* wcscpy dispatch for RTLD and non-multiarch .c ISA level 1 build. > + Copyright (C) 2022 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +/* wcscpy non-multiarch build is split into two files, > + wcscpy-generic.c and wcscpy.S. The wcscpy-generic.c build is for > + ISA level <= 1 and just uses wcsmbs/wcscpy.c. This must be split > + into two files because we cannot include C code from assembly or > + vice versa. */ > + > +#include > + > +#if MINIMUM_X86_ISA_LEVEL <= 1 > + > +# include "wcsmbs/wcscpy.c" > + > +#endif > diff --git a/sysdeps/x86_64/wcscpy.S b/sysdeps/x86_64/wcscpy.S > new file mode 100644 > index 0000000000..11d0bb4bab > --- /dev/null > +++ b/sysdeps/x86_64/wcscpy.S > @@ -0,0 +1,40 @@ > +/* wcscpy dispatch for RTLD and non-multiarch .c files > + Copyright (C) 2022 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +/* wcscpy non-multiarch build is split into two files, > + wcscpy-generic.c and wcscpy.S. The wcscpy.S build is for > + ISA level >= 2 uses the optimized assembly implementations in > + multiarch/wcscpy*.S. This must be split into two files because > + we cannot include C code from assembly or vice versa. */ > + > +#include > + > +#if MINIMUM_X86_ISA_LEVEL >= 2 > + > +# define WCSCPY __wcscpy > + > +# define DEFAULT_IMPL_V2 "multiarch/wcscpy-ssse3.S" > +/* isa-default-impl.h expects DEFAULT_IMPL_V1 to be defined but it > + should never be used from here. */ > +# define DEFAULT_IMPL_V1 "ERROR -- Invalid ISA IMPL" > + > +# include "isa-default-impl.h" > + > +weak_alias (__wcscpy, wcscpy) > +libc_hidden_def (__wcscpy) > +#endif > -- > 2.34.1 > LGTM. Thanks. -- H.J.