From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102c.google.com (mail-pj1-x102c.google.com [IPv6:2607:f8b0:4864:20::102c]) by sourceware.org (Postfix) with ESMTPS id 90E1E3858412 for ; Wed, 6 Jul 2022 01:33:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 90E1E3858412 Received: by mail-pj1-x102c.google.com with SMTP id g16-20020a17090a7d1000b001ea9f820449so18760169pjl.5 for ; Tue, 05 Jul 2022 18:33:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MXUCNg3ZRYKI9ac+1aDc7KxFSylPA1j+X1FVGU/z0pk=; b=5lrmwCziyyAvdrVqwSPKUzE8MWYHgc6H/cBHy7ty5wKu80qbWVRt8Nn9w48+Ivo0v/ p9sx8XEQABFz/wt0RGfyGCygtjHXxLEVdlwK7JRZ3x29G30P4c1kc7vtJ1lidOB+U3oR fWxRCFAjvr2HSqGfXJX56+kZABwEhrR9YiEyJ64gdPTJene7JB9CdXyGwV9g1I755Q7B Cy0vHQh4PGYdclmMeJtESBvwAhB/2ZzTk8kHj8TgeGncgTYF5hyA6FzH3hw7Nkjc6nHw ii4oMbLfT4YZ8jyfkLM3b/F0HNhu05FuFcCoohxNnNgtXyqiLdp3SGCOBtxqS0aB9VR3 Fg+Q== X-Gm-Message-State: AJIora9JuudHqS32RFp4qIc9RPETPF5QgnECiT+IkLer1Xfby3GIYIww YD18URbfhpR6tbjABZ/CH2dEi/UcxOABDZ/c3JDzoDS+ X-Google-Smtp-Source: AGRyM1uh0O3hmRkNJ3gi+5Q2PH2NnCrncK6VBgbB6ydI0D2aoSCCWMIFVxP+ZRQv6ZVuqTNd5UQB2EJBHxpJi+iE7FQ= X-Received: by 2002:a17:90b:3942:b0:1ef:9fcd:f8eb with SMTP id oe2-20020a17090b394200b001ef9fcdf8ebmr5762056pjb.10.1657071184337; Tue, 05 Jul 2022 18:33:04 -0700 (PDT) MIME-Version: 1.0 References: <20220706000641.3657347-1-goldstein.w.n@gmail.com> In-Reply-To: <20220706000641.3657347-1-goldstein.w.n@gmail.com> From: "H.J. Lu" Date: Tue, 5 Jul 2022 18:32:28 -0700 Message-ID: Subject: Re: [PATCH v1] x86: Remove generic strncat, strncpy, and stpncpy implementations To: Noah Goldstein , Mayshao-oc Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3023.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2022 01:33:09 -0000 On Tue, Jul 5, 2022 at 5:06 PM Noah Goldstein wrote: > > These functions all have optimized versions: > __strncat_sse2_unaligned, __strncpy_sse2_unaligned, and > stpncpy_sse2_unaligned which are faster than their respective generic > implementations. Since the sse2 versions can run on baseline x86_64, Is this true on all x86-64 processors? > we should use these as the baseline implementation and can remove the > generic implementations. > > Geometric mean of N=20 runs of the entire benchmark suite on: > 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz (Tigerlake) > > __strncat_sse2_unaligned / __strncat_generic: .944 > __strncpy_sse2_unaligned / __strncpy_generic: .726 > __stpncpy_sse2_unaligned / __stpncpy_generic: .650 > > Tested build with and without multiarch and full check with multiarch. > --- > sysdeps/x86_64/multiarch/Makefile | 3 -- > sysdeps/x86_64/multiarch/ifunc-impl-list.c | 9 ++-- > sysdeps/x86_64/multiarch/ifunc-strcpy.h | 8 +--- > sysdeps/x86_64/multiarch/ifunc-strncpy.h | 48 ++++++++++++++++++++++ > sysdeps/x86_64/multiarch/stpncpy-generic.c | 26 ------------ > sysdeps/x86_64/multiarch/stpncpy.c | 3 +- > sysdeps/x86_64/multiarch/strncat-generic.c | 21 ---------- > sysdeps/x86_64/multiarch/strncat.c | 3 +- > sysdeps/x86_64/multiarch/strncpy-generic.c | 24 ----------- > sysdeps/x86_64/multiarch/strncpy.c | 3 +- > 10 files changed, 56 insertions(+), 92 deletions(-) > create mode 100644 sysdeps/x86_64/multiarch/ifunc-strncpy.h > delete mode 100644 sysdeps/x86_64/multiarch/stpncpy-generic.c > delete mode 100644 sysdeps/x86_64/multiarch/strncat-generic.c > delete mode 100644 sysdeps/x86_64/multiarch/strncpy-generic.c > > diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile > index 18cea04423..04e238efa0 100644 > --- a/sysdeps/x86_64/multiarch/Makefile > +++ b/sysdeps/x86_64/multiarch/Makefile > @@ -45,7 +45,6 @@ sysdep_routines += \ > stpcpy-sse2-unaligned \ > stpncpy-avx2 \ > stpncpy-avx2-rtm \ > - stpncpy-generic \ > stpncpy-evex \ > stpncpy-sse2-unaligned \ > strcasecmp_l-avx2 \ > @@ -92,7 +91,6 @@ sysdep_routines += \ > strncase_l-sse4_2 \ > strncat-avx2 \ > strncat-avx2-rtm \ > - strncat-generic \ > strncat-evex \ > strncat-sse2-unaligned \ > strncmp-avx2 \ > @@ -102,7 +100,6 @@ sysdep_routines += \ > strncmp-sse4_2 \ > strncpy-avx2 \ > strncpy-avx2-rtm \ > - strncpy-generic \ > strncpy-evex \ > strncpy-sse2-unaligned \ > strnlen-avx2 \ > diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > index adf7d4bafd..2c96cb62d2 100644 > --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c > +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > @@ -403,8 +403,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > && CPU_FEATURE_USABLE (AVX512BW)), > __stpncpy_evex) > IFUNC_IMPL_ADD (array, i, stpncpy, 1, > - __stpncpy_sse2_unaligned) > - IFUNC_IMPL_ADD (array, i, stpncpy, 1, __stpncpy_generic)) > + __stpncpy_sse2_unaligned)) > > /* Support sysdeps/x86_64/multiarch/stpcpy.c. */ > IFUNC_IMPL (i, name, stpcpy, > @@ -618,8 +617,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > && CPU_FEATURE_USABLE (AVX512BW)), > __strncat_evex) > IFUNC_IMPL_ADD (array, i, strncat, 1, > - __strncat_sse2_unaligned) > - IFUNC_IMPL_ADD (array, i, strncat, 1, __strncat_generic)) > + __strncat_sse2_unaligned)) > > /* Support sysdeps/x86_64/multiarch/strncpy.c. */ > IFUNC_IMPL (i, name, strncpy, > @@ -634,8 +632,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > && CPU_FEATURE_USABLE (AVX512BW)), > __strncpy_evex) > IFUNC_IMPL_ADD (array, i, strncpy, 1, > - __strncpy_sse2_unaligned) > - IFUNC_IMPL_ADD (array, i, strncpy, 1, __strncpy_generic)) > + __strncpy_sse2_unaligned)) > > /* Support sysdeps/x86_64/multiarch/strpbrk.c. */ > IFUNC_IMPL (i, name, strpbrk, > diff --git a/sysdeps/x86_64/multiarch/ifunc-strcpy.h b/sysdeps/x86_64/multiarch/ifunc-strcpy.h > index 80529458d1..a15afa44e9 100644 > --- a/sysdeps/x86_64/multiarch/ifunc-strcpy.h > +++ b/sysdeps/x86_64/multiarch/ifunc-strcpy.h > @@ -20,11 +20,7 @@ > > #include > > -#ifndef GENERIC > -# define GENERIC sse2 > -#endif > - > -extern __typeof (REDIRECT_NAME) OPTIMIZE (GENERIC) attribute_hidden; > +extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden; > extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2_unaligned) > attribute_hidden; > extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2) attribute_hidden; > @@ -53,5 +49,5 @@ IFUNC_SELECTOR (void) > if (CPU_FEATURES_ARCH_P (cpu_features, Fast_Unaligned_Load)) > return OPTIMIZE (sse2_unaligned); > > - return OPTIMIZE (GENERIC); > + return OPTIMIZE (sse2); > } > diff --git a/sysdeps/x86_64/multiarch/ifunc-strncpy.h b/sysdeps/x86_64/multiarch/ifunc-strncpy.h > new file mode 100644 > index 0000000000..323225af4d > --- /dev/null > +++ b/sysdeps/x86_64/multiarch/ifunc-strncpy.h > @@ -0,0 +1,48 @@ > +/* Common definition for ifunc st{r|p}n{cpy|cat} > + All versions must be listed in ifunc-impl-list.c. > + Copyright (C) 2022 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#include > + > +extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2_unaligned) > + attribute_hidden; > +extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2) attribute_hidden; > +extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2_rtm) attribute_hidden; > +extern __typeof (REDIRECT_NAME) OPTIMIZE (evex) attribute_hidden; > + > +static inline void * > +IFUNC_SELECTOR (void) > +{ > + const struct cpu_features* cpu_features = __get_cpu_features (); > + > + if (CPU_FEATURE_USABLE_P (cpu_features, AVX2) > + && CPU_FEATURES_ARCH_P (cpu_features, AVX_Fast_Unaligned_Load)) > + { > + if (CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) > + && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)) > + return OPTIMIZE (evex); > + > + if (CPU_FEATURE_USABLE_P (cpu_features, RTM)) > + return OPTIMIZE (avx2_rtm); > + > + if (!CPU_FEATURES_ARCH_P (cpu_features, Prefer_No_VZEROUPPER)) > + return OPTIMIZE (avx2); > + } > + > + return OPTIMIZE (sse2_unaligned); > +} > diff --git a/sysdeps/x86_64/multiarch/stpncpy-generic.c b/sysdeps/x86_64/multiarch/stpncpy-generic.c > deleted file mode 100644 > index 87826845b0..0000000000 > --- a/sysdeps/x86_64/multiarch/stpncpy-generic.c > +++ /dev/null > @@ -1,26 +0,0 @@ > -/* stpncpy. > - Copyright (C) 2022 Free Software Foundation, Inc. > - This file is part of the GNU C Library. > - > - The GNU C Library is free software; you can redistribute it and/or > - modify it under the terms of the GNU Lesser General Public > - License as published by the Free Software Foundation; either > - version 2.1 of the License, or (at your option) any later version. > - > - The GNU C Library is distributed in the hope that it will be useful, > - but WITHOUT ANY WARRANTY; without even the implied warranty of > - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > - Lesser General Public License for more details. > - > - You should have received a copy of the GNU Lesser General Public > - License along with the GNU C Library; if not, see > - . */ > - > - > -#define STPNCPY __stpncpy_generic > -#undef weak_alias > -#define weak_alias(ignored1, ignored2) > -#undef libc_hidden_def > -#define libc_hidden_def(stpncpy) > - > -#include > diff --git a/sysdeps/x86_64/multiarch/stpncpy.c b/sysdeps/x86_64/multiarch/stpncpy.c > index 879bc83f0b..a8d083ff0d 100644 > --- a/sysdeps/x86_64/multiarch/stpncpy.c > +++ b/sysdeps/x86_64/multiarch/stpncpy.c > @@ -25,9 +25,8 @@ > # undef stpncpy > # undef __stpncpy > > -# define GENERIC generic > # define SYMBOL_NAME stpncpy > -# include "ifunc-strcpy.h" > +# include "ifunc-strncpy.h" > > libc_ifunc_redirected (__redirect_stpncpy, __stpncpy, IFUNC_SELECTOR ()); > > diff --git a/sysdeps/x86_64/multiarch/strncat-generic.c b/sysdeps/x86_64/multiarch/strncat-generic.c > deleted file mode 100644 > index 0090669cd1..0000000000 > --- a/sysdeps/x86_64/multiarch/strncat-generic.c > +++ /dev/null > @@ -1,21 +0,0 @@ > -/* strncat. > - Copyright (C) 2022 Free Software Foundation, Inc. > - This file is part of the GNU C Library. > - > - The GNU C Library is free software; you can redistribute it and/or > - modify it under the terms of the GNU Lesser General Public > - License as published by the Free Software Foundation; either > - version 2.1 of the License, or (at your option) any later version. > - > - The GNU C Library is distributed in the hope that it will be useful, > - but WITHOUT ANY WARRANTY; without even the implied warranty of > - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > - Lesser General Public License for more details. > - > - You should have received a copy of the GNU Lesser General Public > - License along with the GNU C Library; if not, see > - . */ > - > - > -#define STRNCAT __strncat_generic > -#include > diff --git a/sysdeps/x86_64/multiarch/strncat.c b/sysdeps/x86_64/multiarch/strncat.c > index 50fba8a41f..a590c25d51 100644 > --- a/sysdeps/x86_64/multiarch/strncat.c > +++ b/sysdeps/x86_64/multiarch/strncat.c > @@ -24,8 +24,7 @@ > # undef strncat > > # define SYMBOL_NAME strncat > -# define GENERIC generic > -# include "ifunc-strcpy.h" > +# include "ifunc-strncpy.h" > > libc_ifunc_redirected (__redirect_strncat, strncat, IFUNC_SELECTOR ()); > strong_alias (strncat, __strncat); > diff --git a/sysdeps/x86_64/multiarch/strncpy-generic.c b/sysdeps/x86_64/multiarch/strncpy-generic.c > deleted file mode 100644 > index 9916153dd5..0000000000 > --- a/sysdeps/x86_64/multiarch/strncpy-generic.c > +++ /dev/null > @@ -1,24 +0,0 @@ > -/* strncpy. > - Copyright (C) 2022 Free Software Foundation, Inc. > - This file is part of the GNU C Library. > - > - The GNU C Library is free software; you can redistribute it and/or > - modify it under the terms of the GNU Lesser General Public > - License as published by the Free Software Foundation; either > - version 2.1 of the License, or (at your option) any later version. > - > - The GNU C Library is distributed in the hope that it will be useful, > - but WITHOUT ANY WARRANTY; without even the implied warranty of > - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > - Lesser General Public License for more details. > - > - You should have received a copy of the GNU Lesser General Public > - License along with the GNU C Library; if not, see > - . */ > - > - > -#define STRNCPY __strncpy_generic > -#undef libc_hidden_builtin_def > -#define libc_hidden_builtin_def(strncpy) > - > -#include > diff --git a/sysdeps/x86_64/multiarch/strncpy.c b/sysdeps/x86_64/multiarch/strncpy.c > index 7fc7d72ec5..c83440f0e3 100644 > --- a/sysdeps/x86_64/multiarch/strncpy.c > +++ b/sysdeps/x86_64/multiarch/strncpy.c > @@ -24,8 +24,7 @@ > # undef strncpy > > # define SYMBOL_NAME strncpy > -# define GENERIC generic > -# include "ifunc-strcpy.h" > +# include "ifunc-strncpy.h" > > libc_ifunc_redirected (__redirect_strncpy, strncpy, IFUNC_SELECTOR ()); > > -- > 2.34.1 > -- H.J.