From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by sourceware.org (Postfix) with ESMTPS id 072193858408 for ; Wed, 6 Jul 2022 18:09:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 072193858408 Received: by mail-pg1-x52a.google.com with SMTP id q82so7438072pgq.6 for ; Wed, 06 Jul 2022 11:09:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=IrE7BNHTfaQu5kHcm19Dba09kJhH4Qh790PSl6jFn4Q=; b=QPT3k+HAuWGIXiGE3p+pmkm9e8MyZpdoBZWeGrsXM4ynr9GAQQDqcDCqnc6FZiTq8l QPiCdq64B3C5pqyYDluR3/9W1mxCHrKEviAJVKZd+IfQb0MM0ivIzmxSpsNJArQO34Py C0hyTT4dA7LQeBElr/GKhx1+SqKwaZvFxwc94MEKb4TXgDdcagDSQYXWA/NPONQV5OrK VdYLz75As9T0TOXZbYa5W/Wq8D3GvKaeLz+difWEmTW5bIDxUEFPimb65Syjc2nRlMhn 5dOFXJeIDZMKhh/10wx+sJJN5JhCPrt1QAg3WzjJV7hSAdcp/XCLEbHma1auKoVkiTiv +zqQ== X-Gm-Message-State: AJIora/o7Z/aVsEvMKpCiMkQjp38p4Q55VDEgmZggI40AiTRumEdUmqM u/z/cpI//8hFEGlfiSXpxmTpuCUkuyVG6u8EbvRBJ+fT X-Google-Smtp-Source: AGRyM1sMbRCpA7+6dj5RROh3+OCG3YTB5VYWCgQPOW+8EOSHrWXGFV+WmpHIOIP6w85Jgje4hogvlTbzinklRQKlQAo= X-Received: by 2002:a05:6a00:14c4:b0:525:9341:288 with SMTP id w4-20020a056a0014c400b0052593410288mr49741768pfu.1.1657130960825; Wed, 06 Jul 2022 11:09:20 -0700 (PDT) MIME-Version: 1.0 References: <20220706000641.3657347-1-goldstein.w.n@gmail.com> In-Reply-To: <20220706000641.3657347-1-goldstein.w.n@gmail.com> From: "H.J. Lu" Date: Wed, 6 Jul 2022 11:08:44 -0700 Message-ID: Subject: Re: [PATCH v1] x86: Remove generic strncat, strncpy, and stpncpy implementations To: Noah Goldstein Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3024.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2022 18:09:24 -0000 On Tue, Jul 5, 2022 at 5:06 PM Noah Goldstein wrote: > > These functions all have optimized versions: > __strncat_sse2_unaligned, __strncpy_sse2_unaligned, and > stpncpy_sse2_unaligned which are faster than their respective generic > implementations. Since the sse2 versions can run on baseline x86_64, > we should use these as the baseline implementation and can remove the > generic implementations. > > Geometric mean of N=20 runs of the entire benchmark suite on: > 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz (Tigerlake) > > __strncat_sse2_unaligned / __strncat_generic: .944 > __strncpy_sse2_unaligned / __strncpy_generic: .726 > __stpncpy_sse2_unaligned / __stpncpy_generic: .650 > > Tested build with and without multiarch and full check with multiarch. > --- > sysdeps/x86_64/multiarch/Makefile | 3 -- > sysdeps/x86_64/multiarch/ifunc-impl-list.c | 9 ++-- > sysdeps/x86_64/multiarch/ifunc-strcpy.h | 8 +--- > sysdeps/x86_64/multiarch/ifunc-strncpy.h | 48 ++++++++++++++++++++++ > sysdeps/x86_64/multiarch/stpncpy-generic.c | 26 ------------ > sysdeps/x86_64/multiarch/stpncpy.c | 3 +- > sysdeps/x86_64/multiarch/strncat-generic.c | 21 ---------- > sysdeps/x86_64/multiarch/strncat.c | 3 +- > sysdeps/x86_64/multiarch/strncpy-generic.c | 24 ----------- > sysdeps/x86_64/multiarch/strncpy.c | 3 +- > 10 files changed, 56 insertions(+), 92 deletions(-) > create mode 100644 sysdeps/x86_64/multiarch/ifunc-strncpy.h > delete mode 100644 sysdeps/x86_64/multiarch/stpncpy-generic.c > delete mode 100644 sysdeps/x86_64/multiarch/strncat-generic.c > delete mode 100644 sysdeps/x86_64/multiarch/strncpy-generic.c > > diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile > index 18cea04423..04e238efa0 100644 > --- a/sysdeps/x86_64/multiarch/Makefile > +++ b/sysdeps/x86_64/multiarch/Makefile > @@ -45,7 +45,6 @@ sysdep_routines += \ > stpcpy-sse2-unaligned \ > stpncpy-avx2 \ > stpncpy-avx2-rtm \ > - stpncpy-generic \ > stpncpy-evex \ > stpncpy-sse2-unaligned \ > strcasecmp_l-avx2 \ > @@ -92,7 +91,6 @@ sysdep_routines += \ > strncase_l-sse4_2 \ > strncat-avx2 \ > strncat-avx2-rtm \ > - strncat-generic \ > strncat-evex \ > strncat-sse2-unaligned \ > strncmp-avx2 \ > @@ -102,7 +100,6 @@ sysdep_routines += \ > strncmp-sse4_2 \ > strncpy-avx2 \ > strncpy-avx2-rtm \ > - strncpy-generic \ > strncpy-evex \ > strncpy-sse2-unaligned \ > strnlen-avx2 \ > diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > index adf7d4bafd..2c96cb62d2 100644 > --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c > +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > @@ -403,8 +403,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > && CPU_FEATURE_USABLE (AVX512BW)), > __stpncpy_evex) > IFUNC_IMPL_ADD (array, i, stpncpy, 1, > - __stpncpy_sse2_unaligned) > - IFUNC_IMPL_ADD (array, i, stpncpy, 1, __stpncpy_generic)) > + __stpncpy_sse2_unaligned)) > > /* Support sysdeps/x86_64/multiarch/stpcpy.c. */ > IFUNC_IMPL (i, name, stpcpy, > @@ -618,8 +617,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > && CPU_FEATURE_USABLE (AVX512BW)), > __strncat_evex) > IFUNC_IMPL_ADD (array, i, strncat, 1, > - __strncat_sse2_unaligned) > - IFUNC_IMPL_ADD (array, i, strncat, 1, __strncat_generic)) > + __strncat_sse2_unaligned)) > > /* Support sysdeps/x86_64/multiarch/strncpy.c. */ > IFUNC_IMPL (i, name, strncpy, > @@ -634,8 +632,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > && CPU_FEATURE_USABLE (AVX512BW)), > __strncpy_evex) > IFUNC_IMPL_ADD (array, i, strncpy, 1, > - __strncpy_sse2_unaligned) > - IFUNC_IMPL_ADD (array, i, strncpy, 1, __strncpy_generic)) > + __strncpy_sse2_unaligned)) > > /* Support sysdeps/x86_64/multiarch/strpbrk.c. */ > IFUNC_IMPL (i, name, strpbrk, > diff --git a/sysdeps/x86_64/multiarch/ifunc-strcpy.h b/sysdeps/x86_64/multiarch/ifunc-strcpy.h > index 80529458d1..a15afa44e9 100644 > --- a/sysdeps/x86_64/multiarch/ifunc-strcpy.h > +++ b/sysdeps/x86_64/multiarch/ifunc-strcpy.h > @@ -20,11 +20,7 @@ > > #include > > -#ifndef GENERIC > -# define GENERIC sse2 > -#endif > - > -extern __typeof (REDIRECT_NAME) OPTIMIZE (GENERIC) attribute_hidden; > +extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden; > extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2_unaligned) > attribute_hidden; > extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2) attribute_hidden; > @@ -53,5 +49,5 @@ IFUNC_SELECTOR (void) > if (CPU_FEATURES_ARCH_P (cpu_features, Fast_Unaligned_Load)) > return OPTIMIZE (sse2_unaligned); > > - return OPTIMIZE (GENERIC); > + return OPTIMIZE (sse2); > } > diff --git a/sysdeps/x86_64/multiarch/ifunc-strncpy.h b/sysdeps/x86_64/multiarch/ifunc-strncpy.h > new file mode 100644 > index 0000000000..323225af4d > --- /dev/null > +++ b/sysdeps/x86_64/multiarch/ifunc-strncpy.h > @@ -0,0 +1,48 @@ > +/* Common definition for ifunc st{r|p}n{cpy|cat} > + All versions must be listed in ifunc-impl-list.c. > + Copyright (C) 2022 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#include > + > +extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2_unaligned) > + attribute_hidden; > +extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2) attribute_hidden; > +extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2_rtm) attribute_hidden; > +extern __typeof (REDIRECT_NAME) OPTIMIZE (evex) attribute_hidden; > + > +static inline void * > +IFUNC_SELECTOR (void) > +{ > + const struct cpu_features* cpu_features = __get_cpu_features (); > + > + if (CPU_FEATURE_USABLE_P (cpu_features, AVX2) > + && CPU_FEATURES_ARCH_P (cpu_features, AVX_Fast_Unaligned_Load)) > + { > + if (CPU_FEATURE_USABLE_P (cpu_features, AVX512VL) > + && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)) > + return OPTIMIZE (evex); > + > + if (CPU_FEATURE_USABLE_P (cpu_features, RTM)) > + return OPTIMIZE (avx2_rtm); > + > + if (!CPU_FEATURES_ARCH_P (cpu_features, Prefer_No_VZEROUPPER)) > + return OPTIMIZE (avx2); > + } > + > + return OPTIMIZE (sse2_unaligned); > +} > diff --git a/sysdeps/x86_64/multiarch/stpncpy-generic.c b/sysdeps/x86_64/multiarch/stpncpy-generic.c > deleted file mode 100644 > index 87826845b0..0000000000 > --- a/sysdeps/x86_64/multiarch/stpncpy-generic.c > +++ /dev/null > @@ -1,26 +0,0 @@ > -/* stpncpy. > - Copyright (C) 2022 Free Software Foundation, Inc. > - This file is part of the GNU C Library. > - > - The GNU C Library is free software; you can redistribute it and/or > - modify it under the terms of the GNU Lesser General Public > - License as published by the Free Software Foundation; either > - version 2.1 of the License, or (at your option) any later version. > - > - The GNU C Library is distributed in the hope that it will be useful, > - but WITHOUT ANY WARRANTY; without even the implied warranty of > - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > - Lesser General Public License for more details. > - > - You should have received a copy of the GNU Lesser General Public > - License along with the GNU C Library; if not, see > - . */ > - > - > -#define STPNCPY __stpncpy_generic > -#undef weak_alias > -#define weak_alias(ignored1, ignored2) > -#undef libc_hidden_def > -#define libc_hidden_def(stpncpy) > - > -#include > diff --git a/sysdeps/x86_64/multiarch/stpncpy.c b/sysdeps/x86_64/multiarch/stpncpy.c > index 879bc83f0b..a8d083ff0d 100644 > --- a/sysdeps/x86_64/multiarch/stpncpy.c > +++ b/sysdeps/x86_64/multiarch/stpncpy.c > @@ -25,9 +25,8 @@ > # undef stpncpy > # undef __stpncpy > > -# define GENERIC generic > # define SYMBOL_NAME stpncpy > -# include "ifunc-strcpy.h" > +# include "ifunc-strncpy.h" > > libc_ifunc_redirected (__redirect_stpncpy, __stpncpy, IFUNC_SELECTOR ()); > > diff --git a/sysdeps/x86_64/multiarch/strncat-generic.c b/sysdeps/x86_64/multiarch/strncat-generic.c > deleted file mode 100644 > index 0090669cd1..0000000000 > --- a/sysdeps/x86_64/multiarch/strncat-generic.c > +++ /dev/null > @@ -1,21 +0,0 @@ > -/* strncat. > - Copyright (C) 2022 Free Software Foundation, Inc. > - This file is part of the GNU C Library. > - > - The GNU C Library is free software; you can redistribute it and/or > - modify it under the terms of the GNU Lesser General Public > - License as published by the Free Software Foundation; either > - version 2.1 of the License, or (at your option) any later version. > - > - The GNU C Library is distributed in the hope that it will be useful, > - but WITHOUT ANY WARRANTY; without even the implied warranty of > - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > - Lesser General Public License for more details. > - > - You should have received a copy of the GNU Lesser General Public > - License along with the GNU C Library; if not, see > - . */ > - > - > -#define STRNCAT __strncat_generic > -#include > diff --git a/sysdeps/x86_64/multiarch/strncat.c b/sysdeps/x86_64/multiarch/strncat.c > index 50fba8a41f..a590c25d51 100644 > --- a/sysdeps/x86_64/multiarch/strncat.c > +++ b/sysdeps/x86_64/multiarch/strncat.c > @@ -24,8 +24,7 @@ > # undef strncat > > # define SYMBOL_NAME strncat > -# define GENERIC generic > -# include "ifunc-strcpy.h" > +# include "ifunc-strncpy.h" > > libc_ifunc_redirected (__redirect_strncat, strncat, IFUNC_SELECTOR ()); > strong_alias (strncat, __strncat); > diff --git a/sysdeps/x86_64/multiarch/strncpy-generic.c b/sysdeps/x86_64/multiarch/strncpy-generic.c > deleted file mode 100644 > index 9916153dd5..0000000000 > --- a/sysdeps/x86_64/multiarch/strncpy-generic.c > +++ /dev/null > @@ -1,24 +0,0 @@ > -/* strncpy. > - Copyright (C) 2022 Free Software Foundation, Inc. > - This file is part of the GNU C Library. > - > - The GNU C Library is free software; you can redistribute it and/or > - modify it under the terms of the GNU Lesser General Public > - License as published by the Free Software Foundation; either > - version 2.1 of the License, or (at your option) any later version. > - > - The GNU C Library is distributed in the hope that it will be useful, > - but WITHOUT ANY WARRANTY; without even the implied warranty of > - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > - Lesser General Public License for more details. > - > - You should have received a copy of the GNU Lesser General Public > - License along with the GNU C Library; if not, see > - . */ > - > - > -#define STRNCPY __strncpy_generic > -#undef libc_hidden_builtin_def > -#define libc_hidden_builtin_def(strncpy) > - > -#include > diff --git a/sysdeps/x86_64/multiarch/strncpy.c b/sysdeps/x86_64/multiarch/strncpy.c > index 7fc7d72ec5..c83440f0e3 100644 > --- a/sysdeps/x86_64/multiarch/strncpy.c > +++ b/sysdeps/x86_64/multiarch/strncpy.c > @@ -24,8 +24,7 @@ > # undef strncpy > > # define SYMBOL_NAME strncpy > -# define GENERIC generic > -# include "ifunc-strcpy.h" > +# include "ifunc-strncpy.h" > > libc_ifunc_redirected (__redirect_strncpy, strncpy, IFUNC_SELECTOR ()); > > -- > 2.34.1 > LGTM. Thanks. -- H.J.