From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by sourceware.org (Postfix) with ESMTPS id 310EA3858D20 for ; Thu, 17 Feb 2022 19:21:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 310EA3858D20 Received: by mail-pj1-x1034.google.com with SMTP id b8so6531004pjb.4 for ; Thu, 17 Feb 2022 11:21:26 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Fe+3bOA2wrRbm13TyVHuIt37iKuvJ99m5MPyec2gz7s=; b=H7yHcWOWJRQ8a5S8uZVeFwmUkt2hgkY5TUljSsujyP6uFiB89X7VFVlv48vbWbiP6F obaGdiM2oP7KsnasHc26uFkmpAnPqfJHkXE7p+ZXp2sAjUQpGBTK0K5Q8eLQpsRs08j1 78NeKdbNgkrdDb9ElSU1IwvCdKakHLBgsoLKAzHefHqjvQg9YY0TPURUHXOrNMxl7rIV lfNzo/0/gE4FzcZA5WhyE/y2FpoDSfcFEc4kmrE5yfWZ++aW2ldyvXVWJ19+LrWjjGAh GLvCVW0G7YJy6U1CosZ3OwvxI2d/O8FadTBRpzIceJ4vcriuKScN3TNzWNGKwfa9vHN/ 4MlA== X-Gm-Message-State: AOAM530JyDSo2kVA+r/+lBfOSaR23yiAJDpB3fMEr/jhk+mVsvlb1veA J88Wphh1U4RB8zMNaL1ZkN4IoEs9nNuv2zrTqAS6LfO8/Ds= X-Google-Smtp-Source: ABdhPJw0/W2is3kkSGGznplbwsnkZvf/87vhwUAerLPvq/Qigv8BfQcCg1Jlltzly7ZLJf4JH2nUOlOvrw8c7pPd7oM= X-Received: by 2002:a17:902:e8c2:b0:14d:8ddc:c1eb with SMTP id v2-20020a170902e8c200b0014d8ddcc1ebmr4101761plg.102.1645125685220; Thu, 17 Feb 2022 11:21:25 -0800 (PST) MIME-Version: 1.0 References: <20220215162751.281955-1-goldstein.w.n@gmail.com> <20220217191524.2961663-1-goldstein.w.n@gmail.com> In-Reply-To: <20220217191524.2961663-1-goldstein.w.n@gmail.com> From: "H.J. Lu" Date: Thu, 17 Feb 2022 11:20:49 -0800 Message-ID: Subject: Re: [PATCH v5] x86: Fallback {str|wcs}cmp RTM in the ncmp overflow case [BZ #28896] To: Noah Goldstein Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3026.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Feb 2022 19:21:28 -0000 On Thu, Feb 17, 2022 at 11:15 AM Noah Goldstein wrote: > > In the overflow fallback strncmp-avx2-rtm and wcsncmp-avx2-rtm would > call strcmp-avx2 and wcsncmp-avx2 respectively. This would have > not checks around vzeroupper and would trigger spurious > aborts. This commit fixes that. > > test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all pass on > AVX2 machines with and without RTM. > > Co-authored-by: H.J. Lu > --- > sysdeps/x86/Makefile | 2 +- > sysdeps/x86/tst-strncmp-rtm.c | 17 ++++++++++++++++- > sysdeps/x86_64/multiarch/strcmp-avx2.S | 8 ++------ > sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S | 1 + > sysdeps/x86_64/multiarch/strncmp-avx2.S | 1 + > sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S | 2 +- > sysdeps/x86_64/multiarch/wcsncmp-avx2.S | 2 +- > 7 files changed, 23 insertions(+), 10 deletions(-) > > diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile > index 6cf708335c..d110f7b7f2 100644 > --- a/sysdeps/x86/Makefile > +++ b/sysdeps/x86/Makefile > @@ -109,7 +109,7 @@ CFLAGS-tst-memset-rtm.c += -mrtm > CFLAGS-tst-strchr-rtm.c += -mrtm > CFLAGS-tst-strcpy-rtm.c += -mrtm > CFLAGS-tst-strlen-rtm.c += -mrtm > -CFLAGS-tst-strncmp-rtm.c += -mrtm > +CFLAGS-tst-strncmp-rtm.c += -mrtm -Wno-error > CFLAGS-tst-strrchr-rtm.c += -mrtm > endif > > diff --git a/sysdeps/x86/tst-strncmp-rtm.c b/sysdeps/x86/tst-strncmp-rtm.c > index 09ed6fa0d6..9e20abaacc 100644 > --- a/sysdeps/x86/tst-strncmp-rtm.c > +++ b/sysdeps/x86/tst-strncmp-rtm.c > @@ -16,6 +16,7 @@ > License along with the GNU C Library; if not, see > . */ > > +#include > #include > > #define LOOP 3000 > @@ -45,8 +46,22 @@ function (void) > return 1; > } > > +__attribute__ ((noinline, noclone)) > +static int > +function_overflow (void) > +{ > + if (strncmp (string1, string2, SIZE_MAX) == 0) > + return 0; > + else > + return 1; > +} > + > static int > do_test (void) > { > - return do_test_1 ("strncmp", LOOP, prepare, function); > + int status = do_test_1 ("strncmp", LOOP, prepare, function); > + if (status != EXIT_SUCCESS) > + return status; > + status = do_test_1 ("strncmp", LOOP, prepare, function_overflow); > + return status; > } > diff --git a/sysdeps/x86_64/multiarch/strcmp-avx2.S b/sysdeps/x86_64/multiarch/strcmp-avx2.S > index 07a5a2c889..52ff5ad724 100644 > --- a/sysdeps/x86_64/multiarch/strcmp-avx2.S > +++ b/sysdeps/x86_64/multiarch/strcmp-avx2.S > @@ -193,10 +193,10 @@ L(ret_zero): > .p2align 4,, 5 > L(one_or_less): > jb L(ret_zero) > -# ifdef USE_AS_WCSCMP > /* 'nbe' covers the case where length is negative (large > unsigned). */ > - jnbe __wcscmp_avx2 > + jnbe OVERFLOW_STRCMP > +# ifdef USE_AS_WCSCMP > movl (%rdi), %edx > xorl %eax, %eax > cmpl (%rsi), %edx > @@ -205,10 +205,6 @@ L(one_or_less): > negl %eax > orl $1, %eax > # else > - /* 'nbe' covers the case where length is negative (large > - unsigned). */ > - > - jnbe __strcmp_avx2 > movzbl (%rdi), %eax > movzbl (%rsi), %ecx > subl %ecx, %eax > diff --git a/sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S b/sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S > index 37d1224bb9..68bad365ba 100644 > --- a/sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S > +++ b/sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S > @@ -1,3 +1,4 @@ > #define STRCMP __strncmp_avx2_rtm > #define USE_AS_STRNCMP 1 > +#define OVERFLOW_STRCMP __strcmp_avx2_rtm > #include "strcmp-avx2-rtm.S" > diff --git a/sysdeps/x86_64/multiarch/strncmp-avx2.S b/sysdeps/x86_64/multiarch/strncmp-avx2.S > index 1678bcc235..f138e9f1fd 100644 > --- a/sysdeps/x86_64/multiarch/strncmp-avx2.S > +++ b/sysdeps/x86_64/multiarch/strncmp-avx2.S > @@ -1,3 +1,4 @@ > #define STRCMP __strncmp_avx2 > #define USE_AS_STRNCMP 1 > +#define OVERFLOW_STRCMP __strcmp_avx2 > #include "strcmp-avx2.S" > diff --git a/sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S b/sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S > index 4e88c70cc6..f467582cbe 100644 > --- a/sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S > +++ b/sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S > @@ -1,5 +1,5 @@ > #define STRCMP __wcsncmp_avx2_rtm > #define USE_AS_STRNCMP 1 > #define USE_AS_WCSCMP 1 > - > +#define OVERFLOW_STRCMP __wcscmp_avx2_rtm > #include "strcmp-avx2-rtm.S" > diff --git a/sysdeps/x86_64/multiarch/wcsncmp-avx2.S b/sysdeps/x86_64/multiarch/wcsncmp-avx2.S > index 4fa1de4d3f..e9ede522b8 100644 > --- a/sysdeps/x86_64/multiarch/wcsncmp-avx2.S > +++ b/sysdeps/x86_64/multiarch/wcsncmp-avx2.S > @@ -1,5 +1,5 @@ > #define STRCMP __wcsncmp_avx2 > #define USE_AS_STRNCMP 1 > #define USE_AS_WCSCMP 1 > - > +#define OVERFLOW_STRCMP __wcscmp_avx2 > #include "strcmp-avx2.S" > -- > 2.25.1 > LGTM. Reviewed-by: H.J. Lu Thanks. -- H.J.