From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by sourceware.org (Postfix) with ESMTPS id 074EC3858C83 for ; Tue, 15 Feb 2022 17:00:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 074EC3858C83 Received: by mail-pf1-x42c.google.com with SMTP id g1so16073002pfv.1 for ; Tue, 15 Feb 2022 09:00:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=aczmMXWkIgCUn2pCnVIY535Q0Rwppc3lXjdIzAT80MQ=; b=GRKamUHMHpog0QtOriUCHEHRYWrdo6zhry36MHZqJSFHhgE2UjIwd/b3s219cwfBWG d7eN+QxWAod5W9rccMYgdffExMoEgPh2LhNTp8y/T2DJuFzZmH3zYTrxStlUhkBqwZfd hCqbzCW+Y/hVsjbC3lVbD8gsXe9rASQqlFGs/AyCT5zkW1iy8EdNomSGlRpPDRUjv4FJ 3Fmv3qFU0NTviWDyDQN0tIqrt/inT/ynnmYw5V6h6iSXrKVEvTFOnrnmcMAlYKQGkAtU 8JygfUtyJ84o/LKLeZ4EfncRN9P4+01yaUielF3Eaf4VCBLbfMCdF1i0cB12iZXL3uRB QWlQ== X-Gm-Message-State: AOAM533875u55OHFxcuz9Lx1syXP2mv6xWhIeZyJx4KZPEAyLaqG0Luy Z8B/d+cE7vtA7F9f295YfBYfU2+kaSF35ns58kvu+7OF X-Google-Smtp-Source: ABdhPJztxYKj3fHycMshElA/w/aorfZsxVuGx76WyRfi+oawirQSf8UFrsEja+Cww3pei8o3/Z7vtej6o1LNDmbMk/4= X-Received: by 2002:aa7:82c1:: with SMTP id f1mr4935632pfn.60.1644944432027; Tue, 15 Feb 2022 09:00:32 -0800 (PST) MIME-Version: 1.0 References: <20220215162751.281955-1-goldstein.w.n@gmail.com> In-Reply-To: From: "H.J. Lu" Date: Tue, 15 Feb 2022 08:59:56 -0800 Message-ID: Subject: Re: [PATCH v1] x86: Fallback {str|wcs}cmp RTM in the ncmp overflow case [BZ #28896] To: Noah Goldstein Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3027.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Feb 2022 17:00:34 -0000 On Tue, Feb 15, 2022 at 8:51 AM Noah Goldstein wrote: > > On Tue, Feb 15, 2022 at 10:30 AM H.J. Lu wrote: > > > > On Tue, Feb 15, 2022 at 8:28 AM Noah Goldstein wrote: > > > > > > In the overflow fallback strncmp-avx2-rtm and wcsncmp-avx2-rtm would > > > call strcmp-avx2 and wcsncmp-avx2 respectively. This would have > > > not checks around vzeroupper and would trigger spurious > > > aborts. This commit fixes that. > > > > Include a testcase? > Added test case in V2. Don't have the hardware to check it though, > can you? Yes, I can. Please V2 on a branch in gitlab. Thanks. > > > > > test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all > > > pass. Note not tested on a machine that supports RTM (non > > > available). > > > --- > > > sysdeps/x86_64/multiarch/strcmp-avx2.S | 8 ++------ > > > sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S | 1 + > > > sysdeps/x86_64/multiarch/strncmp-avx2.S | 1 + > > > sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S | 2 +- > > > sysdeps/x86_64/multiarch/wcsncmp-avx2.S | 2 +- > > > 5 files changed, 6 insertions(+), 8 deletions(-) > > > > > > diff --git a/sysdeps/x86_64/multiarch/strcmp-avx2.S b/sysdeps/x86_64/multiarch/strcmp-avx2.S > > > index 07a5a2c889..52ff5ad724 100644 > > > --- a/sysdeps/x86_64/multiarch/strcmp-avx2.S > > > +++ b/sysdeps/x86_64/multiarch/strcmp-avx2.S > > > @@ -193,10 +193,10 @@ L(ret_zero): > > > .p2align 4,, 5 > > > L(one_or_less): > > > jb L(ret_zero) > > > -# ifdef USE_AS_WCSCMP > > > /* 'nbe' covers the case where length is negative (large > > > unsigned). */ > > > - jnbe __wcscmp_avx2 > > > + jnbe OVERFLOW_STRCMP > > > +# ifdef USE_AS_WCSCMP > > > movl (%rdi), %edx > > > xorl %eax, %eax > > > cmpl (%rsi), %edx > > > @@ -205,10 +205,6 @@ L(one_or_less): > > > negl %eax > > > orl $1, %eax > > > # else > > > - /* 'nbe' covers the case where length is negative (large > > > - unsigned). */ > > > - > > > - jnbe __strcmp_avx2 > > > movzbl (%rdi), %eax > > > movzbl (%rsi), %ecx > > > subl %ecx, %eax > > > diff --git a/sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S b/sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S > > > index 37d1224bb9..68bad365ba 100644 > > > --- a/sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S > > > +++ b/sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S > > > @@ -1,3 +1,4 @@ > > > #define STRCMP __strncmp_avx2_rtm > > > #define USE_AS_STRNCMP 1 > > > +#define OVERFLOW_STRCMP __strcmp_avx2_rtm > > > #include "strcmp-avx2-rtm.S" > > > diff --git a/sysdeps/x86_64/multiarch/strncmp-avx2.S b/sysdeps/x86_64/multiarch/strncmp-avx2.S > > > index 1678bcc235..f138e9f1fd 100644 > > > --- a/sysdeps/x86_64/multiarch/strncmp-avx2.S > > > +++ b/sysdeps/x86_64/multiarch/strncmp-avx2.S > > > @@ -1,3 +1,4 @@ > > > #define STRCMP __strncmp_avx2 > > > #define USE_AS_STRNCMP 1 > > > +#define OVERFLOW_STRCMP __strcmp_avx2 > > > #include "strcmp-avx2.S" > > > diff --git a/sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S b/sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S > > > index 4e88c70cc6..f467582cbe 100644 > > > --- a/sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S > > > +++ b/sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S > > > @@ -1,5 +1,5 @@ > > > #define STRCMP __wcsncmp_avx2_rtm > > > #define USE_AS_STRNCMP 1 > > > #define USE_AS_WCSCMP 1 > > > - > > > +#define OVERFLOW_STRCMP __wcscmp_avx2_rtm > > > #include "strcmp-avx2-rtm.S" > > > diff --git a/sysdeps/x86_64/multiarch/wcsncmp-avx2.S b/sysdeps/x86_64/multiarch/wcsncmp-avx2.S > > > index 4fa1de4d3f..e9ede522b8 100644 > > > --- a/sysdeps/x86_64/multiarch/wcsncmp-avx2.S > > > +++ b/sysdeps/x86_64/multiarch/wcsncmp-avx2.S > > > @@ -1,5 +1,5 @@ > > > #define STRCMP __wcsncmp_avx2 > > > #define USE_AS_STRNCMP 1 > > > #define USE_AS_WCSCMP 1 > > > - > > > +#define OVERFLOW_STRCMP __wcscmp_avx2 > > > #include "strcmp-avx2.S" > > > -- > > > 2.25.1 > > > > > > > > > -- > > H.J. -- H.J.