From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) by sourceware.org (Postfix) with ESMTPS id F15493858C83 for ; Tue, 15 Feb 2022 17:06:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org F15493858C83 Received: by mail-pf1-x431.google.com with SMTP id u16so1865480pfg.12 for ; Tue, 15 Feb 2022 09:06:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=NEKdcvsus8pfU5VUgzDa/r6RDf8No0KIVfn+MJ9pYcw=; b=WWIzl93SeOblg1fYiI/g2jNmhoBmmR0Yiq6IbAZ8OhRv4JuLDD7ADSghN0lUlcgajc ktRdFHdmVRb4GwQYzWHs1oQ5NvPmqKSxdubNZAq5z7ELXniM2QFz1SlN7zDcbUoeiydT xsRsFRlUPE/JngA5PPPGcqmHhbIDGXqtObZxrQ8VyHaQBnCfNg/FE1gsH3X3loBenMT9 YBF62SufjOxRFXmYlg2Zi2LHMF12oK2CWZMe46dmWFa8QUPDfQi72YxSqlW+ptXiAvvZ /4ffXGhdjrx1N4fQyb/gZUtxi29LmacyMZYffdUE9rvZ+5g9iv4QS5dwTlxb3WTF6GB7 9w1Q== X-Gm-Message-State: AOAM532R/2Z1Wn3+l3x6rMFlHcFLkvyVNSchN7p/Ye6/IIIIbFIcqI5D aJpTPokEvmkOPlanm21VIKx4UJOPkz0qZbHKPRf59tjn1Pg= X-Google-Smtp-Source: ABdhPJwegrxQ5PKzuNbZDiuumNODC2dG+oVB1u21w507WeJJlVPOklLoFd10q3g/5W5poVe/CmoU/yTOiHJjbGxu5xk= X-Received: by 2002:a63:6a06:: with SMTP id f6mr4335804pgc.18.1644944815126; Tue, 15 Feb 2022 09:06:55 -0800 (PST) MIME-Version: 1.0 References: <20220215162751.281955-1-goldstein.w.n@gmail.com> In-Reply-To: From: Noah Goldstein Date: Tue, 15 Feb 2022 11:06:44 -0600 Message-ID: Subject: Re: [PATCH v1] x86: Fallback {str|wcs}cmp RTM in the ncmp overflow case [BZ #28896] To: "H.J. Lu" Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=0.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, UNWANTED_LANGUAGE_BODY autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Feb 2022 17:06:57 -0000 On Tue, Feb 15, 2022 at 11:00 AM H.J. Lu wrote: > > On Tue, Feb 15, 2022 at 8:51 AM Noah Goldstein wrote: > > > > On Tue, Feb 15, 2022 at 10:30 AM H.J. Lu wrote: > > > > > > On Tue, Feb 15, 2022 at 8:28 AM Noah Goldstein wrote: > > > > > > > > In the overflow fallback strncmp-avx2-rtm and wcsncmp-avx2-rtm would > > > > call strcmp-avx2 and wcsncmp-avx2 respectively. This would have > > > > not checks around vzeroupper and would trigger spurious > > > > aborts. This commit fixes that. > > > > > > Include a testcase? > > Added test case in V2. Don't have the hardware to check it though, > > can you? > > Yes, I can. Please V2 on a branch in gitlab. https://gitlab.com/x86-glibc/glibc/-/commits/users/goldsteinn/strncmp-rtm-test > > Thanks. > > > > > > > > test-strcmp, test-strncmp, test-wcscmp, and test-wcsncmp all > > > > pass. Note not tested on a machine that supports RTM (non > > > > available). > > > > --- > > > > sysdeps/x86_64/multiarch/strcmp-avx2.S | 8 ++------ > > > > sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S | 1 + > > > > sysdeps/x86_64/multiarch/strncmp-avx2.S | 1 + > > > > sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S | 2 +- > > > > sysdeps/x86_64/multiarch/wcsncmp-avx2.S | 2 +- > > > > 5 files changed, 6 insertions(+), 8 deletions(-) > > > > > > > > diff --git a/sysdeps/x86_64/multiarch/strcmp-avx2.S b/sysdeps/x86_64/multiarch/strcmp-avx2.S > > > > index 07a5a2c889..52ff5ad724 100644 > > > > --- a/sysdeps/x86_64/multiarch/strcmp-avx2.S > > > > +++ b/sysdeps/x86_64/multiarch/strcmp-avx2.S > > > > @@ -193,10 +193,10 @@ L(ret_zero): > > > > .p2align 4,, 5 > > > > L(one_or_less): > > > > jb L(ret_zero) > > > > -# ifdef USE_AS_WCSCMP > > > > /* 'nbe' covers the case where length is negative (large > > > > unsigned). */ > > > > - jnbe __wcscmp_avx2 > > > > + jnbe OVERFLOW_STRCMP > > > > +# ifdef USE_AS_WCSCMP > > > > movl (%rdi), %edx > > > > xorl %eax, %eax > > > > cmpl (%rsi), %edx > > > > @@ -205,10 +205,6 @@ L(one_or_less): > > > > negl %eax > > > > orl $1, %eax > > > > # else > > > > - /* 'nbe' covers the case where length is negative (large > > > > - unsigned). */ > > > > - > > > > - jnbe __strcmp_avx2 > > > > movzbl (%rdi), %eax > > > > movzbl (%rsi), %ecx > > > > subl %ecx, %eax > > > > diff --git a/sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S b/sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S > > > > index 37d1224bb9..68bad365ba 100644 > > > > --- a/sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S > > > > +++ b/sysdeps/x86_64/multiarch/strncmp-avx2-rtm.S > > > > @@ -1,3 +1,4 @@ > > > > #define STRCMP __strncmp_avx2_rtm > > > > #define USE_AS_STRNCMP 1 > > > > +#define OVERFLOW_STRCMP __strcmp_avx2_rtm > > > > #include "strcmp-avx2-rtm.S" > > > > diff --git a/sysdeps/x86_64/multiarch/strncmp-avx2.S b/sysdeps/x86_64/multiarch/strncmp-avx2.S > > > > index 1678bcc235..f138e9f1fd 100644 > > > > --- a/sysdeps/x86_64/multiarch/strncmp-avx2.S > > > > +++ b/sysdeps/x86_64/multiarch/strncmp-avx2.S > > > > @@ -1,3 +1,4 @@ > > > > #define STRCMP __strncmp_avx2 > > > > #define USE_AS_STRNCMP 1 > > > > +#define OVERFLOW_STRCMP __strcmp_avx2 > > > > #include "strcmp-avx2.S" > > > > diff --git a/sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S b/sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S > > > > index 4e88c70cc6..f467582cbe 100644 > > > > --- a/sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S > > > > +++ b/sysdeps/x86_64/multiarch/wcsncmp-avx2-rtm.S > > > > @@ -1,5 +1,5 @@ > > > > #define STRCMP __wcsncmp_avx2_rtm > > > > #define USE_AS_STRNCMP 1 > > > > #define USE_AS_WCSCMP 1 > > > > - > > > > +#define OVERFLOW_STRCMP __wcscmp_avx2_rtm > > > > #include "strcmp-avx2-rtm.S" > > > > diff --git a/sysdeps/x86_64/multiarch/wcsncmp-avx2.S b/sysdeps/x86_64/multiarch/wcsncmp-avx2.S > > > > index 4fa1de4d3f..e9ede522b8 100644 > > > > --- a/sysdeps/x86_64/multiarch/wcsncmp-avx2.S > > > > +++ b/sysdeps/x86_64/multiarch/wcsncmp-avx2.S > > > > @@ -1,5 +1,5 @@ > > > > #define STRCMP __wcsncmp_avx2 > > > > #define USE_AS_STRNCMP 1 > > > > #define USE_AS_WCSCMP 1 > > > > - > > > > +#define OVERFLOW_STRCMP __wcscmp_avx2 > > > > #include "strcmp-avx2.S" > > > > -- > > > > 2.25.1 > > > > > > > > > > > > > -- > > > H.J. > > > > -- > H.J.