From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x1134.google.com (mail-yw1-x1134.google.com [IPv6:2607:f8b0:4864:20::1134]) by sourceware.org (Postfix) with ESMTPS id CBFD63856DE2 for ; Thu, 14 Jul 2022 02:54:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CBFD63856DE2 Received: by mail-yw1-x1134.google.com with SMTP id 00721157ae682-31d85f82f0bso4224507b3.7 for ; Wed, 13 Jul 2022 19:54:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=EeLG/42Wulo732KsAzR33i8j1iKLVHIyJg/F8LljtzI=; b=f7S4TZ5/mGy4R7Mpp7xAfhutCo1HMCu0BfcJeakEnbjO7YPNJBx7589xlSmxwwFzmW TD241IfZIaE9NM9rAROAuiGKTeTC6PrsCw7sh1ORONmL2pPyu/zU0ZHaX9aqlhEnoRbi OcZMCUDV6l8tVswTKnEdzhBBQJ9Vmu2BFAkiBrZj34bbRO9xqsudqYG7kFPbLbujQFQo dpPrUbM4seeirBd0FpDJSq5v+3IAJcq/p2hz+uN0+zMpflXN3lqxFESybhtfjfgz5iXb 5hhzGMl780m9Mbydp59Y9uAV9T5Y8NrKY/yjNELuhXkl4JJlnhfdf1nXFycK1WMeD0W5 ZVyw== X-Gm-Message-State: AJIora+YbXST+T4U4NQUnotVsK2oPd8uMFvpLRycLrvychzV2jBl/sU/ ak3RgS+JTYL0llGSI2GC0k6cMt3bQ12X4Sq+yiQYKkBWn+Y= X-Google-Smtp-Source: AGRyM1uasmLnZDU2Cfg7OX+I6fWC9e5NFoKxFFdrI76Zx7LIjemuLrzd+VFc3ibfjMdKu/L/wAR+v1PIkwtzKUL9vUM= X-Received: by 2002:a81:a1ca:0:b0:31c:9af0:ab6 with SMTP id y193-20020a81a1ca000000b0031c9af00ab6mr7933282ywg.120.1657767290358; Wed, 13 Jul 2022 19:54:50 -0700 (PDT) MIME-Version: 1.0 References: <20220615002533.1741934-1-goldstein.w.n@gmail.com> <20220615002533.1741934-3-goldstein.w.n@gmail.com> In-Reply-To: From: Sunil Pandey Date: Wed, 13 Jul 2022 19:54:14 -0700 Message-ID: Subject: Re: [PATCH v1 3/3] x86: Add sse42 implementation to strcmp's ifunc To: "H.J. Lu" Cc: Noah Goldstein , GNU C Library Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, HK_RANDOM_ENVFROM, HK_RANDOM_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jul 2022 02:54:52 -0000 On Tue, Jun 14, 2022 at 6:09 PM H.J. Lu via Libc-alpha wrote: > > On Tue, Jun 14, 2022 at 5:25 PM Noah Goldstein wrote: > > > > This has been missing since the the ifuncs where added. > > > > The performance of SSE4.2 is preferable to to SSE2. > > > > Measured on Tigerlake with N = 20 runs. > > Geometric Mean of all benchmarks SSE4.2 / SSE2: 0.906 > > --- > > sysdeps/x86_64/multiarch/strcmp.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/sysdeps/x86_64/multiarch/strcmp.c b/sysdeps/x86_64/multiarch/strcmp.c > > index a248c2a6e6..9c1677724c 100644 > > --- a/sysdeps/x86_64/multiarch/strcmp.c > > +++ b/sysdeps/x86_64/multiarch/strcmp.c > > @@ -28,6 +28,7 @@ > > > > extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden; > > extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2_unaligned) attribute_hidden; > > +extern __typeof (REDIRECT_NAME) OPTIMIZE (sse42) attribute_hidden; > > extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2) attribute_hidden; > > extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2_rtm) attribute_hidden; > > extern __typeof (REDIRECT_NAME) OPTIMIZE (evex) attribute_hidden; > > @@ -52,6 +53,10 @@ IFUNC_SELECTOR (void) > > return OPTIMIZE (avx2); > > } > > > > + if (CPU_FEATURE_USABLE_P (cpu_features, SSE4_2) > > + && !CPU_FEATURES_ARCH_P (cpu_features, Slow_SSE4_2)) > > + return OPTIMIZE (sse42); > > + > > if (CPU_FEATURES_ARCH_P (cpu_features, Fast_Unaligned_Load)) > > return OPTIMIZE (sse2_unaligned); > > > > -- > > 2.34.1 > > > > LGTM. > > Thanks. > > -- > H.J. I would like to backport this patch to release branches. Any comments or objections? --Sunil