From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) by sourceware.org (Postfix) with ESMTPS id BB3D33856DCB for ; Tue, 5 Jul 2022 15:41:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BB3D33856DCB Received: by mail-pj1-x1031.google.com with SMTP id s21so7989177pjq.4 for ; Tue, 05 Jul 2022 08:41:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=VtBkZP8vWuagmaVl2paHj4Fi42T8loViODKv9/cMq08=; b=13OFpRWynozsPNS6W8rK7LB4ogtIgR0eka9Otd99JiBc11LnH5WT6y7sUxedS1nEAU tWvlYQOzkM1i0SkhreTHI2GurEUv2ZNsaZ2cKVehnZjztMHjO4m154Aq2/Z50SShsdvI XQhZZPyiZ9AENEQnyDWoWoJ+vbCsLu2inkKf7yu1XJg9kc3Iuw5DmfClpN8ylL8JNtA+ JlXZ8H/pXZ96/OLEB5M0tJiWLMzgsgMIdzMOmODvHftjX6/MXp+5fNtYf4REGf9u1I66 PmtA72/gY7PFz1kJh3OLybvz1nKTdpWj+5XrM/csHX/VEEEfR2x0pM+oFljR2gPXnANl Xq5g== X-Gm-Message-State: AJIora/ZWdZDAVQD2XqSA/M8LtObTai8IHiAqqDqZyCSEt1O3wrLPuf8 kbqkFMfWu0uE+/5barx6YYpayeymNgxFofoaeNQ= X-Google-Smtp-Source: AGRyM1tuTlVPyUKa7zZ5Nd21ebk8Bwufffs3ZVQDSV2/Of64RRcAK738DMKcO/LFUBkOblnvIgRXDpfJe9M4sFKgC48= X-Received: by 2002:a17:902:a502:b0:15e:c251:b769 with SMTP id s2-20020a170902a50200b0015ec251b769mr41710274plq.115.1657035701822; Tue, 05 Jul 2022 08:41:41 -0700 (PDT) MIME-Version: 1.0 References: <20220628152717.17838-1-goldstein.w.n@gmail.com> <20220704042807.3863553-1-goldstein.w.n@gmail.com> In-Reply-To: <20220704042807.3863553-1-goldstein.w.n@gmail.com> From: "H.J. Lu" Date: Tue, 5 Jul 2022 08:41:05 -0700 Message-ID: Subject: Re: [PATCH v7 1/2] x86: Add comment explaining no Slow_SSE4_2 check in ifunc-sse4_2 To: Noah Goldstein Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3024.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2022 15:41:44 -0000 On Sun, Jul 3, 2022 at 9:28 PM Noah Goldstein wrote: > > Just for clarities sake and so that if a future implementation is > added we remember to add the check. > --- > sysdeps/x86_64/multiarch/ifunc-sse4_2.h | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/sysdeps/x86_64/multiarch/ifunc-sse4_2.h b/sysdeps/x86_64/multiarch/ifunc-sse4_2.h > index ee36525bcf..f8b56936ec 100644 > --- a/sysdeps/x86_64/multiarch/ifunc-sse4_2.h > +++ b/sysdeps/x86_64/multiarch/ifunc-sse4_2.h > @@ -27,6 +27,12 @@ IFUNC_SELECTOR (void) > { > const struct cpu_features* cpu_features = __get_cpu_features (); > > + /* This function uses the `pcmpstri` sse4.2 instruction which can be > + slow on some CPUs. This normally would be guarded by a > + Slow_SSE4_2 check, but since there is no other optimized > + implementation its best to keep it regardless. If an optimized > + fallback is added add a X86_ISA_CPU_FEATURE_ARCH_P (cpu_features, > + Slow_SSE4_2) check. */ > if (CPU_FEATURE_USABLE_P (cpu_features, SSE4_2)) > return OPTIMIZE (sse42); > > -- > 2.34.1 > LGTM. Thanks. -- H.J.