From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id 61549385274F for ; Mon, 3 Oct 2022 21:12:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 61549385274F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ej1-x634.google.com with SMTP id lc7so24873625ejb.0 for ; Mon, 03 Oct 2022 14:12:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=lHBbGQvWCnUm6qlFMAptOYCB0s4GhSlnx+y5rCgbe4U=; b=lqrJky625oKJDfDA46FQorYo+HvHkflmudAdJ7nUlmEgULTPLIv0bbzkU0Q5DBCoTz 60YX5xU/j4TqNAHqTaj6dmolM6z37LSBAHEZdN6bSmlOeu8WacuYEQbAhNm7Rt3wbMJz D+8aLYmT1/X0RprvdEI7HE6yuRQfbJLqQ9MyjMsMW4e9GMBEWWc98lg/eN2gLiydqebS idgxRKS2m0EkGKIUWeLNqiapE+NAduXbKZXAD2HpuY+vMDjHlqzP1ypX5V/BWfi4uufA Tal9HZDzlBfpyBEt/8ErG0xZryMuRJqVz5IPUpzGlVJOgtIcavZP02ebsssVDMku4GBY kjbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=lHBbGQvWCnUm6qlFMAptOYCB0s4GhSlnx+y5rCgbe4U=; b=lbUZ1sfinfFTNOoFnU+t/g2cB5NPCCYVNBpQVcUgNLq3ScU58HNY1SMC3CcwnIZoGG ZkL89OrUJH9/KSYmgClARyLRdJZuBHJ7FxuE8GXymuHoMatenRyZAF+6RBlh4/BJGMkW 2VT9hfc3za80zhlX0AIdnKhsh1kOEpRYRuXiPpbxnAalSOm+oTB2D2331DUtPXITVxf+ LvvuJYHb0Pqnx8crt7WIhp0CksMNC4LhYRu8dYRqomzN1nDB4fyVCdrrs1ggV63NQlE8 hz4ajs8XEa//21dHRLtnWI4En+qMz/kFB+C3Fm3vZjtnqjwpjvoJ2pQ7HfEynVlduohj tQlQ== X-Gm-Message-State: ACrzQf22tnrySU5LBhg9HaYrPbOqNDm+Gg1acl1A5B7Jbemtknqf4/6A jIfQ2RHGjtJ2QNTFr1/895Z2YFlr6sEkDK2Pruu5KhCVnj4= X-Google-Smtp-Source: AMsMyM5TGtf0STamFSfgcRwKjrxe2ZZIbBHB8UkeSsyISDJNMDZKHKEPLG4+MNtXAHcyJZevvoaW/SMVX/hZuuYkAzo= X-Received: by 2002:a17:906:fe46:b0:73d:939a:ec99 with SMTP id wz6-20020a170906fe4600b0073d939aec99mr16811761ejb.169.1664831520288; Mon, 03 Oct 2022 14:12:00 -0700 (PDT) MIME-Version: 1.0 References: <20221003195944.3274548-1-aurelien@aurel32.net> <20221003195944.3274548-6-aurelien@aurel32.net> In-Reply-To: <20221003195944.3274548-6-aurelien@aurel32.net> From: Noah Goldstein Date: Mon, 3 Oct 2022 14:11:49 -0700 Message-ID: Subject: Re: [PATCH v3 5/8] x86-64: Require BMI2 for AVX2 wcs(n)cmp implementations To: Aurelien Jarno Cc: libc-alpha@sourceware.org, "H . J . Lu" , Sunil K Pandey Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, Oct 3, 2022 at 12:59 PM Aurelien Jarno wrote: > > The AVX2 wcs(n)cmp implementations use the 'bzhi' instruction, which > belongs to the BMI2 CPU feature. > > NB: It also uses the 'tzcnt' BMI1 instruction, but it is executed as BSF > as BSF if the CPU doesn't support TZCNT, and produces the same result > for non-zero input. > > Partially fixes: b77b06e0e296 ("x86: Optimize strcmp-avx2.S") > Partially resolves: BZ #29611 > --- > sysdeps/x86_64/multiarch/ifunc-impl-list.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > index aebef3daaf..fec8790c11 100644 > --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c > +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c > @@ -810,10 +810,12 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > && CPU_FEATURE_USABLE (BMI2)), > __wcscmp_evex) > X86_IFUNC_IMPL_ADD_V3 (array, i, wcscmp, > - CPU_FEATURE_USABLE (AVX2), > + (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (BMI2)), > __wcscmp_avx2) > X86_IFUNC_IMPL_ADD_V3 (array, i, wcscmp, > (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (BMI2) > && CPU_FEATURE_USABLE (RTM)), > __wcscmp_avx2_rtm) > /* ISA V2 wrapper for SSE2 implementation because the SSE2 > @@ -830,10 +832,12 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, > && CPU_FEATURE_USABLE (BMI2)), > __wcsncmp_evex) > X86_IFUNC_IMPL_ADD_V3 (array, i, wcsncmp, > - CPU_FEATURE_USABLE (AVX2), > + (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (BMI2)), > __wcsncmp_avx2) > X86_IFUNC_IMPL_ADD_V3 (array, i, wcsncmp, > (CPU_FEATURE_USABLE (AVX2) > + && CPU_FEATURE_USABLE (BMI2) > && CPU_FEATURE_USABLE (RTM)), > __wcsncmp_avx2_rtm) > /* ISA V2 wrapper for GENERIC implementation because the > -- > 2.35.1 > LGTM. Reviewed-by: Noah Goldstein