From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cygnus.enyo.de (cygnus.enyo.de [79.140.189.114]) by sourceware.org (Postfix) with ESMTPS id 864FF3858D37 for ; Wed, 1 Nov 2023 21:04:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 864FF3858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=deneb.enyo.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=deneb.enyo.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 864FF3858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=79.140.189.114 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698872697; cv=none; b=sBwKKSgjsr9ETa5BuqwdWrvZr6g2PbVfwhfl9r/BN4fg7JtnqhbFw3wmYAJ15UvwwxZ6JE1gqa90kFsSlYx+PkjIRKav2bTzvgOmVU6RjcrObNCgX6lAmhjGxULXp1w5NA4i+wIZ5c/T2MdHps2dOmXuY0U+yRo3+mJ3vIfMstg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698872697; c=relaxed/simple; bh=3eY/ujVpFGDXxzSdcVdnChPy8e3YYRrYZ+sJgFT/08s=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=hKWUCgORBxUSWawraG4sVtGknK9cgkWIc9q2eLYT194Bb+rAdFb9ggSjQ4xEopvX+yIZ6wj4zXHx34igpJ5lq/e3MzNOrMijqx/6XmIWSYpMCzuZhM1jDOHOOL+mnJRIuNM2w6o5vZ2Mw4Tg5bt7B3uQsW6xYfaTLT1lus3rlYw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [172.17.203.2] (port=37315 helo=deneb.enyo.de) by albireo.enyo.de ([172.17.140.2]) with esmtps (TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) id 1qyIOJ-00EVkB-7z; Wed, 01 Nov 2023 21:04:44 +0000 Received: from fw by deneb.enyo.de with local (Exim 4.96) (envelope-from ) id 1qyIOK-000n6l-0V; Wed, 01 Nov 2023 22:04:44 +0100 From: Florian Weimer To: Florian Weimer Cc: Noah Goldstein , libc-alpha@sourceware.org, hjl.tools@gmail.com, carlos@systemhalted.org, Sunil Pandey Subject: Re: x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10 References: <20230921143837.2903914-1-goldstein.w.n@gmail.com> <20231004184855.3517478-1-goldstein.w.n@gmail.com> <87fs28gug9.fsf@oldenburg.str.redhat.com> Date: Wed, 01 Nov 2023 22:04:44 +0100 In-Reply-To: <87fs28gug9.fsf@oldenburg.str.redhat.com> (Florian Weimer's message of "Wed, 18 Oct 2023 11:18:14 +0200") Message-ID: <877cn1tcb7.fsf@mid.deneb.enyo.de> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-3.2 required=5.0 tests=BAYES_00,KAM_DMARC_STATUS,KAM_NUMSUBJECT,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: * Florian Weimer: > * Noah Goldstein: > >> This commit refactors `strrchr-evex` and `strrchr-evex512` to use a >> common implementation: `strrchr-evex-base.S`. >> >> The motivation is `strrchr-evex` needed to be refactored to not use >> 64-bit masked registers in preperation for AVX10. >> >> Once vec-width masked register combining was removed, the EVEX and >> EVEX512 implementations can easily be implemented in the same file >> without any major overhead. >> >> The net result is performance improvements (measured on TGL) for both >> `strrchr-evex` and `strrchr-evex512`. Although, note there are some >> regressions in the test suite and it may be many of the cases that >> make the total-geomean of improvement/regression across bench-strrchr >> are cold. The point of the performance measurement is to show there >> are no major regressions, but the primary motivation is preperation >> for AVX10. >> >> Benchmarks where taken on TGL: >> https://www.intel.com/content/www/us/en/products/sku/213799/intel-core-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html >> >> EVEX geometric_mean(N=5) of all benchmarks New / Original : 0.74 >> EVEX512 geometric_mean(N=5) of all benchmarks New / Original: 0.87 >> >> Full check passes on x86. > > I believe this caused some sort of regression because when we upgraded > glibc in the Fedora rawhide buildroot, a lot of things started failing: > > glibc-2.38.9000-13.fc40 broke rawhide buildroot on x86_64 > > > The list of changes relative to the previous version is rather short: > > - stdlib: fix grouping verification with multi-byte thousands separator (bug 30964) > - build-many-glibcs: Check for required system tools > - x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10 > - aarch64: Optimise vecmath logs > - aarch64: Cosmetic change in SVE exp routines > - aarch64: Optimize SVE cos & cosf > - aarch64: Improve vecmath sin routines > - nss: Get rid of alloca usage in makedb's write_output. > - debug: Add regression tests for BZ 30932 > - Fix FORTIFY_SOURCE false positive > - nss: Rearrange and sort Makefile variables > - inet: Rearrange and sort Makefile variables > - Fix off-by-one OOB write in iconv/tst-iconv-mt > > And this patch is the most likely one to cause issues. I will try to > revert the patch and see if it fixes the observed issues. We did the revert and the issues were gone. So I think this commit is faulty.