From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x32.google.com (mail-oa1-x32.google.com [IPv6:2001:4860:4864:20::32]) by sourceware.org (Postfix) with ESMTPS id 5ABED3858D37 for ; Wed, 1 Nov 2023 21:11:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5ABED3858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5ABED3858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:4860:4864:20::32 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698873098; cv=none; b=kSap0ZBTMHg/J/mBcNiSYsNCIVwZGJEGSMmvxa3tuJd2eFEwxKY409N5hcCmM1GkPxgenszPTQxHHAFwVBx7J2+UI6reb3lHAj6CBYqPAP8W3RR6LhsuTcQrnaXXa8p56tEbzwGuiWsnm1O+X89imxUCL3SPS0JByujD2mI7VbQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698873098; c=relaxed/simple; bh=FezMTK6MhukKu0g1dBs6+Z4xqTb48jwHOUWT6MqKwkY=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=xuEp7/uuT1WgN7uVoBlOAiRb65yXQR2vFPmur1iUSWx5HChClHhzlOMEfjORW/H6WkuB7DOd2n2ya8GRPve3RPfu8VzKA6hbt1QFOZLHfIWoMGsunr39/w9jsXuW4jHEtOolj+KM94kNTITJXSw56No1eFSXhf9JAZjTlwV4HY0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oa1-x32.google.com with SMTP id 586e51a60fabf-1e9a757e04eso685910fac.0 for ; Wed, 01 Nov 2023 14:11:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698873087; x=1699477887; darn=sourceware.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=XgZymPJWNqhiq4TqEGi6RS9sczVJfnP/lxv8qUOXUsE=; b=VQRfWZlr6TqmdcxMekZpPAb/rBikzWN58LKOmq880hdNssU2aEToa4cSg4J7JPwdKA 1ajmEYl3z56e2kwjzBcAeqLyA9gNzvB/cw0HXp/3+fG2mNsd6IO/25wIolNxlTj03COG ExbxpghVfBdZhiSEydfOT8RflnvDYJ+scYU9k7zBcwiwKSeGFhTqjN+nNPngtrSqcObB kI8kRx1uUvoFGVlj6aA5ItCEi4x2eTXS22tk56lnlsa4Jsmooh8EDnyJHIJKShZ/wjL4 ey77LFGXPKqGxAE+KZVylGG/t7MFxL+aHP8ZEp4uDv2FkAaYHmtHsRVKtkBYMG+fCg87 F91Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698873087; x=1699477887; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XgZymPJWNqhiq4TqEGi6RS9sczVJfnP/lxv8qUOXUsE=; b=r5kJOrFz0qipAGGVq0j7RXUV3jZ+/ZUrobDqh76V3FoHbmwRHzh9ST8e6a320Z+jE9 1JYvLZdT4b8qQU5NbfnqRTQciJtIUMl2Bingm4emwrMccWdBvD08XFbFow3XjBGGp4Uj KYi1HRla3uOpkn/x48jYoWpq9/nnU9Dow/B16VMEgyvS+L3RX2C2u9xaAyuiVM6V2Bzz t8/RyURyblnLCsxLRBlmztqn3nnVUepZT/i3PRTHWEGEOC+xh94cgMFInYSRnwgK3A7L Lg5ZgCgTtoVIgj3oAQXcGu0bWa+sfJmyK43PPt5pCeO8QWLymikHczoDU7tqZiXci728 DVSQ== X-Gm-Message-State: AOJu0YySAURR+SWuFzmrmw13JYQ32/+rIT4rvT17WZ/9EUF6q7SfAjUr RuUhiku48WORlRzo1dkEevumpfS59hFxt7jCymE= X-Google-Smtp-Source: AGHT+IGkNulqbcRUvJqqCRypcmSBo4qPy80EI+qQDeXOeDw/IvkyAB/Heuo6XZq7P1WuDV161CVsjjeEtWRBR4IY1JM= X-Received: by 2002:a05:6870:530d:b0:1e9:8a35:863d with SMTP id j13-20020a056870530d00b001e98a35863dmr2718520oan.19.1698873087406; Wed, 01 Nov 2023 14:11:27 -0700 (PDT) MIME-Version: 1.0 References: <20230921143837.2903914-1-goldstein.w.n@gmail.com> <20231004184855.3517478-1-goldstein.w.n@gmail.com> <87fs28gug9.fsf@oldenburg.str.redhat.com> <877cn1tcb7.fsf@mid.deneb.enyo.de> In-Reply-To: <877cn1tcb7.fsf@mid.deneb.enyo.de> From: Noah Goldstein Date: Wed, 1 Nov 2023 16:11:15 -0500 Message-ID: Subject: Re: x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10 To: Florian Weimer Cc: Florian Weimer , libc-alpha@sourceware.org, hjl.tools@gmail.com, carlos@systemhalted.org, Sunil Pandey Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Nov 1, 2023 at 4:04=E2=80=AFPM Florian Weimer wr= ote: > > * Florian Weimer: > > > * Noah Goldstein: > > > >> This commit refactors `strrchr-evex` and `strrchr-evex512` to use a > >> common implementation: `strrchr-evex-base.S`. > >> > >> The motivation is `strrchr-evex` needed to be refactored to not use > >> 64-bit masked registers in preperation for AVX10. > >> > >> Once vec-width masked register combining was removed, the EVEX and > >> EVEX512 implementations can easily be implemented in the same file > >> without any major overhead. > >> > >> The net result is performance improvements (measured on TGL) for both > >> `strrchr-evex` and `strrchr-evex512`. Although, note there are some > >> regressions in the test suite and it may be many of the cases that > >> make the total-geomean of improvement/regression across bench-strrchr > >> are cold. The point of the performance measurement is to show there > >> are no major regressions, but the primary motivation is preperation > >> for AVX10. > >> > >> Benchmarks where taken on TGL: > >> https://www.intel.com/content/www/us/en/products/sku/213799/intel-core= -i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html > >> > >> EVEX geometric_mean(N=3D5) of all benchmarks New / Original : 0.74 > >> EVEX512 geometric_mean(N=3D5) of all benchmarks New / Original: 0.87 > >> > >> Full check passes on x86. > > > > I believe this caused some sort of regression because when we upgraded > > glibc in the Fedora rawhide buildroot, a lot of things started failing: > > > > glibc-2.38.9000-13.fc40 broke rawhide buildroot on x86_64 > > > > > > The list of changes relative to the previous version is rather short: > > > > - stdlib: fix grouping verification with multi-byte thousands separator= (bug 30964) > > - build-many-glibcs: Check for required system tools > > - x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10 > > - aarch64: Optimise vecmath logs > > - aarch64: Cosmetic change in SVE exp routines > > - aarch64: Optimize SVE cos & cosf > > - aarch64: Improve vecmath sin routines > > - nss: Get rid of alloca usage in makedb's write_output. > > - debug: Add regression tests for BZ 30932 > > - Fix FORTIFY_SOURCE false positive > > - nss: Rearrange and sort Makefile variables > > - inet: Rearrange and sort Makefile variables > > - Fix off-by-one OOB write in iconv/tst-iconv-mt > > > > And this patch is the most likely one to cause issues. I will try to > > revert the patch and see if it fixes the observed issues. > > We did the revert and the issues were gone. So I think this commit is > faulty. Bah, didn't see your last email. Thank you for reverting. Will look into the issue.