From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x34.google.com (mail-oa1-x34.google.com [IPv6:2001:4860:4864:20::34]) by sourceware.org (Postfix) with ESMTPS id 13A3A3858D28 for ; Wed, 1 Nov 2023 21:23:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 13A3A3858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 13A3A3858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:4860:4864:20::34 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698873796; cv=none; b=VIrVmqu48yeQGQwwKTlseDmjri0AplqXvJujZyaQc6LNfkW3A8Jx7kGJzrEAxSz7O5a+36e9xQKzXxjaBnDz+cPPiUGrFCXpOyXecspwz4l8+z4p8EgMZ6Jht80bckVn6mPO2mcvAZH/XiVzHNHJ3H7pWqzXg72q3qa5YXFpYfM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698873796; c=relaxed/simple; bh=cAdAIC0KxC32jfcRpZnwdxe83BVffQrm2m3b7qIB8/U=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=ocGHKjQCJjYBHAUNCtqNfRlTe6ksUM/8ugdidROn56ZJEw48lDGy49+WwUlQXq2uOjP7FVRm7gQ+HCO1uqJ6vTlubxPswoQ9zuLfRZbjVv8ZJBWrHCSCZpSP2/ZSxGOkrFKAE2tOCEWTm1u3U/UujBKbA9TendwbdBZWmi7Cd+Q= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oa1-x34.google.com with SMTP id 586e51a60fabf-1e993765c1bso167670fac.3 for ; Wed, 01 Nov 2023 14:23:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698873785; x=1699478585; darn=sourceware.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=w+UFj9FVAZ5+sOkp/WwYzZGcisyK9nyuyT0nyaSJcF8=; b=Y+WsRnl/O0+s3QN71RTQA4+PS/667FUppVAPj3s1DrfhYxOjl9rvwevBvPPTcKUtbv EVu8bq33RkxU9WfneosDDtB7wPBwKFiOB5z52kum9a5njZEWm6iOLyEZxjKVQab9ZUUe VxMRnr0a4RR36tfop50Es53rLEcfRp2m1CDwg1ZjgfB31oo0795bevwxyNDI/utBjqsP sHBELdAcEDelMPG3KaRveogjc9WZ3IRlba7/m1QN6bdIavXqJLN/ykp3PBXY95ISTHFH 2G43aYYrmSM6tGFzmNFmMrFy6pqLbExzGbWOkzUne8xT8opPDgXgAtOUuKkHzLXS6W3k SQ6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698873785; x=1699478585; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=w+UFj9FVAZ5+sOkp/WwYzZGcisyK9nyuyT0nyaSJcF8=; b=KVGzcDnl3ULfQKusbkc/VKiV5nD0EoUrhq2nTg3LUgt+3YdpBc7gci5nlCzKSKBiN6 oS//HtQpCTeGiUWItN8wTcqFcFmaUEoRoHNsoxfjZgjZ9vdcKCLMhA3XUfWrCyDKq80+ AWLQBA0epvk1OqhtmHR17sC/YcqAA1T7OA1j5flLmI9A5cTSfMrh5wR4E+wbjjj7Zwwb cdqOsbnK62/4VsB34h+Q1krtcM3Oz1YdbgeGsMgmJX9geoMIMCkArgfaGNdnJ3bo7ZM0 7qb2Adl+rb8EjIlNADzVOIbD7ZypeV+1osvvYX6zcncozoHxZBgi7DG9BrS7/CN/uX5Z TWSQ== X-Gm-Message-State: AOJu0Yw/N8v1UJZs/rAkRGz80KkvAx8x8N9SihjzIFhekLmXhq5g6Rsw e8uzZZq/biIpQJcPQJ93fE4qMoAIjNPYV86KDoQ= X-Google-Smtp-Source: AGHT+IGCGn0/h2EsIGjRKxBip1gaBZjEKzsxhpzCe/S8pBlTier2oJK6JT/a/Y5Gg/+JnYjZ3w+mcq/WvN7wN5kz748= X-Received: by 2002:a05:6870:4612:b0:1f0:1950:3d56 with SMTP id z18-20020a056870461200b001f019503d56mr5854159oao.47.1698873784953; Wed, 01 Nov 2023 14:23:04 -0700 (PDT) MIME-Version: 1.0 References: <20230921143837.2903914-1-goldstein.w.n@gmail.com> <20231004184855.3517478-1-goldstein.w.n@gmail.com> <87fs28gug9.fsf@oldenburg.str.redhat.com> <877cn1tcb7.fsf@mid.deneb.enyo.de> In-Reply-To: From: Noah Goldstein Date: Wed, 1 Nov 2023 16:22:52 -0500 Message-ID: Subject: Re: x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10 To: Florian Weimer Cc: Florian Weimer , libc-alpha@sourceware.org, hjl.tools@gmail.com, carlos@systemhalted.org, Sunil Pandey Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Nov 1, 2023 at 4:11=E2=80=AFPM Noah Goldstein wrote: > > On Wed, Nov 1, 2023 at 4:04=E2=80=AFPM Florian Weimer = wrote: > > > > * Florian Weimer: > > > > > * Noah Goldstein: > > > > > >> This commit refactors `strrchr-evex` and `strrchr-evex512` to use a > > >> common implementation: `strrchr-evex-base.S`. > > >> > > >> The motivation is `strrchr-evex` needed to be refactored to not use > > >> 64-bit masked registers in preperation for AVX10. > > >> > > >> Once vec-width masked register combining was removed, the EVEX and > > >> EVEX512 implementations can easily be implemented in the same file > > >> without any major overhead. > > >> > > >> The net result is performance improvements (measured on TGL) for bot= h > > >> `strrchr-evex` and `strrchr-evex512`. Although, note there are some > > >> regressions in the test suite and it may be many of the cases that > > >> make the total-geomean of improvement/regression across bench-strrch= r > > >> are cold. The point of the performance measurement is to show there > > >> are no major regressions, but the primary motivation is preperation > > >> for AVX10. > > >> > > >> Benchmarks where taken on TGL: > > >> https://www.intel.com/content/www/us/en/products/sku/213799/intel-co= re-i711850h-processor-24m-cache-up-to-4-80-ghz/specifications.html > > >> > > >> EVEX geometric_mean(N=3D5) of all benchmarks New / Original : 0.74 > > >> EVEX512 geometric_mean(N=3D5) of all benchmarks New / Original: 0.87 > > >> > > >> Full check passes on x86. > > > > > > I believe this caused some sort of regression because when we upgrade= d > > > glibc in the Fedora rawhide buildroot, a lot of things started failin= g: > > > > > > glibc-2.38.9000-13.fc40 broke rawhide buildroot on x86_64 > > > > > > > > > The list of changes relative to the previous version is rather short: > > > > > > - stdlib: fix grouping verification with multi-byte thousands separat= or (bug 30964) > > > - build-many-glibcs: Check for required system tools > > > - x86: Prepare `strrchr-evex` and `strrchr-evex512` for AVX10 > > > - aarch64: Optimise vecmath logs > > > - aarch64: Cosmetic change in SVE exp routines > > > - aarch64: Optimize SVE cos & cosf > > > - aarch64: Improve vecmath sin routines > > > - nss: Get rid of alloca usage in makedb's write_output. > > > - debug: Add regression tests for BZ 30932 > > > - Fix FORTIFY_SOURCE false positive > > > - nss: Rearrange and sort Makefile variables > > > - inet: Rearrange and sort Makefile variables > > > - Fix off-by-one OOB write in iconv/tst-iconv-mt > > > > > > And this patch is the most likely one to cause issues. I will try to > > > revert the patch and see if it fixes the observed issues. > > > > We did the revert and the issues were gone. So I think this commit is > > faulty. > > Bah, didn't see your last email. > Thank you for reverting. Will look into the issue. Okay bug is missing VBMI2 check. But the VBMI2 stuff isn't really needed so will update and repost w/ fixed ISA.