From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x430.google.com (mail-pf1-x430.google.com [IPv6:2607:f8b0:4864:20::430]) by sourceware.org (Postfix) with ESMTPS id CC3913858402 for ; Tue, 14 Sep 2021 05:55:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CC3913858402 Received: by mail-pf1-x430.google.com with SMTP id q22so11162824pfu.0 for ; Mon, 13 Sep 2021 22:55:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=nOmNTzJSFUVFDtK7adgCu87RcVeZN6dL2NRZmRFFN/g=; b=yvK4hUskbwcnwhYTKV6M5po0+uZ/3XZFi7YDTQ0UHVKjE1/kBHxR8WDiCwgVr5ZyYl YglSKRQwyaQQEFKxAVsOZ2kq436+q+YFmriPkMKW1Ytis3VUFdz4bE0yJTjc1V8iDrIm iwkrA4SXujc6oSG6Hr/AxgZ5QD/ZQB982GOHLtIfAeZpRvspQgCGsROxGlLsamuxUMwo QP30nJCTKqXl8xa9eGF2zrXdKRzbflqg/DnEL/9sCq/Ea/HrrLWiNiY8YyEqN7ISA+l5 bm3dyJoQ5iPg9JknN+Zw+lcA0S1Mv9RZcmymCGUvbHyOls+ij3OM/XLe3lUTNORLaOzQ 47cg== X-Gm-Message-State: AOAM533ZSdVx3rGbwjekeoAuLtgeIby43Vr4IBYjfd/HR0wwupfwL8V8 XFrawTJLrHmIDJmWCTQyE3B8AptPtGUXZXee45U= X-Google-Smtp-Source: ABdhPJxPmhnNE+5go/h2mfdAPVZVLHVWUk+SgjVvm2iuHX+C4t2t3Be5NR/ujQ9UyAhrCKYbK7tkJ5LaR7keLU/40IM= X-Received: by 2002:a63:dc42:: with SMTP id f2mr14289332pgj.152.1631598918448; Mon, 13 Sep 2021 22:55:18 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Noah Goldstein Date: Tue, 14 Sep 2021 00:55:07 -0500 Message-ID: Subject: Re: [PATCH 5/5] x86_64: Add evex optimized bcmp implementation in bcmp-evex.S To: DJ Delorie Cc: "Carlos O'Donell" , GNU C Library X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Sep 2021 05:55:23 -0000 On Tue, Sep 14, 2021 at 12:42 AM DJ Delorie wrote: > Noah Goldstein writes: > > I'm not 100% sure what you mean? memcmp can correctly implement bcmp > > but not the vice versa. > > glibc does not have a separate implementation of bcmp(). Any calls to > bcmp() end up calling memcmp() (through that weak alias). So your patch > is not *optimizing* bcmp, it is *adding* bcmp. The new version you are > adding is no longer using the optimized versions of memcmp, so you'd > have to either (1) be very careful to not introduce a performance > regression, or (2) optimize the existing memcmp()s further instead. > Ah, got it. In the first patch of the set: [PATCH 1/5] x86_64: Add support for bcmp using sse2, sse4_1, avx2, and evex I have some performance numbers. Seems to be an improvement for avx2/evex. The sse2/sse4 stuff is a bit more iffy. I don't really have the hardware to properly test those versions. Thank you for all the help!