From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) by sourceware.org (Postfix) with ESMTPS id 5F3313858C20 for ; Thu, 10 Feb 2022 18:29:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5F3313858C20 Received: by mail-pf1-x42e.google.com with SMTP id t36so912616pfg.0 for ; Thu, 10 Feb 2022 10:29:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=fmd2eFOIbulFf0eObTPy2w8esTmT49v96C7X//4XaO8=; b=7RbNC7Qlrv9HtAVFwoxr3nqJ00dsAZA4HMuzID+yTw97yfSjO1rqeTZeuybxsQ7GjV rPzOpeJdx1+oYZryjxU0lY7hh1oUPFER0DU7p0RaI0d3GcTUVw5bU1noQ+YAwEq2nYuS JPY5rjoodwXeybgIK31i9NDwbA/eAeZJ+WwDBVPyGS1xCG0Ww4o21JXJ1703ptVIMiVQ 6jpmkOTsDgx7hNhq2CNTsfCZ/zpWPV1yn0PqBh77N4mEhEPbNjQ00WtAv8mzjX7rj0I2 n2K+PXdAmU7ftWg03z2YyzwdwO6p86zcFNKvzJbZSN/1tEJ94klm25InQwnvlY423Rt3 nY9g== X-Gm-Message-State: AOAM530eosm8NeQTpZRUR4BTRscYB7s2rAj17PQpJzpnN2t9I5da67NQ xvROqrdhxAXWwUOvH8D7zZs1iL80tunvN1d1Z1c= X-Google-Smtp-Source: ABdhPJzPke7c/pfFVtV8KN/8kY2uqoqmWGHtOqegi6/G8wqyD9CVPaoMUdRmTO6cEnbUVsss+QbpOIDH/eORC1YG9CA= X-Received: by 2002:a63:6a06:: with SMTP id f6mr7236229pgc.18.1644517740500; Thu, 10 Feb 2022 10:29:00 -0800 (PST) MIME-Version: 1.0 References: <20220208224319.40271-1-hjl.tools@gmail.com> In-Reply-To: From: Noah Goldstein Date: Thu, 10 Feb 2022 12:28:48 -0600 Message-ID: Subject: Re: [PATCH v2] x86-64: Optimize bzero To: Wilco Dijkstra Cc: Adhemerval Zanella , "H.J. Lu" , GNU C Library Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Feb 2022 18:29:02 -0000 On Thu, Feb 10, 2022 at 7:02 AM Wilco Dijkstra wrote: > > Hi, > > >> The saving is in the lane-cross broadcast which is on the critical > >> path for memsets in [VEC_SIZE, 2 * VEC_SIZE] (think 32-64). > > What is the speedup in eg. bench-memset? Generally the OoO engine will > be able to hide a small increase in latency, so I'd be surprised it shows up > as a significant gain. > > If you can show a good speedup in an important application (or benchmark > like SPEC2017) then it may be worth pursuing. However there are other > optimization opportunities that may be easier or give a larger benefit. Very much so doubt any benefit on SPEC/other unless the compiler decided to build zeros with long dependency chains instead of xor. > > >> Agreed it's not clear if it's worth it to start replacing memset calls with > >> bzero calls, but at the very least this will improve existing code that > >> uses bzero. > > No code uses bzero, no compiler emits bzero. It died 2 decades ago... > > > My point is this is a lot of code and infrastructure for a symbol marked > > as legacy for POSIX.1-2001 and removed on POSIX.1-2008 for the sake of > > marginal gains in specific cases. > > Indeed, what we really should discuss is how to remove the last traces of > bcopy and bcmp from GLIBC. Do we need to keep a compatibility symbol > or could we just get rid of it altogether? > > Cheers, > Wilco