From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by sourceware.org (Postfix) with ESMTPS id EBA0B3858431 for ; Thu, 10 Feb 2022 18:35:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org EBA0B3858431 Received: by mail-pf1-x42c.google.com with SMTP id 9so8887501pfx.12 for ; Thu, 10 Feb 2022 10:35:18 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=HK4wTxcDB8Zz8Yr53CR9c4tXY4Bc9x9UBAorZrWpiK4=; b=gs1bd6mRIA7UUh/5q3GahTTGR350MnJ1H6yrtq9/isIs8Z/hiHQ2CnezEMVhe4g/8z jQUu3RTd877eyT9A+AINZiTG49I16Mxy9IV16wyxj7Ryrm9RkdARfludcH/Jr/nniuZo xvn6VvlI+Lz2i+bfUGw11q9EmgZa2Ni0vpB3FHR64TNldid0AeN9f4FdOGqNtPFe153T +P5rbFI8htzfG3/ZL84vHCwyTnbOSgHhvwFF4w0Xg2BpM4omN6Z6fwbNddfoscI2DcrL xAWrZEQtrmkpNI30e8sk9EOqZpBImedMhp8II6BbE1mII5/dwX5aXNUvbCJf3uj07igI V4+A== X-Gm-Message-State: AOAM5302V5vtDaUqIyN2XscDwlP8tYAR0BRxkzQ3iTuOj9UH1DJEhtj1 dUs4nkPbKVfbYrfb4yOg+h376Rv6cFzZczoiS78= X-Google-Smtp-Source: ABdhPJyAH89Q5HdSrhzyOCoc/cZ9te48WIfBIapZh7JbtWEKnMSmsXSM3Vj/cuYjTKRBe6QV3s6GG7iUglcsyJTPVvU= X-Received: by 2002:a62:7555:: with SMTP id q82mr8883956pfc.11.1644518117699; Thu, 10 Feb 2022 10:35:17 -0800 (PST) MIME-Version: 1.0 References: <20220208224319.40271-1-hjl.tools@gmail.com> In-Reply-To: From: Noah Goldstein Date: Thu, 10 Feb 2022 12:35:05 -0600 Message-ID: Subject: Re: [PATCH v2] x86-64: Optimize bzero To: Wilco Dijkstra Cc: Adhemerval Zanella , "H.J. Lu" , GNU C Library Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Feb 2022 18:35:20 -0000 On Thu, Feb 10, 2022 at 7:02 AM Wilco Dijkstra wrote: > > Hi, > > >> The saving is in the lane-cross broadcast which is on the critical > >> path for memsets in [VEC_SIZE, 2 * VEC_SIZE] (think 32-64). > > What is the speedup in eg. bench-memset? Generally the OoO engine will > be able to hide a small increase in latency, so I'd be surprised it shows up > as a significant gain. Well comparing the previous sse2 bzero against avx2/evex/avx512 versions there is obviously speedup. Comparing memset-${version} vs bzero-${version} it's ambiguous if there is any benefits. > > If you can show a good speedup in an important application (or benchmark > like SPEC2017) then it may be worth pursuing. However there are other > optimization opportunities that may be easier or give a larger benefit. > > >> Agreed it's not clear if it's worth it to start replacing memset calls with > >> bzero calls, but at the very least this will improve existing code that > >> uses bzero. > > No code uses bzero, no compiler emits bzero. It died 2 decades ago... > > > My point is this is a lot of code and infrastructure for a symbol marked > > as legacy for POSIX.1-2001 and removed on POSIX.1-2008 for the sake of > > marginal gains in specific cases. > > Indeed, what we really should discuss is how to remove the last traces of > bcopy and bcmp from GLIBC. Do we need to keep a compatibility symbol > or could we just get rid of it altogether? > > Cheers, > Wilco