public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Patrick McGehearty <patrick.mcgehearty@oracle.com>,
	libc-alpha@sourceware.org
Subject: Re: [PATCH v2] x86-64: Optimize bzero
Date: Fri, 11 Feb 2022 10:01:06 -0300	[thread overview]
Message-ID: <1ea64f9f-6ce8-5409-8b56-02f7481526d9@linaro.org> (raw)
In-Reply-To: <0efdd4fe-4e35-cf1d-5731-13ed1c046cc6@oracle.com>



On 10/02/2022 18:07, Patrick McGehearty via Libc-alpha wrote:
> Just as another point of information, Solaris libc implemented
> bzero as moving arguments around appropriately then jumping to
> memset. Noone noticed enough to file a complaint. Of course,
> short fixed-length bzero was handled with in line stores of zero
> by the compiler. For long vector bzeroing, the overhead was
> negligible.
> 
> When certain Sparc hardware implementations provided faster methods
> for zeroing a cache line at a time on cache line boundaries,
> memset added a single test for zero ifandonlyif the length of code
> to memset was over a threshold that seemed likely to make it
> worthwhile to use the faster method. The principal advantage
> of the fast zeroing operation is that it did not require data
> to move from memory to cache before writing zeros to memory,
> protecting cache locality in the face of large block zeroing.
> I was responsible for much of that optimization effort.
> Whether that optimization was really worth it is open for debate
> for a variety of reasons that I won't go into just now.

Afaik this is pretty much what optimized memset implementations
does, if architecture allows it. For instance, aarch64 uses 
'dc zva' for sizes larger than 256 and powerpc uses dcbz with a 
similar strategy.

> 
> Apps still used bzero or memset(target,zero,length) according to
> their preferences, but the code was unified under memset.
> 
> I am inclined to agree with keeping bzero in the API for
> compatibility with old code/old binaries/old programmers. :-)

The main driver to remove the bzero internal implementation is just
the *currently* gcc just do not generate bzero calls as default
(I couldn't find a single binary that calls bzero in my system).

So to actually see any performance advantage from the optimized
bzero, we will need to reevaluate the gcc optimization to transform
it on memset (which will need to be applied per-architecture base)
which I seem highly unlikely gcc maintainer will accept it.

Some time ago LLVM tried to do something similar to bcmp, but in the
end it was not a good idea to use an already define symbol and it
ended up with __memcmp_eq instead. 

> 
> Using shared memset code for the implementation of bzero
> is worthwhile for reducing future maintenance costs.
> 
> - Patrick McGehearty
> former Sparc/Solaris performance tuning person
> 
> 
> 
> On 2/10/2022 2:42 PM, Adhemerval Zanella via Libc-alpha wrote:
>>
>> On 10/02/2022 17:27, Alejandro Colomar (man-pages) wrote:
>>>> We are discussing different subjects here: what I want is to remove the
>>>> glibc *internal* optimization for bzero, which is essentially an
>>>> implementation detail.  In a first glance it would change performance,
>>>> however gcc does a hard job replacing bzero/bcmp/bcopy with their
>>>> str* counterparts, so it highly unlike that newer binaries will actually
>>>> call bzero.
>>> Okay, then yes, go ahead and remove bzero(3) from glibc if GCC will
>>> continue supporting it.  Just remember that some users keep writing and
>>> wanting to write bzero(3) instead of memset(3) in their .c files, so
>>> it's far from being dead in source code.
>> Again, I am not proposing to *remove* bzero, but rather the internal
>> optimizations that currently only adds code complexity and maintenance
>> burden.  My patchset [1] will keep the ABI as-is, the difference is
>> bcopy and bzero will use the default implementation on all architectures.
>>
>> [1] https://patchwork.sourceware.org/project/glibc/list/?series=7243
> 

  reply	other threads:[~2022-02-11 13:01 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-08 22:43 H.J. Lu
2022-02-08 23:56 ` Noah Goldstein
2022-02-09 11:41 ` Adhemerval Zanella
2022-02-09 22:14   ` Noah Goldstein
2022-02-10 12:35     ` Adhemerval Zanella
2022-02-10 13:01       ` Wilco Dijkstra
2022-02-10 13:10         ` Adhemerval Zanella
2022-02-10 13:16           ` Adhemerval Zanella
2022-02-10 13:17           ` Wilco Dijkstra
2022-02-10 13:22             ` Adhemerval Zanella
2022-02-10 17:50               ` Alejandro Colomar (man-pages)
2022-02-10 19:19                 ` Wilco Dijkstra
2022-02-10 20:27                   ` Alejandro Colomar (man-pages)
2022-02-10 20:42                     ` Adhemerval Zanella
2022-02-10 21:07                       ` Patrick McGehearty
2022-02-11 13:01                         ` Adhemerval Zanella [this message]
2022-02-12 23:46                           ` Noah Goldstein
2022-02-14 12:07                             ` Adhemerval Zanella
2022-02-14 12:41                               ` Noah Goldstein
2022-02-14 14:07                                 ` Adhemerval Zanella
2022-02-14 15:03                                   ` H.J. Lu
2022-05-04  6:35                                     ` Sunil Pandey
2022-05-04 12:52                                       ` Adhemerval Zanella
2022-05-04 14:50                                         ` H.J. Lu
2022-05-04 14:54                                           ` Adhemerval Zanella
2022-02-10 22:00                       ` Alejandro Colomar (man-pages)
2022-02-10 19:42                 ` Adhemerval Zanella
2022-02-10 18:28         ` Noah Goldstein
2022-02-10 18:35         ` Noah Goldstein
2022-02-15 13:38 Wilco Dijkstra
2022-02-23  8:12 ` Noah Goldstein
2022-02-23 12:09   ` Adhemerval Zanella
2022-02-24 13:16   ` Wilco Dijkstra
2022-02-24 15:48     ` H.J. Lu
2022-02-24 22:58     ` Noah Goldstein
2022-02-24 23:21       ` Noah Goldstein
2022-02-25 17:37         ` Noah Goldstein
2022-02-25 13:51       ` Wilco Dijkstra
2022-02-25 17:35         ` Noah Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1ea64f9f-6ce8-5409-8b56-02f7481526d9@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=libc-alpha@sourceware.org \
    --cc=patrick.mcgehearty@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).