public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx.manpages@gmail.com>
To: Wilco Dijkstra <Wilco.Dijkstra@arm.com>,
	Paul Eggert <eggert@cs.ucla.edu>,
	Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>,
	"linux-man@vger.kernel.org" <linux-man@vger.kernel.org>
Cc: Alejandro Colomar <alx@kernel.org>,
	"libc-alpha@sourceware.org" <libc-alpha@sourceware.org>,
	"G. Branden Robinson" <g.branden.robinson@gmail.com>
Subject: Re: [PATCH] bind.2, mount_setattr.2, openat2.2, perf_event_open.2, pidfd_send_signal.2, recvmmsg.2, seccomp_unotify.2, select_tut.2, sendmmsg.2, set_thread_area.2, sysctl.2, bzero.3, getaddrinfo.3, getaddrinfo_a.3, getutent.3, mbrtowc.3, mbsinit.3, rti...
Date: Fri, 6 Jan 2023 14:49:33 +0100	[thread overview]
Message-ID: <4f7b15d2-d7c0-babf-4a14-b4d3311733e1@gmail.com> (raw)
In-Reply-To: <PAWPR08MB898218EE6B27DED9CAA12F2C83FB9@PAWPR08MB8982.eurprd08.prod.outlook.com>


[-- Attachment #1.1: Type: text/plain, Size: 5135 bytes --]

Hi Wilco,

On 1/6/23 03:26, Wilco Dijkstra wrote:
> Hi Alex,
> 
>> Many projects redefine those functions themselves, with alternative names, so
>> it's hard to really count how much is the intention of projects to use it,
>> rather than actual use.  Since the standards don't guarantee such functions,
>> projects that care a lot, use a portable name (one that isn't reserved;
>> sometimes they don't even know that there's a GNU extension with that name and
>> use a weird one, such as cpymem() by nginx).
> 
> Yeah portability is a big issue with these non-standard functions. So even if you
> aren't considering the large cost of supporting these functions in C libraries, there
> are also costs in making applications portable, precisely because not all C libraries
> will support it...
> 
>> The thing is that those APIs are better (imagine that they were all standard,
>> and were all equally known by programmers; which ones would you use?).  Some
>> programmers will want to use the better APIs, independently of libc providing it
>> or not.  In some cases, for high performance programs, good APIs are even more
>> relevant.  Not implementing them in libc, will only mean that projects will roll
>> their own.
> 
> No, the use of non-standard functions is the problem here. bzero was deprecated
> more than 20 years ago, do you think C libraries will add support and optimize it
> even if they never supported it before?

Which C libraries never supported bzero(3)?  It was in POSIX once, so I guess 
it's supported everywhere in Unix(-like) systems (you can see that I don't care 
at all about other systems).  Even if only for backwards compatibility, the 
removal from POSIX will have not affected the portability of bzero(3) in source 
code (even where libc has removed it, the compiler will provide support).

> If it's non-standard, it's never going to
> happen.

So, I don't think that's a real problem yet.  We're not yet (or I believe so) in 
a point where bzero(3) is non-portable in source code.

> 
> If we continue with the mempcpy vs memcpy example of nginx, I presume
> nginx implements cpymem() similar to this:
> 
> #if HAVE_MEMPCPY_SUPPORT
>    return mempcpy (p, q, n);
> #else
>    return memcpy (p, q, n) + n;
> #endif
> 
> The define would be set by a special configure check.

Even simpler: it is unconditionally defined to memcpy() + len in a macro.

The reason (I guess) is that they didn't even know that mempcpy() exists.

> 
> Now if nginx got say 10% faster from using mempcpy then that would
> be great and it would be worth the trouble. However there is no difference
> since compilers typically generate identical code for these cases.

Actually, gcc optimizes differently.  When you call mempcpy(3), since it knows 
it exists, it calls it or replaces it by memcpy(3)+len, depending on 
optimization flags.  When you call memcpy(3)+len, since it doesn't know if 
mempcpy(3) is available, it keeps the memcpy(3) call always.

> So what's
> the point of mempcpy exactly?

The point of mempcpy(3) is that it's the simplest libc API to catenate strings 
when you know the length of the source strings, and you also want to know the 
length of the resulting string, and you know there will be no truncation.

Example:

src/nxt_h1proto.c:2287:    p = nxt_cpymem(p, r->method->start, r->method->length);
src/nxt_h1proto.c-2288-    *p++ = ' ';
src/nxt_h1proto.c:2289:    p = nxt_cpymem(p, r->target.start, r->target.length);
src/nxt_h1proto.c:2290:    p = nxt_cpymem(p, " HTTP/1.1\r\n", 11);
src/nxt_h1proto.c:2291:    p = nxt_cpymem(p, "Connection: close\r\n", 19);


Any other function will either be slower (stpcpy(3) will likely be slower), or 
make the code more complex (memcpy(3) will require adding +... everywhere).

I'm not saying that this will be significantly faster than memcpy(3), but it 
will be at least as fast (and negligibly faster if libc optimized mempcpy(3), 
but that's negligible).

> 
> By all means, create your own special copy interface function - it's just sugar.
> But deciding that mempcpy is great and then being forced to do extra work
> to make it portable for no gain is what I find insane...

 From a source code point of view, they let programmers write better/simpler 
source code than memcpy(3) or memset(3).  That's sugar... yes.  IMO, it's worth it.

> 
>> Where do you suggest that we put such function?  In or out of libc?
> 
> Well you mentioned that nginx and many other programs already define their
> own memcpy variants. It's perfectly reasonable to do what you proposed and
> create a library of inline string functions using standard calls as primitives.
> If it is a freely usable and portable, any project that likes it could just add it.

Having it in libc rather than an external library has the benefit that it will 
have support from the compiler (better warnings and optimizations).

But yes, for the time being, I'll keep developing such an external library.

> 
> Cheers,
> Wilco

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-01-06 13:49 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-06  2:26 Wilco Dijkstra
2023-01-06 13:49 ` Alejandro Colomar [this message]
  -- strict thread matches above, loose matches on Subject: below --
2023-01-06 15:53 Wilco Dijkstra
2023-01-06 16:20 ` Alejandro Colomar
2023-01-06 17:01   ` Joseph Myers
2023-01-06  0:02 Wilco Dijkstra
2023-01-06  0:22 ` Alejandro Colomar
2023-01-06  0:57   ` Alejandro Colomar
2023-01-05 19:37 [PATCH] bind.2, mount_setattr.2, openat2.2, perf_event_open.2, pidfd_send_signal.2, recvmmsg.2, seccomp_unotify.2, select_tut.2, sendmmsg.2, set_thread_area.2, sysctl.2, bzero.3, getaddrinfo.3, getaddrinfo_a.3, getutent.3, mbrtowc.3, mbsinit.3, rtime.3, rtnetlink.3, strptime.3, NULL.3const, size_t.3type, void.3type, aio.7, netlink.7, unix.7: Prefer bzero(3) over memset(3) Alejandro Colomar
2023-01-05 20:48 ` Adhemerval Zanella Netto
2023-01-05 20:55   ` Paul Eggert
2023-01-05 21:12     ` [PATCH] bind.2, mount_setattr.2, openat2.2, perf_event_open.2, pidfd_send_signal.2, recvmmsg.2, seccomp_unotify.2, select_tut.2, sendmmsg.2, set_thread_area.2, sysctl.2, bzero.3, getaddrinfo.3, getaddrinfo_a.3, getutent.3, mbrtowc.3, mbsinit.3, rti Wilco Dijkstra
2023-01-05 21:33       ` Alejandro Colomar
2023-01-05 23:30       ` Wilco Dijkstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4f7b15d2-d7c0-babf-4a14-b4d3311733e1@gmail.com \
    --to=alx.manpages@gmail.com \
    --cc=Wilco.Dijkstra@arm.com \
    --cc=adhemerval.zanella@linaro.org \
    --cc=alx@kernel.org \
    --cc=eggert@cs.ucla.edu \
    --cc=g.branden.robinson@gmail.com \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-man@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).