public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
To: "Alejandro Colomar (man-pages)" <alx.manpages@gmail.com>
Cc: 'GNU C Library' <libc-alpha@sourceware.org>
Subject: [PATCH 1/1] string: Add stpecpy(3)
Date: Fri, 23 Dec 2022 14:59:20 +0000	[thread overview]
Message-ID: <PAWPR08MB8982DF2FE0C57C913536218083E99@PAWPR08MB8982.eurprd08.prod.outlook.com> (raw)

Hi Alex,

>> Given strlcpy and strlcat are in POSIX next and therefore bar
>> some extraordinary event will be in glibc, I think we should
>> probably wait until those two land, then see if there's still
>> an appetite for stpecpy in glibc.
>
> I disagree for the following reasons.
>
> strlcpy(3)/strlcat(3) are designed to be _slow_, in exchange for added 
> simplicity and safety.  That's what Theo told me about them.  They didn't care 
> about performance.  The two performance issues are:

We'd need actual benchmarks to confirm there is a detectable performance
difference in typical scenarios. So calling strlcpy slow is premature. Looking
at the proposed stpecpy, it seems it has a lot more branches and special cases
compared to a typical strlcpy, and that adds extra overhead. Using memccpy
is risky too since that is often not optimized.

> -  Traverse the entire input string, to make sure it's a string.  stpecpy(3) 
> instead only reads what's necessary for the copy; it stops reading after truncation.

Almost all strings are short and few cases need truncation, so I don't see the issue.
People concerned about performance wouldn't use the standard string functions
anyway.

> -  strlcat(3) finds the terminating null byte; that's something you already know 
> where it is, with functions that return a useful pointer (mempcpy(3), stpcpy(3), 
> and stpecpy(3)).

If you know the end of the destination string then don't use concatenation. Easy!

In fact compilers could inline the 'dest += strlen (dest)' part of strcat and call
strcpy instead. This allows optimization of the strlen in case you know the size
of the destination string. This is true for strlcpy too, a compiler could just inline
it and optimize the strlen (src).

> The reason that triggered me wanting to add this function is seeing strncpy(3) 
> used for a patch to some glibc internals themselves.  Using strlcpy(3)/cat(3) in 
> glibc internals would be bad for performance; I would hope that glibc uses the 
> optimal internals, even if it also provides slow functions for users.

Most internal uses are unlikely to be performance critical, and correctness is kind
of important for libraries.

IMHO inventing many slightly different non-standard string functions is what
causes performance and correctness issues. People disagree about the semantics
(strlcpy has been argued over for a decade or so). Even if a library supports them,
you never know which implementations are actually well optimized (obviously
this varies between ISA and different libc's). So which non-standard string function
is safe to use across all targets and libraries?

In contrast we can be pretty sure that the standard strlen, memcpy etc are both
correct and efficient on all targets/libc's.

Cheers,
Wilco

             reply	other threads:[~2022-12-23 14:59 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-23 14:59 Wilco Dijkstra [this message]
2022-12-23 17:03 ` Alejandro Colomar
2022-12-23 17:27   ` Alejandro Colomar
  -- strict thread matches above, loose matches on Subject: below --
2022-12-23 23:24 Wilco Dijkstra
2022-12-24  0:05 ` Alejandro Colomar
2022-12-24  0:26   ` Alejandro Colomar
2022-12-25  1:52     ` Noah Goldstein
2022-12-25 14:37       ` Alejandro Colomar
2022-12-25 22:31         ` Noah Goldstein
2022-12-26  0:26           ` Alejandro Colomar
2022-12-26  0:32             ` Noah Goldstein
2022-12-26  0:37               ` Alejandro Colomar
2022-12-26  2:43                 ` Noah Goldstein
2022-12-26 22:25                   ` Alejandro Colomar
2022-12-26 23:24                     ` Alejandro Colomar
2022-12-26 23:52                       ` Alejandro Colomar
2022-12-27  0:12                         ` Alejandro Colomar
2022-12-23 18:35 Wilco Dijkstra
2022-12-23 22:40 ` Alejandro Colomar
2022-12-22 21:42 [PATCH 0/1] " Alejandro Colomar
2022-12-22 21:42 ` [PATCH 1/1] " Alejandro Colomar
2022-12-23  7:02   ` Sam James
2022-12-23 12:26     ` Alejandro Colomar
2022-12-23 12:29       ` Alejandro Colomar
2022-12-23 17:21       ` Alejandro Colomar
2022-12-31 15:13       ` Sam James
2022-12-31 15:15         ` Alejandro Colomar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PAWPR08MB8982DF2FE0C57C913536218083E99@PAWPR08MB8982.eurprd08.prod.outlook.com \
    --to=wilco.dijkstra@arm.com \
    --cc=alx.manpages@gmail.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).