From: Alejandro Colomar <alx.manpages@gmail.com>
To: Sam James <sam@gentoo.org>
Cc: Zack Weinberg via Libc-alpha <libc-alpha@sourceware.org>,
Alejandro Colomar <alx@kernel.org>,
Florian Weimer <fweimer@redhat.com>,
Paul Eggert <eggert@cs.ucla.edu>
Subject: Re: [PATCH 1/1] string: Add stpecpy(3)
Date: Fri, 23 Dec 2022 13:26:02 +0100 [thread overview]
Message-ID: <674a24d3-f81b-0201-097e-0bbe9b60d1fd@gmail.com> (raw)
In-Reply-To: <505B9E19-7065-4DBE-A980-3FDA096C6449@gentoo.org>
[-- Attachment #1.1: Type: text/plain, Size: 8478 bytes --]
Hi Sam!
On 12/23/22 08:02, Sam James wrote:
>
>
>> On 22 Dec 2022, at 21:42, Alejandro Colomar via Libc-alpha <libc-alpha@sourceware.org> wrote:
>>
>> Glibc didn't provide any function that copies a string with truncation.
>>
>> It only provided strncpy(3) and stpncpy(3), which copy from a string
>> into a null-padded character sequence at the destination fixed-width
>> buffer, with truncation.
>>
>> Those old functions, which don't produce a string, have been misused for
>> a long time as a poor-man's replacement for strlcpy(3), but doing that
>> is a source of bugs, since it's hard to calculate the right size that
>> should be passed to the function, and it's also necessary to explicitly
>> terminate the buffer with a null byte. Detecting truncation is yet
>> another problem.
>>
>> stpecpy(3), described in the new string_copying(7) manual page, is
>> similar to OpenBSD's strlcpy(3)/strlcat(3), but:
>>
>> - It's simpler to implement.
>> - It's faster.
>> - It's simpler to detect truncation.
>
> Given strlcpy and strlcat are in POSIX next and therefore bar
> some extraordinary event will be in glibc, I think we should
> probably wait until those two land, then see if there's still
> an appetite for stpecpy in glibc.
I disagree for the following reasons.
strlcpy(3)/strlcat(3) are designed to be _slow_, in exchange for added
simplicity and safety. That's what Theo told me about them. They didn't care
about performance. The two performance issues are:
- Traverse the entire input string, to make sure it's a string. stpecpy(3)
instead only reads what's necessary for the copy; it stops reading after truncation.
- strlcat(3) finds the terminating null byte; that's something you already know
where it is, with functions that return a useful pointer (mempcpy(3), stpcpy(3),
and stpecpy(3)).
While there are many programs that may be fine with the OpenBSD functions, glibc
should _also_ provide a way to do the operation with an optimal API. And it's
in line with glibc providing stpcpy(3) and mempcpy(3) as extensions (stpcpy(3)
is now in POSIX).
The reason that triggered me wanting to add this function is seeing strncpy(3)
used for a patch to some glibc internals themselves. Using strlcpy(3)/cat(3) in
glibc internals would be bad for performance; I would hope that glibc uses the
optimal internals, even if it also provides slow functions for users.
There are probably more cases within existing code in glibc. Just check the
output of:
$ grep -rn st.ncpy -A1 | grep -B1 " = '\\\\0'"
Moreover, in the Austin discussion for strlcpy(3)/cat(3), it was mentioned that
strlcpy(3) has an interface identical to that of snprintf(3), and that "If we
truly think that this is bad design, should we come up with a new version of
snprintf() that also doesn't do this? I don't think so.".
Well, I do believe snprintf is also misdesigned, for the same reasons that the
strlcpy(3) manual page states that you should use strlcpy(3) for catenating
strings, but rather strlcat(3):
To detect truncation, perhaps while building a pathname, something like
the following might be used:
char *dir, *file, pname[MAXPATHLEN];
...
if (strlcpy(pname, dir, sizeof(pname)) >= sizeof(pname))
goto toolong;
if (strlcat(pname, file, sizeof(pname)) >= sizeof(pname))
goto toolong;
Since it is known how many characters were copied the first time,
things can be sped up a bit by using a copy instead of an append:
char *dir, *file, pname[MAXPATHLEN];
size_t n;
...
n = strlcpy(pname, dir, sizeof(pname));
if (n >= sizeof(pname))
goto toolong;
if (strlcpy(pname + n, file, sizeof(pname) ‐ n) >= sizeof(pname) ‐ n)
goto toolong;
However, one may question the validity of such optimizations, as they
defeat the whole purpose of strlcpy() and strlcat(). As a matter of
fact, the first version of this manual page got it wrong.
Guess what? There's no 'cat' version of snprintf, so users are doomed to write
buggy code when trying to use it to concatenate after some other string. I've
recently been investigating a lot about it, and found invocations of Undefined
Behavior, and some milder cases of benign off-by-one (on the safe side, by luck)
errors, in calls to snprintf(3) in several important projects:
- NGINX Unit:
- Undefined Behavior:
<https://github.com/nginx/unit/issues/795#issuecomment-1345400420>
- Wrong truncation detection:
<https://github.com/nginx/unit/pull/734#discussion_r1043963527>
<https://github.com/nginx/unit/issues/804>
- shadow:
- off-by-one:
<https://github.com/shadow-maint/shadow/pull/607>
- clever code that looks like a bug but it's not:
<https://github.com/shadow-maint/shadow/issues/608>
Rather than adding some catenating variant of snprintf(3), I suggest adding a
single function that has an interface similar to stpcpy(3) and mempcpy(3), and
identical to stpecpy(3): stpeprintf(3):
<http://www.alejandro-colomar.es/src/alx/alx/libstp.git/tree/src/stp/stpe/stpeprintf.c>
I implemented it in terms of vsnprintf(3), so I need to handle a return of -1,
but if implemented from scratch in glibc, in could be written to not be limited
to INT_MAX (although I wonder why anyone would want to copy more than INT_MAX as
a formatted string).
Below is its manual page in libstp.
Cheers,
Alex
---
stpeprintf(3) Library Functions Manual stpeprintf(3)
NAME
stpeprintf, vstpeprintf - create a formatted string with truncation
LIBRARY
Stp string library (libstp, pkgconf ‐‐cflags ‐‐libs libstp)
SYNOPSIS
#include <stp/stpe/stpeprintf.h>
char *_Nullable stpeprintf(char *_Nullable dst, char end[0],
const char *restrict fmt, ...);
char *_Nullable vstpeprintf(char *_Nullable dst, char end[0],
const char *restrict fmt, va_list ap);
DESCRIPTION
These functions are almost identical to snprintf(3) and vsnprintf(3).
The destination buffer is limited by a pointer to its end —one after
its last element— instead of a size.
These functions can be chained with calls to stpeprintf(3) and
vstpeprintf(3).
RETURN VALUE
NULL
• If this function failed (see ERRORS).
• If dst was NULL.
end
• If this call truncated.
• If dst was equal to end (a previous call to these functions
truncated).
dst + strlen(dst)
On success, these functions return a pointer to the terminating
null byte.
ERRORS
These functions may fail for any of the same reasons as vsnprintf(3).
ATTRIBUTES
For an explanation of the terms used in this section, see attrib‐
utes(7).
┌────────────────────────────────────────────┬───────────────┬─────────┐
│Interface │ Attribute │ Value │
├────────────────────────────────────────────┼───────────────┼─────────┤
│stpeprintf(3), vstpeprintf(3) │ Thread safety │ MT‐Safe │
└────────────────────────────────────────────┴───────────────┴─────────┘
STANDARDS
None.
EXAMPLES
See stpecpy(3).
SEE ALSO
stpecpy(3), string_copying(7)
libstp (unreleased) (date) stpeprintf(3)
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2022-12-23 12:26 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-22 21:42 [PATCH 0/1] " Alejandro Colomar
2022-12-22 21:42 ` [PATCH 1/1] " Alejandro Colomar
2022-12-23 7:02 ` Sam James
2022-12-23 12:26 ` Alejandro Colomar [this message]
2022-12-23 12:29 ` Alejandro Colomar
2022-12-23 17:21 ` Alejandro Colomar
2022-12-31 15:13 ` Sam James
2022-12-31 15:15 ` Alejandro Colomar
2022-12-28 23:17 ` [PATCH v2 0/3] Add stpe*() functions Alejandro Colomar
2022-12-29 0:01 ` Zack Weinberg
2022-12-29 10:13 ` Alejandro Colomar
2022-12-28 23:17 ` [PATCH v2 1/3] string: Add stpecpy() Alejandro Colomar
2022-12-28 23:27 ` Alejandro Colomar
2022-12-28 23:17 ` [PATCH v2 2/3] stdio: Add vstpeprintf() Alejandro Colomar
2022-12-28 23:17 ` [PATCH v2 3/3] stdio: Add stpeprintf() Alejandro Colomar
2022-12-28 23:27 ` Alejandro Colomar
2022-12-23 14:59 [PATCH 1/1] string: Add stpecpy(3) Wilco Dijkstra
2022-12-23 17:03 ` Alejandro Colomar
2022-12-23 17:27 ` Alejandro Colomar
2022-12-23 18:35 Wilco Dijkstra
2022-12-23 22:40 ` Alejandro Colomar
2022-12-23 23:24 Wilco Dijkstra
2022-12-24 0:05 ` Alejandro Colomar
2022-12-24 0:26 ` Alejandro Colomar
2022-12-25 1:52 ` Noah Goldstein
2022-12-25 14:37 ` Alejandro Colomar
2022-12-25 22:31 ` Noah Goldstein
2022-12-26 0:26 ` Alejandro Colomar
2022-12-26 0:32 ` Noah Goldstein
2022-12-26 0:37 ` Alejandro Colomar
2022-12-26 2:43 ` Noah Goldstein
2022-12-26 22:25 ` Alejandro Colomar
2022-12-26 23:24 ` Alejandro Colomar
2022-12-26 23:52 ` Alejandro Colomar
2022-12-27 0:12 ` Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=674a24d3-f81b-0201-097e-0bbe9b60d1fd@gmail.com \
--to=alx.manpages@gmail.com \
--cc=alx@kernel.org \
--cc=eggert@cs.ucla.edu \
--cc=fweimer@redhat.com \
--cc=libc-alpha@sourceware.org \
--cc=sam@gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).