From: Alejandro Colomar <alx.manpages@gmail.com>
To: Andrew Pinski <pinskia@gmail.com>
Cc: gcc@gcc.gnu.org, linux-man <linux-man@vger.kernel.org>,
GNU C Library <libc-alpha@sourceware.org>, groff <groff@gnu.org>,
Martin Sebor <msebor@redhat.com>
Subject: Re: [-Wstringop-overflow=] strncat(3)
Date: Thu, 15 Dec 2022 00:14:07 +0100 [thread overview]
Message-ID: <c306794f-8bfd-6b88-0baa-352d4c8b6871@gmail.com> (raw)
In-Reply-To: <CA+=Sn1nkz5FKFFx3e+A42nhpAF-TLW0Gmdgn4a5NL0KkCpuWNg@mail.gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 8569 bytes --]
[CC += groff]
Hi Andrew,
On 12/14/22 23:57, Andrew Pinski wrote:
> On Wed, Dec 14, 2022 at 2:46 PM Alejandro Colomar via Libc-alpha
> <libc-alpha@sourceware.org> wrote:
>>
>> Hi,
>>
>> I was rewriting the strncat(3) manual page, and when I tried to compile the
>> example program, I got a surprise from the compiler.
>>
>> Here goes the page:
>>
>>
>> strncat(3) Library Functions Manual strncat(3)
>>
>> NAME
>> strncat - concatenate a null‐padded character sequence into a
>> string
>>
>> LIBRARY
>> Standard C library (libc, -lc)
>>
>> SYNOPSIS
>> #include <string.h>
>>
>> char *strncat(char *restrict dst, const char src[restrict .sz],
>> size_t sz);
>>
>> DESCRIPTION
>> This function catenates the input character sequence contained in
>> a null‐padded fixed‐width buffer, into a string at the buffer
>> pointed to by dst. The programmer is responsible for allocating a
>> buffer large enough, that is, strlen(dst) + strnlen(src, sz) + 1.
>>
>> An implementation of this function might be:
>>
>> char *
>> strncat(char *restrict dst, const char *restrict src, size_t sz)
>> {
>> int len;
>> char *end;
>>
>> len = strnlen(src, sz);
>> end = dst + strlen(dst);
>> end = mempcpy(end, src, len);
>> *end = '\0';
>>
>> return dst;
>> }
>>
>> RETURN VALUE
>> strncat() returns dest.
>>
>> ATTRIBUTES
>> [...]
>>
>> STANDARDS
>> POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
>>
>> CAVEATS
>> The name of this function is confusing. This function has no re‐
>> lation with strncpy(3).
>>
>> If the destination buffer is not large enough, the behavior is un‐
>> defined. See _FORTIFY_SOURCE in feature_test_macros(7).
>>
>> BUGS
>> This function can be very inefficient. Read about Shlemiel
>> the painter ⟨https://www.joelonsoftware.com/2001/12/11/
>> back-to-basics/⟩.
>>
>> EXAMPLES
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <string.h>
>>
>> int
>> main(void)
>> {
>> char buf[BUFSIZ];
>> size_t len;
>>
>> buf[0] = '\0'; // There’s no ’cpy’ function to this ’cat’.
>> strncat(buf, "Hello ", 6);
>> strncat(buf, "world", 42); // Padding null bytes ignored.
>> strncat(buf, "!", 1);
>> len = strlen(buf);
>> printf("[len = %zu]: <%s>\n", len, buf);
>>
>> exit(EXIT_SUCCESS);
>> }
>>
>> SEE ALSO
>> string(3), string_copy(3)
>>
>> Linux man‐pages (unreleased) (date) strncat(3)
>>
>>
>> And when you compile that, you get:
>>
>> $ cc -Wall -Wextra ./strncat.c
>> ./strncat.c: In function ‘main’:
>> ./strncat.c:12:12: warning: ‘strncat’ specified bound 6 equals source length
>> [-Wstringop-overflow=]
>> 12 | strncat(buf, "Hello ", 6);
>> | ^~~~~~~~~~~~~~~~~~~~~~~~~
>> ./strncat.c:14:12: warning: ‘strncat’ specified bound 1 equals source length
>> [-Wstringop-overflow=]
>> 14 | strncat(buf, "!", 1);
>> | ^~~~~~~~~~~~~~~~~~~~
>>
>>
>> So, what? Where's the problem? This function does exactly that: "take an
>> unterminated character sequence and catenate it to an existing string". Clang
>> seems to be fine with the code.
>
> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83404 and the
> background of why the warning was added here:
>
> https://www.us-cert.gov/bsi/articles/knowledge/coding-practices/strncpy-and-strncat.
This document is bogus, since it's puting strncpy(3) and strncat(3) in the same
sack, when they're in reality two completely different beasts. I'll quote below
some paragraphs of some new page I'm writing, which will show why.
The rationale behind GCC's warning is also fundamentally wrong. Martin was
wrong when he claimed that the right call for strncat(3) is the remaining space
in the destination.
I admit that I didn't know what strncat(3) was useful for, and believed that it
was simply a broken-by-design function until very recently (this week, more or
less). And to be honest, I still believe it's broken by design; it's just that
it can be repurposed for a reasonable new purpose (which I found while digging
in groff's source code; that's why the CC).
First I'll show an example program that I added to the strncat(3) manual page
last week, which is based on the groff code that used it:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define nitems(arr) (sizeof((arr)) / sizeof((arr)[0]))
int
main(void)
{
char pre[4] = "pre.";
char *post = ".post";
char *src = "some_long_body.post";
char dest[100];
dest[0] = '\0';
strncat(dest, pre, nitems(pre));
strncat(dest, src, strlen(src) - strlen(post));
puts(dest); // "pre.some_long_body"
exit(EXIT_SUCCESS);
}
And now I'll quote some text that I'm writing currently for the function:
Null‐padded character sequences
For historic reasons, some standard APIs, such as utmpx(5), use null‐
padded character sequences in fixed‐width buffers. To interface with
them, specialized functions need to be used.
To copy strings into them, use stpncpy(3).
To copy from an unterminated string within a fixed‐width buffer into a
string, ignoring any trailing null bytes in the source fixed‐width
buffer, you should use strncat(3).
[...]
stpncpy(3)
This function copies the input string into a destination
null‐padded character sequence in a fixed‐width buffer. If
the destination buffer, limited by its size, isn’t large
enough to hold the copy, the resulting character sequence
is truncated. Since it creates a character sequence, it
doesn’t need to write a terminating null byte. It’s impos‐
sible to distinguish truncation after the call, from a
character sequence that just fits the destination buffer;
truncation should be detected from the length of the origi‐
nal string.
strncpy(3)
This function is identical to stpncpy(3) except for the
useless return value.
stpncpy(3) is a simpler alternative to this function.
[...]
strncat(3)
Do not confuse this function with strncpy(3); they are not
related at all.
This function catenates the input character sequence con‐
tained in a null‐padded wixed‐width buffer, into a destina‐
tion string. The programmer is responsible for allocating
a buffer large enough. The return value is useless.
zustr2stp(3) is a faster alternative to this function.
An implementation of this function might be:
char *
strncat(char *restrict dst, const char *restrict src,
size_t sz)
{
int len;
char *end;
len = strnlen(src, sz);
end = dst + strlen(dst);
end = mempcpy(end, src, len);
*end = '\0';
return dst;
}
Cheers,
Alex
>
> Thanks,
> Andrew Pinski
>
>>
>> Cheers,
>>
>> Alex
>>
>>
>> --
>> <http://www.alejandro-colomar.es/>
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2022-12-14 23:14 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-14 22:45 Alejandro Colomar
2022-12-14 22:51 ` Alejandro Colomar
2022-12-14 22:51 ` Alejandro Colomar
2022-12-14 22:57 ` Andrew Pinski
2022-12-14 23:14 ` Alejandro Colomar [this message]
2022-12-15 20:50 ` Martin Sebor
2022-12-15 22:03 ` Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c306794f-8bfd-6b88-0baa-352d4c8b6871@gmail.com \
--to=alx.manpages@gmail.com \
--cc=gcc@gcc.gnu.org \
--cc=groff@gnu.org \
--cc=libc-alpha@sourceware.org \
--cc=linux-man@vger.kernel.org \
--cc=msebor@redhat.com \
--cc=pinskia@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).