public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: "Alejandro Colomar (man-pages)" <alx.manpages@gmail.com>
To: glibc <libc-alpha@sourceware.org>
Cc: tech@openbsd.org, Christoph Hellwig <hch@lst.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: [RFC] strcpys(): New function for copying strings safely
Date: Sun, 27 Jun 2021 21:26:36 +0200	[thread overview]
Message-ID: <755875ec-baae-6cab-52a8-3c9530db1ce6@gmail.com> (raw)

Hi,

I recently re-read the discussions about strscpy() and strlcpy().

https://lwn.net/Articles/612244/
https://sourceware.org/legacy-ml/libc-alpha/2000-08/msg00052.html
https://mafford.com/text/the-many-ways-to-copy-a-string-in-c/
https://lkml.org/lkml/2017/7/14/637
https://lwn.net/Articles/659214/

Arguments against strlcpy():

- Inefficiency (compared to memcpy())
- It doesn't report truncation, needing extra checks by the user
- It doesn't prevent overrunning the source string (requires a C-string 
input)

Arguments in favor of strlcpy():

- Avoid code bloat
- Self-documenting code
- Avoid everybody rolling their own strcat() safe variant
- Prevent buffer overflows


I rolled my own strcpy safer functions some time ago, and after reading 
those discussions, I decided to propose them for inclusion in glibc.

I think they address all of the arguments before (well, efficiency will 
never be as good as memcpy(), I guess, but a good compiler, and -flto 
can make it good enough).

It reports two kinds of errors: "hard" errors, where the string is 
invalid, and "soft" errors, where the string is valid but truncated.
The method for reporting the errors is an 'int' return value, which is 0 
for success, -1 for "hard" error, and 1 for "soft" error.

The value of written characters that strcpy() functions typically 
return, is now moved to a pointer parameter (4th parameter).

The order of the parameters has been changed (compared to other strcpy() 
typicall variants), according to a new principle of the C2x standard, 
which says that "the size of an array appears before the array".
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2086.htm>

It doesn't require a C string in the input.  It only reads up to the 
size limit provided by the user.

I added the _np suffix to the function to mark it as non-portable 
(similar to existing practice in glibc with non-standard functions).

I used 'ssize_t' instead of 'size_t', because I consider unsigned types 
to be inherently unsafe.  See 
<https://google.github.io/styleguide/cppguide.html#Integer_Types>.

It is designed so that usage requires the minimum number of lines of 
code for complete usage (including error handling checks):

[[
// When we already checked that 'size' is >= 1
// and truncation is not an issue:

strcpys_np(size, dest, src, NULL);

// When we ddidn't check the value of 'size',
// but truncation is still not an issue:

if (strcpys_np(size, dest, src, NULL) == -1)
	goto handle_hard_error;

// When truncation is an issue:

if (strcpys_np(size, dest, src, NULL))
	goto handle_all_errors;

// When truncation is an error,
// but it requires a different handling than a "hard" error:

status = strcpys_np(size, dest, src, NULL);
if (status == 1)
	goto handle_truncation;
if (status)
	goto handle_hard_error;
]]

After any of those samples of code, the string has been copied, and is a 
valid C-string.


Here goes the code (strcpys_np() is defined in terms of strscpy_np(), 
which similar to the known strscpy(), but with some of the improvements 
mentioned above, such as using array parameters, and ssize_t):


[[

#include <string.h>
#include <sys/types.h>


[[gnu::nonnull]]
ssize_t strscpy_np(ssize_t size,
                    char dest[static restrict size],
                    const char src[static restrict size])
{
	ssize_t len;

	if (size <= 0)
		return -1;

	len = strnlen(src, size - 1);
	memcpy(dest, src, len);
	dest[len] = '\0';

	return len;
}

[[gnu::nonnull(2, 3)]]
int strcpys_np(ssize_t size,
                char dest[static restrict size],
                const char src[static restrict size],
                ssize_t *restrict len)
{
	ssize_t l;

	l = strscpy_np(size, dest, src);
	if (len)
		*len = l;

	if (l == -1)
		return -1;
	if (l >= size)
		return 1;
	return 0;
}

]]

I may have introduced some bugs right now, as I adapted the code a bit 
before sending, but I expect it to be free of any bugs known of the 
existing str*cpy() interfaces.


What do you think about this function?  Would you want to add it to glibc?


Thanks,

Alex



--
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/

             reply	other threads:[~2021-06-27 19:26 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-27 19:26 Alejandro Colomar (man-pages) [this message]
2021-06-27 19:46 ` Alejandro Colomar (man-pages)
2021-06-28  8:15   ` David Laight
2021-06-28 11:32     ` Alejandro Colomar (man-pages)
2021-06-28 12:00       ` Alejandro Colomar (man-pages)
2021-06-28 12:10         ` Alejandro Colomar (man-pages)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=755875ec-baae-6cab-52a8-3c9530db1ce6@gmail.com \
    --to=alx.manpages@gmail.com \
    --cc=hch@lst.de \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tech@openbsd.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).