public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx.manpages@gmail.com>
To: Florian Weimer <fweimer@redhat.com>, libc-alpha@sourceware.org
Subject: Re: [PATCH 1/2] Implement strlcpy and strlcat [BZ #178]
Date: Thu, 6 Apr 2023 23:21:51 +0200	[thread overview]
Message-ID: <8513afd6-e276-05d5-bc4c-0722de71e0af@gmail.com> (raw)
In-Reply-To: <f2b8dcd6c84557b79f1398c0ca3e55d0f3b10584.1680693362.git.fweimer@redhat.com>

Hi Florian,

On 4/5/23 13:20, Florian Weimer via Libc-alpha wrote:
> These functions are about to be added to POSIX, under Austin Group
> issue 986.
> 
> The fortified strlcat implementation does not raise SIGABRT if the
> destination buffer does not contain a null terminator, it just
> inheritis the non-failing regular strlcat behavior.  Maybe this
> should be changed to catch secondary overflows because the string
> appears longer than the buffer.
> ---

[...]

> +size_t
> +__strlcat (char *__restrict dest, const char *__restrict src, size_t size)
> +{

strlcat(3) is a slow call that shouldn't be emitted by an optimizing
compiler (similar to how GCC optimizes strcat(3) to stpcpy(3)).

With that in mind, do we really want all this code?  Or maybe we just
want to implement it in the simplest possible way?

I suggest the following alternative:


size_t
strlcat(char *restrict dst, const char *restrict src, size_t dsize)
{
    size_t  dlen = strlen(dst);

    return strlcpy(dst + dlen, src, dsize - dlen) + dlen;
}


In fact, that's the code I would expect that GCC should emit instead
of calls to the actual strlcat(3).

(Disclaimer: I didn't test the code; it may likely have bugs)

> +  size_t src_length = strlen (src);
> +
> +  /* Our implementation strlcat supports dest == NULL if size == 0
> +     (for consistency with snprintf and strlcpy), but strnlen does
> +     not, so we have to cover this case explicitly.  */
> +  if (size == 0)
> +    return src_length;
> +
> +  size_t dest_length = __strnlen (dest, size);

The OpenBSD contract of strlcat(3) includes that _both_ the source
string and the destination strings are NULL-terminated.  I guess
POSIX has kept that contract.  If that's the case, we can just call
strlen(3) here.

> +  if (dest_length != size)

This should always happen.  Else, it is undefined, because it means
that the destination string wasn't terminated.

Relevant quote of BSD's man page:

[
                               ...  Also note  that  strlcpy()  and  strlcat()
       only  operate  on  true “C” strings.  This means that ...
                              ... for strlcat() both src and dst must be  NUL‐
       terminated.
]

So the conditional can be omitted.


> +    {
> +      /* Copy at most the remaining number of characters in the
> +	 destination buffer.  Leave for the NUL terminator.  */
> +      size_t to_copy = size - dest_length - 1;
> +      /* But not more than what is available in the source string.  */
> +      if (to_copy > src_length)
> +	to_copy = src_length;
> +
> +      char *target = dest + dest_length;
> +      memcpy (target, src, to_copy);
> +      target[to_copy] = '\0';
> +    }
> +
> +  /* If the sum wraps around, we have more than SIZE_MAX + 2 bytes in
> +     the two input strings (including both null terminators).  If each
> +     byte in the address space can be assigned a unique size_t value
> +     (which the static_assert checks), then by the pigeonhole
> +     principle, the two input strings must overlap, which is
> +     undefined.  */
> +  _Static_assert (sizeof (uintptr_t) == sizeof (size_t),
> +		  "theoretical maximum object size covers address space");
> +  return dest_length + src_length;
> +}
> +libc_hidden_def (__strlcat)
> +weak_alias (__strlcat, strlcat)

[...]

> +
> +size_t
> +__strlcpy (char *__restrict dest, const char *__restrict src, size_t size)
> +{
> +  size_t src_length = strlen (src);
> +
> +  if (__glibc_unlikely (src_length >= size))
> +    {
> +      if (size > 0)
> +	{
> +	  /* Copy the leading portion of the string.  The last
> +	     character is subsequently overwritten with the NUL
> +	     terminator, but the destination size is usually a
> +	     multiple of a small power of two, so writing it twice
> +	     should be more efficient than copying an odd number of
> +	     bytes.  */
> +	  memcpy (dest, src, size);
> +	  dest[size - 1] = '\0';
> +	}
> +    }
> +  else
> +    /* Copy the string and its terminating NUL character.  */
> +    memcpy (dest, src, src_length + 1);
> +  return src_length;
> +}

L(very)GTM. :)


Cheers,
Alex


  parent reply	other threads:[~2023-04-06 21:21 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-05 11:20 [PATCH 0/2] strlcpy/strlcat/wcslcpy/wcscat implementation Florian Weimer
2023-04-05 11:20 ` [PATCH 1/2] Implement strlcpy and strlcat [BZ #178] Florian Weimer
2023-04-05 13:18   ` Adhemerval Zanella Netto
2023-04-06  9:18     ` Florian Weimer
2023-04-06 14:22   ` Siddhesh Poyarekar
2023-04-06 15:09     ` Florian Weimer
2023-04-06 21:29     ` Alejandro Colomar
2023-04-11 14:28       ` Siddhesh Poyarekar
2023-04-20 10:55     ` Florian Weimer
2023-04-20 11:45       ` Siddhesh Poyarekar
2023-04-21 17:45         ` Florian Weimer
2023-04-06 21:21   ` Alejandro Colomar [this message]
2023-04-06 21:35     ` Florian Weimer
2023-04-06 22:15       ` Alejandro Colomar
2023-04-06 22:19       ` Alejandro Colomar
2023-04-06 22:34     ` Alejandro Colomar
2023-04-08 22:08   ` Paul Eggert
2023-04-09 15:29     ` Paul Eggert
2023-04-13 11:37       ` Florian Weimer
2023-04-13 14:39         ` Paul Eggert
2023-04-13 17:59           ` Paul Eggert
2023-04-20  8:07     ` Florian Weimer
2023-04-21 19:00       ` Paul Eggert
2023-04-28  8:49         ` Florian Weimer
2023-04-05 11:20 ` [PATCH 2/2] Add the wcslcpy, wcslcat functions Florian Weimer
2023-04-08 22:09   ` Paul Eggert
2023-04-05 12:30 ` [PATCH 0/2] strlcpy/strlcat/wcslcpy/wcscat implementation Alejandro Colomar
2023-04-08 22:05 ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8513afd6-e276-05d5-bc4c-0722de71e0af@gmail.com \
    --to=alx.manpages@gmail.com \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).