public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
To: Alejandro Colomar <alx.manpages@gmail.com>
Cc: 'GNU C Library' <libc-alpha@sourceware.org>
Subject: Re: [PATCH 1/1] string: Add stpecpy(3)
Date: Fri, 23 Dec 2022 23:24:23 +0000	[thread overview]
Message-ID: <PAWPR08MB8982EFB047372F6A54FCEE3783E99@PAWPR08MB8982.eurprd08.prod.outlook.com> (raw)

Hi Alex,

> For that, we'd first need to discuss what is a typical scenario.

Like copying/concatenating strings that fit within the buffer.

> And also, it depends a lot on what the compiler can optimize.  If I call 
> strlcat(3) in a loop, I know that stpecpy(3) is going to be orders of magnitude 
> faster.

If you're trying to say that the 'strcat' variant is bad then yes absolutely -
it's better to inline in the compiler or avoid the 'strcat' versions altogether
(that's also why I would strongly suggest never to add more 'cat' variants).
But that doesn't say anything about whether stpecpy is better than strlcpy.

> If I call strlcpy(3) in a loop, doing what an ideal compiler might do, that 
> might be something to benchmark, but we'd also need to discuss what is a good 
> input for the benchmark.

The typical case would be copying or concatenating smallish strings to a buffer.

> In the OpenBSD definition of strlcpy(), I count 4 branches, and one of them is 
> inside a while loop.  So, I'd find it very surprising if strlcpy(3) outperformed 
> stpecpy(3).

If that really is the OpenBSD implementation then this proves my point that
non-standard string functions are often totally unoptimized.

A basic implementation of strlcpy would use strlen and memcpy so it is fast
on every system without requiring any optimization:

size_t
strlcpy (char *dst, const char *src, size_t size)
{
  size_t len = strlen (src);

  if (size == 0)
    return len;
  size = len >= size ? size - 1 : len;
  dst[size] = 0;
  memcpy (dst, src, size);
  return len;
}

> Well, with the current memccpy(3) I already suspect it's going to be faster than 
> strlcpy(3).  If you optimize it, it would increase the chances that it's faster :)

I don't see why it would be any faster given memccpy might also not be
optimized.

> I find it _way_ more readable than the strlcpy(3)/cat(3) code.  Oh, and did I 
> say it has less branches? :)

I'm not so sure about that - you've got 3 call/returns plus at least 4 branches
for each stpecpy (besides whatever memcpy/memchr do). strlcpy has 2 calls/
returns plus one branch. So needing an extra branch in case you need to do
something special for the buffer full case doesn't seem like a major problem.

>> In contrast we can be pretty sure that the standard strlen, memcpy etc are both
>> correct and efficient on all targets/libc's.
>
> Sure, but memcpy(3) is not usable in code that needs to truncate.  We need to 
> compare against stpncpy(3) (ughhh) and strlcpy(3).

The idea is that if we add new string functions, their implementation should use
other string functions that are known to be well optimized for most targets.

Cheers,
Wilco

             reply	other threads:[~2022-12-23 23:24 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-23 23:24 Wilco Dijkstra [this message]
2022-12-24  0:05 ` Alejandro Colomar
2022-12-24  0:26   ` Alejandro Colomar
2022-12-24  2:30     ` stpecpy(3) vs strlcpy(3) benchmark (was: [PATCH 1/1] string: Add stpecpy(3)) Alejandro Colomar
2022-12-24 10:28       ` Alejandro Colomar
2022-12-25  1:52     ` [PATCH 1/1] string: Add stpecpy(3) Noah Goldstein
2022-12-25 14:37       ` Alejandro Colomar
2022-12-25 22:31         ` Noah Goldstein
2022-12-26  0:26           ` Alejandro Colomar
2022-12-26  0:32             ` Noah Goldstein
2022-12-26  0:37               ` Alejandro Colomar
2022-12-26  2:43                 ` Noah Goldstein
2022-12-26 22:25                   ` Alejandro Colomar
2022-12-26 23:24                     ` Alejandro Colomar
2022-12-26 23:52                       ` Alejandro Colomar
2022-12-27  0:12                         ` Alejandro Colomar
  -- strict thread matches above, loose matches on Subject: below --
2022-12-23 18:35 Wilco Dijkstra
2022-12-23 22:40 ` Alejandro Colomar
2022-12-23 14:59 Wilco Dijkstra
2022-12-23 17:03 ` Alejandro Colomar
2022-12-23 17:27   ` Alejandro Colomar
2022-12-22 21:42 [PATCH 0/1] " Alejandro Colomar
2022-12-22 21:42 ` [PATCH 1/1] " Alejandro Colomar
2022-12-23  7:02   ` Sam James
2022-12-23 12:26     ` Alejandro Colomar
2022-12-23 12:29       ` Alejandro Colomar
2022-12-23 17:21       ` Alejandro Colomar
2022-12-31 15:13       ` Sam James
2022-12-31 15:15         ` Alejandro Colomar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PAWPR08MB8982EFB047372F6A54FCEE3783E99@PAWPR08MB8982.eurprd08.prod.outlook.com \
    --to=wilco.dijkstra@arm.com \
    --cc=alx.manpages@gmail.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).