public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* Support for memcpy with equal source and destination
@ 2023-11-23 12:14 Adhemerval Zanella Netto
  2023-11-25  7:48 ` Paul Eggert
  0 siblings, 1 reply; 14+ messages in thread
From: Adhemerval Zanella Netto @ 2023-11-23 12:14 UTC (permalink / raw)
  To: libc-alpha; +Cc: post+sourceware.org

The BZ#31055 [1] requests a guarantee that memcpy with equal source and
destination is well defined on glibc, since both gcc and clang already
emits code with this assumption [2] [3].  There is a WIP document [4] to
proper extend this requirement for C standard, with some extra requirement
(such as allowing NULL inputs).

The GCC bug report [5] have further information on why GCC developers thing
this is a reasonable assumption, most related to the codgen cost of adding
either the extra compare to correctly call mempcy or just change the call
to memmmove.

From glibc standpoint, as far I could check there is no implementation that
prevents it (although I am not sure for all of them for all input sizes).
The memcpy with equal source and destination will still triggers compiler
warnings due the restrict issues, but it should not matter to compiler 
auto-generated libcalls.

Another solution would be to provide an alternative symbol, similar to 
__memcmpeq, meant to be used by the compiler to libcall optimizations.
However, __memcmpeq did not have any adoption so far [6] [7].

If we adopt this constraint, I think it would require to add some testing
besides the manual documentation.  Some arch maintainers would also like
to check their implementation to add an early bailout optimization. I also
presume that the fortify builtins already handle it correctly.

I am not sure about other libc implementation, at least Musl seems unlikely
to provide such guarantee.

Thoughts?

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=31055
[2] https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=7758cb4b53e8a33642709402ce582f769eb9fd18;hp=6ce952188ab39e303e4f63e474b5cba83b5b12fd
[3] https://reviews.llvm.org/D86993
[4] https://docs.google.com/document/d/1guH_HgibKrX7t9JfKGfWX2UCPyZOTLsnRfR6UleD1F8/edit
[5] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667
[6] https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596881.html
[7] https://reviews.llvm.org/D127461

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Support for memcpy with equal source and destination
  2023-11-23 12:14 Support for memcpy with equal source and destination Adhemerval Zanella Netto
@ 2023-11-25  7:48 ` Paul Eggert
  2023-11-25  8:20   ` Ralf Jung
  0 siblings, 1 reply; 14+ messages in thread
From: Paul Eggert @ 2023-11-25  7:48 UTC (permalink / raw)
  To: Adhemerval Zanella Netto, libc-alpha; +Cc: post+sourceware.org

I see several areas of possible confusion, so if we make this change to 
the glibc documentation, the new documentation should make the following 
clear:

* This is a GNU extension, and other C libraries might not guarantee 
this (not surprising). Also, other C compilers might not guarantee this 
even when used with glibc (somewhat more surprising).

* GCC and other compilers might warn about memcpy (X, X, SIZE) even if 
it is supported.

* This an exception to the usual rule about "restrict", since the 
prototype says "restrict" but it's OK if the two pointers are the same 
(so "restrict" now means that they cannot overlap other than being 
equal, just for this particular function).

* The pointers must point to SIZE bytes even when SOURCE == DESTINATION. 
For example, memcpy (&errno, &errno, SIZE_MAX) is not valid.

* If SIZE is nonzero, the destination must be writable even when SOURCE 
== DESTINATION.

* mempcpy, strcpy, and stpcpy are like memcpy in the above respects. 
(Are there other functions for which we want to make similar guarantees?)

The issue with null-or-invalid pointers and size zero is independent of 
this, as far as I can see, so it can be treated separately.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Support for memcpy with equal source and destination
  2023-11-25  7:48 ` Paul Eggert
@ 2023-11-25  8:20   ` Ralf Jung
  2023-11-25 17:11     ` Paul Eggert
  0 siblings, 1 reply; 14+ messages in thread
From: Ralf Jung @ 2023-11-25  8:20 UTC (permalink / raw)
  To: Paul Eggert, Adhemerval Zanella Netto, libc-alpha

Hi,

On 25.11.23 08:48, Paul Eggert wrote:
> I see several areas of possible confusion, so if we make this change to the 
> glibc documentation, the new documentation should make the following clear:
> 
> * This is a GNU extension, and other C libraries might not guarantee this (not 
> surprising). Also, other C compilers might not guarantee this even when used 
> with glibc (somewhat more surprising).
> 
> * GCC and other compilers might warn about memcpy (X, X, SIZE) even if it is 
> supported.
> 
> * This an exception to the usual rule about "restrict", since the prototype says 
> "restrict" but it's OK if the two pointers are the same (so "restrict" now means 
> that they cannot overlap other than being equal, just for this particular 
> function).

Note that "restrict" does not mean "must not be equal". It means "the accesses 
performed through this pointer (and pointers derived from it) must be disjoint 
from the accesses performed through other pointers (excluding memory that is 
only being read)".

So when one sees restrict in a signature, it is impossible to tell what the 
actual constraint is without further documentation: the function needs to say 
which memory is being accesses through which pointer, and *that* is then where 
the disjointness comes from.

That said, if the glibc memcpy has "restrict" in its signature, then GCC itself 
will optimize it assuming that the two buffers are truly disjoint. For a memcpy 
that is implemented in C (rather than assembly), I don't think it is possible to 
make this promise (of supporting src==dest) when there is "restrict" in the 
signature. So if glibc wants to make that promise I think it needs to remove 
"restrict" from its memcpy signatures.

Kind regards,
Ralf

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Support for memcpy with equal source and destination
  2023-11-25  8:20   ` Ralf Jung
@ 2023-11-25 17:11     ` Paul Eggert
  2023-11-27 11:15       ` Ralf Jung
  0 siblings, 1 reply; 14+ messages in thread
From: Paul Eggert @ 2023-11-25 17:11 UTC (permalink / raw)
  To: Ralf Jung, Adhemerval Zanella Netto, libc-alpha

On 2023-11-25 00:20, Ralf Jung wrote:
> For a memcpy that is implemented in C (rather than assembly), I don't 
> think it is possible to make this promise (of supporting src==dest) when 
> there is "restrict" in the signature.

Actually it is possible, though it costs a conditional branch compared 
to the naive approach that generally would work anyway.

Here's a sample implementation. When dest == src this function doesn't 
dereference either pointer, and this satisfies the C standard's rules 
for 'restrict'.

   #include <string.h>

   void *
   memcpy (void *restrict dest, void const *restrict src, size_t n)
   {
     if (dest != src)
       {
	char *d = dest;
	char const *s = src;
	for (; n != 0; n--)
	  *d++ = *s++;
       }
     return dest;
   }

When n is zero, this implementation also supports NULL dest or src, 
though that's a separate issue.


> when one sees restrict in a signature, it is impossible to tell what the actual constraint is without further documentation: the function needs to say which memory is being accesses through which pointer, and *that* is then where the disjointness comes from.

True, and one could document the new guarantee along those lines. 
Writing the documentation could be a bit tricky, though, as one needs to 
explain all this stuff clearly without being too prolix.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Support for memcpy with equal source and destination
  2023-11-25 17:11     ` Paul Eggert
@ 2023-11-27 11:15       ` Ralf Jung
  2023-11-27 11:46         ` Alexander Monakov
  0 siblings, 1 reply; 14+ messages in thread
From: Ralf Jung @ 2023-11-27 11:15 UTC (permalink / raw)
  To: Paul Eggert, Adhemerval Zanella Netto, libc-alpha

Hi all,

> On 2023-11-25 00:20, Ralf Jung wrote:
>> For a memcpy that is implemented in C (rather than assembly), I don't think it 
>> is possible to make this promise (of supporting src==dest) when there is 
>> "restrict" in the signature.
> 
> Actually it is possible, though it costs a conditional branch compared to the 
> naive approach that generally would work anyway.

Ah, true, I forgot about that option.

However, at that point it seems unclear why that branch should live inside 
`memcpy`, rather than being performed by the caller. The entire argument made 
all along by compiler developers (as I understood it) was that the existing 
`memcpy` are already working fine for the src==dest case; if new branches need 
to be added, that's a different discussion.
This probably needs benchmark to determine on which side the branch is less 
expensive overall? Or a dedicated memcpy variant that allows src==dest, as has 
been brought up elsewhere.

> When n is zero, this implementation also supports NULL dest or src, though 
> that's a separate issue.

Yeah I'd like to see that guarantee as well, if possible. :)

Kind regards,
Ralf

> 
> 
>> when one sees restrict in a signature, it is impossible to tell what the 
>> actual constraint is without further documentation: the function needs to say 
>> which memory is being accesses through which pointer, and *that* is then where 
>> the disjointness comes from.
> 
> True, and one could document the new guarantee along those lines. Writing the 
> documentation could be a bit tricky, though, as one needs to explain all this 
> stuff clearly without being too prolix.
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Support for memcpy with equal source and destination
  2023-11-27 11:15       ` Ralf Jung
@ 2023-11-27 11:46         ` Alexander Monakov
  2023-11-27 12:34           ` Ralf Jung
  0 siblings, 1 reply; 14+ messages in thread
From: Alexander Monakov @ 2023-11-27 11:46 UTC (permalink / raw)
  To: Ralf Jung; +Cc: Paul Eggert, Adhemerval Zanella Netto, libc-alpha


On Mon, 27 Nov 2023, Ralf Jung wrote:

> This probably needs benchmark to determine on which side the branch is less
> expensive overall? Or a dedicated memcpy variant that allows src==dest, as has
> been brought up elsewhere.

Please note that GCC does not use memcpy for "sufficiently small" structure
copies at -O2, as it's faster to emit the necessary loads+stores inline.

The threshold for "sufficiently small" varies with target and compiler version;
for instance, it is "above 64 bytes" for 32-bit arm and "above 8192 bytes" for
x86_64 with current trunk (it also depends on default -march/-mtune, etc.).

So on x86 at least adding such a branch in memcpy is not a practical choice.

Alexander

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Support for memcpy with equal source and destination
  2023-11-27 11:46         ` Alexander Monakov
@ 2023-11-27 12:34           ` Ralf Jung
  2023-11-27 14:25             ` Alexander Monakov
  0 siblings, 1 reply; 14+ messages in thread
From: Ralf Jung @ 2023-11-27 12:34 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: Paul Eggert, Adhemerval Zanella Netto, libc-alpha

Hi all,

>> This probably needs benchmark to determine on which side the branch is less
>> expensive overall? Or a dedicated memcpy variant that allows src==dest, as has
>> been brought up elsewhere.
> 
> Please note that GCC does not use memcpy for "sufficiently small" structure
> copies at -O2, as it's faster to emit the necessary loads+stores inline.
> 
> The threshold for "sufficiently small" varies with target and compiler version;
> for instance, it is "above 64 bytes" for 32-bit arm and "above 8192 bytes" for
> x86_64 with current trunk (it also depends on default -march/-mtune, etc.).

Wow, 8 kilobytes?!?

> So on x86 at least adding such a branch in memcpy is not a practical choice.

Sorry, how does that follow from GCC not using memcpy for small copies?

Kind regards,
Ralf

> 
> Alexander

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Support for memcpy with equal source and destination
  2023-11-27 12:34           ` Ralf Jung
@ 2023-11-27 14:25             ` Alexander Monakov
  0 siblings, 0 replies; 14+ messages in thread
From: Alexander Monakov @ 2023-11-27 14:25 UTC (permalink / raw)
  To: Ralf Jung; +Cc: Paul Eggert, Adhemerval Zanella Netto, libc-alpha


On Mon, 27 Nov 2023, Ralf Jung wrote:

> > Please note that GCC does not use memcpy for "sufficiently small" structure
> > copies at -O2, as it's faster to emit the necessary loads+stores inline.
> > 
> > The threshold for "sufficiently small" varies with target and compiler
> > version; for instance, it is "above 64 bytes" for 32-bit arm and "above 8192
> > bytes" for x86_64 with current trunk (it also depends on default
> > -march/-mtune, etc.).
> 
> Wow, 8 kilobytes?!?

It inlines 'rep movsq' for sizes between 32 and 8192 for generic tuning, but
it varies substantially for different CPU families. For example, with
-mtune=znver[234] the thresholds are:

* memcpy is used above 64 bytes, or if size is unknown
* 'rep movsq' above 16 bytes (up to 64 bytes)

> > So on x86 at least adding such a branch in memcpy is not a practical choice.
> 
> Sorry, how does that follow from GCC not using memcpy for small copies?

If you add a branch into memcpy, every single invocation of memcpy will pay a
tiny cost for that branch, even though it matters only for copying huge
structs, which is vanishingly rare compared to all uses of memcpy.

Alexander

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Support for memcpy with equal source and destination
  2023-11-27 19:28 Aaron Peter Bachmann
@ 2023-11-27 19:39 ` Paul Eggert
  0 siblings, 0 replies; 14+ messages in thread
From: Paul Eggert @ 2023-11-27 19:39 UTC (permalink / raw)
  To: Aaron Peter Bachmann, libc-alpha

On 2023-11-27 11:28, Aaron Peter Bachmann wrote:
> a conforming compiler could just remove the test

No, because the standard does not allow the compiler to assume that two 
pointers are unequal merely because both are restricted. It merely 
allows the compiler to assume that the two pointers are unequal if both 
are dereferenced and at least one is used to modify the object. Which is 
not the case when "dest != src" is executed.

The situation is like "if (n != 0) return m/n;". Obviously m/n has 
undefined behavior when n is zero. But the compiler cannot use this 
information to optimize away the "n != 0" test.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Support for memcpy with equal source and destination
@ 2023-11-27 19:28 Aaron Peter Bachmann
  2023-11-27 19:39 ` Paul Eggert
  0 siblings, 1 reply; 14+ messages in thread
From: Aaron Peter Bachmann @ 2023-11-27 19:28 UTC (permalink / raw)
  To: libc-alpha

In response to https://sourceware.org/pipermail/libc-alpha/2023-November/152961.html
(Since I am not subscribed, I cannot answer directly.)
>Here's a sample implementation. When dest == src this function doesn't 
>dereference either pointer, and this satisfies the C standard's rules 
>for 'restrict'.
I interpret the C-std differently.
Unfortunately, I am not entirely sure my interpretation is correct.
>
>   #include <string.h>
>
I think, the function signature is a promise that *dest and *src do not overlap.
This implies dest != src.
>   void *
>   memcpy (void *restrict dest, void const *restrict src, size_t n)
>   {
The comparison (dest != src) does not make dest based on src.
The comparison (dest != src) does not make src based on dest.
Thus, a conforming compiler could just remove the test which seems to be true always according to the function signature.
gcc and clang do not do so presently.
>     if (dest != src)
>       {
>	char *d = dest;
>	char const *s = src;
>	for (; n != 0; n--)
>	  *d++ = *s++;
>       }
>     return dest;
>   }
Regards, Aaron Peter Bachmann


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Support for memcpy with equal source and destination
  2023-11-27 14:53   ` Zack Weinberg
@ 2023-11-27 15:02     ` enh
  0 siblings, 0 replies; 14+ messages in thread
From: enh @ 2023-11-27 15:02 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Ralf Jung, Wilco Dijkstra, GNU libc development

On Mon, Nov 27, 2023 at 6:54 AM Zack Weinberg <zack@owlfolio.org> wrote:
>
> On Mon, Nov 27, 2023, at 9:45 AM, Ralf Jung wrote:
> >> Paul's example was to show that with restrict you could write a conformant
> >> C implementation. I think it is sufficient to cast away restrict without adding an
> >> extra branch (the existing generic C version does this).
> >
> > I don't think it is possible to cast away restrict. The restrict rules apply to
> > all pointers derived from the restrict pointer.
>
> As a practical matter, if we remove restrict annotations from the public memcpy
> declaration in string.h, _someone_ is going to think that's a bug, and also I
> wouldn't be surprised if future compilers started complaining about mismatches
> with their internal prototype.  (Neither GCC 13 nor clang 16 does this _now_.)

fwiw, bionic doesn't use `restrict` at all in headers, and no human or
compiler has noticed yet.

> zw

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Support for memcpy with equal source and destination
  2023-11-27 14:45 ` Ralf Jung
@ 2023-11-27 14:53   ` Zack Weinberg
  2023-11-27 15:02     ` enh
  0 siblings, 1 reply; 14+ messages in thread
From: Zack Weinberg @ 2023-11-27 14:53 UTC (permalink / raw)
  To: Ralf Jung, Wilco Dijkstra; +Cc: GNU libc development

On Mon, Nov 27, 2023, at 9:45 AM, Ralf Jung wrote:
>> Paul's example was to show that with restrict you could write a conformant
>> C implementation. I think it is sufficient to cast away restrict without adding an
>> extra branch (the existing generic C version does this).
>
> I don't think it is possible to cast away restrict. The restrict rules apply to 
> all pointers derived from the restrict pointer.

As a practical matter, if we remove restrict annotations from the public memcpy
declaration in string.h, _someone_ is going to think that's a bug, and also I
wouldn't be surprised if future compilers started complaining about mismatches
with their internal prototype.  (Neither GCC 13 nor clang 16 does this _now_.)

zw

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Support for memcpy with equal source and destination
  2023-11-27 14:38 Wilco Dijkstra
@ 2023-11-27 14:45 ` Ralf Jung
  2023-11-27 14:53   ` Zack Weinberg
  0 siblings, 1 reply; 14+ messages in thread
From: Ralf Jung @ 2023-11-27 14:45 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: 'GNU C Library'

Hi all,

>> However, at that point it seems unclear why that branch should live inside
>> `memcpy`, rather than being performed by the caller. The entire argument made
>> all along by compiler developers (as I understood it) was that the existing
>> `memcpy` are already working fine for the src==dest case; if new branches need
>> to be added, that's a different discussion.
> 
> Existing implementations don't need (or want) any extra branches. We also don't
> want to complicate inline memcpy expansions. They all work fine if src==dst.
> 
> Paul's example was to show that with restrict you could write a conformant
> C implementation. I think it is sufficient to cast away restrict without adding an
> extra branch (the existing generic C version does this).

I don't think it is possible to cast away restrict. The restrict rules apply to 
all pointers derived from the restrict pointer.

Kind regards,
Ralf

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Support for memcpy with equal source and destination
@ 2023-11-27 14:38 Wilco Dijkstra
  2023-11-27 14:45 ` Ralf Jung
  0 siblings, 1 reply; 14+ messages in thread
From: Wilco Dijkstra @ 2023-11-27 14:38 UTC (permalink / raw)
  To: post; +Cc: 'GNU C Library'

Hi Ralf,

> However, at that point it seems unclear why that branch should live inside 
> `memcpy`, rather than being performed by the caller. The entire argument made 
> all along by compiler developers (as I understood it) was that the existing 
> `memcpy` are already working fine for the src==dest case; if new branches need 
> to be added, that's a different discussion.

Existing implementations don't need (or want) any extra branches. We also don't
want to complicate inline memcpy expansions. They all work fine if src==dst.

Paul's example was to show that with restrict you could write a conformant
C implementation. I think it is sufficient to cast away restrict without adding an
extra branch (the existing generic C version does this).

>> When n is zero, this implementation also supports NULL dest or src, though 
>> that's a separate issue.
>
> Yeah I'd like to see that guarantee as well, if possible. :)

That should also work fine on existing implementations.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-11-27 19:39 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-23 12:14 Support for memcpy with equal source and destination Adhemerval Zanella Netto
2023-11-25  7:48 ` Paul Eggert
2023-11-25  8:20   ` Ralf Jung
2023-11-25 17:11     ` Paul Eggert
2023-11-27 11:15       ` Ralf Jung
2023-11-27 11:46         ` Alexander Monakov
2023-11-27 12:34           ` Ralf Jung
2023-11-27 14:25             ` Alexander Monakov
2023-11-27 14:38 Wilco Dijkstra
2023-11-27 14:45 ` Ralf Jung
2023-11-27 14:53   ` Zack Weinberg
2023-11-27 15:02     ` enh
2023-11-27 19:28 Aaron Peter Bachmann
2023-11-27 19:39 ` Paul Eggert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).