Hi Jakub, On 12/12/22 14:56, Jakub Jelinek wrote: > On Mon, Dec 12, 2022 at 02:44:04PM +0100, Alejandro Colomar via Gcc wrote: >>> I don't see any problem with the code snippets you provided. >> >> Well, then the optimization may be the other way around (although I question >> why it is implemented that way, and not the other way around, but I'm not a >> hardware or libc guy, so there may be reasons). >> >> If calling memcpy(3) is better, then the code calling mempcpy(3) could be >> expanded inline to call it (but I doubt it). >> >> If calling mempcpy(3) is better, then the hand-made pattern resembling >> mempcpy(3) should probably be merged as a call to mempcpy(3). >> >> But acting different on equivalent calls to both of them seems inconsistent >> to me, unless you trust the programmer to know better how to optimize, that >> is... > > I think that is the case, plus the question if one can use a non-standard > function to implement a standard function (and if it would be triggered > by seeing an expected prototype for the non-standard function). I guess implementing a standard function by calling a non-standard one is fine. The implementation is free to do what it pleases, as long as it provides the expected interface. > > Otherwise, whether mempcpy in libc is implemented as memcpy + tweak return > value or has its own implementation is something that is heavily dependent > on the target and changes over time, so hardcoding that in gcc is > problematic. Might be, although I'm guessing that if GCC collapses mempcpy(3)-like hand-made patterns to mempcpy(3), the worst that can happen is that glibc undoes that; not a horrible crime. In the best case, it saves a function call, or a few assignments. > For -Os mempcpy call might be very well smaller even if the > library side is then slower. Heh, you might be surprised with the following. Remember that the file ending in 1 is a hand-made pattern around memcpy(3), while the file ending in 3 calls mempcpy(3) directly; yet GCC emits more code for mempcpy(3). I don't see any reason for this. Cheers, Alex --- $ diff -u usts2stp[13].s --- usts2stp1.s 2022-12-12 15:00:34.775119720 +0100 +++ usts2stp3.s 2022-12-12 15:00:34.807119072 +0100 @@ -1,12 +1,13 @@ - .file "usts2stp1.c" + .file "usts2stp3.c" .text .globl usts2stp .type usts2stp, @function usts2stp: .LFB0: .cfi_startproc - movq (%rsi), %rcx + movq %rsi, %rax movq 8(%rsi), %rsi + movq (%rax), %rcx rep movsb movb $0, (%rdi) movq %rdi, %rax --