public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* Re: [PATCH 4/4] S390: Implement mempcpy with help of memcpy. [BZ #19765]
@ 2016-05-04 17:17 Wilco Dijkstra
  2016-05-04 18:13 ` Adhemerval Zanella
  0 siblings, 1 reply; 26+ messages in thread
From: Wilco Dijkstra @ 2016-05-04 17:17 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: nd, 'GNU C Library'

Adhemerval Zanella wrote:
> Right, but I *think* compiler would be smart enough to just avoid the extra spilling. 
> Take this example for instance [1], using GCC 5.3 for s390x I see no difference in
> generated assembly if I the strategy I proposed (-DMEMPCPY_TO_MEMCPY) to
> the s390 specific you are suggesting.  In the end, I am proposing that architecture
> specific micro-optimization should be avoid in favor of a more specific one.  
> Specially the one that tend to avoid one or two extra spilling based on quite complex
> macro expansion.  [1] http://pastie.org/10824072

You need to use something like this to show the difference:

return __mempcpy (__mempcpy (__mempcpy (p1, s, len), p2, 1), p3, 16);

GCC doesn't even optimize mempcpy of constant size (PR70140), so if you do have
an optimized mempcpy like s390 here, you *still* need to use memcpy for small immediate
sizes (so they get inlined), and only use mempcpy for unknown or very large sizes.

We end up having to do these header tricks because GCC doesn't implement mempcpy
as a first-class builtin or allow targets to defer to memcpy.

There are similar issues with strchr (s, 0) being used instead of the faster strlen (s) + s.

Wilco

^ permalink raw reply	[flat|nested] 26+ messages in thread
* [PATCH 1/4] S390: Use mvcle for copies > 1MB on 32bit with default memcpy variant.
@ 2016-04-26 12:08 Stefan Liebler
  2016-04-26 12:08 ` [PATCH 4/4] S390: Implement mempcpy with help of memcpy. [BZ #19765] Stefan Liebler
  0 siblings, 1 reply; 26+ messages in thread
From: Stefan Liebler @ 2016-04-26 12:08 UTC (permalink / raw)
  To: libc-alpha; +Cc: carlos, Wilco.Dijkstra, neleai, Stefan Liebler

If more than 255 bytes should be copied, the algorithm jumps away.
Before this patch, it jumps to the mvc-loop (.L_G5_12).
Now it jumps first to the "> 1MB" check, which jumps away to
__memcpy_mvcle. Otherwise, the mvc-loop (.L_G5_12) copies the bytes.

ChangeLog:

	* sysdeps/s390/s390-32/memcpy.S (memcpy):
	Jump to 1MB check before executing mvc-loop.
---
 sysdeps/s390/s390-32/memcpy.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sysdeps/s390/s390-32/memcpy.S b/sysdeps/s390/s390-32/memcpy.S
index 62ecbbf..2ac51ab 100644
--- a/sysdeps/s390/s390-32/memcpy.S
+++ b/sysdeps/s390/s390-32/memcpy.S
@@ -42,7 +42,7 @@ ENTRY(memcpy)
 	srl     %r5,8
 	ltr     %r5,%r5
 	lr      %r1,%r2
-	jne     .L_G5_12
+	jne     .L_G5_13
 	ex      %r4,.L_G5_17-.L_G5_16(%r13)
 .L_G5_4:
 	l       %r13,52(%r15)
-- 
2.3.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2016-05-24  8:55 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-04 17:17 [PATCH 4/4] S390: Implement mempcpy with help of memcpy. [BZ #19765] Wilco Dijkstra
2016-05-04 18:13 ` Adhemerval Zanella
2016-05-04 18:21   ` H.J. Lu
2016-05-04 18:23     ` Adhemerval Zanella
2016-05-04 20:51   ` Wilco Dijkstra
2016-05-04 20:58     ` Adhemerval Zanella
2016-05-05 13:24       ` Wilco Dijkstra
2016-05-05 13:37       ` H.J. Lu
2016-05-05 14:16         ` Adhemerval Zanella
2016-05-05 14:45           ` H.J. Lu
2016-05-05 16:34             ` Adhemerval Zanella
2016-05-05 16:36               ` H.J. Lu
2016-05-09 14:39                 ` Stefan Liebler
2016-05-12 14:11                   ` Adhemerval Zanella
2016-05-13 14:43                     ` Stefan Liebler
2016-05-13 14:49                       ` H.J. Lu
2016-05-13 14:59                         ` Stefan Liebler
2016-05-13 15:06                           ` H.J. Lu
2016-05-13 15:31                             ` Stefan Liebler
2016-05-18 15:25                               ` Stefan Liebler
2016-05-24  9:11                                 ` Stefan Liebler
  -- strict thread matches above, loose matches on Subject: below --
2016-04-26 12:08 [PATCH 1/4] S390: Use mvcle for copies > 1MB on 32bit with default memcpy variant Stefan Liebler
2016-04-26 12:08 ` [PATCH 4/4] S390: Implement mempcpy with help of memcpy. [BZ #19765] Stefan Liebler
2016-04-26 13:33   ` Adhemerval Zanella
2016-04-27  8:15     ` Stefan Liebler
2016-05-04 15:42       ` Adhemerval Zanella
2016-05-04 13:20   ` Stefan Liebler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).