public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
To: Carlos O'Donell <carlos@redhat.com>,
	GNU C Library <libc-alpha@sourceware.org>,
	Ondrej Bilka <neleai@seznam.cz>,
	"Joseph S. Myers" <joseph@codesourcery.com>,
	Jakub Jelinek <jakub@redhat.com>, Jeff Law <law@redhat.com>
Cc: nd <nd@arm.com>
Subject: Re: Review decision to inline mempcpy to memcpy.
Date: Fri, 04 Mar 2016 20:20:00 -0000	[thread overview]
Message-ID: <AM3PR08MB008895AC9E7360EE8E831C0183BE0@AM3PR08MB0088.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <AM3PR08MB0088D8CBEE224AA54E620F6983BE0@AM3PR08MB0088.eurprd08.prod.outlook.com>

Hi,

(resend to post to GLIBC list too)

> Were the changes in glibc to optimize mempcpy as memcpy
> originally motivated by performance for ARM?

OK, so the goal behind this was to provide the best possible out of the box performance
in GLIBC without requiring all targets to write a lot of assembler code. For less
frequently used functions which are identical to a standard function but with a different
return value the most obvious and efficient implementation is to inline a call to the most
commonly used version. There are several good reasons to do this for mempcpy:

1. Few targets implement mempcpy.S, so currently use the slow mempcpy.c veneer.
2. On most targets merging mempcpy into memcpy looks impossible without
   slowing down memcpy as a result.
3. Adding a separate mempcpy.S implementation increases I-cache pressure as
   now you need to load mempcpy too even if memcpy is already resident in L1/L2 cache.
4. GCC doesn't optimize/inline mempcpy as well as it does memcpy (see below)

> The crux of the argument is that the compiler may be able
> to do a better job of optimizing if it knows the call was
> a mempcpy as opposed to memcpy + addition.

No, unfortunately even GCC6 optimizes memcpy better than mempcpy:

return __builtin_memcpy(x, y, 32);

        ldp     x4, x5, [x1]
        stp     x4, x5, [x0]
        ldp     x4, x5, [x1, 16]
        stp     x4, x5, [x0, 16]
        ret

return __builtin_mempcpy(x, y, 32);

        mov     x2, 32
        b       mempcpy

return mempcpy(x, y, 32);  // using GLIBC2.23 inline

        mov     x2, x0
        add     x0, x0, 32
        ldp     x4, x5, [x1]
        stp     x4, x5, [x2]
        ldp     x4, x5, [x1, 16]
        stp     x4, x5, [x2, 16]
        ret

So the only case where I can see a clear win for mempcpy is if you can do a good
merged implementation and GCC is fixed to optimize mempcpy always in exactly the
same way as memcpy. In that case just define _HAVE_STRING_ARCH_mempcpy.

If we can get GCC to do the right thing depending of the preference of the target and
library then things would be perfect.

Cheers,
Wilco


      parent reply	other threads:[~2016-03-04 20:20 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-03 15:23 Carlos O'Donell
2016-03-03 20:55 ` H.J. Lu
2016-03-03 21:37   ` Carlos O'Donell
     [not found] ` <AM3PR08MB0088D8CBEE224AA54E620F6983BE0@AM3PR08MB0088.eurprd08.prod.outlook.com>
2016-03-04 16:57   ` Jakub Jelinek
2016-03-04 20:20   ` Wilco Dijkstra [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM3PR08MB008895AC9E7360EE8E831C0183BE0@AM3PR08MB0088.eurprd08.prod.outlook.com \
    --to=wilco.dijkstra@arm.com \
    --cc=carlos@redhat.com \
    --cc=jakub@redhat.com \
    --cc=joseph@codesourcery.com \
    --cc=law@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=nd@arm.com \
    --cc=neleai@seznam.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).