public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* memcpy performance regressions 2.19 -> 2.24(5)
@ 2017-05-05 17:09 Erich Elsen
  2017-05-05 18:09 ` Carlos O'Donell
  0 siblings, 1 reply; 31+ messages in thread
From: Erich Elsen @ 2017-05-05 17:09 UTC (permalink / raw)
  To: libc-alpha

Hi everyone,

I've noticed that there seem to be some noticeable performance
regressions for certain processors and certain sizes when moving from
2.19 to 2.24 (and 2.25).

In this (https://docs.google.com/spreadsheets/d/1Mpu1Kr9CNaa9HQjzKGL0tb2x_Nsx8vtLK3b0QnKesHg/edit?usp=sharing)
spreadsheet the regressions are highlighted with red.  The three
benchmarks are:

readwritecache: both read and write locations are cached (if possible)
nocache: neither read or write locations will be cached
readcache: only the read location will be cached (if possible)

The regressions on IvyBridge are especially concerning and can be
fixed by using __memcpy_avx_unaligned instead of the current default
(__sse2_unaligned_erms).

The regressions at large sizes on IvyBridge and SandyBridge seem to be
due to using non-temporal stores and avoiding them also restores the
performance to 2.19 levels.

The regressions on Haswell can be fixed by using
__memcpy_avx_unaligned instead of __memcpy_avx_unaligned_erms in the
region of 32K <= N <= 4MB.

I had a couple of questions:

1) Are the large regressions at large sizes for IvyBridge and
SandyBridge expected?  Is avoiding non-temporal stores a reasonable
solution?

2) Is it possible to fix the IvyBridge regressions by using model
information to force a specific implementation?  I'm not sure how
other cpus (AMD) would be affected if the selection logic was modified
based on feature flags.

Thanks,
Erich

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2017-05-27 21:35 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-05 17:09 memcpy performance regressions 2.19 -> 2.24(5) Erich Elsen
2017-05-05 18:09 ` Carlos O'Donell
2017-05-06  0:57   ` Erich Elsen
2017-05-06 15:41     ` H.J. Lu
2017-05-09 23:48       ` Erich Elsen
2017-05-10 17:33         ` H.J. Lu
2017-05-11  2:17           ` Carlos O'Donell
2017-05-12 19:47             ` Erich Elsen
     [not found]             ` <CAOVZoAPp3_T+ourRkNFXHfCSQUOMFn4iBBm9j50==h=VJcGSzw@mail.gmail.com>
2017-05-12 20:21               ` H.J. Lu
2017-05-12 21:21                 ` H.J. Lu
2017-05-18 20:59                   ` Erich Elsen
2017-05-22 19:17                     ` H.J. Lu
2017-05-22 20:22                       ` H.J. Lu
2017-05-23  1:23                       ` Erich Elsen
2017-05-23  2:25                         ` H.J. Lu
2017-05-23  3:19                           ` Erich Elsen
2017-05-23 20:39                             ` Erich Elsen
2017-05-23 20:46                               ` H.J. Lu
2017-05-23 20:57                                 ` Erich Elsen
2017-05-23 22:08                                   ` H.J. Lu
2017-05-23 22:12                                     ` Erich Elsen
2017-05-23 22:55                                       ` H.J. Lu
2017-05-24  0:56                                         ` Erich Elsen
2017-05-24  3:42                                           ` H.J. Lu
2017-05-24 21:03                                             ` Erich Elsen
2017-05-24 21:36                             ` H.J. Lu
2017-05-25 21:23                               ` Erich Elsen
2017-05-25 21:57                                 ` Erich Elsen
2017-05-25 22:03                                   ` H.J. Lu
2017-05-27  0:31                                     ` Erich Elsen
2017-05-27 21:35                                       ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).