public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/16830] New: memset performance regression
@ 2014-04-10 16:04 schwab@linux-m68k.org
  2014-04-28  8:12 ` [Bug libc/16830] " neleai at seznam dot cz
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: schwab@linux-m68k.org @ 2014-04-10 16:04 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=16830

            Bug ID: 16830
           Summary: memset performance regression
           Product: glibc
           Version: 2.18
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
          Assignee: unassigned at sourceware dot org
          Reporter: schwab@linux-m68k.org
                CC: drepper.fsp at gmail dot com, neleai at seznam dot cz
              Host: x86_64-*-*

https://java.net/projects/libmicro

                    usecs/call           usecs/call
                    2.11                 2.18
Intel Harpertown Socket 771
Unit memset_10m     3743.0016 (  0.00%)   5447.0912 (-45.53%)
Unit memsetP2_10m   7541.9904 (  0.00%)  11094.2976 (-47.10%)

Xeon(R) CPU E5504 Gainestown
Unit memset_10m     1147.8016 (  0.00%)   1702.6816 (-48.34%)
Unit memsetP2_10m   2058.1120 (  0.00%)   2864.3072 (-39.17%)

Opteron 270 
Unit memset_10m     2112.1584 (  0.00%)   4910.7742 (-132.50%)
Unit memsetP2_10m   2145.8658 (  0.00%)   4957.2592 (-131.01%)

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libc/16830] memset performance regression
  2014-04-10 16:04 [Bug libc/16830] New: memset performance regression schwab@linux-m68k.org
@ 2014-04-28  8:12 ` neleai at seznam dot cz
  2014-05-21 12:55 ` matz at suse dot de
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: neleai at seznam dot cz @ 2014-04-28  8:12 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=16830

--- Comment #1 from Ondrej Bilka <neleai at seznam dot cz> ---
Thats because I did not add nontemporal loops for large sizes yet. Will send a
patch.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libc/16830] memset performance regression
  2014-04-10 16:04 [Bug libc/16830] New: memset performance regression schwab@linux-m68k.org
  2014-04-28  8:12 ` [Bug libc/16830] " neleai at seznam dot cz
@ 2014-05-21 12:55 ` matz at suse dot de
  2014-05-21 14:21 ` matz at suse dot de
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: matz at suse dot de @ 2014-05-21 12:55 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=16830

Michael Matz <matz at suse dot de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |matz at suse dot de

--- Comment #2 from Michael Matz <matz at suse dot de> ---
(In reply to Ondrej Bilka from comment #1)
> Thats because I did not add nontemporal loops for large sizes yet. Will send
> a patch.

Any progress already?

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libc/16830] memset performance regression
  2014-04-10 16:04 [Bug libc/16830] New: memset performance regression schwab@linux-m68k.org
  2014-04-28  8:12 ` [Bug libc/16830] " neleai at seznam dot cz
  2014-05-21 12:55 ` matz at suse dot de
@ 2014-05-21 14:21 ` matz at suse dot de
  2014-06-12 19:43 ` fweimer at redhat dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: matz at suse dot de @ 2014-05-21 14:21 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=16830

--- Comment #3 from Michael Matz <matz at suse dot de> ---
Created attachment 7611
  --> https://sourceware.org/bugzilla/attachment.cgi?id=7611&action=edit
Patch adding non-temporal stores

This patch adds non-temporal stores for large blocksizes.  On my machine
(an Opteron) with libmicro benchmark "memset -s 10m" I get:

glibc 2.17:
             prc thr   usecs/call      samples   errors cnt/samp     size
memset         1   1   2424.64635           97        0       20 10485760

glibc 2.19:
             prc thr   usecs/call      samples   errors cnt/samp     size
memset         1   1   3539.25120           97        0       20 10485760

glibc 2.19 with patch:
             prc thr   usecs/call      samples   errors cnt/samp     size
memset         1   1   2524.34610          102        0       20 10485760

So it's indeed the non-temporal stores that improve performance.  It's still
a bit slower than it once was, but much more reasonable.  The old code
filled 128 bytes per loop iteration, the new code only 64, which might explain
the last little difference.  So the real patch should also use a 128 byte
loop.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libc/16830] memset performance regression
  2014-04-10 16:04 [Bug libc/16830] New: memset performance regression schwab@linux-m68k.org
                   ` (2 preceding siblings ...)
  2014-05-21 14:21 ` matz at suse dot de
@ 2014-06-12 19:43 ` fweimer at redhat dot com
  2015-08-27 22:21 ` [Bug string/16830] " jsm28 at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: fweimer at redhat dot com @ 2014-06-12 19:43 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=16830

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|                            |security-

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug string/16830] memset performance regression
  2014-04-10 16:04 [Bug libc/16830] New: memset performance regression schwab@linux-m68k.org
                   ` (3 preceding siblings ...)
  2014-06-12 19:43 ` fweimer at redhat dot com
@ 2015-08-27 22:21 ` jsm28 at gcc dot gnu.org
  2015-10-16  3:55 ` cfester at tmriusa dot com
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jsm28 at gcc dot gnu.org @ 2015-08-27 22:21 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=16830

Joseph Myers <jsm28 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|libc                        |string

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug string/16830] memset performance regression
  2014-04-10 16:04 [Bug libc/16830] New: memset performance regression schwab@linux-m68k.org
                   ` (4 preceding siblings ...)
  2015-08-27 22:21 ` [Bug string/16830] " jsm28 at gcc dot gnu.org
@ 2015-10-16  3:55 ` cfester at tmriusa dot com
  2015-10-16  4:39 ` cfester at tmriusa dot com
  2024-05-09 19:16 ` hjl.tools at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: cfester at tmriusa dot com @ 2015-10-16  3:55 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=16830

Chris Fester <cfester at tmriusa dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |cfester at tmriusa dot com

--- Comment #4 from Chris Fester <cfester at tmriusa dot com> ---
Created attachment 8725
  --> https://sourceware.org/bugzilla/attachment.cgi?id=8725&action=edit
test illustrating differences in speed between movdqa and movntdq

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug string/16830] memset performance regression
  2014-04-10 16:04 [Bug libc/16830] New: memset performance regression schwab@linux-m68k.org
                   ` (5 preceding siblings ...)
  2015-10-16  3:55 ` cfester at tmriusa dot com
@ 2015-10-16  4:39 ` cfester at tmriusa dot com
  2024-05-09 19:16 ` hjl.tools at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: cfester at tmriusa dot com @ 2015-10-16  4:39 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=16830

--- Comment #5 from Chris Fester <cfester at tmriusa dot com> ---
We also have seen a performance regression in memset.  Our previous
system software used eglibc 2.15.  When we were evaluating Yocto 1.6, we
started using eglibc 2.19 and saw the problem.  We continued to see it with 
glibc 2.21 when using Yocto 1.8.

Our system software frequently uses memset to initialize large areas of
memory.  We break up the memory area into chunks and dedicate all 16 cores
to memset-ing a chunk.  This likely thrashes the caches quite a bit,
although I'll admit I haven't looked at any performance counters.

For Yocto 1.6 we patched eglibc to revert to the unrolled loop version of
memset.S.  For Yocto 1.8, we actually produced a patch strikingly similar
to Michael Matz's patch to fix the regression.

Some hardware details:
CPU - 2 Sandy Bridge based Xeon CPUs, 16 cores total
Memory - 32 GB on each node

I attached a c source file to do some cycle counting for memset with
multiple threads.  If I have time tomorrow I will attach an ODS spreadsheet
with a graph illustrating the data we collected, as well as the patch we're
currently using to work around the problem.

Please let us know how we can help with this issue.

Thanks,
Chris Fester

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug string/16830] memset performance regression
  2014-04-10 16:04 [Bug libc/16830] New: memset performance regression schwab@linux-m68k.org
                   ` (6 preceding siblings ...)
  2015-10-16  4:39 ` cfester at tmriusa dot com
@ 2024-05-09 19:16 ` hjl.tools at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: hjl.tools at gmail dot com @ 2024-05-09 19:16 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=16830

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hjl.tools at gmail dot com,
                   |                            |skpgkp2 at gmail dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-05-09 19:16 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-10 16:04 [Bug libc/16830] New: memset performance regression schwab@linux-m68k.org
2014-04-28  8:12 ` [Bug libc/16830] " neleai at seznam dot cz
2014-05-21 12:55 ` matz at suse dot de
2014-05-21 14:21 ` matz at suse dot de
2014-06-12 19:43 ` fweimer at redhat dot com
2015-08-27 22:21 ` [Bug string/16830] " jsm28 at gcc dot gnu.org
2015-10-16  3:55 ` cfester at tmriusa dot com
2015-10-16  4:39 ` cfester at tmriusa dot com
2024-05-09 19:16 ` hjl.tools at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).