public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* Why does glibc use AVX-512?
@ 2021-03-26  4:38 Andy Lutomirski
  2021-03-26 10:06 ` Borislav Petkov
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Andy Lutomirski @ 2021-03-26  4:38 UTC (permalink / raw)
  To: libc-alpha, H. J. Lu, X86 ML, LKML, Bae, Chang Seok,
	Florian Weimer, Carlos O'Donell, Rich Felker

Hi all-

glibc appears to use AVX512F for memcpy by default.  (Unless
Prefer_ERMS is default-on, but I genuinely can't tell if this is the
case.  I did some searching.)  The commit adding it refers to a 2016
email saying that it's 30% on KNL.  Unfortunately, AVX-512 is now
available in normal hardware, and the overhead from switching between
normal and AVX-512 code appears to vary from bad to genuinely
horrible.  And, once anything has used the high parts of YMM and/or
ZMM, those states tend to get stuck with XINUSE=1.

I'm wondering whether glibc should stop using AVX-512 by default.

Meanwhile, some of you may have noticed a little ABI break we have.
On AVX-512 hardware, the size of a signal frame is unreasonably large,
and this is causing problems even for existing software that doesn't
use AVX-512.  Do any of you have any clever ideas for how to fix it?
We have some kernel patches around to try to fail more cleanly, but we
still fail.

I think we should seriously consider solutions in which, for new
tasks, XCR0 has new giant features (e.g. AMX) and possibly even
AVX-512 cleared, and programs need to explicitly request enablement.
This would allow programs to opt into not saving/restoring across
signals or to save/restore in buffers supplied when the feature is
enabled.  This has all kinds of pros and cons, and I'm not sure it's a
great idea.  But, in the absence of some change to the ABI, the
default outcome is that, on AMX-enabled kernels on AMX-enabled
hardware, the signal frame will be more than 8kB, and this will affect
*every* signal regardless of whether AMX is in use.

--Andy

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-03-26 21:21 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-26  4:38 Why does glibc use AVX-512? Andy Lutomirski
2021-03-26 10:06 ` Borislav Petkov
2021-03-26 18:17   ` Andy Lutomirski
2021-03-26 12:12 ` Florian Weimer
2021-03-26 18:14   ` Andy Lutomirski
2021-03-26 19:34     ` Florian Weimer
2021-03-26 19:47       ` Andy Lutomirski
2021-03-26 20:06         ` Andrew Cooper
2021-03-26 20:35         ` Florian Weimer
2021-03-26 20:43           ` H.J. Lu
2021-03-26 20:48           ` Andy Lutomirski
2021-03-26 21:11             ` Florian Weimer
2021-03-26 21:21               ` Andy Lutomirski
2021-03-26 13:32 ` David Laight

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).