public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug string/28354] New: For x86_64 string/memory functions use of EVEX registers sets HI16_ZMM_state adding context switch overhead
@ 2021-09-20  2:23 goldstein.w.n at gmail dot com
  2021-09-20  2:28 ` [Bug string/28354] " goldstein.w.n at gmail dot com
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: goldstein.w.n at gmail dot com @ 2021-09-20  2:23 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=28354

            Bug ID: 28354
           Summary: For x86_64 string/memory functions use of EVEX
                    registers sets HI16_ZMM_state adding context switch
                    overhead
           Product: glibc
           Version: 2.34
            Status: UNCONFIRMED
          Severity: minor
          Priority: P2
         Component: string
          Assignee: unassigned at sourceware dot org
          Reporter: goldstein.w.n at gmail dot com
  Target Milestone: ---

Use of ymm16-ymm31 in the exex string/memory functions in
sysdeps/x86_64/multtiarch sets HI16_ZMM_state to true.
See:
https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf#page=321

This defeats the init optimization of various xsave* context switching
instructions. Overall it adds at the very least 1024 bytes to context switches.

Simple reproduction:

```
        .global _start
        .text
_start:
    vpxorq  %ymm16, %ymm16, %ymm16
    vzeroupper

loop:
    jmp loop


        movl    $60, %eax
        xorl    %edi, %edi
        syscall
```

Then check:

cat /proc/${pid}/arch_status


Which will show the state being continuously updated.

State is updated during context switch here:
https://elixir.bootlin.com/linux/v5.15-rc1/source/arch/x86/kernel/fpu/core.c#L108)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug string/28354] For x86_64 string/memory functions use of EVEX registers sets HI16_ZMM_state adding context switch overhead
  2021-09-20  2:23 [Bug string/28354] New: For x86_64 string/memory functions use of EVEX registers sets HI16_ZMM_state adding context switch overhead goldstein.w.n at gmail dot com
@ 2021-09-20  2:28 ` goldstein.w.n at gmail dot com
  2021-09-20 14:09 ` hjl.tools at gmail dot com
  2021-09-22 21:10 ` goldstein.w.n at gmail dot com
  2 siblings, 0 replies; 4+ messages in thread
From: goldstein.w.n at gmail dot com @ 2021-09-20  2:28 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=28354

Noah Goldstein <goldstein.w.n at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |goldstein.w.n at gmail dot com
             Target|                            |x86_64-linux

--- Comment #1 from Noah Goldstein <goldstein.w.n at gmail dot com> ---
My general opinion is that we should move the current evex function to evex-rtm
and add a new class of evex function which may use avx512 functions but stay in
the ymm0-ymm15 register range.

Benefits:

1) evex instructions cost more code size (+2 bytes at least)
2) Its impossible to encode certain useful instructions with the evex prefix
(i.e `vpcmpeq`)
3) We may be adding 1024 bytes to uses context switches.


Costs:

1) vzeroupper is not free (in terms of code size or execution).
2) more total code size consumer by the library (this is limited by the fact
that they will be in their own section and users will generally only stay in
one section for all string/memory functions)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug string/28354] For x86_64 string/memory functions use of EVEX registers sets HI16_ZMM_state adding context switch overhead
  2021-09-20  2:23 [Bug string/28354] New: For x86_64 string/memory functions use of EVEX registers sets HI16_ZMM_state adding context switch overhead goldstein.w.n at gmail dot com
  2021-09-20  2:28 ` [Bug string/28354] " goldstein.w.n at gmail dot com
@ 2021-09-20 14:09 ` hjl.tools at gmail dot com
  2021-09-22 21:10 ` goldstein.w.n at gmail dot com
  2 siblings, 0 replies; 4+ messages in thread
From: hjl.tools at gmail dot com @ 2021-09-20 14:09 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=28354

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hjl.tools at gmail dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug string/28354] For x86_64 string/memory functions use of EVEX registers sets HI16_ZMM_state adding context switch overhead
  2021-09-20  2:23 [Bug string/28354] New: For x86_64 string/memory functions use of EVEX registers sets HI16_ZMM_state adding context switch overhead goldstein.w.n at gmail dot com
  2021-09-20  2:28 ` [Bug string/28354] " goldstein.w.n at gmail dot com
  2021-09-20 14:09 ` hjl.tools at gmail dot com
@ 2021-09-22 21:10 ` goldstein.w.n at gmail dot com
  2 siblings, 0 replies; 4+ messages in thread
From: goldstein.w.n at gmail dot com @ 2021-09-22 21:10 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=28354

--- Comment #2 from Noah Goldstein <goldstein.w.n at gmail dot com> ---
Also worth noting that if we stick in the vec0-vec15 range we may be able to
get away with using `zmm` registers as `vzeroupper` does appear to effectively
clear `ZMM_HI256_state` so any context switch/frequency burdens would be
contained to the function.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-09-22 21:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-20  2:23 [Bug string/28354] New: For x86_64 string/memory functions use of EVEX registers sets HI16_ZMM_state adding context switch overhead goldstein.w.n at gmail dot com
2021-09-20  2:28 ` [Bug string/28354] " goldstein.w.n at gmail dot com
2021-09-20 14:09 ` hjl.tools at gmail dot com
2021-09-22 21:10 ` goldstein.w.n at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).