* RFC: Add <sys/platform/x86.h> to glibc 2.29
@ 2018-07-29 16:16 H.J. Lu
2018-07-31 13:32 ` Carlos O'Donell
2018-08-17 14:50 ` Florian Weimer
0 siblings, 2 replies; 12+ messages in thread
From: H.J. Lu @ 2018-07-29 16:16 UTC (permalink / raw)
To: GNU C Library
I am proposing to add <sys/platform/x86.h> to glibc 2.29 to provide
an API to access x86 specific platform features:
enum
{
/* The integer bit array index of architecture feature bits. */
FEATURE_INDEX_1 = 0,
FEATURE_INDEX_2,
/* The current maximum size of the feature integer bit array. */
FEATURE_INDEX_MAX
};
enum
{
COMMON_CPUID_INDEX_1 = 0,
COMMON_CPUID_INDEX_7,
COMMON_CPUID_INDEX_80000001,
COMMON_CPUID_INDEX_D_ECX_1,
COMMON_CPUID_INDEX_80000007,
COMMON_CPUID_INDEX_80000008,
/* Keep the following line at the end. */
COMMON_CPUID_INDEX_MAX
};
enum cpu_features_kind
{
arch_kind_unknown = 0,
arch_kind_intel,
arch_kind_amd,
arch_kind_other
};
struct cpuid_registers
{
unsigned int eax;
unsigned int ebx;
unsigned int ecx;
unsigned int edx;
};
extern enum cpu_features_kind __x86_get_cpu_kind (void)
__attribute__ ((const));
extern const struct cpuid_registers *__x86_get_cpuid_registers
(unsigned int) __attribute__ ((const));
extern unsigned int __x86_get_arch_feature (unsigned int)
__attribute__ ((const));
extern unsigned long long __x86_tsc_to_ns (unsigned long long);
extern unsigned long long __x86_ns_to_tsc (unsigned long long);
/* HAS_* evaluates to true if we may use the feature at runtime. */
#define HAS_CPU_FEATURE(name) \
((__x86_get_cpuid_registers (index_cpu_##name)->reg_##name \
& (bit_cpu_##name)) != 0)
#define HAS_ARCH_FEATURE(name) \
((__x86_get_arch_feature (index_arch_##name) & (bit_arch_##name)) != 0)
Users can use <sys/platform/x86.h> to detect if a CPU feature is
supported at run-time with
HAS_CPU_FEATURE (AVX)
which returns true if glibc detects that AVX is available. Before AVX
can be used, he/she should check
HAS_ARCH_FEATURE (AVX_Usable)
which returns true if glibc detects that AVX is available and usable.
extern unsigned long long __x86_tsc_to_ns (unsigned long long);
extern unsigned long long __x86_ns_to_tsc (unsigned long long);
can be used to convert between TSCs and nanoseconds if
HAS_ARCH_FEATURE (TSC_To_NS_Usable)
returns true.
Backward binary compatibility is provided by:
1. __x86_get_cpu_kind will get a new version if a new CPU kind is added
to cpu_features_kind.
2. __x86_get_cpuid_registers returns a pointer to a cpuid_registers
with all zeros if cpuid index >= COMMON_CPUID_INDEX_MAX.
3. __x86_get_arch_feature returns 0 if architecture index >=
FEATURE_INDEX_MAX.
This means:
1. Applications linked against the new __x86_get_cpu_kind will only
run with the newer libc.so.
2. Applications linked against the new __x86_get_cpuid_registers and
__x86_get_arch_feature will run with the older libc.so. But new
CPU and architecture features won't be detected by the older libc.so.
This program:
https://github.com/hjl-tools/glibc/blob/hjl/platform/master/sysdeps/x86/tst-x86-platform-1.c
may display:
[hjl@gnu-efi-2 build-x86_64-linux]$ ./elf/tst-x86-platform-1
Vendor: Intel
CPU features:
SSE3
PCLMULQDQ
DTES64
MONITOR
DS_CPL
VMX
EST
TM2
SSSE3
SDBG
FMA
CMPXCHG16B
XTPRUPDCTRL
PDCM
PCID
SSE4_1
SSE4_2
X2APIC
MOVBE
POPCOUNT
TSC_DEADLINE
AES
XSAVE
OSXSAVE
AVX
F16C
RDRAND
FPU
VME
DE
PSE
TSC
MSR
PAE
MCE
CX8
APIC
SEP
MTRR
PGE
MCA
CMOV
PAT
PSE_36
DS
ACPI
MMX
FXSR
SSE
SSE2
SS
HTT
TM
PBE
FSGSBASE
TSC_ADJUST
SGX
BMI1
AVX2
BMI2
ERMS
INVPCID
MPX
RDSEED
ADX
SMAP
TRACE
LAHF64_SAHF64
LZCNT
PREFETCHW
SYSCALL_SYSRET
NX
PAGE1GB
RDTSCP
LM
XSAVEOPT
XSAVEC
XGETBV_ECX_1
XSAVES
INVARIANT_TSC
Architecture features:
AVX_Usable
AVX2_Usable
FMA_Usable
XSAVEC_Usable
TSC_To_NS_Usable
[hjl@gnu-efi-2 build-x86_64-linux]$
Any comments?
H.J.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29
2018-07-29 16:16 RFC: Add <sys/platform/x86.h> to glibc 2.29 H.J. Lu
@ 2018-07-31 13:32 ` Carlos O'Donell
2018-07-31 14:49 ` H.J. Lu
2018-08-17 14:50 ` Florian Weimer
1 sibling, 1 reply; 12+ messages in thread
From: Carlos O'Donell @ 2018-07-31 13:32 UTC (permalink / raw)
To: H.J. Lu, GNU C Library
On 07/29/2018 12:16 PM, H.J. Lu wrote:
> I am proposing to add <sys/platform/x86.h> to glibc 2.29 to provide
> an API to access x86 specific platform features:
>
> enum
> {
> /* The integer bit array index of architecture feature bits. */
> FEATURE_INDEX_1 = 0,
> FEATURE_INDEX_2,
> /* The current maximum size of the feature integer bit array. */
> FEATURE_INDEX_MAX
> };
>
> enum
> {
> COMMON_CPUID_INDEX_1 = 0,
> COMMON_CPUID_INDEX_7,
> COMMON_CPUID_INDEX_80000001,
> COMMON_CPUID_INDEX_D_ECX_1,
> COMMON_CPUID_INDEX_80000007,
> COMMON_CPUID_INDEX_80000008,
> /* Keep the following line at the end. */
> COMMON_CPUID_INDEX_MAX
> };
>
> enum cpu_features_kind
> {
> arch_kind_unknown = 0,
> arch_kind_intel,
> arch_kind_amd,
> arch_kind_other
> };
>
> struct cpuid_registers
> {
> unsigned int eax;
> unsigned int ebx;
> unsigned int ecx;
> unsigned int edx;
> };
>
> extern enum cpu_features_kind __x86_get_cpu_kind (void)
> __attribute__ ((const));
>
> extern const struct cpuid_registers *__x86_get_cpuid_registers
> (unsigned int) __attribute__ ((const));
>
> extern unsigned int __x86_get_arch_feature (unsigned int)
> __attribute__ ((const));
>
> extern unsigned long long __x86_tsc_to_ns (unsigned long long);
> extern unsigned long long __x86_ns_to_tsc (unsigned long long);
>
> /* HAS_* evaluates to true if we may use the feature at runtime. */
> #define HAS_CPU_FEATURE(name) \
> ((__x86_get_cpuid_registers (index_cpu_##name)->reg_##name \
> & (bit_cpu_##name)) != 0)
I don't like these as macros.
I would rather an inline function that does all the work within the function
and returns a result of the appropriate type e.g. boolean.
> #define HAS_ARCH_FEATURE(name) \
> ((__x86_get_arch_feature (index_arch_##name) & (bit_arch_##name)) != 0)
>
Likewise.
> Users can use <sys/platform/x86.h> to detect if a CPU feature is
> supported at run-time with
>
> HAS_CPU_FEATURE (AVX)
Likewise.
> which returns true if glibc detects that AVX is available. Before AVX
> can be used, he/she should check
>
> HAS_ARCH_FEATURE (AVX_Usable)
Likewise.
These macros like this are harder to debug than static inline functions.
> which returns true if glibc detects that AVX is available and usable.
>
> extern unsigned long long __x86_tsc_to_ns (unsigned long long);
> extern unsigned long long __x86_ns_to_tsc (unsigned long long);
>
> can be used to convert between TSCs and nanoseconds if
>
> HAS_ARCH_FEATURE (TSC_To_NS_Usable)
>
> returns true.
>
> Backward binary compatibility is provided by:
>
> 1. __x86_get_cpu_kind will get a new version if a new CPU kind is added
> to cpu_features_kind.
This function takes void and returns the kind of CPU as an enum value. OK.
> 2. __x86_get_cpuid_registers returns a pointer to a cpuid_registers
> with all zeros if cpuid index >= COMMON_CPUID_INDEX_MAX.
Could you expand on this in more detail please.
> 3. __x86_get_arch_feature returns 0 if architecture index >=
> FEATURE_INDEX_MAX.
Likewise.
> This means:
>
> 1. Applications linked against the new __x86_get_cpu_kind will only
> run with the newer libc.so.
How? Symbol versioning?
> 2. Applications linked against the new __x86_get_cpuid_registers and
> __x86_get_arch_feature will run with the older libc.so. But new
> CPU and architecture features won't be detected by the older libc.so.
The caller must attempt to copy or inspect all the data, but it's structure
size may be larger, and the old glibc structure may be smaller, how do you
prevent the newer program from reading out of bounds when run in an older
glibc? This question is only relevant for __x86_get_cpuid_registers which
returns a pointer to a fixed size structure. The structure can grow, but
the caller needs to know exactly how big the returned structure is to avoid
reading outside of bounds.
> This program:
>
> https://github.com/hjl-tools/glibc/blob/hjl/platform/master/sysdeps/x86/tst-x86-platform-1.c
>
> may display:
>
> [hjl@gnu-efi-2 build-x86_64-linux]$ ./elf/tst-x86-platform-1
> Vendor: Intel
> CPU features:
> SSE3
> PCLMULQDQ
> DTES64
> MONITOR
> DS_CPL
> VMX
> EST
> TM2
> SSSE3
> SDBG
> FMA
> CMPXCHG16B
> XTPRUPDCTRL
> PDCM
> PCID
> SSE4_1
> SSE4_2
> X2APIC
> MOVBE
> POPCOUNT
> TSC_DEADLINE
> AES
> XSAVE
> OSXSAVE
> AVX
> F16C
> RDRAND
> FPU
> VME
> DE
> PSE
> TSC
> MSR
> PAE
> MCE
> CX8
> APIC
> SEP
> MTRR
> PGE
> MCA
> CMOV
> PAT
> PSE_36
> DS
> ACPI
> MMX
> FXSR
> SSE
> SSE2
> SS
> HTT
> TM
> PBE
> FSGSBASE
> TSC_ADJUST
> SGX
> BMI1
> AVX2
> BMI2
> ERMS
> INVPCID
> MPX
> RDSEED
> ADX
> SMAP
> TRACE
> LAHF64_SAHF64
> LZCNT
> PREFETCHW
> SYSCALL_SYSRET
> NX
> PAGE1GB
> RDTSCP
> LM
> XSAVEOPT
> XSAVEC
> XGETBV_ECX_1
> XSAVES
> INVARIANT_TSC
> Architecture features:
> AVX_Usable
> AVX2_Usable
> FMA_Usable
> XSAVEC_Usable
> TSC_To_NS_Usable
> [hjl@gnu-efi-2 build-x86_64-linux]$
>
> Any comments?
The data you want to share seems reasonable, but I expect we'll need
some iteration on the interface design.
> H.J.
>
--
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29
2018-07-31 13:32 ` Carlos O'Donell
@ 2018-07-31 14:49 ` H.J. Lu
2018-07-31 15:06 ` Carlos O'Donell
0 siblings, 1 reply; 12+ messages in thread
From: H.J. Lu @ 2018-07-31 14:49 UTC (permalink / raw)
To: Carlos O'Donell; +Cc: GNU C Library
On Tue, Jul 31, 2018 at 6:32 AM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 07/29/2018 12:16 PM, H.J. Lu wrote:
>> I am proposing to add <sys/platform/x86.h> to glibc 2.29 to provide
>> an API to access x86 specific platform features:
>>
>> enum
>> {
>> /* The integer bit array index of architecture feature bits. */
>> FEATURE_INDEX_1 = 0,
>> FEATURE_INDEX_2,
>> /* The current maximum size of the feature integer bit array. */
>> FEATURE_INDEX_MAX
>> };
>>
>> enum
>> {
>> COMMON_CPUID_INDEX_1 = 0,
>> COMMON_CPUID_INDEX_7,
>> COMMON_CPUID_INDEX_80000001,
>> COMMON_CPUID_INDEX_D_ECX_1,
>> COMMON_CPUID_INDEX_80000007,
>> COMMON_CPUID_INDEX_80000008,
>> /* Keep the following line at the end. */
>> COMMON_CPUID_INDEX_MAX
>> };
>>
>> enum cpu_features_kind
>> {
>> arch_kind_unknown = 0,
>> arch_kind_intel,
>> arch_kind_amd,
>> arch_kind_other
>> };
>>
>> struct cpuid_registers
>> {
>> unsigned int eax;
>> unsigned int ebx;
>> unsigned int ecx;
>> unsigned int edx;
>> };
>>
>> extern enum cpu_features_kind __x86_get_cpu_kind (void)
>> __attribute__ ((const));
>>
>> extern const struct cpuid_registers *__x86_get_cpuid_registers
>> (unsigned int) __attribute__ ((const));
>>
>> extern unsigned int __x86_get_arch_feature (unsigned int)
>> __attribute__ ((const));
>>
>> extern unsigned long long __x86_tsc_to_ns (unsigned long long);
>> extern unsigned long long __x86_ns_to_tsc (unsigned long long);
>>
>> /* HAS_* evaluates to true if we may use the feature at runtime. */
>> #define HAS_CPU_FEATURE(name) \
>> ((__x86_get_cpuid_registers (index_cpu_##name)->reg_##name \
>> & (bit_cpu_##name)) != 0)
>
> I don't like these as macros.
>
> I would rather an inline function that does all the work within the function
> and returns a result of the appropriate type e.g. boolean.
CPUID instruction takes EAX and optionally ECX as input, It writes to
EAX, EBX, ECD and EDX. Each CPU feature is determined by CPUID
with specific input values and a bit in EAX, EBX, ECX or EDX. Take
AVX2 as example:
#define bit_cpu_AVX2 (1u << 5)
#define index_cpu_AVX2 COMMON_CPUID_INDEX_7
#define reg_AVX2 ebx
It is determined by the bit 5 of EBX from CPUID with EAX == 7.
HAS_CPU_FEATURE (AVX2) is expanded to
((__x86_get_cpuid_registers (index_cpu_AVX2)->reg_AVX2 \
& (bit_cpu_AVX2)) != 0)
which is equivalent to
((__x86_get_cpuid_registers (COMMON_CPUID_INDEX_7)->ebx \
& ((1u << 5))) != 0)
It is perfect for macro, not so convenient for inline function.
>> #define HAS_ARCH_FEATURE(name) \
>> ((__x86_get_arch_feature (index_arch_##name) & (bit_arch_##name)) != 0)
>>
>
> Likewise.
>
>> Users can use <sys/platform/x86.h> to detect if a CPU feature is
>> supported at run-time with
>>
>> HAS_CPU_FEATURE (AVX)
>
> Likewise.
>
>> which returns true if glibc detects that AVX is available. Before AVX
>> can be used, he/she should check
>>
>> HAS_ARCH_FEATURE (AVX_Usable)
>
> Likewise.
>
> These macros like this are harder to debug than static inline functions.
>
>> which returns true if glibc detects that AVX is available and usable.
>>
>> extern unsigned long long __x86_tsc_to_ns (unsigned long long);
>> extern unsigned long long __x86_ns_to_tsc (unsigned long long);
>>
>> can be used to convert between TSCs and nanoseconds if
>>
>> HAS_ARCH_FEATURE (TSC_To_NS_Usable)
>>
>> returns true.
>>
>> Backward binary compatibility is provided by:
>>
>> 1. __x86_get_cpu_kind will get a new version if a new CPU kind is added
>> to cpu_features_kind.
>
> This function takes void and returns the kind of CPU as an enum value. OK.
>
>> 2. __x86_get_cpuid_registers returns a pointer to a cpuid_registers
>> with all zeros if cpuid index >= COMMON_CPUID_INDEX_MAX.
>
> Could you expand on this in more detail please.
Internally, we have
struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX];
_x86_get_cpuid_registers is defined as
const struct cpuid_registers *
__x86_get_cpuid_registers (unsigned int i)
{
const static struct cpuid_registers zero_cpuid_registers = { 0 };
if (i >= COMMON_CPUID_INDEX_MAX)
return &zero_cpuid_registers;
return &GLRO(dl_x86_cpu_features).cpuid[i];
}
>> 3. __x86_get_arch_feature returns 0 if architecture index >=
>> FEATURE_INDEX_MAX.
>
> Likewise.
>
>> This means:
>>
>> 1. Applications linked against the new __x86_get_cpu_kind will only
>> run with the newer libc.so.
>
> How? Symbol versioning?
Yes.
>> 2. Applications linked against the new __x86_get_cpuid_registers and
>> __x86_get_arch_feature will run with the older libc.so. But new
>> CPU and architecture features won't be detected by the older libc.so.
>
> The caller must attempt to copy or inspect all the data, but it's structure
> size may be larger, and the old glibc structure may be smaller, how do you
> prevent the newer program from reading out of bounds when run in an older
> glibc? This question is only relevant for __x86_get_cpuid_registers which
> returns a pointer to a fixed size structure. The structure can grow, but
> the caller needs to know exactly how big the returned structure is to avoid
> reading outside of bounds.
There is no ABI issue here. __x86_get_cpuid_registers returns a pointer to
an element of struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX],
whose size is fixed as
struct cpuid_registers
{
unsigned int eax;
unsigned int ebx;
unsigned int ecx;
unsigned int edx;
};
which is the cached output of CPUID. As shown above if the input index
>= COMMON_CPUID_INDEX_MAX. we return a pointer to
const static struct cpuid_registers zero_cpuid_registers = { 0 };
--
H.J.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29
2018-07-31 14:49 ` H.J. Lu
@ 2018-07-31 15:06 ` Carlos O'Donell
2018-07-31 16:17 ` H.J. Lu
0 siblings, 1 reply; 12+ messages in thread
From: Carlos O'Donell @ 2018-07-31 15:06 UTC (permalink / raw)
To: H.J. Lu; +Cc: GNU C Library
On 07/31/2018 10:48 AM, H.J. Lu wrote:
> On Tue, Jul 31, 2018 at 6:32 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>> On 07/29/2018 12:16 PM, H.J. Lu wrote:
>>> I am proposing to add <sys/platform/x86.h> to glibc 2.29 to provide
>>> an API to access x86 specific platform features:
>>>
>>> enum
>>> {
>>> /* The integer bit array index of architecture feature bits. */
>>> FEATURE_INDEX_1 = 0,
>>> FEATURE_INDEX_2,
>>> /* The current maximum size of the feature integer bit array. */
>>> FEATURE_INDEX_MAX
>>> };
>>>
>>> enum
>>> {
>>> COMMON_CPUID_INDEX_1 = 0,
>>> COMMON_CPUID_INDEX_7,
>>> COMMON_CPUID_INDEX_80000001,
>>> COMMON_CPUID_INDEX_D_ECX_1,
>>> COMMON_CPUID_INDEX_80000007,
>>> COMMON_CPUID_INDEX_80000008,
>>> /* Keep the following line at the end. */
>>> COMMON_CPUID_INDEX_MAX
>>> };
>>>
>>> enum cpu_features_kind
>>> {
>>> arch_kind_unknown = 0,
>>> arch_kind_intel,
>>> arch_kind_amd,
>>> arch_kind_other
>>> };
>>>
>>> struct cpuid_registers
>>> {
>>> unsigned int eax;
>>> unsigned int ebx;
>>> unsigned int ecx;
>>> unsigned int edx;
>>> };
>>>
>>> extern enum cpu_features_kind __x86_get_cpu_kind (void)
>>> __attribute__ ((const));
>>>
>>> extern const struct cpuid_registers *__x86_get_cpuid_registers
>>> (unsigned int) __attribute__ ((const));
>>>
>>> extern unsigned int __x86_get_arch_feature (unsigned int)
>>> __attribute__ ((const));
>>>
>>> extern unsigned long long __x86_tsc_to_ns (unsigned long long);
>>> extern unsigned long long __x86_ns_to_tsc (unsigned long long);
>>>
>>> /* HAS_* evaluates to true if we may use the feature at runtime. */
>>> #define HAS_CPU_FEATURE(name) \
>>> ((__x86_get_cpuid_registers (index_cpu_##name)->reg_##name \
>>> & (bit_cpu_##name)) != 0)
>>
>> I don't like these as macros.
>>
>> I would rather an inline function that does all the work within the function
>> and returns a result of the appropriate type e.g. boolean.
>
> CPUID instruction takes EAX and optionally ECX as input, It writes to
> EAX, EBX, ECD and EDX. Each CPU feature is determined by CPUID
> with specific input values and a bit in EAX, EBX, ECX or EDX. Take
> AVX2 as example:
>
> #define bit_cpu_AVX2 (1u << 5)
> #define index_cpu_AVX2 COMMON_CPUID_INDEX_7
> #define reg_AVX2 ebx
>
> It is determined by the bit 5 of EBX from CPUID with EAX == 7.
> HAS_CPU_FEATURE (AVX2) is expanded to
>
> ((__x86_get_cpuid_registers (index_cpu_AVX2)->reg_AVX2 \
> & (bit_cpu_AVX2)) != 0)
>
> which is equivalent to
>
> ((__x86_get_cpuid_registers (COMMON_CPUID_INDEX_7)->ebx \
> & ((1u << 5))) != 0)
>
> It is perfect for macro, not so convenient for inline function.
It's not perfect.
It's difficult to debug.
You are using the c pre-processor here to evaluate the input argument
into another 2 or 3 constant values, this is the kind of propagation
of values that the compiler can do too.
I'd like to see an attempt at a functional definition of this rather
than a macro, and the best argument against a static inline function
is performance. If the macro generates much better code then that's
a real reason to use it.
>>> #define HAS_ARCH_FEATURE(name) \
>>> ((__x86_get_arch_feature (index_arch_##name) & (bit_arch_##name)) != 0)
>>>
>>
>> Likewise.
>>
>>> Users can use <sys/platform/x86.h> to detect if a CPU feature is
>>> supported at run-time with
>>>
>>> HAS_CPU_FEATURE (AVX)
>>
>> Likewise.
>>
>>> which returns true if glibc detects that AVX is available. Before AVX
>>> can be used, he/she should check
>>>
>>> HAS_ARCH_FEATURE (AVX_Usable)
>>
>> Likewise.
>>
>> These macros like this are harder to debug than static inline functions.
>>
>>> which returns true if glibc detects that AVX is available and usable.
>>>
>>> extern unsigned long long __x86_tsc_to_ns (unsigned long long);
>>> extern unsigned long long __x86_ns_to_tsc (unsigned long long);
>>>
>>> can be used to convert between TSCs and nanoseconds if
>>>
>>> HAS_ARCH_FEATURE (TSC_To_NS_Usable)
>>>
>>> returns true.
>>>
>>> Backward binary compatibility is provided by:
>>>
>>> 1. __x86_get_cpu_kind will get a new version if a new CPU kind is added
>>> to cpu_features_kind.
>>
>> This function takes void and returns the kind of CPU as an enum value. OK.
>>
>>> 2. __x86_get_cpuid_registers returns a pointer to a cpuid_registers
>>> with all zeros if cpuid index >= COMMON_CPUID_INDEX_MAX.
>>
>> Could you expand on this in more detail please.
>
> Internally, we have
>
> struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX];
>
> _x86_get_cpuid_registers is defined as
>
> const struct cpuid_registers *
> __x86_get_cpuid_registers (unsigned int i)
> {
> const static struct cpuid_registers zero_cpuid_registers = { 0 };
> if (i >= COMMON_CPUID_INDEX_MAX)
> return &zero_cpuid_registers;
> return &GLRO(dl_x86_cpu_features).cpuid[i];
> }
>
I don't like returning all zero for an error condition, we should have
a distinct error and return value..
Are we certain that an all zero cpuid register return is an invalid
result for cpuid in all cases or could it be confused with a valid
cpuid result but with the options in question disabled?
>>> 3. __x86_get_arch_feature returns 0 if architecture index >=
>>> FEATURE_INDEX_MAX.
>>
>> Likewise.
>>
>>> This means:
>>>
>>> 1. Applications linked against the new __x86_get_cpu_kind will only
>>> run with the newer libc.so.
>>
>> How? Symbol versioning?
>
> Yes.
OK.
>>> 2. Applications linked against the new __x86_get_cpuid_registers and
>>> __x86_get_arch_feature will run with the older libc.so. But new
>>> CPU and architecture features won't be detected by the older libc.so.
>>
>> The caller must attempt to copy or inspect all the data, but it's structure
>> size may be larger, and the old glibc structure may be smaller, how do you
>> prevent the newer program from reading out of bounds when run in an older
>> glibc? This question is only relevant for __x86_get_cpuid_registers which
>> returns a pointer to a fixed size structure. The structure can grow, but
>> the caller needs to know exactly how big the returned structure is to avoid
>> reading outside of bounds.
>
> There is no ABI issue here. __x86_get_cpuid_registers returns a pointer to
> an element of struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX],
> whose size is fixed as
>
> struct cpuid_registers
> {
> unsigned int eax;
> unsigned int ebx;
> unsigned int ecx;
> unsigned int edx;
> };
Ah! Right, it's the cached cpuid return result so it's following the ISA rules
for the registers it touches. I expect this can't change at this point, and if
we got a *new* cpuid instruction we'd just make a new function to implement that.
> which is the cached output of CPUID. As shown above if the input index
>> = COMMON_CPUID_INDEX_MAX. we return a pointer to
>
> const static struct cpuid_registers zero_cpuid_registers = { 0 };
--
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29
2018-07-31 15:06 ` Carlos O'Donell
@ 2018-07-31 16:17 ` H.J. Lu
0 siblings, 0 replies; 12+ messages in thread
From: H.J. Lu @ 2018-07-31 16:17 UTC (permalink / raw)
To: Carlos O'Donell; +Cc: GNU C Library
On Tue, Jul 31, 2018 at 8:06 AM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 07/31/2018 10:48 AM, H.J. Lu wrote:
>> On Tue, Jul 31, 2018 at 6:32 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>>> On 07/29/2018 12:16 PM, H.J. Lu wrote:
>>>> I am proposing to add <sys/platform/x86.h> to glibc 2.29 to provide
>>>> an API to access x86 specific platform features:
>>>>
>>>> enum
>>>> {
>>>> /* The integer bit array index of architecture feature bits. */
>>>> FEATURE_INDEX_1 = 0,
>>>> FEATURE_INDEX_2,
>>>> /* The current maximum size of the feature integer bit array. */
>>>> FEATURE_INDEX_MAX
>>>> };
>>>>
>>>> enum
>>>> {
>>>> COMMON_CPUID_INDEX_1 = 0,
>>>> COMMON_CPUID_INDEX_7,
>>>> COMMON_CPUID_INDEX_80000001,
>>>> COMMON_CPUID_INDEX_D_ECX_1,
>>>> COMMON_CPUID_INDEX_80000007,
>>>> COMMON_CPUID_INDEX_80000008,
>>>> /* Keep the following line at the end. */
>>>> COMMON_CPUID_INDEX_MAX
>>>> };
>>>>
>>>> enum cpu_features_kind
>>>> {
>>>> arch_kind_unknown = 0,
>>>> arch_kind_intel,
>>>> arch_kind_amd,
>>>> arch_kind_other
>>>> };
>>>>
>>>> struct cpuid_registers
>>>> {
>>>> unsigned int eax;
>>>> unsigned int ebx;
>>>> unsigned int ecx;
>>>> unsigned int edx;
>>>> };
>>>>
>>>> extern enum cpu_features_kind __x86_get_cpu_kind (void)
>>>> __attribute__ ((const));
>>>>
>>>> extern const struct cpuid_registers *__x86_get_cpuid_registers
>>>> (unsigned int) __attribute__ ((const));
>>>>
>>>> extern unsigned int __x86_get_arch_feature (unsigned int)
>>>> __attribute__ ((const));
>>>>
>>>> extern unsigned long long __x86_tsc_to_ns (unsigned long long);
>>>> extern unsigned long long __x86_ns_to_tsc (unsigned long long);
>>>>
>>>> /* HAS_* evaluates to true if we may use the feature at runtime. */
>>>> #define HAS_CPU_FEATURE(name) \
>>>> ((__x86_get_cpuid_registers (index_cpu_##name)->reg_##name \
>>>> & (bit_cpu_##name)) != 0)
>>>
>>> I don't like these as macros.
>>>
>>> I would rather an inline function that does all the work within the function
>>> and returns a result of the appropriate type e.g. boolean.
>>
>> CPUID instruction takes EAX and optionally ECX as input, It writes to
>> EAX, EBX, ECD and EDX. Each CPU feature is determined by CPUID
>> with specific input values and a bit in EAX, EBX, ECX or EDX. Take
>> AVX2 as example:
>>
>> #define bit_cpu_AVX2 (1u << 5)
>> #define index_cpu_AVX2 COMMON_CPUID_INDEX_7
>> #define reg_AVX2 ebx
>>
>> It is determined by the bit 5 of EBX from CPUID with EAX == 7.
>> HAS_CPU_FEATURE (AVX2) is expanded to
>>
>> ((__x86_get_cpuid_registers (index_cpu_AVX2)->reg_AVX2 \
>> & (bit_cpu_AVX2)) != 0)
>>
>> which is equivalent to
>>
>> ((__x86_get_cpuid_registers (COMMON_CPUID_INDEX_7)->ebx \
>> & ((1u << 5))) != 0)
>>
>> It is perfect for macro, not so convenient for inline function.
>
> It's not perfect.
>
> It's difficult to debug.
sysdeps/x86/tst-get-cpu-features.c should have 100% coverage of all
features. Glibc users can just use them.
> You are using the c pre-processor here to evaluate the input argument
> into another 2 or 3 constant values, this is the kind of propagation
> of values that the compiler can do too.
Not easily with inline function in a public header file.
> I'd like to see an attempt at a functional definition of this rather
> than a macro, and the best argument against a static inline function
> is performance. If the macro generates much better code then that's
> a real reason to use it.
In this case, macro generates very good code and very easy to
maintain.
>>>> #define HAS_ARCH_FEATURE(name) \
>>>> ((__x86_get_arch_feature (index_arch_##name) & (bit_arch_##name)) != 0)
>>>>
>>>
>>> Likewise.
>>>
>>>> Users can use <sys/platform/x86.h> to detect if a CPU feature is
>>>> supported at run-time with
>>>>
>>>> HAS_CPU_FEATURE (AVX)
>>>
>>> Likewise.
>>>
>>>> which returns true if glibc detects that AVX is available. Before AVX
>>>> can be used, he/she should check
>>>>
>>>> HAS_ARCH_FEATURE (AVX_Usable)
>>>
>>> Likewise.
>>>
>>> These macros like this are harder to debug than static inline functions.
>>>
>>>> which returns true if glibc detects that AVX is available and usable.
>>>>
>>>> extern unsigned long long __x86_tsc_to_ns (unsigned long long);
>>>> extern unsigned long long __x86_ns_to_tsc (unsigned long long);
>>>>
>>>> can be used to convert between TSCs and nanoseconds if
>>>>
>>>> HAS_ARCH_FEATURE (TSC_To_NS_Usable)
>>>>
>>>> returns true.
>>>>
>>>> Backward binary compatibility is provided by:
>>>>
>>>> 1. __x86_get_cpu_kind will get a new version if a new CPU kind is added
>>>> to cpu_features_kind.
>>>
>>> This function takes void and returns the kind of CPU as an enum value. OK.
>>>
>>>> 2. __x86_get_cpuid_registers returns a pointer to a cpuid_registers
>>>> with all zeros if cpuid index >= COMMON_CPUID_INDEX_MAX.
>>>
>>> Could you expand on this in more detail please.
>>
>> Internally, we have
>>
>> struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX];
>>
>> _x86_get_cpuid_registers is defined as
>>
>> const struct cpuid_registers *
>> __x86_get_cpuid_registers (unsigned int i)
>> {
>> const static struct cpuid_registers zero_cpuid_registers = { 0 };
>> if (i >= COMMON_CPUID_INDEX_MAX)
>> return &zero_cpuid_registers;
>> return &GLRO(dl_x86_cpu_features).cpuid[i];
>> }
>>
>
> I don't like returning all zero for an error condition, we should have
> a distinct error and return value..
It isn't an error. It simply means that this glibc doesn't support
this CPUID input. It is OK to return 0 in EAX, EBX, ECX and EDX
and HAS_CPU_FEATURE (XXX) evaluates to false.
> Are we certain that an all zero cpuid register return is an invalid
> result for cpuid in all cases or could it be confused with a valid
> cpuid result but with the options in question disabled?
HAS_CPU_FEATURE (XXX) evaluates to true if glibc detects that
XXX is supported. It is OK for HAS_CPU_FEATURE (XXX) to be
false if glibc doesn't know XXX on processors which has XXX.
BTW, there are many reserved bits in EAX, EBX, ECX and EDX,
which may be changed to new CPU features in future processors.
It is easy to add these features to x86.h with macro without recompiling
glibc.
>>>> 3. __x86_get_arch_feature returns 0 if architecture index >=
>>>> FEATURE_INDEX_MAX.
>>>
>>> Likewise.
>>>
>>>> This means:
>>>>
>>>> 1. Applications linked against the new __x86_get_cpu_kind will only
>>>> run with the newer libc.so.
>>>
>>> How? Symbol versioning?
>>
>> Yes.
>
> OK.
>
>>>> 2. Applications linked against the new __x86_get_cpuid_registers and
>>>> __x86_get_arch_feature will run with the older libc.so. But new
>>>> CPU and architecture features won't be detected by the older libc.so.
>>>
>>> The caller must attempt to copy or inspect all the data, but it's structure
>>> size may be larger, and the old glibc structure may be smaller, how do you
>>> prevent the newer program from reading out of bounds when run in an older
>>> glibc? This question is only relevant for __x86_get_cpuid_registers which
>>> returns a pointer to a fixed size structure. The structure can grow, but
>>> the caller needs to know exactly how big the returned structure is to avoid
>>> reading outside of bounds.
>>
>> There is no ABI issue here. __x86_get_cpuid_registers returns a pointer to
>> an element of struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX],
>> whose size is fixed as
>>
>> struct cpuid_registers
>> {
>> unsigned int eax;
>> unsigned int ebx;
>> unsigned int ecx;
>> unsigned int edx;
>> };
>
> Ah! Right, it's the cached cpuid return result so it's following the ISA rules
> for the registers it touches. I expect this can't change at this point, and if
> we got a *new* cpuid instruction we'd just make a new function to implement that.
True.
>> which is the cached output of CPUID. As shown above if the input index
>>> = COMMON_CPUID_INDEX_MAX. we return a pointer to
>>
>> const static struct cpuid_registers zero_cpuid_registers = { 0 };
>
>
> --
> Cheers,
> Carlos.
--
H.J.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29
2018-07-29 16:16 RFC: Add <sys/platform/x86.h> to glibc 2.29 H.J. Lu
2018-07-31 13:32 ` Carlos O'Donell
@ 2018-08-17 14:50 ` Florian Weimer
2018-08-17 15:46 ` H.J. Lu
1 sibling, 1 reply; 12+ messages in thread
From: Florian Weimer @ 2018-08-17 14:50 UTC (permalink / raw)
To: H.J. Lu, GNU C Library
On 07/29/2018 06:16 PM, H.J. Lu wrote:
> extern enum cpu_features_kind __x86_get_cpu_kind (void)
> __attribute__ ((const));
>
> extern const struct cpuid_registers *__x86_get_cpuid_registers
> (unsigned int) __attribute__ ((const));
>
> extern unsigned int __x86_get_arch_feature (unsigned int)
> __attribute__ ((const));
Are these functions supposed to be usable in IFUNC resolvers?
Shouldn't these identifiers be in the public (non-internal) namespace?
Thanks,
Florian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29
2018-08-17 14:50 ` Florian Weimer
@ 2018-08-17 15:46 ` H.J. Lu
2018-08-17 16:19 ` Florian Weimer
0 siblings, 1 reply; 12+ messages in thread
From: H.J. Lu @ 2018-08-17 15:46 UTC (permalink / raw)
To: Florian Weimer; +Cc: GNU C Library
On Fri, Aug 17, 2018 at 7:50 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 07/29/2018 06:16 PM, H.J. Lu wrote:
>>
>> extern enum cpu_features_kind __x86_get_cpu_kind (void)
>> __attribute__ ((const));
>>
>> extern const struct cpuid_registers *__x86_get_cpuid_registers
>> (unsigned int) __attribute__ ((const));
>>
>> extern unsigned int __x86_get_arch_feature (unsigned int)
>> __attribute__ ((const));
>
>
> Are these functions supposed to be usable in IFUNC resolvers?
They have the same limitation as
/* Used from outside of glibc to get access to the CPU features
structure. */
extern const struct cpu_features *__get_cpu_features (void)
__attribute__ ((const));
> Shouldn't these identifiers be in the public (non-internal) namespace?
>
Do we have precedents for such arch specific public functions?
--
H.J.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29
2018-08-17 15:46 ` H.J. Lu
@ 2018-08-17 16:19 ` Florian Weimer
2018-08-17 16:44 ` H.J. Lu
0 siblings, 1 reply; 12+ messages in thread
From: Florian Weimer @ 2018-08-17 16:19 UTC (permalink / raw)
To: H.J. Lu; +Cc: GNU C Library
On 08/17/2018 05:45 PM, H.J. Lu wrote:
> On Fri, Aug 17, 2018 at 7:50 AM, Florian Weimer <fweimer@redhat.com> wrote:
>> On 07/29/2018 06:16 PM, H.J. Lu wrote:
>>>
>>> extern enum cpu_features_kind __x86_get_cpu_kind (void)
>>> __attribute__ ((const));
>>>
>>> extern const struct cpuid_registers *__x86_get_cpuid_registers
>>> (unsigned int) __attribute__ ((const));
>>>
>>> extern unsigned int __x86_get_arch_feature (unsigned int)
>>> __attribute__ ((const));
>>
>>
>> Are these functions supposed to be usable in IFUNC resolvers?
>
> They have the same limitation as
>
> /* Used from outside of glibc to get access to the CPU features
> structure. */
> extern const struct cpu_features *__get_cpu_features (void)
> __attribute__ ((const));
Sorry, this does not answer my question.
>> Shouldn't these identifiers be in the public (non-internal) namespace?
>>
>
> Do we have precedents for such arch specific public functions?
There is modify_ldt.
Thanks,
Florian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29
2018-08-17 16:19 ` Florian Weimer
@ 2018-08-17 16:44 ` H.J. Lu
2018-08-17 19:11 ` Florian Weimer
2018-08-20 11:29 ` Florian Weimer
0 siblings, 2 replies; 12+ messages in thread
From: H.J. Lu @ 2018-08-17 16:44 UTC (permalink / raw)
To: Florian Weimer; +Cc: GNU C Library
On Fri, Aug 17, 2018 at 9:19 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 08/17/2018 05:45 PM, H.J. Lu wrote:
>>
>> On Fri, Aug 17, 2018 at 7:50 AM, Florian Weimer <fweimer@redhat.com>
>> wrote:
>>>
>>> On 07/29/2018 06:16 PM, H.J. Lu wrote:
>>>>
>>>>
>>>> extern enum cpu_features_kind __x86_get_cpu_kind (void)
>>>> __attribute__ ((const));
>>>>
>>>> extern const struct cpuid_registers *__x86_get_cpuid_registers
>>>> (unsigned int) __attribute__ ((const));
>>>>
>>>> extern unsigned int __x86_get_arch_feature (unsigned int)
>>>> __attribute__ ((const));
>>>
>>>
>>>
>>> Are these functions supposed to be usable in IFUNC resolvers?
>>
>>
>> They have the same limitation as
>>
>> /* Used from outside of glibc to get access to the CPU features
>> structure. */
>> extern const struct cpu_features *__get_cpu_features (void)
>> __attribute__ ((const));
>
>
> Sorry, this does not answer my question.
Just like __get_cpu_features, they can be used in IFUNC resolvers.
>>> Shouldn't these identifiers be in the public (non-internal) namespace?
>>>
>>
>> Do we have precedents for such arch specific public functions?
>
>
> There is modify_ldt.
>
Which header file has its prototype?
--
H.J.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29
2018-08-17 16:44 ` H.J. Lu
@ 2018-08-17 19:11 ` Florian Weimer
2018-08-20 11:29 ` Florian Weimer
1 sibling, 0 replies; 12+ messages in thread
From: Florian Weimer @ 2018-08-17 19:11 UTC (permalink / raw)
To: H.J. Lu; +Cc: GNU C Library
On 08/17/2018 06:43 PM, H.J. Lu wrote:
>>>> Are these functions supposed to be usable in IFUNC resolvers?
>>>
>>> They have the same limitation as
>>>
>>> /* Used from outside of glibc to get access to the CPU features
>>> structure. */
>>> extern const struct cpu_features *__get_cpu_features (void)
>>> __attribute__ ((const));
>>
>> Sorry, this does not answer my question.
> Just like __get_cpu_features, they can be used in IFUNC resolvers.
Then we need something like this:
<https://sourceware.org/ml/libc-alpha/2018-06/msg00077.html>
Thanks,
Florian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29
2018-08-17 16:44 ` H.J. Lu
2018-08-17 19:11 ` Florian Weimer
@ 2018-08-20 11:29 ` Florian Weimer
2018-08-20 20:55 ` H.J. Lu
1 sibling, 1 reply; 12+ messages in thread
From: Florian Weimer @ 2018-08-20 11:29 UTC (permalink / raw)
To: H.J. Lu; +Cc: GNU C Library
On 08/17/2018 06:43 PM, H.J. Lu wrote:
>>>> Shouldn't these identifiers be in the public (non-internal) namespace?
>>>>
>>> Do we have precedents for such arch specific public functions?
>>
>> There is modify_ldt.
>>
> Which header file has its prototype?
vm86 is an example for an arch-specific function with a prototype in an
installed header file.
Thanks,
Florian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29
2018-08-20 11:29 ` Florian Weimer
@ 2018-08-20 20:55 ` H.J. Lu
0 siblings, 0 replies; 12+ messages in thread
From: H.J. Lu @ 2018-08-20 20:55 UTC (permalink / raw)
To: Florian Weimer; +Cc: GNU C Library
On Mon, Aug 20, 2018 at 4:29 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 08/17/2018 06:43 PM, H.J. Lu wrote:
>>>>>
>>>>> Shouldn't these identifiers be in the public (non-internal) namespace?
>>>>>
>>>> Do we have precedents for such arch specific public functions?
>>>
>>>
>>> There is modify_ldt.
>>>
>> Which header file has its prototype?
>
>
> vm86 is an example for an arch-specific function with a prototype in an
> installed header file.
The updated patch is posted at
https://sourceware.org/ml/libc-alpha/2018-08/msg00435.html
--
H.J.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2018-08-20 20:55 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-29 16:16 RFC: Add <sys/platform/x86.h> to glibc 2.29 H.J. Lu
2018-07-31 13:32 ` Carlos O'Donell
2018-07-31 14:49 ` H.J. Lu
2018-07-31 15:06 ` Carlos O'Donell
2018-07-31 16:17 ` H.J. Lu
2018-08-17 14:50 ` Florian Weimer
2018-08-17 15:46 ` H.J. Lu
2018-08-17 16:19 ` Florian Weimer
2018-08-17 16:44 ` H.J. Lu
2018-08-17 19:11 ` Florian Weimer
2018-08-20 11:29 ` Florian Weimer
2018-08-20 20:55 ` H.J. Lu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).