* RFC: Add <sys/platform/x86.h> to glibc 2.29 @ 2018-07-29 16:16 H.J. Lu 2018-07-31 13:32 ` Carlos O'Donell 2018-08-17 14:50 ` Florian Weimer 0 siblings, 2 replies; 12+ messages in thread From: H.J. Lu @ 2018-07-29 16:16 UTC (permalink / raw) To: GNU C Library I am proposing to add <sys/platform/x86.h> to glibc 2.29 to provide an API to access x86 specific platform features: enum { /* The integer bit array index of architecture feature bits. */ FEATURE_INDEX_1 = 0, FEATURE_INDEX_2, /* The current maximum size of the feature integer bit array. */ FEATURE_INDEX_MAX }; enum { COMMON_CPUID_INDEX_1 = 0, COMMON_CPUID_INDEX_7, COMMON_CPUID_INDEX_80000001, COMMON_CPUID_INDEX_D_ECX_1, COMMON_CPUID_INDEX_80000007, COMMON_CPUID_INDEX_80000008, /* Keep the following line at the end. */ COMMON_CPUID_INDEX_MAX }; enum cpu_features_kind { arch_kind_unknown = 0, arch_kind_intel, arch_kind_amd, arch_kind_other }; struct cpuid_registers { unsigned int eax; unsigned int ebx; unsigned int ecx; unsigned int edx; }; extern enum cpu_features_kind __x86_get_cpu_kind (void) __attribute__ ((const)); extern const struct cpuid_registers *__x86_get_cpuid_registers (unsigned int) __attribute__ ((const)); extern unsigned int __x86_get_arch_feature (unsigned int) __attribute__ ((const)); extern unsigned long long __x86_tsc_to_ns (unsigned long long); extern unsigned long long __x86_ns_to_tsc (unsigned long long); /* HAS_* evaluates to true if we may use the feature at runtime. */ #define HAS_CPU_FEATURE(name) \ ((__x86_get_cpuid_registers (index_cpu_##name)->reg_##name \ & (bit_cpu_##name)) != 0) #define HAS_ARCH_FEATURE(name) \ ((__x86_get_arch_feature (index_arch_##name) & (bit_arch_##name)) != 0) Users can use <sys/platform/x86.h> to detect if a CPU feature is supported at run-time with HAS_CPU_FEATURE (AVX) which returns true if glibc detects that AVX is available. Before AVX can be used, he/she should check HAS_ARCH_FEATURE (AVX_Usable) which returns true if glibc detects that AVX is available and usable. extern unsigned long long __x86_tsc_to_ns (unsigned long long); extern unsigned long long __x86_ns_to_tsc (unsigned long long); can be used to convert between TSCs and nanoseconds if HAS_ARCH_FEATURE (TSC_To_NS_Usable) returns true. Backward binary compatibility is provided by: 1. __x86_get_cpu_kind will get a new version if a new CPU kind is added to cpu_features_kind. 2. __x86_get_cpuid_registers returns a pointer to a cpuid_registers with all zeros if cpuid index >= COMMON_CPUID_INDEX_MAX. 3. __x86_get_arch_feature returns 0 if architecture index >= FEATURE_INDEX_MAX. This means: 1. Applications linked against the new __x86_get_cpu_kind will only run with the newer libc.so. 2. Applications linked against the new __x86_get_cpuid_registers and __x86_get_arch_feature will run with the older libc.so. But new CPU and architecture features won't be detected by the older libc.so. This program: https://github.com/hjl-tools/glibc/blob/hjl/platform/master/sysdeps/x86/tst-x86-platform-1.c may display: [hjl@gnu-efi-2 build-x86_64-linux]$ ./elf/tst-x86-platform-1 Vendor: Intel CPU features: SSE3 PCLMULQDQ DTES64 MONITOR DS_CPL VMX EST TM2 SSSE3 SDBG FMA CMPXCHG16B XTPRUPDCTRL PDCM PCID SSE4_1 SSE4_2 X2APIC MOVBE POPCOUNT TSC_DEADLINE AES XSAVE OSXSAVE AVX F16C RDRAND FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE_36 DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE FSGSBASE TSC_ADJUST SGX BMI1 AVX2 BMI2 ERMS INVPCID MPX RDSEED ADX SMAP TRACE LAHF64_SAHF64 LZCNT PREFETCHW SYSCALL_SYSRET NX PAGE1GB RDTSCP LM XSAVEOPT XSAVEC XGETBV_ECX_1 XSAVES INVARIANT_TSC Architecture features: AVX_Usable AVX2_Usable FMA_Usable XSAVEC_Usable TSC_To_NS_Usable [hjl@gnu-efi-2 build-x86_64-linux]$ Any comments? H.J. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29 2018-07-29 16:16 RFC: Add <sys/platform/x86.h> to glibc 2.29 H.J. Lu @ 2018-07-31 13:32 ` Carlos O'Donell 2018-07-31 14:49 ` H.J. Lu 2018-08-17 14:50 ` Florian Weimer 1 sibling, 1 reply; 12+ messages in thread From: Carlos O'Donell @ 2018-07-31 13:32 UTC (permalink / raw) To: H.J. Lu, GNU C Library On 07/29/2018 12:16 PM, H.J. Lu wrote: > I am proposing to add <sys/platform/x86.h> to glibc 2.29 to provide > an API to access x86 specific platform features: > > enum > { > /* The integer bit array index of architecture feature bits. */ > FEATURE_INDEX_1 = 0, > FEATURE_INDEX_2, > /* The current maximum size of the feature integer bit array. */ > FEATURE_INDEX_MAX > }; > > enum > { > COMMON_CPUID_INDEX_1 = 0, > COMMON_CPUID_INDEX_7, > COMMON_CPUID_INDEX_80000001, > COMMON_CPUID_INDEX_D_ECX_1, > COMMON_CPUID_INDEX_80000007, > COMMON_CPUID_INDEX_80000008, > /* Keep the following line at the end. */ > COMMON_CPUID_INDEX_MAX > }; > > enum cpu_features_kind > { > arch_kind_unknown = 0, > arch_kind_intel, > arch_kind_amd, > arch_kind_other > }; > > struct cpuid_registers > { > unsigned int eax; > unsigned int ebx; > unsigned int ecx; > unsigned int edx; > }; > > extern enum cpu_features_kind __x86_get_cpu_kind (void) > __attribute__ ((const)); > > extern const struct cpuid_registers *__x86_get_cpuid_registers > (unsigned int) __attribute__ ((const)); > > extern unsigned int __x86_get_arch_feature (unsigned int) > __attribute__ ((const)); > > extern unsigned long long __x86_tsc_to_ns (unsigned long long); > extern unsigned long long __x86_ns_to_tsc (unsigned long long); > > /* HAS_* evaluates to true if we may use the feature at runtime. */ > #define HAS_CPU_FEATURE(name) \ > ((__x86_get_cpuid_registers (index_cpu_##name)->reg_##name \ > & (bit_cpu_##name)) != 0) I don't like these as macros. I would rather an inline function that does all the work within the function and returns a result of the appropriate type e.g. boolean. > #define HAS_ARCH_FEATURE(name) \ > ((__x86_get_arch_feature (index_arch_##name) & (bit_arch_##name)) != 0) > Likewise. > Users can use <sys/platform/x86.h> to detect if a CPU feature is > supported at run-time with > > HAS_CPU_FEATURE (AVX) Likewise. > which returns true if glibc detects that AVX is available. Before AVX > can be used, he/she should check > > HAS_ARCH_FEATURE (AVX_Usable) Likewise. These macros like this are harder to debug than static inline functions. > which returns true if glibc detects that AVX is available and usable. > > extern unsigned long long __x86_tsc_to_ns (unsigned long long); > extern unsigned long long __x86_ns_to_tsc (unsigned long long); > > can be used to convert between TSCs and nanoseconds if > > HAS_ARCH_FEATURE (TSC_To_NS_Usable) > > returns true. > > Backward binary compatibility is provided by: > > 1. __x86_get_cpu_kind will get a new version if a new CPU kind is added > to cpu_features_kind. This function takes void and returns the kind of CPU as an enum value. OK. > 2. __x86_get_cpuid_registers returns a pointer to a cpuid_registers > with all zeros if cpuid index >= COMMON_CPUID_INDEX_MAX. Could you expand on this in more detail please. > 3. __x86_get_arch_feature returns 0 if architecture index >= > FEATURE_INDEX_MAX. Likewise. > This means: > > 1. Applications linked against the new __x86_get_cpu_kind will only > run with the newer libc.so. How? Symbol versioning? > 2. Applications linked against the new __x86_get_cpuid_registers and > __x86_get_arch_feature will run with the older libc.so. But new > CPU and architecture features won't be detected by the older libc.so. The caller must attempt to copy or inspect all the data, but it's structure size may be larger, and the old glibc structure may be smaller, how do you prevent the newer program from reading out of bounds when run in an older glibc? This question is only relevant for __x86_get_cpuid_registers which returns a pointer to a fixed size structure. The structure can grow, but the caller needs to know exactly how big the returned structure is to avoid reading outside of bounds. > This program: > > https://github.com/hjl-tools/glibc/blob/hjl/platform/master/sysdeps/x86/tst-x86-platform-1.c > > may display: > > [hjl@gnu-efi-2 build-x86_64-linux]$ ./elf/tst-x86-platform-1 > Vendor: Intel > CPU features: > SSE3 > PCLMULQDQ > DTES64 > MONITOR > DS_CPL > VMX > EST > TM2 > SSSE3 > SDBG > FMA > CMPXCHG16B > XTPRUPDCTRL > PDCM > PCID > SSE4_1 > SSE4_2 > X2APIC > MOVBE > POPCOUNT > TSC_DEADLINE > AES > XSAVE > OSXSAVE > AVX > F16C > RDRAND > FPU > VME > DE > PSE > TSC > MSR > PAE > MCE > CX8 > APIC > SEP > MTRR > PGE > MCA > CMOV > PAT > PSE_36 > DS > ACPI > MMX > FXSR > SSE > SSE2 > SS > HTT > TM > PBE > FSGSBASE > TSC_ADJUST > SGX > BMI1 > AVX2 > BMI2 > ERMS > INVPCID > MPX > RDSEED > ADX > SMAP > TRACE > LAHF64_SAHF64 > LZCNT > PREFETCHW > SYSCALL_SYSRET > NX > PAGE1GB > RDTSCP > LM > XSAVEOPT > XSAVEC > XGETBV_ECX_1 > XSAVES > INVARIANT_TSC > Architecture features: > AVX_Usable > AVX2_Usable > FMA_Usable > XSAVEC_Usable > TSC_To_NS_Usable > [hjl@gnu-efi-2 build-x86_64-linux]$ > > Any comments? The data you want to share seems reasonable, but I expect we'll need some iteration on the interface design. > H.J. > -- Cheers, Carlos. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29 2018-07-31 13:32 ` Carlos O'Donell @ 2018-07-31 14:49 ` H.J. Lu 2018-07-31 15:06 ` Carlos O'Donell 0 siblings, 1 reply; 12+ messages in thread From: H.J. Lu @ 2018-07-31 14:49 UTC (permalink / raw) To: Carlos O'Donell; +Cc: GNU C Library On Tue, Jul 31, 2018 at 6:32 AM, Carlos O'Donell <carlos@redhat.com> wrote: > On 07/29/2018 12:16 PM, H.J. Lu wrote: >> I am proposing to add <sys/platform/x86.h> to glibc 2.29 to provide >> an API to access x86 specific platform features: >> >> enum >> { >> /* The integer bit array index of architecture feature bits. */ >> FEATURE_INDEX_1 = 0, >> FEATURE_INDEX_2, >> /* The current maximum size of the feature integer bit array. */ >> FEATURE_INDEX_MAX >> }; >> >> enum >> { >> COMMON_CPUID_INDEX_1 = 0, >> COMMON_CPUID_INDEX_7, >> COMMON_CPUID_INDEX_80000001, >> COMMON_CPUID_INDEX_D_ECX_1, >> COMMON_CPUID_INDEX_80000007, >> COMMON_CPUID_INDEX_80000008, >> /* Keep the following line at the end. */ >> COMMON_CPUID_INDEX_MAX >> }; >> >> enum cpu_features_kind >> { >> arch_kind_unknown = 0, >> arch_kind_intel, >> arch_kind_amd, >> arch_kind_other >> }; >> >> struct cpuid_registers >> { >> unsigned int eax; >> unsigned int ebx; >> unsigned int ecx; >> unsigned int edx; >> }; >> >> extern enum cpu_features_kind __x86_get_cpu_kind (void) >> __attribute__ ((const)); >> >> extern const struct cpuid_registers *__x86_get_cpuid_registers >> (unsigned int) __attribute__ ((const)); >> >> extern unsigned int __x86_get_arch_feature (unsigned int) >> __attribute__ ((const)); >> >> extern unsigned long long __x86_tsc_to_ns (unsigned long long); >> extern unsigned long long __x86_ns_to_tsc (unsigned long long); >> >> /* HAS_* evaluates to true if we may use the feature at runtime. */ >> #define HAS_CPU_FEATURE(name) \ >> ((__x86_get_cpuid_registers (index_cpu_##name)->reg_##name \ >> & (bit_cpu_##name)) != 0) > > I don't like these as macros. > > I would rather an inline function that does all the work within the function > and returns a result of the appropriate type e.g. boolean. CPUID instruction takes EAX and optionally ECX as input, It writes to EAX, EBX, ECD and EDX. Each CPU feature is determined by CPUID with specific input values and a bit in EAX, EBX, ECX or EDX. Take AVX2 as example: #define bit_cpu_AVX2 (1u << 5) #define index_cpu_AVX2 COMMON_CPUID_INDEX_7 #define reg_AVX2 ebx It is determined by the bit 5 of EBX from CPUID with EAX == 7. HAS_CPU_FEATURE (AVX2) is expanded to ((__x86_get_cpuid_registers (index_cpu_AVX2)->reg_AVX2 \ & (bit_cpu_AVX2)) != 0) which is equivalent to ((__x86_get_cpuid_registers (COMMON_CPUID_INDEX_7)->ebx \ & ((1u << 5))) != 0) It is perfect for macro, not so convenient for inline function. >> #define HAS_ARCH_FEATURE(name) \ >> ((__x86_get_arch_feature (index_arch_##name) & (bit_arch_##name)) != 0) >> > > Likewise. > >> Users can use <sys/platform/x86.h> to detect if a CPU feature is >> supported at run-time with >> >> HAS_CPU_FEATURE (AVX) > > Likewise. > >> which returns true if glibc detects that AVX is available. Before AVX >> can be used, he/she should check >> >> HAS_ARCH_FEATURE (AVX_Usable) > > Likewise. > > These macros like this are harder to debug than static inline functions. > >> which returns true if glibc detects that AVX is available and usable. >> >> extern unsigned long long __x86_tsc_to_ns (unsigned long long); >> extern unsigned long long __x86_ns_to_tsc (unsigned long long); >> >> can be used to convert between TSCs and nanoseconds if >> >> HAS_ARCH_FEATURE (TSC_To_NS_Usable) >> >> returns true. >> >> Backward binary compatibility is provided by: >> >> 1. __x86_get_cpu_kind will get a new version if a new CPU kind is added >> to cpu_features_kind. > > This function takes void and returns the kind of CPU as an enum value. OK. > >> 2. __x86_get_cpuid_registers returns a pointer to a cpuid_registers >> with all zeros if cpuid index >= COMMON_CPUID_INDEX_MAX. > > Could you expand on this in more detail please. Internally, we have struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX]; _x86_get_cpuid_registers is defined as const struct cpuid_registers * __x86_get_cpuid_registers (unsigned int i) { const static struct cpuid_registers zero_cpuid_registers = { 0 }; if (i >= COMMON_CPUID_INDEX_MAX) return &zero_cpuid_registers; return &GLRO(dl_x86_cpu_features).cpuid[i]; } >> 3. __x86_get_arch_feature returns 0 if architecture index >= >> FEATURE_INDEX_MAX. > > Likewise. > >> This means: >> >> 1. Applications linked against the new __x86_get_cpu_kind will only >> run with the newer libc.so. > > How? Symbol versioning? Yes. >> 2. Applications linked against the new __x86_get_cpuid_registers and >> __x86_get_arch_feature will run with the older libc.so. But new >> CPU and architecture features won't be detected by the older libc.so. > > The caller must attempt to copy or inspect all the data, but it's structure > size may be larger, and the old glibc structure may be smaller, how do you > prevent the newer program from reading out of bounds when run in an older > glibc? This question is only relevant for __x86_get_cpuid_registers which > returns a pointer to a fixed size structure. The structure can grow, but > the caller needs to know exactly how big the returned structure is to avoid > reading outside of bounds. There is no ABI issue here. __x86_get_cpuid_registers returns a pointer to an element of struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX], whose size is fixed as struct cpuid_registers { unsigned int eax; unsigned int ebx; unsigned int ecx; unsigned int edx; }; which is the cached output of CPUID. As shown above if the input index >= COMMON_CPUID_INDEX_MAX. we return a pointer to const static struct cpuid_registers zero_cpuid_registers = { 0 }; -- H.J. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29 2018-07-31 14:49 ` H.J. Lu @ 2018-07-31 15:06 ` Carlos O'Donell 2018-07-31 16:17 ` H.J. Lu 0 siblings, 1 reply; 12+ messages in thread From: Carlos O'Donell @ 2018-07-31 15:06 UTC (permalink / raw) To: H.J. Lu; +Cc: GNU C Library On 07/31/2018 10:48 AM, H.J. Lu wrote: > On Tue, Jul 31, 2018 at 6:32 AM, Carlos O'Donell <carlos@redhat.com> wrote: >> On 07/29/2018 12:16 PM, H.J. Lu wrote: >>> I am proposing to add <sys/platform/x86.h> to glibc 2.29 to provide >>> an API to access x86 specific platform features: >>> >>> enum >>> { >>> /* The integer bit array index of architecture feature bits. */ >>> FEATURE_INDEX_1 = 0, >>> FEATURE_INDEX_2, >>> /* The current maximum size of the feature integer bit array. */ >>> FEATURE_INDEX_MAX >>> }; >>> >>> enum >>> { >>> COMMON_CPUID_INDEX_1 = 0, >>> COMMON_CPUID_INDEX_7, >>> COMMON_CPUID_INDEX_80000001, >>> COMMON_CPUID_INDEX_D_ECX_1, >>> COMMON_CPUID_INDEX_80000007, >>> COMMON_CPUID_INDEX_80000008, >>> /* Keep the following line at the end. */ >>> COMMON_CPUID_INDEX_MAX >>> }; >>> >>> enum cpu_features_kind >>> { >>> arch_kind_unknown = 0, >>> arch_kind_intel, >>> arch_kind_amd, >>> arch_kind_other >>> }; >>> >>> struct cpuid_registers >>> { >>> unsigned int eax; >>> unsigned int ebx; >>> unsigned int ecx; >>> unsigned int edx; >>> }; >>> >>> extern enum cpu_features_kind __x86_get_cpu_kind (void) >>> __attribute__ ((const)); >>> >>> extern const struct cpuid_registers *__x86_get_cpuid_registers >>> (unsigned int) __attribute__ ((const)); >>> >>> extern unsigned int __x86_get_arch_feature (unsigned int) >>> __attribute__ ((const)); >>> >>> extern unsigned long long __x86_tsc_to_ns (unsigned long long); >>> extern unsigned long long __x86_ns_to_tsc (unsigned long long); >>> >>> /* HAS_* evaluates to true if we may use the feature at runtime. */ >>> #define HAS_CPU_FEATURE(name) \ >>> ((__x86_get_cpuid_registers (index_cpu_##name)->reg_##name \ >>> & (bit_cpu_##name)) != 0) >> >> I don't like these as macros. >> >> I would rather an inline function that does all the work within the function >> and returns a result of the appropriate type e.g. boolean. > > CPUID instruction takes EAX and optionally ECX as input, It writes to > EAX, EBX, ECD and EDX. Each CPU feature is determined by CPUID > with specific input values and a bit in EAX, EBX, ECX or EDX. Take > AVX2 as example: > > #define bit_cpu_AVX2 (1u << 5) > #define index_cpu_AVX2 COMMON_CPUID_INDEX_7 > #define reg_AVX2 ebx > > It is determined by the bit 5 of EBX from CPUID with EAX == 7. > HAS_CPU_FEATURE (AVX2) is expanded to > > ((__x86_get_cpuid_registers (index_cpu_AVX2)->reg_AVX2 \ > & (bit_cpu_AVX2)) != 0) > > which is equivalent to > > ((__x86_get_cpuid_registers (COMMON_CPUID_INDEX_7)->ebx \ > & ((1u << 5))) != 0) > > It is perfect for macro, not so convenient for inline function. It's not perfect. It's difficult to debug. You are using the c pre-processor here to evaluate the input argument into another 2 or 3 constant values, this is the kind of propagation of values that the compiler can do too. I'd like to see an attempt at a functional definition of this rather than a macro, and the best argument against a static inline function is performance. If the macro generates much better code then that's a real reason to use it. >>> #define HAS_ARCH_FEATURE(name) \ >>> ((__x86_get_arch_feature (index_arch_##name) & (bit_arch_##name)) != 0) >>> >> >> Likewise. >> >>> Users can use <sys/platform/x86.h> to detect if a CPU feature is >>> supported at run-time with >>> >>> HAS_CPU_FEATURE (AVX) >> >> Likewise. >> >>> which returns true if glibc detects that AVX is available. Before AVX >>> can be used, he/she should check >>> >>> HAS_ARCH_FEATURE (AVX_Usable) >> >> Likewise. >> >> These macros like this are harder to debug than static inline functions. >> >>> which returns true if glibc detects that AVX is available and usable. >>> >>> extern unsigned long long __x86_tsc_to_ns (unsigned long long); >>> extern unsigned long long __x86_ns_to_tsc (unsigned long long); >>> >>> can be used to convert between TSCs and nanoseconds if >>> >>> HAS_ARCH_FEATURE (TSC_To_NS_Usable) >>> >>> returns true. >>> >>> Backward binary compatibility is provided by: >>> >>> 1. __x86_get_cpu_kind will get a new version if a new CPU kind is added >>> to cpu_features_kind. >> >> This function takes void and returns the kind of CPU as an enum value. OK. >> >>> 2. __x86_get_cpuid_registers returns a pointer to a cpuid_registers >>> with all zeros if cpuid index >= COMMON_CPUID_INDEX_MAX. >> >> Could you expand on this in more detail please. > > Internally, we have > > struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX]; > > _x86_get_cpuid_registers is defined as > > const struct cpuid_registers * > __x86_get_cpuid_registers (unsigned int i) > { > const static struct cpuid_registers zero_cpuid_registers = { 0 }; > if (i >= COMMON_CPUID_INDEX_MAX) > return &zero_cpuid_registers; > return &GLRO(dl_x86_cpu_features).cpuid[i]; > } > I don't like returning all zero for an error condition, we should have a distinct error and return value.. Are we certain that an all zero cpuid register return is an invalid result for cpuid in all cases or could it be confused with a valid cpuid result but with the options in question disabled? >>> 3. __x86_get_arch_feature returns 0 if architecture index >= >>> FEATURE_INDEX_MAX. >> >> Likewise. >> >>> This means: >>> >>> 1. Applications linked against the new __x86_get_cpu_kind will only >>> run with the newer libc.so. >> >> How? Symbol versioning? > > Yes. OK. >>> 2. Applications linked against the new __x86_get_cpuid_registers and >>> __x86_get_arch_feature will run with the older libc.so. But new >>> CPU and architecture features won't be detected by the older libc.so. >> >> The caller must attempt to copy or inspect all the data, but it's structure >> size may be larger, and the old glibc structure may be smaller, how do you >> prevent the newer program from reading out of bounds when run in an older >> glibc? This question is only relevant for __x86_get_cpuid_registers which >> returns a pointer to a fixed size structure. The structure can grow, but >> the caller needs to know exactly how big the returned structure is to avoid >> reading outside of bounds. > > There is no ABI issue here. __x86_get_cpuid_registers returns a pointer to > an element of struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX], > whose size is fixed as > > struct cpuid_registers > { > unsigned int eax; > unsigned int ebx; > unsigned int ecx; > unsigned int edx; > }; Ah! Right, it's the cached cpuid return result so it's following the ISA rules for the registers it touches. I expect this can't change at this point, and if we got a *new* cpuid instruction we'd just make a new function to implement that. > which is the cached output of CPUID. As shown above if the input index >> = COMMON_CPUID_INDEX_MAX. we return a pointer to > > const static struct cpuid_registers zero_cpuid_registers = { 0 }; -- Cheers, Carlos. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29 2018-07-31 15:06 ` Carlos O'Donell @ 2018-07-31 16:17 ` H.J. Lu 0 siblings, 0 replies; 12+ messages in thread From: H.J. Lu @ 2018-07-31 16:17 UTC (permalink / raw) To: Carlos O'Donell; +Cc: GNU C Library On Tue, Jul 31, 2018 at 8:06 AM, Carlos O'Donell <carlos@redhat.com> wrote: > On 07/31/2018 10:48 AM, H.J. Lu wrote: >> On Tue, Jul 31, 2018 at 6:32 AM, Carlos O'Donell <carlos@redhat.com> wrote: >>> On 07/29/2018 12:16 PM, H.J. Lu wrote: >>>> I am proposing to add <sys/platform/x86.h> to glibc 2.29 to provide >>>> an API to access x86 specific platform features: >>>> >>>> enum >>>> { >>>> /* The integer bit array index of architecture feature bits. */ >>>> FEATURE_INDEX_1 = 0, >>>> FEATURE_INDEX_2, >>>> /* The current maximum size of the feature integer bit array. */ >>>> FEATURE_INDEX_MAX >>>> }; >>>> >>>> enum >>>> { >>>> COMMON_CPUID_INDEX_1 = 0, >>>> COMMON_CPUID_INDEX_7, >>>> COMMON_CPUID_INDEX_80000001, >>>> COMMON_CPUID_INDEX_D_ECX_1, >>>> COMMON_CPUID_INDEX_80000007, >>>> COMMON_CPUID_INDEX_80000008, >>>> /* Keep the following line at the end. */ >>>> COMMON_CPUID_INDEX_MAX >>>> }; >>>> >>>> enum cpu_features_kind >>>> { >>>> arch_kind_unknown = 0, >>>> arch_kind_intel, >>>> arch_kind_amd, >>>> arch_kind_other >>>> }; >>>> >>>> struct cpuid_registers >>>> { >>>> unsigned int eax; >>>> unsigned int ebx; >>>> unsigned int ecx; >>>> unsigned int edx; >>>> }; >>>> >>>> extern enum cpu_features_kind __x86_get_cpu_kind (void) >>>> __attribute__ ((const)); >>>> >>>> extern const struct cpuid_registers *__x86_get_cpuid_registers >>>> (unsigned int) __attribute__ ((const)); >>>> >>>> extern unsigned int __x86_get_arch_feature (unsigned int) >>>> __attribute__ ((const)); >>>> >>>> extern unsigned long long __x86_tsc_to_ns (unsigned long long); >>>> extern unsigned long long __x86_ns_to_tsc (unsigned long long); >>>> >>>> /* HAS_* evaluates to true if we may use the feature at runtime. */ >>>> #define HAS_CPU_FEATURE(name) \ >>>> ((__x86_get_cpuid_registers (index_cpu_##name)->reg_##name \ >>>> & (bit_cpu_##name)) != 0) >>> >>> I don't like these as macros. >>> >>> I would rather an inline function that does all the work within the function >>> and returns a result of the appropriate type e.g. boolean. >> >> CPUID instruction takes EAX and optionally ECX as input, It writes to >> EAX, EBX, ECD and EDX. Each CPU feature is determined by CPUID >> with specific input values and a bit in EAX, EBX, ECX or EDX. Take >> AVX2 as example: >> >> #define bit_cpu_AVX2 (1u << 5) >> #define index_cpu_AVX2 COMMON_CPUID_INDEX_7 >> #define reg_AVX2 ebx >> >> It is determined by the bit 5 of EBX from CPUID with EAX == 7. >> HAS_CPU_FEATURE (AVX2) is expanded to >> >> ((__x86_get_cpuid_registers (index_cpu_AVX2)->reg_AVX2 \ >> & (bit_cpu_AVX2)) != 0) >> >> which is equivalent to >> >> ((__x86_get_cpuid_registers (COMMON_CPUID_INDEX_7)->ebx \ >> & ((1u << 5))) != 0) >> >> It is perfect for macro, not so convenient for inline function. > > It's not perfect. > > It's difficult to debug. sysdeps/x86/tst-get-cpu-features.c should have 100% coverage of all features. Glibc users can just use them. > You are using the c pre-processor here to evaluate the input argument > into another 2 or 3 constant values, this is the kind of propagation > of values that the compiler can do too. Not easily with inline function in a public header file. > I'd like to see an attempt at a functional definition of this rather > than a macro, and the best argument against a static inline function > is performance. If the macro generates much better code then that's > a real reason to use it. In this case, macro generates very good code and very easy to maintain. >>>> #define HAS_ARCH_FEATURE(name) \ >>>> ((__x86_get_arch_feature (index_arch_##name) & (bit_arch_##name)) != 0) >>>> >>> >>> Likewise. >>> >>>> Users can use <sys/platform/x86.h> to detect if a CPU feature is >>>> supported at run-time with >>>> >>>> HAS_CPU_FEATURE (AVX) >>> >>> Likewise. >>> >>>> which returns true if glibc detects that AVX is available. Before AVX >>>> can be used, he/she should check >>>> >>>> HAS_ARCH_FEATURE (AVX_Usable) >>> >>> Likewise. >>> >>> These macros like this are harder to debug than static inline functions. >>> >>>> which returns true if glibc detects that AVX is available and usable. >>>> >>>> extern unsigned long long __x86_tsc_to_ns (unsigned long long); >>>> extern unsigned long long __x86_ns_to_tsc (unsigned long long); >>>> >>>> can be used to convert between TSCs and nanoseconds if >>>> >>>> HAS_ARCH_FEATURE (TSC_To_NS_Usable) >>>> >>>> returns true. >>>> >>>> Backward binary compatibility is provided by: >>>> >>>> 1. __x86_get_cpu_kind will get a new version if a new CPU kind is added >>>> to cpu_features_kind. >>> >>> This function takes void and returns the kind of CPU as an enum value. OK. >>> >>>> 2. __x86_get_cpuid_registers returns a pointer to a cpuid_registers >>>> with all zeros if cpuid index >= COMMON_CPUID_INDEX_MAX. >>> >>> Could you expand on this in more detail please. >> >> Internally, we have >> >> struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX]; >> >> _x86_get_cpuid_registers is defined as >> >> const struct cpuid_registers * >> __x86_get_cpuid_registers (unsigned int i) >> { >> const static struct cpuid_registers zero_cpuid_registers = { 0 }; >> if (i >= COMMON_CPUID_INDEX_MAX) >> return &zero_cpuid_registers; >> return &GLRO(dl_x86_cpu_features).cpuid[i]; >> } >> > > I don't like returning all zero for an error condition, we should have > a distinct error and return value.. It isn't an error. It simply means that this glibc doesn't support this CPUID input. It is OK to return 0 in EAX, EBX, ECX and EDX and HAS_CPU_FEATURE (XXX) evaluates to false. > Are we certain that an all zero cpuid register return is an invalid > result for cpuid in all cases or could it be confused with a valid > cpuid result but with the options in question disabled? HAS_CPU_FEATURE (XXX) evaluates to true if glibc detects that XXX is supported. It is OK for HAS_CPU_FEATURE (XXX) to be false if glibc doesn't know XXX on processors which has XXX. BTW, there are many reserved bits in EAX, EBX, ECX and EDX, which may be changed to new CPU features in future processors. It is easy to add these features to x86.h with macro without recompiling glibc. >>>> 3. __x86_get_arch_feature returns 0 if architecture index >= >>>> FEATURE_INDEX_MAX. >>> >>> Likewise. >>> >>>> This means: >>>> >>>> 1. Applications linked against the new __x86_get_cpu_kind will only >>>> run with the newer libc.so. >>> >>> How? Symbol versioning? >> >> Yes. > > OK. > >>>> 2. Applications linked against the new __x86_get_cpuid_registers and >>>> __x86_get_arch_feature will run with the older libc.so. But new >>>> CPU and architecture features won't be detected by the older libc.so. >>> >>> The caller must attempt to copy or inspect all the data, but it's structure >>> size may be larger, and the old glibc structure may be smaller, how do you >>> prevent the newer program from reading out of bounds when run in an older >>> glibc? This question is only relevant for __x86_get_cpuid_registers which >>> returns a pointer to a fixed size structure. The structure can grow, but >>> the caller needs to know exactly how big the returned structure is to avoid >>> reading outside of bounds. >> >> There is no ABI issue here. __x86_get_cpuid_registers returns a pointer to >> an element of struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX], >> whose size is fixed as >> >> struct cpuid_registers >> { >> unsigned int eax; >> unsigned int ebx; >> unsigned int ecx; >> unsigned int edx; >> }; > > Ah! Right, it's the cached cpuid return result so it's following the ISA rules > for the registers it touches. I expect this can't change at this point, and if > we got a *new* cpuid instruction we'd just make a new function to implement that. True. >> which is the cached output of CPUID. As shown above if the input index >>> = COMMON_CPUID_INDEX_MAX. we return a pointer to >> >> const static struct cpuid_registers zero_cpuid_registers = { 0 }; > > > -- > Cheers, > Carlos. -- H.J. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29 2018-07-29 16:16 RFC: Add <sys/platform/x86.h> to glibc 2.29 H.J. Lu 2018-07-31 13:32 ` Carlos O'Donell @ 2018-08-17 14:50 ` Florian Weimer 2018-08-17 15:46 ` H.J. Lu 1 sibling, 1 reply; 12+ messages in thread From: Florian Weimer @ 2018-08-17 14:50 UTC (permalink / raw) To: H.J. Lu, GNU C Library On 07/29/2018 06:16 PM, H.J. Lu wrote: > extern enum cpu_features_kind __x86_get_cpu_kind (void) > __attribute__ ((const)); > > extern const struct cpuid_registers *__x86_get_cpuid_registers > (unsigned int) __attribute__ ((const)); > > extern unsigned int __x86_get_arch_feature (unsigned int) > __attribute__ ((const)); Are these functions supposed to be usable in IFUNC resolvers? Shouldn't these identifiers be in the public (non-internal) namespace? Thanks, Florian ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29 2018-08-17 14:50 ` Florian Weimer @ 2018-08-17 15:46 ` H.J. Lu 2018-08-17 16:19 ` Florian Weimer 0 siblings, 1 reply; 12+ messages in thread From: H.J. Lu @ 2018-08-17 15:46 UTC (permalink / raw) To: Florian Weimer; +Cc: GNU C Library On Fri, Aug 17, 2018 at 7:50 AM, Florian Weimer <fweimer@redhat.com> wrote: > On 07/29/2018 06:16 PM, H.J. Lu wrote: >> >> extern enum cpu_features_kind __x86_get_cpu_kind (void) >> __attribute__ ((const)); >> >> extern const struct cpuid_registers *__x86_get_cpuid_registers >> (unsigned int) __attribute__ ((const)); >> >> extern unsigned int __x86_get_arch_feature (unsigned int) >> __attribute__ ((const)); > > > Are these functions supposed to be usable in IFUNC resolvers? They have the same limitation as /* Used from outside of glibc to get access to the CPU features structure. */ extern const struct cpu_features *__get_cpu_features (void) __attribute__ ((const)); > Shouldn't these identifiers be in the public (non-internal) namespace? > Do we have precedents for such arch specific public functions? -- H.J. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29 2018-08-17 15:46 ` H.J. Lu @ 2018-08-17 16:19 ` Florian Weimer 2018-08-17 16:44 ` H.J. Lu 0 siblings, 1 reply; 12+ messages in thread From: Florian Weimer @ 2018-08-17 16:19 UTC (permalink / raw) To: H.J. Lu; +Cc: GNU C Library On 08/17/2018 05:45 PM, H.J. Lu wrote: > On Fri, Aug 17, 2018 at 7:50 AM, Florian Weimer <fweimer@redhat.com> wrote: >> On 07/29/2018 06:16 PM, H.J. Lu wrote: >>> >>> extern enum cpu_features_kind __x86_get_cpu_kind (void) >>> __attribute__ ((const)); >>> >>> extern const struct cpuid_registers *__x86_get_cpuid_registers >>> (unsigned int) __attribute__ ((const)); >>> >>> extern unsigned int __x86_get_arch_feature (unsigned int) >>> __attribute__ ((const)); >> >> >> Are these functions supposed to be usable in IFUNC resolvers? > > They have the same limitation as > > /* Used from outside of glibc to get access to the CPU features > structure. */ > extern const struct cpu_features *__get_cpu_features (void) > __attribute__ ((const)); Sorry, this does not answer my question. >> Shouldn't these identifiers be in the public (non-internal) namespace? >> > > Do we have precedents for such arch specific public functions? There is modify_ldt. Thanks, Florian ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29 2018-08-17 16:19 ` Florian Weimer @ 2018-08-17 16:44 ` H.J. Lu 2018-08-17 19:11 ` Florian Weimer 2018-08-20 11:29 ` Florian Weimer 0 siblings, 2 replies; 12+ messages in thread From: H.J. Lu @ 2018-08-17 16:44 UTC (permalink / raw) To: Florian Weimer; +Cc: GNU C Library On Fri, Aug 17, 2018 at 9:19 AM, Florian Weimer <fweimer@redhat.com> wrote: > On 08/17/2018 05:45 PM, H.J. Lu wrote: >> >> On Fri, Aug 17, 2018 at 7:50 AM, Florian Weimer <fweimer@redhat.com> >> wrote: >>> >>> On 07/29/2018 06:16 PM, H.J. Lu wrote: >>>> >>>> >>>> extern enum cpu_features_kind __x86_get_cpu_kind (void) >>>> __attribute__ ((const)); >>>> >>>> extern const struct cpuid_registers *__x86_get_cpuid_registers >>>> (unsigned int) __attribute__ ((const)); >>>> >>>> extern unsigned int __x86_get_arch_feature (unsigned int) >>>> __attribute__ ((const)); >>> >>> >>> >>> Are these functions supposed to be usable in IFUNC resolvers? >> >> >> They have the same limitation as >> >> /* Used from outside of glibc to get access to the CPU features >> structure. */ >> extern const struct cpu_features *__get_cpu_features (void) >> __attribute__ ((const)); > > > Sorry, this does not answer my question. Just like __get_cpu_features, they can be used in IFUNC resolvers. >>> Shouldn't these identifiers be in the public (non-internal) namespace? >>> >> >> Do we have precedents for such arch specific public functions? > > > There is modify_ldt. > Which header file has its prototype? -- H.J. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29 2018-08-17 16:44 ` H.J. Lu @ 2018-08-17 19:11 ` Florian Weimer 2018-08-20 11:29 ` Florian Weimer 1 sibling, 0 replies; 12+ messages in thread From: Florian Weimer @ 2018-08-17 19:11 UTC (permalink / raw) To: H.J. Lu; +Cc: GNU C Library On 08/17/2018 06:43 PM, H.J. Lu wrote: >>>> Are these functions supposed to be usable in IFUNC resolvers? >>> >>> They have the same limitation as >>> >>> /* Used from outside of glibc to get access to the CPU features >>> structure. */ >>> extern const struct cpu_features *__get_cpu_features (void) >>> __attribute__ ((const)); >> >> Sorry, this does not answer my question. > Just like __get_cpu_features, they can be used in IFUNC resolvers. Then we need something like this: <https://sourceware.org/ml/libc-alpha/2018-06/msg00077.html> Thanks, Florian ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29 2018-08-17 16:44 ` H.J. Lu 2018-08-17 19:11 ` Florian Weimer @ 2018-08-20 11:29 ` Florian Weimer 2018-08-20 20:55 ` H.J. Lu 1 sibling, 1 reply; 12+ messages in thread From: Florian Weimer @ 2018-08-20 11:29 UTC (permalink / raw) To: H.J. Lu; +Cc: GNU C Library On 08/17/2018 06:43 PM, H.J. Lu wrote: >>>> Shouldn't these identifiers be in the public (non-internal) namespace? >>>> >>> Do we have precedents for such arch specific public functions? >> >> There is modify_ldt. >> > Which header file has its prototype? vm86 is an example for an arch-specific function with a prototype in an installed header file. Thanks, Florian ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RFC: Add <sys/platform/x86.h> to glibc 2.29 2018-08-20 11:29 ` Florian Weimer @ 2018-08-20 20:55 ` H.J. Lu 0 siblings, 0 replies; 12+ messages in thread From: H.J. Lu @ 2018-08-20 20:55 UTC (permalink / raw) To: Florian Weimer; +Cc: GNU C Library On Mon, Aug 20, 2018 at 4:29 AM, Florian Weimer <fweimer@redhat.com> wrote: > On 08/17/2018 06:43 PM, H.J. Lu wrote: >>>>> >>>>> Shouldn't these identifiers be in the public (non-internal) namespace? >>>>> >>>> Do we have precedents for such arch specific public functions? >>> >>> >>> There is modify_ldt. >>> >> Which header file has its prototype? > > > vm86 is an example for an arch-specific function with a prototype in an > installed header file. The updated patch is posted at https://sourceware.org/ml/libc-alpha/2018-08/msg00435.html -- H.J. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2018-08-20 20:55 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-07-29 16:16 RFC: Add <sys/platform/x86.h> to glibc 2.29 H.J. Lu 2018-07-31 13:32 ` Carlos O'Donell 2018-07-31 14:49 ` H.J. Lu 2018-07-31 15:06 ` Carlos O'Donell 2018-07-31 16:17 ` H.J. Lu 2018-08-17 14:50 ` Florian Weimer 2018-08-17 15:46 ` H.J. Lu 2018-08-17 16:19 ` Florian Weimer 2018-08-17 16:44 ` H.J. Lu 2018-08-17 19:11 ` Florian Weimer 2018-08-20 11:29 ` Florian Weimer 2018-08-20 20:55 ` H.J. Lu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).