public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Bug 201689: Belt-and-suspenders detection of FMA.
@ 2016-10-14 18:09 Carlos O'Donell
  2016-10-14 19:12 ` H.J. Lu
  0 siblings, 1 reply; 5+ messages in thread
From: Carlos O'Donell @ 2016-10-14 18:09 UTC (permalink / raw)
  To: GNU C Library, H.J. Lu

In the Intel Architecture Instruction Set Extensions Programming reference
the recommended way to test for FMA in section '2.2.1 Detection of FMA'
is:

"Application Software must identify that hardware supports AVX as explained
 in ... after that it must also detect support for FMA..."

We don't do that in glibc. We use osxsave to detect the use of xgetbv, and
after that we check for AVX and FMA orthogonally. It is conceivable that
you could have the AVX bit clear and the FMA bit in an undefined state.

I have never seen a machine with the AVX bit clear and the FMA bit set, but
we should follow the intel specifications and adjust our check as the following
patch works.

OK to checkin?

2016-10-14  Carlos O'Donell  <carlos@redhat.com>

	[BZ #20689]
	* sysdeps/x86/cpu-features.c: Only enable FMA is AVX is present.

diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
index 11b9af2..1d52f22 100644
--- a/sysdeps/x86/cpu-features.c
+++ b/sysdeps/x86/cpu-features.c
@@ -83,8 +83,10 @@ get_common_indeces (struct cpu_features *cpu_features,
                      |= bit_arch_AVX512DQ_Usable;
                }
            }
-         /* Determine if FMA is usable.  */
-         if (CPU_FEATURES_CPU_P (cpu_features, FMA))
+         /* Determine if FMA is usable.  The recommended Intel procedure
+            is to check for AVX && FMA to decide if FMA is available.  */
+         if (CPU_FEATURES_CPU_P (cpu_features, AVX)
+             && CPU_FEATURES_CPU_P (cpu_features, FMA))
            cpu_features->feature[index_arch_FMA_Usable]
              |= bit_arch_FMA_Usable;
        }
---


-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Bug 201689: Belt-and-suspenders detection of FMA.
  2016-10-14 18:09 [PATCH] Bug 201689: Belt-and-suspenders detection of FMA Carlos O'Donell
@ 2016-10-14 19:12 ` H.J. Lu
  2016-10-17 15:44   ` Carlos O'Donell
  0 siblings, 1 reply; 5+ messages in thread
From: H.J. Lu @ 2016-10-14 19:12 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: GNU C Library

On Fri, Oct 14, 2016 at 11:09 AM, Carlos O'Donell <carlos@redhat.com> wrote:
> In the Intel Architecture Instruction Set Extensions Programming reference
> the recommended way to test for FMA in section '2.2.1 Detection of FMA'
> is:
>
> "Application Software must identify that hardware supports AVX as explained
>  in ... after that it must also detect support for FMA..."
>
> We don't do that in glibc. We use osxsave to detect the use of xgetbv, and
> after that we check for AVX and FMA orthogonally. It is conceivable that
> you could have the AVX bit clear and the FMA bit in an undefined state.
>
> I have never seen a machine with the AVX bit clear and the FMA bit set, but
> we should follow the intel specifications and adjust our check as the following
> patch works.

One can't write a program with FMA nor AVX2 without using AVX
instructions.  AVX2/FMA aren't usable if AVX isn't.   We should do

if (CPU_FEATURES_CPU_P (cpu_features, AVX))
   {
      Set AVX available
      if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
         Set AVX2 available
      if (CPU_FEATURES_CPU_P (cpu_features, FMA))
         Set FMA available
   }

> OK to checkin?
>
> 2016-10-14  Carlos O'Donell  <carlos@redhat.com>
>
>         [BZ #20689]
>         * sysdeps/x86/cpu-features.c: Only enable FMA is AVX is present.
>
> diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
> index 11b9af2..1d52f22 100644
> --- a/sysdeps/x86/cpu-features.c
> +++ b/sysdeps/x86/cpu-features.c
> @@ -83,8 +83,10 @@ get_common_indeces (struct cpu_features *cpu_features,
>                       |= bit_arch_AVX512DQ_Usable;
>                 }
>             }
> -         /* Determine if FMA is usable.  */
> -         if (CPU_FEATURES_CPU_P (cpu_features, FMA))
> +         /* Determine if FMA is usable.  The recommended Intel procedure
> +            is to check for AVX && FMA to decide if FMA is available.  */
> +         if (CPU_FEATURES_CPU_P (cpu_features, AVX)
> +             && CPU_FEATURES_CPU_P (cpu_features, FMA))
>             cpu_features->feature[index_arch_FMA_Usable]
>               |= bit_arch_FMA_Usable;
>         }
> ---
>
>
> --
> Cheers,
> Carlos.



-- 
H.J.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Bug 201689: Belt-and-suspenders detection of FMA.
  2016-10-14 19:12 ` H.J. Lu
@ 2016-10-17 15:44   ` Carlos O'Donell
  2016-10-17 15:57     ` H.J. Lu
  2016-10-17 23:40     ` Carlos O'Donell
  0 siblings, 2 replies; 5+ messages in thread
From: Carlos O'Donell @ 2016-10-17 15:44 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GNU C Library

On 10/14/2016 03:12 PM, H.J. Lu wrote:
> On Fri, Oct 14, 2016 at 11:09 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>> In the Intel Architecture Instruction Set Extensions Programming reference
>> the recommended way to test for FMA in section '2.2.1 Detection of FMA'
>> is:
>>
>> "Application Software must identify that hardware supports AVX as explained
>>  in ... after that it must also detect support for FMA..."
>>
>> We don't do that in glibc. We use osxsave to detect the use of xgetbv, and
>> after that we check for AVX and FMA orthogonally. It is conceivable that
>> you could have the AVX bit clear and the FMA bit in an undefined state.
>>
>> I have never seen a machine with the AVX bit clear and the FMA bit set, but
>> we should follow the intel specifications and adjust our check as the following
>> patch works.
> 
> One can't write a program with FMA nor AVX2 without using AVX
> instructions.  AVX2/FMA aren't usable if AVX isn't.   We should do
> 
> if (CPU_FEATURES_CPU_P (cpu_features, AVX))
>    {
>       Set AVX available
>       if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
>          Set AVX2 available
>       if (CPU_FEATURES_CPU_P (cpu_features, FMA))
>          Set FMA available
>    }
> 

Agreed. I double checked the manual, here is a new patch.

Testing on x86_64 and i686.

OK to commit if the results are clean?

v2
- Move FMA and AVX2 check up into AVX check since they depend upon it.

2016-10-14  Carlos O'Donell  <carlos@redhat.com>

	[BZ #20689]
	* sysdeps/x86/cpu-features.c: Only enable FMA and AVX2 if
	AVX is usable.

diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
index 11b9af2..e228a76 100644
--- a/sysdeps/x86/cpu-features.c
+++ b/sysdeps/x86/cpu-features.c
@@ -60,12 +60,20 @@ get_common_indeces (struct cpu_features *cpu_features,
        {
          /* Determine if AVX is usable.  */
          if (CPU_FEATURES_CPU_P (cpu_features, AVX))
-           cpu_features->feature[index_arch_AVX_Usable]
-             |= bit_arch_AVX_Usable;
-         /* Determine if AVX2 is usable.  */
-         if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
-           cpu_features->feature[index_arch_AVX2_Usable]
-             |= bit_arch_AVX2_Usable;
+           {
+             cpu_features->feature[index_arch_AVX_Usable]
+               |= bit_arch_AVX_Usable;
+             /* The following features depend on AVX being usable.  */
+             /* Determine if AVX2 is usable.  */
+             if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
+               cpu_features->feature[index_arch_AVX2_Usable]
+                 |= bit_arch_AVX2_Usable;
+             /* Determine if FMA is usable.  */
+             if (CPU_FEATURES_CPU_P (cpu_features, FMA))
+               cpu_features->feature[index_arch_FMA_Usable]
+                 |= bit_arch_FMA_Usable;
+           }
+
          /* Check if OPMASK state, upper 256-bit of ZMM0-ZMM15 and
             ZMM16-ZMM31 state are enabled.  */
          if ((xcrlow & (bit_Opmask_state | bit_ZMM0_15_state
@@ -83,10 +91,6 @@ get_common_indeces (struct cpu_features *cpu_features,
                      |= bit_arch_AVX512DQ_Usable;
                }
            }
-         /* Determine if FMA is usable.  */
-         if (CPU_FEATURES_CPU_P (cpu_features, FMA))
-           cpu_features->feature[index_arch_FMA_Usable]
-             |= bit_arch_FMA_Usable;
        }
     }
 }
---

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Bug 201689: Belt-and-suspenders detection of FMA.
  2016-10-17 15:44   ` Carlos O'Donell
@ 2016-10-17 15:57     ` H.J. Lu
  2016-10-17 23:40     ` Carlos O'Donell
  1 sibling, 0 replies; 5+ messages in thread
From: H.J. Lu @ 2016-10-17 15:57 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: GNU C Library

On Mon, Oct 17, 2016 at 8:44 AM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 10/14/2016 03:12 PM, H.J. Lu wrote:
>> On Fri, Oct 14, 2016 at 11:09 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>>> In the Intel Architecture Instruction Set Extensions Programming reference
>>> the recommended way to test for FMA in section '2.2.1 Detection of FMA'
>>> is:
>>>
>>> "Application Software must identify that hardware supports AVX as explained
>>>  in ... after that it must also detect support for FMA..."
>>>
>>> We don't do that in glibc. We use osxsave to detect the use of xgetbv, and
>>> after that we check for AVX and FMA orthogonally. It is conceivable that
>>> you could have the AVX bit clear and the FMA bit in an undefined state.
>>>
>>> I have never seen a machine with the AVX bit clear and the FMA bit set, but
>>> we should follow the intel specifications and adjust our check as the following
>>> patch works.
>>
>> One can't write a program with FMA nor AVX2 without using AVX
>> instructions.  AVX2/FMA aren't usable if AVX isn't.   We should do
>>
>> if (CPU_FEATURES_CPU_P (cpu_features, AVX))
>>    {
>>       Set AVX available
>>       if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
>>          Set AVX2 available
>>       if (CPU_FEATURES_CPU_P (cpu_features, FMA))
>>          Set FMA available
>>    }
>>
>
> Agreed. I double checked the manual, here is a new patch.
>
> Testing on x86_64 and i686.
>
> OK to commit if the results are clean?
>
> v2
> - Move FMA and AVX2 check up into AVX check since they depend upon it.
>
> 2016-10-14  Carlos O'Donell  <carlos@redhat.com>
>
>         [BZ #20689]
>         * sysdeps/x86/cpu-features.c: Only enable FMA and AVX2 if
>         AVX is usable.
>
> diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
> index 11b9af2..e228a76 100644
> --- a/sysdeps/x86/cpu-features.c
> +++ b/sysdeps/x86/cpu-features.c
> @@ -60,12 +60,20 @@ get_common_indeces (struct cpu_features *cpu_features,
>         {
>           /* Determine if AVX is usable.  */
>           if (CPU_FEATURES_CPU_P (cpu_features, AVX))
> -           cpu_features->feature[index_arch_AVX_Usable]
> -             |= bit_arch_AVX_Usable;
> -         /* Determine if AVX2 is usable.  */
> -         if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
> -           cpu_features->feature[index_arch_AVX2_Usable]
> -             |= bit_arch_AVX2_Usable;
> +           {
> +             cpu_features->feature[index_arch_AVX_Usable]
> +               |= bit_arch_AVX_Usable;
> +             /* The following features depend on AVX being usable.  */
> +             /* Determine if AVX2 is usable.  */
> +             if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
> +               cpu_features->feature[index_arch_AVX2_Usable]
> +                 |= bit_arch_AVX2_Usable;
> +             /* Determine if FMA is usable.  */
> +             if (CPU_FEATURES_CPU_P (cpu_features, FMA))
> +               cpu_features->feature[index_arch_FMA_Usable]
> +                 |= bit_arch_FMA_Usable;
> +           }
> +
>           /* Check if OPMASK state, upper 256-bit of ZMM0-ZMM15 and
>              ZMM16-ZMM31 state are enabled.  */
>           if ((xcrlow & (bit_Opmask_state | bit_ZMM0_15_state
> @@ -83,10 +91,6 @@ get_common_indeces (struct cpu_features *cpu_features,
>                       |= bit_arch_AVX512DQ_Usable;
>                 }
>             }
> -         /* Determine if FMA is usable.  */
> -         if (CPU_FEATURES_CPU_P (cpu_features, FMA))
> -           cpu_features->feature[index_arch_FMA_Usable]
> -             |= bit_arch_FMA_Usable;
>         }
>      }
>  }
> ---
>

This is OK.

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Bug 201689: Belt-and-suspenders detection of FMA.
  2016-10-17 15:44   ` Carlos O'Donell
  2016-10-17 15:57     ` H.J. Lu
@ 2016-10-17 23:40     ` Carlos O'Donell
  1 sibling, 0 replies; 5+ messages in thread
From: Carlos O'Donell @ 2016-10-17 23:40 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GNU C Library

On 10/17/2016 11:44 AM, Carlos O'Donell wrote:
> On 10/14/2016 03:12 PM, H.J. Lu wrote:
>> On Fri, Oct 14, 2016 at 11:09 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>>> In the Intel Architecture Instruction Set Extensions Programming reference
>>> the recommended way to test for FMA in section '2.2.1 Detection of FMA'
>>> is:
>>>
>>> "Application Software must identify that hardware supports AVX as explained
>>>  in ... after that it must also detect support for FMA..."
>>>
>>> We don't do that in glibc. We use osxsave to detect the use of xgetbv, and
>>> after that we check for AVX and FMA orthogonally. It is conceivable that
>>> you could have the AVX bit clear and the FMA bit in an undefined state.
>>>
>>> I have never seen a machine with the AVX bit clear and the FMA bit set, but
>>> we should follow the intel specifications and adjust our check as the following
>>> patch works.
>>
>> One can't write a program with FMA nor AVX2 without using AVX
>> instructions.  AVX2/FMA aren't usable if AVX isn't.   We should do
>>
>> if (CPU_FEATURES_CPU_P (cpu_features, AVX))
>>    {
>>       Set AVX available
>>       if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
>>          Set AVX2 available
>>       if (CPU_FEATURES_CPU_P (cpu_features, FMA))
>>          Set FMA available
>>    }
>>
> 
> Agreed. I double checked the manual, here is a new patch.
> 
> Testing on x86_64 and i686.
> 
> OK to commit if the results are clean?
> 
> v2
> - Move FMA and AVX2 check up into AVX check since they depend upon it.
> 
> 2016-10-14  Carlos O'Donell  <carlos@redhat.com>
> 
> 	[BZ #20689]
> 	* sysdeps/x86/cpu-features.c: Only enable FMA and AVX2 if
> 	AVX is usable.

Committed. Thanks.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-10-17 23:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-14 18:09 [PATCH] Bug 201689: Belt-and-suspenders detection of FMA Carlos O'Donell
2016-10-14 19:12 ` H.J. Lu
2016-10-17 15:44   ` Carlos O'Donell
2016-10-17 15:57     ` H.J. Lu
2016-10-17 23:40     ` Carlos O'Donell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).