public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH, i386, testsuite] FMA intrinsics
@ 2011-08-20 11:23 Uros Bizjak
  2011-08-22 17:32 ` Ilya Tocar
  0 siblings, 1 reply; 27+ messages in thread
From: Uros Bizjak @ 2011-08-20 11:23 UTC (permalink / raw)
  To: gcc-patches; +Cc: Ilya Tocar

Hello!

> This patch adds intrinsics for FMA instruction set along with tests for them.
> Bootstraps and passes make check (including make check on simulator
> for new runtime tests).

? ? ? ? ? ? ? * config/i386/fmaintrin.h: New.

It is not included in the patch.

? ? ? ? ? ? ? * config.gcc: Add fmaintrin.h.
? ? ? ? ? ? ? * config/i386/i386.c
? ? ? ? ? ? ? * <ix86_builtins> (IX86_BUILTIN_VFMADDSS3): New.
? ? ? ? ? ? ? (IX86_BUILTIN_VFMADDSD3): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDSS3): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDSD3): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBSS3): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBSD3): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBSS3): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBSD3): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPS): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPD): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPS256): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPD256): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPS): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPD): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPS256): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPD256): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPS): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPD): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPS256): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPD256): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPS): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPD): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPS256): Likewise.
? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPD256): Likewise.

You don't need to add "negated" versions, one FMA builtin per mode is
enough, please see existing FMA4 descriptions. Just put unary minus
sign in the intrinsics header for "negated" operand and let GCC do its
job. Please see existing FMA4 intrinsics header.

? ? ? ? ? ? ? * config/i386/sse.md (fmai_fnmadd_<mode>): New.
? ? ? ? ? ? ? (fmai_fmsub_<mode>): Likewise.
? ? ? ? ? ? ? (fmai_fnmsub_<mode>): Likewise.
? ? ? ? ? ? ? (fmai_fmadd_s_<mode>): Likewise.
? ? ? ? ? ? ? (fmai_vmfmadd_s_<mode>): Likewise.
? ? ? ? ? ? ? (fmai_vmfmsub_s_<mode>): Likewise.
? ? ? ? ? ? ? (fmai_vmfnmadd_s_<mode>): Likewise.
? ? ? ? ? ? ? (fmai_vmfnmsub_s_<mode>): Likewise.
? ? ? ? ? ? ? (*fmai_fmadd_s_<mode>): Likewise.
? ? ? ? ? ? ? (*fmai_fmsub_s_<mode>): Likewise.
? ? ? ? ? ? ? (*fmai_fnmadd_s_<mode>): Likewise.
? ? ? ? ? ? ? (*fmai_fnmsub_s_<mode>): Likewise.
? ? ? ? ? ? ? (fmsubadd_<mode>): Likewise.

Also here. All your FMAMODE patterns should be expanded through
existing "fma4i_fmadd_<mode>" expander (you can rename it to
"fmai_fmadd..." to make its name more generic). This includes new
"fmsubadd_<mode>" pattern that should be expanded through existing
"fmaddsub_<mode>" expander.

vec_merge scalar versions also need only one expander, again follow
existing FMA4 version. Also, there is no need to include "_s_" in the
name. We know that these are scalar versions.

? ? ? ? ? ? ? * gcc.target/i386/fma-check.h: New.
? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmaddXX.c: New testcase.
? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmaddsubXX.c: Likewise.
? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmsubXX.c: Likewise.
? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmsubaddXX.c: Likewise.
? ? ? ? ? ? ? * gcc.target/i386/fma-256-fnmaddXX.c: Likewise.
? ? ? ? ? ? ? * gcc.target/i386/fma-256-fnmsubXX.c: Likewise.
? ? ? ? ? ? ? * gcc.target/i386/fma-fmaddXX.c: Likewise.
? ? ? ? ? ? ? * gcc.target/i386/fma-fmaddsubXX.c: Likewise.
? ? ? ? ? ? ? * gcc.target/i386/fma-fmsubXX.c: Likewise.
? ? ? ? ? ? ? * gcc.target/i386/fma-fmsubaddXX.c: Likewise.
? ? ? ? ? ? ? * gcc.target/i386/fma-fnmaddXX.c: Likewise.
? ? ? ? ? ? ? * gcc.target/i386/fma-fnmsubXX.c: Likewise.
? ? ? ? ? ? ? * gcc.target/i386/fma-compile.c: Likewise.
? ? ? ? ? ? ? * gcc.target/i386/i386.exp (check_effective_target_fma): New.

Is there a reason that all runtime tests are compiled with -O0 except
that there are some existing FMA tests in the testsuite using -O0?
Usually, these kind of tests are compiled using -O2, so optimizations
are applied also to the builtins.

Uros.

^ permalink raw reply	[flat|nested] 27+ messages in thread
* [PATCH, i386, testsuite]FMA intrinsics
@ 2011-08-17 14:08 Ilya Tocar
  0 siblings, 0 replies; 27+ messages in thread
From: Ilya Tocar @ 2011-08-17 14:08 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3413 bytes --]

Hello everyone,
This patch adds intrinsics for FMA instruction set along with tests for them.
Bootstraps and passes make check (including make check on simulator
for new runtime tests).

Here is a Changelog:

2011-08-15  Ilya Tocar  <ilya.tocar@intel.com>

              * config/i386/fmaintrin.h: New.
              * config.gcc: Add fmaintrin.h.
              * config/i386/i386.c
              * <ix86_builtins> (IX86_BUILTIN_VFMADDSS3): New.
              (IX86_BUILTIN_VFMADDSD3): Likewise.
              (X86_BUILTIN_VFNMADDSS3): Likewise.
              (X86_BUILTIN_VFNMADDSD3): Likewise.
              (X86_BUILTIN_VFMSUBSS3): Likewise.
              (X86_BUILTIN_VFMSUBSD3): Likewise.
              (X86_BUILTIN_VFNMSUBSS3): Likewise.
              (X86_BUILTIN_VFNMSUBSD3): Likewise.
              (X86_BUILTIN_VFMSUBPS): Likewise.
              (X86_BUILTIN_VFMSUBPD): Likewise.
              (X86_BUILTIN_VFMSUBPS256): Likewise.
              (X86_BUILTIN_VFMSUBPD256): Likewise.
              (X86_BUILTIN_VFNMADDPS): Likewise.
              (X86_BUILTIN_VFNMADDPD): Likewise.
              (X86_BUILTIN_VFNMADDPS256): Likewise.
              (X86_BUILTIN_VFNMADDPD256): Likewise.
              (X86_BUILTIN_VFNMSUBPS): Likewise.
              (X86_BUILTIN_VFNMSUBPD): Likewise.
              (X86_BUILTIN_VFNMSUBPS256): Likewise.
              (X86_BUILTIN_VFNMSUBPD256): Likewise.
              (X86_BUILTIN_VFMSUBADDPS): Likewise.
              (X86_BUILTIN_VFMSUBADDPD): Likewise.
              (X86_BUILTIN_VFMSUBADDPS256): Likewise.
              (X86_BUILTIN_VFMSUBADDPD256): Likewise.
              * config/i386/sse.md (fmai_fnmadd_<mode>): New.
              (fmai_fmsub_<mode>): Likewise.
              (fmai_fnmsub_<mode>): Likewise.
              (fmai_fmadd_s_<mode>): Likewise.
              (fmai_vmfmadd_s_<mode>): Likewise.
              (fmai_vmfmsub_s_<mode>): Likewise.
              (fmai_vmfnmadd_s_<mode>): Likewise.
              (fmai_vmfnmsub_s_<mode>): Likewise.
              (*fmai_fmadd_s_<mode>): Likewise.
              (*fmai_fmsub_s_<mode>): Likewise.
              (*fmai_fnmadd_s_<mode>): Likewise.
              (*fmai_fnmsub_s_<mode>): Likewise.
              (fmsubadd_<mode>): Likewise.
              * config/i386/x86intrin.h: Add fmaintrin.h.

And Changelog for testsuite:

2011-08-15  Ilya Tocar <ilya.tocar@intel.com>

              * gcc.target/i386/fma-check.h: New.
              * gcc.target/i386/fma-256-fmaddXX.c: New testcase.
              * gcc.target/i386/fma-256-fmaddsubXX.c: Likewise.
              * gcc.target/i386/fma-256-fmsubXX.c: Likewise.
              * gcc.target/i386/fma-256-fmsubaddXX.c: Likewise.
              * gcc.target/i386/fma-256-fnmaddXX.c: Likewise.
              * gcc.target/i386/fma-256-fnmsubXX.c: Likewise.
              * gcc.target/i386/fma-fmaddXX.c: Likewise.
              * gcc.target/i386/fma-fmaddsubXX.c: Likewise.
              * gcc.target/i386/fma-fmsubXX.c: Likewise.
              * gcc.target/i386/fma-fmsubaddXX.c: Likewise.
              * gcc.target/i386/fma-fnmaddXX.c: Likewise.
              * gcc.target/i386/fma-fnmsubXX.c: Likewise.
              * gcc.target/i386/fma-compile.c: Likewise.
              * gcc.target/i386/i386.exp (check_effective_target_fma): New.

Is it OK for trunk?
---
Best regards,
Ilya Tocar

[-- Attachment #2: patch --]
[-- Type: application/octet-stream, Size: 40860 bytes --]

diff --git a/gcc/config.gcc b/gcc/config.gcc
index ec13d93..3879a2a 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -352,7 +352,7 @@ i[34567]86-*-*)
 		       nmmintrin.h bmmintrin.h fma4intrin.h wmmintrin.h
 		       immintrin.h x86intrin.h avxintrin.h xopintrin.h
 		       ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
-		       lzcntintrin.h bmiintrin.h tbmintrin.h"
+		       lzcntintrin.h bmiintrin.h tbmintrin.h fmaintrin.h"
 	;;
 x86_64-*-*)
 	cpu_type=i386
@@ -364,7 +364,7 @@ x86_64-*-*)
 		       nmmintrin.h bmmintrin.h fma4intrin.h wmmintrin.h
 		       immintrin.h x86intrin.h avxintrin.h xopintrin.h
 		       ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
-		       lzcntintrin.h bmiintrin.h tbmintrin.h"
+		       lzcntintrin.h bmiintrin.h tbmintrin.h fmaintrin.h"
 	need_64bit_hwint=yes
 	;;
 ia64-*-*)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index fe6ccbe..9e04cea 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -23888,7 +23888,7 @@ enum ix86_builtins
   IX86_BUILTIN_VEC_PERM_V4DF,
   IX86_BUILTIN_VEC_PERM_V8SF,
 
-  /* FMA4 and XOP instructions.  */
+  /* FMA4 instructions.  */
   IX86_BUILTIN_VFMADDSS,
   IX86_BUILTIN_VFMADDSD,
   IX86_BUILTIN_VFMADDPS,
@@ -23900,6 +23900,33 @@ enum ix86_builtins
   IX86_BUILTIN_VFMADDSUBPS256,
   IX86_BUILTIN_VFMADDSUBPD256,
 
+  /* fma instructions.  */
+  IX86_BUILTIN_VFMADDSS3,
+  IX86_BUILTIN_VFMADDSD3,
+  IX86_BUILTIN_VFNMADDSS3,
+  IX86_BUILTIN_VFNMADDSD3,
+  IX86_BUILTIN_VFMSUBSS3,
+  IX86_BUILTIN_VFMSUBSD3,
+  IX86_BUILTIN_VFNMSUBSS3,
+  IX86_BUILTIN_VFNMSUBSD3,
+  IX86_BUILTIN_VFMSUBPS,
+  IX86_BUILTIN_VFMSUBPD,
+  IX86_BUILTIN_VFMSUBPS256,
+  IX86_BUILTIN_VFMSUBPD256,
+  IX86_BUILTIN_VFNMADDPS,
+  IX86_BUILTIN_VFNMADDPD,
+  IX86_BUILTIN_VFNMADDPS256,
+  IX86_BUILTIN_VFNMADDPD256,
+  IX86_BUILTIN_VFNMSUBPS,
+  IX86_BUILTIN_VFNMSUBPD,
+  IX86_BUILTIN_VFNMSUBPS256,
+  IX86_BUILTIN_VFNMSUBPD256,
+  IX86_BUILTIN_VFMSUBADDPS,
+  IX86_BUILTIN_VFMSUBADDPD,
+  IX86_BUILTIN_VFMSUBADDPS256,
+  IX86_BUILTIN_VFMSUBADDPD256,
+
+  /* XOP instructions.  */
   IX86_BUILTIN_VPCMOV,
   IX86_BUILTIN_VPCMOV_V2DI,
   IX86_BUILTIN_VPCMOV_V4SI,
@@ -25100,6 +25127,31 @@ static const struct builtin_description bdesc_multi_arg[] =
     "__builtin_ia32_vfmaddsd", IX86_BUILTIN_VFMADDSD,
     UNKNOWN, (int)MULTI_ARG_3_DF },
 
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_vmfmadd_s_v4sf,
+    "__builtin_ia32_vfmaddss3", IX86_BUILTIN_VFMADDSS3,
+    UNKNOWN, (int)MULTI_ARG_3_SF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_vmfmadd_s_v2df,
+    "__builtin_ia32_vfmaddsd3", IX86_BUILTIN_VFMADDSD3,
+    UNKNOWN, (int)MULTI_ARG_3_DF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_vmfnmadd_s_v4sf,
+    "__builtin_ia32_vfnmaddss3", IX86_BUILTIN_VFNMADDSS3,
+    UNKNOWN, (int)MULTI_ARG_3_SF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_vmfnmadd_s_v2df,
+    "__builtin_ia32_vfnmaddsd3", IX86_BUILTIN_VFNMADDSD3,
+    UNKNOWN, (int)MULTI_ARG_3_DF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_vmfmsub_s_v4sf,
+    "__builtin_ia32_vfmsubss3", IX86_BUILTIN_VFMSUBSS3,
+    UNKNOWN, (int)MULTI_ARG_3_SF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_vmfmsub_s_v2df,
+    "__builtin_ia32_vfmsubsd3", IX86_BUILTIN_VFMSUBSD3,
+    UNKNOWN, (int)MULTI_ARG_3_DF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_vmfnmsub_s_v4sf,
+    "__builtin_ia32_vfnmsubss3", IX86_BUILTIN_VFNMSUBSS3,
+    UNKNOWN, (int)MULTI_ARG_3_SF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_vmfnmsub_s_v2df,
+    "__builtin_ia32_vfnmsubsd3", IX86_BUILTIN_VFNMSUBSD3,
+    UNKNOWN, (int)MULTI_ARG_3_DF },
+
   { OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4, CODE_FOR_fma4i_fmadd_v4sf,
     "__builtin_ia32_vfmaddps", IX86_BUILTIN_VFMADDPS,
     UNKNOWN, (int)MULTI_ARG_3_SF },
@@ -25113,6 +25165,45 @@ static const struct builtin_description bdesc_multi_arg[] =
     "__builtin_ia32_vfmaddpd256", IX86_BUILTIN_VFMADDPD256,
     UNKNOWN, (int)MULTI_ARG_3_DF2 },
 
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_fmsub_v4sf,
+    "__builtin_ia32_vfmsubps", IX86_BUILTIN_VFMSUBPS,
+    UNKNOWN, (int)MULTI_ARG_3_SF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_fmsub_v2df,
+    "__builtin_ia32_vfmsubpd", IX86_BUILTIN_VFMSUBPD,
+    UNKNOWN, (int)MULTI_ARG_3_DF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_fmsub_v8sf,
+    "__builtin_ia32_vfmsubps256", IX86_BUILTIN_VFMSUBPS256,
+    UNKNOWN, (int)MULTI_ARG_3_SF2 },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_fmsub_v4df,
+    "__builtin_ia32_vfmsubpd256", IX86_BUILTIN_VFMSUBPD256,
+    UNKNOWN, (int)MULTI_ARG_3_DF2 },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_fnmadd_v4sf,
+    "__builtin_ia32_vfnmaddps", IX86_BUILTIN_VFNMADDPS,
+    UNKNOWN, (int)MULTI_ARG_3_SF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_fnmadd_v2df,
+    "__builtin_ia32_vfnmaddpd", IX86_BUILTIN_VFNMADDPD,
+    UNKNOWN, (int)MULTI_ARG_3_DF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_fnmadd_v8sf,
+    "__builtin_ia32_vfnmaddps256", IX86_BUILTIN_VFNMADDPS256,
+    UNKNOWN, (int)MULTI_ARG_3_SF2 },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_fnmadd_v4df,
+    "__builtin_ia32_vfnmaddpd256", IX86_BUILTIN_VFNMADDPD256,
+    UNKNOWN, (int)MULTI_ARG_3_DF2 },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_fnmsub_v4sf,
+    "__builtin_ia32_vfnmsubps", IX86_BUILTIN_VFNMSUBPS,
+    UNKNOWN, (int)MULTI_ARG_3_SF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_fnmsub_v2df,
+    "__builtin_ia32_vfnmsubpd", IX86_BUILTIN_VFNMSUBPD,
+    UNKNOWN, (int)MULTI_ARG_3_DF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_fnmsub_v8sf,
+    "__builtin_ia32_vfnmsubps256", IX86_BUILTIN_VFNMSUBPS256,
+    UNKNOWN, (int)MULTI_ARG_3_SF2 },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmai_fnmsub_v4df,
+    "__builtin_ia32_vfnmsubpd256", IX86_BUILTIN_VFNMSUBPD256,
+    UNKNOWN, (int)MULTI_ARG_3_DF2 },
+
+
+
   { OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4, CODE_FOR_fmaddsub_v4sf,
     "__builtin_ia32_vfmaddsubps", IX86_BUILTIN_VFMADDSUBPS,
     UNKNOWN, (int)MULTI_ARG_3_SF },
@@ -25125,6 +25216,18 @@ static const struct builtin_description bdesc_multi_arg[] =
   { OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4, CODE_FOR_fmaddsub_v4df,
     "__builtin_ia32_vfmaddsubpd256", IX86_BUILTIN_VFMADDSUBPD256,
     UNKNOWN, (int)MULTI_ARG_3_DF2 },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmsubadd_v4sf,
+    "__builtin_ia32_vfmsubaddps", IX86_BUILTIN_VFMSUBADDPS,
+    UNKNOWN, (int)MULTI_ARG_3_SF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmsubadd_v2df,
+    "__builtin_ia32_vfmsubaddpd", IX86_BUILTIN_VFMSUBADDPD,
+    UNKNOWN, (int)MULTI_ARG_3_DF },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmsubadd_v8sf,
+    "__builtin_ia32_vfmsubaddps256", IX86_BUILTIN_VFMSUBADDPS256,
+    UNKNOWN, (int)MULTI_ARG_3_SF2 },
+  { OPTION_MASK_ISA_FMA, CODE_FOR_fmsubadd_v4df,
+    "__builtin_ia32_vfmsubaddpd256", IX86_BUILTIN_VFMSUBADDPD256,
+    UNKNOWN, (int)MULTI_ARG_3_DF2 },
 
   { OPTION_MASK_ISA_XOP, CODE_FOR_xop_pcmov_v2di,        "__builtin_ia32_vpcmov",      IX86_BUILTIN_VPCMOV,	 UNKNOWN,      (int)MULTI_ARG_3_DI },
   { OPTION_MASK_ISA_XOP, CODE_FOR_xop_pcmov_v2di,        "__builtin_ia32_vpcmov_v2di", IX86_BUILTIN_VPCMOV_V2DI, UNKNOWN,      (int)MULTI_ARG_3_DI },
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index e9f6c3d..0f3f982 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1528,6 +1528,44 @@
 	  (match_operand:FMAMODE 3 "nonimmediate_operand")))]
   "TARGET_FMA || TARGET_FMA4")
 
+
+(define_expand "fmai_fnmadd_<mode>"
+  [(set (match_operand:FMAMODE 0 "register_operand")
+	(fma:FMAMODE
+	  (neg:FMAMODE
+            (match_operand:FMAMODE 1 "nonimmediate_operand"))
+	  (match_operand:FMAMODE 2 "nonimmediate_operand")
+	  (match_operand:FMAMODE 3 "nonimmediate_operand")))]
+  "TARGET_FMA")
+
+(define_expand "fmai_fmsub_<mode>"
+  [(set (match_operand:FMAMODE 0 "register_operand")
+	(fma:FMAMODE
+	  (match_operand:FMAMODE 1 "nonimmediate_operand")
+	  (match_operand:FMAMODE 2 "nonimmediate_operand")
+	  (neg:FMAMODE
+            (match_operand:FMAMODE 3 "nonimmediate_operand"))))]
+  "TARGET_FMA")
+
+(define_expand "fmai_fnmsub_<mode>"
+  [(set (match_operand:FMAMODE 0 "register_operand")
+	(fma:FMAMODE
+	  (neg:FMAMODE
+            (match_operand:FMAMODE 1 "nonimmediate_operand"))
+	  (match_operand:FMAMODE 2 "nonimmediate_operand")
+          (neg:FMAMODE
+	    (match_operand:FMAMODE 3 "nonimmediate_operand"))))]
+  "TARGET_FMA")
+
+(define_expand "fmai_fmadd_s_<mode>"
+  [(set (match_operand:VF_128 0 "register_operand")
+	(fma:VF_128
+	  (match_operand:VF_128 1 "nonimmediate_operand")
+	  (match_operand:VF_128 2 "nonimmediate_operand")
+	  (match_operand:VF_128 3 "nonimmediate_operand")))]
+  "TARGET_FMA")
+
+
 (define_insn "*fma4i_fmadd_<mode>"
   [(set (match_operand:FMAMODE 0 "register_operand" "=x,x")
 	(fma:FMAMODE
@@ -1593,6 +1631,126 @@
   operands[4] = CONST0_RTX (<MODE>mode);
 })
 
+(define_expand "fmai_vmfmadd_s_<mode>"
+  [(set (match_operand:VF_128 0 "register_operand")
+	(vec_merge:VF_128
+	  (fma:VF_128
+	    (match_operand:VF_128 1 "nonimmediate_operand")
+	    (match_operand:VF_128 2 "nonimmediate_operand")
+	    (match_operand:VF_128 3 "nonimmediate_operand"))
+	  (match_dup 0)
+	  (const_int 1)))]
+  "TARGET_FMA")
+
+(define_expand "fmai_vmfmsub_s_<mode>"
+  [(set (match_operand:VF_128 0 "register_operand")
+        (vec_merge:VF_128
+	  (fma:VF_128
+	    (match_operand:VF_128   1 "nonimmediate_operand")
+	    (match_operand:VF_128   2 "nonimmediate_operand")
+	    (neg:VF_128
+	      (match_operand:VF_128 3 "nonimmediate_operand")))
+	  (match_dup 0)
+	  (const_int 1)))]
+  "TARGET_FMA")
+
+(define_expand "fmai_vmfnmadd_s_<mode>"
+  [(set (match_operand:VF_128 0 "register_operand")
+        (vec_merge:VF_128
+	  (fma:VF_128
+	    (neg:VF_128
+	      (match_operand:VF_128 1 "nonimmediate_operand"))
+	    (match_operand:VF_128   2 "nonimmediate_operand")
+	    (match_operand:VF_128   3 "nonimmediate_operand"))
+	  (match_dup 0)
+	  (const_int 1)))]
+  "TARGET_FMA")
+
+(define_expand "fmai_vmfnmsub_s_<mode>"
+  [(set (match_operand:VF_128 0 "register_operand")
+        (vec_merge:VF_128
+	  (fma:VF_128
+	    (neg:VF_128
+	      (match_operand:VF_128 1 "nonimmediate_operand"))
+	    (match_operand:VF_128   2 "nonimmediate_operand")
+	    (neg:VF_128
+	      (match_operand:VF_128 3 "nonimmediate_operand")))
+	  (match_dup 0)
+	  (const_int 1)))]
+  "TARGET_FMA")
+
+(define_insn "*fmai_fmadd_s_<mode>"
+  [(set (match_operand:VF_128 0 "register_operand" "=x,x,x")
+        (vec_merge:VF_128
+	  (fma:VF_128
+	    (match_operand:VF_128 1 "nonimmediate_operand" "%0, 0,x")
+	    (match_operand:VF_128 2 "nonimmediate_operand" "xm, x,xm")
+	    (match_operand:VF_128 3 "nonimmediate_operand" " x,xm,0"))
+	  (match_dup 0)
+	  (const_int 1)))]
+  "TARGET_FMA"
+  "@
+   vfmadd132<ssescalarmodesuffix>\t{%2, %3, %0|%0, %3, %2}
+   vfmadd213<ssescalarmodesuffix>\t{%3, %2, %0|%0, %2, %3}
+   vfmadd231<ssescalarmodesuffix>\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "ssemuladd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*fmai_fmsub_s_<mode>"
+  [(set (match_operand:VF_128 0 "register_operand" "=x,x,x")
+        (vec_merge:VF_128
+	  (fma:VF_128
+	    (match_operand:VF_128   1 "nonimmediate_operand" "%0, 0,x")
+	    (match_operand:VF_128   2 "nonimmediate_operand" "xm, x,xm")
+	    (neg:VF_128
+	      (match_operand:VF_128 3 "nonimmediate_operand" " x,xm,0")))
+	  (match_dup 0)
+	  (const_int 1)))]
+  "TARGET_FMA"
+  "@
+   vfmsub132<ssescalarmodesuffix>\t{%2, %3, %0|%0, %3, %2}
+   vfmsub213<ssescalarmodesuffix>\t{%3, %2, %0|%0, %2, %3}
+   vfmsub231<ssescalarmodesuffix>\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "ssemuladd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*fmai_fnmadd_s_<mode>"
+  [(set (match_operand:VF_128 0 "register_operand" "=x,x,x")
+        (vec_merge:VF_128
+	  (fma:VF_128
+	    (neg:VF_128
+	      (match_operand:VF_128 1 "nonimmediate_operand" "%0, 0,x"))
+	    (match_operand:VF_128   2 "nonimmediate_operand" "xm, x,xm")
+	    (match_operand:VF_128   3 "nonimmediate_operand" " x,xm,0"))
+	  (match_dup 0)
+	  (const_int 1)))]
+  "TARGET_FMA"
+  "@
+   vfnmadd132<ssescalarmodesuffix>\t{%2, %3, %0|%0, %3, %2}
+   vfnmadd213<ssescalarmodesuffix>\t{%3, %2, %0|%0, %2, %3}
+   vfnmadd231<ssescalarmodesuffix>\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "ssemuladd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*fmai_fnmsub_s_<mode>"
+  [(set (match_operand:VF_128 0 "register_operand" "=x,x,x")
+        (vec_merge:VF_128
+	  (fma:VF_128
+	    (neg:VF_128
+	      (match_operand:VF_128 1 "nonimmediate_operand" "%0, 0,x"))
+	    (match_operand:VF_128   2 "nonimmediate_operand" "xm, x,xm")
+	    (neg:VF_128
+	      (match_operand:VF_128 3 "nonimmediate_operand" " x,xm,0")))
+	  (match_dup 0)
+	  (const_int 1)))]
+  "TARGET_FMA"
+  "@
+   vfnmsub132<ssescalarmodesuffix>\t{%2, %3, %0|%0, %3, %2}
+   vfnmsub213<ssescalarmodesuffix>\t{%3, %2, %0|%0, %2, %3}
+   vfnmsub231<ssescalarmodesuffix>\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "ssemuladd")
+   (set_attr "mode" "<MODE>")])
+
 (define_insn "*fma4i_vmfmadd_<mode>"
   [(set (match_operand:VF_128 0 "register_operand" "=x,x")
 	(vec_merge:VF_128
@@ -1677,6 +1835,16 @@
 	  UNSPEC_FMADDSUB))]
   "TARGET_FMA || TARGET_FMA4")
 
+(define_expand "fmsubadd_<mode>"
+  [(set (match_operand:VF 0 "register_operand")
+	(unspec:VF
+	  [(match_operand:VF 1 "nonimmediate_operand")
+	   (match_operand:VF 2 "nonimmediate_operand")
+	   (neg:VF
+	     (match_operand:VF 3 "nonimmediate_operand"))]
+	  UNSPEC_FMADDSUB))]
+  "TARGET_FMA")
+
 (define_insn "*fma4_fmaddsub_<mode>"
   [(set (match_operand:VF 0 "register_operand" "=x,x")
 	(unspec:VF
diff --git a/gcc/config/i386/x86intrin.h b/gcc/config/i386/x86intrin.h
index 88456f9..546b8a8 100644
--- a/gcc/config/i386/x86intrin.h
+++ b/gcc/config/i386/x86intrin.h
@@ -93,4 +93,8 @@
 #include <popcntintrin.h>
 #endif
 
+#ifdef __FMA__
+#include <fmaintrin.h>
+#endif
+
 #endif /* _X86INTRIN_H_INCLUDED */
diff --git a/gcc/testsuite/gcc.target/i386/fma-256-fmaddXX.c b/gcc/testsuite/gcc.target/i386/fma-256-fmaddXX.c
new file mode 100644
index 0000000..d87d244
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fma-256-fmaddXX.c
@@ -0,0 +1,61 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fma } */
+/* { dg-options "-O0 -mfma" } */
+
+#include "fma-check.h"
+
+#include <x86intrin.h>
+#include "m256-check.h"
+
+void
+check_mm256_fmadd_pd (__m256d __A, __m256d __B, __m256d __C)
+{
+  union256d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[4];
+  int i;
+  e.x = _mm256_fmadd_pd (__A, __B, __C);
+  for (i=0; i < 4;i++)
+    {
+      d[i] = a.a[i]*b.a[i]+c.a[i];
+    }
+  if (check_union256d (e, d))
+    abort ();
+}
+
+void
+check_mm256_fmadd_ps (__m256 __A, __m256 __B, __m256 __C)
+{
+  union256 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[8];
+  int i;
+  e.x = _mm256_fmadd_ps (__A, __B, __C);
+  for (i=0; i < 8;i++)
+    {
+      d[i] = a.a[i]*b.a[i]+c.a[i];
+    }
+  if (check_union256 (e, d))
+    abort ();
+}
+
+static void
+fma_test (void)
+{
+  union256 c[3];
+  union256d d[3];
+  int i,j;
+  for (i = 0;i < 3;i++)
+    {
+      for (j = 0;j < 4;j++)
+        c[i].a[j] = i*j + 3.5;
+      for (j = 0;j < 4;j++)
+        d[i].a[j] = i*j + 3.5;
+    }
+  check_mm256_fmadd_pd (d[0].x, d[1].x, d[2].x);
+  check_mm256_fmadd_ps (c[0].x, c[1].x, c[2].x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/fma-256-fmaddsubXX.c b/gcc/testsuite/gcc.target/i386/fma-256-fmaddsubXX.c
new file mode 100644
index 0000000..19079d0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fma-256-fmaddsubXX.c
@@ -0,0 +1,61 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fma } */
+/* { dg-options "-O0 -mfma" } */
+
+#include "fma-check.h"
+
+#include <x86intrin.h>
+#include "m256-check.h"
+
+void
+check_mm256_fmaddsub_ps (__m256 __A, __m256 __B, __m256 __C)
+{
+  union256 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[8];
+  int i;
+  e.x = _mm256_fmaddsub_ps (__A, __B, __C);
+  for (i=0; i < 8;i++)
+    {
+      d[i] = a.a[i]*b.a[i] + (i%2 == 1 ? c.a[i] : -c.a[i]);
+    }
+  if (check_union256 (e, d))
+    abort ();
+}
+
+void
+check_mm256_fmaddsub_pd (__m256d __A, __m256d __B, __m256d __C)
+{
+  union256d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[4];
+  int i;
+  e.x = _mm256_fmaddsub_pd (__A, __B, __C);
+  for (i=0; i < 4;i++)
+    {
+      d[i] = a.a[i]*b.a[i] + (i%2 == 1 ? c.a[i] : -c.a[i]);
+    }
+  if (check_union256d (e, d))
+    abort ();
+}
+
+static void
+fma_test (void)
+{
+  union256 c[3];
+  union256d d[3];
+  int i,j;
+  for (i = 0;i < 3;i++)
+    {
+      for (j = 0;j < 4;j++)
+        c[i].a[j] = i*j + 3.5;
+      for (j = 0;j < 4;j++)
+        d[i].a[j] = i*j + 3.5;
+    }
+  check_mm256_fmaddsub_pd (b[0].x, b[1].x, b[2].x);
+  check_mm256_fmaddsub_ps (a[0].x, a[1].x, a[2].x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/fma-256-fmsubXX.c b/gcc/testsuite/gcc.target/i386/fma-256-fmsubXX.c
new file mode 100644
index 0000000..e20f4e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fma-256-fmsubXX.c
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fma } */
+/* { dg-options "-O0 -mfma" } */
+
+#include "fma-check.h"
+
+#include <x86intrin.h>
+#include "m256-check.h"
+
+
+void
+check_mm256_fmsub_pd (__m256d __A, __m256d __B, __m256d __C)
+{
+  union256d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[4];
+  int i;
+  e.x = _mm256_fmsub_pd (__A, __B, __C);
+  for (i=0; i < 4;i++)
+    {
+      d[i] = a.a[i]*b.a[i]-c.a[i];
+    }
+  if (check_union256d (e, d))
+    abort ();
+}
+
+void
+check_mm256_fmsub_ps (__m256 __A, __m256 __B, __m256 __C)
+{
+  union256 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[8];
+  int i;
+  e.x = _mm256_fmsub_ps (__A, __B, __C);
+  for (i=0; i < 8;i++)
+    {
+      d[i] = a.a[i]*b.a[i]-c.a[i];
+    }
+  if (check_union256 (e, d))
+    abort ();
+}
+
+void
+static void
+fma_test (void)
+{
+  union256 c[3];
+  union256d d[3];
+  int i,j;
+  for (i = 0;i < 3;i++)
+    {
+      for (j = 0;j < 4;j++)
+        c[i].a[j] = i*j + 3.5;
+      for (j = 0;j < 4;j++)
+        d[i].a[j] = i*j + 3.5;
+    }
+  check_mm256_fmsub_pd (d[0].x, d[1].x, d[2].x);
+  check_mm256_fmsub_ps (c[0].x, c[1].x, c[2].x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/fma-256-fmsubaddXX.c b/gcc/testsuite/gcc.target/i386/fma-256-fmsubaddXX.c
new file mode 100644
index 0000000..a82b506
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fma-256-fmsubaddXX.c
@@ -0,0 +1,61 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fma } */
+/* { dg-options "-O0 -mfma" } */
+
+#include "fma-check.h"
+
+#include <x86intrin.h>
+#include "m256-check.h"
+
+void
+check_mm256_fmsubadd_ps (__m256 __A, __m256 __B, __m256 __C)
+{
+  union256 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[8];
+  int i;
+  e.x = _mm256_fmsubadd_ps (__A, __B, __C);
+  for (i=0; i < 8;i++)
+    {
+      d[i] = a.a[i]*b.a[i] + (i%2 == 1 ? -c.a[i] : c.a[i]);
+    }
+  if (check_union256 (e, d))
+    abort ();
+}
+
+void
+check_mm256_fmsubadd_pd (__m256d __A, __m256d __B, __m256d __C)
+{
+  union256d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[4];
+  int i;
+  e.x = _mm256_fmsubadd_pd (__A, __B, __C);
+  for (i=0; i < 4;i++)
+    {
+      d[i] = a.a[i]*b.a[i] + (i%2 == 1 ? -c.a[i] : c.a[i]);
+    }
+  if (check_union256d (e, d))
+    abort ();
+}
+
+static void
+fma_test (void)
+{
+  union256 c[3];
+  union256d d[3];
+  int i,j;
+  for (i = 0;i < 3;i++)
+    {
+      for (j = 0;j < 4;j++)
+        c[i].a[j] = i*j + 3.5;
+      for (j = 0;j < 4;j++)
+        d[i].a[j] = i*j + 3.5;
+    }
+  check_mm256_fmsubadd_pd (b[0].x, b[1].x, b[2].x);
+  check_mm256_fmsubadd_ps (a[0].x, a[1].x, a[2].x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/fma-256-fnmaddXX.c b/gcc/testsuite/gcc.target/i386/fma-256-fnmaddXX.c
new file mode 100644
index 0000000..ed8dc2e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fma-256-fnmaddXX.c
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fma } */
+/* { dg-options "-O0 -mfma" } */
+
+#include "fma-check.h"
+
+#include <x86intrin.h>
+#include "m256-check.h"
+
+void
+check_mm256_fnmadd_pd (__m256d __A, __m256d __B, __m256d __C)
+{
+  union256d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[4];
+  int i;
+  e.x = _mm256_fnmadd_pd (__A, __B, __C);
+  for (i=0; i < 4;i++)
+    {
+      d[i] = -a.a[i]*b.a[i]+c.a[i];
+    }
+  if (check_union256d (e, d))
+    abort ();
+}
+
+void
+check_mm256_fnmadd_ps (__m256 __A, __m256 __B, __m256 __C)
+{
+  union256 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[8];
+  int i;
+  e.x = _mm256_fnmadd_ps (__A, __B, __C);
+  for (i=0; i < 8;i++)
+    {
+      d[i] = -a.a[i]*b.a[i]+c.a[i];
+    }
+  if (check_union256 (e, d))
+    abort ();
+}
+
+static void
+fma_test (void)
+{
+  union256 c[3];
+  union256d d[3];
+  int i,j;
+  for (i = 0;i < 3;i++)
+    {
+      for (j = 0;j < 4;j++)
+        c[i].a[j] = i*j + 3.5;
+      for (j = 0;j < 4;j++)
+        d[i].a[j] = i*j + 3.5;
+    }
+  check_mm256_fnmadd_pd (d[0].x, d[1].x, d[2].x);
+  check_mm256_fnmadd_ps (c[0].x, c[1].x, c[2].x);
+}
+
+
diff --git a/gcc/testsuite/gcc.target/i386/fma-256-fnmsubXX.c b/gcc/testsuite/gcc.target/i386/fma-256-fnmsubXX.c
new file mode 100644
index 0000000..e59ff02
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fma-256-fnmsubXX.c
@@ -0,0 +1,62 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fma } */
+/* { dg-options "-O0 -mfma" } */
+
+#include "fma-check.h"
+
+#include <x86intrin.h>
+#include "m256-check.h"
+
+
+void
+check_mm256_fnmsub_pd (__m256d __A, __m256d __B, __m256d __C)
+{
+  union256d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[4];
+  int i;
+  e.x = _mm256_fnmsub_pd (__A, __B, __C);
+  for (i=0; i < 4;i++)
+    {
+      d[i] = -a.a[i]*b.a[i]-c.a[i];
+    }
+  if (check_union256d (e, d))
+    abort ();
+}
+
+void
+check_mm256_fnmsub_ps (__m256 __A, __m256 __B, __m256 __C)
+{
+  union256 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[8];
+  int i;
+  e.x = _mm256_fnmsub_ps (__A, __B, __C);
+  for (i=0; i < 8;i++)
+    {
+      d[i] = -a.a[i]*b.a[i]-c.a[i];
+    }
+  if (check_union256 (e, d))
+    abort ();
+}
+
+static void
+fma_test (void)
+{
+  union256 c[3];
+  union256d d[3];
+  int i,j;
+  for (i = 0;i < 3;i++)
+    {
+      for (j = 0;j < 4;j++)
+        c[i].a[j] = i*j + 3.5;
+      for (j = 0;j < 4;j++)
+        d[i].a[j] = i*j + 3.5;
+    }
+  check_mm256_fnmsub_pd (d[0].x, d[1].x, d[2].x);
+  check_mm256_fnmsub_ps (c[0].x, c[1].x, c[2].x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/fma-check.h b/gcc/testsuite/gcc.target/i386/fma-check.h
new file mode 100644
index 0000000..e4a6836
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fma-check.h
@@ -0,0 +1,27 @@
+#include <stdlib.h>
+
+#include "cpuid.h"
+
+static void fma_test (void);
+
+static void
+__attribute__ ((noinline))
+do_test (void)
+{
+  fma_test ();
+}
+
+int
+main ()
+{
+  unsigned int eax, ebx, ecx, edx;
+
+  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+    return 0;
+
+  /* Run FMA test only if host has FMA support.  */
+  if (ecx & bit_FMA)
+    do_test ();
+
+  exit (0);
+}
diff --git a/gcc/testsuite/gcc.target/i386/fma-compile.c b/gcc/testsuite/gcc.target/i386/fma-compile.c
new file mode 100644
index 0000000..4411772
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fma-compile.c
@@ -0,0 +1,221 @@
+/* Test that the compiler properly generates floating point multiply
+   and add instructions fma systems.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -mfma" } */
+
+#include <x86intrin.h>
+
+__m128d
+check_mm_fmadd_pd (__m128d a, __m128d b,__m128d c)
+{
+      return _mm_fmadd_pd (a, b, c);
+}
+
+__m256d
+check_mm256_fmadd_pd (__m256d a, __m256d b,__m256d c)
+{
+      return _mm256_fmadd_pd (a, b, c);
+}
+
+__m128
+check_mm_fmadd_ps (__m128 a, __m128 b,__m128 c)
+{
+      return _mm_fmadd_ps (a, b, c);
+}
+
+__m256
+check_mm256_fmadd_ps (__m256 a, __m256 b,__m256 c)
+{
+      return _mm256_fmadd_ps (a, b, c);
+}
+
+__m128d
+check_mm_fmadd_sd (__m128d a, __m128d b,__m128d c)
+{
+      return _mm_fmadd_sd (a, b, c);
+}
+
+__m128
+check_mm_fmadd_ss (__m128 a, __m128 b,__m128 c)
+{
+      return _mm_fmadd_ss (a, b, c);
+}
+
+__m128d
+check_mm_fmsub_pd (__m128d a, __m128d b,__m128d c)
+{
+      return _mm_fmsub_pd (a, b, c);
+}
+
+__m256d
+check_mm256_fmsub_pd (__m256d a, __m256d b,__m256d c)
+{
+      return _mm256_fmsub_pd (a, b, c);
+}
+
+__m128
+check_mm_fmsub_ps (__m128 a, __m128 b,__m128 c)
+{
+      return _mm_fmsub_ps (a, b, c);
+}
+
+__m256
+check_mm256_fmsub_ps (__m256 a, __m256 b,__m256 c)
+{
+      return _mm256_fmsub_ps (a, b, c);
+}
+
+__m128d
+check_mm_fmsub_sd (__m128d a, __m128d b,__m128d c)
+{
+      return _mm_fmsub_sd (a, b, c);
+}
+
+__m128
+check_mm_fmsub_ss (__m128 a, __m128 b,__m128 c)
+{
+      return _mm_fmsub_ss (a, b, c);
+}
+
+__m128d
+check_mm_fnmadd_pd (__m128d a, __m128d b,__m128d c)
+{
+      return _mm_fnmadd_pd (a, b, c);
+}
+
+__m256d
+check_mm256_fnmadd_pd (__m256d a, __m256d b,__m256d c)
+{
+      return _mm256_fnmadd_pd (a, b, c);
+}
+
+__m128
+check_mm_fnmadd_ps (__m128 a, __m128 b,__m128 c)
+{
+      return _mm_fnmadd_ps (a, b, c);
+}
+
+__m256
+check_mm256_fnmadd_ps (__m256 a, __m256 b,__m256 c)
+{
+      return _mm256_fnmadd_ps (a, b, c);
+}
+
+__m128d
+check_mm_fnmadd_sd (__m128d a, __m128d b,__m128d c)
+{
+      return _mm_fnmadd_sd (a, b, c);
+}
+
+__m128
+check_mm_fnmadd_ss (__m128 a, __m128 b,__m128 c)
+{
+      return _mm_fnmadd_ss (a, b, c);
+}
+
+__m128d
+check_mm_fnmsub_pd (__m128d a, __m128d b,__m128d c)
+{
+      return _mm_fnmsub_pd (a, b, c);
+}
+
+__m256d
+check_mm256_fnmsub_pd (__m256d a, __m256d b,__m256d c)
+{
+      return _mm256_fnmsub_pd (a, b, c);
+}
+
+__m128
+check_mm_fnmsub_ps (__m128 a, __m128 b,__m128 c)
+{
+      return _mm_fnmsub_ps (a, b, c);
+}
+
+__m256
+check_mm256_fnmsub_ps (__m256 a, __m256 b,__m256 c)
+{
+      return _mm256_fnmsub_ps (a, b, c);
+}
+
+__m128d
+check_mm_fnmsub_sd (__m128d a, __m128d b,__m128d c)
+{
+      return _mm_fnmsub_sd (a, b, c);
+}
+
+__m128
+check_mm_fnmsub_ss (__m128 a, __m128 b,__m128 c)
+{
+      return _mm_fnmsub_ss (a, b, c);
+}
+
+__m128d
+check_mm_fmaddsub_pd (__m128d a, __m128d b,__m128d c)
+{
+      return _mm_fmaddsub_pd (a, b, c);
+}
+
+__m256d
+check_mm256_fmaddsub_pd (__m256d a, __m256d b,__m256d c)
+{
+      return _mm256_fmaddsub_pd (a, b, c);
+}
+
+__m128
+check_mm_fmaddsub_ps (__m128 a, __m128 b,__m128 c)
+{
+      return _mm_fmaddsub_ps (a, b, c);
+}
+
+__m256
+check_mm256_fmaddsub_ps (__m256 a, __m256 b,__m256 c)
+{
+      return _mm256_fmaddsub_ps (a, b, c);
+}
+
+__m128d
+check_mm_fmsubadd_pd (__m128d a, __m128d b,__m128d c)
+{
+      return _mm_fmsubadd_pd (a, b, c);
+}
+
+__m256d
+check_mm256_fmsubadd_pd (__m256d a, __m256d b,__m256d c)
+{
+      return _mm256_fmsubadd_pd (a, b, c);
+}
+
+__m128
+check_mm_fmsubadd_ps (__m128 a, __m128 b,__m128 c)
+{
+      return _mm_fmsubadd_ps (a, b, c);
+}
+
+__m256
+check_mm256_fmsubadd_ps (__m256 a, __m256 b,__m256 c)
+{
+      return _mm256_fmsubadd_ps (a, b, c);
+}
+
+
+/* { dg-final { scan-assembler-times "vfmadd[^s]..ps" 2 } } */
+/* { dg-final { scan-assembler-times "vfmsub[^s]..ps" 2 } } */
+/* { dg-final { scan-assembler-times "vfnmadd...ps" 2 } } */
+/* { dg-final { scan-assembler-times "vfnmsub...ps" 2 } } */
+/* { dg-final { scan-assembler-times "vfmaddsub...ps" 2 } } */
+/* { dg-final { scan-assembler-times "vfmsubadd...ps" 2 } } */
+/* { dg-final { scan-assembler-times "vfmadd[^s]..pd" 2 } } */
+/* { dg-final { scan-assembler-times "vfmsub[^s]..pd" 2 } } */
+/* { dg-final { scan-assembler-times "vfnmadd...pd" 2 } } */
+/* { dg-final { scan-assembler-times "vfnmsub...pd" 2 } } */
+/* { dg-final { scan-assembler-times "vfmaddsub...pd" 2 } } */
+/* { dg-final { scan-assembler-times "vfmsubadd...pd" 2 } } */
+/* { dg-final { scan-assembler-times "vfmadd[^s]..ss" 1 } } */
+/* { dg-final { scan-assembler-times "vfmsub[^s]..ss" 1 } } */
+/* { dg-final { scan-assembler-times "vfnmadd...ss" 1 } } */
+/* { dg-final { scan-assembler-times "vfnmsub...ss" 1 } } */
+/* { dg-final { scan-assembler-times "vfmadd[^s]..sd" 1 } } */
+/* { dg-final { scan-assembler-times "vfmsub[^s]..sd" 1 } } */
+/* { dg-final { scan-assembler-times "vfnmadd...sd" 1 } } */
+/* { dg-final { scan-assembler-times "vfnmsub...sd" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/fma-fmaddXX.c b/gcc/testsuite/gcc.target/i386/fma-fmaddXX.c
new file mode 100644
index 0000000..60ebc5d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fma-fmaddXX.c
@@ -0,0 +1,102 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fma } */
+/* { dg-options "-O0 -mfma" } */
+
+#include "fma-check.h"
+
+#include <x86intrin.h>
+#include "m256-check.h"
+
+void
+check_mm_fmadd_pd (__m128d __A, __m128d __B, __m128d __C)
+{
+  union128d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[2];
+  int i;
+  e.x = _mm_fmadd_pd (__A, __B, __C);
+  for (i=0; i < 2;i++)
+    {
+      d[i] = a.a[i]*b.a[i]+c.a[i];
+    }
+
+  if (check_union128d (e, d))
+    abort ();
+}
+
+void
+check_mm_fmadd_ps (__m128 __A, __m128 __B, __m128 __C)
+{
+  union128 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[4];
+  int i;
+  e.x = _mm_fmadd_ps (__A, __B, __C);
+  for (i=0; i < 4;i++)
+    {
+      d[i] = a.a[i]*b.a[i]+c.a[i];
+    }
+  if (check_union128 (e, d))
+    abort ();
+}
+
+void
+check_mm_fmadd_sd (__m128d __A, __m128d __B, __m128d __C)
+{
+  union128d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[2];
+  int i;
+  e.x = _mm_fmadd_sd (__A, __B, __C);
+  for (i=1; i < 2;i++)
+    {
+      d[i] = a.a[i];
+    }
+  d[0] = a.a[0]*b.a[0]+c.a[0];
+  if (check_union128d (e, d))
+    abort ();
+}
+
+void
+check_mm_fmadd_ss (__m128 __A, __m128 __B, __m128 __C)
+{
+  union128 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[4];
+  int i;
+  e.x = _mm_fmadd_ss (__A, __B, __C);
+  for (i=1; i < 4;i++)
+    {
+      d[i] = a.a[i];
+    }
+  d[0] = a.a[0]*b.a[0]+c.a[0];
+  if (check_union128 (e, d))
+    abort ();
+}
+
+static void
+fma_test (void)
+{
+  union128  a[3];
+  union128d b[3];
+  int i,j;
+  for (i = 0;i < 3;i++)
+    {
+      for (j = 0;j < 2;j++)
+        a[i].a[j] = i*j + 3.5;
+      for (j = 0;j < 2;j++)
+        b[i].a[j] = i*j + 3.5;
+    }
+  check_mm_fmadd_pd (b[0].x, b[1].x, b[2].x);
+  check_mm_fmadd_sd (b[0].x, b[1].x, b[2].x);
+  check_mm_fmadd_ps (a[0].x, a[1].x, a[2].x);
+  check_mm_fmadd_ss (a[0].x, a[1].x, a[2].x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/fma-fmaddsubXX.c b/gcc/testsuite/gcc.target/i386/fma-fmaddsubXX.c
new file mode 100644
index 0000000..bee7a7a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fma-fmaddsubXX.c
@@ -0,0 +1,61 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fma } */
+/* { dg-options "-O0 -mfma" } */
+
+#include "fma-check.h"
+
+#include <x86intrin.h>
+#include "m256-check.h"
+
+void
+check_mm_fmaddsub_ps (__m128 __A, __m128 __B, __m128 __C)
+{
+  union128 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[4];
+  int i;
+  e.x = _mm_fmaddsub_ps (__A, __B, __C);
+  for (i=0; i < 4;i++)
+    {
+      d[i] = a.a[i]*b.a[i] + (i%2 == 1 ? c.a[i] : -c.a[i]);
+    }
+  if (check_union128 (e, d))
+    abort ();
+}
+
+void
+check_mm_fmaddsub_pd (__m128d __A, __m128d __B, __m128d __C)
+{
+  union128d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[2];
+  int i;
+  e.x = _mm_fmaddsub_pd (__A, __B, __C);
+  for (i=0; i < 2;i++)
+    {
+      d[i] = a.a[i]*b.a[i] + (i%2 == 1 ? c.a[i] : -c.a[i]);
+    }
+  if (check_union128d (e, d))
+    abort ();
+}
+
+static void
+fma_test (void)
+{
+  union128  a[3];
+  union128d b[3];
+  int i,j;
+  for (i = 0;i < 3;i++)
+    {
+      for (j = 0;j < 2;j++)
+        a[i].a[j] = i*j + 3.5;
+      for (j = 0;j < 2;j++)
+        b[i].a[j] = i*j + 3.5;
+    }
+  check_mm_fmaddsub_pd (b[0].x, b[1].x, b[2].x);
+  check_mm_fmaddsub_ps (a[0].x, a[1].x, a[2].x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/fma-fmsubXX.c b/gcc/testsuite/gcc.target/i386/fma-fmsubXX.c
new file mode 100644
index 0000000..fb424ff
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fma-fmsubXX.c
@@ -0,0 +1,101 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fma } */
+/* { dg-options "-O0 -mfma" } */
+
+#include "fma-check.h"
+
+#include <x86intrin.h>
+#include "m256-check.h"
+
+void
+check_mm_fmsub_pd (__m128d __A, __m128d __B, __m128d __C)
+{
+  union128d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[2];
+  int i;
+  e.x = _mm_fmsub_pd (__A, __B, __C);
+  for (i=0; i < 2;i++)
+    {
+      d[i] = a.a[i]*b.a[i]-c.a[i];
+    }
+  if (check_union128d (e, d))
+    abort ();
+}
+
+void
+check_mm_fmsub_ps (__m128 __A, __m128 __B, __m128 __C)
+{
+  union128 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[4];
+  int i;
+  e.x = _mm_fmsub_ps (__A, __B, __C);
+  for (i=0; i < 4;i++)
+    {
+      d[i] = a.a[i]*b.a[i]-c.a[i];
+    }
+  if (check_union128 (e, d))
+    abort ();
+}
+
+void
+check_mm_fmsub_sd (__m128d __A, __m128d __B, __m128d __C)
+{
+  union128d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[2];
+  int i;
+  e.x = _mm_fmsub_sd (__A, __B, __C);
+  for (i=1; i < 2;i++)
+    {
+      d[i] = a.a[i];
+    }
+  d[0] = a.a[0]*b.a[0]-c.a[0];
+  if (check_union128d (e, d))
+    abort ();
+}
+
+void
+check_mm_fmsub_ss (__m128 __A, __m128 __B, __m128 __C)
+{
+  union128 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[4];
+  int i;
+  e.x = _mm_fmsub_ss (__A, __B, __C);
+  for (i=1; i < 4;i++)
+    {
+      d[i] = a.a[i];
+    }
+  d[0] = a.a[0]*b.a[0]-c.a[0];
+  if (check_union128 (e, d))
+    abort ();
+}
+
+static void
+fma_test (void)
+{
+  union128  a[3];
+  union128d b[3];
+  int i,j;
+  for (i = 0;i < 3;i++)
+    {
+      for (j = 0;j < 2;j++)
+        a[i].a[j] = i*j + 3.5;
+      for (j = 0;j < 2;j++)
+        b[i].a[j] = i*j + 3.5;
+    }
+  check_mm_fmsub_pd (b[0].x, b[1].x, b[2].x);
+  check_mm_fmsub_sd (b[0].x, b[1].x, b[2].x);
+  check_mm_fmsub_ps (a[0].x, a[1].x, a[2].x);
+  check_mm_fmsub_ss (a[0].x, a[1].x, a[2].x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/fma-fmsubaddXX.c b/gcc/testsuite/gcc.target/i386/fma-fmsubaddXX.c
new file mode 100644
index 0000000..a411e97
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fma-fmsubaddXX.c
@@ -0,0 +1,61 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fma } */
+/* { dg-options "-O0 -mfma" } */
+
+#include "fma-check.h"
+
+#include <x86intrin.h>
+#include "m256-check.h"
+
+void
+check_mm_fmsubadd_ps (__m128 __A, __m128 __B, __m128 __C)
+{
+  union128 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[4];
+  int i;
+  e.x = _mm_fmsubadd_ps (__A, __B, __C);
+  for (i=0; i < 4;i++)
+    {
+      d[i] = a.a[i]*b.a[i] + (i%2 == 1 ? -c.a[i] : c.a[i]);
+    }
+  if (check_union128 (e, d))
+    abort ();
+}
+
+void
+check_mm_fmsubadd_pd (__m128d __A, __m128d __B, __m128d __C)
+{
+  union128d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[2];
+  int i;
+  e.x = _mm_fmsubadd_pd (__A, __B, __C);
+  for (i=0; i < 2;i++)
+    {
+      d[i] = a.a[i]*b.a[i] + (i%2 == 1 ? -c.a[i] : c.a[i]);
+    }
+  if (check_union128d (e, d))
+    abort ();
+}
+
+static void
+fma_test (void)
+{
+  union128  a[3];
+  union128d b[3];
+  int i,j;
+  for (i = 0;i < 3;i++)
+    {
+      for (j = 0;j < 2;j++)
+        a[i].a[j] = i*j + 3.5;
+      for (j = 0;j < 2;j++)
+        b[i].a[j] = i*j + 3.5;
+    }
+  check_mm_fmsubadd_pd (b[0].x, b[1].x, b[2].x);
+  check_mm_fmsubadd_ps (a[0].x, a[1].x, a[2].x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/fma-fnmaddXX.c b/gcc/testsuite/gcc.target/i386/fma-fnmaddXX.c
new file mode 100644
index 0000000..d5a53ad
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fma-fnmaddXX.c
@@ -0,0 +1,101 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fma } */
+/* { dg-options "-O0 -mfma" } */
+
+#include "fma-check.h"
+
+#include <x86intrin.h>
+#include "m256-check.h"
+
+void
+check_mm_fnmadd_ps (__m128 __A, __m128 __B, __m128 __C)
+{
+  union128 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[4];
+  int i;
+  e.x = _mm_fnmadd_ps (__A, __B, __C);
+  for (i=0; i < 4;i++)
+    {
+      d[i] = -a.a[i]*b.a[i]+c.a[i];
+    }
+  if (check_union128 (e, d))
+    abort ();
+}
+
+void
+check_mm_fnmadd_pd (__m128d __A, __m128d __B, __m128d __C)
+{
+  union128d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[2];
+  int i;
+  e.x = _mm_fnmadd_pd (__A, __B, __C);
+  for (i=0; i < 2;i++)
+    {
+      d[i] = -a.a[i]*b.a[i]+c.a[i];
+    }
+  if (check_union128d (e, d))
+    abort ();
+}
+
+void
+check_mm_fnmadd_sd (__m128d __A, __m128d __B, __m128d __C)
+{
+  union128d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[2];
+  int i;
+  e.x = _mm_fnmadd_sd (__A, __B, __C);
+  for (i=1; i < 2;i++)
+    {
+      d[i] = a.a[i];
+    }
+  d[0] = -a.a[0]*b.a[0]+c.a[0];
+  if (check_union128d (e, d))
+    abort ();
+}
+
+void
+check_mm_fnmadd_ss (__m128 __A, __m128 __B, __m128 __C)
+{
+  union128 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[4];
+  int i;
+  e.x = _mm_fnmadd_ss (__A, __B, __C);
+  for (i=1; i < 4;i++)
+    {
+      d[i] = a.a[i];
+    }
+  d[0] = -a.a[0]*b.a[0]+c.a[0];
+  if (check_union128 (e, d))
+    abort ();
+}
+
+static void
+fma_test (void)
+{
+  union128  a[3];
+  union128d b[3];
+  int i,j;
+  for (i = 0;i < 3;i++)
+    {
+      for (j = 0;j < 2;j++)
+        a[i].a[j] = i*j + 3.5;
+      for (j = 0;j < 2;j++)
+        b[i].a[j] = i*j + 3.5;
+    }
+  check_mm_fnmadd_pd (b[0].x, b[1].x, b[2].x);
+  check_mm_fnmadd_sd (b[0].x, b[1].x, b[2].x);
+  check_mm_fnmadd_ps (a[0].x, a[1].x, a[2].x);
+  check_mm_fnmadd_ss (a[0].x, a[1].x, a[2].x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/fma-fnmsubXX.c b/gcc/testsuite/gcc.target/i386/fma-fnmsubXX.c
new file mode 100644
index 0000000..7a55386
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fma-fnmsubXX.c
@@ -0,0 +1,102 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fma } */
+/* { dg-options "-O0 -mfma" } */
+
+#include "fma-check.h"
+
+#include <x86intrin.h>
+#include "m256-check.h"
+
+void
+check_mm_fnmsub_sd (__m128d __A, __m128d __B, __m128d __C)
+{
+  union128d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[2];
+  int i;
+  e.x = _mm_fnmsub_sd (__A, __B, __C);
+  for (i=1; i < 2;i++)
+    {
+      d[i] = a.a[i];
+    }
+  d[0] = -a.a[0]*b.a[0]-c.a[0];
+  if (check_union128d (e, d))
+    abort ();
+}
+
+void
+check_mm_fnmsub_ss (__m128 __A, __m128 __B, __m128 __C)
+{
+  union128 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[4];
+  int i;
+  e.x = _mm_fnmsub_ss (__A, __B, __C);
+  for (i=1; i < 4;i++)
+    {
+      d[i] = a.a[i];
+    }
+  d[0] = -a.a[0]*b.a[0]-c.a[0];
+  if (check_union128 (e, d))
+    abort ();
+}
+
+void
+check_mm_fnmsub_ps (__m128 __A, __m128 __B, __m128 __C)
+{
+  union128 a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  float d[4];
+  int i;
+  e.x = _mm_fnmsub_ps (__A, __B, __C);
+  for (i=0; i < 4;i++)
+    {
+      d[i] = -a.a[i]*b.a[i]-c.a[i];
+    }
+  if (check_union128 (e, d))
+    abort ();
+}
+
+void
+check_mm_fnmsub_pd (__m128d __A, __m128d __B, __m128d __C)
+{
+  union128d a,b,c,e;
+  a.x = __A;
+  b.x = __B;
+  c.x = __C;
+  double d[2];
+  int i;
+  e.x = _mm_fnmsub_pd (__A, __B, __C);
+  for (i=0; i < 2;i++)
+    {
+      d[i] = -a.a[i]*b.a[i]-c.a[i];
+    }
+  if (check_union128d (e, d))
+    abort ();
+}
+
+void
+static void
+fma_test (void)
+{
+  union128  a[3];
+  union128d b[3];
+  int i,j;
+  for (i = 0;i < 3;i++)
+    {
+      for (j = 0;j < 2;j++)
+        a[i].a[j] = i*j + 3.5;
+      for (j = 0;j < 2;j++)
+        b[i].a[j] = i*j + 3.5;
+    }
+  check_mm_fnmsub_pd (b[0].x, b[1].x, b[2].x);
+  check_mm_fnmsub_sd (b[0].x, b[1].x, b[2].x);
+  check_mm_fnmsub_ps (a[0].x, a[1].x, a[2].x);
+  check_mm_fnmsub_ss (a[0].x, a[1].x, a[2].x);
+}
diff --git a/gcc/testsuite/gcc.target/i386/i386.exp b/gcc/testsuite/gcc.target/i386/i386.exp
index 167b79b..8ebe57d 100644
--- a/gcc/testsuite/gcc.target/i386/i386.exp
+++ b/gcc/testsuite/gcc.target/i386/i386.exp
@@ -172,6 +172,20 @@ proc check_effective_target_fma4 { } {
     } "-O2 -mfma4" ]
 }
 
+# Return 1 if fma instructions can be compiled.
+proc check_effective_target_fma { } {
+    return [check_no_compiler_messages fma object {
+        typedef float __m128 __attribute__ ((__vector_size__ (16)));
+	typedef float __v4sf __attribute__ ((__vector_size__ (16)));
+	__m128 _mm_macc_ps(__m128 __A, __m128 __B, __m128 __C)
+	{
+	    return (__m128) __builtin_ia32_vfmsubps ((__v4sf)__A,
+						     (__v4sf)__B,
+						     (__v4sf)__C);
+	}
+    } "-O2 -mfma" ]
+}
+
 # Return 1 if xop instructions can be compiled.
 proc check_effective_target_xop { } {
     return [check_no_compiler_messages xop object {

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2011-08-30 14:03 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-20 11:23 [PATCH, i386, testsuite] FMA intrinsics Uros Bizjak
2011-08-22 17:32 ` Ilya Tocar
2011-08-22 20:41   ` Uros Bizjak
2011-08-23 14:55     ` Ilya Tocar
2011-08-23 15:33       ` Uros Bizjak
2011-08-24 10:06         ` Ilya Tocar
2011-08-24 10:12           ` Jakub Jelinek
2011-08-24 11:12             ` Ilya Tocar
2011-08-24 13:31               ` Uros Bizjak
2011-08-24 13:52                 ` Ilya Tocar
2011-08-24 22:00                   ` Uros Bizjak
2011-08-25 10:15                     ` Ilya Tocar
2011-08-25 10:46                       ` Uros Bizjak
2011-08-25 11:46                         ` Ilya Tocar
2011-08-25 11:46                           ` Jakub Jelinek
2011-08-25 11:49                             ` Ilya Tocar
2011-08-26 11:03                               ` Ilya Tocar
2011-08-26 14:07                                 ` H.J. Lu
2011-08-26 15:47                                   ` Ilya Tocar
2011-08-26 17:02                                     ` H.J. Lu
2011-08-26 17:06                                       ` H.J. Lu
2011-08-30 12:05                                         ` Ilya Tocar
2011-08-30 14:02                                           ` H.J. Lu
2011-08-30 14:10                                             ` Ilya Tocar
2011-08-30 14:32                                               ` Ilya Tocar
2011-08-30 15:25                                                 ` H.J. Lu
  -- strict thread matches above, loose matches on Subject: below --
2011-08-17 14:08 [PATCH, i386, testsuite]FMA intrinsics Ilya Tocar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).