public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Intrinsics for ADCX
@ 2012-07-31 11:51 Michael Zolotukhin
  2012-07-31 13:26 ` Uros Bizjak
  2012-07-31 16:24 ` Richard Henderson
  0 siblings, 2 replies; 14+ messages in thread
From: Michael Zolotukhin @ 2012-07-31 11:51 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Jakub Jelinek, gcc-patches, H.J. Lu, Kirill Yukhin

[-- Attachment #1: Type: text/plain, Size: 2450 bytes --]

Hi guys,
Here is a third part of patch, refactored by Kirill. This one adds
_addcarryx_u[32|64]  intrinsics.

Is it ok?

Changelog entry:
2012-07-31 Michael Zolotukhin <michael.v.zolotukhin@intel.com>

        * common/config/i386/i386-common.c (OPTION_MASK_ISA_ADX_SET): New.
        (OPTION_MASK_ISA_ADX_UNSET): Likewise.
        (ix86_handle_option): Handle madx option.
        * config.gcc (i[34567]86-*-*): Add adxintrin.h.
        (x86_64-*-*): Likewise.
        * config/i386/adxintrin.h: New header.
        * config/i386/driver-i386.c (host_detect_local_cpu): Detect ADCX/ADOX
        support.
        * config/i386/i386-builtin-types.def
        (UCHAR_FTYPE_UCHAR_UINT_UINT_PINT): New function type.
        (UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PINT): Likewise.
        * config/i386/i386-c.c: Define __ADX__ if needed.
        * config/i386/i386.c (ix86_target_string): Define -madx option.
        (PTA_ADX): New.
        (ix86_option_override_internal): Handle new option.
        (ix86_valid_target_attribute_inner_p): Add OPT_madx.
        (ix86_builtins): Add IX86_BUILTIN_ADDCARRYX32,
        IX86_BUILTIN_ADDCARRYX64.
        (ix86_init_mmx_sse_builtins): Define corresponding built-ins.
        (ix86_expand_builtin): Handle these built-ins.
        (ix86_expand_args_builtin): Handle new function types.
        * config/i386/i386.h (TARGET_ADX): New.
        * config/i386/i386.md (adcx<mode>): New define_expand.
        (adcx<mode>_carry): New define_insn.
        * config/i386/i386.opt (madx): New.
        * config/i386/x86intrin.h: Include adxintrin.h.

testsuite/Changelog entry:
2012-07-31 Michael Zolotukhin <michael.v.zolotukhin@intel.com>

        * gcc.target/i386/adx-addcarryx32-1.c: New.
        * gcc.target/i386/adx-addcarryx32-2.c: New.
        * gcc.target/i386/adx-addcarryx64-1.c: New.
        * gcc.target/i386/adx-addcarryx64-2.c: New.
        * gcc.target/i386/adx-check.h: New.
        * gcc.target/i386/i386.exp (check_effective_target_adx): New.
        * gcc.target/i386/sse-12.c: Add -madx.
        * gcc.target/i386/sse-13.c: Ditto.
        * gcc.target/i386/sse-14.c: Ditto.
        * gcc.target/i386/sse-22.c: Ditto.
        * gcc.target/i386/sse-23.c: Ditto.
        * g++.dg/other/i386-2.C: Ditto.
        * g++.dg/other/i386-3.C: Ditto.


Bootstrap and new tests are passing, other testing is in progress.


-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.

[-- Attachment #2: bdw-adx-1.gcc.patch --]
[-- Type: application/octet-stream, Size: 24928 bytes --]

diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
index 70dcae0..e05cd56 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -57,6 +57,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_RTM_SET OPTION_MASK_ISA_RTM
 #define OPTION_MASK_ISA_PRFCHW_SET OPTION_MASK_ISA_PRFCHW
 #define OPTION_MASK_ISA_RDSEED_SET OPTION_MASK_ISA_RDSEED
+#define OPTION_MASK_ISA_ADX_SET OPTION_MASK_ISA_ADX
 
 /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
    as -msse4.2.  */
@@ -127,6 +128,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_RTM_UNSET OPTION_MASK_ISA_RTM
 #define OPTION_MASK_ISA_PRFCHW_UNSET OPTION_MASK_ISA_PRFCHW
 #define OPTION_MASK_ISA_RDSEED_UNSET OPTION_MASK_ISA_RDSEED
+#define OPTION_MASK_ISA_ADX_UNSET OPTION_MASK_ISA_ADX
 
 /* SSE4 includes both SSE4.1 and SSE4.2.  -mno-sse4 should the same
    as -mno-sse4.1. */
@@ -598,6 +600,19 @@ ix86_handle_option (struct gcc_options *opts,
 	}
       return true;
 
+    case OPT_madx:
+      if (value)
+	{
+	  opts->x_ix86_isa_flags |= OPTION_MASK_ISA_ADX_SET;
+	  opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_ADX_SET;
+	}
+      else
+	{
+	  opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_ADX_UNSET;
+	  opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_ADX_UNSET;
+	}
+      return true;
+
   /* Comes from final.c -- no real reason to change it.  */
 #define MAX_CODE_ALIGN 16
 
diff --git a/gcc/config.gcc b/gcc/config.gcc
index dad4c3a..f40ac0e 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -361,7 +361,7 @@ i[34567]86-*-*)
 		       ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
 		       lzcntintrin.h bmiintrin.h bmi2intrin.h tbmintrin.h
 		       avx2intrin.h fmaintrin.h f16cintrin.h rtmintrin.h
-		       xtestintrin.h rdseedintrin.h prfchwintrin.h"
+		       xtestintrin.h rdseedintrin.h prfchwintrin.h adxintrin.h"
 	;;
 x86_64-*-*)
 	cpu_type=i386
@@ -375,7 +375,7 @@ x86_64-*-*)
 		       ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
 		       lzcntintrin.h bmiintrin.h tbmintrin.h bmi2intrin.h
 		       avx2intrin.h fmaintrin.h f16cintrin.h rtmintrin.h
-		       xtestintrin.h rdseedintrin.h prfchwintrin.h"
+		       xtestintrin.h rdseedintrin.h prfchwintrin.h adxintrin.h"
 	need_64bit_hwint=yes
 	;;
 ia64-*-*)
diff --git a/gcc/config/i386/adxintrin.h b/gcc/config/i386/adxintrin.h
new file mode 100644
index 0000000..4f8fb0b
--- /dev/null
+++ b/gcc/config/i386/adxintrin.h
@@ -0,0 +1,51 @@
+/* Copyright (C) 2012 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#if !defined _X86INTRIN_H_INCLUDED && !defined _IMMINTRIN_H_INCLUDED
+# error "Never use <adxintrin.h> directly; include <x86intrin.h> instead."
+#endif
+
+#ifndef __ADX__
+# error "Flag-preserving add-carry instructions not enabled"
+#endif /* __ADX__ */
+
+#ifndef _ADXINTRIN_H_INCLUDED
+#define _ADXINTRIN_H_INCLUDED
+
+extern __inline unsigned char
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_addcarryx_u32 (unsigned char __CF, unsigned int __X,
+		unsigned int __Y, unsigned int *__P)
+{
+    return __builtin_ia32_addcarryx_u32 (__CF, __X, __Y, __P);
+}
+
+extern __inline unsigned char
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_addcarryx_u64 (unsigned char __CF, unsigned long __X,
+		unsigned long __Y, unsigned long long *__P)
+{
+    return __builtin_ia32_addcarryx_u64 (__CF, __X, __Y, __P);
+}
+
+#endif /* _ADXINTRIN_H_INCLUDED */
diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
index 4616108..0b56f3f 100644
--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -399,7 +399,7 @@ const char *host_detect_local_cpu (int argc, const char **argv)
   unsigned int has_bmi = 0, has_bmi2 = 0, has_tbm = 0, has_lzcnt = 0;
   unsigned int has_hle = 0, has_rtm = 0;
   unsigned int has_rdrnd = 0, has_f16c = 0, has_fsgsbase = 0;
-  unsigned int has_rdseed = 0, has_prfchw = 0;
+  unsigned int has_rdseed = 0, has_prfchw = 0, has_adx = 0;
 
   bool arch;
 
@@ -468,6 +468,7 @@ const char *host_detect_local_cpu (int argc, const char **argv)
       has_fsgsbase = ebx & bit_FSGSBASE;
       has_rdseed = ebx & bit_RDSEED;
       has_prfchw = ecx & bit_PRFCHW;
+      has_adx = ebx & bit_ADX;
     }
 
   /* Check cpuid level of extended features.  */
@@ -750,11 +751,12 @@ const char *host_detect_local_cpu (int argc, const char **argv)
       const char *fsgsbase = has_fsgsbase ? " -mfsgsbase" : " -mno-fsgsbase";
       const char *rdseed = has_rdseed ? " -mrdseed" : " -mno-rdseed";
       const char *prfchw = has_prfchw ? " -mprfchw" : " -mno-prfchw";
+      const char *adx = has_adx ? " -madx" : " -mno-adx";
 
       options = concat (options, cx16, sahf, movbe, ase, pclmul,
 			popcnt, abm, lwp, fma, fma4, xop, bmi, bmi2,
 			tbm, avx, avx2, sse4_2, sse4_1, lzcnt, rtm,
-			hle, rdrnd, f16c, fsgsbase, rdseed, prfchw, NULL);
+			hle, rdrnd, f16c, fsgsbase, rdseed, prfchw, adx, NULL);
     }
 
 done:
diff --git a/gcc/config/i386/i386-builtin-types.def b/gcc/config/i386/i386-builtin-types.def
index 398bf0a..41141f0 100644
--- a/gcc/config/i386/i386-builtin-types.def
+++ b/gcc/config/i386/i386-builtin-types.def
@@ -446,6 +446,9 @@ DEF_FUNCTION_TYPE (V16QI, V16QI, INT, V16QI, INT, INT)
 
 DEF_FUNCTION_TYPE (V8QI, QI, QI, QI, QI, QI, QI, QI, QI)
 
+DEF_FUNCTION_TYPE (UCHAR, UCHAR, UINT, UINT, PINT)
+DEF_FUNCTION_TYPE (UCHAR, UCHAR, ULONGLONG, ULONGLONG, PINT)
+
 DEF_FUNCTION_TYPE (V2DF, V2DF, PCDOUBLE, V4SI, V2DF, INT)
 DEF_FUNCTION_TYPE (V4DF, V4DF, PCDOUBLE, V4SI, V4DF, INT)
 DEF_FUNCTION_TYPE (V4DF, V4DF, PCDOUBLE, V8SI, V4DF, INT)
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index a4c947a..d00e0ba 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -300,6 +300,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
     def_or_undef (parse_in, "__RDSEED__");
   if (isa_flag & OPTION_MASK_ISA_PRFCHW)
     def_or_undef (parse_in, "__PRFCHW__");
+  if (isa_flag & OPTION_MASK_ISA_ADX)
+    def_or_undef (parse_in, "__ADX__");
   if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE))
     def_or_undef (parse_in, "__SSE_MATH__");
   if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE2))
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index f7a927e..fa755d7 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2769,6 +2769,7 @@ ix86_target_string (HOST_WIDE_INT isa, int flags, const char *arch,
     { "-mhle",		OPTION_MASK_ISA_HLE },
     { "-mrdseed",	OPTION_MASK_ISA_RDSEED },
     { "-mprfchw",	OPTION_MASK_ISA_PRFCHW },
+    { "-madx",		OPTION_MASK_ISA_ADX },
     { "-mtbm",		OPTION_MASK_ISA_TBM },
     { "-mpopcnt",	OPTION_MASK_ISA_POPCNT },
     { "-mmovbe",	OPTION_MASK_ISA_MOVBE },
@@ -3047,6 +3048,7 @@ ix86_option_override_internal (bool main_args_p)
 #define PTA_HLE			(HOST_WIDE_INT_1 << 33)
 #define PTA_PRFCHW		(HOST_WIDE_INT_1 << 34)
 #define PTA_RDSEED		(HOST_WIDE_INT_1 << 35)
+#define PTA_ADX			(HOST_WIDE_INT_1 << 36)
 /* if this reaches 64, need to widen struct pta flags below */
 
   static struct pta
@@ -3538,6 +3540,9 @@ ix86_option_override_internal (bool main_args_p)
 	if (processor_alias_table[i].flags & PTA_RDSEED
 	    && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_RDSEED))
 	  ix86_isa_flags |= OPTION_MASK_ISA_RDSEED;
+	if (processor_alias_table[i].flags & PTA_ADX
+	    && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_ADX))
+	  ix86_isa_flags |= OPTION_MASK_ISA_ADX;
 	if (processor_alias_table[i].flags & (PTA_PREFETCH_SSE | PTA_SSE))
 	  x86_prefetch_sse = true;
 
@@ -4361,6 +4366,7 @@ ix86_valid_target_attribute_inner_p (tree args, char *p_strings[],
     IX86_ATTR_ISA ("hle",	OPT_mhle),
     IX86_ATTR_ISA ("prfchw",	OPT_mprfchw),
     IX86_ATTR_ISA ("rdseed",	OPT_mrdseed),
+    IX86_ATTR_ISA ("adx",	OPT_madx),
 
     /* enum options */
     IX86_ATTR_ENUM ("fpmath=",	OPT_mfpmath_),
@@ -26101,6 +26107,10 @@ enum ix86_builtins
   IX86_BUILTIN_PEXT32,
   IX86_BUILTIN_PEXT64,
 
+  /* ADX instructions.  */
+  IX86_BUILTIN_ADDCARRYX32,
+  IX86_BUILTIN_ADDCARRYX64,
+
   /* FSGSBASE instructions.  */
   IX86_BUILTIN_RDFSBASE32,
   IX86_BUILTIN_RDFSBASE64,
@@ -27951,6 +27961,14 @@ ix86_init_mmx_sse_builtins (void)
 	       "__builtin_ia32_rdseed_di_step",
 	       INT_FTYPE_PULONGLONG, IX86_BUILTIN_RDSEED64_STEP);
 
+  /* ADCX */
+  def_builtin (OPTION_MASK_ISA_ADX, "__builtin_ia32_addcarryx_u32",
+	       UCHAR_FTYPE_UCHAR_UINT_UINT_PINT, IX86_BUILTIN_ADDCARRYX32);
+  def_builtin (OPTION_MASK_ISA_ADX | OPTION_MASK_ISA_64BIT,
+	       "__builtin_ia32_addcarryx_u64",
+	       UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PINT,
+	       IX86_BUILTIN_ADDCARRYX64);
+
   /* Add FMA4 multi-arg argument instructions */
   for (i = 0, d = bdesc_multi_arg; i < ARRAY_SIZE (bdesc_multi_arg); i++, d++)
     {
@@ -29466,6 +29484,10 @@ ix86_expand_args_builtin (const struct builtin_description *d,
       nargs = 4;
       nargs_constant = 2;
       break;
+    case UCHAR_FTYPE_UCHAR_UINT_UINT_PINT:
+    case UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PINT:
+      nargs = 4;
+      break;
     default:
       gcc_unreachable ();
     }
@@ -30312,7 +30334,36 @@ rdseed_step:
         target = gen_reg_rtx (SImode);
 
       emit_insn (gen_zero_extendqisi2 (target, op2));
+      return target;
+
+    case IX86_BUILTIN_ADDCARRYX32:
+    case IX86_BUILTIN_ADDCARRYX64:
+      mode0 = (fcode == IX86_BUILTIN_ADDCARRYX64) ? DImode : SImode;
+      icode = (fcode == IX86_BUILTIN_ADDCARRYX64)
+		? CODE_FOR_adcxdi
+		: CODE_FOR_adcxsi;
+      arg0 = CALL_EXPR_ARG (exp, 0);
+      arg1 = CALL_EXPR_ARG (exp, 1);
+      arg2 = CALL_EXPR_ARG (exp, 2);
+      arg3 = CALL_EXPR_ARG (exp, 3);
+      if (target == 0)
+	target = gen_reg_rtx (QImode);
+      op1 = expand_normal (arg0);
+      if (!REG_P (op1))
+	op1 = copy_to_mode_reg (QImode, op1);
+      else
+	op1 = gen_rtx_SUBREG (QImode, op1, 0);
+      op2 = expand_normal (arg1);
+      if (!REG_P (op2))
+	op2 = copy_to_mode_reg (mode0, op2);
+      op3 = expand_normal (arg2);
+      if (!REG_P (op3))
+	op3 = copy_to_mode_reg (mode0, op3);
+      op4 = expand_normal (arg3);
 
+      /* Gen ADCX instruction to compute X+Y+CF.  */
+      op0 = gen_reg_rtx (mode0);
+      emit_insn (GEN_FCN (icode) (target, op0, op1, op2, op3, op4));
       return target;
 
     case IX86_BUILTIN_GATHERSIV2DF:
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index a6ce0ce..5869628 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -78,6 +78,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define TARGET_HLE	OPTION_ISA_HLE
 #define TARGET_RDSEED	OPTION_ISA_RDSEED
 #define TARGET_PRFCHW	OPTION_ISA_PRFCHW
+#define TARGET_ADX	OPTION_ISA_ADX
 
 #define TARGET_LP64	OPTION_ABI_64
 #define TARGET_X32	OPTION_ABI_X32
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index ace3b6e..f6c75b1 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -6704,6 +6704,53 @@
 	  (match_operand:MODEF 2 "nonimmediate_operand")))]
   "(TARGET_80387 && X87_ENABLE_ARITH (<MODE>mode))
     || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH)")
+
+(define_expand "adcx<mode>"
+  [(parallel
+    [(use (match_operand:SWI48 0 "register_operand"))
+     (set (match_operand:SWI48 1 "nonimmediate_operand")
+	  (plus:SWI48
+	    (match_operand:SWI48 3 "nonimmediate_operand")
+	    (plus:SWI48 (ltu:SWI48 (reg:CC FLAGS_REG) (const_int 0))
+			(match_operand:SWI48 4 "<general_operand>"))))
+     (use (match_operand:SWI48 2 "general_operand"))
+     (use (match_operand:SWI48 5 "general_operand"))
+     (clobber (reg:CC FLAGS_REG))])]
+  "TARGET_ADX"
+{
+  /* Generate CF from input operand.  */
+  emit_insn (gen_addqi3_cc (gen_reg_rtx (QImode), operands[2], constm1_rtx));
+
+  /* Generate the insn.  */
+  emit_insn (gen_adcx<mode>3_carry (operands[1], operands[3], operands[4]));
+
+  /* Store the result to sum.  */
+  if (!address_operand (operands[5], VOIDmode))
+  {
+    operands[5] = convert_memory_address (Pmode, operands[5]);
+    operands[5] = copy_addr_to_reg (operands[5]);
+  }
+  emit_move_insn (gen_rtx_MEM (<MODE>mode, operands[5]), operands[1]);
+
+  /* Return current CF value.  */
+  emit_insn (gen_rtx_SET (QImode, operands[0],
+			  gen_rtx_LTU (QImode, gen_rtx_REG (CCCmode, FLAGS_REG), const0_rtx)));
+
+  DONE;
+})
+
+(define_insn "adcx<mode>3_carry"
+  [(set (match_operand:SWI48 0 "nonimmediate_operand" "=r")
+	(plus:SWI48
+	  (match_operand:SWI48 1 "nonimmediate_operand" "<comm>0")
+	  (plus:SWI48
+	    (reg:CC FLAGS_REG)
+	    (match_operand:SWI48 2 "<general_operand>" "rm"))))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_ADX && ix86_binary_operator_ok (PLUS, <MODE>mode, operands)"
+  "adcx\t{%2, %0|%0, %2}"
+  [(set_attr "use_carry" "1")
+   (set_attr "mode" "<MODE>")])
 \f
 ;; Multiply instructions
 
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index ccada37..e4f78f3 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -540,6 +540,10 @@ mprfchw
 Target Report Mask(ISA_PRFCHW) Var(ix86_isa_flags) Save
 Support PREFETCHW instruction
 
+madx
+Target Report Mask(ISA_ADX) Var(ix86_isa_flags) Save
+Support flag-preserving add-carry instructions
+
 mtbm
 Target Report Mask(ISA_TBM) Var(ix86_isa_flags) Save
 Support TBM built-in functions and code generation
diff --git a/gcc/config/i386/x86intrin.h b/gcc/config/i386/x86intrin.h
index 9dee9ef..dc5c58e 100644
--- a/gcc/config/i386/x86intrin.h
+++ b/gcc/config/i386/x86intrin.h
@@ -105,4 +105,8 @@
 #include <prfchwintrin.h>
 #endif
 
+#ifdef __ADX__
+#include <adxintrin.h>
+#endif
+
 #endif /* _X86INTRIN_H_INCLUDED */
diff --git a/gcc/testsuite/g++.dg/other/i386-2.C b/gcc/testsuite/g++.dg/other/i386-2.C
index 47fda70..197497f 100644
--- a/gcc/testsuite/g++.dg/other/i386-2.C
+++ b/gcc/testsuite/g++.dg/other/i386-2.C
@@ -1,5 +1,5 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
diff --git a/gcc/testsuite/g++.dg/other/i386-3.C b/gcc/testsuite/g++.dg/other/i386-3.C
index ad477fa..780731e 100644
--- a/gcc/testsuite/g++.dg/other/i386-3.C
+++ b/gcc/testsuite/g++.dg/other/i386-3.C
@@ -1,5 +1,5 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
diff --git a/gcc/testsuite/gcc.target/i386/adx-addcarryx32-1.c b/gcc/testsuite/gcc.target/i386/adx-addcarryx32-1.c
new file mode 100644
index 0000000..daf5779
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-addcarryx32-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-madx -O2" } */
+/* { dg-final { scan-assembler "adcx" } } */
+
+#include <x86intrin.h>
+
+volatile unsigned char c;
+volatile unsigned int x, y;
+unsigned int *sum;
+
+void extern
+adx_test (void)
+{
+    c = _addcarryx_u32 (c, x, y, sum);
+}
diff --git a/gcc/testsuite/gcc.target/i386/adx-addcarryx32-2.c b/gcc/testsuite/gcc.target/i386/adx-addcarryx32-2.c
new file mode 100644
index 0000000..d38d7ee
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-addcarryx32-2.c
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+/* { dg-options "-madx -O2" } */
+/* { dg-require-effective-target adx } */
+
+#include <x86intrin.h>
+#include "adx-check.h"
+
+static void
+adx_test (void)
+{
+  volatile unsigned char c;
+  unsigned int x;
+  volatile unsigned int y, sum_ref;
+
+  c = 0;
+  x = y = 0xFFFFFFFF;
+  sum_ref = 0xFFFFFFFE;
+
+  /* X = 0xFFFFFFFF, Y = 0xFFFFFFFF, C = 0.  */
+  c = _addcarryx_u32 (c, x, y, &x);
+  /* X = 0xFFFFFFFE, Y = 0xFFFFFFFF, C = 1.  */
+  c = _addcarryx_u32 (c, x, y, &x);
+  /* X = 0xFFFFFFFE, Y = 0xFFFFFFFF, C = 1.  */
+
+  if (x != sum_ref)
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/adx-addcarryx64-1.c b/gcc/testsuite/gcc.target/i386/adx-addcarryx64-1.c
new file mode 100644
index 0000000..45beca8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-addcarryx64-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-madx -O2" } */
+/* { dg-final { scan-assembler "adcx" } } */
+
+#include <x86intrin.h>
+
+volatile unsigned char c;
+volatile unsigned long long x, y;
+unsigned long long *sum;
+
+void extern
+adx_test (void)
+{
+    c = _addcarryx_u64 (c, x, y, sum);
+}
diff --git a/gcc/testsuite/gcc.target/i386/adx-addcarryx64-2.c b/gcc/testsuite/gcc.target/i386/adx-addcarryx64-2.c
new file mode 100644
index 0000000..6aa2539
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-addcarryx64-2.c
@@ -0,0 +1,27 @@
+/* { dg-do run { target { ! ia32 } } } */
+/* { dg-options "-madx -O2" } */
+/* { dg-require-effective-target adx } */
+
+#include <x86intrin.h>
+#include "adx-check.h"
+
+static void
+adx_test (void)
+{
+  volatile unsigned char c;
+  unsigned long long x;
+  volatile unsigned long long y, sum_ref;
+
+  c = 0;
+  x = y = 0xFFFFFFFFFFFFFFFFLL;
+  sum_ref = 0xFFFFFFFFFFFFFFFELL;
+
+  /* X = 0xFFFFFFFFFFFFFFFF, Y = 0xFFFFFFFFFFFFFFFF, C = 0.  */
+  c = _addcarryx_u64 (c, x, y, &x);
+  /* X = 0xFFFFFFFFFFFFFFFE, Y = 0xFFFFFFFFFFFFFFFF, C = 1.  */
+  c = _addcarryx_u64 (c, x, y, &x);
+  /* X = 0xFFFFFFFFFFFFFFFE, Y = 0xFFFFFFFFFFFFFFFF, C = 1.  */
+
+  if (x != sum_ref)
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/adx-check.h b/gcc/testsuite/gcc.target/i386/adx-check.h
new file mode 100644
index 0000000..580cb49
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-check.h
@@ -0,0 +1,40 @@
+#include <stdlib.h>
+#include "cpuid.h"
+
+static void adx_test (void);
+
+static void __attribute__ ((noinline)) do_test (void)
+{
+  adx_test ();
+}
+
+  int
+main ()
+{
+  unsigned int eax, ebx, ecx, edx;
+
+  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+    return 0;
+
+  /* Run ADX test only if host has ADX support.  */
+
+  if (__get_cpuid_max (0, NULL) < 7)
+    return 0;
+
+  __cpuid_count (7, 0, eax, ebx, ecx, edx);
+
+  if ((ebx & bit_ADX) == bit_ADX)
+    {
+      do_test ();
+#ifdef DEBUG
+      printf ("PASSED\n");
+#endif
+      return 0;
+    }
+#ifdef DEBUG
+  printf ("SKIPPED\n");
+#endif
+
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/i386/i386.exp b/gcc/testsuite/gcc.target/i386/i386.exp
index 785a973..37f43a6 100644
--- a/gcc/testsuite/gcc.target/i386/i386.exp
+++ b/gcc/testsuite/gcc.target/i386/i386.exp
@@ -243,6 +243,18 @@ proc check_effective_target_bmi2 { } {
     } "-mbmi2" ]
 }
 
+# Return 1 if ADX instructions can be compiled.
+proc check_effective_target_adx { } {
+    return [check_no_compiler_messages adx object {
+	unsigned char
+	_adxcarry_u32 (unsigned char __CF, unsigned int __X,
+		   unsigned int __Y, unsigned int *__P)
+	{
+	    return __builtin_ia32_addcarryx_u32 (__CF, __X, __Y, __P);
+	}
+    } "-madx" ]
+}
+
 # Return 1 if rtm instructions can be compiled.
 proc check_effective_target_rtm { } {
     return [check_no_compiler_messages rtm object {
diff --git a/gcc/testsuite/gcc.target/i386/sse-12.c b/gcc/testsuite/gcc.target/i386/sse-12.c
index cb3ab18..0d78a0c 100644
--- a/gcc/testsuite/gcc.target/i386/sse-12.c
+++ b/gcc/testsuite/gcc.target/i386/sse-12.c
@@ -3,7 +3,7 @@
    popcntintrin.h and mm_malloc.h are usable
    with -O -std=c89 -pedantic-errors.  */
 /* { dg-do compile } */
-/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 #include <x86intrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index fe2bf46..4c575ba 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 #include <mm_malloc.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-14.c b/gcc/testsuite/gcc.target/i386/sse-14.c
index 8877e31..c8c13ce 100644
--- a/gcc/testsuite/gcc.target/i386/sse-14.c
+++ b/gcc/testsuite/gcc.target/i386/sse-14.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 #include <mm_malloc.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c b/gcc/testsuite/gcc.target/i386/sse-22.c
index ec5ccb8..ec83255 100644
--- a/gcc/testsuite/gcc.target/i386/sse-22.c
+++ b/gcc/testsuite/gcc.target/i386/sse-22.c
@@ -50,7 +50,7 @@
 
 
 #ifndef DIFFERENT_PRAGMAS
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx")
 #endif
 
 /* Following intrinsics require immediate arguments.  They
@@ -264,7 +264,7 @@ test_2 (_mm_clmulepi64_si128, __m128i, __m128i, __m128i, 1)
 
 /* x86intrin.h (FMA4/XOP/LWP/BMI/BMI2/TBM/LZCNT/FMA). */
 #ifdef DIFFERENT_PRAGMAS
-#pragma GCC target ("fma4,xop,lwp,bmi,bmi2,tbm,lzcnt,fma,rdseed,prfchw")
+#pragma GCC target ("fma4,xop,lwp,bmi,bmi2,tbm,lzcnt,fma,rdseed,prfchw,adx")
 #endif
 #include <x86intrin.h>
 /* xopintrin.h */
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index 3b26d99..f046ef6 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -183,7 +183,7 @@
 /* rtmintrin.h */
 #define __builtin_ia32_xabort(M) __builtin_ia32_xabort(1)
 
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx")
 #include <wmmintrin.h>
 #include <smmintrin.h>
 #include <mm3dnow.h>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Intrinsics for ADCX
  2012-07-31 11:51 [PATCH] Intrinsics for ADCX Michael Zolotukhin
@ 2012-07-31 13:26 ` Uros Bizjak
  2012-07-31 16:24 ` Richard Henderson
  1 sibling, 0 replies; 14+ messages in thread
From: Uros Bizjak @ 2012-07-31 13:26 UTC (permalink / raw)
  To: Michael Zolotukhin; +Cc: Jakub Jelinek, gcc-patches, H.J. Lu, Kirill Yukhin

On Tue, Jul 31, 2012 at 1:33 PM, Michael Zolotukhin
<michael.v.zolotukhin@gmail.com> wrote:
> Hi guys,
> Here is a third part of patch, refactored by Kirill. This one adds
> _addcarryx_u[32|64]  intrinsics.
>
> Is it ok?
>
> Changelog entry:
> 2012-07-31 Michael Zolotukhin <michael.v.zolotukhin@intel.com>
>
>         * common/config/i386/i386-common.c (OPTION_MASK_ISA_ADX_SET): New.
>         (OPTION_MASK_ISA_ADX_UNSET): Likewise.
>         (ix86_handle_option): Handle madx option.
>         * config.gcc (i[34567]86-*-*): Add adxintrin.h.
>         (x86_64-*-*): Likewise.
>         * config/i386/adxintrin.h: New header.
>         * config/i386/driver-i386.c (host_detect_local_cpu): Detect ADCX/ADOX
>         support.
>         * config/i386/i386-builtin-types.def
>         (UCHAR_FTYPE_UCHAR_UINT_UINT_PINT): New function type.
>         (UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PINT): Likewise.
>         * config/i386/i386-c.c: Define __ADX__ if needed.
>         * config/i386/i386.c (ix86_target_string): Define -madx option.
>         (PTA_ADX): New.
>         (ix86_option_override_internal): Handle new option.
>         (ix86_valid_target_attribute_inner_p): Add OPT_madx.
>         (ix86_builtins): Add IX86_BUILTIN_ADDCARRYX32,
>         IX86_BUILTIN_ADDCARRYX64.
>         (ix86_init_mmx_sse_builtins): Define corresponding built-ins.
>         (ix86_expand_builtin): Handle these built-ins.
>         (ix86_expand_args_builtin): Handle new function types.
>         * config/i386/i386.h (TARGET_ADX): New.
>         * config/i386/i386.md (adcx<mode>): New define_expand.
>         (adcx<mode>_carry): New define_insn.
>         * config/i386/i386.opt (madx): New.
>         * config/i386/x86intrin.h: Include adxintrin.h.
>
> testsuite/Changelog entry:
> 2012-07-31 Michael Zolotukhin <michael.v.zolotukhin@intel.com>
>
>         * gcc.target/i386/adx-addcarryx32-1.c: New.
>         * gcc.target/i386/adx-addcarryx32-2.c: New.
>         * gcc.target/i386/adx-addcarryx64-1.c: New.
>         * gcc.target/i386/adx-addcarryx64-2.c: New.
>         * gcc.target/i386/adx-check.h: New.
>         * gcc.target/i386/i386.exp (check_effective_target_adx): New.
>         * gcc.target/i386/sse-12.c: Add -madx.
>         * gcc.target/i386/sse-13.c: Ditto.
>         * gcc.target/i386/sse-14.c: Ditto.
>         * gcc.target/i386/sse-22.c: Ditto.
>         * gcc.target/i386/sse-23.c: Ditto.
>         * g++.dg/other/i386-2.C: Ditto.
>         * g++.dg/other/i386-3.C: Ditto.
>
>
> Bootstrap and new tests are passing, other testing is in progress.

Following is the correct definition of new insn:

--cut here--
Index: i386.md
===================================================================
--- i386.md     (revision 190005)
+++ i386.md     (working copy)
@@ -6604,6 +6604,27 @@
    (set_attr "pent_pair" "pu")
    (set_attr "mode" "<MODE>")])

+(define_insn "adcx<mode>3"
+  [(set (reg:CCC FLAGS_REG)
+       (compare
+         (plus:SWI48
+           (match_operand:SWI48 1 "nonimmediate_operand" "%0")
+           (plus:SWI48
+             (match_operator 4 "ix86_carry_flag_operator"
+              [(match_operand 3 "flags_reg_operand") (const_int 0)])
+             (match_operand:SWI48 2 "nonimmediate_operand" "rm")))
+         (const_int 0)))
+   (set (match_operand:SWI48 0 "register_operand" "=r")
+       (plus:SWI48 (match_dup 1)
+                   (plus:SWI48 (match_op_dup 4
+                                [(match_dup 3) (const_int 0)])
+                               (match_dup 2))))]
+  "TARGET_ADX && ix86_binary_operator_ok (PLUS, <MODE>mode, operands)"
+  "adcx\t{%2, %0|%0, %2}"
+  [(set_attr "type" "alu")
+   (set_attr "use_carry" "1")
+   (set_attr "mode" "<MODE>")])
+
 (define_insn "*addsi3_carry_zext"
   [(set (match_operand:DI 0 "register_operand" "=r")
        (zero_extend:DI
--cut here--

You don't need expander to emit insns via emit_insn (gen_<whatever>).
Please put the code from the expander back to i386.c and rewrite the
sequence according to new insn pattern.

+  /* Generate CF from input operand.  */
+  emit_insn (gen_addqi3_cc (gen_reg_rtx (QImode), operands[2], constm1_rtx));

This insn should be in correct mode, you can make the pattern public if needed.

+      if (!REG_P (op1))
+	op1 = copy_to_mode_reg (QImode, op1);
+      else
+	op1 = gen_rtx_SUBREG (QImode, op1, 0);

This is not needed, just pass the register in the correct mode. You
should use something like:

  if (!insn_data[icode].operand[2].predicate (op1, mode1))
    op1 = copy_to_mode_reg (mode1, op1);

Uros.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Intrinsics for ADCX
  2012-07-31 11:51 [PATCH] Intrinsics for ADCX Michael Zolotukhin
  2012-07-31 13:26 ` Uros Bizjak
@ 2012-07-31 16:24 ` Richard Henderson
  2012-08-01 16:37   ` Kirill Yukhin
  1 sibling, 1 reply; 14+ messages in thread
From: Richard Henderson @ 2012-07-31 16:24 UTC (permalink / raw)
  To: Michael Zolotukhin
  Cc: Uros Bizjak, Jakub Jelinek, gcc-patches, H.J. Lu, Kirill Yukhin

On 2012-07-31 04:33, Michael Zolotukhin wrote:
> Here is a third part of patch, refactored by Kirill. This one adds
> _addcarryx_u[32|64]  intrinsics.

Frankly I don't understand the point of these instructions
being added to the ISA at all.  I would have understood an
add-with-carry that did *not* modify the flags at all, but
two separate ones that modify C and O separately is just
downright strange.

But to the point: I don't understand the point of having
this as a builtin.  Is the code generated by this builtin
any better than plain C?

And if you're going to have the builtin, why is this restricted
to adx anyway?  You obviously can produce the same results with
the good old fashioned adc instruction as well.

Which begs the question of why you've got a separate pattern
for the adx anyway.  If the insn is so much better, it ought to
be used in the same pattern we use for adc now.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Intrinsics for ADCX
  2012-07-31 16:24 ` Richard Henderson
@ 2012-08-01 16:37   ` Kirill Yukhin
  2012-08-03 13:24     ` Michael Zolotukhin
  2012-08-09 12:22     ` Michael Zolotukhin
  0 siblings, 2 replies; 14+ messages in thread
From: Kirill Yukhin @ 2012-08-01 16:37 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Michael Zolotukhin, Uros Bizjak, Jakub Jelinek, gcc-patches, H.J. Lu

Hi Richard,

> Frankly I don't understand the point of these instructions
> being added to the ISA at all.  I would have understood an
> add-with-carry that did *not* modify the flags at all, but
> two separate ones that modify C and O separately is just
> downright strange.
If there is only one carry in flight, they all are equivalent although
ADOX is a little less useful in loops.
If there are two carries in flight, that’s where the new instructions
show their benefit, since they allow accumulation without destroying
each other (see next comment).
For any number of carries beyond two, you have to start saving
restoring carry bits and it degenerates to the first case for some of
them.

> But to the point: I don't understand the point of having
> this as a builtin.  Is the code generated by this builtin
> any better than plain C?
I think this is just like a practice to introduce new intrinsics for new insns.
I doubt, that we may generate such things automatically:
c1 = 0;
c2 = 0;
c1 = _adcx64( & res[i], src[i], src2[i], c1);
c1 = _adcx64( & res[i+1], src[i+1], src2[i+1], c1);
c2 = _adcx64( & res[i], src[i], src2[i], c2);
c2 = _adcx64( & res[i+1], src[i+1], src2[i+1], c2);

> And if you're going to have the builtin, why is this restricted
> to adx anyway?  You obviously can produce the same results with
> the good old fashioned adc instruction as well.
We have one intrinsic for both ADCX/ADOX. So, we just picked up first
one to use when exanding the built-in

> Which begs the question of why you've got a separate pattern
> for the adx anyway.  If the insn is so much better, it ought to
> be used in the same pattern we use for adc now.
I believe, we may introduce global variant of ADCX, which may be
expanded into either of ADC/ADCX/ADOX on x86 and into analogs
on the other ports.

K

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Intrinsics for ADCX
  2012-08-01 16:37   ` Kirill Yukhin
@ 2012-08-03 13:24     ` Michael Zolotukhin
  2012-08-03 13:52       ` Uros Bizjak
  2012-08-09 12:22     ` Michael Zolotukhin
  1 sibling, 1 reply; 14+ messages in thread
From: Michael Zolotukhin @ 2012-08-03 13:24 UTC (permalink / raw)
  To: Kirill Yukhin
  Cc: Richard Henderson, Uros Bizjak, Jakub Jelinek, gcc-patches, H.J. Lu

[-- Attachment #1: Type: text/plain, Size: 4458 bytes --]

Hi,
I made a new version of the patch, where I tried to take into account
Uros' remarks - is it ok for trunk?

Bootstrap and new tests are passing, testing is in progress.

Changelog entry:
2012-08-03 Michael Zolotukhin <michael.v.zolotukhin@intel.com>

        * common/config/i386/i386-common.c (OPTION_MASK_ISA_ADX_SET): New.
        (OPTION_MASK_ISA_ADX_UNSET): Likewise.
        (ix86_handle_option): Handle madx option.
        * config.gcc (i[34567]86-*-*): Add adxintrin.h.
        (x86_64-*-*): Likewise.
        * config/i386/adxintrin.h: New header.
        * config/i386/driver-i386.c (host_detect_local_cpu): Detect ADCX/ADOX
        support.
        * config/i386/i386-builtin-types.def
        (UCHAR_FTYPE_UCHAR_UINT_UINT_PINT): New function type.
        (UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PINT): Likewise.
        * config/i386/i386-c.c: Define __ADX__ if needed.
        * config/i386/i386.c (ix86_target_string): Define -madx option.
        (PTA_ADX): New.
        (ix86_option_override_internal): Handle new option.
        (ix86_valid_target_attribute_inner_p): Add OPT_madx.
        (ix86_builtins): Add IX86_BUILTIN_ADDCARRYX32,
        IX86_BUILTIN_ADDCARRYX64.
        (ix86_init_mmx_sse_builtins): Define corresponding built-ins.
        (ix86_expand_builtin): Handle these built-ins.
        (ix86_expand_args_builtin): Handle new function types.
        * config/i386/i386.h (TARGET_ADX): New.
        * config/i386/i386.md (adcx<mode>3): New define_insn.
        * config/i386/i386.opt (madx): New.
        * config/i386/x86intrin.h: Include adxintrin.h.

testsuite/Changelog entry:
2012-08-03 Michael Zolotukhin <michael.v.zolotukhin@intel.com>

        * gcc.target/i386/adx-addcarryx32-1.c: New.
        * gcc.target/i386/adx-addcarryx32-2.c: New.
        * gcc.target/i386/adx-addcarryx64-1.c: New.
        * gcc.target/i386/adx-addcarryx64-2.c: New.
        * gcc.target/i386/adx-check.h: New.
        * gcc.target/i386/i386.exp (check_effective_target_adx): New.
        * gcc.target/i386/sse-12.c: Add -madx.
        * gcc.target/i386/sse-13.c: Ditto.
        * gcc.target/i386/sse-14.c: Ditto.
        * gcc.target/i386/sse-22.c: Ditto.
        * gcc.target/i386/sse-23.c: Ditto.
        * g++.dg/other/i386-2.C: Ditto.
        * g++.dg/other/i386-3.C: Ditto.


Thanks, Michael

On 1 August 2012 20:37, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:
> Hi Richard,
>
>> Frankly I don't understand the point of these instructions
>> being added to the ISA at all.  I would have understood an
>> add-with-carry that did *not* modify the flags at all, but
>> two separate ones that modify C and O separately is just
>> downright strange.
> If there is only one carry in flight, they all are equivalent although
> ADOX is a little less useful in loops.
> If there are two carries in flight, that’s where the new instructions
> show their benefit, since they allow accumulation without destroying
> each other (see next comment).
> For any number of carries beyond two, you have to start saving
> restoring carry bits and it degenerates to the first case for some of
> them.
>
>> But to the point: I don't understand the point of having
>> this as a builtin.  Is the code generated by this builtin
>> any better than plain C?
> I think this is just like a practice to introduce new intrinsics for new insns.
> I doubt, that we may generate such things automatically:
> c1 = 0;
> c2 = 0;
> c1 = _adcx64( & res[i], src[i], src2[i], c1);
> c1 = _adcx64( & res[i+1], src[i+1], src2[i+1], c1);
> c2 = _adcx64( & res[i], src[i], src2[i], c2);
> c2 = _adcx64( & res[i+1], src[i+1], src2[i+1], c2);
>
>> And if you're going to have the builtin, why is this restricted
>> to adx anyway?  You obviously can produce the same results with
>> the good old fashioned adc instruction as well.
> We have one intrinsic for both ADCX/ADOX. So, we just picked up first
> one to use when exanding the built-in
>
>> Which begs the question of why you've got a separate pattern
>> for the adx anyway.  If the insn is so much better, it ought to
>> be used in the same pattern we use for adc now.
> I believe, we may introduce global variant of ADCX, which may be
> expanded into either of ADC/ADCX/ADOX on x86 and into analogs
> on the other ports.
>
> K

-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.

[-- Attachment #2: bdw-adx-3.gcc.patch --]
[-- Type: application/octet-stream, Size: 24691 bytes --]

diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
index 70dcae0..e05cd56 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -57,6 +57,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_RTM_SET OPTION_MASK_ISA_RTM
 #define OPTION_MASK_ISA_PRFCHW_SET OPTION_MASK_ISA_PRFCHW
 #define OPTION_MASK_ISA_RDSEED_SET OPTION_MASK_ISA_RDSEED
+#define OPTION_MASK_ISA_ADX_SET OPTION_MASK_ISA_ADX
 
 /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
    as -msse4.2.  */
@@ -127,6 +128,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_RTM_UNSET OPTION_MASK_ISA_RTM
 #define OPTION_MASK_ISA_PRFCHW_UNSET OPTION_MASK_ISA_PRFCHW
 #define OPTION_MASK_ISA_RDSEED_UNSET OPTION_MASK_ISA_RDSEED
+#define OPTION_MASK_ISA_ADX_UNSET OPTION_MASK_ISA_ADX
 
 /* SSE4 includes both SSE4.1 and SSE4.2.  -mno-sse4 should the same
    as -mno-sse4.1. */
@@ -598,6 +600,19 @@ ix86_handle_option (struct gcc_options *opts,
 	}
       return true;
 
+    case OPT_madx:
+      if (value)
+	{
+	  opts->x_ix86_isa_flags |= OPTION_MASK_ISA_ADX_SET;
+	  opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_ADX_SET;
+	}
+      else
+	{
+	  opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_ADX_UNSET;
+	  opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_ADX_UNSET;
+	}
+      return true;
+
   /* Comes from final.c -- no real reason to change it.  */
 #define MAX_CODE_ALIGN 16
 
diff --git a/gcc/config.gcc b/gcc/config.gcc
index dad4c3a..f40ac0e 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -361,7 +361,7 @@ i[34567]86-*-*)
 		       ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
 		       lzcntintrin.h bmiintrin.h bmi2intrin.h tbmintrin.h
 		       avx2intrin.h fmaintrin.h f16cintrin.h rtmintrin.h
-		       xtestintrin.h rdseedintrin.h prfchwintrin.h"
+		       xtestintrin.h rdseedintrin.h prfchwintrin.h adxintrin.h"
 	;;
 x86_64-*-*)
 	cpu_type=i386
@@ -375,7 +375,7 @@ x86_64-*-*)
 		       ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
 		       lzcntintrin.h bmiintrin.h tbmintrin.h bmi2intrin.h
 		       avx2intrin.h fmaintrin.h f16cintrin.h rtmintrin.h
-		       xtestintrin.h rdseedintrin.h prfchwintrin.h"
+		       xtestintrin.h rdseedintrin.h prfchwintrin.h adxintrin.h"
 	need_64bit_hwint=yes
 	;;
 ia64-*-*)
diff --git a/gcc/config/i386/adxintrin.h b/gcc/config/i386/adxintrin.h
new file mode 100644
index 0000000..4f8fb0b
--- /dev/null
+++ b/gcc/config/i386/adxintrin.h
@@ -0,0 +1,51 @@
+/* Copyright (C) 2012 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#if !defined _X86INTRIN_H_INCLUDED && !defined _IMMINTRIN_H_INCLUDED
+# error "Never use <adxintrin.h> directly; include <x86intrin.h> instead."
+#endif
+
+#ifndef __ADX__
+# error "Flag-preserving add-carry instructions not enabled"
+#endif /* __ADX__ */
+
+#ifndef _ADXINTRIN_H_INCLUDED
+#define _ADXINTRIN_H_INCLUDED
+
+extern __inline unsigned char
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_addcarryx_u32 (unsigned char __CF, unsigned int __X,
+		unsigned int __Y, unsigned int *__P)
+{
+    return __builtin_ia32_addcarryx_u32 (__CF, __X, __Y, __P);
+}
+
+extern __inline unsigned char
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_addcarryx_u64 (unsigned char __CF, unsigned long __X,
+		unsigned long __Y, unsigned long long *__P)
+{
+    return __builtin_ia32_addcarryx_u64 (__CF, __X, __Y, __P);
+}
+
+#endif /* _ADXINTRIN_H_INCLUDED */
diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
index 4616108..0b56f3f 100644
--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -399,7 +399,7 @@ const char *host_detect_local_cpu (int argc, const char **argv)
   unsigned int has_bmi = 0, has_bmi2 = 0, has_tbm = 0, has_lzcnt = 0;
   unsigned int has_hle = 0, has_rtm = 0;
   unsigned int has_rdrnd = 0, has_f16c = 0, has_fsgsbase = 0;
-  unsigned int has_rdseed = 0, has_prfchw = 0;
+  unsigned int has_rdseed = 0, has_prfchw = 0, has_adx = 0;
 
   bool arch;
 
@@ -468,6 +468,7 @@ const char *host_detect_local_cpu (int argc, const char **argv)
       has_fsgsbase = ebx & bit_FSGSBASE;
       has_rdseed = ebx & bit_RDSEED;
       has_prfchw = ecx & bit_PRFCHW;
+      has_adx = ebx & bit_ADX;
     }
 
   /* Check cpuid level of extended features.  */
@@ -750,11 +751,12 @@ const char *host_detect_local_cpu (int argc, const char **argv)
       const char *fsgsbase = has_fsgsbase ? " -mfsgsbase" : " -mno-fsgsbase";
       const char *rdseed = has_rdseed ? " -mrdseed" : " -mno-rdseed";
       const char *prfchw = has_prfchw ? " -mprfchw" : " -mno-prfchw";
+      const char *adx = has_adx ? " -madx" : " -mno-adx";
 
       options = concat (options, cx16, sahf, movbe, ase, pclmul,
 			popcnt, abm, lwp, fma, fma4, xop, bmi, bmi2,
 			tbm, avx, avx2, sse4_2, sse4_1, lzcnt, rtm,
-			hle, rdrnd, f16c, fsgsbase, rdseed, prfchw, NULL);
+			hle, rdrnd, f16c, fsgsbase, rdseed, prfchw, adx, NULL);
     }
 
 done:
diff --git a/gcc/config/i386/i386-builtin-types.def b/gcc/config/i386/i386-builtin-types.def
index 398bf0a..41141f0 100644
--- a/gcc/config/i386/i386-builtin-types.def
+++ b/gcc/config/i386/i386-builtin-types.def
@@ -446,6 +446,9 @@ DEF_FUNCTION_TYPE (V16QI, V16QI, INT, V16QI, INT, INT)
 
 DEF_FUNCTION_TYPE (V8QI, QI, QI, QI, QI, QI, QI, QI, QI)
 
+DEF_FUNCTION_TYPE (UCHAR, UCHAR, UINT, UINT, PINT)
+DEF_FUNCTION_TYPE (UCHAR, UCHAR, ULONGLONG, ULONGLONG, PINT)
+
 DEF_FUNCTION_TYPE (V2DF, V2DF, PCDOUBLE, V4SI, V2DF, INT)
 DEF_FUNCTION_TYPE (V4DF, V4DF, PCDOUBLE, V4SI, V4DF, INT)
 DEF_FUNCTION_TYPE (V4DF, V4DF, PCDOUBLE, V8SI, V4DF, INT)
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index a4c947a..d00e0ba 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -300,6 +300,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
     def_or_undef (parse_in, "__RDSEED__");
   if (isa_flag & OPTION_MASK_ISA_PRFCHW)
     def_or_undef (parse_in, "__PRFCHW__");
+  if (isa_flag & OPTION_MASK_ISA_ADX)
+    def_or_undef (parse_in, "__ADX__");
   if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE))
     def_or_undef (parse_in, "__SSE_MATH__");
   if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE2))
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1772dc6..0b7172d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2769,6 +2769,7 @@ ix86_target_string (HOST_WIDE_INT isa, int flags, const char *arch,
     { "-mhle",		OPTION_MASK_ISA_HLE },
     { "-mrdseed",	OPTION_MASK_ISA_RDSEED },
     { "-mprfchw",	OPTION_MASK_ISA_PRFCHW },
+    { "-madx",		OPTION_MASK_ISA_ADX },
     { "-mtbm",		OPTION_MASK_ISA_TBM },
     { "-mpopcnt",	OPTION_MASK_ISA_POPCNT },
     { "-mmovbe",	OPTION_MASK_ISA_MOVBE },
@@ -3047,6 +3048,7 @@ ix86_option_override_internal (bool main_args_p)
 #define PTA_HLE			(HOST_WIDE_INT_1 << 33)
 #define PTA_PRFCHW		(HOST_WIDE_INT_1 << 34)
 #define PTA_RDSEED		(HOST_WIDE_INT_1 << 35)
+#define PTA_ADX			(HOST_WIDE_INT_1 << 36)
 /* if this reaches 64, need to widen struct pta flags below */
 
   static struct pta
@@ -3538,6 +3540,9 @@ ix86_option_override_internal (bool main_args_p)
 	if (processor_alias_table[i].flags & PTA_RDSEED
 	    && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_RDSEED))
 	  ix86_isa_flags |= OPTION_MASK_ISA_RDSEED;
+	if (processor_alias_table[i].flags & PTA_ADX
+	    && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_ADX))
+	  ix86_isa_flags |= OPTION_MASK_ISA_ADX;
 	if (processor_alias_table[i].flags & (PTA_PREFETCH_SSE | PTA_SSE))
 	  x86_prefetch_sse = true;
 
@@ -4361,6 +4366,7 @@ ix86_valid_target_attribute_inner_p (tree args, char *p_strings[],
     IX86_ATTR_ISA ("hle",	OPT_mhle),
     IX86_ATTR_ISA ("prfchw",	OPT_mprfchw),
     IX86_ATTR_ISA ("rdseed",	OPT_mrdseed),
+    IX86_ATTR_ISA ("adx",	OPT_madx),
 
     /* enum options */
     IX86_ATTR_ENUM ("fpmath=",	OPT_mfpmath_),
@@ -26107,6 +26113,10 @@ enum ix86_builtins
   IX86_BUILTIN_PEXT32,
   IX86_BUILTIN_PEXT64,
 
+  /* ADX instructions.  */
+  IX86_BUILTIN_ADDCARRYX32,
+  IX86_BUILTIN_ADDCARRYX64,
+
   /* FSGSBASE instructions.  */
   IX86_BUILTIN_RDFSBASE32,
   IX86_BUILTIN_RDFSBASE64,
@@ -27957,6 +27967,14 @@ ix86_init_mmx_sse_builtins (void)
 	       "__builtin_ia32_rdseed_di_step",
 	       INT_FTYPE_PULONGLONG, IX86_BUILTIN_RDSEED64_STEP);
 
+  /* ADCX */
+  def_builtin (OPTION_MASK_ISA_ADX, "__builtin_ia32_addcarryx_u32",
+	       UCHAR_FTYPE_UCHAR_UINT_UINT_PINT, IX86_BUILTIN_ADDCARRYX32);
+  def_builtin (OPTION_MASK_ISA_ADX | OPTION_MASK_ISA_64BIT,
+	       "__builtin_ia32_addcarryx_u64",
+	       UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PINT,
+	       IX86_BUILTIN_ADDCARRYX64);
+
   /* Add FMA4 multi-arg argument instructions */
   for (i = 0, d = bdesc_multi_arg; i < ARRAY_SIZE (bdesc_multi_arg); i++, d++)
     {
@@ -29472,6 +29490,10 @@ ix86_expand_args_builtin (const struct builtin_description *d,
       nargs = 4;
       nargs_constant = 2;
       break;
+    case UCHAR_FTYPE_UCHAR_UINT_UINT_PINT:
+    case UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PINT:
+      nargs = 4;
+      break;
     default:
       gcc_unreachable ();
     }
@@ -30318,7 +30340,62 @@ rdseed_step:
         target = gen_reg_rtx (SImode);
 
       emit_insn (gen_zero_extendqisi2 (target, op2));
+      return target;
+
+    case IX86_BUILTIN_ADDCARRYX32:
+      icode = CODE_FOR_adcxsi3;
+      mode0 = SImode;
+      goto addcarryx;
+
+    case IX86_BUILTIN_ADDCARRYX64:
+      icode = CODE_FOR_adcxdi3;
+      mode0 = DImode;
+
+addcarryx:
+      arg0 = CALL_EXPR_ARG (exp, 0); /* unsigned char c_in.  */
+      arg1 = CALL_EXPR_ARG (exp, 1); /* unsigned int src1.  */
+      arg2 = CALL_EXPR_ARG (exp, 2); /* unsigned int src2.  */
+      arg3 = CALL_EXPR_ARG (exp, 3); /* unsigned int *sum_out.  */
+
+      op0 = gen_reg_rtx (QImode);
+
+      /* Generate CF from input operand.  */
+      op1 = expand_normal (arg0);
+      if (GET_MODE (op1) != QImode)
+	op1 = convert_to_mode (QImode, op1, 1);
+      op1 = copy_to_mode_reg (QImode, op1);
+      emit_insn (gen_addqi3_cc (op0, op1, constm1_rtx));
+
+      /* Gen ADCX instruction to compute X+Y+CF.  */
+      op2 = expand_normal (arg1);
+      op3 = expand_normal (arg2);
+
+      if (!REG_P (op2))
+	op2 = copy_to_mode_reg (mode0, op2);
+      if (!REG_P (op3))
+	op3 = copy_to_mode_reg (mode0, op3);
+
+      op0 = gen_reg_rtx (mode0);
+
+      op4 = gen_rtx_REG (CCmode, FLAGS_REG);
+      pat = gen_rtx_LTU (VOIDmode, op4, const0_rtx);
+      emit_insn (GEN_FCN (icode) (op0, op2, op3, op4, pat));
+
+      /* Store the result.  */
+      op4 = expand_normal (arg3);
+      if (!address_operand (op4, VOIDmode))
+	{
+	  op4 = convert_memory_address (Pmode, op4);
+	  op4 = copy_addr_to_reg (op4);
+	}
+      emit_move_insn (gen_rtx_MEM (mode0, op4), op0);
+
+      /* Return current CF value.  */
+      if (target == 0)
+        target = gen_reg_rtx (QImode);
 
+      PUT_MODE (pat, QImode);
+      emit_insn (gen_rtx_SET (VOIDmode, target, pat));
       return target;
 
     case IX86_BUILTIN_GATHERSIV2DF:
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index a6ce0ce..5869628 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -78,6 +78,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define TARGET_HLE	OPTION_ISA_HLE
 #define TARGET_RDSEED	OPTION_ISA_RDSEED
 #define TARGET_PRFCHW	OPTION_ISA_PRFCHW
+#define TARGET_ADX	OPTION_ISA_ADX
 
 #define TARGET_LP64	OPTION_ABI_64
 #define TARGET_X32	OPTION_ABI_X32
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index ace3b6e..6774ae2 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -6633,6 +6633,29 @@
    (set_attr "pent_pair" "pu")
    (set_attr "mode" "SI")])
 \f
+;; ADCX instruction
+
+(define_insn "adcx<mode>3"
+  [(set (reg:CCC FLAGS_REG)
+	(compare:CCC
+	  (plus:SWI48
+	    (match_operand:SWI48 1 "nonimmediate_operand" "%0")
+	    (plus:SWI48
+	      (match_operator 4 "ix86_carry_flag_operator"
+	       [(match_operand 3 "flags_reg_operand") (const_int 0)])
+	      (match_operand:SWI48 2 "nonimmediate_operand" "rm")))
+	  (const_int 0)))
+   (set (match_operand:SWI48 0 "register_operand" "=r")
+	(plus:SWI48 (match_dup 1)
+		    (plus:SWI48 (match_op_dup 4
+				 [(match_dup 3) (const_int 0)])
+				(match_dup 2))))]
+  "TARGET_ADX && ix86_binary_operator_ok (PLUS, <MODE>mode, operands)"
+  "adcx\t{%2, %0|%0, %2}"
+  [(set_attr "type" "alu")
+   (set_attr "use_carry" "1")
+   (set_attr "mode" "<MODE>")])
+\f
 ;; Overflow setting add and subtract instructions
 
 (define_insn "*add<mode>3_cconly_overflow"
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index ccada37..e4f78f3 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -540,6 +540,10 @@ mprfchw
 Target Report Mask(ISA_PRFCHW) Var(ix86_isa_flags) Save
 Support PREFETCHW instruction
 
+madx
+Target Report Mask(ISA_ADX) Var(ix86_isa_flags) Save
+Support flag-preserving add-carry instructions
+
 mtbm
 Target Report Mask(ISA_TBM) Var(ix86_isa_flags) Save
 Support TBM built-in functions and code generation
diff --git a/gcc/config/i386/x86intrin.h b/gcc/config/i386/x86intrin.h
index 9dee9ef..dc5c58e 100644
--- a/gcc/config/i386/x86intrin.h
+++ b/gcc/config/i386/x86intrin.h
@@ -105,4 +105,8 @@
 #include <prfchwintrin.h>
 #endif
 
+#ifdef __ADX__
+#include <adxintrin.h>
+#endif
+
 #endif /* _X86INTRIN_H_INCLUDED */
diff --git a/gcc/testsuite/g++.dg/other/i386-2.C b/gcc/testsuite/g++.dg/other/i386-2.C
index 47fda70..197497f 100644
--- a/gcc/testsuite/g++.dg/other/i386-2.C
+++ b/gcc/testsuite/g++.dg/other/i386-2.C
@@ -1,5 +1,5 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
diff --git a/gcc/testsuite/g++.dg/other/i386-3.C b/gcc/testsuite/g++.dg/other/i386-3.C
index ad477fa..780731e 100644
--- a/gcc/testsuite/g++.dg/other/i386-3.C
+++ b/gcc/testsuite/g++.dg/other/i386-3.C
@@ -1,5 +1,5 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
diff --git a/gcc/testsuite/gcc.target/i386/adx-addcarryx32-1.c b/gcc/testsuite/gcc.target/i386/adx-addcarryx32-1.c
new file mode 100644
index 0000000..daf5779
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-addcarryx32-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-madx -O2" } */
+/* { dg-final { scan-assembler "adcx" } } */
+
+#include <x86intrin.h>
+
+volatile unsigned char c;
+volatile unsigned int x, y;
+unsigned int *sum;
+
+void extern
+adx_test (void)
+{
+    c = _addcarryx_u32 (c, x, y, sum);
+}
diff --git a/gcc/testsuite/gcc.target/i386/adx-addcarryx32-2.c b/gcc/testsuite/gcc.target/i386/adx-addcarryx32-2.c
new file mode 100644
index 0000000..d38d7ee
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-addcarryx32-2.c
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+/* { dg-options "-madx -O2" } */
+/* { dg-require-effective-target adx } */
+
+#include <x86intrin.h>
+#include "adx-check.h"
+
+static void
+adx_test (void)
+{
+  volatile unsigned char c;
+  unsigned int x;
+  volatile unsigned int y, sum_ref;
+
+  c = 0;
+  x = y = 0xFFFFFFFF;
+  sum_ref = 0xFFFFFFFE;
+
+  /* X = 0xFFFFFFFF, Y = 0xFFFFFFFF, C = 0.  */
+  c = _addcarryx_u32 (c, x, y, &x);
+  /* X = 0xFFFFFFFE, Y = 0xFFFFFFFF, C = 1.  */
+  c = _addcarryx_u32 (c, x, y, &x);
+  /* X = 0xFFFFFFFE, Y = 0xFFFFFFFF, C = 1.  */
+
+  if (x != sum_ref)
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/adx-addcarryx64-1.c b/gcc/testsuite/gcc.target/i386/adx-addcarryx64-1.c
new file mode 100644
index 0000000..45beca8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-addcarryx64-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-madx -O2" } */
+/* { dg-final { scan-assembler "adcx" } } */
+
+#include <x86intrin.h>
+
+volatile unsigned char c;
+volatile unsigned long long x, y;
+unsigned long long *sum;
+
+void extern
+adx_test (void)
+{
+    c = _addcarryx_u64 (c, x, y, sum);
+}
diff --git a/gcc/testsuite/gcc.target/i386/adx-addcarryx64-2.c b/gcc/testsuite/gcc.target/i386/adx-addcarryx64-2.c
new file mode 100644
index 0000000..6aa2539
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-addcarryx64-2.c
@@ -0,0 +1,27 @@
+/* { dg-do run { target { ! ia32 } } } */
+/* { dg-options "-madx -O2" } */
+/* { dg-require-effective-target adx } */
+
+#include <x86intrin.h>
+#include "adx-check.h"
+
+static void
+adx_test (void)
+{
+  volatile unsigned char c;
+  unsigned long long x;
+  volatile unsigned long long y, sum_ref;
+
+  c = 0;
+  x = y = 0xFFFFFFFFFFFFFFFFLL;
+  sum_ref = 0xFFFFFFFFFFFFFFFELL;
+
+  /* X = 0xFFFFFFFFFFFFFFFF, Y = 0xFFFFFFFFFFFFFFFF, C = 0.  */
+  c = _addcarryx_u64 (c, x, y, &x);
+  /* X = 0xFFFFFFFFFFFFFFFE, Y = 0xFFFFFFFFFFFFFFFF, C = 1.  */
+  c = _addcarryx_u64 (c, x, y, &x);
+  /* X = 0xFFFFFFFFFFFFFFFE, Y = 0xFFFFFFFFFFFFFFFF, C = 1.  */
+
+  if (x != sum_ref)
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/adx-check.h b/gcc/testsuite/gcc.target/i386/adx-check.h
new file mode 100644
index 0000000..580cb49
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-check.h
@@ -0,0 +1,40 @@
+#include <stdlib.h>
+#include "cpuid.h"
+
+static void adx_test (void);
+
+static void __attribute__ ((noinline)) do_test (void)
+{
+  adx_test ();
+}
+
+  int
+main ()
+{
+  unsigned int eax, ebx, ecx, edx;
+
+  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+    return 0;
+
+  /* Run ADX test only if host has ADX support.  */
+
+  if (__get_cpuid_max (0, NULL) < 7)
+    return 0;
+
+  __cpuid_count (7, 0, eax, ebx, ecx, edx);
+
+  if ((ebx & bit_ADX) == bit_ADX)
+    {
+      do_test ();
+#ifdef DEBUG
+      printf ("PASSED\n");
+#endif
+      return 0;
+    }
+#ifdef DEBUG
+  printf ("SKIPPED\n");
+#endif
+
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/i386/i386.exp b/gcc/testsuite/gcc.target/i386/i386.exp
index 785a973..37f43a6 100644
--- a/gcc/testsuite/gcc.target/i386/i386.exp
+++ b/gcc/testsuite/gcc.target/i386/i386.exp
@@ -243,6 +243,18 @@ proc check_effective_target_bmi2 { } {
     } "-mbmi2" ]
 }
 
+# Return 1 if ADX instructions can be compiled.
+proc check_effective_target_adx { } {
+    return [check_no_compiler_messages adx object {
+	unsigned char
+	_adxcarry_u32 (unsigned char __CF, unsigned int __X,
+		   unsigned int __Y, unsigned int *__P)
+	{
+	    return __builtin_ia32_addcarryx_u32 (__CF, __X, __Y, __P);
+	}
+    } "-madx" ]
+}
+
 # Return 1 if rtm instructions can be compiled.
 proc check_effective_target_rtm { } {
     return [check_no_compiler_messages rtm object {
diff --git a/gcc/testsuite/gcc.target/i386/sse-12.c b/gcc/testsuite/gcc.target/i386/sse-12.c
index cb3ab18..0d78a0c 100644
--- a/gcc/testsuite/gcc.target/i386/sse-12.c
+++ b/gcc/testsuite/gcc.target/i386/sse-12.c
@@ -3,7 +3,7 @@
    popcntintrin.h and mm_malloc.h are usable
    with -O -std=c89 -pedantic-errors.  */
 /* { dg-do compile } */
-/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 #include <x86intrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index fe2bf46..4c575ba 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 #include <mm_malloc.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-14.c b/gcc/testsuite/gcc.target/i386/sse-14.c
index 8877e31..c8c13ce 100644
--- a/gcc/testsuite/gcc.target/i386/sse-14.c
+++ b/gcc/testsuite/gcc.target/i386/sse-14.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 #include <mm_malloc.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c b/gcc/testsuite/gcc.target/i386/sse-22.c
index ec5ccb8..ec83255 100644
--- a/gcc/testsuite/gcc.target/i386/sse-22.c
+++ b/gcc/testsuite/gcc.target/i386/sse-22.c
@@ -50,7 +50,7 @@
 
 
 #ifndef DIFFERENT_PRAGMAS
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx")
 #endif
 
 /* Following intrinsics require immediate arguments.  They
@@ -264,7 +264,7 @@ test_2 (_mm_clmulepi64_si128, __m128i, __m128i, __m128i, 1)
 
 /* x86intrin.h (FMA4/XOP/LWP/BMI/BMI2/TBM/LZCNT/FMA). */
 #ifdef DIFFERENT_PRAGMAS
-#pragma GCC target ("fma4,xop,lwp,bmi,bmi2,tbm,lzcnt,fma,rdseed,prfchw")
+#pragma GCC target ("fma4,xop,lwp,bmi,bmi2,tbm,lzcnt,fma,rdseed,prfchw,adx")
 #endif
 #include <x86intrin.h>
 /* xopintrin.h */
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index 3b26d99..f046ef6 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -183,7 +183,7 @@
 /* rtmintrin.h */
 #define __builtin_ia32_xabort(M) __builtin_ia32_xabort(1)
 
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx")
 #include <wmmintrin.h>
 #include <smmintrin.h>
 #include <mm3dnow.h>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Intrinsics for ADCX
  2012-08-03 13:24     ` Michael Zolotukhin
@ 2012-08-03 13:52       ` Uros Bizjak
  2012-08-03 14:17         ` Michael Zolotukhin
  2012-08-08  5:34         ` Michael Zolotukhin
  0 siblings, 2 replies; 14+ messages in thread
From: Uros Bizjak @ 2012-08-03 13:52 UTC (permalink / raw)
  To: Michael Zolotukhin
  Cc: Kirill Yukhin, Richard Henderson, Jakub Jelinek, gcc-patches, H.J. Lu

On Fri, Aug 3, 2012 at 3:24 PM, Michael Zolotukhin
<michael.v.zolotukhin@gmail.com> wrote:
> Hi,
> I made a new version of the patch, where I tried to take into account
> Uros' remarks - is it ok for trunk?
>
> Bootstrap and new tests are passing, testing is in progress.
>
> Changelog entry:
> 2012-08-03 Michael Zolotukhin <michael.v.zolotukhin@intel.com>
>
>         * common/config/i386/i386-common.c (OPTION_MASK_ISA_ADX_SET): New.
>         (OPTION_MASK_ISA_ADX_UNSET): Likewise.
>         (ix86_handle_option): Handle madx option.
>         * config.gcc (i[34567]86-*-*): Add adxintrin.h.
>         (x86_64-*-*): Likewise.
>         * config/i386/adxintrin.h: New header.
>         * config/i386/driver-i386.c (host_detect_local_cpu): Detect ADCX/ADOX
>         support.
>         * config/i386/i386-builtin-types.def
>         (UCHAR_FTYPE_UCHAR_UINT_UINT_PINT): New function type.
>         (UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PINT): Likewise.
>         * config/i386/i386-c.c: Define __ADX__ if needed.
>         * config/i386/i386.c (ix86_target_string): Define -madx option.
>         (PTA_ADX): New.
>         (ix86_option_override_internal): Handle new option.
>         (ix86_valid_target_attribute_inner_p): Add OPT_madx.
>         (ix86_builtins): Add IX86_BUILTIN_ADDCARRYX32,
>         IX86_BUILTIN_ADDCARRYX64.
>         (ix86_init_mmx_sse_builtins): Define corresponding built-ins.
>         (ix86_expand_builtin): Handle these built-ins.
>         (ix86_expand_args_builtin): Handle new function types.
>         * config/i386/i386.h (TARGET_ADX): New.
>         * config/i386/i386.md (adcx<mode>3): New define_insn.
>         * config/i386/i386.opt (madx): New.
>         * config/i386/x86intrin.h: Include adxintrin.h.
>
> testsuite/Changelog entry:
> 2012-08-03 Michael Zolotukhin <michael.v.zolotukhin@intel.com>
>
>         * gcc.target/i386/adx-addcarryx32-1.c: New.
>         * gcc.target/i386/adx-addcarryx32-2.c: New.
>         * gcc.target/i386/adx-addcarryx64-1.c: New.
>         * gcc.target/i386/adx-addcarryx64-2.c: New.
>         * gcc.target/i386/adx-check.h: New.
>         * gcc.target/i386/i386.exp (check_effective_target_adx): New.
>         * gcc.target/i386/sse-12.c: Add -madx.
>         * gcc.target/i386/sse-13.c: Ditto.
>         * gcc.target/i386/sse-14.c: Ditto.
>         * gcc.target/i386/sse-22.c: Ditto.
>         * gcc.target/i386/sse-23.c: Ditto.
>         * g++.dg/other/i386-2.C: Ditto.
>         * g++.dg/other/i386-3.C: Ditto.

Just change this line:

+      op4 = gen_rtx_REG (CCmode, FLAGS_REG);

back to CCCmode.

OK with this change.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Intrinsics for ADCX
  2012-08-03 13:52       ` Uros Bizjak
@ 2012-08-03 14:17         ` Michael Zolotukhin
  2012-08-08  5:34         ` Michael Zolotukhin
  1 sibling, 0 replies; 14+ messages in thread
From: Michael Zolotukhin @ 2012-08-03 14:17 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: Kirill Yukhin, Richard Henderson, Jakub Jelinek, gcc-patches, H.J. Lu

Thanks!

On 3 August 2012 17:51, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Fri, Aug 3, 2012 at 3:24 PM, Michael Zolotukhin
> <michael.v.zolotukhin@gmail.com> wrote:
>> Hi,
>> I made a new version of the patch, where I tried to take into account
>> Uros' remarks - is it ok for trunk?
>>
>> Bootstrap and new tests are passing, testing is in progress.
>>
>> Changelog entry:
>> 2012-08-03 Michael Zolotukhin <michael.v.zolotukhin@intel.com>
>>
>>         * common/config/i386/i386-common.c (OPTION_MASK_ISA_ADX_SET): New.
>>         (OPTION_MASK_ISA_ADX_UNSET): Likewise.
>>         (ix86_handle_option): Handle madx option.
>>         * config.gcc (i[34567]86-*-*): Add adxintrin.h.
>>         (x86_64-*-*): Likewise.
>>         * config/i386/adxintrin.h: New header.
>>         * config/i386/driver-i386.c (host_detect_local_cpu): Detect ADCX/ADOX
>>         support.
>>         * config/i386/i386-builtin-types.def
>>         (UCHAR_FTYPE_UCHAR_UINT_UINT_PINT): New function type.
>>         (UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PINT): Likewise.
>>         * config/i386/i386-c.c: Define __ADX__ if needed.
>>         * config/i386/i386.c (ix86_target_string): Define -madx option.
>>         (PTA_ADX): New.
>>         (ix86_option_override_internal): Handle new option.
>>         (ix86_valid_target_attribute_inner_p): Add OPT_madx.
>>         (ix86_builtins): Add IX86_BUILTIN_ADDCARRYX32,
>>         IX86_BUILTIN_ADDCARRYX64.
>>         (ix86_init_mmx_sse_builtins): Define corresponding built-ins.
>>         (ix86_expand_builtin): Handle these built-ins.
>>         (ix86_expand_args_builtin): Handle new function types.
>>         * config/i386/i386.h (TARGET_ADX): New.
>>         * config/i386/i386.md (adcx<mode>3): New define_insn.
>>         * config/i386/i386.opt (madx): New.
>>         * config/i386/x86intrin.h: Include adxintrin.h.
>>
>> testsuite/Changelog entry:
>> 2012-08-03 Michael Zolotukhin <michael.v.zolotukhin@intel.com>
>>
>>         * gcc.target/i386/adx-addcarryx32-1.c: New.
>>         * gcc.target/i386/adx-addcarryx32-2.c: New.
>>         * gcc.target/i386/adx-addcarryx64-1.c: New.
>>         * gcc.target/i386/adx-addcarryx64-2.c: New.
>>         * gcc.target/i386/adx-check.h: New.
>>         * gcc.target/i386/i386.exp (check_effective_target_adx): New.
>>         * gcc.target/i386/sse-12.c: Add -madx.
>>         * gcc.target/i386/sse-13.c: Ditto.
>>         * gcc.target/i386/sse-14.c: Ditto.
>>         * gcc.target/i386/sse-22.c: Ditto.
>>         * gcc.target/i386/sse-23.c: Ditto.
>>         * g++.dg/other/i386-2.C: Ditto.
>>         * g++.dg/other/i386-3.C: Ditto.
>
> Just change this line:
>
> +      op4 = gen_rtx_REG (CCmode, FLAGS_REG);
>
> back to CCCmode.
>
> OK with this change.
>
> Thanks,
> Uros.



-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Intrinsics for ADCX
  2012-08-03 13:52       ` Uros Bizjak
  2012-08-03 14:17         ` Michael Zolotukhin
@ 2012-08-08  5:34         ` Michael Zolotukhin
  2012-08-08 13:27           ` Kirill Yukhin
  1 sibling, 1 reply; 14+ messages in thread
From: Michael Zolotukhin @ 2012-08-08  5:34 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: Kirill Yukhin, Richard Henderson, Jakub Jelinek, gcc-patches, H.J. Lu

[-- Attachment #1: Type: text/plain, Size: 5089 bytes --]

Hi,
Here is the patch with some obvious fixes. If there are no objections,
could anyone please check it in?

Changelog entry:
2012-08-08 Michael Zolotukhin <michael.v.zolotukhin@intel.com>

        * common/config/i386/i386-common.c (OPTION_MASK_ISA_ADX_SET): New.
        (OPTION_MASK_ISA_ADX_UNSET): Likewise.
        (ix86_handle_option): Handle madx option.
        * config.gcc (i[34567]86-*-*): Add adxintrin.h.
        (x86_64-*-*): Likewise.
        * config/i386/adxintrin.h: New header.
        * config/i386/driver-i386.c (host_detect_local_cpu): Detect ADCX/ADOX
        support.
        * config/i386/i386-builtin-types.def
        (UCHAR_FTYPE_UCHAR_UINT_UINT_PUNSIGNED): New function type.
        (UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PULONGLONG): Likewise.
        * config/i386/i386-c.c: Define __ADX__ if needed.
        * config/i386/i386.c (ix86_target_string): Define -madx option.
        (PTA_ADX): New.
        (ix86_option_override_internal): Handle new option.
        (ix86_valid_target_attribute_inner_p): Add OPT_madx.
        (ix86_builtins): Add IX86_BUILTIN_ADDCARRYX32,
        IX86_BUILTIN_ADDCARRYX64.
        (ix86_init_mmx_sse_builtins): Define corresponding built-ins.
        (ix86_expand_builtin): Handle these built-ins.
        (ix86_expand_args_builtin): Handle new function types.
        * config/i386/i386.h (TARGET_ADX): New.
        * config/i386/i386.md (adcx<mode>3): New define_insn.
        * config/i386/i386.opt (madx): New.
        * config/i386/x86intrin.h: Include adxintrin.h.

testsuite/Changelog entry:
2012-08-08 Michael Zolotukhin <michael.v.zolotukhin@intel.com>

        * gcc.target/i386/adx-addcarryx32-1.c: New.
        * gcc.target/i386/adx-addcarryx32-2.c: New.
        * gcc.target/i386/adx-addcarryx64-1.c: New.
        * gcc.target/i386/adx-addcarryx64-2.c: New.
        * gcc.target/i386/adx-check.h: New.
        * gcc.target/i386/i386.exp (check_effective_target_adx): New.
        * gcc.target/i386/sse-12.c: Add -madx.
        * gcc.target/i386/sse-13.c: Ditto.
        * gcc.target/i386/sse-14.c: Ditto.
        * gcc.target/i386/sse-22.c: Ditto.
        * gcc.target/i386/sse-23.c: Ditto.
        * g++.dg/other/i386-2.C: Ditto.
        * g++.dg/other/i386-3.C: Ditto.


On 3 August 2012 17:51, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Fri, Aug 3, 2012 at 3:24 PM, Michael Zolotukhin
> <michael.v.zolotukhin@gmail.com> wrote:
>> Hi,
>> I made a new version of the patch, where I tried to take into account
>> Uros' remarks - is it ok for trunk?
>>
>> Bootstrap and new tests are passing, testing is in progress.
>>
>> Changelog entry:
>> 2012-08-03 Michael Zolotukhin <michael.v.zolotukhin@intel.com>
>>
>>         * common/config/i386/i386-common.c (OPTION_MASK_ISA_ADX_SET): New.
>>         (OPTION_MASK_ISA_ADX_UNSET): Likewise.
>>         (ix86_handle_option): Handle madx option.
>>         * config.gcc (i[34567]86-*-*): Add adxintrin.h.
>>         (x86_64-*-*): Likewise.
>>         * config/i386/adxintrin.h: New header.
>>         * config/i386/driver-i386.c (host_detect_local_cpu): Detect ADCX/ADOX
>>         support.
>>         * config/i386/i386-builtin-types.def
>>         (UCHAR_FTYPE_UCHAR_UINT_UINT_PINT): New function type.
>>         (UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PINT): Likewise.
>>         * config/i386/i386-c.c: Define __ADX__ if needed.
>>         * config/i386/i386.c (ix86_target_string): Define -madx option.
>>         (PTA_ADX): New.
>>         (ix86_option_override_internal): Handle new option.
>>         (ix86_valid_target_attribute_inner_p): Add OPT_madx.
>>         (ix86_builtins): Add IX86_BUILTIN_ADDCARRYX32,
>>         IX86_BUILTIN_ADDCARRYX64.
>>         (ix86_init_mmx_sse_builtins): Define corresponding built-ins.
>>         (ix86_expand_builtin): Handle these built-ins.
>>         (ix86_expand_args_builtin): Handle new function types.
>>         * config/i386/i386.h (TARGET_ADX): New.
>>         * config/i386/i386.md (adcx<mode>3): New define_insn.
>>         * config/i386/i386.opt (madx): New.
>>         * config/i386/x86intrin.h: Include adxintrin.h.
>>
>> testsuite/Changelog entry:
>> 2012-08-03 Michael Zolotukhin <michael.v.zolotukhin@intel.com>
>>
>>         * gcc.target/i386/adx-addcarryx32-1.c: New.
>>         * gcc.target/i386/adx-addcarryx32-2.c: New.
>>         * gcc.target/i386/adx-addcarryx64-1.c: New.
>>         * gcc.target/i386/adx-addcarryx64-2.c: New.
>>         * gcc.target/i386/adx-check.h: New.
>>         * gcc.target/i386/i386.exp (check_effective_target_adx): New.
>>         * gcc.target/i386/sse-12.c: Add -madx.
>>         * gcc.target/i386/sse-13.c: Ditto.
>>         * gcc.target/i386/sse-14.c: Ditto.
>>         * gcc.target/i386/sse-22.c: Ditto.
>>         * gcc.target/i386/sse-23.c: Ditto.
>>         * g++.dg/other/i386-2.C: Ditto.
>>         * g++.dg/other/i386-3.C: Ditto.
>
> Just change this line:
>
> +      op4 = gen_rtx_REG (CCmode, FLAGS_REG);
>
> back to CCCmode.
>
> OK with this change.
>
> Thanks,
> Uros.



-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.

[-- Attachment #2: bdw-adx-4.gcc.patch --]
[-- Type: application/octet-stream, Size: 24753 bytes --]

diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c
index 70dcae0..e05cd56 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -57,6 +57,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_RTM_SET OPTION_MASK_ISA_RTM
 #define OPTION_MASK_ISA_PRFCHW_SET OPTION_MASK_ISA_PRFCHW
 #define OPTION_MASK_ISA_RDSEED_SET OPTION_MASK_ISA_RDSEED
+#define OPTION_MASK_ISA_ADX_SET OPTION_MASK_ISA_ADX
 
 /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
    as -msse4.2.  */
@@ -127,6 +128,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_RTM_UNSET OPTION_MASK_ISA_RTM
 #define OPTION_MASK_ISA_PRFCHW_UNSET OPTION_MASK_ISA_PRFCHW
 #define OPTION_MASK_ISA_RDSEED_UNSET OPTION_MASK_ISA_RDSEED
+#define OPTION_MASK_ISA_ADX_UNSET OPTION_MASK_ISA_ADX
 
 /* SSE4 includes both SSE4.1 and SSE4.2.  -mno-sse4 should the same
    as -mno-sse4.1. */
@@ -598,6 +600,19 @@ ix86_handle_option (struct gcc_options *opts,
 	}
       return true;
 
+    case OPT_madx:
+      if (value)
+	{
+	  opts->x_ix86_isa_flags |= OPTION_MASK_ISA_ADX_SET;
+	  opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_ADX_SET;
+	}
+      else
+	{
+	  opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_ADX_UNSET;
+	  opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_ADX_UNSET;
+	}
+      return true;
+
   /* Comes from final.c -- no real reason to change it.  */
 #define MAX_CODE_ALIGN 16
 
diff --git a/gcc/config.gcc b/gcc/config.gcc
index dad4c3a..f40ac0e 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -361,7 +361,7 @@ i[34567]86-*-*)
 		       ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
 		       lzcntintrin.h bmiintrin.h bmi2intrin.h tbmintrin.h
 		       avx2intrin.h fmaintrin.h f16cintrin.h rtmintrin.h
-		       xtestintrin.h rdseedintrin.h prfchwintrin.h"
+		       xtestintrin.h rdseedintrin.h prfchwintrin.h adxintrin.h"
 	;;
 x86_64-*-*)
 	cpu_type=i386
@@ -375,7 +375,7 @@ x86_64-*-*)
 		       ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
 		       lzcntintrin.h bmiintrin.h tbmintrin.h bmi2intrin.h
 		       avx2intrin.h fmaintrin.h f16cintrin.h rtmintrin.h
-		       xtestintrin.h rdseedintrin.h prfchwintrin.h"
+		       xtestintrin.h rdseedintrin.h prfchwintrin.h adxintrin.h"
 	need_64bit_hwint=yes
 	;;
 ia64-*-*)
diff --git a/gcc/config/i386/adxintrin.h b/gcc/config/i386/adxintrin.h
new file mode 100644
index 0000000..2e2a18b
--- /dev/null
+++ b/gcc/config/i386/adxintrin.h
@@ -0,0 +1,53 @@
+/* Copyright (C) 2012 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#if !defined _X86INTRIN_H_INCLUDED && !defined _IMMINTRIN_H_INCLUDED
+# error "Never use <adxintrin.h> directly; include <x86intrin.h> instead."
+#endif
+
+#ifndef __ADX__
+# error "Flag-preserving add-carry instructions not enabled"
+#endif /* __ADX__ */
+
+#ifndef _ADXINTRIN_H_INCLUDED
+#define _ADXINTRIN_H_INCLUDED
+
+extern __inline unsigned char
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_addcarryx_u32 (unsigned char __CF, unsigned int __X,
+		unsigned int __Y, unsigned int *__P)
+{
+    return __builtin_ia32_addcarryx_u32 (__CF, __X, __Y, __P);
+}
+
+#ifdef __x86_64__
+extern __inline unsigned char
+__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_addcarryx_u64 (unsigned char __CF, unsigned long __X,
+		unsigned long __Y, unsigned long long *__P)
+{
+    return __builtin_ia32_addcarryx_u64 (__CF, __X, __Y, __P);
+}
+#endif
+
+#endif /* _ADXINTRIN_H_INCLUDED */
diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
index 4616108..0b56f3f 100644
--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -399,7 +399,7 @@ const char *host_detect_local_cpu (int argc, const char **argv)
   unsigned int has_bmi = 0, has_bmi2 = 0, has_tbm = 0, has_lzcnt = 0;
   unsigned int has_hle = 0, has_rtm = 0;
   unsigned int has_rdrnd = 0, has_f16c = 0, has_fsgsbase = 0;
-  unsigned int has_rdseed = 0, has_prfchw = 0;
+  unsigned int has_rdseed = 0, has_prfchw = 0, has_adx = 0;
 
   bool arch;
 
@@ -468,6 +468,7 @@ const char *host_detect_local_cpu (int argc, const char **argv)
       has_fsgsbase = ebx & bit_FSGSBASE;
       has_rdseed = ebx & bit_RDSEED;
       has_prfchw = ecx & bit_PRFCHW;
+      has_adx = ebx & bit_ADX;
     }
 
   /* Check cpuid level of extended features.  */
@@ -750,11 +751,12 @@ const char *host_detect_local_cpu (int argc, const char **argv)
       const char *fsgsbase = has_fsgsbase ? " -mfsgsbase" : " -mno-fsgsbase";
       const char *rdseed = has_rdseed ? " -mrdseed" : " -mno-rdseed";
       const char *prfchw = has_prfchw ? " -mprfchw" : " -mno-prfchw";
+      const char *adx = has_adx ? " -madx" : " -mno-adx";
 
       options = concat (options, cx16, sahf, movbe, ase, pclmul,
 			popcnt, abm, lwp, fma, fma4, xop, bmi, bmi2,
 			tbm, avx, avx2, sse4_2, sse4_1, lzcnt, rtm,
-			hle, rdrnd, f16c, fsgsbase, rdseed, prfchw, NULL);
+			hle, rdrnd, f16c, fsgsbase, rdseed, prfchw, adx, NULL);
     }
 
 done:
diff --git a/gcc/config/i386/i386-builtin-types.def b/gcc/config/i386/i386-builtin-types.def
index 398bf0a..8a199c0 100644
--- a/gcc/config/i386/i386-builtin-types.def
+++ b/gcc/config/i386/i386-builtin-types.def
@@ -446,6 +446,9 @@ DEF_FUNCTION_TYPE (V16QI, V16QI, INT, V16QI, INT, INT)
 
 DEF_FUNCTION_TYPE (V8QI, QI, QI, QI, QI, QI, QI, QI, QI)
 
+DEF_FUNCTION_TYPE (UCHAR, UCHAR, UINT, UINT, PUNSIGNED)
+DEF_FUNCTION_TYPE (UCHAR, UCHAR, ULONGLONG, ULONGLONG, PULONGLONG)
+
 DEF_FUNCTION_TYPE (V2DF, V2DF, PCDOUBLE, V4SI, V2DF, INT)
 DEF_FUNCTION_TYPE (V4DF, V4DF, PCDOUBLE, V4SI, V4DF, INT)
 DEF_FUNCTION_TYPE (V4DF, V4DF, PCDOUBLE, V8SI, V4DF, INT)
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index a4c947a..d00e0ba 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -300,6 +300,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
     def_or_undef (parse_in, "__RDSEED__");
   if (isa_flag & OPTION_MASK_ISA_PRFCHW)
     def_or_undef (parse_in, "__PRFCHW__");
+  if (isa_flag & OPTION_MASK_ISA_ADX)
+    def_or_undef (parse_in, "__ADX__");
   if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE))
     def_or_undef (parse_in, "__SSE_MATH__");
   if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE2))
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1772dc6..9eda338 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2769,6 +2769,7 @@ ix86_target_string (HOST_WIDE_INT isa, int flags, const char *arch,
     { "-mhle",		OPTION_MASK_ISA_HLE },
     { "-mrdseed",	OPTION_MASK_ISA_RDSEED },
     { "-mprfchw",	OPTION_MASK_ISA_PRFCHW },
+    { "-madx",		OPTION_MASK_ISA_ADX },
     { "-mtbm",		OPTION_MASK_ISA_TBM },
     { "-mpopcnt",	OPTION_MASK_ISA_POPCNT },
     { "-mmovbe",	OPTION_MASK_ISA_MOVBE },
@@ -3047,6 +3048,7 @@ ix86_option_override_internal (bool main_args_p)
 #define PTA_HLE			(HOST_WIDE_INT_1 << 33)
 #define PTA_PRFCHW		(HOST_WIDE_INT_1 << 34)
 #define PTA_RDSEED		(HOST_WIDE_INT_1 << 35)
+#define PTA_ADX			(HOST_WIDE_INT_1 << 36)
 /* if this reaches 64, need to widen struct pta flags below */
 
   static struct pta
@@ -3538,6 +3540,9 @@ ix86_option_override_internal (bool main_args_p)
 	if (processor_alias_table[i].flags & PTA_RDSEED
 	    && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_RDSEED))
 	  ix86_isa_flags |= OPTION_MASK_ISA_RDSEED;
+	if (processor_alias_table[i].flags & PTA_ADX
+	    && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_ADX))
+	  ix86_isa_flags |= OPTION_MASK_ISA_ADX;
 	if (processor_alias_table[i].flags & (PTA_PREFETCH_SSE | PTA_SSE))
 	  x86_prefetch_sse = true;
 
@@ -4361,6 +4366,7 @@ ix86_valid_target_attribute_inner_p (tree args, char *p_strings[],
     IX86_ATTR_ISA ("hle",	OPT_mhle),
     IX86_ATTR_ISA ("prfchw",	OPT_mprfchw),
     IX86_ATTR_ISA ("rdseed",	OPT_mrdseed),
+    IX86_ATTR_ISA ("adx",	OPT_madx),
 
     /* enum options */
     IX86_ATTR_ENUM ("fpmath=",	OPT_mfpmath_),
@@ -26107,6 +26113,10 @@ enum ix86_builtins
   IX86_BUILTIN_PEXT32,
   IX86_BUILTIN_PEXT64,
 
+  /* ADX instructions.  */
+  IX86_BUILTIN_ADDCARRYX32,
+  IX86_BUILTIN_ADDCARRYX64,
+
   /* FSGSBASE instructions.  */
   IX86_BUILTIN_RDFSBASE32,
   IX86_BUILTIN_RDFSBASE64,
@@ -27957,6 +27967,14 @@ ix86_init_mmx_sse_builtins (void)
 	       "__builtin_ia32_rdseed_di_step",
 	       INT_FTYPE_PULONGLONG, IX86_BUILTIN_RDSEED64_STEP);
 
+  /* ADCX */
+  def_builtin (OPTION_MASK_ISA_ADX, "__builtin_ia32_addcarryx_u32",
+	       UCHAR_FTYPE_UCHAR_UINT_UINT_PUNSIGNED, IX86_BUILTIN_ADDCARRYX32);
+  def_builtin (OPTION_MASK_ISA_ADX && OPTION_MASK_ISA_64BIT,
+	       "__builtin_ia32_addcarryx_u64",
+	       UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PULONGLONG,
+	       IX86_BUILTIN_ADDCARRYX64);
+
   /* Add FMA4 multi-arg argument instructions */
   for (i = 0, d = bdesc_multi_arg; i < ARRAY_SIZE (bdesc_multi_arg); i++, d++)
     {
@@ -29472,6 +29490,10 @@ ix86_expand_args_builtin (const struct builtin_description *d,
       nargs = 4;
       nargs_constant = 2;
       break;
+    case UCHAR_FTYPE_UCHAR_UINT_UINT_PUNSIGNED:
+    case UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PULONGLONG:
+      nargs = 4;
+      break;
     default:
       gcc_unreachable ();
     }
@@ -30318,7 +30340,62 @@ rdseed_step:
         target = gen_reg_rtx (SImode);
 
       emit_insn (gen_zero_extendqisi2 (target, op2));
+      return target;
+
+    case IX86_BUILTIN_ADDCARRYX32:
+      icode = CODE_FOR_adcxsi3;
+      mode0 = SImode;
+      goto addcarryx;
+
+    case IX86_BUILTIN_ADDCARRYX64:
+      icode = CODE_FOR_adcxdi3;
+      mode0 = DImode;
+
+addcarryx:
+      arg0 = CALL_EXPR_ARG (exp, 0); /* unsigned char c_in.  */
+      arg1 = CALL_EXPR_ARG (exp, 1); /* unsigned int src1.  */
+      arg2 = CALL_EXPR_ARG (exp, 2); /* unsigned int src2.  */
+      arg3 = CALL_EXPR_ARG (exp, 3); /* unsigned int *sum_out.  */
+
+      op0 = gen_reg_rtx (QImode);
+
+      /* Generate CF from input operand.  */
+      op1 = expand_normal (arg0);
+      if (GET_MODE (op1) != QImode)
+	op1 = convert_to_mode (QImode, op1, 1);
+      op1 = copy_to_mode_reg (QImode, op1);
+      emit_insn (gen_addqi3_cc (op0, op1, constm1_rtx));
+
+      /* Gen ADCX instruction to compute X+Y+CF.  */
+      op2 = expand_normal (arg1);
+      op3 = expand_normal (arg2);
+
+      if (!REG_P (op2))
+	op2 = copy_to_mode_reg (mode0, op2);
+      if (!REG_P (op3))
+	op3 = copy_to_mode_reg (mode0, op3);
+
+      op0 = gen_reg_rtx (mode0);
+
+      op4 = gen_rtx_REG (CCCmode, FLAGS_REG);
+      pat = gen_rtx_LTU (VOIDmode, op4, const0_rtx);
+      emit_insn (GEN_FCN (icode) (op0, op2, op3, op4, pat));
+
+      /* Store the result.  */
+      op4 = expand_normal (arg3);
+      if (!address_operand (op4, VOIDmode))
+	{
+	  op4 = convert_memory_address (Pmode, op4);
+	  op4 = copy_addr_to_reg (op4);
+	}
+      emit_move_insn (gen_rtx_MEM (mode0, op4), op0);
+
+      /* Return current CF value.  */
+      if (target == 0)
+        target = gen_reg_rtx (QImode);
 
+      PUT_MODE (pat, QImode);
+      emit_insn (gen_rtx_SET (VOIDmode, target, pat));
       return target;
 
     case IX86_BUILTIN_GATHERSIV2DF:
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index a6ce0ce..5869628 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -78,6 +78,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define TARGET_HLE	OPTION_ISA_HLE
 #define TARGET_RDSEED	OPTION_ISA_RDSEED
 #define TARGET_PRFCHW	OPTION_ISA_PRFCHW
+#define TARGET_ADX	OPTION_ISA_ADX
 
 #define TARGET_LP64	OPTION_ABI_64
 #define TARGET_X32	OPTION_ABI_X32
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index ace3b6e..6774ae2 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -6633,6 +6633,29 @@
    (set_attr "pent_pair" "pu")
    (set_attr "mode" "SI")])
 \f
+;; ADCX instruction
+
+(define_insn "adcx<mode>3"
+  [(set (reg:CCC FLAGS_REG)
+	(compare:CCC
+	  (plus:SWI48
+	    (match_operand:SWI48 1 "nonimmediate_operand" "%0")
+	    (plus:SWI48
+	      (match_operator 4 "ix86_carry_flag_operator"
+	       [(match_operand 3 "flags_reg_operand") (const_int 0)])
+	      (match_operand:SWI48 2 "nonimmediate_operand" "rm")))
+	  (const_int 0)))
+   (set (match_operand:SWI48 0 "register_operand" "=r")
+	(plus:SWI48 (match_dup 1)
+		    (plus:SWI48 (match_op_dup 4
+				 [(match_dup 3) (const_int 0)])
+				(match_dup 2))))]
+  "TARGET_ADX && ix86_binary_operator_ok (PLUS, <MODE>mode, operands)"
+  "adcx\t{%2, %0|%0, %2}"
+  [(set_attr "type" "alu")
+   (set_attr "use_carry" "1")
+   (set_attr "mode" "<MODE>")])
+\f
 ;; Overflow setting add and subtract instructions
 
 (define_insn "*add<mode>3_cconly_overflow"
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index ccada37..e4f78f3 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -540,6 +540,10 @@ mprfchw
 Target Report Mask(ISA_PRFCHW) Var(ix86_isa_flags) Save
 Support PREFETCHW instruction
 
+madx
+Target Report Mask(ISA_ADX) Var(ix86_isa_flags) Save
+Support flag-preserving add-carry instructions
+
 mtbm
 Target Report Mask(ISA_TBM) Var(ix86_isa_flags) Save
 Support TBM built-in functions and code generation
diff --git a/gcc/config/i386/x86intrin.h b/gcc/config/i386/x86intrin.h
index 9dee9ef..dc5c58e 100644
--- a/gcc/config/i386/x86intrin.h
+++ b/gcc/config/i386/x86intrin.h
@@ -105,4 +105,8 @@
 #include <prfchwintrin.h>
 #endif
 
+#ifdef __ADX__
+#include <adxintrin.h>
+#endif
+
 #endif /* _X86INTRIN_H_INCLUDED */
diff --git a/gcc/testsuite/g++.dg/other/i386-2.C b/gcc/testsuite/g++.dg/other/i386-2.C
index 47fda70..197497f 100644
--- a/gcc/testsuite/g++.dg/other/i386-2.C
+++ b/gcc/testsuite/g++.dg/other/i386-2.C
@@ -1,5 +1,5 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
diff --git a/gcc/testsuite/g++.dg/other/i386-3.C b/gcc/testsuite/g++.dg/other/i386-3.C
index ad477fa..780731e 100644
--- a/gcc/testsuite/g++.dg/other/i386-3.C
+++ b/gcc/testsuite/g++.dg/other/i386-3.C
@@ -1,5 +1,5 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
diff --git a/gcc/testsuite/gcc.target/i386/adx-addcarryx32-1.c b/gcc/testsuite/gcc.target/i386/adx-addcarryx32-1.c
new file mode 100644
index 0000000..daf5779
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-addcarryx32-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-madx -O2" } */
+/* { dg-final { scan-assembler "adcx" } } */
+
+#include <x86intrin.h>
+
+volatile unsigned char c;
+volatile unsigned int x, y;
+unsigned int *sum;
+
+void extern
+adx_test (void)
+{
+    c = _addcarryx_u32 (c, x, y, sum);
+}
diff --git a/gcc/testsuite/gcc.target/i386/adx-addcarryx32-2.c b/gcc/testsuite/gcc.target/i386/adx-addcarryx32-2.c
new file mode 100644
index 0000000..d38d7ee
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-addcarryx32-2.c
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+/* { dg-options "-madx -O2" } */
+/* { dg-require-effective-target adx } */
+
+#include <x86intrin.h>
+#include "adx-check.h"
+
+static void
+adx_test (void)
+{
+  volatile unsigned char c;
+  unsigned int x;
+  volatile unsigned int y, sum_ref;
+
+  c = 0;
+  x = y = 0xFFFFFFFF;
+  sum_ref = 0xFFFFFFFE;
+
+  /* X = 0xFFFFFFFF, Y = 0xFFFFFFFF, C = 0.  */
+  c = _addcarryx_u32 (c, x, y, &x);
+  /* X = 0xFFFFFFFE, Y = 0xFFFFFFFF, C = 1.  */
+  c = _addcarryx_u32 (c, x, y, &x);
+  /* X = 0xFFFFFFFE, Y = 0xFFFFFFFF, C = 1.  */
+
+  if (x != sum_ref)
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/adx-addcarryx64-1.c b/gcc/testsuite/gcc.target/i386/adx-addcarryx64-1.c
new file mode 100644
index 0000000..45beca8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-addcarryx64-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-madx -O2" } */
+/* { dg-final { scan-assembler "adcx" } } */
+
+#include <x86intrin.h>
+
+volatile unsigned char c;
+volatile unsigned long long x, y;
+unsigned long long *sum;
+
+void extern
+adx_test (void)
+{
+    c = _addcarryx_u64 (c, x, y, sum);
+}
diff --git a/gcc/testsuite/gcc.target/i386/adx-addcarryx64-2.c b/gcc/testsuite/gcc.target/i386/adx-addcarryx64-2.c
new file mode 100644
index 0000000..6aa2539
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-addcarryx64-2.c
@@ -0,0 +1,27 @@
+/* { dg-do run { target { ! ia32 } } } */
+/* { dg-options "-madx -O2" } */
+/* { dg-require-effective-target adx } */
+
+#include <x86intrin.h>
+#include "adx-check.h"
+
+static void
+adx_test (void)
+{
+  volatile unsigned char c;
+  unsigned long long x;
+  volatile unsigned long long y, sum_ref;
+
+  c = 0;
+  x = y = 0xFFFFFFFFFFFFFFFFLL;
+  sum_ref = 0xFFFFFFFFFFFFFFFELL;
+
+  /* X = 0xFFFFFFFFFFFFFFFF, Y = 0xFFFFFFFFFFFFFFFF, C = 0.  */
+  c = _addcarryx_u64 (c, x, y, &x);
+  /* X = 0xFFFFFFFFFFFFFFFE, Y = 0xFFFFFFFFFFFFFFFF, C = 1.  */
+  c = _addcarryx_u64 (c, x, y, &x);
+  /* X = 0xFFFFFFFFFFFFFFFE, Y = 0xFFFFFFFFFFFFFFFF, C = 1.  */
+
+  if (x != sum_ref)
+    abort ();
+}
diff --git a/gcc/testsuite/gcc.target/i386/adx-check.h b/gcc/testsuite/gcc.target/i386/adx-check.h
new file mode 100644
index 0000000..580cb49
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-check.h
@@ -0,0 +1,40 @@
+#include <stdlib.h>
+#include "cpuid.h"
+
+static void adx_test (void);
+
+static void __attribute__ ((noinline)) do_test (void)
+{
+  adx_test ();
+}
+
+  int
+main ()
+{
+  unsigned int eax, ebx, ecx, edx;
+
+  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+    return 0;
+
+  /* Run ADX test only if host has ADX support.  */
+
+  if (__get_cpuid_max (0, NULL) < 7)
+    return 0;
+
+  __cpuid_count (7, 0, eax, ebx, ecx, edx);
+
+  if ((ebx & bit_ADX) == bit_ADX)
+    {
+      do_test ();
+#ifdef DEBUG
+      printf ("PASSED\n");
+#endif
+      return 0;
+    }
+#ifdef DEBUG
+  printf ("SKIPPED\n");
+#endif
+
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/i386/i386.exp b/gcc/testsuite/gcc.target/i386/i386.exp
index 785a973..37f43a6 100644
--- a/gcc/testsuite/gcc.target/i386/i386.exp
+++ b/gcc/testsuite/gcc.target/i386/i386.exp
@@ -243,6 +243,18 @@ proc check_effective_target_bmi2 { } {
     } "-mbmi2" ]
 }
 
+# Return 1 if ADX instructions can be compiled.
+proc check_effective_target_adx { } {
+    return [check_no_compiler_messages adx object {
+	unsigned char
+	_adxcarry_u32 (unsigned char __CF, unsigned int __X,
+		   unsigned int __Y, unsigned int *__P)
+	{
+	    return __builtin_ia32_addcarryx_u32 (__CF, __X, __Y, __P);
+	}
+    } "-madx" ]
+}
+
 # Return 1 if rtm instructions can be compiled.
 proc check_effective_target_rtm { } {
     return [check_no_compiler_messages rtm object {
diff --git a/gcc/testsuite/gcc.target/i386/sse-12.c b/gcc/testsuite/gcc.target/i386/sse-12.c
index cb3ab18..0d78a0c 100644
--- a/gcc/testsuite/gcc.target/i386/sse-12.c
+++ b/gcc/testsuite/gcc.target/i386/sse-12.c
@@ -3,7 +3,7 @@
    popcntintrin.h and mm_malloc.h are usable
    with -O -std=c89 -pedantic-errors.  */
 /* { dg-do compile } */
-/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 #include <x86intrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index fe2bf46..4c575ba 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 #include <mm_malloc.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-14.c b/gcc/testsuite/gcc.target/i386/sse-14.c
index 8877e31..c8c13ce 100644
--- a/gcc/testsuite/gcc.target/i386/sse-14.c
+++ b/gcc/testsuite/gcc.target/i386/sse-14.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw" } */
+/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx" } */
 
 #include <mm_malloc.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c b/gcc/testsuite/gcc.target/i386/sse-22.c
index ec5ccb8..ec83255 100644
--- a/gcc/testsuite/gcc.target/i386/sse-22.c
+++ b/gcc/testsuite/gcc.target/i386/sse-22.c
@@ -50,7 +50,7 @@
 
 
 #ifndef DIFFERENT_PRAGMAS
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx")
 #endif
 
 /* Following intrinsics require immediate arguments.  They
@@ -264,7 +264,7 @@ test_2 (_mm_clmulepi64_si128, __m128i, __m128i, __m128i, 1)
 
 /* x86intrin.h (FMA4/XOP/LWP/BMI/BMI2/TBM/LZCNT/FMA). */
 #ifdef DIFFERENT_PRAGMAS
-#pragma GCC target ("fma4,xop,lwp,bmi,bmi2,tbm,lzcnt,fma,rdseed,prfchw")
+#pragma GCC target ("fma4,xop,lwp,bmi,bmi2,tbm,lzcnt,fma,rdseed,prfchw,adx")
 #endif
 #include <x86intrin.h>
 /* xopintrin.h */
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index 3b26d99..f046ef6 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -183,7 +183,7 @@
 /* rtmintrin.h */
 #define __builtin_ia32_xabort(M) __builtin_ia32_xabort(1)
 
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx")
 #include <wmmintrin.h>
 #include <smmintrin.h>
 #include <mm3dnow.h>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Intrinsics for ADCX
  2012-08-08  5:34         ` Michael Zolotukhin
@ 2012-08-08 13:27           ` Kirill Yukhin
  0 siblings, 0 replies; 14+ messages in thread
From: Kirill Yukhin @ 2012-08-08 13:27 UTC (permalink / raw)
  To: Michael Zolotukhin
  Cc: Uros Bizjak, Richard Henderson, Jakub Jelinek, gcc-patches, H.J. Lu

> Here is the patch with some obvious fixes. If there are no objections,
> could anyone please check it in?
Done:
http://gcc.gnu.org/ml/gcc-cvs/2012-08/msg00203.html

Thanks, K

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Intrinsics for ADCX
  2012-08-01 16:37   ` Kirill Yukhin
  2012-08-03 13:24     ` Michael Zolotukhin
@ 2012-08-09 12:22     ` Michael Zolotukhin
  2012-08-09 14:26       ` Richard Henderson
  1 sibling, 1 reply; 14+ messages in thread
From: Michael Zolotukhin @ 2012-08-09 12:22 UTC (permalink / raw)
  To: Kirill Yukhin
  Cc: Richard Henderson, Uros Bizjak, Jakub Jelinek, gcc-patches, H.J. Lu

[-- Attachment #1: Type: text/plain, Size: 3045 bytes --]

Hi guys,
This patch generalizes recently commited addcarryx-intrinsic so that
it could be generated either via ADCX or common ADC instruction.
ADX-* tests are ok, bootstrap is passed.
Is it ok for trunk?

Changelog entry:
2012-08-09  Michael Zolotukhin  <michael.v.zolotukhin@intel.com>

        * config/i386/adxintrin.h: Remove guarding __ADX__ check.
        * config/i386/x86intrin.h: Likewise.
        * config/i386/i386.c (ix86_init_mmx_sse_builtins): Remove
        OPTION_MASK_ISA_ADX from needed options for
        __builtin_ia32_addcarryx_u32 and __builtin_ia32_addcarryx_u64.
        (ix86_expand_builtin): Use add<mode>3_carry in expanding of
        IX86_BUILTIN_ADDCARRYX32 and IX86_BUILTIN_ADDCARRYX64.

testsuite/Changelog entry:
2012-08-09  Michael Zolotukhin  <michael.v.zolotukhin@intel.com>

        * gcc.target/i386/adx-addxcarry32-3.c: New.
        * gcc.target/i386/adx-addxcarry64-3.c: New.


Thanks, Michael

On 1 August 2012 20:37, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:
> Hi Richard,
>
>> Frankly I don't understand the point of these instructions
>> being added to the ISA at all.  I would have understood an
>> add-with-carry that did *not* modify the flags at all, but
>> two separate ones that modify C and O separately is just
>> downright strange.
> If there is only one carry in flight, they all are equivalent although
> ADOX is a little less useful in loops.
> If there are two carries in flight, that’s where the new instructions
> show their benefit, since they allow accumulation without destroying
> each other (see next comment).
> For any number of carries beyond two, you have to start saving
> restoring carry bits and it degenerates to the first case for some of
> them.
>
>> But to the point: I don't understand the point of having
>> this as a builtin.  Is the code generated by this builtin
>> any better than plain C?
> I think this is just like a practice to introduce new intrinsics for new insns.
> I doubt, that we may generate such things automatically:
> c1 = 0;
> c2 = 0;
> c1 = _adcx64( & res[i], src[i], src2[i], c1);
> c1 = _adcx64( & res[i+1], src[i+1], src2[i+1], c1);
> c2 = _adcx64( & res[i], src[i], src2[i], c2);
> c2 = _adcx64( & res[i+1], src[i+1], src2[i+1], c2);
>
>> And if you're going to have the builtin, why is this restricted
>> to adx anyway?  You obviously can produce the same results with
>> the good old fashioned adc instruction as well.
> We have one intrinsic for both ADCX/ADOX. So, we just picked up first
> one to use when exanding the built-in
>
>> Which begs the question of why you've got a separate pattern
>> for the adx anyway.  If the insn is so much better, it ought to
>> be used in the same pattern we use for adc now.
> I believe, we may introduce global variant of ADCX, which may be
> expanded into either of ADC/ADCX/ADOX on x86 and into analogs
> on the other ports.
>
> K


-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.

[-- Attachment #2: bdw-adx-5.gcc.patch --]
[-- Type: application/octet-stream, Size: 3000 bytes --]

diff --git a/gcc/config/i386/adxintrin.h b/gcc/config/i386/adxintrin.h
index 2e2a18b..a68566d 100644
--- a/gcc/config/i386/adxintrin.h
+++ b/gcc/config/i386/adxintrin.h
@@ -25,10 +25,6 @@
 # error "Never use <adxintrin.h> directly; include <x86intrin.h> instead."
 #endif
 
-#ifndef __ADX__
-# error "Flag-preserving add-carry instructions not enabled"
-#endif /* __ADX__ */
-
 #ifndef _ADXINTRIN_H_INCLUDED
 #define _ADXINTRIN_H_INCLUDED
 
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 17d4446..7a9e134 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -27968,9 +27968,9 @@ ix86_init_mmx_sse_builtins (void)
 	       INT_FTYPE_PULONGLONG, IX86_BUILTIN_RDSEED64_STEP);
 
   /* ADCX */
-  def_builtin (OPTION_MASK_ISA_ADX, "__builtin_ia32_addcarryx_u32",
+  def_builtin (0, "__builtin_ia32_addcarryx_u32",
 	       UCHAR_FTYPE_UCHAR_UINT_UINT_PUNSIGNED, IX86_BUILTIN_ADDCARRYX32);
-  def_builtin (OPTION_MASK_ISA_ADX && OPTION_MASK_ISA_64BIT,
+  def_builtin (OPTION_MASK_ISA_64BIT,
 	       "__builtin_ia32_addcarryx_u64",
 	       UCHAR_FTYPE_UCHAR_ULONGLONG_ULONGLONG_PULONGLONG,
 	       IX86_BUILTIN_ADDCARRYX64);
@@ -30343,12 +30343,12 @@ rdseed_step:
       return target;
 
     case IX86_BUILTIN_ADDCARRYX32:
-      icode = CODE_FOR_adcxsi3;
+      icode = TARGET_ADX ? CODE_FOR_adcxsi3 : CODE_FOR_addsi3_carry;
       mode0 = SImode;
       goto addcarryx;
 
     case IX86_BUILTIN_ADDCARRYX64:
-      icode = CODE_FOR_adcxdi3;
+      icode = TARGET_ADX ? CODE_FOR_adcxdi3 : CODE_FOR_adddi3_carry;
       mode0 = DImode;
 
 addcarryx:
diff --git a/gcc/config/i386/x86intrin.h b/gcc/config/i386/x86intrin.h
index dc5c58e..fae6491 100644
--- a/gcc/config/i386/x86intrin.h
+++ b/gcc/config/i386/x86intrin.h
@@ -105,8 +105,6 @@
 #include <prfchwintrin.h>
 #endif
 
-#ifdef __ADX__
 #include <adxintrin.h>
-#endif
 
 #endif /* _X86INTRIN_H_INCLUDED */
diff --git a/gcc/testsuite/gcc.target/i386/adx-addcarryx32-3.c b/gcc/testsuite/gcc.target/i386/adx-addcarryx32-3.c
new file mode 100644
index 0000000..0ed33a9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-addcarryx32-3.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-mno-adx -O2" } */
+/* { dg-final { scan-assembler "adcl" } } */
+
+#include <x86intrin.h>
+
+volatile unsigned char c;
+volatile unsigned int x, y;
+unsigned int *sum;
+
+void extern
+adx_test (void)
+{
+    c = _addcarryx_u32 (c, x, y, sum);
+}
diff --git a/gcc/testsuite/gcc.target/i386/adx-addcarryx64-3.c b/gcc/testsuite/gcc.target/i386/adx-addcarryx64-3.c
new file mode 100644
index 0000000..4bbf74b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/adx-addcarryx64-3.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-mno-adx -O2" } */
+/* { dg-final { scan-assembler "adcq" } } */
+
+#include <x86intrin.h>
+
+volatile unsigned char c;
+volatile unsigned long long x, y;
+unsigned long long *sum;
+
+void extern
+adx_test (void)
+{
+    c = _addcarryx_u64 (c, x, y, sum);
+}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Intrinsics for ADCX
  2012-08-09 12:22     ` Michael Zolotukhin
@ 2012-08-09 14:26       ` Richard Henderson
  2012-08-09 14:36         ` Kirill Yukhin
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Henderson @ 2012-08-09 14:26 UTC (permalink / raw)
  To: Michael Zolotukhin
  Cc: Kirill Yukhin, Uros Bizjak, Jakub Jelinek, gcc-patches, H.J. Lu

On 08/09/2012 05:21 AM, Michael Zolotukhin wrote:
> Changelog entry:
> 2012-08-09  Michael Zolotukhin  <michael.v.zolotukhin@intel.com>
> 
>         * config/i386/adxintrin.h: Remove guarding __ADX__ check.
>         * config/i386/x86intrin.h: Likewise.
>         * config/i386/i386.c (ix86_init_mmx_sse_builtins): Remove
>         OPTION_MASK_ISA_ADX from needed options for
>         __builtin_ia32_addcarryx_u32 and __builtin_ia32_addcarryx_u64.
>         (ix86_expand_builtin): Use add<mode>3_carry in expanding of
>         IX86_BUILTIN_ADDCARRYX32 and IX86_BUILTIN_ADDCARRYX64.
> 
> testsuite/Changelog entry:
> 2012-08-09  Michael Zolotukhin  <michael.v.zolotukhin@intel.com>
> 
>         * gcc.target/i386/adx-addxcarry32-3.c: New.
>         * gcc.target/i386/adx-addxcarry64-3.c: New.

Ok.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Intrinsics for ADCX
  2012-08-09 14:26       ` Richard Henderson
@ 2012-08-09 14:36         ` Kirill Yukhin
  2012-08-10  6:19           ` Michael Zolotukhin
  0 siblings, 1 reply; 14+ messages in thread
From: Kirill Yukhin @ 2012-08-09 14:36 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Michael Zolotukhin, Uros Bizjak, Jakub Jelinek, gcc-patches, H.J. Lu

>
> Ok.

Checked in:
http://gcc.gnu.org/ml/gcc-cvs/2012-08/msg00231.html

Thanks, K

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Intrinsics for ADCX
  2012-08-09 14:36         ` Kirill Yukhin
@ 2012-08-10  6:19           ` Michael Zolotukhin
  2012-08-15  8:19             ` Kirill Yukhin
  0 siblings, 1 reply; 14+ messages in thread
From: Michael Zolotukhin @ 2012-08-10  6:19 UTC (permalink / raw)
  To: Kirill Yukhin
  Cc: Richard Henderson, Uros Bizjak, Jakub Jelinek, gcc-patches, H.J. Lu

Thanks!

On 9 August 2012 18:36, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:
>>
>> Ok.
>
> Checked in:
> http://gcc.gnu.org/ml/gcc-cvs/2012-08/msg00231.html
>
> Thanks, K


-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Intrinsics for ADCX
  2012-08-10  6:19           ` Michael Zolotukhin
@ 2012-08-15  8:19             ` Kirill Yukhin
  0 siblings, 0 replies; 14+ messages in thread
From: Kirill Yukhin @ 2012-08-15  8:19 UTC (permalink / raw)
  To: Michael Zolotukhin, Richard Henderson, Uros Bizjak,
	Jakub Jelinek, gcc-patches, H.J. Lu

Hi,
There's white paper [1] available, which explains usage of  MULX/ADCX/ADOX

[1] - http://download.intel.com/embedded/processor/whitepaper/327831.pdf

Thanks, K

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-08-15  8:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-31 11:51 [PATCH] Intrinsics for ADCX Michael Zolotukhin
2012-07-31 13:26 ` Uros Bizjak
2012-07-31 16:24 ` Richard Henderson
2012-08-01 16:37   ` Kirill Yukhin
2012-08-03 13:24     ` Michael Zolotukhin
2012-08-03 13:52       ` Uros Bizjak
2012-08-03 14:17         ` Michael Zolotukhin
2012-08-08  5:34         ` Michael Zolotukhin
2012-08-08 13:27           ` Kirill Yukhin
2012-08-09 12:22     ` Michael Zolotukhin
2012-08-09 14:26       ` Richard Henderson
2012-08-09 14:36         ` Kirill Yukhin
2012-08-10  6:19           ` Michael Zolotukhin
2012-08-15  8:19             ` Kirill Yukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).