public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* PATCH: Add LWP support for upcoming AMD Orochi processor.
@ 2009-10-09  1:13 Harsha Jagasia
  2009-10-09 10:07 ` Jakub Jelinek
  2009-11-05  9:32 ` Jakub Jelinek
  0 siblings, 2 replies; 32+ messages in thread
From: Harsha Jagasia @ 2009-10-09  1:13 UTC (permalink / raw)
  To: gcc-patches, hubicka, rth, dwarak.rajagopal, christophe.harle,
	ubizjak, jakub, Harsha Jagasia
  Cc: Harsha Jagasia

Hello,

> > > I think the easiest would be to use (unspec_volatile ...
> > > UNSPECV_LWPVAL...)
> > > instead.  Otherwise the insn that doesn't set any register may be
> > > eliminated
> > > as unneeded.
> >
> > Then these instructions should be defined as unspec_volatile. OTOH,
> > perhaps you should introduce new fixed register to hold LWP state and
> > change all instructions to correctly depend on this register. Since
> > LWP state won't be hidden from the compiler, you can define "normal"
> > insn patterns using "set". This will also benefit scheduler and will
> > increase general happiness of the compiler ;)
> 
> Well, with modeling LWP as register, one would need to add explicit USE
> pattern to every instruction that differs in behaviour based on LWP
> state.  From quick glance at specs it seems that it is about every
> instruction.
> 
> I guess LWP should act as full scheduling barrier (so we don't get code
> we want to profile moved before profiling starts or after it finish), so
> unspec_volatile is preferred variant.

I have changed the patterns to use unspec_volatile/UNSPECV_LWPVAL...

> > 	* config/i386/sse.md (lwp_llwpcbhi1): New lwp pattern.
> >	...
> 
> There is nothing SSE specific in these patterns, so I think they
> should go in i386.md.

Done.

Thanks to Honza, Uros and Jakub for the input.
I will check in below (after changes and acceptance of XOP patch),
unless there is further review.

Thanks,
Harsha

-----------------
2009-10-8  Harsha Jagasia  <harsha.jagasia@amd.com>

	* doc/invoke.texi (-mlwp): Add documentation.
	* doc/extend.texi (x86 intrinsics): Add LWP intrinsics.

	* config.gcc (i[34567]86-*-*): Include lwpintrin.h.
	(x86_64-*-*): Ditto.
	* config/i386/lwpintrin.h: New file, provide x86 compiler
	intrinisics for LWP.
	* config/i386/cpuid.h (bit_LWP): Define LWP bit.
	* config/i386/x86intrin.h: Add LWP check and lwpintrin.h.
	* config/i386/i386-c.c(ix86_target_macros_internal): Check
	ISA_FLAG for LWP. 
	* config/i386/i386.h(TARGET_LWP): New macro for LWP.
	* config/i386/i386.opt (-mlwp): New switch for LWP support.

	* config/i386/i386.c (OPTION_MASK_ISA_LWP_SET): New.
	(OPTION_MASK_ISA_LWP_UNSET): New.	
	(ix86_handle_option): Handle -mlwp.
	(isa_opts): Handle -mlwp.
	(enum pta_flags): Add PTA_LWP.
	(override_options): Add LWP support.

	(IX86_BUILTIN_LLWPCB16): New for LWP intrinsic.
	(IX86_BUILTIN_LLWPCB32): Ditto
	(IX86_BUILTIN_LLWPCB64): Ditto
	(IX86_BUILTIN_SLWPCB16): Ditto
	(IX86_BUILTIN_SLWPCB32): Ditto
	(IX86_BUILTIN_SLWPCB64): Ditto
	(IX86_BUILTIN_LWPVAL16): Ditto
	(IX86_BUILTIN_LWPVAL32): Ditto
	(IX86_BUILTIN_LWPVAL64): Ditto
	(IX86_BUILTIN_LWPINS16): Ditto
	(IX86_BUILTIN_LWPINS32): Ditto
	(IX86_BUILTIN_LWPINS64): Ditto

	(enum  ix86_special_builtin_type): Add LWP intrinsic support.
	(builtin_description): Ditto.
	(ix86_init_mmx_sse_builtins): Ditto.
	(ix86_expand_special_args_builtin): Ditto.

	* config/i386/i386.md (UNSPEC_LLWP_INTRINSIC):
	(UNSPEC_SLWP_INTRINSIC):
	(UNSPECV_LWPVAL_INTRINSIC):
	(UNSPECV_LWPINS_INTRINSIC): Add new UNSPEC for LWP support.
	(lwp_llwpcbhi1): New lwp pattern.
	(lwp_llwpcbsi1): Ditto.
	(lwp_llwpcbdi1): Ditto.
	(lwp_slwpcbhi1): Ditto.
	(lwp_slwpcbsi1): Ditto.
	(lwp_slwpcbdi1): Ditto.
	(lwp_lwpvalhi3): Ditto.
	(lwp_lwpvalsi3): Ditto.
	(lwp_lwpvaldi3): Ditto.
	(lwp_lwpinshi3): Ditto.
	(lwp_lwpinssi3): Ditto.
	(lwp_lwpinsdi3): Ditto.


diff -upNw gcc-xop-2/gcc/config.gcc gcc-lwp/gcc/config.gcc
--- gcc-xop-2/gcc/config.gcc	2009-09-30 22:37:58.000000000 -0500
+++ gcc-lwp/gcc/config.gcc	2009-09-30 16:33:28.000000000 -0500
@@ -288,7 +288,7 @@ i[34567]86-*-*)
 		       pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h
 		       nmmintrin.h bmmintrin.h fma4intrin.h wmmintrin.h
 		       immintrin.h x86intrin.h avxintrin.h xopintrin.h
-		       ia32intrin.h cross-stdarg.h"
+		       ia32intrin.h cross-stdarg.h lwpintrin.h"
 	;;
 x86_64-*-*)
 	cpu_type=i386
@@ -298,7 +298,7 @@ x86_64-*-*)
 		       pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h
 		       nmmintrin.h bmmintrin.h fma4intrin.h wmmintrin.h
 		       immintrin.h x86intrin.h avxintrin.h xopintrin.h 
-		       ia32intrin.h cross-stdarg.h"
+		       ia32intrin.h cross-stdarg.h lwpintrin.h"
 	need_64bit_hwint=yes
 	;;
 ia64-*-*)
diff -upNw gcc-xop-2/gcc/doc/extend.texi gcc-lwp/gcc/doc/extend.texi
--- gcc-xop-2/gcc/doc/extend.texi	2009-09-30 22:37:58.000000000 -0500
+++ gcc-lwp/gcc/doc/extend.texi	2009-10-03 11:49:19.000000000 -0500
@@ -3178,6 +3178,11 @@ Enable/disable the generation of the FMA
 @cindex @code{target("xop")} attribute
 Enable/disable the generation of the XOP instructions.
 
+@item lwp
+@itemx no-lwp
+@cindex @code{target("lwp")} attribute
+Enable/disable the generation of the LWP instructions.
+
 @item ssse3
 @itemx no-ssse3
 @cindex @code{target("ssse3")} attribute
@@ -9066,5 +9071,22 @@ v8sf __builtin_ia32_fmsubaddps256 (v8sf,
 
 @end smallexample
 
+The following built-in functions are available when @option{-mlwp} is used.
+
+@smallexample
+void __builtin_ia32_llwpcb16 (void *);
+void __builtin_ia32_llwpcb32 (void *);
+void __builtin_ia32_llwpcb64 (void *);
+void * __builtin_ia32_llwpcb16 (void);
+void * __builtin_ia32_llwpcb32 (void);
+void * __builtin_ia32_llwpcb64 (void);
+void __builtin_ia32_lwpval16 (unsigned short, unsigned int, unsigned short)
+void __builtin_ia32_lwpval32 (unsigned int, unsigned int, unsigned int)
+void __builtin_ia32_lwpval64 (unsigned __int64, unsigned int, unsigned int)
+unsigned char __builtin_ia32_lwpins16 (unsigned short, unsigned int, unsigned short)
+unsigned char __builtin_ia32_lwpins32 (unsigned int, unsigned int, unsigned int)
+unsigned char __builtin_ia32_lwpins64 (unsigned __int64, unsigned int, unsigned int)
+@end smallexample
+
 The following built-in functions are available when @option{-m3dnow} is used.
 All of them generate the machine instruction that is part of the name.

diff -upNw gcc-xop-2/gcc/doc/invoke.texi gcc-lwp/gcc/doc/invoke.texi
--- gcc-xop-2/gcc/doc/invoke.texi	2009-09-30 22:37:58.000000000 -0500
+++ gcc-lwp/gcc/doc/invoke.texi	2009-09-30 16:33:28.000000000 -0500
@@ -592,7 +592,7 @@ Objective-C and Objective-C++ Dialects}.
 -mcld -mcx16 -msahf -mmovbe -mcrc32 -mrecip @gol
 -mmmx  -msse  -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol
 -maes -mpclmul @gol
--msse4a -m3dnow -mpopcnt -mabm -mfma4 -mxop @gol
+-msse4a -m3dnow -mpopcnt -mabm -mfma4 -mxop -mlwp @gol
 -mthreads  -mno-align-stringops  -minline-all-stringops @gol
 -minline-stringops-dynamically -mstringop-strategy=@var{alg} @gol
 -mpush-args  -maccumulate-outgoing-args  -m128bit-long-double @gol
@@ -11731,6 +11731,8 @@ preferred alignment to @option{-mpreferr
 @itemx -mno-fma4
 @itemx -mxop
 @itemx -mno-xop
+@itemx -mlwp
+@itemx -mno-lwp
 @itemx -m3dnow
 @itemx -mno-3dnow
 @itemx -mpopcnt
@@ -11745,7 +11747,7 @@ preferred alignment to @option{-mpreferr
 @opindex mno-3dnow
 These switches enable or disable the use of instructions in the MMX,
 SSE, SSE2, SSE3, SSSE3, SSE4.1, AVX, AES, PCLMUL, SSE4A, FMA4, XOP,
-ABM or 3DNow!@: extended instruction sets.
+LWP, ABM or 3DNow!@: extended instruction sets.
 These extensions are also available as built-in functions: see
 @ref{X86 Built-in Functions}, for details of the functions enabled and
 disabled by these switches.
diff -upNw gcc-xop-2/gcc/config/i386/cpuid.h gcc-lwp/gcc/config/i386/cpuid.h
--- gcc-xop-2/gcc/config/i386/cpuid.h	2009-09-30 22:37:58.000000000 -0500
+++ gcc-lwp/gcc/config/i386/cpuid.h	2009-09-30 16:33:28.000000000 -0500
@@ -49,6 +49,7 @@
 #define bit_LAHF_LM	(1 << 0)
 #define bit_SSE4a	(1 << 6)
 #define bit_FMA4	(1 << 16)
+#define bit_LWP 	(1 << 15)
 #define bit_XOP         (1 << 11)
 
 /* %edx */
diff -upNw gcc-xop-2/gcc/config/i386/i386.c gcc-lwp/gcc/config/i386/i386.c
--- gcc-xop-2/gcc/config/i386/i386.c	2009-10-07 15:51:22.000000000 -0500
+++ gcc-lwp/gcc/config/i386/i386.c	2009-10-07 14:33:43.000000000 -0500
@@ -1960,6 +1960,8 @@ static int ix86_isa_flags_explicit;
    | OPTION_MASK_ISA_AVX_SET)
 #define OPTION_MASK_ISA_XOP_SET \
   (OPTION_MASK_ISA_XOP | OPTION_MASK_ISA_FMA4_SET)
+#define OPTION_MASK_ISA_LWP_SET \
+  OPTION_MASK_ISA_LWP
 
 /* AES and PCLMUL need SSE2 because they use xmm registers */
 #define OPTION_MASK_ISA_AES_SET \
@@ -2014,6 +2016,7 @@ static int ix86_isa_flags_explicit;
 #define OPTION_MASK_ISA_FMA4_UNSET \
   (OPTION_MASK_ISA_FMA4 | OPTION_MASK_ISA_XOP_UNSET)
 #define OPTION_MASK_ISA_XOP_UNSET OPTION_MASK_ISA_XOP
+#define OPTION_MASK_ISA_LWP_UNSET OPTION_MASK_ISA_LWP
 
 #define OPTION_MASK_ISA_AES_UNSET OPTION_MASK_ISA_AES
 #define OPTION_MASK_ISA_PCLMUL_UNSET OPTION_MASK_ISA_PCLMUL
@@ -2274,6 +2277,19 @@ ix86_handle_option (size_t code, const c
 	}
       return true;
 
+   case OPT_mlwp:
+      if (value)
+	{
+	  ix86_isa_flags |= OPTION_MASK_ISA_LWP_SET;
+	  ix86_isa_flags_explicit |= OPTION_MASK_ISA_LWP_SET;
+	}
+      else
+	{
+	  ix86_isa_flags &= ~OPTION_MASK_ISA_LWP_UNSET;
+	  ix86_isa_flags_explicit |= OPTION_MASK_ISA_LWP_UNSET;
+	}
+      return true;
+
     case OPT_mabm:
       if (value)
 	{
@@ -2403,6 +2419,7 @@ ix86_target_string (int isa, int flags, 
     { "-m64",		OPTION_MASK_ISA_64BIT },
     { "-mfma4",		OPTION_MASK_ISA_FMA4 },
     { "-mxop",		OPTION_MASK_ISA_XOP },
+    { "-mlwp",		OPTION_MASK_ISA_LWP },
     { "-msse4a",	OPTION_MASK_ISA_SSE4A },
     { "-msse4.2",	OPTION_MASK_ISA_SSE4_2 },
     { "-msse4.1",	OPTION_MASK_ISA_SSE4_1 },
@@ -2634,7 +2651,8 @@ override_options (bool main_args_p)
       PTA_FMA = 1 << 19,
       PTA_MOVBE = 1 << 20,
       PTA_FMA4 = 1 << 21,
-      PTA_XOP = 1 << 22
+      PTA_XOP = 1 << 22,
+      PTA_LWP = 1 << 23
     };
 
   static struct pta
@@ -2983,6 +3001,9 @@ override_options (bool main_args_p)
 	if (processor_alias_table[i].flags & PTA_XOP
 	    && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_XOP))
 	  ix86_isa_flags |= OPTION_MASK_ISA_XOP;
+	if (processor_alias_table[i].flags & PTA_LWP
+	    && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_LWP))
+	  ix86_isa_flags |= OPTION_MASK_ISA_LWP;
 	if (processor_alias_table[i].flags & PTA_ABM
 	    && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_ABM))
 	  ix86_isa_flags |= OPTION_MASK_ISA_ABM;
@@ -3668,6 +3689,7 @@ ix86_valid_target_attribute_inner_p (tre
     IX86_ATTR_ISA ("ssse3",	OPT_mssse3),
     IX86_ATTR_ISA ("fma4",	OPT_mfma4),
     IX86_ATTR_ISA ("xop",	OPT_mxop),
+    IX86_ATTR_ISA ("lwp",	OPT_mlwp),
 
     /* string options */
     IX86_ATTR_STR ("arch=",	IX86_FUNCTION_SPECIFIC_ARCH),
@@ -20987,7 +21009,7 @@ enum ix86_builtins
 
   IX86_BUILTIN_CVTUDQ2PS,
 
-  /* FMA4 instructions.  */
+  /* FMA4 and XOP instructions.  */
   IX86_BUILTIN_VFMADDSS,
   IX86_BUILTIN_VFMADDSD,
   IX86_BUILTIN_VFMADDPS,
@@ -21164,6 +21186,20 @@ enum ix86_builtins
   IX86_BUILTIN_VPCOMFALSEQ,
   IX86_BUILTIN_VPCOMTRUEQ,
 
+  /* LWP instructions.  */
+  IX86_BUILTIN_LLWPCB16,
+  IX86_BUILTIN_LLWPCB32,
+  IX86_BUILTIN_LLWPCB64,
+  IX86_BUILTIN_SLWPCB16,
+  IX86_BUILTIN_SLWPCB32,
+  IX86_BUILTIN_SLWPCB64,
+  IX86_BUILTIN_LWPVAL16,
+  IX86_BUILTIN_LWPVAL32,
+  IX86_BUILTIN_LWPVAL64,
+  IX86_BUILTIN_LWPINS16,
+  IX86_BUILTIN_LWPINS32,
+  IX86_BUILTIN_LWPINS64,
+
   IX86_BUILTIN_MAX
 };
 
@@ -21377,7 +21413,13 @@ enum ix86_special_builtin_type
   VOID_FTYPE_PV8SF_V8SF_V8SF,
   VOID_FTYPE_PV4DF_V4DF_V4DF,
   VOID_FTYPE_PV4SF_V4SF_V4SF,
-  VOID_FTYPE_PV2DF_V2DF_V2DF
+  VOID_FTYPE_PV2DF_V2DF_V2DF,
+  VOID_FTYPE_USHORT_UINT_USHORT,
+  VOID_FTYPE_UINT_UINT_UINT,
+  VOID_FTYPE_UINT64_UINT_UINT,
+  UCHAR_FTYPE_USHORT_UINT_USHORT,
+  UCHAR_FTYPE_UINT_UINT_UINT,
+  UCHAR_FTYPE_UINT64_UINT_UINT
 };
 
 /* Builtin types */
@@ -21624,6 +21666,22 @@ static const struct builtin_description 
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_maskstoreps, "__builtin_ia32_maskstoreps", IX86_BUILTIN_MASKSTOREPS, UNKNOWN, (int) VOID_FTYPE_PV4SF_V4SF_V4SF },
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_maskstorepd256, "__builtin_ia32_maskstorepd256", IX86_BUILTIN_MASKSTOREPD256, UNKNOWN, (int) VOID_FTYPE_PV4DF_V4DF_V4DF },
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_maskstoreps256, "__builtin_ia32_maskstoreps256", IX86_BUILTIN_MASKSTOREPS256, UNKNOWN, (int) VOID_FTYPE_PV8SF_V8SF_V8SF },
+
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_llwpcbhi1,   "__builtin_ia32_llwpcb16",   IX86_BUILTIN_LLWPCB16,    UNKNOWN,     (int) VOID_FTYPE_VOID },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_llwpcbsi1,   "__builtin_ia32_llwpcb32",   IX86_BUILTIN_LLWPCB32,    UNKNOWN,     (int) VOID_FTYPE_VOID },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_llwpcbdi1,   "__builtin_ia32_llwpcb64",   IX86_BUILTIN_LLWPCB64,    UNKNOWN,     (int) VOID_FTYPE_VOID },
+
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_slwpcbhi1,   "__builtin_ia32_slwpcb16",   IX86_BUILTIN_SLWPCB16,    UNKNOWN,     (int) VOID_FTYPE_VOID },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_slwpcbsi1,   "__builtin_ia32_slwpcb32",   IX86_BUILTIN_SLWPCB32,    UNKNOWN,     (int) VOID_FTYPE_VOID },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_slwpcbdi1,   "__builtin_ia32_slwpcb64",   IX86_BUILTIN_SLWPCB64,    UNKNOWN,     (int) VOID_FTYPE_VOID },
+
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpvalhi3,   "__builtin_ia32_lwpval16", IX86_BUILTIN_LWPVAL16,  UNKNOWN,     (int) VOID_FTYPE_USHORT_UINT_USHORT },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpvalsi3,   "__builtin_ia32_lwpval32", IX86_BUILTIN_LWPVAL64,  UNKNOWN,     (int) VOID_FTYPE_UINT_UINT_UINT },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpvaldi3,   "__builtin_ia32_lwpval64", IX86_BUILTIN_LWPVAL64,  UNKNOWN,     (int) VOID_FTYPE_UINT64_UINT_UINT },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpinshi3,   "__builtin_ia32_lwpins16", IX86_BUILTIN_LWPINS16,  UNKNOWN,     (int) UCHAR_FTYPE_USHORT_UINT_USHORT },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpinssi3,   "__builtin_ia32_lwpins32", IX86_BUILTIN_LWPINS64,  UNKNOWN,     (int) UCHAR_FTYPE_UINT_UINT_UINT },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpinsdi3,   "__builtin_ia32_lwpins64", IX86_BUILTIN_LWPINS64,  UNKNOWN,     (int) UCHAR_FTYPE_UINT64_UINT_UINT },
+
 };
 
 /* Builtins with variable number of arguments.  */
@@ -23282,6 +23340,50 @@ ix86_init_mmx_sse_builtins (void)
 				integer_type_node,
 				NULL_TREE);
 
+  /* LWP instructions.  */
+
+  tree void_ftype_ushort_unsigned_ushort
+    = build_function_type_list (void_type_node,
+				short_unsigned_type_node,
+				unsigned_type_node,
+				short_unsigned_type_node,
+				NULL_TREE);
+
+  tree void_ftype_unsigned_unsigned_unsigned
+    = build_function_type_list (void_type_node,
+				unsigned_type_node,
+				unsigned_type_node,
+				unsigned_type_node,
+				NULL_TREE);
+
+  tree void_ftype_uint64_unsigned_unsigned
+    = build_function_type_list (void_type_node,
+				long_long_unsigned_type_node,
+				unsigned_type_node,
+				unsigned_type_node,
+				NULL_TREE);
+
+  tree uchar_ftype_ushort_unsigned_ushort
+    = build_function_type_list (unsigned_char_type_node,
+				short_unsigned_type_node,
+				unsigned_type_node,
+				short_unsigned_type_node,
+				NULL_TREE);
+
+  tree uchar_ftype_unsigned_unsigned_unsigned
+    = build_function_type_list (unsigned_char_type_node,
+				unsigned_type_node,
+				unsigned_type_node,
+				unsigned_type_node,
+				NULL_TREE);
+
+  tree uchar_ftype_uint64_unsigned_unsigned
+    = build_function_type_list (unsigned_char_type_node,
+				long_long_unsigned_type_node,
+				unsigned_type_node,
+				unsigned_type_node,
+				NULL_TREE);
+
   tree ftype;
 
   /* Add all special builtins with variable number of operands.  */
@@ -23395,6 +23497,25 @@ ix86_init_mmx_sse_builtins (void)
 	case VOID_FTYPE_PV2DF_V2DF_V2DF:
 	  type = void_ftype_pv2df_v2df_v2df;
 	  break;
+	case VOID_FTYPE_USHORT_UINT_USHORT:
+	  type = void_ftype_ushort_unsigned_ushort;
+	  break;
+	case VOID_FTYPE_UINT_UINT_UINT:
+	  type = void_ftype_unsigned_unsigned_unsigned;
+	  break;
+	case VOID_FTYPE_UINT64_UINT_UINT:
+	  type = void_ftype_uint64_unsigned_unsigned;
+	  break;
+	case UCHAR_FTYPE_USHORT_UINT_USHORT:
+	  type = uchar_ftype_ushort_unsigned_ushort;
+	  break;
+	case UCHAR_FTYPE_UINT_UINT_UINT:
+	  type = uchar_ftype_unsigned_unsigned_unsigned;
+	  break;
+	case UCHAR_FTYPE_UINT64_UINT_UINT:
+	  type = uchar_ftype_uint64_unsigned_unsigned;
+	  break;
+
 	default:
 	  gcc_unreachable ();
 	}
@@ -25275,6 +25396,16 @@ ix86_expand_special_args_builtin (const 
       /* Reserve memory operand for target.  */
       memory = ARRAY_SIZE (args);
       break;
+    case VOID_FTYPE_USHORT_UINT_USHORT:
+    case VOID_FTYPE_UINT_UINT_UINT:
+    case VOID_FTYPE_UINT64_UINT_UINT:
+    case UCHAR_FTYPE_USHORT_UINT_USHORT:
+    case UCHAR_FTYPE_UINT_UINT_UINT:
+    case UCHAR_FTYPE_UINT64_UINT_UINT:
+      nargs = 3;
+      klass = store;
+      memory = 0;
+      break;
     default:
       gcc_unreachable ();
     }
diff -upNw gcc-xop-2/gcc/config/i386/i386-c.c gcc-lwp/gcc/config/i386/i386-c.c
--- gcc-xop-2/gcc/config/i386/i386-c.c	2009-09-30 22:37:58.000000000 -0500
+++ gcc-lwp/gcc/config/i386/i386-c.c	2009-09-30 16:33:28.000000000 -0500
@@ -234,6 +234,8 @@ ix86_target_macros_internal (int isa_fla
     def_or_undef (parse_in, "__FMA4__");
   if (isa_flag & OPTION_MASK_ISA_XOP)
     def_or_undef (parse_in, "__XOP__");
+  if (isa_flag & OPTION_MASK_ISA_LWP)
+    def_or_undef (parse_in, "__LWP__");
   if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE))
     def_or_undef (parse_in, "__SSE_MATH__");
   if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE2))
diff -upNw gcc-xop-2/gcc/config/i386/i386.h gcc-lwp/gcc/config/i386/i386.h
--- gcc-xop-2/gcc/config/i386/i386.h	2009-09-30 22:37:58.000000000 -0500
+++ gcc-lwp/gcc/config/i386/i386.h	2009-09-30 16:33:28.000000000 -0500
@@ -56,6 +56,7 @@ see the files COPYING3 and COPYING.RUNTI
 #define TARGET_SSE4A	OPTION_ISA_SSE4A
 #define TARGET_FMA4	OPTION_ISA_FMA4
 #define TARGET_XOP	OPTION_ISA_XOP
+#define TARGET_LWP	OPTION_ISA_LWP
 #define TARGET_ROUND	OPTION_ISA_ROUND
 #define TARGET_ABM	OPTION_ISA_ABM
 #define TARGET_POPCNT	OPTION_ISA_POPCNT
diff -upNw gcc-xop-2/gcc/config/i386/i386.md gcc-lwp/gcc/config/i386/i386.md
--- gcc-xop-2/gcc/config/i386/i386.md	2009-10-07 15:48:45.000000000 -0500
+++ gcc-lwp/gcc/config/i386/i386.md	2009-10-07 12:42:09.000000000 -0500
@@ -204,6 +204,10 @@
    (UNSPEC_XOP_TRUEFALSE	152)
    (UNSPEC_XOP_PERMUTE		153)
    (UNSPEC_FRCZ			154)
+   (UNSPEC_LLWP_INTRINSIC	155)
+   (UNSPEC_SLWP_INTRINSIC	156)
+   (UNSPECV_LWPVAL_INTRINSIC	157)
+   (UNSPECV_LWPINS_INTRINSIC	158)
 
    ; For AES support
    (UNSPEC_AESENC		159)
@@ -352,7 +356,7 @@
    fmov,fop,fsgn,fmul,fdiv,fpspc,fcmov,fcmp,fxch,fistp,fisttp,frndint,
    sselog,sselog1,sseiadd,sseiadd1,sseishft,sseimul,
    sse,ssemov,sseadd,ssemul,ssecmp,ssecomi,ssecvt,ssecvt1,sseicvt,ssediv,sseins,
-   ssemuladd,sse4arg,
+   ssemuladd,sse4arg,lwp,
    mmx,mmxmov,mmxadd,mmxmul,mmxcmp,mmxcvt,mmxshft"
   (const_string "other"))
 
@@ -22731,6 +22735,120 @@
   [(set_attr "type" "other")
    (set_attr "length" "3")])
 
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;;
+;; LWP instructions
+;;
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+(define_insn "lwp_llwpcbhi1"
+  [(unspec [(match_operand:HI 0 "register_operand" "r")]
+  	   UNSPEC_LLWP_INTRINSIC)]
+  "TARGET_LWP"
+  "llwpcb\t%0"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "HI")])
+
+(define_insn "lwp_llwpcbsi1"
+  [(unspec [(match_operand:SI 0 "register_operand" "r")]
+  	   UNSPEC_LLWP_INTRINSIC)]
+  "TARGET_LWP"
+  "llwpcb\t%0"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "SI")])
+
+(define_insn "lwp_llwpcbdi1"
+  [(unspec [(match_operand:DI 0 "register_operand" "r")]
+  	   UNSPEC_LLWP_INTRINSIC)]
+  "TARGET_LWP"
+  "llwpcb\t%0"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "DI")])
+
+(define_insn "lwp_slwpcbhi1"
+  [(unspec [(match_operand:HI 0 "register_operand" "r")]
+  	   UNSPEC_SLWP_INTRINSIC)]
+  "TARGET_LWP"
+  "slwpcb\t%0"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "HI")])
+
+(define_insn "lwp_slwpcbsi1"
+  [(unspec [(match_operand:SI 0 "register_operand" "r")]
+  	   UNSPEC_SLWP_INTRINSIC)]
+  "TARGET_LWP"
+  "slwpcb\t%0"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "SI")])
+
+(define_insn "lwp_slwpcbdi1"
+  [(unspec [(match_operand:DI 0 "register_operand" "r")]
+  	   UNSPEC_SLWP_INTRINSIC)]
+  "TARGET_LWP"
+  "slwpcb\t%0"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "DI")])
+
+(define_insn "lwp_lwpvalhi3"
+  [(unspec_volatile [(match_operand:HI 0 "register_operand" "r")
+  	   	     (match_operand:SI 1 "nonimmediate_operand" "rm")
+	   	     (match_operand:HI 2 "const_int_operand" "")]
+  	   	    UNSPECV_LWPVAL_INTRINSIC)]
+  "TARGET_LWP"
+  "lwpval\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "HI")])
+
+(define_insn "lwp_lwpvalsi3"
+  [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")
+    	    	     (match_operand:SI 1 "nonimmediate_operand" "rm")
+	    	     (match_operand:SI 2 "const_int_operand" "")]
+		    UNSPECV_LWPVAL_INTRINSIC)]
+  "TARGET_LWP"
+  "lwpval\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "SI")])
+
+(define_insn "lwp_lwpvaldi3"
+  [(unspec_volatile [(match_operand:DI 0 "register_operand" "r")
+  		     (match_operand:SI 1 "nonimmediate_operand" "rm")
+		     (match_operand:SI 2 "const_int_operand" "")]
+		    UNSPECV_LWPVAL_INTRINSIC)]
+  "TARGET_LWP"
+  "lwpval\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "DI")])
+
+(define_insn "lwp_lwpinshi3"
+  [(unspec_volatile [(match_operand:HI 0 "register_operand" "r")
+  		     (match_operand:SI 1 "nonimmediate_operand" "rm")
+		     (match_operand:HI 2 "const_int_operand" "")]
+		    UNSPECV_LWPINS_INTRINSIC)]
+  "TARGET_LWP"
+  "lwpins\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "HI")])
+
+(define_insn "lwp_lwpinssi3"
+  [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")
+  		     (match_operand:SI 1 "nonimmediate_operand" "rm")
+		     (match_operand:SI 2 "const_int_operand" "")]
+		    UNSPECV_LWPINS_INTRINSIC)]
+  "TARGET_LWP"
+  "lwpins\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "SI")])
+
+(define_insn "lwp_lwpinsdi3"
+  [(unspec_volatile [(match_operand:DI 0 "register_operand" "r")
+  		     (match_operand:SI 1 "nonimmediate_operand" "rm")
+		     (match_operand:SI 2 "const_int_operand" "")]
+		    UNSPECV_LWPINS_INTRINSIC)]
+  "TARGET_LWP"
+  "lwpins\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "DI")])
+
 (include "mmx.md")
 (include "sse.md")
 (include "sync.md")
diff -upNw gcc-xop-2/gcc/config/i386/i386.opt gcc-lwp/gcc/config/i386/i386.opt
--- gcc-xop-2/gcc/config/i386/i386.opt	2009-09-30 22:37:58.000000000 -0500
+++ gcc-lwp/gcc/config/i386/i386.opt	2009-09-30 16:33:28.000000000 -0500
@@ -318,6 +318,10 @@ mxop
 Target Report Mask(ISA_XOP) Var(ix86_isa_flags) VarExists Save
 Support XOP built-in functions and code generation 
 
+mlwp
+Target Report Mask(ISA_LWP) Var(ix86_isa_flags) VarExists Save
+Support LWP built-in functions and code generation 
+
 mabm
 Target Report Mask(ISA_ABM) Var(ix86_isa_flags) VarExists Save
 Support code generation of Advanced Bit Manipulation (ABM) instructions.
diff -upNw gcc-xop-2/gcc/config/i386/lwpintrin.h gcc-lwp/gcc/config/i386/lwpintrin.h
--- gcc-xop-2/gcc/config/i386/lwpintrin.h	1969-12-31 18:00:00.000000000 -0600
+++ gcc-lwp/gcc/config/i386/lwpintrin.h	2009-10-02 18:30:09.000000000 -0500
@@ -0,0 +1,109 @@
+/* Copyright (C) 2007, 2008, 2009 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _X86INTRIN_H_INCLUDED
+# error "Never use <lwpintrin.h> directly; include <x86intrin.h> instead."
+#endif
+
+#ifndef _LWPINTRIN_H_INCLUDED
+#define _LWPINTRIN_H_INCLUDED
+
+#ifndef __LWP__
+# error "LWP instruction set not enabled"
+#else
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__llwpcb16 (void *pcbAddress)
+{
+  __builtin_ia32_llwpcb16 (pcbAddress);
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__llwpcb32 (void *pcbAddress)
+{
+  __builtin_ia32_llwpcb32 (pcbAddress);
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__llwpcb64 (void *pcbAddress)
+{
+  __builtin_ia32_llwpcb64 (pcbAddress);
+}
+
+extern __inline void * __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__slwpcb16 (void)
+{
+  return __builtin_ia32_slwpcb16 ();
+}
+
+extern __inline void * __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__slwpcb32 (void)
+{
+  return __builtin_ia32_slwpcb32 ();
+}
+
+extern __inline void * __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__slwpcb64 (void)
+{
+  return __builtin_ia32_slwpcb64 ();
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__lwpval16 (unsigned short data2, unsigned int data1, unsigned short flags)
+{
+  __builtin_ia32_lwpval16 (data2, data1, flags);
+}
+/*
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__lwpval32 (unsigned int data2, unsigned int data1, unsigned int flags)
+{
+  __builtin_ia32_lwpval32 (data2, data1, flags);
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__lwpval64 (unsigned __int64 data2, unsigned int data1, unsigned int flags)
+{
+  __builtin_ia32_lwpval64 (data2, data1, flags);
+}
+
+extern __inline unsigned char __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__lwpins16 (unsigned short data2, unsigned int data1, unsigned short flags)
+{
+  return __builtin_ia32_lwpins16 (data2, data1, flags);
+}
+
+extern __inline unsigned char __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__lwpins32 (unsigned int data2, unsigned int data1, unsigned int flags)
+{
+  return __builtin_ia32_lwpins32 (data2, data1, flags);
+}
+
+extern __inline unsigned char __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__lwpins64 (unsigned __int64 data2, unsigned int data1, unsigned int flags)
+{
+  return __builtin_ia32_lwpins64 (data2, data1, flags);
+}
+*/
+#endif /* __LWP__ */
+
+#endif /* _LWPINTRIN_H_INCLUDED */
diff -upNw gcc-xop-2/gcc/config/i386/x86intrin.h gcc-lwp/gcc/config/i386/x86intrin.h
--- gcc-xop-2/gcc/config/i386/x86intrin.h	2009-09-30 22:37:58.000000000 -0500
+++ gcc-lwp/gcc/config/i386/x86intrin.h	2009-09-30 16:33:28.000000000 -0500
@@ -62,6 +62,10 @@
 #include <xopintrin.h>
 #endif
 
+#ifdef __LWP__
+#include <lwpintrin.h>
+#endif
+
 #if defined (__AES__) || defined (__PCLMUL__)
 #include <wmmintrin.h>
 #endif

^ permalink raw reply	[flat|nested] 32+ messages in thread
* Re: PATCH: Add LWP support for upcoming AMD Orochi processor.
@ 2009-10-09  2:12 Ross Ridge
  0 siblings, 0 replies; 32+ messages in thread
From: Ross Ridge @ 2009-10-09  2:12 UTC (permalink / raw)
  To: gcc-patches

Harsha Jagasia writes:
>+(define_insn "lwp_lwpinshi3"
>+  [(unspec_volatile [(match_operand:HI 0 "register_operand" "r")
>+  		     (match_operand:SI 1 "nonimmediate_operand" "rm")
>+		     (match_operand:HI 2 "const_int_operand" "")]
>+		    UNSPECV_LWPINS_INTRINSIC)]
>+  "TARGET_LWP"
>+  "lwpins\t{%2, %1, %0|%0, %1, %2}"
>+  [(set_attr "type" "lwp")
>+   (set_attr "mode" "HI")])
>+

The LWPINS instruction is documented as setting the carry flag (CF),
and I think this value is supposed to be returned to caller, given the
return type of the intrinsics.

					Ross Ridge

^ permalink raw reply	[flat|nested] 32+ messages in thread
* Re: PATCH: Add LWP support for upcoming AMD Orochi processor.
@ 2009-10-01  7:51 Uros Bizjak
  2009-10-01 10:09 ` Jan Hubicka
  0 siblings, 1 reply; 32+ messages in thread
From: Uros Bizjak @ 2009-10-01  7:51 UTC (permalink / raw)
  To: gcc-patches
  Cc: Harsha Jagasia, hubicka, rth, dwarak.rajagopal, christophe.harle

Hello!

> This patch is for LWP instruction set support for gcc 4.5 for the
> upcoming AMD Orochi processor. Please see the AMD spec for the LWP
> ISA at http://support.amd.com/us/Processor_TechDocs/43724.pdf
>
> One of the issues I am hoping the maintainers can give guidance on:
>
> - Currently the code for the lwpval and lwpins instructions is
> commented out. These instructions are different from typical
> instructions in that they have no destination register
> (please see the spec). I am not sure how to repesent the patterns
> for the same and would appreciate some input.

Then these instructions should be defined as unspec_volatile. OTOH,
perhaps you should introduce new fixed register to hold LWP state and
change all instructions to correctly depend on this register. Since
LWP state won't be hidden from the compiler, you can define "normal"
insn patterns using "set". This will also benefit scheduler and will
increase general happiness of the compiler ;)

You can just look at FPCR and FPSR handling in i386.md and their
definition in i386.h.

> 	* config/i386/sse.md (lwp_llwpcbhi1): New lwp pattern.
>	...

There is nothing SSE specific in these patterns, so I think they
should go in i386.md.

Uros.

^ permalink raw reply	[flat|nested] 32+ messages in thread
* PATCH: Add LWP support for upcoming AMD Orochi processor.
@ 2009-10-01  4:06 Harsha Jagasia
  2009-10-01  6:30 ` Jakub Jelinek
  0 siblings, 1 reply; 32+ messages in thread
From: Harsha Jagasia @ 2009-10-01  4:06 UTC (permalink / raw)
  To: Harsha Jagasia, gcc-patches, hubicka, rth, dwarak.rajagopal,
	christophe.harle
  Cc: Harsha Jagasia

This patch is for LWP instruction set support for gcc 4.5 for the
upcoming AMD Orochi processor. Please see the AMD spec for the LWP
ISA at http://support.amd.com/us/Processor_TechDocs/43724.pdf

We are still in the process of wrapping up the LWP binutils work
and expect it to be checked in during stage 3.

The attached patch is based on the latest trunk and bootstrap
and target tests pass. A full make check is still running.
I will update the list with the results of make check, but I
wanted to send the patch out so that the reviewers can look at it.

One of the issues I am hoping the maintainers can give guidance
on:

- Currently the code for the lwpval and lwpins instructions is
commented out. These instructions are different from typical
instructions in that they have no destination register
(please see the spec). I am not sure how to repesent the patterns
for the same and would appreciate some input.

Thanks in advance.


2009-09-29  Harsha Jagasia  <harsha.jagasia@amd.com>

	* doc/invoke.texi (-mlwp): Add documentation.
	* doc/extend.texi (x86 intrinsics): Add LWP intrinsics.

	* config.gcc (i[34567]86-*-*): Include lwpintrin.h.
	(x86_64-*-*): Ditto.
	* config/i386/lwpintrin.h: New file, provide x86 compiler
	intrinisics for LWP.
	* config/i386/cpuid.h (bit_LWP): Define LWP bit.
	* config/i386/x86intrin.h: Add LWP check and lwpintrin.h.
	* config/i386/i386-c.c(ix86_target_macros_internal): Check
	ISA_FLAG for LWP. 
	* config/i386/i386.h(TARGET_LWP): New macro for LWP.
	* config/i386/i386.opt (-mlwp): New switch for LWP support.

	* config/i386/i386.c (OPTION_MASK_ISA_LWP_SET): New.
	(OPTION_MASK_ISA_LWP_UNSET): New.	
	(ix86_handle_option): Handle -mlwp.
	(isa_opts): Handle -mlwp.
	(enum pta_flags): Add PTA_LWP.
	(override_options): Add LWP support.

	(IX86_BUILTIN_LLWPCB16): New for LWP intrinsic.
	(IX86_BUILTIN_LLWPCB32): Ditto
	(IX86_BUILTIN_LLWPCB64): Ditto
	(IX86_BUILTIN_SLWPCB16): Ditto
	(IX86_BUILTIN_SLWPCB32): Ditto
	(IX86_BUILTIN_SLWPCB64): Ditto
	(IX86_BUILTIN_LWPVAL16): Ditto
	(IX86_BUILTIN_LWPVAL32): Ditto
	(IX86_BUILTIN_LWPVAL64): Ditto
	(IX86_BUILTIN_LWPINS16): Ditto
	(IX86_BUILTIN_LWPINS32): Ditto
	(IX86_BUILTIN_LWPINS64): Ditto

	(enum  ix86_builtin_type): Add LWP intrinsic support.
	(builtin_description): Ditto.
	(ix86_init_mmx_sse_builtins): Ditto.
	(ix86_expand_args_builtin): Ditto.

	* config/i386/i386.md (UNSPEC_LLWP_INTRINSIC):
	(UNSPEC_SLWP_INTRINSIC):
	(UNSPEC_LWPVAL_INTRINSIC):
	(UNSPEC_LWPINS_INTRINSIC): Add new UNSPEC for LWP support.

	* config/i386/sse.md (lwp_llwpcbhi1): New lwp pattern.
	(lwp_llwpcbsi1): Ditto.
	(lwp_llwpcbdi1): Ditto.
	(lwp_slwpcbhi1): Ditto.
	(lwp_slwpcbsi1): Ditto.
	(lwp_slwpcbdi1): Ditto.
	(lwp_lwpvalhi3): Ditto.
	(lwp_lwpvalsi3): Ditto.
	(lwp_lwpvaldi3): Ditto.
	(lwp_lwpinshi3): Ditto.
	(lwp_lwpinssi3): Ditto.
	(lwp_lwpinsdi3): Ditto.



diff -upNw gcc-xop-2/gcc/config.gcc gcc-lwp/gcc/config.gcc
--- gcc-xop-2/gcc/config.gcc	2009-09-30 14:12:36.000000000 -0500
+++ gcc-lwp/gcc/config.gcc	2009-09-30 16:33:28.000000000 -0500
@@ -288,7 +288,7 @@ i[34567]86-*-*)
 		       pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h
 		       nmmintrin.h bmmintrin.h fma4intrin.h wmmintrin.h
 		       immintrin.h x86intrin.h avxintrin.h xopintrin.h
-		       ia32intrin.h cross-stdarg.h"
+		       ia32intrin.h cross-stdarg.h lwpintrin.h"
 	;;
 x86_64-*-*)
 	cpu_type=i386
@@ -298,7 +298,7 @@ x86_64-*-*)
 		       pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h
 		       nmmintrin.h bmmintrin.h fma4intrin.h wmmintrin.h
 		       immintrin.h x86intrin.h avxintrin.h xopintrin.h 
-		       ia32intrin.h cross-stdarg.h"
+		       ia32intrin.h cross-stdarg.h lwpintrin.h"
 	need_64bit_hwint=yes
 	;;
 ia64-*-*)
diff -upNw gcc-xop-2/gcc/doc/extend.texi gcc-lwp/gcc/doc/extend.texi
--- gcc-xop-2/gcc/doc/extend.texi	2009-09-29 19:41:02.000000000 -0500
+++ gcc-lwp/gcc/doc/extend.texi	2009-09-30 16:33:28.000000000 -0500
@@ -3178,6 +3178,11 @@ Enable/disable the generation of the FMA
 @cindex @code{target("xop")} attribute
 Enable/disable the generation of the XOP instructions.
 
+@item lwp
+@itemx no-lwp
+@cindex @code{target("lwp")} attribute
+Enable/disable the generation of the LWP instructions.
+
 @item ssse3
 @itemx no-ssse3
 @cindex @code{target("ssse3")} attribute
@@ -9066,5 +9071,22 @@ v8sf __builtin_ia32_fmsubaddps256 (v8sf,
 
 @end smallexample
 
+The following built-in functions are available when @option{-mlwp} is used.
+
+@smallexample
+void __builtin_ia32_llwpcb16 (void *);
+void __builtin_ia32_llwpcb32 (void *);
+void __builtin_ia32_llwpcb64 (void *);
+void * __builtin_ia32_llwpcb16 (void);
+void * __builtin_ia32_llwpcb32 (void);
+void * __builtin_ia32_llwpcb64 (void);
+@c void __builtin_ia32_lwpval16 (unsigned short, unsigned int, unsigned short)
+@c void __builtin_ia32_lwpval32 (unsigned int, unsigned int, unsigned int)
+@c void __builtin_ia32_lwpval64 (unsigned __int64, unsigned int, unsigned int)
+@c unsigned char __builtin_ia32_lwpins16 (unsigned short, unsigned int, unsigned short)
+@c unsigned char __builtin_ia32_lwpins32 (unsigned int, unsigned int, unsigned int)
+@c unsigned char __builtin_ia32_lwpins64 (unsigned __int64, unsigned int, unsigned int)
+@end smallexample
+
 The following built-in functions are available when @option{-m3dnow} is used.
 All of them generate the machine instruction that is part of the name.

diff -upNw gcc-xop-2/gcc/doc/invoke.texi gcc-lwp/gcc/doc/invoke.texi
--- gcc-xop-2/gcc/doc/invoke.texi	2009-09-29 19:41:02.000000000 -0500
+++ gcc-lwp/gcc/doc/invoke.texi	2009-09-30 16:33:28.000000000 -0500
@@ -592,7 +592,7 @@ Objective-C and Objective-C++ Dialects}.
 -mcld -mcx16 -msahf -mmovbe -mcrc32 -mrecip @gol
 -mmmx  -msse  -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol
 -maes -mpclmul @gol
--msse4a -m3dnow -mpopcnt -mabm -mfma4 -mxop @gol
+-msse4a -m3dnow -mpopcnt -mabm -mfma4 -mxop -mlwp @gol
 -mthreads  -mno-align-stringops  -minline-all-stringops @gol
 -minline-stringops-dynamically -mstringop-strategy=@var{alg} @gol
 -mpush-args  -maccumulate-outgoing-args  -m128bit-long-double @gol
@@ -11731,6 +11731,8 @@ preferred alignment to @option{-mpreferr
 @itemx -mno-fma4
 @itemx -mxop
 @itemx -mno-xop
+@itemx -mlwp
+@itemx -mno-lwp
 @itemx -m3dnow
 @itemx -mno-3dnow
 @itemx -mpopcnt
@@ -11745,7 +11747,7 @@ preferred alignment to @option{-mpreferr
 @opindex mno-3dnow
 These switches enable or disable the use of instructions in the MMX,
 SSE, SSE2, SSE3, SSSE3, SSE4.1, AVX, AES, PCLMUL, SSE4A, FMA4, XOP,
-ABM or 3DNow!@: extended instruction sets.
+LWP, ABM or 3DNow!@: extended instruction sets.
 These extensions are also available as built-in functions: see
 @ref{X86 Built-in Functions}, for details of the functions enabled and
 disabled by these switches.
diff -upNw gcc-xop-2/gcc/config/i386/cpuid.h gcc-lwp/gcc/config/i386/cpuid.h
--- gcc-xop-2/gcc/config/i386/cpuid.h	2009-09-29 19:41:02.000000000 -0500
+++ gcc-lwp/gcc/config/i386/cpuid.h	2009-09-30 16:33:28.000000000 -0500
@@ -49,6 +49,7 @@
 #define bit_LAHF_LM	(1 << 0)
 #define bit_SSE4a	(1 << 6)
 #define bit_FMA4	(1 << 16)
+#define bit_LWP 	(1 << 15)
 #define bit_XOP         (1 << 11)
 
 /* %edx */
diff -upNw gcc-xop-2/gcc/config/i386/i386.c gcc-lwp/gcc/config/i386/i386.c
--- gcc-xop-2/gcc/config/i386/i386.c	2009-09-29 19:41:03.000000000 -0500
+++ gcc-lwp/gcc/config/i386/i386.c	2009-09-30 16:33:28.000000000 -0500
@@ -1960,6 +1960,8 @@ static int ix86_isa_flags_explicit;
    | OPTION_MASK_ISA_AVX_SET)
 #define OPTION_MASK_ISA_XOP_SET \
   (OPTION_MASK_ISA_XOP | OPTION_MASK_ISA_FMA4_SET)
+#define OPTION_MASK_ISA_LWP_SET \
+  OPTION_MASK_ISA_LWP
 
 /* AES and PCLMUL need SSE2 because they use xmm registers */
 #define OPTION_MASK_ISA_AES_SET \
@@ -2014,6 +2016,7 @@ static int ix86_isa_flags_explicit;
 #define OPTION_MASK_ISA_FMA4_UNSET \
   (OPTION_MASK_ISA_FMA4 | OPTION_MASK_ISA_XOP_UNSET)
 #define OPTION_MASK_ISA_XOP_UNSET OPTION_MASK_ISA_XOP
+#define OPTION_MASK_ISA_LWP_UNSET OPTION_MASK_ISA_LWP
 
 #define OPTION_MASK_ISA_AES_UNSET OPTION_MASK_ISA_AES
 #define OPTION_MASK_ISA_PCLMUL_UNSET OPTION_MASK_ISA_PCLMUL
@@ -2274,6 +2277,19 @@ ix86_handle_option (size_t code, const c
 	}
       return true;
 
+   case OPT_mlwp:
+      if (value)
+	{
+	  ix86_isa_flags |= OPTION_MASK_ISA_LWP_SET;
+	  ix86_isa_flags_explicit |= OPTION_MASK_ISA_LWP_SET;
+	}
+      else
+	{
+	  ix86_isa_flags &= ~OPTION_MASK_ISA_LWP_UNSET;
+	  ix86_isa_flags_explicit |= OPTION_MASK_ISA_LWP_UNSET;
+	}
+      return true;
+
     case OPT_mabm:
       if (value)
 	{
@@ -2403,6 +2419,7 @@ ix86_target_string (int isa, int flags, 
     { "-m64",		OPTION_MASK_ISA_64BIT },
     { "-mfma4",		OPTION_MASK_ISA_FMA4 },
     { "-mxop",		OPTION_MASK_ISA_XOP },
+    { "-mlwp",		OPTION_MASK_ISA_LWP },
     { "-msse4a",	OPTION_MASK_ISA_SSE4A },
     { "-msse4.2",	OPTION_MASK_ISA_SSE4_2 },
     { "-msse4.1",	OPTION_MASK_ISA_SSE4_1 },
@@ -2634,7 +2651,8 @@ override_options (bool main_args_p)
       PTA_FMA = 1 << 19,
       PTA_MOVBE = 1 << 20,
       PTA_FMA4 = 1 << 21,
-      PTA_XOP = 1 << 22
+      PTA_XOP = 1 << 22,
+      PTA_LWP = 1 << 23
     };
 
   static struct pta
@@ -2983,6 +3001,9 @@ override_options (bool main_args_p)
 	if (processor_alias_table[i].flags & PTA_XOP
 	    && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_XOP))
 	  ix86_isa_flags |= OPTION_MASK_ISA_XOP;
+	if (processor_alias_table[i].flags & PTA_LWP
+	    && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_LWP))
+	  ix86_isa_flags |= OPTION_MASK_ISA_LWP;
 	if (processor_alias_table[i].flags & PTA_ABM
 	    && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_ABM))
 	  ix86_isa_flags |= OPTION_MASK_ISA_ABM;
@@ -3668,6 +3689,7 @@ ix86_valid_target_attribute_inner_p (tre
     IX86_ATTR_ISA ("ssse3",	OPT_mssse3),
     IX86_ATTR_ISA ("fma4",	OPT_mfma4),
     IX86_ATTR_ISA ("xop",	OPT_mxop),
+    IX86_ATTR_ISA ("lwp",	OPT_mlwp),
 
     /* string options */
     IX86_ATTR_STR ("arch=",	IX86_FUNCTION_SPECIFIC_ARCH),
@@ -20987,7 +21009,7 @@ enum ix86_builtins
 
   IX86_BUILTIN_CVTUDQ2PS,
 
-  /* FMA4 instructions.  */
+  /* FMA4 and XOP instructions.  */
   IX86_BUILTIN_VFMADDSS,
   IX86_BUILTIN_VFMADDSD,
   IX86_BUILTIN_VFMADDPS,
@@ -21164,6 +21186,23 @@ enum ix86_builtins
   IX86_BUILTIN_VPCOMFALSEQ,
   IX86_BUILTIN_VPCOMTRUEQ,
 
+  /* LWP instructions.  */
+  IX86_BUILTIN_LLWPCB16,
+  IX86_BUILTIN_LLWPCB32,
+  IX86_BUILTIN_LLWPCB64,
+  IX86_BUILTIN_SLWPCB16,
+  IX86_BUILTIN_SLWPCB32,
+  IX86_BUILTIN_SLWPCB64,
+
+  /*
+  IX86_BUILTIN_LWPVAL16,
+  IX86_BUILTIN_LWPVAL32,
+  IX86_BUILTIN_LWPVAL64,
+  IX86_BUILTIN_LWPINS16,
+  IX86_BUILTIN_LWPINS32,
+  IX86_BUILTIN_LWPINS64,
+  */
+
   IX86_BUILTIN_MAX
 };
 
@@ -21540,7 +21579,13 @@ enum ix86_builtin_type
   V1DI2DI_FTYPE_V1DI_V1DI_INT,
   V2DF_FTYPE_V2DF_V2DF_INT,
   V2DI_FTYPE_V2DI_UINT_UINT,
-  V2DI_FTYPE_V2DI_V2DI_UINT_UINT
+  V2DI_FTYPE_V2DI_V2DI_UINT_UINT,
+  VOID_FTYPE_USHORT_UINT_USHORT,
+  VOID_FTYPE_UINT_UINT_UINT,
+  VOID_FTYPE_UINT64_UINT_UINT,
+  UCHAR_FTYPE_USHORT_UINT_USHORT,
+  UCHAR_FTYPE_UINT_UINT_UINT,
+  UCHAR_FTYPE_UINT64_UINT_UINT
 };
 
 /* Special builtins with variable number of arguments.  */
@@ -22237,7 +22282,7 @@ static const struct builtin_description 
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_movmskps256, "__builtin_ia32_movmskps256", IX86_BUILTIN_MOVMSKPS256, UNKNOWN, (int) INT_FTYPE_V8SF },
 };
 
-/* FMA4 and XOP.  */
+/* FMA4, XOP.  */
 enum multi_arg_type {
   MULTI_ARG_UNKNOWN,
   MULTI_ARG_3_SF,
@@ -22484,6 +22529,23 @@ static const struct builtin_description 
   { OPTION_MASK_ISA_XOP, CODE_FOR_xop_pcom_tfv4si3,      "__builtin_ia32_vpcomtrueud", IX86_BUILTIN_VPCOMTRUEUD, (enum rtx_code) PCOM_TRUE,    (int)MULTI_ARG_2_SI_TF },
   { OPTION_MASK_ISA_XOP, CODE_FOR_xop_pcom_tfv2di3,      "__builtin_ia32_vpcomtrueuq", IX86_BUILTIN_VPCOMTRUEUQ, (enum rtx_code) PCOM_TRUE,    (int)MULTI_ARG_2_DI_TF },
 
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_llwpcbhi1,            "__builtin_ia32_llwpcb16",   IX86_BUILTIN_LLWPCB16,    UNKNOWN,     (int) VOID_FTYPE_VOID },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_llwpcbsi1,            "__builtin_ia32_llwpcb32",   IX86_BUILTIN_LLWPCB32,    UNKNOWN,     (int) VOID_FTYPE_VOID },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_llwpcbdi1,            "__builtin_ia32_llwpcb64",   IX86_BUILTIN_LLWPCB64,    UNKNOWN,     (int) VOID_FTYPE_VOID },
+
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_slwpcbhi1,            "__builtin_ia32_slwpcb16",   IX86_BUILTIN_SLWPCB16,    UNKNOWN,     (int) VOID_FTYPE_VOID },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_slwpcbsi1,            "__builtin_ia32_slwpcb32",   IX86_BUILTIN_SLWPCB32,    UNKNOWN,     (int) VOID_FTYPE_VOID },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_slwpcbdi1,            "__builtin_ia32_slwpcb64",   IX86_BUILTIN_SLWPCB64,    UNKNOWN,     (int) VOID_FTYPE_VOID },
+
+  /*
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpvalhi3,          "__builtin_ia32_lwpval16", IX86_BUILTIN_LWPVAL16,  UNKNOWN,     (int) VOID_FTYPE_USHORT_UINT_USHORT },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpvalsi3,          "__builtin_ia32_lwpval32", IX86_BUILTIN_LWPVAL64,  UNKNOWN,     (int) VOID_FTYPE_UINT_UINT_UINT },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpvaldi3,          "__builtin_ia32_lwpval64", IX86_BUILTIN_LWPVAL64,  UNKNOWN,     (int) VOID_FTYPE_UINT64_UINT_UINT },
+
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpinshi3,          "__builtin_ia32_lwpins16", IX86_BUILTIN_LWPINS16,  UNKNOWN,     (int) UCHAR_FTYPE_USHORT_UINT_USHORT },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpinssi3,          "__builtin_ia32_lwpins32", IX86_BUILTIN_LWPINS64,  UNKNOWN,     (int) UCHAR_FTYPE_UINT_UINT_UINT },
+  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpinsdi3,          "__builtin_ia32_lwpins64", IX86_BUILTIN_LWPINS64,  UNKNOWN,     (int) UCHAR_FTYPE_UINT64_UINT_UINT },
+  */
 };
 
 /* Set up all the MMX/SSE builtins, even builtins for instructions that are not
@@ -23253,6 +23315,50 @@ ix86_init_mmx_sse_builtins (void)
 				float_type_node,
 				NULL_TREE);
 
+  /* LWP instructions.  */
+
+  tree void_ftype_ushort_unsigned_ushort
+    = build_function_type_list (void_type_node,
+				short_unsigned_type_node,
+				unsigned_type_node,
+				short_unsigned_type_node,
+				NULL_TREE);
+
+  tree void_ftype_unsigned_unsigned_unsigned
+    = build_function_type_list (void_type_node,
+				unsigned_type_node,
+				unsigned_type_node,
+				unsigned_type_node,
+				NULL_TREE);
+
+  tree void_ftype_uint64_unsigned_unsigned
+    = build_function_type_list (void_type_node,
+				long_long_unsigned_type_node,
+				unsigned_type_node,
+				unsigned_type_node,
+				NULL_TREE);
+
+  tree uchar_ftype_ushort_unsigned_ushort
+    = build_function_type_list (unsigned_char_type_node,
+				short_unsigned_type_node,
+				unsigned_type_node,
+				short_unsigned_type_node,
+				NULL_TREE);
+
+  tree uchar_ftype_unsigned_unsigned_unsigned
+    = build_function_type_list (unsigned_char_type_node,
+				unsigned_type_node,
+				unsigned_type_node,
+				unsigned_type_node,
+				NULL_TREE);
+
+  tree uchar_ftype_uint64_unsigned_unsigned
+    = build_function_type_list (unsigned_char_type_node,
+				long_long_unsigned_type_node,
+				unsigned_type_node,
+				unsigned_type_node,
+				NULL_TREE);
+
   /* Integer intrinsics.  */
   tree uint64_ftype_void
     = build_function_type (long_long_unsigned_type_node,
@@ -23855,6 +23961,25 @@ ix86_init_mmx_sse_builtins (void)
 	case V1DI2DI_FTYPE_V1DI_V1DI_INT:
 	  type = v1di_ftype_v1di_v1di_int;
 	  break;
+	case VOID_FTYPE_USHORT_UINT_USHORT:
+	  type = void_ftype_ushort_unsigned_ushort;
+	  break;
+	case VOID_FTYPE_UINT_UINT_UINT:
+	  type = void_ftype_unsigned_unsigned_unsigned;
+	  break;
+	case VOID_FTYPE_UINT64_UINT_UINT:
+	  type = void_ftype_uint64_unsigned_unsigned;
+	  break;
+	case UCHAR_FTYPE_USHORT_UINT_USHORT:
+	  type = uchar_ftype_ushort_unsigned_ushort;
+	  break;
+	case UCHAR_FTYPE_UINT_UINT_UINT:
+	  type = uchar_ftype_unsigned_unsigned_unsigned;
+	  break;
+	case UCHAR_FTYPE_UINT64_UINT_UINT:
+	  type = uchar_ftype_uint64_unsigned_unsigned;
+	  break;
+
 	default:
 	  gcc_unreachable ();
 	}
@@ -25034,6 +25159,15 @@ ix86_expand_args_builtin (const struct b
       nargs = 4;
       nargs_constant = 2;
       break;
+    case VOID_FTYPE_USHORT_UINT_USHORT:
+    case VOID_FTYPE_UINT_UINT_UINT:
+    case VOID_FTYPE_UINT64_UINT_UINT:
+    case UCHAR_FTYPE_USHORT_UINT_USHORT:
+    case UCHAR_FTYPE_UINT_UINT_UINT:
+    case UCHAR_FTYPE_UINT64_UINT_UINT:
+      nargs = 3;
+      nargs_constant = 3;
+      break;
     default:
       gcc_unreachable ();
     }
diff -upNw gcc-xop-2/gcc/config/i386/i386-c.c gcc-lwp/gcc/config/i386/i386-c.c
--- gcc-xop-2/gcc/config/i386/i386-c.c	2009-09-29 19:41:03.000000000 -0500
+++ gcc-lwp/gcc/config/i386/i386-c.c	2009-09-30 16:33:28.000000000 -0500
@@ -234,6 +234,8 @@ ix86_target_macros_internal (int isa_fla
     def_or_undef (parse_in, "__FMA4__");
   if (isa_flag & OPTION_MASK_ISA_XOP)
     def_or_undef (parse_in, "__XOP__");
+  if (isa_flag & OPTION_MASK_ISA_LWP)
+    def_or_undef (parse_in, "__LWP__");
   if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE))
     def_or_undef (parse_in, "__SSE_MATH__");
   if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE2))
diff -upNw gcc-xop-2/gcc/config/i386/i386.h gcc-lwp/gcc/config/i386/i386.h
--- gcc-xop-2/gcc/config/i386/i386.h	2009-09-29 19:41:03.000000000 -0500
+++ gcc-lwp/gcc/config/i386/i386.h	2009-09-30 16:33:28.000000000 -0500
@@ -56,6 +56,7 @@ see the files COPYING3 and COPYING.RUNTI
 #define TARGET_SSE4A	OPTION_ISA_SSE4A
 #define TARGET_FMA4	OPTION_ISA_FMA4
 #define TARGET_XOP	OPTION_ISA_XOP
+#define TARGET_LWP	OPTION_ISA_LWP
 #define TARGET_ROUND	OPTION_ISA_ROUND
 #define TARGET_ABM	OPTION_ISA_ABM
 #define TARGET_POPCNT	OPTION_ISA_POPCNT
diff -upNw gcc-xop-2/gcc/config/i386/i386.md gcc-lwp/gcc/config/i386/i386.md
--- gcc-xop-2/gcc/config/i386/i386.md	2009-09-29 19:41:03.000000000 -0500
+++ gcc-lwp/gcc/config/i386/i386.md	2009-09-30 16:33:28.000000000 -0500
@@ -204,6 +204,10 @@
    (UNSPEC_XOP_TRUEFALSE	152)
    (UNSPEC_XOP_PERMUTE		153)
    (UNSPEC_FRCZ			154)
+   (UNSPEC_LLWP_INTRINSIC	155)
+   (UNSPEC_SLWP_INTRINSIC	156)
+   (UNSPEC_LWPVAL_INTRINSIC	157)
+   (UNSPEC_LWPINS_INTRINSIC	158)
 
    ; For AES support
    (UNSPEC_AESENC		159)
@@ -352,7 +356,7 @@
    fmov,fop,fsgn,fmul,fdiv,fpspc,fcmov,fcmp,fxch,fistp,fisttp,frndint,
    sselog,sselog1,sseiadd,sseiadd1,sseishft,sseimul,
    sse,ssemov,sseadd,ssemul,ssecmp,ssecomi,ssecvt,ssecvt1,sseicvt,ssediv,sseins,
-   ssemuladd,sse4arg,
+   ssemuladd,sse4arg,lwp,
    mmx,mmxmov,mmxadd,mmxmul,mmxcmp,mmxcvt,mmxshft"
   (const_string "other"))
 
diff -upNw gcc-xop-2/gcc/config/i386/i386.opt gcc-lwp/gcc/config/i386/i386.opt
--- gcc-xop-2/gcc/config/i386/i386.opt	2009-09-29 19:41:03.000000000 -0500
+++ gcc-lwp/gcc/config/i386/i386.opt	2009-09-30 16:33:28.000000000 -0500
@@ -318,6 +318,10 @@ mxop
 Target Report Mask(ISA_XOP) Var(ix86_isa_flags) VarExists Save
 Support XOP built-in functions and code generation 
 
+mlwp
+Target Report Mask(ISA_LWP) Var(ix86_isa_flags) VarExists Save
+Support LWP built-in functions and code generation 
+
 mabm
 Target Report Mask(ISA_ABM) Var(ix86_isa_flags) VarExists Save
 Support code generation of Advanced Bit Manipulation (ABM) instructions.
diff -upNw gcc-xop-2/gcc/config/i386/lwpintrin.h gcc-lwp/gcc/config/i386/lwpintrin.h
--- gcc-xop-2/gcc/config/i386/lwpintrin.h	1969-12-31 18:00:00.000000000 -0600
+++ gcc-lwp/gcc/config/i386/lwpintrin.h	2009-09-30 16:33:28.000000000 -0500
@@ -0,0 +1,111 @@
+/* Copyright (C) 2007, 2008, 2009 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _X86INTRIN_H_INCLUDED
+# error "Never use <lwpintrin.h> directly; include <x86intrin.h> instead."
+#endif
+
+#ifndef _LWPINTRIN_H_INCLUDED
+#define _LWPINTRIN_H_INCLUDED
+
+#ifndef __LWP__
+# error "LWP instruction set not enabled"
+#else
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__llwpcb16 (void *pcbAddress)
+{
+  __builtin_ia32_llwpcb16 (pcbAddress);
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__llwpcb32 (void *pcbAddress)
+{
+  __builtin_ia32_llwpcb32 (pcbAddress);
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__llwpcb64 (void *pcbAddress)
+{
+  __builtin_ia32_llwpcb64 (pcbAddress);
+}
+
+extern __inline void * __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__slwpcb16 (void)
+{
+  return __builtin_ia32_slwpcb16 ();
+}
+
+extern __inline void * __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__slwpcb32 (void)
+{
+  return __builtin_ia32_slwpcb32 ();
+}
+
+extern __inline void * __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__slwpcb64 (void)
+{
+  return __builtin_ia32_slwpcb64 ();
+}
+
+/*
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__lwpval16 (unsigned short data2, unsigned int data1, unsigned short flags)
+{
+  __builtin_ia32_lwpval16 (data2, data1, flags);
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__lwpval32 (unsigned int data2, unsigned int data1, unsigned int flags)
+{
+  __builtin_ia32_lwpval32 (data2, data1, flags);
+}
+
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__lwpval64 (unsigned __int64 data2, unsigned int data1, unsigned int flags)
+{
+  __builtin_ia32_lwpval64 (data2, data1, flags);
+}
+
+extern __inline unsigned char __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__lwpins16 (unsigned short data2, unsigned int data1, unsigned short flags)
+{
+  return __builtin_ia32_lwpins16 (data2, data1, flags);
+}
+
+extern __inline unsigned char __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__lwpins32 (unsigned int data2, unsigned int data1, unsigned int flags)
+{
+  return __builtin_ia32_lwpins32 (data2, data1, flags);
+}
+
+extern __inline unsigned char __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__lwpins64 (unsigned __int64 data2, unsigned int data1, unsigned int flags)
+{
+  return __builtin_ia32_lwpins64 (data2, data1, flags);
+}
+*/
+
+#endif /* __LWP__ */
+
+#endif /* _LWPINTRIN_H_INCLUDED */
diff -upNw gcc-xop-2/gcc/config/i386/sse.md gcc-lwp/gcc/config/i386/sse.md
--- gcc-xop-2/gcc/config/i386/sse.md	2009-09-29 19:41:03.000000000 -0500
+++ gcc-lwp/gcc/config/i386/sse.md	2009-09-30 16:33:28.000000000 -0500
@@ -12092,6 +12092,121 @@
    (set_attr "length_immediate" "1")
    (set_attr "mode" "TI")])
 
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;;
+;; LWP instructions
+;;
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+(define_insn "lwp_llwpcbhi1"
+  [(unspec [(match_operand:HI 0 "register_operand" "r")]
+  	   UNSPEC_LLWP_INTRINSIC)]
+  "TARGET_LWP"
+  "llwpcb\t%0"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "HI")])
+
+(define_insn "lwp_llwpcbsi1"
+  [(unspec [(match_operand:SI 0 "register_operand" "r")]
+  	   UNSPEC_LLWP_INTRINSIC)]
+  "TARGET_LWP"
+  "llwpcb\t%0"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "SI")])
+
+(define_insn "lwp_llwpcbdi1"
+  [(unspec [(match_operand:DI 0 "register_operand" "r")]
+  	   UNSPEC_LLWP_INTRINSIC)]
+  "TARGET_LWP"
+  "llwpcb\t%0"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "DI")])
+
+(define_insn "lwp_slwpcbhi1"
+  [(unspec [(match_operand:HI 0 "register_operand" "r")]
+  	   UNSPEC_SLWP_INTRINSIC)]
+  "TARGET_LWP"
+  "slwpcb\t%0"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "HI")])
+
+(define_insn "lwp_slwpcbsi1"
+  [(unspec [(match_operand:SI 0 "register_operand" "r")]
+  	   UNSPEC_SLWP_INTRINSIC)]
+  "TARGET_LWP"
+  "slwpcb\t%0"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "SI")])
+
+(define_insn "lwp_slwpcbdi1"
+  [(unspec [(match_operand:DI 0 "register_operand" "r")]
+  	   UNSPEC_SLWP_INTRINSIC)]
+  "TARGET_LWP"
+  "slwpcb\t%0"
+  [(set_attr "type" "lwp")
+   (set_attr "mode" "DI")])
+
+;;(define_insn "lwp_lwpvalhi3"
+;;  [(unspec [(match_operand:HI 0 "register_operand" "r")
+;;  	   (match_operand:SI 1 "nonimmediate_operand" "rm")
+;;	   (match_operand:HI 2 "const_int_operand" "")]
+;;  	   UNSPEC_LWPVAL_INTRINSIC)]
+;;  "TARGET_LWP"
+;;  "lwpval\t{%2, %1, %0|%0, %1, %2}"
+;;  [(set_attr "type" "lwp")
+;;   (set_attr "mode" "HI")])
+
+;;(define_insn "lwp_lwpvalsi3"
+;;  [(unspec [(match_operand:SI 0 "register_operand" "r")]
+;;  	   (match_operand:SI 1 "nonimmediate_operand" "rm")
+;;	   (match_operand:SI 2 "const_int_operand" "")]
+;;  	   UNSPEC_LWPVAL_INTRINSIC)]
+;;  "TARGET_LWP"
+;;  "lwpval\t{%2, %1, %0|%0, %1, %2}"
+;;  [(set_attr "type" "lwp")
+;;   (set_attr "mode" "SI")])
+
+;;(define_insn "lwp_lwpvaldi3"
+;;  [(unspec [(match_operand:DI 0 "register_operand" "r")]
+;;  	   [(match_operand:SI 1 "nonimmediate_operand" "rm")]
+;;	   [(match_operand:SI 2 "const_int_operand" "")]
+;;  	   UNSPEC_LWPVAL_INTRINSIC)]
+;;  "TARGET_LWP"
+;;  "lwpval\t{%2, %1, %0|%0, %1, %2}"
+;;  [(set_attr "type" "lwp")
+;;   (set_attr "mode" "DI")])
+
+;;(define_insn "lwp_lwpinshi3"
+;;  [(unspec [(match_operand:HI 0 "register_operand" "r")]
+;;  	   (match_operand:SI 1 "nonimmediate_operand" "rm")
+;;	   (match_operand:HI 2 "const_int_operand" "")]
+;;  	   UNSPEC_LWPINS_INTRINSIC)]
+;;  "TARGET_LWP"
+;;  "lwpins\t{%2, %1, %0|%0, %1, %2}"
+;;  [(set_attr "type" "lwp")
+;;   (set_attr "mode" "HI")])
+
+;;(define_insn "lwp_lwpinssi3"
+;;  [(unspec [(match_operand:SI 0 "register_operand" "r")
+;;  	   (match_operand:SI 1 "nonimmediate_operand" "rm")
+;;	   (match_operand:SI 2 "const_int_operand" "")]
+;;  	   UNSPEC_LWPINS_INTRINSIC)]
+;;  "TARGET_LWP"
+;;  "lwpins\t{%2, %1, %0|%0, %1, %2}"
+;;  [(set_attr "type" "lwp")
+;;   (set_attr "mode" "SI")])
+
+;;(define_insn "lwp_lwpinsdi3"
+;;  [(unspec [(match_operand:DI 0 "register_operand" "r")]
+;;  	   (match_operand:SI 1 "nonimmediate_operand" "rm")
+;;	   (match_operand:SI 2 "const_int_operand" "")]
+;;  	   UNSPEC_LWPINS_INTRINSIC)]
+;;  "TARGET_LWP"
+;;  "lwpins\t{%2, %1, %0|%0, %1, %2}"
+;;  [(set_attr "type" "lwp")
+;;   (set_attr "mode" "DI")])
+
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 (define_insn "*avx_aesenc"
   [(set (match_operand:V2DI 0 "register_operand" "=x")
diff -upNw gcc-xop-2/gcc/config/i386/x86intrin.h gcc-lwp/gcc/config/i386/x86intrin.h
--- gcc-xop-2/gcc/config/i386/x86intrin.h	2009-09-29 19:41:03.000000000 -0500
+++ gcc-lwp/gcc/config/i386/x86intrin.h	2009-09-30 16:33:28.000000000 -0500
@@ -62,6 +62,10 @@
 #include <xopintrin.h>
 #endif
 
+#ifdef __LWP__
+#include <lwpintrin.h>
+#endif
+
 #if defined (__AES__) || defined (__PCLMUL__)
 #include <wmmintrin.h>
 #endif

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2009-12-14 20:15 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-09  1:13 PATCH: Add LWP support for upcoming AMD Orochi processor Harsha Jagasia
2009-10-09 10:07 ` Jakub Jelinek
2009-10-22 21:07   ` rajagopal, dwarak
2009-11-05  9:32 ` Jakub Jelinek
2009-11-05 16:21   ` Jakub Jelinek
2009-11-05 16:58     ` Sebastian Pop
2009-11-05 17:03       ` Richard Guenther
2009-11-05 17:21       ` Uros Bizjak
2009-11-06 10:15     ` Jakub Jelinek
2009-12-10 19:58       ` Sebastian Pop
2009-12-10 21:01         ` Jakub Jelinek
2009-12-10 21:04           ` Sebastian Pop
2009-12-10 21:52             ` Jakub Jelinek
2009-12-11 14:51               ` Jakub Jelinek
2009-12-11 16:54                 ` Richard Henderson
2009-12-11 21:00                 ` Sebastian Pop
2009-12-11 21:43                   ` Jakub Jelinek
2009-12-11 22:27                     ` Sebastian Pop
2009-12-12  9:27                       ` Sebastian Pop
2009-12-14 16:35                         ` Richard Henderson
2009-12-14 19:15                         ` H.J. Lu
2009-12-14 19:21                           ` Jakub Jelinek
2009-12-14 19:38                             ` Richard Henderson
2009-12-14 20:15                               ` Jakub Jelinek
2009-12-14 20:14                         ` Uros Bizjak
2009-12-14 20:38                           ` Jakub Jelinek
2009-12-14 20:52                             ` Uros Bizjak
  -- strict thread matches above, loose matches on Subject: below --
2009-10-09  2:12 Ross Ridge
2009-10-01  7:51 Uros Bizjak
2009-10-01 10:09 ` Jan Hubicka
2009-10-01  4:06 Harsha Jagasia
2009-10-01  6:30 ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).