From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23510 invoked by alias); 1 Oct 2009 04:06:45 -0000 Received: (qmail 23495 invoked by uid 22791); 1 Oct 2009 04:06:42 -0000 X-SWARE-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from va3ehsobe001.messaging.microsoft.com (HELO VA3EHSOBE001.bigfish.com) (216.32.180.11) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 01 Oct 2009 04:06:34 +0000 Received: from mail121-va3-R.bigfish.com (10.7.14.235) by VA3EHSOBE001.bigfish.com (10.7.40.21) with Microsoft SMTP Server id 8.1.340.0; Thu, 1 Oct 2009 04:06:32 +0000 Received: from mail121-va3 (localhost.localdomain [127.0.0.1]) by mail121-va3-R.bigfish.com (Postfix) with ESMTP id 66A8315F0185 for ; Thu, 1 Oct 2009 04:06:32 +0000 (UTC) X-SpamScore: -19 X-BigFish: VPS-19(z40baiz936eM1443R655Nc8kzz1202hzz5eeeT92fbm4ee3lz32i6bh64h) X-Spam-TCS-SCL: 3:0 X-FB-SS: 5, Received: by mail121-va3 (MessageSwitch) id 1254369987855947_27121; Thu, 1 Oct 2009 04:06:27 +0000 (UCT) Received: from VA3EHSMHS025.bigfish.com (unknown [10.7.14.247]) by mail121-va3.bigfish.com (Postfix) with ESMTP id CDA721720054 for ; Thu, 1 Oct 2009 04:06:27 +0000 (UTC) Received: from ausb3extmailp02.amd.com (163.181.251.22) by VA3EHSMHS025.bigfish.com (10.7.99.35) with Microsoft SMTP Server (TLS) id 14.0.482.32; Thu, 1 Oct 2009 04:06:27 +0000 Received: from ausb3twp02.amd.com ([163.181.250.38]) by ausb3extmailp02.amd.com (Switch-3.2.7/Switch-3.2.7) with ESMTP id n9146N9v011153 for ; Wed, 30 Sep 2009 23:06:26 -0500 X-M-MSG: Received: from sausexbh2.amd.com (SAUSEXBH2.amd.com [163.181.22.102]) by ausb3twp02.amd.com (Tumbleweed MailGate 3.7.0) with ESMTP id 2F1FDC84E4 for ; Wed, 30 Sep 2009 23:06:20 -0500 (CDT) Received: from SAUSEXMB3.amd.com ([163.181.22.202]) by sausexbh2.amd.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 30 Sep 2009 23:06:23 -0500 Received: from tilapia-05.site ([10.236.44.188]) by SAUSEXMB3.amd.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 30 Sep 2009 23:06:23 -0500 From: Harsha Jagasia To: Harsha Jagasia , gcc-patches@gcc.gnu.org, hubicka@ucw.cz, rth@redhat.com, dwarak.rajagopal@amd.com, christophe.harle@amd.com CC: Harsha Jagasia Message-ID: <20091001040134.20668.54066.sendpatchset@tilapia-05.site> Subject: PATCH: Add LWP support for upcoming AMD Orochi processor. Date: Thu, 01 Oct 2009 04:06:00 -0000 MIME-Version: 1.0 Content-Type: text/plain X-Reverse-DNS: ausb3extmailp02.amd.com Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2009-10/txt/msg00011.txt.bz2 This patch is for LWP instruction set support for gcc 4.5 for the upcoming AMD Orochi processor. Please see the AMD spec for the LWP ISA at http://support.amd.com/us/Processor_TechDocs/43724.pdf We are still in the process of wrapping up the LWP binutils work and expect it to be checked in during stage 3. The attached patch is based on the latest trunk and bootstrap and target tests pass. A full make check is still running. I will update the list with the results of make check, but I wanted to send the patch out so that the reviewers can look at it. One of the issues I am hoping the maintainers can give guidance on: - Currently the code for the lwpval and lwpins instructions is commented out. These instructions are different from typical instructions in that they have no destination register (please see the spec). I am not sure how to repesent the patterns for the same and would appreciate some input. Thanks in advance. 2009-09-29 Harsha Jagasia * doc/invoke.texi (-mlwp): Add documentation. * doc/extend.texi (x86 intrinsics): Add LWP intrinsics. * config.gcc (i[34567]86-*-*): Include lwpintrin.h. (x86_64-*-*): Ditto. * config/i386/lwpintrin.h: New file, provide x86 compiler intrinisics for LWP. * config/i386/cpuid.h (bit_LWP): Define LWP bit. * config/i386/x86intrin.h: Add LWP check and lwpintrin.h. * config/i386/i386-c.c(ix86_target_macros_internal): Check ISA_FLAG for LWP. * config/i386/i386.h(TARGET_LWP): New macro for LWP. * config/i386/i386.opt (-mlwp): New switch for LWP support. * config/i386/i386.c (OPTION_MASK_ISA_LWP_SET): New. (OPTION_MASK_ISA_LWP_UNSET): New. (ix86_handle_option): Handle -mlwp. (isa_opts): Handle -mlwp. (enum pta_flags): Add PTA_LWP. (override_options): Add LWP support. (IX86_BUILTIN_LLWPCB16): New for LWP intrinsic. (IX86_BUILTIN_LLWPCB32): Ditto (IX86_BUILTIN_LLWPCB64): Ditto (IX86_BUILTIN_SLWPCB16): Ditto (IX86_BUILTIN_SLWPCB32): Ditto (IX86_BUILTIN_SLWPCB64): Ditto (IX86_BUILTIN_LWPVAL16): Ditto (IX86_BUILTIN_LWPVAL32): Ditto (IX86_BUILTIN_LWPVAL64): Ditto (IX86_BUILTIN_LWPINS16): Ditto (IX86_BUILTIN_LWPINS32): Ditto (IX86_BUILTIN_LWPINS64): Ditto (enum ix86_builtin_type): Add LWP intrinsic support. (builtin_description): Ditto. (ix86_init_mmx_sse_builtins): Ditto. (ix86_expand_args_builtin): Ditto. * config/i386/i386.md (UNSPEC_LLWP_INTRINSIC): (UNSPEC_SLWP_INTRINSIC): (UNSPEC_LWPVAL_INTRINSIC): (UNSPEC_LWPINS_INTRINSIC): Add new UNSPEC for LWP support. * config/i386/sse.md (lwp_llwpcbhi1): New lwp pattern. (lwp_llwpcbsi1): Ditto. (lwp_llwpcbdi1): Ditto. (lwp_slwpcbhi1): Ditto. (lwp_slwpcbsi1): Ditto. (lwp_slwpcbdi1): Ditto. (lwp_lwpvalhi3): Ditto. (lwp_lwpvalsi3): Ditto. (lwp_lwpvaldi3): Ditto. (lwp_lwpinshi3): Ditto. (lwp_lwpinssi3): Ditto. (lwp_lwpinsdi3): Ditto. diff -upNw gcc-xop-2/gcc/config.gcc gcc-lwp/gcc/config.gcc --- gcc-xop-2/gcc/config.gcc 2009-09-30 14:12:36.000000000 -0500 +++ gcc-lwp/gcc/config.gcc 2009-09-30 16:33:28.000000000 -0500 @@ -288,7 +288,7 @@ i[34567]86-*-*) pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h nmmintrin.h bmmintrin.h fma4intrin.h wmmintrin.h immintrin.h x86intrin.h avxintrin.h xopintrin.h - ia32intrin.h cross-stdarg.h" + ia32intrin.h cross-stdarg.h lwpintrin.h" ;; x86_64-*-*) cpu_type=i386 @@ -298,7 +298,7 @@ x86_64-*-*) pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h nmmintrin.h bmmintrin.h fma4intrin.h wmmintrin.h immintrin.h x86intrin.h avxintrin.h xopintrin.h - ia32intrin.h cross-stdarg.h" + ia32intrin.h cross-stdarg.h lwpintrin.h" need_64bit_hwint=yes ;; ia64-*-*) diff -upNw gcc-xop-2/gcc/doc/extend.texi gcc-lwp/gcc/doc/extend.texi --- gcc-xop-2/gcc/doc/extend.texi 2009-09-29 19:41:02.000000000 -0500 +++ gcc-lwp/gcc/doc/extend.texi 2009-09-30 16:33:28.000000000 -0500 @@ -3178,6 +3178,11 @@ Enable/disable the generation of the FMA @cindex @code{target("xop")} attribute Enable/disable the generation of the XOP instructions. +@item lwp +@itemx no-lwp +@cindex @code{target("lwp")} attribute +Enable/disable the generation of the LWP instructions. + @item ssse3 @itemx no-ssse3 @cindex @code{target("ssse3")} attribute @@ -9066,5 +9071,22 @@ v8sf __builtin_ia32_fmsubaddps256 (v8sf, @end smallexample +The following built-in functions are available when @option{-mlwp} is used. + +@smallexample +void __builtin_ia32_llwpcb16 (void *); +void __builtin_ia32_llwpcb32 (void *); +void __builtin_ia32_llwpcb64 (void *); +void * __builtin_ia32_llwpcb16 (void); +void * __builtin_ia32_llwpcb32 (void); +void * __builtin_ia32_llwpcb64 (void); +@c void __builtin_ia32_lwpval16 (unsigned short, unsigned int, unsigned short) +@c void __builtin_ia32_lwpval32 (unsigned int, unsigned int, unsigned int) +@c void __builtin_ia32_lwpval64 (unsigned __int64, unsigned int, unsigned int) +@c unsigned char __builtin_ia32_lwpins16 (unsigned short, unsigned int, unsigned short) +@c unsigned char __builtin_ia32_lwpins32 (unsigned int, unsigned int, unsigned int) +@c unsigned char __builtin_ia32_lwpins64 (unsigned __int64, unsigned int, unsigned int) +@end smallexample + The following built-in functions are available when @option{-m3dnow} is used. All of them generate the machine instruction that is part of the name. diff -upNw gcc-xop-2/gcc/doc/invoke.texi gcc-lwp/gcc/doc/invoke.texi --- gcc-xop-2/gcc/doc/invoke.texi 2009-09-29 19:41:02.000000000 -0500 +++ gcc-lwp/gcc/doc/invoke.texi 2009-09-30 16:33:28.000000000 -0500 @@ -592,7 +592,7 @@ Objective-C and Objective-C++ Dialects}. -mcld -mcx16 -msahf -mmovbe -mcrc32 -mrecip @gol -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol -maes -mpclmul @gol --msse4a -m3dnow -mpopcnt -mabm -mfma4 -mxop @gol +-msse4a -m3dnow -mpopcnt -mabm -mfma4 -mxop -mlwp @gol -mthreads -mno-align-stringops -minline-all-stringops @gol -minline-stringops-dynamically -mstringop-strategy=@var{alg} @gol -mpush-args -maccumulate-outgoing-args -m128bit-long-double @gol @@ -11731,6 +11731,8 @@ preferred alignment to @option{-mpreferr @itemx -mno-fma4 @itemx -mxop @itemx -mno-xop +@itemx -mlwp +@itemx -mno-lwp @itemx -m3dnow @itemx -mno-3dnow @itemx -mpopcnt @@ -11745,7 +11747,7 @@ preferred alignment to @option{-mpreferr @opindex mno-3dnow These switches enable or disable the use of instructions in the MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, AVX, AES, PCLMUL, SSE4A, FMA4, XOP, -ABM or 3DNow!@: extended instruction sets. +LWP, ABM or 3DNow!@: extended instruction sets. These extensions are also available as built-in functions: see @ref{X86 Built-in Functions}, for details of the functions enabled and disabled by these switches. diff -upNw gcc-xop-2/gcc/config/i386/cpuid.h gcc-lwp/gcc/config/i386/cpuid.h --- gcc-xop-2/gcc/config/i386/cpuid.h 2009-09-29 19:41:02.000000000 -0500 +++ gcc-lwp/gcc/config/i386/cpuid.h 2009-09-30 16:33:28.000000000 -0500 @@ -49,6 +49,7 @@ #define bit_LAHF_LM (1 << 0) #define bit_SSE4a (1 << 6) #define bit_FMA4 (1 << 16) +#define bit_LWP (1 << 15) #define bit_XOP (1 << 11) /* %edx */ diff -upNw gcc-xop-2/gcc/config/i386/i386.c gcc-lwp/gcc/config/i386/i386.c --- gcc-xop-2/gcc/config/i386/i386.c 2009-09-29 19:41:03.000000000 -0500 +++ gcc-lwp/gcc/config/i386/i386.c 2009-09-30 16:33:28.000000000 -0500 @@ -1960,6 +1960,8 @@ static int ix86_isa_flags_explicit; | OPTION_MASK_ISA_AVX_SET) #define OPTION_MASK_ISA_XOP_SET \ (OPTION_MASK_ISA_XOP | OPTION_MASK_ISA_FMA4_SET) +#define OPTION_MASK_ISA_LWP_SET \ + OPTION_MASK_ISA_LWP /* AES and PCLMUL need SSE2 because they use xmm registers */ #define OPTION_MASK_ISA_AES_SET \ @@ -2014,6 +2016,7 @@ static int ix86_isa_flags_explicit; #define OPTION_MASK_ISA_FMA4_UNSET \ (OPTION_MASK_ISA_FMA4 | OPTION_MASK_ISA_XOP_UNSET) #define OPTION_MASK_ISA_XOP_UNSET OPTION_MASK_ISA_XOP +#define OPTION_MASK_ISA_LWP_UNSET OPTION_MASK_ISA_LWP #define OPTION_MASK_ISA_AES_UNSET OPTION_MASK_ISA_AES #define OPTION_MASK_ISA_PCLMUL_UNSET OPTION_MASK_ISA_PCLMUL @@ -2274,6 +2277,19 @@ ix86_handle_option (size_t code, const c } return true; + case OPT_mlwp: + if (value) + { + ix86_isa_flags |= OPTION_MASK_ISA_LWP_SET; + ix86_isa_flags_explicit |= OPTION_MASK_ISA_LWP_SET; + } + else + { + ix86_isa_flags &= ~OPTION_MASK_ISA_LWP_UNSET; + ix86_isa_flags_explicit |= OPTION_MASK_ISA_LWP_UNSET; + } + return true; + case OPT_mabm: if (value) { @@ -2403,6 +2419,7 @@ ix86_target_string (int isa, int flags, { "-m64", OPTION_MASK_ISA_64BIT }, { "-mfma4", OPTION_MASK_ISA_FMA4 }, { "-mxop", OPTION_MASK_ISA_XOP }, + { "-mlwp", OPTION_MASK_ISA_LWP }, { "-msse4a", OPTION_MASK_ISA_SSE4A }, { "-msse4.2", OPTION_MASK_ISA_SSE4_2 }, { "-msse4.1", OPTION_MASK_ISA_SSE4_1 }, @@ -2634,7 +2651,8 @@ override_options (bool main_args_p) PTA_FMA = 1 << 19, PTA_MOVBE = 1 << 20, PTA_FMA4 = 1 << 21, - PTA_XOP = 1 << 22 + PTA_XOP = 1 << 22, + PTA_LWP = 1 << 23 }; static struct pta @@ -2983,6 +3001,9 @@ override_options (bool main_args_p) if (processor_alias_table[i].flags & PTA_XOP && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_XOP)) ix86_isa_flags |= OPTION_MASK_ISA_XOP; + if (processor_alias_table[i].flags & PTA_LWP + && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_LWP)) + ix86_isa_flags |= OPTION_MASK_ISA_LWP; if (processor_alias_table[i].flags & PTA_ABM && !(ix86_isa_flags_explicit & OPTION_MASK_ISA_ABM)) ix86_isa_flags |= OPTION_MASK_ISA_ABM; @@ -3668,6 +3689,7 @@ ix86_valid_target_attribute_inner_p (tre IX86_ATTR_ISA ("ssse3", OPT_mssse3), IX86_ATTR_ISA ("fma4", OPT_mfma4), IX86_ATTR_ISA ("xop", OPT_mxop), + IX86_ATTR_ISA ("lwp", OPT_mlwp), /* string options */ IX86_ATTR_STR ("arch=", IX86_FUNCTION_SPECIFIC_ARCH), @@ -20987,7 +21009,7 @@ enum ix86_builtins IX86_BUILTIN_CVTUDQ2PS, - /* FMA4 instructions. */ + /* FMA4 and XOP instructions. */ IX86_BUILTIN_VFMADDSS, IX86_BUILTIN_VFMADDSD, IX86_BUILTIN_VFMADDPS, @@ -21164,6 +21186,23 @@ enum ix86_builtins IX86_BUILTIN_VPCOMFALSEQ, IX86_BUILTIN_VPCOMTRUEQ, + /* LWP instructions. */ + IX86_BUILTIN_LLWPCB16, + IX86_BUILTIN_LLWPCB32, + IX86_BUILTIN_LLWPCB64, + IX86_BUILTIN_SLWPCB16, + IX86_BUILTIN_SLWPCB32, + IX86_BUILTIN_SLWPCB64, + + /* + IX86_BUILTIN_LWPVAL16, + IX86_BUILTIN_LWPVAL32, + IX86_BUILTIN_LWPVAL64, + IX86_BUILTIN_LWPINS16, + IX86_BUILTIN_LWPINS32, + IX86_BUILTIN_LWPINS64, + */ + IX86_BUILTIN_MAX }; @@ -21540,7 +21579,13 @@ enum ix86_builtin_type V1DI2DI_FTYPE_V1DI_V1DI_INT, V2DF_FTYPE_V2DF_V2DF_INT, V2DI_FTYPE_V2DI_UINT_UINT, - V2DI_FTYPE_V2DI_V2DI_UINT_UINT + V2DI_FTYPE_V2DI_V2DI_UINT_UINT, + VOID_FTYPE_USHORT_UINT_USHORT, + VOID_FTYPE_UINT_UINT_UINT, + VOID_FTYPE_UINT64_UINT_UINT, + UCHAR_FTYPE_USHORT_UINT_USHORT, + UCHAR_FTYPE_UINT_UINT_UINT, + UCHAR_FTYPE_UINT64_UINT_UINT }; /* Special builtins with variable number of arguments. */ @@ -22237,7 +22282,7 @@ static const struct builtin_description { OPTION_MASK_ISA_AVX, CODE_FOR_avx_movmskps256, "__builtin_ia32_movmskps256", IX86_BUILTIN_MOVMSKPS256, UNKNOWN, (int) INT_FTYPE_V8SF }, }; -/* FMA4 and XOP. */ +/* FMA4, XOP. */ enum multi_arg_type { MULTI_ARG_UNKNOWN, MULTI_ARG_3_SF, @@ -22484,6 +22529,23 @@ static const struct builtin_description { OPTION_MASK_ISA_XOP, CODE_FOR_xop_pcom_tfv4si3, "__builtin_ia32_vpcomtrueud", IX86_BUILTIN_VPCOMTRUEUD, (enum rtx_code) PCOM_TRUE, (int)MULTI_ARG_2_SI_TF }, { OPTION_MASK_ISA_XOP, CODE_FOR_xop_pcom_tfv2di3, "__builtin_ia32_vpcomtrueuq", IX86_BUILTIN_VPCOMTRUEUQ, (enum rtx_code) PCOM_TRUE, (int)MULTI_ARG_2_DI_TF }, + { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_llwpcbhi1, "__builtin_ia32_llwpcb16", IX86_BUILTIN_LLWPCB16, UNKNOWN, (int) VOID_FTYPE_VOID }, + { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_llwpcbsi1, "__builtin_ia32_llwpcb32", IX86_BUILTIN_LLWPCB32, UNKNOWN, (int) VOID_FTYPE_VOID }, + { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_llwpcbdi1, "__builtin_ia32_llwpcb64", IX86_BUILTIN_LLWPCB64, UNKNOWN, (int) VOID_FTYPE_VOID }, + + { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_slwpcbhi1, "__builtin_ia32_slwpcb16", IX86_BUILTIN_SLWPCB16, UNKNOWN, (int) VOID_FTYPE_VOID }, + { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_slwpcbsi1, "__builtin_ia32_slwpcb32", IX86_BUILTIN_SLWPCB32, UNKNOWN, (int) VOID_FTYPE_VOID }, + { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_slwpcbdi1, "__builtin_ia32_slwpcb64", IX86_BUILTIN_SLWPCB64, UNKNOWN, (int) VOID_FTYPE_VOID }, + + /* + { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpvalhi3, "__builtin_ia32_lwpval16", IX86_BUILTIN_LWPVAL16, UNKNOWN, (int) VOID_FTYPE_USHORT_UINT_USHORT }, + { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpvalsi3, "__builtin_ia32_lwpval32", IX86_BUILTIN_LWPVAL64, UNKNOWN, (int) VOID_FTYPE_UINT_UINT_UINT }, + { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpvaldi3, "__builtin_ia32_lwpval64", IX86_BUILTIN_LWPVAL64, UNKNOWN, (int) VOID_FTYPE_UINT64_UINT_UINT }, + + { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpinshi3, "__builtin_ia32_lwpins16", IX86_BUILTIN_LWPINS16, UNKNOWN, (int) UCHAR_FTYPE_USHORT_UINT_USHORT }, + { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpinssi3, "__builtin_ia32_lwpins32", IX86_BUILTIN_LWPINS64, UNKNOWN, (int) UCHAR_FTYPE_UINT_UINT_UINT }, + { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpinsdi3, "__builtin_ia32_lwpins64", IX86_BUILTIN_LWPINS64, UNKNOWN, (int) UCHAR_FTYPE_UINT64_UINT_UINT }, + */ }; /* Set up all the MMX/SSE builtins, even builtins for instructions that are not @@ -23253,6 +23315,50 @@ ix86_init_mmx_sse_builtins (void) float_type_node, NULL_TREE); + /* LWP instructions. */ + + tree void_ftype_ushort_unsigned_ushort + = build_function_type_list (void_type_node, + short_unsigned_type_node, + unsigned_type_node, + short_unsigned_type_node, + NULL_TREE); + + tree void_ftype_unsigned_unsigned_unsigned + = build_function_type_list (void_type_node, + unsigned_type_node, + unsigned_type_node, + unsigned_type_node, + NULL_TREE); + + tree void_ftype_uint64_unsigned_unsigned + = build_function_type_list (void_type_node, + long_long_unsigned_type_node, + unsigned_type_node, + unsigned_type_node, + NULL_TREE); + + tree uchar_ftype_ushort_unsigned_ushort + = build_function_type_list (unsigned_char_type_node, + short_unsigned_type_node, + unsigned_type_node, + short_unsigned_type_node, + NULL_TREE); + + tree uchar_ftype_unsigned_unsigned_unsigned + = build_function_type_list (unsigned_char_type_node, + unsigned_type_node, + unsigned_type_node, + unsigned_type_node, + NULL_TREE); + + tree uchar_ftype_uint64_unsigned_unsigned + = build_function_type_list (unsigned_char_type_node, + long_long_unsigned_type_node, + unsigned_type_node, + unsigned_type_node, + NULL_TREE); + /* Integer intrinsics. */ tree uint64_ftype_void = build_function_type (long_long_unsigned_type_node, @@ -23855,6 +23961,25 @@ ix86_init_mmx_sse_builtins (void) case V1DI2DI_FTYPE_V1DI_V1DI_INT: type = v1di_ftype_v1di_v1di_int; break; + case VOID_FTYPE_USHORT_UINT_USHORT: + type = void_ftype_ushort_unsigned_ushort; + break; + case VOID_FTYPE_UINT_UINT_UINT: + type = void_ftype_unsigned_unsigned_unsigned; + break; + case VOID_FTYPE_UINT64_UINT_UINT: + type = void_ftype_uint64_unsigned_unsigned; + break; + case UCHAR_FTYPE_USHORT_UINT_USHORT: + type = uchar_ftype_ushort_unsigned_ushort; + break; + case UCHAR_FTYPE_UINT_UINT_UINT: + type = uchar_ftype_unsigned_unsigned_unsigned; + break; + case UCHAR_FTYPE_UINT64_UINT_UINT: + type = uchar_ftype_uint64_unsigned_unsigned; + break; + default: gcc_unreachable (); } @@ -25034,6 +25159,15 @@ ix86_expand_args_builtin (const struct b nargs = 4; nargs_constant = 2; break; + case VOID_FTYPE_USHORT_UINT_USHORT: + case VOID_FTYPE_UINT_UINT_UINT: + case VOID_FTYPE_UINT64_UINT_UINT: + case UCHAR_FTYPE_USHORT_UINT_USHORT: + case UCHAR_FTYPE_UINT_UINT_UINT: + case UCHAR_FTYPE_UINT64_UINT_UINT: + nargs = 3; + nargs_constant = 3; + break; default: gcc_unreachable (); } diff -upNw gcc-xop-2/gcc/config/i386/i386-c.c gcc-lwp/gcc/config/i386/i386-c.c --- gcc-xop-2/gcc/config/i386/i386-c.c 2009-09-29 19:41:03.000000000 -0500 +++ gcc-lwp/gcc/config/i386/i386-c.c 2009-09-30 16:33:28.000000000 -0500 @@ -234,6 +234,8 @@ ix86_target_macros_internal (int isa_fla def_or_undef (parse_in, "__FMA4__"); if (isa_flag & OPTION_MASK_ISA_XOP) def_or_undef (parse_in, "__XOP__"); + if (isa_flag & OPTION_MASK_ISA_LWP) + def_or_undef (parse_in, "__LWP__"); if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE)) def_or_undef (parse_in, "__SSE_MATH__"); if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE2)) diff -upNw gcc-xop-2/gcc/config/i386/i386.h gcc-lwp/gcc/config/i386/i386.h --- gcc-xop-2/gcc/config/i386/i386.h 2009-09-29 19:41:03.000000000 -0500 +++ gcc-lwp/gcc/config/i386/i386.h 2009-09-30 16:33:28.000000000 -0500 @@ -56,6 +56,7 @@ see the files COPYING3 and COPYING.RUNTI #define TARGET_SSE4A OPTION_ISA_SSE4A #define TARGET_FMA4 OPTION_ISA_FMA4 #define TARGET_XOP OPTION_ISA_XOP +#define TARGET_LWP OPTION_ISA_LWP #define TARGET_ROUND OPTION_ISA_ROUND #define TARGET_ABM OPTION_ISA_ABM #define TARGET_POPCNT OPTION_ISA_POPCNT diff -upNw gcc-xop-2/gcc/config/i386/i386.md gcc-lwp/gcc/config/i386/i386.md --- gcc-xop-2/gcc/config/i386/i386.md 2009-09-29 19:41:03.000000000 -0500 +++ gcc-lwp/gcc/config/i386/i386.md 2009-09-30 16:33:28.000000000 -0500 @@ -204,6 +204,10 @@ (UNSPEC_XOP_TRUEFALSE 152) (UNSPEC_XOP_PERMUTE 153) (UNSPEC_FRCZ 154) + (UNSPEC_LLWP_INTRINSIC 155) + (UNSPEC_SLWP_INTRINSIC 156) + (UNSPEC_LWPVAL_INTRINSIC 157) + (UNSPEC_LWPINS_INTRINSIC 158) ; For AES support (UNSPEC_AESENC 159) @@ -352,7 +356,7 @@ fmov,fop,fsgn,fmul,fdiv,fpspc,fcmov,fcmp,fxch,fistp,fisttp,frndint, sselog,sselog1,sseiadd,sseiadd1,sseishft,sseimul, sse,ssemov,sseadd,ssemul,ssecmp,ssecomi,ssecvt,ssecvt1,sseicvt,ssediv,sseins, - ssemuladd,sse4arg, + ssemuladd,sse4arg,lwp, mmx,mmxmov,mmxadd,mmxmul,mmxcmp,mmxcvt,mmxshft" (const_string "other")) diff -upNw gcc-xop-2/gcc/config/i386/i386.opt gcc-lwp/gcc/config/i386/i386.opt --- gcc-xop-2/gcc/config/i386/i386.opt 2009-09-29 19:41:03.000000000 -0500 +++ gcc-lwp/gcc/config/i386/i386.opt 2009-09-30 16:33:28.000000000 -0500 @@ -318,6 +318,10 @@ mxop Target Report Mask(ISA_XOP) Var(ix86_isa_flags) VarExists Save Support XOP built-in functions and code generation +mlwp +Target Report Mask(ISA_LWP) Var(ix86_isa_flags) VarExists Save +Support LWP built-in functions and code generation + mabm Target Report Mask(ISA_ABM) Var(ix86_isa_flags) VarExists Save Support code generation of Advanced Bit Manipulation (ABM) instructions. diff -upNw gcc-xop-2/gcc/config/i386/lwpintrin.h gcc-lwp/gcc/config/i386/lwpintrin.h --- gcc-xop-2/gcc/config/i386/lwpintrin.h 1969-12-31 18:00:00.000000000 -0600 +++ gcc-lwp/gcc/config/i386/lwpintrin.h 2009-09-30 16:33:28.000000000 -0500 @@ -0,0 +1,111 @@ +/* Copyright (C) 2007, 2008, 2009 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +#ifndef _X86INTRIN_H_INCLUDED +# error "Never use directly; include instead." +#endif + +#ifndef _LWPINTRIN_H_INCLUDED +#define _LWPINTRIN_H_INCLUDED + +#ifndef __LWP__ +# error "LWP instruction set not enabled" +#else + +extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +__llwpcb16 (void *pcbAddress) +{ + __builtin_ia32_llwpcb16 (pcbAddress); +} + +extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +__llwpcb32 (void *pcbAddress) +{ + __builtin_ia32_llwpcb32 (pcbAddress); +} + +extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +__llwpcb64 (void *pcbAddress) +{ + __builtin_ia32_llwpcb64 (pcbAddress); +} + +extern __inline void * __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +__slwpcb16 (void) +{ + return __builtin_ia32_slwpcb16 (); +} + +extern __inline void * __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +__slwpcb32 (void) +{ + return __builtin_ia32_slwpcb32 (); +} + +extern __inline void * __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +__slwpcb64 (void) +{ + return __builtin_ia32_slwpcb64 (); +} + +/* +extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +__lwpval16 (unsigned short data2, unsigned int data1, unsigned short flags) +{ + __builtin_ia32_lwpval16 (data2, data1, flags); +} + +extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +__lwpval32 (unsigned int data2, unsigned int data1, unsigned int flags) +{ + __builtin_ia32_lwpval32 (data2, data1, flags); +} + +extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +__lwpval64 (unsigned __int64 data2, unsigned int data1, unsigned int flags) +{ + __builtin_ia32_lwpval64 (data2, data1, flags); +} + +extern __inline unsigned char __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +__lwpins16 (unsigned short data2, unsigned int data1, unsigned short flags) +{ + return __builtin_ia32_lwpins16 (data2, data1, flags); +} + +extern __inline unsigned char __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +__lwpins32 (unsigned int data2, unsigned int data1, unsigned int flags) +{ + return __builtin_ia32_lwpins32 (data2, data1, flags); +} + +extern __inline unsigned char __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +__lwpins64 (unsigned __int64 data2, unsigned int data1, unsigned int flags) +{ + return __builtin_ia32_lwpins64 (data2, data1, flags); +} +*/ + +#endif /* __LWP__ */ + +#endif /* _LWPINTRIN_H_INCLUDED */ diff -upNw gcc-xop-2/gcc/config/i386/sse.md gcc-lwp/gcc/config/i386/sse.md --- gcc-xop-2/gcc/config/i386/sse.md 2009-09-29 19:41:03.000000000 -0500 +++ gcc-lwp/gcc/config/i386/sse.md 2009-09-30 16:33:28.000000000 -0500 @@ -12092,6 +12092,121 @@ (set_attr "length_immediate" "1") (set_attr "mode" "TI")]) + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; +;; LWP instructions +;; +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +(define_insn "lwp_llwpcbhi1" + [(unspec [(match_operand:HI 0 "register_operand" "r")] + UNSPEC_LLWP_INTRINSIC)] + "TARGET_LWP" + "llwpcb\t%0" + [(set_attr "type" "lwp") + (set_attr "mode" "HI")]) + +(define_insn "lwp_llwpcbsi1" + [(unspec [(match_operand:SI 0 "register_operand" "r")] + UNSPEC_LLWP_INTRINSIC)] + "TARGET_LWP" + "llwpcb\t%0" + [(set_attr "type" "lwp") + (set_attr "mode" "SI")]) + +(define_insn "lwp_llwpcbdi1" + [(unspec [(match_operand:DI 0 "register_operand" "r")] + UNSPEC_LLWP_INTRINSIC)] + "TARGET_LWP" + "llwpcb\t%0" + [(set_attr "type" "lwp") + (set_attr "mode" "DI")]) + +(define_insn "lwp_slwpcbhi1" + [(unspec [(match_operand:HI 0 "register_operand" "r")] + UNSPEC_SLWP_INTRINSIC)] + "TARGET_LWP" + "slwpcb\t%0" + [(set_attr "type" "lwp") + (set_attr "mode" "HI")]) + +(define_insn "lwp_slwpcbsi1" + [(unspec [(match_operand:SI 0 "register_operand" "r")] + UNSPEC_SLWP_INTRINSIC)] + "TARGET_LWP" + "slwpcb\t%0" + [(set_attr "type" "lwp") + (set_attr "mode" "SI")]) + +(define_insn "lwp_slwpcbdi1" + [(unspec [(match_operand:DI 0 "register_operand" "r")] + UNSPEC_SLWP_INTRINSIC)] + "TARGET_LWP" + "slwpcb\t%0" + [(set_attr "type" "lwp") + (set_attr "mode" "DI")]) + +;;(define_insn "lwp_lwpvalhi3" +;; [(unspec [(match_operand:HI 0 "register_operand" "r") +;; (match_operand:SI 1 "nonimmediate_operand" "rm") +;; (match_operand:HI 2 "const_int_operand" "")] +;; UNSPEC_LWPVAL_INTRINSIC)] +;; "TARGET_LWP" +;; "lwpval\t{%2, %1, %0|%0, %1, %2}" +;; [(set_attr "type" "lwp") +;; (set_attr "mode" "HI")]) + +;;(define_insn "lwp_lwpvalsi3" +;; [(unspec [(match_operand:SI 0 "register_operand" "r")] +;; (match_operand:SI 1 "nonimmediate_operand" "rm") +;; (match_operand:SI 2 "const_int_operand" "")] +;; UNSPEC_LWPVAL_INTRINSIC)] +;; "TARGET_LWP" +;; "lwpval\t{%2, %1, %0|%0, %1, %2}" +;; [(set_attr "type" "lwp") +;; (set_attr "mode" "SI")]) + +;;(define_insn "lwp_lwpvaldi3" +;; [(unspec [(match_operand:DI 0 "register_operand" "r")] +;; [(match_operand:SI 1 "nonimmediate_operand" "rm")] +;; [(match_operand:SI 2 "const_int_operand" "")] +;; UNSPEC_LWPVAL_INTRINSIC)] +;; "TARGET_LWP" +;; "lwpval\t{%2, %1, %0|%0, %1, %2}" +;; [(set_attr "type" "lwp") +;; (set_attr "mode" "DI")]) + +;;(define_insn "lwp_lwpinshi3" +;; [(unspec [(match_operand:HI 0 "register_operand" "r")] +;; (match_operand:SI 1 "nonimmediate_operand" "rm") +;; (match_operand:HI 2 "const_int_operand" "")] +;; UNSPEC_LWPINS_INTRINSIC)] +;; "TARGET_LWP" +;; "lwpins\t{%2, %1, %0|%0, %1, %2}" +;; [(set_attr "type" "lwp") +;; (set_attr "mode" "HI")]) + +;;(define_insn "lwp_lwpinssi3" +;; [(unspec [(match_operand:SI 0 "register_operand" "r") +;; (match_operand:SI 1 "nonimmediate_operand" "rm") +;; (match_operand:SI 2 "const_int_operand" "")] +;; UNSPEC_LWPINS_INTRINSIC)] +;; "TARGET_LWP" +;; "lwpins\t{%2, %1, %0|%0, %1, %2}" +;; [(set_attr "type" "lwp") +;; (set_attr "mode" "SI")]) + +;;(define_insn "lwp_lwpinsdi3" +;; [(unspec [(match_operand:DI 0 "register_operand" "r")] +;; (match_operand:SI 1 "nonimmediate_operand" "rm") +;; (match_operand:SI 2 "const_int_operand" "")] +;; UNSPEC_LWPINS_INTRINSIC)] +;; "TARGET_LWP" +;; "lwpins\t{%2, %1, %0|%0, %1, %2}" +;; [(set_attr "type" "lwp") +;; (set_attr "mode" "DI")]) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (define_insn "*avx_aesenc" [(set (match_operand:V2DI 0 "register_operand" "=x") diff -upNw gcc-xop-2/gcc/config/i386/x86intrin.h gcc-lwp/gcc/config/i386/x86intrin.h --- gcc-xop-2/gcc/config/i386/x86intrin.h 2009-09-29 19:41:03.000000000 -0500 +++ gcc-lwp/gcc/config/i386/x86intrin.h 2009-09-30 16:33:28.000000000 -0500 @@ -62,6 +62,10 @@ #include #endif +#ifdef __LWP__ +#include +#endif + #if defined (__AES__) || defined (__PCLMUL__) #include #endif