public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch][i386] Goldmont Plus -march/-mtune options
@ 2018-05-16 13:40 Makhotina, Olga
  2018-05-17  6:32 ` Uros Bizjak
  0 siblings, 1 reply; 2+ messages in thread
From: Makhotina, Olga @ 2018-05-16 13:40 UTC (permalink / raw)
  To: Uros Bizjak, gcc-patches; +Cc: Makhotina, Olga, Kirill Yukhin

[-- Attachment #1: Type: text/plain, Size: 1338 bytes --]

Hi,

This patch implements Goldmont Plus -march/-mtune.

2018-05-16  Olga Makhotina  <olga.makhotina@intel.com>

gcc/

	* config.gcc: Support "goldmont-plus".
	* config/i386/driver-i386.c (host_detect_local_cpu): Detect "goldmont-plus".
	* config/i386/i386-c.c (ix86_target_macros_internal): Handle
	PROCESSOR_GOLDMONT_PLUS.
	* config/i386/i386.c (m_GOLDMONT_PLUS): Define.
	(processor_target_table): Add "goldmont-plus".
	(PTA_GOLDMONT_PLUS): Define.
	(ix86_lea_outperforms): Add TARGET_GOLDMONT_PLUS.
	(get_builtin_code_for_version): Handle PROCESSOR_GOLDMONT_PLUS.
	(fold_builtin_cpu): Add M_INTEL_GOLDMONT_PLUS.
	(fold_builtin_cpu): Add "goldmont-plus".
	(ix86_add_stmt_cost): Add TARGET_GOLDMONT_PLUS.
	(ix86_option_override_internal): Add "goldmont-plus".
	* config/i386/i386.h (processor_costs): Define TARGET_GOLDMONT_PLUS.
	(processor_type): Add PROCESSOR_GOLDMONT_PLUS.
	* config/i386/x86-tune.def: Add m_GOLDMONT_PLUS.
	* doc/invoke.texi: Add goldmont-plus as x86 -march=/-mtune= CPU type.

libgcc/

	* config/i386/cpuinfo.h (processor_types): Add INTEL_GOLDMONT_PLUS.
	* config/i386/cpuinfo.c (get_intel_cpu): Detect Goldmont Plus.

gcc/testsuite/

	* gcc.target/i386/builtin_target.c: Test goldmont-plus.
	* gcc.target/i386/funcspec-56.inc: Test arch=goldmont-plus.

Is it Ok?

Thanks.

[-- Attachment #2: 0001-goldmont-plus.patch --]
[-- Type: application/octet-stream, Size: 21940 bytes --]

From 7ccc7fc722ceccde36731a374ca79992d1bd565a Mon Sep 17 00:00:00 2001
From: Olga Makhotina <olga.makhotina@intel.com>
Date: Thu, 26 Apr 2018 16:06:38 +0300
Subject: [PATCH] goldmont-plus

	modified:   gcc/config.gcc
	modified:   gcc/config/i386/driver-i386.c
	modified:   gcc/config/i386/i386-c.c
	modified:   gcc/config/i386/i386.c
	modified:   gcc/config/i386/i386.h
	modified:   gcc/config/i386/x86-tune.def
	modified:   gcc/doc/invoke.texi
	modified:   gcc/testsuite/gcc.target/i386/builtin_target.c
	modified:   gcc/testsuite/gcc.target/i386/funcspec-56.inc
	modified:   libgcc/config/i386/cpuinfo.c
	modified:   libgcc/config/i386/cpuinfo.h
---
 gcc/config.gcc                                 |  2 +-
 gcc/config/i386/driver-i386.c                  |  9 ++++-
 gcc/config/i386/i386-c.c                       |  7 ++++
 gcc/config/i386/i386.c                         | 22 ++++++++---
 gcc/config/i386/i386.h                         |  2 +
 gcc/config/i386/x86-tune.def                   | 55 +++++++++++++++-----------
 gcc/doc/invoke.texi                            |  5 +++
 gcc/testsuite/gcc.target/i386/builtin_target.c |  4 ++
 gcc/testsuite/gcc.target/i386/funcspec-56.inc  |  1 +
 libgcc/config/i386/cpuinfo.c                   |  4 ++
 libgcc/config/i386/cpuinfo.h                   |  1 +
 11 files changed, 81 insertions(+), 31 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 64d96da..61f1fe8 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -638,7 +638,7 @@ bdver3 bdver4 znver1 btver1 btver2 k8 k8-sse3 opteron opteron-sse3 nocona \
 core2 corei7 corei7-avx core-avx-i core-avx2 atom slm nehalem westmere \
 sandybridge ivybridge haswell broadwell bonnell silvermont knl knm \
 skylake-avx512 cannonlake icelake-client icelake-server skylake goldmont \
-x86-64 native"
+goldmont-plus x86-64 native"
 
 # Additional x86 processors supported by --with-cpu=.  Each processor
 # MUST be separated by exactly one space.
diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
index 88cf6ea..10ff64b 100644
--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -760,6 +760,10 @@ const char *host_detect_local_cpu (int argc, const char **argv)
 	  /* Goldmont.  */
 	  cpu = "goldmont";
 	  break;
+	case 0x7a:
+	  /* Goldmont Plus.  */
+	  cpu = "goldmont-plus";
+	  break;
 	case 0x0f:
 	  /* Merom.  */
 	case 0x17:
@@ -864,7 +868,10 @@ const char *host_detect_local_cpu (int argc, const char **argv)
 		cpu = "sandybridge";
 	      else if (has_sse4_2)
 		{
-		  if (has_xsave)
+		  if (has_sgx)
+		    /* Assume Goldmont Plus.  */
+		    cpu = "goldmont-plus";
+		  else if (has_xsave)
 		    /* Assume Goldmont.  */
 		    cpu = "goldmont";
 		  else if (has_movbe)
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 3df599c..444c1ad 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -178,6 +178,10 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
       def_or_undef (parse_in, "__goldmont");
       def_or_undef (parse_in, "__goldmont__");
       break;
+    case PROCESSOR_GOLDMONT_PLUS:
+      def_or_undef (parse_in, "__goldmont_plus");
+      def_or_undef (parse_in, "__goldmont_plus__");
+      break;
     case PROCESSOR_KNL:
       def_or_undef (parse_in, "__knl");
       def_or_undef (parse_in, "__knl__");
@@ -318,6 +322,9 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
     case PROCESSOR_GOLDMONT:
       def_or_undef (parse_in, "__tune_goldmont__");
       break;
+    case PROCESSOR_GOLDMONT_PLUS:
+      def_or_undef (parse_in, "__tune_goldmont_plus__");
+      break;
     case PROCESSOR_KNL:
       def_or_undef (parse_in, "__tune_knl__");
       break;
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 4581094..27d448a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -150,6 +150,7 @@ const struct processor_costs *ix86_cost = NULL;
 #define m_ICELAKE_CLIENT (HOST_WIDE_INT_1U<<PROCESSOR_ICELAKE_CLIENT)
 #define m_ICELAKE_SERVER (HOST_WIDE_INT_1U<<PROCESSOR_ICELAKE_SERVER)
 #define m_GOLDMONT (HOST_WIDE_INT_1U<<PROCESSOR_GOLDMONT)
+#define m_GOLDMONT_PLUS (HOST_WIDE_INT_1U<<PROCESSOR_GOLDMONT_PLUS)
 #define m_INTEL (HOST_WIDE_INT_1U<<PROCESSOR_INTEL)
 
 #define m_GEODE (HOST_WIDE_INT_1U<<PROCESSOR_GEODE)
@@ -860,6 +861,7 @@ static const struct ptt processor_target_table[PROCESSOR_max] =
   {"bonnell", &atom_cost, 16, 15, 16, 7, 16},
   {"silvermont", &slm_cost, 16, 15, 16, 7, 16},
   {"goldmont", &slm_cost, 16, 15, 16, 7, 16},
+  {"goldmont-plus", &slm_cost, 16, 15, 16, 7, 16},
   {"knl", &slm_cost, 16, 15, 16, 7, 16},
   {"knm", &slm_cost, 16, 15, 16, 7, 16},
   {"skylake", &skylake_cost, 16, 10, 16, 10, 16},
@@ -3489,6 +3491,8 @@ ix86_option_override_internal (bool main_args_p,
   const wide_int_bitmask PTA_GOLDMONT = PTA_SILVERMONT | PTA_SHA | PTA_XSAVE
     | PTA_RDSEED | PTA_XSAVEC | PTA_XSAVES | PTA_CLFLUSHOPT | PTA_XSAVEOPT
     | PTA_FSGSBASE;
+  const wide_int_bitmask PTA_GOLDMONT_PLUS = PTA_GOLDMONT | PTA_RDPID
+    | PTA_SGX;
   const wide_int_bitmask PTA_KNM = PTA_KNL | PTA_AVX5124VNNIW
     | PTA_AVX5124FMAPS | PTA_AVX512VPOPCNTDQ;
 
@@ -3565,6 +3569,7 @@ ix86_option_override_internal (bool main_args_p,
       {"silvermont", PROCESSOR_SILVERMONT, CPU_SLM, PTA_SILVERMONT},
       {"slm", PROCESSOR_SILVERMONT, CPU_SLM, PTA_SILVERMONT},
       {"goldmont", PROCESSOR_GOLDMONT, CPU_GLM, PTA_GOLDMONT},
+      {"goldmont-plus", PROCESSOR_GOLDMONT_PLUS, CPU_GLM, PTA_GOLDMONT_PLUS},
       {"knl", PROCESSOR_KNL, CPU_SLM, PTA_KNL},
       {"knm", PROCESSOR_KNM, CPU_SLM, PTA_KNM},
       {"intel", PROCESSOR_INTEL, CPU_SLM, PTA_NEHALEM},
@@ -21239,7 +21244,8 @@ ix86_lea_outperforms (rtx_insn *insn, unsigned int regno0, unsigned int regno1,
   /* For Silvermont if using a 2-source or 3-source LEA for
      non-destructive destination purposes, or due to wanting
      ability to use SCALE, the use of LEA is justified.  */
-  if (TARGET_SILVERMONT || TARGET_GOLDMONT || TARGET_INTEL)
+  if (TARGET_SILVERMONT || TARGET_GOLDMONT || TARGET_GOLDMONT_PLUS
+      || TARGET_INTEL)
     {
       if (has_scale)
 	return true;
@@ -32398,10 +32404,14 @@ get_builtin_code_for_version (tree decl, tree *predicate_list)
 	      arg_str = "silvermont";
 	      priority = P_PROC_SSE4_2;
 	      break;
-	   case PROCESSOR_GOLDMONT:
+	    case PROCESSOR_GOLDMONT:
 	      arg_str = "goldmont";
 	      priority = P_PROC_SSE4_2;
 	      break;
+	    case PROCESSOR_GOLDMONT_PLUS:
+	      arg_str = "goldmont-plus";
+	      priority = P_PROC_SSE4_2;
+	      break;
 	    case PROCESSOR_AMDFAM10:
 	      arg_str = "amdfam10h";
 	      priority = P_PROC_SSE4_A;
@@ -33107,7 +33117,8 @@ fold_builtin_cpu (tree fndecl, tree *args)
     M_INTEL_COREI7_CANNONLAKE,
     M_INTEL_COREI7_ICELAKE_CLIENT,
     M_INTEL_COREI7_ICELAKE_SERVER,
-    M_INTEL_GOLDMONT
+    M_INTEL_GOLDMONT,
+    M_INTEL_GOLDMONT_PLUS
   };
 
   static struct _arch_names_table
@@ -33137,6 +33148,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
       {"bonnell", M_INTEL_BONNELL},
       {"silvermont", M_INTEL_SILVERMONT},
       {"goldmont", M_INTEL_GOLDMONT},
+      {"goldmont-plus", M_INTEL_GOLDMONT_PLUS},
       {"knl", M_INTEL_KNL},
       {"knm", M_INTEL_KNM},
       {"amdfam10h", M_AMDFAM10H},
@@ -50597,8 +50609,8 @@ ix86_add_stmt_cost (void *data, int count, enum vect_cost_for_stmt kind,
   /* We need to multiply all vector stmt cost by 1.7 (estimated cost)
      for Silvermont as it has out of order integer pipeline and can execute
      2 scalar instruction per tick, but has in order SIMD pipeline.  */
-  if ((TARGET_SILVERMONT || TARGET_GOLDMONT || TARGET_INTEL)
-      && stmt_info && stmt_info->stmt)
+  if ((TARGET_SILVERMONT || TARGET_GOLDMONT || TARGET_GOLDMONT_PLUS
+       || TARGET_INTEL) && stmt_info && stmt_info->stmt)
     {
       tree lhs_op = gimple_get_lhs (stmt_info->stmt);
       if (lhs_op && TREE_CODE (TREE_TYPE (lhs_op)) == INTEGER_TYPE)
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 787bf9f..0927982 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -386,6 +386,7 @@ extern const struct processor_costs ix86_size_cost;
 #define TARGET_BONNELL (ix86_tune == PROCESSOR_BONNELL)
 #define TARGET_SILVERMONT (ix86_tune == PROCESSOR_SILVERMONT)
 #define TARGET_GOLDMONT (ix86_tune == PROCESSOR_GOLDMONT)
+#define TARGET_GOLDMONT_PLUS (ix86_tune == PROCESSOR_GOLDMONT_PLUS)
 #define TARGET_KNL (ix86_tune == PROCESSOR_KNL)
 #define TARGET_KNM (ix86_tune == PROCESSOR_KNM)
 #define TARGET_SKYLAKE (ix86_tune == PROCESSOR_SKYLAKE)
@@ -2281,6 +2282,7 @@ enum processor_type
   PROCESSOR_BONNELL,
   PROCESSOR_SILVERMONT,
   PROCESSOR_GOLDMONT,
+  PROCESSOR_GOLDMONT_PLUS,
   PROCESSOR_KNL,
   PROCESSOR_KNM,
   PROCESSOR_SKYLAKE,
diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
index ae9f42c..77d9934 100644
--- a/gcc/config/i386/x86-tune.def
+++ b/gcc/config/i386/x86-tune.def
@@ -42,7 +42,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 DEF_TUNE (X86_TUNE_SCHEDULE, "schedule",
           m_PENT | m_LAKEMONT | m_PPRO | m_CORE_ALL | m_BONNELL | m_SILVERMONT
 	  | m_INTEL | m_KNL | m_KNM | m_K6_GEODE | m_AMD_MULTIPLE | m_GOLDMONT
-	  | m_GENERIC)
+	  | m_GOLDMONT_PLUS | m_GENERIC)
 
 /* X86_TUNE_PARTIAL_REG_DEPENDENCY: Enable more register renaming
    on modern chips.  Preffer stores affecting whole integer register
@@ -50,7 +50,7 @@ DEF_TUNE (X86_TUNE_SCHEDULE, "schedule",
    value over movb.  */
 DEF_TUNE (X86_TUNE_PARTIAL_REG_DEPENDENCY, "partial_reg_dependency",
           m_P4_NOCONA | m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE
-	  | m_BONNELL | m_SILVERMONT | m_GOLDMONT | m_INTEL
+	  | m_BONNELL | m_SILVERMONT | m_GOLDMONT | m_GOLDMONT_PLUS | m_INTEL
 	  | m_KNL | m_KNM | m_AMD_MULTIPLE | m_SKYLAKE_AVX512 | m_GENERIC)
 
 /* X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY: This knob promotes all store
@@ -86,13 +86,15 @@ DEF_TUNE (X86_TUNE_PARTIAL_FLAG_REG_STALL, "partial_flag_reg_stall",
 DEF_TUNE (X86_TUNE_MOVX, "movx",
           m_PPRO | m_P4_NOCONA | m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE
 	  | m_BONNELL | m_SILVERMONT | m_GOLDMONT | m_KNL | m_KNM | m_INTEL
-	  | m_GEODE | m_AMD_MULTIPLE | m_SKYLAKE_AVX512 | m_GENERIC)
+	  | m_GOLDMONT_PLUS | m_GEODE | m_AMD_MULTIPLE | m_SKYLAKE_AVX512
+	  | m_GENERIC)
 
 /* X86_TUNE_MEMORY_MISMATCH_STALL: Avoid partial stores that are followed by
    full sized loads.  */
 DEF_TUNE (X86_TUNE_MEMORY_MISMATCH_STALL, "memory_mismatch_stall",
           m_P4_NOCONA | m_CORE_ALL | m_BONNELL | m_SILVERMONT | m_INTEL
-	  | m_KNL | m_KNM | m_GOLDMONT | m_AMD_MULTIPLE | m_GENERIC)
+	  | m_KNL | m_KNM | m_GOLDMONT | m_GOLDMONT_PLUS | m_AMD_MULTIPLE
+	  | m_GENERIC)
 
 /* X86_TUNE_FUSE_CMP_AND_BRANCH_32: Fuse compare with a subsequent
    conditional jump instruction for 32 bit TARGET.  */
@@ -131,7 +133,7 @@ DEF_TUNE (X86_TUNE_FUSE_ALU_AND_BRANCH, "fuse_alu_and_branch",
 
 DEF_TUNE (X86_TUNE_ACCUMULATE_OUTGOING_ARGS, "accumulate_outgoing_args",
 	  m_PPRO | m_P4_NOCONA | m_BONNELL | m_SILVERMONT | m_KNL | m_KNM | m_INTEL
-	  | m_GOLDMONT | m_ATHLON_K8)
+	  | m_GOLDMONT | m_GOLDMONT_PLUS | m_ATHLON_K8)
 
 /* X86_TUNE_PROLOGUE_USING_MOVE: Do not use push/pop in prologues that are
    considered on critical path.  */
@@ -193,7 +195,7 @@ DEF_TUNE (X86_TUNE_PAD_RETURNS, "pad_returns",
    than 4 branch instructions in the 16 byte window.  */
 DEF_TUNE (X86_TUNE_FOUR_JUMP_LIMIT, "four_jump_limit",
           m_PPRO | m_P4_NOCONA | m_BONNELL | m_SILVERMONT | m_KNL | m_KNM
-	  | m_GOLDMONT | m_INTEL | m_ATHLON_K8 | m_AMDFAM10)
+	  | m_GOLDMONT | m_GOLDMONT_PLUS | m_INTEL | m_ATHLON_K8 | m_AMDFAM10)
 
 /*****************************************************************************/
 /* Integer instruction selection tuning                                      */
@@ -222,23 +224,24 @@ DEF_TUNE (X86_TUNE_READ_MODIFY, "read_modify", ~(m_PENT | m_LAKEMONT | m_PPRO))
 DEF_TUNE (X86_TUNE_USE_INCDEC, "use_incdec",
           ~(m_P4_NOCONA | m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE
 	    | m_BONNELL | m_SILVERMONT | m_INTEL |  m_KNL | m_KNM | m_GOLDMONT
-	    | m_GENERIC))
+	    | m_GOLDMONT_PLUS | m_GENERIC))
 
 /* X86_TUNE_INTEGER_DFMODE_MOVES: Enable if integer moves are preferred
    for DFmode copies */
 DEF_TUNE (X86_TUNE_INTEGER_DFMODE_MOVES, "integer_dfmode_moves",
           ~(m_PPRO | m_P4_NOCONA | m_CORE_ALL | m_BONNELL | m_SILVERMONT
 	    | m_KNL | m_KNM | m_INTEL | m_GEODE | m_AMD_MULTIPLE | m_GOLDMONT
-	    | m_GENERIC))
+	    | m_GOLDMONT_PLUS | m_GENERIC))
 
 /* X86_TUNE_OPT_AGU: Optimize for Address Generation Unit. This flag
    will impact LEA instruction selection. */
 DEF_TUNE (X86_TUNE_OPT_AGU, "opt_agu", m_BONNELL | m_SILVERMONT | m_KNL
-	 | m_KNM | m_GOLDMONT | m_INTEL)
+	 | m_KNM | m_GOLDMONT | m_GOLDMONT_PLUS | m_INTEL)
 
 /* X86_TUNE_AVOID_LEA_FOR_ADDR: Avoid lea for address computation.  */
 DEF_TUNE (X86_TUNE_AVOID_LEA_FOR_ADDR, "avoid_lea_for_addr",
-	  m_BONNELL | m_SILVERMONT | m_GOLDMONT | m_KNL | m_KNM)
+	  m_BONNELL | m_SILVERMONT | m_GOLDMONT | m_GOLDMONT_PLUS | m_KNL
+	  | m_KNM)
 
 /* X86_TUNE_SLOW_IMUL_IMM32_MEM: Imul of 32-bit constant and memory is
    vector path on AMD machines.
@@ -255,7 +258,8 @@ DEF_TUNE (X86_TUNE_SLOW_IMUL_IMM8, "slow_imul_imm8",
 /* X86_TUNE_AVOID_MEM_OPND_FOR_CMOVE: Try to avoid memory operands for
    a conditional move.  */
 DEF_TUNE (X86_TUNE_AVOID_MEM_OPND_FOR_CMOVE, "avoid_mem_opnd_for_cmove",
-	  m_BONNELL | m_SILVERMONT | m_GOLDMONT | m_KNL | m_KNM | m_INTEL)
+	  m_BONNELL | m_SILVERMONT | m_GOLDMONT | m_GOLDMONT_PLUS | m_KNL
+	  | m_KNM | m_INTEL)
 
 /* X86_TUNE_SINGLE_STRINGOP: Enable use of single string operations, such
    as MOVS and STOS (without a REP prefix) to move/set sequences of bytes.  */
@@ -274,17 +278,18 @@ DEF_TUNE (X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES,
 DEF_TUNE (X86_TUNE_USE_SAHF, "use_sahf",
           m_PPRO | m_P4_NOCONA | m_CORE_ALL | m_BONNELL | m_SILVERMONT
 	  | m_KNL | m_KNM | m_INTEL | m_K6_GEODE | m_K8 | m_AMDFAM10 | m_BDVER
-	  | m_BTVER | m_ZNVER1 | m_GOLDMONT | m_GENERIC)
+	  | m_BTVER | m_ZNVER1 | m_GOLDMONT | m_GOLDMONT_PLUS | m_GENERIC)
 
 /* X86_TUNE_USE_CLTD: Controls use of CLTD and CTQO instructions.  */
 DEF_TUNE (X86_TUNE_USE_CLTD, "use_cltd",
 	  ~(m_PENT | m_LAKEMONT | m_BONNELL | m_SILVERMONT | m_KNL | m_KNM | m_INTEL
-	    | m_K6 | m_GOLDMONT))
+	    | m_K6 | m_GOLDMONT | m_GOLDMONT_PLUS))
 
 /* X86_TUNE_USE_BT: Enable use of BT (bit test) instructions.  */
 DEF_TUNE (X86_TUNE_USE_BT, "use_bt",
           m_CORE_ALL | m_BONNELL | m_SILVERMONT | m_KNL | m_KNM | m_INTEL
-	  | m_LAKEMONT | m_AMD_MULTIPLE | m_GOLDMONT | m_GENERIC)
+	  | m_LAKEMONT | m_AMD_MULTIPLE | m_GOLDMONT | m_GOLDMONT_PLUS
+	  | m_GENERIC)
 
 /* X86_TUNE_AVOID_FALSE_DEP_FOR_BMI: Avoid false dependency
    for bit-manipulation instructions.  */
@@ -301,7 +306,7 @@ DEF_TUNE (X86_TUNE_ADJUST_UNROLL, "adjust_unroll_factor", m_BDVER3 | m_BDVER4)
    if-converted sequence to one.  */
 DEF_TUNE (X86_TUNE_ONE_IF_CONV_INSN, "one_if_conv_insn",
 	  m_SILVERMONT | m_KNL | m_KNM | m_INTEL | m_CORE_ALL | m_GOLDMONT
-	  | m_GENERIC)
+	  | m_GOLDMONT_PLUS | m_GENERIC)
 
 /*****************************************************************************/
 /* 387 instruction selection tuning                                          */
@@ -318,7 +323,7 @@ DEF_TUNE (X86_TUNE_USE_HIMODE_FIOP, "use_himode_fiop",
 DEF_TUNE (X86_TUNE_USE_SIMODE_FIOP, "use_simode_fiop",
           ~(m_PENT | m_LAKEMONT | m_PPRO | m_CORE_ALL | m_BONNELL
 	    | m_SILVERMONT | m_KNL | m_KNM | m_INTEL | m_AMD_MULTIPLE
-	    | m_GOLDMONT | m_GENERIC))
+	    | m_GOLDMONT | m_GOLDMONT_PLUS | m_GENERIC))
 
 /* X86_TUNE_USE_FFREEP: Use freep instruction instead of fstp.  */
 DEF_TUNE (X86_TUNE_USE_FFREEP, "use_ffreep", m_AMD_MULTIPLE)
@@ -327,7 +332,7 @@ DEF_TUNE (X86_TUNE_USE_FFREEP, "use_ffreep", m_AMD_MULTIPLE)
 DEF_TUNE (X86_TUNE_EXT_80387_CONSTANTS, "ext_80387_constants",
           m_PPRO | m_P4_NOCONA | m_CORE_ALL | m_BONNELL | m_SILVERMONT
 	  | m_KNL | m_KNM | m_INTEL | m_K6_GEODE | m_ATHLON_K8 | m_GOLDMONT
-	  | m_GENERIC)
+	  | m_GOLDMONT_PLUS | m_GENERIC)
 
 /*****************************************************************************/
 /* SSE instruction selection tuning                                          */
@@ -342,15 +347,15 @@ DEF_TUNE (X86_TUNE_GENERAL_REGS_SSE_SPILL, "general_regs_sse_spill",
    of a sequence loading registers by parts.  */
 DEF_TUNE (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL, "sse_unaligned_load_optimal",
 	  m_NEHALEM | m_SANDYBRIDGE | m_HASWELL | m_SILVERMONT | m_KNL | m_KNM
-	  | m_INTEL | m_SKYLAKE_AVX512 | m_GOLDMONT | m_AMDFAM10 | m_BDVER
-	  | m_BTVER | m_ZNVER1 | m_GENERIC)
+	  | m_INTEL | m_SKYLAKE_AVX512 | m_GOLDMONT | m_GOLDMONT_PLUS
+	  | m_AMDFAM10 | m_BDVER | m_BTVER | m_ZNVER1 | m_GENERIC)
 
 /* X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL: Use movups for misaligned stores instead
    of a sequence loading registers by parts.  */
 DEF_TUNE (X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL, "sse_unaligned_store_optimal",
 	  m_NEHALEM | m_SANDYBRIDGE | m_HASWELL | m_SILVERMONT | m_KNL | m_KNM
-	  | m_INTEL | m_SKYLAKE_AVX512 | m_GOLDMONT | m_BDVER | m_ZNVER1
-	  | m_GENERIC)
+	  | m_INTEL | m_SKYLAKE_AVX512 | m_GOLDMONT | m_GOLDMONT_PLUS
+	  | m_BDVER | m_ZNVER1 | m_GENERIC)
 
 /* Use packed single precision instructions where posisble.  I.e. movups instead
    of movupd.  */
@@ -387,7 +392,8 @@ DEF_TUNE (X86_TUNE_INTER_UNIT_CONVERSIONS, "inter_unit_conversions",
 /* X86_TUNE_SPLIT_MEM_OPND_FOR_FP_CONVERTS: Try to split memory operand for
    fp converts to destination register.  */
 DEF_TUNE (X86_TUNE_SPLIT_MEM_OPND_FOR_FP_CONVERTS, "split_mem_opnd_for_fp_converts",
-	  m_SILVERMONT | m_KNL | m_KNM | m_GOLDMONT | m_INTEL)
+	  m_SILVERMONT | m_KNL | m_KNM | m_GOLDMONT | m_GOLDMONT_PLUS
+	  | m_INTEL)
 
 /* X86_TUNE_USE_VECTOR_FP_CONVERTS: Prefer vector packed SSE conversion
    from FP to FP.  This form of instructions avoids partial write to the
@@ -401,11 +407,12 @@ DEF_TUNE (X86_TUNE_USE_VECTOR_CONVERTS, "use_vector_converts", m_AMDFAM10)
 
 /* X86_TUNE_SLOW_SHUFB: Indicates tunings with slow pshufb instruction.  */
 DEF_TUNE (X86_TUNE_SLOW_PSHUFB, "slow_pshufb",
-	  m_BONNELL | m_SILVERMONT | m_KNL | m_KNM | m_GOLDMONT | m_INTEL)
+	  m_BONNELL | m_SILVERMONT | m_KNL | m_KNM | m_GOLDMONT
+	  | m_GOLDMONT_PLUS | m_INTEL)
 
 /* X86_TUNE_AVOID_4BYTE_PREFIXES: Avoid instructions requiring 4+ bytes of prefixes.  */
 DEF_TUNE (X86_TUNE_AVOID_4BYTE_PREFIXES, "avoid_4byte_prefixes",
-	  m_SILVERMONT | m_GOLDMONT | m_INTEL)
+	  m_SILVERMONT | m_GOLDMONT | m_GOLDMONT_PLUS | m_INTEL)
 
 /* X86_TUNE_USE_GATHER: Use gather instructions.  */
 DEF_TUNE (X86_TUNE_USE_GATHER, "use_gather",
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9900f20..a751918 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -26571,6 +26571,11 @@ Intel Goldmont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3,
 SSE4.1, SSE4.2, POPCNT, AES, PCLMUL, RDRND, XSAVE, XSAVEOPT and FSGSBASE
 instruction set support.
 
+@item goldmont-plus
+Intel Goldmont Plus CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
+SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL, RDRND, XSAVE, XSAVEOPT, FSGSBASE,
+PTWRITE, RDPID, SGX and UMIP instruction set support.
+
 @item knl
 Intel Knight's Landing CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
 SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA,
diff --git a/gcc/testsuite/gcc.target/i386/builtin_target.c b/gcc/testsuite/gcc.target/i386/builtin_target.c
index 024212c..1a7a9f3 100644
--- a/gcc/testsuite/gcc.target/i386/builtin_target.c
+++ b/gcc/testsuite/gcc.target/i386/builtin_target.c
@@ -43,6 +43,10 @@ check_intel_cpu_model (unsigned int family, unsigned int model,
 	      /* Goldmont.  */
 	      assert (__builtin_cpu_is ("goldmont"));
 	      break;
+	    case 0x7a:
+	      /* Goldmont Plus.  */
+	      assert (__builtin_cpu_is ("goldmont-plus"));
+	      break;
 	    case 0x57:
 	      /* Knights Landing.  */
 	      assert (__builtin_cpu_is ("knl"));
diff --git a/gcc/testsuite/gcc.target/i386/funcspec-56.inc b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
index 72519ba..6a11038 100644
--- a/gcc/testsuite/gcc.target/i386/funcspec-56.inc
+++ b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
@@ -143,6 +143,7 @@ extern void test_arch_corei7_avx (void)		__attribute__((__target__("arch=corei7-
 extern void test_arch_core_avx2 (void)		__attribute__((__target__("arch=core-avx2")));
 extern void test_arch_silvermont (void)		__attribute__((__target__("arch=silvermont")));
 extern void test_arch_goldmont (void)		__attribute__((__target__("arch=goldmont")));
+extern void test_arch_goldmont_plus (void)	__attribute__((__target__("arch=goldmont-plus")));
 extern void test_arch_knl (void)		__attribute__((__target__("arch=knl")));
 extern void test_arch_knm (void)		__attribute__((__target__("arch=knm")));
 extern void test_arch_skylake (void)		__attribute__((__target__("arch=skylake")));
diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c
index 7e6c7a4..8c9878c 100644
--- a/libgcc/config/i386/cpuinfo.c
+++ b/libgcc/config/i386/cpuinfo.c
@@ -145,6 +145,10 @@ get_intel_cpu (unsigned int family, unsigned int model, unsigned int brand_id)
 	      /* Goldmont.  */
 	      __cpu_model.__cpu_type = INTEL_GOLDMONT;
 	      break;
+	    case 0x7a:
+	      /* Goldmont Plus.  */
+	      __cpu_model.__cpu_type = INTEL_GOLDMONT_PLUS;
+	      break;
 	    case 0x57:
 	      /* Knights Landing.  */
 	      __cpu_model.__cpu_type = INTEL_KNL;
diff --git a/libgcc/config/i386/cpuinfo.h b/libgcc/config/i386/cpuinfo.h
index 18db199..ace07df 100644
--- a/libgcc/config/i386/cpuinfo.h
+++ b/libgcc/config/i386/cpuinfo.h
@@ -49,6 +49,7 @@ enum processor_types
   AMDFAM17H,
   INTEL_KNM,
   INTEL_GOLDMONT,
+  INTEL_GOLDMONT_PLUS,
   CPU_TYPE_MAX
 };
 
-- 
2.5.5


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [patch][i386] Goldmont Plus -march/-mtune options
  2018-05-16 13:40 [patch][i386] Goldmont Plus -march/-mtune options Makhotina, Olga
@ 2018-05-17  6:32 ` Uros Bizjak
  0 siblings, 0 replies; 2+ messages in thread
From: Uros Bizjak @ 2018-05-17  6:32 UTC (permalink / raw)
  To: Makhotina, Olga; +Cc: gcc-patches, Kirill Yukhin

On Wed, May 16, 2018 at 3:37 PM, Makhotina, Olga
<olga.makhotina@intel.com> wrote:
> Hi,
>
> This patch implements Goldmont Plus -march/-mtune.
>
> 2018-05-16  Olga Makhotina  <olga.makhotina@intel.com>
>
> gcc/
>
>         * config.gcc: Support "goldmont-plus".
>         * config/i386/driver-i386.c (host_detect_local_cpu): Detect "goldmont-plus".
>         * config/i386/i386-c.c (ix86_target_macros_internal): Handle
>         PROCESSOR_GOLDMONT_PLUS.
>         * config/i386/i386.c (m_GOLDMONT_PLUS): Define.
>         (processor_target_table): Add "goldmont-plus".
>         (PTA_GOLDMONT_PLUS): Define.
>         (ix86_lea_outperforms): Add TARGET_GOLDMONT_PLUS.
>         (get_builtin_code_for_version): Handle PROCESSOR_GOLDMONT_PLUS.
>         (fold_builtin_cpu): Add M_INTEL_GOLDMONT_PLUS.
>         (fold_builtin_cpu): Add "goldmont-plus".
>         (ix86_add_stmt_cost): Add TARGET_GOLDMONT_PLUS.
>         (ix86_option_override_internal): Add "goldmont-plus".
>         * config/i386/i386.h (processor_costs): Define TARGET_GOLDMONT_PLUS.
>         (processor_type): Add PROCESSOR_GOLDMONT_PLUS.
>         * config/i386/x86-tune.def: Add m_GOLDMONT_PLUS.
>         * doc/invoke.texi: Add goldmont-plus as x86 -march=/-mtune= CPU type.
>
> libgcc/
>
>         * config/i386/cpuinfo.h (processor_types): Add INTEL_GOLDMONT_PLUS.
>         * config/i386/cpuinfo.c (get_intel_cpu): Detect Goldmont Plus.
>
> gcc/testsuite/
>
>         * gcc.target/i386/builtin_target.c: Test goldmont-plus.
>         * gcc.target/i386/funcspec-56.inc: Test arch=goldmont-plus.
>
> Is it Ok?

OK for mainline.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-05-17  6:08 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-16 13:40 [patch][i386] Goldmont Plus -march/-mtune options Makhotina, Olga
2018-05-17  6:32 ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).