public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Update x86-tune-costs.h for znver2
@ 2019-07-23  9:34 Jan Hubicka
  2019-07-30  8:10 ` Jan Hubicka
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Hubicka @ 2019-07-23  9:34 UTC (permalink / raw)
  To: gcc-patches

Hi,
this patch updates znver2 costs to match reality.  In particular we
re-benchmarked memcpy strategies and it looks that glibc now wins even
for relatively small blocks. 
Moreover I updated costs of moves to reflect that znver2 has 256 vector
paths and faster multiplication.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

	* x86-tune-costs.h (znver2_memcpy): Update.
	(znver2_costs): Update 256 bit SSE costs and multiplication.
Index: config/i386/x86-tune-costs.h
===================================================================
--- config/i386/x86-tune-costs.h	(revision 273727)
+++ config/i386/x86-tune-costs.h	(working copy)
@@ -1279,12 +1279,12 @@ struct processor_costs znver1_cost = {
 static stringop_algs znver2_memcpy[2] = {
   {libcall, {{6, loop, false}, {14, unrolled_loop, false},
 	     {-1, rep_prefix_4_byte, false}}},
-  {libcall, {{16, loop, false}, {8192, rep_prefix_8_byte, false},
+  {libcall, {{16, loop, false}, {64, rep_prefix_4_byte, false},
 	     {-1, libcall, false}}}};
 static stringop_algs znver2_memset[2] = {
   {libcall, {{8, loop, false}, {24, unrolled_loop, false},
 	     {2048, rep_prefix_4_byte, false}, {-1, libcall, false}}},
-  {libcall, {{48, unrolled_loop, false}, {8192, rep_prefix_8_byte, false},
+  {libcall, {{24, rep_prefix_4_byte, false}, {128, rep_prefix_8_byte, false},
 	     {-1, libcall, false}}}};
 
 struct processor_costs znver2_cost = {
@@ -1335,11 +1335,11 @@ struct processor_costs znver2_cost = {
 					   in SImode and DImode.  */
   {8, 8},				/* cost of storing MMX registers
 					   in SImode and DImode.  */
-  2, 3, 6,				/* cost of moving XMM,YMM,ZMM
+  2, 2, 3,				/* cost of moving XMM,YMM,ZMM
 					   register.  */
-  {6, 6, 6, 10, 20},			/* cost of loading SSE registers
+  {6, 6, 6, 6, 12},			/* cost of loading SSE registers
 					   in 32,64,128,256 and 512-bit.  */
-  {6, 6, 6, 10, 20},			/* cost of unaligned loads.  */
+  {6, 6, 6, 6, 12},			/* cost of unaligned loads.  */
   {8, 8, 8, 8, 16},			/* cost of storing SSE registers
 					   in 32,64,128,256 and 512-bit.  */
   {8, 8, 8, 8, 16},			/* cost of unaligned stores.  */
@@ -1372,7 +1372,7 @@ struct processor_costs znver2_cost = {
   COSTS_N_INSNS (1),			/* cost of cheap SSE instruction.  */
   COSTS_N_INSNS (3),			/* cost of ADDSS/SD SUBSS/SD insns.  */
   COSTS_N_INSNS (3),			/* cost of MULSS instruction.  */
-  COSTS_N_INSNS (4),			/* cost of MULSD instruction.  */
+  COSTS_N_INSNS (3),			/* cost of MULSD instruction.  */
   COSTS_N_INSNS (5),			/* cost of FMA SS instruction.  */
   COSTS_N_INSNS (5),			/* cost of FMA SD instruction.  */
   COSTS_N_INSNS (10),			/* cost of DIVSS instruction.  */

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-08-18  8:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-23  9:34 Update x86-tune-costs.h for znver2 Jan Hubicka
2019-07-30  8:10 ` Jan Hubicka
2019-07-30  8:44   ` Richard Biener
2019-07-30  9:53     ` Jan Hubicka
2019-07-30  9:58       ` Richard Biener
2019-08-18 10:36         ` Gerald Pfeifer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).