public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* RE: [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass.
@ 2021-08-19 16:00 Roger Sayle
  2021-08-20  7:28 ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Roger Sayle @ 2021-08-19 16:00 UTC (permalink / raw)
  To: 'GCC Patches'

[-- Attachment #1: Type: text/plain, Size: 1574 bytes --]


Doh!  ENOPATCH.

-----Original Message-----
From: Roger Sayle <roger@nextmovesoftware.com> 
Sent: 19 August 2021 16:59
To: 'GCC Patches' <gcc-patches@gcc.gnu.org>
Subject: [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass.


Back in June I briefly mentioned in one of my gcc-patches posts that a
change that should have always reduced code size, would mysteriously
occasionally result in slightly larger code (according to CSiBE):
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573233.html

Investigating further, the cause turns out to be that x86_64's
scalar-to-vector (stv) pass is relying on poor estimates of the size
costs/benefits.  This patch tweaks the backend's compute_convert_gain method
to provide slightly more accurate values when compiling with -Os.
Compilation without -Os is (should be) unaffected.  And for completeness,
I'll mention that the stv pass is a net win for code size so it's much
better to improve its heuristics than simply gate the pass on
!optimize_for_size.

The net effect of this change is to save 1399 bytes on the CSiBE code size
benchmark when compiling with -Os.

This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
and "make -k check" with no new failures.

Ok for mainline?


2021-08-19  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386-features.c (compute_convert_gain): Provide
	more accurate values for CONST_INT, when optimizing for size.
	* config/i386/i386.c (COSTS_N_BYTES): Move definition from here...
	* config/i386/i386.h (COSTS_N_BYTES): to here.

Roger
--


[-- Attachment #2: patchz.txt --]
[-- Type: text/plain, Size: 2443 bytes --]

diff --git a/gcc/config/i386/i386-features.c b/gcc/config/i386/i386-features.c
index d9c6652..cdae3dc 100644
--- a/gcc/config/i386/i386-features.c
+++ b/gcc/config/i386/i386-features.c
@@ -610,12 +610,31 @@ general_scalar_chain::compute_convert_gain ()
 
 	  case CONST_INT:
 	    if (REG_P (dst))
-	      /* DImode can be immediate for TARGET_64BIT and SImode always.  */
-	      igain += m * COSTS_N_INSNS (1);
+	      {
+		if (optimize_insn_for_size_p ())
+		  {
+		    /* xor (2 bytes) vs. xorps (3 bytes).  */
+		    if (src == const0_rtx)
+		      igain -= COSTS_N_BYTES (1);
+		    /* movdi_internal vs. movv2di_internal.  */
+		    /* => mov (5 bytes) vs. movaps (7 bytes).  */
+		    else if (x86_64_immediate_operand (src, SImode))
+		      igain -= COSTS_N_BYTES (2);
+		  }
+		else
+		  {
+		    /* DImode can be immediate for TARGET_64BIT
+		       and SImode always.  */
+		    igain += m * COSTS_N_INSNS (1);
+		    igain -= vector_const_cost (src);
+		  }
+	      }
 	    else if (MEM_P (dst))
-	      igain += (m * ix86_cost->int_store[2]
-			- ix86_cost->sse_store[sse_cost_idx]);
-	    igain -= vector_const_cost (src);
+	      {
+		igain += (m * ix86_cost->int_store[2]
+			  - ix86_cost->sse_store[sse_cost_idx]);
+		igain -= vector_const_cost (src);
+	      }
 	    break;
 
 	  default:
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 4d4ab6a..5abf2a6 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -19982,8 +19982,6 @@ ix86_division_cost (const struct processor_costs *cost,
     return cost->divide[MODE_INDEX (mode)];
 }
 
-#define COSTS_N_BYTES(N) ((N) * 2)
-
 /* Return cost of shift in MODE.
    If CONSTANT_OP1 is true, the op1 value is known and set in OP1_VAL.
    AND_IN_OP1 specify in op1 is result of and and SHIFT_AND_TRUNCATE
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 21fe51b..edbfcaf 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -88,6 +88,11 @@ struct stringop_algs
   } size [MAX_STRINGOP_ALGS];
 };
 
+/* Analog of COSTS_N_INSNS when optimizing for size.  */
+#ifndef COSTS_N_BYTES
+#define COSTS_N_BYTES(N) ((N) * 2)
+#endif
+
 /* Define the specific costs for a given cpu.  NB: hard_register is used
    by TARGET_REGISTER_MOVE_COST and TARGET_MEMORY_MOVE_COST to compute
    hard register move costs by register allocator.  Relative costs of

^ permalink raw reply	[flat|nested] 7+ messages in thread
* [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass.
@ 2021-08-19 15:59 Roger Sayle
  0 siblings, 0 replies; 7+ messages in thread
From: Roger Sayle @ 2021-08-19 15:59 UTC (permalink / raw)
  To: 'GCC Patches'


Back in June I briefly mentioned in one of my gcc-patches posts that
a change that should have always reduced code size, would mysteriously
occasionally result in slightly larger code (according to CSiBE):
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573233.html

Investigating further, the cause turns out to be that x86_64's
scalar-to-vector (stv) pass is relying on poor estimates of the size
costs/benefits.  This patch tweaks the backend's compute_convert_gain
method to provide slightly more accurate values when compiling with
-Os. Compilation without -Os is (should be) unaffected.  And for
completeness, I'll mention that the stv pass is a net win for code
size so it's much better to improve its heuristics than simply gate
the pass on !optimize_for_size.

The net effect of this change is to save 1399 bytes on the CSiBE
code size benchmark when compiling with -Os.

This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
and "make -k check" with no new failures.

Ok for mainline?


2021-08-19  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386-features.c (compute_convert_gain): Provide
	more accurate values for CONST_INT, when optimizing for size.
	* config/i386/i386.c (COSTS_N_BYTES): Move definition from here...
	* config/i386/i386.h (COSTS_N_BYTES): to here.

Roger
--



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-08-24  2:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-19 16:00 [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass Roger Sayle
2021-08-20  7:28 ` Richard Biener
2021-08-20 10:20   ` Roger Sayle
2021-08-20 19:55   ` Roger Sayle
2021-08-23 13:47     ` Richard Biener
2021-08-24  2:08       ` Roger Sayle
  -- strict thread matches above, loose matches on Subject: below --
2021-08-19 15:59 Roger Sayle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).