public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] RISC-V: Synthesize power-of-two constants.
@ 2023-05-30 19:13 Robin Dapp
  2023-05-30 20:18 ` Andrew Waterman
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Dapp @ 2023-05-30 19:13 UTC (permalink / raw)
  To: gcc-patches, Kito Cheng, palmer, juzhe.zhong, jeffreyalaw; +Cc: rdapp.gcc

Hi,

I figured I'd send this patch that I quickly hacked together some
days back.  It's likely going to be controversial because we don't
have vector costs in place at all yet and even with costs it's
probably debatable as the emitted sequence is longer :)
I'm willing to defer or ditch it altogether but as it's small and
localized why not at least discuss it quickly.

For immediates that are powers of two, instead of loading them into a
GPR and then broadcasting (incurring the scalar-vector latency) we
can synthesize them with a vmv.vi and a vsll.v.i.  Depending on actual
costs we could also add more complicated synthesis patterns in the
future.

Regards
 Robin

gcc/ChangeLog:

	* config/riscv/riscv-selftests.cc (run_const_vector_selftests):
	Adjust expectation.
	* config/riscv/riscv-v.cc (expand_const_vector): Synthesize
	power-of-two constants.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv32.c: Adjust test
	expectation.
	* gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv64.c: Dito.
	* gcc.target/riscv/rvv/autovec/vmv-imm-rv32.c: Dito.
	* gcc.target/riscv/rvv/autovec/vmv-imm-rv64.c: Dito.
---
 gcc/config/riscv/riscv-selftests.cc           |  9 +++++-
 gcc/config/riscv/riscv-v.cc                   | 31 +++++++++++++++++++
 .../riscv/rvv/autovec/vmv-imm-fixed-rv32.c    |  5 +--
 .../riscv/rvv/autovec/vmv-imm-fixed-rv64.c    |  5 +--
 .../riscv/rvv/autovec/vmv-imm-rv32.c          |  5 +--
 .../riscv/rvv/autovec/vmv-imm-rv64.c          |  5 +--
 6 files changed, 51 insertions(+), 9 deletions(-)

diff --git a/gcc/config/riscv/riscv-selftests.cc b/gcc/config/riscv/riscv-selftests.cc
index 1bf1a648fa1..21fa460bb1f 100644
--- a/gcc/config/riscv/riscv-selftests.cc
+++ b/gcc/config/riscv/riscv-selftests.cc
@@ -259,9 +259,16 @@ run_const_vector_selftests (void)
 	      rtx_insn *insn = get_last_insn ();
 	      rtx src = XEXP (SET_SRC (PATTERN (insn)), 1);
 	      /* 1. Should be vmv.v.i for in rang of -16 ~ 15.
-		 2. Should be vmv.v.x for exceed -16 ~ 15.  */
+		 2. For 16 (and appropriate higher powers of two)
+		    expect a shift because we emit a
+		    vmv.v.i v1, 8 and a
+		    vsll.v.i v1, v1, 1.
+		 3. Should be vmv.v.x for everything else.  */
 	      if (IN_RANGE (val, -16, 15))
 		ASSERT_TRUE (rtx_equal_p (src, dup));
+	      else if (IN_RANGE (val, 16, 16))
+		ASSERT_TRUE (GET_CODE (src) == ASHIFT
+			     && INTVAL (XEXP (src, 1)) == 1);
 	      else
 		ASSERT_TRUE (
 		  rtx_equal_p (src,
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index b381970140d..b295a48bb9d 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -560,6 +560,7 @@ expand_const_vector (rtx target, rtx src)
   rtx elt;
   if (const_vec_duplicate_p (src, &elt))
     {
+      HOST_WIDE_INT val = INTVAL (elt);
       rtx tmp = register_operand (target, mode) ? target : gen_reg_rtx (mode);
       /* Element in range -16 ~ 15 integer or 0.0 floating-point,
 	 we use vmv.v.i instruction.  */
@@ -568,6 +569,36 @@ expand_const_vector (rtx target, rtx src)
 	  rtx ops[] = {tmp, src};
 	  emit_vlmax_insn (code_for_pred_mov (mode), RVV_UNOP, ops);
 	}
+      /* If we can reach the immediate by loading an immediate and shifting,
+	 assume this is cheaper than loading a scalar.
+	 A power-of-two value > 15 cannot be loaded with vmv.v.i but we can
+	 load 8 into a vector register and shift it.  */
+      else if (val > 15 && wi::popcount (val) == 1
+	       && exact_log2 (val) - 3 /* exact_log2 (8)  */
+	       <= 15)
+	{
+	  /* We could also allow shifting an immediate and adding
+	     another one if VAL is suitable.
+	     This would allow us to synthesize constants like
+	     143 = 128 + 15 via
+	     vmv.v.i v1, 8
+	     vsll.vi v1, v1, 4
+	     vadd.vi v1, v1, 15
+	     TODO: Try more sequences and actually compare costs.  */
+
+	  HOST_WIDE_INT sw = exact_log2 (val);
+	  rtx eight = gen_const_vec_duplicate (mode, GEN_INT (8));
+	  rtx imm = gen_reg_rtx (mode);
+
+	  /* Load '8' as broadcast immediate.  */
+	  rtx ops1[] = {imm, eight};
+	  emit_vlmax_insn (code_for_pred_mov (mode), RVV_UNOP, ops1);
+
+	  /* Shift it.  */
+	  rtx ops2[] = {tmp, imm, GEN_INT (sw - 3)};
+	  emit_vlmax_insn (code_for_pred_scalar (ASHIFT, mode),
+			   RVV_BINOP, ops2);
+	}
       else
 	{
 	  elt = force_reg (elt_mode, elt);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv32.c
index e8d017f7339..5aaf55935a0 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv32.c
@@ -3,5 +3,6 @@
 
 #include "vmv-imm-template.h"
 
-/* { dg-final { scan-assembler-times {vmv.v.i} 32 } } */
-/* { dg-final { scan-assembler-times {vmv.v.x} 8 } } */
+/* { dg-final { scan-assembler-times {vmv.v.i} 34 } } */
+/* { dg-final { scan-assembler-times {vsll.vi} 2 } } */
+/* { dg-final { scan-assembler-times {vmv.v.x} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv64.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv64.c
index f85ad4117d3..0a7effde08a 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv64.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv64.c
@@ -3,5 +3,6 @@
 
 #include "vmv-imm-template.h"
 
-/* { dg-final { scan-assembler-times {vmv.v.i} 32 } } */
-/* { dg-final { scan-assembler-times {vmv.v.x} 8 } } */
+/* { dg-final { scan-assembler-times {vmv.v.i} 34 } } */
+/* { dg-final { scan-assembler-times {vsll.vi} 2 } } */
+/* { dg-final { scan-assembler-times {vmv.v.x} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-rv32.c
index 6843bc6018d..d5e7fa409e8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-rv32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-rv32.c
@@ -3,5 +3,6 @@
 
 #include "vmv-imm-template.h"
 
-/* { dg-final { scan-assembler-times {vmv.v.i} 32 } } */
-/* { dg-final { scan-assembler-times {vmv.v.x} 8 } } */
+/* { dg-final { scan-assembler-times {vmv.v.i} 34 } } */
+/* { dg-final { scan-assembler-times {vsll.vi} 2 } } */
+/* { dg-final { scan-assembler-times {vmv.v.x} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-rv64.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-rv64.c
index 39fb2a6cc7b..adb6a0b869e 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-rv64.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmv-imm-rv64.c
@@ -3,5 +3,6 @@
 
 #include "vmv-imm-template.h"
 
-/* { dg-final { scan-assembler-times {vmv.v.i} 32 } } */
-/* { dg-final { scan-assembler-times {vmv.v.x} 8 } } */
+/* { dg-final { scan-assembler-times {vmv.v.i} 34 } } */
+/* { dg-final { scan-assembler-times {vsll.vi} 2 } } */
+/* { dg-final { scan-assembler-times {vmv.v.x} 6 } } */
-- 
2.40.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-05-30 22:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-30 19:13 [PATCH] RISC-V: Synthesize power-of-two constants Robin Dapp
2023-05-30 20:18 ` Andrew Waterman
2023-05-30 22:01   ` 钟居哲
2023-05-30 22:09     ` Jeff Law
2023-05-30 22:13       ` 钟居哲
2023-05-30 22:14         ` Jeff Law
2023-05-30 22:17       ` Philipp Tomsich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).