public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]
@ 2024-05-20 23:32 Vineet Gupta
  2024-05-20 23:32 ` [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion Vineet Gupta
  2024-05-21 17:30 ` [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Jeff Law
  0 siblings, 2 replies; 6+ messages in thread
From: Vineet Gupta @ 2024-05-20 23:32 UTC (permalink / raw)
  To: gcc-patches
  Cc: Jeff Law, kito.cheng, Palmer Dabbelt, gnu-toolchain, Vineet Gupta

Changes since v2:
  - Broke out the hunk corresponding to alloca in epilogue expansion in
    a seperate patch.
---

If the constant used for stack offset can be expressed as sum of two S12
values, the constant need not be materialized (in a reg) and instead the
two S12 bits can be added to instructions involved with frame pointer.
This avoids burning a register and more importantly can often get down
to be 2 insn vs. 3.

The prev patches to generally avoid LUI based const materialization didn't
fix this PR and need this directed fix in funcion prologue/epilogue
expansion.

This fix doesn't move the neddle for SPEC, at all, but it is still a
win considering gcc generates one insn fewer than llvm for the test ;-)

   gcc-13.1 release   |      gcc 230823     |                   |
                      |    g6619b3d4c15c    |   This patch      |  clang/llvm
---------------------------------------------------------------------------------
li      t0,-4096     | li    t0,-4096      | addi  sp,sp,-2048 | addi sp,sp,-2048
addi    t0,t0,2016   | addi  t0,t0,2032    | add   sp,sp,-16   | addi sp,sp,-32
li      a4,4096      | add   sp,sp,t0      | add   a5,sp,a0    | add  a1,sp,16
add     sp,sp,t0     | addi  a5,sp,-2032   | sb    zero,0(a5)  | add  a0,a0,a1
li      a5,-4096     | add   a0,a5,a0      | addi  sp,sp,2032  | sb   zero,0(a0)
addi    a4,a4,-2032  | li    t0, 4096      | addi  sp,sp,32    | addi sp,sp,2032
add     a4,a4,a5     | sb    zero,2032(a0) | ret               | addi sp,sp,48
addi    a5,sp,16     | addi  t0,t0,-2032   |                   | ret
add     a5,a4,a5     | add   sp,sp,t0      |
add     a0,a5,a0     | ret                 |
li      t0,4096      |
sd      a5,8(sp)     |
sb      zero,2032(a0)|
addi    t0,t0,-2016  |
add     sp,sp,t0     |
ret                  |

gcc/ChangeLog:
	PR target/105733
	* config/riscv/riscv.h: New macros for with aligned offsets.
	* config/riscv/riscv.cc (riscv_split_sum_of_two_s12): New
	function to split a sum of two s12 values into constituents.
	(riscv_expand_prologue): Handle offset being sum of two S12.
	(riscv_expand_epilogue): Ditto.
	* config/riscv/riscv-protos.h (riscv_split_sum_of_two_s12): New.

gcc/testsuite/ChangeLog:
	* gcc.target/riscv/pr105733.c: New Test.
	* gcc.target/riscv/rvv/autovec/vls/spill-1.c: Adjust to not
	expect LUI 4096.
	* gcc.target/riscv/rvv/autovec/vls/spill-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/spill-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/spill-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/spill-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/spill-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/spill-7.c: Ditto.

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
---
 gcc/config/riscv/riscv-protos.h               |  2 +
 gcc/config/riscv/riscv.cc                     | 54 +++++++++++++++++--
 gcc/config/riscv/riscv.h                      |  7 +++
 gcc/testsuite/gcc.target/riscv/pr105733.c     | 15 ++++++
 .../riscv/rvv/autovec/vls/spill-1.c           |  4 +-
 .../riscv/rvv/autovec/vls/spill-2.c           |  4 +-
 .../riscv/rvv/autovec/vls/spill-3.c           |  4 +-
 .../riscv/rvv/autovec/vls/spill-4.c           |  4 +-
 .../riscv/rvv/autovec/vls/spill-5.c           |  4 +-
 .../riscv/rvv/autovec/vls/spill-6.c           |  4 +-
 .../riscv/rvv/autovec/vls/spill-7.c           |  4 +-
 11 files changed, 89 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr105733.c

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index c64aae18deb9..0704968561bb 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -167,6 +167,8 @@ extern void riscv_subword_address (rtx, rtx *, rtx *, rtx *, rtx *);
 extern void riscv_lshift_subword (machine_mode, rtx, rtx, rtx *);
 extern enum memmodel riscv_union_memmodels (enum memmodel, enum memmodel);
 extern bool riscv_reg_frame_related (rtx);
+extern void riscv_split_sum_of_two_s12 (HOST_WIDE_INT, HOST_WIDE_INT *,
+					HOST_WIDE_INT *);
 
 /* Routines implemented in riscv-c.cc.  */
 void riscv_cpu_cpp_builtins (cpp_reader *);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index d0c22058b8c3..2ecbcf1d0af8 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4075,6 +4075,32 @@ riscv_split_doubleword_move (rtx dest, rtx src)
        riscv_emit_move (riscv_subword (dest, true), riscv_subword (src, true));
      }
 }
+
+/* Constant VAL is known to be sum of two S12 constants.  Break it into
+   comprising BASE and OFF.
+   Numerically S12 is -2048 to 2047, however it uses the more conservative
+   range -2048 to 2032 as offsets pertain to stack related registers.  */
+
+void
+riscv_split_sum_of_two_s12 (HOST_WIDE_INT val, HOST_WIDE_INT *base,
+			    HOST_WIDE_INT *off)
+{
+  if (SUM_OF_TWO_S12_N (val))
+    {
+      *base = -2048;
+      *off = val - (-2048);
+    }
+  else if (SUM_OF_TWO_S12_P_ALGN (val))
+    {
+      *base = 2032;
+      *off = val - 2032;
+    }
+  else
+    {
+      gcc_unreachable ();
+    }
+}
+
 \f
 /* Return the appropriate instructions to move SRC into DEST.  Assume
    that SRC is operand 1 and DEST is operand 0.  */
@@ -7864,6 +7890,17 @@ riscv_expand_prologue (void)
 				GEN_INT (-constant_frame));
 	  RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
 	}
+      else if (SUM_OF_TWO_S12_ALGN (-constant_frame))
+	{
+	  HOST_WIDE_INT one, two;
+	  riscv_split_sum_of_two_s12 (-constant_frame, &one, &two);
+	  insn = gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx,
+				GEN_INT (one));
+	  RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
+	  insn = gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx,
+				GEN_INT (two));
+	  RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
+	}
       else
 	{
 	  riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), GEN_INT (-constant_frame));
@@ -8160,10 +8197,21 @@ riscv_expand_epilogue (int style)
 
       /* Get an rtx for STEP1 that we can add to BASE.
 	 Skip if adjust equal to zero.  */
-      if (step1.to_constant () != 0)
+      HOST_WIDE_INT step1_value = step1.to_constant ();
+      if (step1_value != 0)
 	{
-	  rtx adjust = GEN_INT (step1.to_constant ());
-	  if (!SMALL_OPERAND (step1.to_constant ()))
+	  rtx adjust = GEN_INT (step1_value);
+	  if (SUM_OF_TWO_S12_ALGN (step1_value))
+	    {
+	      HOST_WIDE_INT one, two;
+	      riscv_split_sum_of_two_s12 (step1_value, &one, &two);
+	      insn = emit_insn (gen_add3_insn (stack_pointer_rtx,
+						stack_pointer_rtx,
+						GEN_INT (one)));
+	      RTX_FRAME_RELATED_P (insn) = 1;
+	      adjust = GEN_INT (two);
+	    }
+	  else if (!SMALL_OPERAND (step1_value))
 	    {
 	      riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), adjust);
 	      adjust = RISCV_PROLOGUE_TEMP (Pmode);
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 0d27c0d378df..d6b14c4d6205 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -641,6 +641,13 @@ enum reg_class
 #define SUM_OF_TWO_S12(VALUE)						\
   (SUM_OF_TWO_S12_N (VALUE) || SUM_OF_TWO_S12_P (VALUE))
 
+/* Variant with first value 8 byte aligned if involving stack regs.  */
+#define SUM_OF_TWO_S12_P_ALGN(VALUE)				\
+  (((VALUE) >= (2032 + 1)) && ((VALUE) <= (2032 * 2)))
+
+#define SUM_OF_TWO_S12_ALGN(VALUE)				\
+  (SUM_OF_TWO_S12_N (VALUE) || SUM_OF_TWO_S12_P_ALGN (VALUE))
+
 /* If this is a single bit mask, then we can load it with bseti.  Special
    handling of SImode 0x80000000 on RV64 is done in riscv_build_integer_1. */
 #define SINGLE_BIT_MASK_OPERAND(VALUE)					\
diff --git a/gcc/testsuite/gcc.target/riscv/pr105733.c b/gcc/testsuite/gcc.target/riscv/pr105733.c
new file mode 100644
index 000000000000..6156c36dc7ef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr105733.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options { -march=rv64gcv -mabi=lp64d } } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */
+
+#define BUF_SIZE 2064
+
+void
+foo(unsigned long i)
+{
+    volatile char buf[BUF_SIZE];
+
+    buf[i] = 0;
+}
+
+/* { dg-final { scan-assembler-not {li\t[a-x0-9]+,4096} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-1.c
index b64c73f34f13..6afcf1db593b 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-1.c
@@ -129,5 +129,5 @@ spill_12 (int8_t *in, int8_t *out)
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-2.c
index 8fcdca705384..544e8628a27b 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-2.c
@@ -120,5 +120,5 @@ spill_11 (int16_t *in, int16_t *out)
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-3.c
index ca296ce02d66..4bfeb07e9aca 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-3.c
@@ -111,5 +111,5 @@ spill_10 (int32_t *in, int32_t *out)
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-4.c
index ef61d9a2c0c3..1faf31ffd8e0 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-4.c
@@ -102,5 +102,5 @@ spill_9 (int64_t *in, int64_t *out)
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-5.c
index 150135a91103..0c8dccc518e3 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-5.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-5.c
@@ -120,5 +120,5 @@ spill_11 (_Float16 *in, _Float16 *out)
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-6.c
index c5d2d0194348..8bf53b84d1cd 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-6.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-6.c
@@ -111,5 +111,5 @@ spill_10 (float *in, float *out)
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-7.c
index 70ca683908db..e3980a295406 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-7.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/spill-7.c
@@ -102,5 +102,5 @@ spill_9 (int64_t *in, int64_t *out)
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-256} 1 } } */
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-512} 1 } } */
 /* { dg-final { scan-assembler-times {addi\tsp,sp,-1024} 1 } } */
-/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 1 } } */
-/* { dg-final { scan-assembler-times {li\t[a-x0-9]+,-4096\s+add\tsp,sp,[a-x0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,-2048} 3 } } */
+/* { dg-final { scan-assembler-times {addi\tsp,sp,2032} 1 } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion
  2024-05-20 23:32 [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Vineet Gupta
@ 2024-05-20 23:32 ` Vineet Gupta
  2024-05-21  3:54   ` Jeff Law
  2024-05-21 17:38   ` Jeff Law
  2024-05-21 17:30 ` [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Jeff Law
  1 sibling, 2 replies; 6+ messages in thread
From: Vineet Gupta @ 2024-05-20 23:32 UTC (permalink / raw)
  To: gcc-patches
  Cc: Jeff Law, kito.cheng, Palmer Dabbelt, gnu-toolchain, Vineet Gupta

This is testsuite clean however there's a dwarf quirk which I want to
run by the experts. The test that was tripping CI has following
fragment:

	Before patch		|	After Patch
------------------------------------------------------
   li	t0,-4096		|  addi	sp,s0,-2048
   addi	t0,t0,560		|  .cfi_def_cfa 2, 2048      <- #1
   add	sp,s0,t0		|  addi	sp,sp,-1488
   .cfi_def_cfa 2, 3536		|  .cfi_def_cfa_offset 3536  <- #2
   addi	sp,sp,1504		|  addi	sp,sp,1504
   .cfi_def_cfa_offset 2032	|  .cfi_def_cfa_offset 2032  <- #3

The dwarf insn #1 and #3 seem ok, however #2 seems dubious to me.

---

This is continuing on the prev patch in function epilogue expansion.

gcc/ChangeLog:
	* config/riscv/riscv.cc (riscv_expand_epilogue): Handle offset
	being sum of two S12.

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
---
 gcc/config/riscv/riscv.cc | 33 ++++++++++++++++++++++++++-------
 1 file changed, 26 insertions(+), 7 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 2ecbcf1d0af8..85df5b7ab498 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8111,7 +8111,10 @@ riscv_expand_epilogue (int style)
       need_barrier_p = false;
 
       poly_int64 adjust_offset = -frame->hard_frame_pointer_offset;
+      rtx dwarf_adj = gen_int_mode (adjust_offset, Pmode);
       rtx adjust = NULL_RTX;
+      bool sum_of_two_s12 = false;
+      HOST_WIDE_INT one, two;
 
       if (!adjust_offset.is_constant ())
 	{
@@ -8123,14 +8126,23 @@ riscv_expand_epilogue (int style)
 	}
       else
 	{
-	  if (!SMALL_OPERAND (adjust_offset.to_constant ()))
+	  HOST_WIDE_INT adj_off_value = adjust_offset.to_constant ();
+	  if (SMALL_OPERAND (adj_off_value))
+	    {
+	      adjust = GEN_INT (adj_off_value);
+	    }
+	  else if (SUM_OF_TWO_S12_ALGN (adj_off_value))
+	    {
+	      riscv_split_sum_of_two_s12 (adj_off_value, &one, &two);
+	      dwarf_adj = adjust = GEN_INT (one);
+	      sum_of_two_s12 = true;
+	    }
+	  else
 	    {
 	      riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode),
-			       GEN_INT (adjust_offset.to_constant ()));
+			       GEN_INT (adj_off_value));
 	      adjust = RISCV_PROLOGUE_TEMP (Pmode);
 	    }
-	  else
-	    adjust = GEN_INT (adjust_offset.to_constant ());
 	}
 
       insn = emit_insn (
@@ -8138,14 +8150,21 @@ riscv_expand_epilogue (int style)
 			      adjust));
 
       rtx dwarf = NULL_RTX;
-      rtx cfa_adjust_value = gen_rtx_PLUS (
-			       Pmode, hard_frame_pointer_rtx,
-			       gen_int_mode (-frame->hard_frame_pointer_offset, Pmode));
+      rtx cfa_adjust_value = gen_rtx_PLUS (Pmode, hard_frame_pointer_rtx,
+					   dwarf_adj);
       rtx cfa_adjust_rtx = gen_rtx_SET (stack_pointer_rtx, cfa_adjust_value);
       dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, cfa_adjust_rtx, dwarf);
+
       RTX_FRAME_RELATED_P (insn) = 1;
 
       REG_NOTES (insn) = dwarf;
+
+      if (sum_of_two_s12)
+	{
+	  insn = emit_insn (gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx,
+			    GEN_INT (two)));
+	  RTX_FRAME_RELATED_P (insn) = 1;
+	}
     }
 
   if (use_restore_libcall || use_multi_pop)
-- 
2.34.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion
  2024-05-20 23:32 ` [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion Vineet Gupta
@ 2024-05-21  3:54   ` Jeff Law
  2024-05-21 15:15     ` Vineet Gupta
  2024-05-21 17:38   ` Jeff Law
  1 sibling, 1 reply; 6+ messages in thread
From: Jeff Law @ 2024-05-21  3:54 UTC (permalink / raw)
  To: Vineet Gupta, gcc-patches; +Cc: kito.cheng, Palmer Dabbelt, gnu-toolchain



On 5/20/24 5:32 PM, Vineet Gupta wrote:
> This is testsuite clean however there's a dwarf quirk which I want to
> run by the experts. The test that was tripping CI has following
> fragment:
> 
> 	Before patch		|	After Patch
> ------------------------------------------------------
>     li	t0,-4096		|  addi	sp,s0,-2048
>     addi	t0,t0,560		|  .cfi_def_cfa 2, 2048      <- #1
>     add	sp,s0,t0		|  addi	sp,sp,-1488
>     .cfi_def_cfa 2, 3536		|  .cfi_def_cfa_offset 3536  <- #2
>     addi	sp,sp,1504		|  addi	sp,sp,1504
>     .cfi_def_cfa_offset 2032	|  .cfi_def_cfa_offset 2032  <- #3
> 
> The dwarf insn #1 and #3 seem ok, however #2 seems dubious to me.
What about it seems dubious?  We need a CFA adjustment on each insn that 
modifies the stack pointer so that we can unwind at any arbitrary point.

The first adjustment says the prior frame is at sp + 2048.  Then it's at 
sp + 3536.  Then after the final insn the prior frame is at sp+2032.

Jeff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion
  2024-05-21  3:54   ` Jeff Law
@ 2024-05-21 15:15     ` Vineet Gupta
  0 siblings, 0 replies; 6+ messages in thread
From: Vineet Gupta @ 2024-05-21 15:15 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: kito.cheng, Palmer Dabbelt, gnu-toolchain

On 5/20/24 20:54, Jeff Law wrote:
> On 5/20/24 5:32 PM, Vineet Gupta wrote:
>> This is testsuite clean however there's a dwarf quirk which I want to
>> run by the experts. The test that was tripping CI has following
>> fragment:
>>
>> 	Before patch		|	After Patch
>> ------------------------------------------------------
>>     li	t0,-4096		|  addi	sp,s0,-2048
>>     addi	t0,t0,560		|  .cfi_def_cfa 2, 2048      <- #1
>>     add	sp,s0,t0		|  addi	sp,sp,-1488
>>     .cfi_def_cfa 2, 3536		|  .cfi_def_cfa_offset 3536  <- #2
>>     addi	sp,sp,1504		|  addi	sp,sp,1504
>>     .cfi_def_cfa_offset 2032	|  .cfi_def_cfa_offset 2032  <- #3
>>
>> The dwarf insn #1 and #3 seem ok, however #2 seems dubious to me.
> What about it seems dubious?  

My discomfort at claiming I understand dwarf, despite debugging/fixing
the ARC Linux port's in kernel dwarf unwinder :-)

> We need a CFA adjustment on each insn that 
> modifies the stack pointer so that we can unwind at any arbitrary point.

Of course.

> The first adjustment says the prior frame is at sp + 2048.  Then it's at 
> sp + 3536.  Then after the final insn the prior frame is at sp+2032.

Yeah I got confused with second one since once it gets anchored to SP
from S0, but you are right it is farther from base CFA now.

-Vineet

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]
  2024-05-20 23:32 [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Vineet Gupta
  2024-05-20 23:32 ` [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion Vineet Gupta
@ 2024-05-21 17:30 ` Jeff Law
  1 sibling, 0 replies; 6+ messages in thread
From: Jeff Law @ 2024-05-21 17:30 UTC (permalink / raw)
  To: Vineet Gupta, gcc-patches; +Cc: kito.cheng, Palmer Dabbelt, gnu-toolchain



On 5/20/24 5:32 PM, Vineet Gupta wrote:
> Changes since v2:
>    - Broke out the hunk corresponding to alloca in epilogue expansion in
>      a seperate patch.
> ---
> 
> If the constant used for stack offset can be expressed as sum of two S12
> values, the constant need not be materialized (in a reg) and instead the
> two S12 bits can be added to instructions involved with frame pointer.
> This avoids burning a register and more importantly can often get down
> to be 2 insn vs. 3.
> 
> The prev patches to generally avoid LUI based const materialization didn't
> fix this PR and need this directed fix in funcion prologue/epilogue
> expansion.
> 
> This fix doesn't move the neddle for SPEC, at all, but it is still a
> win considering gcc generates one insn fewer than llvm for the test ;-)
> 
>     gcc-13.1 release   |      gcc 230823     |                   |
>                        |    g6619b3d4c15c    |   This patch      |  clang/llvm
> ---------------------------------------------------------------------------------
> li      t0,-4096     | li    t0,-4096      | addi  sp,sp,-2048 | addi sp,sp,-2048
> addi    t0,t0,2016   | addi  t0,t0,2032    | add   sp,sp,-16   | addi sp,sp,-32
> li      a4,4096      | add   sp,sp,t0      | add   a5,sp,a0    | add  a1,sp,16
> add     sp,sp,t0     | addi  a5,sp,-2032   | sb    zero,0(a5)  | add  a0,a0,a1
> li      a5,-4096     | add   a0,a5,a0      | addi  sp,sp,2032  | sb   zero,0(a0)
> addi    a4,a4,-2032  | li    t0, 4096      | addi  sp,sp,32    | addi sp,sp,2032
> add     a4,a4,a5     | sb    zero,2032(a0) | ret               | addi sp,sp,48
> addi    a5,sp,16     | addi  t0,t0,-2032   |                   | ret
> add     a5,a4,a5     | add   sp,sp,t0      |
> add     a0,a5,a0     | ret                 |
> li      t0,4096      |
> sd      a5,8(sp)     |
> sb      zero,2032(a0)|
> addi    t0,t0,-2016  |
> add     sp,sp,t0     |
> ret                  |
> 
> gcc/ChangeLog:
> 	PR target/105733
> 	* config/riscv/riscv.h: New macros for with aligned offsets.
> 	* config/riscv/riscv.cc (riscv_split_sum_of_two_s12): New
> 	function to split a sum of two s12 values into constituents.
> 	(riscv_expand_prologue): Handle offset being sum of two S12.
> 	(riscv_expand_epilogue): Ditto.
> 	* config/riscv/riscv-protos.h (riscv_split_sum_of_two_s12): New.
> 
> gcc/testsuite/ChangeLog:
> 	* gcc.target/riscv/pr105733.c: New Test.
> 	* gcc.target/riscv/rvv/autovec/vls/spill-1.c: Adjust to not
> 	expect LUI 4096.
> 	* gcc.target/riscv/rvv/autovec/vls/spill-2.c: Ditto.
> 	* gcc.target/riscv/rvv/autovec/vls/spill-3.c: Ditto.
> 	* gcc.target/riscv/rvv/autovec/vls/spill-4.c: Ditto.
> 	* gcc.target/riscv/rvv/autovec/vls/spill-5.c: Ditto.
> 	* gcc.target/riscv/rvv/autovec/vls/spill-6.c: Ditto.
> 	* gcc.target/riscv/rvv/autovec/vls/spill-7.c: Ditto.
OK
Jeff


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion
  2024-05-20 23:32 ` [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion Vineet Gupta
  2024-05-21  3:54   ` Jeff Law
@ 2024-05-21 17:38   ` Jeff Law
  1 sibling, 0 replies; 6+ messages in thread
From: Jeff Law @ 2024-05-21 17:38 UTC (permalink / raw)
  To: Vineet Gupta, gcc-patches; +Cc: kito.cheng, Palmer Dabbelt, gnu-toolchain



On 5/20/24 5:32 PM, Vineet Gupta wrote:
> This is testsuite clean however there's a dwarf quirk which I want to
> run by the experts. The test that was tripping CI has following
> fragment:
> 
> 	Before patch		|	After Patch
> ------------------------------------------------------
>     li	t0,-4096		|  addi	sp,s0,-2048
>     addi	t0,t0,560		|  .cfi_def_cfa 2, 2048      <- #1
>     add	sp,s0,t0		|  addi	sp,sp,-1488
>     .cfi_def_cfa 2, 3536		|  .cfi_def_cfa_offset 3536  <- #2
>     addi	sp,sp,1504		|  addi	sp,sp,1504
>     .cfi_def_cfa_offset 2032	|  .cfi_def_cfa_offset 2032  <- #3
> 
> The dwarf insn #1 and #3 seem ok, however #2 seems dubious to me.
> 
> ---
> 
> This is continuing on the prev patch in function epilogue expansion.
> 
> gcc/ChangeLog:
> 	* config/riscv/riscv.cc (riscv_expand_epilogue): Handle offset
> 	being sum of two S12.
OK.
jeff


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-05-21 17:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-20 23:32 [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Vineet Gupta
2024-05-20 23:32 ` [PATCH v3 2/2] RISC-V: avoid LUI based const mat in alloca epilogue expansion Vineet Gupta
2024-05-21  3:54   ` Jeff Law
2024-05-21 15:15     ` Vineet Gupta
2024-05-21 17:38   ` Jeff Law
2024-05-21 17:30 ` [PATCH v3 1/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).