public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Kito Cheng <kito.cheng@gmail.com>
To: "Maciej W. Rozycki" <macro@embecosm.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
	Andrew Waterman <andrew@sifive.com>
Subject: Re: [PATCH] RISC-V: Split unordered FP comparisons into individual RTL insns
Date: Fri, 24 Jun 2022 00:44:36 +0800	[thread overview]
Message-ID: <CA+yXCZBLavnLVv=AVYQiSnoa38u5i2OwHSEg1fxJa0kF2e8x6Q@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.20.2206220143490.10833@tpp.orcam.me.uk>

[-- Attachment #1: Type: text/plain, Size: 3376 bytes --]

Hi Maciej:

Thanks for detail analysis and performance number report, I am concern
about this patch might let compiler schedule the fsflags/frflags with
other floating point instructions, and the major issue is we didn't
model fflags right in GCC as you mentioned in the first email.

So I think we should model this right before we split that, I guess
that would be a bunch of work:
1. Add fflags to the hard register list.
2. Add (clobber (reg fflags)) or (set (reg fflags) (fpop
(operands...))) to those floating point operations which might change
fflags
3. Rewrite riscv_frflags and riscv_fsflags pattern by right RTL
pattern: (set (reg) (reg fflags)) and (set (reg fflags) (reg)).
4. Then split *f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_default and
*f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_snan pattern.

However I am not sure about the code gen impact of 2, especially the
impact to the combine pass, not sure if you are interested to give a
try?

And, I did some hack for part of this approach (1+3+4) got following
result for "__builtin_isless (x, y) + __builtin_isless (x, z)":

fltlt:
       frflags a4      # 8     [c=4 l=4]  riscv_frflags
       flt.d   a5,fa0,fa1      # 14    [c=4 l=4]  *cstoredfdi4
       flt.d   a0,fa0,fa2      # 17    [c=4 l=4]  *cstoredfdi4
       fsflags a4      # 18    [c=4 l=4]  riscv_fsflags
       add     a0,a0,a5        # 30    [c=4 l=4]  adddi3/0
       ret             # 40    [c=0 l=4]  simple_return

Verbose version:
fltlt:
#(insn 8 5 9 (set (reg:SI 14 a4 [88])
#        (reg:SI 66 fflags)) "x.c":5:10 258 {riscv_frflags}
#     (expr_list:REG_DEAD (reg:SI 66 fflags)
#        (nil)))
       frflags a4      # 8     [c=4 l=4]  riscv_frflags
#(insn 14 11 15 (parallel [
#            (set (reg:DI 15 a5 [90])
#                (lt:DI (reg/v:DF 42 fa0 [orig:81 x ] [81])
#                    (reg:DF 43 fa1 [101])))
#            (clobber:SI (reg:SI 66 fflags))
#        ]) "x.c":5:10 197 {*cstoredfdi4}
#     (expr_list:REG_DEAD (reg:DF 43 fa1 [101])
#        (expr_list:REG_UNUSED (reg:SI 66 fflags)
#            (nil))))
       flt.d   a5,fa0,fa1      # 14    [c=4 l=4]  *cstoredfdi4
#(insn 17 15 18 (parallel [
#            (set (reg:DI 10 a0 [94])
#                (lt:DI (reg/v:DF 42 fa0 [orig:81 x ] [81])
#                    (reg:DF 44 fa2 [102])))
#            (clobber:SI (reg:SI 66 fflags))
#        ]) "x.c":5:36 197 {*cstoredfdi4}
#     (expr_list:REG_DEAD (reg:DF 44 fa2 [102])
#        (expr_list:REG_DEAD (reg/v:DF 42 fa0 [orig:81 x ] [81])
#            (expr_list:REG_UNUSED (reg:SI 66 fflags)
#                (nil)))))
       flt.d   a0,fa0,fa2      # 17    [c=4 l=4]  *cstoredfdi4
#(insn 18 17 19 (set (reg:SI 66 fflags)
#        (reg:SI 14 a4 [88])) "x.c":5:36 259 {riscv_fsflags}
#     (expr_list:REG_DEAD (reg:SI 14 a4 [88])
#        (nil)))
       fsflags a4      # 18    [c=4 l=4]  riscv_fsflags
#(insn 30 25 31 (set (reg/i:DI 10 a0)
#        (plus:DI (reg:DI 10 a0 [94])
#            (reg:DI 15 a5 [90]))) "x.c":6:1 4 {adddi3}
#     (expr_list:REG_DEAD (reg:DI 15 a5 [90])
#        (nil)))
       add     a0,a0,a5        # 30    [c=4 l=4]  adddi3/0
#(jump_insn 40 39 41 (simple_return) "x.c":6:1 244 {simple_return}
#     (nil)
# -> simple_return)
       ret             # 40    [c=0 l=4]  simple_return
----

But this hack add an extra use of fflags to prevent FFLAGS getting
CSEed, patch attached.

[-- Attachment #2: 0001-model-fflags.patch --]
[-- Type: text/x-patch, Size: 6664 bytes --]

From 1116422bb5a69d8361f5c5bc334a122fecbaa306 Mon Sep 17 00:00:00 2001
From: Kito Cheng <kito.cheng@sifive.com>
Date: Fri, 24 Jun 2022 00:41:55 +0800
Subject: [PATCH] model fflags

---
 gcc/config/riscv/riscv.h  | 12 +++++------
 gcc/config/riscv/riscv.md | 42 ++++++++++++++++++++-------------------
 gcc/function.cc           |  1 +
 3 files changed, 29 insertions(+), 26 deletions(-)

diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 6f7f4d3fbdc..c4e6efe9885 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -291,7 +291,7 @@ ASM_MISA_SPEC
 	- ARG_POINTER_REGNUM
 	- FRAME_POINTER_REGNUM */
 
-#define FIRST_PSEUDO_REGISTER 66
+#define FIRST_PSEUDO_REGISTER 67
 
 /* x0, sp, gp, and tp are fixed.  */
 
@@ -303,7 +303,7 @@ ASM_MISA_SPEC
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,			\
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,			\
   /* Others.  */							\
-  1, 1									\
+  1, 1, 1								\
 }
 
 /* a0-a7, t0-t6, fa0-fa7, and ft0-ft11 are volatile across calls.
@@ -317,7 +317,7 @@ ASM_MISA_SPEC
   1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1,			\
   1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,			\
   /* Others.  */							\
-  1, 1									\
+  1, 1, 1,								\
 }
 
 /* Select a register mode required for caller save of hard regno REGNO.
@@ -472,7 +472,7 @@ enum reg_class
   { 0xffffffff, 0x00000000, 0x00000000 },	/* GR_REGS */		\
   { 0x00000000, 0xffffffff, 0x00000000 },	/* FP_REGS */		\
   { 0x00000000, 0x00000000, 0x00000003 },	/* FRAME_REGS */	\
-  { 0xffffffff, 0xffffffff, 0x00000003 }	/* ALL_REGS */		\
+  { 0xffffffff, 0xffffffff, 0x00000007 }	/* ALL_REGS */		\
 }
 
 /* A C expression whose value is a register class containing hard
@@ -514,7 +514,7 @@ enum reg_class
   40, 41, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,			\
   /* None of the remaining classes have defined call-saved		\
      registers.  */							\
-  64, 65								\
+  64, 65, 66								\
 }
 
 /* True if VALUE is a signed 12-bit number.  */
@@ -777,7 +777,7 @@ typedef struct {
   "fs0", "fs1", "fa0", "fa1", "fa2", "fa3", "fa4", "fa5",	\
   "fa6", "fa7", "fs2", "fs3", "fs4", "fs5", "fs6", "fs7",	\
   "fs8", "fs9", "fs10","fs11","ft8", "ft9", "ft10","ft11",	\
-  "arg", "frame", }
+  "arg", "frame", "fflags" }
 
 #define ADDITIONAL_REGISTER_NAMES					\
 {									\
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 4f59bb99cf5..9039cc73e4d 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -47,6 +47,7 @@ (define_c_enum "unspec" [
 
   ;; Stack tie
   UNSPEC_TIE
+  UNSPEC_FSNVSNAN
 ])
 
 (define_c_enum "unspecv" [
@@ -57,7 +58,6 @@ (define_c_enum "unspecv" [
   ;; Floating-point unspecs.
   UNSPECV_FRFLAGS
   UNSPECV_FSFLAGS
-  UNSPECV_FSNVSNAN
 
   ;; Interrupt handler instructions.
   UNSPECV_MRET
@@ -99,6 +99,7 @@ (define_constants
    (S9_REGNUM			25)
    (S10_REGNUM			26)
    (S11_REGNUM			27)
+   (FFLAGS_REGNUM		66)
 
    (NORMAL_RETURN		0)
    (SIBCALL_RETURN		1)
@@ -2300,7 +2301,8 @@ (define_expand "cstore<mode>4"
   [(set (match_operand:SI 0 "register_operand")
 	(match_operator:SI 1 "order_operator"
 	    [(match_operand:GPR 2 "register_operand")
-	     (match_operand:GPR 3 "nonmemory_operand")]))]
+	     (match_operand:GPR 3 "nonmemory_operand")]))
+   (clobber (reg:SI FFLAGS_REGNUM))]
   ""
 {
   riscv_expand_int_scc (operands[0], GET_CODE (operands[1]), operands[2],
@@ -2312,7 +2314,8 @@ (define_expand "cstore<mode>4"
   [(set (match_operand:SI 0 "register_operand")
 	(match_operator:SI 1 "fp_scc_comparison"
 	     [(match_operand:ANYF 2 "register_operand")
-	      (match_operand:ANYF 3 "register_operand")]))]
+	      (match_operand:ANYF 3 "register_operand")]))
+   (clobber (reg:SI FFLAGS_REGNUM))]
   "TARGET_HARD_FLOAT"
 {
   riscv_expand_float_scc (operands[0], GET_CODE (operands[1]), operands[2],
@@ -2324,7 +2327,8 @@ (define_insn "*cstore<ANYF:mode><X:mode>4"
    [(set (match_operand:X         0 "register_operand" "=r")
 	 (match_operator:X 1 "fp_native_comparison"
 	     [(match_operand:ANYF 2 "register_operand" " f")
-	      (match_operand:ANYF 3 "register_operand" " f")]))]
+	      (match_operand:ANYF 3 "register_operand" " f")]))
+   (clobber (reg:SI FFLAGS_REGNUM))]
   "TARGET_HARD_FLOAT"
   "f%C1.<fmt>\t%0,%2,%3"
   [(set_attr "type" "fcmp")
@@ -2342,18 +2346,15 @@ (define_expand "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4"
   rtx op2 = operands[2];
   rtx tmp = gen_reg_rtx (SImode);
   rtx cmp = gen_rtx_<QUIET_PATTERN> (<X:MODE>mode, op1, op2);
-  rtx frflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, const0_rtx),
-					 UNSPECV_FRFLAGS);
-  rtx fsflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, tmp),
-					 UNSPECV_FSFLAGS);
-
-  emit_insn (gen_rtx_SET (tmp, frflags));
-  emit_insn (gen_rtx_SET (op0, cmp));
-  emit_insn (fsflags);
+
+  emit_insn (gen_riscv_frflags (tmp));
+rtx set = gen_rtx_SET (op0, cmp);
+rtx clobber =  gen_rtx_CLOBBER(SImode, gen_rtx_REG(SImode, FFLAGS_REGNUM));
+  rtvec vec = gen_rtvec (2, set, clobber);
+  emit_insn (gen_rtx_PARALLEL(VOIDmode, vec));
+  emit_insn (gen_riscv_fsflags(tmp));
   if (HONOR_SNANS (<ANYF:MODE>mode))
-    emit_insn (gen_rtx_UNSPEC_VOLATILE (<ANYF:MODE>mode,
-					gen_rtvec (2, op1, op2),
-					UNSPECV_FSNVSNAN));
+    emit_insn (gen_riscv_fsnvsnan<ANYF:mode>2 (op1, op2));
   DONE;
 })
 
@@ -2754,19 +2755,20 @@ (define_insn "gpr_restore_return"
 
 (define_insn "riscv_frflags"
   [(set (match_operand:SI 0 "register_operand" "=r")
-	(unspec_volatile [(const_int 0)] UNSPECV_FRFLAGS))]
+	(reg:SI FFLAGS_REGNUM))]
   "TARGET_HARD_FLOAT"
   "frflags\t%0")
 
 (define_insn "riscv_fsflags"
-  [(unspec_volatile [(match_operand:SI 0 "csr_operand" "rK")] UNSPECV_FSFLAGS)]
+  [(set (reg:SI FFLAGS_REGNUM) (match_operand:SI 0 "csr_operand" "rK"))]
   "TARGET_HARD_FLOAT"
   "fsflags\t%0")
 
-(define_insn "*riscv_fsnvsnan<mode>2"
-  [(unspec_volatile [(match_operand:ANYF 0 "register_operand")
+(define_insn "riscv_fsnvsnan<mode>2"
+  [(set (reg:SI FFLAGS_REGNUM)
+        (unspec [(match_operand:ANYF 0 "register_operand")
 		     (match_operand:ANYF 1 "register_operand")]
-		    UNSPECV_FSNVSNAN)]
+		    UNSPEC_FSNVSNAN))]
   "TARGET_HARD_FLOAT"
   "feq.<fmt>\tzero,%0,%1"
   [(set_attr "type" "fcmp")
diff --git a/gcc/function.cc b/gcc/function.cc
index ad0096a43ef..cb0d743558f 100644
--- a/gcc/function.cc
+++ b/gcc/function.cc
@@ -5590,6 +5590,7 @@ expand_function_end (void)
      sh mach_dep_reorg) that still try and compute their own lifetime info
      instead of using the general framework.  */
   use_return_register ();
+  emit_insn (gen_rtx_USE(SImode, gen_rtx_REG(SImode,66)));
 }
 
 rtx
-- 
2.34.0


  reply	other threads:[~2022-06-23 16:44 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-09 13:44 Maciej W. Rozycki
2022-06-23 13:39 ` Maciej W. Rozycki
2022-06-23 16:44   ` Kito Cheng [this message]
2022-07-04 14:12     ` [PATCH v2] " Maciej W. Rozycki
2022-07-18 15:42       ` [PING][PATCH " Maciej W. Rozycki
2022-07-27 10:40         ` Kito Cheng
2022-07-28 13:20           ` Maciej W. Rozycki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+yXCZBLavnLVv=AVYQiSnoa38u5i2OwHSEg1fxJa0kF2e8x6Q@mail.gmail.com' \
    --to=kito.cheng@gmail.com \
    --cc=andrew@sifive.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=macro@embecosm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).