[PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2.
@ 2023-04-19  9:57 Jin Ma
  2023-05-05 15:03 ` Christoph Müllner
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Jin Ma @ 2023-04-19  9:57 UTC (permalink / raw)
  To: gcc-patches
  Cc: jeffreyalaw, kito.cheng, kito.cheng, palmer, christoph.muellner,
	ijinma, Jin Ma

This patch adds the 'Zfa' extension for riscv, which is based on:
  https://github.com/riscv/riscv-isa-manual/commits/zfb
  https://github.com/riscv/riscv-isa-manual/commit/1f038182810727f5feca311072e630d6baac51da

The binutils-gdb for 'Zfa' extension:
  https://github.com/a4lg/binutils-gdb/commits/riscv-zfa

What needs special explanation is:
1, The immediate number of the instructions FLI.H/S/D is represented in the assembly as a
  floating-point value, with scientific counting when rs1 is 1,2, and decimal numbers for
  the rest.

  Related llvm link:
    https://reviews.llvm.org/D145645
  Related discussion link:
    https://github.com/riscv/riscv-isa-manual/issues/980

2, According to riscv-spec, "The FCVTMO D.W.D instruction was added principally to
  accelerate the processing of JavaScript Numbers.", so it seems that no implementation
  is required.

3, The instructions FMINM and FMAXM correspond to C23 library function fminimum and fmaximum.
  Therefore, this patch has simply implemented the pattern of fminm<hf\sf\df>3 and
  fmaxm<hf\sf\df>3 to prepare for later.

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc: Add zfa extension version.
	* config/riscv/constraints.md (Zf): Constrain the floating point number that the
	instructions FLI.H/S/D can load.
	((TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS): enable FMVP.D.X and FMVH.X.D.
	* config/riscv/iterators.md (ceil): New.
	* config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): New.
	* config/riscv/riscv.cc (find_index_in_array): New.
	(riscv_float_const_rtx_index_for_fli): Get the index of the floating-point number that
	the instructions FLI.H/S/D can mov.
	(riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be used, memory is not applicable.
	(riscv_const_insns): The cost of FLI.H/S/D is 3.
	(riscv_legitimize_const_move): Likewise.
	(riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, no split is required.
	(riscv_output_move): Output the mov instructions in zfa extension.
	(riscv_print_operand): Output the floating-point value of the FLI.H/S/D immediate in assembly
	(riscv_secondary_memory_needed): Likewise.
	* config/riscv/riscv.h (GP_REG_RTX_P): New.
	* config/riscv/riscv.md (fminm<mode>3): New.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
	* gcc.target/riscv/zfa-fleq-fltq.c: New test.
	* gcc.target/riscv/zfa-fli-rv32.c: New test.
	* gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
	* gcc.target/riscv/zfa-fli-zfh.c: New test.
	* gcc.target/riscv/zfa-fli.c: New test.
	* gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
	* gcc.target/riscv/zfa-fround-rv32.c: New test.
	* gcc.target/riscv/zfa-fround.c: New test.
---
 gcc/common/config/riscv/riscv-common.cc       |   4 +
 gcc/config/riscv/constraints.md               |  11 +-
 gcc/config/riscv/iterators.md                 |   5 +
 gcc/config/riscv/riscv-opts.h                 |   3 +
 gcc/config/riscv/riscv-protos.h               |   1 +
 gcc/config/riscv/riscv.cc                     | 168 +++++++++++++++++-
 gcc/config/riscv/riscv.h                      |   1 +
 gcc/config/riscv/riscv.md                     | 112 +++++++++---
 .../gcc.target/riscv/zfa-fleq-fltq-rv32.c     |  19 ++
 .../gcc.target/riscv/zfa-fleq-fltq.c          |  19 ++
 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c |  79 ++++++++
 .../gcc.target/riscv/zfa-fli-zfh-rv32.c       |  41 +++++
 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c  |  41 +++++
 gcc/testsuite/gcc.target/riscv/zfa-fli.c      |  79 ++++++++
 .../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c   |  10 ++
 .../gcc.target/riscv/zfa-fround-rv32.c        |  42 +++++
 gcc/testsuite/gcc.target/riscv/zfa-fround.c   |  42 +++++
 17 files changed, 652 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround.c

diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc
index 309a52def75..f9fce6bcc38 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -217,6 +217,8 @@ static const struct riscv_ext_version riscv_ext_version_table[] =
   {"zfh",       ISA_SPEC_CLASS_NONE, 1, 0},
   {"zfhmin",    ISA_SPEC_CLASS_NONE, 1, 0},
 
+  {"zfa",     ISA_SPEC_CLASS_NONE, 0, 2},
+
   {"zmmul", ISA_SPEC_CLASS_NONE, 1, 0},
 
   {"svinval", ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1260,6 +1262,8 @@ static const riscv_ext_flag_table_t riscv_ext_flag_table[] =
   {"zfhmin",    &gcc_options::x_riscv_zf_subext, MASK_ZFHMIN},
   {"zfh",       &gcc_options::x_riscv_zf_subext, MASK_ZFH},
 
+  {"zfa",       &gcc_options::x_riscv_zf_subext, MASK_ZFA},
+
   {"zmmul", &gcc_options::x_riscv_zm_subext, MASK_ZMMUL},
 
   {"svinval", &gcc_options::x_riscv_sv_subext, MASK_SVINVAL},
diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index c448e6b37e9..62d9094f966 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -118,6 +118,13 @@ (define_constraint "T"
   (and (match_operand 0 "move_operand")
        (match_test "CONSTANT_P (op)")))
 
+;; Zfa constraints.
+
+(define_constraint "Zf"
+  "A floating point number that can be loaded using instruction `fli` in zfa."
+  (and (match_code "const_double")
+       (match_test "(riscv_float_const_rtx_index_for_fli (op) != -1)")))
+
 ;; Vector constraints.
 
 (define_register_constraint "vr" "TARGET_VECTOR ? V_REGS : NO_REGS"
@@ -183,8 +190,8 @@ (define_memory_constraint "Wdm"
 
 ;; Vendor ISA extension constraints.
 
-(define_register_constraint "th_f_fmv" "TARGET_XTHEADFMV ? FP_REGS : NO_REGS"
+(define_register_constraint "th_f_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS"
   "A floating-point register for XTheadFmv.")
 
-(define_register_constraint "th_r_fmv" "TARGET_XTHEADFMV ? GR_REGS : NO_REGS"
+(define_register_constraint "th_r_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? GR_REGS : NO_REGS"
   "An integer register for XTheadFmv.")
diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 9b767038452..c81b08e3cc5 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -288,3 +288,8 @@ (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
 (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET "le")])
 (define_int_attr QUIET_PATTERN [(UNSPEC_FLT_QUIET "LT") (UNSPEC_FLE_QUIET "LE")])
 
+(define_int_iterator ROUND [UNSPEC_ROUND UNSPEC_FLOOR UNSPEC_CEIL UNSPEC_BTRUNC UNSPEC_ROUNDEVEN UNSPEC_NEARBYINT])
+(define_int_attr round_pattern [(UNSPEC_ROUND "round") (UNSPEC_FLOOR "floor") (UNSPEC_CEIL "ceil")
+				(UNSPEC_BTRUNC "btrunc") (UNSPEC_ROUNDEVEN "roundeven") (UNSPEC_NEARBYINT "nearbyint")])
+(define_int_attr round_rm [(UNSPEC_ROUND "rmm") (UNSPEC_FLOOR "rdn") (UNSPEC_CEIL "rup")
+			   (UNSPEC_BTRUNC "rtz") (UNSPEC_ROUNDEVEN "rne") (UNSPEC_NEARBYINT "dyn")])
\ No newline at end of file
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index cf0cd669be4..87b72efd12e 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -172,6 +172,9 @@ enum stack_protector_guard {
 #define TARGET_ZFHMIN ((riscv_zf_subext & MASK_ZFHMIN) != 0)
 #define TARGET_ZFH    ((riscv_zf_subext & MASK_ZFH) != 0)
 
+#define MASK_ZFA   (1 << 0)
+#define TARGET_ZFA    ((riscv_zf_subext & MASK_ZFA) != 0)
+
 #define MASK_ZMMUL      (1 << 0)
 #define TARGET_ZMMUL    ((riscv_zm_subext & MASK_ZMMUL) != 0)
 
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 5244e8dcbf0..e421244a06c 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -38,6 +38,7 @@ enum riscv_symbol_type {
 /* Routines implemented in riscv.cc.  */
 extern enum riscv_symbol_type riscv_classify_symbolic_expression (rtx);
 extern bool riscv_symbolic_constant_p (rtx, enum riscv_symbol_type *);
+extern int riscv_float_const_rtx_index_for_fli (rtx);
 extern int riscv_regno_mode_ok_for_base_p (int, machine_mode, bool);
 extern int riscv_address_insns (rtx, machine_mode, bool);
 extern int riscv_const_insns (rtx);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index cdb47e81e7c..faffedffe97 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -799,6 +799,116 @@ static int riscv_symbol_insns (enum riscv_symbol_type type)
     }
 }
 
+/* Immediate values loaded by the FLI.S instruction in Chapter 25 of the latest RISC-V ISA
+   Manual draft. For details, please see:
+   https://github.com/riscv/riscv-isa-manual/releases/tag/draft-20221217-cb3b9d1 */
+
+unsigned HOST_WIDE_INT fli_value_hf[32] =
+{
+  0xbc00, 0x400, 0x100, 0x200, 0x1c00, 0x2000, 0x2c00, 0x3000,
+  0x3400, 0x3500, 0x3600, 0x3700, 0x3800, 0x3900, 0x3a00, 0x3b00,
+  0x3c00, 0x3d00, 0x3e00, 0x3f00, 0x4000, 0x4100, 0x4200, 0x4400,
+  0x4800, 0x4c00, 0x5800, 0x5c00, 0x7800,
+  /* Only used for filling, ensuring that 29 and 30 of HF are the same. */
+  0x7800,
+  0x7c00, 0x7e00,
+};
+
+unsigned HOST_WIDE_INT fli_value_sf[32] =
+{
+  0xbf800000, 0x00800000, 0x37800000, 0x38000000, 0x3b800000, 0x3c000000, 0x3d800000, 0x3e000000,
+  0x3e800000, 0x3ea00000, 0x3ec00000, 0x3ee00000, 0x3f000000, 0x3f200000, 0x3f400000, 0x3f600000,
+  0x3f800000, 0x3fa00000, 0x3fc00000, 0x3fe00000, 0x40000000, 0x40200000, 0x40400000, 0x40800000,
+  0x41000000, 0x41800000, 0x43000000, 0x43800000, 0x47000000, 0x47800000, 0x7f800000, 0x7fc00000
+};
+
+unsigned HOST_WIDE_INT fli_value_df[32] =
+{
+  0xbff0000000000000, 0x10000000000000, 0x3ef0000000000000, 0x3f00000000000000,
+  0x3f70000000000000, 0x3f80000000000000, 0x3fb0000000000000, 0x3fc0000000000000,
+  0x3fd0000000000000, 0x3fd4000000000000, 0x3fd8000000000000, 0x3fdc000000000000,
+  0x3fe0000000000000, 0x3fe4000000000000, 0x3fe8000000000000, 0x3fec000000000000,
+  0x3ff0000000000000, 0x3ff4000000000000, 0x3ff8000000000000, 0x3ffc000000000000,
+  0x4000000000000000, 0x4004000000000000, 0x4008000000000000, 0x4010000000000000,
+  0x4020000000000000, 0x4030000000000000, 0x4060000000000000, 0x4070000000000000,
+  0x40e0000000000000, 0x40f0000000000000, 0x7ff0000000000000, 0x7ff8000000000000,
+};
+
+const char *fli_value_print[32] =
+{
+  "-1.0", "min", "1.52587890625e-05", "3.0517578125e-05", "0.00390625", "0.0078125", "0.0625", "0.125",
+  "0.25", "0.3125", "0.375", "0.4375", "0.5", "0.625", "0.75", "0.875",
+  "1.0", "1.25", "1.5", "1.75", "2.0", "2.5", "3.0", "4.0",
+  "8.0", "16.0", "128.0", "256.0", "32768.0", "65536.0", "inf", "nan"
+};
+
+/* Find the index of TARGET in ARRAY, and return -1 if not found. */
+
+static int
+find_index_in_array (unsigned HOST_WIDE_INT target, unsigned HOST_WIDE_INT *array, int len)
+{
+  if (array == NULL)
+    return -1;
+
+  for (int i = 0; i < len; i++)
+    {
+      if (target == array[i])
+	return i;
+    }
+  return -1;
+}
+
+/* Return index of the FLI instruction table if rtx X is an immediate constant that
+   can be moved using a single FLI instruction in zfa extension. -1 otherwise. */
+
+int
+riscv_float_const_rtx_index_for_fli (rtx x)
+{
+  machine_mode mode = GET_MODE (x);
+
+  if (!TARGET_ZFA || mode == VOIDmode
+      || !CONST_DOUBLE_P(x)
+      || (mode == HFmode && !TARGET_ZFH)
+      || (mode == SFmode && !TARGET_HARD_FLOAT)
+      || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
+    return -1;
+
+  if (!SCALAR_FLOAT_MODE_P (mode)
+      || GET_MODE_BITSIZE (mode).to_constant () > HOST_BITS_PER_WIDE_INT
+      /* Only support up to DF mode.  */
+      || GET_MODE_BITSIZE (mode).to_constant () > GET_MODE_BITSIZE (DFmode))
+    return -1;
+
+  unsigned HOST_WIDE_INT ival = 0;
+
+  long res[2];
+  real_to_target (res,
+		  CONST_DOUBLE_REAL_VALUE (x),
+		  REAL_MODE_FORMAT (mode));
+
+  if (mode == DFmode)
+    {
+      int order = BYTES_BIG_ENDIAN ? 1 : 0;
+      ival = zext_hwi (res[order], 32);
+      ival |= (zext_hwi (res[1 - order], 32) << 32);
+    }
+  else
+      ival = zext_hwi (res[0], 32);
+
+  switch (mode)
+    {
+      case SFmode:
+	return find_index_in_array (ival, fli_value_sf, 32);
+      case DFmode:
+	return find_index_in_array (ival, fli_value_df, 32);
+      case HFmode:
+	return find_index_in_array (ival, fli_value_hf, 32);
+      default:
+	break;
+    }
+  return -1;
+}
+
 /* Implement TARGET_LEGITIMATE_CONSTANT_P.  */
 
 static bool
@@ -826,6 +936,9 @@ riscv_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
   if (GET_CODE (x) == HIGH)
     return true;
 
+  if (riscv_float_const_rtx_index_for_fli (x) != -1)
+   return true;
+
   split_const (x, &base, &offset);
   if (riscv_symbolic_constant_p (base, &type))
     {
@@ -1213,6 +1326,11 @@ riscv_const_insns (rtx x)
       }
 
     case CONST_DOUBLE:
+      /* See if we can use FMV directly.  */
+      if (riscv_float_const_rtx_index_for_fli (x) != -1)
+	return 3;
+      /* Fall through.  */
+
     case CONST_VECTOR:
       /* We can use x0 to load floating-point zero.  */
       return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
@@ -1749,6 +1867,12 @@ riscv_legitimize_const_move (machine_mode mode, rtx dest, rtx src)
       return;
     }
 
+  if (riscv_float_const_rtx_index_for_fli (src) != -1)
+    {
+      riscv_emit_set (dest, src);
+      return;
+    }
+
   /* Split moves of symbolic constants into high/low pairs.  */
   if (riscv_split_symbol (dest, src, MAX_MACHINE_MODE, &src, FALSE))
     {
@@ -2770,12 +2894,19 @@ riscv_split_64bit_move_p (rtx dest, rtx src)
   if (TARGET_64BIT)
     return false;
 
+  /* There is no need to split if the FLI instruction in the `Zfa` extension can be used. */
+  if (riscv_float_const_rtx_index_for_fli (src) != -1)
+    return false;
+
   /* Allow FPR <-> FPR and FPR <-> MEM moves, and permit the special case
      of zeroing an FPR with FCVT.D.W.  */
   if (TARGET_DOUBLE_FLOAT
       && ((FP_REG_RTX_P (src) && FP_REG_RTX_P (dest))
 	  || (FP_REG_RTX_P (dest) && MEM_P (src))
 	  || (FP_REG_RTX_P (src) && MEM_P (dest))
+	  || (TARGET_ZFA
+	      && ((FP_REG_RTX_P (dest) && GP_REG_RTX_P (src))
+	      || (FP_REG_RTX_P (src) && GP_REG_RTX_P (dest))))
 	  || (FP_REG_RTX_P (dest) && src == CONST0_RTX (GET_MODE (src)))))
     return false;
 
@@ -2857,6 +2988,8 @@ riscv_output_move (rtx dest, rtx src)
 	  case 4:
 	    return "fmv.x.s\t%0,%1";
 	  case 8:
+	    if (!TARGET_64BIT && TARGET_ZFA)
+	      return "fmv.x.w\t%0,%1\n\tfmvh.x.d\t%N0,%1";
 	    return "fmv.x.d\t%0,%1";
 	  }
 
@@ -2916,6 +3049,8 @@ riscv_output_move (rtx dest, rtx src)
 	      case 8:
 		if (TARGET_64BIT)
 		  return "fmv.d.x\t%0,%z1";
+		else if (TARGET_ZFA && src != CONST0_RTX (mode))
+		  return "fmvp.d.x\t%0,%1,%N1";
 		/* in RV32, we can emulate fmv.d.x %0, x0 using fcvt.d.w */
 		gcc_assert (src == CONST0_RTX (mode));
 		return "fcvt.d.w\t%0,x0";
@@ -2968,6 +3103,14 @@ riscv_output_move (rtx dest, rtx src)
 	  case 8:
 	    return "fld\t%0,%1";
 	  }
+
+      if (src_code == CONST_DOUBLE && (riscv_float_const_rtx_index_for_fli (src) != -1))
+	switch (width)
+	  {
+	    case 2: return "fli.h\t%0,%1";
+	    case 4: return "fli.s\t%0,%1";
+	    case 8: return "fli.d\t%0,%1";
+	  }
     }
   if (dest_code == REG && GP_REG_P (REGNO (dest)) && src_code == CONST_POLY_INT)
     {
@@ -4349,6 +4492,7 @@ riscv_memmodel_needs_release_fence (enum memmodel model)
    'S'	Print shift-index of single-bit mask OP.
    'T'	Print shift-index of inverted single-bit mask OP.
    '~'	Print w if TARGET_64BIT is true; otherwise not print anything.
+   'N'  Print next register.
 
    Note please keep this list and the list in riscv.md in sync.  */
 
@@ -4533,6 +4677,9 @@ riscv_print_operand (FILE *file, rtx op, int letter)
 	output_addr_const (file, newop);
 	break;
       }
+    case 'N':
+      fputs (reg_names[REGNO (op) + 1], file);
+      break;
     default:
       switch (code)
 	{
@@ -4549,6 +4696,24 @@ riscv_print_operand (FILE *file, rtx op, int letter)
 	    output_address (mode, XEXP (op, 0));
 	  break;
 
+	case CONST_DOUBLE:
+	  {
+	    if (letter == 'z' && op == CONST0_RTX (GET_MODE (op)))
+	      {
+		fputs (reg_names[GP_REG_FIRST], file);
+		break;
+	      }
+
+	    int fli_index = riscv_float_const_rtx_index_for_fli (op);
+	    if (fli_index == -1 || fli_index > 31)
+	      {
+		output_operand_lossage ("invalid use of '%%%c'", letter);
+		break;
+	      }
+	    asm_fprintf (file, "%s", fli_value_print[fli_index]);
+	    break;
+	  }
+
 	default:
 	  if (letter == 'z' && op == CONST0_RTX (GET_MODE (op)))
 	    fputs (reg_names[GP_REG_FIRST], file);
@@ -5897,7 +6062,8 @@ riscv_secondary_memory_needed (machine_mode mode, reg_class_t class1,
   return (!riscv_v_ext_vector_mode_p (mode)
 	  && GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD
 	  && (class1 == FP_REGS) != (class2 == FP_REGS)
-	  && !TARGET_XTHEADFMV);
+	  && !TARGET_XTHEADFMV
+	  && !TARGET_ZFA);
 }
 
 /* Implement TARGET_REGISTER_MOVE_COST.  */
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 66fb07d6652..d438b281142 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -377,6 +377,7 @@ ASM_MISA_SPEC
 #define SIBCALL_REG_P(REGNO)	\
   TEST_HARD_REG_BIT (reg_class_contents[SIBCALL_REGS], REGNO)
 
+#define GP_REG_RTX_P(X) (REG_P (X) && GP_REG_P (REGNO (X)))
 #define FP_REG_RTX_P(X) (REG_P (X) && FP_REG_P (REGNO (X)))
 
 /* Use s0 as the frame pointer if it is so requested.  */
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index bc384d9aedf..f22e71b5a3a 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -59,6 +59,15 @@ (define_c_enum "unspec" [
   UNSPEC_LROUND
   UNSPEC_FMIN
   UNSPEC_FMAX
+  UNSPEC_RINT
+  UNSPEC_ROUND
+  UNSPEC_FLOOR
+  UNSPEC_CEIL
+  UNSPEC_BTRUNC
+  UNSPEC_ROUNDEVEN
+  UNSPEC_NEARBYINT
+  UNSPEC_FMINM
+  UNSPEC_FMAXM
 
   ;; Stack tie
   UNSPEC_TIE
@@ -1232,6 +1241,26 @@ (define_insn "neg<mode>2"
 ;;
 ;;  ....................
 
+(define_insn "fminm<mode>3"
+  [(set (match_operand:ANYF                    0 "register_operand" "=f")
+	(unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
+		      (use (match_operand:ANYF 2 "register_operand" " f"))]
+		     UNSPEC_FMINM))]
+  "TARGET_HARD_FLOAT && TARGET_ZFA"
+  "fminm.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fmove")
+   (set_attr "mode" "<UNITMODE>")])
+
+(define_insn "fmaxm<mode>3"
+  [(set (match_operand:ANYF                    0 "register_operand" "=f")
+	(unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
+		      (use (match_operand:ANYF 2 "register_operand" " f"))]
+		     UNSPEC_FMAXM))]
+  "TARGET_HARD_FLOAT && TARGET_ZFA"
+  "fmaxm.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fmove")
+   (set_attr "mode" "<UNITMODE>")])
+
 (define_insn "fmin<mode>3"
   [(set (match_operand:ANYF                    0 "register_operand" "=f")
 	(unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
@@ -1508,13 +1537,13 @@ (define_expand "movhf"
 })
 
 (define_insn "*movhf_hardfloat"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
-	(match_operand:HF 1 "move_operand"         " f,G,m,f,G,*r,*f,*G*r,*m,*r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
+	(match_operand:HF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*G*r,*m,*r"))]
   "TARGET_ZFHMIN
    && (register_operand (operands[0], HFmode)
        || reg_or_0_operand (operands[1], HFmode))"
   { return riscv_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
+  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
    (set_attr "mode" "HF")])
 
 (define_insn "*movhf_softfloat"
@@ -1580,6 +1609,26 @@ (define_insn "l<rint_pattern><ANYF:mode><GPR:mode>2"
   [(set_attr "type" "fcvt")
    (set_attr "mode" "<ANYF:MODE>")])
 
+(define_insn "<round_pattern><ANYF:mode>2"
+  [(set (match_operand:ANYF     0 "register_operand" "=f")
+	(unspec:ANYF
+	    [(match_operand:ANYF 1 "register_operand" " f")]
+	ROUND))]
+  "TARGET_HARD_FLOAT && TARGET_ZFA"
+  "fround.<ANYF:fmt>\t%0,%1,<round_rm>"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "<ANYF:MODE>")])
+
+(define_insn "rint<ANYF:mode>2"
+  [(set (match_operand:ANYF     0 "register_operand" "=f")
+	(unspec:ANYF
+	    [(match_operand:ANYF 1 "register_operand" " f")]
+	UNSPEC_RINT))]
+  "TARGET_HARD_FLOAT && TARGET_ZFA"
+  "froundnx.<ANYF:fmt>\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "<ANYF:MODE>")])
+
 ;;
 ;;  ....................
 ;;
@@ -1839,13 +1888,13 @@ (define_expand "movsf"
 })
 
 (define_insn "*movsf_hardfloat"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
-	(match_operand:SF 1 "move_operand"         " f,G,m,f,G,*r,*f,*G*r,*m,*r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
+	(match_operand:SF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*G*r,*m,*r"))]
   "TARGET_HARD_FLOAT
    && (register_operand (operands[0], SFmode)
        || reg_or_0_operand (operands[1], SFmode))"
   { return riscv_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
+  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
    (set_attr "mode" "SF")])
 
 (define_insn "*movsf_softfloat"
@@ -1873,23 +1922,23 @@ (define_expand "movdf"
 ;; In RV32, we lack fmv.x.d and fmv.d.x.  Go through memory instead.
 ;; (However, we can still use fcvt.d.w to zero a floating-point register.)
 (define_insn "*movdf_hardfloat_rv32"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,m,*th_f_fmv,*th_r_fmv,  *r,*r,*m")
-	(match_operand:DF 1 "move_operand"         " f,G,m,f,G,*th_r_fmv,*th_f_fmv,*r*G,*m,*r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*th_f_fmv,*th_r_fmv,  *r,*r,*m")
+	(match_operand:DF 1 "move_operand"         " f,Zf,G,m,f,G,*th_r_fmv,*th_f_fmv,*r*G,*m,*r"))]
   "!TARGET_64BIT && TARGET_DOUBLE_FLOAT
    && (register_operand (operands[0], DFmode)
        || reg_or_0_operand (operands[1], DFmode))"
   { return riscv_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
+  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
    (set_attr "mode" "DF")])
 
 (define_insn "*movdf_hardfloat_rv64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
-	(match_operand:DF 1 "move_operand"         " f,G,m,f,G,*r,*f,*r*G,*m,*r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
+	(match_operand:DF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*r*G,*m,*r"))]
   "TARGET_64BIT && TARGET_DOUBLE_FLOAT
    && (register_operand (operands[0], DFmode)
        || reg_or_0_operand (operands[1], DFmode))"
   { return riscv_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
+  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
    (set_attr "mode" "DF")])
 
 (define_insn "*movdf_softfloat"
@@ -2494,16 +2543,23 @@ (define_expand "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4"
   rtx op0 = operands[0];
   rtx op1 = operands[1];
   rtx op2 = operands[2];
-  rtx tmp = gen_reg_rtx (SImode);
-  rtx cmp = gen_rtx_<QUIET_PATTERN> (<X:MODE>mode, op1, op2);
-  rtx frflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, const0_rtx),
-					 UNSPECV_FRFLAGS);
-  rtx fsflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, tmp),
-					 UNSPECV_FSFLAGS);
-
-  emit_insn (gen_rtx_SET (tmp, frflags));
-  emit_insn (gen_rtx_SET (op0, cmp));
-  emit_insn (fsflags);
+
+  if (TARGET_ZFA)
+    emit_insn (gen_f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa(op0, op1, op2));
+  else
+    {
+      rtx tmp = gen_reg_rtx (SImode);
+      rtx cmp = gen_rtx_<QUIET_PATTERN> (<X:MODE>mode, op1, op2);
+      rtx frflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, const0_rtx),
+					     UNSPECV_FRFLAGS);
+      rtx fsflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, tmp),
+					     UNSPECV_FSFLAGS);
+
+      emit_insn (gen_rtx_SET (tmp, frflags));
+      emit_insn (gen_rtx_SET (op0, cmp));
+      emit_insn (fsflags);
+    }
+
   if (HONOR_SNANS (<ANYF:MODE>mode))
     emit_insn (gen_rtx_UNSPEC_VOLATILE (<ANYF:MODE>mode,
 					gen_rtvec (2, op1, op2),
@@ -2511,6 +2567,18 @@ (define_expand "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4"
   DONE;
 })
 
+(define_insn "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa"
+   [(set (match_operand:X      0 "register_operand" "=r")
+	 (unspec:X
+	  [(match_operand:ANYF 1 "register_operand" " f")
+	   (match_operand:ANYF 2 "register_operand" " f")]
+	  QUIET_COMPARISON))]
+  "TARGET_HARD_FLOAT && TARGET_ZFA"
+  "f<quiet_pattern>q.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fcmp")
+   (set_attr "mode" "<UNITMODE>")
+   (set (attr "length") (const_int 16))])
+
 (define_insn "*seq_zero_<X:mode><GPR:mode>"
   [(set (match_operand:GPR       0 "register_operand" "=r")
 	(eq:GPR (match_operand:X 1 "register_operand" " r")
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
new file mode 100644
index 00000000000..26895b76fa4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O2" } */
+
+extern void abort(void);
+extern float a, b;
+extern double c, d;
+
+void 
+foo()
+{
+  if ((__builtin_isless(a, b) ||  __builtin_islessequal(c, d))
+      && (__builtin_islessequal(a, b)|| __builtin_isless(c, d)))
+    abort();
+}
+
+/* { dg-final { scan-assembler-times "fleq.s" 1 } } */
+/* { dg-final { scan-assembler-times "fltq.s" 1 } } */
+/* { dg-final { scan-assembler-times "fleq.d" 1 } } */
+/* { dg-final { scan-assembler-times "fltq.d" 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
new file mode 100644
index 00000000000..4ccd6a7dd78
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O2" } */
+
+extern void abort(void);
+extern float a, b;
+extern double c, d;
+
+void 
+foo()
+{
+  if ((__builtin_isless(a, b) ||  __builtin_islessequal(c, d))
+      && (__builtin_islessequal(a, b)|| __builtin_isless(c, d)))
+    abort();
+}
+
+/* { dg-final { scan-assembler-times "fleq.s" 1 } } */
+/* { dg-final { scan-assembler-times "fltq.s" 1 } } */
+/* { dg-final { scan-assembler-times "fleq.d" 1 } } */
+/* { dg-final { scan-assembler-times "fltq.d" 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
new file mode 100644
index 00000000000..c4da04797aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
@@ -0,0 +1,79 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O0" } */
+
+void foo_float32 ()
+{
+  volatile float a;
+  a = -1.0;
+  a = 1.1754944e-38;
+  a = 1.0/(1 << 16);
+  a = 1.0/(1 << 15);
+  a = 1.0/(1 << 8);
+  a = 1.0/(1 << 7);
+  a = 1.0/(1 << 4);
+  a = 1.0/(1 << 3);
+  a = 1.0/(1 << 2);
+  a = 0.3125;
+  a = 0.375;
+  a = 0.4375;
+  a = 0.5;
+  a = 0.625;
+  a = 0.75;
+  a = 0.875;
+  a = 1.0;
+  a = 1.25;
+  a = 1.5;
+  a = 1.75;
+  a = 2.0;
+  a = 2.5;
+  a = 3.0;
+  a = 1.0*(1 << 2);
+  a = 1.0*(1 << 3);
+  a = 1.0*(1 << 4);
+  a = 1.0*(1 << 7);
+  a = 1.0*(1 << 8);
+  a = 1.0*(1 << 15);
+  a = 1.0*(1 << 16);
+  a = __builtin_inff ();
+  a = __builtin_nanf ("");
+}
+
+void foo_double64 ()
+{
+  volatile double a;
+  a = -1.0;
+  a = 2.2250738585072014E-308;
+  a = 1.0/(1 << 16);
+  a = 1.0/(1 << 15);
+  a = 1.0/(1 << 8);
+  a = 1.0/(1 << 7);
+  a = 1.0/(1 << 4);
+  a = 1.0/(1 << 3);
+  a = 1.0/(1 << 2);
+  a = 0.3125;
+  a = 0.375;
+  a = 0.4375;
+  a = 0.5;
+  a = 0.625;
+  a = 0.75;
+  a = 0.875;
+  a = 1.0;
+  a = 1.25;
+  a = 1.5;
+  a = 1.75;
+  a = 2.0;
+  a = 2.5;
+  a = 3.0;
+  a = 1.0*(1 << 2);
+  a = 1.0*(1 << 3);
+  a = 1.0*(1 << 4);
+  a = 1.0*(1 << 7);
+  a = 1.0*(1 << 8);
+  a = 1.0*(1 << 15);
+  a = 1.0*(1 << 16);
+  a = __builtin_inf ();
+  a = __builtin_nan ("");
+}
+
+/* { dg-final { scan-assembler-times "fli.s" 32 } } */
+/* { dg-final { scan-assembler-times "fli.d" 32 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
new file mode 100644
index 00000000000..bcffe9d2c82
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32imafdc_zfa_zfh -mabi=ilp32d -O0" } */
+
+void foo_float16 ()
+{
+  volatile _Float16 a;
+  a = -1.0;
+  a = 6.104E-5;
+  a = 1.0/(1 << 16);
+  a = 1.0/(1 << 15);
+  a = 1.0/(1 << 8);
+  a = 1.0/(1 << 7);
+  a = 1.0/(1 << 4);
+  a = 1.0/(1 << 3);
+  a = 1.0/(1 << 2);
+  a = 0.3125;
+  a = 0.375;
+  a = 0.4375;
+  a = 0.5;
+  a = 0.625;
+  a = 0.75;
+  a = 0.875;
+  a = 1.0;
+  a = 1.25;
+  a = 1.5;
+  a = 1.75;
+  a = 2.0;
+  a = 2.5;
+  a = 3.0;
+  a = 1.0*(1 << 2);
+  a = 1.0*(1 << 3);
+  a = 1.0*(1 << 4);
+  a = 1.0*(1 << 7);
+  a = 1.0*(1 << 8);
+  a = 1.0*(1 << 15);
+  a = 1.0*(1 << 16);
+  a = __builtin_inff16 ();
+  a = __builtin_nanf16 ("");
+}
+
+/* { dg-final { scan-assembler-times "fli.h" 32 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
new file mode 100644
index 00000000000..13aa7b5f846
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64imafdc_zfa_zfh -mabi=lp64d -O0" } */
+
+void foo_float16 ()
+{
+  volatile _Float16 a;
+  a = -1.0;
+  a = 6.104E-5;
+  a = 1.0/(1 << 16);
+  a = 1.0/(1 << 15);
+  a = 1.0/(1 << 8);
+  a = 1.0/(1 << 7);
+  a = 1.0/(1 << 4);
+  a = 1.0/(1 << 3);
+  a = 1.0/(1 << 2);
+  a = 0.3125;
+  a = 0.375;
+  a = 0.4375;
+  a = 0.5;
+  a = 0.625;
+  a = 0.75;
+  a = 0.875;
+  a = 1.0;
+  a = 1.25;
+  a = 1.5;
+  a = 1.75;
+  a = 2.0;
+  a = 2.5;
+  a = 3.0;
+  a = 1.0*(1 << 2);
+  a = 1.0*(1 << 3);
+  a = 1.0*(1 << 4);
+  a = 1.0*(1 << 7);
+  a = 1.0*(1 << 8);
+  a = 1.0*(1 << 15);
+  a = 1.0*(1 << 16);
+  a = __builtin_inff16 ();
+  a = __builtin_nanf16 ("");
+}
+
+/* { dg-final { scan-assembler-times "fli.h" 32 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli.c b/gcc/testsuite/gcc.target/riscv/zfa-fli.c
new file mode 100644
index 00000000000..b6d41cf460f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fli.c
@@ -0,0 +1,79 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O0" } */
+
+void foo_float32 ()
+{
+  volatile float a;
+  a = -1.0;
+  a = 1.1754944e-38;
+  a = 1.0/(1 << 16);
+  a = 1.0/(1 << 15);
+  a = 1.0/(1 << 8);
+  a = 1.0/(1 << 7);
+  a = 1.0/(1 << 4);
+  a = 1.0/(1 << 3);
+  a = 1.0/(1 << 2);
+  a = 0.3125;
+  a = 0.375;
+  a = 0.4375;
+  a = 0.5;
+  a = 0.625;
+  a = 0.75;
+  a = 0.875;
+  a = 1.0;
+  a = 1.25;
+  a = 1.5;
+  a = 1.75;
+  a = 2.0;
+  a = 2.5;
+  a = 3.0;
+  a = 1.0*(1 << 2);
+  a = 1.0*(1 << 3);
+  a = 1.0*(1 << 4);
+  a = 1.0*(1 << 7);
+  a = 1.0*(1 << 8);
+  a = 1.0*(1 << 15);
+  a = 1.0*(1 << 16);
+  a = __builtin_inff ();
+  a = __builtin_nanf ("");
+}
+
+void foo_double64 ()
+{
+  volatile double a;
+  a = -1.0;
+  a = 2.2250738585072014E-308;
+  a = 1.0/(1 << 16);
+  a = 1.0/(1 << 15);
+  a = 1.0/(1 << 8);
+  a = 1.0/(1 << 7);
+  a = 1.0/(1 << 4);
+  a = 1.0/(1 << 3);
+  a = 1.0/(1 << 2);
+  a = 0.3125;
+  a = 0.375;
+  a = 0.4375;
+  a = 0.5;
+  a = 0.625;
+  a = 0.75;
+  a = 0.875;
+  a = 1.0;
+  a = 1.25;
+  a = 1.5;
+  a = 1.75;
+  a = 2.0;
+  a = 2.5;
+  a = 3.0;
+  a = 1.0*(1 << 2);
+  a = 1.0*(1 << 3);
+  a = 1.0*(1 << 4);
+  a = 1.0*(1 << 7);
+  a = 1.0*(1 << 8);
+  a = 1.0*(1 << 15);
+  a = 1.0*(1 << 16);
+  a = __builtin_inf ();
+  a = __builtin_nan ("");
+}
+
+/* { dg-final { scan-assembler-times "fli.s" 32 } } */
+/* { dg-final { scan-assembler-times "fli.d" 32 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
new file mode 100644
index 00000000000..5a52adce36a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32g_zfa -mabi=ilp32 -O0" } */
+
+double foo(long long a)
+{
+  return (double)(a + 3);
+}
+
+/* { dg-final { scan-assembler-times "fmvp.d.x" 1 } } */
+/* { dg-final { scan-assembler-times "fmvh.x.d" 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
new file mode 100644
index 00000000000..b53601d6e1f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O2" } */
+
+extern float a;
+extern double b;
+
+void foo (float *x, double *y)
+{
+  {
+    *x = __builtin_roundf (a);
+    *y = __builtin_round (b);
+  }
+  {
+    *x = __builtin_floorf (a);
+    *y = __builtin_floor (b);
+  }
+  {
+    *x = __builtin_ceilf (a);
+    *y = __builtin_ceil (b);
+  }
+  {
+    *x = __builtin_truncf (a);
+    *y = __builtin_trunc (b);
+  }
+  {
+    *x = __builtin_roundevenf (a);
+    *y = __builtin_roundeven (b);
+  }
+  {
+    *x = __builtin_nearbyintf (a);
+    *y = __builtin_nearbyint (b);
+  }
+  {
+    *x = __builtin_rintf (a);
+    *y = __builtin_rint (b);
+  }
+}
+
+/* { dg-final { scan-assembler-times "fround.s" 6 } } */
+/* { dg-final { scan-assembler-times "fround.d" 6 } } */
+/* { dg-final { scan-assembler-times "froundnx.s" 1 } } */
+/* { dg-final { scan-assembler-times "froundnx.d" 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fround.c b/gcc/testsuite/gcc.target/riscv/zfa-fround.c
new file mode 100644
index 00000000000..c10de82578e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fround.c
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O2" } */
+
+extern float a;
+extern double b;
+
+void foo (float *x, double *y)
+{
+  {
+    *x = __builtin_roundf (a);
+    *y = __builtin_round (b);
+  }
+  {
+    *x = __builtin_floorf (a);
+    *y = __builtin_floor (b);
+  }
+  {
+    *x = __builtin_ceilf (a);
+    *y = __builtin_ceil (b);
+  }
+  {
+    *x = __builtin_truncf (a);
+    *y = __builtin_trunc (b);
+  }
+  {
+    *x = __builtin_roundevenf (a);
+    *y = __builtin_roundeven (b);
+  }
+  {
+    *x = __builtin_nearbyintf (a);
+    *y = __builtin_nearbyint (b);
+  }
+  {
+    *x = __builtin_rintf (a);
+    *y = __builtin_rint (b);
+  }
+}
+
+/* { dg-final { scan-assembler-times "fround.s" 6 } } */
+/* { dg-final { scan-assembler-times "fround.d" 6 } } */
+/* { dg-final { scan-assembler-times "froundnx.s" 1 } } */
+/* { dg-final { scan-assembler-times "froundnx.d" 1 } } */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2.
  2023-04-19  9:57 [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2 Jin Ma
@ 2023-05-05 15:03 ` Christoph Müllner
  2023-05-05 15:04   ` Christoph Müllner
  2023-05-05 23:31 ` Jeff Law
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 20+ messages in thread
From: Christoph Müllner @ 2023-05-05 15:03 UTC (permalink / raw)
  To: Jin Ma; +Cc: gcc-patches, jeffreyalaw, kito.cheng, kito.cheng, palmer, ijinma

On Wed, Apr 19, 2023 at 11:58 AM Jin Ma <jinma@linux.alibaba.com> wrote:
>
> This patch adds the 'Zfa' extension for riscv, which is based on:
>   https://github.com/riscv/riscv-isa-manual/commits/zfb
>   https://github.com/riscv/riscv-isa-manual/commit/1f038182810727f5feca311072e630d6baac51da
>
> The binutils-gdb for 'Zfa' extension:
>   https://github.com/a4lg/binutils-gdb/commits/riscv-zfa
>
> What needs special explanation is:
> 1, The immediate number of the instructions FLI.H/S/D is represented in the assembly as a
>   floating-point value, with scientific counting when rs1 is 1,2, and decimal numbers for
>   the rest.
>
>   Related llvm link:
>     https://reviews.llvm.org/D145645
>   Related discussion link:
>     https://github.com/riscv/riscv-isa-manual/issues/980
>
> 2, According to riscv-spec, "The FCVTMO D.W.D instruction was added principally to
>   accelerate the processing of JavaScript Numbers.", so it seems that no implementation
>   is required.
>
> 3, The instructions FMINM and FMAXM correspond to C23 library function fminimum and fmaximum.
>   Therefore, this patch has simply implemented the pattern of fminm<hf\sf\df>3 and
>   fmaxm<hf\sf\df>3 to prepare for later.
>
> gcc/ChangeLog:
>
>         * common/config/riscv/riscv-common.cc: Add zfa extension version.
>         * config/riscv/constraints.md (Zf): Constrain the floating point number that the
>         instructions FLI.H/S/D can load.
>         ((TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS): enable FMVP.D.X and FMVH.X.D.
>         * config/riscv/iterators.md (ceil): New.
>         * config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): New.
>         * config/riscv/riscv.cc (find_index_in_array): New.
>         (riscv_float_const_rtx_index_for_fli): Get the index of the floating-point number that
>         the instructions FLI.H/S/D can mov.
>         (riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be used, memory is not applicable.
>         (riscv_const_insns): The cost of FLI.H/S/D is 3.
>         (riscv_legitimize_const_move): Likewise.
>         (riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, no split is required.
>         (riscv_output_move): Output the mov instructions in zfa extension.
>         (riscv_print_operand): Output the floating-point value of the FLI.H/S/D immediate in assembly
>         (riscv_secondary_memory_needed): Likewise.
>         * config/riscv/riscv.h (GP_REG_RTX_P): New.
>         * config/riscv/riscv.md (fminm<mode>3): New.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
>         * gcc.target/riscv/zfa-fleq-fltq.c: New test.
>         * gcc.target/riscv/zfa-fli-rv32.c: New test.
>         * gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
>         * gcc.target/riscv/zfa-fli-zfh.c: New test.
>         * gcc.target/riscv/zfa-fli.c: New test.
>         * gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
>         * gcc.target/riscv/zfa-fround-rv32.c: New test.
>         * gcc.target/riscv/zfa-fround.c: New test.
> ---
>  gcc/common/config/riscv/riscv-common.cc       |   4 +
>  gcc/config/riscv/constraints.md               |  11 +-
>  gcc/config/riscv/iterators.md                 |   5 +
>  gcc/config/riscv/riscv-opts.h                 |   3 +
>  gcc/config/riscv/riscv-protos.h               |   1 +
>  gcc/config/riscv/riscv.cc                     | 168 +++++++++++++++++-
>  gcc/config/riscv/riscv.h                      |   1 +
>  gcc/config/riscv/riscv.md                     | 112 +++++++++---
>  .../gcc.target/riscv/zfa-fleq-fltq-rv32.c     |  19 ++
>  .../gcc.target/riscv/zfa-fleq-fltq.c          |  19 ++
>  gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c |  79 ++++++++
>  .../gcc.target/riscv/zfa-fli-zfh-rv32.c       |  41 +++++
>  gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c  |  41 +++++
>  gcc/testsuite/gcc.target/riscv/zfa-fli.c      |  79 ++++++++
>  .../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c   |  10 ++
>  .../gcc.target/riscv/zfa-fround-rv32.c        |  42 +++++
>  gcc/testsuite/gcc.target/riscv/zfa-fround.c   |  42 +++++
>  17 files changed, 652 insertions(+), 25 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround.c
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc
> index 309a52def75..f9fce6bcc38 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -217,6 +217,8 @@ static const struct riscv_ext_version riscv_ext_version_table[] =
>    {"zfh",       ISA_SPEC_CLASS_NONE, 1, 0},
>    {"zfhmin",    ISA_SPEC_CLASS_NONE, 1, 0},
>
> +  {"zfa",     ISA_SPEC_CLASS_NONE, 0, 2},
> +
>    {"zmmul", ISA_SPEC_CLASS_NONE, 1, 0},
>
>    {"svinval", ISA_SPEC_CLASS_NONE, 1, 0},
> @@ -1260,6 +1262,8 @@ static const riscv_ext_flag_table_t riscv_ext_flag_table[] =
>    {"zfhmin",    &gcc_options::x_riscv_zf_subext, MASK_ZFHMIN},
>    {"zfh",       &gcc_options::x_riscv_zf_subext, MASK_ZFH},
>
> +  {"zfa",       &gcc_options::x_riscv_zf_subext, MASK_ZFA},
> +
>    {"zmmul", &gcc_options::x_riscv_zm_subext, MASK_ZMMUL},
>
>    {"svinval", &gcc_options::x_riscv_sv_subext, MASK_SVINVAL},
> diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
> index c448e6b37e9..62d9094f966 100644
> --- a/gcc/config/riscv/constraints.md
> +++ b/gcc/config/riscv/constraints.md
> @@ -118,6 +118,13 @@ (define_constraint "T"
>    (and (match_operand 0 "move_operand")
>         (match_test "CONSTANT_P (op)")))
>
> +;; Zfa constraints.
> +
> +(define_constraint "Zf"
> +  "A floating point number that can be loaded using instruction `fli` in zfa."
> +  (and (match_code "const_double")
> +       (match_test "(riscv_float_const_rtx_index_for_fli (op) != -1)")))
> +
>  ;; Vector constraints.
>
>  (define_register_constraint "vr" "TARGET_VECTOR ? V_REGS : NO_REGS"
> @@ -183,8 +190,8 @@ (define_memory_constraint "Wdm"
>
>  ;; Vendor ISA extension constraints.
>
> -(define_register_constraint "th_f_fmv" "TARGET_XTHEADFMV ? FP_REGS : NO_REGS"
> +(define_register_constraint "th_f_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS"
>    "A floating-point register for XTheadFmv.")
>
> -(define_register_constraint "th_r_fmv" "TARGET_XTHEADFMV ? GR_REGS : NO_REGS"
> +(define_register_constraint "th_r_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? GR_REGS : NO_REGS"
>    "An integer register for XTheadFmv.")

These are vendor extension constraints with the prefix "th_".
I would avoid using them in code that targets standard extensions.

I see two ways here:
a) Create two new constraints at the top of the file. E.g.:
    - "F" - "A floating-point register (no fall-back for Zfinx)" and
    - "rF" - "A integer register in case FP registers are available".
b) Move to top and rename these two constraints (and adjust
movdf_hardfloat_rv32 accordingly)

I would prefer b) and would even go so far, that I would do this in a
separate commit that
comes before the Zfa support patch.


I've applied the patch on top of today's master (with --3way) and
successfully tested it:
Tested-by: Christoph Müllner <christoph.muellner@vrull.eu>

> diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
> index 9b767038452..c81b08e3cc5 100644
> --- a/gcc/config/riscv/iterators.md
> +++ b/gcc/config/riscv/iterators.md
> @@ -288,3 +288,8 @@ (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
>  (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET "le")])
>  (define_int_attr QUIET_PATTERN [(UNSPEC_FLT_QUIET "LT") (UNSPEC_FLE_QUIET "LE")])
>
> +(define_int_iterator ROUND [UNSPEC_ROUND UNSPEC_FLOOR UNSPEC_CEIL UNSPEC_BTRUNC UNSPEC_ROUNDEVEN UNSPEC_NEARBYINT])
> +(define_int_attr round_pattern [(UNSPEC_ROUND "round") (UNSPEC_FLOOR "floor") (UNSPEC_CEIL "ceil")
> +                               (UNSPEC_BTRUNC "btrunc") (UNSPEC_ROUNDEVEN "roundeven") (UNSPEC_NEARBYINT "nearbyint")])
> +(define_int_attr round_rm [(UNSPEC_ROUND "rmm") (UNSPEC_FLOOR "rdn") (UNSPEC_CEIL "rup")
> +                          (UNSPEC_BTRUNC "rtz") (UNSPEC_ROUNDEVEN "rne") (UNSPEC_NEARBYINT "dyn")])
> \ No newline at end of file
> diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
> index cf0cd669be4..87b72efd12e 100644
> --- a/gcc/config/riscv/riscv-opts.h
> +++ b/gcc/config/riscv/riscv-opts.h
> @@ -172,6 +172,9 @@ enum stack_protector_guard {
>  #define TARGET_ZFHMIN ((riscv_zf_subext & MASK_ZFHMIN) != 0)
>  #define TARGET_ZFH    ((riscv_zf_subext & MASK_ZFH) != 0)
>
> +#define MASK_ZFA   (1 << 0)
> +#define TARGET_ZFA    ((riscv_zf_subext & MASK_ZFA) != 0)
> +
>  #define MASK_ZMMUL      (1 << 0)
>  #define TARGET_ZMMUL    ((riscv_zm_subext & MASK_ZMMUL) != 0)
>
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 5244e8dcbf0..e421244a06c 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -38,6 +38,7 @@ enum riscv_symbol_type {
>  /* Routines implemented in riscv.cc.  */
>  extern enum riscv_symbol_type riscv_classify_symbolic_expression (rtx);
>  extern bool riscv_symbolic_constant_p (rtx, enum riscv_symbol_type *);
> +extern int riscv_float_const_rtx_index_for_fli (rtx);
>  extern int riscv_regno_mode_ok_for_base_p (int, machine_mode, bool);
>  extern int riscv_address_insns (rtx, machine_mode, bool);
>  extern int riscv_const_insns (rtx);
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index cdb47e81e7c..faffedffe97 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -799,6 +799,116 @@ static int riscv_symbol_insns (enum riscv_symbol_type type)
>      }
>  }
>
> +/* Immediate values loaded by the FLI.S instruction in Chapter 25 of the latest RISC-V ISA
> +   Manual draft. For details, please see:
> +   https://github.com/riscv/riscv-isa-manual/releases/tag/draft-20221217-cb3b9d1 */
> +
> +unsigned HOST_WIDE_INT fli_value_hf[32] =
> +{
> +  0xbc00, 0x400, 0x100, 0x200, 0x1c00, 0x2000, 0x2c00, 0x3000,
> +  0x3400, 0x3500, 0x3600, 0x3700, 0x3800, 0x3900, 0x3a00, 0x3b00,
> +  0x3c00, 0x3d00, 0x3e00, 0x3f00, 0x4000, 0x4100, 0x4200, 0x4400,
> +  0x4800, 0x4c00, 0x5800, 0x5c00, 0x7800,
> +  /* Only used for filling, ensuring that 29 and 30 of HF are the same. */
> +  0x7800,
> +  0x7c00, 0x7e00,
> +};
> +
> +unsigned HOST_WIDE_INT fli_value_sf[32] =
> +{
> +  0xbf800000, 0x00800000, 0x37800000, 0x38000000, 0x3b800000, 0x3c000000, 0x3d800000, 0x3e000000,
> +  0x3e800000, 0x3ea00000, 0x3ec00000, 0x3ee00000, 0x3f000000, 0x3f200000, 0x3f400000, 0x3f600000,
> +  0x3f800000, 0x3fa00000, 0x3fc00000, 0x3fe00000, 0x40000000, 0x40200000, 0x40400000, 0x40800000,
> +  0x41000000, 0x41800000, 0x43000000, 0x43800000, 0x47000000, 0x47800000, 0x7f800000, 0x7fc00000
> +};
> +
> +unsigned HOST_WIDE_INT fli_value_df[32] =
> +{
> +  0xbff0000000000000, 0x10000000000000, 0x3ef0000000000000, 0x3f00000000000000,
> +  0x3f70000000000000, 0x3f80000000000000, 0x3fb0000000000000, 0x3fc0000000000000,
> +  0x3fd0000000000000, 0x3fd4000000000000, 0x3fd8000000000000, 0x3fdc000000000000,
> +  0x3fe0000000000000, 0x3fe4000000000000, 0x3fe8000000000000, 0x3fec000000000000,
> +  0x3ff0000000000000, 0x3ff4000000000000, 0x3ff8000000000000, 0x3ffc000000000000,
> +  0x4000000000000000, 0x4004000000000000, 0x4008000000000000, 0x4010000000000000,
> +  0x4020000000000000, 0x4030000000000000, 0x4060000000000000, 0x4070000000000000,
> +  0x40e0000000000000, 0x40f0000000000000, 0x7ff0000000000000, 0x7ff8000000000000,
> +};
> +
> +const char *fli_value_print[32] =
> +{
> +  "-1.0", "min", "1.52587890625e-05", "3.0517578125e-05", "0.00390625", "0.0078125", "0.0625", "0.125",
> +  "0.25", "0.3125", "0.375", "0.4375", "0.5", "0.625", "0.75", "0.875",
> +  "1.0", "1.25", "1.5", "1.75", "2.0", "2.5", "3.0", "4.0",
> +  "8.0", "16.0", "128.0", "256.0", "32768.0", "65536.0", "inf", "nan"
> +};
> +
> +/* Find the index of TARGET in ARRAY, and return -1 if not found. */
> +
> +static int
> +find_index_in_array (unsigned HOST_WIDE_INT target, unsigned HOST_WIDE_INT *array, int len)
> +{
> +  if (array == NULL)
> +    return -1;
> +
> +  for (int i = 0; i < len; i++)
> +    {
> +      if (target == array[i])
> +       return i;
> +    }
> +  return -1;
> +}
> +
> +/* Return index of the FLI instruction table if rtx X is an immediate constant that
> +   can be moved using a single FLI instruction in zfa extension. -1 otherwise. */
> +
> +int
> +riscv_float_const_rtx_index_for_fli (rtx x)
> +{
> +  machine_mode mode = GET_MODE (x);
> +
> +  if (!TARGET_ZFA || mode == VOIDmode
> +      || !CONST_DOUBLE_P(x)
> +      || (mode == HFmode && !TARGET_ZFH)
> +      || (mode == SFmode && !TARGET_HARD_FLOAT)
> +      || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
> +    return -1;
> +
> +  if (!SCALAR_FLOAT_MODE_P (mode)
> +      || GET_MODE_BITSIZE (mode).to_constant () > HOST_BITS_PER_WIDE_INT
> +      /* Only support up to DF mode.  */
> +      || GET_MODE_BITSIZE (mode).to_constant () > GET_MODE_BITSIZE (DFmode))
> +    return -1;
> +
> +  unsigned HOST_WIDE_INT ival = 0;
> +
> +  long res[2];
> +  real_to_target (res,
> +                 CONST_DOUBLE_REAL_VALUE (x),
> +                 REAL_MODE_FORMAT (mode));
> +
> +  if (mode == DFmode)
> +    {
> +      int order = BYTES_BIG_ENDIAN ? 1 : 0;
> +      ival = zext_hwi (res[order], 32);
> +      ival |= (zext_hwi (res[1 - order], 32) << 32);
> +    }
> +  else
> +      ival = zext_hwi (res[0], 32);
> +
> +  switch (mode)
> +    {
> +      case SFmode:
> +       return find_index_in_array (ival, fli_value_sf, 32);
> +      case DFmode:
> +       return find_index_in_array (ival, fli_value_df, 32);
> +      case HFmode:
> +       return find_index_in_array (ival, fli_value_hf, 32);
> +      default:
> +       break;
> +    }
> +  return -1;
> +}
> +
>  /* Implement TARGET_LEGITIMATE_CONSTANT_P.  */
>
>  static bool
> @@ -826,6 +936,9 @@ riscv_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
>    if (GET_CODE (x) == HIGH)
>      return true;
>
> +  if (riscv_float_const_rtx_index_for_fli (x) != -1)
> +   return true;
> +
>    split_const (x, &base, &offset);
>    if (riscv_symbolic_constant_p (base, &type))
>      {
> @@ -1213,6 +1326,11 @@ riscv_const_insns (rtx x)
>        }
>
>      case CONST_DOUBLE:
> +      /* See if we can use FMV directly.  */
> +      if (riscv_float_const_rtx_index_for_fli (x) != -1)
> +       return 3;
> +      /* Fall through.  */
> +
>      case CONST_VECTOR:
>        /* We can use x0 to load floating-point zero.  */
>        return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
> @@ -1749,6 +1867,12 @@ riscv_legitimize_const_move (machine_mode mode, rtx dest, rtx src)
>        return;
>      }
>
> +  if (riscv_float_const_rtx_index_for_fli (src) != -1)
> +    {
> +      riscv_emit_set (dest, src);
> +      return;
> +    }
> +
>    /* Split moves of symbolic constants into high/low pairs.  */
>    if (riscv_split_symbol (dest, src, MAX_MACHINE_MODE, &src, FALSE))
>      {
> @@ -2770,12 +2894,19 @@ riscv_split_64bit_move_p (rtx dest, rtx src)
>    if (TARGET_64BIT)
>      return false;
>
> +  /* There is no need to split if the FLI instruction in the `Zfa` extension can be used. */
> +  if (riscv_float_const_rtx_index_for_fli (src) != -1)
> +    return false;
> +
>    /* Allow FPR <-> FPR and FPR <-> MEM moves, and permit the special case
>       of zeroing an FPR with FCVT.D.W.  */
>    if (TARGET_DOUBLE_FLOAT
>        && ((FP_REG_RTX_P (src) && FP_REG_RTX_P (dest))
>           || (FP_REG_RTX_P (dest) && MEM_P (src))
>           || (FP_REG_RTX_P (src) && MEM_P (dest))
> +         || (TARGET_ZFA
> +             && ((FP_REG_RTX_P (dest) && GP_REG_RTX_P (src))
> +             || (FP_REG_RTX_P (src) && GP_REG_RTX_P (dest))))
>           || (FP_REG_RTX_P (dest) && src == CONST0_RTX (GET_MODE (src)))))
>      return false;
>
> @@ -2857,6 +2988,8 @@ riscv_output_move (rtx dest, rtx src)
>           case 4:
>             return "fmv.x.s\t%0,%1";
>           case 8:
> +           if (!TARGET_64BIT && TARGET_ZFA)
> +             return "fmv.x.w\t%0,%1\n\tfmvh.x.d\t%N0,%1";
>             return "fmv.x.d\t%0,%1";
>           }
>
> @@ -2916,6 +3049,8 @@ riscv_output_move (rtx dest, rtx src)
>               case 8:
>                 if (TARGET_64BIT)
>                   return "fmv.d.x\t%0,%z1";
> +               else if (TARGET_ZFA && src != CONST0_RTX (mode))
> +                 return "fmvp.d.x\t%0,%1,%N1";
>                 /* in RV32, we can emulate fmv.d.x %0, x0 using fcvt.d.w */
>                 gcc_assert (src == CONST0_RTX (mode));
>                 return "fcvt.d.w\t%0,x0";
> @@ -2968,6 +3103,14 @@ riscv_output_move (rtx dest, rtx src)
>           case 8:
>             return "fld\t%0,%1";
>           }
> +
> +      if (src_code == CONST_DOUBLE && (riscv_float_const_rtx_index_for_fli (src) != -1))
> +       switch (width)
> +         {
> +           case 2: return "fli.h\t%0,%1";
> +           case 4: return "fli.s\t%0,%1";
> +           case 8: return "fli.d\t%0,%1";
> +         }
>      }
>    if (dest_code == REG && GP_REG_P (REGNO (dest)) && src_code == CONST_POLY_INT)
>      {
> @@ -4349,6 +4492,7 @@ riscv_memmodel_needs_release_fence (enum memmodel model)
>     'S' Print shift-index of single-bit mask OP.
>     'T' Print shift-index of inverted single-bit mask OP.
>     '~' Print w if TARGET_64BIT is true; otherwise not print anything.
> +   'N'  Print next register.
>
>     Note please keep this list and the list in riscv.md in sync.  */
>
> @@ -4533,6 +4677,9 @@ riscv_print_operand (FILE *file, rtx op, int letter)
>         output_addr_const (file, newop);
>         break;
>        }
> +    case 'N':
> +      fputs (reg_names[REGNO (op) + 1], file);
> +      break;
>      default:
>        switch (code)
>         {
> @@ -4549,6 +4696,24 @@ riscv_print_operand (FILE *file, rtx op, int letter)
>             output_address (mode, XEXP (op, 0));
>           break;
>
> +       case CONST_DOUBLE:
> +         {
> +           if (letter == 'z' && op == CONST0_RTX (GET_MODE (op)))
> +             {
> +               fputs (reg_names[GP_REG_FIRST], file);
> +               break;
> +             }
> +
> +           int fli_index = riscv_float_const_rtx_index_for_fli (op);
> +           if (fli_index == -1 || fli_index > 31)
> +             {
> +               output_operand_lossage ("invalid use of '%%%c'", letter);
> +               break;
> +             }
> +           asm_fprintf (file, "%s", fli_value_print[fli_index]);
> +           break;
> +         }
> +
>         default:
>           if (letter == 'z' && op == CONST0_RTX (GET_MODE (op)))
>             fputs (reg_names[GP_REG_FIRST], file);
> @@ -5897,7 +6062,8 @@ riscv_secondary_memory_needed (machine_mode mode, reg_class_t class1,
>    return (!riscv_v_ext_vector_mode_p (mode)
>           && GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD
>           && (class1 == FP_REGS) != (class2 == FP_REGS)
> -         && !TARGET_XTHEADFMV);
> +         && !TARGET_XTHEADFMV
> +         && !TARGET_ZFA);
>  }
>
>  /* Implement TARGET_REGISTER_MOVE_COST.  */
> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> index 66fb07d6652..d438b281142 100644
> --- a/gcc/config/riscv/riscv.h
> +++ b/gcc/config/riscv/riscv.h
> @@ -377,6 +377,7 @@ ASM_MISA_SPEC
>  #define SIBCALL_REG_P(REGNO)   \
>    TEST_HARD_REG_BIT (reg_class_contents[SIBCALL_REGS], REGNO)
>
> +#define GP_REG_RTX_P(X) (REG_P (X) && GP_REG_P (REGNO (X)))
>  #define FP_REG_RTX_P(X) (REG_P (X) && FP_REG_P (REGNO (X)))
>
>  /* Use s0 as the frame pointer if it is so requested.  */
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index bc384d9aedf..f22e71b5a3a 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -59,6 +59,15 @@ (define_c_enum "unspec" [
>    UNSPEC_LROUND
>    UNSPEC_FMIN
>    UNSPEC_FMAX
> +  UNSPEC_RINT
> +  UNSPEC_ROUND
> +  UNSPEC_FLOOR
> +  UNSPEC_CEIL
> +  UNSPEC_BTRUNC
> +  UNSPEC_ROUNDEVEN
> +  UNSPEC_NEARBYINT
> +  UNSPEC_FMINM
> +  UNSPEC_FMAXM
>
>    ;; Stack tie
>    UNSPEC_TIE
> @@ -1232,6 +1241,26 @@ (define_insn "neg<mode>2"
>  ;;
>  ;;  ....................
>
> +(define_insn "fminm<mode>3"
> +  [(set (match_operand:ANYF                    0 "register_operand" "=f")
> +       (unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
> +                     (use (match_operand:ANYF 2 "register_operand" " f"))]
> +                    UNSPEC_FMINM))]
> +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> +  "fminm.<fmt>\t%0,%1,%2"
> +  [(set_attr "type" "fmove")
> +   (set_attr "mode" "<UNITMODE>")])
> +
> +(define_insn "fmaxm<mode>3"
> +  [(set (match_operand:ANYF                    0 "register_operand" "=f")
> +       (unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
> +                     (use (match_operand:ANYF 2 "register_operand" " f"))]
> +                    UNSPEC_FMAXM))]
> +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> +  "fmaxm.<fmt>\t%0,%1,%2"
> +  [(set_attr "type" "fmove")
> +   (set_attr "mode" "<UNITMODE>")])
> +
>  (define_insn "fmin<mode>3"
>    [(set (match_operand:ANYF                    0 "register_operand" "=f")
>         (unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
> @@ -1508,13 +1537,13 @@ (define_expand "movhf"
>  })
>
>  (define_insn "*movhf_hardfloat"
> -  [(set (match_operand:HF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
> -       (match_operand:HF 1 "move_operand"         " f,G,m,f,G,*r,*f,*G*r,*m,*r"))]
> +  [(set (match_operand:HF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
> +       (match_operand:HF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*G*r,*m,*r"))]
>    "TARGET_ZFHMIN
>     && (register_operand (operands[0], HFmode)
>         || reg_or_0_operand (operands[1], HFmode))"
>    { return riscv_output_move (operands[0], operands[1]); }
> -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "mode" "HF")])
>
>  (define_insn "*movhf_softfloat"
> @@ -1580,6 +1609,26 @@ (define_insn "l<rint_pattern><ANYF:mode><GPR:mode>2"
>    [(set_attr "type" "fcvt")
>     (set_attr "mode" "<ANYF:MODE>")])
>
> +(define_insn "<round_pattern><ANYF:mode>2"
> +  [(set (match_operand:ANYF     0 "register_operand" "=f")
> +       (unspec:ANYF
> +           [(match_operand:ANYF 1 "register_operand" " f")]
> +       ROUND))]
> +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> +  "fround.<ANYF:fmt>\t%0,%1,<round_rm>"
> +  [(set_attr "type" "fcvt")
> +   (set_attr "mode" "<ANYF:MODE>")])
> +
> +(define_insn "rint<ANYF:mode>2"
> +  [(set (match_operand:ANYF     0 "register_operand" "=f")
> +       (unspec:ANYF
> +           [(match_operand:ANYF 1 "register_operand" " f")]
> +       UNSPEC_RINT))]
> +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> +  "froundnx.<ANYF:fmt>\t%0,%1"
> +  [(set_attr "type" "fcvt")
> +   (set_attr "mode" "<ANYF:MODE>")])
> +
>  ;;
>  ;;  ....................
>  ;;
> @@ -1839,13 +1888,13 @@ (define_expand "movsf"
>  })
>
>  (define_insn "*movsf_hardfloat"
> -  [(set (match_operand:SF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
> -       (match_operand:SF 1 "move_operand"         " f,G,m,f,G,*r,*f,*G*r,*m,*r"))]
> +  [(set (match_operand:SF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
> +       (match_operand:SF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*G*r,*m,*r"))]
>    "TARGET_HARD_FLOAT
>     && (register_operand (operands[0], SFmode)
>         || reg_or_0_operand (operands[1], SFmode))"
>    { return riscv_output_move (operands[0], operands[1]); }
> -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "mode" "SF")])
>
>  (define_insn "*movsf_softfloat"
> @@ -1873,23 +1922,23 @@ (define_expand "movdf"
>  ;; In RV32, we lack fmv.x.d and fmv.d.x.  Go through memory instead.
>  ;; (However, we can still use fcvt.d.w to zero a floating-point register.)
>  (define_insn "*movdf_hardfloat_rv32"
> -  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,m,*th_f_fmv,*th_r_fmv,  *r,*r,*m")
> -       (match_operand:DF 1 "move_operand"         " f,G,m,f,G,*th_r_fmv,*th_f_fmv,*r*G,*m,*r"))]
> +  [(set (match_operand:DF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*th_f_fmv,*th_r_fmv,  *r,*r,*m")
> +       (match_operand:DF 1 "move_operand"         " f,Zf,G,m,f,G,*th_r_fmv,*th_f_fmv,*r*G,*m,*r"))]
>    "!TARGET_64BIT && TARGET_DOUBLE_FLOAT
>     && (register_operand (operands[0], DFmode)
>         || reg_or_0_operand (operands[1], DFmode))"
>    { return riscv_output_move (operands[0], operands[1]); }
> -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "mode" "DF")])
>
>  (define_insn "*movdf_hardfloat_rv64"
> -  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
> -       (match_operand:DF 1 "move_operand"         " f,G,m,f,G,*r,*f,*r*G,*m,*r"))]
> +  [(set (match_operand:DF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
> +       (match_operand:DF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*r*G,*m,*r"))]
>    "TARGET_64BIT && TARGET_DOUBLE_FLOAT
>     && (register_operand (operands[0], DFmode)
>         || reg_or_0_operand (operands[1], DFmode))"
>    { return riscv_output_move (operands[0], operands[1]); }
> -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "mode" "DF")])
>
>  (define_insn "*movdf_softfloat"
> @@ -2494,16 +2543,23 @@ (define_expand "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4"
>    rtx op0 = operands[0];
>    rtx op1 = operands[1];
>    rtx op2 = operands[2];
> -  rtx tmp = gen_reg_rtx (SImode);
> -  rtx cmp = gen_rtx_<QUIET_PATTERN> (<X:MODE>mode, op1, op2);
> -  rtx frflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, const0_rtx),
> -                                        UNSPECV_FRFLAGS);
> -  rtx fsflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, tmp),
> -                                        UNSPECV_FSFLAGS);
> -
> -  emit_insn (gen_rtx_SET (tmp, frflags));
> -  emit_insn (gen_rtx_SET (op0, cmp));
> -  emit_insn (fsflags);
> +
> +  if (TARGET_ZFA)
> +    emit_insn (gen_f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa(op0, op1, op2));
> +  else
> +    {
> +      rtx tmp = gen_reg_rtx (SImode);
> +      rtx cmp = gen_rtx_<QUIET_PATTERN> (<X:MODE>mode, op1, op2);
> +      rtx frflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, const0_rtx),
> +                                            UNSPECV_FRFLAGS);
> +      rtx fsflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, tmp),
> +                                            UNSPECV_FSFLAGS);
> +
> +      emit_insn (gen_rtx_SET (tmp, frflags));
> +      emit_insn (gen_rtx_SET (op0, cmp));
> +      emit_insn (fsflags);
> +    }
> +
>    if (HONOR_SNANS (<ANYF:MODE>mode))
>      emit_insn (gen_rtx_UNSPEC_VOLATILE (<ANYF:MODE>mode,
>                                         gen_rtvec (2, op1, op2),
> @@ -2511,6 +2567,18 @@ (define_expand "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4"
>    DONE;
>  })
>
> +(define_insn "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa"
> +   [(set (match_operand:X      0 "register_operand" "=r")
> +        (unspec:X
> +         [(match_operand:ANYF 1 "register_operand" " f")
> +          (match_operand:ANYF 2 "register_operand" " f")]
> +         QUIET_COMPARISON))]
> +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> +  "f<quiet_pattern>q.<fmt>\t%0,%1,%2"
> +  [(set_attr "type" "fcmp")
> +   (set_attr "mode" "<UNITMODE>")
> +   (set (attr "length") (const_int 16))])
> +
>  (define_insn "*seq_zero_<X:mode><GPR:mode>"
>    [(set (match_operand:GPR       0 "register_operand" "=r")
>         (eq:GPR (match_operand:X 1 "register_operand" " r")
> diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
> new file mode 100644
> index 00000000000..26895b76fa4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O2" } */
> +
> +extern void abort(void);
> +extern float a, b;
> +extern double c, d;
> +
> +void
> +foo()
> +{
> +  if ((__builtin_isless(a, b) ||  __builtin_islessequal(c, d))
> +      && (__builtin_islessequal(a, b)|| __builtin_isless(c, d)))
> +    abort();
> +}
> +
> +/* { dg-final { scan-assembler-times "fleq.s" 1 } } */
> +/* { dg-final { scan-assembler-times "fltq.s" 1 } } */
> +/* { dg-final { scan-assembler-times "fleq.d" 1 } } */
> +/* { dg-final { scan-assembler-times "fltq.d" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
> new file mode 100644
> index 00000000000..4ccd6a7dd78
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O2" } */
> +
> +extern void abort(void);
> +extern float a, b;
> +extern double c, d;
> +
> +void
> +foo()
> +{
> +  if ((__builtin_isless(a, b) ||  __builtin_islessequal(c, d))
> +      && (__builtin_islessequal(a, b)|| __builtin_isless(c, d)))
> +    abort();
> +}
> +
> +/* { dg-final { scan-assembler-times "fleq.s" 1 } } */
> +/* { dg-final { scan-assembler-times "fltq.s" 1 } } */
> +/* { dg-final { scan-assembler-times "fleq.d" 1 } } */
> +/* { dg-final { scan-assembler-times "fltq.d" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
> new file mode 100644
> index 00000000000..c4da04797aa
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
> @@ -0,0 +1,79 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O0" } */
> +
> +void foo_float32 ()
> +{
> +  volatile float a;
> +  a = -1.0;
> +  a = 1.1754944e-38;
> +  a = 1.0/(1 << 16);
> +  a = 1.0/(1 << 15);
> +  a = 1.0/(1 << 8);
> +  a = 1.0/(1 << 7);
> +  a = 1.0/(1 << 4);
> +  a = 1.0/(1 << 3);
> +  a = 1.0/(1 << 2);
> +  a = 0.3125;
> +  a = 0.375;
> +  a = 0.4375;
> +  a = 0.5;
> +  a = 0.625;
> +  a = 0.75;
> +  a = 0.875;
> +  a = 1.0;
> +  a = 1.25;
> +  a = 1.5;
> +  a = 1.75;
> +  a = 2.0;
> +  a = 2.5;
> +  a = 3.0;
> +  a = 1.0*(1 << 2);
> +  a = 1.0*(1 << 3);
> +  a = 1.0*(1 << 4);
> +  a = 1.0*(1 << 7);
> +  a = 1.0*(1 << 8);
> +  a = 1.0*(1 << 15);
> +  a = 1.0*(1 << 16);
> +  a = __builtin_inff ();
> +  a = __builtin_nanf ("");
> +}
> +
> +void foo_double64 ()
> +{
> +  volatile double a;
> +  a = -1.0;
> +  a = 2.2250738585072014E-308;
> +  a = 1.0/(1 << 16);
> +  a = 1.0/(1 << 15);
> +  a = 1.0/(1 << 8);
> +  a = 1.0/(1 << 7);
> +  a = 1.0/(1 << 4);
> +  a = 1.0/(1 << 3);
> +  a = 1.0/(1 << 2);
> +  a = 0.3125;
> +  a = 0.375;
> +  a = 0.4375;
> +  a = 0.5;
> +  a = 0.625;
> +  a = 0.75;
> +  a = 0.875;
> +  a = 1.0;
> +  a = 1.25;
> +  a = 1.5;
> +  a = 1.75;
> +  a = 2.0;
> +  a = 2.5;
> +  a = 3.0;
> +  a = 1.0*(1 << 2);
> +  a = 1.0*(1 << 3);
> +  a = 1.0*(1 << 4);
> +  a = 1.0*(1 << 7);
> +  a = 1.0*(1 << 8);
> +  a = 1.0*(1 << 15);
> +  a = 1.0*(1 << 16);
> +  a = __builtin_inf ();
> +  a = __builtin_nan ("");
> +}
> +
> +/* { dg-final { scan-assembler-times "fli.s" 32 } } */
> +/* { dg-final { scan-assembler-times "fli.d" 32 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
> new file mode 100644
> index 00000000000..bcffe9d2c82
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
> @@ -0,0 +1,41 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32imafdc_zfa_zfh -mabi=ilp32d -O0" } */
> +
> +void foo_float16 ()
> +{
> +  volatile _Float16 a;
> +  a = -1.0;
> +  a = 6.104E-5;
> +  a = 1.0/(1 << 16);
> +  a = 1.0/(1 << 15);
> +  a = 1.0/(1 << 8);
> +  a = 1.0/(1 << 7);
> +  a = 1.0/(1 << 4);
> +  a = 1.0/(1 << 3);
> +  a = 1.0/(1 << 2);
> +  a = 0.3125;
> +  a = 0.375;
> +  a = 0.4375;
> +  a = 0.5;
> +  a = 0.625;
> +  a = 0.75;
> +  a = 0.875;
> +  a = 1.0;
> +  a = 1.25;
> +  a = 1.5;
> +  a = 1.75;
> +  a = 2.0;
> +  a = 2.5;
> +  a = 3.0;
> +  a = 1.0*(1 << 2);
> +  a = 1.0*(1 << 3);
> +  a = 1.0*(1 << 4);
> +  a = 1.0*(1 << 7);
> +  a = 1.0*(1 << 8);
> +  a = 1.0*(1 << 15);
> +  a = 1.0*(1 << 16);
> +  a = __builtin_inff16 ();
> +  a = __builtin_nanf16 ("");
> +}
> +
> +/* { dg-final { scan-assembler-times "fli.h" 32 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
> new file mode 100644
> index 00000000000..13aa7b5f846
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
> @@ -0,0 +1,41 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64imafdc_zfa_zfh -mabi=lp64d -O0" } */
> +
> +void foo_float16 ()
> +{
> +  volatile _Float16 a;
> +  a = -1.0;
> +  a = 6.104E-5;
> +  a = 1.0/(1 << 16);
> +  a = 1.0/(1 << 15);
> +  a = 1.0/(1 << 8);
> +  a = 1.0/(1 << 7);
> +  a = 1.0/(1 << 4);
> +  a = 1.0/(1 << 3);
> +  a = 1.0/(1 << 2);
> +  a = 0.3125;
> +  a = 0.375;
> +  a = 0.4375;
> +  a = 0.5;
> +  a = 0.625;
> +  a = 0.75;
> +  a = 0.875;
> +  a = 1.0;
> +  a = 1.25;
> +  a = 1.5;
> +  a = 1.75;
> +  a = 2.0;
> +  a = 2.5;
> +  a = 3.0;
> +  a = 1.0*(1 << 2);
> +  a = 1.0*(1 << 3);
> +  a = 1.0*(1 << 4);
> +  a = 1.0*(1 << 7);
> +  a = 1.0*(1 << 8);
> +  a = 1.0*(1 << 15);
> +  a = 1.0*(1 << 16);
> +  a = __builtin_inff16 ();
> +  a = __builtin_nanf16 ("");
> +}
> +
> +/* { dg-final { scan-assembler-times "fli.h" 32 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli.c b/gcc/testsuite/gcc.target/riscv/zfa-fli.c
> new file mode 100644
> index 00000000000..b6d41cf460f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli.c
> @@ -0,0 +1,79 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O0" } */
> +
> +void foo_float32 ()
> +{
> +  volatile float a;
> +  a = -1.0;
> +  a = 1.1754944e-38;
> +  a = 1.0/(1 << 16);
> +  a = 1.0/(1 << 15);
> +  a = 1.0/(1 << 8);
> +  a = 1.0/(1 << 7);
> +  a = 1.0/(1 << 4);
> +  a = 1.0/(1 << 3);
> +  a = 1.0/(1 << 2);
> +  a = 0.3125;
> +  a = 0.375;
> +  a = 0.4375;
> +  a = 0.5;
> +  a = 0.625;
> +  a = 0.75;
> +  a = 0.875;
> +  a = 1.0;
> +  a = 1.25;
> +  a = 1.5;
> +  a = 1.75;
> +  a = 2.0;
> +  a = 2.5;
> +  a = 3.0;
> +  a = 1.0*(1 << 2);
> +  a = 1.0*(1 << 3);
> +  a = 1.0*(1 << 4);
> +  a = 1.0*(1 << 7);
> +  a = 1.0*(1 << 8);
> +  a = 1.0*(1 << 15);
> +  a = 1.0*(1 << 16);
> +  a = __builtin_inff ();
> +  a = __builtin_nanf ("");
> +}
> +
> +void foo_double64 ()
> +{
> +  volatile double a;
> +  a = -1.0;
> +  a = 2.2250738585072014E-308;
> +  a = 1.0/(1 << 16);
> +  a = 1.0/(1 << 15);
> +  a = 1.0/(1 << 8);
> +  a = 1.0/(1 << 7);
> +  a = 1.0/(1 << 4);
> +  a = 1.0/(1 << 3);
> +  a = 1.0/(1 << 2);
> +  a = 0.3125;
> +  a = 0.375;
> +  a = 0.4375;
> +  a = 0.5;
> +  a = 0.625;
> +  a = 0.75;
> +  a = 0.875;
> +  a = 1.0;
> +  a = 1.25;
> +  a = 1.5;
> +  a = 1.75;
> +  a = 2.0;
> +  a = 2.5;
> +  a = 3.0;
> +  a = 1.0*(1 << 2);
> +  a = 1.0*(1 << 3);
> +  a = 1.0*(1 << 4);
> +  a = 1.0*(1 << 7);
> +  a = 1.0*(1 << 8);
> +  a = 1.0*(1 << 15);
> +  a = 1.0*(1 << 16);
> +  a = __builtin_inf ();
> +  a = __builtin_nan ("");
> +}
> +
> +/* { dg-final { scan-assembler-times "fli.s" 32 } } */
> +/* { dg-final { scan-assembler-times "fli.d" 32 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
> new file mode 100644
> index 00000000000..5a52adce36a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32g_zfa -mabi=ilp32 -O0" } */
> +
> +double foo(long long a)
> +{
> +  return (double)(a + 3);
> +}
> +
> +/* { dg-final { scan-assembler-times "fmvp.d.x" 1 } } */
> +/* { dg-final { scan-assembler-times "fmvh.x.d" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
> new file mode 100644
> index 00000000000..b53601d6e1f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
> @@ -0,0 +1,42 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O2" } */
> +
> +extern float a;
> +extern double b;
> +
> +void foo (float *x, double *y)
> +{
> +  {
> +    *x = __builtin_roundf (a);
> +    *y = __builtin_round (b);
> +  }
> +  {
> +    *x = __builtin_floorf (a);
> +    *y = __builtin_floor (b);
> +  }
> +  {
> +    *x = __builtin_ceilf (a);
> +    *y = __builtin_ceil (b);
> +  }
> +  {
> +    *x = __builtin_truncf (a);
> +    *y = __builtin_trunc (b);
> +  }
> +  {
> +    *x = __builtin_roundevenf (a);
> +    *y = __builtin_roundeven (b);
> +  }
> +  {
> +    *x = __builtin_nearbyintf (a);
> +    *y = __builtin_nearbyint (b);
> +  }
> +  {
> +    *x = __builtin_rintf (a);
> +    *y = __builtin_rint (b);
> +  }
> +}
> +
> +/* { dg-final { scan-assembler-times "fround.s" 6 } } */
> +/* { dg-final { scan-assembler-times "fround.d" 6 } } */
> +/* { dg-final { scan-assembler-times "froundnx.s" 1 } } */
> +/* { dg-final { scan-assembler-times "froundnx.d" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fround.c b/gcc/testsuite/gcc.target/riscv/zfa-fround.c
> new file mode 100644
> index 00000000000..c10de82578e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zfa-fround.c
> @@ -0,0 +1,42 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O2" } */
> +
> +extern float a;
> +extern double b;
> +
> +void foo (float *x, double *y)
> +{
> +  {
> +    *x = __builtin_roundf (a);
> +    *y = __builtin_round (b);
> +  }
> +  {
> +    *x = __builtin_floorf (a);
> +    *y = __builtin_floor (b);
> +  }
> +  {
> +    *x = __builtin_ceilf (a);
> +    *y = __builtin_ceil (b);
> +  }
> +  {
> +    *x = __builtin_truncf (a);
> +    *y = __builtin_trunc (b);
> +  }
> +  {
> +    *x = __builtin_roundevenf (a);
> +    *y = __builtin_roundeven (b);
> +  }
> +  {
> +    *x = __builtin_nearbyintf (a);
> +    *y = __builtin_nearbyint (b);
> +  }
> +  {
> +    *x = __builtin_rintf (a);
> +    *y = __builtin_rint (b);
> +  }
> +}
> +
> +/* { dg-final { scan-assembler-times "fround.s" 6 } } */
> +/* { dg-final { scan-assembler-times "fround.d" 6 } } */
> +/* { dg-final { scan-assembler-times "froundnx.s" 1 } } */
> +/* { dg-final { scan-assembler-times "froundnx.d" 1 } } */
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2.
  2023-05-05 15:03 ` Christoph Müllner
@ 2023-05-05 15:04   ` Christoph Müllner
  2023-05-05 15:12     ` Palmer Dabbelt
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Müllner @ 2023-05-05 15:04 UTC (permalink / raw)
  To: Jin Ma; +Cc: gcc-patches, jeffreyalaw, kito.cheng, kito.cheng, palmer, ijinma

What I forgot to mention:
Zfa is frozen and in public review:
  https://groups.google.com/a/groups.riscv.org/g/isa-dev/c/SED4ntBkabg

On Fri, May 5, 2023 at 5:03 PM Christoph Müllner
<christoph.muellner@vrull.eu> wrote:
>
> On Wed, Apr 19, 2023 at 11:58 AM Jin Ma <jinma@linux.alibaba.com> wrote:
> >
> > This patch adds the 'Zfa' extension for riscv, which is based on:
> >   https://github.com/riscv/riscv-isa-manual/commits/zfb
> >   https://github.com/riscv/riscv-isa-manual/commit/1f038182810727f5feca311072e630d6baac51da
> >
> > The binutils-gdb for 'Zfa' extension:
> >   https://github.com/a4lg/binutils-gdb/commits/riscv-zfa
> >
> > What needs special explanation is:
> > 1, The immediate number of the instructions FLI.H/S/D is represented in the assembly as a
> >   floating-point value, with scientific counting when rs1 is 1,2, and decimal numbers for
> >   the rest.
> >
> >   Related llvm link:
> >     https://reviews.llvm.org/D145645
> >   Related discussion link:
> >     https://github.com/riscv/riscv-isa-manual/issues/980
> >
> > 2, According to riscv-spec, "The FCVTMO D.W.D instruction was added principally to
> >   accelerate the processing of JavaScript Numbers.", so it seems that no implementation
> >   is required.
> >
> > 3, The instructions FMINM and FMAXM correspond to C23 library function fminimum and fmaximum.
> >   Therefore, this patch has simply implemented the pattern of fminm<hf\sf\df>3 and
> >   fmaxm<hf\sf\df>3 to prepare for later.
> >
> > gcc/ChangeLog:
> >
> >         * common/config/riscv/riscv-common.cc: Add zfa extension version.
> >         * config/riscv/constraints.md (Zf): Constrain the floating point number that the
> >         instructions FLI.H/S/D can load.
> >         ((TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS): enable FMVP.D.X and FMVH.X.D.
> >         * config/riscv/iterators.md (ceil): New.
> >         * config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): New.
> >         * config/riscv/riscv.cc (find_index_in_array): New.
> >         (riscv_float_const_rtx_index_for_fli): Get the index of the floating-point number that
> >         the instructions FLI.H/S/D can mov.
> >         (riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be used, memory is not applicable.
> >         (riscv_const_insns): The cost of FLI.H/S/D is 3.
> >         (riscv_legitimize_const_move): Likewise.
> >         (riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, no split is required.
> >         (riscv_output_move): Output the mov instructions in zfa extension.
> >         (riscv_print_operand): Output the floating-point value of the FLI.H/S/D immediate in assembly
> >         (riscv_secondary_memory_needed): Likewise.
> >         * config/riscv/riscv.h (GP_REG_RTX_P): New.
> >         * config/riscv/riscv.md (fminm<mode>3): New.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
> >         * gcc.target/riscv/zfa-fleq-fltq.c: New test.
> >         * gcc.target/riscv/zfa-fli-rv32.c: New test.
> >         * gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
> >         * gcc.target/riscv/zfa-fli-zfh.c: New test.
> >         * gcc.target/riscv/zfa-fli.c: New test.
> >         * gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
> >         * gcc.target/riscv/zfa-fround-rv32.c: New test.
> >         * gcc.target/riscv/zfa-fround.c: New test.
> > ---
> >  gcc/common/config/riscv/riscv-common.cc       |   4 +
> >  gcc/config/riscv/constraints.md               |  11 +-
> >  gcc/config/riscv/iterators.md                 |   5 +
> >  gcc/config/riscv/riscv-opts.h                 |   3 +
> >  gcc/config/riscv/riscv-protos.h               |   1 +
> >  gcc/config/riscv/riscv.cc                     | 168 +++++++++++++++++-
> >  gcc/config/riscv/riscv.h                      |   1 +
> >  gcc/config/riscv/riscv.md                     | 112 +++++++++---
> >  .../gcc.target/riscv/zfa-fleq-fltq-rv32.c     |  19 ++
> >  .../gcc.target/riscv/zfa-fleq-fltq.c          |  19 ++
> >  gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c |  79 ++++++++
> >  .../gcc.target/riscv/zfa-fli-zfh-rv32.c       |  41 +++++
> >  gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c  |  41 +++++
> >  gcc/testsuite/gcc.target/riscv/zfa-fli.c      |  79 ++++++++
> >  .../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c   |  10 ++
> >  .../gcc.target/riscv/zfa-fround-rv32.c        |  42 +++++
> >  gcc/testsuite/gcc.target/riscv/zfa-fround.c   |  42 +++++
> >  17 files changed, 652 insertions(+), 25 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround.c
> >
> > diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc
> > index 309a52def75..f9fce6bcc38 100644
> > --- a/gcc/common/config/riscv/riscv-common.cc
> > +++ b/gcc/common/config/riscv/riscv-common.cc
> > @@ -217,6 +217,8 @@ static const struct riscv_ext_version riscv_ext_version_table[] =
> >    {"zfh",       ISA_SPEC_CLASS_NONE, 1, 0},
> >    {"zfhmin",    ISA_SPEC_CLASS_NONE, 1, 0},
> >
> > +  {"zfa",     ISA_SPEC_CLASS_NONE, 0, 2},
> > +
> >    {"zmmul", ISA_SPEC_CLASS_NONE, 1, 0},
> >
> >    {"svinval", ISA_SPEC_CLASS_NONE, 1, 0},
> > @@ -1260,6 +1262,8 @@ static const riscv_ext_flag_table_t riscv_ext_flag_table[] =
> >    {"zfhmin",    &gcc_options::x_riscv_zf_subext, MASK_ZFHMIN},
> >    {"zfh",       &gcc_options::x_riscv_zf_subext, MASK_ZFH},
> >
> > +  {"zfa",       &gcc_options::x_riscv_zf_subext, MASK_ZFA},
> > +
> >    {"zmmul", &gcc_options::x_riscv_zm_subext, MASK_ZMMUL},
> >
> >    {"svinval", &gcc_options::x_riscv_sv_subext, MASK_SVINVAL},
> > diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
> > index c448e6b37e9..62d9094f966 100644
> > --- a/gcc/config/riscv/constraints.md
> > +++ b/gcc/config/riscv/constraints.md
> > @@ -118,6 +118,13 @@ (define_constraint "T"
> >    (and (match_operand 0 "move_operand")
> >         (match_test "CONSTANT_P (op)")))
> >
> > +;; Zfa constraints.
> > +
> > +(define_constraint "Zf"
> > +  "A floating point number that can be loaded using instruction `fli` in zfa."
> > +  (and (match_code "const_double")
> > +       (match_test "(riscv_float_const_rtx_index_for_fli (op) != -1)")))
> > +
> >  ;; Vector constraints.
> >
> >  (define_register_constraint "vr" "TARGET_VECTOR ? V_REGS : NO_REGS"
> > @@ -183,8 +190,8 @@ (define_memory_constraint "Wdm"
> >
> >  ;; Vendor ISA extension constraints.
> >
> > -(define_register_constraint "th_f_fmv" "TARGET_XTHEADFMV ? FP_REGS : NO_REGS"
> > +(define_register_constraint "th_f_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS"
> >    "A floating-point register for XTheadFmv.")
> >
> > -(define_register_constraint "th_r_fmv" "TARGET_XTHEADFMV ? GR_REGS : NO_REGS"
> > +(define_register_constraint "th_r_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? GR_REGS : NO_REGS"
> >    "An integer register for XTheadFmv.")
>
> These are vendor extension constraints with the prefix "th_".
> I would avoid using them in code that targets standard extensions.
>
> I see two ways here:
> a) Create two new constraints at the top of the file. E.g.:
>     - "F" - "A floating-point register (no fall-back for Zfinx)" and
>     - "rF" - "A integer register in case FP registers are available".
> b) Move to top and rename these two constraints (and adjust
> movdf_hardfloat_rv32 accordingly)
>
> I would prefer b) and would even go so far, that I would do this in a
> separate commit that
> comes before the Zfa support patch.
>
>
> I've applied the patch on top of today's master (with --3way) and
> successfully tested it:
> Tested-by: Christoph Müllner <christoph.muellner@vrull.eu>
>
> > diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
> > index 9b767038452..c81b08e3cc5 100644
> > --- a/gcc/config/riscv/iterators.md
> > +++ b/gcc/config/riscv/iterators.md
> > @@ -288,3 +288,8 @@ (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
> >  (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET "le")])
> >  (define_int_attr QUIET_PATTERN [(UNSPEC_FLT_QUIET "LT") (UNSPEC_FLE_QUIET "LE")])
> >
> > +(define_int_iterator ROUND [UNSPEC_ROUND UNSPEC_FLOOR UNSPEC_CEIL UNSPEC_BTRUNC UNSPEC_ROUNDEVEN UNSPEC_NEARBYINT])
> > +(define_int_attr round_pattern [(UNSPEC_ROUND "round") (UNSPEC_FLOOR "floor") (UNSPEC_CEIL "ceil")
> > +                               (UNSPEC_BTRUNC "btrunc") (UNSPEC_ROUNDEVEN "roundeven") (UNSPEC_NEARBYINT "nearbyint")])
> > +(define_int_attr round_rm [(UNSPEC_ROUND "rmm") (UNSPEC_FLOOR "rdn") (UNSPEC_CEIL "rup")
> > +                          (UNSPEC_BTRUNC "rtz") (UNSPEC_ROUNDEVEN "rne") (UNSPEC_NEARBYINT "dyn")])
> > \ No newline at end of file
> > diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
> > index cf0cd669be4..87b72efd12e 100644
> > --- a/gcc/config/riscv/riscv-opts.h
> > +++ b/gcc/config/riscv/riscv-opts.h
> > @@ -172,6 +172,9 @@ enum stack_protector_guard {
> >  #define TARGET_ZFHMIN ((riscv_zf_subext & MASK_ZFHMIN) != 0)
> >  #define TARGET_ZFH    ((riscv_zf_subext & MASK_ZFH) != 0)
> >
> > +#define MASK_ZFA   (1 << 0)
> > +#define TARGET_ZFA    ((riscv_zf_subext & MASK_ZFA) != 0)
> > +
> >  #define MASK_ZMMUL      (1 << 0)
> >  #define TARGET_ZMMUL    ((riscv_zm_subext & MASK_ZMMUL) != 0)
> >
> > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> > index 5244e8dcbf0..e421244a06c 100644
> > --- a/gcc/config/riscv/riscv-protos.h
> > +++ b/gcc/config/riscv/riscv-protos.h
> > @@ -38,6 +38,7 @@ enum riscv_symbol_type {
> >  /* Routines implemented in riscv.cc.  */
> >  extern enum riscv_symbol_type riscv_classify_symbolic_expression (rtx);
> >  extern bool riscv_symbolic_constant_p (rtx, enum riscv_symbol_type *);
> > +extern int riscv_float_const_rtx_index_for_fli (rtx);
> >  extern int riscv_regno_mode_ok_for_base_p (int, machine_mode, bool);
> >  extern int riscv_address_insns (rtx, machine_mode, bool);
> >  extern int riscv_const_insns (rtx);
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index cdb47e81e7c..faffedffe97 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -799,6 +799,116 @@ static int riscv_symbol_insns (enum riscv_symbol_type type)
> >      }
> >  }
> >
> > +/* Immediate values loaded by the FLI.S instruction in Chapter 25 of the latest RISC-V ISA
> > +   Manual draft. For details, please see:
> > +   https://github.com/riscv/riscv-isa-manual/releases/tag/draft-20221217-cb3b9d1 */
> > +
> > +unsigned HOST_WIDE_INT fli_value_hf[32] =
> > +{
> > +  0xbc00, 0x400, 0x100, 0x200, 0x1c00, 0x2000, 0x2c00, 0x3000,
> > +  0x3400, 0x3500, 0x3600, 0x3700, 0x3800, 0x3900, 0x3a00, 0x3b00,
> > +  0x3c00, 0x3d00, 0x3e00, 0x3f00, 0x4000, 0x4100, 0x4200, 0x4400,
> > +  0x4800, 0x4c00, 0x5800, 0x5c00, 0x7800,
> > +  /* Only used for filling, ensuring that 29 and 30 of HF are the same. */
> > +  0x7800,
> > +  0x7c00, 0x7e00,
> > +};
> > +
> > +unsigned HOST_WIDE_INT fli_value_sf[32] =
> > +{
> > +  0xbf800000, 0x00800000, 0x37800000, 0x38000000, 0x3b800000, 0x3c000000, 0x3d800000, 0x3e000000,
> > +  0x3e800000, 0x3ea00000, 0x3ec00000, 0x3ee00000, 0x3f000000, 0x3f200000, 0x3f400000, 0x3f600000,
> > +  0x3f800000, 0x3fa00000, 0x3fc00000, 0x3fe00000, 0x40000000, 0x40200000, 0x40400000, 0x40800000,
> > +  0x41000000, 0x41800000, 0x43000000, 0x43800000, 0x47000000, 0x47800000, 0x7f800000, 0x7fc00000
> > +};
> > +
> > +unsigned HOST_WIDE_INT fli_value_df[32] =
> > +{
> > +  0xbff0000000000000, 0x10000000000000, 0x3ef0000000000000, 0x3f00000000000000,
> > +  0x3f70000000000000, 0x3f80000000000000, 0x3fb0000000000000, 0x3fc0000000000000,
> > +  0x3fd0000000000000, 0x3fd4000000000000, 0x3fd8000000000000, 0x3fdc000000000000,
> > +  0x3fe0000000000000, 0x3fe4000000000000, 0x3fe8000000000000, 0x3fec000000000000,
> > +  0x3ff0000000000000, 0x3ff4000000000000, 0x3ff8000000000000, 0x3ffc000000000000,
> > +  0x4000000000000000, 0x4004000000000000, 0x4008000000000000, 0x4010000000000000,
> > +  0x4020000000000000, 0x4030000000000000, 0x4060000000000000, 0x4070000000000000,
> > +  0x40e0000000000000, 0x40f0000000000000, 0x7ff0000000000000, 0x7ff8000000000000,
> > +};
> > +
> > +const char *fli_value_print[32] =
> > +{
> > +  "-1.0", "min", "1.52587890625e-05", "3.0517578125e-05", "0.00390625", "0.0078125", "0.0625", "0.125",
> > +  "0.25", "0.3125", "0.375", "0.4375", "0.5", "0.625", "0.75", "0.875",
> > +  "1.0", "1.25", "1.5", "1.75", "2.0", "2.5", "3.0", "4.0",
> > +  "8.0", "16.0", "128.0", "256.0", "32768.0", "65536.0", "inf", "nan"
> > +};
> > +
> > +/* Find the index of TARGET in ARRAY, and return -1 if not found. */
> > +
> > +static int
> > +find_index_in_array (unsigned HOST_WIDE_INT target, unsigned HOST_WIDE_INT *array, int len)
> > +{
> > +  if (array == NULL)
> > +    return -1;
> > +
> > +  for (int i = 0; i < len; i++)
> > +    {
> > +      if (target == array[i])
> > +       return i;
> > +    }
> > +  return -1;
> > +}
> > +
> > +/* Return index of the FLI instruction table if rtx X is an immediate constant that
> > +   can be moved using a single FLI instruction in zfa extension. -1 otherwise. */
> > +
> > +int
> > +riscv_float_const_rtx_index_for_fli (rtx x)
> > +{
> > +  machine_mode mode = GET_MODE (x);
> > +
> > +  if (!TARGET_ZFA || mode == VOIDmode
> > +      || !CONST_DOUBLE_P(x)
> > +      || (mode == HFmode && !TARGET_ZFH)
> > +      || (mode == SFmode && !TARGET_HARD_FLOAT)
> > +      || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
> > +    return -1;
> > +
> > +  if (!SCALAR_FLOAT_MODE_P (mode)
> > +      || GET_MODE_BITSIZE (mode).to_constant () > HOST_BITS_PER_WIDE_INT
> > +      /* Only support up to DF mode.  */
> > +      || GET_MODE_BITSIZE (mode).to_constant () > GET_MODE_BITSIZE (DFmode))
> > +    return -1;
> > +
> > +  unsigned HOST_WIDE_INT ival = 0;
> > +
> > +  long res[2];
> > +  real_to_target (res,
> > +                 CONST_DOUBLE_REAL_VALUE (x),
> > +                 REAL_MODE_FORMAT (mode));
> > +
> > +  if (mode == DFmode)
> > +    {
> > +      int order = BYTES_BIG_ENDIAN ? 1 : 0;
> > +      ival = zext_hwi (res[order], 32);
> > +      ival |= (zext_hwi (res[1 - order], 32) << 32);
> > +    }
> > +  else
> > +      ival = zext_hwi (res[0], 32);
> > +
> > +  switch (mode)
> > +    {
> > +      case SFmode:
> > +       return find_index_in_array (ival, fli_value_sf, 32);
> > +      case DFmode:
> > +       return find_index_in_array (ival, fli_value_df, 32);
> > +      case HFmode:
> > +       return find_index_in_array (ival, fli_value_hf, 32);
> > +      default:
> > +       break;
> > +    }
> > +  return -1;
> > +}
> > +
> >  /* Implement TARGET_LEGITIMATE_CONSTANT_P.  */
> >
> >  static bool
> > @@ -826,6 +936,9 @@ riscv_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
> >    if (GET_CODE (x) == HIGH)
> >      return true;
> >
> > +  if (riscv_float_const_rtx_index_for_fli (x) != -1)
> > +   return true;
> > +
> >    split_const (x, &base, &offset);
> >    if (riscv_symbolic_constant_p (base, &type))
> >      {
> > @@ -1213,6 +1326,11 @@ riscv_const_insns (rtx x)
> >        }
> >
> >      case CONST_DOUBLE:
> > +      /* See if we can use FMV directly.  */
> > +      if (riscv_float_const_rtx_index_for_fli (x) != -1)
> > +       return 3;
> > +      /* Fall through.  */
> > +
> >      case CONST_VECTOR:
> >        /* We can use x0 to load floating-point zero.  */
> >        return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
> > @@ -1749,6 +1867,12 @@ riscv_legitimize_const_move (machine_mode mode, rtx dest, rtx src)
> >        return;
> >      }
> >
> > +  if (riscv_float_const_rtx_index_for_fli (src) != -1)
> > +    {
> > +      riscv_emit_set (dest, src);
> > +      return;
> > +    }
> > +
> >    /* Split moves of symbolic constants into high/low pairs.  */
> >    if (riscv_split_symbol (dest, src, MAX_MACHINE_MODE, &src, FALSE))
> >      {
> > @@ -2770,12 +2894,19 @@ riscv_split_64bit_move_p (rtx dest, rtx src)
> >    if (TARGET_64BIT)
> >      return false;
> >
> > +  /* There is no need to split if the FLI instruction in the `Zfa` extension can be used. */
> > +  if (riscv_float_const_rtx_index_for_fli (src) != -1)
> > +    return false;
> > +
> >    /* Allow FPR <-> FPR and FPR <-> MEM moves, and permit the special case
> >       of zeroing an FPR with FCVT.D.W.  */
> >    if (TARGET_DOUBLE_FLOAT
> >        && ((FP_REG_RTX_P (src) && FP_REG_RTX_P (dest))
> >           || (FP_REG_RTX_P (dest) && MEM_P (src))
> >           || (FP_REG_RTX_P (src) && MEM_P (dest))
> > +         || (TARGET_ZFA
> > +             && ((FP_REG_RTX_P (dest) && GP_REG_RTX_P (src))
> > +             || (FP_REG_RTX_P (src) && GP_REG_RTX_P (dest))))
> >           || (FP_REG_RTX_P (dest) && src == CONST0_RTX (GET_MODE (src)))))
> >      return false;
> >
> > @@ -2857,6 +2988,8 @@ riscv_output_move (rtx dest, rtx src)
> >           case 4:
> >             return "fmv.x.s\t%0,%1";
> >           case 8:
> > +           if (!TARGET_64BIT && TARGET_ZFA)
> > +             return "fmv.x.w\t%0,%1\n\tfmvh.x.d\t%N0,%1";
> >             return "fmv.x.d\t%0,%1";
> >           }
> >
> > @@ -2916,6 +3049,8 @@ riscv_output_move (rtx dest, rtx src)
> >               case 8:
> >                 if (TARGET_64BIT)
> >                   return "fmv.d.x\t%0,%z1";
> > +               else if (TARGET_ZFA && src != CONST0_RTX (mode))
> > +                 return "fmvp.d.x\t%0,%1,%N1";
> >                 /* in RV32, we can emulate fmv.d.x %0, x0 using fcvt.d.w */
> >                 gcc_assert (src == CONST0_RTX (mode));
> >                 return "fcvt.d.w\t%0,x0";
> > @@ -2968,6 +3103,14 @@ riscv_output_move (rtx dest, rtx src)
> >           case 8:
> >             return "fld\t%0,%1";
> >           }
> > +
> > +      if (src_code == CONST_DOUBLE && (riscv_float_const_rtx_index_for_fli (src) != -1))
> > +       switch (width)
> > +         {
> > +           case 2: return "fli.h\t%0,%1";
> > +           case 4: return "fli.s\t%0,%1";
> > +           case 8: return "fli.d\t%0,%1";
> > +         }
> >      }
> >    if (dest_code == REG && GP_REG_P (REGNO (dest)) && src_code == CONST_POLY_INT)
> >      {
> > @@ -4349,6 +4492,7 @@ riscv_memmodel_needs_release_fence (enum memmodel model)
> >     'S' Print shift-index of single-bit mask OP.
> >     'T' Print shift-index of inverted single-bit mask OP.
> >     '~' Print w if TARGET_64BIT is true; otherwise not print anything.
> > +   'N'  Print next register.
> >
> >     Note please keep this list and the list in riscv.md in sync.  */
> >
> > @@ -4533,6 +4677,9 @@ riscv_print_operand (FILE *file, rtx op, int letter)
> >         output_addr_const (file, newop);
> >         break;
> >        }
> > +    case 'N':
> > +      fputs (reg_names[REGNO (op) + 1], file);
> > +      break;
> >      default:
> >        switch (code)
> >         {
> > @@ -4549,6 +4696,24 @@ riscv_print_operand (FILE *file, rtx op, int letter)
> >             output_address (mode, XEXP (op, 0));
> >           break;
> >
> > +       case CONST_DOUBLE:
> > +         {
> > +           if (letter == 'z' && op == CONST0_RTX (GET_MODE (op)))
> > +             {
> > +               fputs (reg_names[GP_REG_FIRST], file);
> > +               break;
> > +             }
> > +
> > +           int fli_index = riscv_float_const_rtx_index_for_fli (op);
> > +           if (fli_index == -1 || fli_index > 31)
> > +             {
> > +               output_operand_lossage ("invalid use of '%%%c'", letter);
> > +               break;
> > +             }
> > +           asm_fprintf (file, "%s", fli_value_print[fli_index]);
> > +           break;
> > +         }
> > +
> >         default:
> >           if (letter == 'z' && op == CONST0_RTX (GET_MODE (op)))
> >             fputs (reg_names[GP_REG_FIRST], file);
> > @@ -5897,7 +6062,8 @@ riscv_secondary_memory_needed (machine_mode mode, reg_class_t class1,
> >    return (!riscv_v_ext_vector_mode_p (mode)
> >           && GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD
> >           && (class1 == FP_REGS) != (class2 == FP_REGS)
> > -         && !TARGET_XTHEADFMV);
> > +         && !TARGET_XTHEADFMV
> > +         && !TARGET_ZFA);
> >  }
> >
> >  /* Implement TARGET_REGISTER_MOVE_COST.  */
> > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> > index 66fb07d6652..d438b281142 100644
> > --- a/gcc/config/riscv/riscv.h
> > +++ b/gcc/config/riscv/riscv.h
> > @@ -377,6 +377,7 @@ ASM_MISA_SPEC
> >  #define SIBCALL_REG_P(REGNO)   \
> >    TEST_HARD_REG_BIT (reg_class_contents[SIBCALL_REGS], REGNO)
> >
> > +#define GP_REG_RTX_P(X) (REG_P (X) && GP_REG_P (REGNO (X)))
> >  #define FP_REG_RTX_P(X) (REG_P (X) && FP_REG_P (REGNO (X)))
> >
> >  /* Use s0 as the frame pointer if it is so requested.  */
> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > index bc384d9aedf..f22e71b5a3a 100644
> > --- a/gcc/config/riscv/riscv.md
> > +++ b/gcc/config/riscv/riscv.md
> > @@ -59,6 +59,15 @@ (define_c_enum "unspec" [
> >    UNSPEC_LROUND
> >    UNSPEC_FMIN
> >    UNSPEC_FMAX
> > +  UNSPEC_RINT
> > +  UNSPEC_ROUND
> > +  UNSPEC_FLOOR
> > +  UNSPEC_CEIL
> > +  UNSPEC_BTRUNC
> > +  UNSPEC_ROUNDEVEN
> > +  UNSPEC_NEARBYINT
> > +  UNSPEC_FMINM
> > +  UNSPEC_FMAXM
> >
> >    ;; Stack tie
> >    UNSPEC_TIE
> > @@ -1232,6 +1241,26 @@ (define_insn "neg<mode>2"
> >  ;;
> >  ;;  ....................
> >
> > +(define_insn "fminm<mode>3"
> > +  [(set (match_operand:ANYF                    0 "register_operand" "=f")
> > +       (unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
> > +                     (use (match_operand:ANYF 2 "register_operand" " f"))]
> > +                    UNSPEC_FMINM))]
> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> > +  "fminm.<fmt>\t%0,%1,%2"
> > +  [(set_attr "type" "fmove")
> > +   (set_attr "mode" "<UNITMODE>")])
> > +
> > +(define_insn "fmaxm<mode>3"
> > +  [(set (match_operand:ANYF                    0 "register_operand" "=f")
> > +       (unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
> > +                     (use (match_operand:ANYF 2 "register_operand" " f"))]
> > +                    UNSPEC_FMAXM))]
> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> > +  "fmaxm.<fmt>\t%0,%1,%2"
> > +  [(set_attr "type" "fmove")
> > +   (set_attr "mode" "<UNITMODE>")])
> > +
> >  (define_insn "fmin<mode>3"
> >    [(set (match_operand:ANYF                    0 "register_operand" "=f")
> >         (unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
> > @@ -1508,13 +1537,13 @@ (define_expand "movhf"
> >  })
> >
> >  (define_insn "*movhf_hardfloat"
> > -  [(set (match_operand:HF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
> > -       (match_operand:HF 1 "move_operand"         " f,G,m,f,G,*r,*f,*G*r,*m,*r"))]
> > +  [(set (match_operand:HF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
> > +       (match_operand:HF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*G*r,*m,*r"))]
> >    "TARGET_ZFHMIN
> >     && (register_operand (operands[0], HFmode)
> >         || reg_or_0_operand (operands[1], HFmode))"
> >    { return riscv_output_move (operands[0], operands[1]); }
> > -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> > +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >     (set_attr "mode" "HF")])
> >
> >  (define_insn "*movhf_softfloat"
> > @@ -1580,6 +1609,26 @@ (define_insn "l<rint_pattern><ANYF:mode><GPR:mode>2"
> >    [(set_attr "type" "fcvt")
> >     (set_attr "mode" "<ANYF:MODE>")])
> >
> > +(define_insn "<round_pattern><ANYF:mode>2"
> > +  [(set (match_operand:ANYF     0 "register_operand" "=f")
> > +       (unspec:ANYF
> > +           [(match_operand:ANYF 1 "register_operand" " f")]
> > +       ROUND))]
> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> > +  "fround.<ANYF:fmt>\t%0,%1,<round_rm>"
> > +  [(set_attr "type" "fcvt")
> > +   (set_attr "mode" "<ANYF:MODE>")])
> > +
> > +(define_insn "rint<ANYF:mode>2"
> > +  [(set (match_operand:ANYF     0 "register_operand" "=f")
> > +       (unspec:ANYF
> > +           [(match_operand:ANYF 1 "register_operand" " f")]
> > +       UNSPEC_RINT))]
> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> > +  "froundnx.<ANYF:fmt>\t%0,%1"
> > +  [(set_attr "type" "fcvt")
> > +   (set_attr "mode" "<ANYF:MODE>")])
> > +
> >  ;;
> >  ;;  ....................
> >  ;;
> > @@ -1839,13 +1888,13 @@ (define_expand "movsf"
> >  })
> >
> >  (define_insn "*movsf_hardfloat"
> > -  [(set (match_operand:SF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
> > -       (match_operand:SF 1 "move_operand"         " f,G,m,f,G,*r,*f,*G*r,*m,*r"))]
> > +  [(set (match_operand:SF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
> > +       (match_operand:SF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*G*r,*m,*r"))]
> >    "TARGET_HARD_FLOAT
> >     && (register_operand (operands[0], SFmode)
> >         || reg_or_0_operand (operands[1], SFmode))"
> >    { return riscv_output_move (operands[0], operands[1]); }
> > -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> > +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >     (set_attr "mode" "SF")])
> >
> >  (define_insn "*movsf_softfloat"
> > @@ -1873,23 +1922,23 @@ (define_expand "movdf"
> >  ;; In RV32, we lack fmv.x.d and fmv.d.x.  Go through memory instead.
> >  ;; (However, we can still use fcvt.d.w to zero a floating-point register.)
> >  (define_insn "*movdf_hardfloat_rv32"
> > -  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,m,*th_f_fmv,*th_r_fmv,  *r,*r,*m")
> > -       (match_operand:DF 1 "move_operand"         " f,G,m,f,G,*th_r_fmv,*th_f_fmv,*r*G,*m,*r"))]
> > +  [(set (match_operand:DF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*th_f_fmv,*th_r_fmv,  *r,*r,*m")
> > +       (match_operand:DF 1 "move_operand"         " f,Zf,G,m,f,G,*th_r_fmv,*th_f_fmv,*r*G,*m,*r"))]
> >    "!TARGET_64BIT && TARGET_DOUBLE_FLOAT
> >     && (register_operand (operands[0], DFmode)
> >         || reg_or_0_operand (operands[1], DFmode))"
> >    { return riscv_output_move (operands[0], operands[1]); }
> > -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> > +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >     (set_attr "mode" "DF")])
> >
> >  (define_insn "*movdf_hardfloat_rv64"
> > -  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
> > -       (match_operand:DF 1 "move_operand"         " f,G,m,f,G,*r,*f,*r*G,*m,*r"))]
> > +  [(set (match_operand:DF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
> > +       (match_operand:DF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*r*G,*m,*r"))]
> >    "TARGET_64BIT && TARGET_DOUBLE_FLOAT
> >     && (register_operand (operands[0], DFmode)
> >         || reg_or_0_operand (operands[1], DFmode))"
> >    { return riscv_output_move (operands[0], operands[1]); }
> > -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> > +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >     (set_attr "mode" "DF")])
> >
> >  (define_insn "*movdf_softfloat"
> > @@ -2494,16 +2543,23 @@ (define_expand "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4"
> >    rtx op0 = operands[0];
> >    rtx op1 = operands[1];
> >    rtx op2 = operands[2];
> > -  rtx tmp = gen_reg_rtx (SImode);
> > -  rtx cmp = gen_rtx_<QUIET_PATTERN> (<X:MODE>mode, op1, op2);
> > -  rtx frflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, const0_rtx),
> > -                                        UNSPECV_FRFLAGS);
> > -  rtx fsflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, tmp),
> > -                                        UNSPECV_FSFLAGS);
> > -
> > -  emit_insn (gen_rtx_SET (tmp, frflags));
> > -  emit_insn (gen_rtx_SET (op0, cmp));
> > -  emit_insn (fsflags);
> > +
> > +  if (TARGET_ZFA)
> > +    emit_insn (gen_f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa(op0, op1, op2));
> > +  else
> > +    {
> > +      rtx tmp = gen_reg_rtx (SImode);
> > +      rtx cmp = gen_rtx_<QUIET_PATTERN> (<X:MODE>mode, op1, op2);
> > +      rtx frflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, const0_rtx),
> > +                                            UNSPECV_FRFLAGS);
> > +      rtx fsflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, tmp),
> > +                                            UNSPECV_FSFLAGS);
> > +
> > +      emit_insn (gen_rtx_SET (tmp, frflags));
> > +      emit_insn (gen_rtx_SET (op0, cmp));
> > +      emit_insn (fsflags);
> > +    }
> > +
> >    if (HONOR_SNANS (<ANYF:MODE>mode))
> >      emit_insn (gen_rtx_UNSPEC_VOLATILE (<ANYF:MODE>mode,
> >                                         gen_rtvec (2, op1, op2),
> > @@ -2511,6 +2567,18 @@ (define_expand "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4"
> >    DONE;
> >  })
> >
> > +(define_insn "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa"
> > +   [(set (match_operand:X      0 "register_operand" "=r")
> > +        (unspec:X
> > +         [(match_operand:ANYF 1 "register_operand" " f")
> > +          (match_operand:ANYF 2 "register_operand" " f")]
> > +         QUIET_COMPARISON))]
> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> > +  "f<quiet_pattern>q.<fmt>\t%0,%1,%2"
> > +  [(set_attr "type" "fcmp")
> > +   (set_attr "mode" "<UNITMODE>")
> > +   (set (attr "length") (const_int 16))])
> > +
> >  (define_insn "*seq_zero_<X:mode><GPR:mode>"
> >    [(set (match_operand:GPR       0 "register_operand" "=r")
> >         (eq:GPR (match_operand:X 1 "register_operand" " r")
> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
> > new file mode 100644
> > index 00000000000..26895b76fa4
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
> > @@ -0,0 +1,19 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O2" } */
> > +
> > +extern void abort(void);
> > +extern float a, b;
> > +extern double c, d;
> > +
> > +void
> > +foo()
> > +{
> > +  if ((__builtin_isless(a, b) ||  __builtin_islessequal(c, d))
> > +      && (__builtin_islessequal(a, b)|| __builtin_isless(c, d)))
> > +    abort();
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "fleq.s" 1 } } */
> > +/* { dg-final { scan-assembler-times "fltq.s" 1 } } */
> > +/* { dg-final { scan-assembler-times "fleq.d" 1 } } */
> > +/* { dg-final { scan-assembler-times "fltq.d" 1 } } */
> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
> > new file mode 100644
> > index 00000000000..4ccd6a7dd78
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
> > @@ -0,0 +1,19 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O2" } */
> > +
> > +extern void abort(void);
> > +extern float a, b;
> > +extern double c, d;
> > +
> > +void
> > +foo()
> > +{
> > +  if ((__builtin_isless(a, b) ||  __builtin_islessequal(c, d))
> > +      && (__builtin_islessequal(a, b)|| __builtin_isless(c, d)))
> > +    abort();
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "fleq.s" 1 } } */
> > +/* { dg-final { scan-assembler-times "fltq.s" 1 } } */
> > +/* { dg-final { scan-assembler-times "fleq.d" 1 } } */
> > +/* { dg-final { scan-assembler-times "fltq.d" 1 } } */
> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
> > new file mode 100644
> > index 00000000000..c4da04797aa
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
> > @@ -0,0 +1,79 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O0" } */
> > +
> > +void foo_float32 ()
> > +{
> > +  volatile float a;
> > +  a = -1.0;
> > +  a = 1.1754944e-38;
> > +  a = 1.0/(1 << 16);
> > +  a = 1.0/(1 << 15);
> > +  a = 1.0/(1 << 8);
> > +  a = 1.0/(1 << 7);
> > +  a = 1.0/(1 << 4);
> > +  a = 1.0/(1 << 3);
> > +  a = 1.0/(1 << 2);
> > +  a = 0.3125;
> > +  a = 0.375;
> > +  a = 0.4375;
> > +  a = 0.5;
> > +  a = 0.625;
> > +  a = 0.75;
> > +  a = 0.875;
> > +  a = 1.0;
> > +  a = 1.25;
> > +  a = 1.5;
> > +  a = 1.75;
> > +  a = 2.0;
> > +  a = 2.5;
> > +  a = 3.0;
> > +  a = 1.0*(1 << 2);
> > +  a = 1.0*(1 << 3);
> > +  a = 1.0*(1 << 4);
> > +  a = 1.0*(1 << 7);
> > +  a = 1.0*(1 << 8);
> > +  a = 1.0*(1 << 15);
> > +  a = 1.0*(1 << 16);
> > +  a = __builtin_inff ();
> > +  a = __builtin_nanf ("");
> > +}
> > +
> > +void foo_double64 ()
> > +{
> > +  volatile double a;
> > +  a = -1.0;
> > +  a = 2.2250738585072014E-308;
> > +  a = 1.0/(1 << 16);
> > +  a = 1.0/(1 << 15);
> > +  a = 1.0/(1 << 8);
> > +  a = 1.0/(1 << 7);
> > +  a = 1.0/(1 << 4);
> > +  a = 1.0/(1 << 3);
> > +  a = 1.0/(1 << 2);
> > +  a = 0.3125;
> > +  a = 0.375;
> > +  a = 0.4375;
> > +  a = 0.5;
> > +  a = 0.625;
> > +  a = 0.75;
> > +  a = 0.875;
> > +  a = 1.0;
> > +  a = 1.25;
> > +  a = 1.5;
> > +  a = 1.75;
> > +  a = 2.0;
> > +  a = 2.5;
> > +  a = 3.0;
> > +  a = 1.0*(1 << 2);
> > +  a = 1.0*(1 << 3);
> > +  a = 1.0*(1 << 4);
> > +  a = 1.0*(1 << 7);
> > +  a = 1.0*(1 << 8);
> > +  a = 1.0*(1 << 15);
> > +  a = 1.0*(1 << 16);
> > +  a = __builtin_inf ();
> > +  a = __builtin_nan ("");
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "fli.s" 32 } } */
> > +/* { dg-final { scan-assembler-times "fli.d" 32 } } */
> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
> > new file mode 100644
> > index 00000000000..bcffe9d2c82
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
> > @@ -0,0 +1,41 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv32imafdc_zfa_zfh -mabi=ilp32d -O0" } */
> > +
> > +void foo_float16 ()
> > +{
> > +  volatile _Float16 a;
> > +  a = -1.0;
> > +  a = 6.104E-5;
> > +  a = 1.0/(1 << 16);
> > +  a = 1.0/(1 << 15);
> > +  a = 1.0/(1 << 8);
> > +  a = 1.0/(1 << 7);
> > +  a = 1.0/(1 << 4);
> > +  a = 1.0/(1 << 3);
> > +  a = 1.0/(1 << 2);
> > +  a = 0.3125;
> > +  a = 0.375;
> > +  a = 0.4375;
> > +  a = 0.5;
> > +  a = 0.625;
> > +  a = 0.75;
> > +  a = 0.875;
> > +  a = 1.0;
> > +  a = 1.25;
> > +  a = 1.5;
> > +  a = 1.75;
> > +  a = 2.0;
> > +  a = 2.5;
> > +  a = 3.0;
> > +  a = 1.0*(1 << 2);
> > +  a = 1.0*(1 << 3);
> > +  a = 1.0*(1 << 4);
> > +  a = 1.0*(1 << 7);
> > +  a = 1.0*(1 << 8);
> > +  a = 1.0*(1 << 15);
> > +  a = 1.0*(1 << 16);
> > +  a = __builtin_inff16 ();
> > +  a = __builtin_nanf16 ("");
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "fli.h" 32 } } */
> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
> > new file mode 100644
> > index 00000000000..13aa7b5f846
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
> > @@ -0,0 +1,41 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64imafdc_zfa_zfh -mabi=lp64d -O0" } */
> > +
> > +void foo_float16 ()
> > +{
> > +  volatile _Float16 a;
> > +  a = -1.0;
> > +  a = 6.104E-5;
> > +  a = 1.0/(1 << 16);
> > +  a = 1.0/(1 << 15);
> > +  a = 1.0/(1 << 8);
> > +  a = 1.0/(1 << 7);
> > +  a = 1.0/(1 << 4);
> > +  a = 1.0/(1 << 3);
> > +  a = 1.0/(1 << 2);
> > +  a = 0.3125;
> > +  a = 0.375;
> > +  a = 0.4375;
> > +  a = 0.5;
> > +  a = 0.625;
> > +  a = 0.75;
> > +  a = 0.875;
> > +  a = 1.0;
> > +  a = 1.25;
> > +  a = 1.5;
> > +  a = 1.75;
> > +  a = 2.0;
> > +  a = 2.5;
> > +  a = 3.0;
> > +  a = 1.0*(1 << 2);
> > +  a = 1.0*(1 << 3);
> > +  a = 1.0*(1 << 4);
> > +  a = 1.0*(1 << 7);
> > +  a = 1.0*(1 << 8);
> > +  a = 1.0*(1 << 15);
> > +  a = 1.0*(1 << 16);
> > +  a = __builtin_inff16 ();
> > +  a = __builtin_nanf16 ("");
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "fli.h" 32 } } */
> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli.c b/gcc/testsuite/gcc.target/riscv/zfa-fli.c
> > new file mode 100644
> > index 00000000000..b6d41cf460f
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli.c
> > @@ -0,0 +1,79 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O0" } */
> > +
> > +void foo_float32 ()
> > +{
> > +  volatile float a;
> > +  a = -1.0;
> > +  a = 1.1754944e-38;
> > +  a = 1.0/(1 << 16);
> > +  a = 1.0/(1 << 15);
> > +  a = 1.0/(1 << 8);
> > +  a = 1.0/(1 << 7);
> > +  a = 1.0/(1 << 4);
> > +  a = 1.0/(1 << 3);
> > +  a = 1.0/(1 << 2);
> > +  a = 0.3125;
> > +  a = 0.375;
> > +  a = 0.4375;
> > +  a = 0.5;
> > +  a = 0.625;
> > +  a = 0.75;
> > +  a = 0.875;
> > +  a = 1.0;
> > +  a = 1.25;
> > +  a = 1.5;
> > +  a = 1.75;
> > +  a = 2.0;
> > +  a = 2.5;
> > +  a = 3.0;
> > +  a = 1.0*(1 << 2);
> > +  a = 1.0*(1 << 3);
> > +  a = 1.0*(1 << 4);
> > +  a = 1.0*(1 << 7);
> > +  a = 1.0*(1 << 8);
> > +  a = 1.0*(1 << 15);
> > +  a = 1.0*(1 << 16);
> > +  a = __builtin_inff ();
> > +  a = __builtin_nanf ("");
> > +}
> > +
> > +void foo_double64 ()
> > +{
> > +  volatile double a;
> > +  a = -1.0;
> > +  a = 2.2250738585072014E-308;
> > +  a = 1.0/(1 << 16);
> > +  a = 1.0/(1 << 15);
> > +  a = 1.0/(1 << 8);
> > +  a = 1.0/(1 << 7);
> > +  a = 1.0/(1 << 4);
> > +  a = 1.0/(1 << 3);
> > +  a = 1.0/(1 << 2);
> > +  a = 0.3125;
> > +  a = 0.375;
> > +  a = 0.4375;
> > +  a = 0.5;
> > +  a = 0.625;
> > +  a = 0.75;
> > +  a = 0.875;
> > +  a = 1.0;
> > +  a = 1.25;
> > +  a = 1.5;
> > +  a = 1.75;
> > +  a = 2.0;
> > +  a = 2.5;
> > +  a = 3.0;
> > +  a = 1.0*(1 << 2);
> > +  a = 1.0*(1 << 3);
> > +  a = 1.0*(1 << 4);
> > +  a = 1.0*(1 << 7);
> > +  a = 1.0*(1 << 8);
> > +  a = 1.0*(1 << 15);
> > +  a = 1.0*(1 << 16);
> > +  a = __builtin_inf ();
> > +  a = __builtin_nan ("");
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "fli.s" 32 } } */
> > +/* { dg-final { scan-assembler-times "fli.d" 32 } } */
> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
> > new file mode 100644
> > index 00000000000..5a52adce36a
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
> > @@ -0,0 +1,10 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv32g_zfa -mabi=ilp32 -O0" } */
> > +
> > +double foo(long long a)
> > +{
> > +  return (double)(a + 3);
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "fmvp.d.x" 1 } } */
> > +/* { dg-final { scan-assembler-times "fmvh.x.d" 1 } } */
> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
> > new file mode 100644
> > index 00000000000..b53601d6e1f
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
> > @@ -0,0 +1,42 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O2" } */
> > +
> > +extern float a;
> > +extern double b;
> > +
> > +void foo (float *x, double *y)
> > +{
> > +  {
> > +    *x = __builtin_roundf (a);
> > +    *y = __builtin_round (b);
> > +  }
> > +  {
> > +    *x = __builtin_floorf (a);
> > +    *y = __builtin_floor (b);
> > +  }
> > +  {
> > +    *x = __builtin_ceilf (a);
> > +    *y = __builtin_ceil (b);
> > +  }
> > +  {
> > +    *x = __builtin_truncf (a);
> > +    *y = __builtin_trunc (b);
> > +  }
> > +  {
> > +    *x = __builtin_roundevenf (a);
> > +    *y = __builtin_roundeven (b);
> > +  }
> > +  {
> > +    *x = __builtin_nearbyintf (a);
> > +    *y = __builtin_nearbyint (b);
> > +  }
> > +  {
> > +    *x = __builtin_rintf (a);
> > +    *y = __builtin_rint (b);
> > +  }
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "fround.s" 6 } } */
> > +/* { dg-final { scan-assembler-times "fround.d" 6 } } */
> > +/* { dg-final { scan-assembler-times "froundnx.s" 1 } } */
> > +/* { dg-final { scan-assembler-times "froundnx.d" 1 } } */
> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fround.c b/gcc/testsuite/gcc.target/riscv/zfa-fround.c
> > new file mode 100644
> > index 00000000000..c10de82578e
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fround.c
> > @@ -0,0 +1,42 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O2" } */
> > +
> > +extern float a;
> > +extern double b;
> > +
> > +void foo (float *x, double *y)
> > +{
> > +  {
> > +    *x = __builtin_roundf (a);
> > +    *y = __builtin_round (b);
> > +  }
> > +  {
> > +    *x = __builtin_floorf (a);
> > +    *y = __builtin_floor (b);
> > +  }
> > +  {
> > +    *x = __builtin_ceilf (a);
> > +    *y = __builtin_ceil (b);
> > +  }
> > +  {
> > +    *x = __builtin_truncf (a);
> > +    *y = __builtin_trunc (b);
> > +  }
> > +  {
> > +    *x = __builtin_roundevenf (a);
> > +    *y = __builtin_roundeven (b);
> > +  }
> > +  {
> > +    *x = __builtin_nearbyintf (a);
> > +    *y = __builtin_nearbyint (b);
> > +  }
> > +  {
> > +    *x = __builtin_rintf (a);
> > +    *y = __builtin_rint (b);
> > +  }
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "fround.s" 6 } } */
> > +/* { dg-final { scan-assembler-times "fround.d" 6 } } */
> > +/* { dg-final { scan-assembler-times "froundnx.s" 1 } } */
> > +/* { dg-final { scan-assembler-times "froundnx.d" 1 } } */
> > --
> > 2.17.1
> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2.
  2023-05-05 15:04   ` Christoph Müllner
@ 2023-05-05 15:12     ` Palmer Dabbelt
  2023-05-05 15:43       ` Christoph Müllner
  0 siblings, 1 reply; 20+ messages in thread
From: Palmer Dabbelt @ 2023-05-05 15:12 UTC (permalink / raw)
  To: christoph.muellner
  Cc: jinma, gcc-patches, jeffreyalaw, kito.cheng, Kito Cheng, ijinma

On Fri, 05 May 2023 08:04:53 PDT (-0700), christoph.muellner@vrull.eu wrote:
> What I forgot to mention:
> Zfa is frozen and in public review:
>   https://groups.google.com/a/groups.riscv.org/g/isa-dev/c/SED4ntBkabg

Thanks, I'd also forgot to send that out ;).

I think the only blocker here on the specification side is the assembly 
format for FLI?  It looks like the feedback on 
<https://github.com/riscv-non-isa/riscv-asm-manual/pull/85> has been 
pretty minor so far.  It'd be nice to have the docs lined up before 
we merge, but we could always just call it a GNU extension -- we've 
already got a lot of that in assembler land, so I don't think it's that 
big of a deal.

>
> On Fri, May 5, 2023 at 5:03 PM Christoph Müllner
> <christoph.muellner@vrull.eu> wrote:
>>
>> On Wed, Apr 19, 2023 at 11:58 AM Jin Ma <jinma@linux.alibaba.com> wrote:
>> >
>> > This patch adds the 'Zfa' extension for riscv, which is based on:
>> >   https://github.com/riscv/riscv-isa-manual/commits/zfb
>> >   https://github.com/riscv/riscv-isa-manual/commit/1f038182810727f5feca311072e630d6baac51da
>> >
>> > The binutils-gdb for 'Zfa' extension:
>> >   https://github.com/a4lg/binutils-gdb/commits/riscv-zfa
>> >
>> > What needs special explanation is:
>> > 1, The immediate number of the instructions FLI.H/S/D is represented in the assembly as a
>> >   floating-point value, with scientific counting when rs1 is 1,2, and decimal numbers for
>> >   the rest.
>> >
>> >   Related llvm link:
>> >     https://reviews.llvm.org/D145645
>> >   Related discussion link:
>> >     https://github.com/riscv/riscv-isa-manual/issues/980
>> >
>> > 2, According to riscv-spec, "The FCVTMO D.W.D instruction was added principally to
>> >   accelerate the processing of JavaScript Numbers.", so it seems that no implementation
>> >   is required.
>> >
>> > 3, The instructions FMINM and FMAXM correspond to C23 library function fminimum and fmaximum.
>> >   Therefore, this patch has simply implemented the pattern of fminm<hf\sf\df>3 and
>> >   fmaxm<hf\sf\df>3 to prepare for later.
>> >
>> > gcc/ChangeLog:
>> >
>> >         * common/config/riscv/riscv-common.cc: Add zfa extension version.
>> >         * config/riscv/constraints.md (Zf): Constrain the floating point number that the
>> >         instructions FLI.H/S/D can load.
>> >         ((TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS): enable FMVP.D.X and FMVH.X.D.
>> >         * config/riscv/iterators.md (ceil): New.
>> >         * config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): New.
>> >         * config/riscv/riscv.cc (find_index_in_array): New.
>> >         (riscv_float_const_rtx_index_for_fli): Get the index of the floating-point number that
>> >         the instructions FLI.H/S/D can mov.
>> >         (riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be used, memory is not applicable.
>> >         (riscv_const_insns): The cost of FLI.H/S/D is 3.
>> >         (riscv_legitimize_const_move): Likewise.
>> >         (riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, no split is required.
>> >         (riscv_output_move): Output the mov instructions in zfa extension.
>> >         (riscv_print_operand): Output the floating-point value of the FLI.H/S/D immediate in assembly
>> >         (riscv_secondary_memory_needed): Likewise.
>> >         * config/riscv/riscv.h (GP_REG_RTX_P): New.
>> >         * config/riscv/riscv.md (fminm<mode>3): New.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >         * gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
>> >         * gcc.target/riscv/zfa-fleq-fltq.c: New test.
>> >         * gcc.target/riscv/zfa-fli-rv32.c: New test.
>> >         * gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
>> >         * gcc.target/riscv/zfa-fli-zfh.c: New test.
>> >         * gcc.target/riscv/zfa-fli.c: New test.
>> >         * gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
>> >         * gcc.target/riscv/zfa-fround-rv32.c: New test.
>> >         * gcc.target/riscv/zfa-fround.c: New test.
>> > ---
>> >  gcc/common/config/riscv/riscv-common.cc       |   4 +
>> >  gcc/config/riscv/constraints.md               |  11 +-
>> >  gcc/config/riscv/iterators.md                 |   5 +
>> >  gcc/config/riscv/riscv-opts.h                 |   3 +
>> >  gcc/config/riscv/riscv-protos.h               |   1 +
>> >  gcc/config/riscv/riscv.cc                     | 168 +++++++++++++++++-
>> >  gcc/config/riscv/riscv.h                      |   1 +
>> >  gcc/config/riscv/riscv.md                     | 112 +++++++++---
>> >  .../gcc.target/riscv/zfa-fleq-fltq-rv32.c     |  19 ++
>> >  .../gcc.target/riscv/zfa-fleq-fltq.c          |  19 ++
>> >  gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c |  79 ++++++++
>> >  .../gcc.target/riscv/zfa-fli-zfh-rv32.c       |  41 +++++
>> >  gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c  |  41 +++++
>> >  gcc/testsuite/gcc.target/riscv/zfa-fli.c      |  79 ++++++++
>> >  .../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c   |  10 ++
>> >  .../gcc.target/riscv/zfa-fround-rv32.c        |  42 +++++
>> >  gcc/testsuite/gcc.target/riscv/zfa-fround.c   |  42 +++++
>> >  17 files changed, 652 insertions(+), 25 deletions(-)
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli.c
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround.c
>> >
>> > diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc
>> > index 309a52def75..f9fce6bcc38 100644
>> > --- a/gcc/common/config/riscv/riscv-common.cc
>> > +++ b/gcc/common/config/riscv/riscv-common.cc
>> > @@ -217,6 +217,8 @@ static const struct riscv_ext_version riscv_ext_version_table[] =
>> >    {"zfh",       ISA_SPEC_CLASS_NONE, 1, 0},
>> >    {"zfhmin",    ISA_SPEC_CLASS_NONE, 1, 0},
>> >
>> > +  {"zfa",     ISA_SPEC_CLASS_NONE, 0, 2},
>> > +
>> >    {"zmmul", ISA_SPEC_CLASS_NONE, 1, 0},
>> >
>> >    {"svinval", ISA_SPEC_CLASS_NONE, 1, 0},
>> > @@ -1260,6 +1262,8 @@ static const riscv_ext_flag_table_t riscv_ext_flag_table[] =
>> >    {"zfhmin",    &gcc_options::x_riscv_zf_subext, MASK_ZFHMIN},
>> >    {"zfh",       &gcc_options::x_riscv_zf_subext, MASK_ZFH},
>> >
>> > +  {"zfa",       &gcc_options::x_riscv_zf_subext, MASK_ZFA},
>> > +
>> >    {"zmmul", &gcc_options::x_riscv_zm_subext, MASK_ZMMUL},
>> >
>> >    {"svinval", &gcc_options::x_riscv_sv_subext, MASK_SVINVAL},
>> > diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
>> > index c448e6b37e9..62d9094f966 100644
>> > --- a/gcc/config/riscv/constraints.md
>> > +++ b/gcc/config/riscv/constraints.md
>> > @@ -118,6 +118,13 @@ (define_constraint "T"
>> >    (and (match_operand 0 "move_operand")
>> >         (match_test "CONSTANT_P (op)")))
>> >
>> > +;; Zfa constraints.
>> > +
>> > +(define_constraint "Zf"
>> > +  "A floating point number that can be loaded using instruction `fli` in zfa."
>> > +  (and (match_code "const_double")
>> > +       (match_test "(riscv_float_const_rtx_index_for_fli (op) != -1)")))
>> > +
>> >  ;; Vector constraints.
>> >
>> >  (define_register_constraint "vr" "TARGET_VECTOR ? V_REGS : NO_REGS"
>> > @@ -183,8 +190,8 @@ (define_memory_constraint "Wdm"
>> >
>> >  ;; Vendor ISA extension constraints.
>> >
>> > -(define_register_constraint "th_f_fmv" "TARGET_XTHEADFMV ? FP_REGS : NO_REGS"
>> > +(define_register_constraint "th_f_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS"
>> >    "A floating-point register for XTheadFmv.")
>> >
>> > -(define_register_constraint "th_r_fmv" "TARGET_XTHEADFMV ? GR_REGS : NO_REGS"
>> > +(define_register_constraint "th_r_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? GR_REGS : NO_REGS"
>> >    "An integer register for XTheadFmv.")
>>
>> These are vendor extension constraints with the prefix "th_".
>> I would avoid using them in code that targets standard extensions.
>>
>> I see two ways here:
>> a) Create two new constraints at the top of the file. E.g.:
>>     - "F" - "A floating-point register (no fall-back for Zfinx)" and
>>     - "rF" - "A integer register in case FP registers are available".
>> b) Move to top and rename these two constraints (and adjust
>> movdf_hardfloat_rv32 accordingly)
>>
>> I would prefer b) and would even go so far, that I would do this in a
>> separate commit that
>> comes before the Zfa support patch.
>>
>>
>> I've applied the patch on top of today's master (with --3way) and
>> successfully tested it:
>> Tested-by: Christoph Müllner <christoph.muellner@vrull.eu>
>>
>> > diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
>> > index 9b767038452..c81b08e3cc5 100644
>> > --- a/gcc/config/riscv/iterators.md
>> > +++ b/gcc/config/riscv/iterators.md
>> > @@ -288,3 +288,8 @@ (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
>> >  (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET "le")])
>> >  (define_int_attr QUIET_PATTERN [(UNSPEC_FLT_QUIET "LT") (UNSPEC_FLE_QUIET "LE")])
>> >
>> > +(define_int_iterator ROUND [UNSPEC_ROUND UNSPEC_FLOOR UNSPEC_CEIL UNSPEC_BTRUNC UNSPEC_ROUNDEVEN UNSPEC_NEARBYINT])
>> > +(define_int_attr round_pattern [(UNSPEC_ROUND "round") (UNSPEC_FLOOR "floor") (UNSPEC_CEIL "ceil")
>> > +                               (UNSPEC_BTRUNC "btrunc") (UNSPEC_ROUNDEVEN "roundeven") (UNSPEC_NEARBYINT "nearbyint")])
>> > +(define_int_attr round_rm [(UNSPEC_ROUND "rmm") (UNSPEC_FLOOR "rdn") (UNSPEC_CEIL "rup")
>> > +                          (UNSPEC_BTRUNC "rtz") (UNSPEC_ROUNDEVEN "rne") (UNSPEC_NEARBYINT "dyn")])
>> > \ No newline at end of file
>> > diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
>> > index cf0cd669be4..87b72efd12e 100644
>> > --- a/gcc/config/riscv/riscv-opts.h
>> > +++ b/gcc/config/riscv/riscv-opts.h
>> > @@ -172,6 +172,9 @@ enum stack_protector_guard {
>> >  #define TARGET_ZFHMIN ((riscv_zf_subext & MASK_ZFHMIN) != 0)
>> >  #define TARGET_ZFH    ((riscv_zf_subext & MASK_ZFH) != 0)
>> >
>> > +#define MASK_ZFA   (1 << 0)
>> > +#define TARGET_ZFA    ((riscv_zf_subext & MASK_ZFA) != 0)
>> > +
>> >  #define MASK_ZMMUL      (1 << 0)
>> >  #define TARGET_ZMMUL    ((riscv_zm_subext & MASK_ZMMUL) != 0)
>> >
>> > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
>> > index 5244e8dcbf0..e421244a06c 100644
>> > --- a/gcc/config/riscv/riscv-protos.h
>> > +++ b/gcc/config/riscv/riscv-protos.h
>> > @@ -38,6 +38,7 @@ enum riscv_symbol_type {
>> >  /* Routines implemented in riscv.cc.  */
>> >  extern enum riscv_symbol_type riscv_classify_symbolic_expression (rtx);
>> >  extern bool riscv_symbolic_constant_p (rtx, enum riscv_symbol_type *);
>> > +extern int riscv_float_const_rtx_index_for_fli (rtx);
>> >  extern int riscv_regno_mode_ok_for_base_p (int, machine_mode, bool);
>> >  extern int riscv_address_insns (rtx, machine_mode, bool);
>> >  extern int riscv_const_insns (rtx);
>> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>> > index cdb47e81e7c..faffedffe97 100644
>> > --- a/gcc/config/riscv/riscv.cc
>> > +++ b/gcc/config/riscv/riscv.cc
>> > @@ -799,6 +799,116 @@ static int riscv_symbol_insns (enum riscv_symbol_type type)
>> >      }
>> >  }
>> >
>> > +/* Immediate values loaded by the FLI.S instruction in Chapter 25 of the latest RISC-V ISA
>> > +   Manual draft. For details, please see:
>> > +   https://github.com/riscv/riscv-isa-manual/releases/tag/draft-20221217-cb3b9d1 */
>> > +
>> > +unsigned HOST_WIDE_INT fli_value_hf[32] =
>> > +{
>> > +  0xbc00, 0x400, 0x100, 0x200, 0x1c00, 0x2000, 0x2c00, 0x3000,
>> > +  0x3400, 0x3500, 0x3600, 0x3700, 0x3800, 0x3900, 0x3a00, 0x3b00,
>> > +  0x3c00, 0x3d00, 0x3e00, 0x3f00, 0x4000, 0x4100, 0x4200, 0x4400,
>> > +  0x4800, 0x4c00, 0x5800, 0x5c00, 0x7800,
>> > +  /* Only used for filling, ensuring that 29 and 30 of HF are the same. */
>> > +  0x7800,
>> > +  0x7c00, 0x7e00,
>> > +};
>> > +
>> > +unsigned HOST_WIDE_INT fli_value_sf[32] =
>> > +{
>> > +  0xbf800000, 0x00800000, 0x37800000, 0x38000000, 0x3b800000, 0x3c000000, 0x3d800000, 0x3e000000,
>> > +  0x3e800000, 0x3ea00000, 0x3ec00000, 0x3ee00000, 0x3f000000, 0x3f200000, 0x3f400000, 0x3f600000,
>> > +  0x3f800000, 0x3fa00000, 0x3fc00000, 0x3fe00000, 0x40000000, 0x40200000, 0x40400000, 0x40800000,
>> > +  0x41000000, 0x41800000, 0x43000000, 0x43800000, 0x47000000, 0x47800000, 0x7f800000, 0x7fc00000
>> > +};
>> > +
>> > +unsigned HOST_WIDE_INT fli_value_df[32] =
>> > +{
>> > +  0xbff0000000000000, 0x10000000000000, 0x3ef0000000000000, 0x3f00000000000000,
>> > +  0x3f70000000000000, 0x3f80000000000000, 0x3fb0000000000000, 0x3fc0000000000000,
>> > +  0x3fd0000000000000, 0x3fd4000000000000, 0x3fd8000000000000, 0x3fdc000000000000,
>> > +  0x3fe0000000000000, 0x3fe4000000000000, 0x3fe8000000000000, 0x3fec000000000000,
>> > +  0x3ff0000000000000, 0x3ff4000000000000, 0x3ff8000000000000, 0x3ffc000000000000,
>> > +  0x4000000000000000, 0x4004000000000000, 0x4008000000000000, 0x4010000000000000,
>> > +  0x4020000000000000, 0x4030000000000000, 0x4060000000000000, 0x4070000000000000,
>> > +  0x40e0000000000000, 0x40f0000000000000, 0x7ff0000000000000, 0x7ff8000000000000,
>> > +};
>> > +
>> > +const char *fli_value_print[32] =
>> > +{
>> > +  "-1.0", "min", "1.52587890625e-05", "3.0517578125e-05", "0.00390625", "0.0078125", "0.0625", "0.125",
>> > +  "0.25", "0.3125", "0.375", "0.4375", "0.5", "0.625", "0.75", "0.875",
>> > +  "1.0", "1.25", "1.5", "1.75", "2.0", "2.5", "3.0", "4.0",
>> > +  "8.0", "16.0", "128.0", "256.0", "32768.0", "65536.0", "inf", "nan"
>> > +};
>> > +
>> > +/* Find the index of TARGET in ARRAY, and return -1 if not found. */
>> > +
>> > +static int
>> > +find_index_in_array (unsigned HOST_WIDE_INT target, unsigned HOST_WIDE_INT *array, int len)
>> > +{
>> > +  if (array == NULL)
>> > +    return -1;
>> > +
>> > +  for (int i = 0; i < len; i++)
>> > +    {
>> > +      if (target == array[i])
>> > +       return i;
>> > +    }
>> > +  return -1;
>> > +}
>> > +
>> > +/* Return index of the FLI instruction table if rtx X is an immediate constant that
>> > +   can be moved using a single FLI instruction in zfa extension. -1 otherwise. */
>> > +
>> > +int
>> > +riscv_float_const_rtx_index_for_fli (rtx x)
>> > +{
>> > +  machine_mode mode = GET_MODE (x);
>> > +
>> > +  if (!TARGET_ZFA || mode == VOIDmode
>> > +      || !CONST_DOUBLE_P(x)
>> > +      || (mode == HFmode && !TARGET_ZFH)
>> > +      || (mode == SFmode && !TARGET_HARD_FLOAT)
>> > +      || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
>> > +    return -1;
>> > +
>> > +  if (!SCALAR_FLOAT_MODE_P (mode)
>> > +      || GET_MODE_BITSIZE (mode).to_constant () > HOST_BITS_PER_WIDE_INT
>> > +      /* Only support up to DF mode.  */
>> > +      || GET_MODE_BITSIZE (mode).to_constant () > GET_MODE_BITSIZE (DFmode))
>> > +    return -1;
>> > +
>> > +  unsigned HOST_WIDE_INT ival = 0;
>> > +
>> > +  long res[2];
>> > +  real_to_target (res,
>> > +                 CONST_DOUBLE_REAL_VALUE (x),
>> > +                 REAL_MODE_FORMAT (mode));
>> > +
>> > +  if (mode == DFmode)
>> > +    {
>> > +      int order = BYTES_BIG_ENDIAN ? 1 : 0;
>> > +      ival = zext_hwi (res[order], 32);
>> > +      ival |= (zext_hwi (res[1 - order], 32) << 32);
>> > +    }
>> > +  else
>> > +      ival = zext_hwi (res[0], 32);
>> > +
>> > +  switch (mode)
>> > +    {
>> > +      case SFmode:
>> > +       return find_index_in_array (ival, fli_value_sf, 32);
>> > +      case DFmode:
>> > +       return find_index_in_array (ival, fli_value_df, 32);
>> > +      case HFmode:
>> > +       return find_index_in_array (ival, fli_value_hf, 32);
>> > +      default:
>> > +       break;
>> > +    }
>> > +  return -1;
>> > +}
>> > +
>> >  /* Implement TARGET_LEGITIMATE_CONSTANT_P.  */
>> >
>> >  static bool
>> > @@ -826,6 +936,9 @@ riscv_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
>> >    if (GET_CODE (x) == HIGH)
>> >      return true;
>> >
>> > +  if (riscv_float_const_rtx_index_for_fli (x) != -1)
>> > +   return true;
>> > +
>> >    split_const (x, &base, &offset);
>> >    if (riscv_symbolic_constant_p (base, &type))
>> >      {
>> > @@ -1213,6 +1326,11 @@ riscv_const_insns (rtx x)
>> >        }
>> >
>> >      case CONST_DOUBLE:
>> > +      /* See if we can use FMV directly.  */
>> > +      if (riscv_float_const_rtx_index_for_fli (x) != -1)
>> > +       return 3;
>> > +      /* Fall through.  */
>> > +
>> >      case CONST_VECTOR:
>> >        /* We can use x0 to load floating-point zero.  */
>> >        return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
>> > @@ -1749,6 +1867,12 @@ riscv_legitimize_const_move (machine_mode mode, rtx dest, rtx src)
>> >        return;
>> >      }
>> >
>> > +  if (riscv_float_const_rtx_index_for_fli (src) != -1)
>> > +    {
>> > +      riscv_emit_set (dest, src);
>> > +      return;
>> > +    }
>> > +
>> >    /* Split moves of symbolic constants into high/low pairs.  */
>> >    if (riscv_split_symbol (dest, src, MAX_MACHINE_MODE, &src, FALSE))
>> >      {
>> > @@ -2770,12 +2894,19 @@ riscv_split_64bit_move_p (rtx dest, rtx src)
>> >    if (TARGET_64BIT)
>> >      return false;
>> >
>> > +  /* There is no need to split if the FLI instruction in the `Zfa` extension can be used. */
>> > +  if (riscv_float_const_rtx_index_for_fli (src) != -1)
>> > +    return false;
>> > +
>> >    /* Allow FPR <-> FPR and FPR <-> MEM moves, and permit the special case
>> >       of zeroing an FPR with FCVT.D.W.  */
>> >    if (TARGET_DOUBLE_FLOAT
>> >        && ((FP_REG_RTX_P (src) && FP_REG_RTX_P (dest))
>> >           || (FP_REG_RTX_P (dest) && MEM_P (src))
>> >           || (FP_REG_RTX_P (src) && MEM_P (dest))
>> > +         || (TARGET_ZFA
>> > +             && ((FP_REG_RTX_P (dest) && GP_REG_RTX_P (src))
>> > +             || (FP_REG_RTX_P (src) && GP_REG_RTX_P (dest))))
>> >           || (FP_REG_RTX_P (dest) && src == CONST0_RTX (GET_MODE (src)))))
>> >      return false;
>> >
>> > @@ -2857,6 +2988,8 @@ riscv_output_move (rtx dest, rtx src)
>> >           case 4:
>> >             return "fmv.x.s\t%0,%1";
>> >           case 8:
>> > +           if (!TARGET_64BIT && TARGET_ZFA)
>> > +             return "fmv.x.w\t%0,%1\n\tfmvh.x.d\t%N0,%1";
>> >             return "fmv.x.d\t%0,%1";
>> >           }
>> >
>> > @@ -2916,6 +3049,8 @@ riscv_output_move (rtx dest, rtx src)
>> >               case 8:
>> >                 if (TARGET_64BIT)
>> >                   return "fmv.d.x\t%0,%z1";
>> > +               else if (TARGET_ZFA && src != CONST0_RTX (mode))
>> > +                 return "fmvp.d.x\t%0,%1,%N1";
>> >                 /* in RV32, we can emulate fmv.d.x %0, x0 using fcvt.d.w */
>> >                 gcc_assert (src == CONST0_RTX (mode));
>> >                 return "fcvt.d.w\t%0,x0";
>> > @@ -2968,6 +3103,14 @@ riscv_output_move (rtx dest, rtx src)
>> >           case 8:
>> >             return "fld\t%0,%1";
>> >           }
>> > +
>> > +      if (src_code == CONST_DOUBLE && (riscv_float_const_rtx_index_for_fli (src) != -1))
>> > +       switch (width)
>> > +         {
>> > +           case 2: return "fli.h\t%0,%1";
>> > +           case 4: return "fli.s\t%0,%1";
>> > +           case 8: return "fli.d\t%0,%1";
>> > +         }
>> >      }
>> >    if (dest_code == REG && GP_REG_P (REGNO (dest)) && src_code == CONST_POLY_INT)
>> >      {
>> > @@ -4349,6 +4492,7 @@ riscv_memmodel_needs_release_fence (enum memmodel model)
>> >     'S' Print shift-index of single-bit mask OP.
>> >     'T' Print shift-index of inverted single-bit mask OP.
>> >     '~' Print w if TARGET_64BIT is true; otherwise not print anything.
>> > +   'N'  Print next register.
>> >
>> >     Note please keep this list and the list in riscv.md in sync.  */
>> >
>> > @@ -4533,6 +4677,9 @@ riscv_print_operand (FILE *file, rtx op, int letter)
>> >         output_addr_const (file, newop);
>> >         break;
>> >        }
>> > +    case 'N':
>> > +      fputs (reg_names[REGNO (op) + 1], file);
>> > +      break;
>> >      default:
>> >        switch (code)
>> >         {
>> > @@ -4549,6 +4696,24 @@ riscv_print_operand (FILE *file, rtx op, int letter)
>> >             output_address (mode, XEXP (op, 0));
>> >           break;
>> >
>> > +       case CONST_DOUBLE:
>> > +         {
>> > +           if (letter == 'z' && op == CONST0_RTX (GET_MODE (op)))
>> > +             {
>> > +               fputs (reg_names[GP_REG_FIRST], file);
>> > +               break;
>> > +             }
>> > +
>> > +           int fli_index = riscv_float_const_rtx_index_for_fli (op);
>> > +           if (fli_index == -1 || fli_index > 31)
>> > +             {
>> > +               output_operand_lossage ("invalid use of '%%%c'", letter);
>> > +               break;
>> > +             }
>> > +           asm_fprintf (file, "%s", fli_value_print[fli_index]);
>> > +           break;
>> > +         }
>> > +
>> >         default:
>> >           if (letter == 'z' && op == CONST0_RTX (GET_MODE (op)))
>> >             fputs (reg_names[GP_REG_FIRST], file);
>> > @@ -5897,7 +6062,8 @@ riscv_secondary_memory_needed (machine_mode mode, reg_class_t class1,
>> >    return (!riscv_v_ext_vector_mode_p (mode)
>> >           && GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD
>> >           && (class1 == FP_REGS) != (class2 == FP_REGS)
>> > -         && !TARGET_XTHEADFMV);
>> > +         && !TARGET_XTHEADFMV
>> > +         && !TARGET_ZFA);
>> >  }
>> >
>> >  /* Implement TARGET_REGISTER_MOVE_COST.  */
>> > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
>> > index 66fb07d6652..d438b281142 100644
>> > --- a/gcc/config/riscv/riscv.h
>> > +++ b/gcc/config/riscv/riscv.h
>> > @@ -377,6 +377,7 @@ ASM_MISA_SPEC
>> >  #define SIBCALL_REG_P(REGNO)   \
>> >    TEST_HARD_REG_BIT (reg_class_contents[SIBCALL_REGS], REGNO)
>> >
>> > +#define GP_REG_RTX_P(X) (REG_P (X) && GP_REG_P (REGNO (X)))
>> >  #define FP_REG_RTX_P(X) (REG_P (X) && FP_REG_P (REGNO (X)))
>> >
>> >  /* Use s0 as the frame pointer if it is so requested.  */
>> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
>> > index bc384d9aedf..f22e71b5a3a 100644
>> > --- a/gcc/config/riscv/riscv.md
>> > +++ b/gcc/config/riscv/riscv.md
>> > @@ -59,6 +59,15 @@ (define_c_enum "unspec" [
>> >    UNSPEC_LROUND
>> >    UNSPEC_FMIN
>> >    UNSPEC_FMAX
>> > +  UNSPEC_RINT
>> > +  UNSPEC_ROUND
>> > +  UNSPEC_FLOOR
>> > +  UNSPEC_CEIL
>> > +  UNSPEC_BTRUNC
>> > +  UNSPEC_ROUNDEVEN
>> > +  UNSPEC_NEARBYINT
>> > +  UNSPEC_FMINM
>> > +  UNSPEC_FMAXM
>> >
>> >    ;; Stack tie
>> >    UNSPEC_TIE
>> > @@ -1232,6 +1241,26 @@ (define_insn "neg<mode>2"
>> >  ;;
>> >  ;;  ....................
>> >
>> > +(define_insn "fminm<mode>3"
>> > +  [(set (match_operand:ANYF                    0 "register_operand" "=f")
>> > +       (unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
>> > +                     (use (match_operand:ANYF 2 "register_operand" " f"))]
>> > +                    UNSPEC_FMINM))]
>> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
>> > +  "fminm.<fmt>\t%0,%1,%2"
>> > +  [(set_attr "type" "fmove")
>> > +   (set_attr "mode" "<UNITMODE>")])
>> > +
>> > +(define_insn "fmaxm<mode>3"
>> > +  [(set (match_operand:ANYF                    0 "register_operand" "=f")
>> > +       (unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
>> > +                     (use (match_operand:ANYF 2 "register_operand" " f"))]
>> > +                    UNSPEC_FMAXM))]
>> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
>> > +  "fmaxm.<fmt>\t%0,%1,%2"
>> > +  [(set_attr "type" "fmove")
>> > +   (set_attr "mode" "<UNITMODE>")])
>> > +
>> >  (define_insn "fmin<mode>3"
>> >    [(set (match_operand:ANYF                    0 "register_operand" "=f")
>> >         (unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
>> > @@ -1508,13 +1537,13 @@ (define_expand "movhf"
>> >  })
>> >
>> >  (define_insn "*movhf_hardfloat"
>> > -  [(set (match_operand:HF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
>> > -       (match_operand:HF 1 "move_operand"         " f,G,m,f,G,*r,*f,*G*r,*m,*r"))]
>> > +  [(set (match_operand:HF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
>> > +       (match_operand:HF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*G*r,*m,*r"))]
>> >    "TARGET_ZFHMIN
>> >     && (register_operand (operands[0], HFmode)
>> >         || reg_or_0_operand (operands[1], HFmode))"
>> >    { return riscv_output_move (operands[0], operands[1]); }
>> > -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>> > +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>> >     (set_attr "mode" "HF")])
>> >
>> >  (define_insn "*movhf_softfloat"
>> > @@ -1580,6 +1609,26 @@ (define_insn "l<rint_pattern><ANYF:mode><GPR:mode>2"
>> >    [(set_attr "type" "fcvt")
>> >     (set_attr "mode" "<ANYF:MODE>")])
>> >
>> > +(define_insn "<round_pattern><ANYF:mode>2"
>> > +  [(set (match_operand:ANYF     0 "register_operand" "=f")
>> > +       (unspec:ANYF
>> > +           [(match_operand:ANYF 1 "register_operand" " f")]
>> > +       ROUND))]
>> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
>> > +  "fround.<ANYF:fmt>\t%0,%1,<round_rm>"
>> > +  [(set_attr "type" "fcvt")
>> > +   (set_attr "mode" "<ANYF:MODE>")])
>> > +
>> > +(define_insn "rint<ANYF:mode>2"
>> > +  [(set (match_operand:ANYF     0 "register_operand" "=f")
>> > +       (unspec:ANYF
>> > +           [(match_operand:ANYF 1 "register_operand" " f")]
>> > +       UNSPEC_RINT))]
>> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
>> > +  "froundnx.<ANYF:fmt>\t%0,%1"
>> > +  [(set_attr "type" "fcvt")
>> > +   (set_attr "mode" "<ANYF:MODE>")])
>> > +
>> >  ;;
>> >  ;;  ....................
>> >  ;;
>> > @@ -1839,13 +1888,13 @@ (define_expand "movsf"
>> >  })
>> >
>> >  (define_insn "*movsf_hardfloat"
>> > -  [(set (match_operand:SF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
>> > -       (match_operand:SF 1 "move_operand"         " f,G,m,f,G,*r,*f,*G*r,*m,*r"))]
>> > +  [(set (match_operand:SF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
>> > +       (match_operand:SF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*G*r,*m,*r"))]
>> >    "TARGET_HARD_FLOAT
>> >     && (register_operand (operands[0], SFmode)
>> >         || reg_or_0_operand (operands[1], SFmode))"
>> >    { return riscv_output_move (operands[0], operands[1]); }
>> > -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>> > +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>> >     (set_attr "mode" "SF")])
>> >
>> >  (define_insn "*movsf_softfloat"
>> > @@ -1873,23 +1922,23 @@ (define_expand "movdf"
>> >  ;; In RV32, we lack fmv.x.d and fmv.d.x.  Go through memory instead.
>> >  ;; (However, we can still use fcvt.d.w to zero a floating-point register.)
>> >  (define_insn "*movdf_hardfloat_rv32"
>> > -  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,m,*th_f_fmv,*th_r_fmv,  *r,*r,*m")
>> > -       (match_operand:DF 1 "move_operand"         " f,G,m,f,G,*th_r_fmv,*th_f_fmv,*r*G,*m,*r"))]
>> > +  [(set (match_operand:DF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*th_f_fmv,*th_r_fmv,  *r,*r,*m")
>> > +       (match_operand:DF 1 "move_operand"         " f,Zf,G,m,f,G,*th_r_fmv,*th_f_fmv,*r*G,*m,*r"))]
>> >    "!TARGET_64BIT && TARGET_DOUBLE_FLOAT
>> >     && (register_operand (operands[0], DFmode)
>> >         || reg_or_0_operand (operands[1], DFmode))"
>> >    { return riscv_output_move (operands[0], operands[1]); }
>> > -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>> > +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>> >     (set_attr "mode" "DF")])
>> >
>> >  (define_insn "*movdf_hardfloat_rv64"
>> > -  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
>> > -       (match_operand:DF 1 "move_operand"         " f,G,m,f,G,*r,*f,*r*G,*m,*r"))]
>> > +  [(set (match_operand:DF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
>> > +       (match_operand:DF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*r*G,*m,*r"))]
>> >    "TARGET_64BIT && TARGET_DOUBLE_FLOAT
>> >     && (register_operand (operands[0], DFmode)
>> >         || reg_or_0_operand (operands[1], DFmode))"
>> >    { return riscv_output_move (operands[0], operands[1]); }
>> > -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>> > +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>> >     (set_attr "mode" "DF")])
>> >
>> >  (define_insn "*movdf_softfloat"
>> > @@ -2494,16 +2543,23 @@ (define_expand "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4"
>> >    rtx op0 = operands[0];
>> >    rtx op1 = operands[1];
>> >    rtx op2 = operands[2];
>> > -  rtx tmp = gen_reg_rtx (SImode);
>> > -  rtx cmp = gen_rtx_<QUIET_PATTERN> (<X:MODE>mode, op1, op2);
>> > -  rtx frflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, const0_rtx),
>> > -                                        UNSPECV_FRFLAGS);
>> > -  rtx fsflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, tmp),
>> > -                                        UNSPECV_FSFLAGS);
>> > -
>> > -  emit_insn (gen_rtx_SET (tmp, frflags));
>> > -  emit_insn (gen_rtx_SET (op0, cmp));
>> > -  emit_insn (fsflags);
>> > +
>> > +  if (TARGET_ZFA)
>> > +    emit_insn (gen_f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa(op0, op1, op2));
>> > +  else
>> > +    {
>> > +      rtx tmp = gen_reg_rtx (SImode);
>> > +      rtx cmp = gen_rtx_<QUIET_PATTERN> (<X:MODE>mode, op1, op2);
>> > +      rtx frflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, const0_rtx),
>> > +                                            UNSPECV_FRFLAGS);
>> > +      rtx fsflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, tmp),
>> > +                                            UNSPECV_FSFLAGS);
>> > +
>> > +      emit_insn (gen_rtx_SET (tmp, frflags));
>> > +      emit_insn (gen_rtx_SET (op0, cmp));
>> > +      emit_insn (fsflags);
>> > +    }
>> > +
>> >    if (HONOR_SNANS (<ANYF:MODE>mode))
>> >      emit_insn (gen_rtx_UNSPEC_VOLATILE (<ANYF:MODE>mode,
>> >                                         gen_rtvec (2, op1, op2),
>> > @@ -2511,6 +2567,18 @@ (define_expand "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4"
>> >    DONE;
>> >  })
>> >
>> > +(define_insn "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa"
>> > +   [(set (match_operand:X      0 "register_operand" "=r")
>> > +        (unspec:X
>> > +         [(match_operand:ANYF 1 "register_operand" " f")
>> > +          (match_operand:ANYF 2 "register_operand" " f")]
>> > +         QUIET_COMPARISON))]
>> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
>> > +  "f<quiet_pattern>q.<fmt>\t%0,%1,%2"
>> > +  [(set_attr "type" "fcmp")
>> > +   (set_attr "mode" "<UNITMODE>")
>> > +   (set (attr "length") (const_int 16))])
>> > +
>> >  (define_insn "*seq_zero_<X:mode><GPR:mode>"
>> >    [(set (match_operand:GPR       0 "register_operand" "=r")
>> >         (eq:GPR (match_operand:X 1 "register_operand" " r")
>> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
>> > new file mode 100644
>> > index 00000000000..26895b76fa4
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
>> > @@ -0,0 +1,19 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O2" } */
>> > +
>> > +extern void abort(void);
>> > +extern float a, b;
>> > +extern double c, d;
>> > +
>> > +void
>> > +foo()
>> > +{
>> > +  if ((__builtin_isless(a, b) ||  __builtin_islessequal(c, d))
>> > +      && (__builtin_islessequal(a, b)|| __builtin_isless(c, d)))
>> > +    abort();
>> > +}
>> > +
>> > +/* { dg-final { scan-assembler-times "fleq.s" 1 } } */
>> > +/* { dg-final { scan-assembler-times "fltq.s" 1 } } */
>> > +/* { dg-final { scan-assembler-times "fleq.d" 1 } } */
>> > +/* { dg-final { scan-assembler-times "fltq.d" 1 } } */
>> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
>> > new file mode 100644
>> > index 00000000000..4ccd6a7dd78
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
>> > @@ -0,0 +1,19 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O2" } */
>> > +
>> > +extern void abort(void);
>> > +extern float a, b;
>> > +extern double c, d;
>> > +
>> > +void
>> > +foo()
>> > +{
>> > +  if ((__builtin_isless(a, b) ||  __builtin_islessequal(c, d))
>> > +      && (__builtin_islessequal(a, b)|| __builtin_isless(c, d)))
>> > +    abort();
>> > +}
>> > +
>> > +/* { dg-final { scan-assembler-times "fleq.s" 1 } } */
>> > +/* { dg-final { scan-assembler-times "fltq.s" 1 } } */
>> > +/* { dg-final { scan-assembler-times "fleq.d" 1 } } */
>> > +/* { dg-final { scan-assembler-times "fltq.d" 1 } } */
>> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
>> > new file mode 100644
>> > index 00000000000..c4da04797aa
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
>> > @@ -0,0 +1,79 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O0" } */
>> > +
>> > +void foo_float32 ()
>> > +{
>> > +  volatile float a;
>> > +  a = -1.0;
>> > +  a = 1.1754944e-38;
>> > +  a = 1.0/(1 << 16);
>> > +  a = 1.0/(1 << 15);
>> > +  a = 1.0/(1 << 8);
>> > +  a = 1.0/(1 << 7);
>> > +  a = 1.0/(1 << 4);
>> > +  a = 1.0/(1 << 3);
>> > +  a = 1.0/(1 << 2);
>> > +  a = 0.3125;
>> > +  a = 0.375;
>> > +  a = 0.4375;
>> > +  a = 0.5;
>> > +  a = 0.625;
>> > +  a = 0.75;
>> > +  a = 0.875;
>> > +  a = 1.0;
>> > +  a = 1.25;
>> > +  a = 1.5;
>> > +  a = 1.75;
>> > +  a = 2.0;
>> > +  a = 2.5;
>> > +  a = 3.0;
>> > +  a = 1.0*(1 << 2);
>> > +  a = 1.0*(1 << 3);
>> > +  a = 1.0*(1 << 4);
>> > +  a = 1.0*(1 << 7);
>> > +  a = 1.0*(1 << 8);
>> > +  a = 1.0*(1 << 15);
>> > +  a = 1.0*(1 << 16);
>> > +  a = __builtin_inff ();
>> > +  a = __builtin_nanf ("");
>> > +}
>> > +
>> > +void foo_double64 ()
>> > +{
>> > +  volatile double a;
>> > +  a = -1.0;
>> > +  a = 2.2250738585072014E-308;
>> > +  a = 1.0/(1 << 16);
>> > +  a = 1.0/(1 << 15);
>> > +  a = 1.0/(1 << 8);
>> > +  a = 1.0/(1 << 7);
>> > +  a = 1.0/(1 << 4);
>> > +  a = 1.0/(1 << 3);
>> > +  a = 1.0/(1 << 2);
>> > +  a = 0.3125;
>> > +  a = 0.375;
>> > +  a = 0.4375;
>> > +  a = 0.5;
>> > +  a = 0.625;
>> > +  a = 0.75;
>> > +  a = 0.875;
>> > +  a = 1.0;
>> > +  a = 1.25;
>> > +  a = 1.5;
>> > +  a = 1.75;
>> > +  a = 2.0;
>> > +  a = 2.5;
>> > +  a = 3.0;
>> > +  a = 1.0*(1 << 2);
>> > +  a = 1.0*(1 << 3);
>> > +  a = 1.0*(1 << 4);
>> > +  a = 1.0*(1 << 7);
>> > +  a = 1.0*(1 << 8);
>> > +  a = 1.0*(1 << 15);
>> > +  a = 1.0*(1 << 16);
>> > +  a = __builtin_inf ();
>> > +  a = __builtin_nan ("");
>> > +}
>> > +
>> > +/* { dg-final { scan-assembler-times "fli.s" 32 } } */
>> > +/* { dg-final { scan-assembler-times "fli.d" 32 } } */
>> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
>> > new file mode 100644
>> > index 00000000000..bcffe9d2c82
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
>> > @@ -0,0 +1,41 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-march=rv32imafdc_zfa_zfh -mabi=ilp32d -O0" } */
>> > +
>> > +void foo_float16 ()
>> > +{
>> > +  volatile _Float16 a;
>> > +  a = -1.0;
>> > +  a = 6.104E-5;
>> > +  a = 1.0/(1 << 16);
>> > +  a = 1.0/(1 << 15);
>> > +  a = 1.0/(1 << 8);
>> > +  a = 1.0/(1 << 7);
>> > +  a = 1.0/(1 << 4);
>> > +  a = 1.0/(1 << 3);
>> > +  a = 1.0/(1 << 2);
>> > +  a = 0.3125;
>> > +  a = 0.375;
>> > +  a = 0.4375;
>> > +  a = 0.5;
>> > +  a = 0.625;
>> > +  a = 0.75;
>> > +  a = 0.875;
>> > +  a = 1.0;
>> > +  a = 1.25;
>> > +  a = 1.5;
>> > +  a = 1.75;
>> > +  a = 2.0;
>> > +  a = 2.5;
>> > +  a = 3.0;
>> > +  a = 1.0*(1 << 2);
>> > +  a = 1.0*(1 << 3);
>> > +  a = 1.0*(1 << 4);
>> > +  a = 1.0*(1 << 7);
>> > +  a = 1.0*(1 << 8);
>> > +  a = 1.0*(1 << 15);
>> > +  a = 1.0*(1 << 16);
>> > +  a = __builtin_inff16 ();
>> > +  a = __builtin_nanf16 ("");
>> > +}
>> > +
>> > +/* { dg-final { scan-assembler-times "fli.h" 32 } } */
>> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
>> > new file mode 100644
>> > index 00000000000..13aa7b5f846
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
>> > @@ -0,0 +1,41 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-march=rv64imafdc_zfa_zfh -mabi=lp64d -O0" } */
>> > +
>> > +void foo_float16 ()
>> > +{
>> > +  volatile _Float16 a;
>> > +  a = -1.0;
>> > +  a = 6.104E-5;
>> > +  a = 1.0/(1 << 16);
>> > +  a = 1.0/(1 << 15);
>> > +  a = 1.0/(1 << 8);
>> > +  a = 1.0/(1 << 7);
>> > +  a = 1.0/(1 << 4);
>> > +  a = 1.0/(1 << 3);
>> > +  a = 1.0/(1 << 2);
>> > +  a = 0.3125;
>> > +  a = 0.375;
>> > +  a = 0.4375;
>> > +  a = 0.5;
>> > +  a = 0.625;
>> > +  a = 0.75;
>> > +  a = 0.875;
>> > +  a = 1.0;
>> > +  a = 1.25;
>> > +  a = 1.5;
>> > +  a = 1.75;
>> > +  a = 2.0;
>> > +  a = 2.5;
>> > +  a = 3.0;
>> > +  a = 1.0*(1 << 2);
>> > +  a = 1.0*(1 << 3);
>> > +  a = 1.0*(1 << 4);
>> > +  a = 1.0*(1 << 7);
>> > +  a = 1.0*(1 << 8);
>> > +  a = 1.0*(1 << 15);
>> > +  a = 1.0*(1 << 16);
>> > +  a = __builtin_inff16 ();
>> > +  a = __builtin_nanf16 ("");
>> > +}
>> > +
>> > +/* { dg-final { scan-assembler-times "fli.h" 32 } } */
>> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli.c b/gcc/testsuite/gcc.target/riscv/zfa-fli.c
>> > new file mode 100644
>> > index 00000000000..b6d41cf460f
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli.c
>> > @@ -0,0 +1,79 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O0" } */
>> > +
>> > +void foo_float32 ()
>> > +{
>> > +  volatile float a;
>> > +  a = -1.0;
>> > +  a = 1.1754944e-38;
>> > +  a = 1.0/(1 << 16);
>> > +  a = 1.0/(1 << 15);
>> > +  a = 1.0/(1 << 8);
>> > +  a = 1.0/(1 << 7);
>> > +  a = 1.0/(1 << 4);
>> > +  a = 1.0/(1 << 3);
>> > +  a = 1.0/(1 << 2);
>> > +  a = 0.3125;
>> > +  a = 0.375;
>> > +  a = 0.4375;
>> > +  a = 0.5;
>> > +  a = 0.625;
>> > +  a = 0.75;
>> > +  a = 0.875;
>> > +  a = 1.0;
>> > +  a = 1.25;
>> > +  a = 1.5;
>> > +  a = 1.75;
>> > +  a = 2.0;
>> > +  a = 2.5;
>> > +  a = 3.0;
>> > +  a = 1.0*(1 << 2);
>> > +  a = 1.0*(1 << 3);
>> > +  a = 1.0*(1 << 4);
>> > +  a = 1.0*(1 << 7);
>> > +  a = 1.0*(1 << 8);
>> > +  a = 1.0*(1 << 15);
>> > +  a = 1.0*(1 << 16);
>> > +  a = __builtin_inff ();
>> > +  a = __builtin_nanf ("");
>> > +}
>> > +
>> > +void foo_double64 ()
>> > +{
>> > +  volatile double a;
>> > +  a = -1.0;
>> > +  a = 2.2250738585072014E-308;
>> > +  a = 1.0/(1 << 16);
>> > +  a = 1.0/(1 << 15);
>> > +  a = 1.0/(1 << 8);
>> > +  a = 1.0/(1 << 7);
>> > +  a = 1.0/(1 << 4);
>> > +  a = 1.0/(1 << 3);
>> > +  a = 1.0/(1 << 2);
>> > +  a = 0.3125;
>> > +  a = 0.375;
>> > +  a = 0.4375;
>> > +  a = 0.5;
>> > +  a = 0.625;
>> > +  a = 0.75;
>> > +  a = 0.875;
>> > +  a = 1.0;
>> > +  a = 1.25;
>> > +  a = 1.5;
>> > +  a = 1.75;
>> > +  a = 2.0;
>> > +  a = 2.5;
>> > +  a = 3.0;
>> > +  a = 1.0*(1 << 2);
>> > +  a = 1.0*(1 << 3);
>> > +  a = 1.0*(1 << 4);
>> > +  a = 1.0*(1 << 7);
>> > +  a = 1.0*(1 << 8);
>> > +  a = 1.0*(1 << 15);
>> > +  a = 1.0*(1 << 16);
>> > +  a = __builtin_inf ();
>> > +  a = __builtin_nan ("");
>> > +}
>> > +
>> > +/* { dg-final { scan-assembler-times "fli.s" 32 } } */
>> > +/* { dg-final { scan-assembler-times "fli.d" 32 } } */
>> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
>> > new file mode 100644
>> > index 00000000000..5a52adce36a
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
>> > @@ -0,0 +1,10 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-march=rv32g_zfa -mabi=ilp32 -O0" } */
>> > +
>> > +double foo(long long a)
>> > +{
>> > +  return (double)(a + 3);
>> > +}
>> > +
>> > +/* { dg-final { scan-assembler-times "fmvp.d.x" 1 } } */
>> > +/* { dg-final { scan-assembler-times "fmvh.x.d" 1 } } */
>> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
>> > new file mode 100644
>> > index 00000000000..b53601d6e1f
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
>> > @@ -0,0 +1,42 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O2" } */
>> > +
>> > +extern float a;
>> > +extern double b;
>> > +
>> > +void foo (float *x, double *y)
>> > +{
>> > +  {
>> > +    *x = __builtin_roundf (a);
>> > +    *y = __builtin_round (b);
>> > +  }
>> > +  {
>> > +    *x = __builtin_floorf (a);
>> > +    *y = __builtin_floor (b);
>> > +  }
>> > +  {
>> > +    *x = __builtin_ceilf (a);
>> > +    *y = __builtin_ceil (b);
>> > +  }
>> > +  {
>> > +    *x = __builtin_truncf (a);
>> > +    *y = __builtin_trunc (b);
>> > +  }
>> > +  {
>> > +    *x = __builtin_roundevenf (a);
>> > +    *y = __builtin_roundeven (b);
>> > +  }
>> > +  {
>> > +    *x = __builtin_nearbyintf (a);
>> > +    *y = __builtin_nearbyint (b);
>> > +  }
>> > +  {
>> > +    *x = __builtin_rintf (a);
>> > +    *y = __builtin_rint (b);
>> > +  }
>> > +}
>> > +
>> > +/* { dg-final { scan-assembler-times "fround.s" 6 } } */
>> > +/* { dg-final { scan-assembler-times "fround.d" 6 } } */
>> > +/* { dg-final { scan-assembler-times "froundnx.s" 1 } } */
>> > +/* { dg-final { scan-assembler-times "froundnx.d" 1 } } */
>> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fround.c b/gcc/testsuite/gcc.target/riscv/zfa-fround.c
>> > new file mode 100644
>> > index 00000000000..c10de82578e
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fround.c
>> > @@ -0,0 +1,42 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O2" } */
>> > +
>> > +extern float a;
>> > +extern double b;
>> > +
>> > +void foo (float *x, double *y)
>> > +{
>> > +  {
>> > +    *x = __builtin_roundf (a);
>> > +    *y = __builtin_round (b);
>> > +  }
>> > +  {
>> > +    *x = __builtin_floorf (a);
>> > +    *y = __builtin_floor (b);
>> > +  }
>> > +  {
>> > +    *x = __builtin_ceilf (a);
>> > +    *y = __builtin_ceil (b);
>> > +  }
>> > +  {
>> > +    *x = __builtin_truncf (a);
>> > +    *y = __builtin_trunc (b);
>> > +  }
>> > +  {
>> > +    *x = __builtin_roundevenf (a);
>> > +    *y = __builtin_roundeven (b);
>> > +  }
>> > +  {
>> > +    *x = __builtin_nearbyintf (a);
>> > +    *y = __builtin_nearbyint (b);
>> > +  }
>> > +  {
>> > +    *x = __builtin_rintf (a);
>> > +    *y = __builtin_rint (b);
>> > +  }
>> > +}
>> > +
>> > +/* { dg-final { scan-assembler-times "fround.s" 6 } } */
>> > +/* { dg-final { scan-assembler-times "fround.d" 6 } } */
>> > +/* { dg-final { scan-assembler-times "froundnx.s" 1 } } */
>> > +/* { dg-final { scan-assembler-times "froundnx.d" 1 } } */
>> > --
>> > 2.17.1
>> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2.
  2023-05-05 15:12     ` Palmer Dabbelt
@ 2023-05-05 15:43       ` Christoph Müllner
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Müllner @ 2023-05-05 15:43 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: jinma, gcc-patches, jeffreyalaw, kito.cheng, Kito Cheng, ijinma

On Fri, May 5, 2023 at 5:13 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Fri, 05 May 2023 08:04:53 PDT (-0700), christoph.muellner@vrull.eu wrote:
> > What I forgot to mention:
> > Zfa is frozen and in public review:
> >   https://groups.google.com/a/groups.riscv.org/g/isa-dev/c/SED4ntBkabg
>
> Thanks, I'd also forgot to send that out ;).
>
> I think the only blocker here on the specification side is the assembly
> format for FLI?  It looks like the feedback on
> <https://github.com/riscv-non-isa/riscv-asm-manual/pull/85> has been
> pretty minor so far.  It'd be nice to have the docs lined up before
> we merge, but we could always just call it a GNU extension -- we've
> already got a lot of that in assembler land, so I don't think it's that
> big of a deal.

I also don't think that we need to wait for that PR to land.

Nelson already gave his ok on the Binutils v4 (but after ratification,
not freeze):
  https://sourceware.org/pipermail/binutils/2023-April/127027.html

FWIW, I have meanwhile sent out a v5 for Binutils as well (there were
few changes requested).
And the v5 has been rebased and retested as well.

>
> >
> > On Fri, May 5, 2023 at 5:03 PM Christoph Müllner
> > <christoph.muellner@vrull.eu> wrote:
> >>
> >> On Wed, Apr 19, 2023 at 11:58 AM Jin Ma <jinma@linux.alibaba.com> wrote:
> >> >
> >> > This patch adds the 'Zfa' extension for riscv, which is based on:
> >> >   https://github.com/riscv/riscv-isa-manual/commits/zfb
> >> >   https://github.com/riscv/riscv-isa-manual/commit/1f038182810727f5feca311072e630d6baac51da
> >> >
> >> > The binutils-gdb for 'Zfa' extension:
> >> >   https://github.com/a4lg/binutils-gdb/commits/riscv-zfa
> >> >
> >> > What needs special explanation is:
> >> > 1, The immediate number of the instructions FLI.H/S/D is represented in the assembly as a
> >> >   floating-point value, with scientific counting when rs1 is 1,2, and decimal numbers for
> >> >   the rest.
> >> >
> >> >   Related llvm link:
> >> >     https://reviews.llvm.org/D145645
> >> >   Related discussion link:
> >> >     https://github.com/riscv/riscv-isa-manual/issues/980
> >> >
> >> > 2, According to riscv-spec, "The FCVTMO D.W.D instruction was added principally to
> >> >   accelerate the processing of JavaScript Numbers.", so it seems that no implementation
> >> >   is required.
> >> >
> >> > 3, The instructions FMINM and FMAXM correspond to C23 library function fminimum and fmaximum.
> >> >   Therefore, this patch has simply implemented the pattern of fminm<hf\sf\df>3 and
> >> >   fmaxm<hf\sf\df>3 to prepare for later.
> >> >
> >> > gcc/ChangeLog:
> >> >
> >> >         * common/config/riscv/riscv-common.cc: Add zfa extension version.
> >> >         * config/riscv/constraints.md (Zf): Constrain the floating point number that the
> >> >         instructions FLI.H/S/D can load.
> >> >         ((TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS): enable FMVP.D.X and FMVH.X.D.
> >> >         * config/riscv/iterators.md (ceil): New.
> >> >         * config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): New.
> >> >         * config/riscv/riscv.cc (find_index_in_array): New.
> >> >         (riscv_float_const_rtx_index_for_fli): Get the index of the floating-point number that
> >> >         the instructions FLI.H/S/D can mov.
> >> >         (riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be used, memory is not applicable.
> >> >         (riscv_const_insns): The cost of FLI.H/S/D is 3.
> >> >         (riscv_legitimize_const_move): Likewise.
> >> >         (riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, no split is required.
> >> >         (riscv_output_move): Output the mov instructions in zfa extension.
> >> >         (riscv_print_operand): Output the floating-point value of the FLI.H/S/D immediate in assembly
> >> >         (riscv_secondary_memory_needed): Likewise.
> >> >         * config/riscv/riscv.h (GP_REG_RTX_P): New.
> >> >         * config/riscv/riscv.md (fminm<mode>3): New.
> >> >
> >> > gcc/testsuite/ChangeLog:
> >> >
> >> >         * gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
> >> >         * gcc.target/riscv/zfa-fleq-fltq.c: New test.
> >> >         * gcc.target/riscv/zfa-fli-rv32.c: New test.
> >> >         * gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
> >> >         * gcc.target/riscv/zfa-fli-zfh.c: New test.
> >> >         * gcc.target/riscv/zfa-fli.c: New test.
> >> >         * gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
> >> >         * gcc.target/riscv/zfa-fround-rv32.c: New test.
> >> >         * gcc.target/riscv/zfa-fround.c: New test.
> >> > ---
> >> >  gcc/common/config/riscv/riscv-common.cc       |   4 +
> >> >  gcc/config/riscv/constraints.md               |  11 +-
> >> >  gcc/config/riscv/iterators.md                 |   5 +
> >> >  gcc/config/riscv/riscv-opts.h                 |   3 +
> >> >  gcc/config/riscv/riscv-protos.h               |   1 +
> >> >  gcc/config/riscv/riscv.cc                     | 168 +++++++++++++++++-
> >> >  gcc/config/riscv/riscv.h                      |   1 +
> >> >  gcc/config/riscv/riscv.md                     | 112 +++++++++---
> >> >  .../gcc.target/riscv/zfa-fleq-fltq-rv32.c     |  19 ++
> >> >  .../gcc.target/riscv/zfa-fleq-fltq.c          |  19 ++
> >> >  gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c |  79 ++++++++
> >> >  .../gcc.target/riscv/zfa-fli-zfh-rv32.c       |  41 +++++
> >> >  gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c  |  41 +++++
> >> >  gcc/testsuite/gcc.target/riscv/zfa-fli.c      |  79 ++++++++
> >> >  .../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c   |  10 ++
> >> >  .../gcc.target/riscv/zfa-fround-rv32.c        |  42 +++++
> >> >  gcc/testsuite/gcc.target/riscv/zfa-fround.c   |  42 +++++
> >> >  17 files changed, 652 insertions(+), 25 deletions(-)
> >> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
> >> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
> >> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
> >> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
> >> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
> >> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli.c
> >> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
> >> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
> >> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround.c
> >> >
> >> > diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc
> >> > index 309a52def75..f9fce6bcc38 100644
> >> > --- a/gcc/common/config/riscv/riscv-common.cc
> >> > +++ b/gcc/common/config/riscv/riscv-common.cc
> >> > @@ -217,6 +217,8 @@ static const struct riscv_ext_version riscv_ext_version_table[] =
> >> >    {"zfh",       ISA_SPEC_CLASS_NONE, 1, 0},
> >> >    {"zfhmin",    ISA_SPEC_CLASS_NONE, 1, 0},
> >> >
> >> > +  {"zfa",     ISA_SPEC_CLASS_NONE, 0, 2},
> >> > +
> >> >    {"zmmul", ISA_SPEC_CLASS_NONE, 1, 0},
> >> >
> >> >    {"svinval", ISA_SPEC_CLASS_NONE, 1, 0},
> >> > @@ -1260,6 +1262,8 @@ static const riscv_ext_flag_table_t riscv_ext_flag_table[] =
> >> >    {"zfhmin",    &gcc_options::x_riscv_zf_subext, MASK_ZFHMIN},
> >> >    {"zfh",       &gcc_options::x_riscv_zf_subext, MASK_ZFH},
> >> >
> >> > +  {"zfa",       &gcc_options::x_riscv_zf_subext, MASK_ZFA},
> >> > +
> >> >    {"zmmul", &gcc_options::x_riscv_zm_subext, MASK_ZMMUL},
> >> >
> >> >    {"svinval", &gcc_options::x_riscv_sv_subext, MASK_SVINVAL},
> >> > diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
> >> > index c448e6b37e9..62d9094f966 100644
> >> > --- a/gcc/config/riscv/constraints.md
> >> > +++ b/gcc/config/riscv/constraints.md
> >> > @@ -118,6 +118,13 @@ (define_constraint "T"
> >> >    (and (match_operand 0 "move_operand")
> >> >         (match_test "CONSTANT_P (op)")))
> >> >
> >> > +;; Zfa constraints.
> >> > +
> >> > +(define_constraint "Zf"
> >> > +  "A floating point number that can be loaded using instruction `fli` in zfa."
> >> > +  (and (match_code "const_double")
> >> > +       (match_test "(riscv_float_const_rtx_index_for_fli (op) != -1)")))
> >> > +
> >> >  ;; Vector constraints.
> >> >
> >> >  (define_register_constraint "vr" "TARGET_VECTOR ? V_REGS : NO_REGS"
> >> > @@ -183,8 +190,8 @@ (define_memory_constraint "Wdm"
> >> >
> >> >  ;; Vendor ISA extension constraints.
> >> >
> >> > -(define_register_constraint "th_f_fmv" "TARGET_XTHEADFMV ? FP_REGS : NO_REGS"
> >> > +(define_register_constraint "th_f_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS"
> >> >    "A floating-point register for XTheadFmv.")
> >> >
> >> > -(define_register_constraint "th_r_fmv" "TARGET_XTHEADFMV ? GR_REGS : NO_REGS"
> >> > +(define_register_constraint "th_r_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? GR_REGS : NO_REGS"
> >> >    "An integer register for XTheadFmv.")
> >>
> >> These are vendor extension constraints with the prefix "th_".
> >> I would avoid using them in code that targets standard extensions.
> >>
> >> I see two ways here:
> >> a) Create two new constraints at the top of the file. E.g.:
> >>     - "F" - "A floating-point register (no fall-back for Zfinx)" and
> >>     - "rF" - "A integer register in case FP registers are available".
> >> b) Move to top and rename these two constraints (and adjust
> >> movdf_hardfloat_rv32 accordingly)
> >>
> >> I would prefer b) and would even go so far, that I would do this in a
> >> separate commit that
> >> comes before the Zfa support patch.
> >>
> >>
> >> I've applied the patch on top of today's master (with --3way) and
> >> successfully tested it:
> >> Tested-by: Christoph Müllner <christoph.muellner@vrull.eu>
> >>
> >> > diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
> >> > index 9b767038452..c81b08e3cc5 100644
> >> > --- a/gcc/config/riscv/iterators.md
> >> > +++ b/gcc/config/riscv/iterators.md
> >> > @@ -288,3 +288,8 @@ (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
> >> >  (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET "le")])
> >> >  (define_int_attr QUIET_PATTERN [(UNSPEC_FLT_QUIET "LT") (UNSPEC_FLE_QUIET "LE")])
> >> >
> >> > +(define_int_iterator ROUND [UNSPEC_ROUND UNSPEC_FLOOR UNSPEC_CEIL UNSPEC_BTRUNC UNSPEC_ROUNDEVEN UNSPEC_NEARBYINT])
> >> > +(define_int_attr round_pattern [(UNSPEC_ROUND "round") (UNSPEC_FLOOR "floor") (UNSPEC_CEIL "ceil")
> >> > +                               (UNSPEC_BTRUNC "btrunc") (UNSPEC_ROUNDEVEN "roundeven") (UNSPEC_NEARBYINT "nearbyint")])
> >> > +(define_int_attr round_rm [(UNSPEC_ROUND "rmm") (UNSPEC_FLOOR "rdn") (UNSPEC_CEIL "rup")
> >> > +                          (UNSPEC_BTRUNC "rtz") (UNSPEC_ROUNDEVEN "rne") (UNSPEC_NEARBYINT "dyn")])
> >> > \ No newline at end of file
> >> > diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
> >> > index cf0cd669be4..87b72efd12e 100644
> >> > --- a/gcc/config/riscv/riscv-opts.h
> >> > +++ b/gcc/config/riscv/riscv-opts.h
> >> > @@ -172,6 +172,9 @@ enum stack_protector_guard {
> >> >  #define TARGET_ZFHMIN ((riscv_zf_subext & MASK_ZFHMIN) != 0)
> >> >  #define TARGET_ZFH    ((riscv_zf_subext & MASK_ZFH) != 0)
> >> >
> >> > +#define MASK_ZFA   (1 << 0)
> >> > +#define TARGET_ZFA    ((riscv_zf_subext & MASK_ZFA) != 0)
> >> > +
> >> >  #define MASK_ZMMUL      (1 << 0)
> >> >  #define TARGET_ZMMUL    ((riscv_zm_subext & MASK_ZMMUL) != 0)
> >> >
> >> > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> >> > index 5244e8dcbf0..e421244a06c 100644
> >> > --- a/gcc/config/riscv/riscv-protos.h
> >> > +++ b/gcc/config/riscv/riscv-protos.h
> >> > @@ -38,6 +38,7 @@ enum riscv_symbol_type {
> >> >  /* Routines implemented in riscv.cc.  */
> >> >  extern enum riscv_symbol_type riscv_classify_symbolic_expression (rtx);
> >> >  extern bool riscv_symbolic_constant_p (rtx, enum riscv_symbol_type *);
> >> > +extern int riscv_float_const_rtx_index_for_fli (rtx);
> >> >  extern int riscv_regno_mode_ok_for_base_p (int, machine_mode, bool);
> >> >  extern int riscv_address_insns (rtx, machine_mode, bool);
> >> >  extern int riscv_const_insns (rtx);
> >> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> >> > index cdb47e81e7c..faffedffe97 100644
> >> > --- a/gcc/config/riscv/riscv.cc
> >> > +++ b/gcc/config/riscv/riscv.cc
> >> > @@ -799,6 +799,116 @@ static int riscv_symbol_insns (enum riscv_symbol_type type)
> >> >      }
> >> >  }
> >> >
> >> > +/* Immediate values loaded by the FLI.S instruction in Chapter 25 of the latest RISC-V ISA
> >> > +   Manual draft. For details, please see:
> >> > +   https://github.com/riscv/riscv-isa-manual/releases/tag/draft-20221217-cb3b9d1 */
> >> > +
> >> > +unsigned HOST_WIDE_INT fli_value_hf[32] =
> >> > +{
> >> > +  0xbc00, 0x400, 0x100, 0x200, 0x1c00, 0x2000, 0x2c00, 0x3000,
> >> > +  0x3400, 0x3500, 0x3600, 0x3700, 0x3800, 0x3900, 0x3a00, 0x3b00,
> >> > +  0x3c00, 0x3d00, 0x3e00, 0x3f00, 0x4000, 0x4100, 0x4200, 0x4400,
> >> > +  0x4800, 0x4c00, 0x5800, 0x5c00, 0x7800,
> >> > +  /* Only used for filling, ensuring that 29 and 30 of HF are the same. */
> >> > +  0x7800,
> >> > +  0x7c00, 0x7e00,
> >> > +};
> >> > +
> >> > +unsigned HOST_WIDE_INT fli_value_sf[32] =
> >> > +{
> >> > +  0xbf800000, 0x00800000, 0x37800000, 0x38000000, 0x3b800000, 0x3c000000, 0x3d800000, 0x3e000000,
> >> > +  0x3e800000, 0x3ea00000, 0x3ec00000, 0x3ee00000, 0x3f000000, 0x3f200000, 0x3f400000, 0x3f600000,
> >> > +  0x3f800000, 0x3fa00000, 0x3fc00000, 0x3fe00000, 0x40000000, 0x40200000, 0x40400000, 0x40800000,
> >> > +  0x41000000, 0x41800000, 0x43000000, 0x43800000, 0x47000000, 0x47800000, 0x7f800000, 0x7fc00000
> >> > +};
> >> > +
> >> > +unsigned HOST_WIDE_INT fli_value_df[32] =
> >> > +{
> >> > +  0xbff0000000000000, 0x10000000000000, 0x3ef0000000000000, 0x3f00000000000000,
> >> > +  0x3f70000000000000, 0x3f80000000000000, 0x3fb0000000000000, 0x3fc0000000000000,
> >> > +  0x3fd0000000000000, 0x3fd4000000000000, 0x3fd8000000000000, 0x3fdc000000000000,
> >> > +  0x3fe0000000000000, 0x3fe4000000000000, 0x3fe8000000000000, 0x3fec000000000000,
> >> > +  0x3ff0000000000000, 0x3ff4000000000000, 0x3ff8000000000000, 0x3ffc000000000000,
> >> > +  0x4000000000000000, 0x4004000000000000, 0x4008000000000000, 0x4010000000000000,
> >> > +  0x4020000000000000, 0x4030000000000000, 0x4060000000000000, 0x4070000000000000,
> >> > +  0x40e0000000000000, 0x40f0000000000000, 0x7ff0000000000000, 0x7ff8000000000000,
> >> > +};
> >> > +
> >> > +const char *fli_value_print[32] =
> >> > +{
> >> > +  "-1.0", "min", "1.52587890625e-05", "3.0517578125e-05", "0.00390625", "0.0078125", "0.0625", "0.125",
> >> > +  "0.25", "0.3125", "0.375", "0.4375", "0.5", "0.625", "0.75", "0.875",
> >> > +  "1.0", "1.25", "1.5", "1.75", "2.0", "2.5", "3.0", "4.0",
> >> > +  "8.0", "16.0", "128.0", "256.0", "32768.0", "65536.0", "inf", "nan"
> >> > +};
> >> > +
> >> > +/* Find the index of TARGET in ARRAY, and return -1 if not found. */
> >> > +
> >> > +static int
> >> > +find_index_in_array (unsigned HOST_WIDE_INT target, unsigned HOST_WIDE_INT *array, int len)
> >> > +{
> >> > +  if (array == NULL)
> >> > +    return -1;
> >> > +
> >> > +  for (int i = 0; i < len; i++)
> >> > +    {
> >> > +      if (target == array[i])
> >> > +       return i;
> >> > +    }
> >> > +  return -1;
> >> > +}
> >> > +
> >> > +/* Return index of the FLI instruction table if rtx X is an immediate constant that
> >> > +   can be moved using a single FLI instruction in zfa extension. -1 otherwise. */
> >> > +
> >> > +int
> >> > +riscv_float_const_rtx_index_for_fli (rtx x)
> >> > +{
> >> > +  machine_mode mode = GET_MODE (x);
> >> > +
> >> > +  if (!TARGET_ZFA || mode == VOIDmode
> >> > +      || !CONST_DOUBLE_P(x)
> >> > +      || (mode == HFmode && !TARGET_ZFH)
> >> > +      || (mode == SFmode && !TARGET_HARD_FLOAT)
> >> > +      || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
> >> > +    return -1;
> >> > +
> >> > +  if (!SCALAR_FLOAT_MODE_P (mode)
> >> > +      || GET_MODE_BITSIZE (mode).to_constant () > HOST_BITS_PER_WIDE_INT
> >> > +      /* Only support up to DF mode.  */
> >> > +      || GET_MODE_BITSIZE (mode).to_constant () > GET_MODE_BITSIZE (DFmode))
> >> > +    return -1;
> >> > +
> >> > +  unsigned HOST_WIDE_INT ival = 0;
> >> > +
> >> > +  long res[2];
> >> > +  real_to_target (res,
> >> > +                 CONST_DOUBLE_REAL_VALUE (x),
> >> > +                 REAL_MODE_FORMAT (mode));
> >> > +
> >> > +  if (mode == DFmode)
> >> > +    {
> >> > +      int order = BYTES_BIG_ENDIAN ? 1 : 0;
> >> > +      ival = zext_hwi (res[order], 32);
> >> > +      ival |= (zext_hwi (res[1 - order], 32) << 32);
> >> > +    }
> >> > +  else
> >> > +      ival = zext_hwi (res[0], 32);
> >> > +
> >> > +  switch (mode)
> >> > +    {
> >> > +      case SFmode:
> >> > +       return find_index_in_array (ival, fli_value_sf, 32);
> >> > +      case DFmode:
> >> > +       return find_index_in_array (ival, fli_value_df, 32);
> >> > +      case HFmode:
> >> > +       return find_index_in_array (ival, fli_value_hf, 32);
> >> > +      default:
> >> > +       break;
> >> > +    }
> >> > +  return -1;
> >> > +}
> >> > +
> >> >  /* Implement TARGET_LEGITIMATE_CONSTANT_P.  */
> >> >
> >> >  static bool
> >> > @@ -826,6 +936,9 @@ riscv_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
> >> >    if (GET_CODE (x) == HIGH)
> >> >      return true;
> >> >
> >> > +  if (riscv_float_const_rtx_index_for_fli (x) != -1)
> >> > +   return true;
> >> > +
> >> >    split_const (x, &base, &offset);
> >> >    if (riscv_symbolic_constant_p (base, &type))
> >> >      {
> >> > @@ -1213,6 +1326,11 @@ riscv_const_insns (rtx x)
> >> >        }
> >> >
> >> >      case CONST_DOUBLE:
> >> > +      /* See if we can use FMV directly.  */
> >> > +      if (riscv_float_const_rtx_index_for_fli (x) != -1)
> >> > +       return 3;
> >> > +      /* Fall through.  */
> >> > +
> >> >      case CONST_VECTOR:
> >> >        /* We can use x0 to load floating-point zero.  */
> >> >        return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
> >> > @@ -1749,6 +1867,12 @@ riscv_legitimize_const_move (machine_mode mode, rtx dest, rtx src)
> >> >        return;
> >> >      }
> >> >
> >> > +  if (riscv_float_const_rtx_index_for_fli (src) != -1)
> >> > +    {
> >> > +      riscv_emit_set (dest, src);
> >> > +      return;
> >> > +    }
> >> > +
> >> >    /* Split moves of symbolic constants into high/low pairs.  */
> >> >    if (riscv_split_symbol (dest, src, MAX_MACHINE_MODE, &src, FALSE))
> >> >      {
> >> > @@ -2770,12 +2894,19 @@ riscv_split_64bit_move_p (rtx dest, rtx src)
> >> >    if (TARGET_64BIT)
> >> >      return false;
> >> >
> >> > +  /* There is no need to split if the FLI instruction in the `Zfa` extension can be used. */
> >> > +  if (riscv_float_const_rtx_index_for_fli (src) != -1)
> >> > +    return false;
> >> > +
> >> >    /* Allow FPR <-> FPR and FPR <-> MEM moves, and permit the special case
> >> >       of zeroing an FPR with FCVT.D.W.  */
> >> >    if (TARGET_DOUBLE_FLOAT
> >> >        && ((FP_REG_RTX_P (src) && FP_REG_RTX_P (dest))
> >> >           || (FP_REG_RTX_P (dest) && MEM_P (src))
> >> >           || (FP_REG_RTX_P (src) && MEM_P (dest))
> >> > +         || (TARGET_ZFA
> >> > +             && ((FP_REG_RTX_P (dest) && GP_REG_RTX_P (src))
> >> > +             || (FP_REG_RTX_P (src) && GP_REG_RTX_P (dest))))
> >> >           || (FP_REG_RTX_P (dest) && src == CONST0_RTX (GET_MODE (src)))))
> >> >      return false;
> >> >
> >> > @@ -2857,6 +2988,8 @@ riscv_output_move (rtx dest, rtx src)
> >> >           case 4:
> >> >             return "fmv.x.s\t%0,%1";
> >> >           case 8:
> >> > +           if (!TARGET_64BIT && TARGET_ZFA)
> >> > +             return "fmv.x.w\t%0,%1\n\tfmvh.x.d\t%N0,%1";
> >> >             return "fmv.x.d\t%0,%1";
> >> >           }
> >> >
> >> > @@ -2916,6 +3049,8 @@ riscv_output_move (rtx dest, rtx src)
> >> >               case 8:
> >> >                 if (TARGET_64BIT)
> >> >                   return "fmv.d.x\t%0,%z1";
> >> > +               else if (TARGET_ZFA && src != CONST0_RTX (mode))
> >> > +                 return "fmvp.d.x\t%0,%1,%N1";
> >> >                 /* in RV32, we can emulate fmv.d.x %0, x0 using fcvt.d.w */
> >> >                 gcc_assert (src == CONST0_RTX (mode));
> >> >                 return "fcvt.d.w\t%0,x0";
> >> > @@ -2968,6 +3103,14 @@ riscv_output_move (rtx dest, rtx src)
> >> >           case 8:
> >> >             return "fld\t%0,%1";
> >> >           }
> >> > +
> >> > +      if (src_code == CONST_DOUBLE && (riscv_float_const_rtx_index_for_fli (src) != -1))
> >> > +       switch (width)
> >> > +         {
> >> > +           case 2: return "fli.h\t%0,%1";
> >> > +           case 4: return "fli.s\t%0,%1";
> >> > +           case 8: return "fli.d\t%0,%1";
> >> > +         }
> >> >      }
> >> >    if (dest_code == REG && GP_REG_P (REGNO (dest)) && src_code == CONST_POLY_INT)
> >> >      {
> >> > @@ -4349,6 +4492,7 @@ riscv_memmodel_needs_release_fence (enum memmodel model)
> >> >     'S' Print shift-index of single-bit mask OP.
> >> >     'T' Print shift-index of inverted single-bit mask OP.
> >> >     '~' Print w if TARGET_64BIT is true; otherwise not print anything.
> >> > +   'N'  Print next register.
> >> >
> >> >     Note please keep this list and the list in riscv.md in sync.  */
> >> >
> >> > @@ -4533,6 +4677,9 @@ riscv_print_operand (FILE *file, rtx op, int letter)
> >> >         output_addr_const (file, newop);
> >> >         break;
> >> >        }
> >> > +    case 'N':
> >> > +      fputs (reg_names[REGNO (op) + 1], file);
> >> > +      break;
> >> >      default:
> >> >        switch (code)
> >> >         {
> >> > @@ -4549,6 +4696,24 @@ riscv_print_operand (FILE *file, rtx op, int letter)
> >> >             output_address (mode, XEXP (op, 0));
> >> >           break;
> >> >
> >> > +       case CONST_DOUBLE:
> >> > +         {
> >> > +           if (letter == 'z' && op == CONST0_RTX (GET_MODE (op)))
> >> > +             {
> >> > +               fputs (reg_names[GP_REG_FIRST], file);
> >> > +               break;
> >> > +             }
> >> > +
> >> > +           int fli_index = riscv_float_const_rtx_index_for_fli (op);
> >> > +           if (fli_index == -1 || fli_index > 31)
> >> > +             {
> >> > +               output_operand_lossage ("invalid use of '%%%c'", letter);
> >> > +               break;
> >> > +             }
> >> > +           asm_fprintf (file, "%s", fli_value_print[fli_index]);
> >> > +           break;
> >> > +         }
> >> > +
> >> >         default:
> >> >           if (letter == 'z' && op == CONST0_RTX (GET_MODE (op)))
> >> >             fputs (reg_names[GP_REG_FIRST], file);
> >> > @@ -5897,7 +6062,8 @@ riscv_secondary_memory_needed (machine_mode mode, reg_class_t class1,
> >> >    return (!riscv_v_ext_vector_mode_p (mode)
> >> >           && GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD
> >> >           && (class1 == FP_REGS) != (class2 == FP_REGS)
> >> > -         && !TARGET_XTHEADFMV);
> >> > +         && !TARGET_XTHEADFMV
> >> > +         && !TARGET_ZFA);
> >> >  }
> >> >
> >> >  /* Implement TARGET_REGISTER_MOVE_COST.  */
> >> > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> >> > index 66fb07d6652..d438b281142 100644
> >> > --- a/gcc/config/riscv/riscv.h
> >> > +++ b/gcc/config/riscv/riscv.h
> >> > @@ -377,6 +377,7 @@ ASM_MISA_SPEC
> >> >  #define SIBCALL_REG_P(REGNO)   \
> >> >    TEST_HARD_REG_BIT (reg_class_contents[SIBCALL_REGS], REGNO)
> >> >
> >> > +#define GP_REG_RTX_P(X) (REG_P (X) && GP_REG_P (REGNO (X)))
> >> >  #define FP_REG_RTX_P(X) (REG_P (X) && FP_REG_P (REGNO (X)))
> >> >
> >> >  /* Use s0 as the frame pointer if it is so requested.  */
> >> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> >> > index bc384d9aedf..f22e71b5a3a 100644
> >> > --- a/gcc/config/riscv/riscv.md
> >> > +++ b/gcc/config/riscv/riscv.md
> >> > @@ -59,6 +59,15 @@ (define_c_enum "unspec" [
> >> >    UNSPEC_LROUND
> >> >    UNSPEC_FMIN
> >> >    UNSPEC_FMAX
> >> > +  UNSPEC_RINT
> >> > +  UNSPEC_ROUND
> >> > +  UNSPEC_FLOOR
> >> > +  UNSPEC_CEIL
> >> > +  UNSPEC_BTRUNC
> >> > +  UNSPEC_ROUNDEVEN
> >> > +  UNSPEC_NEARBYINT
> >> > +  UNSPEC_FMINM
> >> > +  UNSPEC_FMAXM
> >> >
> >> >    ;; Stack tie
> >> >    UNSPEC_TIE
> >> > @@ -1232,6 +1241,26 @@ (define_insn "neg<mode>2"
> >> >  ;;
> >> >  ;;  ....................
> >> >
> >> > +(define_insn "fminm<mode>3"
> >> > +  [(set (match_operand:ANYF                    0 "register_operand" "=f")
> >> > +       (unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
> >> > +                     (use (match_operand:ANYF 2 "register_operand" " f"))]
> >> > +                    UNSPEC_FMINM))]
> >> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> >> > +  "fminm.<fmt>\t%0,%1,%2"
> >> > +  [(set_attr "type" "fmove")
> >> > +   (set_attr "mode" "<UNITMODE>")])
> >> > +
> >> > +(define_insn "fmaxm<mode>3"
> >> > +  [(set (match_operand:ANYF                    0 "register_operand" "=f")
> >> > +       (unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
> >> > +                     (use (match_operand:ANYF 2 "register_operand" " f"))]
> >> > +                    UNSPEC_FMAXM))]
> >> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> >> > +  "fmaxm.<fmt>\t%0,%1,%2"
> >> > +  [(set_attr "type" "fmove")
> >> > +   (set_attr "mode" "<UNITMODE>")])
> >> > +
> >> >  (define_insn "fmin<mode>3"
> >> >    [(set (match_operand:ANYF                    0 "register_operand" "=f")
> >> >         (unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
> >> > @@ -1508,13 +1537,13 @@ (define_expand "movhf"
> >> >  })
> >> >
> >> >  (define_insn "*movhf_hardfloat"
> >> > -  [(set (match_operand:HF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
> >> > -       (match_operand:HF 1 "move_operand"         " f,G,m,f,G,*r,*f,*G*r,*m,*r"))]
> >> > +  [(set (match_operand:HF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
> >> > +       (match_operand:HF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*G*r,*m,*r"))]
> >> >    "TARGET_ZFHMIN
> >> >     && (register_operand (operands[0], HFmode)
> >> >         || reg_or_0_operand (operands[1], HFmode))"
> >> >    { return riscv_output_move (operands[0], operands[1]); }
> >> > -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >> > +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >> >     (set_attr "mode" "HF")])
> >> >
> >> >  (define_insn "*movhf_softfloat"
> >> > @@ -1580,6 +1609,26 @@ (define_insn "l<rint_pattern><ANYF:mode><GPR:mode>2"
> >> >    [(set_attr "type" "fcvt")
> >> >     (set_attr "mode" "<ANYF:MODE>")])
> >> >
> >> > +(define_insn "<round_pattern><ANYF:mode>2"
> >> > +  [(set (match_operand:ANYF     0 "register_operand" "=f")
> >> > +       (unspec:ANYF
> >> > +           [(match_operand:ANYF 1 "register_operand" " f")]
> >> > +       ROUND))]
> >> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> >> > +  "fround.<ANYF:fmt>\t%0,%1,<round_rm>"
> >> > +  [(set_attr "type" "fcvt")
> >> > +   (set_attr "mode" "<ANYF:MODE>")])
> >> > +
> >> > +(define_insn "rint<ANYF:mode>2"
> >> > +  [(set (match_operand:ANYF     0 "register_operand" "=f")
> >> > +       (unspec:ANYF
> >> > +           [(match_operand:ANYF 1 "register_operand" " f")]
> >> > +       UNSPEC_RINT))]
> >> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> >> > +  "froundnx.<ANYF:fmt>\t%0,%1"
> >> > +  [(set_attr "type" "fcvt")
> >> > +   (set_attr "mode" "<ANYF:MODE>")])
> >> > +
> >> >  ;;
> >> >  ;;  ....................
> >> >  ;;
> >> > @@ -1839,13 +1888,13 @@ (define_expand "movsf"
> >> >  })
> >> >
> >> >  (define_insn "*movsf_hardfloat"
> >> > -  [(set (match_operand:SF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
> >> > -       (match_operand:SF 1 "move_operand"         " f,G,m,f,G,*r,*f,*G*r,*m,*r"))]
> >> > +  [(set (match_operand:SF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
> >> > +       (match_operand:SF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*G*r,*m,*r"))]
> >> >    "TARGET_HARD_FLOAT
> >> >     && (register_operand (operands[0], SFmode)
> >> >         || reg_or_0_operand (operands[1], SFmode))"
> >> >    { return riscv_output_move (operands[0], operands[1]); }
> >> > -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >> > +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >> >     (set_attr "mode" "SF")])
> >> >
> >> >  (define_insn "*movsf_softfloat"
> >> > @@ -1873,23 +1922,23 @@ (define_expand "movdf"
> >> >  ;; In RV32, we lack fmv.x.d and fmv.d.x.  Go through memory instead.
> >> >  ;; (However, we can still use fcvt.d.w to zero a floating-point register.)
> >> >  (define_insn "*movdf_hardfloat_rv32"
> >> > -  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,m,*th_f_fmv,*th_r_fmv,  *r,*r,*m")
> >> > -       (match_operand:DF 1 "move_operand"         " f,G,m,f,G,*th_r_fmv,*th_f_fmv,*r*G,*m,*r"))]
> >> > +  [(set (match_operand:DF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*th_f_fmv,*th_r_fmv,  *r,*r,*m")
> >> > +       (match_operand:DF 1 "move_operand"         " f,Zf,G,m,f,G,*th_r_fmv,*th_f_fmv,*r*G,*m,*r"))]
> >> >    "!TARGET_64BIT && TARGET_DOUBLE_FLOAT
> >> >     && (register_operand (operands[0], DFmode)
> >> >         || reg_or_0_operand (operands[1], DFmode))"
> >> >    { return riscv_output_move (operands[0], operands[1]); }
> >> > -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >> > +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >> >     (set_attr "mode" "DF")])
> >> >
> >> >  (define_insn "*movdf_hardfloat_rv64"
> >> > -  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
> >> > -       (match_operand:DF 1 "move_operand"         " f,G,m,f,G,*r,*f,*r*G,*m,*r"))]
> >> > +  [(set (match_operand:DF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r,  *r,*r,*m")
> >> > +       (match_operand:DF 1 "move_operand"         " f,Zf,G,m,f,G,*r,*f,*r*G,*m,*r"))]
> >> >    "TARGET_64BIT && TARGET_DOUBLE_FLOAT
> >> >     && (register_operand (operands[0], DFmode)
> >> >         || reg_or_0_operand (operands[1], DFmode))"
> >> >    { return riscv_output_move (operands[0], operands[1]); }
> >> > -  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >> > +  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >> >     (set_attr "mode" "DF")])
> >> >
> >> >  (define_insn "*movdf_softfloat"
> >> > @@ -2494,16 +2543,23 @@ (define_expand "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4"
> >> >    rtx op0 = operands[0];
> >> >    rtx op1 = operands[1];
> >> >    rtx op2 = operands[2];
> >> > -  rtx tmp = gen_reg_rtx (SImode);
> >> > -  rtx cmp = gen_rtx_<QUIET_PATTERN> (<X:MODE>mode, op1, op2);
> >> > -  rtx frflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, const0_rtx),
> >> > -                                        UNSPECV_FRFLAGS);
> >> > -  rtx fsflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, tmp),
> >> > -                                        UNSPECV_FSFLAGS);
> >> > -
> >> > -  emit_insn (gen_rtx_SET (tmp, frflags));
> >> > -  emit_insn (gen_rtx_SET (op0, cmp));
> >> > -  emit_insn (fsflags);
> >> > +
> >> > +  if (TARGET_ZFA)
> >> > +    emit_insn (gen_f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa(op0, op1, op2));
> >> > +  else
> >> > +    {
> >> > +      rtx tmp = gen_reg_rtx (SImode);
> >> > +      rtx cmp = gen_rtx_<QUIET_PATTERN> (<X:MODE>mode, op1, op2);
> >> > +      rtx frflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, const0_rtx),
> >> > +                                            UNSPECV_FRFLAGS);
> >> > +      rtx fsflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, tmp),
> >> > +                                            UNSPECV_FSFLAGS);
> >> > +
> >> > +      emit_insn (gen_rtx_SET (tmp, frflags));
> >> > +      emit_insn (gen_rtx_SET (op0, cmp));
> >> > +      emit_insn (fsflags);
> >> > +    }
> >> > +
> >> >    if (HONOR_SNANS (<ANYF:MODE>mode))
> >> >      emit_insn (gen_rtx_UNSPEC_VOLATILE (<ANYF:MODE>mode,
> >> >                                         gen_rtvec (2, op1, op2),
> >> > @@ -2511,6 +2567,18 @@ (define_expand "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4"
> >> >    DONE;
> >> >  })
> >> >
> >> > +(define_insn "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa"
> >> > +   [(set (match_operand:X      0 "register_operand" "=r")
> >> > +        (unspec:X
> >> > +         [(match_operand:ANYF 1 "register_operand" " f")
> >> > +          (match_operand:ANYF 2 "register_operand" " f")]
> >> > +         QUIET_COMPARISON))]
> >> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> >> > +  "f<quiet_pattern>q.<fmt>\t%0,%1,%2"
> >> > +  [(set_attr "type" "fcmp")
> >> > +   (set_attr "mode" "<UNITMODE>")
> >> > +   (set (attr "length") (const_int 16))])
> >> > +
> >> >  (define_insn "*seq_zero_<X:mode><GPR:mode>"
> >> >    [(set (match_operand:GPR       0 "register_operand" "=r")
> >> >         (eq:GPR (match_operand:X 1 "register_operand" " r")
> >> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
> >> > new file mode 100644
> >> > index 00000000000..26895b76fa4
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
> >> > @@ -0,0 +1,19 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O2" } */
> >> > +
> >> > +extern void abort(void);
> >> > +extern float a, b;
> >> > +extern double c, d;
> >> > +
> >> > +void
> >> > +foo()
> >> > +{
> >> > +  if ((__builtin_isless(a, b) ||  __builtin_islessequal(c, d))
> >> > +      && (__builtin_islessequal(a, b)|| __builtin_isless(c, d)))
> >> > +    abort();
> >> > +}
> >> > +
> >> > +/* { dg-final { scan-assembler-times "fleq.s" 1 } } */
> >> > +/* { dg-final { scan-assembler-times "fltq.s" 1 } } */
> >> > +/* { dg-final { scan-assembler-times "fleq.d" 1 } } */
> >> > +/* { dg-final { scan-assembler-times "fltq.d" 1 } } */
> >> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
> >> > new file mode 100644
> >> > index 00000000000..4ccd6a7dd78
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
> >> > @@ -0,0 +1,19 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O2" } */
> >> > +
> >> > +extern void abort(void);
> >> > +extern float a, b;
> >> > +extern double c, d;
> >> > +
> >> > +void
> >> > +foo()
> >> > +{
> >> > +  if ((__builtin_isless(a, b) ||  __builtin_islessequal(c, d))
> >> > +      && (__builtin_islessequal(a, b)|| __builtin_isless(c, d)))
> >> > +    abort();
> >> > +}
> >> > +
> >> > +/* { dg-final { scan-assembler-times "fleq.s" 1 } } */
> >> > +/* { dg-final { scan-assembler-times "fltq.s" 1 } } */
> >> > +/* { dg-final { scan-assembler-times "fleq.d" 1 } } */
> >> > +/* { dg-final { scan-assembler-times "fltq.d" 1 } } */
> >> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
> >> > new file mode 100644
> >> > index 00000000000..c4da04797aa
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
> >> > @@ -0,0 +1,79 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O0" } */
> >> > +
> >> > +void foo_float32 ()
> >> > +{
> >> > +  volatile float a;
> >> > +  a = -1.0;
> >> > +  a = 1.1754944e-38;
> >> > +  a = 1.0/(1 << 16);
> >> > +  a = 1.0/(1 << 15);
> >> > +  a = 1.0/(1 << 8);
> >> > +  a = 1.0/(1 << 7);
> >> > +  a = 1.0/(1 << 4);
> >> > +  a = 1.0/(1 << 3);
> >> > +  a = 1.0/(1 << 2);
> >> > +  a = 0.3125;
> >> > +  a = 0.375;
> >> > +  a = 0.4375;
> >> > +  a = 0.5;
> >> > +  a = 0.625;
> >> > +  a = 0.75;
> >> > +  a = 0.875;
> >> > +  a = 1.0;
> >> > +  a = 1.25;
> >> > +  a = 1.5;
> >> > +  a = 1.75;
> >> > +  a = 2.0;
> >> > +  a = 2.5;
> >> > +  a = 3.0;
> >> > +  a = 1.0*(1 << 2);
> >> > +  a = 1.0*(1 << 3);
> >> > +  a = 1.0*(1 << 4);
> >> > +  a = 1.0*(1 << 7);
> >> > +  a = 1.0*(1 << 8);
> >> > +  a = 1.0*(1 << 15);
> >> > +  a = 1.0*(1 << 16);
> >> > +  a = __builtin_inff ();
> >> > +  a = __builtin_nanf ("");
> >> > +}
> >> > +
> >> > +void foo_double64 ()
> >> > +{
> >> > +  volatile double a;
> >> > +  a = -1.0;
> >> > +  a = 2.2250738585072014E-308;
> >> > +  a = 1.0/(1 << 16);
> >> > +  a = 1.0/(1 << 15);
> >> > +  a = 1.0/(1 << 8);
> >> > +  a = 1.0/(1 << 7);
> >> > +  a = 1.0/(1 << 4);
> >> > +  a = 1.0/(1 << 3);
> >> > +  a = 1.0/(1 << 2);
> >> > +  a = 0.3125;
> >> > +  a = 0.375;
> >> > +  a = 0.4375;
> >> > +  a = 0.5;
> >> > +  a = 0.625;
> >> > +  a = 0.75;
> >> > +  a = 0.875;
> >> > +  a = 1.0;
> >> > +  a = 1.25;
> >> > +  a = 1.5;
> >> > +  a = 1.75;
> >> > +  a = 2.0;
> >> > +  a = 2.5;
> >> > +  a = 3.0;
> >> > +  a = 1.0*(1 << 2);
> >> > +  a = 1.0*(1 << 3);
> >> > +  a = 1.0*(1 << 4);
> >> > +  a = 1.0*(1 << 7);
> >> > +  a = 1.0*(1 << 8);
> >> > +  a = 1.0*(1 << 15);
> >> > +  a = 1.0*(1 << 16);
> >> > +  a = __builtin_inf ();
> >> > +  a = __builtin_nan ("");
> >> > +}
> >> > +
> >> > +/* { dg-final { scan-assembler-times "fli.s" 32 } } */
> >> > +/* { dg-final { scan-assembler-times "fli.d" 32 } } */
> >> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
> >> > new file mode 100644
> >> > index 00000000000..bcffe9d2c82
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
> >> > @@ -0,0 +1,41 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-march=rv32imafdc_zfa_zfh -mabi=ilp32d -O0" } */
> >> > +
> >> > +void foo_float16 ()
> >> > +{
> >> > +  volatile _Float16 a;
> >> > +  a = -1.0;
> >> > +  a = 6.104E-5;
> >> > +  a = 1.0/(1 << 16);
> >> > +  a = 1.0/(1 << 15);
> >> > +  a = 1.0/(1 << 8);
> >> > +  a = 1.0/(1 << 7);
> >> > +  a = 1.0/(1 << 4);
> >> > +  a = 1.0/(1 << 3);
> >> > +  a = 1.0/(1 << 2);
> >> > +  a = 0.3125;
> >> > +  a = 0.375;
> >> > +  a = 0.4375;
> >> > +  a = 0.5;
> >> > +  a = 0.625;
> >> > +  a = 0.75;
> >> > +  a = 0.875;
> >> > +  a = 1.0;
> >> > +  a = 1.25;
> >> > +  a = 1.5;
> >> > +  a = 1.75;
> >> > +  a = 2.0;
> >> > +  a = 2.5;
> >> > +  a = 3.0;
> >> > +  a = 1.0*(1 << 2);
> >> > +  a = 1.0*(1 << 3);
> >> > +  a = 1.0*(1 << 4);
> >> > +  a = 1.0*(1 << 7);
> >> > +  a = 1.0*(1 << 8);
> >> > +  a = 1.0*(1 << 15);
> >> > +  a = 1.0*(1 << 16);
> >> > +  a = __builtin_inff16 ();
> >> > +  a = __builtin_nanf16 ("");
> >> > +}
> >> > +
> >> > +/* { dg-final { scan-assembler-times "fli.h" 32 } } */
> >> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
> >> > new file mode 100644
> >> > index 00000000000..13aa7b5f846
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
> >> > @@ -0,0 +1,41 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-march=rv64imafdc_zfa_zfh -mabi=lp64d -O0" } */
> >> > +
> >> > +void foo_float16 ()
> >> > +{
> >> > +  volatile _Float16 a;
> >> > +  a = -1.0;
> >> > +  a = 6.104E-5;
> >> > +  a = 1.0/(1 << 16);
> >> > +  a = 1.0/(1 << 15);
> >> > +  a = 1.0/(1 << 8);
> >> > +  a = 1.0/(1 << 7);
> >> > +  a = 1.0/(1 << 4);
> >> > +  a = 1.0/(1 << 3);
> >> > +  a = 1.0/(1 << 2);
> >> > +  a = 0.3125;
> >> > +  a = 0.375;
> >> > +  a = 0.4375;
> >> > +  a = 0.5;
> >> > +  a = 0.625;
> >> > +  a = 0.75;
> >> > +  a = 0.875;
> >> > +  a = 1.0;
> >> > +  a = 1.25;
> >> > +  a = 1.5;
> >> > +  a = 1.75;
> >> > +  a = 2.0;
> >> > +  a = 2.5;
> >> > +  a = 3.0;
> >> > +  a = 1.0*(1 << 2);
> >> > +  a = 1.0*(1 << 3);
> >> > +  a = 1.0*(1 << 4);
> >> > +  a = 1.0*(1 << 7);
> >> > +  a = 1.0*(1 << 8);
> >> > +  a = 1.0*(1 << 15);
> >> > +  a = 1.0*(1 << 16);
> >> > +  a = __builtin_inff16 ();
> >> > +  a = __builtin_nanf16 ("");
> >> > +}
> >> > +
> >> > +/* { dg-final { scan-assembler-times "fli.h" 32 } } */
> >> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli.c b/gcc/testsuite/gcc.target/riscv/zfa-fli.c
> >> > new file mode 100644
> >> > index 00000000000..b6d41cf460f
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fli.c
> >> > @@ -0,0 +1,79 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O0" } */
> >> > +
> >> > +void foo_float32 ()
> >> > +{
> >> > +  volatile float a;
> >> > +  a = -1.0;
> >> > +  a = 1.1754944e-38;
> >> > +  a = 1.0/(1 << 16);
> >> > +  a = 1.0/(1 << 15);
> >> > +  a = 1.0/(1 << 8);
> >> > +  a = 1.0/(1 << 7);
> >> > +  a = 1.0/(1 << 4);
> >> > +  a = 1.0/(1 << 3);
> >> > +  a = 1.0/(1 << 2);
> >> > +  a = 0.3125;
> >> > +  a = 0.375;
> >> > +  a = 0.4375;
> >> > +  a = 0.5;
> >> > +  a = 0.625;
> >> > +  a = 0.75;
> >> > +  a = 0.875;
> >> > +  a = 1.0;
> >> > +  a = 1.25;
> >> > +  a = 1.5;
> >> > +  a = 1.75;
> >> > +  a = 2.0;
> >> > +  a = 2.5;
> >> > +  a = 3.0;
> >> > +  a = 1.0*(1 << 2);
> >> > +  a = 1.0*(1 << 3);
> >> > +  a = 1.0*(1 << 4);
> >> > +  a = 1.0*(1 << 7);
> >> > +  a = 1.0*(1 << 8);
> >> > +  a = 1.0*(1 << 15);
> >> > +  a = 1.0*(1 << 16);
> >> > +  a = __builtin_inff ();
> >> > +  a = __builtin_nanf ("");
> >> > +}
> >> > +
> >> > +void foo_double64 ()
> >> > +{
> >> > +  volatile double a;
> >> > +  a = -1.0;
> >> > +  a = 2.2250738585072014E-308;
> >> > +  a = 1.0/(1 << 16);
> >> > +  a = 1.0/(1 << 15);
> >> > +  a = 1.0/(1 << 8);
> >> > +  a = 1.0/(1 << 7);
> >> > +  a = 1.0/(1 << 4);
> >> > +  a = 1.0/(1 << 3);
> >> > +  a = 1.0/(1 << 2);
> >> > +  a = 0.3125;
> >> > +  a = 0.375;
> >> > +  a = 0.4375;
> >> > +  a = 0.5;
> >> > +  a = 0.625;
> >> > +  a = 0.75;
> >> > +  a = 0.875;
> >> > +  a = 1.0;
> >> > +  a = 1.25;
> >> > +  a = 1.5;
> >> > +  a = 1.75;
> >> > +  a = 2.0;
> >> > +  a = 2.5;
> >> > +  a = 3.0;
> >> > +  a = 1.0*(1 << 2);
> >> > +  a = 1.0*(1 << 3);
> >> > +  a = 1.0*(1 << 4);
> >> > +  a = 1.0*(1 << 7);
> >> > +  a = 1.0*(1 << 8);
> >> > +  a = 1.0*(1 << 15);
> >> > +  a = 1.0*(1 << 16);
> >> > +  a = __builtin_inf ();
> >> > +  a = __builtin_nan ("");
> >> > +}
> >> > +
> >> > +/* { dg-final { scan-assembler-times "fli.s" 32 } } */
> >> > +/* { dg-final { scan-assembler-times "fli.d" 32 } } */
> >> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
> >> > new file mode 100644
> >> > index 00000000000..5a52adce36a
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
> >> > @@ -0,0 +1,10 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-march=rv32g_zfa -mabi=ilp32 -O0" } */
> >> > +
> >> > +double foo(long long a)
> >> > +{
> >> > +  return (double)(a + 3);
> >> > +}
> >> > +
> >> > +/* { dg-final { scan-assembler-times "fmvp.d.x" 1 } } */
> >> > +/* { dg-final { scan-assembler-times "fmvh.x.d" 1 } } */
> >> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
> >> > new file mode 100644
> >> > index 00000000000..b53601d6e1f
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
> >> > @@ -0,0 +1,42 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O2" } */
> >> > +
> >> > +extern float a;
> >> > +extern double b;
> >> > +
> >> > +void foo (float *x, double *y)
> >> > +{
> >> > +  {
> >> > +    *x = __builtin_roundf (a);
> >> > +    *y = __builtin_round (b);
> >> > +  }
> >> > +  {
> >> > +    *x = __builtin_floorf (a);
> >> > +    *y = __builtin_floor (b);
> >> > +  }
> >> > +  {
> >> > +    *x = __builtin_ceilf (a);
> >> > +    *y = __builtin_ceil (b);
> >> > +  }
> >> > +  {
> >> > +    *x = __builtin_truncf (a);
> >> > +    *y = __builtin_trunc (b);
> >> > +  }
> >> > +  {
> >> > +    *x = __builtin_roundevenf (a);
> >> > +    *y = __builtin_roundeven (b);
> >> > +  }
> >> > +  {
> >> > +    *x = __builtin_nearbyintf (a);
> >> > +    *y = __builtin_nearbyint (b);
> >> > +  }
> >> > +  {
> >> > +    *x = __builtin_rintf (a);
> >> > +    *y = __builtin_rint (b);
> >> > +  }
> >> > +}
> >> > +
> >> > +/* { dg-final { scan-assembler-times "fround.s" 6 } } */
> >> > +/* { dg-final { scan-assembler-times "fround.d" 6 } } */
> >> > +/* { dg-final { scan-assembler-times "froundnx.s" 1 } } */
> >> > +/* { dg-final { scan-assembler-times "froundnx.d" 1 } } */
> >> > diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fround.c b/gcc/testsuite/gcc.target/riscv/zfa-fround.c
> >> > new file mode 100644
> >> > index 00000000000..c10de82578e
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/riscv/zfa-fround.c
> >> > @@ -0,0 +1,42 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O2" } */
> >> > +
> >> > +extern float a;
> >> > +extern double b;
> >> > +
> >> > +void foo (float *x, double *y)
> >> > +{
> >> > +  {
> >> > +    *x = __builtin_roundf (a);
> >> > +    *y = __builtin_round (b);
> >> > +  }
> >> > +  {
> >> > +    *x = __builtin_floorf (a);
> >> > +    *y = __builtin_floor (b);
> >> > +  }
> >> > +  {
> >> > +    *x = __builtin_ceilf (a);
> >> > +    *y = __builtin_ceil (b);
> >> > +  }
> >> > +  {
> >> > +    *x = __builtin_truncf (a);
> >> > +    *y = __builtin_trunc (b);
> >> > +  }
> >> > +  {
> >> > +    *x = __builtin_roundevenf (a);
> >> > +    *y = __builtin_roundeven (b);
> >> > +  }
> >> > +  {
> >> > +    *x = __builtin_nearbyintf (a);
> >> > +    *y = __builtin_nearbyint (b);
> >> > +  }
> >> > +  {
> >> > +    *x = __builtin_rintf (a);
> >> > +    *y = __builtin_rint (b);
> >> > +  }
> >> > +}
> >> > +
> >> > +/* { dg-final { scan-assembler-times "fround.s" 6 } } */
> >> > +/* { dg-final { scan-assembler-times "fround.d" 6 } } */
> >> > +/* { dg-final { scan-assembler-times "froundnx.s" 1 } } */
> >> > +/* { dg-final { scan-assembler-times "froundnx.d" 1 } } */
> >> > --
> >> > 2.17.1
> >> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2.
  2023-04-19  9:57 [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2 Jin Ma
  2023-05-05 15:03 ` Christoph Müllner
@ 2023-05-05 23:31 ` Jeff Law
  2023-05-06  7:54 ` Jin Ma
  2023-05-15 13:16 ` [PATCH v9] " Jin Ma
  3 siblings, 0 replies; 20+ messages in thread
From: Jeff Law @ 2023-05-05 23:31 UTC (permalink / raw)
  To: Jin Ma, gcc-patches
  Cc: kito.cheng, kito.cheng, palmer, christoph.muellner, ijinma



On 4/19/23 03:57, Jin Ma wrote:
> This patch adds the 'Zfa' extension for riscv, which is based on:
>    https://github.com/riscv/riscv-isa-manual/commits/zfb
>    https://github.com/riscv/riscv-isa-manual/commit/1f038182810727f5feca311072e630d6baac51da
> 
> The binutils-gdb for 'Zfa' extension:
>    https://github.com/a4lg/binutils-gdb/commits/riscv-zfa
> 
> What needs special explanation is:
> 1, The immediate number of the instructions FLI.H/S/D is represented in the assembly as a
>    floating-point value, with scientific counting when rs1 is 1,2, and decimal numbers for
>    the rest.
> 
>    Related llvm link:
>      https://reviews.llvm.org/D145645
>    Related discussion link:
>      https://github.com/riscv/riscv-isa-manual/issues/980
Right.  I think the goal right now is to get the bulk of this reviewed 
now.  Ideally we'll get to the point where the only outstanding issue is 
the interface between the assembler & gcc.

> 
> 2, According to riscv-spec, "The FCVTMO D.W.D instruction was added principally to
>    accelerate the processing of JavaScript Numbers.", so it seems that no implementation
>    is required.
Fair enough.  There's seems to be a general desire to wire up builtins 
for many things that aren't directly usable by the compiler.  So 
consider such a change as a follow-up.   I don't think something like 
this should hold up the blk of Zfa.

> 
> 3, The instructions FMINM and FMAXM correspond to C23 library function fminimum and fmaximum.
>    Therefore, this patch has simply implemented the pattern of fminm<hf\sf\df>3 and
>    fmaxm<hf\sf\df>3 to prepare for later.
Sounds good.


> 
> gcc/ChangeLog:
> 
> 	* common/config/riscv/riscv-common.cc: Add zfa extension version.
> 	* config/riscv/constraints.md (Zf): Constrain the floating point number that the
> 	instructions FLI.H/S/D can load.
> 	((TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS): enable FMVP.D.X and FMVH.X.D.
> 	* config/riscv/iterators.md (ceil): New.
> 	* config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): New.
> 	* config/riscv/riscv.cc (find_index_in_array): New.
> 	(riscv_float_const_rtx_index_for_fli): Get the index of the floating-point number that
> 	the instructions FLI.H/S/D can mov.
> 	(riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be used, memory is not applicable.
> 	(riscv_const_insns): The cost of FLI.H/S/D is 3.
> 	(riscv_legitimize_const_move): Likewise.
> 	(riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, no split is required.
> 	(riscv_output_move): Output the mov instructions in zfa extension.
> 	(riscv_print_operand): Output the floating-point value of the FLI.H/S/D immediate in assembly
> 	(riscv_secondary_memory_needed): Likewise.
> 	* config/riscv/riscv.h (GP_REG_RTX_P): New.
> 	* config/riscv/riscv.md (fminm<mode>3): New.
> 

> index c448e6b37e9..62d9094f966 100644
> --- a/gcc/config/riscv/constraints.md
> +++ b/gcc/config/riscv/constraints.md
> @@ -118,6 +118,13 @@ (define_constraint "T"
>     (and (match_operand 0 "move_operand")
>          (match_test "CONSTANT_P (op)")))
>   
> +;; Zfa constraints.
> +
> +(define_constraint "Zf"
> +  "A floating point number that can be loaded using instruction `fli` in zfa."
> +  (and (match_code "const_double")
> +       (match_test "(riscv_float_const_rtx_index_for_fli (op) != -1)")))
> +
>   ;; Vector constraints.
>   
>   (define_register_constraint "vr" "TARGET_VECTOR ? V_REGS : NO_REGS"
> @@ -183,8 +190,8 @@ (define_memory_constraint "Wdm"
>   
>   ;; Vendor ISA extension constraints.
>   
> -(define_register_constraint "th_f_fmv" "TARGET_XTHEADFMV ? FP_REGS : NO_REGS"
> +(define_register_constraint "th_f_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS"
>     "A floating-point register for XTheadFmv.")
>   
> -(define_register_constraint "th_r_fmv" "TARGET_XTHEADFMV ? GR_REGS : NO_REGS"
> +(define_register_constraint "th_r_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? GR_REGS : NO_REGS"
>     "An integer register for XTheadFmv.")
I think Christoph had good suggestions on the constraints.  So let's go 
with his suggestions.

You might consider a follow-up patch where you use negation of one of 
the predefined constants for synthesis.  I would not be surprised at all 
if that's as efficient on some cores as loading the negated constants 
out of the constant pool.  But I don't think it has to be a part of this 
patch.




> diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
> index 9b767038452..c81b08e3cc5 100644
> --- a/gcc/config/riscv/iterators.md
> +++ b/gcc/config/riscv/iterators.md
> @@ -288,3 +288,8 @@ (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
>   (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET "le")])
>   (define_int_attr QUIET_PATTERN [(UNSPEC_FLT_QUIET "LT") (UNSPEC_FLE_QUIET "LE")])
>   
> +(define_int_iterator ROUND [UNSPEC_ROUND UNSPEC_FLOOR UNSPEC_CEIL UNSPEC_BTRUNC UNSPEC_ROUNDEVEN UNSPEC_NEARBYINT])
> +(define_int_attr round_pattern [(UNSPEC_ROUND "round") (UNSPEC_FLOOR "floor") (UNSPEC_CEIL "ceil")
> +				(UNSPEC_BTRUNC "btrunc") (UNSPEC_ROUNDEVEN "roundeven") (UNSPEC_NEARBYINT "nearbyint")])
> +(define_int_attr round_rm [(UNSPEC_ROUND "rmm") (UNSPEC_FLOOR "rdn") (UNSPEC_CEIL "rup")
> +			   (UNSPEC_BTRUNC "rtz") (UNSPEC_ROUNDEVEN "rne") (UNSPEC_NEARBYINT "dyn")])
Do we really need to use unspecs for all these cases?  I would expect 
some correspond to the trunc, round, ceil, nearbyint, etc well known RTX 
codes.

In general, we should try to avoid unspecs when there is a clear 
semantic match between the instruction and GCC's RTX opcodes.  So please 
review the existing RTX code semantics to see if any match the new 
instructions.  If there are matches, use those RTX codes rather than 
UNSPECs.



>   
> +/* Immediate values loaded by the FLI.S instruction in Chapter 25 of the latest RISC-V ISA
> +   Manual draft. For details, please see:
> +   https://github.com/riscv/riscv-isa-manual/releases/tag/draft-20221217-cb3b9d1 */
> +
> +unsigned HOST_WIDE_INT fli_value_hf[32] =
> +{
> +  0xbc00, 0x400, 0x100, 0x200, 0x1c00, 0x2000, 0x2c00, 0x3000,
> +  0x3400, 0x3500, 0x3600, 0x3700, 0x3800, 0x3900, 0x3a00, 0x3b00,
> +  0x3c00, 0x3d00, 0x3e00, 0x3f00, 0x4000, 0x4100, 0x4200, 0x4400,
> +  0x4800, 0x4c00, 0x5800, 0x5c00, 0x7800,
> +  /* Only used for filling, ensuring that 29 and 30 of HF are the same. */
> +  0x7800,
> +  0x7c00, 0x7e00,
> +};
> +
> +unsigned HOST_WIDE_INT fli_value_sf[32] =
> +{
> +  0xbf800000, 0x00800000, 0x37800000, 0x38000000, 0x3b800000, 0x3c000000, 0x3d800000, 0x3e000000,
> +  0x3e800000, 0x3ea00000, 0x3ec00000, 0x3ee00000, 0x3f000000, 0x3f200000, 0x3f400000, 0x3f600000,
> +  0x3f800000, 0x3fa00000, 0x3fc00000, 0x3fe00000, 0x40000000, 0x40200000, 0x40400000, 0x40800000,
> +  0x41000000, 0x41800000, 0x43000000, 0x43800000, 0x47000000, 0x47800000, 0x7f800000, 0x7fc00000
> +};
> +
> +unsigned HOST_WIDE_INT fli_value_df[32] =
> +{
> +  0xbff0000000000000, 0x10000000000000, 0x3ef0000000000000, 0x3f00000000000000,
> +  0x3f70000000000000, 0x3f80000000000000, 0x3fb0000000000000, 0x3fc0000000000000,
> +  0x3fd0000000000000, 0x3fd4000000000000, 0x3fd8000000000000, 0x3fdc000000000000,
> +  0x3fe0000000000000, 0x3fe4000000000000, 0x3fe8000000000000, 0x3fec000000000000,
> +  0x3ff0000000000000, 0x3ff4000000000000, 0x3ff8000000000000, 0x3ffc000000000000,
> +  0x4000000000000000, 0x4004000000000000, 0x4008000000000000, 0x4010000000000000,
> +  0x4020000000000000, 0x4030000000000000, 0x4060000000000000, 0x4070000000000000,
> +  0x40e0000000000000, 0x40f0000000000000, 0x7ff0000000000000, 0x7ff8000000000000,
> +};
Going to assume these are sane.  I think the only concern would be 
endianness, but it looks like you handle that reasonably.
> +
> +/* Find the index of TARGET in ARRAY, and return -1 if not found. */
> +
> +static int
> +find_index_in_array (unsigned HOST_WIDE_INT target, unsigned HOST_WIDE_INT *array, int len)
> +{
> +  if (array == NULL)
> +    return -1;
> +
> +  for (int i = 0; i < len; i++)
> +    {
> +      if (target == array[i])
> +	return i;
> +    }
> +  return -1;
> +}
Given the way constraint and operand matching occurrs, I wouldn't be 
surprised if this search turns out to be compile-time expensive.



> +
> +/* Return index of the FLI instruction table if rtx X is an immediate constant that
> +   can be moved using a single FLI instruction in zfa extension. -1 otherwise. */
> +
> +int
> +riscv_float_const_rtx_index_for_fli (rtx x)
> +{
> +  machine_mode mode = GET_MODE (x);
> +
> +  if (!TARGET_ZFA || mode == VOIDmode
> +      || !CONST_DOUBLE_P(x)
> +      || (mode == HFmode && !TARGET_ZFH)
> +      || (mode == SFmode && !TARGET_HARD_FLOAT)
> +      || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
> +    return -1;
Bring the "|| mode == VOIDmode" down to its own line similar to how 
you've done with the !CONST_DOUBLE_P check.



>   
>   static bool
> @@ -826,6 +936,9 @@ riscv_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
>     if (GET_CODE (x) == HIGH)
>       return true;
>   
> +  if (riscv_float_const_rtx_index_for_fli (x) != -1)
> +   return true;
> +
So if you do a follow-up handling negative fli constants, obviously we'd 
need further changes in this code.


> @@ -1213,6 +1326,11 @@ riscv_const_insns (rtx x)
>         }
>   
>       case CONST_DOUBLE:
> +      /* See if we can use FMV directly.  */
> +      if (riscv_float_const_rtx_index_for_fli (x) != -1)
> +	return 3;
That seems fairly high cost-wise.  Where did this value come from?   Or 
is it relative to COSTS_N_INSNS?




>     if (TARGET_DOUBLE_FLOAT
>         && ((FP_REG_RTX_P (src) && FP_REG_RTX_P (dest))
>   	  || (FP_REG_RTX_P (dest) && MEM_P (src))
>   	  || (FP_REG_RTX_P (src) && MEM_P (dest))
> +	  || (TARGET_ZFA
> +	      && ((FP_REG_RTX_P (dest) && GP_REG_RTX_P (src))
> +	      || (FP_REG_RTX_P (src) && GP_REG_RTX_P (dest))))
The formatting of the second FP_REG_RTX_P check looks goofy, but that 
may be a mailer issue.  Double check the "|| FP_REG" should line up 
under the FP_REG_RTX_P.




> @@ -2968,6 +3103,14 @@ riscv_output_move (rtx dest, rtx src)
>   	  case 8:
>   	    return "fld\t%0,%1";
>   	  }
> +
> +      if (src_code == CONST_DOUBLE && (riscv_float_const_rtx_index_for_fli (src) != -1))
> +	switch (width)
> +	  {
> +	    case 2: return "fli.h\t%0,%1";
> +	    case 4: return "fli.s\t%0,%1";
> +	    case 8: return "fli.d\t%0,%1";
> +	  }
We generally discourage having code on the same line as a case 
statement, so bring those return statements down to a new line.





> @@ -1580,6 +1609,26 @@ (define_insn "l<rint_pattern><ANYF:mode><GPR:mode>2"
>     [(set_attr "type" "fcvt")
>      (set_attr "mode" "<ANYF:MODE>")])
>   
> +(define_insn "<round_pattern><ANYF:mode>2"
> +  [(set (match_operand:ANYF     0 "register_operand" "=f")
> +	(unspec:ANYF
> +	    [(match_operand:ANYF 1 "register_operand" " f")]
> +	ROUND))]
> +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> +  "fround.<ANYF:fmt>\t%0,%1,<round_rm>"
> +  [(set_attr "type" "fcvt")
> +   (set_attr "mode" "<ANYF:MODE>")])
> +
> +(define_insn "rint<ANYF:mode>2"
> +  [(set (match_operand:ANYF     0 "register_operand" "=f")
> +	(unspec:ANYF
> +	    [(match_operand:ANYF 1 "register_operand" " f")]
> +	UNSPEC_RINT))]
> +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> +  "froundnx.<ANYF:fmt>\t%0,%1"
> +  [(set_attr "type" "fcvt")
> +   (set_attr "mode" "<ANYF:MODE>")])
Please review the existing RTX codes and their semantics in the 
internals manual and if any of the new instructions match those existing 
primitives, implement them using those RTX codes rather than with an UNSPEC.


Overall it looks pretty good.

jeff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2.
  2023-04-19  9:57 [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2 Jin Ma
  2023-05-05 15:03 ` Christoph Müllner
  2023-05-05 23:31 ` Jeff Law
@ 2023-05-06  7:54 ` Jin Ma
  2023-05-06 12:53   ` jinma
  2023-05-15 13:16 ` [PATCH v9] " Jin Ma
  3 siblings, 1 reply; 20+ messages in thread
From: Jin Ma @ 2023-05-06  7:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: jeffreyalaw, kito.cheng, kito.cheng, palmer, ijinma


> On 4/19/23 03:57, Jin Ma wrote:
> > This patch adds the 'Zfa' extension for riscv, which is based on:
> >    https://github.com/riscv/riscv-isa-manual/commits/zfb
> >    https://github.com/riscv/riscv-isa-manual/commit/1f038182810727f5feca311072e630d6baac51da
> > 
> > The binutils-gdb for 'Zfa' extension:
> >    https://github.com/a4lg/binutils-gdb/commits/riscv-zfa
> > 
> > What needs special explanation is:
> > 1, The immediate number of the instructions FLI.H/S/D is represented in the assembly as a
> >    floating-point value, with scientific counting when rs1 is 1,2, and decimal numbers for
> >    the rest.
> > 
> >    Related llvm link:
> >      https://reviews.llvm.org/D145645
> >    Related discussion link:
> >      https://github.com/riscv/riscv-isa-manual/issues/980
> Right.  I think the goal right now is to get the bulk of this reviewed 
> now.  Ideally we'll get to the point where the only outstanding issue is 
> the interface between the assembler & gcc.

I will send a new version referring to the latest binutils(v5) in the near future:
https://sourceware.org/pipermail/binutils/2023-April/127060.html

> 
> > 
> > 2, According to riscv-spec, "The FCVTMO D.W.D instruction was added principally to
> >    accelerate the processing of JavaScript Numbers.", so it seems that no implementation
> >    is required.
> Fair enough.  There's seems to be a general desire to wire up builtins 
> for many things that aren't directly usable by the compiler.  So 
> consider such a change as a follow-up.   I don't think something like 
> this should hold up the blk of Zfa.
> 
> > 
> > 3, The instructions FMINM and FMAXM correspond to C23 library function fminimum and fmaximum.
> >    Therefore, this patch has simply implemented the pattern of fminm<hf\sf\df>3 and
> >    fmaxm<hf\sf\df>3 to prepare for later.
> Sounds good.
> 
> 
> > 
> > gcc/ChangeLog:
> > 
> > 	* common/config/riscv/riscv-common.cc: Add zfa extension version.
> > 	* config/riscv/constraints.md (Zf): Constrain the floating point number that the
> > 	instructions FLI.H/S/D can load.
> > 	((TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS): enable FMVP.D.X and FMVH.X.D.
> > 	* config/riscv/iterators.md (ceil): New.
> > 	* config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): New.
> > 	* config/riscv/riscv.cc (find_index_in_array): New.
> > 	(riscv_float_const_rtx_index_for_fli): Get the index of the floating-point number that
> > 	the instructions FLI.H/S/D can mov.
> > 	(riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be used, memory is not applicable.
> > 	(riscv_const_insns): The cost of FLI.H/S/D is 3.
> > 	(riscv_legitimize_const_move): Likewise.
> > 	(riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, no split is required.
> > 	(riscv_output_move): Output the mov instructions in zfa extension.
> > 	(riscv_print_operand): Output the floating-point value of the FLI.H/S/D immediate in assembly
> > 	(riscv_secondary_memory_needed): Likewise.
> > 	* config/riscv/riscv.h (GP_REG_RTX_P): New.
> > 	* config/riscv/riscv.md (fminm<mode>3): New.
> > 
> 
> > index c448e6b37e9..62d9094f966 100644
> > --- a/gcc/config/riscv/constraints.md
> > +++ b/gcc/config/riscv/constraints.md
> > @@ -118,6 +118,13 @@ (define_constraint "T"
> >     (and (match_operand 0 "move_operand")
> >          (match_test "CONSTANT_P (op)")))
> >   
> > +;; Zfa constraints.
> > +
> > +(define_constraint "Zf"
> > +  "A floating point number that can be loaded using instruction `fli` in zfa."
> > +  (and (match_code "const_double")
> > +       (match_test "(riscv_float_const_rtx_index_for_fli (op) != -1)")))
> > +
> >   ;; Vector constraints.
> >   
> >   (define_register_constraint "vr" "TARGET_VECTOR ? V_REGS : NO_REGS"
> > @@ -183,8 +190,8 @@ (define_memory_constraint "Wdm"
> >   
> >   ;; Vendor ISA extension constraints.
> >   
> > -(define_register_constraint "th_f_fmv" "TARGET_XTHEADFMV ? FP_REGS : NO_REGS"
> > +(define_register_constraint "th_f_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS"
> >     "A floating-point register for XTheadFmv.")
> >   
> > -(define_register_constraint "th_r_fmv" "TARGET_XTHEADFMV ? GR_REGS : NO_REGS"
> > +(define_register_constraint "th_r_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? GR_REGS : NO_REGS"
> >     "An integer register for XTheadFmv.")
> I think Christoph had good suggestions on the constraints.  So let's go 
> with his suggestions.
> 
> You might consider a follow-up patch where you use negation of one of 
> the predefined constants for synthesis.  I would not be surprised at all 
> if that's as efficient on some cores as loading the negated constants 
> out of the constant pool.  But I don't think it has to be a part of this 
> patch.
> 

I also think the Christoph is right, and I will revise it according to his suggestion.

> 
> 
> 
> > diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
> > index 9b767038452..c81b08e3cc5 100644
> > --- a/gcc/config/riscv/iterators.md
> > +++ b/gcc/config/riscv/iterators.md
> > @@ -288,3 +288,8 @@ (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
> >   (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET "le")])
> >   (define_int_attr QUIET_PATTERN [(UNSPEC_FLT_QUIET "LT") (UNSPEC_FLE_QUIET "LE")])
> >   
> > +(define_int_iterator ROUND [UNSPEC_ROUND UNSPEC_FLOOR UNSPEC_CEIL UNSPEC_BTRUNC UNSPEC_ROUNDEVEN UNSPEC_NEARBYINT])
> > +(define_int_attr round_pattern [(UNSPEC_ROUND "round") (UNSPEC_FLOOR "floor") (UNSPEC_CEIL "ceil")
> > +				(UNSPEC_BTRUNC "btrunc") (UNSPEC_ROUNDEVEN "roundeven") (UNSPEC_NEARBYINT "nearbyint")])
> > +(define_int_attr round_rm [(UNSPEC_ROUND "rmm") (UNSPEC_FLOOR "rdn") (UNSPEC_CEIL "rup")
> > +			   (UNSPEC_BTRUNC "rtz") (UNSPEC_ROUNDEVEN "rne") (UNSPEC_NEARBYINT "dyn")])
> Do we really need to use unspecs for all these cases?  I would expect 
> some correspond to the trunc, round, ceil, nearbyint, etc well known RTX 
> codes.
> 
> In general, we should try to avoid unspecs when there is a clear 
> semantic match between the instruction and GCC's RTX opcodes.  So please 
> review the existing RTX code semantics to see if any match the new 
> instructions.  If there are matches, use those RTX codes rather than 
> UNSPECs.

I'll try, thanks.

> 
> 
> 
> >   
> > +/* Immediate values loaded by the FLI.S instruction in Chapter 25 of the latest RISC-V ISA
> > +   Manual draft. For details, please see:
> > +   https://github.com/riscv/riscv-isa-manual/releases/tag/draft-20221217-cb3b9d1 */
> > +
> > +unsigned HOST_WIDE_INT fli_value_hf[32] =
> > +{
> > +  0xbc00, 0x400, 0x100, 0x200, 0x1c00, 0x2000, 0x2c00, 0x3000,
> > +  0x3400, 0x3500, 0x3600, 0x3700, 0x3800, 0x3900, 0x3a00, 0x3b00,
> > +  0x3c00, 0x3d00, 0x3e00, 0x3f00, 0x4000, 0x4100, 0x4200, 0x4400,
> > +  0x4800, 0x4c00, 0x5800, 0x5c00, 0x7800,
> > +  /* Only used for filling, ensuring that 29 and 30 of HF are the same. */
> > +  0x7800,
> > +  0x7c00, 0x7e00,
> > +};
> > +
> > +unsigned HOST_WIDE_INT fli_value_sf[32] =
> > +{
> > +  0xbf800000, 0x00800000, 0x37800000, 0x38000000, 0x3b800000, 0x3c000000, 0x3d800000, 0x3e000000,
> > +  0x3e800000, 0x3ea00000, 0x3ec00000, 0x3ee00000, 0x3f000000, 0x3f200000, 0x3f400000, 0x3f600000,
> > +  0x3f800000, 0x3fa00000, 0x3fc00000, 0x3fe00000, 0x40000000, 0x40200000, 0x40400000, 0x40800000,
> > +  0x41000000, 0x41800000, 0x43000000, 0x43800000, 0x47000000, 0x47800000, 0x7f800000, 0x7fc00000
> > +};
> > +
> > +unsigned HOST_WIDE_INT fli_value_df[32] =
> > +{
> > +  0xbff0000000000000, 0x10000000000000, 0x3ef0000000000000, 0x3f00000000000000,
> > +  0x3f70000000000000, 0x3f80000000000000, 0x3fb0000000000000, 0x3fc0000000000000,
> > +  0x3fd0000000000000, 0x3fd4000000000000, 0x3fd8000000000000, 0x3fdc000000000000,
> > +  0x3fe0000000000000, 0x3fe4000000000000, 0x3fe8000000000000, 0x3fec000000000000,
> > +  0x3ff0000000000000, 0x3ff4000000000000, 0x3ff8000000000000, 0x3ffc000000000000,
> > +  0x4000000000000000, 0x4004000000000000, 0x4008000000000000, 0x4010000000000000,
> > +  0x4020000000000000, 0x4030000000000000, 0x4060000000000000, 0x4070000000000000,
> > +  0x40e0000000000000, 0x40f0000000000000, 0x7ff0000000000000, 0x7ff8000000000000,
> > +};
> Going to assume these are sane.  I think the only concern would be 
> endianness, but it looks like you handle that reasonably.

I did a simple treatment of endianness in the function riscv_float_const_rtx_index_for_fli(),
which seems to be correct at present.

In addition, in the next version, I used the newer floating point literal 
formats instead according to your suggestion.

For example:
unsigned HOST_WIDE_INT fli_value_sf[32] =
{
  0xbf8p20, 0x008p20, 0x378p20, 0x380p20, 0x3b8p20, 0x3c0p20, 0x3d8p20, 0x3e0p20,
  0x3e8p20, 0x3eap20, 0x3ecp20, 0x3eep20, 0x3f0p20, 0x3f2p20, 0x3f4p20, 0x3f6p20,
  0x3f8p20, 0x3fap20, 0x3fcp20, 0x3fep20, 0x400p20, 0x402p20, 0x404p20, 0x408p20,
  0x410p20, 0x418p20, 0x430p20, 0x438p20, 0x470p20, 0x478p20, 0x7f8p20, 0x7fcp20
};

Is that so? I don't know if I understand correctly.

> > +
> > +/* Find the index of TARGET in ARRAY, and return -1 if not found. */
> > +
> > +static int
> > +find_index_in_array (unsigned HOST_WIDE_INT target, unsigned HOST_WIDE_INT *array, int len)
> > +{
> > +  if (array == NULL)
> > +    return -1;
> > +
> > +  for (int i = 0; i < len; i++)
> > +    {
> > +      if (target == array[i])
> > +	return i;
> > +    }
> > +  return -1;
> > +}
> Given the way constraint and operand matching occurrs, I wouldn't be 
> surprised if this search turns out to be compile-time expensive.

Yes, I tried to find a better way, but the compiler seems to have to retrieve the 32 values of 
the fli instruction, which may need to be optimized, such as the binary search algorithm.

> 
> 
> 
> > +
> > +/* Return index of the FLI instruction table if rtx X is an immediate constant that
> > +   can be moved using a single FLI instruction in zfa extension. -1 otherwise. */
> > +
> > +int
> > +riscv_float_const_rtx_index_for_fli (rtx x)
> > +{
> > +  machine_mode mode = GET_MODE (x);
> > +
> > +  if (!TARGET_ZFA || mode == VOIDmode
> > +      || !CONST_DOUBLE_P(x)
> > +      || (mode == HFmode && !TARGET_ZFH)
> > +      || (mode == SFmode && !TARGET_HARD_FLOAT)
> > +      || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
> > +    return -1;
> Bring the "|| mode == VOIDmode" down to its own line similar to how 
> you've done with the !CONST_DOUBLE_P check.

Fix in the next version.

> 
> >   
> >   static bool
> > @@ -826,6 +936,9 @@ riscv_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
> >     if (GET_CODE (x) == HIGH)
> >       return true;
> >   
> > +  if (riscv_float_const_rtx_index_for_fli (x) != -1)
> > +   return true;
> > +
> So if you do a follow-up handling negative fli constants, obviously we'd 
> need further changes in this code.

This is a query index function, I think it will only return 0 or positive integer, 
there should be no negative index.

> > @@ -1213,6 +1326,11 @@ riscv_const_insns (rtx x)
> >         }
> >   
> >       case CONST_DOUBLE:
> > +      /* See if we can use FMV directly.  */
> > +      if (riscv_float_const_rtx_index_for_fli (x) != -1)
> > +	return 3;
> That seems fairly high cost-wise.  Where did this value come from?   Or 
> is it relative to COSTS_N_INSNS?

Referring to the relevant patch aarch64, in this case the COSTS_N_INSNS (3) is returned, so I
simply define it here as 3, or should I change it to COSTS_N_INSNS (3)?
https://github.com/gcc-mirror/gcc/commit/a217096563e356fa03cc5163665148227613c62f#diff-2ea6a52c675e9f1862287091ef606b129d9e311224999af1cc017317c62c1efeR6942

> 
> >     if (TARGET_DOUBLE_FLOAT
> >         && ((FP_REG_RTX_P (src) && FP_REG_RTX_P (dest))
> >   	  || (FP_REG_RTX_P (dest) && MEM_P (src))
> >   	  || (FP_REG_RTX_P (src) && MEM_P (dest))
> > +	  || (TARGET_ZFA
> > +	      && ((FP_REG_RTX_P (dest) && GP_REG_RTX_P (src))
> > +	      || (FP_REG_RTX_P (src) && GP_REG_RTX_P (dest))))
> The formatting of the second FP_REG_RTX_P check looks goofy, but that 
> may be a mailer issue.  Double check the "|| FP_REG" should line up 
> under the FP_REG_RTX_P.


It will be fixed in the next version.

> 
> 
> > @@ -2968,6 +3103,14 @@ riscv_output_move (rtx dest, rtx src)
> >   	  case 8:
> >   	    return "fld\t%0,%1";
> >   	  }
> > +
> > +      if (src_code == CONST_DOUBLE && (riscv_float_const_rtx_index_for_fli (src) != -1))
> > +	switch (width)
> > +	  {
> > +	    case 2: return "fli.h\t%0,%1";
> > +	    case 4: return "fli.s\t%0,%1";
> > +	    case 8: return "fli.d\t%0,%1";
> > +	  }
> We generally discourage having code on the same line as a case 
> statement, so bring those return statements down to a new line.
> 


It will be fixed in the next version.

> 
> 
> 
> 
> > @@ -1580,6 +1609,26 @@ (define_insn "l<rint_pattern><ANYF:mode><GPR:mode>2"
> >     [(set_attr "type" "fcvt")
> >      (set_attr "mode" "<ANYF:MODE>")])
> >   
> > +(define_insn "<round_pattern><ANYF:mode>2"
> > +  [(set (match_operand:ANYF     0 "register_operand" "=f")
> > +	(unspec:ANYF
> > +	    [(match_operand:ANYF 1 "register_operand" " f")]
> > +	ROUND))]
> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> > +  "fround.<ANYF:fmt>\t%0,%1,<round_rm>"
> > +  [(set_attr "type" "fcvt")
> > +   (set_attr "mode" "<ANYF:MODE>")])
> > +
> > +(define_insn "rint<ANYF:mode>2"
> > +  [(set (match_operand:ANYF     0 "register_operand" "=f")
> > +	(unspec:ANYF
> > +	    [(match_operand:ANYF 1 "register_operand" " f")]
> > +	UNSPEC_RINT))]
> > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> > +  "froundnx.<ANYF:fmt>\t%0,%1"
> > +  [(set_attr "type" "fcvt")
> > +   (set_attr "mode" "<ANYF:MODE>")])
> Please review the existing RTX codes and their semantics in the 
> internals manual and if any of the new instructions match those existing 
> primitives, implement them using those RTX codes rather than with an UNSPEC.
>

I'll try, thanks.

> 
> Overall it looks pretty good.
> 
> jeff

Thank you for your guidance.

Jin Ma

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2.
  2023-05-06  7:54 ` Jin Ma
@ 2023-05-06 12:53   ` jinma
  2023-05-16  3:59     ` Jeff Law
  0 siblings, 1 reply; 20+ messages in thread
From: jinma @ 2023-05-06 12:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: jeffreyalaw, kito.cheng, kito.cheng, palmer, ijinma

> > > diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
> > > index 9b767038452..c81b08e3cc5 100644
> > > --- a/gcc/config/riscv/iterators.md
> > > +++ b/gcc/config/riscv/iterators.md
> > > @@ -288,3 +288,8 @@ (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
> > >   (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET "le")])
> > >   (define_int_attr QUIET_PATTERN [(UNSPEC_FLT_QUIET "LT") (UNSPEC_FLE_QUIET "LE")])
> > >   
> > > +(define_int_iterator ROUND [UNSPEC_ROUND UNSPEC_FLOOR UNSPEC_CEIL UNSPEC_BTRUNC UNSPEC_ROUNDEVEN UNSPEC_NEARBYINT])
> > > +(define_int_attr round_pattern [(UNSPEC_ROUND "round") (UNSPEC_FLOOR "floor") (UNSPEC_CEIL "ceil")
> > > +				(UNSPEC_BTRUNC "btrunc") (UNSPEC_ROUNDEVEN "roundeven") (UNSPEC_NEARBYINT "nearbyint")])
> > > +(define_int_attr round_rm [(UNSPEC_ROUND "rmm") (UNSPEC_FLOOR "rdn") (UNSPEC_CEIL "rup")
> > > +			   (UNSPEC_BTRUNC "rtz") (UNSPEC_ROUNDEVEN "rne") (UNSPEC_NEARBYINT "dyn")])
> > Do we really need to use unspecs for all these cases?  I would expect 
> > some correspond to the trunc, round, ceil, nearbyint, etc well known RTX 
> > codes.
> > 
> > In general, we should try to avoid unspecs when there is a clear 
> > semantic match between the instruction and GCC's RTX opcodes.  So please 
> > review the existing RTX code semantics to see if any match the new 
> > instructions.  If there are matches, use those RTX codes rather than 
> > UNSPECs.
> 
> I'll try, thanks.


I encountered some confusion about this. I checked gcc's documents and
found no RTX codes that can correspond to round, ceil, nearbyint, etc.
Only "(fix:m x)" seems to correspond to trunc, which can be expressed
as rounding towards zero, while others have not yet been found.


In addition, I found that other architectures also seem to adopt the
unspecs for all these cases  on the latest master branch.
arm: https://github.com/gcc-mirror/gcc/commit/1dd4fe1fd892458ce29f15f3ca95125a11b2534f#diff-159a39276c509272adfaeef91c2110f54f65c38f7fd1ab2f1e750af0a7f86377R1251
rs6000: https://github.com/gcc-mirror/gcc/commit/7042fe5ef83ff0585eb91144817105f26d566d4c#diff-1a2d4976d867ead4556899cab1dbb39f5069574276e06a2976fb62b771ece2e3R6995
i386: https://github.com/gcc-mirror/gcc/commit/3e8c4b925a9825fdb8c81f47b621f63108894362#diff-f00b14a8846eb6aaeb981077e36ac3668160d7dabb490beeb1f62792afa83281R23332

Can you give me some advice?

> > > @@ -1580,6 +1609,26 @@ (define_insn "l<rint_pattern><ANYF:mode><GPR:mode>2"
> > >     [(set_attr "type" "fcvt")
> > >      (set_attr "mode" "<ANYF:MODE>")])
> > >   
> > > +(define_insn "<round_pattern><ANYF:mode>2"
> > > +  [(set (match_operand:ANYF     0 "register_operand" "=f")
> > > +	(unspec:ANYF
> > > +	    [(match_operand:ANYF 1 "register_operand" " f")]
> > > +	ROUND))]
> > > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> > > +  "fround.<ANYF:fmt>\t%0,%1,<round_rm>"
> > > +  [(set_attr "type" "fcvt")
> > > +   (set_attr "mode" "<ANYF:MODE>")])
> > > +
> > > +(define_insn "rint<ANYF:mode>2"
> > > +  [(set (match_operand:ANYF     0 "register_operand" "=f")
> > > +	(unspec:ANYF
> > > +	    [(match_operand:ANYF 1 "register_operand" " f")]
> > > +	UNSPEC_RINT))]
> > > +  "TARGET_HARD_FLOAT && TARGET_ZFA"
> > > +  "froundnx.<ANYF:fmt>\t%0,%1"
> > > +  [(set_attr "type" "fcvt")
> > > +   (set_attr "mode" "<ANYF:MODE>")])
> > Please review the existing RTX codes and their semantics in the 
> > internals manual and if any of the new instructions match those existing 
> > primitives, implement them using those RTX codes rather than with an UNSPEC.
> >
> 
> I'll try, thanks.
> 

thanks.

Jin Ma

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2
  2023-04-19  9:57 [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2 Jin Ma
                   ` (2 preceding siblings ...)
  2023-05-06  7:54 ` Jin Ma
@ 2023-05-15 13:16 ` Jin Ma
  2023-05-15 13:30   ` jinma
  2023-05-16  4:16   ` Jeff Law
  3 siblings, 2 replies; 20+ messages in thread
From: Jin Ma @ 2023-05-15 13:16 UTC (permalink / raw)
  To: gcc-patches
  Cc: jeffreyalaw, christoph.muellner, kito.cheng, kito.cheng, palmer,
	ijinma, Jin Ma

This patch adds the 'Zfa' extension for riscv, which is based on:
https://github.com/riscv/riscv-isa-manual/commits/zfb

The binutils-gdb for 'Zfa' extension:
https://sourceware.org/pipermail/binutils/2023-April/127060.html

What needs special explanation is:
1, The immediate number of the instructions FLI.H/S/D is represented in the assembly as a
  floating-point value, with scientific counting when rs1 is 2,3, and decimal numbers for
  the rest.

  Related llvm link:
    https://reviews.llvm.org/D145645
  Related discussion link:
    https://github.com/riscv/riscv-isa-manual/issues/980

2, According to riscv-spec, "The FCVTMO D.W.D instruction was added principally to
  accelerate the processing of JavaScript Numbers.", so it seems that no implementation
  is required.

3, The instructions FMINM and FMAXM correspond to C23 library function fminimum and fmaximum.
  Therefore, this patch has simply implemented the pattern of fminm<hf\sf\df>3 and
  fmaxm<hf\sf\df>3 to prepare for later.

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc: Add zfa extension version.
	* config/riscv/constraints.md (zfli): Constrain the floating point number that the
	instructions FLI.H/S/D can load.
	* config/riscv/iterators.md (ceil): New.
	(rup): New.
	* config/riscv/riscv-opts.h (MASK_ZFA): New.
	(TARGET_ZFA): New.
	* config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): New.
	* config/riscv/riscv.cc (riscv_float_const_rtx_index_for_fli): New.
	(riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be used, memory is not applicable.
	(riscv_const_insns): Likewise.
	(riscv_legitimize_const_move): Likewise.
	(riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, no split is required.
	(riscv_split_doubleword_move): Likewise.
	(riscv_output_move): Output the mov instructions in zfa extension.
	(riscv_print_operand): Output the floating-point value of the FLI.H/S/D immediate in assembly
	(riscv_secondary_memory_needed): Likewise.
	* config/riscv/riscv.md (fminm<mode>3): New.
	(fmaxm<mode>3): New.
	(movsidf2_low_rv32): New.
	(movsidf2_high_rv32): New.
	(movdfsisi3_rv32): New.
	(f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
	* gcc.target/riscv/zfa-fleq-fltq.c: New test.
	* gcc.target/riscv/zfa-fli-rv32.c: New test.
	* gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
	* gcc.target/riscv/zfa-fli-zfh.c: New test.
	* gcc.target/riscv/zfa-fli.c: New test.
	* gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
	* gcc.target/riscv/zfa-fround-rv32.c: New test.
	* gcc.target/riscv/zfa-fround.c: New test.
---
 gcc/common/config/riscv/riscv-common.cc       |   4 +
 gcc/config/riscv/constraints.md               |  21 +-
 gcc/config/riscv/iterators.md                 |   5 +
 gcc/config/riscv/riscv-opts.h                 |   3 +
 gcc/config/riscv/riscv-protos.h               |   1 +
 gcc/config/riscv/riscv.cc                     | 204 +++++++++++++++++-
 gcc/config/riscv/riscv.md                     | 145 +++++++++++--
 .../gcc.target/riscv/zfa-fleq-fltq-rv32.c     |  19 ++
 .../gcc.target/riscv/zfa-fleq-fltq.c          |  19 ++
 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c |  79 +++++++
 .../gcc.target/riscv/zfa-fli-zfh-rv32.c       |  41 ++++
 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c  |  41 ++++
 gcc/testsuite/gcc.target/riscv/zfa-fli.c      |  79 +++++++
 .../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c   |  10 +
 .../gcc.target/riscv/zfa-fround-rv32.c        |  42 ++++
 gcc/testsuite/gcc.target/riscv/zfa-fround.c   |  42 ++++
 16 files changed, 719 insertions(+), 36 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround.c

diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc
index 3a285dfbff0..550f6796e98 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -217,6 +217,8 @@ static const struct riscv_ext_version riscv_ext_version_table[] =
   {"zfh",       ISA_SPEC_CLASS_NONE, 1, 0},
   {"zfhmin",    ISA_SPEC_CLASS_NONE, 1, 0},
 
+  {"zfa",     ISA_SPEC_CLASS_NONE, 0, 2},
+
   {"zmmul", ISA_SPEC_CLASS_NONE, 1, 0},
 
   {"svinval", ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1260,6 +1262,8 @@ static const riscv_ext_flag_table_t riscv_ext_flag_table[] =
   {"zfhmin",    &gcc_options::x_riscv_zf_subext, MASK_ZFHMIN},
   {"zfh",       &gcc_options::x_riscv_zf_subext, MASK_ZFH},
 
+  {"zfa",       &gcc_options::x_riscv_zf_subext, MASK_ZFA},
+
   {"zmmul", &gcc_options::x_riscv_zm_subext, MASK_ZMMUL},
 
   {"svinval", &gcc_options::x_riscv_sv_subext, MASK_SVINVAL},
diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index c448e6b37e9..06d0cd47c3c 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -118,6 +118,19 @@ (define_constraint "T"
   (and (match_operand 0 "move_operand")
        (match_test "CONSTANT_P (op)")))
 
+;; Zfa constraints.
+
+(define_constraint "zfli"
+  "A floating point number that can be loaded using instruction `fli` in zfa."
+  (and (match_code "const_double")
+       (match_test "(riscv_float_const_rtx_index_for_fli (op) != -1)")))
+
+(define_register_constraint "zmvf" "(TARGET_ZFA || TARGET_XTHEADFMV) ? FP_REGS : NO_REGS"
+  "A floating-point register for ZFA or XTheadFmv.")
+
+(define_register_constraint "zmvr" "(TARGET_ZFA || TARGET_XTHEADFMV) ? GR_REGS : NO_REGS"
+  "An integer register for  ZFA or XTheadFmv.")
+
 ;; Vector constraints.
 
 (define_register_constraint "vr" "TARGET_VECTOR ? V_REGS : NO_REGS"
@@ -180,11 +193,3 @@ (define_memory_constraint "Wdm"
   "Vector duplicate memory operand"
   (and (match_code "mem")
        (match_code "reg" "0")))
-
-;; Vendor ISA extension constraints.
-
-(define_register_constraint "th_f_fmv" "TARGET_XTHEADFMV ? FP_REGS : NO_REGS"
-  "A floating-point register for XTheadFmv.")
-
-(define_register_constraint "th_r_fmv" "TARGET_XTHEADFMV ? GR_REGS : NO_REGS"
-  "An integer register for XTheadFmv.")
diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 1d56324df03..1a15999f9e4 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -294,3 +294,8 @@ (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
 (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET "le")])
 (define_int_attr QUIET_PATTERN [(UNSPEC_FLT_QUIET "LT") (UNSPEC_FLE_QUIET "LE")])
 
+(define_int_iterator ROUND [UNSPEC_ROUND UNSPEC_FLOOR UNSPEC_CEIL UNSPEC_BTRUNC UNSPEC_ROUNDEVEN UNSPEC_NEARBYINT])
+(define_int_attr round_pattern [(UNSPEC_ROUND "round") (UNSPEC_FLOOR "floor") (UNSPEC_CEIL "ceil")
+				(UNSPEC_BTRUNC "btrunc") (UNSPEC_ROUNDEVEN "roundeven") (UNSPEC_NEARBYINT "nearbyint")])
+(define_int_attr round_rm [(UNSPEC_ROUND "rmm") (UNSPEC_FLOOR "rdn") (UNSPEC_CEIL "rup")
+			   (UNSPEC_BTRUNC "rtz") (UNSPEC_ROUNDEVEN "rne") (UNSPEC_NEARBYINT "dyn")])
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 1b2e6de5e1b..7fe02208c58 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -196,6 +196,9 @@ enum riscv_multilib_select_kind {
 #define TARGET_ZFHMIN ((riscv_zf_subext & MASK_ZFHMIN) != 0)
 #define TARGET_ZFH    ((riscv_zf_subext & MASK_ZFH) != 0)
 
+#define MASK_ZFA   (1 << 0)
+#define TARGET_ZFA    ((riscv_zf_subext & MASK_ZFA) != 0)
+
 #define MASK_ZMMUL      (1 << 0)
 #define TARGET_ZMMUL    ((riscv_zm_subext & MASK_ZMMUL) != 0)
 
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index bc71f9cbbba..b62ba9562b0 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -40,6 +40,7 @@ enum riscv_symbol_type {
 /* Routines implemented in riscv.cc.  */
 extern enum riscv_symbol_type riscv_classify_symbolic_expression (rtx);
 extern bool riscv_symbolic_constant_p (rtx, enum riscv_symbol_type *);
+extern int riscv_float_const_rtx_index_for_fli (rtx);
 extern int riscv_regno_mode_ok_for_base_p (int, machine_mode, bool);
 extern int riscv_address_insns (rtx, machine_mode, bool);
 extern int riscv_const_insns (rtx);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a770fdfaa0e..2d5e1bf4c40 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -813,6 +813,137 @@ static int riscv_symbol_insns (enum riscv_symbol_type type)
     }
 }
 
+/* Immediate values loaded by the FLI.S instruction in Chapter 25 of the latest RISC-V ISA
+   Manual draft. For details, please see:
+   https://github.com/riscv/riscv-isa-manual/releases/tag/isa-449cd0c  */
+
+static unsigned HOST_WIDE_INT fli_value_hf[32] =
+{
+  0xbcp8, 0x4p8, 0x1p8, 0x2p8, 0x1cp8, 0x20p8, 0x2cp8, 0x30p8,
+  0x34p8, 0x35p8, 0x36p8, 0x37p8, 0x38p8, 0x39p8, 0x3ap8, 0x3bp8,
+  0x3cp8, 0x3dp8, 0x3ep8, 0x3fp8, 0x40p8, 0x41p8, 0x42p8, 0x44p8,
+  0x48p8, 0x4cp8, 0x58p8, 0x5cp8, 0x78p8,
+  /* Only used for filling, ensuring that 29 and 30 of HF are the same.  */
+  0x78p8,
+  0x7cp8, 0x7ep8
+};
+
+static unsigned HOST_WIDE_INT fli_value_sf[32] =
+{
+  0xbf8p20, 0x008p20, 0x378p20, 0x380p20, 0x3b8p20, 0x3c0p20, 0x3d8p20, 0x3e0p20,
+  0x3e8p20, 0x3eap20, 0x3ecp20, 0x3eep20, 0x3f0p20, 0x3f2p20, 0x3f4p20, 0x3f6p20,
+  0x3f8p20, 0x3fap20, 0x3fcp20, 0x3fep20, 0x400p20, 0x402p20, 0x404p20, 0x408p20,
+  0x410p20, 0x418p20, 0x430p20, 0x438p20, 0x470p20, 0x478p20, 0x7f8p20, 0x7fcp20
+};
+
+static unsigned HOST_WIDE_INT fli_value_df[32] =
+{
+  0xbff0p48, 0x10p48, 0x3ef0p48, 0x3f00p48,
+  0x3f70p48, 0x3f80p48, 0x3fb0p48, 0x3fc0p48,
+  0x3fd0p48, 0x3fd4p48, 0x3fd8p48, 0x3fdcp48,
+  0x3fe0p48, 0x3fe4p48, 0x3fe8p48, 0x3fecp48,
+  0x3ff0p48, 0x3ff4p48, 0x3ff8p48, 0x3ffcp48,
+  0x4000p48, 0x4004p48, 0x4008p48, 0x4010p48,
+  0x4020p48, 0x4030p48, 0x4060p48, 0x4070p48,
+  0x40e0p48, 0x40f0p48, 0x7ff0p48, 0x7ff8p48
+};
+
+/* Display floating-point values at the assembly level, which is consistent
+   with the zfa extension of llvm:   */
+
+const char *fli_value_print[32] =
+{
+  "-1.0", "min", "1.52587890625e-05", "3.0517578125e-05", "0.00390625", "0.0078125", "0.0625", "0.125",
+  "0.25", "0.3125", "0.375", "0.4375", "0.5", "0.625", "0.75", "0.875",
+  "1.0", "1.25", "1.5", "1.75", "2.0", "2.5", "3.0", "4.0",
+  "8.0", "16.0", "128.0", "256.0", "32768.0", "65536.0", "inf", "nan"
+};
+
+/* Return index of the FLI instruction table if rtx X is an immediate constant that can
+   be moved using a single FLI instruction in zfa extension. Return -1 if not found.  */
+
+int
+riscv_float_const_rtx_index_for_fli (rtx x)
+{
+  unsigned HOST_WIDE_INT *fli_value_array;
+
+  machine_mode mode = GET_MODE (x);
+
+  if (!TARGET_ZFA
+      || !CONST_DOUBLE_P(x)
+      || mode == VOIDmode
+      || (mode == HFmode && !TARGET_ZFH)
+      || (mode == SFmode && !TARGET_HARD_FLOAT)
+      || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
+    return -1;
+
+  if (!SCALAR_FLOAT_MODE_P (mode)
+      || GET_MODE_BITSIZE (mode).to_constant () > HOST_BITS_PER_WIDE_INT
+      /* Only support up to DF mode.  */
+      || GET_MODE_BITSIZE (mode).to_constant () > GET_MODE_BITSIZE (DFmode))
+    return -1;
+
+  unsigned HOST_WIDE_INT ival = 0;
+
+  long res[2];
+  real_to_target (res,
+		  CONST_DOUBLE_REAL_VALUE (x),
+		  REAL_MODE_FORMAT (mode));
+
+  if (mode == DFmode)
+    {
+      int order = BYTES_BIG_ENDIAN ? 1 : 0;
+      ival = zext_hwi (res[order], 32);
+      ival |= (zext_hwi (res[1 - order], 32) << 32);
+
+      /* When the lower 32 bits are not all 0, it is impossible to be in the table.  */
+      if (ival & 0xffffffff)
+	return -1;
+    }
+  else
+      ival = zext_hwi (res[0], 32);
+
+  switch (mode)
+    {
+      case E_HFmode:
+	fli_value_array = fli_value_hf;
+	break;
+      case E_SFmode:
+	fli_value_array = fli_value_sf;
+	break;
+      case E_DFmode:
+	fli_value_array = fli_value_df;
+	break;
+      default:
+	return -1;
+    }
+
+  if (fli_value_array[0] == ival)
+    return 0;
+
+  if (fli_value_array[1] == ival)
+    return 1;
+
+  /* Perform a binary search to find target index.  */
+  unsigned l, r, m;
+
+  l = 2;
+  r = 31;
+
+  while (l <= r)
+    {
+      m = (l + r) / 2;
+      if (fli_value_array[m] == ival)
+	return m;
+      else if (fli_value_array[m] < ival)
+	l = m+1;
+      else
+	r = m-1;
+    }
+
+  return -1;
+}
+
 /* Implement TARGET_LEGITIMATE_CONSTANT_P.  */
 
 static bool
@@ -840,6 +971,9 @@ riscv_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
   if (GET_CODE (x) == HIGH)
     return true;
 
+  if (satisfies_constraint_zfli (x))
+   return true;
+
   split_const (x, &base, &offset);
   if (riscv_symbolic_constant_p (base, &type))
     {
@@ -1266,6 +1400,10 @@ riscv_const_insns (rtx x)
       }
 
     case CONST_DOUBLE:
+      /* See if we can use FMV directly.  */
+      if (satisfies_constraint_zfli (x))
+	return 1;
+
       /* We can use x0 to load floating-point zero.  */
       return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
     case CONST_VECTOR:
@@ -1824,6 +1962,12 @@ riscv_legitimize_const_move (machine_mode mode, rtx dest, rtx src)
       return;
     }
 
+  if (satisfies_constraint_zfli (src))
+    {
+      riscv_emit_set (dest, src);
+      return;
+    }
+
   /* Split moves of symbolic constants into high/low pairs.  */
   if (riscv_split_symbol (dest, src, MAX_MACHINE_MODE, &src))
     {
@@ -2847,6 +2991,10 @@ riscv_split_64bit_move_p (rtx dest, rtx src)
   if (TARGET_64BIT)
     return false;
 
+  /* There is no need to split if the FLI instruction in the `Zfa` extension can be used.  */
+  if (satisfies_constraint_zfli (src))
+    return false;
+
   /* Allow FPR <-> FPR and FPR <-> MEM moves, and permit the special case
      of zeroing an FPR with FCVT.D.W.  */
   if (TARGET_DOUBLE_FLOAT
@@ -2866,22 +3014,36 @@ riscv_split_64bit_move_p (rtx dest, rtx src)
 void
 riscv_split_doubleword_move (rtx dest, rtx src)
 {
-  /* XTheadFmv has instructions for accessing the upper bits of a double.  */
-  if (!TARGET_64BIT && TARGET_XTHEADFMV)
+  /* ZFA or XTheadFmv has instructions for accessing the upper bits of a double.  */
+  if (!TARGET_64BIT && (TARGET_ZFA || TARGET_XTHEADFMV))
     {
       if (FP_REG_RTX_P (dest))
 	{
 	  rtx low_src = riscv_subword (src, false);
 	  rtx high_src = riscv_subword (src, true);
-	  emit_insn (gen_th_fmv_hw_w_x (dest, high_src, low_src));
+
+	  if (TARGET_ZFA)
+	    emit_insn (gen_movdfsisi3_rv32 (dest, high_src, low_src));
+	  else
+	    emit_insn (gen_th_fmv_hw_w_x (dest, high_src, low_src));
 	  return;
 	}
       if (FP_REG_RTX_P (src))
 	{
 	  rtx low_dest = riscv_subword (dest, false);
 	  rtx high_dest = riscv_subword (dest, true);
-	  emit_insn (gen_th_fmv_x_w (low_dest, src));
-	  emit_insn (gen_th_fmv_x_hw (high_dest, src));
+
+	  if (TARGET_ZFA)
+	    {
+	      emit_insn (gen_movsidf2_low_rv32 (low_dest, src));
+	      emit_insn (gen_movsidf2_high_rv32 (high_dest, src));
+	      return;
+	    }
+	  else
+	    {
+	      emit_insn (gen_th_fmv_x_w (low_dest, src));
+	      emit_insn (gen_th_fmv_x_hw (high_dest, src));
+	    }
 	  return;
 	}
     }
@@ -3045,6 +3207,17 @@ riscv_output_move (rtx dest, rtx src)
 	  case 8:
 	    return "fld\t%0,%1";
 	  }
+
+      if (src_code == CONST_DOUBLE && satisfies_constraint_zfli (src))
+	switch (width)
+	  {
+	    case 2:
+	      return "fli.h\t%0,%1";
+	    case 4:
+	      return "fli.s\t%0,%1";
+	    case 8:
+	      return "fli.d\t%0,%1";
+	  }
     }
   if (dest_code == REG && GP_REG_P (REGNO (dest)) && src_code == CONST_POLY_INT)
     {
@@ -4671,6 +4844,24 @@ riscv_print_operand (FILE *file, rtx op, int letter)
 	    output_address (mode, XEXP (op, 0));
 	  break;
 
+	case CONST_DOUBLE:
+	  {
+	    if (letter == 'z' && op == CONST0_RTX (GET_MODE (op)))
+	      {
+		fputs (reg_names[GP_REG_FIRST], file);
+		break;
+	      }
+
+	    int fli_index = riscv_float_const_rtx_index_for_fli (op);
+	    if (fli_index == -1 || fli_index > 31)
+	      {
+		output_operand_lossage ("invalid use of '%%%c'", letter);
+		break;
+	      }
+	    asm_fprintf (file, "%s", fli_value_print[fli_index]);
+	    break;
+	  }
+
 	default:
 	  if (letter == 'z' && op == CONST0_RTX (GET_MODE (op)))
 	    fputs (reg_names[GP_REG_FIRST], file);
@@ -6033,7 +6224,8 @@ riscv_secondary_memory_needed (machine_mode mode, reg_class_t class1,
   return (!riscv_v_ext_mode_p (mode)
 	  && GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD
 	  && (class1 == FP_REGS) != (class2 == FP_REGS)
-	  && !TARGET_XTHEADFMV);
+	  && !TARGET_XTHEADFMV
+	  && !TARGET_ZFA);
 }
 
 /* Implement TARGET_REGISTER_MOVE_COST.  */
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 7065e68c0b7..6f95a5c1b4a 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -55,10 +55,19 @@ (define_c_enum "unspec" [
   UNSPEC_FLT_QUIET
   UNSPEC_FLE_QUIET
   UNSPEC_COPYSIGN
+  UNSPEC_RINT
+  UNSPEC_ROUND
+  UNSPEC_FLOOR
+  UNSPEC_CEIL
+  UNSPEC_BTRUNC
+  UNSPEC_ROUNDEVEN
+  UNSPEC_NEARBYINT
   UNSPEC_LRINT
   UNSPEC_LROUND
   UNSPEC_FMIN
   UNSPEC_FMAX
+  UNSPEC_FMINM
+  UNSPEC_FMAXM
 
   ;; Stack tie
   UNSPEC_TIE
@@ -1290,6 +1299,26 @@ (define_insn "neg<mode>2"
 ;;
 ;;  ....................
 
+(define_insn "fminm<mode>3"
+  [(set (match_operand:ANYF                    0 "register_operand" "=f")
+	(unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
+		      (use (match_operand:ANYF 2 "register_operand" " f"))]
+		     UNSPEC_FMINM))]
+  "TARGET_HARD_FLOAT && TARGET_ZFA"
+  "fminm.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fmove")
+   (set_attr "mode" "<UNITMODE>")])
+
+(define_insn "fmaxm<mode>3"
+  [(set (match_operand:ANYF                    0 "register_operand" "=f")
+	(unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
+		      (use (match_operand:ANYF 2 "register_operand" " f"))]
+		     UNSPEC_FMAXM))]
+  "TARGET_HARD_FLOAT && TARGET_ZFA"
+  "fmaxm.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fmove")
+   (set_attr "mode" "<UNITMODE>")])
+
 (define_insn "fmin<mode>3"
   [(set (match_operand:ANYF                    0 "register_operand" "=f")
 	(unspec:ANYF [(use (match_operand:ANYF 1 "register_operand" " f"))
@@ -1566,13 +1595,13 @@ (define_expand "movhf"
 })
 
 (define_insn "*movhf_hardfloat"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
-	(match_operand:HF 1 "move_operand"         " f,G,m,f,G,*r,*f,*G*r,*m,*r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=f,   f,f,f,m,m,*f,*r,  *r,*r,*m")
+	(match_operand:HF 1 "move_operand"         " f,zfli,G,m,f,G,*r,*f,*G*r,*m,*r"))]
   "TARGET_ZFHMIN
    && (register_operand (operands[0], HFmode)
        || reg_or_0_operand (operands[1], HFmode))"
   { return riscv_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
+  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
    (set_attr "mode" "HF")])
 
 (define_insn "*movhf_softfloat"
@@ -1638,6 +1667,26 @@ (define_insn "l<rint_pattern><ANYF:mode><GPR:mode>2"
   [(set_attr "type" "fcvt")
    (set_attr "mode" "<ANYF:MODE>")])
 
+(define_insn "<round_pattern><ANYF:mode>2"
+  [(set (match_operand:ANYF     0 "register_operand" "=f")
+	(unspec:ANYF
+	    [(match_operand:ANYF 1 "register_operand" " f")]
+	ROUND))]
+  "TARGET_HARD_FLOAT && TARGET_ZFA"
+  "fround.<ANYF:fmt>\t%0,%1,<round_rm>"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "<ANYF:MODE>")])
+
+(define_insn "rint<ANYF:mode>2"
+  [(set (match_operand:ANYF     0 "register_operand" "=f")
+	(unspec:ANYF
+	    [(match_operand:ANYF 1 "register_operand" " f")]
+	UNSPEC_RINT))]
+  "TARGET_HARD_FLOAT && TARGET_ZFA"
+  "froundnx.<ANYF:fmt>\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "<ANYF:MODE>")])
+
 ;;
 ;;  ....................
 ;;
@@ -1897,13 +1946,13 @@ (define_expand "movsf"
 })
 
 (define_insn "*movsf_hardfloat"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
-	(match_operand:SF 1 "move_operand"         " f,G,m,f,G,*r,*f,*G*r,*m,*r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=f,   f,f,f,m,m,*f,*r,  *r,*r,*m")
+	(match_operand:SF 1 "move_operand"         " f,zfli,G,m,f,G,*r,*f,*G*r,*m,*r"))]
   "TARGET_HARD_FLOAT
    && (register_operand (operands[0], SFmode)
        || reg_or_0_operand (operands[1], SFmode))"
   { return riscv_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
+  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
    (set_attr "mode" "SF")])
 
 (define_insn "*movsf_softfloat"
@@ -1931,23 +1980,23 @@ (define_expand "movdf"
 ;; In RV32, we lack fmv.x.d and fmv.d.x.  Go through memory instead.
 ;; (However, we can still use fcvt.d.w to zero a floating-point register.)
 (define_insn "*movdf_hardfloat_rv32"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,m,*th_f_fmv,*th_r_fmv,  *r,*r,*m")
-	(match_operand:DF 1 "move_operand"         " f,G,m,f,G,*th_r_fmv,*th_f_fmv,*r*G,*m,*r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,   f,f,f,m,m,*zmvf,*zmvr,  *r,*r,*m")
+	(match_operand:DF 1 "move_operand"         " f,zfli,G,m,f,G,*zmvr,*zmvf,*r*G,*m,*r"))]
   "!TARGET_64BIT && TARGET_DOUBLE_FLOAT
    && (register_operand (operands[0], DFmode)
        || reg_or_0_operand (operands[1], DFmode))"
   { return riscv_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
+  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
    (set_attr "mode" "DF")])
 
 (define_insn "*movdf_hardfloat_rv64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,m,*f,*r,  *r,*r,*m")
-	(match_operand:DF 1 "move_operand"         " f,G,m,f,G,*r,*f,*r*G,*m,*r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,   f,f,f,m,m,*f,*r,  *r,*r,*m")
+	(match_operand:DF 1 "move_operand"         " f,zfli,G,m,f,G,*r,*f,*r*G,*m,*r"))]
   "TARGET_64BIT && TARGET_DOUBLE_FLOAT
    && (register_operand (operands[0], DFmode)
        || reg_or_0_operand (operands[1], DFmode))"
   { return riscv_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
+  [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
    (set_attr "mode" "DF")])
 
 (define_insn "*movdf_softfloat"
@@ -1960,6 +2009,39 @@ (define_insn "*movdf_softfloat"
   [(set_attr "move_type" "move,load,store")
    (set_attr "mode" "DF")])
 
+(define_insn "movsidf2_low_rv32"
+  [(set (match_operand:SI      0 "register_operand" "=  r")
+	(truncate:SI
+	    (match_operand:DF 1 "register_operand"  "zmvf")))]
+  "TARGET_HARD_FLOAT && !TARGET_64BIT && TARGET_ZFA"
+  "fmv.x.w\t%0,%1"
+  [(set_attr "move_type" "fmove")
+   (set_attr "mode" "DF")])
+
+
+(define_insn "movsidf2_high_rv32"
+  [(set (match_operand:SI      0 "register_operand"    "=  r")
+	(truncate:SI
+            (lshiftrt:DF
+                (match_operand:DF 1 "register_operand" "zmvf")
+                (const_int 32))))]
+  "TARGET_HARD_FLOAT && !TARGET_64BIT && TARGET_ZFA"
+  "fmvh.x.d\t%0,%1"
+  [(set_attr "move_type" "fmove")
+   (set_attr "mode" "DF")])
+
+(define_insn "movdfsisi3_rv32"
+  [(set (match_operand:DF      0 "register_operand"    "=  f")
+	(plus:DF
+            (match_operand:SI 2 "register_operand"     "zmvr")
+            (ashift:SI
+                (match_operand:SI 1 "register_operand" "zmvr")
+                (const_int 32))))]
+  "TARGET_HARD_FLOAT && !TARGET_64BIT && TARGET_ZFA"
+  "fmvp.d.x\t%0,%2,%1"
+  [(set_attr "move_type" "fmove")
+   (set_attr "mode" "DF")])
+
 (define_split
   [(set (match_operand:MOVE64 0 "nonimmediate_operand")
 	(match_operand:MOVE64 1 "move_operand"))]
@@ -2552,16 +2634,23 @@ (define_expand "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4"
   rtx op0 = operands[0];
   rtx op1 = operands[1];
   rtx op2 = operands[2];
-  rtx tmp = gen_reg_rtx (SImode);
-  rtx cmp = gen_rtx_<QUIET_PATTERN> (<X:MODE>mode, op1, op2);
-  rtx frflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, const0_rtx),
-					 UNSPECV_FRFLAGS);
-  rtx fsflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, tmp),
-					 UNSPECV_FSFLAGS);
-
-  emit_insn (gen_rtx_SET (tmp, frflags));
-  emit_insn (gen_rtx_SET (op0, cmp));
-  emit_insn (fsflags);
+
+  if (TARGET_ZFA)
+    emit_insn (gen_f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa(op0, op1, op2));
+  else
+    {
+      rtx tmp = gen_reg_rtx (SImode);
+      rtx cmp = gen_rtx_<QUIET_PATTERN> (<X:MODE>mode, op1, op2);
+      rtx frflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, const0_rtx),
+					     UNSPECV_FRFLAGS);
+      rtx fsflags = gen_rtx_UNSPEC_VOLATILE (SImode, gen_rtvec (1, tmp),
+					     UNSPECV_FSFLAGS);
+
+      emit_insn (gen_rtx_SET (tmp, frflags));
+      emit_insn (gen_rtx_SET (op0, cmp));
+      emit_insn (fsflags);
+    }
+
   if (HONOR_SNANS (<ANYF:MODE>mode))
     emit_insn (gen_rtx_UNSPEC_VOLATILE (<ANYF:MODE>mode,
 					gen_rtvec (2, op1, op2),
@@ -2569,6 +2658,18 @@ (define_expand "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4"
   DONE;
 })
 
+(define_insn "f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa"
+   [(set (match_operand:X      0 "register_operand" "=r")
+	 (unspec:X
+	  [(match_operand:ANYF 1 "register_operand" " f")
+	   (match_operand:ANYF 2 "register_operand" " f")]
+	  QUIET_COMPARISON))]
+  "TARGET_HARD_FLOAT && TARGET_ZFA"
+  "f<quiet_pattern>q.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fcmp")
+   (set_attr "mode" "<UNITMODE>")
+   (set (attr "length") (const_int 16))])
+
 (define_insn "*seq_zero_<X:mode><GPR:mode>"
   [(set (match_operand:GPR       0 "register_operand" "=r")
 	(eq:GPR (match_operand:X 1 "register_operand" " r")
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
new file mode 100644
index 00000000000..26895b76fa4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O2" } */
+
+extern void abort(void);
+extern float a, b;
+extern double c, d;
+
+void 
+foo()
+{
+  if ((__builtin_isless(a, b) ||  __builtin_islessequal(c, d))
+      && (__builtin_islessequal(a, b)|| __builtin_isless(c, d)))
+    abort();
+}
+
+/* { dg-final { scan-assembler-times "fleq.s" 1 } } */
+/* { dg-final { scan-assembler-times "fltq.s" 1 } } */
+/* { dg-final { scan-assembler-times "fleq.d" 1 } } */
+/* { dg-final { scan-assembler-times "fltq.d" 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
new file mode 100644
index 00000000000..4ccd6a7dd78
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O2" } */
+
+extern void abort(void);
+extern float a, b;
+extern double c, d;
+
+void 
+foo()
+{
+  if ((__builtin_isless(a, b) ||  __builtin_islessequal(c, d))
+      && (__builtin_islessequal(a, b)|| __builtin_isless(c, d)))
+    abort();
+}
+
+/* { dg-final { scan-assembler-times "fleq.s" 1 } } */
+/* { dg-final { scan-assembler-times "fltq.s" 1 } } */
+/* { dg-final { scan-assembler-times "fleq.d" 1 } } */
+/* { dg-final { scan-assembler-times "fltq.d" 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
new file mode 100644
index 00000000000..c4da04797aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
@@ -0,0 +1,79 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O0" } */
+
+void foo_float32 ()
+{
+  volatile float a;
+  a = -1.0;
+  a = 1.1754944e-38;
+  a = 1.0/(1 << 16);
+  a = 1.0/(1 << 15);
+  a = 1.0/(1 << 8);
+  a = 1.0/(1 << 7);
+  a = 1.0/(1 << 4);
+  a = 1.0/(1 << 3);
+  a = 1.0/(1 << 2);
+  a = 0.3125;
+  a = 0.375;
+  a = 0.4375;
+  a = 0.5;
+  a = 0.625;
+  a = 0.75;
+  a = 0.875;
+  a = 1.0;
+  a = 1.25;
+  a = 1.5;
+  a = 1.75;
+  a = 2.0;
+  a = 2.5;
+  a = 3.0;
+  a = 1.0*(1 << 2);
+  a = 1.0*(1 << 3);
+  a = 1.0*(1 << 4);
+  a = 1.0*(1 << 7);
+  a = 1.0*(1 << 8);
+  a = 1.0*(1 << 15);
+  a = 1.0*(1 << 16);
+  a = __builtin_inff ();
+  a = __builtin_nanf ("");
+}
+
+void foo_double64 ()
+{
+  volatile double a;
+  a = -1.0;
+  a = 2.2250738585072014E-308;
+  a = 1.0/(1 << 16);
+  a = 1.0/(1 << 15);
+  a = 1.0/(1 << 8);
+  a = 1.0/(1 << 7);
+  a = 1.0/(1 << 4);
+  a = 1.0/(1 << 3);
+  a = 1.0/(1 << 2);
+  a = 0.3125;
+  a = 0.375;
+  a = 0.4375;
+  a = 0.5;
+  a = 0.625;
+  a = 0.75;
+  a = 0.875;
+  a = 1.0;
+  a = 1.25;
+  a = 1.5;
+  a = 1.75;
+  a = 2.0;
+  a = 2.5;
+  a = 3.0;
+  a = 1.0*(1 << 2);
+  a = 1.0*(1 << 3);
+  a = 1.0*(1 << 4);
+  a = 1.0*(1 << 7);
+  a = 1.0*(1 << 8);
+  a = 1.0*(1 << 15);
+  a = 1.0*(1 << 16);
+  a = __builtin_inf ();
+  a = __builtin_nan ("");
+}
+
+/* { dg-final { scan-assembler-times "fli.s" 32 } } */
+/* { dg-final { scan-assembler-times "fli.d" 32 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
new file mode 100644
index 00000000000..bcffe9d2c82
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32imafdc_zfa_zfh -mabi=ilp32d -O0" } */
+
+void foo_float16 ()
+{
+  volatile _Float16 a;
+  a = -1.0;
+  a = 6.104E-5;
+  a = 1.0/(1 << 16);
+  a = 1.0/(1 << 15);
+  a = 1.0/(1 << 8);
+  a = 1.0/(1 << 7);
+  a = 1.0/(1 << 4);
+  a = 1.0/(1 << 3);
+  a = 1.0/(1 << 2);
+  a = 0.3125;
+  a = 0.375;
+  a = 0.4375;
+  a = 0.5;
+  a = 0.625;
+  a = 0.75;
+  a = 0.875;
+  a = 1.0;
+  a = 1.25;
+  a = 1.5;
+  a = 1.75;
+  a = 2.0;
+  a = 2.5;
+  a = 3.0;
+  a = 1.0*(1 << 2);
+  a = 1.0*(1 << 3);
+  a = 1.0*(1 << 4);
+  a = 1.0*(1 << 7);
+  a = 1.0*(1 << 8);
+  a = 1.0*(1 << 15);
+  a = 1.0*(1 << 16);
+  a = __builtin_inff16 ();
+  a = __builtin_nanf16 ("");
+}
+
+/* { dg-final { scan-assembler-times "fli.h" 32 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
new file mode 100644
index 00000000000..a493ca95f0c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64imafdc_zfa_zfh -mabi=lp64d -O0" } */
+
+void foo_float16 ()
+{
+  volatile _Float16 a;
+  a = -1.0;
+  a = 6.104e-5;
+  a = 1.0/(1 << 16);
+  a = 1.0/(1 << 15);
+  a = 1.0/(1 << 8);
+  a = 1.0/(1 << 7);
+  a = 1.0/(1 << 4);
+  a = 1.0/(1 << 3);
+  a = 1.0/(1 << 2);
+  a = 0.3125;
+  a = 0.375;
+  a = 0.4375;
+  a = 0.5;
+  a = 0.625;
+  a = 0.75;
+  a = 0.875;
+  a = 1.0;
+  a = 1.25;
+  a = 1.5;
+  a = 1.75;
+  a = 2.0;
+  a = 2.5;
+  a = 3.0;
+  a = 1.0*(1 << 2);
+  a = 1.0*(1 << 3);
+  a = 1.0*(1 << 4);
+  a = 1.0*(1 << 7);
+  a = 1.0*(1 << 8);
+  a = 1.0*(1 << 15);
+  a = 1.0*(1 << 16);
+  a = __builtin_inff16 ();
+  a = __builtin_nanf16 ("");
+}
+
+/* { dg-final { scan-assembler-times "fli.h" 32 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fli.c b/gcc/testsuite/gcc.target/riscv/zfa-fli.c
new file mode 100644
index 00000000000..babb10f21e1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fli.c
@@ -0,0 +1,79 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O0" } */
+
+void foo_float32 ()
+{
+  volatile float a;
+  a = -1.0;
+  a = 1.1754944e-38;
+  a = 1.0/(1 << 16);
+  a = 1.0/(1 << 15);
+  a = 1.0/(1 << 8);
+  a = 1.0/(1 << 7);
+  a = 1.0/(1 << 4);
+  a = 1.0/(1 << 3);
+  a = 1.0/(1 << 2);
+  a = 0.3125;
+  a = 0.375;
+  a = 0.4375;
+  a = 0.5;
+  a = 0.625;
+  a = 0.75;
+  a = 0.875;
+  a = 1.0;
+  a = 1.25;
+  a = 1.5;
+  a = 1.75;
+  a = 2.0;
+  a = 2.5;
+  a = 3.0;
+  a = 1.0*(1 << 2);
+  a = 1.0*(1 << 3);
+  a = 1.0*(1 << 4);
+  a = 1.0*(1 << 7);
+  a = 1.0*(1 << 8);
+  a = 1.0*(1 << 15);
+  a = 1.0*(1 << 16);
+  a = __builtin_inff ();
+  a = __builtin_nanf ("");
+}
+
+void foo_double64 ()
+{
+  volatile double a;
+  a = -1.0;
+  a = 2.2250738585072014e-308;
+  a = 1.0/(1 << 16);
+  a = 1.0/(1 << 15);
+  a = 1.0/(1 << 8);
+  a = 1.0/(1 << 7);
+  a = 1.0/(1 << 4);
+  a = 1.0/(1 << 3);
+  a = 1.0/(1 << 2);
+  a = 0.3125;
+  a = 0.375;
+  a = 0.4375;
+  a = 0.5;
+  a = 0.625;
+  a = 0.75;
+  a = 0.875;
+  a = 1.0;
+  a = 1.25;
+  a = 1.5;
+  a = 1.75;
+  a = 2.0;
+  a = 2.5;
+  a = 3.0;
+  a = 1.0*(1 << 2);
+  a = 1.0*(1 << 3);
+  a = 1.0*(1 << 4);
+  a = 1.0*(1 << 7);
+  a = 1.0*(1 << 8);
+  a = 1.0*(1 << 15);
+  a = 1.0*(1 << 16);
+  a = __builtin_inf ();
+  a = __builtin_nan ("");
+}
+
+/* { dg-final { scan-assembler-times "fli.s" 32 } } */
+/* { dg-final { scan-assembler-times "fli.d" 32 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
new file mode 100644
index 00000000000..5a52adce36a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32g_zfa -mabi=ilp32 -O0" } */
+
+double foo(long long a)
+{
+  return (double)(a + 3);
+}
+
+/* { dg-final { scan-assembler-times "fmvp.d.x" 1 } } */
+/* { dg-final { scan-assembler-times "fmvh.x.d" 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c b/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
new file mode 100644
index 00000000000..b53601d6e1f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32imafdc_zfa -mabi=ilp32d -O2" } */
+
+extern float a;
+extern double b;
+
+void foo (float *x, double *y)
+{
+  {
+    *x = __builtin_roundf (a);
+    *y = __builtin_round (b);
+  }
+  {
+    *x = __builtin_floorf (a);
+    *y = __builtin_floor (b);
+  }
+  {
+    *x = __builtin_ceilf (a);
+    *y = __builtin_ceil (b);
+  }
+  {
+    *x = __builtin_truncf (a);
+    *y = __builtin_trunc (b);
+  }
+  {
+    *x = __builtin_roundevenf (a);
+    *y = __builtin_roundeven (b);
+  }
+  {
+    *x = __builtin_nearbyintf (a);
+    *y = __builtin_nearbyint (b);
+  }
+  {
+    *x = __builtin_rintf (a);
+    *y = __builtin_rint (b);
+  }
+}
+
+/* { dg-final { scan-assembler-times "fround.s" 6 } } */
+/* { dg-final { scan-assembler-times "fround.d" 6 } } */
+/* { dg-final { scan-assembler-times "froundnx.s" 1 } } */
+/* { dg-final { scan-assembler-times "froundnx.d" 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fround.c b/gcc/testsuite/gcc.target/riscv/zfa-fround.c
new file mode 100644
index 00000000000..c10de82578e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zfa-fround.c
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64imafdc_zfa -mabi=lp64d -O2" } */
+
+extern float a;
+extern double b;
+
+void foo (float *x, double *y)
+{
+  {
+    *x = __builtin_roundf (a);
+    *y = __builtin_round (b);
+  }
+  {
+    *x = __builtin_floorf (a);
+    *y = __builtin_floor (b);
+  }
+  {
+    *x = __builtin_ceilf (a);
+    *y = __builtin_ceil (b);
+  }
+  {
+    *x = __builtin_truncf (a);
+    *y = __builtin_trunc (b);
+  }
+  {
+    *x = __builtin_roundevenf (a);
+    *y = __builtin_roundeven (b);
+  }
+  {
+    *x = __builtin_nearbyintf (a);
+    *y = __builtin_nearbyint (b);
+  }
+  {
+    *x = __builtin_rintf (a);
+    *y = __builtin_rint (b);
+  }
+}
+
+/* { dg-final { scan-assembler-times "fround.s" 6 } } */
+/* { dg-final { scan-assembler-times "fround.d" 6 } } */
+/* { dg-final { scan-assembler-times "froundnx.s" 1 } } */
+/* { dg-final { scan-assembler-times "froundnx.d" 1 } } */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2
  2023-05-15 13:16 ` [PATCH v9] " Jin Ma
@ 2023-05-15 13:30   ` jinma
  2023-05-16  4:00     ` Jeff Law
  2023-05-16  4:16   ` Jeff Law
  1 sibling, 1 reply; 20+ messages in thread
From: jinma @ 2023-05-15 13:30 UTC (permalink / raw)
  To: gcc-patches
  Cc: jeffreyalaw, christoph.muellner, kito.cheng, kito.cheng, palmer, ijinma

According to Jeff's review feedback, the issues regarding UNSPEC's implementation of round, ceil, nearbyint, etc. still need to be determined:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617706.html

source: 
https://github.com/majin2020/gcc-mirror/commit/93d7a2d995cee588d494d1839f56e8151c6cb057

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2.
  2023-05-06 12:53   ` jinma
@ 2023-05-16  3:59     ` Jeff Law
  0 siblings, 0 replies; 20+ messages in thread
From: Jeff Law @ 2023-05-16  3:59 UTC (permalink / raw)
  To: jinma, gcc-patches; +Cc: kito.cheng, kito.cheng, palmer



On 5/6/23 06:53, jinma wrote:
>>>> diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
>>>> index 9b767038452..c81b08e3cc5 100644
>>>> --- a/gcc/config/riscv/iterators.md
>>>> +++ b/gcc/config/riscv/iterators.md
>>>> @@ -288,3 +288,8 @@ (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
>>>>    (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET "le")])
>>>>    (define_int_attr QUIET_PATTERN [(UNSPEC_FLT_QUIET "LT") (UNSPEC_FLE_QUIET "LE")])
>>>>    
>>>> +(define_int_iterator ROUND [UNSPEC_ROUND UNSPEC_FLOOR UNSPEC_CEIL UNSPEC_BTRUNC UNSPEC_ROUNDEVEN UNSPEC_NEARBYINT])
>>>> +(define_int_attr round_pattern [(UNSPEC_ROUND "round") (UNSPEC_FLOOR "floor") (UNSPEC_CEIL "ceil")
>>>> +				(UNSPEC_BTRUNC "btrunc") (UNSPEC_ROUNDEVEN "roundeven") (UNSPEC_NEARBYINT "nearbyint")])
>>>> +(define_int_attr round_rm [(UNSPEC_ROUND "rmm") (UNSPEC_FLOOR "rdn") (UNSPEC_CEIL "rup")
>>>> +			   (UNSPEC_BTRUNC "rtz") (UNSPEC_ROUNDEVEN "rne") (UNSPEC_NEARBYINT "dyn")])
>>> Do we really need to use unspecs for all these cases?  I would expect
>>> some correspond to the trunc, round, ceil, nearbyint, etc well known RTX
>>> codes.
>>>
>>> In general, we should try to avoid unspecs when there is a clear
>>> semantic match between the instruction and GCC's RTX opcodes.  So please
>>> review the existing RTX code semantics to see if any match the new
>>> instructions.  If there are matches, use those RTX codes rather than
>>> UNSPECs.
>>
>> I'll try, thanks.
> 
> 
> I encountered some confusion about this. I checked gcc's documents and
> found no RTX codes that can correspond to round, ceil, nearbyint, etc.
> Only "(fix:m x)" seems to correspond to trunc, which can be expressed
> as rounding towards zero, while others have not yet been found.
You're largely correct.  My bad.  There's named patterns for round to 
integer, nearbyint, etc, but no RTX codes.  So they need to be handled 
as unspecs.  Sorry fo the confusion.

Jeff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2
  2023-05-15 13:30   ` jinma
@ 2023-05-16  4:00     ` Jeff Law
  0 siblings, 0 replies; 20+ messages in thread
From: Jeff Law @ 2023-05-16  4:00 UTC (permalink / raw)
  To: jinma, gcc-patches; +Cc: christoph.muellner, kito.cheng, kito.cheng, palmer



On 5/15/23 07:30, jinma wrote:
> According to Jeff's review feedback, the issues regarding UNSPEC's implementation of round, ceil, nearbyint, etc. still need to be determined:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617706.html
> 
> source:
> https://github.com/majin2020/gcc-mirror/commit/93d7a2d995cee588d494d1839f56e8151c6cb057
After double-checking I was incorrect.  We have named patterns for those 
operations, but the RTL for them are UNSPECs.  So this is a non-issue 
for this patch.

jeff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2
  2023-05-15 13:16 ` [PATCH v9] " Jin Ma
  2023-05-15 13:30   ` jinma
@ 2023-05-16  4:16   ` Jeff Law
  2023-05-16  7:06     ` jinma
  1 sibling, 1 reply; 20+ messages in thread
From: Jeff Law @ 2023-05-16  4:16 UTC (permalink / raw)
  To: Jin Ma, gcc-patches
  Cc: christoph.muellner, kito.cheng, kito.cheng, palmer, ijinma



On 5/15/23 07:16, Jin Ma wrote:
> This patch adds the 'Zfa' extension for riscv, which is based on:
> https://github.com/riscv/riscv-isa-manual/commits/zfb
> 
> The binutils-gdb for 'Zfa' extension:
> https://sourceware.org/pipermail/binutils/2023-April/127060.html
> 
> What needs special explanation is:
> 1, The immediate number of the instructions FLI.H/S/D is represented in the assembly as a
>    floating-point value, with scientific counting when rs1 is 2,3, and decimal numbers for
>    the rest.
> 
>    Related llvm link:
>      https://reviews.llvm.org/D145645
>    Related discussion link:
>      https://github.com/riscv/riscv-isa-manual/issues/980
> 
> 2, According to riscv-spec, "The FCVTMO D.W.D instruction was added principally to
>    accelerate the processing of JavaScript Numbers.", so it seems that no implementation
>    is required.
> 
> 3, The instructions FMINM and FMAXM correspond to C23 library function fminimum and fmaximum.
>    Therefore, this patch has simply implemented the pattern of fminm<hf\sf\df>3 and
>    fmaxm<hf\sf\df>3 to prepare for later.
> 
> gcc/ChangeLog:
> 
> 	* common/config/riscv/riscv-common.cc: Add zfa extension version.
> 	* config/riscv/constraints.md (zfli): Constrain the floating point number that the
> 	instructions FLI.H/S/D can load.
> 	* config/riscv/iterators.md (ceil): New.
> 	(rup): New.
> 	* config/riscv/riscv-opts.h (MASK_ZFA): New.
> 	(TARGET_ZFA): New.
> 	* config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): New.
> 	* config/riscv/riscv.cc (riscv_float_const_rtx_index_for_fli): New.
> 	(riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be used, memory is not applicable.
> 	(riscv_const_insns): Likewise.
> 	(riscv_legitimize_const_move): Likewise.
> 	(riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, no split is required.
> 	(riscv_split_doubleword_move): Likewise.
> 	(riscv_output_move): Output the mov instructions in zfa extension.
> 	(riscv_print_operand): Output the floating-point value of the FLI.H/S/D immediate in assembly
> 	(riscv_secondary_memory_needed): Likewise.
> 	* config/riscv/riscv.md (fminm<mode>3): New.
> 	(fmaxm<mode>3): New.
> 	(movsidf2_low_rv32): New.
> 	(movsidf2_high_rv32): New.
> 	(movdfsisi3_rv32): New.
> 	(f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
> 	* gcc.target/riscv/zfa-fleq-fltq.c: New test.
> 	* gcc.target/riscv/zfa-fli-rv32.c: New test.
> 	* gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
> 	* gcc.target/riscv/zfa-fli-zfh.c: New test.
> 	* gcc.target/riscv/zfa-fli.c: New test.
> 	* gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
> 	* gcc.target/riscv/zfa-fround-rv32.c: New test.
> 	* gcc.target/riscv/zfa-fround.c: New test.
> ---
>   gcc/common/config/riscv/riscv-common.cc       |   4 +
>   gcc/config/riscv/constraints.md               |  21 +-
>   gcc/config/riscv/iterators.md                 |   5 +
>   gcc/config/riscv/riscv-opts.h                 |   3 +
>   gcc/config/riscv/riscv-protos.h               |   1 +
>   gcc/config/riscv/riscv.cc                     | 204 +++++++++++++++++-
>   gcc/config/riscv/riscv.md                     | 145 +++++++++++--
>   .../gcc.target/riscv/zfa-fleq-fltq-rv32.c     |  19 ++
>   .../gcc.target/riscv/zfa-fleq-fltq.c          |  19 ++
>   gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c |  79 +++++++
>   .../gcc.target/riscv/zfa-fli-zfh-rv32.c       |  41 ++++
>   gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c  |  41 ++++
>   gcc/testsuite/gcc.target/riscv/zfa-fli.c      |  79 +++++++
>   .../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c   |  10 +
>   .../gcc.target/riscv/zfa-fround-rv32.c        |  42 ++++
>   gcc/testsuite/gcc.target/riscv/zfa-fround.c   |  42 ++++
>   16 files changed, 719 insertions(+), 36 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround.c
> 


> +
> +/* Return index of the FLI instruction table if rtx X is an immediate constant that can
> +   be moved using a single FLI instruction in zfa extension. Return -1 if not found.  */
> +
> +int
> +riscv_float_const_rtx_index_for_fli (rtx x)
> +{
> +  unsigned HOST_WIDE_INT *fli_value_array;
> +
> +  machine_mode mode = GET_MODE (x);
> +
> +  if (!TARGET_ZFA
> +      || !CONST_DOUBLE_P(x)
> +      || mode == VOIDmode
> +      || (mode == HFmode && !TARGET_ZFH)
> +      || (mode == SFmode && !TARGET_HARD_FLOAT)
> +      || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
> +    return -1;
Do we also need to check Z[FDH]INX too?

Otherwise it looks pretty good.  We just need to wait for everything to 
freeze and finalization on the assembler interface.

jeff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2
  2023-05-16  4:16   ` Jeff Law
@ 2023-05-16  7:06     ` jinma
  2023-05-16  7:53       ` Kito Cheng
  2023-08-09 18:11       ` Vineet Gupta
  0 siblings, 2 replies; 20+ messages in thread
From: jinma @ 2023-05-16  7:06 UTC (permalink / raw)
  To: gcc-patches, Jeff Law
  Cc: christoph.muellner, kito.cheng, kito.cheng, palmer, ijinma

On 5/15/23 07:16, Jin Ma wrote:
> > This patch adds the 'Zfa' extension for riscv, which is based on:
> > https://github.com/riscv/riscv-isa-manual/commits/zfb
> > 
> > The binutils-gdb for 'Zfa' extension:
> > https://sourceware.org/pipermail/binutils/2023-April/127060.html
> > 
> > What needs special explanation is:
> > 1, The immediate number of the instructions FLI.H/S/D is represented in the assembly as a
> >    floating-point value, with scientific counting when rs1 is 2,3, and decimal numbers for
> >    the rest.
> > 
> >    Related llvm link:
> >      https://reviews.llvm.org/D145645
> >    Related discussion link:
> >      https://github.com/riscv/riscv-isa-manual/issues/980
> > 
> > 2, According to riscv-spec, "The FCVTMO D.W.D instruction was added principally to
> >    accelerate the processing of JavaScript Numbers.", so it seems that no implementation
> >    is required.
> > 
> > 3, The instructions FMINM and FMAXM correspond to C23 library function fminimum and fmaximum.
> >    Therefore, this patch has simply implemented the pattern of fminm<hf\sf\df>3 and
> >    fmaxm<hf\sf\df>3 to prepare for later.
> > 
> > gcc/ChangeLog:
> > 
> >  * common/config/riscv/riscv-common.cc: Add zfa extension version.
> >  * config/riscv/constraints.md (zfli): Constrain the floating point number that the
> >  instructions FLI.H/S/D can load.
> >  * config/riscv/iterators.md (ceil): New.
> >  (rup): New.
> >  * config/riscv/riscv-opts.h (MASK_ZFA): New.
> >  (TARGET_ZFA): New.
> >  * config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): New.
> >  * config/riscv/riscv.cc (riscv_float_const_rtx_index_for_fli): New.
> >  (riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be used, memory is not applicable.
> >  (riscv_const_insns): Likewise.
> >  (riscv_legitimize_const_move): Likewise.
> >  (riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, no split is required.
> >  (riscv_split_doubleword_move): Likewise.
> >  (riscv_output_move): Output the mov instructions in zfa extension.
> >  (riscv_print_operand): Output the floating-point value of the FLI.H/S/D immediate in assembly
> >  (riscv_secondary_memory_needed): Likewise.
> >  * config/riscv/riscv.md (fminm<mode>3): New.
> >  (fmaxm<mode>3): New.
> >  (movsidf2_low_rv32): New.
> >  (movsidf2_high_rv32): New.
> >  (movdfsisi3_rv32): New.
> >  (f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa): Likewise.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> >  * gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
> >  * gcc.target/riscv/zfa-fleq-fltq.c: New test.
> >  * gcc.target/riscv/zfa-fli-rv32.c: New test.
> >  * gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
> >  * gcc.target/riscv/zfa-fli-zfh.c: New test.
> >  * gcc.target/riscv/zfa-fli.c: New test.
> >  * gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
> >  * gcc.target/riscv/zfa-fround-rv32.c: New test.
> >  * gcc.target/riscv/zfa-fround.c: New test.
> > ---
> >   gcc/common/config/riscv/riscv-common.cc       |   4 +
> >   gcc/config/riscv/constraints.md               |  21 +-
> >   gcc/config/riscv/iterators.md                 |   5 +
> >   gcc/config/riscv/riscv-opts.h                 |   3 +
> >   gcc/config/riscv/riscv-protos.h               |   1 +
> >   gcc/config/riscv/riscv.cc                     | 204 +++++++++++++++++-
> >   gcc/config/riscv/riscv.md                     | 145 +++++++++++--
> >   .../gcc.target/riscv/zfa-fleq-fltq-rv32.c     |  19 ++
> >   .../gcc.target/riscv/zfa-fleq-fltq.c          |  19 ++
> >   gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c |  79 +++++++
> >   .../gcc.target/riscv/zfa-fli-zfh-rv32.c       |  41 ++++
> >   gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c  |  41 ++++
> >   gcc/testsuite/gcc.target/riscv/zfa-fli.c      |  79 +++++++
> >   .../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c   |  10 +
> >   .../gcc.target/riscv/zfa-fround-rv32.c        |  42 ++++
> >   gcc/testsuite/gcc.target/riscv/zfa-fround.c   |  42 ++++
> >   16 files changed, 719 insertions(+), 36 deletions(-)
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround.c
> > 
> 
> 
> > +
> > +/* Return index of the FLI instruction table if rtx X is an immediate constant that can
> > +   be moved using a single FLI instruction in zfa extension. Return -1 if not found.  */
> > +
> > +int
> > +riscv_float_const_rtx_index_for_fli (rtx x)
> > +{
> > +  unsigned HOST_WIDE_INT *fli_value_array;
> > +
> > +  machine_mode mode = GET_MODE (x);
> > +
> > +  if (!TARGET_ZFA
> > +      || !CONST_DOUBLE_P(x)
> > +      || mode == VOIDmode
> > +      || (mode == HFmode && !TARGET_ZFH)
> > +      || (mode == SFmode && !TARGET_HARD_FLOAT)
> > +      || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
> > +    return -1;
> Do we also need to check Z[FDH]INX too?
> 
> Otherwise it looks pretty good.  We just need to wait for everything to 
> freeze and finalization on the assembler interface.
> 
> jeff

Yes, you are right, we also need to check Z[FDH]INX. I will send a patch
again to fix it after others give some review comments.

Jin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2
  2023-05-16  7:06     ` jinma
@ 2023-05-16  7:53       ` Kito Cheng
  2023-08-09 18:11       ` Vineet Gupta
  1 sibling, 0 replies; 20+ messages in thread
From: Kito Cheng @ 2023-05-16  7:53 UTC (permalink / raw)
  To: jinma; +Cc: gcc-patches, Jeff Law, christoph.muellner, kito.cheng, palmer

zfa requires/depend f, it means zfa implies f in current toolchain
implementation, could you add that into riscv-common.cc?

Also that means zfa is exclusive with Z[FDH]INX.

Ref: https://github.com/riscv/riscv-isa-manual/issues/1020

On Tue, May 16, 2023 at 3:06 PM jinma <jinma@linux.alibaba.com> wrote:
>
> On 5/15/23 07:16, Jin Ma wrote:
> > > This patch adds the 'Zfa' extension for riscv, which is based on:
> > > https://github.com/riscv/riscv-isa-manual/commits/zfb
> > >
> > > The binutils-gdb for 'Zfa' extension:
> > > https://sourceware.org/pipermail/binutils/2023-April/127060.html
> > >
> > > What needs special explanation is:
> > > 1, The immediate number of the instructions FLI.H/S/D is represented in the assembly as a
> > >    floating-point value, with scientific counting when rs1 is 2,3, and decimal numbers for
> > >    the rest.
> > >
> > >    Related llvm link:
> > >      https://reviews.llvm.org/D145645
> > >    Related discussion link:
> > >      https://github.com/riscv/riscv-isa-manual/issues/980
> > >
> > > 2, According to riscv-spec, "The FCVTMO D.W.D instruction was added principally to
> > >    accelerate the processing of JavaScript Numbers.", so it seems that no implementation
> > >    is required.
> > >
> > > 3, The instructions FMINM and FMAXM correspond to C23 library function fminimum and fmaximum.
> > >    Therefore, this patch has simply implemented the pattern of fminm<hf\sf\df>3 and
> > >    fmaxm<hf\sf\df>3 to prepare for later.
> > >
> > > gcc/ChangeLog:
> > >
> > >  * common/config/riscv/riscv-common.cc: Add zfa extension version.
> > >  * config/riscv/constraints.md (zfli): Constrain the floating point number that the
> > >  instructions FLI.H/S/D can load.
> > >  * config/riscv/iterators.md (ceil): New.
> > >  (rup): New.
> > >  * config/riscv/riscv-opts.h (MASK_ZFA): New.
> > >  (TARGET_ZFA): New.
> > >  * config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): New.
> > >  * config/riscv/riscv.cc (riscv_float_const_rtx_index_for_fli): New.
> > >  (riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be used, memory is not applicable.
> > >  (riscv_const_insns): Likewise.
> > >  (riscv_legitimize_const_move): Likewise.
> > >  (riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, no split is required.
> > >  (riscv_split_doubleword_move): Likewise.
> > >  (riscv_output_move): Output the mov instructions in zfa extension.
> > >  (riscv_print_operand): Output the floating-point value of the FLI.H/S/D immediate in assembly
> > >  (riscv_secondary_memory_needed): Likewise.
> > >  * config/riscv/riscv.md (fminm<mode>3): New.
> > >  (fmaxm<mode>3): New.
> > >  (movsidf2_low_rv32): New.
> > >  (movsidf2_high_rv32): New.
> > >  (movdfsisi3_rv32): New.
> > >  (f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa): Likewise.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >  * gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
> > >  * gcc.target/riscv/zfa-fleq-fltq.c: New test.
> > >  * gcc.target/riscv/zfa-fli-rv32.c: New test.
> > >  * gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
> > >  * gcc.target/riscv/zfa-fli-zfh.c: New test.
> > >  * gcc.target/riscv/zfa-fli.c: New test.
> > >  * gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
> > >  * gcc.target/riscv/zfa-fround-rv32.c: New test.
> > >  * gcc.target/riscv/zfa-fround.c: New test.
> > > ---
> > >   gcc/common/config/riscv/riscv-common.cc       |   4 +
> > >   gcc/config/riscv/constraints.md               |  21 +-
> > >   gcc/config/riscv/iterators.md                 |   5 +
> > >   gcc/config/riscv/riscv-opts.h                 |   3 +
> > >   gcc/config/riscv/riscv-protos.h               |   1 +
> > >   gcc/config/riscv/riscv.cc                     | 204 +++++++++++++++++-
> > >   gcc/config/riscv/riscv.md                     | 145 +++++++++++--
> > >   .../gcc.target/riscv/zfa-fleq-fltq-rv32.c     |  19 ++
> > >   .../gcc.target/riscv/zfa-fleq-fltq.c          |  19 ++
> > >   gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c |  79 +++++++
> > >   .../gcc.target/riscv/zfa-fli-zfh-rv32.c       |  41 ++++
> > >   gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c  |  41 ++++
> > >   gcc/testsuite/gcc.target/riscv/zfa-fli.c      |  79 +++++++
> > >   .../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c   |  10 +
> > >   .../gcc.target/riscv/zfa-fround-rv32.c        |  42 ++++
> > >   gcc/testsuite/gcc.target/riscv/zfa-fround.c   |  42 ++++
> > >   16 files changed, 719 insertions(+), 36 deletions(-)
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli.c
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround.c
> > >
> >
> >
> > > +
> > > +/* Return index of the FLI instruction table if rtx X is an immediate constant that can
> > > +   be moved using a single FLI instruction in zfa extension. Return -1 if not found.  */
> > > +
> > > +int
> > > +riscv_float_const_rtx_index_for_fli (rtx x)
> > > +{
> > > +  unsigned HOST_WIDE_INT *fli_value_array;
> > > +
> > > +  machine_mode mode = GET_MODE (x);
> > > +
> > > +  if (!TARGET_ZFA
> > > +      || !CONST_DOUBLE_P(x)
> > > +      || mode == VOIDmode
> > > +      || (mode == HFmode && !TARGET_ZFH)
> > > +      || (mode == SFmode && !TARGET_HARD_FLOAT)
> > > +      || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
> > > +    return -1;
> > Do we also need to check Z[FDH]INX too?
> >
> > Otherwise it looks pretty good.  We just need to wait for everything to
> > freeze and finalization on the assembler interface.
> >
> > jeff
>
> Yes, you are right, we also need to check Z[FDH]INX. I will send a patch
> again to fix it after others give some review comments.
>
> Jin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2
  2023-05-16  7:06     ` jinma
  2023-05-16  7:53       ` Kito Cheng
@ 2023-08-09 18:11       ` Vineet Gupta
  2023-08-11 15:49         ` Jin Ma
  2023-08-14  6:00         ` Jin Ma
  1 sibling, 2 replies; 20+ messages in thread
From: Vineet Gupta @ 2023-08-09 18:11 UTC (permalink / raw)
  To: jinma, gcc-patches, Jeff Law
  Cc: jinma, christoph.muellner, kito.cheng, kito.cheng, palmer

Hi Jin Ma,

On 5/16/23 00:06, jinma via Gcc-patches wrote:
> On 5/15/23 07:16, Jin Ma wrote:
>>
>> Do we also need to check Z[FDH]INX too?
>>
>> Otherwise it looks pretty good.  We just need to wait for everything to
>> freeze and finalization on the assembler interface.
>>
>> jeff
> Yes, you are right, we also need to check Z[FDH]INX. I will send a patch
> again to fix it after others give some review comments.

Can we please revisit this and get this merged upstream.
Seems like gcc is supporting frozen but not ratified extensions.

Thx,
-Vineet

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2
  2023-08-09 18:11       ` Vineet Gupta
@ 2023-08-11 15:49         ` Jin Ma
  2023-08-14  6:00         ` Jin Ma
  1 sibling, 0 replies; 20+ messages in thread
From: Jin Ma @ 2023-08-11 15:49 UTC (permalink / raw)
  To: jinma, gcc-patches, Jeff Law, Vineet Gupta
  Cc: christoph.muellner, kito.cheng, kito.cheng, palmer

> Hi Jin Ma,
> 
> On 5/16/23 00:06, jinma via Gcc-patches wrote:
> > On 5/15/23 07:16, Jin Ma wrote:
> >>
> >> Do we also need to check Z[FDH]INX too?
> >>
> >> Otherwise it looks pretty good.  We just need to wait for everything to
> >> freeze and finalization on the assembler interface.
> >>
> >> jeff
> > Yes, you are right, we also need to check Z[FDH]INX. I will send a patch
> > again to fix it after others give some review comments.
> 
> Can we please revisit this and get this merged upstream.
> Seems like gcc is supporting frozen but not ratified extensions.
> 
> Thx,
> -Vineet

OK, I will check and resend a patch about this in a few days.

Thanks,
Jin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2
  2023-08-09 18:11       ` Vineet Gupta
  2023-08-11 15:49         ` Jin Ma
@ 2023-08-14  6:00         ` Jin Ma
  2023-08-14  6:10           ` Jin Ma
  1 sibling, 1 reply; 20+ messages in thread
From: Jin Ma @ 2023-08-14  6:00 UTC (permalink / raw)
  To: jinma, gcc-patches, Jeff Law, Vineet Gupta
  Cc: christoph.muellner, kito.cheng, kito.cheng, palmer

> > Hi Jin Ma,
> > 
> > On 5/16/23 00:06, jinma via Gcc-patches wrote:
> > > On 5/15/23 07:16, Jin Ma wrote:
> > >>
> > >> Do we also need to check Z[FDH]INX too?
> > >>
> > >> Otherwise it looks pretty good.  We just need to wait for everything to
> > >> freeze and finalization on the assembler interface.
> > >>
> > >> jeff
> > > Yes, you are right, we also need to check Z[FDH]INX. I will send a patch
> > > again to fix it after others give some review comments.
> > 
> > Can we please revisit this and get this merged upstream.
> > Seems like gcc is supporting frozen but not ratified extensions.
> > 
> > Thx,
> > -Vineet
> 
> OK, I will check and resend a patch about this in a few days.
> 
> Thanks,
> Jin

Done, and please review again. Compared with the v9 version two months ago,
the previous review comments have been modified. At the same time, the variable
riscv_zfa_subext have been added to riscv.opt to enable zfa extension.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2
  2023-08-14  6:00         ` Jin Ma
@ 2023-08-14  6:10           ` Jin Ma
  2023-08-14 22:11             ` Jeff Law
  0 siblings, 1 reply; 20+ messages in thread
From: Jin Ma @ 2023-08-14  6:10 UTC (permalink / raw)
  To: jinma, gcc-patches, Jeff Law, Vineet Gupta
  Cc: christoph.muellner, kito.cheng, kito.cheng, palmer

Additional links：
v10, the patch that needs to be reviewed again:
http://patchwork.ozlabs.org/project/gcc/patch/20230814055033.1995-1-jinma@linux.alibaba.com/

v9 and the previous review comments:
http://patchwork.ozlabs.org/project/gcc/patch/20230515131628.953-1-jinma@linux.alibaba.com/

Zfa patch in master branch of binutils-gdb
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=1f3fc45bddc7147a2e59346a59290094137ef1e1

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2
  2023-08-14  6:10           ` Jin Ma
@ 2023-08-14 22:11             ` Jeff Law
  0 siblings, 0 replies; 20+ messages in thread
From: Jeff Law @ 2023-08-14 22:11 UTC (permalink / raw)
  To: Jin Ma, jinma, gcc-patches, Vineet Gupta
  Cc: christoph.muellner, kito.cheng, kito.cheng, palmer



On 8/14/23 00:10, Jin Ma wrote:
> Additional links：
> v10, the patch that needs to be reviewed again:
> http://patchwork.ozlabs.org/project/gcc/patch/20230814055033.1995-1-jinma@linux.alibaba.com/
> 
> v9 and the previous review comments:
> http://patchwork.ozlabs.org/project/gcc/patch/20230515131628.953-1-jinma@linux.alibaba.com/
> 
> Zfa patch in master branch of binutils-gdb
> https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=1f3fc45bddc7147a2e59346a59290094137ef1e1
Will do.  We'll also have to evaluate against Tsukasa's work.  As we saw 
with Zicond there may be cases that are better handled by one vs the 
other and we may end up taking pieces from both.

jeff

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-08-14 22:11 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-19  9:57 [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2 Jin Ma
2023-05-05 15:03 ` Christoph Müllner
2023-05-05 15:04   ` Christoph Müllner
2023-05-05 15:12     ` Palmer Dabbelt
2023-05-05 15:43       ` Christoph Müllner
2023-05-05 23:31 ` Jeff Law
2023-05-06  7:54 ` Jin Ma
2023-05-06 12:53   ` jinma
2023-05-16  3:59     ` Jeff Law
2023-05-15 13:16 ` [PATCH v9] " Jin Ma
2023-05-15 13:30   ` jinma
2023-05-16  4:00     ` Jeff Law
2023-05-16  4:16   ` Jeff Law
2023-05-16  7:06     ` jinma
2023-05-16  7:53       ` Kito Cheng
2023-08-09 18:11       ` Vineet Gupta
2023-08-11 15:49         ` Jin Ma
2023-08-14  6:00         ` Jin Ma
2023-08-14  6:10           ` Jin Ma
2023-08-14 22:11             ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).