public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Sylvain Noiry <snoiry@kalrayinc.com>
To: gcc-patches@gcc.gnu.org
Cc: Sylvain Noiry <snoiry@kalrayinc.com>
Subject: [PATCH v2 02/11] Native complex ops: Move functions to hooks
Date: Tue, 12 Sep 2023 12:07:04 +0200	[thread overview]
Message-ID: <20230912100713.1074-3-snoiry@kalrayinc.com> (raw)
In-Reply-To: <20230912100713.1074-1-snoiry@kalrayinc.com>

Summary:
Move read_complex_part and write_complex_part to target hooks. Their
signature also change because of the type of argument part is now
complex_part_t. Calls to theses functions are updated accordingly.

gcc/ChangeLog:

        * target.def: Define hooks for read_complex_part and
        write_complex_part
        * targhooks.cc (default_read_complex_part): New: default
        implementation of read_complex_part
        (default_write_complex_part): New: default implementation
        if write_complex_part
        * targhooks.h: Add default_read_complex_part and
        default_write_complex_part
        * doc/tm.texi: Document the new TARGET_READ_COMPLEX_PART
        and TARGET_WRITE_COMPLEX_PART hooks
        * doc/tm.texi.in: Add TARGET_READ_COMPLEX_PART and
        TARGET_WRITE_COMPLEX_PART
        * expr.cc
        (write_complex_part): Call TARGET_READ_COMPLEX_PART hook
        (read_complex_part): Call TARGET_WRITE_COMPLEX_PART hook
        * expr.h: Update function signatures of read_complex_part
        and write_complex_part
        * builtins.cc (expand_ifn_atomic_compare_exchange_into_call):
        Update calls to read_complex_part and write_complex_part
        (expand_ifn_atomic_compare_exchange): Likewise
        * expmed.cc (flip_storage_order): Likewise
        (clear_storage_hints): Likewise
        and write_complex_part
        (emit_move_complex_push): Likewise
        (emit_move_complex_parts): Likewise
        (expand_assignment): Likewise
        (expand_expr_real_2): Likewise
        (expand_expr_real_1): Likewise
        (const_vector_from_tree): Likewise
        * internal-fn.cc (expand_arith_set_overflow): Likewise
        (expand_arith_overflow_result_store): Likewise
        (expand_addsub_overflow): Likewise
        (expand_neg_overflow): Likewise
        (expand_mul_overflow): Likewise
        (expand_arith_overflow): Likewise
        (expand_UADDC): Likewise
---
 gcc/builtins.cc    |   8 +--
 gcc/doc/tm.texi    |  10 +++
 gcc/doc/tm.texi.in |   4 ++
 gcc/expmed.cc      |   4 +-
 gcc/expr.cc        | 165 +++++++++------------------------------------
 gcc/expr.h         |   5 +-
 gcc/internal-fn.cc |  16 ++---
 gcc/target.def     |  18 +++++
 gcc/targhooks.cc   | 139 ++++++++++++++++++++++++++++++++++++++
 gcc/targhooks.h    |   4 ++
 10 files changed, 221 insertions(+), 152 deletions(-)

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 3b453b3ec8c..b5cb652c413 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -6349,8 +6349,8 @@ expand_ifn_atomic_compare_exchange_into_call (gcall *call, machine_mode mode)
       if (GET_MODE (boolret) != mode)
 	boolret = convert_modes (mode, GET_MODE (boolret), boolret, 1);
       x = force_reg (mode, x);
-      write_complex_part (target, boolret, true, true);
-      write_complex_part (target, x, false, false);
+      write_complex_part (target, boolret, IMAG_P, true);
+      write_complex_part (target, x, REAL_P, false);
     }
 }
 
@@ -6405,8 +6405,8 @@ expand_ifn_atomic_compare_exchange (gcall *call)
       rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
       if (GET_MODE (boolret) != mode)
 	boolret = convert_modes (mode, GET_MODE (boolret), boolret, 1);
-      write_complex_part (target, boolret, true, true);
-      write_complex_part (target, oldval, false, false);
+      write_complex_part (target, boolret, IMAG_P, true);
+      write_complex_part (target, oldval, REAL_P, false);
     }
 }
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index ff69207fb9f..c4f935b5746 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4620,6 +4620,16 @@ to return a nonzero value when it is required, the compiler will run out
 of spill registers and print a fatal error message.
 @end deftypefn
 
+@deftypefn {Target Hook} rtx TARGET_READ_COMPLEX_PART (rtx @var{cplx}, complex_part_t @var{part})
+This hook should return the rtx representing the specified @var{part} of the complex given by @var{cplx}.
+  @var{part} can be the real part, the imaginary part, or both of them.
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_WRITE_COMPLEX_PART (rtx @var{cplx}, rtx @var{val}, complex_part_t @var{part})
+This hook should move the rtx value given by @var{val} to the specified @var{var} of the complex given by @var{cplx}.
+  @var{var} can be the real part, the imaginary part, or both of them.
+@end deftypefn
+
 @node Scalar Return
 @subsection How Scalar Function Values Are Returned
 @cindex return values in registers
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index cad6308a87c..b8970761c8d 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -3392,6 +3392,10 @@ stack.
 
 @hook TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P
 
+@hook TARGET_READ_COMPLEX_PART
+
+@hook TARGET_WRITE_COMPLEX_PART
+
 @node Scalar Return
 @subsection How Scalar Function Values Are Returned
 @cindex return values in registers
diff --git a/gcc/expmed.cc b/gcc/expmed.cc
index b294eabb08d..973c16a14d3 100644
--- a/gcc/expmed.cc
+++ b/gcc/expmed.cc
@@ -394,8 +394,8 @@ flip_storage_order (machine_mode mode, rtx x)
 
   if (COMPLEX_MODE_P (mode))
     {
-      rtx real = read_complex_part (x, false);
-      rtx imag = read_complex_part (x, true);
+      rtx real = read_complex_part (x, REAL_P);
+      rtx imag = read_complex_part (x, IMAG_P);
 
       real = flip_storage_order (GET_MODE_INNER (mode), real);
       imag = flip_storage_order (GET_MODE_INNER (mode), imag);
diff --git a/gcc/expr.cc b/gcc/expr.cc
index d5b6494b4fc..12b74273144 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -3475,8 +3475,8 @@ clear_storage_hints (rtx object, rtx size, enum block_op_methods method,
 	  zero = CONST0_RTX (GET_MODE_INNER (mode));
 	  if (zero != NULL)
 	    {
-	      write_complex_part (object, zero, 0, true);
-	      write_complex_part (object, zero, 1, false);
+	      write_complex_part (object, zero, REAL_P, true);
+	      write_complex_part (object, zero, IMAG_P, false);
 	      return NULL;
 	    }
 	}
@@ -3641,126 +3641,18 @@ set_storage_via_setmem (rtx object, rtx size, rtx val, unsigned int align,
    If UNDEFINED_P then the value in CPLX is currently undefined.  */
 
 void
-write_complex_part (rtx cplx, rtx val, bool imag_p, bool undefined_p)
+write_complex_part (rtx cplx, rtx val, complex_part_t part, bool undefined_p)
 {
-  machine_mode cmode;
-  scalar_mode imode;
-  unsigned ibitsize;
-
-  if (GET_CODE (cplx) == CONCAT)
-    {
-      emit_move_insn (XEXP (cplx, imag_p), val);
-      return;
-    }
-
-  cmode = GET_MODE (cplx);
-  imode = GET_MODE_INNER (cmode);
-  ibitsize = GET_MODE_BITSIZE (imode);
-
-  /* For MEMs simplify_gen_subreg may generate an invalid new address
-     because, e.g., the original address is considered mode-dependent
-     by the target, which restricts simplify_subreg from invoking
-     adjust_address_nv.  Instead of preparing fallback support for an
-     invalid address, we call adjust_address_nv directly.  */
-  if (MEM_P (cplx))
-    {
-      emit_move_insn (adjust_address_nv (cplx, imode,
-					 imag_p ? GET_MODE_SIZE (imode) : 0),
-		      val);
-      return;
-    }
-
-  /* If the sub-object is at least word sized, then we know that subregging
-     will work.  This special case is important, since store_bit_field
-     wants to operate on integer modes, and there's rarely an OImode to
-     correspond to TCmode.  */
-  if (ibitsize >= BITS_PER_WORD
-      /* For hard regs we have exact predicates.  Assume we can split
-	 the original object if it spans an even number of hard regs.
-	 This special case is important for SCmode on 64-bit platforms
-	 where the natural size of floating-point regs is 32-bit.  */
-      || (REG_P (cplx)
-	  && REGNO (cplx) < FIRST_PSEUDO_REGISTER
-	  && REG_NREGS (cplx) % 2 == 0))
-    {
-      rtx part = simplify_gen_subreg (imode, cplx, cmode,
-				      imag_p ? GET_MODE_SIZE (imode) : 0);
-      if (part)
-        {
-	  emit_move_insn (part, val);
-	  return;
-	}
-      else
-	/* simplify_gen_subreg may fail for sub-word MEMs.  */
-	gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD);
-    }
-
-  store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, 0, 0, imode, val,
-		   false, undefined_p);
+  targetm.write_complex_part (cplx, val, part, undefined_p);
 }
 
 /* Extract one of the components of the complex value CPLX.  Extract the
    real part if IMAG_P is false, and the imaginary part if it's true.  */
 
 rtx
-read_complex_part (rtx cplx, bool imag_p)
-{
-  machine_mode cmode;
-  scalar_mode imode;
-  unsigned ibitsize;
-
-  if (GET_CODE (cplx) == CONCAT)
-    return XEXP (cplx, imag_p);
-
-  cmode = GET_MODE (cplx);
-  imode = GET_MODE_INNER (cmode);
-  ibitsize = GET_MODE_BITSIZE (imode);
-
-  /* Special case reads from complex constants that got spilled to memory.  */
-  if (MEM_P (cplx) && GET_CODE (XEXP (cplx, 0)) == SYMBOL_REF)
-    {
-      tree decl = SYMBOL_REF_DECL (XEXP (cplx, 0));
-      if (decl && TREE_CODE (decl) == COMPLEX_CST)
-	{
-	  tree part = imag_p ? TREE_IMAGPART (decl) : TREE_REALPART (decl);
-	  if (CONSTANT_CLASS_P (part))
-	    return expand_expr (part, NULL_RTX, imode, EXPAND_NORMAL);
-	}
-    }
-
-  /* For MEMs simplify_gen_subreg may generate an invalid new address
-     because, e.g., the original address is considered mode-dependent
-     by the target, which restricts simplify_subreg from invoking
-     adjust_address_nv.  Instead of preparing fallback support for an
-     invalid address, we call adjust_address_nv directly.  */
-  if (MEM_P (cplx))
-    return adjust_address_nv (cplx, imode,
-			      imag_p ? GET_MODE_SIZE (imode) : 0);
-
-  /* If the sub-object is at least word sized, then we know that subregging
-     will work.  This special case is important, since extract_bit_field
-     wants to operate on integer modes, and there's rarely an OImode to
-     correspond to TCmode.  */
-  if (ibitsize >= BITS_PER_WORD
-      /* For hard regs we have exact predicates.  Assume we can split
-	 the original object if it spans an even number of hard regs.
-	 This special case is important for SCmode on 64-bit platforms
-	 where the natural size of floating-point regs is 32-bit.  */
-      || (REG_P (cplx)
-	  && REGNO (cplx) < FIRST_PSEUDO_REGISTER
-	  && REG_NREGS (cplx) % 2 == 0))
-    {
-      rtx ret = simplify_gen_subreg (imode, cplx, cmode,
-				     imag_p ? GET_MODE_SIZE (imode) : 0);
-      if (ret)
-        return ret;
-      else
-	/* simplify_gen_subreg may fail for sub-word MEMs.  */
-	gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD);
-    }
-
-  return extract_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0,
-			    true, NULL_RTX, imode, imode, false, NULL);
+read_complex_part (rtx cplx, complex_part_t part)
+{
+  return targetm.read_complex_part (cplx, part);
 }
 \f
 /* A subroutine of emit_move_insn_1.  Yet another lowpart generator.
@@ -3931,9 +3823,10 @@ emit_move_complex_push (machine_mode mode, rtx x, rtx y)
     }
 
   emit_move_insn (gen_rtx_MEM (submode, XEXP (x, 0)),
-		  read_complex_part (y, imag_first));
+		  read_complex_part (y, (imag_first) ? IMAG_P : REAL_P));
   return emit_move_insn (gen_rtx_MEM (submode, XEXP (x, 0)),
-			 read_complex_part (y, !imag_first));
+			 read_complex_part (y,
+					    (imag_first) ? REAL_P : IMAG_P));
 }
 
 /* A subroutine of emit_move_complex.  Perform the move from Y to X
@@ -3949,8 +3842,8 @@ emit_move_complex_parts (rtx x, rtx y)
       && REG_P (x) && !reg_overlap_mentioned_p (x, y))
     emit_clobber (x);
 
-  write_complex_part (x, read_complex_part (y, false), false, true);
-  write_complex_part (x, read_complex_part (y, true), true, false);
+  write_complex_part (x, read_complex_part (y, REAL_P), REAL_P, true);
+  write_complex_part (x, read_complex_part (y, IMAG_P), IMAG_P, false);
 
   return get_last_insn ();
 }
@@ -5807,9 +5700,9 @@ expand_assignment (tree to, tree from, bool nontemporal)
 		  if (from_rtx)
 		    {
 		      emit_move_insn (XEXP (to_rtx, 0),
-				      read_complex_part (from_rtx, false));
+				      read_complex_part (from_rtx, REAL_P));
 		      emit_move_insn (XEXP (to_rtx, 1),
-				      read_complex_part (from_rtx, true));
+				      read_complex_part (from_rtx, IMAG_P));
 		    }
 		  else
 		    {
@@ -5831,14 +5724,16 @@ expand_assignment (tree to, tree from, bool nontemporal)
 	    concat_store_slow:;
 	      rtx temp = assign_stack_temp (GET_MODE (to_rtx),
 					    GET_MODE_SIZE (GET_MODE (to_rtx)));
-	      write_complex_part (temp, XEXP (to_rtx, 0), false, true);
-	      write_complex_part (temp, XEXP (to_rtx, 1), true, false);
+	      write_complex_part (temp, XEXP (to_rtx, 0), REAL_P, true);
+	      write_complex_part (temp, XEXP (to_rtx, 1), IMAG_P, false);
 	      result = store_field (temp, bitsize, bitpos,
 				    bitregion_start, bitregion_end,
 				    mode1, from, get_alias_set (to),
 				    nontemporal, reversep);
-	      emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false));
-	      emit_move_insn (XEXP (to_rtx, 1), read_complex_part (temp, true));
+	      emit_move_insn (XEXP (to_rtx, 0),
+			      read_complex_part (temp, REAL_P));
+	      emit_move_insn (XEXP (to_rtx, 1),
+			      read_complex_part (temp, IMAG_P));
 	    }
 	}
       /* For calls to functions returning variable length structures, if TO_RTX
@@ -10317,8 +10212,8 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 	      complex_expr_swap_order:
 		/* Move the imaginary (op1) and real (op0) parts to their
 		   location.  */
-		write_complex_part (target, op1, true, true);
-		write_complex_part (target, op0, false, false);
+		write_complex_part (target, op1, IMAG_P, true);
+		write_complex_part (target, op0, REAL_P, false);
 
 		return target;
 	      }
@@ -10346,9 +10241,8 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 	    break;
 	  }
 
-      /* Move the real (op0) and imaginary (op1) parts to their location.  */
-      write_complex_part (target, op0, false, true);
-      write_complex_part (target, op1, true, false);
+      /* Temporary use a CONCAT to pass both real and imag parts in one call.  */
+      write_complex_part (target, gen_rtx_CONCAT (GET_MODE (target), op0, op1), BOTH_P, true);
 
       return target;
 
@@ -11550,7 +11444,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 		    rtx parts[2];
 		    for (int i = 0; i < 2; i++)
 		      {
-			rtx op = read_complex_part (op0, i != 0);
+			rtx op = read_complex_part (op0, (i != 0) ? IMAG_P
+						    : REAL_P);
 			if (GET_CODE (op) == SUBREG)
 			  op = force_reg (GET_MODE (op), op);
 			temp = gen_lowpart_common (GET_MODE_INNER (mode1), op);
@@ -12150,11 +12045,11 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 
     case REALPART_EXPR:
       op0 = expand_normal (treeop0);
-      return read_complex_part (op0, false);
+      return read_complex_part (op0, REAL_P);
 
     case IMAGPART_EXPR:
       op0 = expand_normal (treeop0);
-      return read_complex_part (op0, true);
+      return read_complex_part (op0, IMAG_P);
 
     case RETURN_EXPR:
     case LABEL_EXPR:
@@ -13494,8 +13389,8 @@ const_vector_from_tree (tree exp)
 	builder.quick_push (const_double_from_real_value (TREE_REAL_CST (elt),
 							  inner));
       else if (TREE_CODE (elt) == FIXED_CST)
-	builder.quick_push (CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt),
-							  inner));
+	builder.quick_push (CONST_FIXED_FROM_FIXED_VALUE
+			    (TREE_FIXED_CST (elt), inner));
       else
 	builder.quick_push (immed_wide_int_const (wi::to_poly_wide (elt),
 						  inner));
diff --git a/gcc/expr.h b/gcc/expr.h
index 11bff531862..833ff16bd0d 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -261,9 +261,8 @@ extern rtx_insn *emit_move_insn_1 (rtx, rtx);
 
 extern rtx_insn *emit_move_complex_push (machine_mode, rtx, rtx);
 extern rtx_insn *emit_move_complex_parts (rtx, rtx);
-extern rtx read_complex_part (rtx, bool);
-extern void write_complex_part (rtx, rtx, bool, bool);
-extern rtx read_complex_part (rtx, bool);
+extern rtx read_complex_part (rtx, complex_part_t);
+extern void write_complex_part (rtx, rtx, complex_part_t, bool);
 extern rtx emit_move_resolve_push (machine_mode, rtx);
 
 /* Push a block of length SIZE (perhaps variable)
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 0fd34359247..a01b7160303 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -919,9 +919,9 @@ expand_arith_set_overflow (tree lhs, rtx target)
 {
   if (TYPE_PRECISION (TREE_TYPE (TREE_TYPE (lhs))) == 1
       && !TYPE_UNSIGNED (TREE_TYPE (TREE_TYPE (lhs))))
-    write_complex_part (target, constm1_rtx, true, false);
+    write_complex_part (target, constm1_rtx, IMAG_P, false);
   else
-    write_complex_part (target, const1_rtx, true, false);
+    write_complex_part (target, const1_rtx, IMAG_P, false);
 }
 
 /* Helper for expand_*_overflow.  Store RES into the __real__ part
@@ -976,7 +976,7 @@ expand_arith_overflow_result_store (tree lhs, rtx target,
       expand_arith_set_overflow (lhs, target);
       emit_label (done_label);
     }
-  write_complex_part (target, lres, false, false);
+  write_complex_part (target, lres, REAL_P, false);
 }
 
 /* Helper for expand_*_overflow.  Store RES into TARGET.  */
@@ -1051,7 +1051,7 @@ expand_addsub_overflow (location_t loc, tree_code code, tree lhs,
     {
       target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
       if (!is_ubsan)
-	write_complex_part (target, const0_rtx, true, false);
+	write_complex_part (target, const0_rtx, IMAG_P, false);
     }
 
   /* We assume both operands and result have the same precision
@@ -1496,7 +1496,7 @@ expand_neg_overflow (location_t loc, tree lhs, tree arg1, bool is_ubsan,
     {
       target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
       if (!is_ubsan)
-	write_complex_part (target, const0_rtx, true, false);
+	write_complex_part (target, const0_rtx, IMAG_P, false);
     }
 
   enum insn_code icode = optab_handler (negv3_optab, mode);
@@ -1621,7 +1621,7 @@ expand_mul_overflow (location_t loc, tree lhs, tree arg0, tree arg1,
     {
       target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
       if (!is_ubsan)
-	write_complex_part (target, const0_rtx, true, false);
+	write_complex_part (target, const0_rtx, IMAG_P, false);
     }
 
   if (is_ubsan)
@@ -2444,7 +2444,7 @@ expand_mul_overflow (location_t loc, tree lhs, tree arg0, tree arg1,
       do_compare_rtx_and_jump (op1, res, NE, true, mode, NULL_RTX, NULL,
 			       all_done_label, profile_probability::very_unlikely ());
       emit_label (set_noovf);
-      write_complex_part (target, const0_rtx, true, false);
+      write_complex_part (target, const0_rtx, IMAG_P, false);
       emit_label (all_done_label);
     }
 
@@ -2713,7 +2713,7 @@ expand_arith_overflow (enum tree_code code, gimple *stmt)
 	{
 	  /* The infinity precision result will always fit into result.  */
 	  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
-	  write_complex_part (target, const0_rtx, true, false);
+	  write_complex_part (target, const0_rtx, IMAG_P, false);
 	  scalar_int_mode mode = SCALAR_INT_TYPE_MODE (type);
 	  struct separate_ops ops;
 	  ops.code = code;
diff --git a/gcc/target.def b/gcc/target.def
index 42622177ef9..f99df939776 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -3313,6 +3313,24 @@ a pointer to int.",
  bool, (ao_ref *ref),
  default_ref_may_alias_errno)
 
+/* Returns the value corresponding to the specified part of a complex.  */
+DEFHOOK
+(read_complex_part,
+ "This hook should return the rtx representing the specified @var{part} of the complex given by @var{cplx}.\n\
+  @var{part} can be the real part, the imaginary part, or both of them.",
+ rtx,
+ (rtx cplx, complex_part_t part),
+ default_read_complex_part)
+
+/* Moves a value to the specified part of a complex  */
+DEFHOOK
+(write_complex_part,
+ "This hook should move the rtx value given by @var{val} to the specified @var{var} of the complex given by @var{cplx}.\n\
+  @var{var} can be the real part, the imaginary part, or both of them.",
+ void,
+ (rtx cplx, rtx val, complex_part_t part),
+ default_write_complex_part)
+
 /* Support for named address spaces.  */
 #undef HOOK_PREFIX
 #define HOOK_PREFIX "TARGET_ADDR_SPACE_"
diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
index 4f5b240f8d6..df852eb18e3 100644
--- a/gcc/targhooks.cc
+++ b/gcc/targhooks.cc
@@ -1533,6 +1533,145 @@ default_preferred_simd_mode (scalar_mode)
   return word_mode;
 }
 
+/* By default, extract one of the components of the complex value CPLX.  Extract the
+   real part if part is REAL_P, and the imaginary part if it is IMAG_P. If part is
+   BOTH_P, return cplx directly.  */
+
+rtx
+default_read_complex_part (rtx cplx, complex_part_t part)
+{
+  machine_mode cmode;
+  scalar_mode imode;
+  unsigned ibitsize;
+
+  if (part == BOTH_P)
+    return cplx;
+
+  if (GET_CODE (cplx) == CONCAT)
+    return XEXP (cplx, part);
+
+  cmode = GET_MODE (cplx);
+  imode = GET_MODE_INNER (cmode);
+  ibitsize = GET_MODE_BITSIZE (imode);
+
+  /* Special case reads from complex constants that got spilled to memory.  */
+  if (MEM_P (cplx) && GET_CODE (XEXP (cplx, 0)) == SYMBOL_REF)
+    {
+      tree decl = SYMBOL_REF_DECL (XEXP (cplx, 0));
+      if (decl && TREE_CODE (decl) == COMPLEX_CST)
+	{
+	  tree cplx_part =
+	    (part == IMAG_P) ? TREE_IMAGPART (decl) : TREE_REALPART (decl);
+	  if (CONSTANT_CLASS_P (cplx_part))
+	    return expand_expr (cplx_part, NULL_RTX, imode, EXPAND_NORMAL);
+	}
+    }
+
+  /* For MEMs simplify_gen_subreg may generate an invalid new address
+     because, e.g., the original address is considered mode-dependent
+     by the target, which restricts simplify_subreg from invoking
+     adjust_address_nv.  Instead of preparing fallback support for an
+     invalid address, we call adjust_address_nv directly.  */
+  if (MEM_P (cplx))
+    return adjust_address_nv (cplx, imode, (part == IMAG_P)
+			      ? GET_MODE_SIZE (imode) : 0);
+
+  /* If the sub-object is at least word sized, then we know that subregging
+     will work.  This special case is important, since extract_bit_field
+     wants to operate on integer modes, and there's rarely an OImode to
+     correspond to TCmode.  */
+  if (ibitsize >= BITS_PER_WORD
+      /* For hard regs we have exact predicates.  Assume we can split
+	 the original object if it spans an even number of hard regs.
+	 This special case is important for SCmode on 64-bit platforms
+	 where the natural size of floating-point regs is 32-bit.  */
+      || (REG_P (cplx)
+	  && REGNO (cplx) < FIRST_PSEUDO_REGISTER
+	  && REG_NREGS (cplx) % 2 == 0))
+    {
+      rtx ret = simplify_gen_subreg (imode, cplx, cmode, (part == IMAG_P)
+				     ? GET_MODE_SIZE (imode) : 0);
+      if (ret)
+	return ret;
+      else
+	/* simplify_gen_subreg may fail for sub-word MEMs.  */
+	gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD);
+    }
+
+  return extract_bit_field (cplx, ibitsize, (part == IMAG_P) ? ibitsize : 0,
+			    true, NULL_RTX, imode, imode, false, NULL);
+}
+
+/* By default, Write to one of the components of the complex value CPLX.  Write VAL to
+   the real part if part is REAL_P, and the imaginary part if it is IMAG_P. If part is
+   BOTH_P, call recursively with REAL_P and IMAG_P.  */
+
+void
+default_write_complex_part (rtx cplx, rtx val, complex_part_t part)
+{
+  machine_mode cmode;
+  scalar_mode imode;
+  unsigned ibitsize;
+
+  if (part == BOTH_P)
+    {
+      write_complex_part (cplx, read_complex_part (val, REAL_P), REAL_P);
+      write_complex_part (cplx, read_complex_part (val, IMAG_P), IMAG_P);
+      return;
+    }
+
+  if (GET_CODE (cplx) == CONCAT)
+    {
+      emit_move_insn (XEXP (cplx, part == IMAG_P), val);
+      return;
+    }
+
+  cmode = GET_MODE (cplx);
+  imode = GET_MODE_INNER (cmode);
+  ibitsize = GET_MODE_BITSIZE (imode);
+
+  /* For MEMs simplify_gen_subreg may generate an invalid new address
+     because, e.g., the original address is considered mode-dependent
+     by the target, which restricts simplify_subreg from invoking
+     adjust_address_nv.  Instead of preparing fallback support for an
+     invalid address, we call adjust_address_nv directly.  */
+  if (MEM_P (cplx))
+    {
+      emit_move_insn (adjust_address_nv (cplx, imode, (part == IMAG_P)
+					 ? GET_MODE_SIZE (imode) : 0), val);
+      return;
+    }
+
+  /* If the sub-object is at least word sized, then we know that subregging
+     will work.  This special case is important, since store_bit_field
+     wants to operate on integer modes, and there's rarely an OImode to
+     correspond to TCmode.  */
+  if (ibitsize >= BITS_PER_WORD
+      /* For hard regs we have exact predicates.  Assume we can split
+	 the original object if it spans an even number of hard regs.
+	 This special case is important for SCmode on 64-bit platforms
+	 where the natural size of floating-point regs is 32-bit.  */
+      || (REG_P (cplx)
+	  && REGNO (cplx) < FIRST_PSEUDO_REGISTER
+	  && REG_NREGS (cplx) % 2 == 0))
+    {
+      rtx cplx_part = simplify_gen_subreg (imode, cplx, cmode,
+					   (part == IMAG_P) ?
+					   GET_MODE_SIZE (imode) : 0);
+      if (cplx_part)
+	{
+	  emit_move_insn (cplx_part, val);
+	  return;
+	}
+      else
+	/* simplify_gen_subreg may fail for sub-word MEMs.  */
+	gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD);
+    }
+
+  store_bit_field (cplx, ibitsize, (part == IMAG_P) ? ibitsize : 0, 0, 0,
+		   imode, val, false);
+}
+
 /* By default do not split reductions further.  */
 
 machine_mode
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 189549cb1c7..dcacc725e27 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -124,6 +124,10 @@ extern opt_machine_mode default_get_mask_mode (machine_mode);
 extern bool default_empty_mask_is_expensive (unsigned);
 extern vector_costs *default_vectorize_create_costs (vec_info *, bool);
 
+extern rtx default_read_complex_part (rtx cplx, complex_part_t part);
+extern void default_write_complex_part (rtx cplx, rtx val,
+					complex_part_t part);
+
 /* OpenACC hooks.  */
 extern bool default_goacc_validate_dims (tree, int [], int, unsigned);
 extern int default_goacc_dim_limit (int);
-- 
2.17.1






  parent reply	other threads:[~2023-09-12 10:07 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-17  9:02 [PATCH 0/9] Native complex operations Sylvain Noiry
2023-07-17  9:02 ` [PATCH 1/9] Native complex operations: Conditional lowering Sylvain Noiry
2023-07-17  9:02 ` [PATCH 2/9] Native complex operations: Move functions to hooks Sylvain Noiry
2023-07-17  9:02 ` [PATCH 3/9] Native complex operations: Add gen_rtx_complex hook Sylvain Noiry
2023-07-17  9:02 ` [PATCH 4/9] Native complex operations: Allow native complex regs and ops in rtl Sylvain Noiry
2023-07-17  9:02 ` [PATCH 5/9] Native complex operations: Add the conjugate op in optabs Sylvain Noiry
2023-07-17  9:02 ` [PATCH 6/9] Native complex operations: Update how complex rotations are handled Sylvain Noiry
2023-07-17  9:02 ` [PATCH 7/9] Native complex operations: Vectorization of native complex operations Sylvain Noiry
2023-07-17  9:02 ` [PATCH 8/9] Native complex operations: Add explicit vector of complex Sylvain Noiry
2023-07-17  9:02 ` [PATCH 9/9] Native complex operation: Experimental support in x86 backend Sylvain Noiry
2023-09-12 10:07   ` [PATCH v2 0/11] Native complex operations Sylvain Noiry
2023-09-12 10:07     ` [PATCH v2 01/11] Native complex ops : Conditional lowering Sylvain Noiry
2023-09-12 10:07     ` Sylvain Noiry [this message]
2023-09-12 10:07     ` [PATCH v2 03/11] Native complex ops: Add gen_rtx_complex hook Sylvain Noiry
2023-09-12 10:07     ` [PATCH v2 04/11] Native complex ops: Allow native complex regs and ops in rtl Sylvain Noiry
2023-09-12 10:07     ` [PATCH v2 05/11] Native complex ops: Add the conjugate op in optabs Sylvain Noiry
2023-09-12 10:07     ` [PATCH v2 06/11] Native complex ops: Update how complex rotations are handled Sylvain Noiry
2023-09-12 10:07     ` [PATCH v2 07/11] Native complex ops: Vectorization of native complex operations Sylvain Noiry
2023-09-12 10:07     ` [PATCH v2 08/11] Native complex ops: Add explicit vector of complex Sylvain Noiry
2023-09-12 17:25       ` Joseph Myers
2023-09-13  6:48         ` Richard Biener
2023-09-12 10:07     ` [PATCH v2 09/11] Native complex ops: remove useless special cases Sylvain Noiry
2023-09-12 10:07     ` [PATCH v2 10/11] Native complex ops: Add a fast complex multiplication pattern Sylvain Noiry
2023-09-12 10:07     ` [PATCH v2 11/11] Native complex ops: Experimental support in x86 backend Sylvain Noiry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230912100713.1074-3-snoiry@kalrayinc.com \
    --to=snoiry@kalrayinc.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).