[PATCH][ARM] Implementation of VFP hard float ABI

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH][ARM] Implementation of VFP hard float ABI
@ 2008-06-30  9:02 Chung-Lin Tang
  0 siblings, 0 replies; only message in thread
From: Chung-Lin Tang @ 2008-06-30  9:02 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 4052 bytes --]

Hi,

This patch implements VFP hard float, i.e. the -mfloat-abi=hard
-mfpu=vfp configuration, as specified according to section 6 of the ARM
AAPCS. This provides floating point parameter passing and value
returning in VFP registers for ARM, including vector type support for
NEON.

Our main support here is for AAPCS.
The ATPCS conventions for VFP seems to be identical (AAPCS seems to only
expand on specifications for vector types), so nothing was done in
particular for ATPCS. However such VFP ATPCS compatibility was not
tested with any other compiler, so correct me if I'm wrong. I can't seem
to find any standard for APCS + VFP, so assuming it's a void, whatever
AAPCS conventions that fall through will simply be used.

We have built complete toolchains with this patch. It has been testsuite
checked, and ABI conformance checked with ARM's compiler tools (except
for any GCC language features that RVCT 3.1 does not currently support)
under ARM EABI Linux (using QEMU), so the core argument passing/return
code should be fairly complete and working. I'm hoping others more
familiar with their respective ARM environments could also check it out
(if I overlooked some configuration assumption or anything else).

(I believe I'll need to have the copyright assignments and employer
disclaimer completed before this patch gets accepted. Can anyone help me
on that? Thanks.)


Regards,
Chung-Lin Tang - Marvell Taiwan


--
Notes:
* As this configuration will not be compatible with existing libraries,
it is advisable to simply rebuild the entire toolchain to try out this
patch. What we did was configure the GCC build with --with-float=hard
--with-fpu=vfp. I have not yet worked on any multilib configuration
support in this patch; how it should be organized to be convenient to
use I believe others will be more knowledgeable.


* gcc.dg/builtin-apply2.c does not pass anymore with this patch applied
:P
The case seems similar to a recent AVR issue on this same testcase that
appeared in this list, basically calling conventions differing between
variadic/non-variadic functions.

The AAPCS specifies that the VFP hard ABI only applies to non-variadic
functions, and variadic functions fall back to the base calling
convention standard. Effectively, with any FP arguments involved, the
ABIs for variadic/non-variadic functions diverge to different
conventions. So as exercised in the builtin-apply2.c testcase, using the
__builtin_apply set of features to do FP argument marshalling between
calling/called functions that are one variadic and one not, simply does
not work.


--

2008-06-30  Chung-Lin Tang  <ctang@marvell.com>

	* config/arm/arm-protos.h (arm_function_arg_advance): New
function prototype.
	* config/arm/arm.c (arm_override_options): Remove VFP hard float
blocker.
	(arm_vfp_aapcs_handled_p): New function.
	(arm_vfp_aapcs_handled_mode_p): New function.
	(arm_vfp_allocate_args): New function.
	(arm_vfp_gen_aggregate_parallel): New function.
	(arm_function_value): Remove ATTRIBUTE_UNUSED from fntype
argument. VFP hard float support.
	(arm_apply_result_size): VFP hard float support.
	(arm_return_in_memory): Remove ATTRIBUTE_UNUSED from fntype
argument. VFP hard float support.
	(arm_init_cumulative_args): VFP hard float support.
	(arm_function_arg): VFP hard float support.
	(arm_function_arg_advance): New function.
	(arm_arg_partial_bytes): VFP hard float support.
	* config/arm/arm.h (LIBCALL_VALUE): VFP hard float support.
Modify to call ARM_MODE_RETURN_REG.
	(ARM_MODE_RETURN_REG): New macro, subsumes much of original
LIBCALL_VALUE.
	(FUNCTION_VALUE_REGNO): VFP hard float support.
	(CUMULATIVE_ARGS): New fields vfp_aapcs_on and vfp_regset.
	(FUNCTION_ARG_ADVANCE): Modify to call
arm_function_arg_advance().
	(FUNCTION_ARG_REGNO_P): VFP hard float support.
	* config/arm/ieee754-sf.S (aeabi_l2d, aeabi_ul2d): long long
conversion support for VFP hard float.
	* config/arm/ieee754-df.S (aeabi_l2f, aeabi_ul2f): Likewise.


[-- Attachment #2: vfp-hard-float.patch --]
[-- Type: application/octet-stream, Size: 21230 bytes --]

diff -u arm.orig/arm-protos.h arm/arm-protos.h
--- arm.orig/arm-protos.h	2008-06-19 10:51:26.000000000 +0800
+++ arm/arm-protos.h	2008-06-30 12:59:30.000000000 +0800
@@ -152,6 +152,7 @@
 
 #if defined TREE_CODE
 extern rtx arm_function_arg (CUMULATIVE_ARGS *, enum machine_mode, tree, int);
+extern void arm_function_arg_advance (CUMULATIVE_ARGS *, enum machine_mode, tree, int);
 extern void arm_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree);
 extern bool arm_pad_arg_upward (enum machine_mode, const_tree);
 extern bool arm_pad_reg_upward (enum machine_mode, tree, int);
diff -u arm.orig/arm.c arm/arm.c
--- arm.orig/arm.c	2008-06-28 00:53:54.000000000 +0800
+++ arm/arm.c	2008-06-30 15:32:56.000000000 +0800
@@ -145,6 +145,11 @@
 static rtx emit_set_insn (rtx, rtx);
 static int arm_arg_partial_bytes (CUMULATIVE_ARGS *, enum machine_mode,
 				  tree, bool);
+static bool arm_vfp_aapcs_handled_p (enum machine_mode, const_tree, enum machine_mode*, int*);
+static bool arm_vfp_aapcs_handled_mode_p (enum machine_mode);
+static int arm_vfp_allocate_args (CUMULATIVE_ARGS *, enum machine_mode, int, int*);
+static rtx arm_vfp_gen_aggregate_parallel (enum machine_mode mode, int first_reg,
+					   enum machine_mode elmode, int elnum);
 
 #ifdef OBJECT_FORMAT_ELF
 static void arm_elf_asm_constructor (rtx, int) ATTRIBUTE_UNUSED;
@@ -1367,9 +1372,6 @@
   else
     arm_float_abi = TARGET_DEFAULT_FLOAT_ABI;
 
-  if (arm_float_abi == ARM_FLOAT_ABI_HARD && TARGET_VFP)
-    sorry ("-mfloat-abi=hard and VFP");
-
   /* FPA and iWMMXt are incompatible because the insn encodings overlap.
      VFP and iWMMXt can theoretically coexist, but it's unlikely such silicon
      will ever exist.  GCC makes no attempt to support this combination.  */
@@ -2697,11 +2699,224 @@
   return code;
 }
 
+/* Determines whether MODE is passed in VFP registers
+   when VFP hard float ABI (AAPCS) is in effect. */
+static bool
+arm_vfp_aapcs_handled_mode_p (enum machine_mode mode)
+{
+  return (mode == SFmode || mode == DFmode
+	  || (TARGET_NEON
+	      && arm_vector_mode_supported_p (mode)));
+}
+
+/* Return true if a value of TYPE/TYPE_MODE should be handled as specified
+   by the VFP hard ABI (AAPCS). This includes SF,DF mode values, vector
+   modes supported by VFP (NEON), and homogeneous aggregates of these modes.
+   The MODE and NUM pointers are out parameters if non-NULL.
+
+   *MODE returns the element mode of a homogeneous aggregate type (HAT).
+   HATs are structs/unions consisting of completely either
+   (1) only one kind of fundamental type (here of course we only handle FP modes).
+   (2) vector types of same size, e.g. V4SF and V2DI (in this case we may
+       fill *MODE with any one of the appearing vector modes, size is what matters).
+   *NUM returns the number of *MODE mode HAT elements,
+   which is <= 4 according to VFP AAPCS. (larger aggregates fall back to base AAPCS)
+   For non-aggregate types,
+   *MODE will return TYPE_MODE, and *NUM will return 1.
+*/
+static bool
+arm_vfp_aapcs_handled_p (enum machine_mode type_mode, const_tree type,
+			 enum machine_mode *mode, int *num)
+{
+  /* If TYPE == NULL, handle according to TYPE_MODE. */
+  if (!type)
+    {
+      /* Will BLKmode or complex modes ever reach here? */
+      gcc_assert (type_mode != BLKmode
+		  && GET_MODE_CLASS (type_mode) != MODE_COMPLEX_FLOAT);
+
+      if (arm_vfp_aapcs_handled_mode_p (type_mode))
+	{
+	  if (mode) *mode = type_mode;
+	  if (num) *num = 1;
+	  return true;
+	}
+      else
+	return false;
+    }
+
+  /* Empty structs, etc. are not handled. */
+  if (!TYPE_SIZE (type)
+      || integer_zerop (TYPE_SIZE (type)))
+    return false;
+
+  if ((TREE_CODE (type) == REAL_TYPE
+       && (TYPE_MODE (type) == SFmode
+	   || TYPE_MODE (type) == DFmode))
+      || (TARGET_NEON
+	  && TREE_CODE (type) == VECTOR_TYPE
+	  && arm_vector_mode_supported_p (TYPE_MODE (type))))
+    {
+      if (mode) *mode = TYPE_MODE (type);
+      if (num) *num = 1; /* Non-aggregate. */
+      return true;
+    }
+
+  if (TREE_CODE (type) == COMPLEX_TYPE
+      && (TYPE_MODE (type) == SCmode
+	  || TYPE_MODE (type) == DCmode))
+    {
+      if (mode) *mode = GET_MODE_INNER (TYPE_MODE (type));
+      if (num) *num = 2;
+      return true;
+    }
+
+  if (TREE_CODE (type) == RECORD_TYPE
+      || TREE_CODE (type) == UNION_TYPE
+      || TREE_CODE (type) == QUAL_UNION_TYPE)
+    {
+      tree t;
+      int field_num;
+      int total_field_num = 0;
+      enum machine_mode field_mode;
+      enum machine_mode first_mode = VOIDmode;
+
+      for (t = TYPE_FIELDS (type); t; t = TREE_CHAIN (t))
+	{
+	  if (TREE_CODE (t) != FIELD_DECL)
+	    continue;
+
+	  /* We handle arrays only within structs/unions.*/
+	  if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
+	    {
+	      tree eltype = TREE_TYPE (TREE_TYPE (t));
+	      enum machine_mode elmode;
+	      int elnum, array_num;
+
+	      if (! arm_vfp_aapcs_handled_p (TYPE_MODE (eltype), eltype,
+					     &elmode, &elnum))
+		return false;
+
+	      gcc_assert (int_size_in_bytes (eltype));
+	      /* Number of elements of ELTYPE. */
+	      array_num = int_size_in_bytes (TREE_TYPE (t)) / int_size_in_bytes (eltype);
+
+	      /* The number of ELMODE fields must not exceed 4. */
+	      if (array_num * elnum > 4)
+		return false;
+
+	      field_mode = elmode;
+	      field_num = array_num * elnum;
+	    }
+	  else
+	    {
+	      /* For other field types, recursively test. */
+	      if (! arm_vfp_aapcs_handled_p (TYPE_MODE (TREE_TYPE (t)), TREE_TYPE (t),
+					    &field_mode, &field_num))
+		return false;
+	    }
+
+	  if (first_mode == VOIDmode)
+	    /* Record first field mode. */
+	    first_mode = field_mode;
+	  else
+	    {
+	      /* Compare each subsequent field mode with first field mode.
+		 If they are different, and are not vector modes of same size,
+		 return false. */
+	      if (first_mode != field_mode
+		  && ! (TARGET_NEON
+			&& arm_vector_mode_supported_p (first_mode)
+			&& arm_vector_mode_supported_p (field_mode)
+			&& GET_MODE_SIZE (first_mode) == GET_MODE_SIZE (field_mode)))
+		return false;
+	    }
+
+	  if (TREE_CODE (type) == RECORD_TYPE)
+	    /* For structs, add field sizes together. */
+	    total_field_num += field_num;
+	  else
+	    {
+	      /* For unions, overlay all fields. */
+	      if (total_field_num < field_num)
+		total_field_num = field_num;
+	    }
+
+	  /*The AAPCS limits to 4 element aggregates. */
+	  if (total_field_num > 4)
+	    return false;
+	}
+
+      /* Found no FIELD_DECLs at all. */
+      if (first_mode == VOIDmode)
+	return false;
+
+      if (mode) *mode = first_mode;
+      if (num) *num = total_field_num;
+      return true;
+    }
+
+  return false;
+}
+
+/* Looks for NUM consecutive MODE mode VFP registers in the
+   CUMULATIVE_ARGS state. The return value is an int bitset
+   that can be ORed later with CUMULATIVE_ARGS.vfp_regset
+   to mark VFP arg regs as used; return value is zero(0) when
+   allocation fails. If FIRST_REG != 0, *FIRST_REG returns
+   the regno of the first VFP reg found available.
+*/
+static int
+arm_vfp_allocate_args (CUMULATIVE_ARGS *pcum,
+		       enum machine_mode mode, int num, int *first_reg)
+{
+  int i;
+  int nregs = HARD_REGNO_NREGS (FIRST_VFP_REGNUM, mode);
+  int regset = (1 << nregs) - 1;
+  int regset_len = num * nregs;
+
+  /* Build a regset of regset_len consecutive registers*/
+  for (i = 1; i < num; i++)
+    regset |= (regset << nregs);
+
+  /* Search pcum->vfp_regset for regset_len available
+     consecutive regs. */
+  for (i = 0; i + regset_len <= 16; i += nregs)
+    if ((pcum->vfp_regset & (regset << i)) == 0)
+      {
+	/* Found space to place VFP args. */
+	if (first_reg)
+	  *first_reg = FIRST_VFP_REGNUM + i; /* Rebase to first VFP reg. */
+	return (regset << i);
+      }
+
+  return 0;
+}
+
+/* Generates the PARALLEL rtx returned by function_arg and function_value
+   when passing/returning VFP homogeneous aggregates. */
+static rtx
+arm_vfp_gen_aggregate_parallel (enum machine_mode mode, int first_reg,
+				enum machine_mode elmode, int elnum)
+{
+  int i, elmode_nregs = HARD_REGNO_NREGS (first_reg, elmode);
+  rtx loc[4];
+
+  for (i = 0; i < elnum; i++)
+    loc [i] =
+      gen_rtx_EXPR_LIST (VOIDmode,
+			 gen_rtx_REG (elmode,
+				      first_reg + elmode_nregs * i),
+			 GEN_INT (GET_MODE_SIZE (elmode) * i));
+
+  return gen_rtx_PARALLEL (mode, gen_rtvec_v (elnum, loc));
+}
+
 
 /* Define how to find the value returned by a function.  */
 
 rtx
-arm_function_value(const_tree type, const_tree func ATTRIBUTE_UNUSED)
+arm_function_value(const_tree type, const_tree func)
 {
   enum machine_mode mode;
   int unsignedp ATTRIBUTE_UNUSED;
@@ -2712,6 +2927,31 @@
   if (INTEGRAL_TYPE_P (type))
     PROMOTE_FUNCTION_MODE (mode, unsignedp, type);
 
+  /* VFP homogeneous aggregates. */
+  if (TARGET_32BIT && TARGET_HARD_FLOAT_ABI && TARGET_VFP)
+    {
+      int elnum;
+      enum machine_mode elmode;
+
+      /* If FUNC is not NULL, make sure it is the function type,
+	 not the DECL. */
+      if (func && TREE_CODE (func) == FUNCTION_DECL)
+	func = TREE_TYPE (func);
+
+      if (!stdarg_p ((tree) func)
+	  && arm_vfp_aapcs_handled_p (mode, type, &elmode, &elnum))
+	{
+	  if (mode != elmode)
+	    /* Differing MODE and ELMODE indicates an aggregate,
+	       return PARALLEL rtx to represent aggregate function value. */
+	    return arm_vfp_gen_aggregate_parallel (mode, FIRST_VFP_REGNUM, elmode, elnum);
+	  else
+	    /* Return single REG. */
+	    return
+	      gen_rtx_REG (elmode, FIRST_VFP_REGNUM);
+	}
+    }
+
   /* Promotes small structs returned in a register to full-word size
      for big-endian AAPCS.  */
   if (arm_return_in_msb (type))
@@ -2724,7 +2964,7 @@
 	}
     }
 
-  return LIBCALL_VALUE(mode);
+  return ARM_MODE_RETURN_REG(mode);
 }
 
 /* Determine the amount of memory needed to store the possible return
@@ -2742,6 +2982,8 @@
 	    size += 12;
 	  if (TARGET_MAVERICK)
 	    size += 8;
+	  if (TARGET_VFP)
+	    size += 64;
 	}
       if (TARGET_IWMMXT_ABI)
 	size += 8;
@@ -2754,7 +2996,7 @@
    or in a register (false).  This is called as the target hook
    TARGET_RETURN_IN_MEMORY.  */
 static bool
-arm_return_in_memory (const_tree type, const_tree fntype ATTRIBUTE_UNUSED)
+arm_return_in_memory (const_tree type, const_tree fntype)
 {
   HOST_WIDE_INT size;
 
@@ -2772,6 +3014,18 @@
        For AAPCS, complex types are treated the same as aggregates.  */
     return 0;
 
+#ifndef ARM_WINCE
+  /* When VFP hard ABI is in effect,
+     FP scalars and homogeneous aggregates with <= 4 elements in registers. */
+  if (TARGET_32BIT && TARGET_HARD_FLOAT_ABI && TARGET_VFP)
+    {
+      /* Only for non-variadic functions. */
+      if (!stdarg_p ((tree) fntype)
+	  && arm_vfp_aapcs_handled_p (TYPE_MODE (type), type, NULL, NULL))
+	return 0;
+    }
+#endif /* not ARM_WINCE */
+
   if (arm_abi != ARM_ABI_APCS)
     {
       /* ATPCS and later return aggregate types in memory only if they are
@@ -2904,6 +3158,20 @@
   pcum->named_count = 0;
   pcum->nargs = 0;
 
+  /* VFP hard float fields. */
+  pcum->vfp_aapcs_on = 0;
+  pcum->vfp_regset = 0;
+
+  if (TARGET_32BIT && TARGET_HARD_FLOAT_ABI && TARGET_VFP)
+    {
+      /* The AAPCS standard says that the VFP hard float ABI only applies
+	 to non-variadic functions (i.e. no stdarg/vararg "..." params).
+	 For variadic functions, still fall back to base standard
+	 (i.e. normal 'softfp') calling conventions; so we further qualify
+	 the 'vfp_aapcs_on' condition by testing if FNTYPE is non-variadic. */
+      pcum->vfp_aapcs_on = !stdarg_p (fntype);
+    }
+
   if (TARGET_REALLY_IWMMXT && fntype)
     {
       tree fn_arg;
@@ -2947,6 +3215,33 @@
 {
   int nregs;
 
+  if (pcum->vfp_aapcs_on)
+    {
+      int elnum;
+      enum machine_mode elmode;
+      int regset, first_reg;
+
+      if (arm_vfp_aapcs_handled_p (mode, type, &elmode, &elnum))
+	{
+	  /* Try to find ELNUM consecutive VFP arg regs. */
+	  regset = arm_vfp_allocate_args (pcum, elmode, elnum, &first_reg);
+
+	  /* Not enough VFP regs available. */
+	  if (!regset)
+	    return NULL_RTX;
+
+	  if (mode != elmode)
+	    /* Differing MODE and ELMODE indicates an aggregate.
+	       Return a PARALLEL rtx. */
+	    return arm_vfp_gen_aggregate_parallel (mode, first_reg, elmode, elnum);
+	  else
+	    {
+	      /* Return single REG. */
+	      return gen_rtx_REG (elmode, first_reg);
+	    }
+	}
+    }
+
   /* Varargs vectors are treated the same as long long.
      named_count avoids having to change the way arm handles 'named' */
   if (TARGET_IWMMXT_ABI
@@ -2986,12 +3281,54 @@
   return gen_rtx_REG (mode, pcum->nregs);
 }
 
+void
+arm_function_arg_advance (CUMULATIVE_ARGS *pcum,
+			  enum machine_mode mode, tree type, int named ATTRIBUTE_UNUSED)
+{
+  pcum->nargs += 1;
+
+  if (TARGET_IWMMXT_ABI
+      && arm_vector_mode_supported_p (mode)
+      && pcum->named_count > pcum->nargs)
+    {
+      pcum->iwmmxt_nregs += 1;
+      return;
+    }
+
+  if (pcum->vfp_aapcs_on)
+    {
+      int elnum, regset;
+      enum machine_mode elmode;
+
+      if (arm_vfp_aapcs_handled_p (mode, type, &elmode, &elnum))
+	{
+	  regset = arm_vfp_allocate_args (pcum, elmode, elnum, NULL);
+
+	  if (regset == 0)
+	    /* If VFP register arg allocation fails,
+	       mark any remaining regs as unavailable. */
+	    pcum->vfp_regset = (int) 0xFFFF;
+	  else
+	    pcum->vfp_regset |= regset;
+	  return;
+	}
+    }
+
+  pcum->nregs += ARM_NUM_REGS2 (mode, type);
+}
+
 static int
 arm_arg_partial_bytes (CUMULATIVE_ARGS *pcum, enum machine_mode mode,
 		       tree type, bool named ATTRIBUTE_UNUSED)
 {
   int nregs = pcum->nregs;
 
+  /* VFP AAPCS handled types are always either entirely in regs,
+     or entirely in memory. */
+  if (pcum->vfp_aapcs_on
+      && arm_vfp_aapcs_handled_p (mode, type, NULL, NULL))
+    return 0;
+
   if (TARGET_IWMMXT_ABI && arm_vector_mode_supported_p (mode))
     return 0;
 
diff -u arm.orig/arm.h arm/arm.h
--- arm.orig/arm.h	2008-06-19 10:51:26.000000000 +0800
+++ arm/arm.h	2008-06-30 12:59:30.000000000 +0800
@@ -1446,6 +1446,17 @@
 /* Define how to find the value returned by a library function
    assuming the value has mode MODE.  */
 #define LIBCALL_VALUE(MODE)  \
+  (TARGET_32BIT && TARGET_HARD_FLOAT_ABI && TARGET_VFP			\
+   && (GET_MODE_CLASS (MODE) == MODE_FLOAT				\
+       || (TARGET_NEON && arm_vector_mode_supported_p (MODE)))		\
+   ? gen_rtx_REG (MODE, FIRST_VFP_REGNUM)				\
+   /* The rest are non-VFP return reg cases, see macro below. */	\
+   : ARM_MODE_RETURN_REG (MODE))
+
+/* Define return reg RTX for each machine mode. These do NOT include
+   the VFP registers, for the VFP hard FP ABI depends on other conditions
+   in determining value placement. */
+#define ARM_MODE_RETURN_REG(MODE) \
   (TARGET_32BIT && TARGET_HARD_FLOAT_ABI && TARGET_FPA			\
    && GET_MODE_CLASS (MODE) == MODE_FLOAT				\
    ? gen_rtx_REG (MODE, FIRST_FPA_REGNUM)				\
@@ -1466,11 +1477,14 @@
 /* 1 if N is a possible register number for a function value.
    On the ARM, only r0 and f0 can return results.  */
 /* On a Cirrus chip, mvf0 can return results.  */
+/* On VFP hard ABI, s0/d0/q0 can return results.  */
 #define FUNCTION_VALUE_REGNO_P(REGNO)  \
   ((REGNO) == ARG_REGISTER (1) \
    || (TARGET_32BIT && ((REGNO) == FIRST_CIRRUS_FP_REGNUM)		\
        && TARGET_HARD_FLOAT_ABI && TARGET_MAVERICK)			\
    || ((REGNO) == FIRST_IWMMXT_REGNUM && TARGET_IWMMXT_ABI) \
+   || (TARGET_32BIT && ((REGNO) == FIRST_VFP_REGNUM)			\
+       && TARGET_HARD_FLOAT_ABI && TARGET_VFP)				\
    || (TARGET_32BIT && ((REGNO) == FIRST_FPA_REGNUM)			\
        && TARGET_HARD_FLOAT_ABI && TARGET_FPA))
 
@@ -1583,6 +1597,10 @@
   int named_count;
   int nargs;
   int can_split;
+  /* Set when VFP hard ABI is in effect for this function. */
+  int vfp_aapcs_on;
+  /* Bitset to record allocated VFP regs (s0-s15) when passing parameters. */
+  unsigned vfp_regset : 16;
 } CUMULATIVE_ARGS;
 
 /* Define where to put the arguments to a function.
@@ -1628,13 +1646,7 @@
    of mode MODE and data type TYPE.
    (TYPE is null for libcalls where that information may not be available.)  */
 #define FUNCTION_ARG_ADVANCE(CUM, MODE, TYPE, NAMED)	\
-  (CUM).nargs += 1;					\
-  if (arm_vector_mode_supported_p (MODE)		\
-      && (CUM).named_count > (CUM).nargs		\
-      && TARGET_IWMMXT_ABI)				\
-    (CUM).iwmmxt_nregs += 1;				\
-  else							\
-    (CUM).nregs += ARM_NUM_REGS2 (MODE, TYPE)
+  arm_function_arg_advance (&(CUM), (MODE), (TYPE), (NAMED))
 
 /* If defined, a C expression that gives the alignment boundary, in bits, of an
    argument with the specified mode and type.  If it is not defined,
@@ -1648,6 +1660,10 @@
    On the ARM, r0-r3 are used to pass args.  */
 #define FUNCTION_ARG_REGNO_P(REGNO)	\
    (IN_RANGE ((REGNO), 0, 3)		\
+    || (TARGET_32BIT			\
+	&& TARGET_HARD_FLOAT_ABI	\
+	&& TARGET_VFP			\
+	&& IN_RANGE ((REGNO), FIRST_VFP_REGNUM, D7_VFP_REGNUM))	\
     || (TARGET_IWMMXT_ABI		\
 	&& IN_RANGE ((REGNO), FIRST_IWMMXT_REGNUM, FIRST_IWMMXT_REGNUM + 9)))
 
diff -u arm.orig/ieee754-df.S arm/ieee754-df.S
--- arm.orig/ieee754-df.S	2008-03-03 22:30:48.000000000 +0800
+++ arm/ieee754-df.S	2008-06-30 12:59:30.000000000 +0800
@@ -499,19 +499,27 @@
 ARM_FUNC_ALIAS aeabi_ul2d floatundidf
 
 	orrs	r2, r0, r1
-#if !defined (__VFP_FP__) && !defined(__SOFTFP__)
+#if !defined(__SOFTFP__)
 	do_it	eq, t
+# if defined(__VFP_FP__)
+	fmdrreq d0, r0, r1
+# else
 	mvfeqd	f0, #0.0
+# endif
 #else
 	do_it	eq
 #endif
 	RETc(eq)
 
-#if !defined (__VFP_FP__) && !defined(__SOFTFP__)
-	@ For hard FPA code we want to return via the tail below so that
-	@ we can return the result in f0 as well as in r0/r1 for backwards
+#if !defined(__SOFTFP__)
+	@ For hard VFP/FPA code we want to return via the tail below so that
+	@ we can return the result in d0/f0 as well as in r0/r1 for backwards
 	@ compatibility.
+# if defined(__VFP_FP__)
+	adr	ip, LSYM(d0_ret)
+# else
 	adr	ip, LSYM(f0_ret)
+# endif
 	@ Push pc as well so that RETLDM works correctly.
 	do_push	{r4, r5, ip, lr, pc}
 #else
@@ -525,19 +533,27 @@
 ARM_FUNC_ALIAS aeabi_l2d floatdidf
 
 	orrs	r2, r0, r1
-#if !defined (__VFP_FP__) && !defined(__SOFTFP__)
+#if !defined(__SOFTFP__)
 	do_it	eq, t
+# if defined(__VFP_FP__)
+	fmdrreq d0, r0, r1
+# else
 	mvfeqd	f0, #0.0
+# endif
 #else
 	do_it	eq
 #endif
 	RETc(eq)
 
-#if !defined (__VFP_FP__) && !defined(__SOFTFP__)
-	@ For hard FPA code we want to return via the tail below so that
-	@ we can return the result in f0 as well as in r0/r1 for backwards
+#if !defined(__SOFTFP__)
+	@ For hard VFP/FPA code we want to return via the tail below so that
+	@ we can return the result in d0/f0 as well as in r0/r1 for backwards
 	@ compatibility.
+# if defined(__VFP_FP__)
+	adr	ip, LSYM(d0_ret)
+# else
 	adr	ip, LSYM(f0_ret)
+# endif
 	@ Push pc as well so that RETLDM works correctly.
 	do_push	{r4, r5, ip, lr, pc}
 #else
@@ -585,8 +601,16 @@
 	add	r4, r4, r2
 	b	LSYM(Lad_p)
 
-#if !defined (__VFP_FP__) && !defined(__SOFTFP__)
-
+#if !defined(__SOFTFP__)
+# if defined(__VFP_FP__)
+	@ We place a copy of the result in d0, this provides compatibility
+	@ for both VFP softfp and hard ABI variants.
+LSYM(d0_ret):
+	@ Return (xl,xh), which expands to (r0,r1) or (r1,r0)
+	@ depending on endian mode.
+	fmdrr	d0, xl, xh
+	RETLDM
+# else
 	@ Legacy code expects the result to be returned in f0.  Copy it
 	@ there as well.
 LSYM(f0_ret):
@@ -594,6 +618,7 @@
 	ldfd	f0, [sp], #8
 	RETLDM
 
+# endif
 #endif
 
 	FUNC_END floatdidf
diff -u arm.orig/ieee754-sf.S arm/ieee754-sf.S
--- arm.orig/ieee754-sf.S	2008-03-03 22:30:48.000000000 +0800
+++ arm/ieee754-sf.S	2008-06-30 12:59:30.000000000 +0800
@@ -330,9 +330,13 @@
 ARM_FUNC_ALIAS aeabi_ul2f floatundisf
 
 	orrs	r2, r0, r1
-#if !defined (__VFP_FP__) && !defined(__SOFTFP__)
+#if !defined(__SOFTFP__)
 	do_it	eq, t
+# if defined(__VFP_FP__)
+	fmsreq	s0, r0
+# else
 	mvfeqs	f0, #0.0
+# endif
 #else
 	do_it	eq
 #endif
@@ -345,9 +349,13 @@
 ARM_FUNC_ALIAS aeabi_l2f floatdisf
 
 	orrs	r2, r0, r1
-#if !defined (__VFP_FP__) && !defined(__SOFTFP__)
+#if !defined(__SOFTFP__)
 	do_it	eq, t
+# if defined(__VFP_FP__)
+	fmsreq	s0, r0
+# else
 	mvfeqs	f0, #0.0
+# endif
 #else
 	do_it	eq
 #endif
@@ -363,12 +371,16 @@
 	rsc	ah, ah, #0
 #endif
 1:
-#if !defined (__VFP_FP__) && !defined(__SOFTFP__)
-	@ For hard FPA code we want to return via the tail below so that
-	@ we can return the result in f0 as well as in r0 for backwards
+#if !defined(__SOFTFP__)
+	@ For hard VFP/FPA code we want to return via the tail below so that
+	@ we can return the result in s0/f0 as well as in r0 for backwards
 	@ compatibility.
 	str	lr, [sp, #-8]!
+# if defined(__VFP_FP__)
+	adr	lr, LSYM(s0_ret)
+# else
 	adr	lr, LSYM(f0_ret)
+# endif
 #endif
 
 	movs	ip, ah
@@ -432,13 +444,20 @@
 	biceq	r0, r0, ip, lsr #31
 	RET
 
-#if !defined (__VFP_FP__) && !defined(__SOFTFP__)
-
+#if !defined(__SOFTFP__)
+# if defined(__VFP_FP__)
+	@ We place a copy of the result in s0, this provides compatibility
+	@ for both VFP softfp and hard ABI variants.
+LSYM(s0_ret):
+	fmsr	s0, r0
+	RETLDM
+# else
 LSYM(f0_ret):
 	str	r0, [sp, #-4]!
 	ldfs	f0, [sp], #4
 	RETLDM
 
+# endif
 #endif
 
 	FUNC_END floatdisf

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2008-06-30  8:27 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-06-30  9:02 [PATCH][ARM] Implementation of VFP hard float ABI Chung-Lin Tang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).