[PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs.
@ 2016-10-19  9:36 Tamar Christina
  2016-10-19 10:24 ` Christophe Lyon
  0 siblings, 1 reply; 8+ messages in thread
From: Tamar Christina @ 2016-10-19  9:36 UTC (permalink / raw)
  To: GCC Patches, Kyrylo Tkachov, Christophe Lyon; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 1505 bytes --]

Hi All,

This patch implements the vmaxnmQ_ST and vminnmQ_ST intrinsics. The
current builtin registration code is deficient since it can't access
standard pattern names, to which vmaxnmQ_ST and vminnmQ_ST map
directly. Thus, to enable the vectoriser to have access to these
intrinsics, we implement them using builtin functions, which we 
expand to the proper standard pattern using a define_expand.

This patch also implements the __ARM_FEATURE_NUMERIC_MAXMIN macro, 
which is defined when __ARM_ARCH >= 8, and which enables the 
intrinsics.

Regression tested on arm-none-eabi and no regressions.

This patch is a rework of a previous patch:
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01971.html

OK for trunk?

Thanks,
Tamar

---

gcc/

2016-10-19  Bilyan Borisov  <bilyan.borisov@arm.com>
	    Tamar Christina <tamar.christina@arm.com>

	* config/arm/arm-c.c (arm_cpu_builtins): New macro definition.
	* config/arm/arm_neon.h (vmaxnm_f32): New intrinsinc.
	(vmaxnmq_f32): Likewise.
	(vminnm_f32): Likewise.
	(vminnmq_f32): Likewise.
	* config/arm/arm_neon_builtins.def (vmaxnm): New builtin.
	(vminnm): Likewise.
	* config/arm/neon.md (neon_<fmaxmin_op><mode>, VCVTF): New
	expander.

gcc/testsuite/

2016-10-19  Bilyan Borisov  <bilyan.borisov@arm.com>

	* gcc.target/arm/simd/vmaxnm_f32_1.c: New.
	* gcc.target/arm/simd/vmaxnmq_f32_1.c: Likewise.
	* gcc.target/arm/simd/vminnm_f32_1.c: Likewise.
	* gcc.target/arm/simd/vminnmq_f32_1.c: Likewise.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-arm-patch-intrinsics.patch --]
[-- Type: text/x-patch; name=gcc-arm-patch-intrinsics.patch, Size: 3712 bytes --]

diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 72837001d1011e366233236a6ba3d1e5775583b1..dcb883d750506a02257e6e2e49880f2d1b9888fa 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -86,6 +86,9 @@ arm_cpu_builtins (struct cpp_reader* pfile)
 		      ((TARGET_ARM_ARCH >= 5 && !TARGET_THUMB)
 		       || TARGET_ARM_ARCH_ISA_THUMB >=2));
 
+  def_or_undef_macro (pfile, "__ARM_FEATURE_NUMERIC_MAXMIN",
+		      TARGET_ARM_ARCH >= 8 && TARGET_NEON && TARGET_FPU_ARMV8);
+
   def_or_undef_macro (pfile, "__ARM_FEATURE_SIMD32", TARGET_INT_SIMD);
 
   builtin_define_with_int_value ("__ARM_SIZEOF_MINIMAL_ENUM",
diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 54bbc7dd83cf979b6fad7724ba1d4b327b311f5c..3898ff7302dc3f21e6b50a8a7b835033c1ae2021 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -2956,6 +2956,34 @@ vmaxq_f32 (float32x4_t __a, float32x4_t __b)
   return (float32x4_t)__builtin_neon_vmaxfv4sf (__a, __b);
 }
 
+#pragma GCC push_options
+#pragma GCC target ("fpu=neon-fp-armv8")
+__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
+vmaxnm_f32 (float32x2_t a, float32x2_t b)
+{
+  return (float32x2_t)__builtin_neon_vmaxnmv2sf (a, b);
+}
+
+__extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
+vmaxnmq_f32 (float32x4_t a, float32x4_t b)
+{
+  return (float32x4_t)__builtin_neon_vmaxnmv4sf (a, b);
+}
+
+__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
+vminnm_f32 (float32x2_t a, float32x2_t b)
+{
+  return (float32x2_t)__builtin_neon_vminnmv2sf (a, b);
+}
+
+__extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
+vminnmq_f32 (float32x4_t a, float32x4_t b)
+{
+  return (float32x4_t)__builtin_neon_vminnmv4sf (a, b);
+}
+#pragma GCC pop_options
+
+
 __extension__ static __inline uint8x16_t __attribute__ ((__always_inline__))
 vmaxq_u8 (uint8x16_t __a, uint8x16_t __b)
 {
diff --git a/gcc/config/arm/arm_neon_builtins.def b/gcc/config/arm/arm_neon_builtins.def
index b29aa91a64ecb85dfb5eb9661ed67d4fa326062f..58b10207c1f5c0380cb01fdb4a92a3f0b4dec591 100644
--- a/gcc/config/arm/arm_neon_builtins.def
+++ b/gcc/config/arm/arm_neon_builtins.def
@@ -147,12 +147,12 @@ VAR6 (BINOP, vmaxs, v8qi, v4hi, v2si, v16qi, v8hi, v4si)
 VAR6 (BINOP, vmaxu, v8qi, v4hi, v2si, v16qi, v8hi, v4si)
 VAR2 (BINOP, vmaxf, v2sf, v4sf)
 VAR2 (BINOP, vmaxf, v8hf, v4hf)
-VAR2 (BINOP, vmaxnm, v4hf, v8hf)
+VAR4 (BINOP, vmaxnm, v2sf, v4sf, v4hf, v8hf)
 VAR6 (BINOP, vmins, v8qi, v4hi, v2si, v16qi, v8hi, v4si)
 VAR6 (BINOP, vminu, v8qi, v4hi, v2si, v16qi, v8hi, v4si)
 VAR2 (BINOP, vminf, v2sf, v4sf)
 VAR2 (BINOP, vminf, v4hf, v8hf)
-VAR2 (BINOP, vminnm, v8hf, v4hf)
+VAR4 (BINOP, vminnm, v2sf, v4sf, v8hf, v4hf)
 
 VAR3 (BINOP, vpmaxs, v8qi, v4hi, v2si)
 VAR3 (BINOP, vpmaxu, v8qi, v4hi, v2si)
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 05323334ffd81aeff33ee407b96c788d123b3fe3..3ae4f6a3bf26032f4c34d83ff79e27b30d4000de 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -2841,6 +2841,18 @@
  [(set_attr "type" "neon_fp_minmax_s<q>")]
 )
 
+;; Expander for v<maxmin>nm intrinsics.
+(define_expand "neon_<fmaxmin_op><mode>"
+  [(unspec:VCVTF [(match_operand:VCVTF 0 "s_register_operand" "")
+   (match_operand:VCVTF 1 "s_register_operand" "")
+   (match_operand:VCVTF 2 "s_register_operand" "")]
+		  VMAXMINFNM)]
+  "TARGET_NEON && TARGET_FPU_ARMV8"
+{
+  emit_insn (gen_<fmaxmin><mode>3 (operands[0], operands[1], operands[2]));
+  DONE;
+})
+
 ;; Vector forms for the IEEE-754 fmax()/fmin() functions
 (define_insn "<fmaxmin><mode>3"
   [(set (match_operand:VCVTF 0 "s_register_operand" "=w")

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs.
  2016-10-19  9:36 [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs Tamar Christina
@ 2016-10-19 10:24 ` Christophe Lyon
  2016-10-19 10:39   ` Tamar Christina
  2016-10-26 15:02   ` Tamar Christina
  0 siblings, 2 replies; 8+ messages in thread
From: Christophe Lyon @ 2016-10-19 10:24 UTC (permalink / raw)
  To: Tamar Christina; +Cc: GCC Patches, Kyrylo Tkachov, nd

On 19 October 2016 at 11:36, Tamar Christina <Tamar.Christina@arm.com> wrote:
> Hi All,
>
> This patch implements the vmaxnmQ_ST and vminnmQ_ST intrinsics. The
> current builtin registration code is deficient since it can't access
> standard pattern names, to which vmaxnmQ_ST and vminnmQ_ST map
> directly. Thus, to enable the vectoriser to have access to these
> intrinsics, we implement them using builtin functions, which we
> expand to the proper standard pattern using a define_expand.
>
> This patch also implements the __ARM_FEATURE_NUMERIC_MAXMIN macro,
> which is defined when __ARM_ARCH >= 8, and which enables the
> intrinsics.
>
> Regression tested on arm-none-eabi and no regressions.
>
> This patch is a rework of a previous patch:
> https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01971.html
>
> OK for trunk?
>
> Thanks,
> Tamar
>
> ---
>
> gcc/
>
> 2016-10-19  Bilyan Borisov  <bilyan.borisov@arm.com>
>             Tamar Christina <tamar.christina@arm.com>
>
>         * config/arm/arm-c.c (arm_cpu_builtins): New macro definition.
>         * config/arm/arm_neon.h (vmaxnm_f32): New intrinsinc.
>         (vmaxnmq_f32): Likewise.
>         (vminnm_f32): Likewise.
>         (vminnmq_f32): Likewise.
>         * config/arm/arm_neon_builtins.def (vmaxnm): New builtin.
>         (vminnm): Likewise.
>         * config/arm/neon.md (neon_<fmaxmin_op><mode>, VCVTF): New
>         expander.
>
> gcc/testsuite/
>
> 2016-10-19  Bilyan Borisov  <bilyan.borisov@arm.com>
>
>         * gcc.target/arm/simd/vmaxnm_f32_1.c: New.
>         * gcc.target/arm/simd/vmaxnmq_f32_1.c: Likewise.
>         * gcc.target/arm/simd/vminnm_f32_1.c: Likewise.
>         * gcc.target/arm/simd/vminnmq_f32_1.c: Likewise.
>

I think you forgot to attach the new tests.

Christophe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs.
  2016-10-19 10:24 ` Christophe Lyon
@ 2016-10-19 10:39   ` Tamar Christina
  2016-10-26 15:02   ` Tamar Christina
  1 sibling, 0 replies; 8+ messages in thread
From: Tamar Christina @ 2016-10-19 10:39 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: GCC Patches, Kyrylo Tkachov, nd

________________________________________
> I think you forgot to attach the new tests.

Ah, you're right! forgot an add.

New patch coming soon.

Thanks,
Tamar

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs.
  2016-10-19 10:24 ` Christophe Lyon
  2016-10-19 10:39   ` Tamar Christina
@ 2016-10-26 15:02   ` Tamar Christina
  2016-11-01  9:24     ` Tamar Christina
  2016-11-01 12:22     ` Kyrill Tkachov
  1 sibling, 2 replies; 8+ messages in thread
From: Tamar Christina @ 2016-10-26 15:02 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: GCC Patches, Kyrylo Tkachov, nd

[-- Attachment #1: Type: text/plain, Size: 2182 bytes --]

Hi Christophe,

Here's the updated patch.

Cheers,
Tamar
________________________________________
From: Christophe Lyon <christophe.lyon@linaro.org>
Sent: Wednesday, October 19, 2016 11:23:56 AM
To: Tamar Christina
Cc: GCC Patches; Kyrylo Tkachov; nd
Subject: Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs.

On 19 October 2016 at 11:36, Tamar Christina <Tamar.Christina@arm.com> wrote:
> Hi All,
>
> This patch implements the vmaxnmQ_ST and vminnmQ_ST intrinsics. The
> current builtin registration code is deficient since it can't access
> standard pattern names, to which vmaxnmQ_ST and vminnmQ_ST map
> directly. Thus, to enable the vectoriser to have access to these
> intrinsics, we implement them using builtin functions, which we
> expand to the proper standard pattern using a define_expand.
>
> This patch also implements the __ARM_FEATURE_NUMERIC_MAXMIN macro,
> which is defined when __ARM_ARCH >= 8, and which enables the
> intrinsics.
>
> Regression tested on arm-none-eabi and no regressions.
>
> This patch is a rework of a previous patch:
> https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01971.html
>
> OK for trunk?
>
> Thanks,
> Tamar
>
> ---
>
> gcc/
>
> 2016-10-19  Bilyan Borisov  <bilyan.borisov@arm.com>
>             Tamar Christina <tamar.christina@arm.com>
>
>         * config/arm/arm-c.c (arm_cpu_builtins): New macro definition.
>         * config/arm/arm_neon.h (vmaxnm_f32): New intrinsinc.
>         (vmaxnmq_f32): Likewise.
>         (vminnm_f32): Likewise.
>         (vminnmq_f32): Likewise.
>         * config/arm/arm_neon_builtins.def (vmaxnm): New builtin.
>         (vminnm): Likewise.
>         * config/arm/neon.md (neon_<fmaxmin_op><mode>, VCVTF): New
>         expander.
>
> gcc/testsuite/
>
> 2016-10-19  Bilyan Borisov  <bilyan.borisov@arm.com>
>
>         * gcc.target/arm/simd/vmaxnm_f32_1.c: New.
>         * gcc.target/arm/simd/vmaxnmq_f32_1.c: Likewise.
>         * gcc.target/arm/simd/vminnm_f32_1.c: Likewise.
>         * gcc.target/arm/simd/vminnmq_f32_1.c: Likewise.
>

I think you forgot to attach the new tests.

Christophe


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: gcc-arm-patch-intrinsics-v2.patch --]
[-- Type: text/x-patch; name=gcc-arm-patch-intrinsics-v2.patch, Size: 22230 bytes --]

diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 72837001d1011e366233236a6ba3d1e5775583b1..dcb883d750506a02257e6e2e49880f2d1b9888fa 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -86,6 +86,9 @@ arm_cpu_builtins (struct cpp_reader* pfile)
 		      ((TARGET_ARM_ARCH >= 5 && !TARGET_THUMB)
 		       || TARGET_ARM_ARCH_ISA_THUMB >=2));
 
+  def_or_undef_macro (pfile, "__ARM_FEATURE_NUMERIC_MAXMIN",
+		      TARGET_ARM_ARCH >= 8 && TARGET_NEON && TARGET_FPU_ARMV8);
+
   def_or_undef_macro (pfile, "__ARM_FEATURE_SIMD32", TARGET_INT_SIMD);
 
   builtin_define_with_int_value ("__ARM_SIZEOF_MINIMAL_ENUM",
diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 54bbc7dd83cf979b6fad7724ba1d4b327b311f5c..3898ff7302dc3f21e6b50a8a7b835033c1ae2021 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -2956,6 +2956,34 @@ vmaxq_f32 (float32x4_t __a, float32x4_t __b)
   return (float32x4_t)__builtin_neon_vmaxfv4sf (__a, __b);
 }
 
+#pragma GCC push_options
+#pragma GCC target ("fpu=neon-fp-armv8")
+__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
+vmaxnm_f32 (float32x2_t a, float32x2_t b)
+{
+  return (float32x2_t)__builtin_neon_vmaxnmv2sf (a, b);
+}
+
+__extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
+vmaxnmq_f32 (float32x4_t a, float32x4_t b)
+{
+  return (float32x4_t)__builtin_neon_vmaxnmv4sf (a, b);
+}
+
+__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
+vminnm_f32 (float32x2_t a, float32x2_t b)
+{
+  return (float32x2_t)__builtin_neon_vminnmv2sf (a, b);
+}
+
+__extension__ static __inline float32x4_t __attribute__ ((__always_inline__))
+vminnmq_f32 (float32x4_t a, float32x4_t b)
+{
+  return (float32x4_t)__builtin_neon_vminnmv4sf (a, b);
+}
+#pragma GCC pop_options
+
+
 __extension__ static __inline uint8x16_t __attribute__ ((__always_inline__))
 vmaxq_u8 (uint8x16_t __a, uint8x16_t __b)
 {
diff --git a/gcc/config/arm/arm_neon_builtins.def b/gcc/config/arm/arm_neon_builtins.def
index b29aa91a64ecb85dfb5eb9661ed67d4fa326062f..58b10207c1f5c0380cb01fdb4a92a3f0b4dec591 100644
--- a/gcc/config/arm/arm_neon_builtins.def
+++ b/gcc/config/arm/arm_neon_builtins.def
@@ -147,12 +147,12 @@ VAR6 (BINOP, vmaxs, v8qi, v4hi, v2si, v16qi, v8hi, v4si)
 VAR6 (BINOP, vmaxu, v8qi, v4hi, v2si, v16qi, v8hi, v4si)
 VAR2 (BINOP, vmaxf, v2sf, v4sf)
 VAR2 (BINOP, vmaxf, v8hf, v4hf)
-VAR2 (BINOP, vmaxnm, v4hf, v8hf)
+VAR4 (BINOP, vmaxnm, v2sf, v4sf, v4hf, v8hf)
 VAR6 (BINOP, vmins, v8qi, v4hi, v2si, v16qi, v8hi, v4si)
 VAR6 (BINOP, vminu, v8qi, v4hi, v2si, v16qi, v8hi, v4si)
 VAR2 (BINOP, vminf, v2sf, v4sf)
 VAR2 (BINOP, vminf, v4hf, v8hf)
-VAR2 (BINOP, vminnm, v8hf, v4hf)
+VAR4 (BINOP, vminnm, v2sf, v4sf, v8hf, v4hf)
 
 VAR3 (BINOP, vpmaxs, v8qi, v4hi, v2si)
 VAR3 (BINOP, vpmaxu, v8qi, v4hi, v2si)
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 05323334ffd81aeff33ee407b96c788d123b3fe3..4f7358effdbbd7b8e7667af68dd54c2732459ced 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -2841,6 +2841,17 @@
  [(set_attr "type" "neon_fp_minmax_s<q>")]
 )
 
+;; v<maxmin>nm intrinsics.
+(define_insn "neon_<fmaxmin_op><mode>"
+  [(set (match_operand:VCVTF 0 "s_register_operand" "=w")
+	(unspec:VCVTF [(match_operand:VCVTF 1 "s_register_operand" "w")
+		       (match_operand:VCVTF 2 "s_register_operand" "w")]
+		       VMAXMINFNM))]
+  "TARGET_NEON && TARGET_FPU_ARMV8"
+  "<fmaxmin_op>.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+  [(set_attr "type" "neon_fp_minmax_s<q>")]
+)
+
 ;; Vector forms for the IEEE-754 fmax()/fmin() functions
 (define_insn "<fmaxmin><mode>3"
   [(set (match_operand:VCVTF 0 "s_register_operand" "=w")
diff --git a/gcc/testsuite/gcc.target/arm/simd/vmaxnm_f32_1.c b/gcc/testsuite/gcc.target/arm/simd/vmaxnm_f32_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..c3a9f3671b36a1491ed6d33dc894a3b4b559c4ae
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vmaxnm_f32_1.c
@@ -0,0 +1,159 @@
+/* Test the `vmaxnmf32' ARM Neon intrinsic.  */
+
+/* { dg-do run } */
+/* { dg-require-effective-target arm_v8_neon_ok } */
+/* { dg-options "-save-temps -O3 -march=armv8-a" } */
+/* { dg-add-options arm_v8_neon } */
+
+#include "arm_neon.h"
+
+extern void abort ();
+
+void __attribute__ ((noinline))
+test_vmaxnm_f32__regular_input1 ()
+{
+  float32_t a1[] = {1,2};
+  float32_t b1[] = {3,4};
+  float32x2_t a = vld1_f32 (a1);
+  float32x2_t b = vld1_f32 (b1);
+  float32x2_t c = vmaxnm_f32 (a, b);
+  float32_t actual[2];
+  vst1_f32 (actual, c);
+
+  for (int i = 0; i < 2; ++i)
+    if (actual[i] != b1[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vmaxnm_f32__regular_input2 ()
+{
+  float32_t a1[] = {3,2};
+  float32_t b1[] = {1,4};
+  float32_t e[] = {3,4};
+  float32x2_t a = vld1_f32 (a1);
+  float32x2_t b = vld1_f32 (b1);
+  float32x2_t c = vmaxnm_f32 (a, b);
+  float32_t actual[2];
+  vst1_f32 (actual, c);
+
+  for (int i = 0; i < 2; ++i)
+    if (actual[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vmaxnm_f32__quiet_NaN_one_arg ()
+{
+  /* When given a quiet NaN, vmaxnm returns the other operand.
+     In this test case we have NaNs in only one operand.  */
+  float32_t n = __builtin_nanf ("");
+  float32_t a1[] = {1,2};
+  float32_t b1[] = {n,n};
+  float32_t e[] = {1,2};
+  float32x2_t a = vld1_f32 (a1);
+  float32x2_t b = vld1_f32 (b1);
+  float32x2_t c = vmaxnm_f32 (a, b);
+  float32_t actual[2];
+  vst1_f32 (actual, c);
+
+  for (int i = 0; i < 2; ++i)
+    if (actual[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vmaxnm_f32__quiet_NaN_both_args ()
+{
+  /* When given a quiet NaN, vmaxnm returns the other operand.
+     In this test case we have NaNs in both operands.  */
+  float32_t n = __builtin_nanf ("");
+  float32_t a1[] = {n,2};
+  float32_t b1[] = {1,n};
+  float32_t e[] = {1,2};
+  float32x2_t a = vld1_f32 (a1);
+  float32x2_t b = vld1_f32 (b1);
+  float32x2_t c = vmaxnm_f32 (a, b);
+  float32_t actual[2];
+  vst1_f32 (actual, c);
+
+  for (int i = 0; i < 2; ++i)
+    if (actual[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vmaxnm_f32__zero_both_args ()
+{
+  /* For 0 and -0, vmaxnm returns 0.  Since 0 == -0, check sign bit.  */
+  float32_t a1[] = {0.0, 0.0};
+  float32_t b1[] = {-0.0, -0.0};
+  float32_t e[] = {0.0, 0.0};
+
+  float32x2_t a = vld1_f32 (a1);
+  float32x2_t b = vld1_f32 (b1);
+  float32x2_t c = vmaxnm_f32 (a, b);
+
+  float32_t actual1[2];
+  vst1_f32 (actual1, c);
+
+  for (int i = 0; i < 2; ++i)
+    if (actual1[i] != e[i] || __builtin_signbit (actual1[i]) != 0)
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vmaxnm_f32__inf_both_args ()
+{
+  /* The max of inf and inf is inf.  The max of -inf and -inf is -inf.  */
+  float32_t inf = __builtin_huge_valf ();
+  float32_t a1[] = {inf, -inf};
+  float32_t b1[] = {inf, -inf};
+  float32_t e[] = {inf, -inf};
+
+  float32x2_t a = vld1_f32 (a1);
+  float32x2_t b = vld1_f32 (b1);
+  float32x2_t c = vmaxnm_f32 (a, b);
+
+  float32_t actual1[2];
+  vst1_f32 (actual1, c);
+
+  for (int i = 0; i < 2; ++i)
+    if (actual1[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vmaxnm_f32__two_quiet_NaNs_both_args ()
+{
+  /* When given 2 NaNs, return a NaN.  Since a NaN is not equal to anything,
+     not even another NaN, use __builtin_isnan () to check.  */
+  float32_t n = __builtin_nanf ("");
+  float32_t a1[] = {n,n};
+  float32_t b1[] = {n,n};
+  float32_t e[] = {n,n};
+  float32x2_t a = vld1_f32 (a1);
+  float32x2_t b = vld1_f32 (b1);
+  float32x2_t c = vmaxnm_f32 (a, b);
+  float32_t actual[2];
+  vst1_f32 (actual, c);
+
+  for (int i = 0; i < 2; ++i)
+    if (!__builtin_isnan (actual[i]))
+      abort ();
+}
+
+int
+main ()
+{
+  test_vmaxnm_f32__regular_input1 ();
+  test_vmaxnm_f32__regular_input2 ();
+  test_vmaxnm_f32__quiet_NaN_one_arg ();
+  test_vmaxnm_f32__quiet_NaN_both_args ();
+  test_vmaxnm_f32__zero_both_args ();
+  test_vmaxnm_f32__inf_both_args ();
+  test_vmaxnm_f32__two_quiet_NaNs_both_args ();
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "vmaxnm\.f32\t\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+\n" 7 } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/vmaxnmq_f32_1.c b/gcc/testsuite/gcc.target/arm/simd/vmaxnmq_f32_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..80c4e9aa18810fea318b865e8c4e503238e826f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vmaxnmq_f32_1.c
@@ -0,0 +1,160 @@
+/* Test the `vmaxnmqf32' ARM Neon intrinsic.  */
+
+/* { dg-do run } */
+/* { dg-require-effective-target arm_v8_neon_ok } */
+/* { dg-options "-save-temps -O3 -march=armv8-a" } */
+/* { dg-add-options arm_v8_neon } */
+
+#include "arm_neon.h"
+
+extern void abort ();
+
+void __attribute__ ((noinline))
+test_vmaxnmq_f32__regular_input1 ()
+{
+  float32_t a1[] = {1,2,5,6};
+  float32_t b1[] = {3,4,7,8};
+  float32x4_t a = vld1q_f32 (a1);
+  float32x4_t b = vld1q_f32 (b1);
+  float32x4_t c = vmaxnmq_f32 (a, b);
+  float32_t actual[4];
+  vst1q_f32 (actual, c);
+
+  for (int i = 0; i < 4; ++i)
+    if (actual[i] != b1[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vmaxnmq_f32__regular_input2 ()
+{
+  float32_t a1[] = {3,2,7,6};
+  float32_t b1[] = {1,4,5,8};
+  float32_t e[] = {3,4,7,8};
+  float32x4_t a = vld1q_f32 (a1);
+  float32x4_t b = vld1q_f32 (b1);
+  float32x4_t c = vmaxnmq_f32 (a, b);
+  float32_t actual[4];
+  vst1q_f32 (actual, c);
+
+  for (int i = 0; i < 4; ++i)
+    if (actual[i] != e[i])
+      abort ();
+}
+
+
+void __attribute__ ((noinline))
+test_vmaxnmq_f32__quiet_NaN_one_arg ()
+{
+  /* When given a quiet NaN, vmaxnmq returns the other operand.
+     In this test case we have NaNs in only one operand.  */
+  float32_t n = __builtin_nanf ("");
+  float32_t a1[] = {1,2,3,4};
+  float32_t b1[] = {n,n,n,n};
+  float32_t e[] = {1,2,3,4};
+  float32x4_t a = vld1q_f32 (a1);
+  float32x4_t b = vld1q_f32 (b1);
+  float32x4_t c = vmaxnmq_f32 (a, b);
+  float32_t actual[4];
+  vst1q_f32 (actual, c);
+
+  for (int i = 0; i < 4; ++i)
+    if (actual[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vmaxnmq_f32__quiet_NaN_both_args ()
+{
+  /* When given a quiet NaN, vmaxnmq returns the other operand.
+     In this test case we have NaNs in both operands.  */
+  float32_t n = __builtin_nanf ("");
+  float32_t a1[] = {n,2,n,4};
+  float32_t b1[] = {1,n,3,n};
+  float32_t e[] = {1,2,3,4};
+  float32x4_t a = vld1q_f32 (a1);
+  float32x4_t b = vld1q_f32 (b1);
+  float32x4_t c = vmaxnmq_f32 (a, b);
+  float32_t actual[4];
+  vst1q_f32 (actual, c);
+
+  for (int i = 0; i < 4; ++i)
+    if (actual[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vmaxnmq_f32__zero_both_args ()
+{
+  /* For 0 and -0, vmaxnmq returns 0.  Since 0 == -0, check sign bit.  */
+  float32_t a1[] = {0.0, 0.0, -0.0, -0.0};
+  float32_t b1[] = {-0.0, -0.0, 0.0, 0.0};
+  float32_t e[] = {0.0, 0.0, 0.0, 0.0};
+
+  float32x4_t a = vld1q_f32 (a1);
+  float32x4_t b = vld1q_f32 (b1);
+  float32x4_t c = vmaxnmq_f32 (a, b);
+
+  float32_t actual1[4];
+  vst1q_f32 (actual1, c);
+
+  for (int i = 0; i < 4; ++i)
+    if (actual1[i] != e[i] || __builtin_signbit (actual1[i]) != 0)
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vmaxnmq_f32__inf_both_args ()
+{
+  /* The max of inf and inf is inf.  The max of -inf and -inf is -inf.  */
+  float32_t inf = __builtin_huge_valf ();
+  float32_t a1[] = {inf, -inf, inf, inf};
+  float32_t b1[] = {inf, -inf, -inf, -inf};
+  float32_t e[] = {inf, -inf, inf, inf};
+
+  float32x4_t a = vld1q_f32 (a1);
+  float32x4_t b = vld1q_f32 (b1);
+  float32x4_t c = vmaxnmq_f32 (a, b);
+
+  float32_t actual1[4];
+  vst1q_f32 (actual1, c);
+
+  for (int i = 0; i < 4; ++i)
+    if (actual1[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vmaxnmq_f32__two_quiet_NaNs_both_args ()
+{
+  /* When given 2 NaNs, return a NaN.  Since a NaN is not equal to anything,
+     not even another NaN, use __builtin_isnan () to check.  */
+  float32_t n = __builtin_nanf ("");
+  float32_t a1[] = {n,n,n,n};
+  float32_t b1[] = {n,n,n,n};
+  float32_t e[] = {n,n};
+  float32x4_t a = vld1q_f32 (a1);
+  float32x4_t b = vld1q_f32 (b1);
+  float32x4_t c = vmaxnmq_f32 (a, b);
+  float32_t actual[4];
+  vst1q_f32 (actual, c);
+
+  for (int i = 0; i < 4; ++i)
+    if (!__builtin_isnan (actual[i]))
+      abort ();
+}
+
+int
+main ()
+{
+  test_vmaxnmq_f32__regular_input1 ();
+  test_vmaxnmq_f32__regular_input2 ();
+  test_vmaxnmq_f32__quiet_NaN_one_arg ();
+  test_vmaxnmq_f32__quiet_NaN_both_args ();
+  test_vmaxnmq_f32__zero_both_args ();
+  test_vmaxnmq_f32__inf_both_args ();
+  test_vmaxnmq_f32__two_quiet_NaNs_both_args ();
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "vmaxnm\.f32\t\[qQ\]\[0-9\]+, ?\[qQ\]\[0-9\]+, ?\[qQ\]\[0-9\]+\n" 7 } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/vminnm_f32_1.c b/gcc/testsuite/gcc.target/arm/simd/vminnm_f32_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..9a1d097911748108591a11f3bd7fbf3e44adebaa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vminnm_f32_1.c
@@ -0,0 +1,159 @@
+/* Test the `vminnmf32' ARM Neon intrinsic.  */
+
+/* { dg-do run } */
+/* { dg-require-effective-target arm_v8_neon_ok } */
+/* { dg-options "-save-temps -O3 -march=armv8-a" } */
+/* { dg-add-options arm_v8_neon } */
+
+#include "arm_neon.h"
+
+extern void abort ();
+
+void __attribute__ ((noinline))
+test_vminnm_f32__regular_input1 ()
+{
+  float32_t a1[] = {1,2};
+  float32_t b1[] = {3,4};
+  float32x2_t a = vld1_f32 (a1);
+  float32x2_t b = vld1_f32 (b1);
+  float32x2_t c = vminnm_f32 (a, b);
+  float32_t actual[2];
+  vst1_f32 (actual, c);
+
+  for (int i = 0; i < 2; ++i)
+    if (actual[i] != a1[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vminnm_f32__regular_input2 ()
+{
+  float32_t a1[] = {3,2};
+  float32_t b1[] = {1,4};
+  float32_t e[] = {1,2};
+  float32x2_t a = vld1_f32 (a1);
+  float32x2_t b = vld1_f32 (b1);
+  float32x2_t c = vminnm_f32 (a, b);
+  float32_t actual[2];
+  vst1_f32 (actual, c);
+
+  for (int i = 0; i < 2; ++i)
+    if (actual[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vminnm_f32__quiet_NaN_one_arg ()
+{
+  /* When given a quiet NaN, vminnm returns the other operand.
+     In this test case we have NaNs in only one operand.  */
+  float32_t n = __builtin_nanf ("");
+  float32_t a1[] = {1,2};
+  float32_t b1[] = {n,n};
+  float32_t e[] = {1,2};
+  float32x2_t a = vld1_f32 (a1);
+  float32x2_t b = vld1_f32 (b1);
+  float32x2_t c = vminnm_f32 (a, b);
+  float32_t actual[2];
+  vst1_f32 (actual, c);
+
+  for (int i = 0; i < 2; ++i)
+    if (actual[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vminnm_f32__quiet_NaN_both_args ()
+{
+  /* When given a quiet NaN, vminnm returns the other operand.
+     In this test case we have NaNs in both operands.  */
+  float32_t n = __builtin_nanf ("");
+  float32_t a1[] = {n,2};
+  float32_t b1[] = {1,n};
+  float32_t e[] = {1,2};
+  float32x2_t a = vld1_f32 (a1);
+  float32x2_t b = vld1_f32 (b1);
+  float32x2_t c = vminnm_f32 (a, b);
+  float32_t actual[2];
+  vst1_f32 (actual, c);
+
+  for (int i = 0; i < 2; ++i)
+    if (actual[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vminnm_f32__zero_both_args ()
+{
+  /* For 0 and -0, vminnm returns -0.  Since 0 == -0, check sign bit.  */
+  float32_t a1[] = {0.0,0.0};
+  float32_t b1[] = {-0.0, -0.0};
+  float32_t e[] = {-0.0, -0.0};
+
+  float32x2_t a = vld1_f32 (a1);
+  float32x2_t b = vld1_f32 (b1);
+  float32x2_t c = vminnm_f32 (a, b);
+
+  float32_t actual1[2];
+  vst1_f32 (actual1, c);
+
+  for (int i = 0; i < 2; ++i)
+    if (actual1[i] != e[i] || __builtin_signbit (actual1[i]) == 0)
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vminnm_f32__inf_both_args ()
+{
+  /* The min of inf and inf is inf.  The min of -inf and -inf is -inf.  */
+  float32_t inf = __builtin_huge_valf ();
+  float32_t a1[] = {inf, -inf};
+  float32_t b1[] = {inf, -inf};
+  float32_t e[] = {inf, -inf};
+
+  float32x2_t a = vld1_f32 (a1);
+  float32x2_t b = vld1_f32 (b1);
+  float32x2_t c = vminnm_f32 (a, b);
+
+  float32_t actual1[2];
+  vst1_f32 (actual1, c);
+
+  for (int i = 0; i < 2; ++i)
+    if (actual1[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vminnm_f32__two_quiet_NaNs_both_args ()
+{
+  /* When given 2 NaNs, return a NaN.  Since a NaN is not equal to anything,
+     not even another NaN, use __builtin_isnan () to check.  */
+  float32_t n = __builtin_nanf ("");
+  float32_t a1[] = {n,n};
+  float32_t b1[] = {n,n};
+  float32_t e[] = {n,n};
+  float32x2_t a = vld1_f32 (a1);
+  float32x2_t b = vld1_f32 (b1);
+  float32x2_t c = vminnm_f32 (a, b);
+  float32_t actual[2];
+  vst1_f32 (actual, c);
+
+  for (int i = 0; i < 2; ++i)
+    if (!__builtin_isnan (actual[i]))
+      abort ();
+}
+
+int
+main ()
+{
+  test_vminnm_f32__regular_input1 ();
+  test_vminnm_f32__regular_input2 ();
+  test_vminnm_f32__quiet_NaN_one_arg ();
+  test_vminnm_f32__quiet_NaN_both_args ();
+  test_vminnm_f32__zero_both_args ();
+  test_vminnm_f32__inf_both_args ();
+  test_vminnm_f32__two_quiet_NaNs_both_args ();
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "vminnm\.f32\t\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+, ?\[dD\]\[0-9\]+\n" 7 } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/vminnmq_f32_1.c b/gcc/testsuite/gcc.target/arm/simd/vminnmq_f32_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..a778abecd857e9ea83d249e0ab52886209030aa4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vminnmq_f32_1.c
@@ -0,0 +1,159 @@
+/* Test the `vminnmqf32' ARM Neon intrinsic.  */
+
+/* { dg-do run } */
+/* { dg-require-effective-target arm_v8_neon_ok } */
+/* { dg-options "-save-temps -O3 -march=armv8-a" } */
+/* { dg-add-options arm_v8_neon } */
+
+#include "arm_neon.h"
+
+extern void abort ();
+
+void __attribute__ ((noinline))
+test_vminnmq_f32__regular_input1 ()
+{
+  float32_t a1[] = {1,2,5,6};
+  float32_t b1[] = {3,4,7,8};
+  float32x4_t a = vld1q_f32 (a1);
+  float32x4_t b = vld1q_f32 (b1);
+  float32x4_t c = vminnmq_f32 (a, b);
+  float32_t actual[4];
+  vst1q_f32 (actual, c);
+
+  for (int i = 0; i < 4; ++i)
+    if (actual[i] != a1[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vminnmq_f32__regular_input2 ()
+{
+  float32_t a1[] = {3,2,7,6};
+  float32_t b1[] = {1,4,5,8};
+  float32_t e[] = {1,2,5,6};
+  float32x4_t a = vld1q_f32 (a1);
+  float32x4_t b = vld1q_f32 (b1);
+  float32x4_t c = vminnmq_f32 (a, b);
+  float32_t actual[4];
+  vst1q_f32 (actual, c);
+
+  for (int i = 0; i < 4; ++i)
+    if (actual[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vminnmq_f32__quiet_NaN_one_arg ()
+{
+  /* When given a quiet NaN, vminnmq returns the other operand.
+     In this test case we have NaNs in only one operand.  */
+  float32_t n = __builtin_nanf ("");
+  float32_t a1[] = {1,2,3,4};
+  float32_t b1[] = {n,n,n,n};
+  float32_t e[] = {1,2,3,4};
+  float32x4_t a = vld1q_f32 (a1);
+  float32x4_t b = vld1q_f32 (b1);
+  float32x4_t c = vminnmq_f32 (a, b);
+  float32_t actual[4];
+  vst1q_f32 (actual, c);
+
+  for (int i = 0; i < 4; ++i)
+    if (actual[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vminnmq_f32__quiet_NaN_both_args ()
+{
+  /* When given a quiet NaN, vminnmq returns the other operand.
+     In this test case we have NaNs in both operands.  */
+  float32_t n = __builtin_nanf ("");
+  float32_t a1[] = {n,2,n,4};
+  float32_t b1[] = {1,n,3,n};
+  float32_t e[] = {1,2,3,4};
+  float32x4_t a = vld1q_f32 (a1);
+  float32x4_t b = vld1q_f32 (b1);
+  float32x4_t c = vminnmq_f32 (a, b);
+  float32_t actual[4];
+  vst1q_f32 (actual, c);
+
+  for (int i = 0; i < 4; ++i)
+    if (actual[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vminnmq_f32__zero_both_args ()
+{
+  /* For 0 and -0, vminnmq returns -0.  Since 0 == -0, check sign bit.  */
+  float32_t a1[] = {0.0, 0.0, -0.0, -0.0};
+  float32_t b1[] = {-0.0, -0.0, 0.0, 0.0};
+  float32_t e[] = {-0.0, -0.0, -0.0, -0.0};
+
+  float32x4_t a = vld1q_f32 (a1);
+  float32x4_t b = vld1q_f32 (b1);
+  float32x4_t c = vminnmq_f32 (a, b);
+
+  float32_t actual1[4];
+  vst1q_f32 (actual1, c);
+
+  for (int i = 0; i < 4; ++i)
+    if (actual1[i] != e[i] || __builtin_signbit (actual1[i]) == 0)
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vminnmq_f32__inf_both_args ()
+{
+  /* The min of inf and inf is inf.  The min of -inf and -inf is -inf.  */
+  float32_t inf = __builtin_huge_valf ();
+  float32_t a1[] = {inf, -inf, inf, inf};
+  float32_t b1[] = {inf, -inf, -inf, -inf};
+  float32_t e[] = {inf, -inf, -inf, -inf};
+
+  float32x4_t a = vld1q_f32 (a1);
+  float32x4_t b = vld1q_f32 (b1);
+  float32x4_t c = vminnmq_f32 (a, b);
+
+  float32_t actual1[4];
+  vst1q_f32 (actual1, c);
+
+  for (int i = 0; i < 4; ++i)
+    if (actual1[i] != e[i])
+      abort ();
+}
+
+void __attribute__ ((noinline))
+test_vminnmq_f32__two_quiet_NaNs_both_args ()
+{
+  /* When given 2 NaNs, return a NaN.  Since a NaN is not equal to anything,
+     not even another NaN, use __builtin_isnan () to check.  */
+  float32_t n = __builtin_nanf ("");
+  float32_t a1[] = {n,n,n,n};
+  float32_t b1[] = {n,n,n,n};
+  float32_t e[] = {n,n};
+  float32x4_t a = vld1q_f32 (a1);
+  float32x4_t b = vld1q_f32 (b1);
+  float32x4_t c = vminnmq_f32 (a, b);
+  float32_t actual[4];
+  vst1q_f32 (actual, c);
+
+  for (int i = 0; i < 4; ++i)
+    if (!__builtin_isnan (actual[i]))
+      abort ();
+}
+
+int
+main ()
+{
+  test_vminnmq_f32__regular_input1 ();
+  test_vminnmq_f32__regular_input2 ();
+  test_vminnmq_f32__quiet_NaN_one_arg ();
+  test_vminnmq_f32__quiet_NaN_both_args ();
+  test_vminnmq_f32__zero_both_args ();
+  test_vminnmq_f32__inf_both_args ();
+  test_vminnmq_f32__two_quiet_NaNs_both_args ();
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "vminnm\.f32\t\[qQ\]\[0-9\]+, ?\[qQ\]\[0-9\]+, ?\[qQ\]\[0-9\]+\n" 7 } } */

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs.
  2016-10-26 15:02   ` Tamar Christina
@ 2016-11-01  9:24     ` Tamar Christina
  2016-11-01 12:22     ` Kyrill Tkachov
  1 sibling, 0 replies; 8+ messages in thread
From: Tamar Christina @ 2016-11-01  9:24 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: GCC Patches, Kyrylo Tkachov, nd

Ping.

________________________________________
From: gcc-patches-owner@gcc.gnu.org <gcc-patches-owner@gcc.gnu.org> on behalf of Tamar Christina <Tamar.Christina@arm.com>
Sent: Wednesday, October 26, 2016 4:01:42 PM
To: Christophe Lyon
Cc: GCC Patches; Kyrylo Tkachov; nd
Subject: Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs.

Hi Christophe,

Here's the updated patch.

Cheers,
Tamar
________________________________________
From: Christophe Lyon <christophe.lyon@linaro.org>
Sent: Wednesday, October 19, 2016 11:23:56 AM
To: Tamar Christina
Cc: GCC Patches; Kyrylo Tkachov; nd
Subject: Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs.

On 19 October 2016 at 11:36, Tamar Christina <Tamar.Christina@arm.com> wrote:
> Hi All,
>
> This patch implements the vmaxnmQ_ST and vminnmQ_ST intrinsics. The
> current builtin registration code is deficient since it can't access
> standard pattern names, to which vmaxnmQ_ST and vminnmQ_ST map
> directly. Thus, to enable the vectoriser to have access to these
> intrinsics, we implement them using builtin functions, which we
> expand to the proper standard pattern using a define_expand.
>
> This patch also implements the __ARM_FEATURE_NUMERIC_MAXMIN macro,
> which is defined when __ARM_ARCH >= 8, and which enables the
> intrinsics.
>
> Regression tested on arm-none-eabi and no regressions.
>
> This patch is a rework of a previous patch:
> https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01971.html
>
> OK for trunk?
>
> Thanks,
> Tamar
>
> ---
>
> gcc/
>
> 2016-10-19  Bilyan Borisov  <bilyan.borisov@arm.com>
>             Tamar Christina <tamar.christina@arm.com>
>
>         * config/arm/arm-c.c (arm_cpu_builtins): New macro definition.
>         * config/arm/arm_neon.h (vmaxnm_f32): New intrinsinc.
>         (vmaxnmq_f32): Likewise.
>         (vminnm_f32): Likewise.
>         (vminnmq_f32): Likewise.
>         * config/arm/arm_neon_builtins.def (vmaxnm): New builtin.
>         (vminnm): Likewise.
>         * config/arm/neon.md (neon_<fmaxmin_op><mode>, VCVTF): New
>         expander.
>
> gcc/testsuite/
>
> 2016-10-19  Bilyan Borisov  <bilyan.borisov@arm.com>
>
>         * gcc.target/arm/simd/vmaxnm_f32_1.c: New.
>         * gcc.target/arm/simd/vmaxnmq_f32_1.c: Likewise.
>         * gcc.target/arm/simd/vminnm_f32_1.c: Likewise.
>         * gcc.target/arm/simd/vminnmq_f32_1.c: Likewise.
>

I think you forgot to attach the new tests.

Christophe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs.
  2016-10-26 15:02   ` Tamar Christina
  2016-11-01  9:24     ` Tamar Christina
@ 2016-11-01 12:22     ` Kyrill Tkachov
  2016-11-02 11:22       ` Bin.Cheng
  1 sibling, 1 reply; 8+ messages in thread
From: Kyrill Tkachov @ 2016-11-01 12:22 UTC (permalink / raw)
  To: Tamar Christina, Christophe Lyon; +Cc: GCC Patches

Hi Tamar,

On 26/10/16 16:01, Tamar Christina wrote:
> Hi Christophe,
>
> Here's the updated patch.
>
> Cheers,
> Tamar
> ________________________________________
> From: Christophe Lyon <christophe.lyon@linaro.org>
> Sent: Wednesday, October 19, 2016 11:23:56 AM
> To: Tamar Christina
> Cc: GCC Patches; Kyrylo Tkachov; nd
> Subject: Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs.
>
> On 19 October 2016 at 11:36, Tamar Christina <Tamar.Christina@arm.com> wrote:
>> Hi All,
>>
>> This patch implements the vmaxnmQ_ST and vminnmQ_ST intrinsics. The
>> current builtin registration code is deficient since it can't access
>> standard pattern names, to which vmaxnmQ_ST and vminnmQ_ST map
>> directly. Thus, to enable the vectoriser to have access to these
>> intrinsics, we implement them using builtin functions, which we
>> expand to the proper standard pattern using a define_expand.
>>
>> This patch also implements the __ARM_FEATURE_NUMERIC_MAXMIN macro,
>> which is defined when __ARM_ARCH >= 8, and which enables the
>> intrinsics.
>>
>> Regression tested on arm-none-eabi and no regressions.
>>
>> This patch is a rework of a previous patch:
>> https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01971.html
>>
>> OK for trunk?

Ok.
Thanks,
Kyrill

>> Thanks,
>> Tamar
>>
>> ---
>>
>> gcc/
>>
>> 2016-10-19  Bilyan Borisov  <bilyan.borisov@arm.com>
>>              Tamar Christina <tamar.christina@arm.com>
>>
>>          * config/arm/arm-c.c (arm_cpu_builtins): New macro definition.
>>          * config/arm/arm_neon.h (vmaxnm_f32): New intrinsinc.
>>          (vmaxnmq_f32): Likewise.
>>          (vminnm_f32): Likewise.
>>          (vminnmq_f32): Likewise.
>>          * config/arm/arm_neon_builtins.def (vmaxnm): New builtin.
>>          (vminnm): Likewise.
>>          * config/arm/neon.md (neon_<fmaxmin_op><mode>, VCVTF): New
>>          expander.
>>
>> gcc/testsuite/
>>
>> 2016-10-19  Bilyan Borisov  <bilyan.borisov@arm.com>
>>
>>          * gcc.target/arm/simd/vmaxnm_f32_1.c: New.
>>          * gcc.target/arm/simd/vmaxnmq_f32_1.c: Likewise.
>>          * gcc.target/arm/simd/vminnm_f32_1.c: Likewise.
>>          * gcc.target/arm/simd/vminnmq_f32_1.c: Likewise.
>>
> I think you forgot to attach the new tests.
>
> Christophe
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs.
  2016-11-01 12:22     ` Kyrill Tkachov
@ 2016-11-02 11:22       ` Bin.Cheng
  2016-11-02 13:29         ` Christophe Lyon
  0 siblings, 1 reply; 8+ messages in thread
From: Bin.Cheng @ 2016-11-02 11:22 UTC (permalink / raw)
  To: Kyrill Tkachov; +Cc: Tamar Christina, Christophe Lyon, GCC Patches

On Tue, Nov 1, 2016 at 12:21 PM, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
> Hi Tamar,
>
>
> On 26/10/16 16:01, Tamar Christina wrote:
>>
>> Hi Christophe,
>>
>> Here's the updated patch.
>>
>> Cheers,
>> Tamar
>> ________________________________________
>> From: Christophe Lyon <christophe.lyon@linaro.org>
>> Sent: Wednesday, October 19, 2016 11:23:56 AM
>> To: Tamar Christina
>> Cc: GCC Patches; Kyrylo Tkachov; nd
>> Subject: Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and
>> vminnmQ_ST intrinsincs.
>>
>> On 19 October 2016 at 11:36, Tamar Christina <Tamar.Christina@arm.com>
>> wrote:
>>>
>>> Hi All,
>>>
>>> This patch implements the vmaxnmQ_ST and vminnmQ_ST intrinsics. The
>>> current builtin registration code is deficient since it can't access
>>> standard pattern names, to which vmaxnmQ_ST and vminnmQ_ST map
>>> directly. Thus, to enable the vectoriser to have access to these
>>> intrinsics, we implement them using builtin functions, which we
>>> expand to the proper standard pattern using a define_expand.
>>>
>>> This patch also implements the __ARM_FEATURE_NUMERIC_MAXMIN macro,
>>> which is defined when __ARM_ARCH >= 8, and which enables the
>>> intrinsics.
>>>
>>> Regression tested on arm-none-eabi and no regressions.
>>>
>>> This patch is a rework of a previous patch:
>>> https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01971.html
>>>
>>> OK for trunk?
These cases failed on arm-none-linux-gnueabihf as below:
FAIL: gcc.target/arm/simd/vmaxnm_f32_1.c execution test
FAIL: gcc.target/arm/simd/vmaxnmq_f32_1.c execution test
FAIL: gcc.target/arm/simd/vminnm_f32_1.c execution test
FAIL: gcc.target/arm/simd/vminnmq_f32_1.c execution test

For such changes, I would suggest reg test for both bare-metal and
linux toolchains, plus a bootstrap for linux toolchain.

Thanks,
bin

>
>
> Ok.
> Thanks,
> Kyrill
>
>
>>> Thanks,
>>> Tamar
>>>
>>> ---
>>>
>>> gcc/
>>>
>>> 2016-10-19  Bilyan Borisov  <bilyan.borisov@arm.com>
>>>              Tamar Christina <tamar.christina@arm.com>
>>>
>>>          * config/arm/arm-c.c (arm_cpu_builtins): New macro definition.
>>>          * config/arm/arm_neon.h (vmaxnm_f32): New intrinsinc.
>>>          (vmaxnmq_f32): Likewise.
>>>          (vminnm_f32): Likewise.
>>>          (vminnmq_f32): Likewise.
>>>          * config/arm/arm_neon_builtins.def (vmaxnm): New builtin.
>>>          (vminnm): Likewise.
>>>          * config/arm/neon.md (neon_<fmaxmin_op><mode>, VCVTF): New
>>>          expander.
>>>
>>> gcc/testsuite/
>>>
>>> 2016-10-19  Bilyan Borisov  <bilyan.borisov@arm.com>
>>>
>>>          * gcc.target/arm/simd/vmaxnm_f32_1.c: New.
>>>          * gcc.target/arm/simd/vmaxnmq_f32_1.c: Likewise.
>>>          * gcc.target/arm/simd/vminnm_f32_1.c: Likewise.
>>>          * gcc.target/arm/simd/vminnmq_f32_1.c: Likewise.
>>>
>> I think you forgot to attach the new tests.
>>
>> Christophe
>>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs.
  2016-11-02 11:22       ` Bin.Cheng
@ 2016-11-02 13:29         ` Christophe Lyon
  0 siblings, 0 replies; 8+ messages in thread
From: Christophe Lyon @ 2016-11-02 13:29 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: Kyrill Tkachov, Tamar Christina, GCC Patches

On 2 November 2016 at 12:22, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Tue, Nov 1, 2016 at 12:21 PM, Kyrill Tkachov
> <kyrylo.tkachov@foss.arm.com> wrote:
>> Hi Tamar,
>>
>>
>> On 26/10/16 16:01, Tamar Christina wrote:
>>>
>>> Hi Christophe,
>>>
>>> Here's the updated patch.
>>>
>>> Cheers,
>>> Tamar
>>> ________________________________________
>>> From: Christophe Lyon <christophe.lyon@linaro.org>
>>> Sent: Wednesday, October 19, 2016 11:23:56 AM
>>> To: Tamar Christina
>>> Cc: GCC Patches; Kyrylo Tkachov; nd
>>> Subject: Re: [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and
>>> vminnmQ_ST intrinsincs.
>>>
>>> On 19 October 2016 at 11:36, Tamar Christina <Tamar.Christina@arm.com>
>>> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> This patch implements the vmaxnmQ_ST and vminnmQ_ST intrinsics. The
>>>> current builtin registration code is deficient since it can't access
>>>> standard pattern names, to which vmaxnmQ_ST and vminnmQ_ST map
>>>> directly. Thus, to enable the vectoriser to have access to these
>>>> intrinsics, we implement them using builtin functions, which we
>>>> expand to the proper standard pattern using a define_expand.
>>>>
>>>> This patch also implements the __ARM_FEATURE_NUMERIC_MAXMIN macro,
>>>> which is defined when __ARM_ARCH >= 8, and which enables the
>>>> intrinsics.
>>>>
>>>> Regression tested on arm-none-eabi and no regressions.
>>>>
>>>> This patch is a rework of a previous patch:
>>>> https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01971.html
>>>>
>>>> OK for trunk?
> These cases failed on arm-none-linux-gnueabihf as below:
> FAIL: gcc.target/arm/simd/vmaxnm_f32_1.c execution test
> FAIL: gcc.target/arm/simd/vmaxnmq_f32_1.c execution test
> FAIL: gcc.target/arm/simd/vminnm_f32_1.c execution test
> FAIL: gcc.target/arm/simd/vminnmq_f32_1.c execution test
>
> For such changes, I would suggest reg test for both bare-metal and
> linux toolchains, plus a bootstrap for linux toolchain.
>

Hi,

I confirm some tests are failing:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/241736/report-build-info.html

Sorry I couldn't answer/test before you committed, I was on holidays.

Christophe

> Thanks,
> bin
>
>>
>>
>> Ok.
>> Thanks,
>> Kyrill
>>
>>
>>>> Thanks,
>>>> Tamar
>>>>
>>>> ---
>>>>
>>>> gcc/
>>>>
>>>> 2016-10-19  Bilyan Borisov  <bilyan.borisov@arm.com>
>>>>              Tamar Christina <tamar.christina@arm.com>
>>>>
>>>>          * config/arm/arm-c.c (arm_cpu_builtins): New macro definition.
>>>>          * config/arm/arm_neon.h (vmaxnm_f32): New intrinsinc.
>>>>          (vmaxnmq_f32): Likewise.
>>>>          (vminnm_f32): Likewise.
>>>>          (vminnmq_f32): Likewise.
>>>>          * config/arm/arm_neon_builtins.def (vmaxnm): New builtin.
>>>>          (vminnm): Likewise.
>>>>          * config/arm/neon.md (neon_<fmaxmin_op><mode>, VCVTF): New
>>>>          expander.
>>>>
>>>> gcc/testsuite/
>>>>
>>>> 2016-10-19  Bilyan Borisov  <bilyan.borisov@arm.com>
>>>>
>>>>          * gcc.target/arm/simd/vmaxnm_f32_1.c: New.
>>>>          * gcc.target/arm/simd/vmaxnmq_f32_1.c: Likewise.
>>>>          * gcc.target/arm/simd/vminnm_f32_1.c: Likewise.
>>>>          * gcc.target/arm/simd/vminnmq_f32_1.c: Likewise.
>>>>
>>> I think you forgot to attach the new tests.
>>>
>>> Christophe
>>>
>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-11-02 13:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-19  9:36 [PATCH v2][AArch32][NEON] Implementing vmaxnmQ_ST and vminnmQ_ST intrinsincs Tamar Christina
2016-10-19 10:24 ` Christophe Lyon
2016-10-19 10:39   ` Tamar Christina
2016-10-26 15:02   ` Tamar Christina
2016-11-01  9:24     ` Tamar Christina
2016-11-01 12:22     ` Kyrill Tkachov
2016-11-02 11:22       ` Bin.Cheng
2016-11-02 13:29         ` Christophe Lyon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).