[PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian.
  2016-04-18 14:35 [PATCH 0/6] [ARC] Various fixes Claudiu Zissulescu
                   ` (2 preceding siblings ...)
  2016-04-18 14:35 ` [PATCH 3/6] [ARC] Pass mfpuda to assembler Claudiu Zissulescu
@ 2016-04-18 14:35 ` Claudiu Zissulescu
  2016-04-28 10:29   ` Joern Wolfgang Rennecke
  2016-04-18 14:35 ` [PATCH 1/6] [ARC] Don't use drsub* instructions when selecting fpuda Claudiu Zissulescu
  2016-04-18 14:35 ` [PATCH 5/6] [ARC] Fix unwanted match for sign extend 16-bit constant Claudiu Zissulescu
  5 siblings, 1 reply; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-18 14:35 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett

OK to apply?
Claudiu

gcc/
2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.c (arc_process_double_reg_moves): Fix for
	big-endian compilation.
	* config/arc/arc.md (addf3): Likewise.
	(subdf3): Likewise.
	(muldf3): Likewise.
---
 gcc/config/arc/arc.c  | 12 ++++++++----
 gcc/config/arc/arc.md | 18 +++++++++---------
 2 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index d60db50..f4bef3e 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -8647,8 +8647,10 @@ arc_process_double_reg_moves (rtx *operands)
 	{
 	  /* When we have 'mov D, r' or 'mov D, D' then get the target
 	     register pair for use with LR insn.  */
-	  rtx destHigh = simplify_gen_subreg(SImode, dest, DFmode, 4);
-	  rtx destLow  = simplify_gen_subreg(SImode, dest, DFmode, 0);
+	  rtx destHigh = simplify_gen_subreg (SImode, dest, DFmode,
+					     TARGET_BIG_ENDIAN ? 0 : 4);
+	  rtx destLow  = simplify_gen_subreg (SImode, dest, DFmode,
+					     TARGET_BIG_ENDIAN ? 4 : 0);
 
 	  /* Produce the two LR insns to get the high and low parts.  */
 	  emit_insn (gen_rtx_SET (destHigh,
@@ -8665,8 +8667,10 @@ arc_process_double_reg_moves (rtx *operands)
     {
       /* When we have 'mov r, D' or 'mov D, D' and we have access to the
 	 LR insn get the target register pair.  */
-      rtx srcHigh = simplify_gen_subreg(SImode, src, DFmode, 4);
-      rtx srcLow  = simplify_gen_subreg(SImode, src, DFmode, 0);
+      rtx srcHigh = simplify_gen_subreg (SImode, src, DFmode,
+					TARGET_BIG_ENDIAN ? 0 : 4);
+      rtx srcLow  = simplify_gen_subreg (SImode, src, DFmode,
+					TARGET_BIG_ENDIAN ? 4 : 0);
 
       emit_insn (gen_rtx_UNSPEC_VOLATILE (Pmode,
 					  gen_rtvec (3, dest, srcHigh, srcLow),
diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 9766547..74530b1 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -5681,9 +5681,9 @@
    {
     if (GET_CODE (operands[2]) == CONST_DOUBLE)
      {
-        rtx high, low, tmp;
-        split_double (operands[2], &low, &high);
-        tmp = force_reg (SImode, high);
+        rtx first, second, tmp;
+        split_double (operands[2], &first, &second);
+        tmp = force_reg (SImode, TARGET_BIG_ENDIAN ? first : second);
         emit_insn (gen_adddf3_insn (operands[0], operands[1],
                                     operands[2], tmp, const0_rtx));
      }
@@ -5718,10 +5718,10 @@
      if ((GET_CODE (operands[1]) == CONST_DOUBLE)
           || GET_CODE (operands[2]) == CONST_DOUBLE)
       {
-        rtx high, low, tmp;
+        rtx first, second, tmp;
         int const_index = ((GET_CODE (operands[1]) == CONST_DOUBLE) ? 1 : 2);
-        split_double (operands[const_index], &low, &high);
-        tmp = force_reg (SImode, high);
+        split_double (operands[const_index], &first, &second);
+        tmp = force_reg (SImode, TARGET_BIG_ENDIAN ? first : second);
         emit_insn (gen_subdf3_insn (operands[0], operands[1],
                                     operands[2], tmp, const0_rtx));
       }
@@ -5753,9 +5753,9 @@
     {
      if (GET_CODE (operands[2]) == CONST_DOUBLE)
       {
-        rtx high, low, tmp;
-        split_double (operands[2], &low, &high);
-        tmp = force_reg (SImode, high);
+        rtx first, second, tmp;
+        split_double (operands[2], &first, &second);
+        tmp = force_reg (SImode, TARGET_BIG_ENDIAN ? first : second);
         emit_insn (gen_muldf3_insn (operands[0], operands[1],
                                     operands[2], tmp, const0_rtx));
       }
-- 
1.9.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library.
  2016-04-18 14:35 [PATCH 0/6] [ARC] Various fixes Claudiu Zissulescu
  2016-04-18 14:35 ` [PATCH 6/6] [ARC] Various instruction pattern fixes Claudiu Zissulescu
@ 2016-04-18 14:35 ` Claudiu Zissulescu
  2016-04-28 11:27   ` Joern Wolfgang Rennecke
  2016-04-18 14:35 ` [PATCH 3/6] [ARC] Pass mfpuda to assembler Claudiu Zissulescu
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-18 14:35 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett

OK to apply?
Claudiu

gcc/
2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>

	* testsuite/gcc.target/arc/ieee_eq.c: New test.

libgcc/
2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/ieee-754/eqdf2.S: Handle FPX NaN.
---
 gcc/testsuite/gcc.target/arc/ieee_eq.c | 47 ++++++++++++++++++++++++++++++++++
 libgcc/config/arc/ieee-754/eqdf2.S     | 13 ++++++----
 2 files changed, 55 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/ieee_eq.c

diff --git a/gcc/testsuite/gcc.target/arc/ieee_eq.c b/gcc/testsuite/gcc.target/arc/ieee_eq.c
new file mode 100644
index 0000000..70aebad
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/ieee_eq.c
@@ -0,0 +1,47 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include <stdio.h>
+#include <float.h>
+
+#define TEST_EQ(TYPE,X,Y,RES)				\
+  do {							\
+    volatile TYPE a, b;					\
+    a = (TYPE) X;					\
+    b = (TYPE) Y;					\
+    if ((a == b) != RES)				\
+      {							\
+	printf ("Runtime computation error @%d. %g "	\
+		"!= %g\n", __LINE__, a, b);		\
+	error = 1;					\
+      }							\
+  } while (0)
+
+#ifndef __HS__
+/* Special type of NaN found when using double FPX instructions.  */
+static const unsigned long long __nan = 0x7FF0000080000000ULL;
+# define W (*(double *) &__nan)
+#else
+# define W __builtin_nan ("")
+#endif
+
+#define Q __builtin_nan ("")
+#define H __builtin_inf ()
+
+int main (void)
+{
+  int error = 0;
+
+  TEST_EQ (double, 1, 1, 1);
+  TEST_EQ (double, 1, 2, 0);
+  TEST_EQ (double, W, W, 0);
+  TEST_EQ (double, Q, Q, 0);
+  TEST_EQ (double, __DBL_MAX__, __DBL_MAX__, 1);
+  TEST_EQ (double, __DBL_MIN__, __DBL_MIN__, 1);
+  TEST_EQ (double, H, H, 1);
+
+  if (error)
+    __builtin_abort ();
+
+  return 0;
+}
diff --git a/libgcc/config/arc/ieee-754/eqdf2.S b/libgcc/config/arc/ieee-754/eqdf2.S
index bc7d88e..3b23e04 100644
--- a/libgcc/config/arc/ieee-754/eqdf2.S
+++ b/libgcc/config/arc/ieee-754/eqdf2.S
@@ -58,11 +58,14 @@ __eqdf2:
 	   well predictable (as seen from the branch predictor).  */
 __eqdf2:
 	brne.d DBL0H,DBL1H,.Lhighdiff
-	bmsk r12,DBL0H,20
-#ifdef DPFP_COMPAT
-	or.f 0,DBL0L,DBL1L
-	bset.ne r12,r12,21
-#endif /* DPFP_COMPAT */
+#ifndef __HS__
+	/* The next two instructions are required to recognize the FPX
+	NaN, which has a pattern like this: 0x7ff0_0000_8000_0000, as
+	oposite to 0x7ff8_0000_0000_0000.  */
+	or.f    0,DBL0L,DBL1L
+	bset.ne DBL0H,DBL0H,19
+#endif /* __HS__ */
+	bmsk    r12,DBL0H,20
 	add1.f	r12,r12,DBL0H /* set c iff NaN; also, clear z if NaN.  */
 	j_s.d	[blink]
 	cmp.cc	DBL0L,DBL1L
-- 
1.9.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/6] [ARC] Don't use drsub* instructions when selecting fpuda.
  2016-04-18 14:35 [PATCH 0/6] [ARC] Various fixes Claudiu Zissulescu
                   ` (3 preceding siblings ...)
  2016-04-18 14:35 ` [PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian Claudiu Zissulescu
@ 2016-04-18 14:35 ` Claudiu Zissulescu
  2016-04-28 10:05   ` Joern Wolfgang Rennecke
  2016-04-18 14:35 ` [PATCH 5/6] [ARC] Fix unwanted match for sign extend 16-bit constant Claudiu Zissulescu
  5 siblings, 1 reply; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-18 14:35 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett

The double precision floating point assist instructions are not
implementing the reverse double subtract instruction (drsub) found in
the FPX extension, hence, this patch.

OK to apply?
Claudiu

gcc/
2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.md (cpu_facility): Add fpx variant.
	(subdf3): Prohibit use reverse sub when assist operations option
	is enabled.
	* config/arc/fpx.md (subdf3_insn, *dsubh_peep2_insn): Allow drsub
	instructions only when FPX is enabled.
        * testsuite/gcc.target/arc/trsub.c: New test.
---
 gcc/config/arc/arc.md                |  8 +++++++-
 gcc/config/arc/fpx.md                |  7 ++++---
 gcc/testsuite/gcc.target/arc/trsub.c | 10 ++++++++++
 3 files changed, 21 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/trsub.c

diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 4193d26..9766547 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -265,7 +265,7 @@
 		     - get_attr_length (insn)")))
 
 ; for ARCv2 we need to disable/enable different instruction alternatives
-(define_attr "cpu_facility" "std,av1,av2"
+(define_attr "cpu_facility" "std,av1,av2,fpx"
   (const_string "std"))
 
 ; We should consider all the instructions enabled until otherwise
@@ -277,6 +277,10 @@
 	 (and (eq_attr "cpu_facility" "av2")
 	      (not (match_test "TARGET_V2")))
 	 (const_string "no")
+
+	 (and (eq_attr "cpu_facility" "fpx")
+	      (match_test "TARGET_FP_DP_AX"))
+	 (const_string "no")
 	 ]
 	(const_string "yes")))
 
@@ -5709,6 +5713,8 @@
   "
    if (TARGET_DPFP)
     {
+     if (TARGET_FP_DP_AX && (GET_CODE (operands[1]) == CONST_DOUBLE))
+       operands[1] = force_reg (DFmode, operands[1]);
      if ((GET_CODE (operands[1]) == CONST_DOUBLE)
           || GET_CODE (operands[2]) == CONST_DOUBLE)
       {
diff --git a/gcc/config/arc/fpx.md b/gcc/config/arc/fpx.md
index b790600..2e11157 100644
--- a/gcc/config/arc/fpx.md
+++ b/gcc/config/arc/fpx.md
@@ -304,7 +304,8 @@
      drsubh%F0%F2 0,%H1,%L1
      drsubh%F0%F2 0,%3,%L1"
   [(set_attr "type" "dpfp_addsub")
-  (set_attr "length" "4,8,4,8")])
+   (set_attr "length" "4,8,4,8")
+   (set_attr "cpu_facility" "*,*,fpx,fpx")])
 
 ;; ;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;; ;; Peephole for following conversion
@@ -613,5 +614,5 @@
   drsubh%F0%F2 %H6, %H1, %L1
   drsubh%F0%F2 %H6, %3, %L1"
  [(set_attr "type" "dpfp_addsub")
-  (set_attr "length" "4,8,4,8")]
-)
+  (set_attr "length" "4,8,4,8")
+  (set_attr "cpu_facility" "*,*,fpx,fpx")])
diff --git a/gcc/testsuite/gcc.target/arc/trsub.c b/gcc/testsuite/gcc.target/arc/trsub.c
new file mode 100644
index 0000000..031935f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/trsub.c
@@ -0,0 +1,10 @@
+/* Tests if we generate rsub instructions when compiling using
+   floating point assist instructions.  */
+/* { dg-do compile } */
+/* { dg-options "-mfpu=fpuda -mcpu=arcem" } */
+
+double foo (double a)
+{
+  return ((double) 0.12 - a);
+}
+/* { dg-final { scan-assembler-not "drsub.*" } } */
-- 
1.9.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 5/6] [ARC] Fix unwanted match for sign extend 16-bit constant.
  2016-04-18 14:35 [PATCH 0/6] [ARC] Various fixes Claudiu Zissulescu
                   ` (4 preceding siblings ...)
  2016-04-18 14:35 ` [PATCH 1/6] [ARC] Don't use drsub* instructions when selecting fpuda Claudiu Zissulescu
@ 2016-04-18 14:35 ` Claudiu Zissulescu
  2016-04-28 11:47   ` Joern Wolfgang Rennecke
  5 siblings, 1 reply; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-18 14:35 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett

The combine pass may conclude umulhisi3_imm pattern can accept also sign
extended 16-bit constants. This patch prohibits the combine in considering
this pattern as suitable.

OK to apply?
Claudiu

gcc/
2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.md (umulhisi3_imm): Avoid unwanted match for sign
	extend 16-bit constants.
	* testsuite/gcc.target/arc/umulsihi3_z.c: New file.
---
 gcc/config/arc/arc.md                      |  3 ++-
 gcc/testsuite/gcc.target/arc/umulsihi3_z.c | 23 +++++++++++++++++++++++
 2 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/umulsihi3_z.c

diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 74530b1..6731072 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -1729,7 +1729,8 @@
 (define_insn "umulhisi3_imm"
   [(set (match_operand:SI 0 "register_operand"                          "=r, r,r,  r,  r")
 	(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" " 0, r,0,  0,  r"))
-		 (match_operand:HI 2 "short_const_int_operand"          " L, L,I,C16,C16")))]
+		 (match_operand:HI 2 "short_const_int_operand"          " L, L,I,C16,C16")))
+  (use (match_dup 2))]
   "TARGET_MPYW"
   "mpyuw%? %0,%1,%2"
   [(set_attr "length" "4,4,4,8,8")
diff --git a/gcc/testsuite/gcc.target/arc/umulsihi3_z.c b/gcc/testsuite/gcc.target/arc/umulsihi3_z.c
new file mode 100644
index 0000000..cf1c00d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/umulsihi3_z.c
@@ -0,0 +1,23 @@
+/* Check if the optimizers are not removing the umulsihi3_imm
+   instruction.  */
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-inline" } */
+
+#include <stdint.h>
+
+static int32_t test (int16_t reg_val)
+{
+  int32_t x = (reg_val & 0xf) * 62500;
+  return x;
+}
+
+int main (void)
+{
+  volatile int32_t x = 0xc172;
+  x = test (x);
+
+  if (x != 0x0001e848)
+    __builtin_abort ();
+  return 0;
+}
+
-- 
1.9.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 6/6] [ARC] Various instruction pattern fixes
  2016-04-18 14:35 [PATCH 0/6] [ARC] Various fixes Claudiu Zissulescu
@ 2016-04-18 14:35 ` Claudiu Zissulescu
  2016-04-18 18:26   ` Claudiu Zissulescu
  2016-04-18 14:35 ` [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library Claudiu Zissulescu
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-18 14:35 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett

OK to apply?
Claudiu

gcc/
2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.md (mulsidi3): Change operand 0 predicate to
	register_operand.
	(umulsidi3): Likewise.
	(indirect_jump): Fix jump instruction assembly patterns.
	(arcset<code>): Change operand 1 predicate to nonmemory_operand.
	(arcsetltu, arcsetgeu): Likewise.
	(arcsethi, arcsetls): Fix pattern.
---
 gcc/config/arc/arc.md | 125 +++++++++++++++++++++++++++-----------------------
 1 file changed, 67 insertions(+), 58 deletions(-)

diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 6731072..170ac1c 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -1964,7 +1964,7 @@
   (set_attr "cond" "nocond,canuse,nocond,canuse_limm,canuse,nocond")])
 
 (define_expand "mulsidi3"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "")
+  [(set (match_operand:DI 0 "register_operand" "")
 	(mult:DI (sign_extend:DI(match_operand:SI 1 "register_operand" ""))
 		 (sign_extend:DI(match_operand:SI 2 "nonmemory_operand" ""))))]
   "TARGET_ANY_MPY"
@@ -2200,9 +2200,9 @@
 }")
 
 (define_expand "umulsidi3"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "")
-	(mult:DI (zero_extend:DI(match_operand:SI 1 "register_operand" ""))
-		 (zero_extend:DI(match_operand:SI 2 "nonmemory_operand" ""))))]
+  [(set (match_operand:DI 0 "register_operand" "")
+	(mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand" ""))
+		 (zero_extend:DI (match_operand:SI 2 "nonmemory_operand" ""))))]
   ""
 {
   if (TARGET_MPY)
@@ -3673,7 +3673,12 @@
 (define_insn "indirect_jump"
   [(set (pc) (match_operand:SI 0 "nonmemory_operand" "L,I,Cal,Rcqq,r"))]
   ""
-  "j%!%* [%0]%&"
+  "@
+   j%!%* %0%&
+   j%!%* %0%&
+   j%!%* %0%&
+   j%!%* [%0]%&
+   j%!%* [%0]%&"
   [(set_attr "type" "jump")
    (set_attr "iscompact" "false,false,false,maybe,false")
    (set_attr "cond" "canuse,canuse_limm,canuse,canuse,canuse")])
@@ -5425,90 +5430,94 @@
 (define_code_iterator arcCC_cond [eq ne gt lt ge le])
 
 (define_insn "arcset<code>"
-  [(set (match_operand:SI 0 "register_operand"                "=r,r,r,r,r,r,r")
-	(arcCC_cond:SI (match_operand:SI 1 "register_operand"  "0,r,0,r,0,0,r")
-		       (match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I,n,n")))]
+  [(set (match_operand:SI 0 "register_operand"                "=r,r,r,r,r,r,r,r")
+	(arcCC_cond:SI (match_operand:SI 1 "nonmemory_operand" "0,r,n,0,r,0,0,r")
+		       (match_operand:SI 2 "nonmemory_operand" "r,r,r,L,L,I,n,n")))]
   "TARGET_V2 && TARGET_CODE_DENSITY"
   "set<code>%? %0, %1, %2"
-  [(set_attr "length" "4,4,4,4,4,8,8")
+  [(set_attr "length" "4,4,8,4,4,4,8,8")
    (set_attr "iscompact" "false")
    (set_attr "type" "compare")
-   (set_attr "predicable" "yes,no,yes,no,no,yes,no")
-   (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond")
+   (set_attr "predicable" "yes,no,no,yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond")
    ])
 
 (define_insn "arcsetltu"
-  [(set (match_operand:SI 0 "register_operand"         "=r,r,r,r,r,  r,  r")
-	(ltu:SI (match_operand:SI 1 "register_operand"  "0,r,0,r,0,  0,  r")
-		(match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I,  n,  n")))]
+  [(set (match_operand:SI 0 "register_operand"         "=r,r,r,r,r,r,  r,  r")
+	(ltu:SI (match_operand:SI 1 "nonmemory_operand" "0,r,n,0,r,0,  0,  r")
+		(match_operand:SI 2 "nonmemory_operand" "r,r,r,L,L,I,  n,  n")))]
   "TARGET_V2 && TARGET_CODE_DENSITY"
   "setlo%? %0, %1, %2"
-  [(set_attr "length" "4,4,4,4,4,8,8")
+  [(set_attr "length" "4,4,8,4,4,4,8,8")
    (set_attr "iscompact" "false")
    (set_attr "type" "compare")
-   (set_attr "predicable" "yes,no,yes,no,no,yes,no")
-   (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond")
+   (set_attr "predicable" "yes,no,no,yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond")
    ])
 
 (define_insn "arcsetgeu"
-  [(set (match_operand:SI 0 "register_operand"         "=r,r,r,r,r,  r,  r")
-	(geu:SI (match_operand:SI 1 "register_operand"  "0,r,0,r,0,  0,  r")
-		(match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I,  n,  n")))]
+  [(set (match_operand:SI 0 "register_operand"         "=r,r,r,r,r,r,  r,  r")
+	(geu:SI (match_operand:SI 1 "nonmemory_operand" "0,r,n,0,r,0,  0,  r")
+		(match_operand:SI 2 "nonmemory_operand" "r,r,r,L,L,I,  n,  n")))]
   "TARGET_V2 && TARGET_CODE_DENSITY"
   "seths%? %0, %1, %2"
-  [(set_attr "length" "4,4,4,4,4,8,8")
+  [(set_attr "length" "4,4,8,4,4,4,8,8")
    (set_attr "iscompact" "false")
    (set_attr "type" "compare")
-   (set_attr "predicable" "yes,no,yes,no,no,yes,no")
-   (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond")
+   (set_attr "predicable" "yes,no,no,yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond")
    ])
 
 ;; Special cases of SETCC
 (define_insn_and_split "arcsethi"
-  [(set (match_operand:SI 0 "register_operand"         "=r,r,  r,r")
-	(gtu:SI (match_operand:SI 1 "register_operand"  "r,r,  r,r")
-		(match_operand:SI 2 "nonmemory_operand" "0,r,C62,n")))]
+  [(set (match_operand:SI 0 "register_operand"         "=r,  r,r,r")
+	(gtu:SI (match_operand:SI 1 "nonmemory_operand" "r,  r,r,n")
+		(match_operand:SI 2 "nonmemory_operand" "r,C62,n,r")))]
   "TARGET_V2 && TARGET_CODE_DENSITY"
-  "setlo%? %0, %2, %1"
-  "reload_completed
-   && CONST_INT_P (operands[2])
-   && satisfies_constraint_C62 (operands[2])"
+  "#"
+  "reload_completed"
   [(const_int 0)]
   "{
-    /* sethi a,b,u6 => seths a,b,u6 + 1.  */
-    operands[2] = GEN_INT (INTVAL (operands[2]) + 1);
-    emit_insn (gen_arcsetgeu (operands[0], operands[1], operands[2]));
-    DONE;
+    if (CONST_INT_P (operands[2]) && satisfies_constraint_C62 (operands[2]))
+     {
+      /* sethi a,b,u6 => seths a,b,u6 + 1.  */
+      operands[2] = GEN_INT (INTVAL (operands[2]) + 1);
+      emit_insn (gen_arcsetgeu (operands[0], operands[1], operands[2]));
+      DONE;
+     }
+   else
+    {
+     emit_insn (gen_arcsetltu (operands[0], operands[2], operands[1]));
+     DONE;
+    }
  }"
- [(set_attr "length" "4,4,4,8")
-   (set_attr "iscompact" "false")
-   (set_attr "type" "compare")
-   (set_attr "predicable" "yes,no,no,no")
-   (set_attr "cond" "canuse,nocond,nocond,nocond")]
-)
+ [(set_attr "length" "4,4,8,8")
+   (set_attr "type" "compare")])
 
 (define_insn_and_split "arcsetls"
-  [(set (match_operand:SI 0 "register_operand"         "=r,r,  r,r")
-	(leu:SI (match_operand:SI 1 "register_operand"  "r,r,  r,r")
-		(match_operand:SI 2 "nonmemory_operand" "0,r,C62,n")))]
+  [(set (match_operand:SI 0 "register_operand"         "=r,  r,r,r")
+	(leu:SI (match_operand:SI 1 "nonmemory_operand" "r,  r,r,n")
+		(match_operand:SI 2 "nonmemory_operand" "r,C62,n,r")))]
   "TARGET_V2 && TARGET_CODE_DENSITY"
-  "seths%? %0, %2, %1"
-  "reload_completed
-   && CONST_INT_P (operands[2])
-   && satisfies_constraint_C62 (operands[2])"
+  "#"
+  "reload_completed"
   [(const_int 0)]
   "{
-    /* setls a,b,u6 => setlo a,b,u6 + 1.  */
-    operands[2] = GEN_INT (INTVAL (operands[2]) + 1);
-    emit_insn (gen_arcsetltu (operands[0], operands[1], operands[2]));
-    DONE;
- }"
- [(set_attr "length" "4,4,4,8")
-   (set_attr "iscompact" "false")
-   (set_attr "type" "compare")
-   (set_attr "predicable" "yes,no,no,no")
-   (set_attr "cond" "canuse,nocond,nocond,nocond")]
-)
+    if (CONST_INT_P (operands[2]) && satisfies_constraint_C62 (operands[2]))
+     {
+      /* setls a,b,u6 => setlo a,b,u6 + 1.  */
+      operands[2] = GEN_INT (INTVAL (operands[2]) + 1);
+      emit_insn (gen_arcsetltu (operands[0], operands[1], operands[2]));
+      DONE;
+     }
+   else
+    {
+     emit_insn (gen_arcsetgeu (operands[0], operands[2], operands[1]));
+     DONE;
+    }
+   }"
+ [(set_attr "length" "4,4,8,8")
+   (set_attr "type" "compare")])
 
 ; Any mode that needs to be solved by secondary reload
 (define_mode_iterator SRI [QI HI])
-- 
1.9.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 3/6] [ARC] Pass mfpuda to assembler.
  2016-04-18 14:35 [PATCH 0/6] [ARC] Various fixes Claudiu Zissulescu
  2016-04-18 14:35 ` [PATCH 6/6] [ARC] Various instruction pattern fixes Claudiu Zissulescu
  2016-04-18 14:35 ` [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library Claudiu Zissulescu
@ 2016-04-18 14:35 ` Claudiu Zissulescu
  2016-04-28 10:30   ` Joern Wolfgang Rennecke
  2016-04-18 14:35 ` [PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian Claudiu Zissulescu
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-18 14:35 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett

OK to apply?
Claudiu

gcc/
2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.h (ASM_SPEC): Pass mfpuda to assembler.
---
 gcc/config/arc/arc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
index 1c2a38d..299e63a 100644
--- a/gcc/config/arc/arc.h
+++ b/gcc/config/arc/arc.h
@@ -153,7 +153,7 @@ along with GCC; see the file COPYING3.  If not see
 %{mcpu=ARC700:-mEA} \
 %{!mcpu=*:" ASM_DEFAULT "} \
 %{mbarrel-shifter} %{mno-mpy} %{mmul64} %{mmul32x16:-mdsp-packa} %{mnorm} \
-%{mswap} %{mEA} %{mmin-max} %{mspfp*} %{mdpfp*} \
+%{mswap} %{mEA} %{mmin-max} %{mspfp*} %{mdpfp*} %{mfpu=fpuda*:-mfpuda} \
 %{msimd} \
 %{mmac-d16} %{mmac-24} %{mdsp-packa} %{mcrc} %{mdvbf} %{mtelephony} %{mxy} \
 %{mcpu=ARC700|!mcpu=*:%{mlock}} \
-- 
1.9.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 0/6] [ARC] Various fixes
@ 2016-04-18 14:35 Claudiu Zissulescu
  2016-04-18 14:35 ` [PATCH 6/6] [ARC] Various instruction pattern fixes Claudiu Zissulescu
                   ` (5 more replies)
  0 siblings, 6 replies; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-18 14:35 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett

Hi,

This series of 6 patches are fixing a number of small issues found
during time with our compiler.

Patch 1 fixes the problem of using drsub* instructions when compiling
with double assist instruction support.

Patch 2 fixes big-endian emitted code when using FPX extension
instructions.

Patch 3 passes mfpuda option to the compiler whenever we use double
assist instructions in or compilation.

Patch 4 fixes the floating point optimized equality routine to handle
NaNs emitted by FPX extenssion.

Patch 5 fixes the case when combiner matches a sign-extended 16-bit
number with umulhisi3_imm pattern.

Patch 6 fixes various instruction patterns.

OK to apply?
Claudiu

Claudiu Zissulescu (6):
  [ARC] Don't use drsub* instructions when selecting fpuda.
  [ARC] Fix FPX/FPUDA code gen when compiling for big-endian.
  [ARC] Pass mfpuda to assembler.
  [ARC] Handle FPX NaN within optimized floating point library.
  [ARC] Fix unwanted match for sign extend 16-bit constant.
  [ARC] Various instruction pattern fixes

 gcc/config/arc/arc.c                       |  12 ++-
 gcc/config/arc/arc.h                       |   2 +-
 gcc/config/arc/arc.md                      | 154 ++++++++++++++++-------------
 gcc/config/arc/fpx.md                      |   7 +-
 gcc/testsuite/gcc.target/arc/ieee_eq.c     |  47 +++++++++
 gcc/testsuite/gcc.target/arc/trsub.c       |  10 ++
 gcc/testsuite/gcc.target/arc/umulsihi3_z.c |  23 +++++
 libgcc/config/arc/ieee-754/eqdf2.S         |  13 ++-
 8 files changed, 186 insertions(+), 82 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/ieee_eq.c
 create mode 100644 gcc/testsuite/gcc.target/arc/trsub.c
 create mode 100644 gcc/testsuite/gcc.target/arc/umulsihi3_z.c

-- 
1.9.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 6/6] [ARC] Various instruction pattern fixes
  2016-04-18 14:35 ` [PATCH 6/6] [ARC] Various instruction pattern fixes Claudiu Zissulescu
@ 2016-04-18 18:26   ` Claudiu Zissulescu
  2016-04-28 12:31     ` Joern Wolfgang Rennecke
  0 siblings, 1 reply; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-18 18:26 UTC (permalink / raw)
  To: Claudiu Zissulescu, gcc-patches; +Cc: gnu, Francois.Bedard, jeremy.bennett

Forgot to add the reload cases. Here it is the updated patch.

//Claudiu


gcc/
2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.md (mulsidi3): Change operand 0 predicate to
	register_operand.
	(umulsidi3): Likewise.
	(indirect_jump): Fix jump instruction assembly patterns.
	(arcset<code>): Change operand 1 predicate to nonmemory_operand.
	(arcsetltu, arcsetgeu): Likewise.
	(arcsethi, arcsetls): Fix pattern.
---
 gcc/config/arc/arc.md | 146 ++++++++++++++++++++++++++++----------------------
 1 file changed, 83 insertions(+), 63 deletions(-)

diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 6731072..9d87b76 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -1964,7 +1964,7 @@
   (set_attr "cond" "nocond,canuse,nocond,canuse_limm,canuse,nocond")])
 
 (define_expand "mulsidi3"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "")
+  [(set (match_operand:DI 0 "register_operand" "")
 	(mult:DI (sign_extend:DI(match_operand:SI 1 "register_operand" ""))
 		 (sign_extend:DI(match_operand:SI 2 "nonmemory_operand" ""))))]
   "TARGET_ANY_MPY"
@@ -2200,9 +2200,9 @@
 }")
 
 (define_expand "umulsidi3"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "")
-	(mult:DI (zero_extend:DI(match_operand:SI 1 "register_operand" ""))
-		 (zero_extend:DI(match_operand:SI 2 "nonmemory_operand" ""))))]
+  [(set (match_operand:DI 0 "register_operand" "")
+	(mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand" ""))
+		 (zero_extend:DI (match_operand:SI 2 "nonmemory_operand" ""))))]
   ""
 {
   if (TARGET_MPY)
@@ -3673,7 +3673,12 @@
 (define_insn "indirect_jump"
   [(set (pc) (match_operand:SI 0 "nonmemory_operand" "L,I,Cal,Rcqq,r"))]
   ""
-  "j%!%* [%0]%&"
+  "@
+   j%!%* %0%&
+   j%!%* %0%&
+   j%!%* %0%&
+   j%!%* [%0]%&
+   j%!%* [%0]%&"
   [(set_attr "type" "jump")
    (set_attr "iscompact" "false,false,false,maybe,false")
    (set_attr "cond" "canuse,canuse_limm,canuse,canuse,canuse")])
@@ -5425,90 +5430,105 @@
 (define_code_iterator arcCC_cond [eq ne gt lt ge le])
 
 (define_insn "arcset<code>"
-  [(set (match_operand:SI 0 "register_operand"                "=r,r,r,r,r,r,r")
-	(arcCC_cond:SI (match_operand:SI 1 "register_operand"  "0,r,0,r,0,0,r")
-		       (match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I,n,n")))]
-  "TARGET_V2 && TARGET_CODE_DENSITY"
+  [(set (match_operand:SI 0 "register_operand"                "=r,r,r,r,r,r,r,r")
+	(arcCC_cond:SI (match_operand:SI 1 "nonmemory_operand" "0,r,n,0,r,0,0,r")
+		       (match_operand:SI 2 "nonmemory_operand" "r,r,r,L,L,I,n,n")))]
+  "TARGET_V2 && TARGET_CODE_DENSITY
+   && (register_operand (operands[1], SImode)
+       || register_operand (operands[2], SImode))"
   "set<code>%? %0, %1, %2"
-  [(set_attr "length" "4,4,4,4,4,8,8")
+  [(set_attr "length" "4,4,8,4,4,4,8,8")
    (set_attr "iscompact" "false")
    (set_attr "type" "compare")
-   (set_attr "predicable" "yes,no,yes,no,no,yes,no")
-   (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond")
+   (set_attr "predicable" "yes,no,no,yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond")
    ])
 
 (define_insn "arcsetltu"
-  [(set (match_operand:SI 0 "register_operand"         "=r,r,r,r,r,  r,  r")
-	(ltu:SI (match_operand:SI 1 "register_operand"  "0,r,0,r,0,  0,  r")
-		(match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I,  n,  n")))]
-  "TARGET_V2 && TARGET_CODE_DENSITY"
+  [(set (match_operand:SI 0 "register_operand"         "=r,r,r,r,r,r,  r,  r")
+	(ltu:SI (match_operand:SI 1 "nonmemory_operand" "0,r,n,0,r,0,  0,  r")
+		(match_operand:SI 2 "nonmemory_operand" "r,r,r,L,L,I,  n,  n")))]
+  "TARGET_V2 && TARGET_CODE_DENSITY
+   && (register_operand (operands[1], SImode)
+       || register_operand (operands[2], SImode))"
   "setlo%? %0, %1, %2"
-  [(set_attr "length" "4,4,4,4,4,8,8")
+  [(set_attr "length" "4,4,8,4,4,4,8,8")
    (set_attr "iscompact" "false")
    (set_attr "type" "compare")
-   (set_attr "predicable" "yes,no,yes,no,no,yes,no")
-   (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond")
+   (set_attr "predicable" "yes,no,no,yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond")
    ])
 
 (define_insn "arcsetgeu"
-  [(set (match_operand:SI 0 "register_operand"         "=r,r,r,r,r,  r,  r")
-	(geu:SI (match_operand:SI 1 "register_operand"  "0,r,0,r,0,  0,  r")
-		(match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I,  n,  n")))]
-  "TARGET_V2 && TARGET_CODE_DENSITY"
+  [(set (match_operand:SI 0 "register_operand"         "=r,r,r,r,r,r,  r,  r")
+	(geu:SI (match_operand:SI 1 "nonmemory_operand" "0,r,n,0,r,0,  0,  r")
+		(match_operand:SI 2 "nonmemory_operand" "r,r,r,L,L,I,  n,  n")))]
+  "TARGET_V2 && TARGET_CODE_DENSITY
+   && (register_operand (operands[1], SImode)
+       || register_operand (operands[2], SImode))"
   "seths%? %0, %1, %2"
-  [(set_attr "length" "4,4,4,4,4,8,8")
+  [(set_attr "length" "4,4,8,4,4,4,8,8")
    (set_attr "iscompact" "false")
    (set_attr "type" "compare")
-   (set_attr "predicable" "yes,no,yes,no,no,yes,no")
-   (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond")
+   (set_attr "predicable" "yes,no,no,yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond")
    ])
 
 ;; Special cases of SETCC
 (define_insn_and_split "arcsethi"
-  [(set (match_operand:SI 0 "register_operand"         "=r,r,  r,r")
-	(gtu:SI (match_operand:SI 1 "register_operand"  "r,r,  r,r")
-		(match_operand:SI 2 "nonmemory_operand" "0,r,C62,n")))]
-  "TARGET_V2 && TARGET_CODE_DENSITY"
-  "setlo%? %0, %2, %1"
-  "reload_completed
-   && CONST_INT_P (operands[2])
-   && satisfies_constraint_C62 (operands[2])"
+  [(set (match_operand:SI 0 "register_operand"         "=r,  r,r,r")
+	(gtu:SI (match_operand:SI 1 "nonmemory_operand" "r,  r,r,n")
+		(match_operand:SI 2 "nonmemory_operand" "r,C62,n,r")))]
+  "TARGET_V2 && TARGET_CODE_DENSITY
+   && (register_operand (operands[1], SImode)
+       || register_operand (operands[2], SImode))"
+
+  "#"
+  "reload_completed"
   [(const_int 0)]
   "{
-    /* sethi a,b,u6 => seths a,b,u6 + 1.  */
-    operands[2] = GEN_INT (INTVAL (operands[2]) + 1);
-    emit_insn (gen_arcsetgeu (operands[0], operands[1], operands[2]));
-    DONE;
+    if (CONST_INT_P (operands[2]) && satisfies_constraint_C62 (operands[2]))
+     {
+      /* sethi a,b,u6 => seths a,b,u6 + 1.  */
+      operands[2] = GEN_INT (INTVAL (operands[2]) + 1);
+      emit_insn (gen_arcsetgeu (operands[0], operands[1], operands[2]));
+      DONE;
+     }
+   else
+    {
+     emit_insn (gen_arcsetltu (operands[0], operands[2], operands[1]));
+     DONE;
+    }
  }"
- [(set_attr "length" "4,4,4,8")
-   (set_attr "iscompact" "false")
-   (set_attr "type" "compare")
-   (set_attr "predicable" "yes,no,no,no")
-   (set_attr "cond" "canuse,nocond,nocond,nocond")]
-)
+ [(set_attr "length" "4,4,8,8")
+   (set_attr "type" "compare")])
 
 (define_insn_and_split "arcsetls"
-  [(set (match_operand:SI 0 "register_operand"         "=r,r,  r,r")
-	(leu:SI (match_operand:SI 1 "register_operand"  "r,r,  r,r")
-		(match_operand:SI 2 "nonmemory_operand" "0,r,C62,n")))]
-  "TARGET_V2 && TARGET_CODE_DENSITY"
-  "seths%? %0, %2, %1"
-  "reload_completed
-   && CONST_INT_P (operands[2])
-   && satisfies_constraint_C62 (operands[2])"
+  [(set (match_operand:SI 0 "register_operand"         "=r,  r,r,r")
+	(leu:SI (match_operand:SI 1 "nonmemory_operand" "r,  r,r,n")
+		(match_operand:SI 2 "nonmemory_operand" "r,C62,n,r")))]
+  "TARGET_V2 && TARGET_CODE_DENSITY
+   && (register_operand (operands[1], SImode)
+       || register_operand (operands[2], SImode))"
+  "#"
+  "reload_completed"
   [(const_int 0)]
   "{
-    /* setls a,b,u6 => setlo a,b,u6 + 1.  */
-    operands[2] = GEN_INT (INTVAL (operands[2]) + 1);
-    emit_insn (gen_arcsetltu (operands[0], operands[1], operands[2]));
-    DONE;
- }"
- [(set_attr "length" "4,4,4,8")
-   (set_attr "iscompact" "false")
-   (set_attr "type" "compare")
-   (set_attr "predicable" "yes,no,no,no")
-   (set_attr "cond" "canuse,nocond,nocond,nocond")]
-)
+    if (CONST_INT_P (operands[2]) && satisfies_constraint_C62 (operands[2]))
+     {
+      /* setls a,b,u6 => setlo a,b,u6 + 1.  */
+      operands[2] = GEN_INT (INTVAL (operands[2]) + 1);
+      emit_insn (gen_arcsetltu (operands[0], operands[1], operands[2]));
+      DONE;
+     }
+   else
+    {
+     emit_insn (gen_arcsetgeu (operands[0], operands[2], operands[1]));
+     DONE;
+    }
+   }"
+ [(set_attr "length" "4,4,8,8")
+   (set_attr "type" "compare")])
 
 ; Any mode that needs to be solved by secondary reload
 (define_mode_iterator SRI [QI HI])
-- 
2.5.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/6] [ARC] Don't use drsub* instructions when selecting fpuda.
  2016-04-18 14:35 ` [PATCH 1/6] [ARC] Don't use drsub* instructions when selecting fpuda Claudiu Zissulescu
@ 2016-04-28 10:05   ` Joern Wolfgang Rennecke
  2016-04-28 12:16     ` Claudiu Zissulescu
  0 siblings, 1 reply; 34+ messages in thread
From: Joern Wolfgang Rennecke @ 2016-04-28 10:05 UTC (permalink / raw)
  To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett



On 18/04/16 15:33, Claudiu Zissulescu wrote:
> The double precision floating point assist instructions are not
> implementing the reverse double subtract instruction (drsub) found in
> the FPX extension, hence, this patch.
>
> OK to apply?
> Claudiu
>
> gcc/
> 2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>
>
> 	* config/arc/arc.md (cpu_facility): Add fpx variant.
> 	(subdf3): Prohibit use reverse sub when assist operations option
> 	is enabled.
> 	* config/arc/fpx.md (subdf3_insn, *dsubh_peep2_insn): Allow drsub
> 	instructions only when FPX is enabled.
>          * testsuite/gcc.target/arc/trsub.c: New test.
>
  OK.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian.
  2016-04-18 14:35 ` [PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian Claudiu Zissulescu
@ 2016-04-28 10:29   ` Joern Wolfgang Rennecke
  2016-04-28 12:54     ` Claudiu Zissulescu
  0 siblings, 1 reply; 34+ messages in thread
From: Joern Wolfgang Rennecke @ 2016-04-28 10:29 UTC (permalink / raw)
  To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett



On 18/04/16 15:33, Claudiu Zissulescu wrote:
> OK to apply?
> Claudiu
>
> gcc/
> 2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>
>
> 	* config/arc/arc.c (arc_process_double_reg_moves): Fix for
> 	big-endian compilation.
> 	* config/arc/arc.md (addf3): Likewise.
> 	(subdf3): Likewise.
> 	(muldf3): Likewise.
>
  OK.

FWIW, there is also a FIXME for a little-endian-centric use of 
split_double in arc.c:arc_rtx_costs.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 3/6] [ARC] Pass mfpuda to assembler.
  2016-04-18 14:35 ` [PATCH 3/6] [ARC] Pass mfpuda to assembler Claudiu Zissulescu
@ 2016-04-28 10:30   ` Joern Wolfgang Rennecke
  2016-04-28 13:10     ` Claudiu Zissulescu
  0 siblings, 1 reply; 34+ messages in thread
From: Joern Wolfgang Rennecke @ 2016-04-28 10:30 UTC (permalink / raw)
  To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett



On 18/04/16 15:33, Claudiu Zissulescu wrote:
> OK to apply?
> Claudiu
>
> gcc/
> 2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>
>
> 	* config/arc/arc.h (ASM_SPEC): Pass mfpuda to assembler.
>
  OK.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library.
  2016-04-18 14:35 ` [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library Claudiu Zissulescu
@ 2016-04-28 11:27   ` Joern Wolfgang Rennecke
  2016-04-28 11:35     ` Claudiu Zissulescu
  0 siblings, 1 reply; 34+ messages in thread
From: Joern Wolfgang Rennecke @ 2016-04-28 11:27 UTC (permalink / raw)
  To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett



On 18/04/16 15:33, Claudiu Zissulescu wrote:
> OK to apply?
No.  You are clobbering DBL0H.

Besides, why would you change any of the code, apart from the argument
to #ifdef and the comments?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library.
  2016-04-28 11:27   ` Joern Wolfgang Rennecke
@ 2016-04-28 11:35     ` Claudiu Zissulescu
  2016-04-28 11:41       ` Joern Wolfgang Rennecke
  0 siblings, 1 reply; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-28 11:35 UTC (permalink / raw)
  To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett

> Besides, why would you change any of the code, apart from the argument
> to #ifdef and the comments?

It is not working/giving wrong results. I think, the test shows you this if you run it without all the libgcc mods.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library.
  2016-04-28 11:35     ` Claudiu Zissulescu
@ 2016-04-28 11:41       ` Joern Wolfgang Rennecke
  2016-04-28 11:43         ` Claudiu Zissulescu
  2016-04-28 14:12         ` Claudiu Zissulescu
  0 siblings, 2 replies; 34+ messages in thread
From: Joern Wolfgang Rennecke @ 2016-04-28 11:41 UTC (permalink / raw)
  To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett



On 28/04/16 12:35, Claudiu Zissulescu wrote:
>> Besides, why would you change any of the code, apart from the argument
>> to #ifdef and the comments?
> It is not working/giving wrong results. I think, the test shows you this if you run it without all the libgcc mods.
I can't.

Where exactly does the test go wrong?
Can you show a trace of __eqdf2 with register values?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library.
  2016-04-28 11:41       ` Joern Wolfgang Rennecke
@ 2016-04-28 11:43         ` Claudiu Zissulescu
  2016-04-28 14:12         ` Claudiu Zissulescu
  1 sibling, 0 replies; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-28 11:43 UTC (permalink / raw)
  To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett

> 
> Where exactly does the test go wrong?

I will try to trace it back when I develop it. It passed too long since then. Probably something related with big-endian.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 5/6] [ARC] Fix unwanted match for sign extend 16-bit constant.
  2016-04-18 14:35 ` [PATCH 5/6] [ARC] Fix unwanted match for sign extend 16-bit constant Claudiu Zissulescu
@ 2016-04-28 11:47   ` Joern Wolfgang Rennecke
  2016-04-28 17:12     ` [PATCH] " Claudiu Zissulescu
  0 siblings, 1 reply; 34+ messages in thread
From: Joern Wolfgang Rennecke @ 2016-04-28 11:47 UTC (permalink / raw)
  To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett



On 18/04/16 15:33, Claudiu Zissulescu wrote:
> The combine pass may conclude umulhisi3_imm pattern can accept also sign
> extended 16-bit constants. This patch prohibits the combine in considering
> this pattern as suitable.
>
> OK to apply?
> Claudiu
>
> gcc/
> 2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>
>
> 	* config/arc/arc.md (umulhisi3_imm): Avoid unwanted match for sign
> 	extend 16-bit constants.
...
> 	* testsuite/gcc.target/arc/umulsihi3_z.c: New file.
> -		 (match_operand:HI 2 "short_const_int_operand"          " L, L,I,C16,C16")))]
> +		 (match_operand:HI 2 "short_const_int_operand"          " L, L,I,C16,C16")))
> +  (use (match_dup 2))]
>
  That's not the way to fix it.  Get the predicates and constraints right.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 1/6] [ARC] Don't use drsub* instructions when selecting fpuda.
  2016-04-28 10:05   ` Joern Wolfgang Rennecke
@ 2016-04-28 12:16     ` Claudiu Zissulescu
  0 siblings, 0 replies; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-28 12:16 UTC (permalink / raw)
  To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett

Committed r235562.

Thanks,
Claudiu

> >
> > gcc/
> > 2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>
> >
> > 	* config/arc/arc.md (cpu_facility): Add fpx variant.
> > 	(subdf3): Prohibit use reverse sub when assist operations option
> > 	is enabled.
> > 	* config/arc/fpx.md (subdf3_insn, *dsubh_peep2_insn): Allow drsub
> > 	instructions only when FPX is enabled.
> >          * testsuite/gcc.target/arc/trsub.c: New test.
> >
>   OK.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 6/6] [ARC] Various instruction pattern fixes
  2016-04-18 18:26   ` Claudiu Zissulescu
@ 2016-04-28 12:31     ` Joern Wolfgang Rennecke
  2016-05-02 11:21       ` Claudiu Zissulescu
  0 siblings, 1 reply; 34+ messages in thread
From: Joern Wolfgang Rennecke @ 2016-04-28 12:31 UTC (permalink / raw)
  To: Claudiu Zissulescu, Claudiu Zissulescu, gcc-patches
  Cc: Francois.Bedard, jeremy.bennett



On 18/04/16 19:25, Claudiu Zissulescu wrote:
> Forgot to add the reload cases. Here it is the updated patch.
>
> //Claudiu
>
>
> gcc/
> 2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>
>
> 	* config/arc/arc.md (mulsidi3): Change operand 0 predicate to
> 	register_operand.
> 	(umulsidi3): Likewise.
> 	(indirect_jump): Fix jump instruction assembly patterns.
> 	(arcset<code>): Change operand 1 predicate to nonmemory_operand.
> 	(arcsetltu, arcsetgeu): Likewise.
ChangeLog omission: You are also adding an r/n/r alternative.
> 	(arcsethi, arcsetls): Fix pattern.
Otherwise this is OK.

If the constant / register comparisons come from an expander, in
general the expander should be fixed to swap the operands and
use the swapped comparison code, to get canonical rtl.
OTOH, constant re-materialization during register allocation can change 
a reg-reg into
a constant-reg comparison, and at that stage, canonicalization would not 
be expected.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian.
  2016-04-28 10:29   ` Joern Wolfgang Rennecke
@ 2016-04-28 12:54     ` Claudiu Zissulescu
  0 siblings, 0 replies; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-28 12:54 UTC (permalink / raw)
  To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett

Fixed naming in arc_rtx_costs, committed r235567.

Thanks,
Claudiu

>> gcc/
> > 2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>
> >
> > 	* config/arc/arc.c (arc_process_double_reg_moves): Fix for
> > 	big-endian compilation.
> > 	* config/arc/arc.md (addf3): Likewise.
> > 	(subdf3): Likewise.
> > 	(muldf3): Likewise.
> >
>   OK.
> 
> FWIW, there is also a FIXME for a little-endian-centric use of
> split_double in arc.c:arc_rtx_costs.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 3/6] [ARC] Pass mfpuda to assembler.
  2016-04-28 10:30   ` Joern Wolfgang Rennecke
@ 2016-04-28 13:10     ` Claudiu Zissulescu
  0 siblings, 0 replies; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-28 13:10 UTC (permalink / raw)
  To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett

Committed r235568.

Thanks,
Claudiu

> > gcc/
> > 2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>
> >
> > 	* config/arc/arc.h (ASM_SPEC): Pass mfpuda to assembler.
> >
>   OK.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library.
  2016-04-28 11:41       ` Joern Wolfgang Rennecke
  2016-04-28 11:43         ` Claudiu Zissulescu
@ 2016-04-28 14:12         ` Claudiu Zissulescu
  2016-04-28 15:03           ` Joern Wolfgang Rennecke
  1 sibling, 1 reply; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-28 14:12 UTC (permalink / raw)
  To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett

Hi,

> Where exactly does the test go wrong?

The test which fails is this one: 
	TEST_EQ (double, __DBL_MAX__, __DBL_MAX__, 1);
From the test file included in the patch.

> Can you show a trace of __eqdf2 with register values?

Sure thing, running for ARC700, using original implementation and enabled guarded code for FPX handling:

[0x000002a2] 0xc000                 K Z    ld_s           r0,[sp,0x0] : lw [0x5000c0c0] => 0xffffffff : (w1) r0 <= 0xffffffff *
[0x000002a4] 0xc101                 K Z    ld_s           r1,[sp,0x4] : lw [0x5000c0c4] => 0x7fefffff : (w1) r1 <= 0x7fefffff *
[0x000002a6] 0xc202                 K Z    ld_s           r2,[sp,0x8] : lw [0x5000c0c8] => 0xffffffff : (w1) r2 <= 0xffffffff *
[0x000002a8] 0xc303                 K Z    ld_s           r3,[sp,0xc] : lw [0x5000c0cc] => 0x7fefffff : (w1) r3 <= 0x7fefffff *
[0x000002aa] 0x0aea0000             K Z    bl             0x2e8 : (w0) r31 <= 0x000002ae *
[0x00000590] 0x091d00e1             K Z    brne.d         r1,r3,0x1c
[0x00000594] 0x2153050c             K Z    bmsk           r12,r1,0x14 : (w0) r12 <= 0x000fffff *
[0x00000598] 0x200580be             K Z    or.f           0,r0,r2 *
[0x0000059c] 0x24cf1562             K  N   bset.ne        r12,r12,0x15 : (w0) r12 <= 0x002fffff *
[0x000005a0] 0x2414904c             K  N   add1.f         r12,r12,r1 : (w0) r12 <= 0x000ffffd *
[0x000005a4] 0x7fe0                 K   C  j_s.d          [blink] *
[0x000005a6] 0x20cc8086             KD  C  cmp.cc         r0,r2
 
For reference, the routine:

	.global __eqdf2
	.balign 4
	HIDDEN_FUNC(__eqdf2)
	/* Good performance as long as the difference in high word is
	   well predictable (as seen from the branch predictor).  */
__eqdf2:
	brne.d DBL0H,DBL1H,.Lhighdiff
	bmsk    r12,DBL0H,20
#ifndef __HS__
	/* The next two instructions are required to recognize the FPX
	NaN, which has a pattern like this: 0x7ff0_0000_8000_0000, as
	oposite to 0x7ff8_0000_0000_0000.  */
	or.f    0,DBL0L,DBL1L
	bset.ne r12,r12,21
#endif /* __HS__ */
	add1.f	r12,r12,DBL0H /* set c iff NaN; also, clear z if NaN.  */
	j_s.d	[blink]
	cmp.cc	DBL0L,DBL1L
	.balign 4
.Lhighdiff:
	or	r12,DBL0H,DBL1H
	or.f	0,DBL0L,DBL1L
	j_s.d	[blink]
	bmsk.eq.f r12,r12,30
	ENDFUNC(__eqdf2)

All those results were collected using nsimfree.

Please let me know if you need more info,
Claudiu

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library.
  2016-04-28 14:12         ` Claudiu Zissulescu
@ 2016-04-28 15:03           ` Joern Wolfgang Rennecke
  2016-04-29 10:18             ` [PATCH] " Claudiu Zissulescu
  0 siblings, 1 reply; 34+ messages in thread
From: Joern Wolfgang Rennecke @ 2016-04-28 15:03 UTC (permalink / raw)
  To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett



On 28/04/16 15:11, Claudiu Zissulescu wrote:
> Sure thing, running for ARC700, using original implementation and enabled guarded code for FPX handling:
>
> [0x000002a2] 0xc000                 K Z    ld_s           r0,[sp,0x0] : lw [0x5000c0c0] => 0xffffffff : (w1) r0 <= 0xffffffff *
> [0x000002a4] 0xc101                 K Z    ld_s           r1,[sp,0x4] : lw [0x5000c0c4] => 0x7fefffff : (w1) r1 <= 0x7fefffff *
> [0x000002a6] 0xc202                 K Z    ld_s           r2,[sp,0x8] : lw [0x5000c0c8] => 0xffffffff : (w1) r2 <= 0xffffffff *
> [0x000002a8] 0xc303                 K Z    ld_s           r3,[sp,0xc] : lw [0x5000c0cc] => 0x7fefffff : (w1) r3 <= 0x7fefffff *
> [0x000002aa] 0x0aea0000             K Z    bl             0x2e8 : (w0) r31 <= 0x000002ae *
> [0x00000590] 0x091d00e1             K Z    brne.d         r1,r3,0x1c
> [0x00000594] 0x2153050c             K Z    bmsk           r12,r1,0x14 : (w0) r12 <= 0x000fffff *
> [0x00000598] 0x200580be             K Z    or.f           0,r0,r2 *
> [0x0000059c] 0x24cf1562             K  N   bset.ne        r12,r12,0x15 : (w0) r12 <= 0x002fffff *
> [0x000005a0] 0x2414904c             K  N   add1.f         r12,r12,r1 : (w0) r12 <= 0x000ffffd *
> [0x000005a4] 0x7fe0                 K   C  j_s.d          [blink] *
> [0x000005a6] 0x20cc8086             KD  C  cmp.cc         r0,r2
>   
>   
I see, we basically have an overflow.
I think the DPFP_COMPAT / __HS__ variant should be something like:

         brne DBL0H,DBL1H,.Lhighdiff
         mov_s r12,0x00200000
         or.f 0,DBL0L,DBL1L
         bset.ne r12,r12,0

         add1.f  r12,r12,DBL0H /* set c iff NaN; also, clear z if NaN.  */
         j_s.d   [blink]
         cmp.cc  DBL0L,DBL1L
...

Where the mov_s could be replaced with something else that loads the 
same value,
depending on what instructions are supported.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] [ARC] Fix unwanted match for sign extend 16-bit constant.
  2016-04-28 11:47   ` Joern Wolfgang Rennecke
@ 2016-04-28 17:12     ` Claudiu Zissulescu
  2016-04-28 17:46       ` Joern Wolfgang Rennecke
  0 siblings, 1 reply; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-28 17:12 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett

Please find the updated patch.

Claudiu

gcc/
2016-04-28  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.h (UNSIGNED_INT12, UNSIGNED_INT16): Define.
	* config/arc/arc.md (umulhisi3): Use arc_short_operand predicate.
	(umulhisi3_imm): Update predicates and constraint letters.
	(umulhisi3_reg): Declare instruction as commutative.
	* config/arc/constraints.md (U12, U16): New constraints.
	* config/arc/predicates.md (short_unsigned_const_operand): New
	predicate.
	(arc_short_operand): Likewise.
	* testsuite/gcc.target/arc/umulsihi3_z.c: New file.
---
 gcc/config/arc/arc.h                       |  2 ++
 gcc/config/arc/arc.md                      | 14 +++++++-------
 gcc/config/arc/constraints.md              | 11 +++++++++++
 gcc/config/arc/predicates.md               |  8 ++++++++
 gcc/testsuite/gcc.target/arc/umulsihi3_z.c | 23 +++++++++++++++++++++++
 5 files changed, 51 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/umulsihi3_z.c

diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
index 37c1afa..1b75099 100644
--- a/gcc/config/arc/arc.h
+++ b/gcc/config/arc/arc.h
@@ -795,6 +795,8 @@ extern enum reg_class arc_regno_reg_class[];
 #define UNSIGNED_INT6(X) ((unsigned) (X) < 0x40)
 #define UNSIGNED_INT7(X) ((unsigned) (X) < 0x80)
 #define UNSIGNED_INT8(X) ((unsigned) (X) < 0x100)
+#define UNSIGNED_INT12(X) ((unsigned) (X) < 0x800)
+#define UNSIGNED_INT16(X) ((unsigned) (X) < 0x10000)
 #define IS_ONE(X) ((X) == 1)
 #define IS_ZERO(X) ((X) == 0)
 
diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 8ec0ce0..e0f74e4 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -1720,21 +1720,21 @@
 (define_expand "umulhisi3"
   [(set (match_operand:SI 0 "register_operand"                           "")
 	(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand"  ""))
-		 (zero_extend:SI (match_operand:HI 2 "nonmemory_operand" ""))))]
+		 (zero_extend:SI (match_operand:HI 2 "arc_short_operand" ""))))]
   "TARGET_MPYW"
   "{
     if (CONSTANT_P (operands[2]))
     {
-      emit_insn (gen_umulhisi3_imm (operands[0], operands[1], operands[2]));
-      DONE;
+     emit_insn (gen_umulhisi3_imm (operands[0], operands[1], operands[2]));
+     DONE;
     }
   }"
 )
 
 (define_insn "umulhisi3_imm"
-  [(set (match_operand:SI 0 "register_operand"                          "=r, r,r,  r,  r")
-	(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" " 0, r,0,  0,  r"))
-		 (match_operand:HI 2 "short_const_int_operand"          " L, L,I,C16,C16")))]
+  [(set (match_operand:SI 0 "register_operand"                          "=r, r,  r,  r,  r")
+	(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" "%0, r,  0,  0,  r"))
+		 (match_operand:HI 2 "short_unsigned_const_operand"     " L, L,U12,U16,U16")))]
   "TARGET_MPYW"
   "mpyuw%? %0,%1,%2"
   [(set_attr "length" "4,4,4,8,8")
@@ -1746,7 +1746,7 @@
 
 (define_insn "umulhisi3_reg"
   [(set (match_operand:SI 0 "register_operand"                          "=Rcqq, r, r")
-	(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" "    0, 0, r"))
+	(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" "   %0, 0, r"))
 		 (zero_extend:SI (match_operand:HI 2 "register_operand" " Rcqq, r, r"))))]
   "TARGET_MPYW"
   "mpyuw%? %0,%1,%2"
diff --git a/gcc/config/arc/constraints.md b/gcc/config/arc/constraints.md
index 668b60a..cdf94ef 100644
--- a/gcc/config/arc/constraints.md
+++ b/gcc/config/arc/constraints.md
@@ -427,3 +427,14 @@
   "A memory with only a base register"
   (match_operand 0 "mem_noofs_operand"))
 
+(define_constraint "U12"
+  "@internal
+   An unsigned 12-bit integer constant."
+  (and (match_code "const_int")
+       (match_test "UNSIGNED_INT12 (ival)")))
+
+(define_constraint "U16"
+  "@internal
+   An unsigned 16-bit integer constant"
+  (and (match_code "const_int")
+       (match_test "UNSIGNED_INT16 (ival)")))
diff --git a/gcc/config/arc/predicates.md b/gcc/config/arc/predicates.md
index 3c657c6..9542b22 100644
--- a/gcc/config/arc/predicates.md
+++ b/gcc/config/arc/predicates.md
@@ -819,3 +819,11 @@
 (define_predicate "double_register_operand"
   (ior (match_test "even_register_operand (op, mode)")
        (match_test "arc_double_register_operand (op, mode)")))
+
+(define_predicate "short_unsigned_const_operand"
+  (and (match_code "const_int")
+       (match_test "satisfies_constraint_U16 (op)")))
+
+(define_predicate "arc_short_operand"
+  (ior (match_test "register_operand (op, mode)")
+       (match_test "short_unsigned_const_operand (op, mode)")))
diff --git a/gcc/testsuite/gcc.target/arc/umulsihi3_z.c b/gcc/testsuite/gcc.target/arc/umulsihi3_z.c
new file mode 100644
index 0000000..cf1c00d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/umulsihi3_z.c
@@ -0,0 +1,23 @@
+/* Check if the optimizers are not removing the umulsihi3_imm
+   instruction.  */
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-inline" } */
+
+#include <stdint.h>
+
+static int32_t test (int16_t reg_val)
+{
+  int32_t x = (reg_val & 0xf) * 62500;
+  return x;
+}
+
+int main (void)
+{
+  volatile int32_t x = 0xc172;
+  x = test (x);
+
+  if (x != 0x0001e848)
+    __builtin_abort ();
+  return 0;
+}
+
-- 
1.9.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] [ARC] Fix unwanted match for sign extend 16-bit constant.
  2016-04-28 17:12     ` [PATCH] " Claudiu Zissulescu
@ 2016-04-28 17:46       ` Joern Wolfgang Rennecke
  2016-04-28 20:31         ` Claudiu Zissulescu
  0 siblings, 1 reply; 34+ messages in thread
From: Joern Wolfgang Rennecke @ 2016-04-28 17:46 UTC (permalink / raw)
  To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett



On 28/04/16 18:10, Claudiu Zissulescu wrote:
> Please find the updated patch.
>
> Claudiu
>
> gcc/
> 2016-04-28  Claudiu Zissulescu  <claziss@synopsys.com>
>
> 	* config/arc/arc.h (UNSIGNED_INT12, UNSIGNED_INT16): Define.
> 	* config/arc/arc.md (umulhisi3): Use arc_short_operand predicate.
> 	(umulhisi3_imm): Update predicates and constraint letters.
> 	(umulhisi3_reg): Declare instruction as commutative.
> 	* config/arc/constraints.md (U12, U16): New constraints.
I'm not sure how to feel about this.  U16 looks intuitive, but we have
traditionally used U for memory constraints.  And we use it for ARC
for that purpose, too, even though with a compatible constraint
length of 3.
I suppose it's fine if you're sure we never want to have an addressing
mode that's best described with "12" or "16", or some other number
we might want for an unsigned integer.

Otherwise, I'd suggest using a traditional integer letter.  'J' is free.
>   
>   (define_expand "umulhisi3"
>     [(set (match_operand:SI 0 "register_operand"                           "")
>   	(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand"  ""))
> -		 (zero_extend:SI (match_operand:HI 2 "nonmemory_operand" ""))))]
> +		 (zero_extend:SI (match_operand:HI 2 "arc_short_operand" ""))))]
>     "TARGET_MPYW"
>     "{
>       if (CONSTANT_P (operands[2]))
>       {
> -      emit_insn (gen_umulhisi3_imm (operands[0], operands[1], operands[2]));
> -      DONE;
> +     emit_insn (gen_umulhisi3_imm (operands[0], operands[1], operands[2]));
> +     DONE;
Why do you remove half of the indentation?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] [ARC] Fix unwanted match for sign extend 16-bit constant.
  2016-04-28 17:46       ` Joern Wolfgang Rennecke
@ 2016-04-28 20:31         ` Claudiu Zissulescu
  2016-04-28 20:57           ` Joern Wolfgang Rennecke
  0 siblings, 1 reply; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-28 20:31 UTC (permalink / raw)
  To: Joern Wolfgang Rennecke, Claudiu Zissulescu, gcc-patches
  Cc: Francois.Bedard, jeremy.bennett

>
> Otherwise, I'd suggest using a traditional integer letter.  'J' is free.
Thanks for the suggestion, I will use 'J'.

> Why do you remove half of the indentation?
Unwanted reformatting, sorry for this, I will revert it.

I have the feeling you are happy with my new patch. Is there anything to 
be added to it besides fixing the above issues?

Thanks,
Claudiu

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] [ARC] Fix unwanted match for sign extend 16-bit constant.
  2016-04-28 20:31         ` Claudiu Zissulescu
@ 2016-04-28 20:57           ` Joern Wolfgang Rennecke
  2016-04-29  8:41             ` Claudiu Zissulescu
  0 siblings, 1 reply; 34+ messages in thread
From: Joern Wolfgang Rennecke @ 2016-04-28 20:57 UTC (permalink / raw)
  To: Claudiu Zissulescu, Claudiu Zissulescu, gcc-patches
  Cc: Francois.Bedard, jeremy.bennett



On 28/04/16 21:31, Claudiu Zissulescu wrote:
>>
>> Otherwise, I'd suggest using a traditional integer letter.  'J' is free.
> Thanks for the suggestion, I will use 'J'.
>
>> Why do you remove half of the indentation?
> Unwanted reformatting, sorry for this, I will revert it.
>
> I have the feeling you are happy with my new patch. Is there anything 
> to be added to it besides fixing the above issues?
No, otherwise it looks OK.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH] [ARC] Fix unwanted match for sign extend 16-bit constant.
  2016-04-28 20:57           ` Joern Wolfgang Rennecke
@ 2016-04-29  8:41             ` Claudiu Zissulescu
  0 siblings, 0 replies; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-29  8:41 UTC (permalink / raw)
  To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett

Committed r235623.

Thanks,
Claudiu

> -----Original Message-----
> From: Joern Wolfgang Rennecke [mailto:gnu@amylaar.uk]
> Sent: Thursday, April 28, 2016 10:57 PM
> To: Claudiu Zissulescu; Claudiu Zissulescu; gcc-patches@gcc.gnu.org
> Cc: Francois.Bedard@synopsys.com; jeremy.bennett@embecosm.com
> Subject: Re: [PATCH] [ARC] Fix unwanted match for sign extend 16-bit
> constant.
> 
> 
> 
> On 28/04/16 21:31, Claudiu Zissulescu wrote:
> >>
> >> Otherwise, I'd suggest using a traditional integer letter.  'J' is free.
> > Thanks for the suggestion, I will use 'J'.
> >
> >> Why do you remove half of the indentation?
> > Unwanted reformatting, sorry for this, I will revert it.
> >
> > I have the feeling you are happy with my new patch. Is there anything
> > to be added to it besides fixing the above issues?
> No, otherwise it looks OK.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] [ARC] Handle FPX NaN within optimized floating point library.
  2016-04-28 15:03           ` Joern Wolfgang Rennecke
@ 2016-04-29 10:18             ` Claudiu Zissulescu
  2016-04-29 10:23               ` Joern Wolfgang Rennecke
  2016-04-29 10:27               ` Joern Wolfgang Rennecke
  0 siblings, 2 replies; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-29 10:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: Claudiu.Zissulescu, gnu, Francois.Bedard, jeremy.bennett

This is the updated patch on handling FPX NaNs.

Ok to apply?
Claudiu


gcc/
2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>

	* testsuite/gcc.target/arc/ieee_eq.c: New test.

libgcc/
2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/ieee-754/eqdf2.S: Handle FPX NaN.
---
 gcc/testsuite/gcc.target/arc/ieee_eq.c | 47 ++++++++++++++++++++++++++++++++++
 libgcc/config/arc/ieee-754/eqdf2.S     | 15 +++++++----
 2 files changed, 57 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/ieee_eq.c

diff --git a/gcc/testsuite/gcc.target/arc/ieee_eq.c b/gcc/testsuite/gcc.target/arc/ieee_eq.c
new file mode 100644
index 0000000..70aebad
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/ieee_eq.c
@@ -0,0 +1,47 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include <stdio.h>
+#include <float.h>
+
+#define TEST_EQ(TYPE,X,Y,RES)				\
+  do {							\
+    volatile TYPE a, b;					\
+    a = (TYPE) X;					\
+    b = (TYPE) Y;					\
+    if ((a == b) != RES)				\
+      {							\
+	printf ("Runtime computation error @%d. %g "	\
+		"!= %g\n", __LINE__, a, b);		\
+	error = 1;					\
+      }							\
+  } while (0)
+
+#ifndef __HS__
+/* Special type of NaN found when using double FPX instructions.  */
+static const unsigned long long __nan = 0x7FF0000080000000ULL;
+# define W (*(double *) &__nan)
+#else
+# define W __builtin_nan ("")
+#endif
+
+#define Q __builtin_nan ("")
+#define H __builtin_inf ()
+
+int main (void)
+{
+  int error = 0;
+
+  TEST_EQ (double, 1, 1, 1);
+  TEST_EQ (double, 1, 2, 0);
+  TEST_EQ (double, W, W, 0);
+  TEST_EQ (double, Q, Q, 0);
+  TEST_EQ (double, __DBL_MAX__, __DBL_MAX__, 1);
+  TEST_EQ (double, __DBL_MIN__, __DBL_MIN__, 1);
+  TEST_EQ (double, H, H, 1);
+
+  if (error)
+    __builtin_abort ();
+
+  return 0;
+}
diff --git a/libgcc/config/arc/ieee-754/eqdf2.S b/libgcc/config/arc/ieee-754/eqdf2.S
index bc7d88e..7e80ef5 100644
--- a/libgcc/config/arc/ieee-754/eqdf2.S
+++ b/libgcc/config/arc/ieee-754/eqdf2.S
@@ -58,11 +58,16 @@ __eqdf2:
 	   well predictable (as seen from the branch predictor).  */
 __eqdf2:
 	brne.d DBL0H,DBL1H,.Lhighdiff
-	bmsk r12,DBL0H,20
-#ifdef DPFP_COMPAT
-	or.f 0,DBL0L,DBL1L
-	bset.ne r12,r12,21
-#endif /* DPFP_COMPAT */
+#ifndef __HS__
+	/* The next two instructions are required to recognize the FPX
+	NaN, which has a pattern like this: 0x7ff0_0000_8000_0000, as
+	oposite to 0x7ff8_0000_0000_0000.  */
+	or.f    0,DBL0L,DBL1L
+	mov_s	r12,0x00200000
+	bset.ne r12,r12,0
+#else
+	bmsk    r12,DBL0H,20
+#endif /* __HS__ */
 	add1.f	r12,r12,DBL0H /* set c iff NaN; also, clear z if NaN.  */
 	j_s.d	[blink]
 	cmp.cc	DBL0L,DBL1L
-- 
1.9.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] [ARC] Handle FPX NaN within optimized floating point library.
  2016-04-29 10:18             ` [PATCH] " Claudiu Zissulescu
@ 2016-04-29 10:23               ` Joern Wolfgang Rennecke
  2016-04-29 10:27               ` Joern Wolfgang Rennecke
  1 sibling, 0 replies; 34+ messages in thread
From: Joern Wolfgang Rennecke @ 2016-04-29 10:23 UTC (permalink / raw)
  To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett



On 29/04/16 11:16, Claudiu Zissulescu wrote:
> This is the updated patch on handling FPX NaNs.
>
> Ok to apply?
> Claudiu
>
>
  OK.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] [ARC] Handle FPX NaN within optimized floating point library.
  2016-04-29 10:18             ` [PATCH] " Claudiu Zissulescu
  2016-04-29 10:23               ` Joern Wolfgang Rennecke
@ 2016-04-29 10:27               ` Joern Wolfgang Rennecke
  2016-04-29 10:31                 ` Claudiu Zissulescu
  1 sibling, 1 reply; 34+ messages in thread
From: Joern Wolfgang Rennecke @ 2016-04-29 10:27 UTC (permalink / raw)
  To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett

P.S.: the .d suffix on the branch was there just for scheduling purposes -
not sure if that actually helped any chip's pipeline, or if it was just 
a bug
in the documentation.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH] [ARC] Handle FPX NaN within optimized floating point library.
  2016-04-29 10:27               ` Joern Wolfgang Rennecke
@ 2016-04-29 10:31                 ` Claudiu Zissulescu
  2016-04-29 10:37                   ` Joern Wolfgang Rennecke
  0 siblings, 1 reply; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-29 10:31 UTC (permalink / raw)
  To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett

It should do the job, at least for EM where the jump takes 2 cycle, and by means of using delay slots we can make all the cycles count. HS has a branch prediction mechanism, hence, filling up the delay slot doesn't have such a big impact like in EM or even earlier cpus.

//Claudiu

> -----Original Message-----
> From: Joern Wolfgang Rennecke [mailto:gnu@amylaar.uk]
> Sent: Friday, April 29, 2016 12:27 PM
> To: Claudiu Zissulescu; gcc-patches@gcc.gnu.org
> Cc: Francois.Bedard@synopsys.com; jeremy.bennett@embecosm.com
> Subject: Re: [PATCH] [ARC] Handle FPX NaN within optimized floating point
> library.
> 
> P.S.: the .d suffix on the branch was there just for scheduling purposes -
> not sure if that actually helped any chip's pipeline, or if it was just
> a bug
> in the documentation.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] [ARC] Handle FPX NaN within optimized floating point library.
  2016-04-29 10:31                 ` Claudiu Zissulescu
@ 2016-04-29 10:37                   ` Joern Wolfgang Rennecke
  2016-04-29 10:47                     ` Claudiu Zissulescu
  0 siblings, 1 reply; 34+ messages in thread
From: Joern Wolfgang Rennecke @ 2016-04-29 10:37 UTC (permalink / raw)
  To: Claudiu Zissulescu, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett



On 29/04/16 11:31, Claudiu Zissulescu wrote:
> It should do the job, at least for EM where the jump takes 2 cycle, and by means of using delay slots we can make all the cycles count. HS has a branch prediction mechanism, hence, filling up the delay slot doesn't have such a big impact like in EM or even earlier cpus.
No, the alternative is to hide the delay slot, so if the branch is 
predicted properly, the case with
different high words should be faster without the .d suffix.

I.e. , eagerly filling the delay slot like this has a bigger - negative 
- impact on performance.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH] [ARC] Handle FPX NaN within optimized floating point library.
  2016-04-29 10:37                   ` Joern Wolfgang Rennecke
@ 2016-04-29 10:47                     ` Claudiu Zissulescu
  0 siblings, 0 replies; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-04-29 10:47 UTC (permalink / raw)
  To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett

> > It should do the job, at least for EM where the jump takes 2 cycle, and by
> means of using delay slots we can make all the cycles count. HS has a branch
> prediction mechanism, hence, filling up the delay slot doesn't have such a big
> impact like in EM or even earlier cpus.
> No, the alternative is to hide the delay slot, so if the branch is
> predicted properly, the case with
> different high words should be faster without the .d suffix.
> 
> I.e. , eagerly filling the delay slot like this has a bigger - negative
> - impact on performance.

If we talking about HS, then we can add another flag 'T' which should instruct the branch prediction that we expect this branch to be taken. However, I haven't seen any impact of this flag on the code, and the compiler generates this. In general, the HS branch prediction has some particularities. Although what you say makes perfect sense, I am almost sure it doesn't apply in the case of HS because of the way how it is implemented. But this is a good point, I will try to keep it in mind and ask the hw guys what is best.

//Claudiu

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [PATCH 6/6] [ARC] Various instruction pattern fixes
  2016-04-28 12:31     ` Joern Wolfgang Rennecke
@ 2016-05-02 11:21       ` Claudiu Zissulescu
  0 siblings, 0 replies; 34+ messages in thread
From: Claudiu Zissulescu @ 2016-05-02 11:21 UTC (permalink / raw)
  To: Joern Wolfgang Rennecke, gcc-patches; +Cc: Francois.Bedard, jeremy.bennett

> > gcc/
> > 2016-04-18  Claudiu Zissulescu  <claziss@synopsys.com>
> >
> > 	* config/arc/arc.md (mulsidi3): Change operand 0 predicate to
> > 	register_operand.
> > 	(umulsidi3): Likewise.
> > 	(indirect_jump): Fix jump instruction assembly patterns.
> > 	(arcset<code>): Change operand 1 predicate to
> nonmemory_operand.
> > 	(arcsetltu, arcsetgeu): Likewise.
> ChangeLog omission: You are also adding an r/n/r alternative.
> > 	(arcsethi, arcsetls): Fix pattern.
> Otherwise this is OK.
> 
> If the constant / register comparisons come from an expander, in
> general the expander should be fixed to swap the operands and
> use the swapped comparison code, to get canonical rtl.
> OTOH, constant re-materialization during register allocation can change
> a reg-reg into
> a constant-reg comparison, and at that stage, canonicalization would not
> be expected.

I will commit this patch without the arcset* mods, this is safer. 

Thanks!
Claudiu

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2016-05-02 11:21 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-18 14:35 [PATCH 0/6] [ARC] Various fixes Claudiu Zissulescu
2016-04-18 14:35 ` [PATCH 6/6] [ARC] Various instruction pattern fixes Claudiu Zissulescu
2016-04-18 18:26   ` Claudiu Zissulescu
2016-04-28 12:31     ` Joern Wolfgang Rennecke
2016-05-02 11:21       ` Claudiu Zissulescu
2016-04-18 14:35 ` [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library Claudiu Zissulescu
2016-04-28 11:27   ` Joern Wolfgang Rennecke
2016-04-28 11:35     ` Claudiu Zissulescu
2016-04-28 11:41       ` Joern Wolfgang Rennecke
2016-04-28 11:43         ` Claudiu Zissulescu
2016-04-28 14:12         ` Claudiu Zissulescu
2016-04-28 15:03           ` Joern Wolfgang Rennecke
2016-04-29 10:18             ` [PATCH] " Claudiu Zissulescu
2016-04-29 10:23               ` Joern Wolfgang Rennecke
2016-04-29 10:27               ` Joern Wolfgang Rennecke
2016-04-29 10:31                 ` Claudiu Zissulescu
2016-04-29 10:37                   ` Joern Wolfgang Rennecke
2016-04-29 10:47                     ` Claudiu Zissulescu
2016-04-18 14:35 ` [PATCH 3/6] [ARC] Pass mfpuda to assembler Claudiu Zissulescu
2016-04-28 10:30   ` Joern Wolfgang Rennecke
2016-04-28 13:10     ` Claudiu Zissulescu
2016-04-18 14:35 ` [PATCH 2/6] [ARC] Fix FPX/FPUDA code gen when compiling for big-endian Claudiu Zissulescu
2016-04-28 10:29   ` Joern Wolfgang Rennecke
2016-04-28 12:54     ` Claudiu Zissulescu
2016-04-18 14:35 ` [PATCH 1/6] [ARC] Don't use drsub* instructions when selecting fpuda Claudiu Zissulescu
2016-04-28 10:05   ` Joern Wolfgang Rennecke
2016-04-28 12:16     ` Claudiu Zissulescu
2016-04-18 14:35 ` [PATCH 5/6] [ARC] Fix unwanted match for sign extend 16-bit constant Claudiu Zissulescu
2016-04-28 11:47   ` Joern Wolfgang Rennecke
2016-04-28 17:12     ` [PATCH] " Claudiu Zissulescu
2016-04-28 17:46       ` Joern Wolfgang Rennecke
2016-04-28 20:31         ` Claudiu Zissulescu
2016-04-28 20:57           ` Joern Wolfgang Rennecke
2016-04-29  8:41             ` Claudiu Zissulescu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).