public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* rs6000 fused multiply-add patch
@ 2002-12-02 19:01 Geoffrey Keating
  2002-12-03 15:41 ` Segher Boessenkool
  0 siblings, 1 reply; 33+ messages in thread
From: Geoffrey Keating @ 2002-12-02 19:01 UTC (permalink / raw)
  To: gcc-patches; +Cc: dje, pinskia, segher, dalej


I looked at Segher's patch, and while I thought it was the right
direction, I kept finding more missing pieces the more I looked at
it.  So, I wrote my own.  This one has test cases, documentation, and
works in nearly every case.

It ensures that the fused multiply-add operations are used whenever
appropriate when -ffast-math is used.  It also gets it right in nearly
all cases when -ffast-math is not being used, by using fneg followed
by a fused multiply-add operation.

The one case it misses is

(minus A (mult B C))

when ! -ffast-math, which it should really do as 

T = (neg B)
(plus (mult T C) A)

causing suboptimal code generation in the a[7] case in the testcase.
If I make combine do this, it'll generate an extra operation on
non-powerpc systems, which I thought would probably be a bad idea (all
the other simplifications I added just move operations around or
delete operations).  Then I tried to make a splitter, which I think
ought to be the right long-term solution, but couldn't get it to work,
be recognized by combine, and have T be a new temporary.  So I'm
leaving that as a project for later.

I'm running a bootstrap & testrun on powerpc-darwin and if it works
I'll commit to mainline.

-- 
- Geoffrey Keating <geoffk@apple.com>

===File ~/patches/rs6000-fnmadd.patch=======================
Index: ChangeLog
2002-12-02  Geoffrey Keating  <geoffk@apple.com>

	* combine.c (combine_simplify_rtx): Add new canonicalizations.
	* doc/md.texi (Insn Canonicalizations): Document new
	canonicalizations for multiply/add combinations.
	* config/rs6000/rs6000.md: Add and modify floating add/multiply
	patterns to ensure they're used whenever they can be.

Index: testsuite/ChangeLog
2002-12-02  Geoffrey Keating  <geoffk@apple.com>

	* gcc.dg/ppc-fmadd-1.c: New file.
	* gcc.dg/ppc-fmadd-2.c: New file.
	* gcc.dg/ppc-fmadd-3.c: New file.

Index: combine.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/combine.c,v
retrieving revision 1.324
diff -u -p -u -p -r1.324 combine.c
--- combine.c	20 Nov 2002 09:43:19 -0000	1.324
+++ combine.c	3 Dec 2002 02:32:22 -0000
@@ -4029,6 +4029,23 @@ combine_simplify_rtx (x, op0_mode, last,
 	return gen_binary (MINUS, mode, XEXP (XEXP (x, 0), 1),
 			   XEXP (XEXP (x, 0), 0));
 
+      /* (neg (plus A B)) is canonicalized to (minus (neg A) B).  */
+      if (GET_CODE (XEXP (x, 0)) == PLUS
+	  && !HONOR_SIGNED_ZEROS (mode)
+	  && !HONOR_SIGN_DEPENDENT_ROUNDING (mode))
+	{
+	  temp = simplify_gen_unary (NEG, mode, XEXP (XEXP (x, 0), 0), mode);
+	  return gen_binary (PLUS, mode, temp, XEXP (XEXP (x, 0), 1));
+	}
+
+      /* (neg (mult A B)) becomes (mult (neg A) B).  
+         This works even for floating-point values.  */
+      if (GET_CODE (XEXP (x, 0)) == MULT)
+	{
+	  temp = simplify_gen_unary (NEG, mode, XEXP (XEXP (x, 0), 0), mode);
+	  return gen_binary (MULT, mode, temp, XEXP (XEXP (x, 0), 1));
+	}
+
       /* (neg (xor A 1)) is (plus A -1) if A is known to be either 0 or 1.  */
       if (GET_CODE (XEXP (x, 0)) == XOR && XEXP (XEXP (x, 0), 1) == const1_rtx
 	  && nonzero_bits (XEXP (XEXP (x, 0), 0), mode) == 1)
@@ -4217,6 +4234,19 @@ combine_simplify_rtx (x, op0_mode, last,
 #endif
 
     case PLUS:
+      /* Canonicalize (plus (mult (neg B) C) A) to (minus A (mult B C)).
+       */
+      if (GET_CODE (XEXP (x, 0)) == MULT 
+	  && GET_CODE (XEXP (XEXP (x, 0), 0)) == NEG)
+	{
+	  rtx in1, in2;
+	 
+	  in1 = XEXP (XEXP (XEXP (x, 0), 0), 0);
+	  in2 = XEXP (XEXP (x, 0), 1);
+	  return gen_binary (MINUS, mode, XEXP (x, 1),
+			     gen_binary (MULT, mode, in1, in2));
+	}
+
       /* If we have (plus (plus (A const) B)), associate it so that CONST is
 	 outermost.  That's because that's the way indexed addresses are
 	 supposed to appear.  This code used to check many more cases, but
@@ -4322,6 +4352,32 @@ combine_simplify_rtx (x, op0_mode, last,
 	  && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0)))
 	return simplify_and_const_int (NULL_RTX, mode, XEXP (x, 0),
 				       -INTVAL (XEXP (XEXP (x, 1), 1)) - 1);
+
+      /* Canonicalize (minus A (mult (neg B) C)) to (plus (mult B C) A).
+       */
+      if (GET_CODE (XEXP (x, 1)) == MULT 
+	  && GET_CODE (XEXP (XEXP (x, 1), 0)) == NEG)
+	{
+	  rtx in1, in2;
+	 
+	  in1 = XEXP (XEXP (XEXP (x, 1), 0), 0);
+	  in2 = XEXP (XEXP (x, 1), 1);
+	  return gen_binary (PLUS, mode, gen_binary (MULT, mode, in1, in2),
+			     XEXP (x, 0));
+	}
+
+       /* Canonicalize (minus (neg A) (mult B C)) to 
+	  (minus (mult (neg B) C) A). */
+      if (GET_CODE (XEXP (x, 1)) == MULT 
+	  && GET_CODE (XEXP (x, 0)) == NEG)
+	{
+	  rtx in1, in2;
+	 
+	  in1 = simplify_gen_unary (NEG, mode, XEXP (XEXP (x, 1), 0), mode);
+	  in2 = XEXP (XEXP (x, 1), 1);
+	  return gen_binary (MINUS, mode, gen_binary (MULT, mode, in1, in2),
+			     XEXP (XEXP (x, 0), 0));
+	}
 
       /* Canonicalize (minus A (plus B C)) to (minus (minus A B) C) for
 	 integers.  */
Index: config/rs6000/rs6000.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.222
diff -u -p -u -p -r1.222 rs6000.md
--- config/rs6000/rs6000.md	16 Nov 2002 18:01:51 -0000	1.222
+++ config/rs6000/rs6000.md	3 Dec 2002 02:32:41 -0000
@@ -5280,7 +5280,18 @@
 	(neg:SF (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
 				  (match_operand:SF 2 "gpc_reg_operand" "f"))
 			 (match_operand:SF 3 "gpc_reg_operand" "f"))))]
-  "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD"
+  "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD
+   && HONOR_SIGNED_ZEROS (SFmode)"
+  "fnmadds %0,%1,%2,%3"
+  [(set_attr "type" "fp")])
+
+(define_insn ""
+  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
+	(minus:SF (mult:SF (neg:SF (match_operand:SF 1 "gpc_reg_operand" "f"))
+			   (match_operand:SF 2 "gpc_reg_operand" "f"))
+			 (match_operand:SF 3 "gpc_reg_operand" "f")))]
+  "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD
+   && ! HONOR_SIGNED_ZEROS (SFmode)"
   "fnmadds %0,%1,%2,%3"
   [(set_attr "type" "fp")])
 
@@ -5295,10 +5306,31 @@
 
 (define_insn ""
   [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
+	(minus:SF (mult:SF (neg:SF (match_operand:SF 1 "gpc_reg_operand" "f"))
+			   (match_operand:SF 2 "gpc_reg_operand" "f"))
+			 (match_operand:SF 3 "gpc_reg_operand" "f")))]
+  "! TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD
+   && ! HONOR_SIGNED_ZEROS (SFmode)"
+  "{fnma|fnmadd} %0,%1,%2,%3"
+  [(set_attr "type" "dmul")])
+
+(define_insn ""
+  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
 	(neg:SF (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
 				   (match_operand:SF 2 "gpc_reg_operand" "f"))
 			  (match_operand:SF 3 "gpc_reg_operand" "f"))))]
-  "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD"
+  "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD
+   && HONOR_SIGNED_ZEROS (SFmode)"
+  "fnmsubs %0,%1,%2,%3"
+  [(set_attr "type" "fp")])
+
+(define_insn ""
+  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
+	(minus:SF (match_operand:SF 3 "gpc_reg_operand" "f")
+		  (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
+			   (match_operand:SF 2 "gpc_reg_operand" "f"))))]
+  "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD
+   && ! HONOR_SIGNED_ZEROS (SFmode)"
   "fnmsubs %0,%1,%2,%3"
   [(set_attr "type" "fp")])
 
@@ -5311,6 +5343,16 @@
   "{fnms|fnmsub} %0,%1,%2,%3"
   [(set_attr "type" "dmul")])
 
+(define_insn ""
+  [(set (match_operand:SF 0 "gpc_reg_operand" "=f")
+	(minus:SF (match_operand:SF 3 "gpc_reg_operand" "f")
+		  (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f")
+			   (match_operand:SF 2 "gpc_reg_operand" "f"))))]
+  "! TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD
+   && ! HONOR_SIGNED_ZEROS (SFmode)"
+  "{fnms|fnmsub} %0,%1,%2,%3"
+  [(set_attr "type" "fp")])
+
 (define_expand "sqrtsf2"
   [(set (match_operand:SF 0 "gpc_reg_operand" "")
 	(sqrt:SF (match_operand:SF 1 "gpc_reg_operand" "")))]
@@ -5524,7 +5566,18 @@
 	(neg:DF (plus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f")
 				  (match_operand:DF 2 "gpc_reg_operand" "f"))
 			 (match_operand:DF 3 "gpc_reg_operand" "f"))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD"
+  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD
+   && HONOR_SIGNED_ZEROS (DFmode)"
+  "{fnma|fnmadd} %0,%1,%2,%3"
+  [(set_attr "type" "dmul")])
+
+(define_insn ""
+  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
+	(minus:DF (mult:DF (neg:DF (match_operand:DF 1 "gpc_reg_operand" "f"))
+			   (match_operand:DF 2 "gpc_reg_operand" "f"))
+		  (match_operand:DF 3 "gpc_reg_operand" "f")))]
+  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD
+   && ! HONOR_SIGNED_ZEROS (DFmode)"
   "{fnma|fnmadd} %0,%1,%2,%3"
   [(set_attr "type" "dmul")])
 
@@ -5533,7 +5586,18 @@
 	(neg:DF (minus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f")
 				   (match_operand:DF 2 "gpc_reg_operand" "f"))
 			  (match_operand:DF 3 "gpc_reg_operand" "f"))))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD"
+  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD
+   && HONOR_SIGNED_ZEROS (DFmode)"
+  "{fnms|fnmsub} %0,%1,%2,%3"
+  [(set_attr "type" "dmul")])
+
+(define_insn ""
+  [(set (match_operand:DF 0 "gpc_reg_operand" "=f")
+	(minus:DF (match_operand:DF 3 "gpc_reg_operand" "f")
+	          (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%f")
+			   (match_operand:DF 2 "gpc_reg_operand" "f"))))]
+  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD 
+   && ! HONOR_SIGNED_ZEROS (DFmode)"
   "{fnms|fnmsub} %0,%1,%2,%3"
   [(set_attr "type" "dmul")])
 
Index: doc/md.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/md.texi,v
retrieving revision 1.53
diff -u -p -u -p -r1.53 md.texi
--- doc/md.texi	1 Nov 2002 07:05:57 -0000	1.53
+++ doc/md.texi	3 Dec 2002 02:32:43 -0000
@@ -3670,6 +3670,14 @@ For these operators, if only one operand
 @code{mult}, @code{plus}, or @code{minus} expression, it will be the
 first operand.
 
+@item
+In combinations of @code{neg}, @code{mult}, @code{plus}, and
+@code{minus}, the @code{neg} operations (if any) will be moved inside
+the operations as far as possible.  For instance, 
+@code{(neg (mult A B))} is canonicalized as @code{(mult (neg A) B)}, but
+@code{(plus (mult (neg A) B) C)} is canonicalized as
+@code{(minus A (mult B C))}.
+
 @cindex @code{compare}, canonicalization of
 @item
 For the @code{compare} operator, a constant is always the second operand
Index: testsuite/gcc.dg/ppc-fmadd-1.c
===================================================================
RCS file: testsuite/gcc.dg/ppc-fmadd-1.c
diff -N testsuite/gcc.dg/ppc-fmadd-1.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ testsuite/gcc.dg/ppc-fmadd-1.c	3 Dec 2002 02:33:56 -0000
@@ -0,0 +1,43 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-options "-ffast-math -O2" } */
+/* { dg-final { scan-assembler-not "f(add|sub|mul|neg)" } } */
+
+void foo(double *a, double *b, double *c, double *d)
+{
+  a[0] =  b[0] + c[0] * d[0];		// fmadd
+  a[1] =  b[1] - c[1] * d[1];		// fnmsub with fast-math
+  a[2] = -b[2] + c[2] * d[2];   	// fmsub
+  a[3] = -b[3] - c[3] * d[3];		// fnmadd with fast-math
+  a[4] = -( b[4] + c[4] * d[4]);	// fnmadd
+  a[5] = -( b[5] - c[5] * d[5]);	// fmsub with fast-math
+  a[6] = -(-b[6] + c[6] * d[6]);	// fnmsub
+  a[7] = -(-b[7] - c[7] * d[7]);	// fmadd with fast-math
+  a[10] =  b[10] - c[10] * -d[10];	// fmadd
+  a[11] =  b[11] + c[11] * -d[11];	// fnmsub with fast-math
+  a[12] = -b[12] - c[12] * -d[12];   	// fmsub
+  a[13] = -b[13] + c[13] * -d[13];	// fnmadd with fast-math
+  a[14] = -( b[14] - c[14] * -d[14]);	// fnmadd
+  a[15] = -( b[15] + c[15] * -d[15]);	// fmsub with fast-math
+  a[16] = -(-b[16] - c[16] * -d[16]);	// fnmsub
+  a[17] = -(-b[17] + c[17] * -d[17]);	// fmadd with fast-math
+}
+
+void foos(float *a, float *b, float *c, float *d)
+{
+  a[0] =  b[0] + c[0] * d[0];		// fmadd
+  a[1] =  b[1] - c[1] * d[1];		// fnmsub with fast-math
+  a[2] = -b[2] + c[2] * d[2];   	// fmsub
+  a[3] = -b[3] - c[3] * d[3];		// fnmadd with fast-math
+  a[4] = -( b[4] + c[4] * d[4]);	// fnmadd
+  a[5] = -( b[5] - c[5] * d[5]);	// fmsub with fast-math
+  a[6] = -(-b[6] + c[6] * d[6]);	// fnmsub
+  a[7] = -(-b[7] - c[7] * d[7]);	// fmadd with fast-math
+  a[10] =  b[10] - c[10] * -d[10];	// fmadd
+  a[11] =  b[11] + c[11] * -d[11];	// fnmsub with fast-math
+  a[12] = -b[12] - c[12] * -d[12];   	// fmsub
+  a[13] = -b[13] + c[13] * -d[13];	// fnmadd with fast-math
+  a[14] = -( b[14] - c[14] * -d[14]);	// fnmadd
+  a[15] = -( b[15] + c[15] * -d[15]);	// fmsub with fast-math
+  a[16] = -(-b[16] - c[16] * -d[16]);	// fnmsub
+  a[17] = -(-b[17] + c[17] * -d[17]);	// fmadd with fast-math
+}
Index: testsuite/gcc.dg/ppc-fmadd-2.c
===================================================================
RCS file: testsuite/gcc.dg/ppc-fmadd-2.c
diff -N testsuite/gcc.dg/ppc-fmadd-2.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ testsuite/gcc.dg/ppc-fmadd-2.c	3 Dec 2002 02:33:56 -0000
@@ -0,0 +1,27 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "f(add|sub|mul|neg)" } } */
+
+void foo(double *a, double *b, double *c, double *d)
+{
+  a[0] =  b[0] + c[0] * d[0];		// fmadd
+  a[2] = -b[2] + c[2] * d[2];   	// fmsub
+  a[4] = -( b[4] + c[4] * d[4]);	// fnmadd
+  a[6] = -(-b[6] + c[6] * d[6]);	// fnmsub
+  a[10] =  b[10] - c[10] * -d[10];	// fmadd
+  a[12] = -b[12] - c[12] * -d[12];   	// fmsub
+  a[14] = -( b[14] - c[14] * -d[14]);	// fnmadd
+  a[16] = -(-b[16] - c[16] * -d[16]);	// fnmsub
+}
+
+void foos(float *a, float *b, float *c, float *d)
+{
+  a[0] =  b[0] + c[0] * d[0];		// fmadd
+  a[2] = -b[2] + c[2] * d[2];   	// fmsub
+  a[4] = -( b[4] + c[4] * d[4]);	// fnmadd
+  a[6] = -(-b[6] + c[6] * d[6]);	// fnmsub
+  a[10] =  b[10] - c[10] * -d[10];	// fmadd
+  a[12] = -b[12] - c[12] * -d[12];   	// fmsub
+  a[14] = -( b[14] - c[14] * -d[14]);	// fnmadd
+  a[16] = -(-b[16] - c[16] * -d[16]);	// fnmsub
+}
Index: testsuite/gcc.dg/ppc-fmadd-3.c
===================================================================
RCS file: testsuite/gcc.dg/ppc-fmadd-3.c
diff -N testsuite/gcc.dg/ppc-fmadd-3.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ testsuite/gcc.dg/ppc-fmadd-3.c	3 Dec 2002 02:33:58 -0000
@@ -0,0 +1,36 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "f(add|sub|mul)" } } */
+
+void foo(double *a, double *b, double *c, double *d)
+{
+#if 0
+  a[1] =  b[1] - c[1] * d[1];		// fneg, fmadd without fast-math
+#endif
+  a[3] = -b[3] - c[3] * d[3];		// fneg, fmsub without fast-math
+#if 0
+  a[5] = -( b[5] - c[5] * d[5]);	// fneg, fnmadd without fast-math
+#endif
+  a[7] = -(-b[7] - c[7] * d[7]);	// fneg, fnmsub without fast-math
+  a[11] =  b[11] + c[11] * -d[11];	// fneg, fmadd without fast-math
+  a[13] = -b[13] + c[13] * -d[13];	// fneg, fmsub without fast-math
+  a[15] = -( b[15] + c[15] * -d[15]);	// fneg, fnmadd without fast-math
+  a[17] = -(-b[17] + c[17] * -d[17]);	// fneg, fnmsub without fast-math
+}
+
+void foos(float *a, float *b, float *c, float *d)
+{
+#if 0
+  a[1] =  b[1] - c[1] * d[1];		// fneg, fmadd without fast-math
+#endif
+  a[3] = -b[3] - c[3] * d[3];		// fneg, fmsub without fast-math
+#if 0
+  a[5] = -( b[5] - c[5] * d[5]);	// fneg, fnmadd without fast-math
+#endif
+  a[7] = -(-b[7] - c[7] * d[7]);	// fneg, fnmsub without fast-math
+  a[11] =  b[11] + c[11] * -d[11];	// fneg, fmadd without fast-math
+  a[13] = -b[13] + c[13] * -d[13];	// fneg, fmsub without fast-math
+  a[15] = -( b[15] + c[15] * -d[15]);	// fneg, fnmadd without fast-math
+  a[17] = -(-b[17] + c[17] * -d[17]);	// fneg, fnmsub without fast-math
+}
+
============================================================

^ permalink raw reply	[flat|nested] 33+ messages in thread
* Re: rs6000 fused multiply-add patch [+ patchlet]
@ 2002-12-29  5:13 Richard Kenner
  2002-12-29 19:25 ` Segher Boessenkool
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Kenner @ 2002-12-29  5:13 UTC (permalink / raw)
  To: segher; +Cc: gcc-patches

    The simplifications are implemented, but combine won't simplify two insns
    into two different insns.  

Correct, and it *must* not.

The reason is that combine must always *decrease* the number of insns
to avoid infinite loops.

    The attached patch fixes this.  

This patch is *not* OK.

^ permalink raw reply	[flat|nested] 33+ messages in thread
* Re: rs6000 fused multiply-add patch [+ patchlet]
@ 2002-12-30  4:58 Richard Kenner
  2002-12-30 20:17 ` Segher Boessenkool
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Kenner @ 2002-12-30  4:58 UTC (permalink / raw)
  To: segher; +Cc: gcc-patches

    Isn't this part of the patch preventing such loops?

Yes, that's *part* of the code, but the other part is what you've taken
out: not combining two insns into two!

The *purpose* of combine is to reduce the number of insns.  Why would it
be helpful to replace two insns by two?  You still have the same number
of insns.  What's better about it?

^ permalink raw reply	[flat|nested] 33+ messages in thread
* Re: rs6000 fused multiply-add patch [+ patchlet]
@ 2002-12-31  3:59 Richard Kenner
  2003-01-05 23:24 ` Zack Weinberg
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Kenner @ 2002-12-31  3:59 UTC (permalink / raw)
  To: segher; +Cc: gcc-patches

    Without it, lots of simplifications don't ever get applied.  This results
    in worse code.  For example, with the patch applied, bootstrap time goes
    down by a few percent (powerpc-unknown-linux-gnu), as well as code size.

    One common example is, without the patch, computations involving bitfields
    use mfcr insns; with it, they use logic instructions.

But that's not what combine is supposed to do!  The purpose of combine
is what it's name says, to *combine* insns.

If there is a simpler way to do an insn, it should be in the MD file.

^ permalink raw reply	[flat|nested] 33+ messages in thread
* Re: rs6000 fused multiply-add patch [+ patchlet]
@ 2003-01-02  1:39 Richard Kenner
  2003-01-04  1:59 ` Segher Boessenkool
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Kenner @ 2003-01-02  1:39 UTC (permalink / raw)
  To: segher; +Cc: gcc-patches

    Yes, but sometimes doing a 2->2 simplification will allow it to do a
    2->1 simplification, or two 2->2 simplifications will allow a 3->2, or
    maybe some even longer chain.

But that should get done when combining 3 insns, for example.

^ permalink raw reply	[flat|nested] 33+ messages in thread
* Re: rs6000 fused multiply-add patch [+ patchlet]
@ 2003-01-06 22:09 Richard Kenner
  0 siblings, 0 replies; 33+ messages in thread
From: Richard Kenner @ 2003-01-06 22:09 UTC (permalink / raw)
  To: dje; +Cc: gcc-patches

    I don't think that anyone is objecting to the concept and the benefit.
    If I understand correctly, the patch violates the semantics of the
    combiner algorithm which requires a declining cost calculated as the
    number of instructions.  Allowing combinations that do not decrease
    the cost would make the algorithm non-deterministic and possibly not
    converge, right?

That's what I'm saying, yes.  I'm out of the country right now with poor
email access and will have more to say on this when I et bck in a few days.

^ permalink raw reply	[flat|nested] 33+ messages in thread
* Re: rs6000 fused multiply-add patch [+ patchlet]
@ 2003-01-10  2:15 Richard Kenner
  2003-01-12  1:38 ` Segher Boessenkool
  0 siblings, 1 reply; 33+ messages in thread
From: Richard Kenner @ 2003-01-10  2:15 UTC (permalink / raw)
  To: segher; +Cc: gcc-patches

    Reducing the amount of rtl isn't the same as improving performance.

No, but unless the MD file has a serious problem, it is *one way* of
improving performance.

    The only difference in this regard is that the original combine
    is easier to prove to be terminating.

And that it sticks to the semantics, which is *combining* insns.

There's no question that we don't current have a pass which is good at
finding the simplest form of the RTL.  The idea is to move as much of
combine as possible into simplify-rtx.c.  Then combine is very small and
other passes will do what you are trying to do.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2003-01-12  1:38 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-12-02 19:01 rs6000 fused multiply-add patch Geoffrey Keating
2002-12-03 15:41 ` Segher Boessenkool
2002-12-03 16:59   ` Geoff Keating
2002-12-03 17:12     ` Segher Boessenkool
2002-12-03 17:29       ` David Edelsohn
2002-12-04 19:41         ` Segher Boessenkool
2002-12-05 14:04           ` Geoff Keating
2002-12-20 21:08             ` rs6000 fused multiply-add patch [+ patchlet] Segher Boessenkool
2002-12-20 21:38               ` Geoff Keating
2002-12-20 22:21                 ` Segher Boessenkool
2002-12-20 22:28                   ` David Edelsohn
2002-12-21 21:55                   ` Geoff Keating
2002-12-28 22:08                   ` Segher Boessenkool
2002-12-29  5:13 Richard Kenner
2002-12-29 19:25 ` Segher Boessenkool
2002-12-30  4:58 Richard Kenner
2002-12-30 20:17 ` Segher Boessenkool
2002-12-30 20:51   ` David Edelsohn
2003-01-02  0:52     ` Segher Boessenkool
2003-01-02  1:44       ` Geoff Keating
2003-01-04  1:59         ` Segher Boessenkool
2002-12-31  3:59 Richard Kenner
2003-01-05 23:24 ` Zack Weinberg
2003-01-06  2:26   ` David Edelsohn
2003-01-06  4:02     ` Segher Boessenkool
2003-01-06  4:07       ` Segher Boessenkool
2003-01-06 23:21     ` Geoff Keating
2003-01-09 22:41       ` Segher Boessenkool
2003-01-02  1:39 Richard Kenner
2003-01-04  1:59 ` Segher Boessenkool
2003-01-06 22:09 Richard Kenner
2003-01-10  2:15 Richard Kenner
2003-01-12  1:38 ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).