public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
@ 2017-06-12  7:56 Tamar Christina
  2017-06-12  9:10 ` Richard Biener
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Tamar Christina @ 2017-06-12  7:56 UTC (permalink / raw)
  To: GCC Patches; +Cc: nd, law, ian, rguenther

[-- Attachment #1: Type: text/plain, Size: 1884 bytes --]

Hi All,

this patch implements a optimization rewriting

x * copysign (1.0, y) and 
x * copysign (-1.0, y) 

to:

x ^ (y & (1 << sign_bit_position))

This is done by creating a special builtin during matching and generate the
appropriate instructions during expand. This new builtin is called XORSIGN.

The expansion of xorsign depends on if the backend has an appropriate optab
available. If this is not the case then we use a modified version of the existing
copysign which does not take the abs value of the first argument as a fall back.

This patch is a revival of a previous patch
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html

Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
Regression done on aarch64-none-linux-gnu and no regressions.

Ok for trunk?

gcc/
2017-06-07  Tamar Christina  <tamar.christina@arm.com>

	* builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
	(BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
	* match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
	(mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
	(copysigns @0 (negate @1)): Likewise.
	* builtins.c (expand_builtin_copysign): Promoted local to argument.
	(expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
	CASE_FLT_FN (BUILT_IN_XORSIGN).
	(BUILT_IN_COPYSIGN): Updated function call.
	* optabs.h (expand_copysign): New bool.
	(expand_xorsign): New.
	* optabs.def (xorsign_optab): New.
	* optabs.c (expand_copysign): New parameter.
	* fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
	* fortran/mathbuiltins.def (XORSIGN): New.

gcc/testsuite/
2017-06-07  Tamar Christina  <tamar.christina@arm.com>

	* gcc.dg/tree-ssa/xorsign.c: New.
	* gcc.dg/xorsign_exec.c: New.
	* gcc.dg/vec-xorsign_exec.c: New.
	* gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: xorgsign-1.patch --]
[-- Type: text/x-patch; name="xorgsign-1.patch", Size: 14527 bytes --]

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 30462ad0f419721fd0aa2029dbc9f8f5593b5823..2a84bebf5f6235f84a0f46f15ba2fed67b1d5564 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -5117,10 +5117,12 @@ expand_builtin_fabs (tree exp, rtx target, rtx subtarget)
 /* Expand EXP, a call to copysign, copysignf, or copysignl.
    Return NULL is a normal call should be emitted rather than expanding the
    function inline.  If convenient, the result should be placed in TARGET.
-   SUBTARGET may be used as the target for computing the operand.  */
+   SUBTARGET may be used as the target for computing the operand.
+   If OP0_NEEDS_ABS is true then abs() will be performed on the first
+   argument.  */
 
 static rtx
-expand_builtin_copysign (tree exp, rtx target, rtx subtarget)
+expand_builtin_copysign (tree exp, rtx target, rtx subtarget, bool op0_needs_abs)
 {
   rtx op0, op1;
   tree arg;
@@ -5134,7 +5136,7 @@ expand_builtin_copysign (tree exp, rtx target, rtx subtarget)
   arg = CALL_EXPR_ARG (exp, 1);
   op1 = expand_normal (arg);
 
-  return expand_copysign (op0, op1, target);
+  return expand_copysign (op0, op1, target, op0_needs_abs);
 }
 
 /* Expand a call to __builtin___clear_cache.  */
@@ -6586,7 +6588,14 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
 
     CASE_FLT_FN (BUILT_IN_COPYSIGN):
     CASE_FLT_FN_FLOATN_NX (BUILT_IN_COPYSIGN):
-      target = expand_builtin_copysign (exp, target, subtarget);
+      target = expand_builtin_copysign (exp, target, subtarget, true);
+      if (target)
+	return target;
+      break;
+
+    CASE_FLT_FN (BUILT_IN_XORSIGN):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN):
+      target = expand_builtin_copysign (exp, target, subtarget, false);
       if (target)
 	return target;
       break;
@@ -7688,7 +7697,7 @@ builtin_mathfn_code (const_tree t)
   const_call_expr_arg_iterator iter;
 
   if (TREE_CODE (t) != CALL_EXPR
-      || TREE_CODE (CALL_EXPR_FN (t)) != ADDR_EXPR)
+      || (CALL_EXPR_FN (t) && TREE_CODE (CALL_EXPR_FN (t)) != ADDR_EXPR))
     return END_BUILTINS;
 
   fndecl = get_callee_fndecl (t);
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 58d78dbbdee58df77fb7bad904362327704403c5..9508fc35d622369ab5b89fc63d3add3728931279 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -325,6 +325,12 @@ DEF_C99_BUILTIN        (BUILT_IN_COPYSIGNL, "copysignl", BT_FN_LONGDOUBLE_LONGDO
 #define COPYSIGN_TYPE(F) BT_FN_##F##_##F##_##F
 DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_COPYSIGN, "copysign", COPYSIGN_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
 #undef COPYSIGN_TYPE
+DEF_GCC_BUILTIN        (BUILT_IN_XORSIGN, "xorsign", BT_FN_DOUBLE_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_XORSIGNF, "xorsignf", BT_FN_FLOAT_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_XORSIGNL, "xorsignl", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#define XORSIGN_TYPE(F) BT_FN_##F##_##F##_##F
+DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_XORSIGN, "xorsign", XORSIGN_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#undef XORSIGN_TYPE
 DEF_LIB_BUILTIN        (BUILT_IN_COS, "cos", BT_FN_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_COSF, "cosf", BT_FN_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING)
 DEF_LIB_BUILTIN        (BUILT_IN_COSH, "cosh", BT_FN_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
diff --git a/gcc/fortran/f95-lang.c b/gcc/fortran/f95-lang.c
index 44bd8dcc2ad028a8e1ec1cd477d9e8dc1790fb57..8c5fc72aed654f2470d1d57938935e0b87964fe8 100644
--- a/gcc/fortran/f95-lang.c
+++ b/gcc/fortran/f95-lang.c
@@ -747,6 +747,16 @@ gfc_init_builtin_functions (void)
   gfc_define_builtin ("__builtin_copysignf", mfunc_float[1], 
 		      BUILT_IN_COPYSIGNF, "copysignf",
 		      ATTR_CONST_NOTHROW_LEAF_LIST);
+
+  gfc_define_builtin ("__builtin_xorsignl", mfunc_longdouble[1],
+                      BUILT_IN_XORSIGNL, "xorsignl",
+                      ATTR_CONST_NOTHROW_LEAF_LIST);
+  gfc_define_builtin ("__builtin_xorsign", mfunc_double[1],
+                      BUILT_IN_XORSIGN, "xorsign",
+                      ATTR_CONST_NOTHROW_LEAF_LIST);
+  gfc_define_builtin ("__builtin_xorsignf", mfunc_float[1],
+                      BUILT_IN_XORSIGNF, "xorsignf",
+                      ATTR_CONST_NOTHROW_LEAF_LIST);
  
   gfc_define_builtin ("__builtin_nextafterl", mfunc_longdouble[1], 
 		      BUILT_IN_NEXTAFTERL, "nextafterl",
diff --git a/gcc/fortran/mathbuiltins.def b/gcc/fortran/mathbuiltins.def
index fadfedb25ffe16806c82318e8c3f13a9993f96ff..e865735c028084dd5df220559adddd23ecec5de9 100644
--- a/gcc/fortran/mathbuiltins.def
+++ b/gcc/fortran/mathbuiltins.def
@@ -58,6 +58,7 @@ DEFINE_MATH_BUILTIN   (HYPOT, "hypot",  1)
    double and long double) and to build the quad-precision decls.  */
 OTHER_BUILTIN (CABS,      "cabs",      cabs,    true)
 OTHER_BUILTIN (COPYSIGN,  "copysign",  2,       true)
+OTHER_BUILTIN (XORSIGN,   "xorsign" ,  2,       true)
 OTHER_BUILTIN (CPOW,      "cpow",      cpow,    true)
 OTHER_BUILTIN (FABS,      "fabs",      1,       true)
 OTHER_BUILTIN (FMOD,      "fmod",      2,       true)
diff --git a/gcc/match.pd b/gcc/match.pd
index 54a8e0449f8301ffaf553c139bbd2d7ccb1e8648..d6ce8f606f1fdf79020cb0f18a010fc554ca39e6 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -441,6 +441,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (coss (copysigns @0 @1))
    (coss @0)))
 
+/* x * copysign(1.0, y) -> xorsign(x, y).  */
+(for copysigns (COPYSIGN)
+     xorsigns (XORSIGN)
+ (simplify
+  (mult:c (copysigns:s real_onep @0) @1)
+    (xorsigns @1 @0)))
+
+
+/* x * copysign(-1.0, y) -> xorsign(x, y).  */
+(for copysigns (COPYSIGN)
+     xorsigns (XORSIGN)
+ (simplify
+  (mult:c (copysigns:s real_minus_onep @0) @1)
+    (xorsigns @1 @0)))
+
 /* pow(copysign(x, y), z) -> pow(x, z) if z is an even integer.  */
 (for pows (POW)
      copysigns (COPYSIGN)
diff --git a/gcc/optabs.c b/gcc/optabs.c
index b69e75943cc99d6c5792ae2f151f5343d2bdbc1a..a76d6e3041dd3b7024b6616de8d1390031390b1e 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -3433,24 +3433,26 @@ expand_copysign_bit (machine_mode mode, rtx op0, rtx op1, rtx target,
   return target;
 }
 
-/* Expand the C99 copysign operation.  OP0 and OP1 must be the same
-   scalar floating point mode.  Return NULL if we do not know how to
-   expand the operation inline.  */
+/* Expand the C99 copysign operation or an optimized 'xorsign' depending on
+   the value of OP0_NEEDS_ABS.
+   Essentially xorsign(abs(x),y) == copysign(x,y).
+   OP0 and OP1 must be the same scalar floating point mode.
+   Return NULL if we do not know how to expand the operation inline.  */
 
 rtx
-expand_copysign (rtx op0, rtx op1, rtx target)
+expand_copysign (rtx op0, rtx op1, rtx target, bool op0_needs_abs)
 {
   machine_mode mode = GET_MODE (op0);
   const struct real_format *fmt;
-  bool op0_is_abs;
   rtx temp;
 
   gcc_assert (SCALAR_FLOAT_MODE_P (mode));
   gcc_assert (GET_MODE (op1) == mode);
 
   /* First try to do it with a special instruction.  */
-  temp = expand_binop (mode, copysign_optab, op0, op1,
-		       target, 0, OPTAB_DIRECT);
+  temp = expand_binop (mode, op0_needs_abs ? copysign_optab : xorsign_optab,
+		       op0, op1, target, 0, OPTAB_DIRECT);
+
   if (temp)
     return temp;
 
@@ -3458,12 +3460,11 @@ expand_copysign (rtx op0, rtx op1, rtx target)
   if (fmt == NULL || !fmt->has_signed_zero)
     return NULL_RTX;
 
-  op0_is_abs = false;
-  if (CONST_DOUBLE_AS_FLOAT_P (op0))
+  if (op0_needs_abs && CONST_DOUBLE_AS_FLOAT_P (op0))
     {
       if (real_isneg (CONST_DOUBLE_REAL_VALUE (op0)))
 	op0 = simplify_unary_operation (ABS, mode, op0, mode);
-      op0_is_abs = true;
+      op0_needs_abs = false;
     }
 
   if (fmt->signbit_ro >= 0
@@ -3472,7 +3473,7 @@ expand_copysign (rtx op0, rtx op1, rtx target)
 	      && optab_handler (abs_optab, mode) != CODE_FOR_nothing)))
     {
       temp = expand_copysign_absneg (mode, op0, op1, target,
-				     fmt->signbit_ro, op0_is_abs);
+				     fmt->signbit_ro, !op0_needs_abs);
       if (temp)
 	return temp;
     }
@@ -3480,7 +3481,7 @@ expand_copysign (rtx op0, rtx op1, rtx target)
   if (fmt->signbit_rw < 0)
     return NULL_RTX;
   return expand_copysign_bit (mode, op0, op1, target,
-			      fmt->signbit_rw, op0_is_abs);
+			      fmt->signbit_rw, !op0_needs_abs);
 }
 \f
 /* Generate an instruction whose insn-code is INSN_CODE,
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 504c655be0210a72e7f0f34cde0131e21ccf8089..507af9ea81c42eba98780ef326c22847fdd2c043 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -253,6 +253,7 @@ OPTAB_D (asin_optab, "asin$a2")
 OPTAB_D (atan2_optab, "atan2$a3")
 OPTAB_D (atan_optab, "atan$a2")
 OPTAB_D (copysign_optab, "copysign$F$a3")
+OPTAB_D (xorsign_optab, "xorsign$F$a3")
 OPTAB_D (cos_optab, "cos$a2")
 OPTAB_D (exp10_optab, "exp10$a2")
 OPTAB_D (exp2_optab, "exp2$a2")
diff --git a/gcc/optabs.h b/gcc/optabs.h
index 728b866f08db01de2cc330ad29088ab252f4d3ad..676be0bc6c8b49e46d868ff6610de2675a8a6b79 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -220,7 +220,8 @@ extern rtx expand_abs (machine_mode, rtx, rtx, int, int);
 extern rtx expand_one_cmpl_abs_nojump (machine_mode, rtx, rtx);
 
 /* Expand the copysign operation.  */
-extern rtx expand_copysign (rtx, rtx, rtx);
+extern rtx expand_copysign (rtx, rtx, rtx, bool);
+
 /* Generate an instruction with a given INSN_CODE with an output and
    an input.  */
 extern bool maybe_emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-39.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-39.c
index 9befe184a018515da50a4dd14782d79482fd07d9..b917146bc812301194feef9c145ad93e9c2ee446 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-39.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-39.c
@@ -20,13 +20,13 @@ f1 (float x)
 double
 f2 (double x, double y)
 {
-  return x * ((1.0/12) * __builtin_copysign (1.0, y));
+  return x * ((1.0/12) * __builtin_copysign (2.0, y));
 }
 
 double
 f3 (double x, double y)
 {
-  return (x * (-1.0/12)) * __builtin_copysign (1.0, y);
+  return (x * (-1.0/12)) * __builtin_copysign (2.0, y);
 }
 
 double
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/xorsign.c b/gcc/testsuite/gcc.dg/tree-ssa/xorsign.c
new file mode 100644
index 0000000000000000000000000000000000000000..5989727bf9a80f5ff487dd7267b96cc32745f345
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/xorsign.c
@@ -0,0 +1,85 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-gimple" } */
+
+double
+check_d_pos (double x, double y)
+{
+  return x * __builtin_copysign (1.0, y);
+}
+
+float
+check_f_pos (float x, float y)
+{
+  return x * __builtin_copysignf (1.0f, y);
+}
+
+long double
+check_l_pos (long double x, long double y)
+{
+  return x * __builtin_copysignl (1.0, y);
+}
+
+/* --------------- */
+
+double
+check_d_neg (double x, double y)
+{
+  return x * __builtin_copysign (-1.0, y);
+}
+
+float
+check_f_neg (float x, float y)
+{
+  return x * __builtin_copysignf (-1.0f, y);
+}
+
+long double
+check_l_neg (long double x, long double y)
+{
+  return x * __builtin_copysignl (-1.0, y);
+}
+
+/* --------------- */
+
+double
+check_d_pos_rev (double x, double y)
+{
+  return __builtin_copysign (1.0, y) * x;
+}
+
+float
+check_f_pos_rev (float x, float y)
+{
+  return __builtin_copysignf (1.0f, y) * x;
+}
+
+long double
+check_l_pos_rev (long double x, long double y)
+{
+  return __builtin_copysignl (1.0, y) * x;
+}
+
+/* --------------- */
+
+double
+check_d_neg_rev (double x, double y)
+{
+  return __builtin_copysign (-1.0, y) * x;
+}
+
+float
+check_f_neg_rev (float x, float y)
+{
+  return __builtin_copysignf (-1.0f, y) * x;
+}
+
+long double
+check_l_neg_rev (long double x, long double y)
+{
+  return __builtin_copysignl (-1.0, y) * x;
+}
+
+/* { dg-final { scan-tree-dump-times "xorsign" 12 "gimple"} } */
+/* { dg-final { scan-tree-dump-times "xorsignf" 4 "gimple"} } */
+/* { dg-final { scan-tree-dump-times "xorsignl" 4 "gimple"} } */
+/* { dg-final { scan-assembler-not "__builtin_xorsign" } } */
diff --git a/gcc/testsuite/gcc.dg/vec-xorsign_exec.c b/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
new file mode 100644
index 0000000000000000000000000000000000000000..f8c8befd336c7f2743a1621d3b0f53d78bab9df7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
@@ -0,0 +1,53 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
+/* { dg-additional-options "-march=armv8-a" { target { aarch64*-*-* } } }*/
+
+extern void abort ();
+
+#define N 16
+float a[N] = {-0.1f, -3.2f, -6.3f, -9.4f,
+	      -12.5f, -15.6f, -18.7f, -21.8f,
+	      24.9f, 27.1f, 30.2f, 33.3f,
+	      36.4f, 39.5f, 42.6f, 45.7f};
+float b[N] = {-1.2f, 3.4f, -5.6f, 7.8f,
+	      -9.0f, 1.0f, -2.0f, 3.0f,
+	      -4.0f, -5.0f, 6.0f, 7.0f,
+	      -8.0f, -9.0f, 10.0f, 11.0f};
+float r[N];
+
+float ad[N] = {-0.1fd,  -3.2d,  -6.3d,  -9.4d,
+               -12.5d, -15.6d, -18.7d, -21.8d,
+                24.9d,  27.1d,  30.2d,  33.3d,
+                36.4d,  39.5d,  42.6d, 45.7d};
+float bd[N] = {-1.2d,  3.4d, -5.6d,  7.8d,
+               -9.0d,  1.0d, -2.0d,  3.0d,
+               -4.0d, -5.0d,  6.0d,  7.0d,
+               -8.0d, -9.0d, 10.0d, 11.0d};
+float rd[N];
+
+int
+main (void)
+{
+  int i;
+
+  for (i = 0; i < N; i++)
+    r[i] = a[i] * _builtin_copysignf (1.0f, b[i]);
+
+  /* check results:  */
+  for (i = 0; i < N; i++)
+    if (r[i] != a[i] * __builtin_copysignf (1.0f, b[i]))
+      abort ();
+
+  for (i = 0; i < N; i++)
+    rd[i] = ad[i] * _builtin_copysignd (1.0d, bd[i]);
+
+  /* check results:  */
+  for (i = 0; i < N; i++)
+    if (r[i] != ad[i] * __builtin_copysignd (1.0d, bd[i]))
+      abort ();
+
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/xorsign_exec.c b/gcc/testsuite/gcc.dg/xorsign_exec.c
new file mode 100644
index 0000000000000000000000000000000000000000..64bf8044cbd12c1cc744ff9b2a3308d71267bff0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/xorsign_exec.c
@@ -0,0 +1,26 @@
+/* { dg-do run } */
+/* { dg-options "-O -ffast-math" } */
+
+#include <math.h>
+
+extern void abort(void);
+
+static double x = 2.0;
+static float  y = 2.0;
+
+int main()
+{
+  if ((2.5 * __builtin_copysign(1.0d, x)) != 2.5)
+     abort();
+
+  if ((2.5 * __builtin_copysign(1.0f, y)) != 2.5)
+     abort();
+
+  if ((2.5 * __builtin_copysignf(1.0d, -x)) != -2.5)
+     abort();
+
+  if ((2.5 * __builtin_copysignf(1.0f, -y)) != -2.5)
+     abort();
+
+  return 0;
+}

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-06-12  7:56 [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)] Tamar Christina
@ 2017-06-12  9:10 ` Richard Biener
  2017-06-12 16:27   ` Richard Sandiford
  2017-06-13 10:17   ` Tamar Christina
  2017-06-12 16:52 ` Joseph Myers
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 21+ messages in thread
From: Richard Biener @ 2017-06-12  9:10 UTC (permalink / raw)
  To: Tamar Christina; +Cc: GCC Patches, nd, law, ian, rdsandiford

On Mon, 12 Jun 2017, Tamar Christina wrote:

> Hi All,
> 
> this patch implements a optimization rewriting
> 
> x * copysign (1.0, y) and 
> x * copysign (-1.0, y) 
> 
> to:
> 
> x ^ (y & (1 << sign_bit_position))
> 
> This is done by creating a special builtin during matching and generate the
> appropriate instructions during expand. This new builtin is called XORSIGN.
> 
> The expansion of xorsign depends on if the backend has an appropriate optab
> available. If this is not the case then we use a modified version of the existing
> copysign which does not take the abs value of the first argument as a fall back.
> 
> This patch is a revival of a previous patch
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> 
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.
> 
> Ok for trunk?

Without looking at the patch a few comments.

First, nowadays please add an internal function instead of builtins.
You can even take advantage of Richards work to directly tie those
to optabs (he might want to chime in to tell you how).  You don't need
the fortran FE changes in that case.

Second, I think we should canonicalize copysign (-CST, x) to
copysign (CST, x), thus positive real constants.  You should be
able to use

(simplify
 (copysign negate_expr_p@0 @1)
 (copysign @0 @1))

to catch both the (negate @0) and REAL_CST case.

What does

>       (copysigns @0 (negate @1)): Likewise.

do?

Third, new IL that is present throughout the compilation always
poses the risk that while passes may be able to handle copysign
they do not handle xorsign (vectorization?).  In this case it
looks like the matching is simply to enhance RTL expansion
which means it should ideally be done close to RTL expansion
only.  If you write

(match (xorsign_p @0 @1)
 (mult:c (copysign real_onep @0) @1))

you can call gimple_xorsign_p (you need to declare it, see
the generated gimple-match.c file for the definition) from,
say, pass_optimize_widening_mul, which despite its name
is used as a kitchen-sink for late GIMPLE pattern-matching
stuff to enhance RTL expansion / instruction selection.

Thanks,
Richard.

> 
> gcc/
> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> 
> 	* builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> 	(BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> 	* match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> 	(mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> 	(copysigns @0 (negate @1)): Likewise.
> 	* builtins.c (expand_builtin_copysign): Promoted local to argument.
> 	(expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
> 	CASE_FLT_FN (BUILT_IN_XORSIGN).
> 	(BUILT_IN_COPYSIGN): Updated function call.
> 	* optabs.h (expand_copysign): New bool.
> 	(expand_xorsign): New.
> 	* optabs.def (xorsign_optab): New.
> 	* optabs.c (expand_copysign): New parameter.
> 	* fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> 	* fortran/mathbuiltins.def (XORSIGN): New.
> 
> gcc/testsuite/
> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> 
> 	* gcc.dg/tree-ssa/xorsign.c: New.
> 	* gcc.dg/xorsign_exec.c: New.
> 	* gcc.dg/vec-xorsign_exec.c: New.
> 	* gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-06-12  9:10 ` Richard Biener
@ 2017-06-12 16:27   ` Richard Sandiford
  2017-06-13 10:22     ` Tamar Christina
  2017-06-13 10:17   ` Tamar Christina
  1 sibling, 1 reply; 21+ messages in thread
From: Richard Sandiford @ 2017-06-12 16:27 UTC (permalink / raw)
  To: Richard Biener; +Cc: Tamar Christina, GCC Patches, nd, law, ian

Richard Biener <rguenther@suse.de> writes:
> On Mon, 12 Jun 2017, Tamar Christina wrote:
>> Hi All,
>> 
>> this patch implements a optimization rewriting
>> 
>> x * copysign (1.0, y) and 
>> x * copysign (-1.0, y) 
>> 
>> to:
>> 
>> x ^ (y & (1 << sign_bit_position))
>> 
>> This is done by creating a special builtin during matching and generate the
>> appropriate instructions during expand. This new builtin is called XORSIGN.
>> 
>> The expansion of xorsign depends on if the backend has an appropriate optab
>> available. If this is not the case then we use a modified version of
>> the existing
>> copysign which does not take the abs value of the first argument as a
>> fall back.
>> 
>> This patch is a revival of a previous patch
>> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
>> 
>> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
>> Regression done on aarch64-none-linux-gnu and no regressions.
>> 
>> Ok for trunk?
>
> Without looking at the patch a few comments.
>
> First, nowadays please add an internal function instead of builtins.
> You can even take advantage of Richards work to directly tie those
> to optabs (he might want to chime in to tell you how).  You don't need
> the fortran FE changes in that case.

Yeah, it should just be a case of adding:

DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary)

to internal-fn.def.  The supposedly useful thing about this is that it
automatically extends to vectors, so you shouldn't need the xorsign
vector builtins or the aarch64_builtin_vectorized_function change.

However, we don't yet support SLP vectorisation of internal functions.
I have a patch for that that I've been looking for an excuse to post
(at the moment I think it only helps SVE).  If this goes in I can
post it as a follow-on.

In:

> diff --git a/gcc/testsuite/gcc.dg/vec-xorsign_exec.c b/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..f8c8befd336c7f2743a1621d3b0f53d78bab9df7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
> @@ -0,0 +1,53 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
> +/* { dg-additional-options "-march=armv8-a" { target { aarch64*-*-* } } }*/
> +
> +extern void abort ();
> +
> +#define N 16
> +float a[N] = {-0.1f, -3.2f, -6.3f, -9.4f,
> +	      -12.5f, -15.6f, -18.7f, -21.8f,
> +	      24.9f, 27.1f, 30.2f, 33.3f,
> +	      36.4f, 39.5f, 42.6f, 45.7f};
> +float b[N] = {-1.2f, 3.4f, -5.6f, 7.8f,
> +	      -9.0f, 1.0f, -2.0f, 3.0f,
> +	      -4.0f, -5.0f, 6.0f, 7.0f,
> +	      -8.0f, -9.0f, 10.0f, 11.0f};
> +float r[N];
> +
> +float ad[N] = {-0.1fd,  -3.2d,  -6.3d,  -9.4d,
> +               -12.5d, -15.6d, -18.7d, -21.8d,
> +                24.9d,  27.1d,  30.2d,  33.3d,
> +                36.4d,  39.5d,  42.6d, 45.7d};
> +float bd[N] = {-1.2d,  3.4d, -5.6d,  7.8d,
> +               -9.0d,  1.0d, -2.0d,  3.0d,
> +               -4.0d, -5.0d,  6.0d,  7.0d,
> +               -8.0d, -9.0d, 10.0d, 11.0d};
> +float rd[N];

Looks like these last three were meant to be doubles.

> +
> +int
> +main (void)
> +{
> +  int i;
> +
> +  for (i = 0; i < N; i++)
> +    r[i] = a[i] * _builtin_copysignf (1.0f, b[i]);
> +
> +  /* check results:  */
> +  for (i = 0; i < N; i++)
> +    if (r[i] != a[i] * __builtin_copysignf (1.0f, b[i]))
> +      abort ();
> +
> +  for (i = 0; i < N; i++)
> +    rd[i] = ad[i] * _builtin_copysignd (1.0d, bd[i]);
> +
> +  /* check results:  */
> +  for (i = 0; i < N; i++)
> +    if (r[i] != ad[i] * __builtin_copysignd (1.0d, bd[i]))
> +      abort ();
> +
> +
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */

Why does only one loop get vectorised?

Thanks,
Richard

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-06-12  7:56 [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)] Tamar Christina
  2017-06-12  9:10 ` Richard Biener
@ 2017-06-12 16:52 ` Joseph Myers
  2017-06-12 19:50 ` Michael Meissner
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Joseph Myers @ 2017-06-12 16:52 UTC (permalink / raw)
  To: Tamar Christina; +Cc: GCC Patches, nd, law, ian, rguenther

On Mon, 12 Jun 2017, Tamar Christina wrote:

> x * copysign (1.0, y) and 
> x * copysign (-1.0, y) 
> 
> to:
> 
> x ^ (y & (1 << sign_bit_position))

Note that this needs to be disabled for -fsignaling-nans, as if x is a 
signaling NaN, the multiplication converts it to a quiet NaN and raises 
"invalid".

> This is done by creating a special builtin during matching and generate the
> appropriate instructions during expand. This new builtin is called XORSIGN.

If the built-in function has a user-visible name such as __builtin_xorsign 
(as opposed to one that's not a C identifier), it needs to be documented 
as a user-visible feature.  I'd suggest not having such a user-visible 
built-in function.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-06-12  7:56 [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)] Tamar Christina
  2017-06-12  9:10 ` Richard Biener
  2017-06-12 16:52 ` Joseph Myers
@ 2017-06-12 19:50 ` Michael Meissner
  2017-06-13 10:14   ` Tamar Christina
  2017-06-24 23:53 ` Andrew Pinski
  2017-07-09 23:21 ` Andrew Pinski
  4 siblings, 1 reply; 21+ messages in thread
From: Michael Meissner @ 2017-06-12 19:50 UTC (permalink / raw)
  To: Tamar Christina; +Cc: GCC Patches, nd, law, ian, rguenther

On Mon, Jun 12, 2017 at 07:56:54AM +0000, Tamar Christina wrote:
> Hi All,
> 
> this patch implements a optimization rewriting
> 
> x * copysign (1.0, y) and 
> x * copysign (-1.0, y) 
> 
> to:
> 
> x ^ (y & (1 << sign_bit_position))
> 
> This is done by creating a special builtin during matching and generate the
> appropriate instructions during expand. This new builtin is called XORSIGN.
> 
> The expansion of xorsign depends on if the backend has an appropriate optab
> available. If this is not the case then we use a modified version of the existing
> copysign which does not take the abs value of the first argument as a fall back.
> 
> This patch is a revival of a previous patch
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> 
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.
> 
> Ok for trunk?

Please only enable this if you have XORSIGN and XORSIGNF.

On the PowerPC this would involve moving the value from the vector/floating
point registers to the general purpose registers to do the XOR operation and
then back to the vector/floating point registers.

Note, the PowerPC has an instruction that does copysign directly.  It would be
better just to do the copysign/multiply.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-06-12 19:50 ` Michael Meissner
@ 2017-06-13 10:14   ` Tamar Christina
  0 siblings, 0 replies; 21+ messages in thread
From: Tamar Christina @ 2017-06-13 10:14 UTC (permalink / raw)
  To: Michael Meissner; +Cc: GCC Patches, nd, law, ian, rguenther



> 
> Please only enable this if you have XORSIGN and XORSIGNF.
> 
> On the PowerPC this would involve moving the value from the
> vector/floating point registers to the general purpose registers to do the XOR
> operation and then back to the vector/floating point registers.
> 

Fair enough, I think using Richard's earlier change request this should be fairly simple.
I'll update the patch.

Thanks

> 
> --
> Michael Meissner, IBM
> IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
> email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-06-12  9:10 ` Richard Biener
  2017-06-12 16:27   ` Richard Sandiford
@ 2017-06-13 10:17   ` Tamar Christina
  1 sibling, 0 replies; 21+ messages in thread
From: Tamar Christina @ 2017-06-13 10:17 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, nd, law, ian, rdsandiford

Hi Richard,

Thanks for the feedback, I'll update the patch accordingly.

> What does
> 
> >       (copysigns @0 (negate @1)): Likewise.
> 
> do?
> 

Sorry this slipped through my clean-up. The patch doesn't actually contain this definition anymore.

> Third, new IL that is present throughout the compilation always poses the
> risk that while passes may be able to handle copysign they do not handle
> xorsign (vectorization?).  In this case it looks like the matching is simply to
> enhance RTL expansion which means it should ideally be done close to RTL
> expansion only.  If you write
> 
> (match (xorsign_p @0 @1)
>  (mult:c (copysign real_onep @0) @1))
> 
> you can call gimple_xorsign_p (you need to declare it, see the generated
> gimple-match.c file for the definition) from, say,
> pass_optimize_widening_mul, which despite its name is used as a kitchen-
> sink for late GIMPLE pattern-matching stuff to enhance RTL expansion /
> instruction selection.
> 
> Thanks,
> Richard.
> 
> >
> > gcc/
> > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> >
> > 	* builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> > 	(BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> > 	* match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> > 	(mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> > 	(copysigns @0 (negate @1)): Likewise.
> > 	* builtins.c (expand_builtin_copysign): Promoted local to argument.
> > 	(expand_builtin): Added CASE_FLT_FN_FLOATN_NX
> (BUILT_IN_XORSIGN) and
> > 	CASE_FLT_FN (BUILT_IN_XORSIGN).
> > 	(BUILT_IN_COPYSIGN): Updated function call.
> > 	* optabs.h (expand_copysign): New bool.
> > 	(expand_xorsign): New.
> > 	* optabs.def (xorsign_optab): New.
> > 	* optabs.c (expand_copysign): New parameter.
> > 	* fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> > 	* fortran/mathbuiltins.def (XORSIGN): New.
> >
> > gcc/testsuite/
> > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> >
> > 	* gcc.dg/tree-ssa/xorsign.c: New.
> > 	* gcc.dg/xorsign_exec.c: New.
> > 	* gcc.dg/vec-xorsign_exec.c: New.
> > 	* gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-06-12 16:27   ` Richard Sandiford
@ 2017-06-13 10:22     ` Tamar Christina
  0 siblings, 0 replies; 21+ messages in thread
From: Tamar Christina @ 2017-06-13 10:22 UTC (permalink / raw)
  To: Richard Sandiford, Richard Biener; +Cc: GCC Patches, nd, law, ian

Hi Richard,

> > First, nowadays please add an internal function instead of builtins.
> > You can even take advantage of Richards work to directly tie those to
> > optabs (he might want to chime in to tell you how).  You don't need
> > the fortran FE changes in that case.
> 
> Yeah, it should just be a case of adding:
> 
> DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary)
> 
> to internal-fn.def.  The supposedly useful thing about this is that it
> automatically extends to vectors, so you shouldn't need the xorsign vector
> builtins or the aarch64_builtin_vectorized_function change.

Ah, ok, thanks! I'll change it to an internal function.
And take a look at the testcases for the updated patch. 

> However, we don't yet support SLP vectorisation of internal functions.
> I have a patch for that that I've been looking for an excuse to post (at the
> moment I think it only helps SVE).  If this goes in I can post it as a follow-on.
> 
> In:
> 
> > diff --git a/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
> > b/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..f8c8befd336c7f2743a1621d3b
> 0f
> > 53d78bab9df7
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
> > @@ -0,0 +1,53 @@
> > +/* { dg-do run } */
> > +/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
> > +/* { dg-additional-options "-march=armv8-a" { target { aarch64*-*-* }
> > +} }*/
> > +
> > +extern void abort ();
> > +
> > +#define N 16
> > +float a[N] = {-0.1f, -3.2f, -6.3f, -9.4f,
> > +	      -12.5f, -15.6f, -18.7f, -21.8f,
> > +	      24.9f, 27.1f, 30.2f, 33.3f,
> > +	      36.4f, 39.5f, 42.6f, 45.7f};
> > +float b[N] = {-1.2f, 3.4f, -5.6f, 7.8f,
> > +	      -9.0f, 1.0f, -2.0f, 3.0f,
> > +	      -4.0f, -5.0f, 6.0f, 7.0f,
> > +	      -8.0f, -9.0f, 10.0f, 11.0f};
> > +float r[N];
> > +
> > +float ad[N] = {-0.1fd,  -3.2d,  -6.3d,  -9.4d,
> > +               -12.5d, -15.6d, -18.7d, -21.8d,
> > +                24.9d,  27.1d,  30.2d,  33.3d,
> > +                36.4d,  39.5d,  42.6d, 45.7d}; float bd[N] = {-1.2d,
> > +3.4d, -5.6d,  7.8d,
> > +               -9.0d,  1.0d, -2.0d,  3.0d,
> > +               -4.0d, -5.0d,  6.0d,  7.0d,
> > +               -8.0d, -9.0d, 10.0d, 11.0d}; float rd[N];
> 
> Looks like these last three were meant to be doubles.
> 
> > +
> > +int
> > +main (void)
> > +{
> > +  int i;
> > +
> > +  for (i = 0; i < N; i++)
> > +    r[i] = a[i] * _builtin_copysignf (1.0f, b[i]);
> > +
> > +  /* check results:  */
> > +  for (i = 0; i < N; i++)
> > +    if (r[i] != a[i] * __builtin_copysignf (1.0f, b[i]))
> > +      abort ();
> > +
> > +  for (i = 0; i < N; i++)
> > +    rd[i] = ad[i] * _builtin_copysignd (1.0d, bd[i]);
> > +
> > +  /* check results:  */
> > +  for (i = 0; i < N; i++)
> > +    if (r[i] != ad[i] * __builtin_copysignd (1.0d, bd[i]))
> > +      abort ();
> > +
> > +
> > +  return 0;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" }
> > +} */
> 
> Why does only one loop get vectorised?
> 
> Thanks,
> Richard

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-06-12  7:56 [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)] Tamar Christina
                   ` (2 preceding siblings ...)
  2017-06-12 19:50 ` Michael Meissner
@ 2017-06-24 23:53 ` Andrew Pinski
  2017-06-26  2:09   ` Andrew Pinski
  2017-06-26  7:26   ` Richard Biener
  2017-07-09 23:21 ` Andrew Pinski
  4 siblings, 2 replies; 21+ messages in thread
From: Andrew Pinski @ 2017-06-24 23:53 UTC (permalink / raw)
  To: Tamar Christina; +Cc: GCC Patches, nd, law, ian, rguenther

On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
<Tamar.Christina@arm.com> wrote:
> Hi All,
>
> this patch implements a optimization rewriting
>
> x * copysign (1.0, y) and
> x * copysign (-1.0, y)


This reminds me:
copysign(-1.0, y) can be just optimized to:
copysign(1.0, y)

I did that in my patch here:
https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01860.html

This should allow you to reduce the number of patterns needed to match here.
Note I still think we could do this in expand without a new
builtin/internal function.
I might go and code that up soonish.

Thanks,
Andrew

>
> to:
>
> x ^ (y & (1 << sign_bit_position))
>
> This is done by creating a special builtin during matching and generate the
> appropriate instructions during expand. This new builtin is called XORSIGN.
>
> The expansion of xorsign depends on if the backend has an appropriate optab
> available. If this is not the case then we use a modified version of the existing
> copysign which does not take the abs value of the first argument as a fall back.
>
> This patch is a revival of a previous patch
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
>
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.
>
> Ok for trunk?
>
> gcc/
> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
>
>         * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
>         (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
>         * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
>         (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
>         (copysigns @0 (negate @1)): Likewise.
>         * builtins.c (expand_builtin_copysign): Promoted local to argument.
>         (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
>         CASE_FLT_FN (BUILT_IN_XORSIGN).
>         (BUILT_IN_COPYSIGN): Updated function call.
>         * optabs.h (expand_copysign): New bool.
>         (expand_xorsign): New.
>         * optabs.def (xorsign_optab): New.
>         * optabs.c (expand_copysign): New parameter.
>         * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
>         * fortran/mathbuiltins.def (XORSIGN): New.
>
> gcc/testsuite/
> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
>
>         * gcc.dg/tree-ssa/xorsign.c: New.
>         * gcc.dg/xorsign_exec.c: New.
>         * gcc.dg/vec-xorsign_exec.c: New.
>         * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-06-24 23:53 ` Andrew Pinski
@ 2017-06-26  2:09   ` Andrew Pinski
  2017-06-26  8:46     ` Tamar Christina
  2017-06-26  7:26   ` Richard Biener
  1 sibling, 1 reply; 21+ messages in thread
From: Andrew Pinski @ 2017-06-26  2:09 UTC (permalink / raw)
  To: Tamar Christina; +Cc: GCC Patches, nd, law, ian, rguenther

[-- Attachment #1: Type: text/plain, Size: 3289 bytes --]

On Sat, Jun 24, 2017 at 4:53 PM, Andrew Pinski <pinskia@gmail.com> wrote:
> On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> <Tamar.Christina@arm.com> wrote:
>> Hi All,
>>
>> this patch implements a optimization rewriting
>>
>> x * copysign (1.0, y) and
>> x * copysign (-1.0, y)
>
>
> This reminds me:
> copysign(-1.0, y) can be just optimized to:
> copysign(1.0, y)
>
> I did that in my patch here:
> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01860.html

I updated the patch to handle all constants and not just -1.0.

>
> This should allow you to reduce the number of patterns needed to match here.
> Note I still think we could do this in expand without a new
> builtin/internal function.
> I might go and code that up soonish.

Also something like attached (NOTE this is NOT a full patch and needs
the xorsign optabs part of your patch) should work for the expand side
rather than creating a new builtin.  There still needs to handling of
the vector based copysign.  But you should get the general idea.  I
would like to see more of these special expand patterns really.

NOTE you can remove the target hook part and just check if xorsign
optab is there.  I don't know if that is what we want to do if not
allow for generic expanding of this.

Thanks,
Andrew Pinski


>
> Thanks,
> Andrew
>
>>
>> to:
>>
>> x ^ (y & (1 << sign_bit_position))
>>
>> This is done by creating a special builtin during matching and generate the
>> appropriate instructions during expand. This new builtin is called XORSIGN.
>>
>> The expansion of xorsign depends on if the backend has an appropriate optab
>> available. If this is not the case then we use a modified version of the existing
>> copysign which does not take the abs value of the first argument as a fall back.
>>
>> This patch is a revival of a previous patch
>> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
>>
>> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
>> Regression done on aarch64-none-linux-gnu and no regressions.
>>
>> Ok for trunk?
>>
>> gcc/
>> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
>>
>>         * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
>>         (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
>>         * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
>>         (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
>>         (copysigns @0 (negate @1)): Likewise.
>>         * builtins.c (expand_builtin_copysign): Promoted local to argument.
>>         (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
>>         CASE_FLT_FN (BUILT_IN_XORSIGN).
>>         (BUILT_IN_COPYSIGN): Updated function call.
>>         * optabs.h (expand_copysign): New bool.
>>         (expand_xorsign): New.
>>         * optabs.def (xorsign_optab): New.
>>         * optabs.c (expand_copysign): New parameter.
>>         * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
>>         * fortran/mathbuiltins.def (XORSIGN): New.
>>
>> gcc/testsuite/
>> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
>>
>>         * gcc.dg/tree-ssa/xorsign.c: New.
>>         * gcc.dg/xorsign_exec.c: New.
>>         * gcc.dg/vec-xorsign_exec.c: New.
>>         * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.

[-- Attachment #2: startmult-copysign.diff.txt --]
[-- Type: text/plain, Size: 2493 bytes --]

Index: gcc/expr.c
===================================================================
--- gcc/expr.c	(revision 249619)
+++ gcc/expr.c	(working copy)
@@ -8182,6 +8182,59 @@
   return NULL_RTX;
 }
 
+static bool
+is_copysign_call_with_1 (gimple *call)
+{
+  if (!is_gimple_call (call))
+    return false;
+
+  if (gimple_call_builtin_p (call, BUILT_IN_NORMAL))
+    {
+      gcall *c = as_a<gcall*> (call);
+      tree decl = gimple_call_fndecl (call);
+      switch (DECL_FUNCTION_CODE (decl))
+	{
+        CASE_FLT_FN (BUILT_IN_COPYSIGN):
+	CASE_FLT_FN_FLOATN_NX (BUILT_IN_COPYSIGN):
+	  return real_one_p (gimple_call_arg (c, 0));
+	default:
+	  return false;
+	}
+    }
+}
+
+static rtx
+maybe_expand_mult_copysign (tree treeop0, tree treeop1, rtx target)
+{
+  tree type = TREE_TYPE (treeop0);
+  rtx op0, op1;
+
+  if (!SCALAR_FLOAT_TYPE_P (type)
+      && VECTOR_FLOAT_TYPE_P (type))
+    return NULL;
+
+  if (HONOR_SNANS (type))
+    return NULL;
+
+  if (!targetm.expand_mult_copysign_xor ())
+    return NULL;
+
+  if (TREE_CODE (treeop0) == SSA_NAME)
+    {
+      gimple *call0 = SSA_NAME_DEF_STMT (treeop0);
+      if (is_copysign_call_with_1 (call0))
+	{
+	  gcall *c = as_a<gcall*> (call0);
+	  treeop0 = gimple_call_arg (c, 1);
+	  expand_operands (treeop1, treeop0, NULL_RTX, &op0, &op1, EXPAND_NORMAL);
+	  return expand_copysign (op0, op1, target, true);
+	}
+    }
+
+  return NULL;
+}
+
+
 rtx
 expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 		    enum expand_modifier modifier)
@@ -8791,6 +8844,10 @@
       if (modifier == EXPAND_STACK_PARM)
 	target = 0;
 
+      temp = maybe_expand_mult_copysign (treeop0, treeop1);
+      if (temp)
+	return temp;
+
       expand_operands (treeop0, treeop1, subtarget, &op0, &op1, EXPAND_NORMAL);
       return REDUCE_BIT_FIELD (expand_mult (mode, op0, op1, target, unsignedp));
 
Index: gcc/target.def
===================================================================
--- gcc/target.def	(revision 249619)
+++ gcc/target.def	(working copy)
@@ -2640,6 +2640,13 @@
  default_have_conditional_execution)
 
 DEFHOOK
+(expand_mult_copysign_xor,
+ "This target hook returns true if the target wants to use xor's\n
+to expand x * copysign (1., y) as x ^ (y & (1 << sign_bit_position)).\n"
+ bool, (void),
+ hook_bool_void_false)
+
+DEFHOOK
 (gen_ccmp_first,
  "This function prepares to emit a comparison insn for the first compare in a\n\
  sequence of conditional comparisions.  It returns an appropriate comparison\n\

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-06-24 23:53 ` Andrew Pinski
  2017-06-26  2:09   ` Andrew Pinski
@ 2017-06-26  7:26   ` Richard Biener
  1 sibling, 0 replies; 21+ messages in thread
From: Richard Biener @ 2017-06-26  7:26 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: Tamar Christina, GCC Patches, nd, law, ian

On Sat, 24 Jun 2017, Andrew Pinski wrote:

> On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> <Tamar.Christina@arm.com> wrote:
> > Hi All,
> >
> > this patch implements a optimization rewriting
> >
> > x * copysign (1.0, y) and
> > x * copysign (-1.0, y)
> 
> 
> This reminds me:
> copysign(-1.0, y) can be just optimized to:
> copysign(1.0, y)

I think I suggested that in my earlie review.

> I did that in my patch here:
> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01860.html
> 
> This should allow you to reduce the number of patterns needed to match here.
> Note I still think we could do this in expand without a new
> builtin/internal function.
> I might go and code that up soonish.
> 
> Thanks,
> Andrew
> 
> >
> > to:
> >
> > x ^ (y & (1 << sign_bit_position))
> >
> > This is done by creating a special builtin during matching and generate the
> > appropriate instructions during expand. This new builtin is called XORSIGN.
> >
> > The expansion of xorsign depends on if the backend has an appropriate optab
> > available. If this is not the case then we use a modified version of the existing
> > copysign which does not take the abs value of the first argument as a fall back.
> >
> > This patch is a revival of a previous patch
> > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> >
> > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> > Regression done on aarch64-none-linux-gnu and no regressions.
> >
> > Ok for trunk?
> >
> > gcc/
> > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> >
> >         * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> >         (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> >         * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> >         (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> >         (copysigns @0 (negate @1)): Likewise.
> >         * builtins.c (expand_builtin_copysign): Promoted local to argument.
> >         (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
> >         CASE_FLT_FN (BUILT_IN_XORSIGN).
> >         (BUILT_IN_COPYSIGN): Updated function call.
> >         * optabs.h (expand_copysign): New bool.
> >         (expand_xorsign): New.
> >         * optabs.def (xorsign_optab): New.
> >         * optabs.c (expand_copysign): New parameter.
> >         * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> >         * fortran/mathbuiltins.def (XORSIGN): New.
> >
> > gcc/testsuite/
> > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> >
> >         * gcc.dg/tree-ssa/xorsign.c: New.
> >         * gcc.dg/xorsign_exec.c: New.
> >         * gcc.dg/vec-xorsign_exec.c: New.
> >         * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-06-26  2:09   ` Andrew Pinski
@ 2017-06-26  8:46     ` Tamar Christina
  0 siblings, 0 replies; 21+ messages in thread
From: Tamar Christina @ 2017-06-26  8:46 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: GCC Patches, nd, law, ian, rguenther

Hi Andrew,

Thanks! I'll put together the rest today or tomorrow.
Sorry for the slow response on this one.

Tamar
________________________________________
From: Andrew Pinski <pinskia@gmail.com>
Sent: Monday, June 26, 2017 3:09:54 AM
To: Tamar Christina
Cc: GCC Patches; nd; law@redhat.com; ian@airs.com; rguenther@suse.de
Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

On Sat, Jun 24, 2017 at 4:53 PM, Andrew Pinski <pinskia@gmail.com> wrote:
> On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> <Tamar.Christina@arm.com> wrote:
>> Hi All,
>>
>> this patch implements a optimization rewriting
>>
>> x * copysign (1.0, y) and
>> x * copysign (-1.0, y)
>
>
> This reminds me:
> copysign(-1.0, y) can be just optimized to:
> copysign(1.0, y)
>
> I did that in my patch here:
> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01860.html

I updated the patch to handle all constants and not just -1.0.

>
> This should allow you to reduce the number of patterns needed to match here.
> Note I still think we could do this in expand without a new
> builtin/internal function.
> I might go and code that up soonish.

Also something like attached (NOTE this is NOT a full patch and needs
the xorsign optabs part of your patch) should work for the expand side
rather than creating a new builtin.  There still needs to handling of
the vector based copysign.  But you should get the general idea.  I
would like to see more of these special expand patterns really.

NOTE you can remove the target hook part and just check if xorsign
optab is there.  I don't know if that is what we want to do if not
allow for generic expanding of this.

Thanks,
Andrew Pinski


>
> Thanks,
> Andrew
>
>>
>> to:
>>
>> x ^ (y & (1 << sign_bit_position))
>>
>> This is done by creating a special builtin during matching and generate the
>> appropriate instructions during expand. This new builtin is called XORSIGN.
>>
>> The expansion of xorsign depends on if the backend has an appropriate optab
>> available. If this is not the case then we use a modified version of the existing
>> copysign which does not take the abs value of the first argument as a fall back.
>>
>> This patch is a revival of a previous patch
>> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
>>
>> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
>> Regression done on aarch64-none-linux-gnu and no regressions.
>>
>> Ok for trunk?
>>
>> gcc/
>> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
>>
>>         * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
>>         (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
>>         * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
>>         (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
>>         (copysigns @0 (negate @1)): Likewise.
>>         * builtins.c (expand_builtin_copysign): Promoted local to argument.
>>         (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
>>         CASE_FLT_FN (BUILT_IN_XORSIGN).
>>         (BUILT_IN_COPYSIGN): Updated function call.
>>         * optabs.h (expand_copysign): New bool.
>>         (expand_xorsign): New.
>>         * optabs.def (xorsign_optab): New.
>>         * optabs.c (expand_copysign): New parameter.
>>         * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
>>         * fortran/mathbuiltins.def (XORSIGN): New.
>>
>> gcc/testsuite/
>> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
>>
>>         * gcc.dg/tree-ssa/xorsign.c: New.
>>         * gcc.dg/xorsign_exec.c: New.
>>         * gcc.dg/vec-xorsign_exec.c: New.
>>         * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-06-12  7:56 [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)] Tamar Christina
                   ` (3 preceding siblings ...)
  2017-06-24 23:53 ` Andrew Pinski
@ 2017-07-09 23:21 ` Andrew Pinski
  2017-07-10 15:47   ` Tamar Christina
  4 siblings, 1 reply; 21+ messages in thread
From: Andrew Pinski @ 2017-07-09 23:21 UTC (permalink / raw)
  To: Tamar Christina; +Cc: GCC Patches, nd, law, ian, rguenther

On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
<Tamar.Christina@arm.com> wrote:
> Hi All,
>
> this patch implements a optimization rewriting
>
> x * copysign (1.0, y) and
> x * copysign (-1.0, y)
>
> to:
>
> x ^ (y & (1 << sign_bit_position))
>
> This is done by creating a special builtin during matching and generate the
> appropriate instructions during expand. This new builtin is called XORSIGN.
>
> The expansion of xorsign depends on if the backend has an appropriate optab
> available. If this is not the case then we use a modified version of the existing
> copysign which does not take the abs value of the first argument as a fall back.
>
> This patch is a revival of a previous patch
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
>
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.


Note this is also PR 19706.

Thanks,
Andrew

>
> Ok for trunk?
>
> gcc/
> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
>
>         * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
>         (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
>         * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
>         (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
>         (copysigns @0 (negate @1)): Likewise.
>         * builtins.c (expand_builtin_copysign): Promoted local to argument.
>         (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
>         CASE_FLT_FN (BUILT_IN_XORSIGN).
>         (BUILT_IN_COPYSIGN): Updated function call.
>         * optabs.h (expand_copysign): New bool.
>         (expand_xorsign): New.
>         * optabs.def (xorsign_optab): New.
>         * optabs.c (expand_copysign): New parameter.
>         * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
>         * fortran/mathbuiltins.def (XORSIGN): New.
>
> gcc/testsuite/
> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
>
>         * gcc.dg/tree-ssa/xorsign.c: New.
>         * gcc.dg/xorsign_exec.c: New.
>         * gcc.dg/vec-xorsign_exec.c: New.
>         * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-07-09 23:21 ` Andrew Pinski
@ 2017-07-10 15:47   ` Tamar Christina
  2017-07-18  8:05     ` Tamar Christina
  2017-07-18  8:38     ` Richard Biener
  0 siblings, 2 replies; 21+ messages in thread
From: Tamar Christina @ 2017-07-10 15:47 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: GCC Patches, nd, law, ian, rguenther

[-- Attachment #1: Type: text/plain, Size: 3256 bytes --]

Hi All,

I've re-spun the patch with the changes requested.


This is only done when not honoring signaling NaNs.
This transormation is done at expand time by using
a new optab "xorsign". If the optab is not available
then copysign is expanded as normal.

Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
Regression done on aarch64-none-linux-gnu and no regressions.

Ok for trunk?

gcc/
2017-07-10  Tamar Christina  <tamar.christina@arm.com>
	    Andrew Pinski <pinskia@gmail.com>

	PR middle-end/19706
	* expr.c (is_copysign_call_with_1): New.
	(maybe_expand_mult_copysign): Likewise.
	(expand_expr_real_2): Expand copysign.
	* optabs.def (xorsign_optab): New.

________________________________________
From: Andrew Pinski <pinskia@gmail.com>
Sent: Monday, July 10, 2017 12:21:29 AM
To: Tamar Christina
Cc: GCC Patches; nd; law@redhat.com; ian@airs.com; rguenther@suse.de
Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
<Tamar.Christina@arm.com> wrote:
> Hi All,
>
> this patch implements a optimization rewriting
>
> x * copysign (1.0, y) and
> x * copysign (-1.0, y)
>
> to:
>
> x ^ (y & (1 << sign_bit_position))
>
> This is done by creating a special builtin during matching and generate the
> appropriate instructions during expand. This new builtin is called XORSIGN.
>
> The expansion of xorsign depends on if the backend has an appropriate optab
> available. If this is not the case then we use a modified version of the existing
> copysign which does not take the abs value of the first argument as a fall back.
>
> This patch is a revival of a previous patch
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
>
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.


Note this is also PR 19706.

Thanks,
Andrew

>
> Ok for trunk?
>
> gcc/
> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
>
>         * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
>         (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
>         * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
>         (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
>         (copysigns @0 (negate @1)): Likewise.
>         * builtins.c (expand_builtin_copysign): Promoted local to argument.
>         (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
>         CASE_FLT_FN (BUILT_IN_XORSIGN).
>         (BUILT_IN_COPYSIGN): Updated function call.
>         * optabs.h (expand_copysign): New bool.
>         (expand_xorsign): New.
>         * optabs.def (xorsign_optab): New.
>         * optabs.c (expand_copysign): New parameter.
>         * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
>         * fortran/mathbuiltins.def (XORSIGN): New.
>
> gcc/testsuite/
> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
>
>         * gcc.dg/tree-ssa/xorsign.c: New.
>         * gcc.dg/xorsign_exec.c: New.
>         * gcc.dg/vec-xorsign_exec.c: New.
>         * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: xorsign-1-spin3.patch --]
[-- Type: text/x-patch; name="xorsign-1-spin3.patch", Size: 3312 bytes --]

diff --git a/gcc/expr.c b/gcc/expr.c
index 5febf07929d0add0ad0ae1356baef008524f0c7c..0193231bc857bdea3e02b2845e62883e1b5c291b 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8182,6 +8182,86 @@ expand_cond_expr_using_cmove (tree treeop0 ATTRIBUTE_UNUSED,
   return NULL_RTX;
 }
 
+/* Check to see if the CALL statement is an invocation of copysign
+   with 1. being the first argument.  */
+static bool
+is_copysign_call_with_1 (gimple *call)
+{
+  if (!is_gimple_call (call))
+    return false;
+
+  enum combined_fn code = gimple_call_combined_fn (call);
+
+  if (code == CFN_LAST)
+    return false;
+
+  gcall *c = as_a<gcall*> (call);
+
+  if (builtin_fn_p (code))
+    {
+      switch (as_builtin_fn (code))
+	{
+	CASE_FLT_FN (BUILT_IN_COPYSIGN):
+	CASE_FLT_FN_FLOATN_NX (BUILT_IN_COPYSIGN):
+	  return real_onep (gimple_call_arg (c, 0));
+	default:
+	  return false;
+	}
+    }
+
+  if (internal_fn_p (code))
+    {
+      switch (as_internal_fn (code))
+	{
+	case IFN_COPYSIGN:
+	  return real_onep (gimple_call_arg (c, 0));
+	default:
+	  return false;
+	}
+    }
+
+   return false;
+}
+
+/* Try to expand the pattern x * copysign (1, y) into xorsign (x, y).
+   This only happens when the the xorsign optab is defined, if the
+   pattern is not a xorsign pattern or if expension failes NULL_RTX is
+   returned, otherwise the RTX from the optab expansion is returned.  */
+static rtx
+maybe_expand_mult_copysign (tree treeop0, tree treeop1, rtx target)
+{
+  tree type = TREE_TYPE (treeop0);
+  rtx op0, op1;
+
+  if (HONOR_SNANS (type))
+    return NULL_RTX;
+
+  if (TREE_CODE (treeop0) == SSA_NAME && TREE_CODE (treeop1) == SSA_NAME)
+    {
+      gimple *call0 = SSA_NAME_DEF_STMT (treeop0);
+      if (!is_copysign_call_with_1 (call0))
+	{
+	  /* IPA sometimes inlines and then extracts the function again,
+	     resulting in an incorrect order, so check both ways.  */
+	  call0 = SSA_NAME_DEF_STMT (treeop1);
+	  if (!is_copysign_call_with_1 (call0))
+	    return NULL_RTX;
+
+	  treeop1 = treeop0;
+	}
+
+	gcall *c = as_a<gcall*> (call0);
+	treeop0 = gimple_call_arg (c, 1);
+
+	expand_operands (treeop1, treeop0, NULL_RTX, &op0, &op1, EXPAND_NORMAL);
+
+	return expand_binop (TYPE_MODE (type), xorsign_optab, op0, op1,
+			     target, 0, OPTAB_DIRECT);
+    }
+
+  return NULL_RTX;
+}
+
 rtx
 expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 		    enum expand_modifier modifier)
@@ -8791,6 +8871,10 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       if (modifier == EXPAND_STACK_PARM)
 	target = 0;
 
+      temp = maybe_expand_mult_copysign (treeop0, treeop1, target);
+      if (temp)
+	return temp;
+
       expand_operands (treeop0, treeop1, subtarget, &op0, &op1, EXPAND_NORMAL);
       return REDUCE_BIT_FIELD (expand_mult (mode, op0, op1, target, unsignedp));
 
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 504c655be0210a72e7f0f34cde0131e21ccf8089..507af9ea81c42eba98780ef326c22847fdd2c043 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -253,6 +253,7 @@ OPTAB_D (asin_optab, "asin$a2")
 OPTAB_D (atan2_optab, "atan2$a3")
 OPTAB_D (atan_optab, "atan$a2")
 OPTAB_D (copysign_optab, "copysign$F$a3")
+OPTAB_D (xorsign_optab, "xorsign$F$a3")
 OPTAB_D (cos_optab, "cos$a2")
 OPTAB_D (exp10_optab, "exp10$a2")
 OPTAB_D (exp2_optab, "exp2$a2")

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-07-10 15:47   ` Tamar Christina
@ 2017-07-18  8:05     ` Tamar Christina
  2017-07-18  8:38     ` Richard Biener
  1 sibling, 0 replies; 21+ messages in thread
From: Tamar Christina @ 2017-07-18  8:05 UTC (permalink / raw)
  To: GCC Patches
  Cc: nd, law, ian, rguenther, Andrew Pinski, Michael Meissner, Joseph Myers

Ping.
________________________________________
From: Tamar Christina
Sent: Monday, July 10, 2017 4:47 PM
To: Andrew Pinski
Cc: GCC Patches; nd; law@redhat.com; ian@airs.com; rguenther@suse.de
Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

Hi All,

I've re-spun the patch with the changes requested.


This is only done when not honoring signaling NaNs.
This transormation is done at expand time by using
a new optab "xorsign". If the optab is not available
then copysign is expanded as normal.

Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
Regression done on aarch64-none-linux-gnu and no regressions.

Ok for trunk?

gcc/
2017-07-10  Tamar Christina  <tamar.christina@arm.com>
            Andrew Pinski <pinskia@gmail.com>

        PR middle-end/19706
        * expr.c (is_copysign_call_with_1): New.
        (maybe_expand_mult_copysign): Likewise.
        (expand_expr_real_2): Expand copysign.
        * optabs.def (xorsign_optab): New.

________________________________________
From: Andrew Pinski <pinskia@gmail.com>
Sent: Monday, July 10, 2017 12:21:29 AM
To: Tamar Christina
Cc: GCC Patches; nd; law@redhat.com; ian@airs.com; rguenther@suse.de
Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
<Tamar.Christina@arm.com> wrote:
> Hi All,
>
> this patch implements a optimization rewriting
>
> x * copysign (1.0, y) and
> x * copysign (-1.0, y)
>
> to:
>
> x ^ (y & (1 << sign_bit_position))
>
> This is done by creating a special builtin during matching and generate the
> appropriate instructions during expand. This new builtin is called XORSIGN.
>
> The expansion of xorsign depends on if the backend has an appropriate optab
> available. If this is not the case then we use a modified version of the existing
> copysign which does not take the abs value of the first argument as a fall back.
>
> This patch is a revival of a previous patch
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
>
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.


Note this is also PR 19706.

Thanks,
Andrew

>
> Ok for trunk?
>
> gcc/
> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
>
>         * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
>         (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
>         * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
>         (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
>         (copysigns @0 (negate @1)): Likewise.
>         * builtins.c (expand_builtin_copysign): Promoted local to argument.
>         (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
>         CASE_FLT_FN (BUILT_IN_XORSIGN).
>         (BUILT_IN_COPYSIGN): Updated function call.
>         * optabs.h (expand_copysign): New bool.
>         (expand_xorsign): New.
>         * optabs.def (xorsign_optab): New.
>         * optabs.c (expand_copysign): New parameter.
>         * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
>         * fortran/mathbuiltins.def (XORSIGN): New.
>
> gcc/testsuite/
> 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
>
>         * gcc.dg/tree-ssa/xorsign.c: New.
>         * gcc.dg/xorsign_exec.c: New.
>         * gcc.dg/vec-xorsign_exec.c: New.
>         * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-07-10 15:47   ` Tamar Christina
  2017-07-18  8:05     ` Tamar Christina
@ 2017-07-18  8:38     ` Richard Biener
  2017-07-18  9:22       ` Tamar Christina
  1 sibling, 1 reply; 21+ messages in thread
From: Richard Biener @ 2017-07-18  8:38 UTC (permalink / raw)
  To: Tamar Christina; +Cc: Andrew Pinski, GCC Patches, nd, law, ian

On Mon, 10 Jul 2017, Tamar Christina wrote:

> Hi All,
> 
> I've re-spun the patch with the changes requested.
> 
> 
> This is only done when not honoring signaling NaNs.
> This transormation is done at expand time by using
> a new optab "xorsign". If the optab is not available
> then copysign is expanded as normal.
> 
> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> Regression done on aarch64-none-linux-gnu and no regressions.
> 
> Ok for trunk?

+static rtx
+maybe_expand_mult_copysign (tree treeop0, tree treeop1, rtx target)
+{
+  tree type = TREE_TYPE (treeop0);
+  rtx op0, op1;
+
+  if (HONOR_SNANS (type))
+    return NULL_RTX;
+
+  if (TREE_CODE (treeop0) == SSA_NAME && TREE_CODE (treeop1) == SSA_NAME)
+    {
+      gimple *call0 = SSA_NAME_DEF_STMT (treeop0);

you can't lookup arbitrary def stmts during RTL expansion but you
have to go through get_gimple_for_ssa_name which may return NULL
if SSA name coalescing makes doing so unsafe.

Why's this now done during RTL expansion rather than during late
GIMPLE, using match.pd and an internal function for xorsign?

Thanks,
Richard.

> 
> gcc/
> 2017-07-10  Tamar Christina  <tamar.christina@arm.com>
> 	    Andrew Pinski <pinskia@gmail.com>
> 
> 	PR middle-end/19706
> 	* expr.c (is_copysign_call_with_1): New.
> 	(maybe_expand_mult_copysign): Likewise.
> 	(expand_expr_real_2): Expand copysign.
> 	* optabs.def (xorsign_optab): New.
> 
> ________________________________________
> From: Andrew Pinski <pinskia@gmail.com>
> Sent: Monday, July 10, 2017 12:21:29 AM
> To: Tamar Christina
> Cc: GCC Patches; nd; law@redhat.com; ian@airs.com; rguenther@suse.de
> Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
> 
> On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> <Tamar.Christina@arm.com> wrote:
> > Hi All,
> >
> > this patch implements a optimization rewriting
> >
> > x * copysign (1.0, y) and
> > x * copysign (-1.0, y)
> >
> > to:
> >
> > x ^ (y & (1 << sign_bit_position))
> >
> > This is done by creating a special builtin during matching and generate the
> > appropriate instructions during expand. This new builtin is called XORSIGN.
> >
> > The expansion of xorsign depends on if the backend has an appropriate optab
> > available. If this is not the case then we use a modified version of the existing
> > copysign which does not take the abs value of the first argument as a fall back.
> >
> > This patch is a revival of a previous patch
> > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> >
> > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues.
> > Regression done on aarch64-none-linux-gnu and no regressions.
> 
> 
> Note this is also PR 19706.
> 
> Thanks,
> Andrew
> 
> >
> > Ok for trunk?
> >
> > gcc/
> > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> >
> >         * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> >         (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> >         * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> >         (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> >         (copysigns @0 (negate @1)): Likewise.
> >         * builtins.c (expand_builtin_copysign): Promoted local to argument.
> >         (expand_builtin): Added CASE_FLT_FN_FLOATN_NX (BUILT_IN_XORSIGN) and
> >         CASE_FLT_FN (BUILT_IN_XORSIGN).
> >         (BUILT_IN_COPYSIGN): Updated function call.
> >         * optabs.h (expand_copysign): New bool.
> >         (expand_xorsign): New.
> >         * optabs.def (xorsign_optab): New.
> >         * optabs.c (expand_copysign): New parameter.
> >         * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> >         * fortran/mathbuiltins.def (XORSIGN): New.
> >
> > gcc/testsuite/
> > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> >
> >         * gcc.dg/tree-ssa/xorsign.c: New.
> >         * gcc.dg/xorsign_exec.c: New.
> >         * gcc.dg/vec-xorsign_exec.c: New.
> >         * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-07-18  8:38     ` Richard Biener
@ 2017-07-18  9:22       ` Tamar Christina
  2017-07-18 10:19         ` Richard Biener
  0 siblings, 1 reply; 21+ messages in thread
From: Tamar Christina @ 2017-07-18  9:22 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andrew Pinski, GCC Patches, nd, law, ian

> 
> Why's this now done during RTL expansion rather than during late GIMPLE,
> using match.pd and an internal function for xorsign?
> 

Mainly because of Andrew's email on the 10th which stated:

> But you should get the general idea.  I would like to see more of these special expand patterns really.

And there were no objections so I figured this was also an acceptable solution.

> Thanks,
> Richard.
> 
> >
> > gcc/
> > 2017-07-10  Tamar Christina  <tamar.christina@arm.com>
> > 	    Andrew Pinski <pinskia@gmail.com>
> >
> > 	PR middle-end/19706
> > 	* expr.c (is_copysign_call_with_1): New.
> > 	(maybe_expand_mult_copysign): Likewise.
> > 	(expand_expr_real_2): Expand copysign.
> > 	* optabs.def (xorsign_optab): New.
> >
> > ________________________________________
> > From: Andrew Pinski <pinskia@gmail.com>
> > Sent: Monday, July 10, 2017 12:21:29 AM
> > To: Tamar Christina
> > Cc: GCC Patches; nd; law@redhat.com; ian@airs.com; rguenther@suse.de
> > Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y)
> > [Patch (1/2)]
> >
> > On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> > <Tamar.Christina@arm.com> wrote:
> > > Hi All,
> > >
> > > this patch implements a optimization rewriting
> > >
> > > x * copysign (1.0, y) and
> > > x * copysign (-1.0, y)
> > >
> > > to:
> > >
> > > x ^ (y & (1 << sign_bit_position))
> > >
> > > This is done by creating a special builtin during matching and
> > > generate the appropriate instructions during expand. This new builtin is
> called XORSIGN.
> > >
> > > The expansion of xorsign depends on if the backend has an
> > > appropriate optab available. If this is not the case then we use a
> > > modified version of the existing copysign which does not take the abs
> value of the first argument as a fall back.
> > >
> > > This patch is a revival of a previous patch
> > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> > >
> > > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no
> issues.
> > > Regression done on aarch64-none-linux-gnu and no regressions.
> >
> >
> > Note this is also PR 19706.
> >
> > Thanks,
> > Andrew
> >
> > >
> > > Ok for trunk?
> > >
> > > gcc/
> > > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> > >
> > >         * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> > >         (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> > >         * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> > >         (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> > >         (copysigns @0 (negate @1)): Likewise.
> > >         * builtins.c (expand_builtin_copysign): Promoted local to argument.
> > >         (expand_builtin): Added CASE_FLT_FN_FLOATN_NX
> (BUILT_IN_XORSIGN) and
> > >         CASE_FLT_FN (BUILT_IN_XORSIGN).
> > >         (BUILT_IN_COPYSIGN): Updated function call.
> > >         * optabs.h (expand_copysign): New bool.
> > >         (expand_xorsign): New.
> > >         * optabs.def (xorsign_optab): New.
> > >         * optabs.c (expand_copysign): New parameter.
> > >         * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> > >         * fortran/mathbuiltins.def (XORSIGN): New.
> > >
> > > gcc/testsuite/
> > > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> > >
> > >         * gcc.dg/tree-ssa/xorsign.c: New.
> > >         * gcc.dg/xorsign_exec.c: New.
> > >         * gcc.dg/vec-xorsign_exec.c: New.
> > >         * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-07-18  9:22       ` Tamar Christina
@ 2017-07-18 10:19         ` Richard Biener
  2017-07-18 11:00           ` Tamar Christina
  0 siblings, 1 reply; 21+ messages in thread
From: Richard Biener @ 2017-07-18 10:19 UTC (permalink / raw)
  To: Tamar Christina; +Cc: Andrew Pinski, GCC Patches, nd, law, ian

On Tue, 18 Jul 2017, Tamar Christina wrote:

> > 
> > Why's this now done during RTL expansion rather than during late GIMPLE,
> > using match.pd and an internal function for xorsign?
> > 
> 
> Mainly because of Andrew's email on the 10th which stated:
> 
> > But you should get the general idea.  I would like to see more of 
> > these special expand patterns really.
> 
> And there were no objections so I figured this was also an acceptable 
> solution.

I see.  But the implementation challenge is that this interacts badly
with SSA coalescing done before this and thus should really happen
on GIMPLE before that.

And yes, I also like to see more of this, it's basically doing some
instruction selection on (late) GIMPLE.  Ideally we'd be able to
generate an expand.pd match.pd variant from the machine
description (named) define_insns, creating IFNs that we know how
to expand.

Think of a combine pass combining GIMPLE stmts to (recognized)
RTL insn (sequences).  Until RTL expansion the RTL insn (sequence)
would be represented by an internal function call (or alternatively
for multi-output cases an GIMPLE ASM with enumerated asm text).

Richard.

> > Thanks,
> > Richard.
> > 
> > >
> > > gcc/
> > > 2017-07-10  Tamar Christina  <tamar.christina@arm.com>
> > > 	    Andrew Pinski <pinskia@gmail.com>
> > >
> > > 	PR middle-end/19706
> > > 	* expr.c (is_copysign_call_with_1): New.
> > > 	(maybe_expand_mult_copysign): Likewise.
> > > 	(expand_expr_real_2): Expand copysign.
> > > 	* optabs.def (xorsign_optab): New.
> > >
> > > ________________________________________
> > > From: Andrew Pinski <pinskia@gmail.com>
> > > Sent: Monday, July 10, 2017 12:21:29 AM
> > > To: Tamar Christina
> > > Cc: GCC Patches; nd; law@redhat.com; ian@airs.com; rguenther@suse.de
> > > Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y)
> > > [Patch (1/2)]
> > >
> > > On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> > > <Tamar.Christina@arm.com> wrote:
> > > > Hi All,
> > > >
> > > > this patch implements a optimization rewriting
> > > >
> > > > x * copysign (1.0, y) and
> > > > x * copysign (-1.0, y)
> > > >
> > > > to:
> > > >
> > > > x ^ (y & (1 << sign_bit_position))
> > > >
> > > > This is done by creating a special builtin during matching and
> > > > generate the appropriate instructions during expand. This new builtin is
> > called XORSIGN.
> > > >
> > > > The expansion of xorsign depends on if the backend has an
> > > > appropriate optab available. If this is not the case then we use a
> > > > modified version of the existing copysign which does not take the abs
> > value of the first argument as a fall back.
> > > >
> > > > This patch is a revival of a previous patch
> > > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> > > >
> > > > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no
> > issues.
> > > > Regression done on aarch64-none-linux-gnu and no regressions.
> > >
> > >
> > > Note this is also PR 19706.
> > >
> > > Thanks,
> > > Andrew
> > >
> > > >
> > > > Ok for trunk?
> > > >
> > > > gcc/
> > > > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> > > >
> > > >         * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> > > >         (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX): Likewise.
> > > >         * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier.
> > > >         (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> > > >         (copysigns @0 (negate @1)): Likewise.
> > > >         * builtins.c (expand_builtin_copysign): Promoted local to argument.
> > > >         (expand_builtin): Added CASE_FLT_FN_FLOATN_NX
> > (BUILT_IN_XORSIGN) and
> > > >         CASE_FLT_FN (BUILT_IN_XORSIGN).
> > > >         (BUILT_IN_COPYSIGN): Updated function call.
> > > >         * optabs.h (expand_copysign): New bool.
> > > >         (expand_xorsign): New.
> > > >         * optabs.def (xorsign_optab): New.
> > > >         * optabs.c (expand_copysign): New parameter.
> > > >         * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> > > >         * fortran/mathbuiltins.def (XORSIGN): New.
> > > >
> > > > gcc/testsuite/
> > > > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> > > >
> > > >         * gcc.dg/tree-ssa/xorsign.c: New.
> > > >         * gcc.dg/xorsign_exec.c: New.
> > > >         * gcc.dg/vec-xorsign_exec.c: New.
> > > >         * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
> > >
> > 
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> > HRB 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-07-18 10:19         ` Richard Biener
@ 2017-07-18 11:00           ` Tamar Christina
  2017-07-18 11:19             ` Richard Biener
  0 siblings, 1 reply; 21+ messages in thread
From: Tamar Christina @ 2017-07-18 11:00 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andrew Pinski, GCC Patches, nd, law, ian


> I see.  But the implementation challenge is that this interacts badly with SSA
> coalescing done before this and thus should really happen on GIMPLE before
> that.
> 
> And yes, I also like to see more of this, it's basically doing some instruction
> selection on (late) GIMPLE.  Ideally we'd be able to generate an expand.pd
> match.pd variant from the machine description (named) define_insns,
> creating IFNs that we know how to expand.

Fair enough, Just to check I understood correctly.

It should be a match.pd rule that uses a match predicate, so expand in gimple-match.c.
but don't do this if the target doesn't have the xorsign optab and don't do it if honouring SNAN.

I'll make the changes then.
Thanks,
Tamar

> 
> Think of a combine pass combining GIMPLE stmts to (recognized) RTL insn
> (sequences).  Until RTL expansion the RTL insn (sequence) would be
> represented by an internal function call (or alternatively for multi-output
> cases an GIMPLE ASM with enumerated asm text).
> 
> Richard.
> 
> > > Thanks,
> > > Richard.
> > >
> > > >
> > > > gcc/
> > > > 2017-07-10  Tamar Christina  <tamar.christina@arm.com>
> > > > 	    Andrew Pinski <pinskia@gmail.com>
> > > >
> > > > 	PR middle-end/19706
> > > > 	* expr.c (is_copysign_call_with_1): New.
> > > > 	(maybe_expand_mult_copysign): Likewise.
> > > > 	(expand_expr_real_2): Expand copysign.
> > > > 	* optabs.def (xorsign_optab): New.
> > > >
> > > > ________________________________________
> > > > From: Andrew Pinski <pinskia@gmail.com>
> > > > Sent: Monday, July 10, 2017 12:21:29 AM
> > > > To: Tamar Christina
> > > > Cc: GCC Patches; nd; law@redhat.com; ian@airs.com;
> > > > rguenther@suse.de
> > > > Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y)
> > > > [Patch (1/2)]
> > > >
> > > > On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> > > > <Tamar.Christina@arm.com> wrote:
> > > > > Hi All,
> > > > >
> > > > > this patch implements a optimization rewriting
> > > > >
> > > > > x * copysign (1.0, y) and
> > > > > x * copysign (-1.0, y)
> > > > >
> > > > > to:
> > > > >
> > > > > x ^ (y & (1 << sign_bit_position))
> > > > >
> > > > > This is done by creating a special builtin during matching and
> > > > > generate the appropriate instructions during expand. This new
> > > > > builtin is
> > > called XORSIGN.
> > > > >
> > > > > The expansion of xorsign depends on if the backend has an
> > > > > appropriate optab available. If this is not the case then we use
> > > > > a modified version of the existing copysign which does not take
> > > > > the abs
> > > value of the first argument as a fall back.
> > > > >
> > > > > This patch is a revival of a previous patch
> > > > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> > > > >
> > > > > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no
> > > issues.
> > > > > Regression done on aarch64-none-linux-gnu and no regressions.
> > > >
> > > >
> > > > Note this is also PR 19706.
> > > >
> > > > Thanks,
> > > > Andrew
> > > >
> > > > >
> > > > > Ok for trunk?
> > > > >
> > > > > gcc/
> > > > > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> > > > >
> > > > >         * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> > > > >         (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX):
> Likewise.
> > > > >         * match.pd (mult (COPYSIGN:s real_onep @0) @1): New
> simplifier.
> > > > >         (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> > > > >         (copysigns @0 (negate @1)): Likewise.
> > > > >         * builtins.c (expand_builtin_copysign): Promoted local to
> argument.
> > > > >         (expand_builtin): Added CASE_FLT_FN_FLOATN_NX
> > > (BUILT_IN_XORSIGN) and
> > > > >         CASE_FLT_FN (BUILT_IN_XORSIGN).
> > > > >         (BUILT_IN_COPYSIGN): Updated function call.
> > > > >         * optabs.h (expand_copysign): New bool.
> > > > >         (expand_xorsign): New.
> > > > >         * optabs.def (xorsign_optab): New.
> > > > >         * optabs.c (expand_copysign): New parameter.
> > > > >         * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> > > > >         * fortran/mathbuiltins.def (XORSIGN): New.
> > > > >
> > > > > gcc/testsuite/
> > > > > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> > > > >
> > > > >         * gcc.dg/tree-ssa/xorsign.c: New.
> > > > >         * gcc.dg/xorsign_exec.c: New.
> > > > >         * gcc.dg/vec-xorsign_exec.c: New.
> > > > >         * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham
> > > Norton, HRB 21284 (AG Nuernberg)
> >
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-07-18 11:00           ` Tamar Christina
@ 2017-07-18 11:19             ` Richard Biener
  2017-07-18 11:24               ` Tamar Christina
  0 siblings, 1 reply; 21+ messages in thread
From: Richard Biener @ 2017-07-18 11:19 UTC (permalink / raw)
  To: Tamar Christina; +Cc: Andrew Pinski, GCC Patches, nd, law, ian

On Tue, 18 Jul 2017, Tamar Christina wrote:

> 
> > I see.  But the implementation challenge is that this interacts badly with SSA
> > coalescing done before this and thus should really happen on GIMPLE before
> > that.
> > 
> > And yes, I also like to see more of this, it's basically doing some instruction
> > selection on (late) GIMPLE.  Ideally we'd be able to generate an expand.pd
> > match.pd variant from the machine description (named) define_insns,
> > creating IFNs that we know how to expand.
> 
> Fair enough, Just to check I understood correctly.
> 
> It should be a match.pd rule that uses a match predicate, so expand in 
> gimple-match.c. but don't do this if the target doesn't have the xorsign 
> optab and don't do it if honouring SNAN.

Note that this will trigger too early (IMHO), so unless you feel
like inventing new infrastructure I'd put manual pattern matching in
tree-ssa-math-opts.c pass_optimize_widening_mul where we currently
do this kind of "late GIMPLE instruction selection".

Richard.

> I'll make the changes then.
> Thanks,
> Tamar
> 
> > 
> > Think of a combine pass combining GIMPLE stmts to (recognized) RTL insn
> > (sequences).  Until RTL expansion the RTL insn (sequence) would be
> > represented by an internal function call (or alternatively for multi-output
> > cases an GIMPLE ASM with enumerated asm text).
> > 
> > Richard.
> > 
> > > > Thanks,
> > > > Richard.
> > > >
> > > > >
> > > > > gcc/
> > > > > 2017-07-10  Tamar Christina  <tamar.christina@arm.com>
> > > > > 	    Andrew Pinski <pinskia@gmail.com>
> > > > >
> > > > > 	PR middle-end/19706
> > > > > 	* expr.c (is_copysign_call_with_1): New.
> > > > > 	(maybe_expand_mult_copysign): Likewise.
> > > > > 	(expand_expr_real_2): Expand copysign.
> > > > > 	* optabs.def (xorsign_optab): New.
> > > > >
> > > > > ________________________________________
> > > > > From: Andrew Pinski <pinskia@gmail.com>
> > > > > Sent: Monday, July 10, 2017 12:21:29 AM
> > > > > To: Tamar Christina
> > > > > Cc: GCC Patches; nd; law@redhat.com; ian@airs.com;
> > > > > rguenther@suse.de
> > > > > Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y)
> > > > > [Patch (1/2)]
> > > > >
> > > > > On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> > > > > <Tamar.Christina@arm.com> wrote:
> > > > > > Hi All,
> > > > > >
> > > > > > this patch implements a optimization rewriting
> > > > > >
> > > > > > x * copysign (1.0, y) and
> > > > > > x * copysign (-1.0, y)
> > > > > >
> > > > > > to:
> > > > > >
> > > > > > x ^ (y & (1 << sign_bit_position))
> > > > > >
> > > > > > This is done by creating a special builtin during matching and
> > > > > > generate the appropriate instructions during expand. This new
> > > > > > builtin is
> > > > called XORSIGN.
> > > > > >
> > > > > > The expansion of xorsign depends on if the backend has an
> > > > > > appropriate optab available. If this is not the case then we use
> > > > > > a modified version of the existing copysign which does not take
> > > > > > the abs
> > > > value of the first argument as a fall back.
> > > > > >
> > > > > > This patch is a revival of a previous patch
> > > > > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> > > > > >
> > > > > > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no
> > > > issues.
> > > > > > Regression done on aarch64-none-linux-gnu and no regressions.
> > > > >
> > > > >
> > > > > Note this is also PR 19706.
> > > > >
> > > > > Thanks,
> > > > > Andrew
> > > > >
> > > > > >
> > > > > > Ok for trunk?
> > > > > >
> > > > > > gcc/
> > > > > > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> > > > > >
> > > > > >         * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF): New.
> > > > > >         (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX):
> > Likewise.
> > > > > >         * match.pd (mult (COPYSIGN:s real_onep @0) @1): New
> > simplifier.
> > > > > >         (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> > > > > >         (copysigns @0 (negate @1)): Likewise.
> > > > > >         * builtins.c (expand_builtin_copysign): Promoted local to
> > argument.
> > > > > >         (expand_builtin): Added CASE_FLT_FN_FLOATN_NX
> > > > (BUILT_IN_XORSIGN) and
> > > > > >         CASE_FLT_FN (BUILT_IN_XORSIGN).
> > > > > >         (BUILT_IN_COPYSIGN): Updated function call.
> > > > > >         * optabs.h (expand_copysign): New bool.
> > > > > >         (expand_xorsign): New.
> > > > > >         * optabs.def (xorsign_optab): New.
> > > > > >         * optabs.c (expand_copysign): New parameter.
> > > > > >         * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> > > > > >         * fortran/mathbuiltins.def (XORSIGN): New.
> > > > > >
> > > > > > gcc/testsuite/
> > > > > > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> > > > > >
> > > > > >         * gcc.dg/tree-ssa/xorsign.c: New.
> > > > > >         * gcc.dg/xorsign_exec.c: New.
> > > > > >         * gcc.dg/vec-xorsign_exec.c: New.
> > > > > >         * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham
> > > > Norton, HRB 21284 (AG Nuernberg)
> > >
> > >
> > 
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> > HRB 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]
  2017-07-18 11:19             ` Richard Biener
@ 2017-07-18 11:24               ` Tamar Christina
  0 siblings, 0 replies; 21+ messages in thread
From: Tamar Christina @ 2017-07-18 11:24 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andrew Pinski, GCC Patches, nd, law, ian

> > It should be a match.pd rule that uses a match predicate, so expand in
> > gimple-match.c. but don't do this if the target doesn't have the
> > xorsign optab and don't do it if honouring SNAN.
> 
> Note that this will trigger too early (IMHO), so unless you feel like inventing
> new infrastructure I'd put manual pattern matching in tree-ssa-math-opts.c
> pass_optimize_widening_mul where we currently do this kind of "late
> GIMPLE instruction selection".
> 

Alright, I'll do that then, thanks!

> Richard.
> 
> > I'll make the changes then.
> > Thanks,
> > Tamar
> >
> > >
> > > Think of a combine pass combining GIMPLE stmts to (recognized) RTL
> > > insn (sequences).  Until RTL expansion the RTL insn (sequence) would
> > > be represented by an internal function call (or alternatively for
> > > multi-output cases an GIMPLE ASM with enumerated asm text).
> > >
> > > Richard.
> > >
> > > > > Thanks,
> > > > > Richard.
> > > > >
> > > > > >
> > > > > > gcc/
> > > > > > 2017-07-10  Tamar Christina  <tamar.christina@arm.com>
> > > > > > 	    Andrew Pinski <pinskia@gmail.com>
> > > > > >
> > > > > > 	PR middle-end/19706
> > > > > > 	* expr.c (is_copysign_call_with_1): New.
> > > > > > 	(maybe_expand_mult_copysign): Likewise.
> > > > > > 	(expand_expr_real_2): Expand copysign.
> > > > > > 	* optabs.def (xorsign_optab): New.
> > > > > >
> > > > > > ________________________________________
> > > > > > From: Andrew Pinski <pinskia@gmail.com>
> > > > > > Sent: Monday, July 10, 2017 12:21:29 AM
> > > > > > To: Tamar Christina
> > > > > > Cc: GCC Patches; nd; law@redhat.com; ian@airs.com;
> > > > > > rguenther@suse.de
> > > > > > Subject: Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0,
> > > > > > y) [Patch (1/2)]
> > > > > >
> > > > > > On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> > > > > > <Tamar.Christina@arm.com> wrote:
> > > > > > > Hi All,
> > > > > > >
> > > > > > > this patch implements a optimization rewriting
> > > > > > >
> > > > > > > x * copysign (1.0, y) and
> > > > > > > x * copysign (-1.0, y)
> > > > > > >
> > > > > > > to:
> > > > > > >
> > > > > > > x ^ (y & (1 << sign_bit_position))
> > > > > > >
> > > > > > > This is done by creating a special builtin during matching
> > > > > > > and generate the appropriate instructions during expand.
> > > > > > > This new builtin is
> > > > > called XORSIGN.
> > > > > > >
> > > > > > > The expansion of xorsign depends on if the backend has an
> > > > > > > appropriate optab available. If this is not the case then we
> > > > > > > use a modified version of the existing copysign which does
> > > > > > > not take the abs
> > > > > value of the first argument as a fall back.
> > > > > > >
> > > > > > > This patch is a revival of a previous patch
> > > > > > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00069.html
> > > > > > >
> > > > > > > Bootstrapped on both aarch64-none-linux-gnu and x86_64 with
> > > > > > > no
> > > > > issues.
> > > > > > > Regression done on aarch64-none-linux-gnu and no regressions.
> > > > > >
> > > > > >
> > > > > > Note this is also PR 19706.
> > > > > >
> > > > > > Thanks,
> > > > > > Andrew
> > > > > >
> > > > > > >
> > > > > > > Ok for trunk?
> > > > > > >
> > > > > > > gcc/
> > > > > > > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> > > > > > >
> > > > > > >         * builtins.def (BUILT_IN_XORSIGN, BUILT_IN_XORSIGNF):
> New.
> > > > > > >         (BUILT_IN_XORSIGNL, BUILT_IN_XORSIGN_FLOAT_NX):
> > > Likewise.
> > > > > > >         * match.pd (mult (COPYSIGN:s real_onep @0) @1): New
> > > simplifier.
> > > > > > >         (mult (COPYSIGN:s real_mus_onep @0) @1): Likewise.
> > > > > > >         (copysigns @0 (negate @1)): Likewise.
> > > > > > >         * builtins.c (expand_builtin_copysign): Promoted
> > > > > > > local to
> > > argument.
> > > > > > >         (expand_builtin): Added CASE_FLT_FN_FLOATN_NX
> > > > > (BUILT_IN_XORSIGN) and
> > > > > > >         CASE_FLT_FN (BUILT_IN_XORSIGN).
> > > > > > >         (BUILT_IN_COPYSIGN): Updated function call.
> > > > > > >         * optabs.h (expand_copysign): New bool.
> > > > > > >         (expand_xorsign): New.
> > > > > > >         * optabs.def (xorsign_optab): New.
> > > > > > >         * optabs.c (expand_copysign): New parameter.
> > > > > > >         * fortran/f95-lang.c (xorsignl, xorsign, xorsignf): New.
> > > > > > >         * fortran/mathbuiltins.def (XORSIGN): New.
> > > > > > >
> > > > > > > gcc/testsuite/
> > > > > > > 2017-06-07  Tamar Christina  <tamar.christina@arm.com>
> > > > > > >
> > > > > > >         * gcc.dg/tree-ssa/xorsign.c: New.
> > > > > > >         * gcc.dg/xorsign_exec.c: New.
> > > > > > >         * gcc.dg/vec-xorsign_exec.c: New.
> > > > > > >         * gcc.dg/tree-ssa/reassoc-39.c (f2, f3): Updated constant to 2.
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix
> > > > > Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG
> > > > > Nuernberg)
> > > >
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham
> > > Norton, HRB 21284 (AG Nuernberg)
> >
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-07-18 11:24 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-12  7:56 [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)] Tamar Christina
2017-06-12  9:10 ` Richard Biener
2017-06-12 16:27   ` Richard Sandiford
2017-06-13 10:22     ` Tamar Christina
2017-06-13 10:17   ` Tamar Christina
2017-06-12 16:52 ` Joseph Myers
2017-06-12 19:50 ` Michael Meissner
2017-06-13 10:14   ` Tamar Christina
2017-06-24 23:53 ` Andrew Pinski
2017-06-26  2:09   ` Andrew Pinski
2017-06-26  8:46     ` Tamar Christina
2017-06-26  7:26   ` Richard Biener
2017-07-09 23:21 ` Andrew Pinski
2017-07-10 15:47   ` Tamar Christina
2017-07-18  8:05     ` Tamar Christina
2017-07-18  8:38     ` Richard Biener
2017-07-18  9:22       ` Tamar Christina
2017-07-18 10:19         ` Richard Biener
2017-07-18 11:00           ` Tamar Christina
2017-07-18 11:19             ` Richard Biener
2017-07-18 11:24               ` Tamar Christina

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).