[PATCH 2/9][GCC][AArch64][middle-end] Add rules to strip away unneeded type casts in expressions

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH 2/9][GCC][AArch64][middle-end] Add rules to strip away unneeded type casts in expressions
@ 2018-11-11 10:26 Tamar Christina
  2018-11-13  0:49 ` Joseph Myers
  0 siblings, 1 reply; 6+ messages in thread
From: Tamar Christina @ 2018-11-11 10:26 UTC (permalink / raw)
  To: gcc-patches
  Cc: nd, James Greenhalgh, Richard Earnshaw, Marcus Shawcroft, law,
	ian, rguenther

[-- Attachment #1: Type: text/plain, Size: 1388 bytes --]

Hi All,

This patch adds a match.pd rule for stripping away the type converts when you're
converting to a type that has twice the precision of the current type in the same
class, doing a simple math operation on it and converting back to the smaller type.

The change makes it so the operations are kept in the smaller type.  The motivating
reason behind this is that the imaginary constant I in C99 is defined to be a
single precision float.  For Half precision this means the entire operation is carried
out in single precision which means that it adds a lot of type casting instructions in
the output and prevents optimal vectorization as it lowers your vectorization factor.

It means that if a and b are fp16 values, doing a * b * I will get vectorized in SFmode
instead of HFmode.

Bootstrap and Regtest on aarch64-none-linux-gnu, arm-none-gnueabihf and x86_64-pc-linux-gnu
are still on going but previous patch showed regressions in the builtin-arith-overflow-8 to -11.

However since it doesn't show any regression anywhere else I am wondering if it's just the test
that need updating or if the idea is not acceptable. Perhaps it should be done only for unsafe math?

So I am posting the patch for comments.

Thanks,
Tamar

gcc/ChangeLog:

2018-11-11  Tamar Christina  <tamar.christina@arm.com>

	* match.pd: Add type conversion stripping.

-- 

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: rb10277.patch --]
[-- Type: text/x-diff; name="rb10277.patch", Size: 1213 bytes --]

diff --git a/gcc/match.pd b/gcc/match.pd
index d07ceb7d087b8b5c5a7d7362ad9d8f71ac90dc08..3c2f8caca42d6a163fbf7faba6220d7304200100 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4709,6 +4709,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 	 (convert (op (convert:utype @0)
 		      (convert:utype @1))))))))

+/* Strip out useless type conversions:
+   ((F)((X)a op (X)b)) -> a op b
+
+   when ((F)((X)a op (X)b)) where a and b are both of type F,
+   and X has twice the precision of F then the conversion is useless
+   and should be stripped away to allow more optimizations.  */
+
+(for op (plus minus mult rdiv)
+ (simplify
+   (convert (op:s (convert@0 @1) (convert@2 @3)))
+   (if (types_match (@1, @3)
+        && types_match (type, @1)
+        && types_match (@0, @2)
+	&& GET_MODE_CLASS (TYPE_MODE (type))
+	   == GET_MODE_CLASS (TYPE_MODE (TREE_TYPE (@0)))
+        && TYPE_PRECISION (type) == (TYPE_PRECISION (TREE_TYPE (@0)) / 2))
+     (op @1 @3))))
+
 /* This is another case of narrowing, specifically when there's an outer
    BIT_AND_EXPR which masks off bits outside the type of the innermost
    operands.   Like the previous case we have to convert the operands

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/9][GCC][AArch64][middle-end] Add rules to strip away unneeded type casts in expressions
  2018-11-11 10:26 [PATCH 2/9][GCC][AArch64][middle-end] Add rules to strip away unneeded type casts in expressions Tamar Christina
@ 2018-11-13  0:49 ` Joseph Myers
  2018-11-13 16:41   ` Tamar Christina
  0 siblings, 1 reply; 6+ messages in thread
From: Joseph Myers @ 2018-11-13  0:49 UTC (permalink / raw)
  To: Tamar Christina
  Cc: gcc-patches, nd, James Greenhalgh, Richard Earnshaw,
	Marcus Shawcroft, law, ian, rguenther

On Sun, 11 Nov 2018, Tamar Christina wrote:

> This patch adds a match.pd rule for stripping away the type converts 
> when you're converting to a type that has twice the precision of the 
> current type in the same class, doing a simple math operation on it and 
> converting back to the smaller type.

What types exactly is this meant to apply to?  Floating-point?  Integer?  
Mixtures of those?  (I'm guessing not mixtures, because those would be 
something other than "convert" here.)

For integer types, it's not safe, in that if e.g. F is int and X is 
unsigned long long, you're changing from defined overflow to undefined 
overflow.

For floating-point types, using TYPE_PRECISION is suspect (it's not 
wonderfully clear what it means, but it's not the number of significand 
bits) - you need to look at the actual properties of the real format of 
the machine modes in question.

Specifically, see convert.c:convert_to_real_1, the long comment starting 
"Sometimes this transformation is safe (cannot change results through 
affecting double rounding cases) and sometimes it is not.", and the 
associated code calling real_can_shorten_arithmetic.  I think that code in 
convert.c ought to apply to your case of half precision converted to float 
for arithmetic and then converted back to half precision afterwards.  (In 
the case where the excess precision configuration - which depends on 
TARGET_FP_F16INST for AArch64 - says to do arithmetic directly on half 
precision, anyway.)

Now, there are still issues in that convert.c code in the decimal 
floating-point case (see bug 40503).  And I think match.pd is a much 
better place for this sort of thing than convert.c (and 
c-family/c-common.c:shorten_binary_op in the integer case).  But for this 
specific case of binary floating-point conversions, I think the logic in 
convert.c is what should be followed (but moved to match.pd if possible).

This patch is also lacking a testcase, which might show why the existing 
logic in convert.c isn't being applied in whatever case you're concerned 
with.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH 2/9][GCC][AArch64][middle-end] Add rules to strip away unneeded type casts in expressions
  2018-11-13  0:49 ` Joseph Myers
@ 2018-11-13 16:41   ` Tamar Christina
  2018-11-13 17:59     ` Joseph Myers
  0 siblings, 1 reply; 6+ messages in thread
From: Tamar Christina @ 2018-11-13 16:41 UTC (permalink / raw)
  To: Joseph Myers
  Cc: gcc-patches, nd, James Greenhalgh, Richard Earnshaw,
	Marcus Shawcroft, law, ian, rguenther

Hi Joseph,

> What types exactly is this meant to apply to?  Floating-point?  Integer?  
> Mixtures of those?  (I'm guessing not mixtures, because those would be something other than "convert" here.)

Originally I had it for both Floating-point and Integer, but not a mix of the two.

> For integer types, it's not safe, in that if e.g. F is int and X is unsigned long long, you're changing from defined overflow to undefined overflow.

This is a good point, so I should limited it to floating point formats only.

> For floating-point types, using TYPE_PRECISION is suspect (it's not wonderfully clear what it means, but it's not the number of significand
> bits) - you need to look at the actual properties of the real format of the machine modes in question.

Would restricting it to flag_unsafe_math_optimizations not be enough in this case? Since if it's only done for unsafe math
then you likely won't care about a small loss in precision anyway?

> Specifically, see convert.c:convert_to_real_1, the long comment starting "Sometimes this transformation is safe (cannot change results through affecting double rounding cases) and sometimes it is not.", and the associated code calling real_can_shorten_arithmetic.  I think that code in convert.c ought to apply to your case of half precision converted to float for arithmetic and then converted back to half precision afterwards.  (In the case where the excess precision configuration - which depends on TARGET_FP_F16INST for AArch64 - says to do arithmetic directly on half precision, anyway.)

Sorry I hadn't attached a test-case because I wanted to get some feedback before.

But my simple testcase is

#include <complex.h>

#define N 200

void foo (_Float16 complex a[N], _Float16 complex b[N], _Float16 complex *c)
{
  for (int x = 0; x < N; x++)
    c[x] = a[x] + b[x] * I;
}

A simplified version of one of the expressions in the gimple tree for this is

  _7 = IMAGPART_EXPR <*_4>;
  _13 = REALPART_EXPR <*_3>;
  _8 = (floatD.30) _7;
  _14 = (floatD.30) _13;
  _35 = _14 + _8;
  _19 = (_Float16D.33) _35;
  IMAGPART_EXPR <*_20> = _19;

note that the type of _4, _3 and _19 are that of half float.

  _4 = a_26(D) + _2;
  _3 = b_25(D) + _2;
  _20 = c_28(D) + _2;

  complex _Float16D.43 * a_26(D) = aD.3538;
  complex _Float16D.43 * b_25(D) = bD.3539;
  complex _Float16D.43 * c_28(D) = cD.3540;

If the code is SLP vectorized, then inside the tree (without my match.pd rule)
it will contain the casts, which is evident from the resulting vector code

	ldr	h0, [x6, x3]
	ldr	h2, [x1, x3]
	fcvt	s0, h0
	fcvt	s2, h2
	fadd	s0, s0, s2
	fcvt	h1, s1
	fcvt	h0, s0
	str	h1, [x2, x3]
	str	h0, [x4, x3]

The problem seems to be (as far as I can tell) that when a loop is present, a new
temporary is created to store the intermediate result, while it's creating the
destination variable with the dest + offset.  It can't do the cast and the computation
at the same time.

foo (complex _Float16 * a, complex _Float16 * b, complex _Float16 * c)
{
  complex _Float16 D_3548;
  complex _Float16 D_3549;
  complex float D_3550;

The type of this temporary seems to be the type of the inner computation, so in
this case float due to the 90 rotation of b in the C file.

This also has the effect that it seems to make the type of the imaginary and real
expressions all float

So when convert.c tries to convert them, all it sees are

 <imagpart_expr 0x7ffb19016c40
    type <real_type 0x7ffb18e362a0 float SF
        size <integer_cst 0x7ffb18e1df48 constant 32>
        unit-size <integer_cst 0x7ffb18e1df60 constant 4>
        align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffb18e362a0 precision:32
        pointer_to_this <pointer_type 0x7ffb18e36888>>
    side-effects
    arg:0 <save_expr 0x7ffb19016c00

Because the usage of this expression is to store it inside a complex float, then
convert and constant prop will no longer remove the casts.  The float is then
converted into half float again for storing.  Because of this all the invocations
of convert_to_real_1 exit at REAL_TYPE and essentially adds a NOP_EXPR doing no
conversions.

So it does seem like match.pd is the best place to do this since it won't care
about the temporaries.

Thanks,
Tamar

> -----Original Message-----
> From: Joseph Myers <joseph@codesourcery.com>
> Sent: Tuesday, November 13, 2018 00:49
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; James Greenhalgh
> <James.Greenhalgh@arm.com>; Richard Earnshaw
> <Richard.Earnshaw@arm.com>; Marcus Shawcroft
> <Marcus.Shawcroft@arm.com>; law@redhat.com; ian@airs.com;
> rguenther@suse.de
> Subject: Re: [PATCH 2/9][GCC][AArch64][middle-end] Add rules to strip away
> unneeded type casts in expressions
> 
> On Sun, 11 Nov 2018, Tamar Christina wrote:
> 
> > This patch adds a match.pd rule for stripping away the type converts
> > when you're converting to a type that has twice the precision of the
> > current type in the same class, doing a simple math operation on it
> > and converting back to the smaller type.
> 
> What types exactly is this meant to apply to?  Floating-point?  Integer?
> Mixtures of those?  (I'm guessing not mixtures, because those would be
> something other than "convert" here.)
> 
> For integer types, it's not safe, in that if e.g. F is int and X is unsigned long
> long, you're changing from defined overflow to undefined overflow.
> 
> For floating-point types, using TYPE_PRECISION is suspect (it's not
> wonderfully clear what it means, but it's not the number of significand
> bits) - you need to look at the actual properties of the real format of the
> machine modes in question.
> 
> Specifically, see convert.c:convert_to_real_1, the long comment starting
> "Sometimes this transformation is safe (cannot change results through
> affecting double rounding cases) and sometimes it is not.", and the
> associated code calling real_can_shorten_arithmetic.  I think that code in
> convert.c ought to apply to your case of half precision converted to float for
> arithmetic and then converted back to half precision afterwards.  (In the case
> where the excess precision configuration - which depends on
> TARGET_FP_F16INST for AArch64 - says to do arithmetic directly on half
> precision, anyway.)
> 
> Now, there are still issues in that convert.c code in the decimal floating-point
> case (see bug 40503).  And I think match.pd is a much better place for this
> sort of thing than convert.c (and c-family/c-common.c:shorten_binary_op in
> the integer case).  But for this specific case of binary floating-point
> conversions, I think the logic in convert.c is what should be followed (but
> moved to match.pd if possible).
> 
> This patch is also lacking a testcase, which might show why the existing logic
> in convert.c isn't being applied in whatever case you're concerned with.
> 
> --
> Joseph S. Myers
> joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH 2/9][GCC][AArch64][middle-end] Add rules to strip away unneeded type casts in expressions
  2018-11-13 16:41   ` Tamar Christina
@ 2018-11-13 17:59     ` Joseph Myers
  2018-11-28 17:57       ` Tamar Christina
  0 siblings, 1 reply; 6+ messages in thread
From: Joseph Myers @ 2018-11-13 17:59 UTC (permalink / raw)
  To: Tamar Christina
  Cc: gcc-patches, nd, James Greenhalgh, Richard Earnshaw,
	Marcus Shawcroft, law, ian, rguenther

On Tue, 13 Nov 2018, Tamar Christina wrote:

> Would restricting it to flag_unsafe_math_optimizations not be enough in 
> this case? Since if it's only done for unsafe math then you likely won't 
> care about a small loss in precision anyway?

We have what should be the right logic (modulo DFP issues) in 
convert_to_real_1, for converting (outertype)((innertype0)a+(innertype1)b) 
into ((newtype)a+(newtype)b).  I think the right thing to do is to move 
that logic, including the associated comments, from convert_to_real_1 to 
match.pd (not duplicate it or a subset in both places - make match.pd able 
to do everything that logic in convert.c does, so it's no longer needed in 
convert.c).  That logic knows both when the conversion is OK based on 
flag_unsafe_math_optimizations, and when it's OK based on 
real_can_shorten_arithmetic.

(Most of the other special case logic in convert_to_real_1 to optimize 
particular conversions would make sense to move as well - but for your 
present issue, only the PLUS_EXPR / MINUS_EXPR / MULT_EXPR / RDIV_EXPR 
logic should need to move.)

> But my simple testcase is

I'd hope a simpler test could be added - one not involving complex 
arithmetic at all, just an intermediate temporary variable or whatever's 
needed for convert_to_real_1 not to apply.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH 2/9][GCC][AArch64][middle-end] Add rules to strip away unneeded type casts in expressions
  2018-11-13 17:59     ` Joseph Myers
@ 2018-11-28 17:57       ` Tamar Christina
  2018-12-03 19:25         ` Jeff Law
  0 siblings, 1 reply; 6+ messages in thread
From: Tamar Christina @ 2018-11-28 17:57 UTC (permalink / raw)
  To: Joseph Myers; +Cc: gcc-patches, nd, law, ian, rguenther

[-- Attachment #1: Type: text/plain, Size: 2960 bytes --]

Hi Joseph,

This is an updated version of my first patch which moves part of the type conversion code
from convert.c to match.pd because match.pd is able to apply this transformation in the
presence of intermediate temporary variables.

I couldn't move the error parts as match.pd doesn't seem to have access to the diagnostics
bits and also I feel like this may be a bit against the philosophy of match.pd. 

Regtested and bootstrapped on aarch64-none-linux-gnu and no issues.

Ok for trunk?

Thanks,
Tamar

Concretely it makes both these cases behave the same

  float e = (float)a * (float)b;
  *c = (_Float16)e;

and 

  *c = (_Float16)((float)a * (float)b);

Thanks,
Tamar

gcc/ChangeLog:

2018-11-28  Tamar Christina  <tamar.christina@arm.com>

	* convert.c (convert_to_real_1): Move part of conversion code...
	* match.pd: ...To here.

gcc/testsuite/ChangeLog:

2018-11-28  Tamar Christina  <tamar.christina@arm.com>

	* gcc.dg/type-convert-var.c: New test.

> -----Original Message-----
> From: Joseph Myers <joseph@codesourcery.com>
> Sent: Tuesday, November 13, 2018 17:59
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; James Greenhalgh
> <James.Greenhalgh@arm.com>; Richard Earnshaw
> <Richard.Earnshaw@arm.com>; Marcus Shawcroft
> <Marcus.Shawcroft@arm.com>; law@redhat.com; ian@airs.com;
> rguenther@suse.de
> Subject: RE: [PATCH 2/9][GCC][AArch64][middle-end] Add rules to strip away
> unneeded type casts in expressions
> 
> On Tue, 13 Nov 2018, Tamar Christina wrote:
> 
> > Would restricting it to flag_unsafe_math_optimizations not be enough
> > in this case? Since if it's only done for unsafe math then you likely
> > won't care about a small loss in precision anyway?
> 
> We have what should be the right logic (modulo DFP issues) in
> convert_to_real_1, for converting (outertype)((innertype0)a+(innertype1)b)
> into ((newtype)a+(newtype)b).  I think the right thing to do is to move that
> logic, including the associated comments, from convert_to_real_1 to
> match.pd (not duplicate it or a subset in both places - make match.pd able to
> do everything that logic in convert.c does, so it's no longer needed in
> convert.c).  That logic knows both when the conversion is OK based on
> flag_unsafe_math_optimizations, and when it's OK based on
> real_can_shorten_arithmetic.
> 
> (Most of the other special case logic in convert_to_real_1 to optimize
> particular conversions would make sense to move as well - but for your
> present issue, only the PLUS_EXPR / MINUS_EXPR / MULT_EXPR /
> RDIV_EXPR logic should need to move.)
> 
> > But my simple testcase is
> 
> I'd hope a simpler test could be added - one not involving complex arithmetic
> at all, just an intermediate temporary variable or whatever's needed for
> convert_to_real_1 not to apply.
> 
> --
> Joseph S. Myers
> joseph@codesourcery.com

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: rb10277.patch --]
[-- Type: text/x-diff; name="rb10277.patch", Size: 8480 bytes --]

diff --git a/gcc/convert.c b/gcc/convert.c
index 68705f3e9b09c5d0e6bae0ba1c7bca0db7107980..73ca22967d83aa125ad0bd67e8070a775eaf8d14 100644
--- a/gcc/convert.c
+++ b/gcc/convert.c
@@ -272,92 +272,6 @@ convert_to_real_1 (tree type, tree expr, bool fold_p)
 	      return build1 (TREE_CODE (expr), type, arg);
 	    }
 	  break;
-	/* Convert (outertype)((innertype0)a+(innertype1)b)
-	   into ((newtype)a+(newtype)b) where newtype
-	   is the widest mode from all of these.  */
-	case PLUS_EXPR:
-	case MINUS_EXPR:
-	case MULT_EXPR:
-	case RDIV_EXPR:
-	   {
-	     tree arg0 = strip_float_extensions (TREE_OPERAND (expr, 0));
-	     tree arg1 = strip_float_extensions (TREE_OPERAND (expr, 1));
-
-	     if (FLOAT_TYPE_P (TREE_TYPE (arg0))
-		 && FLOAT_TYPE_P (TREE_TYPE (arg1))
-		 && DECIMAL_FLOAT_TYPE_P (itype) == DECIMAL_FLOAT_TYPE_P (type))
-	       {
-		  tree newtype = type;
-
-		  if (TYPE_MODE (TREE_TYPE (arg0)) == SDmode
-		      || TYPE_MODE (TREE_TYPE (arg1)) == SDmode
-		      || TYPE_MODE (type) == SDmode)
-		    newtype = dfloat32_type_node;
-		  if (TYPE_MODE (TREE_TYPE (arg0)) == DDmode
-		      || TYPE_MODE (TREE_TYPE (arg1)) == DDmode
-		      || TYPE_MODE (type) == DDmode)
-		    newtype = dfloat64_type_node;
-		  if (TYPE_MODE (TREE_TYPE (arg0)) == TDmode
-		      || TYPE_MODE (TREE_TYPE (arg1)) == TDmode
-		      || TYPE_MODE (type) == TDmode)
-                    newtype = dfloat128_type_node;
-		  if (newtype == dfloat32_type_node
-		      || newtype == dfloat64_type_node
-		      || newtype == dfloat128_type_node)
-		    {
-		      expr = build2 (TREE_CODE (expr), newtype,
-				     convert_to_real_1 (newtype, arg0,
-							fold_p),
-				     convert_to_real_1 (newtype, arg1,
-							fold_p));
-		      if (newtype == type)
-			return expr;
-		      break;
-		    }
-
-		  if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (newtype))
-		    newtype = TREE_TYPE (arg0);
-		  if (TYPE_PRECISION (TREE_TYPE (arg1)) > TYPE_PRECISION (newtype))
-		    newtype = TREE_TYPE (arg1);
-		  /* Sometimes this transformation is safe (cannot
-		     change results through affecting double rounding
-		     cases) and sometimes it is not.  If NEWTYPE is
-		     wider than TYPE, e.g. (float)((long double)double
-		     + (long double)double) converted to
-		     (float)(double + double), the transformation is
-		     unsafe regardless of the details of the types
-		     involved; double rounding can arise if the result
-		     of NEWTYPE arithmetic is a NEWTYPE value half way
-		     between two representable TYPE values but the
-		     exact value is sufficiently different (in the
-		     right direction) for this difference to be
-		     visible in ITYPE arithmetic.  If NEWTYPE is the
-		     same as TYPE, however, the transformation may be
-		     safe depending on the types involved: it is safe
-		     if the ITYPE has strictly more than twice as many
-		     mantissa bits as TYPE, can represent infinities
-		     and NaNs if the TYPE can, and has sufficient
-		     exponent range for the product or ratio of two
-		     values representable in the TYPE to be within the
-		     range of normal values of ITYPE.  */
-		  if (TYPE_PRECISION (newtype) < TYPE_PRECISION (itype)
-		      && (flag_unsafe_math_optimizations
-			  || (TYPE_PRECISION (newtype) == TYPE_PRECISION (type)
-			      && real_can_shorten_arithmetic (TYPE_MODE (itype),
-							      TYPE_MODE (type))
-			      && !excess_precision_type (newtype))))
-		    {
-		      expr = build2 (TREE_CODE (expr), newtype,
-				     convert_to_real_1 (newtype, arg0,
-							fold_p),
-				     convert_to_real_1 (newtype, arg1,
-							fold_p));
-		      if (newtype == type)
-			return expr;
-		    }
-	       }
-	   }
-	  break;
 	default:
 	  break;
       }
diff --git a/gcc/match.pd b/gcc/match.pd
index d07ceb7d087b8b5c5a7d7362ad9d8f71ac90dc08..5f7b20a164be11a9e2b2b8449521cb29868ca9f9 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4709,6 +4709,83 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 	 (convert (op (convert:utype @0)
 		      (convert:utype @1))))))))
 
+/* Convert (outertype)((innertype0)a+(innertype1)b)
+   into ((newtype)a+(newtype)b) where newtype
+   is the widest mode from all of these.  */
+(for op (plus minus mult rdiv)
+ (simplify
+   (convert (op:s@0 (convert1? @1) (convert2? @2)))
+   (with { tree arg0 = strip_float_extensions (@1);
+	   tree arg1 = strip_float_extensions (@2);
+	   tree itype = TREE_TYPE (@0);
+	   tree ty1 = TREE_TYPE (arg0);
+	   tree ty2 = TREE_TYPE (arg1);
+	   enum tree_code code = TREE_CODE (itype); }
+    (if (FLOAT_TYPE_P (ty1)
+	 && FLOAT_TYPE_P (ty2)
+	 && FLOAT_TYPE_P (type)
+	 && DECIMAL_FLOAT_TYPE_P (itype) == DECIMAL_FLOAT_TYPE_P (type))
+       (with { tree newtype = type;
+	       if (TYPE_MODE (ty1) == SDmode
+		   || TYPE_MODE (ty2) == SDmode
+		   || TYPE_MODE (type) == SDmode)
+		 newtype = dfloat32_type_node;
+	       if (TYPE_MODE (ty1) == DDmode
+		   || TYPE_MODE (ty2) == DDmode
+		   || TYPE_MODE (type) == DDmode)
+		 newtype = dfloat64_type_node;
+	       if (TYPE_MODE (ty1) == TDmode
+		   || TYPE_MODE (ty2) == TDmode
+		   || TYPE_MODE (type) == TDmode)
+		 newtype = dfloat128_type_node; }
+	(if ((newtype == dfloat32_type_node
+	      || newtype == dfloat64_type_node
+	      || newtype == dfloat128_type_node)
+	     && newtype == type)
+	   (convert:newtype (op (convert:newtype @1) (convert:newtype @2)))
+	   (with { if (TYPE_PRECISION (ty1) > TYPE_PRECISION (newtype))
+		     newtype = ty1;
+		   if (TYPE_PRECISION (ty2) > TYPE_PRECISION (newtype))
+		     newtype = ty2; }
+	     /* Sometimes this transformation is safe (cannot
+	        change results through affecting double rounding
+	        cases) and sometimes it is not.  If NEWTYPE is
+	        wider than TYPE, e.g. (float)((long double)double
+	        + (long double)double) converted to
+	        (float)(double + double), the transformation is
+	        unsafe regardless of the details of the types
+	        involved; double rounding can arise if the result
+	        of NEWTYPE arithmetic is a NEWTYPE value half way
+	        between two representable TYPE values but the
+	        exact value is sufficiently different (in the
+	        right direction) for this difference to be
+	        visible in ITYPE arithmetic.  If NEWTYPE is the
+	        same as TYPE, however, the transformation may be
+	        safe depending on the types involved: it is safe
+	        if the ITYPE has strictly more than twice as many
+	        mantissa bits as TYPE, can represent infinities
+	        and NaNs if the TYPE can, and has sufficient
+	        exponent range for the product or ratio of two
+	        values representable in the TYPE to be within the
+	        range of normal values of ITYPE.  */
+	     (if (TYPE_PRECISION (newtype) < TYPE_PRECISION (itype)
+		  && (flag_unsafe_math_optimizations
+		      || (TYPE_PRECISION (newtype) == TYPE_PRECISION (type)
+			  && real_can_shorten_arithmetic (TYPE_MODE (itype),
+							  TYPE_MODE (type))
+			  && !excess_precision_type (newtype))))
+		(convert:newtype (op (convert:newtype @1) (convert:newtype @2)))
+	))))
+	(if (code == REAL_TYPE)
+	  /* Ignore the conversion if we don't need to store intermediate
+	     results and neither type is a decimal float.  */
+	  (if (flag_float_store
+	       || DECIMAL_FLOAT_TYPE_P (type)
+	       || DECIMAL_FLOAT_TYPE_P (itype))
+	    (convert:type (op (convert:ty1 @1) (convert:ty2 @2)))
+	    (nop:type (op (convert:ty1 @1) (convert:ty2 @2))))))
+)))
+
 /* This is another case of narrowing, specifically when there's an outer
    BIT_AND_EXPR which masks off bits outside the type of the innermost
    operands.   Like the previous case we have to convert the operands
diff --git a/gcc/testsuite/gcc.dg/type-convert-var.c b/gcc/testsuite/gcc.dg/type-convert-var.c
new file mode 100644
index 0000000000000000000000000000000000000000..88d74e2a49d7123515b87ff64a18bd9b306d57e9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/type-convert-var.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O1 -fdump-tree-optimized" } */
+void foo (float a, float b, float *c)
+{
+  double e = (double)a * (double)b;
+  *c = (float)e;
+}
+
+/* { dg-final { scan-tree-dump-not {double} "optimized" } } */


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/9][GCC][AArch64][middle-end] Add rules to strip away unneeded type casts in expressions
  2018-11-28 17:57       ` Tamar Christina
@ 2018-12-03 19:25         ` Jeff Law
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff Law @ 2018-12-03 19:25 UTC (permalink / raw)
  To: Tamar Christina, Joseph Myers; +Cc: gcc-patches, nd, ian, rguenther

On 11/28/18 10:57 AM, Tamar Christina wrote:
> Hi Joseph,
> 
> This is an updated version of my first patch which moves part of the type conversion code
> from convert.c to match.pd because match.pd is able to apply this transformation in the
> presence of intermediate temporary variables.
> 
> I couldn't move the error parts as match.pd doesn't seem to have access to the diagnostics
> bits and also I feel like this may be a bit against the philosophy of match.pd. 
> 
> Regtested and bootstrapped on aarch64-none-linux-gnu and no issues.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> Concretely it makes both these cases behave the same
> 
>   float e = (float)a * (float)b;
>   *c = (_Float16)e;
> 
> and 
> 
>   *c = (_Float16)((float)a * (float)b);
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 2018-11-28  Tamar Christina  <tamar.christina@arm.com>
> 
> 	* convert.c (convert_to_real_1): Move part of conversion code...
> 	* match.pd: ...To here.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-11-28  Tamar Christina  <tamar.christina@arm.com>
> 
> 	* gcc.dg/type-convert-var.c: New test.
OK.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-12-03 19:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-11 10:26 [PATCH 2/9][GCC][AArch64][middle-end] Add rules to strip away unneeded type casts in expressions Tamar Christina
2018-11-13  0:49 ` Joseph Myers
2018-11-13 16:41   ` Tamar Christina
2018-11-13 17:59     ` Joseph Myers
2018-11-28 17:57       ` Tamar Christina
2018-12-03 19:25         ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).