public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fix PR55152
@ 2016-09-28 14:27 Richard Biener
  2016-09-28 17:49 ` Joseph Myers
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Biener @ 2016-09-28 14:27 UTC (permalink / raw)
  To: gcc-patches


The following adds a pattern to optimize MAX (a, -a) to abs (a).

Bootstrap / regtest pending on x86_64-unknown-linux-gnu.

Richard.

2016-09-28  Richard Biener  <rguenther@suse.de>

	PR middle-end/55152
	* match.pd: Add max(a,-a) -> abs(a) pattern.

	* gcc.dg/pr55152.c: New testcase.

Index: gcc/match.pd
===================================================================
--- gcc/match.pd	(revision 240565)
+++ gcc/match.pd	(working copy)
@@ -1271,6 +1284,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (simplify
  (max:c (min:c @0 @1) @1)
  @1)
+/* max(a,-a) -> abs(a).  */
+(simplify
+ (max:c @0 (negate @0))
+ (if (TREE_CODE (type) != COMPLEX_TYPE
+      && (! ANY_INTEGRAL_TYPE_P (type)
+	  || TYPE_OVERFLOW_UNDEFINED (type)))
+  (abs @0)))
 (simplify
  (min @0 @1)
  (switch
Index: gcc/testsuite/gcc.dg/pr55152.c
===================================================================
--- gcc/testsuite/gcc.dg/pr55152.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr55152.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O -ffinite-math-only -fstrict-overflow -fdump-tree-optimized" } */
+
+double g (double a)
+{
+  return (a>=-a)?a:-a;
+}
+int f(int a)
+{
+  return (a>=-a)?a:-a;
+}
+
+/* { dg-final { scan-tree-dump-times "ABS_EXPR" 2 "optimized" } } */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix PR55152
  2016-09-28 14:27 [PATCH] Fix PR55152 Richard Biener
@ 2016-09-28 17:49 ` Joseph Myers
  2016-09-29  7:55   ` Richard Biener
  0 siblings, 1 reply; 14+ messages in thread
From: Joseph Myers @ 2016-09-28 17:49 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On Wed, 28 Sep 2016, Richard Biener wrote:

> Index: gcc/testsuite/gcc.dg/pr55152.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/pr55152.c	(revision 0)
> +++ gcc/testsuite/gcc.dg/pr55152.c	(working copy)
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -ffinite-math-only -fstrict-overflow -fdump-tree-optimized" } */
> +
> +double g (double a)
> +{
> +  return (a>=-a)?a:-a;

You should need -fno-signed-zeros for this (that is, for the 
transformation to MAX_EXPR), not -ffinite-math-only.  For a == -0, that 
function should return -0, but abs would produce +0.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix PR55152
  2016-09-28 17:49 ` Joseph Myers
@ 2016-09-29  7:55   ` Richard Biener
  2016-09-29 12:45     ` Richard Biener
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Biener @ 2016-09-29  7:55 UTC (permalink / raw)
  To: Joseph Myers; +Cc: gcc-patches

On Wed, 28 Sep 2016, Joseph Myers wrote:

> On Wed, 28 Sep 2016, Richard Biener wrote:
> 
> > Index: gcc/testsuite/gcc.dg/pr55152.c
> > ===================================================================
> > --- gcc/testsuite/gcc.dg/pr55152.c	(revision 0)
> > +++ gcc/testsuite/gcc.dg/pr55152.c	(working copy)
> > @@ -0,0 +1,13 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -ffinite-math-only -fstrict-overflow -fdump-tree-optimized" } */
> > +
> > +double g (double a)
> > +{
> > +  return (a>=-a)?a:-a;
> 
> You should need -fno-signed-zeros for this (that is, for the 
> transformation to MAX_EXPR), not -ffinite-math-only.  For a == -0, that 
> function should return -0, but abs would produce +0.

This means that tree-ssa-phiopt.c has a bug:

static bool
minmax_replacement (basic_block cond_bb, basic_block middle_bb,
                    edge e0, edge e1, gimple *phi,
                    tree arg0, tree arg1)
{
...
  /* The optimization may be unsafe due to NaNs.  */
  if (HONOR_NANS (type))
    return false;

and it should check HONOR_SIGNED_ZEROS as well.

I'll fix that as a followup.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix PR55152
  2016-09-29  7:55   ` Richard Biener
@ 2016-09-29 12:45     ` Richard Biener
  2016-09-29 18:05       ` Andrew Pinski
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Biener @ 2016-09-29 12:45 UTC (permalink / raw)
  To: Joseph Myers; +Cc: gcc-patches

On Thu, 29 Sep 2016, Richard Biener wrote:

> On Wed, 28 Sep 2016, Joseph Myers wrote:
> 
> > On Wed, 28 Sep 2016, Richard Biener wrote:
> > 
> > > Index: gcc/testsuite/gcc.dg/pr55152.c
> > > ===================================================================
> > > --- gcc/testsuite/gcc.dg/pr55152.c	(revision 0)
> > > +++ gcc/testsuite/gcc.dg/pr55152.c	(working copy)
> > > @@ -0,0 +1,13 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -ffinite-math-only -fstrict-overflow -fdump-tree-optimized" } */
> > > +
> > > +double g (double a)
> > > +{
> > > +  return (a>=-a)?a:-a;
> > 
> > You should need -fno-signed-zeros for this (that is, for the 
> > transformation to MAX_EXPR), not -ffinite-math-only.  For a == -0, that 
> > function should return -0, but abs would produce +0.
> 
> This means that tree-ssa-phiopt.c has a bug:
> 
> static bool
> minmax_replacement (basic_block cond_bb, basic_block middle_bb,
>                     edge e0, edge e1, gimple *phi,
>                     tree arg0, tree arg1)
> {
> ...
>   /* The optimization may be unsafe due to NaNs.  */
>   if (HONOR_NANS (type))
>     return false;
> 
> and it should check HONOR_SIGNED_ZEROS as well.
> 
> I'll fix that as a followup.

Committed as follows.

Bootstrapped / tested on x86_64-unknown-linux-gnu.

Richard.

2016-09-29  Richard Biener  <rguenther@suse.de>

	PR middle-end/55152
	* match.pd: Add max(a,-a) -> abs(a) pattern.
	* tree-ssa-phiopt.c (minmax_replacement): Disable for
	HONOR_SIGNED_ZEROS types.

	* gcc.dg/pr55152.c: New testcase.
	* gcc.dg/tree-ssa/phi-opt-5.c: Adjust.

Index: gcc/match.pd
===================================================================
--- gcc/match.pd	(revision 240565)
+++ gcc/match.pd	(working copy)
@@ -1271,6 +1284,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (simplify
  (max:c (min:c @0 @1) @1)
  @1)
+/* max(a,-a) -> abs(a).  */
+(simplify
+ (max:c @0 (negate @0))
+ (if (TREE_CODE (type) != COMPLEX_TYPE
+      && (! ANY_INTEGRAL_TYPE_P (type)
+	  || TYPE_OVERFLOW_UNDEFINED (type)))
+  (abs @0)))
 (simplify
  (min @0 @1)
  (switch
Index: gcc/tree-ssa-phiopt.c
===================================================================
--- gcc/tree-ssa-phiopt.c	(revision 240565)
+++ gcc/tree-ssa-phiopt.c	(working copy)
@@ -1075,7 +1075,7 @@ minmax_replacement (basic_block cond_bb,
   type = TREE_TYPE (PHI_RESULT (phi));
 
   /* The optimization may be unsafe due to NaNs.  */
-  if (HONOR_NANS (type))
+  if (HONOR_NANS (type) || HONOR_SIGNED_ZEROS (type))
     return false;
 
   cond = as_a <gcond *> (last_stmt (cond_bb));
Index: gcc/testsuite/gcc.dg/pr55152.c
===================================================================
--- gcc/testsuite/gcc.dg/pr55152.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr55152.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O -ffinite-math-only -fno-signed-zeros -fstrict-overflow -fdump-tree-optimized" } */
+
+double g (double a)
+{
+  return (a>=-a)?a:-a;
+}
+int f(int a)
+{
+  return (a>=-a)?a:-a;
+}
+
+/* { dg-final { scan-tree-dump-times "ABS_EXPR" 2 "optimized" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-5.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-5.c	(revision 240612)
+++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-5.c	(working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O1 -ffinite-math-only -fdump-tree-phiopt1" } */
+/* { dg-options "-O1 -ffinite-math-only -fno-signed-zeros -fdump-tree-phiopt1" } */
 
 float repl1 (float varx)
 {

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix PR55152
  2016-09-29 12:45     ` Richard Biener
@ 2016-09-29 18:05       ` Andrew Pinski
  2016-10-04  7:30         ` Richard Biener
  0 siblings, 1 reply; 14+ messages in thread
From: Andrew Pinski @ 2016-09-29 18:05 UTC (permalink / raw)
  To: Richard Biener; +Cc: Joseph Myers, GCC Patches

On Thu, Sep 29, 2016 at 8:23 PM, Richard Biener <rguenther@suse.de> wrote:
> On Thu, 29 Sep 2016, Richard Biener wrote:
>
>> On Wed, 28 Sep 2016, Joseph Myers wrote:
>>
>> > On Wed, 28 Sep 2016, Richard Biener wrote:
>> >
>> > > Index: gcc/testsuite/gcc.dg/pr55152.c
>> > > ===================================================================
>> > > --- gcc/testsuite/gcc.dg/pr55152.c        (revision 0)
>> > > +++ gcc/testsuite/gcc.dg/pr55152.c        (working copy)
>> > > @@ -0,0 +1,13 @@
>> > > +/* { dg-do compile } */
>> > > +/* { dg-options "-O -ffinite-math-only -fstrict-overflow -fdump-tree-optimized" } */
>> > > +
>> > > +double g (double a)
>> > > +{
>> > > +  return (a>=-a)?a:-a;
>> >
>> > You should need -fno-signed-zeros for this (that is, for the
>> > transformation to MAX_EXPR), not -ffinite-math-only.  For a == -0, that
>> > function should return -0, but abs would produce +0.
>>
>> This means that tree-ssa-phiopt.c has a bug:
>>
>> static bool
>> minmax_replacement (basic_block cond_bb, basic_block middle_bb,
>>                     edge e0, edge e1, gimple *phi,
>>                     tree arg0, tree arg1)
>> {
>> ...
>>   /* The optimization may be unsafe due to NaNs.  */
>>   if (HONOR_NANS (type))
>>     return false;
>>
>> and it should check HONOR_SIGNED_ZEROS as well.
>>
>> I'll fix that as a followup.
>
> Committed as follows.
>
> Bootstrapped / tested on x86_64-unknown-linux-gnu.
>
> Richard.
>
> 2016-09-29  Richard Biener  <rguenther@suse.de>
>
>         PR middle-end/55152
>         * match.pd: Add max(a,-a) -> abs(a) pattern.


Hmm, shouldn't we also do "min(a, -a) -> - abs(a)" ?

Thanks,
Andrew Pinski

>         * tree-ssa-phiopt.c (minmax_replacement): Disable for
>         HONOR_SIGNED_ZEROS types.
>
>         * gcc.dg/pr55152.c: New testcase.
>         * gcc.dg/tree-ssa/phi-opt-5.c: Adjust.
>
> Index: gcc/match.pd
> ===================================================================
> --- gcc/match.pd        (revision 240565)
> +++ gcc/match.pd        (working copy)
> @@ -1271,6 +1284,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (simplify
>   (max:c (min:c @0 @1) @1)
>   @1)
> +/* max(a,-a) -> abs(a).  */
> +(simplify
> + (max:c @0 (negate @0))
> + (if (TREE_CODE (type) != COMPLEX_TYPE
> +      && (! ANY_INTEGRAL_TYPE_P (type)
> +         || TYPE_OVERFLOW_UNDEFINED (type)))
> +  (abs @0)))
>  (simplify
>   (min @0 @1)
>   (switch
> Index: gcc/tree-ssa-phiopt.c
> ===================================================================
> --- gcc/tree-ssa-phiopt.c       (revision 240565)
> +++ gcc/tree-ssa-phiopt.c       (working copy)
> @@ -1075,7 +1075,7 @@ minmax_replacement (basic_block cond_bb,
>    type = TREE_TYPE (PHI_RESULT (phi));
>
>    /* The optimization may be unsafe due to NaNs.  */
> -  if (HONOR_NANS (type))
> +  if (HONOR_NANS (type) || HONOR_SIGNED_ZEROS (type))
>      return false;
>
>    cond = as_a <gcond *> (last_stmt (cond_bb));
> Index: gcc/testsuite/gcc.dg/pr55152.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/pr55152.c      (revision 0)
> +++ gcc/testsuite/gcc.dg/pr55152.c      (working copy)
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -ffinite-math-only -fno-signed-zeros -fstrict-overflow -fdump-tree-optimized" } */
> +
> +double g (double a)
> +{
> +  return (a>=-a)?a:-a;
> +}
> +int f(int a)
> +{
> +  return (a>=-a)?a:-a;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "ABS_EXPR" 2 "optimized" } } */
> Index: gcc/testsuite/gcc.dg/tree-ssa/phi-opt-5.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/phi-opt-5.c   (revision 240612)
> +++ gcc/testsuite/gcc.dg/tree-ssa/phi-opt-5.c   (working copy)
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O1 -ffinite-math-only -fdump-tree-phiopt1" } */
> +/* { dg-options "-O1 -ffinite-math-only -fno-signed-zeros -fdump-tree-phiopt1" } */
>
>  float repl1 (float varx)
>  {

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix PR55152
  2016-09-29 18:05       ` Andrew Pinski
@ 2016-10-04  7:30         ` Richard Biener
  2016-10-04  9:41           ` Marc Glisse
  2016-10-04 12:09           ` Joseph Myers
  0 siblings, 2 replies; 14+ messages in thread
From: Richard Biener @ 2016-10-04  7:30 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: Joseph Myers, GCC Patches

On Fri, 30 Sep 2016, Andrew Pinski wrote:

> On Thu, Sep 29, 2016 at 8:23 PM, Richard Biener <rguenther@suse.de> wrote:
> > On Thu, 29 Sep 2016, Richard Biener wrote:
> >
> >> On Wed, 28 Sep 2016, Joseph Myers wrote:
> >>
> >> > On Wed, 28 Sep 2016, Richard Biener wrote:
> >> >
> >> > > Index: gcc/testsuite/gcc.dg/pr55152.c
> >> > > ===================================================================
> >> > > --- gcc/testsuite/gcc.dg/pr55152.c        (revision 0)
> >> > > +++ gcc/testsuite/gcc.dg/pr55152.c        (working copy)
> >> > > @@ -0,0 +1,13 @@
> >> > > +/* { dg-do compile } */
> >> > > +/* { dg-options "-O -ffinite-math-only -fstrict-overflow -fdump-tree-optimized" } */
> >> > > +
> >> > > +double g (double a)
> >> > > +{
> >> > > +  return (a>=-a)?a:-a;
> >> >
> >> > You should need -fno-signed-zeros for this (that is, for the
> >> > transformation to MAX_EXPR), not -ffinite-math-only.  For a == -0, that
> >> > function should return -0, but abs would produce +0.
> >>
> >> This means that tree-ssa-phiopt.c has a bug:
> >>
> >> static bool
> >> minmax_replacement (basic_block cond_bb, basic_block middle_bb,
> >>                     edge e0, edge e1, gimple *phi,
> >>                     tree arg0, tree arg1)
> >> {
> >> ...
> >>   /* The optimization may be unsafe due to NaNs.  */
> >>   if (HONOR_NANS (type))
> >>     return false;
> >>
> >> and it should check HONOR_SIGNED_ZEROS as well.
> >>
> >> I'll fix that as a followup.
> >
> > Committed as follows.
> >
> > Bootstrapped / tested on x86_64-unknown-linux-gnu.
> >
> > Richard.
> >
> > 2016-09-29  Richard Biener  <rguenther@suse.de>
> >
> >         PR middle-end/55152
> >         * match.pd: Add max(a,-a) -> abs(a) pattern.
> 
> 
> Hmm, shouldn't we also do "min(a, -a) -> - abs(a)" ?

Possibly.  Though then for FP we also want - abs (a) -> copysign (a, -1).

Testing the following.

Richard.

2016-10-04  Richard Biener  <rguenther@suse.de>

	PR middle-end/55152
	* match.pd (- abs (x) -> copysign (x, -1)): New pattern.
	(min(a,-a) -> -abs(a)): Likewise.

	* gcc.dg/pr55152-2.c: New testcase.

Index: gcc/match.pd
===================================================================
--- gcc/match.pd	(revision 240738)
+++ gcc/match.pd	(working copy)
@@ -782,6 +782,17 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (simplify
  (abs tree_expr_nonnegative_p@0)
  @0)
+/* - abs (x) -> copysign (x, -1).  */
+(simplify
+ (negate (abs @0))
+ (if (SCALAR_FLOAT_TYPE_P (type))
+  (switch
+   (if (types_match (type, float_type_node))
+    (BUILT_IN_COPYSIGNF @0 { build_minus_one_cst (type); }))
+   (if (types_match (type, double_type_node))
+    (BUILT_IN_COPYSIGN @0 { build_minus_one_cst (type); }))
+   (if (types_match (type, long_double_type_node))
+    (BUILT_IN_COPYSIGNL @0 { build_minus_one_cst (type); })))))
 
 /* A few cases of fold-const.c negate_expr_p predicate.  */
 (match negate_expr_p
@@ -1291,6 +1302,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
       && (! ANY_INTEGRAL_TYPE_P (type)
 	  || TYPE_OVERFLOW_UNDEFINED (type)))
   (abs @0)))
+/* min(a,-a) -> -abs(a).  */
+(simplify
+ (min:c @0 (negate @0))
+ (if (TREE_CODE (type) != COMPLEX_TYPE
+      && (! ANY_INTEGRAL_TYPE_P (type)
+	  || TYPE_OVERFLOW_UNDEFINED (type)))
+  (negate (abs @0))))
 (simplify
  (min @0 @1)
  (switch
Index: gcc/testsuite/gcc.dg/pr55152-2.c
===================================================================
--- gcc/testsuite/gcc.dg/pr55152-2.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr55152-2.c	(working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O -ffinite-math-only -fno-signed-zeros -fstrict-overflow -fdump-tree-optimized" } */
+
+double g (double a)
+{
+  return (a<-a)?a:-a;
+}
+int f(int a)
+{
+  return (a<-a)?a:-a;
+}
+
+/* { dg-final { scan-tree-dump-times "ABS_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "copysign" 1 "optimized" } } */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix PR55152
  2016-10-04  7:30         ` Richard Biener
@ 2016-10-04  9:41           ` Marc Glisse
  2016-10-04  9:50             ` Richard Biener
  2016-10-04 12:09           ` Joseph Myers
  1 sibling, 1 reply; 14+ messages in thread
From: Marc Glisse @ 2016-10-04  9:41 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andrew Pinski, Joseph Myers, GCC Patches

On Tue, 4 Oct 2016, Richard Biener wrote:

> Possibly.  Though then for FP we also want - abs (a) -> copysign (a, -1).

I thought this might fix PR 62055, but at least on x86_64, we generate 
much worse code for copysign(,-1) than for -abs :-(

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix PR55152
  2016-10-04  9:41           ` Marc Glisse
@ 2016-10-04  9:50             ` Richard Biener
  2016-10-16 20:57               ` Andrew Pinski
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Biener @ 2016-10-04  9:50 UTC (permalink / raw)
  To: GCC Patches; +Cc: Andrew Pinski, Joseph Myers, ubizjak

On Tue, 4 Oct 2016, Marc Glisse wrote:

> On Tue, 4 Oct 2016, Richard Biener wrote:
> 
> > Possibly.  Though then for FP we also want - abs (a) -> copysign (a, -1).
> 
> I thought this might fix PR 62055, but at least on x86_64, we generate much
> worse code for copysign(,-1) than for -abs :-(

I would have expected copysign(,-1) to be a simple IOR ...

double foo (double x)
{
  return __builtin_copysign (x, -1.);
}

foo:
.LFB0:
        .cfi_startproc
        movsd   .LC0(%rip), %xmm1
        movapd  %xmm1, %xmm2
        andpd   .LC2(%rip), %xmm2
        andpd   .LC1(%rip), %xmm0
        orpd    %xmm2, %xmm0
        ret

ICK. -fabs (x) yields

foo:
.LFB0:
        .cfi_startproc
        andpd  .LC0(%rip), %xmm0
        xorpd  .LC1(%rip), %xmm0
        ret

I expected a simple orpd .LC0(%rip), %xmm0 ...

Richard.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix PR55152
  2016-10-04  7:30         ` Richard Biener
  2016-10-04  9:41           ` Marc Glisse
@ 2016-10-04 12:09           ` Joseph Myers
  2016-10-04 13:23             ` Richard Biener
  1 sibling, 1 reply; 14+ messages in thread
From: Joseph Myers @ 2016-10-04 12:09 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andrew Pinski, GCC Patches

On Tue, 4 Oct 2016, Richard Biener wrote:

> Possibly.  Though then for FP we also want - abs (a) -> copysign (a, -1).

For architectures such as powerpc that have a negated-abs instruction, 
does it get properly generated from the copysign code?  (The relevant 
pattern in rs6000.md uses (neg (abs)) - is that the correct canonical RTL 
for this?)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix PR55152
  2016-10-04 12:09           ` Joseph Myers
@ 2016-10-04 13:23             ` Richard Biener
  2016-10-04 17:12               ` Joseph Myers
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Biener @ 2016-10-04 13:23 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Andrew Pinski, GCC Patches

On Tue, 4 Oct 2016, Joseph Myers wrote:

> On Tue, 4 Oct 2016, Richard Biener wrote:
> 
> > Possibly.  Though then for FP we also want - abs (a) -> copysign (a, -1).
> 
> For architectures such as powerpc that have a negated-abs instruction, 
> does it get properly generated from the copysign code?  (The relevant 
> pattern in rs6000.md uses (neg (abs)) - is that the correct canonical RTL 
> for this?)

I have no idea - given that there is no copysign RTL code I suppose
it is the only machine independent RTL that can do this?

The question would be whether the copysign optab of said targets would
special-case the -1 case appropriately.

Given your comment I'm putting the copysign part of the patch on the side 
for now.  And I have to re-do the other part then.

Richard.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix PR55152
  2016-10-04 13:23             ` Richard Biener
@ 2016-10-04 17:12               ` Joseph Myers
  2016-10-05  7:47                 ` Richard Biener
  0 siblings, 1 reply; 14+ messages in thread
From: Joseph Myers @ 2016-10-04 17:12 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andrew Pinski, GCC Patches

On Tue, 4 Oct 2016, Richard Biener wrote:

> On Tue, 4 Oct 2016, Joseph Myers wrote:
> 
> > On Tue, 4 Oct 2016, Richard Biener wrote:
> > 
> > > Possibly.  Though then for FP we also want - abs (a) -> copysign (a, -1).
> > 
> > For architectures such as powerpc that have a negated-abs instruction, 
> > does it get properly generated from the copysign code?  (The relevant 
> > pattern in rs6000.md uses (neg (abs)) - is that the correct canonical RTL 
> > for this?)
> 
> I have no idea - given that there is no copysign RTL code I suppose
> it is the only machine independent RTL that can do this?
> 
> The question would be whether the copysign optab of said targets would
> special-case the -1 case appropriately.

Why -1?  Any constant in copysign should be handled appropriately.  I'd 
say that (neg (abs)) is probably better as a canonical representation, so 
map copysign from a constant to either (abs) or (neg (abs)) appropriately.

Then in the case where abs is expanded with bit manipulation, (neg (abs)) 
should be expanded to a single OR.  I don't know whether the RTL 
optimizers will map the AND / OR combination to a single OR, but if they 
do then there should be no need to special-case (neg (abs)) in expansion, 
and when (abs) RTL is generated a machine description's (neg (abs)) 
pattern should match automatically.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix PR55152
  2016-10-04 17:12               ` Joseph Myers
@ 2016-10-05  7:47                 ` Richard Biener
  2016-10-05 11:25                   ` Richard Biener
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Biener @ 2016-10-05  7:47 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Andrew Pinski, GCC Patches

On Tue, 4 Oct 2016, Joseph Myers wrote:

> On Tue, 4 Oct 2016, Richard Biener wrote:
> 
> > On Tue, 4 Oct 2016, Joseph Myers wrote:
> > 
> > > On Tue, 4 Oct 2016, Richard Biener wrote:
> > > 
> > > > Possibly.  Though then for FP we also want - abs (a) -> copysign (a, -1).
> > > 
> > > For architectures such as powerpc that have a negated-abs instruction, 
> > > does it get properly generated from the copysign code?  (The relevant 
> > > pattern in rs6000.md uses (neg (abs)) - is that the correct canonical RTL 
> > > for this?)
> > 
> > I have no idea - given that there is no copysign RTL code I suppose
> > it is the only machine independent RTL that can do this?
> > 
> > The question would be whether the copysign optab of said targets would
> > special-case the -1 case appropriately.
> 
> Why -1?  Any constant in copysign should be handled appropriately.  I'd 
> say that (neg (abs)) is probably better as a canonical representation, so 
> map copysign from a constant to either (abs) or (neg (abs)) appropriately.
> 
> Then in the case where abs is expanded with bit manipulation, (neg (abs)) 
> should be expanded to a single OR.  I don't know whether the RTL 
> optimizers will map the AND / OR combination to a single OR, but if they 
> do then there should be no need to special-case (neg (abs)) in expansion, 
> and when (abs) RTL is generated a machine description's (neg (abs)) 
> pattern should match automatically.

Ok, that's a good point -- I'll do copysign (X, const) canonicalization
to [-]abs() then.  Hopefully that's a valid transform even if NaNs and
signed zeros are involved.

For now I've installed the requested min(a,-a)->-abs(a) pattern.

Bootstrapped / tested on x86_64-unknown-linux-gnu.

Richard.

2016-10-05  Richard Biener  <rguenther@suse.de>

	PR middle-end/55152
	* match.pd (min(a,-a) -> -abs(a)): New pattern.

	* gcc.dg/pr55152-2.c: New testcase.

Index: gcc/match.pd
===================================================================
--- gcc/match.pd	(revision 240738)
+++ gcc/match.pd	(working copy)
@@ -1291,6 +1302,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
       && (! ANY_INTEGRAL_TYPE_P (type)
 	  || TYPE_OVERFLOW_UNDEFINED (type)))
   (abs @0)))
+/* min(a,-a) -> -abs(a).  */
+(simplify
+ (min:c @0 (negate @0))
+ (if (TREE_CODE (type) != COMPLEX_TYPE
+      && (! ANY_INTEGRAL_TYPE_P (type)
+	  || TYPE_OVERFLOW_UNDEFINED (type)))
+  (negate (abs @0))))
 (simplify
  (min @0 @1)
  (switch
Index: gcc/testsuite/gcc.dg/pr55152-2.c
===================================================================
--- gcc/testsuite/gcc.dg/pr55152-2.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr55152-2.c	(working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O -ffinite-math-only -fno-signed-zeros -fstrict-overflow -fdump-tree-optimized" } */
+
+double g (double a)
+{
+  return (a<-a)?a:-a;
+}
+int f(int a)
+{
+  return (a<-a)?a:-a;
+}
+
+/* { dg-final { scan-tree-dump-times "ABS_EXPR" 2 "optimized" } } */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix PR55152
  2016-10-05  7:47                 ` Richard Biener
@ 2016-10-05 11:25                   ` Richard Biener
  0 siblings, 0 replies; 14+ messages in thread
From: Richard Biener @ 2016-10-05 11:25 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Andrew Pinski, GCC Patches

On Wed, 5 Oct 2016, Richard Biener wrote:

> On Tue, 4 Oct 2016, Joseph Myers wrote:
> 
> > On Tue, 4 Oct 2016, Richard Biener wrote:
> > 
> > > On Tue, 4 Oct 2016, Joseph Myers wrote:
> > > 
> > > > On Tue, 4 Oct 2016, Richard Biener wrote:
> > > > 
> > > > > Possibly.  Though then for FP we also want - abs (a) -> copysign (a, -1).
> > > > 
> > > > For architectures such as powerpc that have a negated-abs instruction, 
> > > > does it get properly generated from the copysign code?  (The relevant 
> > > > pattern in rs6000.md uses (neg (abs)) - is that the correct canonical RTL 
> > > > for this?)
> > > 
> > > I have no idea - given that there is no copysign RTL code I suppose
> > > it is the only machine independent RTL that can do this?
> > > 
> > > The question would be whether the copysign optab of said targets would
> > > special-case the -1 case appropriately.
> > 
> > Why -1?  Any constant in copysign should be handled appropriately.  I'd 
> > say that (neg (abs)) is probably better as a canonical representation, so 
> > map copysign from a constant to either (abs) or (neg (abs)) appropriately.
> > 
> > Then in the case where abs is expanded with bit manipulation, (neg (abs)) 
> > should be expanded to a single OR.  I don't know whether the RTL 
> > optimizers will map the AND / OR combination to a single OR, but if they 
> > do then there should be no need to special-case (neg (abs)) in expansion, 
> > and when (abs) RTL is generated a machine description's (neg (abs)) 
> > pattern should match automatically.
> 
> Ok, that's a good point -- I'll do copysign (X, const) canonicalization
> to [-]abs() then.  Hopefully that's a valid transform even if NaNs and
> signed zeros are involved.

AFAICS it is.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2016-10-05  Richard Biener  <rguenther@suse.de>

	* match.pd (copysign(x, CST) -> [-]abs (x)): New pattern.

	* gcc.dg/fold-copysign-1.c: New testcase.

Index: gcc/match.pd
===================================================================
--- gcc/match.pd	(revision 240770)
+++ gcc/match.pd	(working copy)
@@ -452,6 +452,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (hypots @0 (copysigns @1 @2))
   (hypots @0 @1)))
 
+/* copysign(x, CST) -> [-]abs (x).  */
+(for copysigns (COPYSIGN)
+ (simplify
+  (copysigns @0 REAL_CST@1)
+  (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))
+   (negate (abs @0))
+   (abs @0))))
+
 /* copysign(copysign(x, y), z) -> copysign(x, z).  */
 (for copysigns (COPYSIGN)
  (simplify
Index: gcc/testsuite/gcc.dg/fold-copysign-1.c
===================================================================
--- gcc/testsuite/gcc.dg/fold-copysign-1.c	(revision 0)
+++ gcc/testsuite/gcc.dg/fold-copysign-1.c	(working copy)
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-cddce1" } */
+
+double foo (double x)
+{
+  double one = 1.;
+  return __builtin_copysign (x, one);
+}
+double bar (double x)
+{
+  double minuszero = -0.;
+  return __builtin_copysign (x, minuszero);
+}
+
+/* { dg-final { scan-tree-dump-times "= -" 1 "cddce1" } } */
+/* { dg-final { scan-tree-dump-times "= ABS_EXPR" 2 "cddce1" } } */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Fix PR55152
  2016-10-04  9:50             ` Richard Biener
@ 2016-10-16 20:57               ` Andrew Pinski
  0 siblings, 0 replies; 14+ messages in thread
From: Andrew Pinski @ 2016-10-16 20:57 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Joseph Myers, Uros Bizjak

On Tue, Oct 4, 2016 at 2:50 AM, Richard Biener <rguenther@suse.de> wrote:
> On Tue, 4 Oct 2016, Marc Glisse wrote:
>
>> On Tue, 4 Oct 2016, Richard Biener wrote:
>>
>> > Possibly.  Though then for FP we also want - abs (a) -> copysign (a, -1).
>>
>> I thought this might fix PR 62055, but at least on x86_64, we generate much
>> worse code for copysign(,-1) than for -abs :-(
>
> I would have expected copysign(,-1) to be a simple IOR ...
>
> double foo (double x)
> {
>   return __builtin_copysign (x, -1.);
> }
>
> foo:
> .LFB0:
>         .cfi_startproc
>         movsd   .LC0(%rip), %xmm1
>         movapd  %xmm1, %xmm2
>         andpd   .LC2(%rip), %xmm2
>         andpd   .LC1(%rip), %xmm0
>         orpd    %xmm2, %xmm0
>         ret
>
> ICK. -fabs (x) yields
>
> foo:
> .LFB0:
>         .cfi_startproc
>         andpd  .LC0(%rip), %xmm0
>         xorpd  .LC1(%rip), %xmm0
>         ret
>
> I expected a simple orpd .LC0(%rip), %xmm0 ...

That is recorded already as PR62055 :).

Thanks,
Andrew

>
> Richard.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-10-16 20:57 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-28 14:27 [PATCH] Fix PR55152 Richard Biener
2016-09-28 17:49 ` Joseph Myers
2016-09-29  7:55   ` Richard Biener
2016-09-29 12:45     ` Richard Biener
2016-09-29 18:05       ` Andrew Pinski
2016-10-04  7:30         ` Richard Biener
2016-10-04  9:41           ` Marc Glisse
2016-10-04  9:50             ` Richard Biener
2016-10-16 20:57               ` Andrew Pinski
2016-10-04 12:09           ` Joseph Myers
2016-10-04 13:23             ` Richard Biener
2016-10-04 17:12               ` Joseph Myers
2016-10-05  7:47                 ` Richard Biener
2016-10-05 11:25                   ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).