public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/51938] New: missed optimization: 2 comparisons
@ 2012-01-21 22:54 marc.glisse at normalesup dot org
  2012-01-23 10:29 ` [Bug rtl-optimization/51938] " rguenth at gcc dot gnu.org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: marc.glisse at normalesup dot org @ 2012-01-21 22:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51938

             Bug #: 51938
           Summary: missed optimization: 2 comparisons
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: marc.glisse@normalesup.org


Hello,

this is possibly related to bug 28685, or to the bugs about the tracking of
flag register values.

void f();
enum Sign { NEG=-1, ZERO, POS };
enum Sign sign(double x){
    if(x>0) return POS;
    if(x<0) return NEG;
    return ZERO;
}
void g(double x){
    if(sign(x)==NEG) f();
}


Ideally, the compiler would realize that g is equivalent to:
if(x<0) f();

but on x86_64 I get with 4.6 (optimization -O3):

    xorpd    %xmm1, %xmm1
    ucomisd    %xmm1, %xmm0
    ja    .L8
    ucomisd    %xmm0, %xmm1
    jbe    .L8
    xorl    %eax, %eax
    jmp    f
.L8:
    rep
    ret

And with 4.7:

    xorpd    %xmm1, %xmm1
    ucomisd    %xmm1, %xmm0
    jbe    .L12
.L9:
    rep
    ret
.L12:
    ucomisd    %xmm0, %xmm1
    jbe    .L9
    xorl    %eax, %eax
    jmp    f

Other compilers: clang 2.9 does the optimization (it has only one ucomisd and
j*), but neither Intel 11.1 nor Oracle 5.12.

In my application, this optimization has a 5% impact on the total runtime.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/51938] missed optimization: 2 comparisons
  2012-01-21 22:54 [Bug tree-optimization/51938] New: missed optimization: 2 comparisons marc.glisse at normalesup dot org
@ 2012-01-23 10:29 ` rguenth at gcc dot gnu.org
  2012-01-23 13:07 ` marc.glisse at normalesup dot org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-01-23 10:29 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51938

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
           Keywords|                            |missed-optimization
   Last reconfirmed|                            |2012-01-23
          Component|tree-optimization           |rtl-optimization
     Ever Confirmed|0                           |1
           Severity|normal                      |enhancement

--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-01-23 10:23:31 UTC ---
Confirmed.

Something for PHI-OPT, recognize

<bb 2>:
  if (x_2(D) > 0.0)
    goto <bb 5>;
  else
    goto <bb 3>;

<bb 3>:
  if (x_2(D) < 0.0)
    goto <bb 4>;
  else
    goto <bb 5>;

<bb 4>:

<bb 5>:
  # D.1719_1 = PHI <1(2), -1(4), 0(3)>
  return D.1719_1;

I suspect also worthwhile for integral types.  Note that for real types
you need -ffinite-math-only - I bet the clang result is wrong for NaNs.

Btw, what's the optimal assembly you expect?

Not sure what's the best way to represent this on the tree level either,
so eventually we have to resort to RTL if-conversion (doesn't work
there for integral types either).


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/51938] missed optimization: 2 comparisons
  2012-01-21 22:54 [Bug tree-optimization/51938] New: missed optimization: 2 comparisons marc.glisse at normalesup dot org
  2012-01-23 10:29 ` [Bug rtl-optimization/51938] " rguenth at gcc dot gnu.org
@ 2012-01-23 13:07 ` marc.glisse at normalesup dot org
  2012-06-06 21:23 ` glisse at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: marc.glisse at normalesup dot org @ 2012-01-23 13:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51938

--- Comment #2 from Marc Glisse <marc.glisse at normalesup dot org> 2012-01-23 12:51:42 UTC ---
(In reply to comment #1)
> I suspect also worthwhile for integral types.  Note that for real types
> you need -ffinite-math-only - I bet the clang result is wrong for NaNs.

I hadn't thought about it (this code could indeed use -ffinite-math-only), but
it appears I am lucky, the code really is equivalent to if(x<0) f(); as
claimed. Indeed, for a NaN, x>0 and x<0 are false, so sign returns ZERO which
is not NEG. Comparing sign(x) to ZERO would indeed be different than x==0. On
the other hand, ucomisd sets ZF to 1 for QNaN. clang's code appears to be right
on all variations I tried.

> Btw, what's the optimal assembly you expect?

clang generates:

    pxor    %xmm1, %xmm1
    ucomisd    %xmm0, %xmm1
    ja    .LBB1_2
    ret
.LBB1_2:
    xorb    %al, %al
    jmp    f                       # TAILCALL

No idea if that's optimal (it also depends on which branch is most likely), but
one pair of ucomisd+ja is certainly better than 2.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/51938] missed optimization: 2 comparisons
  2012-01-21 22:54 [Bug tree-optimization/51938] New: missed optimization: 2 comparisons marc.glisse at normalesup dot org
  2012-01-23 10:29 ` [Bug rtl-optimization/51938] " rguenth at gcc dot gnu.org
  2012-01-23 13:07 ` marc.glisse at normalesup dot org
@ 2012-06-06 21:23 ` glisse at gcc dot gnu.org
  2012-06-07 14:54 ` [Bug tree-optimization/51938] " marc.glisse at normalesup dot org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: glisse at gcc dot gnu.org @ 2012-06-06 21:23 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51938

Marc Glisse <glisse at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |glisse at gcc dot gnu.org

--- Comment #3 from Marc Glisse <glisse at gcc dot gnu.org> 2012-06-06 21:22:59 UTC ---
Note that if I replace:
    if(x<0) return NEG;
with:
    if(!(x>=0)) return NEG;
then ifcombine recognizes the pattern and optimizes it (the generated code is
slightly different since that changes the behavior for NaN).

No time to investigate right now, but I wanted to add this note to the PR.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/51938] missed optimization: 2 comparisons
  2012-01-21 22:54 [Bug tree-optimization/51938] New: missed optimization: 2 comparisons marc.glisse at normalesup dot org
                   ` (2 preceding siblings ...)
  2012-06-06 21:23 ` glisse at gcc dot gnu.org
@ 2012-06-07 14:54 ` marc.glisse at normalesup dot org
  2012-06-08 13:07 ` glisse at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: marc.glisse at normalesup dot org @ 2012-06-07 14:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51938

Marc Glisse <marc.glisse at normalesup dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|rtl-optimization            |tree-optimization

--- Comment #4 from Marc Glisse <marc.glisse at normalesup dot org> 2012-06-07 14:54:02 UTC ---
Changing to tree-optimization (doing the optimization at RTL level would
require finite-math-only).

There is plenty of code that corresponds to A&&B and A||B, but (almost) nothing
for A&&!B. Quite a big missing piece...

<bb 2>:
  if (x_2(D) > 0.0)
    goto <bb 5>;
  else
    goto <bb 3>;

<bb 3>:
  if (x_2(D) < 0.0)
    goto <bb 4>;
  else
    goto <bb 5>;

The 2 conditions don't share the same then branch or the same else branch (it
is a mix), so ifcombine doesn't even try to turn it into

  if (x_2(D) > 0.0 || !(x_2(D) < 0.0))
    goto <bb 5>;
  else
    goto <bb 4>;

Besides, it doesn't look like the logic is in place to fold that condition into
just its second half (but I may have missed it).


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/51938] missed optimization: 2 comparisons
  2012-01-21 22:54 [Bug tree-optimization/51938] New: missed optimization: 2 comparisons marc.glisse at normalesup dot org
                   ` (3 preceding siblings ...)
  2012-06-07 14:54 ` [Bug tree-optimization/51938] " marc.glisse at normalesup dot org
@ 2012-06-08 13:07 ` glisse at gcc dot gnu.org
  2012-06-08 19:49 ` glisse at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: glisse at gcc dot gnu.org @ 2012-06-08 13:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51938

--- Comment #5 from Marc Glisse <glisse at gcc dot gnu.org> 2012-06-08 13:06:52 UTC ---
The main difficulty is trapping math. It isn't hard to add to
tree-ssa-ifcombine an ifcombine_ifnotandif variant, and we could make it call
maybe_fold_and_comparisons (as ifcombine_ifandif does) with
invert_tree_comparison (gimple_cond_code (outer_cond),...), but unless we pass
-fno-trapping-math, it won't help.

combine_comparisons (fold-const.c) has relevant code to determine whether the
optimization is safe wrt trapping, but it does so with the semantic that
UNLT_EXPR never traps, whereas I am (wrongfully) using it for
TRUTH_NOT_EXPR(GE_EXPR). So I guess that means two more versions of
maybe_fold_and_comparisons/maybe_fold_or_comparisons are needed :-(


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/51938] missed optimization: 2 comparisons
  2012-01-21 22:54 [Bug tree-optimization/51938] New: missed optimization: 2 comparisons marc.glisse at normalesup dot org
                   ` (4 preceding siblings ...)
  2012-06-08 13:07 ` glisse at gcc dot gnu.org
@ 2012-06-08 19:49 ` glisse at gcc dot gnu.org
  2012-06-08 20:03 ` glisse at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: glisse at gcc dot gnu.org @ 2012-06-08 19:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51938

--- Comment #6 from Marc Glisse <glisse at gcc dot gnu.org> 2012-06-08 19:49:07 UTC ---
Created attachment 27590
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27590
incomplete patch

I wonder if this is the right direction. At least it passes the testsuite and
optimizes the original testcase.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/51938] missed optimization: 2 comparisons
  2012-01-21 22:54 [Bug tree-optimization/51938] New: missed optimization: 2 comparisons marc.glisse at normalesup dot org
                   ` (5 preceding siblings ...)
  2012-06-08 19:49 ` glisse at gcc dot gnu.org
@ 2012-06-08 20:03 ` glisse at gcc dot gnu.org
  2012-06-09 21:36 ` glisse at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: glisse at gcc dot gnu.org @ 2012-06-08 20:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51938

Marc Glisse <glisse at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #27590|0                           |1
        is obsolete|                            |

--- Comment #7 from Marc Glisse <glisse at gcc dot gnu.org> 2012-06-08 20:03:35 UTC ---
Created attachment 27591
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27591
incomplete patch

The same with one less typo and more up-to-date comments.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/51938] missed optimization: 2 comparisons
  2012-01-21 22:54 [Bug tree-optimization/51938] New: missed optimization: 2 comparisons marc.glisse at normalesup dot org
                   ` (6 preceding siblings ...)
  2012-06-08 20:03 ` glisse at gcc dot gnu.org
@ 2012-06-09 21:36 ` glisse at gcc dot gnu.org
  2012-08-06 16:39 ` glisse at gcc dot gnu.org
  2012-08-06 16:53 ` glisse at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: glisse at gcc dot gnu.org @ 2012-06-09 21:36 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51938

Marc Glisse <glisse at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #27591|0                           |1
        is obsolete|                            |

--- Comment #8 from Marc Glisse <glisse at gcc dot gnu.org> 2012-06-09 21:36:04 UTC ---
Created attachment 27592
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27592
patch

Still needs testcases before it can be submitted.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/51938] missed optimization: 2 comparisons
  2012-01-21 22:54 [Bug tree-optimization/51938] New: missed optimization: 2 comparisons marc.glisse at normalesup dot org
                   ` (7 preceding siblings ...)
  2012-06-09 21:36 ` glisse at gcc dot gnu.org
@ 2012-08-06 16:39 ` glisse at gcc dot gnu.org
  2012-08-06 16:53 ` glisse at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: glisse at gcc dot gnu.org @ 2012-08-06 16:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51938

--- Comment #9 from Marc Glisse <glisse at gcc dot gnu.org> 2012-08-06 16:38:52 UTC ---
Author: glisse
Date: Mon Aug  6 16:38:48 2012
New Revision: 190184

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190184
Log:
2012-08-06 Marc Glisse <marc.glisse@inria.fr>

gcc/
    PR tree-optimization/51938
    PR tree-optimization/52005
    * tree-ssa-ifcombine.c (ifcombine_ifandif): New parameters for
    inverted conditions.
    (ifcombine_iforif): Remove, merge code into ifcombine_ifandif.
    (tree_ssa_ifcombine_bb): Update calls to the above. Detect !a&&b
    and !a||b patterns.

gcc/testsuite/
    PR tree-optimization/51938
    PR tree-optimization/52005
    * gcc.dg/tree-ssa/ssa-ifcombine-8.c: New testcase.
    * gcc.dg/tree-ssa/ssa-ifcombine-9.c: Likewise.
    * gcc.dg/tree-ssa/ssa-ifcombine-10.c: Likewise.
    * gcc.dg/tree-ssa/ssa-ifcombine-11.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-10.c   (with props)
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-11.c   (with props)
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-8.c   (with props)
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-9.c   (with props)
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-ifcombine.c

Propchange: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-10.c
            ('svn:eol-style' added)

Propchange: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-10.c
            ('svn:keywords' added)

Propchange: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-11.c
            ('svn:eol-style' added)

Propchange: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-11.c
            ('svn:keywords' added)

Propchange: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-8.c
            ('svn:eol-style' added)

Propchange: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-8.c
            ('svn:keywords' added)

Propchange: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-9.c
            ('svn:eol-style' added)

Propchange: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-ifcombine-9.c
            ('svn:keywords' added)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/51938] missed optimization: 2 comparisons
  2012-01-21 22:54 [Bug tree-optimization/51938] New: missed optimization: 2 comparisons marc.glisse at normalesup dot org
                   ` (8 preceding siblings ...)
  2012-08-06 16:39 ` glisse at gcc dot gnu.org
@ 2012-08-06 16:53 ` glisse at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: glisse at gcc dot gnu.org @ 2012-08-06 16:53 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51938

Marc Glisse <glisse at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

--- Comment #10 from Marc Glisse <glisse at gcc dot gnu.org> 2012-08-06 16:53:19 UTC ---
The testcase is now optimized with -fno-trapping-math. Ideally it should also
be optimized with -ftrapping-math (which a first unapplied patch did), but that
option needs too much work to fix all its issues.

Thus closing (feel free to reopen if someone is interested in the trapping
case).


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-08-06 16:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-21 22:54 [Bug tree-optimization/51938] New: missed optimization: 2 comparisons marc.glisse at normalesup dot org
2012-01-23 10:29 ` [Bug rtl-optimization/51938] " rguenth at gcc dot gnu.org
2012-01-23 13:07 ` marc.glisse at normalesup dot org
2012-06-06 21:23 ` glisse at gcc dot gnu.org
2012-06-07 14:54 ` [Bug tree-optimization/51938] " marc.glisse at normalesup dot org
2012-06-08 13:07 ` glisse at gcc dot gnu.org
2012-06-08 19:49 ` glisse at gcc dot gnu.org
2012-06-08 20:03 ` glisse at gcc dot gnu.org
2012-06-09 21:36 ` glisse at gcc dot gnu.org
2012-08-06 16:39 ` glisse at gcc dot gnu.org
2012-08-06 16:53 ` glisse at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).