public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug optimization/15187] New: Inefficient if optimization with -O2 -ffast-math
@ 2004-04-28  7:26 uros at kss-loka dot si
  2004-04-28  9:50 ` [Bug optimization/15187] " zack at codesourcery dot com
                   ` (11 more replies)
  0 siblings, 12 replies; 15+ messages in thread
From: uros at kss-loka dot si @ 2004-04-28  7:26 UTC (permalink / raw)
  To: gcc-bugs

This testcase compiles with '-O2 -ffast-math' into extremely inefficient asm code:
double test(double x) {
        if (x > 0.0)
                return cos(x);
        else
                return sin(x);
}

--cut here--
test:
        pushl   %ebp
        movl    %esp, %ebp
        fldl    8(%ebp)
        fcoml   .LC1
        fnstsw  %ax
        fld     %st(0)
        fcos
        sahf
        ja      .L6
        fstp    %st(0)
        fsin
        jmp     .L1
        .p2align 4,,7
.L6:
        fstp    %st(1)
.L1:
        popl    %ebp
        ret
--cut here--

It will _always_ call fcos instruction, and - depending on input - overwrite
output of fcos with output of fsin instruction. This problem is not fsin/fcos
specific.

The problem is in ifcvt.c, find_if_case_1() function. Around line 2889, there is
a condition:
  /* THEN is small.  */
  if (count_bb_insns (then_bb) > BRANCH_COST)
    return FALSE;

This condition would prevent moving 'else' BB before 'if', if then_bb is not
small. 'Small' means a couple of instructions with default BRANCH_COST (= 1).

However, if an instruction is UNSPEC_*, this instruction can last hundred of
cycles (as it is case with fsin or fcos), but it is still _one_ RTL instruction.
So the case above is not triggered, and fcos is moved before 'if'. 

The situation is even worser with '-O2 -ffast-math -march=i686'. fsin and fcos
are called every time...

test:
        pushl   %ebp
        movl    %esp, %ebp
        fldl    8(%ebp)
        fldz
        fld     %st(1)
        fld     %st(2)
        fxch    %st(1)
        fcos
        fxch    %st(3)
        popl    %ebp
        fcomip  %st(2), %st
        fstp    %st(1)
        fsin
        fcmovnbe        %st(1), %st
        fstp    %st(1)
        ret

-- 
           Summary: Inefficient if optimization with -O2 -ffast-math
           Product: gcc
           Version: 3.5.0
            Status: UNCONFIRMED
          Severity: critical
          Priority: P2
         Component: optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: uros at kss-loka dot si
                CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: i686-pc-linux-gnu
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug optimization/15187] Inefficient if optimization with -O2 -ffast-math
  2004-04-28  7:26 [Bug optimization/15187] New: Inefficient if optimization with -O2 -ffast-math uros at kss-loka dot si
@ 2004-04-28  9:50 ` zack at codesourcery dot com
  2004-04-28 11:22 ` uros at kss-loka dot si
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: zack at codesourcery dot com @ 2004-04-28  9:50 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From zack at codesourcery dot com  2004-04-28 07:44 -------
Subject: Re:  New: Inefficient if optimization with
 -O2 -ffast-math

 
Ideal code here would be to call fsincos and then pick one of the
outputs, yes?

zw


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug optimization/15187] Inefficient if optimization with -O2 -ffast-math
  2004-04-28  7:26 [Bug optimization/15187] New: Inefficient if optimization with -O2 -ffast-math uros at kss-loka dot si
  2004-04-28  9:50 ` [Bug optimization/15187] " zack at codesourcery dot com
@ 2004-04-28 11:22 ` uros at kss-loka dot si
  2004-04-28 12:52 ` pinskia at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: uros at kss-loka dot si @ 2004-04-28 11:22 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From uros at kss-loka dot si  2004-04-28 09:52 -------
Yes, fsincos would be the best solution in particular case, when both sin() and
cos() are involved. However with more general testcase, for example:

double test(double x) {
      if (x > 0.0)
        return x + 1.0;
      else
        return sqrt(x); // or sin(x), cos(x), etc...
}

square root is _always_ called, even if x > 0.0. If we return sin(x) [which
could not be combine with sqrt in any way] instead of (x + 1.0), both sin() and
sqrt() will be called when x > 0.0. In this case, -march=i686 (which has cmov
instruction) will produce code, which will _always_ call sin() and sqrt() and
then cmov will pick one of the results. When fsqrt x87 instruction takes 70
cycles and fsin x87 insn takes 65-100 cycles to execute, this is not a win in
any case.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug optimization/15187] Inefficient if optimization with -O2 -ffast-math
  2004-04-28  7:26 [Bug optimization/15187] New: Inefficient if optimization with -O2 -ffast-math uros at kss-loka dot si
  2004-04-28  9:50 ` [Bug optimization/15187] " zack at codesourcery dot com
  2004-04-28 11:22 ` uros at kss-loka dot si
@ 2004-04-28 12:52 ` pinskia at gcc dot gnu dot org
  2004-04-28 13:30 ` uros at kss-loka dot si
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-04-28 12:52 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-04-28 11:41 -------
Confirmed.  I think this has to do with the branch probability of taking it being 50% so it is merging 
both of them together so if you use profiled feedback that GCC has it will work correctly if it is not 
taken 50% of the time.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|critical                    |normal
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|                            |1
           Keywords|                            |pessimizes-code
   Last reconfirmed|0000-00-00 00:00:00         |2004-04-28 11:41:56
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug optimization/15187] Inefficient if optimization with -O2 -ffast-math
  2004-04-28  7:26 [Bug optimization/15187] New: Inefficient if optimization with -O2 -ffast-math uros at kss-loka dot si
                   ` (2 preceding siblings ...)
  2004-04-28 12:52 ` pinskia at gcc dot gnu dot org
@ 2004-04-28 13:30 ` uros at kss-loka dot si
  2004-04-28 13:42 ` pinskia at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: uros at kss-loka dot si @ 2004-04-28 13:30 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From uros at kss-loka dot si  2004-04-28 12:04 -------
Andrew: I don't think that profiled feedback will help. fcos will be computed
every time, but in 50% that result will be thrown away, fsin will be calculated
and its result will be used. With -mbranch-cost=0, generated code looks a lot
better, exactly what would one expect (and with only one conditional jump):

--cut here--
test:
      pushl     %ebp
      movl      %esp, %ebp
      fldl      8(%ebp)
      fcoml     .LC1
      fnstsw    %ax
      sahf
      jbe       .L2
      fcos
      popl      %ebp
      ret
      .p2align 4,,7
.L2:
      fsin
      popl      %ebp
      ret
--cut here--

-mbranch-cost=0 does not help with -march=i686, resulting code is the same as
without -mbranch-cost.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug optimization/15187] Inefficient if optimization with -O2 -ffast-math
  2004-04-28  7:26 [Bug optimization/15187] New: Inefficient if optimization with -O2 -ffast-math uros at kss-loka dot si
                   ` (3 preceding siblings ...)
  2004-04-28 13:30 ` uros at kss-loka dot si
@ 2004-04-28 13:42 ` pinskia at gcc dot gnu dot org
  2004-04-28 14:45 ` falk at debian dot org
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-04-28 13:42 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-04-28 12:52 -------
Note branch cost and branch probabilities are two seperate things.  I tried to find what flag is causing 
this but I could not I would need to look into the RTL dups to see what is going on.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug optimization/15187] Inefficient if optimization with -O2 -ffast-math
  2004-04-28  7:26 [Bug optimization/15187] New: Inefficient if optimization with -O2 -ffast-math uros at kss-loka dot si
                   ` (4 preceding siblings ...)
  2004-04-28 13:42 ` pinskia at gcc dot gnu dot org
@ 2004-04-28 14:45 ` falk at debian dot org
  2004-06-04  6:15 ` [Bug rtl-optimization/15187] " pinskia at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: falk at debian dot org @ 2004-04-28 14:45 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From falk at debian dot org  2004-04-28 13:35 -------
It seems to me that cases where this is a bad decision are really rare. Basically
it is only bad if both branches use the same functional unit, which additionally
isn't fully pipelined. This seems like a rare situation, and in the future CPUs
will have even more pipelined units, which will usually not have to be
represented with UNSPEC, and even more expensive branches. IMHO we should leave
it just as it is and maybe try to mitigate it with target specific hacks if
it is really an issue in real programs.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug rtl-optimization/15187] Inefficient if optimization with -O2 -ffast-math
  2004-04-28  7:26 [Bug optimization/15187] New: Inefficient if optimization with -O2 -ffast-math uros at kss-loka dot si
                   ` (5 preceding siblings ...)
  2004-04-28 14:45 ` falk at debian dot org
@ 2004-06-04  6:15 ` pinskia at gcc dot gnu dot org
  2004-07-07 13:34 ` roger at eyesopen dot com
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-06-04  6:15 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug rtl-optimization/15187] Inefficient if optimization with -O2 -ffast-math
  2004-04-28  7:26 [Bug optimization/15187] New: Inefficient if optimization with -O2 -ffast-math uros at kss-loka dot si
                   ` (6 preceding siblings ...)
  2004-06-04  6:15 ` [Bug rtl-optimization/15187] " pinskia at gcc dot gnu dot org
@ 2004-07-07 13:34 ` roger at eyesopen dot com
  2004-09-17  5:32 ` cvs-commit at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: roger at eyesopen dot com @ 2004-07-07 13:34 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From roger at eyesopen dot com  2004-07-07 13:34 -------
There's a relevant discussion of these issues posted to gcc-patches at:
http://gcc.gnu.org/ml/gcc-patches/2004-07/msg00597.html

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug rtl-optimization/15187] Inefficient if optimization with -O2 -ffast-math
  2004-04-28  7:26 [Bug optimization/15187] New: Inefficient if optimization with -O2 -ffast-math uros at kss-loka dot si
                   ` (7 preceding siblings ...)
  2004-07-07 13:34 ` roger at eyesopen dot com
@ 2004-09-17  5:32 ` cvs-commit at gcc dot gnu dot org
  2004-09-17  5:41 ` uros at kss-loka dot si
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: cvs-commit at gcc dot gnu dot org @ 2004-09-17  5:32 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From cvs-commit at gcc dot gnu dot org  2004-09-17 05:32 -------
Subject: Bug 15187

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	uros@gcc.gnu.org	2004-09-17 05:32:37

Modified files:
	gcc            : ChangeLog ifcvt.c 

Log message:
	PR rtl-optimization/15187
	* ifcvt.c (noce_try_cmove_arith): Exit early if total
	insn_rtx_cost of both branches > BRANCH_COST

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.5487&r2=2.5488
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ifcvt.c.diff?cvsroot=gcc&r1=1.164&r2=1.165



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug rtl-optimization/15187] Inefficient if optimization with -O2 -ffast-math
  2004-04-28  7:26 [Bug optimization/15187] New: Inefficient if optimization with -O2 -ffast-math uros at kss-loka dot si
                   ` (8 preceding siblings ...)
  2004-09-17  5:32 ` cvs-commit at gcc dot gnu dot org
@ 2004-09-17  5:41 ` uros at kss-loka dot si
  2004-09-17  5:47 ` uros at gcc dot gnu dot org
  2004-09-17  8:44 ` pinskia at gcc dot gnu dot org
  11 siblings, 0 replies; 15+ messages in thread
From: uros at kss-loka dot si @ 2004-09-17  5:41 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From uros at kss-loka dot si  2004-09-17 05:41 -------
Fixed by:
http://gcc.gnu.org/ml/gcc-patches/2004-07/msg00654.html
http://gcc.gnu.org/ml/gcc-patches/2004-07/msg01107.html

(TERGET_CMOV case by):
http://gcc.gnu.org/ml/gcc-patches/2004-09/msg01667.html

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug rtl-optimization/15187] Inefficient if optimization with -O2 -ffast-math
  2004-04-28  7:26 [Bug optimization/15187] New: Inefficient if optimization with -O2 -ffast-math uros at kss-loka dot si
                   ` (9 preceding siblings ...)
  2004-09-17  5:41 ` uros at kss-loka dot si
@ 2004-09-17  5:47 ` uros at gcc dot gnu dot org
  2004-09-17  8:44 ` pinskia at gcc dot gnu dot org
  11 siblings, 0 replies; 15+ messages in thread
From: uros at gcc dot gnu dot org @ 2004-09-17  5:47 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From uros at gcc dot gnu dot org  2004-09-17 05:47 -------
I forgot to mark this bug as RESOLVED FIXED in previous comment.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug rtl-optimization/15187] Inefficient if optimization with -O2 -ffast-math
  2004-04-28  7:26 [Bug optimization/15187] New: Inefficient if optimization with -O2 -ffast-math uros at kss-loka dot si
                   ` (10 preceding siblings ...)
  2004-09-17  5:47 ` uros at gcc dot gnu dot org
@ 2004-09-17  8:44 ` pinskia at gcc dot gnu dot org
  11 siblings, 0 replies; 15+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-09-17  8:44 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |4.0.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug rtl-optimization/15187] Inefficient if optimization with -O2 -ffast-math
       [not found] <bug-15187-1649@http.gcc.gnu.org/bugzilla/>
  2006-03-28 10:34 ` pluto at agmk dot net
@ 2006-03-29 14:08 ` uros at kss-loka dot si
  1 sibling, 0 replies; 15+ messages in thread
From: uros at kss-loka dot si @ 2006-03-29 14:08 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from uros at kss-loka dot si  2006-03-29 14:08 -------
(In reply to comment #11)
> it looks like 4.1.1 and 4.2.0 still produce unoptimal code.

> test:   pushl   %ebp
>         movl    %esp, %ebp
>         fldl    8(%ebp)
>         fldz
>         fcomip  %st(1), %st
>         jae     .L2
>         popl    %ebp
>         fcos
>         ret
> 
> .L2:    popl    %ebp
>         fsin
>         ret

No, this code is optimal. Please compare the code above to the code in
description, where fcos is calculated even if x <= 0.0


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug rtl-optimization/15187] Inefficient if optimization with -O2 -ffast-math
       [not found] <bug-15187-1649@http.gcc.gnu.org/bugzilla/>
@ 2006-03-28 10:34 ` pluto at agmk dot net
  2006-03-29 14:08 ` uros at kss-loka dot si
  1 sibling, 0 replies; 15+ messages in thread
From: pluto at agmk dot net @ 2006-03-28 10:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from pluto at agmk dot net  2006-03-28 10:34 -------
it looks like 4.1.1 and 4.2.0 still produce unoptimal code.

#include <math.h>
double test(double x)
{
        if (x > 0.0)
                return cos(x);
        else
                return sin(x);
}

[ 4.1.1-0.20060322 / rev.112277 /x86-64 ]

$ gcc bug.c -O2 -ffast-math -m32 -march=i686 -S

test:   pushl   %ebp
        movl    %esp, %ebp
        fldl    8(%ebp)
        fldz
        fcomip  %st(1), %st
        jae     .L2
        popl    %ebp
        fcos
        ret

.L2:    popl    %ebp
        fsin
        ret

[ 4.2.0-20060323 / rev.112317 / x86-64 ]

$ ./xgcc -B. bug.c -O2 -ffast-math -m32 -march=i686 -S
bug.c: In function &#8216;test&#8217;:
bug.c:7: internal compiler error: in bsi_last, at tree-flow-inline.h:760


-- 

pluto at agmk dot net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pluto at agmk dot net


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15187


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2006-03-29 14:08 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-04-28  7:26 [Bug optimization/15187] New: Inefficient if optimization with -O2 -ffast-math uros at kss-loka dot si
2004-04-28  9:50 ` [Bug optimization/15187] " zack at codesourcery dot com
2004-04-28 11:22 ` uros at kss-loka dot si
2004-04-28 12:52 ` pinskia at gcc dot gnu dot org
2004-04-28 13:30 ` uros at kss-loka dot si
2004-04-28 13:42 ` pinskia at gcc dot gnu dot org
2004-04-28 14:45 ` falk at debian dot org
2004-06-04  6:15 ` [Bug rtl-optimization/15187] " pinskia at gcc dot gnu dot org
2004-07-07 13:34 ` roger at eyesopen dot com
2004-09-17  5:32 ` cvs-commit at gcc dot gnu dot org
2004-09-17  5:41 ` uros at kss-loka dot si
2004-09-17  5:47 ` uros at gcc dot gnu dot org
2004-09-17  8:44 ` pinskia at gcc dot gnu dot org
     [not found] <bug-15187-1649@http.gcc.gnu.org/bugzilla/>
2006-03-28 10:34 ` pluto at agmk dot net
2006-03-29 14:08 ` uros at kss-loka dot si

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).