[Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug tree-optimization/38785]  New: huge performance regression on EEMBC bitmnp01
@ 2009-01-09 15:46 amylaar at gcc dot gnu dot org
  2009-01-09 15:55 ` [Bug tree-optimization/38785] " rguenth at gcc dot gnu dot org
                   ` (28 more replies)
  0 siblings, 29 replies; 30+ messages in thread
From: amylaar at gcc dot gnu dot org @ 2009-01-09 15:46 UTC (permalink / raw)
  To: gcc-bugs

After merging ARCompact support into gcc 4.4.0 20081210, we noticed that
cycle count is up by 155% compared to gcc 4.2.1 for ARC700 on the eembc
bitmnp01
benchmark.  There are long sequences of putting integer constants on the stack,
and shufflink stack locations / registers around in the inner loop.
The *084t.pre dump shows that partial redundancy elimination / constant
propagation has gone berserk, calculating combined ORed values through
all the paths of the sequence of ifs in the main loop.

I've built an i686-pc-linux-gnu compiler from the same sources and
verified that the 084t.pre dump and the .s file show the same bogosity.
(Using options -O3 -fomit-frame-pointer -gstabs -fdump-tree-all. )
I've confirmed the same findings for i686-pc-linux-gnu with a pristine
svn snapshot from today, Revision: 143207.

-- 
           Summary: huge performance regression on EEMBC bitmnp01
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: amylaar at gcc dot gnu dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
@ 2009-01-09 15:55 ` rguenth at gcc dot gnu dot org
  2009-01-09 16:39 ` amylaar at gcc dot gnu dot org
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-01-09 15:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from rguenth at gcc dot gnu dot org  2009-01-09 15:55 -------
Testcase?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
  2009-01-09 15:55 ` [Bug tree-optimization/38785] " rguenth at gcc dot gnu dot org
@ 2009-01-09 16:39 ` amylaar at gcc dot gnu dot org
  2009-01-09 17:35 ` amylaar at gcc dot gnu dot org
                   ` (26 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: amylaar at gcc dot gnu dot org @ 2009-01-09 16:39 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from amylaar at gcc dot gnu dot org  2009-01-09 16:39 -------
(In reply to comment #1)
> Testcase?

Unfortunately, the EEMBC benchmarks are not freely redistributable.
See http://www.eembc.org .

I'm not sure yet which parts of the benchmark are intrinsic to the problem
and which would be a copyrightable expression.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
  2009-01-09 15:55 ` [Bug tree-optimization/38785] " rguenth at gcc dot gnu dot org
  2009-01-09 16:39 ` amylaar at gcc dot gnu dot org
@ 2009-01-09 17:35 ` amylaar at gcc dot gnu dot org
  2009-01-09 17:59 ` rguenth at gcc dot gnu dot org
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: amylaar at gcc dot gnu dot org @ 2009-01-09 17:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from amylaar at gcc dot gnu dot org  2009-01-09 17:34 -------
(In reply to comment #1)
> Testcase?

Ok, I now have a testcase that is almost, but not quite, entirely unlike
fbital.  About the only characteristic it shares with fbital is that it has
a loop which provides opportunities for pessimizing constant propagation
through phi nodes.

void
f (int i, long *a, long *b)
{
  for (; --i >=  0; a++, b++)
    {
      b[i] = 0;
#define PART(I) if ((a[i] << (I)) > -15) b[i] += 0x7fffffffL / (I);
      PART (1);
      PART (2);
      PART (3);
      PART (4);
      PART (5);
      PART (6);
    }
}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (2 preceding siblings ...)
  2009-01-09 17:35 ` amylaar at gcc dot gnu dot org
@ 2009-01-09 17:59 ` rguenth at gcc dot gnu dot org
  2009-01-09 20:55 ` steven at gcc dot gnu dot org
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-01-09 17:59 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from rguenth at gcc dot gnu dot org  2009-01-09 17:59 -------
It's indeed partial-PRE that performs these insertions.  Steven has some
patches
to tune down regular insertion that may also apply to partial insertion.

See also PR38401.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu dot
                   |                            |org
  BugsThisDependsOn|                            |38401
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
           Keywords|                            |missed-optimization
   Last reconfirmed|0000-00-00 00:00:00         |2009-01-09 17:59:31
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (3 preceding siblings ...)
  2009-01-09 17:59 ` rguenth at gcc dot gnu dot org
@ 2009-01-09 20:55 ` steven at gcc dot gnu dot org
  2009-01-10 16:10 ` amylaar at gcc dot gnu dot org
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-01-09 20:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from steven at gcc dot gnu dot org  2009-01-09 20:55 -------
Joern, re. comment #4, Richi refers to my patch to enable PRE at -Os, see [1]. 
An extension to this patch that we tested on x86 machines, is to disable PRE
for scalar integer registers, via SMALL_REGISTER_CLASSES.  I changed
SMALL_REGISTER_CLASSES into a target hook for this purpose, see [2]. You could
play with this, see if you can use this to cure your problem...

[1] http://gcc.gnu.org/ml/gcc-patches/2008-12/msg00199.html
[2] http://gcc.gnu.org/ml/gcc-patches/2008-12/msg00590.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (4 preceding siblings ...)
  2009-01-09 20:55 ` steven at gcc dot gnu dot org
@ 2009-01-10 16:10 ` amylaar at gcc dot gnu dot org
  2009-01-14 10:08 ` Joey dot ye at intel dot com
                   ` (22 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: amylaar at gcc dot gnu dot org @ 2009-01-10 16:10 UTC (permalink / raw)
  To: gcc-bugs

------- Comment #6 from amylaar at gcc dot gnu dot org  2009-01-10 16:10 -------
(In reply to comment #5)
> Joern, re. comment #4, Richi refers to my patch to enable PRE at -Os, see [1]. 
> An extension to this patch that we tested on x86 machines, is to disable PRE
> for scalar integer registers, via SMALL_REGISTER_CLASSES.  I changed
> SMALL_REGISTER_CLASSES into a target hook for this purpose, see [2]. You could
> play with this, see if you can use this to cure your problem...

This is not a problem of having inserted more expressions.  The expressions
actually go down, 7 add expressions are actually eliminated.
The problem is that this comes at the cost of addding 127 phi nodes which
are not free. It's a cascade of 64/32/16/8/4/2/1 phi nodes that are used
to compute the 7 chained adds through all the possible paths through the
6 consecutive if-blocks, and this requires 64/32/16/8/4/2/1 reg-reg copies
to implement inside these if-blocks, plus 64 unconditional constant loads
at the start.
requiring 64 + x registers for this one computation alone does cause register
allocation trouble for ARCompact, but it is in good company here, as lost of
other RISC architectures also have no more than 32 general purpose registers.
And even if you did this for a processor with lots of registers like the i960,
requiring 64 constant loads in the unconditional path and then having 127
conditional reg-reg copies is certainly worse than having 7 conditional add
operations.

I think the problem is that the algorithm, like many SSA algorithm,
has no idea of the run time cost of a phi node, and uses the lower
bound 0 as an approximation.
As you make more sophisticated algorithms to approximate the cost minimum
for a flawed cost function, you will pessimize more and more code.

Is there a way to determine when replacing one expression causes the
number of phi nodes in a dominator to increase?  I would think that
this criterion would be a possibly useful indicator of register
pressure and instruction count increase.

I haven't looked wht the ssa pre does exactly, but from the code
transformation performed I would guess that it sees one expression which
is fed by a directed acyclic graph (DAG) of phi nodes with constant leaves,
and figures it only needs to use a replacement DAG of phi nodes where the
expression is evaluated for each constant, assuming that the original DAG
will be mostly dead then.
However, what happens in the testcase is that the original DAG is still
fully live, the replacement DAG is added in parallel, and thus the number
of phi nodes from the original DAG has been doubled.

For the testcase (and for fbital, if you have access to it), it is
interesting to observe that the problem would not arise if pre had a notion
of what expressions better to leave for conditional execution.
Unfortunately, many architectures, conditional execution requires
two-address (wrt the ALU) operations, which cannot be expressed in SSA.

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (5 preceding siblings ...)
  2009-01-10 16:10 ` amylaar at gcc dot gnu dot org
@ 2009-01-14 10:08 ` Joey dot ye at intel dot com
  2009-01-14 10:54 ` steven at gcc dot gnu dot org
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: Joey dot ye at intel dot com @ 2009-01-14 10:08 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from Joey dot ye at intel dot com  2009-01-14 10:08 -------
(In reply to comment #5)
> Joern, re. comment #4, Richi refers to my patch to enable PRE at -Os, see [1]. 
> An extension to this patch that we tested on x86 machines, is to disable PRE
> for scalar integer registers, via SMALL_REGISTER_CLASSES.  I changed
> SMALL_REGISTER_CLASSES into a target hook for this purpose, see [2]. You could
> play with this, see if you can use this to cure your problem...
> [1] http://gcc.gnu.org/ml/gcc-patches/2008-12/msg00199.html
> [2] http://gcc.gnu.org/ml/gcc-patches/2008-12/msg00590.html
Reproduced on x86. But I fail to build with patch [2] on x86_64, anything
wrong?
../../src/gcc/target-def.h:476:1: error: unterminated #ifndef
../../src/gcc/c-common.c:8197: error: 'TARGETCM_INITIALIZER' undeclared here
(not in a function)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (6 preceding siblings ...)
  2009-01-14 10:08 ` Joey dot ye at intel dot com
@ 2009-01-14 10:54 ` steven at gcc dot gnu dot org
  2009-01-14 18:47 ` amylaar at gcc dot gnu dot org
                   ` (20 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-01-14 10:54 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from steven at gcc dot gnu dot org  2009-01-14 10:54 -------
Re comment #7

Those patches are just proof-of-concept, and wouldn't actually help without
additional changes in tree-ssa-pre.c.  If you want, I can make the patches
apply and work properly, and send them to you to play with (just send me a mail
if you want them).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (7 preceding siblings ...)
  2009-01-14 10:54 ` steven at gcc dot gnu dot org
@ 2009-01-14 18:47 ` amylaar at gcc dot gnu dot org
  2009-01-14 20:51 ` rguenther at suse dot de
                   ` (19 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: amylaar at gcc dot gnu dot org @ 2009-01-14 18:47 UTC (permalink / raw)
  To: gcc-bugs

------- Comment #9 from amylaar at gcc dot gnu dot org  2009-01-14 18:47 -------
I think the disregard for conditional execution opportunities and the
assumption that phi nodes have no execution cost are two separate issues.
I'd like to address the latter first, because it causes exponential code and
execution time growth.

A phi node joining two constants has at least the cost of a constant load.
A phi node joining two different variables which are initialized by a graph
with constant leafs costs at least a reg-reg copy on one arm, plus the cost
of its parents if these are needed solely for this phi node.

Therefore, if an expression is only partially anticipatable, we should compare
the cost of any phi node needed to compute it early with the estimated
likelyhod that such a computatatio, once done, is actually needed, multiplied
with the cost of the replaced operation.       

Can we use edge probabilities inside tree-pre to calculate execution
probabilities?

Can we calculate the cost of replaced expressions?

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (8 preceding siblings ...)
  2009-01-14 18:47 ` amylaar at gcc dot gnu dot org
@ 2009-01-14 20:51 ` rguenther at suse dot de
  2009-01-14 22:06 ` amylaar at gcc dot gnu dot org
                   ` (18 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: rguenther at suse dot de @ 2009-01-14 20:51 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from rguenther at suse dot de  2009-01-14 20:51 -------
Subject: Re:  huge performance regression on
 EEMBC bitmnp01

On Wed, 14 Jan 2009, amylaar at gcc dot gnu dot org wrote:

> I think the disregard for conditional execution opportunities and the
> assumption that phi nodes have no execution cost are two separate issues.
> I'd like to address the latter first, because it causes exponential code and
> execution time growth.
> 
> A phi node joining two constants has at least the cost of a constant load.
> A phi node joining two different variables which are initialized by a graph
> with constant leafs costs at least a reg-reg copy on one arm, plus the cost
> of its parents if these are needed solely for this phi node.
> 
> Therefore, if an expression is only partially anticipatable, we should compare
> the cost of any phi node needed to compute it early with the estimated
> likelyhod that such a computatatio, once done, is actually needed, multiplied
> with the cost of the replaced operation.       
> 
> Can we use edge probabilities inside tree-pre to calculate execution
> probabilities?
> 
> Can we calculate the cost of replaced expressions?

You would completely underestimate the optimization opportunities PRE
unleashes.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (9 preceding siblings ...)
  2009-01-14 20:51 ` rguenther at suse dot de
@ 2009-01-14 22:06 ` amylaar at gcc dot gnu dot org
  2009-01-15 11:36 ` amylaar at gcc dot gnu dot org
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: amylaar at gcc dot gnu dot org @ 2009-01-14 22:06 UTC (permalink / raw)
  To: gcc-bugs

------- Comment #11 from amylaar at gcc dot gnu dot org  2009-01-14 22:06 -------
(In reply to comment #10)
> You would completely underestimate the optimization opportunities PRE
> unleashes.

Well, at least for partial-partial-RE, as mentioned before in PR38401,
benchmarks  indicate that we'd be better without it altogether than
what it is in its current form.
If we could use a cost calculation to avoid the code pessimization while
keeping some of the potential penefit of partial-partial-RE, that should
be even better.

The potentially harmful impact of ordinary PRE is lesser because if
you put in a constant load
where the expression is at least fully anticipatable, you swap a constant
load for some other operation, worst case that means an expensive
constant load and two branches instead of a simple icond-exec operation -
that's bad, but not nearly as bad as blowing up always-executed path
along with everything else exponentially.

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (10 preceding siblings ...)
  2009-01-14 22:06 ` amylaar at gcc dot gnu dot org
@ 2009-01-15 11:36 ` amylaar at gcc dot gnu dot org
  2009-01-20 23:02 ` steven at gcc dot gnu dot org
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: amylaar at gcc dot gnu dot org @ 2009-01-15 11:36 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from amylaar at gcc dot gnu dot org  2009-01-15 11:36 -------
(In reply to comment #11)
P.S.:
Another feature that we could look at is the number of times an input
ssa name is used.  If it is used more than once, we cannot rely on the
original ssa name to go away, and hence the odds of exponential execution
time explosion are higher for doing partial-partial redundancy elimination
on an expression using such an ssa name as input.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (11 preceding siblings ...)
  2009-01-15 11:36 ` amylaar at gcc dot gnu dot org
@ 2009-01-20 23:02 ` steven at gcc dot gnu dot org
  2009-03-04 22:58 ` amylaar at gcc dot gnu dot org
                   ` (15 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-01-20 23:02 UTC (permalink / raw)
  To: gcc-bugs

------- Comment #13 from steven at gcc dot gnu dot org  2009-01-20 23:01 -------
Created an attachment (id=17155)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17155&action=view)
Throttle PRE, hookize SMALL_REGISTER_CLASSES

This is the patch I have in my local tree (bootstrapped&tested on AMD64
multilib).

It's rather brute-force, but it shows what I would do: There is a new function
want_to_pre_p that you can feed an expression to decide whether to perform the
redundancy elimination or not.

For x86 normal PRE (i.e. not PPRE) I toyed, quite successfully, with the idea
to make the transformation depend on the probability that a PRE-ed expression
will result in spills.  I have not investigated at all whether there is a
difference for PPRE and PRE in the want_to_pre_p decision.  Some of Joern's
ideas, and more (or less) should be included in this want_to_pre_p function.  

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (12 preceding siblings ...)
  2009-01-20 23:02 ` steven at gcc dot gnu dot org
@ 2009-03-04 22:58 ` amylaar at gcc dot gnu dot org
  2009-03-05  0:32 ` amylaar at gcc dot gnu dot org
                   ` (14 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: amylaar at gcc dot gnu dot org @ 2009-03-04 22:58 UTC (permalink / raw)
  To: gcc-bugs



-- 

amylaar at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  BugsThisDependsOn|39302                       |
OtherBugsDependingO|                            |39363
              nThis|                            |
         AssignedTo|unassigned at gcc dot gnu   |amylaar at gcc dot gnu dot
                   |dot org                     |org
             Status|NEW                         |ASSIGNED
           Keywords|                            |patch
   Last reconfirmed|2009-01-09 17:59:31         |2009-03-04 22:58:24
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (13 preceding siblings ...)
  2009-03-04 22:58 ` amylaar at gcc dot gnu dot org
@ 2009-03-05  0:32 ` amylaar at gcc dot gnu dot org
  2009-03-31 16:08 ` [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] " jsm28 at gcc dot gnu dot org
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: amylaar at gcc dot gnu dot org @ 2009-03-05  0:32 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from amylaar at gcc dot gnu dot org  2009-03-05 00:32 -------
My combined patch is here:
http://gcc.gnu.org/ml/gcc-patches/2009-03/msg00250.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (14 preceding siblings ...)
  2009-03-05  0:32 ` amylaar at gcc dot gnu dot org
@ 2009-03-31 16:08 ` jsm28 at gcc dot gnu dot org
  2009-04-14  9:49 ` jakub at gcc dot gnu dot org
                   ` (12 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: jsm28 at gcc dot gnu dot org @ 2009-03-31 16:08 UTC (permalink / raw)
  To: gcc-bugs



-- 

jsm28 at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|huge performance regression |[4.3/4.4/4.5 Regression]
                   |on EEMBC bitmnp01           |huge performance regression
                   |                            |on EEMBC bitmnp01
   Target Milestone|---                         |4.3.4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (15 preceding siblings ...)
  2009-03-31 16:08 ` [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] " jsm28 at gcc dot gnu dot org
@ 2009-04-14  9:49 ` jakub at gcc dot gnu dot org
  2009-07-15 21:12 ` steven at gcc dot gnu dot org
                   ` (11 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-04-14  9:49 UTC (permalink / raw)
  To: gcc-bugs



-- 

jakub at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (16 preceding siblings ...)
  2009-04-14  9:49 ` jakub at gcc dot gnu dot org
@ 2009-07-15 21:12 ` steven at gcc dot gnu dot org
  2009-07-15 21:35 ` steven at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-07-15 21:12 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from steven at gcc dot gnu dot org  2009-07-15 21:12 -------
*** Bug 40768 has been marked as a duplicate of this bug. ***


-- 

steven at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kazu at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (17 preceding siblings ...)
  2009-07-15 21:12 ` steven at gcc dot gnu dot org
@ 2009-07-15 21:35 ` steven at gcc dot gnu dot org
  2009-07-23 21:51 ` drow at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-07-15 21:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #16 from steven at gcc dot gnu dot org  2009-07-15 21:35 -------
*** Bug 40768 has been marked as a duplicate of this bug. ***


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (18 preceding siblings ...)
  2009-07-15 21:35 ` steven at gcc dot gnu dot org
@ 2009-07-23 21:51 ` drow at gcc dot gnu dot org
  2009-07-23 22:23 ` stevenb dot gcc at gmail dot com
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: drow at gcc dot gnu dot org @ 2009-07-23 21:51 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #17 from drow at gcc dot gnu dot org  2009-07-23 21:50 -------
Steven, have you had time for this?  Anything we can do to help?


-- 

drow at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |steven at gcc dot gnu dot
                   |                            |org, drow at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (19 preceding siblings ...)
  2009-07-23 21:51 ` drow at gcc dot gnu dot org
@ 2009-07-23 22:23 ` stevenb dot gcc at gmail dot com
  2009-08-04 12:46 ` rguenth at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: stevenb dot gcc at gmail dot com @ 2009-07-23 22:23 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #18 from stevenb dot gcc at gmail dot com  2009-07-23 22:23 -------
Subject: Re:  [4.3/4.4/4.5 Regression] huge 
        performance regression on EEMBC bitmnp01

I had the patch ready but Matz' PRE patch means I have to rework things a bit.
Since I only have time for this in weekends...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (20 preceding siblings ...)
  2009-07-23 22:23 ` stevenb dot gcc at gmail dot com
@ 2009-08-04 12:46 ` rguenth at gcc dot gnu dot org
  2010-01-29 20:33 ` amylaar at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-08-04 12:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #19 from rguenth at gcc dot gnu dot org  2009-08-04 12:29 -------
GCC 4.3.4 is being released, adjusting target milestone.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.3.4                       |4.3.5


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (21 preceding siblings ...)
  2009-08-04 12:46 ` rguenth at gcc dot gnu dot org
@ 2010-01-29 20:33 ` amylaar at gcc dot gnu dot org
  2010-02-19 12:18 ` rguenth at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: amylaar at gcc dot gnu dot org @ 2010-01-29 20:33 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #20 from amylaar at gcc dot gnu dot org  2010-01-29 20:33 -------
Subject: Bug 38785

Author: amylaar
Date: Fri Jan 29 20:33:19 2010
New Revision: 156365

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=156365
Log:
  PR tree-optimization/38785
  2009-02-02  J"orn Rennecke  <joern.rennecke@arc.com>
        * tree-ssa-pre.c (ppre_n_insert_for_speed_p): New function.
        * (do_partial_partial_insertion): Use it to throttle
        insert_into_preds_of_block calls.
        * common.opt (-ftree-pre-partial-partial-obliviously): New option.
  2009-01-15  Steven Bosscher  <steven@gcc.gnu.org>
        http://gcc.gnu.org/ml/gcc-patches/2008-12/msg00199.html
        * opts.c (decode_options): Fix initialization of
        flag_tree_switch_conversion.
        * tree-ssa-pre.c: Update outline of the algorithm.
        (bitmap_set_and): Prototype.
        (insert_into_preds_of_block): Don't report discovery of partial
        redundancies here, do so from the callers instead (see below).
        (do_regular_insertion): Add counter for an estimate for the number
        of inserts required to eliminate a partial redundancy.  If the
        current function is optimized for size, only perform the partial
        redundancy elimination if this requires inserting in only one
        predecessor.  Report all found partial redundancies from here.
        (do_partial_partial_insertion): Report them from here too.
        (insert_aux): Do not insert for partial-partial redundancies when
        optimizing for size.
        (do_pre): Run FRE at least, if PRE is disabled.
        (gate_pre): Return true if flag_tree pre or flag_tree_fre is set.
  2009-01-15  J"orn Rennecke  <joern.rennecke@arc.com>
        * common.opt (ftree-pre-partial-partial): New option.
        * opts.c (decode_options): Initialize flag_tree_pre_partial_partial.
        * tree-ssa-pre.c (execute_pre): Use flag_tree_pre_partial_partial.

Modified:
    branches/mpost-opt-imp-20100127/gcc/ChangeLog.mpost
    branches/mpost-opt-imp-20100127/gcc/FAILED-PATCHES
    branches/mpost-opt-imp-20100127/gcc/common.opt
    branches/mpost-opt-imp-20100127/gcc/opts.c
    branches/mpost-opt-imp-20100127/gcc/tree-ssa-pre.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (22 preceding siblings ...)
  2010-01-29 20:33 ` amylaar at gcc dot gnu dot org
@ 2010-02-19 12:18 ` rguenth at gcc dot gnu dot org
  2010-02-19 12:29 ` amylaar at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-02-19 12:18 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #21 from rguenth at gcc dot gnu dot org  2010-02-19 12:17 -------
The patch would need updating now.  Micha, does your tracer handle this case?


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |matz at gcc dot gnu dot org
   Last reconfirmed|2009-03-04 22:58:24         |2010-02-19 12:17:54
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (23 preceding siblings ...)
  2010-02-19 12:18 ` rguenth at gcc dot gnu dot org
@ 2010-02-19 12:29 ` amylaar at gcc dot gnu dot org
  2010-02-19 13:19 ` matz at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: amylaar at gcc dot gnu dot org @ 2010-02-19 12:29 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #22 from amylaar at gcc dot gnu dot org  2010-02-19 12:29 -------
(In reply to comment #21)
> The patch would need updating now.  Micha, does your tracer handle this case?

See comment #20 for an updated patch.  However, I don't have EEMBC available
at the moment to make absolutely sure that the original problem is still
addressed.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (24 preceding siblings ...)
  2010-02-19 12:29 ` amylaar at gcc dot gnu dot org
@ 2010-02-19 13:19 ` matz at gcc dot gnu dot org
  2010-02-19 14:08 ` drow at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: matz at gcc dot gnu dot org @ 2010-02-19 13:19 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #23 from matz at gcc dot gnu dot org  2010-02-19 13:19 -------
Richi: not really.  It tries to separate paths where at least one has mostly
constants in their PHI args.  This applies to this testcase for the first
chain of PHI nodes, which are separated as intended.  But this simply leads
to constant propagation into the next chain of PHI nodes.  If repeated until
nothing changes this just ripples down the constants further and further but
doubles the number of incoming edges for the successor blocks.  Until we have
the exponential growth in the number of edges, not the number of PHI nodes.
I.e. effectively all possible paths are then represented by an edge, as would
be expected from a tracer.  At that point the last PHI node with constants
then would be useless and removed.  But we still would have exponential code
size growth.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (25 preceding siblings ...)
  2010-02-19 13:19 ` matz at gcc dot gnu dot org
@ 2010-02-19 14:08 ` drow at gcc dot gnu dot org
  2010-02-19 23:33 ` stevenb dot gcc at gmail dot com
  2010-05-22 18:28 ` [Bug tree-optimization/38785] [4.3/4.4/4.5/4.6 " rguenth at gcc dot gnu dot org
  28 siblings, 0 replies; 30+ messages in thread
From: drow at gcc dot gnu dot org @ 2010-02-19 14:08 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #24 from drow at gcc dot gnu dot org  2010-02-19 14:08 -------
If no one else has EEMBC available, ask me and we can verify any fix.  We've
been using Steven's and Joern's patches; we tried other approaches, but in the
end we weren't able to come up with any other approach that worked as well.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (26 preceding siblings ...)
  2010-02-19 14:08 ` drow at gcc dot gnu dot org
@ 2010-02-19 23:33 ` stevenb dot gcc at gmail dot com
  2010-05-22 18:28 ` [Bug tree-optimization/38785] [4.3/4.4/4.5/4.6 " rguenth at gcc dot gnu dot org
  28 siblings, 0 replies; 30+ messages in thread
From: stevenb dot gcc at gmail dot com @ 2010-02-19 23:33 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #25 from stevenb dot gcc at gmail dot com  2010-02-19 23:32 -------
Subject: Re:  [4.3/4.4/4.5 Regression] huge 
        performance regression on EEMBC bitmnp01

On 2/19/10, drow at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #24 from drow at gcc dot gnu dot org  2010-02-19 14:08
> -------
> If no one else has EEMBC available, ask me and we can verify any fix.  We've
> been using Steven's and Joern's patches; we tried other approaches, but in
> the
> end we weren't able to come up with any other approach that worked as well.
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug tree-optimization/38785] [4.3/4.4/4.5/4.6 Regression] huge performance regression on EEMBC bitmnp01
  2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
                   ` (27 preceding siblings ...)
  2010-02-19 23:33 ` stevenb dot gcc at gmail dot com
@ 2010-05-22 18:28 ` rguenth at gcc dot gnu dot org
  28 siblings, 0 replies; 30+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-05-22 18:28 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #26 from rguenth at gcc dot gnu dot org  2010-05-22 18:13 -------
GCC 4.3.5 is being released, adjusting target milestone.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.3.5                       |4.3.6


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2010-05-22 18:28 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-01-09 15:46 [Bug tree-optimization/38785] New: huge performance regression on EEMBC bitmnp01 amylaar at gcc dot gnu dot org
2009-01-09 15:55 ` [Bug tree-optimization/38785] " rguenth at gcc dot gnu dot org
2009-01-09 16:39 ` amylaar at gcc dot gnu dot org
2009-01-09 17:35 ` amylaar at gcc dot gnu dot org
2009-01-09 17:59 ` rguenth at gcc dot gnu dot org
2009-01-09 20:55 ` steven at gcc dot gnu dot org
2009-01-10 16:10 ` amylaar at gcc dot gnu dot org
2009-01-14 10:08 ` Joey dot ye at intel dot com
2009-01-14 10:54 ` steven at gcc dot gnu dot org
2009-01-14 18:47 ` amylaar at gcc dot gnu dot org
2009-01-14 20:51 ` rguenther at suse dot de
2009-01-14 22:06 ` amylaar at gcc dot gnu dot org
2009-01-15 11:36 ` amylaar at gcc dot gnu dot org
2009-01-20 23:02 ` steven at gcc dot gnu dot org
2009-03-04 22:58 ` amylaar at gcc dot gnu dot org
2009-03-05  0:32 ` amylaar at gcc dot gnu dot org
2009-03-31 16:08 ` [Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] " jsm28 at gcc dot gnu dot org
2009-04-14  9:49 ` jakub at gcc dot gnu dot org
2009-07-15 21:12 ` steven at gcc dot gnu dot org
2009-07-15 21:35 ` steven at gcc dot gnu dot org
2009-07-23 21:51 ` drow at gcc dot gnu dot org
2009-07-23 22:23 ` stevenb dot gcc at gmail dot com
2009-08-04 12:46 ` rguenth at gcc dot gnu dot org
2010-01-29 20:33 ` amylaar at gcc dot gnu dot org
2010-02-19 12:18 ` rguenth at gcc dot gnu dot org
2010-02-19 12:29 ` amylaar at gcc dot gnu dot org
2010-02-19 13:19 ` matz at gcc dot gnu dot org
2010-02-19 14:08 ` drow at gcc dot gnu dot org
2010-02-19 23:33 ` stevenb dot gcc at gmail dot com
2010-05-22 18:28 ` [Bug tree-optimization/38785] [4.3/4.4/4.5/4.6 " rguenth at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).