public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/35639]  New: -fprofile-generate + PRE = big compile-time
@ 2008-03-19 15:51 bonzini at gnu dot org
  2008-03-19 15:52 ` [Bug tree-optimization/35639] " bonzini at gnu dot org
                   ` (23 more replies)
  0 siblings, 24 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2008-03-19 15:51 UTC (permalink / raw)
  To: gcc-bugs

The attached .i file (from sed) spends 50% of its compilation time on PRE with
"-fprofile-generate -O2".


-- 
           Summary: -fprofile-generate + PRE = big compile-time
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Keywords: compile-time-hog
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: bonzini at gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] -fprofile-generate + PRE = big compile-time
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
  2008-03-19 15:52 ` [Bug tree-optimization/35639] " bonzini at gnu dot org
@ 2008-03-19 15:52 ` bonzini at gnu dot org
  2008-03-19 20:18 ` [Bug tree-optimization/35639] [4.3/4.4 Regression] " rguenth at gcc dot gnu dot org
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2008-03-19 15:52 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from bonzini at gnu dot org  2008-03-19 15:51 -------
Created an attachment (id=15344)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15344&action=view)
preprocessed source


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] -fprofile-generate + PRE = big compile-time
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
@ 2008-03-19 15:52 ` bonzini at gnu dot org
  2008-03-19 15:52 ` bonzini at gnu dot org
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2008-03-19 15:52 UTC (permalink / raw)
  To: gcc-bugs



-- 

bonzini at gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2008-03-19 15:52:03
               date|                            |
            Summary|-fprofile-generate + PRE =  |-fprofile-generate + PRE =
                   |big compile-time            |big compile-time


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
  2008-03-19 15:52 ` [Bug tree-optimization/35639] " bonzini at gnu dot org
  2008-03-19 15:52 ` bonzini at gnu dot org
@ 2008-03-19 20:18 ` rguenth at gcc dot gnu dot org
  2008-03-19 20:33 ` rguenth at gcc dot gnu dot org
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-03-19 20:18 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from rguenth at gcc dot gnu dot org  2008-03-19 20:17 -------
compile_program is the offending function.  I'll probably can limit walking
with the alias oracle.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (2 preceding siblings ...)
  2008-03-19 20:18 ` [Bug tree-optimization/35639] [4.3/4.4 Regression] " rguenth at gcc dot gnu dot org
@ 2008-03-19 20:33 ` rguenth at gcc dot gnu dot org
  2008-03-19 20:35 ` rguenth at gcc dot gnu dot org
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-03-19 20:33 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from rguenth at gcc dot gnu dot org  2008-03-19 20:33 -------
Created an attachment (id=15347)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15347&action=view)
unincluded testcase


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (3 preceding siblings ...)
  2008-03-19 20:33 ` rguenth at gcc dot gnu dot org
@ 2008-03-19 20:35 ` rguenth at gcc dot gnu dot org
  2008-03-19 20:37 ` rguenth at gcc dot gnu dot org
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-03-19 20:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from rguenth at gcc dot gnu dot org  2008-03-19 20:35 -------
Err, whoops?


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (4 preceding siblings ...)
  2008-03-19 20:35 ` rguenth at gcc dot gnu dot org
@ 2008-03-19 20:37 ` rguenth at gcc dot gnu dot org
  2008-04-28  4:41 ` mmitchel at gcc dot gnu dot org
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-03-19 20:37 UTC (permalink / raw)
  To: gcc-bugs



-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |4.3.1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (5 preceding siblings ...)
  2008-03-19 20:37 ` rguenth at gcc dot gnu dot org
@ 2008-04-28  4:41 ` mmitchel at gcc dot gnu dot org
  2008-06-06 15:03 ` rguenth at gcc dot gnu dot org
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2008-04-28  4:41 UTC (permalink / raw)
  To: gcc-bugs



-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (6 preceding siblings ...)
  2008-04-28  4:41 ` mmitchel at gcc dot gnu dot org
@ 2008-06-06 15:03 ` rguenth at gcc dot gnu dot org
  2008-08-27 22:13 ` jsm28 at gcc dot gnu dot org
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-06-06 15:03 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from rguenth at gcc dot gnu dot org  2008-06-06 14:59 -------
4.3.1 is being released, adjusting target milestone.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.3.1                       |4.3.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (7 preceding siblings ...)
  2008-06-06 15:03 ` rguenth at gcc dot gnu dot org
@ 2008-08-27 22:13 ` jsm28 at gcc dot gnu dot org
  2008-11-22 10:55 ` steven at gcc dot gnu dot org
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: jsm28 at gcc dot gnu dot org @ 2008-08-27 22:13 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from jsm28 at gcc dot gnu dot org  2008-08-27 22:03 -------
4.3.2 is released, changing milestones to 4.3.3.


-- 

jsm28 at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.3.2                       |4.3.3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (8 preceding siblings ...)
  2008-08-27 22:13 ` jsm28 at gcc dot gnu dot org
@ 2008-11-22 10:55 ` steven at gcc dot gnu dot org
  2008-11-22 16:02 ` bonzini at gnu dot org
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-11-22 10:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from steven at gcc dot gnu dot org  2008-11-22 10:53 -------
The last time this bug was reconfirmed, was in March 2008.  PRE has been
completely rewritten since then.  With today's trunk, I still see PRE take most
of the compile time, but it's "only" 20% (on x86 and on x86_64 with Richi's
testcase from t3.c.gz).

Paolo, do you still see this problem?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (9 preceding siblings ...)
  2008-11-22 10:55 ` steven at gcc dot gnu dot org
@ 2008-11-22 16:02 ` bonzini at gnu dot org
  2009-01-24 10:24 ` rguenth at gcc dot gnu dot org
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2008-11-22 16:02 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from bonzini at gnu dot org  2008-11-22 16:00 -------
The problem was that without -fprofile-generate the time spent there was
basically zero.  Since the profiling code is building a MST of the control-flow
graph, it should produce absolutely no PRE opportunity and it's a pity that it
causes such a slowdown.

I wonder if there is some low-hanging fruit to eliminate value numbers that
cannot be redundant.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (10 preceding siblings ...)
  2008-11-22 16:02 ` bonzini at gnu dot org
@ 2009-01-24 10:24 ` rguenth at gcc dot gnu dot org
  2009-02-05  9:01 ` [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE bonzini at gnu dot org
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-01-24 10:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from rguenth at gcc dot gnu dot org  2009-01-24 10:20 -------
GCC 4.3.3 is being released, adjusting target milestone.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.3.3                       |4.3.4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (11 preceding siblings ...)
  2009-01-24 10:24 ` rguenth at gcc dot gnu dot org
@ 2009-02-05  9:01 ` bonzini at gnu dot org
  2009-02-05  9:14 ` rguenth at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2009-02-05  9:01 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from bonzini at gnu dot org  2009-02-05 09:01 -------
I get some SCCs that have 1500+ values in them.  That is quite expensive to
maintain.

345,334,000  iterative_hash_hashval_t
199,259,509  ???:???
151,133,289  iterative_hash_expr
111,911,577  bitmap_bit_p
 96,901,657  htab_find_slot_with_hash
 93,662,766  get_expr_value_id
 64,328,952  bitmap_set_bit


-- 

bonzini at gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[4.3/4.4 Regression] -      |[4.3/4.4 Regression] -
                   |fprofile-generate + PRE =   |fprofile-generate = huge
                   |big compile-time            |SCCs for PRE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (12 preceding siblings ...)
  2009-02-05  9:01 ` [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE bonzini at gnu dot org
@ 2009-02-05  9:14 ` rguenth at gcc dot gnu dot org
  2009-02-05 10:26 ` bonzini at gnu dot org
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-02-05  9:14 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from rguenth at gcc dot gnu dot org  2009-02-05 09:14 -------
I get

 tree PRE              :   1.57 (50%) usr   0.09 (29%) sys   1.68 (29%) wall   
 295 kB ( 3%) ggc

on current trunk (w/o checking).  While 50% is quite high the absolute time
is low.  Though I wonder why FRE is so much better.

 tree FRE              :   0.03 ( 1%) usr   0.01 ( 3%) sys   0.05 ( 1%) wall   
  85 kB ( 1%) ggc


Danny, does your recent hacking around SCCVN improve this testcase?


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dberlin at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (13 preceding siblings ...)
  2009-02-05  9:14 ` rguenth at gcc dot gnu dot org
@ 2009-02-05 10:26 ` bonzini at gnu dot org
  2009-02-05 10:32 ` rguenther at suse dot de
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2009-02-05 10:26 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from bonzini at gnu dot org  2009-02-05 10:26 -------
FRE is not a problem because all the time (93%) is spent computing ANTIC; of
this, half is phi_translate and the other half is bitmap_set operations.

I get a relatively good (15%) improvement from

Index: tree-ssa-sccvn.c
===================================================================
--- tree-ssa-sccvn.c     (revision 143938)
+++ tree-ssa-sccvn.c     (working copy)
@@ -398,9 +398,14 @@ vn_reference_op_eq (const void *p1, cons
 static hashval_t
 vn_reference_op_compute_hash (const vn_reference_op_t vro1)
 {
-  return iterative_hash_expr (vro1->op0, vro1->opcode)
-    + iterative_hash_expr (vro1->op1, vro1->opcode)
-    + iterative_hash_expr (vro1->op2, vro1->opcode);
+  hashval_t result = 0;
+  if (vro1->op0)
+    result += iterative_hash_expr (vro1->op0, vro1->opcode);
+  if (vro1->op1)
+    result += iterative_hash_expr (vro1->op1, vro1->opcode);
+  if (vro1->op2)
+    result += iterative_hash_expr (vro1->op2, vro1->opcode);
+  return result;
 }

 /* Return the hashcode for a given reference operation P1.  */


and another 8% from this:

Index: tree-ssa-pre.c
===================================================================
--- tree-ssa-pre.c      (revision 143938)
+++ tree-ssa-pre.c      (working copy)
@@ -216,11 +216,11 @@ pre_expr_hash (const void *p1)
     case CONSTANT:
       return vn_hash_constant_with_type (PRE_EXPR_CONSTANT (e));
     case NAME:
-      return iterative_hash_expr (PRE_EXPR_NAME (e), 0);
+      return iterative_hash_hashval_t (SSA_NAME_VERSION (PRE_EXPR_NAME (e)),
0);
     case NARY:
-      return vn_nary_op_compute_hash (PRE_EXPR_NARY (e));
+      return PRE_EXPR_NARY (e)->hashcode;
     case REFERENCE:
-      return vn_reference_compute_hash (PRE_EXPR_REFERENCE (e));
+      return PRE_EXPR_REFERENCE (e)->hashcode;
     default:
       abort ();
     }

(Tested with "make check RUNTESTFLAGS=tree-ssa.exp=*[pf]re*").  At least these
two kick hashing almost out of the profile and bring PRE down from 50% to 40%
of the compilation time.  They also speedup a bit the bitmap_sets since
get_or_alloc_expression_id was also doing hashing.

The remaining main offenders are phi_translate_set and phi_translate_1.  Apart
from some bitmap_sets, their profile is quite flat so no more microoptimization
I guess.

I'll bootstrap/regtest the above.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (14 preceding siblings ...)
  2009-02-05 10:26 ` bonzini at gnu dot org
@ 2009-02-05 10:32 ` rguenther at suse dot de
  2009-02-05 13:02 ` bonzini at gnu dot org
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rguenther at suse dot de @ 2009-02-05 10:32 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from rguenther at suse dot de  2009-02-05 10:31 -------
Subject: Re:  [4.3/4.4 Regression] -fprofile-generate
 = huge SCCs for PRE

On Thu, 5 Feb 2009, bonzini at gnu dot org wrote:

> 
> 
> ------- Comment #12 from bonzini at gnu dot org  2009-02-05 10:26 -------
> FRE is not a problem because all the time (93%) is spent computing ANTIC; of
> this, half is phi_translate and the other half is bitmap_set operations.
> 
> I get a relatively good (15%) improvement from
> 
> Index: tree-ssa-sccvn.c
> ===================================================================
> --- tree-ssa-sccvn.c     (revision 143938)
> +++ tree-ssa-sccvn.c     (working copy)
> @@ -398,9 +398,14 @@ vn_reference_op_eq (const void *p1, cons
>  static hashval_t
>  vn_reference_op_compute_hash (const vn_reference_op_t vro1)
>  {
> -  return iterative_hash_expr (vro1->op0, vro1->opcode)
> -    + iterative_hash_expr (vro1->op1, vro1->opcode)
> -    + iterative_hash_expr (vro1->op2, vro1->opcode);
> +  hashval_t result = 0;
> +  if (vro1->op0)
> +    result += iterative_hash_expr (vro1->op0, vro1->opcode);
> +  if (vro1->op1)
> +    result += iterative_hash_expr (vro1->op1, vro1->opcode);
> +  if (vro1->op2)
> +    result += iterative_hash_expr (vro1->op2, vro1->opcode);
> +  return result;
>  }
> 
>  /* Return the hashcode for a given reference operation P1.  */
> 
> 
> and another 8% from this:
> 
> Index: tree-ssa-pre.c
> ===================================================================
> --- tree-ssa-pre.c      (revision 143938)
> +++ tree-ssa-pre.c      (working copy)
> @@ -216,11 +216,11 @@ pre_expr_hash (const void *p1)
>      case CONSTANT:
>        return vn_hash_constant_with_type (PRE_EXPR_CONSTANT (e));
>      case NAME:
> -      return iterative_hash_expr (PRE_EXPR_NAME (e), 0);
> +      return iterative_hash_hashval_t (SSA_NAME_VERSION (PRE_EXPR_NAME (e)),
> 0);
>      case NARY:
> -      return vn_nary_op_compute_hash (PRE_EXPR_NARY (e));
> +      return PRE_EXPR_NARY (e)->hashcode;
>      case REFERENCE:
> -      return vn_reference_compute_hash (PRE_EXPR_REFERENCE (e));
> +      return PRE_EXPR_REFERENCE (e)->hashcode;
>      default:
>        abort ();
>      }
> 
> (Tested with "make check RUNTESTFLAGS=tree-ssa.exp=*[pf]re*").  At least these
> two kick hashing almost out of the profile and bring PRE down from 50% to 40%
> of the compilation time.  They also speedup a bit the bitmap_sets since
> get_or_alloc_expression_id was also doing hashing.
> 
> The remaining main offenders are phi_translate_set and phi_translate_1.  Apart
> from some bitmap_sets, their profile is quite flat so no more microoptimization
> I guess.
> 
> I'll bootstrap/regtest the above.

Ah, can you test in addition to the above

--- ../trunk/gcc/tree-ssa-sccvn.c       2009-01-28 13:11:34.000000000 
+0100
+++ gcc/tree-ssa-sccvn.c        2009-02-01 12:26:36.000000000 +0100
@@ -316,6 +316,10 @@
   const struct vn_constant_s *vc1 = (const struct vn_constant_s *) p1;
   const struct vn_constant_s *vc2 = (const struct vn_constant_s *) p2;

+  /* Early out if this is not a hash collision.  */
+  if (vc1->hashcode != vc2->hashcode)
+    return false;
+
   return vn_constant_eq_with_type (vc1->constant, vc2->constant);
 }

and similar in the other hash compare fns?

Richard.

> 
> 
> 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (15 preceding siblings ...)
  2009-02-05 10:32 ` rguenther at suse dot de
@ 2009-02-05 13:02 ` bonzini at gnu dot org
  2009-02-05 13:40 ` rguenth at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2009-02-05 13:02 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from bonzini at gnu dot org  2009-02-05 13:02 -------
20% of PRE time is spent in qsort via sorted_array_from_bitmap_set.  Rewriting
it like this makes it a bit faster (it shaves 12% more of PRE time):

  FOR_EACH_VALUE_ID_IN_SET (set, i, bi)
    {
      /* The number of expressions having a given value is usually
         relatively small.  Thus, rather than making a vector of all the
         expressions and sorting it by value id, we walk the values and
         check in the reverse mapping that tells us what expressions have
         a given value, to filter those in our set.  If this is somehow
         a significant lose for some cases, we can choose which set to
         walk based on the set size.  */
      bitmap_set_t exprset = VEC_index (bitmap_set_t, value_expressions, i);
      FOR_EACH_EXPR_ID_IN_SET (exprset, j, bj)
        {
          if (bitmap_bit_p (set->expressions, j))
            VEC_safe_push (pre_expr, heap, result, expression_for_id (j));
        }
    }

I'll add this to the bootstrap/regtest.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (16 preceding siblings ...)
  2009-02-05 13:02 ` bonzini at gnu dot org
@ 2009-02-05 13:40 ` rguenth at gcc dot gnu dot org
  2009-02-05 16:30 ` bonzini at gnu dot org
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-02-05 13:40 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from rguenth at gcc dot gnu dot org  2009-02-05 13:40 -------
pre-allocating the VEC with some reasonable defaults (add some stats for this
testcase) should make it even more fast.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (17 preceding siblings ...)
  2009-02-05 13:40 ` rguenth at gcc dot gnu dot org
@ 2009-02-05 16:30 ` bonzini at gnu dot org
  2009-02-05 16:41 ` dberlin at dberlin dot org
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2009-02-05 16:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #16 from bonzini at gnu dot org  2009-02-05 16:30 -------
Not much.  The remaining compile-time hogs (~25%) are the pre_expr and
expr_pred_trans hash tables.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (18 preceding siblings ...)
  2009-02-05 16:30 ` bonzini at gnu dot org
@ 2009-02-05 16:41 ` dberlin at dberlin dot org
  2009-02-05 17:43 ` dberlin at dberlin dot org
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: dberlin at dberlin dot org @ 2009-02-05 16:41 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #17 from dberlin at gcc dot gnu dot org  2009-02-05 16:41 -------
Subject: Re:  [4.3/4.4 Regression] 
        -fprofile-generate = huge SCCs for PRE

Ugh.
It might make sense to just replace the hash table implementation we
use with something better (simple power of 2, key-value stuff instead
of what we have now)
I've found in my testing that it can be quite a time sink.


On Thu, Feb 5, 2009 at 11:30 AM, bonzini at gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #16 from bonzini at gnu dot org  2009-02-05 16:30 -------
> Not much.  The remaining compile-time hogs (~25%) are the pre_expr and
> expr_pred_trans hash tables.
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (19 preceding siblings ...)
  2009-02-05 16:41 ` dberlin at dberlin dot org
@ 2009-02-05 17:43 ` dberlin at dberlin dot org
  2009-02-06  8:31 ` bonzini at gnu dot org
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: dberlin at dberlin dot org @ 2009-02-05 17:43 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #18 from dberlin at gcc dot gnu dot org  2009-02-05 17:43 -------
Subject: Re:  [4.3/4.4 Regression] 
        -fprofile-generate = huge SCCs for PRE

My hacking will seriously improve this, since it doesn't iterate over
pieces of the SCC that aren't changing (which often is most of it).
On large SCC's, most of the time is actually being spent revisiting
things that can't possibly have changed.

(IE out of 5000 members of the SCC, 200 could have changed but we
visit all 5000 anyway).


On Thu, Feb 5, 2009 at 4:14 AM, rguenth at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #11 from rguenth at gcc dot gnu dot org  2009-02-05 09:14 -------
> I get
>
>  tree PRE              :   1.57 (50%) usr   0.09 (29%) sys   1.68 (29%) wall
>  295 kB ( 3%) ggc
>
> on current trunk (w/o checking).  While 50% is quite high the absolute time
> is low.  Though I wonder why FRE is so much better.
>
>  tree FRE              :   0.03 ( 1%) usr   0.01 ( 3%) sys   0.05 ( 1%) wall
>  85 kB ( 1%) ggc
>
>
> Danny, does your recent hacking around SCCVN improve this testcase?
>
>
> --
>
> rguenth at gcc dot gnu dot org changed:
>
>           What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                 CC|                            |dberlin at gcc dot gnu dot
>                   |                            |org
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (20 preceding siblings ...)
  2009-02-05 17:43 ` dberlin at dberlin dot org
@ 2009-02-06  8:31 ` bonzini at gnu dot org
  2009-08-04 12:39 ` [Bug tree-optimization/35639] [4.3/4.4/4.5 " rguenth at gcc dot gnu dot org
  2010-01-26 10:43 ` rguenth at gcc dot gnu dot org
  23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2009-02-06  8:31 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #19 from bonzini at gnu dot org  2009-02-06 08:30 -------
patch committed, PRE is now down to 35-40% of compilation time.  I defer to the
RMs whether this should be closed.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4/4.5 Regression] -fprofile-generate = huge SCCs for PRE
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (21 preceding siblings ...)
  2009-02-06  8:31 ` bonzini at gnu dot org
@ 2009-08-04 12:39 ` rguenth at gcc dot gnu dot org
  2010-01-26 10:43 ` rguenth at gcc dot gnu dot org
  23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-08-04 12:39 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #20 from rguenth at gcc dot gnu dot org  2009-08-04 12:29 -------
GCC 4.3.4 is being released, adjusting target milestone.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.3.4                       |4.3.5


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/35639] [4.3/4.4/4.5 Regression] -fprofile-generate = huge SCCs for PRE
  2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
                   ` (22 preceding siblings ...)
  2009-08-04 12:39 ` [Bug tree-optimization/35639] [4.3/4.4/4.5 " rguenth at gcc dot gnu dot org
@ 2010-01-26 10:43 ` rguenth at gcc dot gnu dot org
  23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-01-26 10:43 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #21 from rguenth at gcc dot gnu dot org  2010-01-26 10:43 -------
New timings (I suppose machines get faster ...):

GCC 4.3.4:
 tree PRE              :   2.18 (68%) usr   0.04 (33%) sys   2.23 (66%) wall   
 972 kB ( 7%) ggc
 TOTAL                 :   3.22             0.12             3.36             
13328 kB

GCC 4.4.[01]:
 tree PRE              :   0.63 (48%) usr   0.01 (20%) sys   0.66 (48%) wall   
 291 kB ( 3%) ggc

GCC 4.4.[23]:
 tree PRE              :   0.05 ( 7%) usr   0.00 ( 0%) sys   0.04 ( 5%) wall   
 261 kB ( 3%) ggc
 TOTAL                 :   0.72             0.04             0.77             
10324 kB

trunk:
 tree PRE              :   0.03 ( 6%) usr   0.00 ( 0%) sys   0.03 ( 4%) wall   
 129 kB ( 2%) ggc
 TOTAL                 :   0.51             0.08             0.84              
7674 kB


I suppose the maximal set translation fixed this.

WONTFIX for the 4.3 branch.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
      Known to work|4.2.3                       |4.2.3 4.4.2 4.5.0
         Resolution|                            |FIXED
   Target Milestone|4.3.5                       |4.4.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2010-01-26 10:43 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
2008-03-19 15:52 ` [Bug tree-optimization/35639] " bonzini at gnu dot org
2008-03-19 15:52 ` bonzini at gnu dot org
2008-03-19 20:18 ` [Bug tree-optimization/35639] [4.3/4.4 Regression] " rguenth at gcc dot gnu dot org
2008-03-19 20:33 ` rguenth at gcc dot gnu dot org
2008-03-19 20:35 ` rguenth at gcc dot gnu dot org
2008-03-19 20:37 ` rguenth at gcc dot gnu dot org
2008-04-28  4:41 ` mmitchel at gcc dot gnu dot org
2008-06-06 15:03 ` rguenth at gcc dot gnu dot org
2008-08-27 22:13 ` jsm28 at gcc dot gnu dot org
2008-11-22 10:55 ` steven at gcc dot gnu dot org
2008-11-22 16:02 ` bonzini at gnu dot org
2009-01-24 10:24 ` rguenth at gcc dot gnu dot org
2009-02-05  9:01 ` [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE bonzini at gnu dot org
2009-02-05  9:14 ` rguenth at gcc dot gnu dot org
2009-02-05 10:26 ` bonzini at gnu dot org
2009-02-05 10:32 ` rguenther at suse dot de
2009-02-05 13:02 ` bonzini at gnu dot org
2009-02-05 13:40 ` rguenth at gcc dot gnu dot org
2009-02-05 16:30 ` bonzini at gnu dot org
2009-02-05 16:41 ` dberlin at dberlin dot org
2009-02-05 17:43 ` dberlin at dberlin dot org
2009-02-06  8:31 ` bonzini at gnu dot org
2009-08-04 12:39 ` [Bug tree-optimization/35639] [4.3/4.4/4.5 " rguenth at gcc dot gnu dot org
2010-01-26 10:43 ` rguenth at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).