public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time
@ 2008-03-19 15:51 bonzini at gnu dot org
2008-03-19 15:52 ` [Bug tree-optimization/35639] " bonzini at gnu dot org
` (23 more replies)
0 siblings, 24 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2008-03-19 15:51 UTC (permalink / raw)
To: gcc-bugs
The attached .i file (from sed) spends 50% of its compilation time on PRE with
"-fprofile-generate -O2".
--
Summary: -fprofile-generate + PRE = big compile-time
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Keywords: compile-time-hog
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: bonzini at gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] -fprofile-generate + PRE = big compile-time
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
2008-03-19 15:52 ` [Bug tree-optimization/35639] " bonzini at gnu dot org
@ 2008-03-19 15:52 ` bonzini at gnu dot org
2008-03-19 20:18 ` [Bug tree-optimization/35639] [4.3/4.4 Regression] " rguenth at gcc dot gnu dot org
` (21 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2008-03-19 15:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from bonzini at gnu dot org 2008-03-19 15:51 -------
Created an attachment (id=15344)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15344&action=view)
preprocessed source
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] -fprofile-generate + PRE = big compile-time
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
@ 2008-03-19 15:52 ` bonzini at gnu dot org
2008-03-19 15:52 ` bonzini at gnu dot org
` (22 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2008-03-19 15:52 UTC (permalink / raw)
To: gcc-bugs
--
bonzini at gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2008-03-19 15:52:03
date| |
Summary|-fprofile-generate + PRE = |-fprofile-generate + PRE =
|big compile-time |big compile-time
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
2008-03-19 15:52 ` [Bug tree-optimization/35639] " bonzini at gnu dot org
2008-03-19 15:52 ` bonzini at gnu dot org
@ 2008-03-19 20:18 ` rguenth at gcc dot gnu dot org
2008-03-19 20:33 ` rguenth at gcc dot gnu dot org
` (20 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-03-19 20:18 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from rguenth at gcc dot gnu dot org 2008-03-19 20:17 -------
compile_program is the offending function. I'll probably can limit walking
with the alias oracle.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (2 preceding siblings ...)
2008-03-19 20:18 ` [Bug tree-optimization/35639] [4.3/4.4 Regression] " rguenth at gcc dot gnu dot org
@ 2008-03-19 20:33 ` rguenth at gcc dot gnu dot org
2008-03-19 20:35 ` rguenth at gcc dot gnu dot org
` (19 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-03-19 20:33 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from rguenth at gcc dot gnu dot org 2008-03-19 20:33 -------
Created an attachment (id=15347)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15347&action=view)
unincluded testcase
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (3 preceding siblings ...)
2008-03-19 20:33 ` rguenth at gcc dot gnu dot org
@ 2008-03-19 20:35 ` rguenth at gcc dot gnu dot org
2008-03-19 20:37 ` rguenth at gcc dot gnu dot org
` (18 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-03-19 20:35 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from rguenth at gcc dot gnu dot org 2008-03-19 20:35 -------
Err, whoops?
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|FIXED |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (4 preceding siblings ...)
2008-03-19 20:35 ` rguenth at gcc dot gnu dot org
@ 2008-03-19 20:37 ` rguenth at gcc dot gnu dot org
2008-04-28 4:41 ` mmitchel at gcc dot gnu dot org
` (17 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-03-19 20:37 UTC (permalink / raw)
To: gcc-bugs
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |4.3.1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (5 preceding siblings ...)
2008-03-19 20:37 ` rguenth at gcc dot gnu dot org
@ 2008-04-28 4:41 ` mmitchel at gcc dot gnu dot org
2008-06-06 15:03 ` rguenth at gcc dot gnu dot org
` (16 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2008-04-28 4:41 UTC (permalink / raw)
To: gcc-bugs
--
mmitchel at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (6 preceding siblings ...)
2008-04-28 4:41 ` mmitchel at gcc dot gnu dot org
@ 2008-06-06 15:03 ` rguenth at gcc dot gnu dot org
2008-08-27 22:13 ` jsm28 at gcc dot gnu dot org
` (15 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-06-06 15:03 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from rguenth at gcc dot gnu dot org 2008-06-06 14:59 -------
4.3.1 is being released, adjusting target milestone.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.3.1 |4.3.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (7 preceding siblings ...)
2008-06-06 15:03 ` rguenth at gcc dot gnu dot org
@ 2008-08-27 22:13 ` jsm28 at gcc dot gnu dot org
2008-11-22 10:55 ` steven at gcc dot gnu dot org
` (14 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: jsm28 at gcc dot gnu dot org @ 2008-08-27 22:13 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from jsm28 at gcc dot gnu dot org 2008-08-27 22:03 -------
4.3.2 is released, changing milestones to 4.3.3.
--
jsm28 at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.3.2 |4.3.3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (8 preceding siblings ...)
2008-08-27 22:13 ` jsm28 at gcc dot gnu dot org
@ 2008-11-22 10:55 ` steven at gcc dot gnu dot org
2008-11-22 16:02 ` bonzini at gnu dot org
` (13 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: steven at gcc dot gnu dot org @ 2008-11-22 10:55 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from steven at gcc dot gnu dot org 2008-11-22 10:53 -------
The last time this bug was reconfirmed, was in March 2008. PRE has been
completely rewritten since then. With today's trunk, I still see PRE take most
of the compile time, but it's "only" 20% (on x86 and on x86_64 with Richi's
testcase from t3.c.gz).
Paolo, do you still see this problem?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (9 preceding siblings ...)
2008-11-22 10:55 ` steven at gcc dot gnu dot org
@ 2008-11-22 16:02 ` bonzini at gnu dot org
2009-01-24 10:24 ` rguenth at gcc dot gnu dot org
` (12 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2008-11-22 16:02 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from bonzini at gnu dot org 2008-11-22 16:00 -------
The problem was that without -fprofile-generate the time spent there was
basically zero. Since the profiling code is building a MST of the control-flow
graph, it should produce absolutely no PRE opportunity and it's a pity that it
causes such a slowdown.
I wonder if there is some low-hanging fruit to eliminate value numbers that
cannot be redundant.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate + PRE = big compile-time
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (10 preceding siblings ...)
2008-11-22 16:02 ` bonzini at gnu dot org
@ 2009-01-24 10:24 ` rguenth at gcc dot gnu dot org
2009-02-05 9:01 ` [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE bonzini at gnu dot org
` (11 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-01-24 10:24 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from rguenth at gcc dot gnu dot org 2009-01-24 10:20 -------
GCC 4.3.3 is being released, adjusting target milestone.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.3.3 |4.3.4
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (11 preceding siblings ...)
2009-01-24 10:24 ` rguenth at gcc dot gnu dot org
@ 2009-02-05 9:01 ` bonzini at gnu dot org
2009-02-05 9:14 ` rguenth at gcc dot gnu dot org
` (10 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2009-02-05 9:01 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from bonzini at gnu dot org 2009-02-05 09:01 -------
I get some SCCs that have 1500+ values in them. That is quite expensive to
maintain.
345,334,000 iterative_hash_hashval_t
199,259,509 ???:???
151,133,289 iterative_hash_expr
111,911,577 bitmap_bit_p
96,901,657 htab_find_slot_with_hash
93,662,766 get_expr_value_id
64,328,952 bitmap_set_bit
--
bonzini at gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[4.3/4.4 Regression] - |[4.3/4.4 Regression] -
|fprofile-generate + PRE = |fprofile-generate = huge
|big compile-time |SCCs for PRE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (12 preceding siblings ...)
2009-02-05 9:01 ` [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE bonzini at gnu dot org
@ 2009-02-05 9:14 ` rguenth at gcc dot gnu dot org
2009-02-05 10:26 ` bonzini at gnu dot org
` (9 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-02-05 9:14 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from rguenth at gcc dot gnu dot org 2009-02-05 09:14 -------
I get
tree PRE : 1.57 (50%) usr 0.09 (29%) sys 1.68 (29%) wall
295 kB ( 3%) ggc
on current trunk (w/o checking). While 50% is quite high the absolute time
is low. Though I wonder why FRE is so much better.
tree FRE : 0.03 ( 1%) usr 0.01 ( 3%) sys 0.05 ( 1%) wall
85 kB ( 1%) ggc
Danny, does your recent hacking around SCCVN improve this testcase?
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |dberlin at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (13 preceding siblings ...)
2009-02-05 9:14 ` rguenth at gcc dot gnu dot org
@ 2009-02-05 10:26 ` bonzini at gnu dot org
2009-02-05 10:32 ` rguenther at suse dot de
` (8 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2009-02-05 10:26 UTC (permalink / raw)
To: gcc-bugs
------- Comment #12 from bonzini at gnu dot org 2009-02-05 10:26 -------
FRE is not a problem because all the time (93%) is spent computing ANTIC; of
this, half is phi_translate and the other half is bitmap_set operations.
I get a relatively good (15%) improvement from
Index: tree-ssa-sccvn.c
===================================================================
--- tree-ssa-sccvn.c (revision 143938)
+++ tree-ssa-sccvn.c (working copy)
@@ -398,9 +398,14 @@ vn_reference_op_eq (const void *p1, cons
static hashval_t
vn_reference_op_compute_hash (const vn_reference_op_t vro1)
{
- return iterative_hash_expr (vro1->op0, vro1->opcode)
- + iterative_hash_expr (vro1->op1, vro1->opcode)
- + iterative_hash_expr (vro1->op2, vro1->opcode);
+ hashval_t result = 0;
+ if (vro1->op0)
+ result += iterative_hash_expr (vro1->op0, vro1->opcode);
+ if (vro1->op1)
+ result += iterative_hash_expr (vro1->op1, vro1->opcode);
+ if (vro1->op2)
+ result += iterative_hash_expr (vro1->op2, vro1->opcode);
+ return result;
}
/* Return the hashcode for a given reference operation P1. */
and another 8% from this:
Index: tree-ssa-pre.c
===================================================================
--- tree-ssa-pre.c (revision 143938)
+++ tree-ssa-pre.c (working copy)
@@ -216,11 +216,11 @@ pre_expr_hash (const void *p1)
case CONSTANT:
return vn_hash_constant_with_type (PRE_EXPR_CONSTANT (e));
case NAME:
- return iterative_hash_expr (PRE_EXPR_NAME (e), 0);
+ return iterative_hash_hashval_t (SSA_NAME_VERSION (PRE_EXPR_NAME (e)),
0);
case NARY:
- return vn_nary_op_compute_hash (PRE_EXPR_NARY (e));
+ return PRE_EXPR_NARY (e)->hashcode;
case REFERENCE:
- return vn_reference_compute_hash (PRE_EXPR_REFERENCE (e));
+ return PRE_EXPR_REFERENCE (e)->hashcode;
default:
abort ();
}
(Tested with "make check RUNTESTFLAGS=tree-ssa.exp=*[pf]re*"). At least these
two kick hashing almost out of the profile and bring PRE down from 50% to 40%
of the compilation time. They also speedup a bit the bitmap_sets since
get_or_alloc_expression_id was also doing hashing.
The remaining main offenders are phi_translate_set and phi_translate_1. Apart
from some bitmap_sets, their profile is quite flat so no more microoptimization
I guess.
I'll bootstrap/regtest the above.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (14 preceding siblings ...)
2009-02-05 10:26 ` bonzini at gnu dot org
@ 2009-02-05 10:32 ` rguenther at suse dot de
2009-02-05 13:02 ` bonzini at gnu dot org
` (7 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: rguenther at suse dot de @ 2009-02-05 10:32 UTC (permalink / raw)
To: gcc-bugs
------- Comment #13 from rguenther at suse dot de 2009-02-05 10:31 -------
Subject: Re: [4.3/4.4 Regression] -fprofile-generate
= huge SCCs for PRE
On Thu, 5 Feb 2009, bonzini at gnu dot org wrote:
>
>
> ------- Comment #12 from bonzini at gnu dot org 2009-02-05 10:26 -------
> FRE is not a problem because all the time (93%) is spent computing ANTIC; of
> this, half is phi_translate and the other half is bitmap_set operations.
>
> I get a relatively good (15%) improvement from
>
> Index: tree-ssa-sccvn.c
> ===================================================================
> --- tree-ssa-sccvn.c (revision 143938)
> +++ tree-ssa-sccvn.c (working copy)
> @@ -398,9 +398,14 @@ vn_reference_op_eq (const void *p1, cons
> static hashval_t
> vn_reference_op_compute_hash (const vn_reference_op_t vro1)
> {
> - return iterative_hash_expr (vro1->op0, vro1->opcode)
> - + iterative_hash_expr (vro1->op1, vro1->opcode)
> - + iterative_hash_expr (vro1->op2, vro1->opcode);
> + hashval_t result = 0;
> + if (vro1->op0)
> + result += iterative_hash_expr (vro1->op0, vro1->opcode);
> + if (vro1->op1)
> + result += iterative_hash_expr (vro1->op1, vro1->opcode);
> + if (vro1->op2)
> + result += iterative_hash_expr (vro1->op2, vro1->opcode);
> + return result;
> }
>
> /* Return the hashcode for a given reference operation P1. */
>
>
> and another 8% from this:
>
> Index: tree-ssa-pre.c
> ===================================================================
> --- tree-ssa-pre.c (revision 143938)
> +++ tree-ssa-pre.c (working copy)
> @@ -216,11 +216,11 @@ pre_expr_hash (const void *p1)
> case CONSTANT:
> return vn_hash_constant_with_type (PRE_EXPR_CONSTANT (e));
> case NAME:
> - return iterative_hash_expr (PRE_EXPR_NAME (e), 0);
> + return iterative_hash_hashval_t (SSA_NAME_VERSION (PRE_EXPR_NAME (e)),
> 0);
> case NARY:
> - return vn_nary_op_compute_hash (PRE_EXPR_NARY (e));
> + return PRE_EXPR_NARY (e)->hashcode;
> case REFERENCE:
> - return vn_reference_compute_hash (PRE_EXPR_REFERENCE (e));
> + return PRE_EXPR_REFERENCE (e)->hashcode;
> default:
> abort ();
> }
>
> (Tested with "make check RUNTESTFLAGS=tree-ssa.exp=*[pf]re*"). At least these
> two kick hashing almost out of the profile and bring PRE down from 50% to 40%
> of the compilation time. They also speedup a bit the bitmap_sets since
> get_or_alloc_expression_id was also doing hashing.
>
> The remaining main offenders are phi_translate_set and phi_translate_1. Apart
> from some bitmap_sets, their profile is quite flat so no more microoptimization
> I guess.
>
> I'll bootstrap/regtest the above.
Ah, can you test in addition to the above
--- ../trunk/gcc/tree-ssa-sccvn.c 2009-01-28 13:11:34.000000000
+0100
+++ gcc/tree-ssa-sccvn.c 2009-02-01 12:26:36.000000000 +0100
@@ -316,6 +316,10 @@
const struct vn_constant_s *vc1 = (const struct vn_constant_s *) p1;
const struct vn_constant_s *vc2 = (const struct vn_constant_s *) p2;
+ /* Early out if this is not a hash collision. */
+ if (vc1->hashcode != vc2->hashcode)
+ return false;
+
return vn_constant_eq_with_type (vc1->constant, vc2->constant);
}
and similar in the other hash compare fns?
Richard.
>
>
>
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (15 preceding siblings ...)
2009-02-05 10:32 ` rguenther at suse dot de
@ 2009-02-05 13:02 ` bonzini at gnu dot org
2009-02-05 13:40 ` rguenth at gcc dot gnu dot org
` (6 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2009-02-05 13:02 UTC (permalink / raw)
To: gcc-bugs
------- Comment #14 from bonzini at gnu dot org 2009-02-05 13:02 -------
20% of PRE time is spent in qsort via sorted_array_from_bitmap_set. Rewriting
it like this makes it a bit faster (it shaves 12% more of PRE time):
FOR_EACH_VALUE_ID_IN_SET (set, i, bi)
{
/* The number of expressions having a given value is usually
relatively small. Thus, rather than making a vector of all the
expressions and sorting it by value id, we walk the values and
check in the reverse mapping that tells us what expressions have
a given value, to filter those in our set. If this is somehow
a significant lose for some cases, we can choose which set to
walk based on the set size. */
bitmap_set_t exprset = VEC_index (bitmap_set_t, value_expressions, i);
FOR_EACH_EXPR_ID_IN_SET (exprset, j, bj)
{
if (bitmap_bit_p (set->expressions, j))
VEC_safe_push (pre_expr, heap, result, expression_for_id (j));
}
}
I'll add this to the bootstrap/regtest.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (16 preceding siblings ...)
2009-02-05 13:02 ` bonzini at gnu dot org
@ 2009-02-05 13:40 ` rguenth at gcc dot gnu dot org
2009-02-05 16:30 ` bonzini at gnu dot org
` (5 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-02-05 13:40 UTC (permalink / raw)
To: gcc-bugs
------- Comment #15 from rguenth at gcc dot gnu dot org 2009-02-05 13:40 -------
pre-allocating the VEC with some reasonable defaults (add some stats for this
testcase) should make it even more fast.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (17 preceding siblings ...)
2009-02-05 13:40 ` rguenth at gcc dot gnu dot org
@ 2009-02-05 16:30 ` bonzini at gnu dot org
2009-02-05 16:41 ` dberlin at dberlin dot org
` (4 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2009-02-05 16:30 UTC (permalink / raw)
To: gcc-bugs
------- Comment #16 from bonzini at gnu dot org 2009-02-05 16:30 -------
Not much. The remaining compile-time hogs (~25%) are the pre_expr and
expr_pred_trans hash tables.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (18 preceding siblings ...)
2009-02-05 16:30 ` bonzini at gnu dot org
@ 2009-02-05 16:41 ` dberlin at dberlin dot org
2009-02-05 17:43 ` dberlin at dberlin dot org
` (3 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: dberlin at dberlin dot org @ 2009-02-05 16:41 UTC (permalink / raw)
To: gcc-bugs
------- Comment #17 from dberlin at gcc dot gnu dot org 2009-02-05 16:41 -------
Subject: Re: [4.3/4.4 Regression]
-fprofile-generate = huge SCCs for PRE
Ugh.
It might make sense to just replace the hash table implementation we
use with something better (simple power of 2, key-value stuff instead
of what we have now)
I've found in my testing that it can be quite a time sink.
On Thu, Feb 5, 2009 at 11:30 AM, bonzini at gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #16 from bonzini at gnu dot org 2009-02-05 16:30 -------
> Not much. The remaining compile-time hogs (~25%) are the pre_expr and
> expr_pred_trans hash tables.
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (19 preceding siblings ...)
2009-02-05 16:41 ` dberlin at dberlin dot org
@ 2009-02-05 17:43 ` dberlin at dberlin dot org
2009-02-06 8:31 ` bonzini at gnu dot org
` (2 subsequent siblings)
23 siblings, 0 replies; 25+ messages in thread
From: dberlin at dberlin dot org @ 2009-02-05 17:43 UTC (permalink / raw)
To: gcc-bugs
------- Comment #18 from dberlin at gcc dot gnu dot org 2009-02-05 17:43 -------
Subject: Re: [4.3/4.4 Regression]
-fprofile-generate = huge SCCs for PRE
My hacking will seriously improve this, since it doesn't iterate over
pieces of the SCC that aren't changing (which often is most of it).
On large SCC's, most of the time is actually being spent revisiting
things that can't possibly have changed.
(IE out of 5000 members of the SCC, 200 could have changed but we
visit all 5000 anyway).
On Thu, Feb 5, 2009 at 4:14 AM, rguenth at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #11 from rguenth at gcc dot gnu dot org 2009-02-05 09:14 -------
> I get
>
> tree PRE : 1.57 (50%) usr 0.09 (29%) sys 1.68 (29%) wall
> 295 kB ( 3%) ggc
>
> on current trunk (w/o checking). While 50% is quite high the absolute time
> is low. Though I wonder why FRE is so much better.
>
> tree FRE : 0.03 ( 1%) usr 0.01 ( 3%) sys 0.05 ( 1%) wall
> 85 kB ( 1%) ggc
>
>
> Danny, does your recent hacking around SCCVN improve this testcase?
>
>
> --
>
> rguenth at gcc dot gnu dot org changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> CC| |dberlin at gcc dot gnu dot
> | |org
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (20 preceding siblings ...)
2009-02-05 17:43 ` dberlin at dberlin dot org
@ 2009-02-06 8:31 ` bonzini at gnu dot org
2009-08-04 12:39 ` [Bug tree-optimization/35639] [4.3/4.4/4.5 " rguenth at gcc dot gnu dot org
2010-01-26 10:43 ` rguenth at gcc dot gnu dot org
23 siblings, 0 replies; 25+ messages in thread
From: bonzini at gnu dot org @ 2009-02-06 8:31 UTC (permalink / raw)
To: gcc-bugs
------- Comment #19 from bonzini at gnu dot org 2009-02-06 08:30 -------
patch committed, PRE is now down to 35-40% of compilation time. I defer to the
RMs whether this should be closed.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4/4.5 Regression] -fprofile-generate = huge SCCs for PRE
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (21 preceding siblings ...)
2009-02-06 8:31 ` bonzini at gnu dot org
@ 2009-08-04 12:39 ` rguenth at gcc dot gnu dot org
2010-01-26 10:43 ` rguenth at gcc dot gnu dot org
23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-08-04 12:39 UTC (permalink / raw)
To: gcc-bugs
------- Comment #20 from rguenth at gcc dot gnu dot org 2009-08-04 12:29 -------
GCC 4.3.4 is being released, adjusting target milestone.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.3.4 |4.3.5
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/35639] [4.3/4.4/4.5 Regression] -fprofile-generate = huge SCCs for PRE
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
` (22 preceding siblings ...)
2009-08-04 12:39 ` [Bug tree-optimization/35639] [4.3/4.4/4.5 " rguenth at gcc dot gnu dot org
@ 2010-01-26 10:43 ` rguenth at gcc dot gnu dot org
23 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-01-26 10:43 UTC (permalink / raw)
To: gcc-bugs
------- Comment #21 from rguenth at gcc dot gnu dot org 2010-01-26 10:43 -------
New timings (I suppose machines get faster ...):
GCC 4.3.4:
tree PRE : 2.18 (68%) usr 0.04 (33%) sys 2.23 (66%) wall
972 kB ( 7%) ggc
TOTAL : 3.22 0.12 3.36
13328 kB
GCC 4.4.[01]:
tree PRE : 0.63 (48%) usr 0.01 (20%) sys 0.66 (48%) wall
291 kB ( 3%) ggc
GCC 4.4.[23]:
tree PRE : 0.05 ( 7%) usr 0.00 ( 0%) sys 0.04 ( 5%) wall
261 kB ( 3%) ggc
TOTAL : 0.72 0.04 0.77
10324 kB
trunk:
tree PRE : 0.03 ( 6%) usr 0.00 ( 0%) sys 0.03 ( 4%) wall
129 kB ( 2%) ggc
TOTAL : 0.51 0.08 0.84
7674 kB
I suppose the maximal set translation fixed this.
WONTFIX for the 4.3 branch.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |RESOLVED
Known to work|4.2.3 |4.2.3 4.4.2 4.5.0
Resolution| |FIXED
Target Milestone|4.3.5 |4.4.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2010-01-26 10:43 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-19 15:51 [Bug tree-optimization/35639] New: -fprofile-generate + PRE = big compile-time bonzini at gnu dot org
2008-03-19 15:52 ` [Bug tree-optimization/35639] " bonzini at gnu dot org
2008-03-19 15:52 ` bonzini at gnu dot org
2008-03-19 20:18 ` [Bug tree-optimization/35639] [4.3/4.4 Regression] " rguenth at gcc dot gnu dot org
2008-03-19 20:33 ` rguenth at gcc dot gnu dot org
2008-03-19 20:35 ` rguenth at gcc dot gnu dot org
2008-03-19 20:37 ` rguenth at gcc dot gnu dot org
2008-04-28 4:41 ` mmitchel at gcc dot gnu dot org
2008-06-06 15:03 ` rguenth at gcc dot gnu dot org
2008-08-27 22:13 ` jsm28 at gcc dot gnu dot org
2008-11-22 10:55 ` steven at gcc dot gnu dot org
2008-11-22 16:02 ` bonzini at gnu dot org
2009-01-24 10:24 ` rguenth at gcc dot gnu dot org
2009-02-05 9:01 ` [Bug tree-optimization/35639] [4.3/4.4 Regression] -fprofile-generate = huge SCCs for PRE bonzini at gnu dot org
2009-02-05 9:14 ` rguenth at gcc dot gnu dot org
2009-02-05 10:26 ` bonzini at gnu dot org
2009-02-05 10:32 ` rguenther at suse dot de
2009-02-05 13:02 ` bonzini at gnu dot org
2009-02-05 13:40 ` rguenth at gcc dot gnu dot org
2009-02-05 16:30 ` bonzini at gnu dot org
2009-02-05 16:41 ` dberlin at dberlin dot org
2009-02-05 17:43 ` dberlin at dberlin dot org
2009-02-06 8:31 ` bonzini at gnu dot org
2009-08-04 12:39 ` [Bug tree-optimization/35639] [4.3/4.4/4.5 " rguenth at gcc dot gnu dot org
2010-01-26 10:43 ` rguenth at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).