* [tcb] Improve SSA-CCP to work with aliased stores, structures and arrays
@ 2004-10-15 4:55 Diego Novillo
2004-10-15 19:22 ` Daniel Berlin
0 siblings, 1 reply; 2+ messages in thread
From: Diego Novillo @ 2004-10-15 4:55 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 7175 bytes --]
Given
----------------------------------------------------------------------
struct A a;
const int B = 42;
void foo (int i)
{
if (i > 10)
a.a = 42;
else
{
a.b = 21;
a.a = a.b + 21;
}
if (a.a != B)
link_error ();
}
----------------------------------------------------------------------
GCC will not fold 'a.a != B' into '0'.
While CCP has all the logic to propagate the constants in this case, it
is missing the ability to associate constants with stores and loads
(i.e., pointer dereferences, structures and global/aliased variables).
We don't keep loads and stores in SSA, but we do build a factored
use-def web for them (in the virtual operands).
Essentially, the patch extends Brian Booth's work to propagate constants
in global variables. It associates constant values with the SSA names
in the V_MAY_DEF/V_MUST_DEF operands. However, this is not enough,
because we glob partial loads/stores with the base symbol:
# a_5 = V_MAY_DEF <a_4>
a.a = 2;
# VUSE <a_5>
x_3 = a.b;
In the example above, CCP will associate value '2' with 'a_5', but it
would be wrong to replace the load from 'a.b' with '2', because '2' had
been stored into a.a. So, we also keep track of the memory reference
expression where the store was done. This may not be necessary when we
start representing partial loads and stores (Dan?).
We also can't rely on looking up the store by checking the defining
statement of the SSA name, because we also need to propagate through
PHIs.
Store CCP is not too expensive, but it does have to do more work than
regular CCP, so I only enabled it at -O2 and parameterized CCP with a
flag to only run store CCP separately. In particular, it's not
necessary to run store CCP before scalarization. After timing various
alternatives, I settled on 3 CCP runs: one standard CCP early, one store
CCP after SRA, and another standard CCP near the end.
The patch also rearranges calls to FRE and copy-prop, and removes the
loop around the dominator pass. In a test of cc1-i-files and PR8361, we
propagate 35% more constants than in mainline. The timings aren't
significantly different. Here's a summary of the major differences on
cc1-i-files:
Phase Before After % change
tree copy propagation 0.00 7.36 N/A
tree PTA 2.10 1.76 -16.2%
tree alias analysis 2.01 1.96 -2.5%
tree PHI insertion 4.15 3.85 -7.2%
tree SSA rewrite 5.47 5.04 -7.9%
tree SSA other 23.25 20.37 -12.4%
tree operand scan 19.43 21.10 8.6%
dominator optimization 32.78 16.56 -49.5%
tree SRA 0.33 0.28 -15.2%
tree STORE-CCP 0.00 2.28 N/A
tree CCP 2.00 4.42 121.0%
tree split crit edges 3.17 3.03 -4.4%
tree FRE 0.00 13.89 N/A
tree PRE 5.50 6.68 21.5%
tree conservative DCE 2.36 4.64 96.6%
tree aggressive DCE 1.22 1.55 27.0%
expand 25.30 20.25 -20.0%
varconst 2.85 2.65 -7.0%
jump 1.50 1.38 -8.0%
CSE 44.73 41.72 -6.7%
bypass jumps 5.02 4.28 -14.7%
CSE 2 28.61 26.40 -7.7%
branch prediction 5.16 4.91 -4.8%
local alloc 8.10 7.65 -5.6%
global alloc 23.31 23.00 -1.3%
reload CSE regs 11.21 10.22 -8.8%
flow 2 2.55 2.19 -14.1%
Total tree passes 143.28 157.10 9.6%
Total compile time 751.50 749.87 -0.2%
So the net effect is positive. We spend more time in the tree passes,
but the total compile time is somewhat better. I expect to keep
improving these times.
Bootstrapped and tested x86, x86-64 and ppc.
Diego.
2004-10-14 Diego Novillo <dnovillo@redhat.com>
Constant propagation of stores and loads.
* common.opt (ftree-store-ccp): New flag.
* opts.c (decode_options): Enable STORE-CCP at -O2.
* timevar.def (TV_TREE_STORE_CCP): New timer.
* tree-flow.h (struct prop_value_d): Declare.
(unmodifiable_var_p): Declare.
* tree-optimize.c (init_tree_optimization_passes): Re-arrange
CCP, FRE and copy-prop passes.
Add call to pass_store_ccp.
* tree-pass.h (pass_store_ccp): Declare.
* tree-ssa-ccp.c (value): Remove. Update all users.
(const_val): Declare.
(value_vector): Remove. Update all users.
(ccp_lattice_t): Rename from latticevalue. Update all users.
(do_store_ccp): Declare.
(debug_lattice_value): New function.
(get_default_value): Re-write.
(set_lattice_value): Do not allow lattice transitions to lower
lattice values (with the exception of CONSTANT->UNKNOWN_VAL).
Return false on UNINITIALIZED->UNDEFINED transitions.
(def_to_varying): Remove.
(likely_value): Re-write.
(need_imm_uses_for): Call likely_value if VAR is defined by an
assignment.
(ccp_initialize): Remove variable 'is_may_def'.
(struct prop_stats_d): Declare.
(prop_stats): Declare.
(replace_uses_in): Count number of constants and copies
propagated.
(replace_vuse_in): Likewise.
(replace_phi_args_in): Factor out from ...
(substitute_and_fold): ... here.
Initialize prop_stats.
Always remove EH edges if the statement no longer has EH
information.
(ccp_lattice_meet): Re-write. Add documentation for
UNKNOWN_VAL.
(ccp_visit_phi_node): Do not handle UNKNOWN_VAL if
do_store_ccp is false.
(ccp_fold): Do not handle loads if do_store_ccp is false.
(visit_assignment): Re-write.
(ccp_visit_stmt): Visit all assignments.
(execute_ssa_ccp): Add argument 'store_ccp'.
(do_ssa_ccp): New function.
(do_ssa_store_ccp): New function.
(gate_store_ccp): Declare.
* tree-ssa-copy.c (copy_of): Declare.
(do_store_copy_prop): Declare.
(stmt_may_generate_copy): New function.
(need_imm_uses_for): Call it.
(get_first_copy_of): Use copy_of instead of SSA_NAME_VALUE.
(get_last_copy_of): Likewise.
(set_first_copy_of): Likewise.
(dump_copy_of): Likewise.
(copy_prop_visit_assignment): Likewise.
(copy_prop_visit_cond_stmt): Likewise.
(copy_prop_visit_stmt): Likewise.
(init_copy_prop): Likewise.
(fini_copy_prop): Likewise.
* tree-ssa-dom.c (struct opt_stats_d): Add fields
num_const_prop and num_copy_prop.
(tree_ssa_dominator_optimize): Do not execute more than once.
(dump_dominator_optimization_stats): Show constants and copies
propagated.
(cprop_into_successor_phis): Do not replace if NEW == ORIG.
(cprop_operand): Do not replace if OP == VAL.
Do not call get_virtual_var when VAL is a GIMPLE register.
Keep statistics on propagated copies and constants.
* tree-ssa-propagate.c (first_vuse): New function.
(first_vdef): New function.
(stmt_makes_single_load): New function.
(stmt_makes_single_store): New function.
* tree-ssa-propagate.h (first_vuse): Declare.
(first_vdef): Declare.
(stmt_makes_single_load): Declare.
(stmt_makes_single_store): Declare.
* tree-ssa.c (verify_ssa): Call verify_stmts.
* doc/invoke.texi (-fdump-tree-storeccp): Document.
(-ftree-store-ccp): Document.
[-- Attachment #2: 20041014-store-ccp.diff.gz --]
[-- Type: application/x-gzip, Size: 24299 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [tcb] Improve SSA-CCP to work with aliased stores, structures and arrays
2004-10-15 4:55 [tcb] Improve SSA-CCP to work with aliased stores, structures and arrays Diego Novillo
@ 2004-10-15 19:22 ` Daniel Berlin
0 siblings, 0 replies; 2+ messages in thread
From: Daniel Berlin @ 2004-10-15 19:22 UTC (permalink / raw)
To: Diego Novillo; +Cc: gcc-patches
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2100 bytes --]
On Fri, 15 Oct 2004, Diego Novillo wrote:
>
> Given
>
> ----------------------------------------------------------------------
> struct A a;
> const int B = 42;
>
> void foo (int i)
> {
> if (i > 10)
> a.a = 42;
> else
> {
> a.b = 21;
> a.a = a.b + 21;
> }
>
> if (a.a != B)
> link_error ();
> }
> ----------------------------------------------------------------------
>
> GCC will not fold 'a.a != B' into '0'.
>
> While CCP has all the logic to propagate the constants in this case, it
> is missing the ability to associate constants with stores and loads
> (i.e., pointer dereferences, structures and global/aliased variables).
> We don't keep loads and stores in SSA, but we do build a factored
> use-def web for them (in the virtual operands).
>
> Essentially, the patch extends Brian Booth's work to propagate constants
> in global variables. It associates constant values with the SSA names
> in the V_MAY_DEF/V_MUST_DEF operands. However, this is not enough,
> because we glob partial loads/stores with the base symbol:
>
> # a_5 = V_MAY_DEF <a_4>
> a.a = 2;
>
> # VUSE <a_5>
> x_3 = a.b;
>
> In the example above, CCP will associate value '2' with 'a_5', but it
> would be wrong to replace the load from 'a.b' with '2', because '2' had
> been stored into a.a. So, we also keep track of the memory reference
> expression where the store was done. This may not be necessary when we
> start representing partial loads and stores (Dan?).
It won't be necessary.
Where we don't know, you'll see VUSE<a_5, 0, -1> (IE it's somewhere int he
object) and where we do know, you'll see VUSE <a_5, 2, 4> or whatever.
Also, the immediate uses of <a_5, 2, 4> will be linked up with the right
uses, and vice versa.
The only thing you'll have to watch out for is the "don't know"
definitions of variables, which you can't propagate from (since we don't
know if it really wrote that part), and don't know
uses of variablse, which you can't propagate to.
You can think of V_MAY_DEF <a_5, 2, 4> as V_MUST_DEF <a_5, 2, 4>. We
*know* it defined bytes 2-6 of a_5.
--Dan
[-- Attachment #2: Type: APPLICATION/X-GZIP, Size: 24299 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2004-10-15 19:22 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-15 4:55 [tcb] Improve SSA-CCP to work with aliased stores, structures and arrays Diego Novillo
2004-10-15 19:22 ` Daniel Berlin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).