* [patch] Remove redundant calls to copy_virtual_operands
@ 2007-08-07 12:10 Ira Rosen
2007-08-07 13:06 ` Dorit Nuzman
2007-08-07 13:22 ` Diego Novillo
0 siblings, 2 replies; 9+ messages in thread
From: Ira Rosen @ 2007-08-07 12:10 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 899 bytes --]
Currently the vectorizer copies virtual operands from scalar memory
operations to the corresponding vector operations (that the vectorizer
creates). This seems to be redundant, since we set memory tags for the
memory operations that we create and call update_ssa at the end of loop
vectorization. This patch removes these redundant calls to
copy_virtual_operands().
Bootstrapped and tested on ppc-linux and x86_64-linux.
O.K. for mainline?
Ira
:ADDPATCH SSA:
ChangeLog entry:
* tree-vect-transform.c (vectorizable_store): Remove call to
copy_virtual_operands() and call mark_symbols_for_renaming() for
the created vector store.
(vect_setup_realignment): Don't call copy_virtual_operands() and
update_vuses_to_preheader().
(vectorizable_load): Don't call copy_virtual_operands().
(update_vuses_to_preheader): Remove.
(See attached file: copy_vops.txt)
[-- Attachment #2: copy_vops.txt --]
[-- Type: text/plain, Size: 5948 bytes --]
--- tree-vect-transform.c 2007-08-06 18:15:44.000000000 +0300
+++ tree-vect-transform.c 2007-08-07 15:12:48.000000000 +0300
@@ -58,7 +58,6 @@ static tree vect_init_vector (tree, tree
static void vect_finish_stmt_generation
(tree stmt, tree vec_stmt, block_stmt_iterator *bsi);
static bool vect_is_simple_cond (tree, loop_vec_info);
-static void update_vuses_to_preheader (tree, struct loop*);
static void vect_create_epilog_for_reduction (tree, tree, enum tree_code, tree);
static tree get_initial_def_for_reduction (tree, tree, tree *);
@@ -3871,8 +3870,6 @@ vectorizable_store (tree stmt, block_stm
enum machine_mode vec_mode;
tree dummy;
enum dr_alignment_support alignment_support_cheme;
- ssa_op_iter iter;
- def_operand_p def_p;
tree def, def_stmt;
enum vect_def_type dt;
stmt_vec_info prev_stmt_info = NULL;
@@ -4089,36 +4086,12 @@ vectorizable_store (tree stmt, block_stm
/* Arguments are ready. Create the new vector stmt. */
new_stmt = build_gimple_modify_stmt (data_ref, vec_oprnd);
vect_finish_stmt_generation (stmt, new_stmt, bsi);
-
- /* Set the VDEFs for the vector pointer. If this virtual def
- has a use outside the loop and a loop peel is performed
- then the def may be renamed by the peel. Mark it for
- renaming so the later use will also be renamed. */
- copy_virtual_operands (new_stmt, next_stmt);
- if (j == 0)
- {
- /* The original store is deleted so the same SSA_NAMEs
- can be used. */
- FOR_EACH_SSA_TREE_OPERAND (def, next_stmt, iter, SSA_OP_VDEF)
- {
- SSA_NAME_DEF_STMT (def) = new_stmt;
- mark_sym_for_renaming (SSA_NAME_VAR (def));
- }
-
- STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt;
- }
+ mark_symbols_for_renaming (new_stmt);
+
+ if (j == 0)
+ STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt;
else
- {
- /* Create new names for all the definitions created by COPY and
- add replacement mappings for each new name. */
- FOR_EACH_SSA_DEF_OPERAND (def_p, new_stmt, iter, SSA_OP_VDEF)
- {
- create_new_def_for (DEF_FROM_PTR (def_p), new_stmt, def_p);
- mark_sym_for_renaming (SSA_NAME_VAR (DEF_FROM_PTR (def_p)));
- }
-
- STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
- }
+ STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
prev_stmt_info = vinfo_for_stmt (new_stmt);
next_stmt = DR_GROUP_NEXT_DR (vinfo_for_stmt (next_stmt));
@@ -4204,8 +4177,6 @@ vect_setup_realignment (tree stmt, block
new_bb = bsi_insert_on_edge_immediate (pe, new_stmt);
gcc_assert (!new_bb);
msq_init = GIMPLE_STMT_OPERAND (new_stmt, 0);
- copy_virtual_operands (new_stmt, stmt);
- update_vuses_to_preheader (new_stmt, loop);
/* 2. Create permutation mask, if required, in loop preheader. */
if (targetm.vectorize.builtin_mask_for_load)
@@ -4780,7 +4751,6 @@ vectorizable_load (tree stmt, block_stmt
new_temp = make_ssa_name (vec_dest, new_stmt);
GIMPLE_STMT_OPERAND (new_stmt, 0) = new_temp;
vect_finish_stmt_generation (stmt, new_stmt, bsi);
- copy_virtual_operands (new_stmt, stmt);
mark_symbols_for_renaming (new_stmt);
/* 3. Handle explicit realignment if necessary/supported. */
@@ -5273,82 +5243,6 @@ vect_generate_tmps_on_preheader (loop_ve
}
-/* Function update_vuses_to_preheader.
-
- Input:
- STMT - a statement with potential VUSEs.
- LOOP - the loop whose preheader will contain STMT.
-
- It's possible to vectorize a loop even though an SSA_NAME from a VUSE
- appears to be defined in a VDEF in another statement in a loop.
- One such case is when the VUSE is at the dereference of a __restricted__
- pointer in a load and the VDEF is at the dereference of a different
- __restricted__ pointer in a store. Vectorization may result in
- copy_virtual_uses being called to copy the problematic VUSE to a new
- statement that is being inserted in the loop preheader. This procedure
- is called to change the SSA_NAME in the new statement's VUSE from the
- SSA_NAME updated in the loop to the related SSA_NAME available on the
- path entering the loop.
-
- When this function is called, we have the following situation:
-
- # vuse <name1>
- S1: vload
- do {
- # name1 = phi < name0 , name2>
-
- # vuse <name1>
- S2: vload
-
- # name2 = vdef <name1>
- S3: vstore
-
- }while...
-
- Stmt S1 was created in the loop preheader block as part of misaligned-load
- handling. This function fixes the name of the vuse of S1 from 'name1' to
- 'name0'. */
-
-static void
-update_vuses_to_preheader (tree stmt, struct loop *loop)
-{
- basic_block header_bb = loop->header;
- edge preheader_e = loop_preheader_edge (loop);
- ssa_op_iter iter;
- use_operand_p use_p;
-
- FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_VUSE)
- {
- tree ssa_name = USE_FROM_PTR (use_p);
- tree def_stmt = SSA_NAME_DEF_STMT (ssa_name);
- tree name_var = SSA_NAME_VAR (ssa_name);
- basic_block bb = bb_for_stmt (def_stmt);
-
- /* For a use before any definitions, def_stmt is a NOP_EXPR. */
- if (!IS_EMPTY_STMT (def_stmt)
- && flow_bb_inside_loop_p (loop, bb))
- {
- /* If the block containing the statement defining the SSA_NAME
- is in the loop then it's necessary to find the definition
- outside the loop using the PHI nodes of the header. */
- tree phi;
- bool updated = false;
-
- for (phi = phi_nodes (header_bb); phi; phi = PHI_CHAIN (phi))
- {
- if (SSA_NAME_VAR (PHI_RESULT (phi)) == name_var)
- {
- SET_USE (use_p, PHI_ARG_DEF (phi, preheader_e->dest_idx));
- updated = true;
- break;
- }
- }
- gcc_assert (updated);
- }
- }
-}
-
-
/* Function vect_update_ivs_after_vectorizer.
"Advance" the induction variables of LOOP to the value they should take
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch] Remove redundant calls to copy_virtual_operands
2007-08-07 12:10 [patch] Remove redundant calls to copy_virtual_operands Ira Rosen
@ 2007-08-07 13:06 ` Dorit Nuzman
2007-08-07 13:23 ` Diego Novillo
2007-08-07 13:22 ` Diego Novillo
1 sibling, 1 reply; 9+ messages in thread
From: Dorit Nuzman @ 2007-08-07 13:06 UTC (permalink / raw)
To: Ira Rosen; +Cc: gcc-patches, dnovillo, dberlin, dvorakz
>
> Currently the vectorizer copies virtual operands from scalar memory
> operations to the corresponding vector operations (that the vectorizer
> creates). This seems to be redundant, since we set memory tags for the
> memory operations that we create and call update_ssa at the end of loop
> vectorization. This patch removes these redundant calls to
> copy_virtual_operands().
>
If I'm not mistaken this patch also fixes an ICE when applied on top of
gcc4.1.1, so not only are these calls apparently redundant, but it seems
that they can also be harmful.
> Bootstrapped and tested on ppc-linux and x86_64-linux.
on x86 it also passed bootstrap with vectorization enabled, right?
> O.K. for mainline?
>
it looks ok to me, but it would be much better if one of the aliasing/vops
experts took a look
thanks,
dorit
> Ira
>
> :ADDPATCH SSA:
>
> ChangeLog entry:
>
> * tree-vect-transform.c (vectorizable_store): Remove call to
> copy_virtual_operands() and call mark_symbols_for_renaming() for
> the created vector store.
> (vect_setup_realignment): Don't call copy_virtual_operands() and
> update_vuses_to_preheader().
> (vectorizable_load): Don't call copy_virtual_operands().
> (update_vuses_to_preheader): Remove.
>
> (See attached file: copy_vops.txt)[attachment "copy_vops.txt"
> deleted by Dorit Nuzman/Haifa/IBM]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch] Remove redundant calls to copy_virtual_operands
2007-08-07 12:10 [patch] Remove redundant calls to copy_virtual_operands Ira Rosen
2007-08-07 13:06 ` Dorit Nuzman
@ 2007-08-07 13:22 ` Diego Novillo
1 sibling, 0 replies; 9+ messages in thread
From: Diego Novillo @ 2007-08-07 13:22 UTC (permalink / raw)
To: Ira Rosen; +Cc: gcc-patches
On 8/7/07 8:08 AM, Ira Rosen wrote:
> * tree-vect-transform.c (vectorizable_store): Remove call to
> copy_virtual_operands() and call mark_symbols_for_renaming() for
> the created vector store.
> (vect_setup_realignment): Don't call copy_virtual_operands() and
> update_vuses_to_preheader().
> (vectorizable_load): Don't call copy_virtual_operands().
> (update_vuses_to_preheader): Remove.
Well, it's certainly safe to do this. However, you may be marking more
symbols for renaming than strictly necessary. Have you compared compile
times? If you are marking more symbols for renaming than before, you
are probably increasing the time needed to update the SSA form.
It may certainly make it easier to maintain, but if the compile times
are a problem, I would suggest going with what you had before. Though
I'd be willing to bet that this won't really affect compile times all
that much.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch] Remove redundant calls to copy_virtual_operands
2007-08-07 13:06 ` Dorit Nuzman
@ 2007-08-07 13:23 ` Diego Novillo
2007-08-08 8:02 ` Ira Rosen
2007-08-08 9:41 ` Zdenek Dvorak
0 siblings, 2 replies; 9+ messages in thread
From: Diego Novillo @ 2007-08-07 13:23 UTC (permalink / raw)
To: Dorit Nuzman; +Cc: Ira Rosen, gcc-patches, dberlin, dvorakz
On 8/7/07 9:07 AM, Dorit Nuzman wrote:
> it looks ok to me, but it would be much better if one of the aliasing/vops
> experts took a look
Yeah, it is OK. As I mentioned earlier, I would only suggest doing
compile-time tests. Though if this fixes ICEs, then it's really better
even if a tad slower. Leaving this to update_ssa is certainly easier to
maintain.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch] Remove redundant calls to copy_virtual_operands
2007-08-07 13:23 ` Diego Novillo
@ 2007-08-08 8:02 ` Ira Rosen
2007-08-08 12:06 ` Diego Novillo
2007-08-08 9:41 ` Zdenek Dvorak
1 sibling, 1 reply; 9+ messages in thread
From: Ira Rosen @ 2007-08-08 8:02 UTC (permalink / raw)
To: Diego Novillo; +Cc: dberlin, Dorit Nuzman, dvorakz, gcc-patches
Diego Novillo <dnovillo@google.com> wrote on 07/08/2007 16:23:08:
> Yeah, it is OK. As I mentioned earlier, I would only suggest doing
> compile-time tests. Though if this fixes ICEs, then it's really better
> even if a tad slower. Leaving this to update_ssa is certainly easier to
> maintain.
As Dorit mentioned, it is indeed a matter of correctness. This patch fixes
bootstrap with vectorization enabled failure on x86_64 and ICE on 4.1.1 if
built with enable-checking.
So, is it OK to apply?
Thanks,
Ira
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch] Remove redundant calls to copy_virtual_operands
2007-08-07 13:23 ` Diego Novillo
2007-08-08 8:02 ` Ira Rosen
@ 2007-08-08 9:41 ` Zdenek Dvorak
2007-08-08 11:04 ` Ira Rosen
1 sibling, 1 reply; 9+ messages in thread
From: Zdenek Dvorak @ 2007-08-08 9:41 UTC (permalink / raw)
To: Diego Novillo; +Cc: Dorit Nuzman, Ira Rosen, gcc-patches, dberlin, dvorakz
Hello,
> On 8/7/07 9:07 AM, Dorit Nuzman wrote:
>
> > it looks ok to me, but it would be much better if one of the aliasing/vops
> > experts took a look
>
> Yeah, it is OK. As I mentioned earlier, I would only suggest doing
> compile-time tests. Though if this fixes ICEs, then it's really better
> even if a tad slower. Leaving this to update_ssa is certainly easier to
> maintain.
on the other hand, the fact that there are ICEs shows that there is some
problem in setting the alias information for the new statements in
vectorizer -- if this were correct, the virtual operands would be
exactly the same as the ones of the original statement; so although I
agree that leaving things to update_ssa is easier to maintain (although
possibly much slower), it would be nice if someone could check what does
the alias information updating code in vectorizer get wrong,
Zdenek
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch] Remove redundant calls to copy_virtual_operands
2007-08-08 9:41 ` Zdenek Dvorak
@ 2007-08-08 11:04 ` Ira Rosen
2007-08-12 19:40 ` Zdenek Dvorak
0 siblings, 1 reply; 9+ messages in thread
From: Ira Rosen @ 2007-08-08 11:04 UTC (permalink / raw)
To: Zdenek Dvorak; +Cc: dberlin, Diego Novillo, Dorit Nuzman, dvorakz, gcc-patches
Zdenek Dvorak <rakdver@kam.mff.cuni.cz> wrote on 08/08/2007 12:41:01:
> on the other hand, the fact that there are ICEs shows that there is some
> problem in setting the alias information for the new statements in
> vectorizer -- if this were correct, the virtual operands would be
> exactly the same as the ones of the original statement; so although I
> agree that leaving things to update_ssa is easier to maintain (although
> possibly much slower), it would be nice if someone could check what does
> the alias information updating code in vectorizer get wrong,
Currently the vectorizer both sets a memory tag and copies virtual
operands. Sometimes, like in the example below, the tag provided by the
vectorizer is more accurate than the alias information we had before, i.e.,
the virtual operands of the scalar memory operations (although I only have
a testcase for this on 4.1.1).
Here is the testcase (for gcc 4.1.1):
#define N 64
short AA[4096], BB[4096];
main()
{
int j;
for (j = 0; j < N; j++){
*(AA+j) = 1.0;
*(BB+j) = 1.0;
}
}
After vectorization and copying vops from scalars we get:
...
# BBD.1944_22 = PHI <BBD.1944_19(2), BBD.1944_14(4)>;
# AAD.1943_21 = PHI <AAD.1943_18(2), AAD.1943_15(4)>;
<L0>:;
...
# AAD.1943_16 = V_MAY_DEF <AAD.1943_21>;
# BBD.1944_17 = V_MAY_DEF <BBD.1944_22>;
*ivtmp.40D.2017_28 = vect_cst_.35D.2012_25;
# AAD.1943_18 = V_MAY_DEF <AAD.1943_16>;
# BBD.1944_19 = V_MAY_DEF <BBD.1944_17>;
*ivtmp.46D.2024_33 = vect_cst_.41D.2019_30;
...
While the memtags the vectorizer sets distinguish between AA and BB.
Therefore, by both setting more accurate tags and copying less accurate
vops, we create an inconsistency.
The ICE we get on gcc 4.1.1 is that ivopts pass rewrites the memory
operations and calls update_stmt() to update vops based on the memtags set
by the vectorizer.
Here is what we get after ivopts pass:
# BBD.1944_22 = PHI <BBD.1944_19(2), BBD.1944_14(0)>;
# AAD.1943_21 = PHI <AAD.1943_18(2), AAD.1943_15(0)>;
<L0>:;
...
# AAD.1943_16 = V_MAY_DEF <AAD.1943_21>;
MEM[base: &AAD.1943, index: D.2034_10]{*ivtmp.40D.2017} =
vect_cst_.35D.2012_25;
...
# BBD.1944_19 = V_MAY_DEF <BBD.1944_17>;
MEM[base: &BBD.1944, index: D.2035_6]{*ivtmp.46D.2024} =
vect_cst_.41D.2019_30;
...
it removes V_MAY_DEF of BB when writing to AA and V_MAY_DEF of AA when
writing to BB. The problem is that the remaining defs are not updated and
verify_ssa fails (there is a use of BBD.1944_17 without a def).
Our assumption was that creating such an inconsistent information in the
vectorizer is incorrect. And that since by setting more accurate memtags we
improve the alias info, we should simply remove the calls to
copy_virtual_operands.
Thanks,
Ira
>
> Zdenek
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch] Remove redundant calls to copy_virtual_operands
2007-08-08 8:02 ` Ira Rosen
@ 2007-08-08 12:06 ` Diego Novillo
0 siblings, 0 replies; 9+ messages in thread
From: Diego Novillo @ 2007-08-08 12:06 UTC (permalink / raw)
To: Ira Rosen; +Cc: dberlin, Dorit Nuzman, dvorakz, gcc-patches
On 8/8/07 4:01 AM, Ira Rosen wrote:
> So, is it OK to apply?
Sure.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch] Remove redundant calls to copy_virtual_operands
2007-08-08 11:04 ` Ira Rosen
@ 2007-08-12 19:40 ` Zdenek Dvorak
0 siblings, 0 replies; 9+ messages in thread
From: Zdenek Dvorak @ 2007-08-12 19:40 UTC (permalink / raw)
To: Ira Rosen; +Cc: dberlin, Diego Novillo, Dorit Nuzman, dvorakz, gcc-patches
Hello,
> > on the other hand, the fact that there are ICEs shows that there is some
> > problem in setting the alias information for the new statements in
> > vectorizer -- if this were correct, the virtual operands would be
> > exactly the same as the ones of the original statement; so although I
> > agree that leaving things to update_ssa is easier to maintain (although
> > possibly much slower), it would be nice if someone could check what does
> > the alias information updating code in vectorizer get wrong,
>
> Currently the vectorizer both sets a memory tag and copies virtual
> operands. Sometimes, like in the example below, the tag provided by the
> vectorizer is more accurate than the alias information we had before, i.e.,
> the virtual operands of the scalar memory operations (although I only have
> a testcase for this on 4.1.1).
OK, the patch is correct, then.
Zdenek
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2007-08-12 19:40 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-08-07 12:10 [patch] Remove redundant calls to copy_virtual_operands Ira Rosen
2007-08-07 13:06 ` Dorit Nuzman
2007-08-07 13:23 ` Diego Novillo
2007-08-08 8:02 ` Ira Rosen
2007-08-08 12:06 ` Diego Novillo
2007-08-08 9:41 ` Zdenek Dvorak
2007-08-08 11:04 ` Ira Rosen
2007-08-12 19:40 ` Zdenek Dvorak
2007-08-07 13:22 ` Diego Novillo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).