public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Question about simplify_cond_and_lookup_avail_expr
@ 2004-11-24 17:17 Richard Kenner
  2004-11-24 17:41 ` Jeffrey A Law
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Kenner @ 2004-11-24 17:17 UTC (permalink / raw)
  To: law; +Cc: gcc

When looking at prev->low and prev->high, how do we know they've already
been computed?  Isn't the computation done lazily?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Question about simplify_cond_and_lookup_avail_expr
  2004-11-24 17:17 Question about simplify_cond_and_lookup_avail_expr Richard Kenner
@ 2004-11-24 17:41 ` Jeffrey A Law
  2004-11-24 20:03   ` Devang Patel
  2004-11-25  0:20   ` Jeffrey A Law
  0 siblings, 2 replies; 12+ messages in thread
From: Jeffrey A Law @ 2004-11-24 17:41 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc

On Wed, 2004-11-24 at 12:07 -0500, Richard Kenner wrote:
> When looking at prev->low and prev->high, how do we know they've already
> been computed?  Isn't the computation done lazily?
Yes the computation is done lazily, but there will never be more
than one unprocessed vrp element.

1. We always call simplify_cond ... before we walk down its children
   in the dominator tree.  If there is an element on the vrp element
   stack, we process it.

2. When we walk down the true arm's dominator children, we record an
   unprocessed range for the true condition.  When we're done with the
   true arm we pop that range off.

3. Same as 2, but for the false arm.

If we encounter another conditional during step #2 or step #3 we 
start the entire process again with a new conditional.  Thus
we can never have more than 1 unprocessed element on any SSA_NAME's
vrp element stack.

jeff



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Question about simplify_cond_and_lookup_avail_expr
  2004-11-24 17:41 ` Jeffrey A Law
@ 2004-11-24 20:03   ` Devang Patel
  2004-11-25  0:33     ` Jeffrey A Law
  2004-11-25  0:20   ` Jeffrey A Law
  1 sibling, 1 reply; 12+ messages in thread
From: Devang Patel @ 2004-11-24 20:03 UTC (permalink / raw)
  To: law; +Cc: gcc, Richard Kenner

Related question, Why do find_equivalent_equality_comparison() only for 
EQ and NE ?
Thanks,
-
Devang

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Question about simplify_cond_and_lookup_avail_expr
  2004-11-24 17:41 ` Jeffrey A Law
  2004-11-24 20:03   ` Devang Patel
@ 2004-11-25  0:20   ` Jeffrey A Law
  1 sibling, 0 replies; 12+ messages in thread
From: Jeffrey A Law @ 2004-11-25  0:20 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc

On Wed, 2004-11-24 at 10:38 -0700, Jeffrey A Law wrote:
> On Wed, 2004-11-24 at 12:07 -0500, Richard Kenner wrote:
> > When looking at prev->low and prev->high, how do we know they've already
> > been computed?  Isn't the computation done lazily?
> Yes the computation is done lazily, but there will never be more
> than one unprocessed vrp element.
> 
> 1. We always call simplify_cond ... before we walk down its children
>    in the dominator tree.  If there is an element on the vrp element
>    stack, we process it.
> 
> 2. When we walk down the true arm's dominator children, we record an
>    unprocessed range for the true condition.  When we're done with the
>    true arm we pop that range off.
> 
> 3. Same as 2, but for the false arm.
> 
> If we encounter another conditional during step #2 or step #3 we 
> start the entire process again with a new conditional.  Thus
> we can never have more than 1 unprocessed element on any SSA_NAME's
> vrp element stack.
[ Replying to myself.... ]

I'll try to clarify the comments to make it clearer.

jeff


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Question about simplify_cond_and_lookup_avail_expr
  2004-11-24 20:03   ` Devang Patel
@ 2004-11-25  0:33     ` Jeffrey A Law
  2004-11-25  2:43       ` Devang Patel
  0 siblings, 1 reply; 12+ messages in thread
From: Jeffrey A Law @ 2004-11-25  0:33 UTC (permalink / raw)
  To: Devang Patel; +Cc: gcc, Richard Kenner

On Wed, 2004-11-24 at 11:57 -0800, Devang Patel wrote:
> Related question, Why do find_equivalent_equality_comparison() only for 
> EQ and NE ?
I can't think of a good reason offhand.  I might have been
overly paranoid about whether or not we can apply that
tranformation with relational tests.

jeff


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Question about simplify_cond_and_lookup_avail_expr
  2004-11-25  0:33     ` Jeffrey A Law
@ 2004-11-25  2:43       ` Devang Patel
  2004-11-25  9:45         ` Andrew Pinski
  0 siblings, 1 reply; 12+ messages in thread
From: Devang Patel @ 2004-11-25  2:43 UTC (permalink / raw)
  To: law; +Cc: gcc mailing list


On Nov 24, 2004, at 4:14 PM, Jeffrey A Law wrote:

> On Wed, 2004-11-24 at 11:57 -0800, Devang Patel wrote:
>> Related question, Why do find_equivalent_equality_comparison() only 
>> for
>> EQ and NE ?
> I can't think of a good reason offhand.  I might have been
> overly paranoid about whether or not we can apply that
> tranformation with relational tests.

OK. I am playing with following patch. It allows me to remove few 
unnecessary casts that prevents me to auto vectorize SPEC gzip loop. 
With this patch, plus couple of other patches not yet in mainline, I am 
able to vectorize candidate gzip loop.

I have not polished it to be a real patch and full testing is not yet 
complete, but am I on the right track?

Thanks,
-
Devang

         * tree-ssa-dom.c (eliminate_unnecessary_casts): New function.
         (optimize_stmt): Eliminate unnecessary casts.

Index: tree-ssa-dom.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-dom.c,v
retrieving revision 2.65
diff -Idpatel.pbxuser -c -3 -p -r2.65 tree-ssa-dom.c
*** tree-ssa-dom.c      30 Oct 2004 12:15:09 -0000      2.65
--- tree-ssa-dom.c      25 Nov 2004 01:43:49 -0000
*************** cprop_into_stmt (tree stmt)
*** 2914,2919 ****
--- 2917,2995 ----
     return may_have_exposed_new_symbols;
   }

+ /* Remove unnecessary cast.
+    For example,
+    int a;
+    short int s1, s2;
+
+    a = (int) s1;
+    s2 = (short) a;
+ */
+
+ static void
+ eliminate_unnecessary_casts (tree stmt)
+ {
+   tree lhs, rhs;
+ #ifdef ENABLE_CHECKING
+   gcc_assert (stmt);
+ #endif
+
+   if (TREE_CODE (stmt) == COND_EXPR
+       && COMPARISON_CLASS_P (COND_EXPR_COND (stmt)))
+     {
+       tree new_cond = NULL_TREE;
+       tree op0 = TREE_OPERAND (COND_EXPR_COND (stmt), 0);
+       if (op0 && TREE_CODE (op0) == SSA_NAME)
+       new_cond = find_equivalent_equality_comparison (COND_EXPR_COND 
(stmt));
+       if (new_cond)
+       {
+         COND_EXPR_COND (stmt) = new_cond;
+         modify_stmt (stmt);
+         return;
+       }
+     }
+
+   if (TREE_CODE (stmt) != MODIFY_EXPR)
+     return;
+
+   lhs = TREE_OPERAND (stmt, 0);
+   rhs = TREE_OPERAND (stmt, 1);
+
+   if ((TREE_CODE (rhs) == NOP_EXPR
+        || TREE_CODE (rhs) == CONVERT_EXPR)
+       && TREE_CODE (TREE_OPERAND (rhs, 0)) == SSA_NAME)
+     {
+       tree def_stmt, def_rhs, def_lhs;
+
+       def_stmt = SSA_NAME_DEF_STMT (TREE_OPERAND (rhs, 0));
+
+       if (TREE_CODE (def_stmt) == MODIFY_EXPR)
+       {
+         def_lhs = TREE_OPERAND (def_stmt, 0);
+         def_rhs = TREE_OPERAND (def_stmt, 1);
+
+         if ((TREE_CODE (def_rhs) == NOP_EXPR
+              || TREE_CODE (def_rhs) == CONVERT_EXPR)
+             && TREE_CODE (TREE_OPERAND (def_rhs, 0)) == SSA_NAME)
+           {
+             tree rhs_inner_type = TREE_TYPE (TREE_OPERAND (def_rhs, 
0));
+             tree lhs_type = TREE_TYPE (lhs);
+
+             if (TYPE_PRECISION (rhs_inner_type) == TYPE_PRECISION 
(lhs_type)
+                 && TYPE_MAIN_VARIANT (rhs_inner_type) == 
TYPE_MAIN_VARIANT (lhs_type))
+               {
+                 tree expr = VARRAY_TOP_TREE (avail_exprs_stack);
+
+                 TREE_OPERAND (stmt, 1) = TREE_OPERAND (def_rhs, 0);
+                 modify_stmt (stmt);
+
+                 VARRAY_POP (avail_exprs_stack);
+
+               }
+           }
+       }
+     }
+ }

   /* Optimize the statement pointed by iterator SI.

*************** optimize_stmt (struct dom_walk_data *wal
*** 2998,3003 ****
--- 3074,3081 ----
       may_have_exposed_new_symbols
         |= eliminate_redundant_computations (walk_data, stmt, ann);

+   eliminate_unnecessary_casts (stmt);
+
     /* Record any additional equivalences created by this statement.  */
     if (TREE_CODE (stmt) == MODIFY_EXPR)
       record_equivalences_from_stmt (stmt,

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Question about simplify_cond_and_lookup_avail_expr
  2004-11-25  2:43       ` Devang Patel
@ 2004-11-25  9:45         ` Andrew Pinski
  2004-11-29 20:06           ` Devang Patel
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Pinski @ 2004-11-25  9:45 UTC (permalink / raw)
  To: Devang Patel; +Cc: law, gcc mailing list


On Nov 24, 2004, at 9:03 PM, Devang Patel wrote:

>
> On Nov 24, 2004, at 4:14 PM, Jeffrey A Law wrote:
>
>> On Wed, 2004-11-24 at 11:57 -0800, Devang Patel wrote:
>>> Related question, Why do find_equivalent_equality_comparison() only 
>>> for
>>> EQ and NE ?
>> I can't think of a good reason offhand.  I might have been
>> overly paranoid about whether or not we can apply that
>> tranformation with relational tests.
>
> OK. I am playing with following patch. It allows me to remove few 
> unnecessary casts that prevents me to auto vectorize SPEC gzip loop. 
> With this patch, plus couple of other patches not yet in mainline, I 
> am able to vectorize candidate gzip loop.
>
> I have not polished it to be a real patch and full testing is not yet 
> complete, but am I on the right track?

No this is in the wrong place, this should be done by a tree combiner.
I did post an example of how to do a tree combiner a while back.
I also did post a pass which removed redundant casts before, it was
rejected for the same things as I have mentioned above.  If you want
I can post a new tree combiner but it needs some work.  Also fold does
all the work for you already.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Question about simplify_cond_and_lookup_avail_expr
  2004-11-25  9:45         ` Andrew Pinski
@ 2004-11-29 20:06           ` Devang Patel
  2004-11-29 20:20             ` Steven Bosscher
  2004-11-29 20:33             ` Jeffrey A Law
  0 siblings, 2 replies; 12+ messages in thread
From: Devang Patel @ 2004-11-29 20:06 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: law, gcc mailing list


On Nov 24, 2004, at 6:09 PM, Andrew Pinski wrote:

> No this is in the wrong place, this should be done by a tree combiner.

Tree combiner is mentioned many times but I do not understand its role 
clearly. What is it suppose to do?

-
Devang

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Question about simplify_cond_and_lookup_avail_expr
  2004-11-29 20:06           ` Devang Patel
@ 2004-11-29 20:20             ` Steven Bosscher
  2004-11-29 20:33             ` Jeffrey A Law
  1 sibling, 0 replies; 12+ messages in thread
From: Steven Bosscher @ 2004-11-29 20:20 UTC (permalink / raw)
  To: gcc; +Cc: Devang Patel, Andrew Pinski, law

On Monday 29 November 2004 20:58, Devang Patel wrote:
> On Nov 24, 2004, at 6:09 PM, Andrew Pinski wrote:
> > No this is in the wrong place, this should be done by a tree combiner.
>
> Tree combiner is mentioned many times but I do not understand its role
> clearly. What is it suppose to do?

Much like combine.c, only working on tree-ssa.  Basically it is
a very heavy form of forward propagation.

Gr.
Steven


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Question about simplify_cond_and_lookup_avail_expr
  2004-11-29 20:06           ` Devang Patel
  2004-11-29 20:20             ` Steven Bosscher
@ 2004-11-29 20:33             ` Jeffrey A Law
  2004-11-29 21:21               ` Devang Patel
  1 sibling, 1 reply; 12+ messages in thread
From: Jeffrey A Law @ 2004-11-29 20:33 UTC (permalink / raw)
  To: Devang Patel; +Cc: Andrew Pinski, gcc mailing list

On Mon, 2004-11-29 at 11:58 -0800, Devang Patel wrote:
> On Nov 24, 2004, at 6:09 PM, Andrew Pinski wrote:
> 
> > No this is in the wrong place, this should be done by a tree combiner.
> 
> Tree combiner is mentioned many times but I do not understand its role 
> clearly. What is it suppose to do?
It's basically the same as the RTL combiner.  It would take a series of
dependent expressions combine them into a single expression and try
to simplify it.

There are a few possible outcomes:

  1. The result can be simplified and it simplifies into a gimple
     expression.

  2. The result can be simplified, but it does not simplify into
     a gimple expression.  However, we may be able to gimplify
     the expression and still end up with something that is better
     than the original series of expressions (this is similar to 
     the 3->2 insn combinations in the RTL combiner)

  3. The result simplifies, but gimplification doesn't produce
     a series of statements that is cheaper than the original.

  4. The result does not simplify.

Cases 1 & 2 are wins.  Cases 3 & 4 do not win and we would not change
the IL in those cases.

One trivial example would be the reassociation done by the dominator
optimizer.

  t = a + C1
  res = t + C2

A tree combiner would turn that into

  res = a + C3  /* Where C3 is C1 + C2 computed at compile time */

Or consider a couple typecasts in series:

  t = (type1) a;
  res = (type2) t;
  

Depending on the types of a, type1 and type2, we may be able to simplify
that into

  res = (type2) a;

This happens often enough that I nearly wrote an optimizer to deal
with it explicitly.
 


Everything in the forward propagation pass (and the proposed extensions
to it is) just a specialized form of tree combination.

And the list of goes on and on and on.

jeff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Question about simplify_cond_and_lookup_avail_expr
  2004-11-29 20:33             ` Jeffrey A Law
@ 2004-11-29 21:21               ` Devang Patel
  0 siblings, 0 replies; 12+ messages in thread
From: Devang Patel @ 2004-11-29 21:21 UTC (permalink / raw)
  To: law; +Cc: gcc mailing list, Richard Kenner, Steven Bosscher


On Nov 29, 2004, at 12:13 PM, Jeffrey A Law wrote:

> On Mon, 2004-11-29 at 11:58 -0800, Devang Patel wrote:
>> On Nov 24, 2004, at 6:09 PM, Andrew Pinski wrote:
>>
>>> No this is in the wrong place, this should be done by a tree 
>>> combiner.
>>
>> Tree combiner is mentioned many times but I do not understand its role
>> clearly. What is it suppose to do?
> It's basically the same as the RTL combiner.  It would take a series of
> dependent expressions combine them into a single expression and try
> to simplify it.
>
> There are a few possible outcomes:
>
>   1. The result can be simplified and it simplifies into a gimple
>      expression.
>
>   2. The result can be simplified, but it does not simplify into
>      a gimple expression.  However, we may be able to gimplify
>      the expression and still end up with something that is better
>      than the original series of expressions (this is similar to
>      the 3->2 insn combinations in the RTL combiner)
>
>   3. The result simplifies, but gimplification doesn't produce
>      a series of statements that is cheaper than the original.
>
>   4. The result does not simplify.
>
> Cases 1 & 2 are wins.  Cases 3 & 4 do not win and we would not change
> the IL in those cases.
>
> One trivial example would be the reassociation done by the dominator
> optimizer.
>
>   t = a + C1
>   res = t + C2
>
> A tree combiner would turn that into
>
>   res = a + C3  /* Where C3 is C1 + C2 computed at compile time */
>
> Or consider a couple typecasts in series:
>
>   t = (type1) a;
>   res = (type2) t;
>
>
> Depending on the types of a, type1 and type2, we may be able to 
> simplify
> that into
>
>   res = (type2) a;
>
> This happens often enough that I nearly wrote an optimizer to deal
> with it explicitly.
>
>
>
> Everything in the forward propagation pass (and the proposed extensions
> to it is) just a specialized form of tree combination.
>
> And the list of goes on and on and on.

Thank you guys.
-
Devang

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Question about simplify_cond_and_lookup_avail_expr
@ 2004-11-29 20:14 Richard Kenner
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Kenner @ 2004-11-29 20:14 UTC (permalink / raw)
  To: dpatel; +Cc: gcc

    Tree combiner is mentioned many times but I do not understand its role 
    clearly. What is it suppose to do?

The same as the RTL combiner, but on trees.  ;-)

Basically, the idea (oversimplified) is to try to find a pair of statements
where the output of the first is the input of the second and do a macro
substitution replacing LHS of the first inside the second with the RHS of the
first, then trying to simplify the resulting expression in the hope that
it'll remain GIMPLE.  If so, we've reduced by one the number of statements in
the function.

Of course, the same tricks with respect to "3->2 merging" done in RTL would
also apply here, though the making of PARALLELs would have no analog.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2004-11-29 20:32 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-24 17:17 Question about simplify_cond_and_lookup_avail_expr Richard Kenner
2004-11-24 17:41 ` Jeffrey A Law
2004-11-24 20:03   ` Devang Patel
2004-11-25  0:33     ` Jeffrey A Law
2004-11-25  2:43       ` Devang Patel
2004-11-25  9:45         ` Andrew Pinski
2004-11-29 20:06           ` Devang Patel
2004-11-29 20:20             ` Steven Bosscher
2004-11-29 20:33             ` Jeffrey A Law
2004-11-29 21:21               ` Devang Patel
2004-11-25  0:20   ` Jeffrey A Law
2004-11-29 20:14 Richard Kenner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).