public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* [tree-ssa] Out of SSA status and issues
@ 2003-05-12 14:42 Andrew MacLeod
  2003-05-12 15:38 ` Diego Novillo
  2003-05-12 18:57 ` Andrew MacLeod
  0 siblings, 2 replies; 36+ messages in thread
From: Andrew MacLeod @ 2003-05-12 14:42 UTC (permalink / raw)
  To: gcc mailing list



Most of the out of ssa pass is ready to go, except for a couple of
issues. One is large, one is not.

First, the smaller issue. When overlapping live ranges are allowed, we
will sometimes have to issue copies on edges in order to satify PHI
assignments. Since it is impossible to issue a copy on an abnormal
critical edge, we must coalesce all the variables which could
potentially cause a copy across one of these edges. This prevents the
need for a copy on that edge.

The worst case scenario happens when you have 2 abnormal critical edges
feeding a PHI node in which the 2 values coming in conflict. This means
you *have* to issue a copy on one of the edges, and that can't be done.

Copy propagation can currently cause this to happen in C++ code. EH
generates lots of abnormal critical edges, and copy propagation can
cause this as the following example shows:

void f();

int val();
void g();

int *q;
int i;
void
a()
{
 int *p;

  p = &i;
  try {
    f();
    *p++ = val();
  }
  catch (...)
    {
    }

   *p =  20;
}

With copy propagation we end up with:

   #   (*T.2)_10 = VDEF <(*T.2)_7>;
    #   .GLOBAL_VAR_11 = VDEF <.GLOBAL_VAR_8>;
    #   VUSE <(*T.2)_7>;
    #   VUSE <.GLOBAL_VAR_8>;
    p_9 = &i;
    try
      {

        # BLOCK 1 (a.C:16).  PRED: 0.  SUCC: 2 3.
        {

          #   .GLOBAL_VAR_12 = VDEF <.GLOBAL_VAR_11>;
          f ();

          # BLOCK 2.  PRED: 1.  SUCC: 7 3.
          (void)0;

          #   (*T.2)_15 = VDEF <(*T.2)_10>;
          #   .GLOBAL_VAR_16 = VDEF <.GLOBAL_VAR_12>;
          p_14 = p_9 + 4B;

          #   (*T.2)_18 = VDEF <(*T.2)_15>;
          #   .GLOBAL_VAR_19 = VDEF <.GLOBAL_VAR_16>;
          T.2_17 = p_9;

          #   .GLOBAL_VAR_20 = VDEF <.GLOBAL_VAR_19>;
          #   (*T.2)_21 = VDEF <(*T.2)_18>;
          #   VUSE <T.2_17>;
          *T.2 = val ()
        }
      }
    catch
      {

        # BLOCK 3 (a.C:20).  PRED: 2 1.  SUCC: 7 4.
        #   p_1 = PHI <p_9(1), p_14(2)>;
        #   .GLOBAL_VAR_3 = PHI <.GLOBAL_VAR_12(1), .GLOBAL_VAR_20(2)>;
        #   (*T.2)_5 = PHI <(*T.2)_10(1), (*T.2)_21(2)>;
        catch ()
          {

            # BLOCK 4 (a.C:20).  PRED: 3.  SUCC: 7 5.
            {

              #   .GLOBAL_VAR_22 = VDEF <.GLOBAL_VAR_3>;
              __cxa_begin_catch (<<<exception object>>>);
              try
                {

                  # BLOCK 5 (a.C:21).  PRED: 4.  SUCC: 6 7.
                  {
                    (void)0
                  }
                }
              finally
                {

                  # BLOCK 6 (a.C:20).  PRED: 5.  SUCC: 7.

                  #   .GLOBAL_VAR_23 = VDEF <.GLOBAL_VAR_22>;
                  __cxa_end_catch ()
                }
            }
          }
      };

    # BLOCK 7 (a.C:24).  PRED: 6 5 4 3 2 0.  SUCC: -2.
    #   p_2 = PHI <p_9(0), p_14(2), p_1(3), p_1(4), p_1(5), p_1(6)>;
    #   .GLOBAL_VAR_4 = PHI <.GLOBAL_VAR_11(0), .GLOBAL_VAR_20(2),
.GLOBAL_VAR_3(3), .GLOBAL_VAR_22(4), .GLOBAL_VAR_22(5),
.GLOBAL_VAR_23(6)>;
    #   (*T.2)_6 = PHI <(*T.2)_10(0), (*T.2)_21(2), (*T.2)_5(3),
(*T.2)_5(4), (*T.2)_5(5), (*T.2)_5(6)>;

    #   (*T.2)_24 = VDEF <(*T.2)_6>;
    #   .GLOBAL_VAR_25 = VDEF <.GLOBAL_VAR_4>;
    #   VUSE <p_2>;
    *p = 20
  }
}

p_9 and p_14 overlap in block 2, and both are elements of a PHI node in
block 3. Both come into the PHI node on abnormal critical edges (block 3
is the start of the catch), and cannot occupy the same memory location,
so we can't coalesce them .

This occurs in libstdc++, and prevent us from compiling the library when
we enable overlapping live ranges.

So this problem needs to be resolved before we can turn on overlapping
ranges. We must not propagate copies which are going to cause conflicts
with other registers that are used across abnormal critical edges.  

                -------------------------

The second, and more serious problem, has to do with the relationship
between pointers and dereferenced pointers in our SSA implementation.

Given something simple, say:

int a[100];

int
b() 
{
  int *p= a;
  int y;

  *p = 20;
  p += 10;
  y = *p;
  if (y > 20)
    *p++ = 30;

  return *p;

}

we see something like:

  #   (*p)_5 = VDEF <(*p)_3>;
  #   VUSE <(*p)_3>;
  p_4 = &a;
  
  #   (*p)_6 = VDEF <(*p)_5>;
  #   VUSE <p_4>;
  *p = 20;
  
  #   (*p)_8 = VDEF <(*p)_6>;
  p_7 = p_4 + 40B;
  
  #   VUSE <(*p)_8>;
  #   VUSE <p_7>;
  y_9 = *p;
  if (y_9 > 20)
    {
      
      #   (*p)_10 = VDEF <(*p)_8>;
      #   VUSE <p_7>;
      *p = 30;
      
      #   (*p)_12 = VDEF <(*p)_10>;
      p_11 = p_7 + 4B
    }


if none of the versions of p overlap, then p_1, p_4, p_7 and p_11 all
coalesce together, and are assigned to 'p'. And this program works. 

If one or more of them can't be coalesced toegther, we need to create a
new variable to represent the ones which overlap. Anywhere where that
pointer is used, we have to replace the 'p' with the new variable.

Our SSA representation doesn't include these in the operands of a stmt. 
Take for example:

  #   VUSE <(*p)_8>;
  #   VUSE <p_7>;
  y_9 = *p;

the dereference of p actually uses p_7, but the use is a virtual use
instead of a real use.  If p_7 didn't coalesce with 'p', it would be
asigned to a new variable, say P.33. This stmt would  then need to be
rewritten as:

  y = *p.33

But since the use of p_7 is virtual instead of a real use, we don't see
this when we go to rewrite.

I have created a hack which Ive been using which will look for
derefernces of variables and try to match them up with a VUSE, and if
the VUSE has been coalesced to a different variable, then it rewrites
the variable in the stmt. So it will take care of the above situation.

However, its not flawless. There are 2 situations which can occur under
which it fails. If 2 different versions of p, say p_66 and p_77 are both
dereferenced on the same stmt, there is no way to know which one needs
to be rewritten, all we know is one of them needs it.

The second case occurs when a derefernce is copy propagated into a PHI
node:
 
   if (T.1_5 != 0B)
    {

      #   VUSE <(*map)_3>;
      #   VUSE <map_4>;
      T.2_8 = map->compact_to_partition;
      i.3_9 = (unsigned int)i_6;
      T.4_10 = i.3_9 * 4;
      T.5_11 = (int *)T.4_10;

      #   (*T.6)_13 = VDEF <(*T.6)_7>;
      T.6_12 = T.2_8 + T.5_11;

      #   VUSE <T.6_12>;
      i_14 = (*T.6)_13
    };
  #   i_1 = PHI <i_6(0), (*T.6)_13(1)>;
  #   (*T.6)_2 = PHI <(*T.6)_7(0), (*T.6)_13(1)>;

  #   (*T.7)_17 = VDEF <(*T.7)_16>;
  #   VUSE <(*map)_3>;
  #   VUSE <map_4>;
     
The value of i_14 has been propagated into the PHI node. DCE the deletes
the stmt 
   i_14 = (*T.6)_13

When we go to rewrite this, all we know is that its a derefernce of T.6.
There is no VUSE now to look at to figure out what the correct pointer
it.  The original def has it as T.6_12. The information could be found
by looking for the def of (*T.6)_13 (which is virtual), and looking at
the real def, which is T.6_12. I am about to try that in my hack and see
if it works.


The real solution to this is to use the real pointer in the stmt, and
make it a a real use instead of a virtual use.  Diego is currently
investigating this. Then rewrite would simply be replacing p_7 with the
correct variable and not need to go hunting through virtual operands to 
see if it needs to do something or not.

The orignal program example I gave would then look something like:

  #   (*p_4)_5 = VDEF <(*p)_3>;
  #   VUSE <(*p)_3>;
  p_4 = &a;
  
  #   (*p_4)_6 = VDEF <(*p_4)_5>;
  #   VUSE <p_4>;
  *p_4 = 20;
  
  #   (*p_7)_8 = VDEF <(*p_4)_6>;
  p_7 = p_4 + 40B;
  
  #   VUSE <(*p_7)_8>;
  y_9 = *p_7;
  if (y_9 > 20)
    {
      
      #   (*p_7)_10 = VDEF <(*p_7)_8>;
      *p_7 = 30;
      
      #   (*p_11)_12 = VDEF <(*p_7)_10>;
      p_11 = p_7 + 4B
    }
 

Once we resolve these issues, we should be able to at least attempt to
turn on overlapping live ranges, and give ourselves lots of other bugs
to look at :-)

Andrew


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-12 14:42 [tree-ssa] Out of SSA status and issues Andrew MacLeod
@ 2003-05-12 15:38 ` Diego Novillo
  2003-05-12 15:57   ` Andrew MacLeod
  2003-05-12 18:57 ` Andrew MacLeod
  1 sibling, 1 reply; 36+ messages in thread
From: Diego Novillo @ 2003-05-12 15:38 UTC (permalink / raw)
  To: Andrew MacLeod; +Cc: gcc mailing list

On Mon, May 12, 2003 at 10:42:43AM -0400, Andrew MacLeod wrote:

> So this problem needs to be resolved before we can turn on overlapping
> ranges. We must not propagate copies which are going to cause conflicts
> with other registers that are used across abnormal critical edges.  
> 
Perhaps we could just teach copy-prop not to use critical edges
for the time being.  Or, to avoid pessimizing too much, we could
try and use the same check we have now for overlapping live
ranges in the SSA renamer.  I'm not sure if it's feasible,
though.

> The real solution to this is to use the real pointer in the stmt, and
> make it a a real use instead of a virtual use.  Diego is currently
> investigating this. Then rewrite would simply be replacing p_7 with the
> correct variable and not need to go hunting through virtual operands to 
> see if it needs to do something or not.
> 
Yes, I've got a partial patch for this.  The main problem that I
need to fix now is the fact that INDIRECT_REF nodes are shared.
So, when rename the base pointer of an INDIRECT_REF variable, we
automatically rename them all.

We can't just unshare the INDIRECT_REF nodes.  The solution needs
to share all the nodes that have the same pointer (i.e., all the
(*p_3)_n SSA names need to have the same variable *p_3 as their
SSA_NAME_VAR).  Once I fix that, it should all just work (FLW).


> Once we resolve these issues, we should be able to at least attempt to
> turn on overlapping live ranges, and give ourselves lots of other bugs
> to look at :-)
> 
Indeed :)


Diego.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-12 15:38 ` Diego Novillo
@ 2003-05-12 15:57   ` Andrew MacLeod
  2003-05-12 16:05     ` Michael Matz
  2003-05-12 17:08     ` law
  0 siblings, 2 replies; 36+ messages in thread
From: Andrew MacLeod @ 2003-05-12 15:57 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc mailing list

On Mon, 2003-05-12 at 11:38, Diego Novillo wrote:
> On Mon, May 12, 2003 at 10:42:43AM -0400, Andrew MacLeod wrote:
> 
> > So this problem needs to be resolved before we can turn on overlapping
> > ranges. We must not propagate copies which are going to cause conflicts
> > with other registers that are used across abnormal critical edges.  
> > 
> Perhaps we could just teach copy-prop not to use critical edges
> for the time being.  Or, to avoid pessimizing too much, we could
> try and use the same check we have now for overlapping live
> ranges in the SSA renamer.  I'm not sure if it's feasible,
> though.
> 

Unfortunately, it not propagating over a critical edge thats the
problem. Its propagating a copy past another use to form an overlapping
live range where both of those values are used in a PHI node at some
point across abnormal critical edges.

I beleive the bvrute force approach is to flag any variables which are
elements of PHI's acroiss an abnormal ciroitcal edge, and never copy
propagate them  ie, so in the example, when you look at the PHI, you see
p_9 and p_14 are used across abnormal critical edges, so you stick them
in a list of variables never to copy propagate.

I dont know that another optimiation might not do something which
results in the same condition or not. Time will tell.

On another note, shouldn't the into-ssa-pass's copy propagation only be
turned on when -ftree-copyprop is specified?  I think we do it all the
time right now... thus my problem in libstdc++ even without
-ftree-copyprop...


> We can't just unshare the INDIRECT_REF nodes.  The solution needs
> to share all the nodes that have the same pointer (i.e., all the
> (*p_3)_n SSA names need to have the same variable *p_3 as their
> SSA_NAME_VAR).  Once I fix that, it should all just work (FLW).
> 

Thats the working theory anwyay :-)

Andrew

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-12 15:57   ` Andrew MacLeod
@ 2003-05-12 16:05     ` Michael Matz
  2003-05-12 16:10       ` Andrew MacLeod
  2003-05-12 16:16       ` law
  2003-05-12 17:08     ` law
  1 sibling, 2 replies; 36+ messages in thread
From: Michael Matz @ 2003-05-12 16:05 UTC (permalink / raw)
  To: Andrew MacLeod; +Cc: Diego Novillo, gcc mailing list

Hi,

On 12 May 2003, Andrew MacLeod wrote:

> I beleive the bvrute force approach is to flag any variables which are
> elements of PHI's acroiss an abnormal ciroitcal edge, and never copy
> propagate them  ie, so in the example, when you look at the PHI, you see
> p_9 and p_14 are used across abnormal critical edges, so you stick them
> in a list of variables never to copy propagate.

A better heuristic is to start with ignoring abnormal critical edges.
Then _if_ there are copies inserted on such edges, mark the involved
variables as never-propagate.  In that case rerun the pass.  Yes, this
sometimes is slower, but produces better code (i.e. something or -O3
depending on how slow it actually is).


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-12 16:05     ` Michael Matz
@ 2003-05-12 16:10       ` Andrew MacLeod
  2003-05-12 16:16       ` law
  1 sibling, 0 replies; 36+ messages in thread
From: Andrew MacLeod @ 2003-05-12 16:10 UTC (permalink / raw)
  To: Michael Matz; +Cc: Diego Novillo, gcc mailing list

On Mon, 2003-05-12 at 12:04, Michael Matz wrote:
> Hi,
> 
> On 12 May 2003, Andrew MacLeod wrote:
> 
> > I beleive the bvrute force approach is to flag any variables which are
> > elements of PHI's acroiss an abnormal ciroitcal edge, and never copy
> > propagate them  ie, so in the example, when you look at the PHI, you see
> > p_9 and p_14 are used across abnormal critical edges, so you stick them
> > in a list of variables never to copy propagate.
> 
> A better heuristic is to start with ignoring abnormal critical edges.
> Then _if_ there are copies inserted on such edges, mark the involved
> variables as never-propagate.  In that case rerun the pass.  Yes, this
> sometimes is slower, but produces better code (i.e. something or -O3
> depending on how slow it actually is).
> 

Unfortunately, we don't know which edges we are going to need to insert
copies on until we are exiting SSA form and have built the interference
graph et all. So by that point, its too late to make a decision not to
do something in an earlier pass.

Andrew

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-12 16:05     ` Michael Matz
  2003-05-12 16:10       ` Andrew MacLeod
@ 2003-05-12 16:16       ` law
  1 sibling, 0 replies; 36+ messages in thread
From: law @ 2003-05-12 16:16 UTC (permalink / raw)
  To: Michael Matz; +Cc: Andrew MacLeod, Diego Novillo, gcc mailing list

In message <Pine.LNX.4.44.0305121802110.15564-100000@wotan.suse.de>, Michael Ma
tz writes:
 >Hi,
 >
 >On 12 May 2003, Andrew MacLeod wrote:
 >
 >> I beleive the bvrute force approach is to flag any variables which are
 >> elements of PHI's acroiss an abnormal ciroitcal edge, and never copy
 >> propagate them  ie, so in the example, when you look at the PHI, you see
 >> p_9 and p_14 are used across abnormal critical edges, so you stick them
 >> in a list of variables never to copy propagate.
 >
 >A better heuristic is to start with ignoring abnormal critical edges.
 >Then _if_ there are copies inserted on such edges, mark the involved
 >variables as never-propagate.  In that case rerun the pass.  Yes, this
 >sometimes is slower, but produces better code (i.e. something or -O3
 >depending on how slow it actually is).
This heuristic won't work for Andrew's problem since we don't know if
things are "mucked up" until we're about to leave SSA form, at which
point you can't easily go back and undo the copy propagation.

This kind of heuristic does work for things like PRE/LCM though.

jeff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-12 15:57   ` Andrew MacLeod
  2003-05-12 16:05     ` Michael Matz
@ 2003-05-12 17:08     ` law
  2003-05-12 17:12       ` Andrew MacLeod
  1 sibling, 1 reply; 36+ messages in thread
From: law @ 2003-05-12 17:08 UTC (permalink / raw)
  To: Andrew MacLeod; +Cc: Diego Novillo, gcc mailing list

In message <1052755028.2743.368.camel@p4>, Andrew MacLeod writes:
 >On another note, shouldn't the into-ssa-pass's copy propagation only be
 >turned on when -ftree-copyprop is specified?  I think we do it all the
 >time right now... thus my problem in libstdc++ even without
 >-ftree-copyprop...
Probably...


FWIW, I dug up my changes to identify the problem variables (those
occurring in abnormal PHIs) and cobbled together the two lines of
code necessary to use that information to avoid copy propagating
those variables during SSA renaming.

Interested in playing with it to see if it resolves your problems?


jeff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-12 17:08     ` law
@ 2003-05-12 17:12       ` Andrew MacLeod
  2003-05-12 17:26         ` law
  0 siblings, 1 reply; 36+ messages in thread
From: Andrew MacLeod @ 2003-05-12 17:12 UTC (permalink / raw)
  To: Jeff Law; +Cc: Diego Novillo, gcc mailing list

On Mon, 2003-05-12 at 13:08, law@redhat.com wrote:
> In message <1052755028.2743.368.camel@p4>, Andrew MacLeod writes:
>  >On another note, shouldn't the into-ssa-pass's copy propagation only be
>  >turned on when -ftree-copyprop is specified?  I think we do it all the
>  >time right now... thus my problem in libstdc++ even without
>  >-ftree-copyprop...
> Probably...
> 
> 
> FWIW, I dug up my changes to identify the problem variables (those
> occurring in abnormal PHIs) and cobbled together the two lines of
> code necessary to use that information to avoid copy propagating
> those variables during SSA renaming.
> 
> Interested in playing with it to see if it resolves your problems?
> 
Absolutely :-)

Andrew


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-12 17:12       ` Andrew MacLeod
@ 2003-05-12 17:26         ` law
  0 siblings, 0 replies; 36+ messages in thread
From: law @ 2003-05-12 17:26 UTC (permalink / raw)
  To: Andrew MacLeod; +Cc: Diego Novillo, gcc mailing list

In message <1052759521.3373.0.camel@p4>, Andrew MacLeod writes:
 >> FWIW, I dug up my changes to identify the problem variables (those
 >> occurring in abnormal PHIs) and cobbled together the two lines of
 >> code necessary to use that information to avoid copy propagating
 >> those variables during SSA renaming.
 >> 
 >> Interested in playing with it to see if it resolves your problems?
 >> 
 >Absolutely :-)

Here you go.  Feel free to check it in if it resolves the problems to
your satisfaction.


	* tree-flow.h (struct var_ann_d): New field occurs_in_abnormal_phi.
	* tree-ssa.c (MAY_COPYPROP_P): Do not allow copy propagations
	if either argument occurs in an abnormal phi.
	* tree-dfa.c (add_phi_arg): Set occurs_in_abrnomal_phi as needed.
	* tree-ssa-copyprop.c (copyprop_stmt): Do not allow copy
	propagations if either argument occurs in an abnormal phi.
	(copyprop_phi): Likewise.
	

Index: tree-flow.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Attic/tree-flow.h,v
retrieving revision 1.1.4.77
diff -c -3 -p -r1.1.4.77 tree-flow.h
*** tree-flow.h	8 May 2003 20:29:29 -0000	1.1.4.77
--- tree-flow.h	12 May 2003 17:18:10 -0000
*************** struct var_ann_d GTY(())
*** 93,100 ****
    /* Used when building root_var structures in tree_ssa_live.[ch].  */
    unsigned root_var_processed : 1;
  
    /* Unused bits.  */
!   unsigned unused : 24;
  
    /* An INDIRECT_REF expression representing all the dereferences of this
       pointer.  Used to store aliasing information for pointer dereferences
--- 93,103 ----
    /* Used when building root_var structures in tree_ssa_live.[ch].  */
    unsigned root_var_processed : 1;
  
+   /* Nonzero if the variable occurs in an abnormal PHI.  */
+   unsigned occurs_in_abnormal_phi : 1;
+ 
    /* Unused bits.  */
!   unsigned unused : 23;
  
    /* An INDIRECT_REF expression representing all the dereferences of this
       pointer.  Used to store aliasing information for pointer dereferences
Index: tree-ssa.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Attic/tree-ssa.c,v
retrieving revision 1.1.4.75
diff -c -3 -p -r1.1.4.75 tree-ssa.c
*** tree-ssa.c	8 May 2003 20:29:30 -0000	1.1.4.75
--- tree-ssa.c	12 May 2003 17:19:20 -0000
*************** static bool var_is_live			PARAMS ((tree,
*** 226,232 ****
  	   && TREE_CODE (SSA_NAME_VAR (RHS)) == INDIRECT_REF)		\
       /* FIXME.  For now, don't propagate pointers if they haven't been	\
          dereferenced (see update_indirect_ref_vuses).  */		\
!      && (!POINTER_TYPE_P (TREE_TYPE (RHS)) || indirect_ref (RHS)))
  
  
  /* Main entry point to the SSA builder.  FNDECL is the gimplified function
--- 226,234 ----
  	   && TREE_CODE (SSA_NAME_VAR (RHS)) == INDIRECT_REF)		\
       /* FIXME.  For now, don't propagate pointers if they haven't been	\
          dereferenced (see update_indirect_ref_vuses).  */		\
!      && (!POINTER_TYPE_P (TREE_TYPE (RHS)) || indirect_ref (RHS))	\
!      && ! var_ann (SSA_NAME_VAR (LHS))->occurs_in_abnormal_phi		\
!      && ! var_ann (SSA_NAME_VAR (RHS))->occurs_in_abnormal_phi)
  
  
  /* Main entry point to the SSA builder.  FNDECL is the gimplified function
Index: tree-dfa.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Attic/tree-dfa.c,v
retrieving revision 1.1.4.111
diff -c -3 -p -r1.1.4.111 tree-dfa.c
*** tree-dfa.c	8 May 2003 20:29:29 -0000	1.1.4.111
--- tree-dfa.c	12 May 2003 17:19:22 -0000
*************** add_phi_arg (phi, def, e)
*** 937,942 ****
--- 937,951 ----
    if (i >= PHI_ARG_CAPACITY (phi))
      abort ();
  #endif
+ 
+   /* Copy propagation needs to know what object occur in abnormal
+      PHI nodes.  This is a convenient place to record such information.  */
+   if (e->flags & EDGE_ABNORMAL)
+     {
+       var_ann (def)->occurs_in_abnormal_phi = 1;
+       var_ann (PHI_RESULT (phi))->occurs_in_abnormal_phi = 1;
+     }
+ 
    PHI_ARG_DEF (phi, i) = def;
    PHI_ARG_EDGE (phi, i) = e;
    PHI_NUM_ARGS (phi)++;
Index: tree-ssa-copyprop.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Attic/tree-ssa-copyprop.c,v
retrieving revision 1.1.2.1
diff -c -3 -p -r1.1.2.1 tree-ssa-copyprop.c
*** tree-ssa-copyprop.c	1 Mar 2003 00:09:07 -0000	1.1.2.1
--- tree-ssa-copyprop.c	12 May 2003 17:19:22 -0000
*************** copyprop_stmt (stmt)
*** 114,120 ****
        tree *use_p = (tree *) VARRAY_GENERIC_PTR (uses, i);
        tree orig = get_original (*use_p, &vuse);
  
!       if (orig)
  	{
  	  if (dump_file && (dump_flags & TDF_DETAILS))
  	    {
--- 114,122 ----
        tree *use_p = (tree *) VARRAY_GENERIC_PTR (uses, i);
        tree orig = get_original (*use_p, &vuse);
  
!       if (orig
! 	  && ! var_ann (SSA_NAME_VAR (*use_p))->occurs_in_abnormal_phi
! 	  && ! var_ann (SSA_NAME_VAR (orig))->occurs_in_abnormal_phi)
  	{
  	  if (dump_file && (dump_flags & TDF_DETAILS))
  	    {
*************** copyprop_phi (phi)
*** 159,165 ****
        tree arg = PHI_ARG_DEF (phi, i);
        tree orig = get_original (arg, &vuse);
  
!       if (orig)
  	{
  	  if (dump_file && dump_flags & TDF_DETAILS)
  	    {
--- 161,169 ----
        tree arg = PHI_ARG_DEF (phi, i);
        tree orig = get_original (arg, &vuse);
  
!       if (orig
! 	  && ! var_ann (SSA_NAME_VAR (arg))->occurs_in_abnormal_phi
! 	  && ! var_ann (SSA_NAME_VAR (orig))->occurs_in_abnormal_phi)
  	{
  	  if (dump_file && dump_flags & TDF_DETAILS)
  	    {



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-12 14:42 [tree-ssa] Out of SSA status and issues Andrew MacLeod
  2003-05-12 15:38 ` Diego Novillo
@ 2003-05-12 18:57 ` Andrew MacLeod
  2003-05-13  9:07   ` Michael Matz
  2003-05-13 12:33   ` Diego Novillo
  1 sibling, 2 replies; 36+ messages in thread
From: Andrew MacLeod @ 2003-05-12 18:57 UTC (permalink / raw)
  To: Andrew MacLeod; +Cc: gcc mailing list

On Mon, 2003-05-12 at 10:42, Andrew MacLeod wrote:

> 
> The second case occurs when a derefernce is copy propagated into a PHI
> node:
>  
>    if (T.1_5 != 0B)
>     {
> 
>       #   VUSE <(*map)_3>;
>       #   VUSE <map_4>;
>       T.2_8 = map->compact_to_partition;
>       i.3_9 = (unsigned int)i_6;
>       T.4_10 = i.3_9 * 4;
>       T.5_11 = (int *)T.4_10;
> 
>       #   (*T.6)_13 = VDEF <(*T.6)_7>;
>       T.6_12 = T.2_8 + T.5_11;
> 
>       #   VUSE <T.6_12>;
>       i_14 = (*T.6)_13
>     };
>   #   i_1 = PHI <i_6(0), (*T.6)_13(1)>;
>   #   (*T.6)_2 = PHI <(*T.6)_7(0), (*T.6)_13(1)>;
> 
>   #   (*T.7)_17 = VDEF <(*T.7)_16>;
>   #   VUSE <(*map)_3>;
>   #   VUSE <map_4>;
>      
> The value of i_14 has been propagated into the PHI node. DCE the deletes
> the stmt 
>    i_14 = (*T.6)_13
> 
> When we go to rewrite this, all we know is that its a derefernce of T.6.
> There is no VUSE now to look at to figure out what the correct pointer
> it.  The original def has it as T.6_12. The information could be found
> by looking for the def of (*T.6)_13 (which is virtual), and looking at
> the real def, which is T.6_12. I am about to try that in my hack and see
> if it works.
>

Actually, is propagating this copy a safe thing to do? 

Copy propagation simply looks at the def, and if the stmt is a copy,
copies it... So if there was a store after the definition of i_14 which
killed the memory location that T.6_12 points to, then the PHI is going
to get the wrong result...  isn't it?

ie it would look something like this hacked up example:
 
      #   (*T.6)_13 = VDEF <(*T.6)_7>;
      T.6_12 = T.2_8 + T.5_11;

      #   VUSE <T.6_12>;
      i_14 = (*T.6)_13

      #   (*T.6)_22 = VDEF <(*T.6)_13>      
      #   VUSE <T.6_12>
      *T.6 = 30;
    };
  #   i_1 = PHI <i_6(0), (*T.6)_13(1)>;

Or is there a reason that copyprop would never happen?
Should we never copyprop an indirect reference?

Andrew

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-12 18:57 ` Andrew MacLeod
@ 2003-05-13  9:07   ` Michael Matz
  2003-05-13 12:42     ` Diego Novillo
  2003-05-13 12:33   ` Diego Novillo
  1 sibling, 1 reply; 36+ messages in thread
From: Michael Matz @ 2003-05-13  9:07 UTC (permalink / raw)
  To: Andrew MacLeod; +Cc: gcc mailing list

Hi,

On 12 May 2003, Andrew MacLeod wrote:

> Actually, is propagating this copy a safe thing to do?
>
> Copy propagation simply looks at the def, and if the stmt is a copy,
> copies it... So if there was a store after the definition of i_14 which
> killed the memory location that T.6_12 points to, then the PHI is going
> to get the wrong result...  isn't it?

An indirect reference is not a copy.  I don't know if tree-ssa thinks it
is, but it definitely shouldn't.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-12 18:57 ` Andrew MacLeod
  2003-05-13  9:07   ` Michael Matz
@ 2003-05-13 12:33   ` Diego Novillo
  2003-05-13 12:49     ` Andrew MacLeod
  1 sibling, 1 reply; 36+ messages in thread
From: Diego Novillo @ 2003-05-13 12:33 UTC (permalink / raw)
  To: Andrew Macleod; +Cc: gcc mailing list

On Mon, 2003-05-12 at 14:45, Andrew MacLeod wrote:

> ie it would look something like this hacked up example:
>  
>       #   (*T.6)_13 = VDEF <(*T.6)_7>;
>       T.6_12 = T.2_8 + T.5_11;
> 
>       #   VUSE <T.6_12>;
>       i_14 = (*T.6)_13
> 
>       #   (*T.6)_22 = VDEF <(*T.6)_13>      
>       #   VUSE <T.6_12>
>       *T.6 = 30;
>     };
>   #   i_1 = PHI <i_6(0), (*T.6)_13(1)>;
> 
> Or is there a reason that copyprop would never happen?
>
Correctness wise?  No, there's no reason.  We can safely do the
propagation in this case.

But I wouldn't think it's efficient.  Pointer dereferences are bound to
be slower than a straight scalar reference.  The copy propagator in the
SSA renamer blocks propagation of INDIRECT_REF nodes.  The stand-alone
copy propagator should probably do the same.


Diego.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13  9:07   ` Michael Matz
@ 2003-05-13 12:42     ` Diego Novillo
  2003-05-13 12:50       ` Andrew MacLeod
  2003-05-13 12:57       ` Michael Matz
  0 siblings, 2 replies; 36+ messages in thread
From: Diego Novillo @ 2003-05-13 12:42 UTC (permalink / raw)
  To: Michael Matz; +Cc: Andrew Macleod, gcc mailing list

On Tue, 2003-05-13 at 05:07, Michael Matz wrote:

> An indirect reference is not a copy.  I don't know if tree-ssa thinks it
> is, but it definitely shouldn't.
> 
Why?

     1. foo()
     2. {
     3.   int i, *p;
     4. 
     5.   p = malloc();
     6.   i = *p;
     7.   return i + 9;
     8. }

I see nothing wrong in replacing 'i + 9' with '*p + 9'.  It would
probably not be efficient, but I can't see it being wrong.

Not that tree-ssa will do anything with this code, the default
type-based aliasing is too conservative, but PTA may disambiguate this.


Diego.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 12:33   ` Diego Novillo
@ 2003-05-13 12:49     ` Andrew MacLeod
  2003-05-13 12:58       ` Diego Novillo
  0 siblings, 1 reply; 36+ messages in thread
From: Andrew MacLeod @ 2003-05-13 12:49 UTC (permalink / raw)
  To: Diego Novillo; +Cc: gcc mailing list

On Tue, 2003-05-13 at 08:33, Diego Novillo wrote:
> On Mon, 2003-05-12 at 14:45, Andrew MacLeod wrote:
> 
> > ie it would look something like this hacked up example:
> >  
> >       #   (*T.6)_13 = VDEF <(*T.6)_7>;
> >       T.6_12 = T.2_8 + T.5_11;
> > 
> >       #   VUSE <T.6_12>;
> >       i_14 = (*T.6)_13
> > 
> >       #   (*T.6)_22 = VDEF <(*T.6)_13>      
> >       #   VUSE <T.6_12>
> >       *T.6 = 30;
> >     };
> >   #   i_1 = PHI <i_6(0), (*T.6)_13(1)>;
> > 
> > Or is there a reason that copyprop would never happen?
> >
> Correctness wise?  No, there's no reason.  We can safely do the
> propagation in this case.
> 
> But I wouldn't think it's efficient.  Pointer dereferences are bound to
> be slower than a straight scalar reference.  The copy propagator in the
> SSA renamer blocks propagation of INDIRECT_REF nodes.  The stand-alone
> copy propagator should probably do the same.
> 
>
but this code is rewritten into:

 T.6 = T.2 + T.5
 i_14 = *T.6
*T.6 = 30
 i_1 = *T.6

Thats not the same meaning at all....

Andrew


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 12:42     ` Diego Novillo
@ 2003-05-13 12:50       ` Andrew MacLeod
  2003-05-13 13:05         ` Diego Novillo
  2003-05-13 12:57       ` Michael Matz
  1 sibling, 1 reply; 36+ messages in thread
From: Andrew MacLeod @ 2003-05-13 12:50 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Michael Matz, gcc mailing list

On Tue, 2003-05-13 at 08:42, Diego Novillo wrote:
> On Tue, 2003-05-13 at 05:07, Michael Matz wrote:
> 
> > An indirect reference is not a copy.  I don't know if tree-ssa thinks it
> > is, but it definitely shouldn't.
> > 
> Why?
> 
>      1. foo()
>      2. {
>      3.   int i, *p;
>      4. 
>      5.   p = malloc();
>      6.   i = *p;
>      7.   return i + 9;
>      8. }
> 
> I see nothing wrong in replacing 'i + 9' with '*p + 9'.  It would
> probably not be efficient, but I can't see it being wrong.
> 
> Not that tree-ssa will do anything with this code, the default
> type-based aliasing is too conservative, but PTA may disambiguate this.

There is nothing wrong with it, as long as you know for sure that *p
hasn't changed between 'i = *p' and 'i+9'. Copyprop doesnt look for
that. All it does is says 'i = *p', I can replace all occurences of i
with *p...

Andrew

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 12:42     ` Diego Novillo
  2003-05-13 12:50       ` Andrew MacLeod
@ 2003-05-13 12:57       ` Michael Matz
  2003-05-13 13:11         ` Diego Novillo
  2003-05-13 15:01         ` Daniel Berlin
  1 sibling, 2 replies; 36+ messages in thread
From: Michael Matz @ 2003-05-13 12:57 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Andrew Macleod, gcc mailing list

Hi,

On 13 May 2003, Diego Novillo wrote:

> Why?
>
>      1. foo()
>      2. {
>      3.   int i, *p;
>      4.
>      5.   p = malloc();
>      6.   i = *p;
>      7.   return i + 9;
>      8. }
>
> I see nothing wrong in replacing 'i + 9' with '*p + 9'.

I do.  *p could have been changed in between 6 and 7.  The useful thing of
SSA is, that there is exactly one definition of an entity, and that is the
reason that you can copy-propagate extremely easily (because you know,
that the source of the copy can't be possibly changed after that copy
insn, because the def must have reached it, and that was the only def).

To be able to propagate *p you have to explicitely proove by some means
that it still is unchanged at the places of the uses of the copied-into
entity.  This would be exactly detrimental to the usefullness of copy
propagation.

I.e. *p is not a register but an expression, and a copy instruction copies
registers to registers.

> Not that tree-ssa will do anything with this code, the default
> type-based aliasing is too conservative, but PTA may disambiguate this.

Of course you _can_ optimize such cases, but it's not a subset of copy
propagation but, hmm, un-PRE or something like that.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 12:49     ` Andrew MacLeod
@ 2003-05-13 12:58       ` Diego Novillo
  0 siblings, 0 replies; 36+ messages in thread
From: Diego Novillo @ 2003-05-13 12:58 UTC (permalink / raw)
  To: Andrew Macleod; +Cc: gcc mailing list

On Tue, 2003-05-13 at 08:49, Andrew MacLeod wrote:
> On Tue, 2003-05-13 at 08:33, Diego Novillo wrote:
> > On Mon, 2003-05-12 at 14:45, Andrew MacLeod wrote:
> > 
> > > ie it would look something like this hacked up example:
> > >  
> > >       #   (*T.6)_13 = VDEF <(*T.6)_7>;
> > >       T.6_12 = T.2_8 + T.5_11;
> > > 
> > >       #   VUSE <T.6_12>;
> > >       i_14 = (*T.6)_13
> > > 
> > >       #   (*T.6)_22 = VDEF <(*T.6)_13>      
> > >       #   VUSE <T.6_12>
> > >       *T.6 = 30;
> > >     };
> > >   #   i_1 = PHI <i_6(0), (*T.6)_13(1)>;
> > > 
> > > Or is there a reason that copyprop would never happen?
> > >
> > Correctness wise?  No, there's no reason.  We can safely do the
> > propagation in this case.
> > 
> > But I wouldn't think it's efficient.  Pointer dereferences are bound to
> > be slower than a straight scalar reference.  The copy propagator in the
> > SSA renamer blocks propagation of INDIRECT_REF nodes.  The stand-alone
> > copy propagator should probably do the same.
> > 
> >
> but this code is rewritten into:
> 
>  T.6 = T.2 + T.5
>  i_14 = *T.6
> *T.6 = 30
>  i_1 = *T.6
> 
> Thats not the same meaning at all....
> 
Drat, I hadn't noticed that *T.6 was aliased.  Yes, in this case
copyprop is doing the wrong thing.  It shouldn't be propagating from
VDEFs into PHIs like that.  Sorry about that.

So, is it copyprop or the copy propagator in the SSA rename pass?  I
don't think the SSA renamer would do this because we specifically block
propagating dereferences.


Diego.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 12:50       ` Andrew MacLeod
@ 2003-05-13 13:05         ` Diego Novillo
  2003-05-13 13:29           ` Andrew MacLeod
  0 siblings, 1 reply; 36+ messages in thread
From: Diego Novillo @ 2003-05-13 13:05 UTC (permalink / raw)
  To: Andrew Macleod; +Cc: Michael Matz, gcc mailing list

On Tue, 2003-05-13 at 08:50, Andrew MacLeod wrote:
> On Tue, 2003-05-13 at 08:42, Diego Novillo wrote:
> > On Tue, 2003-05-13 at 05:07, Michael Matz wrote:
> > 
> > > An indirect reference is not a copy.  I don't know if tree-ssa thinks it
> > > is, but it definitely shouldn't.
> > > 
> > Why?
> > 
> >      1. foo()
> >      2. {
> >      3.   int i, *p;
> >      4. 
> >      5.   p = malloc();
> >      6.   i = *p;
> >      7.   return i + 9;
> >      8. }
> > 
> > I see nothing wrong in replacing 'i + 9' with '*p + 9'.  It would
> > probably not be efficient, but I can't see it being wrong.
> > 
> > Not that tree-ssa will do anything with this code, the default
> > type-based aliasing is too conservative, but PTA may disambiguate this.
> 
> There is nothing wrong with it, as long as you know for sure that *p
> hasn't changed between 'i = *p' and 'i+9'. Copyprop doesnt look for
> that. All it does is says 'i = *p', I can replace all occurences of i
> with *p...
> 
Yes.  After I finish the patch to proper rename INDIRECT_REFs, the code
above should be rewritten into:

     1. foo()
     2. {
     3.   int i, *p;
     4. 
     5.   p_1 = malloc();
     6.   i_2 = *(p_1)_3;
     7.   return i_2 + 9;
     8. }

Now I ask, suppose that we insert a new definition for 'p' in between
lines 6 and 7:

     1. foo()
     2. {
     3.   int i, *p;
     4. 
     5.   p_1 = malloc();
     6.   i_2 = *(p_1)_3;
     7.   p_4 = malloc();
     8.   return i_2 + 9;
     9. }

Copyprop and DCE now convert this code into:

     1. foo()
     2. {
     3.   int *p;
     4. 
     5.   p_1 = malloc();
     6.   p_4 = malloc();
     7.   return *(p_1)_3 + 9;
     8. }

Will the coalescer DTRT here?  Yes, it's a silly transformation, but I'm
wondering if we don't have a bigger problem here.  I've been toying with
the idea of not bothering with renaming INDIRECT_REFs *at all*.  We
certainly get little benefit from it because they are often aliased and
it really is somewhat painful to deal with them.


Diego.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 12:57       ` Michael Matz
@ 2003-05-13 13:11         ` Diego Novillo
  2003-05-13 13:18           ` Andrew MacLeod
  2003-05-13 15:01         ` Daniel Berlin
  1 sibling, 1 reply; 36+ messages in thread
From: Diego Novillo @ 2003-05-13 13:11 UTC (permalink / raw)
  To: Michael Matz; +Cc: Andrew Macleod, gcc mailing list

On Tue, 2003-05-13 at 08:56, Michael Matz wrote:
> Hi,
> 
> On 13 May 2003, Diego Novillo wrote:
> 
> > Why?
> >
> >      1. foo()
> >      2. {
> >      3.   int i, *p;
> >      4.
> >      5.   p = malloc();
> >      6.   i = *p;
> >      7.   return i + 9;
> >      8. }
> >
> > I see nothing wrong in replacing 'i + 9' with '*p + 9'.
> 
> I do.  *p could have been changed in between 6 and 7.
>
The above is a complete program, *p has not changed between 6 and 7 nor
it is volatile.

> I.e. *p is not a register but an expression, and a copy instruction copies
> registers to registers.
> 
Not in tree-ssa.  INDIRECT_REFs are first-class variables.  Since in
GIMPLE pointers can only be one level deep, we can maintain SSA version
numbers for the pointer and the pointed-to memory location.  Granted, it
is somewhat painful (see my other message in this thread) and I'm
starting to think that it may not be sufficiently worth the pain.

> > Not that tree-ssa will do anything with this code, the default
> > type-based aliasing is too conservative, but PTA may disambiguate this.
> 
> Of course you _can_ optimize such cases, but it's not a subset of copy
> propagation but, hmm, un-PRE or something like that.
> 
It is if INDIRECT_REFs are treated as regular variables, like we
currently do.  I agree that otherwise you need to treat them as
expressions.  If we stopped dealing with INDIRECT_REFs as first-class
variables, we would probably move this in to the hands of SSAPRE.


Diego.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 13:11         ` Diego Novillo
@ 2003-05-13 13:18           ` Andrew MacLeod
  2003-05-14 17:19             ` Jan Vroonhof
  0 siblings, 1 reply; 36+ messages in thread
From: Andrew MacLeod @ 2003-05-13 13:18 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Michael Matz, gcc mailing list

On Tue, 2003-05-13 at 09:11, Diego Novillo wrote:
> On Tue, 2003-05-13 at 08:56, Michael Matz wrote:
> > Hi,
> > 
> > On 13 May 2003, Diego Novillo wrote:
> > 
> > > Why?
> > >
> > >      1. foo()
> > >      2. {
> > >      3.   int i, *p;
> > >      4.
> > >      5.   p = malloc();
> > >      6.   i = *p;
> > >      7.   return i + 9;
> > >      8. }
> > >
> > > I see nothing wrong in replacing 'i + 9' with '*p + 9'.
> > 
> > I do.  *p could have been changed in between 6 and 7.
> >
> The above is a complete program, *p has not changed between 6 and 7 nor
> it is volatile.
> 

but you are missing the point. There are lots of trivial examples which
dont show problems.

The problem occurs when yoiu add a new stmt, 6.5 *p = 10

copyprop still thinks it can propagate the stmt, but it can't.

*p = 10 is not a co[py. It is a store to memory, and you have to go
looking at aliasing and such in order to move around loads. copy prop
doesnt do that.

> > I.e. *p is not a register but an expression, and a copy instruction copies
> > registers to registers.
> > 
> Not in tree-ssa.  INDIRECT_REFs are first-class variables.  Since in
> GIMPLE pointers can only be one level deep, we can maintain SSA version
> numbers for the pointer and the pointed-to memory location.  Granted, it
> is somewhat painful (see my other message in this thread) and I'm
> starting to think that it may not be sufficiently worth the pain.
> 
> > > Not that tree-ssa will do anything with this code, the default
> > > type-based aliasing is too conservative, but PTA may disambiguate this.
> > 
> > Of course you _can_ optimize such cases, but it's not a subset of copy
> > propagation but, hmm, un-PRE or something like that.
> > 
> It is if INDIRECT_REFs are treated as regular variables, like we
> currently do.  I agree that otherwise you need to treat them as
> expressions.  If we stopped dealing with INDIRECT_REFs as first-class
> variables, we would probably move this in to the hands of SSAPRE.
> 

I think the only issue is that the copy prop pass is incorrectly
assuming that a load from memory through a pointer is the same thing as
a copy between variables, and it isn't.

Andrew


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 13:05         ` Diego Novillo
@ 2003-05-13 13:29           ` Andrew MacLeod
  2003-05-13 13:57             ` Diego Novillo
  0 siblings, 1 reply; 36+ messages in thread
From: Andrew MacLeod @ 2003-05-13 13:29 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Michael Matz, gcc mailing list

On Tue, 2003-05-13 at 09:05, Diego Novillo wrote:
> On Tue, 2003-05-13 at 08:50, Andrew MacLeod wrote:
> > On Tue, 2003-05-13 at 08:42, Diego Novillo wrote:
> > > On Tue, 2003-05-13 at 05:07, Michael Matz wrote:
> > > 
> > > > An indirect reference is not a copy.  I don't know if tree-ssa thinks it
> > > > is, but it definitely shouldn't.
> > > > 
> > > Why?
> > > 
> > >      1. foo()
> > >      2. {
> > >      3.   int i, *p;
> > >      4. 
> > >      5.   p = malloc();
> > >      6.   i = *p;
> > >      7.   return i + 9;
> > >      8. }
> > > 
> > > I see nothing wrong in replacing 'i + 9' with '*p + 9'.  It would
> > > probably not be efficient, but I can't see it being wrong.
> > > 
> > > Not that tree-ssa will do anything with this code, the default
> > > type-based aliasing is too conservative, but PTA may disambiguate this.
> > 
> > There is nothing wrong with it, as long as you know for sure that *p
> > hasn't changed between 'i = *p' and 'i+9'. Copyprop doesnt look for
> > that. All it does is says 'i = *p', I can replace all occurences of i
> > with *p...
> > 

> 
>      1. foo()
>      2. {
>      3.   int *p;
>      4. 
>      5.   p_1 = malloc();
>      6.   p_4 = malloc();
>      7.   return *(p_1)_3 + 9;
>      8. }
> 
> Will the coalescer DTRT here?  Yes, it's a silly transformation, but I'm
> wondering if we don't have a bigger problem here.  I've been toying with

Sure, the coalescer will handle this easily. p_1 and p_4 conflict, so
they will get assigned something like:

p = malloc ()
p.33 = malloc()
return *p + 9


Doesn't make the copy-prop example right tho. If you take the original
case, and use the new format, you'd get:

      #   (*T.6_12)_13 = VDEF <(*T.6)_7>;
      T.6_12 = T.2_8 + T.5_11;

      #   VUSE <T.6_12>;
      i_14 = (*T.6_12)_13       <<<--- This is an issue.... 

      #   (*T.6_12)_22 = VDEF <(*T.6_12)_13>      
      #   VUSE <T.6_12>
      *T.6_12 = 30;
    };
  #   i_1 = PHI <i_6(0), (*T.6_12)_13(1)>;
 

Since the point itself never changes (just the memory), It would still
be written out the same way, and be wrong.

The stmt I flagged with "This is an issue", IMHO should actually be:
  i_14 = *T.6_12

I don't think it should have the _13. And of course, the VUSE shouldn't
be there since T.6_12 is now a real use. In fact, the aliased variable
should be a VUSE... so I think it ought to look like:

   #   VUSE <(*T.6_12)_13>
   i_14 = *T.6_12

I think that is consistant..... Virtual variables used for aliasing,
real variables used for code.

Andrew

> the idea of not bothering with renaming INDIRECT_REFs *at all*.  We
> certainly get little benefit from it because they are often aliased and
> it really is somewhat painful to deal with them.
> 

Well, renaming of indirect refs serves a different purpose.. we dont use
them when we are rewriting out of SSA form, but I suspect anyone who
tries to manipulate loads and stores is going to want them, because they
effectively give versions to loads and stores, allowing you to use SSA
on memory references... 

But I could be imagining things :-)

The above exmaple seems to make it pretty clear that the value of
(*T.6_12) is different when we set *T.6_12 = 30.  Perhaps there is a
better way.   You simply cannot "use" (*T.6_12)_13 any more, because the
memory it refers to has now changed... 

Whats PRE think of all this?

Andrew.  






^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 13:29           ` Andrew MacLeod
@ 2003-05-13 13:57             ` Diego Novillo
  0 siblings, 0 replies; 36+ messages in thread
From: Diego Novillo @ 2003-05-13 13:57 UTC (permalink / raw)
  To: Andrew Macleod; +Cc: Michael Matz, gcc mailing list

On Tue, 2003-05-13 at 09:29, Andrew MacLeod wrote:

> > 
> >      1. foo()
> >      2. {
> >      3.   int *p;
> >      4. 
> >      5.   p_1 = malloc();
> >      6.   p_4 = malloc();
> >      7.   return *(p_1)_3 + 9;
> >      8. }
> > 
> > Will the coalescer DTRT here?  Yes, it's a silly transformation, but I'm
> > wondering if we don't have a bigger problem here.  I've been toying with
> 
> Sure, the coalescer will handle this easily. p_1 and p_4 conflict, so
> they will get assigned something like:
> 
> p = malloc ()
> p.33 = malloc()
> return *p + 9
> 
Excellent.

> Doesn't make the copy-prop example right tho.
>
What example?  The one I gave with 'p = malloc()' or the one you had
with 'T.6'?  I already agreed that copyprop in the case of your example
code was wrong, *T.6 is aliased, right?  In which case, we should never
see *T.6 in a real operand.  If we do, then we have a bug in
get_stmt_operands.


>  If you take the original case, and use the new format, you'd get:
> 
>       #   (*T.6_12)_13 = VDEF <(*T.6)_7>;
>       T.6_12 = T.2_8 + T.5_11;
> 
>       #   VUSE <T.6_12>;
>       i_14 = (*T.6_12)_13       <<<--- This is an issue.... 
> 
>       #   (*T.6_12)_22 = VDEF <(*T.6_12)_13>      
>       #   VUSE <T.6_12>
>       *T.6_12 = 30;
>     };
>   #   i_1 = PHI <i_6(0), (*T.6_12)_13(1)>;
>  
Wait.  The above is wrong.  On some statements you have *T.6 in VDEF
operands, and in 'i_14 = (*T.6_12)_13' you have it as a real operand. 
Did we really rename the program like that?  Is *T.6 aliased?  If so,
then we have a different problem.  Aliased references should always be
voperands.

Now, let's assume for a minute that the code above didn't use pointers. 
I'll replace uses of *T.6 with J:

      i_14 = J_13;
      J_12 = 30;
    };
  #   i_1 = PHI <i_6(0), J_13(1)>;

The above transformation should be OK, right?  The point I've been
trying to convey is that if we now changed J for an UNALIASED pointer
dereference '*p', then the trasnformation should also be valid:

      i_14 = *(p_3)_13;
      *(p_3)_14 = 30;
    };
  #   i_1 = PHI <i_6(0), *(p_3)_13(1)>;

The copies that the coalescer must insert are exactly the same, right? 
It needs to create a temporary to hold the value of *(p_3)_13.  We know
that we are referencing the same memory location p_3 and that memory
location is not aliased, because all references to *p appear as real
operands.

What am I missing?  I agree that we should not do this for performance
reasons.  I'm merely trying to figure out if treating INDIRECT_REFs as
first-class variables is doable as I think it is.  It would certainly be
easier for me to punt and treat them as regular expressions and let
SSAPRE deal with them.


> I don't think it should have the _13. And of course, the VUSE shouldn't
> be there since T.6_12 is now a real use. In fact, the aliased variable
> should be a VUSE... so I think it ought to look like:
> 
>    #   VUSE <(*T.6_12)_13>
>    i_14 = *T.6_12
> 
Yes.  I'm surprised it doesn't.  Send me a test case?


Diego.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 12:57       ` Michael Matz
  2003-05-13 13:11         ` Diego Novillo
@ 2003-05-13 15:01         ` Daniel Berlin
  1 sibling, 0 replies; 36+ messages in thread
From: Daniel Berlin @ 2003-05-13 15:01 UTC (permalink / raw)
  To: Michael Matz; +Cc: Diego Novillo, Andrew Macleod, gcc mailing list


On Tuesday, May 13, 2003, at 08:56  AM, Michael Matz wrote:

> Hi,
>
> On 13 May 2003, Diego Novillo wrote:
>
>> Why?
>>
>>      1. foo()
>>      2. {
>>      3.   int i, *p;
>>      4.
>>      5.   p = malloc();
>>      6.   i = *p;
>>      7.   return i + 9;
>>      8. }
>>
>> I see nothing wrong in replacing 'i + 9' with '*p + 9'.
>
> I do.  *p could have been changed in between 6 and 7.  The useful 
> thing of
> SSA is, that there is exactly one definition of an entity, and that is 
> the
> reason that you can copy-propagate extremely easily (because you know,
> that the source of the copy can't be possibly changed after that copy
> insn, because the def must have reached it, and that was the only def).
>
> To be able to propagate *p you have to explicitely proove by some means
> that it still is unchanged at the places of the uses of the copied-into
> entity.  This would be exactly detrimental to the usefullness of copy
> propagation.
>
> I.e. *p is not a register but an expression, and a copy instruction 
> copies
> registers to registers.
>

It would also ruin sparse register promotion that load PRE will do 
soon, by unpromoting things.

>> Not that tree-ssa will do anything with this code, the default
>> type-based aliasing is too conservative, but PTA may disambiguate 
>> this.
>
> Of course you _can_ optimize such cases, but it's not a subset of copy
> propagation but, hmm, un-PRE or something like that.
>

It's SSUSPRE (Static Single Use Store PRE).
Store PRE builds a single static use form, ...

>
> Ciao,
> Michael.
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 13:18           ` Andrew MacLeod
@ 2003-05-14 17:19             ` Jan Vroonhof
  2003-05-14 18:05               ` Andrew MacLeod
  2003-05-14 18:33               ` Diego Novillo
  0 siblings, 2 replies; 36+ messages in thread
From: Jan Vroonhof @ 2003-05-14 17:19 UTC (permalink / raw)
  To: gcc


I have a distinct feeling you guys are talking past each other. Feel
free to flame me to bits if I am missing to point.

If I understand things correctly:

Diego seems to think that in the tree-ssa IR memory locations (INDIRECT_REFs)
are treated on equal footing with normal temporaries. (I believe this
is partly what in the Morgan book is called tags). SSA renaming will
take care of aliasing issues etc. 
Whereas you have the more typical view that memory loads are normal
expressions.

In Diego's definition then loads are simply a copy and they can be
propagated. I would think that also would mean that the UNSSA phase
also has to treat INDIRECT_REFs as temporaries and do proper live
range analysis on them.

So in Diego's view

Copy propagating

   a = (*p)_1;
   (*p)_2 = b;
   .
   .
   .
   PHI(..,a,..)

into 

   (void)0;
   (*p)_2 = b;
   .
   .
   .
   PHI(..,(*p)_1,..)

is fine (assuming non-volatile of course), as UNSSA will notice the
overlapping live ranges and assign (*p)_1 and (*p)_2 to different
partitions, create a new temporary for (*p)_1 and voila: it gets rewritten as

   t = *p;   // (*p)_1 is in the same partition as *p on entry
   .
   .
   .
  (void)0;
   *p = b;   // (*p)_2 is the same partition *p on exit
   .
   .
   .
   PHI(..,t,..)

In fact if we can prove that the memory pointed to by p is not live on
access we have potentially eliminated memory from the equation
altogether (at the expense of memory pressure).

So the two phases make SSA do a bit of PRE as a side effect.  So
AFAICS the main problem is that copyprop and the toSSA on the one hand
and UNSSA on the other hand have a different view of the meaning of
the treessa representation. So whether it is a bug in copyprop or a
weakness in UNSSA simply depends on your point of view.

Whether this partitioning of memory locations actually is doable I do
not know. For instance I would think that the partitioner needs to
make sure that there are partitions for (*p)_onentry and (*p)_onexit
(if the memory is live) and assign those *p as a location otherwise
the loads and stores will never happen and that is wrong. For
non-volatiles one can remove all but 1-load and 1-store.

However, Not treating this as a PRE problem might have the tendency to
move the load towards entry and the store towards exit too much,
increasing register pressure and putting loads and stores in codepaths
that previously didn't have them. This seems to suggest leaving it to PRE.

On the other hand the SSAPRE as far as i understand (which is very
little) simply would SSA rename it as an expression and then UNSSA
it doing essentially the same thing.[1] So in the end the copy versus
expression distinction is rahter weak (when SSAPRE is enabled).

I hope these ramblings from a beginner didn't just confuse things even
more.

Jan

Footnotes: 
[1]  In fact I am wondering whether SSA of expressions (as per SSAPRE) and SSA with one unique temporary per expression (as per Morgan) aren't in fact the same thing.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-14 17:19             ` Jan Vroonhof
@ 2003-05-14 18:05               ` Andrew MacLeod
  2003-05-14 18:33               ` Diego Novillo
  1 sibling, 0 replies; 36+ messages in thread
From: Andrew MacLeod @ 2003-05-14 18:05 UTC (permalink / raw)
  To: Jan Vroonhof; +Cc: gcc mailing list

On Tue, 2003-05-13 at 13:29, Jan Vroonhof wrote:
> 
> I have a distinct feeling you guys are talking past each other. Feel
> free to flame me to bits if I am missing to point.
> 

Yes, we sorted it out eventually :-)

> If I understand things correctly:
> 
> Diego seems to think that in the tree-ssa IR memory locations (INDIRECT_REFs)
> are treated on equal footing with normal temporaries. (I believe this
> is partly what in the Morgan book is called tags). SSA renaming will
> take care of aliasing issues etc. 
> Whereas you have the more typical view that memory loads are normal
> expressions.
> 
> In Diego's definition then loads are simply a copy and they can be
> propagated. I would think that also would mean that the UNSSA phase
> also has to treat INDIRECT_REFs as temporaries and do proper live
> range analysis on them.
> 

The confusing part is that only objects which have no alaises that were
treated that way. They were represented by a Virtual DEF, and by real
USES. As soon as that same dereference was alaised with something, they
became virtual USES, and the real USES became dereference of the actual
pointer. 

When the INDIRECT_REF variables were treated as real uses, SSA->Normal
in that case requires a little twisting. It had to replace the
INDIRECT_REF temporary with a dereference of the correct pointer.
 
  The inconsistancy is what was bothering me. (Not to mention the extra
work :-)



> 
> Whether this partitioning of memory locations actually is doable I do
> not know. For instance I would think that the partitioner needs to
> make sure that there are partitions for (*p)_onentry and (*p)_onexit
> (if the memory is live) and assign those *p as a location otherwise
> the loads and stores will never happen and that is wrong. For
> non-volatiles one can remove all but 1-load and 1-store.

Part of the problem was that there was never a real definition of the
INDIRECT_REF variable. It was a virtual definition when the value of the
pointer was set.

> 
> However, Not treating this as a PRE problem might have the tendency to
> move the load towards entry and the store towards exit too much,
> increasing register pressure and putting loads and stores in codepaths
> that previously didn't have them. This seems to suggest leaving it to PRE.
> 

I think this is where we are finally headed :-)

> On the other hand the SSAPRE as far as i understand (which is very
> little) simply would SSA rename it as an expression and then UNSSA
> it doing essentially the same thing.[1] So in the end the copy versus
> expression distinction is rahter weak (when SSAPRE is enabled).

By PRE would put hings into real temporaries, and then UNSSA doesn't
have to go looking for special cases... it just does a direct
partitioning and rewrite.

Andrew



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-14 17:19             ` Jan Vroonhof
  2003-05-14 18:05               ` Andrew MacLeod
@ 2003-05-14 18:33               ` Diego Novillo
  2003-05-14 19:11                 ` Daniel Berlin
  1 sibling, 1 reply; 36+ messages in thread
From: Diego Novillo @ 2003-05-14 18:33 UTC (permalink / raw)
  To: Jan Vroonhof; +Cc: gcc

On Tue, 2003-05-13 at 13:29, Jan Vroonhof wrote:

> However, Not treating this as a PRE problem might have the tendency to
> move the load towards entry and the store towards exit too much,
> increasing register pressure and putting loads and stores in codepaths
> that previously didn't have them. This seems to suggest leaving it to PRE.
> 
Exactly.  This is what we are converging to.  In my initial design, I
had thought that by treating INDIRECT_REF nodes like regular variables
would simplify the implementation and give us the extra benefit of being
able to do the same transformations we do to any other variable.  Boy,
was I wrong!

INDIRECT_REF nodes are like any other unary expression (OK, they have
slightly different semantics as they can be on the LHS of an
assignment), so the expression-SSA built by SSAPRE is perfect for them. 
All we need is to do is build the SSA form for expressions and let that
framework handle them.

Daniel, how clean is the separation between the expression-SSA and the
PRE specific bits in SSAPRE?  It'd be nice if we have a clean separation
between the infrastructure proper and the various passes we want to do
that operate on expressions.


Diego.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-14 18:33               ` Diego Novillo
@ 2003-05-14 19:11                 ` Daniel Berlin
  0 siblings, 0 replies; 36+ messages in thread
From: Daniel Berlin @ 2003-05-14 19:11 UTC (permalink / raw)
  To: Diego Novillo; +Cc: Jan Vroonhof, gcc


On Wednesday, May 14, 2003, at 02:33  PM, Diego Novillo wrote:

> On Tue, 2003-05-13 at 13:29, Jan Vroonhof wrote:
>
>> However, Not treating this as a PRE problem might have the tendency to
>> move the load towards entry and the store towards exit too much,
>> increasing register pressure and putting loads and stores in codepaths
>> that previously didn't have them. This seems to suggest leaving it to 
>> PRE.
>>
> Exactly.  This is what we are converging to.  In my initial design, I
> had thought that by treating INDIRECT_REF nodes like regular variables
> would simplify the implementation and give us the extra benefit of 
> being
> able to do the same transformations we do to any other variable.  Boy,
> was I wrong!
>
> INDIRECT_REF nodes are like any other unary expression (OK, they have
> slightly different semantics as they can be on the LHS of an
> assignment)

This is why there is an ELEFT tree expression (in fact, specifically 
for this reason).

> , so the expression-SSA built by SSAPRE is perfect for them.
> All we need is to do is build the SSA form for expressions and let that
> framework handle them.
>
> Daniel, how clean is the separation between the expression-SSA and the
> PRE specific bits in SSAPRE?  It'd be nice if we have a clean 
> separation
> between the infrastructure proper and the various passes we want to do
> that operate on expressions.

Besides the fact that there are SSAPRE specific fields in the various 
E* structures, into-ESSA and PRE are cleanly separated.
You could stop after the call to rename_1, and you'd have ESSA 
structures available in expr_info.
We don't tack the EUSE/EDEF onto statements, for lack of any point in 
doing so for SSAPRE (we always walk them in pre-order, DT order, and 
keep pointers to all of them in an array).
The statement associated with a node is available from an E* node.
We do, however, tack ephi nodes onto BB's.

Remember also that we build ESSA for one lexical expression at a time, 
optimize the expression's occurrences, then move out of ESSA and move 
to the next lexical expression.  So if you are expecting to build ESSA 
for all expressions at once, it would
1. Be heavy on memory, most likely (in particular, the number of phi 
nodes would be high, since you have a phi node everywhere any variable  
in the expression has a phi, in addition to the IDF of the statement).
2. Probably require changes to the helper routines, but not much.

An ESSA optimization pass should probably be working one lexical 
expression at a time anyway.

The ESSA structures are factored (IE linked), but the version number is 
stored in the structures explicitly if one wants it (we use it for 
available expression determination).

Out-of-ESSA isn't really the same as out-of-SSA, because these are 
expressions, not variables, and thus they never occur on the LHS except 
for load PRE.  Out-of-ESSA is quite simple as a result.  It expects 
something else to set the save and reload flags on the EUSE's, and it 
just saves to and reloads from, temporaries, where told to.  It also 
creates phis for the ephis.


I'm not sure you can come up with an ESSA pass that is useful besides 
PRE, that runs after PRE.  Value numbering can easily be done on ESSA 
if you want (Open64 has a value numbering pass formulated using the 
SSAPRE pieces to do the work of renaming and saving). Past that, it's 
hard to come up with a useful pass.
Since load PRE is promoting pointer accesses into temporaries where 
possible/profitable, the regular SSA optimizers can take care of 
working on pointers by working on the temporaries.  Value numbering 
would take care of the constant loads/stores.
Other than that, let's see:
1.  All partially redundant expressions are eliminated
2.  Constant propagation on non-pointer expressions is the same as 
constant propagation on the variables *in* the expression.
3.  Whatever copy-prop you can do can be integrated into the 
code_motion pass (it's on my todo list).

--Dan
>
> Diego.
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 15:23 Richard Kenner
  2003-05-13 18:50 ` Geoff Keating
@ 2003-05-17 17:19 ` Michael S. Zick
  1 sibling, 0 replies; 36+ messages in thread
From: Michael S. Zick @ 2003-05-17 17:19 UTC (permalink / raw)
  To: Richard Kenner, mszick; +Cc: gcc

On Tuesday 13 May 2003 10:28 am, Richard Kenner wrote:
>     Consider instead that '*p' is just the name given a register's
>     contents.  Neither 'i' nor '9' are processor internal.  Both are
>     external to the cpu.  The memory reference for 'i' is source code
>     implicit, the memory reference for '*p' is source code explicit.
>
> True, but given caching effects, it's hard to see how *p could be less
> expensive than 'i'.

Six of one and half a dozen of the other.  Its a Union.

I am using "Union" in the general sense of either the union of two or
more data structures OR two or more code structures.
In this usage of "Union", an assignment statement is a "Union" -
including the assignment of a function result.

Our terminology and viewpoint differ, but we are talking about the
same thing.

As to the subject of this thread, a link you may find interesting:
<http://www.sac-home.org/>

Not for the specialization of "C" that the compiler processes, but for the
compiler options that allow it to be stopped and dump its internals.

Clipped from the help menu:

OPTIMIZATION OPTIONS:

         -ssa           use optimizations based on ssa-form.
         -no <opt>      disable optimization technique <opt>
         -do <opt>      enable optimization technique <opt>

        The following optimization techniques are currently supported:

                CF      constant folding
                INL     function inlining
                LUR     loop unrolling
                WLUR    with-loop unrolling
                LUS     loop unswitching
                DCR     dead code removal
                DFR     dead function removal
                LIR     loop invariant removal
                CSE     common subexpression elimination
                WLT     with-loop transformation
                WLF     with-loop folding
                DLAW    application of the distributive law
                IVE     index vector elimination
                AE      array elimination
                RCO     refcount optimization
                UIP     update-in-place
                AP      array padding
                APL     array placement
                TSI     tile size inference (blocking)
                TSP     tile size pragmas (blocking)
                MTO     multi-thread optimization
                SBE     syncronisation barrier elimination
                PHM     private heap management
                APS     arena preselection           (in conjunction with PHM)
                RCAO    refcount allocation optimiz. (in conjunction with PHM)
                MSCA    memory size cache adjustment (in conjunction with PHM)

                OPT     enables/disables all optimizations at once.

        Lower case letters may be used to indicate optimization techniques.

        Command line arguments are evaluated from left to right, i.e.,
        "-no OPT -do INL" disables all optimizations except for
        function inlining.

Mike

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 18:50 ` Geoff Keating
@ 2003-05-13 23:28   ` Michael S. Zick
  0 siblings, 0 replies; 36+ messages in thread
From: Michael S. Zick @ 2003-05-13 23:28 UTC (permalink / raw)
  To: Geoff Keating, Richard Kenner; +Cc: gcc

On Tuesday 13 May 2003 01:50 pm, Geoff Keating wrote:
> kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:
> >     Consider instead that '*p' is just the name given a register's
> >     contents.  Neither 'i' nor '9' are processor internal.  Both are
> >     external to the cpu.  The memory reference for 'i' is source code
> >     implicit, the memory reference for '*p' is source code explicit.
> >
> > True, but given caching effects, it's hard to see how *p could be less
> > expensive than 'i'.
>
> Given the code
>
> i = *p;
> // operations that don't use 'i' or change '*p'
> foo(i);
>
> and this is the only use of 'i', surely this is always as fast or faster
> than
>
> foo(*p);
>
> due to reducing register pressure.
I think this sub-thread has developed a split viewpoint.
I (and I think RK) was talking about optimization at the hardware
independent level.
The above seems to be in the area of hardware dependent
optimizations.
Translation: You folks have lost me on this one, I'll go back
to lurking.

Mike

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 15:23 Richard Kenner
@ 2003-05-13 18:50 ` Geoff Keating
  2003-05-13 23:28   ` Michael S. Zick
  2003-05-17 17:19 ` Michael S. Zick
  1 sibling, 1 reply; 36+ messages in thread
From: Geoff Keating @ 2003-05-13 18:50 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc

kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:

>     Consider instead that '*p' is just the name given a register's
>     contents.  Neither 'i' nor '9' are processor internal.  Both are
>     external to the cpu.  The memory reference for 'i' is source code
>     implicit, the memory reference for '*p' is source code explicit.  
> 
> True, but given caching effects, it's hard to see how *p could be less
> expensive than 'i'.

Given the code

i = *p;
// operations that don't use 'i' or change '*p'
foo(i);

and this is the only use of 'i', surely this is always as fast or faster than

foo(*p);

due to reducing register pressure.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
@ 2003-05-13 15:23 Richard Kenner
  2003-05-13 18:50 ` Geoff Keating
  2003-05-17 17:19 ` Michael S. Zick
  0 siblings, 2 replies; 36+ messages in thread
From: Richard Kenner @ 2003-05-13 15:23 UTC (permalink / raw)
  To: mszick; +Cc: gcc

    Consider instead that '*p' is just the name given a register's
    contents.  Neither 'i' nor '9' are processor internal.  Both are
    external to the cpu.  The memory reference for 'i' is source code
    implicit, the memory reference for '*p' is source code explicit.  

True, but given caching effects, it's hard to see how *p could be less
expensive than 'i'.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 13:17 Richard Kenner
  2003-05-13 13:27 ` Diego Novillo
  2003-05-13 13:40 ` Michael Matz
@ 2003-05-13 15:08 ` Michael S. Zick
  2 siblings, 0 replies; 36+ messages in thread
From: Michael S. Zick @ 2003-05-13 15:08 UTC (permalink / raw)
  To: Richard Kenner, dnovillo; +Cc: gcc

On Tuesday 13 May 2003 08:22 am, Richard Kenner wrote:
>     I see nothing wrong in replacing 'i + 9' with '*p + 9'.  It would
>     probably not be efficient, but I can't see it being wrong.
>
> If an optimizer pessimizes the code, I'd consider that "wrong".
>
> This isn't a machine-dependent issue: with CPU speeds the way they are,
> a memory reference is *always* many times more expensive than an addition.
Nice generalization, but only relevant while considering '*p' to be an
expression.  
Consider instead that '*p' is just the name given a register's
contents.  Neither 'i' nor '9'  are processor internal.  Both are external
to the cpu.  The memory reference for 'i' is source code implicit, the memory
reference for '*p' is source code explicit.
Mike

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
@ 2003-05-13 13:42 Richard Kenner
  0 siblings, 0 replies; 36+ messages in thread
From: Richard Kenner @ 2003-05-13 13:42 UTC (permalink / raw)
  To: matz; +Cc: gcc

    Suppose there is register pressure and we are in the register allocator
    and decide that p1 doesn't get a hardreg.  If you now don't allow the use
    of p1 to be replaced with *p2, you are forced to emit such code:

Of course, but we weren't talking about the register allocator.  Instead,
we were talking about a high-level optimization.

The register allocator indeed needs to know what a psuedo corresponds to
so that it can undo the optimization is there aren't enough hard registers.
We do some of this, but not enough.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 13:17 Richard Kenner
  2003-05-13 13:27 ` Diego Novillo
@ 2003-05-13 13:40 ` Michael Matz
  2003-05-13 15:08 ` Michael S. Zick
  2 siblings, 0 replies; 36+ messages in thread
From: Michael Matz @ 2003-05-13 13:40 UTC (permalink / raw)
  To: Richard Kenner; +Cc: dnovillo, gcc

Hi,

On Tue, 13 May 2003, Richard Kenner wrote:

>     I see nothing wrong in replacing 'i + 9' with '*p + 9'.  It would
>     probably not be efficient, but I can't see it being wrong.
>
> If an optimizer pessimizes the code, I'd consider that "wrong".
>
> This isn't a machine-dependent issue: with CPU speeds the way they are,
> a memory reference is *always* many times more expensive than an addition.

You apply a much too narrow view to optimization, or better to
transformation.  For instance in this situation:

  p1 = *p2;
  ... nothing changing *p2 or using p1...
  p3 = p1 + 1;

Suppose there is register pressure and we are in the register allocator
and decide that p1 doesn't get a hardreg.  If you now don't allow the use
of p1 to be replaced with *p2, you are forced to emit such code:

  p1 = *p2;
  stack[place] = p1;
  ... nothing changing *p2 or using p1...
  p3 = stack[place] + 1;

whereas it would have been better to use *p2 directly:

  (p1 = ... removed, because dead)
  ... nothing changing ....
  p3 = *p2 + 1;


I.e. it depends on the context if you want to replace register references
with memory accesses or not.  You can't just say it pessimizes code in all
cases.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
  2003-05-13 13:17 Richard Kenner
@ 2003-05-13 13:27 ` Diego Novillo
  2003-05-13 13:40 ` Michael Matz
  2003-05-13 15:08 ` Michael S. Zick
  2 siblings, 0 replies; 36+ messages in thread
From: Diego Novillo @ 2003-05-13 13:27 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc

On Tue, 2003-05-13 at 09:22, Richard Kenner wrote:
>     I see nothing wrong in replacing 'i + 9' with '*p + 9'.  It would
>     probably not be efficient, but I can't see it being wrong.
> 
> If an optimizer pessimizes the code, I'd consider that "wrong".
> 
correctness != efficiency.  But we agree, transformations that are known
to pessimize code will not be allowed.

> This isn't a machine-dependent issue: with CPU speeds the way they are,
> a memory reference is *always* many times more expensive than an addition.
>
Read the rest of my message.  This is *exactly* why we don't do this
transformation.


Diego.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [tree-ssa] Out of SSA status and issues
@ 2003-05-13 13:17 Richard Kenner
  2003-05-13 13:27 ` Diego Novillo
                   ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Richard Kenner @ 2003-05-13 13:17 UTC (permalink / raw)
  To: dnovillo; +Cc: gcc

    I see nothing wrong in replacing 'i + 9' with '*p + 9'.  It would
    probably not be efficient, but I can't see it being wrong.

If an optimizer pessimizes the code, I'd consider that "wrong".

This isn't a machine-dependent issue: with CPU speeds the way they are,
a memory reference is *always* many times more expensive than an addition.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2003-05-17 17:09 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-12 14:42 [tree-ssa] Out of SSA status and issues Andrew MacLeod
2003-05-12 15:38 ` Diego Novillo
2003-05-12 15:57   ` Andrew MacLeod
2003-05-12 16:05     ` Michael Matz
2003-05-12 16:10       ` Andrew MacLeod
2003-05-12 16:16       ` law
2003-05-12 17:08     ` law
2003-05-12 17:12       ` Andrew MacLeod
2003-05-12 17:26         ` law
2003-05-12 18:57 ` Andrew MacLeod
2003-05-13  9:07   ` Michael Matz
2003-05-13 12:42     ` Diego Novillo
2003-05-13 12:50       ` Andrew MacLeod
2003-05-13 13:05         ` Diego Novillo
2003-05-13 13:29           ` Andrew MacLeod
2003-05-13 13:57             ` Diego Novillo
2003-05-13 12:57       ` Michael Matz
2003-05-13 13:11         ` Diego Novillo
2003-05-13 13:18           ` Andrew MacLeod
2003-05-14 17:19             ` Jan Vroonhof
2003-05-14 18:05               ` Andrew MacLeod
2003-05-14 18:33               ` Diego Novillo
2003-05-14 19:11                 ` Daniel Berlin
2003-05-13 15:01         ` Daniel Berlin
2003-05-13 12:33   ` Diego Novillo
2003-05-13 12:49     ` Andrew MacLeod
2003-05-13 12:58       ` Diego Novillo
2003-05-13 13:17 Richard Kenner
2003-05-13 13:27 ` Diego Novillo
2003-05-13 13:40 ` Michael Matz
2003-05-13 15:08 ` Michael S. Zick
2003-05-13 13:42 Richard Kenner
2003-05-13 15:23 Richard Kenner
2003-05-13 18:50 ` Geoff Keating
2003-05-13 23:28   ` Michael S. Zick
2003-05-17 17:19 ` Michael S. Zick

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).