public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* USE patch vs the PA
@ 1997-10-24 17:58 Jeffrey A Law
  1997-10-24 19:00 ` John Carr
  1997-10-25  9:02 ` Toon Moene
  0 siblings, 2 replies; 7+ messages in thread
From: Jeffrey A Law @ 1997-10-24 17:58 UTC (permalink / raw)
  To: Toon Moene; +Cc: egcs

Well, after hacking the alias code a little, tomcatv with the
simplify_giv_expr patch (aka USE patch) now runs slightly faster
than tomcatv without the USE patch on my PA (as opposed to about
10% slower!)

There were basically three problems with the alias code.  I'll write
them up in more detail later this weekend.

  * Didn't handle autoincrement addressing.  It just punted.
  autoincrement addresses can be treated just like a set of the
  base value to itself plus a small constant.

  * find_base_value didn't do a good enough job when given the
  sum of two pseudos.

  * find_base_value (psuedo) would return the pseudo instead of
  the base value of the pseudo.


  	    
I'm going to do some more benchmarking, but hopefully we've solved
the mystery!




--
Jeff Law (law@cygnus.com)
Cygnus Solutions		EGCS GNU Compiler System
http://www.cygnus.com		http://www.cygnus.com/egcs


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: USE patch vs the PA
  1997-10-24 17:58 USE patch vs the PA Jeffrey A Law
@ 1997-10-24 19:00 ` John Carr
  1997-10-24 22:37   ` Jeffrey A Law
  1997-10-24 22:45   ` Jeffrey A Law
  1997-10-25  9:02 ` Toon Moene
  1 sibling, 2 replies; 7+ messages in thread
From: John Carr @ 1997-10-24 19:00 UTC (permalink / raw)
  To: law; +Cc: egcs

> There were basically three problems with the alias code.  I'll write
> them up in more detail later this weekend.
> 
>   * Didn't handle autoincrement addressing.  It just punted.
>   autoincrement addresses can be treated just like a set of the
>   base value to itself plus a small constant.

I thought it would work with autoincrement addressing.  It uses
note_stores to detect sets, and note_stores only looks for explicit
sets.

>   * find_base_value (psuedo) would return the pseudo instead of
>   the base value of the pseudo.

That's how it was intended to work.  Insns are not scanned in the same
order they are executed.  Consider insns which appear in rtl in this
order:

(1)	R1 = constant
(2)	R2 = R1
(3)	R1 = variable

If the instructions are executed in the order (3), (2), (1) due to
labels and branches then R2 will not have a known value.

The current find_base_value handles this case by marking R2 as having the
same base value as R1, a value which is not yet known and might be NULL.
Only after all insns have been seen and the base value of R1 is known can
R2 be assigned a legitimate base value.

If there are no labels between (1) and (2) then the code is too
conservative.  Handling that case is not hard; I'm not sure how helpful it
would be.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: USE patch vs the PA
  1997-10-24 19:00 ` John Carr
@ 1997-10-24 22:37   ` Jeffrey A Law
  1997-10-25  6:31     ` John Carr
  1997-10-24 22:45   ` Jeffrey A Law
  1 sibling, 1 reply; 7+ messages in thread
From: Jeffrey A Law @ 1997-10-24 22:37 UTC (permalink / raw)
  To: John Carr; +Cc: egcs

  In message < 199710250200.WAA00802@jfc. >you write:
  > That's how it was intended to work.  Insns are not scanned in the same
  > order they are executed.  Consider insns which appear in rtl in this
  > order:
  > 
  > (1)	R1 = constant
  > (2)	R2 = R1
  > (3)	R1 = variable
Note that if REG_N_SETS (R1) == 1, then find_base_value should return
reg_base_value [REGNO (R1)] if it is nonzero.

Doing that, improving PLUS and handling autoinc is enough for the
testcase that I've been working with.

--

If we wanted to get more clever, then we'd change how we do alias propagation.

The way it works right now, if regA is set to regB + regC and we
don't know if regB & regC are "constant" when we see the set for
regA, then we just set reg_base_value[regA] = regA.

Assume that we later find that regC is initially set to a symbol_ref
and just incremented in a loop.  Thus reg_base_value[regC] = regC
and can be considered "constant" once we've scanned all the insns.

But, when we do propagation of the aliases we've lost any information
that says regA might be related to regB or regC, and thus we can't
determine that reg_base_values[regA] can be set to regC.

To get better propagation, we could do something like

while (something changed)
  for each insn
    note_stores....
  propagate alias info

We'd keep two arrays for base values.  The first contains the constant
entries computed so far.  The second contains the new tenative entries.

In find_base_value, you return the constant entries computed so far
whenever possible.

record_set records the info into the tenative base values array.

When we've scanned all the insns, we add entries from the tenative
reg base array to the known reg base array and repeat the entire
process until no new real base entries are created.

It's really not all that difficult -- I've actually got it in
my sandbox, though I haven't tested it all that much.

If you think it's worth trying to make "real" I can clean it up
easily and let you comment on it.

Obviously it will run slower than yours, so maybe we limit the
number of passes based on -O{X} or something of that nature.

jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: USE patch vs the PA
  1997-10-24 19:00 ` John Carr
  1997-10-24 22:37   ` Jeffrey A Law
@ 1997-10-24 22:45   ` Jeffrey A Law
  1 sibling, 0 replies; 7+ messages in thread
From: Jeffrey A Law @ 1997-10-24 22:45 UTC (permalink / raw)
  To: John Carr; +Cc: egcs

  In message < 199710250200.WAA00802@jfc. >you write:
  > 
  > > There were basically three problems with the alias code.  I'll write
  > > them up in more detail later this weekend.
  > > 
  > >   * Didn't handle autoincrement addressing.  It just punted.
  > >   autoincrement addresses can be treated just like a set of the
  > >   base value to itself plus a small constant.
  > 
  > I thought it would work with autoincrement addressing.  It uses
  > note_stores to detect sets, and note_stores only looks for explicit
  > sets.
It's not a major problem -- find_base_term and find_base_value give up
when they encounter an autoincrement address.

Stripping off the autoinc and exposing the base register should be all
we need to do unless I'm missing something important (which is certainly
possible!)


  > >   * find_base_value (psuedo) would return the pseudo instead of
  > >   the base value of the pseudo.
  > 
  > That's how it was intended to work.  Insns are not scanned in the same
  > order they are executed.  Consider insns which appear in rtl in this
  > order:
You're absolutely right -- I missed the propagation code at the bottom
of the file which should have figured this out.

jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: USE patch vs the PA
  1997-10-24 22:37   ` Jeffrey A Law
@ 1997-10-25  6:31     ` John Carr
  1997-10-25 10:21       ` Jeffrey A Law
  0 siblings, 1 reply; 7+ messages in thread
From: John Carr @ 1997-10-25  6:31 UTC (permalink / raw)
  To: law; +Cc: egcs

>   > (1)	R1 = constant
>   > (2)	R2 = R1
>   > (3)	R1 = variable
> Note that if REG_N_SETS (R1) == 1, then find_base_value should return
> reg_base_value [REGNO (R1)] if it is nonzero.

Can you send me an example where this helps?


> The way it works right now, if regA is set to regB + regC and we
> don't know if regB & regC are "constant" when we see the set for
> regA, then we just set reg_base_value[regA] = regA.

There is code to pick the right register as the base: the one with
REGNO_POINTER_FLAG set.  A register without this flag shouldn't
contain a pointer.  Why isn't this working?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: USE patch vs the PA
  1997-10-24 17:58 USE patch vs the PA Jeffrey A Law
  1997-10-24 19:00 ` John Carr
@ 1997-10-25  9:02 ` Toon Moene
  1 sibling, 0 replies; 7+ messages in thread
From: Toon Moene @ 1997-10-25  9:02 UTC (permalink / raw)
  To: law; +Cc: egcs

>  There were basically three problems with the alias code.
>  I'll write them up in more detail later this weekend.
>
>    * Didn't handle autoincrement addressing.  It just
>    punted.  autoincrement addresses can be treated just
>    like a set of the base value to itself plus a small
>    constant.

Heh, heh, that means that I was right:  The PA is `interesting' in  
this regard; it's one of the few mainstream architectures with  
autoincrement addressing for which gcc defines function units for  
instruction scheduling.  So it's no wonder no-one else noticed this  
;-)

The arm and the pdp11 are two others (yeah, right, SPEC95 on a PDP 11 :-)

Great job !

Cheers,
Toon.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: USE patch vs the PA
  1997-10-25  6:31     ` John Carr
@ 1997-10-25 10:21       ` Jeffrey A Law
  0 siblings, 0 replies; 7+ messages in thread
From: Jeffrey A Law @ 1997-10-25 10:21 UTC (permalink / raw)
  To: John Carr; +Cc: egcs

  In message < 199710251330.JAA02308@jfc. >you write:
  > 
  > >   > (1)	R1 = constant
  > >   > (2)	R2 = R1
  > >   > (3)	R1 = variable
  > > Note that if REG_N_SETS (R1) == 1, then find_base_value should return
  > > reg_base_value [REGNO (R1)] if it is nonzero.
  > 
  > Can you send me an example where this helps?
Not easily since the examples I'm working from sources with the
simplify_giv_expr patch, a rerun-loop patch, and other random
stuff I'm playing with.

It's worth noting that paying attention to REG_N_SETS and the
PLUS handling are closely related.  It's not all that useful to
discuss one outside the context of the other.

Right now PLUS handling looks something like this:

    case PLUS:
    case MINUS:
      {
        rtx src_0 = XEXP (src, 0), src_1 = XEXP (src, 1);

        /* Guess which operand is the base address.

           If the first operand is a symbol or the second operand is
           an integer, the first operand is the base address.  Else if
           either operand is a register marked as a pointer, it is the
           base address.  */

        if (GET_CODE (src_1) == CONST_INT
            || GET_CODE (src_0) == SYMBOL_REF
            || GET_CODE (src_0) == LABEL_REF
            || GET_CODE (src_0) == CONST)
          return find_base_value (src_0);

        if (GET_CODE (src_0) == REG && REGNO_POINTER_FLAG (REGNO (src_0)))
          return find_base_value (src_0);

        if (GET_CODE (src_1) == REG && REGNO_POINTER_FLAG (REGNO (src_1)))
          return find_base_value (src_1);

        return 0;
      }


So, if you have (plus (reg) (reg)) and neither has REGNO_POINTER_FLAG
set we do nothing.  Even if one of the regs has a known base value.


One might think that one of the regs should have REGNO_POINTER_FLAG set.

  * Conceptually REGNO_POINTER_FLAG should only be set for a "pure
  pointer".  By pure pointer I mean the value in the reg must be
  within the object being pointed to.

  That sounds odd, but gcc will internally often create a pseudoreg
  outside the bounds of an object, then "correct" it with a later
  addition.  The pseudo which is out of the bounds of the object
  must not have REGNO_POINTER_FLAG set.

  Thus, if you have (set (regC) (plus (regA) (regB))) and regA has
  REGNO_POINTER_FLAG set, you can _not_ set REGNO_POINTER_FLAG for
  regC unless you can prove that the sum of (regA) (regB) still
  points within the same object as (regA).

  It is important to realize that for alias gathering you only
  care about relating values to some base value (ie a symbol_ref,
  frame_pointer_rtx, etc).


  * On a technical level consider a call to emit_iv_add_mult.
  
  It basically computes b * m + a.

  In one case I care about it computes:

  (plus (mult (reg 101) (const_int 8))
	(plus (plus (reg 975) (const_int -8))
	      (reg 1024))

  [ That's not canonicalized, but you get the idea... ]

  Which (of course) creates multiple insns, with temporary
  pseudos holding some values -- we can't get at those pseudos
  easily to propagate REGNO_POINTER_FLAG the way we might want to.

  Furthermore, even though (reg 1024) is the pointer, it is
  difficult to prove that any subexpression involving reg 1024
  is a pointer.



So, how does the PLUS handling tie into peeking at REG_N_SETS?

Basically at the top of the PLUS/MINUS code in find_base_value my
code does something like this:

        if (GET_CODE (src_0) == REG)
          {
            temp = find_base_value (src_0);
            if (temp)
              src_0 = temp;
          }

        if (GET_CODE (src_1) == REG)
          {
            temp = find_base_value (src_1);
            if (temp)
              src_1 = temp;
          }


Which will allow us to pick up any underlying information about
the two regs, regardless of whether or not one is a strict pointer.

Using the example above we might emit code like:

(set (temp1) (plus (reg 975) (const_int -8)))

(set (temp2) (plus (reg 1024) (temp1)))

(set (temp3) (plus (mult (reg 101) (const_int 8)) (temp2)))


Remember (reg 1024) is related to a SYMBOL_REF.

So consider what happens when we find the set for temp2.

find_base_value is called with (plus (reg 1024) (temp1))

(reg 1024) is a pointer, so we call find_base_value (reg 1024), but
since find_base_value just returns the original reg we get nothing
useful back.

If we change find_base_value to return reg_base_values[X] if
the register is only set once, then we can relate temp2 back to
the SYMBOL_REF (via reg 1024).

That's the first improvement, we've now propagated the base value
to one more register.

Then consider what happens for temp3.  We call into find_base_value
with (plus (mult (reg 101) (const_int 8)) (temp2)))

temp2 does not have REGNO_POINTER_FLAG set, so we basically do
nothing and are unable to relate temp3 back to the SYMBOL_REF
(via temp2 & (reg 1024)).

With my change we call find_base_value on temp2, which will give
us back the SYMBOL_REF.  This allows us to relate temp3 back to the
original SYMBOL_REF (via temp2 & (reg 1024).



jeff



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~1997-10-25 10:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1997-10-24 17:58 USE patch vs the PA Jeffrey A Law
1997-10-24 19:00 ` John Carr
1997-10-24 22:37   ` Jeffrey A Law
1997-10-25  6:31     ` John Carr
1997-10-25 10:21       ` Jeffrey A Law
1997-10-24 22:45   ` Jeffrey A Law
1997-10-25  9:02 ` Toon Moene

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).