Using MEM_EXPR inside a call expression

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Using MEM_EXPR inside a call expression
@ 2009-08-28 18:16 Adam Nemet
  2009-09-01 16:22 ` Richard Henderson
  0 siblings, 1 reply; 6+ messages in thread
From: Adam Nemet @ 2009-08-28 18:16 UTC (permalink / raw)
  To: gcc

On MIPS, PIC calls are indirect calls that need to be dispatched via an ABI
mandated register.  At expansion time we load the symbol into a pseudo then
expand the call.  There is a linker optimization that can turn these indirect
calls into direct calls under some circumstances.  This can improve branch
prediction (the real ABI requirement is that the PIC register is live on entry
to the callee).  To assist the linker we need to annotate the indirect call
with the function symbol.

Since the call is expanded early, during the various optimizations the meaning
of the call can change to no longer refer to the original function.  E.g.:

f (int i, int j)
{
  while (i--)
    if (j)
      g ();
    else
      h ();
}

You would hope that GCC would move the condition out of the loop and preset
the PIC register accordingly with only the indirect call in the loop body.
(No, this does not happen ATM.)

I've however seen this happening with cross-jumping.

AFAICT we have two options.  We either create a simple (local) dataflow in
md_reorg and annotate the calls with the result or set MEM_EXPR of the mem
inside the call to the function decl during expansion.

I have a patch for the latter (I used to do the former in our GCC).  It seems
to me that this perfectly fits with the definition of MEM_EXPR but I don't
think MEM_EXPR is ever used for mems inside calls.  In fact, I don't think any
of the MEM_ATTRS are meaningful in a call expression.

What's promising that cross-jumping treats them correctly.  As it merges
MEM_ATTRS it clears mismatching MEM_EXPRs no matter where the mem expression
is found in the insn.  And my patch bootstraps successfully.

So, is using MEM_EXPR for this a bad idea?

Adam

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using MEM_EXPR inside a call expression
  2009-08-28 18:16 Using MEM_EXPR inside a call expression Adam Nemet
@ 2009-09-01 16:22 ` Richard Henderson
  2009-09-01 19:48   ` Adam Nemet
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Henderson @ 2009-09-01 16:22 UTC (permalink / raw)
  To: Adam Nemet; +Cc: gcc

On 08/28/2009 12:38 AM, Adam Nemet wrote:
> ... To assist the linker we need to annotate the indirect call
> with the function symbol.
>
> Since the call is expanded early...

Having experimented with this on Alpha a few years back,
the only thing I can suggest is to not expand them early.

I use a combination of peep2's and normal splitters to
determine if the post-call GP reload is needed, and to
expand the call itself.

r~

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using MEM_EXPR inside a call expression
  2009-09-01 16:22 ` Richard Henderson
@ 2009-09-01 19:48   ` Adam Nemet
  2009-09-01 20:46     ` Richard Henderson
  0 siblings, 1 reply; 6+ messages in thread
From: Adam Nemet @ 2009-09-01 19:48 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc

Richard Henderson writes:
> On 08/28/2009 12:38 AM, Adam Nemet wrote:
> > ... To assist the linker we need to annotate the indirect call
> > with the function symbol.
> >
> > Since the call is expanded early...
> 
> Having experimented with this on Alpha a few years back,
> the only thing I can suggest is to not expand them early.
> 
> I use a combination of peep2's and normal splitters to
> determine if the post-call GP reload is needed, and to
> expand the call itself.

I see.  So I guess you're saying that there is little chance to optimize the
loop I had in my previous email ;(.

Now suppose we split late, shouldn't we still assume that data-flow can change
later.  IOW, wouldn't we be required to use the literal/lituse counting that
alpha does?

If yes then I guess it's still better to use MEM_EXPR.  MEM_EXPR also has the
benefit that it does not deem indirect calls as different when cross-jumping
compares the insns.  I don't know how important this is though.

Adam

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using MEM_EXPR inside a call expression
  2009-09-01 19:48   ` Adam Nemet
@ 2009-09-01 20:46     ` Richard Henderson
  2009-09-01 21:50       ` Adam Nemet
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Henderson @ 2009-09-01 20:46 UTC (permalink / raw)
  To: Adam Nemet; +Cc: gcc

On 09/01/2009 12:48 PM, Adam Nemet wrote:
> I see.  So I guess you're saying that there is little chance to optimize the
> loop I had in my previous email ;(.

Not at the rtl level.  Gimple-level loop splitting should do it though.

> Now suppose we split late, shouldn't we still assume that data-flow can change
> later.  IOW, wouldn't we be required to use the literal/lituse counting that
> alpha does?

If you split post-reload, data flow isn't going to change
in any significant way.

> If yes then I guess it's still better to use MEM_EXPR.  MEM_EXPR also has the
> benefit that it does not deem indirect calls as different when cross-jumping
> compares the insns.  I don't know how important this is though.

It depends on how much benefit you get from the direct
branch.  On alpha it's quite a bit, so we work hard to
make sure that we can get one, if at all possible.


r~

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using MEM_EXPR inside a call expression
  2009-09-01 20:46     ` Richard Henderson
@ 2009-09-01 21:50       ` Adam Nemet
  2009-09-02 18:10         ` Richard Sandiford
  0 siblings, 1 reply; 6+ messages in thread
From: Adam Nemet @ 2009-09-01 21:50 UTC (permalink / raw)
  To: Richard Henderson, rdsandiford; +Cc: gcc

Richard Henderson writes:
> On 09/01/2009 12:48 PM, Adam Nemet wrote:
> > I see.  So I guess you're saying that there is little chance to optimize the
> > loop I had in my previous email ;(.
> 
> Not at the rtl level.  Gimple-level loop splitting should do it though.
> 
> > Now suppose we split late, shouldn't we still assume that data-flow can change
> > later.  IOW, wouldn't we be required to use the literal/lituse counting that
> > alpha does?
> 
> If you split post-reload, data flow isn't going to change
> in any significant way.
> 
> > If yes then I guess it's still better to use MEM_EXPR.  MEM_EXPR also has the
> > benefit that it does not deem indirect calls as different when cross-jumping
> > compares the insns.  I don't know how important this is though.
> 
> It depends on how much benefit you get from the direct
> branch.  On alpha it's quite a bit, so we work hard to
> make sure that we can get one, if at all possible.

Thanks, RTH.

RichardS,

Can you comment on what RTH is suggesting?  Besides cross-jumping I haven't
seen indirect PIC calls get optimized much, and it seems that splitting late
will avoid the data-flow complications.

I can experiment with this but it would be nice to get some early buy-in.

BTW, I have the R_MIPS_JALR patch ready for submission but if we don't need to
worry about data-flow changes then using MEM_EXPR is not necessary.

Adam

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Using MEM_EXPR inside a call expression
  2009-09-01 21:50       ` Adam Nemet
@ 2009-09-02 18:10         ` Richard Sandiford
  0 siblings, 0 replies; 6+ messages in thread
From: Richard Sandiford @ 2009-09-02 18:10 UTC (permalink / raw)
  To: Adam Nemet; +Cc: Richard Henderson, gcc

Adam Nemet <anemet@caviumnetworks.com> writes:
> Richard Henderson writes:
>> On 09/01/2009 12:48 PM, Adam Nemet wrote:
>> > I see.  So I guess you're saying that there is little chance to optimize the
>> > loop I had in my previous email ;(.
>> 
>> Not at the rtl level.  Gimple-level loop splitting should do it though.
>> 
>> > Now suppose we split late, shouldn't we still assume that data-flow can change
>> > later.  IOW, wouldn't we be required to use the literal/lituse counting that
>> > alpha does?
>> 
>> If you split post-reload, data flow isn't going to change
>> in any significant way.
>> 
>> > If yes then I guess it's still better to use MEM_EXPR.  MEM_EXPR also has the
>> > benefit that it does not deem indirect calls as different when cross-jumping
>> > compares the insns.  I don't know how important this is though.
>> 
>> It depends on how much benefit you get from the direct
>> branch.  On alpha it's quite a bit, so we work hard to
>> make sure that we can get one, if at all possible.
>
> Thanks, RTH.
>
> RichardS,
>
> Can you comment on what RTH is suggesting?  Besides cross-jumping I haven't
> seen indirect PIC calls get optimized much, and it seems that splitting late
> will avoid the data-flow complications.
>
> I can experiment with this but it would be nice to get some early buy-in.
>
> BTW, I have the R_MIPS_JALR patch ready for submission but if we don't need to
> worry about data-flow changes then using MEM_EXPR is not necessary.

I guess all three would work, but TBH, I think it's too dangerous to
rely on dataflow not changing in an unwanted way.  We'd also have to say
specifically what that way is, and preferably assert for it somehow.

Personally, I like the dataflow approach you said you'd taken originally.
It's the kind of thing df was designed to make easy, and we already
use df in md_reorg to implement -mr10k-cache-barrier.  It should just
be a case of making sure that all definitions have the same value.

I suppose the danger of using MEM_EXPR is that (in the MIPS case)
it isn't technically correct for functions that are initially
directed at a lazy-binding stub.  It probably wouldn't matter
in practice though, since there'll be no lazy-binding stub if
the address is ever used in a different way.  I don't really
have any objections to using MEM_EXPR.

Richard

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-09-02 18:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-28 18:16 Using MEM_EXPR inside a call expression Adam Nemet
2009-09-01 16:22 ` Richard Henderson
2009-09-01 19:48   ` Adam Nemet
2009-09-01 20:46     ` Richard Henderson
2009-09-01 21:50       ` Adam Nemet
2009-09-02 18:10         ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).