public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
@ 2007-10-08 12:14 Uros Bizjak
  2007-10-08 12:35 ` [Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr Kenneth Zadeck
  2007-10-09  6:01 ` [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638 Ian Lance Taylor
  0 siblings, 2 replies; 22+ messages in thread
From: Uros Bizjak @ 2007-10-08 12:14 UTC (permalink / raw)
  To: GCC Patches

Hello!

This patch fixes the problem with -fforce-addr, where we blindly move
calculation of a function argument stack slot address into the
temporary, but we fail to update REG_DEP_TRUE of a function call,
breaking the dependancy chain.

The failure silently propagates up to dse1 pass, where "unused" stack
slot is removed.

The problem starts at expand, where we generate:

;; bscale = __builtin_powf (2.0e+0, (real4) D.1062)
(insn 1310 1309 1311 t.f:61 (set (reg:SF 646)
        (float:SF (reg:SI 182 [ D.1062 ]))) -1 (nil))

(insn 1311 1310 1312 t.f:61 (parallel [
            (set (reg/f:SI 647)
                (plus:SI (reg/f:SI 56 virtual-outgoing-args)
                    (const_int 4 [0x4])))
            (clobber (reg:CC 17 flags))
        ]) -1 (nil))

(insn 1312 1311 1313 t.f:61 (set (mem:SF (reg/f:SI 647) [0 S4 A32])
        (reg:SF 646)) -1 (nil))

(insn 1313 1312 1314 t.f:61 (set (reg:SF 648)
        (mem/u/c/i:SF (symbol_ref/u:SI ("*.LC4") [flags 0x2]) [9 S4
A32])) -1 (nil))

(insn 1314 1313 1315 t.f:61 (set (mem:SF (reg/f:SI 56
virtual-outgoing-args) [0 S4 A32])
        (reg:SF 648)) -1 (nil))

(call_insn/u 1315 1314 1316 t.f:61 (set (reg:SF 8 st)
        (call (mem:QI (symbol_ref:SI ("powf") [flags 0x41]
<function_decl 0xb7c3a200 __builtin_powf>) [0 S1 A8])
            (const_int 8 [0x8]))) -1 (nil)
    (expr_list:REG_DEP_TRUE (use (mem/i:SF (reg/f:SI 56
virtual-outgoing-args) [0 S4 A32]))
        (expr_list:REG_DEP_TRUE (use (mem/i:SF (plus:SI (reg/f:SI 56
virtual-outgoing-args)
                        (const_int 4 [0x4])) [0 S4 A32]))
            (nil))))


Note, that (call_insn 1315) still refers to "mem (v-o-a + 4)", where
the argument's address is in fact "mem (reg 647)", generated in (insn
1311).

There are two solutions to this problem:

a) update call_insn dependencies with temporary pointer or
b) prevent v-o-a and stack_reg referred addresses to be pushed to a temporary.

Attached patch implements b). Patch was bootstrapped and regression
tested on i686-pc-linux-gnu. OK for mainline?

Unfortunatelly, I was not able to construct short test case, but at
the end of the day, the problem looks kind of obvious.

2007-10-08  Uros Bizjak  <ubizjak@gmail.com>

	PR rtl-optimization/33638
	* explow.c (memory_address): When -fforce-addr is in effect,
	do not force addresses using virtual_outgoing_args_rtx or
	stack_pointer_rtx to a temporary register.

Uros.

Index: explow.c
===================================================================
--- explow.c    (revision 129120)
+++ explow.c    (working copy)
@@ -488,7 +488,13 @@ memory_address (enum machine_mode mode,
     win2:
       x = oldx;
     win:
-      if (flag_force_addr && ! cse_not_expected && !REG_P (x))
+      if (flag_force_addr && ! cse_not_expected && !REG_P (x)
+         /* Do not force address using v_o_a and SP to a temporary.
+            The code below does not update function call's argument
+            dependencies, so dependency chain is broken when
+            function arguments are passed on the stack.  */
+         && ! reg_mentioned_p (virtual_outgoing_args_rtx, x)
+         && ! reg_mentioned_p (stack_pointer_rtx, x))
        {
          x = force_operand (x, NULL_RTX);
          x = force_reg (Pmode, x);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr
  2007-10-08 12:14 [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638 Uros Bizjak
@ 2007-10-08 12:35 ` Kenneth Zadeck
  2007-10-09  6:01 ` [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638 Ian Lance Taylor
  1 sibling, 0 replies; 22+ messages in thread
From: Kenneth Zadeck @ 2007-10-08 12:35 UTC (permalink / raw)
  To: gcc-bugzilla, gcc-patches; +Cc: Uros Bizjak


Your (a) solution would only paper over the problem: dse assumes that
it can see all of the loads and stores off of the frame in more places
than that call.  So fixing the call dependencies would have just been
the tip of the iceberg.

I like this solution but i am not authorized to approve this patch.

Kenny

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-08 12:14 [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638 Uros Bizjak
  2007-10-08 12:35 ` [Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr Kenneth Zadeck
@ 2007-10-09  6:01 ` Ian Lance Taylor
  2007-10-09  6:38   ` Eric Botcazou
  1 sibling, 1 reply; 22+ messages in thread
From: Ian Lance Taylor @ 2007-10-09  6:01 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: GCC Patches, zadeck

"Uros Bizjak" <ubizjak@gmail.com> writes:

> 2007-10-08  Uros Bizjak  <ubizjak@gmail.com>
> 
> 	PR rtl-optimization/33638
> 	* explow.c (memory_address): When -fforce-addr is in effect,
> 	do not force addresses using virtual_outgoing_args_rtx or
> 	stack_pointer_rtx to a temporary register.

It's hard for me to be entirely happy with this patch.

Some processors don't have register offset addressing, so all access
to the stack is done by copying and adding registers.  It seems to me
that such processors might have trouble with dse.c as currently
written.

On the other hand, -fforce-addr is meaningful for stack addresses,
especially on architectures with only small stack pointer offsets or
for which register direct instructions are smaller than register
offset addresses.

If I understand the problem correctly, the issue is that the address
of a function argument was loaded into a register.  A later store
using that register was not noticed as being for a function argument,
and was eliminated as dead although it was not.  All would be well if
you knew that the store was for a function argument.  And, of course,
the issue only arises for const functions, as for a non-const function
all stores are live.

There is special code in calls.c to mark the stack usage of const
functions: around line 2700, in expand_call.  It seems to me that if
store_one_arg comes up with a different address, because it calls
memory_address, you should put that address also in
CALL_INSN_FUNCTION_USAGE.  That ought to fix the immediate problem as
well.

I think there is still a lurking issue, which is that in principle
dse.c really needs to record which stores may be for function
arguments.  I don't see a really good way to do that, though of course
it could be done by setting and propagating a register flag.

Ian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-09  6:01 ` [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638 Ian Lance Taylor
@ 2007-10-09  6:38   ` Eric Botcazou
  2007-10-09  7:19     ` Ian Lance Taylor
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Botcazou @ 2007-10-09  6:38 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-patches, Uros Bizjak, zadeck

> > 2007-10-08  Uros Bizjak  <ubizjak@gmail.com>
> >
> > 	PR rtl-optimization/33638
> > 	* explow.c (memory_address): When -fforce-addr is in effect,
> > 	do not force addresses using virtual_outgoing_args_rtx or
> > 	stack_pointer_rtx to a temporary register.
>
> It's hard for me to be entirely happy with this patch.

FWIW I agree.

> There is special code in calls.c to mark the stack usage of const
> functions: around line 2700, in expand_call.  It seems to me that if
> store_one_arg comes up with a different address, because it calls
> memory_address, you should put that address also in
> CALL_INSN_FUNCTION_USAGE.  That ought to fix the immediate problem as
> well.

Does dse.c take into account CALL_INSN_FUNCTION_USAGE at all?  Flow.c did.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-09  6:38   ` Eric Botcazou
@ 2007-10-09  7:19     ` Ian Lance Taylor
  2007-10-09 11:51       ` Kenneth Zadeck
  2007-10-09 19:37       ` Eric Botcazou
  0 siblings, 2 replies; 22+ messages in thread
From: Ian Lance Taylor @ 2007-10-09  7:19 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Uros Bizjak, zadeck

Eric Botcazou <ebotcazou@libertysurf.fr> writes:

> > There is special code in calls.c to mark the stack usage of const
> > functions: around line 2700, in expand_call.  It seems to me that if
> > store_one_arg comes up with a different address, because it calls
> > memory_address, you should put that address also in
> > CALL_INSN_FUNCTION_USAGE.  That ought to fix the immediate problem as
> > well.
> 
> Does dse.c take into account CALL_INSN_FUNCTION_USAGE at all?  Flow.c did.

It does.  The problem is that CALL_INSN_FUNCTION_USAGE has a mem with
virtual-outgoing-args, but the actual store is done via a reg.  The
address was copied into the reg due to -fforce-addr.  This can be seen
in the RTL which Uros posted.

dse.c is not smart enough to see that the address in the
CALL_INSN_FUNCTION_USAGE is the same as the address in the register.
This is an issue that only arises for const functions.

Ian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-09  7:19     ` Ian Lance Taylor
@ 2007-10-09 11:51       ` Kenneth Zadeck
  2007-10-09 12:31         ` Uros Bizjak
  2007-10-09 15:42         ` Ian Lance Taylor
  2007-10-09 19:37       ` Eric Botcazou
  1 sibling, 2 replies; 22+ messages in thread
From: Kenneth Zadeck @ 2007-10-09 11:51 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: Eric Botcazou, gcc-patches, Uros Bizjak

Ian Lance Taylor wrote:
> Eric Botcazou <ebotcazou@libertysurf.fr> writes:
>
>   
>>> There is special code in calls.c to mark the stack usage of const
>>> functions: around line 2700, in expand_call.  It seems to me that if
>>> store_one_arg comes up with a different address, because it calls
>>> memory_address, you should put that address also in
>>> CALL_INSN_FUNCTION_USAGE.  That ought to fix the immediate problem as
>>> well.
>>>       
>> Does dse.c take into account CALL_INSN_FUNCTION_USAGE at all?  Flow.c did.
>>     
>
> It does.  The problem is that CALL_INSN_FUNCTION_USAGE has a mem with
> virtual-outgoing-args, but the actual store is done via a reg.  The
> address was copied into the reg due to -fforce-addr.  This can be seen
> in the RTL which Uros posted.
>
> dse.c is not smart enough to see that the address in the
> CALL_INSN_FUNCTION_USAGE is the same as the address in the register.
> This is an issue that only arises for const functions.
>
> Ian
>   
There are two places where dse uses knowledge of an address being
frame_related:
1) in this code for const functions.
2) in code that deletes stores that die because the function ends.

(1) is sensitive to this problem in that it will get the wrong answer
for an arg to a const function.  (2) will only miss opportunities to
remove dead insns if some frame based stores are not identified. 

it is not that dse is not smart enough to see this. dse is cselib based
which is a local analysis, and it is not smart enough to see this.  I
believe that the reason that flow never ran into this problem was that
flow did not try to do this.  My dse is more aggressive than flow was.

As to ian's point in the prior mail:

> Some processors don't have register offset addressing, so all access
> to the stack is done by copying and adding registers.  It seems to me
> that such processors might have trouble with dse.c as currently
> written.


I am surprised that this has not bitten us before.  this code has been
in since the dataflow merge and if some arch really had to do this, i
would have expected we would have seen this before.  I can understand
being lucky about this because -fforce-addr is a rarely used flag, but i
have trouble seeing how this could even bootstrap on a processor that
had to do it.

I can easily turn off the optimization if the -fforce-addr is given.  I
guess we could add a flag the the md files of the processors that need
to do this of the form: PROCESSOR_LOADS_STACK_ADDRESSES_INTO_REGS or
something like this.
I am a little scared about having to thread a new flag onto registers
thru the optimization stack at this point. 

Kenny

Kenny

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-09 11:51       ` Kenneth Zadeck
@ 2007-10-09 12:31         ` Uros Bizjak
  2007-10-09 15:42           ` Ian Lance Taylor
  2007-10-09 15:42         ` Ian Lance Taylor
  1 sibling, 1 reply; 22+ messages in thread
From: Uros Bizjak @ 2007-10-09 12:31 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: GCC Patches, Ian Lance Taylor, Eric Botcazou

Hello!

Looking a bit further into the problem, I have noticed that actual
argument address becomes different from what CALL_INSN_FUNCTION_USAGE
claims in store_one_arg() function. The problem is in the call to
emit_push_insn() in line 4280. emit_push_insn() calls memory_address()
that fubars the address.

An (less intrusive?) variant of my previous patch is to change call to
memory_address in emit_push_insn() to memory_address_noforce(). I
can't see the benefit of

load reg <- (SP + off)
store (reg) <- regX

since IMO the address of the pushed argument won't be CSE'd with any
other address in any way.

BTW: I have tried to update arg->stack just after the call to
emit_push_insn in store_one_arg with SET_DEST (PATTERN (get_last_insn
())) to update CALL_INSN_FUNCTION_USAGE, but IMO, this is too hacky.

Uros.

Index: expr.c
===================================================================
--- expr.c      (revision 129152)
+++ expr.c      (working copy)
@@ -3910,13 +3910,12 @@ emit_push_insn (rtx x, enum machine_mode
 #endif
        {
          if (GET_CODE (args_so_far) == CONST_INT)
-           addr
-             = memory_address (mode,
-                               plus_constant (args_addr,
-                                              INTVAL (args_so_far)));
+           addr = plus_constant (args_addr, INTVAL (args_so_far));
          else
-           addr = memory_address (mode, gen_rtx_PLUS (Pmode, args_addr,
-                                                      args_so_far));
+           addr = gen_rtx_PLUS (Pmode, args_addr, args_so_far);
+
+         addr = memory_address_noforce (mode, addr);
+
          dest = gen_rtx_MEM (mode, addr);

          /* We do *not* set_mem_attributes here, because incoming arguments

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-09 11:51       ` Kenneth Zadeck
  2007-10-09 12:31         ` Uros Bizjak
@ 2007-10-09 15:42         ` Ian Lance Taylor
  2007-10-09 16:05           ` Kenneth Zadeck
  1 sibling, 1 reply; 22+ messages in thread
From: Ian Lance Taylor @ 2007-10-09 15:42 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: Eric Botcazou, gcc-patches, Uros Bizjak

Kenneth Zadeck <zadeck@naturalbridge.com> writes:

> There are two places where dse uses knowledge of an address being
> frame_related:
> 1) in this code for const functions.
> 2) in code that deletes stores that die because the function ends.

I think you may be confusing "frame related" with "pushes argument
onto the stack."  A frame related instruction is one which, e.g.,
saves registers on the stack in the function prologue.  That is not
the issue here.  The instructions which push arguments onto the stack
are not frame related.

> As to ian's point in the prior mail:
> 
> > Some processors don't have register offset addressing, so all access
> > to the stack is done by copying and adding registers.  It seems to me
> > that such processors might have trouble with dse.c as currently
> > written.
> 
> 
> I am surprised that this has not bitten us before.  this code has been
> in since the dataflow merge and if some arch really had to do this, i
> would have expected we would have seen this before.  I can understand
> being lucky about this because -fforce-addr is a rarely used flag, but i
> have trouble seeing how this could even bootstrap on a processor that
> had to do it.

The processor's I'm talking about are all DSP processors, rarely used,
certainly never used to bootstrap gcc.

> I can easily turn off the optimization if the -fforce-addr is given.  I
> guess we could add a flag the the md files of the processors that need
> to do this of the form: PROCESSOR_LOADS_STACK_ADDRESSES_INTO_REGS or
> something like this.
> I am a little scared about having to thread a new flag onto registers
> thru the optimization stack at this point. 

dse.c has special optimization for const and pure functions (search
for CONST_OR_PURE_CALL in dse.c).  A safe change would be to remove
that optimization when those functions take arguments on the stack.

Ian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-09 12:31         ` Uros Bizjak
@ 2007-10-09 15:42           ` Ian Lance Taylor
  0 siblings, 0 replies; 22+ messages in thread
From: Ian Lance Taylor @ 2007-10-09 15:42 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Kenneth Zadeck, GCC Patches, Eric Botcazou

"Uros Bizjak" <ubizjak@gmail.com> writes:

> An (less intrusive?) variant of my previous patch is to change call to
> memory_address in emit_push_insn() to memory_address_noforce(). I
> can't see the benefit of
> 
> load reg <- (SP + off)
> store (reg) <- regX
> 
> since IMO the address of the pushed argument won't be CSE'd with any
> other address in any way.

It can be CSE'd if you make two function calls in a row.

Whether that CSE is useful is admittedly another question.

> BTW: I have tried to update arg->stack just after the call to
> emit_push_insn in store_one_arg with SET_DEST (PATTERN (get_last_insn
> ())) to update CALL_INSN_FUNCTION_USAGE, but IMO, this is too hacky.

But CALL_INSN_FUNCTION_USAGE has not been set at that point.  The code
calls store_one_arg, then adds an entry to CALL_INSN_FUNCTION_USAGE
for a const function.  You would just have to arrange for the call to
store_one_arg to return the actual address that it uses.

This is not a complete solution, as nothing prevents that address from
getting munged in some other way between expand and dse.

Arguably the bug is assuming that a note does a good job of conveying
that an address is being used.  This only works if the address doesn't
change in any way, or if the other optimizations are smart enough to
pick up (which in general they aren't).  The note was added for PR
12372, which was about a similar case for a sibcall: the store was
getting deleted before the jump.  It wouldn't be surprising if the
test case in PR 12372 failed with -fforce-addr when using flow.

Ian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-09 15:42         ` Ian Lance Taylor
@ 2007-10-09 16:05           ` Kenneth Zadeck
  2007-10-09 17:09             ` Ian Lance Taylor
  0 siblings, 1 reply; 22+ messages in thread
From: Kenneth Zadeck @ 2007-10-09 16:05 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: Eric Botcazou, gcc-patches, Uros Bizjak

Ian Lance Taylor wrote:
> Kenneth Zadeck <zadeck@naturalbridge.com> writes:
>
>   
>> There are two places where dse uses knowledge of an address being
>> frame_related:
>> 1) in this code for const functions.
>> 2) in code that deletes stores that die because the function ends.
>>     
>
> I think you may be confusing "frame related" with "pushes argument
> onto the stack."  A frame related instruction is one which, e.g.,
> saves registers on the stack in the function prologue.  That is not
> the issue here.  The instructions which push arguments onto the stack
> are not frame related.
>
>   
I understand that there is something called frame_related which is used
in prologue and epilogue.

in a earlier email i made it clear that inside of dse, i call things
frame related if they are related to the stack frame as these pushes are. 
>> As to ian's point in the prior mail:
>>
>>     
>>> Some processors don't have register offset addressing, so all access
>>> to the stack is done by copying and adding registers.  It seems to me
>>> that such processors might have trouble with dse.c as currently
>>> written.
>>>       
>> I am surprised that this has not bitten us before.  this code has been
>> in since the dataflow merge and if some arch really had to do this, i
>> would have expected we would have seen this before.  I can understand
>> being lucky about this because -fforce-addr is a rarely used flag, but i
>> have trouble seeing how this could even bootstrap on a processor that
>> had to do it.
>>     
>
> The processor's I'm talking about are all DSP processors, rarely used,
> certainly never used to bootstrap gcc.
>
>   
>> I can easily turn off the optimization if the -fforce-addr is given.  I
>> guess we could add a flag the the md files of the processors that need
>> to do this of the form: PROCESSOR_LOADS_STACK_ADDRESSES_INTO_REGS or
>> something like this.
>> I am a little scared about having to thread a new flag onto registers
>> thru the optimization stack at this point. 
>>     
>
> dse.c has special optimization for const and pure functions (search
> for CONST_OR_PURE_CALL in dse.c).  A safe change would be to remove
> that optimization when those functions take arguments on the stack.
>
>   
The whole point of this part of the optimization was in fact to remove
redundant pushes.
Given that this is a perfectly useful optimization on most real
architectures, i think that it would be more useful to mark/discover the
ones that it is not useful for rather than turning it off for everyone. 

> Ian
>   

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-09 16:05           ` Kenneth Zadeck
@ 2007-10-09 17:09             ` Ian Lance Taylor
  0 siblings, 0 replies; 22+ messages in thread
From: Ian Lance Taylor @ 2007-10-09 17:09 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: Eric Botcazou, gcc-patches, Uros Bizjak

Kenneth Zadeck <zadeck@naturalbridge.com> writes:

> in a earlier email i made it clear that inside of dse, i call things
> frame related if they are related to the stack frame as these pushes are. 

Sorry, I missed that.  Frame related has a specific meaning in gcc.
The problem here is specifically pushing arguments on the stack, not
any other part of frame usage.


> > dse.c has special optimization for const and pure functions (search
> > for CONST_OR_PURE_CALL in dse.c).  A safe change would be to remove
> > that optimization when those functions take arguments on the stack.
> >
> >   
> The whole point of this part of the optimization was in fact to remove
> redundant pushes.
> Given that this is a perfectly useful optimization on most real
> architectures, i think that it would be more useful to mark/discover the
> ones that it is not useful for rather than turning it off for everyone. 

The obvious first cut is that you only have to worry about const
functions for which parameters are stored on the stack without using
push instructions.  Admittedly that includes newer x86 processors.

If you want to do this safely then you need to know which stores are
for arguments.  I don't see any way around that.  Focusing on
-fforce-addr amounts to focusing on the current state of RTL
optimizations, rather than on what is valid.

We can go that way--something along the lines of Uros's patch to use
memory_addr_noforce--if nobody wants to imlement the real solution.
But we need some detailed comments in the code, and we need some
complete test cases to find the future breakage.

Ian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-09  7:19     ` Ian Lance Taylor
  2007-10-09 11:51       ` Kenneth Zadeck
@ 2007-10-09 19:37       ` Eric Botcazou
  2007-10-09 19:55         ` Kenneth Zadeck
  2007-10-10 10:45         ` Eric Botcazou
  1 sibling, 2 replies; 22+ messages in thread
From: Eric Botcazou @ 2007-10-09 19:37 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-patches, Uros Bizjak, zadeck

> It does.  The problem is that CALL_INSN_FUNCTION_USAGE has a mem with
> virtual-outgoing-args, but the actual store is done via a reg.  The
> address was copied into the reg due to -fforce-addr.

No, I don't think it does.  Apply

Index: cselib.c
===================================================================
--- cselib.c    (revision 129158)
+++ cselib.c    (working copy)
@@ -1687,7 +1687,8 @@ cselib_process_insn (rtx insn)
   if (CALL_P (insn))
     {
       for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-       if (call_used_regs[i]
+       if ((call_used_regs[i]
+            && !(ACCUMULATE_OUTGOING_ARGS && i == STACK_POINTER_REGNUM))
            || (REG_VALUES (i) && REG_VALUES (i)->elt
                && HARD_REGNO_CALL_PART_CLOBBERED (i,
                      GET_MODE (REG_VALUES (i)->elt->val_rtx))))

and comment out the special code in dse.c for const functions that invalidates
all the stores into the frame.  Now compile the testcase without -fforce-addr 
and you'll get in cse2:

(insn 1039 1038 1040 149 comunpack.f:61 (set (mem:SF (plus:SI (reg/f:SI 7 sp)
                (const_int 4 [0x4])) [0 S4 A32])
        (reg:SF 374)) 66 {*movsf_1} (expr_list:REG_DEAD (reg:SF 374)
        (nil)))

[...]

(call_insn/u 1042 1041 1043 149 comunpack.f:61 (set (reg:SF 8 st)
        (call (mem:QI (symbol_ref:SI ("powf") [flags 0x41] <function_decl 
0x55704200 __builtin_powf>) [0 S1 A8])
            (const_int 8 [0x8]))) 611 {*call_value_0} (nil)
    (expr_list:REG_DEP_TRUE (use (mem/i:SF (reg/f:SI 7 sp) [0 S4 A32]))
        (expr_list:REG_DEP_TRUE (use (mem/i:SF (plus:SI (reg/f:SI 7 sp)
                        (const_int 4 [0x4])) [0 S4 A32]))
            (nil))))

[...]

(insn 1048 1047 1049 149 comunpack.f:62 (set (mem:SF (plus:SI (reg/f:SI 7 sp)
                (const_int 4 [0x4])) [0 S4 A32])
        (reg:SF 379)) 66 {*movsf_1} (expr_list:REG_DEAD (reg:SF 379)
        (nil)))


And then in dse1:

**scanning insn=1048
  mem: (plus:SI (reg/f:SI 7 sp)
    (const_int 4 [0x4]))

   after cselib_expand address: (plus:SI (reg/f:SI 7 sp)
    (const_int 4 [0x4]))

   after canon_rtx address: (plus:SI (reg/f:SI 7 sp)
    (const_int 4 [0x4]))
  varying cselib base=2 offset = 4
 processing cselib store [4..8)
    trying store in insn=1041 gid=-1[0..4)
    trying store in insn=1039 gid=-1[4..8)
Locally deleting insn 1039 
deferring deletion of insn with uid = 1039.
mems_found = 1, cannot_delete = false
[...]
deleting insn with uid = 1039.

and yet there is an explicit use for the store in CALL_INSN_FUNCTION_USAGE.

> dse.c is not smart enough to see that the address in the
> CALL_INSN_FUNCTION_USAGE is the same as the address in the register.

I think that dse.c doesn't even look into CALL_INSN_FUNCTION_USAGE.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-09 19:37       ` Eric Botcazou
@ 2007-10-09 19:55         ` Kenneth Zadeck
  2007-10-09 21:55           ` Eric Botcazou
  2007-10-10 10:45         ` Eric Botcazou
  1 sibling, 1 reply; 22+ messages in thread
From: Kenneth Zadeck @ 2007-10-09 19:55 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Ian Lance Taylor, gcc-patches, Uros Bizjak

Eric Botcazou wrote:
>> It does.  The problem is that CALL_INSN_FUNCTION_USAGE has a mem with
>> virtual-outgoing-args, but the actual store is done via a reg.  The
>> address was copied into the reg due to -fforce-addr.
>>     
>
> No, I don't think it does.  Apply
>
> Index: cselib.c
> ===================================================================
> --- cselib.c    (revision 129158)
> +++ cselib.c    (working copy)
> @@ -1687,7 +1687,8 @@ cselib_process_insn (rtx insn)
>    if (CALL_P (insn))
>      {
>        for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
> -       if (call_used_regs[i]
> +       if ((call_used_regs[i]
> +            && !(ACCUMULATE_OUTGOING_ARGS && i == STACK_POINTER_REGNUM))
>             || (REG_VALUES (i) && REG_VALUES (i)->elt
>                 && HARD_REGNO_CALL_PART_CLOBBERED (i,
>                       GET_MODE (REG_VALUES (i)->elt->val_rtx))))
>
> and comment out the special code in dse.c for const functions that invalidates
> all the stores into the frame.  Now compile the testcase without -fforce-addr 
> and you'll get in cse2:
>
> (insn 1039 1038 1040 149 comunpack.f:61 (set (mem:SF (plus:SI (reg/f:SI 7 sp)
>                 (const_int 4 [0x4])) [0 S4 A32])
>         (reg:SF 374)) 66 {*movsf_1} (expr_list:REG_DEAD (reg:SF 374)
>         (nil)))
>
> [...]
>
> (call_insn/u 1042 1041 1043 149 comunpack.f:61 (set (reg:SF 8 st)
>         (call (mem:QI (symbol_ref:SI ("powf") [flags 0x41] <function_decl 
> 0x55704200 __builtin_powf>) [0 S1 A8])
>             (const_int 8 [0x8]))) 611 {*call_value_0} (nil)
>     (expr_list:REG_DEP_TRUE (use (mem/i:SF (reg/f:SI 7 sp) [0 S4 A32]))
>         (expr_list:REG_DEP_TRUE (use (mem/i:SF (plus:SI (reg/f:SI 7 sp)
>                         (const_int 4 [0x4])) [0 S4 A32]))
>             (nil))))
>
> [...]
>
> (insn 1048 1047 1049 149 comunpack.f:62 (set (mem:SF (plus:SI (reg/f:SI 7 sp)
>                 (const_int 4 [0x4])) [0 S4 A32])
>         (reg:SF 379)) 66 {*movsf_1} (expr_list:REG_DEAD (reg:SF 379)
>         (nil)))
>
>
> And then in dse1:
>
> **scanning insn=1048
>   mem: (plus:SI (reg/f:SI 7 sp)
>     (const_int 4 [0x4]))
>
>    after cselib_expand address: (plus:SI (reg/f:SI 7 sp)
>     (const_int 4 [0x4]))
>
>    after canon_rtx address: (plus:SI (reg/f:SI 7 sp)
>     (const_int 4 [0x4]))
>   varying cselib base=2 offset = 4
>  processing cselib store [4..8)
>     trying store in insn=1041 gid=-1[0..4)
>     trying store in insn=1039 gid=-1[4..8)
> Locally deleting insn 1039 
> deferring deletion of insn with uid = 1039.
> mems_found = 1, cannot_delete = false
> [...]
> deleting insn with uid = 1039.
>
> and yet there is an explicit use for the store in CALL_INSN_FUNCTION_USAGE.
>
>   
>> dse.c is not smart enough to see that the address in the
>> CALL_INSN_FUNCTION_USAGE is the same as the address in the register.
>>     
>
> I think that dse.c doesn't even look into CALL_INSN_FUNCTION_USAGE.
>
>   
no, it doesn't.  I did not know i had to.  At the time i wrote this
code, i assumed that when i canonized the insn, i would just see the
reference to the frame in the rtl. 

Eric, i am not arguing that my code in dse should stay the way that it
is, i was just hoping somehow i could salvage the ability to do better
things for accesses off of the frame pointer. 

Another issue that this brings up is the code that deletes stores into
the frame pointer that die when the function returns.  I am now a little
worried that code may be also compromised.  If i am able to see the
store as being relative to the frame but there is some load whose
address has been stored in a register, i am screwed.

Kenny

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-09 19:55         ` Kenneth Zadeck
@ 2007-10-09 21:55           ` Eric Botcazou
  2007-10-10 12:19             ` Eric Botcazou
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Botcazou @ 2007-10-09 21:55 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: gcc-patches, Ian Lance Taylor, Uros Bizjak

> Eric, i am not arguing that my code in dse should stay the way that it
> is, i was just hoping somehow i could salvage the ability to do better
> things for accesses off of the frame pointer.

No disagreement. :-)  I'm still trying to figure out whether flow.c would 
handle this correctly with the help of CALL_INSN_FUNCTION_USAGE.

Note that dse.c also uses the alias.c machinery (canon_rtx), which is global.
I'm under the impression that by enhancing a little this machinery, we could 
match the ability of gcse.c to detect that the stack pointer is invariant in 
the function; if we had this, canon_rtx would return the right thing.

> Another issue that this brings up is the code that deletes stores into
> the frame pointer that die when the function returns.  I am now a little
> worried that code may be also compromised.  If i am able to see the
> store as being relative to the frame but there is some load whose
> address has been stored in a register, i am screwed.

This should be more or less an equivalent problem.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-09 19:37       ` Eric Botcazou
  2007-10-09 19:55         ` Kenneth Zadeck
@ 2007-10-10 10:45         ` Eric Botcazou
  1 sibling, 0 replies; 22+ messages in thread
From: Eric Botcazou @ 2007-10-10 10:45 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-patches, Uros Bizjak, zadeck

> No, I don't think it does.  Apply
>
> Index: cselib.c
> ===================================================================
> --- cselib.c    (revision 129158)
> +++ cselib.c    (working copy)
> @@ -1687,7 +1687,8 @@ cselib_process_insn (rtx insn)
>    if (CALL_P (insn))
>      {
>        for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
> -       if (call_used_regs[i]
> +       if ((call_used_regs[i]
> +            && !(ACCUMULATE_OUTGOING_ARGS && i == STACK_POINTER_REGNUM))
>
>             || (REG_VALUES (i) && REG_VALUES (i)->elt
>
>                 && HARD_REGNO_CALL_PART_CLOBBERED (i,
>                       GET_MODE (REG_VALUES (i)->elt->val_rtx))))
>
> and comment out the special code in dse.c for const functions that
> invalidates all the stores into the frame.

In fact the second part is not necessary, just apply the first part or

Index: cselib.c
===================================================================
--- cselib.c    (revision 129158)
+++ cselib.c    (working copy)
@@ -1687,7 +1687,7 @@ cselib_process_insn (rtx insn)
   if (CALL_P (insn))
     {
       for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-       if (call_used_regs[i]
+       if (TEST_HARD_REG_BIT (regs_invalidated_by_call, i)
            || (REG_VALUES (i) && REG_VALUES (i)->elt
                && HARD_REGNO_CALL_PART_CLOBBERED (i,
                      GET_MODE (REG_VALUES (i)->elt->val_rtx))))

and dse.c will delete stores based on the stack pointer around const 
functions.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-09 21:55           ` Eric Botcazou
@ 2007-10-10 12:19             ` Eric Botcazou
  2007-10-10 19:21               ` Eric Botcazou
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Botcazou @ 2007-10-10 12:19 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: gcc-patches, Ian Lance Taylor, Uros Bizjak

> > Another issue that this brings up is the code that deletes stores into
> > the frame pointer that die when the function returns.  I am now a little
> > worried that code may be also compromised.  If i am able to see the
> > store as being relative to the frame but there is some load whose
> > address has been stored in a register, i am screwed.
>
> This should be more or less an equivalent problem.

I was wrong.  Part of the problem stems from the confusion in dse.c about what
"stores into the frame" are.  They are defined as stores whose base is either
frame_pointer_rtx or hard_frame_pointer_rtx.  That's the right definition for 
the purpose quoted above and I think that these stores should be immune to 
the problem (address loaded into pseudo) because both frame_pointer_rtx and 
hard_frame_pointer_rtx are !rtx_varies_p, so canon_rtx will really return the 
canonicalized address.

Now, for const functions, we are not interested in these stores into the frame 
as defined above, we are interested in stores based on the *stack pointer*.

Quoting flow.c:

	  /* Non-constant calls clobber memory, constant calls do not
	     clobber memory, though they may clobber outgoing arguments
	     on the stack.  */
	  if (! CONST_OR_PURE_CALL_P (insn))
	    {
	      free_EXPR_LIST_list (&pbi->mem_set_list);
	      pbi->mem_set_list_len = 0;
	    }
	  else
	    invalidate_mems_from_set (pbi, stack_pointer_rtx);

And the solution to the canonicalization problem for those stores is already 
present in gcse.c:

static bool
store_killed_in_insn (const_rtx x, const_rtx x_regs, const_rtx insn, int 
after)
{
  const_rtx reg, base, note, pat;

  if (!INSN_P (insn))
    return false;

  if (CALL_P (insn))
    {
      /* A normal or pure call might read from pattern,
	 but a const call will not.  */
      if (! CONST_OR_PURE_CALL_P (insn) || pure_call_p (insn))
	return true;

      /* But even a const call reads its parameters.  Check whether the
	 base of some of registers used in mem is stack pointer.  */
      for (reg = x_regs; reg; reg = XEXP (reg, 1))
	{
	  base = find_base_term (XEXP (reg, 0));
	  if (!base
	      || (GET_CODE (base) == ADDRESS
		  && GET_MODE (base) == Pmode
		  && XEXP (base, 0) == stack_pointer_rtx))
	    return true;


(insn 1312 1310 1313 149 comunpack.f:61 (set (mem:SF (reg/f:SI 755) [0 S4 
A32])
        (reg:SF 646)) 66 {*movsf_1} (expr_list:REG_DEAD (reg:SF 646)
        (nil)))

(gdb) p debug_rtx(0x558bde40)
(reg/f:SI 755)
$19 = void
(gdb) p debug_rtx(find_base_term(0x558bde40))
(address:SI (reg/f:SI 7 sp))
$20 = void

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-10 12:19             ` Eric Botcazou
@ 2007-10-10 19:21               ` Eric Botcazou
  2007-10-12 15:49                 ` Ian Lance Taylor
  0 siblings, 1 reply; 22+ messages in thread
From: Eric Botcazou @ 2007-10-10 19:21 UTC (permalink / raw)
  To: Kenneth Zadeck; +Cc: gcc-patches, Ian Lance Taylor, Uros Bizjak

[-- Attachment #1: Type: text/plain, Size: 660 bytes --]

> Now, for const functions, we are not interested in these stores into the
> frame as defined above, we are interested in stores based on the *stack
> pointer*.

	 * dse.c (struct insn_info): Remove 'stack_read' field,
	add 'stack_pointer_based' field.
	(record_store): For stores with non-constant base, record
	whether it is stack pointer based.
	(scan_insn): For the call to a const function, remove stack
	pointer based stores from the list of local active stores.
	(scan_reads_nospill): Delete code dealing with const functions.
	

This is enough to fix the problem, but I think we should add support for 
CALL_INSN_FUNCTION_USAGE too.

-- 
Eric Botcazou

[-- Attachment #2: pr33638.diff --]
[-- Type: text/x-diff, Size: 4101 bytes --]

Index: dse.c
===================================================================
--- dse.c	(revision 129158)
+++ dse.c	(working copy)
@@ -284,12 +284,11 @@ struct insn_info 
      contains a wild read, the use_rec will be null.  */
   bool wild_read;
 
-  /* This field is set for const function calls.  Const functions
-     cannot read memory, but they can read the stack because that is
-     where they may get their parms.  So having this set is less
-     severe than a wild read, it just means that all of the stores to
-     the stack are killed rather than all stores.  */
-  bool stack_read;
+  /* This field is only used for the processing of const functions.
+     These functions cannot read memory, but they can read the stack
+     because that is where they may get their parms.  It is set to
+     true if the insn may contain a stack pointer based store.  */
+  bool stack_pointer_based;
 
   /* This is true if any of the sets within the store contains a
      cselib base.  Such stores can only be deleted by the local
@@ -941,8 +940,9 @@ add_wild_read (bb_info_t bb_info)
 }
 
 
-/* Return true if X is a constant or one of the registers that behaves
-   as a constant over the life of a function.  */
+/* Return true if X is a constant or one of the registers that behave
+   as a constant over the life of a function.  This is equivalent to
+   !rtx_varies_p for memory addresses.  */
 
 static bool
 const_or_frame_p (rtx x)
@@ -1245,8 +1245,15 @@ record_store (rtx body, bb_info_t bb_inf
     }
   else
     {
-      store_info = pool_alloc (cse_store_info_pool);
+      rtx base_term = find_base_term (XEXP (mem, 0));
+      if (!base_term
+	  || (GET_CODE (base_term) == ADDRESS
+	      && GET_MODE (base_term) == Pmode
+	      && XEXP (base_term, 0) == stack_pointer_rtx))
+	insn_info->stack_pointer_based = true;
       insn_info->contains_cselib_groups = true;
+
+      store_info = pool_alloc (cse_store_info_pool);
       group_id = -1;
 
       if (dump_file)
@@ -1948,9 +1955,10 @@ scan_insn (bb_info_t bb_info, rtx insn)
   if (CALL_P (insn))
     {
       insn_info->cannot_delete = true;
+
       /* Const functions cannot do anything bad i.e. read memory,
-	 however, they can read their parameters which may have been
-	 pushed onto the stack.  */
+	 however, they can read their parameters which may have
+	 been pushed onto the stack.  */
       if (CONST_OR_PURE_CALL_P (insn) && !pure_call_p (insn))
 	{
 	  insn_info_t i_ptr = active_local_stores;
@@ -1961,15 +1969,8 @@ scan_insn (bb_info_t bb_info, rtx insn)
 
 	  while (i_ptr)
 	    {
-	      store_info_t store_info = i_ptr->store_rec;
-
-	      /* Skip the clobbers.  */
-	      while (!store_info->is_set)
-		store_info = store_info->next;
-
-	      /* Remove the frame related stores.  */
-	      if (store_info->group_id >= 0
-		  && VEC_index (group_info_t, rtx_group_vec, store_info->group_id)->frame_related)
+	      /* Remove the stack pointer based stores.  */
+	      if (i_ptr->stack_pointer_based)
 		{
 		  if (dump_file)
 		    dump_insn_info ("removing from active", i_ptr);
@@ -1983,14 +1984,12 @@ scan_insn (bb_info_t bb_info, rtx insn)
 		last = i_ptr;
 	      i_ptr = i_ptr->next_local_store;
 	    }
-
-	  insn_info->stack_read = true;
-	  
-	  return;
 	}
 
-      /* Every other call, including pure functions may read memory.  */
-      add_wild_read (bb_info);
+      else
+	/* Every other call, including pure functions, may read memory.  */
+	add_wild_read (bb_info);
+
       return;
     }
 
@@ -2492,18 +2491,6 @@ scan_reads_nospill (insn_info_t insn_inf
   int i;
   group_info_t group;
 
-  /* For const function calls kill the stack related stores.  */
-  if (insn_info->stack_read)
-    {
-      for (i = 0; VEC_iterate (group_info_t, rtx_group_vec, i, group); i++)
-	if (group->process_globally && group->frame_related)
-	  {
-	    if (kill)
-	      bitmap_ior_into (kill, group->group_kill);
-	    bitmap_and_compl_into (gen, group->group_kill); 
-	  }
-    }
-
   while (read_info)
     {
       for (i = 0; VEC_iterate (group_info_t, rtx_group_vec, i, group); i++)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-10 19:21               ` Eric Botcazou
@ 2007-10-12 15:49                 ` Ian Lance Taylor
  2007-10-12 17:10                   ` Eric Botcazou
  2007-10-14 16:29                   ` Eric Botcazou
  0 siblings, 2 replies; 22+ messages in thread
From: Ian Lance Taylor @ 2007-10-12 15:49 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Kenneth Zadeck, gcc-patches, Uros Bizjak

Eric Botcazou <ebotcazou@libertysurf.fr> writes:

> > Now, for const functions, we are not interested in these stores into the
> > frame as defined above, we are interested in stores based on the *stack
> > pointer*.
> 
> 	 * dse.c (struct insn_info): Remove 'stack_read' field,
> 	add 'stack_pointer_based' field.
> 	(record_store): For stores with non-constant base, record
> 	whether it is stack pointer based.
> 	(scan_insn): For the call to a const function, remove stack
> 	pointer based stores from the list of local active stores.
> 	(scan_reads_nospill): Delete code dealing with const functions.

+      rtx base_term = find_base_term (XEXP (mem, 0));
+      if (!base_term
+	  || (GET_CODE (base_term) == ADDRESS
+	      && GET_MODE (base_term) == Pmode
+	      && XEXP (base_term, 0) == stack_pointer_rtx))
+	insn_info->stack_pointer_based = true;

Are you sure that checking stack_pointer_rtx is enough here?  Note
that DSE is run after reload.  I would think that in some cases dse
might see hard_frame_pointer_rtx.

Otherwise, this looks good to me.  Please commit it if it passes
testing.

> This is enough to fix the problem, but I think we should add support for 
> CALL_INSN_FUNCTION_USAGE too.

Your patch is conservative, in that it assumes that a const call may
use any store to the stack frame.  The use I see for
CALL_INSN_FUNCTION_USAGE would be to make this less conservative, in
that it would then be possible to prove that some stores to the stack
frame are not used by the const call.  Is that what you mean?

Ian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-12 15:49                 ` Ian Lance Taylor
@ 2007-10-12 17:10                   ` Eric Botcazou
  2007-10-14 16:29                   ` Eric Botcazou
  1 sibling, 0 replies; 22+ messages in thread
From: Eric Botcazou @ 2007-10-12 17:10 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: Kenneth Zadeck, gcc-patches, Uros Bizjak

> Are you sure that checking stack_pointer_rtx is enough here?  Note
> that DSE is run after reload.  I would think that in some cases dse
> might see hard_frame_pointer_rtx.

For stores related to outgoing arguments?  If so, no, this is not enough.

> Otherwise, this looks good to me.  Please commit it if it passes testing.

It passed testing and I committed it yesterday.

> Your patch is conservative, in that it assumes that a const call may
> use any store to the stack frame.  The use I see for
> CALL_INSN_FUNCTION_USAGE would be to make this less conservative, in
> that it would then be possible to prove that some stores to the stack
> frame are not used by the const call.  Is that what you mean?

I was thinking from a more general point of view.  Theoritically nothing would 
prevent CALL_INSN_FUNCTION_USAGE from containing "unexpected" uses.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-12 15:49                 ` Ian Lance Taylor
  2007-10-12 17:10                   ` Eric Botcazou
@ 2007-10-14 16:29                   ` Eric Botcazou
  2007-10-15  5:47                     ` Ian Lance Taylor
  1 sibling, 1 reply; 22+ messages in thread
From: Eric Botcazou @ 2007-10-14 16:29 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: Kenneth Zadeck, gcc-patches, Uros Bizjak

[-- Attachment #1: Type: text/plain, Size: 590 bytes --]

> Are you sure that checking stack_pointer_rtx is enough here?  Note
> that DSE is run after reload.  I would think that in some cases dse
> might see hard_frame_pointer_rtx.

Follow-up patch attached, it restores the killing of all the frame related 
stores after reload.  Is it OK with you?


	* dse.c (struct insn_info): Add 'frame_read' field.
	(scan_insn): For the call to a const function, set frame_read if
	reload has been run.
	If the insn reads the frame, remove the frame related stores from
	the list of local active stores.
	(scan_reads_nospill): Likewise.


-- 
Eric Botcazou

[-- Attachment #2: pr33638-2.diff --]
[-- Type: text/x-diff, Size: 3733 bytes --]

Index: dse.c
===================================================================
--- dse.c	(revision 129282)
+++ dse.c	(working copy)
@@ -286,8 +286,26 @@ struct insn_info 
 
   /* This field is only used for the processing of const functions.
      These functions cannot read memory, but they can read the stack
-     because that is where they may get their parms.  It is set to
-     true if the insn may contain a stack pointer based store.  */
+     because that is where they may get their parms.  We need to be
+     this conservative because, like the store motion pass, we don't
+     consider CALL_INSN_FUNCTION_USAGE when processing call insns.
+     Moreover, we need to distinguish two cases:
+     1. Before reload (register elimination), the stores related to
+        outgoing arguments are stack pointer based and thus deemed
+        of non-constant base in this pass.  This requires special
+        handling but also means that the frame pointer based stores
+        need not be killed upon encountering a const function call.
+     2. After reload, the stores related to outgoing arguments may be
+        either stack pointer or hard frame pointer based.  This means
+        that we have no other choice than also killing all the frame
+        pointer based stores upon encountering a const function call.
+     This field is set after reload for const function calls.  Having
+     this set is less severe than a wild read, it just means that all
+     the frame related stores are killed rather than all the stores.  */
+  bool frame_read;
+
+  /* This field is only used for the processing of const functions.
+     It is set if the insn may contain a stack pointer based store.  */
   bool stack_pointer_based;
 
   /* This is true if any of the sets within the store contains a
@@ -1967,10 +1985,36 @@ scan_insn (bb_info_t bb_info, rtx insn)
 	  if (dump_file)
 	    fprintf (dump_file, "const call %d\n", INSN_UID (insn));
 
+	  /* See the head comment of the frame_read field.  */
+	  if (reload_completed)
+	    insn_info->frame_read = true;
+
+	  /* Loop over the active stores and remove those which are
+	     killed by the const function call.  */
 	  while (i_ptr)
 	    {
-	      /* Remove the stack pointer based stores.  */
+	      bool remove_store = false;
+
+	      /* The stack pointer based stores are always killed.  */
 	      if (i_ptr->stack_pointer_based)
+	        remove_store = true;
+
+	      /* If the frame is read, the frame related stores are killed.  */
+	      else if (insn_info->frame_read)
+		{
+		  store_info_t store_info = i_ptr->store_rec;
+
+		  /* Skip the clobbers.  */
+		  while (!store_info->is_set)
+		    store_info = store_info->next;
+
+		  if (store_info->group_id >= 0
+		      && VEC_index (group_info_t, rtx_group_vec,
+				    store_info->group_id)->frame_related)
+		    remove_store = true;
+		}
+
+	      if (remove_store)
 		{
 		  if (dump_file)
 		    dump_insn_info ("removing from active", i_ptr);
@@ -1982,6 +2026,7 @@ scan_insn (bb_info_t bb_info, rtx insn)
 		}
 	      else
 		last = i_ptr;
+
 	      i_ptr = i_ptr->next_local_store;
 	    }
 	}
@@ -2491,6 +2536,18 @@ scan_reads_nospill (insn_info_t insn_inf
   int i;
   group_info_t group;
 
+  /* If this insn reads the frame, kill all the frame related stores.  */
+  if (insn_info->frame_read)
+    {
+      for (i = 0; VEC_iterate (group_info_t, rtx_group_vec, i, group); i++)
+	if (group->process_globally && group->frame_related)
+	  {
+	    if (kill)
+	      bitmap_ior_into (kill, group->group_kill);
+	    bitmap_and_compl_into (gen, group->group_kill); 
+	  }
+    }
+
   while (read_info)
     {
       for (i = 0; VEC_iterate (group_info_t, rtx_group_vec, i, group); i++)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-14 16:29                   ` Eric Botcazou
@ 2007-10-15  5:47                     ` Ian Lance Taylor
  2007-10-15  7:43                       ` Eric Botcazou
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Lance Taylor @ 2007-10-15  5:47 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Kenneth Zadeck, gcc-patches, Uros Bizjak

Eric Botcazou <ebotcazou@libertysurf.fr> writes:

> 	* dse.c (struct insn_info): Add 'frame_read' field.
> 	(scan_insn): For the call to a const function, set frame_read if
> 	reload has been run.
> 	If the insn reads the frame, remove the frame related stores from
> 	the list of local active stores.
> 	(scan_reads_nospill): Likewise.

Note that a const call only reads the frame if there is a MEM in
CALL_INSN_FUNCTION_USAGE.  On many platforms a const call will not
normally read the frame.

Otherwise looks reasonable to me.

I have the feeling that your previous patch has more or less wiped out
DSE for const calls.  Does DSE still actually eliminate any stores in
the presence of a const call?

Ian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638
  2007-10-15  5:47                     ` Ian Lance Taylor
@ 2007-10-15  7:43                       ` Eric Botcazou
  0 siblings, 0 replies; 22+ messages in thread
From: Eric Botcazou @ 2007-10-15  7:43 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-patches, Kenneth Zadeck, Uros Bizjak

> Note that a const call only reads the frame if there is a MEM in
> CALL_INSN_FUNCTION_USAGE.  On many platforms a const call will not
> normally read the frame.

Sure, that's what is explained at the beginning of the big comment.  Note that 
the DSE from flow.c did the same (invalidate all stack pointer based stores).

> I have the feeling that your previous patch has more or less wiped out
> DSE for const calls.  Does DSE still actually eliminate any stores in
> the presence of a const call?

Definitely, the patch doesn't change anything for stores with constant base:

__attribute__ ((const)) int f (int i) { return i; }

int a;

void g (void)
{
  a = 1;
  a = f(0);
}

Compile at -O -fno-tree-dse.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2007-10-15  6:02 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-10-08 12:14 [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638 Uros Bizjak
2007-10-08 12:35 ` [Bug rtl-optimization/33638] [4.3 regression]: wrong code with -fforce-addr Kenneth Zadeck
2007-10-09  6:01 ` [PATCH, rtl-optimization]: Fix PR rtl-optimization/33638 Ian Lance Taylor
2007-10-09  6:38   ` Eric Botcazou
2007-10-09  7:19     ` Ian Lance Taylor
2007-10-09 11:51       ` Kenneth Zadeck
2007-10-09 12:31         ` Uros Bizjak
2007-10-09 15:42           ` Ian Lance Taylor
2007-10-09 15:42         ` Ian Lance Taylor
2007-10-09 16:05           ` Kenneth Zadeck
2007-10-09 17:09             ` Ian Lance Taylor
2007-10-09 19:37       ` Eric Botcazou
2007-10-09 19:55         ` Kenneth Zadeck
2007-10-09 21:55           ` Eric Botcazou
2007-10-10 12:19             ` Eric Botcazou
2007-10-10 19:21               ` Eric Botcazou
2007-10-12 15:49                 ` Ian Lance Taylor
2007-10-12 17:10                   ` Eric Botcazou
2007-10-14 16:29                   ` Eric Botcazou
2007-10-15  5:47                     ` Ian Lance Taylor
2007-10-15  7:43                       ` Eric Botcazou
2007-10-10 10:45         ` Eric Botcazou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).