Re: [patch] for PR 18040

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: [patch] for PR 18040
@ 2004-10-18 14:52 Richard Kenner
  2004-10-18 15:20 ` Daniel Berlin
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Kenner @ 2004-10-18 14:52 UTC (permalink / raw)
  To: dberlin; +Cc: gcc-patches

    A quick check of a random function handling component_refs shows they
    will also be missing optimizations

Let me be very clear here:

If the nops aren't needed, they won't be there.  So the question isn't
if they are needed or not.  Rather, it's whether the expression should be
split up among statements.

Are you really arguing that it's more likely to miss optimizations if they
are in *one* statement than if they are in multiple statements?  If so,
can you explain?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-18 14:52 [patch] for PR 18040 Richard Kenner
@ 2004-10-18 15:20 ` Daniel Berlin
  0 siblings, 0 replies; 61+ messages in thread
From: Daniel Berlin @ 2004-10-18 15:20 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches


On Oct 18, 2004, at 10:52 AM, Richard Kenner wrote:

>     A quick check of a random function handling component_refs shows 
> they
>     will also be missing optimizations
>
> Let me be very clear here:
>
> If the nops aren't needed, they won't be there.  So the question isn't
> if they are needed or not.  Rather, it's whether the expression should 
> be
> split up among statements.
>
> Are you really arguing that it's more likely to miss optimizations if 
> they
> are in *one* statement than if they are in multiple statements?

Yes.

>  If so,
> can you explain?
>
It is much easier to optimize expressions that are simple, regardless 
of how they are split up, because it makes your optimizers simpler to 
write, and you are much less likely to miss corner cases, etc.

There is no magic problem with a statement being split up into multiple 
statements, because that's just not how algorithms to do optimization 
are written.  From the POV, of say, CCP, it doesn't care about the 
difference between

a = 10
b = 20
c = a + b
return c;

and

c = 10 + 20
return c;

or even about the difference between

return 10 + 20;

and

a = 5
b = a + 2
c = b + 2
d = c + 1

e = 10
f = e + 5
g = f + 5

h = d + g

return h;

Both are just as easy to optimize.


--Dan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-20 13:27 Richard Kenner
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Kenner @ 2004-10-20 13:27 UTC (permalink / raw)
  To: jason; +Cc: gcc-patches

    But we should only allow VIEW_CONVERT_EXPR, not NOP_EXPR or any other
    conversion codes.

I agree.  I think the others are historical at this point and not
actually used.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-19 22:19     ` Daniel Berlin
@ 2004-10-20  7:03       ` Jason Merrill
  0 siblings, 0 replies; 61+ messages in thread
From: Jason Merrill @ 2004-10-20  7:03 UTC (permalink / raw)
  To: Daniel Berlin
  Cc: Richard Henderson, gcc-patches, Richard Kenner, Zack Weinberg

On Tue, 19 Oct 2004 18:18:24 -0400, Daniel Berlin <dberlin@dberlin.org> wrote:

> On Oct 19, 2004, at 5:35 PM, Richard Henderson wrote:
>
>> On Sun, Oct 17, 2004 at 02:04:13PM -0700, Zack Weinberg wrote:
>>> What he's saying is that
>>>
>>>    <COMPONENT_REF type struct field>
>>>
>>> (and all other reference nodes) would implicitly convert FIELD to TYPE,
>>> if that's not FIELD's intrinsic type.  I think this makes a hell of a
>>> lot of sense, personally.
>>
>> I think this is an exceptionally bad idea.
>>
> Is this because you'd rather force the optimizers to deal with the fact
> that there is a cast there, than only care if they want to?
>
> Or something else?

I think rth expects a COMPONENT_REF to always have the same type as the
field.  This seems like a good invariant, though it's not always the case
for bitfields (canonicalize_component_ref).

> And if it's the first solution, than I would assume your preferred solution
> to this problem is to make
> ((cast)var).field gimple, and audit the optimizers to make sure they aren't
> going to silently generate bad code in this case?

I think that's going to be necessary; adjusting the type of a COMPONENT_REF
won't handle the case you mention, since the VIEW_CONVERT_EXPR is wrapping
a VAR_DECL, not a COMPONENT_REF.

But we should only allow VIEW_CONVERT_EXPR, not NOP_EXPR or any other
conversion codes.

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-20  0:25 Richard Kenner
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Kenner @ 2004-10-20  0:25 UTC (permalink / raw)
  To: rth; +Cc: gcc-patches

    Folks that abuse unchecked_conversion like this get what they deserve.

Perhaps, though I doubt many would see it that way.  But the more
significant point is that VIEW_CONVERT_EXPR occurs in generated trees
for lots of operations, such as object-oriented stuff, that we do
want to operate efficiently.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-19 22:51     ` Zack Weinberg
@ 2004-10-20  0:02       ` Richard Henderson
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Henderson @ 2004-10-20  0:02 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Richard Kenner, dberlin, gcc-patches

On Tue, Oct 19, 2004 at 03:51:18PM -0700, Zack Weinberg wrote:
> Okay.  Can you explain why?  Do you have an alternative suggestion?

Because the field has type T.  If we need something of type U,
then we need a conversion.  Period.

Failure to abide by type rules DOES lead to failures.  We've
already seen this.  This sort of sloppiness was once rampant
in the front ends, and is why check_pointer_types_r exists.

Alternate suggestions?  None particularly constructive.
Personally I think we shouldn't bother worrying about getting
decent code out of some of these test cases.  Folks that abuse
unchecked_conversion like this get what they deserve.

r~

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-19 23:26 Richard Kenner
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Kenner @ 2004-10-19 23:26 UTC (permalink / raw)
  To: jason; +Cc: gcc-patches

    a.b.c would still be valid GIMPLE, but the initial gimplification may break
    it down in future.  A later pass would be free to put it back together.

I have no problem with that.  However, I though the purpose of the
proposal was to make it *not* valid GIMPLE.  Of course, until we have a
tree combiner, it's not going to be put back together.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-19 23:22 Richard Kenner
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Kenner @ 2004-10-19 23:22 UTC (permalink / raw)
  To: zack; +Cc: gcc-patches

    But this problem goes away if you start manipulating all the
    aggregates by reference.  

See my last mesasge on that topic.

    Sure, but this is decidedly _not_ the way the tree optimizers want to
    think about it.  Where RTL requires something to be done a certain
    way, we should not make that into a design constraint on trees; at
    most, there might be an impedance mismatch to be resolved in the
    expander.

Sure, but the point isn't RTL-specific.  It relates to code
generation.  When generating code, you only have to do *one* operation
no matter how many dereferences there are: all the extra defererences
do is change the parameters (offsets in this case) of that operation.
That's why I like to view this nest as conceptually one operation.

    Hmm.  That might turn out to be equivalent to my handwavey suggestion
    of inventing some way to make unaddressable things addressable for
    tree purposes.  Do you have a pointer to that proposal?

I don't think so, but the proposal is at
	http://gcc.gnu.org/ml/gcc/2004-06/msg01376.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-19 23:03 Richard Kenner
@ 2004-10-19 23:13 ` Jason Merrill
  0 siblings, 0 replies; 61+ messages in thread
From: Jason Merrill @ 2004-10-19 23:13 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches

On Tue, 19 Oct 04 19:05:29 EDT, kenner@vlsi1.ultra.nyu.edu (Richard Kenner) wrote:

>     I don't think anybody is suggesting that we break up reference chains
>     in a way that would introduce aggregate temporaries.  Only things that
>     can be expressed like
>
>            t = &a.b;
>     a.b.c  =>    t->c
>
>     "t" here is a scalar temporary, so the optimizers should be able to do
>     useful things with it.
>
> But that has it's own set of problems:
>
> (1) We're taking an address of something that didn't used to have its
> address taken.

The address-of should be removed by the time we leave SSA, unless it proves
useful.

> (2) Suppose a.b is not addressable?  Then you *do* have to make a copy
> and hence an aggregate temporary.

No, then you just leave it as a.b.c.

> If a.b.c is not valid GIMPLE, then even if we had a tree combiner (and
> ignoring the nonaddressable issue), we couldn't undo this pessimization.

a.b.c would still be valid GIMPLE, but the initial gimplification may break
it down in future.  A later pass would be free to put it back together.

> For example, suppose A were small enough to go into a register.  Taking
> the address of a.b would preclude that.

Only if we still take its address by the time we get to RTL.

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-19 22:48 Richard Kenner
  2004-10-19 23:01 ` Jason Merrill
@ 2004-10-19 23:07 ` Zack Weinberg
  1 sibling, 0 replies; 61+ messages in thread
From: Zack Weinberg @ 2004-10-19 23:07 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches

kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:

>     The tree optimizers do better on simple operations, which is why we
>     have GIMPLE in the first place.  
>
> Sure, but this begs the question of how we define "simple" ..
>
>     In fact, some people have argued that we should be breaking up
>     _all_ chains of dereference operations so that the optimizers
>     can see better.
>
> The problem, however, is that the SSA-based optimizations operate mostly
> on *scalars*, so making more aggregate temporaries, which is what
> happens when you break up those chains, creates things that are hard to
> remove under the present optimization environment.
>
> Moreover, some of those temporaries will be of variable-size, which
> we don't currently support.

But this problem goes away if you start manipulating all the
aggregates by reference.  Then you only have scalar (pointer)
temporaries.  If that doesn't do a good job, then that's a different
(but easier to resolve) optimization deficiency.

In the cases where we can't manipulate components of aggregates by
reference because they are not addressable, well, maybe we ought to
cook up some kind of thing that makes them addressable for the tree
optimizers' purposes.

> ... what I was trying to get at is that at the level of generating
> RTL, we treat a chain of reference operators as if it were one
> operation, by merging all the offset calculations into one.

Sure, but this is decidedly _not_ the way the tree optimizers want to
think about it.  Where RTL requires something to be done a certain
way, we should not make that into a design constraint on trees; at
most, there might be an impedance mismatch to be resolved in the
expander.

> Indeed, I once made a proposal of having a "reference annotation"
> that would formalize that approach and, for each dereference
> operation, produce a variable that represented the cumulative offset
> up to that point.  If you do that, then the entire chain really *is*
> one operation for all purposes except loop optimizers, which will
> want to look at array references.

Hmm.  That might turn out to be equivalent to my handwavey suggestion
of inventing some way to make unaddressable things addressable for
tree purposes.  Do you have a pointer to that proposal?

zw

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-19 23:03 Richard Kenner
  2004-10-19 23:13 ` Jason Merrill
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Kenner @ 2004-10-19 23:03 UTC (permalink / raw)
  To: jason; +Cc: gcc-patches

    I don't think anybody is suggesting that we break up reference chains
    in a way that would introduce aggregate temporaries.  Only things that
    can be expressed like

           t = &a.b;
    a.b.c  =>    t->c

    "t" here is a scalar temporary, so the optimizers should be able to do
    useful things with it.

But that has it's own set of problems:

(1) We're taking an address of something that didn't used to have its
address taken.

(2) Suppose a.b is not addressable?  Then you *do* have to make a copy
and hence an aggregate temporary.

If a.b.c is not valid GIMPLE, then even if we had a tree combiner (and
ignoring the nonaddressable issue), we couldn't undo this pessimization.
For example, suppose A were small enough to go into a register.  Taking
the address of a.b would preclude that.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-19 22:48 Richard Kenner
@ 2004-10-19 23:01 ` Jason Merrill
  2004-10-19 23:07 ` Zack Weinberg
  1 sibling, 0 replies; 61+ messages in thread
From: Jason Merrill @ 2004-10-19 23:01 UTC (permalink / raw)
  To: Richard Kenner; +Cc: zack, gcc-patches

On Tue, 19 Oct 04 18:51:10 EDT, kenner@vlsi1.ultra.nyu.edu (Richard Kenner) wrote:

>     In fact, some people have argued that
>     we should be breaking up _all_ chains of dereference operations so
>     that the optimizers can see better.
>
> The problem, however, is that the SSA-based optimizations operate mostly
> on *scalars*, so making more aggregate temporaries, which is what
> happens when you break up those chains, creates things that are hard to
> remove under the present optimization environment.
>
> Moreover, some of those temporaries will be of variable-size, which
> we don't currently support.

I don't think anybody is suggesting that we break up reference chains in a
way that would introduce aggregate temporaries.  Only things that can be
expressed like

           t = &a.b;
a.b.c  =>    t->c

"t" here is a scalar temporary, so the optimizers should be able to do
useful things with it.

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-19 21:36   ` Richard Henderson
  2004-10-19 22:19     ` Daniel Berlin
@ 2004-10-19 22:51     ` Zack Weinberg
  2004-10-20  0:02       ` Richard Henderson
  1 sibling, 1 reply; 61+ messages in thread
From: Zack Weinberg @ 2004-10-19 22:51 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Richard Kenner, dberlin, gcc-patches

Richard Henderson <rth@redhat.com> writes:

> On Sun, Oct 17, 2004 at 02:04:13PM -0700, Zack Weinberg wrote:
>> What he's saying is that
>> 
>>    <COMPONENT_REF type struct field>
>> 
>> (and all other reference nodes) would implicitly convert FIELD to TYPE,
>> if that's not FIELD's intrinsic type.  I think this makes a hell of a
>> lot of sense, personally.
>
> I think this is an exceptionally bad idea.

Okay.  Can you explain why?  Do you have an alternative suggestion?

Another possibility which has occurred to me, since an awful lot of
the problem has to do with not being able to take apart chains of
*_REFs when the types aren't addressable, is to invent a BIT_ADDR_EXPR
whose purpose is to make them addressable dammit.  With necessary
attendant baggage and stuff.

zw

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-18 14:24 Richard Kenner
@ 2004-10-19 22:50 ` Zack Weinberg
  0 siblings, 0 replies; 61+ messages in thread
From: Zack Weinberg @ 2004-10-19 22:50 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches

kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:

>     If you provide an example which produces a type conversion in the
>     middle of a chain of dereference expressions, I will endeavor to
>     clarify what I mean.
>
> I'll try, but, as I said, I don't write Ada very well ...
>
> 	with unchecked_conversion;
> 	function foo return integer is
> 	    type arr is array (1 .. 100) of integer;
> 	    type r is record
> 		f1: integer;
> 		f2: array (1..99) of integer;
> 	    end record;
> 	    function uc is new unchecked_conversion (arr, r);
> 	    type q is record f1: arr; end record;
>
> 	    var: q;
> 	begin
> 	    return (uc (q.f1).f1;
> 	end foo;

This isn't a great example because everything in here is addressable,
so one could equally well solve the problem by decomposing the chain
of references.

Florian's example is better in this regard.  I showed elsewhere how to
solve it by promoting the transformation to an addressable level.
With the suggestion that COMPONENT_REFS would implicitly convert,
you'd have something which I can no longer write in any source
language, so here it is in pseudo-dump_tree() notation (just the body
of the function):

  <modify_expr type <void>
     <var_decl type <U> var2>
     <component_ref type <U> 
        <field_decl type <U> B>
        <component_ref type <Baz> 
           <field_decl type <T> B>
           <var_decl type <Foo> Var1>>>>

The crucial bit being that the inner COMPONENT_REF has a different
type than its FIELD_DECL, indicating an implicit conversion to that
type.

zw

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-19 22:48 Richard Kenner
  2004-10-19 23:01 ` Jason Merrill
  2004-10-19 23:07 ` Zack Weinberg
  0 siblings, 2 replies; 61+ messages in thread
From: Richard Kenner @ 2004-10-19 22:48 UTC (permalink / raw)
  To: zack; +Cc: gcc-patches

    The tree optimizers do better on simple operations, which is why we
    have GIMPLE in the first place.  

Sure, but this begs the question of how we define "simple" ..

    In fact, some people have argued that
    we should be breaking up _all_ chains of dereference operations so
    that the optimizers can see better.

The problem, however, is that the SSA-based optimizations operate mostly
on *scalars*, so making more aggregate temporaries, which is what
happens when you break up those chains, creates things that are hard to
remove under the present optimization environment.

Moreover, some of those temporaries will be of variable-size, which
we don't currently support.

    > No, of course not.  But I thought the idea is to encode each statement as
    > one *expansion* operation.

    I do not understand this statement.

Well, it's indeed not very well-defined, but what I was trying to get at is
that at the level of generating RTL, we treat a chain of reference
operators as if it were one operation, by merging all the offset calculations
into one.  Indeed, I once made a proposal of having a "reference annotation"
that would formalize that approach and, for each dereference operation,
produce a variable that represented the cumulative offset up to that point.
If you do that, then the entire chain really *is* one operation for all
purposes except loop optimizers, which will want to look at array
references.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-18 14:22 Richard Kenner
@ 2004-10-19 22:47 ` Zack Weinberg
  0 siblings, 0 replies; 61+ messages in thread
From: Zack Weinberg @ 2004-10-19 22:47 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches

kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:

> [First of all, let me apologize about the lack of threading.  I *do* have
> the sources I need to implement that, but I want to catch up on some of
> the technical stuff (ACATS bugs) first.]
>
>     Ok.  Does that also address the objection you raised to the
>     pointer-punning operation at the outermost level?
>
> The *correctness* objection, yes, but not the code quality
> objection.  We'd have a situation where it's impossible for the tree
> optimizers to improve the code because there's no way of
> representing the improvement.  That's bad.

The tree optimizers do better on simple operations, which is why we
have GIMPLE in the first place.  In fact, some people have argued that
we should be breaking up _all_ chains of dereference operations so
that the optimizers can see better.

>     The goal is not to be able to represent as one statement every
>     operation that can be encoded in one machine instruction, and it is
>     certainly not to be able to represent as one statement every primitive
>     operation of every supported language.
>
> No, of course not.  But I thought the idea is to encode each statement as
> one *expansion* operation.

I do not understand this statement.

zw

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-18 13:13   ` Florian Weimer
  2004-10-18 17:24     ` Jason Merrill
@ 2004-10-19 22:40     ` Zack Weinberg
  1 sibling, 0 replies; 61+ messages in thread
From: Zack Weinberg @ 2004-10-19 22:40 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Richard Kenner, gcc-patches

Florian Weimer <fw@deneb.enyo.de> writes:

> Unfortunately, var doesn't have to be addressable.  Consider the
> following example:
>
> with Ada.Unchecked_Conversion;
>
> procedure Bar is
>
>    type T is mod 2**4;
>    type Foo is record
>       A, B : T;
>    end record;
>    pragma Pack (Foo);
>    for Foo'Size use 8;
>
>    type U is mod 2**2;
>    type Baz is record
>       A, B : U;
>    end record;
>    pragma Pack (Baz);
>    for Baz'Size use 4;
>
>    Var1 : Foo;
>    pragma Import (Ada, Var1);
>
>    Var2 : U;
>    pragma Import (Ada, Var2);
>
>    function Convert is new Ada.Unchecked_Conversion (T, Baz);
>
> begin
>    Var2 := Convert (Var1.B).B;
> end Bar;

This is the case that I suggested could be handled by pushing the type
conversion to the top level of the nest, as-if the programmer had
written

with Ada.Unchecked_Conversion;

procedure Bar is

   type T is mod 2**4;
   type U is mod 2**2;

   type Foo is record
      A, B : T;
   end record;
   pragma Pack (Foo);
   for Foo'Size use 8;  -- N.B. I am assuming this number is in bits.

   type Quux is record
     AA, AB, BA, BB : U;
   end record;
   pragma Pack (Quux);
   for Quux'Size use 8;

   Var1 : Foo;
   pragma Import (Ada, Var1);
   
   Var2 : U;
   pragma Import (Ada, Var2);

   function Convert is new Ada.Unchecked_Conversion (Foo, Quux);

begin
   Var2 := Convert(Var1).BB;
end;

or, in C,

struct Foo __attribute__ ((packed)) {
  unsigned char A : 4;
  unsigned char B : 4;
};

struct Quux __attribute__ ((packed)) {
  unsigned char AA : 2;
  unsigned char AB : 2;
  unsigned char BA : 2;
  unsigned char BB : 2;
};

extern struct Foo var1;
extern char var2;  /* original type not representable in C */

void bar(void)
{
  struct Quux *q = (struct Quux *) &var1;

  var2.value = q->BB;
}

This transformation, however, has the problem that Kenner pointed out,
of possibly requiring a quadratic number of type-synthesis operations
at compile time.

> For the example above, all versions of GNAT I've tested (GNAT 3.15p,
> GCC 3.4, mainline) generate somewhat questionable code which involves
> bit operations on registers that are only partly defined (the
> undefined bits are subsequently masked away, though).

I'll add that I get much better code out of GNAT after the
transformation I demonstrated.  With GCC 3.3.5, before:

        movzbl  var1, %edx
        andb    $-16, %al
        shrb    $4, %dl
        orb     %dl, %al
        shrb    $2, %al
        andb    $3, %al
        movb    %al, var2
        ret

and after:

        movzbl  var1, %eax
        shrb    $6, %al
        movb    %al, var2
        ret

The C version also produces this improved code.

>> The only catch is that alias analysis must understand that rp aliases
>> var.  This may require some adjustment to GNAT's alias set logic.
>
> I'm not sure if this is actually required by the language, but
> real-world code probably needs this.

Agree.

zw

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-19 21:36   ` Richard Henderson
@ 2004-10-19 22:19     ` Daniel Berlin
  2004-10-20  7:03       ` Jason Merrill
  2004-10-19 22:51     ` Zack Weinberg
  1 sibling, 1 reply; 61+ messages in thread
From: Daniel Berlin @ 2004-10-19 22:19 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc-patches, Richard Kenner, Zack Weinberg


On Oct 19, 2004, at 5:35 PM, Richard Henderson wrote:

> On Sun, Oct 17, 2004 at 02:04:13PM -0700, Zack Weinberg wrote:
>> What he's saying is that
>>
>>    <COMPONENT_REF type struct field>
>>
>> (and all other reference nodes) would implicitly convert FIELD to 
>> TYPE,
>> if that's not FIELD's intrinsic type.  I think this makes a hell of a
>> lot of sense, personally.
>
> I think this is an exceptionally bad idea.
>
Is this because you'd rather force the optimizers to deal with the fact 
that there is a cast there, than only care if they want to?

Or something else?

And if it's the first solution, than I would assume your preferred 
solution to this problem is to make
((cast)var).field gimple, and audit the optimizers to make sure they 
aren't going to silently generate bad code in this case?
Or do you have something else in mind?


>
> r~

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-19 22:04     ` Andrew Pinski
@ 2004-10-19 22:06       ` Zdenek Dvorak
  0 siblings, 0 replies; 61+ messages in thread
From: Zdenek Dvorak @ 2004-10-19 22:06 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc-patches, Richard Henderson

Hello,

> >>On Sun, Oct 17, 2004 at 09:15:35PM +0200, Zdenek Dvorak wrote:
> >>>	PR tree-optimization/18040
> >>>	* tree-gimple.c (is_gimple_component): New function.
> >>>	(is_gimple_addressable, get_base_address): Use is_gimple_component.
> >>
> >>I think we should simply remove the casts from handled_component_p.
> >
> >then it would also be necessary to remove other superfluous entries 
> >from
> >it that are not recognized as gimple components (BIT_FIELD_REF and
> >NON_LVALUE_EXPR).  I do not care for later, but removing former seems
> >likely to break other code.
> 
> BIT_FIELD_REF should be handled the same way as COMPONENT_REF I think.

not according to gimple grammar (and also it does not make sense to have
BIT_FIELD_REF's nested).

Zdenek

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-19 22:03   ` Zdenek Dvorak
@ 2004-10-19 22:04     ` Andrew Pinski
  2004-10-19 22:06       ` Zdenek Dvorak
  0 siblings, 1 reply; 61+ messages in thread
From: Andrew Pinski @ 2004-10-19 22:04 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: gcc-patches, Richard Henderson


On Oct 19, 2004, at 5:55 PM, Zdenek Dvorak wrote:

> Hello,
>
>> On Sun, Oct 17, 2004 at 09:15:35PM +0200, Zdenek Dvorak wrote:
>>> 	PR tree-optimization/18040
>>> 	* tree-gimple.c (is_gimple_component): New function.
>>> 	(is_gimple_addressable, get_base_address): Use is_gimple_component.
>>
>> I think we should simply remove the casts from handled_component_p.
>
> then it would also be necessary to remove other superfluous entries 
> from
> it that are not recognized as gimple components (BIT_FIELD_REF and
> NON_LVALUE_EXPR).  I do not care for later, but removing former seems
> likely to break other code.

BIT_FIELD_REF should be handled the same way as COMPONENT_REF I think.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-19 21:36 ` Richard Henderson
@ 2004-10-19 22:03   ` Zdenek Dvorak
  2004-10-19 22:04     ` Andrew Pinski
  0 siblings, 1 reply; 61+ messages in thread
From: Zdenek Dvorak @ 2004-10-19 22:03 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches

Hello,

> On Sun, Oct 17, 2004 at 09:15:35PM +0200, Zdenek Dvorak wrote:
> > 	PR tree-optimization/18040
> > 	* tree-gimple.c (is_gimple_component): New function.
> > 	(is_gimple_addressable, get_base_address): Use is_gimple_component.
> 
> I think we should simply remove the casts from handled_component_p.

then it would also be necessary to remove other superfluous entries from
it that are not recognized as gimple components (BIT_FIELD_REF and
NON_LVALUE_EXPR).  I do not care for later, but removing former seems
likely to break other code.

Zdenek

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-17 19:28 Zdenek Dvorak
@ 2004-10-19 21:36 ` Richard Henderson
  2004-10-19 22:03   ` Zdenek Dvorak
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Henderson @ 2004-10-19 21:36 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: gcc-patches

On Sun, Oct 17, 2004 at 09:15:35PM +0200, Zdenek Dvorak wrote:
> 	PR tree-optimization/18040
> 	* tree-gimple.c (is_gimple_component): New function.
> 	(is_gimple_addressable, get_base_address): Use is_gimple_component.

I think we should simply remove the casts from handled_component_p.


r~

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-17 21:10 ` Zack Weinberg
@ 2004-10-19 21:36   ` Richard Henderson
  2004-10-19 22:19     ` Daniel Berlin
  2004-10-19 22:51     ` Zack Weinberg
  0 siblings, 2 replies; 61+ messages in thread
From: Richard Henderson @ 2004-10-19 21:36 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Richard Kenner, dberlin, gcc-patches

On Sun, Oct 17, 2004 at 02:04:13PM -0700, Zack Weinberg wrote:
> What he's saying is that
> 
>    <COMPONENT_REF type struct field>
> 
> (and all other reference nodes) would implicitly convert FIELD to TYPE,
> if that's not FIELD's intrinsic type.  I think this makes a hell of a
> lot of sense, personally.

I think this is an exceptionally bad idea.


r~

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-18 17:37       ` Florian Weimer
@ 2004-10-18 18:02         ` Jason Merrill
  0 siblings, 0 replies; 61+ messages in thread
From: Jason Merrill @ 2004-10-18 18:02 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Zack Weinberg, Richard Kenner, gcc-patches

On Mon, 18 Oct 2004 19:34:09 +0200, Florian Weimer <fw@deneb.enyo.de> wrote:

> * Jason Merrill:
>
>> On Mon, 18 Oct 2004 14:33:03 +0200, Florian Weimer <fw@deneb.enyo.de> wrote:
>>
>>> Unfortunately, var doesn't have to be addressable.
>>
>> If var isn't addressable, there's no way to break up references into
>> multiple statements, because there's no way to get a handle on an
>> intermediate reference.  So it seems that we need to support
>> VIEW_CONVERT_EXPR (but not NOP_EXPR) inside references.
>>
> The language permits to work on an addressable copy instead

But that can be arbitrarily expensive for large types, and impossible for
variable-sized types.

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-18 17:48 Richard Kenner
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Kenner @ 2004-10-18 17:48 UTC (permalink / raw)
  To: fw; +Cc: gcc-patches

    The language permits to work on an addressable copy instead, 

Yes, but (1) that copy can be very large and (2) there is also a use
of a VIEW_CONVERT_EXPR on inside the nest of references in the LHS
where the copy isn't permitted.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-18 17:24     ` Jason Merrill
@ 2004-10-18 17:37       ` Florian Weimer
  2004-10-18 18:02         ` Jason Merrill
  0 siblings, 1 reply; 61+ messages in thread
From: Florian Weimer @ 2004-10-18 17:37 UTC (permalink / raw)
  To: Jason Merrill; +Cc: Zack Weinberg, Richard Kenner, gcc-patches

* Jason Merrill:

> On Mon, 18 Oct 2004 14:33:03 +0200, Florian Weimer <fw@deneb.enyo.de> wrote:
>
>> Unfortunately, var doesn't have to be addressable.
>
> If var isn't addressable, there's no way to break up references into
> multiple statements, because there's no way to get a handle on an
> intermediate reference.  So it seems that we need to support
> VIEW_CONVERT_EXPR (but not NOP_EXPR) inside references.
>
> If var is addressable, uses of VIEW_CONVERT_EXPR can be broken up into
> pointer casts.

The language permits to work on an addressable copy instead, but
previous GNAT versions seem to behave differently and return a
reference to the argument in Unchecked_Conversion, be it addressable
or not.  Although it's not documented explicitly in the GNAT RM, this
behavior must be preserved because some real-world code is likely to
rely on it.  At least that's my gut feeling, someone at AdaCore should
know better.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-18 13:13   ` Florian Weimer
@ 2004-10-18 17:24     ` Jason Merrill
  2004-10-18 17:37       ` Florian Weimer
  2004-10-19 22:40     ` Zack Weinberg
  1 sibling, 1 reply; 61+ messages in thread
From: Jason Merrill @ 2004-10-18 17:24 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Zack Weinberg, Richard Kenner, gcc-patches

On Mon, 18 Oct 2004 14:33:03 +0200, Florian Weimer <fw@deneb.enyo.de> wrote:

> Unfortunately, var doesn't have to be addressable.

If var isn't addressable, there's no way to break up references into
multiple statements, because there's no way to get a handle on an
intermediate reference.  So it seems that we need to support
VIEW_CONVERT_EXPR (but not NOP_EXPR) inside references.

If var is addressable, uses of VIEW_CONVERT_EXPR can be broken up into
pointer casts.

My first draft of the gimplifier broke up references into multiple
statements, but I disabled that after I found that our RTL-level alias
analysis did better with a.b.c than with t1 = &a.b; t1->c.  Hopefully our
alias analysis has improved since then; as Daniel says, the optimizers work
better with smaller chunks.

Jason

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-18 15:42 Richard Kenner
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Kenner @ 2004-10-18 15:42 UTC (permalink / raw)
  To: dberlin; +Cc: gcc-patches

    It is much easier to optimize expressions that are simple, regardless 
    of how they are split up, because it makes your optimizers simpler to 
    write, and you are much less likely to miss corner cases, etc.

Sorry, I wasn't asking the general question, but about this specific case.
I'd like to see how it's easier to optimize cases with a conversion between
two field references when they are in two statements than in one.  

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-18 14:38 Richard Kenner
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Kenner @ 2004-10-18 14:38 UTC (permalink / raw)
  To: dberlin; +Cc: gcc-patches

    Inserting STRIP_NOPS into everywhere that breaks apart component_refs
    and looking for variables and things is not a pleasant idea, unless it
    was hidden in component_ref iterators of some sort that you could tell
    to auto-skip them (the way we have ssa operand iterators).

Actually, it wouldn't do anything, because it doesn't strip
VIEW_CONVERT_EXPR and only strips those NOP_EXPRs that would have already
been removed!

    Thus, if you want ((cast)var).field to be legal, you are going to have
    to go through every single function in the optimizers that walks
    component_refs and fix them.  A lot of them are simply returning NULL
    or (whatever they have don't know) in the case where they hit the
    NOP_EXPR, instead of aborting.  Some are probably silently doing bad
    things :)

Well, how many of those are there?  Most just treat an "LHS" as a 
single object.  Very few would look inside it.

    This will miss the indirect_ref (and an optimization opportunity) if you
    have COMPONENT_REF <NOP_EXPR <INDIRECT_REF <SSA_NAME>>>AR_DECL>>

But I'm not sure that *is* an optimization opportunity: if that
NOP_EXPR is still there, it's there for a reason!

The question I'd ask is whether you'd have the optimization opportunity if
the NOP_EXPR were in a separate statement.  The answer is clearly "no".
So how would doing that help?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-18 14:24 Richard Kenner
  2004-10-19 22:50 ` Zack Weinberg
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Kenner @ 2004-10-18 14:24 UTC (permalink / raw)
  To: zack; +Cc: gcc-patches

    If you provide an example which produces a type conversion in the
    middle of a chain of dereference expressions, I will endeavor to
    clarify what I mean.

I'll try, but, as I said, I don't write Ada very well ...

	with unchecked_conversion;
	function foo return integer is
	    type arr is array (1 .. 100) of integer;
	    type r is record
		f1: integer;
		f2: array (1..99) of integer;
	    end record;
	    function uc is new unchecked_conversion (arr, r);
	    type q is record f1: arr; end record;

	    var: q;
	begin
	    return (uc (q.f1).f1;
	end foo;

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-18 14:22 Richard Kenner
  2004-10-19 22:47 ` Zack Weinberg
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Kenner @ 2004-10-18 14:22 UTC (permalink / raw)
  To: zack; +Cc: gcc-patches

[First of all, let me apologize about the lack of threading.  I *do* have
the sources I need to implement that, but I want to catch up on some of
the technical stuff (ACATS bugs) first.]

    Ok.  Does that also address the objection you raised to the
    pointer-punning operation at the outermost level?

The *correctness* objection, yes, but not the code quality objection.  We'd
have a situation where it's impossible for the tree optimizers to improve
the code because there's no way of representing the improvement.  That's bad.

    You'd only need to construct the modified outermost type when you had
    a type conversion somewhere along the chain of references.  I suppose
    one could construct a program that accesses each field *with* a type
    conversion; I can't think of any good reason such a program would pop
    up in the wild, but still, if we can find a solution that doesn't
    entail quadratic behavior for this case that would be good.

Many Ada programs that access each field of a complex record in a nested
way would have such conversions: remember, I said they weren't just there
in explicit user cases, but the front-end inserts such conversions too.
Yes, it may add too many of those and looking into that is certainly on
my list, but this adding yet more complexity to the issue.

    The goal is not to be able to represent as one statement every
    operation that can be encoded in one machine instruction, and it is
    certainly not to be able to represent as one statement every primitive
    operation of every supported language.

No, of course not.  But I thought the idea is to encode each statement as
one *expansion* operation.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-17 23:41 ` Zack Weinberg
@ 2004-10-18 13:13   ` Florian Weimer
  2004-10-18 17:24     ` Jason Merrill
  2004-10-19 22:40     ` Zack Weinberg
  0 siblings, 2 replies; 61+ messages in thread
From: Florian Weimer @ 2004-10-18 13:13 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Richard Kenner, gcc-patches

* Zack Weinberg:

> If I have understood correctly, then my suggestion would be that you
> should have Gigi translate this code to GENERIC equivalent to the
> following C example:
>
> int foo(void)
> {
>   typedef int arr[100000];
>   typedef struct { int f1; int f2[999999]; } r;
>
>   arr var;
>
>   return ((r *) &var)->f1 + var[2000];
> }

Unfortunately, var doesn't have to be addressable.  Consider the
following example:

with Ada.Unchecked_Conversion;

procedure Bar is

   type T is mod 2**4;
   type Foo is record
      A, B : T;
   end record;
   pragma Pack (Foo);
   for Foo'Size use 8;

   type U is mod 2**2;
   type Baz is record
      A, B : U;
   end record;
   pragma Pack (Baz);
   for Baz'Size use 4;

   Var1 : Foo;
   pragma Import (Ada, Var1);

   Var2 : U;
   pragma Import (Ada, Var2);

   function Convert is new Ada.Unchecked_Conversion (T, Baz);

begin
   Var2 := Convert (Var1.B).B;
end Bar;

However, this looks much like the distinction between by-copy and
by-reference types that is already needed for function calls.

For the example above, all versions of GNAT I've tested (GNAT 3.15p,
GCC 3.4, mainline) generate somewhat questionable code which involves
bit operations on registers that are only partly defined (the
undefined bits are subsequently masked away, though).  For example,
GNAT 3.15p generates:

        movb var1,%al
        shrb $4,%al
        shrl $2,%eax
        andl $3,%eax
        movb %al,var2
        ret

Still not optimal, but mainline looks worse:

        movzbl  var1, %eax
        andl    $-16, %edx
        shrb    $4, %al
        orl     %eax, %edx
        movl    %edx, %eax
        shrb    $2, %al
        andl    $3, %eax
        movb    %al, var2
        ret

> The only catch is that alias analysis must understand that rp aliases
> var.  This may require some adjustment to GNAT's alias set logic.

I'm not sure if this is actually required by the language, but
real-world code probably needs this.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-18  4:02 ` Zack Weinberg
@ 2004-10-18  4:26   ` Daniel Berlin
  0 siblings, 0 replies; 61+ messages in thread
From: Daniel Berlin @ 2004-10-18  4:26 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Richard Kenner, gcc-patches

On Sun, 2004-10-17 at 20:21 -0700, Zack Weinberg wrote:
> kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:
> 
> >     Your opinion on my alternative suggestion - hoist the type conversion
> >     to the outermost type, thus reducing the nested case to the unnested
> >     case - would be appreciated (and please read the whole message before
> >     answering the question, because I address another objection below).
> >
> > Aside from the issue of possible quadratic behavior, I don't completely
> > understand the proposal here, so I can't fully comment on it.  I should
> > point out, though, that some of these expressions in practice are
> > quite complex: there are lots of implicit conversions and dereferences
> > generated by the front end.
> 
> If you provide an example which produces a type conversion in the
> middle of a chain of dereference expressions, I will endeavor to
> clarify what I mean.
> 
> 
> >     > This stuff is *very* tricky and, as I said, we've been
> >     > throught it before.
> >
> >     The PR indicates that the solution which has already been
> >     implemented does not work.  Thus, the entire topic is open for
> >     reexamination.
> >
> > The PR indicates that a particular optimizer has problems with the
> > overall approach.  One way of dealing with that is to change the
> > approach.  But a more local way is to fix that particular optimizer.
> 
> Fair point, however, comments upthread indicate that lots of people
> are not happy with the overall approach.

Inserting STRIP_NOPS into everywhere that breaks apart component_refs
and looking for variables and things is not a pleasant idea, unless it
was hidden in component_ref iterators of some sort that you could tell
to auto-skip them (the way we have ssa operand iterators).

The optimizer's function in question (for_each_index in tree-ssa-loop-
ivopts.c) was written to assume what the gimple grammar specified, which
is that ((cast)var).field is not legal.
All of the optimizers are written in this manner (IE assuming that the
gimple grammar written is the gimple grammar).

Thus, if you want ((cast)var).field to be legal, you are going to have
to go through every single function in the optimizers that walks
component_refs and fix them.  A lot of them are simply returning NULL or
(whatever they have don't know) in the case where they hit the NOP_EXPR,
instead of aborting.  Some are probably silently doing bad things :)

A quick check of a random function handling component_refs shows they
will also be missing optimizations if you have 

[s/VAR_DECL/SSA_NAME/ is applicable for all the ssa optimizers i'm about
to reference.]

COMPONENT_REF <NOP_EXPR <VAR_DECL> > 
or 
COMPONENT_REF <NOP_EXPR <INDIRECT_REF <VAR_DECL>>

Let me give an example from tree-ssa-dom.c:

if (INDIRECT_REF_P (t))
          {
            tree op = TREE_OPERAND (t, 0);

            /* If the pointer is a SSA variable, then enter new
               equivalences into the hash table.  */
            while (TREE_CODE (op) == SSA_NAME)
              {

This will miss the indirect_ref (and an optimization opportunity) if you
have COMPONENT_REF <NOP_EXPR <INDIRECT_REF <SSA_NAME>>>AR_DECL>>

It will also fail if you have COMPONENT_REF <INDIRECT_REF <NOP_EXPR
<SSA_NAME>>>

Note that to fix both of these with your approach of making
((cast)var).field legal, you'd have to add 2 STRIP_NOPS to just this
small 5 lines of code.

In fact, you'd have to add STRIP_NOPS *everywhere*, and be vigilant
about it, or you'd miss optimizations or generate wrong code.
And that's a mess.
And a mess that other compilers don't have, so it must be possible to do
without it.
In fact, to be honest, i've never, looking at other compiler source,
seen the equivalent of NOP_EXPR (a conversion that doesn't generate
code) around.

For example, for intel, before lowering, etc (IE straight from the front
end) looks like this for zach's example:
Original Source line 9:
        return ((r *) &var)->f1 + var[2000];

9       1               return ( &var.1_V$2->f1_V$3 + var.1_V$2[2000] );

Note the lack of casts. :)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-18  3:12 Richard Kenner
@ 2004-10-18  4:02 ` Zack Weinberg
  2004-10-18  4:26   ` Daniel Berlin
  0 siblings, 1 reply; 61+ messages in thread
From: Zack Weinberg @ 2004-10-18  4:02 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches

kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:

>     Your opinion on my alternative suggestion - hoist the type conversion
>     to the outermost type, thus reducing the nested case to the unnested
>     case - would be appreciated (and please read the whole message before
>     answering the question, because I address another objection below).
>
> Aside from the issue of possible quadratic behavior, I don't completely
> understand the proposal here, so I can't fully comment on it.  I should
> point out, though, that some of these expressions in practice are
> quite complex: there are lots of implicit conversions and dereferences
> generated by the front end.

If you provide an example which produces a type conversion in the
middle of a chain of dereference expressions, I will endeavor to
clarify what I mean.


>     > This stuff is *very* tricky and, as I said, we've been
>     > throught it before.
>
>     The PR indicates that the solution which has already been
>     implemented does not work.  Thus, the entire topic is open for
>     reexamination.
>
> The PR indicates that a particular optimizer has problems with the
> overall approach.  One way of dealing with that is to change the
> approach.  But a more local way is to fix that particular optimizer.

Fair point, however, comments upthread indicate that lots of people
are not happy with the overall approach.

zw

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-18  2:38 Richard Kenner
@ 2004-10-18  3:14 ` Zack Weinberg
  0 siblings, 0 replies; 61+ messages in thread
From: Zack Weinberg @ 2004-10-18  3:14 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches

kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:

>     Hmm.  Is it possible to construct an example which BOTH requires
>     taking the address of an intermediate field, and wouldn't be cured by
>     making all *_REF expressions implicitly convert to their type?
>
> Sorry, I didn't realize that you were talking about a different solutions
> to each problem.

Well, at the time I had forgotten about the intermediate-field case,
but I think that two different solutions might well be appropriate.

> I believe the answer to the question you pose is "no", but it's
> going to be hard to prove that.  There is a potential aliasing
> problem, but I don't think it can actually occur because you can
> always view the semantics of the VIEW_CONVERT_EXPR as being pointer
> punning and could make the resulting pointer TYPE_REF_CAN_ALIAS_ALL.

Ok.  Does that also address the objection you raised to the
pointer-punning operation at the outermost level?

>     Also, I just now thought of an alternative: whenever you have a chain
>     of *_REF expressions which requires a type conversion somewhere along
>     the way, construct a modified version of the *outermost* type which
>     will give the field in question the effective type that it needs.
>
> That construction operation is linear in the number of fields in the
> record.  If you have a program that access each field, doing that
> construction produces quadratic behavior (we have such a program in
> the testsuite for simple field accesses).

You'd only need to construct the modified outermost type when you had
a type conversion somewhere along the chain of references.  I suppose
one could construct a program that accesses each field *with* a type
conversion; I can't think of any good reason such a program would pop
up in the wild, but still, if we can find a solution that doesn't
entail quadratic behavior for this case that would be good.

> What I'm not sure about are issues relating to code quality: we've
> taken what used to be expanded as one operation between tree and RTL
> and converted it into four distinct operations (ADDR_EXPR, NOP_EXPR,
> INDIRECT_REF, and COMPONENT_REF) and it'll be up to the RTL
> optimizers to put it all back together.
>
> I think it's taking a step backwards when we can no longer represent
> what is a single expansion operation in GIMPLE as one statement.

Experimentally speaking, the C example I showed you compiles to
exactly the same code as the Ada example.  So the optimizers are
effective at putting it all back together.  That's not a very
complicated example, of course.

Now I'm not the person to make final judgements about design criteria
of GIMPLE; my involvement in the tree-ssa project has been minimal.
However, I think this is a mistaken design rationale.  The goal of
GIMPLE as I understand it is to be simple, logically consistent, and
both language- and machine-independent (by which I mean that all
language- and machine-specific semantics are encoded explicitly in the
representation, rather than referring to 'hooks' into front or back
end).  The goal is not to be able to represent as one statement every
operation that can be encoded in one machine instruction, and it is
certainly not to be able to represent as one statement every primitive
operation of every supported language.

In this case, it seems to me that it may well be better overall to
translate this particular language operation to multiple GIMPLE
statements even though it is going to wind up being compressed back
down to one machine instruction, since that seems to make life so much
easier for both the optimizers and the humans trying to debug the
optimizers.

zw

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-18  3:12 Richard Kenner
  2004-10-18  4:02 ` Zack Weinberg
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Kenner @ 2004-10-18  3:12 UTC (permalink / raw)
  To: zack; +Cc: gcc-patches

    Your opinion on my alternative suggestion - hoist the type conversion
    to the outermost type, thus reducing the nested case to the unnested
    case - would be appreciated (and please read the whole message before
    answering the question, because I address another objection below).

Aside from the issue of possible quadratic behavior, I don't completely
understand the proposal here, so I can't fully comment on it.  I should
point out, though, that some of these expressions in practice are
quite complex: there are lots of implicit conversions and dereferences
generated by the front end.

    I didn't say this had to be captured in the type-based alias sets, I
    said it had to be comprehensible to the alias analyzer.  Those two
    assertions are not the same thing.

No, but your comment about the GNAT alias set computation was sounding like
you were talking about type-based alias sets.

    > This stuff is *very* tricky and, as I said, we've been throught it
    > before.

    The PR indicates that the solution which has already been implemented
    does not work.  Thus, the entire topic is open for reexamination.

The PR indicates that a particular optimizer has problems with the
overall approach.  One way of dealing with that is to change the approach.
But a more local way is to fix that particular optimizer.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-18  2:35 Richard Kenner
@ 2004-10-18  2:51 ` Zack Weinberg
  0 siblings, 0 replies; 61+ messages in thread
From: Zack Weinberg @ 2004-10-18  2:51 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches

kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:

>       return ((r *) &var)->f1 + var[2000];
>
> As I said earlier, that works *in this case*, but not in the general
> case of nested field references, because you might be needing to
> take the address of something non-addressable and ...
>
>     This does not look very different from what you are generating
>     now, but sandwiching the cast inside an address-of and
>     dereference means that the gimplifier can break it down without
>     needing to create an aggregate temporary, producing something
>     like this:
>
> But you *would* need the aggerate temporary in the nested reference
> case, which is the general case.  I don't see a strong argument for
> doing something special for the degenerate case here.

I talked about the unnested case because I had forgotten that the
nested case existed, so I assumed that the example you gave was the
most general case.  As I said earlier, I would have appreciated being
shown an example not amenable to a solution which has already been
ruled out.

Your opinion on my alternative suggestion - hoist the type conversion
to the outermost type, thus reducing the nested case to the unnested
case - would be appreciated (and please read the whole message before
answering the question, because I address another objection below).

>     The only catch is that alias analysis must understand that rp
>     aliases var.  This may require some adjustment to GNAT's alias
>     set logic.
>
> If it did, then it couldn't be done because the types and the
> unchecked conversion could be in different translation units, so you
> can't know the unchecked conversion existed when you were assigning
> alias sets to the types.

I didn't say this had to be captured in the type-based alias sets, I
said it had to be comprehensible to the alias analyzer.  Those two
assertions are not the same thing.

In particular, I'm guessing that the unchecked_conversion operation
needs to be visible at the point that its result is *used*; otherwise
you would be forced to make extremely conservative aliasing
assumptions about all pointers or references whose initialization was
not visible, and I doubt this is the case.  Thus, you could use
flow-based alias analysis to get it right.

> This stuff is *very* tricky and, as I said, we've been throught it
> before.

The PR indicates that the solution which has already been implemented
does not work.  Thus, the entire topic is open for reexamination.

zw

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-18  2:46 Richard Kenner
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Kenner @ 2004-10-18  2:46 UTC (permalink / raw)
  To: zack; +Cc: gcc-patches

    Also, I just now thought of an alternative: whenever you have a chain
    of *_REF expressions which requires a type conversion somewhere along
    the way, construct a modified version of the *outermost* type which
    will give the field in question the effective type that it needs.

That construction operation is linear in the number of fields in the
record.  If you have a program that access each field, doing that
construction produces quadratic behavior (we have such a program in
the testsuite for simple field accesses).

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-18  2:38 Richard Kenner
  2004-10-18  3:14 ` Zack Weinberg
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Kenner @ 2004-10-18  2:38 UTC (permalink / raw)
  To: zack; +Cc: gcc-patches

    Hmm.  Is it possible to construct an example which BOTH requires
    taking the address of an intermediate field, and wouldn't be cured by
    making all *_REF expressions implicitly convert to their type?

Sorry, I didn't realize that you were talking about a different solutions
to each problem.

I believe the answer to the question you pose is "no", but it's going to be
hard to prove that.  There is a potential aliasing problem, but I don't think
it can actually occur because you can always view the semantics of the
VIEW_CONVERT_EXPR as being pointer punning and could make the resulting
pointer TYPE_REF_CAN_ALIAS_ALL.

What I'm not sure about are issues relating to code quality: we've taken what
used to be expanded as one operation between tree and RTL and converted it
into four distinct operations (ADDR_EXPR, NOP_EXPR, INDIRECT_REF, and
COMPONENT_REF) and it'll be up to the RTL optimizers to put it all back
together.

I think it's taking a step backwards when we can no longer represent what
is a single expansion operation in GIMPLE as one statement.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-18  2:35 Richard Kenner
  2004-10-18  2:51 ` Zack Weinberg
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Kenner @ 2004-10-18  2:35 UTC (permalink / raw)
  To: zack; +Cc: gcc-patches

    It only makes any sense at all if I assume you meant to write 'var'
    instead of 'foo' in both places inside the body of the function.

Sort of.

    Also, you left out the declaration of uc, which meant I had to go look
    up how unchecked_conversion works.  

I wrote it at one point, but apparently edited it out.  I actually
can't write Ada very well.

      return ((r *) &var)->f1 + var[2000];

As I said earlier, that works *in this case*, but not in the general
case of nested field references, because you might be needing to take
the address of something non-addressable and ...

    This does not look very different from what you are generating now,
    but sandwiching the cast inside an address-of and dereference means
    that the gimplifier can break it down without needing to create an
    aggregate temporary, producing something like this:

But you *would* need the aggerate temporary in the nested reference case,
which is the general case.  I don't see a strong argument for doing
something special for the degenerate case here.

    The only catch is that alias analysis must understand that rp aliases
    var.  This may require some adjustment to GNAT's alias set logic.

If it did, then it couldn't be done because the types and the unchecked
conversion could be in different translation units, so you can't know
the unchecked conversion existed when you were assigning alias sets
to the types.

This stuff is *very* tricky and, as I said, we've been throught it before.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-17 23:45 ` Zack Weinberg
@ 2004-10-18  0:23   ` Zack Weinberg
  0 siblings, 0 replies; 61+ messages in thread
From: Zack Weinberg @ 2004-10-18  0:23 UTC (permalink / raw)
  To: Richard Kenner; +Cc: pinskia, gcc-patches

Zack Weinberg <zack@codesourcery.com> writes:

> Hmm.  Is it possible to construct an example which BOTH requires
> taking the address of an intermediate field, and wouldn't be cured by
> making all *_REF expressions implicitly convert to their type?

Correction: I meant to say "... requires changing the type of an
intermediate component of the chain of *_REF expressions ..."

Also, I just now thought of an alternative: whenever you have a chain
of *_REF expressions which requires a type conversion somewhere along
the way, construct a modified version of the *outermost* type which
will give the field in question the effective type that it needs.
Then take the address of the whole aggregate, convert it to that
modified type, and do the dereferences.

zw

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-17 23:06 Richard Kenner
@ 2004-10-17 23:45 ` Zack Weinberg
  2004-10-18  0:23   ` Zack Weinberg
  0 siblings, 1 reply; 61+ messages in thread
From: Zack Weinberg @ 2004-10-17 23:45 UTC (permalink / raw)
  To: Richard Kenner; +Cc: pinskia, gcc-patches

kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:

>     The way to fix this would be take the address and cast that pointer
>     to the pointer of the new type.
>
> That only works when it's addressable.  In the case of a variable it is,
> but not when it's an intermediate field.

I wish you would have given me an example that wasn't susceptible to
fixing by taking the address of the aggregate, then.

Hmm.  Is it possible to construct an example which BOTH requires
taking the address of an intermediate field, and wouldn't be cured by
making all *_REF expressions implicitly convert to their type?

zw

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-17 22:30 Richard Kenner
  2004-10-17 22:36 ` Andrew Pinski
@ 2004-10-17 23:41 ` Zack Weinberg
  2004-10-18 13:13   ` Florian Weimer
  1 sibling, 1 reply; 61+ messages in thread
From: Zack Weinberg @ 2004-10-17 23:41 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches

kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:

>     > ... actually, it's not good enough because it doesn't handle the
>     > 	((cast) var).field
>     > case and that's an expensive one.
>
>     My gut feeling is that needing this cast means there's something wrong
>     with the static type of VAR.  Could you show an example of source code
>     that needs this cast, and the type tree generated for VAR?
>
> Well, in Ada you can get it with an explicit Unchecked_Conversion to a
> composite type.  For example:
>
> 	with unchecked_conversion;
> 	function foo return integer is
> 	    type arr is array (1..100_000) of integer;
> 	    type r is record
> 		f1: length;
> 		f2: array (1..99_999) of integer;
> 	    end record;
>
> 	    var: foo;
> 	begin
> 	    return uc (foo).f1 + foo (2000);
> 	end foo;

This example has problems.  It only makes any sense at all if I assume
you meant to write 'var' instead of 'foo' in both places inside the
body of the function.  Also, you left out the declaration of uc, which
meant I had to go look up how unchecked_conversion works.  And my copy
of GNAT (from GCC 3.3) doesn't like the record declaration.  After
some bludgeoning I have managed to produce an example that compiles
and still, I believe, captures what you intended to demonstrate:

with unchecked_conversion;
function foo return integer is
    type arr is array (1..100_000) of integer;
    type arr2 is array (1..99_999) of integer;
    type r is record
        f1: integer;
        f2: arr2;
    end record;
    function uc is new unchecked_conversion (source => arr, target => r);
    var : arr;
begin
    return uc(var).f1 + var(2000);
end foo;

If I have understood correctly, then my suggestion would be that you
should have Gigi translate this code to GENERIC equivalent to the
following C example:

int foo(void)
{
  typedef int arr[100000];
  typedef struct { int f1; int f2[999999]; } r;

  arr var;

  return ((r *) &var)->f1 + var[2000];
}

This does not look very different from what you are generating now,
but sandwiching the cast inside an address-of and dereference means
that the gimplifier can break it down without needing to create an
aggregate temporary, producing something like this:

int foo(void)
{
  typedef int arr[100000];
  typedef struct { int f1; int f2[999999]; } r;

  arr var;
  r *var0;

  int v1, v2, rv;

  var0 = (r *) &var;

  v1 = var0->f1;
  v2 = var[2000];
  rv = v1 + v2;

  return rv;
}

You would do this any time either side of the unchecked_conversion is
not a scalar type.

The only catch is that alias analysis must understand that rp aliases
var.  This may require some adjustment to GNAT's alias set logic.

zw

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-17 23:06 Richard Kenner
  2004-10-17 23:45 ` Zack Weinberg
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Kenner @ 2004-10-17 23:06 UTC (permalink / raw)
  To: pinskia; +Cc: gcc-patches

    The way to fix this would be take the address and cast that pointer
    to the pointer of the new type.

That only works when it's addressable.  In the case of a variable it is,
but not when it's an intermediate field.

    We should never have a cast to/from a non-scaler value (except for
    maybe vectors).

We've been through all this months ago.  The addressability (alignment)
problem is the show-stopper for doing that sort of transformation in general.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-17 22:30 Richard Kenner
@ 2004-10-17 22:36 ` Andrew Pinski
  2004-10-17 23:41 ` Zack Weinberg
  1 sibling, 0 replies; 61+ messages in thread
From: Andrew Pinski @ 2004-10-17 22:36 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches, zack


On Oct 17, 2004, at 6:29 PM, Richard Kenner wrote:
> This is, admittedly, a somewhat contrived example, but this sort of
> thing occurs commonly in expanded code for different constructs.

The way to fix this would be take the address and cast that pointer
to the pointer of the new type.

We should never have a cast to/from a non-scaler value (except for
maybe vectors).


-- Pinski

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-17 22:30 Richard Kenner
  2004-10-17 22:36 ` Andrew Pinski
  2004-10-17 23:41 ` Zack Weinberg
  0 siblings, 2 replies; 61+ messages in thread
From: Richard Kenner @ 2004-10-17 22:30 UTC (permalink / raw)
  To: zack; +Cc: gcc-patches

    > ... actually, it's not good enough because it doesn't handle the
    > 	((cast) var).field
    > case and that's an expensive one.

    My gut feeling is that needing this cast means there's something wrong
    with the static type of VAR.  Could you show an example of source code
    that needs this cast, and the type tree generated for VAR?

Well, in Ada you can get it with an explicit Unchecked_Conversion to a
composite type.  For example:

	with unchecked_conversion;
	function foo return integer is
	    type arr is array (1..100_000) of integer;
	    type r is record
		f1: length;
		f2: array (1..99_999) of integer;
	    end record;

	    var: foo;
	begin
	    return uc (foo).f1 + foo (2000);
	end foo;

The type of FOO has to be an array of 100,000 integers, but the
programmer would be quite surprised if the reference to F1 involved a
400,000 byte copy!

This is, admittedly, a somewhat contrived example, but this sort of
thing occurs commonly in expanded code for different constructs.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-17 21:24 Richard Kenner
@ 2004-10-17 21:43 ` Zack Weinberg
  0 siblings, 0 replies; 61+ messages in thread
From: Zack Weinberg @ 2004-10-17 21:43 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches

kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:

>     What he's saying is that
>
>        <COMPONENT_REF type struct field>
>
>     (and all other reference nodes) would implicitly convert FIELD to TYPE,
>     if that's not FIELD's intrinsic type.  I think this makes a hell of a
>     lot of sense, personally.
>
> ... actually, it's not good enough because it doesn't handle the
> 	((cast) var).field
> case and that's an expensive one.

My gut feeling is that needing this cast means there's something wrong
with the static type of VAR.  Could you show an example of source code
that needs this cast, and the type tree generated for VAR?

zw

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-17 21:18 Richard Kenner
@ 2004-10-17 21:26 ` Zack Weinberg
  0 siblings, 0 replies; 61+ messages in thread
From: Zack Weinberg @ 2004-10-17 21:26 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches

kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:

>     What he's saying is that
>
>        <COMPONENT_REF type struct field>
>
>     (and all other reference nodes) would implicitly convert FIELD to TYPE,
>     if that's not FIELD's intrinsic type.  I think this makes a hell of a
>     lot of sense, personally.
>
> I have no problem with that, though I suspect it's a bit of an earthquake
> at this point. 

Not clear.  If as few places look inside a reference chain as you say,
it should be pretty limited.  But maybe it's tree-cleanup-branch
material.

> Would we do it for both GENERIC and GIMPLE or just the latter?

I would personally prefer both, as this is simpler and means that
front ends can avoid generating tree nodes which are going to be
folded away.

zw

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-17 21:24 Richard Kenner
  2004-10-17 21:43 ` Zack Weinberg
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Kenner @ 2004-10-17 21:24 UTC (permalink / raw)
  To: zack; +Cc: gcc-patches

    What he's saying is that

       <COMPONENT_REF type struct field>

    (and all other reference nodes) would implicitly convert FIELD to TYPE,
    if that's not FIELD's intrinsic type.  I think this makes a hell of a
    lot of sense, personally.

... actually, it's not good enough because it doesn't handle the
	((cast) var).field
case and that's an expensive one.  If we don't have a way of representing
that, we might have to do a full copy of VAR, which might be quite large.

It's been suggested that an optimizer can remove that copy, but how could it?
There's no way of representing having the copy be removed in the tree!  Only
an RTL optimizer could remove it and that's way too low level.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
@ 2004-10-17 21:18 Richard Kenner
  2004-10-17 21:26 ` Zack Weinberg
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Kenner @ 2004-10-17 21:18 UTC (permalink / raw)
  To: zack; +Cc: gcc-patches

    What he's saying is that

       <COMPONENT_REF type struct field>

    (and all other reference nodes) would implicitly convert FIELD to TYPE,
    if that's not FIELD's intrinsic type.  I think this makes a hell of a
    lot of sense, personally.

I have no problem with that, though I suspect it's a bit of an earthquake
at this point.  Would we do it for both GENERIC and GIMPLE or just the
latter?

Note that I also think that get_inner_reference and handle_component_p
probably only need to handle VIEW_CONVERT_EXPR and not the rest of the
conversions at this point.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-17 21:04 Richard Kenner
@ 2004-10-17 21:10 ` Zack Weinberg
  2004-10-19 21:36   ` Richard Henderson
  0 siblings, 1 reply; 61+ messages in thread
From: Zack Weinberg @ 2004-10-17 21:10 UTC (permalink / raw)
  To: Richard Kenner; +Cc: dberlin, gcc-patches

kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes:

>     2. Make them mandatory inside component_refs, so that you *always* have
>     them there.  This is what LLVM does with getelementptr. Each piece has a
>     type specifier
>
> I don't follow.  Since every tree node has a type, we always have it.
> It seems that all you need is an access function that strips off the
> conversion and you have the same form.

What he's saying is that

   <COMPONENT_REF type struct field>

(and all other reference nodes) would implicitly convert FIELD to TYPE,
if that's not FIELD's intrinsic type.  I think this makes a hell of a
lot of sense, personally.

zw

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re:  [patch] for PR 18040
@ 2004-10-17 21:04 Richard Kenner
  2004-10-17 21:10 ` Zack Weinberg
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Kenner @ 2004-10-17 21:04 UTC (permalink / raw)
  To: dberlin; +Cc: gcc-patches

    Except that this is exactly what you are doing by making it so
    conversions can appear more or less anywhere in an expression, instead
    of just a single place.

Not "more or less anywhere", it's precisely just inside a chain of references.
Since the expander treats this as a single reference, I don't see why the
optimizers care about what's inside that except in special situations.

It *really* pessimizes the code to not allow them, plus, as I've said, it
means we can't convert all valid GENERIC to GIMPLE. 

    1. Don't allow casts except at the top level. Deal with consequences.

I'd argue exactly the other way around: having a cast at the top level
of a nest of references isn't necessary, so we can require that it be
in it's own statement. But inside is where it's useful.

    2. Make them mandatory inside component_refs, so that you *always* have
    them there.  This is what LLVM does with getelementptr. Each piece has a
    type specifier

I don't follow.  Since every tree node has a type, we always have it.
It seems that all you need is an access function that strips off the
conversion and you have the same form.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re:  [patch] for PR 18040
  2004-10-17 20:25 Richard Kenner
@ 2004-10-17 20:47 ` Daniel Berlin
  0 siblings, 0 replies; 61+ messages in thread
From: Daniel Berlin @ 2004-10-17 20:47 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches

On Sun, 2004-10-17 at 16:29 -0400, Richard Kenner wrote:
>     It is the job of the optimizers to produce good code, not the job of
>     the front end.  If the optimizers can't produce good code from this,
>     then fix the optimizers.
> 
> Yeah, but there's no point in making things arbitrarily hard for them.
> GIGO still applies.
Except that this is exactly what you are doing by making it so
conversions can appear more or less anywhere in an expression, instead
of just a single place.

I see two sane options here.

1. Don't allow casts except at the top level. Deal with consequences.
2. Make them mandatory inside component_refs, so that you *always* have
them there.  This is what LLVM does with getelementptr. Each piece has a
type specifier


--Dan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re:  [patch] for PR 18040
@ 2004-10-17 20:25 Richard Kenner
  2004-10-17 20:47 ` Daniel Berlin
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Kenner @ 2004-10-17 20:25 UTC (permalink / raw)
  To: dberlin; +Cc: gcc-patches

    It is the job of the optimizers to produce good code, not the job of
    the front end.  If the optimizers can't produce good code from this,
    then fix the optimizers.

Yeah, but there's no point in making things arbitrarily hard for them.
GIGO still applies.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re:  [patch] for PR 18040
  2004-10-17 20:17 Richard Kenner
@ 2004-10-17 20:25 ` Daniel Berlin
  0 siblings, 0 replies; 61+ messages in thread
From: Daniel Berlin @ 2004-10-17 20:25 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches

On Sun, 2004-10-17 at 16:07 -0400, Richard Kenner wrote:
>     ((cast)var).field is not valid gimple.
> 
>     Casts must be in a statement by themselves, unless the argument is
>     is_gimple_min_invariant.
> 
> The issue here is historical.  The point is to view a single nested component
> reference as valid so we don't try to split it up since it can normally be
> done in a single operation and spliting it up will make it more expensive.
> The idea is that handled_component_p returns true for exactly those things
> that will be stripped by get_inner_reference.
> 
> So we wanted to support conversions *inside* a nest of references.  In other
> words, ((cast) var.field1).field2.  I thought that was specified in the
> grammer; if not, it should be.

I don't believe so.  It drastically changes where type conversions are
allowed.


> Your example above is simply a degenerate form of that and certainly needs to
> be supported because otherwise we pessimize code quite badly (at best).
> Suppose that cast is to a variable-size type.  Then we can't even gimplify
> that expression because we can't have a variable-size temporary.  Such an
> expression is certainly valid GENERIC, so saying we can't gimplify it is bad.
> But if we say that it's the responsibility of the front-end to generate that
> temporary, we now have horrible code: we might be copying a 1MB structure
> into a temporary just to retreive one byte!

It is the job of the optimizers to produce good code, not the job of the
front end.
If the optimizers can't produce good code from this, then fix the
optimizers.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re:  [patch] for PR 18040
@ 2004-10-17 20:17 Richard Kenner
  2004-10-17 20:25 ` Daniel Berlin
  0 siblings, 1 reply; 61+ messages in thread
From: Richard Kenner @ 2004-10-17 20:17 UTC (permalink / raw)
  To: dberlin; +Cc: gcc-patches

    ((cast)var).field is not valid gimple.

    Casts must be in a statement by themselves, unless the argument is
    is_gimple_min_invariant.

The issue here is historical.  The point is to view a single nested component
reference as valid so we don't try to split it up since it can normally be
done in a single operation and spliting it up will make it more expensive.
The idea is that handled_component_p returns true for exactly those things
that will be stripped by get_inner_reference.

So we wanted to support conversions *inside* a nest of references.  In other
words, ((cast) var.field1).field2.  I thought that was specified in the
grammer; if not, it should be.

Your example above is simply a degenerate form of that and certainly needs to
be supported because otherwise we pessimize code quite badly (at best).
Suppose that cast is to a variable-size type.  Then we can't even gimplify
that expression because we can't have a variable-size temporary.  Such an
expression is certainly valid GENERIC, so saying we can't gimplify it is bad.
But if we say that it's the responsibility of the front-end to generate that
temporary, we now have horrible code: we might be copying a 1MB structure
into a temporary just to retreive one byte!

However, a side-effect of allowing this is that we allow the conversion as
the *outer* operation.  That hasn't be quite correct for a while, but
apparently never caused problems before.

So my suggestion is to modify handled_component_p to exclude a conversion at
the outer level but to have a recursive version of it that *does* allow the
conversion inside fields.  But I don't know if that addresses your issue.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-17 19:48 ` Zdenek Dvorak
@ 2004-10-17 20:01   ` Daniel Berlin
  0 siblings, 0 replies; 61+ messages in thread
From: Daniel Berlin @ 2004-10-17 20:01 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: Richard Kenner, gcc-patches

On Sun, 2004-10-17 at 21:46 +0200, Zdenek Dvorak wrote:
> Hello,
> 
> >     is_gimple_addressable uses handled_component_p to determine whether its
> >     operand is a component reference.  However handled_component_p allows
> >     also NOP_EXPRs and CONVERT_EXPRs.  This in turn makes SRA believe that
> >     ((cast) var).field is valid gimple (since "(cast) var" passes
> >     is_gimple_addressable), which causes crash later.
> > 
> > Can you explain this more?  I very much think it's going in the wrong
> > direction to not use handled_component_p.  If we don't want to view a
> > an expression as being a component that's handled, we should change
> > handled_component_p.  But right now, what you show above in valid GIMPLE.
> > Perhaps there's a bug elsewehre.
> 
> no, it is not valid gimple, since it does not match grammar as
> described in documentation.  Also considering "(cast) something"
> to be addressable is a bit weird by itself.
> 
> Of course feel free to propose to extend gimple for this if you think it
> would be useful; handling the no-op casts in addresses would not be a
> problem from optimizations point of view.

Except then the optimizers would have to start breaking apart
expressions and stripping off nops again where they didn't matter,
wouldn't they?
Right now the only time you see a nop_expr is at the top level of a RHS.
Changing gimple to allow casts in any other part of the tree is a very
bad idea, IMHO.



> 
> Zdenek
-- 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [patch] for PR 18040
  2004-10-17 19:46 Richard Kenner
  2004-10-17 19:46 ` Daniel Berlin
@ 2004-10-17 19:48 ` Zdenek Dvorak
  2004-10-17 20:01   ` Daniel Berlin
  1 sibling, 1 reply; 61+ messages in thread
From: Zdenek Dvorak @ 2004-10-17 19:48 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc-patches

Hello,

>     is_gimple_addressable uses handled_component_p to determine whether its
>     operand is a component reference.  However handled_component_p allows
>     also NOP_EXPRs and CONVERT_EXPRs.  This in turn makes SRA believe that
>     ((cast) var).field is valid gimple (since "(cast) var" passes
>     is_gimple_addressable), which causes crash later.
> 
> Can you explain this more?  I very much think it's going in the wrong
> direction to not use handled_component_p.  If we don't want to view a
> an expression as being a component that's handled, we should change
> handled_component_p.  But right now, what you show above in valid GIMPLE.
> Perhaps there's a bug elsewehre.

no, it is not valid gimple, since it does not match grammar as
described in documentation.  Also considering "(cast) something"
to be addressable is a bit weird by itself.

Of course feel free to propose to extend gimple for this if you think it
would be useful; handling the no-op casts in addresses would not be a
problem from optimizations point of view.

Zdenek

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re:  [patch] for PR 18040
@ 2004-10-17 19:46 Richard Kenner
  2004-10-17 19:46 ` Daniel Berlin
  2004-10-17 19:48 ` Zdenek Dvorak
  0 siblings, 2 replies; 61+ messages in thread
From: Richard Kenner @ 2004-10-17 19:46 UTC (permalink / raw)
  To: rakdver; +Cc: gcc-patches

    is_gimple_addressable uses handled_component_p to determine whether its
    operand is a component reference.  However handled_component_p allows
    also NOP_EXPRs and CONVERT_EXPRs.  This in turn makes SRA believe that
    ((cast) var).field is valid gimple (since "(cast) var" passes
    is_gimple_addressable), which causes crash later.

Can you explain this more?  I very much think it's going in the wrong
direction to not use handled_component_p.  If we don't want to view a
an expression as being a component that's handled, we should change
handled_component_p.  But right now, what you show above in valid GIMPLE.
Perhaps there's a bug elsewehre.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re:  [patch] for PR 18040
  2004-10-17 19:46 Richard Kenner
@ 2004-10-17 19:46 ` Daniel Berlin
  2004-10-17 19:48 ` Zdenek Dvorak
  1 sibling, 0 replies; 61+ messages in thread
From: Daniel Berlin @ 2004-10-17 19:46 UTC (permalink / raw)
  To: Richard Kenner; +Cc: rakdver, gcc-patches

On Sun, 2004-10-17 at 15:44 -0400, Richard Kenner wrote:
>     is_gimple_addressable uses handled_component_p to determine whether its
>     operand is a component reference.  However handled_component_p allows
>     also NOP_EXPRs and CONVERT_EXPRs.  This in turn makes SRA believe that
>     ((cast) var).field is valid gimple (since "(cast) var" passes
>     is_gimple_addressable), which causes crash later.
> 
> Can you explain this more?  I very much think it's going in the wrong
> direction to not use handled_component_p.  If we don't want to view a
> an expression as being a component that's handled, we should change
> handled_component_p.  But right now, what you show above in valid GIMPLE.
> Perhaps there's a bug elsewehre.
-- 

((cast)var).field is not valid gimple.

Casts must be in a statement by themselves, unless the argument is
is_gimple_min_invariant.


IE (cast)(var.field) would be legal if var.field passes
is_gimple_min_invariant, however ((cast)var).field is not valid gimple
ever.

--Dan



^ permalink raw reply	[flat|nested] 61+ messages in thread

* [patch] for PR 18040
@ 2004-10-17 19:28 Zdenek Dvorak
  2004-10-19 21:36 ` Richard Henderson
  0 siblings, 1 reply; 61+ messages in thread
From: Zdenek Dvorak @ 2004-10-17 19:28 UTC (permalink / raw)
  To: gcc-patches

Hello,

is_gimple_addressable uses handled_component_p to determine whether its
operand is a component reference.  However handled_component_p allows
also NOP_EXPRs and CONVERT_EXPRs.  This in turn makes SRA believe that
((cast) var).field is valid gimple (since "(cast) var" passes
is_gimple_addressable), which causes crash later.

Bootstrapped & regtested on i686.

Zdenek

	PR tree-optimization/18040
	* tree-gimple.c (is_gimple_component): New function.
	(is_gimple_addressable, get_base_address): Use is_gimple_component.

Index: tree-gimple.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-gimple.c,v
retrieving revision 2.28
diff -c -3 -p -r2.28 tree-gimple.c
*** tree-gimple.c	30 Sep 2004 01:22:05 -0000	2.28
--- tree-gimple.c	17 Oct 2004 16:23:57 -0000
*************** is_gimple_condexpr (tree t)
*** 162,173 ****
    return (is_gimple_val (t) || COMPARISON_CLASS_P (t));
  }
  
  /*  Return true if T is something whose address can be taken.  */
  
  bool
  is_gimple_addressable (tree t)
  {
!   return (is_gimple_id (t) || handled_component_p (t)
  	  || TREE_CODE (t) == REALPART_EXPR
  	  || TREE_CODE (t) == IMAGPART_EXPR
  	  || INDIRECT_REF_P (t));
--- 162,192 ----
    return (is_gimple_val (t) || COMPARISON_CLASS_P (t));
  }
  
+ /* Return true if T is a gimple component reference.  */
+ 
+ static bool
+ is_gimple_component (tree t)
+ {
+   switch (TREE_CODE (t))
+     {
+     case COMPONENT_REF:
+     case ARRAY_REF:
+     case ARRAY_RANGE_REF:
+     case VIEW_CONVERT_EXPR:
+       return true;
+ 
+     default:
+       return false;;
+     }
+ }
+ 
  /*  Return true if T is something whose address can be taken.  */
  
  bool
  is_gimple_addressable (tree t)
  {
!   return (is_gimple_id (t)
! 	  || is_gimple_component (t)
  	  || TREE_CODE (t) == REALPART_EXPR
  	  || TREE_CODE (t) == IMAGPART_EXPR
  	  || INDIRECT_REF_P (t));
*************** get_call_expr_in (tree t)
*** 437,444 ****
  tree
  get_base_address (tree t)
  {
!   while (TREE_CODE (t) == REALPART_EXPR || TREE_CODE (t) == IMAGPART_EXPR
! 	 || handled_component_p (t))
      t = TREE_OPERAND (t, 0);
    
    if (SSA_VAR_P (t)
--- 456,465 ----
  tree
  get_base_address (tree t)
  {
!   while (TREE_CODE (t) == REALPART_EXPR
! 	 || TREE_CODE (t) == IMAGPART_EXPR
! 	 || TREE_CODE (t) == BIT_FIELD_REF
! 	 || is_gimple_component (t))
      t = TREE_OPERAND (t, 0);
    
    if (SSA_VAR_P (t)

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2004-10-20 13:26 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-18 14:52 [patch] for PR 18040 Richard Kenner
2004-10-18 15:20 ` Daniel Berlin
  -- strict thread matches above, loose matches on Subject: below --
2004-10-20 13:27 Richard Kenner
2004-10-20  0:25 Richard Kenner
2004-10-19 23:26 Richard Kenner
2004-10-19 23:22 Richard Kenner
2004-10-19 23:03 Richard Kenner
2004-10-19 23:13 ` Jason Merrill
2004-10-19 22:48 Richard Kenner
2004-10-19 23:01 ` Jason Merrill
2004-10-19 23:07 ` Zack Weinberg
2004-10-18 17:48 Richard Kenner
2004-10-18 15:42 Richard Kenner
2004-10-18 14:38 Richard Kenner
2004-10-18 14:24 Richard Kenner
2004-10-19 22:50 ` Zack Weinberg
2004-10-18 14:22 Richard Kenner
2004-10-19 22:47 ` Zack Weinberg
2004-10-18  3:12 Richard Kenner
2004-10-18  4:02 ` Zack Weinberg
2004-10-18  4:26   ` Daniel Berlin
2004-10-18  2:46 Richard Kenner
2004-10-18  2:38 Richard Kenner
2004-10-18  3:14 ` Zack Weinberg
2004-10-18  2:35 Richard Kenner
2004-10-18  2:51 ` Zack Weinberg
2004-10-17 23:06 Richard Kenner
2004-10-17 23:45 ` Zack Weinberg
2004-10-18  0:23   ` Zack Weinberg
2004-10-17 22:30 Richard Kenner
2004-10-17 22:36 ` Andrew Pinski
2004-10-17 23:41 ` Zack Weinberg
2004-10-18 13:13   ` Florian Weimer
2004-10-18 17:24     ` Jason Merrill
2004-10-18 17:37       ` Florian Weimer
2004-10-18 18:02         ` Jason Merrill
2004-10-19 22:40     ` Zack Weinberg
2004-10-17 21:24 Richard Kenner
2004-10-17 21:43 ` Zack Weinberg
2004-10-17 21:18 Richard Kenner
2004-10-17 21:26 ` Zack Weinberg
2004-10-17 21:04 Richard Kenner
2004-10-17 21:10 ` Zack Weinberg
2004-10-19 21:36   ` Richard Henderson
2004-10-19 22:19     ` Daniel Berlin
2004-10-20  7:03       ` Jason Merrill
2004-10-19 22:51     ` Zack Weinberg
2004-10-20  0:02       ` Richard Henderson
2004-10-17 20:25 Richard Kenner
2004-10-17 20:47 ` Daniel Berlin
2004-10-17 20:17 Richard Kenner
2004-10-17 20:25 ` Daniel Berlin
2004-10-17 19:46 Richard Kenner
2004-10-17 19:46 ` Daniel Berlin
2004-10-17 19:48 ` Zdenek Dvorak
2004-10-17 20:01   ` Daniel Berlin
2004-10-17 19:28 Zdenek Dvorak
2004-10-19 21:36 ` Richard Henderson
2004-10-19 22:03   ` Zdenek Dvorak
2004-10-19 22:04     ` Andrew Pinski
2004-10-19 22:06       ` Zdenek Dvorak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).