public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* strict-aliasing and typedefs
@ 2003-05-14 20:56 Brad Lucier
  2003-05-14 21:07 ` Fergus Henderson
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Brad Lucier @ 2003-05-14 20:56 UTC (permalink / raw)
  To: gcc; +Cc: Brad Lucier

I've read the documentation and done a google and gcc search, but
I still have this question.

If I have the typedefs and variables

typedef int stackslot;
typedef int heapslot;

stackslot *sp;
heapslot *hp;

then can sp and hp point to the same location in memory with
ISO C's aliasing rules?  I'm wondering if a Scheme->C compiler
can use typedefs and ISO C's aliasing rules to tell gcc that certain
memory locations cannot alias each other.

Brad

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 20:56 strict-aliasing and typedefs Brad Lucier
@ 2003-05-14 21:07 ` Fergus Henderson
  2003-05-14 21:28   ` Gabriel Dos Reis
  2003-05-14 21:08 ` Gabriel Dos Reis
  2003-05-14 21:21 ` strict-aliasing and typedefs Andreas Schwab
  2 siblings, 1 reply; 30+ messages in thread
From: Fergus Henderson @ 2003-05-14 21:07 UTC (permalink / raw)
  To: Brad Lucier; +Cc: gcc

On 14-May-2003, Brad Lucier <lucier@math.purdue.edu> wrote:
> I've read the documentation and done a google and gcc search, but
> I still have this question.
> 
> If I have the typedefs and variables
> 
> typedef int stackslot;
> typedef int heapslot;
> 
> stackslot *sp;
> heapslot *hp;
> 
> then can sp and hp point to the same location in memory with
> ISO C's aliasing rules?

Yes.

> I'm wondering if a Scheme->C compiler
> can use typedefs and ISO C's aliasing rules to tell gcc that certain
> memory locations cannot alias each other.

I don't think so.

You might be able to do it using single-member structs instead of typedefs;
but doing that might inhibit some of GCC's other optimizations.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 20:56 strict-aliasing and typedefs Brad Lucier
  2003-05-14 21:07 ` Fergus Henderson
@ 2003-05-14 21:08 ` Gabriel Dos Reis
  2003-05-14 21:21   ` Joe Buck
  2003-05-14 21:21 ` strict-aliasing and typedefs Andreas Schwab
  2 siblings, 1 reply; 30+ messages in thread
From: Gabriel Dos Reis @ 2003-05-14 21:08 UTC (permalink / raw)
  To: Brad Lucier; +Cc: gcc

Brad Lucier <lucier@math.purdue.edu> writes:

| I've read the documentation and done a google and gcc search, but
| I still have this question.
| 
| If I have the typedefs and variables
| 
| typedef int stackslot;
| typedef int heapslot;
| 
| stackslot *sp;
| heapslot *hp;
| 
| then can sp and hp point to the same location in memory with
| ISO C's aliasing rules?

Yes, that is because typedef-name acts like a keyword that is
synonymous for the original type, hence sp and hp can alias the same
object. 

|  I'm wondering if a Scheme->C compiler
| can use typedefs and ISO C's aliasing rules to tell gcc that certain
| memory locations cannot alias each other.

No, a typedef does not introduce a distinct type:  It just introduces a
new type-name -- it is all syntactical.

-- Gaby

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 21:08 ` Gabriel Dos Reis
@ 2003-05-14 21:21   ` Joe Buck
  2003-05-14 21:30     ` Gabriel Dos Reis
  0 siblings, 1 reply; 30+ messages in thread
From: Joe Buck @ 2003-05-14 21:21 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Brad Lucier, gcc

Brad Lucier <lucier@math.purdue.edu> writes:
> |  I'm wondering if a Scheme->C compiler
> | can use typedefs and ISO C's aliasing rules to tell gcc that certain
> | memory locations cannot alias each other.
 
On Wed, May 14, 2003 at 11:08:27PM +0200, Gabriel Dos Reis wrote:
> No, a typedef does not introduce a distinct type:  It just introduces a
> new type-name -- it is all syntactical.

How about one-element structs (with same layout but distinct names)?  Those
can't alias.
 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 20:56 strict-aliasing and typedefs Brad Lucier
  2003-05-14 21:07 ` Fergus Henderson
  2003-05-14 21:08 ` Gabriel Dos Reis
@ 2003-05-14 21:21 ` Andreas Schwab
  2003-05-14 21:32   ` Gabriel Dos Reis
  2 siblings, 1 reply; 30+ messages in thread
From: Andreas Schwab @ 2003-05-14 21:21 UTC (permalink / raw)
  To: Brad Lucier; +Cc: gcc

Brad Lucier <lucier@math.purdue.edu> writes:

|> I've read the documentation and done a google and gcc search, but
|> I still have this question.
|> 
|> If I have the typedefs and variables
|> 
|> typedef int stackslot;
|> typedef int heapslot;
|> 
|> stackslot *sp;
|> heapslot *hp;
|> 
|> then can sp and hp point to the same location in memory with
|> ISO C's aliasing rules?

Yes.  Typedefs don't introduce new types, just new names.  The following
is completely equivalent:

typedef int stackslot;
typedef int heapslot;

int *sp;
int *hp;

|> I'm wondering if a Scheme->C compiler
|> can use typedefs and ISO C's aliasing rules to tell gcc that certain
|> memory locations cannot alias each other.

One way to do this is to define differently named structures with a single
member, but they may pessimize the generated code due to ABI
peculiarities.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 21:07 ` Fergus Henderson
@ 2003-05-14 21:28   ` Gabriel Dos Reis
  0 siblings, 0 replies; 30+ messages in thread
From: Gabriel Dos Reis @ 2003-05-14 21:28 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: Brad Lucier, gcc

Fergus Henderson <fjh@cs.mu.OZ.AU> writes:

[...]

| > I'm wondering if a Scheme->C compiler
| > can use typedefs and ISO C's aliasing rules to tell gcc that certain
| > memory locations cannot alias each other.
| 
| I don't think so.
| 
| You might be able to do it using single-member structs instead of typedefs;
| but doing that might inhibit some of GCC's other optimizations.

This might be an occasion to use the restrict keyword.

-- Gaby

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 21:21   ` Joe Buck
@ 2003-05-14 21:30     ` Gabriel Dos Reis
  2003-05-14 21:41       ` Joe Buck
  0 siblings, 1 reply; 30+ messages in thread
From: Gabriel Dos Reis @ 2003-05-14 21:30 UTC (permalink / raw)
  To: Joe Buck; +Cc: Brad Lucier, gcc

Joe Buck <jbuck@synopsys.com> writes:

| Brad Lucier <lucier@math.purdue.edu> writes:
| > |  I'm wondering if a Scheme->C compiler
| > | can use typedefs and ISO C's aliasing rules to tell gcc that certain
| > | memory locations cannot alias each other.
|  
| On Wed, May 14, 2003 at 11:08:27PM +0200, Gabriel Dos Reis wrote:
| > No, a typedef does not introduce a distinct type:  It just introduces a
| > new type-name -- it is all syntactical.
| 
| How about one-element structs (with same layout but distinct names)?  Those
| can't alias.

Yes, that will work but then, as Fergus pointed out, GCC seems to very
fragile when it comes to optimize "simple" structures.  

I would use "restrict" here.

-- Gaby

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 21:21 ` strict-aliasing and typedefs Andreas Schwab
@ 2003-05-14 21:32   ` Gabriel Dos Reis
  2003-05-14 21:46     ` Brad Lucier
  2003-05-14 21:47     ` Daniel Jacobowitz
  0 siblings, 2 replies; 30+ messages in thread
From: Gabriel Dos Reis @ 2003-05-14 21:32 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Brad Lucier, gcc

Andreas Schwab <schwab@suse.de> writes:

[...]

| int *sp;
| int *hp;
| 
| |> I'm wondering if a Scheme->C compiler
| |> can use typedefs and ISO C's aliasing rules to tell gcc that certain
| |> memory locations cannot alias each other.
| 
| One way to do this is to define differently named structures with a single
| member, but they may pessimize the generated code due to ABI
| peculiarities.

As I understand it, he would like an object-based non-alias
optimization, I believe restrict is a good candidate.

-- Gaby

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 21:30     ` Gabriel Dos Reis
@ 2003-05-14 21:41       ` Joe Buck
  2003-05-14 21:53         ` Gabriel Dos Reis
  2003-05-14 22:17         ` Gabriel Dos Reis
  0 siblings, 2 replies; 30+ messages in thread
From: Joe Buck @ 2003-05-14 21:41 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Brad Lucier, gcc

Joe Buck <jbuck@synopsys.com> writes:
> | How about one-element structs (with same layout but distinct names)?  Those
> | can't alias.
 
On Wed, May 14, 2003 at 11:30:15PM +0200, Gabriel Dos Reis wrote:
> Yes, that will work but then, as Fergus pointed out, GCC seems to very
> fragile when it comes to optimize "simple" structures.  

GCC does a decent job with one-element structs; its performance quickly drops like
a rock as soon as there are two elements.  Still, there may well be some loss from
using it.
 
> I would use "restrict" here.

"restrict" says that nothing at all aliases with some object.  What if you have
two pointers to Fred and two pointers to Barney?  "restrict" cannot express the
aliasing relationship.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 21:32   ` Gabriel Dos Reis
@ 2003-05-14 21:46     ` Brad Lucier
  2003-05-14 21:56       ` Gabriel Dos Reis
  2003-05-14 21:47     ` Daniel Jacobowitz
  1 sibling, 1 reply; 30+ messages in thread
From: Brad Lucier @ 2003-05-14 21:46 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Andreas Schwab, Brad Lucier, gcc

> As I understand it, he would like an object-based non-alias
> optimization, I believe restrict is a good candidate.

Well, actually, I think I would have many things that point to the heap,
and perhaps several things that point to the stack (but that's not clear
in my mind, perhaps every access to the stack would go through the stack
pointer).

Brad

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 21:32   ` Gabriel Dos Reis
  2003-05-14 21:46     ` Brad Lucier
@ 2003-05-14 21:47     ` Daniel Jacobowitz
  2003-05-14 21:59       ` Gabriel Dos Reis
  1 sibling, 1 reply; 30+ messages in thread
From: Daniel Jacobowitz @ 2003-05-14 21:47 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Andreas Schwab, Brad Lucier, gcc

On Wed, May 14, 2003 at 11:31:49PM +0200, Gabriel Dos Reis wrote:
> Andreas Schwab <schwab@suse.de> writes:
> 
> [...]
> 
> | int *sp;
> | int *hp;
> | 
> | |> I'm wondering if a Scheme->C compiler
> | |> can use typedefs and ISO C's aliasing rules to tell gcc that certain
> | |> memory locations cannot alias each other.
> | 
> | One way to do this is to define differently named structures with a single
> | member, but they may pessimize the generated code due to ABI
> | peculiarities.
> 
> As I understand it, he would like an object-based non-alias
> optimization, I believe restrict is a good candidate.

That depends.  Brad's initial example could also be used for type-based
non-alias optimization, for different sets of int objects - which is
somewhat looser than restrict allows.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 21:41       ` Joe Buck
@ 2003-05-14 21:53         ` Gabriel Dos Reis
  2003-05-14 22:17         ` Gabriel Dos Reis
  1 sibling, 0 replies; 30+ messages in thread
From: Gabriel Dos Reis @ 2003-05-14 21:53 UTC (permalink / raw)
  To: Joe Buck; +Cc: Brad Lucier, gcc

Joe Buck <jbuck@synopsys.com> writes:

| Joe Buck <jbuck@synopsys.com> writes:
| > | How about one-element structs (with same layout but distinct names)?  Those
| > | can't alias.
|  
| On Wed, May 14, 2003 at 11:30:15PM +0200, Gabriel Dos Reis wrote:
| > Yes, that will work but then, as Fergus pointed out, GCC seems to very
| > fragile when it comes to optimize "simple" structures.  
| 
| GCC does a decent job with one-element structs; its performance
| quickly drops like a rock as soon as there are two elements.  Still,
| there may well be some loss from using it.
|  
| > I would use "restrict" here.
| 
| "restrict" says that nothing at all aliases with some object.  What

No, that is not exact.  Here is an example from C standard:

       [#7] EXAMPLE 1 The file scope declarations

               int * restrict a;
               int * restrict b;
               extern int c[];

       assert  that  if an object is accessed using one of a, b, or
       c, and that object is modified anywhere in the program, then
       it is never accessed using either of the other two.

both a, b, c may alias the same object, but they cannot be used
simultaneously to modify the object. 

| if you have two pointers to Fred and two pointers to Barney?
| "restrict" cannot express the aliasing relationship.

His example is:

   stackslot *sp;
   heapslot *hp;

 rewrite it as

   stackslot *restrict sp = some_stack_pointer;
   heapslot *restrict hp = some_heap_pointer;

to meet the example from the C standard.

-- Gaby

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 21:46     ` Brad Lucier
@ 2003-05-14 21:56       ` Gabriel Dos Reis
  0 siblings, 0 replies; 30+ messages in thread
From: Gabriel Dos Reis @ 2003-05-14 21:56 UTC (permalink / raw)
  To: Brad Lucier; +Cc: Andreas Schwab, gcc

Brad Lucier <lucier@math.purdue.edu> writes:

| > As I understand it, he would like an object-based non-alias
| > optimization, I believe restrict is a good candidate.
| 
| Well, actually, I think I would have many things that point to the heap,
| and perhaps several things that point to the stack (but that's not clear
| in my mind, perhaps every access to the stack would go through the stack
| pointer).

If sp and hp do not point simultaneously to the same object, then
restrict should help -- if GCC does not honor then we should improve
GCC :-).

-- Gaby

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 21:47     ` Daniel Jacobowitz
@ 2003-05-14 21:59       ` Gabriel Dos Reis
  2003-05-14 22:04         ` Daniel Jacobowitz
  0 siblings, 1 reply; 30+ messages in thread
From: Gabriel Dos Reis @ 2003-05-14 21:59 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Andreas Schwab, Brad Lucier, gcc

Daniel Jacobowitz <drow@mvista.com> writes:

| On Wed, May 14, 2003 at 11:31:49PM +0200, Gabriel Dos Reis wrote:
| > Andreas Schwab <schwab@suse.de> writes:
| > 
| > [...]
| > 
| > | int *sp;
| > | int *hp;
| > | 
| > | |> I'm wondering if a Scheme->C compiler
| > | |> can use typedefs and ISO C's aliasing rules to tell gcc that certain
| > | |> memory locations cannot alias each other.
| > | 
| > | One way to do this is to define differently named structures with a single
| > | member, but they may pessimize the generated code due to ABI
| > | peculiarities.
| > 
| > As I understand it, he would like an object-based non-alias
| > optimization, I believe restrict is a good candidate.
| 
| That depends.  Brad's initial example could also be used for type-based
| non-alias optimization, for different sets of int objects - which is
| somewhat looser than restrict allows.

I'm not sure I get what you meant.

At the type-level, one cannot distingush an int* from another int*,
they are both int* so may alias each other. 

At the the object-level, where one considers different sets of objects
of the *same type*, one needs an additional information not provided
just by the type.  What did I miss in your remark?

-- Gaby

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 21:59       ` Gabriel Dos Reis
@ 2003-05-14 22:04         ` Daniel Jacobowitz
  2003-05-14 22:07           ` Brad Lucier
  0 siblings, 1 reply; 30+ messages in thread
From: Daniel Jacobowitz @ 2003-05-14 22:04 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Andreas Schwab, Brad Lucier, gcc

On Wed, May 14, 2003 at 11:59:30PM +0200, Gabriel Dos Reis wrote:
> Daniel Jacobowitz <drow@mvista.com> writes:
> 
> | On Wed, May 14, 2003 at 11:31:49PM +0200, Gabriel Dos Reis wrote:
> | > Andreas Schwab <schwab@suse.de> writes:
> | > 
> | > [...]
> | > 
> | > | int *sp;
> | > | int *hp;
> | > | 
> | > | |> I'm wondering if a Scheme->C compiler
> | > | |> can use typedefs and ISO C's aliasing rules to tell gcc that certain
> | > | |> memory locations cannot alias each other.
> | > | 
> | > | One way to do this is to define differently named structures with a single
> | > | member, but they may pessimize the generated code due to ABI
> | > | peculiarities.
> | > 
> | > As I understand it, he would like an object-based non-alias
> | > optimization, I believe restrict is a good candidate.
> | 
> | That depends.  Brad's initial example could also be used for type-based
> | non-alias optimization, for different sets of int objects - which is
> | somewhat looser than restrict allows.
> 
> I'm not sure I get what you meant.
> 
> At the type-level, one cannot distingush an int* from another int*,
> they are both int* so may alias each other. 
> 
> At the the object-level, where one considers different sets of objects
> of the *same type*, one needs an additional information not provided
> just by the type.  What did I miss in your remark?

You're entirely correct, but I think that what Brad wanted (based on
his message) was:
  typedef int kindone;
  typedef int kindtwo;

and then to have any kindtwo* pointer not alias any kindone* pointer. 
Which is not quite the same as restrict, and is not possible in C.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 22:04         ` Daniel Jacobowitz
@ 2003-05-14 22:07           ` Brad Lucier
  2003-05-14 22:11             ` Gabriel Dos Reis
  0 siblings, 1 reply; 30+ messages in thread
From: Brad Lucier @ 2003-05-14 22:07 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Gabriel Dos Reis, Andreas Schwab, Brad Lucier, gcc

> You're entirely correct, but I think that what Brad wanted (based on
> his message) was:
>   typedef int kindone;
>   typedef int kindtwo;
> 
> and then to have any kindtwo* pointer not alias any kindone* pointer. 
> Which is not quite the same as restrict, and is not possible in C.
> 
> -- 
> Daniel Jacobowitz
> MontaVista Software                         Debian GNU/Linux Developer
> 

Yes, that's what I wanted, but I'm a practical person, and I'll take what
I can get ;-).

Brad

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 22:07           ` Brad Lucier
@ 2003-05-14 22:11             ` Gabriel Dos Reis
  0 siblings, 0 replies; 30+ messages in thread
From: Gabriel Dos Reis @ 2003-05-14 22:11 UTC (permalink / raw)
  To: Brad Lucier; +Cc: Daniel Jacobowitz, Andreas Schwab, gcc

Brad Lucier <lucier@math.purdue.edu> writes:

| > You're entirely correct, but I think that what Brad wanted (based on
| > his message) was:
| >   typedef int kindone;
| >   typedef int kindtwo;
| > 
| > and then to have any kindtwo* pointer not alias any kindone* pointer. 
| > Which is not quite the same as restrict, and is not possible in C.

Yes, you're right.

| > -- 
| > Daniel Jacobowitz
| > MontaVista Software                         Debian GNU/Linux Developer
| > 
| 
| Yes, that's what I wanted, but I'm a practical person, and I'll take what
| I can get ;-).

I'm curious...

-- Gaby

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 21:41       ` Joe Buck
  2003-05-14 21:53         ` Gabriel Dos Reis
@ 2003-05-14 22:17         ` Gabriel Dos Reis
  2003-05-14 22:29           ` two-element struct performance (was: strict-aliasing and typedefs) Joe Buck
  1 sibling, 1 reply; 30+ messages in thread
From: Gabriel Dos Reis @ 2003-05-14 22:17 UTC (permalink / raw)
  To: Joe Buck; +Cc: Brad Lucier, gcc

Joe Buck <jbuck@synopsys.com> writes:

| GCC does a decent job with one-element structs; its performance
| quickly drops like a rock as soon as there are two elements.  Still,
| there may well be some loss from using it.

Would that loss of performance be related to ABI issues (in single
element case)? 

-- Gaby

^ permalink raw reply	[flat|nested] 30+ messages in thread

* two-element struct performance (was: strict-aliasing and typedefs)
  2003-05-14 22:17         ` Gabriel Dos Reis
@ 2003-05-14 22:29           ` Joe Buck
  2003-05-14 22:49             ` Gabriel Dos Reis
  2004-02-20  0:43             ` law
  0 siblings, 2 replies; 30+ messages in thread
From: Joe Buck @ 2003-05-14 22:29 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Brad Lucier, gcc

On Thu, May 15, 2003 at 12:16:56AM +0200, Gabriel Dos Reis wrote:
> Joe Buck <jbuck@synopsys.com> writes:
> 
> | GCC does a decent job with one-element structs; its performance
> | quickly drops like a rock as soon as there are two elements.  Still,
> | there may well be some loss from using it.
> 
> Would that loss of performance be related to ABI issues (in single
> element case)? 

No, it has to do with premature commitment of structs to memory.  The only
way out is to be able to do more aggressive transformations on trees, to get
rid of structs that aren't needed.  For the one-element case, there are some
special-case tricks, which give us a nice (but misleading) Stepanov abstraction
penalty score.  Every object in the Stepanov benchmark has one element.

Actually there's another way out: be able to treat aggregates as scalars.  However,
it seems this isn't really needed for many of the cases that hurt C++ performance
(e.g. iterators with more than one element).

Take a look, for example, at the code generated for the following:

--------------------------------
struct complex {
    double re, im;
    complex(double r, double i) : re(r), im(i) {}
};

inline complex operator+(const complex& a, const complex& b) {
    return complex(a.re+b.re, a.im+b.im);
}

complex addone(const complex& arg) {
    return arg + complex(1,0);
}
------------------------------

It's pretty horrific, because gcc insists on making an actual RAM object for the
temporary struct (even for tree-ssa).

From looking at the tree dumps on the tree-ssa branch, it would seem that a
decent copy propagation pass could propagate the 1.0 and 0.0 values through,
leaving the temporary struct as a dead object.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: two-element struct performance (was: strict-aliasing and typedefs)
  2003-05-14 22:29           ` two-element struct performance (was: strict-aliasing and typedefs) Joe Buck
@ 2003-05-14 22:49             ` Gabriel Dos Reis
  2003-05-14 23:06               ` Joe Buck
  2004-02-20  0:43             ` law
  1 sibling, 1 reply; 30+ messages in thread
From: Gabriel Dos Reis @ 2003-05-14 22:49 UTC (permalink / raw)
  To: Joe Buck; +Cc: Brad Lucier, gcc

Joe Buck <jbuck@synopsys.com> writes:

| On Thu, May 15, 2003 at 12:16:56AM +0200, Gabriel Dos Reis wrote:
| > Joe Buck <jbuck@synopsys.com> writes:
| > 
| > | GCC does a decent job with one-element structs; its performance
| > | quickly drops like a rock as soon as there are two elements.  Still,
| > | there may well be some loss from using it.
| > 
| > Would that loss of performance be related to ABI issues (in single
| > element case)? 
| 
| No, it has to do with premature commitment of structs to memory.

Aha, thanks.  I thought that since some ABIs seem to require that every
structure ought to be returned in memory, GCC (on those targets)
cannot optimize away the abstraction penatly in the case a structure
contains a single or two scalar(s). 

[...]

| Take a look, for example, at the code generated for the following:
| 
| --------------------------------
| struct complex {
|     double re, im;
|     complex(double r, double i) : re(r), im(i) {}
| };
| 
| inline complex operator+(const complex& a, const complex& b) {
|     return complex(a.re+b.re, a.im+b.im);
| }
| 
| complex addone(const complex& arg) {
|     return arg + complex(1,0);
| }

I believe we have a (old) bug report along that line.

[...]

| >From looking at the tree dumps on the tree-ssa branch, it would seem that a
| decent copy propagation pass could propagate the 1.0 and 0.0 values through,
| leaving the temporary struct as a dead object.

That is good news.

-- Gaby

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: two-element struct performance (was: strict-aliasing and typedefs)
  2003-05-14 22:49             ` Gabriel Dos Reis
@ 2003-05-14 23:06               ` Joe Buck
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Buck @ 2003-05-14 23:06 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Brad Lucier, gcc

On Thu, May 15, 2003 at 12:49:23AM +0200, Gabriel Dos Reis wrote:
> | > Would that loss of performance be related to ABI issues (in single
> | > element case)? 
> | 
> | No, it has to do with premature commitment of structs to memory.
> 
> Aha, thanks.  I thought that since some ABIs seem to require that every
> structure ought to be returned in memory, GCC (on those targets)
> cannot optimize away the abstraction penatly in the case a structure
> contains a single or two scalar(s). 

ABI rules don't affect what can be done with local (after inlining) structs
and classes, and that's where GCC is getting killed by, say, KAI's compiler.
 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: two-element struct performance (was: strict-aliasing and  typedefs)
  2003-05-14 22:29           ` two-element struct performance (was: strict-aliasing and typedefs) Joe Buck
  2003-05-14 22:49             ` Gabriel Dos Reis
@ 2004-02-20  0:43             ` law
  2004-02-20  9:06               ` Richard Henderson
  1 sibling, 1 reply; 30+ messages in thread
From: law @ 2004-02-20  0:43 UTC (permalink / raw)
  To: Joe Buck; +Cc: Gabriel Dos Reis, Brad Lucier, gcc

In message <20030514152914.A12355@synopsys.com>, Joe Buck writes:
 >On Thu, May 15, 2003 at 12:16:56AM +0200, Gabriel Dos Reis wrote:
 >> Joe Buck <jbuck@synopsys.com> writes:
 >> 
 >> | GCC does a decent job with one-element structs; its performance
 >> | quickly drops like a rock as soon as there are two elements.  Still,
 >> | there may well be some loss from using it.
 >> 
 >> Would that loss of performance be related to ABI issues (in single
 >> element case)? 
 >
 >No, it has to do with premature commitment of structs to memory.  The only
 >way out is to be able to do more aggressive transformations on trees, to get
 >rid of structs that aren't needed.  For the one-element case, there are some
 >special-case tricks, which give us a nice (but misleading) Stepanov
 >abstraction penalty score.  Every object in the Stepanov benchmark has one
 >element.
 >
 >Actually there's another way out: be able to treat aggregates as scalars.
 >However, it seems this isn't really needed for many of the cases that hurt
 > C++ performance (e.g. iterators with more than one element).
 >
 >Take a look, for example, at the code generated for the following:
 >
 >--------------------------------
 >struct complex {
 >    double re, im;
 >    complex(double r, double i) : re(r), im(i) {}
 >};
 >
 >inline complex operator+(const complex& a, const complex& b) {
 >    return complex(a.re+b.re, a.im+b.im);
 >}
 >
 >complex addone(const complex& arg) {
 >    return arg + complex(1,0);
 >}
 >------------------------------
 >
 >It's pretty horrific, because gcc insists on making an actual RAM object
 > for the temporary struct (even for tree-ssa).
 >
 >>From looking at the tree dumps on the tree-ssa branch, it would seem that a
 >decent copy propagation pass could propagate the 1.0 and 0.0 values through,
 >leaving the temporary struct as a dead object.
We're getting closer, but we're not quite there yet.

As of this morning we got the following at the end of DCE2:

  this<D1518>_1 = &<D1509>;
  this<D1518>_1->re = 1.0e+0;
  this<D1518>_1->im = 0.0;
  b<D1523>_6 = &<D1509>;
  T.0<D1526>_7 = arg<D1506>_4->im;
  T.1<D1527>_8 = b<D1523>_6->im;
  T.2<D1528>_9 = T.0<D1526>_7 + T.1<D1527>_8;
  T.3<D1529>_10 = arg<D1506>_4->re;
  T.4<D1530>_11 = b<D1523>_6->re;
  T.5<D1531>_12 = T.3<D1529>_10 + T.4<D1530>_11;
  this<D1533>_13 = &<D1525>;
  this<D1533>_13->re = T.5<D1531>_12;
  this<D1533>_13->im = T.2<D1528>_9;
  retval.7<D1532> = <D1525>;
  T.6<D1513> = retval.7<D1532>;
  return T.6<D1513>;


Egad.  How in the world are we supposed to get good code from that mess.
That's AWFUL.


Of particular interest is the fact that this_1 and this_13 are totally
unnecessary -- we should have const-propagated the address of D1509
and D1525 in all uses of this_1 and this_13.

It turns out there's a couple minor oversights in the dominator optimizer
which are preventing that constant propagation.  Fixing those leads to the
following code out of DCE2:

  <D1509>.re = 1.0e+0;
  <D1509>.im = 0.0;
  b<D1523>_6 = &<D1509>;
  T.0<D1526>_7 = arg<D1506>_4->im;
  T.1<D1527>_8 = b<D1523>_6->im;
  T.2<D1528>_9 = T.0<D1526>_7 + T.1<D1527>_8;
  T.3<D1529>_10 = arg<D1506>_4->re;
  T.4<D1530>_11 = b<D1523>_6->re; 
  T.5<D1531>_12 = T.3<D1529>_10 + T.4<D1530>_11;
  <D1525>.re = T.5<D1531>_12;
  <D1525>.im = T.2<D1528>_9;
  retval.7<D1532> = <D1525>;
  T.6<D1513> = retval.7<D1532>;
  return T.6<D1513>;


Which is marginally better.    It allows us to generate slightly
better code during tree->rtl conversion and avoid creating a useless
stack object.  In terms of the final assembly code we just reduced the
stack allocation for that function from 56 bytes to 40 bytes.  Nothing
to write home about, but it's a start.

However, if we look at the DCE2 output even closer, we'll see that there's
still a constant propagation opportunity we're missing.  Namely b = &<D1509>.
This is simply a matter of the dominator optimizer playing it too safe
regarding types.  Fix that results in the following code after DSE2:


  T.0<D1526>_7 = arg<D1506>_4->im;
  T.2<D1528>_9 = T.0<D1526>_7 + 0.0;
  T.3<D1529>_10 = arg<D1506>_4->re;
  T.5<D1531>_12 = T.3<D1529>_10 + 1.0e+0;
  T.6<D1513>.re = T.5<D1531>_12;
  T.6<D1513>.im = T.2<D1528>_9;
  return T.6<D1513>;



Which is looking much better -- the constants are propagated in the manner
we want and the resulting assembly code looks quite a bit nicer too.
It's also much cleaner from an aliasing standpoint, so there may be
secondary effects if this code were inlined into some other function.

Doing better would require that the tree-ssa optimizers know about ABI
details for passing parameters and return values -- if it had that knowledge
then it would know that the caller will provide a suitable memory location
for the return value and we could use it directly instead of first building
the return value in T.6.

Anyway, I'll be testing those changes today/tomorrow.  It'll be interesting to
see if/how they affect other C++ code.

jeff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: two-element struct performance (was: strict-aliasing and typedefs)
  2004-02-20  0:43             ` law
@ 2004-02-20  9:06               ` Richard Henderson
  2004-02-20 15:21                 ` law
  2004-02-24  6:19                 ` law
  0 siblings, 2 replies; 30+ messages in thread
From: Richard Henderson @ 2004-02-20  9:06 UTC (permalink / raw)
  To: law; +Cc: Joe Buck, Gabriel Dos Reis, Brad Lucier, gcc

On Thu, Feb 19, 2004 at 04:16:59PM -0700, law@redhat.com wrote:
> Doing better would require that the tree-ssa optimizers know about ABI
> details for passing parameters and return values -- if it had that knowledge
> then it would know that the caller will provide a suitable memory location
> for the return value and we could use it directly instead of first building
> the return value in T.6.

I don't know that tree-ssa knowledge of ABI details is quite necessary.

I think if we'd performed the computation into a RESULT_DECL rather than
into a VAR_DECL temporary, that would effectively do the job since we
can arrange for the RESULT_DECL to have rtl appropriate for a caller
provided memory location.


r~

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: two-element struct performance (was: strict-aliasing and  typedefs)
  2004-02-20  9:06               ` Richard Henderson
@ 2004-02-20 15:21                 ` law
  2004-02-24  6:19                 ` law
  1 sibling, 0 replies; 30+ messages in thread
From: law @ 2004-02-20 15:21 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Joe Buck, Gabriel Dos Reis, Brad Lucier, gcc

In message <20040220080315.GA25175@redhat.com>, Richard Henderson writes:
 >On Thu, Feb 19, 2004 at 04:16:59PM -0700, law@redhat.com wrote:
 >> Doing better would require that the tree-ssa optimizers know about ABI
 >> details for passing parameters and return values -- if it had that knowledg
 >e
 >> then it would know that the caller will provide a suitable memory location
 >> for the return value and we could use it directly instead of first building
 >> the return value in T.6.
 >
 >I don't know that tree-ssa knowledge of ABI details is quite necessary.
 >
 >I think if we'd performed the computation into a RESULT_DECL rather than
 >into a VAR_DECL temporary, that would effectively do the job since we
 >can arrange for the RESULT_DECL to have rtl appropriate for a caller
 >provided memory location.
Yea, I came to to roughly the same conclusion after posting that message.
In fact, I _think_ that's precisely what the named return value optimization
is supposed to do for us.  However, NRV optimizations happen too early
in the C++ front-end.  Jason and I briefly discussed moving it to a later
point some time ago.  I'll take a look at that once I've got the dominator
optimizer improvements installed (they're triggering some weirdness in
PRE).

jeff



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: two-element struct performance (was: strict-aliasing and  typedefs)
  2004-02-20  9:06               ` Richard Henderson
  2004-02-20 15:21                 ` law
@ 2004-02-24  6:19                 ` law
  2004-02-24 10:29                   ` Gabriel Dos Reis
  1 sibling, 1 reply; 30+ messages in thread
From: law @ 2004-02-24  6:19 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Joe Buck, Gabriel Dos Reis, Brad Lucier, gcc

In message <20040220080315.GA25175@redhat.com>, Richard Henderson writes:
 >On Thu, Feb 19, 2004 at 04:16:59PM -0700, law@redhat.com wrote:
 >> Doing better would require that the tree-ssa optimizers know about ABI
 >> details for passing parameters and return values -- if it had that knowledg
 >e
 >> then it would know that the caller will provide a suitable memory location
 >> for the return value and we could use it directly instead of first building
 >> the return value in T.6.
 >
 >I don't know that tree-ssa knowledge of ABI details is quite necessary.
 >
 >I think if we'd performed the computation into a RESULT_DECL rather than
 >into a VAR_DECL temporary, that would effectively do the job since we
 >can arrange for the RESULT_DECL to have rtl appropriate for a caller
 >provided memory location.
Done.  It's pretty trivial.  We now get the following:

  T.2 = arg->im + 0.0;
  T.6.re = arg->re + 1.0e+0;
  T.6.im = T.2;
  return T.6;

The only thing I see now is that TER didn't replace the use of T.2 with
its earlier assignment.  Probably due to the store between the assignment
to T.2 and the later use of T.2.

Testing in progress.

jeff


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: two-element struct performance (was: strict-aliasing and  typedefs)
  2004-02-24  6:19                 ` law
@ 2004-02-24 10:29                   ` Gabriel Dos Reis
  2004-02-24 10:44                     ` Paolo Carlini
  0 siblings, 1 reply; 30+ messages in thread
From: Gabriel Dos Reis @ 2004-02-24 10:29 UTC (permalink / raw)
  To: law; +Cc: Richard Henderson, Joe Buck, Brad Lucier, gcc

law@redhat.com writes:

| In message <20040220080315.GA25175@redhat.com>, Richard Henderson writes:
|  >On Thu, Feb 19, 2004 at 04:16:59PM -0700, law@redhat.com wrote:
|  >> Doing better would require that the tree-ssa optimizers know about ABI
|  >> details for passing parameters and return values -- if it had that knowledg
|  >e
|  >> then it would know that the caller will provide a suitable memory location
|  >> for the return value and we could use it directly instead of first building
|  >> the return value in T.6.
|  >
|  >I don't know that tree-ssa knowledge of ABI details is quite necessary.
|  >
|  >I think if we'd performed the computation into a RESULT_DECL rather than
|  >into a VAR_DECL temporary, that would effectively do the job since we
|  >can arrange for the RESULT_DECL to have rtl appropriate for a caller
|  >provided memory location.
| Done.  It's pretty trivial.  We now get the following:
| 
|   T.2 = arg->im + 0.0;
|   T.6.re = arg->re + 1.0e+0;
|   T.6.im = T.2;
|   return T.6;
| 
| The only thing I see now is that TER didn't replace the use of T.2 with
| its earlier assignment.  Probably due to the store between the assignment
| to T.2 and the later use of T.2.

Hey, pretty amazing progress, Jeff! Wow.

-- Gaby

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: two-element struct performance (was: strict-aliasing and  typedefs)
  2004-02-24 10:29                   ` Gabriel Dos Reis
@ 2004-02-24 10:44                     ` Paolo Carlini
  0 siblings, 0 replies; 30+ messages in thread
From: Paolo Carlini @ 2004-02-24 10:44 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: law, Richard Henderson, Joe Buck, Brad Lucier, gcc

Gabriel Dos Reis wrote:

>law@redhat.com writes:
>
>| Done.  It's pretty trivial.  We now get the following:
>| 
>|   T.2 = arg->im + 0.0;
>|   T.6.re = arg->re + 1.0e+0;
>|   T.6.im = T.2;
>|   return T.6;
>| 
>| The only thing I see now is that TER didn't replace the use of T.2 with
>| its earlier assignment.  Probably due to the store between the assignment
>| to T.2 and the later use of T.2.
>
>Hey, pretty amazing progress, Jeff! Wow.
>  
>
Really! My best compliments too!

Paolo.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
  2003-05-14 23:10 Robert Dewar
@ 2003-05-14 23:14 ` Joe Buck
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Buck @ 2003-05-14 23:14 UTC (permalink / raw)
  To: Robert Dewar; +Cc: gdr, gcc, lucier

On Wed, May 14, 2003 at 07:10:02PM -0400, Robert Dewar wrote:
> > Would that loss of performance be related to ABI issues (in single
> > element case)? 
> 
> Right, for instance on SPARC, you have to pass even a one elemnt struct
> by pointer (at least that's my understanding!)

While such costs are real, they are in many cases dwarfed by GCC's
inefficiency at handling internal structs (those within one function,
or passed only to inlined functions).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
@ 2003-05-14 23:10 Robert Dewar
  2003-05-14 23:14 ` Joe Buck
  0 siblings, 1 reply; 30+ messages in thread
From: Robert Dewar @ 2003-05-14 23:10 UTC (permalink / raw)
  To: gdr, jbuck; +Cc: gcc, lucier

> Would that loss of performance be related to ABI issues (in single
> element case)? 

Right, for instance on SPARC, you have to pass even a one elemnt struct
by pointer (at least that's my understanding!)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: strict-aliasing and typedefs
@ 2003-05-14 21:38 Robert Dewar
  0 siblings, 0 replies; 30+ messages in thread
From: Robert Dewar @ 2003-05-14 21:38 UTC (permalink / raw)
  To: gcc, lucier

> If I have the typedefs and variables
> 
> typedef int stackslot;
> typedef int heapslot;
> 
> stackslot *sp;
> heapslot *hp;
> 
> then can sp and hp point to the same location in memory with
> ISO C's aliasing rules?

Yes indeed. C is not Ada :-) There is only one int type, and giving it
different names does not make distinct types.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2004-02-24 10:29 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-14 20:56 strict-aliasing and typedefs Brad Lucier
2003-05-14 21:07 ` Fergus Henderson
2003-05-14 21:28   ` Gabriel Dos Reis
2003-05-14 21:08 ` Gabriel Dos Reis
2003-05-14 21:21   ` Joe Buck
2003-05-14 21:30     ` Gabriel Dos Reis
2003-05-14 21:41       ` Joe Buck
2003-05-14 21:53         ` Gabriel Dos Reis
2003-05-14 22:17         ` Gabriel Dos Reis
2003-05-14 22:29           ` two-element struct performance (was: strict-aliasing and typedefs) Joe Buck
2003-05-14 22:49             ` Gabriel Dos Reis
2003-05-14 23:06               ` Joe Buck
2004-02-20  0:43             ` law
2004-02-20  9:06               ` Richard Henderson
2004-02-20 15:21                 ` law
2004-02-24  6:19                 ` law
2004-02-24 10:29                   ` Gabriel Dos Reis
2004-02-24 10:44                     ` Paolo Carlini
2003-05-14 21:21 ` strict-aliasing and typedefs Andreas Schwab
2003-05-14 21:32   ` Gabriel Dos Reis
2003-05-14 21:46     ` Brad Lucier
2003-05-14 21:56       ` Gabriel Dos Reis
2003-05-14 21:47     ` Daniel Jacobowitz
2003-05-14 21:59       ` Gabriel Dos Reis
2003-05-14 22:04         ` Daniel Jacobowitz
2003-05-14 22:07           ` Brad Lucier
2003-05-14 22:11             ` Gabriel Dos Reis
2003-05-14 21:38 Robert Dewar
2003-05-14 23:10 Robert Dewar
2003-05-14 23:14 ` Joe Buck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).