public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* EGCS bad on loop invariant optimization ?
@ 1998-04-19 10:52 Gabriel Dos Reis
  1998-04-19 15:57 ` John Carr
  1998-04-19 18:48 ` Martin von Loewis
  0 siblings, 2 replies; 11+ messages in thread
From: Gabriel Dos Reis @ 1998-04-19 10:52 UTC (permalink / raw)
  To: egcs

Hi all!

Consider 

#include <cstddef>

class array {
public:
    array (size_t i) : sz(i), data (new double[i]) {}

    double operator[] (size_t i) const { return data[i]; }
    double& operator[] (size_t i) { return data[i]; }

private:
    size_t sz;
    double* data;
};

int main ()
{
    const size_t n = 1000;

    array a(n), b(n), c(n);

    for (size_t i=0; i<n; ++i) a[i] = b[i] + c[i];

    return 0;
}

EGCS-980315 with -O6 option generates, on a SPARCstation, this (poor) code:

main:
.LLFB1:
	!#PROLOGUE# 0
	save %sp,-136,%sp
.LLCFI0:
	!#PROLOGUE# 1
	mov 1000,%l2
	mov 0,%l0
	sethi %hi(8000),%l1
	or %l1,%lo(8000),%l1
	st %l2,[%fp-24]
	call __builtin_vec_new,0
	mov %l1,%o0
	st %o0,[%fp-20]
	st %l2,[%fp-32]
	call __builtin_vec_new,0
	mov %l1,%o0
	st %o0,[%fp-28]
	st %l2,[%fp-40]
	call __builtin_vec_new,0
	mov %l1,%o0
	st %o0,[%fp-36]
	mov 0,%o3
.LL15:
	ld [%fp-28],%o0			! b.data
	sll %o3,3,%o1
	ld [%fp-36],%o2			! c.data
	add %o3,1,%o3
	ldd [%o0+%o1],%f4
	cmp %o3,999
	ldd [%o2+%o1],%f2
	faddd %f4,%f2,%f4
	ld [%fp-20],%o0			! a.data
	bleu .LL15
	std %f4,[%o0+%o1]
	ret
	restore %g0,0,%o0



Can't EGCS see that a.data, b.data, c.data are loop invariants and
then move the loads out of the loop ? At least, one can expect
b.data and c.data to be subject of this kind of optimization since the
only way to access them is through the (leaf) *const* class member
fonction `array::operator[](size_t) const'. What am I missing ? 


-- Gaby
"One reason that life is complex is that it has a 
real part and imaginary part." -- Andrew Koenig

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: EGCS bad on loop invariant optimization ?
  1998-04-19 10:52 EGCS bad on loop invariant optimization ? Gabriel Dos Reis
@ 1998-04-19 15:57 ` John Carr
  1998-04-20 21:13   ` Jeffrey A Law
  1998-04-21 13:29   ` Gabriel Dos Reis
  1998-04-19 18:48 ` Martin von Loewis
  1 sibling, 2 replies; 11+ messages in thread
From: John Carr @ 1998-04-19 15:57 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: egcs

Try this patch.

Sun Apr 19 18:07:14 1998  John Carr  <jfc@mit.edu>

	* alias.c (true_dependence): If neither mode is QI or BLK and the
	modes are different, X and MEM do not alias.

*** alias.c~	Fri Apr 17 15:54:33 1998
--- alias.c	Sun Apr 19 18:06:30 1998
***************
*** 863,878 ****
  			    SIZE_FOR_MODE (x), x_addr, 0))
      return 0;
  
    /* If both references are struct references, or both are not, nothing
       is known about aliasing.
  
-      If either reference is QImode or BLKmode, ANSI C permits aliasing.
- 
       If both addresses are constant, or both are not, nothing is known
       about aliasing.  */
    if (MEM_IN_STRUCT_P (x) == MEM_IN_STRUCT_P (mem)
-       || mem_mode == QImode || mem_mode == BLKmode
-       || GET_MODE (x) == QImode || GET_MODE (x) == BLKmode
        || GET_CODE (x_addr) == AND || GET_CODE (mem_addr) == AND
        || varies (x_addr) == varies (mem_addr))
      return 1;
--- 863,884 ----
  			    SIZE_FOR_MODE (x), x_addr, 0))
      return 0;
  
+   /* If either reference is QImode or BLKmode, ANSI C permits aliasing.
+      Otherwise, if the types are different there is no aliasing.  */
+   if (mem_mode == QImode || mem_mode == BLKmode
+       || GET_MODE (x) == QImode || GET_MODE (x) == BLKmode)
+     return 1;
+ 
+   if (mem_mode != GET_MODE (x))
+     return 0;
+ 
    /* If both references are struct references, or both are not, nothing
       is known about aliasing.
  
       If both addresses are constant, or both are not, nothing is known
       about aliasing.  */
+ 
    if (MEM_IN_STRUCT_P (x) == MEM_IN_STRUCT_P (mem)
        || GET_CODE (x_addr) == AND || GET_CODE (mem_addr) == AND
        || varies (x_addr) == varies (mem_addr))
      return 1;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: EGCS bad on loop invariant optimization ?
  1998-04-19 10:52 EGCS bad on loop invariant optimization ? Gabriel Dos Reis
  1998-04-19 15:57 ` John Carr
@ 1998-04-19 18:48 ` Martin von Loewis
  1998-04-21 17:06   ` Gabriel Dos Reis
  1 sibling, 1 reply; 11+ messages in thread
From: Martin von Loewis @ 1998-04-19 18:48 UTC (permalink / raw)
  To: Gabriel.Dos-Reis; +Cc: egcs

> Consider 
> 
> #include <cstddef>
> 
> class array {
> public:
>     array (size_t i) : sz(i), data (new double[i]) {}
> 
>     double operator[] (size_t i) const { return data[i]; }
>     double& operator[] (size_t i) { return data[i]; }
> 
> private:
>     size_t sz;
>     double* data;
> };
> 
> int main ()
> {
>     const size_t n = 1000;
> 
>     array a(n), b(n), c(n);
> 
>     for (size_t i=0; i<n; ++i) a[i] = b[i] + c[i];
> 
>     return 0;
> }
> 

...

> Can't EGCS see that a.data, b.data, c.data are loop invariants and
> then move the loads out of the loop ? At least, one can expect
> b.data and c.data to be subject of this kind of optimization since the
> only way to access them is through the (leaf) *const* class member
> fonction `array::operator[](size_t) const'. What am I missing ? 

I believe this would depend on how ::operator new is implemented.  It
*could* return pointers into the stack, so that the writes to a[i]
overwrite the contents of b and c, so that reloading b.date and c.data
is necessary.

Of course, ::operator new never does such things. g++ doesn't now
that, though, and assumes worst-case.

Regards,
Martin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: EGCS bad on loop invariant optimization ?
  1998-04-19 15:57 ` John Carr
@ 1998-04-20 21:13   ` Jeffrey A Law
  1998-04-21  7:36     ` Gabriel Dos Reis
  1998-04-21 13:29   ` Gabriel Dos Reis
  1 sibling, 1 reply; 11+ messages in thread
From: Jeffrey A Law @ 1998-04-20 21:13 UTC (permalink / raw)
  To: John Carr; +Cc: Gabriel Dos Reis, egcs, wilson

  In message < 199804192208.SAA19223@jfc. >you write:
  > 
  > Try this patch.
  > 
  > Sun Apr 19 18:07:14 1998  John Carr  <jfc@mit.edu>
  > 
  > 	* alias.c (true_dependence): If neither mode is QI or BLK and the
  > 	modes are different, X and MEM do not alias.
Looks correct to me.

One day it would be nice to distinguish items which have the same
size, but are different types (ie pointers vs int/log, but we need
help from the front end, or a first-class Pmode for that).

jeff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: EGCS bad on loop invariant optimization ?
  1998-04-20 21:13   ` Jeffrey A Law
@ 1998-04-21  7:36     ` Gabriel Dos Reis
  0 siblings, 0 replies; 11+ messages in thread
From: Gabriel Dos Reis @ 1998-04-21  7:36 UTC (permalink / raw)
  To: law; +Cc: John Carr, Gabriel Dos Reis, egcs, wilson

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 497 bytes --]

>>>>> «Jeff», Jeffrey A Law <law@hurl.cygnus.com> wrote:

Jeff>   In message < 199804192208.SAA19223@jfc. >you write:
>> 
>> Try this patch.
>> 
>> Sun Apr 19 18:07:14 1998  John Carr  <jfc@mit.edu>
>> 
>> * alias.c (true_dependence): If neither mode is QI or BLK and the
>> modes are different, X and MEM do not alias.
Jeff> Looks correct to me.

I can't get egcs-19989418 compiled (on solaris-2.5.x) due to several
parse errors. As soon as it compiles it'll let know the results.

-- Gaby

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: EGCS bad on loop invariant optimization ?
  1998-04-19 15:57 ` John Carr
  1998-04-20 21:13   ` Jeffrey A Law
@ 1998-04-21 13:29   ` Gabriel Dos Reis
  1 sibling, 0 replies; 11+ messages in thread
From: Gabriel Dos Reis @ 1998-04-21 13:29 UTC (permalink / raw)
  To: John Carr; +Cc: Gabriel Dos Reis, egcs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 115 bytes --]

>>>>> «John», John Carr <jfc@mit.edu> wrote:

John> Try this patch.

It works for me. Thanks a lot.

-- Gaby

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: EGCS bad on loop invariant optimization ?
  1998-04-19 18:48 ` Martin von Loewis
@ 1998-04-21 17:06   ` Gabriel Dos Reis
  1998-04-23  6:57     ` Martin von Loewis
  0 siblings, 1 reply; 11+ messages in thread
From: Gabriel Dos Reis @ 1998-04-21 17:06 UTC (permalink / raw)
  To: Martin von Loewis; +Cc: Gabriel.Dos-Reis, egcs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1955 bytes --]

>>>>> «Martin», Martin von Loewis <martin@mira.isdn.cs.tu-berlin.de> wrote:

[code snipped]

>> Can't EGCS see that a.data, b.data, c.data are loop invariants and
>> then move the loads out of the loop ? At least, one can expect
>> b.data and c.data to be subject of this kind of optimization since the
>> only way to access them is through the (leaf) *const* class member
>> fonction `array::operator[](size_t) const'. What am I missing ? 

Martin> I believe this would depend on how ::operator new is implemented.  It
Martin> *could* return pointers into the stack, so that the writes to a[i]
Martin> overwrite the contents of b and c, so that reloading b.date and c.data
Martin> is necessary.

Martin> Of course, ::operator new never does such things. g++ doesn't now
Martin> that, though, and assumes worst-case.

Hmm. I can't get the point. Let me show my ignorance. Consider the
following code:

inline double& nth_element(double* data, unsigned i)
{ return data[i]; }

inline double nth_element(const double* data, unsigned i)
{ return data[i]; }


int main()
{
    const unsigned n = 1000;
    double* a = new double[n];
    double* b = new double[n];
    double* c = new double[n];

    for (unsigned i=0; i<n; ++i)
        nth_element(a, i) = nth_element(static_cast<const double*>(b), i)
            + nth_element(static_cast<const double*>(c), i);

    return 0;
}
    

EGCS-980325 with -O option outputs:

main:
.LLFB1:
	!#PROLOGUE# 0
	save %sp,-112,%sp
.LLCFI0:
	!#PROLOGUE# 1
	sethi %hi(8000),%l0
	call __builtin_vec_new,0
	or %l0,%lo(8000),%o0
	mov %o0,%l1
	call __builtin_vec_new,0
	or %l0,%lo(8000),%o0
	mov %o0,%i0
	call __builtin_vec_new,0
	or %l0,%lo(8000),%o0
	mov %o0,%o2
	mov 0,%o1
.LL9:
	sll %o1,3,%o0
	ldd [%i0+%o0],%f2
	ldd [%o2+%o0],%f4
	faddd %f2,%f4,%f2
	add %o1,1,%o1
	cmp %o1,999
	bleu .LL9
	std %f2,[%l1+%o0]
	ret
	restore %g0,0,%o0


As you can see, there is no store/load of a, b and c inside the loop. 

-- Gaby

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: EGCS bad on loop invariant optimization ?
  1998-04-21 17:06   ` Gabriel Dos Reis
@ 1998-04-23  6:57     ` Martin von Loewis
  1998-04-24 10:11       ` Gabriel Dos Reis
  0 siblings, 1 reply; 11+ messages in thread
From: Martin von Loewis @ 1998-04-23  6:57 UTC (permalink / raw)
  To: Gabriel.Dos-Reis; +Cc: egcs

> Hmm. I can't get the point. Let me show my ignorance. Consider the
> following code:
[snip]
> As you can see, there is no store/load of a, b and c inside the loop. 

First I found this quite convincing. However, you'll have to keep the
C++ storage model in mind. In order to modify an object through a
pointer, you need to take the address of the object. In you second
example, g++ knows that the address of a, b, and c is never taken, so
it knows that no store operation might modify them (at least not in a
compliant program, non-compliant ones might exhibit unexpected
results).

So I modified your last test case to read

inline double& nth_element(double* data, unsigned i)
{ return data[i]; }

inline double nth_element(const double* data, unsigned i)
{ return data[i]; }


int main()
{
    const unsigned n = 1000;
    double* a = new double[n];
    double* b = new double[n];
    double* c = new double[n];

    extern void f(double **);
    f(&a);
    f(&b);
    f(&c);

    for (unsigned i=0; i<n; ++i)
        nth_element(a, i) = nth_element(static_cast<const double*>(b), i)
            + nth_element(static_cast<const double*>(c), i);

    return 0;
}

That way, g++ has no way of knowing what ::f does with the pointers
it gets. In turn, it generates for the loop:

.LL6:
	cmp %o3,999
	bgu .LL7
	ld [%fp-24],%o1
	ld [%fp-28],%o2
	sll %o3,3,%o0
	ldd [%o1+%o0],%f4
	ldd [%o2+%o0],%f2
	faddd %f4,%f2,%f4
	ld [%fp-20],%o1
	add %o3,1,%o3
	b .LL6
	std %f4,[%o1+%o0]

As you can see, it generates these superfluous (sp? sem?) loads. Now,
in C++, the address of almost every object is taken, as it is an
implicit parameter of non-static member functions. So in your first
case, it doesn't know that *this is a loop invariant for a,b, or c.

Regards,
Martin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: EGCS bad on loop invariant optimization ?
  1998-04-23  6:57     ` Martin von Loewis
@ 1998-04-24 10:11       ` Gabriel Dos Reis
  1998-04-24 16:23         ` Martin von Loewis
  0 siblings, 1 reply; 11+ messages in thread
From: Gabriel Dos Reis @ 1998-04-24 10:11 UTC (permalink / raw)
  To: Martin von Loewis; +Cc: Gabriel.Dos-Reis, egcs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2669 bytes --]

>>>>> «Martin», Martin von Loewis <martin@mira.isdn.cs.tu-berlin.de> wrote:


Martin> First I found this quite convincing. However, you'll have to keep the
Martin> C++ storage model in mind. In order to modify an object through a
Martin> pointer, you need to take the address of the object. In you second
Martin> example, g++ knows that the address of a, b, and c is never taken, so
Martin> it knows that no store operation might modify them (at least not in a
Martin> compliant program, non-compliant ones might exhibit unexpected
Martin> results).

Quite right.

Martin> So I modified your last test case to read

Martin> inline double& nth_element(double* data, unsigned i)
Martin> { return data[i]; }

Martin> inline double nth_element(const double* data, unsigned i)
Martin> { return data[i]; }


Martin> int main()
Martin> {
Martin>     const unsigned n = 1000;
Martin>     double* a = new double[n];
Martin>     double* b = new double[n];
Martin>     double* c = new double[n];

Martin>     extern void f(double **);
Martin>     f(&a);
Martin>     f(&b);
Martin>     f(&c);

Martin>     for (unsigned i=0; i<n; ++i)
Martin>         nth_element(a, i) = nth_element(static_cast<const double*>(b), i)
Martin>             + nth_element(static_cast<const double*>(c), i);

Martin>     return 0;
Martin> }

Martin> That way, g++ has no way of knowing what ::f does with the pointers
Martin> it gets. In turn, it generates for the loop:

Martin> .LL6:
Martin> 	cmp %o3,999
Martin> 	bgu .LL7
Martin> 	ld [%fp-24],%o1
Martin> 	ld [%fp-28],%o2
Martin> 	sll %o3,3,%o0
Martin> 	ldd [%o1+%o0],%f4
Martin> 	ldd [%o2+%o0],%f2
Martin> 	faddd %f4,%f2,%f4
Martin> 	ld [%fp-20],%o1
Martin> 	add %o3,1,%o3
Martin> 	b .LL6
Martin> 	std %f4,[%o1+%o0]

Martin> As you can see, it generates these superfluous (sp? sem?) loads. Now,
Martin> in C++, the address of almost every object is taken, as it is an
Martin> implicit parameter of non-static member functions. So in your first
Martin> case, it doesn't know that *this is a loop invariant for a,b, or c.

Yep. But, we (and g++ as well) know that the execution of f(&c); is
completed before entering the for-loop which doesn't included any
other function call whose sided effect is unknown. So my point is: in
the for-loop a, b, and c are loop invariants (because, function calls
involved are inlined, and the arguments aren't modified); how the
semantic could be affected if the loads of a, b, and c were moved out of
the loop ? 

What about this proposal: members functions declared *const*,
*inlined* (i.e. g++ does see their definitions) which are *leaf* 
functions should be subject to code-motion optimization ?


-- Gaby

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: EGCS bad on loop invariant optimization ?
  1998-04-24 10:11       ` Gabriel Dos Reis
@ 1998-04-24 16:23         ` Martin von Loewis
  1998-04-25  9:37           ` Gabriel Dos Reis
  0 siblings, 1 reply; 11+ messages in thread
From: Martin von Loewis @ 1998-04-24 16:23 UTC (permalink / raw)
  To: Gabriel.Dos-Reis; +Cc: egcs

> What about this proposal: members functions declared *const*,
> *inlined* (i.e. g++ does see their definitions) which are *leaf* 
> functions should be subject to code-motion optimization ?

I don't know whether this is a good proposal or not; it seems to me
that constness of the data is also relevant. If you declare b and c
const in the class object example, g++ will know they won't be
modified through their addresses. Unfortunately, declaring just the
the pointer const won't help.

The real problem is in the implementation: I believe it is the
TREE_ADDRESSABLE flag which tells g++ whether something can live in a
register or not. Since this flag is either on or off for a given
variable, g++ doesn't know the concept 'modification through a pointer
is not possible in certain blocks'.

I have really no clue whether g++ can be changed easily to optimize
your example in a way that you like. So you either have to try
modifying it yourself, or collect convincing arguments that this is a
frequent case and absolutely deserves optimization.

Regards,
Martin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: EGCS bad on loop invariant optimization ?
  1998-04-24 16:23         ` Martin von Loewis
@ 1998-04-25  9:37           ` Gabriel Dos Reis
  0 siblings, 0 replies; 11+ messages in thread
From: Gabriel Dos Reis @ 1998-04-25  9:37 UTC (permalink / raw)
  To: Martin von Loewis; +Cc: Gabriel.Dos-Reis, egcs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2492 bytes --]

>>>>> «Martin», Martin von Loewis <martin@mira.isdn.cs.tu-berlin.de> wrote:

>> What about this proposal: members functions declared *const*,
>> *inlined* (i.e. g++ does see their definitions) which are *leaf* 
>> functions should be subject to code-motion optimization ?

Martin> I don't know whether this is a good proposal or not; it seems to me
Martin> that constness of the data is also relevant. If you declare b and c
Martin> const in the class object example, g++ will know they won't be
Martin> modified through their addresses. Unfortunately, declaring just the
Martin> the pointer const won't help.


Even declaring data to be const changes nothing (without John's patch).
BTW, I find this restriction unnatural.

Martin> The real problem is in the implementation: I believe it is the
Martin> TREE_ADDRESSABLE flag which tells g++ whether something can live in a
Martin> register or not. Since this flag is either on or off for a given
Martin> variable, g++ doesn't know the concept 'modification through a pointer
Martin> is not possible in certain blocks'.

Seems like you are right: it appears to me (through  analysis of code
g++ generates) that g++ doesn't know (or don't take into account) the
concept of constness in C++, apart the parsing stage. IMHO, this
can do nothing but to lead to poor code generation.


Martin> I have really no clue whether g++ can be changed easily to optimize
Martin> your example in a way that you like. So you either have to try
Martin> modifying it yourself, 

Unfortunately, I'm not a compiler designer expert, I implement
library. Probably Mark Mitchell or Jason Merrill would like to take a
look at this (crucial) problem.

Martin> or collect convincing arguments that this is a
Martin> frequent case and absolutely deserves optimization.

In fact this kind of optimization is a prerequisite for an effectively
use of C++ in scientific computing. Actually, without this
optimization *any* attempt to implement valarray<> cannot be
convincing (compared to equivalent code written in C). There are numerous
algorithms in Linear Algebra which basically consist of 


	for (size_t i=0; i<N; ++i)
		a[i] = arithmetic expressions involving b[i], c[i]...

where a, b, c ... are valarray<>s or classes providing array
fonctionnalities. Last but not least, this optimization is needed to
support expression template (which I used in my valarray<>
implementation).  


I hope some of you will put this problem on their hot TODO lists.

-- Gaby

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~1998-04-25  9:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-04-19 10:52 EGCS bad on loop invariant optimization ? Gabriel Dos Reis
1998-04-19 15:57 ` John Carr
1998-04-20 21:13   ` Jeffrey A Law
1998-04-21  7:36     ` Gabriel Dos Reis
1998-04-21 13:29   ` Gabriel Dos Reis
1998-04-19 18:48 ` Martin von Loewis
1998-04-21 17:06   ` Gabriel Dos Reis
1998-04-23  6:57     ` Martin von Loewis
1998-04-24 10:11       ` Gabriel Dos Reis
1998-04-24 16:23         ` Martin von Loewis
1998-04-25  9:37           ` Gabriel Dos Reis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).