public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* RE: Finally a main line version that passes LAPACK's test suite ...
@ 1999-03-01 21:45 Billinghurst, David (RTD)
  1999-03-31 23:46 ` Billinghurst, David (RTD)
  0 siblings, 1 reply; 26+ messages in thread
From: Billinghurst, David (RTD) @ 1999-03-01 21:45 UTC (permalink / raw)
  To: egcs

Unfortunately not for Irix 6.2 with egcs-19980228 snapshot.
Still get single precision complex failures in cspr.f at -O2

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: Finally a main line version that passes LAPACK's test suite ...
  1999-03-01 21:45 Finally a main line version that passes LAPACK's test suite Billinghurst, David (RTD)
@ 1999-03-31 23:46 ` Billinghurst, David (RTD)
  0 siblings, 0 replies; 26+ messages in thread
From: Billinghurst, David (RTD) @ 1999-03-31 23:46 UTC (permalink / raw)
  To: egcs

Unfortunately not for Irix 6.2 with egcs-19980228 snapshot.
Still get single precision complex failures in cspr.f at -O2


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
  1999-02-26 12:09           ` Mark Mitchell
       [not found]             ` < 199902262012.MAA07375@adsl-206-170-148-33.dsl.pacbell.net >
@ 1999-02-28 22:53             ` Mark Mitchell
  1 sibling, 0 replies; 26+ messages in thread
From: Mark Mitchell @ 1999-02-28 22:53 UTC (permalink / raw)
  To: law; +Cc: craig, toon, egcs

The most likely cause for the stack-frame expansion is this change, by
yours truly.

  Mon Feb 22 13:33:47 1999  Mark Mitchell  <mark@markmitchell.com>

	* cse.c (dump_class): New function.
	(invalidate_memory): Fix typo in comment.
	* function.c (temp_slot): Add an alias set field.  
	(assign_stack_temp): Only reuse slots if they will have the
	same alias set as before.
	(combine_temp_slots): Don't combine if -fstrict-aliasing;
	that's unsafe.
	* rtl.c (copy_rtx): Copy all the flags (in particular,
	MEM_SCALAR_P).

If -fno-strict-aliasing makes the expansion go away, that will confirm
this hypothesis.

There are a few ways to fix this problem:

  o Don't use -fstrict-aliasing.  This might make sense if type-based
    alias analysis is not useful in Fortran.  In the long run, we plan 
    to use alias sets for other things as well, like temporary
    variables, so this might not be a great long-term solution.

  o Teach g77 about alias sets.  (It doesn't use them now, right?)
    You can see c_get_alias_set for the kind of thing you might
    want to do.  I don't remember enought about recent Fortran (F90) 
    to know if this makes sense.

  o Enhance the stack-slot allocating code.  This is the best plan,
    but hardest.  We need a way to tell the compiler that a stack
    slot is dead, and being reused.

You can search the archives for a few days prior to the 2/22 to find
some discussion of the problem that prompted my patch.

-- 
Mark Mitchell 			mark@markmitchell.com
Mark Mitchell Consulting	http://www.markmitchell.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
  1999-02-28 13:43   ` Mark Mitchell
@ 1999-02-28 22:53     ` Mark Mitchell
  0 siblings, 0 replies; 26+ messages in thread
From: Mark Mitchell @ 1999-02-28 22:53 UTC (permalink / raw)
  To: toon; +Cc: egcs

>>>>> "Toon" == Toon Moene <toon@moene.indiv.nluug.nl> writes:

    Toon> Each of these subroutines has a several megabytes of local
    Toon> (stack based) storage.  So at the end of DYN, several
    Toon> megabytes of stack get "freed", and subsequently RADIA
    Toon> "allocates" several megabytes of stack, which - given a
    Toon> reasonable underlying OS, would mean: The same megabytes.

Thanks.  I get what you're saying now.

    Toon> AFAICS, your alias sets do not do anything across function
    Toon> boundaries (i.e. they work for a function at a time).  Note
    Toon> that inlining doesn't play a role here.

    Toon> So why does -fno-strict-aliasing have any effect ?

I don't have the foggiest.  I'm afraid you'll have to resort to the
debugger!  I would have guessed inlining, but if you say there's none
going on, I'm really at a loss.

I'm sorry not to be able to be of more help!  I suggest trying to
verify that each subroutine is freeing all of the stack space it
allocates, and finding which one is using so much more stack then
before.  Then, perhaps, it will be simpler to see what's going on.

-- 
Mark Mitchell 			mark@markmitchell.com
Mark Mitchell Consulting	http://www.markmitchell.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
  1999-02-26 12:21                   ` Mark Mitchell
       [not found]                     ` < 199902262024.MAA07438@adsl-206-170-148-33.dsl.pacbell.net >
@ 1999-02-28 22:53                     ` Mark Mitchell
  1 sibling, 0 replies; 26+ messages in thread
From: Mark Mitchell @ 1999-02-28 22:53 UTC (permalink / raw)
  To: law; +Cc: craig, toon, egcs

>>>>> "Jeffrey" == Jeffrey A Law <law@hurl.cygnus.com> writes:

    Jeffrey> Another thought -- for arrays on the stack, maybe we
    Jeffrey> could set the alias set based on the underlying type
    Jeffrey> instead of the array itself.  That way arrays with
    Jeffrey> different lifetimes, but the same underlying type could
    Jeffrey> share stack slots.

I bet, although I have not yet confirmed this, that we are already
set up to do this in C:

  else if (TREE_CODE (type) == ARRAY_TYPE)
    /* Anything that can alias one of the array elements can alias
       the entire array as well.  */
    TYPE_ALIAS_SET (type) = c_get_alias_set (TREE_TYPE (type));

But, you're right; we won't currently combine or split a stack slot that
is an an array to fit a new request, so only two arrays of the same
size will get overlaid.  You're right that it would be safe to do
this.

One thing I plan to do RSN is enhance c_get_alias_set to DTRT
w.r.t. structs/unions.  We have the subset machinery already, and now
that alias sets seem to be working in a pretty stable fashion, I think
this is the right time to try this.  That will (as a side-effect)
allow us to reuse stack-slots when they are allocated to structure
types.  (Right now, we can't because all structures have alias set
zero.)  This will probably make a reasonable improvement on the bloat
induced by my change for C/C++, but, of course will not affect
Fortran.

-- 
Mark Mitchell 			mark@markmitchell.com
Mark Mitchell Consulting	http://www.markmitchell.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
  1999-02-28 11:14 Toon Moene
       [not found] ` < 36D994C0.479B0657@moene.indiv.nluug.nl >
@ 1999-02-28 22:53 ` Toon Moene
  1 sibling, 0 replies; 26+ messages in thread
From: Toon Moene @ 1999-02-28 22:53 UTC (permalink / raw)
  To: Mark Mitchell, egcs

Mark,

You wrote:

> It is not presently safe to resuse the slot for `s' for `t' since the
> rest of the compiler will then assume that `s' does not alias `t'.
> This is unsafe even if the compiler assigns alias set zero to both `s'
> and `t', because other uses of the stack slot may assign alias sets to
> invidual pieces of `s' or `t'; for example adding in:
     
>   int* g(struct S* sp) { return &sp->i; }
     
> and then modifying the first block of `f' to look like:
     
>    {
>      struct S s
>      *g(&s) = 3;
>   }
     
> would give a piece of the stack slot the alias set for `int'.
   
>     Toon> *We* certainly do not have close to 2 million temporary
>     Toon> single precision variables !  The whole model state is just
>     Toon> 5 variables x 110 x 100 x 31 x 4 bytes ~ 7 Mbyte ...
    
> Isn't that exactly the blowup you're seeing?

Yeah, but the point I try to get across is that we're not reusing stack
space (temporary arrays) *within* SUBROUTINEs.  g77 is not able to
generate array temporaries, and it won't generate scalar temporaries
that alias stack slots assigned to arrays.  It also doesn't generate
pointers to "sections" of arrays - this is all Fortran 90 stuff that g77
doesn't do yet.  So I'm still lost as to how your change can bring this
about.

To put an end to this speculation, our code looks like:

       PROGRAM HIRLAM
       DO I = 1, NSTEP
       CALL DYN		! Newtons second law, etc.
       CALL RADIA	! Radiative transfer
       CALL CONVEC	! Convective processes
       CALL CONDENS	! Large scale condensation
       CALL SURF	! (momentum, heat) fluxes at surface
       CALL VDIFF	! Vertical (turbulent) exchange of energy
       ENDDO
       END

Each of these subroutines has a several megabytes of local (stack based)
storage.  So at the end of DYN, several megabytes of stack get "freed",
and subsequently RADIA "allocates" several megabytes of stack, which -
given a reasonable underlying OS, would mean:  The same megabytes.

AFAICS, your alias sets do not do anything across function boundaries
(i.e. they work for a function at a time).  Note that inlining doesn't
play a role here.

So why does -fno-strict-aliasing have any effect ?

Cheers,
 
-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: fortran@gnu.org; egcs: egcs-bugs@cygnus.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
  1999-02-26 12:13               ` Jeffrey A Law
       [not found]                 ` < 24783.920059954@hurl.cygnus.com >
@ 1999-02-28 22:53                 ` Jeffrey A Law
  1 sibling, 0 replies; 26+ messages in thread
From: Jeffrey A Law @ 1999-02-28 22:53 UTC (permalink / raw)
  To: mark; +Cc: craig, toon, egcs

  In message < 199902262012.MAA07375@adsl-206-170-148-33.dsl.pacbell.net >you wri
te:
  > If -fno-strict-aliasing makes the expansion go away, that will confirm
  > this hypothesis.
Doh.  Forgot about that change.

  > There are a few ways to fix this problem:
  > 
  >   o Don't use -fstrict-aliasing.  This might make sense if type-based
  >     alias analysis is not useful in Fortran.  In the long run, we plan 
  >     to use alias sets for other things as well, like temporary
  >     variables, so this might not be a great long-term solution.
  > 
  >   o Teach g77 about alias sets.  (It doesn't use them now, right?)
  >     You can see c_get_alias_set for the kind of thing you might
  >     want to do.  I don't remember enought about recent Fortran (F90) 
  >     to know if this makes sense.
  > 
  >   o Enhance the stack-slot allocating code.  This is the best plan,
  >     but hardest.  We need a way to tell the compiler that a stack
  >     slot is dead, and being reused.
  > 
  > You can search the archives for a few days prior to the 2/22 to find
  > some discussion of the problem that prompted my patch.
Another thought -- for arrays on the stack, maybe we could set the alias
set based on the underlying type instead of the array itself.  That way
arrays with different lifetimes, but the same underlying type could share
stack slots.

I don't know how feasible this really is, just an idea that popped into 
my head.

jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
  1999-02-26  5:40   ` craig
       [not found]     ` < 19990226133449.22215.qmail@deer >
@ 1999-02-28 22:53     ` craig
  1 sibling, 0 replies; 26+ messages in thread
From: craig @ 1999-02-28 22:53 UTC (permalink / raw)
  To: toon; +Cc: craig

>A further disturbing feature of the 19990225 snapshot is that the
>forecast part of the run took 54 Megabytes of RSS (as reported by ps and
>top) in stark contrast with 43 Megabytes of the 19990127 snapshot.
>
>Note that this is *Resident* size, so we're not talking about unused
>virtual memory here, but memory that's really *used*.
>
>Unfortunately, due to the long time between the two successful runs,
>it's hard to pinpoint the change that brought this about.

Well, I noticed some "interesting" expansions in the stack-frame
sizes of compiled codes (from my private test suite), and the only
thing offhand I could see that might be responsible was:

Thu Feb 18 18:47:09 1999  Jeffrey A Law  (law@cygnus.com)

	* function.c (assign_stack_temp_for_type): Round SIZE before calling
	assign_stack_local for BLKmode slots.

But I haven't looked into whether that's the culprit, or even what
the patch *is*, yet, and probably won't have time.

Anyway, it seemed offhand as though COMPLEX variables were the ones
getting much more space allocated to them on the stack (at least).

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
  1999-02-26 11:43       ` Jeffrey A Law
       [not found]         ` < 24635.920058096@hurl.cygnus.com >
@ 1999-02-28 22:53         ` Jeffrey A Law
  1 sibling, 0 replies; 26+ messages in thread
From: Jeffrey A Law @ 1999-02-28 22:53 UTC (permalink / raw)
  To: craig; +Cc: toon, egcs

  In message < 19990226133449.22215.qmail@deer >you write:
  > >A further disturbing feature of the 19990225 snapshot is that the
  > >forecast part of the run took 54 Megabytes of RSS (as reported by ps and
  > >top) in stark contrast with 43 Megabytes of the 19990127 snapshot.
  > >
  > >Note that this is *Resident* size, so we're not talking about unused
  > >virtual memory here, but memory that's really *used*.
  > >
  > >Unfortunately, due to the long time between the two successful runs,
  > >it's hard to pinpoint the change that brought this about.
  > 
  > Well, I noticed some "interesting" expansions in the stack-frame
  > sizes of compiled codes (from my private test suite), and the only
  > thing offhand I could see that might be responsible was:
  > 
  > Thu Feb 18 18:47:09 1999  Jeffrey A Law  (law@cygnus.com)
  > 
  > 	* function.c (assign_stack_temp_for_type): Round SIZE before calling
  > 	assign_stack_local for BLKmode slots.
  > 
  > But I haven't looked into whether that's the culprit, or even what
  > the patch *is*, yet, and probably won't have time.
  > 
  > Anyway, it seemed offhand as though COMPLEX variables were the ones
  > getting much more space allocated to them on the stack (at least).
This should have only put back previous behavior.  I've got a tweak from
John to that code, but I doubt either would be responsible for this kind of
increase in RSS.

The only way (of course) to find out is to debug the problem.

jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
  1999-02-28  9:06   ` Mark Mitchell
@ 1999-02-28 22:53     ` Mark Mitchell
  0 siblings, 0 replies; 26+ messages in thread
From: Mark Mitchell @ 1999-02-28 22:53 UTC (permalink / raw)
  To: toon; +Cc: egcs

>>>>> "Toon" == Toon Moene <toon@moene.indiv.nluug.nl> writes:

    Toon> Unfortunately, I still do not understand this.  I thought
    Toon> that your alias-setting of stack slots was meant to prevent
    Toon> overwriting a stack slot still in use (because of aliasing
    Toon> between temporary variables) ?

Right.  The problem is that if we don't know what alias set to use for
a stack-slot, we must assume that we cannot reuse it.  Consider, in C,
something like:

   struct S {
     int i;
   };

   struct T {
     long l;
   };

   void f() { 
     {
       struct S s;
     }
     {
       struct T t;
     }
   }

It is not presently safe to resuse the slot for `s' for `t' since the
rest of the compiler will then assume that `s' does not alias `t'.
This is unsafe even if the compiler assigns alias set zero to both `s'
and `t', because other uses of the stack slot may assign alias sets to
invidual pieces of `s' or `t'; for example adding in:

  int* g(struct S* sp) { return &sp->i; }

and then modifying the first block of `f' to look like:

   {
     struct S s
     *g(&s) = 3;
   }

would give a piece of the stack slot the alias set for `int'.

    Toon> *We* certainly do not have close to 2 million temporary
    Toon> single precision variables !  The whole model state is just
    Toon> 5 variables x 110 x 100 x 31 x 4 bytes ~ 7 Mbyte ...

Isn't that exactly the blowup you're seeing?

    Toon> If the compiler can't see that two items might overlap, they
    Toon> shouldn't (or else ....)  Types do not play a role in the
    Toon> aliasing rules.

There are two things you can do then:

  o Give every separate variable a different alias set.  This will
    tell the compiler that none of them overlap, which could improve
    code-generation.  This will not reduce stack usage until we
    fix the compiler to somehow know how to reallocate stack-slots and
    not screw up alias analysis in the process.  This is a hard
    problem; I'm still thinking about it.

  o Turn off -fstrict-aliasing for Fortran.

-- 
Mark Mitchell 			mark@markmitchell.com
Mark Mitchell Consulting	http://www.markmitchell.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
  1999-02-28 13:03   ` craig
@ 1999-02-28 22:53     ` craig
  0 siblings, 0 replies; 26+ messages in thread
From: craig @ 1999-02-28 22:53 UTC (permalink / raw)
  To: toon; +Cc: craig

>Yeah, but the point I try to get across is that we're not reusing stack
>space (temporary arrays) *within* SUBROUTINEs.  g77 is not able to
>generate array temporaries, and it won't generate scalar temporaries
>that alias stack slots assigned to arrays.  It also doesn't generate
>pointers to "sections" of arrays - this is all Fortran 90 stuff that g77
>doesn't do yet.  So I'm still lost as to how your change can bring this
>about.

Toon's basically right, from probably almost every pertinent viewpoint.

But, I should mention g77 *does* generate array temporaries.  The
culprit would be ffecom_push_tempvar, and the cases where it is
called upon to do so appear to be entirely limited to generating
run-time calls to libg2c to do things like character concatenation.

Sorry I don't yet have time to help investigate this, but I'm very
interested in whatever is discovered until I do -- so be sure
to keep emailing the list or, if you decide to take the discussion
off-line, please cc me.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Finally a main line version that passes LAPACK's test suite ...
  1999-02-26  4:58 Toon Moene
       [not found] ` < 36D69A4F.3CC03D4A@moene.indiv.nluug.nl >
@ 1999-02-28 22:53 ` Toon Moene
  1 sibling, 0 replies; 26+ messages in thread
From: Toon Moene @ 1999-02-28 22:53 UTC (permalink / raw)
  To: egcs

Hi,

The snapshot I CVS'd from the main line at 199902250900 UTC was the
first since 19990127 that passed LAPACK's test suite on my
i686-pc-linux-gnu (standard Red Hat 5.2 install).

I used the following options to compile LAPACK:

-g -O3 -funroll-loops -fomit-frame-pointer -fno-emulate-complex
-malign-double

Subsequently, I compiled and ran the test suite that goes with our
numerical weather forecasting system; I compiled it with:

-g -O2 -malign-double -ffast-math -fomit-frame-pointer -funroll-loops
-Wall -Wsurprising -funix-intrinsics-hide

Here are the timings [all times in seconds] with respect to the 19990127
snapshot (not encouraging):

			19990127		19990225

0SUPOBS TOOK :         0.36000061035        0.32999992371
0DATACH TOOK :       148.4499969482       161.0299987793
0ANAEVA TOOK :       207.3500213623       291.8599853516
0GRPEVA TOOK :      1270.1099853516      1333.4100341797
0HUMSUP TOOK :         0.04992675781        0.05993652344
0DATACH TOOK :        45.6899414062        47.8800048828
0HUMEVA TOOK :        22.9599609375        32.1099853516
0GRPEVA TOOK :        31.1900634766        33.4400634766
 PREPARATIONS TOOK     7.5400               7.1400
 FORECAST TOOK      2445.3301            2509.9102

A further disturbing feature of the 19990225 snapshot is that the
forecast part of the run took 54 Megabytes of RSS (as reported by ps and
top) in stark contrast with 43 Megabytes of the 19990127 snapshot.

Note that this is *Resident* size, so we're not talking about unused
virtual memory here, but memory that's really *used*.

Unfortunately, due to the long time between the two successful runs,
it's hard to pinpoint the change that brought this about.

Cheers,

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: fortran@gnu.org; egcs: egcs-bugs@cygnus.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
  1999-02-26 12:49                       ` craig
@ 1999-02-28 22:53                         ` craig
  0 siblings, 0 replies; 26+ messages in thread
From: craig @ 1999-02-28 22:53 UTC (permalink / raw)
  To: mark; +Cc: craig

Thanks for looking into this.  I hope to look into it myself, and see
what g77 can offer to improve the situation, sometime soon, but want
to wrap up my 1.1.2 involvement and get a few other things out of the
way first.  Don't take my silence until then as lack of interest!

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
  1999-02-28  3:02 Toon Moene
       [not found] ` < 36D9215B.E17961A0@moene.indiv.nluug.nl >
@ 1999-02-28 22:53 ` Toon Moene
  1 sibling, 0 replies; 26+ messages in thread
From: Toon Moene @ 1999-02-28 22:53 UTC (permalink / raw)
  To: Mark Mitchell, egcs

Mark,

You wrote (concerning the growth in RSS I saw with our NWP code):

	If -fno-strict-aliasing makes the expansion go away, that will 
	confirm this hypothesis.

Well, it did remove most of the (otherwise unexplained) growth - I'm now
back at 47 Mbytes (was 54 Mb without -fno-strict-aliasing and 43Mb a
month ago).

Unfortunately, I still do not understand this.  I thought that your
alias-setting of stack slots was meant to prevent overwriting a stack
slot still in use (because of aliasing between temporary variables) ?

*We* certainly do not have close to 2 million temporary single precision
variables !  The whole model state is just 5 variables x 110 x 100 x 31
x 4 bytes ~ 7 Mbyte ...

One of your recommendations mentions:

  o Don't use -fstrict-aliasing.  This might make sense if type-based
    alias analysis is not useful in Fortran.  In the long run, we plan
    to use alias sets for other things as well, like temporary
    variables, so this might not be a great long-term solution.

Well, yes.  The aliasing rules in Fortran are very simple:  If the
compiler can't see that two items might overlap, they shouldn't (or else
....)  Types do not play a role in the aliasing rules.

Hope this makes things clear,

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: fortran@gnu.org; egcs: egcs-bugs@cygnus.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
       [not found] ` < 36D994C0.479B0657@moene.indiv.nluug.nl >
  1999-02-28 13:03   ` craig
@ 1999-02-28 13:43   ` Mark Mitchell
  1999-02-28 22:53     ` Mark Mitchell
  1 sibling, 1 reply; 26+ messages in thread
From: Mark Mitchell @ 1999-02-28 13:43 UTC (permalink / raw)
  To: toon; +Cc: egcs

>>>>> "Toon" == Toon Moene <toon@moene.indiv.nluug.nl> writes:

    Toon> Each of these subroutines has a several megabytes of local
    Toon> (stack based) storage.  So at the end of DYN, several
    Toon> megabytes of stack get "freed", and subsequently RADIA
    Toon> "allocates" several megabytes of stack, which - given a
    Toon> reasonable underlying OS, would mean: The same megabytes.

Thanks.  I get what you're saying now.

    Toon> AFAICS, your alias sets do not do anything across function
    Toon> boundaries (i.e. they work for a function at a time).  Note
    Toon> that inlining doesn't play a role here.

    Toon> So why does -fno-strict-aliasing have any effect ?

I don't have the foggiest.  I'm afraid you'll have to resort to the
debugger!  I would have guessed inlining, but if you say there's none
going on, I'm really at a loss.

I'm sorry not to be able to be of more help!  I suggest trying to
verify that each subroutine is freeing all of the stack space it
allocates, and finding which one is using so much more stack then
before.  Then, perhaps, it will be simpler to see what's going on.

-- 
Mark Mitchell 			mark@markmitchell.com
Mark Mitchell Consulting	http://www.markmitchell.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
       [not found] ` < 36D994C0.479B0657@moene.indiv.nluug.nl >
@ 1999-02-28 13:03   ` craig
  1999-02-28 22:53     ` craig
  1999-02-28 13:43   ` Mark Mitchell
  1 sibling, 1 reply; 26+ messages in thread
From: craig @ 1999-02-28 13:03 UTC (permalink / raw)
  To: toon; +Cc: craig

>Yeah, but the point I try to get across is that we're not reusing stack
>space (temporary arrays) *within* SUBROUTINEs.  g77 is not able to
>generate array temporaries, and it won't generate scalar temporaries
>that alias stack slots assigned to arrays.  It also doesn't generate
>pointers to "sections" of arrays - this is all Fortran 90 stuff that g77
>doesn't do yet.  So I'm still lost as to how your change can bring this
>about.

Toon's basically right, from probably almost every pertinent viewpoint.

But, I should mention g77 *does* generate array temporaries.  The
culprit would be ffecom_push_tempvar, and the cases where it is
called upon to do so appear to be entirely limited to generating
run-time calls to libg2c to do things like character concatenation.

Sorry I don't yet have time to help investigate this, but I'm very
interested in whatever is discovered until I do -- so be sure
to keep emailing the list or, if you decide to take the discussion
off-line, please cc me.

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
@ 1999-02-28 11:14 Toon Moene
       [not found] ` < 36D994C0.479B0657@moene.indiv.nluug.nl >
  1999-02-28 22:53 ` Toon Moene
  0 siblings, 2 replies; 26+ messages in thread
From: Toon Moene @ 1999-02-28 11:14 UTC (permalink / raw)
  To: Mark Mitchell, egcs

Mark,

You wrote:

> It is not presently safe to resuse the slot for `s' for `t' since the
> rest of the compiler will then assume that `s' does not alias `t'.
> This is unsafe even if the compiler assigns alias set zero to both `s'
> and `t', because other uses of the stack slot may assign alias sets to
> invidual pieces of `s' or `t'; for example adding in:
     
>   int* g(struct S* sp) { return &sp->i; }
     
> and then modifying the first block of `f' to look like:
     
>    {
>      struct S s
>      *g(&s) = 3;
>   }
     
> would give a piece of the stack slot the alias set for `int'.
   
>     Toon> *We* certainly do not have close to 2 million temporary
>     Toon> single precision variables !  The whole model state is just
>     Toon> 5 variables x 110 x 100 x 31 x 4 bytes ~ 7 Mbyte ...
    
> Isn't that exactly the blowup you're seeing?

Yeah, but the point I try to get across is that we're not reusing stack
space (temporary arrays) *within* SUBROUTINEs.  g77 is not able to
generate array temporaries, and it won't generate scalar temporaries
that alias stack slots assigned to arrays.  It also doesn't generate
pointers to "sections" of arrays - this is all Fortran 90 stuff that g77
doesn't do yet.  So I'm still lost as to how your change can bring this
about.

To put an end to this speculation, our code looks like:

       PROGRAM HIRLAM
       DO I = 1, NSTEP
       CALL DYN		! Newtons second law, etc.
       CALL RADIA	! Radiative transfer
       CALL CONVEC	! Convective processes
       CALL CONDENS	! Large scale condensation
       CALL SURF	! (momentum, heat) fluxes at surface
       CALL VDIFF	! Vertical (turbulent) exchange of energy
       ENDDO
       END

Each of these subroutines has a several megabytes of local (stack based)
storage.  So at the end of DYN, several megabytes of stack get "freed",
and subsequently RADIA "allocates" several megabytes of stack, which -
given a reasonable underlying OS, would mean:  The same megabytes.

AFAICS, your alias sets do not do anything across function boundaries
(i.e. they work for a function at a time).  Note that inlining doesn't
play a role here.

So why does -fno-strict-aliasing have any effect ?

Cheers,
 
-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: fortran@gnu.org; egcs: egcs-bugs@cygnus.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
       [not found] ` < 36D9215B.E17961A0@moene.indiv.nluug.nl >
@ 1999-02-28  9:06   ` Mark Mitchell
  1999-02-28 22:53     ` Mark Mitchell
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Mitchell @ 1999-02-28  9:06 UTC (permalink / raw)
  To: toon; +Cc: egcs

>>>>> "Toon" == Toon Moene <toon@moene.indiv.nluug.nl> writes:

    Toon> Unfortunately, I still do not understand this.  I thought
    Toon> that your alias-setting of stack slots was meant to prevent
    Toon> overwriting a stack slot still in use (because of aliasing
    Toon> between temporary variables) ?

Right.  The problem is that if we don't know what alias set to use for
a stack-slot, we must assume that we cannot reuse it.  Consider, in C,
something like:

   struct S {
     int i;
   };

   struct T {
     long l;
   };

   void f() { 
     {
       struct S s;
     }
     {
       struct T t;
     }
   }

It is not presently safe to resuse the slot for `s' for `t' since the
rest of the compiler will then assume that `s' does not alias `t'.
This is unsafe even if the compiler assigns alias set zero to both `s'
and `t', because other uses of the stack slot may assign alias sets to
invidual pieces of `s' or `t'; for example adding in:

  int* g(struct S* sp) { return &sp->i; }

and then modifying the first block of `f' to look like:

   {
     struct S s
     *g(&s) = 3;
   }

would give a piece of the stack slot the alias set for `int'.

    Toon> *We* certainly do not have close to 2 million temporary
    Toon> single precision variables !  The whole model state is just
    Toon> 5 variables x 110 x 100 x 31 x 4 bytes ~ 7 Mbyte ...

Isn't that exactly the blowup you're seeing?

    Toon> If the compiler can't see that two items might overlap, they
    Toon> shouldn't (or else ....)  Types do not play a role in the
    Toon> aliasing rules.

There are two things you can do then:

  o Give every separate variable a different alias set.  This will
    tell the compiler that none of them overlap, which could improve
    code-generation.  This will not reduce stack usage until we
    fix the compiler to somehow know how to reallocate stack-slots and
    not screw up alias analysis in the process.  This is a hard
    problem; I'm still thinking about it.

  o Turn off -fstrict-aliasing for Fortran.

-- 
Mark Mitchell 			mark@markmitchell.com
Mark Mitchell Consulting	http://www.markmitchell.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
@ 1999-02-28  3:02 Toon Moene
       [not found] ` < 36D9215B.E17961A0@moene.indiv.nluug.nl >
  1999-02-28 22:53 ` Toon Moene
  0 siblings, 2 replies; 26+ messages in thread
From: Toon Moene @ 1999-02-28  3:02 UTC (permalink / raw)
  To: Mark Mitchell, egcs

Mark,

You wrote (concerning the growth in RSS I saw with our NWP code):

	If -fno-strict-aliasing makes the expansion go away, that will 
	confirm this hypothesis.

Well, it did remove most of the (otherwise unexplained) growth - I'm now
back at 47 Mbytes (was 54 Mb without -fno-strict-aliasing and 43Mb a
month ago).

Unfortunately, I still do not understand this.  I thought that your
alias-setting of stack slots was meant to prevent overwriting a stack
slot still in use (because of aliasing between temporary variables) ?

*We* certainly do not have close to 2 million temporary single precision
variables !  The whole model state is just 5 variables x 110 x 100 x 31
x 4 bytes ~ 7 Mbyte ...

One of your recommendations mentions:

  o Don't use -fstrict-aliasing.  This might make sense if type-based
    alias analysis is not useful in Fortran.  In the long run, we plan
    to use alias sets for other things as well, like temporary
    variables, so this might not be a great long-term solution.

Well, yes.  The aliasing rules in Fortran are very simple:  If the
compiler can't see that two items might overlap, they shouldn't (or else
....)  Types do not play a role in the aliasing rules.

Hope this makes things clear,

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: fortran@gnu.org; egcs: egcs-bugs@cygnus.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
       [not found]                     ` < 199902262024.MAA07438@adsl-206-170-148-33.dsl.pacbell.net >
@ 1999-02-26 12:49                       ` craig
  1999-02-28 22:53                         ` craig
  0 siblings, 1 reply; 26+ messages in thread
From: craig @ 1999-02-26 12:49 UTC (permalink / raw)
  To: mark; +Cc: craig

Thanks for looking into this.  I hope to look into it myself, and see
what g77 can offer to improve the situation, sometime soon, but want
to wrap up my 1.1.2 involvement and get a few other things out of the
way first.  Don't take my silence until then as lack of interest!

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
       [not found]                 ` < 24783.920059954@hurl.cygnus.com >
@ 1999-02-26 12:21                   ` Mark Mitchell
       [not found]                     ` < 199902262024.MAA07438@adsl-206-170-148-33.dsl.pacbell.net >
  1999-02-28 22:53                     ` Mark Mitchell
  0 siblings, 2 replies; 26+ messages in thread
From: Mark Mitchell @ 1999-02-26 12:21 UTC (permalink / raw)
  To: law; +Cc: craig, toon, egcs

>>>>> "Jeffrey" == Jeffrey A Law <law@hurl.cygnus.com> writes:

    Jeffrey> Another thought -- for arrays on the stack, maybe we
    Jeffrey> could set the alias set based on the underlying type
    Jeffrey> instead of the array itself.  That way arrays with
    Jeffrey> different lifetimes, but the same underlying type could
    Jeffrey> share stack slots.

I bet, although I have not yet confirmed this, that we are already
set up to do this in C:

  else if (TREE_CODE (type) == ARRAY_TYPE)
    /* Anything that can alias one of the array elements can alias
       the entire array as well.  */
    TYPE_ALIAS_SET (type) = c_get_alias_set (TREE_TYPE (type));

But, you're right; we won't currently combine or split a stack slot that
is an an array to fit a new request, so only two arrays of the same
size will get overlaid.  You're right that it would be safe to do
this.

One thing I plan to do RSN is enhance c_get_alias_set to DTRT
w.r.t. structs/unions.  We have the subset machinery already, and now
that alias sets seem to be working in a pretty stable fashion, I think
this is the right time to try this.  That will (as a side-effect)
allow us to reuse stack-slots when they are allocated to structure
types.  (Right now, we can't because all structures have alias set
zero.)  This will probably make a reasonable improvement on the bloat
induced by my change for C/C++, but, of course will not affect
Fortran.

-- 
Mark Mitchell 			mark@markmitchell.com
Mark Mitchell Consulting	http://www.markmitchell.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
       [not found]             ` < 199902262012.MAA07375@adsl-206-170-148-33.dsl.pacbell.net >
@ 1999-02-26 12:13               ` Jeffrey A Law
       [not found]                 ` < 24783.920059954@hurl.cygnus.com >
  1999-02-28 22:53                 ` Jeffrey A Law
  0 siblings, 2 replies; 26+ messages in thread
From: Jeffrey A Law @ 1999-02-26 12:13 UTC (permalink / raw)
  To: mark; +Cc: craig, toon, egcs

  In message < 199902262012.MAA07375@adsl-206-170-148-33.dsl.pacbell.net >you wri
te:
  > If -fno-strict-aliasing makes the expansion go away, that will confirm
  > this hypothesis.
Doh.  Forgot about that change.

  > There are a few ways to fix this problem:
  > 
  >   o Don't use -fstrict-aliasing.  This might make sense if type-based
  >     alias analysis is not useful in Fortran.  In the long run, we plan 
  >     to use alias sets for other things as well, like temporary
  >     variables, so this might not be a great long-term solution.
  > 
  >   o Teach g77 about alias sets.  (It doesn't use them now, right?)
  >     You can see c_get_alias_set for the kind of thing you might
  >     want to do.  I don't remember enought about recent Fortran (F90) 
  >     to know if this makes sense.
  > 
  >   o Enhance the stack-slot allocating code.  This is the best plan,
  >     but hardest.  We need a way to tell the compiler that a stack
  >     slot is dead, and being reused.
  > 
  > You can search the archives for a few days prior to the 2/22 to find
  > some discussion of the problem that prompted my patch.
Another thought -- for arrays on the stack, maybe we could set the alias
set based on the underlying type instead of the array itself.  That way
arrays with different lifetimes, but the same underlying type could share
stack slots.

I don't know how feasible this really is, just an idea that popped into 
my head.

jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
       [not found]         ` < 24635.920058096@hurl.cygnus.com >
@ 1999-02-26 12:09           ` Mark Mitchell
       [not found]             ` < 199902262012.MAA07375@adsl-206-170-148-33.dsl.pacbell.net >
  1999-02-28 22:53             ` Mark Mitchell
  0 siblings, 2 replies; 26+ messages in thread
From: Mark Mitchell @ 1999-02-26 12:09 UTC (permalink / raw)
  To: law; +Cc: craig, toon, egcs

The most likely cause for the stack-frame expansion is this change, by
yours truly.

  Mon Feb 22 13:33:47 1999  Mark Mitchell  <mark@markmitchell.com>

	* cse.c (dump_class): New function.
	(invalidate_memory): Fix typo in comment.
	* function.c (temp_slot): Add an alias set field.  
	(assign_stack_temp): Only reuse slots if they will have the
	same alias set as before.
	(combine_temp_slots): Don't combine if -fstrict-aliasing;
	that's unsafe.
	* rtl.c (copy_rtx): Copy all the flags (in particular,
	MEM_SCALAR_P).

If -fno-strict-aliasing makes the expansion go away, that will confirm
this hypothesis.

There are a few ways to fix this problem:

  o Don't use -fstrict-aliasing.  This might make sense if type-based
    alias analysis is not useful in Fortran.  In the long run, we plan 
    to use alias sets for other things as well, like temporary
    variables, so this might not be a great long-term solution.

  o Teach g77 about alias sets.  (It doesn't use them now, right?)
    You can see c_get_alias_set for the kind of thing you might
    want to do.  I don't remember enought about recent Fortran (F90) 
    to know if this makes sense.

  o Enhance the stack-slot allocating code.  This is the best plan,
    but hardest.  We need a way to tell the compiler that a stack
    slot is dead, and being reused.

You can search the archives for a few days prior to the 2/22 to find
some discussion of the problem that prompted my patch.

-- 
Mark Mitchell 			mark@markmitchell.com
Mark Mitchell Consulting	http://www.markmitchell.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
       [not found]     ` < 19990226133449.22215.qmail@deer >
@ 1999-02-26 11:43       ` Jeffrey A Law
       [not found]         ` < 24635.920058096@hurl.cygnus.com >
  1999-02-28 22:53         ` Jeffrey A Law
  0 siblings, 2 replies; 26+ messages in thread
From: Jeffrey A Law @ 1999-02-26 11:43 UTC (permalink / raw)
  To: craig; +Cc: toon, egcs

  In message < 19990226133449.22215.qmail@deer >you write:
  > >A further disturbing feature of the 19990225 snapshot is that the
  > >forecast part of the run took 54 Megabytes of RSS (as reported by ps and
  > >top) in stark contrast with 43 Megabytes of the 19990127 snapshot.
  > >
  > >Note that this is *Resident* size, so we're not talking about unused
  > >virtual memory here, but memory that's really *used*.
  > >
  > >Unfortunately, due to the long time between the two successful runs,
  > >it's hard to pinpoint the change that brought this about.
  > 
  > Well, I noticed some "interesting" expansions in the stack-frame
  > sizes of compiled codes (from my private test suite), and the only
  > thing offhand I could see that might be responsible was:
  > 
  > Thu Feb 18 18:47:09 1999  Jeffrey A Law  (law@cygnus.com)
  > 
  > 	* function.c (assign_stack_temp_for_type): Round SIZE before calling
  > 	assign_stack_local for BLKmode slots.
  > 
  > But I haven't looked into whether that's the culprit, or even what
  > the patch *is*, yet, and probably won't have time.
  > 
  > Anyway, it seemed offhand as though COMPLEX variables were the ones
  > getting much more space allocated to them on the stack (at least).
This should have only put back previous behavior.  I've got a tweak from
John to that code, but I doubt either would be responsible for this kind of
increase in RSS.

The only way (of course) to find out is to debug the problem.

jeff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Finally a main line version that passes LAPACK's test suite ...
       [not found] ` < 36D69A4F.3CC03D4A@moene.indiv.nluug.nl >
@ 1999-02-26  5:40   ` craig
       [not found]     ` < 19990226133449.22215.qmail@deer >
  1999-02-28 22:53     ` craig
  0 siblings, 2 replies; 26+ messages in thread
From: craig @ 1999-02-26  5:40 UTC (permalink / raw)
  To: toon; +Cc: craig

>A further disturbing feature of the 19990225 snapshot is that the
>forecast part of the run took 54 Megabytes of RSS (as reported by ps and
>top) in stark contrast with 43 Megabytes of the 19990127 snapshot.
>
>Note that this is *Resident* size, so we're not talking about unused
>virtual memory here, but memory that's really *used*.
>
>Unfortunately, due to the long time between the two successful runs,
>it's hard to pinpoint the change that brought this about.

Well, I noticed some "interesting" expansions in the stack-frame
sizes of compiled codes (from my private test suite), and the only
thing offhand I could see that might be responsible was:

Thu Feb 18 18:47:09 1999  Jeffrey A Law  (law@cygnus.com)

	* function.c (assign_stack_temp_for_type): Round SIZE before calling
	assign_stack_local for BLKmode slots.

But I haven't looked into whether that's the culprit, or even what
the patch *is*, yet, and probably won't have time.

Anyway, it seemed offhand as though COMPLEX variables were the ones
getting much more space allocated to them on the stack (at least).

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Finally a main line version that passes LAPACK's test suite ...
@ 1999-02-26  4:58 Toon Moene
       [not found] ` < 36D69A4F.3CC03D4A@moene.indiv.nluug.nl >
  1999-02-28 22:53 ` Toon Moene
  0 siblings, 2 replies; 26+ messages in thread
From: Toon Moene @ 1999-02-26  4:58 UTC (permalink / raw)
  To: egcs

Hi,

The snapshot I CVS'd from the main line at 199902250900 UTC was the
first since 19990127 that passed LAPACK's test suite on my
i686-pc-linux-gnu (standard Red Hat 5.2 install).

I used the following options to compile LAPACK:

-g -O3 -funroll-loops -fomit-frame-pointer -fno-emulate-complex
-malign-double

Subsequently, I compiled and ran the test suite that goes with our
numerical weather forecasting system; I compiled it with:

-g -O2 -malign-double -ffast-math -fomit-frame-pointer -funroll-loops
-Wall -Wsurprising -funix-intrinsics-hide

Here are the timings [all times in seconds] with respect to the 19990127
snapshot (not encouraging):

			19990127		19990225

0SUPOBS TOOK :         0.36000061035        0.32999992371
0DATACH TOOK :       148.4499969482       161.0299987793
0ANAEVA TOOK :       207.3500213623       291.8599853516
0GRPEVA TOOK :      1270.1099853516      1333.4100341797
0HUMSUP TOOK :         0.04992675781        0.05993652344
0DATACH TOOK :        45.6899414062        47.8800048828
0HUMEVA TOOK :        22.9599609375        32.1099853516
0GRPEVA TOOK :        31.1900634766        33.4400634766
 PREPARATIONS TOOK     7.5400               7.1400
 FORECAST TOOK      2445.3301            2509.9102

A further disturbing feature of the 19990225 snapshot is that the
forecast part of the run took 54 Megabytes of RSS (as reported by ps and
top) in stark contrast with 43 Megabytes of the 19990127 snapshot.

Note that this is *Resident* size, so we're not talking about unused
virtual memory here, but memory that's really *used*.

Unfortunately, due to the long time between the two successful runs,
it's hard to pinpoint the change that brought this about.

Cheers,

-- 
Toon Moene (toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: fortran@gnu.org; egcs: egcs-bugs@cygnus.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~1999-03-31 23:46 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-03-01 21:45 Finally a main line version that passes LAPACK's test suite Billinghurst, David (RTD)
1999-03-31 23:46 ` Billinghurst, David (RTD)
  -- strict thread matches above, loose matches on Subject: below --
1999-02-28 11:14 Toon Moene
     [not found] ` < 36D994C0.479B0657@moene.indiv.nluug.nl >
1999-02-28 13:03   ` craig
1999-02-28 22:53     ` craig
1999-02-28 13:43   ` Mark Mitchell
1999-02-28 22:53     ` Mark Mitchell
1999-02-28 22:53 ` Toon Moene
1999-02-28  3:02 Toon Moene
     [not found] ` < 36D9215B.E17961A0@moene.indiv.nluug.nl >
1999-02-28  9:06   ` Mark Mitchell
1999-02-28 22:53     ` Mark Mitchell
1999-02-28 22:53 ` Toon Moene
1999-02-26  4:58 Toon Moene
     [not found] ` < 36D69A4F.3CC03D4A@moene.indiv.nluug.nl >
1999-02-26  5:40   ` craig
     [not found]     ` < 19990226133449.22215.qmail@deer >
1999-02-26 11:43       ` Jeffrey A Law
     [not found]         ` < 24635.920058096@hurl.cygnus.com >
1999-02-26 12:09           ` Mark Mitchell
     [not found]             ` < 199902262012.MAA07375@adsl-206-170-148-33.dsl.pacbell.net >
1999-02-26 12:13               ` Jeffrey A Law
     [not found]                 ` < 24783.920059954@hurl.cygnus.com >
1999-02-26 12:21                   ` Mark Mitchell
     [not found]                     ` < 199902262024.MAA07438@adsl-206-170-148-33.dsl.pacbell.net >
1999-02-26 12:49                       ` craig
1999-02-28 22:53                         ` craig
1999-02-28 22:53                     ` Mark Mitchell
1999-02-28 22:53                 ` Jeffrey A Law
1999-02-28 22:53             ` Mark Mitchell
1999-02-28 22:53         ` Jeffrey A Law
1999-02-28 22:53     ` craig
1999-02-28 22:53 ` Toon Moene

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).