public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Compiler uses a lot of memory for large initialized arrays
@ 2004-12-02 16:08 Ian Lance Taylor
  2004-12-02 16:34 ` Joseph S. Myers
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Ian Lance Taylor @ 2004-12-02 16:08 UTC (permalink / raw)
  To: mark; +Cc: gcc

A customer just handed me a test case in which the compiler uses way
too much memory when compiling a large array initialization.  This
turns out to be PR 12245, which previously didn't have a test case
attached to it.

This test case works fine with 2.95.3.  Looking into the patches I
came across this note from you from four years ago:
    http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00937.html

So, I'm reminding you about your four-year-old promise to address this
problem if it arose.

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Compiler uses a lot of memory for large initialized arrays
  2004-12-02 16:08 Compiler uses a lot of memory for large initialized arrays Ian Lance Taylor
@ 2004-12-02 16:34 ` Joseph S. Myers
  2004-12-02 17:03   ` Ian Lance Taylor
                     ` (2 more replies)
  2004-12-02 17:31 ` Mark Mitchell
  2004-12-02 17:33 ` Giovanni Bajo
  2 siblings, 3 replies; 15+ messages in thread
From: Joseph S. Myers @ 2004-12-02 16:34 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: mark, gcc

On Thu, 2 Dec 2004, Ian Lance Taylor wrote:

> This test case works fine with 2.95.3.  Looking into the patches I
> came across this note from you from four years ago:
>     http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00937.html

And hopefully you came across the November part of the thread as well: C99 
designated initializers allow

int i[10000000] = { 0, 1, 2, 3, 4, 5, /* ... */
9999998, 9999999, [0] = -1 };

which stops optimizing in the simplest way by writing out initializers to 
the assembler output before the whole initializer has been parsed.

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
    jsm@polyomino.org.uk (personal mail)
    joseph@codesourcery.com (CodeSourcery mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Compiler uses a lot of memory for large initialized arrays
  2004-12-02 16:34 ` Joseph S. Myers
@ 2004-12-02 17:03   ` Ian Lance Taylor
  2004-12-02 17:05   ` Steven Bosscher
  2004-12-02 18:18   ` Joe Buck
  2 siblings, 0 replies; 15+ messages in thread
From: Ian Lance Taylor @ 2004-12-02 17:03 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: mark, gcc

"Joseph S. Myers" <joseph@codesourcery.com> writes:

> On Thu, 2 Dec 2004, Ian Lance Taylor wrote:
> 
> > This test case works fine with 2.95.3.  Looking into the patches I
> > came across this note from you from four years ago:
> >     http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00937.html
> 
> And hopefully you came across the November part of the thread as well: C99 
> designated initializers allow
> 
> int i[10000000] = { 0, 1, 2, 3, 4, 5, /* ... */
> 9999998, 9999999, [0] = -1 };
> 
> which stops optimizing in the simplest way by writing out initializers to 
> the assembler output before the whole initializer has been parsed.

Oh, ick.

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Compiler uses a lot of memory for large initialized arrays
  2004-12-02 16:34 ` Joseph S. Myers
  2004-12-02 17:03   ` Ian Lance Taylor
@ 2004-12-02 17:05   ` Steven Bosscher
  2004-12-02 17:12     ` Richard Guenther
  2004-12-02 18:18   ` Joe Buck
  2 siblings, 1 reply; 15+ messages in thread
From: Steven Bosscher @ 2004-12-02 17:05 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Ian Lance Taylor, mark, gcc

On Dec 02, 2004 05:34 PM, Joseph S. Myers <joseph@codesourcery.com> wrote:

> On Thu, 2 Dec 2004, Ian Lance Taylor wrote:
> 
> > This test case works fine with 2.95.3.  Looking into the patches I
> > came across this note from you from four years ago:
> >     http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00937.html
> 
> And hopefully you came across the November part of the thread as well: C99 
> designated initializers allow
> 
> int i[10000000] = { 0, 1, 2, 3, 4, 5, /* ... */
> 9999998, 9999999, [0] = -1 };
> 
> which stops optimizing in the simplest way by writing out initializers to 
> the assembler output before the whole initializer has been parsed.

Ouch.

Do we disable this if -std!=c99?

Gr.
Steven

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Compiler uses a lot of memory for large initialized arrays
  2004-12-02 17:05   ` Steven Bosscher
@ 2004-12-02 17:12     ` Richard Guenther
  2004-12-02 17:17       ` Nathan Sidwell
  2004-12-02 17:36       ` Andreas Schwab
  0 siblings, 2 replies; 15+ messages in thread
From: Richard Guenther @ 2004-12-02 17:12 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Joseph S. Myers, Ian Lance Taylor, mark, gcc

On Thu, 2 Dec 2004 18:05:17 +0100 (CET), Steven Bosscher
<stevenb@suse.de> wrote:
> On Dec 02, 2004 05:34 PM, Joseph S. Myers <joseph@codesourcery.com> wrote:
> 
> > On Thu, 2 Dec 2004, Ian Lance Taylor wrote:
> >
> > > This test case works fine with 2.95.3.  Looking into the patches I
> > > came across this note from you from four years ago:
> > >     http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00937.html
> >
> > And hopefully you came across the November part of the thread as well: C99
> > designated initializers allow
> >
> > int i[10000000] = { 0, 1, 2, 3, 4, 5, /* ... */
> > 9999998, 9999999, [0] = -1 };
> >
> > which stops optimizing in the simplest way by writing out initializers to
> > the assembler output before the whole initializer has been parsed.
> 
> Ouch.
> 
> Do we disable this if -std!=c99?

Or can't we seek in the asm output and overwrite previously written values?
Might be slow, though.

Richard.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Compiler uses a lot of memory for large initialized arrays
  2004-12-02 17:12     ` Richard Guenther
@ 2004-12-02 17:17       ` Nathan Sidwell
  2004-12-02 17:29         ` Peter Barada
  2004-12-02 17:36       ` Andreas Schwab
  1 sibling, 1 reply; 15+ messages in thread
From: Nathan Sidwell @ 2004-12-02 17:17 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Steven Bosscher, Joseph S. Myers, Ian Lance Taylor, mark, gcc

Richard Guenther wrote:

> Or can't we seek in the asm output and overwrite previously written values?
> Might be slow, though.

Even if that were sensible, where will we remember the file offsets
of each and every element :)

nathan

-- 
Nathan Sidwell    ::   http://www.codesourcery.com   ::     CodeSourcery LLC
nathan@codesourcery.com    ::     http://www.planetfall.pwp.blueyonder.co.uk

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Compiler uses a lot of memory for large initialized arrays
  2004-12-02 17:17       ` Nathan Sidwell
@ 2004-12-02 17:29         ` Peter Barada
  2004-12-02 17:39           ` Mark Mitchell
                             ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Peter Barada @ 2004-12-02 17:29 UTC (permalink / raw)
  To: nathan; +Cc: richard.guenther, stevenb, joseph, ian, mark, gcc


>> Or can't we seek in the asm output and overwrite previously written values?
>> Might be slow, though.
>
>Even if that were sensible, where will we remember the file offsets
>of each and every element :)

I was going to suggest using .org to go back and overwrite the
previous value in the assembler code(since you can compute the offset
quite easily, at least for arrays), but I see the GAS .org info page
has the following:

  `.org' may only increase the location counter, or leave it
  unchanged; you cannot use `.org' to move the location counter
  backwards.

  Because `as' tries to assemble programs in one pass, NEW-LC may not
  be undefined.  If you really detest this restriction we eagerly await
  a chance to share your improved assembler.

So I have to ask how many GCC targets *don't* use GAS?  If that is
very few(to none),  then perhaps the best solution is to look into
modifying GAS to allow backward setting of .org, or is that an even
bigger problem?

-- 
Peter Barada
peter@the-baradas.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Compiler uses a lot of memory for large initialized arrays
  2004-12-02 16:08 Compiler uses a lot of memory for large initialized arrays Ian Lance Taylor
  2004-12-02 16:34 ` Joseph S. Myers
@ 2004-12-02 17:31 ` Mark Mitchell
  2004-12-02 17:33 ` Giovanni Bajo
  2 siblings, 0 replies; 15+ messages in thread
From: Mark Mitchell @ 2004-12-02 17:31 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc

Ian Lance Taylor wrote:
> A customer just handed me a test case in which the compiler uses way
> too much memory when compiling a large array initialization.  This
> turns out to be PR 12245, which previously didn't have a test case
> attached to it.
> 
> This test case works fine with 2.95.3.  Looking into the patches I
> came across this note from you from four years ago:
>     http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00937.html
> 
> So, I'm reminding you about your four-year-old promise to address this
> problem if it arose.

Fair enough.

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Compiler uses a lot of memory for large initialized arrays
  2004-12-02 16:08 Compiler uses a lot of memory for large initialized arrays Ian Lance Taylor
  2004-12-02 16:34 ` Joseph S. Myers
  2004-12-02 17:31 ` Mark Mitchell
@ 2004-12-02 17:33 ` Giovanni Bajo
  2 siblings, 0 replies; 15+ messages in thread
From: Giovanni Bajo @ 2004-12-02 17:33 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc

Ian Lance Taylor <ian@wasabisystems.com> wrote:

> A customer just handed me a test case in which the compiler uses way
> too much memory when compiling a large array initialization.  This
> turns out to be PR 12245, which previously didn't have a test case
> attached to it.


Check also PR 14179, which is the same issue for C++. The testcase has 4
millions of initializers. We have almost fixed that testcase (that is: we do
much better than a few months ago), and I have been meaning to post the
final patch (to process_init_constructor) for a long time now. With that
patch on, we can compile the testcase with about 220Mb of RAM.

There are other optimizations that can be done here and there. For instance,
the CONSTRUCTOR_ELTS could be using Nathan's Vec where fesable, instead of
TREE_LISTs.
-- 
Giovanni Bajo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Compiler uses a lot of memory for large initialized arrays
  2004-12-02 17:12     ` Richard Guenther
  2004-12-02 17:17       ` Nathan Sidwell
@ 2004-12-02 17:36       ` Andreas Schwab
  1 sibling, 0 replies; 15+ messages in thread
From: Andreas Schwab @ 2004-12-02 17:36 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Steven Bosscher, Joseph S. Myers, Ian Lance Taylor, mark, gcc

Richard Guenther <richard.guenther@gmail.com> writes:

> Or can't we seek in the asm output and overwrite previously written values?

Not with -pipe.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Compiler uses a lot of memory for large initialized arrays
  2004-12-02 17:29         ` Peter Barada
@ 2004-12-02 17:39           ` Mark Mitchell
  2004-12-02 17:42           ` Ian Lance Taylor
  2004-12-02 17:49           ` Dave Korn
  2 siblings, 0 replies; 15+ messages in thread
From: Mark Mitchell @ 2004-12-02 17:39 UTC (permalink / raw)
  To: Peter Barada; +Cc: nathan, richard.guenther, stevenb, joseph, ian, gcc

Peter Barada wrote:
>>>Or can't we seek in the asm output and overwrite previously written values?
>>>Might be slow, though.
>>
>>Even if that were sensible, where will we remember the file offsets
>>of each and every element :)

I think overwriting stuff in the assembler is a horrible idea.

I also think that trying to acheive 2.95.3 memory usage for huge arrays 
is foolish.  Since then, we've deliberately substantially increased the 
amount of memory we need for lots of things: function-at-a-time is more 
expensive than statement-at-a-time, and now we're unit-at-a-time for 
many compilations -- as we should be.  Most compilers suck up lots of 
memory with vast arrays; I don't think we need to be different, alleged 
regression or not.

However, the C++ front end (and perhaps the C front end) do some pretty 
silly stuff when contstructing the arrays.  I believe that when I 
analyzed this, I determined that there were factor-of-eight sorts of 
improvements possible.  That's what I think we should fix.

We've already got some of that, in that, for example, Nathan's changed 
things so that we share integer constants.  The next major step is to 
change CONSTRUCTOR to use an array, rather than a linked list, of 
elements for CONSTRUCTOR_ELTS.

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Compiler uses a lot of memory for large initialized arrays
  2004-12-02 17:29         ` Peter Barada
  2004-12-02 17:39           ` Mark Mitchell
@ 2004-12-02 17:42           ` Ian Lance Taylor
  2004-12-02 17:49           ` Dave Korn
  2 siblings, 0 replies; 15+ messages in thread
From: Ian Lance Taylor @ 2004-12-02 17:42 UTC (permalink / raw)
  To: Peter Barada; +Cc: nathan, richard.guenther, stevenb, joseph, mark, gcc

Peter Barada <peter@the-baradas.com> writes:

> I was going to suggest using .org to go back and overwrite the
> previous value in the assembler code(since you can compute the offset
> quite easily, at least for arrays), but I see the GAS .org info page
> has the following:
> 
>   `.org' may only increase the location counter, or leave it
>   unchanged; you cannot use `.org' to move the location counter
>   backwards.
> 
>   Because `as' tries to assemble programs in one pass, NEW-LC may not
>   be undefined.  If you really detest this restriction we eagerly await
>   a chance to share your improved assembler.
> 
> So I have to ask how many GCC targets *don't* use GAS?  If that is
> very few(to none),  then perhaps the best solution is to look into
> modifying GAS to allow backward setting of .org, or is that an even
> bigger problem?

Having gas support going backward with .org would be doable, though
hardly simple.  gas is not really a one-pass assembler, not since 1990
or so; it stores all the assembled data in memory, and then writes it
out at the end of the assembly.  (On the other hand, gas is also not
really a two-pass assembler; it doesn't actually make another pass
over the input, except to write out the data.)

Backward .org could be supported with a new type of frag which
specified the exact offset to use.  When writing out that frag we
would then use that offset, instead of just keeping track of the
current offset as we do now.  Several places in the assembler would
need to be updated with information about the new frag type.  Probably
not much target dependent code would be involved.

We would have to permit general expressions in the .org, or else it
would be too hard for gcc to generate the correct expression.  That
is, gcc will want to generate ".org array + 10".  Since the data
following the .org can itself affect the placement of future symbols,
some expressions will not be resolvable.  Some attention would have to
be paid to defining what would be permitted, and handling failing
cases.  This becomes particularly complex when using a backward .org
in a code section for a target for which the assembler can relax
branches, such as the i386 (or, for that matter, the Coldfire).

There are certainly gcc targets which don't use gas, so any such
optimization would have to be made conditional on support within the
assembler.

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Compiler uses a lot of memory for large initialized arrays
  2004-12-02 17:29         ` Peter Barada
  2004-12-02 17:39           ` Mark Mitchell
  2004-12-02 17:42           ` Ian Lance Taylor
@ 2004-12-02 17:49           ` Dave Korn
  2 siblings, 0 replies; 15+ messages in thread
From: Dave Korn @ 2004-12-02 17:49 UTC (permalink / raw)
  To: 'Peter Barada', nathan
  Cc: richard.guenther, stevenb, joseph, ian, mark, gcc

> -----Original Message-----
> From: gcc-owner On Behalf Of Peter Barada
> Sent: 02 December 2004 17:30

> So I have to ask how many GCC targets *don't* use GAS?  If that is
> very few(to none),  then perhaps the best solution is to look into
> modifying GAS to allow backward setting of .org, or is that an even
> bigger problem?

  Well, it is an even more bletch-worthy hack than zapping backwards and
forwards in the assembler output overwriting stuff.  Wash your mouth out for
even suggesting such a dirty idea!

  Plus, gcc is very much supposed to interoperate with a target's native
binutils-equivalents.  I quite regularly read posts from people who use it
with Sun's native as/ld on solaris, for example.




    cheers, 
      DaveK
-- 
Can't think of a witty .sigline today....

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Compiler uses a lot of memory for large initialized arrays
  2004-12-02 16:34 ` Joseph S. Myers
  2004-12-02 17:03   ` Ian Lance Taylor
  2004-12-02 17:05   ` Steven Bosscher
@ 2004-12-02 18:18   ` Joe Buck
  2004-12-02 18:23     ` Mark Mitchell
  2 siblings, 1 reply; 15+ messages in thread
From: Joe Buck @ 2004-12-02 18:18 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Ian Lance Taylor, mark, gcc

On Thu, Dec 02, 2004 at 04:34:35PM +0000, Joseph S. Myers wrote:
> On Thu, 2 Dec 2004, Ian Lance Taylor wrote:
> 
> > This test case works fine with 2.95.3.  Looking into the patches I
> > came across this note from you from four years ago:
> >     http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00937.html
> 
> And hopefully you came across the November part of the thread as well: C99 
> designated initializers allow
> 
> int i[10000000] = { 0, 1, 2, 3, 4, 5, /* ... */
> 9999998, 9999999, [0] = -1 };
> 
> which stops optimizing in the simplest way by writing out initializers to 
> the assembler output before the whole initializer has been parsed.

Here's an ugly hack: write large initializers to a temporary file
(say, when the number of elements in the initializer reaches some
threshold, like 500).

If we successfully reach the end of the initializer without anything evil
like the above case, then append the temp file to the assembly output.
Otherwise, go back and do it in memory.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Compiler uses a lot of memory for large initialized arrays
  2004-12-02 18:18   ` Joe Buck
@ 2004-12-02 18:23     ` Mark Mitchell
  0 siblings, 0 replies; 15+ messages in thread
From: Mark Mitchell @ 2004-12-02 18:23 UTC (permalink / raw)
  To: Joe Buck; +Cc: Joseph S. Myers, Ian Lance Taylor, gcc

Joe Buck wrote:

> If we successfully reach the end of the initializer without anything evil
> like the above case, then append the temp file to the assembly output.
> Otherwise, go back and do it in memory.

I can't fathom why we are considering things like this.

Like anything else, we should try to use an efficient representation, 
which we are not, at present.  We should fix that.  But why should we go 
beyond that, jumping through complicated hoops to support initialization 
of gigabytes of data?  We don't jump through these kinds of hoops to 
support compilation of functions with millions of lines of code, or 
translation units with millions of functions.

I think it's perfectly reasonable to run out of memory processing a 
truly huge array -- provided that the compiler is being sensible about 
its internal representation.

-- 
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2004-12-02 18:23 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-02 16:08 Compiler uses a lot of memory for large initialized arrays Ian Lance Taylor
2004-12-02 16:34 ` Joseph S. Myers
2004-12-02 17:03   ` Ian Lance Taylor
2004-12-02 17:05   ` Steven Bosscher
2004-12-02 17:12     ` Richard Guenther
2004-12-02 17:17       ` Nathan Sidwell
2004-12-02 17:29         ` Peter Barada
2004-12-02 17:39           ` Mark Mitchell
2004-12-02 17:42           ` Ian Lance Taylor
2004-12-02 17:49           ` Dave Korn
2004-12-02 17:36       ` Andreas Schwab
2004-12-02 18:18   ` Joe Buck
2004-12-02 18:23     ` Mark Mitchell
2004-12-02 17:31 ` Mark Mitchell
2004-12-02 17:33 ` Giovanni Bajo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).