* Compiler uses a lot of memory for large initialized arrays @ 2004-12-02 16:08 Ian Lance Taylor 2004-12-02 16:34 ` Joseph S. Myers ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Ian Lance Taylor @ 2004-12-02 16:08 UTC (permalink / raw) To: mark; +Cc: gcc A customer just handed me a test case in which the compiler uses way too much memory when compiling a large array initialization. This turns out to be PR 12245, which previously didn't have a test case attached to it. This test case works fine with 2.95.3. Looking into the patches I came across this note from you from four years ago: http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00937.html So, I'm reminding you about your four-year-old promise to address this problem if it arose. Ian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Compiler uses a lot of memory for large initialized arrays 2004-12-02 16:08 Compiler uses a lot of memory for large initialized arrays Ian Lance Taylor @ 2004-12-02 16:34 ` Joseph S. Myers 2004-12-02 17:03 ` Ian Lance Taylor ` (2 more replies) 2004-12-02 17:31 ` Mark Mitchell 2004-12-02 17:33 ` Giovanni Bajo 2 siblings, 3 replies; 15+ messages in thread From: Joseph S. Myers @ 2004-12-02 16:34 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: mark, gcc On Thu, 2 Dec 2004, Ian Lance Taylor wrote: > This test case works fine with 2.95.3. Looking into the patches I > came across this note from you from four years ago: > http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00937.html And hopefully you came across the November part of the thread as well: C99 designated initializers allow int i[10000000] = { 0, 1, 2, 3, 4, 5, /* ... */ 9999998, 9999999, [0] = -1 }; which stops optimizing in the simplest way by writing out initializers to the assembler output before the whole initializer has been parsed. -- Joseph S. Myers http://www.srcf.ucam.org/~jsm28/gcc/ jsm@polyomino.org.uk (personal mail) joseph@codesourcery.com (CodeSourcery mail) jsm28@gcc.gnu.org (Bugzilla assignments and CCs) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Compiler uses a lot of memory for large initialized arrays 2004-12-02 16:34 ` Joseph S. Myers @ 2004-12-02 17:03 ` Ian Lance Taylor 2004-12-02 17:05 ` Steven Bosscher 2004-12-02 18:18 ` Joe Buck 2 siblings, 0 replies; 15+ messages in thread From: Ian Lance Taylor @ 2004-12-02 17:03 UTC (permalink / raw) To: Joseph S. Myers; +Cc: mark, gcc "Joseph S. Myers" <joseph@codesourcery.com> writes: > On Thu, 2 Dec 2004, Ian Lance Taylor wrote: > > > This test case works fine with 2.95.3. Looking into the patches I > > came across this note from you from four years ago: > > http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00937.html > > And hopefully you came across the November part of the thread as well: C99 > designated initializers allow > > int i[10000000] = { 0, 1, 2, 3, 4, 5, /* ... */ > 9999998, 9999999, [0] = -1 }; > > which stops optimizing in the simplest way by writing out initializers to > the assembler output before the whole initializer has been parsed. Oh, ick. Ian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Compiler uses a lot of memory for large initialized arrays 2004-12-02 16:34 ` Joseph S. Myers 2004-12-02 17:03 ` Ian Lance Taylor @ 2004-12-02 17:05 ` Steven Bosscher 2004-12-02 17:12 ` Richard Guenther 2004-12-02 18:18 ` Joe Buck 2 siblings, 1 reply; 15+ messages in thread From: Steven Bosscher @ 2004-12-02 17:05 UTC (permalink / raw) To: Joseph S. Myers; +Cc: Ian Lance Taylor, mark, gcc On Dec 02, 2004 05:34 PM, Joseph S. Myers <joseph@codesourcery.com> wrote: > On Thu, 2 Dec 2004, Ian Lance Taylor wrote: > > > This test case works fine with 2.95.3. Looking into the patches I > > came across this note from you from four years ago: > > http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00937.html > > And hopefully you came across the November part of the thread as well: C99 > designated initializers allow > > int i[10000000] = { 0, 1, 2, 3, 4, 5, /* ... */ > 9999998, 9999999, [0] = -1 }; > > which stops optimizing in the simplest way by writing out initializers to > the assembler output before the whole initializer has been parsed. Ouch. Do we disable this if -std!=c99? Gr. Steven ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Compiler uses a lot of memory for large initialized arrays 2004-12-02 17:05 ` Steven Bosscher @ 2004-12-02 17:12 ` Richard Guenther 2004-12-02 17:17 ` Nathan Sidwell 2004-12-02 17:36 ` Andreas Schwab 0 siblings, 2 replies; 15+ messages in thread From: Richard Guenther @ 2004-12-02 17:12 UTC (permalink / raw) To: Steven Bosscher; +Cc: Joseph S. Myers, Ian Lance Taylor, mark, gcc On Thu, 2 Dec 2004 18:05:17 +0100 (CET), Steven Bosscher <stevenb@suse.de> wrote: > On Dec 02, 2004 05:34 PM, Joseph S. Myers <joseph@codesourcery.com> wrote: > > > On Thu, 2 Dec 2004, Ian Lance Taylor wrote: > > > > > This test case works fine with 2.95.3. Looking into the patches I > > > came across this note from you from four years ago: > > > http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00937.html > > > > And hopefully you came across the November part of the thread as well: C99 > > designated initializers allow > > > > int i[10000000] = { 0, 1, 2, 3, 4, 5, /* ... */ > > 9999998, 9999999, [0] = -1 }; > > > > which stops optimizing in the simplest way by writing out initializers to > > the assembler output before the whole initializer has been parsed. > > Ouch. > > Do we disable this if -std!=c99? Or can't we seek in the asm output and overwrite previously written values? Might be slow, though. Richard. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Compiler uses a lot of memory for large initialized arrays 2004-12-02 17:12 ` Richard Guenther @ 2004-12-02 17:17 ` Nathan Sidwell 2004-12-02 17:29 ` Peter Barada 2004-12-02 17:36 ` Andreas Schwab 1 sibling, 1 reply; 15+ messages in thread From: Nathan Sidwell @ 2004-12-02 17:17 UTC (permalink / raw) To: Richard Guenther Cc: Steven Bosscher, Joseph S. Myers, Ian Lance Taylor, mark, gcc Richard Guenther wrote: > Or can't we seek in the asm output and overwrite previously written values? > Might be slow, though. Even if that were sensible, where will we remember the file offsets of each and every element :) nathan -- Nathan Sidwell :: http://www.codesourcery.com :: CodeSourcery LLC nathan@codesourcery.com :: http://www.planetfall.pwp.blueyonder.co.uk ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Compiler uses a lot of memory for large initialized arrays 2004-12-02 17:17 ` Nathan Sidwell @ 2004-12-02 17:29 ` Peter Barada 2004-12-02 17:39 ` Mark Mitchell ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Peter Barada @ 2004-12-02 17:29 UTC (permalink / raw) To: nathan; +Cc: richard.guenther, stevenb, joseph, ian, mark, gcc >> Or can't we seek in the asm output and overwrite previously written values? >> Might be slow, though. > >Even if that were sensible, where will we remember the file offsets >of each and every element :) I was going to suggest using .org to go back and overwrite the previous value in the assembler code(since you can compute the offset quite easily, at least for arrays), but I see the GAS .org info page has the following: `.org' may only increase the location counter, or leave it unchanged; you cannot use `.org' to move the location counter backwards. Because `as' tries to assemble programs in one pass, NEW-LC may not be undefined. If you really detest this restriction we eagerly await a chance to share your improved assembler. So I have to ask how many GCC targets *don't* use GAS? If that is very few(to none), then perhaps the best solution is to look into modifying GAS to allow backward setting of .org, or is that an even bigger problem? -- Peter Barada peter@the-baradas.com ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Compiler uses a lot of memory for large initialized arrays 2004-12-02 17:29 ` Peter Barada @ 2004-12-02 17:39 ` Mark Mitchell 2004-12-02 17:42 ` Ian Lance Taylor 2004-12-02 17:49 ` Dave Korn 2 siblings, 0 replies; 15+ messages in thread From: Mark Mitchell @ 2004-12-02 17:39 UTC (permalink / raw) To: Peter Barada; +Cc: nathan, richard.guenther, stevenb, joseph, ian, gcc Peter Barada wrote: >>>Or can't we seek in the asm output and overwrite previously written values? >>>Might be slow, though. >> >>Even if that were sensible, where will we remember the file offsets >>of each and every element :) I think overwriting stuff in the assembler is a horrible idea. I also think that trying to acheive 2.95.3 memory usage for huge arrays is foolish. Since then, we've deliberately substantially increased the amount of memory we need for lots of things: function-at-a-time is more expensive than statement-at-a-time, and now we're unit-at-a-time for many compilations -- as we should be. Most compilers suck up lots of memory with vast arrays; I don't think we need to be different, alleged regression or not. However, the C++ front end (and perhaps the C front end) do some pretty silly stuff when contstructing the arrays. I believe that when I analyzed this, I determined that there were factor-of-eight sorts of improvements possible. That's what I think we should fix. We've already got some of that, in that, for example, Nathan's changed things so that we share integer constants. The next major step is to change CONSTRUCTOR to use an array, rather than a linked list, of elements for CONSTRUCTOR_ELTS. -- Mark Mitchell CodeSourcery, LLC mark@codesourcery.com (916) 791-8304 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Compiler uses a lot of memory for large initialized arrays 2004-12-02 17:29 ` Peter Barada 2004-12-02 17:39 ` Mark Mitchell @ 2004-12-02 17:42 ` Ian Lance Taylor 2004-12-02 17:49 ` Dave Korn 2 siblings, 0 replies; 15+ messages in thread From: Ian Lance Taylor @ 2004-12-02 17:42 UTC (permalink / raw) To: Peter Barada; +Cc: nathan, richard.guenther, stevenb, joseph, mark, gcc Peter Barada <peter@the-baradas.com> writes: > I was going to suggest using .org to go back and overwrite the > previous value in the assembler code(since you can compute the offset > quite easily, at least for arrays), but I see the GAS .org info page > has the following: > > `.org' may only increase the location counter, or leave it > unchanged; you cannot use `.org' to move the location counter > backwards. > > Because `as' tries to assemble programs in one pass, NEW-LC may not > be undefined. If you really detest this restriction we eagerly await > a chance to share your improved assembler. > > So I have to ask how many GCC targets *don't* use GAS? If that is > very few(to none), then perhaps the best solution is to look into > modifying GAS to allow backward setting of .org, or is that an even > bigger problem? Having gas support going backward with .org would be doable, though hardly simple. gas is not really a one-pass assembler, not since 1990 or so; it stores all the assembled data in memory, and then writes it out at the end of the assembly. (On the other hand, gas is also not really a two-pass assembler; it doesn't actually make another pass over the input, except to write out the data.) Backward .org could be supported with a new type of frag which specified the exact offset to use. When writing out that frag we would then use that offset, instead of just keeping track of the current offset as we do now. Several places in the assembler would need to be updated with information about the new frag type. Probably not much target dependent code would be involved. We would have to permit general expressions in the .org, or else it would be too hard for gcc to generate the correct expression. That is, gcc will want to generate ".org array + 10". Since the data following the .org can itself affect the placement of future symbols, some expressions will not be resolvable. Some attention would have to be paid to defining what would be permitted, and handling failing cases. This becomes particularly complex when using a backward .org in a code section for a target for which the assembler can relax branches, such as the i386 (or, for that matter, the Coldfire). There are certainly gcc targets which don't use gas, so any such optimization would have to be made conditional on support within the assembler. Ian ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: Compiler uses a lot of memory for large initialized arrays 2004-12-02 17:29 ` Peter Barada 2004-12-02 17:39 ` Mark Mitchell 2004-12-02 17:42 ` Ian Lance Taylor @ 2004-12-02 17:49 ` Dave Korn 2 siblings, 0 replies; 15+ messages in thread From: Dave Korn @ 2004-12-02 17:49 UTC (permalink / raw) To: 'Peter Barada', nathan Cc: richard.guenther, stevenb, joseph, ian, mark, gcc > -----Original Message----- > From: gcc-owner On Behalf Of Peter Barada > Sent: 02 December 2004 17:30 > So I have to ask how many GCC targets *don't* use GAS? If that is > very few(to none), then perhaps the best solution is to look into > modifying GAS to allow backward setting of .org, or is that an even > bigger problem? Well, it is an even more bletch-worthy hack than zapping backwards and forwards in the assembler output overwriting stuff. Wash your mouth out for even suggesting such a dirty idea! Plus, gcc is very much supposed to interoperate with a target's native binutils-equivalents. I quite regularly read posts from people who use it with Sun's native as/ld on solaris, for example. cheers, DaveK -- Can't think of a witty .sigline today.... ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Compiler uses a lot of memory for large initialized arrays 2004-12-02 17:12 ` Richard Guenther 2004-12-02 17:17 ` Nathan Sidwell @ 2004-12-02 17:36 ` Andreas Schwab 1 sibling, 0 replies; 15+ messages in thread From: Andreas Schwab @ 2004-12-02 17:36 UTC (permalink / raw) To: Richard Guenther Cc: Steven Bosscher, Joseph S. Myers, Ian Lance Taylor, mark, gcc Richard Guenther <richard.guenther@gmail.com> writes: > Or can't we seek in the asm output and overwrite previously written values? Not with -pipe. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, MaxfeldstraÃe 5, 90409 Nürnberg, Germany Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Compiler uses a lot of memory for large initialized arrays 2004-12-02 16:34 ` Joseph S. Myers 2004-12-02 17:03 ` Ian Lance Taylor 2004-12-02 17:05 ` Steven Bosscher @ 2004-12-02 18:18 ` Joe Buck 2004-12-02 18:23 ` Mark Mitchell 2 siblings, 1 reply; 15+ messages in thread From: Joe Buck @ 2004-12-02 18:18 UTC (permalink / raw) To: Joseph S. Myers; +Cc: Ian Lance Taylor, mark, gcc On Thu, Dec 02, 2004 at 04:34:35PM +0000, Joseph S. Myers wrote: > On Thu, 2 Dec 2004, Ian Lance Taylor wrote: > > > This test case works fine with 2.95.3. Looking into the patches I > > came across this note from you from four years ago: > > http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00937.html > > And hopefully you came across the November part of the thread as well: C99 > designated initializers allow > > int i[10000000] = { 0, 1, 2, 3, 4, 5, /* ... */ > 9999998, 9999999, [0] = -1 }; > > which stops optimizing in the simplest way by writing out initializers to > the assembler output before the whole initializer has been parsed. Here's an ugly hack: write large initializers to a temporary file (say, when the number of elements in the initializer reaches some threshold, like 500). If we successfully reach the end of the initializer without anything evil like the above case, then append the temp file to the assembly output. Otherwise, go back and do it in memory. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Compiler uses a lot of memory for large initialized arrays 2004-12-02 18:18 ` Joe Buck @ 2004-12-02 18:23 ` Mark Mitchell 0 siblings, 0 replies; 15+ messages in thread From: Mark Mitchell @ 2004-12-02 18:23 UTC (permalink / raw) To: Joe Buck; +Cc: Joseph S. Myers, Ian Lance Taylor, gcc Joe Buck wrote: > If we successfully reach the end of the initializer without anything evil > like the above case, then append the temp file to the assembly output. > Otherwise, go back and do it in memory. I can't fathom why we are considering things like this. Like anything else, we should try to use an efficient representation, which we are not, at present. We should fix that. But why should we go beyond that, jumping through complicated hoops to support initialization of gigabytes of data? We don't jump through these kinds of hoops to support compilation of functions with millions of lines of code, or translation units with millions of functions. I think it's perfectly reasonable to run out of memory processing a truly huge array -- provided that the compiler is being sensible about its internal representation. -- Mark Mitchell CodeSourcery, LLC mark@codesourcery.com (916) 791-8304 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Compiler uses a lot of memory for large initialized arrays 2004-12-02 16:08 Compiler uses a lot of memory for large initialized arrays Ian Lance Taylor 2004-12-02 16:34 ` Joseph S. Myers @ 2004-12-02 17:31 ` Mark Mitchell 2004-12-02 17:33 ` Giovanni Bajo 2 siblings, 0 replies; 15+ messages in thread From: Mark Mitchell @ 2004-12-02 17:31 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: gcc Ian Lance Taylor wrote: > A customer just handed me a test case in which the compiler uses way > too much memory when compiling a large array initialization. This > turns out to be PR 12245, which previously didn't have a test case > attached to it. > > This test case works fine with 2.95.3. Looking into the patches I > came across this note from you from four years ago: > http://gcc.gnu.org/ml/gcc-patches/2000-10/msg00937.html > > So, I'm reminding you about your four-year-old promise to address this > problem if it arose. Fair enough. -- Mark Mitchell CodeSourcery, LLC mark@codesourcery.com (916) 791-8304 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Compiler uses a lot of memory for large initialized arrays 2004-12-02 16:08 Compiler uses a lot of memory for large initialized arrays Ian Lance Taylor 2004-12-02 16:34 ` Joseph S. Myers 2004-12-02 17:31 ` Mark Mitchell @ 2004-12-02 17:33 ` Giovanni Bajo 2 siblings, 0 replies; 15+ messages in thread From: Giovanni Bajo @ 2004-12-02 17:33 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: gcc Ian Lance Taylor <ian@wasabisystems.com> wrote: > A customer just handed me a test case in which the compiler uses way > too much memory when compiling a large array initialization. This > turns out to be PR 12245, which previously didn't have a test case > attached to it. Check also PR 14179, which is the same issue for C++. The testcase has 4 millions of initializers. We have almost fixed that testcase (that is: we do much better than a few months ago), and I have been meaning to post the final patch (to process_init_constructor) for a long time now. With that patch on, we can compile the testcase with about 220Mb of RAM. There are other optimizations that can be done here and there. For instance, the CONSTRUCTOR_ELTS could be using Nathan's Vec where fesable, instead of TREE_LISTs. -- Giovanni Bajo ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2004-12-02 18:23 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2004-12-02 16:08 Compiler uses a lot of memory for large initialized arrays Ian Lance Taylor 2004-12-02 16:34 ` Joseph S. Myers 2004-12-02 17:03 ` Ian Lance Taylor 2004-12-02 17:05 ` Steven Bosscher 2004-12-02 17:12 ` Richard Guenther 2004-12-02 17:17 ` Nathan Sidwell 2004-12-02 17:29 ` Peter Barada 2004-12-02 17:39 ` Mark Mitchell 2004-12-02 17:42 ` Ian Lance Taylor 2004-12-02 17:49 ` Dave Korn 2004-12-02 17:36 ` Andreas Schwab 2004-12-02 18:18 ` Joe Buck 2004-12-02 18:23 ` Mark Mitchell 2004-12-02 17:31 ` Mark Mitchell 2004-12-02 17:33 ` Giovanni Bajo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).