public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/39157]  New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
@ 2009-02-11 21:42 lucier at math dot purdue dot edu
  2009-02-12  9:58 ` [Bug middle-end/39157] " rguenth at gcc dot gnu dot org
                   ` (23 more replies)
  0 siblings, 24 replies; 26+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-11 21:42 UTC (permalink / raw)
  To: gcc-bugs

With this compiler

[lucier@descartes gambit]$ gcc -v
Using built-in specs.
Target: powerpc64-unknown-linux-gnu
Configured with: ../../gcc-4.3.3/configure --prefix=/pkgs/gcc-4.3.3
--enable-languages=c --with-cpu=default64
Thread model: posix
gcc version 4.3.3 (GCC) 

with the file compiler.i found here:

http://www.math.purdue.edu/~lucier/bugzilla/8/

attempting to compile with these options:

gcc -m64 -mcpu=970 -Wall -W -Wno-unused -O1 -fno-math-errno -fschedule-insns2
-fno-trapping-math -fno-strict-aliasing -fwrapv -fomit-frame-pointer -fPIC
-fno-common -rdynamic -shared

can't compile in 8GB of RAM.  With this compiler:

euler-77% /pkgs/gcc-4.2.3/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../../gcc-4.2.3/configure --prefix=/pkgs/gcc-4.2.3
--enable-checking=release --with-gmp=/pkgs/gmp-4.2.2
--with-mpfr=/pkgs/gmp-4.2.2
Thread model: posix
gcc version 4.2.3

and these options:

gcc -Wall -W -Wno-unused -O1 -fno-math-errno -fschedule-insns2
-fno-trapping-math -fno-strict-aliasing -fwrapv -fomit-frame-pointer -fPIC
-fno-common -mieee-fp -rdynamic -shared

it can't compiler in 20GB of RAM.  (That machine has only 16GB of RAM, so I
killed the compile when it hit 20GB of physical+virtual memory.)

It compiles just fine in about 1GB of RAM with 

euler-76% gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../configure --prefix=/pkgs/gcc-4.1.2
Thread model: posix
gcc version 4.1.2

compiler.i is the output from the Gambit Scheme->C compiler; the source scheme
program is from a standard benchmark suite for Scheme compilers.  So I found
this by trying to change the code generator for Gambit and running the
benchmark suite on x86_64.

I don't know how this can be "fixed".  Basically, the entire middle-end
infrastructure since 4.1.* is telling people like me with computer-generated
code like this to just go away (to put it very politely).  On Mac OS X 10.5.*,
Apple bundles their version of 4.0.1, which compiles this just fine; on Red Hat
5.2, they bundle their version of 4.1.2 (I think, my RH5.2 box is down at the
moment), which compiles this just fine; but on Ubuntu 8.10 or Fedora 10 you
can't compile this because they bundle newer compilers.  (I guess I'll see if I
can install 4.1.* on both of these.)

As a stopgap measure, perhaps someone can tell me what optimization level to
use.  As you can see, I use -O1 and a few others (mainly -fschedule-insns2). 
gcc 4.1.* and earlier compiled something like this just fine, but -O1 must mean
something different now.


-- 
           Summary: Code that compiles fine in 1GB of memory with 4.1.2
                    requires > 20GB in 4.2.* and higher
           Product: gcc
           Version: 4.3.3
            Status: UNCONFIRMED
          Severity: major
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: lucier at math dot purdue dot edu
 GCC build triplet: x86_64-unknown-linux-gnu
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
@ 2009-02-12  9:58 ` rguenth at gcc dot gnu dot org
  2009-02-12 10:41 ` jakub at gcc dot gnu dot org
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-02-12  9:58 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from rguenth at gcc dot gnu dot org  2009-02-12 09:58 -------
Is this the same as PR26854?  Note that it would be extremely helpful to have
a testcase that fits into less memory to be able to analyze this.  As you say
it is autogenerated source, can you shrink the testcase?

Thanks.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
  2009-02-12  9:58 ` [Bug middle-end/39157] " rguenth at gcc dot gnu dot org
@ 2009-02-12 10:41 ` jakub at gcc dot gnu dot org
  2009-02-12 10:46 ` rguenth at gcc dot gnu dot org
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-02-12 10:41 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from jakub at gcc dot gnu dot org  2009-02-12 10:41 -------
The peak on x86_64 (with normal checking cc1) seems to be around 29.3GB, but
most of the time top numbers were in the .5GB - 1GB or at most 1.3GB range,
with just a signle very high peak.  Will now look closer where all the memory
is needed.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
  2009-02-12  9:58 ` [Bug middle-end/39157] " rguenth at gcc dot gnu dot org
  2009-02-12 10:41 ` jakub at gcc dot gnu dot org
@ 2009-02-12 10:46 ` rguenth at gcc dot gnu dot org
  2009-02-12 11:26 ` steven at gcc dot gnu dot org
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-02-12 10:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from rguenth at gcc dot gnu dot org  2009-02-12 10:46 -------
If it is like PR26854 then it is all df memory.  The gambit stuff consists of
a state machine with loads of computed gotos (thus a nearly fully connected
CFG) IIRC.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (2 preceding siblings ...)
  2009-02-12 10:46 ` rguenth at gcc dot gnu dot org
@ 2009-02-12 11:26 ` steven at gcc dot gnu dot org
  2009-02-12 11:40 ` ebotcazou at gcc dot gnu dot org
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-02-12 11:26 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from steven at gcc dot gnu dot org  2009-02-12 11:26 -------
Even for code with lots of computed gotos, the CFG should not be fully
connected. We factorize computed gotos to avoid exactly that.  At least we used
to.  Maybe the factorizing is broken, or it is undone somewhere too early?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (3 preceding siblings ...)
  2009-02-12 11:26 ` steven at gcc dot gnu dot org
@ 2009-02-12 11:40 ` ebotcazou at gcc dot gnu dot org
  2009-02-12 12:01 ` steven at gcc dot gnu dot org
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: ebotcazou at gcc dot gnu dot org @ 2009-02-12 11:40 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from ebotcazou at gcc dot gnu dot org  2009-02-12 11:40 -------
> Even for code with lots of computed gotos, the CFG should not be fully
> connected. We factorize computed gotos to avoid exactly that.  At least we used
> to.  Maybe the factorizing is broken, or it is undone somewhere too early?

If I'm not mistaken the factoring is undone during RTL expansion.


-- 

ebotcazou at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ebotcazou at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (4 preceding siblings ...)
  2009-02-12 11:40 ` ebotcazou at gcc dot gnu dot org
@ 2009-02-12 12:01 ` steven at gcc dot gnu dot org
  2009-02-12 12:16 ` jakub at gcc dot gnu dot org
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-02-12 12:01 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from steven at gcc dot gnu dot org  2009-02-12 12:01 -------
In the past, we did not unfactor them (see e.g. Bug 15242).


-- 

steven at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2009-02-12 12:01:30
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (5 preceding siblings ...)
  2009-02-12 12:01 ` steven at gcc dot gnu dot org
@ 2009-02-12 12:16 ` jakub at gcc dot gnu dot org
  2009-02-12 12:49 ` steven at gcc dot gnu dot org
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-02-12 12:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from jakub at gcc dot gnu dot org  2009-02-12 12:16 -------
Adding -fno-move-loop-invariants to the x86_64 mentioned options results in
VIRT memory topping around 1224m (only because of IRA, before RA it never went
above 1GB).  Seems it is really the loop2_invariant pass that ate 28GB of RAM.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (6 preceding siblings ...)
  2009-02-12 12:16 ` jakub at gcc dot gnu dot org
@ 2009-02-12 12:49 ` steven at gcc dot gnu dot org
  2009-02-12 12:59 ` rguenth at gcc dot gnu dot org
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-02-12 12:49 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from steven at gcc dot gnu dot org  2009-02-12 12:49 -------
If there is a test case that compiles in less than 4GB, I'll take this bug (I
have no access to machines with more memory than that ;-)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (7 preceding siblings ...)
  2009-02-12 12:49 ` steven at gcc dot gnu dot org
@ 2009-02-12 12:59 ` rguenth at gcc dot gnu dot org
  2009-02-12 13:16 ` jakub at gcc dot gnu dot org
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-02-12 12:59 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from rguenth at gcc dot gnu dot org  2009-02-12 12:58 -------
PR26854 is "fixed" as well with -fno-move-loop-invariants.  It has a little
less peak memory requirement than the testcase here.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (8 preceding siblings ...)
  2009-02-12 12:59 ` rguenth at gcc dot gnu dot org
@ 2009-02-12 13:16 ` jakub at gcc dot gnu dot org
  2009-02-12 14:21 ` jakub at gcc dot gnu dot org
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-02-12 13:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from jakub at gcc dot gnu dot org  2009-02-12 13:16 -------
Most of the memory is allocated in df_chain->block_pool:
p *df->problems_by_index[4]->block_pool
$37 = {name = 0xbc3114 "df_chain_block pool", id = 475, elts_per_block = 50,
returned_free_list = 0x0, 
  virgin_free_list = 0x72ce7cae8 "", virgin_elts_remaining = 18, elts_allocated
= 775734800, elts_free = 18, 
  blocks_allocated = 15514696, block_list = 0x72ce7c6e0, block_size = 1608,
elt_size = 32}
p *cfun->cfg
$39 = {x_entry_block_ptr = 0x7ffff0cb11e0, x_exit_block_ptr = 0x7ffff0cb1240,
x_basic_block_info = 0x7ffff283f000, 
  x_n_basic_blocks = 76672, x_n_edges = 135946, x_last_basic_block = 76676,
x_label_to_block_map = 0x0, 
  x_profile_status = PROFILE_GUESSED, x_dom_computed = {DOM_NO_FAST_QUERY,
DOM_NONE}, x_n_bbs_in_dom_tree = {76672, 0}, 
  max_jumptable_ents = 0, last_label_uid = 97987}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (9 preceding siblings ...)
  2009-02-12 13:16 ` jakub at gcc dot gnu dot org
@ 2009-02-12 14:21 ` jakub at gcc dot gnu dot org
  2009-02-12 14:23 ` rguenther at suse dot de
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-02-12 14:21 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from jakub at gcc dot gnu dot org  2009-02-12 14:21 -------
Created an attachment (id=17288)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17288&action=view)
gcc44-pr39157.patch

Quick hack to avoid loop invariant motion from excessively large loops at -O1.
With this cc1 still tops at 1.2GB, but loop invariants are still moved out of
reasonably sized loops.  For the huge ones I wonder how much it buys in anyway,
there it increases register pressure a lot.  10000 is pretty arbitrary, I
wonder if we want a param or something.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (10 preceding siblings ...)
  2009-02-12 14:21 ` jakub at gcc dot gnu dot org
@ 2009-02-12 14:23 ` rguenther at suse dot de
  2009-02-12 15:47 ` rguenth at gcc dot gnu dot org
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2009-02-12 14:23 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from rguenther at suse dot de  2009-02-12 14:23 -------
Subject: Re:  Code that compiles fine in 1GB of memory
 with 4.1.2 requires > 20GB in 4.2.* and higher

On Thu, 12 Feb 2009, jakub at gcc dot gnu dot org wrote:

> ------- Comment #11 from jakub at gcc dot gnu dot org  2009-02-12 14:21 -------
> Created an attachment (id=17288)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17288&action=view)
>  --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17288&action=view)
> gcc44-pr39157.patch
> 
> Quick hack to avoid loop invariant motion from excessively large loops at -O1.
> With this cc1 still tops at 1.2GB, but loop invariants are still moved out of
> reasonably sized loops.  For the huge ones I wonder how much it buys in anyway,
> there it increases register pressure a lot.  10000 is pretty arbitrary, I
> wonder if we want a param or something.

I'd say for -O1 disable it completely, for -O2 and up use the
heuristic and add a param for it.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (11 preceding siblings ...)
  2009-02-12 14:23 ` rguenther at suse dot de
@ 2009-02-12 15:47 ` rguenth at gcc dot gnu dot org
  2009-02-12 15:57 ` jakub at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-02-12 15:47 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from rguenth at gcc dot gnu dot org  2009-02-12 15:47 -------
Zdenek, in this case (and PR26854) can we make sure not to recognize loops
that involve the single non-local goto BB?  Maybe this would solve the
problem as well.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rakdver at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (12 preceding siblings ...)
  2009-02-12 15:47 ` rguenth at gcc dot gnu dot org
@ 2009-02-12 15:57 ` jakub at gcc dot gnu dot org
  2009-02-12 16:35 ` lucier at math dot purdue dot edu
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-02-12 15:57 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from jakub at gcc dot gnu dot org  2009-02-12 15:57 -------
Created an attachment (id=17291)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17291&action=view)
gcc44-pr39157.patch

Patch to add loop-invariant-max-bbs-in-loop parameter.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (13 preceding siblings ...)
  2009-02-12 15:57 ` jakub at gcc dot gnu dot org
@ 2009-02-12 16:35 ` lucier at math dot purdue dot edu
  2009-02-12 16:52 ` rguenth at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-12 16:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from lucier at math dot purdue dot edu  2009-02-12 16:35 -------
Some comments (a lot went on while I was sleeping):

1.  Yes, this is similar to the test case of PR26854, but the C code generator
has changed significantly since that test case was filed.  I don't know if the
changes in the code generator really affect what's happening here, however.

2.  I'm trying to get a "moderately" sized test case that will compile in about
3GB of RAM, as Steven requested.  (The test case from PR26854 takes at least
7GB of RAM to compile on ppc64.)

3.  If the amount of memory and cpu time required by the test case at -O1
doesn't increase significantly when loop-invariant motion is performed on loops
of size up to 10,000, then it would be good if the parameter at -O1 could be
10,000 instead of 100 (or at least larger than 100), as is suggested in the
most recent patch.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (14 preceding siblings ...)
  2009-02-12 16:35 ` lucier at math dot purdue dot edu
@ 2009-02-12 16:52 ` rguenth at gcc dot gnu dot org
  2009-02-12 18:29 ` rakdver at kam dot mff dot cuni dot cz
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-02-12 16:52 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #16 from rguenth at gcc dot gnu dot org  2009-02-12 16:52 -------
Actually for PR26854 it is just one loop that is detected, covering all of
the function (with approx. 56000 basic blocks and one basic-block that has
edges to all other basic blocks in the loop).  So the default for the param
could indeed be 10000 here.  And possibly it's not really the large number
of blocks but that the in/out-degree of that single basic-block with the
computed goto.

Thus, I'd suggest again to not recognize such loops or maybe mark them
specially?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (15 preceding siblings ...)
  2009-02-12 16:52 ` rguenth at gcc dot gnu dot org
@ 2009-02-12 18:29 ` rakdver at kam dot mff dot cuni dot cz
  2009-02-12 19:55 ` lucier at math dot purdue dot edu
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: rakdver at kam dot mff dot cuni dot cz @ 2009-02-12 18:29 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #17 from rakdver at kam dot mff dot cuni dot cz  2009-02-12 18:29 -------
Subject: Re:  Code that compiles fine in 1GB of memory with 4.1.2 requires >
20GB in 4.2.* and higher

> Zdenek, in this case (and PR26854) can we make sure not to recognize loops
> that involve the single non-local goto BB?  Maybe this would solve the
> problem as well.

maybe; however, I think it would even make sense to add an option to the
loop analysis not to detect loops over certain size at all.  Most of the
loop optimization passes become senseless on loops with thousands of
basic blocks, just wasting the resources.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (16 preceding siblings ...)
  2009-02-12 18:29 ` rakdver at kam dot mff dot cuni dot cz
@ 2009-02-12 19:55 ` lucier at math dot purdue dot edu
  2009-02-12 20:51 ` lucier at math dot purdue dot edu
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-12 19:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #18 from lucier at math dot purdue dot edu  2009-02-12 19:54 -------
There is now a file slatex.i at

http://www.math.purdue.edu/~lucier/bugzilla/8/

that compiles in about 650MB of memory with gcc-4.2.3 on x86-64 with the same
options; I don't know if that will help Steven.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (17 preceding siblings ...)
  2009-02-12 19:55 ` lucier at math dot purdue dot edu
@ 2009-02-12 20:51 ` lucier at math dot purdue dot edu
  2009-02-13  7:44 ` steven at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: lucier at math dot purdue dot edu @ 2009-02-12 20:51 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #19 from lucier at math dot purdue dot edu  2009-02-12 20:51 -------
Subject: Re:  Code that compiles fine in 1GB of
 memory with 4.1.2 requires > 20GB in 4.2.* and higher

On Thu, 2009-02-12 at 16:52 +0000, rguenth at gcc dot gnu dot org wrote:
> ------- Comment #16 from rguenth at gcc dot gnu dot org  2009-02-12 16:52 -------
> Actually for PR26854 it is just one loop that is detected, covering all of
> the function (with approx. 56000 basic blocks and one basic-block that has
> edges to all other basic blocks in the loop).

Richard:

I'm wondering if you could look at a smaller file that's generated in a
somewhat different way.

At

http://www.math.purdue.edu/~lucier/bugzilla/8/

there's a file _num.i.gz that I think should have smaller (nested) loops
than the entire file, for example, from the label

___L189__23__23_bignum_2e__2a_:

at line 50031 to just before label

___L190__23__23_bignum_2e__2a_:

at line 50105.

Moving loop invariants out of this loop might help if it detected as a
loop, but I don't know how to check whether it is.

Perhaps you might check and report whether this small loop is treated as
a loop or whether, again, the entire function is the only "loop"
detected.

Brad


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (18 preceding siblings ...)
  2009-02-12 20:51 ` lucier at math dot purdue dot edu
@ 2009-02-13  7:44 ` steven at gcc dot gnu dot org
  2009-02-13  9:17 ` jakub at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-02-13  7:44 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #20 from steven at gcc dot gnu dot org  2009-02-13 07:44 -------
Re: "Moving loop invariants out of this loop might help if it detected as a
loop, but I don't know how to check whether it is." (Comment #19):

It's not like there will not be any loop invariant code motion (LICM) at all
anymore if the RTL LICM pass is disabled.  There is an LICM pass on GIMPLE, and
there is also PRE for GIMPLE (and lazy code motion for RTL but I think it
disables itself for your test case).

The RTL LICM pass mostly cleans up after expand, i.e. moves things that are not
exposed in GIMPLE. This is mostly just address calculations.

I think Zdenek is right in Comment #17.  LICM on big loops probably does more
harm than good.  Still, randomly disabling RTL LICM on big loops, when we do
nothing about the other LICM-like passes, looks like a hack to me and not like
a permanent solution.


There are a couple more ways to fix the problem (ISTR I've listed them before
somewhere...):

1. Stop using the DF UD chains and build them locally only.  This is not
something I would like much, but clearly there is something with the DF chains
that consumes too many resources.

2. Figure out why *only* RTL LICM has a big problem with the DF chains for this
test case.  Why does fwprop work?  Does it shut down itself somehow?  Is fwprop
asking only for UD chains for pseudos and LICM also for hardregs, perhaps?

3. Make RTL FUD chains work.  Shame on me for being unable so far to make that
work.  Not 4.4 material.

4. Poor-man's-FUD: For CFGs with many basic blocks and a large join-split point
(like the block that holds the factored computed goto), add a no-op move for
all live registers for which a valid reg-reg move instruction is available.
This effectively splits the UD chains, and the no-op moves would be cleaned up
later (during RA, or maybe earlier).


Just to see if it works, I want to give the poor-man's-FUD option a try with
Brad's smaller test case.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (19 preceding siblings ...)
  2009-02-13  7:44 ` steven at gcc dot gnu dot org
@ 2009-02-13  9:17 ` jakub at gcc dot gnu dot org
  2009-02-13 11:06 ` rguenther at suse dot de
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-02-13  9:17 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #21 from jakub at gcc dot gnu dot org  2009-02-13 09:17 -------
To answer 2), I bet fwprop would suffer the same problem, but fwprop is
disabled at -O1, LICM is not.  I can try it today.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (20 preceding siblings ...)
  2009-02-13  9:17 ` jakub at gcc dot gnu dot org
@ 2009-02-13 11:06 ` rguenther at suse dot de
  2009-02-13 11:08 ` rguenth at gcc dot gnu dot org
  2009-02-20 12:56 ` jakub at gcc dot gnu dot org
  23 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2009-02-13 11:06 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #22 from rguenther at suse dot de  2009-02-13 11:06 -------
Subject: Re:  Code that compiles fine in 1GB of memory
 with 4.1.2 requires > 20GB in 4.2.* and higher

On Thu, 12 Feb 2009, lucier at math dot purdue dot edu wrote:

> ------- Comment #19 from lucier at math dot purdue dot edu  2009-02-12 20:51 -------
> Subject: Re:  Code that compiles fine in 1GB of
>  memory with 4.1.2 requires > 20GB in 4.2.* and higher
> 
> On Thu, 2009-02-12 at 16:52 +0000, rguenth at gcc dot gnu dot org wrote:
> > ------- Comment #16 from rguenth at gcc dot gnu dot org  2009-02-12 16:52 -------
> > Actually for PR26854 it is just one loop that is detected, covering all of
> > the function (with approx. 56000 basic blocks and one basic-block that has
> > edges to all other basic blocks in the loop).
> 
> Richard:
> 
> I'm wondering if you could look at a smaller file that's generated in a
> somewhat different way.
> 
> At
> 
> http://www.math.purdue.edu/~lucier/bugzilla/8/
> 
> there's a file _num.i.gz that I think should have smaller (nested) loops
> than the entire file, for example, from the label
> 
> ___L189__23__23_bignum_2e__2a_:
> 
> at line 50031 to just before label
> 
> ___L190__23__23_bignum_2e__2a_:
> 
> at line 50105.
> 
> Moving loop invariants out of this loop might help if it detected as a
> loop, but I don't know how to check whether it is.
> 
> Perhaps you might check and report whether this small loop is treated as
> a loop or whether, again, the entire function is the only "loop"
> detected.

Yes, there are several small loops detected for this testcase (139 in
total, including one large with the computed goto).

Btw, thanks for the smaller testcase.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (21 preceding siblings ...)
  2009-02-13 11:06 ` rguenther at suse dot de
@ 2009-02-13 11:08 ` rguenth at gcc dot gnu dot org
  2009-02-20 12:56 ` jakub at gcc dot gnu dot org
  23 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-02-13 11:08 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #23 from rguenth at gcc dot gnu dot org  2009-02-13 11:08 -------
Lemme close this bug as a dup of the one marked as regression.

*** This bug has been marked as a duplicate of 26854 ***


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
  2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
                   ` (22 preceding siblings ...)
  2009-02-13 11:08 ` rguenth at gcc dot gnu dot org
@ 2009-02-20 12:56 ` jakub at gcc dot gnu dot org
  23 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-02-20 12:56 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #24 from jakub at gcc dot gnu dot org  2009-02-20 12:56 -------
Subject: Bug 39157

Author: jakub
Date: Fri Feb 20 12:56:01 2009
New Revision: 144320

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=144320
Log:
        PR middle-end/39157
        * Makefile.in (loop-invariant.o): Depend on $(PARAMS_H).
        * params.h (LOOP_INVARIANT_MAX_BBS_IN_LOOP): Define.
        * params.def (loop-invariant-max-bbs-in-loop): New parameter.
        * opts.c (decode_options): Set loop-invariant-max-bbs-in-loop
        parameter to 1000 for -O1 by default.
        * doc/invoke.texi (loop-invariant-max-bbs-in-loop): Document new
        parameter.
        * loop-invariant.c: Include params.h.
        (move_loop_invariants): Don't call move_single_loop_invariants on
        very large loops.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/Makefile.in
    trunk/gcc/doc/invoke.texi
    trunk/gcc/loop-invariant.c
    trunk/gcc/opts.c
    trunk/gcc/params.def
    trunk/gcc/params.h


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug middle-end/39157] Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher
       [not found] <bug-39157-4@http.gcc.gnu.org/bugzilla/>
@ 2012-11-05 22:02 ` steven at gcc dot gnu.org
  0 siblings, 0 replies; 26+ messages in thread
From: steven at gcc dot gnu.org @ 2012-11-05 22:02 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39157

--- Comment #25 from Steven Bosscher <steven at gcc dot gnu.org> 2012-11-05 22:02:18 UTC ---
This problem has been fixed in DF with the DF_RD_PRUNE_DEAD_DEFS flag.
I see no good reason to deprecate the param, though.  For such a huge
loop, any loop invariant code motion is unlikely to be a win.


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2012-11-05 22:02 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-11 21:42 [Bug middle-end/39157] New: Code that compiles fine in 1GB of memory with 4.1.2 requires > 20GB in 4.2.* and higher lucier at math dot purdue dot edu
2009-02-12  9:58 ` [Bug middle-end/39157] " rguenth at gcc dot gnu dot org
2009-02-12 10:41 ` jakub at gcc dot gnu dot org
2009-02-12 10:46 ` rguenth at gcc dot gnu dot org
2009-02-12 11:26 ` steven at gcc dot gnu dot org
2009-02-12 11:40 ` ebotcazou at gcc dot gnu dot org
2009-02-12 12:01 ` steven at gcc dot gnu dot org
2009-02-12 12:16 ` jakub at gcc dot gnu dot org
2009-02-12 12:49 ` steven at gcc dot gnu dot org
2009-02-12 12:59 ` rguenth at gcc dot gnu dot org
2009-02-12 13:16 ` jakub at gcc dot gnu dot org
2009-02-12 14:21 ` jakub at gcc dot gnu dot org
2009-02-12 14:23 ` rguenther at suse dot de
2009-02-12 15:47 ` rguenth at gcc dot gnu dot org
2009-02-12 15:57 ` jakub at gcc dot gnu dot org
2009-02-12 16:35 ` lucier at math dot purdue dot edu
2009-02-12 16:52 ` rguenth at gcc dot gnu dot org
2009-02-12 18:29 ` rakdver at kam dot mff dot cuni dot cz
2009-02-12 19:55 ` lucier at math dot purdue dot edu
2009-02-12 20:51 ` lucier at math dot purdue dot edu
2009-02-13  7:44 ` steven at gcc dot gnu dot org
2009-02-13  9:17 ` jakub at gcc dot gnu dot org
2009-02-13 11:06 ` rguenther at suse dot de
2009-02-13 11:08 ` rguenth at gcc dot gnu dot org
2009-02-20 12:56 ` jakub at gcc dot gnu dot org
     [not found] <bug-39157-4@http.gcc.gnu.org/bugzilla/>
2012-11-05 22:02 ` steven at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).