public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
       [not found] <bug-28071-4@http.gcc.gnu.org/bugzilla/>
@ 2023-07-28  8:40 ` cvs-commit at gcc dot gnu.org
  0 siblings, 0 replies; 20+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-07-28  8:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071

--- Comment #71 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:095eb138f736d94dabf9a07a6671bd351be0e66a

commit r14-2851-g095eb138f736d94dabf9a07a6671bd351be0e66a
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Fri Jul 28 09:39:46 2023 +0100

    PR rtl-optimization/110587: Reduce useless moves in compile-time hog.

    This patch is one of a series of fixes for PR rtl-optimization/110587,
    a compile-time regression with -O0, that attempts to address the underlying
    cause.  As noted previously, the pathological test case pr28071.c contains
    a large number of useless register-to-register moves that can produce
    quadratic behaviour (in LRA).  These moves are generated during RTL
    expansion in emit_group_load_1, where the middle-end attempts to simplify
    the source before calling extract_bit_field.  This is reasonable if the
    source is a complex expression (from before the tree-ssa optimizers), or
    a SUBREG, or a hard register, but it's not particularly useful to copy
    a pseudo register into a new pseudo register.  This patch eliminates that
    redundancy.

    The -fdump-tree-expand for pr28071.c compiled with -O0 currently contains
    777K lines, with this patch it contains 717K lines, i.e. saving about 60K
    lines (admittedly of debugging text output, but it makes the point).

    2023-07-28  Roger Sayle  <roger@nextmovesoftware.com>
                Richard Biener  <rguenther@suse.de>

    gcc/ChangeLog
            PR middle-end/28071
            PR rtl-optimization/110587
            * expr.cc (emit_group_load_1): Simplify logic for calling
            force_reg on ORIG_SRC, to avoid making a copy if the source
            is already in a pseudo register.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (16 preceding siblings ...)
  2007-10-09 19:25 ` mmitchel at gcc dot gnu dot org
@ 2007-11-03  8:07 ` ebotcazou at gcc dot gnu dot org
  17 siblings, 0 replies; 20+ messages in thread
From: ebotcazou at gcc dot gnu dot org @ 2007-11-03  8:07 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #70 from ebotcazou at gcc dot gnu dot org  2007-11-03 08:07 -------
> Audit trail shows that this isn't a problem with 4.2.  Target -> 4.1.3?

Yes, this has been fixed in the 4.2 series according to comment #54.


-- 

ebotcazou at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED
   Target Milestone|4.2.3                       |4.2.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (15 preceding siblings ...)
  2007-07-20  3:47 ` mmitchel at gcc dot gnu dot org
@ 2007-10-09 19:25 ` mmitchel at gcc dot gnu dot org
  2007-11-03  8:07 ` ebotcazou at gcc dot gnu dot org
  17 siblings, 0 replies; 20+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2007-10-09 19:25 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #69 from mmitchel at gcc dot gnu dot org  2007-10-09 19:21 -------
Change target milestone to 4.2.3, as 4.2.2 has been released.


-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.2.2                       |4.2.3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (14 preceding siblings ...)
  2007-05-14 21:49 ` fang at csl dot cornell dot edu
@ 2007-07-20  3:47 ` mmitchel at gcc dot gnu dot org
  2007-10-09 19:25 ` mmitchel at gcc dot gnu dot org
  2007-11-03  8:07 ` ebotcazou at gcc dot gnu dot org
  17 siblings, 0 replies; 20+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2007-07-20  3:47 UTC (permalink / raw)
  To: gcc-bugs



-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.2.1                       |4.2.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (13 preceding siblings ...)
  2007-05-14 21:37 ` mmitchel at gcc dot gnu dot org
@ 2007-05-14 21:49 ` fang at csl dot cornell dot edu
  2007-07-20  3:47 ` mmitchel at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: fang at csl dot cornell dot edu @ 2007-05-14 21:49 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #68 from fang at csl dot cornell dot edu  2007-05-14 22:49 -------
Audit trail shows that this isn't a problem with 4.2.  Target -> 4.1.3?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (12 preceding siblings ...)
  2007-04-17 18:38 ` hubicka at ucw dot cz
@ 2007-05-14 21:37 ` mmitchel at gcc dot gnu dot org
  2007-05-14 21:49 ` fang at csl dot cornell dot edu
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2007-05-14 21:37 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #67 from mmitchel at gcc dot gnu dot org  2007-05-14 22:25 -------
Will not be fixed in 4.2.0; retargeting at 4.2.1.


-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.2.0                       |4.2.1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (11 preceding siblings ...)
  2007-04-17 18:16 ` hubicka at gcc dot gnu dot org
@ 2007-04-17 18:38 ` hubicka at ucw dot cz
  2007-05-14 21:37 ` mmitchel at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: hubicka at ucw dot cz @ 2007-04-17 18:38 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #66 from hubicka at ucw dot cz  2007-04-17 19:38 -------
Subject: Re:  [4.1 regression] A file that can not be compiled in reasonable
time/space

Just to add some explanation to the numbers, df_scan_ref_pool is 50MB,
the bitmaps quoted are 8MB each.  Given nature of the testcase, I think
we are doing satisfactory job at -O2. At -O3 there are still problems
(the testcase -O2 has one huge BB, at -O3 we have many BBs). PRE explode
completely and we need over 1.2GB for -O3 -fno-tree-pre -fno-tree-fre.
What is also killing us at -O3 are the bitmaps.
385MB:
df-problems.c:2951 (df_chain_create_bb)    40198  386574160  385195560
385195560     462958
200MB
f-problems.c:984 (df_rd_alloc)            40198  385290320  208450840
0          0
110MB
df-problems.c:985 (df_rd_alloc)            40198  201714640  110324160
0          0
tree-ssa-live.c:540 (new_tree_live_info)   31939  114031520  113098360
0      84523
tree-ssa-live.c:536 (new_tree_live_info)   31939  113096920  113092320
0      80895

Honza


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (10 preceding siblings ...)
  2007-04-16 15:07 ` mkuvyrkov at gcc dot gnu dot org
@ 2007-04-17 18:16 ` hubicka at gcc dot gnu dot org
  2007-04-17 18:38 ` hubicka at ucw dot cz
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2007-04-17 18:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #65 from hubicka at gcc dot gnu dot org  2007-04-17 19:16 -------
I can confirm that at -O2, memory consumption dropped from 0.5GB to 0.28GB,
that is indeed good improvement. To summarize
http://www.suse.de/~gcctest/memory/results/200704171438/pr28071-O2.rep

Compile time wise major offenders are:
PRE                   : 259.18 (34%) usr   0.00 ( 0%) sys 259.18 (34%) wall   
1421 kB ( 1%) ggc
scheduling 2          : 366.76 (49%) usr   0.00 ( 0%) sys 366.82 (49%) wall   
3062 kB ( 1%) ggc

There is a lot of non-GGC memory. Major allocpool offender is:
df_scan_ref pool          36  130400160   58647984          0
d

bitmaps:
tree-ssa-pre.c:549 (bitmap_set_new)        95283   14667640    8814400   
8798320    9704128
reload1.c:518 (new_insn_chain)             90286    8425760    8425760   
8425760        761
tree-ssa-pre.c:548 (bitmap_set_new)        95283   20190640    9860640   
9826200    3268384
tree-ssa-structalias.c:879 (add_pred_grap  94816    7585280    7585280   
7585280     189632

Thanks,
Honza


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (9 preceding siblings ...)
  2007-04-16 15:04 ` mkuvyrkov at gcc dot gnu dot org
@ 2007-04-16 15:07 ` mkuvyrkov at gcc dot gnu dot org
  2007-04-17 18:16 ` hubicka at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: mkuvyrkov at gcc dot gnu dot org @ 2007-04-16 15:07 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #64 from mkuvyrkov at gcc dot gnu dot org  2007-04-16 16:07 -------
(In reply to comment #63)

Scheduler memory hungryness should be fixed by the above commit.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (8 preceding siblings ...)
  2007-03-26 15:50 ` bonzini at gnu dot org
@ 2007-04-16 15:04 ` mkuvyrkov at gcc dot gnu dot org
  2007-04-16 15:07 ` mkuvyrkov at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: mkuvyrkov at gcc dot gnu dot org @ 2007-04-16 15:04 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #63 from mkuvyrkov at gcc dot gnu dot org  2007-04-16 16:04 -------
Subject: Bug 28071

Author: mkuvyrkov
Date: Mon Apr 16 16:04:18 2007
New Revision: 123874

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=123874
Log:
PR middle-end/28071
* sched-int.h (struct deps): Split field 'pending_lists_length' into
'pending_read_list_length' and 'pending_write_list_length'.  Update
comment.
* sched-deps.c (add_insn_mem_dependence): Change signature.  Update
to handle two length counters instead of one.  Update all uses.
(flush_pending_lists, sched_analyze_1, init_deps): Update to handle
two length counters instead of one.
* sched-rgn.c (propagate_deps): Update to handle two length counters
instead of one.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/sched-deps.c
    trunk/gcc/sched-int.h
    trunk/gcc/sched-rgn.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (7 preceding siblings ...)
  2007-02-06 22:15 ` hubicka at gcc dot gnu dot org
@ 2007-03-26 15:50 ` bonzini at gnu dot org
  2007-04-16 15:04 ` mkuvyrkov at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: bonzini at gnu dot org @ 2007-03-26 15:50 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #62 from bonzini at gnu dot org  2007-03-26 16:50 -------
dataflow branch cannot complete this at -O3 -fno-tree-pre -fno-tree-fre


-- 

bonzini at gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |zadeck at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (6 preceding siblings ...)
  2007-02-06 22:05 ` hubicka at gcc dot gnu dot org
@ 2007-02-06 22:15 ` hubicka at gcc dot gnu dot org
  2007-03-26 15:50 ` bonzini at gnu dot org
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2007-02-06 22:15 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #61 from hubicka at gcc dot gnu dot org  2007-02-06 22:14 -------
Also forgot to mention, integration is slow because of the split_block
quadraticness.

For -O2:
We need 531MB of ram, GGC memory is peaking at 100MB, large portion of the
non-GGC memory are definitly the scheduler dependency lists.

xecution times (seconds)
 garbage collection    :  14.26 ( 5%) usr   0.03 ( 1%) sys  14.27 ( 5%) wall   
   0 kB ( 0%) ggc
 life analysis         :  73.96 (24%) usr   1.55 (46%) sys  75.52 (24%) wall   
7207 kB ( 2%) ggc
 alias analysis        :   0.92 ( 0%) usr   0.00 ( 0%) sys   0.87 ( 0%) wall   
8530 kB ( 3%) ggc
 inline heuristics     :  11.64 ( 4%) usr   0.12 ( 4%) sys  11.77 ( 4%) wall   
2695 kB ( 1%) ggc
 integration           :  16.71 ( 5%) usr   0.19 ( 6%) sys  16.91 ( 5%) wall  
69808 kB (21%) ggc
 tree gimplify         :   0.49 ( 0%) usr   0.07 ( 2%) sys   0.58 ( 0%) wall  
14977 kB ( 4%) ggc
 tree operand scan     :   1.25 ( 0%) usr   0.11 ( 3%) sys   1.29 ( 0%) wall  
20889 kB ( 6%) ggc
 tree SRA              :   1.20 ( 0%) usr   0.07 ( 2%) sys   1.37 ( 0%) wall  
40364 kB (12%) ggc
 tree FRE              :   1.14 ( 0%) usr   0.07 ( 2%) sys   1.21 ( 0%) wall   
9230 kB ( 3%) ggc
 expand                :   3.29 ( 1%) usr   0.10 ( 3%) sys   3.39 ( 1%) wall  
45828 kB (14%) ggc
 PRE                   :  21.54 ( 7%) usr   0.00 ( 0%) sys  21.54 ( 7%) wall   
 898 kB ( 0%) ggc
 regmove               :  93.59 (30%) usr   0.05 ( 1%) sys  93.64 (30%) wall   
 156 kB ( 0%) ggc
 local alloc           :   5.34 ( 2%) usr   0.00 ( 0%) sys   5.33 ( 2%) wall   
2838 kB ( 1%) ggc
 global alloc          :   4.25 ( 1%) usr   0.06 ( 2%) sys   4.30 ( 1%) wall  
19946 kB ( 6%) ggc
 reload CSE regs       :   4.09 ( 1%) usr   0.00 ( 0%) sys   4.11 ( 1%) wall  
11354 kB ( 3%) ggc
 scheduling 2          :  16.97 ( 6%) usr   0.44 (13%) sys  17.53 ( 6%) wall  
20069 kB ( 6%) ggc
 TOTAL                 : 308.36             3.39           312.58            
334207 kB
total: 531915 kB

regmove has the quadratic loop issues I added param for earliler in the track,
but the parameter is now apparently bit too large since rest of compiler is a
lot faster.  Scheduler/out-of-SSA slowness is gone.

There are no overly large bitmaps, one large allocpool:
df_scan_ref pool          18   74449440   67061984          0

Looks like we are in pretty good shape on this one, only IMO important problems
being the slowness of life (hopefully fixed by DFA) and memory houngryness of
scheduler.

Honza


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (5 preceding siblings ...)
  2007-01-18  9:52 ` hubicka at ucw dot cz
@ 2007-02-06 22:05 ` hubicka at gcc dot gnu dot org
  2007-02-06 22:15 ` hubicka at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2007-02-06 22:05 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #60 from hubicka at gcc dot gnu dot org  2007-02-06 22:05 -------
Hi,
small update on status.  At -O3 -fno-tree-fre -fno-tree-pre we are now doing
1.1GB footprint, 800MB of this out of gimple.  We still explode in FRE/PRE but
majority of other problems was fixed:
Execution times (seconds)
 garbage collection    :  18.23 (12%) usr   0.04 ( 1%) sys  18.46 (10%) wall   
   0 kB ( 0%) ggc
 callgraph construction:  10.31 ( 7%) usr   0.04 ( 1%) sys  10.36 ( 5%) wall   
2296 kB ( 0%) ggc
 life analysis         :   4.08 ( 3%) usr   0.16 ( 3%) sys   4.26 ( 2%) wall   
7350 kB ( 2%) ggc
 inline heuristics     :  10.46 ( 7%) usr   0.12 ( 2%) sys  10.57 ( 6%) wall   
2438 kB ( 1%) ggc
 integration           :  16.48 (11%) usr   0.46 ( 9%) sys  17.00 ( 9%) wall 
143049 kB (29%) ggc
 tree CFG cleanup      :   4.69 ( 3%) usr   0.00 ( 0%) sys   4.69 ( 2%) wall   
   0 kB ( 0%) ggc
 tree SSA incremental  :   2.32 ( 2%) usr   0.40 ( 8%) sys   2.76 ( 1%) wall   
3276 kB ( 1%) ggc
 tree operand scan     :   1.42 ( 1%) usr   0.22 ( 4%) sys   1.54 ( 1%) wall  
27071 kB ( 6%) ggc
 dominator optimization:   2.25 ( 2%) usr   0.00 ( 0%) sys   2.24 ( 1%) wall  
14657 kB ( 3%) ggc
 tree split crit edges :   0.39 ( 0%) usr   0.00 ( 0%) sys   0.39 ( 0%) wall  
17558 kB ( 4%) ggc
 tree SSA to normal    :   8.06 ( 5%) usr   0.40 ( 8%) sys   8.51 ( 4%) wall  
22874 kB ( 5%) ggc
 expand                :   3.83 ( 3%) usr   0.69 (14%) sys  38.08 (20%) wall  
54312 kB (11%) ggc
 forward prop          :   3.20 ( 2%) usr   0.82 (16%) sys   4.22 ( 2%) wall   
2470 kB ( 1%) ggc
 if-conversion         :   6.37 ( 4%) usr   0.00 ( 0%) sys   6.41 ( 3%) wall   
9157 kB ( 2%) ggc
 global alloc          :  12.12 ( 8%) usr   0.94 (19%) sys  15.48 ( 8%) wall  
18801 kB ( 4%) ggc
 TOTAL                 : 147.90             5.02           191.03            
486834 kB

We get considerable usage in bitmaps (just those over 100MB of peak memory
usage are listed):
df-problems.c:2957 (df_chain_create_bb)  208MB
df-problems.c:986 (df_rd_alloc)  207MB
df-problems.c:987 (df_rd_alloc)  110MB
tree-ssa-live.c:534 (new_tree_live_info)  110MB
tree-ssa-live.c:538 (new_tree_live_info)  110MB

At least 100MB, but probably more is consumed by the new linked lists used by
scheduler.  Hopefully this can be tracked by moving everyting to allocpools.

I will send -O2 in separate post.
Honza


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (4 preceding siblings ...)
  2007-01-15 15:31 ` zaks at il dot ibm dot com
@ 2007-01-18  9:52 ` hubicka at ucw dot cz
  2007-02-06 22:05 ` hubicka at gcc dot gnu dot org
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: hubicka at ucw dot cz @ 2007-01-18  9:52 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #59 from hubicka at ucw dot cz  2007-01-18 09:51 -------
Subject: Re:  [4.1 regression] A file that can not be compiled in reasonable
time/space

Hi,
just as heads up, the early inlining change made inliner to now fully
inline to the function at -O2 (orignally we stopped because of inline
unit growth doing just few of inlines).  This enables more optimizations
and reduces memory usage of all other passes except for scheduler, that
increases.  So we have roughly peak of 60MB GGC memory without
scheduling, 360MB with scheduling, so this patch would be even more
greatly appreciated ;)

http://www.suse.de/~aj/SPEC/amd64/memory/pr28071-O2.rep

Honza


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (3 preceding siblings ...)
  2007-01-15  7:53 ` mkuvyrkov at ispras dot ru
@ 2007-01-15 15:31 ` zaks at il dot ibm dot com
  2007-01-18  9:52 ` hubicka at ucw dot cz
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: zaks at il dot ibm dot com @ 2007-01-15 15:31 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #58 from zaks at il dot ibm dot com  2007-01-15 15:30 -------
(In reply to comment #57)
> Subject: Re:  [4.1 regression] A file that can not be
>  compiled in reasonable time/space
> Thanks!  Very useful comments.  I'm continuing to work on cleaning the 
> patch (especially on writing the comments)

Enjoy! One suggestion that may help explain the data-structure, is to provide a
drawing of ddn with its dep and nodes connected.

> > o dep_node_def: this is a node in a (doubly-linked) chain, but it represents an
> > *edge* in terms of the data-dependence graph. The prev_nextp field is a "/*
> Right!  I struggled to figure out the correct name and didn't prevail. 
> Thanks for the tip.  It'll be dep_edge.
Ah, on second thought, perhaps the important property of this struct is the
fact that it's a link on a forward or backward chain; so how about dep_link?


> > Pointer to the next field of the previous node in the list.  */" except for the
> > first node on the list, whose prev_nextp points to itself, right?
> No.  Prev_nextp field of the first node points to deps_list->first. 
> This allows us not to distinguish first node from the others.  I'll fix 
> the comment.
Ah, right.

> > 
> > o dep_data_node_def: holding the two conjugate dependence edges together is
> > very useful when switching directions. But perhaps most of the accesses go in
> > one direction (e.g. iterating over cons of a pro), and having both conjugates
> > structed together may reduce cache efficiency. So you may consider connecting
> > each dep_node_def to its conjugate, not necessarily forcing them to be placed
> > adjacent in memory.
> Dep_def and both edges were placed in one structure so that they could 
> be allocated and freed within a single alloc/free.  As I understand you 
> propose putting two pointers inside dep_edge_def: one to the dep_def and 
> one to the opposite edge.  Currently we have one pointer in dep_edge_def 
> to the dep_data_node which have all that pointers.  And probably I'm 
> missing something, but I don't see how your way can improve cache 
> efficiency.
You're right. There's probably not much to gain if anything paying an extra
pointer to save the fields of the conjugate dep_node. Perhaps only place
dep_def between back and forw (been too much into struct-reorg, I guess :). It
does seem wasteful to hold two 'data' pointers for such nearby offsets ... ;)

And another note: INSN_DEPS may be renamed INSN_BACK_DEPS to better distinguish
it from INSN_DEPEND (which in turn might be called INSN_FORW_DEPS). And maybe
INSN_RESOLVED_BACK_DEPS for consistency.

Ayal.


-- 

zaks at il dot ibm dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |zaks at il dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
                   ` (2 preceding siblings ...)
  2007-01-15  7:19 ` zaks at il dot ibm dot com
@ 2007-01-15  7:53 ` mkuvyrkov at ispras dot ru
  2007-01-15 15:31 ` zaks at il dot ibm dot com
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: mkuvyrkov at ispras dot ru @ 2007-01-15  7:53 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #57 from mkuvyrkov at ispras dot ru  2007-01-15 07:52 -------
Subject: Re:  [4.1 regression] A file that can not be
 compiled in reasonable time/space

Thanks!  Very useful comments.  I'm continuing to work on cleaning the 
patch (especially on writing the comments) and making code more 
transparent.  Below are my comments on yours:

zaks at il dot ibm dot com wrote:
> ------- Comment #56 from zaks at il dot ibm dot com  2007-01-15 07:19 -------
> (In reply to comment #55)
>> Created an attachment (id=12879)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12879&action=view)
>  --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12879&action=view) [edit]
>> Patch for scheduler dependency lists.
> 
> Looks like a pretty good cleanup IMHO. Here are some comments.
> 
> o dep_def: representing a dependence edge including both producer and consumer
> is very handy, albeit somewhat redundant as we're usually traversing all cons
> connected to a pro or vice versa.
This allows us to keep all things in one place - one of the things 
current deps don't provide.  I.e., when changing some property of the 
dep we need to find a corresponding to that dep nodes in both backward 
and forward lists and apply the change to two places instead of one.

  (I.e., has its pros and cons, but mostly pros
> I agree - also done in ddg.h/ddg_edge.) Maybe comment why both 'kind' and 'ds'
> are needed, as one supersedes the other.
There will be.  Thanks.

> 
> o dep_node_def: this is a node in a (doubly-linked) chain, but it represents an
> *edge* in terms of the data-dependence graph. The prev_nextp field is a "/*
Right!  I struggled to figure out the correct name and didn't prevail. 
Thanks for the tip.  It'll be dep_edge.

> Pointer to the next field of the previous node in the list.  */" except for the
> first node on the list, whose prev_nextp points to itself, right?
No.  Prev_nextp field of the first node points to deps_list->first. 
This allows us not to distinguish first node from the others.  I'll fix 
the comment.

> 
> o dep_data_node_def: holding the two conjugate dependence edges together is
> very useful when switching directions. But perhaps most of the accesses go in
> one direction (e.g. iterating over cons of a pro), and having both conjugates
> structed together may reduce cache efficiency. So you may consider connecting
> each dep_node_def to its conjugate, not necessarily forcing them to be placed
> adjacent in memory.
Dep_def and both edges were placed in one structure so that they could 
be allocated and freed within a single alloc/free.  As I understand you 
propose putting two pointers inside dep_edge_def: one to the dep_def and 
one to the opposite edge.  Currently we have one pointer in dep_edge_def 
to the dep_data_node which have all that pointers.  And probably I'm 
missing something, but I don't see how your way can improve cache 
efficiency.

> 
> o To add to the checking routines, the following can be checked: every
> dep_node_def is pointed-to by either its data->back xor its data->forw, right?
> If so, this can be used to identify if a dep_node_def is forward or backward;
> that all nodes on a list are forward (and share the same pro) or backward (and
> share the same con); and to assert the following regarding L:
> +/* Add a dependency described by DEP to the list L.
> +   L should be either INSN_DEPS1 or RESOLVED_DEPS1.  */
Good idea.

> 
> o insn_cost (insn, dep): maybe it's better to break this into insn_cost (insn)
> of a producer regardless of consumers, and "dep_cost (dep)".
Agree.

> 
> o The comment explaining what 'resolve_dep' does can be inlined together with
> its code. 
Agree.

> 
> +/* Detach dep_node N from the list.  */
> +static void
> +dep_node_detach (dep_node_t n)
> +{
> +  dep_node_t *prev_nextp = DEP_NODE_PREV_NEXTP (n);
> +  dep_node_t next = DEP_NODE_NEXT (n);
> +
> +  *prev_nextp = next;
> +
> +  if (next != NULL)
> +    DEP_NODE_PREV_NEXTP (next) = prev_nextp;
> maybe complete the detachment by adding:
> DEP_NODE_PREV_NEXTP (n) = NULL;
> DEP_NODE_NEXT (n) = NULL;
Probably, you are right.

> Ayal.

Thanks,

Maxim


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Bug middle-end/28071] [4.1 regression] A file that can not be  compiled in reasonable time/space
  2007-01-15  7:19 ` zaks at il dot ibm dot com
@ 2007-01-15  7:52   ` Maxim Kuvyrkov
  0 siblings, 0 replies; 20+ messages in thread
From: Maxim Kuvyrkov @ 2007-01-15  7:52 UTC (permalink / raw)
  To: gcc-bugzilla; +Cc: gcc-bugs

Thanks!  Very useful comments.  I'm continuing to work on cleaning the 
patch (especially on writing the comments) and making code more 
transparent.  Below are my comments on yours:

zaks at il dot ibm dot com wrote:
> ------- Comment #56 from zaks at il dot ibm dot com  2007-01-15 07:19 -------
> (In reply to comment #55)
>> Created an attachment (id=12879)
>  --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12879&action=view) [edit]
>> Patch for scheduler dependency lists.
> 
> Looks like a pretty good cleanup IMHO. Here are some comments.
> 
> o dep_def: representing a dependence edge including both producer and consumer
> is very handy, albeit somewhat redundant as we're usually traversing all cons
> connected to a pro or vice versa.
This allows us to keep all things in one place - one of the things 
current deps don't provide.  I.e., when changing some property of the 
dep we need to find a corresponding to that dep nodes in both backward 
and forward lists and apply the change to two places instead of one.

  (I.e., has its pros and cons, but mostly pros
> I agree - also done in ddg.h/ddg_edge.) Maybe comment why both 'kind' and 'ds'
> are needed, as one supersedes the other.
There will be.  Thanks.

> 
> o dep_node_def: this is a node in a (doubly-linked) chain, but it represents an
> *edge* in terms of the data-dependence graph. The prev_nextp field is a "/*
Right!  I struggled to figure out the correct name and didn't prevail. 
Thanks for the tip.  It'll be dep_edge.

> Pointer to the next field of the previous node in the list.  */" except for the
> first node on the list, whose prev_nextp points to itself, right?
No.  Prev_nextp field of the first node points to deps_list->first. 
This allows us not to distinguish first node from the others.  I'll fix 
the comment.

> 
> o dep_data_node_def: holding the two conjugate dependence edges together is
> very useful when switching directions. But perhaps most of the accesses go in
> one direction (e.g. iterating over cons of a pro), and having both conjugates
> structed together may reduce cache efficiency. So you may consider connecting
> each dep_node_def to its conjugate, not necessarily forcing them to be placed
> adjacent in memory.
Dep_def and both edges were placed in one structure so that they could 
be allocated and freed within a single alloc/free.  As I understand you 
propose putting two pointers inside dep_edge_def: one to the dep_def and 
one to the opposite edge.  Currently we have one pointer in dep_edge_def 
to the dep_data_node which have all that pointers.  And probably I'm 
missing something, but I don't see how your way can improve cache 
efficiency.

> 
> o To add to the checking routines, the following can be checked: every
> dep_node_def is pointed-to by either its data->back xor its data->forw, right?
> If so, this can be used to identify if a dep_node_def is forward or backward;
> that all nodes on a list are forward (and share the same pro) or backward (and
> share the same con); and to assert the following regarding L:
> +/* Add a dependency described by DEP to the list L.
> +   L should be either INSN_DEPS1 or RESOLVED_DEPS1.  */
Good idea.

> 
> o insn_cost (insn, dep): maybe it's better to break this into insn_cost (insn)
> of a producer regardless of consumers, and "dep_cost (dep)".
Agree.

> 
> o The comment explaining what 'resolve_dep' does can be inlined together with
> its code. 
Agree.

> 
> +/* Detach dep_node N from the list.  */
> +static void
> +dep_node_detach (dep_node_t n)
> +{
> +  dep_node_t *prev_nextp = DEP_NODE_PREV_NEXTP (n);
> +  dep_node_t next = DEP_NODE_NEXT (n);
> +
> +  *prev_nextp = next;
> +
> +  if (next != NULL)
> +    DEP_NODE_PREV_NEXTP (next) = prev_nextp;
> maybe complete the detachment by adding:
> DEP_NODE_PREV_NEXTP (n) = NULL;
> DEP_NODE_NEXT (n) = NULL;
Probably, you are right.

> Ayal.

Thanks,

Maxim



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
  2006-09-23 10:22 ` [Bug middle-end/28071] [4.1 regression] " rguenth at gcc dot gnu dot org
  2007-01-10 11:43 ` mkuvyrkov at gcc dot gnu dot org
@ 2007-01-15  7:19 ` zaks at il dot ibm dot com
  2007-01-15  7:52   ` Maxim Kuvyrkov
  2007-01-15  7:53 ` mkuvyrkov at ispras dot ru
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 20+ messages in thread
From: zaks at il dot ibm dot com @ 2007-01-15  7:19 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #56 from zaks at il dot ibm dot com  2007-01-15 07:19 -------
(In reply to comment #55)
> Created an attachment (id=12879)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12879&action=view) [edit]
> Patch for scheduler dependency lists.

Looks like a pretty good cleanup IMHO. Here are some comments.

o dep_def: representing a dependence edge including both producer and consumer
is very handy, albeit somewhat redundant as we're usually traversing all cons
connected to a pro or vice versa. (I.e., has its pros and cons, but mostly pros
I agree - also done in ddg.h/ddg_edge.) Maybe comment why both 'kind' and 'ds'
are needed, as one supersedes the other.

o dep_node_def: this is a node in a (doubly-linked) chain, but it represents an
*edge* in terms of the data-dependence graph. The prev_nextp field is a "/*
Pointer to the next field of the previous node in the list.  */" except for the
first node on the list, whose prev_nextp points to itself, right?

o dep_data_node_def: holding the two conjugate dependence edges together is
very useful when switching directions. But perhaps most of the accesses go in
one direction (e.g. iterating over cons of a pro), and having both conjugates
structed together may reduce cache efficiency. So you may consider connecting
each dep_node_def to its conjugate, not necessarily forcing them to be placed
adjacent in memory.

o To add to the checking routines, the following can be checked: every
dep_node_def is pointed-to by either its data->back xor its data->forw, right?
If so, this can be used to identify if a dep_node_def is forward or backward;
that all nodes on a list are forward (and share the same pro) or backward (and
share the same con); and to assert the following regarding L:
+/* Add a dependency described by DEP to the list L.
+   L should be either INSN_DEPS1 or RESOLVED_DEPS1.  */

o insn_cost (insn, dep): maybe it's better to break this into insn_cost (insn)
of a producer regardless of consumers, and "dep_cost (dep)".

o The comment explaining what 'resolve_dep' does can be inlined together with
its code. 

+/* Detach dep_node N from the list.  */
+static void
+dep_node_detach (dep_node_t n)
+{
+  dep_node_t *prev_nextp = DEP_NODE_PREV_NEXTP (n);
+  dep_node_t next = DEP_NODE_NEXT (n);
+
+  *prev_nextp = next;
+
+  if (next != NULL)
+    DEP_NODE_PREV_NEXTP (next) = prev_nextp;
maybe complete the detachment by adding:
DEP_NODE_PREV_NEXTP (n) = NULL;
DEP_NODE_NEXT (n) = NULL;
+}


+/* Attach NEXT to the next field pointed to by PREV_NEXTP.  */
^^^^^^^^^^^N to appear after node X whose &DEP_NODE_NEXT (X) is given by 
PREV_NEXT_P
+static void
+dep_node_attach (dep_node_t n, dep_node_t *prev_nextp)


better place
+dep_node_check_p (dep_node_t n)
next to
+dep_nodes_check_p (dep_node_t n)


+/* Make a copy of FROM in TO with substitutin consumer with CON.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^substituting consumer with CON.

Ayal.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
  2006-09-23 10:22 ` [Bug middle-end/28071] [4.1 regression] " rguenth at gcc dot gnu dot org
@ 2007-01-10 11:43 ` mkuvyrkov at gcc dot gnu dot org
  2007-01-15  7:19 ` zaks at il dot ibm dot com
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: mkuvyrkov at gcc dot gnu dot org @ 2007-01-10 11:43 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #55 from mkuvyrkov at gcc dot gnu dot org  2007-01-10 11:42 -------
Created an attachment (id=12879)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12879&action=view)
Patch for scheduler dependency lists.

Hi,

This patch introduces new dependency lists to scheduler thus making LOG_LINKs
not used in the schedulers.  The patch is preliminary and I will post an
updated version to gcc-patches in a few days.

The structure of a change:
As before, we have backward dependencies (INSN_DEPS - replacement for
LOG_LINKS) and forward dependencies (INSN_DEPEND).  These lists consist of
dep_nodes.
Each dep_node has a pointer to dep_data_node which contains dependency data
(data field), dep_node of the backward dep_list (back field) and dep_node of
the forward dep_list (forw field).  Thus we can easily get forward dep_node by
the backward one and vice versa.
Each dep_node also contains a pointer to the next field of the previous node in
the dep_list (to the place where pointer to it is stored) making removal from
the list fast and easy.

Changes are mostly just a pattern replacement of macros names.  Patched
compiler produces exactly the same output as original (except for one small
thing: removal of DEPS_LIST from rtl.def somehow results in different numbering
of the registers.  The same occurs if add an additional rtx description to
rtl.def.  Don't know why this happens, but will be glad if someone explained.)

Minimal changes to the backends were introduced.
1. ia64 scheduler hook adjust_cost was restored to its original version (as in
gcc 4.1)
2. ia64 and rs6000 backends were fixed to walk through the new dependency
lists, which they do for their own heuristics. (no other backend do that).
3. rs6000 scheduler hook is_costly_dependency () was changed so that there'll
be no need to do a compatibility transformation (as being done for adjust_cost,
btw) for a hook that is implemented on a single target.

The patch was bootstrapped on x86_64 and ia64.  Also I've build a cross to
powerpc-740.

Results (on x86_64):
scheduler2 is now 4s instead of 12s.
Memory consumption: 11.5M instead of 48M


Thanks,

Maxim


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space
  2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
@ 2006-09-23 10:22 ` rguenth at gcc dot gnu dot org
  2007-01-10 11:43 ` mkuvyrkov at gcc dot gnu dot org
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2006-09-23 10:22 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #54 from rguenth at gcc dot gnu dot org  2006-09-23 10:22 -------
It's at least still a regression on the 4.1 branch, which still does

cc1: out of memory allocating 290995744 bytes after a total of 43593728 bytes

at -O1.  Otherwise we have

3.4.6: 106s
4.0.3: 108s
4.1.2:  OOM
4.2.0:  86s

and 4.2.0 uses a lot less memory than 4.0.3.  So, let's remove the 4.2
regression marker.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to fail|4.2.0 4.1.2                 |4.1.2
      Known to work|4.0.2                       |3.4.6 4.0.2 4.2.0
            Summary|[4.1/4.2 regression] A file |[4.1 regression] A file that
                   |that can not be compiled in |can not be compiled in
                   |reasonable time/space       |reasonable time/space


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28071


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-07-28  8:41 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-28071-4@http.gcc.gnu.org/bugzilla/>
2023-07-28  8:40 ` [Bug middle-end/28071] [4.1 regression] A file that can not be compiled in reasonable time/space cvs-commit at gcc dot gnu.org
2006-06-17  9:27 [Bug c/28071] New: " raffalli at univ-savoie dot fr
2006-09-23 10:22 ` [Bug middle-end/28071] [4.1 regression] " rguenth at gcc dot gnu dot org
2007-01-10 11:43 ` mkuvyrkov at gcc dot gnu dot org
2007-01-15  7:19 ` zaks at il dot ibm dot com
2007-01-15  7:52   ` Maxim Kuvyrkov
2007-01-15  7:53 ` mkuvyrkov at ispras dot ru
2007-01-15 15:31 ` zaks at il dot ibm dot com
2007-01-18  9:52 ` hubicka at ucw dot cz
2007-02-06 22:05 ` hubicka at gcc dot gnu dot org
2007-02-06 22:15 ` hubicka at gcc dot gnu dot org
2007-03-26 15:50 ` bonzini at gnu dot org
2007-04-16 15:04 ` mkuvyrkov at gcc dot gnu dot org
2007-04-16 15:07 ` mkuvyrkov at gcc dot gnu dot org
2007-04-17 18:16 ` hubicka at gcc dot gnu dot org
2007-04-17 18:38 ` hubicka at ucw dot cz
2007-05-14 21:37 ` mmitchel at gcc dot gnu dot org
2007-05-14 21:49 ` fang at csl dot cornell dot edu
2007-07-20  3:47 ` mmitchel at gcc dot gnu dot org
2007-10-09 19:25 ` mmitchel at gcc dot gnu dot org
2007-11-03  8:07 ` ebotcazou at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).