public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
@ 2023-01-23 17:30 dhekir at gmail dot com
  2023-01-23 17:55 ` [Bug c/108500] " dhekir at gmail dot com
                   ` (24 more replies)
  0 siblings, 25 replies; 26+ messages in thread
From: dhekir at gmail dot com @ 2023-01-23 17:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

            Bug ID: 108500
           Summary: -O -finline-small-functions results in "internal
                    compiler error: Segmentation fault" on a very large
                    program (700k function calls)
           Product: gcc
           Version: 12.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: dhekir at gmail dot com
  Target Milestone: ---

Created attachment 54328
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54328&action=edit
compressed version of a simplified program causing the ICE

In the attached preprocessed program (compressed with .tar.gz), running
'gcc -O -finline-small-functions' results in:

gcc: internal compiler error: Segmentation fault signal terminated program cc1

The original program is more interesting than this simplified version. Still,
it does have more than 700k function calls in the main function, which is
causing the problem.

The original command line was simply 'gcc -O2', then I narrowed the options
down to -finline-small-functions.

I tried several GCC Docker images (running 'gcc -O2' on the attached file), and
I narrowed it down to:

- with gcc:10.4 (or older), compilation works without any errors;
- with gcc:11.1 (or newer; I tested up to 12.2.0), segmentation fault happens.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug c/108500] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
@ 2023-01-23 17:55 ` dhekir at gmail dot com
  2023-01-24  2:57 ` [Bug ipa/108500] " pinskia at gcc dot gnu.org
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: dhekir at gmail dot com @ 2023-01-23 17:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

dhekir at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #54328|0                           |1
        is obsolete|                            |

--- Comment #1 from dhekir at gmail dot com ---
Created attachment 54329
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54329&action=edit
.tar.gz compressed version of program causing crash

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug ipa/108500] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
  2023-01-23 17:55 ` [Bug c/108500] " dhekir at gmail dot com
@ 2023-01-24  2:57 ` pinskia at gcc dot gnu.org
  2023-01-24  3:18 ` [Bug tree-optimization/108500] " pinskia at gcc dot gnu.org
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-01-24  2:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---

#21 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x137731a8,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#22 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13773160,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#23 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13773118,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#24 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x137730d0,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#25 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13773088,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#26 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13773040,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#27 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772ff8,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#28 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772fb0,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#29 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772f68,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#30 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772f20,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#31 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772ed8,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#32 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772e90,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#33 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772e48,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#34 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772e00,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#35 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772db8,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#36 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772d70,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#37 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772d28,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#38 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772ce0,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#39 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772c98,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#40 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772c50,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#41 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772c08,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#42 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772bc0,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
#43 0x0000000000b0ad8b in assign_dfs_numbers (node=node@entry=0x13772b78,
num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:648
...

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
  2023-01-23 17:55 ` [Bug c/108500] " dhekir at gmail dot com
  2023-01-24  2:57 ` [Bug ipa/108500] " pinskia at gcc dot gnu.org
@ 2023-01-24  3:18 ` pinskia at gcc dot gnu.org
  2023-01-24  9:32 ` rguenth at gcc dot gnu.org
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-01-24  3:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2023-01-24
          Component|ipa                         |tree-optimization
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
assign_dfs_numbers has not changed for years ...


1: *cfun.cfg = {x_entry_block_ptr = 0x7ffff73ef6c0, x_exit_block_ptr =
0x7ffff73ef720, x_basic_block_info = 0x7fffbcdd4000, x_n_basic_blocks =
1399837, x_n_edges = 1399836, x_last_basic_block = 1399837, last_label_uid = 1,
x_label_to_block_map = 0x7ffff7277f18, x_profile_status = PROFILE_ABSENT,
x_dom_computed = {
    DOM_NO_FAST_QUERY, DOM_NONE}, x_n_bbs_in_dom_tree = {1399837, 0},
max_jumptable_ents = 0, count_max = {static n_bits = 61, static max_count =
2305843009213693950, static uninitialized_count = 2305843009213693951, m_val =
2305843009213693951, m_quality = GUESSED_LOCAL}, edge_flags_allocated = 262143,
  bb_flags_allocated = 32767}

#0  assign_dfs_numbers (node=0x45b8ae0, num=num@entry=0x7fffffffd850) at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:644
#1  0x0000000000b0ca0c in compute_dom_fast_query (dir=<optimized out>,
dir@entry=<error reading variable: dwarf2_find_location_expression: Corrupted
DWARF expression.>) at /home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:674
#2  calculate_dominance_info(cdi_direction) () at
/home/apinski/src/upstream-gcc/gcc/gcc/dominance.cc:748
#3  0x0000000000fbb0ea in cleanup_tree_cfg_noloop (ssa_update_flags=<error
reading variable: dwarf2_find_location_expression: Corrupted DWARF
expression.>) at /home/apinski/src/upstream-gcc/gcc/gcc/tree-cfgcleanup.cc:1111
#4  cleanup_tree_cfg(unsigned int) () at
/home/apinski/src/upstream-gcc/gcc/gcc/tree-cfgcleanup.cc:1212
#5  0x0000000000e7b27d in execute_function_todo(function*, void*) () at
/home/apinski/src/upstream-gcc/gcc/gcc/passes.cc:2057
#6  0x0000000000e7b70f in execute_todo(unsigned int) () at
/home/apinski/src/upstream-gcc/gcc/gcc/passes.cc:2145
#7  0x0000000000e7e9f7 in execute_one_pass(opt_pass*) () at
/home/apinski/src/upstream-gcc/gcc/gcc/passes.cc:2690
#8  0x0000000000e7ef00 in execute_pass_list_1(opt_pass*) () at
/home/apinski/src/upstream-gcc/gcc/gcc/passes.cc:2753
#9  0x0000000000e7ef39 in execute_pass_list(function*, opt_pass*) () at
/home/apinski/src/upstream-gcc/gcc/gcc/passes.cc:2764
#10 0x0000000000e7f85d in do_per_function_toporder(void (*)(function*, void*),
void*) [clone .part.0] () at
/home/apinski/src/upstream-gcc/gcc/gcc/passes.cc:1780
#11 0x0000000000e7fa8f in do_per_function_toporder (data=<optimized out>,
callback=0xe7ef20 <execute_pass_list(function*, opt_pass*)>) at
/home/apinski/src/upstream-gcc/gcc/gcc/passes.cc:3098
#12 execute_ipa_pass_list(opt_pass*) () at
/home/apinski/src/upstream-gcc/gcc/gcc/passes.cc:3098
#13 0x0000000000ad8274 in ipa_passes () at
/home/apinski/src/upstream-gcc/gcc/gcc/cgraphunit.cc:2203
#14 symbol_table::compile() [clone .part.0] () at
/home/apinski/src/upstream-gcc/gcc/gcc/cgraphunit.cc:2324
#15 0x0000000000adac18 in symbol_table::compile (this=<error reading variable:
dwarf2_find_location_expression: Corrupted DWARF expression.>) at
/home/apinski/src/upstream-gcc/gcc/gcc/cgraphunit.cc:2576
#16 symbol_table::finalize_compilation_unit (this=0x7ffff7246000) at
/home/apinski/src/upstream-gcc/gcc/gcc/cgraphunit.cc:2576
#17 0x0000000000f69660 in compile_file () at
/home/apinski/src/upstream-gcc/gcc/gcc/toplev.cc:471
#18 0x0000000000907eb2 in do_compile (no_backend=false, no_backend@entry=<error
reading variable: dwarf2_find_location_expression: Corrupted DWARF
expression.>) at /home/apinski/src/upstream-gcc/gcc/gcc/toplev.cc:2125
#19 toplev::main(int, char**) () at
/home/apinski/src/upstream-gcc/gcc/gcc/toplev.cc:2277
#20 0x0000000000909afb in main (argc=20, argv=0x7fffffffdd48) at
/home/apinski/src/upstream-gcc/gcc/gcc/main.cc:39


almost 2 basic blocks per time the function inlined ...


Maybe we can do some bb merging before calculate_dominance_info

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (2 preceding siblings ...)
  2023-01-24  3:18 ` [Bug tree-optimization/108500] " pinskia at gcc dot gnu.org
@ 2023-01-24  9:32 ` rguenth at gcc dot gnu.org
  2023-01-24  9:43 ` rguenth at gcc dot gnu.org
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-24  9:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |ice-on-valid-code

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #3)
> assign_dfs_numbers has not changed for years ...

[...]

> Maybe we can do some bb merging before calculate_dominance_info

Unfortunately not (easily).  A more sustainable fix would be to avoid
the recursion in assign_dfs_numbers in favor of a worklist approach,
pushing 'node->son, son' as "iterator" like the cfganal DFS walks do.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (3 preceding siblings ...)
  2023-01-24  9:32 ` rguenth at gcc dot gnu.org
@ 2023-01-24  9:43 ` rguenth at gcc dot gnu.org
  2023-01-24  9:48 ` rguenth at gcc dot gnu.org
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-24  9:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 54331
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54331&action=edit
CFG cleanup "early merging"

OK, I take that back.  For this simple testcase the attached works, delaying
the dominance compute until the iterative cleanup phase for blocks with
multiple predecessors.

I didn't otherwise test it and I think if you add a diamond to the inlined
function you will still hit the issue, so it isn't a "fix".  It's also
going to slow down CFG cleanup a bit.  Though it would be nice to not
require dominance queries during it since it's inevitably going to do
slow queries.  That also hints at another possible fix which is to simply
not compute those from CFG cleanup (but there's no API entry for that yet).

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (4 preceding siblings ...)
  2023-01-24  9:43 ` rguenth at gcc dot gnu.org
@ 2023-01-24  9:48 ` rguenth at gcc dot gnu.org
  2023-01-24 12:36 ` [Bug tree-optimization/108500] [11/12/13 Regression] " rguenth at gcc dot gnu.org
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-24  9:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 54332
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54332&action=edit
do not compute fast-query from CFG cleanup

This is the more obvious workaround (as said, the bug is the recursive DFS
number assignment).  I'm going to test this variant.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12/13 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (5 preceding siblings ...)
  2023-01-24  9:48 ` rguenth at gcc dot gnu.org
@ 2023-01-24 12:36 ` rguenth at gcc dot gnu.org
  2023-01-24 14:29 ` cvs-commit at gcc dot gnu.org
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-24 12:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|-O -finline-small-functions |[11/12/13 Regression] -O
                   |results in "internal        |-finline-small-functions
                   |compiler error:             |results in "internal
                   |Segmentation fault" on a    |compiler error:
                   |very large program (700k    |Segmentation fault" on a
                   |function calls)             |very large program (700k
                   |                            |function calls)
   Target Milestone|---                         |11.4
      Known to work|                            |10.4.0, 7.5.0, 9.3.1

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
It works with GCC 10 for me but I'm sure that makes it a regression only
because of the size of the testcase, it still inlines everything and removes
1399835 blocks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12/13 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (6 preceding siblings ...)
  2023-01-24 12:36 ` [Bug tree-optimization/108500] [11/12/13 Regression] " rguenth at gcc dot gnu.org
@ 2023-01-24 14:29 ` cvs-commit at gcc dot gnu.org
  2023-01-24 14:30 ` [Bug tree-optimization/108500] [11/12 " rguenth at gcc dot gnu.org
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-01-24 14:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:f31fa9ea35ebcf221a2abaacba5511225f5d036e

commit r13-5325-gf31fa9ea35ebcf221a2abaacba5511225f5d036e
Author: Richard Biener <rguenther@suse.de>
Date:   Tue Jan 24 10:49:18 2023 +0100

    tree-optimization/108500 - avoid useless fast-query compute in CFG cleanup

    CFG cleanup computes dominators before the loop over blocks looking
    for merging opportunities.  That computes also the fast-query DFS
    numbers but that's a bit pointless since any CFG cleanup will invalidate
    them immediately (they are re-computed before fixing up loops).
    The following avoids this and fixes the SIGSEGV due to the deep
    recursion in assign_dfs_numbers after inlining very many small
    functions.

            PR tree-optimization/108500
            * dominance.h (calculate_dominance_info): Add parameter
            to indicate fast-query compute, defaulted to true.
            * dominance.cc (calculate_dominance_info): Honor
            fast-query compute parameter.
            * tree-cfgcleanup.cc (cleanup_tree_cfg_noloop): Do
            not compute the dominator fast-query DFS numbers.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (7 preceding siblings ...)
  2023-01-24 14:29 ` cvs-commit at gcc dot gnu.org
@ 2023-01-24 14:30 ` rguenth at gcc dot gnu.org
  2023-02-01  7:47 ` cvs-commit at gcc dot gnu.org
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-24 14:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[11/12/13 Regression] -O    |[11/12 Regression] -O
                   |-finline-small-functions    |-finline-small-functions
                   |results in "internal        |results in "internal
                   |compiler error:             |compiler error:
                   |Segmentation fault" on a    |Segmentation fault" on a
                   |very large program (700k    |very large program (700k
                   |function calls)             |function calls)
      Known to fail|                            |12.2.1
      Known to work|                            |13.0

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
The testcase now works on trunk by means of avoiding the recursion before CFG
cleanup.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (8 preceding siblings ...)
  2023-01-24 14:30 ` [Bug tree-optimization/108500] [11/12 " rguenth at gcc dot gnu.org
@ 2023-02-01  7:47 ` cvs-commit at gcc dot gnu.org
  2023-02-01  8:42 ` rguenth at gcc dot gnu.org
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-02-01  7:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:97258480438db77e52f4b3947fa2c075b09d3fe1

commit r13-5617-g97258480438db77e52f4b3947fa2c075b09d3fe1
Author: Richard Biener <rguenther@suse.de>
Date:   Tue Jan 31 15:45:43 2023 +0100

    middle-end/108500 - replace recursive domtree DFS traversal

    The following replaces the recursive DFS traversal of the dominator
    tree in assign_dfs_numbers with a tree traversal using the fact
    that we have recorded parents.

    Bootstrapped and tested on x86_64-unknown-linux-gnu.

    This makes r13-5325 somewhat obsolete, though not computing the
    DFS numbers at all is beneficial in the cases where we perform
    immediate CFG manipulations.

    OK for trunk and later branch(es)?

    Thanks,
    Richard.

            PR middle-end/108500
            * dominance.cc (assign_dfs_numbers): Replace recursive DFS
            with tree traversal algorithm.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (9 preceding siblings ...)
  2023-02-01  7:47 ` cvs-commit at gcc dot gnu.org
@ 2023-02-01  8:42 ` rguenth at gcc dot gnu.org
  2023-02-01 16:06 ` dhekir at gmail dot com
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-01  8:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
And now the recursion is avoided completely.  I'm going to backport this last
fix after some time and would be interested if that solves your original issue.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (10 preceding siblings ...)
  2023-02-01  8:42 ` rguenth at gcc dot gnu.org
@ 2023-02-01 16:06 ` dhekir at gmail dot com
  2023-02-01 16:07 ` dhekir at gmail dot com
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: dhekir at gmail dot com @ 2023-02-01 16:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #12 from dhekir at gmail dot com ---
Created attachment 54386
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54386&action=edit
another test case, this time with 1M calls and structs as arguments

A more complex test case, which still works (no segmentation fault), but takes
too long to compile.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (11 preceding siblings ...)
  2023-02-01 16:06 ` dhekir at gmail dot com
@ 2023-02-01 16:07 ` dhekir at gmail dot com
  2023-02-02 10:22 ` rguenth at gcc dot gnu.org
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: dhekir at gmail dot com @ 2023-02-01 16:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #13 from dhekir at gmail dot com ---
Thank you very much for the work.

Running the attached file with `-O -finline-small-functions` does compile in
under 30 seconds on my computer.

However, when trying to compile the original program (which is about 1 million
lines, and each call passes 2 structures as arguments, instead of just calling
a function without any arguments), it's taking several dozen minutes. I tried
preprocessing it (5s to obtain the .i) file, and then running it with `-O
-finline-small-functions`, or `-O2`, or `-O3`, and without any options at all,
and in all cases, I ended up terminating the program before it finished (after
more than 10 minutes; in some cases I waited up to 30 minutes).

I tried re-simplifying the program. After preprocessing, I tried the following
variants, with options `-O -finline-small-functions`:

- 1M calls, no arguments, function returning a (global) struct: compiles in
30s;
- 1M calls, each with a single argument of type `struct s`, function returns
that same argument (that is, `struct s f(struct s s1) {return s1;}`): compiles
in <2 minutes;
- 1M calls, each with 2 arguments of types `struct s1` and `struct s2`,
returning the second argument (that is, `struct s2 f(struct s1 arg1, struct s2
arg2) {return arg2;}`): >50 minutes (I had to terminate it).

The last version, with -O2, I left it compiling for almost 3h before having to
stop it.

In any case, this bug seems definitely solved for me, and I no longer have the
original stack overflow. However, I am still unable to compile my original
code, so I'll have to try something else. It's possibly not a regression,
however.

I'm attaching it in case you may want to try it, but feel free to ignore it.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (12 preceding siblings ...)
  2023-02-01 16:07 ` dhekir at gmail dot com
@ 2023-02-02 10:22 ` rguenth at gcc dot gnu.org
  2023-02-02 10:34 ` rguenth at gcc dot gnu.org
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-02 10:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jamborm at gcc dot gnu.org,
                   |                            |vmakarov at gcc dot gnu.org

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
Thanks for the new testcase.  With -O0 (and a --enable-checking=release built
compiler) this builds in ~11 minutes (on a Ryzen 9 7900X) with

 integrated RA                      :  38.96 (  6%)   1.94 ( 20%)  42.00 (  6%)
 3392M ( 23%)
 LRA non-specific                   :  18.93 (  3%)   1.24 ( 13%)  23.78 (  4%)
  450M (  3%)
 LRA virtuals elimination           :   5.67 (  1%)   0.05 (  1%)   5.75 (  1%)
  457M (  3%)
 LRA reload inheritance             : 318.25 ( 49%)   0.24 (  2%) 318.51 ( 48%)
    0  (  0%)
 LRA create live ranges             : 199.24 ( 31%)   0.12 (  1%) 199.38 ( 30%)
  228M (  2%)
645.67user 10.29system 11:04.42elapsed 98%CPU (0avgtext+0avgdata
30577844maxresident)k
3936200inputs+1091808outputs (122053major+10664929minor)pagefaults 0swaps

so register allocation taking all of the time.  There's maybe the possibility
to gate some of its features on the # of BBs or insns (or whatever the actual
"bad" thing is - I didn't look closer yet).

It also seems to use 30GB of peak memory at -O0 ...

For -O the situation is "better":

 tree PTA                           : 987.21 ( 99%)   0.41 ( 12%) 987.70 ( 99%)
  128  (  0%)
992.56user 3.53system 16:36.20elapsed 99%CPU (0avgtext+0avgdata
2968740maxresident)k
42576inputs+8outputs (28major+717414minor)pagefaults 0swaps

which suggests a clear workaround, -fno-tree-pta, which makes it compile
in 5s for me.

Doing -O -finline-small-functions -fno-tree-pta we get a very high
compile-time in SRAs propagate_all_subaccesses which probably sees
a very large struct copy chain

  tem1 = s2;
  s2 = tem1;
  tem2 = s2;
  s2 = tem2;
...

and somehow ends up quadratic (possibly switching the candidate_bitmap
to tree form at the start of propagate_all_subaccesses will help a bit).
tree form bitmap doesn't help, I guess we end up queueing all elements in
the copy chain to the worklist and via the chains end up with a O(n^2)
working set.  The testcase can probably be shortened to get at this
problem.  SRA is actually quite important here, so disabling SRA as a
workaround doesn't look to improve the situation a lot.

Still with -fno-tree-sra added we get good compile time and DCE/DSE
remove all code plus -fno-tree-pta isn't required.

Martin, can you look at the SRA issue?  Do you want me to create a separate
bugreport for this?  The IL into SRA looks like

  <bb 2> :
  s2D.2755 = {};
  s1D.2756 = {};
  _unusedD.2002766 = s1D.2756;
  sD.2002767 = s2D.2755;
  s2D.2755 = sD.2002767;
  _unusedD.2002766 ={v} {CLOBBER(eol)};
  sD.2002767 ={v} {CLOBBER(eol)};
  _unusedD.2002764 = s1D.2756;
  sD.2002765 = s2D.2755;
  s2D.2755 = sD.2002765;
  _unusedD.2002764 ={v} {CLOBBER(eol)};
  sD.2002765 ={v} {CLOBBER(eol)};
  _unusedD.2002762 = s1D.2756;
  sD.2002763 = s2D.2755;
  s2D.2755 = sD.2002763;
  _unusedD.2002762 ={v} {CLOBBER(eol)};
  sD.2002763 ={v} {CLOBBER(eol)};
  _unusedD.2002760 = s1D.2756;
  sD.2002761 = s2D.2755;
  s2D.2755 = sD.2002761;
  _unusedD.2002760 ={v} {CLOBBER(eol)};
  sD.2002761 ={v} {CLOBBER(eol)};
...

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (13 preceding siblings ...)
  2023-02-02 10:22 ` rguenth at gcc dot gnu.org
@ 2023-02-02 10:34 ` rguenth at gcc dot gnu.org
  2023-02-02 14:01 ` rguenth at gcc dot gnu.org
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-02 10:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> ---
To not look at "nothing" (after successful SRA it should indeed become almost
nothing) I've added a store to a volatile 'x' global variable to the end
of main:

...
  s2 = f(s1,s2);
  x = s2;
  return 0;
}

not using main but a function returning a struct would probably work as well.
Otherwise DCE/DSE will remove all code.  Doing that reveals that RTL DSE
(more specifically rtx_equal_for_cselib_1) is very slow (via
dse_step1 calling cselib_process_insn).

I do wonder if you can share the "real" testcase?  It doesn't need to
be able to link, preprocessed source would be enough.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (14 preceding siblings ...)
  2023-02-02 10:34 ` rguenth at gcc dot gnu.org
@ 2023-02-02 14:01 ` rguenth at gcc dot gnu.org
  2023-02-02 17:12 ` dhekir at gmail dot com
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-02 14:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #14)
> Martin, can you look at the SRA issue?  Do you want me to create a separate
> bugreport for this?  The IL into SRA looks like
> 
>   <bb 2> :
>   s2D.2755 = {};
>   s1D.2756 = {};
>   _unusedD.2002766 = s1D.2756;
>   sD.2002767 = s2D.2755;
>   s2D.2755 = sD.2002767;
>   _unusedD.2002766 ={v} {CLOBBER(eol)};
>   sD.2002767 ={v} {CLOBBER(eol)};
>   _unusedD.2002764 = s1D.2756;
>   sD.2002765 = s2D.2755;
>   s2D.2755 = sD.2002765;
>   _unusedD.2002764 ={v} {CLOBBER(eol)};
>   sD.2002765 ={v} {CLOBBER(eol)};
>   _unusedD.2002762 = s1D.2756;
>   sD.2002763 = s2D.2755;
>   s2D.2755 = sD.2002763;
>   _unusedD.2002762 ={v} {CLOBBER(eol)};
>   sD.2002763 ={v} {CLOBBER(eol)};
>   _unusedD.2002760 = s1D.2756;
>   sD.2002761 = s2D.2755;
>   s2D.2755 = sD.2002761;
>   _unusedD.2002760 ={v} {CLOBBER(eol)};
>   sD.2002761 ={v} {CLOBBER(eol)};
> ...

struct s1 {
   char a[20] ;
   char b[20] ;
};
struct s2 {
   int id;
   char a[20];
};
static inline  __attribute__((always_inline)) struct s2
f(struct s1 _unused, struct s2 s)
{ 
  return s;
} 

volatile struct s2 x;
int main(void)
{
  struct s2 s2 = {0};
  struct s1 s1 = {0};
#define TEN \
  s2 = f(s1,s2); \
  s2 = f(s1,s2); \
  s2 = f(s1,s2); \
  s2 = f(s1,s2); \
  s2 = f(s1,s2); \
  s2 = f(s1,s2); \
  s2 = f(s1,s2); \
  s2 = f(s1,s2); \
  s2 = f(s1,s2); \
  s2 = f(s1,s2);
  TEN
  TEN
  x = s2;
  return 0;
}

shows this.  With

diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index ad0c738645d..ab3f47badcb 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -2980,6 +2980,7 @@ propagate_subaccesses_from_lhs (struct access *lacc,
struct access *racc)
 static void
 propagate_all_subaccesses (void)
 {
+  unsigned cnt = 0;
   propagation_budget = new hash_map<tree, unsigned>;
   while (rhs_work_queue_head)
     {
@@ -2994,6 +2995,7 @@ propagate_all_subaccesses (void)
        {
          struct access *lacc = link->lacc;

+         cnt++;
          if (!bitmap_bit_p (candidate_bitmap, DECL_UID (lacc->base)))
            continue;
          lacc = lacc->group_representative;
@@ -3019,6 +3021,7 @@ propagate_all_subaccesses (void)
            while (lacc);
        }
     }
+  fprintf (stderr, "%d\n", cnt);

   while (lhs_work_queue_head)
     {

we have with one TEN
> ./cc1 -quiet t.c -O
0
220
120

and with two TEN
> ./cc1 -quiet t.c -O
0
840
440

and with four TEN
> ./cc1 -quiet t.c -O
0
3280
1680

that's quadratic and has a quite high linear factor as well.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (15 preceding siblings ...)
  2023-02-02 14:01 ` rguenth at gcc dot gnu.org
@ 2023-02-02 17:12 ` dhekir at gmail dot com
  2023-02-03  7:06 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: dhekir at gmail dot com @ 2023-02-02 17:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #17 from dhekir at gmail dot com ---
To be honest, the "real" test case is very similar to the last one I sent: it's
a semi-generated code, with some initialization of the data in the beginning,
and then a lot of statements which perform not necessarily useful operations,
and in the end a few assertions are checked (e.g. that the initialized data was
not tampered with).

So, in reality, I expected GCC to discard most of the program after
optimization and execute it almost instantly. When I encountered the
segmentation fault during compilation, I thought it might also be relevant for
other users, so I submitted the bug.

Now, however, that the issue is mostly a "performance" issue, it's less likely
that other users will encounter such a huge program with "useful" purposes, so
I understand completely if you decide this is just not interesting/useful
enough.

To be honest, I tried compiling the code with other open source C compilers
(Clang, and another not-so-mature one), and one failed with a stack overflow,
and the other didn't complete until 1h30m, so I terminated it.

So, the simple fact that you were able to succesfully compile it with those
options is already very interesting to me and sufficient for my "real" test
case.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (16 preceding siblings ...)
  2023-02-02 17:12 ` dhekir at gmail dot com
@ 2023-02-03  7:06 ` rguenth at gcc dot gnu.org
  2023-02-03  8:20 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-03  7:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #18 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, thanks for the info.  These kind of testcases are quite useful in that they
are not done for the purpose of breaking the compiler and they show algorithmic
deficiencies in GCC.  GCC aims to be able to compile machine-generated code
like this at -O1 [-g] with reasonable compile-time resources (reasonable means
not quadratic in the size of the input).

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (17 preceding siblings ...)
  2023-02-03  7:06 ` rguenth at gcc dot gnu.org
@ 2023-02-03  8:20 ` rguenth at gcc dot gnu.org
  2023-02-10 14:05 ` vmakarov at gcc dot gnu.org
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-03  8:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #19 from Richard Biener <rguenth at gcc dot gnu.org> ---
I have split out the SRA issue to PR108653.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (18 preceding siblings ...)
  2023-02-03  8:20 ` rguenth at gcc dot gnu.org
@ 2023-02-10 14:05 ` vmakarov at gcc dot gnu.org
  2023-02-10 16:45 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2023-02-10 14:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #20 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #14)
> Thanks for the new testcase.  With -O0 (and a --enable-checking=release
> built compiler) this builds in ~11 minutes (on a Ryzen 9 7900X) with
> 
>  integrated RA                      :  38.96 (  6%)   1.94 ( 20%)  42.00 ( 
> 6%)  3392M ( 23%)
>  LRA non-specific                   :  18.93 (  3%)   1.24 ( 13%)  23.78 ( 
> 4%)   450M (  3%)
>  LRA virtuals elimination           :   5.67 (  1%)   0.05 (  1%)   5.75 ( 
> 1%)   457M (  3%)
>  LRA reload inheritance             : 318.25 ( 49%)   0.24 (  2%) 318.51 (
> 48%)     0  (  0%)
>  LRA create live ranges             : 199.24 ( 31%)   0.12 (  1%) 199.38 (
> 30%)   228M (  2%)
> 645.67user 10.29system 11:04.42elapsed 98%CPU (0avgtext+0avgdata
> 30577844maxresident)k
> 3936200inputs+1091808outputs (122053major+10664929minor)pagefaults 0swaps
>

I've tried test-1M.i with -O0 for clang-14.  It took about 12hours on E5-2697
v3 vs about 30min for GCC.  The most time (99%) of clang is spent in "fast
register allocator":

  Total Execution Time: 42103.9395 seconds (42243.9819 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  ---
Name ---
  41533.7657 ( 99.5%)  269.5347 ( 78.6%)  41803.3005 ( 99.3%)  41942.4177 (
99.3%)  Fast Register Allocator
  139.1669 (  0.3%)  16.4785 (  4.8%)  155.6454 (  0.4%)  156.3196 (  0.4%) 
X86 DAG->DAG Instruction Selection

I've tried the same for -O1.  Again gcc took about 30min and I stopped clang
(with another used RA algorithm) after 120hours.

So the situation with RA is not so bad for GCC.  But in any case I'll try to
improve the speed for this case.

> so register allocation taking all of the time.  There's maybe the possibility
> to gate some of its features on the # of BBs or insns (or whatever the actual
> "bad" thing is - I didn't look closer yet).
> 
> It also seems to use 30GB of peak memory at -O0 ...
> 

I see only 3GB.  Improving this is hard task.  The IRA for -O0 uses very simple
algorithm with usage of very few resources.  We could use even simpler method
(assigning memory only for all pseudos) but I think it does not worth to do as
the generated code will be much bigger and probably will be 1.5-2 times slower.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (19 preceding siblings ...)
  2023-02-10 14:05 ` vmakarov at gcc dot gnu.org
@ 2023-02-10 16:45 ` cvs-commit at gcc dot gnu.org
  2023-02-13  7:42 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-02-10 16:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #21 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Vladimir Makarov <vmakarov@gcc.gnu.org>:

https://gcc.gnu.org/g:3c5154d0f0d2185b518465b264ca17fb7c60c1e8

commit r13-5808-g3c5154d0f0d2185b518465b264ca17fb7c60c1e8
Author: Vladimir N. Makarov <vmakarov@redhat.com>
Date:   Fri Feb 10 11:12:37 2023 -0500

    RA: Use simple LRA for huge functions

    The PR108500 test contains a huge function and RA spends a lot of time
    to compile the test with -O0.  The patch decreases compilation time
    considerably for huge functions.  Compilation time for the PR test
    decreases from 1235s to 709s on Intel i7-13600K.

            PR tree-optimization/108500

    gcc/ChangeLog:

            * params.opt (ira-simple-lra-insn-threshold): Add new param.
            * ira.cc (ira): Use the param to switch on simple LRA.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (20 preceding siblings ...)
  2023-02-10 16:45 ` cvs-commit at gcc dot gnu.org
@ 2023-02-13  7:42 ` rguenth at gcc dot gnu.org
  2023-03-15  9:47 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-13  7:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #22 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Vladimir Makarov from comment #20)
> (In reply to Richard Biener from comment #14)
> > Thanks for the new testcase.  With -O0 (and a --enable-checking=release
> > built compiler) this builds in ~11 minutes (on a Ryzen 9 7900X) with
> > 
> >  integrated RA                      :  38.96 (  6%)   1.94 ( 20%)  42.00 ( 
> > 6%)  3392M ( 23%)
> >  LRA non-specific                   :  18.93 (  3%)   1.24 ( 13%)  23.78 ( 
> > 4%)   450M (  3%)
> >  LRA virtuals elimination           :   5.67 (  1%)   0.05 (  1%)   5.75 ( 
> > 1%)   457M (  3%)
> >  LRA reload inheritance             : 318.25 ( 49%)   0.24 (  2%) 318.51 (
> > 48%)     0  (  0%)
> >  LRA create live ranges             : 199.24 ( 31%)   0.12 (  1%) 199.38 (
> > 30%)   228M (  2%)
> > 645.67user 10.29system 11:04.42elapsed 98%CPU (0avgtext+0avgdata
> > 30577844maxresident)k
> > 3936200inputs+1091808outputs (122053major+10664929minor)pagefaults 0swaps
> >
> 
> I've tried test-1M.i with -O0 for clang-14.  It took about 12hours on
> E5-2697 v3 vs about 30min for GCC.  The most time (99%) of clang is spent in
> "fast register allocator":
> 
>   Total Execution Time: 42103.9395 seconds (42243.9819 wall clock)
> 
>    ---User Time---   --System Time--   --User+System--   ---Wall Time--- 
> --- Name ---
>   41533.7657 ( 99.5%)  269.5347 ( 78.6%)  41803.3005 ( 99.3%)  41942.4177 (
> 99.3%)  Fast Register Allocator
>   139.1669 (  0.3%)  16.4785 (  4.8%)  155.6454 (  0.4%)  156.3196 (  0.4%) 
> X86 DAG->DAG Instruction Selection
> 
> I've tried the same for -O1.  Again gcc took about 30min and I stopped clang
> (with another used RA algorithm) after 120hours.
> 
> So the situation with RA is not so bad for GCC.  But in any case I'll try to
> improve the speed for this case.

I bet the LLVM folks do not focus on making -O{0,1} usable for these kind
of testcases which have practical application for auto-generated code.

Of course that's not a reason to not improve GCC even more! ;)

> > so register allocation taking all of the time.  There's maybe the possibility
> > to gate some of its features on the # of BBs or insns (or whatever the actual
> > "bad" thing is - I didn't look closer yet).
> > 
> > It also seems to use 30GB of peak memory at -O0 ...
> > 
> 
> I see only 3GB.  Improving this is hard task.  The IRA for -O0 uses very
> simple algorithm with usage of very few resources.  We could use even
> simpler method (assigning memory only for all pseudos) but I think it does
> not worth to do as the generated code will be much bigger and probably will
> be 1.5-2 times slower.

For some RTL opts algorithm simply splitting large blocks tends to help.
Also some gate on the number of BBs only but their algorithms are quadratic
in the number of insns instead ...

Of course we cannot simply gate RA ... maybe there's a way to have a
"simpler" algorithm that works on smaller regions of a function and
only promote allocnos live across region boundaries to memory?  Ideally
you'd have sth that has linear time complexity - for LRA that should be
possible, since we have done global RA already?

Anyway - thanks for improving things here!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11/12 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (21 preceding siblings ...)
  2023-02-13  7:42 ` rguenth at gcc dot gnu.org
@ 2023-03-15  9:47 ` cvs-commit at gcc dot gnu.org
  2023-03-15 10:03 ` [Bug tree-optimization/108500] [11 " rguenth at gcc dot gnu.org
  2023-05-05  8:09 ` rguenth at gcc dot gnu.org
  24 siblings, 0 replies; 26+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-03-15  9:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

--- Comment #23 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Richard Biener
<rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:b955080a5ab8690902f7cc99a770f9c3da171d6f

commit r12-9256-gb955080a5ab8690902f7cc99a770f9c3da171d6f
Author: Richard Biener <rguenther@suse.de>
Date:   Tue Jan 31 15:45:43 2023 +0100

    middle-end/108500 - replace recursive domtree DFS traversal

    The following replaces the recursive DFS traversal of the dominator
    tree in assign_dfs_numbers with a tree traversal using the fact
    that we have recorded parents.

    Bootstrapped and tested on x86_64-unknown-linux-gnu.

    This makes r13-5325 somewhat obsolete, though not computing the
    DFS numbers at all is beneficial in the cases where we perform
    immediate CFG manipulations.

    OK for trunk and later branch(es)?

    Thanks,
    Richard.

            PR middle-end/108500
            * dominance.cc (assign_dfs_numbers): Replace recursive DFS
            with tree traversal algorithm.

    (cherry picked from commit 97258480438db77e52f4b3947fa2c075b09d3fe1)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (22 preceding siblings ...)
  2023-03-15  9:47 ` cvs-commit at gcc dot gnu.org
@ 2023-03-15 10:03 ` rguenth at gcc dot gnu.org
  2023-05-05  8:09 ` rguenth at gcc dot gnu.org
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-15 10:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |12.2.1
            Summary|[11/12 Regression] -O       |[11 Regression] -O
                   |-finline-small-functions    |-finline-small-functions
                   |results in "internal        |results in "internal
                   |compiler error:             |compiler error:
                   |Segmentation fault" on a    |Segmentation fault" on a
                   |very large program (700k    |very large program (700k
                   |function calls)             |function calls)
           Priority|P3                          |P2
      Known to fail|12.2.1                      |12.2.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug tree-optimization/108500] [11 Regression] -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls)
  2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
                   ` (23 preceding siblings ...)
  2023-03-15 10:03 ` [Bug tree-optimization/108500] [11 " rguenth at gcc dot gnu.org
@ 2023-05-05  8:09 ` rguenth at gcc dot gnu.org
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-05  8:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108500

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
   Target Milestone|11.4                        |12.3
         Resolution|---                         |FIXED

--- Comment #24 from Richard Biener <rguenth at gcc dot gnu.org> ---
Not going to backport this further, it's not really a "regression".

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-05-05  8:09 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-23 17:30 [Bug c/108500] New: -O -finline-small-functions results in "internal compiler error: Segmentation fault" on a very large program (700k function calls) dhekir at gmail dot com
2023-01-23 17:55 ` [Bug c/108500] " dhekir at gmail dot com
2023-01-24  2:57 ` [Bug ipa/108500] " pinskia at gcc dot gnu.org
2023-01-24  3:18 ` [Bug tree-optimization/108500] " pinskia at gcc dot gnu.org
2023-01-24  9:32 ` rguenth at gcc dot gnu.org
2023-01-24  9:43 ` rguenth at gcc dot gnu.org
2023-01-24  9:48 ` rguenth at gcc dot gnu.org
2023-01-24 12:36 ` [Bug tree-optimization/108500] [11/12/13 Regression] " rguenth at gcc dot gnu.org
2023-01-24 14:29 ` cvs-commit at gcc dot gnu.org
2023-01-24 14:30 ` [Bug tree-optimization/108500] [11/12 " rguenth at gcc dot gnu.org
2023-02-01  7:47 ` cvs-commit at gcc dot gnu.org
2023-02-01  8:42 ` rguenth at gcc dot gnu.org
2023-02-01 16:06 ` dhekir at gmail dot com
2023-02-01 16:07 ` dhekir at gmail dot com
2023-02-02 10:22 ` rguenth at gcc dot gnu.org
2023-02-02 10:34 ` rguenth at gcc dot gnu.org
2023-02-02 14:01 ` rguenth at gcc dot gnu.org
2023-02-02 17:12 ` dhekir at gmail dot com
2023-02-03  7:06 ` rguenth at gcc dot gnu.org
2023-02-03  8:20 ` rguenth at gcc dot gnu.org
2023-02-10 14:05 ` vmakarov at gcc dot gnu.org
2023-02-10 16:45 ` cvs-commit at gcc dot gnu.org
2023-02-13  7:42 ` rguenth at gcc dot gnu.org
2023-03-15  9:47 ` cvs-commit at gcc dot gnu.org
2023-03-15 10:03 ` [Bug tree-optimization/108500] [11 " rguenth at gcc dot gnu.org
2023-05-05  8:09 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).