public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/116002] New: GCC Compiler Hang with Recursive Macros in Function
@ 2024-07-19 12:06 iamanonymous.cs at gmail dot com
2024-07-19 13:20 ` [Bug rtl-optimization/116002] GCC Compiler time-hog with large basic block " rguenth at gcc dot gnu.org
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: iamanonymous.cs at gmail dot com @ 2024-07-19 12:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116002
Bug ID: 116002
Summary: GCC Compiler Hang with Recursive Macros in Function
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: iamanonymous.cs at gmail dot com
Target Milestone: ---
Target: x86_64
*******************************************************************************
I encountered an issue where the GCC compiler hangs indefinitely while trying
to compile a specific piece of code involving recursive macros within a
function. The code snippet is provided below.Clang also has this problem.
*******************************************************************************
OS and Platform:
# uname -a
Linux ubuntu 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023
x86_64 x86_64 x86_64 GNU/Linux
*******************************************************************************
# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/root/gdbtest/gcc/gcc-15/libexec/gcc/x86_64-pc-linux-gnu/15.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /root/gdbtest/gcc/obj/../gcc/configure
--prefix=/root/gdbtest/gcc/gcc-15 --enable-languages=c,c++,fortran,go
--disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 15.0.0 20240509 (experimental) (GCC)
*******************************************************************************
Program:
# cat 1.c
int a(int c) {
if (c < 12345) {
#define A \
{ int d = c % 12345; \
int e = c ^ d * 12345; \
int f = c * d % e % 12345; \
c = f; }
#define B A A A A A A
#define C B B B B B B
#define D C C C C C C
#define E D D D D D D
E E E E E
}
return c;
}
*******************************************************************************
Output: Nothing (Waited for nearly 300 seconds.
*******************************************************************************
Also got killed on compiler explorer:https://godbolt.org/z/9Exof1rrW
*******************************************************************************
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/116002] GCC Compiler time-hog with large basic block in Function
2024-07-19 12:06 [Bug c/116002] New: GCC Compiler Hang with Recursive Macros in Function iamanonymous.cs at gmail dot com
@ 2024-07-19 13:20 ` rguenth at gcc dot gnu.org
2024-07-19 13:30 ` rguenth at gcc dot gnu.org
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-07-19 13:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116002
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to fail| |14.1.1
Component|c |rtl-optimization
Ever confirmed|0 |1
Summary|GCC Compiler Hang with |GCC Compiler time-hog with
|Recursive Macros in |large basic block in
|Function |Function
Last reconfirmed| |2024-07-19
Keywords| |compile-time-hog
Status|UNCONFIRMED |NEW
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
With only 'E E' it's
CPROP : 18.28 ( 62%)
with 'E E E'
dead store elim1 : 17.04 ( 22%)
CPROP : 50.19 ( 65%)
with 'E E E E'
dead store elim1 : 36.41 ( 23%)
CPROP : 106.68 ( 67%)
this is the usual gcse memory/compile-time hog with its (and DF) bitmap
operations.
We already disable it via gcse_or_cprop_is_too_expensive but the heuristic
doesn't trigger here - it's few basic blocks only. We end up
<bb 2> [local count: 1073741824]:
if (c_11666(D) <= 12344)
goto <bb 3>; [50.00%]
else
goto <bb 4>; [50.00%]
<bb 3> [local count: 536870912]:
d_11667 = c_11666(D) % 12345;
_1 = d_11667 * 12345;
e_11668 = _1 ^ c_11666(D);
_2 = c_11666(D) * d_11667;
_3 = _2 % e_11668;
f_11669 = _3 % 12345;
_4 = f_11669 * 12345;
e_11670 = _4 ^ f_11669;
_5 = f_11669 * f_11669;
_6 = _5 % e_11670;
f_11671 = _6 % 12345;
_7 = f_11671 * 12345;
e_11672 = _7 ^ f_11671;
_8 = f_11671 * f_11671;
_9 = _8 % e_11672;
...
_11664 = _11663 % e_19442;
f_19443 = _11664 % 12345;
<bb 4> [local count: 1073741824]:
# c_11665 = PHI <c_11666(D)(2), f_19443(3)>
return c_11665;
thus very many pseudos and insns but few basic-blocks. I believe there's
a duplicate.
Samples: 114K of event 'cycles:P', Event count (approx.): 121898207258
Overhead Samples Command Shared Object Symbol
61.00% 70222 cc1 cc1 [.]
_Z22rtx_equal_for_cselib_1P7rtx_defS0_12machine_modei
10.17% 11704 cc1 cc1 [.]
_Z13cselib_lookupP7rtx_def12machine_modeiS1_
6.42% 7400 cc1 cc1 [.]
_ZN10hash_tableI13cselib_hasherLb0E11xcallocatorE19find_slot_with
4.75% 5251 cc1 cc1 [.]
_ZL10topo_visitP16constraint_graphR3vecIj7va_heap6vl_ptrEP17simpl
2.91% 3219 cc1 cc1 [.]
_ZL11solve_graphP16constraint_graph
ah, I've seen this as well - the CSELIB hashing is a joke; I've run into this
earlier. It certainly will produce collisions a lot
with a lot of expressions with the same RTX code like this. Ah, it was
cse.cc hashing that was even worse. But CSELIB hashing also uses + for
mixing.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/116002] GCC Compiler time-hog with large basic block in Function
2024-07-19 12:06 [Bug c/116002] New: GCC Compiler Hang with Recursive Macros in Function iamanonymous.cs at gmail dot com
2024-07-19 13:20 ` [Bug rtl-optimization/116002] GCC Compiler time-hog with large basic block " rguenth at gcc dot gnu.org
@ 2024-07-19 13:30 ` rguenth at gcc dot gnu.org
2024-07-19 14:33 ` rguenth at gcc dot gnu.org
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-07-19 13:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116002
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
Status|NEW |ASSIGNED
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 58708
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58708&action=edit
prototype patch
This improves compile time to 23s for 'E E E E E' with now
tree PTA : 16.93 ( 74%)
CPROP : 0.15 ( 1%)
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/116002] GCC Compiler time-hog with large basic block in Function
2024-07-19 12:06 [Bug c/116002] New: GCC Compiler Hang with Recursive Macros in Function iamanonymous.cs at gmail dot com
2024-07-19 13:20 ` [Bug rtl-optimization/116002] GCC Compiler time-hog with large basic block " rguenth at gcc dot gnu.org
2024-07-19 13:30 ` rguenth at gcc dot gnu.org
@ 2024-07-19 14:33 ` rguenth at gcc dot gnu.org
2024-07-23 7:34 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-07-19 14:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116002
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to fail| |13.3.0
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
The minimal fix for this is the following with the first hunk only for
consistency (I'm not sure this is relevant). This might be more appropriate
when one considers backporting.
diff --git a/gcc/cselib.cc b/gcc/cselib.cc
index cbaab7d515c..67f57b67649 100644
--- a/gcc/cselib.cc
+++ b/gcc/cselib.cc
@@ -1267,12 +1267,12 @@ cselib_hash_plus_const_int (rtx x, HOST_WIDE_INT c, int
create,
return e->hash;
unsigned hash = (unsigned) PLUS + (unsigned) GET_MODE (x);
- hash += e->hash;
+ hash = iterative_hash_hashval_t (hash, e->hash);
unsigned int tem_hash = (unsigned) CONST_INT + (unsigned) VOIDmode;
tem_hash += ((unsigned) CONST_INT << 7) + (unsigned HOST_WIDE_INT) c;
if (tem_hash == 0)
tem_hash = (unsigned int) CONST_INT;
- hash += tem_hash;
+ hash = iterative_hash_hashval_t (hash, tem_hash);
return hash ? hash : 1 + (unsigned int) PLUS;
}
@@ -1509,7 +1509,7 @@ cselib_hash_rtx (rtx x, int create, machine_mode memmode)
if (tem_hash == 0)
return 0;
- hash += tem_hash;
+ hash = iterative_hash_hashval_t (hash, tem_hash);
}
break;
case 'E':
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/116002] GCC Compiler time-hog with large basic block in Function
2024-07-19 12:06 [Bug c/116002] New: GCC Compiler Hang with Recursive Macros in Function iamanonymous.cs at gmail dot com
` (2 preceding siblings ...)
2024-07-19 14:33 ` rguenth at gcc dot gnu.org
@ 2024-07-23 7:34 ` cvs-commit at gcc dot gnu.org
2024-07-23 7:59 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-07-23 7:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116002
--- Comment #4 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:44e065a52fa6069d6c8cacebc8f876840d278dd0
commit r15-2218-g44e065a52fa6069d6c8cacebc8f876840d278dd0
Author: Richard Biener <rguenther@suse.de>
Date: Fri Jul 19 16:23:51 2024 +0200
[v2] rtl-optimization/116002 - cselib hash is bad
The following addresses the bad hash function of cselib which uses
integer plus for merging. This causes a huge number of collisions
for the testcase in the PR and thus very large compile-time.
The following rewrites it to use inchash, eliding duplicate mixing
of RTX code and mode in some cases and more consistently avoiding
a return value of zero as well as treating zero as fatal. An
important part is to preserve mixing of hashes of commutative
operators as commutative.
For cselib_hash_plus_const_int this removes the apparent attempt
of making sure to hash the same as a PLUS as cselib_hash_rtx makes
sure to dispatch to cselib_hash_plus_const_int consistently.
This reduces compile-time for the testcase in the PR from unknown
to 22s and for a reduced testcase from 73s to 9s. There's another
pending patchset to improve the speed of inchash mixing, but it's
not in the profile for this testcase (PTA pops up now).
The generated code is equal. I've also compared cc1 builds
with and without the patch and they are now commparing equal
after retaining commutative hashing for commutative operators.
PR rtl-optimization/116002
* cselib.cc (cselib_hash_rtx): Use inchash to get proper mixing.
Consistently avoid a zero return value when hashing successfully.
Consistently treat a zero hash value from recursing as fatal.
Use hashval_t where appropriate.
(cselib_hash_plus_const_int): Likewise.
(new_cselib_val): Use hashval_t.
(cselib_lookup_1): Likewise.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/116002] GCC Compiler time-hog with large basic block in Function
2024-07-19 12:06 [Bug c/116002] New: GCC Compiler Hang with Recursive Macros in Function iamanonymous.cs at gmail dot com
` (3 preceding siblings ...)
2024-07-23 7:34 ` cvs-commit at gcc dot gnu.org
@ 2024-07-23 7:59 ` rguenth at gcc dot gnu.org
2024-07-23 8:35 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-07-23 7:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116002
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to work| |15.0
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
The CPROP and DSE issues are fixed for GCC 15.
With -O1 we need 4s to compile even 'E E E E E E E E E', adding 'F'
and testing with 'F F F F F F F F F' needs 43s which looks linear behavior.
Popping up on the radar are then
combiner : 4.65 ( 11%)
integrated RA : 7.92 ( 18%)
LRA reload inheritance : 4.15 ( 10%)
LRA create live ranges : 5.66 ( 13%)
so mainly register allocation.
With -O2 we have PTA showing up
tree PTA : 12.21 ( 77%)
PTA solving should be pretty quick, but it seems we take a lot of iterations
for unknown reasons:
32.27% 20663 cc1 cc1 [.] topo_visit
16.29% 10432 cc1 cc1 [.] solve_constraints
14.69% 9412 cc1 cc1 [.] find
14.10% 9021 cc1 cc1 [.] bitmap_clear_bit
1.19% 764 cc1 cc1 [.] bitmap_set_bit
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/116002] GCC Compiler time-hog with large basic block in Function
2024-07-19 12:06 [Bug c/116002] New: GCC Compiler Hang with Recursive Macros in Function iamanonymous.cs at gmail dot com
` (4 preceding siblings ...)
2024-07-23 7:59 ` rguenth at gcc dot gnu.org
@ 2024-07-23 8:35 ` rguenth at gcc dot gnu.org
2024-07-23 9:51 ` cvs-commit at gcc dot gnu.org
2024-07-25 11:49 ` rguenth at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-07-23 8:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116002
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #5)
> With -O2 we have PTA showing up
>
> tree PTA : 12.21 ( 77%)
>
> PTA solving should be pretty quick, but it seems we take a lot of iterations
> for unknown reasons:
>
> 32.27% 20663 cc1 cc1 [.] topo_visit
> 16.29% 10432 cc1 cc1 [.] solve_constraints
> 14.69% 9412 cc1 cc1 [.] find
> 14.10% 9021 cc1 cc1 [.] bitmap_clear_bit
> 1.19% 764 cc1 cc1 [.] bitmap_set_bit
the issue is that we have only constraints like
_7 = &NONLOCAL
e_30 = _7 + UNKNOWN
e_30 = f_27 + UNKNOWN
_8 = f_27 + UNKNOWN
_8 = f_27 + UNKNOWN
_9 = _8 + UNKNOWN
those form a graph with N nodes and no copy edges as those constraints
are all complex. Each solving iteration a few nodes get a new solution
but those dependent have already been evaluated so we bottleneck on
computing the optimal order to visit.
I'm testing a patch for this.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/116002] GCC Compiler time-hog with large basic block in Function
2024-07-19 12:06 [Bug c/116002] New: GCC Compiler Hang with Recursive Macros in Function iamanonymous.cs at gmail dot com
` (5 preceding siblings ...)
2024-07-23 8:35 ` rguenth at gcc dot gnu.org
@ 2024-07-23 9:51 ` cvs-commit at gcc dot gnu.org
2024-07-25 11:49 ` rguenth at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-07-23 9:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116002
--- Comment #7 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:15d3b2dab9182eff036a604169b5e6f4ab3b2a40
commit r15-2223-g15d3b2dab9182eff036a604169b5e6f4ab3b2a40
Author: Richard Biener <rguenther@suse.de>
Date: Tue Jul 23 10:29:58 2024 +0200
tree-optimization/116002 - PTA solving slow with degenerate graph
When the constraint graph consists of N nodes with only complex
constraints and no copy edges we have to be lucky to arrive at
a constraint solving order that requires the optimal number of
iterations. What happens in the testcase is that we bottle-neck
on computing the visitation order but propagate changes only
very slowly. Luckily the testcase complex constraints are
all copy-with-offset and those do provide a way to order
visitation. The following adds this which reduces the iteration
count to one.
PR tree-optimization/116002
* tree-ssa-structalias.cc (topo_visit): Also consider
SCALAR = SCALAR complex constraints as edges.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/116002] GCC Compiler time-hog with large basic block in Function
2024-07-19 12:06 [Bug c/116002] New: GCC Compiler Hang with Recursive Macros in Function iamanonymous.cs at gmail dot com
` (6 preceding siblings ...)
2024-07-23 9:51 ` cvs-commit at gcc dot gnu.org
@ 2024-07-25 11:49 ` rguenth at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-07-25 11:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116002
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Target Milestone|--- |15.0
Resolution|--- |FIXED
--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
The RTL combiner is now topmost spot at -O3
combiner : 5.39 ( 21%)
but I consider this bug fixed now. Thus, fixed for GCC 15. I don't think it's
worth backporting this for artificial testcases like this.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-07-25 11:49 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-19 12:06 [Bug c/116002] New: GCC Compiler Hang with Recursive Macros in Function iamanonymous.cs at gmail dot com
2024-07-19 13:20 ` [Bug rtl-optimization/116002] GCC Compiler time-hog with large basic block " rguenth at gcc dot gnu.org
2024-07-19 13:30 ` rguenth at gcc dot gnu.org
2024-07-19 14:33 ` rguenth at gcc dot gnu.org
2024-07-23 7:34 ` cvs-commit at gcc dot gnu.org
2024-07-23 7:59 ` rguenth at gcc dot gnu.org
2024-07-23 8:35 ` rguenth at gcc dot gnu.org
2024-07-23 9:51 ` cvs-commit at gcc dot gnu.org
2024-07-25 11:49 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).