public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
@ 2024-02-13 15:37 ro at gcc dot gnu.org
2024-02-13 15:38 ` [Bug tree-optimization/113910] " ro at gcc dot gnu.org
` (18 more replies)
0 siblings, 19 replies; 20+ messages in thread
From: ro at gcc dot gnu.org @ 2024-02-13 15:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
Bug ID: 113910
Summary: [12/13/14 regression] Factor 15 slowdown compiling
AMDGPUDisassembler.cpp on SPARC
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ro at gcc dot gnu.org
Target Milestone: ---
Target: sparcv9-sun-solaris2.11
After GCC 11, compile time for LLVM's
lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
on 64-bit Solaris/SPARC regressed by a factor of 25:
cc1plus -fpreprocessed AMDGPUDisassembler.cpp.ii -quiet -mcpu=v9 -O -std=c++17
-ftime-report -o AMDGPUDisassembler.cpp.s
* GCC 11.4.0:
real 2:14.94
user 2:09.96
sys 4.83
* GCC 14.0.1:
real 33:03.33
user 32:57.32
sys 5.52
I'm attaching the preprocessed input and -ftime-report output for both.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
@ 2024-02-13 15:38 ` ro at gcc dot gnu.org
2024-02-13 15:39 ` ro at gcc dot gnu.org
` (17 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: ro at gcc dot gnu.org @ 2024-02-13 15:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
--- Comment #1 from Rainer Orth <ro at gcc dot gnu.org> ---
Created attachment 57414
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57414&action=edit
preprocessed input
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
2024-02-13 15:38 ` [Bug tree-optimization/113910] " ro at gcc dot gnu.org
@ 2024-02-13 15:39 ` ro at gcc dot gnu.org
2024-02-13 15:40 ` ro at gcc dot gnu.org
` (16 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: ro at gcc dot gnu.org @ 2024-02-13 15:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
--- Comment #2 from Rainer Orth <ro at gcc dot gnu.org> ---
Created attachment 57415
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57415&action=edit
GCC 11.4.0 -ftime-report output
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
2024-02-13 15:38 ` [Bug tree-optimization/113910] " ro at gcc dot gnu.org
2024-02-13 15:39 ` ro at gcc dot gnu.org
@ 2024-02-13 15:40 ` ro at gcc dot gnu.org
2024-02-13 15:55 ` pinskia at gcc dot gnu.org
` (15 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: ro at gcc dot gnu.org @ 2024-02-13 15:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
--- Comment #3 from Rainer Orth <ro at gcc dot gnu.org> ---
Created attachment 57416
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57416&action=edit
GCC 14.0.1 -ftime-report output
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (2 preceding siblings ...)
2024-02-13 15:40 ` ro at gcc dot gnu.org
@ 2024-02-13 15:55 ` pinskia at gcc dot gnu.org
2024-02-13 16:04 ` ro at CeBiTec dot Uni-Bielefeld.DE
` (14 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-13 15:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |compile-time-hog
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>Configure with --enable-checking=release to disable checks.
Can you try that if you are comparing compile times?
Some of the slow down is definitely related to that:
> tree SSA verifier : 12.28 ( 1%) 0.02 ( 0%) 12.12 ( 1%) 0 ( 0%)
> tree STMT verifier : 18.62 ( 1%) 0.00 ( 0%) 18.79 ( 1%) 0 ( 0%)
> CFG verifier : 9.77 ( 0%) 0.01 ( 0%) 10.01 ( 1%) 0 ( 0%)
> verify RTL sharing : 12.45 ( 1%) 0.01 ( 0%) 12.46 ( 1%) 0 ( 0%)
For an example.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (3 preceding siblings ...)
2024-02-13 15:55 ` pinskia at gcc dot gnu.org
@ 2024-02-13 16:04 ` ro at CeBiTec dot Uni-Bielefeld.DE
2024-02-14 9:36 ` rguenth at gcc dot gnu.org
` (13 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: ro at CeBiTec dot Uni-Bielefeld.DE @ 2024-02-13 16:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
--- Comment #5 from ro at CeBiTec dot Uni-Bielefeld.DE <ro at CeBiTec dot Uni-Bielefeld.DE> ---
> --- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>>Configure with --enable-checking=release to disable checks.
I'm seeing the same slowdown with release builds of GCC 12.3.0 and
13.2.0.
> Can you try that if you are comparing compile times?
> Some of the slow down is definitely related to that:
>> tree SSA verifier : 12.28 ( 1%) 0.02 ( 0%) 12.12 ( 1%) 0 ( 0%)
>> tree STMT verifier : 18.62 ( 1%) 0.00 ( 0%) 18.79 ( 1%) 0 ( 0%)
>> CFG verifier : 9.77 ( 0%) 0.01 ( 0%) 10.01 ( 1%) 0 ( 0%)
>> verify RTL sharing : 12.45 ( 1%) 0.01 ( 0%) 12.46 ( 1%) 0 ( 0%)
>
>
> For an example.
13.2.0 takes
real 19.59
user 16.05
sys 3.43
but was still in the half-hour range with the original full set of
flags. I'll retry that and report.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (4 preceding siblings ...)
2024-02-13 16:04 ` ro at CeBiTec dot Uni-Bielefeld.DE
@ 2024-02-14 9:36 ` rguenth at gcc dot gnu.org
2024-02-14 9:37 ` rguenth at gcc dot gnu.org
` (12 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-14 9:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2024-02-14
Status|UNCONFIRMED |NEW
Keywords| |needs-bisection
Ever confirmed|0 |1
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
tree PTA :1795.76 ( 91%)
"nice". Possibly some of the PTA speedups done have regressed this case.
Bisecting would be nice. It seems the preprocessed source "works" on x86_64 as
well at least, for both trunk and GCC 11 (and I confirm 11 is fast).
It might be that inlining heuristic changes make a difference here. PTA is
known to have difficulties with functions with very many calls.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (5 preceding siblings ...)
2024-02-14 9:36 ` rguenth at gcc dot gnu.org
@ 2024-02-14 9:37 ` rguenth at gcc dot gnu.org
2024-02-14 9:44 ` rguenth at gcc dot gnu.org
` (11 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-14 9:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target|sparcv9-sun-solaris2.11 |sparcv9-sun-solaris2.11,
| |x86_64-*-*
Target Milestone|--- |12.4
Priority|P3 |P2
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (6 preceding siblings ...)
2024-02-14 9:37 ` rguenth at gcc dot gnu.org
@ 2024-02-14 9:44 ` rguenth at gcc dot gnu.org
2024-02-14 10:05 ` rguenth at gcc dot gnu.org
` (10 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-14 9:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Note GCC 13 seems to dislike the preprocessed source (odd, 12 and trunk are
happy...)
In file included from /usr/gcc/11/include/c++/11.4.0/memory:76,
from
/vol/llvm/src/llvm-project/local/llvm/include/llvm/ADT/SmallVector.h:28,
from
/vol/llvm/src/llvm-project/local/llvm/include/llvm/ADT/SmallString.h:17,
from
/vol/llvm/src/llvm-project/local/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h:19,
from
/vol/llvm/src/llvm-project/local/llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp:19:
/usr/gcc/11/include/c++/11.4.0/bits/unique_ptr.h:486:8: error: expected
identifier before '__remove_cv'
/usr/gcc/11/include/c++/11.4.0/bits/unique_ptr.h:486:20: error: expected '('
before '=' token
/usr/gcc/11/include/c++/11.4.0/bits/unique_ptr.h:486:20: error: expected
type-specifier before '=' token
/usr/gcc/11/include/c++/11.4.0/bits/unique_ptr.h:486:20: error: expected
unqualified-id before '=' token
/usr/gcc/11/include/c++/11.4.0/bits/unique_ptr.h:492:55: error: wrong number of
template arguments (1, should be 2)
that's
using __remove_cv = typename remove_cv<_Up>::type;
template<typename _Up>
using __is_derived_Tp
= __and_< is_base_of<_Tp, _Up>,
__not_<is_same<__remove_cv<_Tp>, __remove_cv<_Up>>> >;
I think.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (7 preceding siblings ...)
2024-02-14 9:44 ` rguenth at gcc dot gnu.org
@ 2024-02-14 10:05 ` rguenth at gcc dot gnu.org
2024-02-14 10:26 ` rguenth at gcc dot gnu.org
` (9 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-14 10:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
Status|NEW |ASSIGNED
--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
early PTA for decodeToMCInst runs on 241782 variables, and we have 751952
constraints.
A fun testcase ;) A little bit large to work with of course.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (8 preceding siblings ...)
2024-02-14 10:05 ` rguenth at gcc dot gnu.org
@ 2024-02-14 10:26 ` rguenth at gcc dot gnu.org
2024-02-14 11:32 ` rguenth at gcc dot gnu.org
` (8 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-14 10:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
With only enabling early PTA via -fdisable-tree-alias -fdisable-tree-pre I
got the compile finished in 18 minutes and
tree PTA :1044.48 ( 98%) 1.53 ( 27%)1046.29 ( 97%)
4341k ( 0%)
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (9 preceding siblings ...)
2024-02-14 10:26 ` rguenth at gcc dot gnu.org
@ 2024-02-14 11:32 ` rguenth at gcc dot gnu.org
2024-02-14 11:41 ` rguenth at gcc dot gnu.org
` (7 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-14 11:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
Cutting the switch in decodeToMCInst after case 693: (roughly halving it by the
number of source lines) gets us to
tree PTA : 129.70 ( 92%) 0.51 ( 14%) 130.28 ( 90%)
2279k ( 0%)
TOTAL : 140.28 3.68 144.01
982M
a profile shows
Samples: 657K of event 'cycles:u', Event count (approx.): 873340708228
Overhead Samples Command Shared Object Symbol
88.08% 578019 cc1plus cc1plus [.] bitmap_equal_p
4.76% 31340 cc1plus cc1plus [.]
equiv_class_lookup_or_a
0.59% 4039 cc1plus cc1plus [.] bitmap_set_bit
0.24% 1611 cc1plus cc1plus [.] bitmap_copy
the way we hash bitmaps is quite bad, we effectively hash set and a subset
of unset bits. Adding a simple additional "hash", the number of set bits,
improves this to
Samples: 214K of event 'cycles:u', Event count (approx.): 283548833048
Overhead Samples Command Shared Object Symbol
69.73% 148209 cc1plus cc1plus [.] bitmap_equal_p
6.29% 13499 cc1plus cc1plus [.]
equiv_class_lookup_or_add
of course we still have too many calls (or too large but almost equal bitmaps
here). Still I have a handle on this.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (10 preceding siblings ...)
2024-02-14 11:32 ` rguenth at gcc dot gnu.org
@ 2024-02-14 11:41 ` rguenth at gcc dot gnu.org
2024-02-14 14:51 ` rguenth at gcc dot gnu.org
` (6 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-14 11:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 57422
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57422&action=edit
patch
I'm testing the attached which brings down compile-time to the levels of GCC 11
again (a bit faster even, 30s vs. 50s).
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (11 preceding siblings ...)
2024-02-14 11:41 ` rguenth at gcc dot gnu.org
@ 2024-02-14 14:51 ` rguenth at gcc dot gnu.org
2024-02-14 14:51 ` cvs-commit at gcc dot gnu.org
` (5 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-14 14:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
Note after the proper fix we still have
(gdb) p pointer_equiv_class_table->m_searches
$17 = 180497
(gdb) p pointer_equiv_class_table->m_collisions
$18 = 4101085
(gdb) p pointer_equiv_class_table->m_n_elements
$22 = 143701
(gdb) p pointer_equiv_class_table->m_size
$23 = 262139
"perfecting" the hash helps (mixing each individual bit number) but then
all the time is spent hashing ;)
Samples: 177K of event 'cycles:u', Event count (approx.): 232966151280
Overhead Samples Command Shared Object Symbol
35.77% 65423 cc1plus cc1plus [.] bitmap_hash
9.64% 16589 cc1plus cc1plus [.] bitmap_set_bit
I think the data structure used is simply far from optimal.
Mixing each bitmap word has higher collision rates than the XOR (dropping
the XOR of the first bit number). Mixing in ptr->indx as well gives
OK collision rates but still
12.78% 16684 cc1plus cc1plus [.] bitmap_set_bit
12.56% 19318 cc1plus cc1plus [.] bitmap_hash
XOR for the words ontop of mixing of ptr->indx gets little worse but still
reasonable rates with
14.03% 16837 cc1plus cc1plus [.] bitmap_set_bit
6.33% 7694 cc1plus cc1plus [.]
insert_updated_phi_node
4.74% 7500 cc1plus cc1plus [.] bitmap_hash
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (12 preceding siblings ...)
2024-02-14 14:51 ` rguenth at gcc dot gnu.org
@ 2024-02-14 14:51 ` cvs-commit at gcc dot gnu.org
2024-02-14 15:11 ` [Bug tree-optimization/113910] [12/13 Regression] " rguenth at gcc dot gnu.org
` (4 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-02-14 14:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
--- Comment #13 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:ad7a365aaccecd23ea287c7faaab9c7bd50b944a
commit r14-8980-gad7a365aaccecd23ea287c7faaab9c7bd50b944a
Author: Richard Biener <rguenther@suse.de>
Date: Wed Feb 14 12:33:13 2024 +0100
tree-optimization/113910 - huge compile time during PTA
For the testcase in PR113910 we spend a lot of time in PTA comparing
bitmaps for looking up equivalence class members. This points to
the very weak bitmap_hash function which effectively hashes set
and a subset of not set bits.
The major problem with it is that it simply truncates the
BITMAP_WORD sized intermediate hash to hashval_t which is
unsigned int, effectively not hashing half of the bits.
This reduces the compile-time for the testcase from tens of minutes
to 42 seconds and PTA time from 99% to 46%.
PR tree-optimization/113910
* bitmap.cc (bitmap_hash): Mix the full element "hash" to
the hashval_t hash.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13 Regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (13 preceding siblings ...)
2024-02-14 14:51 ` cvs-commit at gcc dot gnu.org
@ 2024-02-14 15:11 ` rguenth at gcc dot gnu.org
2024-02-14 20:07 ` ro at CeBiTec dot Uni-Bielefeld.DE
` (3 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-14 15:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[12/13/14 regression] |[12/13 Regression] Factor
|Factor 15 slowdown |15 slowdown compiling
|compiling |AMDGPUDisassembler.cpp on
|AMDGPUDisassembler.cpp on |SPARC
|SPARC |
Known to work| |14.0
--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
The regression should be fixed, can you check we're now no longer slower on
trunk? (either use a release checking build or use -fno-checking which should
get you reasonably close)
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13 Regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (14 preceding siblings ...)
2024-02-14 15:11 ` [Bug tree-optimization/113910] [12/13 Regression] " rguenth at gcc dot gnu.org
@ 2024-02-14 20:07 ` ro at CeBiTec dot Uni-Bielefeld.DE
2024-02-15 10:43 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: ro at CeBiTec dot Uni-Bielefeld.DE @ 2024-02-14 20:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
--- Comment #15 from ro at CeBiTec dot Uni-Bielefeld.DE <ro at CeBiTec dot Uni-Bielefeld.DE> ---
> --- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
> The regression should be fixed, can you check we're now no longer slower on
> trunk? (either use a release checking build or use -fno-checking which should
> get you reasonably close)
I've done a --enable-checking=release build on trunk and compare compile
times of the -save-temps with g++ 11.4.0:
$ time cc1plus -fpreprocessed AMDGPUDisassembler.cpp.ii -quiet -mcpu=v9 -O
-std=c++17 -o AMDGPUDisassembler.cpp.s
* 11.4.0:
real 2:04.33
user 2:03.86
sys 0.30
* 14.0.1:
real 2:17.58
user 2:16.47
sys 0.87
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13 Regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (15 preceding siblings ...)
2024-02-14 20:07 ` ro at CeBiTec dot Uni-Bielefeld.DE
@ 2024-02-15 10:43 ` rguenth at gcc dot gnu.org
2024-02-16 12:57 ` rguenth at gcc dot gnu.org
2024-03-21 11:49 ` cvs-commit at gcc dot gnu.org
18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-15 10:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, thanks for checking. Btw, -ftime-report for GCC 11 has different
bottle-necks meanwhile fixed:
tree PTA : 1.66 ( 3%)
tree SSA incremental : 31.86 ( 61%)
TOTAL : 52.08
but it had a bit less bloated PTA.
I now have
tree PTA : 12.21 ( 35%)
tree SSA incremental : 3.96 ( 11%)
TOTAL : 35.24
on trunk. I guess with bitmaps it always also depends on the memory
hierarchy of the machine, nevertheless overall it looks fine on SPARC
then.
Queued for backporting, some RFC for further improvements on bitmap_hash
is on the mailing list but I'm not going to backport that.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13 Regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (16 preceding siblings ...)
2024-02-15 10:43 ` rguenth at gcc dot gnu.org
@ 2024-02-16 12:57 ` rguenth at gcc dot gnu.org
2024-03-21 11:49 ` cvs-commit at gcc dot gnu.org
18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-16 12:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
The following still helps quite a bit on its own.
diff --git a/gcc/bitmap.cc b/gcc/bitmap.cc
index 459e32c1ad1..a05ad810800 100644
--- a/gcc/bitmap.cc
+++ b/gcc/bitmap.cc
@@ -2695,18 +2695,21 @@ hashval_t
bitmap_hash (const_bitmap head)
{
const bitmap_element *ptr;
- BITMAP_WORD hash = 0;
+ hashval_t hash = 0;
int ix;
gcc_checking_assert (!head->tree_form);
for (ptr = head->first; ptr; ptr = ptr->next)
{
- hash ^= ptr->indx;
+ hash = iterative_hash_hashval_t (ptr->indx, hash);
for (ix = 0; ix != BITMAP_ELEMENT_WORDS; ix++)
- hash ^= ptr->bits[ix];
+ if (sizeof (BITMAP_WORD) > sizeof (hashval_t))
+ hash = iterative_hash_host_wide_int (ptr->bits[ix], hash);
+ else
+ hash = iterative_hash_hashval_t (ptr->bits[ix], hash);
}
- return iterative_hash (&hash, sizeof (hash), 0);
+ return hash;
}
^L
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/113910] [12/13 Regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
` (17 preceding siblings ...)
2024-02-16 12:57 ` rguenth at gcc dot gnu.org
@ 2024-03-21 11:49 ` cvs-commit at gcc dot gnu.org
18 siblings, 0 replies; 20+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-03-21 11:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910
--- Comment #18 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-13 branch has been updated by Richard Biener
<rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:9a19811ea1e9b3024c0f41b074d71679088bb2d7
commit r13-8478-g9a19811ea1e9b3024c0f41b074d71679088bb2d7
Author: Richard Biener <rguenther@suse.de>
Date: Wed Feb 14 12:33:13 2024 +0100
tree-optimization/113910 - huge compile time during PTA
For the testcase in PR113910 we spend a lot of time in PTA comparing
bitmaps for looking up equivalence class members. This points to
the very weak bitmap_hash function which effectively hashes set
and a subset of not set bits.
The major problem with it is that it simply truncates the
BITMAP_WORD sized intermediate hash to hashval_t which is
unsigned int, effectively not hashing half of the bits.
This reduces the compile-time for the testcase from tens of minutes
to 42 seconds and PTA time from 99% to 46%.
PR tree-optimization/113910
* bitmap.cc (bitmap_hash): Mix the full element "hash" to
the hashval_t hash.
(cherry picked from commit ad7a365aaccecd23ea287c7faaab9c7bd50b944a)
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2024-03-21 11:49 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-13 15:37 [Bug tree-optimization/113910] New: [12/13/14 regression] Factor 15 slowdown compiling AMDGPUDisassembler.cpp on SPARC ro at gcc dot gnu.org
2024-02-13 15:38 ` [Bug tree-optimization/113910] " ro at gcc dot gnu.org
2024-02-13 15:39 ` ro at gcc dot gnu.org
2024-02-13 15:40 ` ro at gcc dot gnu.org
2024-02-13 15:55 ` pinskia at gcc dot gnu.org
2024-02-13 16:04 ` ro at CeBiTec dot Uni-Bielefeld.DE
2024-02-14 9:36 ` rguenth at gcc dot gnu.org
2024-02-14 9:37 ` rguenth at gcc dot gnu.org
2024-02-14 9:44 ` rguenth at gcc dot gnu.org
2024-02-14 10:05 ` rguenth at gcc dot gnu.org
2024-02-14 10:26 ` rguenth at gcc dot gnu.org
2024-02-14 11:32 ` rguenth at gcc dot gnu.org
2024-02-14 11:41 ` rguenth at gcc dot gnu.org
2024-02-14 14:51 ` rguenth at gcc dot gnu.org
2024-02-14 14:51 ` cvs-commit at gcc dot gnu.org
2024-02-14 15:11 ` [Bug tree-optimization/113910] [12/13 Regression] " rguenth at gcc dot gnu.org
2024-02-14 20:07 ` ro at CeBiTec dot Uni-Bielefeld.DE
2024-02-15 10:43 ` rguenth at gcc dot gnu.org
2024-02-16 12:57 ` rguenth at gcc dot gnu.org
2024-03-21 11:49 ` cvs-commit at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).