public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/108738] New: compile-time and memory-hog in mdreorg
@ 2023-02-09 8:10 rguenth at gcc dot gnu.org
2023-02-09 8:47 ` [Bug target/108738] " rguenth at gcc dot gnu.org
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-09 8:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738
Bug ID: 108738
Summary: compile-time and memory-hog in mdreorg
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Target Milestone: ---
The compiler.i testcase from PR26854 at -O2 exhibits
machine dep reorg : 550.15 ( 71%) 0.00 ( 0%) 550.16 ( 70%)
24 ( 0%)
TOTAL : 770.35 14.29 784.73
1475M
770.35user 14.33system 13:04.78elapsed 99%CPU (0avgtext+0avgdata
27915760maxresident)k
0inputs+18424outputs (0major+26040556minor)pagefaults 0swaps
I've fixed compile-time issues in that in the past but it seems another such
issue has re-surfaced.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/108738] compile-time and memory-hog in mdreorg
2023-02-09 8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
@ 2023-02-09 8:47 ` rguenth at gcc dot gnu.org
2023-02-09 9:11 ` rguenth at gcc dot gnu.org
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-09 8:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |x86_64-*-*
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
A perf report isn't very conclusive, but at least points at STV:
Samples: 3M of event 'cycles:u', Event count (approx.): 4455606089800
Samples Command Shared Object Symbol
2153935 cc1 cc1 [.] bitmap_bit_p
288315 cc1 cc1 [.] (anonymous
namespace)::scalar_chain::analyze_register_chain
145980 cc1 cc1 [.] bitmap_equal_p
85953 cc1 cc1 [.] pop_scope
70985 cc1 cc1 [.] bitmap_ior_into
87675 cc1 cc1 [.] df_chain_create
with -O2 -mno-stv we get to
machine dep reorg : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
0 ( 0%)
TOTAL : 202.90 8.46 211.41
1475M
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/108738] compile-time and memory-hog in mdreorg
2023-02-09 8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
2023-02-09 8:47 ` [Bug target/108738] " rguenth at gcc dot gnu.org
@ 2023-02-09 9:11 ` rguenth at gcc dot gnu.org
2023-02-09 9:24 ` rguenth at gcc dot gnu.org
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-09 9:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2023-02-09
Ever confirmed|0 |1
Known to fail| |10.4.0
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Happens at least since GCC 10.
There's a comment already:
/* ??? The following is quadratic since analyze_register_chain
iterates over all refs to look for dual-mode regs. Instead this
should be done separately for all regs mentioned in the chain once. */
df_ref ref;
for (ref = DF_INSN_UID_DEFS (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref))
if (!HARD_REGISTER_P (DF_REF_REG (ref)))
analyze_register_chain (candidates, ref);
but the walk also adds to the queue, so it's more interwinded and the suggested
solution isn't the proper. Instead it might be possible to just record
already visited regs.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/108738] compile-time and memory-hog in mdreorg
2023-02-09 8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
2023-02-09 8:47 ` [Bug target/108738] " rguenth at gcc dot gnu.org
2023-02-09 9:11 ` rguenth at gcc dot gnu.org
@ 2023-02-09 9:24 ` rguenth at gcc dot gnu.org
2023-02-09 12:48 ` rguenth at gcc dot gnu.org
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-09 9:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
That brings it down to
machine dep reorg : 87.09 ( 28%)
let me see if there's something else obvious to do.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/108738] compile-time and memory-hog in mdreorg
2023-02-09 8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
` (2 preceding siblings ...)
2023-02-09 9:24 ` rguenth at gcc dot gnu.org
@ 2023-02-09 12:48 ` rguenth at gcc dot gnu.org
2023-02-15 7:28 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-09 12:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
The odd thing is that DF_REF_CHAIN of a USE ref contains _all_ definitions of
the pseudo while DF_REF_CHAIN of a DEF ref contains only reaching uses!?
Probably an artifact of how df_chain_create_bb_process_use operates.
That makes this much harder to fix ...
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/108738] compile-time and memory-hog in mdreorg
2023-02-09 8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
` (3 preceding siblings ...)
2023-02-09 12:48 ` rguenth at gcc dot gnu.org
@ 2023-02-15 7:28 ` cvs-commit at gcc dot gnu.org
2023-02-15 7:28 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-02-15 7:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738
--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:ec23e9e25eb64bb066dc408fd498861b8587bec8
commit r13-5994-gec23e9e25eb64bb066dc408fd498861b8587bec8
Author: Richard Biener <rguenther@suse.de>
Date: Thu Feb 9 11:03:25 2023 +0100
target/108738 - STV bitmap operations compile-time hog
When the set of candidates becomes very large then repeated
bit checks on it during the build of an actual chain can become
slow because of the O(n) nature of bitmap tests. The following
switches the candidates bitmaps to the tree representation before
building the chains to get O(log n) amortized behavior.
For the testcase at hand this improves STV time by 50%.
PR target/108738
* config/i386/i386-features.cc (convert_scalars_to_vector):
Switch candidates bitmaps to tree view before building the chains.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/108738] compile-time and memory-hog in mdreorg
2023-02-09 8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
` (4 preceding siblings ...)
2023-02-15 7:28 ` cvs-commit at gcc dot gnu.org
@ 2023-02-15 7:28 ` cvs-commit at gcc dot gnu.org
2023-03-02 11:38 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-02-15 7:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738
--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:e1dfac7e71056e879f101fef1c5ecb8ff6be1a1f
commit r13-5995-ge1dfac7e71056e879f101fef1c5ecb8ff6be1a1f
Author: Richard Biener <rguenther@suse.de>
Date: Thu Feb 9 13:40:43 2023 +0100
target/108738 - optimize bit operations in STV
The following does low-hanging optimizations, combining bitmap
test and set and removing redundant operations.
PR target/108738
* config/i386/i386-features.cc (scalar_chain::add_to_queue):
Combine bitmap test and set.
(scalar_chain::add_insn): Likewise.
(scalar_chain::analyze_register_chain): Remove redundant
attempt to add to queue and instead strengthen assert.
Sink common attempts to mark the def dual-mode.
(scalar_chain::add_to_queue): Remove redundant insn bitmap
check.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/108738] compile-time and memory-hog in mdreorg
2023-02-09 8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
` (5 preceding siblings ...)
2023-02-15 7:28 ` cvs-commit at gcc dot gnu.org
@ 2023-03-02 11:38 ` rguenth at gcc dot gnu.org
2023-03-03 7:31 ` cvs-commit at gcc dot gnu.org
2023-03-03 7:33 ` rguenth at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-02 11:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
So it's now better than before but still quadratic.
Finding a strathegic place to limit the search with some --param might be a
solution, but there's no easy
point to hook that into. You'd not want to disable the whole pass but
terminate the greedy search and axe the candidates sofar processed (to not run
into the same ones again), which might then result in "odd" STV decisions if
the remains are picked up. To avoid this maybe maintain a "too big" set of
candidates
and if a further greedy search lands at a insn in that set, axe that search as
well. Note it's not the size of the set but the complexity of the search that
needs limiting, so count the number of ref visits through
analyze_register_chain
for an invocation of scalar_chain::build and limit that to some --param.
I'm trying to prototype that.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/108738] compile-time and memory-hog in mdreorg
2023-02-09 8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
` (6 preceding siblings ...)
2023-03-02 11:38 ` rguenth at gcc dot gnu.org
@ 2023-03-03 7:31 ` cvs-commit at gcc dot gnu.org
2023-03-03 7:33 ` rguenth at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-03-03 7:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738
--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:6010189923908501ca5b02bd1f4aee05d2283118
commit r13-6439-g6010189923908501ca5b02bd1f4aee05d2283118
Author: Richard Biener <rguenther@suse.de>
Date: Thu Mar 2 13:43:51 2023 +0100
target/108738 - limit STV chain discovery
The following puts a hard limit on the inherently quadratic STV chain
discovery. Without a limit for the compiler.i testcase in PR26854
we see at -O2
machine dep reorg : 574.45 ( 53%)
with release checking while with the proposed limit it's
machine dep reorg : 2.86 ( 1%)
PR target/108738
* config/i386/i386.opt (--param x86-stv-max-visits): New param.
* doc/invoke.texi (--param x86-stv-max-visits): Document it.
* config/i386/i386-features.h (scalar_chain::max_visits): New.
(scalar_chain::build): Add bitmap parameter, return boolean.
(scalar_chain::add_insn): Likewise.
(scalar_chain::analyze_register_chain): Likewise.
* config/i386/i386-features.cc (scalar_chain::scalar_chain):
Initialize max_visits.
(scalar_chain::analyze_register_chain): When we exhaust
max_visits, abort. Also abort when running into any
disallowed insn.
(scalar_chain::add_insn): Propagate abort.
(scalar_chain::build): Likewise. When aborting amend
the set of disallowed insn with the insns set.
(convert_scalars_to_vector): Adjust. Do not convert aborted
chains.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/108738] compile-time and memory-hog in mdreorg
2023-02-09 8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
` (7 preceding siblings ...)
2023-03-03 7:31 ` cvs-commit at gcc dot gnu.org
@ 2023-03-03 7:33 ` rguenth at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-03 7:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Keywords| |compile-time-hog
Target Milestone|--- |13.0
Status|ASSIGNED |RESOLVED
--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
This has been mitigated for GCC 13, in theory backporting is possible but
there's a workaround available (-mno-stv), so I think that's not strictly
necessary since the issue is quite old already.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-03-03 7:33 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-09 8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
2023-02-09 8:47 ` [Bug target/108738] " rguenth at gcc dot gnu.org
2023-02-09 9:11 ` rguenth at gcc dot gnu.org
2023-02-09 9:24 ` rguenth at gcc dot gnu.org
2023-02-09 12:48 ` rguenth at gcc dot gnu.org
2023-02-15 7:28 ` cvs-commit at gcc dot gnu.org
2023-02-15 7:28 ` cvs-commit at gcc dot gnu.org
2023-03-02 11:38 ` rguenth at gcc dot gnu.org
2023-03-03 7:31 ` cvs-commit at gcc dot gnu.org
2023-03-03 7:33 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).