public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/110034] New: The first popped allcono doesn't take precedence over later popped in ira coloring
@ 2023-05-30 8:02 guihaoc at gcc dot gnu.org
2023-05-30 8:09 ` [Bug rtl-optimization/110034] " rguenth at gcc dot gnu.org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: guihaoc at gcc dot gnu.org @ 2023-05-30 8:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110034
Bug ID: 110034
Summary: The first popped allcono doesn't take precedence over
later popped in ira coloring
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: guihaoc at gcc dot gnu.org
Target Milestone: ---
Followings are ira dumps from a test case. r134 has only one cp(shuffle) with
r173. The r173 and r124 both have preferred hard register r3. r134 is first
popped the register but it fails to get hard register r3 as the conflict cost
is high. If r173 is aheaf of r134, the r134 can get hard register r3 as there
is no hop between r3 and r134 after r173 is assigned r3. Seems the first popped
allcono(r134) doesn't take precedence over later popped allcono(r124).
r173 has a preferred hard register and has no conflict allcono. So r173 can
always be assigned r3.
;; a29(r173,l0) conflicts:
;; total conflict hard regs:
;; conflict hard regs:
...
cp11:a18(r134)<->a29(r173)@125:shuffle
pref0:a12(r158)<-hr100@711
pref1:a23(r144)<-hr100@920
pref2:a25(r140)<-hr100@1842
pref3:a29(r173)<-hr3@2000
pref4:a0(r124)<-hr3@125
...
Start updating from pref of hr3 for a29r173:
a18r134 (hr3): update cost by -62, conflict cost by -62
...
Pushing a1(r169,l0)(cost 0)
Pushing a0(r124,l0)(cost 0)
Pushing a22(r146,l0)(cost 0)
Pushing a20(r125,l0)(cost 0)
Pushing a29(r173,l0)(cost 0)
Pushing a18(r134,l0)(cost 0)
Popping a18(r134,l0) -- (9=0,0) (10=0,0) (8=0,0) (7=0,0) (6=0,0) (5=0,0)
(4=0,0) (3=-62,147) (11=0,0) (0=8000,8000) (31=7,7) (30=7,7) (29=7,7) (28=7,7)
(27=7,7) (26=7,7) (25=7,7) (24=7,7) (23=7,7) (22=7,7) (21=7,7) (20=7,7)
(19=7,7) (18=7,7) (17=7,7) (16=7,7) (15=7,7) (14=7,7) (12=0,0)
Start restoring from a18r134:
a29r173 (hr3): update cost by -62, conflict cost by -62
Start updating from a18r134 by copies:
a29r173 (hr9): update cost by -250, conflict cost by -250
assign reg 9
Popping a29(r173,l0) -- (9=1750,1750) (10=2000,2000) (8=2000,2000)
(7=2000,2000) (6=2000,2000) (5=2000,2000) (4=2000,2000) (3=-2062,-2062)
(11=2000,2000) (0=2000,2000) (31=2007,2007) (30=2007,2007) (29=2007,2007)
(28=2007,2007) (27=2007,2007) (26=2007,2007) (25=2007,2007) (24=2007,2007)
(23=2007,2007) (22=2007,2007) (21=2007,2007) (20=2007,2007) (19=2007,2007)
(18=2007,2007) (17=2007,2007) (16=2007,2007) (15=2007,2007) (14=2007,2007)
(12=2000,2000)
Start restoring from a29r173:
Start updating from a29r173 by copies:
assign reg 3
...
Popping a0(r124,l0) -- (8=0,0) (7=0,0) (6=0,0) (5=0,0) (4=0,0)
(3=-250,-250) (11=0,0) (0=0,0) (31=7,7) (30=7,7) (29=7,7) (28=7,7) (27=7,7)
(26=7,7) (25=7,7) (24=7,7) (23=7,7) (22=7,7) (21=7,7) (20=7,7) (19=7,7)
(18=7,7) (17=7,7) (16=7,7) (15=7,7) (14=7,7) (12=0,0)
Start restoring from a0r124:
Start updating from a0r124 by copies:
a1r169 (hr3): update cost by -44, conflict cost by -44
a2r177 (hr3): update cost by -11, conflict cost by -11
assign reg 3
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/110034] The first popped allcono doesn't take precedence over later popped in ira coloring
2023-05-30 8:02 [Bug rtl-optimization/110034] New: The first popped allcono doesn't take precedence over later popped in ira coloring guihaoc at gcc dot gnu.org
@ 2023-05-30 8:09 ` rguenth at gcc dot gnu.org
2023-05-30 9:32 ` guihaoc at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-30 8:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110034
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization, ra
CC| |vmakarov at gcc dot gnu.org
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Do you have a testcase that reproduces this? Please also specify target and
options.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/110034] The first popped allcono doesn't take precedence over later popped in ira coloring
2023-05-30 8:02 [Bug rtl-optimization/110034] New: The first popped allcono doesn't take precedence over later popped in ira coloring guihaoc at gcc dot gnu.org
2023-05-30 8:09 ` [Bug rtl-optimization/110034] " rguenth at gcc dot gnu.org
@ 2023-05-30 9:32 ` guihaoc at gcc dot gnu.org
2023-05-30 9:32 ` guihaoc at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: guihaoc at gcc dot gnu.org @ 2023-05-30 9:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110034
HaoChen Gui <guihaoc at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |guihaoc at gcc dot gnu.org
--- Comment #2 from HaoChen Gui <guihaoc at gcc dot gnu.org> ---
Created attachment 55214
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55214&action=edit
test case
Compile it with -O3 -mcpu=power9 -fira-verbose=20 > ira_dump.out 2>&1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/110034] The first popped allcono doesn't take precedence over later popped in ira coloring
2023-05-30 8:02 [Bug rtl-optimization/110034] New: The first popped allcono doesn't take precedence over later popped in ira coloring guihaoc at gcc dot gnu.org
2023-05-30 8:09 ` [Bug rtl-optimization/110034] " rguenth at gcc dot gnu.org
2023-05-30 9:32 ` guihaoc at gcc dot gnu.org
@ 2023-05-30 9:32 ` guihaoc at gcc dot gnu.org
2023-08-24 13:35 ` vmakarov at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: guihaoc at gcc dot gnu.org @ 2023-05-30 9:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110034
--- Comment #3 from HaoChen Gui <guihaoc at gcc dot gnu.org> ---
Created attachment 55215
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55215&action=edit
ira dump
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/110034] The first popped allcono doesn't take precedence over later popped in ira coloring
2023-05-30 8:02 [Bug rtl-optimization/110034] New: The first popped allcono doesn't take precedence over later popped in ira coloring guihaoc at gcc dot gnu.org
` (2 preceding siblings ...)
2023-05-30 9:32 ` guihaoc at gcc dot gnu.org
@ 2023-08-24 13:35 ` vmakarov at gcc dot gnu.org
2023-08-29 9:01 ` guihaoc at gcc dot gnu.org
2023-08-31 5:02 ` guihaoc at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2023-08-24 13:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110034
--- Comment #4 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
Thank you for providing the test case.
To be honest I don't see why assigning to hr3 to r134 is better.
Currently we have the following assignments:
hr9->r134; hr3->r173; hr3->r124
and the related preferences:
cp11:a18(r134)<->a29(r173)@125:shuffle
pref3:a29(r173)<-hr3@2000
pref4:a0(r124)<-hr3@125
This removes cost 2000 (pref3) and cost 125 (pref4) and adds cost 125
(cp11). The profit is 2000
If we started with r173, we would have the following assignments:
hr3->r173; hr3->r134; <some hard reg but hr3>->r124
This would remove cost 2000 (pref3) and cost 125 (cp11) and add cost
125 (pref). The profit would be the same 2000.
Choice of heuristics is very time consuming. I spent a lot of time to
try and benchmark numerous ones. I clearly remember that introduction
of pseudo threads for colorable busket gave visible performance
improvement. Currently we assign pseudos from a thread with the
biggest frequency first (r173 and r134) and a pseudo (r134) with the
biggest frequency first from the same thread. I think it is logical.
Also it is always possible to find a test (not this case) where
heuristics give some undesirable results. RA is NP-complete task even
in the simplest formulation. We can not get the optimal solution for
reasonable time.
Still I am open to change any heuristic if somebody can show that it
improves performance for some credible benchmark (I prefer SPEC2007)
on major GCC targets.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/110034] The first popped allcono doesn't take precedence over later popped in ira coloring
2023-05-30 8:02 [Bug rtl-optimization/110034] New: The first popped allcono doesn't take precedence over later popped in ira coloring guihaoc at gcc dot gnu.org
` (3 preceding siblings ...)
2023-08-24 13:35 ` vmakarov at gcc dot gnu.org
@ 2023-08-29 9:01 ` guihaoc at gcc dot gnu.org
2023-08-31 5:02 ` guihaoc at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: guihaoc at gcc dot gnu.org @ 2023-08-29 9:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110034
--- Comment #5 from HaoChen Gui <guihaoc at gcc dot gnu.org> ---
(In reply to Vladimir Makarov from comment #4)
> Thank you for providing the test case.
>
> To be honest I don't see why assigning to hr3 to r134 is better.
> Currently we have the following assignments:
>
> hr9->r134; hr3->r173; hr3->r124
>
> and the related preferences:
>
> cp11:a18(r134)<->a29(r173)@125:shuffle
> pref3:a29(r173)<-hr3@2000
> pref4:a0(r124)<-hr3@125
>
> This removes cost 2000 (pref3) and cost 125 (pref4) and adds cost 125
> (cp11). The profit is 2000
>
> If we started with r173, we would have the following assignments:
>
> hr3->r173; hr3->r134; <some hard reg but hr3>->r124
>
> This would remove cost 2000 (pref3) and cost 125 (cp11) and add cost
> 125 (pref). The profit would be the same 2000.
>
> Choice of heuristics is very time consuming. I spent a lot of time to
> try and benchmark numerous ones. I clearly remember that introduction
> of pseudo threads for colorable busket gave visible performance
> improvement. Currently we assign pseudos from a thread with the
> biggest frequency first (r173 and r134) and a pseudo (r134) with the
> biggest frequency first from the same thread. I think it is logical.
>
> Also it is always possible to find a test (not this case) where
> heuristics give some undesirable results. RA is NP-complete task even
> in the simplest formulation. We can not get the optimal solution for
> reasonable time.
>
> Still I am open to change any heuristic if somebody can show that it
> improves performance for some credible benchmark (I prefer SPEC2007)
> on major GCC targets.
Thanks for your explanation. I agree with it. I also checked the assembly and
found there is no potential performance gain when r3 is assigned to r134. It
should be not a bug.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug rtl-optimization/110034] The first popped allcono doesn't take precedence over later popped in ira coloring
2023-05-30 8:02 [Bug rtl-optimization/110034] New: The first popped allcono doesn't take precedence over later popped in ira coloring guihaoc at gcc dot gnu.org
` (4 preceding siblings ...)
2023-08-29 9:01 ` guihaoc at gcc dot gnu.org
@ 2023-08-31 5:02 ` guihaoc at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: guihaoc at gcc dot gnu.org @ 2023-08-31 5:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110034
HaoChen Gui <guihaoc at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |INVALID
Status|UNCONFIRMED |RESOLVED
--- Comment #6 from HaoChen Gui <guihaoc at gcc dot gnu.org> ---
It's not a problem.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-08-31 5:02 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-30 8:02 [Bug rtl-optimization/110034] New: The first popped allcono doesn't take precedence over later popped in ira coloring guihaoc at gcc dot gnu.org
2023-05-30 8:09 ` [Bug rtl-optimization/110034] " rguenth at gcc dot gnu.org
2023-05-30 9:32 ` guihaoc at gcc dot gnu.org
2023-05-30 9:32 ` guihaoc at gcc dot gnu.org
2023-08-24 13:35 ` vmakarov at gcc dot gnu.org
2023-08-29 9:01 ` guihaoc at gcc dot gnu.org
2023-08-31 5:02 ` guihaoc at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).