public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/108041] New: ivopts results in extra instruction in simple loop
@ 2022-12-09 22:45 law at gcc dot gnu.org
2022-12-11 12:26 ` [Bug tree-optimization/108041] " rguenth at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: law at gcc dot gnu.org @ 2022-12-09 22:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108041
Bug ID: 108041
Summary: ivopts results in extra instruction in simple loop
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: law at gcc dot gnu.org
CC: rzinsly at ventanamicro dot com
Target Milestone: ---
ivopts seems to make a bit of a mess out of this code resulting in the loop
having an unnecessary instruction. Compile with rv64 -O2:
typedef struct network
{
long nr_group, full_groups, max_elems;
} network_t;
void marc_arcs(network_t* net)
{
while (net->full_groups < 0) {
net->full_groups = net->nr_group + net->full_groups;
net->max_elems--;
}
}
After slp1 we have this loop:
;; basic block 3, loop depth 0
;; pred: 2
_1 = net_8(D)->nr_group;
net__max_elems_lsm.4_16 = net_8(D)->max_elems;
;; succ: 4
;; basic block 4, loop depth 1
;; pred: 7
;; 3
# _13 = PHI <_2(7), _11(3)>
# net__max_elems_lsm.4_5 = PHI <_4(7), net__max_elems_lsm.4_16(3)>
_2 = _1 + _13;
_4 = net__max_elems_lsm.4_5 + -1;
if (_2 < 0)
goto <bb 7>; [89.00%]
else
goto <bb 5>; [11.00%]
;; succ: 7
;; 5
;; basic block 7, loop depth 1
;; pred: 4
goto <bb 4>; [100.00%]
;; succ: 4
;; basic block 5, loop depth 0
;; pred: 4
# _12 = PHI <_2(4)>
# _17 = PHI <_4(4)>
net_8(D)->full_groups = _12;
net_8(D)->max_elems = _17;
;; succ: 6
Of particular interest is the max_elems computation into _4. We accumulate it
in the loop, then do the final store after the loop (thank you LSM!). After
ivopts we have:
;; basic block 3, loop depth 0
;; pred: 2
_1 = net_8(D)->nr_group;
net__max_elems_lsm.4_16 = net_8(D)->max_elems;
_22 = net__max_elems_lsm.4_16 + -1;
ivtmp.10_21 = (unsigned long) _22;
;; succ: 4
;; basic block 4, loop depth 1
;; pred: 7
;; 3
# _13 = PHI <_2(7), _11(3)>
# ivtmp.10_3 = PHI <ivtmp.10_18(7), ivtmp.10_21(3)>
_2 = _1 + _13;
_4 = (long int) ivtmp.10_3;
ivtmp.10_18 = ivtmp.10_3 - 1;
if (_2 < 0)
goto <bb 7>; [89.00%]
else
goto <bb 5>; [11.00%]
;; succ: 7
;; 5
;; basic block 7, loop depth 1
;; pred: 4
goto <bb 4>; [100.00%]
;; succ: 4
;; basic block 5, loop depth 0
;; pred: 4
# _12 = PHI <_2(4)>
# _17 = PHI <_4(4)>
net_8(D)->full_groups = _12;
net_8(D)->max_elems = _17;
;; succ: 6
Note the introduction of the IV and its relationship to _4. Essentially we
compute both in the loop even _4 is always one greater than the IV. Worse yet,
the IV is only used to compute _4! And since they differ by 1, we actually
compute both and keep them alive resulting in this final code for rv64:
.L3:
add a5,a5,a2
mv a3,a4
addi a4,a4,-1
blt a5,zero,.L3
sd a5,8(a0)
sd a3,16(a0)
Note how we had to "stash away" the value of a4 before the decrement so that we
could store it after the loop. The induction variable doesn't really buy us
anything in this loop -- it's actively harmful. Not using the IV would
probably be best. Second best would be to realize that _4 (aka a3) can be
derived from the IV (a4) after the loop by adding 1.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/108041] ivopts results in extra instruction in simple loop
2022-12-09 22:45 [Bug tree-optimization/108041] New: ivopts results in extra instruction in simple loop law at gcc dot gnu.org
@ 2022-12-11 12:26 ` rguenth at gcc dot gnu.org
2022-12-11 12:27 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-12-11 12:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108041
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think IVOPTs does not consider the use on the exit edge and its effect on
liveness but it does consider both IVs.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/108041] ivopts results in extra instruction in simple loop
2022-12-09 22:45 [Bug tree-optimization/108041] New: ivopts results in extra instruction in simple loop law at gcc dot gnu.org
2022-12-11 12:26 ` [Bug tree-optimization/108041] " rguenth at gcc dot gnu.org
@ 2022-12-11 12:27 ` rguenth at gcc dot gnu.org
2023-05-29 16:36 ` law at gcc dot gnu.org
2023-05-30 20:40 ` law at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-12-11 12:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108041
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Last reconfirmed| |2022-12-11
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed also on x86_64 btw.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/108041] ivopts results in extra instruction in simple loop
2022-12-09 22:45 [Bug tree-optimization/108041] New: ivopts results in extra instruction in simple loop law at gcc dot gnu.org
2022-12-11 12:26 ` [Bug tree-optimization/108041] " rguenth at gcc dot gnu.org
2022-12-11 12:27 ` rguenth at gcc dot gnu.org
@ 2023-05-29 16:36 ` law at gcc dot gnu.org
2023-05-30 20:40 ` law at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: law at gcc dot gnu.org @ 2023-05-29 16:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108041
--- Comment #3 from Jeffrey A. Law <law at gcc dot gnu.org> ---
Created attachment 55185
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55185&action=edit
(Incomplete) Patch
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/108041] ivopts results in extra instruction in simple loop
2022-12-09 22:45 [Bug tree-optimization/108041] New: ivopts results in extra instruction in simple loop law at gcc dot gnu.org
` (2 preceding siblings ...)
2023-05-29 16:36 ` law at gcc dot gnu.org
@ 2023-05-30 20:40 ` law at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: law at gcc dot gnu.org @ 2023-05-30 20:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108041
--- Comment #4 from Jeffrey A. Law <law at gcc dot gnu.org> ---
Patch was for a different problem. Sorry.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-05-30 20:40 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-09 22:45 [Bug tree-optimization/108041] New: ivopts results in extra instruction in simple loop law at gcc dot gnu.org
2022-12-11 12:26 ` [Bug tree-optimization/108041] " rguenth at gcc dot gnu.org
2022-12-11 12:27 ` rguenth at gcc dot gnu.org
2023-05-29 16:36 ` law at gcc dot gnu.org
2023-05-30 20:40 ` law at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).