public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/68136] New: missed tree-level optimization with redundant computations
@ 2015-10-28 14:39 ktkachov at gcc dot gnu.org
2015-10-28 14:39 ` [Bug tree-optimization/68136] " ktkachov at gcc dot gnu.org
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-10-28 14:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68136
Bug ID: 68136
Summary: missed tree-level optimization with redundant
computations
Product: gcc
Version: 6.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: enhancement
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ktkachov at gcc dot gnu.org
Target Milestone: ---
Take the testcase gcc.dg/ifcvt-3.c:
typedef long long s64;
int
foo (s64 a, s64 b, s64 c)
{
s64 d = a - b;
if (d == 0)
return a + c;
else
return b + d + c;
}
on aarch64 this produces the simplest possible:
foo:
add w0, w2, w0
ret
However, this is due to RTL-level ifconversion.
The final tree dump is the more complex:
foo (s64D.2694 aD.2695, s64D.2694 bD.2696, s64D.2694 cD.2697)
{
s64D.2694 dD.2700;
intD.7 _1;
unsigned int _5;
unsigned int _7;
unsigned int _8;
intD.7 _9;
unsigned int _10;
unsigned int _11;
unsigned int _13;
unsigned int _14;
intD.7 _15;
unsigned int _17;
;; basic block 2, loop depth 0, count 0, freq 10000, maybe hot
;; prev block 0, next block 3, flags: (NEW, REACHABLE)
;; pred: ENTRY [100.0%] (FALLTHRU,EXECUTABLE)
d_4 = a_2(D) - b_3(D);
if (d_4 == 0)
goto <bb 3>;
else
goto <bb 4>;
;; succ: 3 [39.0%] (TRUE_VALUE,EXECUTABLE)
;; 4 [61.0%] (FALSE_VALUE,EXECUTABLE)
;; basic block 3, loop depth 0, count 0, freq 3900, maybe hot
;; prev block 2, next block 4, flags: (NEW, REACHABLE)
;; pred: 2 [39.0%] (TRUE_VALUE,EXECUTABLE)
# RANGE [0, 4294967295]
_5 = (unsigned int) a_2(D);
# RANGE [0, 4294967295]
_7 = (unsigned int) c_6(D);
# RANGE [0, 4294967295]
_8 = _5 + _7;
_9 = (intD.7) _8;
goto <bb 5>;
;; succ: 5 [100.0%] (FALLTHRU,EXECUTABLE)
;; basic block 4, loop depth 0, count 0, freq 6100, maybe hot
;; prev block 3, next block 5, flags: (NEW, REACHABLE)
;; pred: 2 [61.0%] (FALSE_VALUE,EXECUTABLE)
# RANGE [0, 4294967295]
_10 = (unsigned int) b_3(D);
# RANGE [0, 4294967295]
_11 = (unsigned int) d_4;
# RANGE [0, 4294967295]
_13 = (unsigned int) c_6(D);
# RANGE [0, 4294967295]
_17 = _10 + _13;
# RANGE [0, 4294967295]
_14 = _11 + _17;
_15 = (intD.7) _14;
;; succ: 5 [100.0%] (FALLTHRU,EXECUTABLE)
;; basic block 5, loop depth 0, count 0, freq 10000, maybe hot
;; prev block 4, next block 1, flags: (NEW, REACHABLE)
;; pred: 3 [100.0%] (FALLTHRU,EXECUTABLE)
;; 4 [100.0%] (FALLTHRU,EXECUTABLE)
# _1 = PHI <_9(3), _15(4)>
# VUSE <.MEM_16(D)>
return _1;
;; succ: EXIT [100.0%]
}
It's probably a good idea to detect this earlier and produce a " return a + c;"
at the tree level
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/68136] missed tree-level optimization with redundant computations
2015-10-28 14:39 [Bug tree-optimization/68136] New: missed tree-level optimization with redundant computations ktkachov at gcc dot gnu.org
@ 2015-10-28 14:39 ` ktkachov at gcc dot gnu.org
2015-10-28 15:04 ` rguenth at gcc dot gnu.org
2021-07-26 23:08 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-10-28 14:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68136
--- Comment #1 from ktkachov at gcc dot gnu.org ---
This came out of pr67462
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/68136] missed tree-level optimization with redundant computations
2015-10-28 14:39 [Bug tree-optimization/68136] New: missed tree-level optimization with redundant computations ktkachov at gcc dot gnu.org
2015-10-28 14:39 ` [Bug tree-optimization/68136] " ktkachov at gcc dot gnu.org
@ 2015-10-28 15:04 ` rguenth at gcc dot gnu.org
2021-07-26 23:08 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-10-28 15:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68136
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2015-10-28
CC| |rguenth at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
int
foo (s64 a, s64 b, s64 c)
{
s64 d = a - b;
if (d == 0)
return a + c;
else
return b + d + c;
}
ok, so the issue here is that the additions are narrowed to unsigned int
because of the appearant cast to 'int' for the return while the
d = a - b assignment is not narrowed in this way.
(u32)b + (u32)(a - b) + (u32)c
can not be easily simplified to the desired (u32)a + (u32)c.
As 'd' is multi-use this way we don't change the compare to a == b either
which would expose the then single-use 'd' to reassoc (which could get
some tricks to perform narrowing).
I don't see a very good answer here. Pattern-matching the whole thing
is of course an option but I don't see that as a scalable solution.
Maybe hacking up reassoc to do that narrowing trick and also consider
non-single-use chains (marked specially so they only participate in
simplification but never actual association or code-gen).
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/68136] missed tree-level optimization with redundant computations
2015-10-28 14:39 [Bug tree-optimization/68136] New: missed tree-level optimization with redundant computations ktkachov at gcc dot gnu.org
2015-10-28 14:39 ` [Bug tree-optimization/68136] " ktkachov at gcc dot gnu.org
2015-10-28 15:04 ` rguenth at gcc dot gnu.org
@ 2021-07-26 23:08 ` pinskia at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-26 23:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68136
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed|2015-10-28 00:00:00 |2021-7-26
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Improved for GCC 11 by r11-3207 . The testcase even was changed to show that it
is not fully fixed:
typedef long long s64;
int
foo (s64 a, s64 b, s64 c)
{
s64 d = a - b;
if (d == 0)
return a + c;
else
return b + c + d;
}
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-07-26 23:08 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-28 14:39 [Bug tree-optimization/68136] New: missed tree-level optimization with redundant computations ktkachov at gcc dot gnu.org
2015-10-28 14:39 ` [Bug tree-optimization/68136] " ktkachov at gcc dot gnu.org
2015-10-28 15:04 ` rguenth at gcc dot gnu.org
2021-07-26 23:08 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).