public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/67628] New: [tree-optimization] (a && b) && c shows better codegen than a && (b && c)
@ 2015-09-18 14:32 ktkachov at gcc dot gnu.org
2015-09-18 14:48 ` [Bug tree-optimization/67628] " pinskia at gcc dot gnu.org
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-09-18 14:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67628
Bug ID: 67628
Summary: [tree-optimization] (a && b) && c shows better codegen
than a && (b && c)
Product: gcc
Version: 6.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ktkachov at gcc dot gnu.org
Target Milestone: ---
Consider the two functions:
int
foo1 (int a, int b, int c, int d)
{
return a > b && b <= c && c > d;
}
int
foo2 (int a, int b, int c, int d)
{
return a > b && (b <= c && c > d);
}
On aarch64 foo1 generates:
foo1:
cmp w1, w2
ccmp w2, w3, 4, le
ccmp w0, w1, 4, gt
cset w0, gt
ret
but for foo2 generates:
foo2:
cmp w0, w1
ble .L4
cmp w1, w2
cset w1, le
cmp w2, w3
cset w0, gt
and w0, w1, w0
ret
Something similar is observed on x86_64 where foo2 contains a conditional
branch instruction where foo1 is a single basic block
In foo2 we end up generating multiple basic blocks whereas for foo1 we manage
to merge them all into 1 basic block which ends up going through the
conditional-compare pass nicely.
Looking at the final .optimized tree dump the foo1 tree is:
_BoolD.2673 _1;
_BoolD.2673 _4;
_BoolD.2673 _6;
_BoolD.2673 _10;
_BoolD.2673 _11;
intD.7 _12;
;; basic block 2, loop depth 0, count 0, freq 10000, maybe hot
;; prev block 0, next block 1, flags: (NEW, REACHABLE)
;; pred: ENTRY [100.0%] (FALLTHRU,EXECUTABLE)
# RANGE [0, 1]
_4 = a_2(D) > b_3(D);
# RANGE [0, 1]
_6 = b_3(D) <= c_5(D);
# RANGE [0, 1]
_10 = c_5(D) > d_8(D);
# RANGE [0, 1]
_1 = _6 & _10;
# RANGE [0, 1]
_11 = _1 & _4;
# RANGE [0, 1] NONZERO 1
_12 = (intD.7) _11;
# VUSE <.MEM_9(D)>
return _12;
;; succ: EXIT [100.0%]
whereas for foo2 it's more complex:
intD.7 iftmp.0_1;
_BoolD.2673 _5;
_BoolD.2673 _7;
_BoolD.2673 _8;
intD.7 _10;
_BoolD.2673 _11;
;; basic block 2, loop depth 0, count 0, freq 10000, maybe hot
;; prev block 0, next block 3, flags: (NEW, REACHABLE)
;; pred: ENTRY [100.0%] (FALLTHRU,EXECUTABLE)
if (a_2(D) > b_3(D))
goto <bb 3>;
else
goto <bb 4>;
;; succ: 3 [50.0%] (TRUE_VALUE,EXECUTABLE)
;; 4 [50.0%] (FALSE_VALUE,EXECUTABLE)
;; basic block 3, loop depth 0, count 0, freq 5000, maybe hot
;; prev block 2, next block 4, flags: (NEW, REACHABLE)
;; pred: 2 [50.0%] (TRUE_VALUE,EXECUTABLE)
# RANGE [0, 1]
_5 = b_3(D) <= c_4(D);
# RANGE [0, 1]
_7 = c_4(D) > d_6(D);
# RANGE [0, 1]
_8 = _5 & _7;
;; succ: 4 [100.0%] (FALLTHRU,EXECUTABLE)
;; basic block 4, loop depth 0, count 0, freq 10000, maybe hot
;; prev block 3, next block 1, flags: (NEW, REACHABLE)
;; pred: 3 [100.0%] (FALLTHRU,EXECUTABLE)
;; 2 [50.0%] (FALSE_VALUE,EXECUTABLE)
# _11 = PHI <_8(3), 0(2)>
# RANGE [0, 1] NONZERO 1
iftmp.0_1 = (intD.7) _11;
# VUSE <.MEM_9(D)>
return iftmp.0_1;
;; succ: EXIT [100.0%]
If we were to pick some kind of canonicalization for these equivalent
expressions it would make life easier for later passes to generate consistent
code.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/67628] [tree-optimization] (a && b) && c shows better codegen than a && (b && c)
2015-09-18 14:32 [Bug tree-optimization/67628] New: [tree-optimization] (a && b) && c shows better codegen than a && (b && c) ktkachov at gcc dot gnu.org
@ 2015-09-18 14:48 ` pinskia at gcc dot gnu.org
2015-09-18 15:00 ` ktkachov at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-09-18 14:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67628
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2015-09-18
CC| |pinskia at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This is due to the fold-const.c optimization which should not be there any
more. You need to do benchmarking on x86 also if you remove it.
I have not done that yet which is why I have not submitted the patch to fix
this.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/67628] [tree-optimization] (a && b) && c shows better codegen than a && (b && c)
2015-09-18 14:32 [Bug tree-optimization/67628] New: [tree-optimization] (a && b) && c shows better codegen than a && (b && c) ktkachov at gcc dot gnu.org
2015-09-18 14:48 ` [Bug tree-optimization/67628] " pinskia at gcc dot gnu.org
@ 2015-09-18 15:00 ` ktkachov at gcc dot gnu.org
2015-09-18 15:10 ` pinskia at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-09-18 15:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67628
--- Comment #2 from ktkachov at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #1)
> This is due to the fold-const.c optimization which should not be there any
> more. You need to do benchmarking on x86 also if you remove it.
>
could you elaborate what optimization is that?
Is it a matter of fold-const.c or match.pd canonicalizing the expression in
some way?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/67628] [tree-optimization] (a && b) && c shows better codegen than a && (b && c)
2015-09-18 14:32 [Bug tree-optimization/67628] New: [tree-optimization] (a && b) && c shows better codegen than a && (b && c) ktkachov at gcc dot gnu.org
2015-09-18 14:48 ` [Bug tree-optimization/67628] " pinskia at gcc dot gnu.org
2015-09-18 15:00 ` ktkachov at gcc dot gnu.org
@ 2015-09-18 15:10 ` pinskia at gcc dot gnu.org
2015-09-21 14:23 ` ktkachov at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-09-18 15:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67628
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to ktkachov from comment #2)
> (In reply to Andrew Pinski from comment #1)
> > This is due to the fold-const.c optimization which should not be there any
> > more. You need to do benchmarking on x86 also if you remove it.
> >
>
> could you elaborate what optimization is that?
> Is it a matter of fold-const.c or match.pd canonicalizing the expression in
> some way?
The optimization is converting andif into and. Basically ifcombine pass does
not recombine them if fold-const does it early on.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/67628] [tree-optimization] (a && b) && c shows better codegen than a && (b && c)
2015-09-18 14:32 [Bug tree-optimization/67628] New: [tree-optimization] (a && b) && c shows better codegen than a && (b && c) ktkachov at gcc dot gnu.org
` (2 preceding siblings ...)
2015-09-18 15:10 ` pinskia at gcc dot gnu.org
@ 2015-09-21 14:23 ` ktkachov at gcc dot gnu.org
2021-06-07 7:30 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-09-21 14:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67628
--- Comment #4 from ktkachov at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #3)
> (In reply to ktkachov from comment #2)
> > (In reply to Andrew Pinski from comment #1)
> > > This is due to the fold-const.c optimization which should not be there any
> > > more. You need to do benchmarking on x86 also if you remove it.
> > >
> >
> > could you elaborate what optimization is that?
> > Is it a matter of fold-const.c or match.pd canonicalizing the expression in
> > some way?
>
> The optimization is converting andif into and. Basically ifcombine pass does
> not recombine them if fold-const does it early on.
Right, seems like the code in fold_truth_andor.
I think ifcombine should be doing a better job here though
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/67628] [tree-optimization] (a && b) && c shows better codegen than a && (b && c)
2015-09-18 14:32 [Bug tree-optimization/67628] New: [tree-optimization] (a && b) && c shows better codegen than a && (b && c) ktkachov at gcc dot gnu.org
` (3 preceding siblings ...)
2015-09-21 14:23 ` ktkachov at gcc dot gnu.org
@ 2021-06-07 7:30 ` pinskia at gcc dot gnu.org
2021-07-20 6:50 ` pinskia at gcc dot gnu.org
2023-09-03 3:29 ` pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-06-07 7:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67628
--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Another testcase which shows the problem to be even worse:
#define isgap(c) ((c) == ' ' || (c) == '.' || (c) == '_' || (c) == '-' || (c)
== '~')
int f(char a)
{
return isgap(a);
}
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/67628] [tree-optimization] (a && b) && c shows better codegen than a && (b && c)
2015-09-18 14:32 [Bug tree-optimization/67628] New: [tree-optimization] (a && b) && c shows better codegen than a && (b && c) ktkachov at gcc dot gnu.org
` (4 preceding siblings ...)
2021-06-07 7:30 ` pinskia at gcc dot gnu.org
@ 2021-07-20 6:50 ` pinskia at gcc dot gnu.org
2023-09-03 3:29 ` pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-20 6:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67628
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed|2021-06-07 00:00:00 |2021-7-19
Severity|normal |enhancement
--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
if (a_5(D) > b_6(D))
goto <bb 3>; [50.00%]
else
goto <bb 4>; [50.00%]
<bb 3> [local count: 536870913]:
_1 = b_6(D) <= c_7(D);
_2 = c_7(D) > d_8(D);
_3 = _1 & _2;
_10 = (int) _3;
<bb 4> [local count: 1073741824]:
# iftmp.1_4 = PHI <_10(3), 0(2)>
If what was in BB 3 was simplier, I have patches to implement that already.
The last time I tried to disable the code in fold_truth_andor there was some
testsuite fall out. I won't be able to get to trying to fix this until next
year at the earliest I think.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/67628] [tree-optimization] (a && b) && c shows better codegen than a && (b && c)
2015-09-18 14:32 [Bug tree-optimization/67628] New: [tree-optimization] (a && b) && c shows better codegen than a && (b && c) ktkachov at gcc dot gnu.org
` (5 preceding siblings ...)
2021-07-20 6:50 ` pinskia at gcc dot gnu.org
@ 2023-09-03 3:29 ` pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-09-03 3:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67628
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org
--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Mine.
I have ideas on how to fix ifcombine here to allow defining statements but only
if they are boolean in type ...
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-09-03 3:29 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-18 14:32 [Bug tree-optimization/67628] New: [tree-optimization] (a && b) && c shows better codegen than a && (b && c) ktkachov at gcc dot gnu.org
2015-09-18 14:48 ` [Bug tree-optimization/67628] " pinskia at gcc dot gnu.org
2015-09-18 15:00 ` ktkachov at gcc dot gnu.org
2015-09-18 15:10 ` pinskia at gcc dot gnu.org
2015-09-21 14:23 ` ktkachov at gcc dot gnu.org
2021-06-07 7:30 ` pinskia at gcc dot gnu.org
2021-07-20 6:50 ` pinskia at gcc dot gnu.org
2023-09-03 3:29 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).