public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp
@ 2021-10-15 21:07 christophm30 at gmail dot com
2021-10-15 21:10 ` [Bug tree-optimization/102793] " pinskia at gcc dot gnu.org
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: christophm30 at gmail dot com @ 2021-10-15 21:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793
Bug ID: 102793
Summary: AArch64: sequential comparisons with equal conditional
blocks don't use ccmp
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: christophm30 at gmail dot com
Target Milestone: ---
The following code:
int ccmp(uint64_t* s1, uint64_t* s2, int(*foo)(void))
{
uint64_t d1, d2, bar;
d1 = *s1++;
d2 = *s2++;
bar = (d1 ^ d2) & 0xabcd;
if (bar == 0 || d1 != d2)
return foo();
return 0;
}
int noccmp(uint64_t* s1, uint64_t* s2, int(*foo)(void))
{
uint64_t d1, d2, bar;
d1 = *s1++;
d2 = *s2++;
bar = (d1 ^ d2) & 0xabcd;
if (bar == 0)
return foo();
if (d1 != d2)
return foo();
return 0;
}
...produces (GCC master or earlier, ARM64, with -O3):
ccmp:
ldr x3, [x0]
mov x4, 43981
ldr x0, [x1]
eor x1, x3, x0
tst x1, x4
ccmp x3, x0, 0, ne
bne .L5
mov w0, 0
ret
.L5:
mov x16, x2
br x16
noccmp:
ldr x3, [x0]
mov x4, 43981
ldr x0, [x1]
eor x1, x3, x0
tst x1, x4
beq .L8
cmp x3, x0
beq .L9
.L8:
mov x16, x2
br x16
.L9:
mov w0, 0
ret
Since both conditional blocks do exactly the same (they call foo()), both
functions could use the ccmp instruction.
I just observed this and have not analysed this any further.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/102793] AArch64: sequential comparisons with equal conditional blocks don't use ccmp
2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
@ 2021-10-15 21:10 ` pinskia at gcc dot gnu.org
2021-10-15 21:19 ` pinskia at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-10-15 21:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2021-10-15
Severity|normal |enhancement
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Component|middle-end |tree-optimization
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This is a gimple level missed optimization where the indirect function call is
not "commonized". There are a few other bugs which are similar to this too.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/102793] AArch64: sequential comparisons with equal conditional blocks don't use ccmp
2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
2021-10-15 21:10 ` [Bug tree-optimization/102793] " pinskia at gcc dot gnu.org
@ 2021-10-15 21:19 ` pinskia at gcc dot gnu.org
2021-10-18 6:35 ` rguenth at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-10-15 21:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #1)
> This is a gimple level missed optimization where the indirect function call
> is not "commonized". There are a few other bugs which are similar to this
> too.
I should mention the "commonization" happens at the RTL level which makes it
too late to convert to ccmp.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/102793] AArch64: sequential comparisons with equal conditional blocks don't use ccmp
2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
2021-10-15 21:10 ` [Bug tree-optimization/102793] " pinskia at gcc dot gnu.org
2021-10-15 21:19 ` pinskia at gcc dot gnu.org
@ 2021-10-18 6:35 ` rguenth at gcc dot gnu.org
2021-10-18 6:43 ` pinskia at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-10-18 6:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |aarch64
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
I wonder why tail-merging doesn't do it's job here. It does (on x86_64-linux):
find_duplicates: <bb 3> duplicate of <bb 5>
Removing basic block 5
;; basic block 5, loop depth 0
;; pred:
_12 = foo_10(D) ();
;; succ: 6
and .optimized:
int noccmp (uint64_t * s1, uint64_t * s2, int (*<T3aa>) (void) foo)
{
uint64_t bar;
uint64_t d2;
uint64_t d1;
long unsigned int _1;
int _2;
int _14;
<bb 2> [local count: 1073741824]:
d1_6 = *s1_4(D);
d2_8 = *s2_7(D);
_1 = d1_6 ^ d2_8;
bar_9 = _1 & 43981;
if (bar_9 == 0)
goto <bb 3>; [34.00%]
else
goto <bb 4>; [66.00%]
<bb 3> [local count: 719407024]:
_14 = foo_10(D) (); [tail call]
goto <bb 5>; [100.00%]
<bb 4> [local count: 708669601]:
if (d1_6 != d2_8)
goto <bb 3>; [50.00%]
else
goto <bb 5>; [50.00%]
<bb 5> [local count: 1073741824]:
# _2 = PHI <_14(3), 0(4)>
return _2;
}
but what's missing is possibly some if-combine?
IMHO ccmp expansion should be re-written to a pre RTL expansion GIMPLE
transform.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/102793] AArch64: sequential comparisons with equal conditional blocks don't use ccmp
2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
` (2 preceding siblings ...)
2021-10-18 6:35 ` rguenth at gcc dot gnu.org
@ 2021-10-18 6:43 ` pinskia at gcc dot gnu.org
2023-08-26 7:27 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-10-18 6:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #3)
> I wonder why tail-merging doesn't do it's job here. It does (on
> x86_64-linux):
Oh I missed that.
> but what's missing is possibly some if-combine?
Yes because ifcombine happens way way too early. I wonder if we should not have
another one or move the currently one to be after the loop optimizations are
done.
> IMHO ccmp expansion should be re-written to a pre RTL expansion GIMPLE
> transform.
basically ccmp expension is just ifcombine really.
That is:
cc = bar cmp 0
cc = cc.eq ? ne : d1 cmp d2
cset cc.ne
So this is why doing a late pass ifcombine might be good for aarch64; I don't
know if how much compile time it would cost though.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/102793] AArch64: sequential comparisons with equal conditional blocks don't use ccmp
2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
` (3 preceding siblings ...)
2021-10-18 6:43 ` pinskia at gcc dot gnu.org
@ 2023-08-26 7:27 ` pinskia at gcc dot gnu.org
2024-04-05 12:35 ` manolis.tsamis at vrull dot eu
2024-04-05 12:41 ` manolis.tsamis at vrull dot eu
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-26 7:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #4)
> basically ccmp expension is just ifcombine really.
> That is:
> cc = bar cmp 0
> cc = cc.eq ? ne : d1 cmp d2
> cset cc.ne
> So this is why doing a late pass ifcombine might be good for aarch64; I
> don't know if how much compile time it would cost though.
Or maybe isel can do this too ...
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/102793] AArch64: sequential comparisons with equal conditional blocks don't use ccmp
2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
` (4 preceding siblings ...)
2023-08-26 7:27 ` pinskia at gcc dot gnu.org
@ 2024-04-05 12:35 ` manolis.tsamis at vrull dot eu
2024-04-05 12:41 ` manolis.tsamis at vrull dot eu
6 siblings, 0 replies; 8+ messages in thread
From: manolis.tsamis at vrull dot eu @ 2024-04-05 12:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793
Manolis Tsamis <manolis.tsamis at vrull dot eu> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |manolis.tsamis at vrull dot eu
--- Comment #6 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
Created attachment 57886
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57886&action=edit
Patch with proposed fix for the issue.
Adding a copy of the ifcombine pass after pre solves this issues and improves
code generation is some other cases too. I specifically chose after the pre
pass because it deduplicates basic blocks and helps in maximizing the effect of
ifcombine.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/102793] AArch64: sequential comparisons with equal conditional blocks don't use ccmp
2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
` (5 preceding siblings ...)
2024-04-05 12:35 ` manolis.tsamis at vrull dot eu
@ 2024-04-05 12:41 ` manolis.tsamis at vrull dot eu
6 siblings, 0 replies; 8+ messages in thread
From: manolis.tsamis at vrull dot eu @ 2024-04-05 12:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793
--- Comment #7 from Manolis Tsamis <manolis.tsamis at vrull dot eu> ---
Also submitted in the lists:
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648856.html
I should note that I needed to modify the test uninit-pred-6_c.c and remove
this check:
if (l)
if (n > 12)
blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
because it was inconsistent. If the branch is merged to a simple if (l && (n >
12)) is fails (on master). With the proposed patch we get more ifcombining and
better codegen but this condition fails. I believe this would need an
enhancement in tree-ssa-uninit.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-04-05 12:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-15 21:07 [Bug target/102793] New: AArch64: sequential comparisons with equal conditional blocks don't use ccmp christophm30 at gmail dot com
2021-10-15 21:10 ` [Bug tree-optimization/102793] " pinskia at gcc dot gnu.org
2021-10-15 21:19 ` pinskia at gcc dot gnu.org
2021-10-18 6:35 ` rguenth at gcc dot gnu.org
2021-10-18 6:43 ` pinskia at gcc dot gnu.org
2023-08-26 7:27 ` pinskia at gcc dot gnu.org
2024-04-05 12:35 ` manolis.tsamis at vrull dot eu
2024-04-05 12:41 ` manolis.tsamis at vrull dot eu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).