public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/111011] New: gcc-13 incorrectly decrements by 2. It's twice as fast as gcc-12 and clang!
@ 2023-08-14 9:52 adam.warner.nz at gmail dot com
2023-08-14 11:10 ` [Bug rtl-optimization/111011] " rguenth at gcc dot gnu.org
0 siblings, 1 reply; 2+ messages in thread
From: adam.warner.nz at gmail dot com @ 2023-08-14 9:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111011
Bug ID: 111011
Summary: gcc-13 incorrectly decrements by 2. It's twice as fast
as gcc-12 and clang!
Product: gcc
Version: 13.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: adam.warner.nz at gmail dot com
Target Milestone: ---
(Please fix my guess at the correct component for this bug report)
I'm amused by a ghost in the GCC virtual machine. I'm running this code on a
Debian Linux x86-64 desktop with these software versions:
gcc-12 (Debian 12.3.0-7) 12.3.0
gcc-13 (Debian 13.2.0-2) 13.2.0
gcc (Debian 20230718-1) 14.0.0 20230718 (experimental) [master
r14-2597-g6bab2772dbc]
Debian clang version 17.0.0 (++20230128060150+75153adeda1a-1~exp1)
My CPU is locked at 2.7GHz. It should take a nice round 10 seconds to decrement
2.7x10^10 to zero if each decrement takes one clock cycle.
And indeed it used to:
$ cat countdown.c
#include <stdint.h>
int main() {
int64_t count=27000000000;
while (count>0) {
__asm__ __volatile__("" : : : "memory");
--count;
}
return 0;
}
$ gcc-12 -O3 countdown.c && time ./a.out
real 0m10.029s
user 0m10.024s
sys 0m0.004s
$ clang-17 -O3 countdown.c && time ./a.out
real 0m10.032s
user 0m10.030s
sys 0m0.000s
But now it only takes 5 seconds:
$ gcc-13 -O3 countdown.c && time ./a.out
real 0m5.022s
user 0m5.021s
sys 0m0.001s
$ gcc-snapshot.sh -O3 countdown.c && time ./a.out
real 0m5.023s
user 0m5.022s
sys 0m0.000s
By disassembling the machine code we can clearly see why:
$ gcc-13 -O3 countdown.c && objdump -d -m i386:x86-64:intel a.out
...
0000000000001040 <main>:
1040: 48 b8 00 4e 53 49 06 movabs rax,0x649534e00
1047: 00 00 00
104a: 66 0f 1f 44 00 00 nop WORD PTR [rax+rax*1+0x0]
1050: 48 83 e8 02 sub rax,0x2
1054: 75 fa jne 1050 <main+0x10>
1056: 31 c0 xor eax,eax
1058: c3 ret
1059: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0]
...
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug rtl-optimization/111011] gcc-13 incorrectly decrements by 2. It's twice as fast as gcc-12 and clang!
2023-08-14 9:52 [Bug rtl-optimization/111011] New: gcc-13 incorrectly decrements by 2. It's twice as fast as gcc-12 and clang! adam.warner.nz at gmail dot com
@ 2023-08-14 11:10 ` rguenth at gcc dot gnu.org
0 siblings, 0 replies; 2+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-08-14 11:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111011
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |INVALID
Status|UNCONFIRMED |RESOLVED
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
There's nothing wrong, we unroll the loop.
> ./cc1 -quiet t.c -O3 -fopt-info
t.c:5:15: optimized: loop unrolled 1 times
adding "# foo" to the asm text you'll see
.L2:
#APP
# 6 "t.c" 1
# foo
# 0 "" 2
# 6 "t.c" 1
# foo
# 0 "" 2
#NO_APP
subq $2, %rax
jne .L2
there's no data dependence with 'count' for the asm. You can instead use
#include <stdint.h>
int main() {
int64_t count=27000000000;
while (count>0) {
__asm__ __volatile__("" : "=g" (count) : "0" (count) : "memory");
--count;
}
return 0;
}
to get the desired effect.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-08-14 11:10 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-14 9:52 [Bug rtl-optimization/111011] New: gcc-13 incorrectly decrements by 2. It's twice as fast as gcc-12 and clang! adam.warner.nz at gmail dot com
2023-08-14 11:10 ` [Bug rtl-optimization/111011] " rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).