public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/112760] New: [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p()
@ 2023-11-29 7:42 zsojka at seznam dot cz
2023-11-29 8:02 ` [Bug target/112760] " zsojka at seznam dot cz
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: zsojka at seznam dot cz @ 2023-11-29 7:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760
Bug ID: 112760
Summary: [14 Regression] wrong code with -O2 -fno-dce
-fno-guess-branch-probability -m8bit-idiv -mavx
--param=max-cse-insns=0 and __builtin_add_overflow_p()
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: wrong-code
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: zsojka at seznam dot cz
Target Milestone: ---
Host: x86_64-pc-linux-gnu
Target: i686-pc-linux-gnu
Created attachment 56715
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56715&action=edit
reduced testcase
Output:
$ x86_64-pc-linux-gnu-gcc -m32 -O2 -fno-dce -fno-guess-branch-probability
-m8bit-idiv -mavx --param=max-cse-insns=0 testcase.c
$ ./a.out
Aborted
The code for __builtin_add_overflow() check looks wrong:
# testcase.c:9: u16 x = __builtin_add_overflow_p (a, g, (u16) 0);
add eax, ecx # tmp110, g.0_1
mov eax, 1 # tmp118,
setc bl #, _8
cmovne ebx, eax # _8,, _8, tmp118
Comparing the code without -mavx, the breakage can be observed better:
$ diff -u a-testcase.GOOD.s a-testcase.BAD.s
--- a-testcase.GOOD.s 2023-11-29 08:34:39.978807709 +0100
+++ a-testcase.BAD.s 2023-11-29 08:32:27.458809580 +0100
@@ -4,7 +4,7 @@
# compiled by GNU C version 14.0.0 20231128 (experimental), GMP version
6.3.0, MPFR version 4.2.1, MPC version 1.3.1, isl version isl-0.26-GMP
# GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
-# options passed: -m32 -m8bit-idiv -masm=intel -mtune=generic -march=x86-64
-O2 -fno-dce -fno-guess-branch-probability --param=max-cse-insns=0
+# options passed: -m32 -m8bit-idiv -mavx -masm=intel -mtune=generic
-march=x86-64 -O2 -fno-dce -fno-guess-branch-probability
--param=max-cse-insns=0
.text
.p2align 4
.globl foo0
@@ -28,10 +28,8 @@
movzx esi, WORD PTR [esp+16] # b, b
# testcase.c:9: u16 x = __builtin_add_overflow_p (a, g, (u16) 0);
add eax, ecx # tmp110, g.0_1
- movzx edx, ax # tmp111, tmp110
- setc bl #, _8
- cmp eax, edx # tmp110, tmp111
mov eax, 1 # tmp118,
+ setc bl #, _8
cmovne ebx, eax # _8,, _8, tmp118
# testcase.c:10: g -= g / b;
mov eax, ecx # tmp119, g.0_1
The "cmovne" instruction is using the Z flag from a different comparison.
$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-5940-20231128183456-g3d104d93a70-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-5940-20231128183456-g3d104d93a70-checking-yes-rtl-df-extra-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.0 20231128 (experimental) (GCC)
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p()
2023-11-29 7:42 [Bug target/112760] New: [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() zsojka at seznam dot cz
@ 2023-11-29 8:02 ` zsojka at seznam dot cz
2023-11-29 10:45 ` [Bug rtl-optimization/112760] " ubizjak at gmail dot com
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: zsojka at seznam dot cz @ 2023-11-29 8:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760
--- Comment #1 from Zdenek Sojka <zsojka at seznam dot cz> ---
Created attachment 56716
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56716&action=edit
more complex testcase, with less compiler flags
Attached a testcase that needs only -O2 -mavx; might be a different issue,
though.
$ x86_64-pc-linux-gnu-gcc -m32 -O2 -mavx testcase.c
$ ./a.out
Aborted
$ diff -u a-testcase.GOOD.s a-testcase.BAD.s
--- a-testcase.GOOD.s 2023-11-29 08:53:43.058791568 +0100
+++ a-testcase.BAD.s 2023-11-29 08:52:39.878792460 +0100
@@ -4,7 +4,7 @@
# compiled by GNU C version 14.0.0 20231128 (experimental), GMP version
6.3.0, MPFR version 4.2.1, MPC version 1.3.1, isl version isl-0.26-GMP
# GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
-# options passed: -m32 -masm=intel -mtune=generic -march=x86-64 -O2
+# options passed: -m32 -mavx -masm=intel -mtune=generic -march=x86-64 -O2
.text
.p2align 4
.globl foo0
@@ -37,19 +37,19 @@
rol ax, 8 # _5,
# testcase.c:11: u64 u64_1 = __builtin_bswap16 (SHL (u64_0, u8_0)) % u16_0;
div bx # u16_0
-# testcase.c:12: u8 u8_1 = __builtin_add_overflow_p (u8_0, u8_0, u8_0);
- mov eax, ecx # u8_0, u8_0
# testcase.c:13: u32 u32_1 = foo0_u32_0 & u16_0;
movzx ebx, bx # u16_0, u16_0
# testcase.c:12: u8 u8_1 = __builtin_add_overflow_p (u8_0, u8_0, u8_0);
- add al, cl # u8_0, u8_0
+ add al, al # u8_0, u8_0
+# testcase.c:14: u32 u32_2 = __builtin_sub_overflow_p (0, u64_1, (u8) 0);
+ movzx edx, dx # _6, tmp127
+# testcase.c:12: u8 u8_1 = __builtin_add_overflow_p (u8_0, u8_0, u8_0);
setc al #, _15
# testcase.c:14: u32 u32_2 = __builtin_sub_overflow_p (0, u64_1, (u8) 0);
xor ecx, ecx # tmp133
# testcase.c:13: u32 u32_1 = foo0_u32_0 & u16_0;
and ebx, DWORD PTR foo0_u32_0 # u32_1, foo0_u32_0
# testcase.c:14: u32 u32_2 = __builtin_sub_overflow_p (0, u64_1, (u8) 0);
- movzx edx, dx # _6, tmp127
sub ecx, edx # tmp132, _6
setb dl #, _18
and ecx, -256 # tmp132,
Here, the problem is with:
# testcase.c:12: u8 u8_1 = __builtin_add_overflow_p (u8_0, u8_0, u8_0);
add al, al # u8_0, u8_0
due to the:
-# testcase.c:12: u8 u8_1 = __builtin_add_overflow_p (u8_0, u8_0, u8_0);
- mov eax, ecx # u8_0, u8_0
there is no "u8_0" in eax
Maybe the compiler tries to use an AVX instruction, that is in the end not
generated?
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p()
2023-11-29 7:42 [Bug target/112760] New: [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() zsojka at seznam dot cz
2023-11-29 8:02 ` [Bug target/112760] " zsojka at seznam dot cz
@ 2023-11-29 10:45 ` ubizjak at gmail dot com
2023-12-01 12:22 ` [Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355 jakub at gcc dot gnu.org
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: ubizjak at gmail dot com @ 2023-11-29 10:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760
Uroš Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|target |rtl-optimization
Last reconfirmed| |2023-11-29
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
Target Milestone|--- |14.0
--- Comment #2 from Uroš Bizjak <ubizjak at gmail dot com> ---
With the original testcase, ce1 pass is if-converting:
20: flags:CCZ=cmp(r110:SI,r111:SI)
REG_DEAD r111:SI
REG_DEAD r110:SI
21: pc={(flags:CCZ==0)?L23:pc}
REG_DEAD flags:CCZ
39: NOTE_INSN_BASIC_BLOCK 5
22: r103:HI=0x1
23: L23:
with:
IF-THEN-JOIN block found, pass 2, test 2, then 5, join 6
scanning new insn with uid = 45.
scanning new insn with uid = 44.
scanning new insn with uid = 46.
if-conversion succeeded through noce_try_cmove
Removing jump 21.
deleting insn with uid = 21.
deleting insn with uid = 22.
to:
20: flags:CCZ=cmp(r110:SI,r111:SI)
REG_DEAD r111:SI
REG_DEAD r110:SI
45: r118:HI=0x1
44: flags:CCZ=cmp(r110:SI,r111:SI)
46: r103:HI={(flags:CCZ==0)?r103:HI:r118:HI}
And things go downhill from here. Before postreload we have:
20: flags:CCZ=cmp(ax:SI,dx:SI)
REG_UNUSED flags:CCZ
44: flags:CCZ=cmp(ax:SI,dx:SI)
REG_DEAD dx:SI
REG_DEAD ax:SI
62: ax:HI=0x1
REG_EQUIV 0x1
46: bx:HI={(flags:CCZ==0)?bx:HI:ax:HI}
REG_DEAD flags:CCZ
REG_DEAD ax:HI
and in posteload pass (insn 44) is removed:
20: flags:CCZ=cmp(ax:SI,dx:SI)
REG_UNUSED flags:CCZ
62: ax:HI=0x1
REG_EQUIV 0x1
46: bx:HI={(flags:CCZ==0)?bx:HI:ax:HI}
REG_DEAD flags:CCZ
REG_DEAD ax:HI
here comes pro_and_epilogue pass that detects "unused" (insn 20) and removes
it:
df_analyze called
deleting insn with uid = 20.
Confirmed as RTL optimization problem.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355
2023-11-29 7:42 [Bug target/112760] New: [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() zsojka at seznam dot cz
2023-11-29 8:02 ` [Bug target/112760] " zsojka at seznam dot cz
2023-11-29 10:45 ` [Bug rtl-optimization/112760] " ubizjak at gmail dot com
@ 2023-12-01 12:22 ` jakub at gcc dot gnu.org
2023-12-01 12:45 ` jakub at gcc dot gnu.org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-12-01 12:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[14 Regression] wrong code |[14 Regression] wrong code
|with -O2 -fno-dce |with -O2 -fno-dce
|-fno-guess-branch-probabili |-fno-guess-branch-probabili
|ty -m8bit-idiv -mavx |ty -m8bit-idiv -mavx
|--param=max-cse-insns=0 and |--param=max-cse-insns=0 and
|__builtin_add_overflow_p() |__builtin_add_overflow_p()
| |since r14-5355
CC| |jakub at gcc dot gnu.org,
| |rsandifo at gcc dot gnu.org
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Started with r14-5355-g3cd3a09b3f91a1d023cb180763d40598d6bb274b
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355
2023-11-29 7:42 [Bug target/112760] New: [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() zsojka at seznam dot cz
` (2 preceding siblings ...)
2023-12-01 12:22 ` [Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355 jakub at gcc dot gnu.org
@ 2023-12-01 12:45 ` jakub at gcc dot gnu.org
2023-12-01 12:55 ` jakub at gcc dot gnu.org
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-12-01 12:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords|needs-bisection |
--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
In reload dump I see no changes (except function_decl/var_decl addresses), in
vzeroupper, postreload, split2, ree and cmpelim dumps a bunch of extra REG_DEAD
notes
here and there in r14-5355 compared to r14-5354, and finally pro_and_epilogue
deletes
(insn 20 19 62 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 0 ax [110])
(reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1}
(expr_list:REG_UNUSED (reg:CCZ 17 flags)
(nil)))
insn.
In reload dump there is:
(insn 20 19 44 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 0 ax [110])
(reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1}
(nil))
(insn 44 20 62 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 0 ax [110])
(reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1}
(nil))
(insn 62 44 46 2 (set (reg:HI 0 ax [118])
(const_int 1 [0x1])) "pr112760.c":6:22 86 {*movhi_internal}
(expr_list:REG_EQUIV (const_int 1 [0x1])
(nil)))
(insn 46 62 25 2 (set (reg:HI 3 bx [orig:103 _8+2 ] [103])
(if_then_else:HI (eq (reg:CCZ 17 flags)
(const_int 0 [0]))
(reg:HI 3 bx [orig:103 _8+2 ] [103])
(reg:HI 0 ax [118]))) "pr112760.c":6:22 1381 {*movhicc_noc}
(nil))
so the insn 20 is indeed useless and in vzeroupper pass that was correctly
marked in
the notes:
(insn 20 19 44 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 0 ax [110])
(reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1}
(expr_list:REG_UNUSED (reg:CCZ 17 flags)
(nil)))
(insn 44 20 62 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 0 ax [110])
(reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1}
(expr_list:REG_DEAD (reg:SI 1 dx [111])
(expr_list:REG_DEAD (reg:SI 0 ax [110])
(nil))))
(insn 62 44 46 2 (set (reg:HI 0 ax [118])
(const_int 1 [0x1])) "pr112760.c":6:22 86 {*movhi_internal}
(expr_list:REG_EQUIV (const_int 1 [0x1])
(nil)))
(insn 46 62 25 2 (set (reg:HI 3 bx [orig:103 _8+2 ] [103])
(if_then_else:HI (eq (reg:CCZ 17 flags)
(const_int 0 [0]))
(reg:HI 3 bx [orig:103 _8+2 ] [103])
(reg:HI 0 ax [118]))) "pr112760.c":6:22 1381 {*movhicc_noc}
(expr_list:REG_DEAD (reg:CCZ 17 flags)
(expr_list:REG_DEAD (reg:HI 0 ax [118])
(nil))))
But then postreload deletes insn 44 rather than 20 and keeps the notes around
unchanged.
Insn 20 is deleted in
#2 0x0000000000cce9df in copyprop_hardreg_forward_1 (bb=<basic_block
0x7fffea2f7c60 (2)>, vd=0x3bd2be0) at ../../gcc/regcprop.cc:829
#3 0x0000000000ccfe1c in copyprop_hardreg_forward_bb_without_debug_insn
(bb=<basic_block 0x7fffea2f7c60 (2)>) at ../../gcc/regcprop.cc:1235
#4 0x0000000000d5b371 in prepare_shrink_wrap (entry_block=<basic_block
0x7fffea2f7c60 (2)>) at ../../gcc/shrink-wrap.cc:451
#5 0x0000000000d5bb70 in try_shrink_wrapping (entry_edge=0x7fffffffd900,
prologue_seq=0x7fffe9f25240) at ../../gcc/shrink-wrap.cc:674
#6 0x00000000008b4320 in thread_prologue_and_epilogue_insns () at
../../gcc/function.cc:6056
and regcprop.cc documents it relies on up to date REG_DEAD/REG_UNUSED notes;
after all
the removal happens in
/* Detect obviously dead sets (via REG_UNUSED notes) and remove them. */
if (set
&& !RTX_FRAME_RELATED_P (insn)
&& NONJUMP_INSN_P (insn)
&& !may_trap_p (set)
&& find_reg_note (insn, REG_UNUSED, SET_DEST (set))
&& !side_effects_p (SET_SRC (set))
&& !side_effects_p (SET_DEST (set)))
{
bool last = insn == BB_END (bb);
delete_insn (insn);
if (last)
break;
continue;
}
and regcprop.cc calls df_note_add_problem (); before calling df_analyze ().
Except
in the pro_and_epilogue case it is done elsewhere and it just calls into the
regcprop.cc functions.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355
2023-11-29 7:42 [Bug target/112760] New: [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() zsojka at seznam dot cz
` (3 preceding siblings ...)
2023-12-01 12:45 ` jakub at gcc dot gnu.org
@ 2023-12-01 12:55 ` jakub at gcc dot gnu.org
2023-12-06 8:59 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-12-01 12:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 56753
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56753&action=edit
gcc14-pr112760.patch
Untested fix.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355
2023-11-29 7:42 [Bug target/112760] New: [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() zsojka at seznam dot cz
` (4 preceding siblings ...)
2023-12-01 12:55 ` jakub at gcc dot gnu.org
@ 2023-12-06 8:59 ` cvs-commit at gcc dot gnu.org
2023-12-06 9:01 ` jakub at gcc dot gnu.org
2023-12-06 18:33 ` pinskia at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-12-06 8:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760
--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:e44ed92dbbe9d4e5c23f486cd2f77a6f9ee513c5
commit r14-6210-ge44ed92dbbe9d4e5c23f486cd2f77a6f9ee513c5
Author: Jakub Jelinek <jakub@redhat.com>
Date: Wed Dec 6 09:59:12 2023 +0100
i386: Move vzeroupper pass from after reload pass to after postreload_cse
[PR112760]
Regardless of the outcome of the REG_UNUSED discussions, I think
it is a good idea to move the vzeroupper pass one pass later.
As can be seen in the multiple PRs and as postreload.cc documents,
reload/LRA is known to create dead statements quite often, which
is the reason why we have postreload_cse pass at all.
Doing vzeroupper pass before such cleanup means the pass including
df_analyze for it needs to process more instructions than needed
and because mode switching adds note problem, also higher chance of
having stale REG_UNUSED notes.
And, I really don't see why vzeroupper can't wait until those cleanups
are done.
2023-12-06 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/112760
* config/i386/i386-passes.def (pass_insert_vzeroupper): Insert
after pass_postreload_cse rather than pass_reload.
* config/i386/i386-features.cc (rest_of_handle_insert_vzeroupper):
Adjust comment for it.
* gcc.dg/pr112760.c: New test.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355
2023-11-29 7:42 [Bug target/112760] New: [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() zsojka at seznam dot cz
` (5 preceding siblings ...)
2023-12-06 8:59 ` cvs-commit at gcc dot gnu.org
@ 2023-12-06 9:01 ` jakub at gcc dot gnu.org
2023-12-06 18:33 ` pinskia at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-12-06 9:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
This is now latent, we need to decide about the updating and usability of
REG_UNUSED notes, but after moving the pass it shouldn't trigger on this
testcase.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355
2023-11-29 7:42 [Bug target/112760] New: [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() zsojka at seznam dot cz
` (6 preceding siblings ...)
2023-12-06 9:01 ` jakub at gcc dot gnu.org
@ 2023-12-06 18:33 ` pinskia at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-12-06 18:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |DUPLICATE
--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The REG_UNUSED vs single_set issue is being tracked in PR 40209 .
*** This bug has been marked as a duplicate of bug 40209 ***
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-12-06 18:33 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-29 7:42 [Bug target/112760] New: [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() zsojka at seznam dot cz
2023-11-29 8:02 ` [Bug target/112760] " zsojka at seznam dot cz
2023-11-29 10:45 ` [Bug rtl-optimization/112760] " ubizjak at gmail dot com
2023-12-01 12:22 ` [Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355 jakub at gcc dot gnu.org
2023-12-01 12:45 ` jakub at gcc dot gnu.org
2023-12-01 12:55 ` jakub at gcc dot gnu.org
2023-12-06 8:59 ` cvs-commit at gcc dot gnu.org
2023-12-06 9:01 ` jakub at gcc dot gnu.org
2023-12-06 18:33 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).