* [Bug rtl-optimization/110701] [14 Regression] Wrong code at -O1/2/3/s on x86_64-linux-gnu
2023-07-17 10:21 [Bug c/110701] New: Wrong code at -O1/2/3/s on x86_64-linux-gnu shaohua.li at inf dot ethz.ch
@ 2023-07-17 12:43 ` xry111 at gcc dot gnu.org
2023-07-17 17:51 ` pinskia at gcc dot gnu.org
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: xry111 at gcc dot gnu.org @ 2023-07-17 12:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110701
Xi Ruoyao <xry111 at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|c |rtl-optimization
--- Comment #1 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
Combine pass seems guilty. Before combine:
(insn 21 20 23 2 (parallel [
(set (subreg:HI (reg:DI 92) 0)
(and:HI (subreg:HI (reg:DI 92) 0)
(const_int 340 [0x154])))
(clobber (reg:CC 17 flags))
]) "/app/example.c":5:39 596 {*andhi_1}
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))
(insn 23 21 24 2 (parallel [
(set (reg:HI 94)
(xor:HI (subreg:HI (reg:DI 92) 0)
(const_int 3 [0x3])))
(clobber (reg:CC 17 flags))
]) "/app/example.c":7:23 discrim 1 634 {*xorhi_1}
(expr_list:REG_DEAD (reg:DI 92)
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil))))
(insn 24 23 25 2 (set (reg:SI 95)
(sign_extend:SI (reg:HI 94))) "/app/example.c":7:23 discrim 1 179
{extendhisi2}
(expr_list:REG_DEAD (reg:HI 94)
(nil)))
(insn 25 24 26 2 (set (mem:SI (reg/f:DI 85 [ c.3_6 ]) [3 *c.3_6+0 S4 A32])
(reg:SI 95)) "/app/example.c":7:23 discrim 1 91 {*movsi_internal}
(expr_list:REG_DEAD (reg:SI 95)
(expr_list:REG_DEAD (reg/f:DI 85 [ c.3_6 ])
(nil))))
This is perfectly legal. After combine:
(insn 21 20 23 2 (parallel [
(set (subreg:HI (reg:DI 92) 0)
(and:HI (subreg:HI (reg:DI 92) 0)
(const_int 340 [0x154])))
(clobber (reg:CC 17 flags))
]) "/app/example.c":5:39 596 {*andhi_1}
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))
(note 23 21 24 2 NOTE_INSN_DELETED)
(insn 24 23 25 2 (parallel [
(set (reg:SI 95)
(ior:SI (subreg:SI (reg:DI 92) 0)
(const_int 3 [0x3])))
(clobber (reg:CC 17 flags))
]) "/app/example.c":7:23 discrim 1 635 {*iorsi_1}
(expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_DEAD (reg:DI 92)
(nil))))
(insn 25 24 26 2 (set (mem:SI (reg/f:DI 85 [ c.3_6 ]) [3 *c.3_6+0 S4 A32])
(reg:SI 95)) "/app/example.c":7:23 discrim 1 91 {*movsi_internal}
(expr_list:REG_DEAD (reg:SI 95)
(expr_list:REG_DEAD (reg/f:DI 85 [ c.3_6 ])
(nil))))
The removal of sign_extend seems wrong.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/110701] [14 Regression] Wrong code at -O1/2/3/s on x86_64-linux-gnu
2023-07-17 10:21 [Bug c/110701] New: Wrong code at -O1/2/3/s on x86_64-linux-gnu shaohua.li at inf dot ethz.ch
2023-07-17 12:43 ` [Bug rtl-optimization/110701] [14 Regression] " xry111 at gcc dot gnu.org
@ 2023-07-17 17:51 ` pinskia at gcc dot gnu.org
2023-07-17 17:55 ` pinskia at gcc dot gnu.org
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-07-17 17:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110701
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |wrong-code
Last reconfirmed| |2023-07-17
Target Milestone|--- |14.0
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.
Trying 16 -> 17:
16: {r94:HI=r92:DI#0^0x3;clobber flags:CC;}
REG_DEAD r92:DI
REG_UNUSED flags:CC
17: r95:SI=sign_extend(r94:HI)
REG_DEAD r94:HI
Successfully matched this instruction:
(set (reg:SI 95)
(ior:SI (subreg:SI (reg:DI 92) 0)
(const_int 3 [0x3])))
allowing combination of insns 16 and 17
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 16.
modifying insn i3 17: {r95:SI=r92:DI#0|0x3;clobber flags:CC;}
REG_UNUSED flags:CC
REG_DEAD r92:DI
deferring rescan insn with uid = 17.
BUT r92 is defined by
(insn 12 11 13 2 (parallel [
(set (reg:DI 92)
(ashiftrt:DI (reg:DI 93 [ bD.2759 ])
(const_int 63 [0x3f])))
(clobber (reg:CC 17 flags))
]) "/app/example.cpp":6:53 901 {ashrdi3_cvt}
(expr_list:REG_DEAD (reg:DI 93 [ bD.2759 ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_EQUAL (ashiftrt:DI (mem/c:DI (symbol_ref:DI ("b")
[flags 0x2] <var_decl 0x7f6efc025d80 b>) [1 bD.2759+0 S8 A64])
(const_int 63 [0x3f]))
(nil)))))
(insn 13 12 14 2 (set (subreg:HI (reg:DI 92) 0)
(not:HI (subreg:HI (reg:DI 92) 0))) "/app/example.cpp":6:53 816
{*one_cmplhi2_1}
(nil))
(insn 14 13 16 2 (parallel [
(set (subreg:HI (reg:DI 92) 0)
(and:HI (subreg:HI (reg:DI 92) 0)
(const_int 340 [0x154])))
(clobber (reg:CC 17 flags))
]) "/app/example.cpp":6:53 596 {*andhi_1}
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))
So that should be ok, well no because SI != HI here ... and
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/110701] [14 Regression] Wrong code at -O1/2/3/s on x86_64-linux-gnu
2023-07-17 10:21 [Bug c/110701] New: Wrong code at -O1/2/3/s on x86_64-linux-gnu shaohua.li at inf dot ethz.ch
2023-07-17 12:43 ` [Bug rtl-optimization/110701] [14 Regression] " xry111 at gcc dot gnu.org
2023-07-17 17:51 ` pinskia at gcc dot gnu.org
@ 2023-07-17 17:55 ` pinskia at gcc dot gnu.org
2023-07-17 17:58 ` pinskia at gcc dot gnu.org
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-07-17 17:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110701
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
What GCC 13 did was
Trying 16 -> 17:
16: {r94:HI=r92:DI#0^0x3;clobber flags:CC;}
REG_DEAD r92:DI
REG_UNUSED flags:CC
17: r95:SI=sign_extend(r94:HI)
REG_DEAD r94:HI
Failed to match this instruction:
(set (reg:SI 95)
(ior:SI (and:SI (subreg:SI (reg:DI 92) 0)
(const_int 340 [0x154]))
(const_int 3 [0x3])))
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/110701] [14 Regression] Wrong code at -O1/2/3/s on x86_64-linux-gnu
2023-07-17 10:21 [Bug c/110701] New: Wrong code at -O1/2/3/s on x86_64-linux-gnu shaohua.li at inf dot ethz.ch
` (2 preceding siblings ...)
2023-07-17 17:55 ` pinskia at gcc dot gnu.org
@ 2023-07-17 17:58 ` pinskia at gcc dot gnu.org
2023-07-18 15:38 ` roger at nextmovesoftware dot com
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-07-17 17:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110701
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
See Also|https://gcc.gnu.org/bugzill |
|a/show_bug.cgi?id=109040 |
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Maybe r14-2047-gd0e891406b16dc28905717de2
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/110701] [14 Regression] Wrong code at -O1/2/3/s on x86_64-linux-gnu
2023-07-17 10:21 [Bug c/110701] New: Wrong code at -O1/2/3/s on x86_64-linux-gnu shaohua.li at inf dot ethz.ch
` (3 preceding siblings ...)
2023-07-17 17:58 ` pinskia at gcc dot gnu.org
@ 2023-07-18 15:38 ` roger at nextmovesoftware dot com
2023-07-18 17:17 ` roger at nextmovesoftware dot com
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: roger at nextmovesoftware dot com @ 2023-07-18 15:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110701
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |roger at nextmovesoftware dot com
--- Comment #5 from Roger Sayle <roger at nextmovesoftware dot com> ---
nonzero_bits ((reg:DI 92),SImode) is returning 340, so combine (or more
specifically simplify_and_const_int_1) believes that the AND (ZERO_EXTEND)
isn't unnecessary. So it's the same nonzero_bits information that allows us to
turn the XOR into IOR (in insn 16) that's incorrectly telling us the AND 340
(or AND 343, or ZERO_EXTEND) is unnecessary (in insn 17).
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/110701] [14 Regression] Wrong code at -O1/2/3/s on x86_64-linux-gnu
2023-07-17 10:21 [Bug c/110701] New: Wrong code at -O1/2/3/s on x86_64-linux-gnu shaohua.li at inf dot ethz.ch
` (4 preceding siblings ...)
2023-07-18 15:38 ` roger at nextmovesoftware dot com
@ 2023-07-18 17:17 ` roger at nextmovesoftware dot com
2023-07-27 18:23 ` roger at nextmovesoftware dot com
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: roger at nextmovesoftware dot com @ 2023-07-18 17:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110701
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |roger at nextmovesoftware dot com
Status|NEW |ASSIGNED
--- Comment #6 from Roger Sayle <roger at nextmovesoftware dot com> ---
I have a fix (to combine.cc's record_dead_and_set_regs_1). Bootstrapping and
regression testing.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/110701] [14 Regression] Wrong code at -O1/2/3/s on x86_64-linux-gnu
2023-07-17 10:21 [Bug c/110701] New: Wrong code at -O1/2/3/s on x86_64-linux-gnu shaohua.li at inf dot ethz.ch
` (5 preceding siblings ...)
2023-07-18 17:17 ` roger at nextmovesoftware dot com
@ 2023-07-27 18:23 ` roger at nextmovesoftware dot com
2023-10-04 16:12 ` cvs-commit at gcc dot gnu.org
2023-10-11 7:59 ` roger at nextmovesoftware dot com
8 siblings, 0 replies; 10+ messages in thread
From: roger at nextmovesoftware dot com @ 2023-07-27 18:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110701
--- Comment #7 from Roger Sayle <roger at nextmovesoftware dot com> ---
Patch proposed here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625532.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/110701] [14 Regression] Wrong code at -O1/2/3/s on x86_64-linux-gnu
2023-07-17 10:21 [Bug c/110701] New: Wrong code at -O1/2/3/s on x86_64-linux-gnu shaohua.li at inf dot ethz.ch
` (6 preceding siblings ...)
2023-07-27 18:23 ` roger at nextmovesoftware dot com
@ 2023-10-04 16:12 ` cvs-commit at gcc dot gnu.org
2023-10-11 7:59 ` roger at nextmovesoftware dot com
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-10-04 16:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110701
--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:
https://gcc.gnu.org/g:263369b2f7f726a3d4b269678d2c13a9d06a041e
commit r14-4398-g263369b2f7f726a3d4b269678d2c13a9d06a041e
Author: Roger Sayle <roger@nextmovesoftware.com>
Date: Wed Oct 4 17:11:23 2023 +0100
PR rtl-optimization/110701: Fix SUBREG SET_DEST handling in combine.
This patch is my proposed fix to PR rtl-optimization 110701, a latent bug
in combine's record_dead_and_set_regs_1 exposed by recent improvements to
simplify_subreg.
The issue involves the handling of (normal) SUBREG SET_DESTs as in the
instruction:
(set (subreg:HI (reg:SI x) 0) (expr:HI y))
The semantics of this are that the bits specified by the SUBREG are set
to the SET_SRC, y, and that the other bits of the SET_DEST are left/become
undefined. To simplify explanation, we'll only consider lowpart SUBREGs
(though in theory non-lowpart SUBREGS could be handled), and the fact that
bits outside of the lowpart WORD retain their original values (treating
these as undefined is a missed optimization rather than incorrect code
bug, that only affects targets with less than 64-bit words).
The bug is that combine simulates the behaviour of the above instruction,
for calculating nonzero_bits and set_sign_bit_copies, in the function
record_value_for_reg, by using the equivalent of:
(set (reg:SI x) (subreg:SI (expr:HI y))
by calling gen_lowpart on the SET_SRC. Alas, the semantics of this
revised instruction aren't always equivalent to the original.
In the test case for PR110701, the original instruction
(set (subreg:HI (reg:SI x), 0)
(and:HI (subreg:HI (reg:SI y) 0)
(const_int 340)))
which (by definition) leaves the top bits of x undefined, is mistakenly
considered to be equivalent to
(set (reg:SI x) (and:SI (reg:SI y) (const_int 340)))
where gen_lowpart's freedom to do anything with paradoxical SUBREG bits,
has now cleared the high bits. The same bug also triggers when the
SET_SRC is say (subreg:HI (reg:DI z)), where gen_lowpart transforms
this into (subreg:SI (reg:DI z)) which defines bits 16-31 to be the
same as bits 16-31 of z.
The fix is that after calling record_value_for_reg, we need to mark
the bits that should be undefined as undefined, in case gen_lowpart,
which performs transforms appropriate for r-values, has changed the
interpretation of the SUBREG when used as an l-value.
2023-10-04 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/110701
* combine.cc (record_dead_and_set_regs_1): Split comment into
pieces placed before the relevant clauses. When the SET_DEST
is a partial_subreg_p, mark the bits outside of the updated
portion of the destination as undefined.
gcc/testsuite/ChangeLog
PR rtl-optimization/110701
* gcc.target/i386/pr110701.c: New test case.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug rtl-optimization/110701] [14 Regression] Wrong code at -O1/2/3/s on x86_64-linux-gnu
2023-07-17 10:21 [Bug c/110701] New: Wrong code at -O1/2/3/s on x86_64-linux-gnu shaohua.li at inf dot ethz.ch
` (7 preceding siblings ...)
2023-10-04 16:12 ` cvs-commit at gcc dot gnu.org
@ 2023-10-11 7:59 ` roger at nextmovesoftware dot com
8 siblings, 0 replies; 10+ messages in thread
From: roger at nextmovesoftware dot com @ 2023-10-11 7:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110701
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #9 from Roger Sayle <roger at nextmovesoftware dot com> ---
This should now be fixed on mainline.
^ permalink raw reply [flat|nested] 10+ messages in thread