* [PATCH] PR rtl-optimization/110701: Fix SUBREG SET_DEST handling in combine.
@ 2023-07-26 11:40 Roger Sayle
0 siblings, 0 replies; only message in thread
From: Roger Sayle @ 2023-07-26 11:40 UTC (permalink / raw)
To: gcc-patches; +Cc: segher, jeffreyalaw
[-- Attachment #1: Type: text/plain, Size: 2968 bytes --]
This patch is my proposed fix to PR rtl-optimization 110701, a latent bug
in combine's record_dead_and_set_regs_1 exposed by recent improvements to
simplify_subreg.
The issue involves the handling of (normal) SUBREG SET_DESTs as in the
instruction:
(set (subreg:HI (reg:SI x) 0) (expr:HI y))
The semantics of this are that the bits specified by the SUBREG are set
to the SET_SRC, y, and that the other bits of the SET_DEST are left/become
undefined. To simplify explanation, we'll only consider lowpart SUBREGs
(though in theory non-lowpart SUBREGS could be handled), and the fact that
bits outside of the lowpart WORD retain their original values (treating
these as undefined is a missed optimization rather than incorrect code
bug, that only affects targets with less than 64-bit words).
The bug is that combine simulates the behaviour of the above instruction,
for calculating nonzero_bits and set_sign_bit_copies, in the function
record_value_for_reg, by using the equivalent of:
(set (reg:SI x) (subreg:SI (expr:HI y))
by calling gen_lowpart on the SET_SRC. Alas, the semantics of this
revised instruction aren't always equivalent to the original.
In the test case for PR110701, the original instruction
(set (subreg:HI (reg:SI x), 0)
(and:HI (subreg:HI (reg:SI y) 0)
(const_int 340)))
which (by definition) leaves the top bits of x undefined, is mistakenly
considered to be equivalent to
(set (reg:SI x) (and:SI (reg:SI y) (const_int 340)))
where gen_lowpart's freedom to do anything with paradoxical SUBREG bits,
has now cleared the high bits. The same bug also triggers when the
SET_SRC is say (subreg:HI (reg:DI z)), where gen_lowpart transforms
this into (subreg:SI (reg:DI z)) which defines bits 16-31 to be the
same as bits 16-31 of z.
The fix is that after calling record_value_for_reg, we need to mark
the bits that should be undefined as undefined, in case gen_lowpart,
which performs transforms appropriate for r-values, has changed the
interpretation of the SUBREG when used as an l-value.
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures. Ok for mainline?
I've a version of this patch that preserves the original bits outside
of the lowpart WORD that can be submitted as a follow-up, but this is
the piece that addresses the wrong code regression.
2023-07-26 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/110701
* combine.cc (record_dead_and_set_regs_1): Split comment into
pieces placed before the relevant clauses. When the SET_DEST
is a partial_subreg_p, mark the bits outside of the updated
portion of the destination as undefined.
gcc/testsuite/ChangeLog
PR rtl-optimization/110701
* gcc.target/i386/pr110701.c: New test case.
Thanks in advance,
Roger
--
[-- Attachment #2: patchss3.txt --]
[-- Type: text/plain, Size: 3283 bytes --]
diff --git a/gcc/combine.cc b/gcc/combine.cc
index 4bf867d..c5ebb78 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -13337,27 +13337,43 @@ record_dead_and_set_regs_1 (rtx dest, const_rtx setter, void *data)
if (REG_P (dest))
{
- /* If we are setting the whole register, we know its value. Otherwise
- show that we don't know the value. We can handle a SUBREG if it's
- the low part, but we must be careful with paradoxical SUBREGs on
- RISC architectures because we cannot strip e.g. an extension around
- a load and record the naked load since the RTL middle-end considers
- that the upper bits are defined according to LOAD_EXTEND_OP. */
+ /* If we are setting the whole register, we know its value. */
if (GET_CODE (setter) == SET && dest == SET_DEST (setter))
record_value_for_reg (dest, record_dead_insn, SET_SRC (setter));
+ /* We can handle a SUBREG if it's the low part, but we must be
+ careful with paradoxical SUBREGs on RISC architectures because
+ we cannot strip e.g. an extension around a load and record the
+ naked load since the RTL middle-end considers that the upper bits
+ are defined according to LOAD_EXTEND_OP. */
else if (GET_CODE (setter) == SET
&& GET_CODE (SET_DEST (setter)) == SUBREG
&& SUBREG_REG (SET_DEST (setter)) == dest
&& known_le (GET_MODE_PRECISION (GET_MODE (dest)),
BITS_PER_WORD)
&& subreg_lowpart_p (SET_DEST (setter)))
- record_value_for_reg (dest, record_dead_insn,
- WORD_REGISTER_OPERATIONS
- && word_register_operation_p (SET_SRC (setter))
- && paradoxical_subreg_p (SET_DEST (setter))
- ? SET_SRC (setter)
- : gen_lowpart (GET_MODE (dest),
- SET_SRC (setter)));
+ {
+ if (WORD_REGISTER_OPERATIONS
+ && word_register_operation_p (SET_SRC (setter))
+ && paradoxical_subreg_p (SET_DEST (setter)))
+ record_value_for_reg (dest, record_dead_insn, SET_SRC (setter));
+ else if (!partial_subreg_p (SET_DEST (setter)))
+ record_value_for_reg (dest, record_dead_insn,
+ gen_lowpart (GET_MODE (dest),
+ SET_SRC (setter)));
+ else
+ {
+ record_value_for_reg (dest, record_dead_insn,
+ gen_lowpart (GET_MODE (dest),
+ SET_SRC (setter)));
+
+ unsigned HOST_WIDE_INT mask;
+ reg_stat_type *rsp = ®_stat[REGNO (dest)];
+ mask = GET_MODE_MASK (GET_MODE (SET_DEST (setter)));
+ rsp->last_set_nonzero_bits |= ~mask;
+ rsp->last_set_sign_bit_copies = 1;
+ }
+ }
+ /* Otherwise show that we don't know the value. */
else
record_value_for_reg (dest, record_dead_insn, NULL_RTX);
}
diff --git a/gcc/testsuite/gcc.target/i386/pr110701.c b/gcc/testsuite/gcc.target/i386/pr110701.c
new file mode 100644
index 0000000..3f2cea5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr110701.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+int a;
+long b;
+int *c = &a;
+short d(short e, short f) { return e * f; }
+void foo() {
+ *c = d(340, b >= 0) ^ 3;
+}
+
+/* { dg-final { scan-assembler "andl\[ \\t]\\\$340," } } */
+/* { dg-final { scan-assembler-not "andw\[ \\t]\\\$340," } } */
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2023-07-26 11:41 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-26 11:40 [PATCH] PR rtl-optimization/110701: Fix SUBREG SET_DEST handling in combine Roger Sayle
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).