* [PATCH] combine: Use reg_used_between_p rather than modified_between_p in two spots [PR119291]
@ 2025-03-28 11:09 Jakub Jelinek
2025-04-01 9:27 ` Richard Biener
0 siblings, 1 reply; 4+ messages in thread
From: Jakub Jelinek @ 2025-03-28 11:09 UTC (permalink / raw)
To: Segher Boessenkool, Richard Biener, Jeff Law, Richard Sandiford
Cc: gcc-patches
Hi!
The following testcase is miscompiled on x86_64-linux at -O2 by the combiner.
We have from earlier combinations
(insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
(const_int 0 [0])) "pr119291.c":25:15 96 {*movsi_internal}
(nil))
(insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
(reg/v:SI 116 [ e ])) 96 {*movsi_internal}
(expr_list:REG_DEAD (reg/v:SI 116 [ e ])
(nil)))
(note 24 23 25 4 NOTE_INSN_DELETED)
(insn 25 24 26 4 (parallel [
(set (reg:CCZ 17 flags)
(compare:CCZ (neg:SI (reg:SI 104 [ _7 ]))
(const_int 0 [0])))
(set (reg/v:SI 116 [ e ])
(neg:SI (reg:SI 104 [ _7 ])))
]) "pr119291.c":26:13 977 {*negsi_2}
(expr_list:REG_DEAD (reg:SI 104 [ _7 ])
(nil)))
(note 26 25 27 4 NOTE_INSN_DELETED)
(insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
(ne:DI (reg:CCZ 17 flags)
(const_int 0 [0]))) "pr119291.c":26:13 1447 {*setcc_di_1}
(expr_list:REG_DEAD (reg:CCZ 17 flags)
(nil)))
and try_combine is called on i3 25 and i2 22 (second time)
and reach the hunk being patched with simplified i3
(insn 25 24 26 4 (parallel [
(set (pc)
(pc))
(set (reg/v:SI 116 [ e ])
(const_int 0 [0]))
]) "pr119291.c":28:13 977 {*negsi_2}
(expr_list:REG_DEAD (reg:SI 104 [ _7 ])
(nil)))
and
(insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
(const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
(nil))
Now, the try_combine code there attempts to split two independent
sets in newpat by moving one of them to i2.
And among other tests it checks
!modified_between_p (SET_DEST (set1), i2, i3)
which is certainly needed, if there would be say
(set (reg/v:SI 116 [ e ]) (const_int 42 [0x2a]))
in between i2 and i3, we couldn't do that, as that set would overwrite
the value set by set1 we want to move to the i2 position.
But in this case pseudo 116 isn't set in between i2 and i3, but used
(and additionally there is a REG_DEAD note for it).
This is equally bad for the move, because while the i3 insn
and later will see the pseudo value that we set, the insn in between
which uses the value will see a different value from the one that
it should see.
As we don't check for that, in the end try_combine succeeds and
changes the IL to:
(insn 22 21 23 4 (set (reg/v:SI 116 [ e ])
(const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
(nil))
(insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
(reg/v:SI 116 [ e ])) 96 {*movsi_internal}
(expr_list:REG_DEAD (reg/v:SI 116 [ e ])
(nil)))
(note 24 23 25 4 NOTE_INSN_DELETED)
(insn 25 24 26 4 (set (pc)
(pc)) "pr119291.c":28:13 2147483647 {NOOP_MOVE}
(nil))
(note 26 25 27 4 NOTE_INSN_DELETED)
(insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
(const_int 0 [0])) "pr119291.c":28:13 95 {*movdi_internal}
(nil))
(note, the i3 got turned into a nop and try_combine also modified insn 27).
The following patch replaces the modified_between_p
tests with reg_used_between_p, my understanding is that
modified_between_p is a subset of reg_used_between_p, so one
doesn't need both.
Bootstrapped/regtested on x86_64-linux, i686-linux, aarch64-linux,
powerpc64le-linux and s390x-linux, ok for trunk?
Looking at this some more today, I think we should special case
set_noop_p because that can be put into i2 (except for the JUMP_P
violations), currently both modified_between_p (pc_rtx, i2, i3)
and reg_used_between_p (pc_rtx, i2, i3) returns false.
I'll post a patch incrementally for that (but that feels like
new optimization, so probably not something that should be backported).
2025-03-28 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/119291
* combine.cc (try_combine): For splitting of PARALLEL with
2 independent SETs into i2 and i3 sets check reg_used_between_p
of the SET_DESTs rather than just modified_between_p.
* gcc.c-torture/execute/pr119291.c: New test.
--- gcc/combine.cc.jj 2025-03-25 09:34:33.469102343 +0100
+++ gcc/combine.cc 2025-03-27 09:50:15.768567383 +0100
@@ -4012,18 +4012,18 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
rtx set1 = XVECEXP (newpat, 0, 1);
/* Normally, it doesn't matter which of the two is done first, but
- one which uses any regs/memory set in between i2 and i3 can't
- be first. The PARALLEL might also have been pre-existing in i3,
- so we need to make sure that we won't wrongly hoist a SET to i2
- that would conflict with a death note present in there, or would
- have its dest modified between i2 and i3. */
+ one which uses any regs/memory set or used in between i2 and i3
+ can't be first. The PARALLEL might also have been pre-existing
+ in i3, so we need to make sure that we won't wrongly hoist a SET
+ to i2 that would conflict with a death note present in there, or
+ would have its dest modified or used between i2 and i3. */
if (!modified_between_p (SET_SRC (set1), i2, i3)
&& !(REG_P (SET_DEST (set1))
&& find_reg_note (i2, REG_DEAD, SET_DEST (set1)))
&& !(GET_CODE (SET_DEST (set1)) == SUBREG
&& find_reg_note (i2, REG_DEAD,
SUBREG_REG (SET_DEST (set1))))
- && !modified_between_p (SET_DEST (set1), i2, i3)
+ && !reg_used_between_p (SET_DEST (set1), i2, i3)
/* If I3 is a jump, ensure that set0 is a jump so that
we do not create invalid RTL. */
&& (!JUMP_P (i3) || SET_DEST (set0) == pc_rtx)
@@ -4038,7 +4038,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
&& !(GET_CODE (SET_DEST (set0)) == SUBREG
&& find_reg_note (i2, REG_DEAD,
SUBREG_REG (SET_DEST (set0))))
- && !modified_between_p (SET_DEST (set0), i2, i3)
+ && !reg_used_between_p (SET_DEST (set0), i2, i3)
/* If I3 is a jump, ensure that set1 is a jump so that
we do not create invalid RTL. */
&& (!JUMP_P (i3) || SET_DEST (set1) == pc_rtx)
--- gcc/testsuite/gcc.c-torture/execute/pr119291.c.jj 2025-03-27 09:48:01.917407084 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr119291.c 2025-03-27 09:47:48.020598094 +0100
@@ -0,0 +1,33 @@
+/* PR rtl-optimization/119291 */
+
+int a;
+long c;
+
+__attribute__((noipa)) void
+foo (int x)
+{
+ if (x != 0)
+ __builtin_abort ();
+ a = 42;
+}
+
+int
+main ()
+{
+ int e = 1;
+lab:
+ if (a < 2)
+ {
+ int b = e;
+ _Bool d = a != 0;
+ _Bool f = b != 0;
+ unsigned long g = -(d & f);
+ unsigned long h = c & g;
+ unsigned long i = ~c;
+ e = -(i & h);
+ c = e != 0;
+ a = ~e + b;
+ foo (e);
+ goto lab;
+ }
+}
Jakub
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] combine: Use reg_used_between_p rather than modified_between_p in two spots [PR119291]
2025-03-28 11:09 [PATCH] combine: Use reg_used_between_p rather than modified_between_p in two spots [PR119291] Jakub Jelinek
@ 2025-04-01 9:27 ` Richard Biener
2025-04-01 10:29 ` Jakub Jelinek
0 siblings, 1 reply; 4+ messages in thread
From: Richard Biener @ 2025-04-01 9:27 UTC (permalink / raw)
To: Jakub Jelinek
Cc: Segher Boessenkool, Jeff Law, Richard Sandiford, gcc-patches
On Fri, 28 Mar 2025, Jakub Jelinek wrote:
> Hi!
>
> The following testcase is miscompiled on x86_64-linux at -O2 by the combiner.
> We have from earlier combinations
> (insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
> (const_int 0 [0])) "pr119291.c":25:15 96 {*movsi_internal}
> (nil))
> (insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
> (reg/v:SI 116 [ e ])) 96 {*movsi_internal}
> (expr_list:REG_DEAD (reg/v:SI 116 [ e ])
> (nil)))
> (note 24 23 25 4 NOTE_INSN_DELETED)
> (insn 25 24 26 4 (parallel [
> (set (reg:CCZ 17 flags)
> (compare:CCZ (neg:SI (reg:SI 104 [ _7 ]))
> (const_int 0 [0])))
> (set (reg/v:SI 116 [ e ])
> (neg:SI (reg:SI 104 [ _7 ])))
> ]) "pr119291.c":26:13 977 {*negsi_2}
> (expr_list:REG_DEAD (reg:SI 104 [ _7 ])
> (nil)))
> (note 26 25 27 4 NOTE_INSN_DELETED)
> (insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
> (ne:DI (reg:CCZ 17 flags)
> (const_int 0 [0]))) "pr119291.c":26:13 1447 {*setcc_di_1}
> (expr_list:REG_DEAD (reg:CCZ 17 flags)
> (nil)))
> and try_combine is called on i3 25 and i2 22 (second time)
> and reach the hunk being patched with simplified i3
> (insn 25 24 26 4 (parallel [
> (set (pc)
> (pc))
> (set (reg/v:SI 116 [ e ])
> (const_int 0 [0]))
> ]) "pr119291.c":28:13 977 {*negsi_2}
> (expr_list:REG_DEAD (reg:SI 104 [ _7 ])
> (nil)))
> and
> (insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
> (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
> (nil))
> Now, the try_combine code there attempts to split two independent
> sets in newpat by moving one of them to i2.
> And among other tests it checks
> !modified_between_p (SET_DEST (set1), i2, i3)
> which is certainly needed, if there would be say
> (set (reg/v:SI 116 [ e ]) (const_int 42 [0x2a]))
> in between i2 and i3, we couldn't do that, as that set would overwrite
> the value set by set1 we want to move to the i2 position.
> But in this case pseudo 116 isn't set in between i2 and i3, but used
> (and additionally there is a REG_DEAD note for it).
>
> This is equally bad for the move, because while the i3 insn
> and later will see the pseudo value that we set, the insn in between
> which uses the value will see a different value from the one that
> it should see.
>
> As we don't check for that, in the end try_combine succeeds and
> changes the IL to:
> (insn 22 21 23 4 (set (reg/v:SI 116 [ e ])
> (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
> (nil))
> (insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
> (reg/v:SI 116 [ e ])) 96 {*movsi_internal}
> (expr_list:REG_DEAD (reg/v:SI 116 [ e ])
> (nil)))
> (note 24 23 25 4 NOTE_INSN_DELETED)
> (insn 25 24 26 4 (set (pc)
> (pc)) "pr119291.c":28:13 2147483647 {NOOP_MOVE}
> (nil))
> (note 26 25 27 4 NOTE_INSN_DELETED)
> (insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
> (const_int 0 [0])) "pr119291.c":28:13 95 {*movdi_internal}
> (nil))
> (note, the i3 got turned into a nop and try_combine also modified insn 27).
>
> The following patch replaces the modified_between_p
> tests with reg_used_between_p, my understanding is that
> modified_between_p is a subset of reg_used_between_p, so one
> doesn't need both.
>
> Bootstrapped/regtested on x86_64-linux, i686-linux, aarch64-linux,
> powerpc64le-linux and s390x-linux, ok for trunk?
>
> Looking at this some more today, I think we should special case
> set_noop_p because that can be put into i2 (except for the JUMP_P
> violations), currently both modified_between_p (pc_rtx, i2, i3)
> and reg_used_between_p (pc_rtx, i2, i3) returns false.
> I'll post a patch incrementally for that (but that feels like
> new optimization, so probably not something that should be backported).
Trying to review this I noticed that both the comment in combine suggests
that memory set is important and modified_between_p handles MEM_P
'x' and for non-REG 'x' possibly MEM operands (CONCAT?) while
reg_used_between_p only handles REG_P 'reg'. Also shouldn't
you use reg_referenced_p? At least that function seems to handle
SET_SRC/SET_DEST separately?
Can we constrain SET_DEST (set1/set0) to a REG_P in combine? Why
does the comment talk about memory?
> 2025-03-28 Jakub Jelinek <jakub@redhat.com>
>
> PR rtl-optimization/119291
> * combine.cc (try_combine): For splitting of PARALLEL with
> 2 independent SETs into i2 and i3 sets check reg_used_between_p
> of the SET_DESTs rather than just modified_between_p.
>
> * gcc.c-torture/execute/pr119291.c: New test.
>
> --- gcc/combine.cc.jj 2025-03-25 09:34:33.469102343 +0100
> +++ gcc/combine.cc 2025-03-27 09:50:15.768567383 +0100
> @@ -4012,18 +4012,18 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
> rtx set1 = XVECEXP (newpat, 0, 1);
>
> /* Normally, it doesn't matter which of the two is done first, but
> - one which uses any regs/memory set in between i2 and i3 can't
> - be first. The PARALLEL might also have been pre-existing in i3,
> - so we need to make sure that we won't wrongly hoist a SET to i2
> - that would conflict with a death note present in there, or would
> - have its dest modified between i2 and i3. */
> + one which uses any regs/memory set or used in between i2 and i3
> + can't be first. The PARALLEL might also have been pre-existing
> + in i3, so we need to make sure that we won't wrongly hoist a SET
> + to i2 that would conflict with a death note present in there, or
> + would have its dest modified or used between i2 and i3. */
> if (!modified_between_p (SET_SRC (set1), i2, i3)
> && !(REG_P (SET_DEST (set1))
> && find_reg_note (i2, REG_DEAD, SET_DEST (set1)))
> && !(GET_CODE (SET_DEST (set1)) == SUBREG
> && find_reg_note (i2, REG_DEAD,
> SUBREG_REG (SET_DEST (set1))))
> - && !modified_between_p (SET_DEST (set1), i2, i3)
> + && !reg_used_between_p (SET_DEST (set1), i2, i3)
> /* If I3 is a jump, ensure that set0 is a jump so that
> we do not create invalid RTL. */
> && (!JUMP_P (i3) || SET_DEST (set0) == pc_rtx)
> @@ -4038,7 +4038,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
> && !(GET_CODE (SET_DEST (set0)) == SUBREG
> && find_reg_note (i2, REG_DEAD,
> SUBREG_REG (SET_DEST (set0))))
> - && !modified_between_p (SET_DEST (set0), i2, i3)
> + && !reg_used_between_p (SET_DEST (set0), i2, i3)
> /* If I3 is a jump, ensure that set1 is a jump so that
> we do not create invalid RTL. */
> && (!JUMP_P (i3) || SET_DEST (set1) == pc_rtx)
> --- gcc/testsuite/gcc.c-torture/execute/pr119291.c.jj 2025-03-27 09:48:01.917407084 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr119291.c 2025-03-27 09:47:48.020598094 +0100
> @@ -0,0 +1,33 @@
> +/* PR rtl-optimization/119291 */
> +
> +int a;
> +long c;
> +
> +__attribute__((noipa)) void
> +foo (int x)
> +{
> + if (x != 0)
> + __builtin_abort ();
> + a = 42;
> +}
> +
> +int
> +main ()
> +{
> + int e = 1;
> +lab:
> + if (a < 2)
> + {
> + int b = e;
> + _Bool d = a != 0;
> + _Bool f = b != 0;
> + unsigned long g = -(d & f);
> + unsigned long h = c & g;
> + unsigned long i = ~c;
> + e = -(i & h);
> + c = e != 0;
> + a = ~e + b;
> + foo (e);
> + goto lab;
> + }
> +}
>
> Jakub
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] combine: Use reg_used_between_p rather than modified_between_p in two spots [PR119291]
2025-04-01 9:27 ` Richard Biener
@ 2025-04-01 10:29 ` Jakub Jelinek
2025-04-01 12:42 ` Richard Biener
0 siblings, 1 reply; 4+ messages in thread
From: Jakub Jelinek @ 2025-04-01 10:29 UTC (permalink / raw)
To: Richard Biener
Cc: Segher Boessenkool, Jeff Law, Richard Sandiford, gcc-patches
On Tue, Apr 01, 2025 at 11:27:25AM +0200, Richard Biener wrote:
> > Looking at this some more today, I think we should special case
> > set_noop_p because that can be put into i2 (except for the JUMP_P
> > violations), currently both modified_between_p (pc_rtx, i2, i3)
> > and reg_used_between_p (pc_rtx, i2, i3) returns false.
> > I'll post a patch incrementally for that (but that feels like
> > new optimization, so probably not something that should be backported).
>
> Trying to review this I noticed that both the comment in combine suggests
> that memory set is important and modified_between_p handles MEM_P
> 'x' and for non-REG 'x' possibly MEM operands (CONCAT?) while
> reg_used_between_p only handles REG_P 'reg'. Also shouldn't
> you use reg_referenced_p? At least that function seems to handle
> SET_SRC/SET_DEST separately?
>
> Can we constrain SET_DEST (set1/set0) to a REG_P in combine? Why
> does the comment talk about memory?
I was worried about making too risky changes this late in stage4
(and especially also for backports). Most of this code is 1992-ish.
I think many of the functions are just misnamed, the reg_ in there doesn't
match what those functions do (bet they initially supported just REGs
and later on support for other kinds of expressions was added, but haven't
done git archeology to prove that).
What we know for sure is:
&& GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT
&& GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART
&& GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT
&& GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART
that is checked earlier in the condition.
Then it calls
&& ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
XVECEXP (newpat, 0, 0))
&& ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
XVECEXP (newpat, 0, 1))
While it has reg_* in it, that function mostly calls reg_overlap_mentioned_p
which is also misnamed, that function handles just fine all of
REG, MEM, SUBREG of REG, (SUBREG of MEM not, see below), ZERO_EXTRACT,
STRICT_LOW_PART, PC and even some further cases.
So, IMHO SET_DEST (set0) or SET_DEST (set0) can be certainly a REG, SUBREG
of REG, PC (at least the REG and PC cases are triggered on the testcase)
and quite possibly also MEM (SUBREG of MEM not, see below).
Now, the code uses !modified_between_p (SET_SRC (set{1,0}), i2, i3) where that
function for constants just returns false, for PC returns true, for REG
returns reg_set_between_p, for MEM recurses on the address, for
MEM_READONLY_P otherwise returns false, otherwise checks using alias.cc code
whether the memory could have been modified in between, for all other
rtxes recurses on the subrtxes. This part didn't change in my patch.
I've only changed those
- && !modified_between_p (SET_DEST (set{1,0}), i2, i3)
+ && !reg_used_between_p (SET_DEST (set{1,0}), i2, i3)
where the former has been described above and clearly handles all of
REG, SUBREG of REG, PC, MEM and SUBREG of MEM among other things.
The replacement reg_used_between_p calls reg_overlap_mentioned_p on each
instruction in between i2 and i3. So, there is clearly a difference
in behavior if SET_DEST (set{1,0}) is pc_rtx, in that case modified_between_p
returns unconditionally true even if there are no instructions in between,
but reg_used_between_p if there are no non-debug insns in between returns
false. Sorry for missing that, guess I should check for that (with the
exception of the noop moves which are often (set (pc) (pc)) and handled
by the incremental patch). In fact not just that, reg_used_between_p
will only return true for PC if it is mentioned anywhere in the insns
in between.
Anyway, except for that, for REG it calls refers_to_regno_p
and so should find any occurrences of any of the REG or parts of it for hard
registers, for MEM returns true if it sees any MEMs in insns in between
(conservatively), for SUBREGs apparently it relies on it being SUBREG of REG
(so doesn't handle SUBREG of MEM) and handles SUBREG of REG like the
SUBREG_REG, PC I've already described.
Now, because reg_overlap_mentioned_p doesn't handle SUBREG of MEM, I think
already the initial
&& ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
XVECEXP (newpat, 0, 0))
&& ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
XVECEXP (newpat, 0, 1))
calls would have failed --enable-checking=rtl or would have misbehaved, so
I think there is no need to check for it further.
To your question why I don't use reg_referenced_p, that is because
reg_referenced_p is something to call on one insn pattern, while
reg_used_between_p is pretty much that on all insns in between two
instructions (excluding the boundaries).
So, I think it would be safer to add && SET_DEST (set{1,0} != pc_rtx
checks to preserve former behavior, like in the following version.
2025-04-01 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/119291
* combine.cc (try_combine): For splitting of PARALLEL with
2 independent SETs into i2 and i3 sets check reg_used_between_p
of the SET_DESTs rather than just modified_between_p.
* gcc.c-torture/execute/pr119291.c: New test.
--- gcc/combine.cc.jj 2025-03-25 09:34:33.469102343 +0100
+++ gcc/combine.cc 2025-04-01 12:24:25.956711734 +0200
@@ -4012,18 +4012,19 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
rtx set1 = XVECEXP (newpat, 0, 1);
/* Normally, it doesn't matter which of the two is done first, but
- one which uses any regs/memory set in between i2 and i3 can't
- be first. The PARALLEL might also have been pre-existing in i3,
- so we need to make sure that we won't wrongly hoist a SET to i2
- that would conflict with a death note present in there, or would
- have its dest modified between i2 and i3. */
+ one which uses any regs/memory set or used in between i2 and i3
+ can't be first. The PARALLEL might also have been pre-existing
+ in i3, so we need to make sure that we won't wrongly hoist a SET
+ to i2 that would conflict with a death note present in there, or
+ would have its dest modified or used between i2 and i3. */
if (!modified_between_p (SET_SRC (set1), i2, i3)
&& !(REG_P (SET_DEST (set1))
&& find_reg_note (i2, REG_DEAD, SET_DEST (set1)))
&& !(GET_CODE (SET_DEST (set1)) == SUBREG
&& find_reg_note (i2, REG_DEAD,
SUBREG_REG (SET_DEST (set1))))
- && !modified_between_p (SET_DEST (set1), i2, i3)
+ && SET_DEST (set1) != pc_rtx
+ && !reg_used_between_p (SET_DEST (set1), i2, i3)
/* If I3 is a jump, ensure that set0 is a jump so that
we do not create invalid RTL. */
&& (!JUMP_P (i3) || SET_DEST (set0) == pc_rtx)
@@ -4038,7 +4039,8 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
&& !(GET_CODE (SET_DEST (set0)) == SUBREG
&& find_reg_note (i2, REG_DEAD,
SUBREG_REG (SET_DEST (set0))))
- && !modified_between_p (SET_DEST (set0), i2, i3)
+ && SET_DEST (set0) != pc_rtx
+ && !reg_used_between_p (SET_DEST (set0), i2, i3)
/* If I3 is a jump, ensure that set1 is a jump so that
we do not create invalid RTL. */
&& (!JUMP_P (i3) || SET_DEST (set1) == pc_rtx)
--- gcc/testsuite/gcc.c-torture/execute/pr119291.c.jj 2025-03-27 09:48:01.917407084 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr119291.c 2025-03-27 09:47:48.020598094 +0100
@@ -0,0 +1,33 @@
+/* PR rtl-optimization/119291 */
+
+int a;
+long c;
+
+__attribute__((noipa)) void
+foo (int x)
+{
+ if (x != 0)
+ __builtin_abort ();
+ a = 42;
+}
+
+int
+main ()
+{
+ int e = 1;
+lab:
+ if (a < 2)
+ {
+ int b = e;
+ _Bool d = a != 0;
+ _Bool f = b != 0;
+ unsigned long g = -(d & f);
+ unsigned long h = c & g;
+ unsigned long i = ~c;
+ e = -(i & h);
+ c = e != 0;
+ a = ~e + b;
+ foo (e);
+ goto lab;
+ }
+}
Jakub
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] combine: Use reg_used_between_p rather than modified_between_p in two spots [PR119291]
2025-04-01 10:29 ` Jakub Jelinek
@ 2025-04-01 12:42 ` Richard Biener
0 siblings, 0 replies; 4+ messages in thread
From: Richard Biener @ 2025-04-01 12:42 UTC (permalink / raw)
To: Jakub Jelinek
Cc: Segher Boessenkool, Jeff Law, Richard Sandiford, gcc-patches
On Tue, 1 Apr 2025, Jakub Jelinek wrote:
> On Tue, Apr 01, 2025 at 11:27:25AM +0200, Richard Biener wrote:
> > > Looking at this some more today, I think we should special case
> > > set_noop_p because that can be put into i2 (except for the JUMP_P
> > > violations), currently both modified_between_p (pc_rtx, i2, i3)
> > > and reg_used_between_p (pc_rtx, i2, i3) returns false.
> > > I'll post a patch incrementally for that (but that feels like
> > > new optimization, so probably not something that should be backported).
> >
> > Trying to review this I noticed that both the comment in combine suggests
> > that memory set is important and modified_between_p handles MEM_P
> > 'x' and for non-REG 'x' possibly MEM operands (CONCAT?) while
> > reg_used_between_p only handles REG_P 'reg'. Also shouldn't
> > you use reg_referenced_p? At least that function seems to handle
> > SET_SRC/SET_DEST separately?
> >
> > Can we constrain SET_DEST (set1/set0) to a REG_P in combine? Why
> > does the comment talk about memory?
>
> I was worried about making too risky changes this late in stage4
> (and especially also for backports). Most of this code is 1992-ish.
> I think many of the functions are just misnamed, the reg_ in there doesn't
> match what those functions do (bet they initially supported just REGs
> and later on support for other kinds of expressions was added, but haven't
> done git archeology to prove that).
>
> What we know for sure is:
> && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT
> && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART
> && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT
> && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART
> that is checked earlier in the condition.
> Then it calls
> && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
> XVECEXP (newpat, 0, 0))
> && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
> XVECEXP (newpat, 0, 1))
> While it has reg_* in it, that function mostly calls reg_overlap_mentioned_p
> which is also misnamed, that function handles just fine all of
> REG, MEM, SUBREG of REG, (SUBREG of MEM not, see below), ZERO_EXTRACT,
> STRICT_LOW_PART, PC and even some further cases.
> So, IMHO SET_DEST (set0) or SET_DEST (set0) can be certainly a REG, SUBREG
> of REG, PC (at least the REG and PC cases are triggered on the testcase)
> and quite possibly also MEM (SUBREG of MEM not, see below).
>
> Now, the code uses !modified_between_p (SET_SRC (set{1,0}), i2, i3) where that
> function for constants just returns false, for PC returns true, for REG
> returns reg_set_between_p, for MEM recurses on the address, for
> MEM_READONLY_P otherwise returns false, otherwise checks using alias.cc code
> whether the memory could have been modified in between, for all other
> rtxes recurses on the subrtxes. This part didn't change in my patch.
>
> I've only changed those
> - && !modified_between_p (SET_DEST (set{1,0}), i2, i3)
> + && !reg_used_between_p (SET_DEST (set{1,0}), i2, i3)
> where the former has been described above and clearly handles all of
> REG, SUBREG of REG, PC, MEM and SUBREG of MEM among other things.
>
> The replacement reg_used_between_p calls reg_overlap_mentioned_p on each
> instruction in between i2 and i3. So, there is clearly a difference
> in behavior if SET_DEST (set{1,0}) is pc_rtx, in that case modified_between_p
> returns unconditionally true even if there are no instructions in between,
> but reg_used_between_p if there are no non-debug insns in between returns
> false. Sorry for missing that, guess I should check for that (with the
> exception of the noop moves which are often (set (pc) (pc)) and handled
> by the incremental patch). In fact not just that, reg_used_between_p
> will only return true for PC if it is mentioned anywhere in the insns
> in between.
> Anyway, except for that, for REG it calls refers_to_regno_p
> and so should find any occurrences of any of the REG or parts of it for hard
> registers, for MEM returns true if it sees any MEMs in insns in between
> (conservatively), for SUBREGs apparently it relies on it being SUBREG of REG
> (so doesn't handle SUBREG of MEM) and handles SUBREG of REG like the
> SUBREG_REG, PC I've already described.
>
> Now, because reg_overlap_mentioned_p doesn't handle SUBREG of MEM, I think
> already the initial
> && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
> XVECEXP (newpat, 0, 0))
> && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
> XVECEXP (newpat, 0, 1))
> calls would have failed --enable-checking=rtl or would have misbehaved, so
> I think there is no need to check for it further.
>
> To your question why I don't use reg_referenced_p, that is because
> reg_referenced_p is something to call on one insn pattern, while
> reg_used_between_p is pretty much that on all insns in between two
> instructions (excluding the boundaries).
>
> So, I think it would be safer to add && SET_DEST (set{1,0} != pc_rtx
> checks to preserve former behavior, like in the following version.
OK.
Richard.
> 2025-04-01 Jakub Jelinek <jakub@redhat.com>
>
> PR rtl-optimization/119291
> * combine.cc (try_combine): For splitting of PARALLEL with
> 2 independent SETs into i2 and i3 sets check reg_used_between_p
> of the SET_DESTs rather than just modified_between_p.
>
> * gcc.c-torture/execute/pr119291.c: New test.
>
> --- gcc/combine.cc.jj 2025-03-25 09:34:33.469102343 +0100
> +++ gcc/combine.cc 2025-04-01 12:24:25.956711734 +0200
> @@ -4012,18 +4012,19 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
> rtx set1 = XVECEXP (newpat, 0, 1);
>
> /* Normally, it doesn't matter which of the two is done first, but
> - one which uses any regs/memory set in between i2 and i3 can't
> - be first. The PARALLEL might also have been pre-existing in i3,
> - so we need to make sure that we won't wrongly hoist a SET to i2
> - that would conflict with a death note present in there, or would
> - have its dest modified between i2 and i3. */
> + one which uses any regs/memory set or used in between i2 and i3
> + can't be first. The PARALLEL might also have been pre-existing
> + in i3, so we need to make sure that we won't wrongly hoist a SET
> + to i2 that would conflict with a death note present in there, or
> + would have its dest modified or used between i2 and i3. */
> if (!modified_between_p (SET_SRC (set1), i2, i3)
> && !(REG_P (SET_DEST (set1))
> && find_reg_note (i2, REG_DEAD, SET_DEST (set1)))
> && !(GET_CODE (SET_DEST (set1)) == SUBREG
> && find_reg_note (i2, REG_DEAD,
> SUBREG_REG (SET_DEST (set1))))
> - && !modified_between_p (SET_DEST (set1), i2, i3)
> + && SET_DEST (set1) != pc_rtx
> + && !reg_used_between_p (SET_DEST (set1), i2, i3)
> /* If I3 is a jump, ensure that set0 is a jump so that
> we do not create invalid RTL. */
> && (!JUMP_P (i3) || SET_DEST (set0) == pc_rtx)
> @@ -4038,7 +4039,8 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
> && !(GET_CODE (SET_DEST (set0)) == SUBREG
> && find_reg_note (i2, REG_DEAD,
> SUBREG_REG (SET_DEST (set0))))
> - && !modified_between_p (SET_DEST (set0), i2, i3)
> + && SET_DEST (set0) != pc_rtx
> + && !reg_used_between_p (SET_DEST (set0), i2, i3)
> /* If I3 is a jump, ensure that set1 is a jump so that
> we do not create invalid RTL. */
> && (!JUMP_P (i3) || SET_DEST (set1) == pc_rtx)
> --- gcc/testsuite/gcc.c-torture/execute/pr119291.c.jj 2025-03-27 09:48:01.917407084 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr119291.c 2025-03-27 09:47:48.020598094 +0100
> @@ -0,0 +1,33 @@
> +/* PR rtl-optimization/119291 */
> +
> +int a;
> +long c;
> +
> +__attribute__((noipa)) void
> +foo (int x)
> +{
> + if (x != 0)
> + __builtin_abort ();
> + a = 42;
> +}
> +
> +int
> +main ()
> +{
> + int e = 1;
> +lab:
> + if (a < 2)
> + {
> + int b = e;
> + _Bool d = a != 0;
> + _Bool f = b != 0;
> + unsigned long g = -(d & f);
> + unsigned long h = c & g;
> + unsigned long i = ~c;
> + e = -(i & h);
> + c = e != 0;
> + a = ~e + b;
> + foo (e);
> + goto lab;
> + }
> +}
>
> Jakub
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-04-01 12:42 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-28 11:09 [PATCH] combine: Use reg_used_between_p rather than modified_between_p in two spots [PR119291] Jakub Jelinek
2025-04-01 9:27 ` Richard Biener
2025-04-01 10:29 ` Jakub Jelinek
2025-04-01 12:42 ` Richard Biener
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).