* [PATCH] [MIPS] Fix wrong instruction in the delay slot
@ 2015-09-04 15:27 Robert Suchanek
2015-09-07 9:14 ` Matthew Fortune
2015-09-09 15:58 ` Jeff Law
0 siblings, 2 replies; 3+ messages in thread
From: Robert Suchanek @ 2015-09-04 15:27 UTC (permalink / raw)
To: Matthew Fortune, Catherine_Moore; +Cc: gcc-patches
Hi,
The attached test case that uses __builtin_unreachable in the default case
in a switch statement triggers a situation where a wrong instruction is placed
in the delay slot by the eager delay slot filler.
The issue should be reproducible with ToT compiler with -mips32r2 -G0 -mno-abicalls -O2.
It appears that a possibly related issue is already reported to Bugzilla (bug 51513)
where the branch is not optimized away, leaving the compare and branch instructions.
It would also appear that this should be fixed at the tree level, however, handling
a special case when a branch is generated, pointing to 'nowhere' block, is probably
more suitable.
Without the fix, the generated assembly is following:
...
addiu $2,$3,-3
sltu $4,$2,33
.set noreorder
.set nomacro
beq $4,$0,$L22
lw $4,%lo(mips_cm_is64)($4)
.set macro
.set reorder
sll $4,$2,2
...
The load instruction should not be placed in the delay slot as it is a part of
the taken branch with an undefined behaviour. The 'sll' was expected to be in the slot.
This is a series of unfortunate events that causes this to happen:
1. After the expansion, the 'nowhere' block is placed after the switch statement and
behaves as a fallthrough case with a diamond shape.
2. The block reordering pass randomly rearranges the nowhere block and it happens to be
placed before the load instruction.
3. The eager delay slot filler firstly redirects jumps by skipping consecutive labels
(ignoring barriers), pointing directly to the load. The analysis is doomed to failure at this
stage as the own thread is analysed and the shift instruction is rejected as it uses
conflicting resources (most likely because $4 is referenced in the load) but the opposite
thread succeeds partly because the set and needed registers sets do not intersect when
the resources are being computed by mark_target_live_regs ().
IMO, the fix is to recognize the empty basic block that has a code_label followed by
a barrier (ignoring notes and debug_insns), forbid going beyond the barrier if the empty block
is found in skip_consecutive_labels () and first_active_target_insn ().
The patch was cross tested on mips-img-linux-gnu and sparc-linux-gnu.
Regards,
Robert
gcc/
* reorg.c (label_with_barrier_p): New function.
(skip_consecutive_labels): Use it. Don't skip the label if an empty
block is found.
(first_active_target_insn): Likewise. Don't ignore the empty
block when searching for the next active instruction.
gcc/testsuite
* gcc.target/mips/builtin-unreachable-bug-1.c: New test.
---
gcc/reorg.c | 28 +++++++
.../gcc.target/mips/builtin-unreachable-bug-1.c | 90 ++++++++++++++++++++++
2 files changed, 118 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/mips/builtin-unreachable-bug-1.c
diff --git a/gcc/reorg.c b/gcc/reorg.c
index cdaa60c..269f666 100644
--- a/gcc/reorg.c
+++ b/gcc/reorg.c
@@ -145,6 +145,30 @@ along with GCC; see the file COPYING3. If not see
These functions are now only used here in reorg.c, and have therefore
been moved here to avoid inadvertent misuse elsewhere in the compiler. */
+/* Return true iff a LABEL is followed by a BARRIER. Ignore notes and debug
+ instructions. */
+
+static bool
+label_with_barrier_p (rtx_insn *label)
+{
+ bool empty_bb = true;
+
+ if (GET_CODE (label) != CODE_LABEL)
+ empty_bb = false;
+ else
+ label = NEXT_INSN (label);
+
+ while (!BARRIER_P (label) && empty_bb)
+ {
+ if (!(DEBUG_INSN_P (label)
+ || NOTE_P (label)))
+ empty_bb = false;
+ label = NEXT_INSN (label);
+ }
+
+ return empty_bb;
+}
+
/* Return the last label to mark the same position as LABEL. Return LABEL
itself if it is null or any return rtx. */
@@ -159,6 +183,8 @@ skip_consecutive_labels (rtx label_or_return)
rtx_insn *label = as_a <rtx_insn *> (label_or_return);
for (insn = label; insn != 0 && !INSN_P (insn); insn = NEXT_INSN (insn))
+ if (LABEL_P (insn) && label_with_barrier_p (insn))
+ break;
if (LABEL_P (insn))
label = insn;
@@ -267,6 +293,8 @@ first_active_target_insn (rtx insn)
{
if (ANY_RETURN_P (insn))
return insn;
+ if (LABEL_P (insn) && label_with_barrier_p (as_a <rtx_insn *> (insn)))
+ return NULL_RTX;
return next_active_insn (as_a <rtx_insn *> (insn));
}
diff --git a/gcc/testsuite/gcc.target/mips/builtin-unreachable-bug-1.c b/gcc/testsuite/gcc.target/mips/builtin-unreachable-bug-1.c
new file mode 100644
index 0000000..65eb9a3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/builtin-unreachable-bug-1.c
@@ -0,0 +1,90 @@
+/* { dg-options "-mcompact-branches=never -G0 -mno-abicalls" } */
+/* { dg-final { scan-assembler-not "beq\t\\\$4,\\\$0,\\\$L\[0-9\]+\n\tlw\t\\\$4,%lo\\(mips_cm_is64\\)\\(\\\$4\\)" } } */
+
+enum
+{
+ CPU_R4700,
+ CPU_R5000,
+ CPU_R5500,
+ CPU_NEVADA,
+ CPU_RM7000,
+ CPU_SR71000,
+ CPU_4KC,
+ CPU_4KEC,
+ CPU_4KSC,
+ CPU_24K,
+ CPU_34K,
+ CPU_1004K,
+ CPU_74K,
+ CPU_ALCHEMY,
+ CPU_PR4450,
+ CPU_BMIPS32,
+ CPU_BMIPS3300,
+ CPU_BMIPS5000,
+ CPU_JZRISC,
+ CPU_M14KC,
+ CPU_M14KEC,
+ CPU_INTERAPTIV,
+ CPU_P5600,
+ CPU_PROAPTIV,
+ CPU_1074K,
+ CPU_M5150,
+ CPU_I6400,
+ CPU_R3000,
+ CPU_5KC,
+ CPU_5KE,
+ CPU_20KC,
+ CPU_25KF,
+ CPU_SB1,
+ CPU_SB1A,
+ CPU_XLP,
+ CPU_QEMU_GENERIC
+};
+
+struct cpuinfo_mips
+{
+ long options;
+ int isa_level;
+} cpu_data[1];
+
+struct thread_info
+{
+ int cpu;
+};
+
+int a, b, c, d, e, mips_cm_is64;
+
+static int __attribute__ ((__cold__))
+mips_sc_probe_cm3 ()
+{
+ struct thread_info *info;
+ struct cpuinfo_mips *c = &cpu_data[info->cpu];
+ if (mips_cm_is64)
+ c->options = 0;
+ return 1;
+}
+
+int
+mips_sc_probe ()
+{
+ struct cpuinfo_mips ci = cpu_data[c];
+ if (mips_cm_is64)
+ __asm__("" ::: "memory");
+ if (d)
+ return mips_sc_probe_cm3 ();
+ if (ci.isa_level)
+ switch (a)
+ {
+ case CPU_QEMU_GENERIC:
+ case CPU_I6400:
+ case CPU_R3000:
+ case CPU_NEVADA:
+ case CPU_BMIPS3300:
+ break;
+ default:
+ __builtin_unreachable ();
+ }
+ switch (a)
+ case CPU_R3000:
+ e = b;
+}
--
2.4.5
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: [PATCH] [MIPS] Fix wrong instruction in the delay slot
2015-09-04 15:27 [PATCH] [MIPS] Fix wrong instruction in the delay slot Robert Suchanek
@ 2015-09-07 9:14 ` Matthew Fortune
2015-09-09 15:58 ` Jeff Law
1 sibling, 0 replies; 3+ messages in thread
From: Matthew Fortune @ 2015-09-07 9:14 UTC (permalink / raw)
To: Robert Suchanek, Catherine_Moore; +Cc: gcc-patches
Robert Suchanek <Robert.Suchanek@imgtec.com>
> IMO, the fix is to recognize the empty basic block that has a code_label
> followed by a barrier (ignoring notes and debug_insns), forbid going
> beyond the barrier if the empty block is found in
> skip_consecutive_labels () and first_active_target_insn ().
I can't approve this but I agree that the delay slot filler should not be
allowed to look past barriers to fill delay slots (especially if moving
a memory operation).
The redundant/no-op branch here is also annoying but the delay slot issue
does seem to be an independent problem. I've seen lots of pointless branches
being generated by GCC that get all the way though to emitting code. Perhaps
this is just one of several reasons for generating either always or never
taken conditional branches. They end up as things like BEQ $4, $4 or
BNE $4, $4 which MIPS R6 converts to 'B' or 'nothing' respectively as I have
not had time to track their origin.
Thanks,
Matthew
>
> The patch was cross tested on mips-img-linux-gnu and sparc-linux-gnu.
>
>
> Regards,
> Robert
>
> gcc/
> * reorg.c (label_with_barrier_p): New function.
> (skip_consecutive_labels): Use it. Don't skip the label if an
> empty
> block is found.
> (first_active_target_insn): Likewise. Don't ignore the empty
> block when searching for the next active instruction.
>
> gcc/testsuite
> * gcc.target/mips/builtin-unreachable-bug-1.c: New test.
> ---
> gcc/reorg.c | 28 +++++++
> .../gcc.target/mips/builtin-unreachable-bug-1.c | 90
> ++++++++++++++++++++++
> 2 files changed, 118 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/mips/builtin-unreachable-
> bug-1.c
>
> diff --git a/gcc/reorg.c b/gcc/reorg.c
> index cdaa60c..269f666 100644
> --- a/gcc/reorg.c
> +++ b/gcc/reorg.c
> @@ -145,6 +145,30 @@ along with GCC; see the file COPYING3. If not see
> These functions are now only used here in reorg.c, and have
> therefore
> been moved here to avoid inadvertent misuse elsewhere in the
> compiler. */
>
> +/* Return true iff a LABEL is followed by a BARRIER. Ignore notes and
> debug
> + instructions. */
> +
> +static bool
> +label_with_barrier_p (rtx_insn *label)
> +{
> + bool empty_bb = true;
> +
> + if (GET_CODE (label) != CODE_LABEL)
> + empty_bb = false;
> + else
> + label = NEXT_INSN (label);
> +
> + while (!BARRIER_P (label) && empty_bb)
> + {
> + if (!(DEBUG_INSN_P (label)
> + || NOTE_P (label)))
> + empty_bb = false;
> + label = NEXT_INSN (label);
> + }
> +
> + return empty_bb;
> +}
> +
> /* Return the last label to mark the same position as LABEL. Return
> LABEL
> itself if it is null or any return rtx. */
>
> @@ -159,6 +183,8 @@ skip_consecutive_labels (rtx label_or_return)
> rtx_insn *label = as_a <rtx_insn *> (label_or_return);
>
> for (insn = label; insn != 0 && !INSN_P (insn); insn = NEXT_INSN
> (insn))
> + if (LABEL_P (insn) && label_with_barrier_p (insn))
> + break;
> if (LABEL_P (insn))
> label = insn;
>
> @@ -267,6 +293,8 @@ first_active_target_insn (rtx insn) {
> if (ANY_RETURN_P (insn))
> return insn;
> + if (LABEL_P (insn) && label_with_barrier_p (as_a <rtx_insn *>
> (insn)))
> + return NULL_RTX;
> return next_active_insn (as_a <rtx_insn *> (insn)); }
>
>
> diff --git a/gcc/testsuite/gcc.target/mips/builtin-unreachable-bug-1.c
> b/gcc/testsuite/gcc.target/mips/builtin-unreachable-bug-1.c
> new file mode 100644
> index 0000000..65eb9a3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/mips/builtin-unreachable-bug-1.c
> @@ -0,0 +1,90 @@
> +/* { dg-options "-mcompact-branches=never -G0 -mno-abicalls" } */
> +/* { dg-final { scan-assembler-not
> +"beq\t\\\$4,\\\$0,\\\$L\[0-9\]+\n\tlw\t\\\$4,%lo\\(mips_cm_is64\\)\\(\\
> +\$4\\)" } } */
> +
> +enum
> +{
> + CPU_R4700,
> + CPU_R5000,
> + CPU_R5500,
> + CPU_NEVADA,
> + CPU_RM7000,
> + CPU_SR71000,
> + CPU_4KC,
> + CPU_4KEC,
> + CPU_4KSC,
> + CPU_24K,
> + CPU_34K,
> + CPU_1004K,
> + CPU_74K,
> + CPU_ALCHEMY,
> + CPU_PR4450,
> + CPU_BMIPS32,
> + CPU_BMIPS3300,
> + CPU_BMIPS5000,
> + CPU_JZRISC,
> + CPU_M14KC,
> + CPU_M14KEC,
> + CPU_INTERAPTIV,
> + CPU_P5600,
> + CPU_PROAPTIV,
> + CPU_1074K,
> + CPU_M5150,
> + CPU_I6400,
> + CPU_R3000,
> + CPU_5KC,
> + CPU_5KE,
> + CPU_20KC,
> + CPU_25KF,
> + CPU_SB1,
> + CPU_SB1A,
> + CPU_XLP,
> + CPU_QEMU_GENERIC
> +};
> +
> +struct cpuinfo_mips
> +{
> + long options;
> + int isa_level;
> +} cpu_data[1];
> +
> +struct thread_info
> +{
> + int cpu;
> +};
> +
> +int a, b, c, d, e, mips_cm_is64;
> +
> +static int __attribute__ ((__cold__))
> +mips_sc_probe_cm3 ()
> +{
> + struct thread_info *info;
> + struct cpuinfo_mips *c = &cpu_data[info->cpu];
> + if (mips_cm_is64)
> + c->options = 0;
> + return 1;
> +}
> +
> +int
> +mips_sc_probe ()
> +{
> + struct cpuinfo_mips ci = cpu_data[c];
> + if (mips_cm_is64)
> + __asm__("" ::: "memory");
> + if (d)
> + return mips_sc_probe_cm3 ();
> + if (ci.isa_level)
> + switch (a)
> + {
> + case CPU_QEMU_GENERIC:
> + case CPU_I6400:
> + case CPU_R3000:
> + case CPU_NEVADA:
> + case CPU_BMIPS3300:
> + break;
> + default:
> + __builtin_unreachable ();
> + }
> + switch (a)
> + case CPU_R3000:
> + e = b;
> +}
> --
> 2.4.5
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] [MIPS] Fix wrong instruction in the delay slot
2015-09-04 15:27 [PATCH] [MIPS] Fix wrong instruction in the delay slot Robert Suchanek
2015-09-07 9:14 ` Matthew Fortune
@ 2015-09-09 15:58 ` Jeff Law
1 sibling, 0 replies; 3+ messages in thread
From: Jeff Law @ 2015-09-09 15:58 UTC (permalink / raw)
To: Robert Suchanek, Matthew Fortune, Catherine_Moore; +Cc: gcc-patches
On 09/04/2015 09:19 AM, Robert Suchanek wrote:
>
> It appears that a possibly related issue is already reported to Bugzilla (bug 51513)
> where the branch is not optimized away, leaving the compare and branch instructions.
>
> It would also appear that this should be fixed at the tree level, however, handling
> a special case when a branch is generated, pointing to 'nowhere' block, is probably
> more suitable.
Fixing the unnecessary branch may be something worth tackling at the
tree level. I'd have to sit down with the dumps for a few minutes to be
sure.
This is one of the reasons why I really dislike __builtin_unreachable.
I much prefer __builtin_trap so that if by some impossible situation we
get into this code we get an immediate fault rather than wandering off
in the instruction stream executing whatever happens to be there.
However, I think that addressing the unnecessary branching in 51513 is
independent of fixing reorg.c.
>
> Without the fix, the generated assembly is following:
>
> ...
> addiu $2,$3,-3
> sltu $4,$2,33
> .set noreorder
> .set nomacro
> beq $4,$0,$L22
> lw $4,%lo(mips_cm_is64)($4)
> .set macro
> .set reorder
>
> sll $4,$2,2
> ...
>
> The load instruction should not be placed in the delay slot as it is a part of
> the taken branch with an undefined behaviour. The 'sll' was expected to be in the slot.
>
> This is a series of unfortunate events that causes this to happen:
> 1. After the expansion, the 'nowhere' block is placed after the switch statement and
> behaves as a fallthrough case with a diamond shape.
> 2. The block reordering pass randomly rearranges the nowhere block and it happens to be
> placed before the load instruction.
> 3. The eager delay slot filler firstly redirects jumps by skipping consecutive labels
> (ignoring barriers), pointing directly to the load. The analysis is doomed to failure at this
> stage as the own thread is analysed and the shift instruction is rejected as it uses
> conflicting resources (most likely because $4 is referenced in the load) but the opposite
> thread succeeds partly because the set and needed registers sets do not intersect when
> the resources are being computed by mark_target_live_regs ().
>
> IMO, the fix is to recognize the empty basic block that has a code_label followed by
> a barrier (ignoring notes and debug_insns), forbid going beyond the barrier if the empty block
> is found in skip_consecutive_labels () and first_active_target_insn ().
>
> The patch was cross tested on mips-img-linux-gnu and sparc-linux-gnu.
>
>
> Regards,
> Robert
>
> gcc/
> * reorg.c (label_with_barrier_p): New function.
> (skip_consecutive_labels): Use it. Don't skip the label if an empty
> block is found.
> (first_active_target_insn): Likewise. Don't ignore the empty
> block when searching for the next active instruction.
Unless I'm missing something, I don't think we should treat empty blocks
special here. Instead, just stop the search when we hit a BARRIER,
regardless of what other instructions (or lack thereof) are in the block.
Jeff
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-09-09 15:45 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-04 15:27 [PATCH] [MIPS] Fix wrong instruction in the delay slot Robert Suchanek
2015-09-07 9:14 ` Matthew Fortune
2015-09-09 15:58 ` Jeff Law
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).