From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4791 invoked by alias); 25 Feb 2011 20:55:38 -0000 Received: (qmail 4782 invoked by uid 22791); 25 Feb 2011 20:55:36 -0000 X-SWARE-Spam-Status: No, hits=-0.4 required=5.0 tests=AWL,BAYES_50,TW_XF,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (38.113.113.100) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 25 Feb 2011 20:55:27 +0000 Received: (qmail 18798 invoked from network); 25 Feb 2011 20:55:25 -0000 Received: from unknown (HELO tp.orcam.me.uk) (macro@127.0.0.2) by mail.codesourcery.com with ESMTPA; 25 Feb 2011 20:55:25 -0000 Date: Fri, 25 Feb 2011 20:55:00 -0000 From: "Maciej W. Rozycki" To: binutils@sourceware.org cc: Richard Sandiford Subject: [PATCH] MIPS/GAS: Clean-up no-delay slot instruction annotation Message-ID: User-Agent: Alpine 1.10 (DEB 962 2008-03-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Mailing-List: contact binutils-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: binutils-owner@sourceware.org X-SW-Source: 2011-02/txt/msg00318.txt.bz2 Hi, The MIPS architecture defines a set of instructions that are not allowed to be placed in a branch delay slot or unpredictable behaviour will happen. Additionally GAS deliberately does not reorder some other instructions that are actually allowed in a branch delay slot, but their intent is to trigger an exception under some circumstances and handling such an exception, although architecturally supported, is sometimes problematic if coming from a branch delay slot. Currently we do not distinguish between the two cases and furthermore we correctly annotate some instructions to keep them out of delay slots, we annotate some instructions unnecessarily, and we fail to annotate some, which we handle by checking their opcodes explicitly. Ah, and we have this honourable SYNC instruction that got awarded with its own flag for this very purpose. I believe this all is unnecessary. We can do with a single flag, if set correctly for all the affected instructions, and no explicit opcode checks. Here's a change that implements my idea. I have renamed the INSN_TRAP flag to INSN_NO_DELAY_SLOT, to reflect its meaning more accurately. I have kept the TRAP alias though for instructions that GAS deliberately chooses not to schedule into delay slots even they are otherwise permitted there. I have added a new alias, NODS, for these instructions that are truly forbidden there to make it easy to distinguish between the two classes. I have used the new flag for SYNC, the ERET and DERET instructions that were checked explicitly and lacked any annotation, and updated all the other instructions of this kind including, but not limited to MIPS16 compact jumps. I haven't added this flag to branches that are already excluded implicitly by means of some obscure conditions (like being relaxed or having a fixup attached); though frankly perhaps we should use this flag on them too, to make it explicit they cannot be scheduled into delay slots. WDYT? Interestingly enough we keep all the MIPS MT ASE instructions out of delay slots even though only YIELD is truly forbidden there and all the others are fine. I can understand our wish to support twisted self-accesses with MFTR and MTTR instructions that may touch registers a branch may have a dependency against (I'm not sure if such accesses have architecturally consistent behaviour though and whether we should really go that extra mile and do anything about them), but why we disable delay-slot scheduling of instructions such as EMT or DI remains a mystery to me. I have decided to keep their annotations for the time being though. If anyhow, such a change should be made separately. This has been regression-tested successfully with the mips-sde-elf and mips-linux-gnu targets. OK to apply? 25-02-2011 Maciej W. Rozycki include/opcode/ * mips.h (INSN_TRAP): Rename to... (INSN_NO_DELAY_SLOT): ... this. (INSN_SYNC): Remove macro. gas/ * config/tc-mips.c (append_insn): Adjust for the rename of INSN_TRAP to INSN_NO_DELAY_SLOT. Remove the check for INSN_SYNC as well as explicit checks for ERET and DERET when scheduling branch delay slots. opcodes/ * mips-opc.c (NODS): New macro. (TRAP): Adjust for the rename of INSN_TRAP to INSN_NO_DELAY_SLOT. (DSP_VOLA): Likewise. (mips_builtin_opcodes): Add NODS annotation to "deret" and "eret". Replace INSN_SYNC with NODS throughout. Use NODS in place of TRAP for "wait", "waiti" and "yield". * mips16-opc.c (NODS): New macro. (TRAP): Adjust for the rename of INSN_TRAP to INSN_NO_DELAY_SLOT. (mips16_opcodes): Use NODS in place of TRAP for "jalrc", "jrc", "restore" and "save". Maciej binutils-mips-opcode-trap.diff Index: binutils-fsf-trunk-quilt/include/opcode/mips.h =================================================================== --- binutils-fsf-trunk-quilt.orig/include/opcode/mips.h 2011-02-25 15:36:55.000000000 +0000 +++ binutils-fsf-trunk-quilt/include/opcode/mips.h 2011-02-25 15:41:01.000000000 +0000 @@ -489,8 +489,9 @@ struct mips_opcode #define INSN_WRITE_HI 0x01000000 /* Modifies the LO register. */ #define INSN_WRITE_LO 0x02000000 -/* Takes a trap (easier to keep out of delay slot). */ -#define INSN_TRAP 0x04000000 +/* Not to be placed in a branch delay slot, either architecturally + or for ease of handling (such as with instructions that take a trap). */ +#define INSN_NO_DELAY_SLOT 0x04000000 /* Instruction stores value into memory. */ #define INSN_STORE_MEMORY 0x08000000 /* Instruction uses single precision floating point. */ @@ -499,8 +500,6 @@ struct mips_opcode #define FP_D 0x20000000 /* Instruction is part of the tx39's integer multiply family. */ #define INSN_MULT 0x40000000 -/* Instruction synchronize shared memory. */ -#define INSN_SYNC 0x80000000 /* Instruction is actually a macro. It should be ignored by the disassembler, and requires special treatment by the assembler. */ #define INSN_MACRO 0xffffffff Index: binutils-fsf-trunk-quilt/gas/config/tc-mips.c =================================================================== --- binutils-fsf-trunk-quilt.orig/gas/config/tc-mips.c 2011-02-25 15:36:55.000000000 +0000 +++ binutils-fsf-trunk-quilt/gas/config/tc-mips.c 2011-02-25 15:41:01.000000000 +0000 @@ -3005,9 +3005,12 @@ append_insn (struct mips_cl_insn *ip, ex /* Check for conflicts between the swapped sequence and the target of the branch. */ && nops_for_sequence (2, history + 1, ip, history) == 0 - /* We do not swap with a trap instruction, since it complicates - trap handlers to have the trap instruction be in a delay slot. */ - && !(prev_pinfo & INSN_TRAP) + /* We do not swap with instructions that cannot architecturally + be placed in a branch delay slot, such as SYNC or ERET. We + also refrain from swapping with a trap instruction, since it + complicates trap handlers to have the trap instruction be in + a delay slot. */ + && !(prev_pinfo & INSN_NO_DELAY_SLOT) /* If the branch reads a register that the previous instruction sets, we cannot swap. */ && (mips_opts.mips16 @@ -3091,13 +3094,7 @@ append_insn (struct mips_cl_insn *ip, ex /* If the previous instruction had a fixup in mips16 mode, we cannot swap. This normally means that the previous instruction was a 4 byte branch anyhow. */ - && !(mips_opts.mips16 && history[0].fixp[0]) - /* If the previous instruction is a sync, sync.l, or sync.p, - we cannot swap. */ - && !(prev_pinfo & INSN_SYNC) - /* If the previous instruction is an ERET or DERET, avoid the swap. */ - && history[0].insn_opcode != INSN_ERET - && history[0].insn_opcode != INSN_DERET) + && !(mips_opts.mips16 && history[0].fixp[0])) { /* It looks like we can actually do the swap. */ branch_disp = insn_length (history); Index: binutils-fsf-trunk-quilt/opcodes/mips-opc.c =================================================================== --- binutils-fsf-trunk-quilt.orig/opcodes/mips-opc.c 2011-02-25 15:36:55.000000000 +0000 +++ binutils-fsf-trunk-quilt/opcodes/mips-opc.c 2011-02-25 15:41:01.000000000 +0000 @@ -37,7 +37,8 @@ #define COD INSN_COPROC_MOVE_DELAY #define CLD INSN_COPROC_MEMORY_DELAY #define CBL INSN_COND_BRANCH_LIKELY -#define TRAP INSN_TRAP +#define NODS INSN_NO_DELAY_SLOT +#define TRAP INSN_NO_DELAY_SLOT #define SM INSN_STORE_MEMORY #define WR_d INSN_WRITE_GPR_D @@ -150,13 +151,14 @@ to track dependencies of these fields. However, "bposge32" is a branch instruction that depends on the "pos" field. In order to make sure that GAS does not reorder DSP instructions - that writes the "pos" field and "bposge32", we add DSP_VOLA (INSN_TRAP) - attribute to those instructions that write the "pos" field. */ + that writes the "pos" field and "bposge32", we add DSP_VOLA + (INSN_NO_DELAY_SLOT) attribute to those instructions that write the "pos" + field. */ #define WR_a WR_HILO /* Write dsp accumulators (reuse WR_HILO) */ #define RD_a RD_HILO /* Read dsp accumulators (reuse RD_HILO) */ #define MOD_a WR_a|RD_a -#define DSP_VOLA INSN_TRAP +#define DSP_VOLA INSN_NO_DELAY_SLOT #define D32 INSN_DSP #define D33 INSN_DSPR2 #define D64 INSN_DSP64 @@ -631,7 +633,7 @@ const struct mips_opcode mips_builtin_op /* dctr and dctw are used on the r5000. */ {"dctr", "o(b)", 0xbc050000, 0xfc1f0000, RD_b, 0, I3 }, {"dctw", "o(b)", 0xbc090000, 0xfc1f0000, RD_b, 0, I3 }, -{"deret", "", 0x4200001f, 0xffffffff, 0, 0, I32|G2 }, +{"deret", "", 0x4200001f, 0xffffffff, NODS, 0, I32|G2 }, {"dext", "t,r,I,+I", 0, (int) M_DEXT, INSN_MACRO, 0, I65 }, {"dext", "t,r,+A,+C", 0x7c000003, 0xfc00003f, WR_t|RD_s, 0, I65 }, {"dextm", "t,r,+A,+G", 0x7c000001, 0xfc00003f, WR_t|RD_s, 0, I65 }, @@ -763,7 +765,7 @@ const struct mips_opcode mips_builtin_op {"ei", "t", 0x41606020, 0xffe0ffff, WR_t|WR_C0, 0, I33|IOCT}, {"emt", "", 0x41600be1, 0xffffffff, TRAP, 0, MT32 }, {"emt", "t", 0x41600be1, 0xffe0ffff, TRAP|WR_t, 0, MT32 }, -{"eret", "", 0x42000018, 0xffffffff, 0, 0, I3_32 }, +{"eret", "", 0x42000018, 0xffffffff, NODS, 0, I3_32 }, {"evpe", "", 0x41600021, 0xffffffff, TRAP, 0, MT32 }, {"evpe", "t", 0x41600021, 0xffe0ffff, TRAP|WR_t, 0, MT32 }, {"ext", "t,r,+A,+C", 0x7c000000, 0xfc00003f, WR_t|RD_s, 0, I33 }, @@ -1400,19 +1402,19 @@ const struct mips_opcode mips_builtin_op {"invalidate", "t,o(b)",0xb8000000, 0xfc000000, RD_t|RD_b, 0, I2 }, /* same */ {"invalidate", "t,A(b)",0, (int) M_SWR_AB, INSN_MACRO, 0, I2 }, /* as swr */ {"swxc1", "S,t(b)", 0x4c000008, 0xfc0007ff, SM|RD_S|RD_t|RD_b|FP_S, 0, I4_33 }, -{"synciobdma", "", 0x0000008f, 0xffffffff, INSN_SYNC, 0, IOCT }, -{"syncs", "", 0x0000018f, 0xffffffff, INSN_SYNC, 0, IOCT }, -{"syncw", "", 0x0000010f, 0xffffffff, INSN_SYNC, 0, IOCT }, -{"syncws", "", 0x0000014f, 0xffffffff, INSN_SYNC, 0, IOCT }, -{"sync_acquire", "", 0x0000044f, 0xffffffff, INSN_SYNC, 0, I33 }, -{"sync_mb", "", 0x0000040f, 0xffffffff, INSN_SYNC, 0, I33 }, -{"sync_release", "", 0x0000048f, 0xffffffff, INSN_SYNC, 0, I33 }, -{"sync_rmb", "", 0x000004cf, 0xffffffff, INSN_SYNC, 0, I33 }, -{"sync_wmb", "", 0x0000010f, 0xffffffff, INSN_SYNC, 0, I33 }, -{"sync", "", 0x0000000f, 0xffffffff, INSN_SYNC, 0, I2|G1 }, -{"sync", "1", 0x0000000f, 0xfffff83f, INSN_SYNC, 0, I32 }, -{"sync.p", "", 0x0000040f, 0xffffffff, INSN_SYNC, 0, I2 }, -{"sync.l", "", 0x0000000f, 0xffffffff, INSN_SYNC, 0, I2 }, +{"synciobdma", "", 0x0000008f, 0xffffffff, NODS, 0, IOCT }, +{"syncs", "", 0x0000018f, 0xffffffff, NODS, 0, IOCT }, +{"syncw", "", 0x0000010f, 0xffffffff, NODS, 0, IOCT }, +{"syncws", "", 0x0000014f, 0xffffffff, NODS, 0, IOCT }, +{"sync_acquire", "", 0x0000044f, 0xffffffff, NODS, 0, I33 }, +{"sync_mb", "", 0x0000040f, 0xffffffff, NODS, 0, I33 }, +{"sync_release", "", 0x0000048f, 0xffffffff, NODS, 0, I33 }, +{"sync_rmb", "", 0x000004cf, 0xffffffff, NODS, 0, I33 }, +{"sync_wmb", "", 0x0000010f, 0xffffffff, NODS, 0, I33 }, +{"sync", "", 0x0000000f, 0xffffffff, NODS, 0, I2|G1 }, +{"sync", "1", 0x0000000f, 0xfffff83f, NODS, 0, I32 }, +{"sync.p", "", 0x0000040f, 0xffffffff, NODS, 0, I2 }, +{"sync.l", "", 0x0000000f, 0xffffffff, NODS, 0, I2 }, {"synci", "o(b)", 0x041f0000, 0xfc1f0000, SM|RD_b, 0, I33 }, {"syscall", "", 0x0000000c, 0xffffffff, TRAP, 0, I1 }, {"syscall", "B", 0x0000000c, 0xfc00003f, TRAP, 0, I1 }, @@ -1481,9 +1483,9 @@ const struct mips_opcode mips_builtin_op {"wacl.ob", "Y,Z", 0x7800003e, 0xffe007ff, RD_S|RD_T|FP_D, WR_MACC, MX|SB1 }, {"wacl.ob", "S,T", 0x4800003e, 0xffe007ff, RD_S|RD_T, 0, N54 }, {"wacl.qh", "Y,Z", 0x7820003e, 0xffe007ff, RD_S|RD_T|FP_D, WR_MACC, MX }, -{"wait", "", 0x42000020, 0xffffffff, TRAP, 0, I3_32 }, -{"wait", "J", 0x42000020, 0xfe00003f, TRAP, 0, I32|N55 }, -{"waiti", "", 0x42000020, 0xffffffff, TRAP, 0, L1 }, +{"wait", "", 0x42000020, 0xffffffff, NODS, 0, I3_32 }, +{"wait", "J", 0x42000020, 0xfe00003f, NODS, 0, I32|N55 }, +{"waiti", "", 0x42000020, 0xffffffff, NODS, 0, L1 }, {"wrpgpr", "d,w", 0x41c00000, 0xffe007ff, RD_t, 0, I33 }, {"wsbh", "d,w", 0x7c0000a0, 0xffe007ff, WR_d|RD_t, 0, I33 }, {"xor", "d,v,t", 0x00000026, 0xfc0007ff, WR_d|RD_s|RD_t, 0, I1 }, @@ -1496,8 +1498,8 @@ const struct mips_opcode mips_builtin_op {"xor.ob", "D,S,k", 0x4bc0000d, 0xffe0003f, WR_D|RD_S|RD_T, 0, N54 }, {"xor.qh", "X,Y,Q", 0x7820000d, 0xfc20003f, WR_D|RD_S|RD_T|FP_D, 0, MX }, {"xori", "t,r,i", 0x38000000, 0xfc000000, WR_t|RD_s, 0, I1 }, -{"yield", "s", 0x7c000009, 0xfc1fffff, TRAP|RD_s, 0, MT32 }, -{"yield", "d,s", 0x7c000009, 0xfc1f07ff, TRAP|WR_d|RD_s, 0, MT32 }, +{"yield", "s", 0x7c000009, 0xfc1fffff, NODS|RD_s, 0, MT32 }, +{"yield", "d,s", 0x7c000009, 0xfc1f07ff, NODS|WR_d|RD_s, 0, MT32 }, /* User Defined Instruction. */ {"udi0", "s,t,d,+1",0x70000010, 0xfc00003f, WR_d|RD_s|RD_t, 0, I33 }, Index: binutils-fsf-trunk-quilt/opcodes/mips16-opc.c =================================================================== --- binutils-fsf-trunk-quilt.orig/opcodes/mips16-opc.c 2011-02-25 15:36:55.000000000 +0000 +++ binutils-fsf-trunk-quilt/opcodes/mips16-opc.c 2011-02-25 15:41:01.000000000 +0000 @@ -58,7 +58,8 @@ #define RD_HI INSN_READ_HI #define RD_LO INSN_READ_LO -#define TRAP INSN_TRAP +#define NODS INSN_NO_DELAY_SLOT +#define TRAP INSN_NO_DELAY_SLOT #define I1 INSN_ISA1 #define I3 INSN_ISA3 @@ -227,12 +228,12 @@ const struct mips_opcode mips16_opcodes[ {"sw", "R,V(S)", 0x6200, 0xff00, RD_31|RD_SP, 0, I1 }, {"xor", "x,y", 0xe80e, 0xf81f, WR_x|RD_x|RD_y, 0, I1 }, /* MIPS16e additions */ -{"jalrc", "x", 0xe8c0, 0xf8ff, UBR|WR_31|RD_x|TRAP, 0, I32 }, -{"jalrc", "R,x", 0xe8c0, 0xf8ff, UBR|WR_31|RD_x|TRAP, 0, I32 }, -{"jrc", "x", 0xe880, 0xf8ff, UBR|RD_x|TRAP, 0, I32 }, -{"jrc", "R", 0xe8a0, 0xffff, UBR|RD_31|TRAP, 0, I32 }, -{"restore", "M", 0x6400, 0xff80, WR_31|RD_SP|WR_SP|TRAP, 0, I32 }, -{"save", "m", 0x6480, 0xff80, RD_31|RD_SP|WR_SP|TRAP, 0, I32 }, +{"jalrc", "x", 0xe8c0, 0xf8ff, UBR|WR_31|RD_x|NODS, 0, I32 }, +{"jalrc", "R,x", 0xe8c0, 0xf8ff, UBR|WR_31|RD_x|NODS, 0, I32 }, +{"jrc", "x", 0xe880, 0xf8ff, UBR|RD_x|NODS, 0, I32 }, +{"jrc", "R", 0xe8a0, 0xffff, UBR|RD_31|NODS, 0, I32 }, +{"restore", "M", 0x6400, 0xff80, WR_31|RD_SP|WR_SP|NODS, 0, I32 }, +{"save", "m", 0x6480, 0xff80, RD_31|RD_SP|WR_SP|NODS, 0, I32 }, {"sdbbp", "6", 0xe801, 0xf81f, TRAP, 0, I32 }, {"seb", "x", 0xe891, 0xf8ff, WR_x|RD_x, 0, I32 }, {"seh", "x", 0xe8b1, 0xf8ff, WR_x|RD_x, 0, I32 },