public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][MIPS] NetLogic XLP scheduling
@ 2012-07-13 10:11 Chung-Lin Tang
  2012-07-15 16:28 ` Richard Sandiford
  0 siblings, 1 reply; 10+ messages in thread
From: Chung-Lin Tang @ 2012-07-13 10:11 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford

[-- Attachment #1: Type: text/plain, Size: 492 bytes --]

Hi Richard,
This patch adds scheduling support for the NetLogic XLP, including a new
pipeline description, and associated changes.

Asides from the new xlp.md description file, there are also some sync
primitive attribute modifications, for better scheduling of sync loops
(Maxim should be able to better explain this).

Other generic changes include a new "hilo" insn attribute, to mark which
of HI/LO does a m[ft]hilo insn access.

Can you see if this is okay for trunk?

Thanks,
Chung-Lin

[-- Attachment #2: 0001-XLP-scheduling.patch --]
[-- Type: text/plain, Size: 21884 bytes --]

From 014ff721a2e6cb96236dcf5e11d7f15c3b927386 Mon Sep 17 00:00:00 2001
From: Maxim Kuvyrkov <maxim@codesourcery.com>
Date: Mon, 18 Jun 2012 18:10:19 -0700
Subject: [PATCH] XLP scheduling.

2012-07-13  Chung-Lin Tang  <cltang@codesourcery.com>
	    Maxim Kuvyrkov  <maxim@codesourcery.com>
	    NetLogic Microsystems Inc.

	* config/mips/mips.md (define_attr "type"): New values "atomic" and
	"syncloop".
	(hilo): New attribute for indicating which of hi/lo accessed.
	(include xlp.md): New include.
	(mfhi<GPR:mode>_<HILO:mode>,mthi<GPR:mode>_<HILO:mode>):
	Set hilo" attribute.
	* config/mips/sync.md: Set "type" attribute for instructions.
	* config/mips/generic.md (generic_atomic, generic_syncloop):
	New reservations.
	* config/mips/xlp.md: New file.
	* config/mips/mips-proto.h (mips_hilo_use): Declare.
	* config/mips/mips.c (mips_issue_rate): Handle XLP.
	(mips_hilo_use): New function for computing "hilo" attribute.
---
 gcc/config/mips/generic.md    |   16 +++
 gcc/config/mips/mips-protos.h |    2 +
 gcc/config/mips/mips.c        |   28 ++++++
 gcc/config/mips/mips.md       |   18 +++-
 gcc/config/mips/sync.md       |   78 ++++++++++-----
 gcc/config/mips/xlp.md        |  217 +++++++++++++++++++++++++++++++++++++++++
 6 files changed, 332 insertions(+), 27 deletions(-)
 create mode 100644 gcc/config/mips/xlp.md

diff --git a/gcc/config/mips/generic.md b/gcc/config/mips/generic.md
index d61511f..02b1d8b 100644
--- a/gcc/config/mips/generic.md
+++ b/gcc/config/mips/generic.md
@@ -103,3 +103,19 @@
 (define_insn_reservation "generic_frecip_fsqrt_step" 5
   (eq_attr "type" "frdiv1,frdiv2,frsqrt1,frsqrt2")
   "alu")
+
+(define_insn_reservation "generic_atomic" 10
+  (eq_attr "type" "atomic")
+  "alu")
+
+;; Sync loop consists of (in order)
+;; (1) optional sync,
+;; (2) LL instruction,
+;; (3) branch and 1-2 ALU instructions,
+;; (4) SC instruction,
+;; (5) branch and ALU instruction.
+;; The net result of this reservation is a big delay with a flush of
+;; ALU pipeline.
+(define_insn_reservation "generic_sync_loop" 40
+  (eq_attr "type" "syncloop")
+  "alu*39")
diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h
index d1fa160..1b1cbcb 100644
--- a/gcc/config/mips/mips-protos.h
+++ b/gcc/config/mips/mips-protos.h
@@ -334,6 +334,8 @@ extern void mips_final_prescan_insn (rtx, rtx *, int);
 extern int mips_trampoline_code_size (void);
 extern void mips_function_profiler (FILE *);
 
+extern int mips_hilo_use (rtx);
+
 typedef rtx (*mulsidi3_gen_fn) (rtx, rtx, rtx);
 #ifdef RTX_CODE
 extern mulsidi3_gen_fn mips_mulsidi3_gen_fn (enum rtx_code);
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 7356ce5..f46eb49 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -12480,6 +12480,9 @@ mips_issue_rate (void)
     case PROCESSOR_LOONGSON_3A:
       return 4;
 
+    case PROCESSOR_XLP:
+      return (reload_completed ? 4 : 3);
+
     default:
       return 1;
     }
@@ -17407,6 +17410,31 @@ mips_expand_vec_minmax (rtx target, rtx op0, rtx op1,
   x = gen_rtx_IOR (vmode, t0, t1);
   emit_insn (gen_rtx_SET (VOIDmode, target, x));
 }
+
+/* Determine HI/LO access on INSN, return 1 for HI, -1 for LO,
+   and 0 for no access. Used for determining "hilo" attribute.  */
+
+int
+mips_hilo_use (rtx insn)
+{
+  rtx pat, dest, src;
+  enum attr_type insn_type;
+
+  if (! (pat = single_set (insn)))
+    return 0;
+
+  insn_type = get_attr_type (insn);
+  dest = SET_DEST (pat);
+  src = SET_SRC (pat);
+
+  if ((insn_type == TYPE_MTHILO && REGNO (dest) == HI_REGNUM)
+      || (insn_type == TYPE_MFHILO && REGNO (src) == HI_REGNUM))
+    return 1;
+  if ((insn_type == TYPE_MTHILO && REGNO (dest) == LO_REGNUM)
+      || (insn_type == TYPE_MFHILO && REGNO (src) == LO_REGNUM))
+    return -1;
+  return 0;
+}
 \f
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index 5b1735f..d2a304e 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -274,6 +274,8 @@
 ;; frsqrt1      floating point reciprocal square root step1
 ;; frsqrt2      floating point reciprocal square root step2
 ;; multi	multiword sequence (or user asm statements)
+;; atomic	atomic memory update instruction
+;; sync_loop	memory atomic operation implemented as a sync loop
 ;; nop		no operation
 ;; ghost	an instruction that produces no real code
 (define_attr "type"
@@ -281,7 +283,7 @@
    prefetch,prefetchx,condmove,mtc,mfc,mthilo,mfhilo,const,arith,logical,
    shift,slt,signext,clz,pop,trap,imul,imul3,imul3nc,imadd,idiv,idiv3,move,
    fmove,fadd,fmul,fmadd,fdiv,frdiv,frdiv1,frdiv2,fabs,fneg,fcmp,fcvt,fsqrt,
-   frsqrt,frsqrt1,frsqrt2,multi,nop,ghost"
+   frsqrt,frsqrt1,frsqrt2,multi,atomic,syncloop,nop,ghost"
   (cond [(eq_attr "jal" "!unset") (const_string "call")
 	 (eq_attr "got" "load") (const_string "load")
 
@@ -589,6 +591,17 @@
 		(const_string "yes")
 		(const_string "no")))
 
+;; For mfhilo/mthilo insns, determine which of hi or lo is the operand
+(define_attr "hilo" "hi,lo,none"
+  (cond [(and (eq_attr "type" "mthilo,mfhilo")
+              (gt (symbol_ref "mips_hilo_use (insn)") (const_int 0)))
+         (const_string "hi")
+
+         (and (eq_attr "type" "mthilo,mfhilo")
+              (lt (symbol_ref "mips_hilo_use (insn)") (const_int 0)))
+         (const_string "lo")]
+        (const_string "none")))
+
 ;; Describe a user's asm statement.
 (define_asm_attributes
   [(set_attr "type" "multi")
@@ -936,6 +949,7 @@
 (include "sb1.md")
 (include "sr71k.md")
 (include "xlr.md")
+(include "xlp.md")
 (include "generic.md")
 \f
 ;;
@@ -4735,6 +4749,7 @@
   ""
   { return ISA_HAS_MACCHI ? "<GPR:d>macchi\t%0,%.,%." : "mfhi\t%0"; }
   [(set_attr "move_type" "mfhilo")
+   (set_attr "hilo" "hi")
    (set_attr "mode" "<GPR:MODE>")])
 
 ;; Set the high part of a HI/LO value, given that the low part has
@@ -4748,6 +4763,7 @@
   ""
   "mthi\t%z1"
   [(set_attr "move_type" "mthilo")
+   (set_attr "hilo" "hi")
    (set_attr "mode" "SI")])
 
 ;; Emit a doubleword move in which exactly one of the operands is
diff --git a/gcc/config/mips/sync.md b/gcc/config/mips/sync.md
index 0a7905a..6585d91 100644
--- a/gcc/config/mips/sync.md
+++ b/gcc/config/mips/sync.md
@@ -99,7 +99,8 @@
 			    UNSPEC_COMPARE_AND_SWAP_12))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_oldval" "0")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_oldval" "0")
    (set_attr "sync_mem" "1")
    (set_attr "sync_inclusive_mask" "2")
    (set_attr "sync_exclusive_mask" "3")
@@ -114,7 +115,8 @@
 	  UNSPEC_SYNC_OLD_OP))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "addiu,addu")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "addiu,addu")
    (set_attr "sync_mem" "0")
    (set_attr "sync_insn1_op2" "1")])
 
@@ -145,7 +147,8 @@
    (clobber (match_scratch:SI 4 "=&d"))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "<insn>")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "<insn>")
    (set_attr "sync_insn2" "and")
    (set_attr "sync_mem" "0")
    (set_attr "sync_inclusive_mask" "1")
@@ -186,7 +189,8 @@
    (clobber (match_scratch:SI 5 "=&d"))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "<insn>")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "<insn>")
    (set_attr "sync_insn2" "and")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_mem" "1")
@@ -232,7 +236,8 @@
 	   (match_dup 4)] UNSPEC_SYNC_NEW_OP_12))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "<insn>")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "<insn>")
    (set_attr "sync_insn2" "and")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_newval" "0")
@@ -268,7 +273,8 @@
    (clobber (match_scratch:SI 4 "=&d"))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "and")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "and")
    (set_attr "sync_insn2" "xor")
    (set_attr "sync_mem" "0")
    (set_attr "sync_inclusive_mask" "1")
@@ -307,7 +313,8 @@
    (clobber (match_scratch:SI 5 "=&d"))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "and")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "and")
    (set_attr "sync_insn2" "xor")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_mem" "1")
@@ -351,7 +358,8 @@
 	   (match_dup 4)] UNSPEC_SYNC_NEW_OP_12))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "and")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "and")
    (set_attr "sync_insn2" "xor")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_newval" "0")
@@ -368,7 +376,8 @@
 	 UNSPEC_SYNC_OLD_OP))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "subu")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "subu")
    (set_attr "sync_mem" "0")
    (set_attr "sync_insn1_op2" "1")])
 
@@ -383,7 +392,8 @@
 	 UNSPEC_SYNC_OLD_OP))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "addiu,addu")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "addiu,addu")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_mem" "1")
    (set_attr "sync_insn1_op2" "2")])
@@ -398,7 +408,8 @@
 	 UNSPEC_SYNC_OLD_OP))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "subu")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "subu")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_mem" "1")
    (set_attr "sync_insn1_op2" "2")])
@@ -413,7 +424,8 @@
 	 UNSPEC_SYNC_NEW_OP))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "addiu,addu")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "addiu,addu")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_newval" "0")
    (set_attr "sync_mem" "1")
@@ -429,7 +441,8 @@
 	 UNSPEC_SYNC_NEW_OP))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "subu")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "subu")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_newval" "0")
    (set_attr "sync_mem" "1")
@@ -443,7 +456,8 @@
 	 UNSPEC_SYNC_OLD_OP))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "<immediate_insn>,<insn>")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "<immediate_insn>,<insn>")
    (set_attr "sync_mem" "0")
    (set_attr "sync_insn1_op2" "1")])
 
@@ -457,7 +471,8 @@
 	 UNSPEC_SYNC_OLD_OP))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "<immediate_insn>,<insn>")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "<immediate_insn>,<insn>")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_mem" "1")
    (set_attr "sync_insn1_op2" "2")])
@@ -472,7 +487,8 @@
 	 UNSPEC_SYNC_NEW_OP))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "<immediate_insn>,<insn>")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "<immediate_insn>,<insn>")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_newval" "0")
    (set_attr "sync_mem" "1")
@@ -484,7 +500,8 @@
 	 UNSPEC_SYNC_OLD_OP))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "andi,and")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "andi,and")
    (set_attr "sync_insn2" "not")
    (set_attr "sync_mem" "0")
    (set_attr "sync_insn1_op2" "1")])
@@ -497,7 +514,8 @@
 	 UNSPEC_SYNC_OLD_OP))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "andi,and")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "andi,and")
    (set_attr "sync_insn2" "not")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_mem" "1")
@@ -511,7 +529,8 @@
 	 UNSPEC_SYNC_NEW_OP))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "andi,and")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "andi,and")
    (set_attr "sync_insn2" "not")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_newval" "0")
@@ -526,7 +545,8 @@
 	 UNSPEC_SYNC_EXCHANGE))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_memmodel" "11")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_memmodel" "11")
    (set_attr "sync_insn1" "li,move")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_mem" "1")
@@ -555,7 +575,8 @@
 	  UNSPEC_SYNC_EXCHANGE_12))]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_memmodel" "11")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_memmodel" "11")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_mem" "1")
    ;; Unused, but needed to give the number of operands expected by
@@ -594,7 +615,8 @@
     UNSPEC_ATOMIC_COMPARE_AND_SWAP)]
   "GENERATE_LL_SC"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "li,move")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "li,move")
    (set_attr "sync_oldval" "1")
    (set_attr "sync_cmp" "0")
    (set_attr "sync_mem" "2")
@@ -639,7 +661,8 @@
     UNSPEC_ATOMIC_EXCHANGE)]
   "GENERATE_LL_SC && !ISA_HAS_SWAP"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "li,move")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "li,move")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_mem" "1")
    (set_attr "sync_insn1_op2" "2")
@@ -654,7 +677,8 @@
 	(unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "0")]
 	 UNSPEC_ATOMIC_EXCHANGE))]
   "ISA_HAS_SWAP"
-  "swap<size>\t%0,%b1")
+  "swap<size>\t%0,%b1"
+  [(set_attr "type" "atomic")])
 
 (define_expand "atomic_fetch_add<mode>"
   [(match_operand:GPR 0 "register_operand")
@@ -695,7 +719,8 @@
     UNSPEC_ATOMIC_FETCH_OP)]
   "GENERATE_LL_SC && !ISA_HAS_LDADD"
   { return mips_output_sync_loop (insn, operands); }
-  [(set_attr "sync_insn1" "addiu,addu")
+  [(set_attr "type" "syncloop")
+   (set_attr "sync_insn1" "addiu,addu")
    (set_attr "sync_oldval" "0")
    (set_attr "sync_mem" "1")
    (set_attr "sync_insn1_op2" "2")
@@ -712,4 +737,5 @@
 		    (match_operand:GPR 2 "register_operand" "0"))]
 	 UNSPEC_ATOMIC_FETCH_OP))]
   "ISA_HAS_LDADD"
-  "ldadd<size>\t%0,%b1")
+  "ldadd<size>\t%0,%b1"
+  [(set_attr "type" "atomic")])
diff --git a/gcc/config/mips/xlp.md b/gcc/config/mips/xlp.md
new file mode 100644
index 0000000..e57133f
--- /dev/null
+++ b/gcc/config/mips/xlp.md
@@ -0,0 +1,217 @@
+;; DFA-based pipeline description for the XLP.
+;; Copyright (C) 2012 Free Software Foundation, Inc.
+;;
+;; xlp.md   Machine Description for the Broadcom XLP Microprocessor
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_automaton "xlp_cpu")
+
+;; CPU function units.
+(define_cpu_unit "xlp_ex0" "xlp_cpu")
+(define_cpu_unit "xlp_ex1" "xlp_cpu")
+(define_cpu_unit "xlp_ex2" "xlp_cpu")
+(define_cpu_unit "xlp_ex3" "xlp_cpu")
+
+;; Integer Multiply Unit
+(define_cpu_unit "xlp_div" "xlp_cpu")
+
+;; ALU2 completion port.
+(define_cpu_unit "xlp_ex2_wrb" "xlp_cpu")
+
+(define_automaton "xlp_fpu")
+
+;; Floating-point units.
+(define_cpu_unit "xlp_fp" "xlp_fpu")
+
+;; Floating Point Sqrt/Divide
+(define_cpu_unit "xlp_divsq" "xlp_fpu")
+
+;; FPU completion port.
+(define_cpu_unit "xlp_fp_wrb" "xlp_fpu")
+
+;; Define reservations for common combinations.
+
+;;
+;; The ordering of the instruction-execution-path/resource-usage
+;; descriptions (also known as reservation RTL) is roughly ordered
+;; based on the define attribute RTL for the "type" classification.
+;; When modifying, remember that the first test that matches is the
+;; reservation used!
+;;
+(define_insn_reservation "ir_xlp_unknown" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "unknown,multi"))
+  "xlp_ex0+xlp_ex1+xlp_ex2+xlp_ex3")
+
+(define_insn_reservation "ir_xlp_branch" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "branch,jump,call"))
+  "xlp_ex3")
+
+(define_insn_reservation "ir_xlp_prefetch" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "prefetch,prefetchx"))
+  "xlp_ex0|xlp_ex1")
+
+(define_insn_reservation "ir_xlp_load" 4
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "load"))
+  "xlp_ex0|xlp_ex1")
+
+(define_insn_reservation "ir_xlp_fpload" 5
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "fpload,fpidxload"))
+  "xlp_ex0|xlp_ex1")
+
+(define_insn_reservation "ir_xlp_alu" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "const,arith,shift,slt,clz,signext,logical,move,trap,nop"))
+  "xlp_ex0|xlp_ex1|(xlp_ex2,xlp_ex2_wrb)|xlp_ex3")
+
+(define_insn_reservation "ir_xlp_condmov" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "condmove")
+       (eq_attr "mode" "SI,DI"))
+  "xlp_ex2,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_mul" 5
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "imul,imadd"))
+  "xlp_ex2,nothing*4,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_mul3" 3
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "imul3"))
+  "xlp_ex2,nothing*2,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_div" 24
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "SI")
+       (eq_attr "type" "idiv"))
+  "xlp_ex2+xlp_div,xlp_div*23,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_ddiv" 48
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "DI")
+       (eq_attr "type" "idiv"))
+  "xlp_ex2+xlp_div,xlp_div*47,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_store" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "store,fpstore,fpidxstore"))
+  "xlp_ex0|xlp_ex1")
+
+(define_insn_reservation "ir_xlp_fpmove" 2
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "mfc"))
+ "xlp_ex3,xlp_fp,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_mfhi" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "mfhilo")
+       (eq_attr "hilo" "hi"))
+  "xlp_ex2,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_mflo" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "mfhilo")
+       (eq_attr "hilo" "lo"))
+  "xlp_ex2,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_mthi" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "mthilo")
+       (eq_attr "hilo" "hi"))
+  "xlp_ex2,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_mtlo" 3
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "mthilo")
+       (eq_attr "hilo" "lo"))
+  "xlp_ex2,nothing*2,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_fp2" 2
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "fmove,fneg,fabs,condmove"))
+  "xlp_fp,nothing,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fp3" 3
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "fcmp"))
+  "xlp_fp,nothing*2,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fp4" 4
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "fcvt"))
+  "xlp_fp,nothing*3,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fp5" 5
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "SF")
+       (eq_attr "type" "fadd,fmul"))
+  "xlp_fp,nothing*4,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fp6" 6
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "DF")
+       (eq_attr "type" "fadd,fmul"))
+  "xlp_fp,nothing*5,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fp9" 9
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "SF")
+       (eq_attr "type" "fmadd"))
+  "xlp_fp,nothing*3,xlp_fp,nothing*3,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fp11" 11
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "DF")
+       (eq_attr "type" "fmadd"))
+  "xlp_fp,nothing*4,xlp_fp,nothing*4,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fpcomplex_s" 23
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "SF")
+       (eq_attr "type" "fdiv,frdiv,frdiv1,frdiv2,fsqrt,frsqrt,frsqrt1,frsqrt2"))
+  "xlp_fp+xlp_divsq,xlp_divsq*22,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fpcomplex_d" 38
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "DF")
+       (eq_attr "type" "fdiv,frdiv,frdiv1,frdiv2,fsqrt,frsqrt,frsqrt1,frsqrt2"))
+  "xlp_fp+xlp_divsq,xlp_divsq*37,xlp_fp_wrb")
+
+(define_bypass 3 "ir_xlp_mul" "ir_xlp_mfhi")
+
+(define_insn_reservation "ir_xlp_atomic" 15
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "atomic"))
+  "xlp_ex0|xlp_ex1")
+
+;; Sync loop consists of (in order)
+;; (1) optional sync,
+;; (2) LL instruction,
+;; (3) branch and 1-2 ALU instructions,
+;; (4) SC instruction,
+;; (5) optional sync,
+;; (6) branch and ALU instruction.
+;; The net result of this reservation is a big delay with flush of
+;; ALU pipeline and outgoing reservations discouraging use of EX3.
+(define_insn_reservation "ir_xlp_sync_loop" 40
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "syncloop"))
+  "(xlp_ex0+xlp_ex1+xlp_ex2+xlp_ex3)*39,xlp_ex3+(xlp_ex0|xlp_ex1|(xlp_ex2,xlp_ex2_wrb))")
-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][MIPS] NetLogic XLP scheduling
  2012-07-13 10:11 [PATCH][MIPS] NetLogic XLP scheduling Chung-Lin Tang
@ 2012-07-15 16:28 ` Richard Sandiford
  2012-07-16  6:37   ` Chung-Lin Tang
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Sandiford @ 2012-07-15 16:28 UTC (permalink / raw)
  To: Chung-Lin Tang; +Cc: gcc-patches

Chung-Lin Tang <cltang@codesourcery.com> writes:
> This patch adds scheduling support for the NetLogic XLP, including a new
> pipeline description, and associated changes.
>
> Asides from the new xlp.md description file, there are also some sync
> primitive attribute modifications, for better scheduling of sync loops
> (Maxim should be able to better explain this).

Rather than add a "type" attribute to each sync loop, please just add:

	  (not (eq_attr "sync_mem" "none"))
	  (symbol_ref "syncloop")

to the default value of the "type" attribute.  You'll probably need
to swap the order of the sync* attributes with the "type" attribute
in order for this to compile.

The patch is effectively changing the type of the sync loops from
"unknown" to "syncloop".  That's certainly OK, but you'll need to
add "syncloop" to the "unknown" reservations of all other schedulers
(except for generic.md, where what you've done instead is fine).
It might be easier if you split out the addition of syncloop
as a separate patch.

> Other generic changes include a new "hilo" insn attribute, to mark which
> of HI/LO does a m[ft]hilo insn access.

The way other schedulers handle this is with things like:

(define_insn_reservation "ir_sb1_mfhi" 1
  (and (eq_attr "cpu" "sb1,sb1a")
       (and (eq_attr "type" "mfhilo")
	    (not (match_operand 1 "lo_operand"))))
  "sb1_ex1")

which seems simpler.  mfhilo and mthilo are required to read operand 1
and write to operand 0 (respectively) in order to support this kind of
construct.

That said, even the above is a hold-over from when we tried to allow
high registers to store independent values.  These days we can be a bit
more precise, as with the patch below.  (As the comment says:

	 ;; If a doubleword move uses these expensive instructions,
	 ;; it is usually better to schedule them in the same way
	 ;; as the singleword form, rather than as "multi".

I'm continuing to assume that mflo and mtlo are the best type choices
for unsplit double-register moves.  That path should be very rarely
outside of MIPS16 anyway -- just by sched1 if hi and lo are exposed
directly -- and no current scheduler tries to model a doubleword hi/lo
move separately from single-register ones.  The information is available
via the dword_mode attribute if required.)

Tested on mips64-elf, and by making sure that there were no changes in
-O2 output for a recent set of cc1 .ii files.  Applied.

I'm probably punishing you for being honest here, but the only other
thing is that you've listed NetLogic Microsystems Inc. as one of the
authors.  I think that means they'll need to sign a copyright assignment.
Have they already done that?

Thanks,
Richard


gcc/
	* config/mips/mips.md (move_type): Replace mfhilo and mthilo
	with mflo and mtlo.
	(type): Split mfhilo into mfhi and mflo.  Split mthilo into mthi
	and mtlo.  Adjust move_type->type mapping.
	(may_clobber_hilo): Split mthilo into mthi and mtlo.
	(*movdi_32bit, *movdi_32bit_mips16, *movdi_64bit, *movdi_64bit_mips16)
	(*mov<mode>_internal, *mov<mode>_mips16, *movhi_internal)
	(*movhi_mips16, *movqi_internal, *movqi_mips16): Use mtlo and mflo
	instead of mthilo and mfhilo.
	(mfhi<GPR:mode>_<HILO:mode>): Use mfhi instead of mfhilo.
	(mthi<GPR:mode>_<HILO:mode>): Use mthi instead of mthilo.
	* config/mips/mips-dsp.md (mips_extr_w, mips_extr_r_w, mips_extr_rs_w)
	(mips_extr_s_h, mips_extp, mips_extpdp, mips_shilo, mips_mthlip):
	Use mflo instead of mfhilo.
	* config/mips/10000.md (r10k_arith): Split mthilo.
	(r10k_mfhi, r10k_mflo): Use mfhi and mflo directly.
	* config/mips/sb1.md (ir_sb1_mfhi, ir_sb1_mflo): Likewise.
	(ir_sb1_mthilo): Split mthilo into mthi and mtlo.
	* config/mips/20kc.md (r20kc_imthilo, r20kc_imfhilo): Split
	mthilo and mfhilo.
	* config/mips/24k.md (r24k_int_mfhilo, r24k_int_mthilo): Likewise.
	* config/mips/4130.md (vr4130_class, vr4130_mfhilo, vr4130_mthilo):
	Likewise.
	* config/mips/4k.md (r4k_int_mthilo, r4k_int_mfhilo): Likewise.
	* config/mips/5400.md (ir_vr54_hilo): Likewise.
	* config/mips/5500.md (ir_vr55_mthilo, ir_vr55_mfhilo): Likewise.
	* config/mips/5k.md (r5k_int_mthilo, r5k_int_mfhilo): Likewise.
	* config/mips/7000.md (rm7_mthilo, rm7_mfhilo): Likewise.
	* config/mips/74k.md (r74k_int_mfhilo, r74k_int_mthilo): Likewise.
	* config/mips/9000.md (rm9k_mfhilo, rm9k_mthilo): Likewise.
	* config/mips/generic.md (generic_hilo): Likewise.
	* config/mips/loongson2ef.md (ls2_alu): Likewise.
	* config/mips/loongson3a.md (ls3a_mfhilo): Likewise.
	* config/mips/octeon.md (octeon_imul_o1, octeon_imul_o2)
	(octeon_mfhilo_o1, octeon_mfhilo_o2): Likewise.
	* config/mips/sr71k.md (ir_sr70_hilo): Likewise.
	* config/mips/xlr.md (xlr_hilo): Likewise.

Index: gcc/config/mips/mips.md
===================================================================
--- gcc/config/mips/mips.md	2012-06-23 08:30:36.000000000 +0100
+++ gcc/config/mips/mips.md	2012-07-14 13:26:35.795953795 +0100
@@ -201,7 +201,7 @@ (define_attr "jal_macro" "no,yes"
 ;; the split instructions; in some cases, it is more appropriate for the
 ;; scheduling type to be "multi" instead.
 (define_attr "move_type"
-  "unknown,load,fpload,store,fpstore,mtc,mfc,mthilo,mfhilo,move,fmove,
+  "unknown,load,fpload,store,fpstore,mtc,mfc,mtlo,mflo,move,fmove,
    const,constN,signext,ext_ins,logical,arith,sll0,andi,loadpool,
    shift_shift,lui_movf"
   (const_string "unknown"))
@@ -239,8 +239,10 @@ (define_attr "dword_mode" "no,yes"
 ;; condmove	conditional moves
 ;; mtc		transfer to coprocessor
 ;; mfc		transfer from coprocessor
-;; mthilo	transfer to hi/lo registers
-;; mfhilo	transfer from hi/lo registers
+;; mthi		transfer to a hi register
+;; mtlo		transfer to a lo register
+;; mfhi		transfer from a hi register
+;; mflo		transfer from a lo register
 ;; const	load constant
 ;; arith	integer arithmetic instructions
 ;; logical      integer logical instructions
@@ -278,7 +280,7 @@ (define_attr "dword_mode" "no,yes"
 ;; ghost	an instruction that produces no real code
 (define_attr "type"
   "unknown,branch,jump,call,load,fpload,fpidxload,store,fpstore,fpidxstore,
-   prefetch,prefetchx,condmove,mtc,mfc,mthilo,mfhilo,const,arith,logical,
+   prefetch,prefetchx,condmove,mtc,mfc,mthi,mtlo,mfhi,mflo,const,arith,logical,
    shift,slt,signext,clz,pop,trap,imul,imul3,imul3nc,imadd,idiv,idiv3,move,
    fmove,fadd,fmul,fmadd,fdiv,frdiv,frdiv1,frdiv2,fabs,fneg,fcmp,fcvt,fsqrt,
    frsqrt,frsqrt1,frsqrt2,multi,nop,ghost"
@@ -298,8 +300,8 @@ (define_attr "type"
 	 (eq_attr "move_type" "fpstore") (const_string "fpstore")
 	 (eq_attr "move_type" "mtc") (const_string "mtc")
 	 (eq_attr "move_type" "mfc") (const_string "mfc")
-	 (eq_attr "move_type" "mthilo") (const_string "mthilo")
-	 (eq_attr "move_type" "mfhilo") (const_string "mfhilo")
+	 (eq_attr "move_type" "mtlo") (const_string "mtlo")
+	 (eq_attr "move_type" "mflo") (const_string "mflo")
 
 	 ;; These types of move are always single insns.
 	 (eq_attr "move_type" "fmove") (const_string "fmove")
@@ -475,7 +477,7 @@ (define_attr "length" ""
 
 	  ;; Check for doubleword moves that are decomposed into two
 	  ;; instructions.
-	  (and (eq_attr "move_type" "mtc,mfc,mthilo,mfhilo,move")
+	  (and (eq_attr "move_type" "mtc,mfc,mtlo,mflo,move")
 	       (eq_attr "dword_mode" "yes"))
 	  (const_int 8)
 
@@ -557,7 +559,7 @@ (define_attr "hazard" "none,delay,hilo"
 	      (match_test "TARGET_FIX_R4000"))
 	 (const_string "hilo")
 
-	 (and (eq_attr "type" "mfhilo")
+	 (and (eq_attr "type" "mfhi,mflo")
 	      (not (match_test "ISA_HAS_HILO_INTERLOCKS")))
 	 (const_string "hilo")]
 	(const_string "none")))
@@ -585,7 +587,7 @@ (define_attr "branch_likely" "no,yes"
 ;; True if an instruction might assign to hi or lo when reloaded.
 ;; This is used by the TUNE_MACC_CHAINS code.
 (define_attr "may_clobber_hilo" "no,yes"
-  (if_then_else (eq_attr "type" "imul,imul3,imadd,idiv,mthilo")
+  (if_then_else (eq_attr "type" "imul,imul3,imadd,idiv,mthi,mtlo")
 		(const_string "yes")
 		(const_string "no")))
 
@@ -4115,7 +4117,7 @@ (define_insn "*movdi_32bit"
    && (register_operand (operands[0], DImode)
        || reg_or_0_operand (operands[1], DImode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "move,const,load,store,mthilo,mfhilo,mtc,fpload,mfc,fpstore,mtc,fpload,mfc,fpstore")
+  [(set_attr "move_type" "move,const,load,store,mtlo,mflo,mtc,fpload,mfc,fpstore,mtc,fpload,mfc,fpstore")
    (set_attr "mode" "DI")])
 
 (define_insn "*movdi_32bit_mips16"
@@ -4125,7 +4127,7 @@ (define_insn "*movdi_32bit_mips16"
    && (register_operand (operands[0], DImode)
        || register_operand (operands[1], DImode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "move,move,move,const,constN,load,store,mfhilo")
+  [(set_attr "move_type" "move,move,move,const,constN,load,store,mflo")
    (set_attr "mode" "DI")])
 
 (define_insn "*movdi_64bit"
@@ -4135,7 +4137,7 @@ (define_insn "*movdi_64bit"
    && (register_operand (operands[0], DImode)
        || reg_or_0_operand (operands[1], DImode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "move,const,const,load,store,mtc,fpload,mfc,fpstore,mthilo,mfhilo,mtc,fpload,mfc,fpstore")
+  [(set_attr "move_type" "move,const,const,load,store,mtc,fpload,mfc,fpstore,mtlo,mflo,mtc,fpload,mfc,fpstore")
    (set_attr "mode" "DI")])
 
 (define_insn "*movdi_64bit_mips16"
@@ -4145,7 +4147,7 @@ (define_insn "*movdi_64bit_mips16"
    && (register_operand (operands[0], DImode)
        || register_operand (operands[1], DImode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "move,move,move,const,constN,const,loadpool,load,store,mfhilo")
+  [(set_attr "move_type" "move,move,move,const,constN,const,loadpool,load,store,mflo")
    (set_attr "mode" "DI")])
 
 ;; On the mips16, we can split ld $r,N($r) into an add and a load,
@@ -4213,7 +4215,7 @@ (define_insn "*mov<mode>_internal"
    && (register_operand (operands[0], <MODE>mode)
        || reg_or_0_operand (operands[1], <MODE>mode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "move,const,const,load,store,mtc,fpload,mfc,fpstore,mfc,mtc,mthilo,mfhilo,mtc,fpload,mfc,fpstore")
+  [(set_attr "move_type" "move,const,const,load,store,mtc,fpload,mfc,fpstore,mfc,mtc,mtlo,mflo,mtc,fpload,mfc,fpstore")
    (set_attr "mode" "SI")])
 
 (define_insn "*mov<mode>_mips16"
@@ -4223,7 +4225,7 @@ (define_insn "*mov<mode>_mips16"
    && (register_operand (operands[0], <MODE>mode)
        || register_operand (operands[1], <MODE>mode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "move,move,move,const,constN,const,loadpool,load,store,mfhilo")
+  [(set_attr "move_type" "move,move,move,const,constN,const,loadpool,load,store,mflo")
    (set_attr "mode" "SI")])
 
 ;; On the mips16, we can split lw $r,N($r) into an add and a load,
@@ -4400,7 +4402,7 @@ (define_insn "*movhi_internal"
    && (register_operand (operands[0], HImode)
        || reg_or_0_operand (operands[1], HImode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "move,const,load,store,mthilo,mfhilo")
+  [(set_attr "move_type" "move,const,load,store,mtlo,mflo")
    (set_attr "mode" "HI")])
 
 (define_insn "*movhi_mips16"
@@ -4410,7 +4412,7 @@ (define_insn "*movhi_mips16"
    && (register_operand (operands[0], HImode)
        || register_operand (operands[1], HImode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "move,move,move,const,constN,load,store,mfhilo")
+  [(set_attr "move_type" "move,move,move,const,constN,load,store,mflo")
    (set_attr "mode" "HI")])
 
 ;; On the mips16, we can split lh $r,N($r) into an add and a load,
@@ -4475,7 +4477,7 @@ (define_insn "*movqi_internal"
    && (register_operand (operands[0], QImode)
        || reg_or_0_operand (operands[1], QImode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "move,const,load,store,mthilo,mfhilo")
+  [(set_attr "move_type" "move,const,load,store,mtlo,mflo")
    (set_attr "mode" "QI")])
 
 (define_insn "*movqi_mips16"
@@ -4485,7 +4487,7 @@ (define_insn "*movqi_mips16"
    && (register_operand (operands[0], QImode)
        || register_operand (operands[1], QImode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" "move,move,move,const,constN,load,store,mfhilo")
+  [(set_attr "move_type" "move,move,move,const,constN,load,store,mflo")
    (set_attr "mode" "QI")])
 
 ;; On the mips16, we can split lb $r,N($r) into an add and a load,
@@ -4616,7 +4618,7 @@ (define_insn "*movti"
    && (register_operand (operands[0], TImode)
        || reg_or_0_operand (operands[1], TImode))"
   "#"
-  [(set_attr "move_type" "move,const,load,store,mthilo,mfhilo")
+  [(set_attr "move_type" "move,const,load,store,mtlo,mflo")
    (set_attr "mode" "TI")])
 
 (define_insn "*movti_mips16"
@@ -4627,7 +4629,7 @@ (define_insn "*movti_mips16"
    && (register_operand (operands[0], TImode)
        || register_operand (operands[1], TImode))"
   "#"
-  [(set_attr "move_type" "move,move,move,const,constN,load,store,mfhilo")
+  [(set_attr "move_type" "move,move,move,const,constN,load,store,mflo")
    (set_attr "mode" "TI")])
 
 ;; 128-bit floating point moves
@@ -4734,7 +4736,7 @@ (define_insn "mfhi<GPR:mode>_<HILO:mode>
 		    UNSPEC_MFHI))]
   ""
   { return ISA_HAS_MACCHI ? "<GPR:d>macchi\t%0,%.,%." : "mfhi\t%0"; }
-  [(set_attr "move_type" "mfhilo")
+  [(set_attr "type" "mfhi")
    (set_attr "mode" "<GPR:MODE>")])
 
 ;; Set the high part of a HI/LO value, given that the low part has
@@ -4747,7 +4749,7 @@ (define_insn "mthi<GPR:mode>_<HILO:mode>
 		     UNSPEC_MTHI))]
   ""
   "mthi\t%z1"
-  [(set_attr "move_type" "mthilo")
+  [(set_attr "type" "mthi")
    (set_attr "mode" "SI")])
 
 ;; Emit a doubleword move in which exactly one of the operands is
Index: gcc/config/mips/mips-dsp.md
===================================================================
--- gcc/config/mips/mips-dsp.md	2012-01-04 19:04:48.000000000 +0000
+++ gcc/config/mips/mips-dsp.md	2012-07-14 09:55:30.923984782 +0100
@@ -909,7 +909,7 @@ (define_insn "mips_extr_w"
     }
   return "extrv.w\t%0,%q1,%2";
 }
-  [(set_attr "type"	"mfhilo")
+  [(set_attr "type"	"mflo")
    (set_attr "mode"	"SI")])
 
 (define_insn "mips_extr_r_w"
@@ -930,7 +930,7 @@ (define_insn "mips_extr_r_w"
     }
   return "extrv_r.w\t%0,%q1,%2";
 }
-  [(set_attr "type"	"mfhilo")
+  [(set_attr "type"	"mflo")
    (set_attr "mode"	"SI")])
 
 (define_insn "mips_extr_rs_w"
@@ -951,7 +951,7 @@ (define_insn "mips_extr_rs_w"
     }
   return "extrv_rs.w\t%0,%q1,%2";
 }
-  [(set_attr "type"	"mfhilo")
+  [(set_attr "type"	"mflo")
    (set_attr "mode"	"SI")])
 
 ;; EXTR*_S.H
@@ -973,7 +973,7 @@ (define_insn "mips_extr_s_h"
     }
   return "extrv_s.h\t%0,%q1,%2";
 }
-  [(set_attr "type"	"mfhilo")
+  [(set_attr "type"	"mflo")
    (set_attr "mode"	"SI")])
 
 ;; EXTP*
@@ -996,7 +996,7 @@ (define_insn "mips_extp"
     }
   return "extpv\t%0,%q1,%2";
 }
-  [(set_attr "type"	"mfhilo")
+  [(set_attr "type"	"mflo")
    (set_attr "mode"	"SI")])
 
 (define_insn "mips_extpdp"
@@ -1021,7 +1021,7 @@ (define_insn "mips_extpdp"
     }
   return "extpdpv\t%0,%q1,%2";
 }
-  [(set_attr "type"	"mfhilo")
+  [(set_attr "type"	"mflo")
    (set_attr "mode"	"SI")])
 
 ;; SHILO*
@@ -1040,7 +1040,7 @@ (define_insn "mips_shilo"
     }
   return "shilov\t%q0,%2";
 }
-  [(set_attr "type"	"mfhilo")
+  [(set_attr "type"	"mflo")
    (set_attr "mode"	"SI")])
 
 ;; MTHLIP*
@@ -1056,7 +1056,7 @@ (define_insn "mips_mthlip"
 			 (reg:CCDSP CCDSP_PO_REGNUM)] UNSPEC_MTHLIP))])]
   "ISA_HAS_DSP && !TARGET_64BIT"
   "mthlip\t%2,%q0"
-  [(set_attr "type"	"mfhilo")
+  [(set_attr "type"	"mflo")
    (set_attr "mode"	"SI")])
 
 ;; WRDSP
Index: gcc/config/mips/10000.md
===================================================================
--- gcc/config/mips/10000.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/10000.md	2012-07-14 09:03:32.058992414 +0100
@@ -68,21 +68,19 @@ (define_insn_reservation "r10k_fpload" 3
 ;; Miscellaneous arith goes here too (this is a guess).
 (define_insn_reservation "r10k_arith" 1
   (and (eq_attr "cpu" "r10000")
-       (eq_attr "type" "arith,mthilo,slt,clz,const,nop,trap,logical"))
+       (eq_attr "type" "arith,mthi,mtlo,slt,clz,const,nop,trap,logical"))
   "r10k_alu1 | r10k_alu2")
 
 ;; We treat mfhilo differently, because we need to know when
 ;; it's HI and when it's LO.
 (define_insn_reservation "r10k_mfhi" 1
   (and (eq_attr "cpu" "r10000")
-       (and (eq_attr "type" "mfhilo")
-            (not (match_operand 1 "lo_operand"))))
+       (eq_attr "type" "mfhi"))
   "r10k_alu1 | r10k_alu2")
 
 (define_insn_reservation "r10k_mflo" 1
   (and (eq_attr "cpu" "r10000")
-       (and (eq_attr "type" "mfhilo")
-            (match_operand 1 "lo_operand")))
+       (eq_attr "type" "mflo"))
   "r10k_alu1 | r10k_alu2")
 
 
Index: gcc/config/mips/sb1.md
===================================================================
--- gcc/config/mips/sb1.md	2011-09-11 18:19:40.000000000 +0100
+++ gcc/config/mips/sb1.md	2012-07-14 09:59:09.174984249 +0100
@@ -295,21 +295,19 @@ (define_bypass 5
 
 (define_insn_reservation "ir_sb1_mfhi" 1
   (and (eq_attr "cpu" "sb1,sb1a")
-       (and (eq_attr "type" "mfhilo")
-	    (not (match_operand 1 "lo_operand"))))
+       (eq_attr "type" "mfhi"))
   "sb1_ex1")
 
 (define_insn_reservation "ir_sb1_mflo" 1
   (and (eq_attr "cpu" "sb1,sb1a")
-       (and (eq_attr "type" "mfhilo")
-	    (match_operand 1 "lo_operand")))
+       (eq_attr "type" "mflo"))
   "sb1_ex1")
 
 ;; mt{hi,lo} to mul/div is 4 cycles.
 
 (define_insn_reservation "ir_sb1_mthilo" 4
   (and (eq_attr "cpu" "sb1,sb1a")
-       (eq_attr "type" "mthilo"))
+       (eq_attr "type" "mthi,mtlo"))
   "sb1_ex1")
 
 ;; mt{hi,lo} to mf{hi,lo} is 3 cycles.
Index: gcc/config/mips/20kc.md
===================================================================
--- gcc/config/mips/20kc.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/20kc.md	2012-07-14 09:00:05.818992918 +0100
@@ -195,12 +195,12 @@ (define_insn_reservation "r20kc_impy_di"
 ;; cycle latency.  Repeat rate is 3 for both.
 (define_insn_reservation "r20kc_imthilo" 3 
 			 (and (eq_attr "cpu" "20kc")
-			      (eq_attr "type" "mthilo"))
+			      (eq_attr "type" "mthi,mtlo"))
 			 "r20kc_impydiv+(r20kc_impydiv_iter*3)")
 
 (define_insn_reservation "r20kc_imfhilo" 1
 			 (and (eq_attr "cpu" "20kc")
-			      (eq_attr "type" "mfhilo"))
+			      (eq_attr "type" "mfhi,mflo"))
 			 "r20kc_impydiv+(r20kc_impydiv_iter*3)")
 
 ;; Move to fp coprocessor.
Index: gcc/config/mips/24k.md
===================================================================
--- gcc/config/mips/24k.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/24k.md	2012-07-14 09:00:05.822992916 +0100
@@ -94,13 +94,13 @@ (define_insn_reservation "r24k_int_mul3"
 ;; mfhi, mflo, mflhxu - deliver result to gpr in 5 cycles
 (define_insn_reservation "r24k_int_mfhilo" 5
   (and (eq_attr "cpu" "24kc,24kf2_1,24kf1_1")
-       (eq_attr "type" "mfhilo"))
+       (eq_attr "type" "mfhi,mflo"))
   "r24k_iss+(r24k_mul3a|r24k_mul3b|r24k_mul3c)")
 
 ;; mthi, mtlo, mtlhx - deliver result to hi/lo, thence madd, handled as bypass
 (define_insn_reservation "r24k_int_mthilo" 1
   (and (eq_attr "cpu" "24kc,24kf2_1,24kf1_1")
-       (eq_attr "type" "mthilo"))
+       (eq_attr "type" "mthi,mtlo"))
   "r24k_iss+(r24k_mul3a|r24k_mul3b|r24k_mul3c)")
 
 ;; div - default to 36 cycles for 32bit operands.  Faster for 24bit, 16bit and
Index: gcc/config/mips/4130.md
===================================================================
--- gcc/config/mips/4130.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/4130.md	2012-07-14 09:00:05.824992918 +0100
@@ -72,7 +72,7 @@ (define_attr "vr4130_class" "mul,mem,alu
   (cond [(eq_attr "type" "load,store")
 	 (const_string "mem")
 
-	 (eq_attr "type" "mfhilo,mthilo,imul,imul3,imadd,idiv")
+	 (eq_attr "type" "mfhi,mflo,mthi,mtlo,imul,imul3,imadd,idiv")
 	 (const_string "mul")]
 	(const_string "alu")))
 
@@ -98,12 +98,12 @@ (define_insn_reservation "vr4130_store"
 
 (define_insn_reservation "vr4130_mfhilo" 3
   (and (eq_attr "cpu" "r4130")
-       (eq_attr "type" "mfhilo"))
+       (eq_attr "type" "mfhi,mflo"))
   "vr4130_muldiv")
 
 (define_insn_reservation "vr4130_mthilo" 1
   (and (eq_attr "cpu" "r4130")
-       (eq_attr "type" "mthilo"))
+       (eq_attr "type" "mthi,mtlo"))
   "vr4130_muldiv")
 
 ;; The product is available in LO & HI after one cycle.  Moving the result
Index: gcc/config/mips/4k.md
===================================================================
--- gcc/config/mips/4k.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/4k.md	2012-07-14 09:00:05.825992918 +0100
@@ -114,13 +114,13 @@ (define_insn_reservation "r4k_madd_4kp"
 ;; Move to HI/LO -> MADD/MSUB,MFHI/MFLO has a 1 cycle latency.
 (define_insn_reservation "r4k_int_mthilo" 1
   (and (eq_attr "cpu" "4kc,4kp")
-       (eq_attr "type" "mthilo"))
+       (eq_attr "type" "mthi,mtlo"))
   "r4k_ixu_arith+r4k_ixu_mpydiv")
 
 ;; Move from HI/LO -> integer operation has a 2 cycle latency.
 (define_insn_reservation "r4k_int_mfhilo" 2
   (and (eq_attr "cpu" "4kc,4kp")
-       (eq_attr "type" "mfhilo"))
+       (eq_attr "type" "mfhi,mflo"))
   "r4k_ixu_arith+r4k_ixu_mpydiv")
 
 ;; All other integer insns.
Index: gcc/config/mips/5400.md
===================================================================
--- gcc/config/mips/5400.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/5400.md	2012-07-14 09:00:05.826992918 +0100
@@ -73,7 +73,7 @@ (define_insn_reservation "ir_vr54_xfer"
 
 (define_insn_reservation "ir_vr54_hilo" 1
   (and (eq_attr "cpu" "r5400")
-       (eq_attr "type" "mthilo,mfhilo"))
+       (eq_attr "type" "mthi,mtlo,mfhi,mflo"))
   "vr54_dp0|vr54_dp1")
 
 (define_insn_reservation "ir_vr54_arith" 1
Index: gcc/config/mips/5500.md
===================================================================
--- gcc/config/mips/5500.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/5500.md	2012-07-14 09:00:05.826992918 +0100
@@ -84,12 +84,12 @@ (define_bypass 2
 
 (define_insn_reservation "ir_vr55_mthilo" 1
   (and (eq_attr "cpu" "r5500")
-       (eq_attr "type" "mthilo"))
+       (eq_attr "type" "mthi,mtlo"))
   "vr55_mac")
 
 (define_insn_reservation "ir_vr55_mfhilo" 5
   (and (eq_attr "cpu" "r5500")
-       (eq_attr "type" "mfhilo"))
+       (eq_attr "type" "mfhi,mflo"))
   "vr55_mac")
 
 ;; The default latency is for the GPR result of a mul.  Bypasses handle the
Index: gcc/config/mips/5k.md
===================================================================
--- gcc/config/mips/5k.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/5k.md	2012-07-14 09:00:05.827992918 +0100
@@ -88,13 +88,13 @@ (define_insn_reservation "r5k_int_mul" 4
 ;; Move to HI/LO -> MADD/MSUB,MFHI/MFLO has a 1 cycle latency.
 (define_insn_reservation "r5k_int_mthilo" 1
   (and (eq_attr "cpu" "5kc,5kf")
-       (eq_attr "type" "mthilo"))
+       (eq_attr "type" "mthi,mtlo"))
   "r5k_ixu_arith+r5k_ixu_mpydiv")
 
 ;; Move from HI/LO -> integer operation has a 2 cycle latency.
 (define_insn_reservation "r5k_int_mfhilo" 2
   (and (eq_attr "cpu" "5kc,5kf")
-       (eq_attr "type" "mfhilo"))
+       (eq_attr "type" "mfhi,mflo"))
   "r5k_ixu_arith+r5k_ixu_mpydiv")
 
 ;; All other integer insns.
Index: gcc/config/mips/7000.md
===================================================================
--- gcc/config/mips/7000.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/7000.md	2012-07-14 09:00:05.828992918 +0100
@@ -134,12 +134,12 @@ (define_insn_reservation "rm7_impy_di" 9
 ;; Move to/from HI/LO.
 (define_insn_reservation "rm7_mthilo" 3
   (and (eq_attr "cpu" "r7000")
-       (eq_attr "type" "mthilo"))
+       (eq_attr "type" "mthi,mtlo"))
   "rm7_impydiv")
 
 (define_insn_reservation "rm7_mfhilo" 1
   (and (eq_attr "cpu" "r7000")
-       (eq_attr "type" "mfhilo"))
+       (eq_attr "type" "mfhi,mflo"))
   "rm7_impydiv")
 
 ;; Move to/from fp coprocessor.
Index: gcc/config/mips/74k.md
===================================================================
--- gcc/config/mips/74k.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/74k.md	2012-07-14 09:00:05.829992918 +0100
@@ -80,13 +80,13 @@ (define_insn_reservation "r74k_int_mul3"
 ;; mfhi, mflo, mflhxu - deliver result to gpr in 7 cycles
 (define_insn_reservation "r74k_int_mfhilo" 7
   (and (eq_attr "cpu" "74kc,74kf2_1,74kf1_1,74kf3_2")
-       (eq_attr "type" "mfhilo"))
+       (eq_attr "type" "mfhi,mflo"))
   "r74k_alu+r74k_mul")
 
 ;; mthi, mtlo, mtlhx - deliver result to hi/lo, thence madd, handled as bypass
 (define_insn_reservation "r74k_int_mthilo" 7
   (and (eq_attr "cpu" "74kc,74kf2_1,74kf1_1,74kf3_2")
-       (eq_attr "type" "mthilo"))
+       (eq_attr "type" "mthi,mtlo"))
   "r74k_alu+r74k_mul")
 
 ;; div - default to 50 cycles for 32bit operands.  Faster for 8 bit,
Index: gcc/config/mips/9000.md
===================================================================
--- gcc/config/mips/9000.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/9000.md	2012-07-14 09:00:05.829992918 +0100
@@ -87,12 +87,12 @@ (define_insn_reservation "rm9k_divdi" 70
 
 (define_insn_reservation "rm9k_mfhilo" 1
   (and (eq_attr "cpu" "r9000")
-       (eq_attr "type" "mfhilo"))
+       (eq_attr "type" "mfhi,mflo"))
   "rm9k_f_int")
 
 (define_insn_reservation "rm9k_mthilo" 5
   (and (eq_attr "cpu" "r9000")
-       (eq_attr "type" "mthilo"))
+       (eq_attr "type" "mthi,mtlo"))
   "rm9k_f_int")
 
 (define_insn_reservation "rm9k_xfer" 2
Index: gcc/config/mips/generic.md
===================================================================
--- gcc/config/mips/generic.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/generic.md	2012-07-14 09:00:05.832992918 +0100
@@ -43,7 +43,7 @@ (define_insn_reservation "generic_branch
   "alu")
 
 (define_insn_reservation "generic_hilo" 1
-  (eq_attr "type" "mfhilo,mthilo")
+  (eq_attr "type" "mfhi,mflo,mthi,mtlo")
   "imuldiv*3")
 
 (define_insn_reservation "generic_imul" 17
Index: gcc/config/mips/loongson2ef.md
===================================================================
--- gcc/config/mips/loongson2ef.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/loongson2ef.md	2012-07-14 10:00:13.085984090 +0100
@@ -154,8 +154,8 @@ (define_query_cpu_unit "ls2_mem" "ls2_me
 ;; Reservation for integer instructions.
 (define_insn_reservation "ls2_alu" 2
   (and (eq_attr "cpu" "loongson_2e,loongson_2f")
-       (eq_attr "type" "arith,condmove,const,logical,mfhilo,move,
-                        mthilo,nop,shift,signext,slt"))
+       (eq_attr "type" "arith,condmove,const,logical,mfhi,mflo,move,
+                        mthi,mtlo,nop,shift,signext,slt"))
   "ls2_alu")
 
 ;; Reservation for branch instructions.
Index: gcc/config/mips/loongson3a.md
===================================================================
--- gcc/config/mips/loongson3a.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/loongson3a.md	2012-07-14 09:00:05.834992918 +0100
@@ -53,7 +53,7 @@ (define_insn_reservation "ls3a_branch" 1
 
 (define_insn_reservation "ls3a_mfhilo" 1
   (and (eq_attr "cpu" "loongson_3a")
-       (eq_attr "type" "mfhilo,mthilo"))
+       (eq_attr "type" "mfhi,mflo,mthi,mtlo"))
   "ls3a_alu2")
 
 ;; Operation imul3nc is fully pipelined.
Index: gcc/config/mips/octeon.md
===================================================================
--- gcc/config/mips/octeon.md	2011-12-14 19:37:20.000000000 +0000
+++ gcc/config/mips/octeon.md	2012-07-14 09:00:05.903992918 +0100
@@ -83,22 +83,22 @@ (define_insn_reservation "octeon_imul3_o
 
 (define_insn_reservation "octeon_imul_o1" 2
   (and (eq_attr "cpu" "octeon")
-       (eq_attr "type" "imul,mthilo"))
+       (eq_attr "type" "imul,mthi,mtlo"))
   "(octeon_pipe0 | octeon_pipe1) + octeon_mult, octeon_mult")
 
 (define_insn_reservation "octeon_imul_o2" 1
   (and (eq_attr "cpu" "octeon2")
-       (eq_attr "type" "imul,mthilo"))
+       (eq_attr "type" "imul,mthi,mtlo"))
   "octeon_pipe1 + octeon_mult")
 
 (define_insn_reservation "octeon_mfhilo_o1" 5
   (and (eq_attr "cpu" "octeon")
-       (eq_attr "type" "mfhilo"))
+       (eq_attr "type" "mfhi,mflo"))
   "(octeon_pipe0 | octeon_pipe1) + octeon_mult")
 
 (define_insn_reservation "octeon_mfhilo_o2" 6
   (and (eq_attr "cpu" "octeon2")
-       (eq_attr "type" "mfhilo"))
+       (eq_attr "type" "mfhi,mflo"))
   "octeon_pipe1 + octeon_mult")
 
 (define_insn_reservation "octeon_imadd_o1" 4
Index: gcc/config/mips/sr71k.md
===================================================================
--- gcc/config/mips/sr71k.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/sr71k.md	2012-07-14 09:00:05.907992918 +0100
@@ -201,7 +201,7 @@ (define_insn_reservation "ir_sr70_xfer_t
 
 (define_insn_reservation "ir_sr70_hilo" 1
   (and (eq_attr "cpu" "sr71000")
-       (eq_attr "type" "mthilo,mfhilo"))
+       (eq_attr "type" "mthi,mtlo,mfhi,mflo"))
   "ri_insns")
 
 (define_insn_reservation "ir_sr70_arith" 1
Index: gcc/config/mips/xlr.md
===================================================================
--- gcc/config/mips/xlr.md	2011-09-03 10:05:51.000000000 +0100
+++ gcc/config/mips/xlr.md	2012-07-14 09:00:05.913992918 +0100
@@ -85,5 +85,5 @@ (define_insn_reservation "ir_xlr_div" 68
 
 (define_insn_reservation "xlr_hilo" 2
   (and (eq_attr "cpu" "xlr") 
-       (eq_attr "type" "mfhilo,mthilo"))
+       (eq_attr "type" "mfhi,mflo,mthi,mtlo"))
   "xlr_imuldiv_nopipe")

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][MIPS] NetLogic XLP scheduling
  2012-07-15 16:28 ` Richard Sandiford
@ 2012-07-16  6:37   ` Chung-Lin Tang
  2012-07-16  6:57     ` Maxim Kuvyrkov
  2012-07-16  9:06     ` Richard Sandiford
  0 siblings, 2 replies; 10+ messages in thread
From: Chung-Lin Tang @ 2012-07-16  6:37 UTC (permalink / raw)
  To: gcc-patches, rdsandiford, Maxim Kuvyrkov

On 2012/7/16 12:28 AM, Richard Sandiford wrote:
> Chung-Lin Tang <cltang@codesourcery.com> writes:
>> This patch adds scheduling support for the NetLogic XLP, including a new
>> pipeline description, and associated changes.
>>
>> Asides from the new xlp.md description file, there are also some sync
>> primitive attribute modifications, for better scheduling of sync loops
>> (Maxim should be able to better explain this).
> 
> Rather than add a "type" attribute to each sync loop, please just add:
> 
> 	  (not (eq_attr "sync_mem" "none"))
> 	  (symbol_ref "syncloop")
> 
> to the default value of the "type" attribute.  You'll probably need
> to swap the order of the sync* attributes with the "type" attribute
> in order for this to compile.
> 
> The patch is effectively changing the type of the sync loops from
> "unknown" to "syncloop".  That's certainly OK, but you'll need to
> add "syncloop" to the "unknown" reservations of all other schedulers
> (except for generic.md, where what you've done instead is fine).
> It might be easier if you split out the addition of syncloop
> as a separate patch.

I'll leave it to Maxim to respond to the sync parts.

>> Other generic changes include a new "hilo" insn attribute, to mark which
>> of HI/LO does a m[ft]hilo insn access.
> 
> The way other schedulers handle this is with things like:
> 
> (define_insn_reservation "ir_sb1_mfhi" 1
>   (and (eq_attr "cpu" "sb1,sb1a")
>        (and (eq_attr "type" "mfhilo")
> 	    (not (match_operand 1 "lo_operand"))))
>   "sb1_ex1")
> 
> which seems simpler.  mfhilo and mthilo are required to read operand 1
> and write to operand 0 (respectively) in order to support this kind of
> construct.
> 
> That said, even the above is a hold-over from when we tried to allow
> high registers to store independent values.  These days we can be a bit
> more precise, as with the patch below.  (As the comment says:
> 
> 	 ;; If a doubleword move uses these expensive instructions,
> 	 ;; it is usually better to schedule them in the same way
> 	 ;; as the singleword form, rather than as "multi".
> 
> I'm continuing to assume that mflo and mtlo are the best type choices
> for unsplit double-register moves.  That path should be very rarely
> outside of MIPS16 anyway -- just by sched1 if hi and lo are exposed
> directly -- and no current scheduler tries to model a doubleword hi/lo
> move separately from single-register ones.  The information is available
> via the dword_mode attribute if required.)

I suppose this means that actual generation of moves as mfhi/mthi should
almost never happen out of normal conditions?

> Tested on mips64-elf, and by making sure that there were no changes in
> -O2 output for a recent set of cc1 .ii files.  Applied.
> 
> I'm probably punishing you for being honest here, but the only other
> thing is that you've listed NetLogic Microsystems Inc. as one of the
> authors.  I think that means they'll need to sign a copyright assignment.
> Have they already done that?

They have assigned the copyright to Mentor Graphics, so it should mean
the code can be contributed by us.

Thanks,
Chung-Lin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][MIPS] NetLogic XLP scheduling
  2012-07-16  6:37   ` Chung-Lin Tang
@ 2012-07-16  6:57     ` Maxim Kuvyrkov
  2012-07-16  9:08       ` Richard Sandiford
  2012-07-20  4:27       ` Maxim Kuvyrkov
  2012-07-16  9:06     ` Richard Sandiford
  1 sibling, 2 replies; 10+ messages in thread
From: Maxim Kuvyrkov @ 2012-07-16  6:57 UTC (permalink / raw)
  To: Chung-Lin Tang; +Cc: gcc-patches, rdsandiford

On 16/07/2012, at 6:37 PM, Chung-Lin Tang wrote:

> On 2012/7/16 12:28 AM, Richard Sandiford wrote:
>> Chung-Lin Tang <cltang@codesourcery.com> writes:
>>> This patch adds scheduling support for the NetLogic XLP, including a new
>>> pipeline description, and associated changes.
>>> 
>>> Asides from the new xlp.md description file, there are also some sync
>>> primitive attribute modifications, for better scheduling of sync loops
>>> (Maxim should be able to better explain this).
>> 
>> Rather than add a "type" attribute to each sync loop, please just add:
>> 
>> 	  (not (eq_attr "sync_mem" "none"))
>> 	  (symbol_ref "syncloop")
>> 
>> to the default value of the "type" attribute.  You'll probably need
>> to swap the order of the sync* attributes with the "type" attribute
>> in order for this to compile.
>> 
>> The patch is effectively changing the type of the sync loops from
>> "unknown" to "syncloop".  That's certainly OK, but you'll need to
>> add "syncloop" to the "unknown" reservations of all other schedulers
>> (except for generic.md, where what you've done instead is fine).
>> It might be easier if you split out the addition of syncloop
>> as a separate patch.
> 
> I'll leave it to Maxim to respond to the sync parts.

Richard, that's indeed simpler, thanks.

Chung-Lin, I'll try to make a patch for the patch in the next couple of days and will send it to you.  Let me know if you'd rather fixed this yourself.

...

>> Tested on mips64-elf, and by making sure that there were no changes in
>> -O2 output for a recent set of cc1 .ii files.  Applied.
>> 
>> I'm probably punishing you for being honest here, but the only other
>> thing is that you've listed NetLogic Microsystems Inc. as one of the
>> authors.  I think that means they'll need to sign a copyright assignment.
>> Have they already done that?
> 
> They have assigned the copyright to Mentor Graphics, so it should mean
> the code can be contributed by us.

That is correct.  NetLogic developed the original xlp.md description, which Chung-Lin essentially rewrote.  In any case, Mentor has copyright assignment for the original xlp.md specifically so that we can contribute this upstream.

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][MIPS] NetLogic XLP scheduling
  2012-07-16  6:37   ` Chung-Lin Tang
  2012-07-16  6:57     ` Maxim Kuvyrkov
@ 2012-07-16  9:06     ` Richard Sandiford
  1 sibling, 0 replies; 10+ messages in thread
From: Richard Sandiford @ 2012-07-16  9:06 UTC (permalink / raw)
  To: Chung-Lin Tang; +Cc: gcc-patches, Maxim Kuvyrkov

Chung-Lin Tang <cltang@codesourcery.com> writes:
>>> Other generic changes include a new "hilo" insn attribute, to mark which
>>> of HI/LO does a m[ft]hilo insn access.
>> 
>> The way other schedulers handle this is with things like:
>> 
>> (define_insn_reservation "ir_sb1_mfhi" 1
>>   (and (eq_attr "cpu" "sb1,sb1a")
>>        (and (eq_attr "type" "mfhilo")
>> 	    (not (match_operand 1 "lo_operand"))))
>>   "sb1_ex1")
>> 
>> which seems simpler.  mfhilo and mthilo are required to read operand 1
>> and write to operand 0 (respectively) in order to support this kind of
>> construct.
>> 
>> That said, even the above is a hold-over from when we tried to allow
>> high registers to store independent values.  These days we can be a bit
>> more precise, as with the patch below.  (As the comment says:
>> 
>> 	 ;; If a doubleword move uses these expensive instructions,
>> 	 ;; it is usually better to schedule them in the same way
>> 	 ;; as the singleword form, rather than as "multi".
>> 
>> I'm continuing to assume that mflo and mtlo are the best type choices
>> for unsplit double-register moves.  That path should be very rarely
>> outside of MIPS16 anyway -- just by sched1 if hi and lo are exposed
>> directly -- and no current scheduler tries to model a doubleword hi/lo
>> move separately from single-register ones.  The information is available
>> via the dword_mode attribute if required.)
>
> I suppose this means that actual generation of moves as mfhi/mthi should
> almost never happen out of normal conditions?

Yeah, the move patterns themselves never generate mfhi or mthi in isolation.
They either generate mflo, mtlo, mfhi+mflo, or mthi+mtlo.  mfhi and mthi
have special patterns that are generated by post-reload splits.

Richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][MIPS] NetLogic XLP scheduling
  2012-07-16  6:57     ` Maxim Kuvyrkov
@ 2012-07-16  9:08       ` Richard Sandiford
  2012-07-20  4:27       ` Maxim Kuvyrkov
  1 sibling, 0 replies; 10+ messages in thread
From: Richard Sandiford @ 2012-07-16  9:08 UTC (permalink / raw)
  To: Maxim Kuvyrkov; +Cc: Chung-Lin Tang, gcc-patches

Maxim Kuvyrkov <maxim@codesourcery.com> writes:
> On 16/07/2012, at 6:37 PM, Chung-Lin Tang wrote:
>> On 2012/7/16 12:28 AM, Richard Sandiford wrote:
>>> Chung-Lin Tang <cltang@codesourcery.com> writes:
>>> Tested on mips64-elf, and by making sure that there were no changes in
>>> -O2 output for a recent set of cc1 .ii files.  Applied.
>>> 
>>> I'm probably punishing you for being honest here, but the only other
>>> thing is that you've listed NetLogic Microsystems Inc. as one of the
>>> authors.  I think that means they'll need to sign a copyright assignment.
>>> Have they already done that?
>> 
>> They have assigned the copyright to Mentor Graphics, so it should mean
>> the code can be contributed by us.
>
> That is correct.  NetLogic developed the original xlp.md description, which Chung-Lin essentially rewrote.  In any case, Mentor has copyright assignment for the original xlp.md specifically so that we can contribute this upstream.

Ah, excellent, thanks.

Richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][MIPS] NetLogic XLP scheduling
  2012-07-16  6:57     ` Maxim Kuvyrkov
  2012-07-16  9:08       ` Richard Sandiford
@ 2012-07-20  4:27       ` Maxim Kuvyrkov
  2012-07-20  7:06         ` Chung-Lin Tang
  2012-07-20  8:26         ` Richard Sandiford
  1 sibling, 2 replies; 10+ messages in thread
From: Maxim Kuvyrkov @ 2012-07-20  4:27 UTC (permalink / raw)
  To: Maxim Kuvyrkov; +Cc: Chung-Lin Tang, gcc-patches, rdsandiford

[-- Attachment #1: Type: text/plain, Size: 1640 bytes --]

On 16/07/2012, at 6:56 PM, Maxim Kuvyrkov wrote:

> On 16/07/2012, at 6:37 PM, Chung-Lin Tang wrote:
> 
>> On 2012/7/16 12:28 AM, Richard Sandiford wrote:
>>> Chung-Lin Tang <cltang@codesourcery.com> writes:
>>>> This patch adds scheduling support for the NetLogic XLP, including a new
>>>> pipeline description, and associated changes.
>>>> 
>>>> Asides from the new xlp.md description file, there are also some sync
>>>> primitive attribute modifications, for better scheduling of sync loops
>>>> (Maxim should be able to better explain this).
>>> 
>>> Rather than add a "type" attribute to each sync loop, please just add:
>>> 
>>> 	  (not (eq_attr "sync_mem" "none"))
>>> 	  (symbol_ref "syncloop")
>>> 
>>> to the default value of the "type" attribute.  You'll probably need
>>> to swap the order of the sync* attributes with the "type" attribute
>>> in order for this to compile.
>>> 
>>> The patch is effectively changing the type of the sync loops from
>>> "unknown" to "syncloop".  That's certainly OK, but you'll need to
>>> add "syncloop" to the "unknown" reservations of all other schedulers
>>> (except for generic.md, where what you've done instead is fine).
>>> It might be easier if you split out the addition of syncloop
>>> as a separate patch.
>> 
>> I'll leave it to Maxim to respond to the sync parts.
> 
> Richard, that's indeed simpler, thanks.

Attached is a stand-alone patch that adds handling of "syncloop" and "atomic" type attributes.

Tested by building cross-toolchain mips64-linux-gnu including GLIBC.  OK to apply?

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics


[-- Attachment #2: 0001-Support-scheduling-of-sync-loops-and-atomic-instruct.patch --]
[-- Type: application/octet-stream, Size: 17699 bytes --]

From 95e7caa3273c603d1bc9a7b7d262c8ad88acfc96 Mon Sep 17 00:00:00 2001
From: Maxim Kuvyrkov <maxim@codesourcery.com>
Date: Thu, 19 Jul 2012 20:59:55 -0700
Subject: [PATCH 1/2] Support scheduling of sync loops and atomic instructions.

2012-07-13  Maxim Kuvyrkov  <maxim@codesourcery.com>

	* config/mips/mips.md (define_attr sync_*): Move before "type".
	(define_attr "type"): New values "atomic" and "syncloop".
	* config/mips/sync.md (atomic_exchange<mode>, atomic_fetch_add<mode>):
	Set "type" attribute.
	* config/mips/generic.md (generic_atomic, generic_syncloop):
	New reservations.
	* gcc/config/mips/10000.md, gcc/config/mips/20kc.md,
	* gcc/config/mips/24k.md, gcc/config/mips/4130.md,
        * gcc/config/mips/4k.md, gcc/config/mips/5400.md,
	* gcc/config/mips/5500.md, gcc/config/mips/5k.md,
        * gcc/config/mips/7000.md, gcc/config/mips/74k.md,
	* gcc/config/mips/9000.md, gcc/config/mips/loongson2ef.md,
	* gcc/config/mips/loongson3a.md, gcc/config/mips/octeon.md,
	* gcc/config/mips/sb1.md, gcc/config/mips/sr71k.md,
	* gcc/config/mips/xlr.md: Handle "atomic" and "syncloop" types.
---
 gcc/config/mips/10000.md       |    2 +-
 gcc/config/mips/20kc.md        |    2 +-
 gcc/config/mips/24k.md         |    2 +-
 gcc/config/mips/4130.md        |    2 +-
 gcc/config/mips/4k.md          |    2 +-
 gcc/config/mips/5400.md        |    2 +-
 gcc/config/mips/5500.md        |    2 +-
 gcc/config/mips/5k.md          |    2 +-
 gcc/config/mips/7000.md        |    2 +-
 gcc/config/mips/74k.md         |    2 +-
 gcc/config/mips/9000.md        |    2 +-
 gcc/config/mips/generic.md     |   16 ++++++
 gcc/config/mips/loongson2ef.md |    2 +-
 gcc/config/mips/loongson3a.md  |    2 +-
 gcc/config/mips/mips.md        |  108 +++++++++++++++++++++-------------------
 gcc/config/mips/octeon.md      |    2 +-
 gcc/config/mips/sb1.md         |    2 +-
 gcc/config/mips/sr71k.md       |    2 +-
 gcc/config/mips/sync.md        |    6 ++-
 gcc/config/mips/xlr.md         |    2 +-
 20 files changed, 93 insertions(+), 71 deletions(-)

diff --git a/gcc/config/mips/10000.md b/gcc/config/mips/10000.md
index ad21e9e..95d9c3a 100644
--- a/gcc/config/mips/10000.md
+++ b/gcc/config/mips/10000.md
@@ -249,5 +249,5 @@
 ;; Handle unknown/multi insns here (this is a guess).
 (define_insn_reservation "r10k_unknown" 1
   (and (eq_attr "cpu" "r10000")
-       (eq_attr "type" "unknown,multi"))
+       (eq_attr "type" "unknown,multi,atomic,syncloop"))
   "r10k_alu1 + r10k_alu2")
diff --git a/gcc/config/mips/20kc.md b/gcc/config/mips/20kc.md
index 1d3aadf..ac67bb6 100644
--- a/gcc/config/mips/20kc.md
+++ b/gcc/config/mips/20kc.md
@@ -280,5 +280,5 @@
 ;; Force single-dispatch for unknown or multi.
 (define_insn_reservation "r20kc_unknown" 1 
 			 (and (eq_attr "cpu" "20kc")
-			      (eq_attr "type" "unknown,multi"))
+			      (eq_attr "type" "unknown,multi,atomic,syncloop"))
 			 "r20kc_single_dispatch")
diff --git a/gcc/config/mips/24k.md b/gcc/config/mips/24k.md
index 5df8a32..e557699 100644
--- a/gcc/config/mips/24k.md
+++ b/gcc/config/mips/24k.md
@@ -149,7 +149,7 @@
 ;;    scheduling via log links, but not used here).
 (define_insn_reservation "r24k_int_unknown" 0
   (and (eq_attr "cpu" "24kc,24kf2_1,24kf1_1")
-       (eq_attr "type" "unknown"))
+       (eq_attr "type" "unknown,atomic,syncloop"))
   "r24k_iss")
 
 
diff --git a/gcc/config/mips/4130.md b/gcc/config/mips/4130.md
index 6de814f..ca211b2 100644
--- a/gcc/config/mips/4130.md
+++ b/gcc/config/mips/4130.md
@@ -78,7 +78,7 @@
 
 (define_insn_reservation "vr4130_multi" 1
   (and (eq_attr "cpu" "r4130")
-       (eq_attr "type" "multi,unknown"))
+       (eq_attr "type" "multi,unknown,atomic,syncloop"))
   "vr4130_alu1 + vr4130_alu2 + vr4130_dcache + vr4130_muldiv")
 
 (define_insn_reservation "vr4130_int" 1
diff --git a/gcc/config/mips/4k.md b/gcc/config/mips/4k.md
index 88cdbd1..7852054 100644
--- a/gcc/config/mips/4k.md
+++ b/gcc/config/mips/4k.md
@@ -149,5 +149,5 @@
 ;; Unknown or multi - single issue
 (define_insn_reservation "r4k_unknown" 1
   (and (eq_attr "cpu" "4kc,4kp")
-       (eq_attr "type" "unknown,multi"))
+       (eq_attr "type" "unknown,multi,atomic,syncloop"))
   "r4k_ixu_arith+r4k_ixu_mpydiv")
diff --git a/gcc/config/mips/5400.md b/gcc/config/mips/5400.md
index 362999d..2649ef7 100644
--- a/gcc/config/mips/5400.md
+++ b/gcc/config/mips/5400.md
@@ -33,7 +33,7 @@
 
 (define_insn_reservation "ir_vr54_unknown" 1
   (and (eq_attr "cpu" "r5400")
-       (eq_attr "type" "unknown"))
+       (eq_attr "type" "unknown,atomic,syncloop"))
   "vr54_dp0+vr54_dp1+vr54_mem+vr54_mac")
 
 ;; Assume prediction fails.
diff --git a/gcc/config/mips/5500.md b/gcc/config/mips/5500.md
index 0b59af1..67a3a05 100644
--- a/gcc/config/mips/5500.md
+++ b/gcc/config/mips/5500.md
@@ -35,7 +35,7 @@
 
 (define_insn_reservation "ir_vr55_unknown" 1
   (and (eq_attr "cpu" "r5500")
-       (eq_attr "type" "unknown"))
+       (eq_attr "type" "unknown,atomic,syncloop"))
   "vr55_dp0+vr55_dp1+vr55_mem+vr55_mac+vr55_fp+vr55_bru")
 
 ;; Assume prediction fails.
diff --git a/gcc/config/mips/5k.md b/gcc/config/mips/5k.md
index ade06ec..dd2033b 100644
--- a/gcc/config/mips/5k.md
+++ b/gcc/config/mips/5k.md
@@ -127,7 +127,7 @@
 ;; Unknown or multi - single issue
 (define_insn_reservation "r5k_int_unknown" 1
   (and (eq_attr "cpu" "5kc,5kf")
-       (eq_attr "type" "unknown,multi"))
+       (eq_attr "type" "unknown,multi,atomic,syncloop"))
   "r5k_ixu_arith+r5k_ixu_mpydiv")
 
 
diff --git a/gcc/config/mips/7000.md b/gcc/config/mips/7000.md
index 6c91d04..8ff3fbf 100644
--- a/gcc/config/mips/7000.md
+++ b/gcc/config/mips/7000.md
@@ -210,5 +210,5 @@
 ;; Force single-dispatch for unknown or multi.
 (define_insn_reservation "rm7_unknown" 1
   (and (eq_attr "cpu" "r7000")
-       (eq_attr "type" "unknown,multi"))
+       (eq_attr "type" "unknown,multi,atomic,syncloop"))
   "rm7_single_dispatch")
diff --git a/gcc/config/mips/74k.md b/gcc/config/mips/74k.md
index b75bfc4..7bd64b5 100644
--- a/gcc/config/mips/74k.md
+++ b/gcc/config/mips/74k.md
@@ -129,7 +129,7 @@
 ;;
 (define_insn_reservation "r74k_unknown" 1 
   (and (eq_attr "cpu" "74kc,74kf2_1,74kf1_1,74kf3_2")
-       (eq_attr "type" "unknown"))
+       (eq_attr "type" "unknown,atomic,syncloop"))
   "r74k_alu")
 
 (define_insn_reservation "r74k_multi" 10
diff --git a/gcc/config/mips/9000.md b/gcc/config/mips/9000.md
index c0c8d3a..9d71691 100644
--- a/gcc/config/mips/9000.md
+++ b/gcc/config/mips/9000.md
@@ -147,5 +147,5 @@
 
 (define_insn_reservation "rm9k_unknown" 1
   (and (eq_attr "cpu" "r9000")
-       (eq_attr "type" "unknown,multi"))
+       (eq_attr "type" "unknown,multi,atomic,syncloop"))
   "rm9k_m + rm9k_f_int + rm9k_any1 + rm9k_any2")
diff --git a/gcc/config/mips/generic.md b/gcc/config/mips/generic.md
index d61511f..02b1d8b 100644
--- a/gcc/config/mips/generic.md
+++ b/gcc/config/mips/generic.md
@@ -103,3 +103,19 @@
 (define_insn_reservation "generic_frecip_fsqrt_step" 5
   (eq_attr "type" "frdiv1,frdiv2,frsqrt1,frsqrt2")
   "alu")
+
+(define_insn_reservation "generic_atomic" 10
+  (eq_attr "type" "atomic")
+  "alu")
+
+;; Sync loop consists of (in order)
+;; (1) optional sync,
+;; (2) LL instruction,
+;; (3) branch and 1-2 ALU instructions,
+;; (4) SC instruction,
+;; (5) branch and ALU instruction.
+;; The net result of this reservation is a big delay with a flush of
+;; ALU pipeline.
+(define_insn_reservation "generic_sync_loop" 40
+  (eq_attr "type" "syncloop")
+  "alu*39")
diff --git a/gcc/config/mips/loongson2ef.md b/gcc/config/mips/loongson2ef.md
index fa5ae7e..c05d132 100644
--- a/gcc/config/mips/loongson2ef.md
+++ b/gcc/config/mips/loongson2ef.md
@@ -98,7 +98,7 @@
 ;; ls2_[f]alu{1,2}_turn_enabled units according to this attribute.
 ;; These instructions are used in mips.c: sched_ls2_dfa_post_advance_cycle.
 
-(define_attr "ls2_turn_type" "alu1,alu2,falu1,falu2,unknown"
+(define_attr "ls2_turn_type" "alu1,alu2,falu1,falu2,unknown,atomic,syncloop"
   (const_string "unknown"))
 
 ;; Subscribe ls2_alu1_turn_enabled.
diff --git a/gcc/config/mips/loongson3a.md b/gcc/config/mips/loongson3a.md
index c584f42..caaff18 100644
--- a/gcc/config/mips/loongson3a.md
+++ b/gcc/config/mips/loongson3a.md
@@ -131,7 +131,7 @@
 ;; Force single-dispatch for unknown or multi.
 (define_insn_reservation "ls3a_unknown" 1
   (and (eq_attr "cpu" "loongson_3a")
-       (eq_attr "type" "unknown,multi"))
+       (eq_attr "type" "unknown,multi,atomic,syncloop"))
   "ls3a_alu1 + ls3a_alu2 + ls3a_falu1 + ls3a_falu2 + ls3a_mem")
 
 ;; End of DFA-based pipeline description for loongson_3a
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index 5b1735f..efff201 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -224,6 +224,57 @@
 	 (const_string "yes")]
 	(const_string "no")))
 
+;; Attributes describing a sync loop.  These loops have the form:
+;;
+;;       if (RELEASE_BARRIER == YES) sync
+;;    1: OLDVAL = *MEM
+;;       if ((OLDVAL & INCLUSIVE_MASK) != REQUIRED_OLDVAL) goto 2
+;;         CMP  = 0 [delay slot]
+;;       $TMP1 = OLDVAL & EXCLUSIVE_MASK
+;;       $TMP2 = INSN1 (OLDVAL, INSN1_OP2)
+;;       $TMP3 = INSN2 ($TMP2, INCLUSIVE_MASK)
+;;       $AT |= $TMP1 | $TMP3
+;;       if (!commit (*MEM = $AT)) goto 1.
+;;         if (INSN1 != MOVE && INSN1 != LI) NEWVAL = $TMP3 [delay slot]
+;;       CMP  = 1
+;;       if (ACQUIRE_BARRIER == YES) sync
+;;    2:
+;;
+;; where "$" values are temporaries and where the other values are
+;; specified by the attributes below.  Values are specified as operand
+;; numbers and insns are specified as enums.  If no operand number is
+;; specified, the following values are used instead:
+;;
+;;    - OLDVAL: $AT
+;;    - CMP: NONE
+;;    - NEWVAL: $AT
+;;    - INCLUSIVE_MASK: -1
+;;    - REQUIRED_OLDVAL: OLDVAL & INCLUSIVE_MASK
+;;    - EXCLUSIVE_MASK: 0
+;;
+;; MEM and INSN1_OP2 are required.
+;;
+;; Ideally, the operand attributes would be integers, with -1 meaning "none",
+;; but the gen* programs don't yet support that.
+(define_attr "sync_mem" "none,0,1,2,3,4,5" (const_string "none"))
+(define_attr "sync_oldval" "none,0,1,2,3,4,5" (const_string "none"))
+(define_attr "sync_cmp" "none,0,1,2,3,4,5" (const_string "none"))
+(define_attr "sync_newval" "none,0,1,2,3,4,5" (const_string "none"))
+(define_attr "sync_inclusive_mask" "none,0,1,2,3,4,5" (const_string "none"))
+(define_attr "sync_exclusive_mask" "none,0,1,2,3,4,5" (const_string "none"))
+(define_attr "sync_required_oldval" "none,0,1,2,3,4,5" (const_string "none"))
+(define_attr "sync_insn1_op2" "none,0,1,2,3,4,5" (const_string "none"))
+(define_attr "sync_insn1" "move,li,addu,addiu,subu,and,andi,or,ori,xor,xori"
+  (const_string "move"))
+(define_attr "sync_insn2" "nop,and,xor,not"
+  (const_string "nop"))
+;; Memory model specifier.
+;; "0"-"9" values specify the operand that stores the memory model value.
+;; "10" specifies MEMMODEL_ACQ_REL,
+;; "11" specifies MEMMODEL_ACQUIRE.
+(define_attr "sync_memmodel" "" (const_int 10))
+
+
 ;; Classification of each insn.
 ;; branch	conditional branch
 ;; jump		unconditional jump
@@ -274,6 +325,8 @@
 ;; frsqrt1      floating point reciprocal square root step1
 ;; frsqrt2      floating point reciprocal square root step2
 ;; multi	multiword sequence (or user asm statements)
+;; atomic	atomic memory update instruction
+;; syncloop	memory atomic operation implemented as a sync loop
 ;; nop		no operation
 ;; ghost	an instruction that produces no real code
 (define_attr "type"
@@ -281,7 +334,7 @@
    prefetch,prefetchx,condmove,mtc,mfc,mthilo,mfhilo,const,arith,logical,
    shift,slt,signext,clz,pop,trap,imul,imul3,imul3nc,imadd,idiv,idiv3,move,
    fmove,fadd,fmul,fmadd,fdiv,frdiv,frdiv1,frdiv2,fabs,fneg,fcmp,fcvt,fsqrt,
-   frsqrt,frsqrt1,frsqrt2,multi,nop,ghost"
+   frsqrt,frsqrt1,frsqrt2,multi,atomic,syncloop,nop,ghost"
   (cond [(eq_attr "jal" "!unset") (const_string "call")
 	 (eq_attr "got" "load") (const_string "load")
 
@@ -320,7 +373,8 @@
 	      (eq_attr "dword_mode" "yes"))
 	   (const_string "multi")
 	 (eq_attr "move_type" "move") (const_string "move")
-	 (eq_attr "move_type" "const") (const_string "const")]
+	 (eq_attr "move_type" "const") (const_string "const")
+	 (eq_attr "sync_mem" "!none") (const_string "syncloop")]
 	;; We classify "lui_movf" as "unknown" rather than "multi"
 	;; because we don't split it.  FIXME: we should split instead.
 	(const_string "unknown")))
@@ -344,56 +398,6 @@
 		(const_string "yes")
 		(const_string "no")))
 
-;; Attributes describing a sync loop.  These loops have the form:
-;;
-;;       if (RELEASE_BARRIER == YES) sync
-;;    1: OLDVAL = *MEM
-;;       if ((OLDVAL & INCLUSIVE_MASK) != REQUIRED_OLDVAL) goto 2
-;;         CMP  = 0 [delay slot]
-;;       $TMP1 = OLDVAL & EXCLUSIVE_MASK
-;;       $TMP2 = INSN1 (OLDVAL, INSN1_OP2)
-;;       $TMP3 = INSN2 ($TMP2, INCLUSIVE_MASK)
-;;       $AT |= $TMP1 | $TMP3
-;;       if (!commit (*MEM = $AT)) goto 1.
-;;         if (INSN1 != MOVE && INSN1 != LI) NEWVAL = $TMP3 [delay slot]
-;;       CMP  = 1
-;;       if (ACQUIRE_BARRIER == YES) sync
-;;    2:
-;;
-;; where "$" values are temporaries and where the other values are
-;; specified by the attributes below.  Values are specified as operand
-;; numbers and insns are specified as enums.  If no operand number is
-;; specified, the following values are used instead:
-;;
-;;    - OLDVAL: $AT
-;;    - CMP: NONE
-;;    - NEWVAL: $AT
-;;    - INCLUSIVE_MASK: -1
-;;    - REQUIRED_OLDVAL: OLDVAL & INCLUSIVE_MASK
-;;    - EXCLUSIVE_MASK: 0
-;;
-;; MEM and INSN1_OP2 are required.
-;;
-;; Ideally, the operand attributes would be integers, with -1 meaning "none",
-;; but the gen* programs don't yet support that.
-(define_attr "sync_mem" "none,0,1,2,3,4,5" (const_string "none"))
-(define_attr "sync_oldval" "none,0,1,2,3,4,5" (const_string "none"))
-(define_attr "sync_cmp" "none,0,1,2,3,4,5" (const_string "none"))
-(define_attr "sync_newval" "none,0,1,2,3,4,5" (const_string "none"))
-(define_attr "sync_inclusive_mask" "none,0,1,2,3,4,5" (const_string "none"))
-(define_attr "sync_exclusive_mask" "none,0,1,2,3,4,5" (const_string "none"))
-(define_attr "sync_required_oldval" "none,0,1,2,3,4,5" (const_string "none"))
-(define_attr "sync_insn1_op2" "none,0,1,2,3,4,5" (const_string "none"))
-(define_attr "sync_insn1" "move,li,addu,addiu,subu,and,andi,or,ori,xor,xori"
-  (const_string "move"))
-(define_attr "sync_insn2" "nop,and,xor,not"
-  (const_string "nop"))
-;; Memory model specifier.
-;; "0"-"9" values specify the operand that stores the memory model value.
-;; "10" specifies MEMMODEL_ACQ_REL,
-;; "11" specifies MEMMODEL_ACQUIRE.
-(define_attr "sync_memmodel" "" (const_int 10))
-
 ;; Length of instruction in bytes.
 (define_attr "length" ""
    (cond [(and (eq_attr "extended_mips16" "yes")
diff --git a/gcc/config/mips/octeon.md b/gcc/config/mips/octeon.md
index 566beea..18b0ea6 100644
--- a/gcc/config/mips/octeon.md
+++ b/gcc/config/mips/octeon.md
@@ -133,5 +133,5 @@
 
 (define_insn_reservation "octeon_unknown" 1
   (and (eq_attr "cpu" "octeon,octeon2")
-       (eq_attr "type" "unknown,multi"))
+       (eq_attr "type" "unknown,multi,atomic,syncloop"))
   "octeon_pipe0 + octeon_pipe1")
diff --git a/gcc/config/mips/sb1.md b/gcc/config/mips/sb1.md
index 2d36c22..9226fd7 100644
--- a/gcc/config/mips/sb1.md
+++ b/gcc/config/mips/sb1.md
@@ -108,7 +108,7 @@
 
 (define_insn_reservation "ir_sb1_unknown" 1
   (and (eq_attr "cpu" "sb1,sb1a")
-       (eq_attr "type" "unknown,multi"))
+       (eq_attr "type" "unknown,multi,atomic,syncloop"))
   "sb1_ls0+sb1_ls1+sb1_ex0+sb1_ex1+sb1_fp0+sb1_fp1")
 
 ;; predicted taken branch causes 2 cycle ifetch bubble.  predicted not
diff --git a/gcc/config/mips/sr71k.md b/gcc/config/mips/sr71k.md
index 9b2a784..9f33096 100644
--- a/gcc/config/mips/sr71k.md
+++ b/gcc/config/mips/sr71k.md
@@ -144,7 +144,7 @@
 
 (define_insn_reservation "ir_sr70_unknown" 1
   (and (eq_attr "cpu" "sr71000")
-       (eq_attr "type" "unknown"))
+       (eq_attr "type" "unknown,atomic,syncloop"))
   "serial_dispatch")
 
 
diff --git a/gcc/config/mips/sync.md b/gcc/config/mips/sync.md
index 0a7905a..4c8dde9 100644
--- a/gcc/config/mips/sync.md
+++ b/gcc/config/mips/sync.md
@@ -654,7 +654,8 @@
 	(unspec_volatile:GPR [(match_operand:GPR 2 "register_operand" "0")]
 	 UNSPEC_ATOMIC_EXCHANGE))]
   "ISA_HAS_SWAP"
-  "swap<size>\t%0,%b1")
+  "swap<size>\t%0,%b1"
+  [(set_attr "type" "atomic")])
 
 (define_expand "atomic_fetch_add<mode>"
   [(match_operand:GPR 0 "register_operand")
@@ -712,4 +713,5 @@
 		    (match_operand:GPR 2 "register_operand" "0"))]
 	 UNSPEC_ATOMIC_FETCH_OP))]
   "ISA_HAS_LDADD"
-  "ldadd<size>\t%0,%b1")
+  "ldadd<size>\t%0,%b1"
+  [(set_attr "type" "atomic")])
diff --git a/gcc/config/mips/xlr.md b/gcc/config/mips/xlr.md
index 69913b7..aa4a602 100644
--- a/gcc/config/mips/xlr.md
+++ b/gcc/config/mips/xlr.md
@@ -31,7 +31,7 @@
 ;; Integer arithmetic instructions.
 (define_insn_reservation "ir_xlr_alu" 1
   (and (eq_attr "cpu" "xlr") 
-       (eq_attr "type" "move,arith,shift,clz,logical,signext,const,unknown,multi,nop,trap"))
+       (eq_attr "type" "move,arith,shift,clz,logical,signext,const,unknown,multi,nop,trap,atomic,syncloop"))
   "xlr_main_pipe")
 
 ;; Integer arithmetic instructions.
-- 
1.7.4.1


[-- Attachment #3: Type: text/plain, Size: 1 bytes --]



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][MIPS] NetLogic XLP scheduling
  2012-07-20  4:27       ` Maxim Kuvyrkov
@ 2012-07-20  7:06         ` Chung-Lin Tang
  2012-07-20  8:28           ` Richard Sandiford
  2012-07-20  8:26         ` Richard Sandiford
  1 sibling, 1 reply; 10+ messages in thread
From: Chung-Lin Tang @ 2012-07-20  7:06 UTC (permalink / raw)
  To: Maxim Kuvyrkov; +Cc: gcc-patches, rdsandiford

[-- Attachment #1: Type: text/plain, Size: 758 bytes --]

On 2012/7/20 12:27 PM, Maxim Kuvyrkov wrote:
> Attached is a stand-alone patch that adds handling of "syncloop" and "atomic" type attributes.
> 
> Tested by building cross-toolchain mips64-linux-gnu including GLIBC.  OK to apply?
> 
> --
> Maxim Kuvyrkov
> CodeSourcery / Mentor Graphics

And here's the associated updated xlp.md, essentially the same
description posted earlier, but with the m(f|t)(hi|lo) attributes
updated to use the new style. Also included is the new XLP case in
mips_issue_rate().

Thanks,
Chung-Lin

2012-07-20  Chung-Lin Tang  <cltang@codesourcery.com>
            Maxim Kuvyrkov  <maxim@codesourcery.com>
            NetLogic Microsystems Inc.

	* config/mips/mips.c (mips_issue_rate): Handle XLP.
	* config/mips/xlp.md: New file.


[-- Attachment #2: xlp-sched-2.patch --]
[-- Type: text/plain, Size: 7317 bytes --]

Index: config/mips/xlp.md
===================================================================
--- config/mips/xlp.md	(revision 0)
+++ config/mips/xlp.md	(revision 0)
@@ -0,0 +1,213 @@
+;; DFA-based pipeline description for the XLP.
+;; Copyright (C) 2012 Free Software Foundation, Inc.
+;;
+;; xlp.md   Machine Description for the Broadcom XLP Microprocessor
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_automaton "xlp_cpu")
+
+;; CPU function units.
+(define_cpu_unit "xlp_ex0" "xlp_cpu")
+(define_cpu_unit "xlp_ex1" "xlp_cpu")
+(define_cpu_unit "xlp_ex2" "xlp_cpu")
+(define_cpu_unit "xlp_ex3" "xlp_cpu")
+
+;; Integer Multiply Unit
+(define_cpu_unit "xlp_div" "xlp_cpu")
+
+;; ALU2 completion port.
+(define_cpu_unit "xlp_ex2_wrb" "xlp_cpu")
+
+(define_automaton "xlp_fpu")
+
+;; Floating-point units.
+(define_cpu_unit "xlp_fp" "xlp_fpu")
+
+;; Floating Point Sqrt/Divide
+(define_cpu_unit "xlp_divsq" "xlp_fpu")
+
+;; FPU completion port.
+(define_cpu_unit "xlp_fp_wrb" "xlp_fpu")
+
+;; Define reservations for common combinations.
+
+;;
+;; The ordering of the instruction-execution-path/resource-usage
+;; descriptions (also known as reservation RTL) is roughly ordered
+;; based on the define attribute RTL for the "type" classification.
+;; When modifying, remember that the first test that matches is the
+;; reservation used!
+;;
+(define_insn_reservation "ir_xlp_unknown" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "unknown,multi"))
+  "xlp_ex0+xlp_ex1+xlp_ex2+xlp_ex3")
+
+(define_insn_reservation "ir_xlp_branch" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "branch,jump,call"))
+  "xlp_ex3")
+
+(define_insn_reservation "ir_xlp_prefetch" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "prefetch,prefetchx"))
+  "xlp_ex0|xlp_ex1")
+
+(define_insn_reservation "ir_xlp_load" 4
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "load"))
+  "xlp_ex0|xlp_ex1")
+
+(define_insn_reservation "ir_xlp_fpload" 5
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "fpload,fpidxload"))
+  "xlp_ex0|xlp_ex1")
+
+(define_insn_reservation "ir_xlp_alu" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "const,arith,shift,slt,clz,signext,logical,move,trap,nop"))
+  "xlp_ex0|xlp_ex1|(xlp_ex2,xlp_ex2_wrb)|xlp_ex3")
+
+(define_insn_reservation "ir_xlp_condmov" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "condmove")
+       (eq_attr "mode" "SI,DI"))
+  "xlp_ex2,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_mul" 5
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "imul,imadd"))
+  "xlp_ex2,nothing*4,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_mul3" 3
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "imul3"))
+  "xlp_ex2,nothing*2,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_div" 24
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "SI")
+       (eq_attr "type" "idiv"))
+  "xlp_ex2+xlp_div,xlp_div*23,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_ddiv" 48
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "DI")
+       (eq_attr "type" "idiv"))
+  "xlp_ex2+xlp_div,xlp_div*47,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_store" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "store,fpstore,fpidxstore"))
+  "xlp_ex0|xlp_ex1")
+
+(define_insn_reservation "ir_xlp_fpmove" 2
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "mfc"))
+ "xlp_ex3,xlp_fp,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_mfhi" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "mfhi"))
+  "xlp_ex2,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_mflo" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "mflo"))
+  "xlp_ex2,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_mthi" 1
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "mthi"))
+  "xlp_ex2,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_mtlo" 3
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "mtlo"))
+  "xlp_ex2,nothing*2,xlp_ex2_wrb")
+
+(define_insn_reservation "ir_xlp_fp2" 2
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "fmove,fneg,fabs,condmove"))
+  "xlp_fp,nothing,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fp3" 3
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "fcmp"))
+  "xlp_fp,nothing*2,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fp4" 4
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "fcvt"))
+  "xlp_fp,nothing*3,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fp5" 5
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "SF")
+       (eq_attr "type" "fadd,fmul"))
+  "xlp_fp,nothing*4,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fp6" 6
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "DF")
+       (eq_attr "type" "fadd,fmul"))
+  "xlp_fp,nothing*5,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fp9" 9
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "SF")
+       (eq_attr "type" "fmadd"))
+  "xlp_fp,nothing*3,xlp_fp,nothing*3,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fp11" 11
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "DF")
+       (eq_attr "type" "fmadd"))
+  "xlp_fp,nothing*4,xlp_fp,nothing*4,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fpcomplex_s" 23
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "SF")
+       (eq_attr "type" "fdiv,frdiv,frdiv1,frdiv2,fsqrt,frsqrt,frsqrt1,frsqrt2"))
+  "xlp_fp+xlp_divsq,xlp_divsq*22,xlp_fp_wrb")
+
+(define_insn_reservation "ir_xlp_fpcomplex_d" 38
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "mode" "DF")
+       (eq_attr "type" "fdiv,frdiv,frdiv1,frdiv2,fsqrt,frsqrt,frsqrt1,frsqrt2"))
+  "xlp_fp+xlp_divsq,xlp_divsq*37,xlp_fp_wrb")
+
+(define_bypass 3 "ir_xlp_mul" "ir_xlp_mfhi")
+
+(define_insn_reservation "ir_xlp_atomic" 15
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "atomic"))
+  "xlp_ex0|xlp_ex1")
+
+;; Sync loop consists of (in order)
+;; (1) optional sync,
+;; (2) LL instruction,
+;; (3) branch and 1-2 ALU instructions,
+;; (4) SC instruction,
+;; (5) optional sync,
+;; (6) branch and ALU instruction.
+;; The net result of this reservation is a big delay with flush of
+;; ALU pipeline and outgoing reservations discouraging use of EX3.
+(define_insn_reservation "ir_xlp_sync_loop" 40
+  (and (eq_attr "cpu" "xlp")
+       (eq_attr "type" "syncloop"))
+  "(xlp_ex0+xlp_ex1+xlp_ex2+xlp_ex3)*39,xlp_ex3+(xlp_ex0|xlp_ex1|(xlp_ex2,xlp_ex2_wrb))")
Index: config/mips/mips.c
===================================================================
--- config/mips/mips.c	(revision 189702)
+++ config/mips/mips.c	(working copy)
@@ -12480,6 +12480,9 @@ mips_issue_rate (void)
     case PROCESSOR_LOONGSON_3A:
       return 4;
 
+    case PROCESSOR_XLP:
+      return (reload_completed ? 4 : 3);
+
     default:
       return 1;
     }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][MIPS] NetLogic XLP scheduling
  2012-07-20  4:27       ` Maxim Kuvyrkov
  2012-07-20  7:06         ` Chung-Lin Tang
@ 2012-07-20  8:26         ` Richard Sandiford
  1 sibling, 0 replies; 10+ messages in thread
From: Richard Sandiford @ 2012-07-20  8:26 UTC (permalink / raw)
  To: Maxim Kuvyrkov; +Cc: Chung-Lin Tang, gcc-patches

Maxim Kuvyrkov <maxim@codesourcery.com> writes:
> 2012-07-13  Maxim Kuvyrkov  <maxim@codesourcery.com>
>
> 	* config/mips/mips.md (define_attr sync_*): Move before "type".
> 	(define_attr "type"): New values "atomic" and "syncloop".
> 	* config/mips/sync.md (atomic_exchange<mode>, atomic_fetch_add<mode>):
> 	Set "type" attribute.
> 	* config/mips/generic.md (generic_atomic, generic_syncloop):
> 	New reservations.
> 	* gcc/config/mips/10000.md, gcc/config/mips/20kc.md,
> 	* gcc/config/mips/24k.md, gcc/config/mips/4130.md,
>         * gcc/config/mips/4k.md, gcc/config/mips/5400.md,
> 	* gcc/config/mips/5500.md, gcc/config/mips/5k.md,
>         * gcc/config/mips/7000.md, gcc/config/mips/74k.md,
> 	* gcc/config/mips/9000.md, gcc/config/mips/loongson2ef.md,
> 	* gcc/config/mips/loongson3a.md, gcc/config/mips/octeon.md,
> 	* gcc/config/mips/sb1.md, gcc/config/mips/sr71k.md,
> 	* gcc/config/mips/xlr.md: Handle "atomic" and "syncloop" types.

OK, thanks.

Richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][MIPS] NetLogic XLP scheduling
  2012-07-20  7:06         ` Chung-Lin Tang
@ 2012-07-20  8:28           ` Richard Sandiford
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Sandiford @ 2012-07-20  8:28 UTC (permalink / raw)
  To: Chung-Lin Tang; +Cc: Maxim Kuvyrkov, gcc-patches

Chung-Lin Tang <cltang@codesourcery.com> writes:
> 2012-07-20  Chung-Lin Tang  <cltang@codesourcery.com>
>             Maxim Kuvyrkov  <maxim@codesourcery.com>
>             NetLogic Microsystems Inc.
>
> 	* config/mips/mips.c (mips_issue_rate): Handle XLP.
> 	* config/mips/xlp.md: New file.

OK, thanks.

Richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-07-20  8:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-13 10:11 [PATCH][MIPS] NetLogic XLP scheduling Chung-Lin Tang
2012-07-15 16:28 ` Richard Sandiford
2012-07-16  6:37   ` Chung-Lin Tang
2012-07-16  6:57     ` Maxim Kuvyrkov
2012-07-16  9:08       ` Richard Sandiford
2012-07-20  4:27       ` Maxim Kuvyrkov
2012-07-20  7:06         ` Chung-Lin Tang
2012-07-20  8:28           ` Richard Sandiford
2012-07-20  8:26         ` Richard Sandiford
2012-07-16  9:06     ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).