public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, rs6000] Scheduling update
@ 2016-06-21 17:46 Pat Haugen
  2016-06-22 19:11 ` Segher Boessenkool
  0 siblings, 1 reply; 8+ messages in thread
From: Pat Haugen @ 2016-06-21 17:46 UTC (permalink / raw)
  To: GCC Patches, Segher Boessenkool, David Edelsohn

[-- Attachment #1: Type: text/plain, Size: 2389 bytes --]

This patch adds instruction scheduling support for the Power9 processor. Bootstrap/regression tested on powerpc64/powerpc64le with no new failures. Ok for trunk? Ok for backport to GCC 6 branch after successful bootstrap/regtest there?

-Pat


2016-06-21  Pat Haugen  <pthaugen@us.ibm.com>

        * config/rs6000/power8.md (power8-fp): Include dfp type.
        * config/rs6000/power6.md (power6-fp): Likewise.
        * config/rs6000/htm.md (various insns): Change type atribute to 
        htmsimple and set power9_alu2 appropriately.
        * config/rs6000/power9.md: New file.
        * config/rs6000/t-rs6000 (MD_INCLUDES): Add power9.md.
        * config/rs6000/power7.md (power7-fp): Include dfp type.
        * config/rs6000/rs6000.c (power9_cost): Update costs, cache size
        and prefetch streams.
        (rs6000_option_override_internal): Remove temporary code setting
        tuning to power8. Don't set rs6000_sched_groups for power9.
        (last_scheduled_insn): Change to rtx_insn *.
        (divCnt, vec_load_pendulum): New variables.
        (rs6000_adjust_cost): Add Power9 to test for store->load separation.
        (rs6000_issue_rate): Set issue rate for Power9.
        (is_power9_pairable_vec_type): New.
        (rs6000_sched_reorder2): Add Power9 code to group fixed point divide
        insns and group/alternate vector operations with vector loads.
        (insn_must_be_first_in_group): Remove Power9.
        (insn_must_be_last_in_group): Likewise.
        (force_new_group): Likewise.
        (rs6000_sched_init): Fix initialization of last_scheduled_insn.
        Initialize divCnt/vec_load_pendulum.
        (_rs6000_sched_context, rs6000_init_sched_context,
        rs6000_set_sched_context): Handle context save/restore of new
        variables.
        * config/rs6000/vsx.md (various insns): Set power9_alu2 attribute.
        * config/rs6000/altivec.md (various insns): Likewise.
        * config/rs6000/dfp.md (various insns): Change type attribute to dfp.
        *  config/rs6000/crypto.md (crypto_vshasigma<CR_char>): Change type
        and set power9_alu2.
        * config/rs6000/rs6000.md ('type' attribute): Add htmsimple/dfp types.
        Define "power9_alu2" and "mnemonic" attributes.
        Include power9.md.
        (*cmp<mode>_fpr, *fpmask<mode>): Set power9_alu2 attribute.
        (*cmp<mode>_hw): Change type to veccmp.


[-- Attachment #2: scheduling.diff --]
[-- Type: text/x-patch, Size: 57597 bytes --]

Index: config/rs6000/power8.md
===================================================================
--- config/rs6000/power8.md	(revision 237621)
+++ config/rs6000/power8.md	(working copy)
@@ -317,7 +317,7 @@ (define_bypass 4 "power8-branch" "power8
 
 ; VS Unit (includes FP/VSX/VMX/DFP/Crypto)
 (define_insn_reservation "power8-fp" 6
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,dmul,dfp")
        (eq_attr "cpu" "power8"))
   "DU_any_power8,VSU_power8")
 
Index: config/rs6000/power6.md
===================================================================
--- config/rs6000/power6.md	(revision 237621)
+++ config/rs6000/power6.md	(working copy)
@@ -500,7 +500,7 @@ (define_insn_reservation "power6-mtcr" 4
 (define_bypass 9 "power6-mtcr" "power6-branch")
 
 (define_insn_reservation "power6-fp" 6
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,dmul,dfp")
        (eq_attr "cpu" "power6"))
   "FPU_power6")
 
Index: config/rs6000/htm.md
===================================================================
--- config/rs6000/htm.md	(revision 237621)
+++ config/rs6000/htm.md	(working copy)
@@ -72,7 +72,8 @@ (define_insn "*tabort"
    (set (match_operand:BLK 2) (unspec:BLK [(match_dup 2)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "tabort. %0"
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
+   (set_attr "power9_alu2" "yes")
    (set_attr "length" "4")])
 
 (define_expand "tabort<wd>c"
@@ -98,7 +99,8 @@ (define_insn "*tabort<wd>c"
    (set (match_operand:BLK 4) (unspec:BLK [(match_dup 4)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "tabort<wd>c. %0,%1,%2"
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
+   (set_attr "power9_alu2" "yes")
    (set_attr "length" "4")])
 
 (define_expand "tabort<wd>ci"
@@ -124,7 +126,8 @@ (define_insn "*tabort<wd>ci"
    (set (match_operand:BLK 4) (unspec:BLK [(match_dup 4)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "tabort<wd>ci. %0,%1,%2"
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
+   (set_attr "power9_alu2" "yes")
    (set_attr "length" "4")])
 
 (define_expand "tbegin"
@@ -146,7 +149,7 @@ (define_insn "*tbegin"
    (set (match_operand:BLK 2) (unspec:BLK [(match_dup 2)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "tbegin. %0"
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
    (set_attr "length" "4")])
 
 (define_expand "tcheck"
@@ -208,7 +211,7 @@ (define_insn "*trechkpt"
    (set (match_operand:BLK 1) (unspec:BLK [(match_dup 1)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "trechkpt."
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
    (set_attr "length" "4")])
 
 (define_expand "treclaim"
@@ -230,7 +233,7 @@ (define_insn "*treclaim"
    (set (match_operand:BLK 2) (unspec:BLK [(match_dup 2)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "treclaim. %0"
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
    (set_attr "length" "4")])
 
 (define_expand "tsr"
@@ -252,7 +255,7 @@ (define_insn "*tsr"
    (set (match_operand:BLK 2) (unspec:BLK [(match_dup 2)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "tsr. %0"
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
    (set_attr "length" "4")])
 
 (define_expand "ttest"
@@ -272,7 +275,8 @@ (define_insn "*ttest"
    (set (match_operand:BLK 1) (unspec:BLK [(match_dup 1)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "tabortwci. 0,1,0"
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
+   (set_attr "power9_alu2" "yes")
    (set_attr "length" "4")])
 
 (define_insn "htm_mfspr_<mode>"
Index: config/rs6000/power9.md
===================================================================
--- config/rs6000/power9.md	(revision 0)
+++ config/rs6000/power9.md	(revision 0)
@@ -0,0 +1,525 @@
+;; Scheduling description for IBM POWER9 processor.
+;; Copyright (C) 2016 Free Software Foundation, Inc.
+;;
+;; Contributed by Pat Haugen (pthaugen@us.ibm.com).
+
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_automaton "power9dsp,power9lsu,power9vsu,power9misc")
+
+(define_cpu_unit "lsu0_power9,lsu1_power9,lsu2_power9,lsu3_power9" "power9lsu")
+(define_cpu_unit "vsu0_power9,vsu1_power9,vsu2_power9,vsu3_power9" "power9vsu")
+; Two vector permute units, part of vsu
+(define_cpu_unit "prm0_power9,prm1_power9" "power9vsu")
+; Two fixed point divide units, not pipelined
+(define_cpu_unit "fx_div0_power9,fx_div1_power9" "power9misc")
+(define_cpu_unit "bru_power9,cryptu_power9,dfu_power9" "power9misc")
+
+(define_cpu_unit "x0_power9,x1_power9,xa0_power9,xa1_power9,\
+		  x2_power9,x3_power9,xb0_power9,xb1_power9,
+		  br0_power9,br1_power9" "power9dsp")
+
+
+; Dispatch port reservations
+;
+; Power9 can dispatch a maximum of 6 iops per cycle with the following
+; general restrictions (other restrictions also apply):
+;   1) At most 2 iops per execution slice
+;   2) At most 2 iops to the branch unit
+; Note that insn position in a dispatch group of 6 insns does not infer which
+; execution slice the insn is routed to. The units are used to infer the
+; conflicts that exist (i.e. an 'even' requirement will preclude dispatch
+; with 2 insns with 'superslice' requirement).
+
+; The xa0/xa1 units really represent the 3rd dispatch port for a superslice but
+; are listed as separate units to allow those insns that preclude its use to
+; still be scheduled two to a superslice while reserving the 3rd slot. The
+; same applies for xb0/xb1.
+(define_reservation "DU_xa_power9" "xa0_power9+xa1_power9")
+(define_reservation "DU_xb_power9" "xb0_power9+xb1_power9")
+
+; Any execution slice dispatch 
+(define_reservation "DU_any_power9"
+		    "x0_power9|x1_power9|DU_xa_power9|x2_power9|x3_power9|\
+		     DU_xb_power9")
+
+; Even slice, actually takes even/odd slots
+(define_reservation "DU_even_power9" "x0_power9+x1_power9|x2_power9+x3_power9")
+
+; Slice plus 3rd slot
+(define_reservation "DU_slice_3_power9"
+		    "x0_power9+xa0_power9|x1_power9+xa1_power9|\
+		     x2_power9+xb0_power9|x3_power9+xb1_power9")
+
+; Superslice
+(define_reservation "DU_super_power9"
+		    "x0_power9+x1_power9|x2_power9+x3_power9")
+
+; 2-way cracked
+(define_reservation "DU_C2_power9" "(x0_power9+x1_power9)|\
+				    (x1_power9+DU_xa_power9)|\
+				    (x1_power9+x2_power9)|\
+				    (DU_xa_power9+x2_power9)|\
+				    (x2_power9+x3_power9)|\
+				    (x3_power9+DU_xb_power9)")
+
+; 2-way cracked plus 3rd slot
+(define_reservation "DU_C2_3_power9" "(x0_power9+x1_power9+xa0_power9)|\
+				      (x1_power9+x2_power9+xa0_power9)|\
+				      (x1_power9+x2_power9+xb0_power9)|\
+				      (x2_power9+x3_power9+xb0_power9)")
+
+; 3-way cracked (consumes whole decode/dispatch cycle)
+(define_reservation "DU_C3_power9"
+		    "x0_power9+x1_power9+xa0_power9+xa1_power9+x2_power9+\
+		     x3_power9+xb0_power9+xb1_power9+br0_power9+br1_power9")
+
+; Branch ports
+(define_reservation "DU_branch_power9" "br0_power9|br1_power9")
+
+
+; Execution unit reservations
+(define_reservation "LSU_power9"
+		    "lsu0_power9|lsu1_power9|lsu2_power9|lsu3_power9")
+
+(define_reservation "LSU_pair_power9"
+		    "lsu0_power9+lsu1_power9|lsu1_power9+lsu2_power9|\
+		     lsu2_power9+lsu3_power9|lsu3_power9+lsu1_power9")
+
+(define_reservation "VSU_power9"
+		    "vsu0_power9|vsu1_power9|vsu2_power9|vsu3_power9")
+
+(define_reservation "VSU_super_power9"
+		    "vsu0_power9+vsu1_power9|vsu2_power9+vsu3_power9")
+
+(define_reservation "VSU_PRM_power9" "prm0_power9|prm1_power9")
+
+
+; 2 cycle FP ops
+(define_attr "power9_fp_2cyc" "no,yes"
+  (cond [(eq_attr "mnemonic" "fabs,fcpsgn,fmr,fmrgow,fnabs,fneg,\
+			      xsabsdp,xscpsgndp,xsnabsdp,xsnegdp,\
+			      xsabsqp,xscpsgnqp,xsnabsqp,xsnegqp")
+	 (const_string "yes")]
+        (const_string "no")))
+
+; Quad-precision FP ops, execute in DFU
+(define_attr "power9_qp" "no,yes"
+  (if_then_else (ior (match_operand:KF 0 "" "")
+                     (match_operand:TF 0 "" "")
+                     (match_operand:KF 1 "" "")
+                     (match_operand:TF 1 "" ""))
+                (const_string "yes")
+                (const_string "no")))
+
+
+; LS Unit
+(define_insn_reservation "power9-load" 4
+  (and (eq_attr "type" "load")
+       (eq_attr "sign_extend" "no")
+       (eq_attr "update" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,LSU_power9")
+
+(define_insn_reservation "power9-load-update" 4
+  (and (eq_attr "type" "load")
+       (eq_attr "sign_extend" "no")
+       (eq_attr "update" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_power9,LSU_power9+VSU_power9")
+
+(define_insn_reservation "power9-load-ext" 6
+  (and (eq_attr "type" "load")
+       (eq_attr "sign_extend" "yes")
+       (eq_attr "update" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_power9,LSU_power9")
+
+(define_insn_reservation "power9-load-ext-update" 6
+  (and (eq_attr "type" "load")
+       (eq_attr "sign_extend" "yes")
+       (eq_attr "update" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_C3_power9,LSU_power9+VSU_power9")
+
+(define_insn_reservation "power9-fpload-double" 4
+  (and (eq_attr "type" "fpload")
+       (eq_attr "update" "no")
+       (match_operand:DF 0 "" "")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,LSU_power9")
+
+(define_insn_reservation "power9-fpload-update-double" 4
+  (and (eq_attr "type" "fpload")
+       (eq_attr "update" "yes")
+       (match_operand:DF 0 "" "")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_3_power9,LSU_power9+VSU_power9")
+
+; SFmode loads are cracked and have additional 2 cycles over DFmode
+(define_insn_reservation "power9-fpload-single" 6
+  (and (eq_attr "type" "fpload")
+       (eq_attr "update" "no")
+       (match_operand:SF 0 "" "")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_3_power9,LSU_power9")
+
+(define_insn_reservation "power9-fpload-update-single" 6
+  (and (eq_attr "type" "fpload")
+       (eq_attr "update" "yes")
+       (match_operand:SF 0 "" "")
+       (eq_attr "cpu" "power9"))
+  "DU_C3_power9,LSU_power9+VSU_power9")
+
+(define_insn_reservation "power9-vecload" 5
+  (and (eq_attr "type" "vecload")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,LSU_pair_power9")
+
+; Store data can issue 2 cycles after AGEN issue, 3 cycles for vector store
+(define_insn_reservation "power9-store" 0
+  (and (eq_attr "type" "store")
+       (not (and (eq_attr "update" "yes")
+		 (eq_attr "indexed" "yes")))
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,LSU_power9")
+
+(define_insn_reservation "power9-store-indexed" 0
+  (and (eq_attr "type" "store")
+       (eq_attr "update" "no")
+       (eq_attr "indexed" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,LSU_power9")
+
+; Update forms have 2 cycle latency for updated addr reg
+(define_insn_reservation "power9-store-update" 2
+  (and (eq_attr "type" "store")
+       (eq_attr "update" "yes")
+       (eq_attr "indexed" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_3_power9,LSU_power9+VSU_power9")
+
+; Update forms have 2 cycle latency for updated addr reg
+(define_insn_reservation "power9-store-update-indexed" 2
+  (and (eq_attr "type" "store")
+       (eq_attr "update" "yes")
+       (eq_attr "indexed" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_3_power9,LSU_power9+VSU_power9")
+
+(define_insn_reservation "power9-fpstore" 0
+  (and (eq_attr "type" "fpstore")
+       (eq_attr "update" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,LSU_power9")
+
+; Update forms have 2 cycle latency for updated addr reg
+(define_insn_reservation "power9-fpstore-update" 2
+  (and (eq_attr "type" "fpstore")
+       (eq_attr "update" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_3_power9,LSU_power9+VSU_power9")
+
+(define_insn_reservation "power9-vecstore" 0
+  (and (eq_attr "type" "vecstore")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,LSU_pair_power9")
+
+(define_insn_reservation "power9-larx" 4
+  (and (eq_attr "type" "load_l")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,LSU_power9")
+
+(define_insn_reservation "power9-stcx" 2
+  (and (eq_attr "type" "store_c")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_3_power9,LSU_power9+VSU_power9")
+
+(define_insn_reservation "power9-sync" 4
+  (and (eq_attr "type" "sync,isync")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,LSU_power9")
+
+
+; VSU Execution Unit
+
+; Fixed point ops
+
+; Most ALU insns are simple 2 cycl, including record form
+(define_insn_reservation "power9-alu" 2
+  (and (ior (eq_attr "type" "add,cmp,exts,integer,logical,trap,isel")
+	    (and (eq_attr "type" "insert,shift")
+		 (eq_attr "dot" "no")))
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+; Record form rotate/shift are cracked
+(define_insn_reservation "power9-cracked-alu" 2
+  (and (eq_attr "type" "insert,shift")
+       (eq_attr "dot" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_power9,VSU_power9")
+; 4 cycle CR latency 
+(define_bypass 4 "power9-cracked-alu"
+		 "power9-crlogical,power9-mfcr,power9-mfcrf,power9-branch")
+
+(define_insn_reservation "power9-alu2" 3
+  (and (eq_attr "type" "cntlz,popcnt")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+; Treat 'two' and 'three' types as 2 or 3 way cracked
+(define_insn_reservation "power9-two" 4
+  (and (eq_attr "type" "two")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_power9,VSU_power9")
+
+(define_insn_reservation "power9-three" 6
+  (and (eq_attr "type" "three")
+       (eq_attr "cpu" "power9"))
+  "DU_C3_power9,VSU_power9")
+
+(define_insn_reservation "power9-mul" 4
+  (and (eq_attr "type" "mul")
+       (eq_attr "dot" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+(define_insn_reservation "power9-mul-compare" 4
+  (and (eq_attr "type" "mul")
+       (eq_attr "dot" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_power9,VSU_power9")
+; 6 cycle CR latency 
+(define_bypass 6 "power9-mul-compare"
+		 "power9-crlogical,power9-mfcr,power9-mfcrf,power9-branch")
+
+; Fixed point divides reserve the divide units for a minimum of 8 cycles
+(define_insn_reservation "power9-idiv" 16
+  (and (eq_attr "type" "div")
+       (eq_attr "size" "32")
+       (eq_attr "cpu" "power9"))
+  "DU_even_power9,fx_div0_power9*8|fx_div1_power9*8")
+
+(define_insn_reservation "power9-ldiv" 24
+  (and (eq_attr "type" "div")
+       (eq_attr "size" "64")
+       (eq_attr "cpu" "power9"))
+  "DU_even_power9,fx_div0_power9*8|fx_div1_power9*8")
+
+(define_insn_reservation "power9-crlogical" 2
+  (and (eq_attr "type" "cr_logical,delayed_cr")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+(define_insn_reservation "power9-mfcrf" 2
+  (and (eq_attr "type" "mfcrf")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+(define_insn_reservation "power9-mfcr" 6
+  (and (eq_attr "type" "mfcr")
+       (eq_attr "cpu" "power9"))
+  "DU_C3_power9,VSU_power9")
+
+; Should differentiate between 1 cr field and > 1 since target of > 1 cr
+; is cracked
+(define_insn_reservation "power9-mtcr" 2
+  (and (eq_attr "type" "mtcr")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+; Move to LR/CTR are executed in VSU
+(define_insn_reservation "power9-mtjmpr" 5
+  (and (eq_attr "type" "mtjmpr")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+; Floating point/Vector ops
+(define_insn_reservation "power9-fp" 7
+  (and (eq_attr "type" "fp,dmul")
+       (eq_attr "power9_fp_2cyc" "no")
+       (eq_attr "power9_alu2" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-fp2" 2
+  (and (eq_attr "type" "fp,dmul")
+       (eq_attr "power9_fp_2cyc" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-fp-alu2" 3
+  (and (eq_attr "type" "fp,dmul")
+       (eq_attr "power9_alu2" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-fpcompare" 3
+  (and (eq_attr "type" "fpcompare")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+; FP div/sqrt are executed in VSU slices, not pipelined for other divides, but for the
+; most part do not block pipelined ops.
+(define_insn_reservation "power9-sdiv" 22
+  (and (eq_attr "type" "sdiv")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-ddiv" 33
+  (and (eq_attr "type" "ddiv")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-sqrt" 26
+  (and (eq_attr "type" "ssqrt")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-dsqrt" 36
+  (and (eq_attr "type" "dsqrt")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-veccmp" 3
+  (and (eq_attr "type" "veccmp")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-vecsimple" 2
+  (and (eq_attr "type" "vecsimple")
+       (eq_attr "power9_alu2" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-vecsimple-alu2" 3
+  (and (eq_attr "type" "vecsimple")
+       (eq_attr "power9_alu2" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-vecnormal" 7
+  (and (eq_attr "type" "vecfloat,vecdouble")
+       (eq_attr "power9_fp_2cyc" "no")
+       (eq_attr "power9_alu2" "no")
+       (eq_attr "power9_qp" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-vecnormal2" 2
+  (and (eq_attr "type" "vecfloat")
+       (eq_attr "power9_fp_2cyc" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-vecnormal-alu2" 3
+  (and (eq_attr "type" "vecfloat,vecdouble")
+       (eq_attr "power9_alu2" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-qp" 12
+  (and (eq_attr "type" "vecfloat,vecdouble")
+       (eq_attr "power9_qp" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,dfu_power9")
+
+(define_insn_reservation "power9-vecperm" 2
+  (and (eq_attr "type" "vecperm")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_PRM_power9")
+
+(define_insn_reservation "power9-veccomplex" 7
+  (and (eq_attr "type" "veccomplex")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-vecfdiv" 28
+  (and (eq_attr "type" "vecfdiv")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-vecdiv" 32
+  (and (eq_attr "type" "vecdiv")
+       (eq_attr "power9_qp" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-qpdiv" 56
+  (and (eq_attr "type" "vecdiv")
+       (eq_attr "power9_qp" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,dfu_power9")
+
+(define_insn_reservation "power9-mffgpr" 2
+  (and (eq_attr "type" "mffgpr")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-mftgpr" 2
+  (and (eq_attr "type" "mftgpr")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+
+; Branch Unit
+; Move from LR/CTR are executed in BRU but consume a writeback port from an
+; execution slice.
+(define_insn_reservation "power9-mfjmpr" 6
+  (and (eq_attr "type" "mfjmpr")
+       (eq_attr "cpu" "power9"))
+  "DU_branch_power9,bru_power9+VSU_power9")
+
+; Branch is 2 cycles
+(define_insn_reservation "power9-branch" 2
+  (and (eq_attr "type" "jmpreg,branch")
+       (eq_attr "cpu" "power9"))
+  "DU_branch_power9,bru_power9")
+
+
+; Crytpo Unit
+(define_insn_reservation "power9-crypto" 6
+  (and (eq_attr "type" "crypto")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,cryptu_power9")
+
+
+; HTM Unit
+(define_insn_reservation "power9-htm" 6
+  (and (eq_attr "type" "htm")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_power9,LSU_power9")
+
+(define_insn_reservation "power9-htm-simple" 2
+  (and (eq_attr "type" "htmsimple")
+       (eq_attr "power9_alu2" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+(define_insn_reservation "power9-htm-simple-alu2" 3
+  (and (eq_attr "type" "htmsimple")
+       (eq_attr "power9_alu2" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+; DFP Unit
+(define_insn_reservation "power9-dfp" 12
+  (and (eq_attr "type" "dfp")
+       (eq_attr "cpu" "power9"))
+  "DU_even_power9,dfu_power9")
+
Index: config/rs6000/t-rs6000
===================================================================
--- config/rs6000/t-rs6000	(revision 237621)
+++ config/rs6000/t-rs6000	(working copy)
@@ -50,6 +50,7 @@ MD_INCLUDES = $(srcdir)/config/rs6000/rs
 	$(srcdir)/config/rs6000/power6.md \
 	$(srcdir)/config/rs6000/power7.md \
 	$(srcdir)/config/rs6000/power8.md \
+	$(srcdir)/config/rs6000/power9.md \
 	$(srcdir)/config/rs6000/cell.md \
 	$(srcdir)/config/rs6000/xfpu.md \
 	$(srcdir)/config/rs6000/a2.md \
Index: config/rs6000/power7.md
===================================================================
--- config/rs6000/power7.md	(revision 237621)
+++ config/rs6000/power7.md	(working copy)
@@ -292,7 +292,7 @@ (define_insn_reservation "power7-branch"
 
 ; VS Unit (includes FP/VSX/VMX/DFP)
 (define_insn_reservation "power7-fp" 6
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,dmul,dfp")
        (eq_attr "cpu" "power7"))
   "DU_power7,VSU_power7")
 
Index: config/rs6000/rs6000.c
===================================================================
--- config/rs6000/rs6000.c	(revision 237621)
+++ config/rs6000/rs6000.c	(working copy)
@@ -1104,16 +1104,16 @@ struct processor_costs power9_cost = {
   COSTS_N_INSNS (3),	/* mulsi_const */
   COSTS_N_INSNS (3),	/* mulsi_const9 */
   COSTS_N_INSNS (3),	/* muldi */
-  COSTS_N_INSNS (19),	/* divsi */
-  COSTS_N_INSNS (35),	/* divdi */
+  COSTS_N_INSNS (8),	/* divsi */
+  COSTS_N_INSNS (12),	/* divdi */
   COSTS_N_INSNS (3),	/* fp */
   COSTS_N_INSNS (3),	/* dmul */
-  COSTS_N_INSNS (14),	/* sdiv */
-  COSTS_N_INSNS (17),	/* ddiv */
+  COSTS_N_INSNS (13),	/* sdiv */
+  COSTS_N_INSNS (18),	/* ddiv */
   128,			/* cache line size */
   32,			/* l1 cache */
-  256,			/* l2 cache */
-  12,			/* prefetch streams */
+  512,			/* l2 cache */
+  8,			/* prefetch streams */
   COSTS_N_INSNS (3),	/* SF->DF convert */
 };
 
@@ -3841,22 +3841,7 @@ rs6000_option_override_internal (bool gl
   if (rs6000_tune_index >= 0)
     tune_index = rs6000_tune_index;
   else if (have_cpu)
-    {
-      /* Until power9 tuning is available, use power8 tuning if -mcpu=power9.  */
-      if (processor_target_table[cpu_index].processor != PROCESSOR_POWER9)
-	rs6000_tune_index = tune_index = cpu_index;
-      else
-	{
-	  size_t i;
-	  tune_index = -1;
-	  for (i = 0; i < ARRAY_SIZE (processor_target_table); i++)
-	    if (processor_target_table[i].processor == PROCESSOR_POWER8)
-	      {
-		rs6000_tune_index = tune_index = i;
-		break;
-	      }
-	}
-    }
+    rs6000_tune_index = tune_index = cpu_index;
   else
     {
       size_t i;
@@ -4636,8 +4621,7 @@ rs6000_option_override_internal (bool gl
   rs6000_sched_groups = (rs6000_cpu == PROCESSOR_POWER4
 			 || rs6000_cpu == PROCESSOR_POWER5
 			 || rs6000_cpu == PROCESSOR_POWER7
-			 || rs6000_cpu == PROCESSOR_POWER8
-			 || rs6000_cpu == PROCESSOR_POWER9);
+			 || rs6000_cpu == PROCESSOR_POWER8);
   rs6000_align_branch_targets = (rs6000_cpu == PROCESSOR_POWER4
 				 || rs6000_cpu == PROCESSOR_POWER5
 				 || rs6000_cpu == PROCESSOR_POWER6
@@ -29825,13 +29809,19 @@ output_function_profiler (FILE *file, in
 
 /* The following variable value is the last issued insn.  */
 
-static rtx last_scheduled_insn;
+static rtx_insn * last_scheduled_insn;
 
 /* The following variable helps to balance issuing of load and
    store instructions */
 
 static int load_store_pendulum;
 
+/* The following variables are used to keep track of various scheduling
+   information. */
+static int divCnt;
+static int vec_load_pendulum;
+
+
 /* Power4 load update and store update instructions are cracked into a
    load or store and an integer insn which are executed in the same cycle.
    Branches have their own dispatch slot which does not count against the
@@ -29906,7 +29896,7 @@ rs6000_adjust_cost (rtx_insn *insn, rtx 
 	   some cycles later.  */
 
 	/* Separate a load from a narrower, dependent store.  */
-	if (rs6000_sched_groups
+	if ((rs6000_sched_groups || rs6000_cpu_attr == CPU_POWER9)
 	    && GET_CODE (PATTERN (insn)) == SET
 	    && GET_CODE (PATTERN (dep_insn)) == SET
 	    && GET_CODE (XEXP (PATTERN (insn), 1)) == MEM
@@ -30144,6 +30134,8 @@ rs6000_adjust_cost (rtx_insn *insn, rtx 
               break;
             }
         }
+      /* Fall through, no cost for output dependency. */
+
     case REG_DEP_ANTI:
       /* Anti dependency; DEP_INSN reads a register that INSN writes some
 	 cycles later.  */
@@ -30516,8 +30508,9 @@ rs6000_issue_rate (void)
   case CPU_POWER7:
     return 5;
   case CPU_POWER8:
-  case CPU_POWER9:
     return 7;
+  case CPU_POWER9:
+    return 6;
   default:
     return 1;
   }
@@ -30675,6 +30668,28 @@ is_store_insn (rtx insn, rtx *str_mem)
   return is_store_insn1 (PATTERN (insn), str_mem);
 }
 
+/* Return whether TYPE is a Power9 pairable vector instruction type.  */
+
+static bool
+is_power9_pairable_vec_type (enum attr_type type)
+{
+  switch (type)
+    {
+      case TYPE_VECSIMPLE:
+      case TYPE_VECCOMPLEX:
+      case TYPE_VECDIV:
+      case TYPE_VECCMP:
+      case TYPE_VECPERM:
+      case TYPE_VECFLOAT:
+      case TYPE_VECFDIV:
+      case TYPE_VECDOUBLE:
+	return true;
+      default:
+	break;
+    }
+  return false;
+}
+
 /* Returns whether the dependence between INSN and NEXT is considered
    costly by the given target.  */
 
@@ -30786,6 +30801,10 @@ static int
 rs6000_sched_reorder2 (FILE *dump, int sched_verbose, rtx_insn **ready,
 		         int *pn_ready, int clock_var ATTRIBUTE_UNUSED)
 {
+  int pos;
+  int i;
+  rtx_insn *tmp;
+
   if (sched_verbose)
     fprintf (dump, "// rs6000_sched_reorder2 :\n");
 
@@ -30831,9 +30850,6 @@ rs6000_sched_reorder2 (FILE *dump, int s
    */
   if (rs6000_cpu == PROCESSOR_POWER6 && last_scheduled_insn)
     {
-      int pos;
-      int i;
-      rtx_insn *tmp;
       rtx load_mem, str_mem;
 
       if (is_store_insn (last_scheduled_insn, &str_mem))
@@ -30982,6 +30998,224 @@ rs6000_sched_reorder2 (FILE *dump, int s
         }
     }
 
+  /* Do Power9 dependent reordering if necessary.  */
+  if (rs6000_cpu == PROCESSOR_POWER9 && last_scheduled_insn
+      && recog_memoized (last_scheduled_insn) >= 0)
+    {
+      enum attr_type type;
+
+      type = get_attr_type (last_scheduled_insn);
+
+      /* Try to issue fixed point divides back-to-back in pairs so they will
+	 be routed to separate execution units and execute in parallel. */
+      if (type == TYPE_DIV && divCnt == 0)
+	{
+	  /* First divide has been scheduled. */
+	  divCnt = 1;
+
+	  /* Scan the ready list looking for another divide, if found move it
+	     to the end of the list so it is chosen next. */
+	  pos = *pn_ready-1;
+	  while (pos >= 0)
+	    {
+	      if (recog_memoized (ready[pos]) >= 0
+		  && get_attr_type (ready[pos]) == TYPE_DIV)
+		{
+		  tmp = ready[pos];
+		  for (i = pos; i < *pn_ready-1; i++)
+		    ready[i] = ready[i + 1];
+		  ready[*pn_ready-1] = tmp;
+		  break;
+		}
+	      pos--;
+	    }
+	}
+      else
+	{
+	  /* Last insn was the 2nd divide or not a divide, reset the counter. */
+	  divCnt = 0;
+
+	  /* Power9 can execute 2 vector operations and 2 vector loads in a
+	     single cycle. So try to pair up and alternate groups of vector and
+	     vector load instructions.
+
+	     To aid this formation, a counter is maintained to keep track of
+	     vec/vecload insns issued. The value of vec_load_pendulum maintains
+	     the current state with the following values:
+
+	     0  : Initial state, no vec/vecload group has been started.
+
+	     -1 : 1 vector load has been issued and another has been found on
+		  the ready list and moved to the end.
+
+	     -2 : 2 vector loads have been issued and a vector operation has
+		  been found and moved to the end of the ready list.
+
+	     -3 : 2 vector loads and a vector insn have been issued and a
+		  vector operation has been found and moved to the end of the
+		  ready list.
+
+	     1  : 1 vector insn has been issued and another has been found and
+		  moved to the end of the ready list.
+
+	     2  : 2 vector insns have been issued and a vector load has been
+		  found and moved to the end of the ready list.
+
+	     3  : 2 vector insns and a vector load have been issued and another
+		  vector load has been found and moved to the end of the ready
+		  list.
+	  */
+	  if (type == TYPE_VECLOAD)
+	    {
+	      /* Issued a vecload.  */
+	      if (vec_load_pendulum == 0)
+		{
+		  /* We issued a single vecload, look for another and move to
+		     to the end of the ready list so it will be scheduled next.
+		     Set pendulum if found.  */
+		  pos = *pn_ready-1;
+		  while (pos >= 0)
+		    {
+		      if (recog_memoized (ready[pos]) >= 0
+			  && get_attr_type (ready[pos]) == TYPE_VECLOAD)
+			{
+			  tmp = ready[pos];
+			  for (i = pos; i < *pn_ready-1; i++)
+			    ready[i] = ready[i + 1];
+			  ready[*pn_ready-1] = tmp;
+			  vec_load_pendulum = -1;
+			  return cached_can_issue_more;
+			}
+		      pos--;
+		    }
+		}
+	      else if (vec_load_pendulum == -1)
+		{
+		  /* This is the second vecload we've issued, search the ready
+		     list for a vector operation so we can try to schedule a
+		     pair of those next. If found move to the end of the ready
+		     list so it is scheduled next and set the pendulum.  */
+		  pos = *pn_ready-1;
+		  while (pos >= 0)
+		    {
+		      if (recog_memoized (ready[pos]) >= 0
+			  && is_power9_pairable_vec_type (
+			       get_attr_type (ready[pos])))
+			{
+			  tmp = ready[pos];
+			  for (i = pos; i < *pn_ready-1; i++)
+			    ready[i] = ready[i + 1];
+			  ready[*pn_ready-1] = tmp;
+			  vec_load_pendulum = -2;
+			  return cached_can_issue_more;
+			}
+		      pos--;
+		    }
+		}
+	      else if (vec_load_pendulum == 2)
+		{
+		  /* Two vector ops have been issued and we've just issued a
+		     vecload, look for another vecload and move to end of ready
+		     list if found.  */
+		  pos = *pn_ready-1;
+		  while (pos >= 0)
+		    {
+		      if (recog_memoized (ready[pos]) >= 0
+			  && get_attr_type (ready[pos]) == TYPE_VECLOAD)
+			{
+			  tmp = ready[pos];
+			  for (i = pos; i < *pn_ready-1; i++)
+			    ready[i] = ready[i + 1];
+			  ready[*pn_ready-1] = tmp;
+			  /* Set pendulum so that next vecload will be seen as
+			     finishing a group, not start of one.  */
+			  vec_load_pendulum = 3;
+			  return cached_can_issue_more;
+			}
+		      pos--;
+		    }
+		}
+	    }
+	  else if (is_power9_pairable_vec_type (type))
+	    {
+	      /* Issued a vector operation.  */
+	      if (vec_load_pendulum == 0)
+		  /* We issued a single vec op, look for another and move to
+		     to the end of the ready list so it will be scheduled next.
+		     Set pendulum if found.  */
+		{
+		  pos = *pn_ready-1;
+		  while (pos >= 0)
+		    {
+		      if (recog_memoized (ready[pos]) >= 0
+			  && is_power9_pairable_vec_type (
+			       get_attr_type (ready[pos])))
+			{
+			  tmp = ready[pos];
+			  for (i = pos; i < *pn_ready-1; i++)
+			    ready[i] = ready[i + 1];
+			  ready[*pn_ready-1] = tmp;
+			  vec_load_pendulum = 1;
+			  return cached_can_issue_more;
+			}
+		      pos--;
+		    }
+		}
+	      else if (vec_load_pendulum == 1)
+		{
+		  /* This is the second vec op we've issued, search the ready
+		     list for a vecload operation so we can try to schedule a
+		     pair of those next. If found move to the end of the ready
+		     list so it is scheduled next and set the pendulum.  */
+		  pos = *pn_ready-1;
+		  while (pos >= 0)
+		    {
+		      if (recog_memoized (ready[pos]) >= 0
+			  && get_attr_type (ready[pos]) == TYPE_VECLOAD)
+			{
+			  tmp = ready[pos];
+			  for (i = pos; i < *pn_ready-1; i++)
+			    ready[i] = ready[i + 1];
+			  ready[*pn_ready-1] = tmp;
+			  vec_load_pendulum = 2;
+			  return cached_can_issue_more;
+			}
+		      pos--;
+		    }
+		}
+	      else if (vec_load_pendulum == -2)
+		{
+		  /* Two vecload ops have been issued and we've just issued a
+		     vec op, look for another vec op and move to end of ready
+		     list if found.  */
+		  pos = *pn_ready-1;
+		  while (pos >= 0)
+		    {
+		      if (recog_memoized (ready[pos]) >= 0
+			  && is_power9_pairable_vec_type (
+			       get_attr_type (ready[pos])))
+			{
+			  tmp = ready[pos];
+			  for (i = pos; i < *pn_ready-1; i++)
+			    ready[i] = ready[i + 1];
+			  ready[*pn_ready-1] = tmp;
+			  /* Set pendulum so that next vec op will be seen as
+			     finishing a group, not start of one.  */
+			  vec_load_pendulum = -3;
+			  return cached_can_issue_more;
+			}
+		      pos--;
+		    }
+		}
+	    }
+	}
+
+      /* We've either finished a vec/vecload group, couldn't find an insn to
+	 continue the current group, or the last insn had nothing to do with
+	 with a group. In any case, reset the pendulum.  */
+      vec_load_pendulum = 0;
+    }
+
   return cached_can_issue_more;
 }
 
@@ -31150,7 +31384,6 @@ insn_must_be_first_in_group (rtx_insn *i
         }
       break;
     case PROCESSOR_POWER8:
-    case PROCESSOR_POWER9:
       type = get_attr_type (insn);
 
       switch (type)
@@ -31281,7 +31514,6 @@ insn_must_be_last_in_group (rtx_insn *in
     }
     break;
   case PROCESSOR_POWER8:
-  case PROCESSOR_POWER9:
     type = get_attr_type (insn);
 
     switch (type)
@@ -31400,7 +31632,7 @@ force_new_group (int sched_verbose, FILE
 
       /* Do we have a special group ending nop? */
       if (rs6000_cpu_attr == CPU_POWER6 || rs6000_cpu_attr == CPU_POWER7
-	  || rs6000_cpu_attr == CPU_POWER8 || rs6000_cpu_attr == CPU_POWER9)
+	  || rs6000_cpu_attr == CPU_POWER8)
 	{
 	  nop = gen_group_ending_nop ();
 	  emit_insn_before (nop, next_insn);
@@ -31654,8 +31886,10 @@ rs6000_sched_init (FILE *dump ATTRIBUTE_
 		     int sched_verbose ATTRIBUTE_UNUSED,
 		     int max_ready ATTRIBUTE_UNUSED)
 {
-  last_scheduled_insn = NULL_RTX;
+  last_scheduled_insn = NULL;
   load_store_pendulum = 0;
+  divCnt = 0;
+  vec_load_pendulum = 0;
 }
 
 /* The following function is called at the end of scheduling BB.
@@ -31699,8 +31933,10 @@ rs6000_sched_finish (FILE *dump, int sch
 struct _rs6000_sched_context
 {
   short cached_can_issue_more;
-  rtx last_scheduled_insn;
+  rtx_insn * last_scheduled_insn;
   int load_store_pendulum;
+  int divCnt;
+  int vec_load_pendulum;
 };
 
 typedef struct _rs6000_sched_context rs6000_sched_context_def;
@@ -31723,14 +31959,18 @@ rs6000_init_sched_context (void *_sc, bo
   if (clean_p)
     {
       sc->cached_can_issue_more = 0;
-      sc->last_scheduled_insn = NULL_RTX;
+      sc->last_scheduled_insn = NULL;
       sc->load_store_pendulum = 0;
+      sc->divCnt = 0;
+      sc->vec_load_pendulum = 0;
     }
   else
     {
       sc->cached_can_issue_more = cached_can_issue_more;
       sc->last_scheduled_insn = last_scheduled_insn;
       sc->load_store_pendulum = load_store_pendulum;
+      sc->divCnt = divCnt;
+      sc->vec_load_pendulum = vec_load_pendulum;
     }
 }
 
@@ -31745,6 +31985,8 @@ rs6000_set_sched_context (void *_sc)
   cached_can_issue_more = sc->cached_can_issue_more;
   last_scheduled_insn = sc->last_scheduled_insn;
   load_store_pendulum = sc->load_store_pendulum;
+  divCnt = sc->divCnt;
+  vec_load_pendulum = sc->vec_load_pendulum;
 }
 
 /* Free _SC.  */
Index: config/rs6000/vsx.md
===================================================================
--- config/rs6000/vsx.md	(revision 237621)
+++ config/rs6000/vsx.md	(working copy)
@@ -1252,6 +1252,7 @@ (define_insn "vsx_smax<mode>3"
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvmax<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
+   (set_attr "power9_alu2" "yes")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_smin<mode>3"
@@ -1261,6 +1262,7 @@ (define_insn "*vsx_smin<mode>3"
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvmin<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
+   (set_attr "power9_alu2" "yes")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_sqrt<mode>2"
@@ -1421,6 +1423,7 @@ (define_insn "vsx_eq<mode>"
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvcmpeq<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
+   (set_attr "power9_alu2" "yes")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "vsx_gt<mode>"
@@ -1430,6 +1433,7 @@ (define_insn "vsx_gt<mode>"
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvcmpgt<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
+   (set_attr "power9_alu2" "yes")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 (define_insn "*vsx_ge<mode>"
@@ -1439,6 +1443,7 @@ (define_insn "*vsx_ge<mode>"
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvcmpge<VSs> %x0,%x1,%x2"
   [(set_attr "type" "<VStype_simple>")
+   (set_attr "power9_alu2" "yes")
    (set_attr "fp_type" "<VSfptype_simple>")])
 
 ;; Compare vectors producing a vector result and a predicate, setting CR6 to
@@ -1454,7 +1459,8 @@ (define_insn "*vsx_eq_<mode>_p"
 		  (match_dup 2)))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvcmpeq<VSs>. %x0,%x1,%x2"
-  [(set_attr "type" "<VStype_simple>")])
+  [(set_attr "type" "<VStype_simple>")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "*vsx_gt_<mode>_p"
   [(set (reg:CC 74)
@@ -1467,7 +1473,8 @@ (define_insn "*vsx_gt_<mode>_p"
 		  (match_dup 2)))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvcmpgt<VSs>. %x0,%x1,%x2"
-  [(set_attr "type" "<VStype_simple>")])
+  [(set_attr "type" "<VStype_simple>")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "*vsx_ge_<mode>_p"
   [(set (reg:CC 74)
@@ -1480,7 +1487,8 @@ (define_insn "*vsx_ge_<mode>_p"
 		  (match_dup 2)))]
   "VECTOR_UNIT_VSX_P (<MODE>mode)"
   "xvcmpge<VSs>. %x0,%x1,%x2"
-  [(set_attr "type" "<VStype_simple>")])
+  [(set_attr "type" "<VStype_simple>")
+   (set_attr "power9_alu2" "yes")])
 
 ;; Vector select
 (define_insn "*vsx_xxsel<mode>"
@@ -1667,7 +1675,8 @@ (define_insn "vsx_xscvspdpn"
 		   UNSPEC_VSX_CVSPDPN))]
   "TARGET_XSCVSPDPN"
   "xscvspdpn %x0,%x1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "fp")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "vsx_xscvdpspn_scalar"
   [(set (match_operand:V4SF 0 "vsx_register_operand" "=wf,?wa")
@@ -1684,7 +1693,8 @@ (define_insn "vsx_xscvspdpn_directmove"
 		   UNSPEC_VSX_CVSPDPN))]
   "TARGET_XSCVSPDPN"
   "xscvspdpn %x0,%x1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "fp")
+   (set_attr "power9_alu2" "yes")])
 
 ;; Convert and scale (used by vec_ctf, vec_cts, vec_ctu for double/long long)
 
Index: config/rs6000/altivec.md
===================================================================
--- config/rs6000/altivec.md	(revision 237621)
+++ config/rs6000/altivec.md	(working copy)
@@ -511,7 +511,8 @@ (define_insn "altivec_vaddu<VI_char>s"
    (set (reg:SI 110) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
   "<VI_unit>"
   "vaddu<VI_char>s %0,%1,%2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "altivec_vadds<VI_char>s"
   [(set (match_operand:VI 0 "register_operand" "=v")
@@ -521,7 +522,8 @@ (define_insn "altivec_vadds<VI_char>s"
    (set (reg:SI 110) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
   "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
   "vadds<VI_char>s %0,%1,%2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
 
 ;; sub
 (define_insn "sub<mode>3"
@@ -557,7 +559,8 @@ (define_insn "altivec_vsubu<VI_char>s"
    (set (reg:SI 110) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
   "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
   "vsubu<VI_char>s %0,%1,%2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "altivec_vsubs<VI_char>s"
   [(set (match_operand:VI 0 "register_operand" "=v")
@@ -567,7 +570,8 @@ (define_insn "altivec_vsubs<VI_char>s"
    (set (reg:SI 110) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
   "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
   "vsubs<VI_char>s %0,%1,%2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
 
 ;;
 (define_insn "altivec_vavgu<VI_char>"
@@ -577,7 +581,8 @@ (define_insn "altivec_vavgu<VI_char>"
 		   UNSPEC_VAVGU))]
   "TARGET_ALTIVEC"
   "vavgu<VI_char> %0,%1,%2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "altivec_vavgs<VI_char>"
   [(set (match_operand:VI 0 "register_operand" "=v")
@@ -586,7 +591,8 @@ (define_insn "altivec_vavgs<VI_char>"
 		   UNSPEC_VAVGS))]
   "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
   "vavgs<VI_char> %0,%1,%2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "altivec_vcmpbfp"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
@@ -595,7 +601,8 @@ (define_insn "altivec_vcmpbfp"
                       UNSPEC_VCMPBFP))]
   "VECTOR_UNIT_ALTIVEC_P (V4SImode)"
   "vcmpbfp %0,%1,%2"
-  [(set_attr "type" "veccmp")])
+  [(set_attr "type" "veccmp")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "*altivec_eq<mode>"
   [(set (match_operand:VI2 0 "altivec_register_operand" "=v")
@@ -627,7 +634,8 @@ (define_insn "*altivec_eqv4sf"
 		 (match_operand:V4SF 2 "altivec_register_operand" "v")))]
   "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
   "vcmpeqfp %0,%1,%2"
-  [(set_attr "type" "veccmp")])
+  [(set_attr "type" "veccmp")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "*altivec_gtv4sf"
   [(set (match_operand:V4SF 0 "altivec_register_operand" "=v")
@@ -635,7 +643,8 @@ (define_insn "*altivec_gtv4sf"
 		 (match_operand:V4SF 2 "altivec_register_operand" "v")))]
   "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
   "vcmpgtfp %0,%1,%2"
-  [(set_attr "type" "veccmp")])
+  [(set_attr "type" "veccmp")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "*altivec_gev4sf"
   [(set (match_operand:V4SF 0 "altivec_register_operand" "=v")
@@ -643,7 +652,8 @@ (define_insn "*altivec_gev4sf"
 		 (match_operand:V4SF 2 "altivec_register_operand" "v")))]
   "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
   "vcmpgefp %0,%1,%2"
-  [(set_attr "type" "veccmp")])
+  [(set_attr "type" "veccmp")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "*altivec_vsel<mode>"
   [(set (match_operand:VM 0 "altivec_register_operand" "=v")
@@ -854,7 +864,8 @@ (define_insn "umax<mode>3"
 		  (match_operand:VI2 2 "register_operand" "v")))]
   "<VI_unit>"
   "vmaxu<VI_char> %0,%1,%2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "smax<mode>3"
   [(set (match_operand:VI2 0 "register_operand" "=v")
@@ -862,7 +873,8 @@ (define_insn "smax<mode>3"
 		  (match_operand:VI2 2 "register_operand" "v")))]
   "<VI_unit>"
   "vmaxs<VI_char> %0,%1,%2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "*altivec_smaxv4sf3"
   [(set (match_operand:V4SF 0 "register_operand" "=v")
@@ -870,7 +882,8 @@ (define_insn "*altivec_smaxv4sf3"
                    (match_operand:V4SF 2 "register_operand" "v")))]
   "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
   "vmaxfp %0,%1,%2"
-  [(set_attr "type" "veccmp")])
+  [(set_attr "type" "veccmp")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "umin<mode>3"
   [(set (match_operand:VI2 0 "register_operand" "=v")
@@ -878,7 +891,8 @@ (define_insn "umin<mode>3"
 		  (match_operand:VI2 2 "register_operand" "v")))]
   "<VI_unit>"
   "vminu<VI_char> %0,%1,%2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "smin<mode>3"
   [(set (match_operand:VI2 0 "register_operand" "=v")
@@ -886,7 +900,8 @@ (define_insn "smin<mode>3"
 		  (match_operand:VI2 2 "register_operand" "v")))]
   "<VI_unit>"
   "vmins<VI_char> %0,%1,%2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "*altivec_sminv4sf3"
   [(set (match_operand:V4SF 0 "register_operand" "=v")
@@ -894,7 +909,8 @@ (define_insn "*altivec_sminv4sf3"
                    (match_operand:V4SF 2 "register_operand" "v")))]
   "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
   "vminfp %0,%1,%2"
-  [(set_attr "type" "veccmp")])
+  [(set_attr "type" "veccmp")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "altivec_vmhaddshs"
   [(set (match_operand:V8HI 0 "register_operand" "=v")
@@ -1614,7 +1630,8 @@ (define_insn "*altivec_vrl<VI_char>"
 		    (match_operand:VI2 2 "register_operand" "v")))]
   "<VI_unit>"
   "vrl<VI_char> %0,%1,%2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "altivec_vsl"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
@@ -1658,7 +1675,8 @@ (define_insn "*altivec_vsl<VI_char>"
 		    (match_operand:VI2 2 "register_operand" "v")))]
   "<VI_unit>"
   "vsl<VI_char> %0,%1,%2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "*altivec_vsr<VI_char>"
   [(set (match_operand:VI2 0 "register_operand" "=v")
@@ -1666,7 +1684,8 @@ (define_insn "*altivec_vsr<VI_char>"
 		      (match_operand:VI2 2 "register_operand" "v")))]
   "<VI_unit>"
   "vsr<VI_char> %0,%1,%2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "*altivec_vsra<VI_char>"
   [(set (match_operand:VI2 0 "register_operand" "=v")
@@ -1674,7 +1693,8 @@ (define_insn "*altivec_vsra<VI_char>"
 		      (match_operand:VI2 2 "register_operand" "v")))]
   "<VI_unit>"
   "vsra<VI_char> %0,%1,%2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "altivec_vsr"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
@@ -3463,6 +3483,7 @@ (define_insn "*p8v_clz<mode>2"
   "TARGET_P8_VECTOR"
   "vclz<wd> %0,%1"
   [(set_attr "length" "4")
+   (set_attr "power9_alu2" "yes")
    (set_attr "type" "vecsimple")])
 
 ;; Vector absolute difference unsigned
@@ -3490,6 +3511,7 @@ (define_insn "*p9v_ctz<mode>2"
   "TARGET_P9_VECTOR"
   "vctz<wd> %0,%1"
   [(set_attr "length" "4")
+   (set_attr "power9_alu2" "yes")
    (set_attr "type" "vecsimple")])
 
 ;; Vector population count
@@ -3499,6 +3521,7 @@ (define_insn "*p8v_popcount<mode>2"
   "TARGET_P8_VECTOR"
   "vpopcnt<wd> %0,%1"
   [(set_attr "length" "4")
+   (set_attr "power9_alu2" "yes")
    (set_attr "type" "vecsimple")])
 
 ;; Vector parity
@@ -3508,6 +3531,7 @@ (define_insn "*p9v_parity<mode>2"
   "TARGET_P9_VECTOR"
   "vprtyb<wd> %0,%1"
   [(set_attr "length" "4")
+   (set_attr "power9_alu2" "yes")
    (set_attr "type" "vecsimple")])
 
 ;; Vector Gather Bits by Bytes by Doubleword
Index: config/rs6000/dfp.md
===================================================================
--- config/rs6000/dfp.md	(revision 237621)
+++ config/rs6000/dfp.md	(working copy)
@@ -58,7 +58,7 @@ (define_insn "extendsddd2"
 	(float_extend:DD (match_operand:SD 1 "gpc_reg_operand" "f")))]
   "TARGET_DFP"
   "dctdp %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_expand "extendsdtd2"
   [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
@@ -76,7 +76,7 @@ (define_insn "truncddsd2"
 	(float_truncate:SD (match_operand:DD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "drsp %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_expand "negdd2"
   [(set (match_operand:DD 0 "gpc_reg_operand" "")
@@ -160,7 +160,7 @@ (define_insn "extendddtd2"
 	(float_extend:TD (match_operand:DD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dctqpq %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 ;; The result of drdpq is an even/odd register pair with the converted
 ;; value in the even register and zero in the odd register.
@@ -173,7 +173,7 @@ (define_insn "trunctddd2"
    (clobber (match_scratch:TD 2 "=d"))]
   "TARGET_DFP"
   "drdpq %2,%1\;fmr %0,%2"
-  [(set_attr "type" "fp")
+  [(set_attr "type" "dfp")
    (set_attr "length" "8")])
 
 (define_insn "adddd3"
@@ -182,7 +182,7 @@ (define_insn "adddd3"
 		 (match_operand:DD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dadd %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "addtd3"
   [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
@@ -190,7 +190,7 @@ (define_insn "addtd3"
 		 (match_operand:TD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "daddq %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "subdd3"
   [(set (match_operand:DD 0 "gpc_reg_operand" "=d")
@@ -198,7 +198,7 @@ (define_insn "subdd3"
 		  (match_operand:DD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dsub %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "subtd3"
   [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
@@ -206,7 +206,7 @@ (define_insn "subtd3"
 		  (match_operand:TD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dsubq %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "muldd3"
   [(set (match_operand:DD 0 "gpc_reg_operand" "=d")
@@ -214,7 +214,7 @@ (define_insn "muldd3"
 		 (match_operand:DD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dmul %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "multd3"
   [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
@@ -222,7 +222,7 @@ (define_insn "multd3"
 		 (match_operand:TD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dmulq %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "divdd3"
   [(set (match_operand:DD 0 "gpc_reg_operand" "=d")
@@ -230,7 +230,7 @@ (define_insn "divdd3"
 		(match_operand:DD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "ddiv %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "divtd3"
   [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
@@ -238,7 +238,7 @@ (define_insn "divtd3"
 		(match_operand:TD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "ddivq %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "*cmpdd_internal1"
   [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
@@ -246,7 +246,7 @@ (define_insn "*cmpdd_internal1"
 		      (match_operand:DD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dcmpu %0,%1,%2"
-  [(set_attr "type" "fpcompare")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "*cmptd_internal1"
   [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
@@ -254,21 +254,21 @@ (define_insn "*cmptd_internal1"
 		      (match_operand:TD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dcmpuq %0,%1,%2"
-  [(set_attr "type" "fpcompare")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "floatdidd2"
   [(set (match_operand:DD 0 "gpc_reg_operand" "=d")
 	(float:DD (match_operand:DI 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP && TARGET_POPCNTD"
   "dcffix %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "floatditd2"
   [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
 	(float:TD (match_operand:DI 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dcffixq %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 ;; Convert a decimal64 to a decimal64 whose value is an integer.
 ;; This is the first stage of converting it to an integer type.
@@ -278,7 +278,7 @@ (define_insn "ftruncdd2"
 	(fix:DD (match_operand:DD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "drintn. 0,%0,%1,1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 ;; Convert a decimal64 whose value is an integer to an actual integer.
 ;; This is the second stage of converting decimal float to integer type.
@@ -288,7 +288,7 @@ (define_insn "fixdddi2"
 	(fix:DI (match_operand:DD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dctfix %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 ;; Convert a decimal128 to a decimal128 whose value is an integer.
 ;; This is the first stage of converting it to an integer type.
@@ -298,7 +298,7 @@ (define_insn "ftrunctd2"
 	(fix:TD (match_operand:TD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "drintnq. 0,%0,%1,1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 ;; Convert a decimal128 whose value is an integer to an actual integer.
 ;; This is the second stage of converting decimal float to integer type.
@@ -308,7 +308,7 @@ (define_insn "fixtddi2"
 	(fix:DI (match_operand:TD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dctfixq %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 \f
 ;; Decimal builtin support
@@ -333,7 +333,7 @@ (define_insn "dfp_ddedpd_<mode>"
 			 UNSPEC_DDEDPD))]
   "TARGET_DFP"
   "ddedpd<dfp_suffix> %1,%0,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "dfp_denbcd_<mode>"
   [(set (match_operand:D64_D128 0 "gpc_reg_operand" "=d")
@@ -342,7 +342,7 @@ (define_insn "dfp_denbcd_<mode>"
 			 UNSPEC_DENBCD))]
   "TARGET_DFP"
   "denbcd<dfp_suffix> %1,%0,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "dfp_dxex_<mode>"
   [(set (match_operand:D64_D128 0 "gpc_reg_operand" "=d")
@@ -350,7 +350,7 @@ (define_insn "dfp_dxex_<mode>"
 			 UNSPEC_DXEX))]
   "TARGET_DFP"
   "dxex<dfp_suffix> %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "dfp_diex_<mode>"
   [(set (match_operand:D64_D128 0 "gpc_reg_operand" "=d")
@@ -359,7 +359,7 @@ (define_insn "dfp_diex_<mode>"
 			 UNSPEC_DXEX))]
   "TARGET_DFP"
   "diex<dfp_suffix> %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "dfp_dscli_<mode>"
   [(set (match_operand:D64_D128 0 "gpc_reg_operand" "=d")
@@ -368,7 +368,7 @@ (define_insn "dfp_dscli_<mode>"
 			 UNSPEC_DSCLI))]
   "TARGET_DFP"
   "dscli<dfp_suffix> %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "dfp_dscri_<mode>"
   [(set (match_operand:D64_D128 0 "gpc_reg_operand" "=d")
@@ -377,4 +377,4 @@ (define_insn "dfp_dscri_<mode>"
 			 UNSPEC_DSCRI))]
   "TARGET_DFP"
   "dscri<dfp_suffix> %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
Index: config/rs6000/crypto.md
===================================================================
--- config/rs6000/crypto.md	(revision 237621)
+++ config/rs6000/crypto.md	(working copy)
@@ -107,4 +107,5 @@ (define_insn "crypto_vshasigma<CR_char>"
 			UNSPEC_VSHASIGMA))]
   "TARGET_CRYPTO"
   "vshasigma<CR_char> %0,%1,%2,%3"
-  [(set_attr "type" "crypto")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "power9_alu2" "yes")])
Index: config/rs6000/rs6000.md
===================================================================
--- config/rs6000/rs6000.md	(revision 237621)
+++ config/rs6000/rs6000.md	(working copy)
@@ -183,7 +183,7 @@ (define_attr "type"
    brinc,
    vecsimple,veccomplex,vecdiv,veccmp,veccmpsimple,vecperm,
    vecfloat,vecfdiv,vecdouble,mffgpr,mftgpr,crypto,
-   htm"
+   htm,htmsimple,dfp"
   (const_string "integer"))
 
 ;; What data size does this instruction work on?
@@ -275,6 +275,13 @@ (define_attr "cell_micro" "not,condition
 		(const_string "always")
 		(const_string "not")))
 
+;; Is this instruction a Power9 ALU2 insn?
+(define_attr "power9_alu2" "no,yes" (const_string "no"))
+
+;; Define attribute for insn mnemonic
+(define_attr "mnemonic" "unknown" (const_string "unknown"))
+
+
 (automata_option "ndfa")
 
 (include "rs64.md")
@@ -298,6 +305,7 @@ (define_attr "cell_micro" "not,condition
 (include "power6.md")
 (include "power7.md")
 (include "power8.md")
+(include "power9.md")
 (include "cell.md")
 (include "xfpu.md")
 (include "a2.md")
@@ -4510,7 +4518,8 @@ (define_insn "*cmp<mode>_fpr"
   "@
    fcmpu %0,%1,%2
    xscmpudp %0,%x1,%x2"
-  [(set_attr "type" "fpcompare")])
+  [(set_attr "type" "fpcompare")
+   (set_attr "power9_alu2" "yes")])
 
 ;; Floating point conversions
 (define_expand "extendsfdf2"
@@ -4830,7 +4839,8 @@ (define_insn "*fpmask<mode>"
 	 (match_operand:V2DI 5 "zero_constant" "")))]
   "TARGET_P9_MINMAX"
   "xscmp%V1dp %x0,%x2,%x3"
-  [(set_attr "type" "fpcompare")])
+  [(set_attr "type" "fpcompare")
+   (set_attr "power9_alu2" "yes")])
 
 (define_insn "*xxsel<mode>"
   [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>")
@@ -13658,7 +13668,7 @@ (define_insn "*cmp<mode>_hw"
 		      (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
    "xscmpuqp %0,%1,%2"
-  [(set_attr "type" "fpcompare")])
+  [(set_attr "type" "veccmp")])
 
 \f
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, rs6000] Scheduling update
  2016-06-21 17:46 [PATCH, rs6000] Scheduling update Pat Haugen
@ 2016-06-22 19:11 ` Segher Boessenkool
  2016-06-27 14:58   ` Pat Haugen
  2016-06-27 22:21   ` Pat Haugen
  0 siblings, 2 replies; 8+ messages in thread
From: Segher Boessenkool @ 2016-06-22 19:11 UTC (permalink / raw)
  To: Pat Haugen; +Cc: GCC Patches, David Edelsohn

Hi Pat,

On Tue, Jun 21, 2016 at 12:45:26PM -0500, Pat Haugen wrote:
> 2016-06-21  Pat Haugen  <pthaugen@us.ibm.com>
> 
>         * config/rs6000/power8.md (power8-fp): Include dfp type.
>         * config/rs6000/power6.md (power6-fp): Likewise.

Please put the files in the changelog in some logical order, even if
your patch tool is a bit dumb.

>         * config/rs6000/htm.md (various insns): Change type atribute to 
>         htmsimple and set power9_alu2 appropriately.

"attribute", trailing space.

The "power9_alu2" attribute is writing part of the scheduling description
inside the machine description proper.  Can this be reduced, maybe by
adding an attribute describing something about the insns that makes them
be handled by the alu2?  I realise it isn't all so regular :-(

>         (rs6000_option_override_internal): Remove temporary code setting
>         tuning to power8. Don't set rs6000_sched_groups for power9.

Two spaces after full stop.

>         (divCnt, vec_load_pendulum): New variables.

camelCase?

>         (_rs6000_sched_context, rs6000_init_sched_context,
>         rs6000_set_sched_context): Handle context save/restore of new
>         variables.

Pre-existing, but we shouldn't use names starting with underscore+lowercase.

> Index: config/rs6000/htm.md
> ===================================================================
> --- config/rs6000/htm.md	(revision 237621)
> +++ config/rs6000/htm.md	(working copy)
> @@ -72,7 +72,8 @@ (define_insn "*tabort"
>     (set (match_operand:BLK 2) (unspec:BLK [(match_dup 2)] UNSPEC_HTM_FENCE))]
>    "TARGET_HTM"
>    "tabort. %0"
> -  [(set_attr "type" "htm")
> +  [(set_attr "type" "htmsimple")
> +   (set_attr "power9_alu2" "yes")
>     (set_attr "length" "4")])

What determines if an insn is htm or htmsimple?

> +(define_cpu_unit "x0_power9,x1_power9,xa0_power9,xa1_power9,\
> +		  x2_power9,x3_power9,xb0_power9,xb1_power9,
> +		  br0_power9,br1_power9" "power9dsp")

One lines has a backslash and one does not.  None are needed I think?

> +; The xa0/xa1 units really represent the 3rd dispatch port for a superslice but
> +; are listed as separate units to allow those insns that preclude its use to
> +; still be scheduled two to a superslice while reserving the 3rd slot. The
> +; same applies for xb0/xb1.

Two spaces after a full stop.

> +; Any execution slice dispatch 

Trailing space.

> +; Superslice
> +(define_reservation "DU_super_power9"
> +		    "x0_power9+x1_power9|x2_power9+x3_power9")

This needs parens around the alternatives?  Or is it superfluous in all
the other cases that use it?

> +(define_reservation "LSU_pair_power9"
> +		    "lsu0_power9+lsu1_power9|lsu1_power9+lsu2_power9|\
> +		     lsu2_power9+lsu3_power9|lsu3_power9+lsu1_power9")

The 3+1 looks strange, please check (we've talked about that).

> +; 2 cycle FP ops
> +(define_attr "power9_fp_2cyc" "no,yes"
> +  (cond [(eq_attr "mnemonic" "fabs,fcpsgn,fmr,fmrgow,fnabs,fneg,\
> +			      xsabsdp,xscpsgndp,xsnabsdp,xsnegdp,\
> +			      xsabsqp,xscpsgnqp,xsnabsqp,xsnegqp")
> +	 (const_string "yes")]
> +        (const_string "no")))

Eww.  Can we have an attribute for the FP move instructions, instead?
Maybe a value "fpmove" for the "type", even?

> +; Quad-precision FP ops, execute in DFU
> +(define_attr "power9_qp" "no,yes"
> +  (if_then_else (ior (match_operand:KF 0 "" "")
> +                     (match_operand:TF 0 "" "")
> +                     (match_operand:KF 1 "" "")
> +                     (match_operand:TF 1 "" ""))
> +                (const_string "yes")
> +                (const_string "no")))

(The "" are not needed I think).

This perhaps could be better handled with the "size" attribute.

> +(define_insn_reservation "power9-load-ext" 6
> +  (and (eq_attr "type" "load")
> +       (eq_attr "sign_extend" "yes")
> +       (eq_attr "update" "no")
> +       (eq_attr "cpu" "power9"))
> +  "DU_C2_power9,LSU_power9")

So you do not describe the units used after the first cycle?  Why is
that, to keep the size of the automaton down?


> +(define_insn_reservation "power9-fpload-double" 4
> +  (and (eq_attr "type" "fpload")
> +       (eq_attr "update" "no")
> +       (match_operand:DF 0 "" "")
> +       (eq_attr "cpu" "power9"))
> +  "DU_slice_3_power9,LSU_power9")

Using match_operand here is asking for trouble.  "size", and you can
default that for "fpload" insns, and document there that it looks at the
mode of operands[0] for fpload?

> +; Store data can issue 2 cycles after AGEN issue, 3 cycles for vector store
> +(define_insn_reservation "power9-store" 0
> +  (and (eq_attr "type" "store")
> +       (not (and (eq_attr "update" "yes")
> +		 (eq_attr "indexed" "yes")))
> +       (eq_attr "cpu" "power9"))
> +  "DU_slice_3_power9,LSU_power9")

That should be

+(define_insn_reservation "power9-store" 0
+  (and (eq_attr "type" "store")
+       (eq_attr "update" "no")
+       (eq_attr "indexed" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,LSU_power9")

> +; Fixed point ops
> +
> +; Most ALU insns are simple 2 cycl, including record form

"cycl"?

> +(define_insn_reservation "power9-alu" 2
> +  (and (ior (eq_attr "type" "add,cmp,exts,integer,logical,trap,isel")
> +	    (and (eq_attr "type" "insert,shift")
> +		 (eq_attr "dot" "no")))
> +       (eq_attr "cpu" "power9"))
> +  "DU_any_power9,VSU_power9")
> +
> +; Record form rotate/shift are cracked
> +(define_insn_reservation "power9-cracked-alu" 2
> +  (and (eq_attr "type" "insert,shift")
> +       (eq_attr "dot" "yes")
> +       (eq_attr "cpu" "power9"))
> +  "DU_C2_power9,VSU_power9")
> +; 4 cycle CR latency 

Trailing space.

> +(define_insn_reservation "power9-alu2" 3
> +  (and (eq_attr "type" "cntlz,popcnt")
> +       (eq_attr "cpu" "power9"))
> +  "DU_any_power9,VSU_power9")

These alu2 insns are nice and clean and easy, maybe we can do something
similar for (at least some of) the others.

> +; FP div/sqrt are executed in VSU slices, not pipelined for other divides, but for the
> +; most part do not block pipelined ops.

Line too long.  Maybe you can rewrite it, it's not super clear ;-)

> +(define_insn_reservation "power9-vecnormal" 7
> +  (and (eq_attr "type" "vecfloat,vecdouble")
> +       (eq_attr "power9_fp_2cyc" "no")
> +       (eq_attr "power9_alu2" "no")
> +       (eq_attr "power9_qp" "no")
> +       (eq_attr "cpu" "power9"))
> +  "DU_super_power9,VSU_super_power9")

So this has all three "eww"s :-)

> +(define_insn_reservation "power9-vecnormal2" 2
> +  (and (eq_attr "type" "vecfloat")
> +       (eq_attr "power9_fp_2cyc" "yes")
> +       (eq_attr "cpu" "power9"))
> +  "DU_super_power9,VSU_super_power9")
> +
> +(define_insn_reservation "power9-vecnormal-alu2" 3
> +  (and (eq_attr "type" "vecfloat,vecdouble")
> +       (eq_attr "power9_alu2" "yes")
> +       (eq_attr "cpu" "power9"))
> +  "DU_super_power9,VSU_super_power9")
> +
> +(define_insn_reservation "power9-qp" 12
> +  (and (eq_attr "type" "vecfloat,vecdouble")
> +       (eq_attr "power9_qp" "yes")
> +       (eq_attr "cpu" "power9"))
> +  "DU_super_power9,dfu_power9")

... luckily they are exclusive.

> +; Crytpo Unit

Typo.

>  /* The following variable value is the last issued insn.  */
>  
> -static rtx last_scheduled_insn;
> +static rtx_insn * last_scheduled_insn;

No space after *.

> +/* The following variables are used to keep track of various scheduling
> +   information. */
> +static int divCnt;
> +static int vec_load_pendulum;

"divCnt" is typographically wrong, and not very descriptive either (and
neither is the comment btw).  It's not like this variable is used so
often, you can make it a bit longer name if that helps.

> @@ -30144,6 +30134,8 @@ rs6000_adjust_cost (rtx_insn *insn, rtx 
>                break;
>              }
>          }
> +      /* Fall through, no cost for output dependency. */

Two spaces after full stop.

> @@ -30786,6 +30801,10 @@ static int
>  rs6000_sched_reorder2 (FILE *dump, int sched_verbose, rtx_insn **ready,
>  		         int *pn_ready, int clock_var ATTRIBUTE_UNUSED)

We can write this as

 		         int *pn_ready, int /*clock_var*/)

nowadays.  Not sure which is preferred.

>  {
> +  int pos;
> +  int i;
> +  rtx_insn *tmp;

Moving these to an outer scope is really a step back.  The new code could
just declare them itself; in fact, it should probably be a separate
function anyway.

> +      /* Try to issue fixed point divides back-to-back in pairs so they will
> +	 be routed to separate execution units and execute in parallel. */

Two spaces after dot (many times here).

> +	  pos = *pn_ready-1;

Spaces around "-" (many times).  Maybe you want an extra variable to hold
this value, you use it a lot?  lastpos or something.

> +	     3  : 2 vector insns and a vector load have been issued and another
> +		  vector load has been found and moved to the end of the ready
> +		  list.
> +	  */

*/ should not be on a line on its own.

> @@ -31699,8 +31933,10 @@ rs6000_sched_finish (FILE *dump, int sch
>  struct _rs6000_sched_context
>  {
>    short cached_can_issue_more;
> -  rtx last_scheduled_insn;
> +  rtx_insn * last_scheduled_insn;

No space after asterisk.


Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, rs6000] Scheduling update
  2016-06-22 19:11 ` Segher Boessenkool
@ 2016-06-27 14:58   ` Pat Haugen
  2016-06-27 20:45     ` Segher Boessenkool
  2016-06-27 22:21   ` Pat Haugen
  1 sibling, 1 reply; 8+ messages in thread
From: Pat Haugen @ 2016-06-27 14:58 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches, David Edelsohn

[-- Attachment #1: Type: text/plain, Size: 5186 bytes --]

On 06/22/2016 02:10 PM, Segher Boessenkool wrote:
> The "power9_alu2" attribute is writing part of the scheduling description
> inside the machine description proper.  Can this be reduced, maybe by
> adding an attribute describing something about the insns that makes them
> be handled by the alu2?  I realise it isn't all so regular :-(
 
 
>> > +; 2 cycle FP ops
>> > +(define_attr "power9_fp_2cyc" "no,yes"
>> > +  (cond [(eq_attr "mnemonic" "fabs,fcpsgn,fmr,fmrgow,fnabs,fneg,\
>> > +			      xsabsdp,xscpsgndp,xsnabsdp,xsnegdp,\
>> > +			      xsabsqp,xscpsgnqp,xsnabsqp,xsnegqp")
>> > +	 (const_string "yes")]
>> > +        (const_string "no")))
> Eww.  Can we have an attribute for the FP move instructions, instead?
> Maybe a value "fpmove" for the "type", even?

The following patch adds new insn 'type' values that will be used for the Power9 patch to overcome the items listed above. There is no functional change to existing processor types. Bootstrap/regtested on powerpc64/powerpc64le with no new failures. Ok for trunk? Ok for backport to GCC 6 branch after successful bootstrap/regtest there?

-Pat

2016-06-27  Pat Haugen  <pthaugen@us.ibm.com>

        * config/rs6000/rs6000.md ('type' attribute): Add
        vec_logical,veccmp_fx,vec_extend,vecmove insn types.
        (*abs<mode>2_fpr, *nabs<mode>2_fpr, *neg<mode>2_fpr, *extendsfdf2_fpr,
        copysign<mode>3_fcpsgn, trunc<mode>df2_internal1, neg<mode>2_internal,
        p8_fmrgow_<mode>, pack<mode>): Change type to fpsimple.
        (*xxsel<mode>, copysign<mode>3_hard, neg<mode>2_hw, abs<mode>2_hw,
        *nabs<mode>2_hw): Change type to vecmove.
        (*and<mode>3_internal, *bool<mode>3_internal, *boolc<mode>3_internal,
        *boolcc<mode>3_internal, *eqv<mode>3_internal,
        *one_cmpl<mode>3_internal, *ieee_128bit_vsx_neg<mode>2_internal,
        *ieee_128bit_vsx_abs<mode>2_internal,
        *ieee_128bit_vsx_nabs<mode>2_internal, extendkftf2, trunctfkf2,
        *ieee128_mfvsrd_64bit, *ieee128_mfvsrd_32bit, *ieee128_mtvsrd_64bit,
        *ieee128_mtvsrd_32bit): Change type to vec_logical.
        (mov<mode>_hardfloat, *mov<mode>_hardfloat32, *mov<mode>_hardfloat64,
        *movdi_internal32, *movdi_internal64): Update insn types.
        * config/rs6000/vsx.md (*vsx_le_undo_permute_<mode>,
        vsx_extract_<mode>): Change type to vec_logical.
        (*vsx_xxsel<mode>, *vsx_xxsel<mode>_uns): Change type to vecmove.
        (vsx_sign_extend_qi_<mode>, *vsx_sign_extend_hi_<mode>,
        *vsx_sign_extend_si_v2di): Change type to vec_extend.
        * config/rs6000/altivec.md (*altivec_mov<mode>, *altivec_movti): Change
        type to vec_logical.
        (*altivec_eq<mode>, *altivec_gt<mode>, *altivec_gtu<mode>,
        *altivec_vcmpequ<VI_char>_p, *altivec_vcmpgts<VI_char>_p,
        *altivec_vcmpgtu<VI_char>_p): Change type to veccmp_fx.
        (*altivec_vsel<mode>, *altivec_vsel<mode>_uns): Change type to vecmove.
        * config/rs6000/dfp.md (*negdd2_fpr, *absdd2_fpr, *nabsdd2_fpr,
        negtd2, *abstd2_fpr, *nabstd2_fpr): Change type to fpsimple.
        * config/rs6000/40x.md (ppc405-float): Add fpsimple.
        * config/rs6000/440.md (ppc440-fp): Add fpsimple.
        * config/rs6000/476.md (ppc476-fp): Add fpsimple.
        * config/rs6000/601.md (ppc601-fp): Add fpsimple.
        * config/rs6000/603.md (ppc603-fp): Add fpsimple.
        * config/rs6000/6xx.md (ppc604-fp): Add fpsimple.
        * config/rs6000/7xx.md (ppc750-fp): Add fpsimple.
        (ppc7400-vecsimple): Add vec_logical, vecmove, veccmp_fx.
        * config/rs6000/7450.md (ppc7450-fp): Add fpsimple.
        (ppc7450-vecsimple): Add vec_logical, vecmove.
        (ppc7450-veccmp): Add veccmp_fx.
        * config/rs6000/8540.md (ppc8540_simple_vector): Add vec_logical,
        vecmove.
        (ppc8540_vector_compare): Add veccmp_fx.
        * config/rs6000/a2.md (ppca2-fp): Add fpsimple.
        * config/rs6000/cell.md (cell-fp): Add fpsimple.
        (cell-vecsimple): Add vec_logical, vecmove.
        (cell-veccmp): Add veccmp_fx.
        * config/rs6000/e300c2c3.md (ppce300c3_fp): Add fpsimple.
        * config/rs6000/e6500.md (e6500_vecsimple): Add vec_logical, vecmove,
        veccmp_fx.
        * config/rs6000/mpc.md (mpccore-fp): Add fpsimple.
        * config/rs6000/power4.md (power4-fp): Add fpsimple.
        (power4-vecsimple): Add vec_logical, vecmove.
        (power4-veccmp): Add veccmp_fx.
        * config/rs6000/power5.md (power5-fp): Add fpsimple.
        * config/rs6000/power6.md (power6-fp): Add fpsimple.
        (power6-vecsimple): Add vec_logical, vecmove.
        (power6-veccmp): Add veccmp_fx.
        * config/rs6000/power7.md (power7-fp): Add fpsimple.
        (power7-vecsimple): Add vec_logical, vecmove, veccmp_fx.
        * config/rs6000/power8.md (power8-fp): Add fpsimple.
        (power8-vecsimple): Add vec_logical, vecmove, veccmp_fx.
        * config/rs6000/rs64.md (rs64a-fp): Add fpsimple.
        * config/rs6000/titan.md (titan_fp): Add fpsimple.
        * config/rs6000/xfpu.md (fp-default, fp-addsub-s, fp-addsub-d): Add
        fpsimple.
        * config/rs6000/rs6000.c (rs6000_adjust_cost): Add TYPE_FPSIMPLE.




[-- Attachment #2: insn_types.diff --]
[-- Type: text/x-patch, Size: 34643 bytes --]

Index: config/rs6000/476.md
===================================================================
--- config/rs6000/476.md	(revision 237783)
+++ config/rs6000/476.md	(working copy)
@@ -124,7 +124,7 @@ (define_insn_reservation "ppc476-fpcompa
    ppc476_f_pipe+ppc476_i_pipe")
 
 (define_insn_reservation "ppc476-fp" 6
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,fpsimple,dmul")
        (eq_attr "cpu" "ppc476"))
   "ppc476_issue_fp,\
    ppc476_f_pipe")
Index: config/rs6000/e300c2c3.md
===================================================================
--- config/rs6000/e300c2c3.md	(revision 237783)
+++ config/rs6000/e300c2c3.md	(working copy)
@@ -150,7 +150,7 @@ (define_insn_reservation "ppce300c3_fpco
   "ppce300c3_decode,ppce300c3_issue+ppce300c3_fpu,nothing,ppce300c3_retire")
 
 (define_insn_reservation "ppce300c3_fp" 3
-  (and (eq_attr "type" "fp")
+  (and (eq_attr "type" "fp,fpsimple")
        (eq_attr "cpu" "ppce300c3"))
   "ppce300c3_decode,ppce300c3_issue+ppce300c3_fpu,nothing,ppce300c3_retire")
 
Index: config/rs6000/power8.md
===================================================================
--- config/rs6000/power8.md	(revision 237783)
+++ config/rs6000/power8.md	(working copy)
@@ -317,7 +317,7 @@ (define_bypass 4 "power8-branch" "power8
 
 ; VS Unit (includes FP/VSX/VMX/DFP/Crypto)
 (define_insn_reservation "power8-fp" 6
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,fpsimple,dmul")
        (eq_attr "cpu" "power8"))
   "DU_any_power8,VSU_power8")
 
@@ -350,7 +350,8 @@ (define_insn_reservation "power8-dsqrt" 
   "DU_any_power8,VSU_power8")
 
 (define_insn_reservation "power8-vecsimple" 2
-  (and (eq_attr "type" "vecperm,vecsimple,veccmp")
+  (and (eq_attr "type" "vecperm,vecsimple,vec_logical,vecmove,veccmp,
+			veccmp_fx")
        (eq_attr "cpu" "power8"))
   "DU_any_power8,VSU_power8")
 
Index: config/rs6000/6xx.md
===================================================================
--- config/rs6000/6xx.md	(revision 237783)
+++ config/rs6000/6xx.md	(working copy)
@@ -160,7 +160,7 @@ (define_insn_reservation "ppc604-fpcompa
   "fpu_6xx")
 
 (define_insn_reservation "ppc604-fp" 3
-  (and (eq_attr "type" "fp")
+  (and (eq_attr "type" "fp,fpsimple")
        (eq_attr "cpu" "ppc604,ppc604e,ppc620"))
   "fpu_6xx")
 
Index: config/rs6000/8540.md
===================================================================
--- config/rs6000/8540.md	(revision 237783)
+++ config/rs6000/8540.md	(working copy)
@@ -190,7 +190,7 @@ (define_insn_reservation "ppc8540_brinc"
 
 ;; Simple vector
 (define_insn_reservation "ppc8540_simple_vector" 1
-  (and (eq_attr "type" "vecsimple")
+  (and (eq_attr "type" "vecsimple,vec_logical,vecmove")
        (eq_attr "cpu" "ppc8540,ppc8548"))
   "ppc8540_decode,ppc8540_issue+ppc8540_su1_stage0+ppc8540_retire")
 
@@ -202,7 +202,7 @@ (define_insn_reservation "ppc8540_simple
 
 ;; Vector compare
 (define_insn_reservation "ppc8540_vector_compare" 1
-  (and (eq_attr "type" "veccmp")
+  (and (eq_attr "type" "veccmp,veccmp_fx")
        (eq_attr "cpu" "ppc8540,ppc8548"))
   "ppc8540_decode,ppc8540_issue+ppc8540_su1_stage0+ppc8540_retire")
 
Index: config/rs6000/7450.md
===================================================================
--- config/rs6000/7450.md	(revision 237783)
+++ config/rs6000/7450.md	(working copy)
@@ -120,7 +120,7 @@ (define_insn_reservation "ppc7450-fpcomp
   "ppc7450_du,fpu_7450")
 
 (define_insn_reservation "ppc7450-fp" 5
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,fpsimple,dmul")
        (eq_attr "cpu" "ppc7450"))
   "ppc7450_du,fpu_7450")
 
@@ -162,7 +162,7 @@ (define_insn_reservation "ppc7450-jmpreg
 
 ;; Altivec
 (define_insn_reservation "ppc7450-vecsimple" 1
-  (and (eq_attr "type" "vecsimple")
+  (and (eq_attr "type" "vecsimple,vec_logical,vecmove")
        (eq_attr "cpu" "ppc7450"))
   "ppc7450_du,ppc7450_vec_du,vecsmpl_7450")
 
@@ -172,7 +172,7 @@ (define_insn_reservation "ppc7450-veccom
   "ppc7450_du,ppc7450_vec_du,veccmplx_7450")
 
 (define_insn_reservation "ppc7450-veccmp" 2
-  (and (eq_attr "type" "veccmp")
+  (and (eq_attr "type" "veccmp,veccmp_fx")
        (eq_attr "cpu" "ppc7450"))
   "ppc7450_du,ppc7450_vec_du,veccmplx_7450")
 
Index: config/rs6000/440.md
===================================================================
--- config/rs6000/440.md	(revision 237783)
+++ config/rs6000/440.md	(working copy)
@@ -107,7 +107,7 @@ (define_insn_reservation "ppc440-fpcompa
   "ppc440_issue,ppc440_f_pipe+ppc440_i_pipe")
 
 (define_insn_reservation "ppc440-fp" 5
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,fpsimple,dmul")
        (eq_attr "cpu" "ppc440"))
   "ppc440_issue,ppc440_f_pipe")
 
Index: config/rs6000/power6.md
===================================================================
--- config/rs6000/power6.md	(revision 237783)
+++ config/rs6000/power6.md	(working copy)
@@ -500,7 +500,7 @@ (define_insn_reservation "power6-mtcr" 4
 (define_bypass 9 "power6-mtcr" "power6-branch")
 
 (define_insn_reservation "power6-fp" 6
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,fpsimple,dmul")
        (eq_attr "cpu" "power6"))
   "FPU_power6")
 
@@ -556,7 +556,7 @@ (define_insn_reservation "power6-vecstor
   "LSF_power6")
 
 (define_insn_reservation "power6-vecsimple" 3
-  (and (eq_attr "type" "vecsimple")
+  (and (eq_attr "type" "vecsimple,vec_logical,vecmove")
        (eq_attr "cpu" "power6"))
   "FPU_power6")
 
@@ -568,7 +568,7 @@ (define_bypass 5 "power6-vecsimple" "pow
 (define_bypass 4 "power6-vecsimple" "power6-vecstore" )
 
 (define_insn_reservation "power6-veccmp" 1
-  (and (eq_attr "type" "veccmp")
+  (and (eq_attr "type" "veccmp,veccmp_fx")
        (eq_attr "cpu" "power6"))
   "FPU_power6")
 
Index: config/rs6000/rs64.md
===================================================================
--- config/rs6000/rs64.md	(revision 237783)
+++ config/rs6000/rs64.md	(working copy)
@@ -111,7 +111,7 @@ (define_insn_reservation "rs64a-fpcompar
   "mciu_rs64,fpu_rs64,bpu_rs64")
 
 (define_insn_reservation "rs64a-fp" 4
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,fpsimple,dmul")
        (eq_attr "cpu" "rs64a"))
   "mciu_rs64,fpu_rs64")
 
Index: config/rs6000/e6500.md
===================================================================
--- config/rs6000/e6500.md	(revision 237783)
+++ config/rs6000/e6500.md	(working copy)
@@ -205,7 +205,7 @@ (define_insn_reservation "e6500_cr_logic
 
 ;; VSFX.
 (define_insn_reservation "e6500_vecsimple" 1
-  (and (eq_attr "type" "vecsimple,veccmp")
+  (and (eq_attr "type" "vecsimple,vec_logical,vecmove,veccmp,veccmp_fx")
        (eq_attr "cpu" "ppce6500"))
   "e6500_decode,e6500_vec")
 
Index: config/rs6000/40x.md
===================================================================
--- config/rs6000/40x.md	(revision 237783)
+++ config/rs6000/40x.md	(working copy)
@@ -119,6 +119,6 @@ (define_insn_reservation "ppc403-cr" 2
   "bpu_40x")
 
 (define_insn_reservation "ppc405-float" 11
-  (and (eq_attr "type" "fpload,fpstore,fpcompare,fp,dmul,sdiv,ddiv")
+  (and (eq_attr "type" "fpload,fpstore,fpcompare,fp,fpsimple,dmul,sdiv,ddiv")
        (eq_attr "cpu" "ppc405"))
   "fpu_405*10")
Index: config/rs6000/power4.md
===================================================================
--- config/rs6000/power4.md	(revision 237783)
+++ config/rs6000/power4.md	(working copy)
@@ -381,7 +381,7 @@ (define_insn_reservation "power4-mtcr" 4
 
 ; Basic FP latency is 6 cycles
 (define_insn_reservation "power4-fp" 6
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,fpsimple,dmul")
        (eq_attr "cpu" "power4"))
   "fpq_power4")
 
@@ -410,7 +410,7 @@ (define_insn_reservation "power4-isync" 
 
 ; VMX
 (define_insn_reservation "power4-vecsimple" 2
-  (and (eq_attr "type" "vecsimple")
+  (and (eq_attr "type" "vecsimple,vec_logical,vecmove")
        (eq_attr "cpu" "power4"))
   "vq_power4")
 
@@ -421,7 +421,7 @@ (define_insn_reservation "power4-veccomp
 
 ; vecfp compare
 (define_insn_reservation "power4-veccmp" 8
-  (and (eq_attr "type" "veccmp")
+  (and (eq_attr "type" "veccmp,veccmp_fx")
        (eq_attr "cpu" "power4"))
   "vq_power4")
 
Index: config/rs6000/xfpu.md
===================================================================
--- config/rs6000/xfpu.md	(revision 237783)
+++ config/rs6000/xfpu.md	(working copy)
@@ -55,7 +55,7 @@ (define_cpu_unit "Xfpu_issue,Xfpu_addsub
 
 (define_insn_reservation "fp-default" 2
   (and (and 
-        (eq_attr "type" "fp")
+        (eq_attr "type" "fp,fpsimple")
         (eq_attr "fp_type" "fp_default"))
        (eq_attr "cpu" "ppc405"))
   "Xfpu_issue*2")
@@ -67,14 +67,14 @@ (define_insn_reservation "fp-compare" 6
 
 (define_insn_reservation "fp-addsub-s" 14
   (and (and
-        (eq_attr "type" "fp")
+        (eq_attr "type" "fp,fpsimple")
         (eq_attr "fp_type" "fp_addsub_s"))
        (eq_attr "cpu" "ppc405"))
   "Xfpu_issue*2,Xfpu_addsub")
 
 (define_insn_reservation "fp-addsub-d" 18
   (and (and
-        (eq_attr "type" "fp")
+        (eq_attr "type" "fp,fpsimple")
         (eq_attr "fp_type" "fp_addsub_d"))
        (eq_attr "cpu" "ppc405"))
   "Xfpu_issue*2,Xfpu_addsub")
Index: config/rs6000/603.md
===================================================================
--- config/rs6000/603.md	(revision 237783)
+++ config/rs6000/603.md	(working copy)
@@ -105,7 +105,7 @@ (define_insn_reservation "ppc603-fpcompa
   "(fpu_603+iu_603*2),bpu_603")
 
 (define_insn_reservation "ppc603-fp" 3
-  (and (eq_attr "type" "fp")
+  (and (eq_attr "type" "fp,fpsimple")
        (eq_attr "cpu" "ppc603"))
   "fpu_603")
 
Index: config/rs6000/mpc.md
===================================================================
--- config/rs6000/mpc.md	(revision 237783)
+++ config/rs6000/mpc.md	(working copy)
@@ -81,7 +81,7 @@ (define_insn_reservation "mpccore-fpcomp
   "fpu_mpc,bpu_mpc")
 
 (define_insn_reservation "mpccore-fp" 4
-  (and (eq_attr "type" "fp")
+  (and (eq_attr "type" "fp,fpsimple")
        (eq_attr "cpu" "mpccore"))
   "fpu_mpc*2")
 
Index: config/rs6000/cell.md
===================================================================
--- config/rs6000/cell.md	(revision 237783)
+++ config/rs6000/cell.md	(working copy)
@@ -306,7 +306,7 @@ (define_insn_reservation "cell-mtcrf" 1
 
 ; Basic FP latency is 10 cycles, thoughput is 1/cycle
 (define_insn_reservation "cell-fp" 10
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,fpsimple,dmul")
        (eq_attr "cpu" "cell"))
   "slot01,vsu1_cell,vsu1_cell*8")
 
@@ -329,7 +329,7 @@ (define_insn_reservation "cell-sqrt" 84
 
 ; VMX
 (define_insn_reservation "cell-vecsimple" 4
-  (and (eq_attr "type" "vecsimple")
+  (and (eq_attr "type" "vecsimple,vec_logical,vecmove")
        (eq_attr "cpu" "cell"))
   "slot01,vsu1_cell,vsu1_cell*2")
 
@@ -341,7 +341,7 @@ (define_insn_reservation "cell-veccomple
 
 ;; TODO: add support for recording instructions
 (define_insn_reservation "cell-veccmp" 4
-  (and (eq_attr "type" "veccmp")
+  (and (eq_attr "type" "veccmp,veccmp_fx")
        (eq_attr "cpu" "cell"))
   "slot01,vsu1_cell,vsu1_cell*2")
 
Index: config/rs6000/power7.md
===================================================================
--- config/rs6000/power7.md	(revision 237783)
+++ config/rs6000/power7.md	(working copy)
@@ -292,7 +292,7 @@ (define_insn_reservation "power7-branch"
 
 ; VS Unit (includes FP/VSX/VMX/DFP)
 (define_insn_reservation "power7-fp" 6
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,fpsimple,dmul")
        (eq_attr "cpu" "power7"))
   "DU_power7,VSU_power7")
 
@@ -324,7 +324,7 @@ (define_insn_reservation "power7-dsqrt" 
   "DU_power7,VSU_power7")
 
 (define_insn_reservation "power7-vecsimple" 2
-  (and (eq_attr "type" "vecsimple,veccmp")
+  (and (eq_attr "type" "vecsimple,vec_logical,vecmove,veccmp,veccmp_fx")
        (eq_attr "cpu" "power7"))
   "DU_power7,vsu1_power7")
 
Index: config/rs6000/7xx.md
===================================================================
--- config/rs6000/7xx.md	(revision 237783)
+++ config/rs6000/7xx.md	(working copy)
@@ -113,7 +113,7 @@ (define_insn_reservation "ppc750-fpcompa
   "ppc750_du,fpu_7xx")
 
 (define_insn_reservation "ppc750-fp" 3
-  (and (eq_attr "type" "fp")
+  (and (eq_attr "type" "fp,fpsimple")
        (eq_attr "cpu" "ppc750,ppc7400"))
   "ppc750_du,fpu_7xx")
 
@@ -165,7 +165,7 @@ (define_insn_reservation "ppc750-jmpreg"
 
 ;; Altivec
 (define_insn_reservation "ppc7400-vecsimple" 1
-  (and (eq_attr "type" "vecsimple,veccmp")
+  (and (eq_attr "type" "vecsimple,vec_logical,vecmove,veccmp,veccmp_fx")
        (eq_attr "cpu" "ppc7400"))
   "ppc750_du,ppc7400_vec_du,veccmplx_7xx")
 
Index: config/rs6000/rs6000.c
===================================================================
--- config/rs6000/rs6000.c	(revision 237783)
+++ config/rs6000/rs6000.c	(working copy)
@@ -30195,7 +30195,9 @@ rs6000_adjust_cost (rtx_insn *insn, rtx 
           switch (attr_type)
             {
             case TYPE_FP:
-              if (get_attr_type (dep_insn) == TYPE_FP)
+            case TYPE_FPSIMPLE:
+              if (get_attr_type (dep_insn) == TYPE_FP
+		  || get_attr_type (dep_insn) == TYPE_FPSIMPLE)
                 return 1;
               break;
             case TYPE_FPLOAD:
Index: config/rs6000/titan.md
===================================================================
--- config/rs6000/titan.md	(revision 237783)
+++ config/rs6000/titan.md	(working copy)
@@ -156,7 +156,7 @@ (define_insn_reservation "titan_fp_singl
 ;; Make sure the "titan_fp" rule stays last, as it's a catch all for
 ;; double-precision and unclassified (e.g. fsel) FP-instructions
 (define_insn_reservation "titan_fp" 10
-  (and (eq_attr "type" "fpcompare,fp,dmul")
+  (and (eq_attr "type" "fpcompare,fp,fpsimple,dmul")
        (eq_attr "cpu" "titan"))
   "titan_issue,titan_fp0*2,nothing*8,titan_fpwb")
 
Index: config/rs6000/vsx.md
===================================================================
--- config/rs6000/vsx.md	(revision 237783)
+++ config/rs6000/vsx.md	(working copy)
@@ -685,7 +685,7 @@ (define_insn_and_split "*vsx_le_undo_per
     }
 }
   [(set_attr "length" "0,4")
-   (set_attr "type" "vecsimple")])
+   (set_attr "type" "vec_logical")])
 
 (define_insn_and_split "*vsx_le_perm_load_<mode>"
   [(set (match_operand:VSX_LE_128 0 "vsx_register_operand" "=<VSa>")
@@ -1492,7 +1492,7 @@ (define_insn "*vsx_xxsel<mode>"
 	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
   "xxsel %x0,%x3,%x2,%x1"
-  [(set_attr "type" "vecperm")])
+  [(set_attr "type" "vecmove")])
 
 (define_insn "*vsx_xxsel<mode>_uns"
   [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?<VSa>")
@@ -1503,7 +1503,7 @@ (define_insn "*vsx_xxsel<mode>_uns"
 	 (match_operand:VSX_L 3 "vsx_register_operand" "<VSr>,<VSa>")))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
   "xxsel %x0,%x3,%x2,%x1"
-  [(set_attr "type" "vecperm")])
+  [(set_attr "type" "vecmove")])
 
 ;; Copy sign
 (define_insn "vsx_copysign<mode>3"
@@ -2157,7 +2157,7 @@ (define_insn "vsx_extract_<mode>"
   else
     gcc_unreachable ();
 }
-  [(set_attr "type" "vecsimple,mftgpr,mftgpr,vecperm")])
+  [(set_attr "type" "vec_logical,mftgpr,mftgpr,vecperm")])
 
 ;; Optimize extracting a single scalar element from memory if the scalar is in
 ;; the correct location to use a single load.
@@ -2703,7 +2703,7 @@ (define_insn "vsx_sign_extend_qi_<mode>"
 	 UNSPEC_VSX_SIGN_EXTEND))]
   "TARGET_P9_VECTOR"
   "vextsb2<wd> %0,%1"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vec_extend")])
 
 (define_insn "*vsx_sign_extend_hi_<mode>"
   [(set (match_operand:VSINT_84 0 "vsx_register_operand" "=v")
@@ -2712,7 +2712,7 @@ (define_insn "*vsx_sign_extend_hi_<mode>
 	 UNSPEC_VSX_SIGN_EXTEND))]
   "TARGET_P9_VECTOR"
   "vextsh2<wd> %0,%1"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vec_extend")])
 
 (define_insn "*vsx_sign_extend_si_v2di"
   [(set (match_operand:V2DI 0 "vsx_register_operand" "=v")
@@ -2720,4 +2720,4 @@ (define_insn "*vsx_sign_extend_si_v2di"
 		     UNSPEC_VSX_SIGN_EXTEND))]
   "TARGET_P9_VECTOR"
   "vextsw2d %0,%1"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vec_extend")])
Index: config/rs6000/altivec.md
===================================================================
--- config/rs6000/altivec.md	(revision 237783)
+++ config/rs6000/altivec.md	(working copy)
@@ -242,7 +242,7 @@ (define_insn "*altivec_mov<mode>"
     default: gcc_unreachable ();
     }
 }
-  [(set_attr "type" "vecstore,vecload,vecsimple,store,load,*,vecsimple,*,*")
+  [(set_attr "type" "vecstore,vecload,vec_logical,store,load,*,vec_logical,*,*")
    (set_attr "length" "4,4,4,20,20,20,4,8,32")])
 
 ;; Unlike other altivec moves, allow the GPRs, since a normal use of TImode
@@ -268,7 +268,7 @@ (define_insn "*altivec_movti"
     default: gcc_unreachable ();
     }
 }
-  [(set_attr "type" "vecstore,vecload,vecsimple,store,load,*,vecsimple,*")])
+  [(set_attr "type" "vecstore,vecload,vec_logical,store,load,*,vec_logical,*")])
 
 ;; Load up a vector with the most significant bit set by loading up -1 and
 ;; doing a shift left
@@ -603,7 +603,7 @@ (define_insn "*altivec_eq<mode>"
 		(match_operand:VI2 2 "altivec_register_operand" "v")))]
   "<VI_unit>"
   "vcmpequ<VI_char> %0,%1,%2"
-  [(set_attr "type" "veccmp")])
+  [(set_attr "type" "veccmp_fx")])
 
 (define_insn "*altivec_gt<mode>"
   [(set (match_operand:VI2 0 "altivec_register_operand" "=v")
@@ -611,7 +611,7 @@ (define_insn "*altivec_gt<mode>"
 		(match_operand:VI2 2 "altivec_register_operand" "v")))]
   "<VI_unit>"
   "vcmpgts<VI_char> %0,%1,%2"
-  [(set_attr "type" "veccmp")])
+  [(set_attr "type" "veccmp_fx")])
 
 (define_insn "*altivec_gtu<mode>"
   [(set (match_operand:VI2 0 "altivec_register_operand" "=v")
@@ -619,7 +619,7 @@ (define_insn "*altivec_gtu<mode>"
 		 (match_operand:VI2 2 "altivec_register_operand" "v")))]
   "<VI_unit>"
   "vcmpgtu<VI_char> %0,%1,%2"
-  [(set_attr "type" "veccmp")])
+  [(set_attr "type" "veccmp_fx")])
 
 (define_insn "*altivec_eqv4sf"
   [(set (match_operand:V4SF 0 "altivec_register_operand" "=v")
@@ -654,7 +654,7 @@ (define_insn "*altivec_vsel<mode>"
 	 (match_operand:VM 3 "altivec_register_operand" "v")))]
   "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
   "vsel %0,%3,%2,%1"
-  [(set_attr "type" "vecperm")])
+  [(set_attr "type" "vecmove")])
 
 (define_insn "*altivec_vsel<mode>_uns"
   [(set (match_operand:VM 0 "altivec_register_operand" "=v")
@@ -665,7 +665,7 @@ (define_insn "*altivec_vsel<mode>_uns"
 	 (match_operand:VM 3 "altivec_register_operand" "v")))]
   "VECTOR_MEM_ALTIVEC_P (<MODE>mode)"
   "vsel %0,%3,%2,%1"
-  [(set_attr "type" "vecperm")])
+  [(set_attr "type" "vecmove")])
 
 ;; Fused multiply add.
 
@@ -2283,7 +2283,7 @@ (define_insn "*altivec_vcmpequ<VI_char>_
 		(match_dup 2)))]
   "<VI_unit>"
   "vcmpequ<VI_char>. %0,%1,%2"
-  [(set_attr "type" "veccmp")])
+  [(set_attr "type" "veccmp_fx")])
 
 (define_insn "*altivec_vcmpgts<VI_char>_p"
   [(set (reg:CC 74)
@@ -2295,7 +2295,7 @@ (define_insn "*altivec_vcmpgts<VI_char>_
 		(match_dup 2)))]
   "<VI_unit>"
   "vcmpgts<VI_char>. %0,%1,%2"
-  [(set_attr "type" "veccmp")])
+  [(set_attr "type" "veccmp_fx")])
 
 (define_insn "*altivec_vcmpgtu<VI_char>_p"
   [(set (reg:CC 74)
@@ -2307,7 +2307,7 @@ (define_insn "*altivec_vcmpgtu<VI_char>_
 		 (match_dup 2)))]
   "<VI_unit>"
   "vcmpgtu<VI_char>. %0,%1,%2"
-  [(set_attr "type" "veccmp")])
+  [(set_attr "type" "veccmp_fx")])
 
 (define_insn "*altivec_vcmpeqfp_p"
   [(set (reg:CC 74)
Index: config/rs6000/601.md
===================================================================
--- config/rs6000/601.md	(revision 237783)
+++ config/rs6000/601.md	(working copy)
@@ -86,7 +86,7 @@ (define_insn_reservation "ppc601-fpcompa
   "(fpu_ppc601+iu_ppc601*2),nothing*2,bpu_ppc601")
 
 (define_insn_reservation "ppc601-fp" 4
-  (and (eq_attr "type" "fp")
+  (and (eq_attr "type" "fp,fpsimple")
        (eq_attr "cpu" "ppc601"))
   "fpu_ppc601")
 
Index: config/rs6000/dfp.md
===================================================================
--- config/rs6000/dfp.md	(revision 237783)
+++ config/rs6000/dfp.md	(working copy)
@@ -89,7 +89,7 @@ (define_insn "*negdd2_fpr"
 	(neg:DD (match_operand:DD 1 "gpc_reg_operand" "d")))]
   "TARGET_HARD_FLOAT && TARGET_FPRS"
   "fneg %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "fpsimple")])
 
 (define_expand "absdd2"
   [(set (match_operand:DD 0 "gpc_reg_operand" "")
@@ -102,14 +102,14 @@ (define_insn "*absdd2_fpr"
 	(abs:DD (match_operand:DD 1 "gpc_reg_operand" "d")))]
   "TARGET_HARD_FLOAT && TARGET_FPRS"
   "fabs %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "fpsimple")])
 
 (define_insn "*nabsdd2_fpr"
   [(set (match_operand:DD 0 "gpc_reg_operand" "=d")
 	(neg:DD (abs:DD (match_operand:DD 1 "gpc_reg_operand" "d"))))]
   "TARGET_HARD_FLOAT && TARGET_FPRS"
   "fnabs %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "fpsimple")])
 
 (define_expand "negtd2"
   [(set (match_operand:TD 0 "gpc_reg_operand" "")
@@ -124,7 +124,7 @@ (define_insn "*negtd2_fpr"
   "@
    fneg %0,%1
    fneg %0,%1\;fmr %L0,%L1"
-  [(set_attr "type" "fp")
+  [(set_attr "type" "fpsimple")
    (set_attr "length" "4,8")])
 
 (define_expand "abstd2"
@@ -140,7 +140,7 @@ (define_insn "*abstd2_fpr"
   "@
    fabs %0,%1
    fabs %0,%1\;fmr %L0,%L1"
-  [(set_attr "type" "fp")
+  [(set_attr "type" "fpsimple")
    (set_attr "length" "4,8")])
 
 (define_insn "*nabstd2_fpr"
@@ -150,7 +150,7 @@ (define_insn "*nabstd2_fpr"
   "@
    fnabs %0,%1
    fnabs %0,%1\;fmr %L0,%L1"
-  [(set_attr "type" "fp")
+  [(set_attr "type" "fpsimple")
    (set_attr "length" "4,8")])
 
 ;; Hardware support for decimal floating point operations.
Index: config/rs6000/power5.md
===================================================================
--- config/rs6000/power5.md	(revision 237783)
+++ config/rs6000/power5.md	(working copy)
@@ -322,7 +322,7 @@ (define_insn_reservation "power5-mtcr" 4
 
 ; Basic FP latency is 6 cycles
 (define_insn_reservation "power5-fp" 6
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,fpsimple,dmul")
        (eq_attr "cpu" "power5"))
   "fpq_power5")
 
Index: config/rs6000/rs6000.md
===================================================================
--- config/rs6000/rs6000.md	(revision 237783)
+++ config/rs6000/rs6000.md	(working copy)
@@ -183,6 +183,7 @@ (define_attr "type"
    brinc,
    vecsimple,veccomplex,vecdiv,veccmp,veccmpsimple,vecperm,
    vecfloat,vecfdiv,vecdouble,mffgpr,mftgpr,crypto,
+   vec_logical,veccmp_fx,vec_extend,vecmove,
    htm"
   (const_string "integer"))
 
@@ -4348,7 +4349,7 @@ (define_insn "*abs<mode>2_fpr"
   "@
    fabs %0,%1
    xsabsdp %x0,%x1"
-  [(set_attr "type" "fp")
+  [(set_attr "type" "fpsimple")
    (set_attr "fp_type" "fp_addsub_<Fs>")])
 
 (define_insn "*nabs<mode>2_fpr"
@@ -4360,7 +4361,7 @@ (define_insn "*nabs<mode>2_fpr"
   "@
    fnabs %0,%1
    xsnabsdp %x0,%x1"
-  [(set_attr "type" "fp")
+  [(set_attr "type" "fpsimple")
    (set_attr "fp_type" "fp_addsub_<Fs>")])
 
 (define_expand "neg<mode>2"
@@ -4376,7 +4377,7 @@ (define_insn "*neg<mode>2_fpr"
   "@
    fneg %0,%1
    xsnegdp %x0,%x1"
-  [(set_attr "type" "fp")
+  [(set_attr "type" "fpsimple")
    (set_attr "fp_type" "fp_addsub_<Fs>")])
 
 (define_expand "add<mode>3"
@@ -4537,7 +4538,7 @@ (define_insn_and_split "*extendsfdf2_fpr
   emit_note (NOTE_INSN_DELETED);
   DONE;
 }
-  [(set_attr "type" "fp,fp,fpload,fp,fp,fpload,fpload")])
+  [(set_attr "type" "fp,fpsimple,fpload,fp,fpsimple,fpload,fpload")])
 
 (define_expand "truncdfsf2"
   [(set (match_operand:SF 0 "gpc_reg_operand" "")
@@ -4627,7 +4628,7 @@ (define_insn "copysign<mode>3_fcpsgn"
   "@
    fcpsgn %0,%2,%1
    xscpsgndp %x0,%x2,%x1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "fpsimple")])
 
 ;; For MIN, MAX, and conditional move, we use DEFINE_EXPAND's that involve a
 ;; fsel instruction and some auxiliary computations.  Then we just have a
@@ -4840,7 +4841,7 @@ (define_insn "*xxsel<mode>"
 			   (match_operand:SFDF 4 "vsx_register_operand" "<Fv>")))]
   "TARGET_P9_MINMAX"
   "xxsel %x0,%x1,%x3,%x4"
-  [(set_attr "type" "vecperm")])
+  [(set_attr "type" "vecmove")])
 
 \f
 ;; Conversions to and from floating-point.
@@ -5951,7 +5952,7 @@ (define_insn_and_split "*and<mode>3_inte
   [(set (attr "type")
       (if_then_else
 	(match_test "vsx_register_operand (operands[0], <MODE>mode)")
-	(const_string "vecsimple")
+	(const_string "vec_logical")
 	(const_string "integer")))
    (set (attr "length")
       (if_then_else
@@ -5987,7 +5988,7 @@ (define_insn_and_split "*bool<mode>3_int
   [(set (attr "type")
       (if_then_else
 	(match_test "vsx_register_operand (operands[0], <MODE>mode)")
-	(const_string "vecsimple")
+	(const_string "vec_logical")
 	(const_string "integer")))
    (set (attr "length")
       (if_then_else
@@ -6025,7 +6026,7 @@ (define_insn_and_split "*boolc<mode>3_in
   [(set (attr "type")
       (if_then_else
 	(match_test "vsx_register_operand (operands[0], <MODE>mode)")
-	(const_string "vecsimple")
+	(const_string "vec_logical")
 	(const_string "integer")))
    (set (attr "length")
       (if_then_else
@@ -6085,7 +6086,7 @@ (define_insn_and_split "*boolcc<mode>3_i
   [(set (attr "type")
       (if_then_else
 	(match_test "vsx_register_operand (operands[0], <MODE>mode)")
-	(const_string "vecsimple")
+	(const_string "vec_logical")
 	(const_string "integer")))
    (set (attr "length")
       (if_then_else
@@ -6143,7 +6144,7 @@ (define_insn_and_split "*eqv<mode>3_inte
   [(set (attr "type")
       (if_then_else
 	(match_test "vsx_register_operand (operands[0], <MODE>mode)")
-	(const_string "vecsimple")
+	(const_string "vec_logical")
 	(const_string "integer")))
    (set (attr "length")
       (if_then_else
@@ -6199,7 +6200,7 @@ (define_insn_and_split "*one_cmpl<mode>3
   [(set (attr "type")
       (if_then_else
 	(match_test "vsx_register_operand (operands[0], <MODE>mode)")
-	(const_string "vecsimple")
+	(const_string "vec_logical")
 	(const_string "integer")))
    (set (attr "length")
       (if_then_else
@@ -6514,7 +6515,7 @@ (define_insn "mov<mode>_hardfloat"
    mt%0 %1
    mf%1 %0
    nop"
-  [(set_attr "type" "*,load,store,fp,fp,vecsimple,integer,fpload,fpload,fpstore,fpstore,fpload,fpstore,mffgpr,mftgpr,mtjmpr,mfjmpr,*")
+  [(set_attr "type" "*,load,store,fpsimple,fpsimple,vec_logical,integer,fpload,fpload,fpstore,fpstore,fpload,fpstore,mffgpr,mftgpr,mtjmpr,mfjmpr,*")
    (set_attr "length" "4")])
 
 (define_insn "*mov<mode>_softfloat"
@@ -6649,7 +6650,7 @@ (define_insn "*mov<mode>_hardfloat32"
    #
    #
    #"
-  [(set_attr "type" "fpstore,fpload,fp,fpload,fpstore,fpload,fpstore,vecsimple,vecsimple,two,store,load,two")
+  [(set_attr "type" "fpstore,fpload,fpsimple,fpload,fpstore,fpload,fpstore,vec_logical,vec_logical,two,store,load,two")
    (set_attr "length" "4,4,4,4,4,4,4,4,4,8,8,8,8")])
 
 (define_insn "*mov<mode>_softfloat32"
@@ -6694,7 +6695,7 @@ (define_insn "*mov<mode>_hardfloat64"
    mffgpr %0,%1
    mfvsrd %0,%x1
    mtvsrd %x0,%1"
-  [(set_attr "type" "fpstore,fpload,fp,fpload,fpstore,fpload,fpstore,vecsimple,vecsimple,integer,store,load,*,mtjmpr,mfjmpr,*,mftgpr,mffgpr,mftgpr,mffgpr")
+  [(set_attr "type" "fpstore,fpload,fpsimple,fpload,fpstore,fpload,fpstore,vec_logical,vec_logical,integer,store,load,*,mtjmpr,mfjmpr,*,mftgpr,mffgpr,mftgpr,mffgpr")
    (set_attr "length" "4")])
 
 (define_insn "*mov<mode>_softfloat64"
@@ -6905,7 +6906,7 @@ (define_insn_and_split "trunc<mode>df2_i
   emit_note (NOTE_INSN_DELETED);
   DONE;
 }
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "fpsimple")])
 
 (define_insn "trunc<mode>df2_internal2"
   [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
@@ -7138,7 +7139,7 @@ (define_insn "neg<mode>2_internal"
   else
     return \"fneg %0,%1\;fneg %L0,%L1\";
 }"
-  [(set_attr "type" "fp")
+  [(set_attr "type" "fpsimple")
    (set_attr "length" "8")])
 
 (define_expand "abs<mode>2"
@@ -7273,7 +7274,7 @@ (define_insn "*ieee_128bit_vsx_neg<mode>
    (use (match_operand:V16QI 2 "register_operand" "v"))]
   "TARGET_FLOAT128 && !TARGET_FLOAT128_HW"
   "xxlxor %x0,%x1,%x2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vec_logical")])
 
 ;; IEEE 128-bit absolute value
 (define_insn_and_split "ieee_128bit_vsx_abs<mode>2"
@@ -7302,7 +7303,7 @@ (define_insn "*ieee_128bit_vsx_abs<mode>
    (use (match_operand:V16QI 2 "register_operand" "v"))]
   "TARGET_FLOAT128 && !TARGET_FLOAT128_HW"
   "xxlandc %x0,%x1,%x2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vec_logical")])
 
 ;; IEEE 128-bit negative absolute value
 (define_insn_and_split "*ieee_128bit_vsx_nabs<mode>2"
@@ -7335,7 +7336,7 @@ (define_insn "*ieee_128bit_vsx_nabs<mode
    (use (match_operand:V16QI 2 "register_operand" "v"))]
   "TARGET_FLOAT128 && !TARGET_FLOAT128_HW"
   "xxlor %x0,%x1,%x2"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vec_logical")])
 
 ;; Float128 conversion functions.  These expand to library function calls.
 ;; We use expand to convert from IBM double double to IEEE 128-bit
@@ -7491,7 +7492,7 @@ (define_insn "p8_fmrgow_<mode>"
 			 UNSPEC_P8V_FMRGOW))]
   "!TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
   "fmrgow %0,%1,%2"
-  [(set_attr "type" "vecperm")])
+  [(set_attr "type" "fpsimple")])
 
 (define_insn "p8_mtvsrwz"
   [(set (match_operand:DF 0 "register_operand" "=d")
@@ -7743,9 +7744,9 @@ (define_insn "*movdi_internal32"
    #
    #"
   [(set_attr "type"
-               "store,     load,      *,         fpstore,   fpload,     fp,
-                *,         fpstore,   fpstore,   fpload,    fpload,     vecsimple,
-                vecsimple, vecsimple, vecsimple, vecsimple, vecsimple,  vecsimple,
+               "store,     load,      *,         fpstore,     fpload,       fpsimple,
+                *,         fpstore,   fpstore,   fpload,      fpload,       vec_logical,
+                vecsimple, vecsimple, vecsimple, vec_logical, vec_logical,  vecsimple,
                 vecsimple")])
 
 (define_split
@@ -7829,11 +7830,11 @@ (define_insn "*movdi_internal64"
    mfvsrd %0,%x1
    mtvsrd %x0,%1"
   [(set_attr "type"
-               "store,     load,      *,         *,         *,          *,
-                fpstore,   fpload,    fp,        fpstore,   fpstore,    fpload,
-                fpload,    vecsimple, vecsimple, vecsimple, vecsimple,  vecsimple,
-                vecsimple, vecsimple, vecsimple, mfjmpr,    mtjmpr,     *,
-                mftgpr,    mffgpr,    mftgpr,    mffgpr")
+               "store,       load,	  *,         *,         *,         *,
+                fpstore,     fpload,      fpsimple,  fpstore,   fpstore,   fpload,
+                fpload,      vec_logical, vecsimple, vecsimple, vecsimple, vec_logical,
+                vec_logical, vecsimple,   vecsimple, mfjmpr,    mtjmpr,    *,
+                mftgpr,      mffgpr,      mftgpr,    mffgpr")
 
    (set_attr "length"
                "4,         4,         4,         4,         4,          20,
@@ -13250,7 +13251,7 @@ (define_insn_and_split "pack<mode>"
   operands[3] = gen_rtx_REG (<FP128_64>mode, dest_hi);
   operands[4] = gen_rtx_REG (<FP128_64>mode, dest_lo);
 }
-  [(set_attr "type" "fp,fp")
+  [(set_attr "type" "fpsimple,fp")
    (set_attr "length" "4,8")])
 
 (define_insn "unpack<mode>"
@@ -13352,7 +13353,7 @@ (define_insn "copysign<mode>3_hard"
 	 UNSPEC_COPYSIGN))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
    "xscpsgnqp %0,%2,%1"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecmove")])
 
 (define_insn "copysign<mode>3_soft"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
@@ -13372,7 +13373,7 @@ (define_insn "neg<mode>2_hw"
 	 (match_operand:IEEE128 1 "altivec_register_operand" "v")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xsnegqp %0,%1"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecmove")])
 
 
 (define_insn "abs<mode>2_hw"
@@ -13381,7 +13382,7 @@ (define_insn "abs<mode>2_hw"
 	 (match_operand:IEEE128 1 "altivec_register_operand" "v")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xsabsqp %0,%1"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecmove")])
 
 
 (define_insn "*nabs<mode>2_hw"
@@ -13391,7 +13392,7 @@ (define_insn "*nabs<mode>2_hw"
 	  (match_operand:IEEE128 1 "altivec_register_operand" "v"))))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xsnabsqp %0,%1"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecmove")])
 
 ;; Initially don't worry about doing fusion
 (define_insn "*fma<mode>4_hw"
@@ -13461,7 +13462,7 @@ (define_insn_and_split "extendkftf2"
   emit_note (NOTE_INSN_DELETED);
   DONE;
 }
-  [(set_attr "type" "*,vecsimple")
+  [(set_attr "type" "*,vec_logical")
    (set_attr "length" "0,4")])
 
 (define_insn_and_split "trunctfkf2"
@@ -13477,7 +13478,7 @@ (define_insn_and_split "trunctfkf2"
   emit_note (NOTE_INSN_DELETED);
   DONE;
 }
-  [(set_attr "type" "*,vecsimple")
+  [(set_attr "type" "*,vec_logical")
    (set_attr "length" "0,4")])
 
 (define_insn "trunc<mode>df2_hw"
@@ -13613,7 +13614,7 @@ (define_insn "*ieee128_mfvsrd_64bit"
    mfvsrd %0,%x1
    stxsdx %x1,%y0
    xxlor %x0,%x1,%x1"
-  [(set_attr "type" "mftgpr,fpstore,vecsimple")])
+  [(set_attr "type" "mftgpr,fpstore,vec_logical")])
 
 
 (define_insn "*ieee128_mfvsrd_32bit"
@@ -13624,7 +13625,7 @@ (define_insn "*ieee128_mfvsrd_32bit"
   "@
    stxsdx %x1,%y0
    xxlor %x0,%x1,%x1"
-  [(set_attr "type" "fpstore,vecsimple")])
+  [(set_attr "type" "fpstore,vec_logical")])
 
 (define_insn "*ieee128_mfvsrwz"
   [(set (match_operand:SI 0 "reg_or_indexed_operand" "=r,Z")
@@ -13660,7 +13661,7 @@ (define_insn "*ieee128_mtvsrd_64bit"
    mtvsrd %x0,%1
    lxsdx %x0,%y1
    xxlor %x0,%x1,%x1"
-  [(set_attr "type" "mffgpr,fpload,vecsimple")])
+  [(set_attr "type" "mffgpr,fpload,vec_logical")])
 
 (define_insn "*ieee128_mtvsrd_32bit"
   [(set (match_operand:V2DI 0 "altivec_register_operand" "=v,v")
@@ -13670,7 +13671,7 @@ (define_insn "*ieee128_mtvsrd_32bit"
   "@
    lxsdx %x0,%y1
    xxlor %x0,%x1,%x1"
-  [(set_attr "type" "fpload,vecsimple")])
+  [(set_attr "type" "fpload,vec_logical")])
 
 ;; IEEE 128-bit instructions with round to odd semantics
 (define_insn "*trunc<mode>df2_odd"
Index: config/rs6000/a2.md
===================================================================
--- config/rs6000/a2.md	(revision 237783)
+++ config/rs6000/a2.md	(working copy)
@@ -81,7 +81,7 @@ (define_insn_reservation "ppca2-load" 5
 
 ;; D.8.1
 (define_insn_reservation "ppca2-fp" 6
-  (and (eq_attr "type" "fp")     	   ;; Ignore fpsimple insn types (SPE only).
+  (and (eq_attr "type" "fp,fpsimple")
        (eq_attr "cpu" "ppca2"))
   "axu")
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, rs6000] Scheduling update
  2016-06-27 14:58   ` Pat Haugen
@ 2016-06-27 20:45     ` Segher Boessenkool
  2016-06-27 21:30       ` Pat Haugen
  0 siblings, 1 reply; 8+ messages in thread
From: Segher Boessenkool @ 2016-06-27 20:45 UTC (permalink / raw)
  To: Pat Haugen; +Cc: GCC Patches, David Edelsohn

On Mon, Jun 27, 2016 at 09:54:09AM -0500, Pat Haugen wrote:
> On 06/22/2016 02:10 PM, Segher Boessenkool wrote:
> > The "power9_alu2" attribute is writing part of the scheduling description
> > inside the machine description proper.  Can this be reduced, maybe by
> > adding an attribute describing something about the insns that makes them
> > be handled by the alu2?  I realise it isn't all so regular :-(
>  
>  
> >> > +; 2 cycle FP ops
> >> > +(define_attr "power9_fp_2cyc" "no,yes"
> >> > +  (cond [(eq_attr "mnemonic" "fabs,fcpsgn,fmr,fmrgow,fnabs,fneg,\
> >> > +			      xsabsdp,xscpsgndp,xsnabsdp,xsnegdp,\
> >> > +			      xsabsqp,xscpsgnqp,xsnabsqp,xsnegqp")
> >> > +	 (const_string "yes")]
> >> > +        (const_string "no")))
> > Eww.  Can we have an attribute for the FP move instructions, instead?
> > Maybe a value "fpmove" for the "type", even?
> 
> The following patch adds new insn 'type' values that will be used for the Power9 patch to overcome the items listed above. There is no functional change to existing processor types. Bootstrap/regtested on powerpc64/powerpc64le with no new failures. Ok for trunk? Ok for backport to GCC 6 branch after successful bootstrap/regtest there?

Hi Pat,

>         * config/rs6000/rs6000.md ('type' attribute): Add
>         vec_logical,veccmp_fx,vec_extend,vecmove insn types.

Those names are a bit irregular (underscore vs. no underscore after "vec",
"extend" is called "exts" for integer, "vec_logical" holds no relation to
integer "logical").

That said...  If this makes the power9 scheduling patch better, okay
for trunk and 6 later.  So please wait for a review of that patch.

Thanks,


Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, rs6000] Scheduling update
  2016-06-27 20:45     ` Segher Boessenkool
@ 2016-06-27 21:30       ` Pat Haugen
  2016-06-27 21:44         ` Segher Boessenkool
  0 siblings, 1 reply; 8+ messages in thread
From: Pat Haugen @ 2016-06-27 21:30 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches, David Edelsohn

On 06/27/2016 03:41 PM, Segher Boessenkool wrote:
>>         * config/rs6000/rs6000.md ('type' attribute): Add
>> >         vec_logical,veccmp_fx,vec_extend,vecmove insn types.
> Those names are a bit irregular (underscore vs. no underscore after "vec",
> "extend" is called "exts" for integer, "vec_logical" holds no relation to
> integer "logical").

I can remove the underscore to match existing types and change extend->exts. As for the vec_logical/integer logical point, not sure I'm understanding, these are the vector forms of and/or/xor/etc.

Thanks,
Pat

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, rs6000] Scheduling update
  2016-06-27 21:30       ` Pat Haugen
@ 2016-06-27 21:44         ` Segher Boessenkool
  0 siblings, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2016-06-27 21:44 UTC (permalink / raw)
  To: Pat Haugen; +Cc: GCC Patches, David Edelsohn

On Mon, Jun 27, 2016 at 04:24:39PM -0500, Pat Haugen wrote:
> On 06/27/2016 03:41 PM, Segher Boessenkool wrote:
> >>         * config/rs6000/rs6000.md ('type' attribute): Add
> >> >         vec_logical,veccmp_fx,vec_extend,vecmove insn types.
> > Those names are a bit irregular (underscore vs. no underscore after "vec",
> > "extend" is called "exts" for integer, "vec_logical" holds no relation to
> > integer "logical").
> 
> I can remove the underscore to match existing types and change extend->exts. As for the vec_logical/integer logical point, not sure I'm understanding, these are the vector forms of and/or/xor/etc.

Oh!  So they were wrongly called "perm" before?  Ha.

If you can easily make the changes, please do, otherwise just postpone
it, we'll go over this a few more times anyway.


Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, rs6000] Scheduling update
  2016-06-22 19:11 ` Segher Boessenkool
  2016-06-27 14:58   ` Pat Haugen
@ 2016-06-27 22:21   ` Pat Haugen
  2016-06-27 22:43     ` Segher Boessenkool
  1 sibling, 1 reply; 8+ messages in thread
From: Pat Haugen @ 2016-06-27 22:21 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches, David Edelsohn

[-- Attachment #1: Type: text/plain, Size: 5534 bytes --]

On 06/22/2016 02:10 PM, Segher Boessenkool wrote:
>> Index: config/rs6000/htm.md
>> ===================================================================
>> --- config/rs6000/htm.md	(revision 237621)
>> +++ config/rs6000/htm.md	(working copy)
>> @@ -72,7 +72,8 @@ (define_insn "*tabort"
>>     (set (match_operand:BLK 2) (unspec:BLK [(match_dup 2)] UNSPEC_HTM_FENCE))]
>>    "TARGET_HTM"
>>    "tabort. %0"
>> -  [(set_attr "type" "htm")
>> +  [(set_attr "type" "htmsimple")
>> +   (set_attr "power9_alu2" "yes")
>>     (set_attr "length" "4")])
> 
> What determines if an insn is htm or htmsimple?
> 
htm insns are cracked whereas htmsimple are not.


> 
>> +; Quad-precision FP ops, execute in DFU
>> +(define_attr "power9_qp" "no,yes"
>> +  (if_then_else (ior (match_operand:KF 0 "" "")
>> +                     (match_operand:TF 0 "" "")
>> +                     (match_operand:KF 1 "" "")
>> +                     (match_operand:TF 1 "" ""))
>> +                (const_string "yes")
>> +                (const_string "no")))
> 
> (The "" are not needed I think).
> 
> This perhaps could be better handled with the "size" attribute.
> 
Patch has been modified to annotate 128-bit FP insns with size '128' and handled that way.


>> +(define_insn_reservation "power9-load-ext" 6
>> +  (and (eq_attr "type" "load")
>> +       (eq_attr "sign_extend" "yes")
>> +       (eq_attr "update" "no")
>> +       (eq_attr "cpu" "power9"))
>> +  "DU_C2_power9,LSU_power9")
> 
> So you do not describe the units used after the first cycle?  Why is
> that, to keep the size of the automaton down?
> 
Yes, I ran into problems with DFA state explosion when trying to list follow-on cycles/unit reservations.

> 
>> +(define_insn_reservation "power9-fpload-double" 4
>> +  (and (eq_attr "type" "fpload")
>> +       (eq_attr "update" "no")
>> +       (match_operand:DF 0 "" "")
>> +       (eq_attr "cpu" "power9"))
>> +  "DU_slice_3_power9,LSU_power9")
> 
> Using match_operand here is asking for trouble.  "size", and you can
> default that for "fpload" insns, and document there that it looks at the
> mode of operands[0] for fpload?
Handled with size '64' additions to fpload insns.


>>  {
>> +  int pos;
>> +  int i;
>> +  rtx_insn *tmp;
> 
> Moving these to an outer scope is really a step back.  The new code could
> just declare them itself; in fact, it should probably be a separate
> function anyway.
> 
Separate function created.


Updated changelog/patch follow, with additional coding style corrections you pointed out also made. The diff is against current trunk, am currently bootstrap/regtesting on top of the other patch you already reviewed.

Thanks,
Pat


2016-06-27  Pat Haugen  <pthaugen@us.ibm.com>

        * config/rs6000/rs6000.md ('type' attribute): Add htmsimple/dfp types.
        ('size' attribute): Add '128'.
        Include power9.md.
        (*mov<mode>_hardfloat32, *mov<mode>_hardfloat64, *movdi_internal32,
        *movdi_internal64, *movdf_update1): Set size attribute to '64'.
        (add<mode>3, sub<mode>3, mul<mode>3, div<mode>3, sqrt<mode>2,
        copysign<mode>3, neg<mode>2_hw, abs<mode>2_hw, *nabs<mode>2_hw,
        *fma<mode>4_hw, *fms<mode>4_hw, *nfma<mode>4_hw, *nfms<mode>4_hw,
        extend<SFDF:mode><IEEE128:mode>2_hw, trunc<mode>df2_hw,
        *xscvqp<su>wz_<mode>, *xscvqp<su>dz_<mode>, *xscv<su>dqp_<mode>,
        *trunc<mode>df2_odd): Set size attribute to '128'.
        (*cmp<mode>_hw): Change type to veccmp and set size attribute to '128'.
        * config/rs6000/power6.md (power6-fp): Include dfp type.
        * config/rs6000/power7.md (power7-fp): Likewise.
        * config/rs6000/power8.md (power8-fp): Likewise.
        * config/rs6000/power9.md: New file.
        * config/rs6000/t-rs6000 (MD_INCLUDES): Add power9.md.
        * config/rs6000/htm.md (*tabort, *tabort<wd>c, *tabort<wd>ci,
        *trechkpt, *treclaim, *tsr, *ttest): Change type attribute to
        htmsimple.
        * config/rs6000/dfp.md (extendsddd2, truncddsd2, extendddtd2,
        trunctddd2, adddd3, addtd3, subdd3, subtd3, muldd3, multd3, divdd3,
        divtd3, *cmpdd_internal1, *cmptd_internal1, floatdidd2, floatditd2,
        ftruncdd2, fixdddi2, ftrunctd2, fixtddi2, dfp_ddedpd_<mode>,
        dfp_denbcd_<mode>, dfp_dxex_<mode>, dfp_diex_<mode>, dfp_dscli_<mode>,
        dfp_dscri_<mode>): Change type attribute to dfp.
        * config/rs6000/crypto.md (crypto_vshasigma<CR_char>): Change type
        attribute to vecsimple.
        * config/rs6000/rs6000.c (power9_cost): Update costs, cache size
        and prefetch streams.
        (rs6000_option_override_internal): Remove temporary code setting
        tuning to power8.  Don't set rs6000_sched_groups for power9.
        (last_scheduled_insn): Change to rtx_insn *.
        (divide_cnt, vec_load_pendulum): New variables.
        (rs6000_adjust_cost): Add Power9 to test for store->load separation.
        (rs6000_issue_rate): Set issue rate for Power9.
        (is_power9_pairable_vec_type): New.
        (power9_sched_reorder2): New.
        (rs6000_sched_reorder2): Call new function for Power9 specific
        reordering.
        (insn_must_be_first_in_group): Remove Power9.
        (insn_must_be_last_in_group): Likewise.
        (force_new_group): Likewise.
        (rs6000_sched_init): Fix initialization of last_scheduled_insn.
        Initialize divCnt/vec_load_pendulum.
        (_rs6000_sched_context, rs6000_init_sched_context,
        rs6000_set_sched_context): Handle context save/restore of new
        variables.


[-- Attachment #2: p9_scheduling.diff --]
[-- Type: text/x-patch, Size: 52609 bytes --]

Index: config/rs6000/power8.md
===================================================================
--- config/rs6000/power8.md	(revision 237621)
+++ config/rs6000/power8.md	(working copy)
@@ -317,7 +317,7 @@ (define_bypass 4 "power8-branch" "power8
 
 ; VS Unit (includes FP/VSX/VMX/DFP/Crypto)
 (define_insn_reservation "power8-fp" 6
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,dmul,dfp")
        (eq_attr "cpu" "power8"))
   "DU_any_power8,VSU_power8")
 
Index: config/rs6000/power6.md
===================================================================
--- config/rs6000/power6.md	(revision 237621)
+++ config/rs6000/power6.md	(working copy)
@@ -500,7 +500,7 @@ (define_insn_reservation "power6-mtcr" 4
 (define_bypass 9 "power6-mtcr" "power6-branch")
 
 (define_insn_reservation "power6-fp" 6
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,dmul,dfp")
        (eq_attr "cpu" "power6"))
   "FPU_power6")
 
Index: config/rs6000/htm.md
===================================================================
--- config/rs6000/htm.md	(revision 237621)
+++ config/rs6000/htm.md	(working copy)
@@ -72,7 +72,7 @@ (define_insn "*tabort"
    (set (match_operand:BLK 2) (unspec:BLK [(match_dup 2)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "tabort. %0"
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
    (set_attr "length" "4")])
 
 (define_expand "tabort<wd>c"
@@ -98,7 +98,7 @@ (define_insn "*tabort<wd>c"
    (set (match_operand:BLK 4) (unspec:BLK [(match_dup 4)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "tabort<wd>c. %0,%1,%2"
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
    (set_attr "length" "4")])
 
 (define_expand "tabort<wd>ci"
@@ -124,7 +124,7 @@ (define_insn "*tabort<wd>ci"
    (set (match_operand:BLK 4) (unspec:BLK [(match_dup 4)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "tabort<wd>ci. %0,%1,%2"
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
    (set_attr "length" "4")])
 
 (define_expand "tbegin"
@@ -208,7 +208,7 @@ (define_insn "*trechkpt"
    (set (match_operand:BLK 1) (unspec:BLK [(match_dup 1)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "trechkpt."
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
    (set_attr "length" "4")])
 
 (define_expand "treclaim"
@@ -230,7 +230,7 @@ (define_insn "*treclaim"
    (set (match_operand:BLK 2) (unspec:BLK [(match_dup 2)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "treclaim. %0"
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
    (set_attr "length" "4")])
 
 (define_expand "tsr"
@@ -252,7 +252,7 @@ (define_insn "*tsr"
    (set (match_operand:BLK 2) (unspec:BLK [(match_dup 2)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "tsr. %0"
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
    (set_attr "length" "4")])
 
 (define_expand "ttest"
@@ -272,7 +272,7 @@ (define_insn "*ttest"
    (set (match_operand:BLK 1) (unspec:BLK [(match_dup 1)] UNSPEC_HTM_FENCE))]
   "TARGET_HTM"
   "tabortwci. 0,1,0"
-  [(set_attr "type" "htm")
+  [(set_attr "type" "htmsimple")
    (set_attr "length" "4")])
 
 (define_insn "htm_mfspr_<mode>"
Index: config/rs6000/power9.md
===================================================================
--- config/rs6000/power9.md	(revision 0)
+++ config/rs6000/power9.md	(revision 0)
@@ -0,0 +1,477 @@
+;; Scheduling description for IBM POWER9 processor.
+;; Copyright (C) 2016 Free Software Foundation, Inc.
+;;
+;; Contributed by Pat Haugen (pthaugen@us.ibm.com).
+
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_automaton "power9dsp,power9lsu,power9vsu,power9misc")
+
+(define_cpu_unit "lsu0_power9,lsu1_power9,lsu2_power9,lsu3_power9" "power9lsu")
+(define_cpu_unit "vsu0_power9,vsu1_power9,vsu2_power9,vsu3_power9" "power9vsu")
+; Two vector permute units, part of vsu
+(define_cpu_unit "prm0_power9,prm1_power9" "power9vsu")
+; Two fixed point divide units, not pipelined
+(define_cpu_unit "fx_div0_power9,fx_div1_power9" "power9misc")
+(define_cpu_unit "bru_power9,cryptu_power9,dfu_power9" "power9misc")
+
+(define_cpu_unit "x0_power9,x1_power9,xa0_power9,xa1_power9,
+		  x2_power9,x3_power9,xb0_power9,xb1_power9,
+		  br0_power9,br1_power9" "power9dsp")
+
+
+; Dispatch port reservations
+;
+; Power9 can dispatch a maximum of 6 iops per cycle with the following
+; general restrictions (other restrictions also apply):
+;   1) At most 2 iops per execution slice
+;   2) At most 2 iops to the branch unit
+; Note that insn position in a dispatch group of 6 insns does not infer which
+; execution slice the insn is routed to.  The units are used to infer the
+; conflicts that exist (i.e. an 'even' requirement will preclude dispatch
+; with 2 insns with 'superslice' requirement).
+
+; The xa0/xa1 units really represent the 3rd dispatch port for a superslice but
+; are listed as separate units to allow those insns that preclude its use to
+; still be scheduled two to a superslice while reserving the 3rd slot.  The
+; same applies for xb0/xb1.
+(define_reservation "DU_xa_power9" "xa0_power9+xa1_power9")
+(define_reservation "DU_xb_power9" "xb0_power9+xb1_power9")
+
+; Any execution slice dispatch
+(define_reservation "DU_any_power9"
+		    "x0_power9|x1_power9|DU_xa_power9|x2_power9|x3_power9|
+		     DU_xb_power9")
+
+; Even slice, actually takes even/odd slots
+(define_reservation "DU_even_power9" "x0_power9+x1_power9|x2_power9+x3_power9")
+
+; Slice plus 3rd slot
+(define_reservation "DU_slice_3_power9"
+		    "x0_power9+xa0_power9|x1_power9+xa1_power9|
+		     x2_power9+xb0_power9|x3_power9+xb1_power9")
+
+; Superslice
+(define_reservation "DU_super_power9"
+		    "x0_power9+x1_power9|x2_power9+x3_power9")
+
+; 2-way cracked
+(define_reservation "DU_C2_power9" "x0_power9+x1_power9|
+				    x1_power9+DU_xa_power9|
+				    x1_power9+x2_power9|
+				    DU_xa_power9+x2_power9|
+				    x2_power9+x3_power9|
+				    x3_power9+DU_xb_power9")
+
+; 2-way cracked plus 3rd slot
+(define_reservation "DU_C2_3_power9" "x0_power9+x1_power9+xa0_power9|
+				      x1_power9+x2_power9+xa0_power9|
+				      x1_power9+x2_power9+xb0_power9|
+				      x2_power9+x3_power9+xb0_power9")
+
+; 3-way cracked (consumes whole decode/dispatch cycle)
+(define_reservation "DU_C3_power9"
+		    "x0_power9+x1_power9+xa0_power9+xa1_power9+x2_power9+
+		     x3_power9+xb0_power9+xb1_power9+br0_power9+br1_power9")
+
+; Branch ports
+(define_reservation "DU_branch_power9" "br0_power9|br1_power9")
+
+
+; Execution unit reservations
+(define_reservation "LSU_power9"
+		    "lsu0_power9|lsu1_power9|lsu2_power9|lsu3_power9")
+
+(define_reservation "LSU_pair_power9"
+		    "lsu0_power9+lsu1_power9|lsu1_power9+lsu2_power9|
+		     lsu2_power9+lsu3_power9|lsu3_power9+lsu0_power9")
+
+(define_reservation "VSU_power9"
+		    "vsu0_power9|vsu1_power9|vsu2_power9|vsu3_power9")
+
+(define_reservation "VSU_super_power9"
+		    "vsu0_power9+vsu1_power9|vsu2_power9+vsu3_power9")
+
+(define_reservation "VSU_PRM_power9" "prm0_power9|prm1_power9")
+
+
+; LS Unit
+(define_insn_reservation "power9-load" 4
+  (and (eq_attr "type" "load")
+       (eq_attr "sign_extend" "no")
+       (eq_attr "update" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,LSU_power9")
+
+(define_insn_reservation "power9-load-update" 4
+  (and (eq_attr "type" "load")
+       (eq_attr "sign_extend" "no")
+       (eq_attr "update" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_power9,LSU_power9+VSU_power9")
+
+(define_insn_reservation "power9-load-ext" 6
+  (and (eq_attr "type" "load")
+       (eq_attr "sign_extend" "yes")
+       (eq_attr "update" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_power9,LSU_power9")
+
+(define_insn_reservation "power9-load-ext-update" 6
+  (and (eq_attr "type" "load")
+       (eq_attr "sign_extend" "yes")
+       (eq_attr "update" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_C3_power9,LSU_power9+VSU_power9")
+
+(define_insn_reservation "power9-fpload-double" 4
+  (and (eq_attr "type" "fpload")
+       (eq_attr "update" "no")
+       (eq_attr "size" "64")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,LSU_power9")
+
+(define_insn_reservation "power9-fpload-update-double" 4
+  (and (eq_attr "type" "fpload")
+       (eq_attr "update" "yes")
+       (eq_attr "size" "64")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_3_power9,LSU_power9+VSU_power9")
+
+; SFmode loads are cracked and have additional 2 cycles over DFmode
+(define_insn_reservation "power9-fpload-single" 6
+  (and (eq_attr "type" "fpload")
+       (eq_attr "update" "no")
+       (eq_attr "size" "32")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_3_power9,LSU_power9")
+
+(define_insn_reservation "power9-fpload-update-single" 6
+  (and (eq_attr "type" "fpload")
+       (eq_attr "update" "yes")
+       (eq_attr "size" "32")
+       (eq_attr "cpu" "power9"))
+  "DU_C3_power9,LSU_power9+VSU_power9")
+
+(define_insn_reservation "power9-vecload" 5
+  (and (eq_attr "type" "vecload")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,LSU_pair_power9")
+
+; Store data can issue 2 cycles after AGEN issue, 3 cycles for vector store
+(define_insn_reservation "power9-store" 0
+  (and (eq_attr "type" "store")
+       (eq_attr "update" "no")
+       (eq_attr "indexed" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,LSU_power9")
+
+(define_insn_reservation "power9-store-indexed" 0
+  (and (eq_attr "type" "store")
+       (eq_attr "update" "no")
+       (eq_attr "indexed" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,LSU_power9")
+
+; Update forms have 2 cycle latency for updated addr reg
+(define_insn_reservation "power9-store-update" 2
+  (and (eq_attr "type" "store")
+       (eq_attr "update" "yes")
+       (eq_attr "indexed" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_3_power9,LSU_power9+VSU_power9")
+
+; Update forms have 2 cycle latency for updated addr reg
+(define_insn_reservation "power9-store-update-indexed" 2
+  (and (eq_attr "type" "store")
+       (eq_attr "update" "yes")
+       (eq_attr "indexed" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_3_power9,LSU_power9+VSU_power9")
+
+(define_insn_reservation "power9-fpstore" 0
+  (and (eq_attr "type" "fpstore")
+       (eq_attr "update" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,LSU_power9")
+
+; Update forms have 2 cycle latency for updated addr reg
+(define_insn_reservation "power9-fpstore-update" 2
+  (and (eq_attr "type" "fpstore")
+       (eq_attr "update" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_3_power9,LSU_power9+VSU_power9")
+
+(define_insn_reservation "power9-vecstore" 0
+  (and (eq_attr "type" "vecstore")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,LSU_pair_power9")
+
+(define_insn_reservation "power9-larx" 4
+  (and (eq_attr "type" "load_l")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,LSU_power9")
+
+(define_insn_reservation "power9-stcx" 2
+  (and (eq_attr "type" "store_c")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_3_power9,LSU_power9+VSU_power9")
+
+(define_insn_reservation "power9-sync" 4
+  (and (eq_attr "type" "sync,isync")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,LSU_power9")
+
+
+; VSU Execution Unit
+
+; Fixed point ops
+
+; Most ALU insns are simple 2 cycle, including record form
+(define_insn_reservation "power9-alu" 2
+  (and (ior (eq_attr "type" "add,cmp,exts,integer,logical,isel")
+	    (and (eq_attr "type" "insert,shift")
+		 (eq_attr "dot" "no")))
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+; Record form rotate/shift are cracked
+(define_insn_reservation "power9-cracked-alu" 2
+  (and (eq_attr "type" "insert,shift")
+       (eq_attr "dot" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_power9,VSU_power9")
+; 4 cycle CR latency
+(define_bypass 4 "power9-cracked-alu"
+		 "power9-crlogical,power9-mfcr,power9-mfcrf,power9-branch")
+
+(define_insn_reservation "power9-alu2" 3
+  (and (eq_attr "type" "cntlz,popcnt,trap")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+; Treat 'two' and 'three' types as 2 or 3 way cracked
+(define_insn_reservation "power9-two" 4
+  (and (eq_attr "type" "two")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_power9,VSU_power9")
+
+(define_insn_reservation "power9-three" 6
+  (and (eq_attr "type" "three")
+       (eq_attr "cpu" "power9"))
+  "DU_C3_power9,VSU_power9")
+
+(define_insn_reservation "power9-mul" 4
+  (and (eq_attr "type" "mul")
+       (eq_attr "dot" "no")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+(define_insn_reservation "power9-mul-compare" 4
+  (and (eq_attr "type" "mul")
+       (eq_attr "dot" "yes")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_power9,VSU_power9")
+; 6 cycle CR latency
+(define_bypass 6 "power9-mul-compare"
+		 "power9-crlogical,power9-mfcr,power9-mfcrf,power9-branch")
+
+; Fixed point divides reserve the divide units for a minimum of 8 cycles
+(define_insn_reservation "power9-idiv" 16
+  (and (eq_attr "type" "div")
+       (eq_attr "size" "32")
+       (eq_attr "cpu" "power9"))
+  "DU_even_power9,fx_div0_power9*8|fx_div1_power9*8")
+
+(define_insn_reservation "power9-ldiv" 24
+  (and (eq_attr "type" "div")
+       (eq_attr "size" "64")
+       (eq_attr "cpu" "power9"))
+  "DU_even_power9,fx_div0_power9*8|fx_div1_power9*8")
+
+(define_insn_reservation "power9-crlogical" 2
+  (and (eq_attr "type" "cr_logical,delayed_cr")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+(define_insn_reservation "power9-mfcrf" 2
+  (and (eq_attr "type" "mfcrf")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+(define_insn_reservation "power9-mfcr" 6
+  (and (eq_attr "type" "mfcr")
+       (eq_attr "cpu" "power9"))
+  "DU_C3_power9,VSU_power9")
+
+; Should differentiate between 1 cr field and > 1 since target of > 1 cr
+; is cracked
+(define_insn_reservation "power9-mtcr" 2
+  (and (eq_attr "type" "mtcr")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+; Move to LR/CTR are executed in VSU
+(define_insn_reservation "power9-mtjmpr" 5
+  (and (eq_attr "type" "mtjmpr")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+; Floating point/Vector ops
+(define_insn_reservation "power9-fpsimple" 2
+  (and (eq_attr "type" "fpsimple")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-fp" 7
+  (and (eq_attr "type" "fp,dmul")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-fpcompare" 3
+  (and (eq_attr "type" "fpcompare")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+; FP div/sqrt are executed in the VSU slices.  They are not pipelined wrt other
+; divide insns, but for the most part do not block pipelined ops.
+(define_insn_reservation "power9-sdiv" 22
+  (and (eq_attr "type" "sdiv")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-ddiv" 33
+  (and (eq_attr "type" "ddiv")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-sqrt" 26
+  (and (eq_attr "type" "ssqrt")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-dsqrt" 36
+  (and (eq_attr "type" "dsqrt")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-vec-2cyc" 2
+  (and (eq_attr "type" "vecmove,vec_logical,vec_extend,veccmp_fx")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-veccmp" 3
+  (and (eq_attr "type" "veccmp")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-vecsimple" 3
+  (and (eq_attr "type" "vecsimple")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-vecnormal" 7
+  (and (eq_attr "type" "vecfloat,vecdouble")
+       (eq_attr "size" "!128")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+; Quad-precision FP ops, execute in DFU
+(define_insn_reservation "power9-qp" 12
+  (and (eq_attr "type" "vecfloat,vecdouble")
+       (eq_attr "size" "128")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,dfu_power9")
+
+(define_insn_reservation "power9-vecperm" 3
+  (and (eq_attr "type" "vecperm")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_PRM_power9")
+
+(define_insn_reservation "power9-veccomplex" 7
+  (and (eq_attr "type" "veccomplex")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-vecfdiv" 28
+  (and (eq_attr "type" "vecfdiv")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-vecdiv" 32
+  (and (eq_attr "type" "vecdiv")
+       (eq_attr "size" "!128")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,VSU_super_power9")
+
+(define_insn_reservation "power9-qpdiv" 56
+  (and (eq_attr "type" "vecdiv")
+       (eq_attr "size" "128")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,dfu_power9")
+
+(define_insn_reservation "power9-mffgpr" 2
+  (and (eq_attr "type" "mffgpr")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+(define_insn_reservation "power9-mftgpr" 2
+  (and (eq_attr "type" "mftgpr")
+       (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
+
+; Branch Unit
+; Move from LR/CTR are executed in BRU but consume a writeback port from an
+; execution slice.
+(define_insn_reservation "power9-mfjmpr" 6
+  (and (eq_attr "type" "mfjmpr")
+       (eq_attr "cpu" "power9"))
+  "DU_branch_power9,bru_power9+VSU_power9")
+
+; Branch is 2 cycles
+(define_insn_reservation "power9-branch" 2
+  (and (eq_attr "type" "jmpreg,branch")
+       (eq_attr "cpu" "power9"))
+  "DU_branch_power9,bru_power9")
+
+
+; Crypto Unit
+(define_insn_reservation "power9-crypto" 6
+  (and (eq_attr "type" "crypto")
+       (eq_attr "cpu" "power9"))
+  "DU_super_power9,cryptu_power9")
+
+
+; HTM Unit
+(define_insn_reservation "power9-htm" 4
+  (and (eq_attr "type" "htm")
+       (eq_attr "cpu" "power9"))
+  "DU_C2_power9,LSU_power9")
+
+(define_insn_reservation "power9-htm-simple" 2
+  (and (eq_attr "type" "htmsimple")
+       (eq_attr "cpu" "power9"))
+  "DU_any_power9,VSU_power9")
+
+
+; DFP Unit
+(define_insn_reservation "power9-dfp" 12
+  (and (eq_attr "type" "dfp")
+       (eq_attr "cpu" "power9"))
+  "DU_even_power9,dfu_power9")
+
Index: config/rs6000/t-rs6000
===================================================================
--- config/rs6000/t-rs6000	(revision 237621)
+++ config/rs6000/t-rs6000	(working copy)
@@ -50,6 +50,7 @@ MD_INCLUDES = $(srcdir)/config/rs6000/rs
 	$(srcdir)/config/rs6000/power6.md \
 	$(srcdir)/config/rs6000/power7.md \
 	$(srcdir)/config/rs6000/power8.md \
+	$(srcdir)/config/rs6000/power9.md \
 	$(srcdir)/config/rs6000/cell.md \
 	$(srcdir)/config/rs6000/xfpu.md \
 	$(srcdir)/config/rs6000/a2.md \
Index: config/rs6000/power7.md
===================================================================
--- config/rs6000/power7.md	(revision 237621)
+++ config/rs6000/power7.md	(working copy)
@@ -292,7 +292,7 @@ (define_insn_reservation "power7-branch"
 
 ; VS Unit (includes FP/VSX/VMX/DFP)
 (define_insn_reservation "power7-fp" 6
-  (and (eq_attr "type" "fp,dmul")
+  (and (eq_attr "type" "fp,dmul,dfp")
        (eq_attr "cpu" "power7"))
   "DU_power7,VSU_power7")
 
Index: config/rs6000/rs6000.c
===================================================================
--- config/rs6000/rs6000.c	(revision 237621)
+++ config/rs6000/rs6000.c	(working copy)
@@ -1104,16 +1104,16 @@ struct processor_costs power9_cost = {
   COSTS_N_INSNS (3),	/* mulsi_const */
   COSTS_N_INSNS (3),	/* mulsi_const9 */
   COSTS_N_INSNS (3),	/* muldi */
-  COSTS_N_INSNS (19),	/* divsi */
-  COSTS_N_INSNS (35),	/* divdi */
+  COSTS_N_INSNS (8),	/* divsi */
+  COSTS_N_INSNS (12),	/* divdi */
   COSTS_N_INSNS (3),	/* fp */
   COSTS_N_INSNS (3),	/* dmul */
-  COSTS_N_INSNS (14),	/* sdiv */
-  COSTS_N_INSNS (17),	/* ddiv */
+  COSTS_N_INSNS (13),	/* sdiv */
+  COSTS_N_INSNS (18),	/* ddiv */
   128,			/* cache line size */
   32,			/* l1 cache */
-  256,			/* l2 cache */
-  12,			/* prefetch streams */
+  512,			/* l2 cache */
+  8,			/* prefetch streams */
   COSTS_N_INSNS (3),	/* SF->DF convert */
 };
 
@@ -3841,22 +3841,7 @@ rs6000_option_override_internal (bool gl
   if (rs6000_tune_index >= 0)
     tune_index = rs6000_tune_index;
   else if (have_cpu)
-    {
-      /* Until power9 tuning is available, use power8 tuning if -mcpu=power9.  */
-      if (processor_target_table[cpu_index].processor != PROCESSOR_POWER9)
-	rs6000_tune_index = tune_index = cpu_index;
-      else
-	{
-	  size_t i;
-	  tune_index = -1;
-	  for (i = 0; i < ARRAY_SIZE (processor_target_table); i++)
-	    if (processor_target_table[i].processor == PROCESSOR_POWER8)
-	      {
-		rs6000_tune_index = tune_index = i;
-		break;
-	      }
-	}
-    }
+    rs6000_tune_index = tune_index = cpu_index;
   else
     {
       size_t i;
@@ -4636,8 +4621,7 @@ rs6000_option_override_internal (bool gl
   rs6000_sched_groups = (rs6000_cpu == PROCESSOR_POWER4
 			 || rs6000_cpu == PROCESSOR_POWER5
 			 || rs6000_cpu == PROCESSOR_POWER7
-			 || rs6000_cpu == PROCESSOR_POWER8
-			 || rs6000_cpu == PROCESSOR_POWER9);
+			 || rs6000_cpu == PROCESSOR_POWER8);
   rs6000_align_branch_targets = (rs6000_cpu == PROCESSOR_POWER4
 				 || rs6000_cpu == PROCESSOR_POWER5
 				 || rs6000_cpu == PROCESSOR_POWER6
@@ -29825,13 +29809,20 @@ output_function_profiler (FILE *file, in
 
 /* The following variable value is the last issued insn.  */
 
-static rtx last_scheduled_insn;
+static rtx_insn *last_scheduled_insn;
 
 /* The following variable helps to balance issuing of load and
    store instructions */
 
 static int load_store_pendulum;
 
+/* The following variable helps pair divide insns during scheduling.  */
+static int divide_cnt;
+/* The following variable helps pair and alternate vector and vector load
+   insns during scheduling.  */
+static int vec_load_pendulum;
+
+
 /* Power4 load update and store update instructions are cracked into a
    load or store and an integer insn which are executed in the same cycle.
    Branches have their own dispatch slot which does not count against the
@@ -29906,7 +29897,7 @@ rs6000_adjust_cost (rtx_insn *insn, rtx 
 	   some cycles later.  */
 
 	/* Separate a load from a narrower, dependent store.  */
-	if (rs6000_sched_groups
+	if ((rs6000_sched_groups || rs6000_cpu_attr == CPU_POWER9)
 	    && GET_CODE (PATTERN (insn)) == SET
 	    && GET_CODE (PATTERN (dep_insn)) == SET
 	    && GET_CODE (XEXP (PATTERN (insn), 1)) == MEM
@@ -30144,6 +30135,8 @@ rs6000_adjust_cost (rtx_insn *insn, rtx 
               break;
             }
         }
+      /* Fall through, no cost for output dependency.  */
+
     case REG_DEP_ANTI:
       /* Anti dependency; DEP_INSN reads a register that INSN writes some
 	 cycles later.  */
@@ -30516,8 +30509,9 @@ rs6000_issue_rate (void)
   case CPU_POWER7:
     return 5;
   case CPU_POWER8:
-  case CPU_POWER9:
     return 7;
+  case CPU_POWER9:
+    return 6;
   default:
     return 1;
   }
@@ -30675,6 +30669,28 @@ is_store_insn (rtx insn, rtx *str_mem)
   return is_store_insn1 (PATTERN (insn), str_mem);
 }
 
+/* Return whether TYPE is a Power9 pairable vector instruction type.  */
+
+static bool
+is_power9_pairable_vec_type (enum attr_type type)
+{
+  switch (type)
+    {
+      case TYPE_VECSIMPLE:
+      case TYPE_VECCOMPLEX:
+      case TYPE_VECDIV:
+      case TYPE_VECCMP:
+      case TYPE_VECPERM:
+      case TYPE_VECFLOAT:
+      case TYPE_VECFDIV:
+      case TYPE_VECDOUBLE:
+	return true;
+      default:
+	break;
+    }
+  return false;
+}
+
 /* Returns whether the dependence between INSN and NEXT is considered
    costly by the given target.  */
 
@@ -30751,6 +30767,229 @@ get_next_active_insn (rtx_insn *insn, rt
   return insn;
 }
 
+/* Do Power9 specific sched_reorder2 reordering of ready list.  */
+
+static int
+power9_sched_reorder2 (rtx_insn **ready, int lastpos)
+{
+  int pos;
+  int i;
+  rtx_insn *tmp;
+  enum attr_type type;
+
+  type = get_attr_type (last_scheduled_insn);
+
+  /* Try to issue fixed point divides back-to-back in pairs so they will be
+     routed to separate execution units and execute in parallel.  */
+  if (type == TYPE_DIV && divide_cnt == 0)
+    {
+      /* First divide has been scheduled.  */
+      divide_cnt = 1;
+
+      /* Scan the ready list looking for another divide, if found move it
+	 to the end of the list so it is chosen next.  */
+      pos = lastpos;
+      while (pos >= 0)
+	{
+	  if (recog_memoized (ready[pos]) >= 0
+	      && get_attr_type (ready[pos]) == TYPE_DIV)
+	    {
+	      tmp = ready[pos];
+	      for (i = pos; i < lastpos; i++)
+		ready[i] = ready[i + 1];
+	      ready[lastpos] = tmp;
+	      break;
+	    }
+	  pos--;
+	}
+    }
+  else
+    {
+      /* Last insn was the 2nd divide or not a divide, reset the counter.  */
+      divide_cnt = 0;
+
+      /* Power9 can execute 2 vector operations and 2 vector loads in a single
+	 cycle.  So try to pair up and alternate groups of vector and vector
+	 load instructions.
+
+	 To aid this formation, a counter is maintained to keep track of
+	 vec/vecload insns issued.  The value of vec_load_pendulum maintains
+	 the current state with the following values:
+
+	     0  : Initial state, no vec/vecload group has been started.
+
+	     -1 : 1 vector load has been issued and another has been found on
+		  the ready list and moved to the end.
+
+	     -2 : 2 vector loads have been issued and a vector operation has
+		  been found and moved to the end of the ready list.
+
+	     -3 : 2 vector loads and a vector insn have been issued and a
+		  vector operation has been found and moved to the end of the
+		  ready list.
+
+	     1  : 1 vector insn has been issued and another has been found and
+		  moved to the end of the ready list.
+
+	     2  : 2 vector insns have been issued and a vector load has been
+		  found and moved to the end of the ready list.
+
+	     3  : 2 vector insns and a vector load have been issued and another
+		  vector load has been found and moved to the end of the ready
+		  list.	 */
+      if (type == TYPE_VECLOAD)
+	{
+	  /* Issued a vecload.  */
+	  if (vec_load_pendulum == 0)
+	    {
+	      /* We issued a single vecload, look for another and move it to
+		 the end of the ready list so it will be scheduled next.
+		 Set pendulum if found.  */
+	      pos = lastpos;
+	      while (pos >= 0)
+		{
+		  if (recog_memoized (ready[pos]) >= 0
+		      && get_attr_type (ready[pos]) == TYPE_VECLOAD)
+		    {
+		      tmp = ready[pos];
+		      for (i = pos; i < lastpos; i++)
+			ready[i] = ready[i + 1];
+		      ready[lastpos] = tmp;
+		      vec_load_pendulum = -1;
+		      return cached_can_issue_more;
+		    }
+		  pos--;
+		}
+	    }
+	  else if (vec_load_pendulum == -1)
+	    {
+	      /* This is the second vecload we've issued, search the ready
+	         list for a vector operation so we can try to schedule a
+	         pair of those next.  If found move to the end of the ready
+	         list so it is scheduled next and set the pendulum.  */
+	      pos = lastpos;
+	      while (pos >= 0)
+		{
+		  if (recog_memoized (ready[pos]) >= 0
+		      && is_power9_pairable_vec_type (
+			   get_attr_type (ready[pos])))
+		    {
+		      tmp = ready[pos];
+		      for (i = pos; i < lastpos; i++)
+			ready[i] = ready[i + 1];
+		      ready[lastpos] = tmp;
+		      vec_load_pendulum = -2;
+		      return cached_can_issue_more;
+		    }
+		  pos--;
+		}
+	    }
+	  else if (vec_load_pendulum == 2)
+	    {
+	      /* Two vector ops have been issued and we've just issued a
+		 vecload, look for another vecload and move to end of ready
+		 list if found.  */
+	      pos = lastpos;
+	      while (pos >= 0)
+	        {
+		  if (recog_memoized (ready[pos]) >= 0
+		      && get_attr_type (ready[pos]) == TYPE_VECLOAD)
+		    {
+		      tmp = ready[pos];
+		      for (i = pos; i < lastpos; i++)
+			ready[i] = ready[i + 1];
+		      ready[lastpos] = tmp;
+		      /* Set pendulum so that next vecload will be seen as
+			 finishing a group, not start of one.  */
+		      vec_load_pendulum = 3;
+		      return cached_can_issue_more;
+		    }
+		  pos--;
+		}
+	    }
+	}
+      else if (is_power9_pairable_vec_type (type))
+	{
+	  /* Issued a vector operation.  */
+	  if (vec_load_pendulum == 0)
+	    /* We issued a single vec op, look for another and move it
+	       to the end of the ready list so it will be scheduled next.
+	       Set pendulum if found.  */
+	    {
+	      pos = lastpos;
+	      while (pos >= 0)
+		{
+		  if (recog_memoized (ready[pos]) >= 0
+		      && is_power9_pairable_vec_type (
+			   get_attr_type (ready[pos])))
+		    {
+		      tmp = ready[pos];
+		      for (i = pos; i < lastpos; i++)
+			ready[i] = ready[i + 1];
+		      ready[lastpos] = tmp;
+		      vec_load_pendulum = 1;
+		      return cached_can_issue_more;
+		    }
+		  pos--;
+		}
+	    }
+	  else if (vec_load_pendulum == 1)
+	    {
+	      /* This is the second vec op we've issued, search the ready
+		 list for a vecload operation so we can try to schedule a
+		 pair of those next.  If found move to the end of the ready
+		 list so it is scheduled next and set the pendulum.  */
+	      pos = lastpos;
+	      while (pos >= 0)
+		{
+		  if (recog_memoized (ready[pos]) >= 0
+		      && get_attr_type (ready[pos]) == TYPE_VECLOAD)
+		    {
+		      tmp = ready[pos];
+		      for (i = pos; i < lastpos; i++)
+			ready[i] = ready[i + 1];
+		      ready[lastpos] = tmp;
+		      vec_load_pendulum = 2;
+		      return cached_can_issue_more;
+		    }
+		  pos--;
+		}
+	    }
+	  else if (vec_load_pendulum == -2)
+	    {
+	      /* Two vecload ops have been issued and we've just issued a
+		 vec op, look for another vec op and move to end of ready
+	  	 list if found.  */
+	      pos = lastpos;
+	      while (pos >= 0)
+		{
+		  if (recog_memoized (ready[pos]) >= 0
+		      && is_power9_pairable_vec_type (
+			   get_attr_type (ready[pos])))
+		    {
+		      tmp = ready[pos];
+		      for (i = pos; i < lastpos; i++)
+			ready[i] = ready[i + 1];
+		      ready[lastpos] = tmp;
+		      /* Set pendulum so that next vec op will be seen as
+			 finishing a group, not start of one.  */
+		      vec_load_pendulum = -3;
+		      return cached_can_issue_more;
+		    }
+		  pos--;
+		}
+	    }
+	}
+
+      /* We've either finished a vec/vecload group, couldn't find an insn to
+	 continue the current group, or the last insn had nothing to do with
+	 with a group.  In any case, reset the pendulum.  */
+      vec_load_pendulum = 0;
+    }
+
+  return cached_can_issue_more;
+}
+
 /* We are about to begin issuing insns for this clock cycle. */
 
 static int
@@ -30982,6 +31221,11 @@ rs6000_sched_reorder2 (FILE *dump, int s
         }
     }
 
+  /* Do Power9 dependent reordering if necessary.  */
+  if (rs6000_cpu == PROCESSOR_POWER9 && last_scheduled_insn
+      && recog_memoized (last_scheduled_insn) >= 0)
+    return power9_sched_reorder2 (ready, *pn_ready - 1);
+
   return cached_can_issue_more;
 }
 
@@ -31150,7 +31394,6 @@ insn_must_be_first_in_group (rtx_insn *i
         }
       break;
     case PROCESSOR_POWER8:
-    case PROCESSOR_POWER9:
       type = get_attr_type (insn);
 
       switch (type)
@@ -31281,7 +31524,6 @@ insn_must_be_last_in_group (rtx_insn *in
     }
     break;
   case PROCESSOR_POWER8:
-  case PROCESSOR_POWER9:
     type = get_attr_type (insn);
 
     switch (type)
@@ -31400,7 +31642,7 @@ force_new_group (int sched_verbose, FILE
 
       /* Do we have a special group ending nop? */
       if (rs6000_cpu_attr == CPU_POWER6 || rs6000_cpu_attr == CPU_POWER7
-	  || rs6000_cpu_attr == CPU_POWER8 || rs6000_cpu_attr == CPU_POWER9)
+	  || rs6000_cpu_attr == CPU_POWER8)
 	{
 	  nop = gen_group_ending_nop ();
 	  emit_insn_before (nop, next_insn);
@@ -31654,8 +31896,10 @@ rs6000_sched_init (FILE *dump ATTRIBUTE_
 		     int sched_verbose ATTRIBUTE_UNUSED,
 		     int max_ready ATTRIBUTE_UNUSED)
 {
-  last_scheduled_insn = NULL_RTX;
+  last_scheduled_insn = NULL;
   load_store_pendulum = 0;
+  divide_cnt = 0;
+  vec_load_pendulum = 0;
 }
 
 /* The following function is called at the end of scheduling BB.
@@ -31696,14 +31940,16 @@ rs6000_sched_finish (FILE *dump, int sch
     }
 }
 
-struct _rs6000_sched_context
+struct rs6000_sched_context
 {
   short cached_can_issue_more;
-  rtx last_scheduled_insn;
+  rtx_insn *last_scheduled_insn;
   int load_store_pendulum;
+  int divide_cnt;
+  int vec_load_pendulum;
 };
 
-typedef struct _rs6000_sched_context rs6000_sched_context_def;
+typedef struct rs6000_sched_context rs6000_sched_context_def;
 typedef rs6000_sched_context_def *rs6000_sched_context_t;
 
 /* Allocate store for new scheduling context.  */
@@ -31723,14 +31969,18 @@ rs6000_init_sched_context (void *_sc, bo
   if (clean_p)
     {
       sc->cached_can_issue_more = 0;
-      sc->last_scheduled_insn = NULL_RTX;
+      sc->last_scheduled_insn = NULL;
       sc->load_store_pendulum = 0;
+      sc->divide_cnt = 0;
+      sc->vec_load_pendulum = 0;
     }
   else
     {
       sc->cached_can_issue_more = cached_can_issue_more;
       sc->last_scheduled_insn = last_scheduled_insn;
       sc->load_store_pendulum = load_store_pendulum;
+      sc->divide_cnt = divide_cnt;
+      sc->vec_load_pendulum = vec_load_pendulum;
     }
 }
 
@@ -31745,6 +31995,8 @@ rs6000_set_sched_context (void *_sc)
   cached_can_issue_more = sc->cached_can_issue_more;
   last_scheduled_insn = sc->last_scheduled_insn;
   load_store_pendulum = sc->load_store_pendulum;
+  divide_cnt = sc->divide_cnt;
+  vec_load_pendulum = sc->vec_load_pendulum;
 }
 
 /* Free _SC.  */
Index: config/rs6000/dfp.md
===================================================================
--- config/rs6000/dfp.md	(revision 237621)
+++ config/rs6000/dfp.md	(working copy)
@@ -58,7 +58,7 @@ (define_insn "extendsddd2"
 	(float_extend:DD (match_operand:SD 1 "gpc_reg_operand" "f")))]
   "TARGET_DFP"
   "dctdp %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_expand "extendsdtd2"
   [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
@@ -76,7 +76,7 @@ (define_insn "truncddsd2"
 	(float_truncate:SD (match_operand:DD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "drsp %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_expand "negdd2"
   [(set (match_operand:DD 0 "gpc_reg_operand" "")
@@ -160,7 +160,7 @@ (define_insn "extendddtd2"
 	(float_extend:TD (match_operand:DD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dctqpq %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 ;; The result of drdpq is an even/odd register pair with the converted
 ;; value in the even register and zero in the odd register.
@@ -173,7 +173,7 @@ (define_insn "trunctddd2"
    (clobber (match_scratch:TD 2 "=d"))]
   "TARGET_DFP"
   "drdpq %2,%1\;fmr %0,%2"
-  [(set_attr "type" "fp")
+  [(set_attr "type" "dfp")
    (set_attr "length" "8")])
 
 (define_insn "adddd3"
@@ -182,7 +182,7 @@ (define_insn "adddd3"
 		 (match_operand:DD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dadd %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "addtd3"
   [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
@@ -190,7 +190,7 @@ (define_insn "addtd3"
 		 (match_operand:TD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "daddq %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "subdd3"
   [(set (match_operand:DD 0 "gpc_reg_operand" "=d")
@@ -198,7 +198,7 @@ (define_insn "subdd3"
 		  (match_operand:DD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dsub %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "subtd3"
   [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
@@ -206,7 +206,7 @@ (define_insn "subtd3"
 		  (match_operand:TD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dsubq %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "muldd3"
   [(set (match_operand:DD 0 "gpc_reg_operand" "=d")
@@ -214,7 +214,7 @@ (define_insn "muldd3"
 		 (match_operand:DD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dmul %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "multd3"
   [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
@@ -222,7 +222,7 @@ (define_insn "multd3"
 		 (match_operand:TD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dmulq %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "divdd3"
   [(set (match_operand:DD 0 "gpc_reg_operand" "=d")
@@ -230,7 +230,7 @@ (define_insn "divdd3"
 		(match_operand:DD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "ddiv %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "divtd3"
   [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
@@ -238,7 +238,7 @@ (define_insn "divtd3"
 		(match_operand:TD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "ddivq %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "*cmpdd_internal1"
   [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
@@ -246,7 +246,7 @@ (define_insn "*cmpdd_internal1"
 		      (match_operand:DD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dcmpu %0,%1,%2"
-  [(set_attr "type" "fpcompare")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "*cmptd_internal1"
   [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
@@ -254,21 +254,21 @@ (define_insn "*cmptd_internal1"
 		      (match_operand:TD 2 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dcmpuq %0,%1,%2"
-  [(set_attr "type" "fpcompare")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "floatdidd2"
   [(set (match_operand:DD 0 "gpc_reg_operand" "=d")
 	(float:DD (match_operand:DI 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP && TARGET_POPCNTD"
   "dcffix %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "floatditd2"
   [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
 	(float:TD (match_operand:DI 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dcffixq %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 ;; Convert a decimal64 to a decimal64 whose value is an integer.
 ;; This is the first stage of converting it to an integer type.
@@ -278,7 +278,7 @@ (define_insn "ftruncdd2"
 	(fix:DD (match_operand:DD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "drintn. 0,%0,%1,1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 ;; Convert a decimal64 whose value is an integer to an actual integer.
 ;; This is the second stage of converting decimal float to integer type.
@@ -288,7 +288,7 @@ (define_insn "fixdddi2"
 	(fix:DI (match_operand:DD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dctfix %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 ;; Convert a decimal128 to a decimal128 whose value is an integer.
 ;; This is the first stage of converting it to an integer type.
@@ -298,7 +298,7 @@ (define_insn "ftrunctd2"
 	(fix:TD (match_operand:TD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "drintnq. 0,%0,%1,1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 ;; Convert a decimal128 whose value is an integer to an actual integer.
 ;; This is the second stage of converting decimal float to integer type.
@@ -308,7 +308,7 @@ (define_insn "fixtddi2"
 	(fix:DI (match_operand:TD 1 "gpc_reg_operand" "d")))]
   "TARGET_DFP"
   "dctfixq %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 \f
 ;; Decimal builtin support
@@ -333,7 +333,7 @@ (define_insn "dfp_ddedpd_<mode>"
 			 UNSPEC_DDEDPD))]
   "TARGET_DFP"
   "ddedpd<dfp_suffix> %1,%0,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "dfp_denbcd_<mode>"
   [(set (match_operand:D64_D128 0 "gpc_reg_operand" "=d")
@@ -342,7 +342,7 @@ (define_insn "dfp_denbcd_<mode>"
 			 UNSPEC_DENBCD))]
   "TARGET_DFP"
   "denbcd<dfp_suffix> %1,%0,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "dfp_dxex_<mode>"
   [(set (match_operand:D64_D128 0 "gpc_reg_operand" "=d")
@@ -350,7 +350,7 @@ (define_insn "dfp_dxex_<mode>"
 			 UNSPEC_DXEX))]
   "TARGET_DFP"
   "dxex<dfp_suffix> %0,%1"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "dfp_diex_<mode>"
   [(set (match_operand:D64_D128 0 "gpc_reg_operand" "=d")
@@ -359,7 +359,7 @@ (define_insn "dfp_diex_<mode>"
 			 UNSPEC_DXEX))]
   "TARGET_DFP"
   "diex<dfp_suffix> %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "dfp_dscli_<mode>"
   [(set (match_operand:D64_D128 0 "gpc_reg_operand" "=d")
@@ -368,7 +368,7 @@ (define_insn "dfp_dscli_<mode>"
 			 UNSPEC_DSCLI))]
   "TARGET_DFP"
   "dscli<dfp_suffix> %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
 
 (define_insn "dfp_dscri_<mode>"
   [(set (match_operand:D64_D128 0 "gpc_reg_operand" "=d")
@@ -377,4 +377,4 @@ (define_insn "dfp_dscri_<mode>"
 			 UNSPEC_DSCRI))]
   "TARGET_DFP"
   "dscri<dfp_suffix> %0,%1,%2"
-  [(set_attr "type" "fp")])
+  [(set_attr "type" "dfp")])
Index: config/rs6000/crypto.md
===================================================================
--- config/rs6000/crypto.md	(revision 237621)
+++ config/rs6000/crypto.md	(working copy)
@@ -107,4 +107,4 @@ (define_insn "crypto_vshasigma<CR_char>"
 			UNSPEC_VSHASIGMA))]
   "TARGET_CRYPTO"
   "vshasigma<CR_char> %0,%1,%2,%3"
-  [(set_attr "type" "crypto")])
+  [(set_attr "type" "vecsimple")])
Index: config/rs6000/rs6000.md
===================================================================
--- config/rs6000/rs6000.md	(revision 237621)
+++ config/rs6000/rs6000.md	(working copy)
@@ -183,12 +183,12 @@ (define_attr "type"
    brinc,
    vecsimple,veccomplex,vecdiv,veccmp,veccmpsimple,vecperm,
    vecfloat,vecfdiv,vecdouble,mffgpr,mftgpr,crypto,
-   htm"
+   htm,htmsimple,dfp"
   (const_string "integer"))
 
 ;; What data size does this instruction work on?
-;; This is used for insert, mul.
-(define_attr "size" "8,16,32,64" (const_string "32"))
+;; This is used for insert, mul and others as necessary.
+(define_attr "size" "8,16,32,64,128" (const_string "32"))
 
 ;; Is this instruction record form ("dot", signed compare to 0, writing CR0)?
 ;; This is used for add, logical, shift, exts, mul.
@@ -298,6 +298,7 @@ (define_attr "cell_micro" "not,condition
 (include "power6.md")
 (include "power7.md")
 (include "power8.md")
+(include "power9.md")
 (include "cell.md")
 (include "xfpu.md")
 (include "a2.md")
@@ -6650,6 +6651,7 @@ (define_insn "*mov<mode>_hardfloat32"
    #
    #"
   [(set_attr "type" "fpstore,fpload,fp,fpload,fpstore,fpload,fpstore,vecsimple,vecsimple,two,store,load,two")
+   (set_attr "size" "64")
    (set_attr "length" "4,4,4,4,4,4,4,4,4,8,8,8,8")])
 
 (define_insn "*mov<mode>_softfloat32"
@@ -6695,6 +6697,7 @@ (define_insn "*mov<mode>_hardfloat64"
    mfvsrd %0,%x1
    mtvsrd %x0,%1"
   [(set_attr "type" "fpstore,fpload,fp,fpload,fpstore,fpload,fpstore,vecsimple,vecsimple,integer,store,load,*,mtjmpr,mfjmpr,*,mftgpr,mffgpr,mftgpr,mffgpr")
+   (set_attr "size" "64")
    (set_attr "length" "4")])
 
 (define_insn "*mov<mode>_softfloat64"
@@ -7746,7 +7749,8 @@ (define_insn "*movdi_internal32"
                "store,     load,      *,         fpstore,   fpload,     fp,
                 *,         fpstore,   fpstore,   fpload,    fpload,     vecsimple,
                 vecsimple, vecsimple, vecsimple, vecsimple, vecsimple,  vecsimple,
-                vecsimple")])
+                vecsimple")
+   (set_attr "size" "64")])
 
 (define_split
   [(set (match_operand:DI 0 "gpc_reg_operand" "")
@@ -7835,6 +7839,7 @@ (define_insn "*movdi_internal64"
                 vecsimple, vecsimple, vecsimple, mfjmpr,    mtjmpr,     *,
                 mftgpr,    mffgpr,    mftgpr,    mffgpr")
 
+   (set_attr "size" "64")
    (set_attr "length"
                "4,         4,         4,         4,         4,          20,
                 4,         4,         4,         4,         4,          4,
@@ -8884,7 +8889,8 @@ (define_insn "*movdf_update1"
    lfdu %3,%2(%0)"
   [(set_attr "type" "fpload")
    (set_attr "update" "yes")
-   (set_attr "indexed" "yes,no")])
+   (set_attr "indexed" "yes,no")
+   (set_attr "size" "64")])
 
 (define_insn "*movdf_update2"
   [(set (mem:DF (plus:SI (match_operand:SI 1 "gpc_reg_operand" "0,0")
@@ -13289,7 +13295,8 @@ (define_insn "add<mode>3"
 	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xsaddqp %0,%1,%2"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
 
 (define_insn "sub<mode>3"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
@@ -13298,7 +13305,8 @@ (define_insn "sub<mode>3"
 	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xssubqp %0,%1,%2"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
 
 (define_insn "mul<mode>3"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
@@ -13307,7 +13315,8 @@ (define_insn "mul<mode>3"
 	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xsmulqp %0,%1,%2"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
 
 (define_insn "div<mode>3"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
@@ -13316,7 +13325,8 @@ (define_insn "div<mode>3"
 	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xsdivqp %0,%1,%2"
-  [(set_attr "type" "vecdiv")])
+  [(set_attr "type" "vecdiv")
+   (set_attr "size" "128")])
 
 (define_insn "sqrt<mode>2"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
@@ -13324,7 +13334,8 @@ (define_insn "sqrt<mode>2"
 	 (match_operand:IEEE128 1 "altivec_register_operand" "v")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
    "xssqrtqp %0,%1"
-  [(set_attr "type" "vecdiv")])
+  [(set_attr "type" "vecdiv")
+   (set_attr "size" "128")])
 
 (define_insn "copysign<mode>3"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
@@ -13334,7 +13345,8 @@ (define_insn "copysign<mode>3"
 	 UNSPEC_COPYSIGN))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
    "xscpsgnqp %0,%2,%1"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "size" "128")])
 
 (define_insn "neg<mode>2_hw"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
@@ -13342,7 +13354,8 @@ (define_insn "neg<mode>2_hw"
 	 (match_operand:IEEE128 1 "altivec_register_operand" "v")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xsnegqp %0,%1"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "size" "128")])
 
 
 (define_insn "abs<mode>2_hw"
@@ -13351,7 +13364,8 @@ (define_insn "abs<mode>2_hw"
 	 (match_operand:IEEE128 1 "altivec_register_operand" "v")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xsabsqp %0,%1"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "size" "128")])
 
 
 (define_insn "*nabs<mode>2_hw"
@@ -13361,7 +13375,8 @@ (define_insn "*nabs<mode>2_hw"
 	  (match_operand:IEEE128 1 "altivec_register_operand" "v"))))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xsnabsqp %0,%1"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "size" "128")])
 
 ;; Initially don't worry about doing fusion
 (define_insn "*fma<mode>4_hw"
@@ -13372,7 +13387,8 @@ (define_insn "*fma<mode>4_hw"
 	 (match_operand:IEEE128 3 "altivec_register_operand" "0")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xsmaddqp %0,%1,%2"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
 
 (define_insn "*fms<mode>4_hw"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
@@ -13383,7 +13399,8 @@ (define_insn "*fms<mode>4_hw"
 	  (match_operand:IEEE128 3 "altivec_register_operand" "0"))))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xsmsubqp %0,%1,%2"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
 
 (define_insn "*nfma<mode>4_hw"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
@@ -13394,7 +13411,8 @@ (define_insn "*nfma<mode>4_hw"
 	  (match_operand:IEEE128 3 "altivec_register_operand" "0"))))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xsnmaddqp %0,%1,%2"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
 
 (define_insn "*nfms<mode>4_hw"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
@@ -13406,7 +13424,8 @@ (define_insn "*nfms<mode>4_hw"
 	   (match_operand:IEEE128 3 "altivec_register_operand" "0")))))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xsnmsubqp %0,%1,%2"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
 
 (define_insn "extend<SFDF:mode><IEEE128:mode>2_hw"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
@@ -13414,7 +13433,8 @@ (define_insn "extend<SFDF:mode><IEEE128:
 	 (match_operand:SFDF 1 "altivec_register_operand" "v")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<IEEE128:MODE>mode)"
   "xscvdpqp %0,%1"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
 
 ;; Conversion between KFmode and TFmode if TFmode is ieee 128-bit floating
 ;; point is a simple copy.
@@ -13456,7 +13476,8 @@ (define_insn "trunc<mode>df2_hw"
 	 (match_operand:IEEE128 1 "altivec_register_operand" "v")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xscvqpdp %0,%1"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
 
 ;; There is no KFmode -> SFmode instruction. Preserve the accuracy by doing
 ;; the KFmode -> DFmode conversion using round to odd rather than the normal
@@ -13553,7 +13574,8 @@ (define_insn "*xscvqp<su>wz_<mode>"
 	 UNSPEC_IEEE128_CONVERT))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xscvqp<su>wz %0,%1"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
 
 (define_insn "*xscvqp<su>dz_<mode>"
   [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
@@ -13563,7 +13585,8 @@ (define_insn "*xscvqp<su>dz_<mode>"
 	 UNSPEC_IEEE128_CONVERT))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xscvqp<su>dz %0,%1"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
 
 (define_insn "*xscv<su>dqp_<mode>"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
@@ -13572,7 +13595,8 @@ (define_insn "*xscv<su>dqp_<mode>"
 		    UNSPEC_IEEE128_CONVERT)))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xscv<su>dqp %0,%1"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
 
 (define_insn "*ieee128_mfvsrd_64bit"
   [(set (match_operand:DI 0 "reg_or_indexed_operand" "=wr,Z,wi")
@@ -13649,7 +13673,8 @@ (define_insn "*trunc<mode>df2_odd"
 		   UNSPEC_ROUND_TO_ODD))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xscvqpdpo %0,%1"
-  [(set_attr "type" "vecfloat")])
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
 
 ;; IEEE 128-bit comparisons
 (define_insn "*cmp<mode>_hw"
@@ -13658,7 +13683,8 @@ (define_insn "*cmp<mode>_hw"
 		      (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
    "xscmpuqp %0,%1,%2"
-  [(set_attr "type" "fpcompare")])
+  [(set_attr "type" "veccmp")
+   (set_attr "size" "128")])
 
 \f
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH, rs6000] Scheduling update
  2016-06-27 22:21   ` Pat Haugen
@ 2016-06-27 22:43     ` Segher Boessenkool
  0 siblings, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2016-06-27 22:43 UTC (permalink / raw)
  To: Pat Haugen; +Cc: GCC Patches, David Edelsohn

On Mon, Jun 27, 2016 at 04:46:00PM -0500, Pat Haugen wrote:
> On 06/22/2016 02:10 PM, Segher Boessenkool wrote:
> >> Index: config/rs6000/htm.md
> >> ===================================================================
> >> --- config/rs6000/htm.md	(revision 237621)
> >> +++ config/rs6000/htm.md	(working copy)
> >> @@ -72,7 +72,8 @@ (define_insn "*tabort"
> >>     (set (match_operand:BLK 2) (unspec:BLK [(match_dup 2)] UNSPEC_HTM_FENCE))]
> >>    "TARGET_HTM"
> >>    "tabort. %0"
> >> -  [(set_attr "type" "htm")
> >> +  [(set_attr "type" "htmsimple")
> >> +   (set_attr "power9_alu2" "yes")
> >>     (set_attr "length" "4")])
> > 
> > What determines if an insn is htm or htmsimple?
> > 
> htm insns are cracked whereas htmsimple are not.

Sorry, I wasn't clear.  That is what is the difference on p9, sure.
But is there some pattern to this?  Some difference that does not depend
on a specific CPU implementation.

>         (rs6000_sched_init): Fix initialization of last_scheduled_insn.
>         Initialize divCnt/vec_load_pendulum.

You missed divCnt here :-)

> +(define_insn_reservation "power9-vecdiv" 32
> +  (and (eq_attr "type" "vecdiv")
> +       (eq_attr "size" "!128")
> +       (eq_attr "cpu" "power9"))
> +  "DU_super_power9,VSU_super_power9")

Does that work, the ! ?


This looks much better :-)

Okay for trunk; okay for 6 later.  Thanks,


Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-06-27 22:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-21 17:46 [PATCH, rs6000] Scheduling update Pat Haugen
2016-06-22 19:11 ` Segher Boessenkool
2016-06-27 14:58   ` Pat Haugen
2016-06-27 20:45     ` Segher Boessenkool
2016-06-27 21:30       ` Pat Haugen
2016-06-27 21:44         ` Segher Boessenkool
2016-06-27 22:21   ` Pat Haugen
2016-06-27 22:43     ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).