On 06/22/2016 02:10 PM, Segher Boessenkool wrote: >> Index: config/rs6000/htm.md >> =================================================================== >> --- config/rs6000/htm.md (revision 237621) >> +++ config/rs6000/htm.md (working copy) >> @@ -72,7 +72,8 @@ (define_insn "*tabort" >> (set (match_operand:BLK 2) (unspec:BLK [(match_dup 2)] UNSPEC_HTM_FENCE))] >> "TARGET_HTM" >> "tabort. %0" >> - [(set_attr "type" "htm") >> + [(set_attr "type" "htmsimple") >> + (set_attr "power9_alu2" "yes") >> (set_attr "length" "4")]) > > What determines if an insn is htm or htmsimple? > htm insns are cracked whereas htmsimple are not. > >> +; Quad-precision FP ops, execute in DFU >> +(define_attr "power9_qp" "no,yes" >> + (if_then_else (ior (match_operand:KF 0 "" "") >> + (match_operand:TF 0 "" "") >> + (match_operand:KF 1 "" "") >> + (match_operand:TF 1 "" "")) >> + (const_string "yes") >> + (const_string "no"))) > > (The "" are not needed I think). > > This perhaps could be better handled with the "size" attribute. > Patch has been modified to annotate 128-bit FP insns with size '128' and handled that way. >> +(define_insn_reservation "power9-load-ext" 6 >> + (and (eq_attr "type" "load") >> + (eq_attr "sign_extend" "yes") >> + (eq_attr "update" "no") >> + (eq_attr "cpu" "power9")) >> + "DU_C2_power9,LSU_power9") > > So you do not describe the units used after the first cycle? Why is > that, to keep the size of the automaton down? > Yes, I ran into problems with DFA state explosion when trying to list follow-on cycles/unit reservations. > >> +(define_insn_reservation "power9-fpload-double" 4 >> + (and (eq_attr "type" "fpload") >> + (eq_attr "update" "no") >> + (match_operand:DF 0 "" "") >> + (eq_attr "cpu" "power9")) >> + "DU_slice_3_power9,LSU_power9") > > Using match_operand here is asking for trouble. "size", and you can > default that for "fpload" insns, and document there that it looks at the > mode of operands[0] for fpload? Handled with size '64' additions to fpload insns. >> { >> + int pos; >> + int i; >> + rtx_insn *tmp; > > Moving these to an outer scope is really a step back. The new code could > just declare them itself; in fact, it should probably be a separate > function anyway. > Separate function created. Updated changelog/patch follow, with additional coding style corrections you pointed out also made. The diff is against current trunk, am currently bootstrap/regtesting on top of the other patch you already reviewed. Thanks, Pat 2016-06-27 Pat Haugen * config/rs6000/rs6000.md ('type' attribute): Add htmsimple/dfp types. ('size' attribute): Add '128'. Include power9.md. (*mov_hardfloat32, *mov_hardfloat64, *movdi_internal32, *movdi_internal64, *movdf_update1): Set size attribute to '64'. (add3, sub3, mul3, div3, sqrt2, copysign3, neg2_hw, abs2_hw, *nabs2_hw, *fma4_hw, *fms4_hw, *nfma4_hw, *nfms4_hw, extend2_hw, truncdf2_hw, *xscvqpwz_, *xscvqpdz_, *xscvdqp_, *truncdf2_odd): Set size attribute to '128'. (*cmp_hw): Change type to veccmp and set size attribute to '128'. * config/rs6000/power6.md (power6-fp): Include dfp type. * config/rs6000/power7.md (power7-fp): Likewise. * config/rs6000/power8.md (power8-fp): Likewise. * config/rs6000/power9.md: New file. * config/rs6000/t-rs6000 (MD_INCLUDES): Add power9.md. * config/rs6000/htm.md (*tabort, *tabortc, *tabortci, *trechkpt, *treclaim, *tsr, *ttest): Change type attribute to htmsimple. * config/rs6000/dfp.md (extendsddd2, truncddsd2, extendddtd2, trunctddd2, adddd3, addtd3, subdd3, subtd3, muldd3, multd3, divdd3, divtd3, *cmpdd_internal1, *cmptd_internal1, floatdidd2, floatditd2, ftruncdd2, fixdddi2, ftrunctd2, fixtddi2, dfp_ddedpd_, dfp_denbcd_, dfp_dxex_, dfp_diex_, dfp_dscli_, dfp_dscri_): Change type attribute to dfp. * config/rs6000/crypto.md (crypto_vshasigma): Change type attribute to vecsimple. * config/rs6000/rs6000.c (power9_cost): Update costs, cache size and prefetch streams. (rs6000_option_override_internal): Remove temporary code setting tuning to power8. Don't set rs6000_sched_groups for power9. (last_scheduled_insn): Change to rtx_insn *. (divide_cnt, vec_load_pendulum): New variables. (rs6000_adjust_cost): Add Power9 to test for store->load separation. (rs6000_issue_rate): Set issue rate for Power9. (is_power9_pairable_vec_type): New. (power9_sched_reorder2): New. (rs6000_sched_reorder2): Call new function for Power9 specific reordering. (insn_must_be_first_in_group): Remove Power9. (insn_must_be_last_in_group): Likewise. (force_new_group): Likewise. (rs6000_sched_init): Fix initialization of last_scheduled_insn. Initialize divCnt/vec_load_pendulum. (_rs6000_sched_context, rs6000_init_sched_context, rs6000_set_sched_context): Handle context save/restore of new variables.