From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa2.mentor.iphmx.com (esa2.mentor.iphmx.com [68.232.141.98]) by sourceware.org (Postfix) with ESMTPS id CA6D63858D37 for ; Tue, 21 Mar 2023 16:11:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CA6D63858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,279,1673942400"; d="scan'208";a="102535902" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa2.mentor.iphmx.com with ESMTP; 21 Mar 2023 08:11:36 -0800 IronPort-SDR: YR0B8e9NDbyMnCXP4Wk6SKyIjq2rxQ8z5M3fbBS5o8kOJy99ee/7AdBfrjWtqM9+t0vJJH7zpL euDhkrzyfiC9lDB9QQPdwsRX6OkYcjeJ/SkgFY3WYcwe4wZnjwFRo3GnpdVNI5JsX5BhG4ibx3 Tg2EVnLtnVd74g8X0/9EcjJ99XlMG38MO5q7MsVnPAyYxTogbMEM8nC+g3oZSIk52xGJEwSZtn GugJCdUhp+LpX/2fHAqdkySkAkppVwlthNUjOjj0KXIJ8ZSziDQZmLi7tKhEqfEePOAc+z+LmT 8Es= Message-ID: <2e9d505c-f3dd-b9c4-5ac9-be2fa7478bea@codesourcery.com> Date: Tue, 21 Mar 2023 16:11:32 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 Subject: Re: [PATCH] amdgcn: Add accumulator VGPR registers Content-Language: en-GB To: Andrew Jenner , GCC Patches References: From: Andrew Stubbs In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) To svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) X-Spam-Status: No, score=-6.2 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 21/03/2023 13:42, Andrew Jenner wrote: > This patch gives GCC to use the accumulator VGPR registers on CDNA1 and > later architectures. The backend does not yet attempt to make use of the > matrix acceleration instructions, but the new registers are still useful > as fast space for register spills. And they can now be used in inline > assembly statements. > > I haven't written a dedicated testcase for this - just building libgcc > and libgfortran seems to have thoroughly exercised the code paths involved. > > I have a test run in progress - assuming that this doesn't find any > breakage, OK to commit? > > gcc/ChangeLog: > >     * config/gcn/constraints.md: Add AVGPR constraints. >     * config/gcn/gcn-valu.md (*mov, mov_sgprbase) >     (reload_in, reload_out): Add AVGPR alternatives. >     (gather_insn_1offset, gather_insn_1offset_ds) >     (gather_insn_2offsets) >     (scatter_store_insn_1offset     (scatter_insn_1offset_ds) >     (scatter_insn_2offsets): Allow use of AVGPRs. >     * config/gcn/gcn.cc (MAX_NORMAL_AVGPR_COUNT): Define. >     (gcn_class_max_nregs): Handle AVGPR_REGS. >     (gcn_hard_regno_mode_ok): Likewise. >     (gcn_spill_class): Allow spilling to AVGPRs on TARGET_CDNA2_PLUS. >     (gcn_sgpr_move_p): Handle AVGPRs. >     (gcn_secondary_reload): Reload AVGPRs via VGPRs. >     (gcn_conditional_register_usage): Handle AVGPRs. >     (gcn_vgpr_equivalent_register_operand): New function. >     (gcn_valid_move_p): Check for validity of AVGPR moves. >     (gcn_memory_move_cost): Handle AVGPRs. >     (gcn_register_move_cost): Liekwise. >     (gcn_vmem_insn_p): Handle TYPE_VOP3P_MAI. >     (gcn_hsa_declare_function_name): Handle AVGPRs. >     (print_reg): Likewise. >     (gcn_dwarf_register_numbe): Likewise. >     * config/gcn/gcn.h (FIRST_AVGPR_REG, AVGPR_REGNO, LAST_AVGPR_REG): >     Define. >     (SOFT_ARG_REG, FRAME_POINTER_REGNUM, DWARF_LINK_REGISTER) >     (FIRST_PSEUDO_REGISTER): Update. >     (AVGPR_REGNO_P): Define. >     (FIXED_REGISTERS, CALL_USED_REGISTERS): Add AVGPRs. >     (enum reg_class, REG_CLASS_NAMES): Add AVGPR_REGS and ALL_VGPR_REGS. >     (REG_CLASS_CONTENTS): Add new register classes and add entries for >     AVGPRs to all classes. >     (REGISTER_NAMES): Add AVGPRs. >     * config/gcn/gcn.md (FIRST_AVGPR_REG, LAST_AVGPR_REG): Define. >     (AP_REGNUM, FP_REGNUM): Update. >     (define_attr "type"): Add vop3p_mai. >     (*mov_insn, *movti_insn): Add AVGPR alternatives. >     * gcc/config/gcn/predicates.md (gcn_avgpr_register_operand) >     (gcn_avgpr_hard_register_operand): New predicates. I don't like the "a" and "b" constraints. It feels error prone and we don't use that sort of conditional on any other constraint. Please global replace the "b" constraint with "a", and then set the "gcn_version" attribute to "cdna1only" and "cdna2" on each alternative of each insn that previously used "a" and "b". I think you'll need an extra alternative for the load/store insns. > @@ -801,7 +804,7 @@ gcn_spill_class (reg_class_t c, machine_mode /*mode */ ) > || c == VCC_CONDITIONAL_REG || c == EXEC_MASK_REG) > return SGPR_REGS; > else > - return NO_REGS; > + return c == VGPR_REGS && TARGET_CDNA2_PLUS ? AVGPR_REGS : NO_REGS; > } Shouldn't that be CDNA1? > @@ -2524,6 +2539,16 @@ gcn_conditional_register_usage (void) > fixed_regs[cfun->machine->args.reg[WORK_ITEM_ID_Z_ARG]] = 1; > } > > +static bool > +gcn_vgpr_equivalent_register_operand (rtx x, machine_mode mode) > +{ Comment before the function please. > @@ -316,6 +354,8 @@ enum reg_class > SGPR_SRC_REGS, > GENERAL_REGS, > VGPR_REGS, > + AVGPR_REGS, > + ALL_VGPR_REGS, > ALL_GPR_REGS, > SRCDST_REGS, > AFP_REGS, What is ALL_VGPR_REGS for? > @@ -530,9 +538,9 @@ > > (define_insn "*mov_insn" > [(set (match_operand:SISF 0 "nonimmediate_operand" > - "=SD,SD,SD,SD,RB,Sm,RS,v,Sg, v, v,RF,v,RLRG, v,SD, v,RM") > + "=SD,SD,SD,SD,RB,Sm,RS,v,Sg, v,vb,RF,v,RLRG, v,SD,vb,RM, v, a, b") > (match_operand:SISF 1 "gcn_load_operand" > - "SSA, J, B,RB,Sm,RS,Sm,v, v,Sv,RF, v,B, v,RLRG, Y,RM, v"))] > + "SSA, J, B,RB,Sm,RS,Sm,v, v,Sv,RF,vb,B, v,RLRG, Y,RM,vb,^a, v, b"))] These lines are now too long. In this backend we have adopted a style that has the constraints right justified such that the longest line has the last character in the right-most column, and the shorter lines have the alternatives aligned. There are other similar long lines further down the patch. > @@ -580,19 +591,22 @@ > ds_write%b0\t%A0, %1%O0\;s_waitcnt\tlgkmcnt(0) > ds_read%u1\t%0, %A1%O1\;s_waitcnt\tlgkmcnt(0) > global_load%o1\t%0, %A1%O1%g1\;s_waitcnt\tvmcnt(0) > - global_store%s0\t%A0, %1%O0%g0" > + global_store%s0\t%A0, %1%O0%g0 > + v_accvgpr_read_b32\t%0, %1 > + v_accvgpr_write_b32\t%0, %1 > + v_accvgpr_mov_b32\t%0, %1" > [(set_attr "type" > - "sop1,sopk,sop1,vop1,vop3a,vop3a,flat,flat,vop1,ds,ds,flat,flat") > - (set_attr "exec" "*,*,*,*,none,none,*,*,*,*,*,*,*") > - (set_attr "length" "4,4,8,4,4,4,12,12,8,12,12,12,12")]) > + "sop1,sopk,sop1,vop1,vop3a,vop3a,flat,flat,vop1,ds,ds,flat,flat, vop3p_mai,vop3p_mai,vop1") > + (set_attr "exec" "*,*,*,*,none,none,*,*,*,*,*,*,*,*,*,*") > + (set_attr "length" "4,4,8,4,4,4,12,12,8,12,12,12,12,8,8,4")]) You can break the long attribute strings after a comma. Andrew