From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa2.mentor.iphmx.com (esa2.mentor.iphmx.com [68.232.141.98]) by sourceware.org (Postfix) with ESMTPS id B54803AAAC1E; Fri, 18 Jun 2021 14:55:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B54803AAAC1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: 38QmfLpA3WmuxsyUZrWCjWT7KGaudeYbmbx15646Ais6doaf/h5drl5JNHlOm7O3XE30MypPld juwZEaIvJ+Nwp9JQD4Uy5aJ7zGXl4htcKyy8va0RxMDyHWynhi+ays3BUpgXfPlmhEkYisAVT5 M5X+JVeCY3LwTJPQhimF56sLnaF+0RSN68Kv0r72mBBwmgiisK3L1Ow9w2pQ3vigKHEO4BoLgp xPCdhu9Sz3+cQljaKt55lemalvrSuu7EZnhWbXcYk8rRVupHt6AYtW6zl82GItz5rCKrWsN71T fck= X-IronPort-AV: E=Sophos;i="5.83,284,1616486400"; d="scan'208";a="62564782" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa2.mentor.iphmx.com with ESMTP; 18 Jun 2021 06:55:16 -0800 IronPort-SDR: pPLgV+pt/FvRvf0eMUaNAF2k1YReL1QbcRDGkXL0mLCE7f3zYIgt5AQPL8acbYqTgQQrBuv89A s7UPh0OzKSlWThn8dXiOixnQZTF2VElucOLT6nOX0eKhG9CMGk0FKV/zgWRYZGaM5gQL0MpJ/F Oe+QlUnDHDYqfKdxFuRVHjmA/dkZbXYfoWJqAzSj1hG7xt1VoYaaaW4iX3Si2bw2HN/c67+4Ps nN83iqTwQeM8vh0ZIZSB2lxr5anYNktq/L5NWevnpCMN1p329xScIijBVRJwVhPE7FdOQU3q5v c8A= Subject: Re: [PATCH 2/5] amdgcn: Add [us]mulsi3_highpart SGPR alternatives & [us]mulsid3/muldi3 expanders To: Julian Brown , CC: , Tobias Burnus , Jakub Jelinek , Thomas Schwinge References: From: Andrew Stubbs Message-ID: <24911c47-fa2f-2317-d2b6-572f4a01c811@codesourcery.com> Date: Fri, 18 Jun 2021 15:55:09 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-06.mgc.mentorg.com (139.181.222.6) To svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, NICE_REPLY_A, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Jun 2021 14:55:18 -0000 On 18/06/2021 15:19, Julian Brown wrote: > This patch improves 64-bit multiplication for AMD GCN: patterns for > unsigned and signed 32x32->64 bit multiplication have been added, and > also 64x64->64 bit multiplication is now open-coded rather than calling > a library function (which may be a win for code size as well as speed: > the function calling sequence isn't particularly concise for GCN). > > The mulsi3_highpart pattern has also been extended for GCN5+, since > that ISA version supports high-part result multiply instructions with > SGPR operands. > > The DImode multiply implementation is lost from libgcc if we build it > for DImode/TImode rather than SImode/DImode, a change we make in a later > patch in this series. > > I can probably self-approve this, but I'll give Andrew Stubbs a chance > to comment. > > Thanks, > > Julian > > 2021-06-18 Julian Brown > > gcc/ > * config/gcn/gcn.md (mulsi3_highpart): Add SGPR alternatives for > GCN5+. > (mulsidi3, muldi3): Add expanders. > --- > gcc/config/gcn/gcn.md | 55 ++++++++++++++++++++++++++++++++++++++----- > 1 file changed, 49 insertions(+), 6 deletions(-) > > diff --git a/gcc/config/gcn/gcn.md b/gcc/config/gcn/gcn.md > index b5f895a93e2..70655ca4b8b 100644 > --- a/gcc/config/gcn/gcn.md > +++ b/gcc/config/gcn/gcn.md > @@ -1392,19 +1392,62 @@ > (define_code_attr e [(sign_extend "e") (zero_extend "")]) > > (define_insn "mulsi3_highpart" > - [(set (match_operand:SI 0 "register_operand" "= v") > + [(set (match_operand:SI 0 "register_operand" "=Sg, Sg, v") > (truncate:SI > (lshiftrt:DI > (mult:DI > (any_extend:DI > - (match_operand:SI 1 "register_operand" "% v")) > + (match_operand:SI 1 "register_operand" "%SgA,SgA, v")) > (any_extend:DI > - (match_operand:SI 2 "register_operand" "vSv"))) > + (match_operand:SI 2 "register_operand" "SgA, B,vSv"))) > (const_int 32))))] > "" > - "v_mul_hi0\t%0, %2, %1" > - [(set_attr "type" "vop3a") > - (set_attr "length" "8")]) > + "@ > + s_mul_hi0\t%0, %1, %2 > + s_mul_hi0\t%0, %1, %2 > + v_mul_hi0\t%0, %2, %1" > + [(set_attr "type" "sop2,sop2,vop3a") > + (set_attr "length" "4,8,8") > + (set_attr "gcn_version" "gcn5,gcn5,*")]) > + > +(define_expand "mulsidi3" > + [(set (match_operand:DI 0 "register_operand" "") > + (mult:DI > + (any_extend:DI (match_operand:SI 1 "register_operand" "")) > + (any_extend:DI (match_operand:SI 2 "register_operand" ""))))] > + "" > + { > + rtx dst = gen_reg_rtx (DImode); > + rtx dstlo = gen_lowpart (SImode, dst); > + rtx dsthi = gen_highpart_mode (SImode, DImode, dst); > + emit_insn (gen_mulsi3 (dstlo, operands[1], operands[2])); > + emit_insn (gen_mulsi3_highpart (dsthi, operands[1], operands[2])); > + emit_move_insn (operands[0], dst); > + DONE; > + }) > + > +(define_expand "muldi3" > + [(set (match_operand:DI 0 "register_operand" "") > + (mult:DI (match_operand:DI 1 "register_operand" "") > + (match_operand:DI 2 "register_operand" "")))] > + "" > + { > + rtx tmp0 = gen_reg_rtx (SImode); > + rtx tmp1 = gen_reg_rtx (SImode); > + rtx dst = gen_reg_rtx (DImode); > + rtx dsthi = gen_highpart_mode (SImode, DImode, dst); > + rtx op1lo = gen_lowpart (SImode, operands[1]); > + rtx op1hi = gen_highpart_mode (SImode, DImode, operands[1]); > + rtx op2lo = gen_lowpart (SImode, operands[2]); > + rtx op2hi = gen_highpart_mode (SImode, DImode, operands[2]); > + emit_insn (gen_umulsidi3 (dst, op1lo, op2lo)); > + emit_insn (gen_mulsi3 (tmp0, op1lo, op2hi)); > + emit_insn (gen_addsi3 (dsthi, dsthi, tmp0)); > + emit_insn (gen_mulsi3 (tmp1, op1hi, op2lo)); > + emit_insn (gen_addsi3 (dsthi, dsthi, tmp1)); > + emit_move_insn (operands[0], dst); > + DONE; > + }) > > (define_insn "mulhisi3" > [(set (match_operand:SI 0 "register_operand" "=v") > Most of the rest of the backend expands 64-bit operations to 32-bit pairs much later, using define_insn_and_split, because there were lots of issues with splitting it early. I don't recall exactly what right now, unfortunately. (It might have been related to spilling only half the value to the stack?) It also makes it hard to debug, I think. Andrew