From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa2.mentor.iphmx.com (esa2.mentor.iphmx.com [68.232.141.98]) by sourceware.org (Postfix) with ESMTPS id AEAA53AA9822; Fri, 18 Jun 2021 14:19:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AEAA53AA9822 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: +zgwl3uAD4QvKTIewCC9i3d3OxXtxEm41u+oMx0dRUqlBKtyiiY60CZTHTtnocmx2bKxNjQwPd wxrfKJvXVFwJPbJz6dZkkR5ZMis5Xttm0FamifSd1s7/+dla9pauI73PZekOETYZzBLwcGofd1 7aQMM+Ff/Iqi7lrXr5WxauDOOMf/IYIU9jSHqCTLvROx6MONq54tHsa8hFM186D8ZNr/b9e0tT STPuDr6YcDJwiES+Tfr7W2Lh2ERA+d4kCP9Cx8S2bofG3hCNOESjK8Pj54QJhzZphnUHli1hY7 9F0= X-IronPort-AV: E=Sophos;i="5.83,284,1616486400"; d="scan'208";a="62563386" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa2.mentor.iphmx.com with ESMTP; 18 Jun 2021 06:19:57 -0800 IronPort-SDR: DVhYcUyHkMAyz9HwKt9ZdkTWiG+yUeGH3FlbKyQYWAyJUZ6Ork0nQygEozwFcVPBuXx0TPr54T XKlalykf4JKeCfzuqgzBdUnmcA+x6rNtkjAjFZSAwuktp2kb1j6DbGI+6lMKFkHmUm3pQVy7x+ jyxKF7j5WbFdilqOHHP/txaQi5kQbFBKAC01Ptwk0rmwiYyQclp4/CUklJWXkVtIvLCmK0YL5o 4Ti3EyEyPiIEKPyVWBb5v5jT71wrcUs6a4NS+9M2DrqqEWS+id0EAaECFvU2Zkc2G/l/CSgUvc 6WY= From: Julian Brown To: CC: , Tobias Burnus , Jakub Jelinek , Thomas Schwinge , Andrew Stubbs Subject: [PATCH 2/5] amdgcn: Add [us]mulsi3_highpart SGPR alternatives & [us]mulsid3/muldi3 expanders Date: Fri, 18 Jun 2021 07:19:31 -0700 Message-ID: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: SVR-IES-MBX-07.mgc.mentorg.com (139.181.222.7) To SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Jun 2021 14:19:59 -0000 This patch improves 64-bit multiplication for AMD GCN: patterns for unsigned and signed 32x32->64 bit multiplication have been added, and also 64x64->64 bit multiplication is now open-coded rather than calling a library function (which may be a win for code size as well as speed: the function calling sequence isn't particularly concise for GCN). The mulsi3_highpart pattern has also been extended for GCN5+, since that ISA version supports high-part result multiply instructions with SGPR operands. The DImode multiply implementation is lost from libgcc if we build it for DImode/TImode rather than SImode/DImode, a change we make in a later patch in this series. I can probably self-approve this, but I'll give Andrew Stubbs a chance to comment. Thanks, Julian 2021-06-18 Julian Brown gcc/ * config/gcn/gcn.md (mulsi3_highpart): Add SGPR alternatives for GCN5+. (mulsidi3, muldi3): Add expanders. --- gcc/config/gcn/gcn.md | 55 ++++++++++++++++++++++++++++++++++++++----- 1 file changed, 49 insertions(+), 6 deletions(-) diff --git a/gcc/config/gcn/gcn.md b/gcc/config/gcn/gcn.md index b5f895a93e2..70655ca4b8b 100644 --- a/gcc/config/gcn/gcn.md +++ b/gcc/config/gcn/gcn.md @@ -1392,19 +1392,62 @@ (define_code_attr e [(sign_extend "e") (zero_extend "")]) (define_insn "mulsi3_highpart" - [(set (match_operand:SI 0 "register_operand" "= v") + [(set (match_operand:SI 0 "register_operand" "=Sg, Sg, v") (truncate:SI (lshiftrt:DI (mult:DI (any_extend:DI - (match_operand:SI 1 "register_operand" "% v")) + (match_operand:SI 1 "register_operand" "%SgA,SgA, v")) (any_extend:DI - (match_operand:SI 2 "register_operand" "vSv"))) + (match_operand:SI 2 "register_operand" "SgA, B,vSv"))) (const_int 32))))] "" - "v_mul_hi0\t%0, %2, %1" - [(set_attr "type" "vop3a") - (set_attr "length" "8")]) + "@ + s_mul_hi0\t%0, %1, %2 + s_mul_hi0\t%0, %1, %2 + v_mul_hi0\t%0, %2, %1" + [(set_attr "type" "sop2,sop2,vop3a") + (set_attr "length" "4,8,8") + (set_attr "gcn_version" "gcn5,gcn5,*")]) + +(define_expand "mulsidi3" + [(set (match_operand:DI 0 "register_operand" "") + (mult:DI + (any_extend:DI (match_operand:SI 1 "register_operand" "")) + (any_extend:DI (match_operand:SI 2 "register_operand" ""))))] + "" + { + rtx dst = gen_reg_rtx (DImode); + rtx dstlo = gen_lowpart (SImode, dst); + rtx dsthi = gen_highpart_mode (SImode, DImode, dst); + emit_insn (gen_mulsi3 (dstlo, operands[1], operands[2])); + emit_insn (gen_mulsi3_highpart (dsthi, operands[1], operands[2])); + emit_move_insn (operands[0], dst); + DONE; + }) + +(define_expand "muldi3" + [(set (match_operand:DI 0 "register_operand" "") + (mult:DI (match_operand:DI 1 "register_operand" "") + (match_operand:DI 2 "register_operand" "")))] + "" + { + rtx tmp0 = gen_reg_rtx (SImode); + rtx tmp1 = gen_reg_rtx (SImode); + rtx dst = gen_reg_rtx (DImode); + rtx dsthi = gen_highpart_mode (SImode, DImode, dst); + rtx op1lo = gen_lowpart (SImode, operands[1]); + rtx op1hi = gen_highpart_mode (SImode, DImode, operands[1]); + rtx op2lo = gen_lowpart (SImode, operands[2]); + rtx op2hi = gen_highpart_mode (SImode, DImode, operands[2]); + emit_insn (gen_umulsidi3 (dst, op1lo, op2lo)); + emit_insn (gen_mulsi3 (tmp0, op1lo, op2hi)); + emit_insn (gen_addsi3 (dsthi, dsthi, tmp0)); + emit_insn (gen_mulsi3 (tmp1, op1hi, op2lo)); + emit_insn (gen_addsi3 (dsthi, dsthi, tmp1)); + emit_move_insn (operands[0], dst); + DONE; + }) (define_insn "mulhisi3" [(set (match_operand:SI 0 "register_operand" "=v") -- 2.29.2