From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 6B8423858C53 for ; Thu, 19 Oct 2023 00:03:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6B8423858C53 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6B8423858C53 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697673792; cv=none; b=JRO5jUpEk4MdSiL2BCx7mxRHlxLjS1CXcpOqiTCviyUajKcbh8KsJsJYfD1O+uzNh7slswtTVKJ1uqEuX5VWdG8sa9CnCz3p1BWHAUv6nZn6DSK0jbV2eMI7e8oJ9iE4Q6NCj48vWGEvLnC/PsmZeaMayeLW0iWwE6DgpfuO1pU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697673792; c=relaxed/simple; bh=Gz32B4EqgYDIKPvGk0ZF0CihDb6BQTFGrES3DdWPVRM=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=LdgMyA2pW2g/AHP32sQsLuaqH9q9Qv23wgJyHGu+4ECwDbXsqunyNGJ3XWeCYbeYo3X7c741jr50/UakgE2ZTQ6VrY03c2TrrvHrMh8VItubhCWE3FftXXhGHbKuQTgcRBURN43S8pb++2NosxnD98Vnxuae380dQDD+Ou4htQ0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39J02H8h022073; Thu, 19 Oct 2023 00:03:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : subject : message-id : references : mime-version : content-type : in-reply-to; s=pp1; bh=UUmCQw+Y0DQyK8isYAmhieoGDr6Am/eRMUGVtSlSTJA=; b=dbjXz9YZTxDaTrIGdv3++Rh7hQ4VFSAYfkpL97yq+Sl2I58Ns3J/nmTFd2uE01Rugqlr 9aM72Mph/uTjuPFU+jw9At3bbTu9OuCrHIVZev7MIPvtE2cpBoDIeEYirxu5Fzi1zD1R xWi2fb0+puePP2E+vPbp/ttREUTYXQopfJT6oFelciiNVVvzKIL9q4loPmyTI+c0D6pb wRlNfujTiBJPysIuBp3Fo9fyS5mrclrDqsu+pKkzK3JXdq/hLI4Aw//93KnaI8SdZx/L 6Q2JDmXGAFtqUwEZYtw+mu11hkJLstK1MFejSedu43Y/KwlxQKV3McRVv4/K60t2VKiF 7Q== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ttsjgr1ay-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 19 Oct 2023 00:03:07 +0000 Received: from m0353729.ppops.net (m0353729.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 39J02PFW022506; Thu, 19 Oct 2023 00:03:07 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ttsjgr19w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 19 Oct 2023 00:03:07 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 39IMqk3L012858; Thu, 19 Oct 2023 00:03:05 GMT Received: from smtprelay03.dal12v.mail.ibm.com ([172.16.1.5]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3tr5pyn67f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 19 Oct 2023 00:03:05 +0000 Received: from smtpav05.wdc07v.mail.ibm.com (smtpav05.wdc07v.mail.ibm.com [10.39.53.232]) by smtprelay03.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 39J034Wp13238916 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 19 Oct 2023 00:03:05 GMT Received: from smtpav05.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8645858053; Thu, 19 Oct 2023 00:03:04 +0000 (GMT) Received: from smtpav05.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C7EFC58043; Thu, 19 Oct 2023 00:03:03 +0000 (GMT) Received: from cowardly-lion.the-meissners.org (unknown [9.61.180.52]) by smtpav05.wdc07v.mail.ibm.com (Postfix) with ESMTPS; Thu, 19 Oct 2023 00:03:03 +0000 (GMT) Date: Wed, 18 Oct 2023 20:03:02 -0400 From: Michael Meissner To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner Subject: [PATCH 4/6] PowerPC: Make MMA insns support DMR registers. Message-ID: Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 1dN6LDkPngWRnVOxZpuJs8zcdtlwCavj X-Proofpoint-ORIG-GUID: JzajAeJu8hgVIvLI_goOzAj_-dLLZXTk X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-18_18,2023-10-18_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 lowpriorityscore=0 phishscore=0 suspectscore=0 malwarescore=0 bulkscore=0 adultscore=0 mlxlogscore=999 priorityscore=1501 clxscore=1015 impostorscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2309180000 definitions=main-2310180199 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This patch changes the MMA instructions to use either FPR registers (-mcpu=power10) or DMRs (-mcpu=future). In this patch, the existing MMA instruction names are used. A macro (__PPC_DMR__) is defined if the MMA instructions use the DMRs. The patches have been tested on both little and big endian systems. Can I check it into the master branch? 2023-10-18 Michael Meissner gcc/ * config/rs6000/mma.md (mma_): New define_expand to handle mma_ for dense math and non dense math. (mma_ insn): Restrict to non dense math. (mma_xxsetaccz): Convert to define_expand to handle non dense math and dense math. (mma_xxsetaccz_vsx): Rename from mma_xxsetaccz and restrict usage to non dense math. (mma_xxsetaccz_dm): Dense math version of mma_xxsetaccz. (mma_): Add support for dense math. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. (mma_): Likewise. * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define __PPC_DMR__ if we have dense math instructions. * config/rs6000/rs6000.cc (print_operand): Make %A handle only DMRs if dense math and only FPRs if not dense math. (rs6000_split_multireg_move): Do not generate the xxmtacc instruction to prime the DMR registers or the xxmfacc instruction to de-prime instructions if we have dense math register support. --- gcc/config/rs6000/mma.md | 247 +++++++++++++++++++++------------- gcc/config/rs6000/rs6000-c.cc | 3 + gcc/config/rs6000/rs6000.cc | 35 ++--- 3 files changed, 176 insertions(+), 109 deletions(-) diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md index d2c5b73fa8f..e5589d8eccc 100644 --- a/gcc/config/rs6000/mma.md +++ b/gcc/config/rs6000/mma.md @@ -596,190 +596,249 @@ (define_insn "*mma_disassemble_acc_dm" "dmxxextfdmr256 %0,%1,2" [(set_attr "type" "mma")]) -(define_insn "mma_" +;; MMA instructions that do not use their accumulators as an input, still must +;; not allow their vector operands to overlap the registers used by the +;; accumulator. We enforce this by marking the output as early clobber. If we +;; have dense math, we don't need the whole prime/de-prime action, so just make +;; thse instructions be NOPs. + +(define_expand "mma_" + [(set (match_operand:XO 0 "register_operand") + (unspec:XO [(match_operand:XO 1 "register_operand")] + MMA_ACC))] + "TARGET_MMA" +{ + if (TARGET_DENSE_MATH) + { + if (!rtx_equal_p (operands[0], operands[1])) + emit_move_insn (operands[0], operands[1]); + DONE; + } + + /* Generate the prime/de-prime code. */ +}) + +(define_insn "*mma_" [(set (match_operand:XO 0 "fpr_reg_operand" "=&d") (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0")] MMA_ACC))] - "TARGET_MMA" + "TARGET_MMA && !TARGET_DENSE_MATH" " %A0" [(set_attr "type" "mma")]) ;; We can't have integer constants in XOmode so we wrap this in an -;; UNSPEC_VOLATILE. +;; UNSPEC_VOLATILE for the non-dense math case. For dense math, we don't need +;; to disable optimization and we can do a normal UNSPEC. -(define_insn "mma_xxsetaccz" - [(set (match_operand:XO 0 "fpr_reg_operand" "=d") +(define_expand "mma_xxsetaccz" + [(set (match_operand:XO 0 "register_operand") (unspec_volatile:XO [(const_int 0)] UNSPECV_MMA_XXSETACCZ))] "TARGET_MMA" +{ + if (TARGET_DENSE_MATH) + { + emit_insn (gen_mma_xxsetaccz_dm (operands[0])); + DONE; + } +}) + +(define_insn "*mma_xxsetaccz_vsx" + [(set (match_operand:XO 0 "fpr_reg_operand" "=d") + (unspec_volatile:XO [(const_int 0)] + UNSPECV_MMA_XXSETACCZ))] + "TARGET_MMA && !TARGET_DENSE_MATH" "xxsetaccz %A0" [(set_attr "type" "mma")]) + +(define_insn "mma_xxsetaccz_dm" + [(set (match_operand:XO 0 "dmr_operand" "=wD") + (unspec:XO [(const_int 0)] + UNSPECV_MMA_XXSETACCZ))] + "TARGET_DENSE_MATH" + "dmsetdmrz %0" + [(set_attr "type" "mma")]) + (define_insn "mma_" - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") - (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa") - (match_operand:V16QI 2 "vsx_register_operand" "v,?wa")] + [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d") + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa,v,?wa") + (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa")] MMA_VV))] "TARGET_MMA" " %A0,%x1,%x2" - [(set_attr "type" "mma")]) + [(set_attr "type" "mma") + (set_attr "isa" "dm,not_dm,not_dm")]) (define_insn "mma_" - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") - (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0") - (match_operand:V16QI 2 "vsx_register_operand" "v,?wa") - (match_operand:V16QI 3 "vsx_register_operand" "v,?wa")] + [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d") + (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0,0") + (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa") + (match_operand:V16QI 3 "vsx_register_operand" "wa,v,?wa")] MMA_AVV))] "TARGET_MMA" " %A0,%x2,%x3" - [(set_attr "type" "mma")]) + [(set_attr "type" "mma") + (set_attr "isa" "dm,not_dm,not_dm")]) (define_insn "mma_" - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") - (unspec:XO [(match_operand:OO 1 "vsx_register_operand" "v,?wa") - (match_operand:V16QI 2 "vsx_register_operand" "v,?wa")] + [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d") + (unspec:XO [(match_operand:OO 1 "vsx_register_operand" "wa,v,?wa") + (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa")] MMA_PV))] "TARGET_MMA" " %A0,%x1,%x2" - [(set_attr "type" "mma")]) + [(set_attr "type" "mma") + (set_attr "isa" "dm,not_dm,not_dm")]) (define_insn "mma_" - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") - (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0") - (match_operand:OO 2 "vsx_register_operand" "v,?wa") - (match_operand:V16QI 3 "vsx_register_operand" "v,?wa")] + [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d") + (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0,0") + (match_operand:OO 2 "vsx_register_operand" "wa,v,?wa") + (match_operand:V16QI 3 "vsx_register_operand" "wa,v,?wa")] MMA_APV))] "TARGET_MMA" " %A0,%x2,%x3" - [(set_attr "type" "mma")]) + [(set_attr "type" "mma") + (set_attr "isa" "dm,not_dm,not_dm")]) (define_insn "mma_" - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") - (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa") - (match_operand:V16QI 2 "vsx_register_operand" "v,?wa") - (match_operand:SI 3 "const_0_to_15_operand" "n,n") - (match_operand:SI 4 "const_0_to_15_operand" "n,n") - (match_operand:SI 5 "u8bit_cint_operand" "n,n")] + [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d") + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa,v,?wa") + (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa") + (match_operand:SI 3 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 4 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 5 "u8bit_cint_operand" "n,n,n")] MMA_VVI4I4I8))] "TARGET_MMA" " %A0,%x1,%x2,%3,%4,%5" [(set_attr "type" "mma") - (set_attr "prefixed" "yes")]) + (set_attr "prefixed" "yes") + (set_attr "isa" "dm,not_dm,not_dm")]) (define_insn "mma_" - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") - (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0") - (match_operand:V16QI 2 "vsx_register_operand" "v,?wa") - (match_operand:V16QI 3 "vsx_register_operand" "v,?wa") - (match_operand:SI 4 "const_0_to_15_operand" "n,n") - (match_operand:SI 5 "const_0_to_15_operand" "n,n") - (match_operand:SI 6 "u8bit_cint_operand" "n,n")] + [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d") + (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0,0") + (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa") + (match_operand:V16QI 3 "vsx_register_operand" "wa,v,?wa") + (match_operand:SI 4 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 5 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 6 "u8bit_cint_operand" "n,n,n")] MMA_AVVI4I4I8))] "TARGET_MMA" " %A0,%x2,%x3,%4,%5,%6" [(set_attr "type" "mma") - (set_attr "prefixed" "yes")]) + (set_attr "prefixed" "yes") + (set_attr "isa" "dm,not_dm,not_dm")]) (define_insn "mma_" - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") - (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa") - (match_operand:V16QI 2 "vsx_register_operand" "v,?wa") - (match_operand:SI 3 "const_0_to_15_operand" "n,n") - (match_operand:SI 4 "const_0_to_15_operand" "n,n") - (match_operand:SI 5 "const_0_to_3_operand" "n,n")] + [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d") + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa,v,?wa") + (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa") + (match_operand:SI 3 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 4 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 5 "const_0_to_3_operand" "n,n,n")] MMA_VVI4I4I2))] "TARGET_MMA" " %A0,%x1,%x2,%3,%4,%5" [(set_attr "type" "mma") - (set_attr "prefixed" "yes")]) + (set_attr "prefixed" "yes") + (set_attr "isa" "dm,not_dm,not_dm")]) (define_insn "mma_" - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") - (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0") - (match_operand:V16QI 2 "vsx_register_operand" "v,?wa") - (match_operand:V16QI 3 "vsx_register_operand" "v,?wa") - (match_operand:SI 4 "const_0_to_15_operand" "n,n") - (match_operand:SI 5 "const_0_to_15_operand" "n,n") - (match_operand:SI 6 "const_0_to_3_operand" "n,n")] + [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d") + (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0,0") + (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa") + (match_operand:V16QI 3 "vsx_register_operand" "wa,v,?wa") + (match_operand:SI 4 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 5 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 6 "const_0_to_3_operand" "n,n,n")] MMA_AVVI4I4I2))] "TARGET_MMA" " %A0,%x2,%x3,%4,%5,%6" [(set_attr "type" "mma") - (set_attr "prefixed" "yes")]) + (set_attr "prefixed" "yes") + (set_attr "isa" "dm,not_dm,not_dm")]) (define_insn "mma_" - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") - (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa") - (match_operand:V16QI 2 "vsx_register_operand" "v,?wa") - (match_operand:SI 3 "const_0_to_15_operand" "n,n") - (match_operand:SI 4 "const_0_to_15_operand" "n,n")] + [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d") + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa,v,?wa") + (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa") + (match_operand:SI 3 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 4 "const_0_to_15_operand" "n,n,n")] MMA_VVI4I4))] "TARGET_MMA" " %A0,%x1,%x2,%3,%4" [(set_attr "type" "mma") - (set_attr "prefixed" "yes")]) + (set_attr "prefixed" "yes") + (set_attr "isa" "dm,not_dm,not_dm")]) (define_insn "mma_" - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") - (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0") - (match_operand:V16QI 2 "vsx_register_operand" "v,?wa") - (match_operand:V16QI 3 "vsx_register_operand" "v,?wa") - (match_operand:SI 4 "const_0_to_15_operand" "n,n") - (match_operand:SI 5 "const_0_to_15_operand" "n,n")] + [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d") + (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0,0") + (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa") + (match_operand:V16QI 3 "vsx_register_operand" "wa,v,?wa") + (match_operand:SI 4 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 5 "const_0_to_15_operand" "n,n,n")] MMA_AVVI4I4))] "TARGET_MMA" " %A0,%x2,%x3,%4,%5" [(set_attr "type" "mma") - (set_attr "prefixed" "yes")]) + (set_attr "prefixed" "yes") + (set_attr "isa" "dm,not_dm,not_dm")]) (define_insn "mma_" - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") - (unspec:XO [(match_operand:OO 1 "vsx_register_operand" "v,?wa") - (match_operand:V16QI 2 "vsx_register_operand" "v,?wa") - (match_operand:SI 3 "const_0_to_15_operand" "n,n") - (match_operand:SI 4 "const_0_to_3_operand" "n,n")] + [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d") + (unspec:XO [(match_operand:OO 1 "vsx_register_operand" "wa,v,?wa") + (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa") + (match_operand:SI 3 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 4 "const_0_to_3_operand" "n,n,n")] MMA_PVI4I2))] "TARGET_MMA" " %A0,%x1,%x2,%3,%4" [(set_attr "type" "mma") - (set_attr "prefixed" "yes")]) + (set_attr "prefixed" "yes") + (set_attr "isa" "dm,not_dm,not_dm")]) (define_insn "mma_" - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") - (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0") - (match_operand:OO 2 "vsx_register_operand" "v,?wa") - (match_operand:V16QI 3 "vsx_register_operand" "v,?wa") - (match_operand:SI 4 "const_0_to_15_operand" "n,n") - (match_operand:SI 5 "const_0_to_3_operand" "n,n")] + [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d") + (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0,0") + (match_operand:OO 2 "vsx_register_operand" "wa,v,?wa") + (match_operand:V16QI 3 "vsx_register_operand" "wa,v,?wa") + (match_operand:SI 4 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 5 "const_0_to_3_operand" "n,n,n")] MMA_APVI4I2))] "TARGET_MMA" " %A0,%x2,%x3,%4,%5" [(set_attr "type" "mma") - (set_attr "prefixed" "yes")]) + (set_attr "prefixed" "yes") + (set_attr "isa" "dm,not_dm,not_dm")]) (define_insn "mma_" - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") - (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa") - (match_operand:V16QI 2 "vsx_register_operand" "v,?wa") - (match_operand:SI 3 "const_0_to_15_operand" "n,n") - (match_operand:SI 4 "const_0_to_15_operand" "n,n") - (match_operand:SI 5 "const_0_to_15_operand" "n,n")] + [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d") + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa,v,?wa") + (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa") + (match_operand:SI 3 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 4 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 5 "const_0_to_15_operand" "n,n,n")] MMA_VVI4I4I4))] "TARGET_MMA" " %A0,%x1,%x2,%3,%4,%5" [(set_attr "type" "mma") - (set_attr "prefixed" "yes")]) + (set_attr "prefixed" "yes") + (set_attr "isa" "dm,not_dm,not_dm")]) (define_insn "mma_" - [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d") - (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0,0") - (match_operand:V16QI 2 "vsx_register_operand" "v,?wa") - (match_operand:V16QI 3 "vsx_register_operand" "v,?wa") - (match_operand:SI 4 "const_0_to_15_operand" "n,n") - (match_operand:SI 5 "const_0_to_15_operand" "n,n") - (match_operand:SI 6 "const_0_to_15_operand" "n,n")] + [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d") + (unspec:XO [(match_operand:XO 1 "accumulator_operand" "0,0,0") + (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa") + (match_operand:V16QI 3 "vsx_register_operand" "wa,v,?wa") + (match_operand:SI 4 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 5 "const_0_to_15_operand" "n,n,n") + (match_operand:SI 6 "const_0_to_15_operand" "n,n,n")] MMA_AVVI4I4I4))] "TARGET_MMA" " %A0,%x2,%x3,%4,%5,%6" [(set_attr "type" "mma") - (set_attr "prefixed" "yes")]) + (set_attr "prefixed" "yes") + (set_attr "isa" "dm,not_dm,not_dm")]) diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc index e276c20cccd..4fccb6d251f 100644 --- a/gcc/config/rs6000/rs6000-c.cc +++ b/gcc/config/rs6000/rs6000-c.cc @@ -600,6 +600,9 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags) /* Tell the user if we support the MMA instructions. */ if ((flags & OPTION_MASK_MMA) != 0) rs6000_define_or_undefine_macro (define_p, "__MMA__"); + /* Tell the user if we support the dense math instructions. */ + if ((flags & OPTION_MASK_DENSE_MATH) != 0) + rs6000_define_or_undefine_macro (define_p, "__PPC_DMR__"); /* Whether pc-relative code is being generated. */ if ((flags & OPTION_MASK_PCREL) != 0) rs6000_define_or_undefine_macro (define_p, "__PCREL__"); diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 056214d2ab1..7d8b9ec442b 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -14235,8 +14235,13 @@ print_operand (FILE *file, rtx x, int code) overlapping with the FPR registers. */ if (!REG_P (x)) output_operand_lossage ("invalid %%A value"); - else if (TARGET_DENSE_MATH && DMR_REGNO_P (REGNO (x))) - fprintf (file, "%d", REGNO (x) - FIRST_DMR_REGNO); + else if (TARGET_DENSE_MATH) + { + if (DMR_REGNO_P (REGNO (x))) + fprintf (file, "%d", REGNO (x) - FIRST_DMR_REGNO); + else + output_operand_lossage ("%%A operand is not a DMR"); + } else if (!FP_REGNO_P (REGNO (x)) || (REGNO (x) % 4) != 0) output_operand_lossage ("invalid %%A value"); else @@ -27674,7 +27679,7 @@ rs6000_split_multireg_move (rtx dst, rtx src) /* If we are reading an accumulator register, we have to deprime it before we can access it. */ - if (TARGET_MMA + if (TARGET_MMA && !TARGET_DENSE_MATH && GET_MODE (src) == XOmode && FP_REGNO_P (REGNO (src))) emit_insn (gen_mma_xxmfacc (src, src)); @@ -27706,9 +27711,9 @@ rs6000_split_multireg_move (rtx dst, rtx src) emit_insn (gen_rtx_SET (dst2, src2)); } - /* If we are writing an accumulator register, we have to - prime it after we've written it. */ - if (TARGET_MMA + /* If we are writing an accumulator register that overlaps with the + FPR registers, we have to prime it after we've written it. */ + if (TARGET_MMA && !TARGET_DENSE_MATH && GET_MODE (dst) == XOmode && FP_REGNO_P (REGNO (dst))) emit_insn (gen_mma_xxmtacc (dst, dst)); @@ -27777,9 +27782,9 @@ rs6000_split_multireg_move (rtx dst, rtx src) emit_insn (gen_rtx_SET (dst_i, op)); } - /* We are writing an accumulator register, so we have to - prime it after we've written it. */ - if (GET_MODE (src) == XOmode) + /* On systems without dense math where accumulators overlap with the + vector registers, we have to prime it after we've written it. */ + if (GET_MODE (src) == XOmode && !TARGET_DENSE_MATH) emit_insn (gen_mma_xxmtacc (dst, dst)); return; @@ -27790,9 +27795,9 @@ rs6000_split_multireg_move (rtx dst, rtx src) if (REG_P (src) && REG_P (dst) && (REGNO (src) < REGNO (dst))) { - /* If we are reading an accumulator register, we have to - deprime it before we can access it. */ - if (TARGET_MMA + /* If we are reading an accumulator register and we don't have dense + math, we have to deprime it before we can access it. */ + if (TARGET_MMA && !TARGET_DENSE_MATH && GET_MODE (src) == XOmode && FP_REGNO_P (REGNO (src))) emit_insn (gen_mma_xxmfacc (src, src)); @@ -27820,7 +27825,7 @@ rs6000_split_multireg_move (rtx dst, rtx src) /* If we are writing an accumulator register, we have to prime it after we've written it. */ - if (TARGET_MMA + if (TARGET_MMA && !TARGET_DENSE_MATH && GET_MODE (dst) == XOmode && FP_REGNO_P (REGNO (dst))) emit_insn (gen_mma_xxmtacc (dst, dst)); } @@ -27957,7 +27962,7 @@ rs6000_split_multireg_move (rtx dst, rtx src) /* If we are reading an accumulator register, we have to deprime it before we can access it. */ - if (TARGET_MMA && REG_P (src) + if (TARGET_MMA && !TARGET_DENSE_MATH && REG_P (src) && GET_MODE (src) == XOmode && FP_REGNO_P (REGNO (src))) emit_insn (gen_mma_xxmfacc (src, src)); @@ -27989,7 +27994,7 @@ rs6000_split_multireg_move (rtx dst, rtx src) /* If we are writing an accumulator register, we have to prime it after we've written it. */ - if (TARGET_MMA && REG_P (dst) + if (TARGET_MMA && !TARGET_DENSE_MATH && REG_P (dst) && GET_MODE (dst) == XOmode && FP_REGNO_P (REGNO (dst))) emit_insn (gen_mma_xxmtacc (dst, dst)); -- 2.41.0 -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meissner@linux.ibm.com