From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id E0712396EC6E for ; Fri, 20 Nov 2020 01:02:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E0712396EC6E Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0AK0XJIQ117381 for ; Thu, 19 Nov 2020 20:02:49 -0500 Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0a-001b2d01.pphosted.com with ESMTP id 34x1m0kdjt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 19 Nov 2020 20:02:49 -0500 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 0AK0ut0u028503 for ; Fri, 20 Nov 2020 01:02:48 GMT Received: from b01cxnp22035.gho.pok.ibm.com (b01cxnp22035.gho.pok.ibm.com [9.57.198.25]) by ppma01dal.us.ibm.com with ESMTP id 34utts2bfk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 20 Nov 2020 01:02:48 +0000 Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com [9.57.199.109]) by b01cxnp22035.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0AK12kGn5112502 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Fri, 20 Nov 2020 01:02:46 GMT Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A5FCF112065 for ; Fri, 20 Nov 2020 01:02:46 +0000 (GMT) Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 03C32112061 for ; Fri, 20 Nov 2020 01:02:45 +0000 (GMT) Received: from [9.211.53.128] (unknown [9.211.53.128]) by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTPS for ; Fri, 20 Nov 2020 01:02:45 +0000 (GMT) From: Aaron Sawdey Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: [PATCH,rs6000] Make MMA builtins use opaque modes [v2] Message-Id: <8D4F8443-D780-46FC-9739-CF518C1C0763@linux.ibm.com> References: <20201119185847.703536-1-acsawdey@linux.ibm.com> To: gcc-patches Date: Thu, 19 Nov 2020 19:02:44 -0600 X-Mailer: Apple Mail (2.3608.120.23.2.4) X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.312, 18.0.737 definitions=2020-11-19_14:2020-11-19, 2020-11-19 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 phishscore=0 lowpriorityscore=0 clxscore=1015 spamscore=0 mlxscore=0 bulkscore=0 malwarescore=0 suspectscore=0 mlxlogscore=999 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2011200003 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Nov 2020 01:02:55 -0000 For some reason this patch never showed up on gcc-patches. Aaron Sawdey, Ph.D. sawdey@linux.ibm.com IBM Linux on POWER Toolchain =20 > Begin forwarded message: >=20 > From: acsawdey@linux.ibm.com > Subject: [PATCH,rs6000] Make MMA builtins use opaque modes [v2] > Date: November 19, 2020 at 12:58:47 PM CST > To: gcc-patches@gcc.gnu.org > Cc: segher@kernel.crashing.org, wschmidt@linux.ibm.com, = bergner@linux.ibm.com, Aaron Sawdey >=20 > From: Aaron Sawdey >=20 > Segher & Bergner - > Thanks for the reviews, here's the updated patch after fixing those = things. > We now have an UNSPEC for xxsetaccz, and an accompanying change to > rs6000_rtx_costs to make it be cost 0 so that CSE doesn't try to = replace it > with a bunch of register moves. >=20 > If bootstrap/regtest looks good, ok for trunk? >=20 > Thanks, > Aaron >=20 > gcc/ > * gcc/config/rs6000/mma.md (unspec): Add assemble/extract = UNSPECs. > (movoi): Change to movoo. > (*movpoi): Change to *movoo. > (movxi): Change to movxo. > (*movpxi): Change to *movxo. > (mma_assemble_pair): Change to OO mode. > (*mma_assemble_pair): New define_insn_and_split. > (mma_disassemble_pair): New define_expand. > (*mma_disassemble_pair): New define_insn_and_split. > (mma_assemble_acc): Change to XO mode. > (*mma_assemble_acc): Change to XO mode. > (mma_disassemble_acc): New define_expand. > (*mma_disassemble_acc): New define_insn_and_split. > (mma_): Change to XO mode. > (mma_): Change to XO mode. > (mma_): Change to XO mode. > (mma_): Change to OO mode. > (mma_): Change to XO/OO mode. > (mma_): Change to XO mode. > (mma_): Change to XO mode. > (mma_): Change to XO mode. > (mma_): Change to XO mode. > (mma_): Change to XO mode. > (mma_): Change to XO mode. > (mma_): Change to XO/OO mode. > (mma_): Change to XO/OO mode. > (mma_): Change to XO mode. > (mma_): Change to XO mode. > * gcc/config/rs6000/predicates.md (input_operand): Allow opaque. > (mma_disassemble_output_operand): New predicate. > * gcc/config/rs6000/rs6000-builtin.def: > Changes to disassemble builtins. > * gcc/config/rs6000/rs6000-call.c (rs6000_return_in_memory): > Disallow __vector_pair/__vector_quad as return types. > (rs6000_promote_function_mode): Remove function return type > check because we can't test it here any more. > (rs6000_function_arg): Do not allow __vector_pair/__vector_quad > as as function arguments. > (rs6000_gimple_fold_mma_builtin): > Handle mma_disassemble_* builtins. > (rs6000_init_builtins): Create types for XO/OO modes. > * gcc/config/rs6000/rs6000-modes.def: DElete OI, XI, > POI, and PXI modes, and create XO and OO modes. > * gcc/config/rs6000/rs6000-string.c (expand_block_move): > Update to OO mode. > * gcc/config/rs6000/rs6000.c = (rs6000_hard_regno_mode_ok_uncached): > Update for XO/OO modes. > (rs6000_rtx_costs): Make UNSPEC_MMA_XXSETACCZ cost 0. > (rs6000_modes_tieable_p): Update for XO/OO modes. > (rs6000_debug_reg_global): Update for XO/OO modes. > (rs6000_setup_reg_addr_masks): Update for XO/OO modes. > (rs6000_init_hard_regno_mode_ok): Update for XO/OO modes. > (reg_offset_addressing_ok_p): Update for XO/OO modes. > (rs6000_emit_move): Update for XO/OO modes. > (rs6000_preferred_reload_class): Update for XO/OO modes. > (rs6000_split_multireg_move): Update for XO/OO modes. > (rs6000_mangle_type): Update for opaque types. > (rs6000_invalid_conversion): Update for XO/OO modes. > * gcc/config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P): > Update for XO/OO modes. > * gcc/config/rs6000/rs6000.md (RELOAD): Update for XO/OO modes. > gcc/testsuite/ > * gcc.target/powerpc/mma-double-test.c (main): Call abort for = failure. > * gcc.target/powerpc/mma-single-test.c (main): Call abort for = failure. > * gcc.target/powerpc/pr96506.c: Rename to pr96506-1.c. > * gcc.target/powerpc/pr96506-2.c: New test. > --- > gcc/config/rs6000/mma.md | 421 ++++++++++-------- > gcc/config/rs6000/predicates.md | 12 + > gcc/config/rs6000/rs6000-builtin.def | 14 +- > gcc/config/rs6000/rs6000-call.c | 142 +++--- > gcc/config/rs6000/rs6000-modes.def | 10 +- > gcc/config/rs6000/rs6000-string.c | 6 +- > gcc/config/rs6000/rs6000.c | 193 ++++---- > gcc/config/rs6000/rs6000.h | 3 +- > gcc/config/rs6000/rs6000.md | 2 +- > .../gcc.target/powerpc/mma-double-test.c | 3 + > .../gcc.target/powerpc/mma-single-test.c | 3 + > .../powerpc/{pr96506.c =3D> pr96506-1.c} | 24 - > gcc/testsuite/gcc.target/powerpc/pr96506-2.c | 38 ++ > 13 files changed, 508 insertions(+), 363 deletions(-) > rename gcc/testsuite/gcc.target/powerpc/{pr96506.c =3D> pr96506-1.c} = (61%) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr96506-2.c >=20 > diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md > index a3fd28bdd0a..63bb73a01e7 100644 > --- a/gcc/config/rs6000/mma.md > +++ b/gcc/config/rs6000/mma.md > @@ -19,24 +19,18 @@ > ;; along with GCC; see the file COPYING3. If not see > ;; . >=20 > -;; The MMA patterns use the multi-register PXImode and POImode = partial > -;; integer modes to implement the target specific __vector_quad and > -;; __vector_pair types that the MMA built-in functions reference. > -;; To use these modes, we must define XImode and OImode move patterns > -;; so the independent parts of the compiler can use our large partial > -;; integer modes. However, if we enable the XImode and OImode move > -;; patterns, then the compiler will attempt to use them and this can > -;; cause byte swapping issues on litte-endian systems. We don't need > -;; the XImode and OImode move patterns for actual code generation, > -;; therefore, we define the XImode and OImode move patterns, but we > -;; disable their use with a "false" condition flag. > +;; The MMA patterns use the multi-register XOmode and OOmode opaque > +;; modes to implement the target specific __vector_quad and > +;; __vector_pair types that the MMA built-in functions reference. We > +;; use OPAQUE_MODE to prevent anything from trying to open them up. >=20 > (define_constants [(MAX_MMA_OPERANDS 7)]) >=20 > ;; Constants for creating unspecs >=20 > (define_c_enum "unspec" > - [UNSPEC_MMA_ASSEMBLE_ACC > + [UNSPEC_MMA_ASSEMBLE > + UNSPEC_MMA_EXTRACT > UNSPEC_MMA_PMXVBF16GER2 > UNSPEC_MMA_PMXVBF16GER2NN > UNSPEC_MMA_PMXVBF16GER2NP > @@ -97,6 +91,7 @@ (define_c_enum "unspec" > UNSPEC_MMA_XVI8GER4SPP > UNSPEC_MMA_XXMFACC > UNSPEC_MMA_XXMTACC > + UNSPEC_MMA_XXSETACCZ > ]) >=20 > ;; MMA instructions with 1 accumulator argument > @@ -265,31 +260,22 @@ (define_int_attr avvi4i4i4 = [(UNSPEC_MMA_PMXVI8GER4PP "pmxvi8ger4pp") > (UNSPEC_MMA_PMXVI8GER4SPP = "pmxvi8ger4spp")]) >=20 >=20 > -;; Define a disabled OImode move pattern, so we can use POImode. > -(define_expand "movoi" > - [(set (match_operand:OI 0 "nonimmediate_operand") > - (match_operand:OI 1 "input_operand"))] > - "0" > -{ > - gcc_unreachable (); > -}) > - > -;; Vector pair support. POImode can only live in VSRs. > -(define_expand "movpoi" > - [(set (match_operand:POI 0 "nonimmediate_operand") > - (match_operand:POI 1 "input_operand"))] > +;; Vector pair support. OOmode can only live in VSRs. > +(define_expand "movoo" > + [(set (match_operand:OO 0 "nonimmediate_operand") > + (match_operand:OO 1 "input_operand"))] > "TARGET_MMA" > { > - rs6000_emit_move (operands[0], operands[1], POImode); > + rs6000_emit_move (operands[0], operands[1], OOmode); > DONE; > }) >=20 > -(define_insn_and_split "*movpoi" > - [(set (match_operand:POI 0 "nonimmediate_operand" "=3Dwa,m,wa") > - (match_operand:POI 1 "input_operand" "m,wa,wa"))] > +(define_insn_and_split "*movoo" > + [(set (match_operand:OO 0 "nonimmediate_operand" "=3Dwa,m,wa") > + (match_operand:OO 1 "input_operand" "m,wa,wa"))] > "TARGET_MMA > - && (gpc_reg_operand (operands[0], POImode) > - || gpc_reg_operand (operands[1], POImode))" > + && (gpc_reg_operand (operands[0], OOmode) > + || gpc_reg_operand (operands[1], OOmode))" > "@ > lxvp%X1 %x0,%1 > stxvp%X0 %x1,%0 > @@ -305,287 +291,370 @@ (define_insn_and_split "*movpoi" > (set_attr "length" "*,*,8")]) >=20 >=20 > -;; Define a disabled XImode move pattern, so we can use PXImode. > -(define_expand "movxi" > - [(set (match_operand:XI 0 "nonimmediate_operand") > - (match_operand:XI 1 "input_operand"))] > - "0" > -{ > - gcc_unreachable (); > -}) > - > -;; Vector quad support. PXImode can only live in FPRs. > -(define_expand "movpxi" > - [(set (match_operand:PXI 0 "nonimmediate_operand") > - (match_operand:PXI 1 "input_operand"))] > +;; Vector quad support. XOmode can only live in FPRs. > +(define_expand "movxo" > + [(set (match_operand:XO 0 "nonimmediate_operand") > + (match_operand:XO 1 "input_operand"))] > "TARGET_MMA" > { > - rs6000_emit_move (operands[0], operands[1], PXImode); > + rs6000_emit_move (operands[0], operands[1], XOmode); > DONE; > }) >=20 > -(define_insn_and_split "*movpxi" > - [(set (match_operand:PXI 0 "nonimmediate_operand" "=3Dd,m,d,d") > - (match_operand:PXI 1 "input_operand" "m,d,d,O"))] > +(define_insn_and_split "*movxo" > + [(set (match_operand:XO 0 "nonimmediate_operand" "=3Dd,m,d") > + (match_operand:XO 1 "input_operand" "m,d,d"))] > "TARGET_MMA > - && (gpc_reg_operand (operands[0], PXImode) > - || gpc_reg_operand (operands[1], PXImode))" > + && (gpc_reg_operand (operands[0], XOmode) > + || gpc_reg_operand (operands[1], XOmode))" > "@ > # > # > - # > - xxsetaccz %A0" > - "&& reload_completed > - && !(fpr_reg_operand (operands[0], PXImode) && operands[1] =3D=3D = const0_rtx)" > + #" > + "&& reload_completed" > [(const_int 0)] > { > rs6000_split_multireg_move (operands[0], operands[1]); > DONE; > } > - [(set_attr "type" "vecload,vecstore,veclogical,mma") > - (set_attr "length" "8,8,16,*") > - (set_attr "max_prefixed_insns" "2,2,*,*")]) > + [(set_attr "type" "vecload,vecstore,veclogical") > + (set_attr "length" "8,8,16") > + (set_attr "max_prefixed_insns" "2,2,*")]) >=20 > (define_expand "mma_assemble_pair" > - [(match_operand:POI 0 "vsx_register_operand") > - (match_operand:V16QI 1 "input_operand") > - (match_operand:V16QI 2 "input_operand")] > + [(match_operand:OO 0 "vsx_register_operand") > + (match_operand:V16QI 1 "mma_assemble_input_operand") > + (match_operand:V16QI 2 "mma_assemble_input_operand")] > "TARGET_MMA" > { > - rtx dst; > + rtx src =3D gen_rtx_UNSPEC (OOmode, > + gen_rtvec (2, operands[1], operands[2]), > + UNSPEC_MMA_ASSEMBLE); > + emit_move_insn (operands[0], src); > + DONE; > +}) >=20 > - /* Let the compiler know the code below fully defines our output = value. */ > - emit_clobber (operands[0]); > +(define_insn_and_split "*mma_assemble_pair" > + [(set (match_operand:OO 0 "vsx_register_operand" "=3Dwa") > + (unspec:OO [(match_operand:V16QI 1 "mma_assemble_input_operand" = "mwa") > + (match_operand:V16QI 2 "mma_assemble_input_operand" = "mwa")] > + UNSPEC_MMA_ASSEMBLE))] > + "TARGET_MMA" > + "#" > + "&& reload_completed" > + [(const_int 0)] > +{ > + rtx src =3D gen_rtx_UNSPEC (OOmode, > + gen_rtvec (2, operands[1], operands[2]), > + UNSPEC_MMA_ASSEMBLE); > + rs6000_split_multireg_move (operands[0], src); > + DONE; > +}) > + > +(define_expand "mma_disassemble_pair" > + [(match_operand:V16QI 0 "mma_disassemble_output_operand") > + (match_operand:OO 1 "input_operand") > + (match_operand 2 "const_0_to_1_operand")] > + "TARGET_MMA" > +{ > + rtx src; > + int regoff =3D INTVAL (operands[2]); > + src =3D gen_rtx_UNSPEC (V16QImode, > + gen_rtvec (2, operands[1], GEN_INT (regoff)), > + UNSPEC_MMA_EXTRACT); > + emit_move_insn (operands[0], src); > + DONE; > +}) >=20 > - dst =3D simplify_gen_subreg (V16QImode, operands[0], POImode, 0); > - emit_move_insn (dst, operands[1]); > - dst =3D simplify_gen_subreg (V16QImode, operands[0], POImode, 16); > - emit_move_insn (dst, operands[2]); > +(define_insn_and_split "*mma_disassemble_pair" > + [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" = "=3Dmwa") > + (unspec:V16QI [(match_operand:OO 1 "input_operand" "wa") > + (match_operand 2 "const_0_to_1_operand")] > + UNSPEC_MMA_EXTRACT))] > + "TARGET_MMA > + && fpr_reg_operand (operands[1], OOmode)" > + "#" > + "&& reload_completed" > + [(const_int 0)] > +{ > + int reg =3D REGNO (operands[1]); > + int regoff =3D INTVAL (operands[2]); > + rtx src =3D gen_rtx_REG (V16QImode, reg + regoff); > + emit_move_insn (operands[0], src); > DONE; > }) >=20 > (define_expand "mma_assemble_acc" > - [(match_operand:PXI 0 "fpr_reg_operand") > - (match_operand:V16QI 1 "input_operand") > - (match_operand:V16QI 2 "input_operand") > - (match_operand:V16QI 3 "input_operand") > - (match_operand:V16QI 4 "input_operand")] > + [(match_operand:XO 0 "fpr_reg_operand") > + (match_operand:V16QI 1 "mma_assemble_input_operand") > + (match_operand:V16QI 2 "mma_assemble_input_operand") > + (match_operand:V16QI 3 "mma_assemble_input_operand") > + (match_operand:V16QI 4 "mma_assemble_input_operand")] > "TARGET_MMA" > { > - rtx src =3D gen_rtx_UNSPEC (PXImode, > + rtx src =3D gen_rtx_UNSPEC (XOmode, > gen_rtvec (4, operands[1], operands[2], > operands[3], operands[4]), > - UNSPEC_MMA_ASSEMBLE_ACC); > + UNSPEC_MMA_ASSEMBLE); > emit_move_insn (operands[0], src); > DONE; > }) >=20 > (define_insn_and_split "*mma_assemble_acc" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3Dd") > - (unspec:PXI [(match_operand:V16QI 1 "mma_assemble_input_operand" = "mwa") > - (match_operand:V16QI 2 "mma_assemble_input_operand" = "mwa") > - (match_operand:V16QI 3 "mma_assemble_input_operand" = "mwa") > - (match_operand:V16QI 4 "mma_assemble_input_operand" = "mwa")] > - UNSPEC_MMA_ASSEMBLE_ACC))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3Dd") > + (unspec:XO [(match_operand:V16QI 1 "mma_assemble_input_operand" = "mwa") > + (match_operand:V16QI 2 "mma_assemble_input_operand" = "mwa") > + (match_operand:V16QI 3 "mma_assemble_input_operand" = "mwa") > + (match_operand:V16QI 4 "mma_assemble_input_operand" = "mwa")] > + UNSPEC_MMA_ASSEMBLE))] > "TARGET_MMA > - && fpr_reg_operand (operands[0], PXImode)" > + && fpr_reg_operand (operands[0], XOmode)" > "#" > "&& reload_completed" > [(const_int 0)] > { > - rtx src =3D gen_rtx_UNSPEC (PXImode, > + rtx src =3D gen_rtx_UNSPEC (XOmode, > gen_rtvec (4, operands[1], operands[2], > operands[3], operands[4]), > - UNSPEC_MMA_ASSEMBLE_ACC); > + UNSPEC_MMA_ASSEMBLE); > rs6000_split_multireg_move (operands[0], src); > DONE; > }) >=20 > +(define_expand "mma_disassemble_acc" > + [(match_operand:V16QI 0 "mma_disassemble_output_operand") > + (match_operand:XO 1 "input_operand") > + (match_operand 2 "const_0_to_3_operand")] > + "TARGET_MMA" > +{ > + rtx src; > + int regoff =3D INTVAL (operands[2]); > + src =3D gen_rtx_UNSPEC (V16QImode, > + gen_rtvec (2, operands[1], GEN_INT (regoff)), > + UNSPEC_MMA_EXTRACT); > + emit_move_insn (operands[0], src); > + DONE; > +}) > + > +(define_insn_and_split "*mma_disassemble_acc" > + [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" = "=3Dmwa") > + (unspec:V16QI [(match_operand:XO 1 "input_operand" "d") > + (match_operand 2 "const_0_to_3_operand")] > + UNSPEC_MMA_EXTRACT))] > + "TARGET_MMA > + && fpr_reg_operand (operands[1], XOmode)" > + "#" > + "&& reload_completed" > + [(const_int 0)] > +{ > + int reg =3D REGNO (operands[1]); > + int regoff =3D INTVAL (operands[2]); > + rtx src =3D gen_rtx_REG (V16QImode, reg + regoff); > + emit_move_insn (operands[0], src); > + DONE; > +}) > + > ;; MMA instructions that do not use their accumulators as an input, = still > ;; must not allow their vector operands to overlap the registers used = by > ;; the accumulator. We enforce this by marking the output as early = clobber. >=20 > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0")] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0")] > MMA_ACC))] > "TARGET_MMA" > " %A0" > [(set_attr "type" "mma")]) >=20 > +;; We can't have integer constants in XOmode so we wrap this in an = UNSPEC. > + > (define_expand "mma_xxsetaccz" > - [(set (match_operand:PXI 0 "fpr_reg_operand") > + [(set (match_operand:XO 0 "fpr_reg_operand") > (const_int 0))] > "TARGET_MMA" > { > - emit_insn (gen_movpxi (operands[0], const0_rtx)); > + rtx xo0 =3D gen_rtx_UNSPEC (XOmode, gen_rtvec (1, const0_rtx), > + UNSPEC_MMA_XXSETACCZ); > + emit_insn (gen_rtx_SET (operands[0], xo0)); > DONE; > }) >=20 > +(define_insn_and_split "*mma_xxsetaccz" > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3Dd") > + (unspec:XO [(match_operand 1 "const_0_to_1_operand" "O")] > + UNSPEC_MMA_XXSETACCZ))] > + "TARGET_MMA" > + "xxsetaccz %A0" > + "&& reload_completed" > + [(set (match_dup 0) (unspec:XO [(match_dup 1)] = UNSPEC_MMA_XXSETACCZ))] > + "" > + [(set_attr "type" "mma") > + (set_attr "length" "4")]) > + > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:V16QI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" = "wa")] > - MMA_VV))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa") > + (match_operand:V16QI 2 "vsx_register_operand" "wa")] > + MMA_VV))] > "TARGET_MMA" > " %A0,%x1,%x2" > [(set_attr "type" "mma")]) >=20 > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:V16QI 3 "vsx_register_operand" = "wa")] > - MMA_AVV))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:V16QI 3 "vsx_register_operand" "wa")] > + MMA_AVV))] > "TARGET_MMA" > " %A0,%x2,%x3" > [(set_attr "type" "mma")]) >=20 > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:POI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" = "wa")] > - MMA_PV))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:OO 1 "vsx_register_operand" "wa") > + (match_operand:V16QI 2 "vsx_register_operand" "wa")] > + MMA_PV))] > "TARGET_MMA" > " %A0,%x1,%x2" > [(set_attr "type" "mma")]) >=20 > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0") > - (match_operand:POI 2 "vsx_register_operand" "wa") > - (match_operand:V16QI 3 "vsx_register_operand" = "wa")] > - MMA_APV))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0") > + (match_operand:OO 2 "vsx_register_operand" "wa") > + (match_operand:V16QI 3 "vsx_register_operand" "wa")] > + MMA_APV))] > "TARGET_MMA" > " %A0,%x2,%x3" > [(set_attr "type" "mma")]) >=20 > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:V16QI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:SI 3 "const_0_to_15_operand" "n") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "u8bit_cint_operand" "n")] > - MMA_VVI4I4I8))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:SI 3 "const_0_to_15_operand" "n") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "u8bit_cint_operand" "n")] > + MMA_VVI4I4I8))] > "TARGET_MMA" > " %A0,%x1,%x2,%3,%4,%5" > [(set_attr "type" "mma") > (set_attr "length" "8")]) >=20 > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:V16QI 3 "vsx_register_operand" "wa") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "const_0_to_15_operand" "n") > - (match_operand:SI 6 "u8bit_cint_operand" "n")] > - MMA_AVVI4I4I8))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:V16QI 3 "vsx_register_operand" "wa") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "const_0_to_15_operand" "n") > + (match_operand:SI 6 "u8bit_cint_operand" "n")] > + MMA_AVVI4I4I8))] > "TARGET_MMA" > " %A0,%x2,%x3,%4,%5,%6" > [(set_attr "type" "mma") > (set_attr "length" "8")]) >=20 > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:V16QI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:SI 3 "const_0_to_15_operand" "n") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "const_0_to_3_operand" "n")] > - MMA_VVI4I4I2))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:SI 3 "const_0_to_15_operand" "n") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "const_0_to_3_operand" "n")] > + MMA_VVI4I4I2))] > "TARGET_MMA" > " %A0,%x1,%x2,%3,%4,%5" > [(set_attr "type" "mma") > (set_attr "length" "8")]) >=20 > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:V16QI 3 "vsx_register_operand" "wa") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "const_0_to_15_operand" "n") > - (match_operand:SI 6 "const_0_to_3_operand" "n")] > - MMA_AVVI4I4I2))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:V16QI 3 "vsx_register_operand" "wa") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "const_0_to_15_operand" "n") > + (match_operand:SI 6 "const_0_to_3_operand" "n")] > + MMA_AVVI4I4I2))] > "TARGET_MMA" > " %A0,%x2,%x3,%4,%5,%6" > [(set_attr "type" "mma") > (set_attr "length" "8")]) >=20 > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:V16QI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:SI 3 "const_0_to_15_operand" "n") > - (match_operand:SI 4 "const_0_to_15_operand" "n")] > - MMA_VVI4I4))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:SI 3 "const_0_to_15_operand" "n") > + (match_operand:SI 4 "const_0_to_15_operand" "n")] > + MMA_VVI4I4))] > "TARGET_MMA" > " %A0,%x1,%x2,%3,%4" > [(set_attr "type" "mma") > (set_attr "length" "8")]) >=20 > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:V16QI 3 "vsx_register_operand" "wa") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "const_0_to_15_operand" "n")] > - MMA_AVVI4I4))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:V16QI 3 "vsx_register_operand" "wa") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "const_0_to_15_operand" "n")] > + MMA_AVVI4I4))] > "TARGET_MMA" > " %A0,%x2,%x3,%4,%5" > [(set_attr "type" "mma") > (set_attr "length" "8")]) >=20 > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:POI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:SI 3 "const_0_to_15_operand" "n") > - (match_operand:SI 4 "const_0_to_3_operand" "n")] > - MMA_PVI4I2))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:OO 1 "vsx_register_operand" "wa") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:SI 3 "const_0_to_15_operand" "n") > + (match_operand:SI 4 "const_0_to_3_operand" "n")] > + MMA_PVI4I2))] > "TARGET_MMA" > " %A0,%x1,%x2,%3,%4" > [(set_attr "type" "mma") > (set_attr "length" "8")]) >=20 > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0") > - (match_operand:POI 2 "vsx_register_operand" "wa") > - (match_operand:V16QI 3 "vsx_register_operand" "wa") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "const_0_to_3_operand" "n")] > - MMA_APVI4I2))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0") > + (match_operand:OO 2 "vsx_register_operand" "wa") > + (match_operand:V16QI 3 "vsx_register_operand" "wa") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "const_0_to_3_operand" "n")] > + MMA_APVI4I2))] > "TARGET_MMA" > " %A0,%x2,%x3,%4,%5" > [(set_attr "type" "mma") > (set_attr "length" "8")]) >=20 > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:V16QI 1 "vsx_register_operand" "wa") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:SI 3 "const_0_to_15_operand" "n") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "const_0_to_15_operand" "n")] > - MMA_VVI4I4I4))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:SI 3 "const_0_to_15_operand" "n") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "const_0_to_15_operand" "n")] > + MMA_VVI4I4I4))] > "TARGET_MMA" > " %A0,%x1,%x2,%3,%4,%5" > [(set_attr "type" "mma") > (set_attr "length" "8")]) >=20 > (define_insn "mma_" > - [(set (match_operand:PXI 0 "fpr_reg_operand" "=3D&d") > - (unspec:PXI [(match_operand:PXI 1 "fpr_reg_operand" "0") > - (match_operand:V16QI 2 "vsx_register_operand" "wa") > - (match_operand:V16QI 3 "vsx_register_operand" "wa") > - (match_operand:SI 4 "const_0_to_15_operand" "n") > - (match_operand:SI 5 "const_0_to_15_operand" "n") > - (match_operand:SI 6 "const_0_to_15_operand" "n")] > - MMA_AVVI4I4I4))] > + [(set (match_operand:XO 0 "fpr_reg_operand" "=3D&d") > + (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0") > + (match_operand:V16QI 2 "vsx_register_operand" "wa") > + (match_operand:V16QI 3 "vsx_register_operand" "wa") > + (match_operand:SI 4 "const_0_to_15_operand" "n") > + (match_operand:SI 5 "const_0_to_15_operand" "n") > + (match_operand:SI 6 "const_0_to_15_operand" "n")] > + MMA_AVVI4I4I4))] > "TARGET_MMA" > " %A0,%x2,%x3,%4,%5,%6" > [(set_attr "type" "mma") > diff --git a/gcc/config/rs6000/predicates.md = b/gcc/config/rs6000/predicates.md > index 4c2fe7fa312..9ad5ae67302 100644 > --- a/gcc/config/rs6000/predicates.md > +++ b/gcc/config/rs6000/predicates.md > @@ -1144,6 +1144,18 @@ (define_special_predicate = "mma_assemble_input_operand" > (match_test "(mode =3D=3D V16QImode > && (vsx_register_operand (op, mode) || MEM_P (op)))")) >=20 > +;; Return 1 if this operand is valid for an MMA disassemble insn. > +(define_predicate "mma_disassemble_output_operand" > + (match_code "reg,subreg,mem") > +{ > + if (SUBREG_P (op)) > + op =3D SUBREG_REG (op); > + if (!REG_P (op)) > + return true; > + > + return vsx_register_operand (op, mode); > +}) > + > ;; Return true if operand is an operator used in rotate-and-mask = instructions. > (define_predicate "rotate_mask_operator" > (match_code "rotate,ashift,lshiftrt")) > diff --git a/gcc/config/rs6000/rs6000-builtin.def = b/gcc/config/rs6000/rs6000-builtin.def > index a58102c3785..47b1f74e616 100644 > --- a/gcc/config/rs6000/rs6000-builtin.def > +++ b/gcc/config/rs6000/rs6000-builtin.def > @@ -352,7 +352,7 @@ > | RS6000_BTC_UNARY), = \ > CODE_FOR_ ## ICODE) /* ICODE */ >=20 > -#define BU_MMA_V2(ENUM, NAME, ATTR, ICODE) = \ > +#define BU_MMA_2(ENUM, NAME, ATTR, ICODE) = \ > RS6000_BUILTIN_M (MMA_BUILTIN_ ## ENUM, /* ENUM */ = \ > "__builtin_mma_" NAME, /* NAME */ = \ > RS6000_BTM_MMA, /* MASK */ = \ > @@ -360,7 +360,13 @@ > | RS6000_BTC_BINARY = \ > | RS6000_BTC_VOID = \ > | RS6000_BTC_GIMPLE), = \ > - CODE_FOR_nothing) /* ICODE */ > + CODE_FOR_nothing) /* ICODE */ = \ > + RS6000_BUILTIN_M (MMA_BUILTIN_ ## ENUM ## _INTERNAL, /* ENUM = */ \ > + "__builtin_mma_" NAME "_internal", /* NAME */ = \ > + RS6000_BTM_MMA, /* MASK */ = \ > + (RS6000_BTC_ ## ATTR /* ATTR */ = \ > + | RS6000_BTC_BINARY), = \ > + CODE_FOR_ ## ICODE) /* ICODE */ >=20 > #define BU_MMA_3(ENUM, NAME, ATTR, ICODE) = \ > RS6000_BUILTIN_M (MMA_BUILTIN_ ## ENUM, /* ENUM */ = \ > @@ -3108,8 +3114,8 @@ BU_MMA_1 (XXMFACC, "xxmfacc", = QUAD, mma_xxmfacc) > BU_MMA_1 (XXMTACC, "xxmtacc", QUAD, mma_xxmtacc) > BU_MMA_1 (XXSETACCZ, "xxsetaccz", MISC, mma_xxsetaccz) >=20 > -BU_MMA_V2 (DISASSEMBLE_ACC, "disassemble_acc", QUAD, nothing) > -BU_MMA_V2 (DISASSEMBLE_PAIR,"disassemble_pair", PAIR, nothing) > +BU_MMA_2 (DISASSEMBLE_ACC, "disassemble_acc", QUAD, = mma_disassemble_acc) > +BU_MMA_2 (DISASSEMBLE_PAIR,"disassemble_pair", PAIR, = mma_disassemble_pair) >=20 > BU_MMA_3 (ASSEMBLE_PAIR, "assemble_pair", MISC, mma_assemble_pair) > BU_MMA_3 (XVBF16GER2, "xvbf16ger2", MISC, mma_xvbf16ger2) > diff --git a/gcc/config/rs6000/rs6000-call.c = b/gcc/config/rs6000/rs6000-call.c > index 3bd89a79bad..ca0c75778a9 100644 > --- a/gcc/config/rs6000/rs6000-call.c > +++ b/gcc/config/rs6000/rs6000-call.c > @@ -6325,6 +6325,22 @@ rs6000_discover_homogeneous_aggregate = (machine_mode mode, const_tree type, > bool > rs6000_return_in_memory (const_tree type, const_tree fntype = ATTRIBUTE_UNUSED) > { > + /* We do not allow MMA types being used as return values. Only = report > + the invalid return value usage the first time we encounter it. = */ > + if (cfun > + && !cfun->machine->mma_return_type_error > + && TREE_TYPE (cfun->decl) =3D=3D fntype > + && (TYPE_MODE (type) =3D=3D OOmode || TYPE_MODE (type) =3D=3D = XOmode)) > + { > + /* Record we have now handled function CFUN, so the next time = we > + are called, we do not re-report the same error. */ > + cfun->machine->mma_return_type_error =3D true; > + if (TYPE_CANONICAL (type) !=3D NULL_TREE) > + type =3D TYPE_CANONICAL (type); > + error ("invalid use of MMA type %qs as a function return = value", > + IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type)))); > + } > + > /* For the Darwin64 ABI, test if we can fit the return value in = regs. */ > if (TARGET_MACHO > && rs6000_darwin64_abi > @@ -6577,30 +6593,8 @@ machine_mode > rs6000_promote_function_mode (const_tree type ATTRIBUTE_UNUSED, > machine_mode mode, > int *punsignedp ATTRIBUTE_UNUSED, > - const_tree, int for_return) > + const_tree, int for_return = ATTRIBUTE_UNUSED) > { > - /* Warning: this is a static local variable and not always NULL! > - This function is called multiple times for the same function > - and return value. PREV_FUNC is used to keep track of the > - first time we encounter a function's return value in order > - to not report an error with that return value multiple times. = */ > - static struct function *prev_func =3D NULL; > - > - /* We do not allow MMA types being used as return values. Only = report > - the invalid return value usage the first time we encounter it. = */ > - if (for_return > - && prev_func !=3D cfun > - && (mode =3D=3D POImode || mode =3D=3D PXImode)) > - { > - /* Record we have now handled function CFUN, so the next time = we > - are called, we do not re-report the same error. */ > - prev_func =3D cfun; > - if (TYPE_CANONICAL (type) !=3D NULL_TREE) > - type =3D TYPE_CANONICAL (type); > - error ("invalid use of MMA type %qs as a function return = value", > - IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type)))); > - } > - > PROMOTE_MODE (mode, *punsignedp, type); >=20 > return mode; > @@ -7552,7 +7546,7 @@ rs6000_function_arg (cumulative_args_t cum_v, = const function_arg_info &arg) > int n_elts; >=20 > /* We do not allow MMA types being used as function arguments. */ > - if (mode =3D=3D POImode || mode =3D=3D PXImode) > + if (mode =3D=3D OOmode || mode =3D=3D XOmode) > { > if (TYPE_CANONICAL (type) !=3D NULL_TREE) > type =3D TYPE_CANONICAL (type); > @@ -10073,7 +10067,8 @@ mma_expand_builtin (tree exp, rtx target, bool = *expandedp) > } >=20 > unsigned attr_args =3D attr & RS6000_BTC_OPND_MASK; > - if (attr & RS6000_BTC_QUAD) > + if (attr & RS6000_BTC_QUAD > + || fcode =3D=3D MMA_BUILTIN_DISASSEMBLE_PAIR_INTERNAL) > attr_args++; >=20 > gcc_assert (nopnds =3D=3D attr_args); > @@ -11687,23 +11682,24 @@ rs6000_gimple_fold_mma_builtin = (gimple_stmt_iterator *gsi) > gimple *new_call; > tree new_decl; >=20 > - if (rs6000_builtin_info[fncode + 1].icode =3D=3D CODE_FOR_nothing) > + if (fncode =3D=3D MMA_BUILTIN_DISASSEMBLE_ACC > + || fncode =3D=3D MMA_BUILTIN_DISASSEMBLE_PAIR) > { > /* This is an MMA disassemble built-in function. */ > - gcc_assert (fncode =3D=3D MMA_BUILTIN_DISASSEMBLE_ACC > - || fncode =3D=3D MMA_BUILTIN_DISASSEMBLE_PAIR); > - > push_gimplify_context (true); > + unsigned nvec =3D (fncode =3D=3D MMA_BUILTIN_DISASSEMBLE_ACC) ? = 4 : 2; > tree dst_ptr =3D gimple_call_arg (stmt, 0); > tree src_ptr =3D gimple_call_arg (stmt, 1); > tree src_type =3D TREE_TYPE (src_ptr); > tree src =3D make_ssa_name (TREE_TYPE (src_type)); > gimplify_assign (src, build_simple_mem_ref (src_ptr), &new_seq); >=20 > - /* If we are not disassembling an accumulator or our = destination is > - another accumulator, then just copy the entire thing as is. */ > - if (fncode !=3D MMA_BUILTIN_DISASSEMBLE_ACC > - || TREE_TYPE (TREE_TYPE (dst_ptr)) =3D=3D = vector_quad_type_node) > + /* If we are not disassembling an accumulator/pair or our = destination is > + another accumulator/pair, then just copy the entire thing as = is. */ > + if ((fncode =3D=3D MMA_BUILTIN_DISASSEMBLE_ACC > + && TREE_TYPE (TREE_TYPE (dst_ptr)) =3D=3D = vector_quad_type_node) > + || (fncode =3D=3D MMA_BUILTIN_DISASSEMBLE_PAIR > + && TREE_TYPE (TREE_TYPE (dst_ptr)) =3D=3D = vector_pair_type_node)) > { > tree dst =3D build_simple_mem_ref (build1 (VIEW_CONVERT_EXPR, > src_type, dst_ptr)); > @@ -11713,29 +11709,33 @@ rs6000_gimple_fold_mma_builtin = (gimple_stmt_iterator *gsi) > return true; > } >=20 > - /* We're disassembling an accumulator into a different type, so = we need > + /* If we're disassembling an accumulator into a different type, = we need > to emit a xxmfacc instruction now, since we cannot do it later. = */ > - new_decl =3D = rs6000_builtin_decls[MMA_BUILTIN_XXMFACC_INTERNAL]; > - new_call =3D gimple_build_call (new_decl, 1, src); > - src =3D make_ssa_name (vector_quad_type_node); > - gimple_call_set_lhs (new_call, src); > - gimple_seq_add_stmt (&new_seq, new_call); > + if (fncode =3D=3D MMA_BUILTIN_DISASSEMBLE_ACC) > + { > + new_decl =3D = rs6000_builtin_decls[MMA_BUILTIN_XXMFACC_INTERNAL]; > + new_call =3D gimple_build_call (new_decl, 1, src); > + src =3D make_ssa_name (vector_quad_type_node); > + gimple_call_set_lhs (new_call, src); > + gimple_seq_add_stmt (&new_seq, new_call); > + } >=20 > - /* Copy the accumulator vector by vector. */ > + /* Copy the accumulator/pair vector by vector. */ > + new_decl =3D rs6000_builtin_decls[fncode + 1]; > tree dst_type =3D build_pointer_type_for_mode = (unsigned_V16QI_type_node, > ptr_mode, true); > tree dst_base =3D build1 (VIEW_CONVERT_EXPR, dst_type, dst_ptr); > - tree array_type =3D build_array_type_nelts = (unsigned_V16QI_type_node, 4); > - tree src_array =3D build1 (VIEW_CONVERT_EXPR, array_type, src); > - for (unsigned i =3D 0; i < 4; i++) > + for (unsigned i =3D 0; i < nvec; i++) > { > - unsigned index =3D WORDS_BIG_ENDIAN ? i : 3 - i; > - tree ref =3D build4 (ARRAY_REF, unsigned_V16QI_type_node, = src_array, > - build_int_cst (size_type_node, i), > - NULL_TREE, NULL_TREE); > + unsigned index =3D WORDS_BIG_ENDIAN ? i : nvec - 1 - i; > tree dst =3D build2 (MEM_REF, unsigned_V16QI_type_node, = dst_base, > build_int_cst (dst_type, index * 16)); > - gimplify_assign (dst, ref, &new_seq); > + tree dstssa =3D make_ssa_name (unsigned_V16QI_type_node); > + new_call =3D gimple_build_call (new_decl, 2, src, > + build_int_cstu = (uint16_type_node, i)); > + gimple_call_set_lhs (new_call, dstssa); > + gimple_seq_add_stmt (&new_seq, new_call); > + gimplify_assign (dst, dstssa, &new_seq); > } > pop_gimplify_context (NULL); > gsi_replace_with_seq (gsi, new_seq, true); > @@ -13190,17 +13190,23 @@ rs6000_init_builtins (void) > /* Vector pair and vector quad support. */ > if (TARGET_EXTRA_BUILTINS) > { > - vector_pair_type_node =3D make_unsigned_type (256); > + vector_pair_type_node =3D make_node (OPAQUE_TYPE); > + SET_TYPE_MODE (vector_pair_type_node, OOmode); > + TYPE_SIZE (vector_pair_type_node) =3D bitsize_int = (GET_MODE_BITSIZE (OOmode)); > + TYPE_PRECISION (vector_pair_type_node) =3D GET_MODE_BITSIZE = (OOmode); > + TYPE_SIZE_UNIT (vector_pair_type_node) =3D size_int = (GET_MODE_SIZE (OOmode)); > SET_TYPE_ALIGN (vector_pair_type_node, 256); > - SET_TYPE_MODE (vector_pair_type_node, POImode); > - layout_type (vector_pair_type_node); > + TYPE_USER_ALIGN (vector_pair_type_node) =3D 0; > lang_hooks.types.register_builtin_type (vector_pair_type_node, > "__vector_pair"); >=20 > - vector_quad_type_node =3D make_unsigned_type (512); > + vector_quad_type_node =3D make_node (OPAQUE_TYPE); > + SET_TYPE_MODE (vector_quad_type_node, XOmode); > + TYPE_SIZE (vector_quad_type_node) =3D bitsize_int = (GET_MODE_BITSIZE (XOmode)); > + TYPE_PRECISION (vector_quad_type_node) =3D GET_MODE_BITSIZE = (XOmode); > + TYPE_SIZE_UNIT (vector_quad_type_node) =3D size_int = (GET_MODE_SIZE (XOmode)); > SET_TYPE_ALIGN (vector_quad_type_node, 512); > - SET_TYPE_MODE (vector_quad_type_node, PXImode); > - layout_type (vector_quad_type_node); > + TYPE_USER_ALIGN (vector_quad_type_node) =3D 0; > lang_hooks.types.register_builtin_type (vector_quad_type_node, > "__vector_quad"); > } > @@ -13236,8 +13242,8 @@ rs6000_init_builtins (void) > builtin_mode_to_type[V8HImode][1] =3D unsigned_V8HI_type_node; > builtin_mode_to_type[V16QImode][0] =3D V16QI_type_node; > builtin_mode_to_type[V16QImode][1] =3D unsigned_V16QI_type_node; > - builtin_mode_to_type[POImode][1] =3D vector_pair_type_node; > - builtin_mode_to_type[PXImode][1] =3D vector_quad_type_node; > + builtin_mode_to_type[OOmode][1] =3D vector_pair_type_node; > + builtin_mode_to_type[XOmode][1] =3D vector_quad_type_node; >=20 > tdecl =3D add_builtin_type ("__bool char", bool_char_type_node); > TYPE_NAME (bool_char_type_node) =3D tdecl; > @@ -14049,21 +14055,21 @@ mma_init_builtins (void) > } > else > { > - if ((attr & RS6000_BTC_QUAD) =3D=3D 0) > + if ( !(d->code =3D=3D MMA_BUILTIN_DISASSEMBLE_ACC_INTERNAL > + || d->code =3D=3D = MMA_BUILTIN_DISASSEMBLE_PAIR_INTERNAL) > + && (attr & RS6000_BTC_QUAD) =3D=3D 0) > attr_args--; >=20 > /* Ensure we have the correct number and type of operands. */ > gcc_assert (attr_args =3D=3D insn_data[icode].n_operands - 1); > } >=20 > - if (icode =3D=3D CODE_FOR_nothing) > + /* This is a disassemble pair/acc function. */ > + if (d->code =3D=3D MMA_BUILTIN_DISASSEMBLE_ACC > + || d->code =3D=3D MMA_BUILTIN_DISASSEMBLE_PAIR) > { > - /* This is a disassemble MMA built-in function. */ > - gcc_assert (attr_args =3D=3D RS6000_BTC_BINARY > - && (d->code =3D=3D MMA_BUILTIN_DISASSEMBLE_ACC > - || d->code =3D=3D = MMA_BUILTIN_DISASSEMBLE_PAIR)); > op[nopnds++] =3D build_pointer_type (void_type_node); > - if (attr & RS6000_BTC_QUAD) > + if (d->code =3D=3D MMA_BUILTIN_DISASSEMBLE_ACC) > op[nopnds++] =3D build_pointer_type (vector_quad_type_node); > else > op[nopnds++] =3D build_pointer_type (vector_pair_type_node); > @@ -14071,13 +14077,17 @@ mma_init_builtins (void) > else > { > /* This is a normal MMA built-in function. */ > - unsigned j =3D (attr & RS6000_BTC_QUAD) ? 1 : 0; > + unsigned j =3D 0; > + if (attr & RS6000_BTC_QUAD > + && d->code !=3D MMA_BUILTIN_DISASSEMBLE_ACC_INTERNAL > + && d->code !=3D MMA_BUILTIN_DISASSEMBLE_PAIR_INTERNAL) > + j =3D 1; > for (; j < (unsigned) insn_data[icode].n_operands; j++) > { > machine_mode mode =3D insn_data[icode].operand[j].mode; > - if (gimple_func && mode =3D=3D PXImode) > + if (gimple_func && mode =3D=3D XOmode) > op[nopnds++] =3D build_pointer_type = (vector_quad_type_node); > - else if (gimple_func && mode =3D=3D POImode > + else if (gimple_func && mode =3D=3D OOmode > && d->code =3D=3D MMA_BUILTIN_ASSEMBLE_PAIR) > op[nopnds++] =3D build_pointer_type = (vector_pair_type_node); > else > diff --git a/gcc/config/rs6000/rs6000-modes.def = b/gcc/config/rs6000/rs6000-modes.def > index ddb218b3fba..e81a32c8c36 100644 > --- a/gcc/config/rs6000/rs6000-modes.def > +++ b/gcc/config/rs6000/rs6000-modes.def > @@ -83,12 +83,6 @@ VECTOR_MODE (INT, SI, 2); /* = V2SI */ > combination. */ > PARTIAL_INT_MODE (TI, 128, PTI); >=20 > -/* Define, but don't use the larger integer modes. We need an = integer mode > - defined that is the same size as the vector pair and vector quad = modes. */ > - > -INT_MODE (OI, 32); > -INT_MODE (XI, 64); > - > /* Modes used by __vector_pair and __vector_quad. */ > -PARTIAL_INT_MODE (OI, 256, POI); /* __vector_pair. */ > -PARTIAL_INT_MODE (XI, 512, PXI); /* __vector_quad. */ > +OPAQUE_MODE (OO, 32); > +OPAQUE_MODE (XO, 64); > diff --git a/gcc/config/rs6000/rs6000-string.c = b/gcc/config/rs6000/rs6000-string.c > index 82cc24ecdda..a2e6821d353 100644 > --- a/gcc/config/rs6000/rs6000-string.c > +++ b/gcc/config/rs6000/rs6000-string.c > @@ -2787,7 +2787,7 @@ expand_block_move (rtx operands[], bool = might_overlap) > rtx src, dest; > bool move_with_length =3D false; >=20 > - /* Use POImode for paired vsx load/store. Use V2DI for single > + /* Use OOmode for paired vsx load/store. Use V2DI for single > unaligned vsx load/store, for consistency with what other > expansions (compare) already do, and so we can use lxvd2x on > p8. Order is VSX pair unaligned, VSX unaligned, Altivec, VSX > @@ -2799,8 +2799,8 @@ expand_block_move (rtx operands[], bool = might_overlap) > && (align >=3D 256 || !STRICT_ALIGNMENT)) > { > move_bytes =3D 32; > - mode =3D POImode; > - gen_func.mov =3D gen_movpoi; > + mode =3D OOmode; > + gen_func.mov =3D gen_movoo; > } > else if (TARGET_POWERPC64 && TARGET_BLOCK_OPS_UNALIGNED_VSX > && VECTOR_MEM_VSX_P (V2DImode) > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > index d7dcd93f088..bd8205c87f7 100644 > --- a/gcc/config/rs6000/rs6000.c > +++ b/gcc/config/rs6000/rs6000.c > @@ -1826,15 +1826,12 @@ rs6000_hard_regno_mode_ok_uncached (int regno, = machine_mode mode) > mode =3D GET_MODE_INNER (mode); >=20 > /* Vector pair modes need even/odd VSX register pairs. Only allow = vector > - registers. We need to allow OImode to have the same registers = as POImode, > - even though we do not enable the move pattern for OImode. */ > - if (mode =3D=3D POImode || mode =3D=3D OImode) > + registers. */ > + if (mode =3D=3D OOmode) > return (TARGET_MMA && VSX_REGNO_P (regno) && (regno & 1) =3D=3D = 0); >=20 > - /* MMA accumulator modes need FPR registers divisible by 4. We = need to allow > - XImode to have the same registers as PXImode, even though we do = not enable > - the move pattern for XImode. */ > - if (mode =3D=3D PXImode || mode =3D=3D XImode) > + /* MMA accumulator modes need FPR registers divisible by 4. */ > + if (mode =3D=3D XOmode) > return (TARGET_MMA && FP_REGNO_P (regno) && (regno & 3) =3D=3D 0); >=20 > /* PTImode can only go in GPRs. Quad word memory operations require = even/odd > @@ -1941,8 +1938,8 @@ rs6000_hard_regno_mode_ok (unsigned int regno, = machine_mode mode) > GPR registers, and TImode can go in any GPR as well as VSX = registers (PR > 57744). >=20 > - Similarly, don't allow POImode (vector pair, restricted to even = VSX > - registers) or PXImode (vector quad, restricted to FPR registers = divisible > + Similarly, don't allow OOmode (vector pair, restricted to even VSX > + registers) or XOmode (vector quad, restricted to FPR registers = divisible > by 4) to tie with other modes. >=20 > Altivec/VSX vector tests were moved ahead of scalar float mode, so = that IEEE > @@ -1951,8 +1948,8 @@ rs6000_hard_regno_mode_ok (unsigned int regno, = machine_mode mode) > static bool > rs6000_modes_tieable_p (machine_mode mode1, machine_mode mode2) > { > - if (mode1 =3D=3D PTImode || mode1 =3D=3D POImode || mode1 =3D=3D = PXImode > - || mode2 =3D=3D PTImode || mode2 =3D=3D POImode || mode2 =3D=3D = PXImode) > + if (mode1 =3D=3D PTImode || mode1 =3D=3D OOmode || mode1 =3D=3D = XOmode > + || mode2 =3D=3D PTImode || mode2 =3D=3D OOmode || mode2 =3D=3D = XOmode) > return mode1 =3D=3D mode2; >=20 > if (ALTIVEC_OR_VSX_VECTOR_MODE (mode1)) > @@ -2241,10 +2238,8 @@ rs6000_debug_reg_global (void) > V2DFmode, > V8SFmode, > V4DFmode, > - OImode, > - XImode, > - POImode, > - PXImode, > + OOmode, > + XOmode, > CCmode, > CCUNSmode, > CCEQmode, > @@ -2706,13 +2701,13 @@ rs6000_setup_reg_addr_masks (void) > since it will be broken into two vector moves. Vector = quads can > only do offset loads. */ > else if ((addr_mask !=3D 0) && TARGET_MMA > - && (m2 =3D=3D POImode || m2 =3D=3D PXImode)) > + && (m2 =3D=3D OOmode || m2 =3D=3D XOmode)) > { > addr_mask |=3D RELOAD_REG_OFFSET; > if (rc =3D=3D RELOAD_REG_FPR || rc =3D=3D RELOAD_REG_VMX) > { > addr_mask |=3D RELOAD_REG_QUAD_OFFSET; > - if (m2 =3D=3D POImode) > + if (m2 =3D=3D OOmode) > addr_mask |=3D RELOAD_REG_INDEXED; > } > } > @@ -2921,13 +2916,13 @@ rs6000_init_hard_regno_mode_ok (bool = global_init_p) > /* Add support for vector pairs and vector quad registers. */ > if (TARGET_MMA) > { > - rs6000_vector_unit[POImode] =3D VECTOR_NONE; > - rs6000_vector_mem[POImode] =3D VECTOR_VSX; > - rs6000_vector_align[POImode] =3D 256; > + rs6000_vector_unit[OOmode] =3D VECTOR_NONE; > + rs6000_vector_mem[OOmode] =3D VECTOR_VSX; > + rs6000_vector_align[OOmode] =3D 256; >=20 > - rs6000_vector_unit[PXImode] =3D VECTOR_NONE; > - rs6000_vector_mem[PXImode] =3D VECTOR_VSX; > - rs6000_vector_align[PXImode] =3D 512; > + rs6000_vector_unit[XOmode] =3D VECTOR_NONE; > + rs6000_vector_mem[XOmode] =3D VECTOR_VSX; > + rs6000_vector_align[XOmode] =3D 512; > } >=20 > /* Register class constraints for the constraints that depend on = compile > @@ -3064,10 +3059,10 @@ rs6000_init_hard_regno_mode_ok (bool = global_init_p) >=20 > if (TARGET_MMA) > { > - reg_addr[POImode].reload_store =3D = CODE_FOR_reload_poi_di_store; > - reg_addr[POImode].reload_load =3D = CODE_FOR_reload_poi_di_load; > - reg_addr[PXImode].reload_store =3D = CODE_FOR_reload_pxi_di_store; > - reg_addr[PXImode].reload_load =3D = CODE_FOR_reload_pxi_di_load; > + reg_addr[OOmode].reload_store =3D = CODE_FOR_reload_oo_di_store; > + reg_addr[OOmode].reload_load =3D = CODE_FOR_reload_oo_di_load; > + reg_addr[XOmode].reload_store =3D = CODE_FOR_reload_xo_di_store; > + reg_addr[XOmode].reload_load =3D = CODE_FOR_reload_xo_di_load; > } > } > } > @@ -8129,8 +8124,8 @@ reg_offset_addressing_ok_p (machine_mode mode) >=20 > /* The vector pair/quad types support offset addressing if the > underlying vectors support offset addressing. */ > - case E_POImode: > - case E_PXImode: > + case E_OOmode: > + case E_XOmode: > return TARGET_MMA; >=20 > case E_SDmode: > @@ -10323,11 +10318,11 @@ rs6000_emit_move (rtx dest, rtx source, = machine_mode mode) > operands[1] =3D force_const_mem (mode, operands[1]); > break; >=20 > - case E_POImode: > - case E_PXImode: > + case E_OOmode: > + case E_XOmode: > if (CONST_INT_P (operands[1]) && INTVAL (operands[1]) !=3D 0) > error ("%qs is an opaque type, and you can't set it to other = values.", > - (mode =3D=3D POImode) ? "__vector_pair" : = "__vector_quad"); > + (mode =3D=3D OOmode) ? "__vector_pair" : = "__vector_quad"); > break; >=20 > case E_SImode: > @@ -12596,10 +12591,10 @@ rs6000_preferred_reload_class (rtx x, enum = reg_class rclass) > the GPR registers. */ > if (rclass =3D=3D GEN_OR_FLOAT_REGS) > { > - if (mode =3D=3D POImode) > + if (mode =3D=3D OOmode) > return VSX_REGS; >=20 > - if (mode =3D=3D PXImode) > + if (mode =3D=3D XOmode) > return FLOAT_REGS; >=20 > if (GET_MODE_CLASS (mode) =3D=3D MODE_INT) > @@ -16323,15 +16318,15 @@ rs6000_split_multireg_move (rtx dst, rtx = src) >=20 > /* If we have a vector quad register for MMA, and this is a load or = store, > see if we can use vector paired load/stores. */ > - if (mode =3D=3D PXImode && TARGET_MMA > + if (mode =3D=3D XOmode && TARGET_MMA > && (MEM_P (dst) || MEM_P (src))) > { > - reg_mode =3D POImode; > + reg_mode =3D OOmode; > nregs /=3D 2; > } > /* If we have a vector pair/quad mode, split it into two/four = separate > vectors. */ > - else if (mode =3D=3D POImode || mode =3D=3D PXImode) > + else if (mode =3D=3D OOmode || mode =3D=3D XOmode) > reg_mode =3D V1TImode; > else if (FP_REGNO_P (reg)) > reg_mode =3D DECIMAL_FLOAT_MODE_P (mode) ? DDmode : > @@ -16377,12 +16372,16 @@ rs6000_split_multireg_move (rtx dst, rtx = src) > return; > } >=20 > - /* The __vector_pair and __vector_quad modes are multi-register = modes, > - so if have to load or store the registers, we have to be careful = to > - properly swap them if we're in little endian mode below. This = means > - the last register gets the first memory location. */ > - if (mode =3D=3D POImode || mode =3D=3D PXImode) > + /* The __vector_pair and __vector_quad modes are multi-register > + modes, so if we have to load or store the registers, we have to = be > + careful to properly swap them if we're in little endian mode > + below. This means the last register gets the first memory > + location. We also need to be careful of using the right = register > + numbers if we are splitting XO to OO. */ > + if (mode =3D=3D OOmode || mode =3D=3D XOmode) > { > + nregs =3D hard_regno_nregs (reg, mode); > + int reg_mode_nregs =3D hard_regno_nregs (reg, reg_mode); > if (MEM_P (dst)) > { > unsigned offset =3D 0; > @@ -16391,15 +16390,15 @@ rs6000_split_multireg_move (rtx dst, rtx = src) > /* If we are reading an accumulator register, we have to > deprime it before we can access it. */ > if (TARGET_MMA > - && GET_MODE (src) =3D=3D PXImode && FP_REGNO_P (REGNO = (src))) > + && GET_MODE (src) =3D=3D XOmode && FP_REGNO_P (REGNO = (src))) > emit_insn (gen_mma_xxmfacc (src, src)); >=20 > - for (int i =3D 0; i < nregs; i++) > + for (int i =3D 0; i < nregs; i +=3D reg_mode_nregs) > { > - unsigned subreg =3D (WORDS_BIG_ENDIAN) > - ? i * size : (nregs - 1 - i) * size; > + unsigned subreg =3D > + (WORDS_BIG_ENDIAN) ? i : (nregs - reg_mode_nregs - i); > rtx dst2 =3D adjust_address (dst, reg_mode, offset); > - rtx src2 =3D simplify_gen_subreg (reg_mode, src, mode, = subreg); > + rtx src2 =3D gen_rtx_REG (reg_mode, reg + subreg); > offset +=3D size; > emit_insn (gen_rtx_SET (dst2, src2)); > } > @@ -16412,11 +16411,11 @@ rs6000_split_multireg_move (rtx dst, rtx = src) > unsigned offset =3D 0; > unsigned size =3D GET_MODE_SIZE (reg_mode); >=20 > - for (int i =3D 0; i < nregs; i++) > + for (int i =3D 0; i < nregs; i +=3D reg_mode_nregs) > { > - unsigned subreg =3D (WORDS_BIG_ENDIAN) > - ? i * size : (nregs - 1 - i) * size; > - rtx dst2 =3D simplify_gen_subreg (reg_mode, dst, mode, = subreg); > + unsigned subreg =3D > + (WORDS_BIG_ENDIAN) ? i : (nregs - reg_mode_nregs - i); > + rtx dst2 =3D gen_rtx_REG (reg_mode, reg + subreg); > rtx src2 =3D adjust_address (src, reg_mode, offset); > offset +=3D size; > emit_insn (gen_rtx_SET (dst2, src2)); > @@ -16425,7 +16424,7 @@ rs6000_split_multireg_move (rtx dst, rtx src) > /* If we are writing an accumulator register, we have to > prime it after we've written it. */ > if (TARGET_MMA > - && GET_MODE (dst) =3D=3D PXImode && FP_REGNO_P (REGNO = (dst))) > + && GET_MODE (dst) =3D=3D XOmode && FP_REGNO_P (REGNO = (dst))) > emit_insn (gen_mma_xxmtacc (dst, dst)); >=20 > return; > @@ -16433,9 +16432,12 @@ rs6000_split_multireg_move (rtx dst, rtx src) >=20 > if (GET_CODE (src) =3D=3D UNSPEC) > { > - gcc_assert (REG_P (dst) > - && FP_REGNO_P (REGNO (dst)) > - && XINT (src, 1) =3D=3D UNSPEC_MMA_ASSEMBLE_ACC); > + gcc_assert (XINT (src, 1) =3D=3D UNSPEC_MMA_ASSEMBLE); > + gcc_assert (REG_P (dst)); > + if (GET_MODE (src) =3D=3D XOmode) > + gcc_assert (FP_REGNO_P (REGNO (dst))); > + if (GET_MODE (src) =3D=3D OOmode) > + gcc_assert (VSX_REGNO_P (REGNO (dst))); >=20 > reg_mode =3D GET_MODE (XVECEXP (src, 0, 0)); > for (int i =3D 0; i < XVECLEN (src, 0); i++) > @@ -16446,7 +16448,8 @@ rs6000_split_multireg_move (rtx dst, rtx src) >=20 > /* We are writing an accumulator register, so we have to > prime it after we've written it. */ > - emit_insn (gen_mma_xxmtacc (dst, dst)); > + if (GET_MODE (src) =3D=3D XOmode) > + emit_insn (gen_mma_xxmtacc (dst, dst)); >=20 > return; > } > @@ -16459,22 +16462,35 @@ rs6000_split_multireg_move (rtx dst, rtx = src) > /* If we are reading an accumulator register, we have to > deprime it before we can access it. */ > if (TARGET_MMA > - && GET_MODE (src) =3D=3D PXImode && FP_REGNO_P (REGNO (src))) > + && GET_MODE (src) =3D=3D XOmode && FP_REGNO_P (REGNO (src))) > emit_insn (gen_mma_xxmfacc (src, src)); >=20 > /* Move register range backwards, if we might have destructive > overlap. */ > int i; > - for (i =3D nregs - 1; i >=3D 0; i--) > - emit_insn (gen_rtx_SET (simplify_gen_subreg (reg_mode, dst, = mode, > - i * reg_mode_size), > - simplify_gen_subreg (reg_mode, src, = mode, > - i * = reg_mode_size))); > + /* XO/OO are opaque so cannot use subregs. */ > + if (mode =3D=3D OOmode || mode =3D=3D XOmode ) > + { > + for (i =3D nregs - 1; i >=3D 0; i--) > + { > + rtx dst_i =3D gen_rtx_REG (reg_mode, REGNO (dst) + i); > + rtx src_i =3D gen_rtx_REG (reg_mode, REGNO (src) + i); > + emit_insn (gen_rtx_SET (dst_i, src_i)); > + } > + } > + else > + { > + for (i =3D nregs - 1; i >=3D 0; i--) > + emit_insn (gen_rtx_SET (simplify_gen_subreg (reg_mode, dst, = mode, > + i * = reg_mode_size), > + simplify_gen_subreg (reg_mode, src, = mode, > + i * = reg_mode_size))); > + } >=20 > /* If we are writing an accumulator register, we have to > prime it after we've written it. */ > if (TARGET_MMA > - && GET_MODE (dst) =3D=3D PXImode && FP_REGNO_P (REGNO (dst))) > + && GET_MODE (dst) =3D=3D XOmode && FP_REGNO_P (REGNO (dst))) > emit_insn (gen_mma_xxmtacc (dst, dst)); > } > else > @@ -16611,7 +16627,7 @@ rs6000_split_multireg_move (rtx dst, rtx src) > /* If we are reading an accumulator register, we have to > deprime it before we can access it. */ > if (TARGET_MMA && REG_P (src) > - && GET_MODE (src) =3D=3D PXImode && FP_REGNO_P (REGNO (src))) > + && GET_MODE (src) =3D=3D XOmode && FP_REGNO_P (REGNO (src))) > emit_insn (gen_mma_xxmfacc (src, src)); >=20 > for (i =3D 0; i < nregs; i++) > @@ -16626,16 +16642,24 @@ rs6000_split_multireg_move (rtx dst, rtx = src) > if (j =3D=3D 0 && used_update) > continue; >=20 > - emit_insn (gen_rtx_SET (simplify_gen_subreg (reg_mode, dst, = mode, > - j * = reg_mode_size), > - simplify_gen_subreg (reg_mode, src, = mode, > - j * = reg_mode_size))); > + /* XO/OO are opaque so cannot use subregs. */ > + if (mode =3D=3D OOmode || mode =3D=3D XOmode ) > + { > + rtx dst_i =3D gen_rtx_REG (reg_mode, REGNO (dst) + j); > + rtx src_i =3D gen_rtx_REG (reg_mode, REGNO (src) + j); > + emit_insn (gen_rtx_SET (dst_i, src_i)); > + } > + else > + emit_insn (gen_rtx_SET (simplify_gen_subreg (reg_mode, dst, = mode, > + j * = reg_mode_size), > + simplify_gen_subreg (reg_mode, src, = mode, > + j * = reg_mode_size))); > } >=20 > /* If we are writing an accumulator register, we have to > prime it after we've written it. */ > if (TARGET_MMA && REG_P (dst) > - && GET_MODE (dst) =3D=3D PXImode && FP_REGNO_P (REGNO (dst))) > + && GET_MODE (dst) =3D=3D XOmode && FP_REGNO_P (REGNO (dst))) > emit_insn (gen_mma_xxmtacc (dst, dst)); >=20 > if (restore_basereg !=3D NULL_RTX) > @@ -19865,7 +19889,8 @@ rs6000_mangle_type (const_tree type) > type =3D TYPE_MAIN_VARIANT (type); >=20 > if (TREE_CODE (type) !=3D VOID_TYPE && TREE_CODE (type) !=3D = BOOLEAN_TYPE > - && TREE_CODE (type) !=3D INTEGER_TYPE && TREE_CODE (type) !=3D = REAL_TYPE) > + && TREE_CODE (type) !=3D INTEGER_TYPE && TREE_CODE (type) !=3D = REAL_TYPE > + && TREE_CODE (type) !=3D OPAQUE_TYPE) > return NULL; >=20 > if (type =3D=3D bool_char_type_node) return "U6__boolc"; > @@ -21753,6 +21778,14 @@ rs6000_rtx_costs (rtx x, machine_mode mode, = int outer_code, > } > break; >=20 > + case UNSPEC: > + if (XINT (x, 1) =3D=3D UNSPEC_MMA_XXSETACCZ) > + { > + *total =3D 0; > + return true; > + } > + break; > + > default: > break; > } > @@ -27186,14 +27219,14 @@ rs6000_invalid_conversion (const_tree = fromtype, const_tree totype) >=20 > if (frommode !=3D tomode) > { > - /* Do not allow conversions to/from PXImode and POImode types. = */ > - if (frommode =3D=3D PXImode) > + /* Do not allow conversions to/from XOmode and OOmode types. = */ > + if (frommode =3D=3D XOmode) > return N_("invalid conversion from type %<__vector_quad%>"); > - if (tomode =3D=3D PXImode) > + if (tomode =3D=3D XOmode) > return N_("invalid conversion to type %<__vector_quad%>"); > - if (frommode =3D=3D POImode) > + if (frommode =3D=3D OOmode) > return N_("invalid conversion from type %<__vector_pair%>"); > - if (tomode =3D=3D POImode) > + if (tomode =3D=3D OOmode) > return N_("invalid conversion to type %<__vector_pair%>"); > } > else if (POINTER_TYPE_P (fromtype) && POINTER_TYPE_P (totype)) > @@ -27202,19 +27235,19 @@ rs6000_invalid_conversion (const_tree = fromtype, const_tree totype) > frommode =3D TYPE_MODE (TREE_TYPE (fromtype)); > tomode =3D TYPE_MODE (TREE_TYPE (totype)); >=20 > - /* Do not allow conversions to/from PXImode and POImode pointer > + /* Do not allow conversions to/from XOmode and OOmode pointer > types, except to/from void pointers. */ > if (frommode !=3D tomode > && frommode !=3D VOIDmode > && tomode !=3D VOIDmode) > { > - if (frommode =3D=3D PXImode) > + if (frommode =3D=3D XOmode) > return N_("invalid conversion from type %<* = __vector_quad%>"); > - if (tomode =3D=3D PXImode) > + if (tomode =3D=3D XOmode) > return N_("invalid conversion to type %<* __vector_quad%>"); > - if (frommode =3D=3D POImode) > + if (frommode =3D=3D OOmode) > return N_("invalid conversion from type %<* = __vector_pair%>"); > - if (tomode =3D=3D POImode) > + if (tomode =3D=3D OOmode) > return N_("invalid conversion to type %<* __vector_pair%>"); > } > } > diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h > index 5a47aa14722..f35aaf4ffd1 100644 > --- a/gcc/config/rs6000/rs6000.h > +++ b/gcc/config/rs6000/rs6000.h > @@ -1041,7 +1041,7 @@ enum data_align { align_abi, align_opt, = align_both }; > /* Modes that are not vectors, but require vector alignment. Treat = these like > vectors in terms of loads and stores. */ > #define VECTOR_ALIGNMENT_P(MODE) = \ > - (FLOAT128_VECTOR_P (MODE) || (MODE) =3D=3D POImode || (MODE) =3D=3D = PXImode) > + (FLOAT128_VECTOR_P (MODE) || (MODE) =3D=3D OOmode || (MODE) =3D=3D = XOmode) >=20 > #define ALTIVEC_VECTOR_MODE(MODE) = \ > ((MODE) =3D=3D V16QImode = \ > @@ -2556,6 +2556,7 @@ typedef struct GTY(()) machine_function > bool fpr_is_wrapped_separately[32]; > bool lr_is_wrapped_separately; > bool toc_is_wrapped_separately; > + bool mma_return_type_error; > } machine_function; > #endif >=20 > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index 5e5ad9f7c3d..b3f77ec665c 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -778,7 +778,7 @@ (define_mode_attr BOOL_REGS_UNARY [(TI = "r,0,0,wa,v") > ;; supplement addressing modes. > (define_mode_iterator RELOAD [V16QI V8HI V4SI V2DI V4SF V2DF V1TI > SF SD SI DF DD DI TI PTI KF IF TF > - POI PXI]) > + OO XO]) >=20 > ;; Iterate over smin, smax > (define_code_iterator fp_minmax [smin smax]) > diff --git a/gcc/testsuite/gcc.target/powerpc/mma-double-test.c = b/gcc/testsuite/gcc.target/powerpc/mma-double-test.c > index 53843794a95..254af7f8f79 100755 > --- a/gcc/testsuite/gcc.target/powerpc/mma-double-test.c > +++ b/gcc/testsuite/gcc.target/powerpc/mma-double-test.c > @@ -181,6 +181,9 @@ main (int argc, char *argv[]) > printf ("MMA double test fail: %d errors\n",ret); > else > printf ("MMA single test success: 0 MMA errors\n"); > +#else > + if (ret) > + abort(); > #endif >=20 > return ret; > diff --git a/gcc/testsuite/gcc.target/powerpc/mma-single-test.c = b/gcc/testsuite/gcc.target/powerpc/mma-single-test.c > index ac4125ba329..ebbc5ae2e1b 100755 > --- a/gcc/testsuite/gcc.target/powerpc/mma-single-test.c > +++ b/gcc/testsuite/gcc.target/powerpc/mma-single-test.c > @@ -189,6 +189,9 @@ main (int argc, char *argv[]) > printf ("MMA single test fail: %d errors\n",ret); > else > printf ("MMA single test success: 0 MMA errors\n"); > +#else > + if (ret) > + abort(); > #endif >=20 > return ret; > diff --git a/gcc/testsuite/gcc.target/powerpc/pr96506.c = b/gcc/testsuite/gcc.target/powerpc/pr96506-1.c > similarity index 61% > rename from gcc/testsuite/gcc.target/powerpc/pr96506.c > rename to gcc/testsuite/gcc.target/powerpc/pr96506-1.c > index b1b40c5a5c8..91835cec30c 100644 > --- a/gcc/testsuite/gcc.target/powerpc/pr96506.c > +++ b/gcc/testsuite/gcc.target/powerpc/pr96506-1.c > @@ -40,27 +40,3 @@ foo3 (void) > vquad_t v; > bar3 (v); /* { dg-error "invalid use of MMA operand of type = .__vector_quad. as a function parameter" } */ > } > - > -__vector_pair > -foo4 (__vector_pair *src) /* { dg-error "invalid use of MMA type = .__vector_pair. as a function return value" } */ > -{ > - return *src; > -} > - > -vpair_t > -foo5 (vpair_t *src) /* { dg-error "invalid use of MMA type = .__vector_pair. as a function return value" } */ > -{ > - return *src; > -} > - > -__vector_quad > -foo6 (__vector_quad *src) /* { dg-error "invalid use of MMA type = .__vector_quad. as a function return value" } */ > -{ > - return *src; > -} > - > -vquad_t > -foo7 (vquad_t *src) /* { dg-error "invalid use of MMA type = .__vector_quad. as a function return value" } */ > -{ > - return *src; > -} > diff --git a/gcc/testsuite/gcc.target/powerpc/pr96506-2.c = b/gcc/testsuite/gcc.target/powerpc/pr96506-2.c > new file mode 100644 > index 00000000000..9cffd2576c9 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr96506-2.c > @@ -0,0 +1,38 @@ > +/* PR target/96506 */ > +/* { dg-do compile } */ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-options "-mdejagnu-cpu=3Dpower10 -O2" } */ > + > +extern void bar0(); > +extern void bar1(); > +extern void bar2(); > +extern void bar3(); > + > +typedef __vector_pair vpair_t; > +typedef __vector_quad vquad_t; > + > +/* Verify we flag errors on the following. */ > + > +__vector_pair > +foo4 (__vector_pair *src) > +{ /* { dg-error "invalid use of MMA type .__vector_pair. as a = function return value" } */ > + return *src; > +} > + > +vpair_t > +foo5 (vpair_t *src) > +{ /* { dg-error "invalid use of MMA type .__vector_pair. as a = function return value" } */ > + return *src; > +} > + > +__vector_quad > +foo6 (__vector_quad *src) > +{ /* { dg-error "invalid use of MMA type .__vector_quad. as a = function return value" } */ > + return *src; > +} > + > +vquad_t > +foo7 (vquad_t *src) > +{ /* { dg-error "invalid use of MMA type .__vector_quad. as a = function return value" } */ > + return *src; > +} > --=20 > 2.18.4 >=20