From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 6443E3858D3C for ; Thu, 26 Aug 2021 21:28:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6443E3858D3C Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 17QL6Y8Q079457; Thu, 26 Aug 2021 17:28:52 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3apjbb8r8c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 26 Aug 2021 17:28:52 -0400 Received: from m0098404.ppops.net (m0098404.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 17QLGK5q157069; Thu, 26 Aug 2021 17:28:51 -0400 Received: from ppma04wdc.us.ibm.com (1a.90.2fa9.ip4.static.sl-reverse.com [169.47.144.26]) by mx0a-001b2d01.pphosted.com with ESMTP id 3apjbb8r7v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 26 Aug 2021 17:28:51 -0400 Received: from pps.filterd (ppma04wdc.us.ibm.com [127.0.0.1]) by ppma04wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 17QLCuIJ026784; Thu, 26 Aug 2021 21:28:50 GMT Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by ppma04wdc.us.ibm.com with ESMTP id 3ajs4ey7g9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 26 Aug 2021 21:28:50 +0000 Received: from b03ledav003.gho.boulder.ibm.com (b03ledav003.gho.boulder.ibm.com [9.17.130.234]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 17QLSmkt44958186 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 26 Aug 2021 21:28:48 GMT Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 922A26A063; Thu, 26 Aug 2021 21:28:48 +0000 (GMT) Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A6F936A058; Thu, 26 Aug 2021 21:28:47 +0000 (GMT) Received: from toto.the-meissners.org (unknown [9.160.31.187]) by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTPS; Thu, 26 Aug 2021 21:28:47 +0000 (GMT) Date: Thu, 26 Aug 2021 17:28:42 -0400 From: Michael Meissner To: will schmidt Cc: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt , Peter Bergner Subject: Re: [PATCH] Generate XXSPLTIDP on power10. Message-ID: Mail-Followup-To: Michael Meissner , will schmidt , gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt , Peter Bergner References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-TM-AS-GCONF: 00 X-Proofpoint-GUID: eHaieaebfaSYBY1V6VjP5HxT5ax6qbfF X-Proofpoint-ORIG-GUID: x0MSzsDJgEYeMjdUiNp_JsYcLEYDuswX X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-08-26_05:2021-08-26, 2021-08-26 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 clxscore=1015 adultscore=0 impostorscore=0 phishscore=0 mlxscore=0 malwarescore=0 bulkscore=0 lowpriorityscore=0 suspectscore=0 mlxlogscore=999 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2107140000 definitions=main-2108260118 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Aug 2021 21:29:04 -0000 On Thu, Aug 26, 2021 at 02:17:57PM -0500, will schmidt wrote: > On Wed, 2021-08-25 at 15:46 -0400, Michael Meissner wrote: > > Generate XXSPLTIDP on power10. > > > > I have added a temporary switch (-mxxspltidp) to control whether or not the > > XXSPLTIDP instruction is generated. > > How temporary? Until we decide we no longer need to disable the option to do tests. Probably at the end of stage1. > > > > I added 3 new tests to test loading up SF/DF scalar and V2DF vector > > constants. ... > > gcc/ > > * config/rs6000/constraints.md (eF): New constraint. > > * config/rs6000/predicates.md (easy_fp_constant): If we can load > > the scalar constant with XXSPLTIDP, the floating point constant is > > easy. > > Could be shortened to something like ? > Add clause to accept xxspltidp_operand as easy. Sure. > > (xxspltidp_operand): New predicate. > > Will there ever be another instruction using the SF/DF CONST_DOUBLE or > V2DF CONST_VECTOR ? I tentatively question the name of the operand, > but defer.. This is the convention I've used for adding other instructions like xxspltib. You use the name of the instruction followed by operand. The predicate only matches RTXes that we would use the XXSPLTIDP for. And then for operands like XXSPLTIDP and XXSPLTIB where you want additional information, there is the C++ function in rs6000.c that has the additional arguments. In the case of XXSPLTIDP, it returns the 32-bit immediate value that is used in the instruction. The predicate calls this internal function, adding the address of internal variables that aren't used in this case. This way we have just one place that centralizes the knowledge about the instruction. This means that the places that deal with decomposing the XXSPLTIDP instruction don't have to do their own parsing. And if we need to add new enhancements or restrictions, there is only one place to go. There is the XXSPLTI32DX instruction that can also load SFmode, DFmode, and V2DFmode constants. You use a pair of instructions to either fill in the top 32-bits or the bottom 32-bits of each 64-bit element. I have patches for adding XXSPLTI32DX, but so far, I'm not sure whether it is a win or not. This has the xxsplti32dx_operand predicate and the xxsplti32dx_constant_p internal function and a separate constraint ("eD") for matching it. If xxspltidp_constant_p returns false, then xxsplti32dx_operand and xxsplti32dx_constant_p return false. I.e. it only returns true if we should use XXSPLTI32DX instead of some other instruction. You can't combine XXSPLTIDP and XXSPLTI32DX because you have to set the various attributes differently (XXSPLTIDP is a single prefixed instruction, while before split, XXSPLTI32DX is two prefixed instructions). There is another instruction (XXSPLTIW) that can be used for loading up certain V16QImode, V8HImode, V4SImode, and V4SFmode constants. I have patches for this as well. At the moment, there are 1-2 regressions if I enable XXSPLTIW, and I'm trying to figure out how to improve them. But it likely will be submitted as a future patch also. This would have the xxspltiw_operand predicate, "eW" constraint and the xxspltiw_constant_p internal function . > > > (easy_vector_constant): If we can generate XXSPLTIDP, mark the > > vector constant as easy. > > Duplicated from above. Yep, both of the easy_*_functions need to call this. The easy_fp_function is for scalar floating point (i.e. SFmode and DFmode). The easy_vector_function is for vector (i.e. V2DFmode). What the easy_{fp,vector}_constant functions are used for is to decide whether we can load up the constant via insns that are created via define_splits or if we need to push the constant to memory (because it isn't 'easy' to load up the constants). These two functions predate my current involvement with PowerPC on GCC (i.e. 10 years or so). I don't recall if they were used back in the 1990's when I worked on PowerPC at Cygnus solutions. > > > * config/rs6000/rs6000-protos.h (xxspltidp_constant_p): New > > declaration. > > (prefixed_permute_p): Likewise. > > > > * config/rs6000/rs6000.c (xxspltidp_constant_p): New function. > > (output_vec_const_move): Add support for XXSPLTIDP. > > (prefixed_permute_p): New function. > > Duplicated. Yes. You need the external declaration in rs6000-protos.h and the definition in rs6000.c. So you will have the duplication. > > * config/rs6000/rs6000.md (prefixed attribute): Add support for > > permute prefixed instructions. > > (movsf_hardfloat): Add XXSPLTIDP support. > > (mov_hardfloat32, FMOVE64 iterator): Likewise. > > (mov_hardfloat64, FMOVE64 iterator): Likewise. > > * config/rs6000/rs6000.opt (-mxxspltidp): New switch. > > * config/rs6000/vsx.md (vsx_move_64bit): Add XXSPLTIDP > > support. > > (vsx_move_32bit): Likewise. > > No e in mov (per patch contents below). Thanks. > > (vsx_splat_v2df_xxspltidp): New insn. > > (XXSPLTIDP): New mode iterator. > > (xxspltidp__internal): New insn and splits. > > (xxspltidp__inst): Replace xxspltidp_v2df_inst with an > > iterated form that also does SFmode, and DFmode. > Swap "an iterated form" with "xxspltidp__inst ? Ok. > > @@ -8170,6 +8178,7 @@ (define_insn "*mov_softfloat64" > > (set_attr "length" > > "*, *, *, *, *, 8, > > 12, 16, *")]) > > + > > > > Unnecessarily blank line? Probably. > > (define_expand "mov" > > [(set (match_operand:FMOVE128 0 "general_operand") > > diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt > > index 0538db387dc..928c4fafe07 100644 > > --- a/gcc/config/rs6000/rs6000.opt > > +++ b/gcc/config/rs6000/rs6000.opt > > @@ -639,3 +639,7 @@ Enable instructions that guard against return-oriented programming attacks. > > mprivileged > > Target Var(rs6000_privileged) Init(0) > > Generate code that will run in privileged state. > > + > > +mxxspltidp > > +Target Undocumented Var(TARGET_XXSPLTIDP) Init(1) Save > > +Generate (do not generate) XXSPLTIDP instructions. > > > Ok. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meissner@linux.ibm.com, phone: +1 (978) 899-4797