From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <meissner@linux.ibm.com>
Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com
 [148.163.156.1])
 by sourceware.org (Postfix) with ESMTPS id 6443E3858D3C
 for <gcc-patches@gcc.gnu.org>; Thu, 26 Aug 2021 21:28:54 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6443E3858D3C
Received: from pps.filterd (m0098404.ppops.net [127.0.0.1])
 by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id
 17QL6Y8Q079457; Thu, 26 Aug 2021 17:28:52 -0400
Received: from pps.reinject (localhost [127.0.0.1])
 by mx0a-001b2d01.pphosted.com with ESMTP id 3apjbb8r8c-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Thu, 26 Aug 2021 17:28:52 -0400
Received: from m0098404.ppops.net (m0098404.ppops.net [127.0.0.1])
 by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 17QLGK5q157069;
 Thu, 26 Aug 2021 17:28:51 -0400
Received: from ppma04wdc.us.ibm.com (1a.90.2fa9.ip4.static.sl-reverse.com
 [169.47.144.26])
 by mx0a-001b2d01.pphosted.com with ESMTP id 3apjbb8r7v-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Thu, 26 Aug 2021 17:28:51 -0400
Received: from pps.filterd (ppma04wdc.us.ibm.com [127.0.0.1])
 by ppma04wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 17QLCuIJ026784;
 Thu, 26 Aug 2021 21:28:50 GMT
Received: from b03cxnp07028.gho.boulder.ibm.com
 (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15])
 by ppma04wdc.us.ibm.com with ESMTP id 3ajs4ey7g9-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Thu, 26 Aug 2021 21:28:50 +0000
Received: from b03ledav003.gho.boulder.ibm.com
 (b03ledav003.gho.boulder.ibm.com [9.17.130.234])
 by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id
 17QLSmkt44958186
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);
 Thu, 26 Aug 2021 21:28:48 GMT
Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1])
 by IMSVA (Postfix) with ESMTP id 922A26A063;
 Thu, 26 Aug 2021 21:28:48 +0000 (GMT)
Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1])
 by IMSVA (Postfix) with ESMTP id A6F936A058;
 Thu, 26 Aug 2021 21:28:47 +0000 (GMT)
Received: from toto.the-meissners.org (unknown [9.160.31.187])
 by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTPS;
 Thu, 26 Aug 2021 21:28:47 +0000 (GMT)
Date: Thu, 26 Aug 2021 17:28:42 -0400
From: Michael Meissner <meissner@linux.ibm.com>
To: will schmidt <will_schmidt@vnet.ibm.com>
Cc: Michael Meissner <meissner@linux.ibm.com>, gcc-patches@gcc.gnu.org,
 Segher Boessenkool <segher@kernel.crashing.org>,
 David Edelsohn <dje.gcc@gmail.com>, Bill Schmidt <wschmidt@linux.ibm.com>,
 Peter Bergner <bergner@linux.ibm.com>
Subject: Re: [PATCH] Generate XXSPLTIDP on power10.
Message-ID: <YSgHiqMxlf/EQYHz@toto.the-meissners.org>
Mail-Followup-To: Michael Meissner <meissner@linux.ibm.com>,
 will schmidt <will_schmidt@vnet.ibm.com>, gcc-patches@gcc.gnu.org,
 Segher Boessenkool <segher@kernel.crashing.org>,
 David Edelsohn <dje.gcc@gmail.com>,
 Bill Schmidt <wschmidt@linux.ibm.com>,
 Peter Bergner <bergner@linux.ibm.com>
References: <YSaeI5BVUxofem9y@toto.the-meissners.org>
 <f83b1e45fae0ddbabf3d0322362a23c92bd67481.camel@vnet.ibm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <f83b1e45fae0ddbabf3d0322362a23c92bd67481.camel@vnet.ibm.com>
X-TM-AS-GCONF: 00
X-Proofpoint-GUID: eHaieaebfaSYBY1V6VjP5HxT5ax6qbfF
X-Proofpoint-ORIG-GUID: x0MSzsDJgEYeMjdUiNp_JsYcLEYDuswX
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790
 definitions=2021-08-26_05:2021-08-26,
 2021-08-26 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0
 priorityscore=1501
 clxscore=1015 adultscore=0 impostorscore=0 phishscore=0 mlxscore=0
 malwarescore=0 bulkscore=0 lowpriorityscore=0 suspectscore=0
 mlxlogscore=999 spamscore=0 classifier=spam adjust=0 reason=mlx
 scancount=1 engine=8.12.0-2107140000 definitions=main-2108260118
X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,
 SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Aug 2021 21:29:04 -0000

On Thu, Aug 26, 2021 at 02:17:57PM -0500, will schmidt wrote:
> On Wed, 2021-08-25 at 15:46 -0400, Michael Meissner wrote:
> > Generate XXSPLTIDP on power10.
> > 
> > I have added a temporary switch (-mxxspltidp) to control whether or not the
> > XXSPLTIDP instruction is generated.
> 
> How temporary?  

Until we decide we no longer need to disable the option to do tests.  Probably
at the end of stage1.

> > 
> > I added 3 new tests to test loading up SF/DF scalar and V2DF vector
> > constants.

...

> > gcc/
> > 	* config/rs6000/constraints.md (eF): New constraint.
> > 	* config/rs6000/predicates.md (easy_fp_constant): If we can load
> > 	the scalar constant with XXSPLTIDP, the floating point constant is
> > 	easy.
> 
> Could be shortened to something like ? 
>   Add clause to accept xxspltidp_operand as easy.

Sure.

> > 	(xxspltidp_operand): New predicate.
> 
> Will there ever be another instruction using the SF/DF CONST_DOUBLE  or
> V2DF CONST_VECTOR ?   I tentatively question the name of the operand,
> but defer..

This is the convention I've used for adding other instructions like xxspltib.
You use the name of the instruction followed by operand.  The predicate only
matches RTXes that we would use the XXSPLTIDP for.  And then for operands like
XXSPLTIDP and XXSPLTIB where you want additional information, there is the C++
function in rs6000.c that has the additional arguments.  In the case of
XXSPLTIDP, it returns the 32-bit immediate value that is used in the
instruction.  The predicate calls this internal function, adding the address of
internal variables that aren't used in this case.

This way we have just one place that centralizes the knowledge about the
instruction.  This means that the places that deal with decomposing the
XXSPLTIDP instruction don't have to do their own parsing.  And if we need to
add new enhancements or restrictions, there is only one place to go.

There is the XXSPLTI32DX instruction that can also load SFmode, DFmode, and
V2DFmode constants.  You use a pair of instructions to either fill in the top
32-bits or the bottom 32-bits of each 64-bit element.

I have patches for adding XXSPLTI32DX, but so far, I'm not sure whether it is a
win or not.  This has the xxsplti32dx_operand predicate and the
xxsplti32dx_constant_p internal function and a separate constraint ("eD") for
matching it.  If xxspltidp_constant_p returns false, then xxsplti32dx_operand
and xxsplti32dx_constant_p return false.  I.e. it only returns true if we
should use XXSPLTI32DX instead of some other instruction.

You can't combine XXSPLTIDP and XXSPLTI32DX because you have to set the various
attributes differently (XXSPLTIDP is a single prefixed instruction, while
before split, XXSPLTI32DX is two prefixed instructions).

There is another instruction (XXSPLTIW) that can be used for loading up certain
V16QImode, V8HImode, V4SImode, and V4SFmode constants.  I have patches for this
as well.  At the moment, there are 1-2 regressions if I enable XXSPLTIW, and
I'm trying to figure out how to improve them.  But it likely will be submitted
as a future patch also.  This would have the xxspltiw_operand predicate, "eW"
constraint and the xxspltiw_constant_p internal function .

> 
> > 	(easy_vector_constant): If we can generate XXSPLTIDP, mark the
> > 	vector constant as easy.
> 
> Duplicated from above.

Yep, both of the easy_*_functions need to call this.  The easy_fp_function is
for scalar floating point (i.e. SFmode and DFmode).  The easy_vector_function
is for vector (i.e. V2DFmode).

What the easy_{fp,vector}_constant functions are used for is to decide whether
we can load up the constant via insns that are created via define_splits or if
we need to push the constant to memory (because it isn't 'easy' to load up the
constants).  These two functions predate my current involvement with PowerPC on
GCC (i.e. 10 years or so).  I don't recall if they were used back in the 1990's
when I worked on PowerPC at Cygnus solutions.

> 
> > 	* config/rs6000/rs6000-protos.h (xxspltidp_constant_p): New
> > 	declaration.
> > 	(prefixed_permute_p): Likewise.
> 
> 
> > 	* config/rs6000/rs6000.c (xxspltidp_constant_p): New function.
> > 	(output_vec_const_move): Add support for XXSPLTIDP.
> > 	(prefixed_permute_p): New function.
> 
> Duplicated.

Yes.  You need the external declaration in rs6000-protos.h and the definition
in rs6000.c.  So you will have the duplication.

> > 	* config/rs6000/rs6000.md (prefixed attribute): Add support for
> > 	permute prefixed instructions.
> > 	(movsf_hardfloat): Add XXSPLTIDP support.
> > 	(mov<mode>_hardfloat32, FMOVE64 iterator): Likewise.
> > 	(mov<mode>_hardfloat64, FMOVE64 iterator): Likewise.
> > 	* config/rs6000/rs6000.opt (-mxxspltidp): New switch.
> > 	* config/rs6000/vsx.md (vsx_move<mode>_64bit): Add XXSPLTIDP
> > 	support.
> > 	(vsx_move<mode>_32bit): Likewise.
> 
> No e in mov (per patch contents below).

Thanks.

> > 	(vsx_splat_v2df_xxspltidp): New insn.
> > 	(XXSPLTIDP): New mode iterator.
> > 	(xxspltidp_<mode>_internal): New insn and splits.
> > 	(xxspltidp_<mode>_inst): Replace xxspltidp_v2df_inst with an
> > 	iterated form that also does SFmode, and DFmode.
> Swap "an iterated form" with "xxspltidp_<mode>_inst  ?

Ok.

> > @@ -8170,6 +8178,7 @@ (define_insn "*mov<mode>_softfloat64"
> >     (set_attr "length"
> >              "*,       *,      *,      *,      *,      8,
> >               12,      16,     *")])
> > +
> >  
> 
> Unnecessarily blank line?

Probably.

> >  (define_expand "mov<mode>"
> >    [(set (match_operand:FMOVE128 0 "general_operand")
> > diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
> > index 0538db387dc..928c4fafe07 100644
> > --- a/gcc/config/rs6000/rs6000.opt
> > +++ b/gcc/config/rs6000/rs6000.opt
> > @@ -639,3 +639,7 @@ Enable instructions that guard against return-oriented programming attacks.
> >  mprivileged
> >  Target Var(rs6000_privileged) Init(0)
> >  Generate code that will run in privileged state.
> > +
> > +mxxspltidp
> > +Target Undocumented Var(TARGET_XXSPLTIDP) Init(1) Save
> > +Generate (do not generate) XXSPLTIDP instructions.
> 
> 
> Ok.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797