From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <will_schmidt@vnet.ibm.com>
Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com
 [148.163.156.1])
 by sourceware.org (Postfix) with ESMTPS id 03C5A385840E
 for <gcc-patches@gcc.gnu.org>; Fri,  5 Nov 2021 17:01:47 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 03C5A385840E
Received: from pps.filterd (m0098410.ppops.net [127.0.0.1])
 by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A5GUqH1029992; 
 Fri, 5 Nov 2021 17:01:46 GMT
Received: from pps.reinject (localhost [127.0.0.1])
 by mx0a-001b2d01.pphosted.com with ESMTP id 3c55t9c4u3-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Fri, 05 Nov 2021 17:01:46 +0000
Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1])
 by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1A5Gaiij010699;
 Fri, 5 Nov 2021 17:01:46 GMT
Received: from ppma04wdc.us.ibm.com (1a.90.2fa9.ip4.static.sl-reverse.com
 [169.47.144.26])
 by mx0a-001b2d01.pphosted.com with ESMTP id 3c55t9c4te-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Fri, 05 Nov 2021 17:01:46 +0000
Received: from pps.filterd (ppma04wdc.us.ibm.com [127.0.0.1])
 by ppma04wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1A5GgxHD026250;
 Fri, 5 Nov 2021 17:01:45 GMT
Received: from b01cxnp23032.gho.pok.ibm.com (b01cxnp23032.gho.pok.ibm.com
 [9.57.198.27]) by ppma04wdc.us.ibm.com with ESMTP id 3c4t48x8ae-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Fri, 05 Nov 2021 17:01:45 +0000
Received: from b01ledav002.gho.pok.ibm.com (b01ledav002.gho.pok.ibm.com
 [9.57.199.107])
 by b01cxnp23032.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id
 1A5H1iA647579602
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);
 Fri, 5 Nov 2021 17:01:44 GMT
Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1])
 by IMSVA (Postfix) with ESMTP id 219F912405B;
 Fri,  5 Nov 2021 17:01:44 +0000 (GMT)
Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1])
 by IMSVA (Postfix) with ESMTP id 94B7312405A;
 Fri,  5 Nov 2021 17:01:43 +0000 (GMT)
Received: from sig-9-65-204-72.ibm.com (unknown [9.65.204.72])
 by b01ledav002.gho.pok.ibm.com (Postfix) with ESMTP;
 Fri,  5 Nov 2021 17:01:43 +0000 (GMT)
Message-ID: <1e69e1d27984ec906792d068a6df191a0e033ed1.camel@vnet.ibm.com>
Subject: Re: [PATCH 1/5] Add XXSPLTI* and LXVKQ instructions (new data
 structure and function)
From: will schmidt <will_schmidt@vnet.ibm.com>
To: Michael Meissner <meissner@linux.ibm.com>, gcc-patches@gcc.gnu.org,
 Segher Boessenkool <segher@kernel.crashing.org>, David Edelsohn
 <dje.gcc@gmail.com>, Bill Schmidt <wschmidt@linux.ibm.com>, Peter Bergner
 <bergner@linux.ibm.com>
Date: Fri, 05 Nov 2021 12:01:43 -0500
In-Reply-To: <YYStWC018qK1Ta33@toto.the-meissners.org>
References: <YYSs1OZk3bR0VxED@toto.the-meissners.org>
 <YYStWC018qK1Ta33@toto.the-meissners.org>
Content-Type: text/plain; charset="UTF-8"
X-Mailer: Evolution 3.28.5 (3.28.5-16.el8) 
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
X-TM-AS-GCONF: 00
X-Proofpoint-ORIG-GUID: vjWoH9jgsSXO-6lqlrjm1xYWBS9ABC75
X-Proofpoint-GUID: 5jd2RJx0Fo3WBM7bF3JfSTAF7_J4KOkH
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475
 definitions=2021-11-05_02,2021-11-03_01,2020-04-07_01
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0
 priorityscore=1501
 adultscore=0 clxscore=1015 lowpriorityscore=0 spamscore=0 suspectscore=0
 mlxscore=0 mlxlogscore=999 impostorscore=0 bulkscore=0 phishscore=0
 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1
 engine=8.12.0-2110150000 definitions=main-2111050095
X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,
 SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 05 Nov 2021 17:01:50 -0000

On Fri, 2021-11-05 at 00:04 -0400, Michael Meissner wrote:
> Add new constant data structure.
> 
> This patch provides the data structure and function to convert a
> CONST_INT, CONST_DOUBLE, CONST_VECTOR, or VEC_DUPLICATE of a constant) to
> an array of bytes, half-words, words, and  double words that can be loaded
> into a 128-bit vector register.
> 
> The next patches will use this data structure to generate code that
> generates load of the vector/floating point registers using the XXSPLTIDP,
> XXSPLTIW, and LXVKQ instructions that were added in power10.
> 
> 2021-11-05  Michael Meissner  <meissner@the-meissners.org>
> 

Email here is different than the from:.  No big deal either way.  

> gcc/
> 
> 	* config/rs6000/rs6000-protos.h (VECTOR_128BIT_*): New macros.

I defer to maintainers.  I like to explicitly include the full macro names here so a grep later on can easily find it.  


> 	(vec_const_128bit_type): New structure type.
> 	(vec_const_128bit_to_bytes): New declaration.
> 	* config/rs6000/rs6000.c (constant_int_to_128bit_vector): New
> 	helper function.
> 	(constant_fp_to_128bit_vector): New helper function.
> 	(vec_const_128bit_to_bytes): New function.

ok

> ---
>  gcc/config/rs6000/rs6000-protos.h |  28 ++++
>  gcc/config/rs6000/rs6000.c        | 253 ++++++++++++++++++++++++++++++
>  2 files changed, 281 insertions(+)
> 
> diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
> index 14f6b313105..490d6e33736 100644
> --- a/gcc/config/rs6000/rs6000-protos.h
> +++ b/gcc/config/rs6000/rs6000-protos.h
> @@ -222,6 +222,34 @@ address_is_prefixed (rtx addr,
>    return (iform == INSN_FORM_PREFIXED_NUMERIC
>  	  || iform == INSN_FORM_PCREL_LOCAL);
>  }
> +
> +/* Functions and data structures relating to 128-bit constants that are
> +   converted to byte, half-word, word, and double-word values.  All fields are
> +   kept in big endian order.  We also convert scalar values to 128-bits if they
> +   are going to be loaded into vector registers.  */
> +#define VECTOR_128BIT_BITS		128
> +#define VECTOR_128BIT_BYTES		(128 / 8)
> +#define VECTOR_128BIT_HALF_WORDS	(128 / 16)
> +#define VECTOR_128BIT_WORDS		(128 / 32)
> +#define VECTOR_128BIT_DOUBLE_WORDS	(128 / 64)

ok

> +
> +typedef struct {
> +  /* Constant as various sized items.  */
> +  unsigned HOST_WIDE_INT double_words[VECTOR_128BIT_DOUBLE_WORDS];
> +  unsigned int words[VECTOR_128BIT_WORDS];
> +  unsigned short half_words[VECTOR_128BIT_HALF_WORDS];
> +  unsigned char bytes[VECTOR_128BIT_BYTES];
> +
> +  unsigned original_size;		/* Constant size before splat.  */
> +  bool fp_constant_p;			/* Is the constant floating point?  */
> +  bool all_double_words_same;		/* Are the double words all equal?  */
> +  bool all_words_same;			/* Are the words all equal?  */
> +  bool all_half_words_same;		/* Are the halft words all equal?  */

half

> +  bool all_bytes_same;			/* Are the bytes all equal?  */


> +} vec_const_128bit_type;
> +

ok.  


> +extern bool vec_const_128bit_to_bytes (rtx, machine_mode,
> +				       vec_const_128bit_type *);
>  #endif /* RTX_CODE */
> 
>  #ifdef TREE_CODE
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 01affc7a47c..f285022294a 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -28619,6 +28619,259 @@ rs6000_output_addr_vec_elt (FILE *file, int value)
>    fprintf (file, "\n");
>  }
> 
> +
> +/* Copy an integer constant to the vector constant structure.  */
> +

Here and subsequent comments, I'd debate on whether to enhance the
comment to be explicit on the structure name being copied to/from.
(vec_const_128bit_type is easy to search for, vector or constant or
structure are not as unique)

> +static void
> +constant_int_to_128bit_vector (rtx op,
> +			       machine_mode mode,
> +			       size_t byte_num,
> +			       vec_const_128bit_type *info)
> +{
> +  unsigned HOST_WIDE_INT uvalue = UINTVAL (op);
> +  unsigned bitsize = GET_MODE_BITSIZE (mode);
> +
> +  for (int shift = bitsize - 8; shift >= 0; shift -= 8)
> +    info->bytes[byte_num++] = (uvalue >> shift) & 0xff;
> +}

I didn't confirm the maths, but looks OK at a glance.


> +
> +/* Copy an floating point constant to the vector constant structure.  */
> +

s/an/a/

> +static void
> +constant_fp_to_128bit_vector (rtx op,
> +			      machine_mode mode,
> +			      size_t byte_num,
> +			      vec_const_128bit_type *info)
> +{
> +  unsigned bitsize = GET_MODE_BITSIZE (mode);
> +  unsigned num_words = bitsize / 32;
> +  const REAL_VALUE_TYPE *rtype = CONST_DOUBLE_REAL_VALUE (op);
> +  long real_words[VECTOR_128BIT_WORDS];
> +
> +  /* Make sure we don't overflow the real_words array and that it is
> +     filled completely.  */
> +  gcc_assert (num_words <= VECTOR_128BIT_WORDS && (bitsize % 32) == 0);

Not clear to me on the potential to partially fill the real_words
array. 

> +
> +  real_to_target (real_words, rtype, mode);
> +
> +  /* Iterate over each 32-bit word in the floating point constant.  The
> +     real_to_target function puts out words in endian fashion.  We need

Meaning host-endian fashion, or is that meant to be big-endian ? 

Perhaps also rephrase or move the comment up to indicate that
real_to_target will have placed or has already placed the words in
<whatever> endian fashion.
As stated I was expecting to see a call to real_to_target() below the
comment. 

The description of how real_to_target() works may be better placed 
on that function over in real.c .  :-)


> +     to arrange so the words are written in big endian order.  */

> +  for (unsigned num = 0; num < num_words; num++)
> +    {
> +      unsigned endian_num = (BYTES_BIG_ENDIAN
> +			     ? num
> +			     : num_words - 1 - num);
> +
> +      unsigned uvalue = real_words[endian_num];
> +      for (int shift = 32 - 8; shift >= 0; shift -= 8)
> +	info->bytes[byte_num++] = (uvalue >> shift) & 0xff;
> +    }
> +
> +  /* Mark that this constant involves floating point.  */
> +  info->fp_constant_p = true;
> +}
> +
> +/* Convert a vector constant OP with mode MODE to a vector 128-bit constant
> +   structure INFO.
> +
> +   Break out the constant out to bytes, half words, words, and double words.
> +   Return true if we have successfully broken out a constant.

Too many outs.  

> +
> +   We handle CONST_INT, CONST_DOUBLE, CONST_VECTOR, and VEC_DUPLICATE of
> +   constants.  Integer and floating point scalar constants are splatted to fill
> +   out the vector.  */
> +
> +bool
> +vec_const_128bit_to_bytes (rtx op,
> +			   machine_mode mode,
> +			   vec_const_128bit_type *info)
> +{
> +  /* Initialize the constant structure.  */
> +  memset ((void *)info, 0, sizeof (vec_const_128bit_type));
> +
> +  /* Assume CONST_INTs are DImode.  */
> +  if (mode == VOIDmode)
> +    mode = CONST_INT_P (op) ? DImode : GET_MODE (op);
> +
> +  if (mode == VOIDmode)
> +    return false;
> +
> +  unsigned size = GET_MODE_SIZE (mode);
> +  bool splat_p = false;
> +
> +  if (size > VECTOR_128BIT_BYTES)
> +    return false;
> +
> +  /* Set up the bits.  */
> +  switch (GET_CODE (op))
> +    {
> +      /* Integer constants, default to double word.  */
> +    case CONST_INT:
> +      {
> +	constant_int_to_128bit_vector (op, mode, 0, info);
> +	splat_p = true;
> +	break;
> +      }
> +
> +      /* Floating point constants.  */
> +    case CONST_DOUBLE:
> +      {
> +	/* Fail if the floating point constant is the wrong mode.  */
> +	if (GET_MODE (op) != mode)
> +	  return false;
> +
> +	/* SFmode stored as scalars are stored in DFmode format.  */
> +	if (mode == SFmode)
> +	  {
> +	    mode = DFmode;
> +	    size = GET_MODE_SIZE (DFmode);
> +	  }
> +
> +	constant_fp_to_128bit_vector (op, mode, 0, info);
> +	splat_p = true;
> +	break;
> +      }
> +
> +      /* Vector constants, iterate over each element.  On little endian
> +	 systems, we have to reverse the element numbers.  */
> +    case CONST_VECTOR:
> +      {
> +	/* Fail if the vector constant is the wrong mode or size.  */
> +	if (GET_MODE (op) != mode
> +	    || GET_MODE_SIZE (mode) != VECTOR_128BIT_BYTES)
> +	  return false;
> +
> +	machine_mode ele_mode = GET_MODE_INNER (mode);
> +	size_t ele_size = GET_MODE_SIZE (ele_mode);
> +	size_t nunits = GET_MODE_NUNITS (mode);
> +
> +	for (size_t num = 0; num < nunits; num++)
> +	  {
> +	    rtx ele = CONST_VECTOR_ELT (op, num);
> +	    size_t byte_num = (BYTES_BIG_ENDIAN
> +			       ? num
> +			       : nunits - 1 - num) * ele_size;
> +
> +	    if (CONST_INT_P (ele))
> +	      constant_int_to_128bit_vector (ele, ele_mode, byte_num, info);
> +	    else if (CONST_DOUBLE_P (ele))
> +	      constant_fp_to_128bit_vector (ele, ele_mode, byte_num, info);
> +	    else
> +	      return false;
> +	  }
> +
> +	break;
> +      }
> +
> +	/* Treat VEC_DUPLICATE of a constant just like a vector constant.
> +	   Since we are duplicating the element, we don't have to worry about
> +	   endian issues.  */
> +    case VEC_DUPLICATE:
> +      {
> +	/* Fail if the vector duplicate is the wrong mode or size.  */
> +	if (GET_MODE (op) != mode
> +	    || GET_MODE_SIZE (mode) != VECTOR_128BIT_BYTES)
> +	  return false;
> +
> +	machine_mode ele_mode = GET_MODE_INNER (mode);
> +	size_t ele_size = GET_MODE_SIZE (ele_mode);
> +	rtx ele = XEXP (op, 0);
> +	size_t nunits = GET_MODE_NUNITS (mode);
> +
> +	if (!CONST_INT_P (ele) && !CONST_DOUBLE_P (ele))
> +	  return false;
> +
> +	for (size_t num = 0; num < nunits; num++)
> +	  {
> +	    size_t byte_num = num * ele_size;
> +
> +	    if (CONST_INT_P (ele))
> +	      constant_int_to_128bit_vector (ele, ele_mode, byte_num, info);
> +	    else
> +	      constant_fp_to_128bit_vector (ele, ele_mode, byte_num, info);
> +	  }
> +
> +	break;
> +      }
> +
> +      /* Any thing else, just return failure.  */
> +    default:
> +      return false;
> +    }

Seems OK.

> +
> +  /* Possibly splat the constant to fill a vector size.  */


Suggest "Splat the constant to fill a vector size if ..."


> +  if (splat_p && size < VECTOR_128BIT_BYTES)
> +    {
> +      if ((VECTOR_128BIT_BYTES % size) != 0)
> +	return false;
> +
> +      for (size_t offset = size;
> +	   offset < VECTOR_128BIT_BYTES;
> +	   offset += size)
> +	memcpy ((void *) &info->bytes[offset],
> +		(void *) &info->bytes[0],
> +		size);
> +    }
> +
> +  /* Remember original size.  */
> +  info->original_size = size;
> +
> +  /* Determine if the bytes are all the same.  */
> +  unsigned char first_byte = info->bytes[0];
> +  info->all_bytes_same = true;
> +  for (size_t i = 1; i < VECTOR_128BIT_BYTES; i++)
> +    if (first_byte != info->bytes[i])
> +      {
> +	info->all_bytes_same = false;
> +	break;
> +      }
> +
> +  /* Pack half words together & determine if all of the half words are the
> +     same.  */
> +  for (size_t i = 0; i < VECTOR_128BIT_HALF_WORDS; i++)
> +    info->half_words[i] = ((info->bytes[i * 2] << 8)
> +			   | info->bytes[(i * 2) + 1]);
> +
> +  unsigned short first_hword = info->half_words[0];
> +  info->all_half_words_same = true;
> +  for (size_t i = 1; i < VECTOR_128BIT_HALF_WORDS; i++)
> +    if (first_hword != info->half_words[i])
> +      {
> +	info->all_half_words_same = false;
> +	break;
> +      }

ok

> +
> +  /* Pack words together & determine if all of the words are the same.  */
> +  for (size_t i = 0; i < VECTOR_128BIT_WORDS; i++)
> +    info->words[i] = ((info->bytes[i * 4] << 24)
> +		      | (info->bytes[(i * 4) + 1] << 16)
> +		      | (info->bytes[(i * 4) + 2] << 8)
> +		      | info->bytes[(i * 4) + 3]);
> +
> +  info->all_words_same
> +    = (info->words[0] == info->words[1]
> +       && info->words[0] == info->words[1]
> +       && info->words[0] == info->words[2]
> +       && info->words[0] == info->words[3]);
> +
> +  /* Pack double words together & determine if all of the double words are the
> +     same.  */
> +  for (size_t i = 0; i < VECTOR_128BIT_DOUBLE_WORDS; i++)
> +    {
> +      unsigned HOST_WIDE_INT d_word = 0;
> +      for (size_t j = 0; j < 8; j++)
> +	d_word = (d_word << 8) | info->bytes[(i * 8) + j];
> +
> +      info->double_words[i] = d_word;
> +    }
> +
> +  info->all_double_words_same
> +    = (info->double_words[0] == info->double_words[1]);
> +
> +  return true;
> +}
> +

ok.


>  struct gcc_target targetm = TARGET_INITIALIZER;
> 
>  #include "gt-rs6000.h"
> -- 
> 2.31.1
> 
>