From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id D86433857C5F for ; Sun, 4 Jul 2021 16:32:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D86433857C5F Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 164G403g159366; Sun, 4 Jul 2021 12:32:25 -0400 Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0b-001b2d01.pphosted.com with ESMTP id 39k5h4t6vg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 04 Jul 2021 12:32:25 -0400 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 164GQhT1001407; Sun, 4 Jul 2021 16:32:24 GMT Received: from b01cxnp23034.gho.pok.ibm.com (b01cxnp23034.gho.pok.ibm.com [9.57.198.29]) by ppma05wdc.us.ibm.com with ESMTP id 39jfh9vjwr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 04 Jul 2021 16:32:24 +0000 Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com [9.57.199.109]) by b01cxnp23034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 164GWOML42402236 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 4 Jul 2021 16:32:24 GMT Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 378F9112064; Sun, 4 Jul 2021 16:32:24 +0000 (GMT) Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7B6EF112062; Sun, 4 Jul 2021 16:32:23 +0000 (GMT) Received: from Bills-MacBook-Pro.local (unknown [9.211.124.44]) by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTP; Sun, 4 Jul 2021 16:32:23 +0000 (GMT) Reply-To: wschmidt@linux.ibm.com Subject: Re: [PATCH] rs6000: Add MMA __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins To: Peter Bergner , Segher Boessenkool Cc: GCC Patches References: <6797c56b-bc5a-2410-4a90-424cda38abaf@linux.ibm.com> <20210701190106.GE1583@gate.crashing.org> From: Bill Schmidt Message-ID: Date: Sun, 4 Jul 2021 11:32:22 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: IaSbHg2k87GvHbhD8uOzU08wj8Ud6aA6 X-Proofpoint-GUID: IaSbHg2k87GvHbhD8uOzU08wj8Ud6aA6 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-07-04_12:2021-07-02, 2021-07-04 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 lowpriorityscore=0 adultscore=0 impostorscore=0 mlxscore=0 bulkscore=0 mlxlogscore=999 priorityscore=1501 phishscore=0 suspectscore=0 malwarescore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2107040100 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Jul 2021 16:32:28 -0000 Hi Peter, On 7/1/21 2:48 PM, Peter Bergner via Gcc-patches wrote: > gcc/ > * config/rs6000/rs6000-builtin.def (BU_MMA_PAIR_LD, BU_MMA_PAIR_ST): > New macros. > (__builtin_vsx_lxvp, __builtin_vsx_stxvp): New built-ins. > * config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Expand > lxvp and stxvp built-ins. > (mma_init_builtins): Handle lxvp and stxvp built-ins. > (builtin_function_type): Likewise. > * doc/extend.texi (__builtin_vsx_lxvp, __builtin_mma_stxvp): Document. > > gcc/testsuite/ > * gcc.target/powerpc/mma-builtin-7.c: New test. > * gcc.target/powerpc/mma-builtin-8.c: New test. > > > diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def > index d7ce4de421e..6270444ef70 100644 > --- a/gcc/config/rs6000/rs6000-builtin.def > +++ b/gcc/config/rs6000/rs6000-builtin.def > @@ -484,6 +484,25 @@ > | RS6000_BTC_SENARY), \ > CODE_FOR_ ## ICODE) /* ICODE */ > > +#define BU_MMA_PAIR_LD(ENUM, NAME, ATTR) \ > + RS6000_BUILTIN_M (VSX_BUILTIN_ ## ENUM, /* ENUM */ \ > + "__builtin_vsx_" NAME, /* NAME */ \ > + RS6000_BTM_MMA, /* MASK */ \ > + (RS6000_BTC_ ## ATTR /* ATTR */ \ > + | RS6000_BTC_BINARY \ > + | RS6000_BTC_GIMPLE), \ > + CODE_FOR_nothing) /* ICODE */ > + > +#define BU_MMA_PAIR_ST(ENUM, NAME, ATTR) \ > + RS6000_BUILTIN_M (VSX_BUILTIN_ ## ENUM, /* ENUM */ \ > + "__builtin_vsx_" NAME, /* NAME */ \ > + RS6000_BTM_MMA, /* MASK */ \ > + (RS6000_BTC_ ## ATTR /* ATTR */ \ > + | RS6000_BTC_TERNARY \ > + | RS6000_BTC_VOID \ > + | RS6000_BTC_GIMPLE), \ > + CODE_FOR_nothing) /* ICODE */ > + > /* ISA 2.05 (power6) convenience macros. */ > /* For functions that depend on the CMPB instruction */ > #define BU_P6_2(ENUM, NAME, ATTR, ICODE) \ > @@ -3253,6 +3272,9 @@ BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", RS6000_BTM_ALWAYS, > BU_P10V_VSX_1 (XVCVBF16SPN, "xvcvbf16spn", MISC, vsx_xvcvbf16spn) > BU_P10V_VSX_1 (XVCVSPBF16, "xvcvspbf16", MISC, vsx_xvcvspbf16) > > +BU_MMA_PAIR_LD (LXVP, "lxvp", MISC) > +BU_MMA_PAIR_ST (STXVP, "stxvp", PAIR) > + > BU_MMA_1 (XXMFACC, "xxmfacc", QUAD, mma_xxmfacc) > BU_MMA_1 (XXMTACC, "xxmtacc", QUAD, mma_xxmtacc) > BU_MMA_1 (XXSETACCZ, "xxsetaccz", MISC, mma_xxsetaccz) > diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c > index b67789845a5..6115e3b34d9 100644 > --- a/gcc/config/rs6000/rs6000-call.c > +++ b/gcc/config/rs6000/rs6000-call.c > @@ -11913,6 +11913,32 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator *gsi) > gsi_replace_with_seq (gsi, new_seq, true); > return true; > } > + else if (fncode == VSX_BUILTIN_LXVP) > + { > + push_gimplify_context (true); > + tree offset = gimple_call_arg (stmt, 0); > + tree ptr = gimple_call_arg (stmt, 1); > + tree lhs = gimple_call_lhs (stmt); > + tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, > + TREE_TYPE (ptr), ptr, offset)); > + gimplify_assign (lhs, mem, &new_seq); > + pop_gimplify_context (NULL); > + gsi_replace_with_seq (gsi, new_seq, true); > + return true; > + } > + else if (fncode == VSX_BUILTIN_STXVP) > + { > + push_gimplify_context (true); > + tree src = gimple_call_arg (stmt, 0); > + tree offset = gimple_call_arg (stmt, 1); > + tree ptr = gimple_call_arg (stmt, 2); > + tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, > + TREE_TYPE (ptr), ptr, offset)); > + gimplify_assign (mem, src, &new_seq); > + pop_gimplify_context (NULL); > + gsi_replace_with_seq (gsi, new_seq, true); > + return true; > + } > > /* Convert this built-in into an internal version that uses pass-by-value > arguments. The internal built-in follows immediately after this one. */ > @@ -14264,11 +14290,14 @@ mma_init_builtins (void) > if (gimple_func) > { > gcc_assert (icode == CODE_FOR_nothing); > - op[nopnds++] = void_type_node; > /* Some MMA built-ins that are expanded into gimple are converted > into internal MMA built-ins that are expanded into rtl. > The internal built-in follows immediately after this built-in. */ > - icode = d[1].icode; > + if (d[1].icode != CODE_FOR_nothing) > + { > + op[nopnds++] = void_type_node; > + icode = d[1].icode; > + } This hunk bothers me.  The new macros BU_MMA_PAIR_LD and BU_MMA_PAIR_ST define only one builtin, not two.  They are both flagged as RS6000_BTC_GIMPLE.  The use of d[1] in this case is suspect and fragile.  It appears you're relying on the built-in following each of __builtin_vsx_lxvp and __builtin_vsx_stxvp to exist, as otherwise d[1] will be out of bounds.  It's otherwise rather meaningless because you later handle VSX_BUILTIN_LXVP and VSX_BUILTIN_STXVP separately.  Can you please replace this with something less fragile?  Perhaps leave the gimple_func leg alone, but first test for these two builtins and do the right thing for them instead. Otherwise this LGTM. Thanks, Bill > } > else > { > @@ -14291,6 +14320,19 @@ mma_init_builtins (void) > else > op[nopnds++] = build_pointer_type (vector_pair_type_node); > } > + else if (d->code == VSX_BUILTIN_LXVP) > + { > + op[nopnds++] = vector_pair_type_node; > + op[nopnds++] = sizetype; > + op[nopnds++] = build_pointer_type (vector_pair_type_node); > + } > + else if (d->code == VSX_BUILTIN_STXVP) > + { > + op[nopnds++] = void_type_node; > + op[nopnds++] = vector_pair_type_node; > + op[nopnds++] = sizetype; > + op[nopnds++] = build_pointer_type (vector_pair_type_node); > + } > else > { > /* This is a normal MMA built-in function. */ > @@ -14781,6 +14823,16 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, > h.uns_p[2] = 1; > break; > > + case VSX_BUILTIN_LXVP: > + h.uns_p[0] = 1; > + h.uns_p[2] = 1; > + break; > + > + case VSX_BUILTIN_STXVP: > + h.uns_p[1] = 1; > + h.uns_p[3] = 1; > + break; > + > default: > break; > } > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > index 8fc66d626d8..b83cd4919bb 100644 > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -20731,6 +20731,9 @@ void __builtin_vsx_disassemble_pair (void *, __vector_pair *); > > vec_t __builtin_vsx_xvcvspbf16 (vec_t); > vec_t __builtin_vsx_xvcvbf16spn (vec_t); > + > +__vector_pair __builtin_vsx_lxvp (size_t, __vector_pair *); > +void __builtin_vsx_stxvp (__vector_pair, size_t, __vector_pair *); > @end smallexample > > @node PRU Built-in Functions > diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c b/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c > new file mode 100644 > index 00000000000..c661a4b84bc > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ > + > +void > +foo (__vector_pair *dst, __vector_pair *src, long idx) > +{ > + dst[0] = __builtin_vsx_lxvp (0, src); > + dst[2] = __builtin_vsx_lxvp (32, src); > + dst[4] = __builtin_vsx_lxvp (64, src); > + /* Non-constant offset should generate a lxvpx. */ > + dst[6] = __builtin_vsx_lxvp (idx, src); > + /* Non-aligned offset should generate a plxvp. */ > + dst[8] = __builtin_vsx_lxvp (257, src); > +} > + > +#if !__has_builtin (__builtin_vsx_lxvp) > +# error "__has_builtin (__builtin_vsx_lxvp) failed" > +#endif > + > +/* { dg-final { scan-assembler-not {\mlxv\M} } } */ > +/* { dg-final { scan-assembler-not {\mstxv\M} } } */ > +/* { dg-final { scan-assembler-times {\mlxvp\M} 3 } } */ > +/* { dg-final { scan-assembler-times {\mlxvpx\M} 1 } } */ > +/* { dg-final { scan-assembler-times {\mplxvp\M} 1 } } */ > +/* { dg-final { scan-assembler-times {\mstxvp\M} 5 } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c b/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c > new file mode 100644 > index 00000000000..af29e479f83 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c > @@ -0,0 +1,27 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ > + > +void > +foo (__vector_pair *dst, __vector_pair *src, long idx) > +{ > + __vector_pair pair = *src; > + __builtin_vsx_stxvp (pair, 0, dst); > + __builtin_vsx_stxvp (pair, 32, dst); > + __builtin_vsx_stxvp (pair, 64, dst); > + /* Non-constant offset should generate a stxvpx. */ > + __builtin_vsx_stxvp (pair, idx, dst); > + /* Non-aligned offset should generate a pstxvp. */ > + __builtin_vsx_stxvp (pair, 257, dst); > +} > + > +#if !__has_builtin (__builtin_vsx_stxvp) > +# error "__has_builtin (__builtin_vsx_stxvp) failed" > +#endif > + > +/* { dg-final { scan-assembler-not {\mlxv\M} } } */ > +/* { dg-final { scan-assembler-not {\mstxv\M} } } */ > +/* { dg-final { scan-assembler-times {\mlxvp\M} 1 } } */ > +/* { dg-final { scan-assembler-times {\mstxvp\M} 3 } } */ > +/* { dg-final { scan-assembler-times {\mstxvpx\M} 1 } } */ > +/* { dg-final { scan-assembler-times {\mpstxvp\M} 1 } } */