From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 4A417385BF84 for ; Wed, 1 Sep 2021 16:14:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4A417385BF84 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 181G3Kw2140643; Wed, 1 Sep 2021 12:14:32 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3atcb3h2r6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 01 Sep 2021 12:14:32 -0400 Received: from m0098394.ppops.net (m0098394.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 181G3rEa144040; Wed, 1 Sep 2021 12:14:31 -0400 Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0a-001b2d01.pphosted.com with ESMTP id 3atcb3h2qn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 01 Sep 2021 12:14:31 -0400 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 181GClHm029174; Wed, 1 Sep 2021 16:14:30 GMT Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by ppma02dal.us.ibm.com with ESMTP id 3aqcse92h3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 01 Sep 2021 16:14:30 +0000 Received: from b03ledav004.gho.boulder.ibm.com (b03ledav004.gho.boulder.ibm.com [9.17.130.235]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 181GETR847579614 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 1 Sep 2021 16:14:29 GMT Received: from b03ledav004.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8E2C77806B; Wed, 1 Sep 2021 16:14:29 +0000 (GMT) Received: from b03ledav004.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 688D578063; Wed, 1 Sep 2021 16:14:29 +0000 (GMT) Received: from localhost (unknown [9.40.194.84]) by b03ledav004.gho.boulder.ibm.com (Postfix) with ESMTP; Wed, 1 Sep 2021 16:14:29 +0000 (GMT) From: Bill Schmidt To: gcc-patches@gcc.gnu.org Cc: segher@kernel.crashing.org, dje.gcc@gmail.com Subject: [PATCH 05/18] rs6000: Support for vectorizing built-in functions Date: Wed, 1 Sep 2021 11:13:41 -0500 Message-Id: X-Mailer: git-send-email 2.17.1 In-Reply-To: References: In-Reply-To: References: X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: o5-0G078ZHYNbRwF9VeHh40gdfgIJwa3 X-Proofpoint-GUID: cCkM_XtggHSLCTfjhmynZ_B-K9RjcsjX X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-09-01_05:2021-09-01, 2021-09-01 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 phishscore=0 priorityscore=1501 lowpriorityscore=0 adultscore=0 clxscore=1015 mlxscore=0 mlxlogscore=999 impostorscore=0 suspectscore=0 bulkscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2107140000 definitions=main-2109010094 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Sep 2021 16:14:39 -0000 This patch just duplicates a couple of functions and adjusts them to use the new builtin names. There's no logical change otherwise. 2021-08-31 Bill Schmidt gcc/ * config/rs6000/rs6000.c (rs6000-builtins.h): New include. (rs6000_new_builtin_vectorized_function): New function. (rs6000_new_builtin_md_vectorized_function): Likewise. (rs6000_builtin_vectorized_function): Call rs6000_new_builtin_vectorized_function. (rs6000_builtin_md_vectorized_function): Call rs6000_new_builtin_md_vectorized_function. --- gcc/config/rs6000/rs6000.c | 253 +++++++++++++++++++++++++++++++++++++ 1 file changed, 253 insertions(+) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index b7ea1483da5..52c78c7500c 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -78,6 +78,7 @@ #include "case-cfn-macros.h" #include "ppc-auxv.h" #include "rs6000-internal.h" +#include "rs6000-builtins.h" #include "opts.h" /* This file should be included last. */ @@ -5501,6 +5502,251 @@ rs6000_loop_unroll_adjust (unsigned nunroll, struct loop *loop) return nunroll; } +/* Returns a function decl for a vectorized version of the builtin function + with builtin function code FN and the result vector type TYPE, or NULL_TREE + if it is not available. */ + +static tree +rs6000_new_builtin_vectorized_function (unsigned int fn, tree type_out, + tree type_in) +{ + machine_mode in_mode, out_mode; + int in_n, out_n; + + if (TARGET_DEBUG_BUILTIN) + fprintf (stderr, "rs6000_new_builtin_vectorized_function (%s, %s, %s)\n", + combined_fn_name (combined_fn (fn)), + GET_MODE_NAME (TYPE_MODE (type_out)), + GET_MODE_NAME (TYPE_MODE (type_in))); + + if (TREE_CODE (type_out) != VECTOR_TYPE + || TREE_CODE (type_in) != VECTOR_TYPE) + return NULL_TREE; + + out_mode = TYPE_MODE (TREE_TYPE (type_out)); + out_n = TYPE_VECTOR_SUBPARTS (type_out); + in_mode = TYPE_MODE (TREE_TYPE (type_in)); + in_n = TYPE_VECTOR_SUBPARTS (type_in); + + switch (fn) + { + CASE_CFN_COPYSIGN: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls_x[RS6000_BIF_CPSGNDP]; + if (VECTOR_UNIT_VSX_P (V4SFmode) + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls_x[RS6000_BIF_CPSGNSP]; + if (VECTOR_UNIT_ALTIVEC_P (V4SFmode) + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls_x[RS6000_BIF_COPYSIGN_V4SF]; + break; + CASE_CFN_CEIL: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls_x[RS6000_BIF_XVRDPIP]; + if (VECTOR_UNIT_VSX_P (V4SFmode) + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls_x[RS6000_BIF_XVRSPIP]; + if (VECTOR_UNIT_ALTIVEC_P (V4SFmode) + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls_x[RS6000_BIF_VRFIP]; + break; + CASE_CFN_FLOOR: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls_x[RS6000_BIF_XVRDPIM]; + if (VECTOR_UNIT_VSX_P (V4SFmode) + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls_x[RS6000_BIF_XVRSPIM]; + if (VECTOR_UNIT_ALTIVEC_P (V4SFmode) + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls_x[RS6000_BIF_VRFIM]; + break; + CASE_CFN_FMA: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls_x[RS6000_BIF_XVMADDDP]; + if (VECTOR_UNIT_VSX_P (V4SFmode) + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls_x[RS6000_BIF_XVMADDSP]; + if (VECTOR_UNIT_ALTIVEC_P (V4SFmode) + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls_x[RS6000_BIF_VMADDFP]; + break; + CASE_CFN_TRUNC: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls_x[RS6000_BIF_XVRDPIZ]; + if (VECTOR_UNIT_VSX_P (V4SFmode) + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls_x[RS6000_BIF_XVRSPIZ]; + if (VECTOR_UNIT_ALTIVEC_P (V4SFmode) + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls_x[RS6000_BIF_VRFIZ]; + break; + CASE_CFN_NEARBYINT: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && flag_unsafe_math_optimizations + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls_x[RS6000_BIF_XVRDPI]; + if (VECTOR_UNIT_VSX_P (V4SFmode) + && flag_unsafe_math_optimizations + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls_x[RS6000_BIF_XVRSPI]; + break; + CASE_CFN_RINT: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && !flag_trapping_math + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls_x[RS6000_BIF_XVRDPIC]; + if (VECTOR_UNIT_VSX_P (V4SFmode) + && !flag_trapping_math + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls_x[RS6000_BIF_XVRSPIC]; + break; + default: + break; + } + + /* Generate calls to libmass if appropriate. */ + if (rs6000_veclib_handler) + return rs6000_veclib_handler (combined_fn (fn), type_out, type_in); + + return NULL_TREE; +} + +/* Implement TARGET_VECTORIZE_BUILTIN_MD_VECTORIZED_FUNCTION. */ + +static tree +rs6000_new_builtin_md_vectorized_function (tree fndecl, tree type_out, + tree type_in) +{ + machine_mode in_mode, out_mode; + int in_n, out_n; + + if (TARGET_DEBUG_BUILTIN) + fprintf (stderr, + "rs6000_new_builtin_md_vectorized_function (%s, %s, %s)\n", + IDENTIFIER_POINTER (DECL_NAME (fndecl)), + GET_MODE_NAME (TYPE_MODE (type_out)), + GET_MODE_NAME (TYPE_MODE (type_in))); + + if (TREE_CODE (type_out) != VECTOR_TYPE + || TREE_CODE (type_in) != VECTOR_TYPE) + return NULL_TREE; + + out_mode = TYPE_MODE (TREE_TYPE (type_out)); + out_n = TYPE_VECTOR_SUBPARTS (type_out); + in_mode = TYPE_MODE (TREE_TYPE (type_in)); + in_n = TYPE_VECTOR_SUBPARTS (type_in); + + enum rs6000_gen_builtins fn + = (enum rs6000_gen_builtins) DECL_MD_FUNCTION_CODE (fndecl); + switch (fn) + { + case RS6000_BIF_RSQRTF: + if (VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode) + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls_x[RS6000_BIF_VRSQRTFP]; + break; + case RS6000_BIF_RSQRT: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls_x[RS6000_BIF_RSQRT_2DF]; + break; + case RS6000_BIF_RECIPF: + if (VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode) + && out_mode == SFmode && out_n == 4 + && in_mode == SFmode && in_n == 4) + return rs6000_builtin_decls_x[RS6000_BIF_VRECIPFP]; + break; + case RS6000_BIF_RECIP: + if (VECTOR_UNIT_VSX_P (V2DFmode) + && out_mode == DFmode && out_n == 2 + && in_mode == DFmode && in_n == 2) + return rs6000_builtin_decls_x[RS6000_BIF_RECIP_V2DF]; + break; + default: + break; + } + + machine_mode in_vmode = TYPE_MODE (type_in); + machine_mode out_vmode = TYPE_MODE (type_out); + + /* Power10 supported vectorized built-in functions. */ + if (TARGET_POWER10 + && in_vmode == out_vmode + && VECTOR_UNIT_ALTIVEC_OR_VSX_P (in_vmode)) + { + machine_mode exp_mode = DImode; + machine_mode exp_vmode = V2DImode; + enum rs6000_gen_builtins bif; + switch (fn) + { + case RS6000_BIF_DIVWE: + case RS6000_BIF_DIVWEU: + exp_mode = SImode; + exp_vmode = V4SImode; + if (fn == RS6000_BIF_DIVWE) + bif = RS6000_BIF_VDIVESW; + else + bif = RS6000_BIF_VDIVEUW; + break; + case RS6000_BIF_DIVDE: + case RS6000_BIF_DIVDEU: + if (fn == RS6000_BIF_DIVDE) + bif = RS6000_BIF_VDIVESD; + else + bif = RS6000_BIF_VDIVEUD; + break; + case RS6000_BIF_CFUGED: + bif = RS6000_BIF_VCFUGED; + break; + case RS6000_BIF_CNTLZDM: + bif = RS6000_BIF_VCLZDM; + break; + case RS6000_BIF_CNTTZDM: + bif = RS6000_BIF_VCTZDM; + break; + case RS6000_BIF_PDEPD: + bif = RS6000_BIF_VPDEPD; + break; + case RS6000_BIF_PEXTD: + bif = RS6000_BIF_VPEXTD; + break; + default: + return NULL_TREE; + } + + if (in_mode == exp_mode && in_vmode == exp_vmode) + return rs6000_builtin_decls_x[bif]; + } + + return NULL_TREE; +} + /* Handler for the Mathematical Acceleration Subsystem (mass) interface to a library with vectorized intrinsics. */ @@ -5620,6 +5866,9 @@ rs6000_builtin_vectorized_function (unsigned int fn, tree type_out, machine_mode in_mode, out_mode; int in_n, out_n; + if (new_builtins_are_live) + return rs6000_new_builtin_vectorized_function (fn, type_out, type_in); + if (TARGET_DEBUG_BUILTIN) fprintf (stderr, "rs6000_builtin_vectorized_function (%s, %s, %s)\n", combined_fn_name (combined_fn (fn)), @@ -5751,6 +6000,10 @@ rs6000_builtin_md_vectorized_function (tree fndecl, tree type_out, machine_mode in_mode, out_mode; int in_n, out_n; + if (new_builtins_are_live) + return rs6000_new_builtin_md_vectorized_function (fndecl, type_out, + type_in); + if (TARGET_DEBUG_BUILTIN) fprintf (stderr, "rs6000_builtin_md_vectorized_function (%s, %s, %s)\n", IDENTIFIER_POINTER (DECL_NAME (fndecl)), -- 2.27.0