From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 9FEED3858CDA for ; Mon, 10 Jul 2023 19:50:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9FEED3858CDA Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0360072.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36AJemYC012226; Mon, 10 Jul 2023 19:50:53 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : subject : message-id : mime-version : content-type; s=pp1; bh=lyxmX9Zy7bbb8Pf9KwEo8IVAJjqgMN8dliOFQIWED+s=; b=gquYi9TOXOZgATHa0sAuo/IN21LgUUZyOI5RYtLDPrlXBGneP7UFe/9rqHb2HkCz4mRZ SWMzPjRp8ZsWKC5CWiK9Kiz3J+yULSkWDjAlvRNbZwAOiEbG2WYITaZhnUcWjs3xDWFJ oJe/bfY1Dd2L9a0ruTWr0m2P8ZHa2Jb/Yy8a0WNPTcIGQ+xvwfcRTEepWcFONs3BkS30 29UiRXHgkZ+JhhPlnL4hFVM+2MTWVLfgLqXof5mlLHYDnDwpma2zzK248wxVX8flqy8e LvMZl590fjp9nW8NimTC21vtia53b3rWaJto4UXyGq6s9MHv1jcFyaPrFvuQBuqpnDG9 Ig== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rrr0jgkhm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 10 Jul 2023 19:50:52 +0000 Received: from m0360072.ppops.net (m0360072.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 36AJfbHr014538; Mon, 10 Jul 2023 19:50:52 GMT Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rrr0jgkhb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 10 Jul 2023 19:50:52 +0000 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 36AJAKQg005241; Mon, 10 Jul 2023 19:50:51 GMT Received: from smtprelay01.wdc07v.mail.ibm.com ([9.208.129.119]) by ppma02dal.us.ibm.com (PPS) with ESMTPS id 3rpye61r0t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 10 Jul 2023 19:50:51 +0000 Received: from smtpav03.wdc07v.mail.ibm.com (smtpav03.wdc07v.mail.ibm.com [10.39.53.230]) by smtprelay01.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 36AJoncD36111044 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 10 Jul 2023 19:50:50 GMT Received: from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CD4C05805F; Mon, 10 Jul 2023 19:50:49 +0000 (GMT) Received: from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 296D558054; Mon, 10 Jul 2023 19:50:49 +0000 (GMT) Received: from cowardly-lion.the-meissners.org (unknown [9.61.34.183]) by smtpav03.wdc07v.mail.ibm.com (Postfix) with ESMTPS; Mon, 10 Jul 2023 19:50:49 +0000 (GMT) Date: Mon, 10 Jul 2023 15:50:47 -0400 From: Michael Meissner To: gcc-patches@gcc.gnu.org, Michael Meissner , Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner Subject: [PATCH] Optimize vec_splats of vec_extract for V2DI/V2DF (PR target/99293) Message-ID: Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-TM-AS-GCONF: 00 X-Proofpoint-GUID: lpi1_oU6Hx4snt_8Uy7KH1wlOuUMoMp5 X-Proofpoint-ORIG-GUID: HZKqdFVZ3c5kwIuZsq2UA60_U6g_2fdR X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-10_14,2023-07-06_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 phishscore=0 priorityscore=1501 impostorscore=0 malwarescore=0 suspectscore=0 lowpriorityscore=0 adultscore=0 spamscore=0 mlxlogscore=520 bulkscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2307100177 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This patch optimizes cases like: vector double v1, v2; /* ... */ v2 = vec_splats (vec_extract (v1, 0); /* or */ v2 = vec_splats (vec_extract (v1, 1); Previously: vector long long splat_dup_l_0 (vector long long v) { return __builtin_vec_splats (__builtin_vec_extract (v, 0)); } would generate: mfvsrld 9,34 mtvsrdd 34,9,9 blr With this patch, GCC generates: xxpermdi 34,34,34,3 blr 2023-07-10 Michael Meissner gcc/ PR target/99293 * gcc/config/rs6000/vsx.md (vsx_splat_extract_): New combiner insn. gcc/testsuite/ PR target/108958 * gcc.target/powerpc/pr99293.c: New test. * gcc.target/powerpc/builtins-1.c: Update insn count. --- gcc/config/rs6000/vsx.md | 18 ++++++ gcc/testsuite/gcc.target/powerpc/builtins-1.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr99293.c | 55 +++++++++++++++++++ 3 files changed, 74 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99293.c diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 0c269e4e8d9..d34c3b21abe 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -4600,6 +4600,24 @@ (define_insn "vsx_splat__mem" "lxvdsx %x0,%y1" [(set_attr "type" "vecload")]) +;; Optimize SPLAT of an extract from a V2DF/V2DI vector with a constant element +(define_insn "*vsx_splat_extract_" + [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa") + (vec_duplicate:VSX_D + (vec_select: + (match_operand:VSX_D 1 "vsx_register_operand" "wa") + (parallel [(match_operand 2 "const_0_to_1_operand" "n")]))))] + "VECTOR_MEM_VSX_P (mode)" +{ + int which_word = INTVAL (operands[2]); + if (!BYTES_BIG_ENDIAN) + which_word = 1 - which_word; + + operands[3] = GEN_INT (which_word ? 3 : 0); + return "xxpermdi %x0,%x1,%x1,%3"; +} + [(set_attr "type" "vecperm")]) + ;; V4SI splat support (define_insn "vsx_splat_v4si" [(set (match_operand:V4SI 0 "vsx_register_operand" "=wa,wa") diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-1.c b/gcc/testsuite/gcc.target/powerpc/builtins-1.c index 28cd1aa6b1a..98783668bce 100644 --- a/gcc/testsuite/gcc.target/powerpc/builtins-1.c +++ b/gcc/testsuite/gcc.target/powerpc/builtins-1.c @@ -1035,4 +1035,4 @@ foo156 (vector unsigned short usa) /* { dg-final { scan-assembler-times {\mvmrglb\M} 3 } } */ /* { dg-final { scan-assembler-times {\mvmrgew\M} 4 } } */ /* { dg-final { scan-assembler-times {\mvsplth|xxsplth\M} 4 } } */ -/* { dg-final { scan-assembler-times {\mxxpermdi\M} 44 } } */ +/* { dg-final { scan-assembler-times {\mxxpermdi\M} 42 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr99293.c b/gcc/testsuite/gcc.target/powerpc/pr99293.c new file mode 100644 index 00000000000..e5f44bd7346 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr99293.c @@ -0,0 +1,55 @@ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options "-O2 -mpower8-vector" } */ + +/* Test for PR 99263, which wants to do: + __builtin_vec_splats (__builtin_vec_extract (v, n)) + + where v is a V2DF or V2DI vector and n is either 0 or 1. Previously the GCC + compiler would do a direct move to the GPR registers to select the item and a + direct move from the GPR registers to do the splat. + + Before the patch, splat_dup_ll_0 or splat_dup_dbl_0 below would generate: + + mfvsrld 9,34 + mtvsrdd 34,9,9 + blr + + and now it generates: + + xxpermdi 34,34,34,3 + blr */ + +#include + +vector long long +splat_dup_ll_0 (vector long long v) +{ + /* xxpermdi 34,34,34,3 */ + return __builtin_vec_splats (vec_extract (v, 0)); +} + +vector double +splat_dup_dbl_0 (vector double v) +{ + /* xxpermdi 34,34,34,3 */ + return __builtin_vec_splats (vec_extract (v, 0)); +} + +vector long long +splat_dup_ll_1 (vector long long v) +{ + /* xxpermdi 34,34,34,0 */ + return __builtin_vec_splats (vec_extract (v, 1)); +} + +vector double +splat_dup_dbl_1 (vector double v) +{ + /* xxpermdi 34,34,34,0 */ + return __builtin_vec_splats (vec_extract (v, 1)); +} + +/* { dg-final { scan-assembler-times "xxpermdi" 4 } } */ +/* { dg-final { scan-assembler-not "mfvsrd" } } */ +/* { dg-final { scan-assembler-not "mfvsrld" } } */ +/* { dg-final { scan-assembler-not "mtvsrdd" } } */ -- 2.41.0 -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meissner@linux.ibm.com