From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 093C5389941E; Mon, 1 Jul 2024 12:34:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 093C5389941E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gcc.gnu.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 093C5389941E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719837243; cv=none; b=RGSzn80UrcllFKOn6ATsefnRfwMvPud5D2TQmP2ypnqQXeSut02uwcV40aSy55nBQ+RZ5pDKF5c/SNp6t3KJ9LDE+oAcyH7gE79Oa1YwERwJ3HXoCQmdxBokDRguR/6yHKUfpVBiNWRfq1GvtcsuSuVnptZcnj9+YJFMeeYMpj8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719837243; c=relaxed/simple; bh=B8pH+GkQ67PakZZn5Usxp3T0+k4TEEunY56tz1TetmM=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=YNV/ddKSBVE8kX/wOwWkB9EIGXmUceeC1HgkLFuiQ15UXTv1rp2AUUFXWKPpFW3KN56Qes1ZLb9D+dL2jr5zWox2e9Ty+CBPkesBvo3qzuNjIlP1GjRtBNAx/5RarG0dshPA2dtNWEaFBiEw1qj4TkGIagY246SqOJAK9baPQmA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353728.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 4617wOfZ031939; Mon, 1 Jul 2024 08:32:45 GMT Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 403rhjr2f2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 01 Jul 2024 08:32:45 +0000 (GMT) Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 4618ERd4026409; Mon, 1 Jul 2024 08:32:44 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 402wkpp7m3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 01 Jul 2024 08:32:43 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 4618Wcjm53477856 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 1 Jul 2024 08:32:40 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 25E4A2004E; Mon, 1 Jul 2024 08:32:38 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 07D0F2004D; Mon, 1 Jul 2024 08:32:38 +0000 (GMT) Received: from a8345010.lnxne.boe (unknown [9.152.108.100]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTPS; Mon, 1 Jul 2024 08:32:37 +0000 (GMT) From: Stefan Schulze Frielinghaus To: krebbel@linux.ibm.com, gcc-patches@gcc.gnu.org Cc: Stefan Schulze Frielinghaus Subject: [PATCH 1/3] s390: Emulate vec_cmp{eq,gt,gtu} for 128-bit integers Date: Mon, 1 Jul 2024 10:32:29 +0200 Message-ID: <20240701083231.160970-2-stefansf@gcc.gnu.org> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240701083231.160970-1-stefansf@gcc.gnu.org> References: <20240701083231.160970-1-stefansf@gcc.gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: z2T9zyrvhXibow1MqMOeWfQeUXewKhOC X-Proofpoint-GUID: z2T9zyrvhXibow1MqMOeWfQeUXewKhOC X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-01_06,2024-06-28_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 suspectscore=0 bulkscore=0 phishscore=0 priorityscore=1501 lowpriorityscore=0 clxscore=1034 mlxscore=0 spamscore=0 malwarescore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2406140001 definitions=main-2407010064 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00,GIT_PATCH_0,JMQ_SPF_NEUTRAL,KAM_DMARC_STATUS,KAM_SHORT,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NEUTRAL,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Mode iterator V_HW enables V1TI for target VXE which means vec_cmpv1tiv1ti becomes available which leads to an ICE since there is no corresponding insn. Fixed by emulating comparisons and enabling mode V1TI unconditionally for V_HW. For the sake of symmetry, I also added TI mode to V_HW since TF mode is already included. As a consequence the consumers of V_HW vec_{splat,slb,sld,sldw,sldb,srdb,srab,srb,test_mask_int,test_mask} also become available for 128-bit integers. This fixes gcc.c-torture/execute/pr105613.c and gcc.dg/pr106063.c. gcc/ChangeLog: * config/s390/vector.md (V_HW): Enable V1TI unconditionally and add TI. (vec_cmpu): Add 128-bit integer variants. (*vec_cmpeq_nocc_emu): Emulate operation. (*vec_cmpgt_nocc_emu): Emulate operation. (*vec_cmpgtu_nocc_emu): Emulate operation. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-cmp-emu-1.c: New test. * gcc.target/s390/vector/vec-cmp-emu-2.c: New test. * gcc.target/s390/vector/vec-cmp-emu-3.c: New test. --- Bootstrapped and regtested on s390. Ok for mainline and GCC 14? gcc/config/s390/vector.md | 113 ++++++++++++++++-- .../gcc.target/s390/vector/vec-cmp-emu-1.c | 35 ++++++ .../gcc.target/s390/vector/vec-cmp-emu-2.c | 18 +++ .../gcc.target/s390/vector/vec-cmp-emu-3.c | 17 +++ 4 files changed, 171 insertions(+), 12 deletions(-) create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-1.c create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-2.c create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-3.c diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md index 40de0c75a7c..032ec44542c 100644 --- a/gcc/config/s390/vector.md +++ b/gcc/config/s390/vector.md @@ -30,7 +30,7 @@ ; V_HW2 is for having two iterators expanding independently e.g. vcond. ; It's similar to V_HW, but not fully identical: V1TI is not included, because ; there are no 128-bit compares. -(define_mode_iterator V_HW [V16QI V8HI V4SI V2DI (V1TI "TARGET_VXE") V2DF +(define_mode_iterator V_HW [V16QI V8HI V4SI V2DI V1TI TI V2DF (V4SF "TARGET_VXE") (V1TF "TARGET_VXE") (TF "TARGET_VXE")]) (define_mode_iterator V_HW2 [V16QI V8HI V4SI V2DI V2DF (V4SF "TARGET_VXE") @@ -50,6 +50,7 @@ (define_mode_iterator VI_HW_HSDT [V8HI V4SI V2DI V1TI TI]) (define_mode_iterator VI_HW_HS [V8HI V4SI]) (define_mode_iterator VI_HW_QH [V16QI V8HI]) +(define_mode_iterator VI_HW_T [V1TI TI]) ; Directly supported vector modes with a certain number of elements (define_mode_iterator V_HW_2 [V2DI V2DF]) @@ -151,7 +152,7 @@ (V1HI "V1HI") (V2HI "V2HI") (V4HI "V4HI") (V8HI "V8HI") (V1SI "V1SI") (V2SI "V2SI") (V4SI "V4SI") (V1DI "V1DI") (V2DI "V2DI") - (V1TI "V1TI") + (V1TI "V1TI") (TI "V1TI") (V1SF "V1SI") (V2SF "V2SI") (V4SF "V4SI") (V1DF "V1DI") (V2DF "V2DI") (V1TF "V1TI") (TF "V1TI")]) @@ -160,7 +161,7 @@ (V1HI "v1hi") (V2HI "v2hi") (V4HI "v4hi") (V8HI "v8hi") (V1SI "v1si") (V2SI "v2si") (V4SI "v4si") (V1DI "v1di") (V2DI "v2di") - (V1TI "v1ti") + (V1TI "v1ti") (TI "v1ti") (V1SF "v1si") (V2SF "v2si") (V4SF "v4si") (V1DF "v1di") (V2DF "v2di") (V1TF "v1ti") (TF "v1ti")]) @@ -1956,11 +1957,11 @@ DONE; }) -(define_expand "vec_cmpu" - [(set (match_operand:VI_HW 0 "register_operand" "") - (match_operator:VI_HW 1 "" - [(match_operand:VI_HW 2 "register_operand" "") - (match_operand:VI_HW 3 "register_operand" "")]))] +(define_expand "vec_cmpu" + [(set (match_operand:VIT_HW 0 "register_operand" "") + (match_operator:VIT_HW 1 "" + [(match_operand:VIT_HW 2 "register_operand" "") + (match_operand:VIT_HW 3 "register_operand" "")]))] "TARGET_VX" { s390_expand_vec_compare (operands[0], GET_CODE(operands[1]), operands[2], operands[3]); @@ -1975,6 +1976,94 @@ "vc\t%v2,%v0,%v1" [(set_attr "op_type" "VRR")]) +(define_insn_and_split "*vec_cmpeq_nocc_emu" + [(set (match_operand:VI_HW_T 0 "register_operand" "=v") + (eq:VI_HW_T (match_operand:VI_HW_T 1 "register_operand" "v") + (match_operand:VI_HW_T 2 "register_operand" "v")))] + "TARGET_VX" + "#" + "&& can_create_pseudo_p ()" + [(set (match_dup 3) + (eq:V2DI (match_dup 1) (match_dup 2))) + (set (match_dup 4) + (vec_select:V2DI (match_dup 3) (parallel [(const_int 1) (const_int 0)]))) + (set (match_dup 3) + (and:V2DI (match_dup 3) (match_dup 4))) + (set (match_dup 0) + (subreg: (match_dup 3) 0))] +{ + operands[1] = simplify_gen_subreg (V2DImode, operands[1], mode, 0); + operands[2] = simplify_gen_subreg (V2DImode, operands[2], mode, 0); + operands[3] = gen_reg_rtx (V2DImode); + operands[4] = gen_reg_rtx (V2DImode); +}) + +(define_insn_and_split "*vec_cmpgt_nocc_emu" + [(set (match_operand:VI_HW_T 0 "register_operand" "=v") + (gt:VI_HW_T (match_operand:VI_HW_T 1 "register_operand" "v") + (match_operand:VI_HW_T 2 "register_operand" "v")))] + "TARGET_VX" + "#" + "&& can_create_pseudo_p ()" + [(set (match_dup 3) + (gt:V2DI (match_dup 1) (match_dup 2))) + (set (match_dup 4) + (eq:V2DI (match_dup 1) (match_dup 2))) + (set (match_dup 5) + (gtu:V2DI (match_dup 1) (match_dup 2))) + (set (match_dup 5) + (vec_select:V2DI (match_dup 5) (parallel [(const_int 1) (const_int 0)]))) + (set (match_dup 4) + (and:V2DI (match_dup 4) (match_dup 5))) + (set (match_dup 4) + (ior:V2DI (match_dup 3) (match_dup 4))) + (set (match_dup 4) + (vec_duplicate:V2DI + (vec_select:DI + (match_dup 4) + (parallel [(const_int 1)])))) + (set (match_dup 0) + (subreg: (match_dup 4) 0))] +{ + operands[1] = simplify_gen_subreg (V2DImode, operands[1], mode, 0); + operands[2] = simplify_gen_subreg (V2DImode, operands[2], mode, 0); + operands[3] = gen_reg_rtx (V2DImode); + operands[4] = gen_reg_rtx (V2DImode); + operands[5] = gen_reg_rtx (V2DImode); +}) + +(define_insn_and_split "*vec_cmpgtu_nocc_emu" + [(set (match_operand:VI_HW_T 0 "register_operand" "=v") + (gtu:VI_HW_T (match_operand:VI_HW_T 1 "register_operand" "v") + (match_operand:VI_HW_T 2 "register_operand" "v")))] + "TARGET_VX" + "#" + "&& can_create_pseudo_p ()" + [(set (match_dup 3) + (gtu:V2DI (match_dup 1) (match_dup 2))) + (set (match_dup 4) + (eq:V2DI (match_dup 1) (match_dup 2))) + (set (match_dup 5) + (vec_select:V2DI (match_dup 3) (parallel [(const_int 1) (const_int 0)]))) + (set (match_dup 4) + (and:V2DI (match_dup 4) (match_dup 5))) + (set (match_dup 4) + (ior:V2DI (match_dup 3) (match_dup 4))) + (set (match_dup 4) + (vec_duplicate:V2DI + (vec_select:DI + (match_dup 4) + (parallel [(const_int 1)])))) + (set (match_dup 0) + (subreg: (match_dup 4) 0))] +{ + operands[1] = simplify_gen_subreg (V2DImode, operands[1], mode, 0); + operands[2] = simplify_gen_subreg (V2DImode, operands[2], mode, 0); + operands[3] = gen_reg_rtx (V2DImode); + operands[4] = gen_reg_rtx (V2DImode); + operands[5] = gen_reg_rtx (V2DImode); +}) + ;; ;; Floating point compares @@ -2311,12 +2400,12 @@ ; op0 = op3 == 0 ? op1 : op2 (define_insn "*vec_sel0" - [(set (match_operand:V 0 "register_operand" "=v") - (if_then_else:V + [(set (match_operand:VT 0 "register_operand" "=v") + (if_then_else:VT (eq (match_operand: 3 "register_operand" "v") (match_operand: 4 "const0_operand" "")) - (match_operand:V 1 "register_operand" "v") - (match_operand:V 2 "register_operand" "v")))] + (match_operand:VT 1 "register_operand" "v") + (match_operand:VT 2 "register_operand" "v")))] "TARGET_VX" "vsel\t%v0,%2,%1,%3" [(set_attr "op_type" "VRR")]) diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-1.c new file mode 100644 index 00000000000..c92a2b41678 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-1.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mzarch -march=z13" } */ +/* { dg-require-effective-target int128 } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +typedef __attribute__ ((vector_size (16))) signed __int128 v1ti; +typedef __attribute__ ((vector_size (16))) unsigned __int128 uv1ti; + +/* +** eq: +** vceqg (%v[0-9]+),%v[0-9]+,%v[0-9]+ +** vpdi (%v[0-9]+),\1,\1,4 +** vn %v24,(\1,\2|\2,\1) +** br %r14 +*/ + +v1ti +eq (v1ti x, v1ti y) +{ + return x == y; +} + +/* +** ueq: +** vceqg (%v[0-9]+),%v[0-9]+,%v[0-9]+ +** vpdi (%v[0-9]+),\1,\1,4 +** vn %v24,(\1,\2|\2,\1) +** br %r14 +*/ + +uv1ti +ueq (uv1ti x, uv1ti y) +{ + return x == y; +} diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-2.c b/gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-2.c new file mode 100644 index 00000000000..b3ee3197dc7 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-2.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -mzarch -march=z13" } */ +/* { dg-require-effective-target int128 } */ +/* { dg-final { scan-assembler-times {\tvchg\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tvchlg\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tvceqg\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tvpdi\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tvn\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tvo\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tvrepg\t} 1 } } */ + +typedef __attribute__ ((vector_size (16))) __int128 v1ti; + +v1ti +gt (v1ti x, v1ti y) +{ + return x > y; +} diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-3.c b/gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-3.c new file mode 100644 index 00000000000..8887814b176 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/vector/vec-cmp-emu-3.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -mzarch -march=z13" } */ +/* { dg-require-effective-target int128 } */ +/* { dg-final { scan-assembler-times {\tvchlg\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tvceqg\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tvpdi\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tvn\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tvo\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tvrepg\t} 1 } } */ + +typedef __attribute__ ((vector_size (16))) unsigned __int128 uv1ti; + +uv1ti +gt (uv1ti x, uv1ti y) +{ + return x > y; +} -- 2.45.2