From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 126062 invoked by alias); 16 Jun 2017 21:19:10 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 126038 invoked by uid 89); 16 Jun 2017 21:19:09 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.5 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KHOP_DYNAMIC,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 16 Jun 2017 21:19:07 +0000 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v5GLIhXG043075 for ; Fri, 16 Jun 2017 17:19:10 -0400 Received: from e14.ny.us.ibm.com (e14.ny.us.ibm.com [129.33.205.204]) by mx0b-001b2d01.pphosted.com with ESMTP id 2b4pug0h0j-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Fri, 16 Jun 2017 17:19:09 -0400 Received: from localhost by e14.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 16 Jun 2017 17:19:09 -0400 Received: from b01cxnp22034.gho.pok.ibm.com (9.57.198.24) by e14.ny.us.ibm.com (146.89.104.201) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 16 Jun 2017 17:19:07 -0400 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v5GLJ5LK42139818; Fri, 16 Jun 2017 21:19:06 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A8FF1B2058; Fri, 16 Jun 2017 17:16:40 -0400 (EDT) Received: from oc3304648336.ibm.com (unknown [9.70.82.187]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP id 262E5B204D; Fri, 16 Jun 2017 17:16:40 -0400 (EDT) Subject: [PATCH, rs6000] Fix vec_mulo and vec_mule instruction generation From: Carl Love To: gcc-patches@gcc.gnu.org, David Edelsohn , Segher Boessenkool Cc: Bill Schmidt , cel@us.ibm.com Date: Fri, 16 Jun 2017 21:19:00 -0000 Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 17061621-0052-0000-0000-000002257422 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007245; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00875779; UDB=6.00436076; IPR=6.00655867; BA=6.00005425; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015856; XFM=3.00000015; UTC=2017-06-16 21:19:08 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17061621-0053-0000-0000-000050FD9741 Message-Id: <1497647945.3664.36.camel@us.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-06-16_11:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1706160360 X-IsSubscribed: yes X-SW-Source: 2017-06/txt/msg01241.txt.bz2 GCC Maintainers: The support for the vec_mulo and vec_mule has yet another bug. For the case of signed/unsigned integer arguments the builtin generates the half word instruction not the word instruction. This patch fixes the issue. The fix has been tested and verified on powerpc64le-unknown-linux-gnu (Power 8 LE) Is the patch OK for gcc mainline? Carl Love --------------------------------------------- >From 3127a3f9c8480fde428c4a13bc37d6eaefd0edfe Mon Sep 17 00:00:00 2001 From: Carl Love Date: Fri, 16 Jun 2017 16:10:56 -0500 Subject: [PATCH] vec_mule, vec_mulo fix 2 gcc/ChangeLog: 2017-06-17 Carl Love * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add ALTIVEC_BUILTIN_VMULESW, ALTIVEC_BUILTIN_VMULEUW, ALTIVEC_BUILTIN_VMULOSW, ALTIVEC_BUILTIN_VMULOUW enties. * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin(), builtin_function_type()): Add needed ALTIVEC_BUILTIN_* case statements. * config/rs6000/altivec.md (define_c_enum "unspec", define_expand "vec_widen_umult_even_v4si", define_expand "vec_widen_smult_even_v4si", define_expand "vec_widen_umult_odd_v4si", define_expand "vec_widen_smult_odd_v4si", define_insn "altivec_vmuleuw", define_insn "altivec_vmulesw", define_insn "altivec_vmulouw", define_insn "altivec_vmulosw"): Add support to generate vmuleuw, vmulesw, vmulouw, vmulosw instructions. * config/rs6000/rs6000-builtin.def (VMLEUW, VMULESW, VMULOUW, VMULOSW): Add definitions. --- gcc/config/rs6000/altivec.md | 91 ++++++++++++++++++++++++++++++++++++ gcc/config/rs6000/rs6000-builtin.def | 8 ++++ gcc/config/rs6000/rs6000-c.c | 12 +++-- gcc/config/rs6000/rs6000.c | 6 +++ 4 files changed, 113 insertions(+), 4 deletions(-) diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 487b9a4..142300a 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -36,10 +36,14 @@ UNSPEC_VMULESB UNSPEC_VMULEUH UNSPEC_VMULESH + UNSPEC_VMULEUW + UNSPEC_VMULESW UNSPEC_VMULOUB UNSPEC_VMULOSB UNSPEC_VMULOUH UNSPEC_VMULOSH + UNSPEC_VMULOUW + UNSPEC_VMULOSW UNSPEC_VPKPX UNSPEC_VPACK_SIGN_SIGN_SAT UNSPEC_VPACK_SIGN_UNS_SAT @@ -1412,6 +1416,32 @@ DONE; }) +(define_expand "vec_widen_umult_even_v4si" + [(use (match_operand:V2DI 0 "register_operand" "")) + (use (match_operand:V4SI 1 "register_operand" "")) + (use (match_operand:V4SI 2 "register_operand" ""))] + "TARGET_ALTIVEC" +{ + if (VECTOR_ELT_ORDER_BIG) + emit_insn (gen_altivec_vmuleuw (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmulouw (operands[0], operands[1], operands[2])); + DONE; +}) + +(define_expand "vec_widen_smult_even_v4si" + [(use (match_operand:V2DI 0 "register_operand" "")) + (use (match_operand:V4SI 1 "register_operand" "")) + (use (match_operand:V4SI 2 "register_operand" ""))] + "TARGET_ALTIVEC" +{ + if (VECTOR_ELT_ORDER_BIG) + emit_insn (gen_altivec_vmulesw (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmulosw (operands[0], operands[1], operands[2])); + DONE; +}) + (define_expand "vec_widen_umult_odd_v16qi" [(use (match_operand:V8HI 0 "register_operand" "")) (use (match_operand:V16QI 1 "register_operand" "")) @@ -1464,6 +1494,32 @@ DONE; }) +(define_expand "vec_widen_umult_odd_v4si" + [(use (match_operand:V2DI 0 "register_operand" "")) + (use (match_operand:V4SI 1 "register_operand" "")) + (use (match_operand:V4SI 2 "register_operand" ""))] + "TARGET_ALTIVEC" +{ + if (VECTOR_ELT_ORDER_BIG) + emit_insn (gen_altivec_vmulouw (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmuleuw (operands[0], operands[1], operands[2])); + DONE; +}) + +(define_expand "vec_widen_smult_odd_v4si" + [(use (match_operand:V2DI 0 "register_operand" "")) + (use (match_operand:V4SI 1 "register_operand" "")) + (use (match_operand:V4SI 2 "register_operand" ""))] + "TARGET_ALTIVEC" +{ + if (VECTOR_ELT_ORDER_BIG) + emit_insn (gen_altivec_vmulosw (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmulesw (operands[0], operands[1], operands[2])); + DONE; +}) + (define_insn "altivec_vmuleub" [(set (match_operand:V8HI 0 "register_operand" "=v") (unspec:V8HI [(match_operand:V16QI 1 "register_operand" "v") @@ -1536,6 +1592,41 @@ "vmulosh %0,%1,%2" [(set_attr "type" "veccomplex")]) +(define_insn "altivec_vmuleuw" + [(set (match_operand:V2DI 0 "register_operand" "=v") + (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v") + (match_operand:V4SI 2 "register_operand" "v")] + UNSPEC_VMULEUW))] + "TARGET_ALTIVEC" + "vmuleuw %0,%1,%2" + [(set_attr "type" "veccomplex")]) + +(define_insn "altivec_vmulouw" + [(set (match_operand:V2DI 0 "register_operand" "=v") + (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v") + (match_operand:V4SI 2 "register_operand" "v")] + UNSPEC_VMULOUW))] + "TARGET_ALTIVEC" + "vmulouw %0,%1,%2" + [(set_attr "type" "veccomplex")]) + +(define_insn "altivec_vmulesw" + [(set (match_operand:V2DI 0 "register_operand" "=v") + (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v") + (match_operand:V4SI 2 "register_operand" "v")] + UNSPEC_VMULESW))] + "TARGET_ALTIVEC" + "vmulesw %0,%1,%2" + [(set_attr "type" "veccomplex")]) + +(define_insn "altivec_vmulosw" + [(set (match_operand:V2DI 0 "register_operand" "=v") + (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v") + (match_operand:V4SI 2 "register_operand" "v")] + UNSPEC_VMULOSW))] + "TARGET_ALTIVEC" + "vmulosw %0,%1,%2" + [(set_attr "type" "veccomplex")]) ;; Vector pack/unpack (define_insn "altivec_vpkpx" diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index 241c439..780d452 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1031,10 +1031,14 @@ BU_ALTIVEC_2 (VMULEUB, "vmuleub", CONST, vec_widen_umult_even_v16qi) BU_ALTIVEC_2 (VMULESB, "vmulesb", CONST, vec_widen_smult_even_v16qi) BU_ALTIVEC_2 (VMULEUH, "vmuleuh", CONST, vec_widen_umult_even_v8hi) BU_ALTIVEC_2 (VMULESH, "vmulesh", CONST, vec_widen_smult_even_v8hi) +BU_ALTIVEC_2 (VMULEUW, "vmuleuw", CONST, vec_widen_umult_even_v4si) +BU_ALTIVEC_2 (VMULESW, "vmulesw", CONST, vec_widen_smult_even_v4si) BU_ALTIVEC_2 (VMULOUB, "vmuloub", CONST, vec_widen_umult_odd_v16qi) BU_ALTIVEC_2 (VMULOSB, "vmulosb", CONST, vec_widen_smult_odd_v16qi) BU_ALTIVEC_2 (VMULOUH, "vmulouh", CONST, vec_widen_umult_odd_v8hi) BU_ALTIVEC_2 (VMULOSH, "vmulosh", CONST, vec_widen_smult_odd_v8hi) +BU_ALTIVEC_2 (VMULOUW, "vmulouw", CONST, vec_widen_umult_odd_v4si) +BU_ALTIVEC_2 (VMULOSW, "vmulosw", CONST, vec_widen_smult_odd_v4si) BU_ALTIVEC_2 (VNOR, "vnor", CONST, norv4si3) BU_ALTIVEC_2 (VOR, "vor", CONST, iorv4si3) BU_ALTIVEC_2 (VPKUHUM, "vpkuhum", CONST, altivec_vpkuhum) @@ -1346,12 +1350,16 @@ BU_ALTIVEC_OVERLOAD_2 (VMRGLH, "vmrglh") BU_ALTIVEC_OVERLOAD_2 (VMRGLW, "vmrglw") BU_ALTIVEC_OVERLOAD_2 (VMULESB, "vmulesb") BU_ALTIVEC_OVERLOAD_2 (VMULESH, "vmulesh") +BU_ALTIVEC_OVERLOAD_2 (VMULESW, "vmulesw") BU_ALTIVEC_OVERLOAD_2 (VMULEUB, "vmuleub") BU_ALTIVEC_OVERLOAD_2 (VMULEUH, "vmuleuh") +BU_ALTIVEC_OVERLOAD_2 (VMULEUW, "vmuleuw") BU_ALTIVEC_OVERLOAD_2 (VMULOSB, "vmulosb") BU_ALTIVEC_OVERLOAD_2 (VMULOSH, "vmulosh") +BU_ALTIVEC_OVERLOAD_2 (VMULOSW, "vmulosw") BU_ALTIVEC_OVERLOAD_2 (VMULOUB, "vmuloub") BU_ALTIVEC_OVERLOAD_2 (VMULOUH, "vmulouh") +BU_ALTIVEC_OVERLOAD_2 (VMULOUW, "vmulouw") BU_ALTIVEC_OVERLOAD_2 (VPKSHSS, "vpkshss") BU_ALTIVEC_OVERLOAD_2 (VPKSHUS, "vpkshus") BU_ALTIVEC_OVERLOAD_2 (VPKSWSS, "vpkswss") diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c index f1e8d3d..af0bafd 100644 --- a/gcc/config/rs6000/rs6000-c.c +++ b/gcc/config/rs6000/rs6000-c.c @@ -2207,11 +2207,13 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 0 }, { ALTIVEC_BUILTIN_VEC_MULE, ALTIVEC_BUILTIN_VMULESH, RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 }, - { ALTIVEC_BUILTIN_VEC_MULE, ALTIVEC_BUILTIN_VMULESH, + + { ALTIVEC_BUILTIN_VEC_MULE, ALTIVEC_BUILTIN_VMULESW, RS6000_BTI_V2DI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, - { ALTIVEC_BUILTIN_VEC_MULE, ALTIVEC_BUILTIN_VMULEUH, + { ALTIVEC_BUILTIN_VEC_MULE, ALTIVEC_BUILTIN_VMULEUW, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_VMULEUB, ALTIVEC_BUILTIN_VMULEUB, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULESB, ALTIVEC_BUILTIN_VMULESB, @@ -2226,11 +2228,13 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V8HI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOUH, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 0 }, - { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOSH, + + { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOSW, RS6000_BTI_V2DI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, - { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOUH, + { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOUW, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOSH, RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULOSH, ALTIVEC_BUILTIN_VMULOSH, diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 6b28658..d1a4937 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -16332,9 +16332,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) /* Even element flavors of vec_mul (signed). */ case ALTIVEC_BUILTIN_VMULESB: case ALTIVEC_BUILTIN_VMULESH: + case ALTIVEC_BUILTIN_VMULESW: /* Even element flavors of vec_mul (unsigned). */ case ALTIVEC_BUILTIN_VMULEUB: case ALTIVEC_BUILTIN_VMULEUH: + case ALTIVEC_BUILTIN_VMULEUW: { arg0 = gimple_call_arg (stmt, 0); arg1 = gimple_call_arg (stmt, 1); @@ -16347,9 +16349,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) /* Odd element flavors of vec_mul (signed). */ case ALTIVEC_BUILTIN_VMULOSB: case ALTIVEC_BUILTIN_VMULOSH: + case ALTIVEC_BUILTIN_VMULOSW: /* Odd element flavors of vec_mul (unsigned). */ case ALTIVEC_BUILTIN_VMULOUB: case ALTIVEC_BUILTIN_VMULOUH: + case ALTIVEC_BUILTIN_VMULOUW: { arg0 = gimple_call_arg (stmt, 0); arg1 = gimple_call_arg (stmt, 1); @@ -17965,8 +17969,10 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, /* unsigned 2 argument functions. */ case ALTIVEC_BUILTIN_VMULEUB: case ALTIVEC_BUILTIN_VMULEUH: + case ALTIVEC_BUILTIN_VMULEUW: case ALTIVEC_BUILTIN_VMULOUB: case ALTIVEC_BUILTIN_VMULOUH: + case ALTIVEC_BUILTIN_VMULOUW: case CRYPTO_BUILTIN_VCIPHER: case CRYPTO_BUILTIN_VCIPHERLAST: case CRYPTO_BUILTIN_VNCIPHER: -- 1.9.1