From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id A2C8E3857C47; Tue, 11 Aug 2020 19:22:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A2C8E3857C47 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 07BJ2wCJ127160; Tue, 11 Aug 2020 15:22:53 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 32urh31j7p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Aug 2020 15:22:52 -0400 Received: from m0098393.ppops.net (m0098393.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 07BJ32Rp127637; Tue, 11 Aug 2020 15:22:52 -0400 Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0a-001b2d01.pphosted.com with ESMTP id 32urh31j7c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Aug 2020 15:22:52 -0400 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 07BJGDEW024801; Tue, 11 Aug 2020 19:22:51 GMT Received: from b03cxnp08027.gho.boulder.ibm.com (b03cxnp08027.gho.boulder.ibm.com [9.17.130.19]) by ppma01dal.us.ibm.com with ESMTP id 32skp9ase1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Aug 2020 19:22:51 +0000 Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 07BJMkoo21824008 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 11 Aug 2020 19:22:46 GMT Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1FED4C6062; Tue, 11 Aug 2020 19:22:49 +0000 (GMT) Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6B83DC605D; Tue, 11 Aug 2020 19:22:47 +0000 (GMT) Received: from oc3304648336.ibm.com (unknown [9.160.118.52]) by b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTP; Tue, 11 Aug 2020 19:22:47 +0000 (GMT) Message-ID: Subject: [Patch 2/5] rs6000, 128-bit multiply, divide, modulo, shift, compare From: Carl Love To: segher@gcc.gnu.org, dje.gcc@gmail.com, gcc-patches@gcc.gnu.org, Will Schmidt Cc: Bill Schmidt , cel@ibm.com Date: Tue, 11 Aug 2020 12:22:46 -0700 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 (3.28.5-5.el7) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-08-11_15:2020-08-11, 2020-08-11 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 lowpriorityscore=0 suspectscore=0 spamscore=0 clxscore=1015 priorityscore=1501 impostorscore=0 bulkscore=0 adultscore=0 mlxlogscore=999 mlxscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2008110133 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GB_TO_NAME_FREEMAIL, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Aug 2020 19:23:05 -0000 Segher, Will: Patch 2, adds support for divide, modulo, shift, compare of 128-bit integers. The support adds the instruction and builtin support. Carl Love ------------------------------------------------------- rs6000, 128-bit multiply, divide, shift, compare gcc/ChangeLog 2020-08-10 Carl Love * config/rs6000/altivec.h (vec_signextq, vec_dive, vec_mod): Add define for new builtins . * config/rs6000/altivec.md (UNSPEC_VMULEUD, UNSPEC_VMULESD, UNSPEC_VMULOUD, UNSPEC_VMULOSD): New unspecs. (altivec_eqv1ti, altivec_gtv1ti, altivec_gtuv1ti, altivec_vmuleud, altivec_vmuloud, altivec_vmulesd, altivec_vmulosd, altivec_vrlq, altivec_vrlqmi, altivec_vrlqmi_inst, altivec_vrlqnm, altivec_vrlqnm_inst, altivec_vslq, altivec_vsrq, altivec_vsraq, altivec_vcmpequt_p, altivec_vcmpgtst_p, altivec_vcmpgtut_p): New define_insn. (vec_widen_umult_even_v2di, vec_widen_smult_even_v2di, vec_widen_umult_odd_v2di, vec_widen_smult_odd_v2di, altivec_vrlqmi, altivec_vrlqnm): New define_expands. * config/rs6000/rs6000-builtin.def (BU_P10_P, BU_P10_128BIT_1, BU_P10_128BIT_2, BU_P10_128BIT_3): New macro definitions. (VCMPEQUT_P, VCMPGTST_P, VCMPGTUT_P): Add macro expansions. (VCMPGTUT, VCMPGTST, VCMPEQUT, CMPNET, CMPGE_1TI, CMPGE_U1TI, CMPLE_1TI, CMPLE_U1TI, VNOR_V1TI_UNS, VNOR_V1TI, VCMPNET_P, VCMPAET_P): New macro expansions. (VSIGNEXTSD2Q,VMULEUD, VMULESD, VMULOUD, VMULOSD, VRLQ, VSLQ, VSRQ, VSRAQ, VRLQNM, DIV_V1TI, UDIV_V1TI, DIVES_V1TI, DIVEU_V1TI, MODS_V1TI, MODU_V1TI, VRLQMI): New macro expansions. (VRLQ, VSLQ, VSRQ, VSRAQ, SIGNEXT): New overload expansions. * config/rs6000/rs6000-call.c (P10_BUILTIN_VCMPEQUT, P10_BUILTIN_VCMPEQUT, P10_BUILTIN_CMPGE_1TI, P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT, P10_BUILTIN_VCMPGTST, P10_BUILTIN_CMPLE_1TI, P10_BUILTIN_128BIT_DIV_V1TI, P10_BUILTIN_128BIT_UDIV_V1TI, P10_BUILTIN_128BIT_VMULESD, P10_BUILTIN_128BIT_VMULEUD, P10_BUILTIN_128BIT_VMULOSD, P10_BUILTIN_128BIT_VMULOUD, P10_BUILTIN_VNOR_V1TI, P10_BUILTIN_VNOR_V1TI_UNS, P10_BUILTIN_128BIT_VRLQ, P10_BUILTIN_128BIT_VRLQMI, P10_BUILTIN_128BIT_VRLQNM, P10_BUILTIN_128BIT_VSLQ, P10_BUILTIN_128BIT_VSRQ, P10_BUILTIN_128BIT_VSRAQ, P10_BUILTIN_VCMPGTUT_P, P10_BUILTIN_VCMPGTST_P, P10_BUILTIN_VCMPEQUT_P, P10_BUILTIN_VCMPGTUT_P, P10_BUILTIN_VCMPGTST_P, P10_BUILTIN_CMPNET, P10_BUILTIN_VCMPNET_P, P10_BUILTIN_VCMPAET_P, P10_BUILTIN_128BIT_VSIGNEXTSD2Q, P10_BUILTIN_128BIT_DIVES_V1TI, P10_BUILTIN_128BIT_MODS_V1TI, P10_BUILTIN_128BIT_MODU_V1TI): New overloaded definitions. (int_ftype_int_v1ti_v1ti) [P10_BUILTIN_VCMPEQUT, P10_BUILTIN_CMPNET, P10_BUILTIN_CMPGE_1TI, P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT, P10_BUILTIN_VCMPGTST, P10_BUILTIN_CMPLE_1TI, P10_BUILTIN_CMPLE_U1TI, E_V1TImode]: New case statements. (int_ftype_int_v1ti_v1ti) [bool_V1TI_type_node, int_ftype_int_v1ti_v1ti]: New assignments. (int_ftype_int_v1ti_v1ti)[P10_BUILTIN_128BIT_VMULEUD, P10_BUILTIN_128BIT_VMULOUD, P10_BUILTIN_128BIT_DIVEU_V1TI, P10_BUILTIN_128BIT_MODU_V1TI, P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT, P10_BUILTIN_VCMPEQUT]: New case statements. * config/rs6000/r6000.c (rs6000_builtin_mask_calculate): New TARGET_TI_VECTOR_OPS definition. (rs6000_option_override_internal): Add if TARGET_POWER10 statement. (rs6000_handle_altivec_attribute)[ E_TImode, E_V1TImode]: New case statements. (rs6000_opt_masks): Add ti-vector-ops entry. * config/rs6000/r6000.h (MASK_TI_VECTOR_OPS, RS6000_BTM_P10_128BIT, RS6000_BTM_TI_VECTOR_OPS, bool_V1TI_type_node): New defines. (rs6000_builtin_type_index): New enum value RS6000_BTI_bool_V1TI. * config/rs6000/rs6000.opt: New mti-vector-ops entry. * config/rs6000/vector.md (vector_eqv1ti, vector_gtv1ti, vector_nltv1ti, vector_gtuv1ti, vector_nltuv1ti, vector_ngtv1ti, vector_ngtuv1ti, vector_eq_v1ti_p, vector_ne_v1ti_p, vector_ae_v1ti_p, vector_gt_v1ti_p, vector_gtu_v1ti_p, vrotlv1ti3, vashlv1ti3, vlshrv1ti3, vashrv1ti3): New define_expands. * config/rs6000/vsx.md (UNSPEC_VSX_DIVSQ, UNSPEC_VSX_DIVUQ, UNSPEC_VSX_DIVESQ, UNSPEC_VSX_DIVEUQ, UNSPEC_VSX_MODSQ, UNSPEC_VSX_MODUQ, UNSPEC_XXSWAPD_V1TI): New unspecs. (vsx_div_v1ti, vsx_udiv_v1ti, vsx_dives_v1ti, vsx_diveu_v1ti, vsx_mods_v1ti, vsx_modu_v1ti, xxswapd_v1ti, vsx_sign_extend_v2di_v1ti): New define_insns. (vcmpnet): New define_expand. * gcc/doc/extend.texi: Add documentation for the new builtins vec_rl, vec_rlmi, vec_rlnm, vec_sl, vec_sr, vec_sra, vec_mule, vec_mulo, vec_div, vec_dive, vec_mod, vec_cmpeq, vec_cmpne, vec_cmpgt, vec_cmplt, vec_cmpge, vec_cmple, vec_all_eq, vec_all_ne, vec_all_gt, vec_all_lt, vec_all_ge, vec_all_le, vec_any_eq, vec_any_ne, vec_any_gt, vec_any_lt, vec_any_ge, vec_any_le. gcc/testsuite/ChangeLog 2020-08-10 Carl Love * gcc.target/powerpc/int_128bit-runnable.c: New test file. --- gcc/config/rs6000/altivec.h | 6 +- gcc/config/rs6000/altivec.md | 242 +- gcc/config/rs6000/rs6000-builtin.def | 77 + gcc/config/rs6000/rs6000-call.c | 150 +- gcc/config/rs6000/rs6000.c | 17 +- gcc/config/rs6000/rs6000.h | 6 +- gcc/config/rs6000/rs6000.opt | 4 + gcc/config/rs6000/vector.md | 199 ++ gcc/config/rs6000/vsx.md | 99 +- gcc/doc/extend.texi | 174 ++ .../gcc.target/powerpc/int_128bit-runnable.c | 2254 +++++++++++++++++ 11 files changed, 3217 insertions(+), 11 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h index 09320df14ca..a121004b3af 100644 --- a/gcc/config/rs6000/altivec.h +++ b/gcc/config/rs6000/altivec.h @@ -183,7 +183,7 @@ #define vec_recipdiv __builtin_vec_recipdiv #define vec_rlmi __builtin_vec_rlmi #define vec_vrlnm __builtin_vec_rlnm -#define vec_rlnm(a,b,c) (__builtin_vec_rlnm((a),((c)<<8)|(b))) +#define vec_rlnm(a,b,c) (__builtin_vec_rlnm((a),((b)<<8)|(c))) #define vec_rsqrt __builtin_vec_rsqrt #define vec_rsqrte __builtin_vec_rsqrte #define vec_signed __builtin_vec_vsigned @@ -694,6 +694,10 @@ __altivec_scalar_pred(vec_any_nle, #define vec_step(x) __builtin_vec_step (* (__typeof__ (x) *) 0) #ifdef _ARCH_PWR10 +#define vec_signextq __builtin_vec_vsignextq +#define vec_dive __builtin_vec_dive +#define vec_mod __builtin_vec_mod + /* May modify these macro definitions if future capabilities overload with support for different vector argument and result types. */ #define vec_cntlzm(a, b) __builtin_altivec_vclzdm (a, b) diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 0a2e634d6b0..2763d920828 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -39,12 +39,16 @@ UNSPEC_VMULESH UNSPEC_VMULEUW UNSPEC_VMULESW + UNSPEC_VMULEUD + UNSPEC_VMULESD UNSPEC_VMULOUB UNSPEC_VMULOSB UNSPEC_VMULOUH UNSPEC_VMULOSH UNSPEC_VMULOUW UNSPEC_VMULOSW + UNSPEC_VMULOUD + UNSPEC_VMULOSD UNSPEC_VPKPX UNSPEC_VPACK_SIGN_SIGN_SAT UNSPEC_VPACK_SIGN_UNS_SAT @@ -628,6 +632,14 @@ "vcmpequ %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "altivec_eqv1ti" + [(set (match_operand:V1TI 0 "altivec_register_operand" "=v") + (eq:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v") + (match_operand:V1TI 2 "altivec_register_operand" "v")))] + "TARGET_TI_VECTOR_OPS" + "vcmpequq %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_gt" [(set (match_operand:VI2 0 "altivec_register_operand" "=v") (gt:VI2 (match_operand:VI2 1 "altivec_register_operand" "v") @@ -636,6 +648,14 @@ "vcmpgts %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "*altivec_gtv1ti" + [(set (match_operand:V1TI 0 "altivec_register_operand" "=v") + (gt:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v") + (match_operand:V1TI 2 "altivec_register_operand" "v")))] + "TARGET_TI_VECTOR_OPS" + "vcmpgtsq %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_gtu" [(set (match_operand:VI2 0 "altivec_register_operand" "=v") (gtu:VI2 (match_operand:VI2 1 "altivec_register_operand" "v") @@ -644,6 +664,14 @@ "vcmpgtu %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "*altivec_gtuv1ti" + [(set (match_operand:V1TI 0 "altivec_register_operand" "=v") + (gtu:V1TI (match_operand:V1TI 1 "altivec_register_operand" "v") + (match_operand:V1TI 2 "altivec_register_operand" "v")))] + "TARGET_TI_VECTOR_OPS" + "vcmpgtuq %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_eqv4sf" [(set (match_operand:V4SF 0 "altivec_register_operand" "=v") (eq:V4SF (match_operand:V4SF 1 "altivec_register_operand" "v") @@ -1687,6 +1715,19 @@ DONE; }) +(define_expand "vec_widen_umult_even_v2di" + [(use (match_operand:V1TI 0 "register_operand")) + (use (match_operand:V2DI 1 "register_operand")) + (use (match_operand:V2DI 2 "register_operand"))] + "TARGET_TI_VECTOR_OPS" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_altivec_vmuleud (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmuloud (operands[0], operands[1], operands[2])); + DONE; +}) + (define_expand "vec_widen_smult_even_v4si" [(use (match_operand:V2DI 0 "register_operand")) (use (match_operand:V4SI 1 "register_operand")) @@ -1695,11 +1736,24 @@ { if (BYTES_BIG_ENDIAN) emit_insn (gen_altivec_vmulesw (operands[0], operands[1], operands[2])); - else + else emit_insn (gen_altivec_vmulosw (operands[0], operands[1], operands[2])); DONE; }) +(define_expand "vec_widen_smult_even_v2di" + [(use (match_operand:V1TI 0 "register_operand")) + (use (match_operand:V2DI 1 "register_operand")) + (use (match_operand:V2DI 2 "register_operand"))] + "TARGET_TI_VECTOR_OPS" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_altivec_vmulesd (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmulosd (operands[0], operands[1], operands[2])); + DONE; +}) + (define_expand "vec_widen_umult_odd_v16qi" [(use (match_operand:V8HI 0 "register_operand")) (use (match_operand:V16QI 1 "register_operand")) @@ -1765,6 +1819,19 @@ DONE; }) +(define_expand "vec_widen_umult_odd_v2di" + [(use (match_operand:V1TI 0 "register_operand")) + (use (match_operand:V2DI 1 "register_operand")) + (use (match_operand:V2DI 2 "register_operand"))] + "TARGET_TI_VECTOR_OPS" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_altivec_vmuloud (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmuleud (operands[0], operands[1], operands[2])); + DONE; +}) + (define_expand "vec_widen_smult_odd_v4si" [(use (match_operand:V2DI 0 "register_operand")) (use (match_operand:V4SI 1 "register_operand")) @@ -1778,6 +1845,19 @@ DONE; }) +(define_expand "vec_widen_smult_odd_v2di" + [(use (match_operand:V1TI 0 "register_operand")) + (use (match_operand:V2DI 1 "register_operand")) + (use (match_operand:V2DI 2 "register_operand"))] + "TARGET_TI_VECTOR_OPS" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_altivec_vmulosd (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmulesd (operands[0], operands[1], operands[2])); + DONE; +}) + (define_insn "altivec_vmuleub" [(set (match_operand:V8HI 0 "register_operand" "=v") (unspec:V8HI [(match_operand:V16QI 1 "register_operand" "v") @@ -1859,6 +1939,15 @@ "vmuleuw %0,%1,%2" [(set_attr "type" "veccomplex")]) +(define_insn "altivec_vmuleud" + [(set (match_operand:V1TI 0 "register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v")] + UNSPEC_VMULEUD))] + "TARGET_TI_VECTOR_OPS" + "vmuleud %0,%1,%2" + [(set_attr "type" "veccomplex")]) + (define_insn "altivec_vmulouw" [(set (match_operand:V2DI 0 "register_operand" "=v") (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v") @@ -1868,6 +1957,15 @@ "vmulouw %0,%1,%2" [(set_attr "type" "veccomplex")]) +(define_insn "altivec_vmuloud" + [(set (match_operand:V1TI 0 "register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v")] + UNSPEC_VMULOUD))] + "TARGET_TI_VECTOR_OPS" + "vmuloud %0,%1,%2" + [(set_attr "type" "veccomplex")]) + (define_insn "altivec_vmulesw" [(set (match_operand:V2DI 0 "register_operand" "=v") (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v") @@ -1877,6 +1975,15 @@ "vmulesw %0,%1,%2" [(set_attr "type" "veccomplex")]) +(define_insn "altivec_vmulesd" + [(set (match_operand:V1TI 0 "register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v")] + UNSPEC_VMULESD))] + "TARGET_TI_VECTOR_OPS" + "vmulesd %0,%1,%2" + [(set_attr "type" "veccomplex")]) + (define_insn "altivec_vmulosw" [(set (match_operand:V2DI 0 "register_operand" "=v") (unspec:V2DI [(match_operand:V4SI 1 "register_operand" "v") @@ -1886,6 +1993,15 @@ "vmulosw %0,%1,%2" [(set_attr "type" "veccomplex")]) +(define_insn "altivec_vmulosd" + [(set (match_operand:V1TI 0 "register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v")] + UNSPEC_VMULOSD))] + "TARGET_TI_VECTOR_OPS" + "vmulosd %0,%1,%2" + [(set_attr "type" "veccomplex")]) + ;; Vector pack/unpack (define_insn "altivec_vpkpx" [(set (match_operand:V8HI 0 "register_operand" "=v") @@ -1979,6 +2095,15 @@ "vrl %0,%1,%2" [(set_attr "type" "vecsimple")]) +(define_insn "altivec_vrlq" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (rotate:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_TI_VECTOR_OPS" +;; rotate amount in needs to be in bits[57:63] of operand2. + "vrlq %0,%1,%2" + [(set_attr "type" "vecsimple")]) + (define_insn "altivec_vrlmi" [(set (match_operand:VIlong 0 "register_operand" "=v") (unspec:VIlong [(match_operand:VIlong 1 "register_operand" "0") @@ -1989,6 +2114,33 @@ "vrlmi %0,%2,%3" [(set_attr "type" "veclogical")]) +(define_expand "altivec_vrlqmi" + [(set (match_operand:V1TI 0 "vsx_register_operand") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand") + (match_operand:V1TI 2 "vsx_register_operand") + (match_operand:V1TI 3 "vsx_register_operand")] + UNSPEC_VRLMI))] + "TARGET_TI_VECTOR_OPS" +{ + /* Mask bit begin, end fields need to be in bits [41:55] of 128-bit operand2. */ + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn(gen_xxswapd_v1ti (tmp, operands[3])); + emit_insn(gen_altivec_vrlqmi_inst (operands[0], operands[1], operands[2], tmp)); + DONE; +}) + +(define_insn "altivec_vrlqmi_inst" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "0") + (match_operand:V1TI 3 "vsx_register_operand" "v")] + UNSPEC_VRLMI))] + "TARGET_TI_VECTOR_OPS" + "vrlqmi %0,%1,%3" + [(set_attr "type" "veclogical")]) + (define_insn "altivec_vrlnm" [(set (match_operand:VIlong 0 "register_operand" "=v") (unspec:VIlong [(match_operand:VIlong 1 "register_operand" "v") @@ -1998,6 +2150,31 @@ "vrlnm %0,%1,%2" [(set_attr "type" "veclogical")]) +(define_expand "altivec_vrlqnm" + [(set (match_operand:V1TI 0 "vsx_register_operand") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand") + (match_operand:V1TI 2 "vsx_register_operand")] + UNSPEC_VRLNM))] + "TARGET_TI_VECTOR_OPS" +{ + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn(gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn(gen_altivec_vrlqnm_inst (operands[0], operands[1], tmp)); + DONE; +}) + +(define_insn "altivec_vrlqnm_inst" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VRLNM))] + "TARGET_TI_VECTOR_OPS" + ;; rotate and mask bits need to be in upper 64-bits of operand2. + "vrlqnm %0,%1,%2" + [(set_attr "type" "veclogical")]) + (define_insn "altivec_vsl" [(set (match_operand:V4SI 0 "register_operand" "=v") (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") @@ -2042,6 +2219,15 @@ "vsl %0,%1,%2" [(set_attr "type" "vecsimple")]) +(define_insn "altivec_vslq" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_TI_VECTOR_OPS" + /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */ + "vslq %0,%1,%2" + [(set_attr "type" "vecsimple")]) + (define_insn "*altivec_vsr" [(set (match_operand:VI2 0 "register_operand" "=v") (lshiftrt:VI2 (match_operand:VI2 1 "register_operand" "v") @@ -2050,6 +2236,15 @@ "vsr %0,%1,%2" [(set_attr "type" "vecsimple")]) +(define_insn "altivec_vsrq" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_TI_VECTOR_OPS" + /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */ + "vsrq %0,%1,%2" + [(set_attr "type" "vecsimple")]) + (define_insn "*altivec_vsra" [(set (match_operand:VI2 0 "register_operand" "=v") (ashiftrt:VI2 (match_operand:VI2 1 "register_operand" "v") @@ -2058,6 +2253,15 @@ "vsra %0,%1,%2" [(set_attr "type" "vecsimple")]) +(define_insn "altivec_vsraq" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (ashiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_TI_VECTOR_OPS" + /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */ + "vsraq %0,%1,%2" + [(set_attr "type" "vecsimple")]) + (define_insn "altivec_vsr" [(set (match_operand:V4SI 0 "register_operand" "=v") (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v") @@ -2618,6 +2822,18 @@ "vcmpequ. %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "altivec_vcmpequt_p" + [(set (reg:CC CR6_REGNO) + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand" "v") + (match_operand:V1TI 2 "altivec_register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "altivec_register_operand" "=v") + (eq:V1TI (match_dup 1) + (match_dup 2)))] + "TARGET_TI_VECTOR_OPS" + "vcmpequq. %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_vcmpgts_p" [(set (reg:CC CR6_REGNO) (unspec:CC [(gt:CC (match_operand:VI2 1 "register_operand" "v") @@ -2630,6 +2846,18 @@ "vcmpgts. %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "*altivec_vcmpgtst_p" + [(set (reg:CC CR6_REGNO) + (unspec:CC [(gt:CC (match_operand:V1TI 1 "register_operand" "v") + (match_operand:V1TI 2 "register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "register_operand" "=v") + (gt:V1TI (match_dup 1) + (match_dup 2)))] + "TARGET_TI_VECTOR_OPS" + "vcmpgtsq. %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_vcmpgtu_p" [(set (reg:CC CR6_REGNO) (unspec:CC [(gtu:CC (match_operand:VI2 1 "register_operand" "v") @@ -2642,6 +2870,18 @@ "vcmpgtu. %0,%1,%2" [(set_attr "type" "veccmpfx")]) +(define_insn "*altivec_vcmpgtut_p" + [(set (reg:CC CR6_REGNO) + (unspec:CC [(gtu:CC (match_operand:V1TI 1 "register_operand" "v") + (match_operand:V1TI 2 "register_operand" "v"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "register_operand" "=v") + (gtu:V1TI (match_dup 1) + (match_dup 2)))] + "TARGET_TI_VECTOR_OPS" + "vcmpgtuq. %0,%1,%2" + [(set_attr "type" "veccmpfx")]) + (define_insn "*altivec_vcmpeqfp_p" [(set (reg:CC CR6_REGNO) (unspec:CC [(eq:CC (match_operand:V4SF 1 "register_operand" "v") diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index 667c2450d41..871da6c4cf7 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1070,6 +1070,15 @@ | RS6000_BTC_UNARY), \ CODE_FOR_ ## ICODE) /* ICODE */ + +#define BU_P10_P(ENUM, NAME, ATTR, ICODE) \ + RS6000_BUILTIN_P (P10_BUILTIN_ ## ENUM, /* ENUM */ \ + "__builtin_altivec_" NAME, /* NAME */ \ + RS6000_BTM_P10_128BIT, /* MASK */ \ + (RS6000_BTC_ ## ATTR /* ATTR */ \ + | RS6000_BTC_PREDICATE), \ + CODE_FOR_ ## ICODE) /* ICODE */ + #define BU_P10_OVERLOAD_1(ENUM, NAME) \ RS6000_BUILTIN_1 (P10_BUILTIN_VEC_ ## ENUM, /* ENUM */ \ "__builtin_vec_" NAME, /* NAME */ \ @@ -1152,6 +1161,30 @@ (RS6000_BTC_ ## ATTR /* ATTR */ \ | RS6000_BTC_BINARY), \ CODE_FOR_ ## ICODE) /* ICODE */ + +#define BU_P10_128BIT_1(ENUM, NAME, ATTR, ICODE) \ + RS6000_BUILTIN_1 (P10_BUILTIN_128BIT_ ## ENUM, /* ENUM */ \ + "__builtin_altivec_" NAME, /* NAME */ \ + RS6000_BTM_P10_128BIT, /* MASK */ \ + (RS6000_BTC_ ## ATTR /* ATTR */ \ + | RS6000_BTC_UNARY), \ + CODE_FOR_ ## ICODE) /* ICODE */ + +#define BU_P10_128BIT_2(ENUM, NAME, ATTR, ICODE) \ + RS6000_BUILTIN_2 (P10_BUILTIN_128BIT_ ## ENUM, /* ENUM */ \ + "__builtin_altivec_" NAME, /* NAME */ \ + RS6000_BTM_P10_128BIT, /* MASK */ \ + (RS6000_BTC_ ## ATTR /* ATTR */ \ + | RS6000_BTC_BINARY), \ + CODE_FOR_ ## ICODE) /* ICODE */ + +#define BU_P10_128BIT_3(ENUM, NAME, ATTR, ICODE) \ + RS6000_BUILTIN_3 (P10_BUILTIN_128BIT_ ## ENUM, /* ENUM */ \ + "__builtin_altivec_" NAME, /* NAME */ \ + RS6000_BTM_P10_128BIT, /* MASK */ \ + (RS6000_BTC_ ## ATTR /* ATTR */ \ + | RS6000_BTC_TERNARY), \ + CODE_FOR_ ## ICODE) /* ICODE */ #endif @@ -2712,6 +2745,10 @@ BU_P9V_AV_1 (VSIGNEXTSH2D, "vsignextsh2d", CONST, vsx_sign_extend_hi_v2di) BU_P9V_AV_1 (VSIGNEXTSW2D, "vsignextsw2d", CONST, vsx_sign_extend_si_v2di) /* Builtins for scalar instructions added in ISA 3.1 (power10). */ +BU_P10_P (VCMPEQUT_P, "vcmpequt_p", CONST, vector_eq_v1ti_p) +BU_P10_P (VCMPGTST_P, "vcmpgtst_p", CONST, vector_gt_v1ti_p) +BU_P10_P (VCMPGTUT_P, "vcmpgtut_p", CONST, vector_gtu_v1ti_p) + BU_P10_MISC_2 (CFUGED, "cfuged", CONST, cfuged) BU_P10_MISC_2 (CNTLZDM, "cntlzdm", CONST, cntlzdm) BU_P10_MISC_2 (CNTTZDM, "cnttzdm", CONST, cnttzdm) @@ -2733,6 +2770,39 @@ BU_P10V_2 (XXGENPCVM_V8HI, "xxgenpcvm_v8hi", CONST, xxgenpcvm_v8hi) BU_P10V_2 (XXGENPCVM_V4SI, "xxgenpcvm_v4si", CONST, xxgenpcvm_v4si) BU_P10V_2 (XXGENPCVM_V2DI, "xxgenpcvm_v2di", CONST, xxgenpcvm_v2di) +BU_P10V_2 (VCMPGTUT, "vcmpgtut", CONST, vector_gtuv1ti) +BU_P10V_2 (VCMPGTST, "vcmpgtst", CONST, vector_gtv1ti) +BU_P10V_2 (VCMPEQUT, "vcmpequt", CONST, vector_eqv1ti) +BU_P10V_2 (CMPNET, "vcmpnet", CONST, vcmpnet) +BU_P10V_2 (CMPGE_1TI, "cmpge_1ti", CONST, vector_nltv1ti) +BU_P10V_2 (CMPGE_U1TI, "cmpge_u1ti", CONST, vector_nltuv1ti) +BU_P10V_2 (CMPLE_1TI, "cmple_1ti", CONST, vector_ngtv1ti) +BU_P10V_2 (CMPLE_U1TI, "cmple_u1ti", CONST, vector_ngtuv1ti) +BU_P10V_2 (VNOR_V1TI_UNS, "vnor_v1ti_uns",CONST, norv1ti3) +BU_P10V_2 (VNOR_V1TI, "vnor_v1ti", CONST, norv1ti3) +BU_P10V_2 (VCMPNET_P, "vcmpnet_p", CONST, vector_ne_v1ti_p) +BU_P10V_2 (VCMPAET_P, "vcmpaet_p", CONST, vector_ae_v1ti_p) + +BU_P10_128BIT_1 (VSIGNEXTSD2Q, "vsignext", CONST, vsx_sign_extend_v2di_v1ti) + +BU_P10_128BIT_2 (VMULEUD, "vmuleud", CONST, vec_widen_umult_even_v2di) +BU_P10_128BIT_2 (VMULESD, "vmulesd", CONST, vec_widen_smult_even_v2di) +BU_P10_128BIT_2 (VMULOUD, "vmuloud", CONST, vec_widen_umult_odd_v2di) +BU_P10_128BIT_2 (VMULOSD, "vmulosd", CONST, vec_widen_smult_odd_v2di) +BU_P10_128BIT_2 (VRLQ, "vrlq", CONST, vrotlv1ti3) +BU_P10_128BIT_2 (VSLQ, "vslq", CONST, vashlv1ti3) +BU_P10_128BIT_2 (VSRQ, "vsrq", CONST, vlshrv1ti3) +BU_P10_128BIT_2 (VSRAQ, "vsraq", CONST, vashrv1ti3) +BU_P10_128BIT_2 (VRLQNM, "vrlqnm", CONST, altivec_vrlqnm) +BU_P10_128BIT_2 (DIV_V1TI, "div_1ti", CONST, vsx_div_v1ti) +BU_P10_128BIT_2 (UDIV_V1TI, "udiv_1ti", CONST, vsx_udiv_v1ti) +BU_P10_128BIT_2 (DIVES_V1TI, "dives", CONST, vsx_dives_v1ti) +BU_P10_128BIT_2 (DIVEU_V1TI, "diveu", CONST, vsx_diveu_v1ti) +BU_P10_128BIT_2 (MODS_V1TI, "mods", CONST, vsx_mods_v1ti) +BU_P10_128BIT_2 (MODU_V1TI, "modu", CONST, vsx_modu_v1ti) + +BU_P10_128BIT_3 (VRLQMI, "vrlqmi", CONST, altivec_vrlqmi) + BU_P10V_3 (VEXTRACTBL, "vextdubvlx", CONST, vextractlv16qi) BU_P10V_3 (VEXTRACTHL, "vextduhvlx", CONST, vextractlv8hi) BU_P10V_3 (VEXTRACTWL, "vextduwvlx", CONST, vextractlv4si) @@ -2839,6 +2909,12 @@ BU_P10_OVERLOAD_2 (CLRR, "clrr") BU_P10_OVERLOAD_2 (GNB, "gnb") BU_P10_OVERLOAD_4 (XXEVAL, "xxeval") BU_P10_OVERLOAD_2 (XXGENPCVM, "xxgenpcvm") +BU_P10_OVERLOAD_2 (VRLQ, "vrlq") +BU_P10_OVERLOAD_2 (VSLQ, "vslq") +BU_P10_OVERLOAD_2 (VSRQ, "vsrq") +BU_P10_OVERLOAD_2 (VSRAQ, "vsraq") +BU_P10_OVERLOAD_2 (DIVE, "dive") +BU_P10_OVERLOAD_2 (MOD, "mod") BU_P10_OVERLOAD_3 (EXTRACTL, "extractl") BU_P10_OVERLOAD_3 (EXTRACTH, "extracth") @@ -2854,6 +2930,7 @@ BU_P10_OVERLOAD_1 (VSTRIL, "stril") BU_P10_OVERLOAD_1 (VSTRIR_P, "strir_p") BU_P10_OVERLOAD_1 (VSTRIL_P, "stril_p") +BU_P10_OVERLOAD_1 (SIGNEXT, "vsignextq") BU_P10_OVERLOAD_1 (XVTLSBB_ZEROS, "xvtlsbb_all_zeros") BU_P10_OVERLOAD_1 (XVTLSBB_ONES, "xvtlsbb_all_ones") diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 87699be8a07..2bd6412a502 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -839,6 +839,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPEQ, P8V_BUILTIN_VCMPEQUD, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPEQ, P10_BUILTIN_VCMPEQUT, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPEQ, P10_BUILTIN_VCMPEQUT, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPEQ, ALTIVEC_BUILTIN_VCMPEQFP, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_CMPEQ, VSX_BUILTIN_XVCMPEQDP, @@ -885,6 +889,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_CMPGE, VSX_BUILTIN_CMPGE_U2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0}, + { ALTIVEC_BUILTIN_VEC_CMPGE, P10_BUILTIN_CMPGE_1TI, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0}, + { ALTIVEC_BUILTIN_VEC_CMPGE, P10_BUILTIN_CMPGE_U1TI, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0}, { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTUB, RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTSB, @@ -899,8 +908,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, P8V_BUILTIN_VCMPGTUD, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPGT, P10_BUILTIN_VCMPGTUT, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, P8V_BUILTIN_VCMPGTSD, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPGT, P10_BUILTIN_VCMPGTST, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, ALTIVEC_BUILTIN_VCMPGTFP, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { ALTIVEC_BUILTIN_VEC_CMPGT, VSX_BUILTIN_XVCMPGTDP, @@ -943,6 +956,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_CMPLE, VSX_BUILTIN_CMPLE_U2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0}, + { ALTIVEC_BUILTIN_VEC_CMPLE, P10_BUILTIN_CMPLE_1TI, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0}, + { ALTIVEC_BUILTIN_VEC_CMPLE, P10_BUILTIN_CMPLE_U1TI, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0}, { ALTIVEC_BUILTIN_VEC_CMPLT, ALTIVEC_BUILTIN_VCMPGTUB, RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_CMPLT, ALTIVEC_BUILTIN_VCMPGTSB, @@ -995,6 +1013,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, { VSX_BUILTIN_VEC_DIV, VSX_BUILTIN_UDIV_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { VSX_BUILTIN_VEC_DIV, P10_BUILTIN_128BIT_DIV_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, + { VSX_BUILTIN_VEC_DIV, P10_BUILTIN_128BIT_UDIV_V1TI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, + { VSX_BUILTIN_VEC_DOUBLE, VSX_BUILTIN_XVCVSXDDP, RS6000_BTI_V2DF, RS6000_BTI_V2DI, 0, 0 }, { VSX_BUILTIN_VEC_DOUBLE, VSX_BUILTIN_XVCVUXDDP, @@ -1789,6 +1813,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_MULE, P8V_BUILTIN_VMULEUW, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_MULE, P10_BUILTIN_128BIT_VMULESD, + RS6000_BTI_V1TI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_MULE, P10_BUILTIN_128BIT_VMULEUD, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V2DI, + RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_VMULEUB, ALTIVEC_BUILTIN_VMULEUB, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULESB, ALTIVEC_BUILTIN_VMULESB, @@ -1812,6 +1842,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_MULO, P8V_BUILTIN_VMULOUW, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_MULO, P10_BUILTIN_128BIT_VMULOSD, + RS6000_BTI_V1TI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_MULO, P10_BUILTIN_128BIT_VMULOUD, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V2DI, + RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOSH, RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULOSH, ALTIVEC_BUILTIN_VMULOSH, @@ -1860,6 +1895,16 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR_V2DI_UNS, RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10_BUILTIN_VNOR_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_bool_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10_BUILTIN_VNOR_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10_BUILTIN_VNOR_V1TI_UNS, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10_BUILTIN_VNOR_V1TI_UNS, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_bool_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_NOR, P10_BUILTIN_VNOR_V1TI_UNS, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR_V2DI_UNS, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_NOR, ALTIVEC_BUILTIN_VNOR_V4SI, @@ -2115,6 +2160,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_RL, P8V_BUILTIN_VRLD, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_RL, P10_BUILTIN_128BIT_VRLQ, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_RL, P10_BUILTIN_128BIT_VRLQ, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_VRLW, ALTIVEC_BUILTIN_VRLW, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_VRLW, ALTIVEC_BUILTIN_VRLW, @@ -2133,12 +2183,23 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { P9V_BUILTIN_VEC_RLMI, P9V_BUILTIN_VRLDMI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI }, + { P9V_BUILTIN_VEC_RLMI, P10_BUILTIN_128BIT_VRLQMI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI }, + { P9V_BUILTIN_VEC_RLMI, P10_BUILTIN_128BIT_VRLQMI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI }, { P9V_BUILTIN_VEC_RLNM, P9V_BUILTIN_VRLWNM, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { P9V_BUILTIN_VEC_RLNM, P9V_BUILTIN_VRLDNM, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { P9V_BUILTIN_VEC_RLNM, P10_BUILTIN_128BIT_VRLQNM, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, + { P9V_BUILTIN_VEC_RLNM, P10_BUILTIN_128BIT_VRLQNM, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_SL, ALTIVEC_BUILTIN_VSLB, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_SL, ALTIVEC_BUILTIN_VSLB, @@ -2155,6 +2216,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_SL, P8V_BUILTIN_VSLD, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_SL, P10_BUILTIN_128BIT_VSLQ, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_SL, P10_BUILTIN_128BIT_VSLQ, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_SQRT, VSX_BUILTIN_XVSQRTDP, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0, 0 }, { ALTIVEC_BUILTIN_VEC_SQRT, VSX_BUILTIN_XVSQRTSP, @@ -2351,6 +2417,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_SR, P8V_BUILTIN_VSRD, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_SR, P10_BUILTIN_128BIT_VSRQ, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_SR, P10_BUILTIN_128BIT_VSRQ, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_VSRW, ALTIVEC_BUILTIN_VSRW, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_VSRW, ALTIVEC_BUILTIN_VSRW, @@ -2379,6 +2450,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_SRA, P8V_BUILTIN_VSRAD, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_SRA, P10_BUILTIN_128BIT_VSRAQ, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_unsigned_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_SRA, P10_BUILTIN_128BIT_VSRAQ, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, { ALTIVEC_BUILTIN_VEC_VSRAW, ALTIVEC_BUILTIN_VSRAW, RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_VSRAW, ALTIVEC_BUILTIN_VSRAW, @@ -3996,12 +4072,16 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTUD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P10_BUILTIN_VCMPGTUT_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPGT_P, P10_BUILTIN_VCMPGTST_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, ALTIVEC_BUILTIN_VCMPGTFP_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, { ALTIVEC_BUILTIN_VEC_VCMPGT_P, VSX_BUILTIN_XVCMPGTDP_P, @@ -4066,6 +4146,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, P8V_BUILTIN_VCMPEQUD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, P10_BUILTIN_VCMPEQUT_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI }, + { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, P10_BUILTIN_VCMPEQUT_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, ALTIVEC_BUILTIN_VCMPEQFP_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, { ALTIVEC_BUILTIN_VEC_VCMPEQ_P, VSX_BUILTIN_XVCMPEQDP_P, @@ -4117,12 +4201,16 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTUD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P10_BUILTIN_VCMPGTUT_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P8V_BUILTIN_VCMPGTSD_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V2DI, RS6000_BTI_V2DI }, + { ALTIVEC_BUILTIN_VEC_VCMPGE_P, P10_BUILTIN_VCMPGTST_P, + RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, ALTIVEC_BUILTIN_VCMPGEFP_P, RS6000_BTI_INTSI, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF }, { ALTIVEC_BUILTIN_VEC_VCMPGE_P, VSX_BUILTIN_XVCMPGEDP_P, @@ -4771,6 +4859,12 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { ALTIVEC_BUILTIN_VEC_CMPNE, P9V_BUILTIN_CMPNEW, RS6000_BTI_bool_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPNE, P10_BUILTIN_CMPNET, + RS6000_BTI_bool_V1TI, RS6000_BTI_V1TI, + RS6000_BTI_V1TI, 0 }, + { ALTIVEC_BUILTIN_VEC_CMPNE, P10_BUILTIN_CMPNET, + RS6000_BTI_bool_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, /* The following 2 entries have been deprecated. */ { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNEB_P, @@ -4856,8 +4950,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_bool_V2DI, 0 }, { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNED_P, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, - RS6000_BTI_unsigned_V2DI, 0 - }, + RS6000_BTI_unsigned_V2DI, 0 }, + { P9V_BUILTIN_VEC_VCMPNE_P, P10_BUILTIN_VCMPNET_P, + RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, /* The following 2 entries have been deprecated. */ { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNED_P, @@ -4871,6 +4967,8 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNED_P, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, 0 }, + { P9V_BUILTIN_VEC_VCMPNE_P, P10_BUILTIN_VCMPNET_P, + RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, { P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNEFP_P, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, @@ -4961,8 +5059,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_bool_V2DI, 0 }, { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAED_P, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, - RS6000_BTI_unsigned_V2DI, 0 - }, + RS6000_BTI_unsigned_V2DI, 0 }, + { P9V_BUILTIN_VEC_VCMPAE_P, P10_BUILTIN_VCMPAET_P, + RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, /* The following 2 entries have been deprecated. */ { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAED_P, @@ -4976,7 +5076,8 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAED_P, RS6000_BTI_INTSI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, 0 }, - + { P9V_BUILTIN_VEC_VCMPAE_P, P10_BUILTIN_VCMPAET_P, + RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAEFP_P, RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, { P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAEDP_P, @@ -5903,6 +6004,21 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { P10_BUILTIN_VEC_XVTLSBB_ONES, P10_BUILTIN_XVTLSBB_ONES, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V16QI, 0, 0 }, + { P10_BUILTIN_VEC_SIGNEXT, P10_BUILTIN_128BIT_VSIGNEXTSD2Q, + RS6000_BTI_V1TI, RS6000_BTI_V2DI, 0, 0 }, + + { P10_BUILTIN_VEC_DIVE, P10_BUILTIN_128BIT_DIVES_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, + { P10_BUILTIN_VEC_DIVE, P10_BUILTIN_128BIT_DIVEU_V1TI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, + + { P10_BUILTIN_VEC_MOD, P10_BUILTIN_128BIT_MODS_V1TI, + RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 }, + { P10_BUILTIN_VEC_MOD, P10_BUILTIN_128BIT_MODU_V1TI, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, + RS6000_BTI_unsigned_V1TI, 0 }, + { RS6000_BUILTIN_NONE, RS6000_BUILTIN_NONE, 0, 0, 0, 0 } }; @@ -12228,12 +12344,14 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case ALTIVEC_BUILTIN_VCMPEQUH: case ALTIVEC_BUILTIN_VCMPEQUW: case P8V_BUILTIN_VCMPEQUD: + case P10_BUILTIN_VCMPEQUT: fold_compare_helper (gsi, EQ_EXPR, stmt); return true; case P9V_BUILTIN_CMPNEB: case P9V_BUILTIN_CMPNEH: case P9V_BUILTIN_CMPNEW: + case P10_BUILTIN_CMPNET: fold_compare_helper (gsi, NE_EXPR, stmt); return true; @@ -12245,6 +12363,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case VSX_BUILTIN_CMPGE_U4SI: case VSX_BUILTIN_CMPGE_2DI: case VSX_BUILTIN_CMPGE_U2DI: + case P10_BUILTIN_CMPGE_1TI: + case P10_BUILTIN_CMPGE_U1TI: fold_compare_helper (gsi, GE_EXPR, stmt); return true; @@ -12256,6 +12376,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case ALTIVEC_BUILTIN_VCMPGTUW: case P8V_BUILTIN_VCMPGTUD: case P8V_BUILTIN_VCMPGTSD: + case P10_BUILTIN_VCMPGTUT: + case P10_BUILTIN_VCMPGTST: fold_compare_helper (gsi, GT_EXPR, stmt); return true; @@ -12267,6 +12389,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case VSX_BUILTIN_CMPLE_U4SI: case VSX_BUILTIN_CMPLE_2DI: case VSX_BUILTIN_CMPLE_U2DI: + case P10_BUILTIN_CMPLE_1TI: + case P10_BUILTIN_CMPLE_U1TI: fold_compare_helper (gsi, LE_EXPR, stmt); return true; @@ -12978,6 +13102,8 @@ rs6000_init_builtins (void) ? "__vector __bool long" : "__vector __bool long long", bool_long_long_type_node, 2); + bool_V1TI_type_node = rs6000_vector_type ("__vector __bool __int128", + intTI_type_node, 1); pixel_V8HI_type_node = rs6000_vector_type ("__vector __pixel", pixel_type_node, 8); @@ -13163,6 +13289,10 @@ altivec_init_builtins (void) = build_function_type_list (integer_type_node, integer_type_node, V2DI_type_node, V2DI_type_node, NULL_TREE); + tree int_ftype_int_v1ti_v1ti + = build_function_type_list (integer_type_node, + integer_type_node, V1TI_type_node, + V1TI_type_node, NULL_TREE); tree void_ftype_v4si = build_function_type_list (void_type_node, V4SI_type_node, NULL_TREE); tree v8hi_ftype_void @@ -13515,6 +13645,9 @@ altivec_init_builtins (void) case E_VOIDmode: type = int_ftype_int_opaque_opaque; break; + case E_V1TImode: + type = int_ftype_int_v1ti_v1ti; + break; case E_V2DImode: type = int_ftype_int_v2di_v2di; break; @@ -14114,6 +14247,10 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, case P10_BUILTIN_XXGENPCVM_V8HI: case P10_BUILTIN_XXGENPCVM_V4SI: case P10_BUILTIN_XXGENPCVM_V2DI: + case P10_BUILTIN_128BIT_VMULEUD: + case P10_BUILTIN_128BIT_VMULOUD: + case P10_BUILTIN_128BIT_DIVEU_V1TI: + case P10_BUILTIN_128BIT_MODU_V1TI: h.uns_p[0] = 1; h.uns_p[1] = 1; h.uns_p[2] = 1; @@ -14213,10 +14350,13 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, case VSX_BUILTIN_CMPGE_U8HI: case VSX_BUILTIN_CMPGE_U4SI: case VSX_BUILTIN_CMPGE_U2DI: + case P10_BUILTIN_CMPGE_U1TI: case ALTIVEC_BUILTIN_VCMPGTUB: case ALTIVEC_BUILTIN_VCMPGTUH: case ALTIVEC_BUILTIN_VCMPGTUW: case P8V_BUILTIN_VCMPGTUD: + case P10_BUILTIN_VCMPGTUT: + case P10_BUILTIN_VCMPEQUT: h.uns_p[1] = 1; h.uns_p[2] = 1; break; diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 40ee0a695f1..1fa4a527f12 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -3401,7 +3401,9 @@ rs6000_builtin_mask_calculate (void) | ((TARGET_FLOAT128_TYPE) ? RS6000_BTM_FLOAT128 : 0) | ((TARGET_FLOAT128_HW) ? RS6000_BTM_FLOAT128_HW : 0) | ((TARGET_MMA) ? RS6000_BTM_MMA : 0) - | ((TARGET_POWER10) ? RS6000_BTM_P10 : 0)); + | ((TARGET_POWER10) ? RS6000_BTM_P10 : 0) + | ((TARGET_TI_VECTOR_OPS) ? RS6000_BTM_TI_VECTOR_OPS : 0)); + } /* Implement TARGET_MD_ASM_ADJUST. All asm statements are considered @@ -3732,6 +3734,17 @@ rs6000_option_override_internal (bool global_init_p) if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET) rs6000_print_isa_options (stderr, 0, "before defaults", rs6000_isa_flags); + /* The -mti-vector-ops option requires ISA 3.1 support and -maltivec for + the 128-bit instructions. Currently, TARGET_POWER10 is sufficient to + enable it by default. */ + if (TARGET_POWER10) + { + if (rs6000_isa_flags_explicit & OPTION_MASK_VSX) + warning(0, ("%<-mno-altivec%> disables -mti-vector-ops (128-bit integer vector register operations).")); + else + rs6000_isa_flags |= OPTION_MASK_TI_VECTOR_OPS; + } + /* Handle explicit -mno-{altivec,vsx,power8-vector,power9-vector} and turn off all of the options that depend on those flags. */ ignore_masks = rs6000_disable_incompatible_switches (); @@ -19489,6 +19502,7 @@ rs6000_handle_altivec_attribute (tree *node, case 'b': switch (mode) { + case E_TImode: case E_V1TImode: result = bool_V1TI_type_node; break; case E_DImode: case E_V2DImode: result = bool_V2DI_type_node; break; case E_SImode: case E_V4SImode: result = bool_V4SI_type_node; break; case E_HImode: case E_V8HImode: result = bool_V8HI_type_node; break; @@ -23218,6 +23232,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] = { "float128-hardware", OPTION_MASK_FLOAT128_HW, false, true }, { "fprnd", OPTION_MASK_FPRND, false, true }, { "power10", OPTION_MASK_POWER10, false, true }, + { "ti-vector-ops", OPTION_MASK_TI_VECTOR_OPS, false, true }, { "hard-dfp", OPTION_MASK_DFP, false, true }, { "htm", OPTION_MASK_HTM, false, true }, { "isel", OPTION_MASK_ISEL, false, true }, diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index bbd8060e143..da84abde671 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -539,6 +539,7 @@ extern int rs6000_vector_align[]; #define MASK_UPDATE OPTION_MASK_UPDATE #define MASK_VSX OPTION_MASK_VSX #define MASK_POWER10 OPTION_MASK_POWER10 +#define MASK_TI_VECTOR_OPS OPTION_MASK_TI_VECTOR_OPS #ifndef IN_LIBGCC2 #define MASK_POWERPC64 OPTION_MASK_POWERPC64 @@ -2305,6 +2306,7 @@ extern int frame_pointer_needed; #define RS6000_BTM_P8_VECTOR MASK_P8_VECTOR /* ISA 2.07 vector. */ #define RS6000_BTM_P9_VECTOR MASK_P9_VECTOR /* ISA 3.0 vector. */ #define RS6000_BTM_P9_MISC MASK_P9_MISC /* ISA 3.0 misc. non-vector */ +#define RS6000_BTM_P10_128BIT MASK_POWER10 /* ISA P10 vector. */ #define RS6000_BTM_CRYPTO MASK_CRYPTO /* crypto funcs. */ #define RS6000_BTM_HTM MASK_HTM /* hardware TM funcs. */ #define RS6000_BTM_FRE MASK_POPCNTB /* FRE instruction. */ @@ -2322,7 +2324,7 @@ extern int frame_pointer_needed; #define RS6000_BTM_FLOAT128_HW MASK_FLOAT128_HW /* IEEE 128-bit float h/w. */ #define RS6000_BTM_MMA MASK_MMA /* ISA 3.1 MMA. */ #define RS6000_BTM_P10 MASK_POWER10 - +#define RS6000_BTM_TI_VECTOR_OPS MASK_TI_VECTOR_OPS /* 128-bit integer support */ #define RS6000_BTM_COMMON (RS6000_BTM_ALTIVEC \ | RS6000_BTM_VSX \ @@ -2436,6 +2438,7 @@ enum rs6000_builtin_type_index RS6000_BTI_bool_V8HI, /* __vector __bool short */ RS6000_BTI_bool_V4SI, /* __vector __bool int */ RS6000_BTI_bool_V2DI, /* __vector __bool long */ + RS6000_BTI_bool_V1TI, /* __vector __bool long */ RS6000_BTI_pixel_V8HI, /* __vector __pixel */ RS6000_BTI_long, /* long_integer_type_node */ RS6000_BTI_unsigned_long, /* long_unsigned_type_node */ @@ -2489,6 +2492,7 @@ enum rs6000_builtin_type_index #define bool_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V8HI]) #define bool_V4SI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V4SI]) #define bool_V2DI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V2DI]) +#define bool_V1TI_type_node (rs6000_builtin_types[RS6000_BTI_bool_V1TI]) #define pixel_V8HI_type_node (rs6000_builtin_types[RS6000_BTI_pixel_V8HI]) #define long_long_integer_type_internal_node (rs6000_builtin_types[RS6000_BTI_long_long]) diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt index 9d3e740e930..67d667bf1fd 100644 --- a/gcc/config/rs6000/rs6000.opt +++ b/gcc/config/rs6000/rs6000.opt @@ -585,3 +585,7 @@ Generate (do not generate) pc-relative memory addressing. mmma Target Report Mask(MMA) Var(rs6000_isa_flags) Generate (do not generate) MMA instructions. + +mti-vector-ops +Target Report Mask(TI_VECTOR_OPS) Var(rs6000_isa_flags) +Use integer 128-bit instructions for a future architecture. \ No newline at end of file diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md index 796345c80d3..2deff282076 100644 --- a/gcc/config/rs6000/vector.md +++ b/gcc/config/rs6000/vector.md @@ -678,6 +678,13 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +(define_expand "vector_eqv1ti" + [(set (match_operand:V1TI 0 "vlogical_operand") + (eq:V1TI (match_operand:V1TI 1 "vlogical_operand") + (match_operand:V1TI 2 "vlogical_operand")))] + "TARGET_TI_VECTOR_OPS" + "") + (define_expand "vector_gt" [(set (match_operand:VEC_C 0 "vlogical_operand") (gt:VEC_C (match_operand:VEC_C 1 "vlogical_operand") @@ -685,6 +692,13 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +(define_expand "vector_gtv1ti" + [(set (match_operand:V1TI 0 "vlogical_operand") + (gt:V1TI (match_operand:V1TI 1 "vlogical_operand") + (match_operand:V1TI 2 "vlogical_operand")))] + "TARGET_TI_VECTOR_OPS" + "") + ; >= for integer vectors: swap operands and apply not-greater-than (define_expand "vector_nlt" [(set (match_operand:VEC_I 3 "vlogical_operand") @@ -697,6 +711,17 @@ operands[3] = gen_reg_rtx_and_attrs (operands[0]); }) +(define_expand "vector_nltv1ti" + [(set (match_operand:V1TI 3 "vlogical_operand") + (gt:V1TI (match_operand:V1TI 2 "vlogical_operand") + (match_operand:V1TI 1 "vlogical_operand"))) + (set (match_operand:V1TI 0 "vlogical_operand") + (not:V1TI (match_dup 3)))] + "TARGET_TI_VECTOR_OPS" +{ + operands[3] = gen_reg_rtx_and_attrs (operands[0]); +}) + (define_expand "vector_gtu" [(set (match_operand:VEC_I 0 "vint_operand") (gtu:VEC_I (match_operand:VEC_I 1 "vint_operand") @@ -704,6 +729,13 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +(define_expand "vector_gtuv1ti" + [(set (match_operand:V1TI 0 "altivec_register_operand") + (gtu:V1TI (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand")))] + "TARGET_TI_VECTOR_OPS" + "") + ; >= for integer vectors: swap operands and apply not-greater-than (define_expand "vector_nltu" [(set (match_operand:VEC_I 3 "vlogical_operand") @@ -716,6 +748,17 @@ operands[3] = gen_reg_rtx_and_attrs (operands[0]); }) +(define_expand "vector_nltuv1ti" + [(set (match_operand:V1TI 3 "vlogical_operand") + (gtu:V1TI (match_operand:V1TI 2 "vlogical_operand") + (match_operand:V1TI 1 "vlogical_operand"))) + (set (match_operand:V1TI 0 "vlogical_operand") + (not:V1TI (match_dup 3)))] + "TARGET_TI_VECTOR_OPS" +{ + operands[3] = gen_reg_rtx_and_attrs (operands[0]); +}) + (define_expand "vector_geu" [(set (match_operand:VEC_I 0 "vint_operand") (geu:VEC_I (match_operand:VEC_I 1 "vint_operand") @@ -735,6 +778,17 @@ operands[3] = gen_reg_rtx_and_attrs (operands[0]); }) +(define_expand "vector_ngtv1ti" + [(set (match_operand:V1TI 3 "vlogical_operand") + (gt:V1TI (match_operand:V1TI 1 "vlogical_operand") + (match_operand:V1TI 2 "vlogical_operand"))) + (set (match_operand:V1TI 0 "vlogical_operand") + (not:V1TI (match_dup 3)))] + "TARGET_TI_VECTOR_OPS" +{ + operands[3] = gen_reg_rtx_and_attrs (operands[0]); +}) + (define_expand "vector_ngtu" [(set (match_operand:VEC_I 3 "vlogical_operand") (gtu:VEC_I (match_operand:VEC_I 1 "vlogical_operand") @@ -746,6 +800,17 @@ operands[3] = gen_reg_rtx_and_attrs (operands[0]); }) +(define_expand "vector_ngtuv1ti" + [(set (match_operand:V1TI 3 "vlogical_operand") + (gtu:V1TI (match_operand:V1TI 1 "vlogical_operand") + (match_operand:V1TI 2 "vlogical_operand"))) + (set (match_operand:V1TI 0 "vlogical_operand") + (not:V1TI (match_dup 3)))] + "TARGET_TI_VECTOR_OPS" +{ + operands[3] = gen_reg_rtx_and_attrs (operands[0]); +}) + ; There are 14 possible vector FP comparison operators, gt and eq of them have ; been expanded above, so just support 12 remaining operators here. @@ -894,6 +959,18 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +(define_expand "vector_eq_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "vlogical_operand") + (eq:V1TI (match_dup 1) + (match_dup 2)))])] + "TARGET_TI_VECTOR_OPS" + "") + ;; This expansion handles the V16QI, V8HI, and V4SI modes in the ;; implementation of the vec_all_ne built-in functions on Power9. (define_expand "vector_ne__p" @@ -976,6 +1053,23 @@ operands[3] = gen_reg_rtx (V2DImode); }) +(define_expand "vector_ne_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))] + UNSPEC_PREDICATE)) + (set (match_dup 3) + (eq:V1TI (match_dup 1) + (match_dup 2)))]) + (set (match_operand:SI 0 "register_operand" "=r") + (eq:SI (reg:CC CR6_REGNO) + (const_int 0)))] + "TARGET_TI_VECTOR_OPS" +{ + operands[3] = gen_reg_rtx (V1TImode); +}) + ;; This expansion handles the V2DI mode in the implementation of the ;; vec_any_eq built-in function on Power9. ;; @@ -1002,6 +1096,27 @@ operands[3] = gen_reg_rtx (V2DImode); }) +;; Power 10 +(define_expand "vector_ae_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(eq:CC (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))] + UNSPEC_PREDICATE)) + (set (match_dup 3) + (eq:V1TI (match_dup 1) + (match_dup 2)))]) + (set (match_operand:SI 0 "register_operand" "=r") + (eq:SI (reg:CC CR6_REGNO) + (const_int 0))) + (set (match_dup 0) + (xor:SI (match_dup 0) + (const_int 1)))] + "TARGET_TI_VECTOR_OPS" +{ + operands[3] = gen_reg_rtx (V1TImode); +}) + ;; This expansion handles the V4SF and V2DF modes in the Power9 ;; implementation of the vec_all_ne built-in functions. Note that the ;; expansions for this pattern with these modes makes no use of power9- @@ -1061,6 +1176,18 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +(define_expand "vector_gt_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(gt:CC (match_operand:V1TI 1 "vlogical_operand") + (match_operand:V1TI 2 "vlogical_operand"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "vlogical_operand") + (gt:V1TI (match_dup 1) + (match_dup 2)))])] + "TARGET_TI_VECTOR_OPS" + "") + (define_expand "vector_ge__p" [(parallel [(set (reg:CC CR6_REGNO) @@ -1085,6 +1212,18 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +(define_expand "vector_gtu_v1ti_p" + [(parallel + [(set (reg:CC CR6_REGNO) + (unspec:CC [(gtu:CC (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))] + UNSPEC_PREDICATE)) + (set (match_operand:V1TI 0 "altivec_register_operand") + (gtu:V1TI (match_dup 1) + (match_dup 2)))])] + "TARGET_TI_VECTOR_OPS" + "") + ;; AltiVec/VSX predicates. ;; This expansion is triggered during expansion of predicate built-in @@ -1460,6 +1599,20 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +(define_expand "vrotlv1ti3" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (rotate:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_TI_VECTOR_OPS" +{ + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn(gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn(gen_altivec_vrlq (operands[0], operands[1], tmp)); + DONE; +}) + ;; Expanders for rotatert to make use of vrotl (define_expand "vrotr3" [(set (match_operand:VEC_I 0 "vint_operand") @@ -1481,6 +1634,21 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +;; No immediate version of this 128-bit instruction +(define_expand "vashlv1ti3" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_TI_VECTOR_OPS" +{ + /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn(gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn(gen_altivec_vslq (operands[0], operands[1], tmp)); + DONE; +}) + ;; Expanders for logical shift right on each vector element (define_expand "vlshr3" [(set (match_operand:VEC_I 0 "vint_operand") @@ -1489,6 +1657,21 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +;; No immediate version of this 128-bit instruction +(define_expand "vlshrv1ti3" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_TI_VECTOR_OPS" +{ + /* Shift amount in needs to be put into bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn(gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn(gen_altivec_vsrq (operands[0], operands[1], tmp)); + DONE; +}) + ;; Expanders for arithmetic shift right on each vector element (define_expand "vashr3" [(set (match_operand:VEC_I 0 "vint_operand") @@ -1496,6 +1679,22 @@ (match_operand:VEC_I 2 "vint_operand")))] "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") + +;; No immediate version of this 128-bit instruction +(define_expand "vashrv1ti3" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (ashiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")))] + "TARGET_TI_VECTOR_OPS" +{ + /* Shift amount in needs to be put into bits[57:63] of 128-bit operand2. */ + rtx tmp = gen_reg_rtx (V1TImode); + + emit_insn(gen_xxswapd_v1ti (tmp, operands[2])); + emit_insn(gen_altivec_vsraq (operands[0], operands[1], tmp)); + DONE; +}) + ;; Vector reduction expanders for VSX ; The (VEC_reduc:... diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 1153a01b4ef..998af3908ad 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -298,6 +298,12 @@ UNSPEC_VSX_XXSPLTD UNSPEC_VSX_DIVSD UNSPEC_VSX_DIVUD + UNSPEC_VSX_DIVSQ + UNSPEC_VSX_DIVUQ + UNSPEC_VSX_DIVESQ + UNSPEC_VSX_DIVEUQ + UNSPEC_VSX_MODSQ + UNSPEC_VSX_MODUQ UNSPEC_VSX_MULSD UNSPEC_VSX_SIGN_EXTEND UNSPEC_VSX_XVCVBF16SP @@ -361,6 +367,7 @@ UNSPEC_INSERTR UNSPEC_REPLACE_ELT UNSPEC_REPLACE_UN + UNSPEC_XXSWAPD_V1TI ]) (define_int_iterator XVCVBF16 [UNSPEC_VSX_XVCVSPBF16 @@ -1732,7 +1739,61 @@ } [(set_attr "type" "div")]) -;; *tdiv* instruction returning the FG flag +(define_insn "vsx_div_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_DIVSQ))] + "TARGET_TI_VECTOR_OPS" + "vdivsq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_udiv_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_DIVUQ))] + "TARGET_TI_VECTOR_OPS" + "vdivuq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_dives_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_DIVESQ))] + "TARGET_TI_VECTOR_OPS" + "vdivesq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_diveu_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_DIVEUQ))] + "TARGET_TI_VECTOR_OPS" + "vdiveuq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_mods_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_MODSQ))] + "TARGET_TI_VECTOR_OPS" + "vmodsq %0,%1,%2" + [(set_attr "type" "div")]) + +(define_insn "vsx_modu_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (match_operand:V1TI 2 "vsx_register_operand" "v")] + UNSPEC_VSX_MODUQ))] + "TARGET_TI_VECTOR_OPS" + "vmoduq %0,%1,%2" + [(set_attr "type" "div")]) + + ;; *tdiv* instruction returning the FG flag (define_expand "vsx_tdiv3_fg" [(set (match_dup 3) (unspec:CCFP [(match_operand:VSX_B 1 "vsx_register_operand") @@ -3083,6 +3144,18 @@ "xxpermdi %x0,%x1,%x1,2" [(set_attr "type" "vecperm")]) +;; Swap upper/lower 64-bit values in a 128-bit vector +(define_insn "xxswapd_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V1TI 1 "vsx_register_operand" "v") + (parallel [(const_int 0)(const_int 1)])] + UNSPEC_XXSWAPD_V1TI))] + "TARGET_POWER10" +;; AIX does not support extended mnemonic xxswapd. Use the basic +;; mnemonic xxpermdi instead. + "xxpermdi %x0,%x1,%x1,2" + [(set_attr "type" "vecperm")]) + (define_insn "xxgenpcvm__internal" [(set (match_operand:VSX_EXTRACT_I4 0 "altivec_register_operand" "=wa") (unspec:VSX_EXTRACT_I4 @@ -4767,8 +4840,16 @@ (set_attr "type" "vecload")]) -;; ISA 3.0 vector extend sign support +;; ISA 3.1 vector extend sign support +(define_insn "vsx_sign_extend_v2di_v1ti" + [(set (match_operand:V1TI 0 "vsx_register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "vsx_register_operand" "v")] + UNSPEC_VSX_SIGN_EXTEND))] + "TARGET_TI_VECTOR_OPS" + "vextsd2q %0,%1" + [(set_attr "type" "vecexts")]) +;; ISA 3.0 vector extend sign support (define_insn "vsx_sign_extend_qi_" [(set (match_operand:VSINT_84 0 "vsx_register_operand" "=v") (unspec:VSINT_84 @@ -5508,6 +5589,20 @@ "vcmpnew %0,%1,%2" [(set_attr "type" "vecsimple")]) +;; Vector Compare Not Equal v1ti (specified/not+eq:) +(define_expand "vcmpnet" + [(set (match_operand:V1TI 0 "altivec_register_operand") + (not:V1TI + (eq:V1TI (match_operand:V1TI 1 "altivec_register_operand") + (match_operand:V1TI 2 "altivec_register_operand"))))] + "TARGET_TI_VECTOR_OPS" +{ + emit_insn (gen_vector_eqv1ti (operands[0], operands[1], operands[2])); + emit_insn (gen_one_cmplv1ti2 (operands[0], operands[0])); + DONE; +}) + + ;; Vector Compare Not Equal or Zero Word (define_insn "vcmpnezw" [(set (match_operand:V4SI 0 "altivec_register_operand" "=v") diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index cb501ab2d75..346885de545 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -21270,6 +21270,180 @@ Generate PCV from specified Mask size, as if implemented by the immediate value is either 0, 1, 2 or 3. @findex vec_genpcvm +@smallexample +@exdent vector unsigned __int128 vec_rl (vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_rl (vector signed __int128, + vector unsigned __int128); +@end smallexample + +Returns the result of rotating the first input left by the number of bits +specified in the most significant quad word of the second input truncated to +7 bits (bits [125:131]). + +@smallexample +@exdent vector unsigned __int128 vec_rlmi (vector unsigned __int128, + vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_rlmi (vector signed __int128, + vector signed __int128, + vector unsigned __int128); +@end smallexample + +Returns the result of rotating the first input and inserting it under mask into the +second input. The first bit in the mask, the last bit in the mask are obtained from the +two 7-bit fields bits [108:115] and bits [117:123] respectively of the second input. +The shift is obtained from the third input in the 7-bit field [125:131] where all bits +counted from zero at the left. + +@smallexample +@exdent vector unsigned __int128 vec_rlnm (vector unsigned __int128, + vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_rlnm (vector signed __int128, + vector unsigned __int128, + vector unsigned __int128); +@end smallexample + +Returns the result of rotating the first input and ANDing it with a mask. The first +bit in the mask, the last bit in the mask and the shift amount are obtained from the two +7-bit fields bits [117:123] and bits [125:131] respectively of the second input. +The shift is obtained from the third input in the 7-bit field bits [125:131] where all +bits counted from zero at the left. + +@smallexample +@exdent vector unsigned __int128 vec_sl(vector unsigned __int128, vector unsigned __int128); +@exdent vector signed __int128 vec_sl(vector signed __int128, vector unsigned __int128); +@end smallexample + +Returns the result of shifting the first input left by the number of bits +specified in the most significant bits of the second input truncated to +7 bits (bits [125:131]). + +@smallexample +@exdent vector unsigned __int128 vec_sr(vector unsigned __int128, vector unsigned __int128); +@exdent vector signed __int128 vec_sr(vector signed __int128, vector unsigned __int128); +@end smallexample + +Returns the result of performing a logical right shift of the first argument +by the number of bits specified in the most significant double word of the +second input truncated to 7 bits (bits [125:131]). + +@smallexample +@exdent vector unsigned __int128 vec_sra(vector unsigned __int128, vector unsigned __int128); +@exdent vector signed __int128 vec_sra(vector signed __int128, vector unsigned __int128); +@end smallexample + +Returns the result of performing arithmetic right shift of the first argument +by the number of bits specified in the most significant bits of the +second input truncated to 7 bits (bits [125:131]). + + +@smallexample +@exdent vector unsigned __int128 vec_mule (vector unsigned long long, + vector unsigned long long); +@exdent vector signed __int128 vec_mule (vector signed long long, + vector signed long long); +@end smallexample + +Returns a vector containing a 128-bit integer result of multiplying the even doubleword +elements of the two inputs. + +@smallexample +@exdent vector unsigned __int128 vec_mulo (vector unsigned long long, + vector unsigned long long); +@exdent vector signed __int128 vec_mulo (vector signed long long, + vector signed long long); +@end smallexample + +Returns a vector containing a 128-bit integer result of multiplying the odd doubleword +elements of the two inputs. + +@smallexample +@exdent vector unsigned __int128 vec_div (vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_div (vector signed __int128, + vector signed __int128); +@end smallexample + +Returns the result of dividing the first operand by the second operand. An attempt to +divide any value by zero or to divide the most negative signed 128-bit integer by +negative one results in an undefined value. + +@smallexample +@exdent vector unsigned __int128 vec_dive (vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_dive (vector signed __int128, + vector signed __int128); +@end smallexample + +The result is produced by shifting the first input left by 128 bits and dividing by the +second. If an attempt is made to divide by zero or the result is larger than 128 bits, +the result is undefined. + +@smallexample +@exdent vector unsigned __int128 vec_mod (vector unsigned __int128, + vector unsigned __int128); +@exdent vector signed __int128 vec_mod (vector signed __int128, + vector signed __int128); +@end smallexample + +The result is the modulo result of dividing the first input by the second input. + + +The following builtins perform 128-bit vector comparisons. The @code{vec_all_xx}, +@code{vec_any_xx}, and @code{vec_cmpxx}, where @code{xx} is one of the operations +@code{eq, ne, gt, lt, ge, le} perform pairwise comparisons between the elements +at the same positions within their two vector arguments. The @code{vec_all_xx} +function returns a non-zero value if and only if all pairwise comparisons are true. The +@code{vec_any_xx} function returns a non-zero value if and only if at least one pairwise +comparison is true. The @code{vec_cmpxx}function returns a vector of the same type as its +two arguments, within which each element consists of all ones to denote that specified +logical comparison of the corresponding elements was true. Otherwise, the element of the +returned vector contains all zeros. + +@smallexample +vector bool __int128 vec_cmpeq (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmpeq (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmpne (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmpne (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmpgt (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmpgt (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmplt (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmplt (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmpge (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmpge (vector unsigned __int128, vector unsigned __int128); +vector bool __int128 vec_cmple (vector signed __int128, vector signed __int128); +vector bool __int128 vec_cmple (vector unsigned __int128, vector unsigned __int128); + +int vec_all_eq (vector signed __int128, vector signed __int128); +int vec_all_eq (vector unsigned __int128, vector unsigned __int128); +int vec_all_ne (vector signed __int128, vector signed __int128); +int vec_all_ne (vector unsigned __int128, vector unsigned __int128); +int vec_all_gt (vector signed __int128, vector signed __int128); +int vec_all_gt (vector unsigned __int128, vector unsigned __int128); +int vec_all_lt (vector signed __int128, vector signed __int128); +int vec_all_lt (vector unsigned __int128, vector unsigned __int128); +int vec_all_ge (vector signed __int128, vector signed __int128); +int vec_all_ge (vector unsigned __int128, vector unsigned __int128); +int vec_all_le (vector signed __int128, vector signed __int128); +int vec_all_le (vector unsigned __int128, vector unsigned __int128); + +int vec_any_eq (vector signed __int128, vector signed __int128); +int vec_any_eq (vector unsigned __int128, vector unsigned __int128); +int vec_any_ne (vector signed __int128, vector signed __int128); +int vec_any_ne (vector unsigned __int128, vector unsigned __int128); +int vec_any_gt (vector signed __int128, vector signed __int128); +int vec_any_gt (vector unsigned __int128, vector unsigned __int128); +int vec_any_lt (vector signed __int128, vector signed __int128); +int vec_any_lt (vector unsigned __int128, vector unsigned __int128); +int vec_any_ge (vector signed __int128, vector signed __int128); +int vec_any_ge (vector unsigned __int128, vector unsigned __int128); +int vec_any_le (vector signed __int128, vector signed __int128); +int vec_any_le (vector unsigned __int128, vector unsigned __int128); +@end smallexample + + @node PowerPC Hardware Transactional Memory Built-in Functions @subsection PowerPC Hardware Transactional Memory Built-in Functions GCC provides two interfaces for accessing the Hardware Transactional diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c new file mode 100644 index 00000000000..c84494fc28d --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c @@ -0,0 +1,2254 @@ +/* { dg-do run } */ +/* { dg-require-effective-target power10_hw } */ +/* { dg-options "-mdejagnu-cpu=power10" } */ + + +/* Check that the expected 128-bit instructions are generated if the processor + supports the 128-bit integer instructions. */ +/* { dg-final { scan-assembler-times {\mvextsd2q\M} 2 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvslq\M} 2 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvsrq\M} 2 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvsraq\M} 2 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvrlq\M} 2 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvrlqnm\M} 2 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvrlqmi\M} 2 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvcmpuq\M} 0 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvcmpsq\M} 0 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvcmpequq\M} 0 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvcmpequq.\M} 16 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 0 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvcmpgtsq.\M} 16 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 0 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvcmpgtuq.\M} 16 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvmuleud\M} 1 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvmuloud\M} 1 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvmulesd\M} 1 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvmulosd\M} 1 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvdivsq\M} 1 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvdivuq\M} 1 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvdivesq\M} 1 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvdiveuq\M} 1 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvmodsq\M} 1 { target { ppc_native_128bit } } } } */ +/* { dg-final { scan-assembler-times {\mvmoduq\M} 1 { target { ppc_native_128bit } } } } */ + +#include + +#define DEBUG 0 + +#if DEBUG +#include +#include + + +void print_i128(__int128_t val) +{ + printf(" %lld %llu (0x%llx %llx)", + (signed long long)(val >> 64), + (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF), + (unsigned long long)(val >> 64), + (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF)); +} +#endif + +void abort (void); + +int main () +{ + int i, result_int; + + __int128_t arg1, result; + __uint128_t uarg2; + + vector signed long long int vec_arg1_di, vec_arg2_di; + vector unsigned long long int vec_uarg1_di, vec_uarg2_di, vec_uarg3_di; + vector unsigned long long int vec_uresult_di; + vector unsigned long long int vec_uexpected_result_di; + + __int128_t expected_result; + __uint128_t uexpected_result; + + vector __int128_t vec_arg1, vec_arg2, vec_result; + vector __uint128_t vec_uarg1, vec_uarg2, vec_uarg3, vec_uresult; + vector bool __int128 vec_result_bool; + + /* sign extend double to 128-bit integer */ + vec_arg1_di[0] = 1000; + vec_arg1_di[1] = -123456; + + expected_result = 1000; + + vec_result = vec_signextq (vec_arg1_di); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_signextq ((long long) %lld) = ", vec_arg1_di[0]); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1_di[0] = -123456; + vec_arg1_di[1] = 1000; + + expected_result = -123456; + + vec_result = vec_signextq (vec_arg1_di); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_signextq ((long long) %lld) = ", vec_arg1_di[0]); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* test shift 128-bit integers. + Note, shift amount is given by the lower 7-bits of the shift amount. */ + vec_arg1[0] = 3; + vec_uarg2[0] = 2; + expected_result = vec_arg1[0]*4; + + vec_result = vec_sl (vec_arg1, vec_uarg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_sl(int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" << %lld", vec_uarg2[0] & 0xFF); + printf(" = "); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + arg1 = 3; + uarg2 = 4; + expected_result = arg1*16; + + result = arg1 << uarg2; + + if (result != expected_result) { +#if DEBUG + printf("ERROR: int128 << uint128): "); + print_i128(arg1); + printf(" << %lld", uarg2 & 0xFF); + printf(" = "); + print_i128(result); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 3; + vec_uarg2[0] = 2; + uexpected_result = vec_uarg1[0]*4; + + vec_uresult = vec_sl (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_sl(uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" << %lld", vec_uarg2[0] & 0xFF); + printf(" = "); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12; + vec_uarg2[0] = 2; + expected_result = vec_arg1[0]/4; + + vec_result = vec_sr (vec_arg1, vec_uarg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_sr(int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" >> %lld", vec_uarg2[0] & 0xFF); + printf(" = "); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 48; + vec_uarg2[0] = 2; + uexpected_result = vec_uarg1[0]/4; + + vec_uresult = vec_sr (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_sr(uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" >> %lld", vec_uarg2[0] & 0xFF); + printf(" = "); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + arg1 = 48; + uarg2 = 4; + expected_result = arg1/16; + + result = arg1 >> uarg2; + + if (result != expected_result) { +#if DEBUG + printf("ERROR: int128 >> uint128: "); + print_i128(arg1); + printf(" >> %lld", uarg2 & 0xFF); + printf(" = "); + print_i128(result); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 0x1234567890ABCDEFULL; + vec_arg1[0] = (vec_arg1[0] << 64) | 0xAABBCCDDEEFF1122ULL; + vec_uarg2[0] = 32; + expected_result = 0x0000000012345678ULL; + expected_result = (expected_result << 64) | 0x90ABCDEFAABBCCDDULL; + + vec_result = vec_sra (vec_arg1, vec_uarg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_sra(int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" >> %lld = \n", vec_uarg2[0]); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 0xAABBCCDDEEFF1122ULL; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 0x1234567890ABCDEFULL; + vec_uarg2[0] = 48; + uexpected_result = 0xFFFFFFFFFFFFAABBLL; + uexpected_result = (uexpected_result << 64) | 0xCCDDEEFF11221234ULL; + + vec_uresult = vec_sra (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_sra(uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" >> %lld = \n", vec_uarg2[0] & 0xFF); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 0x1234567890ABCDEFULL; + vec_arg1[0] = (vec_arg1[0] << 64) | 0xAABBCCDDEEFF1122ULL; + vec_uarg2[0] = 32; + expected_result = 0x90ABCDEFAABBCCDDULL; + expected_result = (expected_result << 64) | 0xEEFF112212345678ULL; + + vec_result = vec_rl (vec_arg1, vec_uarg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_rl(int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" >> %lld = \n", vec_uarg2[0]); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 0xAABBCCDDEEFF1122ULL; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 0x1234567890ABCDEFULL; + vec_uarg2[0] = 48; + uexpected_result = 0x11221234567890ABULL; + uexpected_result = (uexpected_result << 64) | 0xCDEFAABBCCDDEEFFULL; + + vec_uresult = vec_rl (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_rl(uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" >> %lld = \n", vec_uarg2[0]); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 0x1234567890ABCDEFULL; + vec_arg1[0] = (vec_arg1[0] << 64) | 0xAABBCCDDEEFF1122ULL; + vec_uarg2[0] = 32 << (63-55) | 95 << (63-63); + vec_uarg3[0] = 32; + expected_result = 0xAABBCCDDULL; + expected_result = (expected_result << 64) | 0xEEFF112200000000ULL; + + vec_result = vec_rlnm (vec_arg1, vec_uarg2, vec_uarg3); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_rlnm(int128, uint128, uint128): "); + print_i128(vec_arg1[0]); + printf(" << %lld = \n", vec_uarg2[0] & 0xFF); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 0xAABBCCDDEEFF1122ULL; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 0x1234567890ABCDEFULL; + vec_uarg2[0] = 8 << (63-55) | 119 << (63-63); + vec_uarg3[0] = 48; + + uexpected_result = 0x00221234567890ABULL; + uexpected_result = (uexpected_result << 64) | 0xCDEFAABBCCDDEE00ULL; + + vec_uresult = vec_rlnm (vec_uarg1, vec_uarg2, vec_uarg3); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_rlnm(uint128, uint128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" << %lld = \n", vec_uarg2[0] && 0xFF); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 0x1234567890ABCDEFULL; + vec_arg1[0] = (vec_arg1[0] << 64) | 0xAABBCCDDEEFF1122ULL; + vec_arg2[0] = 0x000000000000DEADULL; + vec_arg2[0] = (vec_arg2[0] << 64) | 0x0000BEEF00000000ULL; + vec_uarg3[0] = 96 << 16 | 127 << 8 | 32; + expected_result = 0x000000000000DEADULL; + expected_result = (expected_result << 64) | 0x0000BEEF12345678ULL; + + vec_result = vec_rlmi (vec_arg1, vec_arg2, vec_uarg3); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_rlmi(int128, int128, uint128): "); + print_i128(vec_arg1[0]); + printf(" << %lld = \n", vec_uarg2_di[1] & 0xFF); + print_i128(vec_result[0]); + printf("\n does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 0xAABBCCDDEEFF1122ULL; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 0x1234567890ABCDEFULL; + vec_uarg2[0] = 0xDEAD000000000000ULL; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 0x000000000000BEEFULL; + vec_uarg3[0] = 16 << 16 | 111 << 8 | 48; + uexpected_result = 0xDEAD1234567890ABULL; + uexpected_result = (uexpected_result << 64) | 0xCDEFAABBCCDDBEEFULL; + + vec_uresult = vec_rlmi (vec_uarg1, vec_uarg2, vec_uarg3); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_rlmi(uint128, unit128, uint128): "); + print_i128(vec_uarg1[0]); + printf(" << %lld = \n", vec_uarg3[1] & 0xFF); + print_i128(vec_uresult[0]); + printf("\n does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* 128-bit compare tests, result is all 1's if true */ + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1[0] = 2468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + uexpected_result = 0xFFFFFFFFFFFFFFFFULL; + uexpected_result = (uexpected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpgt (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != uexpected_result) { +#if DEBUG + printf("ERROR: unsigned vec_cmpgt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpgt (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed vec_cmpgt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmpeq (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR:not equal signed vec_cmpeq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpeq (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed equal vec_cmpeq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmpeq (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned not equal vec_cmpeq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpeq (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: equal unsigned vec_cmpeq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpne (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned not equal vec_cmpne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmpne (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: equal unsigned vec_cmpne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpne (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR:not equal signed vec_cmpne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmpne (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed equal vec_cmpne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmplt (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 > arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 1234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 12468; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmplt (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 < arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmplt (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 = arg2 vec_cmplt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmplt (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 > arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -1234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 12468; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmplt (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 < arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0x0ULL; + + vec_result_bool = vec_cmplt (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_cmplt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmple (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 > arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 1234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 12468; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmple (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 < arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmple (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 = arg2 vec_cmple ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmple (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 > arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -1234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 12468; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmple (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 < arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmple (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_cmple ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 12468; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpge (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 > arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 1234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 12468; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmpge (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 < arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpge (vec_uarg1, vec_uarg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: unsigned arg1 = arg2 vec_cmpge ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = 12468; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = -1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpge (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 > arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -1234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 12468; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + expected_result = 0x0; + + vec_result_bool = vec_cmpge (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 < arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + expected_result = 0xFFFFFFFFFFFFFFFFULL; + expected_result = (expected_result << 64) | 0xFFFFFFFFFFFFFFFFULL; + + vec_result_bool = vec_cmpge (vec_arg1, vec_arg2); + + if (vec_result_bool[0] != expected_result) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_cmpge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed."); + print_i128(vec_result_bool[0]); + printf("\n Result does not match expected_result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_eq (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_eq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_eq (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_eq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_eq (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_eq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_eq (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_eq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_ne (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_ne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_ne (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_ne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_ne (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_ne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_ne (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_ne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_lt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_lt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_lt (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_lt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_lt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_lt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_lt (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_lt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_le (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_le ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_le (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_le ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_le (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_le ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_le (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_le ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_gt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_gt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_gt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_gt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_gt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_gt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_all_ge (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_all_ge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_all_ge (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_all_ge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_all_ge (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_all_ge ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_all_ge (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_all_ge ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_eq (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_eq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_eq (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_eq ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_eq (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_eq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_eq (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_eq ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_ne (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_ne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_ne (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_ne ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_ne (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_ne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_ne (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_ne ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_lt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_lt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_lt (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_lt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_lt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_lt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_lt (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_lt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_gt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_gt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_gt (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_gt ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_gt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_gt (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_le (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_le ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_le (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_le ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_le (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_le ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_le (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_le ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + vec_arg1 = vec_arg2; + + result_int = vec_any_ge (vec_arg1, vec_arg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: signed arg1 = arg2 vec_any_ge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1[0] = -234; + vec_arg1[0] = (vec_arg1[0] << 64) | 4567; + vec_arg2[0] = 1234; + vec_arg2[0] = (vec_arg2[0] << 64) | 4567; + + result_int = vec_any_ge (vec_arg1, vec_arg2); + + if (result_int) { +#if DEBUG + printf("ERROR: signed arg1 != arg2 vec_any_ge ( "); + print_i128(vec_arg1[0]); + printf(", "); + print_i128(vec_arg2[0]); + printf(") failed.\n\n"); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + vec_uarg1 = vec_uarg2; + + result_int = vec_any_ge (vec_uarg1, vec_uarg2); + + if (!result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 = uarg2 vec_any_ge ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 234; + vec_uarg1[0] = (vec_uarg1[0] << 64) | 4567; + vec_uarg2[0] = 1234; + vec_uarg2[0] = (vec_uarg2[0] << 64) | 4567; + + result_int = vec_any_ge (vec_uarg1, vec_uarg2); + + if (result_int) { +#if DEBUG + printf("ERROR: unsigned uarg1 != uarg2 vec_any_gt ( "); + print_i128(vec_uarg1[0]); + printf(", "); + print_i128(vec_uarg2[0]); + printf(") failed.\n\n"); +#else + abort(); +#endif + } + + /* Vector multiply Even and Odd tests */ + vec_arg1_di[0] = 200; + vec_arg1_di[1] = 400; + vec_arg2_di[0] = 1234; + vec_arg2_di[1] = 4567; + expected_result = vec_arg1_di[0] * vec_arg2_di[0]; + + vec_result = vec_mule (vec_arg1_di, vec_arg2_di); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_mule (signed, signed) failed.\n"); + printf(" vec_arg1_di[0] = %lld\n", vec_arg1_di[0]); + printf(" vec_arg2_di[0] = %lld\n", vec_arg2_di[0]); + printf("Result = "); + print_i128(vec_result[0]); + printf("\nExpected Result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_arg1_di[0] = -200; + vec_arg1_di[1] = -400; + vec_arg2_di[0] = 1234; + vec_arg2_di[1] = 4567; + expected_result = vec_arg1_di[1] * vec_arg2_di[1]; + + vec_result = vec_mulo (vec_arg1_di, vec_arg2_di); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_mulo (signed, signed) failed.\n"); + printf(" vec_arg1_di[1] = %lld\n", vec_arg1_di[1]); + printf(" vec_arg2_di[1] = %lld\n", vec_arg2_di[1]); + printf("Result = "); + print_i128(vec_result[0]); + printf("\nExpected Result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1_di[0] = 200; + vec_uarg1_di[1] = 400; + vec_uarg2_di[0] = 1234; + vec_uarg2_di[1] = 4567; + uexpected_result = vec_uarg1_di[0] * vec_uarg2_di[0]; + + vec_uresult = vec_mule (vec_uarg1_di, vec_uarg2_di); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_mule (unsigned, unsigned) failed.\n"); + printf(" vec_uarg1_di[1] = %lld\n", vec_uarg1_di[1]); + printf(" vec_uarg2_di[1] = %lld\n", vec_uarg2_di[1]); + printf("Result = "); + print_i128(vec_uresult[0]); + printf("\nExpected Result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1_di[0] = 200; + vec_uarg1_di[1] = 400; + vec_uarg2_di[0] = 1234; + vec_uarg2_di[1] = 4567; + uexpected_result = vec_uarg1_di[1] * vec_uarg2_di[1]; + + vec_uresult = vec_mulo (vec_uarg1_di, vec_uarg2_di); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_mulo (unsigned, unsigned) failed.\n"); + printf(" vec_uarg1_di[0] = %lld\n", vec_uarg1_di[0]); + printf(" vec_uarg2_di[0] = %lld\n", vec_uarg2_di[0]); + printf("Result = "); + print_i128(vec_uresult[0]); + printf("\nExpected Result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* Vector Divide Quadword */ + vec_arg1[0] = -12345678; + vec_arg2[0] = 2; + expected_result = -6172839; + + vec_result = vec_div (vec_arg1, vec_arg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_div (signed, signed) failed.\n"); + printf("vec_arg1[0] = "); + print_i128(vec_arg1[0]); + printf("\nvec_arg2[0] = "); + print_i128(vec_arg2[0]); + printf("\nResult = "); + print_i128(vec_result[0]); + printf("\nExpected result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 24680; + vec_uarg2[0] = 4; + uexpected_result = 6170; + + vec_uresult = vec_div (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_div (unsigned, unsigned) failed.\n"); + printf("vec_uarg1[0] = "); + print_i128(vec_uarg1[0]); + printf("\nvec_uarg2[0] = "); + print_i128(vec_uarg2[0]); + printf("\nResult = "); + print_i128(vec_uresult[0]); + printf("\nExpected result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* Vector Divide Extended Quadword */ + vec_arg1[0] = -20; // has 128-bit of zero concatenated onto it + vec_arg2[0] = 0x2000000000000000; + vec_arg2[0] = vec_arg2[0] << 64; + expected_result = -160; + + vec_result = vec_dive (vec_arg1, vec_arg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_dive (signed, signed) failed.\n"); + printf("vec_arg1[0] = "); + print_i128(vec_arg1[0]); + printf("\nvec_arg2[0] = "); + print_i128(vec_arg2[0]); + printf("\nResult = "); + print_i128(vec_result[0]); + printf("\nExpected result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 20; // has 128-bit of zero concatenated onto it + vec_uarg2[0] = 0x4000000000000000; + vec_uarg2[0] = vec_uarg2[0] << 64; + uexpected_result = 80; + + vec_uresult = vec_dive (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_dive (unsigned, unsigned) failed.\n"); + printf("vec_uarg1[0] = "); + print_i128(vec_uarg1[0]); + printf("\nvec_uarg2[0] = "); + print_i128(vec_uarg2[0]); + printf("\nResult = "); + print_i128(vec_uresult[0]); + printf("\nExpected result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + /* Vector modulo quad word */ + vec_arg1[0] = -12345675; + vec_arg2[0] = 2; + expected_result = -1; + + vec_result = vec_mod (vec_arg1, vec_arg2); + + if (vec_result[0] != expected_result) { +#if DEBUG + printf("ERROR: vec_mod (signed, signed) failed.\n"); + printf("vec_arg1[0] = "); + print_i128(vec_arg1[0]); + printf("\nvec_arg2[0] = "); + print_i128(vec_arg2[0]); + printf("\nResult = "); + print_i128(vec_result[0]); + printf("\nExpected result = "); + print_i128(expected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + vec_uarg1[0] = 24685; + vec_uarg2[0] = 4; + uexpected_result = 1; + + vec_uresult = vec_mod (vec_uarg1, vec_uarg2); + + if (vec_uresult[0] != uexpected_result) { +#if DEBUG + printf("ERROR: vec_mod (unsigned, unsigned) failed.\n"); + printf("vec_uarg1[0] = "); + print_i128(vec_uarg1[0]); + printf("\nvec_uarg2[0] = "); + print_i128(vec_uarg2[0]); + printf("\nResult = "); + print_i128(vec_uresult[0]); + printf("\nExpected result = "); + print_i128(uexpected_result); + printf("\n\n"); +#else + abort(); +#endif + } + + return 0; +} -- 2.25.1