From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 13FBE3858D37 for ; Thu, 6 Oct 2022 21:30:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 13FBE3858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vnet.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vnet.ibm.com Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 296Jg6it012547; Thu, 6 Oct 2022 21:30:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : subject : from : to : cc : date : content-type : mime-version : content-transfer-encoding; s=pp1; bh=5fn0mDGvepgfYxAOumvzJeoDgAJNgCnPqSgLpRCD0gY=; b=HlOKHptbM0l5au7pKd4CXuc2CuQ/b5DEzosjIEu+o60J314ghQGhyKpYslMakuJ/J7Ns xNdYkvc1pR2tLzUavxp9A1S9siqwOTEWVClyfYgWhxUqI/xCxSJN/PGDUus9z/84qAuU MMRUQA1tsPYLjN9uaItQwqD8HQnUJjLzAGaw/cR9l9IE5oDyZ/RDz41w4li7aMbENO0R YlDxM8AX93abH1vf7NpjwFDMHg+S3qT+4n7A5Wctr81nJkM2OkSU+y5jZR3eZl0XMVeY Q5VAbmbniA5QtGsQSGANyr6wsQBu0VdJ/XnLPliozYCBNQTUH3mAWrVZNrjxjgNeqj9r 0g== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3k25dab4bw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Oct 2022 21:30:01 +0000 Received: from m0187473.ppops.net (m0187473.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 296L20Nh016969; Thu, 6 Oct 2022 21:30:01 GMT Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3k25dab4b8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Oct 2022 21:30:01 +0000 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 296LL78U000678; Thu, 6 Oct 2022 21:29:59 GMT Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by ppma02wdc.us.ibm.com with ESMTP id 3jxd6a506v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Oct 2022 21:29:59 +0000 Received: from smtpav06.dal12v.mail.ibm.com ([9.208.128.130]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 296LTw9112386998 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Oct 2022 21:29:58 GMT Received: from smtpav06.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BE07858059; Thu, 6 Oct 2022 21:29:58 +0000 (GMT) Received: from smtpav06.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 29E9E58043; Thu, 6 Oct 2022 21:29:58 +0000 (GMT) Received: from sig-9-65-213-31.ibm.com (unknown [9.65.213.31]) by smtpav06.dal12v.mail.ibm.com (Postfix) with ESMTP; Thu, 6 Oct 2022 21:29:58 +0000 (GMT) Message-ID: Subject: [PATCH, rs6000] Fix addg6s builtin with long long parameters. (PR100693) From: will schmidt To: "'GCC Patches'" Cc: Segher Boessenkool , "Kewen.Lin" , David Edelsohn Date: Thu, 06 Oct 2022 16:29:57 -0500 Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 (3.28.5-18.el8) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: S8yXywvM5PhBprYfgaD3e5qqqTvBzsQR X-Proofpoint-ORIG-GUID: YG9gU-yllrGUKfUeNcC5vOkbnYAniJpm X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-10-06_04,2022-10-06_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 suspectscore=0 bulkscore=0 adultscore=0 mlxlogscore=874 spamscore=0 mlxscore=0 phishscore=0 lowpriorityscore=0 priorityscore=1501 malwarescore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210060124 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: [PATCH, rs6000] Fix addg6s builtin with long long parameters. (PR100693) Hi, As reported in PR 100693, attempts to use __builtin_addg6s with long long arguments result in truncated results. Since the int and long long types can be coerced into each other, (documented further near the rs6000-c.cc change), this is handled by adding a builtin overload (ADDG6S_OV), and the addition of some special handling in altivec_resolve_overloaded_builtin() to map the calls to addg6s_32 or addg6s_64; similar to how the SCAL_CMPB builtins are currently handled. This has sniff-tested cleanly. I'm seeing a regression failure show up in testsuite/g++.dg/modules/adl-3*.c; which seems entirely unrelated to the content in this change. I'm poking at that a bit more to see if I can tell the what/why for that. OK for trunk? Thanks, -Will gcc/ PR target/100693 * config/rs6000/rs6000-builtins.def ([POWER7]): Replace bif-name __builtin_addg6s with bif-name __builtin_addg6s_32. ([POWER7-64]): New bif-name __builtin_addg6s_64. * config/rs6000/rs6000-c.cc (altivec_resolve_overloaded_builtin): Add handler mapping RS6000_OVLD_ADDG6S_OV to RS6000_BIF_ADDG6S and RS6000_BIF_ADDG6S_32. * config/rs6000/rs6000-overload.def (ADDG6S_OV): Add overloaded entry __builtin_addg6s mapped to ADDG6S_32 and ADDG6S. * config/rs6000/rs6000.md ("addg6s", UNSPEC_ADDG6S): Replace with ("addg6s3") and rework. * doc/extend.texi (__builtin_addg6s): Add documentation for __builtin_addg6s with unsigned long long parameters. gcc/testsuite/ * testsuite/gcc.target/powerpc/pr100693-compile.c: New. * testsuite/gcc.target/powerpc/pr100693-run.c: New. diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def index f76f54793d73..11050e4c26d5 100644 --- a/gcc/config/rs6000/rs6000-builtins.def +++ b/gcc/config/rs6000/rs6000-builtins.def @@ -2010,12 +2010,13 @@ XXSPLTD_V2DI vsx_xxspltd_v2di {} ; Power7 builtins (ISA 2.06). [power7] - const unsigned int __builtin_addg6s (unsigned int, unsigned int); - ADDG6S addg6s {} + + const unsigned int __builtin_addg6s_32 (unsigned int, unsigned int); + ADDG6S_32 addg6ssi3 {} const signed long __builtin_bpermd (signed long, signed long); BPERMD bpermd_di {32bit} const unsigned int __builtin_cbcdtd (unsigned int); @@ -2041,10 +2042,14 @@ UNPACK_V1TI unpackv1ti {} ; Power7 builtins requiring 64-bit GPRs (even with 32-bit addressing). [power7-64] + + const unsigned long __builtin_addg6s_64 (unsigned long, unsigned long); + ADDG6S addg6sdi3 {no32bit} + const signed long long __builtin_divde (signed long long, signed long long); DIVDE dive_di {} const unsigned long long __builtin_divdeu (unsigned long long, \ unsigned long long); diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc index 566094626293..28e8b6761ce5 100644 --- a/gcc/config/rs6000/rs6000-c.cc +++ b/gcc/config/rs6000/rs6000-c.cc @@ -1919,10 +1919,35 @@ altivec_resolve_overloaded_builtin (location_t loc, tree fndecl, instance_code, fcode, types, args); if (call != error_mark_node) return call; break; } + /* We need to special case __builtin_addg6s because the overloaded + forms of this function take (unsigned int, unsigned int) or + (unsigned long long, unsigned long long). Since C conventions + allow the respective argument types to be implicitly coerced into + each other, the default handling does not provide adequate + discrimination between the desired forms of the function. */ + case RS6000_OVLD_ADDG6S_OV: + { + machine_mode arg1_mode = TYPE_MODE (types[0]); + machine_mode arg2_mode = TYPE_MODE (types[1]); + + /* If any supplied arguments are wider than 32 bits, resolve to + 64-bit variant of built-in function. */ + if (GET_MODE_PRECISION (arg1_mode) > 32 + || GET_MODE_PRECISION (arg2_mode) > 32) + instance_code = RS6000_BIF_ADDG6S; + else + instance_code = RS6000_BIF_ADDG6S_32; + + tree call = find_instance (&unsupported_builtin, &instance, + instance_code, fcode, types, args); + if (call != error_mark_node) + return call; + break; + } case RS6000_OVLD_VEC_VSIE: { machine_mode arg1_mode = TYPE_MODE (types[0]); /* If supplied first argument is wider than 64 bits, resolve to diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def index 44e2945aaa0e..41b74c0c1500 100644 --- a/gcc/config/rs6000/rs6000-overload.def +++ b/gcc/config/rs6000/rs6000-overload.def @@ -193,10 +193,16 @@ unsigned int __builtin_cmpb (unsigned int, unsigned int); CMPB_32 unsigned long long __builtin_cmpb (unsigned long long, unsigned long long); CMPB +[ADDG6S_OV, SKIP, __builtin_addg6s] + unsigned int __builtin_addg6s (unsigned int, unsigned int); + ADDG6S_32 + unsigned long long __builtin_addg6s (unsigned long long, unsigned long long); + ADDG6S + [VEC_ABS, vec_abs, __builtin_vec_abs] vsc __builtin_vec_abs (vsc); ABS_V16QI vss __builtin_vec_abs (vss); ABS_V8HI diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index ad5a4cf2ef83..b172ffd097ab 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -14534,14 +14534,14 @@ (define_peephole2 operands[5] = change_address (mem, mode, new_addr); }) ;; Miscellaneous ISA 2.06 (power7) instructions -(define_insn "addg6s" - [(set (match_operand:SI 0 "register_operand" "=r") - (unspec:SI [(match_operand:SI 1 "register_operand" "r") - (match_operand:SI 2 "register_operand" "r")] +(define_insn "addg6s3" + [(set (match_operand:GPR 0 "register_operand" "=r") + (unspec:GPR [(match_operand:GPR 1 "register_operand" "r") + (match_operand:GPR 2 "register_operand" "r")] UNSPEC_ADDG6S))] "TARGET_POPCNTD" "addg6s %0,%1,%2" [(set_attr "type" "integer")]) diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index a5afb467d235..d393062e790f 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -18139,10 +18139,11 @@ addition to the @option{-maltivec}, @option{-mpopcntd}, and @option{-mvsx} options. The following basic built-in functions require @option{-mpopcntd}: @smallexample unsigned int __builtin_addg6s (unsigned int, unsigned int); +unsigned long long __builtin_addg6s (unsigned long long, unsigned long long); long long __builtin_bpermd (long long, long long); unsigned int __builtin_cbcdtd (unsigned int); unsigned int __builtin_cdtbcd (unsigned int); long long __builtin_divde (long long, long long); unsigned long long __builtin_divdeu (unsigned long long, unsigned long long); diff --git a/gcc/testsuite/gcc.target/powerpc/pr100693-compile.c b/gcc/testsuite/gcc.target/powerpc/pr100693-compile.c new file mode 100644 index 000000000000..113936a3d9f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr100693-compile.c @@ -0,0 +1,30 @@ +/* { dg-do compile { target { powerpc*-*-linux* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-mdejagnu-cpu=power7 -O3" } */ +/* { dg-final { scan-assembler-times {\maddg6s\M} 4 } } */ +/* { dg-final { scan-assembler-not "bl __builtin" } } */ + +/* Test case for the addg6s builtin, exercising both + unsigned int and unsigned long long arguments. */ + +unsigned long long test2_ull (unsigned long long a, unsigned long long b) +{ + return __builtin_addg6s (a, b); +} + +unsigned int test1_ui (unsigned int a, unsigned int b) +{ + return __builtin_addg6s (a, b); +} + +unsigned int test3_ui (unsigned int *a, unsigned int *b) +{ + return __builtin_addg6s(*a, *b); +} + +unsigned long long test4_ull (unsigned long long *a, unsigned long long *b) +{ + return __builtin_addg6s(*a, *b); +} + diff --git a/gcc/testsuite/gcc.target/powerpc/pr100693-run.c b/gcc/testsuite/gcc.target/powerpc/pr100693-run.c new file mode 100644 index 000000000000..3694010b10ae --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr100693-run.c @@ -0,0 +1,78 @@ +/* { dg-do run { target { powerpc*-*-linux* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-mdejagnu-cpu=power7 -O3" } */ + +/* Test case for the addg6s builtin, exercising both + unsigned int and unsigned long long arguments. */ + +#include +#include + +unsigned int test_addg6s_int (unsigned int a, unsigned int b) +{ + return __builtin_addg6s (a, b); +} + +unsigned long long test_addg6s_longlong (unsigned long long a, unsigned long long b) +{ + return __builtin_addg6s (a, b); +} + +/* Table of expected values. */ + +unsigned int exp_i[] = { +0x66666660, +0x66666606, +0x66666066, +0x66660666, +0x66606666, +0x66066666, +0x60666666, +0x06666666, +0x77777777 +}; + +unsigned long long exp_ll[] = { +0x6666666666666660, +0x6666666666666606, +0x6666666666666066, +0x6666666666660666, +0x6666666666606666, +0x6666666666066666, +0x6666666660666666, +0x6666666606666666, +0x6666666066666666, +0x6666660666666666, +0x6666606666666666, +0x6666066666666666, +0x6660666666666666, +0x6606666666666666, +0x6066666666666666, +0x0666666666666666, +0x7777777777777777 +}; + +int main() { + unsigned int intresult; + unsigned long long longresult; + int idx; + int fail=0; + + for (idx=0; idx<8; idx++) { + intresult = test_addg6s_int (0x01<<4*idx, 0x0f<<4*idx); + if (exp_i[idx] != intresult) + printf("index:%2d Got:%8x Expected:%8x %d\n", + idx, intresult, exp_i[idx], ++fail); + } + + for (idx=0; idx<16; idx++) { + longresult = test_addg6s_longlong (0x01ULL<<4*idx, 0x0fULL<<4*idx); + if (exp_ll[idx] != longresult) + printf("index:%2d Got:%16llx Expected:%16llx %d\n", + idx, longresult, exp_ll[idx], ++fail); + } + if (fail) + abort(); +} +