From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4538 invoked by alias); 8 Feb 2018 16:38:18 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 4529 invoked by uid 89); 8 Feb 2018 16:38:18 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-9.7 required=5.0 tests=AWL,BAYES_05,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 spammy=mulld, divdu, xxlxor, vctsxs X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 08 Feb 2018 16:38:16 +0000 Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w18Gc9I0090443 for ; Thu, 8 Feb 2018 11:38:14 -0500 Received: from e37.co.us.ibm.com (e37.co.us.ibm.com [32.97.110.158]) by mx0a-001b2d01.pphosted.com with ESMTP id 2g0s8nbdv7-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Thu, 08 Feb 2018 11:38:14 -0500 Received: from localhost by e37.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 8 Feb 2018 09:38:13 -0700 Received: from b03cxnp08026.gho.boulder.ibm.com (9.17.130.18) by e37.co.us.ibm.com (192.168.1.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 8 Feb 2018 09:38:11 -0700 Received: from b03ledav001.gho.boulder.ibm.com (b03ledav001.gho.boulder.ibm.com [9.17.130.232]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w18GcAPu11206976; Thu, 8 Feb 2018 09:38:10 -0700 Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 84A456E04A; Thu, 8 Feb 2018 09:38:10 -0700 (MST) Received: from otta.rchland.ibm.com (unknown [9.10.86.175]) by b03ledav001.gho.boulder.ibm.com (Postfix) with ESMTP id 0301C6E038; Thu, 8 Feb 2018 09:38:09 -0700 (MST) Subject: Re: [PATCH, rs6000] Fix PR83926, ICE using __builtin_vsx_{div,udiv,mul}_2di builtins From: Peter Bergner To: David Edelsohn , Segher Boessenkool Cc: GCC Patches , "William J. Schmidt" , Will Schmidt References: <33c2d1d6-87ca-3273-c1fd-5e7d1a1998d3@vnet.ibm.com> <36494980-cd4f-5f5d-13e9-6a9e12c9ac23@vnet.ibm.com> <34207e92-b45c-2695-2229-c9e69f2a6fb0@vnet.ibm.com> Date: Thu, 08 Feb 2018 16:38:00 -0000 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <34207e92-b45c-2695-2229-c9e69f2a6fb0@vnet.ibm.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 18020816-0024-0000-0000-000017E86B49 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00008497; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000249; SDB=6.00986793; UDB=6.00500824; IPR=6.00766152; BA=6.00005820; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00019444; XFM=3.00000015; UTC=2018-02-08 16:38:12 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18020816-0025-0000-0000-00004EA2CD09 Message-Id: <2cacd8de-94f2-5c2b-a299-85563a0d7e96@vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2018-02-08_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1802080191 X-IsSubscribed: yes X-SW-Source: 2018-02/txt/msg00439.txt.bz2 On 2/6/18 10:36 AM, Peter Bergner wrote: > On 2/6/18 10:20 AM, David Edelsohn wrote: >> Do the gen_XXXdi3 calls work if you use SDI iterator instead of GPR >> iterator, as Segher suggested? > > Well it works _if_ we use the first patch that changes the gen_* > patterns. If we go this route, I agree we should use the SDI > iterator instead of GPR. Actually, my bad. While bootstrapping this on a BE system, we get an error when we attempt a 64-bit multiply in 32-bit mode. In this case, the gen_muldi3() pattern calls expand_mult(DImode, ...) and the automatic expand machinery notices the gen_muldi3() now allows DImode in the !TARGET_POWERPC64 case and then calls gen_muldi3() to emit the multiply and we go into infinite recursion. We don't have that problem in the div/udiv case, because we call out to the lib routines, so no recursion. Given this, I think we should probably go with the patch that modifies vsx.md and guards the calls to gen_{div,udiv,mul}di3() with a TARGET_POWERPC64 test. For completeness, that patch again is below with one testsuite addition. The builtins-1-be.c test case must never have been tested in 32-bit mode, since it was always ICEing from the beginning. I've fixed it to run in both 32-bit and 64-bit modes and in 32-bit mode, it now correctly scans for the 64-bit div/udiv/mul cases this patch generates. Again, this passed bootstrap and regtesting on powerpc64le-linux as well as on powerpc64-linux and running the testsuite in both 32-bit and 64-bit modes. Ok for trunk? Peter gcc/ PR target/83926 * config/rs6000/vsx.md (vsx_mul_v2di): Handle generating a 64-bit multiply in 32-bit mode. (vsx_div_v2di): Handle generating a 64-bit signed divide in 32-bit mode. (vsx_udiv_v2di): Handle generating a 64-bit unsigned divide in 32-bit mode. gcc/testsuite/ PR target/83926 * gcc.target/powerpc/pr83926.c: New test. * gcc.target/powerpc/builtins-1-be.c: Filter out gimple folding disabled message. Fix test for running in 32-bit mode. Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 257390) +++ gcc/config/rs6000/vsx.md (working copy) @@ -1650,10 +1650,22 @@ rtx op5 = gen_reg_rtx (DImode); emit_insn (gen_vsx_extract_v2di (op3, op1, GEN_INT (0))); emit_insn (gen_vsx_extract_v2di (op4, op2, GEN_INT (0))); - emit_insn (gen_muldi3 (op5, op3, op4)); + if (TARGET_POWERPC64) + emit_insn (gen_muldi3 (op5, op3, op4)); + else + { + rtx ret = expand_mult (DImode, op3, op4, NULL, 0, false); + emit_move_insn (op5, ret); + } emit_insn (gen_vsx_extract_v2di (op3, op1, GEN_INT (1))); emit_insn (gen_vsx_extract_v2di (op4, op2, GEN_INT (1))); - emit_insn (gen_muldi3 (op3, op3, op4)); + if (TARGET_POWERPC64) + emit_insn (gen_muldi3 (op3, op3, op4)); + else + { + rtx ret = expand_mult (DImode, op3, op4, NULL, 0, false); + emit_move_insn (op3, ret); + } emit_insn (gen_vsx_concat_v2di (op0, op5, op3)); DONE; }" @@ -1688,10 +1700,30 @@ rtx op5 = gen_reg_rtx (DImode); emit_insn (gen_vsx_extract_v2di (op3, op1, GEN_INT (0))); emit_insn (gen_vsx_extract_v2di (op4, op2, GEN_INT (0))); - emit_insn (gen_divdi3 (op5, op3, op4)); + if (TARGET_POWERPC64) + emit_insn (gen_divdi3 (op5, op3, op4)); + else + { + rtx libfunc = optab_libfunc (sdiv_optab, DImode); + rtx target = emit_library_call_value (libfunc, + op5, LCT_NORMAL, DImode, + op3, DImode, + op4, DImode); + emit_move_insn (op5, target); + } emit_insn (gen_vsx_extract_v2di (op3, op1, GEN_INT (1))); emit_insn (gen_vsx_extract_v2di (op4, op2, GEN_INT (1))); - emit_insn (gen_divdi3 (op3, op3, op4)); + if (TARGET_POWERPC64) + emit_insn (gen_divdi3 (op3, op3, op4)); + else + { + rtx libfunc = optab_libfunc (sdiv_optab, DImode); + rtx target = emit_library_call_value (libfunc, + op3, LCT_NORMAL, DImode, + op3, DImode, + op4, DImode); + emit_move_insn (op3, target); + } emit_insn (gen_vsx_concat_v2di (op0, op5, op3)); DONE; }" @@ -1716,10 +1748,30 @@ rtx op5 = gen_reg_rtx (DImode); emit_insn (gen_vsx_extract_v2di (op3, op1, GEN_INT (0))); emit_insn (gen_vsx_extract_v2di (op4, op2, GEN_INT (0))); - emit_insn (gen_udivdi3 (op5, op3, op4)); + if (TARGET_POWERPC64) + emit_insn (gen_udivdi3 (op5, op3, op4)); + else + { + rtx libfunc = optab_libfunc (udiv_optab, DImode); + rtx target = emit_library_call_value (libfunc, + op5, LCT_NORMAL, DImode, + op3, DImode, + op4, DImode); + emit_move_insn (op5, target); + } emit_insn (gen_vsx_extract_v2di (op3, op1, GEN_INT (1))); emit_insn (gen_vsx_extract_v2di (op4, op2, GEN_INT (1))); - emit_insn (gen_udivdi3 (op3, op3, op4)); + if (TARGET_POWERPC64) + emit_insn (gen_udivdi3 (op3, op3, op4)); + else + { + rtx libfunc = optab_libfunc (udiv_optab, DImode); + rtx target = emit_library_call_value (libfunc, + op3, LCT_NORMAL, DImode, + op3, DImode, + op4, DImode); + emit_move_insn (op3, target); + } emit_insn (gen_vsx_concat_v2di (op0, op5, op3)); DONE; }" Index: gcc/testsuite/gcc.target/powerpc/pr83926.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr83926.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr83926.c (working copy) @@ -0,0 +1,22 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */ +/* { dg-options "-O2 -mcpu=power8 -mno-fold-gimple" } */ + +__attribute__ ((altivec(vector__))) long long +sdiv (__attribute__ ((altivec(vector__))) long long a, + __attribute__ ((altivec(vector__))) long long b) +{ + return __builtin_vsx_div_2di (a, b); +} +__attribute__ ((altivec(vector__))) unsigned long long +udiv (__attribute__ ((altivec(vector__))) unsigned long long a, + __attribute__ ((altivec(vector__))) unsigned long long b) +{ + return __builtin_vsx_udiv_2di (a, b); +} +__attribute__ ((altivec(vector__))) long long +smul (__attribute__ ((altivec(vector__))) long long a, + __attribute__ ((altivec(vector__))) long long b) +{ + return __builtin_vsx_mul_2di (a, b); +} Index: gcc/testsuite/gcc.target/powerpc/builtins-1-be.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/builtins-1-be.c (revision 257390) +++ gcc/testsuite/gcc.target/powerpc/builtins-1-be.c (working copy) @@ -1,6 +1,7 @@ /* { dg-do compile { target { powerpc64-*-* } } } */ /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */ /* { dg-options "-mcpu=power8 -O0 -mno-fold-gimple" } */ +/* { dg-prune-output "gimple folding of rs6000 builtins has been disabled." } */ /* Test that a number of newly added builtin overloads are accepted by the compiler. */ @@ -22,10 +23,10 @@ vec_ctf xvmuldp vec_cts xvcvdpsxds, vctsxs vec_ctu xvcvdpuxds, vctuxs - vec_div divd, divdu + vec_div divd, divdu | __divdi3(), __udivdi3() vec_mergel vmrghb, vmrghh, xxmrghw vec_mergeh xxmrglw, vmrglh - vec_mul mulld + vec_mul mulld | mullw, mulhwu vec_nor xxlnor vec_or xxlor vec_packsu vpksdus @@ -49,21 +50,26 @@ /* { dg-final { scan-assembler-times "vctsxs" 1 } } */ /* { dg-final { scan-assembler-times "xvcvdpuxds" 1 } } */ /* { dg-final { scan-assembler-times "vctuxs" 1 } } */ -/* { dg-final { scan-assembler-times "divd" 4 } } */ -/* { dg-final { scan-assembler-times "divdu" 2 } } */ /* { dg-final { scan-assembler-times "vmrghb" 0 } } */ /* { dg-final { scan-assembler-times "vmrghh" 3 } } */ /* { dg-final { scan-assembler-times "xxmrghw" 1 } } */ /* { dg-final { scan-assembler-times "xxmrglw" 4 } } */ /* { dg-final { scan-assembler-times "vmrglh" 4 } } */ -/* { dg-final { scan-assembler-times "mulld" 4 } } */ -/* { dg-final { scan-assembler-times "xxlnor" 19 } } */ -/* { dg-final { scan-assembler-times "xxlor" 14 } } */ +/* { dg-final { scan-assembler-times "xxlnor" 6 } } */ +/* { dg-final { scan-assembler-times {\mxxlor\M} 8 { target lp64 } } } */ +/* { dg-final { scan-assembler-times {\mxxlor\M} 11 { target ilp32 } } } */ /* { dg-final { scan-assembler-times "vpksdus" 1 } } */ /* { dg-final { scan-assembler-times "vperm" 2 } } */ /* { dg-final { scan-assembler-times "xvrdpi" 1 } } */ /* { dg-final { scan-assembler-times "xxsel" 6 } } */ /* { dg-final { scan-assembler-times "xxlxor" 6 } } */ +/* { dg-final { scan-assembler-times {\mdivd\M} 2 { target lp64 } } } */ +/* { dg-final { scan-assembler-times {\mdivdu\M} 2 { target lp64 } } } */ +/* { dg-final { scan-assembler-times {\mmulld\M} 4 { target lp64 } } } */ +/* { dg-final { scan-assembler-times {\mbl __divdi3\M} 2 { target ilp32 } } } */ +/* { dg-final { scan-assembler-times {\mbl __udivdi3\M} 2 { target ilp32 } } } */ +/* { dg-final { scan-assembler-times {\mmullw\M} 12 { target ilp32 } } } */ +/* { dg-final { scan-assembler-times {\mmulhwu\M} 4 { target ilp32 } } } */ /* The source code for the test is in builtins-1.h. */ #include "builtins-1.h"