From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15900 invoked by alias); 17 Jul 2014 16:56:10 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 15811 invoked by uid 89); 17 Jul 2014 16:56:09 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,MISSING_HEADERS,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=no version=3.3.2 X-HELO: service87.mimecast.com Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 17 Jul 2014 16:56:07 +0000 Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Thu, 17 Jul 2014 17:56:03 +0100 Received: from [10.1.209.51] ([10.1.255.212]) by cam-owa2.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 17 Jul 2014 17:56:03 +0100 Message-ID: <53C80023.6000100@arm.com> Date: Thu, 17 Jul 2014 17:13:00 -0000 From: Alan Lawrence User-Agent: Thunderbird 2.0.0.24 (X11/20101213) MIME-Version: 1.0 CC: Jeff Law , "gcc-patches@gcc.gnu.org" Subject: Re: [PATCH] Relax check against commuting XOR and ASHIFTRT in combine.c References: <53B1B4FE.7010201@arm.com> <53B1D271.5000405@redhat.com> <53C69926.4050503@arm.com> In-Reply-To: <53C69926.4050503@arm.com> X-MC-Unique: 114071717560302201 Content-Type: multipart/mixed; boundary="------------050804090402020502030408" X-IsSubscribed: yes X-SW-Source: 2014-07/txt/msg01234.txt.bz2 This is a multi-part message in MIME format. --------------050804090402020502030408 Content-Type: text/plain; charset=WINDOWS-1252; format=flowed Content-Transfer-Encoding: quoted-printable Content-length: 7964 Ok, the attached tests are passing on x86_64-none-linux-gnu, aarch64-none-e= lf,=20 arm-none-eabi, and a bunch of smaller platforms for which I've only built a= =20 stage 1 compiler (i.e. as far as necessary to assemble). That's with either= =20 change to simplify_shift_const_1. As to the addition of "result_mode !=3D shift_mode", or removing the whole = check=20 against XOR - I've now tested the latter: bootstrapped on x86_64-none-linux-gnu, check-gcc and check-ada; bootstrapped on arm-none-linux-gnueabihf; bootstrapped on aarch64-none-linux-gnu; cross-tested check-gcc on aarch64-none-elf; cross-tested on arm-none-eabi; (and Uros has bootstrapped on alpha and done a suite of tests, as per=20 https://gcc.gnu.org/ml/gcc-testresults/2014-07/msg01236.html). From a perspective of paranoia, I'd lean towards adding "result_mode !=3D= =20 shift_mode", but for neatness removing the whole check against XOR is nicer= . So=20 I'd defer to the maintainers as to whether one might be preferable to the=20 other...(but my unproven suspicion is that the two are equivalent, and no c= ase=20 where result_mode !=3D shift_mode is possible!) --Alan Alan Lawrence wrote: > Thanks for the suggestions! I think I've got a reasonably platform-indepe= ndent=20 > testcase that scans the rtl dump, just trying it on a few more platforms = now... >=20 > As to running on Alpha: bootstrap succeeds, and the regression testsuite = doesn't=20 > raise any issues (https://gcc.gnu.org/ml/gcc-testresults/2014-07/msg01236= .html)=20 > - and that's with a more aggressive patch that completely rolls back the= =20 > original r76965: >=20 > Index: combine.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- combine.c (revision 212523) > +++ combine.c (working copy) > @@ -10218,9 +10218,6 @@ > if (CONST_INT_P (XEXP (varop, 1)) > /* We can't do this if we have (ashiftrt (xor)) and the > constant has its sign bit set in shift_mode. */ > - && !(code =3D=3D ASHIFTRT && GET_CODE (varop) =3D=3D XOR > - && 0 > trunc_int_for_mode (INTVAL (XEXP (varop, 1)), > - shift_mode)) > && (new_rtx =3D simplify_const_binary_operation > (code, result_mode, > gen_int_mode (INTVAL (XEXP (varop, 1)), result_mode= ), > @@ -10237,10 +10234,7 @@ > logical expression, make a new logical expression, and ap= ply > the inverse distributive law. This also can't be done > for some (ashiftrt (xor)). */ > - if (CONST_INT_P (XEXP (varop, 1)) > - && !(code =3D=3D ASHIFTRT && GET_CODE (varop) =3D=3D XOR > - && 0 > trunc_int_for_mode (INTVAL (XEXP (varop, 1)), > - shift_mode))) > + if (CONST_INT_P (XEXP (varop, 1))) > { > rtx lhs =3D simplify_shift_const (NULL_RTX, code, shift_= mode, > XEXP (varop, 0), count); >=20 > I'm testing this version more widely but initial indications are good. >=20 > However, I've not succeeded in checking Ada on Alpha, as GCC's Ada fronte= nd=20 > requires an Ada compiler to bootstrap. So I have to ask: does anyone actu= ally=20 > use Ada on Alpha? (And if so, would they please be able to test the above= patch?) >=20 > Moreover, I don't really see we have much reason to believe the check aga= inst=20 > commuting is required even for Ada/Alpha. GCC's internals have changed=20 > substantially in the interim, with the Ada frontend no longer generating = RTL=20 > directly, as we now have intervening GENERIC/GIMPLE tree stages. Unless t= here is=20 > a logical/bitwise explanation for why the commuting of ashiftrc and xor i= s=20 > unsafe, is the best explanation that the Ada frontend was generating RTL = that=20 > may have looked OK at the time but we would now consider dubious, malform= ed, bad? >=20 > (E.g., these days I don't see how to produce an ashiftrt of one mode cont= aining > an XOR of another without an intervening sign_extend, zero_extend or subr= eg.) >=20 > --Alan >=20 > Jeff Law wrote: >> On 06/30/14 13:05, Alan Lawrence wrote: >>> combine.c includes a check which prevents >>> >>> (ashiftrt (xor A C2) C1) >>> >>> from being commuted to >>> >>> (xor (ashiftrt A C1) (ashiftrt C2 C1)) >>> >>> for constants C1, C2 if C2 has its sign bit set. >>> >>> Specifically, this prevents (ashiftrt (not A) C1) from being commuted to >>> >>> (not (ashiftrt A C1)) >>> >>> because the former is rewritten to (ashiftrt (xor A -1) C1) above, with >>> a comment /* Make this fit the case below. */ - which it no longer doe= s. >>> >>> If result_mode =3D=3D shift_mode, I can see no reason to prevent this >>> normalisation (indeed, I'm not sure I can see any reason to prevent it >>> even if result_mode !=3D shift_mode - but I've not managed to produce s= uch >>> a case in my own testing, as there are always intervening subreg's or >>> sign_extend's, or to build a toolchain on which to reproduce the >>> original bug, so I'm being cautious). Hence this patch allows >>> commutation if the two modes are equal. >>> >>> As an illustrative example, on AArch64, without this patch, compiling >>> this snippet of the current arm_neon.h: >>> >>> typedef long long int int64x1_t __attribute__ ((__vector_size__((8)))); >>> int64x1 vcgez_s64(int64x1_t a) >>> { >>> return (int64x1_t) {a >=3D0 ? -1 : 0}; >>> } >>> >>> produces a sequence involving a logical not (mvn) followed by asr (plus >>> some inter-bank moves), as the combiner takes (ashiftrt (not x) 63), >>> "simplifies", and fails to match the resulting RTL >>> >>> (set (reg:DI 78 [ D.2593 ]) >>> (ashiftrt:DI (xor:DI (reg:DI 0 x0 [ aD.2565 ]) >>> (const_int -1 [0xffffffffffffffff])) >>> (const_int 63 [0x3f]))) >>> >>> However, with this patch, the combiner simplifies to >>> >>> (set (reg:DI 84 [ D.2604 ]) >>> (neg:DI (ge:DI (reg:DI 32 v0 [ aD.2569 ]) >>> (const_int 0 [0])))) >>> >>> which matches a pattern to output the desired cmge instruction. >>> >>> (For the curious: the check against commutation was added in January >>> 2004, r76965, https://gcc.gnu.org/ml/gcc-patches/2004-01/msg03406.html - >>> however, I've not been able to build an ADA toolchain of that era, or an >>> Alpha/VMS toolchain, on which to reproduce the bug; if anyone has such >>> and is willing and able to investigate, by all means!) >>> >>> Testing: >>> >>> aarch64-none-elf: check-gcc (cross-tested) >>> x86-none-linux-gnu: bootstrapped; check-gcc, check-ada, check-g++ >>> arm-none-linux-gnueabi: bootstrapped >>> arm-none-eabi: check-gcc (cross-tested) >>> >>> >>> gcc/ChangeLog: >>> >>> * combine.c (simplify_shift_const_1): Allow commuting ASHIFTRT and= XOR >>> if same mode. >> You'll want to use your sample code from above to create a testcase.=20 >> You can either dump the RTL and search it, or scan the assembly code. >> >> Look in gcc/testsuite/gcc.target/arm for examples. >> >> Given the original problem which prompted Kenner to make this change was= =20 >> Ada related on the Alpha, you might ping rth and/or uros to see if they= =20 >> could do a "quick" regression of those platforms with Ada enabled. >> >> I'm naturally hesitant to approve since this was something pretty Kenner= =20 >> explicitly tried to avoid AFAICT. Kenner wasn't ever known for trying=20 >> to paper over problems -- if he checked in a change like this, much more= =20 >> likely than not it was well thought out and analyzed. He also probably= =20 >> knew combine.c better than anyone at that time (possibly even still true= =20 >> today). >> >> >> Jeff >> >> >> >=20 >=20 >=20 >=20 --------------050804090402020502030408 Content-Type: text/x-patch; name=combrtx_tests.patch Content-Transfer-Encoding: quoted-printable Content-Disposition: inline; filename="combrtx_tests.patch" Content-length: 1894 diff --git a/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c b/gcc/testsuite/gcc.= dg/combine_ashiftrt_1.c new file mode 100644 index 0000000000000000000000000000000000000000..90e64fd10dc358f10ad03a90041= 605bc3ccb7011 --- /dev/null +++ b/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c @@ -0,0 +1,18 @@ +/* { dg-do compile {target sparc64*-*-* aarch64*-*-* x86_64-*-* powerpc64*= -*-*} } */ +/* { dg-options "-O2 -fdump-rtl-combine-all" } */ + +typedef long long int int64_t; + +int64_t +foo (int64_t a) +{ + return (~a) >> 63; +} + +/* The combine phase will try to combine not & ashiftrt, and + combine_simplify_rtx should transform (ashiftrt (not x) 63) + to (not (ashiftrt x 63)) and then to (neg (ge x 0)). We look for + the *attempt* to match this RTL pattern, regardless of whether an + actual insn may be found on the platform. */ +/* { dg-final { scan-rtl-dump "\\(neg:DI \\(ge:DI" "combine" } } */ +/* { dg-final { cleanup-rtl-dump "combine" } } */ diff --git a/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c b/gcc/testsuite/gcc.= dg/combine_ashiftrt_2.c new file mode 100644 index 0000000000000000000000000000000000000000..fd6827caed230ea5dd2d6ec4431= b11bf826531ea --- /dev/null +++ b/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c @@ -0,0 +1,18 @@ +/* { dg-do compile {target arm*-*-* i?86-*-* powerpc-*-* sparc-*-*} } */ +/* { dg-options "-O2 -fdump-rtl-combine-all" } */ + +typedef long int32_t; + +int32_t +foo (int32_t a) +{ + return (~a) >> 31; +} + +/* The combine phase will try to combine not & ashiftrt, and + combine_simplify_rtx should transform (ashiftrt (not x) 31) + to (not (ashiftrt x 63)) and then to (neg (ge x 0)). We look for + the *attempt* to match this RTL pattern, regardless of whether an + actual insn may be found on the platform. */ +/* { dg-final { scan-rtl-dump "\\(neg:SI \\(ge:SI" "combine" } } */ +/* { dg-final { cleanup-rtl-dump "combine" } } */= --------------050804090402020502030408--