From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 44173 invoked by alias); 3 Sep 2015 16:25:51 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 44153 invoked by uid 89); 3 Sep 2015 16:25:50 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 03 Sep 2015 16:25:48 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-29-SSlA0QrNSlKlTSHoI-DoGA-1; Thu, 03 Sep 2015 17:25:43 +0100 Received: from [10.2.207.50] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 3 Sep 2015 17:25:43 +0100 Message-ID: <55E87487.4060101@arm.com> Date: Thu, 03 Sep 2015 16:36:00 -0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Segher Boessenkool , Wilco Dijkstra CC: 'GCC Patches' Subject: Re: RFC: Combine of compare & and oddity References: <000e01d0e5a2$1e2f66b0$5a8e3410$@com> <20150902184747.GA7676@gate.crashing.org> <000f01d0e63d$c40686e0$4c1394a0$@com> <20150903131809.GA27819@gate.crashing.org> <001001d0e659$1120bb60$33623220$@com> <20150903161825.GA13559@gate.crashing.org> In-Reply-To: <20150903161825.GA13559@gate.crashing.org> X-MC-Unique: SSlA0QrNSlKlTSHoI-DoGA-1 Content-Type: text/plain; charset=WINDOWS-1252; format=flowed Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2015-09/txt/msg00289.txt.bz2 On 03/09/15 17:18, Segher Boessenkool wrote: > On Thu, Sep 03, 2015 at 03:59:00PM +0100, Wilco Dijkstra wrote: >>>> However there are 2 issues with this, one is the spurious subreg, >>> Combine didn't make that up out of thin air; something already used >>> DImode here. It could simplify it to SImode in this case, that is >>> true, don't know why it doesn't; it isn't necessarily faster code to >>> do so, it can be slower, it might not match, etc. >> The relevant RTL instructions on AArch64 are: > [ You never gave a full test case, or I missed it, or cannot find it > anymore -- but I can reproduce this now: > > void g(void); > void f(int *x) { if (*x & 2) g(); } > > ] > >> (insn 8 3 25 2 (set (reg:SI 77 [ D.2705 ]) >> (and:SI (reg/v:SI 76 [ xD.2641 ]) >> (const_int 2 [0x2]))) tmp5.c:122 452 {andsi3} >> (nil)) >> (insn 26 25 27 2 (set (reg:CC 66 cc) >> (compare:CC (reg:SI 77 [ D.2705 ]) >> (const_int 0 [0]))) tmp5.c:122 377 {*cmpsi} >> (expr_list:REG_DEAD (reg:SI 77 [ D.2705 ]) >> (nil))) >> >> I don't see anything using DI... > Yeah, I spoke too soon, sorry. It looks like make_compound_operation came > up with it. > >>> It's only a problem for AND-and-compare, no? >> Yes, so it looks like some other backends match the odd pattern and then= have another >> pattern change it back into the canonical AND/TST form during the split = phase (maybe >> the subreg confuses register allocation or block other optimizations). > A subreg of a pseudo is not anything special, don't worry about it, > register_operand and similar treat it just like any other register. > >> This all seems >> a lot of unnecessary complexity for a few special immediates when there = is a much >> simpler solution... > Feel free to post a patch! I would love to have this all simplified. > >>>> But there are more efficient ways to emit single bit and masks tests t= hat apply >>>> to most CPUs rather than doing something specific that works for just = one target >>>> only. For example single bit test is a simple shift into carry flag or= into the >>>> sign bit, and for mask tests, you shift out all the non-mask bits. >>> Most of those are quite target-specific. Some others are already done, >>> and/or done by other passes. >> But what combine does here is even more target-specific. > Combine puts everything (well, most things) through > make_compound_operation, on all targets. > >>> Combine converts the merged instructions to what it thinks is the >>> canonical or cheapest form, and uses that. It does not try multiple >>> options (the zero_ext* -> and+shift rewriting is not changing the >>> semantics of the pattern at all). >> But the change from AND to zero_extract is already changing semantics... > Oh? It is not supposed to! > >>>> Or would it be better to let each target decide >>>> on how to canonicalize bit tests and only try that alternative? >>> The question is how to write the pattern to be most convenient for all >>> targets. >> The obvious choice is to try the 2 original instructions merged. > ... without any simplification. Yes, I've wanted combine to fall back > to that if the "simplified" version does not work out. Not so easy to > do though. > >>>> Yes, but that doesn't mean (x & C) !=3D 0 shouldn't be tried as well... >>> Combine does not try multiple options. >> I'm not following - combine tries zero_extract and shift+AND - that's 2 = options. >> If that is feasible then adding a 3rd option should be possible. > The shift+and is *exactly the same* as the zero_extract, just written > differently. > >> We certainly need a lot more target hooks in general so GCC can do the r= ight thing >> (rather than using costs inconsistently all over the place). But that's = a different >> discussion... > This isn't about costs though. That is a big other can of worms, indeed! > > > Anyway. In that testcase I made, everything is simplified just fine on > aarch64, using *tbeqdi1; what am I missing? A testcase I was looking at is: int foo (int a) { return (a & 7) !=3D 0; } For me this generates: and w0, w0, 7 cmp w0, wzr cset w0, ne ret when it could be: tst w0, 7 cset w0, ne ret Kyrill > > > Segher >