From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30686 invoked by alias); 7 Dec 2018 18:01:02 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 30639 invoked by uid 89); 7 Dec 2018 18:01:00 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=earnshaw, Earnshaw X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 07 Dec 2018 18:00:59 +0000 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BD28E3C2CE8; Fri, 7 Dec 2018 18:00:57 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-17.rdu2.redhat.com [10.10.112.17]) by smtp.corp.redhat.com (Postfix) with ESMTP id BD9D06154C; Fri, 7 Dec 2018 18:00:56 +0000 (UTC) Subject: Re: [RFA] [target/87369] Prefer "bit" over "bfxil" To: "Richard Earnshaw (lists)" , gcc-patches , James Greenhalgh References: From: Jeff Law Openpgp: preference=signencrypt Message-ID: <88d2f56f-1195-336a-b942-5611bb62bb3d@redhat.com> Date: Fri, 07 Dec 2018 18:01:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2018-12/txt/msg00496.txt.bz2 On 12/7/18 10:31 AM, Richard Earnshaw (lists) wrote: > On 07/12/2018 15:52, Jeff Law wrote: >> As I suggested in the BZ, this patch rejects constants with just the >> high bit set for the recently added "bfxil" pattern. As a result we'll >> return to using "bit" for the test in the BZ. >> >> I'm not versed enough in aarch64 performance tuning to know if "bit" is >> actually a better choice than "bfxil". "bit" results in better code for >> the testcase, but that seems more a function of register allocation than >> "bit" being inherently better than "bfxil". Obviously someone with >> more aarch64 knowledge needs to make a decision here. >> >> My first iteration of the patch changed "aarch64_high_bits_all_ones_p". >> We could still go that way too, though the name probably needs to change. >> >> I've bootstrapped and regression tested on aarch64-linux-gnu and it >> fixes the regression. I've also bootstrapped aarch64_be-linux-gnu, but >> haven't done any kind of regression tested on that platform. >> >> >> OK for the trunk? > > The problem here is that the optimum solution depends on the register > classes involved and we don't know this during combine. If we have > general register, then we want bfi/bfxil to be used; if we have a vector > register, then bit is preferable as it changes 3 inter-bank register > copies to a single inter-bank copy; and that copy might be hoisted out > of a loop. Ugh. Things are never simple, are they? > > Ultimately, the best solution here will probably depend on which we > think is more likely, copysign or the example I give above. I'd tend to suspect we'd see more pure integer bit twiddling than the copysign stuff. > > It might be that for copysign we'll need to expand initially to some > unspec that uses a register initialized with a suitable immediate, but > otherwise hides the operation from combine until after that has run, > thus preventing the compiler from doing the otherwise right thing. We'd > lose in the (hopefully) rare case where the operands really were in > general registers, but otherwise win for the more common case where they > aren't. Could we have the bfxil pattern have an alternative that accepts vector regs and generates bit in appropriate circumstances? Hmm, maybe the other way around would be better. Have the "bit" pattern with a general register alternative that generates bfxil when presented with general registers. I would generally warn against hiding things in unspecs like you've suggested above. We're seeing cases where that's been on in the x86 backend and it's inhibiting optimizations in various places. Thoughts? Jeff