From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 8B71E3887016 for ; Wed, 8 Apr 2020 09:10:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 8B71E3887016 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=richard.sandiford@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3E24131B; Wed, 8 Apr 2020 02:10:38 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 33E373F73D; Wed, 8 Apr 2020 02:10:37 -0700 (PDT) From: Richard Sandiford To: Richard Biener Mail-Followup-To: Richard Biener , "Richard Earnshaw \(lists\)" , Richard Henderson , GCC Patches , Marcus Shawcroft , Segher Boessenkool , Wilco Dijkstra , richard.sandiford@arm.com Cc: "Richard Earnshaw \(lists\)" , Richard Henderson , GCC Patches , Marcus Shawcroft , Segher Boessenkool , Wilco Dijkstra Subject: Re: [PATCH v2 00/11] aarch64: Implement TImode comparisons References: <20200402185353.11047-1-richard.henderson@linaro.org> <868411a5-48fa-b025-0451-d23e72fbe37a@arm.com> <333e33ae-4836-1003-042e-80027ab38abd@arm.com> Date: Wed, 08 Apr 2020 10:10:36 +0100 In-Reply-To: (Richard Biener's message of "Mon, 6 Apr 2020 14:22:42 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-13.7 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Apr 2020 09:10:40 -0000 Richard Biener writes: > On Mon, Apr 6, 2020 at 1:20 PM Richard Sandiford > wrote: >> >> "Richard Earnshaw (lists)" writes: >> > On 03/04/2020 16:03, Richard Sandiford wrote: >> >> "Richard Earnshaw (lists)" writes: >> >>> On 03/04/2020 13:27, Richard Sandiford wrote: >> >>>> "Richard Earnshaw (lists)" writes: >> >>>>> On 02/04/2020 19:53, Richard Henderson via Gcc-patches wrote: >> >>>>>> This is attacking case 3 of PR 94174. >> >>>>>> >> >>>>>> In v2, I unify the various subtract-with-borrow and add-with-carry >> >>>>>> patterns that also output flags with unspecs. As suggested by >> >>>>>> Richard Sandiford during review of v1. It does seem cleaner. >> >>>>>> >> >>>>> >> >>>>> Really? I didn't need to use any unspecs for the Arm version of this. >> >>>>> >> >>>>> R. >> >>>> >> >>>> See https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543063.html >> >>>> (including quoted context) for how we got here. >> >>>> >> >>>> The same problem affects the existing aarch64 patterns like >> >>>> *usub3_carryinC. Although that pattern avoids unspecs, >> >>>> the compare:CC doesn't seem to be correct. >> >>>> >> >>>> Richard >> >>>> >> >>> >> >>> But I don't think you can use ANY_EXTEND in these comparisons. It >> >>> doesn't describe what the instruction does, since the instruction does >> >>> not really extend the values first. >> >> >> >> Yeah, that was the starting point in the thread above too. And using >> >> zero_extend in the existing *usub3_carryinC pattern: >> >> >> >> (define_insn "*usub3_carryinC" >> >> [(set (reg:CC CC_REGNUM) >> >> (compare:CC >> >> (zero_extend: >> >> (match_operand:GPI 1 "register_operand" "r")) >> >> (plus: >> >> (zero_extend: >> >> (match_operand:GPI 2 "register_operand" "r")) >> >> (match_operand: 3 "aarch64_borrow_operation" "")))) >> >> (set (match_operand:GPI 0 "register_operand" "=r") >> >> (minus:GPI >> >> (minus:GPI (match_dup 1) (match_dup 2)) >> >> (match_operand:GPI 4 "aarch64_borrow_operation" "")))] >> >> "" >> >> "sbcs\\t%0, %1, %2" >> >> [(set_attr "type" "adc_reg")] >> >> ) >> >> >> >> looks wrong for the same reason. But the main problem IMO isn't how the >> >> inputs to the compare:CC are represented, but that we're using compare:CC >> >> at all. Using compare doesn't accurately model the effect of SBCS on NZCV >> >> for all inputs, so if we're going to use a compare here, it can't be :CC. >> >> >> >>> I would really expect this patch series to be pretty much a dual of this >> >>> series that I posted last year for Arm. >> >>> >> >>> https://gcc.gnu.org/pipermail/gcc-patches/2019-October/532180.html >> >> >> >> That series uses compares with modes like CC_V and CC_B, so I think >> >> you're saying that given the choice in the earlier thread between adding >> >> a new CC mode or using unspecs, you would have preferred a new CC mode, >> >> is that right? >> >> >> > >> > Yes. It surprised me, when doing the aarch32 version, just how often >> > the mid-end parts of the compiler were able to reason about parts of the >> > parallel insn and optimize things accordingly (eg propagating the truth >> > of the comparison). If you use an unspec that can never happen. >> >> That could be changed though. E.g. we could add something like a >> simplify_unspec target hook if this becomes a problem (either here >> or for other unspecs). A fancy implementation could even use >> match.pd-style rules in the .md file. >> >> The reason I'm not keen on using special modes for this case is that >> they'd describe one way in which the result can be used rather than >> describing what the instruction actually does. The instruction really >> does set all four flags to useful values. The "problem" is that they're >> not the values associated with a compare between two values, so representing >> them that way will always lose information. > > Can't you recover the pieces by using a parallel with multiple > set:CC_X that tie together the pieces in the "correct" way? That would mean splitting apart the flags register for the set but (I guess) continuing to treat them as a single unit for uses. That's likely to make life harder for the optimisers. Thanks, Richard