From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x142.google.com (mail-lf1-x142.google.com [IPv6:2a00:1450:4864:20::142]) by sourceware.org (Postfix) with ESMTPS id 9E164385B831 for ; Mon, 6 Apr 2020 12:22:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9E164385B831 Received: by mail-lf1-x142.google.com with SMTP id r17so8068613lff.2 for ; Mon, 06 Apr 2020 05:22:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=3GWlCIjApWKxGWu7jKliP8UvgFQZdGct+Z0pyVX4hAw=; b=VN2JHSjnu4meRSV7wiN71izZHXg1QEuXOKb+9V0buG1a8V1DD9EShYry/dhCxb/pPJ wfWc66com7uV24zOQ0b9x5ICVYlaRsdlle3roVWrmshIpkbeYqausalFyIYUsL+MR0VW v67+7vhGIWWDTEEZVoW1eBC+IcVazDjPLV9GAnIIV0nXpGdu25LwlPuvGQ0qp3fK+Wct uEn0iliOwBxwL0H1Suf/MEGkj1n1t0nzBbhAoXvQyHIcREmvk41L/UD7Uju6Yzpq7jjb 3cYZ5QLiRmD2u6mBufswfsyoFF2bdP+YeIcjeB5idKZfPkH/qhW0d6yAuLirKXg5O/rP 4Utg== X-Gm-Message-State: AGi0PuZ1Hm1XPN0DWe6Jzg6Y1EeG+rDqrePGoGqqJ3jB9uIe611PPs7w 0YxGGo0WpXEFJj/JsrCLfYXxBHJF+1WCU8tF1TI= X-Google-Smtp-Source: APiQypI91x+UeMUXdTxFefWFrr2h12EbLTt0HNl49VH8NChrbGR2Q2/j7Kp+PhX506tMgMJ6/VcEB4Y8gDAZrNltzcM= X-Received: by 2002:a05:6512:478:: with SMTP id x24mr1099866lfd.193.1586175774390; Mon, 06 Apr 2020 05:22:54 -0700 (PDT) MIME-Version: 1.0 References: <20200402185353.11047-1-richard.henderson@linaro.org> <868411a5-48fa-b025-0451-d23e72fbe37a@arm.com> <333e33ae-4836-1003-042e-80027ab38abd@arm.com> In-Reply-To: From: Richard Biener Date: Mon, 6 Apr 2020 14:22:42 +0200 Message-ID: Subject: Re: [PATCH v2 00/11] aarch64: Implement TImode comparisons To: "Richard Earnshaw (lists)" , Richard Henderson , GCC Patches , Marcus Shawcroft , Segher Boessenkool , Wilco Dijkstra , Richard Sandiford Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-4.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Apr 2020 12:22:57 -0000 On Mon, Apr 6, 2020 at 1:20 PM Richard Sandiford wrote: > > "Richard Earnshaw (lists)" writes: > > On 03/04/2020 16:03, Richard Sandiford wrote: > >> "Richard Earnshaw (lists)" writes: > >>> On 03/04/2020 13:27, Richard Sandiford wrote: > >>>> "Richard Earnshaw (lists)" writes: > >>>>> On 02/04/2020 19:53, Richard Henderson via Gcc-patches wrote: > >>>>>> This is attacking case 3 of PR 94174. > >>>>>> > >>>>>> In v2, I unify the various subtract-with-borrow and add-with-carry > >>>>>> patterns that also output flags with unspecs. As suggested by > >>>>>> Richard Sandiford during review of v1. It does seem cleaner. > >>>>>> > >>>>> > >>>>> Really? I didn't need to use any unspecs for the Arm version of this. > >>>>> > >>>>> R. > >>>> > >>>> See https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543063.html > >>>> (including quoted context) for how we got here. > >>>> > >>>> The same problem affects the existing aarch64 patterns like > >>>> *usub3_carryinC. Although that pattern avoids unspecs, > >>>> the compare:CC doesn't seem to be correct. > >>>> > >>>> Richard > >>>> > >>> > >>> But I don't think you can use ANY_EXTEND in these comparisons. It > >>> doesn't describe what the instruction does, since the instruction does > >>> not really extend the values first. > >> > >> Yeah, that was the starting point in the thread above too. And using > >> zero_extend in the existing *usub3_carryinC pattern: > >> > >> (define_insn "*usub3_carryinC" > >> [(set (reg:CC CC_REGNUM) > >> (compare:CC > >> (zero_extend: > >> (match_operand:GPI 1 "register_operand" "r")) > >> (plus: > >> (zero_extend: > >> (match_operand:GPI 2 "register_operand" "r")) > >> (match_operand: 3 "aarch64_borrow_operation" "")))) > >> (set (match_operand:GPI 0 "register_operand" "=r") > >> (minus:GPI > >> (minus:GPI (match_dup 1) (match_dup 2)) > >> (match_operand:GPI 4 "aarch64_borrow_operation" "")))] > >> "" > >> "sbcs\\t%0, %1, %2" > >> [(set_attr "type" "adc_reg")] > >> ) > >> > >> looks wrong for the same reason. But the main problem IMO isn't how the > >> inputs to the compare:CC are represented, but that we're using compare:CC > >> at all. Using compare doesn't accurately model the effect of SBCS on NZCV > >> for all inputs, so if we're going to use a compare here, it can't be :CC. > >> > >>> I would really expect this patch series to be pretty much a dual of this > >>> series that I posted last year for Arm. > >>> > >>> https://gcc.gnu.org/pipermail/gcc-patches/2019-October/532180.html > >> > >> That series uses compares with modes like CC_V and CC_B, so I think > >> you're saying that given the choice in the earlier thread between adding > >> a new CC mode or using unspecs, you would have preferred a new CC mode, > >> is that right? > >> > > > > Yes. It surprised me, when doing the aarch32 version, just how often > > the mid-end parts of the compiler were able to reason about parts of the > > parallel insn and optimize things accordingly (eg propagating the truth > > of the comparison). If you use an unspec that can never happen. > > That could be changed though. E.g. we could add something like a > simplify_unspec target hook if this becomes a problem (either here > or for other unspecs). A fancy implementation could even use > match.pd-style rules in the .md file. > > The reason I'm not keen on using special modes for this case is that > they'd describe one way in which the result can be used rather than > describing what the instruction actually does. The instruction really > does set all four flags to useful values. The "problem" is that they're > not the values associated with a compare between two values, so representing > them that way will always lose information. Can't you recover the pieces by using a parallel with multiple set:CC_X that tie together the pieces in the "correct" way? Richard. > Thanks, > Richard