From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <Richard.Earnshaw@arm.com>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
 by sourceware.org (Postfix) with ESMTP id 0D435385BF83
 for <gcc-patches@gcc.gnu.org>; Tue,  7 Apr 2020 09:52:13 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0D435385BF83
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=arm.com
Authentication-Results: sourceware.org;
 spf=pass smtp.mailfrom=Richard.Earnshaw@arm.com
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
 by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B2E041FB;
 Tue,  7 Apr 2020 02:52:12 -0700 (PDT)
Received: from [192.168.1.19] (unknown [172.31.20.19])
 by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id BD2803F73D;
 Tue,  7 Apr 2020 02:52:11 -0700 (PDT)
Subject: Re: [PATCH v2 00/11] aarch64: Implement TImode comparisons
To: Richard Henderson <richard.henderson@linaro.org>,
 gcc-patches@gcc.gnu.org, marcus.shawcroft@arm.com,
 segher@kernel.crashing.org, Wilco.Dijkstra@arm.com, richard.sandiford@arm.com
References: <20200402185353.11047-1-richard.henderson@linaro.org>
 <e66dbabe-059b-197e-bc3d-87121287ae39@arm.com> <mpt8sjczrd6.fsf@arm.com>
 <868411a5-48fa-b025-0451-d23e72fbe37a@arm.com> <mptftdky5l3.fsf@arm.com>
 <333e33ae-4836-1003-042e-80027ab38abd@arm.com> <mptlfn8uai9.fsf@arm.com>
From: "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>
Message-ID: <01ec0411-b9b0-7233-271e-98e5dc36c6e1@arm.com>
Date: Tue, 7 Apr 2020 10:52:10 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.4.1
MIME-Version: 1.0
In-Reply-To: <mptlfn8uai9.fsf@arm.com>
Content-Type: text/plain; charset=utf-8
Content-Language: en-GB
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS,
 KAM_SHORT, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <http://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <http://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2020 09:52:14 -0000

On 06/04/2020 12:19, Richard Sandiford wrote:
> "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com> writes:
>> On 03/04/2020 16:03, Richard Sandiford wrote:
>>> "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com> writes:
>>>> On 03/04/2020 13:27, Richard Sandiford wrote:
>>>>> "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com> writes:
>>>>>> On 02/04/2020 19:53, Richard Henderson via Gcc-patches wrote:
>>>>>>> This is attacking case 3 of PR 94174.
>>>>>>>
>>>>>>> In v2, I unify the various subtract-with-borrow and add-with-carry
>>>>>>> patterns that also output flags with unspecs.  As suggested by
>>>>>>> Richard Sandiford during review of v1.  It does seem cleaner.
>>>>>>>
>>>>>>
>>>>>> Really?  I didn't need to use any unspecs for the Arm version of this.
>>>>>>
>>>>>> R.
>>>>>
>>>>> See https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543063.html
>>>>> (including quoted context) for how we got here.
>>>>>
>>>>> The same problem affects the existing aarch64 patterns like
>>>>> *usub<GPI:mode>3_carryinC.  Although that pattern avoids unspecs,
>>>>> the compare:CC doesn't seem to be correct.
>>>>>
>>>>> Richard
>>>>>
>>>>
>>>> But I don't think you can use ANY_EXTEND in these comparisons.  It
>>>> doesn't describe what the instruction does, since the instruction does
>>>> not really extend the values first.
>>>
>>> Yeah, that was the starting point in the thread above too.  And using
>>> zero_extend in the existing *usub<GPI:mode>3_carryinC pattern:
>>>
>>> (define_insn "*usub<GPI:mode>3_carryinC"
>>>   [(set (reg:CC CC_REGNUM)
>>>   	(compare:CC
>>> 	  (zero_extend:<DWI>
>>> 	    (match_operand:GPI 1 "register_operand" "r"))
>>> 	  (plus:<DWI>
>>> 	    (zero_extend:<DWI>
>>> 	      (match_operand:GPI 2 "register_operand" "r"))
>>> 	    (match_operand:<DWI> 3 "aarch64_borrow_operation" ""))))
>>>    (set (match_operand:GPI 0 "register_operand" "=r")
>>> 	(minus:GPI
>>> 	  (minus:GPI (match_dup 1) (match_dup 2))
>>> 	  (match_operand:GPI 4 "aarch64_borrow_operation" "")))]
>>>    ""
>>>    "sbcs\\t%<w>0, %<w>1, %<w>2"
>>>   [(set_attr "type" "adc_reg")]
>>> )
>>>
>>> looks wrong for the same reason.  But the main problem IMO isn't how the
>>> inputs to the compare:CC are represented, but that we're using compare:CC
>>> at all.  Using compare doesn't accurately model the effect of SBCS on NZCV
>>> for all inputs, so if we're going to use a compare here, it can't be :CC.
>>>
>>>> I would really expect this patch series to be pretty much a dual of this
>>>> series that I posted last year for Arm.
>>>>
>>>> https://gcc.gnu.org/pipermail/gcc-patches/2019-October/532180.html
>>>
>>> That series uses compares with modes like CC_V and CC_B, so I think
>>> you're saying that given the choice in the earlier thread between adding
>>> a new CC mode or using unspecs, you would have preferred a new CC mode,
>>> is that right?
>>>
>>
>> Yes.  It surprised me, when doing the aarch32 version, just how often
>> the mid-end parts of the compiler were able to reason about parts of the
>> parallel insn and optimize things accordingly (eg propagating the truth
>> of the comparison).  If you use an unspec that can never happen.
> 
> That could be changed though.  E.g. we could add something like a
> simplify_unspec target hook if this becomes a problem (either here
> or for other unspecs).  A fancy implementation could even use
> match.pd-style rules in the .md file.

I really don't like that.  It sounds like the top of a long slippery
slope.  What about all the other cases where the RTL is comprehended by
the mid-end?

> 
> The reason I'm not keen on using special modes for this case is that
> they'd describe one way in which the result can be used rather than
> describing what the instruction actually does.  The instruction really
> does set all four flags to useful values.  The "problem" is that they're
> not the values associated with a compare between two values, so representing
> them that way will always lose information.
> 

Yes, it's true that the rtl -> machine instruction transform is not 100%
reversible.  That's always been the case, but it's the price we pay for
a generic IL that describes instructions on multiple architectures.

R.

> Thanks,
> Richard
>