From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-196342-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 35473 invoked by alias); 11 Jun 2018 23:03:53 -0000
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
Received: (qmail 35457 invoked by uid 89); 11 Jun 2018 23:03:50 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.7 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,KAM_MANYTO,SPF_HELO_PASS autolearn=no version=3.3.2 spammy=SET, pain, Application, teach
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 11 Jun 2018 23:03:48 +0000
Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13])	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))	(No client certificate requested)	by mx1.redhat.com (Postfix) with ESMTPS id 852F53082A48;	Mon, 11 Jun 2018 23:03:47 +0000 (UTC)
Received: from localhost.localdomain (ovpn-112-10.rdu2.redhat.com [10.10.112.10])	by smtp.corp.redhat.com (Postfix) with ESMTP id 44957A09B6;	Mon, 11 Jun 2018 23:03:44 +0000 (UTC)
Subject: Re: [Aarch64] Vector Function Application Binary Interface Specification for OpenMP
To: Steve Ellcey <sellcey@cavium.com>, Alan.Hayward@arm.com, "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>, Francesco Petrogalli <Francesco.Petrogalli@arm.com>, James Greenhalgh <James.Greenhalgh@arm.com>, "Sekhar, Ashwin" <Ashwin.Sekhar@cavium.com>, gcc <gcc@gcc.gnu.org>, Marcus Shawcroft <Marcus.Shawcroft@arm.com>, nd <nd@arm.com>, richard.sandiford@linaro.org
References: <1518212868.14236.47.camel@cavium.com> <32617133-64DC-4F62-B7A0-A6B417C5B14E@arm.com> <1526487700.29509.6.camel@cavium.com> <a8761c95-e4fb-dd92-8988-825c8b34475f@arm.com> <1526491802.29509.19.camel@cavium.com> <87a7sznw5c.fsf@linaro.org> <1527184223.22014.13.camel@cavium.com> <87a7smbuej.fsf@linaro.org> <a94b6e17-fbf7-1dc3-b65c-cac4b476b06e@redhat.com> <871sdubwv6.fsf@linaro.org>
From: Jeff Law <law@redhat.com>
Openpgp: preference=signencrypt
Message-ID: <8e71e1ff-b108-fb74-eb08-e7eb104bbad1@redhat.com>
Date: Mon, 11 Jun 2018 23:06:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0
MIME-Version: 1.0
In-Reply-To: <871sdubwv6.fsf@linaro.org>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-IsSubscribed: yes
X-SW-Source: 2018-06/txt/msg00143.txt.bz2

On 05/29/2018 04:05 AM, Richard Sandiford wrote:
> Jeff Law <law@redhat.com> writes:
>> Now that we're in stage1 I do want to revisit the CLOBBER_HIGH stuff.
>> When we left things I think we were trying to decide between
>> CLOBBER_HIGH and clobbering the appropriate subreg.  The problem with
>> the latter is the dataflow we compute is inaccurate (overly pessimistic)
>> so that'd have to be fixed.
> 
> The clobbered part of the register in this case is a high-part subreg,
> which is ill-formed for single registers.  It would also be difficult
> to represent in terms of the mode, since there are no defined modes for
> what can be stored in the high part of an SVE register.  For 128-bit
> SVE that mode would have zero bits. :-)
> 
> I thought the alternative suggestion was instead to have:
> 
>    (set (reg:M X) (reg:M X))
You're right.  I mis-remembered.  IT happens far too often these days.

> 
> when X is preserved in mode M but not in wider modes.  But that seems
> like too much of a special case to me, both in terms of the source and
> the destination:
Well, the hope was this would "just work" without having to introduce a
new RTX code and teach all the RTL passes about it.  The self-assignment
has the right semantics, but I believe Alan showed that the DF
infrastructure pessimized it horribly.  At which point the question
became how painful would it be to fix DF and compare that to the pain of
adding a new RTX code.


> 
> - On the destination side, a SET normally provides something for later
>   instructions to use, whereas here the effect is intended to be the
>   opposite: the instruction has no effect at all on a value of mode M
>   in X.  As you say, this would pessimise df without specific handling.
>   But I think all optimisations that look for the definition of a value
>   would need to be taught to "look through" this set to find the real
>   definition of (reg:M X) (or any value of a mode no larger than M in X).
>   Very few passes use the df def-uses chains for this due its high cost.
But how often do we really need to look for the REG in a large mode than
M?  Yea, it happens occasionally, but I don't think it's pervasive and
the cases where we do probably aren't *that* important performance-wise.

Though at a conceptual level I agree.  SET is meant to provide something
for later consumption, we'd be abusing it.


> 
>   More fundamentally, it should be possible in RTL to express an
>   instruction J that *does* read X in mode M and clobbers its high part.
>   If we use the SET above to represent the clobber, and treat the rhs use
>   as special, then presumably J would need two uses of X, one "dummy" one
>   on the no-op SET and one "real" one on some other SET (or perhaps in a
>   top-level USE).  Having the number of uses determine this seems
>   a bit awkward.
> 
> IMO CLOBBER and SET have different semantics for good reason: CLOBBER
> represents an optimisation barrier for things that care about the value
> of a certain rtx object, while SET represents a productive effect or
> side-effect.  The effect we want here is the same as a normal clobber,
> except that the clobber is mode-dependent.
I largely agree.  It was really a matter of whether or not using the
self-set would simplify the implementation in a significant way.

jeff