public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Jeff Law <law@redhat.com>
Cc: Alan Hayward <Alan.Hayward@arm.com>,
		"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	nd <nd@arm.com>
Subject: Re: [PATCH 1/7]: SVE: Add CLOBBER_HIGH expression
Date: Tue, 28 Nov 2017 11:58:00 -0000	[thread overview]
Message-ID: <CAFiYyc1iu_BYV6EiNBtC_RMqZfBt9eOjGHAHn6a5aWwYP=DgWg@mail.gmail.com> (raw)
In-Reply-To: <599c2582-6d8d-a909-a878-28e8135d5100@redhat.com>

On Mon, Nov 27, 2017 at 6:29 PM, Jeff Law <law@redhat.com> wrote:
> On 11/23/2017 04:11 AM, Alan Hayward wrote:
>>
>>> On 22 Nov 2017, at 17:33, Jeff Law <law@redhat.com> wrote:
>>>
>>> On 11/22/2017 04:31 AM, Alan Hayward wrote:
>>>>
>>>>> On 21 Nov 2017, at 03:13, Jeff Law <law@redhat.com> wrote:
>>>>>>
>>>>>>>
>>>>>>> You might also look at TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  I'd
>>>>>>> totally forgotten about it.  And in fact it seems to come pretty close
>>>>>>> to what you need…
>>>>>>
>>>>>> Yes, some of the code is similar to the way
>>>>>> TARGET_HARD_REGNO_CALL_PART_CLOBBERED works. Both that code and the
>>>>>> CLOBBER expr code served as a starting point for writing the patch. The main difference
>>>>>> here, is that _PART_CLOBBERED is around all calls and is not tied to a specific Instruction,
>>>>>> it’s part of the calling abi. Whereas clobber_high is explicitly tied to an expression (tls_desc).
>>>>>> It meant there wasn’t really any opportunity to resume any existing code.
>>>>> Understood.  Though your first patch mentions that you're trying to
>>>>> describe partial preservation "around TLS calls". Presumably those are
>>>>> represented as normal insns, not call_insn.
>>>>>
>>>>> That brings me back to Richi's idea of exposing a set of the low subreg
>>>>> to itself using whatever mode is wide enough to cover the neon part of
>>>>> the register.
>>>>>
>>>>> That should tell the generic parts of the compiler that you're just
>>>>> clobbering the upper part and at least in theory you can implement in
>>>>> the aarch64 backend and the rest of the compiler should "just work"
>>>>> because that's the existing semantics of a subreg store.
>>>>>
>>>>> The only worry would be if a pass tried to get overly smart and
>>>>> considered that kind of set a nop -- but I think I'd argue that's simply
>>>>> wrong given the semantics of a partial store.
>>>>>
>>>>
>>>> So, the instead of using clobber_high(reg X), to use set(reg X, reg X).
>>>> It’s something we considered, and then dismissed.
>>>>
>>>> The problem then is you are now using SET semantics on those registers, and it
>>>> would make the register live around the function, which might not be the case.
>>>> Whereas clobber semantics will just make the register dead - which is exactly
>>>> what we want (but only conditionally).
>>> ?!?  A set of the subreg is the *exact* semantics you want.  It says the
>>> low part is preserved while the upper part is clobbered across the TLS
>>> insns.
>>>
>>> jeff
>>
>> Consider where the TLS call is inside a loop. The compiler would normally want
>> to hoist that out of the loop. By adding a set(x,x) into the parallel of the tls_desc we
>> are now making x live across the loop, x is dependant on the value from the previous
>> iteration, and the tls_desc can no longer be hoisted.
> Hmm.  I think I see the problem you're trying to point out.  Let me
> restate it and see if you agree.
>
> The low subreg set does clearly indicate the upper part of the SVE
> register is clobbered.  The problem is from a liveness standpoint the
> compiler is considering the low part live, even though it's a self-set.
>
> In fact, if that is the case, then a single TLS call (independent of a
> loop) would make the low part of the register globally live.  This
> should be testable.  Include one of these low part self sets on the
> existing TLS calls and compile a little test function and let's look at
> the liveness data.
>
>
> Now it could be the case that various local analysis could sub-optimally
> handle things.  You mention LICM.  I know our original LICM did have a
> problem in that if it saw a use of a hard reg in a loop without seeing a
> set of that hard reg it considered the register varying within the loop.
>  I have no idea if we carried that forward when the loop code was
> rewritten (when I looked at this it was circa 1992).
>
>
>>
>> Or consider a stream of code containing two tls_desc calls (ok, the compiler might
>> optimise one of the tls calls away, but this approach should be reusable for other exprs).
>> Between the two set(x,x)’s x is considered live so the register allocator can’t use that
>> register.
>> Given that we are applying this to all the neon registers, the register allocator now throws
>> an ICE because it can’t find any free hard neon registers to use.
> Given your statements it sounds like the liveness infrastructure is
> making those neon regs globally live when it sees the low part subreg
> self-set.  Let's confirm that one way or the other and see where it
> takes us.

Indeed in (set (subreg:neon reg1) (subreg:neon reg1)) it appears that
the lowpart of reg1
is used and thus it is live but liveness analysis can (and should)
simply ignore such sets.

> Jeff

  reply	other threads:[~2017-11-28 11:55 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-16 12:34 Alan Hayward
2017-11-16 18:15 ` Jeff Law
2017-11-16 18:39   ` Richard Biener
2017-11-16 18:57     ` Alan Hayward
2017-11-17 19:45       ` Jeff Law
2017-11-20 15:42         ` Alan Hayward
2017-11-21  6:19           ` Jeff Law
2017-11-22 11:41             ` Alan Hayward
2017-11-22 17:44               ` Jeff Law
2017-11-23 11:25                 ` Alan Hayward
2017-11-27 17:47                   ` Jeff Law
2017-11-28 11:58                     ` Richard Biener [this message]
2017-11-28 15:17                       ` Jeff Law
2017-11-30 11:16                     ` Alan Hayward
2017-12-12 11:11                       ` Alan Hayward
2017-12-19 10:12                         ` Alan Hayward
2017-12-19 16:27                           ` Jeff Law
2018-01-12 12:14                             ` Alan Hayward
2018-01-24 12:20                               ` Alan Hayward

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFiYyc1iu_BYV6EiNBtC_RMqZfBt9eOjGHAHn6a5aWwYP=DgWg@mail.gmail.com' \
    --to=richard.guenther@gmail.com \
    --cc=Alan.Hayward@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).