From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18220 invoked by alias); 26 May 2018 10:09:31 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 18211 invoked by uid 89); 26 May 2018 10:09:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=largest, 2018-05-16, 20180516, preserves X-HELO: mail-wr0-f193.google.com Received: from mail-wr0-f193.google.com (HELO mail-wr0-f193.google.com) (209.85.128.193) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 26 May 2018 10:09:28 +0000 Received: by mail-wr0-f193.google.com with SMTP id w7-v6so1049573wrn.6 for ; Sat, 26 May 2018 03:09:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:cc:subject:references :date:in-reply-to:message-id:user-agent:mime-version :content-transfer-encoding; bh=9xiGIffoOVpA2bZvNu2sKF+U4kY+gr60fjt5bSGmR7w=; b=ZaKrAyI9eMPRByNQSMAG+f1BhI08awNRX6vGCB7IMu0zCXEJcqdJL+2M+kWJ73YC2k KiCWeklDMlg9iUGeWzgnqBt618DP0VrCI8nlyhU/B2+9HN8CvW6qaCio1m12fBs3eWO+ xPxE/RuBLil/ZFvN5aXNObs0fOt60HLlzkZDI/4R2KpqfTw47yjtzSPNtWcjZS79ptUG 4TuZ1EPeFyk0Ng+mMfXfZJWPywDB++iZNcTrH7VzMsmLXFAQzzIVcgueBT2Ws+N2lOwB Sh+zhwgxvnQ2XPaqoMRircw9WVn/QGZzs2s+SHw/UBNo4GttsqlI3kyZwgXbuXkDsECx 387Q== X-Gm-Message-State: ALKqPwd1n7hVDcTOjexZVXud0n7gx+bpIwTJBcB88cV6MDiv3wHXazhN wXL4oNfAr1SfuGwuKzxPem9/WA== X-Google-Smtp-Source: AB8JxZrZ8/w4Zc1FZAmkW4CGf9Kc0+++sGqADniNx/x/tBPAi7ng9+Sk83/atybGg1dDNuCkSTd8dA== X-Received: by 2002:adf:9301:: with SMTP id 1-v6mr4732785wro.175.1527329366665; Sat, 26 May 2018 03:09:26 -0700 (PDT) Received: from localhost (144.69.7.51.dyn.plus.net. [51.7.69.144]) by smtp.gmail.com with ESMTPSA id 72-v6sm15718985wrb.22.2018.05.26.03.09.25 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 26 May 2018 03:09:25 -0700 (PDT) From: Richard Sandiford To: Steve Ellcey Mail-Followup-To: Steve Ellcey ,Alan.Haward@arm.com, "Richard Earnshaw \(lists\)" , Francesco Petrogalli , James Greenhalgh , "Sekhar\, Ashwin" , gcc , Marcus Shawcroft , nd , richard.sandiford@linaro.org Cc: Alan.Haward@arm.com, "Richard Earnshaw \(lists\)" , Francesco Petrogalli , James Greenhalgh , "Sekhar\, Ashwin" , gcc , Marcus Shawcroft , nd Subject: Re: [Aarch64] Vector Function Application Binary Interface Specification for OpenMP References: <1518212868.14236.47.camel@cavium.com> <32617133-64DC-4F62-B7A0-A6B417C5B14E@arm.com> <1526487700.29509.6.camel@cavium.com> <1526491802.29509.19.camel@cavium.com> <87a7sznw5c.fsf@linaro.org> <1527184223.22014.13.camel@cavium.com> Date: Sat, 26 May 2018 10:09:00 -0000 In-Reply-To: <1527184223.22014.13.camel@cavium.com> (Steve Ellcey's message of "Thu, 24 May 2018 10:50:23 -0700") Message-ID: <87a7smbuej.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-SW-Source: 2018-05/txt/msg00241.txt.bz2 Steve Ellcey writes: > On Wed, 2018-05-16 at 22:11 +0100, Richard Sandiford wrote: >>=C2=A0 >> TARGET_HARD_REGNO_CALL_PART_CLOBBERED is the only current way >> of saying that an rtl instruction preserves the low part of a >> register but clobbers the high part.=C2=A0=C2=A0We would need something = like >> Alan H's CLOBBER_HIGH patches to do it using explicit clobbers. >>=20 >> Another approach would be to piggy-back on the -fipa-ra >> infrastructure >> and record that vector PCS functions only clobber Q0-Q7.=C2=A0=C2=A0If -= fipa-ra >> knows that a function doesn't clobber Q8-Q15 then that should >> override >> TARGET_HARD_REGNO_CALL_PART_CLOBBERED.=C2=A0=C2=A0(I'm not sure whether = it does >> in practice, but it should :-)=C2=A0=C2=A0And if it doesn't that's a bug= that's >> worth fixing for its own sake.) >>=20 >> Thanks, >> Richard > > Alan, > > I have been looking at your CLOBBER_HIGH patches to see if they > might be helpful in implementing the ARM SIMD Vector ABI in GCC. > I have also been looking at the -fipa-ra flag and how it works. > > I was wondering if you considered using the ipa-ra infrastructure > for the SVE work that you are currently trying to support with=C2=A0 > the CLOBBER_HIGH macro? > > My current thought for the ABI work is to mark all the floating > point / vector registers as caller saved (the lower half of V8-V15 > are currently callee saved) and remove > TARGET_HARD_REGNO_CALL_PART_CLOBBERED. > This should work but would be inefficient. > > The next step would be to split get_call_reg_set_usage up into > two functions so that I don't have to pass in a default set of > registers.=C2=A0=C2=A0One function would return call_used_reg_set by > default (but could return a smaller set if it had actual used > register information) and the other would return regs_invalidated > by_call by default (but could also return a smaller set). > > Next I would add a 'largest mode used' array to call_cgraph_rtl_info > structure in addition to the current function_used_regs register > set. > > Then I could turn the get_call_reg_set_usage replacement functions > into target specific functions and with the information in the > call_cgraph_rtl_info structure and any simd attribute information on > a function I could modify what registers are really being used/invalidated > without being saved. > > If the called function only uses the bottom half of a register it would n= ot > be marked as used/invalidated.=C2=A0=C2=A0If it uses the entire register = and the > function is not marked as simd, then the register would marked as > used/invalidated.=C2=A0=C2=A0If the function was marked as simd the regis= ter would not > be marked because a simd function would save both the upper and lower hal= ves > of a callee saved register (whereas a non simd function would only save t= he > lower half). > > Does this sound like something that could be used in place of your=C2=A0 > CLOBBER_HIGH patch? One of the advantages of CLOBBER_HIGH is that it can be attached to arbitrary instructions, not just calls. The motivating example was tlsdesc_small_, which isn't treated as a call but as a normal instruction. (And I don't think we want to change that, since it's much easier for rtl optimisers to deal with normal instructions compared to calls. In general a call is part of a longer sequence of instructions that includes setting up arguments, etc.) The other use case (not implemented in the posted patches) would be to represent the effect of syscalls, which clobber the "SVE part" of all vector registers. In that case the clobber would need to be attached to an inline asm insn. On the wider point about changing the way call clobber information is represented: I agree it would be good to generalise what we have now. But if possible I think we should avoid target hooks that take a specific call, and instead make it an inherent part of the call insn itself, much like CALL_INSN_FUNCTION_USAGE is now. E.g. we could add a field that points to an ABI description, with -fipa-ra effectively creating ad-hoc ABIs. That ABI description could start out with whatever we think is relevant now and could grow over time. Thanks, Richard