From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 2BE4C3858D3C for ; Wed, 8 Dec 2021 11:31:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2BE4C3858D3C Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AD3D01042; Wed, 8 Dec 2021 03:31:28 -0800 (PST) Received: from [10.57.0.71] (unknown [10.57.0.71]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CD2ED3F5A1; Wed, 8 Dec 2021 03:31:27 -0800 (PST) Subject: Re: [PATCH] [1/2] arm: Implement cortex-M return signing address codegen To: Andrea Corallo , gcc-patches@gcc.gnu.org Cc: nd , Richard Earnshaw References: From: Richard Earnshaw Message-ID: <1145487a-afa0-9b05-aeb0-91ae92f99891@foss.arm.com> Date: Wed, 8 Dec 2021 11:31:26 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3492.5 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, NICE_REPLY_A, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Dec 2021 11:31:31 -0000 On 05/11/2021 08:52, Andrea Corallo via Gcc-patches wrote: > Hi all, > > this patch enables address return signature and verification based on > Armv8.1-M Pointer Authentication [1]. > > To sign the return address, we use the PAC R12, LR, SP instruction > upon function entry. This is signing LR using SP and storing the > result in R12. R12 will be pushed into the stack. > > During function epilogue R12 will be popped and AUT R12, LR, SP will > be used to verify that the content of LR is still valid before return. > > Here an example of PAC instrumented function prologue and epilogue: > > pac r12, lr, sp > push {r3, r7, lr} > push {r12} > sub sp, sp, #4 Which, as shown here, generates a stack which does not preserve 8-byte alignment. Also, what's wrong with pac r12, lr, sp push {r3, r7, ip, lr} ? Which saves 2 bytes in the prologue and ... > [...] function body > add sp, sp, #4 > pop {r12} > pop {r3, r7, lr} > aut r12, lr, sp > bx lr pop {r3, r7, ip, lr} aut r12, lr, sp bx lr which saves 4 bytes in the epilogue (repeated for each instance of the epilogue). > > The patch also takes care of generating a PACBTI instruction in place > of the sequence BTI+PAC when Branch Target Identification is enabled > contextually. > What about variadic functions? What about functions where lr is live on entry (where it's used for passing the closure in nested functions)? > These two patches apply on top of Tejas series posted here [2]. > > Regressioned and arm-linux-gnu aarch64-linux-gnu bootstraped. > > Best Regards > > Andrea > > [1] > [2] > +static bool arm_pac_enabled_for_curr_function_p (void); I really don't like that name. There are a lot of functions with variations of 'current function' in the name already and this creates yet another variant. Something like arm_current_function_pac_enabled_p() would be preferable; or, if that really is too long, use 'current_func' which already has usage within the compiler. +(define_insn "pac_ip_lr_sp" + [(set (reg:DI IP_REGNUM) + (unspec:DI [(reg:DI SP_REGNUM) (reg:DI LR_REGNUM)] + UNSPEC_PAC_IP_LR_SP))] + "" + "pac\tr12, lr, sp") + +(define_insn "pacbti_ip_lr_sp" + [(set (reg:DI IP_REGNUM) + (unspec:DI [(reg:DI SP_REGNUM) (reg:DI LR_REGNUM)] + UNSPEC_PACBTI_IP_LR_SP))] + "" + "pacbti\tr12, lr, sp") + +(define_insn "aut_ip_lr_sp" + [(unspec:DI [(reg:DI IP_REGNUM) (reg:DI SP_REGNUM) (reg:DI LR_REGNUM)] + UNSPEC_AUT_IP_LR_SP)] + "" + "aut\tr12, lr, sp") + I think all these need a length attribute. Also, they should only be enabled for thumb2 (certainly not in Arm state). And when using explicit register names in an asm, prefix each name with '%|', just in case the assembler dialect has a register name prefix. The names are somewhat unweildy, can't we use something more usefully descriptive, like 'pac_nop', 'pacbti_nop' and 'aut_nop', since all these instructions are using the architectural NOP space. Finally, I think we need some more tests that cover the various frame-pointer flavours when used in combination with this feature and for various corners of the PCS. R.