From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 38928 invoked by alias); 26 Oct 2016 12:51:44 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 38895 invoked by uid 89); 26 Oct 2016 12:51:42 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=2.4 required=5.0 tests=BAYES_50,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD,UNSUBSCRIBE_BODY autolearn=no version=3.3.2 spammy=BLKmode, blkmode, retaining, UINT32_MAX X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 26 Oct 2016 12:51:32 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AE8D716; Wed, 26 Oct 2016 05:51:30 -0700 (PDT) Received: from [10.2.207.77] (e100706-lin.cambridge.arm.com [10.2.207.77]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3DC803F220; Wed, 26 Oct 2016 05:51:30 -0700 (PDT) Message-ID: <5810A6D0.7000808@foss.arm.com> Date: Wed, 26 Oct 2016 12:51:00 -0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: "Andre Vieira (lists)" , gcc-patches@gcc.gnu.org Subject: Re: [PATCHv2 4/7, GCC, ARM, V8M] ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers References: <5796116C.6010100@arm.com> <579612EE.3050606@arm.com> <57BD7E8B.4000108@arm.com> <580F885D.6030107@arm.com> In-Reply-To: <580F885D.6030107@arm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2016-10/txt/msg02108.txt.bz2 Hi Andre, On 25/10/16 17:29, Andre Vieira (lists) wrote: > On 24/08/16 12:01, Andre Vieira (lists) wrote: >> On 25/07/16 14:23, Andre Vieira (lists) wrote: >>> This patch extends support for the ARMv8-M Security Extensions >>> 'cmse_nonsecure_entry' attribute to safeguard against leak of >>> information through unbanked registers. >>> >>> When returning from a nonsecure entry function we clear all caller-saved >>> registers that are not used to pass return values, by writing either the >>> LR, in case of general purpose registers, or the value 0, in case of FP >>> registers. We use the LR to write to APSR and FPSCR too. We currently do >>> not support entry functions that pass arguments or return variables on >>> the stack and we diagnose this. This patch relies on the existing code >>> to make sure callee-saved registers used in cmse_nonsecure_entry >>> functions are saved and restored thus retaining their nonsecure mode >>> value, this should be happening already as it is required by AAPCS. >>> >>> This patch also clears padding bits for cmse_nonsecure_entry functions >>> with struct and union return types. For unions a bit is only considered >>> a padding bit if it is an unused bit in every field of that union. The >>> function that calculates these is used in a later patch to do the same >>> for arguments of cmse_nonsecure_call's. >>> >>> *** gcc/ChangeLog *** >>> 2016-07-25 Andre Vieira >>> Thomas Preud'homme >>> >>> * config/arm/arm.c (output_return_instruction): Clear >>> registers. >>> (thumb2_expand_return): Likewise. >>> (thumb1_expand_epilogue): Likewise. >>> (thumb_exit): Likewise. >>> (arm_expand_epilogue): Likewise. >>> (cmse_nonsecure_entry_clear_before_return): New. >>> (comp_not_to_clear_mask_str_un): New. >>> (compute_not_to_clear_mask): New. >>> * config/arm/thumb1.md (*epilogue_insns): Change length attribute. >>> * config/arm/thumb2.md (*thumb2_return): Likewise. >>> >>> *** gcc/testsuite/ChangeLog *** >>> 2016-07-25 Andre Vieira >>> Thomas Preud'homme >>> >>> * gcc.target/arm/cmse/cmse.exp: Test different multilibs separate. >>> * gcc.target/arm/cmse/struct-1.c: New. >>> * gcc.target/arm/cmse/bitfield-1.c: New. >>> * gcc.target/arm/cmse/bitfield-2.c: New. >>> * gcc.target/arm/cmse/bitfield-3.c: New. >>> * gcc.target/arm/cmse/baseline/cmse-2.c: Test that registers are >>> cleared. >>> * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New. >>> * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New. >>> * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New. >>> * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New. >>> * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New. >>> >> Updated this patch to correctly clear only the cumulative >> exception-status (0-4,7) and the condition code bits (28-31) of the >> FPSCR. I also adapted the code to be handle the bigger floating point >> register files. >> >> ---- >> >> This patch extends support for the ARMv8-M Security Extensions >> 'cmse_nonsecure_entry' attribute to safeguard against leak of >> information through unbanked registers. >> >> When returning from a nonsecure entry function we clear all caller-saved >> registers that are not used to pass return values, by writing either the >> LR, in case of general purpose registers, or the value 0, in case of FP >> registers. We use the LR to write to APSR. For FPSCR we clear only the >> cumulative exception-status (0-4, 7) and the condition code bits >> (28-31). We currently do not support entry functions that pass arguments >> or return variables on the stack and we diagnose this. This patch relies >> on the existing code to make sure callee-saved registers used in >> cmse_nonsecure_entry functions are saved and restored thus retaining >> their nonsecure mode value, this should be happening already as it is >> required by AAPCS. >> >> This patch also clears padding bits for cmse_nonsecure_entry functions >> with struct and union return types. For unions a bit is only considered >> a padding bit if it is an unused bit in every field of that union. The >> function that calculates these is used in a later patch to do the same >> for arguments of cmse_nonsecure_call's. >> >> *** gcc/ChangeLog *** >> 2016-07-xx Andre Vieira >> Thomas Preud'homme >> >> * config/arm/arm.c (output_return_instruction): Clear >> registers. >> (thumb2_expand_return): Likewise. >> (thumb1_expand_epilogue): Likewise. >> (thumb_exit): Likewise. >> (arm_expand_epilogue): Likewise. >> (cmse_nonsecure_entry_clear_before_return): New. >> (comp_not_to_clear_mask_str_un): New. >> (compute_not_to_clear_mask): New. >> * config/arm/thumb1.md (*epilogue_insns): Change length attribute. >> * config/arm/thumb2.md (*thumb2_return): Duplicate pattern for >> cmse_nonsecure_entry functions. >> >> *** gcc/testsuite/ChangeLog *** >> 2016-07-xx Andre Vieira >> Thomas Preud'homme >> >> * gcc.target/arm/cmse/cmse.exp: Test different multilibs separate. >> * gcc.target/arm/cmse/struct-1.c: New. >> * gcc.target/arm/cmse/bitfield-1.c: New. >> * gcc.target/arm/cmse/bitfield-2.c: New. >> * gcc.target/arm/cmse/bitfield-3.c: New. >> * gcc.target/arm/cmse/baseline/cmse-2.c: Test that registers are >> cleared. >> * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New. >> * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New. >> * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New. >> * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New. >> * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New. >> > Hi, > > Rebased previous patch on top of trunk as requested. No changes to > ChangeLog. > > Cheers, > Andre diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index bb81e5662e81a26c7d3ccf9f749e8e356e6de35e..c6260323ecfd2f2842e6a5aab06b67da16619c73 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -17496,6 +17496,279 @@ note_invalid_constants (rtx_insn *insn, HOST_WIDE_INT address, int do_pushes) return; } +/* This function computes the clear mask and PADDING_BITS_TO_CLEAR for structs + and unions in the context of ARMv8-M Security Extensions. It is used as a + helper function for both 'cmse_nonsecure_call' and 'cmse_nonsecure_entry' + functions. The PADDING_BITS_TO_CLEAR pointer can be the base to either one + or four masks, depending on whether it is being computed for a + 'cmse_nonsecure_entry' return value or a 'cmse_nonsecure_call' argument + respectively. The tree for the type of the argument or a field within an + argument is passed in ARG_TYPE, the current register this argument or field + starts in is kept in the pointer REGNO and updated accordingly, the bit this + argument or field starts at is passed in STARTING_BIT and the last used bit + is kept in LAST_USED_BIT which is also updated accordingly. */ + +static unsigned HOST_WIDE_INT +comp_not_to_clear_mask_str_un (tree arg_type, int * regno, + uint32_t * padding_bits_to_clear, + unsigned starting_bit, int * last_used_bit) + +{ + unsigned HOST_WIDE_INT not_to_clear_reg_mask = 0; + + if (TREE_CODE (arg_type) == RECORD_TYPE) + { + unsigned current_bit = starting_bit; + tree field; + long int offset, size; + + + field = TYPE_FIELDS (arg_type); + while (field) + { + /* The offset within a structure is always an offset from + the start of that structure. Make sure we take that into the + calculation of the register based offset that we use here. */ + offset = starting_bit; + offset += TREE_INT_CST_ELT (DECL_FIELD_BIT_OFFSET (field), 0); + offset %= 32; + + /* This is the actual size of the field, for bitfields this is the + bitfield width and not the container size. */ + size = TREE_INT_CST_ELT (DECL_SIZE (field), 0); + + if (*last_used_bit != offset) + { + if (offset < *last_used_bit) + { + /* This field's offset is before the 'last_used_bit', that + means this field goes on the next register. So we need to + pad the rest of the current register and increase the + register number. */ + uint32_t mask; + mask = UINT32_MAX - ((uint32_t) 1 << *last_used_bit); + mask++; + + *(padding_bits_to_clear + *regno) |= mask; padding_bits_to_clear[*regno] |= mask; + not_to_clear_reg_mask |= HOST_WIDE_INT_1U << *regno; + (*regno)++; + } + else + { + /* Otherwise we pad the bits between the last field's end and + the start of the new field. */ + uint32_t mask; + + mask = UINT32_MAX >> (32 - offset); + mask -= ((uint32_t) 1 << *last_used_bit) - 1; + *(padding_bits_to_clear + *regno) |= mask; Likewise. + } + current_bit = offset; + } + + /* Calculate further padding bits for inner structs/unions too. */ + if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (field))) + { + *last_used_bit = current_bit; + not_to_clear_reg_mask + |= comp_not_to_clear_mask_str_un (TREE_TYPE (field), regno, + padding_bits_to_clear, offset, + last_used_bit); + } + else + { + /* Update 'current_bit' with this field's size. If the + 'current_bit' lies in a subsequent register, update 'regno' and + reset 'current_bit' to point to the current bit in that new + register. */ + current_bit += size; + while (current_bit >= 32) + { + current_bit-=32; + not_to_clear_reg_mask |= HOST_WIDE_INT_1U << *regno; + (*regno)++; + } + *last_used_bit = current_bit; + } + + field = TREE_CHAIN (field); + } + not_to_clear_reg_mask |= HOST_WIDE_INT_1U << *regno; + } + else if (TREE_CODE (arg_type) == UNION_TYPE) + { + tree field, field_t; + int i, regno_t, field_size; + int max_reg = -1; + int max_bit = -1; + uint32_t mask; + uint32_t padding_bits_to_clear_res[NUM_ARG_REGS] + = {UINT32_MAX, UINT32_MAX, UINT32_MAX, UINT32_MAX}; + + /* To compute the padding bits in a union we only consider bits as + padding bits if they are always either a padding bit or fall outside a + fields size for all fields in the union. */ + field = TYPE_FIELDS (arg_type); + while (field) + { + uint32_t padding_bits_to_clear_t[NUM_ARG_REGS] + = {0U, 0U, 0U, 0U}; + int last_used_bit_t = *last_used_bit; + regno_t = *regno; + field_t = TREE_TYPE (field); + + /* If the field's type is either a record or a union make sure to + compute their padding bits too. */ + if (RECORD_OR_UNION_TYPE_P (field_t)) + not_to_clear_reg_mask + |= comp_not_to_clear_mask_str_un (field_t, ®no_t, + &padding_bits_to_clear_t[0], + starting_bit, &last_used_bit_t); + else + { + field_size = TREE_INT_CST_ELT (DECL_SIZE (field), 0); + regno_t = (field_size / 32) + *regno; + last_used_bit_t = (starting_bit + field_size) % 32; + } + + for (i = *regno; i < regno_t; i++) + { + /* For all but the last register used by this field only keep the + padding bits that were padding bits in this field. */ + padding_bits_to_clear_res[i] &= padding_bits_to_clear_t[i]; + } + + /* For the last register, keep all padding bits that were padding + bits in this field and any padding bits that are still valid + as padding bits but fall outside of this field's size. */ + mask = (UINT32_MAX - ((uint32_t) 1 << last_used_bit_t)) + 1; + padding_bits_to_clear_res[regno_t] + &= padding_bits_to_clear_t[regno_t] | mask; + + /* Update the maximum size of the fields in terms of registers used + ('max_reg') and the 'last_used_bit' in said register. */ + if (max_reg < regno_t) + { + max_reg = regno_t; + max_bit = last_used_bit_t; + } + else if (max_reg == regno_t && max_bit < last_used_bit_t) + max_bit = last_used_bit_t; + + field = TREE_CHAIN (field); + } + + /* Update the current padding_bits_to_clear using the intersection of the + padding bits of all the fields. */ + for (i=*regno; i < max_reg; i++) + padding_bits_to_clear[i] |= padding_bits_to_clear_res[i]; + watch the spacing in the 'for' definition. + /* Do not keep trailing padding bits, we do not know yet whether this + is the end of the argument. */ + mask = ((uint32_t) 1 << max_bit) - 1; + padding_bits_to_clear[max_reg] + |= padding_bits_to_clear_res[max_reg] & mask; + + *regno = max_reg; + *last_used_bit = max_bit; + } + else + /* This function should only be used for structs and unions. */ + gcc_unreachable (); + + return not_to_clear_reg_mask; +} + +/* In the context of ARMv8-M Security Extensions, this function is used for both + 'cmse_nonsecure_call' and 'cmse_nonsecure_entry' functions to compute what + registers are used when returning or passing arguments, which is then + returned as a mask. It will also compute a mask to indicate padding/unused + bits for each of these registers, and passes this through the + PADDING_BITS_TO_CLEAR pointer. The tree of the argument type is passed in + ARG_TYPE, the rtl representation of the argument is passed in ARG_RTX and + the starting register used to pass this argument or return value is passed + in REGNO. It makes use of 'comp_not_to_clear_mask_str_un' to compute these + for struct and union types. */ + +static unsigned HOST_WIDE_INT +compute_not_to_clear_mask (tree arg_type, rtx arg_rtx, int regno, + uint32_t * padding_bits_to_clear) + +{ + int last_used_bit = 0; + unsigned HOST_WIDE_INT not_to_clear_mask; + + if (RECORD_OR_UNION_TYPE_P (arg_type)) + { + not_to_clear_mask + = comp_not_to_clear_mask_str_un (arg_type, ®no, + padding_bits_to_clear, 0, + &last_used_bit); + + + /* If the 'last_used_bit' is not zero, that means we are still using a + part of the last 'regno'. In such cases we must clear the trailing + bits. Otherwise we are not using regno and we should mark it as to + clear. */ + if (last_used_bit != 0) + *(padding_bits_to_clear + regno) + |= UINT32_MAX - ((uint32_t) 1 << last_used_bit) + 1; padding_bits_to_clear[regno] |= ... + else + not_to_clear_mask &= ~(HOST_WIDE_INT_1U << regno); + } + else + { + not_to_clear_mask = 0; + /* We are not dealing with structs nor unions. So these arguments may be + passed in floating point registers too. In some cases a BLKmode is + used when returning or passing arguments in multiple VFP registers. */ + if (GET_MODE (arg_rtx) == BLKmode) + { + int i, arg_regs; + rtx reg; + + /* This should really only occur when dealing with the hard-float + ABI. */ + gcc_assert (TARGET_HARD_FLOAT_ABI); + + for (i = 0; i < XVECLEN (arg_rtx, 0); i++) + { + reg = XEXP (XVECEXP (arg_rtx, 0, i), 0); + gcc_assert (REG_P (reg)); + + not_to_clear_mask |= HOST_WIDE_INT_1U << REGNO (reg); + + /* If we are dealing with DF mode, make sure we don't + clear either of the registers it addresses. */ + arg_regs = ARM_NUM_REGS (GET_MODE (reg)); Better assert here that you're indeed dealing with DFmode and/or you have 2 registers. +/* Clear caller saved registers not used to pass return values and leaked + condition flags before exiting a cmse_nonsecure_entry function. */ + +void +cmse_nonsecure_entry_clear_before_return (void) +{ + uint64_t to_clear_mask[2]; + uint32_t padding_bits_to_clear = 0; + uint32_t * padding_bits_to_clear_ptr = &padding_bits_to_clear; + int regno, maxregno = IP_REGNUM; + tree result_type; + rtx result_rtl; + + to_clear_mask[0] = (1ULL << (NUM_ARG_REGS)) - 1; + to_clear_mask[0] |= (1ULL << IP_REGNUM); + /* If we are not dealing with -mfloat-abi=soft we will need to clear VFP + registers. We also check TARGET_HARD_FLOAT to make sure these are + present. */ + if (TARGET_HARD_FLOAT) + { + uint64_t float_mask = (1ULL << (D7_VFP_REGNUM + 1)) - 1; + maxregno = LAST_VFP_REGNUM; + + float_mask &= ~((1ULL << FIRST_VFP_REGNUM) - 1); + to_clear_mask[0] |= float_mask; + + float_mask = (1ULL << (maxregno - 63)) - 1; + to_clear_mask[1] = float_mask; + + /* Make sure we dont clear the two scratch registers used to clear the + relevant FPSCR bits in output_return_instruction. We have only + implemented the clearing of FP registers for Thumb-2, so we assert + here that VFP was not enabled for Thumb-1 ARMv8-M targets. */ + gcc_assert (arm_arch_thumb2); + emit_use (gen_rtx_REG (SImode, IP_REGNUM)); + to_clear_mask[0] &= ~(1ULL << IP_REGNUM); + emit_use (gen_rtx_REG (SImode, 4)); + to_clear_mask[0] &= ~(1ULL << 4); + } + + /* If the user has defined registers to be caller saved, these are no longer + restored by the function before returning and must thus be cleared for + security purposes. */ + for (regno = NUM_ARG_REGS; regno < LAST_VFP_REGNUM; regno++) + { + /* We do not touch registers that can be used to pass arguments as per + the AAPCS, since these should never be made callee-saved by user + options. */ + if (regno >= FIRST_VFP_REGNUM && regno <= D7_VFP_REGNUM) + continue; + if (regno >= IP_REGNUM && regno <= PC_REGNUM) + continue; Please use IN_RANGE. + if (call_used_regs[regno]) + to_clear_mask[regno / 64] |= (1ULL << (regno % 64)); + } + + /* Make sure we do not clear the registers used to return the result in. */ + result_type = TREE_TYPE (DECL_RESULT (current_function_decl)); + if (!VOID_TYPE_P (result_type)) + { + result_rtl = arm_function_value (result_type, current_function_decl, 0); + + /* No need to check that we return in registers, because we don't + support returning on stack yet. */ + to_clear_mask[0] + &= ~compute_not_to_clear_mask (result_type, result_rtl, 0, + padding_bits_to_clear_ptr); + } + + if (padding_bits_to_clear != 0) + { + rtx reg_rtx; + /* Padding bits to clear is not 0 so we know we are dealing with + returning a composite type, which only uses r0. Let's make sure that + r1-r3 is cleared too, we will use r1 as a scratch register. */ + gcc_assert ((to_clear_mask[0] & 0xe) == 0xe); + + reg_rtx = gen_rtx_REG (SImode, 1); + Use R1_REGNUM. + /* Fill the lower half of the negated padding_bits_to_clear. */ + emit_move_insn (reg_rtx, + GEN_INT ((((~padding_bits_to_clear) << 16u) >> 16u))); + + /* Also fill the top half of the negated padding_bits_to_clear. */ + if (((~padding_bits_to_clear) >> 16) > 0) + emit_insn (gen_rtx_SET (gen_rtx_ZERO_EXTRACT (SImode, reg_rtx, + GEN_INT (16), + GEN_INT (16)), + GEN_INT ((~padding_bits_to_clear) >> 16))); + + emit_insn (gen_andsi3 (gen_rtx_REG (SImode, 0), + gen_rtx_REG (SImode, 0), + reg_rtx)); Likewise, use R0_REGNUM. This is ok with the changes above. Thanks, Kyrill