On 27/10/16 11:44, Kyrill Tkachov wrote:
> 
> On 27/10/16 11:00, Andre Vieira (lists) wrote:
>> On 26/10/16 17:30, Kyrill Tkachov wrote:
>>> On 26/10/16 17:26, Andre Vieira (lists) wrote:
>>>> On 26/10/16 13:51, Kyrill Tkachov wrote:
>>>>> Hi Andre,
>>>>>
>>>>> On 25/10/16 17:29, Andre Vieira (lists) wrote:
>>>>>> On 24/08/16 12:01, Andre Vieira (lists) wrote:
>>>>>>> On 25/07/16 14:23, Andre Vieira (lists) wrote:
>>>>>>>> This patch extends support for the ARMv8-M Security Extensions
>>>>>>>> 'cmse_nonsecure_entry' attribute to safeguard against leak of
>>>>>>>> information through unbanked registers.
>>>>>>>>
>>>>>>>> When returning from a nonsecure entry function we clear all
>>>>>>>> caller-saved
>>>>>>>> registers that are not used to pass return values, by writing
>>>>>>>> either
>>>>>>>> the
>>>>>>>> LR, in case of general purpose registers, or the value 0, in case
>>>>>>>> of FP
>>>>>>>> registers. We use the LR to write to APSR and FPSCR too. We
>>>>>>>> currently do
>>>>>>>> not support entry functions that pass arguments or return
>>>>>>>> variables on
>>>>>>>> the stack and we diagnose this. This patch relies on the existing
>>>>>>>> code
>>>>>>>> to make sure callee-saved registers used in cmse_nonsecure_entry
>>>>>>>> functions are saved and restored thus retaining their nonsecure
>>>>>>>> mode
>>>>>>>> value, this should be happening already as it is required by AAPCS.
>>>>>>>>
>>>>>>>> This patch also clears padding bits for cmse_nonsecure_entry
>>>>>>>> functions
>>>>>>>> with struct and union return types. For unions a bit is only
>>>>>>>> considered
>>>>>>>> a padding bit if it is an unused bit in every field of that union.
>>>>>>>> The
>>>>>>>> function that calculates these is used in a later patch to do the
>>>>>>>> same
>>>>>>>> for arguments of cmse_nonsecure_call's.
>>>>>>>>
>>>>>>>> *** gcc/ChangeLog ***
>>>>>>>> 2016-07-25  Andre Vieira        <andre.simoesdiasvieira@arm.com>
>>>>>>>>                Thomas Preud'homme  <thomas.preudhomme@arm.com>
>>>>>>>>
>>>>>>>>            * config/arm/arm.c (output_return_instruction): Clear
>>>>>>>>            registers.
>>>>>>>>            (thumb2_expand_return): Likewise.
>>>>>>>>            (thumb1_expand_epilogue): Likewise.
>>>>>>>>            (thumb_exit): Likewise.
>>>>>>>>            (arm_expand_epilogue): Likewise.
>>>>>>>>            (cmse_nonsecure_entry_clear_before_return): New.
>>>>>>>>            (comp_not_to_clear_mask_str_un): New.
>>>>>>>>            (compute_not_to_clear_mask): New.
>>>>>>>>            * config/arm/thumb1.md (*epilogue_insns): Change length
>>>>>>>> attribute.
>>>>>>>>            * config/arm/thumb2.md (*thumb2_return): Likewise.
>>>>>>>>
>>>>>>>> *** gcc/testsuite/ChangeLog ***
>>>>>>>> 2016-07-25  Andre Vieira        <andre.simoesdiasvieira@arm.com>
>>>>>>>>                Thomas Preud'homme  <thomas.preudhomme@arm.com>
>>>>>>>>
>>>>>>>>            * gcc.target/arm/cmse/cmse.exp: Test different multilibs
>>>>>>>> separate.
>>>>>>>>            * gcc.target/arm/cmse/struct-1.c: New.
>>>>>>>>            * gcc.target/arm/cmse/bitfield-1.c: New.
>>>>>>>>            * gcc.target/arm/cmse/bitfield-2.c: New.
>>>>>>>>            * gcc.target/arm/cmse/bitfield-3.c: New.
>>>>>>>>            * gcc.target/arm/cmse/baseline/cmse-2.c: Test that
>>>>>>>> registers are
>>>>>>>> cleared.
>>>>>>>>            * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New.
>>>>>>>>            * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New.
>>>>>>>>            * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New.
>>>>>>>>            * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New.
>>>>>>>>            * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New.
>>>>>>>>
>>>>>>> Updated this patch to correctly clear only the cumulative
>>>>>>> exception-status (0-4,7) and the condition code bits (28-31) of the
>>>>>>> FPSCR. I also adapted the code to be handle the bigger floating
>>>>>>> point
>>>>>>> register files.
>>>>>>>
>>>>>>> ----
>>>>>>>
>>>>>>> This patch extends support for the ARMv8-M Security Extensions
>>>>>>> 'cmse_nonsecure_entry' attribute to safeguard against leak of
>>>>>>> information through unbanked registers.
>>>>>>>
>>>>>>> When returning from a nonsecure entry function we clear all
>>>>>>> caller-saved
>>>>>>> registers that are not used to pass return values, by writing
>>>>>>> either the
>>>>>>> LR, in case of general purpose registers, or the value 0, in case
>>>>>>> of FP
>>>>>>> registers. We use the LR to write to APSR. For FPSCR we clear
>>>>>>> only the
>>>>>>> cumulative exception-status (0-4, 7) and the condition code bits
>>>>>>> (28-31). We currently do not support entry functions that pass
>>>>>>> arguments
>>>>>>> or return variables on the stack and we diagnose this. This patch
>>>>>>> relies
>>>>>>> on the existing code to make sure callee-saved registers used in
>>>>>>> cmse_nonsecure_entry functions are saved and restored thus retaining
>>>>>>> their nonsecure mode value, this should be happening already as
>>>>>>> it is
>>>>>>> required by AAPCS.
>>>>>>>
>>>>>>> This patch also clears padding bits for cmse_nonsecure_entry
>>>>>>> functions
>>>>>>> with struct and union return types. For unions a bit is only
>>>>>>> considered
>>>>>>> a padding bit if it is an unused bit in every field of that
>>>>>>> union. The
>>>>>>> function that calculates these is used in a later patch to do the
>>>>>>> same
>>>>>>> for arguments of cmse_nonsecure_call's.
>>>>>>>
>>>>>>> *** gcc/ChangeLog ***
>>>>>>> 2016-07-xx  Andre Vieira        <andre.simoesdiasvieira@arm.com>
>>>>>>>                Thomas Preud'homme  <thomas.preudhomme@arm.com>
>>>>>>>
>>>>>>>            * config/arm/arm.c (output_return_instruction): Clear
>>>>>>>            registers.
>>>>>>>            (thumb2_expand_return): Likewise.
>>>>>>>            (thumb1_expand_epilogue): Likewise.
>>>>>>>            (thumb_exit): Likewise.
>>>>>>>            (arm_expand_epilogue): Likewise.
>>>>>>>            (cmse_nonsecure_entry_clear_before_return): New.
>>>>>>>            (comp_not_to_clear_mask_str_un): New.
>>>>>>>            (compute_not_to_clear_mask): New.
>>>>>>>            * config/arm/thumb1.md (*epilogue_insns): Change length
>>>>>>> attribute.
>>>>>>>            * config/arm/thumb2.md (*thumb2_return): Duplicate
>>>>>>> pattern for
>>>>>>>            cmse_nonsecure_entry functions.
>>>>>>>
>>>>>>> *** gcc/testsuite/ChangeLog ***
>>>>>>> 2016-07-xx  Andre Vieira        <andre.simoesdiasvieira@arm.com>
>>>>>>>                Thomas Preud'homme  <thomas.preudhomme@arm.com>
>>>>>>>
>>>>>>>            * gcc.target/arm/cmse/cmse.exp: Test different multilibs
>>>>>>> separate.
>>>>>>>            * gcc.target/arm/cmse/struct-1.c: New.
>>>>>>>            * gcc.target/arm/cmse/bitfield-1.c: New.
>>>>>>>            * gcc.target/arm/cmse/bitfield-2.c: New.
>>>>>>>            * gcc.target/arm/cmse/bitfield-3.c: New.
>>>>>>>            * gcc.target/arm/cmse/baseline/cmse-2.c: Test that
>>>>>>> registers
>>>>>>> are
>>>>>>> cleared.
>>>>>>>            * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New.
>>>>>>>            * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New.
>>>>>>>            * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New.
>>>>>>>            * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New.
>>>>>>>            * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New.
>>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Rebased previous patch on top of trunk as requested. No changes to
>>>>>> ChangeLog.
>>>>>>
>>>>>> Cheers,
>>>>>> Andre
>>>>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>>>>> index
>>>>> bb81e5662e81a26c7d3ccf9f749e8e356e6de35e..c6260323ecfd2f2842e6a5aab06b67da16619c73
>>>>>
>>>>>
>>>>> 100644
>>>>> --- a/gcc/config/arm/arm.c
>>>>> +++ b/gcc/config/arm/arm.c
>>>>> @@ -17496,6 +17496,279 @@ note_invalid_constants (rtx_insn *insn,
>>>>> HOST_WIDE_INT address, int do_pushes)
>>>>>      return;
>>>>>    }
>>>>>    +/* This function computes the clear mask and PADDING_BITS_TO_CLEAR
>>>>> for
>>>>> structs
>>>>> +   and unions in the context of ARMv8-M Security Extensions.  It is
>>>>> used as a
>>>>> +   helper function for both 'cmse_nonsecure_call' and
>>>>> 'cmse_nonsecure_entry'
>>>>> +   functions.  The PADDING_BITS_TO_CLEAR pointer can be the base to
>>>>> either one
>>>>> +   or four masks, depending on whether it is being computed for a
>>>>> +   'cmse_nonsecure_entry' return value or a 'cmse_nonsecure_call'
>>>>> argument
>>>>> +   respectively.  The tree for the type of the argument or a field
>>>>> within an
>>>>> +   argument is passed in ARG_TYPE, the current register this argument
>>>>> or field
>>>>> +   starts in is kept in the pointer REGNO and updated accordingly,
>>>>> the
>>>>> bit this
>>>>> +   argument or field starts at is passed in STARTING_BIT and the last
>>>>> used bit
>>>>> +   is kept in LAST_USED_BIT which is also updated accordingly.  */
>>>>> +
>>>>> +static unsigned HOST_WIDE_INT
>>>>> +comp_not_to_clear_mask_str_un (tree arg_type, int * regno,
>>>>> +                   uint32_t * padding_bits_to_clear,
>>>>> +                   unsigned starting_bit, int * last_used_bit)
>>>>> +
>>>>> +{
>>>>> +  unsigned HOST_WIDE_INT not_to_clear_reg_mask = 0;
>>>>> +
>>>>> +  if (TREE_CODE (arg_type) == RECORD_TYPE)
>>>>> +    {
>>>>> +      unsigned current_bit = starting_bit;
>>>>> +      tree field;
>>>>> +      long int offset, size;
>>>>> +
>>>>> +
>>>>> +      field = TYPE_FIELDS (arg_type);
>>>>> +      while (field)
>>>>> +    {
>>>>> +      /* The offset within a structure is always an offset from
>>>>> +         the start of that structure.  Make sure we take that into
>>>>> the
>>>>> +         calculation of the register based offset that we use
>>>>> here.  */
>>>>> +      offset = starting_bit;
>>>>> +      offset += TREE_INT_CST_ELT (DECL_FIELD_BIT_OFFSET (field), 0);
>>>>> +      offset %= 32;
>>>>> +
>>>>> +      /* This is the actual size of the field, for bitfields this is
>>>>> the
>>>>> +         bitfield width and not the container size.  */
>>>>> +      size = TREE_INT_CST_ELT (DECL_SIZE (field), 0);
>>>>> +
>>>>> +      if (*last_used_bit != offset)
>>>>> +        {
>>>>> +          if (offset < *last_used_bit)
>>>>> +        {
>>>>> +          /* This field's offset is before the 'last_used_bit', that
>>>>> +             means this field goes on the next register.  So we
>>>>> need to
>>>>> +             pad the rest of the current register and increase the
>>>>> +             register number.  */
>>>>> +          uint32_t mask;
>>>>> +          mask  = UINT32_MAX - ((uint32_t) 1 << *last_used_bit);
>>>>> +          mask++;
>>>>> +
>>>>> +          *(padding_bits_to_clear + *regno) |= mask;
>>>>>
>>>>> padding_bits_to_clear[*regno] |= mask;
>>>>>
>>>>> +          not_to_clear_reg_mask |= HOST_WIDE_INT_1U << *regno;
>>>>> +          (*regno)++;
>>>>> +        }
>>>>> +          else
>>>>> +        {
>>>>> +          /* Otherwise we pad the bits between the last field's
>>>>> end and
>>>>> +             the start of the new field.  */
>>>>> +          uint32_t mask;
>>>>> +
>>>>> +          mask = UINT32_MAX >> (32 - offset);
>>>>> +          mask -= ((uint32_t) 1 << *last_used_bit) - 1;
>>>>> +          *(padding_bits_to_clear + *regno) |= mask;
>>>>>
>>>>> Likewise.
>>>>>
>>>>> +        }
>>>>> +          current_bit = offset;
>>>>> +        }
>>>>> +
>>>>> +      /* Calculate further padding bits for inner structs/unions
>>>>> too.  */
>>>>> +      if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (field)))
>>>>> +        {
>>>>> +          *last_used_bit = current_bit;
>>>>> +          not_to_clear_reg_mask
>>>>> +        |= comp_not_to_clear_mask_str_un (TREE_TYPE (field), regno,
>>>>> +                          padding_bits_to_clear, offset,
>>>>> +                          last_used_bit);
>>>>> +        }
>>>>> +      else
>>>>> +        {
>>>>> +          /* Update 'current_bit' with this field's size.  If the
>>>>> +         'current_bit' lies in a subsequent register, update 'regno'
>>>>> and
>>>>> +         reset 'current_bit' to point to the current bit in that new
>>>>> +         register.  */
>>>>> +          current_bit += size;
>>>>> +          while (current_bit >= 32)
>>>>> +        {
>>>>> +          current_bit-=32;
>>>>> +          not_to_clear_reg_mask |= HOST_WIDE_INT_1U << *regno;
>>>>> +          (*regno)++;
>>>>> +        }
>>>>> +          *last_used_bit = current_bit;
>>>>> +        }
>>>>> +
>>>>> +      field = TREE_CHAIN (field);
>>>>> +    }
>>>>> +      not_to_clear_reg_mask |= HOST_WIDE_INT_1U << *regno;
>>>>> +    }
>>>>> +  else if (TREE_CODE (arg_type) == UNION_TYPE)
>>>>> +    {
>>>>> +      tree field, field_t;
>>>>> +      int i, regno_t, field_size;
>>>>> +      int max_reg = -1;
>>>>> +      int max_bit = -1;
>>>>> +      uint32_t mask;
>>>>> +      uint32_t padding_bits_to_clear_res[NUM_ARG_REGS]
>>>>> +    = {UINT32_MAX, UINT32_MAX, UINT32_MAX, UINT32_MAX};
>>>>> +
>>>>> +      /* To compute the padding bits in a union we only consider
>>>>> bits as
>>>>> +     padding bits if they are always either a padding bit or fall
>>>>> outside a
>>>>> +     fields size for all fields in the union.  */
>>>>> +      field = TYPE_FIELDS (arg_type);
>>>>> +      while (field)
>>>>> +    {
>>>>> +      uint32_t padding_bits_to_clear_t[NUM_ARG_REGS]
>>>>> +        = {0U, 0U, 0U, 0U};
>>>>> +      int last_used_bit_t = *last_used_bit;
>>>>> +      regno_t = *regno;
>>>>> +      field_t = TREE_TYPE (field);
>>>>> +
>>>>> +      /* If the field's type is either a record or a union make
>>>>> sure to
>>>>> +         compute their padding bits too.  */
>>>>> +      if (RECORD_OR_UNION_TYPE_P (field_t))
>>>>> +        not_to_clear_reg_mask
>>>>> +          |= comp_not_to_clear_mask_str_un (field_t, &regno_t,
>>>>> +                        &padding_bits_to_clear_t[0],
>>>>> +                        starting_bit, &last_used_bit_t);
>>>>> +      else
>>>>> +        {
>>>>> +          field_size = TREE_INT_CST_ELT (DECL_SIZE (field), 0);
>>>>> +          regno_t = (field_size / 32) + *regno;
>>>>> +          last_used_bit_t = (starting_bit + field_size) % 32;
>>>>> +        }
>>>>> +
>>>>> +      for (i = *regno; i < regno_t; i++)
>>>>> +        {
>>>>> +          /* For all but the last register used by this field only
>>>>> keep
>>>>> the
>>>>> +         padding bits that were padding bits in this field.  */
>>>>> +          padding_bits_to_clear_res[i] &= padding_bits_to_clear_t[i];
>>>>> +        }
>>>>> +
>>>>> +        /* For the last register, keep all padding bits that were
>>>>> padding
>>>>> +           bits in this field and any padding bits that are still
>>>>> valid
>>>>> +           as padding bits but fall outside of this field's size.  */
>>>>> +        mask = (UINT32_MAX - ((uint32_t) 1 << last_used_bit_t)) + 1;
>>>>> +        padding_bits_to_clear_res[regno_t]
>>>>> +          &= padding_bits_to_clear_t[regno_t] | mask;
>>>>> +
>>>>> +      /* Update the maximum size of the fields in terms of registers
>>>>> used
>>>>> +         ('max_reg') and the 'last_used_bit' in said register.  */
>>>>> +      if (max_reg < regno_t)
>>>>> +        {
>>>>> +          max_reg = regno_t;
>>>>> +          max_bit = last_used_bit_t;
>>>>> +        }
>>>>> +      else if (max_reg == regno_t && max_bit < last_used_bit_t)
>>>>> +        max_bit = last_used_bit_t;
>>>>> +
>>>>> +      field = TREE_CHAIN (field);
>>>>> +    }
>>>>> +
>>>>> +      /* Update the current padding_bits_to_clear using the
>>>>> intersection of the
>>>>> +     padding bits of all the fields.  */
>>>>> +      for (i=*regno; i < max_reg; i++)
>>>>> +    padding_bits_to_clear[i] |= padding_bits_to_clear_res[i];
>>>>> +
>>>>>
>>>>> watch the spacing in the 'for' definition.
>>>>>
>>>>>    +      /* Do not keep trailing padding bits, we do not know yet
>>>>> whether
>>>>> this
>>>>> +     is the end of the argument.  */
>>>>> +      mask = ((uint32_t) 1 << max_bit) - 1;
>>>>> +      padding_bits_to_clear[max_reg]
>>>>> +    |= padding_bits_to_clear_res[max_reg] & mask;
>>>>> +
>>>>> +      *regno = max_reg;
>>>>> +      *last_used_bit = max_bit;
>>>>> +    }
>>>>> +  else
>>>>> +    /* This function should only be used for structs and unions.  */
>>>>> +    gcc_unreachable ();
>>>>> +
>>>>> +  return not_to_clear_reg_mask;
>>>>> +}
>>>>> +
>>>>> +/* In the context of ARMv8-M Security Extensions, this function is
>>>>> used
>>>>> for both
>>>>> +   'cmse_nonsecure_call' and 'cmse_nonsecure_entry' functions to
>>>>> compute what
>>>>> +   registers are used when returning or passing arguments, which is
>>>>> then
>>>>> +   returned as a mask.  It will also compute a mask to indicate
>>>>> padding/unused
>>>>> +   bits for each of these registers, and passes this through the
>>>>> +   PADDING_BITS_TO_CLEAR pointer.  The tree of the argument type is
>>>>> passed in
>>>>> +   ARG_TYPE, the rtl representation of the argument is passed in
>>>>> ARG_RTX and
>>>>> +   the starting register used to pass this argument or return
>>>>> value is
>>>>> passed
>>>>> +   in REGNO.  It makes use of 'comp_not_to_clear_mask_str_un' to
>>>>> compute these
>>>>> +   for struct and union types.  */
>>>>> +
>>>>> +static unsigned HOST_WIDE_INT
>>>>> +compute_not_to_clear_mask (tree arg_type, rtx arg_rtx, int regno,
>>>>> +                 uint32_t * padding_bits_to_clear)
>>>>> +
>>>>> +{
>>>>> +  int last_used_bit = 0;
>>>>> +  unsigned HOST_WIDE_INT not_to_clear_mask;
>>>>> +
>>>>> +  if (RECORD_OR_UNION_TYPE_P (arg_type))
>>>>> +    {
>>>>> +      not_to_clear_mask
>>>>> +    = comp_not_to_clear_mask_str_un (arg_type, &regno,
>>>>> +                     padding_bits_to_clear, 0,
>>>>> +                     &last_used_bit);
>>>>> +
>>>>> +
>>>>> +      /* If the 'last_used_bit' is not zero, that means we are still
>>>>> using a
>>>>> +     part of the last 'regno'.  In such cases we must clear the
>>>>> trailing
>>>>> +     bits.  Otherwise we are not using regno and we should mark it
>>>>> as to
>>>>> +     clear.  */
>>>>> +      if (last_used_bit != 0)
>>>>> +    *(padding_bits_to_clear + regno)
>>>>> +      |= UINT32_MAX - ((uint32_t) 1 << last_used_bit) + 1;
>>>>>
>>>>> padding_bits_to_clear[regno] |= ...
>>>>>
>>>>> +      else
>>>>> +    not_to_clear_mask &= ~(HOST_WIDE_INT_1U << regno);
>>>>> +    }
>>>>> +  else
>>>>> +    {
>>>>> +      not_to_clear_mask = 0;
>>>>> +      /* We are not dealing with structs nor unions.  So these
>>>>> arguments may be
>>>>> +     passed in floating point registers too.  In some cases a
>>>>> BLKmode is
>>>>> +     used when returning or passing arguments in multiple VFP
>>>>> registers.  */
>>>>> +      if (GET_MODE (arg_rtx) == BLKmode)
>>>>> +    {
>>>>> +      int i, arg_regs;
>>>>> +      rtx reg;
>>>>> +
>>>>> +      /* This should really only occur when dealing with the
>>>>> hard-float
>>>>> +         ABI.  */
>>>>> +      gcc_assert (TARGET_HARD_FLOAT_ABI);
>>>>> +
>>>>> +      for (i = 0; i < XVECLEN (arg_rtx, 0); i++)
>>>>> +        {
>>>>> +          reg = XEXP (XVECEXP (arg_rtx, 0, i), 0);
>>>>> +          gcc_assert (REG_P (reg));
>>>>> +
>>>>> +          not_to_clear_mask |= HOST_WIDE_INT_1U << REGNO (reg);
>>>>> +
>>>>> +          /* If we are dealing with DF mode, make sure we don't
>>>>> +         clear either of the registers it addresses.  */
>>>>> +          arg_regs = ARM_NUM_REGS (GET_MODE (reg));
>>>>>
>>>>> Better assert here that you're indeed dealing with DFmode and/or you
>>>>> have 2 registers.
>>>>>
>>>> The current code actually works for larger modes too. I don't think
>>>> code
>>>> will ever be generated with larger types, but why assert if it works
>>>> anyway?
>>> ok, no need to assert here then.
>>> I suppose it doesn't add much if the code handles the other modes fine.
>>>
>>> Kyrill
>>>
>> Hi,
>>
>> Reworked comments. No change to ChangeLogs.
>>
>> Cheers,
>> Andre
> 
> 
> +/* Clear caller saved registers not used to pass return values and leaked
> +   condition flags before exiting a cmse_nonsecure_entry function.  */
> +
> +void
> +cmse_nonsecure_entry_clear_before_return (void)
> +{
> +  uint64_t to_clear_mask[2];
> +  uint32_t padding_bits_to_clear = 0;
> +  uint32_t * padding_bits_to_clear_ptr = &padding_bits_to_clear;
> +  int regno, maxregno = IP_REGNUM;
> +  tree result_type;
> +  rtx result_rtl;
> +
> +  to_clear_mask[0] = (1ULL << (NUM_ARG_REGS)) - 1;
> +  to_clear_mask[0] |= (1ULL << IP_REGNUM);
> +  /* If we are not dealing with -mfloat-abi=soft we will need to clear VFP
> +     registers.  We also check TARGET_HARD_FLOAT to make sure these are
> +     present.  */
> +  if (TARGET_HARD_FLOAT)
> +    {
> +      uint64_t float_mask = (1ULL << (D7_VFP_REGNUM + 1)) - 1;
> +      maxregno = LAST_VFP_REGNUM;
> +
> +      float_mask &= ~((1ULL << FIRST_VFP_REGNUM) - 1);
> +      to_clear_mask[0] |= float_mask;
> +
> +      float_mask = (1ULL << (maxregno - 63)) - 1;
> +      to_clear_mask[1] = float_mask;
> +
> +      /* Make sure we dont clear the two scratch registers used to
> clear the
> +     relevant FPSCR bits in output_return_instruction.  We have only
> +     implemented the clearing of FP registers for Thumb-2, so we assert
> +     here that VFP was not enabled for Thumb-1 ARMv8-M targets.  */
> +      gcc_assert (arm_arch_thumb2);
> 
> I see this assert triggering when running the testsuite cmse.exp with
> /-march=armv8-m.base/-mfloat-abi=softfp/-mfpu=fpv5-d16
> from a toolchain configured with "--with-cpu=cortex-a15
> --with-float=hard --with-mode=thumb --with-fpu=neon-vfpv4".
> I think some more validation needs to happen to reject the attributes
> for invalid configurations.
> 
> Kyrill
> 
> 
> 
Hi,

So I believe the option is there to allow for interworking on targets
that support 16bit Thumb (but not 32bit Thumb), using pragmas and
compiling with -mfloat-abi=softfp. Yes, I agree it doesn't make sense to
allow it for targets like ARMv6-M or ARMv8-M Baseline, but we can look
at that separately.

For now I changed the checks for clearing FP registers to check not only
for 'TARGET_HARD_FLOAT' but also '!TARGET_THUMB1'.

I suggest we re-evaluate the definition of 'TARGET_HARD_FLOAT' at a
later point, either adding a '&& !TARGET_THUMB1' or change the comment
above it. The comment reads 'Use hardware floating point instructions',
which is wrong as we have seen from this exercise. I also got rid of the
assert you were hitting, since given the changes it is impossible to hit
it. I did add a test to the testsuite checking that we don't generate FP
instructions when compiling with '-march=armv8-m.base -mfloat-abi=softfp'.

Is this OK?

Cheers,
Andre

*** gcc/ChangeLog ***
2016-10-28  Andre Vieira        <andre.simoesdiasvieira@arm.com>
            Thomas Preud'homme  <thomas.preudhomme@arm.com>

        * config/arm/arm.c (output_return_instruction): Clear
        registers.
        (thumb2_expand_return): Likewise.
        (thumb1_expand_epilogue): Likewise.
        (thumb_exit): Likewise.
        (arm_expand_epilogue): Likewise.
        (cmse_nonsecure_entry_clear_before_return): New.
        (comp_not_to_clear_mask_str_un): New.
        (compute_not_to_clear_mask): New.
        * config/arm/thumb1.md (*epilogue_insns): Change length attribute.
        * config/arm/thumb2.md (*thumb2_return): Duplicate pattern for
        cmse_nonsecure_entry functions.

*** gcc/testsuite/ChangeLog ***
2016-10-28  Andre Vieira        <andre.simoesdiasvieira@arm.com>
            Thomas Preud'homme  <thomas.preudhomme@arm.com>

        * gcc.target/arm/cmse/cmse.exp: Test different multilibs separate.
        * gcc.target/arm/cmse/struct-1.c: New.
        * gcc.target/arm/cmse/bitfield-1.c: New.
        * gcc.target/arm/cmse/bitfield-2.c: New.
        * gcc.target/arm/cmse/bitfield-3.c: New.
        * gcc.target/arm/cmse/baseline/cmse-2.c: Test that registers are
cleared.
        * gcc.target/arm/cmse/baseline/softfp.c: New.
        * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New.
        * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New.
        * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New.
        * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New.
        * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New.