[gcc(refs/vendors/ARM/heads/morello)] Pad and align objects to enable precisely bounded capabilities

public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed

* [gcc(refs/vendors/ARM/heads/morello)] Pad and align objects to enable precisely bounded capabilities
@ 2021-12-10 16:48 Matthew Malcomson
  0 siblings, 0 replies; only message in thread
From: Matthew Malcomson @ 2021-12-10 16:48 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:b302420cb558d799669fbf7b7f06ec3f8b6f541e

commit b302420cb558d799669fbf7b7f06ec3f8b6f541e
Author: Matthew Malcomson <matthew.malcomson@arm.com>
Date:   Wed Oct 6 12:05:13 2021 +0100

    Pad and align objects to enable precisely bounded capabilities
    
    There are some limits on the bounds that capabilities can represent.
    This is only a problem at large sizes (greater than 2^14).  That means
    that if an object is of a problematic size, pointers to that object can
    not be precisely bounded so that they can not be used to access
    neighbouring information.
    
    The algorithm to find whether a given base and limit can be represented
    precisely are described in the Morello ARM, section 2.5.1, rule R_KDDZF
    (under Setting and encoding bounds).
    
    While it is not ABI that capabilities pointing to objects must avoid
    overlapping other objects, it is very important for security.  GNU ld
    currently refuses to link objects where the capability bounds cannot
    avoid overlapping to other objects.  LLVM adds padding directly after
    objects, and specifies the size of the object including this padding to
    try and always get precise bounds.
    
    LLD does not refuse to link object files containing capabilities that
    can not be precisely bounded.  I believe that we should follow this for
    GNU ld, since otherwise this is a restriction made artificially strong.
    However, maintaining the error for now is a useful mechanism to ensure
    that our padding implementation works, and if no-one hits this error I
    don't see any reason to remove it (i.e. let's wait until the balance of
    finding GCC bugs vs not allowing correct code changes).  That said, in
    the GCC testsuite we have at least one case where we can not specify the
    required alignment due to the alignment being overridden by a
    user-specified one.  This indicates that we are somewhat likely to
    require this relaxation at some point in the future.
    
    We implement the algorithm from following the Morello ARM and use this
    both to determine the alignment required for an object and the size that
    the symbol should be.
    
    In cases where the user has specified an alignment or when we are emitting
    a TLS variable we do not adjust the alignment.
    
    LLVM still adds the additional padding and alignment to objects which
    have user-specified alignment.  GCC has previously had problems with
    padding and alignment greater than the alignment requested by the user.
    https://gcc.gnu.org/pipermail/gcc-patches/2001-August/056652.html
    
    Enabling this cheri-bounds padding and alignment for user-specified
    symbols would work for us given that all the symbols in crtstuff.c which
    we know require exact alignment are small enough that the code would not
    add padding to.  However it seems like it is not a cohesive approach,
    since it would not work if the same requirement of a specific alignment
    (and no greater than that) happened on a larger object.
    
    Similarly we could have introduced a new alignment attribute to
    distinguish between one that needs alignment of no more than a given
    amount and one that does not need such a thing, but that seems to be
    adding extra complexity when it's not quite worth it.
    
    Here we disable the alignment and padding for objects with a
    user-specified alignment only if the decl of that object also has a
    named section.  This does not *quite* mesh with the existing GCC
    behaviour of avoiding any changes to an objects alignment if it was
    user-specified, but is a minor variation to it.  This variation is
    accepted due to implementation concerns outlined in the implementation
    notes below.
    This introduces a difference in behaviour between LLVM and GCC (though
    we still maintain ABI compatibility since -- to repeat -- this is not a
    *requirement*).  It also introduces another use-case that if required in
    the wild would force us to relax the restriction on GNU ld.  A warning
    is added for such cases.
    
    This decision to avoid extra alignment and padding on TLS variables was
    more for consistency with the decision on the user-aligned variables.
    I.e. there is not a strong reason for or against this.  Elsewhere in GCC
    we avoid adding extra alignment to TLS variables "because TLS space is
    too precious".  Padding such variables is unlikely to be needed since very
    large TLS objects are rarely seen.
    
    --- Notes on implementation:
    
    Having precise bounds on a capability to an object is optional -- it is
    a security enhancement rather than an ABI requirement.  This means that
    we can't set DECL_ALIGN on all declarations.  DECL_ALIGN is used both
    for emitting the variable, and for code accessing the variable as a
    "guaranteed" alignment.  We can only set the alignment on objects that
    we are emitted in the current translation unit.
    
    There are two obvious approaches to setting this alignment, one is to
    adjust the alignment at the point it is emitted (along with the extra
    padding), and the other is to change the DECL_ALIGN only for those
    objects which will be emitted into the current object file.
    
    There are two minor points in favour of using the DECL_ALIGN method:
    1) noswitch sections use callbacks to implement the actual emission of
       alignment, and these take their alignment directly from the DECL.
       Hence using DECL_ALIGN means we only have to adjust
       assemble_noswitch_variable, while adjusting the alignment when
       emitting means we have to adjust each of the noswitch section
       callbacks it uses.
    2) Any code which may take advantage of the alignment for better codegen
       can see this (expected to not be often given this will only change
       for very large objects).
    And there are three minor points against using the DECL_ALIGN method:
    1) We can only set DECL_ALIGN on *known* local objects.  In shared
       libraries symbols may be interpolated.  Hence using DECL_ALIGN can not
       be set on possibly interpolated DECL's but can be emitted when
       outputting the assembly for the object that will be stored in this
       shared library.
    2) Alignment and padding are specified in different places in the code,
       which makes it slightly less obvious that both have been done
       everywhere we require.
    3) DECL_ALIGN stores bitwise alignment in an unsigned integer.  This
       means that we can overflow the alignment requirement easier than if
       we were to use a uint64_t value directly (which would be easy to
       implement if not trying to store things on the TREE node structure).
    On balance we have chosen to set the alignment when emitting the variable
    and not store it in the DECL.
    
    N.b. ASAN adjusts the alignment using DECL_ALIGN everywhere, but ASAN
    *requires* alignment adjustment.  Hence it can rely on external objects
    having the correct alignment.
    
    In a similar manner, we do not want to change the DECL_SIZE/TYPE_SIZE of
    the object, since that would imply that there is more data than there
    actually is.  This would have consequences in all code accessing the
    object.
    
    Hence, we add padding and alignment adjustments just when the objects are
    being emitted.  Padding is emitted in the 4 "leaf emission" functions (as
    I'm calling them): assemble_constant_contents, assemble_variable_contents,
    assemble_noswitch_variable, and output_constant_pool_1.  Alignment must
    sometimes be handled in slightly different places due to the existing
    structure of the code.  It's handled in the noswitch callbacks as well as
    assemble_noswitch_variable, in assemble_variable rather than
    assemble_variable_contents, and in output_constant_def_contents rather
    than assemble_constant_contents.
    
    We calculate padding and alternate alignment using two new hooks,
    targetm.data_padding_size and targetm.data_alignment.
    The padding is emitted at the end of the object with zeros.  In all those
    leaf functions except assemble_noswitch_variable we emit the extra padding
    as a separate directive just to make the assembly a little clearer for
    anyone reading it.
    
    The hooks are not made to be specific to capabilities.  Rather they are
    designed to be general "add padding" and "add alignment" hooks that only
    act on data when it is getting emitted.  In order to satisfy that we try
    to make sure it is still called for objects which can never be large
    enough to cause problems in capability representability.
    The main example of such objects are those which are
    CONSTANT_POOL_ADDRESS_P, since these can only have a size associated with
    their mode (and there are currently no modes which can have too large
    sizes for precise capability bounds).
    
    This has added a restriction that we want must pass the alignment and size
    that we are using directly, since there may not be a DECL to provide the
    current size and/or alignment.
    
    We still pass the original DECL if it's available so that the hooks can
    provide warnings if necessary (it's nice to have this in the hook so that
    target-specific warning messages can be given).
    
    In order to let the linker know that capabilities to this symbol will no
    longer overlap with neighbouring objects we also need to adjust the size
    specified on the object with the size directive.  This is mostly
    accounted for by adjusting the size in those "leaf emission" functions,
    but assemble_variable_contents uses ASM_DECLARE_OBJECT_NAME to declare
    the size of the object and hence that needs to be updated.
    This macro is also used in asm_output_aligned_bss, which is an
    alternative to using the bss noswitch section callback via
    assemble_noswitch_section.  This is not used for AArch64, but we update
    it with the data padding size anyway in order to keep things consistent.
    Similarly, though the ASM_FINISH_DECLARE_OBJECT macro is not used in the
    AArch64 testsuite we update that too.
    
    For the implementation of object blocks, we must know the size and
    alignment of every element in the block.  As mentioned above we do not
    adjust the DECL_SIZE or DECL_ALIGN of objects, so we need to update the
    code in place_block_symbol and output_object_block to account for this
    padding too.  We can't use add the padding in between objects directly in
    output_object_block, since it uses functions to emit variables or
    constants which are also used for objects *not* in an object block.  Here
    we must make sure we account for the different size *before* adjusting
    the size in order to add ASAN redzones.  The changes are straight forward.
    
    We leave assemble_trampoline_template without modification, but with the
    new requirement that if the target wants padding and alignment
    requirements they must change the template they use and the
    TRAMPOLINE_SIZE macro accordingly.
    
    GCC uses DECL_USER_ALIGN to indicate that a declaration has a
    user-specified alignment.  However some places in the compiler
    artificially set this flag.  The `increase_alignment` pass artificially
    sets this flag in `ensure_base_align` and
    `symtab_node::increase_alignment`.  This particular use is troublesome
    since it happens quite often, causing many objects to not get aligned for
    capabilities.
    We could avoid this by changing the `vect_can_force_dr_alignment_p`
    function to always return `false` when the target would like to increase
    alignment due to our new hook.  That would remove some more optimisation.
    We suspect the use of DECL_USER_ALIGN is not necessary and that we could
    simply remove it.  That runs the risk of introducing hard-to-notice
    problems (since any problem doesn't show up in the testsuite).
    Here we relax the alignment restrictions on DECL_USER_ALIGN decls to
    only avoid adjusting in our hook if the decl also has a specified
    section.  This adds more complexity in the decision tree
    (DECL_USER_ALIGN stops any further adjustment of alignment *except* for
    if the alignment is increased by this hook).  We could reduce that
    complexity by changing the decision in the rest of the compiler to
    match this new decision.  This patch doesn't implement that extra step.
    
    Changes in the testsuite from this commit (excluding the passing new
    testcases) are below.  The extra warnings are removed from the testsuite output
    by adding -Wno-cheri-bounds to the relevant tests.  This is done rather than
    adding a `dg-warning` because the warning output is only sometimes emitted and
    sometimes not, and the divergence is based on whether given variables are
    optimised away (which is not a stable thing to base off of).
    
    Avoiding linker errors in:
      gcc.c-torture/execute/pr60822.c
      gcc.c-torture/execute/pr91137.c
      gcc.dg/graphite/interchange-0.c
      gcc.dg/graphite/interchange-1.c
      gcc.dg/graphite/interchange-2.c
      gcc.dg/graphite/interchange-3.c
      gcc.dg/graphite/interchange-4.c
      gcc.dg/graphite/interchange-5.c
      gcc.dg/graphite/interchange-10.c
      gcc.dg/graphite/interchange-11.c
      gcc.dg/graphite/interchange-15.c
      gcc.dg/graphite/interchange-mvt.c
      gcc.dg/graphite/pr46185.c
      gcc.dg/graphite/uns-interchange-15.c
      gcc.dg/graphite/uns-interchange-mvt.c
      gcc.dg/torture/pr53366-1.c
      gcc.dg/torture/pr60183.c
      gcc.dg/torture/pr95248.c
      gcc.dg/torture/ldist-27.c
      gcc.dg/tree-ssa/loop-interchange-1.c
      gcc.dg/tree-ssa/loop-interchange-2.c
      gcc.dg/tree-ssa/loop-interchange-5.c
      gcc.dg/tree-ssa/loop-interchange-6.c
      gcc.dg/tree-ssa/loop-interchange-7.c
      gcc.dg/tree-ssa/loop-interchange-8.c
      gcc.dg/tree-ssa/loop-interchange-9.c
      gcc.dg/tree-ssa/loop-interchange-10.c
      gcc.dg/tree-ssa/loop-interchange-14.c
      gcc.dg/tree-ssa/loop-interchange-1b.c
      gcc.dg/vect/no-section-anchors-vect-68.c
      gcc.dg/vect/no-section-anchors-vect-69.c
      gcc.dg/vect/section-anchors-vect-69.c
      gcc.target/aarch64/symbol-range.c
      tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o-c_compat_y_tst.o link

Diff:
---
 gcc/common.opt                                     |   4 +
 gcc/config/aarch64/aarch64-protos.h                |   2 +
 gcc/config/aarch64/aarch64.c                       | 157 +++++++++++++++++
 gcc/config/aarch64/aarch64.h                       |   2 +-
 gcc/config/elfos.h                                 |   4 +-
 gcc/doc/tm.texi                                    |  31 ++++
 gcc/doc/tm.texi.in                                 |   4 +
 gcc/output.h                                       |   2 +-
 gcc/target.def                                     |  38 +++++
 gcc/targhooks.c                                    |  17 ++
 gcc/targhooks.h                                    |   2 +
 .../aarch64/morello/precise-bounds-padding-2.c     |  76 +++++++++
 .../aarch64/morello/precise-bounds-padding.c       |  60 +++++++
 gcc/toplev.c                                       |   3 +-
 gcc/varasm.c                                       | 186 ++++++++++++++++-----
 15 files changed, 539 insertions(+), 49 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 513125f0c00..1b22bc59fcb 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -570,6 +570,10 @@ Wcpp
 Common Var(warn_cpp) Init(1) Warning
 Warn when a #warning directive is encountered.
 
+Wcheri-bounds
+Common Var(warn_cheri_bounds) Init(1) Warning
+Warn when an object can not be aligned to ensure non-overlapping bounds.
+
 Wattribute-warning
 Common Var(warn_attribute_warning) Init(1) Warning
 Warn about uses of __attribute__((warning)) declarations.
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 98ca35ef1a1..ca27b0caaa9 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -666,6 +666,8 @@ void aarch64_split_simd_move (rtx, rtx);
 /* Check for a legitimate floating point constant for FMOV.  */
 bool aarch64_float_const_representable_p (rtx);
 
+/* Find alignment required for precise bounds on an object of given type.  */
+uint64_t aarch64_morello_precise_bounds_align (const_tree, uint64_t);
 extern int aarch64_epilogue_uses (int);
 
 #if defined (RTX_CODE)
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 3512cd32ba3..c60939ebe4f 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2906,12 +2906,127 @@ aarch64_high_bits_all_ones_p (HOST_WIDE_INT i)
   return exact_log2 (-i) != HOST_WIDE_INT_M1;
 }
 
+/* Helper function for aarch64_morello_get_size_align_req.
+   Finds the alignment we need to round `length` up to so that the Morello
+   capability bounds compression algorithm can precisely represent it.
+
+   In cases where `length` is smaller than 2^14 then there will be no special
+   alignment requirements.  */
+static uint64_t
+aarch64_required_alignment (uint64_t length, bool recursive = false)
+{
+  /* Plus one to account for the specification using a 65 bit length and us
+     using a 64 bit one.  */
+  unsigned num_zeros = clz_hwi (length) + 1;
+  if (num_zeros > 50)
+    return 1;
+
+  uint64_t E = 50 - num_zeros;
+  uint64_t req_alignment = (1ULL << (E+3));
+  /* Rounding up may carry a bit and end up needing a greater alignment.
+     Should only happen once. */
+  uint64_t newlength = ROUND_UP (length, req_alignment);
+  if (newlength != length)
+    {
+      gcc_assert (! recursive);
+      return aarch64_required_alignment (newlength, true);
+    }
+  return req_alignment;
+}
+
+/* Take a length and return a structure containing both the alignment required
+   for such a length and the new length matching that alignment.  The
+   requirement we handle here is what is needed for Morello capabilities to
+   have precise bounds around an object of length `size`.
+
+   The operand and result are in units of *bytes*.  */
+struct align_and_size {
+    uint64_t align;
+    uint64_t size;
+};
+static struct align_and_size
+aarch64_morello_get_size_align_req (uint64_t size)
+{
+  /* size => alignment requirement.
+     alignment requirement => new size
+     new size => different alignment requirement.
+     Can only happen once and that would be done in aarch64_required_alignment.
+     */
+  uint64_t align = aarch64_required_alignment (size);
+  size = ROUND_UP (size, align);
+  gcc_assert (aarch64_required_alignment (size) == align);
+  struct align_and_size ret;
+  ret.align = align;
+  ret.size = size;
+  return ret;
+}
+
+/* This function takes its `size` and `alignment` argument in bytes and
+   returns alignment in the same units.  */
+uint64_t
+aarch64_morello_precise_bounds_align (uint64_t size, uint64_t align,
+				      const_tree decl)
+{
+  if (!TARGET_CAPABILITY_PURE)
+    return align;
+  struct align_and_size required = aarch64_morello_get_size_align_req (size);
+  /* Only avoid extra padding if the user has specifically requested a named
+     section.  Without a named section the user can not know which section the
+     compiler will pick and hence can't be sure what padding will be between
+     objects.  */
+  if (decl && DECL_USER_ALIGN (decl) && DECL_SECTION_NAME (decl)
+      && required.align > align)
+    {
+      warning (OPT_Wcheri_bounds,
+	       "object %qD has cheri alignment overridden by a user-specified one",
+	       decl);
+      return align;
+    }
+  if (decl && DECL_THREAD_LOCAL_P (decl) && required.align > align)
+    {
+      warning (OPT_Wcheri_bounds,
+	       "object %qD has cheri alignment ignored since it is thread local",
+	       decl);
+      return align;
+    }
+  return MAX(required.align, align);
+}
+
+/* Takes a size and alignment in *bytes* and returns the number of extra bytes
+   of padding necessary for Morello capabilities to an object of such size to
+   have precise bounds.  */
+static uint64_t
+aarch64_data_padding_size (uint64_t size, uint64_t align ATTRIBUTE_UNUSED, const_tree decl)
+{
+  if (!TARGET_CAPABILITY_PURE)
+    return 0;
+  struct align_and_size required = aarch64_morello_get_size_align_req (size);
+  /* Only avoid extra padding if the user has specifically requested a named
+     section.  Without a named section the user can not know which section the
+     compiler will pick and hence can't be sure what padding will be between
+     objects.  */
+  if (decl && DECL_USER_ALIGN (decl) && DECL_SECTION_NAME (decl))
+      return 0;
+  /* As in align_variable, TLS space is too precious to waste.  */
+  if (decl && DECL_THREAD_LOCAL_P (decl))
+    return 0;
+  if (required.align > UINT_MAX)
+    return 0;
+  if (required.size > size)
+    return required.size - size;
+  return 0;
+}
+
 /* Implement TARGET_CONSTANT_ALIGNMENT.  Make strings word-aligned so
    that strcpy from constants will be faster.  */
 
 static HOST_WIDE_INT
 aarch64_constant_alignment (const_tree exp, HOST_WIDE_INT align)
 {
+  HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (exp));
+  gcc_assert (size != -1);
+  if (TARGET_CAPABILITY_PURE)
+    align = aarch64_morello_precise_bounds_align (size, align, NULL_TREE);
   if (TREE_CODE (exp) == STRING_CST && !optimize_size)
     return MAX (align, BITS_PER_WORD);
   return align;
@@ -24410,12 +24525,48 @@ aarch64_test_loading_full_dump ()
   ASSERT_EQ (SImode, GET_MODE (crtl->return_rtx));
 }
 
+static void
+aarch64_test_morello_alignment ()
+{
+  /* Using a few choice numbers that we know should behave a given way.
+     16383 is the maximum number where we don't need special alignment.
+	It is (1 << 14) - 1
+     Between that number and 32760 we always have alignment requirement of 8.
+     32761 is (1 << 15) - 7.  When this number is rounded up to 8 we carry the
+     bit which makes it 1 << 15.  That means the alignment requirement becomes
+     16.
+     Between that number and (1 << 16) - 15 all numbers should require an
+     alignment of 16.  */
+  ASSERT_EQ (aarch64_required_alignment (16383), 1);
+  ASSERT_EQ (aarch64_required_alignment (16384), 8);
+  ASSERT_EQ (aarch64_required_alignment (16385), 8);
+  ASSERT_EQ (aarch64_required_alignment (32759), 8);
+  ASSERT_EQ (aarch64_required_alignment (32760), 8);
+  ASSERT_EQ (aarch64_required_alignment (32761), 16);
+  ASSERT_EQ (aarch64_required_alignment (32762), 16);
+  ASSERT_EQ (aarch64_required_alignment (32768), 16);
+  ASSERT_EQ (aarch64_required_alignment (32781), 16);
+
+  unsigned nums[9] = {16383, 16384, 16385, 32759, 32760,
+		      32761, 32762, 32768, 32781};
+  struct align_and_size ret;
+  for (int i = 0; i < 9; i++)
+    {
+      unsigned size = nums[i];
+      ret = aarch64_morello_get_size_align_req (size);
+      ASSERT_EQ (aarch64_required_alignment (ret.size), ret.align);
+      ASSERT_EQ (ret.size % ret.align, 0);
+    }
+  return;
+}
+
 /* Run all target-specific selftests.  */
 
 static void
 aarch64_run_selftests (void)
 {
   aarch64_test_loading_full_dump ();
+  aarch64_test_morello_alignment ();
 }
 
 } // namespace selftest
@@ -24880,6 +25031,12 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_CONSTANT_ALIGNMENT
 #define TARGET_CONSTANT_ALIGNMENT aarch64_constant_alignment
 
+#undef TARGET_DATA_PADDING_SIZE
+#define TARGET_DATA_PADDING_SIZE aarch64_data_padding_size
+
+#undef TARGET_DATA_ALIGNMENT
+#define TARGET_DATA_ALIGNMENT aarch64_morello_precise_bounds_align
+
 #undef TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE
 #define TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE \
   aarch64_stack_clash_protection_alloca_probe_range
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index fcc3b1055c8..080c858c35b 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -125,7 +125,7 @@
 
 /* Align global data.  */
 #define DATA_ALIGNMENT(EXP, ALIGN)			\
-  AARCH64_EXPAND_ALIGNMENT (!optimize_size, EXP, ALIGN)
+    AARCH64_EXPAND_ALIGNMENT (!optimize_size, EXP, ALIGN)
 
 /* Similarly, make sure that objects on the stack are sensibly aligned.  */
 #define LOCAL_ALIGNMENT(EXP, ALIGN)				\
diff --git a/gcc/config/elfos.h b/gcc/config/elfos.h
index 74a3eafda6b..a29b7ffddb1 100644
--- a/gcc/config/elfos.h
+++ b/gcc/config/elfos.h
@@ -167,7 +167,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
     {									\
       fprintf ((FILE), "%s", COMMON_ASM_OP);				\
       assemble_name ((FILE), (NAME));					\
-      fprintf ((FILE), "," HOST_WIDE_INT_PRINT_UNSIGNED ",%u\n",		\
+      fprintf ((FILE), "," HOST_WIDE_INT_PRINT_UNSIGNED ",%lu\n",		\
 	       (SIZE), (ALIGN) / BITS_PER_UNIT);			\
     }									\
   while (0)
@@ -341,6 +341,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 	{								\
 	  size_directive_output = 1;					\
 	  size = tree_to_uhwi (DECL_SIZE_UNIT (DECL));			\
+	  size += targetm.data_padding_size (size, DECL_ALIGN_UNIT (DECL), DECL); \
 	  ASM_OUTPUT_SIZE_DIRECTIVE (FILE, NAME, size);			\
 	}								\
 									\
@@ -369,6 +370,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 	{							\
 	  size_directive_output = 1;				\
 	  size = tree_to_uhwi (DECL_SIZE_UNIT (DECL));		\
+	  size += targetm.data_padding_size (size, DECL_ALIGN_UNIT (DECL), DECL); \
 	  ASM_OUTPUT_SIZE_DIRECTIVE (FILE, name, size);		\
 	}							\
     }								\
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 982f51bcfc4..6670eb420e9 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -1149,6 +1149,37 @@ make it all fit in fewer cache lines.
 If the value of this macro has a type, it should be an unsigned type.
 @end defmac
 
+@deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_DATA_PADDING_SIZE (unsigned HOST_WIDE_INT @var{size}, unsigned HOST_WIDE_INT @var{align}, const_tree @var{decl})
+This hook returns the padding required for an object of size @var{size}
+when writing it out to memory.
+The size of the object for accesses is not affected, the compiler emits
+padding of the given amount when emitting this variable.
+
+This hook takes both its arguments in bytes. The default definition returns
+@code{0}.
+
+The typical use of this hook is to add padding to the end of objects on
+a capability architecture to ensure that the bounds of a capability pointing
+to objects do not allow accesses to any neighbouring objects.
+
+A requirement on the implementation of this function is that if @var{decl}
+has a user-specified alignment on a decl which has an associated section
+then this hook must return @code{0}.
+@end deftypefn
+
+@deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_DATA_ALIGNMENT (unsigned HOST_WIDE_INT @var{size}, unsigned HOST_WIDE_INT @var{align}, const_tree @var{decl})
+This hook returns a possibly adjusted alignment to emit for an object of
+size @var{size} when writing it out to memory.
+
+This hook takes its argument in bytes.  The default definition returns the
+alignment given as an argument.
+
+The typical use of this hook is to ensure alignment in order to give precise
+bounds for a capability pointing to the given object on capability systems.
+A requirement on the implementation of this function is that if @var{decl}
+has a user-specified alignment then this hook must not decrease the alignment.
+@end deftypefn
+
 @deftypefn {Target Hook} HOST_WIDE_INT TARGET_VECTOR_ALIGNMENT (const_tree @var{type})
 This hook can be used to define the alignment for a vector of type
 @var{type}, in order to comply with a platform ABI.  The default is to
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 067be34c62c..a7c5769a1a0 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -1080,6 +1080,10 @@ make it all fit in fewer cache lines.
 If the value of this macro has a type, it should be an unsigned type.
 @end defmac
 
+@hook TARGET_DATA_PADDING_SIZE
+
+@hook TARGET_DATA_ALIGNMENT
+
 @hook TARGET_VECTOR_ALIGNMENT
 
 @defmac STACK_SLOT_ALIGNMENT (@var{type}, @var{mode}, @var{basic-align})
diff --git a/gcc/output.h b/gcc/output.h
index 8705aeb2981..f7335da8f17 100644
--- a/gcc/output.h
+++ b/gcc/output.h
@@ -220,7 +220,7 @@ extern void assemble_external (tree);
 extern void assemble_zeros (unsigned HOST_WIDE_INT);
 
 /* Assemble an alignment pseudo op for an ALIGN-bit boundary.  */
-extern void assemble_align (unsigned int);
+extern void assemble_align (unsigned HOST_WIDE_INT);
 
 /* Assemble a string constant with the specified C string as contents.  */
 extern void assemble_string (const char *, int);
diff --git a/gcc/target.def b/gcc/target.def
index 2571c5583c0..b450354ab23 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -3420,6 +3420,44 @@ constants can be done inline.  The function\n\
  HOST_WIDE_INT, (const_tree constant, HOST_WIDE_INT basic_align),
  default_constant_alignment)
 
+DEFHOOK
+(data_padding_size,
+ "This hook returns the padding required for an object of size @var{size}\n\
+when writing it out to memory.\n\
+The size of the object for accesses is not affected, the compiler emits\n\
+padding of the given amount when emitting this variable.\n\
+\n\
+This hook takes both its arguments in bytes. The default definition returns\n\
+@code{0}.\n\
+\n\
+The typical use of this hook is to add padding to the end of objects on\n\
+a capability architecture to ensure that the bounds of a capability pointing\n\
+to objects do not allow accesses to any neighbouring objects.\n\
+\n\
+A requirement on the implementation of this function is that if @var{decl}\n\
+has a user-specified alignment on a decl which has an associated section\n\
+then this hook must return @code{0}.",
+  unsigned HOST_WIDE_INT, (unsigned HOST_WIDE_INT size,
+			   unsigned HOST_WIDE_INT align, const_tree decl),
+  default_data_padding_size)
+
+DEFHOOK
+(data_alignment,
+ "This hook returns a possibly adjusted alignment to emit for an object of\n\
+size @var{size} when writing it out to memory.\n\
+\n\
+This hook takes its argument in bytes.  The default definition returns the\n\
+alignment given as an argument.\n\
+\n\
+The typical use of this hook is to ensure alignment in order to give precise\n\
+bounds for a capability pointing to the given object on capability systems.\n\
+A requirement on the implementation of this function is that if @var{decl}\n\
+has a user-specified alignment then this hook must not decrease the alignment.",
+ unsigned HOST_WIDE_INT, (unsigned HOST_WIDE_INT size,
+			  unsigned HOST_WIDE_INT align, const_tree decl),
+ default_data_padding_size)
+
+
 DEFHOOK
 (translate_mode_attribute,
  "Define this hook if during mode attribute processing, the port should\n\
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 6315bcba8f2..3b812a2b98b 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1213,6 +1213,23 @@ default_constant_alignment (const_tree, HOST_WIDE_INT align)
   return align;
 }
 
+/* The default implementation of TARGET_DATA_PADDING_SIZE.  */
+unsigned HOST_WIDE_INT
+default_data_padding_size (unsigned HOST_WIDE_INT size ATTRIBUTE_UNUSED,
+			   unsigned HOST_WIDE_INT align ATTRIBUTE_UNUSED,
+			   const_tree decl ATTRIBUTE_UNUSED)
+{
+  return 0;
+}
+
+unsigned HOST_WIDE_INT
+default_data_alignment (unsigned HOST_WIDE_INT size ATTRIBUTE_UNUSED,
+			unsigned HOST_WIDE_INT align,
+			const_tree decl ATTRIBUTE_UNUSED)
+{
+  return align;
+}
+
 /* An implementation of TARGET_CONSTANT_ALIGNMENT that aligns strings
    to at least BITS_PER_WORD but otherwise makes no changes.  */
 
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index c9276768f21..85a92f18547 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -99,6 +99,8 @@ extern tree default_builtin_reciprocal (tree);
 
 extern HOST_WIDE_INT default_static_rtx_alignment (machine_mode);
 extern HOST_WIDE_INT default_constant_alignment (const_tree, HOST_WIDE_INT);
+extern HOST_WIDE_INT default_data_padding_size (unsigned HOST_WIDE_INT,
+						unsigned HOST_WIDE_INT, const_tree);
 extern HOST_WIDE_INT constant_alignment_word_strings (const_tree,
 						      HOST_WIDE_INT);
 extern HOST_WIDE_INT default_vector_alignment (const_tree);
diff --git a/gcc/testsuite/gcc.target/aarch64/morello/precise-bounds-padding-2.c b/gcc/testsuite/gcc.target/aarch64/morello/precise-bounds-padding-2.c
new file mode 100644
index 00000000000..c50dad60b38
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/morello/precise-bounds-padding-2.c
@@ -0,0 +1,76 @@
+/* { dg-do assemble { target cheri_capability_pure } } */
+/* { dg-additional-options "--save-temps -fsection-anchors" }  */
+
+/* Similar to precise-bounds-padding.c, but without the -fno-section-anchors
+   flag (that flag was in order to check the .comm symbol, allowing section
+   anchors means we can at least exercise the object_block code.
+   Testing it is a little tricky since we don't use section anchors for purecap
+   code, so we can't just trigger an access and check that access makes sense.
+   Here we exercise the code and rely on asserts in the compiler to check
+   everything is working.  */
+
+/* Taken from gcc.dg/pr46534.c, checking that creating a very large constant
+   introduces padding to allow precise bounds for capabilities.  */
+
+extern int printf (const char *, ...);
+
+#define S1 "                    "
+#define S2 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1
+#define S3 S2 S2 S2 S2 S2 S2 S2 S2 S2 S2
+#define S4 S3 S3 S3 S3 S3 S3 S3 S3 S3 S3
+#define S5 S4 S4 S4 S4 S4 S4 S4 S4 S4 S4
+#define S6 S5 S5 S5 S5 S5 S5 S5 S5 S5 S5
+#define S7 S6 S6 S6 S6 S6 S6 S6 S6 S6 S6
+
+void
+foo (void)
+{
+  printf (S7 "\n");
+}
+
+/* Note: Using scan-assembler for "zero", "align", and "size" directives.
+   Have nothing *forcing* that the padding we check for comes from each
+   specific test in this file, but are choosing the lengths to be different in
+   order to strongly increase the chances that we're identifying the correct
+   directive.  Are using unit tests to ensure our calculation of lengths is
+   correct.  */
+/* { dg-final { scan-assembler "\.align\t13" } }  */
+/* { dg-final { scan-assembler "\.size.* 20004864" } }  */
+/* { dg-final { scan-assembler "\.zero\t4863" } }  */
+
+/* Ensuring that large variables are padded accordingly.  */
+int bigarray[16389];
+/* Can not check for alignment of 5 since now this is in an object block and
+   the object block has alignment of the greatest object (which is 6 for
+   otherbigarray).  */
+/* { dg-final { scan-assembler "\.size\tbigarray, 65568" } }  */
+/* { dg-final { scan-assembler "\.zero\t12" } }  */
+
+static int otherbigarray[33076];
+int getidx (__SIZE_TYPE__ index)
+{
+  return otherbigarray[index];
+}
+void setidx (__SIZE_TYPE__ index, int val)
+{
+  otherbigarray[index] = val;
+}
+/* { dg-final { scan-assembler "\.align\t6" } }  */
+/* { dg-final { scan-assembler "\.size\totherbigarray, 132352" } }  */
+/* { dg-final { scan-assembler "\.zero\t48" } }  */
+
+__thread int tls_array[16394];
+/* N.b. here we use a slightly different size to in precise-bounds-padding.c
+   since here we enable section anchors and this object would go in an object
+   block.  That means there is still padding between this object and the next,
+   and the padding happens to be the same as this object would need for precise
+   bounds.  Hence we avoid the padding of 24 (while still asserting it is not
+   emitted for the TLS variable above).  */
+int aligned_array[16395] __attribute__ ((aligned(4),section(".aligned_sect")));
+/* { dg-warning "object 'aligned_array' has cheri alignment overridden by a user-specified one" "" { target cheri_capability_pure } .-1 } */
+/* { dg-final { scan-assembler-not "\.zero\t24\n" } } */
+/* N.B. Checking for the non-existence of this line rather than the existence
+   of an alternate line to allow running this testcase on bare-metal targets
+   which don't have TLS and instead use an emutls structure.   */
+/* { dg-final { scan-assembler-not "\.size\ttls_array, 65600" } } */
+/* { dg-final { scan-assembler "\.size\taligned_array, 65580" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/morello/precise-bounds-padding.c b/gcc/testsuite/gcc.target/aarch64/morello/precise-bounds-padding.c
new file mode 100644
index 00000000000..25d81ccb462
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/morello/precise-bounds-padding.c
@@ -0,0 +1,60 @@
+/* { dg-do assemble { target cheri_capability_pure } } */
+/* { dg-additional-options "--save-temps -fno-section-anchors" } */
+
+/* Taken from gcc.dg/pr46534.c, checking that creating a very large constant
+   introduces padding to allow precise bounds for capabilities.  */
+
+extern int printf (const char *, ...);
+
+#define S1 "                    "
+#define S2 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1
+#define S3 S2 S2 S2 S2 S2 S2 S2 S2 S2 S2
+#define S4 S3 S3 S3 S3 S3 S3 S3 S3 S3 S3
+#define S5 S4 S4 S4 S4 S4 S4 S4 S4 S4 S4
+#define S6 S5 S5 S5 S5 S5 S5 S5 S5 S5 S5
+#define S7 S6 S6 S6 S6 S6 S6 S6 S6 S6 S6
+
+void
+foo (void)
+{
+  printf (S7 "\n");
+}
+
+/* Note: Using scan-assembler for "zero", "align", and "size" directives.
+   Have nothing *forcing* that the padding we check for comes from each
+   specific test in this file, but are choosing the lengths to be different in
+   order to strongly increase the chances that we're identifying the correct
+   directive.  Are using unit tests to ensure our calculation of lengths is
+   correct.  */
+/* { dg-final { scan-assembler "\.align\t13" } }  */
+/* { dg-final { scan-assembler "\.size.* 20004864" } }  */
+/* { dg-final { scan-assembler "\.zero\t4863" } }  */
+
+/* Ensuring that large variables are padded accordingly.  */
+int bigarray[16389];
+/* { dg-final { scan-assembler "\.align\t5" } }  */
+/* { dg-final { scan-assembler "\.size\tbigarray, 65568" } }  */
+/* { dg-final { scan-assembler "\.zero\t12" } }  */
+
+/* Ensuring that local .comm variables are padded accordingly.  */
+static int otherbigarray[33076];
+int getidx (__SIZE_TYPE__ index)
+{
+  return otherbigarray[index];
+}
+void setidx (__SIZE_TYPE__ index, int val)
+{
+  otherbigarray[index] = val;
+}
+/* { dg-final { scan-assembler "\.comm\totherbigarray,132352,64" } } */
+
+/* Using the same  */
+__thread int tls_array[16394];
+int aligned_array[16394] __attribute__ ((aligned(4),section(".aligned_sect")));
+/* { dg-warning "object 'aligned_array' has cheri alignment overridden by a user-specified one" "" { target cheri_capability_pure } .-1 } */
+/* { dg-final { scan-assembler-not "\.zero\t24\n" } } */
+/* N.B. Checking for the non-existence of this line rather than the existence
+   of an alternate line to allow running this testcase on bare-metal targets
+   which don't have TLS and instead use an emutls structure.  */
+/* { dg-final { scan-assembler-not "\.size\ttls_array, 65564" } } */
+/* { dg-final { scan-assembler "\.size\taligned_array, 65576" } } */
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 07457d08c3a..4b95687b20c 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -551,7 +551,8 @@ compile_file (void)
 				      HOST_WIDE_INT_1U, 8);
 #elif defined ASM_OUTPUT_ALIGNED_COMMON
       ASM_OUTPUT_ALIGNED_COMMON (asm_out_file, "__gnu_lto_slim",
-				 HOST_WIDE_INT_1U, 8);
+				 HOST_WIDE_INT_1U,
+				 (unsigned HOST_WIDE_INT)8);
 #else
       ASM_OUTPUT_COMMON (asm_out_file, "__gnu_lto_slim",
 			 HOST_WIDE_INT_1U,
diff --git a/gcc/varasm.c b/gcc/varasm.c
index b0b1e7c2c93..c42519a316f 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -487,7 +487,9 @@ asm_output_aligned_bss (FILE *file, tree decl ATTRIBUTE_UNUSED,
 			int align)
 {
   switch_to_section (bss_section);
-  ASM_OUTPUT_ALIGN (file, floor_log2 (align / BITS_PER_UNIT));
+  unsigned HOST_WIDE_INT align_used
+    = targetm.data_alignment(size, align / BITS_PER_UNIT, decl);
+  ASM_OUTPUT_ALIGN (file, floor_log2 (align_used));
 #ifdef ASM_DECLARE_OBJECT_NAME
   last_assemble_variable_decl = decl;
   ASM_DECLARE_OBJECT_NAME (file, name, decl);
@@ -495,6 +497,7 @@ asm_output_aligned_bss (FILE *file, tree decl ATTRIBUTE_UNUSED,
   /* Standard thing is just output label for the object.  */
   ASM_OUTPUT_LABEL (file, name);
 #endif /* ASM_DECLARE_OBJECT_NAME */
+  size += targetm.data_padding_size (size, align_used, decl);
   ASM_OUTPUT_SKIP (file, size ? size : 1);
 }
 
@@ -1951,7 +1954,7 @@ assemble_zeros (unsigned HOST_WIDE_INT size)
 /* Assemble an alignment pseudo op for an ALIGN-bit boundary.  */
 
 void
-assemble_align (unsigned int align)
+assemble_align (unsigned HOST_WIDE_INT align)
 {
   if (align > BITS_PER_UNIT)
     {
@@ -1983,6 +1986,24 @@ assemble_string (const char *p, int size)
 }
 
 \f
+
+/* Handle using targetm.data_alignment hook on an alignment provided in bits.
+   Since the hook takes an alignment provided in bytes we could lose some
+   bit-wise alignment requirement.  This ensures that we maintain the bit-wise
+   alignment if the hook does not increase the alignment requirement.  */
+static unsigned HOST_WIDE_INT
+alignment_pad_from_bits (unsigned HOST_WIDE_INT size,
+			 unsigned HOST_WIDE_INT align_orig,
+			 const_tree decl)
+{
+  unsigned HOST_WIDE_INT align
+    = targetm.data_alignment (size, align_orig/BITS_PER_UNIT, decl);
+  if (align == 1 && align_orig < BITS_PER_UNIT)
+    return align_orig;
+  else
+    return align * BITS_PER_UNIT;
+}
+
 /* A noswitch_section_callback for lcomm_section.  */
 
 static bool
@@ -1991,13 +2012,16 @@ emit_local (tree decl ATTRIBUTE_UNUSED,
 	    unsigned HOST_WIDE_INT size ATTRIBUTE_UNUSED,
 	    unsigned HOST_WIDE_INT rounded ATTRIBUTE_UNUSED)
 {
+#if defined(ASM_OUTPUT_ALIGNED_DECL_LOCAL) || defined(ASM_OUTPUT_ALIGNED_LOCAL)
+  unsigned HOST_WIDE_INT align
+    = alignment_pad_from_bits
+    (size, symtab_node::get (decl)->definition_alignment (), decl);
+#endif
+
 #if defined ASM_OUTPUT_ALIGNED_DECL_LOCAL
-  unsigned int align = symtab_node::get (decl)->definition_alignment ();
-  ASM_OUTPUT_ALIGNED_DECL_LOCAL (asm_out_file, decl, name,
-				 size, align);
+  ASM_OUTPUT_ALIGNED_DECL_LOCAL (asm_out_file, decl, name, size, align);
   return true;
 #elif defined ASM_OUTPUT_ALIGNED_LOCAL
-  unsigned int align = symtab_node::get (decl)->definition_alignment ();
   ASM_OUTPUT_ALIGNED_LOCAL (asm_out_file, name, size, align);
   return true;
 #else
@@ -2015,8 +2039,9 @@ emit_bss (tree decl ATTRIBUTE_UNUSED,
 	  unsigned HOST_WIDE_INT size ATTRIBUTE_UNUSED,
 	  unsigned HOST_WIDE_INT rounded ATTRIBUTE_UNUSED)
 {
-  ASM_OUTPUT_ALIGNED_BSS (asm_out_file, decl, name, size,
-			  get_variable_align (decl));
+  unsigned HOST_WIDE_INT align
+    = alignment_pad_from_bits (size, get_variable_align (decl), decl);
+  ASM_OUTPUT_ALIGNED_BSS (asm_out_file, decl, name, size, align);
   return true;
 }
 #endif
@@ -2029,13 +2054,16 @@ emit_common (tree decl ATTRIBUTE_UNUSED,
 	     unsigned HOST_WIDE_INT size ATTRIBUTE_UNUSED,
 	     unsigned HOST_WIDE_INT rounded ATTRIBUTE_UNUSED)
 {
+#if defined(ASM_OUTPUT_ALIGNED_DECL_LOCAL) || defined(ASM_OUTPUT_ALIGNED_LOCAL)
+  unsigned HOST_WIDE_INT align
+    = alignment_pad_from_bits (size, get_variable_align (decl), decl);
+#endif
+
 #if defined ASM_OUTPUT_ALIGNED_DECL_COMMON
-  ASM_OUTPUT_ALIGNED_DECL_COMMON (asm_out_file, decl, name,
-				  size, get_variable_align (decl));
+  ASM_OUTPUT_ALIGNED_DECL_COMMON (asm_out_file, decl, name, size, align);
   return true;
 #elif defined ASM_OUTPUT_ALIGNED_COMMON
-  ASM_OUTPUT_ALIGNED_COMMON (asm_out_file, name, size,
-			     get_variable_align (decl));
+  ASM_OUTPUT_ALIGNED_COMMON (asm_out_file, name, size, align);
   return true;
 #else
   ASM_OUTPUT_COMMON (asm_out_file, name, size, rounded);
@@ -2070,6 +2098,7 @@ assemble_noswitch_variable (tree decl, const char *name, section *sect,
   unsigned HOST_WIDE_INT size, rounded;
 
   size = tree_to_uhwi (DECL_SIZE_UNIT (decl));
+  size += targetm.data_padding_size (size, align, decl);
   rounded = size;
 
   if ((flag_sanitize & SANITIZE_ADDRESS) && asan_protect_global (decl))
@@ -2111,20 +2140,33 @@ assemble_variable_contents (tree decl, const char *name,
 
   if (!dont_output_data)
     {
+      unsigned HOST_WIDE_INT size = tree_to_uhwi (DECL_SIZE_UNIT (decl));
+      unsigned HOST_WIDE_INT padding
+	= targetm.data_padding_size (size, DECL_ALIGN_UNIT (decl), decl);
       /* Caller is supposed to use varpool_get_constructor when it wants
 	 to output the body.  */
       gcc_assert (!in_lto_p || DECL_INITIAL (decl) != error_mark_node);
       if (DECL_INITIAL (decl)
 	  && DECL_INITIAL (decl) != error_mark_node
 	  && !initializer_zerop (DECL_INITIAL (decl)))
-	/* Output the actual data.  */
-	output_constant (DECL_INITIAL (decl),
-			 tree_to_uhwi (DECL_SIZE_UNIT (decl)),
-			 get_variable_align (decl),
-			 false, merge_strings);
+	/* Output the actual data.
+	   N.b. we use `get_variable_align` here rather than updating it with
+	   targetm.data_align since this parameter to `output_constant` is
+	   the *known alignment* rather than *requested alignment*.
+	   While in this case they are the same (since we always ensure
+	   requested alignment before calling assemble_variable_contents, it
+	   doesn't make a difference and the *known alignment* matches the
+	   DECL_ALIGN idea better.  */
+	output_constant (DECL_INITIAL (decl), size+padding,
+			 get_variable_align (decl), false, merge_strings);
       else
-	/* Leave space for it.  */
-	assemble_zeros (tree_to_uhwi (DECL_SIZE_UNIT (decl)));
+	{
+	  /* Leave space for it.  */
+	  assemble_zeros (size);
+	  /* Have the padding separate just to make it more obvious when
+	     inspecting the assembly.  */
+	  assemble_zeros (padding);
+	}
       targetm.asm_out.decl_end ();
     }
 }
@@ -2255,7 +2297,9 @@ assemble_variable (tree decl, int top_level ATTRIBUTE_UNUSED,
 
   set_mem_align (decl_rtl, DECL_ALIGN (decl));
 
-  align = get_variable_align (decl);
+  gcc_assert (DECL_SIZE_UNIT (decl) && tree_fits_uhwi_p (DECL_SIZE_UNIT (decl)));
+  align = alignment_pad_from_bits (tree_to_uhwi (DECL_SIZE_UNIT (decl)),
+				   get_variable_align (decl), decl);
 
   if (TREE_PUBLIC (decl))
     maybe_assemble_visibility (decl);
@@ -2514,13 +2558,19 @@ assemble_label (FILE *file, const char *name)
   ASM_OUTPUT_LABEL (file, name);
 }
 
+/* Equivalent of the first half of ASM_DECLARE_OBJECT_NAME but for constants.
+   This means that we don't have to worry about types that we don't (yet) know
+   the size of, or of decls that we want declared as "gnu_unique_object".
+   It also means that the interface is simpler to just give the size and name,
+   since we may be emitting a constant that doesn't have an associated DECL.  */
 void
-assemble_object_type_and_size (FILE *file, const char *in_name,
+assemble_object_type_and_size (FILE *file, const char *name,
 			       HOST_WIDE_INT size)
 {
-  const char *name = targetm.strip_name_encoding (in_name);
-  asm_fprintf (file, "\t.type\t%s, %%object\n", name);
-  asm_fprintf (file, "\t.size\t%s, %" PRId64 "\n", name, size);
+  ASM_OUTPUT_TYPE_DIRECTIVE (file, name, "object");
+  if (flag_inhibit_size_directive)
+    return;
+  ASM_OUTPUT_SIZE_DIRECTIVE (file, name, size);
 }
 
 /* Set the symbol_referenced flag for ID.  */
@@ -2628,7 +2678,8 @@ assemble_static_space (unsigned HOST_WIDE_INT size)
 				 BIGGEST_ALIGNMENT);
 #else
 #ifdef ASM_OUTPUT_ALIGNED_LOCAL
-  ASM_OUTPUT_ALIGNED_LOCAL (asm_out_file, name, size, BIGGEST_ALIGNMENT);
+  ASM_OUTPUT_ALIGNED_LOCAL (asm_out_file, name, size,
+			    (unsigned HOST_WIDE_INT)BIGGEST_ALIGNMENT);
 #else
   {
     /* Round size up to multiple of BIGGEST_ALIGNMENT bits
@@ -3355,7 +3406,7 @@ compare_constant (const tree t1, const tree t2)
 /* Return the section into which constant EXP should be placed.  */
 
 static section *
-get_constant_section (tree exp, unsigned int align)
+get_constant_section (tree exp, unsigned HOST_WIDE_INT align)
 {
   return targetm.asm_out.select_section (exp,
 					 compute_reloc_for_constant (exp),
@@ -3564,11 +3615,12 @@ maybe_output_constant_def_contents (struct constant_descriptor_tree *desc,
 
 static void
 assemble_constant_contents (tree exp, const char *label, unsigned int align,
-			    bool merge_strings)
+			    bool merge_strings, const_tree decl)
 {
   HOST_WIDE_INT size;
 
   size = get_constant_size (exp);
+  size += targetm.data_padding_size (size, align, decl);
 
   /* Do any machine/system dependent processing of the constant.  */
   targetm.asm_out.declare_constant_name (asm_out_file, label, exp, size);
@@ -3611,17 +3663,21 @@ output_constant_def_contents (rtx symbol)
     place_block_symbol (symbol);
   else
     {
-      int align = (TREE_CODE (decl) == CONST_DECL
+      unsigned HOST_WIDE_INT align = (TREE_CODE (decl) == CONST_DECL
 		   || (VAR_P (decl) && DECL_IN_CONSTANT_POOL (decl))
 		   ? DECL_ALIGN (decl)
 		   : symtab_node::get (decl)->definition_alignment ());
       section *sect = get_constant_section (exp, align);
       switch_to_section (sect);
-      if (align > BITS_PER_UNIT)
-	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align / BITS_PER_UNIT));
+      align = targetm.data_alignment (get_constant_size (exp),
+				      align / BITS_PER_UNIT,
+				      decl);
+      if (align)
+	ASM_OUTPUT_ALIGN (asm_out_file, floor_log2 (align));
       assemble_constant_contents (exp, XSTR (symbol, 0), align,
 				  (sect->common.flags & SECTION_MERGE)
-				  && (sect->common.flags & SECTION_STRINGS));
+				  && (sect->common.flags & SECTION_STRINGS),
+				  decl);
       if (asan_protected)
 	{
 	  HOST_WIDE_INT size = get_constant_size (exp);
@@ -3984,10 +4040,12 @@ constant_pool_empty_p (void)
 }
 \f
 /* Worker function for output_constant_pool_1.  Emit assembly for X
-   in MODE with known alignment ALIGN.  */
+   in MODE with known alignment ALIGN.  Emit PADDING zero bytes after the
+   above.  */
 
 static void
-output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align)
+output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align,
+			unsigned int padding)
 {
   switch (GET_MODE_CLASS (mode))
     {
@@ -4035,7 +4093,7 @@ output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align)
 	      if (INTVAL (CONST_VECTOR_ELT (x, i + j)) != 0)
 		value |= 1 << (j * elt_bits);
 	    output_constant_pool_2 (int_mode, gen_int_mode (value, int_mode),
-				    i != 0 ? MIN (align, int_bits) : align);
+				    i != 0 ? MIN (align, int_bits) : align, 0);
 	  }
 	break;
       }
@@ -4056,7 +4114,7 @@ output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align)
 	for (i = 0; i < units; i++)
 	  {
 	    rtx elt = CONST_VECTOR_ELT (x, i);
-	    output_constant_pool_2 (submode, elt, i ? subalign : align);
+	    output_constant_pool_2 (submode, elt, i ? subalign : align, 0);
 	  }
       }
       break;
@@ -4064,6 +4122,7 @@ output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align)
     default:
       gcc_unreachable ();
     }
+  assemble_zeros (padding);
 }
 
 /* Worker function for output_constant_pool.  Emit constant DESC,
@@ -4109,17 +4168,21 @@ output_constant_pool_1 (class constant_descriptor_rtx *desc,
       break;
     }
 
+  uint64_t size = GET_MODE_SIZE (desc->mode);
+  uint64_t align_used = targetm.data_alignment (size, align, NULL_TREE);
+  uint64_t padding = targetm.data_padding_size (size, align_used, NULL_TREE);
+
 #ifdef ASM_OUTPUT_SPECIAL_POOL_ENTRY
   ASM_OUTPUT_SPECIAL_POOL_ENTRY (asm_out_file, x, desc->mode,
-				 align, desc->labelno, done);
+				 align_used, desc->labelno, done);
 #endif
 
-  assemble_align (align);
+  assemble_align (align_used);
 
   /* Output the label.  */
   char buf[42];
   ASM_GENERATE_INTERNAL_LABEL (buf, "LC", desc->labelno);
-  assemble_object_type_and_size (asm_out_file, buf, GET_MODE_SIZE (desc->mode));
+  assemble_object_type_and_size (asm_out_file, buf, size+padding);
   ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, buf);
 
   /* Output the data.
@@ -4127,7 +4190,7 @@ output_constant_pool_1 (class constant_descriptor_rtx *desc,
      as function 'output_constant_pool_1' explicitly passes the alignment as 1
      assuming that the data is already aligned which prevents the generation 
      of fix-up table entries.  */
-  output_constant_pool_2 (desc->mode, x, desc->align);
+  output_constant_pool_2 (desc->mode, x, desc->align, padding);
 
   /* Make sure all constants in SECTION_MERGE and not SECTION_STRINGS
      sections have proper size.  */
@@ -4969,7 +5032,10 @@ check_string_literal (tree string, unsigned HOST_WIDE_INT size)
     return false;
   if (size < (unsigned)len)
     return false;
-  if (mem_size != size)
+  /* Allow the size that we're generating for this object to be greater than
+     the size the object needs.  This is for the case where there is padding in
+     the object from targetm.data_padding_size.  */
+  if (mem_size > size)
     return false;
   return true;
 }
@@ -7666,7 +7732,7 @@ place_block_symbol (rtx symbol)
 {
   unsigned HOST_WIDE_INT size, mask, offset;
   class constant_descriptor_rtx *desc;
-  unsigned int alignment;
+  unsigned HOST_WIDE_INT alignment;
   struct object_block *block;
   tree decl;
 
@@ -7680,6 +7746,10 @@ place_block_symbol (rtx symbol)
       desc = SYMBOL_REF_CONSTANT (symbol);
       alignment = desc->align;
       size = GET_MODE_SIZE (desc->mode);
+      alignment = targetm.data_alignment (size, alignment / BITS_PER_UNIT,
+					  NULL_TREE);
+      size += targetm.data_padding_size (size, alignment, NULL_TREE);
+      alignment *= BITS_PER_UNIT;
     }
   else if (TREE_CONSTANT_POOL_ADDRESS_P (symbol))
     {
@@ -7687,6 +7757,9 @@ place_block_symbol (rtx symbol)
       gcc_checking_assert (DECL_IN_CONSTANT_POOL (decl));
       alignment = DECL_ALIGN (decl);
       size = get_constant_size (DECL_INITIAL (decl));
+      alignment = targetm.data_alignment (size, alignment / BITS_PER_UNIT, decl);
+      size += targetm.data_padding_size (size, alignment, decl);
+      alignment *= BITS_PER_UNIT;
       if ((flag_sanitize & SANITIZE_ADDRESS)
 	  && TREE_CODE (DECL_INITIAL (decl)) == STRING_CST
 	  && asan_protect_global (DECL_INITIAL (decl)))
@@ -7716,6 +7789,9 @@ place_block_symbol (rtx symbol)
 	}
       alignment = get_variable_align (decl);
       size = tree_to_uhwi (DECL_SIZE_UNIT (decl));
+      alignment = targetm.data_alignment (size, alignment / BITS_PER_UNIT, decl);
+      size += targetm.data_padding_size (size, alignment, decl);
+      alignment *= BITS_PER_UNIT;
       if ((flag_sanitize & SANITIZE_ADDRESS)
 	  && asan_protect_global (decl))
 	{
@@ -7858,26 +7934,45 @@ output_object_block (struct object_block *block)
   offset = 0;
   FOR_EACH_VEC_ELT (*block->objects, i, symbol)
     {
+      /* N.B. Here we assert that there is never negative padding necessary.
+	 That implies that we've not made one kind of mistake calculating the
+	 offset into the current object block.
+	 We cast the symbol ref block offset to a long unsigned in this
+	 assertion due to a bug that we have not fixed yet.
+	 This field is signed in the structure so we can represent
+	 "uninitialised" as a negative number.  However, with very large
+	 offsets (e.g. an object after 'large_string' in gcc.dg/strlenopt-55.c)
+	 the offset can end up negative due to integer overflow.  This is a
+	 problem, but not one that is too important, since objects that large
+	 are not expected to be seen very often.
+	 Hence we avoid the overflow problems for this assert in order to help
+	 check for more problematic bugs, but leave them in the rest of the
+	 code.  */
+      gcc_assert ((unsigned HOST_WIDE_INT)SYMBOL_REF_BLOCK_OFFSET (symbol)
+		  >= (unsigned HOST_WIDE_INT)offset);
       /* Move to the object's offset, padding with zeros if necessary.  */
       assemble_zeros (SYMBOL_REF_BLOCK_OFFSET (symbol) - offset);
       offset = SYMBOL_REF_BLOCK_OFFSET (symbol);
       if (CONSTANT_POOL_ADDRESS_P (symbol))
 	{
+	  HOST_WIDE_INT size;
 	  desc = SYMBOL_REF_CONSTANT (symbol);
 	  /* Pass 1 for align as we have already laid out everything in the block.
 	     So aligning shouldn't be necessary.  */
 	  output_constant_pool_1 (desc, 1);
-	  offset += GET_MODE_SIZE (desc->mode);
+	  size = GET_MODE_SIZE (desc->mode);
+	  offset += size;
+	  offset += targetm.data_padding_size (size, desc->align, NULL_TREE);
 	}
       else if (TREE_CONSTANT_POOL_ADDRESS_P (symbol))
 	{
-	  HOST_WIDE_INT size;
 	  decl = SYMBOL_REF_DECL (symbol);
+	  HOST_WIDE_INT size = get_constant_size (DECL_INITIAL (decl));
 	  assemble_constant_contents (DECL_INITIAL (decl), XSTR (symbol, 0),
-				      DECL_ALIGN (decl), false);
+				      DECL_ALIGN (decl), false, decl);
 
-	  size = get_constant_size (DECL_INITIAL (decl));
 	  offset += size;
+	  offset += targetm.data_padding_size (size, DECL_ALIGN_UNIT (decl), decl);
 	  if ((flag_sanitize & SANITIZE_ADDRESS)
 	      && TREE_CODE (DECL_INITIAL (decl)) == STRING_CST
 	      && asan_protect_global (DECL_INITIAL (decl)))
@@ -7894,6 +7989,7 @@ output_object_block (struct object_block *block)
 	  assemble_variable_contents (decl, XSTR (symbol, 0), false, false);
 	  size = tree_to_uhwi (DECL_SIZE_UNIT (decl));
 	  offset += size;
+	  offset += targetm.data_padding_size (size, DECL_ALIGN_UNIT (decl), decl);
 	  if ((flag_sanitize & SANITIZE_ADDRESS)
 	      && asan_protect_global (decl))
 	    {


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-12-10 16:48 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-10 16:48 [gcc(refs/vendors/ARM/heads/morello)] Pad and align objects to enable precisely bounded capabilities Matthew Malcomson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).