public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc(refs/vendors/ARM/heads/morello)] Make bit_field_mode_iterator behave conservatively for capabilities
@ 2022-06-16 13:46 Richard Sandiford
  0 siblings, 0 replies; only message in thread
From: Richard Sandiford @ 2022-06-16 13:46 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:351a6da872acf4c6095ad16632697ffb9be63851

commit 351a6da872acf4c6095ad16632697ffb9be63851
Author: Richard Sandiford <richard.sandiford@arm.com>
Date:   Thu Jun 16 14:42:41 2022 +0100

    Make bit_field_mode_iterator behave conservatively for capabilities
    
    bit_field_mode_iterator assumes that increasing an access from
    M bytes to N bytes won't trap whenever (a) the access is known
    to be aligned to N bytes and (b) N is not in danger of approaching
    a likely page size.  This assumption isn't safe for capabilities,
    since without any information to the contrary, we have to assume
    that increasing the size of an access could take it out of bounds.
    
    This patch adds an extra parameter to the bit_field_mode_iterator
    constructor that indicates whether the address is a capability.
    The alignment optimisation then requires this new parameter
    to be false.
    
    Providing nonzero bitregion_start and bitregion_end values
    for extractions thus becomes an optimisation, rather than
    something that is needed for correctness.  Zero values work
    conservatively correctly while nonzero values give
    bit_field_mode_iterator the information it needs to expand
    the access in a safe way.
    
    The bitregion parameters therefore become redundant if their
    only purpose was to prevent the access being widened on capability
    targets (as for emit_group_load_1).  Also, the bitregions used
    for extracts in store_split_bit_field and store_integral_bit_field
    weren't right, since the values applied to the target rtx (op0)
    rather than the source rtx (value).
    
    The get_inner_reference change is needed to avoid a regression
    in pr69990.c.  Without it we would access the bitfield in that
    testcase according to -fstrict-volatile-bitfields rules, even
    though the structure is smaller than the underlying type of
    the bitfield (see PR105805).
    
    Using the alignment of the innermost type is supposed to be
    a conservative fix, in terms of changing the volatile bitfield
    semantics as little as possible.  It means that for things like:
    
    struct S {
      __attribute__((packed)) volatile int a : 16;
      volatile _Float16 b; // byte offset 2
      int c; // gives the structure 4-byte alignment
    };
    
    we'll continue to use a 32-bit access for "a" (thus loading "b" too).
    The handling continues to be asymmetrical: accessing "b" will not
    access "a" in the same way.
    
    The store_split_bit_field and extract_split_bit_field changes
    are because aligned word-sized accesses are no longer guaranteed
    to be allowed.  Without the changes there would be infinite
    recursion between the split and fixed functions.

Diff:
---
 gcc/expmed.c      | 174 +++++++++++++++++++++++++-----------------------------
 gcc/expr.c        | 107 ++++++++++++++++++---------------
 gcc/fold-const.c  |  13 +++-
 gcc/machmode.h    |   5 +-
 gcc/stor-layout.c |  38 ++++++++----
 gcc/tree.c        |  22 +++++++
 gcc/tree.h        |   1 +
 7 files changed, 199 insertions(+), 161 deletions(-)

diff --git a/gcc/expmed.c b/gcc/expmed.c
index 45db93c331a..4d08bd8551e 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -485,7 +485,8 @@ adjust_bit_field_mem_for_reg (enum extraction_pattern pattern,
 {
   bit_field_mode_iterator iter (bitsize, bitnum, bitregion_start,
 				bitregion_end, MEM_ALIGN (op0),
-				MEM_VOLATILE_P (op0));
+				MEM_VOLATILE_P (op0),
+				has_capability_address (op0));
   scalar_int_mode best_mode;
   if (iter.next_mode (&best_mode))
     {
@@ -515,6 +516,42 @@ adjust_bit_field_mem_for_reg (enum extraction_pattern pattern,
   return NULL_RTX;
 }
 
+/* The caller needs to access BITSIZE bits at bit offset *PBITNUM from
+   the start of memory *PSTRUCT, but it is not allowed to access bytes
+   outside the bit region [*PBITREGION_START, *PBITREGION_END].
+   Adjust *PSTRUCT to the start of this bitregion and reduce bit offsets
+   *PBITNUM, *PBITREGION_START, and *PBITREGION_END so that they are relative
+   to the new start address.
+
+   *PBITREGION_START is guaranteed to be a whole number of bytes.  */
+
+static void
+constrain_access_to_bitregion (rtx *pstruct, poly_uint64 bitsize,
+			       poly_uint64 *pbitnum,
+			       poly_uint64 *pbitregion_start,
+			       poly_uint64 *pbitregion_end)
+{
+  scalar_int_mode best_mode;
+  machine_mode addr_mode = VOIDmode;
+  unsigned HOST_WIDE_INT ibitsize, ibitnum;
+
+  poly_uint64 offset = exact_div (*pbitregion_start, BITS_PER_UNIT);
+  *pbitnum -= *pbitregion_start;
+  poly_int64 size = bits_to_bytes_round_up (*pbitnum + bitsize);
+  *pbitregion_end -= *pbitregion_start;
+  *pbitregion_start = 0;
+  if (bitsize.is_constant (&ibitsize)
+      && pbitnum->is_constant (&ibitnum)
+      && get_best_mode (ibitsize, ibitnum,
+			*pbitregion_start, *pbitregion_end,
+			MEM_ALIGN (*pstruct), INT_MAX,
+			MEM_VOLATILE_P (*pstruct),
+			has_capability_address (*pstruct), &best_mode))
+    addr_mode = best_mode;
+  *pstruct = adjust_bitfield_address_size (*pstruct, addr_mode,
+					   offset, size);
+}
+
 /* Return true if a bitfield of size BITSIZE at bit number BITNUM within
    a structure of mode STRUCT_MODE represents a lowpart subreg.   The subreg
    offset is then BITNUM / BITS_PER_UNIT.  */
@@ -978,8 +1015,7 @@ store_integral_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
 	  rtx value_word
 	    = fieldmode == BLKmode
 	      ? extract_bit_field (value, new_bitsize, wordnum * BITS_PER_WORD,
-				   wordnum * BITS_PER_WORD, bitregion_end,
-				   1, NULL_RTX, word_mode, word_mode,
+				   0, 0, 1, NULL_RTX, word_mode, word_mode,
 				   false, NULL)
 	      : operand_subword_force (value, wordnum, value_mode);
 
@@ -1157,25 +1193,8 @@ store_bit_field (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
      bit region.  Adjust the address to start at the beginning of the
      bit region.  */
   if (MEM_P (str_rtx) && maybe_ne (bitregion_start, 0U))
-    {
-      scalar_int_mode best_mode;
-      machine_mode addr_mode = VOIDmode;
-
-      poly_uint64 offset = exact_div (bitregion_start, BITS_PER_UNIT);
-      bitnum -= bitregion_start;
-      poly_int64 size = bits_to_bytes_round_up (bitnum + bitsize);
-      bitregion_end -= bitregion_start;
-      bitregion_start = 0;
-      if (bitsize.is_constant (&ibitsize)
-	  && bitnum.is_constant (&ibitnum)
-	  && get_best_mode (ibitsize, ibitnum,
-			    bitregion_start, bitregion_end,
-			    MEM_ALIGN (str_rtx), INT_MAX,
-			    MEM_VOLATILE_P (str_rtx), &best_mode))
-	addr_mode = best_mode;
-      str_rtx = adjust_bitfield_address_size (str_rtx, addr_mode,
-					      offset, size);
-    }
+    constrain_access_to_bitregion (&str_rtx, bitsize, &bitnum,
+				   &bitregion_start, &bitregion_end);
 
   if (!store_bit_field_1 (str_rtx, bitsize, bitnum,
 			  bitregion_start, bitregion_end,
@@ -1214,7 +1233,7 @@ store_fixed_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
 
       if (!get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end,
 			  MEM_ALIGN (op0), max_bitsize, MEM_VOLATILE_P (op0),
-			  &best_mode))
+			  has_capability_address (op0), &best_mode))
 	{
 	  /* The only way this should occur is if the field spans word
 	     boundaries.  */
@@ -1393,26 +1412,29 @@ store_split_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
       offset = (bitpos + bitsdone) / unit;
       thispos = (bitpos + bitsdone) % unit;
 
-      /* When region of bytes we can touch is restricted, decrease
+      /* THISSIZE must not overrun a word boundary.  Otherwise,
+	 store_fixed_bit_field will call us again, and we will mutually
+	 recurse forever.  */
+      thissize = MIN (bitsize - bitsdone, BITS_PER_WORD);
+      thissize = MIN (thissize, unit - thispos);
+
+      /* When the region of bytes we can touch is restricted, decrease
 	 UNIT close to the end of the region as needed.  If op0 is a REG
 	 or SUBREG of REG, don't do this, as there can't be data races
 	 on a register and we can expand shorter code in some cases.  */
-      if (maybe_ne (bitregion_end, 0U)
-	  && unit > BITS_PER_UNIT
-	  && maybe_gt (bitpos + bitsdone - thispos + unit, bitregion_end + 1)
+      if (unit > BITS_PER_UNIT
 	  && !REG_P (op0)
-	  && (GET_CODE (op0) != SUBREG || !REG_P (SUBREG_REG (op0))))
+	  && (GET_CODE (op0) != SUBREG || !REG_P (SUBREG_REG (op0)))
+	  && (maybe_ne (bitregion_end, 0U)
+	      ? maybe_gt (bitpos + bitsdone - thispos + unit,
+			  bitregion_end + 1)
+	      : (has_capability_address (op0)
+		 && ROUND_UP (thispos + thissize, BITS_PER_UNIT) != unit)))
 	{
 	  unit = unit / 2;
 	  continue;
 	}
 
-      /* THISSIZE must not overrun a word boundary.  Otherwise,
-	 store_fixed_bit_field will call us again, and we will mutually
-	 recurse forever.  */
-      thissize = MIN (bitsize - bitsdone, BITS_PER_WORD);
-      thissize = MIN (thissize, unit - thispos);
-
       if (reverse ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN)
 	{
 	  /* Fetch successively less significant portions.  */
@@ -1425,8 +1447,7 @@ store_split_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
 	    part = extract_fixed_bit_field (word_mode, value, value_mode,
 					    thissize,
 					    bitsize - bitsdone - thissize,
-					    bitregion_start, bitregion_end,
-					    NULL_RTX, 1, false);
+					    0, 0, NULL_RTX, 1, false);
 	  else
 	    /* The args are chosen so that the last part includes the
 	       lsb.  Give extract_bit_field the value it needs (with
@@ -1434,8 +1455,7 @@ store_split_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
 	    part = extract_fixed_bit_field (word_mode, value, value_mode,
 					    thissize,
 					    total_bits - bitsize + bitsdone,
-					    bitregion_start, bitregion_end,
-					    NULL_RTX, 1, false);
+					    0, 0, NULL_RTX, 1, false);
 	}
       else
 	{
@@ -1449,12 +1469,10 @@ store_split_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
 	    part = extract_fixed_bit_field (word_mode, value, value_mode,
 					    thissize,
 					    total_bits - bitsdone - thissize,
-					    bitregion_start, bitregion_end,
-					    NULL_RTX, 1, false);
+					    0, 0, NULL_RTX, 1, false);
 	  else
 	    part = extract_fixed_bit_field (word_mode, value, value_mode,
-					    thissize, bitsdone,
-					    bitregion_start, bitregion_end,
+					    thissize, bitsdone, 0, 0,
 					    NULL_RTX, 1, false);
 	}
 
@@ -2135,34 +2153,15 @@ extract_bit_field (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
 				  target, mode, tmode, reverse, true, alt_rtl);
     }
 
-  /* If bitregion_start has been set to a value, then we are compiling for a
-     target that limits memory accesses to tight bounds so we must not touch
-     bits outside the bit region.  Adjust the address to start at the beginning of the
-     bit region.  */
+  /* Adjust the address to start at the beginning of the bit region,
+     if one is specified.  */
   if (MEM_P (str_rtx) && maybe_ne (bitregion_start, 0U))
-    {
-      scalar_int_mode best_mode;
-      machine_mode addr_mode = VOIDmode;
-
-      poly_uint64 offset = exact_div (bitregion_start, BITS_PER_UNIT);
-      bitnum -= bitregion_start;
-      poly_int64 size = bits_to_bytes_round_up (bitnum + bitsize);
-      bitregion_end -= bitregion_start;
-      bitregion_start = 0;
-      if (bitsize.is_constant (&ibitsize)
-	  && bitnum.is_constant (&ibitnum)
-	  && get_best_mode (ibitsize, ibitnum,
-			    bitregion_start, bitregion_end,
-			    MEM_ALIGN (str_rtx), INT_MAX,
-			    MEM_VOLATILE_P (str_rtx), &best_mode))
-	addr_mode = best_mode;
-      str_rtx = adjust_bitfield_address_size (str_rtx, addr_mode,
-					      offset, size);
-    }
+    constrain_access_to_bitregion (&str_rtx, bitsize, &bitnum,
+				   &bitregion_start, &bitregion_end);
 
   return extract_bit_field_1 (str_rtx, bitsize, bitnum, bitregion_start,
-			      bitregion_end, unsignedp,
-			      target, mode, tmode, reverse, true, alt_rtl);
+			      bitregion_end, unsignedp, target, mode,
+			      tmode, reverse, true, alt_rtl);
 }
 \f
 /* Use shifts and boolean operations to extract a field of BITSIZE bits
@@ -2188,15 +2187,9 @@ extract_fixed_bit_field (machine_mode tmode, rtx op0,
   scalar_int_mode mode;
   if (MEM_P (op0))
     {
-      unsigned int max_bitsize = BITS_PER_WORD;
-      scalar_int_mode imode;
-      if (op0_mode.exists (&imode) && GET_MODE_BITSIZE (imode) < max_bitsize
-	  && maybe_ne (bitregion_end, 0U))
-	max_bitsize = GET_MODE_BITSIZE (imode);
-
       if (!get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end,
-			  MEM_ALIGN (op0), max_bitsize, MEM_VOLATILE_P (op0),
-			  &mode))
+			  MEM_ALIGN (op0), BITS_PER_WORD, MEM_VOLATILE_P (op0),
+			  has_capability_address (op0), &mode))
 	/* The only way this should occur is if the field spans word
 	   boundaries.  */
 	return extract_split_bit_field (op0, op0_mode, bitsize, bitnum,
@@ -2336,13 +2329,6 @@ extract_split_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
   else
     unit = MIN (MEM_ALIGN (op0), BITS_PER_WORD);
 
-  /* For capability targets, if OP0 is a memory with a mode,
-     then UNIT must not be larger than OP0's mode as well.
-     Otherwise, extract_fixed_bit_field will call us
-     again, and we will mutually recurse forever.  */
-  if (MEM_P (op0) && op0_mode.exists () && maybe_ne (bitregion_end, 0U))
-    unit = MIN (unit, GET_MODE_BITSIZE (op0_mode.require ()));
-
   while (bitsdone < bitsize)
     {
       unsigned HOST_WIDE_INT thissize;
@@ -2353,19 +2339,6 @@ extract_split_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
       offset = (bitpos + bitsdone) / unit;
       thispos = (bitpos + bitsdone) % unit;
 
-      /* For capability targets the region of bytes we can read is restricted
-	 by the capability bounds, so we must decrease UNIT close to the end
-	 of the region as needed.  */
-      if (maybe_ne (bitregion_end, 0U)
-	  && unit > BITS_PER_UNIT
-	  && maybe_gt (bitpos + bitsdone - thispos + unit, bitregion_end + 1)
-	  && !REG_P (op0)
-	  && (GET_CODE (op0) != SUBREG || !REG_P (SUBREG_REG (op0))))
-	{
-	  unit = unit / 2;
-	  continue;
-	}
-
       /* THISSIZE must not overrun a word boundary.  Otherwise,
 	 extract_fixed_bit_field will call us again, and we will mutually
 	 recurse forever.  */
@@ -2381,6 +2354,17 @@ extract_split_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
 	  op0_piece_mode = word_mode;
 	  offset = 0;
 	}
+      /* When the region of bytes we can touch is restricted, decrease
+	 UNIT close to the end of the region as needed.  */
+      else if (maybe_ne (bitregion_end, 0U)
+	       ? maybe_gt (bitpos + bitsdone - thispos + unit,
+			   bitregion_end + 1)
+	       : (has_capability_address (op0)
+		  && ROUND_UP (thispos + thissize, BITS_PER_UNIT) != unit))
+	{
+	  unit /= 2;
+	  continue;
+	}
 
       /* Extract the parts in bit-counting order,
 	 whose meaning is determined by BYTES_PER_UNIT.
diff --git a/gcc/expr.c b/gcc/expr.c
index 4d9f2017527..cbb2a29d038 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -2209,10 +2209,9 @@ move_block_to_reg (int regno, rtx x, int nregs, machine_mode mode)
    {
       rtx return_reg = gen_rtx_REG (word_mode, regno + nregs - 1);
       emit_move_insn (return_reg,
-		      extract_bit_field (x, remainder * BITS_PER_UNIT, 0,
-					 0, remainder * BITS_PER_UNIT - 1, 0,
-					 return_reg, word_mode, word_mode, 0,
-					 NULL));
+		      extract_bit_field (x, remainder * BITS_PER_UNIT, 0, 0, 0,
+					 0, return_reg, word_mode, word_mode,
+					 0, NULL));
       return;
    }
    else
@@ -2336,8 +2335,6 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src, tree type,
       poly_int64 bytepos = rtx_to_poly_int64 (XEXP (XVECEXP (dst, 0, i), 1));
       poly_int64 bytelen = GET_MODE_SIZE (mode);
       poly_int64 shift = 0;
-      poly_int64 bitregion_start = 0;
-      poly_int64 bitregion_end = 0;
 
       /* Handle trailing fragments that run over the size of the struct.
 	 It's the target's responsibility to make sure that the fragment
@@ -2358,12 +2355,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src, tree type,
 	      )
 	    shift = (bytelen - (ssize - bytepos)) * BITS_PER_UNIT;
 	  bytelen = ssize - bytepos;
-	  if (MEM_P (orig_src)
-	      && CAPABILITY_MODE_P (GET_MODE (XEXP (orig_src, 0))))
-	  {
-	    bitregion_start = bytepos * BITS_PER_UNIT;
-	    bitregion_end = (bytepos + bytelen) * BITS_PER_UNIT - 1;
-	  }
+	  gcc_assert (maybe_gt (bytelen, 0));
 	}
 
       /* If we won't be loading directly from memory, protect the real source
@@ -2418,10 +2410,9 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src, tree type,
 		  || (!CONSTANT_P (tmps[i])
 		      && (!REG_P (tmps[i]) || GET_MODE (tmps[i]) != mode)))
 		tmps[i] = extract_bit_field (tmps[i], bytelen * BITS_PER_UNIT,
-					     subpos * BITS_PER_UNIT,
-					     bitregion_start, bitregion_end,
-					     1, NULL_RTX, mode, mode,
-					     false, NULL);
+					     subpos * BITS_PER_UNIT, 0, 0,
+					     1, NULL_RTX, mode, mode, false,
+					     NULL);
 	    }
 	  else
 	    {
@@ -2431,8 +2422,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src, tree type,
 	      mem = assign_stack_temp (GET_MODE (src), slen);
 	      emit_move_insn (mem, src);
 	      tmps[i] = extract_bit_field (mem, bytelen * BITS_PER_UNIT, 0,
-					   bitregion_start, bitregion_end,
-					   1, NULL_RTX, mode, mode,
+					   0, 0, 1, NULL_RTX, mode, mode,
 					   false, NULL);
 	    }
 	}
@@ -2473,8 +2463,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src, tree type,
 	tmps[i] = src;
       else
 	tmps[i] = extract_bit_field (src, bytelen * BITS_PER_UNIT,
-				     bytepos * BITS_PER_UNIT,
-				     bitregion_start, bitregion_end,
+				     bytepos * BITS_PER_UNIT, 0, 0,
 				     1, NULL_RTX, mode, mode, false, NULL);
 
       if (maybe_ne (shift, 0))
@@ -5030,7 +5019,8 @@ optimize_bitfield_assignment_op (poly_uint64 pbitsize,
 
       scalar_int_mode best_mode;
       if (!get_best_mode (bitsize, bitpos, bitregion_start, bitregion_end,
-			  MEM_ALIGN (str_rtx), str_bitsize, false, &best_mode))
+			  MEM_ALIGN (str_rtx), str_bitsize, false,
+			  has_capability_address (str_rtx), &best_mode))
 	return false;
       str_mode = best_mode;
       str_bitsize = GET_MODE_BITSIZE (best_mode);
@@ -7445,9 +7435,8 @@ store_field (rtx target, poly_int64 bitsize, poly_int64 bitpos,
       if (GET_MODE (temp) == BLKmode && known_le (bitsize, BITS_PER_WORD))
 	{
 	  temp_mode = smallest_int_mode_for_size (bitsize);
-	  temp = extract_bit_field (temp, bitsize, 0, bitregion_start,
-				    bitregion_end, 1, NULL_RTX, temp_mode,
-				    temp_mode, false, NULL);
+	  temp = extract_bit_field (temp, bitsize, 0, 0, 0, 1, NULL_RTX,
+				    temp_mode, temp_mode, false, NULL);
 	}
 
       /* Store the value in the bitfield.  */
@@ -7513,6 +7502,7 @@ get_inner_reference (tree exp, poly_int64_pod *pbitsize,
 {
   tree size_tree = 0;
   machine_mode mode = VOIDmode;
+  bool volatile_bitfield = false;
   bool blkmode_bitfield = false;
   tree offset = size_zero_node;
   poly_offset_int bit_offset = 0;
@@ -7528,10 +7518,16 @@ get_inner_reference (tree exp, poly_int64_pod *pbitsize,
 	  && TREE_THIS_VOLATILE (exp)
 	  && DECL_BIT_FIELD_TYPE (field)
 	  && DECL_MODE (field) != BLKmode)
-	/* Volatile bitfields should be accessed in the mode of the
-	     field's type, not the mode computed based on the bit
-	     size.  */
-	mode = TYPE_MODE (DECL_BIT_FIELD_TYPE (field));
+	{
+	  /* In general, volatile bitfields should be accessed in the mode
+	     of the field's type, not the mode computed based on the bit size.
+	     However, whether we apply -fstrict-volatile-bitfields ultimately
+	     depends on whether such an access would be sufficiently aligned;
+	     see strict_volatile_bitfield_p for details.  We can't decide
+	     that until we've peeled all accesses.  */
+	  mode = TYPE_MODE (DECL_BIT_FIELD_TYPE (field));
+	  volatile_bitfield = true;
+	}
       else if (!DECL_BIT_FIELD (field))
 	{
 	  mode = DECL_MODE (field);
@@ -7698,6 +7694,21 @@ get_inner_reference (tree exp, poly_int64_pod *pbitsize,
       *poffset = offset;
     }
 
+  /* Drop the mode of a volatile bitfield if the containing type doesn't
+     have enough alignment to support it.  This explicitly avoids looking
+     at DECL_ALIGN for two reasons:
+
+     (1) It shouldn't matter for correctness whether a decl with a volatile
+	 bitfield is accessed directly or indirectly; the interpretation of
+	 "volatile" should be the same in both cases.
+
+     (2) The size of the type is not necessarily a factor of DECL_ALIGN,
+	 so relying on DECL_ALIGN can lead to out-of-range accesses on
+	 targets (such as capability targets) that track type sizes.  */
+  if (volatile_bitfield
+      && TYPE_ALIGN (TREE_TYPE (exp)) < GET_MODE_ALIGNMENT (mode))
+    mode = VOIDmode;
+
   /* We can use BLKmode for a byte-aligned BLKmode bitfield.  */
   if (mode == VOIDmode
       && blkmode_bitfield
@@ -11111,8 +11122,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 				 &unsignedp, &reversep, &volatilep);
 	rtx orig_op0, memloc;
 	bool clear_mem_expr = false;
-      poly_uint64 bitregion_start = 0;
-      poly_uint64 bitregion_end = 0;
+	poly_uint64 bitregion_start = 0;
+	poly_uint64 bitregion_end = 0;
 
 	/* If we got back the original object, something is wrong.  Perhaps
 	   we are evaluating an expression too early.  In any event, don't
@@ -11134,25 +11145,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 			      modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier,
 			      NULL, true);
 
-      /* For capability targets there is a need to limit the memory access to
-	 the bounds of the struct or union.  For bit fields in a struct use
-	 get_bit_range and for other cases fall back to a sensible alternative
-	 based on MEM_SIZE.  This needs to be done here, before
-	 op0 gets modified in a way that increases the MEM_SIZE.  */
-      if (MEM_P (op0) && CAPABILITY_MODE_P (GET_MODE (XEXP (op0, 0)))
-	  && TREE_CODE (exp) == COMPONENT_REF
-	  && DECL_BIT_FIELD_TYPE (TREE_OPERAND (exp, 1))
-	  && DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (exp, 1)))
-	get_bit_range (&bitregion_start, &bitregion_end, exp, &bitpos,
-		       &offset);
-      else if (MEM_P (op0) && MEM_SIZE_KNOWN_P (op0)
-	       && CAPABILITY_MODE_P (GET_MODE (XEXP (op0, 0)))
-	       && known_gt (MEM_SIZE (op0) * BITS_PER_UNIT,
-			    force_align_down (bitpos, BITS_PER_UNIT)))
-	{
-	  bitregion_start = force_align_down (bitpos, BITS_PER_UNIT);
-	  bitregion_end = MEM_SIZE (op0) * BITS_PER_UNIT - 1;
-	}
+	/* The default assumption on capability targets is that the access
+	   must not go beyond the last byte in the field.  Widening the
+	   access beyond that byte could in general lead to an out-of-bounds
+	   condition.
+
+	   However, as an optimization, the normal bitfield semantics
+	   should guarantee that any access to the containing bit range
+	   is safe.  */
+	if (MEM_P (op0)
+	    && has_capability_address (op0)
+	    && TREE_CODE (exp) == COMPONENT_REF
+	    && DECL_BIT_FIELD_TYPE (TREE_OPERAND (exp, 1)))
+	  {
+	    get_bit_range (&bitregion_start, &bitregion_end, exp, &bitpos,
+			   &offset);
+	    bitregion_start = 0;
+	  }
 
 	/* If the field has a mode, we want to access it in the
 	   field's mode, not the computed mode.
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 384d3c149eb..b5ed548633c 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -4524,6 +4524,7 @@ optimize_bit_field_compare (location_t loc, enum tree_code code,
       || lvolatilep)
     return 0;
 
+  bool is_capability_address = has_capability_address (linner);
   if (const_p)
     rreversep = lreversep;
   else
@@ -4543,6 +4544,8 @@ optimize_bit_field_compare (location_t loc, enum tree_code code,
 	 || TREE_CODE (rinner) == PLACEHOLDER_EXPR
 	 || rvolatilep)
        return 0;
+
+     is_capability_address |= has_capability_address (rinner);
    }
 
   /* Honor the C++ memory model and mimic what RTL expansion does.  */
@@ -4561,7 +4564,7 @@ optimize_bit_field_compare (location_t loc, enum tree_code code,
 		      const_p ? TYPE_ALIGN (TREE_TYPE (linner))
 		      : MIN (TYPE_ALIGN (TREE_TYPE (linner)),
 			     TYPE_ALIGN (TREE_TYPE (rinner))),
-		      BITS_PER_WORD, false, &nmode))
+		      BITS_PER_WORD, false, is_capability_address, &nmode))
     return 0;
 
   /* Set signed and unsigned types of the precision of this mode for the
@@ -6410,9 +6413,11 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type,
      to be relative to a field of that size.  */
   first_bit = MIN (ll_bitpos, rl_bitpos);
   end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize);
+  bool capabilityp = (has_capability_address (ll_inner)
+		      || has_capability_address (rl_inner));
   if (!get_best_mode (end_bit - first_bit, first_bit, 0, 0,
 		      TYPE_ALIGN (TREE_TYPE (ll_inner)), BITS_PER_WORD,
-		      volatilep, &lnmode))
+		      volatilep, capabilityp, &lnmode))
     return 0;
 
   lnbitsize = GET_MODE_BITSIZE (lnmode);
@@ -6475,9 +6480,11 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type,
 
       first_bit = MIN (lr_bitpos, rr_bitpos);
       end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize);
+      capabilityp = (has_capability_address (lr_inner)
+		     || has_capability_address (rr_inner));
       if (!get_best_mode (end_bit - first_bit, first_bit, 0, 0,
 			  TYPE_ALIGN (TREE_TYPE (lr_inner)), BITS_PER_WORD,
-			  volatilep, &rnmode))
+			  volatilep, capabilityp, &rnmode))
 	return 0;
 
       rnbitsize = GET_MODE_BITSIZE (rnmode);
diff --git a/gcc/machmode.h b/gcc/machmode.h
index 1678266fb38..ede475912cd 100644
--- a/gcc/machmode.h
+++ b/gcc/machmode.h
@@ -1019,7 +1019,7 @@ class bit_field_mode_iterator
 public:
   bit_field_mode_iterator (HOST_WIDE_INT, HOST_WIDE_INT,
 			   poly_int64, poly_int64,
-			   unsigned int, bool);
+			   unsigned int, bool, bool);
   bool next_mode (scalar_int_mode *);
   bool prefer_smaller_modes ();
 
@@ -1039,7 +1039,8 @@ private:
 /* Find the best mode to use to access a bit field.  */
 
 extern bool get_best_mode (int, int, poly_uint64, poly_uint64, unsigned int,
-			   unsigned HOST_WIDE_INT, bool, scalar_int_mode *);
+			   unsigned HOST_WIDE_INT, bool, bool,
+			   scalar_int_mode *);
 
 /* Determine alignment, 1<=result<=BIGGEST_ALIGNMENT.  */
 
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index f3e650b6d8a..5b085e28c19 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -2933,14 +2933,18 @@ fixup_unsigned_type (tree type)
    memory access to that range.  Otherwise, we are allowed to touch
    any adjacent non bit-fields.
 
-   ALIGN is the alignment of the underlying object in bits.
-   VOLATILEP says whether the bitfield is volatile.  */
+   In addition:
+   - ALIGN is the alignment of the underlying object in bits.
+   - VOLATILEP says whether the bitfield is volatile.
+   - CAPABILITY_ADDRESSP says whether the bitfield is part of a MEM
+     that has a capability address.  */
 
 bit_field_mode_iterator
 ::bit_field_mode_iterator (HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos,
 			   poly_int64 bitregion_start,
 			   poly_int64 bitregion_end,
-			   unsigned int align, bool volatilep)
+			   unsigned int align, bool volatilep,
+			   bool capability_addressp)
 : m_mode (NARROWEST_INT_MODE), m_bitsize (bitsize),
   m_bitpos (bitpos), m_bitregion_start (bitregion_start),
   m_bitregion_end (bitregion_end), m_align (align),
@@ -2948,12 +2952,18 @@ bit_field_mode_iterator
 {
   if (known_eq (m_bitregion_end, 0))
     {
-      /* We can assume that any aligned chunk of ALIGN bits that overlaps
-	 the bitfield is mapped and won't trap, provided that ALIGN isn't
-	 too large.  The cap is the biggest required alignment for data,
-	 or at least the word size.  And force one such chunk at least.  */
-      unsigned HOST_WIDE_INT units
-	= MIN (align, MAX (BIGGEST_ALIGNMENT, BITS_PER_WORD));
+      /* If the bitfield is being addressed using capabilities, we must
+	 conservatively assume that the bounds of the capability do not
+	 extend beyond the bitfield.
+
+	 Otherwise, we can assume that any aligned chunk of ALIGN bits
+	 that overlaps the bitfield is mapped and won't trap, provided
+	 that ALIGN isn't too large.  The cap is the biggest required
+	 alignment for data, or at least the word size.  And force one
+	 such chunk at least.  */
+      unsigned HOST_WIDE_INT units = BITS_PER_UNIT;
+      if (!capability_addressp)
+	units = MIN (align, MAX (BIGGEST_ALIGNMENT, BITS_PER_WORD));
       if (bitsize <= 0)
 	bitsize = 1;
       HOST_WIDE_INT end = bitpos + bitsize + units - 1;
@@ -3050,17 +3060,21 @@ bit_field_mode_iterator::prefer_smaller_modes ()
    all the conditions.
 
    If VOLATILEP is true the narrow_volatile_bitfields target hook is used to
-   decide which of the above modes should be used.  */
+   decide which of the above modes should be used.
+
+   CAPABILITY_ADDRESSP says whether the bitfield is part of a MEM
+   that has a capability address.  */
 
 bool
 get_best_mode (int bitsize, int bitpos,
 	       poly_uint64 bitregion_start, poly_uint64 bitregion_end,
 	       unsigned int align,
 	       unsigned HOST_WIDE_INT largest_mode_bitsize, bool volatilep,
-	       scalar_int_mode *best_mode)
+	       bool capability_addressp, scalar_int_mode *best_mode)
 {
   bit_field_mode_iterator iter (bitsize, bitpos, bitregion_start,
-				bitregion_end, align, volatilep);
+				bitregion_end, align, volatilep,
+				capability_addressp);
   scalar_int_mode mode;
   bool found = false;
   while (iter.next_mode (&mode)
diff --git a/gcc/tree.c b/gcc/tree.c
index 67349b66914..c44b89fdf42 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -14270,6 +14270,28 @@ unqualified_addr_expr ()
   return CAPABILITY_MODE_P (Pmode) ? CAP_ADDR_EXPR : NONCAP_ADDR_EXPR;
 }
 
+/* EXP is something that might live in memory.  Return true if a MEM
+   expansion of EXP would have a capability address.
+
+   This function can be called for a decl before we've decided where
+   the decl will live, so the question about what would happen for a
+   MEM expansion might be purely speculative.  */
+
+bool
+has_capability_address (const_tree exp)
+{
+  while (handled_component_p (exp)
+	 || TREE_CODE (exp) == TARGET_EXPR)
+    exp = TREE_OPERAND (exp, 0);
+
+  if (TREE_CODE (exp) == INDIRECT_REF
+      || TREE_CODE (exp) == MEM_REF
+      || TREE_CODE (exp) == TARGET_MEM_REF)
+    return capability_type_p (TREE_TYPE (TREE_OPERAND (exp, 0)));
+
+  return CAPABILITY_MODE_P (Pmode);
+}
+
 /* Return the machine mode of T.  For vectors, returns the mode of the
    inner type.  The main use case is to feed the result to HONOR_NANS,
    avoiding the BLKmode that a direct TYPE_MODE (T) might return.  */
diff --git a/gcc/tree.h b/gcc/tree.h
index 61c217ebf86..821fdf30e84 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -2036,6 +2036,7 @@ extern tree fold_drop_capability (tree);
 extern tree build_replace_address_value_loc (location_t, tree, tree);
 extern tree build_cap_global_data_get_loc (location_t, tree);
 extern bool valid_capability_code_p (tree_code);
+extern bool has_capability_address (const_tree);
 extern tree_code addr_expr_code (const_tree);
 extern tree_code unqualified_addr_expr ();


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-06-16 13:46 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-16 13:46 [gcc(refs/vendors/ARM/heads/morello)] Make bit_field_mode_iterator behave conservatively for capabilities Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).