* [C++0x] contiguous bitfields race implementation
@ 2011-05-09 17:12 Aldy Hernandez
2011-05-09 18:04 ` Jeff Law
0 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-05-09 17:12 UTC (permalink / raw)
To: Jason Merrill; +Cc: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1324 bytes --]
Seeing that the current C++ draft has been approved, I'd like to submit
this for mainline, and get the proper review everyone's being quietly
avoiding :).
To refresh everyone's memory, here is the problem:
struct
{
unsigned int a : 4;
unsigned char b;
unsigned int c: 6;
} var;
void seta(){
var.a = 12;
}
In the new C++ standard, stores into <a> cannot touch <b>, so we can't
store with anything wider (e.g. a 32 bit store) that will touch <b>.
This problem can be seen on strictly aligned targets such as ARM, where
we store the above sequence with a 32-bit store. Or on x86-64 with <a>
being volatile (PR48124).
This patch fixes both problems, but only for the C++ memory model. This
is NOT a generic fix PR48124, only a fix when using "--param
allow-store-data-races=0". I will gladly change the parameter name, if
another is preferred.
The gist of this patch is in max_field_size(), where we calculate the
maximum number of bits we can store into. In doing this calculation I
assume we can store into the padding without causing any races. So,
padding between fields and at the end of the structure are included.
Tested on x86-64 both with and without "--param
allow-store-data-races=0", and visually inspecting the assembly on
arm-linux and ia64-linux.
OK for trunk?
Aldy
[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 26957 bytes --]
* params.h (ALLOW_STORE_DATA_RACES): New.
* params.def (PARAM_ALLOW_STORE_DATA_RACES): New.
* Makefile.in (expr.o): Depend on PARAMS_H.
* machmode.h (get_best_mode): Add argument.
* fold-const.c (optimize_bit_field_compare): Add argument to
get_best_mode.
(fold_truthop): Same.
* ifcvt.c (noce_emit_move_insn): Add argument to store_bit_field.
* expr.c (emit_group_store): Same.
(copy_blkmode_from_reg): Same.
(write_complex_part): Same.
(optimize_bitfield_assignment_op): Add argument.
Add argument to get_best_mode.
(max_field_size): New.
(expand_assignment): Calculate maxbits and pass it down
accordingly.
(store_field): New argument.
(expand_expr_real_2): New argument to store_field.
Include params.h.
* expr.h (store_bit_field): New argument.
* stor-layout.c (get_best_mode): Restrict mode expansion by taking
into account maxbits.
* calls.c (store_unaligned_arguments_into_pseudos): New argument
to store_bit_field.
* expmed.c (store_bit_field_1): New argument. Use it.
(store_bit_field): Same.
(store_fixed_bit_field): Same.
(store_split_bit_field): Same.
(extract_bit_field_1): Pass new argument to get_best_mode.
(extract_bit_field): Same.
* stmt.c (store_bit_field): Pass new argument to store_bit_field.
* tree.h (DECL_THREAD_VISIBLE_P): New.
* doc/invoke.texi: Document parameter allow-store-data-races.
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi (revision 173263)
+++ doc/invoke.texi (working copy)
@@ -8886,6 +8886,11 @@ The maximum number of conditional stores
if either vectorization (@option{-ftree-vectorize}) or if-conversion
(@option{-ftree-loop-if-convert}) is disabled. The default is 2.
+@item allow-store-data-races
+Allow optimizers to introduce new data races on stores.
+Set to 1 to allow, otherwise to 0. This option is enabled by default
+unless implicitly set by the @option{-fmemory-model=} option.
+
@end table
@end table
Index: machmode.h
===================================================================
--- machmode.h (revision 173263)
+++ machmode.h (working copy)
@@ -248,7 +248,9 @@ extern enum machine_mode mode_for_vector
/* Find the best mode to use to access a bit field. */
-extern enum machine_mode get_best_mode (int, int, unsigned int,
+extern enum machine_mode get_best_mode (int, int,
+ unsigned HOST_WIDE_INT,
+ unsigned int,
enum machine_mode, int);
/* Determine alignment, 1<=result<=BIGGEST_ALIGNMENT. */
Index: tree.h
===================================================================
--- tree.h (revision 173263)
+++ tree.h (working copy)
@@ -3156,6 +3156,10 @@ struct GTY(()) tree_parm_decl {
#define DECL_THREAD_LOCAL_P(NODE) \
(VAR_DECL_CHECK (NODE)->decl_with_vis.tls_model >= TLS_MODEL_REAL)
+/* Return true if a VAR_DECL is visible from another thread. */
+#define DECL_THREAD_VISIBLE_P(NODE) \
+ (TREE_STATIC (NODE) && !DECL_THREAD_LOCAL_P (NODE))
+
/* In a non-local VAR_DECL with static storage duration, true if the
variable has an initialization priority. If false, the variable
will be initialized at the DEFAULT_INIT_PRIORITY. */
Index: fold-const.c
===================================================================
--- fold-const.c (revision 173263)
+++ fold-const.c (working copy)
@@ -3409,7 +3409,7 @@ optimize_bit_field_compare (location_t l
&& flag_strict_volatile_bitfields > 0)
nmode = lmode;
else
- nmode = get_best_mode (lbitsize, lbitpos,
+ nmode = get_best_mode (lbitsize, lbitpos, 0,
const_p ? TYPE_ALIGN (TREE_TYPE (linner))
: MIN (TYPE_ALIGN (TREE_TYPE (linner)),
TYPE_ALIGN (TREE_TYPE (rinner))),
@@ -5237,7 +5237,7 @@ fold_truthop (location_t loc, enum tree_
to be relative to a field of that size. */
first_bit = MIN (ll_bitpos, rl_bitpos);
end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize);
- lnmode = get_best_mode (end_bit - first_bit, first_bit,
+ lnmode = get_best_mode (end_bit - first_bit, first_bit, 0,
TYPE_ALIGN (TREE_TYPE (ll_inner)), word_mode,
volatilep);
if (lnmode == VOIDmode)
@@ -5302,7 +5302,7 @@ fold_truthop (location_t loc, enum tree_
first_bit = MIN (lr_bitpos, rr_bitpos);
end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize);
- rnmode = get_best_mode (end_bit - first_bit, first_bit,
+ rnmode = get_best_mode (end_bit - first_bit, first_bit, 0,
TYPE_ALIGN (TREE_TYPE (lr_inner)), word_mode,
volatilep);
if (rnmode == VOIDmode)
Index: params.h
===================================================================
--- params.h (revision 173263)
+++ params.h (working copy)
@@ -206,4 +206,6 @@ extern void init_param_values (int *para
PARAM_VALUE (PARAM_MIN_NONDEBUG_INSN_UID)
#define MAX_STORES_TO_SINK \
PARAM_VALUE (PARAM_MAX_STORES_TO_SINK)
+#define ALLOW_STORE_DATA_RACES \
+ PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES)
#endif /* ! GCC_PARAMS_H */
Index: ifcvt.c
===================================================================
--- ifcvt.c (revision 173263)
+++ ifcvt.c (working copy)
@@ -885,7 +885,7 @@ noce_emit_move_insn (rtx x, rtx y)
}
gcc_assert (start < (MEM_P (op) ? BITS_PER_UNIT : BITS_PER_WORD));
- store_bit_field (op, size, start, GET_MODE (x), y);
+ store_bit_field (op, size, start, 0, GET_MODE (x), y);
return;
}
@@ -939,7 +939,7 @@ noce_emit_move_insn (rtx x, rtx y)
inner = XEXP (outer, 0);
outmode = GET_MODE (outer);
bitpos = SUBREG_BYTE (outer) * BITS_PER_UNIT;
- store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos, outmode, y);
+ store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos, 0, outmode, y);
}
/* Return sequence of instructions generated by if conversion. This
Index: expr.c
===================================================================
--- expr.c (revision 173263)
+++ expr.c (working copy)
@@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.
#include "diagnostic.h"
#include "ssaexpand.h"
#include "target-globals.h"
+#include "params.h"
/* Decide whether a function's arguments should be processed
from first to last or from last to first.
@@ -142,7 +143,8 @@ static void store_constructor_field (rtx
HOST_WIDE_INT, enum machine_mode,
tree, tree, int, alias_set_type);
static void store_constructor (tree, rtx, int, HOST_WIDE_INT);
-static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT, enum machine_mode,
+static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT, enum machine_mode,
tree, tree, alias_set_type, bool);
static unsigned HOST_WIDE_INT highest_pow2_factor_for_target (const_tree, const_tree);
@@ -2063,7 +2065,7 @@ emit_group_store (rtx orig_dst, rtx src,
emit_move_insn (adjust_address (dest, mode, bytepos), tmps[i]);
else
store_bit_field (dest, bytelen * BITS_PER_UNIT, bytepos * BITS_PER_UNIT,
- mode, tmps[i]);
+ 0, mode, tmps[i]);
}
/* Copy from the pseudo into the (probable) hard reg. */
@@ -2157,7 +2159,7 @@ copy_blkmode_from_reg (rtx tgtblk, rtx s
/* Use xbitpos for the source extraction (right justified) and
bitpos for the destination store (left justified). */
- store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, copy_mode,
+ store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, 0, copy_mode,
extract_bit_field (src, bitsize,
xbitpos % BITS_PER_WORD, 1, false,
NULL_RTX, copy_mode, copy_mode));
@@ -2794,7 +2796,7 @@ write_complex_part (rtx cplx, rtx val, b
gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD);
}
- store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, imode, val);
+ store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, 0, imode, val);
}
/* Extract one of the components of the complex value CPLX. Extract the
@@ -3929,6 +3931,7 @@ get_subtarget (rtx x)
static bool
optimize_bitfield_assignment_op (unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT maxbits,
enum machine_mode mode1, rtx str_rtx,
tree to, tree src)
{
@@ -3989,7 +3992,7 @@ optimize_bitfield_assignment_op (unsigne
if (str_bitsize == 0 || str_bitsize > BITS_PER_WORD)
str_mode = word_mode;
- str_mode = get_best_mode (bitsize, bitpos,
+ str_mode = get_best_mode (bitsize, bitpos, maxbits,
MEM_ALIGN (str_rtx), str_mode, 0);
if (str_mode == VOIDmode)
return false;
@@ -4098,6 +4101,92 @@ optimize_bitfield_assignment_op (unsigne
return false;
}
+/* In the C++ memory model, consecutive bit fields in a structure are
+ considered one memory location.
+
+ Given a COMPONENT_REF, this function returns the maximum number of
+ bits we are allowed to store into, when storing into the
+ COMPONENT_REF. We return 0, if there is no restriction.
+
+ EXP is the COMPONENT_REF.
+
+ BITPOS is the position in bits where the bit starts within the structure.
+ BITSIZE is size in bits of the field being referenced in EXP.
+
+ For example, while storing into FOO.A here...
+
+ struct {
+ BIT 0:
+ unsigned int a : 4;
+ unsigned int b : 1;
+ BIT 8:
+ unsigned char c;
+ unsigned int d : 6;
+ } foo;
+
+ ...we are not allowed to store past <b>, so for the layout above,
+ we would return 8 maximum bits (because who cares if we store into
+ the padding). */
+
+
+static unsigned HOST_WIDE_INT
+max_field_size (tree exp, HOST_WIDE_INT bitpos, HOST_WIDE_INT bitsize)
+{
+ tree field, record_type, fld;
+ HOST_WIDE_INT maxbits = bitsize;
+
+ gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
+
+ /* If other threads can't see this value, no need to restrict stores. */
+ if (ALLOW_STORE_DATA_RACES
+ || !DECL_THREAD_VISIBLE_P (TREE_OPERAND (exp, 0)))
+ return 0;
+
+ field = TREE_OPERAND (exp, 1);
+ record_type = DECL_FIELD_CONTEXT (field);
+
+ /* Find the original field within the structure. */
+ for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld))
+ if (fld == field)
+ break;
+ gcc_assert (fld == field);
+
+ /* If this is the last element in the structure, we can touch from
+ BITPOS to the end of the structure (including the padding). */
+ if (!DECL_CHAIN (fld))
+ return TREE_INT_CST_LOW (TYPE_SIZE (record_type)) - bitpos;
+
+ /* Count contiguous bit fields not separated by a 0-length bit-field. */
+ for (fld = DECL_CHAIN (fld); fld; fld = DECL_CHAIN (fld))
+ {
+ tree t, offset;
+ enum machine_mode mode;
+ int unsignedp, volatilep;
+
+ if (TREE_CODE (fld) != FIELD_DECL)
+ continue;
+
+ t = build3 (COMPONENT_REF, TREE_TYPE (exp),
+ unshare_expr (TREE_OPERAND (exp, 0)),
+ fld, NULL_TREE);
+ get_inner_reference (t, &bitsize, &bitpos, &offset,
+ &mode, &unsignedp, &volatilep, true);
+
+ /* Only count contiguous bit fields, that are not separated by a
+ zero-length bit field. */
+ if (!DECL_BIT_FIELD (fld)
+ || bitsize == 0)
+ {
+ /* Include the padding up to the next field. */
+ maxbits += bitpos - maxbits;
+ break;
+ }
+
+ maxbits += bitsize;
+ }
+
+ return maxbits;
+}
/* Expand an assignment that stores the value of FROM into TO. If NONTEMPORAL
is true, try generating a nontemporal store. */
@@ -4197,6 +4286,9 @@ expand_assignment (tree to, tree from, b
{
enum machine_mode mode1;
HOST_WIDE_INT bitsize, bitpos;
+ /* Max consecutive bits we are allowed to touch while storing
+ into TO. */
+ HOST_WIDE_INT maxbits = 0;
tree offset;
int unsignedp;
int volatilep = 0;
@@ -4206,6 +4298,10 @@ expand_assignment (tree to, tree from, b
tem = get_inner_reference (to, &bitsize, &bitpos, &offset, &mode1,
&unsignedp, &volatilep, true);
+ if (TREE_CODE (to) == COMPONENT_REF
+ && DECL_BIT_FIELD (TREE_OPERAND (to, 1)))
+ maxbits = max_field_size (to, bitpos, bitsize);
+
/* If we are going to use store_bit_field and extract_bit_field,
make sure to_rtx will be safe for multiple use. */
@@ -4286,12 +4382,13 @@ expand_assignment (tree to, tree from, b
result = store_expr (from, XEXP (to_rtx, bitpos != 0), false,
nontemporal);
else if (bitpos + bitsize <= mode_bitsize / 2)
- result = store_field (XEXP (to_rtx, 0), bitsize, bitpos,
+ result = store_field (XEXP (to_rtx, 0), bitsize, bitpos, maxbits,
mode1, from, TREE_TYPE (tem),
get_alias_set (to), nontemporal);
else if (bitpos >= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 1), bitsize,
- bitpos - mode_bitsize / 2, mode1, from,
+ bitpos - mode_bitsize / 2, maxbits,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
else if (bitpos == 0 && bitsize == mode_bitsize)
@@ -4312,7 +4409,8 @@ expand_assignment (tree to, tree from, b
0);
write_complex_part (temp, XEXP (to_rtx, 0), false);
write_complex_part (temp, XEXP (to_rtx, 1), true);
- result = store_field (temp, bitsize, bitpos, mode1, from,
+ result = store_field (temp, bitsize, bitpos, maxbits,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false));
@@ -4337,11 +4435,12 @@ expand_assignment (tree to, tree from, b
MEM_KEEP_ALIAS_SET_P (to_rtx) = 1;
}
- if (optimize_bitfield_assignment_op (bitsize, bitpos, mode1,
+ if (optimize_bitfield_assignment_op (bitsize, bitpos, maxbits, mode1,
to_rtx, to, from))
result = NULL;
else
- result = store_field (to_rtx, bitsize, bitpos, mode1, from,
+ result = store_field (to_rtx, bitsize, bitpos, maxbits,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
}
@@ -4734,7 +4833,7 @@ store_expr (tree exp, rtx target, int ca
: BLOCK_OP_NORMAL));
else if (GET_MODE (target) == BLKmode)
store_bit_field (target, INTVAL (expr_size (exp)) * BITS_PER_UNIT,
- 0, GET_MODE (temp), temp);
+ 0, 0, GET_MODE (temp), temp);
else
convert_move (target, temp, unsignedp);
}
@@ -5177,7 +5276,8 @@ store_constructor_field (rtx target, uns
store_constructor (exp, target, cleared, bitsize / BITS_PER_UNIT);
}
else
- store_field (target, bitsize, bitpos, mode, exp, type, alias_set, false);
+ store_field (target, bitsize, bitpos, 0, mode, exp, type, alias_set,
+ false);
}
/* Store the value of constructor EXP into the rtx TARGET.
@@ -5751,6 +5851,8 @@ store_constructor (tree exp, rtx target,
BITSIZE bits, starting BITPOS bits from the start of TARGET.
If MODE is VOIDmode, it means that we are storing into a bit-field.
+ MAXBITS is the number of bits we can store into, 0 if no limit.
+
Always return const0_rtx unless we have something particular to
return.
@@ -5764,6 +5866,7 @@ store_constructor (tree exp, rtx target,
static rtx
store_field (rtx target, HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT maxbits,
enum machine_mode mode, tree exp, tree type,
alias_set_type alias_set, bool nontemporal)
{
@@ -5796,8 +5899,8 @@ store_field (rtx target, HOST_WIDE_INT b
if (bitsize != (HOST_WIDE_INT) GET_MODE_BITSIZE (GET_MODE (target)))
emit_move_insn (object, target);
- store_field (blk_object, bitsize, bitpos, mode, exp, type, alias_set,
- nontemporal);
+ store_field (blk_object, bitsize, bitpos, maxbits,
+ mode, exp, type, alias_set, nontemporal);
emit_move_insn (target, object);
@@ -5911,7 +6014,7 @@ store_field (rtx target, HOST_WIDE_INT b
}
/* Store the value in the bitfield. */
- store_bit_field (target, bitsize, bitpos, mode, temp);
+ store_bit_field (target, bitsize, bitpos, maxbits, mode, temp);
return const0_rtx;
}
@@ -7323,7 +7426,7 @@ expand_expr_real_2 (sepops ops, rtx targ
(treeop0))
* BITS_PER_UNIT),
(HOST_WIDE_INT) GET_MODE_BITSIZE (mode)),
- 0, TYPE_MODE (valtype), treeop0,
+ 0, 0, TYPE_MODE (valtype), treeop0,
type, 0, false);
}
Index: expr.h
===================================================================
--- expr.h (revision 173263)
+++ expr.h (working copy)
@@ -665,7 +665,8 @@ extern enum machine_mode
mode_for_extraction (enum extraction_pattern, int);
extern void store_bit_field (rtx, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, enum machine_mode, rtx);
+ unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT,
+ enum machine_mode, rtx);
extern rtx extract_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT, int, bool, rtx,
enum machine_mode, enum machine_mode);
Index: stor-layout.c
===================================================================
--- stor-layout.c (revision 173263)
+++ stor-layout.c (working copy)
@@ -2428,6 +2428,9 @@ fixup_unsigned_type (tree type)
/* Find the best machine mode to use when referencing a bit field of length
BITSIZE bits starting at BITPOS.
+ MAXBITS is the maximum number of bits we are allowed to touch, when
+ referencing this bit field. MAXBITS is 0 if there is no limit.
+
The underlying object is known to be aligned to a boundary of ALIGN bits.
If LARGEST_MODE is not VOIDmode, it means that we should not use a mode
larger than LARGEST_MODE (usually SImode).
@@ -2445,7 +2448,8 @@ fixup_unsigned_type (tree type)
decide which of the above modes should be used. */
enum machine_mode
-get_best_mode (int bitsize, int bitpos, unsigned int align,
+get_best_mode (int bitsize, int bitpos, unsigned HOST_WIDE_INT maxbits,
+ unsigned int align,
enum machine_mode largest_mode, int volatilep)
{
enum machine_mode mode;
@@ -2484,6 +2488,7 @@ get_best_mode (int bitsize, int bitpos,
if (bitpos / unit == (bitpos + bitsize - 1) / unit
&& unit <= BITS_PER_WORD
&& unit <= MIN (align, BIGGEST_ALIGNMENT)
+ && (!maxbits || unit <= maxbits)
&& (largest_mode == VOIDmode
|| unit <= GET_MODE_BITSIZE (largest_mode)))
wide_mode = tmode;
Index: calls.c
===================================================================
--- calls.c (revision 173263)
+++ calls.c (working copy)
@@ -909,7 +909,7 @@ store_unaligned_arguments_into_pseudos (
emit_move_insn (reg, const0_rtx);
bytes -= bitsize / BITS_PER_UNIT;
- store_bit_field (reg, bitsize, endian_correction, word_mode,
+ store_bit_field (reg, bitsize, endian_correction, 0, word_mode,
word);
}
}
Index: expmed.c
===================================================================
--- expmed.c (revision 173263)
+++ expmed.c (working copy)
@@ -47,9 +47,13 @@ struct target_expmed *this_target_expmed
static void store_fixed_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ rtx);
static void store_split_bit_field (rtx, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ rtx);
static rtx extract_fixed_bit_field (enum machine_mode, rtx,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
@@ -333,7 +337,9 @@ mode_for_extraction (enum extraction_pat
static bool
store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, enum machine_mode fieldmode,
+ unsigned HOST_WIDE_INT bitnum,
+ unsigned HOST_WIDE_INT maxbits,
+ enum machine_mode fieldmode,
rtx value, bool fallback_p)
{
unsigned int unit
@@ -547,7 +553,9 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!store_bit_field_1 (op0, MIN (BITS_PER_WORD,
bitsize - i * BITS_PER_WORD),
- bitnum + bit_offset, word_mode,
+ bitnum + bit_offset,
+ maxbits,
+ word_mode,
value_word, fallback_p))
{
delete_insns_since (last);
@@ -718,9 +726,10 @@ store_bit_field_1 (rtx str_rtx, unsigned
mode. Otherwise, use the smallest mode containing the field. */
if (GET_MODE (op0) == BLKmode
+ || (maxbits && GET_MODE_BITSIZE (GET_MODE (op0)) > maxbits)
|| (op_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (op_mode)))
- bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum, maxbits, MEM_ALIGN (op0),
(op_mode == MAX_MACHINE_MODE
? VOIDmode : op_mode),
MEM_VOLATILE_P (op0));
@@ -748,7 +757,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
/* Fetch that unit, store the bitfield in it, then store
the unit. */
tempreg = copy_to_reg (xop0);
- if (store_bit_field_1 (tempreg, bitsize, xbitpos,
+ if (store_bit_field_1 (tempreg, bitsize, xbitpos, maxbits,
fieldmode, orig_value, false))
{
emit_move_insn (xop0, tempreg);
@@ -761,21 +770,28 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!fallback_p)
return false;
- store_fixed_bit_field (op0, offset, bitsize, bitpos, value);
+ store_fixed_bit_field (op0, offset, bitsize, bitpos, maxbits, value);
return true;
}
/* Generate code to store value from rtx VALUE
into a bit-field within structure STR_RTX
containing BITSIZE bits starting at bit BITNUM.
+
+ MAXBITS is the maximum number of bits we are allowed to store into,
+ 0 if no limit.
+
FIELDMODE is the machine-mode of the FIELD_DECL node for this field. */
void
store_bit_field (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, enum machine_mode fieldmode,
+ unsigned HOST_WIDE_INT bitnum,
+ unsigned HOST_WIDE_INT maxbits,
+ enum machine_mode fieldmode,
rtx value)
{
- if (!store_bit_field_1 (str_rtx, bitsize, bitnum, fieldmode, value, true))
+ if (!store_bit_field_1 (str_rtx, bitsize, bitnum, maxbits,
+ fieldmode, value, true))
gcc_unreachable ();
}
\f
@@ -791,7 +807,9 @@ store_bit_field (rtx str_rtx, unsigned H
static void
store_fixed_bit_field (rtx op0, unsigned HOST_WIDE_INT offset,
unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitpos, rtx value)
+ unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT maxbits,
+ rtx value)
{
enum machine_mode mode;
unsigned int total_bits = BITS_PER_WORD;
@@ -812,7 +830,7 @@ store_fixed_bit_field (rtx op0, unsigned
/* Special treatment for a bit field split across two registers. */
if (bitsize + bitpos > BITS_PER_WORD)
{
- store_split_bit_field (op0, bitsize, bitpos, value);
+ store_split_bit_field (op0, bitsize, bitpos, maxbits, value);
return;
}
}
@@ -830,10 +848,12 @@ store_fixed_bit_field (rtx op0, unsigned
if (MEM_VOLATILE_P (op0)
&& GET_MODE_BITSIZE (GET_MODE (op0)) > 0
+ && GET_MODE_BITSIZE (GET_MODE (op0)) <= maxbits
&& flag_strict_volatile_bitfields > 0)
mode = GET_MODE (op0);
else
mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ maxbits,
MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
@@ -841,7 +861,7 @@ store_fixed_bit_field (rtx op0, unsigned
/* The only way this should occur is if the field spans word
boundaries. */
store_split_bit_field (op0, bitsize, bitpos + offset * BITS_PER_UNIT,
- value);
+ maxbits, value);
return;
}
@@ -961,7 +981,9 @@ store_fixed_bit_field (rtx op0, unsigned
static void
store_split_bit_field (rtx op0, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitpos, rtx value)
+ unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT maxbits,
+ rtx value)
{
unsigned int unit;
unsigned int bitsdone = 0;
@@ -1076,7 +1098,7 @@ store_split_bit_field (rtx op0, unsigned
it is just an out-of-bounds access. Ignore it. */
if (word != const0_rtx)
store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize,
- thispos, part);
+ thispos, maxbits, part);
bitsdone += thissize;
}
}
@@ -1520,7 +1542,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
if (GET_MODE (op0) == BLKmode
|| (ext_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (ext_mode)))
- bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum, 0, MEM_ALIGN (op0),
(ext_mode == MAX_MACHINE_MODE
? VOIDmode : ext_mode),
MEM_VOLATILE_P (op0));
@@ -1646,7 +1668,7 @@ extract_fixed_bit_field (enum machine_mo
mode = tmode;
}
else
- mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT, 0,
MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
Index: Makefile.in
===================================================================
--- Makefile.in (revision 173263)
+++ Makefile.in (working copy)
@@ -2916,7 +2916,7 @@ expr.o : expr.c $(CONFIG_H) $(SYSTEM_H)
typeclass.h hard-reg-set.h toplev.h $(DIAGNOSTIC_CORE_H) hard-reg-set.h $(EXCEPT_H) \
reload.h langhooks.h intl.h $(TM_P_H) $(TARGET_H) \
tree-iterator.h gt-expr.h $(MACHMODE_H) $(TIMEVAR_H) $(TREE_FLOW_H) \
- $(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h $(SSAEXPAND_H)
+ $(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h $(SSAEXPAND_H) $(PARAMS_H)
dojump.o : dojump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \
$(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(OPTABS_H) $(INSN_ATTR_H) insn-config.h \
langhooks.h $(GGC_H) gt-dojump.h vecprim.h $(BASIC_BLOCK_H) output.h
Index: stmt.c
===================================================================
--- stmt.c (revision 173263)
+++ stmt.c (working copy)
@@ -1758,7 +1758,7 @@ expand_return (tree retval)
/* Use bitpos for the source extraction (left justified) and
xbitpos for the destination store (right justified). */
- store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD, word_mode,
+ store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD, 0, word_mode,
extract_bit_field (src, bitsize,
bitpos % BITS_PER_WORD, 1, false,
NULL_RTX, word_mode, word_mode));
Index: params.def
===================================================================
--- params.def (revision 173263)
+++ params.def (working copy)
@@ -884,6 +884,13 @@ DEFPARAM (PARAM_MAX_STORES_TO_SINK,
"Maximum number of conditional store pairs that can be sunk",
2, 0, 0)
+/* Data race flags for C++0x memory model compliance. */
+
+DEFPARAM (PARAM_ALLOW_STORE_DATA_RACES,
+ "allow-store-data-races",
+ "Allow new data races on stores to be introduced",
+ 1, 0, 1)
+
/*
Local variables:
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-09 17:12 [C++0x] contiguous bitfields race implementation Aldy Hernandez
@ 2011-05-09 18:04 ` Jeff Law
2011-05-09 18:05 ` Aldy Hernandez
2011-05-09 20:11 ` Aldy Hernandez
0 siblings, 2 replies; 81+ messages in thread
From: Jeff Law @ 2011-05-09 18:04 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jason Merrill, gcc-patches
On 05/09/11 10:24, Aldy Hernandez wrote:
> Seeing that the current C++ draft has been approved, I'd like to submit
> this for mainline, and get the proper review everyone's being quietly
> avoiding :).
>
> To refresh everyone's memory, here is the problem:
>
> struct
> {
> unsigned int a : 4;
> unsigned char b;
> unsigned int c: 6;
> } var;
>
>
> void seta(){
> var.a = 12;
> }
>
>
> In the new C++ standard, stores into <a> cannot touch <b>, so we can't
> store with anything wider (e.g. a 32 bit store) that will touch <b>.
> This problem can be seen on strictly aligned targets such as ARM, where
> we store the above sequence with a 32-bit store. Or on x86-64 with <a>
> being volatile (PR48124).
>
> This patch fixes both problems, but only for the C++ memory model. This
> is NOT a generic fix PR48124, only a fix when using "--param
> allow-store-data-races=0". I will gladly change the parameter name, if
> another is preferred.
>
> The gist of this patch is in max_field_size(), where we calculate the
> maximum number of bits we can store into. In doing this calculation I
> assume we can store into the padding without causing any races. So,
> padding between fields and at the end of the structure are included.
Well, the kernel guys would like to be able to be able to preserve the
padding bits too. It's a long long sad story that I won't repeat...
And I don't think we should further complicate this stuff with the
desire to not clobber padding bits :-) Though be aware the request
might come one day....
>
> Tested on x86-64 both with and without "--param
> allow-store-data-races=0", and visually inspecting the assembly on
> arm-linux and ia64-linux.
Any way to add a test to the testsuite?
General approach seems OK; I didn't dive deeply into the implementation.
I'll leave that for rth & jason :-)
jeff
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-09 18:04 ` Jeff Law
@ 2011-05-09 18:05 ` Aldy Hernandez
2011-05-09 19:19 ` Jeff Law
2011-05-09 20:11 ` Aldy Hernandez
1 sibling, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-05-09 18:05 UTC (permalink / raw)
To: Jeff Law; +Cc: Jason Merrill, gcc-patches
>> struct
>> {
>> unsigned int a : 4;
>> unsigned char b;
>> unsigned int c: 6;
>> } var;
> Well, the kernel guys would like to be able to be able to preserve the
> padding bits too. It's a long long sad story that I won't repeat...
> And I don't think we should further complicate this stuff with the
> desire to not clobber padding bits :-) Though be aware the request
> might come one day....
Woah, let me see if I got this right. If we were to store in VAR.C
above, the default for this memory model would be NOT to clobber the
padding bits past <c>? That definitely makes my implementation simpler,
so I won't complain, but that's just weird.
>> Tested on x86-64 both with and without "--param
>> allow-store-data-races=0", and visually inspecting the assembly on
>> arm-linux and ia64-linux.
> Any way to add a test to the testsuite?
Arghhh... I was afraid you'd ask for one. It was much easier with the
test harness on cxx-memory-model. I'll whip one up though...
Aldy
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-09 18:05 ` Aldy Hernandez
@ 2011-05-09 19:19 ` Jeff Law
0 siblings, 0 replies; 81+ messages in thread
From: Jeff Law @ 2011-05-09 19:19 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jason Merrill, gcc-patches
On 05/09/11 11:26, Aldy Hernandez wrote:
>
>>> struct
>>> {
>>> unsigned int a : 4;
>>> unsigned char b;
>>> unsigned int c: 6;
>>> } var;
>
>
>> Well, the kernel guys would like to be able to be able to preserve the
>> padding bits too. It's a long long sad story that I won't repeat...
>> And I don't think we should further complicate this stuff with the
>> desire to not clobber padding bits :-) Though be aware the request
>> might come one day....
>
> Woah, let me see if I got this right. If we were to store in VAR.C
> above, the default for this memory model would be NOT to clobber the
> padding bits past <c>? That definitely makes my implementation simpler,
> so I won't complain, but that's just weird.
Just to be clear, it's something I've discussed with the kernel guys and
is completely separate from the C++ memory model. I don't think we
should wrap this into your current work.
Consider if the kernel team wanted to add some information to a
structure without growing the structure. Furthermore, assume that the
structure escapes, say into modules that aren't necessarily going to be
rebuilt, but those modules won't need to ever access this new
information. And assume there happens to be enough padding bits to hold
this auxiliary information.
This has actually occurred and the kernel team wanted to use the padding
bits to hold the auxiliary information and maintain kernel ABI/API
compatibility. Unfortunately, a store to a nearby bitfield can
overwrite the padding, thus if the structure escaped to a module that
still thought the bits were padding, that module would/could clobber
those padding bits, destroying the auxiliary data.
If GCC had a mode where it would preserve the padding bits (when
possible), it'd help the kernel team in these situations.
>
> Arghhh... I was afraid you'd ask for one. It was much easier with the
> test harness on cxx-memory-model. I'll whip one up though...
Given others have (rightly) called me out on it a lot recently, I
figured I'd pass along the love :-)
jeff
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-09 18:04 ` Jeff Law
2011-05-09 18:05 ` Aldy Hernandez
@ 2011-05-09 20:11 ` Aldy Hernandez
2011-05-09 20:28 ` Jakub Jelinek
2011-05-09 20:49 ` Jason Merrill
1 sibling, 2 replies; 81+ messages in thread
From: Aldy Hernandez @ 2011-05-09 20:11 UTC (permalink / raw)
To: Jeff Law; +Cc: Jason Merrill, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 579 bytes --]
>> Tested on x86-64 both with and without "--param
>> allow-store-data-races=0", and visually inspecting the assembly on
>> arm-linux and ia64-linux.
> Any way to add a test to the testsuite?
I was able to find a testcase for i386/x86_64 by making the bitfield
volatile (similar to the problem in PR48124). So there you go...
testcase and all :).
Jakub also gave me a testcase which triggered a buglet in
max_field_size. I have now added a parameter INNERDECL which is the
inner reference, so we can properly determine if the inner decl is
thread visible or not.
Aldy
[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 27584 bytes --]
* params.h (ALLOW_STORE_DATA_RACES): New.
* params.def (PARAM_ALLOW_STORE_DATA_RACES): New.
* Makefile.in (expr.o): Depend on PARAMS_H.
* machmode.h (get_best_mode): Add argument.
* fold-const.c (optimize_bit_field_compare): Add argument to
get_best_mode.
(fold_truthop): Same.
* ifcvt.c (noce_emit_move_insn): Add argument to store_bit_field.
* expr.c (emit_group_store): Same.
(copy_blkmode_from_reg): Same.
(write_complex_part): Same.
(optimize_bitfield_assignment_op): Add argument.
Add argument to get_best_mode.
(max_field_size): New.
(expand_assignment): Calculate maxbits and pass it down
accordingly.
(store_field): New argument.
(expand_expr_real_2): New argument to store_field.
Include params.h.
* expr.h (store_bit_field): New argument.
* stor-layout.c (get_best_mode): Restrict mode expansion by taking
into account maxbits.
* calls.c (store_unaligned_arguments_into_pseudos): New argument
to store_bit_field.
* expmed.c (store_bit_field_1): New argument. Use it.
(store_bit_field): Same.
(store_fixed_bit_field): Same.
(store_split_bit_field): Same.
(extract_bit_field_1): Pass new argument to get_best_mode.
(extract_bit_field): Same.
* stmt.c (store_bit_field): Pass new argument to store_bit_field.
* tree.h (DECL_THREAD_VISIBLE_P): New.
* doc/invoke.texi: Document parameter allow-store-data-races.
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi (revision 173263)
+++ doc/invoke.texi (working copy)
@@ -8886,6 +8886,11 @@ The maximum number of conditional stores
if either vectorization (@option{-ftree-vectorize}) or if-conversion
(@option{-ftree-loop-if-convert}) is disabled. The default is 2.
+@item allow-store-data-races
+Allow optimizers to introduce new data races on stores.
+Set to 1 to allow, otherwise to 0. This option is enabled by default
+unless implicitly set by the @option{-fmemory-model=} option.
+
@end table
@end table
Index: machmode.h
===================================================================
--- machmode.h (revision 173263)
+++ machmode.h (working copy)
@@ -248,7 +248,9 @@ extern enum machine_mode mode_for_vector
/* Find the best mode to use to access a bit field. */
-extern enum machine_mode get_best_mode (int, int, unsigned int,
+extern enum machine_mode get_best_mode (int, int,
+ unsigned HOST_WIDE_INT,
+ unsigned int,
enum machine_mode, int);
/* Determine alignment, 1<=result<=BIGGEST_ALIGNMENT. */
Index: tree.h
===================================================================
--- tree.h (revision 173263)
+++ tree.h (working copy)
@@ -3156,6 +3156,10 @@ struct GTY(()) tree_parm_decl {
#define DECL_THREAD_LOCAL_P(NODE) \
(VAR_DECL_CHECK (NODE)->decl_with_vis.tls_model >= TLS_MODEL_REAL)
+/* Return true if a VAR_DECL is visible from another thread. */
+#define DECL_THREAD_VISIBLE_P(NODE) \
+ (TREE_STATIC (NODE) && !DECL_THREAD_LOCAL_P (NODE))
+
/* In a non-local VAR_DECL with static storage duration, true if the
variable has an initialization priority. If false, the variable
will be initialized at the DEFAULT_INIT_PRIORITY. */
Index: fold-const.c
===================================================================
--- fold-const.c (revision 173263)
+++ fold-const.c (working copy)
@@ -3409,7 +3409,7 @@ optimize_bit_field_compare (location_t l
&& flag_strict_volatile_bitfields > 0)
nmode = lmode;
else
- nmode = get_best_mode (lbitsize, lbitpos,
+ nmode = get_best_mode (lbitsize, lbitpos, 0,
const_p ? TYPE_ALIGN (TREE_TYPE (linner))
: MIN (TYPE_ALIGN (TREE_TYPE (linner)),
TYPE_ALIGN (TREE_TYPE (rinner))),
@@ -5237,7 +5237,7 @@ fold_truthop (location_t loc, enum tree_
to be relative to a field of that size. */
first_bit = MIN (ll_bitpos, rl_bitpos);
end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize);
- lnmode = get_best_mode (end_bit - first_bit, first_bit,
+ lnmode = get_best_mode (end_bit - first_bit, first_bit, 0,
TYPE_ALIGN (TREE_TYPE (ll_inner)), word_mode,
volatilep);
if (lnmode == VOIDmode)
@@ -5302,7 +5302,7 @@ fold_truthop (location_t loc, enum tree_
first_bit = MIN (lr_bitpos, rr_bitpos);
end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize);
- rnmode = get_best_mode (end_bit - first_bit, first_bit,
+ rnmode = get_best_mode (end_bit - first_bit, first_bit, 0,
TYPE_ALIGN (TREE_TYPE (lr_inner)), word_mode,
volatilep);
if (rnmode == VOIDmode)
Index: params.h
===================================================================
--- params.h (revision 173263)
+++ params.h (working copy)
@@ -206,4 +206,6 @@ extern void init_param_values (int *para
PARAM_VALUE (PARAM_MIN_NONDEBUG_INSN_UID)
#define MAX_STORES_TO_SINK \
PARAM_VALUE (PARAM_MAX_STORES_TO_SINK)
+#define ALLOW_STORE_DATA_RACES \
+ PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES)
#endif /* ! GCC_PARAMS_H */
Index: testsuite/gcc.dg/20110509.c
===================================================================
--- testsuite/gcc.dg/20110509.c (revision 0)
+++ testsuite/gcc.dg/20110509.c (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+/* Test that we don't store past VAR.A. */
+
+struct S
+{
+ volatile unsigned int a : 4;
+ unsigned char b;
+ unsigned int c : 6;
+} var;
+
+void set_a()
+{
+ var.a = 12;
+}
+
+/* { dg-final { scan-assembler-not "movl.*, var" } } */
Index: ifcvt.c
===================================================================
--- ifcvt.c (revision 173263)
+++ ifcvt.c (working copy)
@@ -885,7 +885,7 @@ noce_emit_move_insn (rtx x, rtx y)
}
gcc_assert (start < (MEM_P (op) ? BITS_PER_UNIT : BITS_PER_WORD));
- store_bit_field (op, size, start, GET_MODE (x), y);
+ store_bit_field (op, size, start, 0, GET_MODE (x), y);
return;
}
@@ -939,7 +939,7 @@ noce_emit_move_insn (rtx x, rtx y)
inner = XEXP (outer, 0);
outmode = GET_MODE (outer);
bitpos = SUBREG_BYTE (outer) * BITS_PER_UNIT;
- store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos, outmode, y);
+ store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos, 0, outmode, y);
}
/* Return sequence of instructions generated by if conversion. This
Index: expr.c
===================================================================
--- expr.c (revision 173263)
+++ expr.c (working copy)
@@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.
#include "diagnostic.h"
#include "ssaexpand.h"
#include "target-globals.h"
+#include "params.h"
/* Decide whether a function's arguments should be processed
from first to last or from last to first.
@@ -142,7 +143,8 @@ static void store_constructor_field (rtx
HOST_WIDE_INT, enum machine_mode,
tree, tree, int, alias_set_type);
static void store_constructor (tree, rtx, int, HOST_WIDE_INT);
-static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT, enum machine_mode,
+static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT, enum machine_mode,
tree, tree, alias_set_type, bool);
static unsigned HOST_WIDE_INT highest_pow2_factor_for_target (const_tree, const_tree);
@@ -2063,7 +2065,7 @@ emit_group_store (rtx orig_dst, rtx src,
emit_move_insn (adjust_address (dest, mode, bytepos), tmps[i]);
else
store_bit_field (dest, bytelen * BITS_PER_UNIT, bytepos * BITS_PER_UNIT,
- mode, tmps[i]);
+ 0, mode, tmps[i]);
}
/* Copy from the pseudo into the (probable) hard reg. */
@@ -2157,7 +2159,7 @@ copy_blkmode_from_reg (rtx tgtblk, rtx s
/* Use xbitpos for the source extraction (right justified) and
bitpos for the destination store (left justified). */
- store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, copy_mode,
+ store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, 0, copy_mode,
extract_bit_field (src, bitsize,
xbitpos % BITS_PER_WORD, 1, false,
NULL_RTX, copy_mode, copy_mode));
@@ -2794,7 +2796,7 @@ write_complex_part (rtx cplx, rtx val, b
gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD);
}
- store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, imode, val);
+ store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, 0, imode, val);
}
/* Extract one of the components of the complex value CPLX. Extract the
@@ -3929,6 +3931,7 @@ get_subtarget (rtx x)
static bool
optimize_bitfield_assignment_op (unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT maxbits,
enum machine_mode mode1, rtx str_rtx,
tree to, tree src)
{
@@ -3989,7 +3992,7 @@ optimize_bitfield_assignment_op (unsigne
if (str_bitsize == 0 || str_bitsize > BITS_PER_WORD)
str_mode = word_mode;
- str_mode = get_best_mode (bitsize, bitpos,
+ str_mode = get_best_mode (bitsize, bitpos, maxbits,
MEM_ALIGN (str_rtx), str_mode, 0);
if (str_mode == VOIDmode)
return false;
@@ -4098,6 +4101,93 @@ optimize_bitfield_assignment_op (unsigne
return false;
}
+/* In the C++ memory model, consecutive bit fields in a structure are
+ considered one memory location.
+
+ Given a COMPONENT_REF, this function returns the maximum number of
+ bits we are allowed to store into, when storing into the
+ COMPONENT_REF. We return 0, if there is no restriction.
+
+ EXP is the COMPONENT_REF.
+ INNERDECL is actual object being referenced.
+ BITPOS is the position in bits where the bit starts within the structure.
+ BITSIZE is size in bits of the field being referenced in EXP.
+
+ For example, while storing into FOO.A here...
+
+ struct {
+ BIT 0:
+ unsigned int a : 4;
+ unsigned int b : 1;
+ BIT 8:
+ unsigned char c;
+ unsigned int d : 6;
+ } foo;
+
+ ...we are not allowed to store past <b>, so for the layout above,
+ we would return 8 maximum bits (because who cares if we store into
+ the padding). */
+
+
+static unsigned HOST_WIDE_INT
+max_field_size (tree exp, tree innerdecl,
+ HOST_WIDE_INT bitpos, HOST_WIDE_INT bitsize)
+{
+ tree field, record_type, fld;
+ HOST_WIDE_INT maxbits = bitsize;
+
+ gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
+
+ /* If other threads can't see this value, no need to restrict stores. */
+ if (ALLOW_STORE_DATA_RACES
+ || !DECL_THREAD_VISIBLE_P (innerdecl))
+ return 0;
+
+ field = TREE_OPERAND (exp, 1);
+ record_type = DECL_FIELD_CONTEXT (field);
+
+ /* Find the original field within the structure. */
+ for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld))
+ if (fld == field)
+ break;
+ gcc_assert (fld == field);
+
+ /* If this is the last element in the structure, we can touch from
+ BITPOS to the end of the structure (including the padding). */
+ if (!DECL_CHAIN (fld))
+ return TREE_INT_CST_LOW (TYPE_SIZE (record_type)) - bitpos;
+
+ /* Count contiguous bit fields not separated by a 0-length bit-field. */
+ for (fld = DECL_CHAIN (fld); fld; fld = DECL_CHAIN (fld))
+ {
+ tree t, offset;
+ enum machine_mode mode;
+ int unsignedp, volatilep;
+
+ if (TREE_CODE (fld) != FIELD_DECL)
+ continue;
+
+ t = build3 (COMPONENT_REF, TREE_TYPE (exp),
+ unshare_expr (TREE_OPERAND (exp, 0)),
+ fld, NULL_TREE);
+ get_inner_reference (t, &bitsize, &bitpos, &offset,
+ &mode, &unsignedp, &volatilep, true);
+
+ /* Only count contiguous bit fields, that are not separated by a
+ zero-length bit field. */
+ if (!DECL_BIT_FIELD (fld)
+ || bitsize == 0)
+ {
+ /* Include the padding up to the next field. */
+ maxbits += bitpos - maxbits;
+ break;
+ }
+
+ maxbits += bitsize;
+ }
+
+ return maxbits;
+}
/* Expand an assignment that stores the value of FROM into TO. If NONTEMPORAL
is true, try generating a nontemporal store. */
@@ -4197,6 +4287,9 @@ expand_assignment (tree to, tree from, b
{
enum machine_mode mode1;
HOST_WIDE_INT bitsize, bitpos;
+ /* Max consecutive bits we are allowed to touch while storing
+ into TO. */
+ HOST_WIDE_INT maxbits = 0;
tree offset;
int unsignedp;
int volatilep = 0;
@@ -4206,6 +4299,10 @@ expand_assignment (tree to, tree from, b
tem = get_inner_reference (to, &bitsize, &bitpos, &offset, &mode1,
&unsignedp, &volatilep, true);
+ if (TREE_CODE (to) == COMPONENT_REF
+ && DECL_BIT_FIELD (TREE_OPERAND (to, 1)))
+ maxbits = max_field_size (to, tem, bitpos, bitsize);
+
/* If we are going to use store_bit_field and extract_bit_field,
make sure to_rtx will be safe for multiple use. */
@@ -4286,12 +4383,13 @@ expand_assignment (tree to, tree from, b
result = store_expr (from, XEXP (to_rtx, bitpos != 0), false,
nontemporal);
else if (bitpos + bitsize <= mode_bitsize / 2)
- result = store_field (XEXP (to_rtx, 0), bitsize, bitpos,
+ result = store_field (XEXP (to_rtx, 0), bitsize, bitpos, maxbits,
mode1, from, TREE_TYPE (tem),
get_alias_set (to), nontemporal);
else if (bitpos >= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 1), bitsize,
- bitpos - mode_bitsize / 2, mode1, from,
+ bitpos - mode_bitsize / 2, maxbits,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
else if (bitpos == 0 && bitsize == mode_bitsize)
@@ -4312,7 +4410,8 @@ expand_assignment (tree to, tree from, b
0);
write_complex_part (temp, XEXP (to_rtx, 0), false);
write_complex_part (temp, XEXP (to_rtx, 1), true);
- result = store_field (temp, bitsize, bitpos, mode1, from,
+ result = store_field (temp, bitsize, bitpos, maxbits,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false));
@@ -4337,11 +4436,12 @@ expand_assignment (tree to, tree from, b
MEM_KEEP_ALIAS_SET_P (to_rtx) = 1;
}
- if (optimize_bitfield_assignment_op (bitsize, bitpos, mode1,
+ if (optimize_bitfield_assignment_op (bitsize, bitpos, maxbits, mode1,
to_rtx, to, from))
result = NULL;
else
- result = store_field (to_rtx, bitsize, bitpos, mode1, from,
+ result = store_field (to_rtx, bitsize, bitpos, maxbits,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
}
@@ -4734,7 +4834,7 @@ store_expr (tree exp, rtx target, int ca
: BLOCK_OP_NORMAL));
else if (GET_MODE (target) == BLKmode)
store_bit_field (target, INTVAL (expr_size (exp)) * BITS_PER_UNIT,
- 0, GET_MODE (temp), temp);
+ 0, 0, GET_MODE (temp), temp);
else
convert_move (target, temp, unsignedp);
}
@@ -5177,7 +5277,8 @@ store_constructor_field (rtx target, uns
store_constructor (exp, target, cleared, bitsize / BITS_PER_UNIT);
}
else
- store_field (target, bitsize, bitpos, mode, exp, type, alias_set, false);
+ store_field (target, bitsize, bitpos, 0, mode, exp, type, alias_set,
+ false);
}
/* Store the value of constructor EXP into the rtx TARGET.
@@ -5751,6 +5852,8 @@ store_constructor (tree exp, rtx target,
BITSIZE bits, starting BITPOS bits from the start of TARGET.
If MODE is VOIDmode, it means that we are storing into a bit-field.
+ MAXBITS is the number of bits we can store into, 0 if no limit.
+
Always return const0_rtx unless we have something particular to
return.
@@ -5764,6 +5867,7 @@ store_constructor (tree exp, rtx target,
static rtx
store_field (rtx target, HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT maxbits,
enum machine_mode mode, tree exp, tree type,
alias_set_type alias_set, bool nontemporal)
{
@@ -5796,8 +5900,8 @@ store_field (rtx target, HOST_WIDE_INT b
if (bitsize != (HOST_WIDE_INT) GET_MODE_BITSIZE (GET_MODE (target)))
emit_move_insn (object, target);
- store_field (blk_object, bitsize, bitpos, mode, exp, type, alias_set,
- nontemporal);
+ store_field (blk_object, bitsize, bitpos, maxbits,
+ mode, exp, type, alias_set, nontemporal);
emit_move_insn (target, object);
@@ -5911,7 +6015,7 @@ store_field (rtx target, HOST_WIDE_INT b
}
/* Store the value in the bitfield. */
- store_bit_field (target, bitsize, bitpos, mode, temp);
+ store_bit_field (target, bitsize, bitpos, maxbits, mode, temp);
return const0_rtx;
}
@@ -7323,7 +7427,7 @@ expand_expr_real_2 (sepops ops, rtx targ
(treeop0))
* BITS_PER_UNIT),
(HOST_WIDE_INT) GET_MODE_BITSIZE (mode)),
- 0, TYPE_MODE (valtype), treeop0,
+ 0, 0, TYPE_MODE (valtype), treeop0,
type, 0, false);
}
Index: expr.h
===================================================================
--- expr.h (revision 173263)
+++ expr.h (working copy)
@@ -665,7 +665,8 @@ extern enum machine_mode
mode_for_extraction (enum extraction_pattern, int);
extern void store_bit_field (rtx, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, enum machine_mode, rtx);
+ unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT,
+ enum machine_mode, rtx);
extern rtx extract_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT, int, bool, rtx,
enum machine_mode, enum machine_mode);
Index: stor-layout.c
===================================================================
--- stor-layout.c (revision 173263)
+++ stor-layout.c (working copy)
@@ -2428,6 +2428,9 @@ fixup_unsigned_type (tree type)
/* Find the best machine mode to use when referencing a bit field of length
BITSIZE bits starting at BITPOS.
+ MAXBITS is the maximum number of bits we are allowed to touch, when
+ referencing this bit field. MAXBITS is 0 if there is no limit.
+
The underlying object is known to be aligned to a boundary of ALIGN bits.
If LARGEST_MODE is not VOIDmode, it means that we should not use a mode
larger than LARGEST_MODE (usually SImode).
@@ -2445,7 +2448,8 @@ fixup_unsigned_type (tree type)
decide which of the above modes should be used. */
enum machine_mode
-get_best_mode (int bitsize, int bitpos, unsigned int align,
+get_best_mode (int bitsize, int bitpos, unsigned HOST_WIDE_INT maxbits,
+ unsigned int align,
enum machine_mode largest_mode, int volatilep)
{
enum machine_mode mode;
@@ -2484,6 +2488,7 @@ get_best_mode (int bitsize, int bitpos,
if (bitpos / unit == (bitpos + bitsize - 1) / unit
&& unit <= BITS_PER_WORD
&& unit <= MIN (align, BIGGEST_ALIGNMENT)
+ && (!maxbits || unit <= maxbits)
&& (largest_mode == VOIDmode
|| unit <= GET_MODE_BITSIZE (largest_mode)))
wide_mode = tmode;
Index: calls.c
===================================================================
--- calls.c (revision 173263)
+++ calls.c (working copy)
@@ -909,7 +909,7 @@ store_unaligned_arguments_into_pseudos (
emit_move_insn (reg, const0_rtx);
bytes -= bitsize / BITS_PER_UNIT;
- store_bit_field (reg, bitsize, endian_correction, word_mode,
+ store_bit_field (reg, bitsize, endian_correction, 0, word_mode,
word);
}
}
Index: expmed.c
===================================================================
--- expmed.c (revision 173263)
+++ expmed.c (working copy)
@@ -47,9 +47,13 @@ struct target_expmed *this_target_expmed
static void store_fixed_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ rtx);
static void store_split_bit_field (rtx, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ rtx);
static rtx extract_fixed_bit_field (enum machine_mode, rtx,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
@@ -333,7 +337,9 @@ mode_for_extraction (enum extraction_pat
static bool
store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, enum machine_mode fieldmode,
+ unsigned HOST_WIDE_INT bitnum,
+ unsigned HOST_WIDE_INT maxbits,
+ enum machine_mode fieldmode,
rtx value, bool fallback_p)
{
unsigned int unit
@@ -547,7 +553,9 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!store_bit_field_1 (op0, MIN (BITS_PER_WORD,
bitsize - i * BITS_PER_WORD),
- bitnum + bit_offset, word_mode,
+ bitnum + bit_offset,
+ maxbits,
+ word_mode,
value_word, fallback_p))
{
delete_insns_since (last);
@@ -718,9 +726,10 @@ store_bit_field_1 (rtx str_rtx, unsigned
mode. Otherwise, use the smallest mode containing the field. */
if (GET_MODE (op0) == BLKmode
+ || (maxbits && GET_MODE_BITSIZE (GET_MODE (op0)) > maxbits)
|| (op_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (op_mode)))
- bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum, maxbits, MEM_ALIGN (op0),
(op_mode == MAX_MACHINE_MODE
? VOIDmode : op_mode),
MEM_VOLATILE_P (op0));
@@ -748,7 +757,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
/* Fetch that unit, store the bitfield in it, then store
the unit. */
tempreg = copy_to_reg (xop0);
- if (store_bit_field_1 (tempreg, bitsize, xbitpos,
+ if (store_bit_field_1 (tempreg, bitsize, xbitpos, maxbits,
fieldmode, orig_value, false))
{
emit_move_insn (xop0, tempreg);
@@ -761,21 +770,28 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!fallback_p)
return false;
- store_fixed_bit_field (op0, offset, bitsize, bitpos, value);
+ store_fixed_bit_field (op0, offset, bitsize, bitpos, maxbits, value);
return true;
}
/* Generate code to store value from rtx VALUE
into a bit-field within structure STR_RTX
containing BITSIZE bits starting at bit BITNUM.
+
+ MAXBITS is the maximum number of bits we are allowed to store into,
+ 0 if no limit.
+
FIELDMODE is the machine-mode of the FIELD_DECL node for this field. */
void
store_bit_field (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, enum machine_mode fieldmode,
+ unsigned HOST_WIDE_INT bitnum,
+ unsigned HOST_WIDE_INT maxbits,
+ enum machine_mode fieldmode,
rtx value)
{
- if (!store_bit_field_1 (str_rtx, bitsize, bitnum, fieldmode, value, true))
+ if (!store_bit_field_1 (str_rtx, bitsize, bitnum, maxbits,
+ fieldmode, value, true))
gcc_unreachable ();
}
\f
@@ -791,7 +807,9 @@ store_bit_field (rtx str_rtx, unsigned H
static void
store_fixed_bit_field (rtx op0, unsigned HOST_WIDE_INT offset,
unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitpos, rtx value)
+ unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT maxbits,
+ rtx value)
{
enum machine_mode mode;
unsigned int total_bits = BITS_PER_WORD;
@@ -812,7 +830,7 @@ store_fixed_bit_field (rtx op0, unsigned
/* Special treatment for a bit field split across two registers. */
if (bitsize + bitpos > BITS_PER_WORD)
{
- store_split_bit_field (op0, bitsize, bitpos, value);
+ store_split_bit_field (op0, bitsize, bitpos, maxbits, value);
return;
}
}
@@ -830,10 +848,12 @@ store_fixed_bit_field (rtx op0, unsigned
if (MEM_VOLATILE_P (op0)
&& GET_MODE_BITSIZE (GET_MODE (op0)) > 0
+ && GET_MODE_BITSIZE (GET_MODE (op0)) <= maxbits
&& flag_strict_volatile_bitfields > 0)
mode = GET_MODE (op0);
else
mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ maxbits,
MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
@@ -841,7 +861,7 @@ store_fixed_bit_field (rtx op0, unsigned
/* The only way this should occur is if the field spans word
boundaries. */
store_split_bit_field (op0, bitsize, bitpos + offset * BITS_PER_UNIT,
- value);
+ maxbits, value);
return;
}
@@ -961,7 +981,9 @@ store_fixed_bit_field (rtx op0, unsigned
static void
store_split_bit_field (rtx op0, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitpos, rtx value)
+ unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT maxbits,
+ rtx value)
{
unsigned int unit;
unsigned int bitsdone = 0;
@@ -1076,7 +1098,7 @@ store_split_bit_field (rtx op0, unsigned
it is just an out-of-bounds access. Ignore it. */
if (word != const0_rtx)
store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize,
- thispos, part);
+ thispos, maxbits, part);
bitsdone += thissize;
}
}
@@ -1520,7 +1542,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
if (GET_MODE (op0) == BLKmode
|| (ext_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (ext_mode)))
- bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum, 0, MEM_ALIGN (op0),
(ext_mode == MAX_MACHINE_MODE
? VOIDmode : ext_mode),
MEM_VOLATILE_P (op0));
@@ -1646,7 +1668,7 @@ extract_fixed_bit_field (enum machine_mo
mode = tmode;
}
else
- mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT, 0,
MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
Index: Makefile.in
===================================================================
--- Makefile.in (revision 173263)
+++ Makefile.in (working copy)
@@ -2916,7 +2916,7 @@ expr.o : expr.c $(CONFIG_H) $(SYSTEM_H)
typeclass.h hard-reg-set.h toplev.h $(DIAGNOSTIC_CORE_H) hard-reg-set.h $(EXCEPT_H) \
reload.h langhooks.h intl.h $(TM_P_H) $(TARGET_H) \
tree-iterator.h gt-expr.h $(MACHMODE_H) $(TIMEVAR_H) $(TREE_FLOW_H) \
- $(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h $(SSAEXPAND_H)
+ $(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h $(SSAEXPAND_H) $(PARAMS_H)
dojump.o : dojump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \
$(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(OPTABS_H) $(INSN_ATTR_H) insn-config.h \
langhooks.h $(GGC_H) gt-dojump.h vecprim.h $(BASIC_BLOCK_H) output.h
Index: stmt.c
===================================================================
--- stmt.c (revision 173263)
+++ stmt.c (working copy)
@@ -1758,7 +1758,7 @@ expand_return (tree retval)
/* Use bitpos for the source extraction (left justified) and
xbitpos for the destination store (right justified). */
- store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD, word_mode,
+ store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD, 0, word_mode,
extract_bit_field (src, bitsize,
bitpos % BITS_PER_WORD, 1, false,
NULL_RTX, word_mode, word_mode));
Index: params.def
===================================================================
--- params.def (revision 173263)
+++ params.def (working copy)
@@ -884,6 +884,13 @@ DEFPARAM (PARAM_MAX_STORES_TO_SINK,
"Maximum number of conditional store pairs that can be sunk",
2, 0, 0)
+/* Data race flags for C++0x memory model compliance. */
+
+DEFPARAM (PARAM_ALLOW_STORE_DATA_RACES,
+ "allow-store-data-races",
+ "Allow new data races on stores to be introduced",
+ 1, 0, 1)
+
/*
Local variables:
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-09 20:11 ` Aldy Hernandez
@ 2011-05-09 20:28 ` Jakub Jelinek
2011-05-10 11:42 ` Richard Guenther
2011-05-09 20:49 ` Jason Merrill
1 sibling, 1 reply; 81+ messages in thread
From: Jakub Jelinek @ 2011-05-09 20:28 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jeff Law, Jason Merrill, gcc-patches
On Mon, May 09, 2011 at 01:41:13PM -0500, Aldy Hernandez wrote:
> Jakub also gave me a testcase which triggered a buglet in
> max_field_size. I have now added a parameter INNERDECL which is the
> inner reference, so we can properly determine if the inner decl is
> thread visible or not.
What I meant actually was something different, if max_field_size
and get_inner_reference was called on say
COMPONENT_REF <ARRAY_REF <x, 4>, bitfld>
then get_inner_reference returns the whole x and bitpos
is the relative bit position of bitfld within the struct plus
4 * sizeof the containing struct. Then
TREE_INT_CST_LOW (TYPE_SIZE (record_type)) - bitpos
might get negative (well, it is unsigned, so huge).
Maybe with MEM_REF such nested handled components shouldn't appear,
if that's the case, you should assert that somewhere.
If it appears, you should probably use TREE_OPERAND (component_ref, 2)
instead of bitpos.
BTW, shouldn't BIT_FIELD_REF also be handled similarly to the COMPONENT_REF?
And, probably some coordination with Richi is needed with his bitfield tree
lowering.
Jakub
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-09 20:11 ` Aldy Hernandez
2011-05-09 20:28 ` Jakub Jelinek
@ 2011-05-09 20:49 ` Jason Merrill
2011-05-13 22:35 ` Aldy Hernandez
1 sibling, 1 reply; 81+ messages in thread
From: Jason Merrill @ 2011-05-09 20:49 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jeff Law, gcc-patches
From a quick look it seems that this patch considers bitfields
following the one we're deliberately touching, but not previous
bitfields in the same memory location; we need to include those as well.
With your struct foo, the bits touched are the same regardless of
whether we name .a or .b.
Jason
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-09 20:28 ` Jakub Jelinek
@ 2011-05-10 11:42 ` Richard Guenther
0 siblings, 0 replies; 81+ messages in thread
From: Richard Guenther @ 2011-05-10 11:42 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: Aldy Hernandez, Jeff Law, Jason Merrill, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1561 bytes --]
On Mon, May 9, 2011 at 8:54 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, May 09, 2011 at 01:41:13PM -0500, Aldy Hernandez wrote:
>> Jakub also gave me a testcase which triggered a buglet in
>> max_field_size. I have now added a parameter INNERDECL which is the
>> inner reference, so we can properly determine if the inner decl is
>> thread visible or not.
>
> What I meant actually was something different, if max_field_size
> and get_inner_reference was called on say
> COMPONENT_REF <ARRAY_REF <x, 4>, bitfld>
> then get_inner_reference returns the whole x and bitpos
> is the relative bit position of bitfld within the struct plus
> 4 * sizeof the containing struct. Then
> TREE_INT_CST_LOW (TYPE_SIZE (record_type)) - bitpos
> might get negative (well, it is unsigned, so huge).
> Maybe with MEM_REF such nested handled components shouldn't appear,
> if that's the case, you should assert that somewhere.
> If it appears, you should probably use TREE_OPERAND (component_ref, 2)
> instead of bitpos.
>
> BTW, shouldn't BIT_FIELD_REF also be handled similarly to the COMPONENT_REF?
> And, probably some coordination with Richi is needed with his bitfield tree
> lowering.
Yes, we would need to handle BIT_FIELD_REFs similar (fold can introduce
them for example). I attached a work-in-progress patch that does
bitfield lowering at the tree level. There are interesting issues when
trying to work out the underlying object, as bitfield layout can deliberately
obfuscate things a lot.
Richard.
> Jakub
>
[-- Attachment #2: lower-bitfields-to-mem-ref --]
[-- Type: application/octet-stream, Size: 23825 bytes --]
2011-05-06 Richard Guenther <rguenther@suse.de>
PR rtl-optimization/48696
PR tree-optimization/45144
Index: gcc/gimple-low.c
===================================================================
*** gcc/gimple-low.c.orig 2011-05-06 10:46:48.000000000 +0200
--- gcc/gimple-low.c 2011-05-06 15:01:09.000000000 +0200
*************** along with GCC; see the file COPYING3.
*** 32,37 ****
--- 32,38 ----
#include "function.h"
#include "diagnostic-core.h"
#include "tree-pass.h"
+ #include "tree-pretty-print.h"
/* The differences between High GIMPLE and Low GIMPLE are the
following:
*************** record_vars (tree vars)
*** 950,952 ****
--- 951,1194 ----
{
record_vars_into (vars, current_function_decl);
}
+
+
+ /* From the bit-field reference tree REF get the offset and size of the
+ underlying non-bit-field object in *OFF and *SIZE, relative to
+ TREE_OPERAND (ref, 0) and the bits referenced of that object in
+ *BIT_OFFSET and *BIT_SIZE.
+
+ Return false if this is a reference tree we cannot handle. */
+
+ static bool
+ get_underlying_offset_and_size (tree ref, tree *off, unsigned *size,
+ unsigned *bit_offset, unsigned *bit_size,
+ tree *type)
+ {
+ tree field;
+
+ /* ??? Handle BIT_FIELD_REF as well. */
+ if (!REFERENCE_CLASS_P (ref)
+ || TREE_CODE (ref) != COMPONENT_REF
+ || !DECL_BIT_FIELD (TREE_OPERAND (ref, 1)))
+ return false;
+
+ /* ??? It's surely not for optimization, but we eventually want
+ to canonicalize all bitfield accesses anyway. */
+ if (TREE_THIS_VOLATILE (ref))
+ return false;
+
+ field = TREE_OPERAND (ref, 1);
+
+ *off = component_ref_field_offset (ref);
+ *bit_offset = TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field));
+ *bit_size = TREE_INT_CST_LOW (DECL_SIZE (field));
+
+ /* We probably need to walk adjacent preceeding FIELD_DECLs of the
+ aggregate, looking for the beginning of the bit-field. */
+ /* For non-packed structs we could also guess based on
+ DECL_BIT_FIELD_TYPEs size and alignment (if it matches). */
+ /* Ok, for the DECL_PACKED just allocate a byte-aligned minimal-size
+ chunk that covers the field, for !DECL_PACKED assume we can
+ use the alignment of DECL_BIT_FIELD_TYPE to guess the start of
+ the underlying object and return an aligned chunk of memory. */
+
+ if (!DECL_PACKED (field))
+ {
+ /* What to do for bool bitfields or enum bitfields?
+ For both expansion does not perform bitfield reduction ... */
+ /* Maybe just always use a mode-based type? */
+ if (TREE_CODE (DECL_BIT_FIELD_TYPE (field)) != INTEGER_TYPE)
+ *type = build_nonstandard_integer_type
+ (GET_MODE_PRECISION
+ (TYPE_MODE (DECL_BIT_FIELD_TYPE (field))), 1);
+ else
+ *type = DECL_BIT_FIELD_TYPE (field);
+
+ *size = TREE_INT_CST_LOW (TYPE_SIZE (*type));
+ *off = fold_build2 (PLUS_EXPR, sizetype, *off,
+ size_int ((*bit_offset & ~(*size - 1))
+ / BITS_PER_UNIT));
+ *bit_offset &= (*size - 1);
+
+ /* ??? If we have to do two loads give up for now. */
+ if (*bit_offset + *bit_size > *size)
+ return false;
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ {
+ fprintf (dump_file, "For ");
+ print_generic_expr (dump_file, ref, 0);
+ fprintf (dump_file, " use ");
+ print_generic_expr (dump_file, *off, 0);
+ fprintf (dump_file, " size %d, bit offset %d size %d\n",
+ *size, *bit_offset, *bit_size);
+ }
+
+ return true;
+ }
+ else
+ {
+ /* FIXME */
+ return false;
+ }
+ }
+
+ /* Lower a bitfield store at *GSI to a read-modify-write cycle. */
+
+ static void
+ lower_mem_lhs (gimple_stmt_iterator *gsi)
+ {
+ gimple stmt = gsi_stmt (*gsi);
+ tree lhs;
+ tree off;
+ unsigned size, bit_size, bit_offset;
+ gimple load;
+ tree type, tem, ref, val;
+
+ lhs = gimple_assign_lhs (stmt);
+ if (!get_underlying_offset_and_size (lhs, &off, &size, &bit_offset, &bit_size,
+ &type))
+ return;
+
+ /* Build a MEM_REF tree that can be used to load/store the word we
+ want to manipulate.
+ ??? Gimplifying here avoids us to explicitly handle address-taken
+ stuff in case the access was variable. */
+ tem = create_tmp_reg (type, "BF");
+ ref = fold_build2 (MEM_REF, type,
+ build_fold_addr_expr
+ (unshare_expr (TREE_OPERAND (lhs, 0))),
+ /* TBAA and bitfields is tricky - various ABIs pack
+ different underlying typed bit-fields together.
+ So use the type of the bit-field container instead. */
+ fold_convert (reference_alias_ptr_type (lhs), off));
+ ref = force_gimple_operand_gsi (gsi, ref, false,
+ NULL_TREE, true, GSI_SAME_STMT);
+
+ /* Load the word. */
+ load = gimple_build_assign (tem, ref);
+ gsi_insert_before (gsi, load, GSI_SAME_STMT);
+
+ /* Or the shifted and zero-extended val to the partially cleared
+ loaded value.
+ ??? The old mem-ref branch had BIT_FIELD_EXPR for this, but it
+ had four operands ...
+ ??? Using all the fold stuff makes us handle constants nicely
+ and transparently ...
+ ??? Do we need to think about BITS/BYTES_BIG_ENDIAN here? */
+ val = gimple_assign_rhs1 (stmt);
+ tem = force_gimple_operand_gsi
+ (gsi,
+ fold_build2 (BIT_IOR_EXPR, type,
+ /* Mask out existing bits. */
+ fold_build2 (BIT_AND_EXPR, type,
+ tem,
+ double_int_to_tree
+ (type, double_int_not
+ (double_int_lshift
+ (double_int_mask (bit_size),
+ bit_offset, size, false)))),
+ /* Shift val into place. */
+ fold_build2 (LSHIFT_EXPR, type,
+ /* Zero-extend val to type. */
+ fold_convert
+ (type,
+ fold_convert
+ (build_nonstandard_integer_type
+ (bit_size, 1), val)),
+ build_int_cst (integer_type_node, bit_offset))),
+ true, tem, true, GSI_SAME_STMT);
+
+ /* Modify the old store. */
+ gimple_assign_set_lhs (stmt, unshare_expr (ref));
+ gimple_assign_set_rhs1 (stmt, tem);
+ }
+
+ /* Lower a bitfield load at *GSI. */
+
+ static void
+ lower_mem_rhs (gimple_stmt_iterator *gsi)
+ {
+ gimple stmt = gsi_stmt (*gsi);
+ tree rhs;
+ tree off;
+ unsigned size, bit_size, bit_offset;
+ gimple load;
+ tree type, tem, ref;
+
+ rhs = gimple_assign_rhs1 (stmt);
+ if (!get_underlying_offset_and_size (rhs, &off, &size, &bit_offset, &bit_size,
+ &type))
+ return;
+
+ /* Build a MEM_REF tree that can be used to load/store the word we
+ want to manipulate.
+ ??? Gimplifying here avoids us to explicitly handle address-taken
+ stuff in case the access was variable. */
+ tem = create_tmp_reg (type, "BF");
+ ref = fold_build2 (MEM_REF, type,
+ build_fold_addr_expr
+ (unshare_expr (TREE_OPERAND (rhs, 0))),
+ /* TBAA and bitfields is tricky - various ABIs pack
+ different underlying typed bit-fields together.
+ So use the type of the bit-field container instead. */
+ fold_convert (reference_alias_ptr_type (rhs), off));
+ ref = force_gimple_operand_gsi (gsi, ref, false,
+ NULL_TREE, true, GSI_SAME_STMT);
+
+ /* Load the word. */
+ load = gimple_build_assign (tem, ref);
+ gsi_insert_before (gsi, load, GSI_SAME_STMT);
+
+ /* Shift the value into place and properly zero-/sign-extend it. */
+ tem = force_gimple_operand_gsi
+ (gsi,
+ fold_convert (TREE_TYPE (rhs),
+ fold_build2 (RSHIFT_EXPR, type, tem,
+ build_int_cst (integer_type_node, bit_offset))),
+ false, NULL_TREE, true, GSI_SAME_STMT);
+
+ /* Modify the old load. */
+ gimple_assign_set_rhs_from_tree (gsi, tem);
+ }
+
+ /* Lower (some) bitfield accesses. */
+
+ static unsigned int
+ lower_mem_exprs (void)
+ {
+ gimple_seq body = gimple_body (current_function_decl);
+ gimple_stmt_iterator gsi;
+
+ for (gsi = gsi_start (body); !gsi_end_p (gsi); gsi_next (&gsi))
+ {
+ gimple stmt = gsi_stmt (gsi);
+ if (gimple_assign_single_p (stmt))
+ {
+ lower_mem_lhs (&gsi);
+ lower_mem_rhs (&gsi);
+ }
+ }
+
+ return 0;
+ }
+
+ struct gimple_opt_pass pass_lower_mem =
+ {
+ {
+ GIMPLE_PASS,
+ "memlower", /* name */
+ NULL, /* gate */
+ lower_mem_exprs, /* execute */
+ NULL, /* sub */
+ NULL, /* next */
+ 0, /* static_pass_number */
+ TV_NONE, /* tv_id */
+ PROP_gimple_lcf, /* properties_required */
+ 0/*PROP_gimple_lmem*/, /* properties_provided */
+ 0, /* properties_destroyed */
+ 0, /* todo_flags_start */
+ TODO_dump_func /* todo_flags_finish */
+ }
+ };
Index: gcc/passes.c
===================================================================
*** gcc/passes.c.orig 2011-05-06 10:46:48.000000000 +0200
--- gcc/passes.c 2011-05-06 12:16:07.000000000 +0200
*************** init_optimization_passes (void)
*** 727,732 ****
--- 727,733 ----
NEXT_PASS (pass_lower_cf);
NEXT_PASS (pass_refactor_eh);
NEXT_PASS (pass_lower_eh);
+ NEXT_PASS (pass_lower_mem);
NEXT_PASS (pass_build_cfg);
NEXT_PASS (pass_warn_function_return);
NEXT_PASS (pass_build_cgraph_edges);
Index: gcc/tree-pass.h
===================================================================
*** gcc/tree-pass.h.orig 2011-05-06 10:46:48.000000000 +0200
--- gcc/tree-pass.h 2011-05-06 12:16:07.000000000 +0200
*************** extern void tree_lowering_passes (tree d
*** 352,357 ****
--- 352,358 ----
extern struct gimple_opt_pass pass_mudflap_1;
extern struct gimple_opt_pass pass_mudflap_2;
extern struct gimple_opt_pass pass_lower_cf;
+ extern struct gimple_opt_pass pass_lower_mem;
extern struct gimple_opt_pass pass_refactor_eh;
extern struct gimple_opt_pass pass_lower_eh;
extern struct gimple_opt_pass pass_lower_eh_dispatch;
Index: gcc/fold-const.c
===================================================================
*** gcc/fold-const.c.orig 2011-05-06 10:46:48.000000000 +0200
--- gcc/fold-const.c 2011-05-06 12:16:07.000000000 +0200
*************** fold_binary_loc (location_t loc,
*** 12432,12437 ****
--- 12432,12438 ----
/* If this is a comparison of a field, we may be able to simplify it. */
if ((TREE_CODE (arg0) == COMPONENT_REF
|| TREE_CODE (arg0) == BIT_FIELD_REF)
+ && 0
/* Handle the constant case even without -O
to make sure the warnings are given. */
&& (optimize || TREE_CODE (arg1) == INTEGER_CST))
Index: gcc/Makefile.in
===================================================================
*** gcc/Makefile.in.orig 2011-05-06 10:46:48.000000000 +0200
--- gcc/Makefile.in 2011-05-06 12:16:07.000000000 +0200
*************** gimple-low.o : gimple-low.c $(CONFIG_H)
*** 2669,2675 ****
$(DIAGNOSTIC_H) $(GIMPLE_H) $(TREE_INLINE_H) langhooks.h \
$(LANGHOOKS_DEF_H) $(TREE_FLOW_H) $(TIMEVAR_H) $(TM_H) coretypes.h \
$(EXCEPT_H) $(FLAGS_H) $(RTL_H) $(FUNCTION_H) $(EXPR_H) $(TREE_PASS_H) \
! $(HASHTAB_H) $(DIAGNOSTIC_CORE_H) tree-iterator.h
omp-low.o : omp-low.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
$(RTL_H) $(GIMPLE_H) $(TREE_INLINE_H) langhooks.h $(DIAGNOSTIC_CORE_H) \
$(TREE_FLOW_H) $(TIMEVAR_H) $(FLAGS_H) $(EXPR_H) $(DIAGNOSTIC_CORE_H) \
--- 2669,2675 ----
$(DIAGNOSTIC_H) $(GIMPLE_H) $(TREE_INLINE_H) langhooks.h \
$(LANGHOOKS_DEF_H) $(TREE_FLOW_H) $(TIMEVAR_H) $(TM_H) coretypes.h \
$(EXCEPT_H) $(FLAGS_H) $(RTL_H) $(FUNCTION_H) $(EXPR_H) $(TREE_PASS_H) \
! $(HASHTAB_H) $(DIAGNOSTIC_CORE_H) tree-iterator.h tree-pretty-print.h
omp-low.o : omp-low.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
$(RTL_H) $(GIMPLE_H) $(TREE_INLINE_H) langhooks.h $(DIAGNOSTIC_CORE_H) \
$(TREE_FLOW_H) $(TIMEVAR_H) $(FLAGS_H) $(EXPR_H) $(DIAGNOSTIC_CORE_H) \
Index: gcc/expr.c
===================================================================
*** gcc/expr.c.orig 2011-05-06 10:46:48.000000000 +0200
--- gcc/expr.c 2011-05-06 12:16:07.000000000 +0200
*************** expand_expr_real_2 (sepops ops, rtx targ
*** 7264,7270 ****
/* An operation in what may be a bit-field type needs the
result to be reduced to the precision of the bit-field type,
which is narrower than that of the type's mode. */
! reduce_bit_field = (TREE_CODE (type) == INTEGER_TYPE
&& GET_MODE_PRECISION (mode) > TYPE_PRECISION (type));
if (reduce_bit_field && modifier == EXPAND_STACK_PARM)
--- 7264,7270 ----
/* An operation in what may be a bit-field type needs the
result to be reduced to the precision of the bit-field type,
which is narrower than that of the type's mode. */
! reduce_bit_field = (INTEGRAL_TYPE_P (type)
&& GET_MODE_PRECISION (mode) > TYPE_PRECISION (type));
if (reduce_bit_field && modifier == EXPAND_STACK_PARM)
*************** expand_expr_real_1 (tree exp, rtx target
*** 8330,8336 ****
result to be reduced to the precision of the bit-field type,
which is narrower than that of the type's mode. */
reduce_bit_field = (!ignore
! && TREE_CODE (type) == INTEGER_TYPE
&& GET_MODE_PRECISION (mode) > TYPE_PRECISION (type));
/* If we are going to ignore this result, we need only do something
--- 8330,8336 ----
result to be reduced to the precision of the bit-field type,
which is narrower than that of the type's mode. */
reduce_bit_field = (!ignore
! && INTEGRAL_TYPE_P (type)
&& GET_MODE_PRECISION (mode) > TYPE_PRECISION (type));
/* If we are going to ignore this result, we need only do something
Index: gcc/gimple.c
===================================================================
*** gcc/gimple.c.orig 2011-04-21 16:33:55.000000000 +0200
--- gcc/gimple.c 2011-05-06 14:14:01.000000000 +0200
*************** canonicalize_cond_expr_cond (tree t)
*** 3137,3152 ****
&& truth_value_p (TREE_CODE (TREE_OPERAND (t, 0))))
t = TREE_OPERAND (t, 0);
- /* For (bool)x use x != 0. */
- if (CONVERT_EXPR_P (t)
- && TREE_CODE (TREE_TYPE (t)) == BOOLEAN_TYPE)
- {
- tree top0 = TREE_OPERAND (t, 0);
- t = build2 (NE_EXPR, TREE_TYPE (t),
- top0, build_int_cst (TREE_TYPE (top0), 0));
- }
/* For !x use x == 0. */
! else if (TREE_CODE (t) == TRUTH_NOT_EXPR)
{
tree top0 = TREE_OPERAND (t, 0);
t = build2 (EQ_EXPR, TREE_TYPE (t),
--- 3137,3144 ----
&& truth_value_p (TREE_CODE (TREE_OPERAND (t, 0))))
t = TREE_OPERAND (t, 0);
/* For !x use x == 0. */
! if (TREE_CODE (t) == TRUTH_NOT_EXPR)
{
tree top0 = TREE_OPERAND (t, 0);
t = build2 (EQ_EXPR, TREE_TYPE (t),
Index: gcc/testsuite/gcc.dg/tree-ssa/pr45144.c
===================================================================
*** gcc/testsuite/gcc.dg/tree-ssa/pr45144.c.orig 2011-04-20 10:52:16.000000000 +0200
--- gcc/testsuite/gcc.dg/tree-ssa/pr45144.c 2011-05-06 13:51:29.000000000 +0200
*************** union TMP
*** 22,47 ****
static unsigned
foo (struct A *p)
{
! union TMP t;
! struct A x;
! x = *p;
! t.a = x;
! return t.b;
}
void
bar (unsigned orig, unsigned *new)
{
! struct A a;
! union TMP s;
! s.b = orig;
! a = s.a;
! if (a.a1)
! baz (a.a2);
! *new = foo (&a);
}
! /* { dg-final { scan-tree-dump " = VIEW_CONVERT_EXPR<unsigned int>\\(a\\);" "optimized"} } */
/* { dg-final { cleanup-tree-dump "optimized" } } */
--- 22,48 ----
static unsigned
foo (struct A *p)
{
! union TMP tmpvar;
! struct A avar;
! avar = *p;
! tmpvar.a = avar;
! return tmpvar.b;
}
void
bar (unsigned orig, unsigned *new)
{
! struct A avar;
! union TMP tmpvar;
! tmpvar.b = orig;
! avar = tmpvar.a;
! if (avar.a1)
! baz (avar.a2);
! *new = foo (&avar);
}
! /* { dg-final { scan-tree-dump-not "avar" "optimized"} } */
! /* { dg-final { scan-tree-dump-not "tmpvar" "optimized"} } */
/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/testsuite/gcc.target/i386/bitfield4.c
===================================================================
*** /dev/null 1970-01-01 00:00:00.000000000 +0000
--- gcc/testsuite/gcc.target/i386/bitfield4.c 2011-05-06 15:07:41.000000000 +0200
***************
*** 0 ****
--- 1,19 ----
+ /* PR48696 */
+ /* { dg-do compile } */
+ /* { dg-options "-O" } */
+
+ struct bad_gcc_code_generation {
+ unsigned type:6,
+ pos:16,
+ stream:10;
+ };
+
+ int
+ show_bug(struct bad_gcc_code_generation *a)
+ {
+ /* Avoid store-forwarding failure due to access size mismatch. */
+ a->type = 0;
+ return a->pos;
+ }
+
+ /* { dg-final { scan-assembler-not "andb" } } */
Index: gcc/tree-ssa-forwprop.c
===================================================================
*** gcc/tree-ssa-forwprop.c.orig 2011-05-04 11:07:08.000000000 +0200
--- gcc/tree-ssa-forwprop.c 2011-05-06 17:00:44.000000000 +0200
*************** out:
*** 1938,1943 ****
--- 1938,2103 ----
return false;
}
+ /* Combine two conversions in a row for the second conversion at *GSI.
+ Returns true if there were any changes made. */
+
+ static bool
+ combine_conversions (gimple_stmt_iterator *gsi)
+ {
+ gimple stmt = gsi_stmt (*gsi);
+ gimple def_stmt;
+ tree op0, lhs;
+ enum tree_code code = gimple_assign_rhs_code (stmt);
+
+ gcc_checking_assert (CONVERT_EXPR_CODE_P (code)
+ || code == FLOAT_EXPR
+ || code == FIX_TRUNC_EXPR);
+
+ lhs = gimple_assign_lhs (stmt);
+ op0 = gimple_assign_rhs1 (stmt);
+ if (useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (op0)))
+ {
+ gimple_assign_set_rhs_code (stmt, TREE_CODE (op0));
+ return true;
+ }
+
+ if (TREE_CODE (op0) != SSA_NAME)
+ return false;
+
+ def_stmt = SSA_NAME_DEF_STMT (op0);
+ if (!is_gimple_assign (def_stmt))
+ return false;
+
+ if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)))
+ {
+ tree defop0 = gimple_assign_rhs1 (def_stmt);
+ tree type = TREE_TYPE (lhs);
+ tree inside_type = TREE_TYPE (defop0);
+ tree inter_type = TREE_TYPE (op0);
+ int inside_int = INTEGRAL_TYPE_P (inside_type);
+ int inside_ptr = POINTER_TYPE_P (inside_type);
+ int inside_float = FLOAT_TYPE_P (inside_type);
+ int inside_vec = TREE_CODE (inside_type) == VECTOR_TYPE;
+ unsigned int inside_prec = TYPE_PRECISION (inside_type);
+ int inside_unsignedp = TYPE_UNSIGNED (inside_type);
+ int inter_int = INTEGRAL_TYPE_P (inter_type);
+ int inter_ptr = POINTER_TYPE_P (inter_type);
+ int inter_float = FLOAT_TYPE_P (inter_type);
+ int inter_vec = TREE_CODE (inter_type) == VECTOR_TYPE;
+ unsigned int inter_prec = TYPE_PRECISION (inter_type);
+ int inter_unsignedp = TYPE_UNSIGNED (inter_type);
+ int final_int = INTEGRAL_TYPE_P (type);
+ int final_ptr = POINTER_TYPE_P (type);
+ int final_float = FLOAT_TYPE_P (type);
+ int final_vec = TREE_CODE (type) == VECTOR_TYPE;
+ unsigned int final_prec = TYPE_PRECISION (type);
+ int final_unsignedp = TYPE_UNSIGNED (type);
+
+ /* In addition to the cases of two conversions in a row
+ handled below, if we are converting something to its own
+ type via an object of identical or wider precision, neither
+ conversion is needed. */
+ if (useless_type_conversion_p (type, inside_type)
+ && (((inter_int || inter_ptr) && final_int)
+ || (inter_float && final_float))
+ && inter_prec >= final_prec)
+ {
+ gimple_assign_set_rhs1 (stmt, unshare_expr (defop0));
+ gimple_assign_set_rhs_code (stmt, TREE_CODE (defop0));
+ update_stmt (stmt);
+ return true;
+ }
+
+ /* Likewise, if the intermediate and initial types are either both
+ float or both integer, we don't need the middle conversion if the
+ former is wider than the latter and doesn't change the signedness
+ (for integers). Avoid this if the final type is a pointer since
+ then we sometimes need the middle conversion. Likewise if the
+ final type has a precision not equal to the size of its mode. */
+ if (((inter_int && inside_int)
+ || (inter_float && inside_float)
+ || (inter_vec && inside_vec))
+ && inter_prec >= inside_prec
+ && (inter_float || inter_vec
+ || inter_unsignedp == inside_unsignedp)
+ && ! (final_prec != GET_MODE_BITSIZE (TYPE_MODE (type))
+ && TYPE_MODE (type) == TYPE_MODE (inter_type))
+ && ! final_ptr
+ && (! final_vec || inter_prec == inside_prec))
+ {
+ gimple_assign_set_rhs1 (stmt, defop0);
+ update_stmt (stmt);
+ return true;
+ }
+
+ /* If we have a sign-extension of a zero-extended value, we can
+ replace that by a single zero-extension. */
+ if (inside_int && inter_int && final_int
+ && inside_prec < inter_prec && inter_prec < final_prec
+ && inside_unsignedp && !inter_unsignedp)
+ {
+ gimple_assign_set_rhs1 (stmt, defop0);
+ update_stmt (stmt);
+ return true;
+ }
+
+ /* Two conversions in a row are not needed unless:
+ - some conversion is floating-point (overstrict for now), or
+ - some conversion is a vector (overstrict for now), or
+ - the intermediate type is narrower than both initial and
+ final, or
+ - the intermediate type and innermost type differ in signedness,
+ and the outermost type is wider than the intermediate, or
+ - the initial type is a pointer type and the precisions of the
+ intermediate and final types differ, or
+ - the final type is a pointer type and the precisions of the
+ initial and intermediate types differ. */
+ if (! inside_float && ! inter_float && ! final_float
+ && ! inside_vec && ! inter_vec && ! final_vec
+ && (inter_prec >= inside_prec || inter_prec >= final_prec)
+ && ! (inside_int && inter_int
+ && inter_unsignedp != inside_unsignedp
+ && inter_prec < final_prec)
+ && ((inter_unsignedp && inter_prec > inside_prec)
+ == (final_unsignedp && final_prec > inter_prec))
+ && ! (inside_ptr && inter_prec != final_prec)
+ && ! (final_ptr && inside_prec != inter_prec)
+ && ! (final_prec != GET_MODE_BITSIZE (TYPE_MODE (type))
+ && TYPE_MODE (type) == TYPE_MODE (inter_type)))
+ {
+ gimple_assign_set_rhs1 (stmt, defop0);
+ update_stmt (stmt);
+ return true;
+ }
+
+ /* A truncation to an unsigned type should be canonicalized as
+ bitwise and of a mask. */
+ if (final_int && inter_int && inside_int
+ && final_prec == inside_prec
+ && final_prec > inter_prec
+ && inter_unsignedp)
+ {
+ tree tem;
+ tem = fold_build2 (BIT_AND_EXPR, inside_type,
+ defop0,
+ double_int_to_tree
+ (inside_type, double_int_mask (inter_prec)));
+ if (!useless_type_conversion_p (type, inside_type))
+ {
+ tem = force_gimple_operand_gsi (gsi, tem, true, NULL_TREE, true,
+ GSI_SAME_STMT);
+ gimple_assign_set_rhs1 (stmt, tem);
+ }
+ else
+ gimple_assign_set_rhs_from_tree (gsi, tem);
+ update_stmt (gsi_stmt (*gsi));
+ return true;
+ }
+ }
+
+ return false;
+ }
+
/* Main entry point for the forward propagation optimizer. */
static unsigned int
*************** tree_ssa_forward_propagate_single_use_va
*** 2061,2066 ****
--- 2221,2233 ----
cfg_changed |= associate_plusminus (stmt);
gsi_next (&gsi);
}
+ else if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt))
+ || gimple_assign_rhs_code (stmt) == FLOAT_EXPR
+ || gimple_assign_rhs_code (stmt) == FIX_TRUNC_EXPR)
+ {
+ if (!combine_conversions (&gsi))
+ gsi_next (&gsi);
+ }
else
gsi_next (&gsi);
}
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-09 20:49 ` Jason Merrill
@ 2011-05-13 22:35 ` Aldy Hernandez
2011-05-16 21:20 ` Aldy Hernandez
2011-05-19 7:17 ` Jason Merrill
0 siblings, 2 replies; 81+ messages in thread
From: Aldy Hernandez @ 2011-05-13 22:35 UTC (permalink / raw)
To: Jason Merrill; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 1148 bytes --]
On 05/09/11 14:23, Jason Merrill wrote:
> From a quick look it seems that this patch considers bitfields
> following the one we're deliberately touching, but not previous
> bitfields in the same memory location; we need to include those as well.
> With your struct foo, the bits touched are the same regardless of
> whether we name .a or .b.
Thanks all for looking into this.
Attached is a new patch that takes into account the previous bitfields
as well.
If I understand Jakub correctly, this patch also fixes the relative bit
position problem he pointed out, by virtue of no longer using a relative
bit position but the difference between the start of the memory region
and the end.
I have hand tested various bitfield combinations (at the beginning, at
the end, in the middle, with and without zero-lengthened bitfields, etc
etc). There is also the generic x86 test in the previous incantation.
Bootstrapped without any issues. Running the entire testsuite with
--param=allow-store-data-races=0 is still in progress.
How does this one look?
p.s. I would like to address BIT_FIELD_REF in a followup patch to keep
things simple.
[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 27777 bytes --]
* params.h (ALLOW_STORE_DATA_RACES): New.
* params.def (PARAM_ALLOW_STORE_DATA_RACES): New.
* Makefile.in (expr.o): Depend on PARAMS_H.
* machmode.h (get_best_mode): Add argument.
* fold-const.c (optimize_bit_field_compare): Add argument to
get_best_mode.
(fold_truthop): Same.
* ifcvt.c (noce_emit_move_insn): Add argument to store_bit_field.
* expr.c (emit_group_store): Same.
(copy_blkmode_from_reg): Same.
(write_complex_part): Same.
(optimize_bitfield_assignment_op): Add argument.
Add argument to get_best_mode.
(max_field_size): New.
(expand_assignment): Calculate maxbits and pass it down
accordingly.
(store_field): New argument.
(expand_expr_real_2): New argument to store_field.
Include params.h.
* expr.h (store_bit_field): New argument.
* stor-layout.c (get_best_mode): Restrict mode expansion by taking
into account maxbits.
* calls.c (store_unaligned_arguments_into_pseudos): New argument
to store_bit_field.
* expmed.c (store_bit_field_1): New argument. Use it.
(store_bit_field): Same.
(store_fixed_bit_field): Same.
(store_split_bit_field): Same.
(extract_bit_field_1): Pass new argument to get_best_mode.
(extract_bit_field): Same.
* stmt.c (store_bit_field): Pass new argument to store_bit_field.
* tree.h (DECL_THREAD_VISIBLE_P): New.
* doc/invoke.texi: Document parameter allow-store-data-races.
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi (revision 173263)
+++ doc/invoke.texi (working copy)
@@ -8886,6 +8886,11 @@ The maximum number of conditional stores
if either vectorization (@option{-ftree-vectorize}) or if-conversion
(@option{-ftree-loop-if-convert}) is disabled. The default is 2.
+@item allow-store-data-races
+Allow optimizers to introduce new data races on stores.
+Set to 1 to allow, otherwise to 0. This option is enabled by default
+unless implicitly set by the @option{-fmemory-model=} option.
+
@end table
@end table
Index: machmode.h
===================================================================
--- machmode.h (revision 173263)
+++ machmode.h (working copy)
@@ -248,7 +248,9 @@ extern enum machine_mode mode_for_vector
/* Find the best mode to use to access a bit field. */
-extern enum machine_mode get_best_mode (int, int, unsigned int,
+extern enum machine_mode get_best_mode (int, int,
+ unsigned HOST_WIDE_INT,
+ unsigned int,
enum machine_mode, int);
/* Determine alignment, 1<=result<=BIGGEST_ALIGNMENT. */
Index: tree.h
===================================================================
--- tree.h (revision 173263)
+++ tree.h (working copy)
@@ -3156,6 +3156,10 @@ struct GTY(()) tree_parm_decl {
#define DECL_THREAD_LOCAL_P(NODE) \
(VAR_DECL_CHECK (NODE)->decl_with_vis.tls_model >= TLS_MODEL_REAL)
+/* Return true if a VAR_DECL is visible from another thread. */
+#define DECL_THREAD_VISIBLE_P(NODE) \
+ (TREE_STATIC (NODE) && !DECL_THREAD_LOCAL_P (NODE))
+
/* In a non-local VAR_DECL with static storage duration, true if the
variable has an initialization priority. If false, the variable
will be initialized at the DEFAULT_INIT_PRIORITY. */
Index: fold-const.c
===================================================================
--- fold-const.c (revision 173263)
+++ fold-const.c (working copy)
@@ -3409,7 +3409,7 @@ optimize_bit_field_compare (location_t l
&& flag_strict_volatile_bitfields > 0)
nmode = lmode;
else
- nmode = get_best_mode (lbitsize, lbitpos,
+ nmode = get_best_mode (lbitsize, lbitpos, 0,
const_p ? TYPE_ALIGN (TREE_TYPE (linner))
: MIN (TYPE_ALIGN (TREE_TYPE (linner)),
TYPE_ALIGN (TREE_TYPE (rinner))),
@@ -5237,7 +5237,7 @@ fold_truthop (location_t loc, enum tree_
to be relative to a field of that size. */
first_bit = MIN (ll_bitpos, rl_bitpos);
end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize);
- lnmode = get_best_mode (end_bit - first_bit, first_bit,
+ lnmode = get_best_mode (end_bit - first_bit, first_bit, 0,
TYPE_ALIGN (TREE_TYPE (ll_inner)), word_mode,
volatilep);
if (lnmode == VOIDmode)
@@ -5302,7 +5302,7 @@ fold_truthop (location_t loc, enum tree_
first_bit = MIN (lr_bitpos, rr_bitpos);
end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize);
- rnmode = get_best_mode (end_bit - first_bit, first_bit,
+ rnmode = get_best_mode (end_bit - first_bit, first_bit, 0,
TYPE_ALIGN (TREE_TYPE (lr_inner)), word_mode,
volatilep);
if (rnmode == VOIDmode)
Index: params.h
===================================================================
--- params.h (revision 173263)
+++ params.h (working copy)
@@ -206,4 +206,6 @@ extern void init_param_values (int *para
PARAM_VALUE (PARAM_MIN_NONDEBUG_INSN_UID)
#define MAX_STORES_TO_SINK \
PARAM_VALUE (PARAM_MAX_STORES_TO_SINK)
+#define ALLOW_STORE_DATA_RACES \
+ PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES)
#endif /* ! GCC_PARAMS_H */
Index: testsuite/gcc.dg/20110509.c
===================================================================
--- testsuite/gcc.dg/20110509.c (revision 0)
+++ testsuite/gcc.dg/20110509.c (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+/* Test that we don't store past VAR.A. */
+
+struct S
+{
+ volatile unsigned int a : 4;
+ unsigned char b;
+ unsigned int c : 6;
+} var;
+
+void set_a()
+{
+ var.a = 12;
+}
+
+/* { dg-final { scan-assembler-not "movl.*, var" } } */
Index: ifcvt.c
===================================================================
--- ifcvt.c (revision 173263)
+++ ifcvt.c (working copy)
@@ -885,7 +885,7 @@ noce_emit_move_insn (rtx x, rtx y)
}
gcc_assert (start < (MEM_P (op) ? BITS_PER_UNIT : BITS_PER_WORD));
- store_bit_field (op, size, start, GET_MODE (x), y);
+ store_bit_field (op, size, start, 0, GET_MODE (x), y);
return;
}
@@ -939,7 +939,7 @@ noce_emit_move_insn (rtx x, rtx y)
inner = XEXP (outer, 0);
outmode = GET_MODE (outer);
bitpos = SUBREG_BYTE (outer) * BITS_PER_UNIT;
- store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos, outmode, y);
+ store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos, 0, outmode, y);
}
/* Return sequence of instructions generated by if conversion. This
Index: expr.c
===================================================================
--- expr.c (revision 173263)
+++ expr.c (working copy)
@@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.
#include "diagnostic.h"
#include "ssaexpand.h"
#include "target-globals.h"
+#include "params.h"
/* Decide whether a function's arguments should be processed
from first to last or from last to first.
@@ -142,7 +143,8 @@ static void store_constructor_field (rtx
HOST_WIDE_INT, enum machine_mode,
tree, tree, int, alias_set_type);
static void store_constructor (tree, rtx, int, HOST_WIDE_INT);
-static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT, enum machine_mode,
+static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT, enum machine_mode,
tree, tree, alias_set_type, bool);
static unsigned HOST_WIDE_INT highest_pow2_factor_for_target (const_tree, const_tree);
@@ -2063,7 +2065,7 @@ emit_group_store (rtx orig_dst, rtx src,
emit_move_insn (adjust_address (dest, mode, bytepos), tmps[i]);
else
store_bit_field (dest, bytelen * BITS_PER_UNIT, bytepos * BITS_PER_UNIT,
- mode, tmps[i]);
+ 0, mode, tmps[i]);
}
/* Copy from the pseudo into the (probable) hard reg. */
@@ -2157,7 +2159,7 @@ copy_blkmode_from_reg (rtx tgtblk, rtx s
/* Use xbitpos for the source extraction (right justified) and
bitpos for the destination store (left justified). */
- store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, copy_mode,
+ store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, 0, copy_mode,
extract_bit_field (src, bitsize,
xbitpos % BITS_PER_WORD, 1, false,
NULL_RTX, copy_mode, copy_mode));
@@ -2794,7 +2796,7 @@ write_complex_part (rtx cplx, rtx val, b
gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD);
}
- store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, imode, val);
+ store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, 0, imode, val);
}
/* Extract one of the components of the complex value CPLX. Extract the
@@ -3929,6 +3931,7 @@ get_subtarget (rtx x)
static bool
optimize_bitfield_assignment_op (unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT maxbits,
enum machine_mode mode1, rtx str_rtx,
tree to, tree src)
{
@@ -3989,7 +3992,7 @@ optimize_bitfield_assignment_op (unsigne
if (str_bitsize == 0 || str_bitsize > BITS_PER_WORD)
str_mode = word_mode;
- str_mode = get_best_mode (bitsize, bitpos,
+ str_mode = get_best_mode (bitsize, bitpos, maxbits,
MEM_ALIGN (str_rtx), str_mode, 0);
if (str_mode == VOIDmode)
return false;
@@ -4098,6 +4101,103 @@ optimize_bitfield_assignment_op (unsigne
return false;
}
+/* In the C++ memory model, consecutive bit fields in a structure are
+ considered one memory location.
+
+ Given a COMPONENT_REF, this function returns the maximum number of
+ bits we are allowed to store into, when storing into the
+ COMPONENT_REF. We return 0, if there is no restriction.
+
+ EXP is the COMPONENT_REF.
+ INNERDECL is actual object being referenced.
+ BITPOS is the position in bits where the bit starts within the structure.
+ BITSIZE is size in bits of the field being referenced in EXP.
+
+ For example, while storing into FOO.A here...
+
+ struct {
+ BIT 0:
+ unsigned int a : 4;
+ unsigned int b : 1;
+ BIT 8:
+ unsigned char c;
+ unsigned int d : 6;
+ } foo;
+
+ ...we are not allowed to store past <b>, so for the layout above,
+ we would return 8 maximum bits (because who cares if we store into
+ the padding). */
+
+static unsigned HOST_WIDE_INT
+max_field_size (tree exp, tree innerdecl,
+ HOST_WIDE_INT bitpos, HOST_WIDE_INT bitsize)
+{
+ tree field, record_type, fld;
+ bool found_field = false;
+ bool prev_field_is_bitfield;
+ /* Starting bitpos for the current memory location. */
+ int start_bitpos ;
+
+ gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
+
+ /* If other threads can't see this value, no need to restrict stores. */
+ if (ALLOW_STORE_DATA_RACES
+ || !DECL_THREAD_VISIBLE_P (innerdecl))
+ return 0;
+
+ /* Bit field we're storing into. */
+ field = TREE_OPERAND (exp, 1);
+ record_type = DECL_FIELD_CONTEXT (field);
+
+ /* Count the contiguous bitfields for the memory location that
+ contains FIELD. */
+ start_bitpos = 0;
+ prev_field_is_bitfield = true;
+ for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld))
+ {
+ tree t, offset;
+ enum machine_mode mode;
+ int unsignedp, volatilep;
+
+ if (TREE_CODE (fld) != FIELD_DECL)
+ continue;
+
+ t = build3 (COMPONENT_REF, TREE_TYPE (exp),
+ unshare_expr (TREE_OPERAND (exp, 0)),
+ fld, NULL_TREE);
+ get_inner_reference (t, &bitsize, &bitpos, &offset,
+ &mode, &unsignedp, &volatilep, true);
+
+ if (field == fld)
+ found_field = true;
+
+ if (DECL_BIT_FIELD (fld) && bitsize > 0)
+ {
+ if (prev_field_is_bitfield == false)
+ {
+ start_bitpos = bitpos;
+ prev_field_is_bitfield = true;
+ }
+ }
+ else
+ {
+ prev_field_is_bitfield = false;
+ if (found_field)
+ break;
+ }
+ }
+ gcc_assert (found_field);
+
+ if (fld)
+ {
+ /* We found the end of the bit field sequence. Include the
+ padding up to the next field and be done. */
+ return bitpos - start_bitpos;
+ }
+ /* If this is the last element in the structure, include the padding
+ at the end of structure. */
+ return TREE_INT_CST_LOW (TYPE_SIZE (record_type)) - start_bitpos;
+}
/* Expand an assignment that stores the value of FROM into TO. If NONTEMPORAL
is true, try generating a nontemporal store. */
@@ -4197,6 +4297,9 @@ expand_assignment (tree to, tree from, b
{
enum machine_mode mode1;
HOST_WIDE_INT bitsize, bitpos;
+ /* Max consecutive bits we are allowed to touch while storing
+ into TO. */
+ HOST_WIDE_INT maxbits = 0;
tree offset;
int unsignedp;
int volatilep = 0;
@@ -4206,6 +4309,10 @@ expand_assignment (tree to, tree from, b
tem = get_inner_reference (to, &bitsize, &bitpos, &offset, &mode1,
&unsignedp, &volatilep, true);
+ if (TREE_CODE (to) == COMPONENT_REF
+ && DECL_BIT_FIELD (TREE_OPERAND (to, 1)))
+ maxbits = max_field_size (to, tem, bitpos, bitsize);
+
/* If we are going to use store_bit_field and extract_bit_field,
make sure to_rtx will be safe for multiple use. */
@@ -4286,12 +4393,13 @@ expand_assignment (tree to, tree from, b
result = store_expr (from, XEXP (to_rtx, bitpos != 0), false,
nontemporal);
else if (bitpos + bitsize <= mode_bitsize / 2)
- result = store_field (XEXP (to_rtx, 0), bitsize, bitpos,
+ result = store_field (XEXP (to_rtx, 0), bitsize, bitpos, maxbits,
mode1, from, TREE_TYPE (tem),
get_alias_set (to), nontemporal);
else if (bitpos >= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 1), bitsize,
- bitpos - mode_bitsize / 2, mode1, from,
+ bitpos - mode_bitsize / 2, maxbits,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
else if (bitpos == 0 && bitsize == mode_bitsize)
@@ -4312,7 +4420,8 @@ expand_assignment (tree to, tree from, b
0);
write_complex_part (temp, XEXP (to_rtx, 0), false);
write_complex_part (temp, XEXP (to_rtx, 1), true);
- result = store_field (temp, bitsize, bitpos, mode1, from,
+ result = store_field (temp, bitsize, bitpos, maxbits,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false));
@@ -4337,11 +4446,12 @@ expand_assignment (tree to, tree from, b
MEM_KEEP_ALIAS_SET_P (to_rtx) = 1;
}
- if (optimize_bitfield_assignment_op (bitsize, bitpos, mode1,
+ if (optimize_bitfield_assignment_op (bitsize, bitpos, maxbits, mode1,
to_rtx, to, from))
result = NULL;
else
- result = store_field (to_rtx, bitsize, bitpos, mode1, from,
+ result = store_field (to_rtx, bitsize, bitpos, maxbits,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
}
@@ -4734,7 +4844,7 @@ store_expr (tree exp, rtx target, int ca
: BLOCK_OP_NORMAL));
else if (GET_MODE (target) == BLKmode)
store_bit_field (target, INTVAL (expr_size (exp)) * BITS_PER_UNIT,
- 0, GET_MODE (temp), temp);
+ 0, 0, GET_MODE (temp), temp);
else
convert_move (target, temp, unsignedp);
}
@@ -5177,7 +5287,8 @@ store_constructor_field (rtx target, uns
store_constructor (exp, target, cleared, bitsize / BITS_PER_UNIT);
}
else
- store_field (target, bitsize, bitpos, mode, exp, type, alias_set, false);
+ store_field (target, bitsize, bitpos, 0, mode, exp, type, alias_set,
+ false);
}
/* Store the value of constructor EXP into the rtx TARGET.
@@ -5751,6 +5862,8 @@ store_constructor (tree exp, rtx target,
BITSIZE bits, starting BITPOS bits from the start of TARGET.
If MODE is VOIDmode, it means that we are storing into a bit-field.
+ MAXBITS is the number of bits we can store into, 0 if no limit.
+
Always return const0_rtx unless we have something particular to
return.
@@ -5764,6 +5877,7 @@ store_constructor (tree exp, rtx target,
static rtx
store_field (rtx target, HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT maxbits,
enum machine_mode mode, tree exp, tree type,
alias_set_type alias_set, bool nontemporal)
{
@@ -5796,8 +5910,8 @@ store_field (rtx target, HOST_WIDE_INT b
if (bitsize != (HOST_WIDE_INT) GET_MODE_BITSIZE (GET_MODE (target)))
emit_move_insn (object, target);
- store_field (blk_object, bitsize, bitpos, mode, exp, type, alias_set,
- nontemporal);
+ store_field (blk_object, bitsize, bitpos, maxbits,
+ mode, exp, type, alias_set, nontemporal);
emit_move_insn (target, object);
@@ -5911,7 +6025,7 @@ store_field (rtx target, HOST_WIDE_INT b
}
/* Store the value in the bitfield. */
- store_bit_field (target, bitsize, bitpos, mode, temp);
+ store_bit_field (target, bitsize, bitpos, maxbits, mode, temp);
return const0_rtx;
}
@@ -7323,7 +7437,7 @@ expand_expr_real_2 (sepops ops, rtx targ
(treeop0))
* BITS_PER_UNIT),
(HOST_WIDE_INT) GET_MODE_BITSIZE (mode)),
- 0, TYPE_MODE (valtype), treeop0,
+ 0, 0, TYPE_MODE (valtype), treeop0,
type, 0, false);
}
Index: expr.h
===================================================================
--- expr.h (revision 173263)
+++ expr.h (working copy)
@@ -665,7 +665,8 @@ extern enum machine_mode
mode_for_extraction (enum extraction_pattern, int);
extern void store_bit_field (rtx, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, enum machine_mode, rtx);
+ unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT,
+ enum machine_mode, rtx);
extern rtx extract_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT, int, bool, rtx,
enum machine_mode, enum machine_mode);
Index: stor-layout.c
===================================================================
--- stor-layout.c (revision 173263)
+++ stor-layout.c (working copy)
@@ -2428,6 +2428,9 @@ fixup_unsigned_type (tree type)
/* Find the best machine mode to use when referencing a bit field of length
BITSIZE bits starting at BITPOS.
+ MAXBITS is the maximum number of bits we are allowed to touch, when
+ referencing this bit field. MAXBITS is 0 if there is no limit.
+
The underlying object is known to be aligned to a boundary of ALIGN bits.
If LARGEST_MODE is not VOIDmode, it means that we should not use a mode
larger than LARGEST_MODE (usually SImode).
@@ -2445,7 +2448,8 @@ fixup_unsigned_type (tree type)
decide which of the above modes should be used. */
enum machine_mode
-get_best_mode (int bitsize, int bitpos, unsigned int align,
+get_best_mode (int bitsize, int bitpos, unsigned HOST_WIDE_INT maxbits,
+ unsigned int align,
enum machine_mode largest_mode, int volatilep)
{
enum machine_mode mode;
@@ -2484,6 +2488,7 @@ get_best_mode (int bitsize, int bitpos,
if (bitpos / unit == (bitpos + bitsize - 1) / unit
&& unit <= BITS_PER_WORD
&& unit <= MIN (align, BIGGEST_ALIGNMENT)
+ && (!maxbits || unit <= maxbits)
&& (largest_mode == VOIDmode
|| unit <= GET_MODE_BITSIZE (largest_mode)))
wide_mode = tmode;
Index: calls.c
===================================================================
--- calls.c (revision 173263)
+++ calls.c (working copy)
@@ -909,7 +909,7 @@ store_unaligned_arguments_into_pseudos (
emit_move_insn (reg, const0_rtx);
bytes -= bitsize / BITS_PER_UNIT;
- store_bit_field (reg, bitsize, endian_correction, word_mode,
+ store_bit_field (reg, bitsize, endian_correction, 0, word_mode,
word);
}
}
Index: expmed.c
===================================================================
--- expmed.c (revision 173263)
+++ expmed.c (working copy)
@@ -47,9 +47,13 @@ struct target_expmed *this_target_expmed
static void store_fixed_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ rtx);
static void store_split_bit_field (rtx, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ rtx);
static rtx extract_fixed_bit_field (enum machine_mode, rtx,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
@@ -333,7 +337,9 @@ mode_for_extraction (enum extraction_pat
static bool
store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, enum machine_mode fieldmode,
+ unsigned HOST_WIDE_INT bitnum,
+ unsigned HOST_WIDE_INT maxbits,
+ enum machine_mode fieldmode,
rtx value, bool fallback_p)
{
unsigned int unit
@@ -547,7 +553,9 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!store_bit_field_1 (op0, MIN (BITS_PER_WORD,
bitsize - i * BITS_PER_WORD),
- bitnum + bit_offset, word_mode,
+ bitnum + bit_offset,
+ maxbits,
+ word_mode,
value_word, fallback_p))
{
delete_insns_since (last);
@@ -718,9 +726,10 @@ store_bit_field_1 (rtx str_rtx, unsigned
mode. Otherwise, use the smallest mode containing the field. */
if (GET_MODE (op0) == BLKmode
+ || (maxbits && GET_MODE_BITSIZE (GET_MODE (op0)) > maxbits)
|| (op_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (op_mode)))
- bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum, maxbits, MEM_ALIGN (op0),
(op_mode == MAX_MACHINE_MODE
? VOIDmode : op_mode),
MEM_VOLATILE_P (op0));
@@ -748,7 +757,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
/* Fetch that unit, store the bitfield in it, then store
the unit. */
tempreg = copy_to_reg (xop0);
- if (store_bit_field_1 (tempreg, bitsize, xbitpos,
+ if (store_bit_field_1 (tempreg, bitsize, xbitpos, maxbits,
fieldmode, orig_value, false))
{
emit_move_insn (xop0, tempreg);
@@ -761,21 +770,28 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!fallback_p)
return false;
- store_fixed_bit_field (op0, offset, bitsize, bitpos, value);
+ store_fixed_bit_field (op0, offset, bitsize, bitpos, maxbits, value);
return true;
}
/* Generate code to store value from rtx VALUE
into a bit-field within structure STR_RTX
containing BITSIZE bits starting at bit BITNUM.
+
+ MAXBITS is the maximum number of bits we are allowed to store into,
+ 0 if no limit.
+
FIELDMODE is the machine-mode of the FIELD_DECL node for this field. */
void
store_bit_field (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, enum machine_mode fieldmode,
+ unsigned HOST_WIDE_INT bitnum,
+ unsigned HOST_WIDE_INT maxbits,
+ enum machine_mode fieldmode,
rtx value)
{
- if (!store_bit_field_1 (str_rtx, bitsize, bitnum, fieldmode, value, true))
+ if (!store_bit_field_1 (str_rtx, bitsize, bitnum, maxbits,
+ fieldmode, value, true))
gcc_unreachable ();
}
\f
@@ -791,7 +807,9 @@ store_bit_field (rtx str_rtx, unsigned H
static void
store_fixed_bit_field (rtx op0, unsigned HOST_WIDE_INT offset,
unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitpos, rtx value)
+ unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT maxbits,
+ rtx value)
{
enum machine_mode mode;
unsigned int total_bits = BITS_PER_WORD;
@@ -812,7 +830,7 @@ store_fixed_bit_field (rtx op0, unsigned
/* Special treatment for a bit field split across two registers. */
if (bitsize + bitpos > BITS_PER_WORD)
{
- store_split_bit_field (op0, bitsize, bitpos, value);
+ store_split_bit_field (op0, bitsize, bitpos, maxbits, value);
return;
}
}
@@ -830,10 +848,12 @@ store_fixed_bit_field (rtx op0, unsigned
if (MEM_VOLATILE_P (op0)
&& GET_MODE_BITSIZE (GET_MODE (op0)) > 0
+ && GET_MODE_BITSIZE (GET_MODE (op0)) <= maxbits
&& flag_strict_volatile_bitfields > 0)
mode = GET_MODE (op0);
else
mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ maxbits,
MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
@@ -841,7 +861,7 @@ store_fixed_bit_field (rtx op0, unsigned
/* The only way this should occur is if the field spans word
boundaries. */
store_split_bit_field (op0, bitsize, bitpos + offset * BITS_PER_UNIT,
- value);
+ maxbits, value);
return;
}
@@ -961,7 +981,9 @@ store_fixed_bit_field (rtx op0, unsigned
static void
store_split_bit_field (rtx op0, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitpos, rtx value)
+ unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT maxbits,
+ rtx value)
{
unsigned int unit;
unsigned int bitsdone = 0;
@@ -1076,7 +1098,7 @@ store_split_bit_field (rtx op0, unsigned
it is just an out-of-bounds access. Ignore it. */
if (word != const0_rtx)
store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize,
- thispos, part);
+ thispos, maxbits, part);
bitsdone += thissize;
}
}
@@ -1520,7 +1542,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
if (GET_MODE (op0) == BLKmode
|| (ext_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (ext_mode)))
- bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum, 0, MEM_ALIGN (op0),
(ext_mode == MAX_MACHINE_MODE
? VOIDmode : ext_mode),
MEM_VOLATILE_P (op0));
@@ -1646,7 +1668,7 @@ extract_fixed_bit_field (enum machine_mo
mode = tmode;
}
else
- mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT, 0,
MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
Index: Makefile.in
===================================================================
--- Makefile.in (revision 173263)
+++ Makefile.in (working copy)
@@ -2916,7 +2916,7 @@ expr.o : expr.c $(CONFIG_H) $(SYSTEM_H)
typeclass.h hard-reg-set.h toplev.h $(DIAGNOSTIC_CORE_H) hard-reg-set.h $(EXCEPT_H) \
reload.h langhooks.h intl.h $(TM_P_H) $(TARGET_H) \
tree-iterator.h gt-expr.h $(MACHMODE_H) $(TIMEVAR_H) $(TREE_FLOW_H) \
- $(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h $(SSAEXPAND_H)
+ $(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h $(SSAEXPAND_H) $(PARAMS_H)
dojump.o : dojump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \
$(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(OPTABS_H) $(INSN_ATTR_H) insn-config.h \
langhooks.h $(GGC_H) gt-dojump.h vecprim.h $(BASIC_BLOCK_H) output.h
Index: stmt.c
===================================================================
--- stmt.c (revision 173263)
+++ stmt.c (working copy)
@@ -1758,7 +1758,7 @@ expand_return (tree retval)
/* Use bitpos for the source extraction (left justified) and
xbitpos for the destination store (right justified). */
- store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD, word_mode,
+ store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD, 0, word_mode,
extract_bit_field (src, bitsize,
bitpos % BITS_PER_WORD, 1, false,
NULL_RTX, word_mode, word_mode));
Index: params.def
===================================================================
--- params.def (revision 173263)
+++ params.def (working copy)
@@ -884,6 +884,13 @@ DEFPARAM (PARAM_MAX_STORES_TO_SINK,
"Maximum number of conditional store pairs that can be sunk",
2, 0, 0)
+/* Data race flags for C++0x memory model compliance. */
+
+DEFPARAM (PARAM_ALLOW_STORE_DATA_RACES,
+ "allow-store-data-races",
+ "Allow new data races on stores to be introduced",
+ 1, 0, 1)
+
/*
Local variables:
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-13 22:35 ` Aldy Hernandez
@ 2011-05-16 21:20 ` Aldy Hernandez
2011-05-19 7:17 ` Jason Merrill
1 sibling, 0 replies; 81+ messages in thread
From: Aldy Hernandez @ 2011-05-16 21:20 UTC (permalink / raw)
To: Jason Merrill; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
> Bootstrapped without any issues. Running the entire testsuite with
> --param=allow-store-data-races=0 is still in progress.
BTW, no regressions, even running the entire thing at
--param=allow-store-data-races=0 to force testing this new bitfield
implementation on all tests.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-13 22:35 ` Aldy Hernandez
2011-05-16 21:20 ` Aldy Hernandez
@ 2011-05-19 7:17 ` Jason Merrill
2011-05-20 9:21 ` Aldy Hernandez
1 sibling, 1 reply; 81+ messages in thread
From: Jason Merrill @ 2011-05-19 7:17 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
It seems like you're calculating maxbits correctly now, but an access
doesn't necessarily start from the beginning of the sequence of
bit-fields, especially given store_split_bit_field. That is,
struct A
{
int i;
int j: 32;
int k: 8;
char c[2];
};
Here maxbits would be 40, so we decide that it's OK to use SImode to
access the word starting with k, and clobber c in the process. Am I wrong?
Jason
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-19 7:17 ` Jason Merrill
@ 2011-05-20 9:21 ` Aldy Hernandez
2011-05-26 18:05 ` Jason Merrill
0 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-05-20 9:21 UTC (permalink / raw)
To: Jason Merrill; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 1110 bytes --]
On 05/18/11 16:58, Jason Merrill wrote:
> It seems like you're calculating maxbits correctly now, but an access
> doesn't necessarily start from the beginning of the sequence of
> bit-fields, especially given store_split_bit_field. That is,
This is what I was trying to explain to you on irc. And I obviously
muffed up the whole explanation :).
>
> struct A
> {
> int i;
> int j: 32;
> int k: 8;
> char c[2];
> };
>
> Here maxbits would be 40, so we decide that it's OK to use SImode to
> access the word starting with k, and clobber c in the process. Am I wrong?
You are correct. I have redesigned the patch to pass around starting
and ending bit positions, so get_best_mode() can make a more informed
decision.
I also started using DECL_BIT_FIELD_TYPE instead of DECL_BIT_FIELD to
determine if a DECL is a bit field. It turns out DECL_BIT_FIELD is not
set for bit fields with mode sized number of bits (32-bits, 16-bits, etc).
Furthermore, I added another test to check the above scenario.
Bootstrapped and tested on x86-64 with --param=allow-store-data-races=0.
How do you like these apples?
[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 31414 bytes --]
* params.h (ALLOW_STORE_DATA_RACES): New.
* params.def (PARAM_ALLOW_STORE_DATA_RACES): New.
* Makefile.in (expr.o): Depend on PARAMS_H.
* machmode.h (get_best_mode): Add argument.
* fold-const.c (optimize_bit_field_compare): Add argument to
get_best_mode.
(fold_truthop): Same.
* ifcvt.c (noce_emit_move_insn): Add argument to store_bit_field.
* expr.c (emit_group_store): Same.
(copy_blkmode_from_reg): Same.
(write_complex_part): Same.
(optimize_bitfield_assignment_op): Add argument.
Add argument to get_best_mode.
(get_bit_range): New.
(expand_assignment): Calculate maxbits and pass it down
accordingly.
(store_field): New argument.
(expand_expr_real_2): New argument to store_field.
Include params.h.
* expr.h (store_bit_field): New argument.
* stor-layout.c (get_best_mode): Restrict mode expansion by taking
into account maxbits.
* calls.c (store_unaligned_arguments_into_pseudos): New argument
to store_bit_field.
* expmed.c (store_bit_field_1): New argument. Use it.
(store_bit_field): Same.
(store_fixed_bit_field): Same.
(store_split_bit_field): Same.
(extract_bit_field_1): Pass new argument to get_best_mode.
(extract_bit_field): Same.
* stmt.c (store_bit_field): Pass new argument to store_bit_field.
* tree.h (DECL_THREAD_VISIBLE_P): New.
* doc/invoke.texi: Document parameter allow-store-data-races.
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi (revision 173263)
+++ doc/invoke.texi (working copy)
@@ -8886,6 +8886,11 @@ The maximum number of conditional stores
if either vectorization (@option{-ftree-vectorize}) or if-conversion
(@option{-ftree-loop-if-convert}) is disabled. The default is 2.
+@item allow-store-data-races
+Allow optimizers to introduce new data races on stores.
+Set to 1 to allow, otherwise to 0. This option is enabled by default
+unless implicitly set by the @option{-fmemory-model=} option.
+
@end table
@end table
Index: machmode.h
===================================================================
--- machmode.h (revision 173263)
+++ machmode.h (working copy)
@@ -248,7 +248,10 @@ extern enum machine_mode mode_for_vector
/* Find the best mode to use to access a bit field. */
-extern enum machine_mode get_best_mode (int, int, unsigned int,
+extern enum machine_mode get_best_mode (int, int,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ unsigned int,
enum machine_mode, int);
/* Determine alignment, 1<=result<=BIGGEST_ALIGNMENT. */
Index: tree.h
===================================================================
--- tree.h (revision 173263)
+++ tree.h (working copy)
@@ -3156,6 +3156,10 @@ struct GTY(()) tree_parm_decl {
#define DECL_THREAD_LOCAL_P(NODE) \
(VAR_DECL_CHECK (NODE)->decl_with_vis.tls_model >= TLS_MODEL_REAL)
+/* Return true if a VAR_DECL is visible from another thread. */
+#define DECL_THREAD_VISIBLE_P(NODE) \
+ (TREE_STATIC (NODE) && !DECL_THREAD_LOCAL_P (NODE))
+
/* In a non-local VAR_DECL with static storage duration, true if the
variable has an initialization priority. If false, the variable
will be initialized at the DEFAULT_INIT_PRIORITY. */
Index: fold-const.c
===================================================================
--- fold-const.c (revision 173263)
+++ fold-const.c (working copy)
@@ -3409,7 +3409,7 @@ optimize_bit_field_compare (location_t l
&& flag_strict_volatile_bitfields > 0)
nmode = lmode;
else
- nmode = get_best_mode (lbitsize, lbitpos,
+ nmode = get_best_mode (lbitsize, lbitpos, 0, 0,
const_p ? TYPE_ALIGN (TREE_TYPE (linner))
: MIN (TYPE_ALIGN (TREE_TYPE (linner)),
TYPE_ALIGN (TREE_TYPE (rinner))),
@@ -5237,7 +5237,7 @@ fold_truthop (location_t loc, enum tree_
to be relative to a field of that size. */
first_bit = MIN (ll_bitpos, rl_bitpos);
end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize);
- lnmode = get_best_mode (end_bit - first_bit, first_bit,
+ lnmode = get_best_mode (end_bit - first_bit, first_bit, 0, 0,
TYPE_ALIGN (TREE_TYPE (ll_inner)), word_mode,
volatilep);
if (lnmode == VOIDmode)
@@ -5302,7 +5302,7 @@ fold_truthop (location_t loc, enum tree_
first_bit = MIN (lr_bitpos, rr_bitpos);
end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize);
- rnmode = get_best_mode (end_bit - first_bit, first_bit,
+ rnmode = get_best_mode (end_bit - first_bit, first_bit, 0, 0,
TYPE_ALIGN (TREE_TYPE (lr_inner)), word_mode,
volatilep);
if (rnmode == VOIDmode)
Index: params.h
===================================================================
--- params.h (revision 173263)
+++ params.h (working copy)
@@ -206,4 +206,6 @@ extern void init_param_values (int *para
PARAM_VALUE (PARAM_MIN_NONDEBUG_INSN_UID)
#define MAX_STORES_TO_SINK \
PARAM_VALUE (PARAM_MAX_STORES_TO_SINK)
+#define ALLOW_STORE_DATA_RACES \
+ PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES)
#endif /* ! GCC_PARAMS_H */
Index: testsuite/gcc.dg/20110509.c
===================================================================
--- testsuite/gcc.dg/20110509.c (revision 0)
+++ testsuite/gcc.dg/20110509.c (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+/* Test that we don't store past VAR.A. */
+
+struct S
+{
+ volatile unsigned int a : 4;
+ unsigned char b;
+ unsigned int c : 6;
+} var;
+
+void set_a()
+{
+ var.a = 12;
+}
+
+/* { dg-final { scan-assembler-not "movl.*, var" } } */
Index: testsuite/gcc.dg/20110509-2.c
===================================================================
--- testsuite/gcc.dg/20110509-2.c (revision 0)
+++ testsuite/gcc.dg/20110509-2.c (revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+/* Test that we don't store past VAR.K. */
+
+struct S
+{
+ volatile int i;
+ volatile int j: 32;
+ volatile int k: 15;
+ volatile char c[2];
+} var;
+
+void setit()
+{
+ var.k = 13;
+}
+
+/* { dg-final { scan-assembler-not "movl.*, var" } } */
Index: ifcvt.c
===================================================================
--- ifcvt.c (revision 173263)
+++ ifcvt.c (working copy)
@@ -885,7 +885,7 @@ noce_emit_move_insn (rtx x, rtx y)
}
gcc_assert (start < (MEM_P (op) ? BITS_PER_UNIT : BITS_PER_WORD));
- store_bit_field (op, size, start, GET_MODE (x), y);
+ store_bit_field (op, size, start, 0, 0, GET_MODE (x), y);
return;
}
@@ -939,7 +939,8 @@ noce_emit_move_insn (rtx x, rtx y)
inner = XEXP (outer, 0);
outmode = GET_MODE (outer);
bitpos = SUBREG_BYTE (outer) * BITS_PER_UNIT;
- store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos, outmode, y);
+ store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos,
+ 0, 0, outmode, y);
}
/* Return sequence of instructions generated by if conversion. This
Index: expr.c
===================================================================
--- expr.c (revision 173263)
+++ expr.c (working copy)
@@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.
#include "diagnostic.h"
#include "ssaexpand.h"
#include "target-globals.h"
+#include "params.h"
/* Decide whether a function's arguments should be processed
from first to last or from last to first.
@@ -142,7 +143,9 @@ static void store_constructor_field (rtx
HOST_WIDE_INT, enum machine_mode,
tree, tree, int, alias_set_type);
static void store_constructor (tree, rtx, int, HOST_WIDE_INT);
-static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT, enum machine_mode,
+static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT,
+ enum machine_mode,
tree, tree, alias_set_type, bool);
static unsigned HOST_WIDE_INT highest_pow2_factor_for_target (const_tree, const_tree);
@@ -2063,7 +2066,7 @@ emit_group_store (rtx orig_dst, rtx src,
emit_move_insn (adjust_address (dest, mode, bytepos), tmps[i]);
else
store_bit_field (dest, bytelen * BITS_PER_UNIT, bytepos * BITS_PER_UNIT,
- mode, tmps[i]);
+ 0, 0, mode, tmps[i]);
}
/* Copy from the pseudo into the (probable) hard reg. */
@@ -2157,7 +2160,7 @@ copy_blkmode_from_reg (rtx tgtblk, rtx s
/* Use xbitpos for the source extraction (right justified) and
bitpos for the destination store (left justified). */
- store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, copy_mode,
+ store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, 0, 0, copy_mode,
extract_bit_field (src, bitsize,
xbitpos % BITS_PER_WORD, 1, false,
NULL_RTX, copy_mode, copy_mode));
@@ -2794,7 +2797,7 @@ write_complex_part (rtx cplx, rtx val, b
gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD);
}
- store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, imode, val);
+ store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, 0, 0, imode, val);
}
/* Extract one of the components of the complex value CPLX. Extract the
@@ -3929,6 +3932,8 @@ get_subtarget (rtx x)
static bool
optimize_bitfield_assignment_op (unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
enum machine_mode mode1, rtx str_rtx,
tree to, tree src)
{
@@ -3990,6 +3995,7 @@ optimize_bitfield_assignment_op (unsigne
if (str_bitsize == 0 || str_bitsize > BITS_PER_WORD)
str_mode = word_mode;
str_mode = get_best_mode (bitsize, bitpos,
+ bitregion_start, bitregion_end,
MEM_ALIGN (str_rtx), str_mode, 0);
if (str_mode == VOIDmode)
return false;
@@ -4098,6 +4104,111 @@ optimize_bitfield_assignment_op (unsigne
return false;
}
+/* In the C++ memory model, consecutive bit fields in a structure are
+ considered one memory location.
+
+ Given a COMPONENT_REF, this function returns the bit range of
+ consecutive bits in which this COMPONENT_REF belongs in. The
+ values are returned in *BITSTART and *BITEND. If either the C++
+ memory model is not in activated, or this memory access is not
+ thread visible, 0 is returned in *BITSTART and *BITEND.
+
+ EXP is the COMPONENT_REF.
+ INNERDECL is the actual object being referenced.
+ BITPOS is the position in bits where the bit starts within the structure.
+ BITSIZE is size in bits of the field being referenced in EXP.
+
+ For example, while storing into FOO.A here...
+
+ struct {
+ BIT 0:
+ unsigned int a : 4;
+ unsigned int b : 1;
+ BIT 8:
+ unsigned char c;
+ unsigned int d : 6;
+ } foo;
+
+ ...we are not allowed to store past <b>, so for the layout above, a
+ range of 0..7 (because no one cares if we store into the
+ padding). */
+
+static void
+get_bit_range (unsigned HOST_WIDE_INT *bitstart,
+ unsigned HOST_WIDE_INT *bitend,
+ tree exp, tree innerdecl,
+ HOST_WIDE_INT bitpos, HOST_WIDE_INT bitsize)
+{
+ tree field, record_type, fld;
+ bool found_field = false;
+ bool prev_field_is_bitfield;
+
+ gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
+
+ /* If other threads can't see this value, no need to restrict stores. */
+ if (ALLOW_STORE_DATA_RACES
+ || !DECL_THREAD_VISIBLE_P (innerdecl))
+ {
+ *bitstart = *bitend = 0;
+ return;
+ }
+
+ /* Bit field we're storing into. */
+ field = TREE_OPERAND (exp, 1);
+ record_type = DECL_FIELD_CONTEXT (field);
+
+ /* Count the contiguous bitfields for the memory location that
+ contains FIELD. */
+ *bitstart = 0;
+ prev_field_is_bitfield = true;
+ for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld))
+ {
+ tree t, offset;
+ enum machine_mode mode;
+ int unsignedp, volatilep;
+
+ if (TREE_CODE (fld) != FIELD_DECL)
+ continue;
+
+ t = build3 (COMPONENT_REF, TREE_TYPE (exp),
+ unshare_expr (TREE_OPERAND (exp, 0)),
+ fld, NULL_TREE);
+ get_inner_reference (t, &bitsize, &bitpos, &offset,
+ &mode, &unsignedp, &volatilep, true);
+
+ if (field == fld)
+ found_field = true;
+
+ if (DECL_BIT_FIELD_TYPE (fld) && bitsize > 0)
+ {
+ if (prev_field_is_bitfield == false)
+ {
+ *bitstart = bitpos;
+ prev_field_is_bitfield = true;
+ }
+ }
+ else
+ {
+ prev_field_is_bitfield = false;
+ if (found_field)
+ break;
+ }
+ }
+ gcc_assert (found_field);
+
+ if (fld)
+ {
+ /* We found the end of the bit field sequence. Include the
+ padding up to the next field and be done. */
+ *bitend = bitpos - 1;
+ }
+ else
+ {
+ /* If this is the last element in the structure, include the padding
+ at the end of structure. */
+ *bitend = TREE_INT_CST_LOW (TYPE_SIZE (record_type));
+ }
+}
/* Expand an assignment that stores the value of FROM into TO. If NONTEMPORAL
is true, try generating a nontemporal store. */
@@ -4197,6 +4308,8 @@ expand_assignment (tree to, tree from, b
{
enum machine_mode mode1;
HOST_WIDE_INT bitsize, bitpos;
+ unsigned HOST_WIDE_INT bitregion_start = 0;
+ unsigned HOST_WIDE_INT bitregion_end = 0;
tree offset;
int unsignedp;
int volatilep = 0;
@@ -4206,6 +4319,11 @@ expand_assignment (tree to, tree from, b
tem = get_inner_reference (to, &bitsize, &bitpos, &offset, &mode1,
&unsignedp, &volatilep, true);
+ if (TREE_CODE (to) == COMPONENT_REF
+ && DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
+ get_bit_range (&bitregion_start, &bitregion_end,
+ to, tem, bitpos, bitsize);
+
/* If we are going to use store_bit_field and extract_bit_field,
make sure to_rtx will be safe for multiple use. */
@@ -4287,11 +4405,14 @@ expand_assignment (tree to, tree from, b
nontemporal);
else if (bitpos + bitsize <= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 0), bitsize, bitpos,
+ bitregion_start, bitregion_end,
mode1, from, TREE_TYPE (tem),
get_alias_set (to), nontemporal);
else if (bitpos >= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 1), bitsize,
- bitpos - mode_bitsize / 2, mode1, from,
+ bitpos - mode_bitsize / 2,
+ bitregion_start, bitregion_end,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
else if (bitpos == 0 && bitsize == mode_bitsize)
@@ -4312,7 +4433,9 @@ expand_assignment (tree to, tree from, b
0);
write_complex_part (temp, XEXP (to_rtx, 0), false);
write_complex_part (temp, XEXP (to_rtx, 1), true);
- result = store_field (temp, bitsize, bitpos, mode1, from,
+ result = store_field (temp, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false));
@@ -4337,11 +4460,15 @@ expand_assignment (tree to, tree from, b
MEM_KEEP_ALIAS_SET_P (to_rtx) = 1;
}
- if (optimize_bitfield_assignment_op (bitsize, bitpos, mode1,
+ if (optimize_bitfield_assignment_op (bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode1,
to_rtx, to, from))
result = NULL;
else
- result = store_field (to_rtx, bitsize, bitpos, mode1, from,
+ result = store_field (to_rtx, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
}
@@ -4734,7 +4861,7 @@ store_expr (tree exp, rtx target, int ca
: BLOCK_OP_NORMAL));
else if (GET_MODE (target) == BLKmode)
store_bit_field (target, INTVAL (expr_size (exp)) * BITS_PER_UNIT,
- 0, GET_MODE (temp), temp);
+ 0, 0, 0, GET_MODE (temp), temp);
else
convert_move (target, temp, unsignedp);
}
@@ -5177,7 +5304,8 @@ store_constructor_field (rtx target, uns
store_constructor (exp, target, cleared, bitsize / BITS_PER_UNIT);
}
else
- store_field (target, bitsize, bitpos, mode, exp, type, alias_set, false);
+ store_field (target, bitsize, bitpos, 0, 0, mode, exp, type, alias_set,
+ false);
}
/* Store the value of constructor EXP into the rtx TARGET.
@@ -5751,6 +5879,11 @@ store_constructor (tree exp, rtx target,
BITSIZE bits, starting BITPOS bits from the start of TARGET.
If MODE is VOIDmode, it means that we are storing into a bit-field.
+ BITREGION_START is bitpos of the first bitfield in this region.
+ BITREGION_END is the bitpos of the ending bitfield in this region.
+ These two fields are 0, if the C++ memory model does not apply,
+ or we are not interested in keeping track of bitfield regions.
+
Always return const0_rtx unless we have something particular to
return.
@@ -5764,6 +5897,8 @@ store_constructor (tree exp, rtx target,
static rtx
store_field (rtx target, HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
enum machine_mode mode, tree exp, tree type,
alias_set_type alias_set, bool nontemporal)
{
@@ -5796,8 +5931,9 @@ store_field (rtx target, HOST_WIDE_INT b
if (bitsize != (HOST_WIDE_INT) GET_MODE_BITSIZE (GET_MODE (target)))
emit_move_insn (object, target);
- store_field (blk_object, bitsize, bitpos, mode, exp, type, alias_set,
- nontemporal);
+ store_field (blk_object, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode, exp, type, alias_set, nontemporal);
emit_move_insn (target, object);
@@ -5911,7 +6047,9 @@ store_field (rtx target, HOST_WIDE_INT b
}
/* Store the value in the bitfield. */
- store_bit_field (target, bitsize, bitpos, mode, temp);
+ store_bit_field (target, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode, temp);
return const0_rtx;
}
@@ -7323,7 +7461,7 @@ expand_expr_real_2 (sepops ops, rtx targ
(treeop0))
* BITS_PER_UNIT),
(HOST_WIDE_INT) GET_MODE_BITSIZE (mode)),
- 0, TYPE_MODE (valtype), treeop0,
+ 0, 0, 0, TYPE_MODE (valtype), treeop0,
type, 0, false);
}
Index: expr.h
===================================================================
--- expr.h (revision 173263)
+++ expr.h (working copy)
@@ -665,7 +665,10 @@ extern enum machine_mode
mode_for_extraction (enum extraction_pattern, int);
extern void store_bit_field (rtx, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, enum machine_mode, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ enum machine_mode, rtx);
extern rtx extract_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT, int, bool, rtx,
enum machine_mode, enum machine_mode);
Index: stor-layout.c
===================================================================
--- stor-layout.c (revision 173263)
+++ stor-layout.c (working copy)
@@ -2428,6 +2428,13 @@ fixup_unsigned_type (tree type)
/* Find the best machine mode to use when referencing a bit field of length
BITSIZE bits starting at BITPOS.
+ BITREGION_START is the bit position of the first bit in this
+ sequence of bit fields. BITREGION_END is the last bit in this
+ sequence. If these two fields are non-zero, we should restrict the
+ memory access to a maximum sized chunk of
+ BITREGION_END - BITREGION_START + 1. Otherwise, we are allowed to touch
+ any adjacent non bit-fields.
+
The underlying object is known to be aligned to a boundary of ALIGN bits.
If LARGEST_MODE is not VOIDmode, it means that we should not use a mode
larger than LARGEST_MODE (usually SImode).
@@ -2445,11 +2452,23 @@ fixup_unsigned_type (tree type)
decide which of the above modes should be used. */
enum machine_mode
-get_best_mode (int bitsize, int bitpos, unsigned int align,
+get_best_mode (int bitsize, int bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ unsigned int align,
enum machine_mode largest_mode, int volatilep)
{
enum machine_mode mode;
unsigned int unit = 0;
+ unsigned HOST_WIDE_INT maxbits;
+
+ /* If unset, no restriction. */
+ if (!bitregion_end)
+ maxbits = 0;
+ else if ((unsigned) bitpos < bitregion_start)
+ maxbits = bitregion_end - bitregion_start + 1;
+ else
+ maxbits = bitregion_end - bitpos + 1;
/* Find the narrowest integer mode that contains the bit field. */
for (mode = GET_CLASS_NARROWEST_MODE (MODE_INT); mode != VOIDmode;
@@ -2484,6 +2503,7 @@ get_best_mode (int bitsize, int bitpos,
if (bitpos / unit == (bitpos + bitsize - 1) / unit
&& unit <= BITS_PER_WORD
&& unit <= MIN (align, BIGGEST_ALIGNMENT)
+ && (!maxbits || unit <= maxbits)
&& (largest_mode == VOIDmode
|| unit <= GET_MODE_BITSIZE (largest_mode)))
wide_mode = tmode;
Index: calls.c
===================================================================
--- calls.c (revision 173263)
+++ calls.c (working copy)
@@ -909,8 +909,8 @@ store_unaligned_arguments_into_pseudos (
emit_move_insn (reg, const0_rtx);
bytes -= bitsize / BITS_PER_UNIT;
- store_bit_field (reg, bitsize, endian_correction, word_mode,
- word);
+ store_bit_field (reg, bitsize, endian_correction, 0, 0,
+ word_mode, word);
}
}
}
Index: expmed.c
===================================================================
--- expmed.c (revision 173263)
+++ expmed.c (working copy)
@@ -47,9 +47,15 @@ struct target_expmed *this_target_expmed
static void store_fixed_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ rtx);
static void store_split_bit_field (rtx, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ rtx);
static rtx extract_fixed_bit_field (enum machine_mode, rtx,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
@@ -333,7 +339,10 @@ mode_for_extraction (enum extraction_pat
static bool
store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, enum machine_mode fieldmode,
+ unsigned HOST_WIDE_INT bitnum,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ enum machine_mode fieldmode,
rtx value, bool fallback_p)
{
unsigned int unit
@@ -547,7 +556,9 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!store_bit_field_1 (op0, MIN (BITS_PER_WORD,
bitsize - i * BITS_PER_WORD),
- bitnum + bit_offset, word_mode,
+ bitnum + bit_offset,
+ bitregion_start, bitregion_end,
+ word_mode,
value_word, fallback_p))
{
delete_insns_since (last);
@@ -711,6 +722,12 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (HAVE_insv && MEM_P (op0))
{
enum machine_mode bestmode;
+ unsigned HOST_WIDE_INT maxbits;
+
+ if (bitnum < bitregion_start)
+ maxbits = bitregion_end - bitregion_start + 1;
+ else
+ maxbits = bitregion_end - bitnum + 1;
/* Get the mode to use for inserting into this field. If OP0 is
BLKmode, get the smallest mode consistent with the alignment. If
@@ -718,9 +735,12 @@ store_bit_field_1 (rtx str_rtx, unsigned
mode. Otherwise, use the smallest mode containing the field. */
if (GET_MODE (op0) == BLKmode
+ || (bitregion_end && GET_MODE_BITSIZE (GET_MODE (op0)) > maxbits)
|| (op_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (op_mode)))
- bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum,
+ bitregion_start, bitregion_end,
+ MEM_ALIGN (op0),
(op_mode == MAX_MACHINE_MODE
? VOIDmode : op_mode),
MEM_VOLATILE_P (op0));
@@ -749,6 +769,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
the unit. */
tempreg = copy_to_reg (xop0);
if (store_bit_field_1 (tempreg, bitsize, xbitpos,
+ bitregion_start, bitregion_end,
fieldmode, orig_value, false))
{
emit_move_insn (xop0, tempreg);
@@ -761,21 +782,33 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!fallback_p)
return false;
- store_fixed_bit_field (op0, offset, bitsize, bitpos, value);
+ store_fixed_bit_field (op0, offset, bitsize, bitpos,
+ bitregion_start, bitregion_end, value);
return true;
}
/* Generate code to store value from rtx VALUE
into a bit-field within structure STR_RTX
containing BITSIZE bits starting at bit BITNUM.
+
+ BITREGION_START is bitpos of the first bitfield in this region.
+ BITREGION_END is the bitpos of the ending bitfield in this region.
+ These two fields are 0, if the C++ memory model does not apply,
+ or we are not interested in keeping track of bitfield regions.
+
FIELDMODE is the machine-mode of the FIELD_DECL node for this field. */
void
store_bit_field (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, enum machine_mode fieldmode,
+ unsigned HOST_WIDE_INT bitnum,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ enum machine_mode fieldmode,
rtx value)
{
- if (!store_bit_field_1 (str_rtx, bitsize, bitnum, fieldmode, value, true))
+ if (!store_bit_field_1 (str_rtx, bitsize, bitnum,
+ bitregion_start, bitregion_end,
+ fieldmode, value, true))
gcc_unreachable ();
}
\f
@@ -791,7 +824,10 @@ store_bit_field (rtx str_rtx, unsigned H
static void
store_fixed_bit_field (rtx op0, unsigned HOST_WIDE_INT offset,
unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitpos, rtx value)
+ unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ rtx value)
{
enum machine_mode mode;
unsigned int total_bits = BITS_PER_WORD;
@@ -812,12 +848,23 @@ store_fixed_bit_field (rtx op0, unsigned
/* Special treatment for a bit field split across two registers. */
if (bitsize + bitpos > BITS_PER_WORD)
{
- store_split_bit_field (op0, bitsize, bitpos, value);
+ store_split_bit_field (op0, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ value);
return;
}
}
else
{
+ unsigned HOST_WIDE_INT maxbits;
+
+ if (!bitregion_end)
+ maxbits = 0;
+ else if (bitpos + offset * BITS_PER_UNIT < bitregion_start)
+ maxbits = bitregion_end - bitregion_start + 1;
+ else
+ maxbits = bitregion_end - (bitpos + offset * BITS_PER_UNIT) + 1;
+
/* Get the proper mode to use for this field. We want a mode that
includes the entire field. If such a mode would be larger than
a word, we won't be doing the extraction the normal way.
@@ -830,10 +877,12 @@ store_fixed_bit_field (rtx op0, unsigned
if (MEM_VOLATILE_P (op0)
&& GET_MODE_BITSIZE (GET_MODE (op0)) > 0
+ && GET_MODE_BITSIZE (GET_MODE (op0)) <= maxbits
&& flag_strict_volatile_bitfields > 0)
mode = GET_MODE (op0);
else
mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ bitregion_start, bitregion_end,
MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
@@ -841,7 +890,7 @@ store_fixed_bit_field (rtx op0, unsigned
/* The only way this should occur is if the field spans word
boundaries. */
store_split_bit_field (op0, bitsize, bitpos + offset * BITS_PER_UNIT,
- value);
+ bitregion_start, bitregion_end, value);
return;
}
@@ -961,7 +1010,10 @@ store_fixed_bit_field (rtx op0, unsigned
static void
store_split_bit_field (rtx op0, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitpos, rtx value)
+ unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ rtx value)
{
unsigned int unit;
unsigned int bitsdone = 0;
@@ -1076,7 +1128,7 @@ store_split_bit_field (rtx op0, unsigned
it is just an out-of-bounds access. Ignore it. */
if (word != const0_rtx)
store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize,
- thispos, part);
+ thispos, bitregion_start, bitregion_end, part);
bitsdone += thissize;
}
}
@@ -1520,7 +1572,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
if (GET_MODE (op0) == BLKmode
|| (ext_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (ext_mode)))
- bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum, 0, 0, MEM_ALIGN (op0),
(ext_mode == MAX_MACHINE_MODE
? VOIDmode : ext_mode),
MEM_VOLATILE_P (op0));
@@ -1646,7 +1698,7 @@ extract_fixed_bit_field (enum machine_mo
mode = tmode;
}
else
- mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT, 0, 0,
MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
Index: Makefile.in
===================================================================
--- Makefile.in (revision 173263)
+++ Makefile.in (working copy)
@@ -2916,7 +2916,7 @@ expr.o : expr.c $(CONFIG_H) $(SYSTEM_H)
typeclass.h hard-reg-set.h toplev.h $(DIAGNOSTIC_CORE_H) hard-reg-set.h $(EXCEPT_H) \
reload.h langhooks.h intl.h $(TM_P_H) $(TARGET_H) \
tree-iterator.h gt-expr.h $(MACHMODE_H) $(TIMEVAR_H) $(TREE_FLOW_H) \
- $(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h $(SSAEXPAND_H)
+ $(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h $(SSAEXPAND_H) $(PARAMS_H)
dojump.o : dojump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \
$(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(OPTABS_H) $(INSN_ATTR_H) insn-config.h \
langhooks.h $(GGC_H) gt-dojump.h vecprim.h $(BASIC_BLOCK_H) output.h
Index: stmt.c
===================================================================
--- stmt.c (revision 173263)
+++ stmt.c (working copy)
@@ -1758,7 +1758,8 @@ expand_return (tree retval)
/* Use bitpos for the source extraction (left justified) and
xbitpos for the destination store (right justified). */
- store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD, word_mode,
+ store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD,
+ 0, 0, word_mode,
extract_bit_field (src, bitsize,
bitpos % BITS_PER_WORD, 1, false,
NULL_RTX, word_mode, word_mode));
Index: params.def
===================================================================
--- params.def (revision 173263)
+++ params.def (working copy)
@@ -884,6 +884,13 @@ DEFPARAM (PARAM_MAX_STORES_TO_SINK,
"Maximum number of conditional store pairs that can be sunk",
2, 0, 0)
+/* Data race flags for C++0x memory model compliance. */
+
+DEFPARAM (PARAM_ALLOW_STORE_DATA_RACES,
+ "allow-store-data-races",
+ "Allow new data races on stores to be introduced",
+ 1, 0, 1)
+
/*
Local variables:
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-20 9:21 ` Aldy Hernandez
@ 2011-05-26 18:05 ` Jason Merrill
2011-05-26 18:28 ` Aldy Hernandez
0 siblings, 1 reply; 81+ messages in thread
From: Jason Merrill @ 2011-05-26 18:05 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
I'm afraid I think this is still wrong; the computation of maxbits in
various places assumes that the bitfield is at the start of the unit
we're going to access, so given
struct A
{
int i: 4;
int j: 28;
};
we won't use SImode to access A::j because we're setting maxbits to 28.
Jason
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-26 18:05 ` Jason Merrill
@ 2011-05-26 18:28 ` Aldy Hernandez
2011-05-26 19:07 ` Jason Merrill
0 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-05-26 18:28 UTC (permalink / raw)
To: Jason Merrill; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
On 05/26/11 12:24, Jason Merrill wrote:
> I'm afraid I think this is still wrong; the computation of maxbits in
> various places assumes that the bitfield is at the start of the unit
> we're going to access, so given
>
> struct A
> {
> int i: 4;
> int j: 28;
> };
>
> we won't use SImode to access A::j because we're setting maxbits to 28.
No, maxbits is actually 32, because we include padding. So it's correct
in this case.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-26 18:28 ` Aldy Hernandez
@ 2011-05-26 19:07 ` Jason Merrill
2011-05-26 20:19 ` Aldy Hernandez
0 siblings, 1 reply; 81+ messages in thread
From: Jason Merrill @ 2011-05-26 19:07 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
On 05/26/2011 01:39 PM, Aldy Hernandez wrote:
> On 05/26/11 12:24, Jason Merrill wrote:
>> struct A
>> {
>> int i: 4;
>> int j: 28;
>> };
>>
>> we won't use SImode to access A::j because we're setting maxbits to 28.
>
> No, maxbits is actually 32, because we include padding. So it's correct
> in this case.
What padding? bitregion_end-bitregion_start+1 will be 32, but in
get_best_mode I see
> + maxbits = bitregion_end - bitpos + 1;
which is 28. No?
Incidentally, I would expect _end to be one past the end rather than the
index of the last element, but perhaps I just expect that because C++
iterators work that way.
Jason
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-26 19:07 ` Jason Merrill
@ 2011-05-26 20:19 ` Aldy Hernandez
2011-05-27 20:41 ` Jason Merrill
0 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-05-26 20:19 UTC (permalink / raw)
To: Jason Merrill; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
> What padding? bitregion_end-bitregion_start+1 will be 32, but in
Poop, I misread your example.
> get_best_mode I see
>
>> + maxbits = bitregion_end - bitpos + 1;
>
> which is 28. No?
Yes, but if you look at the next few lines you'll see:
/* Find the narrowest integer mode that contains the bit field. */
for (mode = GET_CLASS_NARROWEST_MODE (MODE_INT); mode != VOIDmode;
mode = GET_MODE_WIDER_MODE (mode))
{
unit = GET_MODE_BITSIZE (mode);
if ((bitpos % unit) + bitsize <= unit)
break;
}
The narrowest integer mode containing the bit field is still 32, so we
access the bitfield with an SI instruction as expected.
> Incidentally, I would expect _end to be one past the end rather than the
> index of the last element, but perhaps I just expect that because C++
> iterators work that way.
I can fix that.
Aldy
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-26 20:19 ` Aldy Hernandez
@ 2011-05-27 20:41 ` Jason Merrill
2011-07-18 13:10 ` Aldy Hernandez
0 siblings, 1 reply; 81+ messages in thread
From: Jason Merrill @ 2011-05-27 20:41 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
On 05/26/2011 02:37 PM, Aldy Hernandez wrote:
> The narrowest integer mode containing the bit field is still 32, so we
> access the bitfield with an SI instruction as expected.
OK, then:
struct A
{
int i: 4;
int j: 4;
int k: 8;
int l: 8;
int m: 8;
};
now the narrowest mode containing 'j' is QI/8, but it would still be
safe to use SI.
Jason
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-05-27 20:41 ` Jason Merrill
@ 2011-07-18 13:10 ` Aldy Hernandez
2011-07-22 19:16 ` Jason Merrill
0 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-07-18 13:10 UTC (permalink / raw)
To: Jason Merrill; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 742 bytes --]
On 05/27/11 14:18, Jason Merrill wrote:
> On 05/26/2011 02:37 PM, Aldy Hernandez wrote:
>> The narrowest integer mode containing the bit field is still 32, so we
>> access the bitfield with an SI instruction as expected.
>
> OK, then:
>
> struct A
> {
> int i: 4;
> int j: 4;
> int k: 8;
> int l: 8;
> int m: 8;
> };
>
> now the narrowest mode containing 'j' is QI/8, but it would still be
> safe to use SI.
Hi Jason.
Sorry to have dropped the ball on this. Your last review coincided with
me going on vacation.
Here is another stab at it. I am now taking into account alignment,
which I believe addresses your issue. I have also added the new
testcase above, which the patch also fixes.
Tested on x86-64 Linux.
How is this?
Aldy
[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 31959 bytes --]
* params.h (ALLOW_STORE_DATA_RACES): New.
* params.def (PARAM_ALLOW_STORE_DATA_RACES): New.
* Makefile.in (expr.o): Depend on PARAMS_H.
* machmode.h (get_best_mode): Add argument.
* fold-const.c (optimize_bit_field_compare): Add argument to
get_best_mode.
(fold_truthop): Same.
* ifcvt.c (noce_emit_move_insn): Add argument to store_bit_field.
* expr.c (emit_group_store): Same.
(copy_blkmode_from_reg): Same.
(write_complex_part): Same.
(optimize_bitfield_assignment_op): Add argument.
Add argument to get_best_mode.
(get_bit_range): New.
(expand_assignment): Calculate maxbits and pass it down
accordingly.
(store_field): New argument.
(expand_expr_real_2): New argument to store_field.
Include params.h.
* expr.h (store_bit_field): New argument.
* stor-layout.c (get_best_mode): Restrict mode expansion by taking
into account maxbits.
* calls.c (store_unaligned_arguments_into_pseudos): New argument
to store_bit_field.
* expmed.c (store_bit_field_1): New argument. Use it.
(store_bit_field): Same.
(store_fixed_bit_field): Same.
(store_split_bit_field): Same.
(extract_bit_field_1): Pass new argument to get_best_mode.
(extract_bit_field): Same.
* stmt.c (store_bit_field): Pass new argument to store_bit_field.
* tree.h (DECL_THREAD_VISIBLE_P): New.
* doc/invoke.texi: Document parameter allow-store-data-races.
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi (revision 176280)
+++ doc/invoke.texi (working copy)
@@ -9027,6 +9027,11 @@ The maximum number of conditional stores
if either vectorization (@option{-ftree-vectorize}) or if-conversion
(@option{-ftree-loop-if-convert}) is disabled. The default is 2.
+@item allow-store-data-races
+Allow optimizers to introduce new data races on stores.
+Set to 1 to allow, otherwise to 0. This option is enabled by default
+unless implicitly set by the @option{-fmemory-model=} option.
+
@item case-values-threshold
The smallest number of different values for which it is best to use a
jump-table instead of a tree of conditional branches. If the value is
Index: machmode.h
===================================================================
--- machmode.h (revision 176280)
+++ machmode.h (working copy)
@@ -248,7 +248,10 @@ extern enum machine_mode mode_for_vector
/* Find the best mode to use to access a bit field. */
-extern enum machine_mode get_best_mode (int, int, unsigned int,
+extern enum machine_mode get_best_mode (int, int,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ unsigned int,
enum machine_mode, int);
/* Determine alignment, 1<=result<=BIGGEST_ALIGNMENT. */
Index: tree.h
===================================================================
--- tree.h (revision 176280)
+++ tree.h (working copy)
@@ -3213,6 +3213,10 @@ struct GTY(()) tree_parm_decl {
#define DECL_THREAD_LOCAL_P(NODE) \
(VAR_DECL_CHECK (NODE)->decl_with_vis.tls_model >= TLS_MODEL_REAL)
+/* Return true if a VAR_DECL is visible from another thread. */
+#define DECL_THREAD_VISIBLE_P(NODE) \
+ (TREE_STATIC (NODE) && !DECL_THREAD_LOCAL_P (NODE))
+
/* In a non-local VAR_DECL with static storage duration, true if the
variable has an initialization priority. If false, the variable
will be initialized at the DEFAULT_INIT_PRIORITY. */
Index: fold-const.c
===================================================================
--- fold-const.c (revision 176280)
+++ fold-const.c (working copy)
@@ -3394,7 +3394,7 @@ optimize_bit_field_compare (location_t l
&& flag_strict_volatile_bitfields > 0)
nmode = lmode;
else
- nmode = get_best_mode (lbitsize, lbitpos,
+ nmode = get_best_mode (lbitsize, lbitpos, 0, 0,
const_p ? TYPE_ALIGN (TREE_TYPE (linner))
: MIN (TYPE_ALIGN (TREE_TYPE (linner)),
TYPE_ALIGN (TREE_TYPE (rinner))),
@@ -5222,7 +5222,7 @@ fold_truthop (location_t loc, enum tree_
to be relative to a field of that size. */
first_bit = MIN (ll_bitpos, rl_bitpos);
end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize);
- lnmode = get_best_mode (end_bit - first_bit, first_bit,
+ lnmode = get_best_mode (end_bit - first_bit, first_bit, 0, 0,
TYPE_ALIGN (TREE_TYPE (ll_inner)), word_mode,
volatilep);
if (lnmode == VOIDmode)
@@ -5287,7 +5287,7 @@ fold_truthop (location_t loc, enum tree_
first_bit = MIN (lr_bitpos, rr_bitpos);
end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize);
- rnmode = get_best_mode (end_bit - first_bit, first_bit,
+ rnmode = get_best_mode (end_bit - first_bit, first_bit, 0, 0,
TYPE_ALIGN (TREE_TYPE (lr_inner)), word_mode,
volatilep);
if (rnmode == VOIDmode)
Index: params.h
===================================================================
--- params.h (revision 176280)
+++ params.h (working copy)
@@ -211,4 +211,6 @@ extern void init_param_values (int *para
PARAM_VALUE (PARAM_MIN_NONDEBUG_INSN_UID)
#define MAX_STORES_TO_SINK \
PARAM_VALUE (PARAM_MAX_STORES_TO_SINK)
+#define ALLOW_STORE_DATA_RACES \
+ PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES)
#endif /* ! GCC_PARAMS_H */
Index: testsuite/gcc.dg/20110509.c
===================================================================
--- testsuite/gcc.dg/20110509.c (revision 0)
+++ testsuite/gcc.dg/20110509.c (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+/* Test that we don't store past VAR.A. */
+
+struct S
+{
+ volatile unsigned int a : 4;
+ unsigned char b;
+ unsigned int c : 6;
+} var;
+
+void set_a()
+{
+ var.a = 12;
+}
+
+/* { dg-final { scan-assembler-not "movl.*, var" } } */
Index: testsuite/gcc.dg/20110509-2.c
===================================================================
--- testsuite/gcc.dg/20110509-2.c (revision 0)
+++ testsuite/gcc.dg/20110509-2.c (revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+/* Test that we don't store past VAR.K. */
+
+struct S
+{
+ volatile int i;
+ volatile int j: 32;
+ volatile int k: 15;
+ volatile char c[2];
+} var;
+
+void setit()
+{
+ var.k = 13;
+}
+
+/* { dg-final { scan-assembler-not "movl.*, var" } } */
Index: testsuite/gcc.dg/20110509-3.c
===================================================================
--- testsuite/gcc.dg/20110509-3.c (revision 0)
+++ testsuite/gcc.dg/20110509-3.c (revision 0)
@@ -0,0 +1,21 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+/* Make sure we don't narrow down to a QI or HI to store into VAR.J,
+ but instead use an SI. */
+
+struct S
+{
+ volatile int i: 4;
+ volatile int j: 4;
+ volatile int k: 8;
+ volatile int l: 8;
+ volatile int m: 8;
+} var;
+
+void setit()
+{
+ var.j = 5;
+}
+
+/* { dg-final { scan-assembler "movl.*, var" } } */
Index: ifcvt.c
===================================================================
--- ifcvt.c (revision 176280)
+++ ifcvt.c (working copy)
@@ -885,7 +885,7 @@ noce_emit_move_insn (rtx x, rtx y)
}
gcc_assert (start < (MEM_P (op) ? BITS_PER_UNIT : BITS_PER_WORD));
- store_bit_field (op, size, start, GET_MODE (x), y);
+ store_bit_field (op, size, start, 0, 0, GET_MODE (x), y);
return;
}
@@ -939,7 +939,8 @@ noce_emit_move_insn (rtx x, rtx y)
inner = XEXP (outer, 0);
outmode = GET_MODE (outer);
bitpos = SUBREG_BYTE (outer) * BITS_PER_UNIT;
- store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos, outmode, y);
+ store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos,
+ 0, 0, outmode, y);
}
/* Return sequence of instructions generated by if conversion. This
Index: expr.c
===================================================================
--- expr.c (revision 176280)
+++ expr.c (working copy)
@@ -55,6 +55,7 @@ along with GCC; see the file COPYING3.
#include "diagnostic.h"
#include "ssaexpand.h"
#include "target-globals.h"
+#include "params.h"
/* Decide whether a function's arguments should be processed
from first to last or from last to first.
@@ -143,7 +144,9 @@ static void store_constructor_field (rtx
HOST_WIDE_INT, enum machine_mode,
tree, tree, int, alias_set_type);
static void store_constructor (tree, rtx, int, HOST_WIDE_INT);
-static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT, enum machine_mode,
+static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT,
+ enum machine_mode,
tree, tree, alias_set_type, bool);
static unsigned HOST_WIDE_INT highest_pow2_factor_for_target (const_tree, const_tree);
@@ -2074,7 +2077,7 @@ emit_group_store (rtx orig_dst, rtx src,
emit_move_insn (adjust_address (dest, mode, bytepos), tmps[i]);
else
store_bit_field (dest, bytelen * BITS_PER_UNIT, bytepos * BITS_PER_UNIT,
- mode, tmps[i]);
+ 0, 0, mode, tmps[i]);
}
/* Copy from the pseudo into the (probable) hard reg. */
@@ -2168,7 +2171,7 @@ copy_blkmode_from_reg (rtx tgtblk, rtx s
/* Use xbitpos for the source extraction (right justified) and
bitpos for the destination store (left justified). */
- store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, copy_mode,
+ store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, 0, 0, copy_mode,
extract_bit_field (src, bitsize,
xbitpos % BITS_PER_WORD, 1, false,
NULL_RTX, copy_mode, copy_mode));
@@ -2805,7 +2808,7 @@ write_complex_part (rtx cplx, rtx val, b
gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD);
}
- store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, imode, val);
+ store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, 0, 0, imode, val);
}
/* Extract one of the components of the complex value CPLX. Extract the
@@ -3940,6 +3943,8 @@ get_subtarget (rtx x)
static bool
optimize_bitfield_assignment_op (unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
enum machine_mode mode1, rtx str_rtx,
tree to, tree src)
{
@@ -4001,6 +4006,7 @@ optimize_bitfield_assignment_op (unsigne
if (str_bitsize == 0 || str_bitsize > BITS_PER_WORD)
str_mode = word_mode;
str_mode = get_best_mode (bitsize, bitpos,
+ bitregion_start, bitregion_end,
MEM_ALIGN (str_rtx), str_mode, 0);
if (str_mode == VOIDmode)
return false;
@@ -4109,6 +4115,111 @@ optimize_bitfield_assignment_op (unsigne
return false;
}
+/* In the C++ memory model, consecutive bit fields in a structure are
+ considered one memory location.
+
+ Given a COMPONENT_REF, this function returns the bit range of
+ consecutive bits in which this COMPONENT_REF belongs in. The
+ values are returned in *BITSTART and *BITEND. If either the C++
+ memory model is not activated, or this memory access is not thread
+ visible, 0 is returned in *BITSTART and *BITEND.
+
+ EXP is the COMPONENT_REF.
+ INNERDECL is the actual object being referenced.
+ BITPOS is the position in bits where the bit starts within the structure.
+ BITSIZE is size in bits of the field being referenced in EXP.
+
+ For example, while storing into FOO.A here...
+
+ struct {
+ BIT 0:
+ unsigned int a : 4;
+ unsigned int b : 1;
+ BIT 8:
+ unsigned char c;
+ unsigned int d : 6;
+ } foo;
+
+ ...we are not allowed to store past <b>, so for the layout above, a
+ range of 0..7 (because no one cares if we store into the
+ padding). */
+
+static void
+get_bit_range (unsigned HOST_WIDE_INT *bitstart,
+ unsigned HOST_WIDE_INT *bitend,
+ tree exp, tree innerdecl,
+ HOST_WIDE_INT bitpos, HOST_WIDE_INT bitsize)
+{
+ tree field, record_type, fld;
+ bool found_field = false;
+ bool prev_field_is_bitfield;
+
+ gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
+
+ /* If other threads can't see this value, no need to restrict stores. */
+ if (ALLOW_STORE_DATA_RACES
+ || !DECL_THREAD_VISIBLE_P (innerdecl))
+ {
+ *bitstart = *bitend = 0;
+ return;
+ }
+
+ /* Bit field we're storing into. */
+ field = TREE_OPERAND (exp, 1);
+ record_type = DECL_FIELD_CONTEXT (field);
+
+ /* Count the contiguous bitfields for the memory location that
+ contains FIELD. */
+ *bitstart = 0;
+ prev_field_is_bitfield = true;
+ for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld))
+ {
+ tree t, offset;
+ enum machine_mode mode;
+ int unsignedp, volatilep;
+
+ if (TREE_CODE (fld) != FIELD_DECL)
+ continue;
+
+ t = build3 (COMPONENT_REF, TREE_TYPE (exp),
+ unshare_expr (TREE_OPERAND (exp, 0)),
+ fld, NULL_TREE);
+ get_inner_reference (t, &bitsize, &bitpos, &offset,
+ &mode, &unsignedp, &volatilep, true);
+
+ if (field == fld)
+ found_field = true;
+
+ if (DECL_BIT_FIELD_TYPE (fld) && bitsize > 0)
+ {
+ if (prev_field_is_bitfield == false)
+ {
+ *bitstart = bitpos;
+ prev_field_is_bitfield = true;
+ }
+ }
+ else
+ {
+ prev_field_is_bitfield = false;
+ if (found_field)
+ break;
+ }
+ }
+ gcc_assert (found_field);
+
+ if (fld)
+ {
+ /* We found the end of the bit field sequence. Include the
+ padding up to the next field and be done. */
+ *bitend = bitpos - 1;
+ }
+ else
+ {
+ /* If this is the last element in the structure, include the padding
+ at the end of structure. */
+ *bitend = TREE_INT_CST_LOW (TYPE_SIZE (record_type));
+ }
+}
/* Expand an assignment that stores the value of FROM into TO. If NONTEMPORAL
is true, try generating a nontemporal store. */
@@ -4208,6 +4319,8 @@ expand_assignment (tree to, tree from, b
{
enum machine_mode mode1;
HOST_WIDE_INT bitsize, bitpos;
+ unsigned HOST_WIDE_INT bitregion_start = 0;
+ unsigned HOST_WIDE_INT bitregion_end = 0;
tree offset;
int unsignedp;
int volatilep = 0;
@@ -4217,6 +4330,11 @@ expand_assignment (tree to, tree from, b
tem = get_inner_reference (to, &bitsize, &bitpos, &offset, &mode1,
&unsignedp, &volatilep, true);
+ if (TREE_CODE (to) == COMPONENT_REF
+ && DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
+ get_bit_range (&bitregion_start, &bitregion_end,
+ to, tem, bitpos, bitsize);
+
/* If we are going to use store_bit_field and extract_bit_field,
make sure to_rtx will be safe for multiple use. */
@@ -4298,11 +4416,14 @@ expand_assignment (tree to, tree from, b
nontemporal);
else if (bitpos + bitsize <= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 0), bitsize, bitpos,
+ bitregion_start, bitregion_end,
mode1, from, TREE_TYPE (tem),
get_alias_set (to), nontemporal);
else if (bitpos >= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 1), bitsize,
- bitpos - mode_bitsize / 2, mode1, from,
+ bitpos - mode_bitsize / 2,
+ bitregion_start, bitregion_end,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
else if (bitpos == 0 && bitsize == mode_bitsize)
@@ -4323,7 +4444,9 @@ expand_assignment (tree to, tree from, b
0);
write_complex_part (temp, XEXP (to_rtx, 0), false);
write_complex_part (temp, XEXP (to_rtx, 1), true);
- result = store_field (temp, bitsize, bitpos, mode1, from,
+ result = store_field (temp, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false));
@@ -4348,11 +4471,15 @@ expand_assignment (tree to, tree from, b
MEM_KEEP_ALIAS_SET_P (to_rtx) = 1;
}
- if (optimize_bitfield_assignment_op (bitsize, bitpos, mode1,
+ if (optimize_bitfield_assignment_op (bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode1,
to_rtx, to, from))
result = NULL;
else
- result = store_field (to_rtx, bitsize, bitpos, mode1, from,
+ result = store_field (to_rtx, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
}
@@ -4745,7 +4872,7 @@ store_expr (tree exp, rtx target, int ca
: BLOCK_OP_NORMAL));
else if (GET_MODE (target) == BLKmode)
store_bit_field (target, INTVAL (expr_size (exp)) * BITS_PER_UNIT,
- 0, GET_MODE (temp), temp);
+ 0, 0, 0, GET_MODE (temp), temp);
else
convert_move (target, temp, unsignedp);
}
@@ -5210,7 +5337,8 @@ store_constructor_field (rtx target, uns
store_constructor (exp, target, cleared, bitsize / BITS_PER_UNIT);
}
else
- store_field (target, bitsize, bitpos, mode, exp, type, alias_set, false);
+ store_field (target, bitsize, bitpos, 0, 0, mode, exp, type, alias_set,
+ false);
}
/* Store the value of constructor EXP into the rtx TARGET.
@@ -5784,6 +5912,11 @@ store_constructor (tree exp, rtx target,
BITSIZE bits, starting BITPOS bits from the start of TARGET.
If MODE is VOIDmode, it means that we are storing into a bit-field.
+ BITREGION_START is bitpos of the first bitfield in this region.
+ BITREGION_END is the bitpos of the ending bitfield in this region.
+ These two fields are 0, if the C++ memory model does not apply,
+ or we are not interested in keeping track of bitfield regions.
+
Always return const0_rtx unless we have something particular to
return.
@@ -5797,6 +5930,8 @@ store_constructor (tree exp, rtx target,
static rtx
store_field (rtx target, HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
enum machine_mode mode, tree exp, tree type,
alias_set_type alias_set, bool nontemporal)
{
@@ -5829,8 +5964,9 @@ store_field (rtx target, HOST_WIDE_INT b
if (bitsize != (HOST_WIDE_INT) GET_MODE_BITSIZE (GET_MODE (target)))
emit_move_insn (object, target);
- store_field (blk_object, bitsize, bitpos, mode, exp, type, alias_set,
- nontemporal);
+ store_field (blk_object, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode, exp, type, alias_set, nontemporal);
emit_move_insn (target, object);
@@ -5944,7 +6080,9 @@ store_field (rtx target, HOST_WIDE_INT b
}
/* Store the value in the bitfield. */
- store_bit_field (target, bitsize, bitpos, mode, temp);
+ store_bit_field (target, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode, temp);
return const0_rtx;
}
@@ -7354,7 +7492,7 @@ expand_expr_real_2 (sepops ops, rtx targ
(treeop0))
* BITS_PER_UNIT),
(HOST_WIDE_INT) GET_MODE_BITSIZE (mode)),
- 0, TYPE_MODE (valtype), treeop0,
+ 0, 0, 0, TYPE_MODE (valtype), treeop0,
type, 0, false);
}
Index: expr.h
===================================================================
--- expr.h (revision 176280)
+++ expr.h (working copy)
@@ -665,7 +665,10 @@ extern enum machine_mode
mode_for_extraction (enum extraction_pattern, int);
extern void store_bit_field (rtx, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, enum machine_mode, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ enum machine_mode, rtx);
extern rtx extract_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT, int, bool, rtx,
enum machine_mode, enum machine_mode);
Index: stor-layout.c
===================================================================
--- stor-layout.c (revision 176280)
+++ stor-layout.c (working copy)
@@ -2361,6 +2361,13 @@ fixup_unsigned_type (tree type)
/* Find the best machine mode to use when referencing a bit field of length
BITSIZE bits starting at BITPOS.
+ BITREGION_START is the bit position of the first bit in this
+ sequence of bit fields. BITREGION_END is the last bit in this
+ sequence. If these two fields are non-zero, we should restrict the
+ memory access to a maximum sized chunk of
+ BITREGION_END - BITREGION_START + 1. Otherwise, we are allowed to touch
+ any adjacent non bit-fields.
+
The underlying object is known to be aligned to a boundary of ALIGN bits.
If LARGEST_MODE is not VOIDmode, it means that we should not use a mode
larger than LARGEST_MODE (usually SImode).
@@ -2378,11 +2385,21 @@ fixup_unsigned_type (tree type)
decide which of the above modes should be used. */
enum machine_mode
-get_best_mode (int bitsize, int bitpos, unsigned int align,
+get_best_mode (int bitsize, int bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ unsigned int align,
enum machine_mode largest_mode, int volatilep)
{
enum machine_mode mode;
unsigned int unit = 0;
+ unsigned HOST_WIDE_INT maxbits;
+
+ /* If unset, no restriction. */
+ if (!bitregion_end)
+ maxbits = 0;
+ else
+ maxbits = (bitregion_end - bitregion_start) % align;
/* Find the narrowest integer mode that contains the bit field. */
for (mode = GET_CLASS_NARROWEST_MODE (MODE_INT); mode != VOIDmode;
@@ -2419,6 +2436,7 @@ get_best_mode (int bitsize, int bitpos,
&& bitpos / unit == (bitpos + bitsize - 1) / unit
&& unit <= BITS_PER_WORD
&& unit <= MIN (align, BIGGEST_ALIGNMENT)
+ && (!maxbits || unit <= maxbits)
&& (largest_mode == VOIDmode
|| unit <= GET_MODE_BITSIZE (largest_mode)))
wide_mode = tmode;
Index: calls.c
===================================================================
--- calls.c (revision 176280)
+++ calls.c (working copy)
@@ -924,8 +924,8 @@ store_unaligned_arguments_into_pseudos (
emit_move_insn (reg, const0_rtx);
bytes -= bitsize / BITS_PER_UNIT;
- store_bit_field (reg, bitsize, endian_correction, word_mode,
- word);
+ store_bit_field (reg, bitsize, endian_correction, 0, 0,
+ word_mode, word);
}
}
}
Index: expmed.c
===================================================================
--- expmed.c (revision 176280)
+++ expmed.c (working copy)
@@ -47,9 +47,15 @@ struct target_expmed *this_target_expmed
static void store_fixed_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ rtx);
static void store_split_bit_field (rtx, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ rtx);
static rtx extract_fixed_bit_field (enum machine_mode, rtx,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
@@ -333,7 +339,10 @@ mode_for_extraction (enum extraction_pat
static bool
store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, enum machine_mode fieldmode,
+ unsigned HOST_WIDE_INT bitnum,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ enum machine_mode fieldmode,
rtx value, bool fallback_p)
{
unsigned int unit
@@ -547,7 +556,9 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!store_bit_field_1 (op0, MIN (BITS_PER_WORD,
bitsize - i * BITS_PER_WORD),
- bitnum + bit_offset, word_mode,
+ bitnum + bit_offset,
+ bitregion_start, bitregion_end,
+ word_mode,
value_word, fallback_p))
{
delete_insns_since (last);
@@ -710,6 +721,12 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (HAVE_insv && MEM_P (op0))
{
enum machine_mode bestmode;
+ unsigned HOST_WIDE_INT maxbits;
+
+ if (!bitregion_end)
+ maxbits = 0;
+ else
+ maxbits = bitregion_end - bitregion_start;
/* Get the mode to use for inserting into this field. If OP0 is
BLKmode, get the smallest mode consistent with the alignment. If
@@ -717,9 +734,12 @@ store_bit_field_1 (rtx str_rtx, unsigned
mode. Otherwise, use the smallest mode containing the field. */
if (GET_MODE (op0) == BLKmode
+ || (bitregion_end && GET_MODE_BITSIZE (GET_MODE (op0)) > maxbits)
|| (op_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (op_mode)))
- bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum,
+ bitregion_start, bitregion_end,
+ MEM_ALIGN (op0),
(op_mode == MAX_MACHINE_MODE
? VOIDmode : op_mode),
MEM_VOLATILE_P (op0));
@@ -748,6 +768,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
the unit. */
tempreg = copy_to_reg (xop0);
if (store_bit_field_1 (tempreg, bitsize, xbitpos,
+ bitregion_start, bitregion_end,
fieldmode, orig_value, false))
{
emit_move_insn (xop0, tempreg);
@@ -760,21 +781,33 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!fallback_p)
return false;
- store_fixed_bit_field (op0, offset, bitsize, bitpos, value);
+ store_fixed_bit_field (op0, offset, bitsize, bitpos,
+ bitregion_start, bitregion_end, value);
return true;
}
/* Generate code to store value from rtx VALUE
into a bit-field within structure STR_RTX
containing BITSIZE bits starting at bit BITNUM.
+
+ BITREGION_START is bitpos of the first bitfield in this region.
+ BITREGION_END is the bitpos of the ending bitfield in this region.
+ These two fields are 0, if the C++ memory model does not apply,
+ or we are not interested in keeping track of bitfield regions.
+
FIELDMODE is the machine-mode of the FIELD_DECL node for this field. */
void
store_bit_field (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, enum machine_mode fieldmode,
+ unsigned HOST_WIDE_INT bitnum,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ enum machine_mode fieldmode,
rtx value)
{
- if (!store_bit_field_1 (str_rtx, bitsize, bitnum, fieldmode, value, true))
+ if (!store_bit_field_1 (str_rtx, bitsize, bitnum,
+ bitregion_start, bitregion_end,
+ fieldmode, value, true))
gcc_unreachable ();
}
\f
@@ -790,7 +823,10 @@ store_bit_field (rtx str_rtx, unsigned H
static void
store_fixed_bit_field (rtx op0, unsigned HOST_WIDE_INT offset,
unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitpos, rtx value)
+ unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ rtx value)
{
enum machine_mode mode;
unsigned int total_bits = BITS_PER_WORD;
@@ -811,12 +847,23 @@ store_fixed_bit_field (rtx op0, unsigned
/* Special treatment for a bit field split across two registers. */
if (bitsize + bitpos > BITS_PER_WORD)
{
- store_split_bit_field (op0, bitsize, bitpos, value);
+ store_split_bit_field (op0, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ value);
return;
}
}
else
{
+ unsigned HOST_WIDE_INT maxbits;
+
+ if (!bitregion_end)
+ maxbits = 0;
+ else if (1||bitpos + offset * BITS_PER_UNIT < bitregion_start)
+ maxbits = bitregion_end - bitregion_start;
+ else
+ maxbits = bitregion_end - (bitpos + offset * BITS_PER_UNIT) + 1;
+
/* Get the proper mode to use for this field. We want a mode that
includes the entire field. If such a mode would be larger than
a word, we won't be doing the extraction the normal way.
@@ -829,10 +876,12 @@ store_fixed_bit_field (rtx op0, unsigned
if (MEM_VOLATILE_P (op0)
&& GET_MODE_BITSIZE (GET_MODE (op0)) > 0
+ && GET_MODE_BITSIZE (GET_MODE (op0)) <= maxbits
&& flag_strict_volatile_bitfields > 0)
mode = GET_MODE (op0);
else
mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ bitregion_start, bitregion_end,
MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
@@ -840,7 +889,7 @@ store_fixed_bit_field (rtx op0, unsigned
/* The only way this should occur is if the field spans word
boundaries. */
store_split_bit_field (op0, bitsize, bitpos + offset * BITS_PER_UNIT,
- value);
+ bitregion_start, bitregion_end, value);
return;
}
@@ -960,7 +1009,10 @@ store_fixed_bit_field (rtx op0, unsigned
static void
store_split_bit_field (rtx op0, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitpos, rtx value)
+ unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ rtx value)
{
unsigned int unit;
unsigned int bitsdone = 0;
@@ -1075,7 +1127,7 @@ store_split_bit_field (rtx op0, unsigned
it is just an out-of-bounds access. Ignore it. */
if (word != const0_rtx)
store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize,
- thispos, part);
+ thispos, bitregion_start, bitregion_end, part);
bitsdone += thissize;
}
}
@@ -1515,7 +1567,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
if (GET_MODE (op0) == BLKmode
|| (ext_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (ext_mode)))
- bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum, 0, 0, MEM_ALIGN (op0),
(ext_mode == MAX_MACHINE_MODE
? VOIDmode : ext_mode),
MEM_VOLATILE_P (op0));
@@ -1641,7 +1693,7 @@ extract_fixed_bit_field (enum machine_mo
mode = tmode;
}
else
- mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT, 0, 0,
MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
Index: Makefile.in
===================================================================
--- Makefile.in (revision 176280)
+++ Makefile.in (working copy)
@@ -2908,7 +2908,7 @@ expr.o : expr.c $(CONFIG_H) $(SYSTEM_H)
reload.h langhooks.h intl.h $(TM_P_H) $(TARGET_H) \
tree-iterator.h gt-expr.h $(MACHMODE_H) $(TIMEVAR_H) $(TREE_FLOW_H) \
$(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h $(SSAEXPAND_H) \
- $(COMMON_TARGET_H)
+ $(PARAMS_H) $(COMMON_TARGET_H)
dojump.o : dojump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \
$(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(OPTABS_H) $(INSN_ATTR_H) insn-config.h \
langhooks.h $(GGC_H) gt-dojump.h vecprim.h $(BASIC_BLOCK_H) output.h
Index: stmt.c
===================================================================
--- stmt.c (revision 176280)
+++ stmt.c (working copy)
@@ -1759,7 +1759,8 @@ expand_return (tree retval)
/* Use bitpos for the source extraction (left justified) and
xbitpos for the destination store (right justified). */
- store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD, word_mode,
+ store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD,
+ 0, 0, word_mode,
extract_bit_field (src, bitsize,
bitpos % BITS_PER_WORD, 1, false,
NULL_RTX, word_mode, word_mode));
Index: params.def
===================================================================
--- params.def (revision 176280)
+++ params.def (working copy)
@@ -902,6 +902,12 @@ DEFPARAM (PARAM_CASE_VALUES_THRESHOLD,
"if 0, use the default for the machine",
0, 0, 0)
+/* Data race flags for C++0x memory model compliance. */
+DEFPARAM (PARAM_ALLOW_STORE_DATA_RACES,
+ "allow-store-data-races",
+ "Allow new data races on stores to be introduced",
+ 1, 0, 1)
+
/*
Local variables:
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-18 13:10 ` Aldy Hernandez
@ 2011-07-22 19:16 ` Jason Merrill
2011-07-25 17:41 ` Aldy Hernandez
0 siblings, 1 reply; 81+ messages in thread
From: Jason Merrill @ 2011-07-22 19:16 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
On 07/18/2011 08:02 AM, Aldy Hernandez wrote:
> + /* If other threads can't see this value, no need to restrict stores. */
> + if (ALLOW_STORE_DATA_RACES
> + || !DECL_THREAD_VISIBLE_P (innerdecl))
> + {
> + *bitstart = *bitend = 0;
> + return;
> + }
What if get_inner_reference returns something that isn't a DECL, such as
an INDIRECT_REF?
> + if (fld)
> + {
> + /* We found the end of the bit field sequence. Include the
> + padding up to the next field and be done. */
> + *bitend = bitpos - 1;
> + }
bitpos is the position of "field", and it seems to me we want the
position of "fld" here.
> + /* If unset, no restriction. */
> + if (!bitregion_end)
> + maxbits = 0;
> + else
> + maxbits = (bitregion_end - bitregion_start) % align;
Maybe use MAX_FIXED_MODE_SIZE so you don't have to test it against 0?
> + if (!bitregion_end)
> + maxbits = 0;
> + else if (1||bitpos + offset * BITS_PER_UNIT < bitregion_start)
> + maxbits = bitregion_end - bitregion_start;
> + else
> + maxbits = bitregion_end - (bitpos + offset * BITS_PER_UNIT) + 1;
I assume the 1|| was there for debugging?
Surely bitpos+offset*BITS_PER_UNIT, which would be the bit position of
the bit-field, must be within [bitregion_start,bitregion_end)?
Jason
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-22 19:16 ` Jason Merrill
@ 2011-07-25 17:41 ` Aldy Hernandez
2011-07-26 5:28 ` Jason Merrill
2011-07-27 18:24 ` H.J. Lu
0 siblings, 2 replies; 81+ messages in thread
From: Aldy Hernandez @ 2011-07-25 17:41 UTC (permalink / raw)
To: Jason Merrill; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 2132 bytes --]
On 07/22/11 13:44, Jason Merrill wrote:
> On 07/18/2011 08:02 AM, Aldy Hernandez wrote:
>> + /* If other threads can't see this value, no need to restrict
>> stores. */
>> + if (ALLOW_STORE_DATA_RACES
>> + || !DECL_THREAD_VISIBLE_P (innerdecl))
>> + {
>> + *bitstart = *bitend = 0;
>> + return;
>> + }
>
> What if get_inner_reference returns something that isn't a DECL, such as
> an INDIRECT_REF?
I had changed this already to take into account aliasing, so if we get
an INDIRECT_REF, ptr_deref_may_alias_global_p() returns true, and we
proceed with the restriction:
+ /* If other threads can't see this value, no need to restrict stores. */
+ if (ALLOW_STORE_DATA_RACES
+ || (!ptr_deref_may_alias_global_p (innerdecl)
+ && (DECL_THREAD_LOCAL_P (innerdecl)
+ || !TREE_STATIC (innerdecl))))
>> + if (fld)
>> + {
>> + /* We found the end of the bit field sequence. Include the
>> + padding up to the next field and be done. */
>> + *bitend = bitpos - 1;
>> + }
>
> bitpos is the position of "field", and it seems to me we want the
> position of "fld" here.
Notice that bitpos gets recalculated at each iteration by
get_inner_reference, so bitpos is actually the position of fld.
>> + /* If unset, no restriction. */
>> + if (!bitregion_end)
>> + maxbits = 0;
>> + else
>> + maxbits = (bitregion_end - bitregion_start) % align;
>
> Maybe use MAX_FIXED_MODE_SIZE so you don't have to test it against 0?
Fixed everywhere.
>> + if (!bitregion_end)
>> + maxbits = 0;
>> + else if (1||bitpos + offset * BITS_PER_UNIT < bitregion_start)
>> + maxbits = bitregion_end - bitregion_start;
>> + else
>> + maxbits = bitregion_end - (bitpos + offset * BITS_PER_UNIT) + 1;
>
> I assume the 1|| was there for debugging?
Fixed, plus I adjusted the calculation of maxbits everywhere because I
found an off-by-one error.
I have also overhauled store_bit_field() to adjust the address of the
address to point to the beginning of the bit region. This fixed a
myraid of corner cases pointed out by a test Hans Boehm was kind enough
to provide.
I have added more tests.
How does this look? (Pending tests.)
[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 32883 bytes --]
* params.h (ALLOW_STORE_DATA_RACES): New.
* params.def (PARAM_ALLOW_STORE_DATA_RACES): New.
* Makefile.in (expr.o): Depend on PARAMS_H.
* machmode.h (get_best_mode): Add argument.
* fold-const.c (optimize_bit_field_compare): Add argument to
get_best_mode.
(fold_truthop): Same.
* ifcvt.c (noce_emit_move_insn): Add argument to store_bit_field.
* expr.c (emit_group_store): Same.
(copy_blkmode_from_reg): Same.
(write_complex_part): Same.
(optimize_bitfield_assignment_op): Add argument.
Add argument to get_best_mode.
(get_bit_range): New.
(expand_assignment): Calculate maxbits and pass it down
accordingly.
(store_field): New argument.
(expand_expr_real_2): New argument to store_field.
Include params.h.
* expr.h (store_bit_field): New argument.
* stor-layout.c (get_best_mode): Restrict mode expansion by taking
into account maxbits.
* calls.c (store_unaligned_arguments_into_pseudos): New argument
to store_bit_field.
* expmed.c (store_bit_field_1): New argument. Use it.
(store_bit_field): Same.
(store_fixed_bit_field): Same.
(store_split_bit_field): Same.
(extract_bit_field_1): Pass new argument to get_best_mode.
(extract_bit_field): Same.
* stmt.c (store_bit_field): Pass new argument to store_bit_field.
* doc/invoke.texi: Document parameter allow-store-data-races.
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi (revision 176280)
+++ doc/invoke.texi (working copy)
@@ -9027,6 +9027,11 @@ The maximum number of conditional stores
if either vectorization (@option{-ftree-vectorize}) or if-conversion
(@option{-ftree-loop-if-convert}) is disabled. The default is 2.
+@item allow-store-data-races
+Allow optimizers to introduce new data races on stores.
+Set to 1 to allow, otherwise to 0. This option is enabled by default
+unless implicitly set by the @option{-fmemory-model=} option.
+
@item case-values-threshold
The smallest number of different values for which it is best to use a
jump-table instead of a tree of conditional branches. If the value is
Index: machmode.h
===================================================================
--- machmode.h (revision 176280)
+++ machmode.h (working copy)
@@ -248,7 +248,10 @@ extern enum machine_mode mode_for_vector
/* Find the best mode to use to access a bit field. */
-extern enum machine_mode get_best_mode (int, int, unsigned int,
+extern enum machine_mode get_best_mode (int, int,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ unsigned int,
enum machine_mode, int);
/* Determine alignment, 1<=result<=BIGGEST_ALIGNMENT. */
Index: fold-const.c
===================================================================
--- fold-const.c (revision 176280)
+++ fold-const.c (working copy)
@@ -3394,7 +3394,7 @@ optimize_bit_field_compare (location_t l
&& flag_strict_volatile_bitfields > 0)
nmode = lmode;
else
- nmode = get_best_mode (lbitsize, lbitpos,
+ nmode = get_best_mode (lbitsize, lbitpos, 0, 0,
const_p ? TYPE_ALIGN (TREE_TYPE (linner))
: MIN (TYPE_ALIGN (TREE_TYPE (linner)),
TYPE_ALIGN (TREE_TYPE (rinner))),
@@ -5222,7 +5222,7 @@ fold_truthop (location_t loc, enum tree_
to be relative to a field of that size. */
first_bit = MIN (ll_bitpos, rl_bitpos);
end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize);
- lnmode = get_best_mode (end_bit - first_bit, first_bit,
+ lnmode = get_best_mode (end_bit - first_bit, first_bit, 0, 0,
TYPE_ALIGN (TREE_TYPE (ll_inner)), word_mode,
volatilep);
if (lnmode == VOIDmode)
@@ -5287,7 +5287,7 @@ fold_truthop (location_t loc, enum tree_
first_bit = MIN (lr_bitpos, rr_bitpos);
end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize);
- rnmode = get_best_mode (end_bit - first_bit, first_bit,
+ rnmode = get_best_mode (end_bit - first_bit, first_bit, 0, 0,
TYPE_ALIGN (TREE_TYPE (lr_inner)), word_mode,
volatilep);
if (rnmode == VOIDmode)
Index: params.h
===================================================================
--- params.h (revision 176280)
+++ params.h (working copy)
@@ -211,4 +211,6 @@ extern void init_param_values (int *para
PARAM_VALUE (PARAM_MIN_NONDEBUG_INSN_UID)
#define MAX_STORES_TO_SINK \
PARAM_VALUE (PARAM_MAX_STORES_TO_SINK)
+#define ALLOW_STORE_DATA_RACES \
+ PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES)
#endif /* ! GCC_PARAMS_H */
Index: testsuite/gcc.dg/20110509-4.c
===================================================================
--- testsuite/gcc.dg/20110509-4.c (revision 0)
+++ testsuite/gcc.dg/20110509-4.c (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+struct bits
+{
+ char a;
+ int b:7;
+ int c:9;
+ unsigned char d;
+} x;
+
+/* Store into <c> should not clobber <d>. */
+void update_c(struct bits *p, int val)
+{
+ p -> c = val;
+}
+
+/* { dg-final { scan-assembler-not "movl" } } */
Index: testsuite/gcc.dg/20110509.c
===================================================================
--- testsuite/gcc.dg/20110509.c (revision 0)
+++ testsuite/gcc.dg/20110509.c (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+/* Test that we don't store past VAR.A. */
+
+struct S
+{
+ volatile unsigned int a : 4;
+ unsigned char b;
+ unsigned int c : 6;
+} var;
+
+void set_a()
+{
+ var.a = 12;
+}
+
+/* { dg-final { scan-assembler-not "movl.*, var" } } */
Index: testsuite/gcc.dg/20110509-2.c
===================================================================
--- testsuite/gcc.dg/20110509-2.c (revision 0)
+++ testsuite/gcc.dg/20110509-2.c (revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+/* Test that we don't store past VAR.K. */
+
+struct S
+{
+ volatile int i;
+ volatile int j: 32;
+ volatile int k: 15;
+ volatile char c[2];
+} var;
+
+void setit()
+{
+ var.k = 13;
+}
+
+/* { dg-final { scan-assembler-not "movl.*, var" } } */
Index: testsuite/gcc.dg/20110509-3.c
===================================================================
--- testsuite/gcc.dg/20110509-3.c (revision 0)
+++ testsuite/gcc.dg/20110509-3.c (revision 0)
@@ -0,0 +1,21 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+/* Make sure we don't narrow down to a QI or HI to store into VAR.J,
+ but instead use an SI. */
+
+struct S
+{
+ volatile int i: 4;
+ volatile int j: 4;
+ volatile int k: 8;
+ volatile int l: 8;
+ volatile int m: 8;
+} var;
+
+void setit()
+{
+ var.j = 5;
+}
+
+/* { dg-final { scan-assembler "movl.*, var" } } */
Index: ifcvt.c
===================================================================
--- ifcvt.c (revision 176280)
+++ ifcvt.c (working copy)
@@ -885,7 +885,7 @@ noce_emit_move_insn (rtx x, rtx y)
}
gcc_assert (start < (MEM_P (op) ? BITS_PER_UNIT : BITS_PER_WORD));
- store_bit_field (op, size, start, GET_MODE (x), y);
+ store_bit_field (op, size, start, 0, 0, GET_MODE (x), y);
return;
}
@@ -939,7 +939,8 @@ noce_emit_move_insn (rtx x, rtx y)
inner = XEXP (outer, 0);
outmode = GET_MODE (outer);
bitpos = SUBREG_BYTE (outer) * BITS_PER_UNIT;
- store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos, outmode, y);
+ store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos,
+ 0, 0, outmode, y);
}
/* Return sequence of instructions generated by if conversion. This
Index: expr.c
===================================================================
--- expr.c (revision 176280)
+++ expr.c (working copy)
@@ -55,6 +55,7 @@ along with GCC; see the file COPYING3.
#include "diagnostic.h"
#include "ssaexpand.h"
#include "target-globals.h"
+#include "params.h"
/* Decide whether a function's arguments should be processed
from first to last or from last to first.
@@ -143,7 +144,9 @@ static void store_constructor_field (rtx
HOST_WIDE_INT, enum machine_mode,
tree, tree, int, alias_set_type);
static void store_constructor (tree, rtx, int, HOST_WIDE_INT);
-static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT, enum machine_mode,
+static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT,
+ enum machine_mode,
tree, tree, alias_set_type, bool);
static unsigned HOST_WIDE_INT highest_pow2_factor_for_target (const_tree, const_tree);
@@ -2074,7 +2077,7 @@ emit_group_store (rtx orig_dst, rtx src,
emit_move_insn (adjust_address (dest, mode, bytepos), tmps[i]);
else
store_bit_field (dest, bytelen * BITS_PER_UNIT, bytepos * BITS_PER_UNIT,
- mode, tmps[i]);
+ 0, 0, mode, tmps[i]);
}
/* Copy from the pseudo into the (probable) hard reg. */
@@ -2168,7 +2171,7 @@ copy_blkmode_from_reg (rtx tgtblk, rtx s
/* Use xbitpos for the source extraction (right justified) and
bitpos for the destination store (left justified). */
- store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, copy_mode,
+ store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, 0, 0, copy_mode,
extract_bit_field (src, bitsize,
xbitpos % BITS_PER_WORD, 1, false,
NULL_RTX, copy_mode, copy_mode));
@@ -2805,7 +2808,7 @@ write_complex_part (rtx cplx, rtx val, b
gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD);
}
- store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, imode, val);
+ store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, 0, 0, imode, val);
}
/* Extract one of the components of the complex value CPLX. Extract the
@@ -3940,6 +3943,8 @@ get_subtarget (rtx x)
static bool
optimize_bitfield_assignment_op (unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
enum machine_mode mode1, rtx str_rtx,
tree to, tree src)
{
@@ -4001,6 +4006,7 @@ optimize_bitfield_assignment_op (unsigne
if (str_bitsize == 0 || str_bitsize > BITS_PER_WORD)
str_mode = word_mode;
str_mode = get_best_mode (bitsize, bitpos,
+ bitregion_start, bitregion_end,
MEM_ALIGN (str_rtx), str_mode, 0);
if (str_mode == VOIDmode)
return false;
@@ -4109,6 +4115,113 @@ optimize_bitfield_assignment_op (unsigne
return false;
}
+/* In the C++ memory model, consecutive bit fields in a structure are
+ considered one memory location.
+
+ Given a COMPONENT_REF, this function returns the bit range of
+ consecutive bits in which this COMPONENT_REF belongs in. The
+ values are returned in *BITSTART and *BITEND. If either the C++
+ memory model is not activated, or this memory access is not thread
+ visible, 0 is returned in *BITSTART and *BITEND.
+
+ EXP is the COMPONENT_REF.
+ INNERDECL is the actual object being referenced.
+ BITPOS is the position in bits where the bit starts within the structure.
+ BITSIZE is size in bits of the field being referenced in EXP.
+
+ For example, while storing into FOO.A here...
+
+ struct {
+ BIT 0:
+ unsigned int a : 4;
+ unsigned int b : 1;
+ BIT 8:
+ unsigned char c;
+ unsigned int d : 6;
+ } foo;
+
+ ...we are not allowed to store past <b>, so for the layout above, a
+ range of 0..7 (because no one cares if we store into the
+ padding). */
+
+static void
+get_bit_range (unsigned HOST_WIDE_INT *bitstart,
+ unsigned HOST_WIDE_INT *bitend,
+ tree exp, tree innerdecl,
+ HOST_WIDE_INT bitpos, HOST_WIDE_INT bitsize)
+{
+ tree field, record_type, fld;
+ bool found_field = false;
+ bool prev_field_is_bitfield;
+
+ gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
+
+ /* If other threads can't see this value, no need to restrict stores. */
+ if (ALLOW_STORE_DATA_RACES
+ || (!ptr_deref_may_alias_global_p (innerdecl)
+ && (DECL_THREAD_LOCAL_P (innerdecl)
+ || !TREE_STATIC (innerdecl))))
+ {
+ *bitstart = *bitend = 0;
+ return;
+ }
+
+ /* Bit field we're storing into. */
+ field = TREE_OPERAND (exp, 1);
+ record_type = DECL_FIELD_CONTEXT (field);
+
+ /* Count the contiguous bitfields for the memory location that
+ contains FIELD. */
+ *bitstart = 0;
+ prev_field_is_bitfield = true;
+ for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld))
+ {
+ tree t, offset;
+ enum machine_mode mode;
+ int unsignedp, volatilep;
+
+ if (TREE_CODE (fld) != FIELD_DECL)
+ continue;
+
+ t = build3 (COMPONENT_REF, TREE_TYPE (exp),
+ unshare_expr (TREE_OPERAND (exp, 0)),
+ fld, NULL_TREE);
+ get_inner_reference (t, &bitsize, &bitpos, &offset,
+ &mode, &unsignedp, &volatilep, true);
+
+ if (field == fld)
+ found_field = true;
+
+ if (DECL_BIT_FIELD_TYPE (fld) && bitsize > 0)
+ {
+ if (prev_field_is_bitfield == false)
+ {
+ *bitstart = bitpos;
+ prev_field_is_bitfield = true;
+ }
+ }
+ else
+ {
+ prev_field_is_bitfield = false;
+ if (found_field)
+ break;
+ }
+ }
+ gcc_assert (found_field);
+
+ if (fld)
+ {
+ /* We found the end of the bit field sequence. Include the
+ padding up to the next field and be done. */
+ *bitend = bitpos - 1;
+ }
+ else
+ {
+ /* If this is the last element in the structure, include the padding
+ at the end of structure. */
+ *bitend = TREE_INT_CST_LOW (TYPE_SIZE (record_type)) - 1;
+ }
+}
/* Expand an assignment that stores the value of FROM into TO. If NONTEMPORAL
is true, try generating a nontemporal store. */
@@ -4208,6 +4321,8 @@ expand_assignment (tree to, tree from, b
{
enum machine_mode mode1;
HOST_WIDE_INT bitsize, bitpos;
+ unsigned HOST_WIDE_INT bitregion_start = 0;
+ unsigned HOST_WIDE_INT bitregion_end = 0;
tree offset;
int unsignedp;
int volatilep = 0;
@@ -4217,6 +4332,11 @@ expand_assignment (tree to, tree from, b
tem = get_inner_reference (to, &bitsize, &bitpos, &offset, &mode1,
&unsignedp, &volatilep, true);
+ if (TREE_CODE (to) == COMPONENT_REF
+ && DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
+ get_bit_range (&bitregion_start, &bitregion_end,
+ to, tem, bitpos, bitsize);
+
/* If we are going to use store_bit_field and extract_bit_field,
make sure to_rtx will be safe for multiple use. */
@@ -4298,11 +4418,14 @@ expand_assignment (tree to, tree from, b
nontemporal);
else if (bitpos + bitsize <= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 0), bitsize, bitpos,
+ bitregion_start, bitregion_end,
mode1, from, TREE_TYPE (tem),
get_alias_set (to), nontemporal);
else if (bitpos >= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 1), bitsize,
- bitpos - mode_bitsize / 2, mode1, from,
+ bitpos - mode_bitsize / 2,
+ bitregion_start, bitregion_end,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
else if (bitpos == 0 && bitsize == mode_bitsize)
@@ -4323,7 +4446,9 @@ expand_assignment (tree to, tree from, b
0);
write_complex_part (temp, XEXP (to_rtx, 0), false);
write_complex_part (temp, XEXP (to_rtx, 1), true);
- result = store_field (temp, bitsize, bitpos, mode1, from,
+ result = store_field (temp, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false));
@@ -4348,11 +4473,15 @@ expand_assignment (tree to, tree from, b
MEM_KEEP_ALIAS_SET_P (to_rtx) = 1;
}
- if (optimize_bitfield_assignment_op (bitsize, bitpos, mode1,
+ if (optimize_bitfield_assignment_op (bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode1,
to_rtx, to, from))
result = NULL;
else
- result = store_field (to_rtx, bitsize, bitpos, mode1, from,
+ result = store_field (to_rtx, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
}
@@ -4745,7 +4874,7 @@ store_expr (tree exp, rtx target, int ca
: BLOCK_OP_NORMAL));
else if (GET_MODE (target) == BLKmode)
store_bit_field (target, INTVAL (expr_size (exp)) * BITS_PER_UNIT,
- 0, GET_MODE (temp), temp);
+ 0, 0, 0, GET_MODE (temp), temp);
else
convert_move (target, temp, unsignedp);
}
@@ -5210,7 +5339,8 @@ store_constructor_field (rtx target, uns
store_constructor (exp, target, cleared, bitsize / BITS_PER_UNIT);
}
else
- store_field (target, bitsize, bitpos, mode, exp, type, alias_set, false);
+ store_field (target, bitsize, bitpos, 0, 0, mode, exp, type, alias_set,
+ false);
}
/* Store the value of constructor EXP into the rtx TARGET.
@@ -5784,6 +5914,11 @@ store_constructor (tree exp, rtx target,
BITSIZE bits, starting BITPOS bits from the start of TARGET.
If MODE is VOIDmode, it means that we are storing into a bit-field.
+ BITREGION_START is bitpos of the first bitfield in this region.
+ BITREGION_END is the bitpos of the ending bitfield in this region.
+ These two fields are 0, if the C++ memory model does not apply,
+ or we are not interested in keeping track of bitfield regions.
+
Always return const0_rtx unless we have something particular to
return.
@@ -5797,6 +5932,8 @@ store_constructor (tree exp, rtx target,
static rtx
store_field (rtx target, HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
enum machine_mode mode, tree exp, tree type,
alias_set_type alias_set, bool nontemporal)
{
@@ -5829,8 +5966,9 @@ store_field (rtx target, HOST_WIDE_INT b
if (bitsize != (HOST_WIDE_INT) GET_MODE_BITSIZE (GET_MODE (target)))
emit_move_insn (object, target);
- store_field (blk_object, bitsize, bitpos, mode, exp, type, alias_set,
- nontemporal);
+ store_field (blk_object, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode, exp, type, alias_set, nontemporal);
emit_move_insn (target, object);
@@ -5944,7 +6082,9 @@ store_field (rtx target, HOST_WIDE_INT b
}
/* Store the value in the bitfield. */
- store_bit_field (target, bitsize, bitpos, mode, temp);
+ store_bit_field (target, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode, temp);
return const0_rtx;
}
@@ -7354,7 +7494,7 @@ expand_expr_real_2 (sepops ops, rtx targ
(treeop0))
* BITS_PER_UNIT),
(HOST_WIDE_INT) GET_MODE_BITSIZE (mode)),
- 0, TYPE_MODE (valtype), treeop0,
+ 0, 0, 0, TYPE_MODE (valtype), treeop0,
type, 0, false);
}
Index: expr.h
===================================================================
--- expr.h (revision 176280)
+++ expr.h (working copy)
@@ -665,7 +665,10 @@ extern enum machine_mode
mode_for_extraction (enum extraction_pattern, int);
extern void store_bit_field (rtx, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, enum machine_mode, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ enum machine_mode, rtx);
extern rtx extract_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT, int, bool, rtx,
enum machine_mode, enum machine_mode);
Index: stor-layout.c
===================================================================
--- stor-layout.c (revision 176280)
+++ stor-layout.c (working copy)
@@ -2361,6 +2361,13 @@ fixup_unsigned_type (tree type)
/* Find the best machine mode to use when referencing a bit field of length
BITSIZE bits starting at BITPOS.
+ BITREGION_START is the bit position of the first bit in this
+ sequence of bit fields. BITREGION_END is the last bit in this
+ sequence. If these two fields are non-zero, we should restrict the
+ memory access to a maximum sized chunk of
+ BITREGION_END - BITREGION_START + 1. Otherwise, we are allowed to touch
+ any adjacent non bit-fields.
+
The underlying object is known to be aligned to a boundary of ALIGN bits.
If LARGEST_MODE is not VOIDmode, it means that we should not use a mode
larger than LARGEST_MODE (usually SImode).
@@ -2378,11 +2385,21 @@ fixup_unsigned_type (tree type)
decide which of the above modes should be used. */
enum machine_mode
-get_best_mode (int bitsize, int bitpos, unsigned int align,
+get_best_mode (int bitsize, int bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ unsigned int align,
enum machine_mode largest_mode, int volatilep)
{
enum machine_mode mode;
unsigned int unit = 0;
+ unsigned HOST_WIDE_INT maxbits;
+
+ /* If unset, no restriction. */
+ if (!bitregion_end)
+ maxbits = MAX_FIXED_MODE_SIZE;
+ else
+ maxbits = (bitregion_end - bitregion_start) % align + 1;
/* Find the narrowest integer mode that contains the bit field. */
for (mode = GET_CLASS_NARROWEST_MODE (MODE_INT); mode != VOIDmode;
@@ -2419,6 +2436,7 @@ get_best_mode (int bitsize, int bitpos,
&& bitpos / unit == (bitpos + bitsize - 1) / unit
&& unit <= BITS_PER_WORD
&& unit <= MIN (align, BIGGEST_ALIGNMENT)
+ && unit <= maxbits
&& (largest_mode == VOIDmode
|| unit <= GET_MODE_BITSIZE (largest_mode)))
wide_mode = tmode;
Index: calls.c
===================================================================
--- calls.c (revision 176280)
+++ calls.c (working copy)
@@ -924,8 +924,8 @@ store_unaligned_arguments_into_pseudos (
emit_move_insn (reg, const0_rtx);
bytes -= bitsize / BITS_PER_UNIT;
- store_bit_field (reg, bitsize, endian_correction, word_mode,
- word);
+ store_bit_field (reg, bitsize, endian_correction, 0, 0,
+ word_mode, word);
}
}
}
Index: expmed.c
===================================================================
--- expmed.c (revision 176280)
+++ expmed.c (working copy)
@@ -47,9 +47,15 @@ struct target_expmed *this_target_expmed
static void store_fixed_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ rtx);
static void store_split_bit_field (rtx, unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, rtx);
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ unsigned HOST_WIDE_INT,
+ rtx);
static rtx extract_fixed_bit_field (enum machine_mode, rtx,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
@@ -333,7 +339,10 @@ mode_for_extraction (enum extraction_pat
static bool
store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, enum machine_mode fieldmode,
+ unsigned HOST_WIDE_INT bitnum,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ enum machine_mode fieldmode,
rtx value, bool fallback_p)
{
unsigned int unit
@@ -455,6 +464,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
/* We may be accessing data outside the field, which means
we can alias adjacent data. */
+ /* ?? not always for C++0x memory model ?? */
if (MEM_P (op0))
{
op0 = shallow_copy_rtx (op0);
@@ -547,7 +557,9 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!store_bit_field_1 (op0, MIN (BITS_PER_WORD,
bitsize - i * BITS_PER_WORD),
- bitnum + bit_offset, word_mode,
+ bitnum + bit_offset,
+ bitregion_start, bitregion_end,
+ word_mode,
value_word, fallback_p))
{
delete_insns_since (last);
@@ -710,6 +722,10 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (HAVE_insv && MEM_P (op0))
{
enum machine_mode bestmode;
+ unsigned HOST_WIDE_INT maxbits = MAX_FIXED_MODE_SIZE;
+
+ if (bitregion_end)
+ maxbits = bitregion_end - bitregion_start + 1;
/* Get the mode to use for inserting into this field. If OP0 is
BLKmode, get the smallest mode consistent with the alignment. If
@@ -717,9 +733,12 @@ store_bit_field_1 (rtx str_rtx, unsigned
mode. Otherwise, use the smallest mode containing the field. */
if (GET_MODE (op0) == BLKmode
+ || GET_MODE_BITSIZE (GET_MODE (op0)) > maxbits
|| (op_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (op_mode)))
- bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum,
+ bitregion_start, bitregion_end,
+ MEM_ALIGN (op0),
(op_mode == MAX_MACHINE_MODE
? VOIDmode : op_mode),
MEM_VOLATILE_P (op0));
@@ -748,6 +767,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
the unit. */
tempreg = copy_to_reg (xop0);
if (store_bit_field_1 (tempreg, bitsize, xbitpos,
+ bitregion_start, bitregion_end,
fieldmode, orig_value, false))
{
emit_move_insn (xop0, tempreg);
@@ -760,21 +780,59 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!fallback_p)
return false;
- store_fixed_bit_field (op0, offset, bitsize, bitpos, value);
+ store_fixed_bit_field (op0, offset, bitsize, bitpos,
+ bitregion_start, bitregion_end, value);
return true;
}
/* Generate code to store value from rtx VALUE
into a bit-field within structure STR_RTX
containing BITSIZE bits starting at bit BITNUM.
+
+ BITREGION_START is bitpos of the first bitfield in this region.
+ BITREGION_END is the bitpos of the ending bitfield in this region.
+ These two fields are 0, if the C++ memory model does not apply,
+ or we are not interested in keeping track of bitfield regions.
+
FIELDMODE is the machine-mode of the FIELD_DECL node for this field. */
void
store_bit_field (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitnum, enum machine_mode fieldmode,
+ unsigned HOST_WIDE_INT bitnum,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ enum machine_mode fieldmode,
rtx value)
{
- if (!store_bit_field_1 (str_rtx, bitsize, bitnum, fieldmode, value, true))
+ /* Under the C++0x memory model, we must not touch bits outside the
+ bit region. Adjust the address to start at the beginning of the
+ bit region. */
+ if (MEM_P (str_rtx)
+ && bitregion_start > 0)
+ {
+ enum machine_mode bestmode;
+ enum machine_mode op_mode;
+ unsigned HOST_WIDE_INT offset;
+
+ op_mode = mode_for_extraction (EP_insv, 3);
+ if (op_mode == MAX_MACHINE_MODE)
+ op_mode = VOIDmode;
+
+ offset = bitregion_start / BITS_PER_UNIT;
+ bitnum -= bitregion_start;
+ bitregion_end -= bitregion_start;
+ bitregion_start = 0;
+ bestmode = get_best_mode (bitsize, bitnum,
+ bitregion_start, bitregion_end,
+ MEM_ALIGN (str_rtx),
+ op_mode,
+ MEM_VOLATILE_P (str_rtx));
+ str_rtx = adjust_address (str_rtx, bestmode, offset);
+ }
+
+ if (!store_bit_field_1 (str_rtx, bitsize, bitnum,
+ bitregion_start, bitregion_end,
+ fieldmode, value, true))
gcc_unreachable ();
}
\f
@@ -790,7 +848,10 @@ store_bit_field (rtx str_rtx, unsigned H
static void
store_fixed_bit_field (rtx op0, unsigned HOST_WIDE_INT offset,
unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitpos, rtx value)
+ unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ rtx value)
{
enum machine_mode mode;
unsigned int total_bits = BITS_PER_WORD;
@@ -811,12 +872,19 @@ store_fixed_bit_field (rtx op0, unsigned
/* Special treatment for a bit field split across two registers. */
if (bitsize + bitpos > BITS_PER_WORD)
{
- store_split_bit_field (op0, bitsize, bitpos, value);
+ store_split_bit_field (op0, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ value);
return;
}
}
else
{
+ unsigned HOST_WIDE_INT maxbits = MAX_FIXED_MODE_SIZE;
+
+ if (bitregion_end)
+ maxbits = bitregion_end - bitregion_start + 1;
+
/* Get the proper mode to use for this field. We want a mode that
includes the entire field. If such a mode would be larger than
a word, we won't be doing the extraction the normal way.
@@ -829,10 +897,12 @@ store_fixed_bit_field (rtx op0, unsigned
if (MEM_VOLATILE_P (op0)
&& GET_MODE_BITSIZE (GET_MODE (op0)) > 0
+ && GET_MODE_BITSIZE (GET_MODE (op0)) <= maxbits
&& flag_strict_volatile_bitfields > 0)
mode = GET_MODE (op0);
else
mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ bitregion_start, bitregion_end,
MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
@@ -840,7 +910,7 @@ store_fixed_bit_field (rtx op0, unsigned
/* The only way this should occur is if the field spans word
boundaries. */
store_split_bit_field (op0, bitsize, bitpos + offset * BITS_PER_UNIT,
- value);
+ bitregion_start, bitregion_end, value);
return;
}
@@ -960,7 +1030,10 @@ store_fixed_bit_field (rtx op0, unsigned
static void
store_split_bit_field (rtx op0, unsigned HOST_WIDE_INT bitsize,
- unsigned HOST_WIDE_INT bitpos, rtx value)
+ unsigned HOST_WIDE_INT bitpos,
+ unsigned HOST_WIDE_INT bitregion_start,
+ unsigned HOST_WIDE_INT bitregion_end,
+ rtx value)
{
unsigned int unit;
unsigned int bitsdone = 0;
@@ -1075,7 +1148,7 @@ store_split_bit_field (rtx op0, unsigned
it is just an out-of-bounds access. Ignore it. */
if (word != const0_rtx)
store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize,
- thispos, part);
+ thispos, bitregion_start, bitregion_end, part);
bitsdone += thissize;
}
}
@@ -1515,7 +1588,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
if (GET_MODE (op0) == BLKmode
|| (ext_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (ext_mode)))
- bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum, 0, 0, MEM_ALIGN (op0),
(ext_mode == MAX_MACHINE_MODE
? VOIDmode : ext_mode),
MEM_VOLATILE_P (op0));
@@ -1641,7 +1714,7 @@ extract_fixed_bit_field (enum machine_mo
mode = tmode;
}
else
- mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT, 0, 0,
MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
Index: Makefile.in
===================================================================
--- Makefile.in (revision 176280)
+++ Makefile.in (working copy)
@@ -2908,7 +2908,7 @@ expr.o : expr.c $(CONFIG_H) $(SYSTEM_H)
reload.h langhooks.h intl.h $(TM_P_H) $(TARGET_H) \
tree-iterator.h gt-expr.h $(MACHMODE_H) $(TIMEVAR_H) $(TREE_FLOW_H) \
$(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h $(SSAEXPAND_H) \
- $(COMMON_TARGET_H)
+ $(PARAMS_H) $(COMMON_TARGET_H)
dojump.o : dojump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \
$(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(OPTABS_H) $(INSN_ATTR_H) insn-config.h \
langhooks.h $(GGC_H) gt-dojump.h vecprim.h $(BASIC_BLOCK_H) output.h
Index: stmt.c
===================================================================
--- stmt.c (revision 176280)
+++ stmt.c (working copy)
@@ -1759,7 +1759,8 @@ expand_return (tree retval)
/* Use bitpos for the source extraction (left justified) and
xbitpos for the destination store (right justified). */
- store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD, word_mode,
+ store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD,
+ 0, 0, word_mode,
extract_bit_field (src, bitsize,
bitpos % BITS_PER_WORD, 1, false,
NULL_RTX, word_mode, word_mode));
Index: params.def
===================================================================
--- params.def (revision 176280)
+++ params.def (working copy)
@@ -902,6 +902,12 @@ DEFPARAM (PARAM_CASE_VALUES_THRESHOLD,
"if 0, use the default for the machine",
0, 0, 0)
+/* Data race flags for C++0x memory model compliance. */
+DEFPARAM (PARAM_ALLOW_STORE_DATA_RACES,
+ "allow-store-data-races",
+ "Allow new data races on stores to be introduced",
+ 1, 0, 1)
+
/*
Local variables:
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-25 17:41 ` Aldy Hernandez
@ 2011-07-26 5:28 ` Jason Merrill
2011-07-26 18:37 ` Aldy Hernandez
2011-07-26 20:05 ` Aldy Hernandez
2011-07-27 18:24 ` H.J. Lu
1 sibling, 2 replies; 81+ messages in thread
From: Jason Merrill @ 2011-07-26 5:28 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
On 07/25/2011 10:07 AM, Aldy Hernandez wrote:
> I had changed this already to take into account aliasing, so if we get
> an INDIRECT_REF, ptr_deref_may_alias_global_p() returns true, and we
> proceed with the restriction:
Sounds good. "global" includes malloc'd memory, right? There don't
seem to be any tests for that.
Speaking of tests, please put them in c-c++-common.
> + bitnum -= bitregion_start;
> + bitregion_end -= bitregion_start;
> + bitregion_start = 0;
Why is this necessary/useful?
Jason
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-26 17:54 ` Jason Merrill
@ 2011-07-26 17:51 ` Aldy Hernandez
2011-07-26 18:05 ` Jason Merrill
0 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-07-26 17:51 UTC (permalink / raw)
To: Jason Merrill; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
> I think the adjustment above is intended to match the adjustment of the
> address by bitregion_start/BITS_PER_UNIT, but the above seems to assume
> that bitregion_start%BITS_PER_UNIT == 0.
That was intentional. bitregion_start always falls on a byte boundary,
does it not?
struct {
stuff;
unsigned int b:3;
unsigned int other_bits:22;
other_stuff;
}
Does not "b" always start at a byte boundary?
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-26 18:37 ` Aldy Hernandez
@ 2011-07-26 17:54 ` Jason Merrill
2011-07-26 17:51 ` Aldy Hernandez
0 siblings, 1 reply; 81+ messages in thread
From: Jason Merrill @ 2011-07-26 17:54 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
On 07/26/2011 09:36 AM, Aldy Hernandez wrote:
>
>>> + bitnum -= bitregion_start;
>>> + bitregion_end -= bitregion_start;
>>> + bitregion_start = 0;
>>
>> Why is this necessary/useful?
>
> You mean, why am I resetting these values (because the call to
> get_best_mode() following it needs the adjusted values). Or why am I
> adjusting the address to point to the beginning of the region?
I think the adjustment above is intended to match the adjustment of the
address by bitregion_start/BITS_PER_UNIT, but the above seems to assume
that bitregion_start%BITS_PER_UNIT == 0.
Jason
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-26 17:51 ` Aldy Hernandez
@ 2011-07-26 18:05 ` Jason Merrill
2011-07-27 15:03 ` Richard Guenther
0 siblings, 1 reply; 81+ messages in thread
From: Jason Merrill @ 2011-07-26 18:05 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
On 07/26/2011 10:32 AM, Aldy Hernandez wrote:
>
>> I think the adjustment above is intended to match the adjustment of the
>> address by bitregion_start/BITS_PER_UNIT, but the above seems to assume
>> that bitregion_start%BITS_PER_UNIT == 0.
>
> That was intentional. bitregion_start always falls on a byte boundary,
> does it not?
Ah, yes, of course, it's bitnum that might not. The code changes look
good, then.
Jason
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-26 5:28 ` Jason Merrill
@ 2011-07-26 18:37 ` Aldy Hernandez
2011-07-26 17:54 ` Jason Merrill
2011-07-26 20:05 ` Aldy Hernandez
1 sibling, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-07-26 18:37 UTC (permalink / raw)
To: Jason Merrill; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
>> + bitnum -= bitregion_start;
>> + bitregion_end -= bitregion_start;
>> + bitregion_start = 0;
>
> Why is this necessary/useful?
You mean, why am I resetting these values (because the call to
get_best_mode() following it needs the adjusted values). Or why am I
adjusting the address to point to the beginning of the region?
A
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-26 5:28 ` Jason Merrill
2011-07-26 18:37 ` Aldy Hernandez
@ 2011-07-26 20:05 ` Aldy Hernandez
1 sibling, 0 replies; 81+ messages in thread
From: Aldy Hernandez @ 2011-07-26 20:05 UTC (permalink / raw)
To: Jason Merrill; +Cc: Jeff Law, gcc-patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 406 bytes --]
On 07/25/11 18:55, Jason Merrill wrote:
> On 07/25/2011 10:07 AM, Aldy Hernandez wrote:
>> I had changed this already to take into account aliasing, so if we get
>> an INDIRECT_REF, ptr_deref_may_alias_global_p() returns true, and we
>> proceed with the restriction:
>
> Sounds good. "global" includes malloc'd memory, right? There don't seem
> to be any tests for that.
Is the attached test appropriate?
[-- Attachment #2: cxxbitfields-5.c --]
[-- Type: text/plain, Size: 403 bytes --]
/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
/* { dg-options "-O2 --param allow-store-data-races=0" } */
#include <stdlib.h>
struct bits
{
char a;
int b:7;
int c:9;
unsigned char d;
} x;
struct bits *p;
static void allocit()
{
p = (struct bits *) malloc (sizeof (struct bits));
}
void foo()
{
allocit();
p -> c = 55;
}
/* { dg-final { scan-assembler-not "movl\t\\(" } } */
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-26 18:05 ` Jason Merrill
@ 2011-07-27 15:03 ` Richard Guenther
2011-07-27 15:12 ` Richard Guenther
` (2 more replies)
0 siblings, 3 replies; 81+ messages in thread
From: Richard Guenther @ 2011-07-27 15:03 UTC (permalink / raw)
To: Jason Merrill; +Cc: Aldy Hernandez, Jeff Law, gcc-patches, Jakub Jelinek
On Tue, Jul 26, 2011 at 7:38 PM, Jason Merrill <jason@redhat.com> wrote:
> On 07/26/2011 10:32 AM, Aldy Hernandez wrote:
>>
>>> I think the adjustment above is intended to match the adjustment of the
>>> address by bitregion_start/BITS_PER_UNIT, but the above seems to assume
>>> that bitregion_start%BITS_PER_UNIT == 0.
>>
>> That was intentional. bitregion_start always falls on a byte boundary,
>> does it not?
>
> Ah, yes, of course, it's bitnum that might not. The code changes look good,
> then.
Looks like this was an approval ...
Anyway, I don't think a --param is appropriate to control a flag whether
to allow store data-races to be created. Why not use a regular option instead?
I believe that any after-the-fact attempt to recover bitfield boundaries is
going to fail unless you preserve more information during bitfield layout.
Consider
struct {
char : 8;
char : 0;
char : 8;
};
where the : 0 isn't preserved in any way and you can't distinguish
it from struct { char : 8; char : 8; }.
Richard.
> Jason
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-27 15:03 ` Richard Guenther
@ 2011-07-27 15:12 ` Richard Guenther
2011-07-27 15:53 ` Richard Guenther
2011-07-27 18:22 ` Aldy Hernandez
2011-07-27 17:29 ` Aldy Hernandez
2011-07-28 22:26 ` Aldy Hernandez
2 siblings, 2 replies; 81+ messages in thread
From: Richard Guenther @ 2011-07-27 15:12 UTC (permalink / raw)
To: Jason Merrill; +Cc: Aldy Hernandez, Jeff Law, gcc-patches, Jakub Jelinek
On Wed, Jul 27, 2011 at 4:52 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Tue, Jul 26, 2011 at 7:38 PM, Jason Merrill <jason@redhat.com> wrote:
>> On 07/26/2011 10:32 AM, Aldy Hernandez wrote:
>>>
>>>> I think the adjustment above is intended to match the adjustment of the
>>>> address by bitregion_start/BITS_PER_UNIT, but the above seems to assume
>>>> that bitregion_start%BITS_PER_UNIT == 0.
>>>
>>> That was intentional. bitregion_start always falls on a byte boundary,
>>> does it not?
>>
>> Ah, yes, of course, it's bitnum that might not. The code changes look good,
>> then.
>
> Looks like this was an approval ...
>
> Anyway, I don't think a --param is appropriate to control a flag whether
> to allow store data-races to be created. Why not use a regular option instead?
>
> I believe that any after-the-fact attempt to recover bitfield boundaries is
> going to fail unless you preserve more information during bitfield layout.
>
> Consider
>
> struct {
> char : 8;
> char : 0;
> char : 8;
> };
>
> where the : 0 isn't preserved in any way and you can't distinguish
> it from struct { char : 8; char : 8; }.
Oh, and
INNERDECL is the actual object being referenced.
|| (!ptr_deref_may_alias_global_p (innerdecl)
is surely not what you want. That asks if *innerdecl is global memory.
I suppose you want is_global_var (innerdecl)? But with
&& (DECL_THREAD_LOCAL_P (innerdecl)
|| !TREE_STATIC (innerdecl))))
you can simply skip this test. Or what was it supposed to do?
Richard.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-27 15:12 ` Richard Guenther
@ 2011-07-27 15:53 ` Richard Guenther
2011-07-28 13:00 ` Richard Guenther
2011-07-28 19:42 ` Aldy Hernandez
2011-07-27 18:22 ` Aldy Hernandez
1 sibling, 2 replies; 81+ messages in thread
From: Richard Guenther @ 2011-07-27 15:53 UTC (permalink / raw)
To: Jason Merrill; +Cc: Aldy Hernandez, Jeff Law, gcc-patches, Jakub Jelinek
On Wed, Jul 27, 2011 at 4:56 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Wed, Jul 27, 2011 at 4:52 PM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Tue, Jul 26, 2011 at 7:38 PM, Jason Merrill <jason@redhat.com> wrote:
>>> On 07/26/2011 10:32 AM, Aldy Hernandez wrote:
>>>>
>>>>> I think the adjustment above is intended to match the adjustment of the
>>>>> address by bitregion_start/BITS_PER_UNIT, but the above seems to assume
>>>>> that bitregion_start%BITS_PER_UNIT == 0.
>>>>
>>>> That was intentional. bitregion_start always falls on a byte boundary,
>>>> does it not?
>>>
>>> Ah, yes, of course, it's bitnum that might not. The code changes look good,
>>> then.
>>
>> Looks like this was an approval ...
>>
>> Anyway, I don't think a --param is appropriate to control a flag whether
>> to allow store data-races to be created. Why not use a regular option instead?
>>
>> I believe that any after-the-fact attempt to recover bitfield boundaries is
>> going to fail unless you preserve more information during bitfield layout.
>>
>> Consider
>>
>> struct {
>> char : 8;
>> char : 0;
>> char : 8;
>> };
>>
>> where the : 0 isn't preserved in any way and you can't distinguish
>> it from struct { char : 8; char : 8; }.
>
> Oh, and
>
> INNERDECL is the actual object being referenced.
>
> || (!ptr_deref_may_alias_global_p (innerdecl)
>
> is surely not what you want. That asks if *innerdecl is global memory.
> I suppose you want is_global_var (innerdecl)? But with
>
> && (DECL_THREAD_LOCAL_P (innerdecl)
> || !TREE_STATIC (innerdecl))))
>
> you can simply skip this test. Or what was it supposed to do?
And
t = build3 (COMPONENT_REF, TREE_TYPE (exp),
unshare_expr (TREE_OPERAND (exp, 0)),
fld, NULL_TREE);
get_inner_reference (t, &bitsize, &bitpos, &offset,
&mode, &unsignedp, &volatilep, true);
for each field of a struct type is of course ... gross! In fact you already
have the FIELD_DECL in the single caller! Yes I know there is not
enough information preserved by bitfield layout - see my previous reply.
if (TREE_CODE (to) == COMPONENT_REF
&& DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
get_bit_range (&bitregion_start, &bitregion_end,
to, tem, bitpos, bitsize);
and shouldn't this test DECL_BIT_FIELD instead of DECL_BIT_FIELD_TYPE?
Richard.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-27 15:03 ` Richard Guenther
2011-07-27 15:12 ` Richard Guenther
@ 2011-07-27 17:29 ` Aldy Hernandez
2011-07-27 17:57 ` Andrew MacLeod
2011-07-28 22:26 ` Aldy Hernandez
2 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-07-27 17:29 UTC (permalink / raw)
To: Richard Guenther; +Cc: Jason Merrill, Jeff Law, gcc-patches, Jakub Jelinek
> Anyway, I don't think a --param is appropriate to control a flag whether
> to allow store data-races to be created. Why not use a regular option instead?
I don't care either way. What -foption-name do you suggest?
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-27 17:29 ` Aldy Hernandez
@ 2011-07-27 17:57 ` Andrew MacLeod
2011-07-27 22:27 ` Joseph S. Myers
2011-07-28 8:58 ` Richard Guenther
0 siblings, 2 replies; 81+ messages in thread
From: Andrew MacLeod @ 2011-07-27 17:57 UTC (permalink / raw)
To: Aldy Hernandez
Cc: Richard Guenther, Jason Merrill, Jeff Law, gcc-patches, Jakub Jelinek
On 07/27/2011 01:08 PM, Aldy Hernandez wrote:
>
>> Anyway, I don't think a --param is appropriate to control a flag whether
>> to allow store data-races to be created. Why not use a regular
>> option instead?
>
> I don't care either way. What -foption-name do you suggest?
Well, I suggested a -f option set last year when this was laid out, and
Ian suggested that it should be a --param
http://gcc.gnu.org/ml/gcc/2010-05/msg00118.html
"I don't agree with your proposed command line options. They seem fine
for internal use, but I think very very few users would know when or
whether they should use -fno-data-race-stores. I think you should
downgrade those options to a --param value, and think about a
multi-layered -fmemory-model option. "
Andrew
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-27 15:12 ` Richard Guenther
2011-07-27 15:53 ` Richard Guenther
@ 2011-07-27 18:22 ` Aldy Hernandez
2011-07-28 8:52 ` Richard Guenther
1 sibling, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-07-27 18:22 UTC (permalink / raw)
To: Richard Guenther; +Cc: Jason Merrill, Jeff Law, gcc-patches, Jakub Jelinek
> Oh, and
>
> INNERDECL is the actual object being referenced.
>
> || (!ptr_deref_may_alias_global_p (innerdecl)
>
> is surely not what you want. That asks if *innerdecl is global memory.
> I suppose you want is_global_var (innerdecl)? But with
>
> && (DECL_THREAD_LOCAL_P (innerdecl)
> || !TREE_STATIC (innerdecl))))
>
> you can simply skip this test. Or what was it supposed to do?
The test was there because neither DECL_THREAD_LOCAL_P nor is_global_var
can handle MEM_REF's.
Would you prefer an explicit check for a *_DECL?
if (ALLOW_STORE_DATA_RACES
- || (!ptr_deref_may_alias_global_p (innerdecl)
+ || (DECL_P (innerdecl)
&& (DECL_THREAD_LOCAL_P (innerdecl)
|| !TREE_STATIC (innerdecl))))
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-25 17:41 ` Aldy Hernandez
2011-07-26 5:28 ` Jason Merrill
@ 2011-07-27 18:24 ` H.J. Lu
2011-07-27 20:39 ` Aldy Hernandez
1 sibling, 1 reply; 81+ messages in thread
From: H.J. Lu @ 2011-07-27 18:24 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jason Merrill, Jeff Law, gcc-patches, Jakub Jelinek
On Mon, Jul 25, 2011 at 10:07 AM, Aldy Hernandez <aldyh@redhat.com> wrote:
> On 07/22/11 13:44, Jason Merrill wrote:
>>
>> On 07/18/2011 08:02 AM, Aldy Hernandez wrote:
>>>
>>> + /* If other threads can't see this value, no need to restrict
>>> stores. */
>>> + if (ALLOW_STORE_DATA_RACES
>>> + || !DECL_THREAD_VISIBLE_P (innerdecl))
>>> + {
>>> + *bitstart = *bitend = 0;
>>> + return;
>>> + }
>>
>> What if get_inner_reference returns something that isn't a DECL, such as
>> an INDIRECT_REF?
>
> I had changed this already to take into account aliasing, so if we get an
> INDIRECT_REF, ptr_deref_may_alias_global_p() returns true, and we proceed
> with the restriction:
>
> + /* If other threads can't see this value, no need to restrict stores. */
> + if (ALLOW_STORE_DATA_RACES
> + || (!ptr_deref_may_alias_global_p (innerdecl)
> + && (DECL_THREAD_LOCAL_P (innerdecl)
> + || !TREE_STATIC (innerdecl))))
>
>
>>> + if (fld)
>>> + {
>>> + /* We found the end of the bit field sequence. Include the
>>> + padding up to the next field and be done. */
>>> + *bitend = bitpos - 1;
>>> + }
>>
>> bitpos is the position of "field", and it seems to me we want the
>> position of "fld" here.
>
> Notice that bitpos gets recalculated at each iteration by
> get_inner_reference, so bitpos is actually the position of fld.
>
>>> + /* If unset, no restriction. */
>>> + if (!bitregion_end)
>>> + maxbits = 0;
>>> + else
>>> + maxbits = (bitregion_end - bitregion_start) % align;
>>
>> Maybe use MAX_FIXED_MODE_SIZE so you don't have to test it against 0?
>
> Fixed everywhere.
>
>>> + if (!bitregion_end)
>>> + maxbits = 0;
>>> + else if (1||bitpos + offset * BITS_PER_UNIT < bitregion_start)
>>> + maxbits = bitregion_end - bitregion_start;
>>> + else
>>> + maxbits = bitregion_end - (bitpos + offset * BITS_PER_UNIT) + 1;
>>
>> I assume the 1|| was there for debugging?
>
> Fixed, plus I adjusted the calculation of maxbits everywhere because I found
> an off-by-one error.
>
> I have also overhauled store_bit_field() to adjust the address of the
> address to point to the beginning of the bit region. This fixed a myraid of
> corner cases pointed out by a test Hans Boehm was kind enough to provide.
>
> I have added more tests.
>
> How does this look? (Pending tests.)
>
This caused:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49875
--
H.J.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-27 18:24 ` H.J. Lu
@ 2011-07-27 20:39 ` Aldy Hernandez
2011-07-27 20:54 ` Jakub Jelinek
0 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-07-27 20:39 UTC (permalink / raw)
To: H.J. Lu; +Cc: Jason Merrill, Jeff Law, gcc-patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 219 bytes --]
> This caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49875
The assembler sequence on ia32 was a bit different.
H.J. Can you try this on your end? If it fixes the problem, I will
commit as obvious.
Aldy
[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 481 bytes --]
PR middle-end/49875
* c-c++-common/cxxbitfields-4.c: Check for smaller than long
moves.
Index: c-c++-common/cxxbitfields-4.c
===================================================================
--- c-c++-common/cxxbitfields-4.c (revision 176824)
+++ c-c++-common/cxxbitfields-4.c (working copy)
@@ -15,4 +15,4 @@ void update_c(struct bits *p, int val)
p -> c = val;
}
-/* { dg-final { scan-assembler-not "movl" } } */
+/* { dg-final { scan-assembler "mov\[bw\]" } } */
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-27 20:39 ` Aldy Hernandez
@ 2011-07-27 20:54 ` Jakub Jelinek
2011-07-27 21:00 ` Aldy Hernandez
0 siblings, 1 reply; 81+ messages in thread
From: Jakub Jelinek @ 2011-07-27 20:54 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: H.J. Lu, Jason Merrill, Jeff Law, gcc-patches
On Wed, Jul 27, 2011 at 01:51:04PM -0500, Aldy Hernandez wrote:
> >This caused:
> >
> >http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49875
>
> The assembler sequence on ia32 was a bit different.
>
> H.J. Can you try this on your end? If it fixes the problem, I will
> commit as obvious.
You could test it yourself on x86_64-linux too with
make check -k RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} dg.exp=cxxbit*'
> PR middle-end/49875
> * c-c++-common/cxxbitfields-4.c: Check for smaller than long
> moves.
Jakub
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-27 20:54 ` Jakub Jelinek
@ 2011-07-27 21:00 ` Aldy Hernandez
0 siblings, 0 replies; 81+ messages in thread
From: Aldy Hernandez @ 2011-07-27 21:00 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: H.J. Lu, Jason Merrill, Jeff Law, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 483 bytes --]
On 07/27/11 13:55, Jakub Jelinek wrote:
> On Wed, Jul 27, 2011 at 01:51:04PM -0500, Aldy Hernandez wrote:
>>> This caused:
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49875
>>
>> The assembler sequence on ia32 was a bit different.
>>
>> H.J. Can you try this on your end? If it fixes the problem, I will
>> commit as obvious.
>
> You could test it yourself on x86_64-linux too with
> make check -k RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} dg.exp=cxxbit*'
Committed.
[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 883 bytes --]
PR middle-end/49875
* c-c++-common/cxxbitfields-4.c: Check for smaller than long
moves.
* c-c++-common/cxxbitfields-5.c: Same.
Index: c-c++-common/cxxbitfields-4.c
===================================================================
--- c-c++-common/cxxbitfields-4.c (revision 176824)
+++ c-c++-common/cxxbitfields-4.c (working copy)
@@ -15,4 +15,4 @@ void update_c(struct bits *p, int val)
p -> c = val;
}
-/* { dg-final { scan-assembler-not "movl" } } */
+/* { dg-final { scan-assembler "mov\[bw\]" } } */
Index: c-c++-common/cxxbitfields-5.c
===================================================================
--- c-c++-common/cxxbitfields-5.c (revision 176824)
+++ c-c++-common/cxxbitfields-5.c (working copy)
@@ -26,4 +26,4 @@ void foo()
p -> c = 55;
}
-/* { dg-final { scan-assembler-not "movl\t\\(" } } */
+/* { dg-final { scan-assembler "mov\[bw\]" } } */
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-27 17:57 ` Andrew MacLeod
@ 2011-07-27 22:27 ` Joseph S. Myers
2011-07-28 8:58 ` Richard Guenther
1 sibling, 0 replies; 81+ messages in thread
From: Joseph S. Myers @ 2011-07-27 22:27 UTC (permalink / raw)
To: Andrew MacLeod
Cc: Aldy Hernandez, Richard Guenther, Jason Merrill, Jeff Law,
gcc-patches, Jakub Jelinek
On Wed, 27 Jul 2011, Andrew MacLeod wrote:
> On 07/27/2011 01:08 PM, Aldy Hernandez wrote:
> >
> > > Anyway, I don't think a --param is appropriate to control a flag whether
> > > to allow store data-races to be created. Why not use a regular option
> > > instead?
> >
> > I don't care either way. What -foption-name do you suggest?
> Well, I suggested a -f option set last year when this was laid out, and Ian
> suggested that it should be a --param
>
> http://gcc.gnu.org/ml/gcc/2010-05/msg00118.html
>
> "I don't agree with your proposed command line options. They seem fine
> for internal use, but I think very very few users would know when or
> whether they should use -fno-data-race-stores. I think you should
> downgrade those options to a --param value, and think about a
> multi-layered -fmemory-model option. "
The documentation says --param is for "various constants to control the
amount of optimization that is done". I don't think it should be used for
anything that affects the semantics of the program; I think -f options are
what's appropriate here (with appropriate warnings in the documentation if
most of the options should not generally be used directly by users).
--
Joseph S. Myers
joseph@codesourcery.com
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-27 18:22 ` Aldy Hernandez
@ 2011-07-28 8:52 ` Richard Guenther
2011-07-29 12:05 ` Aldy Hernandez
0 siblings, 1 reply; 81+ messages in thread
From: Richard Guenther @ 2011-07-28 8:52 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jason Merrill, Jeff Law, gcc-patches, Jakub Jelinek
On Wed, Jul 27, 2011 at 7:36 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>> Oh, and
>>
>> INNERDECL is the actual object being referenced.
>>
>> || (!ptr_deref_may_alias_global_p (innerdecl)
>>
>> is surely not what you want. That asks if *innerdecl is global memory.
>> I suppose you want is_global_var (innerdecl)? But with
>>
>> && (DECL_THREAD_LOCAL_P (innerdecl)
>> || !TREE_STATIC (innerdecl))))
>>
>> you can simply skip this test. Or what was it supposed to do?
>
> The test was there because neither DECL_THREAD_LOCAL_P nor is_global_var can
> handle MEM_REF's.
Ok, in that case you want
(TREE_CODE (innerdecl) == MEM_REF || TREE_CODE (innerdecl) == TARGET_MEM_REF)
&& !ptr_deref_may_alias_global_p (TREE_OPERAND (innerdecl, 0)))
which gets you at the actual pointer.
> Would you prefer an explicit check for a *_DECL?
>
> if (ALLOW_STORE_DATA_RACES
> - || (!ptr_deref_may_alias_global_p (innerdecl)
> + || (DECL_P (innerdecl)
> && (DECL_THREAD_LOCAL_P (innerdecl)
> || !TREE_STATIC (innerdecl))))
Yes. Together with the above it looks then optimal.
Richard.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-27 17:57 ` Andrew MacLeod
2011-07-27 22:27 ` Joseph S. Myers
@ 2011-07-28 8:58 ` Richard Guenther
1 sibling, 0 replies; 81+ messages in thread
From: Richard Guenther @ 2011-07-28 8:58 UTC (permalink / raw)
To: Andrew MacLeod
Cc: Aldy Hernandez, Jason Merrill, Jeff Law, gcc-patches, Jakub Jelinek
On Wed, Jul 27, 2011 at 7:19 PM, Andrew MacLeod <amacleod@redhat.com> wrote:
> On 07/27/2011 01:08 PM, Aldy Hernandez wrote:
>>
>>> Anyway, I don't think a --param is appropriate to control a flag whether
>>> to allow store data-races to be created. Why not use a regular option
>>> instead?
>>
>> I don't care either way. What -foption-name do you suggest?
>
> Well, I suggested a -f option set last year when this was laid out, and Ian
> suggested that it should be a --param
>
> http://gcc.gnu.org/ml/gcc/2010-05/msg00118.html
>
> "I don't agree with your proposed command line options. They seem fine
> for internal use, but I think very very few users would know when or
> whether they should use -fno-data-race-stores. I think you should
> downgrade those options to a --param value, and think about a
> multi-layered -fmemory-model option. "
Hm, ok. I suppose we can revisit this when implementing such -fmemory-model
option then. --params we can at least freely remove between releases.
Richard.
> Andrew
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-27 15:53 ` Richard Guenther
@ 2011-07-28 13:00 ` Richard Guenther
2011-07-29 2:58 ` Jason Merrill
` (2 more replies)
2011-07-28 19:42 ` Aldy Hernandez
1 sibling, 3 replies; 81+ messages in thread
From: Richard Guenther @ 2011-07-28 13:00 UTC (permalink / raw)
To: Jason Merrill; +Cc: Aldy Hernandez, Jeff Law, gcc-patches, Jakub Jelinek
On Wed, Jul 27, 2011 at 5:03 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Wed, Jul 27, 2011 at 4:56 PM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Wed, Jul 27, 2011 at 4:52 PM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>>> On Tue, Jul 26, 2011 at 7:38 PM, Jason Merrill <jason@redhat.com> wrote:
>>>> On 07/26/2011 10:32 AM, Aldy Hernandez wrote:
>>>>>
>>>>>> I think the adjustment above is intended to match the adjustment of the
>>>>>> address by bitregion_start/BITS_PER_UNIT, but the above seems to assume
>>>>>> that bitregion_start%BITS_PER_UNIT == 0.
>>>>>
>>>>> That was intentional. bitregion_start always falls on a byte boundary,
>>>>> does it not?
>>>>
>>>> Ah, yes, of course, it's bitnum that might not. The code changes look good,
>>>> then.
>>>
>>> Looks like this was an approval ...
>>>
>>> Anyway, I don't think a --param is appropriate to control a flag whether
>>> to allow store data-races to be created. Why not use a regular option instead?
>>>
>>> I believe that any after-the-fact attempt to recover bitfield boundaries is
>>> going to fail unless you preserve more information during bitfield layout.
>>>
>>> Consider
>>>
>>> struct {
>>> char : 8;
>>> char : 0;
>>> char : 8;
>>> };
>>>
>>> where the : 0 isn't preserved in any way and you can't distinguish
>>> it from struct { char : 8; char : 8; }.
>>
>> Oh, and
>>
>> INNERDECL is the actual object being referenced.
>>
>> || (!ptr_deref_may_alias_global_p (innerdecl)
>>
>> is surely not what you want. That asks if *innerdecl is global memory.
>> I suppose you want is_global_var (innerdecl)? But with
>>
>> && (DECL_THREAD_LOCAL_P (innerdecl)
>> || !TREE_STATIC (innerdecl))))
>>
>> you can simply skip this test. Or what was it supposed to do?
>
> And
>
> t = build3 (COMPONENT_REF, TREE_TYPE (exp),
> unshare_expr (TREE_OPERAND (exp, 0)),
> fld, NULL_TREE);
> get_inner_reference (t, &bitsize, &bitpos, &offset,
> &mode, &unsignedp, &volatilep, true);
>
> for each field of a struct type is of course ... gross! In fact you already
> have the FIELD_DECL in the single caller! Yes I know there is not
> enough information preserved by bitfield layout - see my previous reply.
Looking at the C++ memory model what you need is indeed simple enough
to recover here. Still this loop does quadratic work for a struct with
N bitfield members and a function which stores into all of them.
And that with a big constant factor as you build a component-ref
and even unshare trees (which isn't necessary here anyway). In fact
you could easily manually keep track of bitpos when walking adjacent
bitfield members. An initial call to get_inner_reference on
TREE_OPERAND (exp, 0) would give you the starting position of the record.
That would still be quadratic of course.
For bitfield lowering I'd like to preserve a way to get from a field-decl to
the first field-decl of a group of bitfield members that occupy an aligned
amount of storage (as place_field assigns it). That wouldn't necessarily
match the first bitfield field in the C++ bitfield group sense but would
probably be sensible enough for conforming accesses (and you'd only
need to search forward from that first field looking for a zero-size
field). Now, the question is of course what to do for DECL_PACKED
fields (I suppose, simply ignore the C++ memory model as C++ doesn't
have a notion of packed or specially (mis-)aligned structs or bitfields).
Richard.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-27 15:53 ` Richard Guenther
2011-07-28 13:00 ` Richard Guenther
@ 2011-07-28 19:42 ` Aldy Hernandez
1 sibling, 0 replies; 81+ messages in thread
From: Aldy Hernandez @ 2011-07-28 19:42 UTC (permalink / raw)
To: Richard Guenther; +Cc: Jason Merrill, Jeff Law, gcc-patches, Jakub Jelinek
> if (TREE_CODE (to) == COMPONENT_REF
> && DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
> get_bit_range (&bitregion_start,&bitregion_end,
> to, tem, bitpos, bitsize);
>
> and shouldn't this test DECL_BIT_FIELD instead of DECL_BIT_FIELD_TYPE?
As I mentioned here:
http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01416.html
I am using DECL_BIT_FIELD_TYPE instead of DECL_BIT_FIELD to determine if
a DECL is a bit field because DECL_BIT_FIELD is not set for bit fields
with mode sized number of bits (32-bits, 16-bits, etc).
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-29 12:05 ` Aldy Hernandez
@ 2011-07-28 19:58 ` Richard Guenther
0 siblings, 0 replies; 81+ messages in thread
From: Richard Guenther @ 2011-07-28 19:58 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jason Merrill, Jeff Law, gcc-patches, Jakub Jelinek
On Thu, Jul 28, 2011 at 9:12 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>> Yes. Together with the above it looks then optimal.
>
> Attached patch tested on x86-64 Linux.
>
> OK for mainline?
Ok with the || moved to the next line as per coding-standards.
Thanks,
Richard.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-27 15:03 ` Richard Guenther
2011-07-27 15:12 ` Richard Guenther
2011-07-27 17:29 ` Aldy Hernandez
@ 2011-07-28 22:26 ` Aldy Hernandez
2 siblings, 0 replies; 81+ messages in thread
From: Aldy Hernandez @ 2011-07-28 22:26 UTC (permalink / raw)
To: Richard Guenther; +Cc: Jason Merrill, Jeff Law, gcc-patches, Jakub Jelinek
> I believe that any after-the-fact attempt to recover bitfield boundaries is
> going to fail unless you preserve more information during bitfield layout.
>
> Consider
>
> struct {
> char : 8;
> char : 0;
> char : 8;
> };
>
> where the : 0 isn't preserved in any way and you can't distinguish
> it from struct { char : 8; char : 8; }.
Huh? In my tests the :0 is preserved, it just doesn't have a DECL_NAME.
(gdb) p fld
$41 = (tree) 0x7ffff7778130
(gdb) pt
<field_decl 0x7ffff7778130 D.1593
...
I have tried the following scenario, and we calculate the beginning of
the bit region correctly (bit 32).
struct bits
{
char a;
int b:7;
int :0; <-- bitregion start
int c:9; <-- bitregion start
unsigned char d;
} *p;
void foo() { p -> c = 55; }
Am I misunderstanding? Why do you suggest we need to preserve more
information during bitfield layout?
FWIW, I should add a zero-length bit test.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-28 13:00 ` Richard Guenther
@ 2011-07-29 2:58 ` Jason Merrill
2011-07-29 12:02 ` Aldy Hernandez
2011-08-05 17:28 ` Aldy Hernandez
2 siblings, 0 replies; 81+ messages in thread
From: Jason Merrill @ 2011-07-29 2:58 UTC (permalink / raw)
To: Richard Guenther; +Cc: Aldy Hernandez, Jeff Law, gcc-patches, Jakub Jelinek
On 07/28/2011 04:40 AM, Richard Guenther wrote:
> field). Now, the question is of course what to do for DECL_PACKED
> fields (I suppose, simply ignore the C++ memory model as C++ doesn't
> have a notion of packed or specially (mis-)aligned structs or bitfields).
I think treat them as bitfields for this purpose.
Jason
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-29 12:02 ` Aldy Hernandez
@ 2011-07-29 11:00 ` Richard Guenther
2011-08-01 13:51 ` Richard Guenther
0 siblings, 1 reply; 81+ messages in thread
From: Richard Guenther @ 2011-07-29 11:00 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jason Merrill, Jeff Law, gcc-patches, Jakub Jelinek
On Fri, Jul 29, 2011 at 4:12 AM, Aldy Hernandez <aldyh@redhat.com> wrote:
> On 07/28/11 06:40, Richard Guenther wrote:
>
>> Looking at the C++ memory model what you need is indeed simple enough
>> to recover here. Still this loop does quadratic work for a struct with
>> N bitfield members and a function which stores into all of them.
>> And that with a big constant factor as you build a component-ref
>> and even unshare trees (which isn't necessary here anyway). In fact
>> you could easily manually keep track of bitpos when walking adjacent
>> bitfield members. An initial call to get_inner_reference on
>> TREE_OPERAND (exp, 0) would give you the starting position of the record.
>>
>> That would still be quadratic of course.
>
> Actually, we don't need to call get_inner_reference at all. It seems
> DECL_FIELD_BIT_OFFSET has all the information we need.
>
> How about we simplify things further as in the attached patch?
>
> Tested on x86-64 Linux.
>
> OK for mainline?
Well ... byte pieces of the offset can be in the tree offset
(DECL_FIELD_OFFSET). Only up to DECL_OFFSET_ALIGN bits
are tracked in DECL_FIELD_BIT_OFFSET (and DECL_FIELD_OFFSET
can be a non-constant - at least for Ada, not sure about C++).
But - can you please expand a bit on the desired semantics of
get_bit_range? Especially, relative to what is *bitstart / *bitend
supposed to be? Why do you pass in bitpos and bitsize - they
seem to be used as local variables only. Why is the check for
thread-local storage in this function and not in the caller (and
what's the magic [0,0] bit-range relative to?)?
The existing get_inner_reference calls give you a bitpos relative
to the start of the containing object - but
/* If this is the last element in the structure, include the padding
at the end of structure. */
*bitend = TREE_INT_CST_LOW (TYPE_SIZE (record_type)) - 1;
will set *bitend to the size of the direct parent structure size, not the
size of the underlying object. Your proposed patch changes
bitpos to be relative to the direct parent structure.
So - I guess you need to play with some testcases like
struct {
int some_padding;
struct {
int bitfield :1;
} x;
};
and split / clarify some of get_bit_range comments.
Thanks,
Richard.
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-28 13:00 ` Richard Guenther
2011-07-29 2:58 ` Jason Merrill
@ 2011-07-29 12:02 ` Aldy Hernandez
2011-07-29 11:00 ` Richard Guenther
2011-08-05 17:28 ` Aldy Hernandez
2 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-07-29 12:02 UTC (permalink / raw)
To: Richard Guenther; +Cc: Jason Merrill, Jeff Law, gcc-patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 885 bytes --]
On 07/28/11 06:40, Richard Guenther wrote:
> Looking at the C++ memory model what you need is indeed simple enough
> to recover here. Still this loop does quadratic work for a struct with
> N bitfield members and a function which stores into all of them.
> And that with a big constant factor as you build a component-ref
> and even unshare trees (which isn't necessary here anyway). In fact
> you could easily manually keep track of bitpos when walking adjacent
> bitfield members. An initial call to get_inner_reference on
> TREE_OPERAND (exp, 0) would give you the starting position of the record.
>
> That would still be quadratic of course.
Actually, we don't need to call get_inner_reference at all. It seems
DECL_FIELD_BIT_OFFSET has all the information we need.
How about we simplify things further as in the attached patch?
Tested on x86-64 Linux.
OK for mainline?
[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 869 bytes --]
* expr.c (get_bit_range): Get field bit offset from
DECL_FIELD_BIT_OFFSET.
Index: expr.c
===================================================================
--- expr.c (revision 176891)
+++ expr.c (working copy)
@@ -4179,18 +4179,10 @@ get_bit_range (unsigned HOST_WIDE_INT *b
prev_field_is_bitfield = true;
for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld))
{
- tree t, offset;
- enum machine_mode mode;
- int unsignedp, volatilep;
-
if (TREE_CODE (fld) != FIELD_DECL)
continue;
- t = build3 (COMPONENT_REF, TREE_TYPE (exp),
- unshare_expr (TREE_OPERAND (exp, 0)),
- fld, NULL_TREE);
- get_inner_reference (t, &bitsize, &bitpos, &offset,
- &mode, &unsignedp, &volatilep, true);
+ bitpos = TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (fld));
if (field == fld)
found_field = true;
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-28 8:52 ` Richard Guenther
@ 2011-07-29 12:05 ` Aldy Hernandez
2011-07-28 19:58 ` Richard Guenther
0 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-07-29 12:05 UTC (permalink / raw)
To: Richard Guenther; +Cc: Jason Merrill, Jeff Law, gcc-patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 114 bytes --]
> Yes. Together with the above it looks then optimal.
Attached patch tested on x86-64 Linux.
OK for mainline?
[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 688 bytes --]
* expr.c (get_bit_range): Handle *MEM_REF's.
Index: expr.c
===================================================================
--- expr.c (revision 176824)
+++ expr.c (working copy)
@@ -4158,7 +4158,10 @@ get_bit_range (unsigned HOST_WIDE_INT *b
/* If other threads can't see this value, no need to restrict stores. */
if (ALLOW_STORE_DATA_RACES
- || (!ptr_deref_may_alias_global_p (innerdecl)
+ || ((TREE_CODE (innerdecl) == MEM_REF ||
+ TREE_CODE (innerdecl) == TARGET_MEM_REF)
+ && !ptr_deref_may_alias_global_p (TREE_OPERAND (innerdecl, 0)))
+ || (DECL_P (innerdecl)
&& (DECL_THREAD_LOCAL_P (innerdecl)
|| !TREE_STATIC (innerdecl))))
{
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-29 11:00 ` Richard Guenther
@ 2011-08-01 13:51 ` Richard Guenther
0 siblings, 0 replies; 81+ messages in thread
From: Richard Guenther @ 2011-08-01 13:51 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jason Merrill, Jeff Law, gcc-patches, Jakub Jelinek
On Fri, Jul 29, 2011 at 11:37 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Fri, Jul 29, 2011 at 4:12 AM, Aldy Hernandez <aldyh@redhat.com> wrote:
>> On 07/28/11 06:40, Richard Guenther wrote:
>>
>>> Looking at the C++ memory model what you need is indeed simple enough
>>> to recover here. Still this loop does quadratic work for a struct with
>>> N bitfield members and a function which stores into all of them.
>>> And that with a big constant factor as you build a component-ref
>>> and even unshare trees (which isn't necessary here anyway). In fact
>>> you could easily manually keep track of bitpos when walking adjacent
>>> bitfield members. An initial call to get_inner_reference on
>>> TREE_OPERAND (exp, 0) would give you the starting position of the record.
>>>
>>> That would still be quadratic of course.
>>
>> Actually, we don't need to call get_inner_reference at all. It seems
>> DECL_FIELD_BIT_OFFSET has all the information we need.
>>
>> How about we simplify things further as in the attached patch?
>>
>> Tested on x86-64 Linux.
>>
>> OK for mainline?
>
> Well ... byte pieces of the offset can be in the tree offset
> (DECL_FIELD_OFFSET). Only up to DECL_OFFSET_ALIGN bits
> are tracked in DECL_FIELD_BIT_OFFSET (and DECL_FIELD_OFFSET
> can be a non-constant - at least for Ada, not sure about C++).
>
> But - can you please expand a bit on the desired semantics of
> get_bit_range? Especially, relative to what is *bitstart / *bitend
> supposed to be? Why do you pass in bitpos and bitsize - they
> seem to be used as local variables only. Why is the check for
> thread-local storage in this function and not in the caller (and
> what's the magic [0,0] bit-range relative to?)?
>
> The existing get_inner_reference calls give you a bitpos relative
> to the start of the containing object - but
>
> /* If this is the last element in the structure, include the padding
> at the end of structure. */
> *bitend = TREE_INT_CST_LOW (TYPE_SIZE (record_type)) - 1;
>
> will set *bitend to the size of the direct parent structure size, not the
> size of the underlying object. Your proposed patch changes
> bitpos to be relative to the direct parent structure.
Using TYPE_SIZE can also run into issues with C++ tail packing,
you need to use DECL_SIZE of the respective field instead. Consider
struct A {
int : 17;
};
struct B : public A {
char c;
};
where I'm not sure we are not allowed to pack c into the tail padding
in A. Also neither TYPE_SIZE nor DECL_SIZE have to be constant,
at least in Ada you can have a variable-sized array before, and in
C you can have a trailing one.
Richard.
> So - I guess you need to play with some testcases like
>
> struct {
> int some_padding;
> struct {
> int bitfield :1;
> } x;
> };
>
> and split / clarify some of get_bit_range comments.
>
> Thanks,
> Richard.
>
>>
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-07-28 13:00 ` Richard Guenther
2011-07-29 2:58 ` Jason Merrill
2011-07-29 12:02 ` Aldy Hernandez
@ 2011-08-05 17:28 ` Aldy Hernandez
2011-08-09 10:52 ` Richard Guenther
2 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-08-05 17:28 UTC (permalink / raw)
To: Richard Guenther; +Cc: Jason Merrill, gcc-patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 2782 bytes --]
Alright, I'm back and bearing patches. Firmly ready for the crucifixion
you will likely submit me to. :)
I've pretty much rewritten everything, taking into account all your
suggestions, and adding a handful of tests for corner cases we will now
handle correctly.
It seems the minimum needed is to calculate the byte offset of the start
of the bit region, and the length of the bit region. (Notice I say BYTE
offset, as the start of any bit region will happily coincide with a byte
boundary). These will of course be adjusted as various parts of the
bitfield infrastructure adjust offsets and memory addresses throughout.
First, it's not as easy as calling get_inner_reference() only once as
you've suggested. The only way to determine the padding at the end of a
field is getting the bit position of the field following the field in
question (or the size of the direct parent structure in the case where
the field in question is the last field in the structure). So we need
two calls to get_inner_reference for the general case. Which is at
least better than my original call to get_inner_reference() for every field.
I have clarified the comments and made it clear what the offsets are
relative to.
I am now handling large offsets that may appear as a tree OFFSET from
get_inner_reference, and have added a test for one such corner case,
including nested structures with head padding as you suggested. I am
still unsure that a variable length offset can happen before a bit field
region. So currently we assert that the final offset is host integer
representable. If you have a testcase that invalidates my assumption, I
will gladly add a test and fix the code.
Honestly, the code isn't pretty, but neither is the rest of the bit
field machinery. I tried to make due, but I'll gladly take suggestions
that are not in the form of "the entire bit field code needs to be
rewritten" :-).
To aid in reviewing, the crux of everything is in the rewritten
get_bit_range() and the first block of store_bit_field(). Everything
else is mostly noise. I have attached all of get_bit_range() as a
separate attachment to aid in reviewing, since that's the main engine,
and it has been largely rewritten.
This pacth handles all the testcases I could come up with, mostly
inspired by your suggestions. Eventually I would like to replace these
target specific tests with target-agnostic tests using the gdb simulated
thread test harness in the cxx-mem-model branch.
Finally, you had mentioned possible problems with tail padding in C++,
and suggested I use DECL_SIZE instead of calculating the padding using
the size of direct parent structure. DECL_SIZE doesn't include padding,
so I'm open to suggestions.
Fire away, but please be kind :).
[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 33421 bytes --]
* machmode.h (get_best_mode): Remove 2 arguments.
* fold-const.c (optimize_bit_field_compare): Same.
(fold_truthop): Same.
* expr.c (store_field): Change argument types in prototype.
(emit_group_store): Change argument types to store_bit_field call.
(copy_blkmode_from_reg): Same.
(write_complex_part): Same.
(optimize_bitfield_assignment_op): Change argument types.
Change arguments to get_best_mode.
(get_bit_range): Rewrite.
(expand_assignment): Adjust new call to get_bit_range.
Adjust bitregion_offset when to_rtx is changed.
Adjust calls to store_field with new argument types.
(store_field): New argument types.
Adjust calls to store_bit_field with new arguments.
* expr.h (store_bit_field): Change argument types.
* stor-layout.c (get_best_mode): Remove use of bitregion* arguments.
* expmed.c (store_bit_field_1): Change argument types.
Do not calculate maxbits.
Adjust bitregion_maxbits if offset changes.
(store_bit_field): Change argument types.
Adjust address taking into account bitregion_offset.
(store_fixed_bit_field): Change argument types.
Do not calculate maxbits.
(store_split_bit_field): Change argument types.
(extract_bit_field_1): Adjust arguments to get_best_mode.
(extract_fixed_bit_field): Same.
Index: machmode.h
===================================================================
--- machmode.h (revision 176891)
+++ machmode.h (working copy)
@@ -249,8 +249,6 @@ extern enum machine_mode mode_for_vector
/* Find the best mode to use to access a bit field. */
extern enum machine_mode get_best_mode (int, int,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
unsigned int,
enum machine_mode, int);
Index: fold-const.c
===================================================================
--- fold-const.c (revision 176891)
+++ fold-const.c (working copy)
@@ -3394,7 +3394,7 @@ optimize_bit_field_compare (location_t l
&& flag_strict_volatile_bitfields > 0)
nmode = lmode;
else
- nmode = get_best_mode (lbitsize, lbitpos, 0, 0,
+ nmode = get_best_mode (lbitsize, lbitpos,
const_p ? TYPE_ALIGN (TREE_TYPE (linner))
: MIN (TYPE_ALIGN (TREE_TYPE (linner)),
TYPE_ALIGN (TREE_TYPE (rinner))),
@@ -5221,7 +5221,7 @@ fold_truthop (location_t loc, enum tree_
to be relative to a field of that size. */
first_bit = MIN (ll_bitpos, rl_bitpos);
end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize);
- lnmode = get_best_mode (end_bit - first_bit, first_bit, 0, 0,
+ lnmode = get_best_mode (end_bit - first_bit, first_bit,
TYPE_ALIGN (TREE_TYPE (ll_inner)), word_mode,
volatilep);
if (lnmode == VOIDmode)
@@ -5286,7 +5286,7 @@ fold_truthop (location_t loc, enum tree_
first_bit = MIN (lr_bitpos, rr_bitpos);
end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize);
- rnmode = get_best_mode (end_bit - first_bit, first_bit, 0, 0,
+ rnmode = get_best_mode (end_bit - first_bit, first_bit,
TYPE_ALIGN (TREE_TYPE (lr_inner)), word_mode,
volatilep);
if (rnmode == VOIDmode)
Index: testsuite/c-c++-common/cxxbitfields-6.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-6.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-6.c (revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+struct bits
+{
+ char a;
+ int b:7;
+ int :0;
+ volatile int c:7;
+ unsigned char d;
+} x;
+
+/* Store into <c> should not clobber <d>. */
+void update_c(struct bits *p, int val)
+{
+ p -> c = val;
+}
+
+/* { dg-final { scan-assembler "movb" } } */
Index: testsuite/c-c++-common/cxxbitfields-8.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-8.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-8.c (revision 0)
@@ -0,0 +1,29 @@
+/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-options "-O --param allow-store-data-races=0" } */
+
+struct bits {
+ /* Make sure the bit position of the bitfield is larger than what
+ can be represented in an unsigned HOST_WIDE_INT, to force
+ get_inner_reference() to return something in POFFSET. */
+
+ struct {
+ int some_padding[1<<30];
+ char more_padding;
+ } pad[1<<29];
+
+ struct {
+ volatile char bitfield :1;
+ } x;
+ char b;
+};
+
+struct bits *p;
+
+/* Test that the store into <bitfield> is not done with something
+ wider than a byte move. */
+void foo()
+{
+ p->x.bitfield = 1;
+}
+
+/* { dg-final { scan-assembler "movb" } } */
Index: testsuite/c-c++-common/cxxbitfields-7.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-7.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-7.c (revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+struct bits
+{
+ int some_padding;
+ struct {
+ volatile char bitfield :1;
+ } x;
+ char b;
+};
+
+/* Store into <bitfield> should not clobber <b>. */
+void update(struct bits *p)
+{
+ p->x.bitfield = 1;
+}
+
+/* { dg-final { scan-assembler "movb" } } */
Index: expr.c
===================================================================
--- expr.c (revision 176891)
+++ expr.c (working copy)
@@ -145,7 +145,7 @@ static void store_constructor_field (rtx
tree, tree, int, alias_set_type);
static void store_constructor (tree, rtx, int, HOST_WIDE_INT);
static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT,
+ tree, HOST_WIDE_INT,
enum machine_mode,
tree, tree, alias_set_type, bool);
@@ -2077,7 +2077,8 @@ emit_group_store (rtx orig_dst, rtx src,
emit_move_insn (adjust_address (dest, mode, bytepos), tmps[i]);
else
store_bit_field (dest, bytelen * BITS_PER_UNIT, bytepos * BITS_PER_UNIT,
- 0, 0, mode, tmps[i]);
+ integer_zero_node, MAX_FIXED_MODE_SIZE,
+ mode, tmps[i]);
}
/* Copy from the pseudo into the (probable) hard reg. */
@@ -2171,7 +2172,8 @@ copy_blkmode_from_reg (rtx tgtblk, rtx s
/* Use xbitpos for the source extraction (right justified) and
bitpos for the destination store (left justified). */
- store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, 0, 0, copy_mode,
+ store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD,
+ integer_zero_node, MAX_FIXED_MODE_SIZE, copy_mode,
extract_bit_field (src, bitsize,
xbitpos % BITS_PER_WORD, 1, false,
NULL_RTX, copy_mode, copy_mode));
@@ -2808,7 +2810,8 @@ write_complex_part (rtx cplx, rtx val, b
gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD);
}
- store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, 0, 0, imode, val);
+ store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0,
+ integer_zero_node, MAX_FIXED_MODE_SIZE, imode, val);
}
/* Extract one of the components of the complex value CPLX. Extract the
@@ -3943,8 +3946,8 @@ get_subtarget (rtx x)
static bool
optimize_bitfield_assignment_op (unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ tree bitregion_offset ATTRIBUTE_UNUSED,
+ HOST_WIDE_INT bitregion_maxbits,
enum machine_mode mode1, rtx str_rtx,
tree to, tree src)
{
@@ -4005,8 +4008,9 @@ optimize_bitfield_assignment_op (unsigne
if (str_bitsize == 0 || str_bitsize > BITS_PER_WORD)
str_mode = word_mode;
+ if (bitregion_maxbits < GET_MODE_BITSIZE (str_mode))
+ str_mode = smallest_mode_for_size (bitregion_maxbits, MODE_INT);
str_mode = get_best_mode (bitsize, bitpos,
- bitregion_start, bitregion_end,
MEM_ALIGN (str_rtx), str_mode, 0);
if (str_mode == VOIDmode)
return false;
@@ -4118,18 +4122,31 @@ optimize_bitfield_assignment_op (unsigne
/* In the C++ memory model, consecutive bit fields in a structure are
considered one memory location.
- Given a COMPONENT_REF, this function returns the bit range of
- consecutive bits in which this COMPONENT_REF belongs in. The
- values are returned in *BITSTART and *BITEND. If either the C++
- memory model is not activated, or this memory access is not thread
- visible, 0 is returned in *BITSTART and *BITEND.
+ Given a COMPONENT_REF, this function calculates the byte offset of
+ the beginning of the memory location containing bit field being
+ referenced. The byte offset is returned in *OFFSET and is the byte
+ offset from the beginning of the containing object (INNERDECL).
+
+ The largest mode that can be used to write into the bit field will
+ be returned in *LARGEST_MODE.
+
+ For example, in the following structure, the bit region starts in
+ byte 4. In an architecture where the size of BITS gets padded to
+ 32-bits, SImode will be returned in *LARGEST_MODE.
+
+ struct bits {
+ int some_padding;
+ struct {
+ volatile char bitfield :1;
+ } bits;
+ char b;
+ };
EXP is the COMPONENT_REF.
- INNERDECL is the actual object being referenced.
- BITPOS is the position in bits where the bit starts within the structure.
- BITSIZE is size in bits of the field being referenced in EXP.
- For example, while storing into FOO.A here...
+ Examples.
+
+ While storing into FOO.A here...
struct {
BIT 0:
@@ -4140,67 +4157,99 @@ optimize_bitfield_assignment_op (unsigne
unsigned int d : 6;
} foo;
- ...we are not allowed to store past <b>, so for the layout above, a
- range of 0..7 (because no one cares if we store into the
- padding). */
+ ...we are not allowed to store past <b>, so for the layout above,
+ *OFFSET will be byte 0, and *LARGEST_MODE will be QImode.
+
+ Here we have 3 distinct memory locations because of the zero-sized
+ bit-field separating the bits:
+
+ struct bits
+ {
+ char a;
+ int b:7;
+ int :0;
+ int c:7;
+ } foo;
+
+ Here we also have 3 distinct memory locations because
+ structure/union boundaries will separate contiguous bit-field
+ sequences:
+
+ struct {
+ char a:3;
+ struct { char b:4; } x;
+ char c:5;
+ } foo; */
static void
-get_bit_range (unsigned HOST_WIDE_INT *bitstart,
- unsigned HOST_WIDE_INT *bitend,
- tree exp, tree innerdecl,
- HOST_WIDE_INT bitpos, HOST_WIDE_INT bitsize)
+get_bit_range (tree exp, tree *offset, HOST_WIDE_INT *maxbits)
{
tree field, record_type, fld;
bool found_field = false;
bool prev_field_is_bitfield;
+ tree start_offset, end_offset, maxbits_tree;
+ tree start_bitpos_direct_parent = NULL_TREE;
+ HOST_WIDE_INT start_bitpos, end_bitpos;
+ HOST_WIDE_INT cumulative_bitsize = 0;
gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
- /* If other threads can't see this value, no need to restrict stores. */
- if (ALLOW_STORE_DATA_RACES
- || ((TREE_CODE (innerdecl) == MEM_REF
- || TREE_CODE (innerdecl) == TARGET_MEM_REF)
- && !ptr_deref_may_alias_global_p (TREE_OPERAND (innerdecl, 0)))
- || (DECL_P (innerdecl)
- && (DECL_THREAD_LOCAL_P (innerdecl)
- || !TREE_STATIC (innerdecl))))
- {
- *bitstart = *bitend = 0;
- return;
- }
-
/* Bit field we're storing into. */
field = TREE_OPERAND (exp, 1);
record_type = DECL_FIELD_CONTEXT (field);
/* Count the contiguous bitfields for the memory location that
contains FIELD. */
- *bitstart = 0;
- prev_field_is_bitfield = true;
+ start_offset = size_zero_node;
+ start_bitpos = 0;
+ prev_field_is_bitfield = false;
for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld))
{
- tree t, offset;
- enum machine_mode mode;
- int unsignedp, volatilep;
-
if (TREE_CODE (fld) != FIELD_DECL)
continue;
- t = build3 (COMPONENT_REF, TREE_TYPE (exp),
- unshare_expr (TREE_OPERAND (exp, 0)),
- fld, NULL_TREE);
- get_inner_reference (t, &bitsize, &bitpos, &offset,
- &mode, &unsignedp, &volatilep, true);
-
if (field == fld)
found_field = true;
- if (DECL_BIT_FIELD_TYPE (fld) && bitsize > 0)
+ /* If we have a bit-field with a bitsize > 0... */
+ if (DECL_BIT_FIELD_TYPE (fld)
+ && (!host_integerp (DECL_SIZE (fld), 1)
+ || tree_low_cst (DECL_SIZE (fld), 1) > 0))
{
+ /* Start of a new bit region. */
if (prev_field_is_bitfield == false)
{
- *bitstart = bitpos;
+ HOST_WIDE_INT bitsize;
+ enum machine_mode mode;
+ int unsignedp, volatilep;
+
+ /* Save starting bitpos and offset. */
+ get_inner_reference (build3 (COMPONENT_REF,
+ TREE_TYPE (exp),
+ TREE_OPERAND (exp, 0),
+ fld, NULL_TREE),
+ &bitsize, &start_bitpos, &start_offset,
+ &mode, &unsignedp, &volatilep, true);
+ /* Save the bit offset of the current structure. */
+ start_bitpos_direct_parent = DECL_FIELD_BIT_OFFSET (fld);
prev_field_is_bitfield = true;
+ cumulative_bitsize = 0;
+ }
+
+ cumulative_bitsize += tree_low_cst (DECL_SIZE (fld), 1);
+
+ /* Short-circuit out if we have the max bits allowed. */
+ /* ?? Is this even worth it. ?? */
+ if (cumulative_bitsize >= MAX_FIXED_MODE_SIZE)
+ {
+ *maxbits = MAX_FIXED_MODE_SIZE;
+ /* Calculate byte offset to the beginning of the bit region. */
+ gcc_assert (start_bitpos % BITS_PER_UNIT == 0);
+ *offset = fold_build2 (PLUS_EXPR, TREE_TYPE (start_offset),
+ start_offset,
+ build_int_cst (integer_type_node,
+ start_bitpos / BITS_PER_UNIT));
+ return;
}
}
else
@@ -4212,17 +4261,58 @@ get_bit_range (unsigned HOST_WIDE_INT *b
}
gcc_assert (found_field);
+ /* Calculate byte offset to the beginning of the bit region. */
+ /* OFFSET = START_OFFSET + (START_BITPOS / BITS_PER_UNIT) */
+ gcc_assert (start_bitpos % BITS_PER_UNIT == 0);
+ if (!start_offset)
+ start_offset = size_zero_node;
+ *offset = fold_build2 (PLUS_EXPR, TREE_TYPE (start_offset),
+ start_offset,
+ build_int_cst (integer_type_node,
+ start_bitpos / BITS_PER_UNIT));
if (fld)
{
+ HOST_WIDE_INT bitsize;
+ enum machine_mode mode;
+ int unsignedp, volatilep;
+
/* We found the end of the bit field sequence. Include the
- padding up to the next field and be done. */
- *bitend = bitpos - 1;
+ padding up to the next field. */
+
+ /* Calculate bitpos and offset of the next field. */
+ get_inner_reference (build3 (COMPONENT_REF,
+ TREE_TYPE (exp),
+ TREE_OPERAND (exp, 0),
+ fld, NULL_TREE),
+ &bitsize, &end_bitpos, &end_offset,
+ &mode, &unsignedp, &volatilep, true);
+ gcc_assert (end_bitpos % BITS_PER_UNIT == 0);
+
+ if (end_offset)
+ {
+ tree type = TREE_TYPE (end_offset), end;
+
+ /* Calculate byte offset to the end of the bit region. */
+ end = fold_build2 (PLUS_EXPR, type,
+ end_offset,
+ build_int_cst (type,
+ end_bitpos / BITS_PER_UNIT));
+ maxbits_tree = fold_build2 (MINUS_EXPR, type, end, *offset);
+ }
+ else
+ maxbits_tree = build_int_cst (integer_type_node,
+ end_bitpos - start_bitpos);
+
+ /* ?? Can we get a variable-lengthened offset here ?? */
+ gcc_assert (host_integerp (maxbits_tree, 1));
+ *maxbits = TREE_INT_CST_LOW (maxbits_tree);
}
else
{
/* If this is the last element in the structure, include the padding
at the end of structure. */
- *bitend = TREE_INT_CST_LOW (TYPE_SIZE (record_type)) - 1;
+ *maxbits = TREE_INT_CST_LOW (TYPE_SIZE (record_type))
+ - TREE_INT_CST_LOW (start_bitpos_direct_parent);
}
}
@@ -4324,8 +4414,8 @@ expand_assignment (tree to, tree from, b
{
enum machine_mode mode1;
HOST_WIDE_INT bitsize, bitpos;
- unsigned HOST_WIDE_INT bitregion_start = 0;
- unsigned HOST_WIDE_INT bitregion_end = 0;
+ tree bitregion_offset = size_zero_node;
+ HOST_WIDE_INT bitregion_maxbits = MAX_FIXED_MODE_SIZE;
tree offset;
int unsignedp;
int volatilep = 0;
@@ -4337,8 +4427,23 @@ expand_assignment (tree to, tree from, b
if (TREE_CODE (to) == COMPONENT_REF
&& DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
- get_bit_range (&bitregion_start, &bitregion_end,
- to, tem, bitpos, bitsize);
+ {
+ /* If other threads can't see this value, no need to
+ restrict stores. */
+ if (ALLOW_STORE_DATA_RACES
+ || ((TREE_CODE (tem) == MEM_REF
+ || TREE_CODE (tem) == TARGET_MEM_REF)
+ && !ptr_deref_may_alias_global_p (TREE_OPERAND (tem, 0)))
+ || (DECL_P (tem)
+ && (DECL_THREAD_LOCAL_P (tem)
+ || !TREE_STATIC (tem))))
+ {
+ bitregion_offset = size_zero_node;
+ bitregion_maxbits = MAX_FIXED_MODE_SIZE;
+ }
+ else
+ get_bit_range (to, &bitregion_offset, &bitregion_maxbits);
+ }
/* If we are going to use store_bit_field and extract_bit_field,
make sure to_rtx will be safe for multiple use. */
@@ -4388,12 +4493,19 @@ expand_assignment (tree to, tree from, b
&& MEM_ALIGN (to_rtx) == GET_MODE_ALIGNMENT (mode1))
{
to_rtx = adjust_address (to_rtx, mode1, bitpos / BITS_PER_UNIT);
+ bitregion_offset = fold_build2 (MINUS_EXPR, integer_type_node,
+ bitregion_offset,
+ build_int_cst (integer_type_node,
+ bitpos / BITS_PER_UNIT));
bitpos = 0;
}
to_rtx = offset_address (to_rtx, offset_rtx,
highest_pow2_factor_for_target (to,
offset));
+ bitregion_offset = fold_build2 (MINUS_EXPR, integer_type_node,
+ bitregion_offset,
+ offset);
}
/* No action is needed if the target is not a memory and the field
@@ -4421,13 +4533,13 @@ expand_assignment (tree to, tree from, b
nontemporal);
else if (bitpos + bitsize <= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 0), bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
mode1, from, TREE_TYPE (tem),
get_alias_set (to), nontemporal);
else if (bitpos >= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 1), bitsize,
bitpos - mode_bitsize / 2,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
@@ -4450,7 +4562,7 @@ expand_assignment (tree to, tree from, b
write_complex_part (temp, XEXP (to_rtx, 0), false);
write_complex_part (temp, XEXP (to_rtx, 1), true);
result = store_field (temp, bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
@@ -4477,13 +4589,14 @@ expand_assignment (tree to, tree from, b
}
if (optimize_bitfield_assignment_op (bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_offset,
+ bitregion_maxbits,
mode1,
to_rtx, to, from))
result = NULL;
else
result = store_field (to_rtx, bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
@@ -5917,10 +6030,10 @@ store_constructor (tree exp, rtx target,
BITSIZE bits, starting BITPOS bits from the start of TARGET.
If MODE is VOIDmode, it means that we are storing into a bit-field.
- BITREGION_START is bitpos of the first bitfield in this region.
- BITREGION_END is the bitpos of the ending bitfield in this region.
- These two fields are 0, if the C++ memory model does not apply,
- or we are not interested in keeping track of bitfield regions.
+ BITREGION_OFFSET is the byte offset from the beginning of the
+ containing object to the start of the bit region.
+ BITREGION_MAXBITS is the size in bits of the largest mode that can
+ be used to set the bit-field in question.
Always return const0_rtx unless we have something particular to
return.
@@ -5935,8 +6048,8 @@ store_constructor (tree exp, rtx target,
static rtx
store_field (rtx target, HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ tree bitregion_offset,
+ HOST_WIDE_INT bitregion_maxbits,
enum machine_mode mode, tree exp, tree type,
alias_set_type alias_set, bool nontemporal)
{
@@ -5970,7 +6083,7 @@ store_field (rtx target, HOST_WIDE_INT b
emit_move_insn (object, target);
store_field (blk_object, bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
mode, exp, type, alias_set, nontemporal);
emit_move_insn (target, object);
@@ -6086,7 +6199,7 @@ store_field (rtx target, HOST_WIDE_INT b
/* Store the value in the bitfield. */
store_bit_field (target, bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
mode, temp);
return const0_rtx;
Index: expr.h
===================================================================
--- expr.h (revision 176891)
+++ expr.h (working copy)
@@ -666,8 +666,8 @@ mode_for_extraction (enum extraction_pat
extern void store_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
+ tree,
+ HOST_WIDE_INT,
enum machine_mode, rtx);
extern rtx extract_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT, int, bool, rtx,
Index: stor-layout.c
===================================================================
--- stor-layout.c (revision 176891)
+++ stor-layout.c (working copy)
@@ -2361,13 +2361,6 @@ fixup_unsigned_type (tree type)
/* Find the best machine mode to use when referencing a bit field of length
BITSIZE bits starting at BITPOS.
- BITREGION_START is the bit position of the first bit in this
- sequence of bit fields. BITREGION_END is the last bit in this
- sequence. If these two fields are non-zero, we should restrict the
- memory access to a maximum sized chunk of
- BITREGION_END - BITREGION_START + 1. Otherwise, we are allowed to touch
- any adjacent non bit-fields.
-
The underlying object is known to be aligned to a boundary of ALIGN bits.
If LARGEST_MODE is not VOIDmode, it means that we should not use a mode
larger than LARGEST_MODE (usually SImode).
@@ -2386,20 +2379,11 @@ fixup_unsigned_type (tree type)
enum machine_mode
get_best_mode (int bitsize, int bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
unsigned int align,
enum machine_mode largest_mode, int volatilep)
{
enum machine_mode mode;
unsigned int unit = 0;
- unsigned HOST_WIDE_INT maxbits;
-
- /* If unset, no restriction. */
- if (!bitregion_end)
- maxbits = MAX_FIXED_MODE_SIZE;
- else
- maxbits = (bitregion_end - bitregion_start) % align + 1;
/* Find the narrowest integer mode that contains the bit field. */
for (mode = GET_CLASS_NARROWEST_MODE (MODE_INT); mode != VOIDmode;
@@ -2436,7 +2420,6 @@ get_best_mode (int bitsize, int bitpos,
&& bitpos / unit == (bitpos + bitsize - 1) / unit
&& unit <= BITS_PER_WORD
&& unit <= MIN (align, BIGGEST_ALIGNMENT)
- && unit <= maxbits
&& (largest_mode == VOIDmode
|| unit <= GET_MODE_BITSIZE (largest_mode)))
wide_mode = tmode;
Index: expmed.c
===================================================================
--- expmed.c (revision 176891)
+++ expmed.c (working copy)
@@ -48,13 +48,11 @@ struct target_expmed *this_target_expmed
static void store_fixed_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
+ tree, HOST_WIDE_INT,
rtx);
static void store_split_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
+ tree, HOST_WIDE_INT,
rtx);
static rtx extract_fixed_bit_field (enum machine_mode, rtx,
unsigned HOST_WIDE_INT,
@@ -340,8 +338,8 @@ mode_for_extraction (enum extraction_pat
static bool
store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitnum,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ tree bitregion_offset,
+ HOST_WIDE_INT bitregion_maxbits,
enum machine_mode fieldmode,
rtx value, bool fallback_p)
{
@@ -558,7 +556,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!store_bit_field_1 (op0, MIN (BITS_PER_WORD,
bitsize - i * BITS_PER_WORD),
bitnum + bit_offset,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
word_mode,
value_word, fallback_p))
{
@@ -722,10 +720,6 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (HAVE_insv && MEM_P (op0))
{
enum machine_mode bestmode;
- unsigned HOST_WIDE_INT maxbits = MAX_FIXED_MODE_SIZE;
-
- if (bitregion_end)
- maxbits = bitregion_end - bitregion_start + 1;
/* Get the mode to use for inserting into this field. If OP0 is
BLKmode, get the smallest mode consistent with the alignment. If
@@ -733,15 +727,18 @@ store_bit_field_1 (rtx str_rtx, unsigned
mode. Otherwise, use the smallest mode containing the field. */
if (GET_MODE (op0) == BLKmode
- || GET_MODE_BITSIZE (GET_MODE (op0)) > maxbits
+ || GET_MODE_BITSIZE (GET_MODE (op0)) > bitregion_maxbits
|| (op_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (op_mode)))
- bestmode = get_best_mode (bitsize, bitnum,
- bitregion_start, bitregion_end,
- MEM_ALIGN (op0),
- (op_mode == MAX_MACHINE_MODE
- ? VOIDmode : op_mode),
- MEM_VOLATILE_P (op0));
+ {
+ bestmode = (op_mode == MAX_MACHINE_MODE ? VOIDmode : op_mode);
+ if (bitregion_maxbits < GET_MODE_SIZE (op_mode))
+ bestmode = smallest_mode_for_size (bitregion_maxbits, MODE_INT);
+ bestmode = get_best_mode (bitsize, bitnum,
+ MEM_ALIGN (op0),
+ bestmode,
+ MEM_VOLATILE_P (op0));
+ }
else
bestmode = GET_MODE (op0);
@@ -767,7 +764,8 @@ store_bit_field_1 (rtx str_rtx, unsigned
the unit. */
tempreg = copy_to_reg (xop0);
if (store_bit_field_1 (tempreg, bitsize, xbitpos,
- bitregion_start, bitregion_end,
+ bitregion_offset,
+ bitregion_maxbits - xoffset * BITS_PER_UNIT,
fieldmode, orig_value, false))
{
emit_move_insn (xop0, tempreg);
@@ -780,8 +778,9 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!fallback_p)
return false;
+ bitregion_maxbits -= offset * BITS_PER_UNIT;
store_fixed_bit_field (op0, offset, bitsize, bitpos,
- bitregion_start, bitregion_end, value);
+ bitregion_offset, bitregion_maxbits, value);
return true;
}
@@ -789,18 +788,17 @@ store_bit_field_1 (rtx str_rtx, unsigned
into a bit-field within structure STR_RTX
containing BITSIZE bits starting at bit BITNUM.
- BITREGION_START is bitpos of the first bitfield in this region.
- BITREGION_END is the bitpos of the ending bitfield in this region.
- These two fields are 0, if the C++ memory model does not apply,
- or we are not interested in keeping track of bitfield regions.
+ BITREGION_OFFSET is the byte offset STR_RTX to the start of the bit
+ region. BITREGION_MAXBITS is the number of bits of the largest
+ mode that can be used to set the bit-field in question.
FIELDMODE is the machine-mode of the FIELD_DECL node for this field. */
void
store_bit_field (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitnum,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ tree bitregion_offset,
+ HOST_WIDE_INT bitregion_maxbits,
enum machine_mode fieldmode,
rtx value)
{
@@ -808,30 +806,23 @@ store_bit_field (rtx str_rtx, unsigned H
bit region. Adjust the address to start at the beginning of the
bit region. */
if (MEM_P (str_rtx)
- && bitregion_start > 0)
+ && bitregion_maxbits < MAX_FIXED_MODE_SIZE)
{
- enum machine_mode bestmode;
- enum machine_mode op_mode;
- unsigned HOST_WIDE_INT offset;
+ HOST_WIDE_INT offset;
- op_mode = mode_for_extraction (EP_insv, 3);
- if (op_mode == MAX_MACHINE_MODE)
- op_mode = VOIDmode;
-
- offset = bitregion_start / BITS_PER_UNIT;
- bitnum -= bitregion_start;
- bitregion_end -= bitregion_start;
- bitregion_start = 0;
- bestmode = get_best_mode (bitsize, bitnum,
- bitregion_start, bitregion_end,
- MEM_ALIGN (str_rtx),
- op_mode,
- MEM_VOLATILE_P (str_rtx));
- str_rtx = adjust_address (str_rtx, bestmode, offset);
+ /* ?? Can we get a variable length offset here ?? */
+ gcc_assert (host_integerp (bitregion_offset, 1));
+ offset = tree_low_cst (bitregion_offset, 1);
+
+ /* Adjust the bit position accordingly. */
+ bitnum -= offset * BITS_PER_UNIT;
+ bitregion_offset = integer_zero_node;
+ /* Adjust the actual address. */
+ str_rtx = adjust_address (str_rtx, GET_MODE (str_rtx), offset);
}
if (!store_bit_field_1 (str_rtx, bitsize, bitnum,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
fieldmode, value, true))
gcc_unreachable ();
}
@@ -849,8 +840,8 @@ static void
store_fixed_bit_field (rtx op0, unsigned HOST_WIDE_INT offset,
unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ tree bitregion_offset,
+ HOST_WIDE_INT bitregion_maxbits,
rtx value)
{
enum machine_mode mode;
@@ -873,17 +864,14 @@ store_fixed_bit_field (rtx op0, unsigned
if (bitsize + bitpos > BITS_PER_WORD)
{
store_split_bit_field (op0, bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
value);
return;
}
}
else
{
- unsigned HOST_WIDE_INT maxbits = MAX_FIXED_MODE_SIZE;
-
- if (bitregion_end)
- maxbits = bitregion_end - bitregion_start + 1;
+ HOST_WIDE_INT maxbits = bitregion_maxbits;
/* Get the proper mode to use for this field. We want a mode that
includes the entire field. If such a mode would be larger than
@@ -901,16 +889,19 @@ store_fixed_bit_field (rtx op0, unsigned
&& flag_strict_volatile_bitfields > 0)
mode = GET_MODE (op0);
else
- mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
- bitregion_start, bitregion_end,
- MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
+ {
+ if (bitregion_maxbits < GET_MODE_BITSIZE (mode))
+ mode = smallest_mode_for_size (bitregion_maxbits, MODE_INT);
+ mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
+ }
if (mode == VOIDmode)
{
/* The only way this should occur is if the field spans word
boundaries. */
store_split_bit_field (op0, bitsize, bitpos + offset * BITS_PER_UNIT,
- bitregion_start, bitregion_end, value);
+ bitregion_offset, bitregion_maxbits, value);
return;
}
@@ -1031,8 +1022,8 @@ store_fixed_bit_field (rtx op0, unsigned
static void
store_split_bit_field (rtx op0, unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ tree bitregion_offset,
+ HOST_WIDE_INT bitregion_maxbits,
rtx value)
{
unsigned int unit;
@@ -1148,7 +1139,8 @@ store_split_bit_field (rtx op0, unsigned
it is just an out-of-bounds access. Ignore it. */
if (word != const0_rtx)
store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize,
- thispos, bitregion_start, bitregion_end, part);
+ thispos, bitregion_offset, bitregion_maxbits,
+ part);
bitsdone += thissize;
}
}
@@ -1588,7 +1580,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
if (GET_MODE (op0) == BLKmode
|| (ext_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (ext_mode)))
- bestmode = get_best_mode (bitsize, bitnum, 0, 0, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
(ext_mode == MAX_MACHINE_MODE
? VOIDmode : ext_mode),
MEM_VOLATILE_P (op0));
@@ -1714,7 +1706,7 @@ extract_fixed_bit_field (enum machine_mo
mode = tmode;
}
else
- mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT, 0, 0,
+ mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
[-- Attachment #3: get-bit-range --]
[-- Type: text/plain, Size: 5785 bytes --]
/* In the C++ memory model, consecutive bit fields in a structure are
considered one memory location.
Given a COMPONENT_REF, this function calculates the byte offset of
the beginning of the memory location containing bit field being
referenced. The byte offset is returned in *OFFSET and is the byte
offset from the beginning of the containing object (INNERDECL).
The largest mode that can be used to write into the bit field will
be returned in *LARGEST_MODE.
For example, in the following structure, the bit region starts in
byte 4. In an architecture where the size of BITS gets padded to
32-bits, SImode will be returned in *LARGEST_MODE.
struct bits {
int some_padding;
struct {
volatile char bitfield :1;
} bits;
char b;
};
EXP is the COMPONENT_REF.
Examples.
While storing into FOO.A here...
struct {
BIT 0:
unsigned int a : 4;
unsigned int b : 1;
BIT 8:
unsigned char c;
unsigned int d : 6;
} foo;
...we are not allowed to store past <b>, so for the layout above,
*OFFSET will be byte 0, and *LARGEST_MODE will be QImode.
Here we have 3 distinct memory locations because of the zero-sized
bit-field separating the bits:
struct bits
{
char a;
int b:7;
int :0;
int c:7;
} foo;
Here we also have 3 distinct memory locations because
structure/union boundaries will separate contiguous bit-field
sequences:
struct {
char a:3;
struct { char b:4; } x;
char c:5;
} foo; */
static void
get_bit_range (tree exp, tree *offset, HOST_WIDE_INT *maxbits)
{
tree field, record_type, fld;
bool found_field = false;
bool prev_field_is_bitfield;
tree start_offset, end_offset, maxbits_tree;
tree start_bitpos_direct_parent = NULL_TREE;
HOST_WIDE_INT start_bitpos, end_bitpos;
HOST_WIDE_INT cumulative_bitsize = 0;
gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
/* Bit field we're storing into. */
field = TREE_OPERAND (exp, 1);
record_type = DECL_FIELD_CONTEXT (field);
/* Count the contiguous bitfields for the memory location that
contains FIELD. */
start_offset = size_zero_node;
start_bitpos = 0;
prev_field_is_bitfield = false;
for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld))
{
if (TREE_CODE (fld) != FIELD_DECL)
continue;
if (field == fld)
found_field = true;
/* If we have a bit-field with a bitsize > 0... */
if (DECL_BIT_FIELD_TYPE (fld)
&& (!host_integerp (DECL_SIZE (fld), 1)
|| tree_low_cst (DECL_SIZE (fld), 1) > 0))
{
/* Start of a new bit region. */
if (prev_field_is_bitfield == false)
{
HOST_WIDE_INT bitsize;
enum machine_mode mode;
int unsignedp, volatilep;
/* Save starting bitpos and offset. */
get_inner_reference (build3 (COMPONENT_REF,
TREE_TYPE (exp),
TREE_OPERAND (exp, 0),
fld, NULL_TREE),
&bitsize, &start_bitpos, &start_offset,
&mode, &unsignedp, &volatilep, true);
/* Save the bit offset of the current structure. */
start_bitpos_direct_parent = DECL_FIELD_BIT_OFFSET (fld);
prev_field_is_bitfield = true;
cumulative_bitsize = 0;
}
cumulative_bitsize += tree_low_cst (DECL_SIZE (fld), 1);
/* Short-circuit out if we have the max bits allowed. */
/* ?? Is this even worth it. ?? */
if (cumulative_bitsize >= MAX_FIXED_MODE_SIZE)
{
*maxbits = MAX_FIXED_MODE_SIZE;
/* Calculate byte offset to the beginning of the bit region. */
gcc_assert (start_bitpos % BITS_PER_UNIT == 0);
*offset = fold_build2 (PLUS_EXPR, TREE_TYPE (start_offset),
start_offset,
build_int_cst (integer_type_node,
start_bitpos / BITS_PER_UNIT));
return;
}
}
else
{
prev_field_is_bitfield = false;
if (found_field)
break;
}
}
gcc_assert (found_field);
/* Calculate byte offset to the beginning of the bit region. */
/* OFFSET = START_OFFSET + (START_BITPOS / BITS_PER_UNIT) */
gcc_assert (start_bitpos % BITS_PER_UNIT == 0);
if (!start_offset)
start_offset = size_zero_node;
*offset = fold_build2 (PLUS_EXPR, TREE_TYPE (start_offset),
start_offset,
build_int_cst (integer_type_node,
start_bitpos / BITS_PER_UNIT));
if (fld)
{
HOST_WIDE_INT bitsize;
enum machine_mode mode;
int unsignedp, volatilep;
/* We found the end of the bit field sequence. Include the
padding up to the next field. */
/* Calculate bitpos and offset of the next field. */
get_inner_reference (build3 (COMPONENT_REF,
TREE_TYPE (exp),
TREE_OPERAND (exp, 0),
fld, NULL_TREE),
&bitsize, &end_bitpos, &end_offset,
&mode, &unsignedp, &volatilep, true);
gcc_assert (end_bitpos % BITS_PER_UNIT == 0);
if (end_offset)
{
tree type = TREE_TYPE (end_offset), end;
/* Calculate byte offset to the end of the bit region. */
end = fold_build2 (PLUS_EXPR, type,
end_offset,
build_int_cst (type,
end_bitpos / BITS_PER_UNIT));
maxbits_tree = fold_build2 (MINUS_EXPR, type, end, *offset);
}
else
maxbits_tree = build_int_cst (integer_type_node,
end_bitpos - start_bitpos);
/* ?? Can we get a variable-lengthened offset here ?? */
gcc_assert (host_integerp (maxbits_tree, 1));
*maxbits = TREE_INT_CST_LOW (maxbits_tree);
}
else
{
/* If this is the last element in the structure, include the padding
at the end of structure. */
*maxbits = TREE_INT_CST_LOW (TYPE_SIZE (record_type))
- TREE_INT_CST_LOW (start_bitpos_direct_parent);
}
}
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-05 17:28 ` Aldy Hernandez
@ 2011-08-09 10:52 ` Richard Guenther
2011-08-09 20:53 ` Aldy Hernandez
0 siblings, 1 reply; 81+ messages in thread
From: Richard Guenther @ 2011-08-09 10:52 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jason Merrill, gcc-patches, Jakub Jelinek
On Fri, Aug 5, 2011 at 7:25 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
> Alright, I'm back and bearing patches. Firmly ready for the crucifixion you
> will likely submit me to. :)
>
> I've pretty much rewritten everything, taking into account all your
> suggestions, and adding a handful of tests for corner cases we will now
> handle correctly.
>
> It seems the minimum needed is to calculate the byte offset of the start of
> the bit region, and the length of the bit region. (Notice I say BYTE
> offset, as the start of any bit region will happily coincide with a byte
> boundary). These will of course be adjusted as various parts of the
> bitfield infrastructure adjust offsets and memory addresses throughout.
>
> First, it's not as easy as calling get_inner_reference() only once as you've
> suggested. The only way to determine the padding at the end of a field is
> getting the bit position of the field following the field in question (or
> the size of the direct parent structure in the case where the field in
> question is the last field in the structure). So we need two calls to
> get_inner_reference for the general case. Which is at least better than my
> original call to get_inner_reference() for every field.
>
> I have clarified the comments and made it clear what the offsets are
> relative to.
>
> I am now handling large offsets that may appear as a tree OFFSET from
> get_inner_reference, and have added a test for one such corner case,
> including nested structures with head padding as you suggested. I am still
> unsure that a variable length offset can happen before a bit field region.
> So currently we assert that the final offset is host integer representable.
> If you have a testcase that invalidates my assumption, I will gladly add a
> test and fix the code.
>
> Honestly, the code isn't pretty, but neither is the rest of the bit field
> machinery. I tried to make due, but I'll gladly take suggestions that are
> not in the form of "the entire bit field code needs to be rewritten" :-).
>
> To aid in reviewing, the crux of everything is in the rewritten
> get_bit_range() and the first block of store_bit_field(). Everything else
> is mostly noise. I have attached all of get_bit_range() as a separate
> attachment to aid in reviewing, since that's the main engine, and it has
> been largely rewritten.
>
> This pacth handles all the testcases I could come up with, mostly inspired
> by your suggestions. Eventually I would like to replace these target
> specific tests with target-agnostic tests using the gdb simulated thread
> test harness in the cxx-mem-model branch.
>
> Finally, you had mentioned possible problems with tail padding in C++, and
> suggested I use DECL_SIZE instead of calculating the padding using the size
> of direct parent structure. DECL_SIZE doesn't include padding, so I'm open
> to suggestions.
>
> Fire away, but please be kind :).
Just reading and commenting top-down on the new get_bit_range function.
/* If we have a bit-field with a bitsize > 0... */
if (DECL_BIT_FIELD_TYPE (fld)
&& (!host_integerp (DECL_SIZE (fld), 1)
|| tree_low_cst (DECL_SIZE (fld), 1) > 0))
DECL_SIZE should always be host_integerp for bitfields.
/* Save starting bitpos and offset. */
get_inner_reference (build3 (COMPONENT_REF,
TREE_TYPE (exp),
TREE_OPERAND (exp, 0),
fld, NULL_TREE),
&bitsize, &start_bitpos, &start_offset,
&mode, &unsignedp, &volatilep, true);
ok, so now you do this only for the first field in a bitfield group. But you
do it for _all_ bitfield groups in a struct, not only for the interesting one.
May I suggest to split the loop into two, first searching the first field
in the bitfield group that contains fld and then in a separate loop computing
the bitwidth?
Backing up, considering one of my earlier questions. What is *offset
supposed to be relative to? The docs say sth like "relative to INNERDECL",
but the code doesn't contain a reference to INNERDECL anymore.
I think if the offset is really supposed to be relative to INNERDECL then
you should return a split offset, similar to get_inner_reference itself.
Thus, return a byte tree offset plus a HWI bit offset and maxbits
(that HWI bit offset is the offset to the start of the bitfield group, right?
Not the offset of the field that is referenced?)
It really feels like you should do something like
/* Get the offset to our parent structure. */
get_inner_reference (TREE_OPERAND (exp, 0), &offset, &bit_offset....);
for (fld = TYPE_FIELDS (...) ...)
/* Search for the starting field of the bitfield group of
TREE_OPERAND (exp, 1) */
offset += DECL_FIELD_OFFSET (first_field_of_group);
bit_offset += DECL_FIELD_BIT_OFFSET (first_field_of_group);
(well, basically copy what get_inner_reference would do here)
for (...)
accumulate bit-offsets of the group (mind they'll eventually wrap
when hitting DECL_OFFSET_ALIGN) to compute maxbits
(that also always will fit in a HWI)
Now we come to that padding thing. What's the C++ memory model
semantic for re-used tail padding? Consider
struct A
{
int i;
bool a:1;
}
struct B : public A
{
bool b:1;
}
The tail-padding of A is 3 bytes that may be used by b. Now, is
accessing a allowed to race with accessing b? Then the group for
a may include the 3 bytes tail padding. If not, then it may not
(in which case using DECL_SIZE would be appropriate).
There is too much get_inner_reference and tree folding stuff in this
patch (which makes it expensive given that the algorithm is still
inherently quadratic). You can rely on the bitfield group advancing
by integer-cst bits (but the start offset may be non-constant, so
may the size of the underlying record).
Now seeing all this - and considering that this is purely C++ frontend
semantics. Why can't the C++ frontend itself constrain accesses
according to the required semantics? It could simply create
BIT_FIELD_REF <MEM_REF <&containing_record,
byte-offset-to-start-of-group>, bit-size, bit-offset> for all bitfield
references (with a proper
type for the MEM_REF, specifying the size of the group). That would
also avoid issues during tree optimization and would at least allow
optimizing the bitfield accesses according to the desired C++ semantics.
Richard.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-09 10:52 ` Richard Guenther
@ 2011-08-09 20:53 ` Aldy Hernandez
2011-08-10 13:34 ` Richard Guenther
0 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-08-09 20:53 UTC (permalink / raw)
To: Richard Guenther; +Cc: Jason Merrill, gcc-patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 2364 bytes --]
> ok, so now you do this only for the first field in a bitfield group. But you
> do it for _all_ bitfield groups in a struct, not only for the interesting one.
>
> May I suggest to split the loop into two, first searching the first field
> in the bitfield group that contains fld and then in a separate loop computing
> the bitwidth?
Excellent idea. Done! Now there are at most two calls to
get_inner_reference, and in many cases, only one.
> Backing up, considering one of my earlier questions. What is *offset
> supposed to be relative to? The docs say sth like "relative to INNERDECL",
> but the code doesn't contain a reference to INNERDECL anymore.
Sorry, I see your confusion. The comments at the top were completely
out of date. I have simplified and rewritten them accordingly. I am
attaching get_bit_range() with these and other changes you suggested.
See if it makes sense now.
> Now we come to that padding thing. What's the C++ memory model
> semantic for re-used tail padding? Consider
Andrew addressed this elsewhere.
> There is too much get_inner_reference and tree folding stuff in this
> patch (which makes it expensive given that the algorithm is still
> inherently quadratic). You can rely on the bitfield group advancing
> by integer-cst bits (but the start offset may be non-constant, so
> may the size of the underlying record).
Now there are only two tree folding calls (apart from
get_inner_reference), and the common case has very simple arithmetic
tuples. I see no clear way of removing the last call to
get_inner_reference(), as the padding after the field can only be
calculated by calling get_inner_reference() on the subsequent field.
> Now seeing all this - and considering that this is purely C++ frontend
> semantics. Why can't the C++ frontend itself constrain accesses
> according to the required semantics? It could simply create
> BIT_FIELD_REF<MEM_REF<&containing_record,
> byte-offset-to-start-of-group>, bit-size, bit-offset> for all bitfield
> references (with a proper
> type for the MEM_REF, specifying the size of the group). That would
> also avoid issues during tree optimization and would at least allow
> optimizing the bitfield accesses according to the desired C++ semantics.
Andrew addressed this as well. Could you respond to his email if you
think it is unsatisfactory?
a
[-- Attachment #2: stuff --]
[-- Type: text/plain, Size: 5150 bytes --]
/* In the C++ memory model, consecutive non-zero bit fields in a
structure are considered one memory location.
Given a COMPONENT_REF, this function calculates the byte offset
from the containing object to the start of the contiguous bit
region containing the field in question. This byte offset is
returned in *OFFSET.
The maximum number of bits that can be addressed while storing into
the COMPONENT_REF is returned in *MAXBITS. This number is the
number of bits in the contiguous bit region, up to a maximum of
MAX_FIXED_MODE_SIZE. */
static void
get_bit_range (tree exp, tree *offset, HOST_WIDE_INT *maxbits)
{
tree field, record_type, fld;
bool prev_field_is_bitfield;
tree start_offset;
tree start_bitpos_direct_parent = NULL_TREE;
HOST_WIDE_INT start_bitpos;
HOST_WIDE_INT cumulative_bitsize = 0;
/* First field of the bitfield group containing the bitfield we are
referencing. */
tree bitregion_start;
HOST_WIDE_INT tbitsize;
enum machine_mode tmode;
int tunsignedp, tvolatilep;
bool found;
gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
/* Bit field we're storing into. */
field = TREE_OPERAND (exp, 1);
record_type = DECL_FIELD_CONTEXT (field);
/* Find the bitfield group containing the field in question, and set
BITREGION_START to the start of the group. */
prev_field_is_bitfield = false;
bitregion_start = NULL_TREE;
for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld))
{
if (TREE_CODE (fld) != FIELD_DECL)
continue;
/* If we have a bit-field with a bitsize > 0... */
if (DECL_BIT_FIELD_TYPE (fld)
&& tree_low_cst (DECL_SIZE (fld), 1) > 0)
{
if (!prev_field_is_bitfield)
{
bitregion_start = fld;
prev_field_is_bitfield = true;
}
}
else
prev_field_is_bitfield = false;
if (fld == field)
break;
}
gcc_assert (bitregion_start);
gcc_assert (fld);
/* Save the starting position of the bitregion. */
get_inner_reference (build3 (COMPONENT_REF,
TREE_TYPE (exp),
TREE_OPERAND (exp, 0),
bitregion_start, NULL_TREE),
&tbitsize, &start_bitpos, &start_offset,
&tmode, &tunsignedp, &tvolatilep, true);
if (!start_offset)
start_offset = size_zero_node;
/* Calculate byte offset to the beginning of the bit region. */
/* OFFSET = START_OFFSET + (START_BITPOS / BITS_PER_UNIT) */
gcc_assert (start_bitpos % BITS_PER_UNIT == 0);
*offset = fold_build2 (PLUS_EXPR, TREE_TYPE (start_offset),
start_offset,
build_int_cst (integer_type_node,
start_bitpos / BITS_PER_UNIT));
/* Save the bit offset of the current structure. */
start_bitpos_direct_parent = DECL_FIELD_BIT_OFFSET (bitregion_start);
/* Count the bitsize of the bitregion containing the field in question. */
found = false;
cumulative_bitsize = 0;
for (fld = bitregion_start; fld; fld = DECL_CHAIN (fld))
{
if (TREE_CODE (fld) != FIELD_DECL)
continue;
if (fld == field)
found = true;
if (DECL_BIT_FIELD_TYPE (fld)
&& tree_low_cst (DECL_SIZE (fld), 1) > 0)
{
cumulative_bitsize += tree_low_cst (DECL_SIZE (fld), 1);
/* Short-circuit out if we have the max bits allowed. */
if (cumulative_bitsize >= MAX_FIXED_MODE_SIZE)
{
*maxbits = MAX_FIXED_MODE_SIZE;
/* Calculate byte offset to the beginning of the bit region. */
gcc_assert (start_bitpos % BITS_PER_UNIT == 0);
*offset = fold_build2 (PLUS_EXPR, TREE_TYPE (start_offset),
start_offset,
build_int_cst (integer_type_node,
start_bitpos / BITS_PER_UNIT));
return;
}
}
else if (found)
break;
}
/* If we found the end of the bit field sequence, include the
padding up to the next field... */
if (fld)
{
tree end_offset, maxbits_tree;
HOST_WIDE_INT end_bitpos;
/* Calculate bitpos and offset of the next field. */
get_inner_reference (build3 (COMPONENT_REF,
TREE_TYPE (exp),
TREE_OPERAND (exp, 0),
fld, NULL_TREE),
&tbitsize, &end_bitpos, &end_offset,
&tmode, &tunsignedp, &tvolatilep, true);
gcc_assert (end_bitpos % BITS_PER_UNIT == 0);
if (end_offset)
{
tree type = TREE_TYPE (end_offset), end;
/* Calculate byte offset to the end of the bit region. */
end = fold_build2 (PLUS_EXPR, type,
end_offset,
build_int_cst (type,
end_bitpos / BITS_PER_UNIT));
maxbits_tree = fold_build2 (MINUS_EXPR, type, end, *offset);
}
else
maxbits_tree = build_int_cst (integer_type_node,
end_bitpos - start_bitpos);
/* ?? Can we get a variable-lengthened offset here ?? */
gcc_assert (host_integerp (maxbits_tree, 1));
*maxbits = TREE_INT_CST_LOW (maxbits_tree);
}
/* ...otherwise, this is the last element in the structure. */
else
{
/* Include the padding at the end of structure. */
*maxbits = TREE_INT_CST_LOW (TYPE_SIZE (record_type))
- TREE_INT_CST_LOW (start_bitpos_direct_parent);
if (*maxbits > MAX_FIXED_MODE_SIZE)
*maxbits = MAX_FIXED_MODE_SIZE;
}
}
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-09 20:53 ` Aldy Hernandez
@ 2011-08-10 13:34 ` Richard Guenther
2011-08-15 19:26 ` Aldy Hernandez
0 siblings, 1 reply; 81+ messages in thread
From: Richard Guenther @ 2011-08-10 13:34 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jason Merrill, gcc-patches, Jakub Jelinek
On Tue, Aug 9, 2011 at 8:39 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>> ok, so now you do this only for the first field in a bitfield group. But
>> you
>> do it for _all_ bitfield groups in a struct, not only for the interesting
>> one.
>>
>> May I suggest to split the loop into two, first searching the first field
>> in the bitfield group that contains fld and then in a separate loop
>> computing
>> the bitwidth?
>
> Excellent idea. Done! Now there are at most two calls to
> get_inner_reference, and in many cases, only one.
>
>> Backing up, considering one of my earlier questions. What is *offset
>> supposed to be relative to? The docs say sth like "relative to
>> INNERDECL",
>> but the code doesn't contain a reference to INNERDECL anymore.
>
> Sorry, I see your confusion. The comments at the top were completely out of
> date. I have simplified and rewritten them accordingly. I am attaching
> get_bit_range() with these and other changes you suggested. See if it makes
> sense now.
>
>> Now we come to that padding thing. What's the C++ memory model
>> semantic for re-used tail padding? Consider
>
> Andrew addressed this elsewhere.
>
>> There is too much get_inner_reference and tree folding stuff in this
>> patch (which makes it expensive given that the algorithm is still
>> inherently quadratic). You can rely on the bitfield group advancing
>> by integer-cst bits (but the start offset may be non-constant, so
>> may the size of the underlying record).
>
> Now there are only two tree folding calls (apart from get_inner_reference),
> and the common case has very simple arithmetic tuples. I see no clear way
> of removing the last call to get_inner_reference(), as the padding after the
> field can only be calculated by calling get_inner_reference() on the
> subsequent field.
>
>> Now seeing all this - and considering that this is purely C++ frontend
>> semantics. Why can't the C++ frontend itself constrain accesses
>> according to the required semantics? It could simply create
>> BIT_FIELD_REF<MEM_REF<&containing_record,
>> byte-offset-to-start-of-group>, bit-size, bit-offset> for all bitfield
>> references (with a proper
>> type for the MEM_REF, specifying the size of the group). That would
>> also avoid issues during tree optimization and would at least allow
>> optimizing the bitfield accesses according to the desired C++ semantics.
>
> Andrew addressed this as well. Could you respond to his email if you think
> it is unsatisfactory?
Some comments.
/* If we have a bit-field with a bitsize > 0... */
if (DECL_BIT_FIELD_TYPE (fld)
&& tree_low_cst (DECL_SIZE (fld), 1) > 0)
I think we can check bitsize != 0, thus
&& !integer_zerop (DECL_SIZE (fld))
instead. You don't break groups here with MAX_FIXED_MODE_SIZE, so
I don't think it's ok to do that in the 2nd loop
/* Short-circuit out if we have the max bits allowed. */
if (cumulative_bitsize >= MAX_FIXED_MODE_SIZE)
{
*maxbits = MAX_FIXED_MODE_SIZE;
/* Calculate byte offset to the beginning of the bit region. */
gcc_assert (start_bitpos % BITS_PER_UNIT == 0);
*offset = fold_build2 (PLUS_EXPR, TREE_TYPE (start_offset),
start_offset,
build_int_cst (integer_type_node,
start_bitpos / BITS_PER_UNIT));
return;
apart from the *offset calculation being redundant, *offset + maxbits
may not include the referenced field. How do you plan to find
an "optimal" window for such access? (*)
/* Count the bitsize of the bitregion containing the field in question. */
found = false;
cumulative_bitsize = 0;
for (fld = bitregion_start; fld; fld = DECL_CHAIN (fld))
{
if (TREE_CODE (fld) != FIELD_DECL)
continue;
if (fld == field)
found = true;
if (DECL_BIT_FIELD_TYPE (fld)
&& tree_low_cst (DECL_SIZE (fld), 1) > 0)
{
...
}
else if (found)
break;
should probably be
if (!DECL_BIT_FIELD_TYPE (fld)
|| integer_zerop (DECL_SIZE (fld)))
break;
we know that we'll eventually find field.
/* If we found the end of the bit field sequence, include the
padding up to the next field... */
if (fld)
{
could be a non-FIELD_DECL, you have to skip those first.
/* Calculate bitpos and offset of the next field. */
get_inner_reference (build3 (COMPONENT_REF,
TREE_TYPE (exp),
TREE_OPERAND (exp, 0),
fld, NULL_TREE),
&tbitsize, &end_bitpos, &end_offset,
&tmode, &tunsignedp, &tvolatilep, true);
gcc_assert (end_bitpos % BITS_PER_UNIT == 0);
if (end_offset)
{
tree type = TREE_TYPE (end_offset), end;
/* Calculate byte offset to the end of the bit region. */
end = fold_build2 (PLUS_EXPR, type,
end_offset,
build_int_cst (type,
end_bitpos / BITS_PER_UNIT));
maxbits_tree = fold_build2 (MINUS_EXPR, type, end, *offset);
}
else
maxbits_tree = build_int_cst (integer_type_node,
end_bitpos - start_bitpos);
/* ?? Can we get a variable-lengthened offset here ?? */
gcc_assert (host_integerp (maxbits_tree, 1));
*maxbits = TREE_INT_CST_LOW (maxbits_tree);
I think you may end up enlarging maxbits to more than
MAX_FIXED_MODE_SIZE here. What you instead should do (I think)
is sth along
*maxbits = MIN (MAX_FIXED_MODE_SIZE,
*maxbits
+ operand_equal_p (DECL_FIELD_OFFSET
(fld), DECL_FIELD_OFFSET (field)) ? DECL_FIELD_BIT_OFFSET (fld) -
DECL_FIELD_BIT_OFFSET (field) : DECL_OFFSET_ALIGN (field) -
DECL_FIELD_BIT_OFFSET (field));
Note that another complication comes to my mind now - the offset
field of a COMPONENT_REF is used to specify a variable offset
and has to be used, if present, instead of DECL_FIELD_OFFSET.
Thus your building of COMPONENT_REFs to then pass them to
get_inner_reference is broken. As you are in generic code and not
in the C++ frontend I believe you have to properly handle this case
(may I suggest to, at the start of the function, simply return a
minimum byte-aligned blob for the case that there is a variable
offset to the bitfield?)
/* ...otherwise, this is the last element in the structure. */
else
{
/* Include the padding at the end of structure. */
*maxbits = TREE_INT_CST_LOW (TYPE_SIZE (record_type))
- TREE_INT_CST_LOW (start_bitpos_direct_parent);
if (*maxbits > MAX_FIXED_MODE_SIZE)
*maxbits = MAX_FIXED_MODE_SIZE;
}
with Andrews answer this is invalid. You can (and should) at most do
else
*maxbits = (*maxbits + BITS_PER_UNIT - 1) & ~(BITS_PER_UNIT - 1);
thus, round *maxbits up to the next byte.
There is still the general issue of packed bitfields which will probably
make the issue of the computed group not covering all of field more
prominent (esp. if you limit to MAX_FIXED_MODE_SIZE - consider
struct __attribute__((packed)) { long long : 1; long long a : 64; char
c; } where
a does not fit in a DImode mem but crosses it. Why constrain
*maxbits to MAX_FIXED_MODE_SIZE at all? Shoudln't the *offset,
*maxbits pair just constrain what the caller does, not force it to actually
use an access covering that full range (does it?)?
Richard.
(*) For bitfield lowering we discussed this a bit and the solution would be
to mirror what place_field does, fill groups until the space for the mode of
the sofar largest field is filled (doesn't work for packed bitfields of course).
> a
>
>
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-10 13:34 ` Richard Guenther
@ 2011-08-15 19:26 ` Aldy Hernandez
2011-08-27 0:05 ` Aldy Hernandez
0 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-08-15 19:26 UTC (permalink / raw)
To: Richard Guenther; +Cc: Jason Merrill, gcc-patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 4241 bytes --]
> Some comments.
>
> /* If we have a bit-field with a bitsize> 0... */
> if (DECL_BIT_FIELD_TYPE (fld)
> && tree_low_cst (DECL_SIZE (fld), 1)> 0)
>
> I think we can check bitsize != 0, thus
>
> && !integer_zerop (DECL_SIZE (fld))
Done
> /* Short-circuit out if we have the max bits allowed. */
> if (cumulative_bitsize>= MAX_FIXED_MODE_SIZE)
> {
> *maxbits = MAX_FIXED_MODE_SIZE;
> /* Calculate byte offset to the beginning of the bit region. */
> gcc_assert (start_bitpos % BITS_PER_UNIT == 0);
> *offset = fold_build2 (PLUS_EXPR, TREE_TYPE (start_offset),
> start_offset,
> build_int_cst (integer_type_node,
> start_bitpos / BITS_PER_UNIT));
> return;
>
> apart from the *offset calculation being redundant, *offset + maxbits
> may not include the referenced field. How do you plan to find
> an "optimal" window for such access? (*)
Actually offset is always needed because we use it in store_bit_field()
to adjust the memory reference up to the bit region. However... I have
removed the MAX_FIXED_MODE_SIZE constrain all throughout. See
explanation below.
> if (!DECL_BIT_FIELD_TYPE (fld)
> || integer_zerop (DECL_SIZE (fld)))
> break;
>
> we know that we'll eventually find field.
Good catch. Done.
> /* If we found the end of the bit field sequence, include the
> padding up to the next field... */
> if (fld)
> {
>
> could be a non-FIELD_DECL, you have to skip those first.
I can't find an example of a non-FIELD_DECL here for the life of me.
I've run numerous tests and can't trigger one. Do you have one in mind
so I can handle it?
> Note that another complication comes to my mind now - the offset
> field of a COMPONENT_REF is used to specify a variable offset
> and has to be used, if present, instead of DECL_FIELD_OFFSET.
> Thus your building of COMPONENT_REFs to then pass them to
> get_inner_reference is broken. As you are in generic code and not
> in the C++ frontend I believe you have to properly handle this case
> (may I suggest to, at the start of the function, simply return a
> minimum byte-aligned blob for the case that there is a variable
> offset to the bitfield?)
Done.
BTW, where would this happen? I'd like an actual test to stick in
there. Is this for non C/C++?
> /* ...otherwise, this is the last element in the structure. */
> else
> {
> /* Include the padding at the end of structure. */
> *maxbits = TREE_INT_CST_LOW (TYPE_SIZE (record_type))
> - TREE_INT_CST_LOW (start_bitpos_direct_parent);
> if (*maxbits> MAX_FIXED_MODE_SIZE)
> *maxbits = MAX_FIXED_MODE_SIZE;
> }
>
> with Andrews answer this is invalid. You can (and should) at most do
>
> else
> *maxbits = (*maxbits + BITS_PER_UNIT - 1)& ~(BITS_PER_UNIT - 1);
I have done this, but I have also removed the constraint to
MAX_FIXED_MODE_SIZE. See below.
> There is still the general issue of packed bitfields which will probably
> make the issue of the computed group not covering all of field more
> prominent (esp. if you limit to MAX_FIXED_MODE_SIZE - consider
> struct __attribute__((packed)) { long long : 1; long long a : 64; char
> c; } where
> a does not fit in a DImode mem but crosses it. Why constrain
> *maxbits to MAX_FIXED_MODE_SIZE at all? Shoudln't the *offset,
> *maxbits pair just constrain what the caller does, not force it to actually
> use an access covering that full range (does it?)?
Well, I have removed the MAX_FIXED_MODE_SIZE restriction, since every
offset adjustment in the bit field machinery will need a corresponding
MAXBITS adjustment. We can end up reducing MAXBITS into negative
territory. So we really have to keep track of the actual number of
bits. So, we've converged, just for different reasons :).
I have bootstrapped the compiler with the bitfield restrictions in
place. This unearthed a few corner cases, which we are now handling
correctly with this revision. So, please take a look at the entire
patch, as there are small changes throughout.
I am including both the entire patch, and the get_bit_range() function
separately, to make it easier to review.
Thanks.
[-- Attachment #2: get-bit-range-function --]
[-- Type: text/plain, Size: 4700 bytes --]
/* In the C++ memory model, consecutive non-zero bit fields in a
structure are considered one memory location.
Given a COMPONENT_REF, this function calculates the byte offset
from the containing object to the start of the contiguous bit
region containing the field in question. This byte offset is
returned in *OFFSET.
The maximum number of bits that can be addressed while storing into
the COMPONENT_REF is returned in *MAXBITS. This number is the
number of bits in the contiguous bit region, including any
padding. */
static void
get_bit_range (tree exp, tree *offset, HOST_WIDE_INT *maxbits)
{
tree field, record_type, fld;
bool prev_field_is_bitfield;
tree start_offset;
tree start_bitpos_direct_parent = NULL_TREE;
HOST_WIDE_INT start_bitpos;
HOST_WIDE_INT cumulative_bitsize = 0;
/* First field of the bitfield group containing the bitfield we are
referencing. */
tree bitregion_start;
HOST_WIDE_INT tbitsize;
enum machine_mode tmode;
int tunsignedp, tvolatilep;
gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
/* Be as conservative as possible on variable offsets. */
if (TREE_OPERAND (exp, 2)
&& !host_integerp (TREE_OPERAND (exp, 2), 1))
{
*offset = TREE_OPERAND (exp, 2);
*maxbits = BITS_PER_UNIT;
return;
}
/* Bit field we're storing into. */
field = TREE_OPERAND (exp, 1);
record_type = DECL_FIELD_CONTEXT (field);
/* Find the bitfield group containing the field in question, and set
BITREGION_START to the start of the group. */
prev_field_is_bitfield = false;
bitregion_start = NULL_TREE;
for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld))
{
if (TREE_CODE (fld) != FIELD_DECL)
continue;
/* If we have a non-zero bit-field. */
if (DECL_BIT_FIELD_TYPE (fld)
&& !integer_zerop (DECL_SIZE (fld)))
{
if (!prev_field_is_bitfield)
{
bitregion_start = fld;
prev_field_is_bitfield = true;
}
}
else
prev_field_is_bitfield = false;
if (fld == field)
break;
}
gcc_assert (bitregion_start);
gcc_assert (fld);
/* Save the starting position of the bitregion. */
get_inner_reference (build3 (COMPONENT_REF,
TREE_TYPE (exp),
TREE_OPERAND (exp, 0),
bitregion_start, NULL_TREE),
&tbitsize, &start_bitpos, &start_offset,
&tmode, &tunsignedp, &tvolatilep, true);
if (!start_offset)
start_offset = size_zero_node;
/* Calculate byte offset to the beginning of the bit region. */
/* OFFSET = START_OFFSET + (START_BITPOS / BITS_PER_UNIT) */
gcc_assert (start_bitpos % BITS_PER_UNIT == 0);
*offset = fold_build2 (PLUS_EXPR, TREE_TYPE (start_offset),
start_offset,
build_int_cst (integer_type_node,
start_bitpos / BITS_PER_UNIT));
/* Save the bit offset of the current structure. */
start_bitpos_direct_parent = DECL_FIELD_BIT_OFFSET (bitregion_start);
/* Count the bitsize of the bitregion containing the field in question. */
cumulative_bitsize = 0;
for (fld = bitregion_start; fld; fld = DECL_CHAIN (fld))
{
if (TREE_CODE (fld) != FIELD_DECL)
continue;
if (!DECL_BIT_FIELD_TYPE (fld)
|| integer_zerop (DECL_SIZE (fld)))
break;
cumulative_bitsize += tree_low_cst (DECL_SIZE (fld), 1);
}
/* If we found the end of the bit field sequence, include the
padding up to the next field... */
if (fld)
{
tree end_offset, maxbits_tree;
HOST_WIDE_INT end_bitpos;
/* Calculate bitpos and offset of the next field. */
get_inner_reference (build3 (COMPONENT_REF,
TREE_TYPE (exp),
TREE_OPERAND (exp, 0),
fld, NULL_TREE),
&tbitsize, &end_bitpos, &end_offset,
&tmode, &tunsignedp, &tvolatilep, true);
gcc_assert (end_bitpos % BITS_PER_UNIT == 0);
if (end_offset)
{
tree type = TREE_TYPE (end_offset), end;
/* Calculate byte offset to the end of the bit region. */
end = fold_build2 (PLUS_EXPR, type,
end_offset,
build_int_cst (type,
end_bitpos / BITS_PER_UNIT));
maxbits_tree = fold_build2 (MINUS_EXPR, type, end, *offset);
}
else
maxbits_tree = build_int_cst (integer_type_node,
end_bitpos - start_bitpos);
*maxbits = TREE_INT_CST_LOW (maxbits_tree);
}
/* ...otherwise, this is the last element in the structure. */
else
{
/* Include the padding at the end of structure. */
*maxbits = TREE_INT_CST_LOW (TYPE_SIZE (record_type))
- TREE_INT_CST_LOW (start_bitpos_direct_parent);
/* Round up to the next byte. */
*maxbits = (*maxbits + BITS_PER_UNIT - 1) & ~(BITS_PER_UNIT - 1);
}
}
[-- Attachment #3: curr --]
[-- Type: text/plain, Size: 36103 bytes --]
* machmode.h (get_best_mode): Remove 2 arguments.
* fold-const.c (optimize_bit_field_compare): Same.
(fold_truthop): Same.
* expr.c (store_field): Change argument types in prototype.
(emit_group_store): Change argument types to store_bit_field call.
(copy_blkmode_from_reg): Same.
(write_complex_part): Same.
(optimize_bitfield_assignment_op): Change argument types.
Change arguments to get_best_mode.
(get_bit_range): Rewrite.
(expand_assignment): Adjust new call to get_bit_range.
Adjust bitregion_offset when to_rtx is changed.
Adjust calls to store_field with new argument types.
(store_field): New argument types.
Adjust calls to store_bit_field with new arguments.
* expr.h (store_bit_field): Change argument types.
* stor-layout.c (get_best_mode): Remove use of bitregion* arguments.
* expmed.c (store_bit_field_1): Change argument types.
Do not calculate maxbits.
Adjust bitregion_maxbits if offset changes.
(store_bit_field): Change argument types.
Adjust address taking into account bitregion_offset.
(store_fixed_bit_field): Change argument types.
Do not calculate maxbits.
(store_split_bit_field): Change argument types.
(extract_bit_field_1): Adjust arguments to get_best_mode.
(extract_fixed_bit_field): Same.
Index: machmode.h
===================================================================
--- machmode.h (revision 176891)
+++ machmode.h (working copy)
@@ -249,8 +249,6 @@ extern enum machine_mode mode_for_vector
/* Find the best mode to use to access a bit field. */
extern enum machine_mode get_best_mode (int, int,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
unsigned int,
enum machine_mode, int);
Index: fold-const.c
===================================================================
--- fold-const.c (revision 176891)
+++ fold-const.c (working copy)
@@ -3394,7 +3394,7 @@ optimize_bit_field_compare (location_t l
&& flag_strict_volatile_bitfields > 0)
nmode = lmode;
else
- nmode = get_best_mode (lbitsize, lbitpos, 0, 0,
+ nmode = get_best_mode (lbitsize, lbitpos,
const_p ? TYPE_ALIGN (TREE_TYPE (linner))
: MIN (TYPE_ALIGN (TREE_TYPE (linner)),
TYPE_ALIGN (TREE_TYPE (rinner))),
@@ -5221,7 +5221,7 @@ fold_truthop (location_t loc, enum tree_
to be relative to a field of that size. */
first_bit = MIN (ll_bitpos, rl_bitpos);
end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize);
- lnmode = get_best_mode (end_bit - first_bit, first_bit, 0, 0,
+ lnmode = get_best_mode (end_bit - first_bit, first_bit,
TYPE_ALIGN (TREE_TYPE (ll_inner)), word_mode,
volatilep);
if (lnmode == VOIDmode)
@@ -5286,7 +5286,7 @@ fold_truthop (location_t loc, enum tree_
first_bit = MIN (lr_bitpos, rr_bitpos);
end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize);
- rnmode = get_best_mode (end_bit - first_bit, first_bit, 0, 0,
+ rnmode = get_best_mode (end_bit - first_bit, first_bit,
TYPE_ALIGN (TREE_TYPE (lr_inner)), word_mode,
volatilep);
if (rnmode == VOIDmode)
Index: testsuite/c-c++-common/cxxbitfields-6.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-6.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-6.c (revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+struct bits
+{
+ char a;
+ int b:7;
+ int :0;
+ volatile int c:7;
+ unsigned char d;
+} x;
+
+/* Store into <c> should not clobber <d>. */
+void update_c(struct bits *p, int val)
+{
+ p -> c = val;
+}
+
+/* { dg-final { scan-assembler "movb" } } */
Index: testsuite/c-c++-common/cxxbitfields-8.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-8.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-8.c (revision 0)
@@ -0,0 +1,29 @@
+/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-options "-O --param allow-store-data-races=0" } */
+
+struct bits {
+ /* Make sure the bit position of the bitfield is larger than what
+ can be represented in an unsigned HOST_WIDE_INT, to force
+ get_inner_reference() to return something in POFFSET. */
+
+ struct {
+ int some_padding[1<<30];
+ char more_padding;
+ } pad[1<<29];
+
+ struct {
+ volatile char bitfield :1;
+ } x;
+ char b;
+};
+
+struct bits *p;
+
+/* Test that the store into <bitfield> is not done with something
+ wider than a byte move. */
+void foo()
+{
+ p->x.bitfield = 1;
+}
+
+/* { dg-final { scan-assembler "movb" } } */
Index: testsuite/c-c++-common/cxxbitfields-7.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-7.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-7.c (revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+struct bits
+{
+ int some_padding;
+ struct {
+ volatile char bitfield :1;
+ } x;
+ char b;
+};
+
+/* Store into <bitfield> should not clobber <b>. */
+void update(struct bits *p)
+{
+ p->x.bitfield = 1;
+}
+
+/* { dg-final { scan-assembler "movb" } } */
Index: ifcvt.c
===================================================================
--- ifcvt.c (revision 176891)
+++ ifcvt.c (working copy)
@@ -885,7 +885,8 @@ noce_emit_move_insn (rtx x, rtx y)
}
gcc_assert (start < (MEM_P (op) ? BITS_PER_UNIT : BITS_PER_WORD));
- store_bit_field (op, size, start, 0, 0, GET_MODE (x), y);
+ store_bit_field (op, size, start, integer_zero_node, 0,
+ GET_MODE (x), y);
return;
}
@@ -940,7 +941,7 @@ noce_emit_move_insn (rtx x, rtx y)
outmode = GET_MODE (outer);
bitpos = SUBREG_BYTE (outer) * BITS_PER_UNIT;
store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos,
- 0, 0, outmode, y);
+ integer_zero_node, 0, outmode, y);
}
/* Return sequence of instructions generated by if conversion. This
Index: expr.c
===================================================================
--- expr.c (revision 176891)
+++ expr.c (working copy)
@@ -145,7 +145,7 @@ static void store_constructor_field (rtx
tree, tree, int, alias_set_type);
static void store_constructor (tree, rtx, int, HOST_WIDE_INT);
static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT,
+ tree, HOST_WIDE_INT,
enum machine_mode,
tree, tree, alias_set_type, bool);
@@ -2077,7 +2077,7 @@ emit_group_store (rtx orig_dst, rtx src,
emit_move_insn (adjust_address (dest, mode, bytepos), tmps[i]);
else
store_bit_field (dest, bytelen * BITS_PER_UNIT, bytepos * BITS_PER_UNIT,
- 0, 0, mode, tmps[i]);
+ integer_zero_node, 0, mode, tmps[i]);
}
/* Copy from the pseudo into the (probable) hard reg. */
@@ -2171,7 +2171,8 @@ copy_blkmode_from_reg (rtx tgtblk, rtx s
/* Use xbitpos for the source extraction (right justified) and
bitpos for the destination store (left justified). */
- store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, 0, 0, copy_mode,
+ store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD,
+ integer_zero_node, 0, copy_mode,
extract_bit_field (src, bitsize,
xbitpos % BITS_PER_WORD, 1, false,
NULL_RTX, copy_mode, copy_mode));
@@ -2808,7 +2809,8 @@ write_complex_part (rtx cplx, rtx val, b
gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD);
}
- store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, 0, 0, imode, val);
+ store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0,
+ integer_zero_node, 0, imode, val);
}
/* Extract one of the components of the complex value CPLX. Extract the
@@ -3943,8 +3945,7 @@ get_subtarget (rtx x)
static bool
optimize_bitfield_assignment_op (unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ HOST_WIDE_INT bitregion_maxbits,
enum machine_mode mode1, rtx str_rtx,
tree to, tree src)
{
@@ -4005,8 +4006,9 @@ optimize_bitfield_assignment_op (unsigne
if (str_bitsize == 0 || str_bitsize > BITS_PER_WORD)
str_mode = word_mode;
+ if (bitregion_maxbits && bitregion_maxbits < GET_MODE_BITSIZE (str_mode))
+ str_mode = smallest_mode_for_size (bitregion_maxbits, MODE_INT);
str_mode = get_best_mode (bitsize, bitpos,
- bitregion_start, bitregion_end,
MEM_ALIGN (str_rtx), str_mode, 0);
if (str_mode == VOIDmode)
return false;
@@ -4115,57 +4117,44 @@ optimize_bitfield_assignment_op (unsigne
return false;
}
-/* In the C++ memory model, consecutive bit fields in a structure are
- considered one memory location.
+/* In the C++ memory model, consecutive non-zero bit fields in a
+ structure are considered one memory location.
- Given a COMPONENT_REF, this function returns the bit range of
- consecutive bits in which this COMPONENT_REF belongs in. The
- values are returned in *BITSTART and *BITEND. If either the C++
- memory model is not activated, or this memory access is not thread
- visible, 0 is returned in *BITSTART and *BITEND.
-
- EXP is the COMPONENT_REF.
- INNERDECL is the actual object being referenced.
- BITPOS is the position in bits where the bit starts within the structure.
- BITSIZE is size in bits of the field being referenced in EXP.
-
- For example, while storing into FOO.A here...
-
- struct {
- BIT 0:
- unsigned int a : 4;
- unsigned int b : 1;
- BIT 8:
- unsigned char c;
- unsigned int d : 6;
- } foo;
-
- ...we are not allowed to store past <b>, so for the layout above, a
- range of 0..7 (because no one cares if we store into the
- padding). */
+ Given a COMPONENT_REF, this function calculates the byte offset
+ from the containing object to the start of the contiguous bit
+ region containing the field in question. This byte offset is
+ returned in *OFFSET.
+
+ The maximum number of bits that can be addressed while storing into
+ the COMPONENT_REF is returned in *MAXBITS. This number is the
+ number of bits in the contiguous bit region, including any
+ padding. */
static void
-get_bit_range (unsigned HOST_WIDE_INT *bitstart,
- unsigned HOST_WIDE_INT *bitend,
- tree exp, tree innerdecl,
- HOST_WIDE_INT bitpos, HOST_WIDE_INT bitsize)
+get_bit_range (tree exp, tree *offset, HOST_WIDE_INT *maxbits)
{
tree field, record_type, fld;
- bool found_field = false;
bool prev_field_is_bitfield;
+ tree start_offset;
+ tree start_bitpos_direct_parent = NULL_TREE;
+ HOST_WIDE_INT start_bitpos;
+ HOST_WIDE_INT cumulative_bitsize = 0;
+ /* First field of the bitfield group containing the bitfield we are
+ referencing. */
+ tree bitregion_start;
+
+ HOST_WIDE_INT tbitsize;
+ enum machine_mode tmode;
+ int tunsignedp, tvolatilep;
gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
- /* If other threads can't see this value, no need to restrict stores. */
- if (ALLOW_STORE_DATA_RACES
- || ((TREE_CODE (innerdecl) == MEM_REF
- || TREE_CODE (innerdecl) == TARGET_MEM_REF)
- && !ptr_deref_may_alias_global_p (TREE_OPERAND (innerdecl, 0)))
- || (DECL_P (innerdecl)
- && (DECL_THREAD_LOCAL_P (innerdecl)
- || !TREE_STATIC (innerdecl))))
+ /* Be as conservative as possible on variable offsets. */
+ if (TREE_OPERAND (exp, 2)
+ && !host_integerp (TREE_OPERAND (exp, 2), 1))
{
- *bitstart = *bitend = 0;
+ *offset = TREE_OPERAND (exp, 2);
+ *maxbits = BITS_PER_UNIT;
return;
}
@@ -4173,56 +4162,109 @@ get_bit_range (unsigned HOST_WIDE_INT *b
field = TREE_OPERAND (exp, 1);
record_type = DECL_FIELD_CONTEXT (field);
- /* Count the contiguous bitfields for the memory location that
- contains FIELD. */
- *bitstart = 0;
- prev_field_is_bitfield = true;
+ /* Find the bitfield group containing the field in question, and set
+ BITREGION_START to the start of the group. */
+ prev_field_is_bitfield = false;
+ bitregion_start = NULL_TREE;
for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld))
{
- tree t, offset;
- enum machine_mode mode;
- int unsignedp, volatilep;
-
if (TREE_CODE (fld) != FIELD_DECL)
continue;
- t = build3 (COMPONENT_REF, TREE_TYPE (exp),
- unshare_expr (TREE_OPERAND (exp, 0)),
- fld, NULL_TREE);
- get_inner_reference (t, &bitsize, &bitpos, &offset,
- &mode, &unsignedp, &volatilep, true);
-
- if (field == fld)
- found_field = true;
-
- if (DECL_BIT_FIELD_TYPE (fld) && bitsize > 0)
+ /* If we have a non-zero bit-field. */
+ if (DECL_BIT_FIELD_TYPE (fld)
+ && !integer_zerop (DECL_SIZE (fld)))
{
- if (prev_field_is_bitfield == false)
+ if (!prev_field_is_bitfield)
{
- *bitstart = bitpos;
+ bitregion_start = fld;
prev_field_is_bitfield = true;
}
}
else
- {
- prev_field_is_bitfield = false;
- if (found_field)
- break;
- }
+ prev_field_is_bitfield = false;
+ if (fld == field)
+ break;
}
- gcc_assert (found_field);
+ gcc_assert (bitregion_start);
+ gcc_assert (fld);
+ /* Save the starting position of the bitregion. */
+ get_inner_reference (build3 (COMPONENT_REF,
+ TREE_TYPE (exp),
+ TREE_OPERAND (exp, 0),
+ bitregion_start, NULL_TREE),
+ &tbitsize, &start_bitpos, &start_offset,
+ &tmode, &tunsignedp, &tvolatilep, true);
+
+ if (!start_offset)
+ start_offset = size_zero_node;
+ /* Calculate byte offset to the beginning of the bit region. */
+ /* OFFSET = START_OFFSET + (START_BITPOS / BITS_PER_UNIT) */
+ gcc_assert (start_bitpos % BITS_PER_UNIT == 0);
+ *offset = fold_build2 (PLUS_EXPR, TREE_TYPE (start_offset),
+ start_offset,
+ build_int_cst (integer_type_node,
+ start_bitpos / BITS_PER_UNIT));
+
+ /* Save the bit offset of the current structure. */
+ start_bitpos_direct_parent = DECL_FIELD_BIT_OFFSET (bitregion_start);
+
+ /* Count the bitsize of the bitregion containing the field in question. */
+ cumulative_bitsize = 0;
+ for (fld = bitregion_start; fld; fld = DECL_CHAIN (fld))
+ {
+ if (TREE_CODE (fld) != FIELD_DECL)
+ continue;
+
+ if (!DECL_BIT_FIELD_TYPE (fld)
+ || integer_zerop (DECL_SIZE (fld)))
+ break;
+
+ cumulative_bitsize += tree_low_cst (DECL_SIZE (fld), 1);
+ }
+
+ /* If we found the end of the bit field sequence, include the
+ padding up to the next field... */
if (fld)
{
- /* We found the end of the bit field sequence. Include the
- padding up to the next field and be done. */
- *bitend = bitpos - 1;
+ tree end_offset, maxbits_tree;
+ HOST_WIDE_INT end_bitpos;
+
+ /* Calculate bitpos and offset of the next field. */
+ get_inner_reference (build3 (COMPONENT_REF,
+ TREE_TYPE (exp),
+ TREE_OPERAND (exp, 0),
+ fld, NULL_TREE),
+ &tbitsize, &end_bitpos, &end_offset,
+ &tmode, &tunsignedp, &tvolatilep, true);
+ gcc_assert (end_bitpos % BITS_PER_UNIT == 0);
+
+ if (end_offset)
+ {
+ tree type = TREE_TYPE (end_offset), end;
+
+ /* Calculate byte offset to the end of the bit region. */
+ end = fold_build2 (PLUS_EXPR, type,
+ end_offset,
+ build_int_cst (type,
+ end_bitpos / BITS_PER_UNIT));
+ maxbits_tree = fold_build2 (MINUS_EXPR, type, end, *offset);
+ }
+ else
+ maxbits_tree = build_int_cst (integer_type_node,
+ end_bitpos - start_bitpos);
+
+ *maxbits = TREE_INT_CST_LOW (maxbits_tree);
}
+ /* ...otherwise, this is the last element in the structure. */
else
{
- /* If this is the last element in the structure, include the padding
- at the end of structure. */
- *bitend = TREE_INT_CST_LOW (TYPE_SIZE (record_type)) - 1;
+ /* Include the padding at the end of structure. */
+ *maxbits = TREE_INT_CST_LOW (TYPE_SIZE (record_type))
+ - TREE_INT_CST_LOW (start_bitpos_direct_parent);
+ /* Round up to the next byte. */
+ *maxbits = (*maxbits + BITS_PER_UNIT - 1) & ~(BITS_PER_UNIT - 1);
}
}
@@ -4324,12 +4366,14 @@ expand_assignment (tree to, tree from, b
{
enum machine_mode mode1;
HOST_WIDE_INT bitsize, bitpos;
- unsigned HOST_WIDE_INT bitregion_start = 0;
- unsigned HOST_WIDE_INT bitregion_end = 0;
tree offset;
int unsignedp;
int volatilep = 0;
tree tem;
+ tree bitregion_offset = size_zero_node;
+ /* Set to 0 for the special case where there is no restriction
+ in play. */
+ HOST_WIDE_INT bitregion_maxbits = 0;
push_temp_slots ();
tem = get_inner_reference (to, &bitsize, &bitpos, &offset, &mode1,
@@ -4337,8 +4381,26 @@ expand_assignment (tree to, tree from, b
if (TREE_CODE (to) == COMPONENT_REF
&& DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
- get_bit_range (&bitregion_start, &bitregion_end,
- to, tem, bitpos, bitsize);
+ {
+ /* If other threads can't see this value, no need to
+ restrict stores. */
+ if (ALLOW_STORE_DATA_RACES
+ || ((TREE_CODE (tem) == MEM_REF
+ || TREE_CODE (tem) == TARGET_MEM_REF)
+ && !ptr_deref_may_alias_global_p (TREE_OPERAND (tem, 0)))
+ || TREE_CODE (tem) == RESULT_DECL
+ || (DECL_P (tem)
+ && (DECL_THREAD_LOCAL_P (tem)
+ || !TREE_STATIC (tem))))
+ {
+ bitregion_offset = size_zero_node;
+ /* Set to 0 for the special case where there is no
+ restriction in play. */
+ bitregion_maxbits = 0;
+ }
+ else
+ get_bit_range (to, &bitregion_offset, &bitregion_maxbits);
+ }
/* If we are going to use store_bit_field and extract_bit_field,
make sure to_rtx will be safe for multiple use. */
@@ -4388,12 +4450,19 @@ expand_assignment (tree to, tree from, b
&& MEM_ALIGN (to_rtx) == GET_MODE_ALIGNMENT (mode1))
{
to_rtx = adjust_address (to_rtx, mode1, bitpos / BITS_PER_UNIT);
+ bitregion_offset = fold_build2 (MINUS_EXPR, integer_type_node,
+ bitregion_offset,
+ build_int_cst (integer_type_node,
+ bitpos / BITS_PER_UNIT));
bitpos = 0;
}
to_rtx = offset_address (to_rtx, offset_rtx,
highest_pow2_factor_for_target (to,
offset));
+ bitregion_offset = fold_build2 (MINUS_EXPR, integer_type_node,
+ bitregion_offset,
+ offset);
}
/* No action is needed if the target is not a memory and the field
@@ -4421,13 +4490,13 @@ expand_assignment (tree to, tree from, b
nontemporal);
else if (bitpos + bitsize <= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 0), bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
mode1, from, TREE_TYPE (tem),
get_alias_set (to), nontemporal);
else if (bitpos >= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 1), bitsize,
bitpos - mode_bitsize / 2,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
@@ -4450,7 +4519,7 @@ expand_assignment (tree to, tree from, b
write_complex_part (temp, XEXP (to_rtx, 0), false);
write_complex_part (temp, XEXP (to_rtx, 1), true);
result = store_field (temp, bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
@@ -4477,13 +4546,13 @@ expand_assignment (tree to, tree from, b
}
if (optimize_bitfield_assignment_op (bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_maxbits,
mode1,
to_rtx, to, from))
result = NULL;
else
result = store_field (to_rtx, bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
@@ -4877,7 +4946,7 @@ store_expr (tree exp, rtx target, int ca
: BLOCK_OP_NORMAL));
else if (GET_MODE (target) == BLKmode)
store_bit_field (target, INTVAL (expr_size (exp)) * BITS_PER_UNIT,
- 0, 0, 0, GET_MODE (temp), temp);
+ 0, integer_zero_node, 0, GET_MODE (temp), temp);
else
convert_move (target, temp, unsignedp);
}
@@ -5342,8 +5411,8 @@ store_constructor_field (rtx target, uns
store_constructor (exp, target, cleared, bitsize / BITS_PER_UNIT);
}
else
- store_field (target, bitsize, bitpos, 0, 0, mode, exp, type, alias_set,
- false);
+ store_field (target, bitsize, bitpos, integer_zero_node, 0, mode, exp,
+ type, alias_set, false);
}
/* Store the value of constructor EXP into the rtx TARGET.
@@ -5917,10 +5986,10 @@ store_constructor (tree exp, rtx target,
BITSIZE bits, starting BITPOS bits from the start of TARGET.
If MODE is VOIDmode, it means that we are storing into a bit-field.
- BITREGION_START is bitpos of the first bitfield in this region.
- BITREGION_END is the bitpos of the ending bitfield in this region.
- These two fields are 0, if the C++ memory model does not apply,
- or we are not interested in keeping track of bitfield regions.
+ BITREGION_OFFSET is the byte offset from the beginning of the
+ containing object to the start of the bit region.
+ BITREGION_MAXBITS is the size in bits of the largest mode that can
+ be used to set the bit-field in question.
Always return const0_rtx unless we have something particular to
return.
@@ -5935,8 +6004,8 @@ store_constructor (tree exp, rtx target,
static rtx
store_field (rtx target, HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ tree bitregion_offset,
+ HOST_WIDE_INT bitregion_maxbits,
enum machine_mode mode, tree exp, tree type,
alias_set_type alias_set, bool nontemporal)
{
@@ -5970,7 +6039,7 @@ store_field (rtx target, HOST_WIDE_INT b
emit_move_insn (object, target);
store_field (blk_object, bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
mode, exp, type, alias_set, nontemporal);
emit_move_insn (target, object);
@@ -6086,7 +6155,7 @@ store_field (rtx target, HOST_WIDE_INT b
/* Store the value in the bitfield. */
store_bit_field (target, bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_offset, bitregion_maxbits,
mode, temp);
return const0_rtx;
Index: expr.h
===================================================================
--- expr.h (revision 176891)
+++ expr.h (working copy)
@@ -666,8 +666,8 @@ mode_for_extraction (enum extraction_pat
extern void store_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
+ tree,
+ HOST_WIDE_INT,
enum machine_mode, rtx);
extern rtx extract_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT, int, bool, rtx,
Index: stor-layout.c
===================================================================
--- stor-layout.c (revision 176891)
+++ stor-layout.c (working copy)
@@ -2361,13 +2361,6 @@ fixup_unsigned_type (tree type)
/* Find the best machine mode to use when referencing a bit field of length
BITSIZE bits starting at BITPOS.
- BITREGION_START is the bit position of the first bit in this
- sequence of bit fields. BITREGION_END is the last bit in this
- sequence. If these two fields are non-zero, we should restrict the
- memory access to a maximum sized chunk of
- BITREGION_END - BITREGION_START + 1. Otherwise, we are allowed to touch
- any adjacent non bit-fields.
-
The underlying object is known to be aligned to a boundary of ALIGN bits.
If LARGEST_MODE is not VOIDmode, it means that we should not use a mode
larger than LARGEST_MODE (usually SImode).
@@ -2386,20 +2379,11 @@ fixup_unsigned_type (tree type)
enum machine_mode
get_best_mode (int bitsize, int bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
unsigned int align,
enum machine_mode largest_mode, int volatilep)
{
enum machine_mode mode;
unsigned int unit = 0;
- unsigned HOST_WIDE_INT maxbits;
-
- /* If unset, no restriction. */
- if (!bitregion_end)
- maxbits = MAX_FIXED_MODE_SIZE;
- else
- maxbits = (bitregion_end - bitregion_start) % align + 1;
/* Find the narrowest integer mode that contains the bit field. */
for (mode = GET_CLASS_NARROWEST_MODE (MODE_INT); mode != VOIDmode;
@@ -2436,7 +2420,6 @@ get_best_mode (int bitsize, int bitpos,
&& bitpos / unit == (bitpos + bitsize - 1) / unit
&& unit <= BITS_PER_WORD
&& unit <= MIN (align, BIGGEST_ALIGNMENT)
- && unit <= maxbits
&& (largest_mode == VOIDmode
|| unit <= GET_MODE_BITSIZE (largest_mode)))
wide_mode = tmode;
Index: expmed.c
===================================================================
--- expmed.c (revision 176891)
+++ expmed.c (working copy)
@@ -48,13 +48,11 @@ struct target_expmed *this_target_expmed
static void store_fixed_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
+ HOST_WIDE_INT,
rtx);
static void store_split_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
+ HOST_WIDE_INT,
rtx);
static rtx extract_fixed_bit_field (enum machine_mode, rtx,
unsigned HOST_WIDE_INT,
@@ -340,8 +338,7 @@ mode_for_extraction (enum extraction_pat
static bool
store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitnum,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ HOST_WIDE_INT bitregion_maxbits,
enum machine_mode fieldmode,
rtx value, bool fallback_p)
{
@@ -558,7 +555,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!store_bit_field_1 (op0, MIN (BITS_PER_WORD,
bitsize - i * BITS_PER_WORD),
bitnum + bit_offset,
- bitregion_start, bitregion_end,
+ bitregion_maxbits,
word_mode,
value_word, fallback_p))
{
@@ -722,10 +719,6 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (HAVE_insv && MEM_P (op0))
{
enum machine_mode bestmode;
- unsigned HOST_WIDE_INT maxbits = MAX_FIXED_MODE_SIZE;
-
- if (bitregion_end)
- maxbits = bitregion_end - bitregion_start + 1;
/* Get the mode to use for inserting into this field. If OP0 is
BLKmode, get the smallest mode consistent with the alignment. If
@@ -733,15 +726,19 @@ store_bit_field_1 (rtx str_rtx, unsigned
mode. Otherwise, use the smallest mode containing the field. */
if (GET_MODE (op0) == BLKmode
- || GET_MODE_BITSIZE (GET_MODE (op0)) > maxbits
+ || (bitregion_maxbits
+ && GET_MODE_BITSIZE (GET_MODE (op0)) > bitregion_maxbits)
|| (op_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (op_mode)))
- bestmode = get_best_mode (bitsize, bitnum,
- bitregion_start, bitregion_end,
- MEM_ALIGN (op0),
- (op_mode == MAX_MACHINE_MODE
- ? VOIDmode : op_mode),
- MEM_VOLATILE_P (op0));
+ {
+ bestmode = (op_mode == MAX_MACHINE_MODE ? VOIDmode : op_mode);
+ if (bitregion_maxbits && bitregion_maxbits < GET_MODE_SIZE (op_mode))
+ bestmode = smallest_mode_for_size (bitregion_maxbits, MODE_INT);
+ bestmode = get_best_mode (bitsize, bitnum,
+ MEM_ALIGN (op0),
+ bestmode,
+ MEM_VOLATILE_P (op0));
+ }
else
bestmode = GET_MODE (op0);
@@ -752,6 +749,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
{
rtx last, tempreg, xop0;
unsigned HOST_WIDE_INT xoffset, xbitpos;
+ HOST_WIDE_INT xmaxbits = bitregion_maxbits;
last = get_last_insn ();
@@ -762,12 +760,13 @@ store_bit_field_1 (rtx str_rtx, unsigned
xoffset = (bitnum / unit) * GET_MODE_SIZE (bestmode);
xbitpos = bitnum % unit;
xop0 = adjust_address (op0, bestmode, xoffset);
+ if (xmaxbits)
+ xmaxbits -= xoffset * BITS_PER_UNIT;
/* Fetch that unit, store the bitfield in it, then store
the unit. */
tempreg = copy_to_reg (xop0);
- if (store_bit_field_1 (tempreg, bitsize, xbitpos,
- bitregion_start, bitregion_end,
+ if (store_bit_field_1 (tempreg, bitsize, xbitpos, xmaxbits,
fieldmode, orig_value, false))
{
emit_move_insn (xop0, tempreg);
@@ -780,8 +779,10 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!fallback_p)
return false;
+ if (bitregion_maxbits)
+ bitregion_maxbits -= offset * BITS_PER_UNIT;
store_fixed_bit_field (op0, offset, bitsize, bitpos,
- bitregion_start, bitregion_end, value);
+ bitregion_maxbits, value);
return true;
}
@@ -789,18 +790,17 @@ store_bit_field_1 (rtx str_rtx, unsigned
into a bit-field within structure STR_RTX
containing BITSIZE bits starting at bit BITNUM.
- BITREGION_START is bitpos of the first bitfield in this region.
- BITREGION_END is the bitpos of the ending bitfield in this region.
- These two fields are 0, if the C++ memory model does not apply,
- or we are not interested in keeping track of bitfield regions.
+ BITREGION_OFFSET is the byte offset STR_RTX to the start of the bit
+ region. BITREGION_MAXBITS is the number of bits of the largest
+ mode that can be used to set the bit-field in question.
FIELDMODE is the machine-mode of the FIELD_DECL node for this field. */
void
store_bit_field (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitnum,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ tree bitregion_offset,
+ HOST_WIDE_INT bitregion_maxbits,
enum machine_mode fieldmode,
rtx value)
{
@@ -808,30 +808,29 @@ store_bit_field (rtx str_rtx, unsigned H
bit region. Adjust the address to start at the beginning of the
bit region. */
if (MEM_P (str_rtx)
- && bitregion_start > 0)
+ && bitregion_maxbits
+ && !integer_zerop (bitregion_offset))
{
- enum machine_mode bestmode;
- enum machine_mode op_mode;
- unsigned HOST_WIDE_INT offset;
+ HOST_WIDE_INT offset;
- op_mode = mode_for_extraction (EP_insv, 3);
- if (op_mode == MAX_MACHINE_MODE)
- op_mode = VOIDmode;
-
- offset = bitregion_start / BITS_PER_UNIT;
- bitnum -= bitregion_start;
- bitregion_end -= bitregion_start;
- bitregion_start = 0;
- bestmode = get_best_mode (bitsize, bitnum,
- bitregion_start, bitregion_end,
- MEM_ALIGN (str_rtx),
- op_mode,
- MEM_VOLATILE_P (str_rtx));
- str_rtx = adjust_address (str_rtx, bestmode, offset);
+ if (host_integerp (bitregion_offset, 1))
+ {
+ /* Adjust the bit position accordingly. */
+ offset = tree_low_cst (bitregion_offset, 1);
+ bitnum -= offset * BITS_PER_UNIT;
+ /* Adjust the actual address. */
+ str_rtx = adjust_address (str_rtx, GET_MODE (str_rtx), offset);
+ }
+ else
+ {
+ /* Handle variable length offsets. */
+ str_rtx = offset_address (str_rtx,
+ expand_normal (bitregion_offset), 1);
+ }
+ bitregion_offset = integer_zero_node;
}
- if (!store_bit_field_1 (str_rtx, bitsize, bitnum,
- bitregion_start, bitregion_end,
+ if (!store_bit_field_1 (str_rtx, bitsize, bitnum, bitregion_maxbits,
fieldmode, value, true))
gcc_unreachable ();
}
@@ -849,8 +848,7 @@ static void
store_fixed_bit_field (rtx op0, unsigned HOST_WIDE_INT offset,
unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ HOST_WIDE_INT bitregion_maxbits,
rtx value)
{
enum machine_mode mode;
@@ -872,19 +870,12 @@ store_fixed_bit_field (rtx op0, unsigned
/* Special treatment for a bit field split across two registers. */
if (bitsize + bitpos > BITS_PER_WORD)
{
- store_split_bit_field (op0, bitsize, bitpos,
- bitregion_start, bitregion_end,
- value);
+ store_split_bit_field (op0, bitsize, bitpos, bitregion_maxbits, value);
return;
}
}
else
{
- unsigned HOST_WIDE_INT maxbits = MAX_FIXED_MODE_SIZE;
-
- if (bitregion_end)
- maxbits = bitregion_end - bitregion_start + 1;
-
/* Get the proper mode to use for this field. We want a mode that
includes the entire field. If such a mode would be larger than
a word, we won't be doing the extraction the normal way.
@@ -897,20 +888,26 @@ store_fixed_bit_field (rtx op0, unsigned
if (MEM_VOLATILE_P (op0)
&& GET_MODE_BITSIZE (GET_MODE (op0)) > 0
- && GET_MODE_BITSIZE (GET_MODE (op0)) <= maxbits
+ && (!bitregion_maxbits
+ || GET_MODE_BITSIZE (GET_MODE (op0)) <= bitregion_maxbits)
&& flag_strict_volatile_bitfields > 0)
mode = GET_MODE (op0);
else
- mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
- bitregion_start, bitregion_end,
- MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
+ {
+ if (bitregion_maxbits && bitregion_maxbits < GET_MODE_BITSIZE (mode))
+ mode = smallest_mode_for_size (bitregion_maxbits, MODE_INT);
+ mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
+ }
if (mode == VOIDmode)
{
+ if (bitregion_maxbits)
+ bitregion_maxbits -= offset * BITS_PER_UNIT;
/* The only way this should occur is if the field spans word
boundaries. */
store_split_bit_field (op0, bitsize, bitpos + offset * BITS_PER_UNIT,
- bitregion_start, bitregion_end, value);
+ bitregion_maxbits, value);
return;
}
@@ -1031,8 +1028,7 @@ store_fixed_bit_field (rtx op0, unsigned
static void
store_split_bit_field (rtx op0, unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ HOST_WIDE_INT bitregion_maxbits,
rtx value)
{
unsigned int unit;
@@ -1147,8 +1143,13 @@ store_split_bit_field (rtx op0, unsigned
store_fixed_bit_field wants offset in bytes. If WORD is const0_rtx,
it is just an out-of-bounds access. Ignore it. */
if (word != const0_rtx)
- store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize,
- thispos, bitregion_start, bitregion_end, part);
+ {
+ HOST_WIDE_INT xmaxbits = bitregion_maxbits;
+ if (bitregion_maxbits)
+ xmaxbits -= offset * unit / BITS_PER_UNIT;
+ store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize,
+ thispos, xmaxbits, part);
+ }
bitsdone += thissize;
}
}
@@ -1588,7 +1589,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
if (GET_MODE (op0) == BLKmode
|| (ext_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (ext_mode)))
- bestmode = get_best_mode (bitsize, bitnum, 0, 0, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
(ext_mode == MAX_MACHINE_MODE
? VOIDmode : ext_mode),
MEM_VOLATILE_P (op0));
@@ -1714,7 +1715,7 @@ extract_fixed_bit_field (enum machine_mo
mode = tmode;
}
else
- mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT, 0, 0,
+ mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-15 19:26 ` Aldy Hernandez
@ 2011-08-27 0:05 ` Aldy Hernandez
2011-08-29 12:54 ` Richard Guenther
0 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-08-27 0:05 UTC (permalink / raw)
To: Richard Guenther; +Cc: Jason Merrill, gcc-patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 2028 bytes --]
This is a "slight" update from the last revision, with your issues
addressed as I explained in the last email. However, everything turned
out to be much tricker than I expected (variable length offsets with
arrays, bit fields spanning multiple words, surprising padding
gymnastics by GCC, etc etc).
It turns out that what we need is to know the precise bit region size at
all times, and adjust it as we rearrange and cut things into pieces
throughout the RTL bit field machinery.
I enabled the C++ memory model, and forced a boostrap and regression
test with it. This brought about many interesting cases, which I was
able to distill and add to the testsuite.
Of particular interest was the struct-layout-1.exp tests. Since many of
the tests set a global bit field, only to later check it against a local
variable containing the same value, it is the perfect stressor because,
while globals are restricted under the memory model, locals are not. So
we can check that we can interoperate with the less restrictive model,
and that the patch does not introduce ABI inconsistencies. After much
grief, we are now passing all the struct-layout-1.exp tests.
Eventually, I'd like to force the struct-layout-1.exp tests to run for
"--param allow-store-data-races=0" as well. Unfortunately, this will
increase testing time.
I have (unfortunately) introduced an additional call to
get_inner_reference(), but only for the field itself (one time). I
can't remember the details, but it was something to effect of the bit
position + padding being impossible to calculate in one variable array
reference case. I can dig up the case if you'd like.
I am currently tackling a reload miscompilation failure while building a
32-bit library. I am secretly hoping your review will uncover the flaw
without me having to pick this up. Otherwise, this is a much more
comprehensive approach than what is currently in mainline, and we now
pass all the bitfield tests the GCC testsuite could throw at it.
Fire away.
[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 44911 bytes --]
* machmode.h (get_best_mode): Remove 2 arguments.
* fold-const.c (optimize_bit_field_compare): Same.
(fold_truthop): Same.
* expr.c (store_field): Change argument types in prototype.
(emit_group_store): Change argument types to store_bit_field call.
(copy_blkmode_from_reg): Same.
(write_complex_part): Same.
(optimize_bitfield_assignment_op): Change argument types.
Change arguments to get_best_mode.
(get_bit_range): Rewrite.
(expand_assignment): Adjust new call to get_bit_range.
Adjust bitregion_offset when to_rtx is changed.
Adjust calls to store_field with new argument types.
(store_field): New argument types.
Adjust calls to store_bit_field with new arguments.
* expr.h (store_bit_field): Change argument types.
* stor-layout.c (get_best_mode): Remove use of bitregion* arguments.
* expmed.c (store_bit_field_1): Change argument types.
Do not calculate maxbits.
Adjust bitregion_maxbits if offset changes.
(store_bit_field): Change argument types.
Adjust address taking into account bitregion_offset.
(store_fixed_bit_field): Change argument types.
Do not calculate maxbits.
(store_split_bit_field): Change argument types.
(extract_bit_field_1): Adjust arguments to get_best_mode.
(extract_fixed_bit_field): Same.
Index: machmode.h
===================================================================
--- machmode.h (revision 176891)
+++ machmode.h (working copy)
@@ -249,8 +249,6 @@ extern enum machine_mode mode_for_vector
/* Find the best mode to use to access a bit field. */
extern enum machine_mode get_best_mode (int, int,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
unsigned int,
enum machine_mode, int);
Index: fold-const.c
===================================================================
--- fold-const.c (revision 176891)
+++ fold-const.c (working copy)
@@ -3394,7 +3394,7 @@ optimize_bit_field_compare (location_t l
&& flag_strict_volatile_bitfields > 0)
nmode = lmode;
else
- nmode = get_best_mode (lbitsize, lbitpos, 0, 0,
+ nmode = get_best_mode (lbitsize, lbitpos,
const_p ? TYPE_ALIGN (TREE_TYPE (linner))
: MIN (TYPE_ALIGN (TREE_TYPE (linner)),
TYPE_ALIGN (TREE_TYPE (rinner))),
@@ -5221,7 +5221,7 @@ fold_truthop (location_t loc, enum tree_
to be relative to a field of that size. */
first_bit = MIN (ll_bitpos, rl_bitpos);
end_bit = MAX (ll_bitpos + ll_bitsize, rl_bitpos + rl_bitsize);
- lnmode = get_best_mode (end_bit - first_bit, first_bit, 0, 0,
+ lnmode = get_best_mode (end_bit - first_bit, first_bit,
TYPE_ALIGN (TREE_TYPE (ll_inner)), word_mode,
volatilep);
if (lnmode == VOIDmode)
@@ -5286,7 +5286,7 @@ fold_truthop (location_t loc, enum tree_
first_bit = MIN (lr_bitpos, rr_bitpos);
end_bit = MAX (lr_bitpos + lr_bitsize, rr_bitpos + rr_bitsize);
- rnmode = get_best_mode (end_bit - first_bit, first_bit, 0, 0,
+ rnmode = get_best_mode (end_bit - first_bit, first_bit,
TYPE_ALIGN (TREE_TYPE (lr_inner)), word_mode,
volatilep);
if (rnmode == VOIDmode)
Index: testsuite/c-c++-common/cxxbitfields-9.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-9.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-9.c (revision 0)
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+
+enum bigenum
+{ bigee = 12345678901LL
+};
+
+struct objtype
+{
+ enum bigenum a;
+ int b:25;
+ int c:15;
+ signed char d;
+ unsigned int e[3] __attribute__ ((aligned));
+ int f;
+};
+
+struct objtype obj;
+
+void foo(){
+ obj.c = 33;
+}
Index: testsuite/c-c++-common/cxxbitfields-10.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-10.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-10.c (revision 0)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+
+/* Variable length offsets with the bit field not ending the record. */
+
+typedef struct
+{
+ short f:3, g:3, h:10;
+ char xxx;
+} small;
+
+struct sometype
+{
+ int i;
+ small s[10];
+} x;
+
+int main ()
+{
+ int i;
+ for (i = 0; i < 10; i++)
+ x.s[i].f = 0;
+ return 0;
+}
Index: testsuite/c-c++-common/cxxbitfields-12.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-12.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-12.c (revision 0)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+
+struct stuff_type
+{
+ double a;
+ int b:27;
+ int c:9;
+ int d:9;
+ unsigned char e;
+} stuff;
+
+void foo(){
+stuff.d = 3;
+}
Index: testsuite/c-c++-common/cxxbitfields-14.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-14.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-14.c (revision 0)
@@ -0,0 +1,25 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "--param allow-store-data-races=0" } */
+
+enum E0 { e0_0 };
+
+enum E2 { e2_m3 = -3, e2_m2, e2_m1, e2_0, e2_1, e2_2, e2_3 };
+
+struct S757
+{
+ enum E0 a;
+ enum E2 b:17;
+ enum E2 c:17;
+ unsigned char d;
+};
+
+struct S757 s757;
+
+int main()
+{
+ s757.c = e2_m2;
+ return 0;
+}
+
+/* Make sure we don't load/store a full 32-bits. */
+/* { dg-final { scan-assembler "movb" } } */
Index: testsuite/c-c++-common/cxxbitfields-6.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-6.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-6.c (revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+struct bits
+{
+ char a;
+ int b:7;
+ int :0;
+ volatile int c:7;
+ unsigned char d;
+} x;
+
+/* Store into <c> should not clobber <d>. */
+void update_c(struct bits *p, int val)
+{
+ p -> c = val;
+}
+
+/* { dg-final { scan-assembler "movb" } } */
Index: testsuite/c-c++-common/cxxbitfields-8.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-8.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-8.c (revision 0)
@@ -0,0 +1,29 @@
+/* { dg-do compile { target x86_64-*-* } } */
+/* { dg-options "-O --param allow-store-data-races=0" } */
+
+struct bits {
+ /* Make sure the bit position of the bitfield is larger than what
+ can be represented in an unsigned HOST_WIDE_INT, to force
+ get_inner_reference() to return something in POFFSET. */
+
+ struct {
+ int some_padding[1<<30];
+ char more_padding;
+ } pad[1<<29];
+
+ struct {
+ volatile char bitfield :1;
+ } x;
+ char b;
+};
+
+struct bits *p;
+
+/* Test that the store into <bitfield> is not done with something
+ wider than a byte move. */
+void foo()
+{
+ p->x.bitfield = 1;
+}
+
+/* { dg-final { scan-assembler "movb" } } */
Index: testsuite/c-c++-common/cxxbitfields-11.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-11.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-11.c (revision 0)
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+
+struct S1075
+{
+ unsigned short int a;
+ unsigned long long int b:29;
+ unsigned long long int c:35;
+ unsigned long long int d:31;
+ unsigned long long int e:50;
+ char *f;
+};
+
+struct S1075 blob;
+void foo(){
+blob.d=55;
+}
Index: testsuite/c-c++-common/cxxbitfields-13.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-13.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-13.c (revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "--param allow-store-data-races=0" } */
+
+/* Test bit fields that are split word boundaries. */
+
+struct footype
+{
+ int c:9;
+ int d:9;
+ char e;
+} foo;
+
+void funky()
+{
+ foo.d = 88;
+}
+
+/* Make sure we don't load/store a full 32-bits. */
+/* { dg-final { scan-assembler-not "movl\[ \t\]foo" } } */
Index: testsuite/c-c++-common/cxxbitfields-7.c
===================================================================
--- testsuite/c-c++-common/cxxbitfields-7.c (revision 0)
+++ testsuite/c-c++-common/cxxbitfields-7.c (revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 --param allow-store-data-races=0" } */
+
+struct bits
+{
+ int some_padding;
+ struct {
+ volatile char bitfield :1;
+ } x;
+ char b;
+};
+
+/* Store into <bitfield> should not clobber <b>. */
+void update(struct bits *p)
+{
+ p->x.bitfield = 1;
+}
+
+/* { dg-final { scan-assembler "movb" } } */
Index: ifcvt.c
===================================================================
--- ifcvt.c (revision 176891)
+++ ifcvt.c (working copy)
@@ -885,7 +885,8 @@ noce_emit_move_insn (rtx x, rtx y)
}
gcc_assert (start < (MEM_P (op) ? BITS_PER_UNIT : BITS_PER_WORD));
- store_bit_field (op, size, start, 0, 0, GET_MODE (x), y);
+ store_bit_field (op, size, start, integer_zero_node, 0, 0,
+ GET_MODE (x), y);
return;
}
@@ -940,7 +941,7 @@ noce_emit_move_insn (rtx x, rtx y)
outmode = GET_MODE (outer);
bitpos = SUBREG_BYTE (outer) * BITS_PER_UNIT;
store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos,
- 0, 0, outmode, y);
+ integer_zero_node, 0, 0, outmode, y);
}
/* Return sequence of instructions generated by if conversion. This
Index: expr.c
===================================================================
--- expr.c (revision 176891)
+++ expr.c (working copy)
@@ -145,7 +145,7 @@ static void store_constructor_field (rtx
tree, tree, int, alias_set_type);
static void store_constructor (tree, rtx, int, HOST_WIDE_INT);
static rtx store_field (rtx, HOST_WIDE_INT, HOST_WIDE_INT,
- unsigned HOST_WIDE_INT, unsigned HOST_WIDE_INT,
+ tree, HOST_WIDE_INT, HOST_WIDE_INT,
enum machine_mode,
tree, tree, alias_set_type, bool);
@@ -2077,7 +2077,7 @@ emit_group_store (rtx orig_dst, rtx src,
emit_move_insn (adjust_address (dest, mode, bytepos), tmps[i]);
else
store_bit_field (dest, bytelen * BITS_PER_UNIT, bytepos * BITS_PER_UNIT,
- 0, 0, mode, tmps[i]);
+ integer_zero_node, 0, 0, mode, tmps[i]);
}
/* Copy from the pseudo into the (probable) hard reg. */
@@ -2171,7 +2171,8 @@ copy_blkmode_from_reg (rtx tgtblk, rtx s
/* Use xbitpos for the source extraction (right justified) and
bitpos for the destination store (left justified). */
- store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, 0, 0, copy_mode,
+ store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD,
+ integer_zero_node, 0, 0, copy_mode,
extract_bit_field (src, bitsize,
xbitpos % BITS_PER_WORD, 1, false,
NULL_RTX, copy_mode, copy_mode));
@@ -2808,7 +2809,8 @@ write_complex_part (rtx cplx, rtx val, b
gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD);
}
- store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, 0, 0, imode, val);
+ store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0,
+ integer_zero_node, 0, 0, imode, val);
}
/* Extract one of the components of the complex value CPLX. Extract the
@@ -3943,8 +3945,7 @@ get_subtarget (rtx x)
static bool
optimize_bitfield_assignment_op (unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ HOST_WIDE_INT bitregion_maxbits,
enum machine_mode mode1, rtx str_rtx,
tree to, tree src)
{
@@ -4005,8 +4006,9 @@ optimize_bitfield_assignment_op (unsigne
if (str_bitsize == 0 || str_bitsize > BITS_PER_WORD)
str_mode = word_mode;
+ if (bitregion_maxbits && bitregion_maxbits < GET_MODE_BITSIZE (str_mode))
+ str_mode = get_max_mode (bitregion_maxbits);
str_mode = get_best_mode (bitsize, bitpos,
- bitregion_start, bitregion_end,
MEM_ALIGN (str_rtx), str_mode, 0);
if (str_mode == VOIDmode)
return false;
@@ -4115,114 +4117,184 @@ optimize_bitfield_assignment_op (unsigne
return false;
}
-/* In the C++ memory model, consecutive bit fields in a structure are
- considered one memory location.
+/* In the C++ memory model, consecutive non-zero bit fields in a
+ structure are considered one memory location.
- Given a COMPONENT_REF, this function returns the bit range of
- consecutive bits in which this COMPONENT_REF belongs in. The
- values are returned in *BITSTART and *BITEND. If either the C++
- memory model is not activated, or this memory access is not thread
- visible, 0 is returned in *BITSTART and *BITEND.
-
- EXP is the COMPONENT_REF.
- INNERDECL is the actual object being referenced.
- BITPOS is the position in bits where the bit starts within the structure.
- BITSIZE is size in bits of the field being referenced in EXP.
-
- For example, while storing into FOO.A here...
-
- struct {
- BIT 0:
- unsigned int a : 4;
- unsigned int b : 1;
- BIT 8:
- unsigned char c;
- unsigned int d : 6;
- } foo;
-
- ...we are not allowed to store past <b>, so for the layout above, a
- range of 0..7 (because no one cares if we store into the
- padding). */
+ Given a COMPONENT_REF, this function calculates the byte offset
+ from the containing object to the start of the contiguous bit
+ region containing the field in question. This byte offset is
+ returned in *BYTE_OFFSET.
+
+ The bit offset from the start of the bit region to the bit field in
+ question is returned in *BIT_OFFSET.
+
+ The maximum number of bits that can be addressed while storing into
+ the COMPONENT_REF is returned in *MAXBITS. This number is the
+ number of bits in the contiguous bit region, including any
+ padding. */
static void
-get_bit_range (unsigned HOST_WIDE_INT *bitstart,
- unsigned HOST_WIDE_INT *bitend,
- tree exp, tree innerdecl,
- HOST_WIDE_INT bitpos, HOST_WIDE_INT bitsize)
+get_bit_range (tree exp, tree *byte_offset, HOST_WIDE_INT *bit_offset,
+ HOST_WIDE_INT *maxbits)
{
tree field, record_type, fld;
- bool found_field = false;
bool prev_field_is_bitfield;
+ tree start_offset;
+ HOST_WIDE_INT start_bitpos;
+ /* First field of the bitfield group containing the bitfield we are
+ referencing. */
+ tree bitregion_start;
- gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
+ HOST_WIDE_INT tbitsize;
+ enum machine_mode tmode;
+ int tunsignedp, tvolatilep;
- /* If other threads can't see this value, no need to restrict stores. */
- if (ALLOW_STORE_DATA_RACES
- || ((TREE_CODE (innerdecl) == MEM_REF
- || TREE_CODE (innerdecl) == TARGET_MEM_REF)
- && !ptr_deref_may_alias_global_p (TREE_OPERAND (innerdecl, 0)))
- || (DECL_P (innerdecl)
- && (DECL_THREAD_LOCAL_P (innerdecl)
- || !TREE_STATIC (innerdecl))))
- {
- *bitstart = *bitend = 0;
- return;
- }
+ gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
/* Bit field we're storing into. */
field = TREE_OPERAND (exp, 1);
record_type = DECL_FIELD_CONTEXT (field);
- /* Count the contiguous bitfields for the memory location that
- contains FIELD. */
- *bitstart = 0;
- prev_field_is_bitfield = true;
+ /* Find the bitfield group containing the field in question, and set
+ BITREGION_START to the start of the group. */
+ prev_field_is_bitfield = false;
+ bitregion_start = NULL_TREE;
for (fld = TYPE_FIELDS (record_type); fld; fld = DECL_CHAIN (fld))
{
- tree t, offset;
- enum machine_mode mode;
- int unsignedp, volatilep;
-
if (TREE_CODE (fld) != FIELD_DECL)
continue;
- t = build3 (COMPONENT_REF, TREE_TYPE (exp),
- unshare_expr (TREE_OPERAND (exp, 0)),
- fld, NULL_TREE);
- get_inner_reference (t, &bitsize, &bitpos, &offset,
- &mode, &unsignedp, &volatilep, true);
-
- if (field == fld)
- found_field = true;
-
- if (DECL_BIT_FIELD_TYPE (fld) && bitsize > 0)
+ /* If we have a non-zero bit-field. */
+ if (DECL_BIT_FIELD_TYPE (fld)
+ && !integer_zerop (DECL_SIZE (fld)))
{
- if (prev_field_is_bitfield == false)
+ if (!prev_field_is_bitfield)
{
- *bitstart = bitpos;
+ bitregion_start = fld;
prev_field_is_bitfield = true;
}
}
else
+ prev_field_is_bitfield = false;
+ if (fld == field)
+ break;
+ }
+ gcc_assert (bitregion_start);
+ gcc_assert (fld);
+
+ /* Save the starting position of the bitregion. */
+ get_inner_reference (build3 (COMPONENT_REF,
+ TREE_TYPE (exp),
+ TREE_OPERAND (exp, 0),
+ bitregion_start, NULL_TREE),
+ &tbitsize, &start_bitpos, &start_offset,
+ &tmode, &tunsignedp, &tvolatilep, true);
+
+ if (!start_offset)
+ start_offset = size_zero_node;
+ /* Calculate byte offset to the beginning of the bit region. */
+ /* BYTE_OFFSET = START_OFFSET + (START_BITPOS / BITS_PER_UNIT) */
+ gcc_assert (start_bitpos % BITS_PER_UNIT == 0);
+ *byte_offset = fold_build2 (PLUS_EXPR, TREE_TYPE (start_offset),
+ start_offset,
+ build_int_cst (integer_type_node,
+ start_bitpos / BITS_PER_UNIT));
+
+ /* Calculate the starting bit offset and find the end of the bit
+ region. */
+ for (fld = bitregion_start; fld; fld = DECL_CHAIN (fld))
+ {
+ if (TREE_CODE (fld) != FIELD_DECL)
+ continue;
+
+ if (!DECL_BIT_FIELD_TYPE (fld)
+ || integer_zerop (DECL_SIZE (fld)))
+ break;
+
+ if (fld == field)
{
- prev_field_is_bitfield = false;
- if (found_field)
- break;
+ tree t = DECL_FIELD_OFFSET (fld);
+ tree bits = build_int_cst (integer_type_node, BITS_PER_UNIT);
+ HOST_WIDE_INT tbitpos;
+ tree toffset;
+
+ get_inner_reference (build3 (COMPONENT_REF,
+ TREE_TYPE (exp),
+ TREE_OPERAND (exp, 0),
+ fld, NULL_TREE),
+ &tbitsize, &tbitpos, &toffset,
+ &tmode, &tunsignedp, &tvolatilep, true);
+
+ if (!toffset)
+ toffset = size_zero_node;
+
+ /* bitoff = start_byte * 8 - (fld.byteoff * 8 + fld.bitoff) */
+ t = fold_build2 (MINUS_EXPR, size_type_node,
+ fold_build2 (PLUS_EXPR, size_type_node,
+ fold_build2 (MULT_EXPR, size_type_node,
+ toffset, bits),
+ build_int_cst (integer_type_node,
+ tbitpos)),
+ fold_build2 (MULT_EXPR, size_type_node,
+ *byte_offset, bits));
+
+ *bit_offset = tree_low_cst (t, 1);
}
}
- gcc_assert (found_field);
+ /* Be as conservative as possible on variable offsets. */
+ if (TREE_OPERAND (exp, 2)
+ && !host_integerp (TREE_OPERAND (exp, 2), 1))
+ {
+ *byte_offset = TREE_OPERAND (exp, 2);
+ *maxbits = BITS_PER_UNIT;
+ return;
+ }
+
+ /* If we found the end of the bit field sequence, include the
+ padding up to the next field... */
if (fld)
{
- /* We found the end of the bit field sequence. Include the
- padding up to the next field and be done. */
- *bitend = bitpos - 1;
+ tree end_offset, maxbits_tree;
+ HOST_WIDE_INT end_bitpos;
+
+ /* Calculate bitpos and offset of the next field. */
+ get_inner_reference (build3 (COMPONENT_REF,
+ TREE_TYPE (exp),
+ TREE_OPERAND (exp, 0),
+ fld, NULL_TREE),
+ &tbitsize, &end_bitpos, &end_offset,
+ &tmode, &tunsignedp, &tvolatilep, true);
+ gcc_assert (end_bitpos % BITS_PER_UNIT == 0);
+
+ if (end_offset)
+ {
+ tree type = TREE_TYPE (end_offset);
+
+ maxbits_tree = fold_build2 (PLUS_EXPR, type,
+ build2 (MULT_EXPR, type,
+ build2 (MINUS_EXPR, type,
+ end_offset,
+ *byte_offset),
+ build_int_cst (size_type_node,
+ BITS_PER_UNIT)),
+ build_int_cst (size_type_node,
+ end_bitpos));
+ }
+ else
+ maxbits_tree = build_int_cst (integer_type_node,
+ end_bitpos - start_bitpos);
+
+ *maxbits = TREE_INT_CST_LOW (maxbits_tree);
}
+ /* ...otherwise, this is the last element in the structure. */
else
{
- /* If this is the last element in the structure, include the padding
- at the end of structure. */
- *bitend = TREE_INT_CST_LOW (TYPE_SIZE (record_type)) - 1;
+ /* Include the padding at the end of structure. */
+ *maxbits = TREE_INT_CST_LOW (TYPE_SIZE (record_type))
+ - TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (bitregion_start));
+ /* Round up to the next byte. */
+ *maxbits = (*maxbits + BITS_PER_UNIT - 1) & ~(BITS_PER_UNIT - 1);
}
}
@@ -4324,12 +4396,15 @@ expand_assignment (tree to, tree from, b
{
enum machine_mode mode1;
HOST_WIDE_INT bitsize, bitpos;
- unsigned HOST_WIDE_INT bitregion_start = 0;
- unsigned HOST_WIDE_INT bitregion_end = 0;
tree offset;
int unsignedp;
int volatilep = 0;
tree tem;
+ tree bitregion_byte_offset = size_zero_node;
+ HOST_WIDE_INT bitregion_bit_offset = 0;
+ /* Set to 0 for the special case where there is no restriction
+ in play. */
+ HOST_WIDE_INT bitregion_maxbits = 0;
push_temp_slots ();
tem = get_inner_reference (to, &bitsize, &bitpos, &offset, &mode1,
@@ -4337,8 +4412,30 @@ expand_assignment (tree to, tree from, b
if (TREE_CODE (to) == COMPONENT_REF
&& DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
- get_bit_range (&bitregion_start, &bitregion_end,
- to, tem, bitpos, bitsize);
+ {
+ /* If other threads can't see this value, no need to
+ restrict stores. */
+ if (ALLOW_STORE_DATA_RACES
+ || ((TREE_CODE (tem) == MEM_REF
+ || TREE_CODE (tem) == TARGET_MEM_REF)
+ && !ptr_deref_may_alias_global_p (TREE_OPERAND (tem, 0)))
+ || TREE_CODE (tem) == RESULT_DECL
+ || TREE_CODE (tem) == PARM_DECL
+ || (DECL_P (tem)
+ && ((TREE_CODE (tem) == VAR_DECL
+ && DECL_THREAD_LOCAL_P (tem))
+ || !TREE_STATIC (tem))))
+ {
+ bitregion_byte_offset = size_zero_node;
+ bitregion_bit_offset = 0;
+ /* Set to 0 for the special case where there is no
+ restriction in play. */
+ bitregion_maxbits = 0;
+ }
+ else
+ get_bit_range (to, &bitregion_byte_offset,
+ &bitregion_bit_offset, &bitregion_maxbits);
+ }
/* If we are going to use store_bit_field and extract_bit_field,
make sure to_rtx will be safe for multiple use. */
@@ -4388,6 +4485,10 @@ expand_assignment (tree to, tree from, b
&& MEM_ALIGN (to_rtx) == GET_MODE_ALIGNMENT (mode1))
{
to_rtx = adjust_address (to_rtx, mode1, bitpos / BITS_PER_UNIT);
+ bitregion_byte_offset = fold_build2 (MINUS_EXPR, integer_type_node,
+ bitregion_byte_offset,
+ build_int_cst (integer_type_node,
+ bitpos / BITS_PER_UNIT));
bitpos = 0;
}
@@ -4421,13 +4522,15 @@ expand_assignment (tree to, tree from, b
nontemporal);
else if (bitpos + bitsize <= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 0), bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_byte_offset, bitregion_bit_offset,
+ bitregion_maxbits,
mode1, from, TREE_TYPE (tem),
get_alias_set (to), nontemporal);
else if (bitpos >= mode_bitsize / 2)
result = store_field (XEXP (to_rtx, 1), bitsize,
bitpos - mode_bitsize / 2,
- bitregion_start, bitregion_end,
+ bitregion_byte_offset, bitregion_bit_offset,
+ bitregion_maxbits,
mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
@@ -4450,7 +4553,8 @@ expand_assignment (tree to, tree from, b
write_complex_part (temp, XEXP (to_rtx, 0), false);
write_complex_part (temp, XEXP (to_rtx, 1), true);
result = store_field (temp, bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_byte_offset, bitregion_bit_offset,
+ bitregion_maxbits,
mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
@@ -4477,13 +4581,14 @@ expand_assignment (tree to, tree from, b
}
if (optimize_bitfield_assignment_op (bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_maxbits,
mode1,
to_rtx, to, from))
result = NULL;
else
result = store_field (to_rtx, bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_byte_offset, bitregion_bit_offset,
+ bitregion_maxbits,
mode1, from,
TREE_TYPE (tem), get_alias_set (to),
nontemporal);
@@ -4877,7 +4982,7 @@ store_expr (tree exp, rtx target, int ca
: BLOCK_OP_NORMAL));
else if (GET_MODE (target) == BLKmode)
store_bit_field (target, INTVAL (expr_size (exp)) * BITS_PER_UNIT,
- 0, 0, 0, GET_MODE (temp), temp);
+ 0, integer_zero_node, 0, 0, GET_MODE (temp), temp);
else
convert_move (target, temp, unsignedp);
}
@@ -5342,8 +5447,8 @@ store_constructor_field (rtx target, uns
store_constructor (exp, target, cleared, bitsize / BITS_PER_UNIT);
}
else
- store_field (target, bitsize, bitpos, 0, 0, mode, exp, type, alias_set,
- false);
+ store_field (target, bitsize, bitpos, integer_zero_node, 0, 0, mode, exp,
+ type, alias_set, false);
}
/* Store the value of constructor EXP into the rtx TARGET.
@@ -5917,10 +6022,14 @@ store_constructor (tree exp, rtx target,
BITSIZE bits, starting BITPOS bits from the start of TARGET.
If MODE is VOIDmode, it means that we are storing into a bit-field.
- BITREGION_START is bitpos of the first bitfield in this region.
- BITREGION_END is the bitpos of the ending bitfield in this region.
- These two fields are 0, if the C++ memory model does not apply,
- or we are not interested in keeping track of bitfield regions.
+ BITREGION_BYTE_OFFSET is the byte offset from the beginning of the
+ containing object to the start of the bit region.
+
+ BITREGION_BIT_OFFSET is the bit offset from the start of the bit
+ region.
+
+ BITREGION_MAXBITS is the size of the bit region containing the bit
+ field in question.
Always return const0_rtx unless we have something particular to
return.
@@ -5935,8 +6044,9 @@ store_constructor (tree exp, rtx target,
static rtx
store_field (rtx target, HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ tree bitregion_byte_offset,
+ HOST_WIDE_INT bitregion_bit_offset,
+ HOST_WIDE_INT bitregion_maxbits,
enum machine_mode mode, tree exp, tree type,
alias_set_type alias_set, bool nontemporal)
{
@@ -5970,7 +6080,8 @@ store_field (rtx target, HOST_WIDE_INT b
emit_move_insn (object, target);
store_field (blk_object, bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_byte_offset, bitregion_bit_offset,
+ bitregion_maxbits,
mode, exp, type, alias_set, nontemporal);
emit_move_insn (target, object);
@@ -6086,7 +6197,8 @@ store_field (rtx target, HOST_WIDE_INT b
/* Store the value in the bitfield. */
store_bit_field (target, bitsize, bitpos,
- bitregion_start, bitregion_end,
+ bitregion_byte_offset, bitregion_bit_offset,
+ bitregion_maxbits,
mode, temp);
return const0_rtx;
@@ -7497,7 +7609,8 @@ expand_expr_real_2 (sepops ops, rtx targ
(treeop0))
* BITS_PER_UNIT),
(HOST_WIDE_INT) GET_MODE_BITSIZE (mode)),
- 0, 0, 0, TYPE_MODE (valtype), treeop0,
+ 0, integer_zero_node, 0, 0,
+ TYPE_MODE (valtype), treeop0,
type, 0, false);
}
Index: expr.h
===================================================================
--- expr.h (revision 176891)
+++ expr.h (working copy)
@@ -664,10 +664,12 @@ enum extraction_pattern { EP_insv, EP_ex
extern enum machine_mode
mode_for_extraction (enum extraction_pattern, int);
+extern enum machine_mode get_max_mode (HOST_WIDE_INT);
extern void store_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
+ tree,
+ HOST_WIDE_INT,
+ HOST_WIDE_INT,
enum machine_mode, rtx);
extern rtx extract_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT, int, bool, rtx,
Index: stor-layout.c
===================================================================
--- stor-layout.c (revision 176891)
+++ stor-layout.c (working copy)
@@ -2361,13 +2361,6 @@ fixup_unsigned_type (tree type)
/* Find the best machine mode to use when referencing a bit field of length
BITSIZE bits starting at BITPOS.
- BITREGION_START is the bit position of the first bit in this
- sequence of bit fields. BITREGION_END is the last bit in this
- sequence. If these two fields are non-zero, we should restrict the
- memory access to a maximum sized chunk of
- BITREGION_END - BITREGION_START + 1. Otherwise, we are allowed to touch
- any adjacent non bit-fields.
-
The underlying object is known to be aligned to a boundary of ALIGN bits.
If LARGEST_MODE is not VOIDmode, it means that we should not use a mode
larger than LARGEST_MODE (usually SImode).
@@ -2386,20 +2379,11 @@ fixup_unsigned_type (tree type)
enum machine_mode
get_best_mode (int bitsize, int bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
unsigned int align,
enum machine_mode largest_mode, int volatilep)
{
enum machine_mode mode;
unsigned int unit = 0;
- unsigned HOST_WIDE_INT maxbits;
-
- /* If unset, no restriction. */
- if (!bitregion_end)
- maxbits = MAX_FIXED_MODE_SIZE;
- else
- maxbits = (bitregion_end - bitregion_start) % align + 1;
/* Find the narrowest integer mode that contains the bit field. */
for (mode = GET_CLASS_NARROWEST_MODE (MODE_INT); mode != VOIDmode;
@@ -2436,7 +2420,6 @@ get_best_mode (int bitsize, int bitpos,
&& bitpos / unit == (bitpos + bitsize - 1) / unit
&& unit <= BITS_PER_WORD
&& unit <= MIN (align, BIGGEST_ALIGNMENT)
- && unit <= maxbits
&& (largest_mode == VOIDmode
|| unit <= GET_MODE_BITSIZE (largest_mode)))
wide_mode = tmode;
Index: calls.c
===================================================================
--- calls.c (revision 176891)
+++ calls.c (working copy)
@@ -924,7 +924,8 @@ store_unaligned_arguments_into_pseudos (
emit_move_insn (reg, const0_rtx);
bytes -= bitsize / BITS_PER_UNIT;
- store_bit_field (reg, bitsize, endian_correction, 0, 0,
+ store_bit_field (reg, bitsize, endian_correction,
+ integer_zero_node, 0, 0,
word_mode, word);
}
}
Index: expmed.c
===================================================================
--- expmed.c (revision 176891)
+++ expmed.c (working copy)
@@ -48,13 +48,11 @@ struct target_expmed *this_target_expmed
static void store_fixed_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
+ HOST_WIDE_INT,
rtx);
static void store_split_bit_field (rtx, unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT,
+ HOST_WIDE_INT,
rtx);
static rtx extract_fixed_bit_field (enum machine_mode, rtx,
unsigned HOST_WIDE_INT,
@@ -340,8 +338,7 @@ mode_for_extraction (enum extraction_pat
static bool
store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitnum,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ HOST_WIDE_INT bitregion_maxbits,
enum machine_mode fieldmode,
rtx value, bool fallback_p)
{
@@ -558,7 +555,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (!store_bit_field_1 (op0, MIN (BITS_PER_WORD,
bitsize - i * BITS_PER_WORD),
bitnum + bit_offset,
- bitregion_start, bitregion_end,
+ bitregion_maxbits,
word_mode,
value_word, fallback_p))
{
@@ -722,10 +719,6 @@ store_bit_field_1 (rtx str_rtx, unsigned
if (HAVE_insv && MEM_P (op0))
{
enum machine_mode bestmode;
- unsigned HOST_WIDE_INT maxbits = MAX_FIXED_MODE_SIZE;
-
- if (bitregion_end)
- maxbits = bitregion_end - bitregion_start + 1;
/* Get the mode to use for inserting into this field. If OP0 is
BLKmode, get the smallest mode consistent with the alignment. If
@@ -733,15 +726,20 @@ store_bit_field_1 (rtx str_rtx, unsigned
mode. Otherwise, use the smallest mode containing the field. */
if (GET_MODE (op0) == BLKmode
- || GET_MODE_BITSIZE (GET_MODE (op0)) > maxbits
+ || (bitregion_maxbits
+ && GET_MODE_BITSIZE (GET_MODE (op0)) > bitregion_maxbits)
|| (op_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (op_mode)))
- bestmode = get_best_mode (bitsize, bitnum,
- bitregion_start, bitregion_end,
- MEM_ALIGN (op0),
- (op_mode == MAX_MACHINE_MODE
- ? VOIDmode : op_mode),
- MEM_VOLATILE_P (op0));
+ {
+ bestmode = (op_mode == MAX_MACHINE_MODE ? VOIDmode : op_mode);
+ if (bitregion_maxbits
+ && bitregion_maxbits < GET_MODE_BITSIZE (op_mode))
+ bestmode = get_max_mode (bitregion_maxbits);
+ bestmode = get_best_mode (bitsize, bitnum,
+ MEM_ALIGN (op0),
+ bestmode,
+ MEM_VOLATILE_P (op0));
+ }
else
bestmode = GET_MODE (op0);
@@ -752,6 +750,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
{
rtx last, tempreg, xop0;
unsigned HOST_WIDE_INT xoffset, xbitpos;
+ HOST_WIDE_INT xmaxbits = bitregion_maxbits;
last = get_last_insn ();
@@ -762,13 +761,24 @@ store_bit_field_1 (rtx str_rtx, unsigned
xoffset = (bitnum / unit) * GET_MODE_SIZE (bestmode);
xbitpos = bitnum % unit;
xop0 = adjust_address (op0, bestmode, xoffset);
+ if (xmaxbits)
+ xmaxbits -= xoffset * BITS_PER_UNIT;
/* Fetch that unit, store the bitfield in it, then store
the unit. */
tempreg = copy_to_reg (xop0);
- if (store_bit_field_1 (tempreg, bitsize, xbitpos,
- bitregion_start, bitregion_end,
- fieldmode, orig_value, false))
+ if (xmaxbits && unit > xmaxbits)
+ {
+ /* Do not allow reading past the bit region.
+ Technically, you can read past the bitregion, because
+ load data races are allowed. You just can't write
+ past the bit region.
+
+ ?? Perhaps allow reading, and adjust everything else
+ accordingly. Ughh. */
+ }
+ else if (store_bit_field_1 (tempreg, bitsize, xbitpos, xmaxbits,
+ fieldmode, orig_value, false))
{
emit_move_insn (xop0, tempreg);
return true;
@@ -781,7 +791,7 @@ store_bit_field_1 (rtx str_rtx, unsigned
return false;
store_fixed_bit_field (op0, offset, bitsize, bitpos,
- bitregion_start, bitregion_end, value);
+ bitregion_maxbits, value);
return true;
}
@@ -789,18 +799,22 @@ store_bit_field_1 (rtx str_rtx, unsigned
into a bit-field within structure STR_RTX
containing BITSIZE bits starting at bit BITNUM.
- BITREGION_START is bitpos of the first bitfield in this region.
- BITREGION_END is the bitpos of the ending bitfield in this region.
- These two fields are 0, if the C++ memory model does not apply,
- or we are not interested in keeping track of bitfield regions.
+ BITREGION_BYTE_OFFSET is the byte offset from STR_RTX to the start
+ of the bit region.
+
+ BITREGION_BIT_OFFSET is the field's bit offset from the start of
+ the bit region.
+
+ BITREGION_MAXBITS is the number of bits in the bit region.
FIELDMODE is the machine-mode of the FIELD_DECL node for this field. */
void
store_bit_field (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitnum,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ tree bitregion_byte_offset,
+ HOST_WIDE_INT bitregion_bit_offset,
+ HOST_WIDE_INT bitregion_maxbits,
enum machine_mode fieldmode,
rtx value)
{
@@ -808,33 +822,51 @@ store_bit_field (rtx str_rtx, unsigned H
bit region. Adjust the address to start at the beginning of the
bit region. */
if (MEM_P (str_rtx)
- && bitregion_start > 0)
+ && bitregion_maxbits
+ && !integer_zerop (bitregion_byte_offset))
{
- enum machine_mode bestmode;
- enum machine_mode op_mode;
- unsigned HOST_WIDE_INT offset;
+ HOST_WIDE_INT offset;
- op_mode = mode_for_extraction (EP_insv, 3);
- if (op_mode == MAX_MACHINE_MODE)
- op_mode = VOIDmode;
-
- offset = bitregion_start / BITS_PER_UNIT;
- bitnum -= bitregion_start;
- bitregion_end -= bitregion_start;
- bitregion_start = 0;
- bestmode = get_best_mode (bitsize, bitnum,
- bitregion_start, bitregion_end,
- MEM_ALIGN (str_rtx),
- op_mode,
- MEM_VOLATILE_P (str_rtx));
- str_rtx = adjust_address (str_rtx, bestmode, offset);
+ if (host_integerp (bitregion_byte_offset, 1))
+ {
+ /* Adjust the bit position accordingly. */
+ offset = tree_low_cst (bitregion_byte_offset, 1);
+ /* Adjust the actual address. */
+ str_rtx = adjust_address (str_rtx, GET_MODE (str_rtx), offset);
+ }
+ else
+ {
+ /* Handle variable length offsets. */
+ str_rtx = offset_address (str_rtx,
+ expand_normal (bitregion_byte_offset), 1);
+ }
+ bitregion_byte_offset = integer_zero_node;
+ bitnum = bitregion_bit_offset;
}
- if (!store_bit_field_1 (str_rtx, bitsize, bitnum,
- bitregion_start, bitregion_end,
+ if (!store_bit_field_1 (str_rtx, bitsize, bitnum, bitregion_maxbits,
fieldmode, value, true))
gcc_unreachable ();
}
+
+/* Return the largest mode that can be used to address a bit field of
+ size BITS. This is basically a MODE whose bit size is <= BITS. */
+enum machine_mode
+get_max_mode (HOST_WIDE_INT bits)
+{
+ enum machine_mode mode, prev;
+
+ for (prev = mode = GET_CLASS_NARROWEST_MODE (MODE_INT); mode != VOIDmode;
+ mode = GET_MODE_WIDER_MODE (mode))
+ {
+ if (GET_MODE_BITSIZE (mode) > bits
+ || GET_MODE_BITSIZE (mode) > MAX_FIXED_MODE_SIZE)
+ return prev;
+ prev = mode;
+ }
+ gcc_unreachable ();
+ return VOIDmode;
+}
\f
/* Use shifts and boolean operations to store VALUE
into a bit field of width BITSIZE
@@ -843,14 +875,16 @@ store_bit_field (rtx str_rtx, unsigned H
The field starts at position BITPOS within the byte.
(If OP0 is a register, it may be a full word or a narrower mode,
but BITPOS still counts within a full word,
- which is significant on bigendian machines.) */
+ which is significant on bigendian machines.)
+
+ BITREGION_MAXBITS is the number of bits in the bit region, which
+ starts at OP0. */
static void
store_fixed_bit_field (rtx op0, unsigned HOST_WIDE_INT offset,
unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ HOST_WIDE_INT bitregion_maxbits,
rtx value)
{
enum machine_mode mode;
@@ -872,19 +906,12 @@ store_fixed_bit_field (rtx op0, unsigned
/* Special treatment for a bit field split across two registers. */
if (bitsize + bitpos > BITS_PER_WORD)
{
- store_split_bit_field (op0, bitsize, bitpos,
- bitregion_start, bitregion_end,
- value);
+ store_split_bit_field (op0, bitsize, bitpos, bitregion_maxbits, value);
return;
}
}
else
{
- unsigned HOST_WIDE_INT maxbits = MAX_FIXED_MODE_SIZE;
-
- if (bitregion_end)
- maxbits = bitregion_end - bitregion_start + 1;
-
/* Get the proper mode to use for this field. We want a mode that
includes the entire field. If such a mode would be larger than
a word, we won't be doing the extraction the normal way.
@@ -897,20 +924,26 @@ store_fixed_bit_field (rtx op0, unsigned
if (MEM_VOLATILE_P (op0)
&& GET_MODE_BITSIZE (GET_MODE (op0)) > 0
- && GET_MODE_BITSIZE (GET_MODE (op0)) <= maxbits
+ && (!bitregion_maxbits
+ || GET_MODE_BITSIZE (GET_MODE (op0)) <= bitregion_maxbits)
&& flag_strict_volatile_bitfields > 0)
mode = GET_MODE (op0);
else
- mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
- bitregion_start, bitregion_end,
- MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
+ {
+ if (bitregion_maxbits
+ && (bitregion_maxbits - offset * BITS_PER_UNIT
+ < GET_MODE_BITSIZE (mode)))
+ mode = get_max_mode (bitregion_maxbits - offset * BITS_PER_UNIT);
+ mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
+ MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
+ }
if (mode == VOIDmode)
{
/* The only way this should occur is if the field spans word
boundaries. */
store_split_bit_field (op0, bitsize, bitpos + offset * BITS_PER_UNIT,
- bitregion_start, bitregion_end, value);
+ bitregion_maxbits, value);
return;
}
@@ -932,6 +965,14 @@ store_fixed_bit_field (rtx op0, unsigned
Then alter OP0 to refer to that word. */
bitpos += (offset % (total_bits / BITS_PER_UNIT)) * BITS_PER_UNIT;
offset -= (offset % (total_bits / BITS_PER_UNIT));
+ if (bitregion_maxbits)
+ {
+ enum machine_mode tmode;
+ bitregion_maxbits -= offset * BITS_PER_UNIT;
+ tmode = get_max_mode (bitregion_maxbits);
+ if (GET_MODE_SIZE (mode) > GET_MODE_SIZE (tmode))
+ mode = tmode;
+ }
op0 = adjust_address (op0, mode, offset);
}
@@ -1031,8 +1072,7 @@ store_fixed_bit_field (rtx op0, unsigned
static void
store_split_bit_field (rtx op0, unsigned HOST_WIDE_INT bitsize,
unsigned HOST_WIDE_INT bitpos,
- unsigned HOST_WIDE_INT bitregion_start,
- unsigned HOST_WIDE_INT bitregion_end,
+ HOST_WIDE_INT bitregion_maxbits,
rtx value)
{
unsigned int unit;
@@ -1043,7 +1083,14 @@ store_split_bit_field (rtx op0, unsigned
if (REG_P (op0) || GET_CODE (op0) == SUBREG)
unit = BITS_PER_WORD;
else
- unit = MIN (MEM_ALIGN (op0), BITS_PER_WORD);
+ {
+ unit = MIN (MEM_ALIGN (op0), BITS_PER_WORD);
+
+ /* ?? Ideally we should do as much as we can with the wider
+ mode, and use BITS_PER_UNIT for the remaining bits. */
+ if (bitregion_maxbits % unit)
+ unit = BITS_PER_UNIT;
+ }
/* If VALUE is a constant other than a CONST_INT, get it into a register in
WORD_MODE. If we can do this using gen_lowpart_common, do so. Note
@@ -1148,7 +1195,7 @@ store_split_bit_field (rtx op0, unsigned
it is just an out-of-bounds access. Ignore it. */
if (word != const0_rtx)
store_fixed_bit_field (word, offset * unit / BITS_PER_UNIT, thissize,
- thispos, bitregion_start, bitregion_end, part);
+ thispos, bitregion_maxbits, part);
bitsdone += thissize;
}
}
@@ -1588,7 +1635,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
if (GET_MODE (op0) == BLKmode
|| (ext_mode != MAX_MACHINE_MODE
&& GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (ext_mode)))
- bestmode = get_best_mode (bitsize, bitnum, 0, 0, MEM_ALIGN (op0),
+ bestmode = get_best_mode (bitsize, bitnum, MEM_ALIGN (op0),
(ext_mode == MAX_MACHINE_MODE
? VOIDmode : ext_mode),
MEM_VOLATILE_P (op0));
@@ -1714,7 +1761,7 @@ extract_fixed_bit_field (enum machine_mo
mode = tmode;
}
else
- mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT, 0, 0,
+ mode = get_best_mode (bitsize, bitpos + offset * BITS_PER_UNIT,
MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0));
if (mode == VOIDmode)
Index: stmt.c
===================================================================
--- stmt.c (revision 176891)
+++ stmt.c (working copy)
@@ -1760,7 +1760,7 @@ expand_return (tree retval)
/* Use bitpos for the source extraction (left justified) and
xbitpos for the destination store (right justified). */
store_bit_field (dst, bitsize, xbitpos % BITS_PER_WORD,
- 0, 0, word_mode,
+ integer_zero_node, 0, 0, word_mode,
extract_bit_field (src, bitsize,
bitpos % BITS_PER_WORD, 1, false,
NULL_RTX, word_mode, word_mode));
Index: params.def
===================================================================
--- params.def (revision 176891)
+++ params.def (working copy)
@@ -912,7 +912,9 @@ DEFPARAM (PARAM_CASE_VALUES_THRESHOLD,
DEFPARAM (PARAM_ALLOW_STORE_DATA_RACES,
"allow-store-data-races",
"Allow new data races on stores to be introduced",
- 1, 0, 1)
+ /* TESTING TESTING */
+ /* TESTING: Enable the memory model by default. */
+ 0, 0, 1)
/*
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-27 0:05 ` Aldy Hernandez
@ 2011-08-29 12:54 ` Richard Guenther
2011-08-30 16:07 ` Aldy Hernandez
` (3 more replies)
0 siblings, 4 replies; 81+ messages in thread
From: Richard Guenther @ 2011-08-29 12:54 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jason Merrill, gcc-patches, Jakub Jelinek
On Fri, Aug 26, 2011 at 8:54 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
> This is a "slight" update from the last revision, with your issues addressed
> as I explained in the last email. However, everything turned out to be much
> tricker than I expected (variable length offsets with arrays, bit fields
> spanning multiple words, surprising padding gymnastics by GCC, etc etc).
>
> It turns out that what we need is to know the precise bit region size at all
> times, and adjust it as we rearrange and cut things into pieces throughout
> the RTL bit field machinery.
>
> I enabled the C++ memory model, and forced a boostrap and regression test
> with it. This brought about many interesting cases, which I was able to
> distill and add to the testsuite.
>
> Of particular interest was the struct-layout-1.exp tests. Since many of the
> tests set a global bit field, only to later check it against a local
> variable containing the same value, it is the perfect stressor because,
> while globals are restricted under the memory model, locals are not. So we
> can check that we can interoperate with the less restrictive model, and that
> the patch does not introduce ABI inconsistencies. After much grief, we are
> now passing all the struct-layout-1.exp tests. Eventually, I'd like to force
> the struct-layout-1.exp tests to run for "--param allow-store-data-races=0"
> as well. Unfortunately, this will increase testing time.
>
> I have (unfortunately) introduced an additional call to
> get_inner_reference(), but only for the field itself (one time). I can't
> remember the details, but it was something to effect of the bit position +
> padding being impossible to calculate in one variable array reference case.
> I can dig up the case if you'd like.
>
> I am currently tackling a reload miscompilation failure while building a
> 32-bit library. I am secretly hoping your review will uncover the flaw
> without me having to pick this up. Otherwise, this is a much more
> comprehensive approach than what is currently in mainline, and we now pass
> all the bitfield tests the GCC testsuite could throw at it.
>
> Fire away.
+ /* Be as conservative as possible on variable offsets. */
+ if (TREE_OPERAND (exp, 2)
+ && !host_integerp (TREE_OPERAND (exp, 2), 1))
+ {
+ *byte_offset = TREE_OPERAND (exp, 2);
+ *maxbits = BITS_PER_UNIT;
+ return;
+ }
shouldn't this be at the very beginning of the function? Because
you've set *bit_offset to an offset that was _not_ calculated relative
to TREE_OPERAND (exp, 2). And you'll avoid ICEing
+ /* bitoff = start_byte * 8 - (fld.byteoff * 8 + fld.bitoff) */
+ t = fold_build2 (MINUS_EXPR, size_type_node,
+ fold_build2 (PLUS_EXPR, size_type_node,
+ fold_build2 (MULT_EXPR, size_type_node,
+ toffset, bits),
+ build_int_cst (integer_type_node,
+ tbitpos)),
+ fold_build2 (MULT_EXPR, size_type_node,
+ *byte_offset, bits));
+
+ *bit_offset = tree_low_cst (t, 1);
here in case t isn't an INTEGER_CST. The comment before the
tree formula above doesn't match it, please update it. If
*bit_offset is supposed to be relative to *byte_offset then it should
be easy to calculate it without another get_inner_reference.
Btw, *byte_offset is still not relative to the containing object as
documented, but relative to the base object of the exp reference
tree (thus, to a in a.i.j.k.l instead of to a.i.j.k). If it were supposed
to be relative to a.i.j.k get_inner_reference would be not needed
either. Can you clarify what "containing object" means in the
overall comment please?
If it is really relative to the innermost reference of exp you can
"CSE" the offset of TREE_OPERAND (exp, 0) and do relative
adjustments for all the other get_inner_reference calls. For
example the
+ /* If we found the end of the bit field sequence, include the
+ padding up to the next field... */
if (fld)
{
...
+ /* Calculate bitpos and offset of the next field. */
+ get_inner_reference (build3 (COMPONENT_REF,
+ TREE_TYPE (exp),
+ TREE_OPERAND (exp, 0),
+ fld, NULL_TREE),
+ &tbitsize, &end_bitpos, &end_offset,
+ &tmode, &tunsignedp, &tvolatilep, true);
case is not correct anyway, fld may have variable position
(non-INTEGER_CST DECL_FIELD_OFFSET), you can't
assume
+ *maxbits = TREE_INT_CST_LOW (maxbits_tree);
this thus.
+ /* ...otherwise, this is the last element in the structure. */
else
{
- /* If this is the last element in the structure, include the padding
- at the end of structure. */
- *bitend = TREE_INT_CST_LOW (TYPE_SIZE (record_type)) - 1;
+ /* Include the padding at the end of structure. */
+ *maxbits = TREE_INT_CST_LOW (TYPE_SIZE (record_type))
+ - TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (bitregion_start));
+ /* Round up to the next byte. */
+ *maxbits = (*maxbits + BITS_PER_UNIT - 1) & ~(BITS_PER_UNIT - 1);
}
so you wasn't convinced about my worries about tail-padding re-use?
And you blindly assume a constant-size record_type ...
and you don't account for DECL_FIELD_OFFSET of bitregion_start
(shouldn't you simply use (and compute) a byte_offset relative to
the start of the record)? Well, I still think you cannot incoude the
padding at the end of the structure (if TREE_OPERAND (exp, 0) is
a COMPONENT_REF as well then its DECL_SIZE can be different
than it's TYPE_SIZE).
+ bitregion_byte_offset = fold_build2 (MINUS_EXPR, integer_type_node,
+ bitregion_byte_offset,
+ build_int_cst (integer_type_node,
+ bitpos / BITS_PER_UNIT));
general remark - you should be using sizetype for byte offsets,
bitsizetype for bit offset trees and size_binop for computations, instead
of fold_build2 (applies everywhere). And thus pass size_zero_node
to store_field bitregion_byte_offset.
Can you split out the get_best_mode two param removal pieces? Consider
them pre-approved.
Why do you need to adjust store_bit_field with the extra param - can't
you simply pass an adjusted str_rtx from the single caller that can
have that non-zero?
Thanks,
Richard.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-29 12:54 ` Richard Guenther
@ 2011-08-30 16:07 ` Aldy Hernandez
2011-08-31 8:38 ` Richard Guenther
2011-08-30 16:53 ` Aldy Hernandez
` (2 subsequent siblings)
3 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-08-30 16:07 UTC (permalink / raw)
To: Richard Guenther; +Cc: Jason Merrill, gcc-patches, Jakub Jelinek
[I'm going to respond to this piece-meal, to make sure I don't drop
anything. My apologies for the long thread, but I'm pretty sure it's in
everybody's kill file by now.]
> + /* Be as conservative as possible on variable offsets. */
> + if (TREE_OPERAND (exp, 2)
> +&& !host_integerp (TREE_OPERAND (exp, 2), 1))
> + {
> + *byte_offset = TREE_OPERAND (exp, 2);
> + *maxbits = BITS_PER_UNIT;
> + return;
> + }
>
> shouldn't this be at the very beginning of the function? Because
> you've set *bit_offset to an offset that was _not_ calculated relative
Sure. I assume in this case, *bit_offset would be 0, right?
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-29 12:54 ` Richard Guenther
2011-08-30 16:07 ` Aldy Hernandez
@ 2011-08-30 16:53 ` Aldy Hernandez
2011-08-31 8:55 ` Richard Guenther
2011-08-30 21:33 ` Aldy Hernandez
2011-09-01 14:53 ` Aldy Hernandez
3 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-08-30 16:53 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches
> *bit_offset is supposed to be relative to *byte_offset then it should
> be easy to calculate it without another get_inner_reference.
Since, as you suggested, we will terminate early on variable length
offsets, we can assume both DECL_FIELD_OFFSET and DECL_FIELD_BIT_OFFSET
will be constants by now. So, I assume we can calculate the bit offset
like this:
*bit_offset = (TREE_INT_CST_LOW (DECL_FIELD_OFFSET (fld))
* BITS_PER_UNIT
+ TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (fld)))
- (TREE_INT_CST_LOW (DECL_FIELD_OFFSET (bitregion_start))
* BITS_PER_UNIT
+ TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (bitregion_start)));
(Yes, I know we can factor out the BITS_PER_UNIT and only do one
multiplication, it's just easier to read this way.)
Is this what you had in mind?
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-29 12:54 ` Richard Guenther
2011-08-30 16:07 ` Aldy Hernandez
2011-08-30 16:53 ` Aldy Hernandez
@ 2011-08-30 21:33 ` Aldy Hernandez
2011-08-31 8:55 ` Richard Guenther
2011-09-01 14:53 ` Aldy Hernandez
3 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-08-30 21:33 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches
> Btw, *byte_offset is still not relative to the containing object as
> documented, but relative to the base object of the exp reference
> tree (thus, to a in a.i.j.k.l instead of to a.i.j.k). If it were supposed
> to be relative to a.i.j.k get_inner_reference would be not needed
> either. Can you clarify what "containing object" means in the
> overall comment please?
I'm thoroughly confused here. Originally I had "inner decl", then we
changed the nomenclature to "containing object", and now there's this
"innermost reference".
What I mean to say is the "a" in a.i.j.k.l. How would you like me to
call that? The innermost reference? The inner decl? Would this
comment be acceptable:
Given a COMPONENT_REF, this function calculates the byte offset
from the innermost reference ("a" in a.i.j.k.l) to the start of the
contiguous bit region containing the field in question.
>
> If it is really relative to the innermost reference of exp you can
> "CSE" the offset of TREE_OPERAND (exp, 0) and do relative
> adjustments for all the other get_inner_reference calls. For
> example the
>
> + /* If we found the end of the bit field sequence, include the
> + padding up to the next field... */
> if (fld)
> {
> ...
> + /* Calculate bitpos and offset of the next field. */
> + get_inner_reference (build3 (COMPONENT_REF,
> + TREE_TYPE (exp),
> + TREE_OPERAND (exp, 0),
> + fld, NULL_TREE),
> + &tbitsize,&end_bitpos,&end_offset,
> + &tmode,&tunsignedp,&tvolatilep, true);
>
> case is not correct anyway, fld may have variable position
> (non-INTEGER_CST DECL_FIELD_OFFSET), you can't
> assume
Innermost here means "a" in a.i.j.k.l? If so, this is what we're
currently doing, *byte_offset is the start of the bit region, and
*bit_offset is the offset from that.
First, I thought we couldn't get a variable position here because we are
now handling that case at the beginning of the function with:
/* Be as conservative as possible on variable offsets. */
if (TREE_OPERAND (exp, 2)
&& !host_integerp (TREE_OPERAND (exp, 2), 1))
{
*byte_offset = TREE_OPERAND (exp, 2);
*maxbits = BITS_PER_UNIT;
*bit_offset = 0;
return;
}
And even if we do get a variable position, I have so far being able to
get away with this...
>
> + *maxbits = TREE_INT_CST_LOW (maxbits_tree);
>
> this thus.
...because the call to fold_build2 immediately preceding this will fold
away the variable offset.
Is what you want, that we call get_inner_reference once, and then use
DECL_FIELD_OFFSET+DECL_FIELD_BIT_OFFSET to calculate any subsequent bit
offset? I found this to be quite tricky with padding, and such, but am
willing to give it a whirl again.
However, could I beg you to reconsider this, and get something working
first, only later concentrating on removing the get_inner_reference()
calls, and performing any other tweaks/optimizations?
Aldy
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-30 16:07 ` Aldy Hernandez
@ 2011-08-31 8:38 ` Richard Guenther
2011-08-31 13:56 ` Richard Guenther
0 siblings, 1 reply; 81+ messages in thread
From: Richard Guenther @ 2011-08-31 8:38 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jason Merrill, gcc-patches, Jakub Jelinek
On Tue, Aug 30, 2011 at 5:01 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
> [I'm going to respond to this piece-meal, to make sure I don't drop
> anything. My apologies for the long thread, but I'm pretty sure it's in
> everybody's kill file by now.]
>
>> + /* Be as conservative as possible on variable offsets. */
>> + if (TREE_OPERAND (exp, 2)
>> +&& !host_integerp (TREE_OPERAND (exp, 2), 1))
>> + {
>> + *byte_offset = TREE_OPERAND (exp, 2);
>> + *maxbits = BITS_PER_UNIT;
>> + return;
>> + }
>>
>> shouldn't this be at the very beginning of the function? Because
>> you've set *bit_offset to an offset that was _not_ calculated relative
>
> Sure. I assume in this case, *bit_offset would be 0, right?
It would be DECL_FIELD_BIT_OFFSET of that field. Oh, and
*byte_offset would be
*byte_offset = size_binop (MULT_EXPR, TREE_OPERAND (exp, 2),
size_int (DECL_OFFSET_ALIGN
(field) / BITS_PER_UNIT));
see expr.c:component_ref_field_offset () (which you conveniently
could use here).
Note that both TREE_OPERAND (exp, 2) and compoment_ref_field_offset
return offsets relative to the immediate containing struct type, not
relative to the base object like get_inner_reference does ...
(where it is still unclear to me what we are supposed to return from this
function ...)
Thus, conservative would be using get_inner_reference here, if the
offset is supposed to be relative to the base object.
Richard.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-30 21:33 ` Aldy Hernandez
@ 2011-08-31 8:55 ` Richard Guenther
2011-08-31 20:37 ` Aldy Hernandez
0 siblings, 1 reply; 81+ messages in thread
From: Richard Guenther @ 2011-08-31 8:55 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: gcc-patches
On Tue, Aug 30, 2011 at 8:13 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>> Btw, *byte_offset is still not relative to the containing object as
>> documented, but relative to the base object of the exp reference
>> tree (thus, to a in a.i.j.k.l instead of to a.i.j.k). If it were supposed
>> to be relative to a.i.j.k get_inner_reference would be not needed
>> either. Can you clarify what "containing object" means in the
>> overall comment please?
>
> I'm thoroughly confused here. Originally I had "inner decl", then we
> changed the nomenclature to "containing object", and now there's this
> "innermost reference".
Well, the nomenclature is not so important once the function only
computes one variant. Only because it doesn't right now I am
confused with the nomenclature trying to figure out what it is supposed
to be relative to ...
The containing object of a component-ref is TREE_OPERAND (exp, 0)
to me. The base object would be get_base_object (exp), which is
eventually what we want, right?
> What I mean to say is the "a" in a.i.j.k.l. How would you like me to call
> that? The innermost reference? The inner decl? Would this comment be
> acceptable:
>
> Given a COMPONENT_REF, this function calculates the byte offset
> from the innermost reference ("a" in a.i.j.k.l) to the start of the
> contiguous bit region containing the field in question.
from the base object ("a" in a.i.j.k.l) ...
would be fine with me.
>>
>> If it is really relative to the innermost reference of exp you can
>> "CSE" the offset of TREE_OPERAND (exp, 0) and do relative
>> adjustments for all the other get_inner_reference calls. For
>> example the
>>
>> + /* If we found the end of the bit field sequence, include the
>> + padding up to the next field... */
>> if (fld)
>> {
>> ...
>> + /* Calculate bitpos and offset of the next field. */
>> + get_inner_reference (build3 (COMPONENT_REF,
>> + TREE_TYPE (exp),
>> + TREE_OPERAND (exp, 0),
>> + fld, NULL_TREE),
>> + &tbitsize,&end_bitpos,&end_offset,
>> + &tmode,&tunsignedp,&tvolatilep, true);
>>
>> case is not correct anyway, fld may have variable position
>> (non-INTEGER_CST DECL_FIELD_OFFSET), you can't
>> assume
>
> Innermost here means "a" in a.i.j.k.l? If so, this is what we're currently
> doing, *byte_offset is the start of the bit region, and *bit_offset is the
> offset from that.
>
> First, I thought we couldn't get a variable position here because we are now
> handling that case at the beginning of the function with:
>
> /* Be as conservative as possible on variable offsets. */
> if (TREE_OPERAND (exp, 2)
> && !host_integerp (TREE_OPERAND (exp, 2), 1))
> {
> *byte_offset = TREE_OPERAND (exp, 2);
> *maxbits = BITS_PER_UNIT;
> *bit_offset = 0;
> return;
> }
>
> And even if we do get a variable position, I have so far being able to get
> away with this...
Did you test Ada and enable the C++ memory model? ;)
Btw, even if the bitfield we access (and thus the whole region) is at a
constant offset, the field _following_ the bitregion (the one you query
above with get_inner_reference) can be at variable offset. I suggest
to simply not include any padding in that case (which would be,
TREE_CODE (DECL_FIELD_OFFSET (fld)) != INTEGER_CST).
>>
>> + *maxbits = TREE_INT_CST_LOW (maxbits_tree);
>>
>> this thus.
>
> ...because the call to fold_build2 immediately preceding this will fold away
> the variable offset.
You hope so ;)
> Is what you want, that we call get_inner_reference once, and then use
> DECL_FIELD_OFFSET+DECL_FIELD_BIT_OFFSET to calculate any subsequent bit
> offset? I found this to be quite tricky with padding, and such, but am
> willing to give it a whirl again.
Yes.
> However, could I beg you to reconsider this, and get something working
> first, only later concentrating on removing the get_inner_reference() calls,
> and performing any other tweaks/optimizations?
Sure, it's fine to tweak this in a followup.
Thanks,
Richard.
> Aldy
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-30 16:53 ` Aldy Hernandez
@ 2011-08-31 8:55 ` Richard Guenther
2011-08-31 17:24 ` Aldy Hernandez
0 siblings, 1 reply; 81+ messages in thread
From: Richard Guenther @ 2011-08-31 8:55 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: gcc-patches
On Tue, Aug 30, 2011 at 6:15 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>> *bit_offset is supposed to be relative to *byte_offset then it should
>> be easy to calculate it without another get_inner_reference.
>
> Since, as you suggested, we will terminate early on variable length offsets,
> we can assume both DECL_FIELD_OFFSET and DECL_FIELD_BIT_OFFSET will be
> constants by now.
Yes.
> So, I assume we can calculate the bit offset like this:
>
> *bit_offset = (TREE_INT_CST_LOW (DECL_FIELD_OFFSET (fld))
> * BITS_PER_UNIT
> + TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (fld)))
> - (TREE_INT_CST_LOW (DECL_FIELD_OFFSET (bitregion_start))
> * BITS_PER_UNIT
> + TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (bitregion_start)));
>
> (Yes, I know we can factor out the BITS_PER_UNIT and only do one
> multiplication, it's just easier to read this way.)
>
> Is this what you had in mind?
Yes. For convenience I'd simply use double_ints for the intermediate
calculations.
Richard.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-31 8:38 ` Richard Guenther
@ 2011-08-31 13:56 ` Richard Guenther
2011-08-31 20:37 ` Aldy Hernandez
0 siblings, 1 reply; 81+ messages in thread
From: Richard Guenther @ 2011-08-31 13:56 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Jason Merrill, gcc-patches, Jakub Jelinek
On Wed, Aug 31, 2011 at 9:45 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Tue, Aug 30, 2011 at 5:01 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>> [I'm going to respond to this piece-meal, to make sure I don't drop
>> anything. My apologies for the long thread, but I'm pretty sure it's in
>> everybody's kill file by now.]
>>
>>> + /* Be as conservative as possible on variable offsets. */
>>> + if (TREE_OPERAND (exp, 2)
>>> +&& !host_integerp (TREE_OPERAND (exp, 2), 1))
>>> + {
>>> + *byte_offset = TREE_OPERAND (exp, 2);
>>> + *maxbits = BITS_PER_UNIT;
>>> + return;
>>> + }
>>>
>>> shouldn't this be at the very beginning of the function? Because
>>> you've set *bit_offset to an offset that was _not_ calculated relative
>>
>> Sure. I assume in this case, *bit_offset would be 0, right?
>
> It would be DECL_FIELD_BIT_OFFSET of that field. Oh, and
> *byte_offset would be
>
> *byte_offset = size_binop (MULT_EXPR, TREE_OPERAND (exp, 2),
> size_int (DECL_OFFSET_ALIGN
> (field) / BITS_PER_UNIT));
>
> see expr.c:component_ref_field_offset () (which you conveniently
> could use here).
>
> Note that both TREE_OPERAND (exp, 2) and compoment_ref_field_offset
> return offsets relative to the immediate containing struct type, not
> relative to the base object like get_inner_reference does ...
> (where it is still unclear to me what we are supposed to return from this
> function ...)
>
> Thus, conservative would be using get_inner_reference here, if the
> offset is supposed to be relative to the base object.
That said, shouldn't *maxbits not at least make sure to cover the field itself?
> Richard.
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-31 8:55 ` Richard Guenther
@ 2011-08-31 17:24 ` Aldy Hernandez
0 siblings, 0 replies; 81+ messages in thread
From: Aldy Hernandez @ 2011-08-31 17:24 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches
>> *bit_offset = (TREE_INT_CST_LOW (DECL_FIELD_OFFSET (fld))
>> * BITS_PER_UNIT
>> + TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (fld)))
>> - (TREE_INT_CST_LOW (DECL_FIELD_OFFSET (bitregion_start))
>> * BITS_PER_UNIT
>> + TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (bitregion_start)));
>>
>> (Yes, I know we can factor out the BITS_PER_UNIT and only do one
>> multiplication, it's just easier to read this way.)
>>
>> Is this what you had in mind?
>
> Yes. For convenience I'd simply use double_ints for the intermediate
> calculations.
Ok, let's leave it like this for now. I have added a FIXME note, and we
can optimize this after we get everything working.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-31 8:55 ` Richard Guenther
@ 2011-08-31 20:37 ` Aldy Hernandez
2011-09-01 7:02 ` Richard Guenther
0 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-08-31 20:37 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches
> Did you test Ada and enable the C++ memory model? ;)
See my earlier comment on Ada. Who would ever use the C++ memory model
on Ada?
> Btw, even if the bitfield we access (and thus the whole region) is at a
> constant offset, the field _following_ the bitregion (the one you query
> above with get_inner_reference) can be at variable offset. I suggest
> to simply not include any padding in that case (which would be,
> TREE_CODE (DECL_FIELD_OFFSET (fld)) != INTEGER_CST).
I still have not found a place where we get a variable offset here
(after folding the computation). How about we put a gcc_assert() along
with a big fat comment with your above suggestion when we encounter
this. Or can you give me an example of this case?
>> Is what you want, that we call get_inner_reference once, and then use
>> DECL_FIELD_OFFSET+DECL_FIELD_BIT_OFFSET to calculate any subsequent bit
>> offset? I found this to be quite tricky with padding, and such, but am
>> willing to give it a whirl again.
>
> Yes.
I have added a comment to this effect, and will address it along with
the get_inner_reference() removal you have suggested as a followup.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-31 13:56 ` Richard Guenther
@ 2011-08-31 20:37 ` Aldy Hernandez
2011-09-01 6:58 ` Richard Guenther
0 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-08-31 20:37 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches
>>> Sure. I assume in this case, *bit_offset would be 0, right?
>>
>> It would be DECL_FIELD_BIT_OFFSET of that field. Oh, and
>> *byte_offset would be
>>
>> *byte_offset = size_binop (MULT_EXPR, TREE_OPERAND (exp, 2),
>> size_int (DECL_OFFSET_ALIGN
>> (field) / BITS_PER_UNIT));
>>
>> see expr.c:component_ref_field_offset () (which you conveniently
>> could use here).
>>
>> Note that both TREE_OPERAND (exp, 2) and compoment_ref_field_offset
>> return offsets relative to the immediate containing struct type, not
>> relative to the base object like get_inner_reference does ...
>> (where it is still unclear to me what we are supposed to return from this
>> function ...)
Ok, I see where your confusion lies. The function is supposed to return
a byte offset from the base object, none of this containing object or
immediate struct, or whatever. Base object, as in "a" in a.i.j.k, as in
what you get back from get_base_address().
Originally everything was calculated with get_inner_reference(), which
is relative to the base object, but now we have this hodge podge of
get_inner_reference() calls with ad-hoc calculations and optimizations.
Gladly, we've agreed to use get_inner_reference() and optimize at a
later time.
So... base object throughout, anything else is a mistake on my part.
BTW, this whole variable length offset I still can't trigger. I know
you want to cater to Ada, but does it even make sense to enable the C++
memory model in Ada? Who would ever do this? Be that as it may, I'll
humor you and handle it.
>> Thus, conservative would be using get_inner_reference here, if the
>> offset is supposed to be relative to the base object.
>
> That said, shouldn't *maxbits not at least make sure to cover the field itself?
Is this what you want?
/* Be as conservative as possible on variable offsets. */
if (TREE_OPERAND (exp, 2)
&& !host_integerp (TREE_OPERAND (exp, 2), 1))
{
get_inner_reference (build3 (COMPONENT_REF,
TREE_TYPE (exp),
TREE_OPERAND (exp, 0),
field, NULL_TREE),
&tbitsize, &start_bitpos, &start_offset,
&tmode, &tunsignedp, &tvolatilep, true);
*byte_offset = start_offset ? start_offset : size_zero_node;
*bit_offset = start_bitpos;
*maxbits = tbitsize;
return;
}
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-31 20:37 ` Aldy Hernandez
@ 2011-09-01 6:58 ` Richard Guenther
0 siblings, 0 replies; 81+ messages in thread
From: Richard Guenther @ 2011-09-01 6:58 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: gcc-patches
On Wed, Aug 31, 2011 at 6:53 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>>>> Sure. I assume in this case, *bit_offset would be 0, right?
>>>
>>> It would be DECL_FIELD_BIT_OFFSET of that field. Oh, and
>>> *byte_offset would be
>>>
>>> *byte_offset = size_binop (MULT_EXPR, TREE_OPERAND (exp, 2),
>>> size_int (DECL_OFFSET_ALIGN
>>> (field) / BITS_PER_UNIT));
>>>
>>> see expr.c:component_ref_field_offset () (which you conveniently
>>> could use here).
>>>
>>> Note that both TREE_OPERAND (exp, 2) and compoment_ref_field_offset
>>> return offsets relative to the immediate containing struct type, not
>>> relative to the base object like get_inner_reference does ...
>>> (where it is still unclear to me what we are supposed to return from this
>>> function ...)
>
> Ok, I see where your confusion lies. The function is supposed to return a
> byte offset from the base object, none of this containing object or
> immediate struct, or whatever. Base object, as in "a" in a.i.j.k, as in
> what you get back from get_base_address().
>
> Originally everything was calculated with get_inner_reference(), which is
> relative to the base object, but now we have this hodge podge of
> get_inner_reference() calls with ad-hoc calculations and optimizations.
> Gladly, we've agreed to use get_inner_reference() and optimize at a later
> time.
>
> So... base object throughout, anything else is a mistake on my part.
>
> BTW, this whole variable length offset I still can't trigger. I know you
> want to cater to Ada, but does it even make sense to enable the C++ memory
> model in Ada? Who would ever do this? Be that as it may, I'll humor you
> and handle it.
>
>>> Thus, conservative would be using get_inner_reference here, if the
>>> offset is supposed to be relative to the base object.
>>
>> That said, shouldn't *maxbits not at least make sure to cover the field
>> itself?
>
> Is this what you want?
>
> /* Be as conservative as possible on variable offsets. */
> if (TREE_OPERAND (exp, 2)
> && !host_integerp (TREE_OPERAND (exp, 2), 1))
> {
> get_inner_reference (build3 (COMPONENT_REF,
> TREE_TYPE (exp),
> TREE_OPERAND (exp, 0),
> field, NULL_TREE),
> &tbitsize, &start_bitpos, &start_offset,
> &tmode, &tunsignedp, &tvolatilep, true);
>
> *byte_offset = start_offset ? start_offset : size_zero_node;
> *bit_offset = start_bitpos;
> *maxbits = tbitsize;
> return;
> }
Yes, exactly.
Richard.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-31 20:37 ` Aldy Hernandez
@ 2011-09-01 7:02 ` Richard Guenther
2011-09-01 7:05 ` Arnaud Charlet
2011-09-01 14:16 ` Aldy Hernandez
0 siblings, 2 replies; 81+ messages in thread
From: Richard Guenther @ 2011-09-01 7:02 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: gcc-patches
On Wed, Aug 31, 2011 at 8:09 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>> Did you test Ada and enable the C++ memory model? ;)
>
> See my earlier comment on Ada. Who would ever use the C++ memory model on
> Ada?
People interoperating Ada with C++. Our bug triager Zdenek who
figures out the --param?
>> Btw, even if the bitfield we access (and thus the whole region) is at a
>> constant offset, the field _following_ the bitregion (the one you query
>> above with get_inner_reference) can be at variable offset. I suggest
>> to simply not include any padding in that case (which would be,
>> TREE_CODE (DECL_FIELD_OFFSET (fld)) != INTEGER_CST).
>
> I still have not found a place where we get a variable offset here (after
> folding the computation). How about we put a gcc_assert() along with a big
> fat comment with your above suggestion when we encounter this. Or can you
> give me an example of this case?
My point is, the middle-end infrastructure makes it possible for this
case to appear, and it seems to be easy to handle conservatively.
There isn't a need to wait for users to run into an ICE or an assert we put
there IMHO. If I'd be fluent in Ada I'd write you a testcase, but I ain't.
>>> Is what you want, that we call get_inner_reference once, and then use
>>> DECL_FIELD_OFFSET+DECL_FIELD_BIT_OFFSET to calculate any subsequent bit
>>> offset? I found this to be quite tricky with padding, and such, but am
>>> willing to give it a whirl again.
>>
>> Yes.
>
> I have added a comment to this effect, and will address it along with the
> get_inner_reference() removal you have suggested as a followup.
Thanks,
Richard.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-09-01 7:02 ` Richard Guenther
@ 2011-09-01 7:05 ` Arnaud Charlet
2011-09-01 14:16 ` Aldy Hernandez
1 sibling, 0 replies; 81+ messages in thread
From: Arnaud Charlet @ 2011-09-01 7:05 UTC (permalink / raw)
To: Richard Guenther; +Cc: Aldy Hernandez, gcc-patches, Eric Botcazou
> >> Did you test Ada and enable the C++ memory model? ;)
> >
> > See my earlier comment on Ada. Â Who would ever use the C++ memory model on
> > Ada?
>
> People interoperating Ada with C++. Our bug triager Zdenek who
> figures out the --param?
Right, that's one example. There are also actually some similarities between
the C++ memory model and the Ada language, so it's not so unconcievable
that Ada would like to take advantage of some of these capabilities.
Arno
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-09-01 7:02 ` Richard Guenther
2011-09-01 7:05 ` Arnaud Charlet
@ 2011-09-01 14:16 ` Aldy Hernandez
2011-09-02 8:48 ` Richard Guenther
1 sibling, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-09-01 14:16 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches
> My point is, the middle-end infrastructure makes it possible for this
> case to appear, and it seems to be easy to handle conservatively.
> There isn't a need to wait for users to run into an ICE or an assert we put
> there IMHO. If I'd be fluent in Ada I'd write you a testcase, but I ain't.
Ughh, this is getting messier.
Ok, I propose keeping track of the field prior (lastfld), calling
get_inner_reference() and adding DECL_SIZE (or tbitsize if you prefer)
to calculate maxbits without the padding.
Notice the comment at the top. We can get rid of yet another call to
get_inner_reference later.
Is this what you had in mind?
BTW, we don't need to round up to the next byte here, do we?
Thanks.
Aldy
/* If we found the end of the bit field sequence, include the
padding up to the next field... */
if (fld)
{
tree end_offset, t;
HOST_WIDE_INT end_bitpos;
/* FIXME: Only call get_inner_reference once (at the beginning
of the bit region), and use
DECL_FIELD_OFFSET+DECL_FIELD_BIT_OFFSET throughout to
calculate any subsequent bit offset. */
/* Even if the bitfield we access (and thus the whole region) is
at a constant offset, the field _following_ the bitregion can
be at variable offset. In this case, do not include any
padding. This is mostly for Ada. */
if (TREE_CODE (DECL_FIELD_OFFSET (fld)) != INTEGER_CST)
{
get_inner_reference (build3 (COMPONENT_REF,
TREE_TYPE (exp),
TREE_OPERAND (exp, 0),
lastfld, NULL_TREE),
&tbitsize, &end_bitpos, &end_offset,
&tmode, &tunsignedp, &tvolatilep, true);
/* Calculate the size of the bit region up the last
bitfield, excluding any subsequent padding.
t = (end_byte_off - start_byte_offset) * 8 + end_bit_off */
end_offset = end_offset ? end_offset : size_zero_node;
t = fold_build2 (PLUS_EXPR, size_type_node,
fold_build2 (MULT_EXPR, size_type_node,
fold_build2 (MINUS_EXPR, size_type_node,
end_offset,
*byte_offset),
build_int_cst (size_type_node,
BITS_PER_UNIT)),
build_int_cst (size_type_node,
end_bitpos));
/* Add the bitsize of the last field. */
t = fold_build2 (PLUS_EXPR, size_type_node,
t, DECL_SIZE (lastfld));
*maxbits = tree_low_cst (t, 1);
return;
}
...
...
...
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-08-29 12:54 ` Richard Guenther
` (2 preceding siblings ...)
2011-08-30 21:33 ` Aldy Hernandez
@ 2011-09-01 14:53 ` Aldy Hernandez
2011-09-01 15:01 ` Jason Merrill
3 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-09-01 14:53 UTC (permalink / raw)
To: Richard Guenther; +Cc: Jason Merrill, gcc-patches
[Jason, can you pontificate on tail-padding and the upcoming C++
standard with regards to bitfields?]
> so you wasn't convinced about my worries about tail-padding re-use?
To answer your question, I believe we can't touch past the last field
(into the padding) if the subsequent record will be packed into the
first's padding.
struct A {
int a : 17;
};
struct B : public A {
char c;
};
So here, if <c> gets packed into the tail-padding of A, we can't touch
the padding of A when storing into <a>. These are different structures,
and I assume would be treated as nested structures, which are distinct
memory locations.
Is there a way of distinguishing this particular variant (possible
tail-packing), or will we have to disallow storing into the record tail
padding altogether? That would seriously suck.
Aldy
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-09-01 14:53 ` Aldy Hernandez
@ 2011-09-01 15:01 ` Jason Merrill
2011-09-01 15:10 ` Aldy Hernandez
0 siblings, 1 reply; 81+ messages in thread
From: Jason Merrill @ 2011-09-01 15:01 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Richard Guenther, gcc-patches
On 09/01/2011 10:52 AM, Aldy Hernandez wrote:
> To answer your question, I believe we can't touch past the last field
> (into the padding) if the subsequent record will be packed into the
> first's padding.
Right.
> struct A {
> int a : 17;
> };
> struct B : public A {
> char c;
> };
>
> So here, if <c> gets packed into the tail-padding of A, we can't touch
> the padding of A when storing into <a>.
But that doesn't apply to this testcase because A is a POD class, so we
don't mess with its tail padding.
> Is there a way of distinguishing this particular variant (possible
> tail-packing), or will we have to disallow storing into the record tail
> padding altogether? That would seriously suck.
Basically you can only touch the size of the CLASSTYPE_AS_BASE variant.
For many classes this will be the same as the size of the class itself.
Jason
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-09-01 15:01 ` Jason Merrill
@ 2011-09-01 15:10 ` Aldy Hernandez
2011-09-01 15:20 ` Jason Merrill
0 siblings, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-09-01 15:10 UTC (permalink / raw)
To: Jason Merrill; +Cc: Richard Guenther, gcc-patches
>> Is there a way of distinguishing this particular variant (possible
>> tail-packing), or will we have to disallow storing into the record tail
>> padding altogether? That would seriously suck.
>
> Basically you can only touch the size of the CLASSTYPE_AS_BASE variant.
> For many classes this will be the same as the size of the class itself.
All this code is in the middle end, so we're language agnostic.
What do we need here, a hook to query the front-end, or is it too late?
Or will we have to play it conservative and never touch the padding
(regardless of language)?
Aldy
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-09-01 15:10 ` Aldy Hernandez
@ 2011-09-01 15:20 ` Jason Merrill
2011-09-02 8:53 ` Richard Guenther
0 siblings, 1 reply; 81+ messages in thread
From: Jason Merrill @ 2011-09-01 15:20 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: Richard Guenther, gcc-patches
On 09/01/2011 11:10 AM, Aldy Hernandez wrote:
>> Basically you can only touch the size of the CLASSTYPE_AS_BASE variant.
>> For many classes this will be the same as the size of the class itself.
>
> All this code is in the middle end, so we're language agnostic.
>
> What do we need here, a hook to query the front-end, or is it too late?
> Or will we have to play it conservative and never touch the padding
> (regardless of language)?
I think it would make sense to expose this information to the back end
somehow. A hook would do the trick: call it type_data_size or
type_min_size or some such, which in the C++ front end would return
TYPE_SIZE (CLASSTYPE_AS_BASE (t)) for classes or just TYPE_SIZE for
other types.
Jason
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-09-01 14:16 ` Aldy Hernandez
@ 2011-09-02 8:48 ` Richard Guenther
2011-09-02 12:49 ` Aldy Hernandez
2011-09-02 20:34 ` Jeff Law
0 siblings, 2 replies; 81+ messages in thread
From: Richard Guenther @ 2011-09-02 8:48 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: gcc-patches
On Thu, Sep 1, 2011 at 4:16 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>> My point is, the middle-end infrastructure makes it possible for this
>> case to appear, and it seems to be easy to handle conservatively.
>> There isn't a need to wait for users to run into an ICE or an assert we
>> put
>> there IMHO. If I'd be fluent in Ada I'd write you a testcase, but I
>> ain't.
>
> Ughh, this is getting messier.
>
> Ok, I propose keeping track of the field prior (lastfld), calling
> get_inner_reference() and adding DECL_SIZE (or tbitsize if you prefer) to
> calculate maxbits without the padding.
>
> Notice the comment at the top. We can get rid of yet another call to
> get_inner_reference later.
>
> Is this what you had in mind?
That could work also for the tail-padding re-use case, yes. Note that
DECL_SIZE of the field is just the last fieds bit-precision, so ..
> BTW, we don't need to round up to the next byte here, do we?
.. rounding up to the next byte cannot hurt (dependent on what the
caller will do with that value).
Note that with all this mess I'll re-iterate some of my initial thoughts.
1) why not do this C++ (or C) specific stuff in the frontends, maybe
at gimplifying/genericization time? That way you wouldn't need to
worry about middle-end features but you could rely solely on what
C/C++ permit. It is, after all, C++ _frontend_ semantics that we
enforce here, in the middle-end, which looks out-of-place.
2) all this information we try to re-construct here is sort-of readily
available when we layout the record (thus, from layout_type and
friends). We should really, really try to preserve it there, rather
than jumping through hoops here (ideally we'd have an
(unused?) FIELD_DECL that covers the whole "bitfield group"
followed by the individual FIELD_DECLS for the bits (yep, they'd
overlap that group FIELD_DECL), and they would refer back to
that group FIELD_DECL)
Is the C++ memory model stuff going to be "ready" for 4.7?
Thanks,
Richard.
> Thanks.
> Aldy
>
> /* If we found the end of the bit field sequence, include the
> padding up to the next field... */
> if (fld)
> {
> tree end_offset, t;
> HOST_WIDE_INT end_bitpos;
>
> /* FIXME: Only call get_inner_reference once (at the beginning
> of the bit region), and use
> DECL_FIELD_OFFSET+DECL_FIELD_BIT_OFFSET throughout to
> calculate any subsequent bit offset. */
>
> /* Even if the bitfield we access (and thus the whole region) is
> at a constant offset, the field _following_ the bitregion can
> be at variable offset. In this case, do not include any
> padding. This is mostly for Ada. */
> if (TREE_CODE (DECL_FIELD_OFFSET (fld)) != INTEGER_CST)
> {
> get_inner_reference (build3 (COMPONENT_REF,
> TREE_TYPE (exp),
> TREE_OPERAND (exp, 0),
> lastfld, NULL_TREE),
> &tbitsize, &end_bitpos, &end_offset,
> &tmode, &tunsignedp, &tvolatilep, true);
>
> /* Calculate the size of the bit region up the last
> bitfield, excluding any subsequent padding.
>
> t = (end_byte_off - start_byte_offset) * 8 + end_bit_off */
> end_offset = end_offset ? end_offset : size_zero_node;
> t = fold_build2 (PLUS_EXPR, size_type_node,
> fold_build2 (MULT_EXPR, size_type_node,
> fold_build2 (MINUS_EXPR,
> size_type_node,
> end_offset,
> *byte_offset),
> build_int_cst (size_type_node,
> BITS_PER_UNIT)),
> build_int_cst (size_type_node,
> end_bitpos));
> /* Add the bitsize of the last field. */
> t = fold_build2 (PLUS_EXPR, size_type_node,
> t, DECL_SIZE (lastfld));
>
> *maxbits = tree_low_cst (t, 1);
> return;
> }
> ...
> ...
> ...
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-09-01 15:20 ` Jason Merrill
@ 2011-09-02 8:53 ` Richard Guenther
2011-09-02 14:10 ` Jason Merrill
0 siblings, 1 reply; 81+ messages in thread
From: Richard Guenther @ 2011-09-02 8:53 UTC (permalink / raw)
To: Jason Merrill; +Cc: Aldy Hernandez, gcc-patches
On Thu, Sep 1, 2011 at 5:19 PM, Jason Merrill <jason@redhat.com> wrote:
> On 09/01/2011 11:10 AM, Aldy Hernandez wrote:
>>>
>>> Basically you can only touch the size of the CLASSTYPE_AS_BASE variant.
>>> For many classes this will be the same as the size of the class itself.
>>
>> All this code is in the middle end, so we're language agnostic.
>>
>> What do we need here, a hook to query the front-end, or is it too late?
>> Or will we have to play it conservative and never touch the padding
>> (regardless of language)?
>
> I think it would make sense to expose this information to the back end
> somehow. A hook would do the trick: call it type_data_size or type_min_size
> or some such, which in the C++ front end would return TYPE_SIZE
> (CLASSTYPE_AS_BASE (t)) for classes or just TYPE_SIZE for other types.
That's too late to work with LTO, you'd need to store that information
permanently
somewhere.
Maybe move this whole C++ specific bitfield handling where it belongs,
namely to the C++ frontend?
I suggest to always not re-use tail padding for now (I believe if your
parent object is a COMPONENT_REF, thus, x.parent.bitfield,
you can use the TYPE_SIZE vs. field-decl DECL_SIZE discrepance
to decide about whether the tail-padding was reused, but please
double-check that ;)))
Richard.
> Jason
>
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-09-02 8:48 ` Richard Guenther
@ 2011-09-02 12:49 ` Aldy Hernandez
2011-09-02 13:05 ` Richard Guenther
2011-09-02 20:34 ` Jeff Law
1 sibling, 1 reply; 81+ messages in thread
From: Aldy Hernandez @ 2011-09-02 12:49 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches
> Note that with all this mess I'll re-iterate some of my initial thoughts.
> 1) why not do this C++ (or C) specific stuff in the frontends, maybe
> at gimplifying/genericization time? That way you wouldn't need to
> worry about middle-end features but you could rely solely on what
> C/C++ permit. It is, after all, C++ _frontend_ semantics that we
> enforce here, in the middle-end, which looks out-of-place.
The front-end, really? After all this going back and forth? After you
were all so worried about Ada, and now you're ditching it in favor of
handling only C++?
> Is the C++ memory model stuff going to be "ready" for 4.7?
No, not if you expect me rewrite things every day.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-09-02 12:49 ` Aldy Hernandez
@ 2011-09-02 13:05 ` Richard Guenther
0 siblings, 0 replies; 81+ messages in thread
From: Richard Guenther @ 2011-09-02 13:05 UTC (permalink / raw)
To: Aldy Hernandez; +Cc: gcc-patches
On Fri, Sep 2, 2011 at 2:49 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>> Note that with all this mess I'll re-iterate some of my initial thoughts.
>> 1) why not do this C++ (or C) specific stuff in the frontends, maybe
>> at gimplifying/genericization time? That way you wouldn't need to
>> worry about middle-end features but you could rely solely on what
>> C/C++ permit. It is, after all, C++ _frontend_ semantics that we
>> enforce here, in the middle-end, which looks out-of-place.
>
> The front-end, really? After all this going back and forth?
Well, I'm fine with handling it in the middle-end if it's correct there.
> After you were
> all so worried about Ada, and now you're ditching it in favor of handling
> only C++?
I'm just showing you a possible solution for where you'd not need to
worry ;) Consider LTOing an Ada and a C++ module - you need to
enable the C++ memory model at link-time so it is in effect when we
process bit-fields. That will automatically enable it for the Ada pieces, too.
>> Is the C++ memory model stuff going to be "ready" for 4.7?
>
> No, not if you expect me rewrite things every day.
I don't expect you to rewrite things every day.
Don't read every comment I make as a definite decision and order to
you. I am a mere mortal, too, and the bitfield thing is, I must admit,
still partially a mystery to myself (which is why I keep asking questions
instead of simply providing you with definite answers). After all I pushed
back my idea of lowering bitfield accesses somewhere on GIMPLE and
I'm not sure if I get back to it for 4.7. And I definitely would consider
2) for that work.
Btw, it would be nice if I weren't the only one reading your updated
patches :/ I'm just punching holes where I see them and hope I and
you learn something in that process.
Richard.
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-09-02 8:53 ` Richard Guenther
@ 2011-09-02 14:10 ` Jason Merrill
2011-09-02 14:38 ` Richard Guenther
0 siblings, 1 reply; 81+ messages in thread
From: Jason Merrill @ 2011-09-02 14:10 UTC (permalink / raw)
To: Richard Guenther; +Cc: Aldy Hernandez, gcc-patches
On 09/02/2011 04:53 AM, Richard Guenther wrote:
> On Thu, Sep 1, 2011 at 5:19 PM, Jason Merrill<jason@redhat.com> wrote:
>> I think it would make sense to expose this information to the back end
>> somehow. A hook would do the trick: call it type_data_size or type_min_size
>> or some such, which in the C++ front end would return TYPE_SIZE
>> (CLASSTYPE_AS_BASE (t)) for classes or just TYPE_SIZE for other types.
>
> That's too late to work with LTO, you'd need to store that information
> permanently somewhere.
OK.
> Maybe move this whole C++ specific bitfield handling where it belongs,
> namely to the C++ frontend?
I don't think that is the way to go; C is adopting the same memory
model, and this is the only sane thing to do with bit-fields.
> I suggest to always not re-use tail padding for now (I believe if your
> parent object is a COMPONENT_REF, thus, x.parent.bitfield,
> you can use the TYPE_SIZE vs. field-decl DECL_SIZE discrepancy
> to decide about whether the tail-padding was reused, but please
> double-check that ;)))
But you don't always have a COMPONENT_REF; you still need to avoid
touching the tail padding when you just have a pointer to the type
because it might be a base sub-object.
I wonder what would break if C++ just set TYPE_SIZE to the as-base size?
Jason
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-09-02 14:10 ` Jason Merrill
@ 2011-09-02 14:38 ` Richard Guenther
2011-09-07 18:12 ` Jason Merrill
0 siblings, 1 reply; 81+ messages in thread
From: Richard Guenther @ 2011-09-02 14:38 UTC (permalink / raw)
To: Jason Merrill; +Cc: Aldy Hernandez, gcc-patches
On Fri, Sep 2, 2011 at 4:10 PM, Jason Merrill <jason@redhat.com> wrote:
> On 09/02/2011 04:53 AM, Richard Guenther wrote:
>>
>> On Thu, Sep 1, 2011 at 5:19 PM, Jason Merrill<jason@redhat.com> wrote:
>>>
>>> I think it would make sense to expose this information to the back end
>>> somehow. A hook would do the trick: call it type_data_size or
>>> type_min_size
>>> or some such, which in the C++ front end would return TYPE_SIZE
>>> (CLASSTYPE_AS_BASE (t)) for classes or just TYPE_SIZE for other types.
>>
>> That's too late to work with LTO, you'd need to store that information
>> permanently somewhere.
>
> OK.
>
>> Maybe move this whole C++ specific bitfield handling where it belongs,
>> namely to the C++ frontend?
>
> I don't think that is the way to go; C is adopting the same memory model,
> and this is the only sane thing to do with bit-fields.
>
>> I suggest to always not re-use tail padding for now (I believe if your
>> parent object is a COMPONENT_REF, thus, x.parent.bitfield,
>> you can use the TYPE_SIZE vs. field-decl DECL_SIZE discrepancy
>> to decide about whether the tail-padding was reused, but please
>> double-check that ;)))
>
> But you don't always have a COMPONENT_REF; you still need to avoid touching
> the tail padding when you just have a pointer to the type because it might
> be a base sub-object.
>
> I wonder what would break if C++ just set TYPE_SIZE to the as-base size?
Good question. Probably argument passing, as the as-base size wouldn't
get a proper mode assigned form layout_type then(?) for small structs?
Maybe worth a try ...
Richard.
> Jason
>
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-09-02 8:48 ` Richard Guenther
2011-09-02 12:49 ` Aldy Hernandez
@ 2011-09-02 20:34 ` Jeff Law
1 sibling, 0 replies; 81+ messages in thread
From: Jeff Law @ 2011-09-02 20:34 UTC (permalink / raw)
To: Richard Guenther; +Cc: Aldy Hernandez, gcc-patches
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 09/02/11 02:48, Richard Guenther wrote:
>
> Note that with all this mess I'll re-iterate some of my initial
> thoughts. 1) why not do this C++ (or C) specific stuff in the
> frontends, maybe at gimplifying/genericization time? That way you
> wouldn't need to worry about middle-end features but you could rely
> solely on what C/C++ permit. It is, after all, C++ _frontend_
> semantics that we enforce here, in the middle-end, which looks
> out-of-place.
Well, it's worth keeping in mind that fixing the way we handle bitfields
is just one piece of a larger project. Furthermore, many of the ideas
in the C++ memory model are applicable to other languages.
However, I must admit, I'm somewhat at a loss, I thought we were doing
all this in stor-layout.c at the time we layout the structure's memory
form then just trying to keep the code generator and optimizers from
mucking things up by combining accesses and the like.
Clearly I'm going to need to sit down and review the code as well.
Which means learning about a part of GCC I've largely been able to
ignore... Sigh...
jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQEcBAEBAgAGBQJOYT2jAAoJEBRtltQi2kC7BF4IAJWwqzsPUdhsHaodlUfm1LRu
JpMojs04wPVfu12+8zcXaWfitjOhEPJDPcZ5c0AHb74NRPJINJmjDvSBZWfFazaE
CrZE/U9IBz7Ay8s/gw//uMIWDS8lNjjYFxpqn6VMUpY2F/4QSkDZtaTlsTfm8YfU
+IZnwq82Johh8MzGDRuYY0HBKRdAotGS2F+SycdOxBGW6hnW0WR/2pt0BpIxNYgl
ro0dOgSptWoEmOFzhlN9pVsFvImSjXVlbV9GnF4AsDrh9x9FIaFIpvhgZMUU8wc+
Akg2jgLy2hhysQ9JtES0rL9qrptPPcQVqCL8ct/sLB85vYw7oUxwSenj7w8mwmE=
=yjMx
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 81+ messages in thread
* Re: [C++0x] contiguous bitfields race implementation
2011-09-02 14:38 ` Richard Guenther
@ 2011-09-07 18:12 ` Jason Merrill
0 siblings, 0 replies; 81+ messages in thread
From: Jason Merrill @ 2011-09-07 18:12 UTC (permalink / raw)
To: Richard Guenther; +Cc: Aldy Hernandez, gcc-patches
On 09/02/2011 10:38 AM, Richard Guenther wrote:
> On Fri, Sep 2, 2011 at 4:10 PM, Jason Merrill<jason@redhat.com> wrote:
>> I wonder what would break if C++ just set TYPE_SIZE to the as-base size?
>
> Good question. Probably argument passing, as the as-base size wouldn't
> get a proper mode assigned form layout_type then(?) for small structs?
Classes for which the as-base size is different are passed by invisible
reference, so that wouldn't be an issue.
But layout_decl would get the wrong size for variables and fields of the
type, so that won't work.
Perhaps it's time to get serious about the change I talked about in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22488#c42 ...
Jason
^ permalink raw reply [flat|nested] 81+ messages in thread
end of thread, other threads:[~2011-09-07 18:08 UTC | newest]
Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-09 17:12 [C++0x] contiguous bitfields race implementation Aldy Hernandez
2011-05-09 18:04 ` Jeff Law
2011-05-09 18:05 ` Aldy Hernandez
2011-05-09 19:19 ` Jeff Law
2011-05-09 20:11 ` Aldy Hernandez
2011-05-09 20:28 ` Jakub Jelinek
2011-05-10 11:42 ` Richard Guenther
2011-05-09 20:49 ` Jason Merrill
2011-05-13 22:35 ` Aldy Hernandez
2011-05-16 21:20 ` Aldy Hernandez
2011-05-19 7:17 ` Jason Merrill
2011-05-20 9:21 ` Aldy Hernandez
2011-05-26 18:05 ` Jason Merrill
2011-05-26 18:28 ` Aldy Hernandez
2011-05-26 19:07 ` Jason Merrill
2011-05-26 20:19 ` Aldy Hernandez
2011-05-27 20:41 ` Jason Merrill
2011-07-18 13:10 ` Aldy Hernandez
2011-07-22 19:16 ` Jason Merrill
2011-07-25 17:41 ` Aldy Hernandez
2011-07-26 5:28 ` Jason Merrill
2011-07-26 18:37 ` Aldy Hernandez
2011-07-26 17:54 ` Jason Merrill
2011-07-26 17:51 ` Aldy Hernandez
2011-07-26 18:05 ` Jason Merrill
2011-07-27 15:03 ` Richard Guenther
2011-07-27 15:12 ` Richard Guenther
2011-07-27 15:53 ` Richard Guenther
2011-07-28 13:00 ` Richard Guenther
2011-07-29 2:58 ` Jason Merrill
2011-07-29 12:02 ` Aldy Hernandez
2011-07-29 11:00 ` Richard Guenther
2011-08-01 13:51 ` Richard Guenther
2011-08-05 17:28 ` Aldy Hernandez
2011-08-09 10:52 ` Richard Guenther
2011-08-09 20:53 ` Aldy Hernandez
2011-08-10 13:34 ` Richard Guenther
2011-08-15 19:26 ` Aldy Hernandez
2011-08-27 0:05 ` Aldy Hernandez
2011-08-29 12:54 ` Richard Guenther
2011-08-30 16:07 ` Aldy Hernandez
2011-08-31 8:38 ` Richard Guenther
2011-08-31 13:56 ` Richard Guenther
2011-08-31 20:37 ` Aldy Hernandez
2011-09-01 6:58 ` Richard Guenther
2011-08-30 16:53 ` Aldy Hernandez
2011-08-31 8:55 ` Richard Guenther
2011-08-31 17:24 ` Aldy Hernandez
2011-08-30 21:33 ` Aldy Hernandez
2011-08-31 8:55 ` Richard Guenther
2011-08-31 20:37 ` Aldy Hernandez
2011-09-01 7:02 ` Richard Guenther
2011-09-01 7:05 ` Arnaud Charlet
2011-09-01 14:16 ` Aldy Hernandez
2011-09-02 8:48 ` Richard Guenther
2011-09-02 12:49 ` Aldy Hernandez
2011-09-02 13:05 ` Richard Guenther
2011-09-02 20:34 ` Jeff Law
2011-09-01 14:53 ` Aldy Hernandez
2011-09-01 15:01 ` Jason Merrill
2011-09-01 15:10 ` Aldy Hernandez
2011-09-01 15:20 ` Jason Merrill
2011-09-02 8:53 ` Richard Guenther
2011-09-02 14:10 ` Jason Merrill
2011-09-02 14:38 ` Richard Guenther
2011-09-07 18:12 ` Jason Merrill
2011-07-28 19:42 ` Aldy Hernandez
2011-07-27 18:22 ` Aldy Hernandez
2011-07-28 8:52 ` Richard Guenther
2011-07-29 12:05 ` Aldy Hernandez
2011-07-28 19:58 ` Richard Guenther
2011-07-27 17:29 ` Aldy Hernandez
2011-07-27 17:57 ` Andrew MacLeod
2011-07-27 22:27 ` Joseph S. Myers
2011-07-28 8:58 ` Richard Guenther
2011-07-28 22:26 ` Aldy Hernandez
2011-07-26 20:05 ` Aldy Hernandez
2011-07-27 18:24 ` H.J. Lu
2011-07-27 20:39 ` Aldy Hernandez
2011-07-27 20:54 ` Jakub Jelinek
2011-07-27 21:00 ` Aldy Hernandez
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).