From: Andrew_Pinski@PlayStation.Sony.Com
To: Richard Henderson <rth@redhat.com>
Cc: gcc-patches@gcc.gnu.org, Russell_Olsen@PlayStation.Sony.com,
Trevor_Smigiel@PlayStation.Sony.com
Subject: Re: [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment
Date: Thu, 17 May 2007 21:43:00 -0000 [thread overview]
Message-ID: <OF15BFC61A.42B05901-ON882572DE.000355EF-882572DE.00775314@playstation.sony.com> (raw)
In-Reply-To: <20070202234910.GB14896@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 2934 bytes --]
Richard Henderson <rth@redhat.com> wrote on 02/02/2007 03:49:10 PM:
> I don't like how these two things are split apart. Either cfgexpand.c
> should handle both the addition and the rounding, or
alloc_stack_frame_space
> should.
I don't see any other way to implement this because the offset needs to
take into account the oversize correctly.
> Which does suggest it'd be interesting to create a block containing
> all of the over-aligned data, create one REG that contains the
> rounded address of the base of that block, and all of the data
> therein is offsets from that one REG.
>
> Most of the infrastructure for that is already present, what with
> the partitioning and sorting that we already do...
In fact getting rid of this works and I was able to get the correct
alignment still. The patch includes a testcase for this case also.
> > @@ -1892,7 +1900,8 @@ expand_decl (tree decl)
> > set_mem_attributes (x, decl, 1);
> > SET_DECL_RTL (decl, x);
> > }
> > - else if (use_register_for_decl (decl))
> > + else if (use_register_for_decl (decl)
> > + && align <= PREFERRED_STACK_BOUNDARY)
>
> Why do we need to check alignment for register data?
Good point, I removed this code too. I blindly copied this part of the
code from our local tree thinking it might have been a good idea but I was
clearly wrong.
OK? Bootstrapped and tested on i686-linux-gnu with no regressions.
Thanks,
Andrew Pinski
ChangeLog:
* expr.h (allocate_dynamic_stack_space_1): New function
prototype.
* functionc.c (assign_temp): Take into account the alignment
of the temp if it is greater than the target's preferred
alignment.
* cfgexpand.c: Include optabs.h.
(get_decl_align_unit): Update comment and don't lower the
alignment
if it is greater than the target's preferred alignment.
(alloc_stack_frame_space): Take an unsigned HOST_WIDE_INT for
align.
Take into account the variable's alignment if it is greater than
the target's preferred alignment.
(expand_one_stack_var_at): Likewise.
* explow.c (allocate_dynamic_stack_space_1): Split out from
allocate_dynamic_stack_space and take into acount the required
alignment.
(allocate_dynamic_stack_space): Call
allocate_dynamic_stack_space_1.
* Makefile.in (cfgexpand.o): Update dependecies.
* stmt.c (expand_decl): Take into account the alignment
of the variable if it is greater than the target's preferred
alignment.
* config/i386/i386.c (ix86_compute_frame_layout): Don't assert
that preferred alignment is greater than the normal preferred
alignment. Don't assert that the stack alignment needed is
greater
than the normal preferred alignment.
* gcc.c-torture/execute/pr16660-1.c: New testcase.
* gcc.c-torture/execute/pr16660-2.c: New testcase.
[-- Attachment #2: fixpr16660.diff.txt --]
[-- Type: text/plain, Size: 12576 bytes --]
Index: testsuite/gcc.c-torture/execute/pr16660-1.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-1.c (revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-1.c (revision 0)
@@ -0,0 +1,11 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+ int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+ size_t b = (size_t)&a;
+ if (b&(ALIGNMENT-1))
+ return 1;
+ return 0;
+}
+
Index: testsuite/gcc.c-torture/execute/pr16660-2.c
===================================================================
--- testsuite/gcc.c-torture/execute/pr16660-2.c (revision 0)
+++ testsuite/gcc.c-torture/execute/pr16660-2.c (revision 0)
@@ -0,0 +1,19 @@
+typedef __SIZE_TYPE__ size_t;
+#define ALIGNMENT 256
+int main(void)
+{
+ {
+ int a[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+ size_t b = (size_t)&a;
+ if (b&(ALIGNMENT-1))
+ return 1;
+ }
+ {
+ int a1[ALIGNMENT/sizeof(int)] __attribute__((aligned(ALIGNMENT)));
+ size_t b = (size_t)&a1;
+ if (b&(ALIGNMENT-1))
+ return 1;
+ }
+ return 0;
+}
+
Index: expr.h
===================================================================
--- expr.h (revision 124657)
+++ expr.h (working copy)
@@ -712,6 +712,7 @@ extern void update_nonlocal_goto_save_ar
/* Allocate some space on the stack dynamically and return its address. An rtx
says how many bytes. */
extern rtx allocate_dynamic_stack_space (rtx, rtx, int);
+extern rtx allocate_dynamic_stack_space_1 (rtx, rtx, int, int);
/* Probe a range of stack addresses from FIRST to FIRST+SIZE, inclusive.
FIRST is a constant and size is a Pmode RTX. These are offsets from the
Index: function.c
===================================================================
--- function.c (revision 124657)
+++ function.c (working copy)
@@ -813,6 +813,7 @@ assign_temp (tree type_or_decl, int keep
if (mode == BLKmode || memory_required)
{
HOST_WIDE_INT size = int_size_in_bytes (type);
+ unsigned HOST_WIDE_INT align;
rtx tmp;
/* Zero sized arrays are GNU C extension. Set size to 1 to avoid
@@ -837,7 +838,37 @@ assign_temp (tree type_or_decl, int keep
size = 1;
}
- tmp = assign_stack_temp_for_type (mode, size, keep, type);
+ align = TYPE_ALIGN (type);
+ if (decl && DECL_ALIGN (decl) > align)
+ align = DECL_ALIGN (decl);
+
+ /* When the user specifies an alignment larger than
+ BIGGEST_ALIGNMENT we need to allocate extra space so we can
+ adjust this decl's address. */
+ if (align > PREFERRED_STACK_BOUNDARY)
+ {
+ rtx addr;
+ HOST_WIDE_INT pad = (align - PREFERRED_STACK_BOUNDARY) / BITS_PER_UNIT;
+ size += pad;
+ tmp = assign_stack_temp_for_type (mode, size, keep, type);
+ addr = XEXP (tmp, 0);
+ /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
+ but we know it can't. So add ourselves and then do
+ TRUNC_DIV_EXPR. */
+ addr = expand_binop (Pmode, add_optab, addr,
+ GEN_INT (align / BITS_PER_UNIT - 1),
+ NULL_RTX, 1, OPTAB_LIB_WIDEN);
+ addr = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, addr,
+ GEN_INT (align / BITS_PER_UNIT),
+ NULL_RTX, 1);
+ addr = expand_mult (Pmode, addr,
+ GEN_INT (align / BITS_PER_UNIT),
+ NULL_RTX, 1);
+ tmp = change_address (tmp, VOIDmode, addr);
+ }
+ else
+ tmp = assign_stack_temp_for_type (mode, size, keep, type);
+
return tmp;
}
Index: cfgexpand.c
===================================================================
--- cfgexpand.c (revision 124657)
+++ cfgexpand.c (working copy)
@@ -41,6 +41,7 @@ Boston, MA 02110-1301, USA. */
#include "params.h"
#include "tree-inline.h"
#include "value-prof.h"
+#include "optabs.h"
/* Verify that there is exactly single jump instruction since last and attach
REG_BR_PROB note specifying probability.
@@ -151,8 +152,7 @@ static bool has_protected_decls;
smaller than our cutoff threshold. Used for -Wstack-protector. */
static bool has_short_buffer;
-/* Discover the byte alignment to use for DECL. Ignore alignment
- we can't do with expected alignment of the stack boundary. */
+/* Discover the byte alignment to use for DECL. */
static unsigned int
get_decl_align_unit (tree decl)
@@ -161,8 +161,6 @@ get_decl_align_unit (tree decl)
align = DECL_ALIGN (decl);
align = LOCAL_ALIGNMENT (TREE_TYPE (decl), align);
- if (align > PREFERRED_STACK_BOUNDARY)
- align = PREFERRED_STACK_BOUNDARY;
if (cfun->stack_alignment_needed < align)
cfun->stack_alignment_needed = align;
@@ -173,11 +171,21 @@ get_decl_align_unit (tree decl)
Return the frame offset. */
static HOST_WIDE_INT
-alloc_stack_frame_space (HOST_WIDE_INT size, HOST_WIDE_INT align)
+alloc_stack_frame_space (HOST_WIDE_INT size, unsigned HOST_WIDE_INT align)
{
HOST_WIDE_INT offset, new_frame_offset;
new_frame_offset = frame_offset;
+
+ /* Allocate extra space if the alignment required is greater than the
+ stack boundary and then assume the RTL expansion of the variable, gets
+ the correct alignment. */
+ if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+ {
+ size += align;
+ align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
+ }
+
if (FRAME_GROWS_DOWNWARD)
{
new_frame_offset -= size + frame_phase;
@@ -523,23 +531,27 @@ dump_stack_var_partition (void)
static void
expand_one_stack_var_at (tree decl, HOST_WIDE_INT offset)
{
- HOST_WIDE_INT align;
+ unsigned HOST_WIDE_INT align;
rtx x;
/* If this fails, we've overflowed the stack frame. Error nicely? */
gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
x = plus_constant (virtual_stack_vars_rtx, offset);
- x = gen_rtx_MEM (DECL_MODE (decl), x);
+ align = get_decl_align_unit (decl);
+
+ if (align > PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
+ {
+ rtx addr = x;
+ addr = expand_binop (Pmode, add_optab, addr,
+ GEN_INT (align - 1),
+ NULL_RTX, 1, OPTAB_LIB_WIDEN);
+ addr = expand_and (Pmode, addr, GEN_INT (-align),
+ NULL_RTX);
+ x = addr;
+ }
- /* Set alignment we actually gave this decl. */
- offset -= frame_phase;
- align = offset & -offset;
- align *= BITS_PER_UNIT;
- if (align > STACK_BOUNDARY || align == 0)
- align = STACK_BOUNDARY;
- DECL_ALIGN (decl) = align;
- DECL_USER_ALIGN (decl) = 0;
+ x = gen_rtx_MEM (DECL_MODE (decl), x);
set_mem_attributes (x, decl, true);
SET_DECL_RTL (decl, x);
Index: explow.c
===================================================================
--- explow.c (revision 124657)
+++ explow.c (working copy)
@@ -1093,16 +1093,28 @@ update_nonlocal_goto_save_area (void)
TARGET is a place in which the address can be placed.
KNOWN_ALIGN is the alignment (in bits) that we know SIZE has. */
-
rtx
allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
{
+ return allocate_dynamic_stack_space_1 (size, target, known_align,
+ BIGGEST_ALIGNMENT);
+}
+
+/* Like allocate_dynamic_stack_space expect provide a way, REQUIRED_ALGIN,
+ to say the required alignment of the address. */
+rtx
+allocate_dynamic_stack_space_1 (rtx size, rtx target, int known_align,
+ int required_align)
+{
/* If we're asking for zero bytes, it doesn't matter what we point
to since we can't dereference it. But return a reasonable
address anyway. */
if (size == const0_rtx)
return virtual_stack_dynamic_rtx;
+ if (required_align < BIGGEST_ALIGNMENT)
+ required_align = BIGGEST_ALIGNMENT;
+
/* Otherwise, show we're calling alloca or equivalent. */
current_function_calls_alloca = 1;
@@ -1131,13 +1143,13 @@ allocate_dynamic_stack_space (rtx size,
#if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET)
#define MUST_ALIGN 1
#else
-#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < BIGGEST_ALIGNMENT)
+#define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < required_align)
#endif
if (MUST_ALIGN)
size
= force_operand (plus_constant (size,
- BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+ required_align / BITS_PER_UNIT - 1),
NULL_RTX);
#ifdef SETJMP_VIA_SAVE_AREA
@@ -1293,18 +1305,12 @@ allocate_dynamic_stack_space (rtx size,
if (MUST_ALIGN)
{
- /* CEIL_DIV_EXPR needs to worry about the addition overflowing,
- but we know it can't. So add ourselves and then do
- TRUNC_DIV_EXPR. */
target = expand_binop (Pmode, add_optab, target,
- GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
+ GEN_INT (required_align / BITS_PER_UNIT - 1),
NULL_RTX, 1, OPTAB_LIB_WIDEN);
- target = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, target,
- GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
- NULL_RTX, 1);
- target = expand_mult (Pmode, target,
- GEN_INT (BIGGEST_ALIGNMENT / BITS_PER_UNIT),
- NULL_RTX, 1);
+ target = expand_and (Pmode, target,
+ GEN_INT (-required_align / BITS_PER_UNIT),
+ NULL_RTX);
}
/* Record the new stack level for nonlocal gotos. */
Index: Makefile.in
===================================================================
--- Makefile.in (revision 124657)
+++ Makefile.in (working copy)
@@ -2553,7 +2553,7 @@ cfgexpand.o : cfgexpand.c $(TREE_FLOW_H)
$(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \
coretypes.h $(TREE_DUMP_H) except.h langhooks.h tree-pass.h $(RTL_H) \
$(DIAGNOSTIC_H) toplev.h $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
- value-prof.h
+ value-prof.h $(OPTABS_H)
cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
$(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
output.h toplev.h $(FUNCTION_H) except.h $(TM_P_H) insn-config.h $(EXPR_H) \
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c (revision 124657)
+++ config/i386/i386.c (working copy)
@@ -5610,9 +5610,6 @@ ix86_compute_frame_layout (struct ix86_f
gcc_assert (!size || stack_alignment_needed);
gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT);
- gcc_assert (preferred_alignment <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
- gcc_assert (stack_alignment_needed
- <= PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
if (stack_alignment_needed < STACK_BOUNDARY / BITS_PER_UNIT)
stack_alignment_needed = STACK_BOUNDARY / BITS_PER_UNIT;
Index: stmt.c
===================================================================
--- stmt.c (revision 124657)
+++ stmt.c (working copy)
@@ -1850,6 +1850,7 @@ void
expand_decl (tree decl)
{
tree type;
+ unsigned HOST_WIDE_INT align;
type = TREE_TYPE (decl);
@@ -1874,6 +1875,13 @@ expand_decl (tree decl)
if (TREE_STATIC (decl) || DECL_EXTERNAL (decl))
return;
+ if (DECL_USER_ALIGN (decl))
+ align = DECL_ALIGN (decl);
+ else if (DECL_MODE (decl) == BLKmode)
+ align = BIGGEST_ALIGNMENT;
+ else
+ align = TYPE_ALIGN (type);
+
/* Create the RTL representation for the variable. */
if (type == error_mark_node)
@@ -1941,10 +1949,16 @@ expand_decl (tree decl)
oldaddr = XEXP (DECL_RTL (decl), 0);
}
- /* Set alignment we actually gave this decl. */
- DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
- : GET_MODE_BITSIZE (DECL_MODE (decl)));
- DECL_USER_ALIGN (decl) = 0;
+ /* Set alignment we actually gave this decl if we don't have
+ an user specific alignment and the alignment is less than
+ the biggest alignment. */
+ if (! DECL_USER_ALIGN (decl)
+ || DECL_ALIGN (decl) <= BIGGEST_ALIGNMENT)
+ {
+ DECL_ALIGN (decl) = (DECL_MODE (decl) == BLKmode ? BIGGEST_ALIGNMENT
+ : GET_MODE_BITSIZE (DECL_MODE (decl)));
+ DECL_USER_ALIGN (decl) = 0;
+ }
x = assign_temp (decl, 1, 1, 1);
set_mem_attributes (x, decl, 1);
@@ -1975,8 +1989,10 @@ expand_decl (tree decl)
DECL_ALIGN says how the variable is to be aligned and we
cannot use it to conclude anything about the alignment of
the size. */
- address = allocate_dynamic_stack_space (size, NULL_RTX,
- TYPE_ALIGN (TREE_TYPE (decl)));
+ address = allocate_dynamic_stack_space_1 (size, NULL_RTX,
+ TYPE_ALIGN (TREE_TYPE (decl)),
+ MAX (TYPE_ALIGN (TREE_TYPE (decl)),
+ align));
/* Reference the variable indirect through that rtx. */
x = gen_rtx_MEM (DECL_MODE (decl), address);
next prev parent reply other threads:[~2007-05-17 21:43 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-30 2:42 Andrew_Pinski
2007-01-31 15:20 ` Meissner, Michael
2007-02-02 23:49 ` Richard Henderson
2007-05-17 21:43 ` Andrew_Pinski [this message]
[not found] ` <OF15BFC61A.42B05901-ON882572DE.000355EF-882572DE.007752F8@playstation.sony.com>
2007-05-17 23:50 ` trevor_smigiel
2007-05-18 0:20 ` Andrew_Pinski
2007-05-18 17:42 ` Andrew_Pinski
2007-07-02 23:56 ` Andrew Pinski
2007-08-04 11:46 ` Andrew Pinski
2007-08-28 20:08 ` Andrew Pinski
2007-09-17 22:22 ` Andrew Pinski
2007-10-09 21:20 ` Andrew Pinski
2007-10-10 0:31 ` H.J. Lu
2007-10-10 0:38 ` Andrew_Pinski
2007-10-10 2:17 ` H.J. Lu
2007-10-10 3:01 ` Andrew Pinski
2007-10-10 3:44 ` How to make gcc psABI conformant H.J. Lu
2007-10-10 11:43 ` Daniel Jacobowitz
2007-10-10 13:05 ` H.J. Lu
2007-10-10 13:10 ` Daniel Jacobowitz
2007-10-10 13:28 ` H.J. Lu
2007-10-10 13:35 ` H.J. Lu
2007-10-10 1:07 ` [PATCH] Fix PR 16660, attribute((aligned)) doesn't work for variables on the stack for greater than required alignment Dorit Nuzman
2008-01-24 14:52 ` Andrew Pinski
2008-01-24 18:07 ` H.J. Lu
2007-02-05 22:39 ` Dorit Nuzman
[not found] <OF15BFC61A.42B05901-ON882572DE.000355EF-882572DE.00775314@LocalDomain>
2007-05-17 21:45 ` Andrew_Pinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=OF15BFC61A.42B05901-ON882572DE.000355EF-882572DE.00775314@playstation.sony.com \
--to=andrew_pinski@playstation.sony.com \
--cc=Russell_Olsen@PlayStation.Sony.com \
--cc=Trevor_Smigiel@PlayStation.Sony.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=rth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).