public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* ping [RFC] [patch] fix PR32893 - forcing alignment >= STACK_BOUNDARY
@ 2007-10-22  9:17 Dorit Nuzman
  2007-10-22 14:19 ` H.J. Lu
  0 siblings, 1 reply; 6+ messages in thread
From: Dorit Nuzman @ 2007-10-22  9:17 UTC (permalink / raw)
  To: gcc-patches; +Cc: H.J. Lu, Andrew Pinski, Mark Mitchell


So I'm not sure how to proceed with PR32893 (see details and proposed patch
below).

I didn't yet get a response from Pinsky on whether his fix for PR16660
would also fix this PR,
and recently H.J. marked this PR as dependent on the newly opened PR33721.
It seems like there's a more general issue here that needs to be solved,
and the simple fix I suggested below in the vectorizer is only a workaround
(that would also involve a lot of testcase changes). I believe (hope...)
that a fix to either PR16660 or PR33721 would also fix this PR, but I'm not
sure what's going on with these PRs (what are the prospects of either of
these PRs to get resolved in the near future?). In the meantime we are
generating wrong code in the vectorizer...

thanks,
dorit

----- Forwarded by Dorit Nuzman/Haifa/IBM on 21/10/2007 21:44 -----

>
> Hi,
>
> There's a function in zlib with a simple loop initializing a local array.
> The vectorizer asks the compiler to force 128 bit alignment to this array
> (i.e. sets DECL_ALIGN to 128 bit), but sometimes, on some targets (e.g.
> i686-linux-gnu), the compiler doesn't respect that, and so we end up
> accessing an unaligned address using an aligned memory instruction (and
> segfault). It has been reported that several applications that link with
> zlib - "firefox/mozilla/thunderbird/seamonkey/xulrunner, rpm (notably
> rpm2cpio), openoffice" - suffer from this problem.
> This is probably because the STACK_BOUNDARY is not guaranteed to be
128bit
> aligned on these systems. The vectorizer however is checking the
> PREFERRED_STACK_BOUNDARY instead, and hopes for the best (i.e. we
knowingly
> do something that may be wrong, but is ok most of the time). This, by the
> way, is also what would happen if we ask to align this array in the
source
> code (e.g. using:
> __attribute__ ((__aligned__(16))),
> i.e. no alignment tweaks from within the vectorizer. I believe this is
what
> PR16660 is about?).
>
> There are two possible solutions:
>
> 1) just do the conservative right thing in the vectorizer, and check the
> STACK_BOUNDARY instead of the PREFERRED_STACK_BOUNDARY. The downside of
> course is that we will now never be able to force alignment of local
arrays
> on most x86 systems, and will have to work with unaligned accesses (using
> unaligned-moves, or peeling when possible, or versioning, or not
vectorize
> at all), which can be *much* less efficient. This amounts to the
following
> small change:
>
> Index: gcc/tree-vectorizer.c
> ===================================================================
> *** gcc/tree-vectorizer.c       (revision 128976)
> --- gcc/tree-vectorizer.c       (working copy)
> *************** vect_can_force_dr_alignment_p (const_tre
> *** 1606,1617 ****
>     if (TREE_STATIC (decl))
>       return (alignment <= MAX_OFILE_ALIGNMENT);
>     else
> !     /* This is not 100% correct.  The absolute correct stack alignment
> !        is STACK_BOUNDARY.  We're supposed to hope, but not assume, that
> !        PREFERRED_STACK_BOUNDARY is honored by all translation units.
> !        However, until someone implements forced stack alignment, SSE
> !        isn't really usable without this.  */
> !     return (alignment <= PREFERRED_STACK_BOUNDARY);
>   }
>
>
> --- 1606,1612 ----
>     if (TREE_STATIC (decl))
>       return (alignment <= MAX_OFILE_ALIGNMENT);
>     else
> !     return (alignment <= STACK_BOUNDARY);
>   }
>
>
> The above patch has been reported to work well and solve the problem:
>
> "FWIW, i've been running GCC-4.2 svn with the patch at
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413#c17 for a couple months
> now
> and have built a sizable chunk of our package repository with
> -ftree-vectorize
> enabled several times over and have yet to run into any trouble
whatsoever.
> "
>
> It passed bootstrap with vectorization enabled on i386-linux. However,
the
> other problem with this patch is that it involves making a lot of changes
> in the vectorizer testsuite, cause a lot of tests that used to get
> vectorized before, now either don't get vectorized on x86, or get
> vectorized differently. I started to go over the testsuite and make the
> required changes, but it's a pretty tedious work, so I'm attaching what I
> have so far as an RFC, cause hopefully there is a different, probably
> better, solution that we can consider instead:
>
> 2) implement forced stack alignment, which I believe is what Pinksy's
patch
> supposed to do: http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01409.html ?
> (by the way, looks like this patch has been waiting for approval for
quite
> a while now). Unfortunately, there's something in his patch that still
> needs some tweaking cause it doesn't solve the PR as is:
>
> "
> Of course after my patch for PR 16660, the patch here should be
> changed to just return true always.
>
> Thanks,
> Andrew Pinski
> ----------------------
> > I also tested using Andrew's patch from bug #16660 and always returning
> true in
> > vect_can_force_dr_alignment_p but it does not fix this error.
>
> Andrew, makes sense to you?
> ----------------------
> I think my patch only checks PREFERRED_STACK_BOUNDARY and not
> STACK_BOUNDARY
> which is why it does not work but I have not looked into it at all.
> "
>
> So I think the main question is - Andrew, do you think your patch can be
> made to solve this problem?
>
> If not, we'd have to resort to the patch attached below.
>
> The patch is not ready to be committed as is, cause we'd still have a
bunch
> of failures on i*86-linux:
>
> FAIL: gcc.dg/vect/vect-31.c scan-tree-dump-times Alignment of access
forced
> using peeling 2
> FAIL: gcc.dg/vect/vect-34.c scan-tree-dump-times Vectorizing an unaligned
> access 0
> FAIL: gcc.dg/vect/vect-36.c scan-tree-dump-times Vectorizing an unaligned
> access 0
> FAIL: gcc.dg/vect/vect-36.c scan-tree-dump-times Alignment of access
forced
> using peeling 0
> FAIL: gcc.dg/vect/vect-64.c scan-tree-dump-times Alignment of access
forced
> using peeling 2
> FAIL: gcc.dg/vect/vect-65.c scan-tree-dump-times Vectorizing an unaligned
> access 0
> FAIL: gcc.dg/vect/vect-66.c scan-tree-dump-times Alignment of access
forced
> using peeling 1
> FAIL: gcc.dg/vect/vect-68.c scan-tree-dump-times Alignment of access
forced
> using peeling 2
> FAIL: gcc.dg/vect/vect-72.c scan-tree-dump-times Alignment of access
forced
> using peeling 0
> FAIL: gcc.dg/vect/vect-73.c scan-tree-dump-times Vectorizing an unaligned
> access 0
> FAIL: gcc.dg/vect/vect-76.c scan-tree-dump-times Vectorizing an unaligned
> access 2
> FAIL: gcc.dg/vect/vect-77.c scan-tree-dump-times Alignment of access
forced
> using peeling 0
> FAIL: gcc.dg/vect/vect-78.c scan-tree-dump-times Alignment of access
forced
> using peeling 0
> FAIL: gcc.dg/vect/vect-86.c scan-tree-dump-times Alignment of access
forced
> using peeling 0
> FAIL: gcc.dg/vect/vect-all.c scan-tree-dump-times Vectorizing an
unaligned
> access 0
> FAIL: gcc.dg/vect/vect-all.c scan-tree-dump-times Alignment of access
> forced using peeling 0
> FAIL: gcc.dg/vect/slp-25.c scan-tree-dump-times Alignment of access
forced
> using peeling 2
> FAIL: gcc.dg/vect/wrapv-vect-7.c scan-tree-dump-times Vectorizing an
> unaligned access 0
> FAIL: gcc.dg/vect/no-scevccp-outer-6.c scan-tree-dump-times OUTER LOOP
> VECTORIZED. 1
>
> I am considering to maybe throw away all these alignment checks all
> together, cause it's starting to be a pain to maintain them.
>
> thanks,
> dorit
>
> ChangeLog:
>
>       * tree-vectorizer.c (vect_can_force_dr_alignment_p): Check
> STACK_BOUNDARY
>       instead of PREFERRED_STACK_BOUNDARY.
>
>       * testsuite/lib/target-supports.exp (
> check_effective_target_unaligned_stack): New.
>       * testsuite/gcc.dg/vect/vect-2.c: xfail for unaligned_stack
targets.
>       * testsuite/gcc.dg/vect/vect-3.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-4.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-5.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-6.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-7.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-13.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-17.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-18.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-19.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-20.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-21.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-22.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-27.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-29.c: Likewise.
>
> (See attached file: stackboundary.txt)[attachment "stackboundary.
> txt" deleted by Dorit Nuzman/Haifa/IBM]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ping [RFC] [patch] fix PR32893 - forcing alignment >=  STACK_BOUNDARY
  2007-10-22  9:17 ping [RFC] [patch] fix PR32893 - forcing alignment >= STACK_BOUNDARY Dorit Nuzman
@ 2007-10-22 14:19 ` H.J. Lu
  2007-10-22 18:31   ` Mark Mitchell
  0 siblings, 1 reply; 6+ messages in thread
From: H.J. Lu @ 2007-10-22 14:19 UTC (permalink / raw)
  To: Dorit Nuzman; +Cc: gcc-patches, Andrew Pinski, Mark Mitchell

On Sun, Oct 21, 2007 at 09:05:38PM -0800, Dorit Nuzman wrote:
> 
> So I'm not sure how to proceed with PR32893 (see details and proposed patch
> below).
> 
> I didn't yet get a response from Pinsky on whether his fix for PR16660
> would also fix this PR,
> and recently H.J. marked this PR as dependent on the newly opened PR33721.
> It seems like there's a more general issue here that needs to be solved,
> and the simple fix I suggested below in the vectorizer is only a workaround
> (that would also involve a lot of testcase changes). I believe (hope...)
> that a fix to either PR16660 or PR33721 would also fix this PR, but I'm not
> sure what's going on with these PRs (what are the prospects of either of
> these PRs to get resolved in the near future?). In the meantime we are
> generating wrong code in the vectorizer...

We are working on a proposal to properly fix the stack alignment issue.
We will start a project for gcc 4.4.


H.J.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ping [RFC] [patch] fix PR32893 - forcing alignment >= STACK_BOUNDARY
  2007-10-22 14:19 ` H.J. Lu
@ 2007-10-22 18:31   ` Mark Mitchell
  2007-10-22 18:37     ` Daniel Jacobowitz
  2007-10-30  7:24     ` Dorit Nuzman
  0 siblings, 2 replies; 6+ messages in thread
From: Mark Mitchell @ 2007-10-22 18:31 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Dorit Nuzman, gcc-patches, Andrew Pinski

H.J. Lu wrote:
> On Sun, Oct 21, 2007 at 09:05:38PM -0800, Dorit Nuzman wrote:
>> So I'm not sure how to proceed with PR32893 (see details and proposed patch
>> below).
>>
>> I didn't yet get a response from Pinsky on whether his fix for PR16660
>> would also fix this PR,
> We are working on a proposal to properly fix the stack alignment issue.
> We will start a project for gcc 4.4.

Dorit --

For GCC 4.3, I don't think you have a choice: you need to change the
vectorized to use plain STACK_BOUNDARY, because that's all the compiler
can actually support.  I understand the performance cost, but
correctness has to trump performance.  ABIs designed to support vector
instructions will probably ensure that STACK_BOUNDARY is large enough
that this is not an issue.

What we probably want in the long term is for the compiler (including
the vectorized) to be able to generate variables with arbitrary
alignment in all functions.  I would expect the implementation of that
would be relatively simple; in the function prologue, you would notice
that this is a function containing a variable of alignment N (greater
than STACK_BOUNDARY) and generate fixup code to ensure that the stack
was so aligned.  So, the cost would only be borne by functions with
variables requiring large alignment, and, even there, it just be a few
instructions in the prologue.  Perhaps this is as simple as setting
PREFERRED_STACK_BOUNDARY to max(user-specified-preferred-stack-boundary,
biggest-alignment-used-in-this-function) at the start of generating RTL
for a function.

Certainly, I also think that the compiler should generate an error if it
cannot honor an alignment directive.  For example, in PR 16660, we
should generate an error on the declaration of "tmp" if we cannot
actually align it on a 16-byte boundary.

Thanks,

-- 
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ping [RFC] [patch] fix PR32893 - forcing alignment >=  STACK_BOUNDARY
  2007-10-22 18:31   ` Mark Mitchell
@ 2007-10-22 18:37     ` Daniel Jacobowitz
  2007-10-22 18:45       ` Mark Mitchell
  2007-10-30  7:24     ` Dorit Nuzman
  1 sibling, 1 reply; 6+ messages in thread
From: Daniel Jacobowitz @ 2007-10-22 18:37 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: H.J. Lu, Dorit Nuzman, gcc-patches, Andrew Pinski

On Mon, Oct 22, 2007 at 12:39:39PM -0500, Mark Mitchell wrote:
> For GCC 4.3, I don't think you have a choice: you need to change the
> vectorized to use plain STACK_BOUNDARY, because that's all the compiler
> can actually support.  I understand the performance cost, but
> correctness has to trump performance.  ABIs designed to support vector
> instructions will probably ensure that STACK_BOUNDARY is large enough
> that this is not an issue.

FYI, as I learned recently, powerpc-linux is one target where
STACK_BOUNDARY is smaller than PREFERRED_STACK_BOUNDARY - but
in fact GCC and the ABI always arrange for the larger alignment.
So a new macro may be called for to continue taking advantage of
AltiVec in the vectorizer.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ping [RFC] [patch] fix PR32893 - forcing alignment >= STACK_BOUNDARY
  2007-10-22 18:37     ` Daniel Jacobowitz
@ 2007-10-22 18:45       ` Mark Mitchell
  0 siblings, 0 replies; 6+ messages in thread
From: Mark Mitchell @ 2007-10-22 18:45 UTC (permalink / raw)
  To: Mark Mitchell, H.J. Lu, Dorit Nuzman, gcc-patches, Andrew Pinski

Daniel Jacobowitz wrote:
> On Mon, Oct 22, 2007 at 12:39:39PM -0500, Mark Mitchell wrote:
>> For GCC 4.3, I don't think you have a choice: you need to change the
>> vectorized to use plain STACK_BOUNDARY, because that's all the compiler
>> can actually support.  I understand the performance cost, but
>> correctness has to trump performance.  ABIs designed to support vector
>> instructions will probably ensure that STACK_BOUNDARY is large enough
>> that this is not an issue.
> 
> FYI, as I learned recently, powerpc-linux is one target where
> STACK_BOUNDARY is smaller than PREFERRED_STACK_BOUNDARY - but
> in fact GCC and the ABI always arrange for the larger alignment.
> So a new macro may be called for to continue taking advantage of
> AltiVec in the vectorizer.

I see.  It's possible that I don't understand what these macros do
exactly, but, given my reading of the documentation, I agree.  I think
we should have a ABI_STACK_BOUNDARY that says what the minimum stack
boundary is guaranteed to be by the ABI.  The default would be
STACK_BOUNDARY.  The default for PREFERRED_STACK_BOUNDARY would then be
ABI_STACK_BOUNDARY, not STACK_BOUNDARY.

The idea is that STACK_BOUNDARY says what the hardware requires.
ABI_STACK_BOUNDARY says what the ABI requires.  PREFERRED_STACK_BOUNDARY
says what alignment we preserve within a given function.  Manually
setting the preferred stack boundary (-mpreferred-stack-boundary) below
ABI_STACK_BOUNDARY should be an error, since it's not ABI compatible.

The vectorizer could use ABI_STACK_BOUNDARY as the maximum allowed
alignment for things, since we know that we'll have at least that
alignment within any function.

I still think that allowing making arbitrary alignment for local
variables work is a good idea.  Then, users and optimization passes can
just align things as they like.  Paying a few extra instructions in the
prologue to get things set up is a huge win if it allows vectorization
of some loop.

-- 
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ping [RFC] [patch] fix PR32893 - forcing alignment >= STACK_BOUNDARY
  2007-10-22 18:31   ` Mark Mitchell
  2007-10-22 18:37     ` Daniel Jacobowitz
@ 2007-10-30  7:24     ` Dorit Nuzman
  1 sibling, 0 replies; 6+ messages in thread
From: Dorit Nuzman @ 2007-10-30  7:24 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 7057 bytes --]

...
> Dorit --
>
> For GCC 4.3, I don't think you have a choice: you need to change the
> vectorized to use plain STACK_BOUNDARY, because that's all the compiler
> can actually support.

ok. Here is the patch that makes the vectorizer consider STACK_BOUNDARY
instead od PREFERRED_STACK_BOUNDARY. If we'll add the proposed
ABI_STACK_BOUNDARY (http://gcc.gnu.org/ml/gcc-patches/2007-10/msg01302.html
) we could replace STACK_BOUNDARY with ABI_STACK_BOUNDARY.

Most of the patch is changes to the vectorizer testcases to account for the
fact that arrays on the stack may not be alignable anymore on targets whose
guaranteed stack alignment is too small. Wherever possible I  globalized
arrays that were originally local. In cases where this changed the behavior
of the test, I also added a version where I left the arrays local and
either removed the alignment checks, or updated them to take into account
"unaligned_stack" targets (or both). For example:
- We can't resolve dependences in tests that have pointers and global
arrays. Therefore, these tests now require run-time dependence testing
(versioning for aliasing) (and this in turn means that we will not peel to
align references, because currently we either version the loop or peel the
loop, so versioning for alignment will be used instead).
- the pass that increases alignment of global arrays (when
-fsection-anchors is on) does not consider arrays within structs. So, when
-fsection-anchors is on (the default on powerpc-linux) we can't change the
alignment of global arrays that are struct fields, and so globalizing these
arrays doesn't help there.
- the same happens with a few cases of multi-dimensional arrays
(vect-[64,65,66].c). Need to check why.
- no-scevccp-outer-6.c changes it's behavior when a local array is
globalized (the loop changes it's forms). Need to check why.

Bootstrapped with vectorization enabled on i386-linux and powerpc64-linux,
tested on the vectorizer testcases on these platforms. To be committed to
mainline.

thanks,
dorit

ChangeLog:

        PR tree-optimization/32893
        * tree-vectorize.c (vect_can_force_dr_alignment_p): Check
        STACK_BOUNDARY instead of PREFERRED_STACK_BOUNDARY.

testsuite/ChangeLog:

        PR tree-optimization/32893
        * testsuite/lib/target-supports.exp
        (check_effective_target_unaligned_stack): new keyword.
        * testsuite/gcc.dg/vect/vect-2.c: Globalize arrays to make the test
        not sensitive to unaligned_stack.
        * testsuite/gcc.dg/vect/vect-3.c: Likewise.
        * testsuite/gcc.dg/vect/vect-4.c: Likewise.
        * testsuite/gcc.dg/vect/vect-5.c: Likewise.
        * testsuite/gcc.dg/vect/vect-6.c: Likewise.
        * testsuite/gcc.dg/vect/vect-7.c: Likewise.
        * testsuite/gcc.dg/vect/vect-13.c: Likewise.
        * testsuite/gcc.dg/vect/vect-17.c: Likewise.
        * testsuite/gcc.dg/vect/vect-18.c: Likewise.
        * testsuite/gcc.dg/vect/vect-19.c: Likewise.
        * testsuite/gcc.dg/vect/vect-20.c: Likewise.
        * testsuite/gcc.dg/vect/vect-21.c: Likewise.
        * testsuite/gcc.dg/vect/vect-22.c: Likewise.
        * testsuite/gcc.dg/vect/vect-27.c: Likewise.
        * testsuite/gcc.dg/vect/vect-29.c: Likewise.
        * testsuite/gcc.dg/vect/vect-64.c: Likewise.
        * testsuite/gcc.dg/vect/vect-65.c: Likewise.
        * testsuite/gcc.dg/vect/vect-66.c: Likewise.
        * testsuite/gcc.dg/vect/vect-72.c: Likewise.
        * testsuite/gcc.dg/vect/vect-73.c: Likewise.
        * testsuite/gcc.dg/vect/vect-86.c: Likewise.
        * testsuite/gcc.dg/vect/vect-all.c: Likewise.
        * testsuite/gcc.dg/vect/slp-25.c: Likewise.
        * testsuite/gcc.dg/vect/wrapv-vect-7.c: Likewise.
        * testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c:
Likewise.
        * testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c:
Likewise.

        * testsuite/gcc.dg/vect/vect-31.c: Removed alignment checks.
        * testsuite/gcc.dg/vect/vect-34.c: Likewise.
        * testsuite/gcc.dg/vect/vect-36.c: Likewise.
        * testsuite/gcc.dg/vect/vect-64.c: Likewise.
        * testsuite/gcc.dg/vect/vect-65.c: Likewise.
        * testsuite/gcc.dg/vect/vect-66.c: Likewise.
        * testsuite/gcc.dg/vect/vect-68.c: Likewise.
        * testsuite/gcc.dg/vect/vect-76.c: Likewise.
        * testsuite/gcc.dg/vect/vect-77.c: Likewise.
        * testsuite/gcc.dg/vect/vect-78.c: Likewise.

        * testsuite/gcc.dg/vect/no-section-anchors-vect-31.c: New test,
Like the
        original testcase (without no-section-anchors prefix) but with
global arrays.
        * testsuite/gcc.dg/vect/no-section-anchors-vect-34.c: Likewise.
        * testsuite/gcc.dg/vect/no-section-anchors-vect-36.c: Likewsie.
        * testsuite/gcc.dg/vect/no-section-anchors-vect-64.c: Likewise.
        * testsuite/gcc.dg/vect/no-section-anchors-vect-65.c: Likewise.
        * testsuite/gcc.dg/vect/no-section-anchors-vect-66.c: Likewise.
        * testsuite/gcc.dg/vect/no-section-anchors-vect-68.c: Likewise.
        * testsuite/gcc.dg/vect/vect-77-global.c: Likewise.
        * testsuite/gcc.dg/vect/vect-78-global.c: Likewise.

        * testsuite/gcc.dg/vect/vect-77-alignchecks.c: New test. Like the
        original testcase (without no-section-anchors prefix) but fix
alignment checks
        to also consider unaligned_stack targets.
        * testsuite/gcc.dg/vect/vect-78-alignchecks.c: Likewise.

        * testsuite/gcc.dg/vect/no-scevccp-outer-6.c: xfail on
unaligned_stack
        targets.
        * testsuite/gcc.dg/vect/no-scevccp-outer-6-global.c: New test. Like
        no-scevccp-outer-6.c, but with global arrays. xfail for now.

(See attached file: alignmentfix.txt)

> I understand the performance cost, but
> correctness has to trump performance.  ABIs designed to support vector
> instructions will probably ensure that STACK_BOUNDARY is large enough
> that this is not an issue.
>
> What we probably want in the long term is for the compiler (including
> the vectorized) to be able to generate variables with arbitrary
> alignment in all functions.  I would expect the implementation of that
> would be relatively simple; in the function prologue, you would notice
> that this is a function containing a variable of alignment N (greater
> than STACK_BOUNDARY) and generate fixup code to ensure that the stack
> was so aligned.  So, the cost would only be borne by functions with
> variables requiring large alignment, and, even there, it just be a few
> instructions in the prologue.  Perhaps this is as simple as setting
> PREFERRED_STACK_BOUNDARY to max(user-specified-preferred-stack-boundary,
> biggest-alignment-used-in-this-function) at the start of generating RTL
> for a function.
>
> Certainly, I also think that the compiler should generate an error if it
> cannot honor an alignment directive.  For example, in PR 16660, we
> should generate an error on the declaration of "tmp" if we cannot
> actually align it on a 16-byte boundary.
>
> Thanks,
>
> --
> Mark Mitchell
> CodeSourcery
> mark@codesourcery.com
> (650) 331-3385 x713

[-- Attachment #2: alignmentfix.txt --]
[-- Type: text/plain, Size: 51240 bytes --]

Index: testsuite/gcc.dg/vect/no-section-anchors-vect-36.c
===================================================================
*** testsuite/gcc.dg/vect/no-section-anchors-vect-36.c	(revision 0)
--- testsuite/gcc.dg/vect/no-section-anchors-vect-36.c	(revision 0)
***************
*** 0 ****
--- 1,48 ----
+ /* { dg-require-effective-target vect_int } */
+ 
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ #define N 16
+  
+ struct {
+   char ca[N];
+   char cb[N];
+ } s;
+ 
+ __attribute__ ((noinline))
+ int main1 ()
+ {  
+   int i;
+ 
+   for (i = 0; i < N; i++)
+     {
+       s.cb[i] = 3*i;
+     }
+ 
+   for (i = 0; i < N; i++)
+     {
+       s.ca[i] = s.cb[i];
+     }
+ 
+   /* check results:  */
+   for (i = 0; i < N; i++)
+     {
+       if (s.ca[i] != s.cb[i])
+         abort ();
+     }
+ 
+   return 0;
+ }
+ 
+ int main (void)
+ { 
+   check_vect ();
+   
+   return main1 ();
+ } 
+ 
+ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/slp-25.c
===================================================================
*** testsuite/gcc.dg/vect/slp-25.c	(revision 129723)
--- testsuite/gcc.dg/vect/slp-25.c	(working copy)
***************
*** 7,17 ****
  
  /* Unaligned stores.  */
  
  int main1 (int n)
  {
    int i;
-   int ia[N+1];
-   short sa[N+1];
  
    for (i = 1; i <= N/2; i++)
      {
--- 7,18 ----
  
  /* Unaligned stores.  */
  
+ int ia[N+1];
+ short sa[N+1];
+ 
  int main1 (int n)
  {
    int i;
  
    for (i = 1; i <= N/2; i++)
      {
Index: testsuite/gcc.dg/vect/vect-34.c
===================================================================
*** testsuite/gcc.dg/vect/vect-34.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-34.c	(working copy)
***************
*** 8,18 ****
  __attribute__ ((noinline))
  int main1 ()
  {  
    struct {
      char ca[N];
    } s;
    char cb[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
-   int i;
  
    for (i = 0; i < N; i++)
      {
--- 8,18 ----
  __attribute__ ((noinline))
  int main1 ()
  {  
+   int i;
    struct {
      char ca[N];
    } s;
    char cb[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
  
    for (i = 0; i < N; i++)
      {
*************** int main (void)
*** 37,41 ****
  } 
  
  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
--- 37,40 ----
Index: testsuite/gcc.dg/vect/vect-17.c
===================================================================
*** testsuite/gcc.dg/vect/vect-17.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-17.c	(working copy)
***************
*** 5,14 ****
  
  #define N 64
  
- __attribute__ ((noinline)) int
- main1 ()
- {
-   int i;
    int ia[N];
    int ib[N]= 
      {1,1,0,0,1,0,1,0,
--- 5,10 ----
*************** main1 ()
*** 72,77 ****
--- 68,77 ----
       1,1,0,0,1,0,1,0,
       1,1,0,0,1,0,1,0};
  
+ __attribute__ ((noinline)) int
+ main1 ()
+ {
+   int i;
    /* Check ints.  */
  
    for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-76.c
===================================================================
*** testsuite/gcc.dg/vect/vect-76.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-76.c	(working copy)
*************** int main (void)
*** 71,75 ****
  
  
  /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail vect_no_align } } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
--- 71,74 ----
Index: testsuite/gcc.dg/vect/no-scevccp-outer-6-global.c
===================================================================
*** testsuite/gcc.dg/vect/no-scevccp-outer-6-global.c	(revision 0)
--- testsuite/gcc.dg/vect/no-scevccp-outer-6-global.c	(revision 0)
***************
*** 0 ****
--- 1,58 ----
+ /* { dg-require-effective-target vect_int } */
+ 
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ #define N 40
+ 
+ int a[N];
+ 
+ __attribute__ ((noinline)) int
+ foo (int * __restrict__ b, int k){
+   int i,j;
+   int sum,x;
+ 
+   for (i = 0; i < N; i++) {
+     sum = b[i];
+     for (j = 0; j < N; j++) {
+       sum += j;
+     }
+     a[i] = sum;
+   }
+   
+   return a[k];
+ }
+ 
+ int main (void)
+ {
+   int i,j;
+   int sum;
+   int b[N];
+   int a[N];
+ 
+   check_vect ();
+ 
+   for (i=0; i<N; i++)
+     b[i] = i + 2;
+ 
+   for (i=0; i<N; i++)
+     a[i] = foo (b,i);
+ 
+     /* check results:  */
+   for (i=0; i<N; i++)
+     {
+       sum = b[i];
+       for (j = 0; j < N; j++){
+         sum += j;
+       }
+       if (a[i] != sum)
+         abort();
+     }
+ 
+   return 0;
+ }
+ 
+ /* "Too many BBs in loop"  */
+ /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail *-*-* } } } */
+ /* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: detected" 1 "vect" { xfail *-*-* } } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c
===================================================================
*** testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c	(revision 129723)
--- testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c	(working copy)
*************** struct s{
*** 18,27 ****
    struct t e;   /* unaligned (offset 2N+4N+4 B) */
  };
   
  int main1 ()
  {  
    int i;
-   struct s tmp;
  
    /* unaligned */
    for (i = 0; i < N/2; i++)
--- 18,28 ----
    struct t e;   /* unaligned (offset 2N+4N+4 B) */
  };
   
+ struct s tmp;
+ 
  int main1 ()
  {  
    int i;
  
    /* unaligned */
    for (i = 0; i < N/2; i++)
Index: testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c
===================================================================
*** testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c	(revision 129723)
--- testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c	(working copy)
*************** struct s{
*** 18,27 ****
    struct t e;   /* unaligned (offset 2N+4N+4 B) */
  };
   
  int main1 ()
  {  
    int i;
-   struct s tmp;
  
    /* unaligned */
    for (i = 0; i < N/2; i++)
--- 18,28 ----
    struct t e;   /* unaligned (offset 2N+4N+4 B) */
  };
   
+ struct s tmp;
+ 
  int main1 ()
  {  
    int i;
  
    /* unaligned */
    for (i = 0; i < N/2; i++)
Index: testsuite/gcc.dg/vect/vect-77-global.c
===================================================================
*** testsuite/gcc.dg/vect/vect-77-global.c	(revision 0)
--- testsuite/gcc.dg/vect/vect-77-global.c	(revision 0)
***************
*** 0 ****
--- 1,53 ----
+ /* { dg-require-effective-target vect_int } */
+ 
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ #define N 8
+ #define OFF 8
+ 
+ /* Check handling of accesses for which the "initial condition" -
+    the expression that represents the first location accessed - is
+    more involved than just an ssa_name.  */
+ 
+ int ib[N+OFF] __attribute__ ((__aligned__(16))) = {0, 1, 3, 5, 7, 11, 13, 17, 0, 2, 6, 10, 14, 22, 26, 34};
+ int ia[N];
+ 
+ __attribute__ ((noinline))
+ int main1 (int *ib, int off)
+ {
+   int i;
+ 
+   for (i = 0; i < N; i++)
+     {
+       ia[i] = ib[i+off];
+     }
+ 
+ 
+   /* check results:  */
+   for (i = 0; i < N; i++)
+     {
+      if (ia[i] != ib[i+off])
+         abort ();
+     }
+ 
+   return 0;
+ }
+ 
+ int main (void)
+ {
+   check_vect ();
+ 
+   main1 (ib, 8);
+   return 0;
+ }
+ 
+ /* For targets that don't support misaligned loads we version for the load.
+    (The store is aligned).  */
+ /* Requires versioning for aliasing.  */
+ 
+ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target vect_no_align } } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/vect-68.c
===================================================================
*** testsuite/gcc.dg/vect/vect-68.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-68.c	(working copy)
*************** int main (void)
*** 86,91 ****
  } 
  
  /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
--- 86,89 ----
Index: testsuite/gcc.dg/vect/vect-18.c
===================================================================
*** testsuite/gcc.dg/vect/vect-18.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-18.c	(working copy)
***************
*** 5,14 ****
  
  #define N 64
  
- __attribute__ ((noinline)) int
- main1 ()
- {
-   int i;
    int ia[N];
    int ib[N]= 
      {1,1,0,0,1,0,1,0,
--- 5,10 ----
*************** main1 ()
*** 71,76 ****
--- 67,76 ----
       1,1,0,0,1,0,1,0,
       1,1,0,0,1,0,1,0};
  
+ __attribute__ ((noinline)) int
+ main1 ()
+ {
+   int i;
    /* Check ints.  */
  
    for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-77.c
===================================================================
*** testsuite/gcc.dg/vect/vect-77.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-77.c	(working copy)
*************** int main (void)
*** 42,52 ****
    return 0;
  }
  
- /* For targets that don't support misaligned loads we version for the load.
-    (The store is aligned).  */
- 
  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } } */
- /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target vect_no_align } } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
--- 42,46 ----
Index: testsuite/gcc.dg/vect/vect-2.c
===================================================================
*** testsuite/gcc.dg/vect/vect-2.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-2.c	(working copy)
***************
*** 5,15 ****
  
  #define N 16
  
  __attribute__ ((noinline)) 
  int main1 ()
  {  
-   char cb[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
-   char ca[N];
    int i;
  
    for (i = 0; i < N; i++)
--- 5,16 ----
  
  #define N 16
  
+ char cb[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
+ char ca[N];
+ 
  __attribute__ ((noinline)) 
  int main1 ()
  {  
    int i;
  
    for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-27.c
===================================================================
*** testsuite/gcc.dg/vect/vect-27.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-27.c	(working copy)
***************
*** 7,18 ****
  
  /* unaligned load.  */
  
  __attribute__ ((noinline))
  int main1 ()
  {
    int i;
-   int ia[N];
-   int ib[N+1];
  
    for (i=0; i <= N; i++)
      {
--- 7,19 ----
  
  /* unaligned load.  */
  
+ int ia[N];
+ int ib[N+1];
+ 
  __attribute__ ((noinline))
  int main1 ()
  {
    int i;
  
    for (i=0; i <= N; i++)
      {
Index: testsuite/gcc.dg/vect/vect-86.c
===================================================================
*** testsuite/gcc.dg/vect/vect-86.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-86.c	(working copy)
***************
*** 5,15 ****
  
  #define N 16
  
  __attribute__ ((noinline))
  int main1 (int n)
  {
    int i, j, k;
!   int a[N], b[N];
  
    for (i = 0; i < n; i++)
      {
--- 5,17 ----
  
  #define N 16
  
+ int a[N];
+ 
  __attribute__ ((noinline))
  int main1 (int n)
  {
    int i, j, k;
!   int b[N];
  
    for (i = 0; i < n; i++)
      {
Index: testsuite/gcc.dg/vect/vect-36.c
===================================================================
*** testsuite/gcc.dg/vect/vect-36.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-36.c	(working copy)
***************
*** 8,18 ****
  __attribute__ ((noinline))
  int main1 ()
  {  
    struct {
      char ca[N];
      char cb[N];
    } s;
!   int i;
  
    for (i = 0; i < N; i++)
      {
--- 8,19 ----
  __attribute__ ((noinline))
  int main1 ()
  {  
+   int i;
    struct {
      char ca[N];
      char cb[N];
    } s;
! 
  
    for (i = 0; i < N; i++)
      {
*************** int main (void)
*** 42,47 ****
  } 
  
  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
--- 43,46 ----
Index: testsuite/gcc.dg/vect/vect-19.c
===================================================================
*** testsuite/gcc.dg/vect/vect-19.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-19.c	(working copy)
***************
*** 5,14 ****
  
  #define N 64
  
- __attribute__ ((noinline)) int
- main1 ()
- {
-   int i;
    int ia[N];
    int ib[N]= 
      {1,1,0,0,1,0,1,0,
--- 5,10 ----
*************** main1 ()
*** 71,76 ****
--- 67,76 ----
       1,1,0,0,1,0,1,0,
       1,1,0,0,1,0,1,0};
  
+ __attribute__ ((noinline)) int
+ main1 ()
+ {
+   int i;
    /* Check ints.  */
  
    for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-78.c
===================================================================
*** testsuite/gcc.dg/vect/vect-78.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-78.c	(working copy)
*************** int main1 (int *ib)
*** 24,30 ****
        ia[i] = ib[i+off];
      }
  
- 
    /* check results:  */
    for (i = 0; i < N; i++)
      {
--- 24,29 ----
*************** int main (void)
*** 43,53 ****
    return 0;
  }
  
- /* For targets that don't support misaligned loads we version for the load.
-    (The store is aligned).  */
- 
  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } } */
- /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target vect_no_align } } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
--- 42,46 ----
Index: testsuite/gcc.dg/vect/no-section-anchors-vect-64.c
===================================================================
*** testsuite/gcc.dg/vect/no-section-anchors-vect-64.c	(revision 0)
--- testsuite/gcc.dg/vect/no-section-anchors-vect-64.c	(revision 0)
***************
*** 0 ****
--- 1,88 ----
+ /* { dg-require-effective-target vect_int } */
+ 
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ #define N 16
+ 
+ int ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
+ int ia[N][4][N+1];
+ int ic[N][N][3][13];
+ int id[N][N][N];
+ 
+ __attribute__ ((noinline))
+ int main1 ()
+ {
+   int i, j;
+ 
+   /* Multidimensional array. Not aligned: vectorizable. */
+   for (i = 0; i < N; i++)
+     {
+       for (j = 0; j < N; j++)
+         {
+            ia[i][1][j] = ib[i];
+         }
+     }
+ 
+   /* Multidimensional array. Aligned: vectorizable. */
+   for (i = 0; i < N; i++)
+     {
+       for (j = 0; j < N; j++)
+         {
+            ic[i][1][1][j] = ib[i];
+         }
+     }
+ 
+   /* Multidimensional array. Not aligned: vectorizable. */
+   for (i = 0; i < N; i++)
+     {
+       for (j = 0; j < N; j++)
+         {
+            id[i][1][j+1] = ib[i];
+         }
+     }
+ 
+   /* check results: */  
+   for (i = 0; i < N; i++)
+     {
+       for (j = 0; j < N; j++)
+         {
+            if (ia[i][1][j] != ib[i])
+               abort();
+         }
+     }
+ 
+   /* check results: */  
+   for (i = 0; i < N; i++)
+     {
+       for (j = 0; j < N; j++)
+         {
+            if (ic[i][1][1][j] != ib[i])
+               abort();
+         }
+     }
+ 
+   /* check results: */  
+   for (i = 0; i < N; i++)
+     {
+       for (j = 0; j < N; j++)
+         {
+            if (id[i][1][j+1] != ib[i])
+               abort();
+         }
+     }
+ 
+   return 0;
+ }
+ 
+ int main (void)
+ { 
+   check_vect ();
+ 
+   return main1 ();
+ }
+ 
+ /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/vect-3.c
===================================================================
*** testsuite/gcc.dg/vect/vect-3.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-3.c	(working copy)
***************
*** 6,23 ****
  
  #define N 20
  
  __attribute__ ((noinline)) int
  main1 ()
  {
    int i;
-   float a[N];
-   float e[N];
-   float b[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
-   float c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
-   float d[N] = {0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30};
-   int ic[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
-   int ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
-   int ia[N];
  
    for (i = 0; i < N; i++)
      {
--- 6,24 ----
  
  #define N 20
  
+ float a[N];
+ float e[N];
+ float b[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
+ float c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+ float d[N] = {0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30};
+ int ic[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
+ int ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
+ int ia[N];
+ 
  __attribute__ ((noinline)) int
  main1 ()
  {
    int i;
  
    for (i = 0; i < N; i++)
      {
Index: testsuite/gcc.dg/vect/vect-78-alignchecks.c
===================================================================
*** testsuite/gcc.dg/vect/vect-78-alignchecks.c	(revision 0)
--- testsuite/gcc.dg/vect/vect-78-alignchecks.c	(revision 0)
***************
*** 0 ****
--- 1,57 ----
+ /* { dg-require-effective-target vect_int } */
+ 
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ #define N 8
+ #define OFF 8
+ 
+ /* Check handling of accesses for which the "initial condition" -
+    the expression that represents the first location accessed - is
+    more involved than just an ssa_name.  */
+ 
+ int ib[N+OFF] __attribute__ ((__aligned__(16))) = {0, 1, 3, 5, 7, 11, 13, 17, 0, 2, 6, 10, 14, 22, 26, 34};
+ int off = 8;
+ 
+ __attribute__ ((noinline))
+ int main1 (int *ib)
+ {
+   int i;
+   int ia[N];
+ 
+   for (i = 0; i < N; i++)
+     {
+       ia[i] = ib[i+off];
+     }
+ 
+ 
+   /* check results:  */
+   for (i = 0; i < N; i++)
+     {
+      if (ia[i] != ib[i+off])
+         abort ();
+     }
+ 
+   return 0;
+ }
+ 
+ int main (void)
+ {
+   check_vect ();
+ 
+   main1 (ib);
+   return 0;
+ }
+ 
+ /* For targets that don't support misaligned loads we version for the load.
+    The store is aligned if alignment can be forced on the stack. Otherwise, we need to 
+    peel the loop in order to align the store. For targets that can't align variables
+    using peeling (don't guarantee natural alignment) versioning the loop is required
+    both for the load and the store.  */
+ 
+ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } } */ 
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { {! vect_no_align} && { unaligned_stack && vector_alignment_reachable } } } } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { { {! unaligned_stack} && vect_no_align } || {unaligned_stack && { {! vector_alignment_reachable} && {! vect_no_align} } } } } } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target { { unaligned_stack && { vector_alignment_reachable && vect_no_align } } || {unaligned_stack && { {! vector_alignment_reachable} && vect_no_align } } } } } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/vect-all.c
===================================================================
*** testsuite/gcc.dg/vect/vect-all.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-all.c	(working copy)
*************** fbar2 (float *a)
*** 65,70 ****
--- 65,81 ----
    fcheck_results (a, fresults2);
  } 
  
+ float a[N];
+ float e[N];
+ float b[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
+ float c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+ float d[N] = {0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30};
+ int ic[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
+ int ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
+ int ia[N];
+ char cb[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
+ char ca[N];
+ short sa[N];
  
  /* All of the loops below are currently vectorizable.  */
  
*************** __attribute__ ((noinline)) int
*** 72,88 ****
  main1 ()
  {
    int i,j;
-   float a[N];
-   float e[N];
-   float b[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
-   float c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
-   float d[N] = {0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30};
-   int ic[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
-   int ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
-   int ia[N];
-   char cb[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
-   char ca[N];
-   short sa[N];
  
    /* Test 1: copy chars.  */
    for (i = 0; i < N; i++)
--- 83,88 ----
Index: testsuite/gcc.dg/vect/vect-20.c
===================================================================
*** testsuite/gcc.dg/vect/vect-20.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-20.c	(working copy)
***************
*** 5,14 ****
  
  #define N 64
  
- __attribute__ ((noinline)) int
- main1 ()
- {
-   int i;
    int ia[N];
    int ib[N]= 
      {1,1,0,0,1,0,1,0,
--- 5,10 ----
*************** main1 ()
*** 42,47 ****
--- 38,47 ----
       1,1,0,0,1,0,1,0,
       1,1,0,0,1,0,1,0};
  
+ __attribute__ ((noinline)) int
+ main1 ()
+ {
+   int i;
  
    /* Check ints.  */
  
Index: testsuite/gcc.dg/vect/no-section-anchors-vect-31.c
===================================================================
*** testsuite/gcc.dg/vect/no-section-anchors-vect-31.c	(revision 0)
--- testsuite/gcc.dg/vect/no-section-anchors-vect-31.c	(revision 0)
***************
*** 0 ****
--- 1,92 ----
+ /* { dg-require-effective-target vect_int } */
+ 
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ #define N 32
+ 
+ struct t{
+   int k[N];
+   int l; 
+ };
+   
+ struct s{
+   char a;	/* aligned */
+   char b[N-1];  /* unaligned (offset 1B) */
+   char c[N];    /* aligned (offset NB) */
+   struct t d;   /* aligned (offset 2NB) */
+   struct t e;   /* unaligned (offset 2N+4N+4 B) */
+ };
+  
+ struct s tmp;
+ __attribute__ ((noinline))
+ int main1 ()
+ {  
+   int i;
+ 
+   /* unaligned */
+   for (i = 0; i < N/2; i++)
+     {
+       tmp.b[i] = 5;
+     }
+ 
+   /* check results:  */
+   for (i = 0; i <N/2; i++)
+     {
+       if (tmp.b[i] != 5)
+         abort ();
+     }
+ 
+   /* aligned */
+   for (i = 0; i < N/2; i++)
+     {
+       tmp.c[i] = 6;
+     }
+ 
+   /* check results:  */
+   for (i = 0; i <N/2; i++)
+     {
+       if (tmp.c[i] != 6)
+         abort ();
+     }
+ 
+   /* aligned */
+   for (i = 0; i < N/2; i++)
+     {
+       tmp.d.k[i] = 7;
+     }
+ 
+   /* check results:  */
+   for (i = 0; i <N/2; i++)
+     {
+       if (tmp.d.k[i] != 7)
+         abort ();
+     }
+ 
+   /* unaligned */
+   for (i = 0; i < N/2; i++)
+     {
+       tmp.e.k[i] = 8;
+     }
+ 
+   /* check results:  */
+   for (i = 0; i <N/2; i++)
+     {
+       if (tmp.e.k[i] != 8)
+         abort ();
+     }
+ 
+   return 0;
+ }
+ 
+ int main (void)
+ { 
+   check_vect ();
+   
+   return main1 ();
+ } 
+ 
+ /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/no-section-anchors-vect-65.c
===================================================================
*** testsuite/gcc.dg/vect/no-section-anchors-vect-65.c	(revision 0)
--- testsuite/gcc.dg/vect/no-section-anchors-vect-65.c	(revision 0)
***************
*** 0 ****
--- 1,85 ----
+ /* { dg-require-effective-target vect_int } */
+ 
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ #define N 16
+ #define M 4
+ 
+ int ib[M][M][N] = {{{0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45},
+                     {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45},
+                     {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45},
+                     {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45}},
+                    {{0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45},
+                     {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45},
+                     {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45},
+                     {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45}},
+                    {{0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45},
+                     {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45},
+                     {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45},
+                     {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45}},
+                    {{0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45},
+                     {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45},
+                     {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45},
+                     {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45}}};
+ int ia[M][M][N];
+ int ic[N];
+ 
+ __attribute__ ((noinline))
+ int main1 ()
+ {
+   int i, j;
+ 
+   /* Multidimensional array. Aligned. The "inner" dimensions
+      are invariant in the inner loop. Load and store. */
+   for (i = 0; i < M; i++)
+     {
+       for (j = 0; j < N; j++)
+         {
+            ia[i][1][j] = ib[2][i][j];
+         }
+     }
+ 
+   /* check results: */  
+   for (i = 0; i < M; i++)
+     {
+       for (j = 0; j < N; j++)
+         {
+            if (ia[i][1][j] != ib[2][i][j])
+               abort();
+         }
+     }
+ 
+   /* Multidimensional array. Aligned. The "inner" dimensions
+      are invariant in the inner loop. Load. */
+   for (i = 0; i < M; i++)
+     {
+       for (j = 0; j < N; j++)
+         {
+            ic[j] = ib[2][i][j];
+         }
+     }
+ 
+   /* check results: */
+   for (i = 0; i < M; i++)
+     {
+       for (j = 0; j < N; j++)
+         {
+            if (ic[j] != ib[2][i][j])
+               abort();
+         }
+     }
+ 
+   return 0;
+ }
+ 
+ int main (void)
+ { 
+   check_vect ();
+ 
+   return main1 ();
+ }
+ 
+ /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/vect-4.c
===================================================================
*** testsuite/gcc.dg/vect/vect-4.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-4.c	(working copy)
***************
*** 5,17 ****
  
  #define N 20
  
  __attribute__ ((noinline)) int
  main1 ()
  {
    int i;
-   float a[N];
-   float b[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,48,51,54,57};
-   float c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19};
  
    for (i = 0; i < N; i++)
      {
--- 5,18 ----
  
  #define N 20
  
+ float a[N];
+ float b[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45,48,51,54,57};
+ float c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19};
+ 
  __attribute__ ((noinline)) int
  main1 ()
  {
    int i;
  
    for (i = 0; i < N; i++)
      {
Index: testsuite/gcc.dg/vect/vect-77-alignchecks.c
===================================================================
*** testsuite/gcc.dg/vect/vect-77-alignchecks.c	(revision 0)
--- testsuite/gcc.dg/vect/vect-77-alignchecks.c	(revision 0)
***************
*** 0 ****
--- 1,56 ----
+ /* { dg-require-effective-target vect_int } */
+ 
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ #define N 8
+ #define OFF 8
+ 
+ /* Check handling of accesses for which the "initial condition" -
+    the expression that represents the first location accessed - is
+    more involved than just an ssa_name.  */
+ 
+ int ib[N+OFF] __attribute__ ((__aligned__(16))) = {0, 1, 3, 5, 7, 11, 13, 17, 0, 2, 6, 10, 14, 22, 26, 34};
+ 
+ __attribute__ ((noinline))
+ int main1 (int *ib, int off)
+ {
+   int i;
+   int ia[N];
+ 
+   for (i = 0; i < N; i++)
+     {
+       ia[i] = ib[i+off];
+     }
+ 
+ 
+   /* check results:  */
+   for (i = 0; i < N; i++)
+     {
+      if (ia[i] != ib[i+off])
+         abort ();
+     }
+ 
+   return 0;
+ }
+ 
+ int main (void)
+ {
+   check_vect ();
+ 
+   main1 (ib, 8);
+   return 0;
+ }
+ 
+ /* For targets that don't support misaligned loads we version for the load.
+    The store is aligned if alignment can be forced on the stack. Otherwise, we need to
+    peel the loop in order to align the store. For targets that can't align variables
+    using peeling (don't guarantee natural alignment) versioning the loop is required
+    both for the load and the store.  */
+ 
+ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { {! vect_no_align} && { unaligned_stack && vector_alignment_reachable } } } } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { { {! unaligned_stack} && vect_no_align } || {unaligned_stack && { {! vector_alignment_reachable} && {! vect_no_align} } } } } } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target { { unaligned_stack && { vector_alignment_reachable && vect_no_align } } || {unaligned_stack && { {! vector_alignment_reachable} && vect_no_align } } } } } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/vect-21.c
===================================================================
*** testsuite/gcc.dg/vect/vect-21.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-21.c	(working copy)
***************
*** 5,14 ****
  
  #define N 64
  
- __attribute__ ((noinline)) int
- main1 ()
- {
-   int i;
    int ia[N];
    int ib[N]= 
      {1,1,0,0,1,0,1,0,
--- 5,10 ----
*************** main1 ()
*** 71,76 ****
--- 67,76 ----
       1,1,0,0,1,0,1,0,
       1,1,0,0,1,0,1,0};
  
+ __attribute__ ((noinline)) int
+ main1 ()
+ {
+   int i;
    /* Check ints.  */
  
    for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-29.c
===================================================================
*** testsuite/gcc.dg/vect/vect-29.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-29.c	(working copy)
***************
*** 8,19 ****
  
  /* unaligned load.  */
  
  __attribute__ ((noinline))
  int main1 (int off)
  {
    int i;
-   int ia[N];
-   int ib[N+OFF];
  
    for (i = 0; i < N+OFF; i++)
      {
--- 8,20 ----
  
  /* unaligned load.  */
  
+ int ia[N];
+ int ib[N+OFF];
+ 
  __attribute__ ((noinline))
  int main1 (int off)
  {
    int i;
  
    for (i = 0; i < N+OFF; i++)
      {
Index: testsuite/gcc.dg/vect/vect-13.c
===================================================================
*** testsuite/gcc.dg/vect/vect-13.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-13.c	(working copy)
***************
*** 7,18 ****
  
  int a[N];
  int results[N] = {0,1,2,3,0,0,0,0,0,0,0,0,12,13,14,15};
  
  __attribute__ ((noinline))
  int main1()
  {
    int i;
-   int b[N] = {0,1,2,3,-4,-5,-6,-7,-8,-9,-10,-11,12,13,14,15};
  
    /* Max pattern.  */
    for (i = 0; i < N; i++)
--- 7,18 ----
  
  int a[N];
  int results[N] = {0,1,2,3,0,0,0,0,0,0,0,0,12,13,14,15};
+ int b[N] = {0,1,2,3,-4,-5,-6,-7,-8,-9,-10,-11,12,13,14,15};
  
  __attribute__ ((noinline))
  int main1()
  {
    int i;
  
    /* Max pattern.  */
    for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-72.c
===================================================================
*** testsuite/gcc.dg/vect/vect-72.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-72.c	(working copy)
***************
*** 7,18 ****
  
  /* unaligned load.  */
  
  __attribute__ ((noinline))
  int main1 ()
  {
    int i;
-   char ia[N];
-   char ib[N+1];
  
    for (i=0; i < N+1; i++)
      {
--- 7,19 ----
  
  /* unaligned load.  */
  
+ char ia[N];
+ char ib[N+1];
+ 
  __attribute__ ((noinline))
  int main1 ()
  {
    int i;
  
    for (i=0; i < N+1; i++)
      {
Index: testsuite/gcc.dg/vect/no-section-anchors-vect-66.c
===================================================================
*** testsuite/gcc.dg/vect/no-section-anchors-vect-66.c	(revision 0)
--- testsuite/gcc.dg/vect/no-section-anchors-vect-66.c	(revision 0)
***************
*** 0 ****
--- 1,84 ----
+ /* { dg-require-effective-target vect_int } */
+ 
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ #define N 16
+ 
+ int ib[6] = {0,3,6,9,12,15};
+ int ia[8][5][6];
+ int ic[16][16][5][6];
+ 
+ __attribute__ ((noinline))
+ int main1 ()
+ {
+   int i, j;
+ 
+   /* Multidimensional array. Aligned. */
+   for (i = 0; i < 16; i++)
+     {
+       for (j = 0; j < 4; j++)
+         {
+            ia[2][6][j] = 5;
+         }
+     }
+ 
+   /* check results: */  
+   for (i = 0; i < 16; i++)
+     {
+       for (j = 0; j < 4; j++)
+         {
+            if (ia[2][6][j] != 5)
+                 abort();
+         }
+     }
+   /* Multidimensional array. Aligned. */
+   for (i = 0; i < 16; i++)
+     {
+       for (j = 0; j < 4; j++)
+            ia[3][6][j+2] = 5;
+     }
+ 
+   /* check results: */  
+   for (i = 0; i < 16; i++)
+     {
+       for (j = 2; j < 6; j++)
+         {
+            if (ia[3][6][j] != 5)
+                 abort();
+         }
+     }
+ 
+   /* Multidimensional array. Not aligned. */
+   for (i = 0; i < 16; i++)
+     {
+       for (j = 0; j < 4; j++)
+         {
+            ic[2][1][6][j+1] = 5;
+         }
+     }
+ 
+   /* check results: */  
+   for (i = 0; i < 16; i++)
+     {
+       for (j = 0; j < 4; j++)
+         {
+            if (ic[2][1][6][j+1] != 5)
+                 abort();
+         }
+     }
+ 
+   return 0;
+ }
+ 
+ int main (void)
+ { 
+   check_vect ();
+ 
+   return main1 ();
+ }
+ 
+ /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/vect-5.c
===================================================================
*** testsuite/gcc.dg/vect/vect-5.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-5.c	(working copy)
***************
*** 5,17 ****
  
  #define N 16
  
  __attribute__ ((noinline))
  int main1 ()
  {
    int i, j;
-   float a[N];
-   float c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
-   float d[N] = {0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30};
  
    i = 0;
    j = 0;
--- 5,18 ----
  
  #define N 16
  
+ float a[N];
+ float c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+ float d[N] = {0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30};
+ 
  __attribute__ ((noinline))
  int main1 ()
  {
    int i, j;
  
    i = 0;
    j = 0;
Index: testsuite/gcc.dg/vect/vect-22.c
===================================================================
*** testsuite/gcc.dg/vect/vect-22.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-22.c	(working copy)
***************
*** 6,15 ****
  
  #define N 64
  
- __attribute__ ((noinline)) int
- main1 ()
- {
-   int i;
    int ia[N];
    int ib[N]= 
      {1,1,0,0,1,0,1,0,
--- 6,11 ----
*************** main1 ()
*** 54,59 ****
--- 50,59 ----
       1,1,0,0,1,0,1,0,
       1,1,0,0,1,0,1,0};
  
+ __attribute__ ((noinline)) int
+ main1 ()
+ {
+   int i;
    /* Check ints.  */
  
    for (i = 0; i < N; i++)
Index: testsuite/gcc.dg/vect/vect-64.c
===================================================================
*** testsuite/gcc.dg/vect/vect-64.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-64.c	(working copy)
***************
*** 5,15 ****
  
  #define N 16
  
  __attribute__ ((noinline))
  int main1 ()
  {
    int i, j;
-   int ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
    int ia[N][4][N+1];
    int ic[N][N][3][13];
    int id[N][N][N];
--- 5,16 ----
  
  #define N 16
  
+ int ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
+ 
  __attribute__ ((noinline))
  int main1 ()
  {
    int i, j;
    int ia[N][4][N+1];
    int ic[N][N][3][13];
    int id[N][N][N];
*************** int main (void)
*** 82,87 ****
  }
  
  /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
--- 83,86 ----
Index: testsuite/gcc.dg/vect/vect.exp
===================================================================
*** testsuite/gcc.dg/vect/vect.exp	(revision 129723)
--- testsuite/gcc.dg/vect/vect.exp	(working copy)
*************** lappend DEFAULT_VECTCFLAGS "-O2"
*** 109,115 ****
  dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/nodump-*.\[cS\]]]  \
  	"" $DEFAULT_VECTCFLAGS
  
! lappend DEFAULT_VECTCFLAGS "-fdump-tree-vect-details"
  
  # Main loop.
  dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/pr*.\[cS\]]]  \
--- 109,115 ----
  dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/nodump-*.\[cS\]]]  \
  	"" $DEFAULT_VECTCFLAGS
  
! lappend DEFAULT_VECTCFLAGS "-fdump-tree-vect-details" 
  
  # Main loop.
  dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/pr*.\[cS\]]]  \
Index: testsuite/gcc.dg/vect/vect-31.c
===================================================================
*** testsuite/gcc.dg/vect/vect-31.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-31.c	(working copy)
*************** int main (void)
*** 87,92 ****
  } 
  
  /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
--- 87,90 ----
Index: testsuite/gcc.dg/vect/vect-73.c
===================================================================
*** testsuite/gcc.dg/vect/vect-73.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-73.c	(working copy)
***************
*** 6,11 ****
--- 6,12 ----
  #define N 16
  
  int ic[N*2];
+ int ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
  
  #define ia (ic+N)
  
*************** __attribute__ ((noinline))
*** 13,19 ****
  int main1 ()
  {
    int i, j;
-   int ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
  
    for (i = 0; i < N; i++)
      {
--- 14,19 ----
Index: testsuite/gcc.dg/vect/vect-78-global.c
===================================================================
*** testsuite/gcc.dg/vect/vect-78-global.c	(revision 0)
--- testsuite/gcc.dg/vect/vect-78-global.c	(revision 0)
***************
*** 0 ****
--- 1,53 ----
+ /* { dg-require-effective-target vect_int } */
+ 
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ #define N 8
+ #define OFF 8
+ 
+ /* Check handling of accesses for which the "initial condition" -
+    the expression that represents the first location accessed - is
+    more involved than just an ssa_name.  */
+ 
+ int ia[N];
+ int ib[N+OFF] __attribute__ ((__aligned__(16))) = {0, 1, 3, 5, 7, 11, 13, 17, 0, 2, 6, 10, 14, 22, 26, 34};
+ int off = 8;
+ 
+ __attribute__ ((noinline))
+ int main1 (int *ib)
+ {
+   int i;
+ 
+   for (i = 0; i < N; i++)
+     {
+       ia[i] = ib[i+off];
+     }
+ 
+ 
+   /* check results:  */
+   for (i = 0; i < N; i++)
+     {
+      if (ia[i] != ib[i+off])
+         abort ();
+     }
+ 
+   return 0;
+ }
+ 
+ int main (void)
+ {
+   check_vect ();
+ 
+   main1 (ib);
+   return 0;
+ }
+ 
+ /* For targets that don't support misaligned loads we version for the load.
+    (The store is aligned).  */
+ 
+ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target vect_no_align } } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/vect-6.c
===================================================================
*** testsuite/gcc.dg/vect/vect-6.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-6.c	(working copy)
***************
*** 7,21 ****
  
  float results1[N] = {192.00,240.00,288.00,336.00,384.00,432.00,480.00,528.00,0.00};
  float results2[N] = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,54.00,120.00,198.00,288.00,390.00,504.00,630.00};
  
  __attribute__ ((noinline))
  int main1 ()
  {
    int i;
-   float a[N] = {0};
-   float e[N] = {0};
-   float b[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
-   float c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
  
    for (i = 0; i < N/2; i++)
      { 
--- 7,21 ----
  
  float results1[N] = {192.00,240.00,288.00,336.00,384.00,432.00,480.00,528.00,0.00};
  float results2[N] = {0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,54.00,120.00,198.00,288.00,390.00,504.00,630.00};
+ float a[N] = {0};
+ float e[N] = {0};
+ float b[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
+ float c[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
  
  __attribute__ ((noinline))
  int main1 ()
  {
    int i;
  
    for (i = 0; i < N/2; i++)
      { 
Index: testsuite/gcc.dg/vect/vect-65.c
===================================================================
*** testsuite/gcc.dg/vect/vect-65.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-65.c	(working copy)
*************** int main (void)
*** 80,84 ****
  }
  
  /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
--- 80,83 ----
Index: testsuite/gcc.dg/vect/no-section-anchors-vect-34.c
===================================================================
*** testsuite/gcc.dg/vect/no-section-anchors-vect-34.c	(revision 0)
--- testsuite/gcc.dg/vect/no-section-anchors-vect-34.c	(revision 0)
***************
*** 0 ****
--- 1,42 ----
+ /* { dg-require-effective-target vect_int } */
+ 
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ #define N 16
+  
+ struct {
+   char ca[N];
+ } s;
+ char cb[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
+ 
+ __attribute__ ((noinline))
+ int main1 ()
+ {  
+   int i;
+ 
+   for (i = 0; i < N; i++)
+     {
+       s.ca[i] = cb[i];
+     }
+ 
+   /* check results:  */
+   for (i = 0; i < N; i++)
+     {
+       if (s.ca[i] != cb[i])
+         abort ();
+     }
+ 
+   return 0;
+ }
+ 
+ int main (void)
+ { 
+   check_vect ();
+   
+   return main1 ();
+ } 
+ 
+ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/wrapv-vect-7.c
===================================================================
*** testsuite/gcc.dg/vect/wrapv-vect-7.c	(revision 129723)
--- testsuite/gcc.dg/vect/wrapv-vect-7.c	(working copy)
***************
*** 5,15 ****
  
  #define N 128
  
  int main1 ()
  {
    int i;
-   short sa[N];
-   short sb[N];
    
    for (i = 0; i < N; i++)
      {
--- 5,16 ----
  
  #define N 128
  
+ short sa[N];
+ short sb[N];
+ 
  int main1 ()
  {
    int i;
    
    for (i = 0; i < N; i++)
      {
Index: testsuite/gcc.dg/vect/no-scevccp-outer-6.c
===================================================================
*** testsuite/gcc.dg/vect/no-scevccp-outer-6.c	(revision 129723)
--- testsuite/gcc.dg/vect/no-scevccp-outer-6.c	(working copy)
*************** int main (void)
*** 51,56 ****
    return 0;
  }
  
! /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail vect_no_align } } } */
  /* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: detected" 1 "vect" { xfail *-*-* } } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
--- 51,56 ----
    return 0;
  }
  
! /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { unaligned_stack || vect_no_align } } } } */
  /* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: detected" 1 "vect" { xfail *-*-* } } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/no-section-anchors-vect-68.c
===================================================================
*** testsuite/gcc.dg/vect/no-section-anchors-vect-68.c	(revision 0)
--- testsuite/gcc.dg/vect/no-section-anchors-vect-68.c	(revision 0)
***************
*** 0 ****
--- 1,92 ----
+ /* { dg-require-effective-target vect_int } */
+ 
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ #define N 32
+ 
+ struct s{
+   int m;
+   int n[N][N][N];
+ };
+ 
+ struct test1{
+   struct s a; /* array a.n is unaligned */
+   int b;
+   int c;
+   struct s e; /* array e.n is aligned */
+ };
+ 
+ struct test1 tmp1;
+ 
+ __attribute__ ((noinline))
+ int main1 ()
+ {  
+   int i,j;
+ 
+   /* 1. unaligned */
+   for (i = 0; i < N; i++)
+     {
+       tmp1.a.n[1][2][i] = 5;
+     }
+ 
+   /* check results:  */
+   for (i = 0; i <N; i++)
+     {
+       if (tmp1.a.n[1][2][i] != 5)
+         abort ();
+     }
+ 
+   /* 2. aligned */
+   for (i = 3; i < N-1; i++)
+     {
+       tmp1.a.n[1][2][i] = 6;
+     }
+ 
+   /* check results:  */
+   for (i = 3; i < N-1; i++)
+     {
+       if (tmp1.a.n[1][2][i] != 6)
+         abort ();
+     }
+ 
+   /* 3. aligned */
+   for (i = 0; i < N; i++)
+     {
+       tmp1.e.n[1][2][i] = 7;
+     }
+ 
+   /* check results:  */
+   for (i = 0; i < N; i++)
+     {
+       if (tmp1.e.n[1][2][i] != 7)
+         abort ();
+     }
+ 
+   /* 4. unaligned */
+   for (i = 3; i < N-3; i++)
+     {
+       tmp1.e.n[1][2][i] = 8;
+     }
+  
+   /* check results:  */
+   for (i = 3; i <N-3; i++)
+     {
+       if (tmp1.e.n[1][2][i] != 8)
+         abort ();
+     }
+ 
+   return 0;
+ }
+ 
+ int main (void)
+ { 
+   check_vect ();
+   
+   return main1 ();
+ } 
+ 
+ /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/vect-7.c
===================================================================
*** testsuite/gcc.dg/vect/vect-7.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-7.c	(working copy)
***************
*** 5,16 ****
  
  #define N 128
  
  __attribute__ ((noinline))
  int main1 ()
  {
    int i;
-   short sa[N];
-   short sb[N];
    
    for (i = 0; i < N; i++)
      {
--- 5,17 ----
  
  #define N 128
  
+ short sa[N];
+ short sb[N];
+ 
  __attribute__ ((noinline))
  int main1 ()
  {
    int i;
    
    for (i = 0; i < N; i++)
      {
Index: testsuite/gcc.dg/vect/vect-66.c
===================================================================
*** testsuite/gcc.dg/vect/vect-66.c	(revision 129723)
--- testsuite/gcc.dg/vect/vect-66.c	(working copy)
*************** int main (void)
*** 78,83 ****
  }
  
  /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
- /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
--- 78,81 ----
Index: testsuite/lib/target-supports.exp
===================================================================
*** testsuite/lib/target-supports.exp	(revision 129723)
--- testsuite/lib/target-supports.exp	(working copy)
*************** proc check_effective_target_vect_unpack 
*** 2109,2114 ****
--- 2109,2135 ----
      return $et_vect_unpack_saved
  }
  
+ # Return 1 if the target plus current options does not guarantee
+ # that its STACK_BOUNDARY is >= the reguired vector alignment.
+ #
+ # This won't change for different subtargets so cache the result.
+ 
+ proc check_effective_target_unaligned_stack { } {
+     global et_unaligned_stack_saved
+ 
+     if [info exists et_unaligned_stack_saved] {
+         verbose "check_effective_target_unaligned_stack: using cached result" 2
+     } else {
+         set et_unaligned_stack_saved 0
+         if { ( [istarget i?86-*-*] || [istarget x86_64-*-*] )
+           && (! [istarget *-*-darwin*] ) } {
+             set et_unaligned_stack_saved 1
+         }
+     }
+     verbose "check_effective_target_unaligned_stack: returning $et_unaligned_stack_saved" 2
+     return $et_unaligned_stack_saved
+ }
+ 
  # Return 1 if the target plus current options does not support a vector
  # alignment mechanism, 0 otherwise.
  #
Index: tree-vectorizer.c
===================================================================
*** tree-vectorizer.c	(revision 129723)
--- tree-vectorizer.c	(working copy)
*************** vect_can_force_dr_alignment_p (const_tre
*** 1606,1617 ****
    if (TREE_STATIC (decl))
      return (alignment <= MAX_OFILE_ALIGNMENT);
    else
!     /* This is not 100% correct.  The absolute correct stack alignment
!        is STACK_BOUNDARY.  We're supposed to hope, but not assume, that
!        PREFERRED_STACK_BOUNDARY is honored by all translation units.
!        However, until someone implements forced stack alignment, SSE
!        isn't really usable without this.  */  
!     return (alignment <= PREFERRED_STACK_BOUNDARY); 
  }
  
  
--- 1606,1614 ----
    if (TREE_STATIC (decl))
      return (alignment <= MAX_OFILE_ALIGNMENT);
    else
!     /* This used to be PREFERRED_STACK_BOUNDARY, however, that is not 100%
!        correct until someone implements forced stack alignment.  */
!     return (alignment <= STACK_BOUNDARY); 
  }
  
  

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-10-30  5:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-10-22  9:17 ping [RFC] [patch] fix PR32893 - forcing alignment >= STACK_BOUNDARY Dorit Nuzman
2007-10-22 14:19 ` H.J. Lu
2007-10-22 18:31   ` Mark Mitchell
2007-10-22 18:37     ` Daniel Jacobowitz
2007-10-22 18:45       ` Mark Mitchell
2007-10-30  7:24     ` Dorit Nuzman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).