public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
@ 2005-12-14 15:24 David dot Monniaux at ens dot fr
2005-12-14 15:26 ` [Bug target/25413] " David dot Monniaux at ens dot fr
` (22 more replies)
0 siblings, 23 replies; 25+ messages in thread
From: David dot Monniaux at ens dot fr @ 2005-12-14 15:24 UTC (permalink / raw)
To: gcc-bugs
(The same phenomenon happens in 4.2 subversion alpha.)
Incorrect code is generated when using -ftree-vectorize on the SSE2 Pentium 4
target. (The same code works on the AMD64 64-bit target.)
See attached source code.
$ gcc -O2 -Wall -march=pentium4 -mfpmath=sse -ftree-vectorize
-ftree-vectorizer-verbose=1 gcc_plantage2.c -o ./gcc_plantage2
$ ./gcc_plantage2
will segfault (the same program works perfectly without -ftree-vectorize).
The segfault appears at the second movapd instruction:
movapd .LC0, %xmm0
.L17:
movapd %xmm0, (%eax)
addl $1, %edx
addl $16, %eax
cmpl %edx, %ebx
ja .L17
%eax is at this point equal to 4 modulo 16, which results in a segmentation
fault, since movapd assumes 16-byte alignment.
--
Summary: wrong alignment or incorrect address computation in
vectorized code on Pentium 4 SSE
Product: gcc
Version: 4.0.3
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: David dot Monniaux at ens dot fr
GCC build triplet: i486-linux-gnu
GCC host triplet: i486-linux-gnu
GCC target triplet: i486-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
@ 2005-12-14 15:26 ` David dot Monniaux at ens dot fr
2005-12-15 12:42 ` dorit at il dot ibm dot com
` (21 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: David dot Monniaux at ens dot fr @ 2005-12-14 15:26 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from David dot Monniaux at ens dot fr 2005-12-14 15:26 -------
Created an attachment (id=10486)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=10486&action=view)
preprocessed C source code (assumes Linux glibc) exhibiting the code generation
bug
Compile this program with -O2 -march=pentium4 -ftree-vectorize on a Pentium 4
and run it. It will segfault. It works perfectly without vectorization.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
2005-12-14 15:26 ` [Bug target/25413] " David dot Monniaux at ens dot fr
@ 2005-12-15 12:42 ` dorit at il dot ibm dot com
2005-12-15 12:50 ` dorit at il dot ibm dot com
` (20 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: dorit at il dot ibm dot com @ 2005-12-15 12:42 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from dorit at il dot ibm dot com 2005-12-15 12:41 -------
The problem is that the vectorizer applies loop-peeling in order to align the
data reference *(m->c+i), and peeling only works correctly if the data is
naturally aligned (aligned on it's type size). This is what the vectorizer
currently blindly assumes, but on the Pentium4 doubles are not necessarily
64bit aligned.
Accidentally Devang and I discussed this issue last week, and Devang actually
committed a patch to apple-ppc branch that works around the problem (
http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=108214). Devang's patch
however will not fix this PR - the patch he committed disables vectorization if
the vectorizer was able to compute the misalignment, and discovered that it
doesn't evenly divide by the type size. In this testcase the misalignment is
unknown at compile time.
To fix this problem we need to disable loop-peeling in the vectorizer if we
can't prove that the data is naturally aligned. Alternatively, if we can't
prove either way we can peel the loop but control the number of iterations it
will execute using a runtime test (i.e. have the prolog loop iterate the entire
loop-count if at runtime we discover that the data is not naturally aligned).
--
dorit at il dot ibm dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |dorit at il dot ibm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
2005-12-14 15:26 ` [Bug target/25413] " David dot Monniaux at ens dot fr
2005-12-15 12:42 ` dorit at il dot ibm dot com
@ 2005-12-15 12:50 ` dorit at il dot ibm dot com
2006-01-30 22:24 ` [Bug tree-optimization/25413] " pinskia at gcc dot gnu dot org
` (19 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: dorit at il dot ibm dot com @ 2005-12-15 12:50 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from dorit at il dot ibm dot com 2005-12-15 12:50 -------
related discussion: http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (2 preceding siblings ...)
2005-12-15 12:50 ` dorit at il dot ibm dot com
@ 2006-01-30 22:24 ` pinskia at gcc dot gnu dot org
2007-04-02 20:52 ` reichelt at gcc dot gnu dot org
` (18 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-01-30 22:24 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from pinskia at gcc dot gnu dot org 2006-01-30 22:24 -------
Confirmed.
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Component|target |tree-optimization
Ever Confirmed|0 |1
Keywords| |wrong-code
Last reconfirmed|0000-00-00 00:00:00 |2006-01-30 22:24:29
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug tree-optimization/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (3 preceding siblings ...)
2006-01-30 22:24 ` [Bug tree-optimization/25413] " pinskia at gcc dot gnu dot org
@ 2007-04-02 20:52 ` reichelt at gcc dot gnu dot org
2007-04-03 19:22 ` [Bug target/25413] " dorit at il dot ibm dot com
` (17 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: reichelt at gcc dot gnu dot org @ 2007-04-02 20:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from reichelt at gcc dot gnu dot org 2007-04-02 21:52 -------
Any news on this one?
The bug makes tree vectorization on pentium 4 totally useless. :-(
Btw, here's a smaller code snippet for testing. Just compile it with
gcc -O -msse2 -ftree-vectorize
on a pentium 4 and see the resulting executable segfault:
================================
struct
{
char c;
double d[2];
} a;
int main()
{
int i;
for ( i=0; i<2; ++i )
a.d[i]=0;
return 0;
}
================================
--
reichelt at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |reichelt at gcc dot gnu dot
| |org
Keywords| |monitored
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (4 preceding siblings ...)
2007-04-02 20:52 ` reichelt at gcc dot gnu dot org
@ 2007-04-03 19:22 ` dorit at il dot ibm dot com
2007-07-01 10:00 ` dorit at gcc dot gnu dot org
` (16 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: dorit at il dot ibm dot com @ 2007-04-03 19:22 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from dorit at il dot ibm dot com 2007-04-03 20:22 -------
So I see Devang had sent a patch for this over a year ago:
http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00167.html
I don't know what ever happened to it.
Maybe you want to give it a try? (you may need to implement the new target hook
for Pentium4). If you have problems applying the patch (it is a bit old) - I
could try to help update the patch (not before next week though).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (5 preceding siblings ...)
2007-04-03 19:22 ` [Bug target/25413] " dorit at il dot ibm dot com
@ 2007-07-01 10:00 ` dorit at gcc dot gnu dot org
2007-07-02 8:30 ` patchapp at dberlin dot org
` (15 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-01 10:00 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from dorit at gcc dot gnu dot org 2007-07-01 09:59 -------
I'm testing the following patch (seems to fix the two testcases in this PR on
Pentium4. still need to bootstrap etc, and check the powerpc bits)
Index: gcc/targhooks.c
===================================================================
*** gcc/targhooks.c (revision 126162)
--- gcc/targhooks.c (working copy)
*************** tree default_mangle_decl_assembler_name
*** 634,637 ****
--- 634,653 ----
return id;
}
+ bool
+ default_builtin_vector_alignment_reachable (tree type, bool is_packed)
+ {
+ if (is_packed)
+ return false;
+
+ /* Assuming that types whose size is > pointer-size are not guaranteed to
be
+ naturally aligned. */
+ if (tree_int_cst_compare (TYPE_SIZE (type), bitsize_int (POINTER_SIZE)) >
0)
+ return false;
+
+ /* Assuming that types whose size is <= pointer-size
+ are naturally aligned. */
+ return true;
+ }
+
#include "gt-targhooks.h"
Index: gcc/targhooks.h
===================================================================
*** gcc/targhooks.h (revision 126162)
--- gcc/targhooks.h (working copy)
*************** extern tree default_builtin_vectorized_c
*** 62,67 ****
--- 62,69 ----
extern tree default_builtin_reciprocal (enum built_in_function, bool, bool);
+ extern bool default_builtin_vector_alignment_reachable (tree, bool);
+
/* These are here, and not in hooks.[ch], because not all users of
hooks.h include tm.h, and thus we don't have CUMULATIVE_ARGS. */
Index: gcc/tree.h
===================================================================
*** gcc/tree.h (revision 126162)
--- gcc/tree.h (working copy)
*************** extern tree get_inner_reference (tree, H
*** 4327,4332 ****
--- 4327,4338 ----
tree *, enum machine_mode *, int *, int *,
bool);
+ /* Given an expression EXP that may be a COMPONENT_REF or an ARRAY_REF,
+ look for whether EXP or any nested component-refs within EXP is marked
+ as PACKED. */
+
+ extern bool contains_packed_reference (tree exp);
+
/* Return 1 if T is an expression that get_inner_reference handles. */
extern int handled_component_p (tree);
Index: gcc/target.h
===================================================================
*** gcc/target.h (revision 126162)
--- gcc/target.h (working copy)
*************** struct gcc_target
*** 413,418 ****
--- 413,422 ----
element-by-element products for the odd elements. */
tree (* builtin_mul_widen_even) (tree);
tree (* builtin_mul_widen_odd) (tree);
+
+ /* Return true if vector alignment is reachable (by peeling N
+ interations) for the given type. */
+ bool (* vector_alignment_reachable) (tree, bool);
} vectorize;
/* The initial value of target_flags. */
Index: gcc/testsuite/gcc.dg/vect/vect-align-1.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/vect-align-1.c (revision 0)
--- gcc/testsuite/gcc.dg/vect/vect-align-1.c (revision 0)
***************
*** 0 ****
--- 1,50 ----
+ /* { dg-require-effective-target vect_int } */
+
+ #include <stdlib.h>
+ #include <stdarg.h>
+ #include "tree-vect.h"
+
+ /* Compile time known misalignment. Cannot use loop peeling to align
+ the store. */
+
+ #define N 16
+
+ struct foo {
+ char x;
+ int y[N];
+ } __attribute__((packed));
+
+ int
+ main1 (struct foo * __restrict__ p)
+ {
+ int i;
+ int x[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+
+ for (i = 0; i < N; i++)
+ {
+ p->y[i] = x[i];
+ }
+
+ /* check results: */
+ for (i = 0; i < N; i++)
+ {
+ if (p->y[i] != x[i])
+ abort ();
+ }
+ return 0;
+ }
+
+
+ int main (void)
+ {
+ int i;
+ struct foo *p = malloc (2*sizeof (struct foo));
+ check_vect ();
+
+ main1 (p);
+ return 0;
+ }
+
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using
versioning" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/pr25413a.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/pr25413a.c (revision 0)
--- gcc/testsuite/gcc.dg/vect/pr25413a.c (revision 0)
***************
*** 0 ****
--- 1,129 ----
+ /* { dg-require-effective-target vect_double } */
+
+ #include <stdarg.h>
+ #include "tree-vect.h"
+
+ #define N 8
+
+ typedef unsigned int size_t;
+
+ extern void *malloc (size_t __size) __attribute__ ((__nothrow__))
__attribute__ ((__malloc__));
+
+ typedef double num_t;
+ static const num_t num__infty = ((num_t)1.0)/((num_t)0.0);
+
+ struct oct_tt;
+ typedef struct oct_tt oct_t;
+
+ typedef unsigned int var_t;
+ typedef enum {
+ OCT_EMPTY = 0,
+ OCT_NORMAL = 1,
+ OCT_CLOSED = 2
+ } oct_state;
+
+ struct oct_tt {
+ var_t n;
+
+ int ref;
+
+ oct_state state;
+ struct oct_tt* closed;
+
+ num_t* c;
+ };
+
+ void* octfapg_mm_malloc (size_t t);
+ oct_t* octfapg_alloc (var_t n);
+ oct_t* octfapg_full_copy (oct_t* m);
+
+ struct mmalloc_tt;
+ typedef struct mmalloc_tt mmalloc_t;
+
+ struct mmalloc_tt
+ {
+ int id;
+
+ int nb_alloc;
+ int nb_realloc;
+ int nb_free;
+
+ size_t rem;
+ size_t max;
+ size_t tot;
+
+ };
+
+ typedef struct
+ {
+ size_t size;
+
+ mmalloc_t* mm;
+ int id;
+
+ double dummy;
+
+ } mmheader_t;
+
+ void*
+ octfapg_mm_malloc (size_t t)
+ {
+ char* m = (char*)malloc(t+sizeof(mmheader_t));
+ return m+sizeof(mmheader_t);
+ }
+
+ oct_t* octfapg_empty (var_t n);
+
+ oct_t*
+ octfapg_empty (const var_t n)
+ {
+ oct_t* m;
+ /*octfapg_timing_enter("oct_empty",3);*/
+ m = ((oct_t*) octfapg_mm_malloc (sizeof(oct_t)));
+ m->n = n;
+ m->ref = 1;
+ m->state = OCT_EMPTY;
+ m->closed = (oct_t*)((void *)0);
+ m->c = (num_t*)((void *)0);
+ /*octfapg_timing_exit("oct_empty",3);*/
+ return m;
+ }
+
+ oct_t*
+ octfapg_alloc (const var_t n)
+ {
+ size_t nn = (2*(size_t)(n)*((size_t)(n)+1));
+ oct_t* m;
+ m = octfapg_empty(n);
+ m->c = ((num_t*) octfapg_mm_malloc (sizeof(num_t)*(nn)));
+ ;
+ m->state = OCT_NORMAL;
+ m->closed = (oct_t*)((void *)0);
+ return m;
+ }
+
+ oct_t*
+ octfapg_universe (const var_t n)
+ {
+ oct_t* m;
+ size_t i, nn = (2*(size_t)(n)*((size_t)(n)+1));
+ m = octfapg_alloc(n);
+ for (i=0;i<nn;i++) *(m->c+i) = num__infty;
+ for (i=0;i<2*n;i++)
*(m->c+((size_t)(i)+(((size_t)(i)+1)*((size_t)(i)+1))/2)) = (num_t)(0);
+ m->state = OCT_CLOSED;
+ return m;
+ }
+
+ int main (void)
+ {
+ int i;
+ check_vect ();
+
+ oct_t *p = octfapg_universe(10);
+ return 0;
+ }
+
+ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "vector alignment may not be reachable"
1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using
versioning" 1 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-align-2.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/vect-align-2.c (revision 0)
--- gcc/testsuite/gcc.dg/vect/vect-align-2.c (revision 0)
***************
*** 0 ****
--- 1,46 ----
+ /* { dg-require-effective-target vect_int } */
+ /* { dg-do run } */
+
+ #include <stdlib.h>
+ #include <stdarg.h>
+ #include "tree-vect.h"
+
+ /* Compile time unknown misalignment. Cannot use loop peeling to align
+ the store. */
+
+ #define N 17
+
+ struct foo {
+ char x0;
+ int y[N][N];
+ } __attribute__ ((packed));
+
+ struct foo f2;
+ int z[16] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+
+ void fbar(struct foo *fp)
+ {
+ int i,j;
+ for (i=0; i<N; i++)
+ for (j=0; j<N; j++)
+ f2.y[i][j] = z[i];
+
+ for (i=0; i<N; i++)
+ for (j=0; j<N; j++)
+ if (f2.y[i][j] != z[i])
+ abort ();
+ }
+
+ int main (void)
+ {
+ struct foo *fp = (struct foo *) malloc (2*sizeof (struct foo));
+
+ fbar(fp);
+ return 0;
+ }
+
+
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using
peeling" 0 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using
versioning" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/pr31699.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/pr31699.c (revision 126162)
--- gcc/testsuite/gcc.dg/vect/pr31699.c (working copy)
*************** int main()
*** 31,35 ****
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target
vect_intfloat_cvt } } } */
! /* { dg-final { scan-tree-dump-times "Alignment of access forced using
peeling" 1 "vect" { xfail vect_no_align } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
--- 31,36 ----
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target
vect_intfloat_cvt } } } */
! /* { dg-final { scan-tree-dump-times "vector alignment may not be reachable"
1 "vect" } } */
! /* { dg-final { scan-tree-dump-times "Alignment of access forced using
versioning" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/pr25413.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/pr25413.c (revision 0)
--- gcc/testsuite/gcc.dg/vect/pr25413.c (revision 0)
***************
*** 0 ****
--- 1,37 ----
+ /* { dg-require-effective-target vect_double } */
+
+ #include <stdarg.h>
+ #include "tree-vect.h"
+
+ #define N 8
+
+ struct
+ {
+ char c;
+ double d[N];
+ } a;
+
+ int main1()
+ {
+ int i;
+ for ( i=0; i<N; ++i )
+ a.d[i]=1;
+ return 0;
+ }
+
+ int main (void)
+ {
+ int i;
+ check_vect ();
+
+ main1 ();
+ for (i=0; i<N; i++)
+ if (a.d[i] != 1)
+ abort ();
+ return 0;
+ }
+
+ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "vector alignment may not be reachable"
1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned
store" 1 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/expr.c
===================================================================
*** gcc/expr.c (revision 126162)
--- gcc/expr.c (working copy)
*************** get_inner_reference (tree exp, HOST_WIDE
*** 5924,5929 ****
--- 5924,5966 ----
return exp;
}
+ bool
+ contains_packed_reference (tree exp)
+ {
+ bool packed_p = false;
+
+ while (1)
+ {
+ switch (TREE_CODE (exp))
+ {
+ case COMPONENT_REF:
+ {
+ tree field = TREE_OPERAND (exp, 1);
+ packed_p = DECL_PACKED (field)
+ || TYPE_PACKED (TREE_TYPE (field)) /* CHECKME */
+ || TYPE_PACKED (TREE_TYPE (exp));
+ if (packed_p)
+ goto done;
+ }
+ break;
+
+ case BIT_FIELD_REF:
+ case ARRAY_REF:
+ case ARRAY_RANGE_REF:
+ case REALPART_EXPR:
+ case IMAGPART_EXPR:
+ case VIEW_CONVERT_EXPR:
+ break;
+
+ default:
+ goto done;
+ }
+ exp = TREE_OPERAND (exp, 0);
+ }
+ done:
+ return packed_p;
+ }
+
/* Return a tree of sizetype representing the size, in bytes, of the element
of EXP, an ARRAY_REF. */
Index: gcc/tree-vect-analyze.c
===================================================================
*** gcc/tree-vect-analyze.c (revision 126162)
--- gcc/tree-vect-analyze.c (working copy)
*************** Software Foundation, 51 Franklin Street,
*** 25,30 ****
--- 25,31 ----
#include "tm.h"
#include "ggc.h"
#include "tree.h"
+ #include "target.h"
#include "basic-block.h"
#include "diagnostic.h"
#include "tree-flow.h"
*************** vect_verify_datarefs_alignment (loop_vec
*** 1379,1384 ****
--- 1380,1449 ----
}
+ static bool
+ vector_alignment_reachable_p (struct data_reference *dr)
+ {
+ tree stmt = DR_STMT (dr);
+ stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+ tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+
+ if (DR_GROUP_FIRST_DR (stmt_info))
+ {
+ /* For interleaved access we peel only if number of iterations in
+ the prolog loop ({VF - misalignment}), is a multiple of the
+ number of the interleaved accesses. */
+ int elem_size, mis_in_elements;
+ int nelements = TYPE_VECTOR_SUBPARTS (vectype);
+
+ /* FORNOW: handle only known alignment. */
+ if (!known_alignment_for_access_p (dr))
+ return false;
+
+ elem_size = UNITS_PER_SIMD_WORD / nelements;
+ mis_in_elements = DR_MISALIGNMENT (dr) / elem_size;
+
+ if ((nelements - mis_in_elements) % DR_GROUP_SIZE (stmt_info))
+ return false;
+ }
+
+ /* If misalignment is known at the compiler time then apply peeling
+ only if natural alignment is reachable through peeling. */
+ if (known_alignment_for_access_p (dr) && !aligned_access_p (dr))
+ {
+ HOST_WIDE_INT elmsize =
+ int_cst_value (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
+ if (DR_MISALIGNMENT (dr) % elmsize)
+ {
+ if (vect_print_dump_info (REPORT_DETAILS))
+ {
+ fprintf (vect_dump, "data size =" HOST_WIDE_INT_PRINT_DEC,
elmsize);
+ fprintf (vect_dump, ". misalignment = %d. ", DR_MISALIGNMENT
(dr));
+ fprintf (vect_dump, "data size does not divide the
misalignment.\n");
+ }
+ return false;
+ }
+ }
+
+ if (!known_alignment_for_access_p (dr))
+ {
+ tree type = (TREE_TYPE (DR_REF (dr)));
+ tree ba = DR_BASE_OBJECT (dr);
+ bool is_packed = false;
+
+ if (ba)
+ is_packed = contains_packed_reference (ba);
+
+ if (vect_print_dump_info (REPORT_DETAILS))
+ fprintf (vect_dump, "Unknown misalignment, is_packed = %d",is_packed);
+ if (targetm.vectorize.vector_alignment_reachable (type, is_packed))
+ return true;
+ else
+ return false;
+ }
+
+ return true;
+ }
+
/* Function vect_enhance_data_refs_alignment
This pass will use loop versioning and loop peeling in order to enhance
*************** vect_enhance_data_refs_alignment (loop_v
*** 1540,1572 ****
if (!DR_IS_READ (dr) && !aligned_access_p (dr))
{
! if (DR_GROUP_FIRST_DR (stmt_info))
! {
! /* For interleaved access we peel only if number of iterations in
! the prolog loop ({VF - misalignment}), is a multiple of the
! number of the interleaved accesses. */
! int elem_size, mis_in_elements;
! tree vectype = STMT_VINFO_VECTYPE (stmt_info);
! int nelements = TYPE_VECTOR_SUBPARTS (vectype);
!
! /* FORNOW: handle only known alignment. */
! if (!known_alignment_for_access_p (dr))
! {
! do_peeling = false;
! break;
! }
!
! elem_size = UNITS_PER_SIMD_WORD / nelements;
! mis_in_elements = DR_MISALIGNMENT (dr) / elem_size;
!
! if ((nelements - mis_in_elements) % DR_GROUP_SIZE (stmt_info))
! {
! do_peeling = false;
! break;
! }
! }
! dr0 = dr;
! do_peeling = true;
break;
}
}
--- 1605,1615 ----
if (!DR_IS_READ (dr) && !aligned_access_p (dr))
{
! do_peeling = vector_alignment_reachable_p (dr);
! if (do_peeling)
! dr0 = dr;
! if (!do_peeling && vect_print_dump_info (REPORT_DETAILS))
! fprintf (vect_dump, "vector alignment may not be reachable");
break;
}
}
Index: gcc/target-def.h
===================================================================
*** gcc/target-def.h (revision 126162)
--- gcc/target-def.h (working copy)
*************** Foundation, 51 Franklin Street, Fifth Fl
*** 356,361 ****
--- 356,364 ----
default_builtin_vectorized_conversion
#define TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_EVEN 0
#define TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_ODD 0
+ #define TARGET_VECTOR_ALIGNMENT_REACHABLE \
+ default_builtin_vector_alignment_reachable
+
#define TARGET_VECTORIZE \
{ \
*************** Foundation, 51 Franklin Street, Fifth Fl
*** 363,369 ****
TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION, \
TARGET_VECTORIZE_BUILTIN_CONVERSION, \
TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_EVEN, \
! TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_ODD \
}
#define TARGET_DEFAULT_TARGET_FLAGS 0
--- 366,373 ----
TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION, \
TARGET_VECTORIZE_BUILTIN_CONVERSION, \
TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_EVEN, \
! TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_ODD, \
! TARGET_VECTOR_ALIGNMENT_REACHABLE \
}
#define TARGET_DEFAULT_TARGET_FLAGS 0
Index: gcc/config/rs6000/rs6000.c
===================================================================
*** gcc/config/rs6000/rs6000.c (revision 126162)
--- gcc/config/rs6000/rs6000.c (working copy)
*************** static tree rs6000_builtin_mul_widen_odd
*** 717,722 ****
--- 717,723 ----
static tree rs6000_builtin_conversion (enum tree_code, tree);
static void def_builtin (int, const char *, tree, int);
+ static bool rs6000_vector_alignment_reachable (tree, bool);
static void rs6000_init_builtins (void);
static rtx rs6000_expand_unop_builtin (enum insn_code, tree, rtx);
static rtx rs6000_expand_binop_builtin (enum insn_code, tree, rtx);
*************** static const char alt_reg_names[][8] =
*** 984,989 ****
--- 985,993 ----
#undef TARGET_VECTORIZE_BUILTIN_CONVERSION
#define TARGET_VECTORIZE_BUILTIN_CONVERSION rs6000_builtin_conversion
+ #undef TARGET_VECTOR_ALIGNMENT_REACHABLE
+ #define TARGET_VECTOR_ALIGNMENT_REACHABLE rs6000_vector_alignment_reachable
+
#undef TARGET_INIT_BUILTINS
#define TARGET_INIT_BUILTINS rs6000_init_builtins
*************** rs6000_builtin_mul_widen_odd (tree type)
*** 1806,1811 ****
--- 1810,1844 ----
}
}
+
+ /* Return true iff, data reference of TYPE can reach vector alignment (16)
+ after applying N number of iterations. This routine does not determine
+ how may iterations are required to reach desired alignment. */
+
+ static bool
+ rs6000_vector_alignment_reachable (tree type, bool is_packed)
+ {
+ if (is_packed)
+ return false;
+
+ if (TARGET_MACHO)
+ {
+ if (TARGET_32BIT)
+ {
+ if (rs6000_alignment_flags == MASK_ALIGN_NATURAL)
+ return true;
+
+ if (rs6000_alignment_flags == MASK_ALIGN_POWER)
+ return true;
+ }
+ return false;
+ }
+
+ /* Assuming that all other types are naturally aligned. CHECKME! */
+ return true;
+ }
+
+
/* Handle generic options of the form -mfoo=yes/no.
NAME is the option name.
VALUE is the option value.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (6 preceding siblings ...)
2007-07-01 10:00 ` dorit at gcc dot gnu dot org
@ 2007-07-02 8:30 ` patchapp at dberlin dot org
2007-07-12 14:42 ` dorit at gcc dot gnu dot org
` (14 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: patchapp at dberlin dot org @ 2007-07-02 8:30 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from patchapp at dberlin dot org 2007-07-02 08:30 -------
Subject: Bug number PR25413
A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is
http://gcc.gnu.org/ml/gcc-patches/2007-07/msg00082.html
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (7 preceding siblings ...)
2007-07-02 8:30 ` patchapp at dberlin dot org
@ 2007-07-12 14:42 ` dorit at gcc dot gnu dot org
2007-07-13 0:13 ` dirtyepic at gentoo dot org
` (13 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-12 14:42 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from dorit at gcc dot gnu dot org 2007-07-12 14:42 -------
Subject: Bug 25413
Author: dorit
Date: Thu Jul 12 14:42:08 2007
New Revision: 126591
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=126591
Log:
2007-07-12 Dorit Nuzman <dorit@il.ibm.com>
Devang Patel <dpatel@apple.com>
PR tree-optimization/25413
* targhooks.c (default_builtin_vector_alignment_reachable): New.
* targhooks.h (default_builtin_vector_alignment_reachable): New.
* tree.h (contains_packed_reference): New.
* expr.c (contains_packed_reference): New.
* tree-vect-analyze.c (vector_alignment_reachable_p): New.
(vect_enhance_data_refs_alignment): Call
vector_alignment_reachable_p.
* target.h (vector_alignment_reachable): New builtin.
* target-def.h (TARGET_VECTOR_ALIGNMENT_REACHABLE): New.
* config/rs6000/rs6000.c (rs6000_vector_alignment_reachable): New.
(TARGET_VECTOR_ALIGNMENT_REACHABLE): Define.
Added:
trunk/gcc/testsuite/gcc.dg/vect/pr25413.c
trunk/gcc/testsuite/gcc.dg/vect/pr25413a.c
trunk/gcc/testsuite/gcc.dg/vect/vect-align-1.c
trunk/gcc/testsuite/gcc.dg/vect/vect-align-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000.c
trunk/gcc/expr.c
trunk/gcc/target-def.h
trunk/gcc/target.h
trunk/gcc/targhooks.c
trunk/gcc/targhooks.h
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/vect/pr31699.c
trunk/gcc/tree-vect-analyze.c
trunk/gcc/tree.h
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (8 preceding siblings ...)
2007-07-12 14:42 ` dorit at gcc dot gnu dot org
@ 2007-07-13 0:13 ` dirtyepic at gentoo dot org
2007-07-16 8:02 ` dorit at gcc dot gnu dot org
` (12 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: dirtyepic at gentoo dot org @ 2007-07-13 0:13 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from dirtyepic at gentoo dot org 2007-07-13 00:13 -------
any chance of a 4.2 backport?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (9 preceding siblings ...)
2007-07-13 0:13 ` dirtyepic at gentoo dot org
@ 2007-07-16 8:02 ` dorit at gcc dot gnu dot org
2007-07-17 11:05 ` reichelt at gcc dot gnu dot org
` (11 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-16 8:02 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from dorit at gcc dot gnu dot org 2007-07-16 08:02 -------
(In reply to comment #10)
> any chance of a 4.2 backport?
sure (as soon as 4.2 gets out of freeze. unfortunately we missed the 4.2.1
release).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (10 preceding siblings ...)
2007-07-16 8:02 ` dorit at gcc dot gnu dot org
@ 2007-07-17 11:05 ` reichelt at gcc dot gnu dot org
2007-07-24 9:05 ` dorit at gcc dot gnu dot org
` (10 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: reichelt at gcc dot gnu dot org @ 2007-07-17 11:05 UTC (permalink / raw)
To: gcc-bugs
------- Comment #12 from reichelt at gcc dot gnu dot org 2007-07-17 11:05 -------
*** Bug 19716 has been marked as a duplicate of this bug. ***
--
reichelt at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |bangerth at dealii dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (11 preceding siblings ...)
2007-07-17 11:05 ` reichelt at gcc dot gnu dot org
@ 2007-07-24 9:05 ` dorit at gcc dot gnu dot org
2007-07-24 19:37 ` dirtyepic at gentoo dot org
` (9 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-24 9:05 UTC (permalink / raw)
To: gcc-bugs
------- Comment #13 from dorit at gcc dot gnu dot org 2007-07-24 09:05 -------
David, can you confirm that this PR can now be closed?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (12 preceding siblings ...)
2007-07-24 9:05 ` dorit at gcc dot gnu dot org
@ 2007-07-24 19:37 ` dirtyepic at gentoo dot org
2007-07-24 19:40 ` dirtyepic at gentoo dot org
` (8 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: dirtyepic at gentoo dot org @ 2007-07-24 19:37 UTC (permalink / raw)
To: gcc-bugs
------- Comment #14 from dirtyepic at gentoo dot org 2007-07-24 19:37 -------
Created an attachment (id=13965)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13965&action=view)
zlib testcase
> A fix for PR25413 was committed to mainline.
> Ryan, could you please check if it solves the zlib miscompilation?
> Andrew, would you plase check if it solves the libgcc miscompilation that you
> are seeing?
Unfortunately it doesn't. I'm still getting unaligned movdqa instructions with
both mainline and the 4.2 patch. Both testcases on this bug now work however,
so maybe the problem lies elsewhere.
I'm attaching a (badly) reduced testcase from inftrees.c in zlib which i
believe shows the bug. Compile with -O -msse2 -ftree-vectorize. On i686 i'm
getting this:
inftrees.lo: file format elf32-i386
Disassembly of section .text:
00000000 <inflate_table>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 53 push %ebx
4: 83 ec 24 sub $0x24,%esp
7: 8b 5d 0c mov 0xc(%ebp),%ebx
a: 8b 4d 10 mov 0x10(%ebp),%ecx
d: 8d 45 d8 lea -0x28(%ebp),%eax
10: 66 0f ef c0 pxor %xmm0,%xmm0
14: 66 0f 7f 00 movdqa %xmm0,(%eax)
18: 66 0f 7f 40 10 movdqa %xmm0,0x10(%eax)
1d: 85 c9 test %ecx,%ecx
1f: 74 16 je 37 <inflate_table+0x37>
21: ba 00 00 00 00 mov $0x0,%edx
26: 0f b7 04 53 movzwl (%ebx,%edx,2),%eax
2a: 66 83 44 45 d8 01 addw $0x1,-0x28(%ebp,%eax,2)
30: 83 c2 01 add $0x1,%edx
33: 39 ca cmp %ecx,%edx
35: 75 ef jne 26 <inflate_table+0x26>
37: b8 0f 00 00 00 mov $0xf,%eax
3c: 8d 55 d8 lea -0x28(%ebp),%edx
3f: 66 83 3c 42 00 cmpw $0x0,(%edx,%eax,2)
44: 75 05 jne 4b <inflate_table+0x4b>
46: 83 e8 01 sub $0x1,%eax
49: 75 f4 jne 3f <inflate_table+0x3f>
4b: 83 c4 24 add $0x24,%esp
4e: 5b pop %ebx
4f: 5d pop %ebp
50: c3 ret
And without the vectorizer:
inftrees.lo: file format elf32-i386
Disassembly of section .text:
00000000 <inflate_table>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 53 push %ebx
4: 83 ec 20 sub $0x20,%esp
7: 8b 5d 0c mov 0xc(%ebp),%ebx
a: 8b 4d 10 mov 0x10(%ebp),%ecx
d: b8 00 00 00 00 mov $0x0,%eax
12: 8d 55 dc lea -0x24(%ebp),%edx
15: 66 c7 04 42 00 00 movw $0x0,(%edx,%eax,2)
1b: 83 c0 01 add $0x1,%eax
1e: 83 f8 10 cmp $0x10,%eax
21: 75 f2 jne 15 <inflate_table+0x15>
23: 85 c9 test %ecx,%ecx
25: 74 16 je 3d <inflate_table+0x3d>
27: ba 00 00 00 00 mov $0x0,%edx
2c: 0f b7 04 53 movzwl (%ebx,%edx,2),%eax
30: 66 83 44 45 dc 01 addw $0x1,-0x24(%ebp,%eax,2)
36: 83 c2 01 add $0x1,%edx
39: 39 ca cmp %ecx,%edx
3b: 75 ef jne 2c <inflate_table+0x2c>
3d: b8 0f 00 00 00 mov $0xf,%eax
42: 8d 55 dc lea -0x24(%ebp),%edx
45: 66 83 3c 42 00 cmpw $0x0,(%edx,%eax,2)
4a: 75 05 jne 51 <inflate_table+0x51>
4c: 83 e8 01 sub $0x1,%eax
4f: 75 f4 jne 45 <inflate_table+0x45>
51: 83 c4 20 add $0x20,%esp
54: 5b pop %ebx
55: 5d pop %ebp
56: c3 ret
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (13 preceding siblings ...)
2007-07-24 19:37 ` dirtyepic at gentoo dot org
@ 2007-07-24 19:40 ` dirtyepic at gentoo dot org
2007-07-24 21:21 ` David dot Monniaux at ens dot fr
` (7 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: dirtyepic at gentoo dot org @ 2007-07-24 19:40 UTC (permalink / raw)
To: gcc-bugs
------- Comment #15 from dirtyepic at gentoo dot org 2007-07-24 19:40 -------
Created an attachment (id=13966)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13966&action=view)
gcc-PR25413-gdb.log
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (14 preceding siblings ...)
2007-07-24 19:40 ` dirtyepic at gentoo dot org
@ 2007-07-24 21:21 ` David dot Monniaux at ens dot fr
2007-07-25 8:40 ` dorit at gcc dot gnu dot org
` (6 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: David dot Monniaux at ens dot fr @ 2007-07-24 21:21 UTC (permalink / raw)
To: gcc-bugs
------- Comment #16 from David dot Monniaux at ens dot fr 2007-07-24 21:21 -------
(In reply to comment #13)
> David, can you confirm that this PR can now be closed?
I'm no seeing the bug any longer when compiling/testing the octagon library.
This does not imply, though, that it no longer occurs on other examples.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (15 preceding siblings ...)
2007-07-24 21:21 ` David dot Monniaux at ens dot fr
@ 2007-07-25 8:40 ` dorit at gcc dot gnu dot org
2007-07-25 9:12 ` Andrew Pinski
2007-07-25 8:51 ` dorit at gcc dot gnu dot org
` (5 subsequent siblings)
22 siblings, 1 reply; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-25 8:40 UTC (permalink / raw)
To: gcc-bugs
------- Comment #17 from dorit at gcc dot gnu dot org 2007-07-25 08:40 -------
This looks like an unrelated problem - the vectorizer does not perform loop
peeling here so it's not an issue of natural alignment. Lets open a separate PR
for this one, unless there's already one open. In the meantime, would you
please try this patch?:
Index: tree-vectorizer.c
===================================================================
*** tree-vectorizer.c (revision 126902)
--- tree-vectorizer.c (working copy)
*************** vect_can_force_dr_alignment_p (tree decl
*** 1527,1533 ****
PREFERRED_STACK_BOUNDARY is honored by all translation units.
However, until someone implements forced stack alignment, SSE
isn't really usable without this. */
! return (alignment <= PREFERRED_STACK_BOUNDARY);
}
--- 1527,1533 ----
PREFERRED_STACK_BOUNDARY is honored by all translation units.
However, until someone implements forced stack alignment, SSE
isn't really usable without this. */
! return (alignment <= STACK_BOUNDARY);
}
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (16 preceding siblings ...)
2007-07-25 8:40 ` dorit at gcc dot gnu dot org
@ 2007-07-25 8:51 ` dorit at gcc dot gnu dot org
2007-07-25 8:52 ` dorit at gcc dot gnu dot org
` (4 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-25 8:51 UTC (permalink / raw)
To: gcc-bugs
------- Comment #18 from dorit at gcc dot gnu dot org 2007-07-25 08:51 -------
Subject: Bug 25413
Author: dorit
Date: Wed Jul 25 08:51:12 2007
New Revision: 126904
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=126904
Log:
2007-07-25 Dorit Nuzman <dorit@il.ibm.com>
Devang Patel <dpatel@apple.com>
PR tree-optimization/25413
* targhooks.c (default_builtin_vector_alignment_reachable): New.
* targhooks.h (default_builtin_vector_alignment_reachable): New.
* tree.h (contains_packed_reference): New.
* expr.c (contains_packed_reference): New.
* tree-vect-analyze.c (vector_alignment_reachable_p): New.
(vect_enhance_data_refs_alignment): Call
vector_alignment_reachable_p.
* target.h (vector_alignment_reachable): New builtin.
* target-def.h (TARGET_VECTOR_ALIGNMENT_REACHABLE): New.
* config/rs6000/rs6000.c (rs6000_vector_alignment_reachable): New.
(TARGET_VECTOR_ALIGNMENT_REACHABLE): Define.
2007-07-25 Dorit Nuzman <dorit@il.ibm.com>
Devang Patel <dpatel@apple.com>
Uros Bizjak <ubizjak@gmail.com>
PR tree-optimization/25413
* lib/target-supports.exp (check_effective_target_vect_aligned_arrays):
New procedure to check if arrays are naturally aligned to the vector
alignment boundary.
* gcc.dg/vect/vect-align-1.c: New.
* gcc.dg/vect/vect-align-2.c: New.
* gcc.dg/vect/pr25413.c: New.
* gcc.dg/vect/pr25413a.c: New.
Added:
branches/gcc-4_2-branch/gcc/testsuite/gcc.dg/vect/pr25413.c
branches/gcc-4_2-branch/gcc/testsuite/gcc.dg/vect/pr25413a.c
branches/gcc-4_2-branch/gcc/testsuite/gcc.dg/vect/vect-align-1.c
branches/gcc-4_2-branch/gcc/testsuite/gcc.dg/vect/vect-align-2.c
Modified:
branches/gcc-4_2-branch/gcc/ChangeLog
branches/gcc-4_2-branch/gcc/config/rs6000/rs6000.c
branches/gcc-4_2-branch/gcc/expr.c
branches/gcc-4_2-branch/gcc/target-def.h
branches/gcc-4_2-branch/gcc/target.h
branches/gcc-4_2-branch/gcc/targhooks.c
branches/gcc-4_2-branch/gcc/targhooks.h
branches/gcc-4_2-branch/gcc/testsuite/ChangeLog
branches/gcc-4_2-branch/gcc/testsuite/lib/target-supports.exp
branches/gcc-4_2-branch/gcc/tree-vect-analyze.c
branches/gcc-4_2-branch/gcc/tree.h
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (17 preceding siblings ...)
2007-07-25 8:51 ` dorit at gcc dot gnu dot org
@ 2007-07-25 8:52 ` dorit at gcc dot gnu dot org
2007-07-25 9:12 ` pinskia at gmail dot com
` (3 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-25 8:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #19 from dorit at gcc dot gnu dot org 2007-07-25 08:52 -------
problem fixed.
--
dorit at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2007-07-25 8:40 ` dorit at gcc dot gnu dot org
@ 2007-07-25 9:12 ` Andrew Pinski
0 siblings, 0 replies; 25+ messages in thread
From: Andrew Pinski @ 2007-07-25 9:12 UTC (permalink / raw)
To: gcc-bugzilla; +Cc: gcc-bugs
On 25 Jul 2007 08:40:09 -0000, dorit at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote
> In the meantime, would you please try this patch?:
Of course after my patch for PR 16660, the patch here should be
changed to just return true always.
Thanks,
Andrew Pinski
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (18 preceding siblings ...)
2007-07-25 8:52 ` dorit at gcc dot gnu dot org
@ 2007-07-25 9:12 ` pinskia at gmail dot com
2007-07-25 11:11 ` dorit at gcc dot gnu dot org
` (2 subsequent siblings)
22 siblings, 0 replies; 25+ messages in thread
From: pinskia at gmail dot com @ 2007-07-25 9:12 UTC (permalink / raw)
To: gcc-bugs
------- Comment #20 from pinskia at gmail dot com 2007-07-25 09:12 -------
Subject: Re: wrong alignment or incorrect address computation in vectorized
code on Pentium 4 SSE
On 25 Jul 2007 08:40:09 -0000, dorit at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote
> In the meantime, would you please try this patch?:
Of course after my patch for PR 16660, the patch here should be
changed to just return true always.
Thanks,
Andrew Pinski
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (19 preceding siblings ...)
2007-07-25 9:12 ` pinskia at gmail dot com
@ 2007-07-25 11:11 ` dorit at gcc dot gnu dot org
2007-07-25 20:24 ` dirtyepic at gentoo dot org
2007-12-28 1:01 ` reichelt at gcc dot gnu dot org
22 siblings, 0 replies; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-25 11:11 UTC (permalink / raw)
To: gcc-bugs
------- Comment #21 from dorit at gcc dot gnu dot org 2007-07-25 11:11 -------
> Of course after my patch for PR 16660, the patch here should be
> changed to just return true always.
In this case, Ryan, could you please also try to see if Andrew's patch
(http://gcc.gnu.org/ml/gcc-patches/2007-07/msg00177.html) fixes the problem?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (20 preceding siblings ...)
2007-07-25 11:11 ` dorit at gcc dot gnu dot org
@ 2007-07-25 20:24 ` dirtyepic at gentoo dot org
2007-12-28 1:01 ` reichelt at gcc dot gnu dot org
22 siblings, 0 replies; 25+ messages in thread
From: dirtyepic at gentoo dot org @ 2007-07-25 20:24 UTC (permalink / raw)
To: gcc-bugs
------- Comment #22 from dirtyepic at gentoo dot org 2007-07-25 20:24 -------
(In reply to comment #17)
> Lets open a separate PR for this one
This is now PR 32893.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
` (21 preceding siblings ...)
2007-07-25 20:24 ` dirtyepic at gentoo dot org
@ 2007-12-28 1:01 ` reichelt at gcc dot gnu dot org
22 siblings, 0 replies; 25+ messages in thread
From: reichelt at gcc dot gnu dot org @ 2007-12-28 1:01 UTC (permalink / raw)
To: gcc-bugs
------- Comment #23 from reichelt at gcc dot gnu dot org 2007-12-28 01:01 -------
*** Bug 33958 has been marked as a duplicate of this bug. ***
--
reichelt at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |eran dot nissenhaus at
| |mobileye dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2007-12-28 1:01 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
2005-12-14 15:26 ` [Bug target/25413] " David dot Monniaux at ens dot fr
2005-12-15 12:42 ` dorit at il dot ibm dot com
2005-12-15 12:50 ` dorit at il dot ibm dot com
2006-01-30 22:24 ` [Bug tree-optimization/25413] " pinskia at gcc dot gnu dot org
2007-04-02 20:52 ` reichelt at gcc dot gnu dot org
2007-04-03 19:22 ` [Bug target/25413] " dorit at il dot ibm dot com
2007-07-01 10:00 ` dorit at gcc dot gnu dot org
2007-07-02 8:30 ` patchapp at dberlin dot org
2007-07-12 14:42 ` dorit at gcc dot gnu dot org
2007-07-13 0:13 ` dirtyepic at gentoo dot org
2007-07-16 8:02 ` dorit at gcc dot gnu dot org
2007-07-17 11:05 ` reichelt at gcc dot gnu dot org
2007-07-24 9:05 ` dorit at gcc dot gnu dot org
2007-07-24 19:37 ` dirtyepic at gentoo dot org
2007-07-24 19:40 ` dirtyepic at gentoo dot org
2007-07-24 21:21 ` David dot Monniaux at ens dot fr
2007-07-25 8:40 ` dorit at gcc dot gnu dot org
2007-07-25 9:12 ` Andrew Pinski
2007-07-25 8:51 ` dorit at gcc dot gnu dot org
2007-07-25 8:52 ` dorit at gcc dot gnu dot org
2007-07-25 9:12 ` pinskia at gmail dot com
2007-07-25 11:11 ` dorit at gcc dot gnu dot org
2007-07-25 20:24 ` dirtyepic at gentoo dot org
2007-12-28 1:01 ` reichelt at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).