public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/25413]  New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
@ 2005-12-14 15:24 David dot Monniaux at ens dot fr
  2005-12-14 15:26 ` [Bug target/25413] " David dot Monniaux at ens dot fr
                   ` (22 more replies)
  0 siblings, 23 replies; 25+ messages in thread
From: David dot Monniaux at ens dot fr @ 2005-12-14 15:24 UTC (permalink / raw)
  To: gcc-bugs

(The same phenomenon happens in 4.2 subversion alpha.)

Incorrect code is generated when using -ftree-vectorize on the SSE2 Pentium 4
target. (The same code works on the AMD64 64-bit target.)

See attached source code.

$ gcc -O2 -Wall -march=pentium4 -mfpmath=sse -ftree-vectorize
-ftree-vectorizer-verbose=1 gcc_plantage2.c -o ./gcc_plantage2
$ ./gcc_plantage2
will segfault (the same program works perfectly without -ftree-vectorize).

The segfault appears at the second movapd instruction:

        movapd  .LC0, %xmm0
.L17:
        movapd  %xmm0, (%eax)
        addl    $1, %edx
        addl    $16, %eax
        cmpl    %edx, %ebx
        ja      .L17

%eax is at this point equal to 4 modulo 16, which results in a segmentation
fault, since movapd assumes 16-byte alignment.


-- 
           Summary: wrong alignment or incorrect address computation in
                    vectorized code on Pentium 4 SSE
           Product: gcc
           Version: 4.0.3
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: David dot Monniaux at ens dot fr
 GCC build triplet: i486-linux-gnu
  GCC host triplet: i486-linux-gnu
GCC target triplet: i486-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
@ 2005-12-14 15:26 ` David dot Monniaux at ens dot fr
  2005-12-15 12:42 ` dorit at il dot ibm dot com
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: David dot Monniaux at ens dot fr @ 2005-12-14 15:26 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from David dot Monniaux at ens dot fr  2005-12-14 15:26 -------
Created an attachment (id=10486)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=10486&action=view)
preprocessed C source code (assumes Linux glibc) exhibiting the code generation
bug

Compile this program with -O2 -march=pentium4 -ftree-vectorize on a Pentium 4
and run it. It will segfault. It works perfectly without vectorization.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
  2005-12-14 15:26 ` [Bug target/25413] " David dot Monniaux at ens dot fr
@ 2005-12-15 12:42 ` dorit at il dot ibm dot com
  2005-12-15 12:50 ` dorit at il dot ibm dot com
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: dorit at il dot ibm dot com @ 2005-12-15 12:42 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from dorit at il dot ibm dot com  2005-12-15 12:41 -------
The problem is that the vectorizer applies loop-peeling in order to align the
data reference *(m->c+i), and peeling only works correctly if the data is
naturally aligned (aligned on it's type size). This is what the vectorizer
currently blindly assumes, but on the Pentium4 doubles are not necessarily
64bit aligned.

Accidentally Devang and I discussed this issue last week, and Devang actually
committed a patch to apple-ppc branch that works around the problem (
http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=108214). Devang's patch
however will not fix this PR - the patch he committed disables vectorization if
the vectorizer was able to compute the misalignment, and discovered that it
doesn't evenly divide by the type size. In this testcase the misalignment is
unknown at compile time. 

To fix this problem we need to disable loop-peeling in the vectorizer if we
can't prove that the data is naturally aligned. Alternatively, if we can't
prove either way we can peel the loop but control the number of iterations it
will execute using a runtime test (i.e. have the prolog loop iterate the entire
loop-count if at runtime we discover that the data is not naturally aligned). 


-- 

dorit at il dot ibm dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dorit at il dot ibm dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
  2005-12-14 15:26 ` [Bug target/25413] " David dot Monniaux at ens dot fr
  2005-12-15 12:42 ` dorit at il dot ibm dot com
@ 2005-12-15 12:50 ` dorit at il dot ibm dot com
  2006-01-30 22:24 ` [Bug tree-optimization/25413] " pinskia at gcc dot gnu dot org
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: dorit at il dot ibm dot com @ 2005-12-15 12:50 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from dorit at il dot ibm dot com  2005-12-15 12:50 -------
related discussion: http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (2 preceding siblings ...)
  2005-12-15 12:50 ` dorit at il dot ibm dot com
@ 2006-01-30 22:24 ` pinskia at gcc dot gnu dot org
  2007-04-02 20:52 ` reichelt at gcc dot gnu dot org
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-01-30 22:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from pinskia at gcc dot gnu dot org  2006-01-30 22:24 -------
Confirmed.


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
          Component|target                      |tree-optimization
     Ever Confirmed|0                           |1
           Keywords|                            |wrong-code
   Last reconfirmed|0000-00-00 00:00:00         |2006-01-30 22:24:29
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tree-optimization/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (3 preceding siblings ...)
  2006-01-30 22:24 ` [Bug tree-optimization/25413] " pinskia at gcc dot gnu dot org
@ 2007-04-02 20:52 ` reichelt at gcc dot gnu dot org
  2007-04-03 19:22 ` [Bug target/25413] " dorit at il dot ibm dot com
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: reichelt at gcc dot gnu dot org @ 2007-04-02 20:52 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from reichelt at gcc dot gnu dot org  2007-04-02 21:52 -------
Any news on this one?
The bug makes tree vectorization on pentium 4 totally useless. :-(

Btw, here's a smaller code snippet for testing. Just compile it with
  gcc -O -msse2 -ftree-vectorize
on a pentium 4 and see the resulting executable segfault:

================================
struct
{
  char c;
  double d[2];
} a;

int main()
{
  int i;
  for ( i=0; i<2; ++i )
    a.d[i]=0;
  return 0;
}
================================


-- 

reichelt at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |reichelt at gcc dot gnu dot
                   |                            |org
           Keywords|                            |monitored


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (4 preceding siblings ...)
  2007-04-02 20:52 ` reichelt at gcc dot gnu dot org
@ 2007-04-03 19:22 ` dorit at il dot ibm dot com
  2007-07-01 10:00 ` dorit at gcc dot gnu dot org
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: dorit at il dot ibm dot com @ 2007-04-03 19:22 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from dorit at il dot ibm dot com  2007-04-03 20:22 -------
So I see Devang had sent a patch for this over a year ago:
http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00167.html
I don't know what ever happened to it.
Maybe you want to give it a try? (you may need to implement the new target hook
for Pentium4). If you have problems applying the patch (it is a bit old) - I
could try to help update the patch (not before next week though).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (5 preceding siblings ...)
  2007-04-03 19:22 ` [Bug target/25413] " dorit at il dot ibm dot com
@ 2007-07-01 10:00 ` dorit at gcc dot gnu dot org
  2007-07-02  8:30 ` patchapp at dberlin dot org
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-01 10:00 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from dorit at gcc dot gnu dot org  2007-07-01 09:59 -------
I'm testing the following patch (seems to fix the two testcases in this PR on
Pentium4. still need to bootstrap etc, and check the powerpc bits)

Index: gcc/targhooks.c
===================================================================
*** gcc/targhooks.c     (revision 126162)
--- gcc/targhooks.c     (working copy)
*************** tree default_mangle_decl_assembler_name
*** 634,637 ****
--- 634,653 ----
     return id;
  }

+ bool
+ default_builtin_vector_alignment_reachable (tree type, bool is_packed)
+ {
+   if (is_packed)
+     return false;
+ 
+   /* Assuming that types whose size is > pointer-size are not guaranteed to
be
+      naturally aligned.  */
+   if (tree_int_cst_compare (TYPE_SIZE (type), bitsize_int (POINTER_SIZE)) >
0)
+     return false;
+ 
+   /* Assuming that types whose size is <= pointer-size
+      are naturally aligned.  */
+   return true;
+ }
+ 
  #include "gt-targhooks.h"
Index: gcc/targhooks.h
===================================================================
*** gcc/targhooks.h     (revision 126162)
--- gcc/targhooks.h     (working copy)
*************** extern tree default_builtin_vectorized_c
*** 62,67 ****
--- 62,69 ----

  extern tree default_builtin_reciprocal (enum built_in_function, bool, bool);

+ extern bool default_builtin_vector_alignment_reachable (tree, bool);
+
  /* These are here, and not in hooks.[ch], because not all users of
     hooks.h include tm.h, and thus we don't have CUMULATIVE_ARGS.  */

Index: gcc/tree.h
===================================================================
*** gcc/tree.h  (revision 126162)
--- gcc/tree.h  (working copy)
*************** extern tree get_inner_reference (tree, H
*** 4327,4332 ****
--- 4327,4338 ----
                                 tree *, enum machine_mode *, int *, int *,
                                 bool);

+ /* Given an expression EXP that may be a COMPONENT_REF or an ARRAY_REF,
+    look for whether EXP or any nested component-refs within EXP is marked
+    as PACKED.  */
+ 
+ extern bool contains_packed_reference (tree exp);
+
  /* Return 1 if T is an expression that get_inner_reference handles.  */

  extern int handled_component_p (tree);
Index: gcc/target.h
===================================================================
*** gcc/target.h        (revision 126162)
--- gcc/target.h        (working copy)
*************** struct gcc_target
*** 413,418 ****
--- 413,422 ----
         element-by-element products for the odd elements.  */
      tree (* builtin_mul_widen_even) (tree);
      tree (* builtin_mul_widen_odd) (tree);
+ 
+     /* Return true if vector alignment is reachable (by peeling N
+        interations) for the given type.  */
+     bool (* vector_alignment_reachable) (tree, bool);
    } vectorize;

    /* The initial value of target_flags.  */
Index: gcc/testsuite/gcc.dg/vect/vect-align-1.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/vect-align-1.c    (revision 0)
--- gcc/testsuite/gcc.dg/vect/vect-align-1.c    (revision 0)
***************
*** 0 ****
--- 1,50 ----
+ /* { dg-require-effective-target vect_int } */
+ 
+ #include <stdlib.h>
+ #include <stdarg.h>
+ #include "tree-vect.h"
+
+ /* Compile time known misalignment. Cannot use loop peeling to align
+    the store.  */
+ 
+ #define N 16
+
+ struct foo {
+   char x;
+   int y[N];
+ } __attribute__((packed));
+
+ int
+ main1 (struct foo * __restrict__ p)
+ {
+   int i;
+   int x[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+ 
+   for (i = 0; i < N; i++)
+     {
+       p->y[i] = x[i];
+     }
+ 
+   /* check results:  */
+   for (i = 0; i < N; i++)
+     {
+       if (p->y[i] != x[i])
+       abort ();
+     }
+   return 0;
+ }
+
+ 
+ int main (void)
+ {
+   int i;
+   struct foo *p = malloc (2*sizeof (struct foo));
+   check_vect ();
+   
+   main1 (p);
+   return 0;
+ }
+ 
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using
versioning" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/pr25413a.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/pr25413a.c        (revision 0)
--- gcc/testsuite/gcc.dg/vect/pr25413a.c        (revision 0)
***************
*** 0 ****
--- 1,129 ----
+ /* { dg-require-effective-target vect_double } */
+ 
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ #define N 8
+
+ typedef unsigned int size_t;
+ 
+ extern void *malloc (size_t __size) __attribute__ ((__nothrow__))
__attribute__ ((__malloc__));
+ 
+ typedef double num_t;
+ static const num_t num__infty = ((num_t)1.0)/((num_t)0.0);
+ 
+ struct oct_tt;
+ typedef struct oct_tt oct_t;
+ 
+ typedef unsigned int var_t;
+ typedef enum {
+   OCT_EMPTY = 0,
+   OCT_NORMAL = 1,
+   OCT_CLOSED = 2
+ } oct_state;
+ 
+ struct oct_tt {
+   var_t n;
+ 
+   int ref;
+ 
+   oct_state state;
+   struct oct_tt* closed;
+ 
+   num_t* c;
+ };
+ 
+ void* octfapg_mm_malloc (size_t t);
+ oct_t* octfapg_alloc (var_t n);
+ oct_t* octfapg_full_copy (oct_t* m);
+ 
+ struct mmalloc_tt;
+ typedef struct mmalloc_tt mmalloc_t;
+ 
+ struct mmalloc_tt
+ {
+   int id;
+ 
+   int nb_alloc;
+   int nb_realloc;
+   int nb_free;
+ 
+   size_t rem;
+   size_t max;
+   size_t tot;
+ 
+ };
+ 
+ typedef struct
+ {
+   size_t size;
+ 
+   mmalloc_t* mm;
+   int id;
+ 
+   double dummy;
+ 
+ } mmheader_t;
+
+ void*
+ octfapg_mm_malloc (size_t t)
+ {
+   char* m = (char*)malloc(t+sizeof(mmheader_t));
+   return m+sizeof(mmheader_t);
+ }
+ 
+ oct_t* octfapg_empty (var_t n);
+ 
+ oct_t*
+ octfapg_empty (const var_t n)
+ {
+   oct_t* m;
+   /*octfapg_timing_enter("oct_empty",3);*/
+   m = ((oct_t*) octfapg_mm_malloc (sizeof(oct_t)));
+   m->n = n;
+   m->ref = 1;
+   m->state = OCT_EMPTY;
+   m->closed = (oct_t*)((void *)0);
+   m->c = (num_t*)((void *)0);
+   /*octfapg_timing_exit("oct_empty",3);*/
+   return m;
+ }
+
+ oct_t*
+ octfapg_alloc (const var_t n)
+ {
+   size_t nn = (2*(size_t)(n)*((size_t)(n)+1));
+   oct_t* m;
+   m = octfapg_empty(n);
+   m->c = ((num_t*) octfapg_mm_malloc (sizeof(num_t)*(nn)));
+   ;
+   m->state = OCT_NORMAL;
+   m->closed = (oct_t*)((void *)0);
+   return m;
+ }
+ 
+ oct_t*
+ octfapg_universe (const var_t n)
+ {
+   oct_t* m;
+   size_t i, nn = (2*(size_t)(n)*((size_t)(n)+1));
+   m = octfapg_alloc(n);
+   for (i=0;i<nn;i++) *(m->c+i) = num__infty;
+   for (i=0;i<2*n;i++)
*(m->c+((size_t)(i)+(((size_t)(i)+1)*((size_t)(i)+1))/2)) = (num_t)(0);
+   m->state = OCT_CLOSED;
+   return m;
+ }
+ 
+ int main (void)
+ { 
+   int i;
+   check_vect ();
+
+   oct_t *p = octfapg_universe(10);
+   return 0;
+ } 
+ 
+ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "vector alignment may not be reachable"
1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using
versioning" 1 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-align-2.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/vect-align-2.c    (revision 0)
--- gcc/testsuite/gcc.dg/vect/vect-align-2.c    (revision 0)
***************
*** 0 ****
--- 1,46 ----
+ /* { dg-require-effective-target vect_int } */
+ /* { dg-do run } */
+
+ #include <stdlib.h>
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ /* Compile time unknown misalignment. Cannot use loop peeling to align
+    the store.  */
+ 
+ #define N 17
+ 
+ struct foo {
+   char x0;
+   int y[N][N];
+ } __attribute__ ((packed));
+ 
+ struct foo f2;
+ int z[16] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+ 
+ void fbar(struct foo *fp)
+ {
+   int i,j;
+    for (i=0; i<N; i++)
+       for (j=0; j<N; j++)
+         f2.y[i][j] = z[i];
+
+    for (i=0; i<N; i++)
+       for (j=0; j<N; j++)
+       if (f2.y[i][j] != z[i])
+         abort ();
+ }
+
+ int main (void)
+ {
+   struct foo  *fp = (struct foo *) malloc (2*sizeof (struct foo));
+ 
+   fbar(fp);
+   return 0;
+ }
+ 
+ 
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using
peeling" 0 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "Alignment of access forced using
versioning" 1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/pr31699.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/pr31699.c (revision 126162)
--- gcc/testsuite/gcc.dg/vect/pr31699.c (working copy)
*************** int main()
*** 31,35 ****
  }

  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target
vect_intfloat_cvt } } } */
! /* { dg-final { scan-tree-dump-times "Alignment of access forced using
peeling" 1 "vect" { xfail vect_no_align } } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
--- 31,36 ----
  }

  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target
vect_intfloat_cvt } } } */
! /* { dg-final { scan-tree-dump-times "vector alignment may not be reachable"
1 "vect" } } */
! /* { dg-final { scan-tree-dump-times "Alignment of access forced using
versioning" 1 "vect" } } */
  /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/pr25413.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/pr25413.c (revision 0)
--- gcc/testsuite/gcc.dg/vect/pr25413.c (revision 0)
***************
*** 0 ****
--- 1,37 ----
+ /* { dg-require-effective-target vect_double } */
+ 
+ #include <stdarg.h>
+ #include "tree-vect.h"
+ 
+ #define N 8
+ 
+ struct
+ {
+   char c;
+   double d[N];
+ } a;
+ 
+ int main1()
+ {
+   int i;
+   for ( i=0; i<N; ++i )
+     a.d[i]=1;
+   return 0;
+ }
+ 
+ int main (void)
+ { 
+   int i;
+   check_vect ();
+   
+   main1 ();
+   for (i=0; i<N; i++)
+     if (a.d[i] != 1)
+       abort ();
+   return 0;
+ } 
+ 
+ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "vector alignment may not be reachable"
1 "vect" } } */
+ /* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned
store" 1 "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/expr.c
===================================================================
*** gcc/expr.c  (revision 126162)
--- gcc/expr.c  (working copy)
*************** get_inner_reference (tree exp, HOST_WIDE
*** 5924,5929 ****
--- 5924,5966 ----
    return exp;
  }

+ bool
+ contains_packed_reference (tree exp)
+ {
+   bool packed_p = false;
+
+   while (1)
+     {
+       switch (TREE_CODE (exp))
+       {
+       case COMPONENT_REF:
+         {
+           tree field = TREE_OPERAND (exp, 1);
+           packed_p = DECL_PACKED (field) 
+                      || TYPE_PACKED (TREE_TYPE (field)) /* CHECKME */
+                      || TYPE_PACKED (TREE_TYPE (exp));
+           if (packed_p)
+             goto done;
+         }
+         break;
+ 
+       case BIT_FIELD_REF:
+       case ARRAY_REF:
+       case ARRAY_RANGE_REF:
+       case REALPART_EXPR:
+       case IMAGPART_EXPR:
+       case VIEW_CONVERT_EXPR:
+         break;
+ 
+       default:
+         goto done;
+       }
+       exp = TREE_OPERAND (exp, 0);
+     }
+  done:
+   return packed_p;
+ }
+ 
  /* Return a tree of sizetype representing the size, in bytes, of the element
     of EXP, an ARRAY_REF.  */

Index: gcc/tree-vect-analyze.c
===================================================================
*** gcc/tree-vect-analyze.c     (revision 126162)
--- gcc/tree-vect-analyze.c     (working copy)
*************** Software Foundation, 51 Franklin Street,
*** 25,30 ****
--- 25,31 ----
  #include "tm.h"
  #include "ggc.h"
  #include "tree.h"
+ #include "target.h"
  #include "basic-block.h"
  #include "diagnostic.h"
  #include "tree-flow.h"
*************** vect_verify_datarefs_alignment (loop_vec
*** 1379,1384 ****
--- 1380,1449 ----
  }


+ static bool
+ vector_alignment_reachable_p (struct data_reference *dr)
+ {
+   tree stmt = DR_STMT (dr);
+   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+ 
+   if (DR_GROUP_FIRST_DR (stmt_info))
+     {
+       /* For interleaved access we peel only if number of iterations in
+        the prolog loop ({VF - misalignment}), is a multiple of the
+        number of the interleaved accesses.  */
+       int elem_size, mis_in_elements;
+       int nelements = TYPE_VECTOR_SUBPARTS (vectype);
+ 
+       /* FORNOW: handle only known alignment.  */
+       if (!known_alignment_for_access_p (dr))
+       return false;
+ 
+       elem_size = UNITS_PER_SIMD_WORD / nelements;
+       mis_in_elements = DR_MISALIGNMENT (dr) / elem_size;
+ 
+       if ((nelements - mis_in_elements) % DR_GROUP_SIZE (stmt_info))
+       return false;
+     }
+ 
+   /* If misalignment is known at the compiler time then apply peeling
+      only if natural alignment is reachable through peeling.  */
+   if (known_alignment_for_access_p (dr) && !aligned_access_p (dr))
+     {
+       HOST_WIDE_INT elmsize = 
+               int_cst_value (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
+       if (DR_MISALIGNMENT (dr) % elmsize)
+       {
+         if (vect_print_dump_info (REPORT_DETAILS))
+           {
+             fprintf (vect_dump, "data size =" HOST_WIDE_INT_PRINT_DEC,
elmsize);
+             fprintf (vect_dump, ". misalignment = %d. ", DR_MISALIGNMENT
(dr));
+             fprintf (vect_dump, "data size does not divide the
misalignment.\n");
+           }
+         return false;
+       }
+     }
+ 
+   if (!known_alignment_for_access_p (dr))
+     {
+       tree type = (TREE_TYPE (DR_REF (dr)));
+       tree ba = DR_BASE_OBJECT (dr);
+       bool is_packed = false;
+ 
+       if (ba)
+       is_packed = contains_packed_reference (ba);
+
+       if (vect_print_dump_info (REPORT_DETAILS))
+       fprintf (vect_dump, "Unknown misalignment, is_packed = %d",is_packed);
+       if (targetm.vectorize.vector_alignment_reachable (type, is_packed))
+       return true;
+       else
+       return false;
+     }
+ 
+   return true;
+ }
+ 
  /* Function vect_enhance_data_refs_alignment

     This pass will use loop versioning and loop peeling in order to enhance
*************** vect_enhance_data_refs_alignment (loop_v
*** 1540,1572 ****

        if (!DR_IS_READ (dr) && !aligned_access_p (dr))
          {
!         if (DR_GROUP_FIRST_DR (stmt_info))
!           {
!             /* For interleaved access we peel only if number of iterations in
!                the prolog loop ({VF - misalignment}), is a multiple of the
!                number of the interleaved accesses.  */
!             int elem_size, mis_in_elements;
!             tree vectype = STMT_VINFO_VECTYPE (stmt_info);
!             int nelements = TYPE_VECTOR_SUBPARTS (vectype);
! 
!             /* FORNOW: handle only known alignment.  */
!             if (!known_alignment_for_access_p (dr))
!               {
!                 do_peeling = false;
!                 break;
!               }
! 
!             elem_size = UNITS_PER_SIMD_WORD / nelements;
!             mis_in_elements = DR_MISALIGNMENT (dr) / elem_size;
! 
!             if ((nelements - mis_in_elements) % DR_GROUP_SIZE (stmt_info))
!               {
!                 do_peeling = false;
!                 break;
!               }
!           }
!         dr0 = dr;
!         do_peeling = true;
          break;
        }
      }
--- 1605,1615 ----

        if (!DR_IS_READ (dr) && !aligned_access_p (dr))
          {
!         do_peeling = vector_alignment_reachable_p (dr);
!         if (do_peeling)
!           dr0 = dr;
!         if (!do_peeling && vect_print_dump_info (REPORT_DETAILS))
!             fprintf (vect_dump, "vector alignment may not be reachable");
          break;
        }
      }
Index: gcc/target-def.h
===================================================================
*** gcc/target-def.h    (revision 126162)
--- gcc/target-def.h    (working copy)
*************** Foundation, 51 Franklin Street, Fifth Fl
*** 356,361 ****
--- 356,364 ----
    default_builtin_vectorized_conversion
  #define TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_EVEN 0
  #define TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_ODD 0
+ #define TARGET_VECTOR_ALIGNMENT_REACHABLE \
+   default_builtin_vector_alignment_reachable
+ 

  #define TARGET_VECTORIZE                                                \
    {                                                                   \
*************** Foundation, 51 Franklin Street, Fifth Fl
*** 363,369 ****
      TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION,                     \
      TARGET_VECTORIZE_BUILTIN_CONVERSION,                              \
      TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_EVEN,                            \
!     TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_ODD                            \
    }

  #define TARGET_DEFAULT_TARGET_FLAGS 0
--- 366,373 ----
      TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION,                     \
      TARGET_VECTORIZE_BUILTIN_CONVERSION,                              \
      TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_EVEN,                            \
!     TARGET_VECTORIZE_BUILTIN_MUL_WIDEN_ODD,                           \
!     TARGET_VECTOR_ALIGNMENT_REACHABLE                                 \
    }

  #define TARGET_DEFAULT_TARGET_FLAGS 0
Index: gcc/config/rs6000/rs6000.c
===================================================================
*** gcc/config/rs6000/rs6000.c  (revision 126162)
--- gcc/config/rs6000/rs6000.c  (working copy)
*************** static tree rs6000_builtin_mul_widen_odd
*** 717,722 ****
--- 717,723 ----
  static tree rs6000_builtin_conversion (enum tree_code, tree);

  static void def_builtin (int, const char *, tree, int);
+ static bool rs6000_vector_alignment_reachable (tree, bool);
  static void rs6000_init_builtins (void);
  static rtx rs6000_expand_unop_builtin (enum insn_code, tree, rtx);
  static rtx rs6000_expand_binop_builtin (enum insn_code, tree, rtx);
*************** static const char alt_reg_names[][8] =
*** 984,989 ****
--- 985,993 ----
  #undef TARGET_VECTORIZE_BUILTIN_CONVERSION
  #define TARGET_VECTORIZE_BUILTIN_CONVERSION rs6000_builtin_conversion

+ #undef TARGET_VECTOR_ALIGNMENT_REACHABLE
+ #define TARGET_VECTOR_ALIGNMENT_REACHABLE rs6000_vector_alignment_reachable
+ 
  #undef TARGET_INIT_BUILTINS
  #define TARGET_INIT_BUILTINS rs6000_init_builtins

*************** rs6000_builtin_mul_widen_odd (tree type)
*** 1806,1811 ****
--- 1810,1844 ----
      }
  }

+ 
+ /* Return true iff, data reference of TYPE can reach vector alignment (16)
+    after applying N number of iterations.  This routine does not determine
+    how may iterations are required to reach desired alignment.  */
+
+ static bool
+ rs6000_vector_alignment_reachable (tree type, bool is_packed)
+ {
+   if (is_packed)
+     return false;
+
+   if (TARGET_MACHO)
+     {
+       if (TARGET_32BIT)
+       {
+         if (rs6000_alignment_flags == MASK_ALIGN_NATURAL)
+           return true;
+ 
+         if (rs6000_alignment_flags ==  MASK_ALIGN_POWER)
+           return true;
+       }
+       return false;
+     }
+ 
+   /* Assuming that all other types are naturally aligned. CHECKME!  */
+   return true;
+ }
+ 
+
  /* Handle generic options of the form -mfoo=yes/no.
     NAME is the option name.
     VALUE is the option value.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (6 preceding siblings ...)
  2007-07-01 10:00 ` dorit at gcc dot gnu dot org
@ 2007-07-02  8:30 ` patchapp at dberlin dot org
  2007-07-12 14:42 ` dorit at gcc dot gnu dot org
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: patchapp at dberlin dot org @ 2007-07-02  8:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from patchapp at dberlin dot org  2007-07-02 08:30 -------
Subject: Bug number PR25413

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is
http://gcc.gnu.org/ml/gcc-patches/2007-07/msg00082.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (7 preceding siblings ...)
  2007-07-02  8:30 ` patchapp at dberlin dot org
@ 2007-07-12 14:42 ` dorit at gcc dot gnu dot org
  2007-07-13  0:13 ` dirtyepic at gentoo dot org
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-12 14:42 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from dorit at gcc dot gnu dot org  2007-07-12 14:42 -------
Subject: Bug 25413

Author: dorit
Date: Thu Jul 12 14:42:08 2007
New Revision: 126591

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=126591
Log:
2007-07-12  Dorit Nuzman  <dorit@il.ibm.com>
            Devang Patel  <dpatel@apple.com>

        PR tree-optimization/25413
        * targhooks.c (default_builtin_vector_alignment_reachable): New.
        * targhooks.h (default_builtin_vector_alignment_reachable): New.
        * tree.h (contains_packed_reference): New.
        * expr.c (contains_packed_reference): New.
        * tree-vect-analyze.c (vector_alignment_reachable_p): New.
        (vect_enhance_data_refs_alignment): Call
        vector_alignment_reachable_p.
        * target.h (vector_alignment_reachable): New builtin.
        * target-def.h (TARGET_VECTOR_ALIGNMENT_REACHABLE): New.
        * config/rs6000/rs6000.c (rs6000_vector_alignment_reachable): New.
        (TARGET_VECTOR_ALIGNMENT_REACHABLE): Define.


Added:
    trunk/gcc/testsuite/gcc.dg/vect/pr25413.c
    trunk/gcc/testsuite/gcc.dg/vect/pr25413a.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-align-1.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-align-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/rs6000/rs6000.c
    trunk/gcc/expr.c
    trunk/gcc/target-def.h
    trunk/gcc/target.h
    trunk/gcc/targhooks.c
    trunk/gcc/targhooks.h
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/vect/pr31699.c
    trunk/gcc/tree-vect-analyze.c
    trunk/gcc/tree.h


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (8 preceding siblings ...)
  2007-07-12 14:42 ` dorit at gcc dot gnu dot org
@ 2007-07-13  0:13 ` dirtyepic at gentoo dot org
  2007-07-16  8:02 ` dorit at gcc dot gnu dot org
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: dirtyepic at gentoo dot org @ 2007-07-13  0:13 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from dirtyepic at gentoo dot org  2007-07-13 00:13 -------
any chance of a 4.2 backport?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (9 preceding siblings ...)
  2007-07-13  0:13 ` dirtyepic at gentoo dot org
@ 2007-07-16  8:02 ` dorit at gcc dot gnu dot org
  2007-07-17 11:05 ` reichelt at gcc dot gnu dot org
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-16  8:02 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from dorit at gcc dot gnu dot org  2007-07-16 08:02 -------
(In reply to comment #10)
> any chance of a 4.2 backport?

sure (as soon as 4.2 gets out of freeze. unfortunately we missed the 4.2.1
release).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (10 preceding siblings ...)
  2007-07-16  8:02 ` dorit at gcc dot gnu dot org
@ 2007-07-17 11:05 ` reichelt at gcc dot gnu dot org
  2007-07-24  9:05 ` dorit at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: reichelt at gcc dot gnu dot org @ 2007-07-17 11:05 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from reichelt at gcc dot gnu dot org  2007-07-17 11:05 -------
*** Bug 19716 has been marked as a duplicate of this bug. ***


-- 

reichelt at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bangerth at dealii dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (11 preceding siblings ...)
  2007-07-17 11:05 ` reichelt at gcc dot gnu dot org
@ 2007-07-24  9:05 ` dorit at gcc dot gnu dot org
  2007-07-24 19:37 ` dirtyepic at gentoo dot org
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-24  9:05 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from dorit at gcc dot gnu dot org  2007-07-24 09:05 -------
David, can you confirm that this PR can now be closed?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (12 preceding siblings ...)
  2007-07-24  9:05 ` dorit at gcc dot gnu dot org
@ 2007-07-24 19:37 ` dirtyepic at gentoo dot org
  2007-07-24 19:40 ` dirtyepic at gentoo dot org
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: dirtyepic at gentoo dot org @ 2007-07-24 19:37 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from dirtyepic at gentoo dot org  2007-07-24 19:37 -------
Created an attachment (id=13965)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13965&action=view)
zlib testcase

> A fix for PR25413 was committed to mainline.
> Ryan, could you please check if it solves the zlib miscompilation?
> Andrew, would you plase check if it solves the libgcc miscompilation that you
> are seeing?

Unfortunately it doesn't.  I'm still getting unaligned movdqa instructions with
both mainline and the 4.2 patch.  Both testcases on this bug now work however,
so maybe the problem lies elsewhere.

I'm attaching a (badly) reduced testcase from inftrees.c in zlib which i
believe shows the bug.  Compile with -O -msse2 -ftree-vectorize.  On i686 i'm
getting this:

inftrees.lo:     file format elf32-i386

Disassembly of section .text:

00000000 <inflate_table>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   53                      push   %ebx
   4:   83 ec 24                sub    $0x24,%esp
   7:   8b 5d 0c                mov    0xc(%ebp),%ebx
   a:   8b 4d 10                mov    0x10(%ebp),%ecx
   d:   8d 45 d8                lea    -0x28(%ebp),%eax
  10:   66 0f ef c0             pxor   %xmm0,%xmm0
  14:   66 0f 7f 00             movdqa %xmm0,(%eax)
  18:   66 0f 7f 40 10          movdqa %xmm0,0x10(%eax)
  1d:   85 c9                   test   %ecx,%ecx
  1f:   74 16                   je     37 <inflate_table+0x37>
  21:   ba 00 00 00 00          mov    $0x0,%edx
  26:   0f b7 04 53             movzwl (%ebx,%edx,2),%eax
  2a:   66 83 44 45 d8 01       addw   $0x1,-0x28(%ebp,%eax,2)
  30:   83 c2 01                add    $0x1,%edx
  33:   39 ca                   cmp    %ecx,%edx
  35:   75 ef                   jne    26 <inflate_table+0x26>
  37:   b8 0f 00 00 00          mov    $0xf,%eax
  3c:   8d 55 d8                lea    -0x28(%ebp),%edx
  3f:   66 83 3c 42 00          cmpw   $0x0,(%edx,%eax,2)
  44:   75 05                   jne    4b <inflate_table+0x4b>
  46:   83 e8 01                sub    $0x1,%eax
  49:   75 f4                   jne    3f <inflate_table+0x3f>
  4b:   83 c4 24                add    $0x24,%esp
  4e:   5b                      pop    %ebx
  4f:   5d                      pop    %ebp
  50:   c3                      ret


And without the vectorizer:

inftrees.lo:     file format elf32-i386

Disassembly of section .text:

00000000 <inflate_table>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   53                      push   %ebx
   4:   83 ec 20                sub    $0x20,%esp
   7:   8b 5d 0c                mov    0xc(%ebp),%ebx
   a:   8b 4d 10                mov    0x10(%ebp),%ecx
   d:   b8 00 00 00 00          mov    $0x0,%eax
  12:   8d 55 dc                lea    -0x24(%ebp),%edx
  15:   66 c7 04 42 00 00       movw   $0x0,(%edx,%eax,2)
  1b:   83 c0 01                add    $0x1,%eax
  1e:   83 f8 10                cmp    $0x10,%eax
  21:   75 f2                   jne    15 <inflate_table+0x15>
  23:   85 c9                   test   %ecx,%ecx
  25:   74 16                   je     3d <inflate_table+0x3d>
  27:   ba 00 00 00 00          mov    $0x0,%edx
  2c:   0f b7 04 53             movzwl (%ebx,%edx,2),%eax
  30:   66 83 44 45 dc 01       addw   $0x1,-0x24(%ebp,%eax,2)
  36:   83 c2 01                add    $0x1,%edx
  39:   39 ca                   cmp    %ecx,%edx
  3b:   75 ef                   jne    2c <inflate_table+0x2c>
  3d:   b8 0f 00 00 00          mov    $0xf,%eax
  42:   8d 55 dc                lea    -0x24(%ebp),%edx
  45:   66 83 3c 42 00          cmpw   $0x0,(%edx,%eax,2)
  4a:   75 05                   jne    51 <inflate_table+0x51>
  4c:   83 e8 01                sub    $0x1,%eax
  4f:   75 f4                   jne    45 <inflate_table+0x45>
  51:   83 c4 20                add    $0x20,%esp
  54:   5b                      pop    %ebx
  55:   5d                      pop    %ebp
  56:   c3                      ret


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (13 preceding siblings ...)
  2007-07-24 19:37 ` dirtyepic at gentoo dot org
@ 2007-07-24 19:40 ` dirtyepic at gentoo dot org
  2007-07-24 21:21 ` David dot Monniaux at ens dot fr
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: dirtyepic at gentoo dot org @ 2007-07-24 19:40 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from dirtyepic at gentoo dot org  2007-07-24 19:40 -------
Created an attachment (id=13966)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13966&action=view)
gcc-PR25413-gdb.log


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (14 preceding siblings ...)
  2007-07-24 19:40 ` dirtyepic at gentoo dot org
@ 2007-07-24 21:21 ` David dot Monniaux at ens dot fr
  2007-07-25  8:40 ` dorit at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: David dot Monniaux at ens dot fr @ 2007-07-24 21:21 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #16 from David dot Monniaux at ens dot fr  2007-07-24 21:21 -------
(In reply to comment #13)
> David, can you confirm that this PR can now be closed?

I'm no seeing the bug any longer when compiling/testing the octagon library.
This does not imply, though, that it no longer occurs on other examples.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (15 preceding siblings ...)
  2007-07-24 21:21 ` David dot Monniaux at ens dot fr
@ 2007-07-25  8:40 ` dorit at gcc dot gnu dot org
  2007-07-25  9:12   ` Andrew Pinski
  2007-07-25  8:51 ` dorit at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  22 siblings, 1 reply; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-25  8:40 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #17 from dorit at gcc dot gnu dot org  2007-07-25 08:40 -------
This looks like an unrelated problem - the vectorizer does not perform loop
peeling here so it's not an issue of natural alignment. Lets open a separate PR
for this one, unless there's already one open. In the meantime, would you
please try this patch?:

Index: tree-vectorizer.c
===================================================================
*** tree-vectorizer.c   (revision 126902)
--- tree-vectorizer.c   (working copy)
*************** vect_can_force_dr_alignment_p (tree decl
*** 1527,1533 ****
         PREFERRED_STACK_BOUNDARY is honored by all translation units.
         However, until someone implements forced stack alignment, SSE
         isn't really usable without this.  */
!     return (alignment <= PREFERRED_STACK_BOUNDARY);
  }


--- 1527,1533 ----
         PREFERRED_STACK_BOUNDARY is honored by all translation units.
         However, until someone implements forced stack alignment, SSE
         isn't really usable without this.  */
!     return (alignment <= STACK_BOUNDARY);
  }


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (16 preceding siblings ...)
  2007-07-25  8:40 ` dorit at gcc dot gnu dot org
@ 2007-07-25  8:51 ` dorit at gcc dot gnu dot org
  2007-07-25  8:52 ` dorit at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-25  8:51 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #18 from dorit at gcc dot gnu dot org  2007-07-25 08:51 -------
Subject: Bug 25413

Author: dorit
Date: Wed Jul 25 08:51:12 2007
New Revision: 126904

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=126904
Log:
2007-07-25  Dorit Nuzman  <dorit@il.ibm.com>
            Devang Patel  <dpatel@apple.com>

        PR tree-optimization/25413
        * targhooks.c (default_builtin_vector_alignment_reachable): New.
        * targhooks.h (default_builtin_vector_alignment_reachable): New.
        * tree.h (contains_packed_reference): New.
        * expr.c (contains_packed_reference): New.
        * tree-vect-analyze.c (vector_alignment_reachable_p): New.
        (vect_enhance_data_refs_alignment): Call
        vector_alignment_reachable_p.
        * target.h (vector_alignment_reachable): New builtin.
        * target-def.h (TARGET_VECTOR_ALIGNMENT_REACHABLE): New.
        * config/rs6000/rs6000.c (rs6000_vector_alignment_reachable): New.
        (TARGET_VECTOR_ALIGNMENT_REACHABLE): Define.

2007-07-25  Dorit Nuzman  <dorit@il.ibm.com>
            Devang Patel  <dpatel@apple.com>
            Uros Bizjak  <ubizjak@gmail.com>

        PR tree-optimization/25413
        * lib/target-supports.exp (check_effective_target_vect_aligned_arrays):
        New procedure to check if arrays are naturally aligned to the vector
        alignment boundary.
        * gcc.dg/vect/vect-align-1.c: New.
        * gcc.dg/vect/vect-align-2.c: New.
        * gcc.dg/vect/pr25413.c: New.
        * gcc.dg/vect/pr25413a.c: New.


Added:
    branches/gcc-4_2-branch/gcc/testsuite/gcc.dg/vect/pr25413.c
    branches/gcc-4_2-branch/gcc/testsuite/gcc.dg/vect/pr25413a.c
    branches/gcc-4_2-branch/gcc/testsuite/gcc.dg/vect/vect-align-1.c
    branches/gcc-4_2-branch/gcc/testsuite/gcc.dg/vect/vect-align-2.c
Modified:
    branches/gcc-4_2-branch/gcc/ChangeLog
    branches/gcc-4_2-branch/gcc/config/rs6000/rs6000.c
    branches/gcc-4_2-branch/gcc/expr.c
    branches/gcc-4_2-branch/gcc/target-def.h
    branches/gcc-4_2-branch/gcc/target.h
    branches/gcc-4_2-branch/gcc/targhooks.c
    branches/gcc-4_2-branch/gcc/targhooks.h
    branches/gcc-4_2-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_2-branch/gcc/testsuite/lib/target-supports.exp
    branches/gcc-4_2-branch/gcc/tree-vect-analyze.c
    branches/gcc-4_2-branch/gcc/tree.h


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (17 preceding siblings ...)
  2007-07-25  8:51 ` dorit at gcc dot gnu dot org
@ 2007-07-25  8:52 ` dorit at gcc dot gnu dot org
  2007-07-25  9:12 ` pinskia at gmail dot com
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-25  8:52 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #19 from dorit at gcc dot gnu dot org  2007-07-25 08:52 -------
problem fixed.


-- 

dorit at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2007-07-25  8:40 ` dorit at gcc dot gnu dot org
@ 2007-07-25  9:12   ` Andrew Pinski
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Pinski @ 2007-07-25  9:12 UTC (permalink / raw)
  To: gcc-bugzilla; +Cc: gcc-bugs

On 25 Jul 2007 08:40:09 -0000, dorit at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote
> In the meantime, would you please try this patch?:

Of course after my patch for PR 16660, the patch here should be
changed to just return true always.

Thanks,
Andrew Pinski


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (18 preceding siblings ...)
  2007-07-25  8:52 ` dorit at gcc dot gnu dot org
@ 2007-07-25  9:12 ` pinskia at gmail dot com
  2007-07-25 11:11 ` dorit at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 25+ messages in thread
From: pinskia at gmail dot com @ 2007-07-25  9:12 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #20 from pinskia at gmail dot com  2007-07-25 09:12 -------
Subject: Re:  wrong alignment or incorrect address computation in vectorized
code on Pentium 4 SSE

On 25 Jul 2007 08:40:09 -0000, dorit at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote
> In the meantime, would you please try this patch?:

Of course after my patch for PR 16660, the patch here should be
changed to just return true always.

Thanks,
Andrew Pinski


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (19 preceding siblings ...)
  2007-07-25  9:12 ` pinskia at gmail dot com
@ 2007-07-25 11:11 ` dorit at gcc dot gnu dot org
  2007-07-25 20:24 ` dirtyepic at gentoo dot org
  2007-12-28  1:01 ` reichelt at gcc dot gnu dot org
  22 siblings, 0 replies; 25+ messages in thread
From: dorit at gcc dot gnu dot org @ 2007-07-25 11:11 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #21 from dorit at gcc dot gnu dot org  2007-07-25 11:11 -------
> Of course after my patch for PR 16660, the patch here should be
> changed to just return true always.

In this case, Ryan, could you please also try to see if Andrew's patch
(http://gcc.gnu.org/ml/gcc-patches/2007-07/msg00177.html) fixes the problem?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (20 preceding siblings ...)
  2007-07-25 11:11 ` dorit at gcc dot gnu dot org
@ 2007-07-25 20:24 ` dirtyepic at gentoo dot org
  2007-12-28  1:01 ` reichelt at gcc dot gnu dot org
  22 siblings, 0 replies; 25+ messages in thread
From: dirtyepic at gentoo dot org @ 2007-07-25 20:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #22 from dirtyepic at gentoo dot org  2007-07-25 20:24 -------
(In reply to comment #17)
> Lets open a separate PR for this one

This is now PR 32893.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE
  2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
                   ` (21 preceding siblings ...)
  2007-07-25 20:24 ` dirtyepic at gentoo dot org
@ 2007-12-28  1:01 ` reichelt at gcc dot gnu dot org
  22 siblings, 0 replies; 25+ messages in thread
From: reichelt at gcc dot gnu dot org @ 2007-12-28  1:01 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #23 from reichelt at gcc dot gnu dot org  2007-12-28 01:01 -------
*** Bug 33958 has been marked as a duplicate of this bug. ***


-- 

reichelt at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |eran dot nissenhaus at
                   |                            |mobileye dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2007-12-28  1:01 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-12-14 15:24 [Bug tree-optimization/25413] New: wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE David dot Monniaux at ens dot fr
2005-12-14 15:26 ` [Bug target/25413] " David dot Monniaux at ens dot fr
2005-12-15 12:42 ` dorit at il dot ibm dot com
2005-12-15 12:50 ` dorit at il dot ibm dot com
2006-01-30 22:24 ` [Bug tree-optimization/25413] " pinskia at gcc dot gnu dot org
2007-04-02 20:52 ` reichelt at gcc dot gnu dot org
2007-04-03 19:22 ` [Bug target/25413] " dorit at il dot ibm dot com
2007-07-01 10:00 ` dorit at gcc dot gnu dot org
2007-07-02  8:30 ` patchapp at dberlin dot org
2007-07-12 14:42 ` dorit at gcc dot gnu dot org
2007-07-13  0:13 ` dirtyepic at gentoo dot org
2007-07-16  8:02 ` dorit at gcc dot gnu dot org
2007-07-17 11:05 ` reichelt at gcc dot gnu dot org
2007-07-24  9:05 ` dorit at gcc dot gnu dot org
2007-07-24 19:37 ` dirtyepic at gentoo dot org
2007-07-24 19:40 ` dirtyepic at gentoo dot org
2007-07-24 21:21 ` David dot Monniaux at ens dot fr
2007-07-25  8:40 ` dorit at gcc dot gnu dot org
2007-07-25  9:12   ` Andrew Pinski
2007-07-25  8:51 ` dorit at gcc dot gnu dot org
2007-07-25  8:52 ` dorit at gcc dot gnu dot org
2007-07-25  9:12 ` pinskia at gmail dot com
2007-07-25 11:11 ` dorit at gcc dot gnu dot org
2007-07-25 20:24 ` dirtyepic at gentoo dot org
2007-12-28  1:01 ` reichelt at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).