public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, ARM] Misaligned access support for ARM Neon
@ 2009-11-17 17:21 Julian Brown
  2009-11-18 12:03 ` Ira Rosen
  2009-11-18 14:25 ` Paul Brook
  0 siblings, 2 replies; 29+ messages in thread
From: Julian Brown @ 2009-11-17 17:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: paul, rearnsha

[-- Attachment #1: Type: text/plain, Size: 5979 bytes --]

This patch provides support for misaligned accesses (for the
vectorizer) for ARM Neon in little-endian mode, and some of the
infrastructure for support in big-endian mode also (though big-endian
support doesn't actually work yet). (Most of the tricky bits in here are
by Paul Brook, so apologies if I've messed up the explanations!)

Supporting big-endian mode for vectorization is tricky for Neon, because
no distinction is made in the vectorizer at present between whole-vector
memory accesses and accesses of arrays of elements. For Neon, these are
supported by distinct operations, which (in big-endian mode at least)
access elements in different orders (namely, vldr/vstr/vstm/vldm for
whole vectors, vld1/vst1 for elements).

The right way to order vectors is (i.e. has been decided to be, AIUI) as
if one had a union, e.g.:

union {
  v4si vector;
  int elements[4];
};

then the numbering of the vector's elements is the same as accessing
the elements through the array. This is the ordering provided by the
vld1/vst1 instructions in both big- and little-endian mode.

However, the vectorizer will (at present) create regular vector
loads/stores (i.e. the ones using vldr/vstr/vldm/vstm) for aligned
accesses. This patch introduces a target hook (vector_always_misalign)
which forces it to always use the misaligned variant (which DTRT in
big-endian mode), even for aligned accesses.

Neon vld1/vst1 can't be used for *arbitrary* alignments: the
minimum alignment is the element size for the access in question.
Another target hook (vector_min_alignment) is defined to teach the
vectorizer this. Correspondingly for the testsuite,
check_effective_target_vect_element_align has been added to
target-supports.exp.

The latter unfortunately is fairly redundant with
check_effective_target_vect_hw_misalign, which has been introduced
since this patch was written. *_vect_element_align provides a weaker
promise (i.e. element alignment vs. arbitrary alignment), though one
which is probably sufficient in most cases. I'm not sure if it makes
sense to keep both predicates: at present, I've updated quite a few
tests which previously used *_vect_hw_misalign to use
*_vect_element_align instead when it improves test results for ARM, but
not exhaustively.

Test results (cross to ARM Linux) show some additional failures. I
think these are just showing up missing features elsewhere in the Neon
support appearing now because of the dejagnu tweaks, rather than
problems with this patch as such.

OK to apply?

Julian

ChangeLog

    Julian Brown  <julian@codesourcery.com>
    Paul Brook  <paul@codesourcery.com>
    Daniel Jacobowitz  <dan@codesourcery.com>
    Joseph Myers  <joseph@codesourcery.com>

    gcc/
    * expr.c (expand_assignment): Handle MISALIGNED_INDIRECT_REF as a
    destination.
    (expand_expr_real_1): Handle writes to MISALIGNED_INDIRECT_REF.
    * target-def.h (TARGET_VECTOR_MIN_ALIGNMENT)
    (TARGET_VECTOR_ALWAYS_MISALIGN): Define.
    (TARGET_VECTORIZE): Use them.
    * target.h (gcc_target): Add vectorize.vector_min_alignment and
    vectorize.always_misalign.
    * targhooks.c (default_vector_min_alignment): New function.
    * targhooks.h (default_vector_min_alignment): Add prototype.
    * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Use
    targetm.vectorize.vector_min_alignment.
    * tree-vect-loop-manip.c (target.h): Include.
    (vect_gen_niters_for_prolog_loop): Use
    targetm.vectorize.vector_min_alignment.
    * tree-vect-stmts.c (vectorizable_store): Honour
    targetm.vectorize.always_misalign.
    (vectorizable_load): Ditto.
    * config/arm/arm.c (arm_vector_min_alignment)
    (arm_vector_always_misalign): New functions.
    (TARGET_VECTOR_MIN_ALIGNMENT, TARGET_VECTOR_ALWAYS_MISALIGN):
    Define macros, using above.
    (neon_vector_mem_operand): Disallow PRE_DEC for array loads.
    (arm_print_operand): Include alignment qualifier in %A.
    * config/arm/neon.md (UNSPEC_MISALIGNED_ACCESS): New constant.
    (movmisalign<mode>): New expander.
    (movmisalign<mode>_neon_store, movmisalign<mode>_neon_load): New
    insn patterns.

    gcc/testsuite/
    * lib/target-supports.exp
    (check_effective_target_arm_vect_no_misalign): New function.
    (check_effective_target_vect_no_align): Use above.
    (check_effective_target_vect_element_align): New function.
    * gcc.dg/vect/no-section-anchors-vect-31.c: Use vect_element_align (instead of
    vect_hw_misalign where appropriate).
    * gcc.dg/vect/no-section-anchors-vect-64.c: Ditto.
    * gcc.dg/vect/no-section-anchors-vect-66.c: Ditto.
    * gcc.dg/vect/no-section-anchors-vect-68.c: Ditto.
    * gcc.dg/vect/no-section-anchors-vect-69.c: Ditto.
    * gcc.dg/vect/section-anchors-vect-69.c: Ditto.
    * gcc.dg/vect/slp-25.c: Ditto.
    * gcc.dg/vect/vect-109.c: Ditto.
    * gcc.dg/vect/vect-26.c: Ditto.
    * gcc.dg/vect/vect-27.c: Ditto.
    * gcc.dg/vect/vect-28.c: Ditto.
    * gcc.dg/vect/vect-29.c: Ditto.
    * gcc.dg/vect/vect-33.c: Ditto.
    * gcc.dg/vect/vect-42.c: Ditto.
    * gcc.dg/vect/vect-44.c: Ditto.
    * gcc.dg/vect/vect-48.c: Ditto.
    * gcc.dg/vect/vect-50.c: Ditto.
    * gcc.dg/vect/vect-52.c: Ditto.
    * gcc.dg/vect/vect-54.c: Ditto.
    * gcc.dg/vect/vect-56.c: Ditto.
    * gcc.dg/vect/vect-58.c: Ditto.
    * gcc.dg/vect/vect-60.c: Ditto.
    * gcc.dg/vect/vect-70.c: Ditto.
    * gcc.dg/vect/vect-72.c: Ditto.
    * gcc.dg/vect/vect-75.c: Ditto.
    * gcc.dg/vect/vect-87.c: Ditto.
    * gcc.dg/vect/vect-88.c: Ditto.
    * gcc.dg/vect/vect-89.c: Ditto.
    * gcc.dg/vect/vect-91.c: Ditto.
    * gcc.dg/vect/vect-92.c: Ditto.
    * gcc.dg/vect/vect-93.c: Ditto.
    * gcc.dg/vect/vect-95.c: Ditto.
    * gcc.dg/vect/vect-96.c: Ditto.
    * gcc.dg/vect/vect-align-1.c: Ditto.
    * gcc.dg/vect/vect-align-2.c: Ditto.
    * gcc.dg/vect/vect-multitypes-1.c: Ditto.
    * gcc.dg/vect/vect-multitypes-3.c: Ditto.
    * gcc.dg/vect/vect-multitypes-4.c: Ditto.
    * gcc.dg/vect/vect-multitypes-6.c: Ditto.

[-- Attachment #2: misaligned-neon-fsf-4.diff --]
[-- Type: text/x-patch, Size: 53518 bytes --]

commit 4985478f93a72dd58d45d48bee17bfbd949830cb
Author: Julian Brown <julian@henry3.codesourcery.com>
Date:   Tue Nov 17 09:08:19 2009 -0800

    [ARM] Misaligned access support for little-endian Neon

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 5f47362..913febc 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -224,6 +224,8 @@ static bool arm_can_eliminate (const int, const int);
 static void arm_asm_trampoline_template (FILE *);
 static void arm_trampoline_init (rtx, tree, rtx);
 static rtx arm_trampoline_adjust_address (rtx);
+static int arm_vector_min_alignment (const_tree type);
+static bool arm_vector_always_misalign (const_tree);
 
 \f
 /* Table of machine attributes.  */
@@ -507,6 +509,12 @@ static const struct attribute_spec arm_attribute_table[] =
 #undef TARGET_CAN_ELIMINATE
 #define TARGET_CAN_ELIMINATE arm_can_eliminate
 
+#undef TARGET_VECTOR_MIN_ALIGNMENT
+#define TARGET_VECTOR_MIN_ALIGNMENT arm_vector_min_alignment
+
+#undef TARGET_VECTOR_ALWAYS_MISALIGN
+#define TARGET_VECTOR_ALWAYS_MISALIGN arm_vector_always_misalign
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 \f
 /* Obstack for minipool constant handling.  */
@@ -8306,7 +8314,8 @@ neon_vector_mem_operand (rtx op, int type)
     return arm_address_register_rtx_p (ind, 0);
 
   /* Allow post-increment with Neon registers.  */
-  if (type != 1 && (GET_CODE (ind) == POST_INC || GET_CODE (ind) == PRE_DEC))
+  if ((type != 1 && GET_CODE (ind) == POST_INC)
+      || (type == 0 && GET_CODE (ind) == PRE_DEC))
     return arm_address_register_rtx_p (XEXP (ind, 0), 0);
 
   /* FIXME: vld1 allows register post-modify.  */
@@ -15245,6 +15254,8 @@ arm_print_operand (FILE *stream, rtx x, int code)
       {
 	rtx addr;
 	bool postinc = FALSE;
+	unsigned align;
+
 	gcc_assert (GET_CODE (x) == MEM);
 	addr = XEXP (x, 0);
 	if (GET_CODE (addr) == POST_INC)
@@ -15252,7 +15263,13 @@ arm_print_operand (FILE *stream, rtx x, int code)
 	    postinc = 1;
 	    addr = XEXP (addr, 0);
 	  }
-	asm_fprintf (stream, "[%r]", REGNO (addr));
+	align = MEM_ALIGN (x) >> 3;
+	asm_fprintf (stream, "[%r", REGNO (addr));
+	if (align > GET_MODE_SIZE (GET_MODE (x)))
+	  align = GET_MODE_SIZE (GET_MODE (x));
+	if (align >= 8)
+	  asm_fprintf (stream, ", :%d", align << 3);
+	asm_fprintf (stream, "]");
 	if (postinc)
 	  fputs("!", stream);
       }
@@ -21274,4 +21291,34 @@ arm_have_conditional_execution (void)
   return !TARGET_THUMB1;
 }
 
+/* Return the minimum alignment required to load or store a
+   vector of the given type, which may be less than the
+   natural alignment of the type.  */
+
+static int
+arm_vector_min_alignment (const_tree type)
+{
+  if (TARGET_NEON)
+    {
+      /* The NEON element load and store instructions only require the
+	 alignment of the element type.  They can benefit from higher
+	 statically reported alignment, but we do not take advantage
+	 of that yet.  */
+      gcc_assert (TREE_CODE (type) == VECTOR_TYPE);
+      return TYPE_ALIGN_UNIT (TREE_TYPE (type));
+    }
+
+  return default_vector_min_alignment (type);
+}
+
+static bool
+arm_vector_always_misalign (const_tree type ATTRIBUTE_UNUSED)
+{
+  /* On big-endian targets array loads (vld1) and vector loads (vldm)
+     use a different format.  Always use the "misaligned" array variant.
+     FIXME: this still doesn't work for big-endian because of constant
+     loads and other operations using vldm ordering.  */
+  return TARGET_NEON && !BYTES_BIG_ENDIAN;
+}
+
 #include "gt-arm.h"
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 7d1ef11..1ba0df8 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -159,7 +159,8 @@
    (UNSPEC_VUZP1		201)
    (UNSPEC_VUZP2		202)
    (UNSPEC_VZIP1		203)
-   (UNSPEC_VZIP2		204)])
+   (UNSPEC_VZIP2		204)
+   (UNSPEC_MISALIGNED_ACCESS	205)])
 
 ;; Double-width vector modes.
 (define_mode_iterator VD [V8QI V4HI V2SI V2SF])
@@ -658,6 +659,51 @@
   neon_disambiguate_copy (operands, dest, src, 4);
 })
 
+(define_expand "movmisalign<mode>"
+  [(set (match_operand:VDQX 0 "nonimmediate_operand"	      "")
+	(unspec:VDQX [(match_operand:VDQX 1 "general_operand" "")]
+		     UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+{
+  if (!s_register_operand (operands[0], <MODE>mode)
+      && !s_register_operand (operands[1], <MODE>mode))
+    FAIL;
+})
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VDX 0 "memory_operand"		       "=Um")
+	(unspec:VDX [(match_operand:VDX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN
+   && (   s_register_operand (operands[0], <MODE>mode)
+       || s_register_operand (operands[1], <MODE>mode))"
+  "vst1.<V_sz_elem>\t{%P1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VDX 0 "s_register_operand"	   "=w")
+	(unspec:VDX [(match_operand:VDX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%P0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VQX 0 "memory_operand"		       "=Um")
+	(unspec:VQX [(match_operand:VQX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vst1.<V_sz_elem>\t{%q1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VQX 0 "s_register_operand"	   "=w")
+	(unspec:VQX [(match_operand:VQX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%q0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
 (define_insn "vec_set<mode>_internal"
   [(set (match_operand:VD 0 "s_register_operand" "=w")
         (vec_merge:VD
diff --git a/gcc/expr.c b/gcc/expr.c
index e62b530..3f514b3 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -4459,6 +4459,29 @@ expand_assignment (tree to, tree from, bool nontemporal)
 
   /* Compute FROM and store the value in the rtx we got.  */
 
+  if (TREE_CODE (to) == MISALIGNED_INDIRECT_REF)
+    {
+      rtx insn;
+      rtx from_rtx;
+      enum insn_code icode;
+      enum machine_mode mode = GET_MODE (to_rtx);
+
+      icode = optab_handler (movmisalign_optab, mode)->insn_code;
+      gcc_assert (icode != CODE_FOR_nothing);
+
+      from_rtx = expand_expr (from, NULL_RTX, mode, EXPAND_NORMAL);
+      insn = GEN_FCN (icode) (to_rtx, from_rtx);
+      /* If that failed then force the source into a reg and try again.  */
+      if (!insn)
+	{
+	  from_rtx = copy_to_mode_reg (mode, from_rtx);
+	  insn = GEN_FCN (icode) (to_rtx, from_rtx);
+	  gcc_assert (insn);
+	}
+      emit_insn (insn);
+      return;
+    }
+
   push_temp_slots ();
   result = store_expr (from, to_rtx, 0, nontemporal);
   preserve_temp_slots (result);
@@ -8730,6 +8753,10 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
 	    int icode;
 	    rtx reg, insn;
 
+	    /* For writes produce a MEM, and expand_assignment will DTRT.  */
+	    if (modifier == EXPAND_WRITE)
+	      return temp;
+
 	    gcc_assert (modifier == EXPAND_NORMAL
 			|| modifier == EXPAND_STACK_PARM);
 
diff --git a/gcc/target-def.h b/gcc/target-def.h
index ddab977..1f62193 100644
--- a/gcc/target-def.h
+++ b/gcc/target-def.h
@@ -391,6 +391,9 @@
 #define TARGET_VECTOR_ALIGNMENT_REACHABLE \
   default_builtin_vector_alignment_reachable
 #define TARGET_VECTORIZE_BUILTIN_VEC_PERM 0
+#define TARGET_VECTOR_MIN_ALIGNMENT \
+  default_vector_min_alignment
+#define TARGET_VECTOR_ALWAYS_MISALIGN hook_bool_const_tree_false
 #define TARGET_SUPPORT_VECTOR_MISALIGNMENT \
   default_builtin_support_vector_misalignment 
    
@@ -405,7 +408,9 @@
     TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST,			\
     TARGET_VECTOR_ALIGNMENT_REACHABLE,                                  \
     TARGET_VECTORIZE_BUILTIN_VEC_PERM,					\
-    TARGET_SUPPORT_VECTOR_MISALIGNMENT				\
+    TARGET_VECTOR_MIN_ALIGNMENT,					\
+    TARGET_VECTOR_ALWAYS_MISALIGN,					\
+    TARGET_SUPPORT_VECTOR_MISALIGNMENT					\
   }
 
 #define TARGET_DEFAULT_TARGET_FLAGS 0
diff --git a/gcc/target.h b/gcc/target.h
index 6d62d52..09c4174 100644
--- a/gcc/target.h
+++ b/gcc/target.h
@@ -490,6 +490,16 @@ struct gcc_target
 
     /* Target builtin that implements vector permute.  */
     tree (* builtin_vec_perm) (tree, tree*);
+
+    /* Return the minimum alignment required to load or store a
+       vector of the given type, which may be less than the
+       natural alignment of the type.  */
+    int (* vector_min_alignment) (const_tree);
+
+    /* Return true if "movmisalign" patterns should be used for all
+       loads/stores from data arrays.  */
+    bool (* always_misalign) (const_tree);
+
     /* Return true if the target supports misaligned store/load of a
        specific factor denoted in the third parameter.  The last parameter
        is true if the access is defined in a packed struct.  */
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index dfc470c..95a3107 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -932,6 +932,12 @@ default_addr_space_convert (rtx op ATTRIBUTE_UNUSED,
   gcc_unreachable ();
 }
 
+int
+default_vector_min_alignment (const_tree type)
+{
+  return TYPE_ALIGN_UNIT (type);
+}
+
 bool
 default_hard_regno_scratch_ok (unsigned int regno ATTRIBUTE_UNUSED)
 {
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 365496b..ae1d865 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -82,6 +82,8 @@ default_builtin_support_vector_misalignment (enum machine_mode mode,
 					     const_tree,
 					     int, bool); 
 
+extern int default_vector_min_alignment (const_tree);
+
 /* These are here, and not in hooks.[ch], because not all users of
    hooks.h include tm.h, and thus we don't have CUMULATIVE_ARGS.  */
 
diff --git a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c
index 21b87a3..373384f 100644
--- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c
+++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c
@@ -88,5 +88,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c
index 1ce3fa7..6679fb2 100644
--- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c
+++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c
@@ -84,5 +84,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c
index 49a9098..48b6f0b 100644
--- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c
+++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c
@@ -79,5 +79,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c
index de036e8..e38acf2 100644
--- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c
+++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c
@@ -88,5 +88,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
index cc4f26f..5e6ac23 100644
--- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
+++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
@@ -114,7 +114,7 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vector_alignment_reachable} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_element_align || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { { ! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c b/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c
index 7b5ce73..13c8639 100644
--- a/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c
+++ b/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c
@@ -115,6 +115,6 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* Alignment forced using versioning until the pass that increases alignment
   is extended to handle structs.  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 4 "vect" { target {vect_int && vector_alignment_reachable } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 4 "vect" { target { {vect_int && vector_alignment_reachable } && { ! vect_element_align } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target {vect_int && {! vector_alignment_reachable} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/slp-25.c b/gcc/testsuite/gcc.dg/vect/slp-25.c
index b660508..f0b7f2e 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-25.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-25.c
@@ -56,5 +56,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-109.c b/gcc/testsuite/gcc.dg/vect/vect-109.c
index dd9f8ea..af6e5ff 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-109.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-109.c
@@ -73,7 +73,7 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 2 "vect" } } */
-/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_element_align } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 10 "vect" { target vect_hw_misalign } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-26.c b/gcc/testsuite/gcc.dg/vect/vect-26.c
index bec111b..f81d5ab 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-26.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-26.c
@@ -37,5 +37,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-27.c b/gcc/testsuite/gcc.dg/vect/vect-27.c
index 4a2da22..cb84d25 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-27.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-27.c
@@ -45,6 +45,6 @@ int main (void)
 /* The initialization induction loop (with aligned access) is also vectorized.  */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { xfail vect_no_align } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-28.c b/gcc/testsuite/gcc.dg/vect/vect-28.c
index 794a7c8..0706a64 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-28.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-28.c
@@ -40,6 +40,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vector_alignment_reachable } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vector_alignment_reachable && { ! vect_element_align } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-29.c b/gcc/testsuite/gcc.dg/vect/vect-29.c
index 0ad2848..136fac5 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-29.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-29.c
@@ -50,7 +50,7 @@ int main (void)
 
 /* The initialization induction loop (with aligned access) is also vectorized.  */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" {target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-33.c b/gcc/testsuite/gcc.dg/vect/vect-33.c
index d35bce4..2a91a1d 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-33.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-33.c
@@ -39,6 +39,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target vector_alignment_reachable } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vector_alignment_reachable && { ! vect_element_align } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */ 
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-42.c b/gcc/testsuite/gcc.dg/vect/vect-42.c
index 3ba1c6f..6623a65 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-42.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-42.c
@@ -65,6 +65,6 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_hw_misalign } } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { { vect_no_align || vect_element_align } || { ! vector_alignment_reachable } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || { ! vector_alignment_reachable } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-44.c b/gcc/testsuite/gcc.dg/vect/vect-44.c
index ef1a463..22f0cce 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-44.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-44.c
@@ -65,8 +65,8 @@ int main (void)
    two loads to be aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target vect_no_align } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {{! vect_no_align} && {! vect_hw_misalign} } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-48.c b/gcc/testsuite/gcc.dg/vect/vect-48.c
index e47ee00..437cd86 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-48.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-48.c
@@ -54,7 +54,7 @@ int main (void)
    (The store is aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-50.c b/gcc/testsuite/gcc.dg/vect/vect-50.c
index 068c804..e037f0b 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-50.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-50.c
@@ -61,9 +61,9 @@ int main (void)
    align the store will not force the two loads to be aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } }  */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } }  */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target vect_no_align } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && { {! vect_no_align } && {! vect_hw_misalign } } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-52.c b/gcc/testsuite/gcc.dg/vect/vect-52.c
index af485ab..06cef33 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-52.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-52.c
@@ -55,7 +55,7 @@ int main (void)
    (The store is aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-54.c b/gcc/testsuite/gcc.dg/vect/vect-54.c
index 629e82d..5ae19da 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-54.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-54.c
@@ -60,5 +60,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-56.c b/gcc/testsuite/gcc.dg/vect/vect-56.c
index 7b7da12..25f6d46 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-56.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-56.c
@@ -68,6 +68,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-58.c b/gcc/testsuite/gcc.dg/vect/vect-58.c
index fa8c91b..fb726b3 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-58.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-58.c
@@ -59,5 +59,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-60.c b/gcc/testsuite/gcc.dg/vect/vect-60.c
index cbdf63d..9500bb9 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-60.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-60.c
@@ -69,6 +69,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-70.c b/gcc/testsuite/gcc.dg/vect/vect-70.c
index e3ebdca..dc1311b 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-70.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-70.c
@@ -64,6 +64,6 @@ int main (void)
           
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable} } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target {{! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-72.c b/gcc/testsuite/gcc.dg/vect/vect-72.c
index 67a1975..983555c 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-72.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-72.c
@@ -46,6 +46,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-75.c b/gcc/testsuite/gcc.dg/vect/vect-75.c
index 092a301..6bdf6ad 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-75.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-75.c
@@ -45,5 +45,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-87.c b/gcc/testsuite/gcc.dg/vect/vect-87.c
index 9912f19..e2838c4 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-87.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-87.c
@@ -51,6 +51,6 @@ int main (void)
 /* Fails for targets that don't vectorize PLUS (e.g alpha).  */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target vector_alignment_reachable} } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-88.c b/gcc/testsuite/gcc.dg/vect/vect-88.c
index 5938546..eabee05 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-88.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-88.c
@@ -51,6 +51,6 @@ int main (void)
 /* Fails for targets that don't vectorize PLUS (e.g alpha).  */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target vector_alignment_reachable } } }  */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } }  */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-89.c b/gcc/testsuite/gcc.dg/vect/vect-89.c
index 131efea..4589e4c 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-89.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-89.c
@@ -46,5 +46,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-91.c b/gcc/testsuite/gcc.dg/vect/vect-91.c
index 632340b..3815130 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-91.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-91.c
@@ -59,6 +59,6 @@ main3 ()
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" { xfail vect_no_int_add } } } */
 /* { dg-final { scan-tree-dump-times "accesses have the same alignment." 3 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" {target { vector_alignment_reachable } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-92.c b/gcc/testsuite/gcc.dg/vect/vect-92.c
index 3a64e25..9e88471 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-92.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-92.c
@@ -92,5 +92,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-93.c b/gcc/testsuite/gcc.dg/vect/vect-93.c
index 85666d9..b8d8550 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-93.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-93.c
@@ -72,7 +72,7 @@ int main (void)
 /* main && main1 together: */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 2 "vect" { target powerpc*-*-* i?86-*-* x86_64-*-* } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { vect_no_align && {! vector_alignment_reachable} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
 
 /* in main1: */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target !powerpc*-*-* !i?86-*-* !x86_64-*-* } } } */
@@ -80,6 +80,6 @@ int main (void)
 
 /* in main: */
 /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-95.c b/gcc/testsuite/gcc.dg/vect/vect-95.c
index c1d5926..b6d41e9 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-95.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-95.c
@@ -62,8 +62,8 @@ int main (void)
    stores and generate misaligned accesses for the loads. For targets that 
    don't support unaligned loads we version for all four accesses.  */
 
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign} } } }  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } }  */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /*  { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" { target vect_no_align } } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-96.c b/gcc/testsuite/gcc.dg/vect/vect-96.c
index f392169..6521509 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-96.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-96.c
@@ -43,7 +43,7 @@ int main (void)
    For targets that don't support unaligned loads, version for the store.  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { {! vect_no_align} && vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { {! { vect_no_align || vect_element_align } } && vector_alignment_reachable } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-align-1.c b/gcc/testsuite/gcc.dg/vect/vect-align-1.c
index 099b7fe..a67de7a 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-align-1.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-align-1.c
@@ -46,7 +46,7 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { xfail vect_hw_misalign} } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { xfail vect_element_align} } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-align-2.c b/gcc/testsuite/gcc.dg/vect/vect-align-2.c
index 08a8011..71b8ae0 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-align-2.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-align-2.c
@@ -43,6 +43,6 @@ int main (void)
 
 
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { xfail vect_hw_misalign} } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { xfail vect_element_align } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c b/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
index e8fe027..5be214f 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
@@ -78,11 +78,11 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_element_align} } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c b/gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c
index 3346e71..37e2561 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c
@@ -54,6 +54,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" {xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" {xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c b/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
index 274fb02..fe17caf 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
@@ -85,11 +85,11 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_hw_misalign}  } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_element_align} } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_element_align}  } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 8 "vect" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c b/gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c
index 5bb4be8..6351a0c 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c
@@ -61,6 +61,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { sparc*-*-* && ilp32 } }} } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 6 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 6 "vect" {xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 6 "vect" {xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 8d89ed8..fbd631c 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -1513,6 +1513,18 @@ proc check_effective_target_arm32 { } {
     }]
 }
 
+# Return 1 if this is an ARM target that only supports aligned vector accesses
+proc check_effective_target_arm_vect_no_misalign { } {
+    return [check_no_compiler_messages arm_vect_no_misalign assembly {
+	#if !defined(__arm__) \
+	    || (defined(__ARMEL__) \
+	        && (!defined(__thumb__) || defined(__thumb2__)))
+	#error FOO
+	#endif
+    }]
+}
+
+
 # Return 1 if this is an ARM target supporting -mfpu=vfp
 # -mfloat-abi=softfp.  Some multilibs may be incompatible with these
 # options.
@@ -2330,7 +2342,7 @@ proc check_effective_target_vect_no_align { } {
 	if { [istarget mipsisa64*-*-*]
 	     || [istarget sparc*-*-*]
 	     || [istarget ia64-*-*]
-	     || [check_effective_target_arm32] } { 
+	     || [check_effective_target_arm_vect_no_misalign] } { 
 	    set et_vect_no_align_saved 1
 	}
     }
@@ -2465,6 +2477,25 @@ proc check_effective_target_vector_alignment_reachable_for_64bit { } {
     return $et_vector_alignment_reachable_for_64bit_saved
 }
 
+# Return 1 if the target only requires element alignment for vector accesses
+
+proc check_effective_target_vect_element_align { } {
+    global et_vect_element_align
+
+    if [info exists et_vect_element_align] {
+	verbose "check_effective_target_vect_element_align: using cached result" 2
+    } else {
+	set et_vect_element_align 0
+	if { [istarget arm*-*-*]
+	     || [check_effective_target_vect_hw_misalign] } {
+	   set et_vect_element_align 1
+	}
+    }
+
+    verbose "check_effective_target_vect_element_align: returning $et_vect_element_align" 2
+    return $et_vect_element_align
+}
+
 # Return 1 if the target supports vector conditional operations, 0 otherwise.
 
 proc check_effective_target_vect_condition { } {
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index c13c275..cfd49dc 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -745,7 +745,7 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
     }
 
   base = build_fold_indirect_ref (base_addr);
-  alignment = ssize_int (TYPE_ALIGN (vectype)/BITS_PER_UNIT);
+  alignment = ssize_int (targetm.vectorize.vector_min_alignment (vectype));
 
   if ((aligned_to && tree_int_cst_compare (aligned_to, alignment) < 0)
       || !misalign)
@@ -796,7 +796,8 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
   /* At this point we assume that the base is aligned.  */
   gcc_assert (base_aligned
 	      || (TREE_CODE (base) == VAR_DECL 
-		  && DECL_ALIGN (base) >= TYPE_ALIGN (vectype)));
+		  && (DECL_ALIGN (base)
+		      >= targetm.vectorize.vector_min_alignment (vectype))));
 
   /* Modulo alignment.  */
   misalign = size_binop (FLOOR_MOD_EXPR, misalign, alignment);
diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index c0b15cd..f566c43 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -37,6 +37,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-scalar-evolution.h"
 #include "tree-vectorizer.h"
 #include "langhooks.h"
+#include "target.h"
 
 /*************************************************************************
   Simple Loop Peeling Utilities
@@ -1835,7 +1836,7 @@ vect_gen_niters_for_prolog_loop (loop_vec_info loop_vinfo, tree loop_niters)
   gimple dr_stmt = DR_STMT (dr);
   stmt_vec_info stmt_info = vinfo_for_stmt (dr_stmt);
   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
-  int vectype_align = TYPE_ALIGN (vectype) / BITS_PER_UNIT;
+  int vectype_align = targetm.vectorize.vector_min_alignment (vectype);
   tree niters_type = TREE_TYPE (loop_niters);
   int step = 1;
   int element_size = GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (DR_REF (dr))));
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index fb4a5bf..e34d822 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -3185,7 +3185,8 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
 	       vect_permute_store_chain().  */
 	    vec_oprnd = VEC_index (tree, result_chain, i);
 
-          if (aligned_access_p (first_dr))
+          if (aligned_access_p (first_dr)
+	      && !targetm.vectorize.always_misalign (vectype))
             data_ref = build_fold_indirect_ref (dataref_ptr);
           else
           {
@@ -3564,10 +3565,15 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
 	    {
 	    case dr_aligned:
 	      gcc_assert (aligned_access_p (first_dr));
-	      data_ref = build_fold_indirect_ref (dataref_ptr);
-	      break;
+	      if (!targetm.vectorize.always_misalign (vectype))
+	        {
+		  data_ref = build_fold_indirect_ref (dataref_ptr);
+		  break;
+		}
+	      /* Fall through...  */
 	    case dr_unaligned_supported:
 	      {
+	        /* TODO: Record actual alignment in always_misalign case.  */
 		int mis = DR_MISALIGNMENT (first_dr);
 		tree tmis = (mis == -1 ? size_zero_node : size_int (mis));
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-11-17 17:21 [PATCH, ARM] Misaligned access support for ARM Neon Julian Brown
@ 2009-11-18 12:03 ` Ira Rosen
  2009-11-30 13:53   ` Julian Brown
  2009-11-18 14:25 ` Paul Brook
  1 sibling, 1 reply; 29+ messages in thread
From: Ira Rosen @ 2009-11-18 12:03 UTC (permalink / raw)
  To: Julian Brown; +Cc: gcc-patches, paul, rearnsha



gcc-patches-owner@gcc.gnu.org wrote on 17/11/2009 19:19:31:

> Julian Brown <julian@codesourcery.com>
...
>
> Neon vld1/vst1 can't be used for *arbitrary* alignments: the
> minimum alignment is the element size for the access in question.
> Another target hook (vector_min_alignment) is defined to teach the
> vectorizer this. Correspondingly for the testsuite,
> check_effective_target_vect_element_align has been added to
> target-supports.exp.
>
> The latter unfortunately is fairly redundant with
> check_effective_target_vect_hw_misalign, which has been introduced
> since this patch was written. *_vect_element_align provides a weaker
> promise (i.e. element alignment vs. arbitrary alignment), though one
> which is probably sufficient in most cases. I'm not sure if it makes
> sense to keep both predicates: at present, I've updated quite a few
> tests which previously used *_vect_hw_misalign to use
> *_vect_element_align instead when it improves test results for ARM, but
> not exhaustively.

I think, vect_element_align can replace vect_hw_misalign in all the
testcases except gcc.dg/vect/vect-align-1.c and gcc.dg/vect/vect-align-2.c.
These tests contain packed structures.


>
> Test results (cross to ARM Linux) show some additional failures. I
> think these are just showing up missing features elsewhere in the Neon
> support appearing now because of the dejagnu tweaks, rather than
> problems with this patch as such.
>
> OK to apply?
>



> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -796,7 +796,8 @@ vect_compute_data_ref_alignment (struct
data_reference *dr)
>    /* At this point we assume that the base is aligned.  */
>    gcc_assert (base_aligned
>           || (TREE_CODE (base) == VAR_DECL
> -           && DECL_ALIGN (base) >= TYPE_ALIGN (vectype)));
> +           && (DECL_ALIGN (base)
> +               >= targetm.vectorize.vector_min_alignment (vectype))));

Looks like you forgot to multiply by BITS_PER_UNIT here.


>    * tree-vect-stmts.c (vectorizable_store): Honour
>     targetm.vectorize.always_misalign.
>     (vectorizable_load): Ditto.

I would prefer to have all the alignment queries in
vect_supportable_dr_alignment(). Maybe you could add another enumeration
value to enum dr_alignment_support?

Otherwise, the vectorizer part is OK with me.

Thanks,
Ira

> Julian
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-11-17 17:21 [PATCH, ARM] Misaligned access support for ARM Neon Julian Brown
  2009-11-18 12:03 ` Ira Rosen
@ 2009-11-18 14:25 ` Paul Brook
  1 sibling, 0 replies; 29+ messages in thread
From: Paul Brook @ 2009-11-18 14:25 UTC (permalink / raw)
  To: Julian Brown; +Cc: gcc-patches, rearnsha

On Tuesday 17 November 2009, Julian Brown wrote:
> This patch provides support for misaligned accesses (for the
> vectorizer) for ARM Neon in little-endian mode, and some of the
> infrastructure for support in big-endian mode also (though big-endian
> support doesn't actually work yet).

The ARM bits are OK by me, you'll need someone else to sign off on the other 
bits.

One thing we did consider in the original implementation was whether the 
"movmisalign" patterns should be renamed. In the ARM scheme describing these 
as misaligned references it somewhat misleading. Instead we have "opaque" 
vector transfers (mov<mode>) and array data transfers (movmisalign).
It's probably not worth the pain of renaming stuff, but I suggest adding 
commentary in the movmisalign section of tm.texi.

Paul

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-11-18 12:03 ` Ira Rosen
@ 2009-11-30 13:53   ` Julian Brown
  2009-11-30 14:03     ` Joseph S. Myers
  2009-12-01  8:39     ` Ira Rosen
  0 siblings, 2 replies; 29+ messages in thread
From: Julian Brown @ 2009-11-30 13:53 UTC (permalink / raw)
  To: Ira Rosen; +Cc: gcc-patches, paul, rearnsha, eres

[-- Attachment #1: Type: text/plain, Size: 6755 bytes --]

Hi,

On Wed, 18 Nov 2009 13:53:05 +0200
Ira Rosen <IRAR@il.ibm.com> wrote:

> I think, vect_element_align can replace vect_hw_misalign in all the
> testcases except gcc.dg/vect/vect-align-1.c and
> gcc.dg/vect/vect-align-2.c. These tests contain packed structures.

I've updated the tests apart from those ones to use vect_element_align.

> > Test results (cross to ARM Linux) show some additional failures. I
> > think these are just showing up missing features elsewhere in the
> > Neon support appearing now because of the dejagnu tweaks, rather
> > than problems with this patch as such.

I updated my sources (a week or two ago), and unfortunately the number
of newly-FAILing tests is now greater. I've examined a cross-section of
these failures, and some of them I think are due to testsuite changes
in the following patch:

  http://gcc.gnu.org/ml/gcc-patches/2009-10/msg01604.html

For example the vect-26.c compilation test now has:

/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access"
0 "vect" } } */

But for ARM NEON, there is a single instance of this message in the
relevant dump file. The previous version of the line looks more correct
to me:

/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access"
1 "vect" { xfail {! vect_hw_misalign} } } } */

This seems to describe the desired behaviour (at least for NEON) more
accurately to me (but I probably misunderstood something about the
above-linked patch).

Other failures are due to things like vectorizing *more* loops than
expected in several tests, and (as written before) missing parts in the
NEON support. I don't think there's anything which indicates actual
breakage.

> > --- a/gcc/tree-vect-data-refs.c
> > +++ b/gcc/tree-vect-data-refs.c
> > @@ -796,7 +796,8 @@ vect_compute_data_ref_alignment (struct
> data_reference *dr)
> >    /* At this point we assume that the base is aligned.  */
> >    gcc_assert (base_aligned
> >           || (TREE_CODE (base) == VAR_DECL
> > -           && DECL_ALIGN (base) >= TYPE_ALIGN (vectype)));
> > +           && (DECL_ALIGN (base)
> > +               >= targetm.vectorize.vector_min_alignment
> > (vectype))));
> 
> Looks like you forgot to multiply by BITS_PER_UNIT here.

Fixed.

> >    * tree-vect-stmts.c (vectorizable_store): Honour
> >     targetm.vectorize.always_misalign.
> >     (vectorizable_load): Ditto.
> 
> I would prefer to have all the alignment queries in
> vect_supportable_dr_alignment(). Maybe you could add another
> enumeration value to enum dr_alignment_support?

Does something like the attached look right?

I've also drafted a bit of hopefully-explanatory text in md.texi about
movmisalign<mode> and element ordering.

Thanks,

Julian

ChangeLog

    Julian Brown  <julian@codesourcery.com>
    Paul Brook  <paul@codesourcery.com>
    Daniel Jacobowitz  <dan@codesourcery.com>
    Joseph Myers  <joseph@codesourcery.com>

    gcc/
    * expr.c (expand_assignment): Handle MISALIGNED_INDIRECT_REF as a
    destination.
    (expand_expr_real_1): Handle writes to MISALIGNED_INDIRECT_REF.
    * target-def.h (TARGET_VECTOR_MIN_ALIGNMENT)
    (TARGET_VECTOR_ALWAYS_MISALIGN): Define.
    (TARGET_VECTORIZE): Use them.
    * target.h (gcc_target): Add vectorize.vector_min_alignment and
    vectorize.always_misalign.
    * targhooks.c (default_vector_min_alignment): New function.
    * targhooks.h (default_vector_min_alignment): Add prototype.
    * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Use
    targetm.vectorize.vector_min_alignment.
    (vect_supportable_dr_alignment): Support forced misalignment for aligned accesses.
    * tree-vect-loop-manip.c (target.h): Include.
    (vect_gen_niters_for_prolog_loop): Use
    targetm.vectorize.vector_min_alignment.
    (vect_model_load_cost, vectorizable_store, vectorizable_load): Support
    dr_unaligned_forced.
    * tree-vect-stmts.c (vectorizable_store): Honour
    targetm.vectorize.always_misalign.
    (vectorizable_load): Ditto.
    * tree-vectorizer.h (operation_type): Add dr_unaligned_forced.
    * config/arm/arm.c (arm_vector_min_alignment)
    (arm_vector_always_misalign): New functions.
    (TARGET_VECTOR_MIN_ALIGNMENT, TARGET_VECTOR_ALWAYS_MISALIGN):
    Define macros, using above.
    (neon_vector_mem_operand): Disallow PRE_DEC for array loads.
    (arm_print_operand): Include alignment qualifier in %A.
    * config/arm/neon.md (UNSPEC_MISALIGNED_ACCESS): New constant.
    (movmisalign<mode>): New expander.
    (movmisalign<mode>_neon_store, movmisalign<mode>_neon_load): New
    insn patterns.

    gcc/doc/
    * md.texi (movmisalign): Add section about element ordering.

    gcc/testsuite/
    * lib/target-supports.exp
    (check_effective_target_arm_vect_no_misalign): New function.
    (check_effective_target_vect_no_align): Use above.
    (check_effective_target_vect_element_align): New function.
    * gcc.dg/vect/no-section-anchors-vect-31.c: Use vect_element_align (instead of
    vect_hw_misalign where appropriate).
    * gcc.dg/vect/no-section-anchors-vect-64.c: Ditto.
    * gcc.dg/vect/no-section-anchors-vect-66.c: Ditto.
    * gcc.dg/vect/no-section-anchors-vect-68.c: Ditto.
    * gcc.dg/vect/no-section-anchors-vect-69.c: Ditto.
    * gcc.dg/vect/no-scebccp-outer-8.c: Ditto.
    * gcc.dg/vect/pr25413.c: Ditto.
    * gcc.dg/vect/section-anchors-vect-69.c: Ditto.
    * gcc.dg/vect/slp-25.c: Ditto.
    * gcc.dg/vect/vect-109.c: Ditto.
    * gcc.dg/vect/vect-26.c: Ditto.
    * gcc.dg/vect/vect-27.c: Ditto.
    * gcc.dg/vect/vect-28.c: Ditto.
    * gcc.dg/vect/vect-29.c: Ditto.
    * gcc.dg/vect/vect-33.c: Ditto.
    * gcc.dg/vect/vect-42.c: Ditto.
    * gcc.dg/vect/vect-44.c: Ditto.
    * gcc.dg/vect/vect-48.c: Ditto.
    * gcc.dg/vect/vect-50.c: Ditto.
    * gcc.dg/vect/vect-52.c: Ditto.
    * gcc.dg/vect/vect-54.c: Ditto.
    * gcc.dg/vect/vect-56.c: Ditto.
    * gcc.dg/vect/vect-58.c: Ditto.
    * gcc.dg/vect/vect-60.c: Ditto.
    * gcc.dg/vect/vect-70.c: Ditto.
    * gcc.dg/vect/vect-72.c: Ditto.
    * gcc.dg/vect/vect-75.c: Ditto.
    * gcc.dg/vect/vect-87.c: Ditto.
    * gcc.dg/vect/vect-88.c: Ditto.
    * gcc.dg/vect/vect-89.c: Ditto.
    * gcc.dg/vect/vect-91.c: Ditto.
    * gcc.dg/vect/vect-92.c: Ditto.
    * gcc.dg/vect/vect-93.c: Ditto.
    * gcc.dg/vect/vect-95.c: Ditto.
    * gcc.dg/vect/vect-96.c: Ditto.
    * gcc.dg/vect/vect-align-1.c: Ditto.
    * gcc.dg/vect/vect-align-2.c: Ditto.
    * gcc.dg/vect/vect-multitypes-1.c: Ditto.
    * gcc.dg/vect/vect-multitypes-3.c: Ditto.
    * gcc.dg/vect/vect-multitypes-4.c: Ditto.
    * gcc.dg/vect/vect-multitypes-6.c: Ditto.
    * gfortran.dg/vect/vect-2.f90: Ditto.
    * gfortran.dg/vect/vect-3.f90: Ditto.
    * gfortran.dg/vect/vect-4.f90: Ditto.
    * gfortran.dg/vect/vect-5.f90: Ditto.

[-- Attachment #2: misaligned-neon-fsf-5.diff --]
[-- Type: text/x-patch, Size: 63671 bytes --]

commit 59ca018f4e3b0164cbbd81c97932f573db982237
Author: Julian Brown <julian@henry3.codesourcery.com>
Date:   Sat Nov 28 12:10:59 2009 -0800

    [ARM] Misaligned access support for little-endian Neon

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index a9ad903..1e37e4e 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -224,6 +224,8 @@ static bool arm_can_eliminate (const int, const int);
 static void arm_asm_trampoline_template (FILE *);
 static void arm_trampoline_init (rtx, tree, rtx);
 static rtx arm_trampoline_adjust_address (rtx);
+static int arm_vector_min_alignment (const_tree type);
+static bool arm_vector_always_misalign (const_tree);
 
 \f
 /* Table of machine attributes.  */
@@ -507,6 +509,12 @@ static const struct attribute_spec arm_attribute_table[] =
 #undef TARGET_CAN_ELIMINATE
 #define TARGET_CAN_ELIMINATE arm_can_eliminate
 
+#undef TARGET_VECTOR_MIN_ALIGNMENT
+#define TARGET_VECTOR_MIN_ALIGNMENT arm_vector_min_alignment
+
+#undef TARGET_VECTOR_ALWAYS_MISALIGN
+#define TARGET_VECTOR_ALWAYS_MISALIGN arm_vector_always_misalign
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 \f
 /* Obstack for minipool constant handling.  */
@@ -8463,7 +8471,8 @@ neon_vector_mem_operand (rtx op, int type)
     return arm_address_register_rtx_p (ind, 0);
 
   /* Allow post-increment with Neon registers.  */
-  if (type != 1 && (GET_CODE (ind) == POST_INC || GET_CODE (ind) == PRE_DEC))
+  if ((type != 1 && GET_CODE (ind) == POST_INC)
+      || (type == 0 && GET_CODE (ind) == PRE_DEC))
     return arm_address_register_rtx_p (XEXP (ind, 0), 0);
 
   /* FIXME: vld1 allows register post-modify.  */
@@ -15411,6 +15420,8 @@ arm_print_operand (FILE *stream, rtx x, int code)
       {
 	rtx addr;
 	bool postinc = FALSE;
+	unsigned align;
+
 	gcc_assert (GET_CODE (x) == MEM);
 	addr = XEXP (x, 0);
 	if (GET_CODE (addr) == POST_INC)
@@ -15418,7 +15429,13 @@ arm_print_operand (FILE *stream, rtx x, int code)
 	    postinc = 1;
 	    addr = XEXP (addr, 0);
 	  }
-	asm_fprintf (stream, "[%r]", REGNO (addr));
+	align = MEM_ALIGN (x) >> 3;
+	asm_fprintf (stream, "[%r", REGNO (addr));
+	if (align > GET_MODE_SIZE (GET_MODE (x)))
+	  align = GET_MODE_SIZE (GET_MODE (x));
+	if (align >= 8)
+	  asm_fprintf (stream, ", :%d", align << 3);
+	asm_fprintf (stream, "]");
 	if (postinc)
 	  fputs("!", stream);
       }
@@ -21463,4 +21480,34 @@ arm_have_conditional_execution (void)
   return !TARGET_THUMB1;
 }
 
+/* Return the minimum alignment required to load or store a
+   vector of the given type, which may be less than the
+   natural alignment of the type.  */
+
+static int
+arm_vector_min_alignment (const_tree type)
+{
+  if (TARGET_NEON)
+    {
+      /* The NEON element load and store instructions only require the
+	 alignment of the element type.  They can benefit from higher
+	 statically reported alignment, but we do not take advantage
+	 of that yet.  */
+      gcc_assert (TREE_CODE (type) == VECTOR_TYPE);
+      return TYPE_ALIGN_UNIT (TREE_TYPE (type));
+    }
+
+  return default_vector_min_alignment (type);
+}
+
+static bool
+arm_vector_always_misalign (const_tree type ATTRIBUTE_UNUSED)
+{
+  /* On big-endian targets array loads (vld1) and vector loads (vldm)
+     use a different format.  Always use the "misaligned" array variant.
+     FIXME: this still doesn't work for big-endian because of constant
+     loads and other operations using vldm ordering.  */
+  return TARGET_NEON && !BYTES_BIG_ENDIAN;
+}
+
 #include "gt-arm.h"
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 43b3805..e6c28bd 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -159,7 +159,8 @@
    (UNSPEC_VUZP1		201)
    (UNSPEC_VUZP2		202)
    (UNSPEC_VZIP1		203)
-   (UNSPEC_VZIP2		204)])
+   (UNSPEC_VZIP2		204)
+   (UNSPEC_MISALIGNED_ACCESS	205)])
 
 ;; Double-width vector modes.
 (define_mode_iterator VD [V8QI V4HI V2SI V2SF])
@@ -674,6 +675,51 @@
   neon_disambiguate_copy (operands, dest, src, 4);
 })
 
+(define_expand "movmisalign<mode>"
+  [(set (match_operand:VDQX 0 "nonimmediate_operand"	      "")
+	(unspec:VDQX [(match_operand:VDQX 1 "general_operand" "")]
+		     UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+{
+  if (!s_register_operand (operands[0], <MODE>mode)
+      && !s_register_operand (operands[1], <MODE>mode))
+    FAIL;
+})
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VDX 0 "memory_operand"		       "=Um")
+	(unspec:VDX [(match_operand:VDX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN
+   && (   s_register_operand (operands[0], <MODE>mode)
+       || s_register_operand (operands[1], <MODE>mode))"
+  "vst1.<V_sz_elem>\t{%P1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VDX 0 "s_register_operand"	   "=w")
+	(unspec:VDX [(match_operand:VDX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%P0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VQX 0 "memory_operand"		       "=Um")
+	(unspec:VQX [(match_operand:VQX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vst1.<V_sz_elem>\t{%q1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VQX 0 "s_register_operand"	   "=w")
+	(unspec:VQX [(match_operand:VQX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%q0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
 (define_insn "vec_set<mode>_internal"
   [(set (match_operand:VD 0 "s_register_operand" "=w")
         (vec_merge:VD
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 2974dcf..837e6b6 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -3807,6 +3807,15 @@ memory, so that it's easy to tell whether this is a load or store.
 This pattern is used by the autovectorizer, and when expanding a
 @code{MISALIGNED_INDIRECT_REF} expression.
 
+The @code{movmisalign@var{m}} pattern should load or store vector elements
+in the same memory order as an array of the element types.  If the
+target machine uses "opaque" operations to implement @code{mov@var{m}}
+for vector types (so the vector elements are in a different order to
+an equivalent array), but can also implement @code{movmisalign@var{m}}
+efficiently, then the autovectorizer should use this pattern for aligned
+accesses as well as misaligned accesses.  This behaviour is controlled
+by the TARGET_VECTOR_ALWAYS_MISALIGN hook.
+
 @cindex @code{load_multiple} instruction pattern
 @item @samp{load_multiple}
 Load several consecutive memory locations into consecutive registers.
diff --git a/gcc/expr.c b/gcc/expr.c
index 75c1792..ad8298c 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -4458,6 +4458,29 @@ expand_assignment (tree to, tree from, bool nontemporal)
 
   /* Compute FROM and store the value in the rtx we got.  */
 
+  if (TREE_CODE (to) == MISALIGNED_INDIRECT_REF)
+    {
+      rtx insn;
+      rtx from_rtx;
+      enum insn_code icode;
+      enum machine_mode mode = GET_MODE (to_rtx);
+
+      icode = optab_handler (movmisalign_optab, mode)->insn_code;
+      gcc_assert (icode != CODE_FOR_nothing);
+
+      from_rtx = expand_expr (from, NULL_RTX, mode, EXPAND_NORMAL);
+      insn = GEN_FCN (icode) (to_rtx, from_rtx);
+      /* If that failed then force the source into a reg and try again.  */
+      if (!insn)
+	{
+	  from_rtx = copy_to_mode_reg (mode, from_rtx);
+	  insn = GEN_FCN (icode) (to_rtx, from_rtx);
+	  gcc_assert (insn);
+	}
+      emit_insn (insn);
+      return;
+    }
+
   push_temp_slots ();
   result = store_expr (from, to_rtx, 0, nontemporal);
   preserve_temp_slots (result);
@@ -8697,6 +8720,10 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
 	    int icode;
 	    rtx reg, insn;
 
+	    /* For writes produce a MEM, and expand_assignment will DTRT.  */
+	    if (modifier == EXPAND_WRITE)
+	      return temp;
+
 	    gcc_assert (modifier == EXPAND_NORMAL
 			|| modifier == EXPAND_STACK_PARM);
 
diff --git a/gcc/target-def.h b/gcc/target-def.h
index c57977b..b3341a3 100644
--- a/gcc/target-def.h
+++ b/gcc/target-def.h
@@ -391,6 +391,9 @@
 #define TARGET_VECTOR_ALIGNMENT_REACHABLE \
   default_builtin_vector_alignment_reachable
 #define TARGET_VECTORIZE_BUILTIN_VEC_PERM 0
+#define TARGET_VECTOR_MIN_ALIGNMENT \
+  default_vector_min_alignment
+#define TARGET_VECTOR_ALWAYS_MISALIGN hook_bool_const_tree_false
 #define TARGET_SUPPORT_VECTOR_MISALIGNMENT \
   default_builtin_support_vector_misalignment
 
@@ -405,7 +408,9 @@
     TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST,			\
     TARGET_VECTOR_ALIGNMENT_REACHABLE,                                  \
     TARGET_VECTORIZE_BUILTIN_VEC_PERM,					\
-    TARGET_SUPPORT_VECTOR_MISALIGNMENT				\
+    TARGET_VECTOR_MIN_ALIGNMENT,					\
+    TARGET_VECTOR_ALWAYS_MISALIGN,					\
+    TARGET_SUPPORT_VECTOR_MISALIGNMENT					\
   }
 
 #define TARGET_DEFAULT_TARGET_FLAGS 0
diff --git a/gcc/target.h b/gcc/target.h
index 477a512..8490e3f 100644
--- a/gcc/target.h
+++ b/gcc/target.h
@@ -490,6 +490,16 @@ struct gcc_target
 
     /* Target builtin that implements vector permute.  */
     tree (* builtin_vec_perm) (tree, tree*);
+
+    /* Return the minimum alignment required to load or store a
+       vector of the given type, which may be less than the
+       natural alignment of the type.  */
+    int (* vector_min_alignment) (const_tree);
+
+    /* Return true if "movmisalign" patterns should be used for all
+       loads/stores from data arrays.  */
+    bool (* always_misalign) (const_tree);
+
     /* Return true if the target supports misaligned store/load of a
        specific factor denoted in the third parameter.  The last parameter
        is true if the access is defined in a packed struct.  */
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index d619ae5..5af30ab 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -932,6 +932,12 @@ default_addr_space_convert (rtx op ATTRIBUTE_UNUSED,
   gcc_unreachable ();
 }
 
+int
+default_vector_min_alignment (const_tree type)
+{
+  return TYPE_ALIGN_UNIT (type);
+}
+
 bool
 default_hard_regno_scratch_ok (unsigned int regno ATTRIBUTE_UNUSED)
 {
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 631bdf2..673f93d 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -82,6 +82,8 @@ default_builtin_support_vector_misalignment (enum machine_mode mode,
 					     const_tree,
 					     int, bool);
 
+extern int default_vector_min_alignment (const_tree);
+
 /* These are here, and not in hooks.[ch], because not all users of
    hooks.h include tm.h, and thus we don't have CUMULATIVE_ARGS.  */
 
diff --git a/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c b/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c
index ea67946..afa5b3d 100644
--- a/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c
+++ b/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c
@@ -46,5 +46,5 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! { vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! { vect_element_align } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c
index 21b87a3..373384f 100644
--- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c
+++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c
@@ -88,5 +88,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c
index 1ce3fa7..6679fb2 100644
--- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c
+++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c
@@ -84,5 +84,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c
index 49a9098..48b6f0b 100644
--- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c
+++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c
@@ -79,5 +79,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c
index de036e8..e38acf2 100644
--- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c
+++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c
@@ -88,5 +88,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
index cc4f26f..5e6ac23 100644
--- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
+++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
@@ -114,7 +114,7 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vector_alignment_reachable} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_element_align || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { { ! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/pr25413.c b/gcc/testsuite/gcc.dg/vect/pr25413.c
index e483732..2e914bf 100644
--- a/gcc/testsuite/gcc.dg/vect/pr25413.c
+++ b/gcc/testsuite/gcc.dg/vect/pr25413.c
@@ -33,7 +33,7 @@ int main (void)
 } 
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vector_alignment_reachable_for_64bit } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "vector alignment may not be reachable" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "vector alignment may not be reachable" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c b/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c
index 7b5ce73..13c8639 100644
--- a/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c
+++ b/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c
@@ -115,6 +115,6 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* Alignment forced using versioning until the pass that increases alignment
   is extended to handle structs.  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 4 "vect" { target {vect_int && vector_alignment_reachable } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 4 "vect" { target { {vect_int && vector_alignment_reachable } && { ! vect_element_align } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target {vect_int && {! vector_alignment_reachable} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/slp-25.c b/gcc/testsuite/gcc.dg/vect/slp-25.c
index b660508..f0b7f2e 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-25.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-25.c
@@ -56,5 +56,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-109.c b/gcc/testsuite/gcc.dg/vect/vect-109.c
index dd9f8ea..e293800 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-109.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-109.c
@@ -73,7 +73,7 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 2 "vect" } } */
-/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 10 "vect" { target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 10 "vect" { target vect_element_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-26.c b/gcc/testsuite/gcc.dg/vect/vect-26.c
index bec111b..f81d5ab 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-26.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-26.c
@@ -37,5 +37,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-27.c b/gcc/testsuite/gcc.dg/vect/vect-27.c
index 4a2da22..cb84d25 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-27.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-27.c
@@ -45,6 +45,6 @@ int main (void)
 /* The initialization induction loop (with aligned access) is also vectorized.  */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { xfail vect_no_align } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-28.c b/gcc/testsuite/gcc.dg/vect/vect-28.c
index 794a7c8..0c80938 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-28.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-28.c
@@ -40,6 +40,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-29.c b/gcc/testsuite/gcc.dg/vect/vect-29.c
index 0ad2848..136fac5 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-29.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-29.c
@@ -50,7 +50,7 @@ int main (void)
 
 /* The initialization induction loop (with aligned access) is also vectorized.  */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" {target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-33.c b/gcc/testsuite/gcc.dg/vect/vect-33.c
index d35bce4..44ad996 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-33.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-33.c
@@ -39,6 +39,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target vector_alignment_reachable } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */ 
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */ 
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-42.c b/gcc/testsuite/gcc.dg/vect/vect-42.c
index 3ba1c6f..f97b7ad 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-42.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-42.c
@@ -64,7 +64,7 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_hw_misalign } } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { { vect_no_align || vect_element_align } || { ! vector_alignment_reachable } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || { ! vector_alignment_reachable } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-44.c b/gcc/testsuite/gcc.dg/vect/vect-44.c
index ef1a463..3a5c1b3 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-44.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-44.c
@@ -65,8 +65,8 @@ int main (void)
    two loads to be aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {{! vect_no_align} && {! vect_hw_misalign} } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {{! vect_no_align} && {! vect_element_align} } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-48.c b/gcc/testsuite/gcc.dg/vect/vect-48.c
index e47ee00..437cd86 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-48.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-48.c
@@ -54,7 +54,7 @@ int main (void)
    (The store is aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-50.c b/gcc/testsuite/gcc.dg/vect/vect-50.c
index 068c804..e9ba847 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-50.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-50.c
@@ -61,9 +61,9 @@ int main (void)
    align the store will not force the two loads to be aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } }  */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } }  */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && { {! vect_no_align } && {! vect_hw_misalign } } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && { {! vect_no_align } && {! vect_element_align } } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-52.c b/gcc/testsuite/gcc.dg/vect/vect-52.c
index af485ab..06cef33 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-52.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-52.c
@@ -55,7 +55,7 @@ int main (void)
    (The store is aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-54.c b/gcc/testsuite/gcc.dg/vect/vect-54.c
index 629e82d..5ae19da 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-54.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-54.c
@@ -60,5 +60,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-56.c b/gcc/testsuite/gcc.dg/vect/vect-56.c
index 7b7da12..25f6d46 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-56.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-56.c
@@ -68,6 +68,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-58.c b/gcc/testsuite/gcc.dg/vect/vect-58.c
index fa8c91b..fb726b3 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-58.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-58.c
@@ -59,5 +59,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-60.c b/gcc/testsuite/gcc.dg/vect/vect-60.c
index cbdf63d..9500bb9 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-60.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-60.c
@@ -69,6 +69,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-70.c b/gcc/testsuite/gcc.dg/vect/vect-70.c
index e3ebdca..a742634 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-70.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-70.c
@@ -64,6 +64,6 @@ int main (void)
           
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target {{! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target {{! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-72.c b/gcc/testsuite/gcc.dg/vect/vect-72.c
index 67a1975..983555c 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-72.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-72.c
@@ -46,6 +46,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-75.c b/gcc/testsuite/gcc.dg/vect/vect-75.c
index 092a301..6bdf6ad 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-75.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-75.c
@@ -45,5 +45,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-87.c b/gcc/testsuite/gcc.dg/vect/vect-87.c
index 9912f19..15cd7e9 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-87.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-87.c
@@ -51,6 +51,6 @@ int main (void)
 /* Fails for targets that don't vectorize PLUS (e.g alpha).  */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target vector_alignment_reachable} } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-88.c b/gcc/testsuite/gcc.dg/vect/vect-88.c
index 5938546..ebc6885 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-88.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-88.c
@@ -51,6 +51,6 @@ int main (void)
 /* Fails for targets that don't vectorize PLUS (e.g alpha).  */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target vector_alignment_reachable } } }  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } }  */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-89.c b/gcc/testsuite/gcc.dg/vect/vect-89.c
index 131efea..4589e4c 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-89.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-89.c
@@ -46,5 +46,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-91.c b/gcc/testsuite/gcc.dg/vect/vect-91.c
index 632340b..bd4cafe 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-91.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-91.c
@@ -59,6 +59,6 @@ main3 ()
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" { xfail vect_no_int_add } } } */
 /* { dg-final { scan-tree-dump-times "accesses have the same alignment." 3 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" {target { vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-92.c b/gcc/testsuite/gcc.dg/vect/vect-92.c
index 3a64e25..9e88471 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-92.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-92.c
@@ -92,5 +92,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-93.c b/gcc/testsuite/gcc.dg/vect/vect-93.c
index 85666d9..b8d8550 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-93.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-93.c
@@ -72,7 +72,7 @@ int main (void)
 /* main && main1 together: */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 2 "vect" { target powerpc*-*-* i?86-*-* x86_64-*-* } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { vect_no_align && {! vector_alignment_reachable} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
 
 /* in main1: */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target !powerpc*-*-* !i?86-*-* !x86_64-*-* } } } */
@@ -80,6 +80,6 @@ int main (void)
 
 /* in main: */
 /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-95.c b/gcc/testsuite/gcc.dg/vect/vect-95.c
index c1d5926..470ee99 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-95.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-95.c
@@ -56,14 +56,14 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail {vect_hw_misalign} } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail {vect_element_align} } } } */
 
 /* For targets that support unaligned loads we version for the two unaligned 
    stores and generate misaligned accesses for the loads. For targets that 
    don't support unaligned loads we version for all four accesses.  */
 
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign} } } }  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } }  */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /*  { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" { target vect_no_align } } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-96.c b/gcc/testsuite/gcc.dg/vect/vect-96.c
index f392169..baaa749 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-96.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-96.c
@@ -43,7 +43,7 @@ int main (void)
    For targets that don't support unaligned loads, version for the store.  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { {! vect_no_align} && vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { {! { vect_no_align || vect_element_align } } && vector_alignment_reachable } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_element_align} } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-align-2.c b/gcc/testsuite/gcc.dg/vect/vect-align-2.c
index 08a8011..4f7e188 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-align-2.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-align-2.c
@@ -43,6 +43,6 @@ int main (void)
 
 
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { xfail vect_hw_misalign} } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { xfail vect_hw_misalign } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c b/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
index e8fe027..5be214f 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
@@ -78,11 +78,11 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_element_align} } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c b/gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c
index 3346e71..37e2561 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c
@@ -54,6 +54,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" {xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" {xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c b/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
index 274fb02..fe17caf 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
@@ -85,11 +85,11 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_hw_misalign}  } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_element_align} } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_element_align}  } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 8 "vect" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c b/gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c
index 5bb4be8..6351a0c 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c
@@ -61,6 +61,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { sparc*-*-* && ilp32 } }} } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 6 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 6 "vect" {xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 6 "vect" {xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff --git a/gcc/testsuite/gfortran.dg/vect/vect-2.f90 b/gcc/testsuite/gfortran.dg/vect/vect-2.f90
index 0f45a70..41accfd 100644
--- a/gcc/testsuite/gfortran.dg/vect/vect-2.f90
+++ b/gcc/testsuite/gfortran.dg/vect/vect-2.f90
@@ -18,5 +18,5 @@ END
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { vect_no_align && { ! vector_alignment_reachable } } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" {target { vect_no_align || { { ! vector_alignment_reachable  } && { ! vect_hw_misalign } } } } } } 
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" {target { vect_no_align || { { ! vector_alignment_reachable  } && { ! vect_element_align } } } } } } 
 ! { dg-final { cleanup-tree-dump "vect" } }
diff --git a/gcc/testsuite/gfortran.dg/vect/vect-3.f90 b/gcc/testsuite/gfortran.dg/vect/vect-3.f90
index 5fc4fbf..2a38ce8 100644
--- a/gcc/testsuite/gfortran.dg/vect/vect-3.f90
+++ b/gcc/testsuite/gfortran.dg/vect/vect-3.f90
@@ -7,8 +7,8 @@ Y = Y + A * X
 END
 
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } }
-! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } }
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_element_align} } } } } }
+! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_element_align} } } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable}} } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { { vect_no_align } || { ! vector_alignment_reachable} } } } }
 
diff --git a/gcc/testsuite/gfortran.dg/vect/vect-4.f90 b/gcc/testsuite/gfortran.dg/vect/vect-4.f90
index 592282f..eced8b7 100644
--- a/gcc/testsuite/gfortran.dg/vect/vect-4.f90
+++ b/gcc/testsuite/gfortran.dg/vect/vect-4.f90
@@ -12,6 +12,6 @@ END
 ! { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } 
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } }
-! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } }
+! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } }
 ! { dg-final { scan-tree-dump-times "accesses have the same alignment." 1 "vect" } }
 ! { dg-final { cleanup-tree-dump "vect" } }
diff --git a/gcc/testsuite/gfortran.dg/vect/vect-5.f90 b/gcc/testsuite/gfortran.dg/vect/vect-5.f90
index 72776a6..d16a019 100644
--- a/gcc/testsuite/gfortran.dg/vect/vect-5.f90
+++ b/gcc/testsuite/gfortran.dg/vect/vect-5.f90
@@ -39,5 +39,5 @@
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target { vect_no_align } } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } }
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } }
 ! { dg-final { cleanup-tree-dump "vect" } }
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 4b8d6f3..be7659f 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -1514,6 +1514,18 @@ proc check_effective_target_arm32 { } {
     }]
 }
 
+# Return 1 if this is an ARM target that only supports aligned vector accesses
+proc check_effective_target_arm_vect_no_misalign { } {
+    return [check_no_compiler_messages arm_vect_no_misalign assembly {
+	#if !defined(__arm__) \
+	    || (defined(__ARMEL__) \
+	        && (!defined(__thumb__) || defined(__thumb2__)))
+	#error FOO
+	#endif
+    }]
+}
+
+
 # Return 1 if this is an ARM target supporting -mfpu=vfp
 # -mfloat-abi=softfp.  Some multilibs may be incompatible with these
 # options.
@@ -2331,7 +2343,7 @@ proc check_effective_target_vect_no_align { } {
 	if { [istarget mipsisa64*-*-*]
 	     || [istarget sparc*-*-*]
 	     || [istarget ia64-*-*]
-	     || [check_effective_target_arm32] } { 
+	     || [check_effective_target_arm_vect_no_misalign] } { 
 	    set et_vect_no_align_saved 1
 	}
     }
@@ -2466,6 +2478,25 @@ proc check_effective_target_vector_alignment_reachable_for_64bit { } {
     return $et_vector_alignment_reachable_for_64bit_saved
 }
 
+# Return 1 if the target only requires element alignment for vector accesses
+
+proc check_effective_target_vect_element_align { } {
+    global et_vect_element_align
+
+    if [info exists et_vect_element_align] {
+	verbose "check_effective_target_vect_element_align: using cached result" 2
+    } else {
+	set et_vect_element_align 0
+	if { [istarget arm*-*-*]
+	     || [check_effective_target_vect_hw_misalign] } {
+	   set et_vect_element_align 1
+	}
+    }
+
+    verbose "check_effective_target_vect_element_align: returning $et_vect_element_align" 2
+    return $et_vect_element_align
+}
+
 # Return 1 if the target supports vector conditional operations, 0 otherwise.
 
 proc check_effective_target_vect_condition { } {
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index abc8485..c55192f 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -745,7 +745,7 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
     }
 
   base = build_fold_indirect_ref (base_addr);
-  alignment = ssize_int (TYPE_ALIGN (vectype)/BITS_PER_UNIT);
+  alignment = ssize_int (targetm.vectorize.vector_min_alignment (vectype));
 
   if ((aligned_to && tree_int_cst_compare (aligned_to, alignment) < 0)
       || !misalign)
@@ -795,8 +795,9 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
 
   /* At this point we assume that the base is aligned.  */
   gcc_assert (base_aligned
-	      || (TREE_CODE (base) == VAR_DECL
-		  && DECL_ALIGN (base) >= TYPE_ALIGN (vectype)));
+	      || (TREE_CODE (base) == VAR_DECL 
+		  && (DECL_ALIGN_UNIT (base)
+		      >= targetm.vectorize.vector_min_alignment (vectype))));
 
   /* Modulo alignment.  */
   misalign = size_binop (FLOOR_MOD_EXPR, misalign, alignment);
@@ -3388,7 +3389,12 @@ vect_supportable_dr_alignment (struct data_reference *dr)
   bool nested_in_vect_loop = false;
 
   if (aligned_access_p (dr))
-    return dr_aligned;
+    {
+      if (targetm.vectorize.always_misalign (vectype))
+	return dr_unaligned_forced;
+      else
+	return dr_aligned;
+    }
 
   if (!loop_vinfo)
     /* FORNOW: Misaligned accesses are supported only in loops.  */
diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index ea1a4d6..b767c30 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -37,6 +37,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-scalar-evolution.h"
 #include "tree-vectorizer.h"
 #include "langhooks.h"
+#include "target.h"
 
 /*************************************************************************
   Simple Loop Peeling Utilities
@@ -1835,7 +1836,7 @@ vect_gen_niters_for_prolog_loop (loop_vec_info loop_vinfo, tree loop_niters)
   gimple dr_stmt = DR_STMT (dr);
   stmt_vec_info stmt_info = vinfo_for_stmt (dr_stmt);
   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
-  int vectype_align = TYPE_ALIGN (vectype) / BITS_PER_UNIT;
+  int vectype_align = targetm.vectorize.vector_min_alignment (vectype);
   tree niters_type = TREE_TYPE (loop_niters);
   int step = 1;
   int element_size = GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (DR_REF (dr))));
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 5c12697..7a2df3d 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -742,6 +742,7 @@ vect_model_load_cost (stmt_vec_info stmt_info, int ncopies, slp_tree slp_node)
         break;
       }
     case dr_unaligned_supported:
+    case dr_unaligned_forced:
       {
         /* Here, we assign an additional cost for the unaligned load.  */
         inside_cost += ncopies * TARG_VEC_UNALIGNED_LOAD_COST;
@@ -3185,7 +3186,8 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
 	       vect_permute_store_chain().  */
 	    vec_oprnd = VEC_index (tree, result_chain, i);
 
-          if (aligned_access_p (first_dr))
+          if (aligned_access_p (first_dr)
+	      && alignment_support_scheme != dr_unaligned_forced)
             data_ref = build_fold_indirect_ref (dataref_ptr);
           else
           {
@@ -3567,7 +3569,9 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
 	      data_ref = build_fold_indirect_ref (dataref_ptr);
 	      break;
 	    case dr_unaligned_supported:
+	    case dr_unaligned_forced:
 	      {
+	        /* TODO: Record actual alignment in always_misalign case.  */
 		int mis = DR_MISALIGNMENT (first_dr);
 		tree tmis = (mis == -1 ? size_zero_node : size_int (mis));
 
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index b7c6316..2cf644a 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -48,6 +48,7 @@ enum operation_type {
 enum dr_alignment_support {
   dr_unaligned_unsupported,
   dr_unaligned_supported,
+  dr_unaligned_forced,
   dr_explicit_realign,
   dr_explicit_realign_optimized,
   dr_aligned

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-11-30 13:53   ` Julian Brown
@ 2009-11-30 14:03     ` Joseph S. Myers
  2009-11-30 14:15       ` Richard Earnshaw
  2009-12-01  8:39     ` Ira Rosen
  1 sibling, 1 reply; 29+ messages in thread
From: Joseph S. Myers @ 2009-11-30 14:03 UTC (permalink / raw)
  To: Julian Brown; +Cc: Ira Rosen, gcc-patches, paul, rearnsha, eres

On Mon, 30 Nov 2009, Julian Brown wrote:

> I've also drafted a bit of hopefully-explanatory text in md.texi about
> movmisalign<mode> and element ordering.

Texinfo markup fixes: ``opaque'', @code{TARGET_VECTOR_ALWAYS_MISALIGN}.  
The two new target hooks TARGET_VECTOR_ALWAYS_MISALIGN and 
TARGET_VECTOR_MIN_ALIGNMENT also need documentation in tm.texi.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-11-30 14:03     ` Joseph S. Myers
@ 2009-11-30 14:15       ` Richard Earnshaw
  2009-11-30 14:33         ` Paul Brook
  0 siblings, 1 reply; 29+ messages in thread
From: Richard Earnshaw @ 2009-11-30 14:15 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Julian Brown, Ira Rosen, gcc-patches, paul, eres


On Mon, 2009-11-30 at 14:01 +0000, Joseph S. Myers wrote:
> On Mon, 30 Nov 2009, Julian Brown wrote:
> 
> > I've also drafted a bit of hopefully-explanatory text in md.texi about
> > movmisalign<mode> and element ordering.
> 
> Texinfo markup fixes: ``opaque'', @code{TARGET_VECTOR_ALWAYS_MISALIGN}.  
> The two new target hooks TARGET_VECTOR_ALWAYS_MISALIGN and 
> TARGET_VECTOR_MIN_ALIGNMENT also need documentation in tm.texi.
> 
I can't say I like this particular way of describing Neon.  It seems to
me that what's really needed is for the vectorizer to have an equivalent
concept for vector elements to the TARGET_WORDS_BIG_ENDIAN -- namely
TARGET_VECT_ELEMENTS_BIG_ENDIAN -- on Neon this would always be false.

R.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-11-30 14:15       ` Richard Earnshaw
@ 2009-11-30 14:33         ` Paul Brook
  2009-11-30 15:06           ` Richard Earnshaw
  2009-11-30 15:43           ` Joseph S. Myers
  0 siblings, 2 replies; 29+ messages in thread
From: Paul Brook @ 2009-11-30 14:33 UTC (permalink / raw)
  To: Richard Earnshaw
  Cc: Joseph S. Myers, Julian Brown, Ira Rosen, gcc-patches, eres

On Monday 30 November 2009, Richard Earnshaw wrote:
> On Mon, 2009-11-30 at 14:01 +0000, Joseph S. Myers wrote:
> > On Mon, 30 Nov 2009, Julian Brown wrote:
> > > I've also drafted a bit of hopefully-explanatory text in md.texi about
> > > movmisalign<mode> and element ordering.
> >
> > Texinfo markup fixes: ``opaque'', @code{TARGET_VECTOR_ALWAYS_MISALIGN}.
> > The two new target hooks TARGET_VECTOR_ALWAYS_MISALIGN and
> > TARGET_VECTOR_MIN_ALIGNMENT also need documentation in tm.texi.
> 
> I can't say I like this particular way of describing Neon.  It seems to
> me that what's really needed is for the vectorizer to have an equivalent
> concept for vector elements to the TARGET_WORDS_BIG_ENDIAN -- namely
> TARGET_VECT_ELEMENTS_BIG_ENDIAN -- on Neon this would always be false.

Are you saying you think the vectorizer should be using vldr, not vld1?
Or that you don't like this particular way of distinguishing between array 
loads and vector copies?
Or that we also need to fix whichever bits of the autovectorizer that know 
about vector layout and remove the BYTES_BIG_ENDIAN hack in 
arm_vector_always_misalign?

Paul

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-11-30 14:33         ` Paul Brook
@ 2009-11-30 15:06           ` Richard Earnshaw
  2009-12-18 18:09             ` Julian Brown
  2009-11-30 15:43           ` Joseph S. Myers
  1 sibling, 1 reply; 29+ messages in thread
From: Richard Earnshaw @ 2009-11-30 15:06 UTC (permalink / raw)
  To: Paul Brook; +Cc: Joseph S. Myers, Julian Brown, Ira Rosen, gcc-patches, eres


On Mon, 2009-11-30 at 14:28 +0000, Paul Brook wrote:
> On Monday 30 November 2009, Richard Earnshaw wrote:
> > On Mon, 2009-11-30 at 14:01 +0000, Joseph S. Myers wrote:
> > > On Mon, 30 Nov 2009, Julian Brown wrote:
> > > > I've also drafted a bit of hopefully-explanatory text in md.texi about
> > > > movmisalign<mode> and element ordering.
> > >
> > > Texinfo markup fixes: ``opaque'', @code{TARGET_VECTOR_ALWAYS_MISALIGN}.
> > > The two new target hooks TARGET_VECTOR_ALWAYS_MISALIGN and
> > > TARGET_VECTOR_MIN_ALIGNMENT also need documentation in tm.texi.
> > 
> > I can't say I like this particular way of describing Neon.  It seems to
> > me that what's really needed is for the vectorizer to have an equivalent
> > concept for vector elements to the TARGET_WORDS_BIG_ENDIAN -- namely
> > TARGET_VECT_ELEMENTS_BIG_ENDIAN -- on Neon this would always be false.
> 
> Are you saying you think the vectorizer should be using vldr, not vld1?
> Or that you don't like this particular way of distinguishing between array 
> loads and vector copies?
> Or that we also need to fix whichever bits of the autovectorizer that know 
> about vector layout and remove the BYTES_BIG_ENDIAN hack in 
> arm_vector_always_misalign?

I certainly think we shouldn't be hiding knowledge about the element to
vector-lane mapping from the vectorizer -- and that the vectorizer
should understand that vector copies are not necessarily the same as
vectorizing loads.  Anything else and we will ultimately have parts of
the compiler fighting against each other and that way lies subtle bugs.

R.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-11-30 14:33         ` Paul Brook
  2009-11-30 15:06           ` Richard Earnshaw
@ 2009-11-30 15:43           ` Joseph S. Myers
  1 sibling, 0 replies; 29+ messages in thread
From: Joseph S. Myers @ 2009-11-30 15:43 UTC (permalink / raw)
  To: Paul Brook; +Cc: Richard Earnshaw, Julian Brown, Ira Rosen, gcc-patches, eres

On Mon, 30 Nov 2009, Paul Brook wrote:

> Are you saying you think the vectorizer should be using vldr, not vld1?
> Or that you don't like this particular way of distinguishing between array 
> loads and vector copies?
> Or that we also need to fix whichever bits of the autovectorizer that know 
> about vector layout and remove the BYTES_BIG_ENDIAN hack in 
> arm_vector_always_misalign?

For misaligned support to work for big-endian, I believe the vectorizer 
needs to know exactly what the effects of the misaligned loads are.  The 
support is presently disabled for big-endian (despite the always_misalign 
code) because, for example, the vectorizer expected to be able to do an 
operation one operand of which was the result of a misaligned load and the 
other operand of which was a constant, without knowing that the constant 
elements needed to be permuted in the same way they would have been by a 
misaligned load from memory.

GCC has a very strongly embedded assumption that the combination of 
machine mode and register number or memory address defines exactly how a 
value of that type is stored in registers or in memory.  GCC defines 
vector element numbering in GENERIC, GIMPLE and RTL in such a way that the 
memory ordering for a vector mode is array ordering.

The ARM backend can in turn define what ordering it likes for vector 
values in core registers and in NEON registers, as long as all moves 
between any combination of memory, core registers and NEON registers are 
consistent with the definition (remembering that the machine-independent 
compiler might sometimes try to synthesise moves out of moves of smaller 
pieces); there are various target hooks or macros to control what SUBREG 
expressions are allowed with what interpretation, if orderings are used 
that would make some SUBREGs behave in unexpected ways.  The backend uses 
the definition that vldr/vldm order is used, which also works conveniently 
for ldm/stm to/from core registers and transfers between core and NEON 
registers.

Given that backend definition, when the vectorizer uses vld1/vst1 for big 
endian the vectorizer needs to understand that the resulting value in the 
register is not the same value, interpreted in the normal way for mode 
V4HI (say), as the value in the array in memory, but a permutation of that 
value.  If it knows what the permutation is, then it can also know when it 
is correct to carry out a vector operation between that value and another 
value: the other value must be permuted in the same way.  Likewise, if 
using vst1 the value being stored must be permuted appropriately.  This 
should not in general result in explicit permutations at runtime; it 
should generally involve consistently using vld1/vst1 for all operands, 
and permuting constants appropriately.

Misaligned support for little-endian is of course much simpler to make 
work, and still useful; I think most NEON hardware is little-endian.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-11-30 13:53   ` Julian Brown
  2009-11-30 14:03     ` Joseph S. Myers
@ 2009-12-01  8:39     ` Ira Rosen
  1 sibling, 0 replies; 29+ messages in thread
From: Ira Rosen @ 2009-12-01  8:39 UTC (permalink / raw)
  To: Julian Brown; +Cc: gcc-patches, paul, rearnsha, Revital1 Eres



Julian Brown <julian@codesourcery.com> wrote on 30/11/2009 15:48:13:


> I updated my sources (a week or two ago), and unfortunately the number
> of newly-FAILing tests is now greater. I've examined a cross-section of
> these failures, and some of them I think are due to testsuite changes
> in the following patch:
>
>   http://gcc.gnu.org/ml/gcc-patches/2009-10/msg01604.html
>
> For example the vect-26.c compilation test now has:
>
> /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access"
> 0 "vect" } } */
>
> But for ARM NEON, there is a single instance of this message in the
> relevant dump file. The previous version of the line looks more correct
> to me:
>
> /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access"
> 1 "vect" { xfail {! vect_hw_misalign} } } } */
>
> This seems to describe the desired behaviour (at least for NEON) more
> accurately to me (but I probably misunderstood something about the
> above-linked patch).

In vect-26.c there is one misaligned store that is handled with loop
peeling. After peeling, the store is aligned and the vectorizer will not
print "Vectorizing an unaligned access" (there is only one data access in
the loop and its alignment is forced using peeling).

In NEON the misalignment value of the aligned store is dr_unaligned_forced,
so the code

      if (supportable_dr_alignment != dr_aligned
          && vect_print_dump_info (REPORT_ALIGNMENT))
        fprintf (vect_dump, "Vectorizing an unaligned access.");

prints the message anyway.

If you want to see this message, there is a need in another keyword.
Otherwise, just add "supportable_dr_alignment != dr_unaligned_forced" to
the condition.


>
> Other failures are due to things like vectorizing *more* loops than
> expected in several tests, and (as written before) missing parts in the
> NEON support. I don't think there's anything which indicates actual
> breakage.
>
> > > --- a/gcc/tree-vect-data-refs.c
> > > +++ b/gcc/tree-vect-data-refs.c
> > > @@ -796,7 +796,8 @@ vect_compute_data_ref_alignment (struct
> > data_reference *dr)
> > >    /* At this point we assume that the base is aligned.  */
> > >    gcc_assert (base_aligned
> > >           || (TREE_CODE (base) == VAR_DECL
> > > -           && DECL_ALIGN (base) >= TYPE_ALIGN (vectype)));
> > > +           && (DECL_ALIGN (base)
> > > +               >= targetm.vectorize.vector_min_alignment
> > > (vectype))));
> >
> > Looks like you forgot to multiply by BITS_PER_UNIT here.
>
> Fixed.

Not in the attached patch...

>
> > >    * tree-vect-stmts.c (vectorizable_store): Honour
> > >     targetm.vectorize.always_misalign.
> > >     (vectorizable_load): Ditto.
> >
> > I would prefer to have all the alignment queries in
> > vect_supportable_dr_alignment(). Maybe you could add another
> > enumeration value to enum dr_alignment_support?
>
> Does something like the attached look right?

Yes.

Thanks,
Ira

>
> I've also drafted a bit of hopefully-explanatory text in md.texi about
> movmisalign<mode> and element ordering.
>
> Thanks,
>
> Julian
>
> ChangeLog
>
>     Julian Brown  <julian@codesourcery.com>
>     Paul Brook  <paul@codesourcery.com>
>     Daniel Jacobowitz  <dan@codesourcery.com>
>     Joseph Myers  <joseph@codesourcery.com>
>
>     gcc/
>     * expr.c (expand_assignment): Handle MISALIGNED_INDIRECT_REF as a
>     destination.
>     (expand_expr_real_1): Handle writes to MISALIGNED_INDIRECT_REF.
>     * target-def.h (TARGET_VECTOR_MIN_ALIGNMENT)
>     (TARGET_VECTOR_ALWAYS_MISALIGN): Define.
>     (TARGET_VECTORIZE): Use them.
>     * target.h (gcc_target): Add vectorize.vector_min_alignment and
>     vectorize.always_misalign.
>     * targhooks.c (default_vector_min_alignment): New function.
>     * targhooks.h (default_vector_min_alignment): Add prototype.
>     * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Use
>     targetm.vectorize.vector_min_alignment.
>     (vect_supportable_dr_alignment): Support forced misalignment for
> aligned accesses.
>     * tree-vect-loop-manip.c (target.h): Include.
>     (vect_gen_niters_for_prolog_loop): Use
>     targetm.vectorize.vector_min_alignment.
>     (vect_model_load_cost, vectorizable_store, vectorizable_load):
Support
>     dr_unaligned_forced.
>     * tree-vect-stmts.c (vectorizable_store): Honour
>     targetm.vectorize.always_misalign.
>     (vectorizable_load): Ditto.
>     * tree-vectorizer.h (operation_type): Add dr_unaligned_forced.
>     * config/arm/arm.c (arm_vector_min_alignment)
>     (arm_vector_always_misalign): New functions.
>     (TARGET_VECTOR_MIN_ALIGNMENT, TARGET_VECTOR_ALWAYS_MISALIGN):
>     Define macros, using above.
>     (neon_vector_mem_operand): Disallow PRE_DEC for array loads.
>     (arm_print_operand): Include alignment qualifier in %A.
>     * config/arm/neon.md (UNSPEC_MISALIGNED_ACCESS): New constant.
>     (movmisalign<mode>): New expander.
>     (movmisalign<mode>_neon_store, movmisalign<mode>_neon_load): New
>     insn patterns.
>
>     gcc/doc/
>     * md.texi (movmisalign): Add section about element ordering.
>
>     gcc/testsuite/
>     * lib/target-supports.exp
>     (check_effective_target_arm_vect_no_misalign): New function.
>     (check_effective_target_vect_no_align): Use above.
>     (check_effective_target_vect_element_align): New function.
>     * gcc.dg/vect/no-section-anchors-vect-31.c: Use
> vect_element_align (instead of
>     vect_hw_misalign where appropriate).
>     * gcc.dg/vect/no-section-anchors-vect-64.c: Ditto.
>     * gcc.dg/vect/no-section-anchors-vect-66.c: Ditto.
>     * gcc.dg/vect/no-section-anchors-vect-68.c: Ditto.
>     * gcc.dg/vect/no-section-anchors-vect-69.c: Ditto.
>     * gcc.dg/vect/no-scebccp-outer-8.c: Ditto.
>     * gcc.dg/vect/pr25413.c: Ditto.
>     * gcc.dg/vect/section-anchors-vect-69.c: Ditto.
>     * gcc.dg/vect/slp-25.c: Ditto.
>     * gcc.dg/vect/vect-109.c: Ditto.
>     * gcc.dg/vect/vect-26.c: Ditto.
>     * gcc.dg/vect/vect-27.c: Ditto.
>     * gcc.dg/vect/vect-28.c: Ditto.
>     * gcc.dg/vect/vect-29.c: Ditto.
>     * gcc.dg/vect/vect-33.c: Ditto.
>     * gcc.dg/vect/vect-42.c: Ditto.
>     * gcc.dg/vect/vect-44.c: Ditto.
>     * gcc.dg/vect/vect-48.c: Ditto.
>     * gcc.dg/vect/vect-50.c: Ditto.
>     * gcc.dg/vect/vect-52.c: Ditto.
>     * gcc.dg/vect/vect-54.c: Ditto.
>     * gcc.dg/vect/vect-56.c: Ditto.
>     * gcc.dg/vect/vect-58.c: Ditto.
>     * gcc.dg/vect/vect-60.c: Ditto.
>     * gcc.dg/vect/vect-70.c: Ditto.
>     * gcc.dg/vect/vect-72.c: Ditto.
>     * gcc.dg/vect/vect-75.c: Ditto.
>     * gcc.dg/vect/vect-87.c: Ditto.
>     * gcc.dg/vect/vect-88.c: Ditto.
>     * gcc.dg/vect/vect-89.c: Ditto.
>     * gcc.dg/vect/vect-91.c: Ditto.
>     * gcc.dg/vect/vect-92.c: Ditto.
>     * gcc.dg/vect/vect-93.c: Ditto.
>     * gcc.dg/vect/vect-95.c: Ditto.
>     * gcc.dg/vect/vect-96.c: Ditto.
>     * gcc.dg/vect/vect-align-1.c: Ditto.
>     * gcc.dg/vect/vect-align-2.c: Ditto.
>     * gcc.dg/vect/vect-multitypes-1.c: Ditto.
>     * gcc.dg/vect/vect-multitypes-3.c: Ditto.
>     * gcc.dg/vect/vect-multitypes-4.c: Ditto.
>     * gcc.dg/vect/vect-multitypes-6.c: Ditto.
>     * gfortran.dg/vect/vect-2.f90: Ditto.
>     * gfortran.dg/vect/vect-3.f90: Ditto.
>     * gfortran.dg/vect/vect-4.f90: Ditto.
>     * gfortran.dg/vect/vect-5.f90: Ditto.[attachment "misaligned-
> neon-fsf-5.diff" deleted by Ira Rosen/Haifa/IBM]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-11-30 15:06           ` Richard Earnshaw
@ 2009-12-18 18:09             ` Julian Brown
  2009-12-21  8:44               ` Ira Rosen
  2009-12-21 15:35               ` Paul Brook
  0 siblings, 2 replies; 29+ messages in thread
From: Julian Brown @ 2009-12-18 18:09 UTC (permalink / raw)
  To: Richard Earnshaw
  Cc: Paul Brook, Joseph S. Myers, Ira Rosen, gcc-patches, eres

[-- Attachment #1: Type: text/plain, Size: 5095 bytes --]

On Mon, 30 Nov 2009 15:02:48 +0000
Richard Earnshaw <rearnsha@arm.com> wrote:

> I certainly think we shouldn't be hiding knowledge about the element
> to vector-lane mapping from the vectorizer -- and that the vectorizer
> should understand that vector copies are not necessarily the same as
> vectorizing loads.  Anything else and we will ultimately have parts of
> the compiler fighting against each other and that way lies subtle
> bugs.

This is a version of the patch which doesn't attempt to resolve the
discrepancy between vector copies and vectorizing loads/stores (thus is
only intended to work in little-endian mode, leaving big-endian mode as
an open problem). So, vldr/vstr etc. will still be used for aligned
accesses, and any issues with adding semantics to movmisalign<mode> are
sidestepped.

As with previous versions of the patch, there are several new failures
in the vector testsuite.

Ira Rosen <IRAR@il.ibm.com> wrote:
> > > > --- a/gcc/tree-vect-data-refs.c
> > > > +++ b/gcc/tree-vect-data-refs.c
> > > > @@ -796,7 +796,8 @@ vect_compute_data_ref_alignment (struct  
> > > data_reference *dr)  
> > > >    /* At this point we assume that the base is aligned.  */
> > > >    gcc_assert (base_aligned
> > > >           || (TREE_CODE (base) == VAR_DECL
> > > > -           && DECL_ALIGN (base) >= TYPE_ALIGN (vectype)));
> > > > +           && (DECL_ALIGN (base)
> > > > +               >= targetm.vectorize.vector_min_alignment
> > > > (vectype))));  
> > >
> > > Looks like you forgot to multiply by BITS_PER_UNIT here.  
> >
> > Fixed.  
> 
> Not in the attached patch...

(I changed DECL_ALIGN to DECL_ALIGN_UNIT, so both copies of the
inequality were in units, rather than explicitly multiplying by
BITS_PER_UNIT. I should probably have mentioned that...)

OK to apply?

Julian

ChangeLog

    Julian Brown  <julian@codesourcery.com>
    Paul Brook  <paul@codesourcery.com>
    Daniel Jacobowitz  <dan@codesourcery.com>
    Joseph Myers  <joseph@codesourcery.com>

    gcc/
    * target-def.h (TARGET_VECTOR_MIN_ALIGNMENT): Define.
    (TARGET_VECTORIZE): Use above.
    * target.h (gcc_target): Add vectorize.vector_min_alignment.
    * targhooks.c (default_vector_min_alignment): New function.
    * targhooks.h (default_vector_min_alignment): Add prototype.
    * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Use
    targetm.vectorize.vector_min_alignment.
    * tree-vect-loop-manip.c (target.h): Include.
    (vect_gen_niters_for_prolog_loop): Use targetm.vectorize.vector_min_alignment.
    * config/arm/arm.c (arm_vector_min_alignment): New function.
    (TARGET_VECTOR_MIN_ALIGNMENT): Define macro, using above.
    (neon_vector_mem_operand): Disallow PRE_DEC for array loads.
    (arm_print_operand): Include alignment qualifier in %A.
    * config/arm/neon.md (UNSPEC_MISALIGNED_ACCESS): New constant.
    (movmisalign<mode>): New expander.
    (movmisalign<mode>_neon_store, movmisalign<mode>_neon_load): New
    insn patterns.

    gcc/testsuite/
    * lib/target-supports.exp
    (check_effective_target_arm_vect_no_misalign): New function.
    (check_effective_target_vect_no_align): Use above.
    (check_effective_target_vect_element_align): New function.
    * gcc.dg/vect/no-section-anchors-vect-31.c: Use vect_element_align (instead of
    vect_hw_misalign where appropriate).
    * gcc.dg/vect/no-section-anchors-vect-64.c: Ditto.
    * gcc.dg/vect/no-section-anchors-vect-66.c: Ditto.
    * gcc.dg/vect/no-section-anchors-vect-68.c: Ditto.
    * gcc.dg/vect/no-section-anchors-vect-69.c: Ditto.
    * gcc.dg/vect/no-scebccp-outer-8.c: Ditto.
    * gcc.dg/vect/pr25413.c: Ditto.
    * gcc.dg/vect/section-anchors-vect-69.c: Ditto.
    * gcc.dg/vect/slp-25.c: Ditto.
    * gcc.dg/vect/vect-109.c: Ditto.
    * gcc.dg/vect/vect-26.c: Ditto.
    * gcc.dg/vect/vect-27.c: Ditto.
    * gcc.dg/vect/vect-28.c: Ditto.
    * gcc.dg/vect/vect-29.c: Ditto.
    * gcc.dg/vect/vect-33.c: Ditto.
    * gcc.dg/vect/vect-42.c: Ditto.
    * gcc.dg/vect/vect-44.c: Ditto.
    * gcc.dg/vect/vect-48.c: Ditto.
    * gcc.dg/vect/vect-50.c: Ditto.
    * gcc.dg/vect/vect-52.c: Ditto.
    * gcc.dg/vect/vect-54.c: Ditto.
    * gcc.dg/vect/vect-56.c: Ditto.
    * gcc.dg/vect/vect-58.c: Ditto.
    * gcc.dg/vect/vect-60.c: Ditto.
    * gcc.dg/vect/vect-70.c: Ditto.
    * gcc.dg/vect/vect-72.c: Ditto.
    * gcc.dg/vect/vect-75.c: Ditto.
    * gcc.dg/vect/vect-87.c: Ditto.
    * gcc.dg/vect/vect-88.c: Ditto.
    * gcc.dg/vect/vect-89.c: Ditto.
    * gcc.dg/vect/vect-91.c: Ditto.
    * gcc.dg/vect/vect-92.c: Ditto.
    * gcc.dg/vect/vect-93.c: Ditto.
    * gcc.dg/vect/vect-95.c: Ditto.
    * gcc.dg/vect/vect-96.c: Ditto.
    * gcc.dg/vect/vect-align-1.c: Ditto.
    * gcc.dg/vect/vect-align-2.c: Ditto.
    * gcc.dg/vect/vect-multitypes-1.c: Ditto.
    * gcc.dg/vect/vect-multitypes-3.c: Ditto.
    * gcc.dg/vect/vect-multitypes-4.c: Ditto.
    * gcc.dg/vect/vect-multitypes-6.c: Ditto.
    * gfortran.dg/vect/vect-2.f90: Ditto.
    * gfortran.dg/vect/vect-3.f90: Ditto.
    * gfortran.dg/vect/vect-4.f90: Ditto.
    * gfortran.dg/vect/vect-5.f90: Ditto.

[-- Attachment #2: misaligned-neon-fsf-7.diff --]
[-- Type: text/x-patch, Size: 59001 bytes --]

Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c	(revision 155347)
+++ gcc/targhooks.c	(working copy)
@@ -932,6 +932,12 @@ default_addr_space_convert (rtx op ATTRI
   gcc_unreachable ();
 }
 
+int
+default_vector_min_alignment (const_tree type)
+{
+  return TYPE_ALIGN_UNIT (type);
+}
+
 bool
 default_hard_regno_scratch_ok (unsigned int regno ATTRIBUTE_UNUSED)
 {
Index: gcc/targhooks.h
===================================================================
--- gcc/targhooks.h	(revision 155347)
+++ gcc/targhooks.h	(working copy)
@@ -82,6 +82,8 @@ default_builtin_support_vector_misalignm
 					     const_tree,
 					     int, bool);
 
+extern int default_vector_min_alignment (const_tree);
+
 /* These are here, and not in hooks.[ch], because not all users of
    hooks.h include tm.h, and thus we don't have CUMULATIVE_ARGS.  */
 
Index: gcc/target.h
===================================================================
--- gcc/target.h	(revision 155347)
+++ gcc/target.h	(working copy)
@@ -499,6 +499,12 @@ struct gcc_target
        is true if the access is defined in a packed struct.  */
     bool (* builtin_support_vector_misalignment) (enum machine_mode,
                                                   const_tree, int, bool);
+
+    /* Return the minimum alignment required to load or store a
+       vector of the given type, which may be less than the
+       natural alignment of the type.  */
+    int (* vector_min_alignment) (const_tree);
+
   } vectorize;
 
   /* The initial value of target_flags.  */
Index: gcc/tree-vect-loop-manip.c
===================================================================
--- gcc/tree-vect-loop-manip.c	(revision 155347)
+++ gcc/tree-vect-loop-manip.c	(working copy)
@@ -37,6 +37,7 @@ along with GCC; see the file COPYING3.  
 #include "tree-scalar-evolution.h"
 #include "tree-vectorizer.h"
 #include "langhooks.h"
+#include "target.h"
 
 /*************************************************************************
   Simple Loop Peeling Utilities
@@ -1835,7 +1836,7 @@ vect_gen_niters_for_prolog_loop (loop_ve
   gimple dr_stmt = DR_STMT (dr);
   stmt_vec_info stmt_info = vinfo_for_stmt (dr_stmt);
   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
-  int vectype_align = TYPE_ALIGN (vectype) / BITS_PER_UNIT;
+  int vectype_align = targetm.vectorize.vector_min_alignment (vectype);
   tree niters_type = TREE_TYPE (loop_niters);
   int step = 1;
   int element_size = GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (DR_REF (dr))));
Index: gcc/testsuite/gcc.dg/vect/vect-50.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-50.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-50.c	(working copy)
@@ -61,9 +61,9 @@ int main (void)
    align the store will not force the two loads to be aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } }  */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } }  */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && { {! vect_no_align } && {! vect_hw_misalign } } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && { {! vect_no_align } && {! vect_element_align } } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-33.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-33.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-33.c	(working copy)
@@ -39,6 +39,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target vector_alignment_reachable } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */ 
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */ 
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c	(working copy)
@@ -114,7 +114,7 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vector_alignment_reachable} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_element_align || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { { ! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-42.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-42.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-42.c	(working copy)
@@ -64,7 +64,7 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_hw_misalign } } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { { vect_no_align || vect_element_align } || { ! vector_alignment_reachable } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || { ! vector_alignment_reachable } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c	(working copy)
@@ -61,6 +61,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { sparc*-*-* && ilp32 } }} } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 6 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 6 "vect" {xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 6 "vect" {xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-60.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-60.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-60.c	(working copy)
@@ -69,6 +69,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-26.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-26.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-26.c	(working copy)
@@ -37,5 +37,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-52.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-52.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-52.c	(working copy)
@@ -55,7 +55,7 @@ int main (void)
    (The store is aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-align-2.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-align-2.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-align-2.c	(working copy)
@@ -43,6 +43,6 @@ int main (void)
 
 
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { xfail vect_hw_misalign} } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { xfail vect_hw_misalign } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-44.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-44.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-44.c	(working copy)
@@ -65,8 +65,8 @@ int main (void)
    two loads to be aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {{! vect_no_align} && {! vect_hw_misalign} } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {{! vect_no_align} && {! vect_element_align} } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-27.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-27.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-27.c	(working copy)
@@ -45,6 +45,6 @@ int main (void)
 /* The initialization induction loop (with aligned access) is also vectorized.  */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { xfail vect_no_align } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-70.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-70.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-70.c	(working copy)
@@ -64,6 +64,6 @@ int main (void)
           
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target {{! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target {{! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-28.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-28.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-28.c	(working copy)
@@ -40,6 +40,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c	(working copy)
@@ -88,5 +88,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-109.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-109.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-109.c	(working copy)
@@ -73,7 +73,7 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 2 "vect" } } */
-/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 10 "vect" { target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 10 "vect" { target vect_element_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-54.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-54.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-54.c	(working copy)
@@ -60,5 +60,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-29.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-29.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-29.c	(working copy)
@@ -50,7 +50,7 @@ int main (void)
 
 /* The initialization induction loop (with aligned access) is also vectorized.  */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" {target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-72.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-72.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-72.c	(working copy)
@@ -46,6 +46,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-56.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-56.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-56.c	(working copy)
@@ -68,6 +68,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-48.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-48.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-48.c	(working copy)
@@ -54,7 +54,7 @@ int main (void)
    (The store is aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-91.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-91.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-91.c	(working copy)
@@ -59,6 +59,6 @@ main3 ()
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" { xfail vect_no_int_add } } } */
 /* { dg-final { scan-tree-dump-times "accesses have the same alignment." 3 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" {target { vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-92.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-92.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-92.c	(working copy)
@@ -92,5 +92,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-75.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-75.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-75.c	(working copy)
@@ -45,5 +45,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-58.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-58.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-58.c	(working copy)
@@ -59,5 +59,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/slp-25.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-25.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/slp-25.c	(working copy)
@@ -56,5 +56,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-93.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-93.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-93.c	(working copy)
@@ -72,7 +72,7 @@ int main (void)
 /* main && main1 together: */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 2 "vect" { target powerpc*-*-* i?86-*-* x86_64-*-* } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { vect_no_align && {! vector_alignment_reachable} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
 
 /* in main1: */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target !powerpc*-*-* !i?86-*-* !x86_64-*-* } } } */
@@ -80,6 +80,6 @@ int main (void)
 
 /* in main: */
 /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c	(working copy)
@@ -46,5 +46,5 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! { vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! { vect_element_align } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c	(working copy)
@@ -115,6 +115,6 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* Alignment forced using versioning until the pass that increases alignment
   is extended to handle structs.  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 4 "vect" { target {vect_int && vector_alignment_reachable } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 4 "vect" { target { {vect_int && vector_alignment_reachable } && { ! vect_element_align } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target {vect_int && {! vector_alignment_reachable} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-95.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-95.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-95.c	(working copy)
@@ -56,14 +56,14 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail {vect_hw_misalign} } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail {vect_element_align} } } } */
 
 /* For targets that support unaligned loads we version for the two unaligned 
    stores and generate misaligned accesses for the loads. For targets that 
    don't support unaligned loads we version for all four accesses.  */
 
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign} } } }  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } }  */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /*  { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" { target vect_no_align } } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c	(working copy)
@@ -84,5 +84,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-87.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-87.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-87.c	(working copy)
@@ -51,6 +51,6 @@ int main (void)
 /* Fails for targets that don't vectorize PLUS (e.g alpha).  */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target vector_alignment_reachable} } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-96.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-96.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-96.c	(working copy)
@@ -43,7 +43,7 @@ int main (void)
    For targets that don't support unaligned loads, version for the store.  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { {! vect_no_align} && vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { {! { vect_no_align || vect_element_align } } && vector_alignment_reachable } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_element_align} } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c	(working copy)
@@ -78,11 +78,11 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_element_align} } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-88.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-88.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-88.c	(working copy)
@@ -51,6 +51,6 @@ int main (void)
 /* Fails for targets that don't vectorize PLUS (e.g alpha).  */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target vector_alignment_reachable } } }  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } }  */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c	(working copy)
@@ -79,5 +79,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-89.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-89.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-89.c	(working copy)
@@ -46,5 +46,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c	(working copy)
@@ -54,6 +54,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" {xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" {xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/gcc.dg/vect/pr25413.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/pr25413.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/pr25413.c	(working copy)
@@ -33,7 +33,7 @@ int main (void)
 } 
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vector_alignment_reachable_for_64bit } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "vector alignment may not be reachable" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "vector alignment may not be reachable" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c	(working copy)
@@ -88,5 +88,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c	(revision 155347)
+++ gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c	(working copy)
@@ -85,11 +85,11 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_hw_misalign}  } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_element_align} } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_element_align}  } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 8 "vect" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	(revision 155347)
+++ gcc/testsuite/lib/target-supports.exp	(working copy)
@@ -1514,6 +1514,18 @@ proc check_effective_target_arm32 { } {
     }]
 }
 
+# Return 1 if this is an ARM target that only supports aligned vector accesses
+proc check_effective_target_arm_vect_no_misalign { } {
+    return [check_no_compiler_messages arm_vect_no_misalign assembly {
+	#if !defined(__arm__) \
+	    || (defined(__ARMEL__) \
+	        && (!defined(__thumb__) || defined(__thumb2__)))
+	#error FOO
+	#endif
+    }]
+}
+
+
 # Return 1 if this is an ARM target supporting -mfpu=vfp
 # -mfloat-abi=softfp.  Some multilibs may be incompatible with these
 # options.
@@ -2331,7 +2343,7 @@ proc check_effective_target_vect_no_alig
 	if { [istarget mipsisa64*-*-*]
 	     || [istarget sparc*-*-*]
 	     || [istarget ia64-*-*]
-	     || [check_effective_target_arm32] } { 
+	     || [check_effective_target_arm_vect_no_misalign] } { 
 	    set et_vect_no_align_saved 1
 	}
     }
@@ -2466,6 +2478,25 @@ proc check_effective_target_vector_align
     return $et_vector_alignment_reachable_for_64bit_saved
 }
 
+# Return 1 if the target only requires element alignment for vector accesses
+
+proc check_effective_target_vect_element_align { } {
+    global et_vect_element_align
+
+    if [info exists et_vect_element_align] {
+	verbose "check_effective_target_vect_element_align: using cached result" 2
+    } else {
+	set et_vect_element_align 0
+	if { [istarget arm*-*-*]
+	     || [check_effective_target_vect_hw_misalign] } {
+	   set et_vect_element_align 1
+	}
+    }
+
+    verbose "check_effective_target_vect_element_align: returning $et_vect_element_align" 2
+    return $et_vect_element_align
+}
+
 # Return 1 if the target supports vector conditional operations, 0 otherwise.
 
 proc check_effective_target_vect_condition { } {
Index: gcc/testsuite/gfortran.dg/vect/vect-2.f90
===================================================================
--- gcc/testsuite/gfortran.dg/vect/vect-2.f90	(revision 155347)
+++ gcc/testsuite/gfortran.dg/vect/vect-2.f90	(working copy)
@@ -18,5 +18,5 @@ END
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { vect_no_align && { ! vector_alignment_reachable } } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" {target { vect_no_align || { { ! vector_alignment_reachable  } && { ! vect_hw_misalign } } } } } } 
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" {target { vect_no_align || { { ! vector_alignment_reachable  } && { ! vect_element_align } } } } } } 
 ! { dg-final { cleanup-tree-dump "vect" } }
Index: gcc/testsuite/gfortran.dg/vect/vect-3.f90
===================================================================
--- gcc/testsuite/gfortran.dg/vect/vect-3.f90	(revision 155347)
+++ gcc/testsuite/gfortran.dg/vect/vect-3.f90	(working copy)
@@ -7,8 +7,8 @@ Y = Y + A * X
 END
 
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } }
-! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } }
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_element_align} } } } } }
+! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_element_align} } } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable}} } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { { vect_no_align } || { ! vector_alignment_reachable} } } } }
 
Index: gcc/testsuite/gfortran.dg/vect/vect-4.f90
===================================================================
--- gcc/testsuite/gfortran.dg/vect/vect-4.f90	(revision 155347)
+++ gcc/testsuite/gfortran.dg/vect/vect-4.f90	(working copy)
@@ -12,6 +12,6 @@ END
 ! { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } 
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } }
-! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } }
+! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } }
 ! { dg-final { scan-tree-dump-times "accesses have the same alignment." 1 "vect" } }
 ! { dg-final { cleanup-tree-dump "vect" } }
Index: gcc/testsuite/gfortran.dg/vect/vect-5.f90
===================================================================
--- gcc/testsuite/gfortran.dg/vect/vect-5.f90	(revision 155347)
+++ gcc/testsuite/gfortran.dg/vect/vect-5.f90	(working copy)
@@ -39,5 +39,5 @@
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target { vect_no_align } } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } }
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } }
 ! { dg-final { cleanup-tree-dump "vect" } }
Index: gcc/tree-vect-data-refs.c
===================================================================
--- gcc/tree-vect-data-refs.c	(revision 155347)
+++ gcc/tree-vect-data-refs.c	(working copy)
@@ -745,7 +745,7 @@ vect_compute_data_ref_alignment (struct 
     }
 
   base = build_fold_indirect_ref (base_addr);
-  alignment = ssize_int (TYPE_ALIGN (vectype)/BITS_PER_UNIT);
+  alignment = ssize_int (targetm.vectorize.vector_min_alignment (vectype));
 
   if ((aligned_to && tree_int_cst_compare (aligned_to, alignment) < 0)
       || !misalign)
@@ -795,8 +795,9 @@ vect_compute_data_ref_alignment (struct 
 
   /* At this point we assume that the base is aligned.  */
   gcc_assert (base_aligned
-	      || (TREE_CODE (base) == VAR_DECL
-		  && DECL_ALIGN (base) >= TYPE_ALIGN (vectype)));
+	      || (TREE_CODE (base) == VAR_DECL 
+		  && (DECL_ALIGN_UNIT (base)
+		      >= targetm.vectorize.vector_min_alignment (vectype))));
 
   /* Modulo alignment.  */
   misalign = size_binop (FLOOR_MOD_EXPR, misalign, alignment);
Index: gcc/target-def.h
===================================================================
--- gcc/target-def.h	(revision 155347)
+++ gcc/target-def.h	(working copy)
@@ -393,6 +393,8 @@
 #define TARGET_VECTORIZE_BUILTIN_VEC_PERM 0
 #define TARGET_VECTORIZE_BUILTIN_VEC_PERM_OK \
   hook_bool_tree_tree_true
+#define TARGET_VECTOR_MIN_ALIGNMENT \
+  default_vector_min_alignment
 #define TARGET_SUPPORT_VECTOR_MISALIGNMENT \
   default_builtin_support_vector_misalignment
 
@@ -408,7 +410,8 @@
     TARGET_VECTOR_ALIGNMENT_REACHABLE,                                  \
     TARGET_VECTORIZE_BUILTIN_VEC_PERM,					\
     TARGET_VECTORIZE_BUILTIN_VEC_PERM_OK,				\
-    TARGET_SUPPORT_VECTOR_MISALIGNMENT					\
+    TARGET_SUPPORT_VECTOR_MISALIGNMENT,					\
+    TARGET_VECTOR_MIN_ALIGNMENT						\
   }
 
 #define TARGET_DEFAULT_TARGET_FLAGS 0
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 155347)
+++ gcc/config/arm/arm.c	(working copy)
@@ -224,6 +224,7 @@ static bool arm_can_eliminate (const int
 static void arm_asm_trampoline_template (FILE *);
 static void arm_trampoline_init (rtx, tree, rtx);
 static rtx arm_trampoline_adjust_address (rtx);
+static int arm_vector_min_alignment (const_tree type);
 
 \f
 /* Table of machine attributes.  */
@@ -507,6 +508,9 @@ static const struct attribute_spec arm_a
 #undef TARGET_CAN_ELIMINATE
 #define TARGET_CAN_ELIMINATE arm_can_eliminate
 
+#undef TARGET_VECTOR_MIN_ALIGNMENT
+#define TARGET_VECTOR_MIN_ALIGNMENT arm_vector_min_alignment
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 \f
 /* Obstack for minipool constant handling.  */
@@ -8463,7 +8467,8 @@ neon_vector_mem_operand (rtx op, int typ
     return arm_address_register_rtx_p (ind, 0);
 
   /* Allow post-increment with Neon registers.  */
-  if (type != 1 && (GET_CODE (ind) == POST_INC || GET_CODE (ind) == PRE_DEC))
+  if ((type != 1 && GET_CODE (ind) == POST_INC)
+      || (type == 0 && GET_CODE (ind) == PRE_DEC))
     return arm_address_register_rtx_p (XEXP (ind, 0), 0);
 
   /* FIXME: vld1 allows register post-modify.  */
@@ -15411,6 +15416,8 @@ arm_print_operand (FILE *stream, rtx x, 
       {
 	rtx addr;
 	bool postinc = FALSE;
+	unsigned align;
+
 	gcc_assert (GET_CODE (x) == MEM);
 	addr = XEXP (x, 0);
 	if (GET_CODE (addr) == POST_INC)
@@ -15418,7 +15425,13 @@ arm_print_operand (FILE *stream, rtx x, 
 	    postinc = 1;
 	    addr = XEXP (addr, 0);
 	  }
-	asm_fprintf (stream, "[%r]", REGNO (addr));
+	align = MEM_ALIGN (x) >> 3;
+	asm_fprintf (stream, "[%r", REGNO (addr));
+	if (align > GET_MODE_SIZE (GET_MODE (x)))
+	  align = GET_MODE_SIZE (GET_MODE (x));
+	if (align >= 8)
+	  asm_fprintf (stream, ", :%d", align << 3);
+	asm_fprintf (stream, "]");
 	if (postinc)
 	  fputs("!", stream);
       }
@@ -21463,4 +21476,24 @@ arm_have_conditional_execution (void)
   return !TARGET_THUMB1;
 }
 
+/* Return the minimum alignment required to load or store a
+   vector of the given type, which may be less than the
+   natural alignment of the type.  */
+
+static int
+arm_vector_min_alignment (const_tree type)
+{
+  if (TARGET_NEON)
+    {
+      /* The NEON element load and store instructions only require the
+	 alignment of the element type.  They can benefit from higher
+	 statically reported alignment, but we do not take advantage
+	 of that yet.  */
+      gcc_assert (TREE_CODE (type) == VECTOR_TYPE);
+      return TYPE_ALIGN_UNIT (TREE_TYPE (type));
+    }
+
+  return default_vector_min_alignment (type);
+}
+
 #include "gt-arm.h"
Index: gcc/config/arm/neon.md
===================================================================
--- gcc/config/arm/neon.md	(revision 155347)
+++ gcc/config/arm/neon.md	(working copy)
@@ -159,7 +159,8 @@
    (UNSPEC_VUZP1		201)
    (UNSPEC_VUZP2		202)
    (UNSPEC_VZIP1		203)
-   (UNSPEC_VZIP2		204)])
+   (UNSPEC_VZIP2		204)
+   (UNSPEC_MISALIGNED_ACCESS	205)])
 
 ;; Double-width vector modes.
 (define_mode_iterator VD [V8QI V4HI V2SI V2SF])
@@ -674,6 +675,51 @@
   neon_disambiguate_copy (operands, dest, src, 4);
 })
 
+(define_expand "movmisalign<mode>"
+  [(set (match_operand:VDQX 0 "nonimmediate_operand"	      "")
+	(unspec:VDQX [(match_operand:VDQX 1 "general_operand" "")]
+		     UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+{
+  if (!s_register_operand (operands[0], <MODE>mode)
+      && !s_register_operand (operands[1], <MODE>mode))
+    FAIL;
+})
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VDX 0 "memory_operand"		       "=Um")
+	(unspec:VDX [(match_operand:VDX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN
+   && (   s_register_operand (operands[0], <MODE>mode)
+       || s_register_operand (operands[1], <MODE>mode))"
+  "vst1.<V_sz_elem>\t{%P1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VDX 0 "s_register_operand"	   "=w")
+	(unspec:VDX [(match_operand:VDX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%P0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VQX 0 "memory_operand"		       "=Um")
+	(unspec:VQX [(match_operand:VQX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vst1.<V_sz_elem>\t{%q1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VQX 0 "s_register_operand"	   "=w")
+	(unspec:VQX [(match_operand:VQX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%q0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
 (define_insn "vec_set<mode>_internal"
   [(set (match_operand:VD 0 "s_register_operand" "=w")
         (vec_merge:VD

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-12-18 18:09             ` Julian Brown
@ 2009-12-21  8:44               ` Ira Rosen
  2009-12-21 15:35               ` Paul Brook
  1 sibling, 0 replies; 29+ messages in thread
From: Ira Rosen @ 2009-12-21  8:44 UTC (permalink / raw)
  To: Julian Brown
  Cc: gcc-patches, Joseph S. Myers, Paul Brook, Richard Earnshaw,
	Revital1 Eres



Julian Brown <julian@codesourcery.com> wrote on 18/12/2009 19:22:04:

> Julian Brown <julian@codesourcery.com>
> 18/12/2009 19:22
>
> To
>
> Richard Earnshaw <rearnsha@arm.com>
>
> cc
>
> Paul Brook <paul@codesourcery.com>, "Joseph S. Myers"
> <joseph@codesourcery.com>, Ira Rosen/Haifa/IBM@IBMIL, gcc-
> patches@gcc.gnu.org, Revital1 Eres/Haifa/IBM@IBMIL
>
> Subject
>
> Re: [PATCH, ARM] Misaligned access support for ARM Neon
>
> On Mon, 30 Nov 2009 15:02:48 +0000
> Richard Earnshaw <rearnsha@arm.com> wrote:
>
> > I certainly think we shouldn't be hiding knowledge about the element
> > to vector-lane mapping from the vectorizer -- and that the vectorizer
> > should understand that vector copies are not necessarily the same as
> > vectorizing loads.  Anything else and we will ultimately have parts of
> > the compiler fighting against each other and that way lies subtle
> > bugs.
>
> This is a version of the patch which doesn't attempt to resolve the
> discrepancy between vector copies and vectorizing loads/stores (thus is
> only intended to work in little-endian mode, leaving big-endian mode as
> an open problem). So, vldr/vstr etc. will still be used for aligned
> accesses, and any issues with adding semantics to movmisalign<mode> are
> sidestepped.
>
> As with previous versions of the patch, there are several new failures
> in the vector testsuite.
>
> Ira Rosen <IRAR@il.ibm.com> wrote:
> > > > > --- a/gcc/tree-vect-data-refs.c
> > > > > +++ b/gcc/tree-vect-data-refs.c
> > > > > @@ -796,7 +796,8 @@ vect_compute_data_ref_alignment (struct
> > > > data_reference *dr)
> > > > >    /* At this point we assume that the base is aligned.  */
> > > > >    gcc_assert (base_aligned
> > > > >           || (TREE_CODE (base) == VAR_DECL
> > > > > -           && DECL_ALIGN (base) >= TYPE_ALIGN (vectype)));
> > > > > +           && (DECL_ALIGN (base)
> > > > > +               >= targetm.vectorize.vector_min_alignment
> > > > > (vectype))));
> > > >
> > > > Looks like you forgot to multiply by BITS_PER_UNIT here.
> > >
> > > Fixed.
> >
> > Not in the attached patch...
>
> (I changed DECL_ALIGN to DECL_ALIGN_UNIT, so both copies of the
> inequality were in units, rather than explicitly multiplying by
> BITS_PER_UNIT. I should probably have mentioned that...)
>
> OK to apply?

The vectorizer part is OK with me.

Thanks,
Ira


> Julian
>
> ChangeLog
>
>     Julian Brown  <julian@codesourcery.com>
>     Paul Brook  <paul@codesourcery.com>
>     Daniel Jacobowitz  <dan@codesourcery.com>
>     Joseph Myers  <joseph@codesourcery.com>
>
>     gcc/
>     * target-def.h (TARGET_VECTOR_MIN_ALIGNMENT): Define.
>     (TARGET_VECTORIZE): Use above.
>     * target.h (gcc_target): Add vectorize.vector_min_alignment.
>     * targhooks.c (default_vector_min_alignment): New function.
>     * targhooks.h (default_vector_min_alignment): Add prototype.
>     * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Use
>     targetm.vectorize.vector_min_alignment.
>     * tree-vect-loop-manip.c (target.h): Include.
>     (vect_gen_niters_for_prolog_loop): Use
> targetm.vectorize.vector_min_alignment.
>     * config/arm/arm.c (arm_vector_min_alignment): New function.
>     (TARGET_VECTOR_MIN_ALIGNMENT): Define macro, using above.
>     (neon_vector_mem_operand): Disallow PRE_DEC for array loads.
>     (arm_print_operand): Include alignment qualifier in %A.
>     * config/arm/neon.md (UNSPEC_MISALIGNED_ACCESS): New constant.
>     (movmisalign<mode>): New expander.
>     (movmisalign<mode>_neon_store, movmisalign<mode>_neon_load): New
>     insn patterns.
>
>     gcc/testsuite/
>     * lib/target-supports.exp
>     (check_effective_target_arm_vect_no_misalign): New function.
>     (check_effective_target_vect_no_align): Use above.
>     (check_effective_target_vect_element_align): New function.
>     * gcc.dg/vect/no-section-anchors-vect-31.c: Use
> vect_element_align (instead of
>     vect_hw_misalign where appropriate).
>     * gcc.dg/vect/no-section-anchors-vect-64.c: Ditto.
>     * gcc.dg/vect/no-section-anchors-vect-66.c: Ditto.
>     * gcc.dg/vect/no-section-anchors-vect-68.c: Ditto.
>     * gcc.dg/vect/no-section-anchors-vect-69.c: Ditto.
>     * gcc.dg/vect/no-scebccp-outer-8.c: Ditto.
>     * gcc.dg/vect/pr25413.c: Ditto.
>     * gcc.dg/vect/section-anchors-vect-69.c: Ditto.
>     * gcc.dg/vect/slp-25.c: Ditto.
>     * gcc.dg/vect/vect-109.c: Ditto.
>     * gcc.dg/vect/vect-26.c: Ditto.
>     * gcc.dg/vect/vect-27.c: Ditto.
>     * gcc.dg/vect/vect-28.c: Ditto.
>     * gcc.dg/vect/vect-29.c: Ditto.
>     * gcc.dg/vect/vect-33.c: Ditto.
>     * gcc.dg/vect/vect-42.c: Ditto.
>     * gcc.dg/vect/vect-44.c: Ditto.
>     * gcc.dg/vect/vect-48.c: Ditto.
>     * gcc.dg/vect/vect-50.c: Ditto.
>     * gcc.dg/vect/vect-52.c: Ditto.
>     * gcc.dg/vect/vect-54.c: Ditto.
>     * gcc.dg/vect/vect-56.c: Ditto.
>     * gcc.dg/vect/vect-58.c: Ditto.
>     * gcc.dg/vect/vect-60.c: Ditto.
>     * gcc.dg/vect/vect-70.c: Ditto.
>     * gcc.dg/vect/vect-72.c: Ditto.
>     * gcc.dg/vect/vect-75.c: Ditto.
>     * gcc.dg/vect/vect-87.c: Ditto.
>     * gcc.dg/vect/vect-88.c: Ditto.
>     * gcc.dg/vect/vect-89.c: Ditto.
>     * gcc.dg/vect/vect-91.c: Ditto.
>     * gcc.dg/vect/vect-92.c: Ditto.
>     * gcc.dg/vect/vect-93.c: Ditto.
>     * gcc.dg/vect/vect-95.c: Ditto.
>     * gcc.dg/vect/vect-96.c: Ditto.
>     * gcc.dg/vect/vect-align-1.c: Ditto.
>     * gcc.dg/vect/vect-align-2.c: Ditto.
>     * gcc.dg/vect/vect-multitypes-1.c: Ditto.
>     * gcc.dg/vect/vect-multitypes-3.c: Ditto.
>     * gcc.dg/vect/vect-multitypes-4.c: Ditto.
>     * gcc.dg/vect/vect-multitypes-6.c: Ditto.
>     * gfortran.dg/vect/vect-2.f90: Ditto.
>     * gfortran.dg/vect/vect-3.f90: Ditto.
>     * gfortran.dg/vect/vect-4.f90: Ditto.
>     * gfortran.dg/vect/vect-5.f90: Ditto.
> [attachment "misaligned-neon-fsf-7.diff" deleted by Ira Rosen/Haifa/IBM]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-12-18 18:09             ` Julian Brown
  2009-12-21  8:44               ` Ira Rosen
@ 2009-12-21 15:35               ` Paul Brook
  2010-05-18  0:44                 ` Julian Brown
  1 sibling, 1 reply; 29+ messages in thread
From: Paul Brook @ 2009-12-21 15:35 UTC (permalink / raw)
  To: Julian Brown
  Cc: Richard Earnshaw, Joseph S. Myers, Ira Rosen, gcc-patches, eres

On Friday 18 December 2009, Julian Brown wrote:
> On Mon, 30 Nov 2009 15:02:48 +0000
> 
> Richard Earnshaw <rearnsha@arm.com> wrote:
> > I certainly think we shouldn't be hiding knowledge about the element
> > to vector-lane mapping from the vectorizer -- and that the vectorizer
> > should understand that vector copies are not necessarily the same as
> > vectorizing loads.  Anything else and we will ultimately have parts of
> > the compiler fighting against each other and that way lies subtle
> > bugs.
> 
> This is a version of the patch which doesn't attempt to resolve the
> discrepancy between vector copies and vectorizing loads/stores (thus is
> only intended to work in little-endian mode, leaving big-endian mode as
> an open problem). So, vldr/vstr etc. will still be used for aligned
> accesses, and any issues with adding semantics to movmisalign<mode> are
> sidestepped.

I don't think this is correct. The original patch contained two hooks:

* VECTOR_MIN_ALIGN: Reduce the alignment required for an "aligned" vector 
load.
* VECTOR_ALWAYS_MISALIGN: Use special instructions when loading vectorized 
array data. To minimize churn these "special instructions" used the tree codes 
and RTL patterns that already existed for misaligned vectors. With hindsight 
this may have been a mistake. When this hook returns true we probably should 
not be using movmisalign for unaligned vectors.

By my reading(unverified) your latest patch will use VLDR for element aligned 
data and VLD1 for unaligned data. Neither of these are correct.

If you are unable to distinguish between vector objects and array data then 
you have a couple of options (either or both):

- Have V_M_A return 32 for ARM. This is possible because while the ABI 
required 64-bit alignment, the NEON vector load/store instructions only 
require 32-bit alignment. Some care required as this will break if we end up 
loading into core registers (ldrd/strd). This will "fix" int/float code but 
not help short/char.

- Add movmisalign. Either ignore the fact that packed structures break, or add 
yet annother hook for "misaligned vectors must be at least {-this-} aligned". 
This will not work for big-endian vectors, and will go away once we implement 
array load support.

Paul

P.S.
unaligned: byte aligned, typically from a packed structure.
element aligned: natural alignment of array data.
vector aligned: ABI specified alignment for vector objects.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-12-21 15:35               ` Paul Brook
@ 2010-05-18  0:44                 ` Julian Brown
  2010-05-18  8:50                   ` Ira Rosen
  2010-06-04 12:50                   ` Julian Brown
  0 siblings, 2 replies; 29+ messages in thread
From: Julian Brown @ 2010-05-18  0:44 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Earnshaw, Paul Brook, Ira Rosen

[-- Attachment #1: Type: text/plain, Size: 5335 bytes --]

Hi,

On Mon, 21 Dec 2009 12:20:12 +0000
Paul Brook <paul@codesourcery.com> wrote:

> On Friday 18 December 2009, Julian Brown wrote:
> > This is a version of the patch which doesn't attempt to resolve the
> > discrepancy between vector copies and vectorizing loads/stores
> > (thus is only intended to work in little-endian mode, leaving
> > big-endian mode as an open problem). So, vldr/vstr etc. will still
> > be used for aligned accesses, and any issues with adding semantics
> > to movmisalign<mode> are sidestepped.
> 
> I don't think this is correct. The original patch contained two hooks:
> [snip]
> - Add movmisalign. Either ignore the fact that packed structures
> break, or add yet annother hook for "misaligned vectors must be at
> least {-this-} aligned". This will not work for big-endian vectors,
> and will go away once we implement array load support.

This is a new version of the patch, which adds movmisalign patterns
for little-endian NEON, and uses a new (since the last version of the
patch was posted) target hook (TARGET_SUPPORT_VECTOR_MISALIGNMENT) to
describe the alignments supported by NEON.

I've retained the check_effective_target_vect_element_align predicate
from an earlier version of this patch -- to inform the testsuite that
the target supports vectors aligned to the natural alignment of their
elements, rather than (check_effective_target_vect_hw_misalign)
vectors aligned to arbitrary alignments. Test cases (other than
gcc.dg/vect/vect-align-1.c and gcc.dg/vect/vect-align-2.c which use
packed accesses) have been adjusted to use the new predicate.

I've attempted to minimise the disruption to the test results this
patch produces. It seems that with the default vector width (64 bits),
several test cases (e.g. gcc.dg/vect/vect-outer-5.c) do not work
properly as-is. For that test for example, both outer loops get
vectorised for NEON. The second loop looks like this:

  /* Outer-loop 2: Not vectorizable because of dependence distance. */
  for (i = 0; i < 4; i++)
    {
      s = 0;
      for (j=0; j<N; j+=4)
        s += C[j];
      B[i+3] = B[i] + s;
    }

I believe this is only unvectorizable if the width of the vectors
used is four elements (floats in this case) or greater. NEON supports
such vectors in the vectorizer if the command-line option
-mvectorize-with-neon-quad is given: so, for this test (and several
others), I've added:

/* { dg-add-options quad_vectors } */

Which adds that option to the test invocation, making the behaviour
more similar to other targets, albeit in a NEON-specific way (I don't
know of any other CPUs which support two vector widths in quite the same
way as NEON). A (trickier) alternative might be to pass information
about the vector width for the target down to the test harness somehow,
and adjust the tests accordingly.

With these tweaks, test results for gcc/vect.exp change as follows:

New FAIL: default/gcc.sum:gcc.dg/vect/vect-109.c scan-tree-dump-times vect "Vectorizing an unaligned access" 10
New FAIL: default/gcc.sum:gcc.dg/vect/vect-outer-4c.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1

FAIL -> PASS: default/gcc.sum:gcc.dg/vect/pr37027.c scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL -> PASS: default/gcc.sum:gcc.dg/vect/pr37027.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1

The new failures are due to apparently-missing patterns in the NEON
backend.

A full test run is still in progress. OK to apply, assuming that is
successful?

Thanks,

Julian

ChangeLog

    gcc/
    * config/arm/arm.c (arm_builtin_support_vector_misalignment): New.
    (TARGET_SUPPORT_VECTOR_MISALIGNMENT): New. Use above.
    (neon_vector_mem_operand): Disallow PRE_DEC for array loads.
    (arm_print_operand): Include alignment qualifier in %A.
    * config/arm/neon.md (UNSPEC_MISALIGNED_ACCESS): New constant.
    (movmisalign<mode>): New expander.
    (movmisalign<mode>_neon_store, movmisalign<mode>_neon_load): New
    insn patterns.

    gcc/testsuite/
    * gcc.dg/vect/vect-50.c: Use vect_element_align instead of
    vect_hw_misalign.
    * gcc.dg/vect/vect-33.c: Likewise.
    * gcc.dg/vect/no-section-anchors-vect-69.c: Likewise.
    * gcc.dg/vect/vect-42.c: Likewise.
    * gcc.dg/vect/vect-outer-5.c: Likewise.
    * gcc.dg/vect/vect-44.c: Likewise.
    * gcc.dg/vect/vect-70.c: Likewise.
    * gcc.dg/vect/vect-28.c: Likewise.
    * gcc.dg/vect/vect-109.c: Likewise.
    * gcc.dg/vect/vect-91.c: Likewise.
    * gcc.dg/vect/no-scevccp-outer-8.c: Likewise.
    * gcc.dg/vect/vect-95.c: Likewise.
    * gcc.dg/vect/vect-87.c: Likewise.
    * gcc.dg/vect/vect-96.c: Likewise.
    * gcc.dg/vect/vect-multitypes-1.c: Likewise.
    * gcc.dg/vect/vect-88.c: Likewise.
    * gcc.dg/vect/pr25413.c: Likewise.
    * gfortran.dg/vect/vect-2.f90: Likewise.
    * gfortran.dg/vect/vect-3.f90: Likewise.
    * gfortran.dg/vect/vect-4.f90: Likewise.
    * gfortran.dg/vect/vect-5.f90: Likewise.
    * gcc.dg/vect/slp-3.c: Use quad-word vectors when available.
    * gcc.dg/vect/no-vfa-pr29145.c: Likewise.
    * gcc.dg/vect/vect-multitypes-4.c: Likewise. Use vect_element_align.
    * lib/target-supports.exp
    (check_effective_target_arm_vect_no_misalign): New.
    (check_effective_target_vect_no_align): Use above.
    (check_effective_target_vect_element_align): New.
    (add_options_for_quad_vectors): New.

[-- Attachment #2: misaligned-neon-fsf-13.diff --]
[-- Type: text/x-patch, Size: 33899 bytes --]

Index: gcc/testsuite/gcc.dg/vect/vect-50.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-50.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-50.c	(working copy)
@@ -62,8 +62,8 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } }  */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target vect_element_align } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && { {! vect_no_align } && {! vect_hw_misalign } } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && { {! vect_no_align } && {! vect_element_align } } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-33.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-33.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-33.c	(working copy)
@@ -40,5 +40,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target vector_alignment_reachable } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */ 
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */ 
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c	(working copy)
@@ -115,6 +115,6 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vector_alignment_reachable} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-42.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-42.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-42.c	(working copy)
@@ -64,7 +64,7 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_element_align } } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-outer-5.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-outer-5.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-outer-5.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_float } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdio.h>
 #include <stdarg.h>
Index: gcc/testsuite/gcc.dg/vect/vect-44.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-44.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-44.c	(working copy)
@@ -68,5 +68,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {{! vect_no_align} && {! vect_hw_misalign} } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {{! vect_no_align} && {! vect_element_align} } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-70.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-70.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-70.c	(working copy)
@@ -65,5 +65,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target {{! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target {{! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-28.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-28.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-28.c	(working copy)
@@ -41,5 +41,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-109.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-109.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-109.c	(working copy)
@@ -73,7 +73,7 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 2 "vect" } } */
-/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 10 "vect" { target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 10 "vect" { target vect_element_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-91.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-91.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-91.c	(working copy)
@@ -60,5 +60,5 @@ main3 ()
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" { xfail vect_no_int_add } } } */
 /* { dg-final { scan-tree-dump-times "accesses have the same alignment." 3 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" {target { vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c	(working copy)
@@ -46,5 +46,5 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! { vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! { vect_element_align } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-95.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-95.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-95.c	(working copy)
@@ -56,14 +56,14 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail {vect_hw_misalign} } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail {vect_element_align} } } } */
 
 /* For targets that support unaligned loads we version for the two unaligned 
    stores and generate misaligned accesses for the loads. For targets that 
    don't support unaligned loads we version for all four accesses.  */
 
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign} } } }  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } }  */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /*  { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" { target vect_no_align } } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-87.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-87.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-87.c	(working copy)
@@ -52,5 +52,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target vector_alignment_reachable} } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-96.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-96.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-96.c	(working copy)
@@ -45,5 +45,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { {! vect_no_align} && vector_alignment_reachable } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_element_align} } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
@@ -78,11 +79,11 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_element_align} } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_element_align} } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-88.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-88.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-88.c	(working copy)
@@ -52,5 +52,5 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target vector_alignment_reachable } } }  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/slp-3.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-3.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/slp-3.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include <stdio.h>
Index: gcc/testsuite/gcc.dg/vect/no-vfa-pr29145.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-vfa-pr29145.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/no-vfa-pr29145.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/pr25413.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/pr25413.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/pr25413.c	(working copy)
@@ -33,7 +33,7 @@ int main (void)
 } 
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vector_alignment_reachable_for_64bit } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "vector alignment may not be reachable" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "vector alignment may not be reachable" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c	(revision 159030)
+++ gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
@@ -85,11 +86,11 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_hw_misalign}  } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_element_align} } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_element_align}  } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 8 "vect" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	(revision 159030)
+++ gcc/testsuite/lib/target-supports.exp	(working copy)
@@ -1575,6 +1575,18 @@ proc check_effective_target_arm32 { } {
     }]
 }
 
+# Return 1 if this is an ARM target that only supports aligned vector accesses
+proc check_effective_target_arm_vect_no_misalign { } {
+    return [check_no_compiler_messages arm_vect_no_misalign assembly {
+	#if !defined(__arm__) \
+	    || (defined(__ARMEL__) \
+	        && (!defined(__thumb__) || defined(__thumb2__)))
+	#error FOO
+	#endif
+    }]
+}
+
+
 # Return 1 if this is an ARM target supporting -mfpu=vfp
 # -mfloat-abi=softfp.  Some multilibs may be incompatible with these
 # options.
@@ -2411,7 +2423,7 @@ proc check_effective_target_vect_no_alig
 	if { [istarget mipsisa64*-*-*]
 	     || [istarget sparc*-*-*]
 	     || [istarget ia64-*-*]
-	     || [check_effective_target_arm32] } { 
+	     || [check_effective_target_arm_vect_no_misalign] } { 
 	    set et_vect_no_align_saved 1
 	}
     }
@@ -2546,6 +2558,25 @@ proc check_effective_target_vector_align
     return $et_vector_alignment_reachable_for_64bit_saved
 }
 
+# Return 1 if the target only requires element alignment for vector accesses
+
+proc check_effective_target_vect_element_align { } {
+    global et_vect_element_align
+
+    if [info exists et_vect_element_align] {
+	verbose "check_effective_target_vect_element_align: using cached result" 2
+    } else {
+	set et_vect_element_align 0
+	if { [istarget arm*-*-*]
+	     || [check_effective_target_vect_hw_misalign] } {
+	   set et_vect_element_align 1
+	}
+    }
+
+    verbose "check_effective_target_vect_element_align: returning $et_vect_element_align" 2
+    return $et_vect_element_align
+}
+
 # Return 1 if the target supports vector conditional operations, 0 otherwise.
 
 proc check_effective_target_vect_condition { } {
@@ -3103,6 +3134,16 @@ proc add_options_for_bind_pic_locally { 
     return $flags
 }
 
+# Add to FLAGS the flags needed to enable 128-bit vectors.
+
+proc add_options_for_quad_vectors { flags } {
+    if [is-effective-target arm_neon_ok] {
+	return "$flags -mvectorize-with-neon-quad"
+    }
+
+    return $flags
+}
+
 # Return 1 if the target provides a full C99 runtime.
 
 proc check_effective_target_c99_runtime { } {
Index: gcc/testsuite/gfortran.dg/vect/vect-2.f90
===================================================================
--- gcc/testsuite/gfortran.dg/vect/vect-2.f90	(revision 159030)
+++ gcc/testsuite/gfortran.dg/vect/vect-2.f90	(working copy)
@@ -18,5 +18,5 @@ END
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { vect_no_align && { ! vector_alignment_reachable } } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" {target { vect_no_align || { { ! vector_alignment_reachable  } && { ! vect_hw_misalign } } } } } } 
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" {target { vect_no_align || { { ! vector_alignment_reachable  } && { ! vect_element_align } } } } } } 
 ! { dg-final { cleanup-tree-dump "vect" } }
Index: gcc/testsuite/gfortran.dg/vect/vect-3.f90
===================================================================
--- gcc/testsuite/gfortran.dg/vect/vect-3.f90	(revision 159030)
+++ gcc/testsuite/gfortran.dg/vect/vect-3.f90	(working copy)
@@ -7,8 +7,8 @@ Y = Y + A * X
 END
 
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } }
-! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } }
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_element_align} } } } } }
+! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_element_align} } } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable}} } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { { vect_no_align } || { ! vector_alignment_reachable} } } } }
 
Index: gcc/testsuite/gfortran.dg/vect/vect-4.f90
===================================================================
--- gcc/testsuite/gfortran.dg/vect/vect-4.f90	(revision 159030)
+++ gcc/testsuite/gfortran.dg/vect/vect-4.f90	(working copy)
@@ -12,6 +12,6 @@ END
 ! { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } 
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } }
-! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } }
+! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } }
 ! { dg-final { scan-tree-dump-times "accesses have the same alignment." 1 "vect" } }
 ! { dg-final { cleanup-tree-dump "vect" } }
Index: gcc/testsuite/gfortran.dg/vect/vect-5.f90
===================================================================
--- gcc/testsuite/gfortran.dg/vect/vect-5.f90	(revision 159030)
+++ gcc/testsuite/gfortran.dg/vect/vect-5.f90	(working copy)
@@ -39,5 +39,5 @@
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target { vect_no_align } } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } }
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } }
 ! { dg-final { cleanup-tree-dump "vect" } }
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 159030)
+++ gcc/config/arm/arm.c	(working copy)
@@ -224,6 +224,10 @@ static void arm_asm_trampoline_template 
 static void arm_trampoline_init (rtx, tree, rtx);
 static rtx arm_trampoline_adjust_address (rtx);
 static rtx arm_pic_static_addr (rtx orig, rtx reg);
+static bool arm_builtin_support_vector_misalignment (enum machine_mode mode,
+						     const_tree type,
+						     int misalignment,
+						     bool is_packed);
 
 \f
 /* Table of machine attributes.  */
@@ -507,6 +511,10 @@ static const struct attribute_spec arm_a
 #undef TARGET_CAN_ELIMINATE
 #define TARGET_CAN_ELIMINATE arm_can_eliminate
 
+#undef TARGET_SUPPORT_VECTOR_MISALIGNMENT
+#define TARGET_SUPPORT_VECTOR_MISALIGNMENT \
+  arm_builtin_support_vector_misalignment
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 \f
 /* Obstack for minipool constant handling.  */
@@ -8606,7 +8614,8 @@ neon_vector_mem_operand (rtx op, int typ
     return arm_address_register_rtx_p (ind, 0);
 
   /* Allow post-increment with Neon registers.  */
-  if (type != 1 && (GET_CODE (ind) == POST_INC || GET_CODE (ind) == PRE_DEC))
+  if ((type != 1 && GET_CODE (ind) == POST_INC)
+      || (type == 0 && GET_CODE (ind) == PRE_DEC))
     return arm_address_register_rtx_p (XEXP (ind, 0), 0);
 
   /* FIXME: vld1 allows register post-modify.  */
@@ -15574,6 +15583,8 @@ arm_print_operand (FILE *stream, rtx x, 
       {
 	rtx addr;
 	bool postinc = FALSE;
+	unsigned align;
+
 	gcc_assert (GET_CODE (x) == MEM);
 	addr = XEXP (x, 0);
 	if (GET_CODE (addr) == POST_INC)
@@ -15581,7 +15592,13 @@ arm_print_operand (FILE *stream, rtx x, 
 	    postinc = 1;
 	    addr = XEXP (addr, 0);
 	  }
-	asm_fprintf (stream, "[%r]", REGNO (addr));
+	align = MEM_ALIGN (x) >> 3;
+	asm_fprintf (stream, "[%r", REGNO (addr));
+	if (align > GET_MODE_SIZE (GET_MODE (x)))
+	  align = GET_MODE_SIZE (GET_MODE (x));
+	if (align >= 8)
+	  asm_fprintf (stream, ", :%d", align << 3);
+	asm_fprintf (stream, "]");
 	if (postinc)
 	  fputs("!", stream);
       }
@@ -21690,4 +21707,29 @@ arm_have_conditional_execution (void)
   return !TARGET_THUMB1;
 }
 
+static bool
+arm_builtin_support_vector_misalignment (enum machine_mode mode,
+					 const_tree type, int misalignment,
+					 bool is_packed)
+{
+  if (TARGET_NEON && !BYTES_BIG_ENDIAN)
+    {
+      HOST_WIDE_INT align = TYPE_ALIGN_UNIT (type);
+
+      if (is_packed)
+        return align == 1;
+
+      /* If the misalignment is unknown, we should be able to handle the access
+	 so long as it is not to a member of a packed data structure.  */
+      if (misalignment == -1)
+        return true;
+
+      /* This is probably always true.  */
+      return (misalignment % align) == 0;
+    }
+  
+  return default_builtin_support_vector_misalignment (mode, type, misalignment,
+						      is_packed);
+}
+
 #include "gt-arm.h"
Index: gcc/config/arm/neon.md
===================================================================
--- gcc/config/arm/neon.md	(revision 159030)
+++ gcc/config/arm/neon.md	(working copy)
@@ -159,7 +159,8 @@
    (UNSPEC_VUZP1		201)
    (UNSPEC_VUZP2		202)
    (UNSPEC_VZIP1		203)
-   (UNSPEC_VZIP2		204)])
+   (UNSPEC_VZIP2		204)
+   (UNSPEC_MISALIGNED_ACCESS	205)])
 
 ;; Double-width vector modes.
 (define_mode_iterator VD [V8QI V4HI V2SI V2SF])
@@ -674,6 +675,51 @@
   neon_disambiguate_copy (operands, dest, src, 4);
 })
 
+(define_expand "movmisalign<mode>"
+  [(set (match_operand:VDQX 0 "nonimmediate_operand"	      "")
+	(unspec:VDQX [(match_operand:VDQX 1 "general_operand" "")]
+		     UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+{
+  if (!s_register_operand (operands[0], <MODE>mode)
+      && !s_register_operand (operands[1], <MODE>mode))
+    FAIL;
+})
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VDX 0 "memory_operand"		       "=Um")
+	(unspec:VDX [(match_operand:VDX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN
+   && (   s_register_operand (operands[0], <MODE>mode)
+       || s_register_operand (operands[1], <MODE>mode))"
+  "vst1.<V_sz_elem>\t{%P1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VDX 0 "s_register_operand"	   "=w")
+	(unspec:VDX [(match_operand:VDX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%P0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VQX 0 "memory_operand"		       "=Um")
+	(unspec:VQX [(match_operand:VQX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vst1.<V_sz_elem>\t{%q1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VQX 0 "s_register_operand"	   "=w")
+	(unspec:VQX [(match_operand:VQX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%q0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
 (define_insn "vec_set<mode>_internal"
   [(set (match_operand:VD 0 "s_register_operand" "=w")
         (vec_merge:VD

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2010-05-18  0:44                 ` Julian Brown
@ 2010-05-18  8:50                   ` Ira Rosen
  2010-05-18 15:58                     ` Julian Brown
  2010-06-04 12:50                   ` Julian Brown
  1 sibling, 1 reply; 29+ messages in thread
From: Ira Rosen @ 2010-05-18  8:50 UTC (permalink / raw)
  To: Julian Brown; +Cc: gcc-patches, Paul Brook, Richard Earnshaw



Julian Brown <julian@codesourcery.com> wrote on 18/05/2010 03:31:08 AM:

> I've retained the check_effective_target_vect_element_align predicate
> from an earlier version of this patch -- to inform the testsuite that
> the target supports vectors aligned to the natural alignment of their
> elements, rather than (check_effective_target_vect_hw_misalign)
> vectors aligned to arbitrary alignments. Test cases (other than
> gcc.dg/vect/vect-align-1.c and gcc.dg/vect/vect-align-2.c which use
> packed accesses) have been adjusted to use the new predicate.
>
> I've attempted to minimise the disruption to the test results this
> patch produces. It seems that with the default vector width (64 bits),
> several test cases (e.g. gcc.dg/vect/vect-outer-5.c) do not work
> properly as-is. For that test for example, both outer loops get
> vectorised for NEON. The second loop looks like this:
>
>   /* Outer-loop 2: Not vectorizable because of dependence distance. */
>   for (i = 0; i < 4; i++)
>     {
>       s = 0;
>       for (j=0; j<N; j+=4)
>         s += C[j];
>       B[i+3] = B[i] + s;
>     }
>
> I believe this is only unvectorizable if the width of the vectors
> used is four elements (floats in this case) or greater. NEON supports
> such vectors in the vectorizer if the command-line option
> -mvectorize-with-neon-quad is given: so, for this test (and several
> others), I've added:
>
> /* { dg-add-options quad_vectors } */
>
> Which adds that option to the test invocation, making the behaviour
> more similar to other targets, albeit in a NEON-specific way (I don't
> know of any other CPUs which support two vector widths in quite the same
> way as NEON). A (trickier) alternative might be to pass information
> about the vector width for the target down to the test harness somehow,
> and adjust the tests accordingly.
>
> With these tweaks, test results for gcc/vect.exp change as follows:
>
> New FAIL: default/gcc.sum:gcc.dg/vect/vect-109.c scan-tree-dump-
> times vect "Vectorizing an unaligned access" 10

I think, you may want to add quad_vectors requirement here as well, since
the test assumes 4 elements for vector int.

> New FAIL: default/gcc.sum:gcc.dg/vect/vect-outer-4c.c scan-tree-
> dump-times vect "OUTER LOOP VECTORIZED" 1

This one requires multiplication of vector short

>
> FAIL -> PASS: default/gcc.sum:gcc.dg/vect/pr37027.c scan-tree-dump-
> times vect "vectorized 1 loops" 1
> FAIL -> PASS: default/gcc.sum:gcc.dg/vect/pr37027.c scan-tree-dump-
> times vect "vectorizing stmts using SLP" 1

and this one vector int add.

Thanks,
Ira

>
> The new failures are due to apparently-missing patterns in the NEON
> backend.
>
> A full test run is still in progress. OK to apply, assuming that is
> successful?
>
> Thanks,
>
> Julian
>
> ChangeLog
>
>     gcc/
>     * config/arm/arm.c (arm_builtin_support_vector_misalignment): New.
>     (TARGET_SUPPORT_VECTOR_MISALIGNMENT): New. Use above.
>     (neon_vector_mem_operand): Disallow PRE_DEC for array loads.
>     (arm_print_operand): Include alignment qualifier in %A.
>     * config/arm/neon.md (UNSPEC_MISALIGNED_ACCESS): New constant.
>     (movmisalign<mode>): New expander.
>     (movmisalign<mode>_neon_store, movmisalign<mode>_neon_load): New
>     insn patterns.
>
>     gcc/testsuite/
>     * gcc.dg/vect/vect-50.c: Use vect_element_align instead of
>     vect_hw_misalign.
>     * gcc.dg/vect/vect-33.c: Likewise.
>     * gcc.dg/vect/no-section-anchors-vect-69.c: Likewise.
>     * gcc.dg/vect/vect-42.c: Likewise.
>     * gcc.dg/vect/vect-outer-5.c: Likewise.
>     * gcc.dg/vect/vect-44.c: Likewise.
>     * gcc.dg/vect/vect-70.c: Likewise.
>     * gcc.dg/vect/vect-28.c: Likewise.
>     * gcc.dg/vect/vect-109.c: Likewise.
>     * gcc.dg/vect/vect-91.c: Likewise.
>     * gcc.dg/vect/no-scevccp-outer-8.c: Likewise.
>     * gcc.dg/vect/vect-95.c: Likewise.
>     * gcc.dg/vect/vect-87.c: Likewise.
>     * gcc.dg/vect/vect-96.c: Likewise.
>     * gcc.dg/vect/vect-multitypes-1.c: Likewise.
>     * gcc.dg/vect/vect-88.c: Likewise.
>     * gcc.dg/vect/pr25413.c: Likewise.
>     * gfortran.dg/vect/vect-2.f90: Likewise.
>     * gfortran.dg/vect/vect-3.f90: Likewise.
>     * gfortran.dg/vect/vect-4.f90: Likewise.
>     * gfortran.dg/vect/vect-5.f90: Likewise.
>     * gcc.dg/vect/slp-3.c: Use quad-word vectors when available.
>     * gcc.dg/vect/no-vfa-pr29145.c: Likewise.
>     * gcc.dg/vect/vect-multitypes-4.c: Likewise. Use vect_element_align.
>     * lib/target-supports.exp
>     (check_effective_target_arm_vect_no_misalign): New.
>     (check_effective_target_vect_no_align): Use above.
>     (check_effective_target_vect_element_align): New.
>     (add_options_for_quad_vectors): New.[attachment "misaligned-
> neon-fsf-13.diff" deleted by Ira Rosen/Haifa/IBM]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2010-05-18  8:50                   ` Ira Rosen
@ 2010-05-18 15:58                     ` Julian Brown
  0 siblings, 0 replies; 29+ messages in thread
From: Julian Brown @ 2010-05-18 15:58 UTC (permalink / raw)
  To: Ira Rosen; +Cc: gcc-patches, Paul Brook, Richard Earnshaw

On Tue, 18 May 2010 11:41:26 +0300
Ira Rosen <IRAR@il.ibm.com> wrote:

> Julian Brown <julian@codesourcery.com> wrote on 18/05/2010 03:31:08
> > With these tweaks, test results for gcc/vect.exp change as follows:
> >
> > New FAIL: default/gcc.sum:gcc.dg/vect/vect-109.c scan-tree-dump-
> > times vect "Vectorizing an unaligned access" 10
> 
> I think, you may want to add quad_vectors requirement here as well,
> since the test assumes 4 elements for vector int.

Thanks, adding the quad_vectors option fixes that test.

> > New FAIL: default/gcc.sum:gcc.dg/vect/vect-outer-4c.c scan-tree-
> > dump-times vect "OUTER LOOP VECTORIZED" 1
> 
> This one requires multiplication of vector short

I might be able to fix that with a follow-up patch.

Full testing revealed some more new failures:

New FAIL: default/g++.sum:g++.dg/vect/pr36648.cc scan-tree-dump-times vect "vectorized 1 loops" 1
New FAIL: default/g++.sum:g++.dg/vect/pr36648.cc scan-tree-dump-times vect "vectorizing stmts using SLP" 1

These are due to a transformation in an earlier tree pass (091t.crited
I think), leading to:

pr36648.cc:10: note: not vectorized: control flow in loop.

I don't think this indicates a failure in this patch, as such.

Julian

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2010-05-18  0:44                 ` Julian Brown
  2010-05-18  8:50                   ` Ira Rosen
@ 2010-06-04 12:50                   ` Julian Brown
  2010-06-04 16:26                     ` Richard Earnshaw
  1 sibling, 1 reply; 29+ messages in thread
From: Julian Brown @ 2010-06-04 12:50 UTC (permalink / raw)
  To: Julian Brown; +Cc: gcc-patches, Richard Earnshaw, Paul Brook

On Tue, 18 May 2010 01:31:08 +0100
Julian Brown <julian@codesourcery.com> wrote:

> Hi,
> 
> On Mon, 21 Dec 2009 12:20:12 +0000
> Paul Brook <paul@codesourcery.com> wrote:
> 
> > On Friday 18 December 2009, Julian Brown wrote:
> > > This is a version of the patch which doesn't attempt to resolve
> > > the discrepancy between vector copies and vectorizing loads/stores
> > > (thus is only intended to work in little-endian mode, leaving
> > > big-endian mode as an open problem). So, vldr/vstr etc. will still
> > > be used for aligned accesses, and any issues with adding semantics
> > > to movmisalign<mode> are sidestepped.
> > 
> > I don't think this is correct. The original patch contained two
> > hooks: [snip]
> > - Add movmisalign. Either ignore the fact that packed structures
> > break, or add yet annother hook for "misaligned vectors must be at
> > least {-this-} aligned". This will not work for big-endian vectors,
> > and will go away once we implement array load support.
> 
> This is a new version of the patch, which adds movmisalign patterns
> for little-endian NEON, and uses a new (since the last version of the
> patch was posted) target hook (TARGET_SUPPORT_VECTOR_MISALIGNMENT) to
> describe the alignments supported by NEON.

Ping (ARM maintainers)?

Julian

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2010-06-04 12:50                   ` Julian Brown
@ 2010-06-04 16:26                     ` Richard Earnshaw
  2010-06-05 14:39                       ` Joseph S. Myers
  2010-06-07 19:09                       ` Julian Brown
  0 siblings, 2 replies; 29+ messages in thread
From: Richard Earnshaw @ 2010-06-04 16:26 UTC (permalink / raw)
  To: Julian Brown; +Cc: gcc-patches, Paul Brook


On Fri, 2010-06-04 at 13:50 +0100, Julian Brown wrote:
> On Tue, 18 May 2010 01:31:08 +0100
> Julian Brown <julian@codesourcery.com> wrote:
> 
> > Hi,
> > 
> > On Mon, 21 Dec 2009 12:20:12 +0000
> > Paul Brook <paul@codesourcery.com> wrote:
> > 
> > > On Friday 18 December 2009, Julian Brown wrote:
> > > > This is a version of the patch which doesn't attempt to resolve
> > > > the discrepancy between vector copies and vectorizing loads/stores
> > > > (thus is only intended to work in little-endian mode, leaving
> > > > big-endian mode as an open problem). So, vldr/vstr etc. will still
> > > > be used for aligned accesses, and any issues with adding semantics
> > > > to movmisalign<mode> are sidestepped.
> > > 
> > > I don't think this is correct. The original patch contained two
> > > hooks: [snip]
> > > - Add movmisalign. Either ignore the fact that packed structures
> > > break, or add yet annother hook for "misaligned vectors must be at
> > > least {-this-} aligned". This will not work for big-endian vectors,
> > > and will go away once we implement array load support.
> > 
> > This is a new version of the patch, which adds movmisalign patterns
> > for little-endian NEON, and uses a new (since the last version of the
> > patch was posted) target hook (TARGET_SUPPORT_VECTOR_MISALIGNMENT) to
> > describe the alignments supported by NEON.
> 
> Ping (ARM maintainers)?
> 
> Julian

I've no particular objection to this patch, but I can't help feeling
it's not really addressing the fundamental problem.

I think the problem we're really trying to fix is GCC's builtin
assumption about the mapping of vectors to registers (ie the order of
the lanes -- Joseph alludes to this in one of his posts on the thread)
and that fundamentally most of this is trying to paper over that
built-in assumption (it's a bit like trying to make big-endian look like
little-endian, or perhaps more accurately WORDS_BIG_ENDIAN+LITTLE_ENDIAN
look like a pure big or little-endian machine).

I haven't looked at the generic code, but my feeling is that what we
probably need to do is to break that assumption in the generic code (and
that until we do, we'll probably continue to run into corner cases that
break).  The way to do this is to have somethink like
VECTOR_LANES_BIG_ENDIAN as a target hook -- essentially, on ARM this
would always be false.

Can someone convince me I'm wrong?

R.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2010-06-04 16:26                     ` Richard Earnshaw
@ 2010-06-05 14:39                       ` Joseph S. Myers
  2010-06-07 19:09                       ` Julian Brown
  1 sibling, 0 replies; 29+ messages in thread
From: Joseph S. Myers @ 2010-06-05 14:39 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: Julian Brown, gcc-patches, Paul Brook

On Fri, 4 Jun 2010, Richard Earnshaw wrote:

> I think the problem we're really trying to fix is GCC's builtin
> assumption about the mapping of vectors to registers (ie the order of
> the lanes -- Joseph alludes to this in one of his posts on the thread)
> and that fundamentally most of this is trying to paper over that
> built-in assumption (it's a bit like trying to make big-endian look like
> little-endian, or perhaps more accurately WORDS_BIG_ENDIAN+LITTLE_ENDIAN
> look like a pure big or little-endian machine).
> 
> I haven't looked at the generic code, but my feeling is that what we
> probably need to do is to break that assumption in the generic code (and
> that until we do, we'll probably continue to run into corner cases that
> break).  The way to do this is to have somethink like
> VECTOR_LANES_BIG_ENDIAN as a target hook -- essentially, on ARM this
> would always be false.

The following are my notes on NEON vector issues in GCC (in particular for 
big-endian).  Exactly which assumption do you wish to break?  That there 
is only one way to represent a given mode in a given register or at a 
given address?

GCC Vectors
-----------

GCC has data types of the form "vector of N elements of type T".  These 
appear:

* In C and C++ language source code, through use of generic vector 
  extensions (``vector_size`` attribute).

* In the GENERIC and GIMPLE representations, either through use of those 
  extensions or through being generated by the vectorizer.

* In RTL (where instead of types you have modes for both vectors and 
  elements, so ``V2SImode`` could have either signed or unsigned 
  elements).

A value of such a data type (or mode) is an ordered N-tuple of values of 
type (or mode) T. The elements are considered to be numbered from 0 to 
N-1.  When such a type (or mode) is stored in memory, the layout is 
defined to be the same as that of an array of N elements of type (or mode 
T), with elements of greater index being stored at greater addresses.  
(However, the required alignment of a vector type (or mode) may be larger 
than that of an array; the alignment of an array is that of the element 
type.)  Although element numbers for vectors do not appear directly in C 
and C++ source as of December 2009 (it has been proposed that the 
extensions should allow vectors to be subscripted like arrays, but parts 
of those patches have not been approved), they do appear in the various 
internal representations, where they have the above semantics.

In RTL, a reference to a memory location as containing a value of a 
particular mode implies some particular interpretation of the bits in that 
location as a value of that mode.  (For vectors, this must be the same as 
the interpretation for arrays.)  GCC does not have any way of handling the 
possibility that a mode may be stored at the same address in more than one 
incompatible way.  Similarly, a reference to a given hard register number 
in a particular mode implies a particular interpretation of the contents 
of that register, and possibly subsequent registers as specified as 
indicated by ``HARD_REGNO_NREGS``, as a value of that mode.  (This can 
commonly be thought of as describing which bits of which registers 
correspond to which bits of which locations in the memory representation, 
though such a model is not adequate for how floating-point registers work 
on some processors.)  GCC cannot handle the possibility of a value of a 
given mode being represented in more than one way in a given register.

Several target macros allow a GCC port to control the relation between 
value of different modes stored in the same register.  For example, they 
can describe processors where a register with the results of a 
floating-point computation in one mode cannot then have bits of that 
result read in another mode, or where a ``!QImode`` value stored in a 
register does not use the same bits as the low bits of an ``!HImode`` 
value stored in that register.

Move patterns in the machine description need to be consistent with how 
each mode is stored in each register that can hold values of that mode, as 
do any moves the target-independent compiler may produce from moves of 
smaller modes in the absence of a move pattern for a larger mode.

NEON Vectors
------------

The NEON (Advanced SIMD) instruction set extension specifies its own 
vector types and how these are stored in registers.  As with GCC vectors, 
elements are numbered from 0 to N-1. Element 0 is stored at the "least 
significant" end of a NEON vector register, and element N-1 at the "most 
significant" end.  A NEON quad register is made up of two consecutive 
double registers (Qn made of D(2n) and D(2n+1)); the least significant 
half of a quad register is always D(2n) and the most significant half is 
always D(2n+1).  When NEON instructions refer to vector elements ("lanes") 
by number, they use this convention to determine which bits of the 
register are referenced.  (Single registers Sn are only used for VFP and 
not directly referenced as such in vector operations, so that view of the 
registers is irrelevant here; the view as D registers is the primary one.)

When NEON registers are loaded from or stored to memory using VLDM, VLDR, 
VSTM and VSTR instructions, each double register is loaded or stored as if 
it contained an integer value: the most significant end of the register 
comes from low-numbered addresses if big-endian and from high-numbered 
addresses if little-endian.  If little-endian, this means that vector 
element 0 comes from low-numbered addresses; if big-endian, it comes from 
high-numbered addresses. Because the ordering of D registers within a Q 
register does not depend on endianness, if for example a Q register of 
eight 16-bit values is loaded from memory this way, the elements will come 
from memory in the order 3, 2, 1, 0, 7, 6, 5, 4 (not 7, 6, 5, 4, 3, 2, 1, 
0) for big-endian and 0, 1, 2, 3, 4, 5, 6, 7 for little-endian.

When NEON registers are loaded from or stored to memory using VLD1 and 
VST1 instructions, each element is loaded individually from successive 
memory locations, so vector element numbers always increase in the same 
order as addresses.

The instruction set only defines how elements are numbered in registers, 
with the two different ways in which they can be loaded from and stored to 
memory.  Appendix A of AAPCS defines layout in memory for "containerized 
vector" types "as loading the specified VFP registers from an array of the 
Base Type using the Fill Operation and then storing that value to memory 
using a single VSTM of the loaded 64-bit (D) registers.". The "Fill 
Operation" is a VLD1 instruction.  Lane numbers in containerized vector 
types are relevant for intrinsics in ``arm_neon.h``; the definition in 
AAPCS means that on little-endian systems the numbering is array order, as 
for GCC vectors, but on big-endian systems the numbering is the 3, 2, 1, 
0, 7, 6, 5, 4 order discussed above.

AAPCS also defines how containerized vectors are passed to and returned 
from functions. When passed in core registers, it is defined to be as if 
the value was loaded from memory with a single LDM instruction.  When 
passed in VFP registers, it is defined to be as if a VLDR or VLDM 
instruction is used.

NEON Vectors and GCC
--------------------

GCC has generic vector modes corresponding to the various vector types 
supported by NEON. Because the element ordering of these in GCC's 
target-independent internal representations is defined differently from 
the NEON containerized vector ordering for big-endian, when 
target-independent RTL is generated from the instrinsics it is necessary 
to adjust element numbers accordingly to convert from the NEON convention 
to the GCC one.

GCC needs to define a single representation for each vector mode in each 
register that can hold values of that mode, as discussed above.  The 
representation presently defined is that resulting from the use of LDM, 
STM, VLDM, VLDR, VSTM and VSTR instructions, rather than that resulting 
from the use of VLD1 and VST1 (the two differing for big-endian only). As 
memory transfers to and from core registers do not support the VLD1/VST1 
ordering, it would be problematic to implement such move patterns if that 
ordering were used, and it would also be problematic to implement the 
AAPCS requirements for passing vectors in core registers if that ordering 
were used or if vector modes were not permitted in core registers. 
Similarly, transfers between core and NEON registers cannot convert 
between the two orderings.

As a result of the representation chosen, vector element M (in GCC terms) 
is stored in part of a vector register that is not element M of that 
register in NEON terms.  Thus, when instructions referencing lane numbers 
are generated from generic RTL, those numbers must be converted from GCC 
convention to NEON convention.  (This means code generation for intrinsics 
goes through one conversion of lane numbers in each direction.)

The VLD1 and VST1 instructions only require element alignment rather than 
the larger alignment of the vector modes.  Thus, they are in principle 
convenient for vectorizing accesses to arrays of unknown alignment.  
However, on big-endian processors the effect in generic GCC terms of a 
VLD1 instruction is not to cause the specified register, interpreted in 
the specified mode, to have the same value as the memory at the specified 
address, interpreted in the specified mode; it is to cause it to have a 
permuted copy of that value.  So for vectorization (which operates on 
GIMPLE) to use these instructions for big-endian, it needs to understand 
how a permuting load or store can be used correctly in vectorized code.  
In the common case where data are loaded from two arrays, some operation 
carried out element-wise, and the data stored to a third array, permuting 
loads and stores may be used the same way as non-permuting ones, as long 
as all loads and stores are permuting the same way.  If some operands are 
constants or otherwise do not come from an array, they must be permuted 
the same way as operands coming from an array.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2010-06-04 16:26                     ` Richard Earnshaw
  2010-06-05 14:39                       ` Joseph S. Myers
@ 2010-06-07 19:09                       ` Julian Brown
  2010-06-08 15:25                         ` Mark Mitchell
  2010-08-03 16:32                         ` Julian Brown
  1 sibling, 2 replies; 29+ messages in thread
From: Julian Brown @ 2010-06-07 19:09 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches, Paul Brook

On Fri, 04 Jun 2010 17:26:17 +0100
Richard Earnshaw <rearnsha@arm.com> wrote:

> 
> On Fri, 2010-06-04 at 13:50 +0100, Julian Brown wrote:
> > On Tue, 18 May 2010 01:31:08 +0100
> > Julian Brown <julian@codesourcery.com> wrote:
> > 
> > > Hi,
> > > 
> > > On Mon, 21 Dec 2009 12:20:12 +0000
> > > Paul Brook <paul@codesourcery.com> wrote:
> > > 
> > > > On Friday 18 December 2009, Julian Brown wrote:
> > > > > This is a version of the patch which doesn't attempt to
> > > > > resolve the discrepancy between vector copies and vectorizing
> > > > > loads/stores (thus is only intended to work in little-endian
> > > > > mode, leaving big-endian mode as an open problem). So,
> > > > > vldr/vstr etc. will still be used for aligned accesses, and
> > > > > any issues with adding semantics to movmisalign<mode> are
> > > > > sidestepped.
> > > > 
> > > > I don't think this is correct. The original patch contained two
> > > > hooks: [snip]
> > > > - Add movmisalign. Either ignore the fact that packed structures
> > > > break, or add yet annother hook for "misaligned vectors must be
> > > > at least {-this-} aligned". This will not work for big-endian
> > > > vectors, and will go away once we implement array load support.
> > > 
> > > This is a new version of the patch, which adds movmisalign
> > > patterns for little-endian NEON, and uses a new (since the last
> > > version of the patch was posted) target hook
> > > (TARGET_SUPPORT_VECTOR_MISALIGNMENT) to describe the alignments
> > > supported by NEON.
> > 
> > Ping (ARM maintainers)?
> > 
> > Julian
> 
> I've no particular objection to this patch, but I can't help feeling
> it's not really addressing the fundamental problem.
> 
> I think the problem we're really trying to fix is GCC's builtin
> assumption about the mapping of vectors to registers (ie the order of
> the lanes -- Joseph alludes to this in one of his posts on the thread)
> and that fundamentally most of this is trying to paper over that
> built-in assumption (it's a bit like trying to make big-endian look
> like little-endian, or perhaps more accurately
> WORDS_BIG_ENDIAN+LITTLE_ENDIAN look like a pure big or little-endian
> machine).

This patch actually doesn't try to do anything to address the mapping
between the memory representation of vectors and element numberings --
it just allows misaligned accesses to be used (using element
load/store instructions), albeit only in little-endian mode. I believe
this makes the autovectoriser much more useful for real-world code (i.e.
able to trigger in many more cases, and/or able to produce better
output).

Yes, there's still an assumption that elements from increasing memory
locations go in increasing lane numbers (which is only true in
little-endian mode for NEON at present), but I don't think this patch
makes things any worse. Fixing big-endian mode is another problem for
another day :-).

Cheers,

Julian

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2010-06-07 19:09                       ` Julian Brown
@ 2010-06-08 15:25                         ` Mark Mitchell
  2010-08-03 16:32                         ` Julian Brown
  1 sibling, 0 replies; 29+ messages in thread
From: Mark Mitchell @ 2010-06-08 15:25 UTC (permalink / raw)
  To: Julian Brown; +Cc: Richard Earnshaw, gcc-patches, Paul Brook

Julian Brown wrote:

> This patch actually doesn't try to do anything to address the mapping
> between the memory representation of vectors and element numberings --
> it just allows misaligned accesses to be used (using element
> load/store instructions), albeit only in little-endian mode. I believe
> this makes the autovectoriser much more useful for real-world code (i.e.
> able to trigger in many more cases, and/or able to produce better
> output).

This seems to me to be a monotonic improvement in the capabilities of
the ARM back-end, and an accurate model of the little-endian ARM NEON
ISA.  So, while solving the general problem (which may require
significant work on the generic parts of the compiler) is certainly
desirable, it seems to me that this is a definite  win, with no major
downside.

Richard, you wrote:

> I've no particular objection to this patch

Is that an approval, or would you like to discuss further?

I'm not trying to apply pressure, and, for avoidance of doubt, my
opinion above is my opinion as a (lapsed) GCC engineer, not anything
RM/SC-ish.  Just want to understand where we stand, and trying to flush
our changes...

Thanks,

-- 
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2010-06-07 19:09                       ` Julian Brown
  2010-06-08 15:25                         ` Mark Mitchell
@ 2010-08-03 16:32                         ` Julian Brown
  2010-08-04  6:38                           ` Ira Rosen
  2010-09-23  9:49                           ` Richard Earnshaw
  1 sibling, 2 replies; 29+ messages in thread
From: Julian Brown @ 2010-08-03 16:32 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Earnshaw, Paul Brook, Ira Rosen

[-- Attachment #1: Type: text/plain, Size: 4738 bytes --]

On Mon, 7 Jun 2010 20:08:48 +0100
Julian Brown <julian@codesourcery.com> wrote:

> > > > This is a new version of the patch, which adds movmisalign
> > > > patterns for little-endian NEON, and uses a new (since the last
> > > > version of the patch was posted) target hook
> > > > (TARGET_SUPPORT_VECTOR_MISALIGNMENT) to describe the alignments
> > > > supported by NEON.

The previously-posted version of this patch no longer works on current
mainline, so here's a new version which does.

Backing up to the start of the problem, since it's been a while -- this
patch adds several things to NEON support in the ARM backend:

1. Implementations of the movmisalign pattern for loading and storing
vectors which are not naturally aligned.

2. Constraint/operand printing tweaks to disallow pre-decrement for 
addresses used by the above, and allow printing of alignment specifiers
for same.

3. Implementations of TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT and
TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE, to tell the middle-end
which alignments are supported for vector loads/stores by the hardware.

4. Testsuite tweaks to specify that certain tests only require vectors
to be aligned to the natural alignment of their elements, but not
necessarily less than that. Also tweaks to force some tests to use the
-mvectorize-with-neon-quad option.

> [...] there's still an assumption that elements from increasing memory
> locations go in increasing lane numbers (which is only true in
> little-endian mode for NEON at present), but I don't think this patch
> makes things any worse. Fixing big-endian mode is another problem for
> another day :-).

This still holds, but as previously discussed, probably should not be a
sticking point for getting this patch applied.

There remains a small amount of noise in testsuite results with this
patch, i.e.:

PASS -> FAIL: mthumb-march_armv7-a-mfpu_neon-mfloat-abi_softfp/gcc.sum:gcc.dg/ve
ct/vect-72.c scan-tree-dump-times vect "Alignment of access forced using peeling
" 0

This fails because a loop containing both an unaligned load and an unaligned store is unpeeled, making the load aligned. It seems to be a valid thing to do, so I'm not sure why it's a failure.

New FAIL: mthumb-march_armv7-a-mfpu_neon-mfloat-abi_softfp/g++.sum:g++.dg/vect/pr36648.cc scan-tree-dump-times vect "vectorized 1 loops" 1
New FAIL: mthumb-march_armv7-a-mfpu_neon-mfloat-abi_softfp/g++.sum:g++.dg/vect/pr36648.cc scan-tree-dump-times vect "vectorizing stmts using SLP" 1

These were analysed in:

  http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01351.html

New FAIL: mthumb-march_armv7-a-mfpu_neon-mfloat-abi_softfp/gcc.sum:gcc.dg/vect/vect-outer-4c.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1

and this in:

  http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01328.html

Also several tests transition from XPASS to PASS.

Tested with cross to ARM Linux (-mthumb -march=armv7-a -mfpu=neon
-mfloat-abi=softfp), gcc/g++/libstdc++. OK to apply?

ChangeLog

    gcc/
    * expr.c (expand_assignment): Add assertion to prevent emitting null rtx for
    movmisalign pattern.
    (expand_expr_real_1): Likewise.
    * config/arm/arm.c (arm_builtin_support_vector_misalignment): New.
    (TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT): New. Use above.
    (arm_vector_alignment_reachable): New.
    (TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE): New. Use above.
    (neon_vector_mem_operand): Disallow PRE_DEC for misaligned loads.
    (arm_print_operand): Include alignment qualifier in %A.
    * config/arm/neon.md (UNSPEC_MISALIGNED_ACCESS): New constant.
    (movmisalign<mode>): New expander.
    (movmisalign<mode>_neon_store, movmisalign<mode>_neon_load): New
    insn patterns.

    gcc/testsuite/
    * gcc.dg/vect/vect-42.c: Use vect_element_align instead of
    vect_hw_misalign.
    * gcc.dg/vect/vect-60.c: Likewise.
    * gcc.dg/vect/vect-56.c: Likewise.
    * gcc.dg/vect/vect-93.c: Likewise.
    * gcc.dg/vect/no-scevccp-outer-8.c: Likewise.
    * gcc.dg/vect/vect-95.c: Likewise.
    * gcc.dg/vect/vect-96.c: Likewise.
    * gcc.dg/vect/vect-outer-5.c: Use quad-word vectors when available.
    * gcc.dg/vect/slp-25.c: Likewise.
    * gcc.dg/vect/slp-3.c: Likewise.
    * gcc.dg/vect/vect-multitypes-1.c: Likewise.
    * gcc.dg/vect/no-vfa-pr29145.c: Likewise.
    * gcc.dg/vect/vect-multitypes-4.c: Likewise. Use vect_element_align.
    * gcc.dg/vect/vect-109.c: Likewise.
    * gcc.dg/vect/vect-peel-1.c: Likewise.
    * gcc.dg/vect/vect-peel-2.c: Likewise.
    * lib/target-supports.exp
    (check_effective_target_arm_vect_no_misalign): New.
    (check_effective_target_vect_no_align): Use above.
    (check_effective_target_vect_element_align): New.
    (add_options_for_quad_vectors): New.

[-- Attachment #2: misaligned-neon-fsf-17.diff --]
[-- Type: text/x-patch, Size: 26770 bytes --]

Index: gcc/testsuite/gcc.dg/vect/vect-42.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-42.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/vect-42.c	(working copy)
@@ -64,8 +64,8 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_hw_misalign } } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || { { !  vector_alignment_reachable } || vect_hw_misalign  } } } } }  */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { target vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || { { ! vector_alignment_reachable } || vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || { { !  vector_alignment_reachable } || vect_element_align  } } } } }  */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { target vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || { { ! vector_alignment_reachable } || vect_element_align } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-outer-5.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-outer-5.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/vect-outer-5.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_float } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdio.h>
 #include <stdarg.h>
Index: gcc/testsuite/gcc.dg/vect/vect-60.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-60.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/vect-60.c	(working copy)
@@ -69,8 +69,8 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail { vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail { vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-109.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-109.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/vect-109.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
@@ -72,8 +73,8 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { target vect_element_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-peel-1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-peel-1.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/vect-peel-1.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
@@ -45,7 +46,7 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail  vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target vect_hw_misalign  } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target vect_element_align  } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail  vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-peel-2.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-peel-2.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/vect-peel-2.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
@@ -46,7 +47,7 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail  vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target vect_hw_misalign  } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target vect_element_align  } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target vect_element_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-56.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-56.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/vect-56.c	(working copy)
@@ -68,8 +68,8 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail { vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail { vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/slp-25.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-25.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/slp-25.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/vect-93.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-93.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/vect-93.c	(working copy)
@@ -72,7 +72,7 @@ int main (void)
 /* main && main1 together: */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 2 "vect" { target powerpc*-*-* i?86-*-* x86_64-*-* } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { vect_no_align && {! vector_alignment_reachable} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { { vect_no_align } || { { ! vector_alignment_reachable} || vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { { vect_no_align } || { { ! vector_alignment_reachable} || vect_element_align } } } } } */
 
 /* in main1: */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target !powerpc*-*-* !i?86-*-* !x86_64-*-* } } } */
Index: gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c	(working copy)
@@ -46,5 +46,5 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! { vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! { vect_element_align } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-95.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-95.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/vect-95.c	(working copy)
@@ -56,14 +56,14 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail {vect_hw_misalign} } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail {vect_element_align} } } } */
 
 /* For targets that support unaligned loads we version for the two unaligned 
    stores and generate misaligned accesses for the loads. For targets that 
    don't support unaligned loads we version for all four accesses.  */
 
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign} } } }  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align} } } }  */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /*  { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" { target vect_no_align } } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-96.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-96.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/vect-96.c	(working copy)
@@ -44,6 +44,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { {! vect_no_align} && vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || { { ! vector_alignment_reachable} || vect_hw_misalign } } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || { { ! vector_alignment_reachable} || vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_element_align} } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/slp-3.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-3.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/slp-3.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include <stdio.h>
Index: gcc/testsuite/gcc.dg/vect/no-vfa-pr29145.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-vfa-pr29145.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/no-vfa-pr29145.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c	(revision 162827)
+++ gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
@@ -92,9 +93,9 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail { vect_no_align } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { vect_hw_misalign}  } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 8 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { target { vect_hw_misalign  } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { vect_element_align}  } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 8 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { target { vect_element_align  } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	(revision 162827)
+++ gcc/testsuite/lib/target-supports.exp	(working copy)
@@ -1739,6 +1739,18 @@ proc check_effective_target_arm32 { } {
     }]
 }
 
+# Return 1 if this is an ARM target that only supports aligned vector accesses
+proc check_effective_target_arm_vect_no_misalign { } {
+    return [check_no_compiler_messages arm_vect_no_misalign assembly {
+	#if !defined(__arm__) \
+	    || (defined(__ARMEL__) \
+	        && (!defined(__thumb__) || defined(__thumb2__)))
+	#error FOO
+	#endif
+    }]
+}
+
+
 # Return 1 if this is an ARM target supporting -mfpu=vfp
 # -mfloat-abi=softfp.  Some multilibs may be incompatible with these
 # options.
@@ -2643,7 +2655,7 @@ proc check_effective_target_vect_no_alig
 	if { [istarget mipsisa64*-*-*]
 	     || [istarget sparc*-*-*]
 	     || [istarget ia64-*-*]
-	     || [check_effective_target_arm32] } { 
+	     || [check_effective_target_arm_vect_no_misalign] } { 
 	    set et_vect_no_align_saved 1
 	}
     }
@@ -2778,6 +2790,25 @@ proc check_effective_target_vector_align
     return $et_vector_alignment_reachable_for_64bit_saved
 }
 
+# Return 1 if the target only requires element alignment for vector accesses
+
+proc check_effective_target_vect_element_align { } {
+    global et_vect_element_align
+
+    if [info exists et_vect_element_align] {
+	verbose "check_effective_target_vect_element_align: using cached result" 2
+    } else {
+	set et_vect_element_align 0
+	if { [istarget arm*-*-*]
+	     || [check_effective_target_vect_hw_misalign] } {
+	   set et_vect_element_align 1
+	}
+    }
+
+    verbose "check_effective_target_vect_element_align: returning $et_vect_element_align" 2
+    return $et_vect_element_align
+}
+
 # Return 1 if the target supports vector conditional operations, 0 otherwise.
 
 proc check_effective_target_vect_condition { } {
@@ -3339,6 +3370,16 @@ proc add_options_for_bind_pic_locally { 
     return $flags
 }
 
+# Add to FLAGS the flags needed to enable 128-bit vectors.
+
+proc add_options_for_quad_vectors { flags } {
+    if [is-effective-target arm_neon_ok] {
+	return "$flags -mvectorize-with-neon-quad"
+    }
+
+    return $flags
+}
+
 # Return 1 if the target provides a full C99 runtime.
 
 proc check_effective_target_c99_runtime { } {
Index: gcc/expr.c
===================================================================
--- gcc/expr.c	(revision 162827)
+++ gcc/expr.c	(working copy)
@@ -4327,7 +4327,10 @@ expand_assignment (tree to, tree from, b
            && op_mode1 != VOIDmode)
          reg = copy_to_mode_reg (op_mode1, reg);
 
-      insn = GEN_FCN (icode) (mem, reg);
+       insn = GEN_FCN (icode) (mem, reg);
+       /* The movmisalign<mode> pattern cannot fail, else the assignment would
+          silently be omitted.  */
+       gcc_assert (insn != NULL_RTX);
        emit_insn (insn);
        return;
      }
@@ -8643,6 +8646,7 @@ expand_expr_real_1 (tree exp, rtx target
 
 	    /* Nor can the insn generator.  */
 	    insn = GEN_FCN (icode) (reg, temp);
+	    gcc_assert (insn != NULL_RTX);
 	    emit_insn (insn);
 
 	    return reg;
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 162827)
+++ gcc/config/arm/arm.c	(working copy)
@@ -228,6 +228,11 @@ static void arm_asm_trampoline_template 
 static void arm_trampoline_init (rtx, tree, rtx);
 static rtx arm_trampoline_adjust_address (rtx);
 static rtx arm_pic_static_addr (rtx orig, rtx reg);
+static bool arm_vector_alignment_reachable (const_tree type, bool is_packed);
+static bool arm_builtin_support_vector_misalignment (enum machine_mode mode,
+						     const_tree type,
+						     int misalignment,
+						     bool is_packed);
 
 \f
 /* Table of machine attributes.  */
@@ -518,6 +523,14 @@ static const struct attribute_spec arm_a
 #undef TARGET_CAN_ELIMINATE
 #define TARGET_CAN_ELIMINATE arm_can_eliminate
 
+#undef TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE
+#define TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE \
+  arm_vector_alignment_reachable
+
+#undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
+#define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
+  arm_builtin_support_vector_misalignment
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 \f
 /* Obstack for minipool constant handling.  */
@@ -8678,7 +8691,8 @@ neon_vector_mem_operand (rtx op, int typ
     return arm_address_register_rtx_p (ind, 0);
 
   /* Allow post-increment with Neon registers.  */
-  if (type != 1 && (GET_CODE (ind) == POST_INC || GET_CODE (ind) == PRE_DEC))
+  if ((type != 1 && GET_CODE (ind) == POST_INC)
+      || (type == 0 && GET_CODE (ind) == PRE_DEC))
     return arm_address_register_rtx_p (XEXP (ind, 0), 0);
 
   /* FIXME: vld1 allows register post-modify.  */
@@ -16131,6 +16145,8 @@ arm_print_operand (FILE *stream, rtx x, 
       {
 	rtx addr;
 	bool postinc = FALSE;
+	unsigned align, modesize, align_bits;
+
 	gcc_assert (GET_CODE (x) == MEM);
 	addr = XEXP (x, 0);
 	if (GET_CODE (addr) == POST_INC)
@@ -16138,7 +16154,29 @@ arm_print_operand (FILE *stream, rtx x, 
 	    postinc = 1;
 	    addr = XEXP (addr, 0);
 	  }
-	asm_fprintf (stream, "[%r]", REGNO (addr));
+	asm_fprintf (stream, "[%r", REGNO (addr));
+
+	/* We know the alignment of this access, so we can emit a hint in the
+	   instruction (for some alignments) as an aid to the memory subsystem
+	   of the target.  */
+	align = MEM_ALIGN (x) >> 3;
+	modesize = GET_MODE_SIZE (GET_MODE (x));
+	
+	/* Only certain alignment specifiers are supported by the hardware.  */
+	if (modesize == 16 && (align % 32) == 0)
+	  align_bits = 256;
+	else if ((modesize == 8 || modesize == 16) && (align % 16) == 0)
+	  align_bits = 128;
+	else if ((align % 8) == 0)
+	  align_bits = 64;
+	else
+	  align_bits = 0;
+	
+	if (align_bits != 0)
+	  asm_fprintf (stream, ", :%d", align_bits);
+
+	asm_fprintf (stream, "]");
+
 	if (postinc)
 	  fputs("!", stream);
       }
@@ -22472,4 +22510,43 @@ arm_have_conditional_execution (void)
   return !TARGET_THUMB1;
 }
 
+static bool
+arm_vector_alignment_reachable (const_tree type, bool is_packed)
+{
+  /* Vectors which aren't in packed structures will not be less aligned than
+     the natural alignment of their element type, so this is safe.  */
+  if (TARGET_NEON && !BYTES_BIG_ENDIAN)
+    return !is_packed;
+
+  return default_builtin_vector_alignment_reachable (type, is_packed);
+}
+
+static bool
+arm_builtin_support_vector_misalignment (enum machine_mode mode,
+					 const_tree type, int misalignment,
+					 bool is_packed)
+{
+  if (TARGET_NEON && !BYTES_BIG_ENDIAN)
+    {
+      HOST_WIDE_INT align = TYPE_ALIGN_UNIT (type);
+
+      if (is_packed)
+        return align == 1;
+
+      /* If the misalignment is unknown, we should be able to handle the access
+	 so long as it is not to a member of a packed data structure.  */
+      if (misalignment == -1)
+        return true;
+
+      /* Return true if the misalignment is a multiple of the natural alignment
+         of the vector's element type.  This is probably always going to be
+	 true in practice, since we've already established that this isn't a
+	 packed access.  */
+      return ((misalignment % align) == 0);
+    }
+  
+  return default_builtin_support_vector_misalignment (mode, type, misalignment,
+						      is_packed);
+}
+
 #include "gt-arm.h"
Index: gcc/config/arm/neon.md
===================================================================
--- gcc/config/arm/neon.md	(revision 162827)
+++ gcc/config/arm/neon.md	(working copy)
@@ -140,7 +140,8 @@
    (UNSPEC_VUZP1		201)
    (UNSPEC_VUZP2		202)
    (UNSPEC_VZIP1		203)
-   (UNSPEC_VZIP2		204)])
+   (UNSPEC_VZIP2		204)
+   (UNSPEC_MISALIGNED_ACCESS	205)])
 
 ;; Double-width vector modes.
 (define_mode_iterator VD [V8QI V4HI V2SI V2SF])
@@ -660,6 +661,52 @@
   neon_disambiguate_copy (operands, dest, src, 4);
 })
 
+(define_expand "movmisalign<mode>"
+  [(set (match_operand:VDQX 0 "nonimmediate_operand"	      "")
+	(unspec:VDQX [(match_operand:VDQX 1 "general_operand" "")]
+		     UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+{
+  /* This pattern is not permitted to fail during expansion: if both arguments
+     are non-registers (e.g. memory := constant, which can be created by the
+     auto-vectorizer), force operand 1 into a register.  */
+  if (!s_register_operand (operands[0], <MODE>mode)
+      && !s_register_operand (operands[1], <MODE>mode))
+    operands[1] = force_reg (<MODE>mode, operands[1]);
+})
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VDX 0 "memory_operand"		       "=Um")
+	(unspec:VDX [(match_operand:VDX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vst1.<V_sz_elem>\t{%P1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VDX 0 "s_register_operand"	   "=w")
+	(unspec:VDX [(match_operand:VDX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%P0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VQX 0 "memory_operand"		       "=Um")
+	(unspec:VQX [(match_operand:VQX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vst1.<V_sz_elem>\t{%q1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VQX 0 "s_register_operand"	   "=w")
+	(unspec:VQX [(match_operand:VQX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%q0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
 (define_insn "vec_set<mode>_internal"
   [(set (match_operand:VD 0 "s_register_operand" "=w")
         (vec_merge:VD

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2010-08-03 16:32                         ` Julian Brown
@ 2010-08-04  6:38                           ` Ira Rosen
  2010-09-23  9:49                           ` Richard Earnshaw
  1 sibling, 0 replies; 29+ messages in thread
From: Ira Rosen @ 2010-08-04  6:38 UTC (permalink / raw)
  To: Julian Brown; +Cc: gcc-patches, Paul Brook, Richard Earnshaw



Julian Brown <julian@codesourcery.com> wrote on 03/08/2010 07:32:00 PM:

> There remains a small amount of noise in testsuite results with this
> patch, i.e.:
>
> PASS -> FAIL: mthumb-march_armv7-a-mfpu_neon-mfloat-abi_softfp/
> gcc.sum:gcc.dg/ve
> ct/vect-72.c scan-tree-dump-times vect "Alignment of access forced
> using peeling
> " 0
>
> This fails because a loop containing both an unaligned load and an
> unaligned store is unpeeled, making the load aligned. It seems to be
> a valid thing to do, so I'm not sure why it's a failure.

The store is supposed to be aligned, and the test checks how we handle
unaligned load. If somehow peeling is done for the load, causing the store
to be unaligned, it is a valid thing to do, just make sure that it is also
reasonable for the target.

Ira


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2010-08-03 16:32                         ` Julian Brown
  2010-08-04  6:38                           ` Ira Rosen
@ 2010-09-23  9:49                           ` Richard Earnshaw
  2010-10-04 15:00                             ` Julian Brown
  1 sibling, 1 reply; 29+ messages in thread
From: Richard Earnshaw @ 2010-09-23  9:49 UTC (permalink / raw)
  To: Julian Brown; +Cc: gcc-patches, Richard Earnshaw, Paul Brook, Ira Rosen

On 03/08/10 17:32, Julian Brown wrote:
> On Mon, 7 Jun 2010 20:08:48 +0100
> Julian Brown <julian@codesourcery.com> wrote:
> 
>>>>> This is a new version of the patch, which adds movmisalign
>>>>> patterns for little-endian NEON, and uses a new (since the last
>>>>> version of the patch was posted) target hook
>>>>> (TARGET_SUPPORT_VECTOR_MISALIGNMENT) to describe the alignments
>>>>> supported by NEON.
> 
> The previously-posted version of this patch no longer works on current
> mainline, so here's a new version which does.
> 
> Backing up to the start of the problem, since it's been a while -- this
> patch adds several things to NEON support in the ARM backend:
> 
> 1. Implementations of the movmisalign pattern for loading and storing
> vectors which are not naturally aligned.
> 
> 2. Constraint/operand printing tweaks to disallow pre-decrement for 
> addresses used by the above, and allow printing of alignment specifiers
> for same.
> 
> 3. Implementations of TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT and
> TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE, to tell the middle-end
> which alignments are supported for vector loads/stores by the hardware.
> 
> 4. Testsuite tweaks to specify that certain tests only require vectors
> to be aligned to the natural alignment of their elements, but not
> necessarily less than that. Also tweaks to force some tests to use the
> -mvectorize-with-neon-quad option.
> 
>> [...] there's still an assumption that elements from increasing memory
>> locations go in increasing lane numbers (which is only true in
>> little-endian mode for NEON at present), but I don't think this patch
>> makes things any worse. Fixing big-endian mode is another problem for
>> another day :-).
> 
> This still holds, but as previously discussed, probably should not be a
> sticking point for getting this patch applied.
> 
> There remains a small amount of noise in testsuite results with this
> patch, i.e.:
> 
> PASS -> FAIL: mthumb-march_armv7-a-mfpu_neon-mfloat-abi_softfp/gcc.sum:gcc.dg/ve
> ct/vect-72.c scan-tree-dump-times vect "Alignment of access forced using peeling
> " 0
> 
> This fails because a loop containing both an unaligned load and an unaligned store is unpeeled, making the load aligned. It seems to be a valid thing to do, so I'm not sure why it's a failure.
> 
> New FAIL: mthumb-march_armv7-a-mfpu_neon-mfloat-abi_softfp/g++.sum:g++.dg/vect/pr36648.cc scan-tree-dump-times vect "vectorized 1 loops" 1
> New FAIL: mthumb-march_armv7-a-mfpu_neon-mfloat-abi_softfp/g++.sum:g++.dg/vect/pr36648.cc scan-tree-dump-times vect "vectorizing stmts using SLP" 1
> 
> These were analysed in:
> 
>   http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01351.html
> 
> New FAIL: mthumb-march_armv7-a-mfpu_neon-mfloat-abi_softfp/gcc.sum:gcc.dg/vect/vect-outer-4c.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
> 
> and this in:
> 
>   http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01328.html
> 
> Also several tests transition from XPASS to PASS.
> 
> Tested with cross to ARM Linux (-mthumb -march=armv7-a -mfpu=neon
> -mfloat-abi=softfp), gcc/g++/libstdc++. OK to apply?
> 
> ChangeLog
> 
>     gcc/
>     * expr.c (expand_assignment): Add assertion to prevent emitting null rtx for
>     movmisalign pattern.
>     (expand_expr_real_1): Likewise.
>     * config/arm/arm.c (arm_builtin_support_vector_misalignment): New.
>     (TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT): New. Use above.
>     (arm_vector_alignment_reachable): New.
>     (TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE): New. Use above.
>     (neon_vector_mem_operand): Disallow PRE_DEC for misaligned loads.
>     (arm_print_operand): Include alignment qualifier in %A.
>     * config/arm/neon.md (UNSPEC_MISALIGNED_ACCESS): New constant.
>     (movmisalign<mode>): New expander.
>     (movmisalign<mode>_neon_store, movmisalign<mode>_neon_load): New
>     insn patterns.
> 
>     gcc/testsuite/
>     * gcc.dg/vect/vect-42.c: Use vect_element_align instead of
>     vect_hw_misalign.
>     * gcc.dg/vect/vect-60.c: Likewise.
>     * gcc.dg/vect/vect-56.c: Likewise.
>     * gcc.dg/vect/vect-93.c: Likewise.
>     * gcc.dg/vect/no-scevccp-outer-8.c: Likewise.
>     * gcc.dg/vect/vect-95.c: Likewise.
>     * gcc.dg/vect/vect-96.c: Likewise.
>     * gcc.dg/vect/vect-outer-5.c: Use quad-word vectors when available.
>     * gcc.dg/vect/slp-25.c: Likewise.
>     * gcc.dg/vect/slp-3.c: Likewise.
>     * gcc.dg/vect/vect-multitypes-1.c: Likewise.
>     * gcc.dg/vect/no-vfa-pr29145.c: Likewise.
>     * gcc.dg/vect/vect-multitypes-4.c: Likewise. Use vect_element_align.
>     * gcc.dg/vect/vect-109.c: Likewise.
>     * gcc.dg/vect/vect-peel-1.c: Likewise.
>     * gcc.dg/vect/vect-peel-2.c: Likewise.
>     * lib/target-supports.exp
>     (check_effective_target_arm_vect_no_misalign): New.
>     (check_effective_target_vect_no_align): Use above.
>     (check_effective_target_vect_element_align): New.
>     (add_options_for_quad_vectors): New.


I've spent a long time pondering this patch and I'm still not entirely
happy that forcing the vectorizer to pretend these operations are
unaligned is the correct way to specify this, but I must admit that I
can't see a reasonable alternative at the moment that isn't
significantly less pleasant in some other respect.  So this is OK apart
from:

+	if (align_bits != 0)
+	  asm_fprintf (stream, ", :%d", align_bits);

The comma is incorrect in the alignment syntax.  The correct form is
[Rn:align].  That is, the ':' is a direct replacement for '@' in the
strict UAL form.

I hope I'm not going to regret this... :-)

R.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2010-09-23  9:49                           ` Richard Earnshaw
@ 2010-10-04 15:00                             ` Julian Brown
  2010-10-07 13:06                               ` Ramana Radhakrishnan
  0 siblings, 1 reply; 29+ messages in thread
From: Julian Brown @ 2010-10-04 15:00 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches, Richard Earnshaw, Paul Brook, Ira Rosen

[-- Attachment #1: Type: text/plain, Size: 7876 bytes --]

On Wed, 22 Sep 2010 23:54:38 +0100
Richard Earnshaw <Richard.Earnshaw@buzzard.freeserve.co.uk> wrote:

> On 03/08/10 17:32, Julian Brown wrote:
> > On Mon, 7 Jun 2010 20:08:48 +0100
> > Julian Brown <julian@codesourcery.com> wrote:
> > 
> >>>>> This is a new version of the patch, which adds movmisalign
> >>>>> patterns for little-endian NEON, and uses a new (since the last
> >>>>> version of the patch was posted) target hook
> >>>>> (TARGET_SUPPORT_VECTOR_MISALIGNMENT) to describe the alignments
> >>>>> supported by NEON.
> > 
> > The previously-posted version of this patch no longer works on
> > current mainline, so here's a new version which does.
> > 
> > Backing up to the start of the problem, since it's been a while --
> > this patch adds several things to NEON support in the ARM backend:
> > 
> > 1. Implementations of the movmisalign pattern for loading and
> > storing vectors which are not naturally aligned.
> > 
> > 2. Constraint/operand printing tweaks to disallow pre-decrement for 
> > addresses used by the above, and allow printing of alignment
> > specifiers for same.
> > 
> > 3. Implementations of TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
> > and TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE, to tell the
> > middle-end which alignments are supported for vector loads/stores
> > by the hardware.
> > 
> > 4. Testsuite tweaks to specify that certain tests only require
> > vectors to be aligned to the natural alignment of their elements,
> > but not necessarily less than that. Also tweaks to force some tests
> > to use the -mvectorize-with-neon-quad option.
> > 
> >> [...] there's still an assumption that elements from increasing
> >> memory locations go in increasing lane numbers (which is only true
> >> in little-endian mode for NEON at present), but I don't think this
> >> patch makes things any worse. Fixing big-endian mode is another
> >> problem for another day :-).
> > 
> > This still holds, but as previously discussed, probably should not
> > be a sticking point for getting this patch applied.
> > 
> > There remains a small amount of noise in testsuite results with this
> > patch, i.e.:
> > 
> > PASS -> FAIL:
> > mthumb-march_armv7-a-mfpu_neon-mfloat-abi_softfp/gcc.sum:gcc.dg/ve
> > ct/vect-72.c scan-tree-dump-times vect "Alignment of access forced
> > using peeling " 0
> > 
> > This fails because a loop containing both an unaligned load and an
> > unaligned store is unpeeled, making the load aligned. It seems to
> > be a valid thing to do, so I'm not sure why it's a failure.
> > 
> > New FAIL:
> > mthumb-march_armv7-a-mfpu_neon-mfloat-abi_softfp/g++.sum:g++.dg/vect/pr36648.cc
> > scan-tree-dump-times vect "vectorized 1 loops" 1 New FAIL:
> > mthumb-march_armv7-a-mfpu_neon-mfloat-abi_softfp/g++.sum:g++.dg/vect/pr36648.cc
> > scan-tree-dump-times vect "vectorizing stmts using SLP" 1
> > 
> > These were analysed in:
> > 
> >   http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01351.html
> > 
> > New FAIL:
> > mthumb-march_armv7-a-mfpu_neon-mfloat-abi_softfp/gcc.sum:gcc.dg/vect/vect-outer-4c.c
> > scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
> > 
> > and this in:
> > 
> >   http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01328.html
> > 
> > Also several tests transition from XPASS to PASS.
> > 
> > Tested with cross to ARM Linux (-mthumb -march=armv7-a -mfpu=neon
> > -mfloat-abi=softfp), gcc/g++/libstdc++. OK to apply?
> > 
> > ChangeLog
> > 
> >     gcc/
> >     * expr.c (expand_assignment): Add assertion to prevent emitting
> > null rtx for movmisalign pattern.
> >     (expand_expr_real_1): Likewise.
> >     * config/arm/arm.c (arm_builtin_support_vector_misalignment):
> > New. (TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT): New. Use above.
> >     (arm_vector_alignment_reachable): New.
> >     (TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE): New. Use above.
> >     (neon_vector_mem_operand): Disallow PRE_DEC for misaligned
> > loads. (arm_print_operand): Include alignment qualifier in %A.
> >     * config/arm/neon.md (UNSPEC_MISALIGNED_ACCESS): New constant.
> >     (movmisalign<mode>): New expander.
> >     (movmisalign<mode>_neon_store, movmisalign<mode>_neon_load): New
> >     insn patterns.
> > 
> >     gcc/testsuite/
> >     * gcc.dg/vect/vect-42.c: Use vect_element_align instead of
> >     vect_hw_misalign.
> >     * gcc.dg/vect/vect-60.c: Likewise.
> >     * gcc.dg/vect/vect-56.c: Likewise.
> >     * gcc.dg/vect/vect-93.c: Likewise.
> >     * gcc.dg/vect/no-scevccp-outer-8.c: Likewise.
> >     * gcc.dg/vect/vect-95.c: Likewise.
> >     * gcc.dg/vect/vect-96.c: Likewise.
> >     * gcc.dg/vect/vect-outer-5.c: Use quad-word vectors when
> > available.
> >     * gcc.dg/vect/slp-25.c: Likewise.
> >     * gcc.dg/vect/slp-3.c: Likewise.
> >     * gcc.dg/vect/vect-multitypes-1.c: Likewise.
> >     * gcc.dg/vect/no-vfa-pr29145.c: Likewise.
> >     * gcc.dg/vect/vect-multitypes-4.c: Likewise. Use
> > vect_element_align.
> >     * gcc.dg/vect/vect-109.c: Likewise.
> >     * gcc.dg/vect/vect-peel-1.c: Likewise.
> >     * gcc.dg/vect/vect-peel-2.c: Likewise.
> >     * lib/target-supports.exp
> >     (check_effective_target_arm_vect_no_misalign): New.
> >     (check_effective_target_vect_no_align): Use above.
> >     (check_effective_target_vect_element_align): New.
> >     (add_options_for_quad_vectors): New.
> 
> 
> I've spent a long time pondering this patch and I'm still not entirely
> happy that forcing the vectorizer to pretend these operations are
> unaligned is the correct way to specify this, but I must admit that I
> can't see a reasonable alternative at the moment that isn't
> significantly less pleasant in some other respect.  So this is OK
> apart from:
> 
> +	if (align_bits != 0)
> +	  asm_fprintf (stream, ", :%d", align_bits);
> 
> The comma is incorrect in the alignment syntax.  The correct form is
> [Rn:align].  That is, the ':' is a direct replacement for '@' in the
> strict UAL form.

Fixed.

Here's the version I'm about to commit, re-tested lightly. It only
differs in trivial ways from the previously-posted version, in order to
apply to current mainline.

Thanks,

Julian

ChangeLog

    gcc/
    * expr.c (expand_assignment): Add assertion to prevent emitting
    null rtx for movmisalign pattern.
    (expand_expr_real_1): Likewise.
    * config/arm/arm.c (arm_builtin_support_vector_misalignment): New.
    (TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT): New. Use above.
    (arm_vector_alignment_reachable): New.
    (TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE): New. Use above.
    (neon_vector_mem_operand): Disallow PRE_DEC for misaligned loads.
    (arm_print_operand): Include alignment qualifier in %A.
    * config/arm/neon.md (UNSPEC_MISALIGNED_ACCESS): New constant.
    (movmisalign<mode>): New expander.
    (movmisalign<mode>_neon_store, movmisalign<mode>_neon_load): New
    insn patterns.

    gcc/testsuite/
    * gcc.dg/vect/vect-42.c: Use vect_element_align instead of
    vect_hw_misalign.
    * gcc.dg/vect/vect-60.c: Likewise.
    * gcc.dg/vect/vect-56.c: Likewise.
    * gcc.dg/vect/vect-93.c: Likewise.
    * gcc.dg/vect/no-scevccp-outer-8.c: Likewise.
    * gcc.dg/vect/vect-95.c: Likewise.
    * gcc.dg/vect/vect-96.c: Likewise.
    * gcc.dg/vect/vect-outer-5.c: Use quad-word vectors when available.
    * gcc.dg/vect/slp-25.c: Likewise.
    * gcc.dg/vect/slp-3.c: Likewise.
    * gcc.dg/vect/vect-multitypes-1.c: Likewise.
    * gcc.dg/vect/no-vfa-pr29145.c: Likewise.
    * gcc.dg/vect/vect-multitypes-4.c: Likewise. Use vect_element_align.
    * gcc.dg/vect/vect-109.c: Likewise.
    * gcc.dg/vect/vect-peel-1.c: Likewise.
    * gcc.dg/vect/vect-peel-2.c: Likewise.
    * lib/target-supports.exp
    (check_effective_target_arm_vect_no_misalign): New.
    (check_effective_target_vect_no_align): Use above.
    (check_effective_target_vect_element_align): New.
    (add_options_for_quad_vectors): New.

[-- Attachment #2: misaligned-neon-fsf-20.diff --]
[-- Type: text/x-patch, Size: 26694 bytes --]

Index: gcc/testsuite/gcc.dg/vect/vect-42.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-42.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/vect-42.c	(working copy)
@@ -64,8 +64,8 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_hw_misalign } } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || { { !  vector_alignment_reachable } || vect_hw_misalign  } } } } }  */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { target vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || { { ! vector_alignment_reachable } || vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || { { !  vector_alignment_reachable } || vect_element_align  } } } } }  */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { target vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || { { ! vector_alignment_reachable } || vect_element_align } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-outer-5.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-outer-5.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/vect-outer-5.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_float } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdio.h>
 #include <stdarg.h>
Index: gcc/testsuite/gcc.dg/vect/vect-60.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-60.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/vect-60.c	(working copy)
@@ -69,8 +69,8 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail { vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail { vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-109.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-109.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/vect-109.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
@@ -72,8 +73,8 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" { target vect_element_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/gcc.dg/vect/vect-peel-1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-peel-1.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/vect-peel-1.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
@@ -45,7 +46,7 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail  vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target vect_hw_misalign  } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target vect_element_align  } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail  vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-peel-2.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-peel-2.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/vect-peel-2.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
@@ -46,7 +47,7 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail  vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target vect_hw_misalign  } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target vect_element_align  } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target vect_element_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-56.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-56.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/vect-56.c	(working copy)
@@ -68,8 +68,8 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail { vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail { vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/slp-25.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-25.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/slp-25.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/vect-93.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-93.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/vect-93.c	(working copy)
@@ -72,7 +72,7 @@ int main (void)
 /* main && main1 together: */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 2 "vect" { target powerpc*-*-* i?86-*-* x86_64-*-* } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { vect_no_align && {! vector_alignment_reachable} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { { vect_no_align } || { { ! vector_alignment_reachable} || vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { { vect_no_align } || { { ! vector_alignment_reachable} || vect_element_align } } } } } */
 
 /* in main1: */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target !powerpc*-*-* !i?86-*-* !x86_64-*-* } } } */
Index: gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c	(working copy)
@@ -46,5 +46,5 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! { vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! { vect_element_align } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-95.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-95.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/vect-95.c	(working copy)
@@ -56,14 +56,14 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail {vect_hw_misalign} } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail {vect_element_align} } } } */
 
 /* For targets that support unaligned loads we version for the two unaligned 
    stores and generate misaligned accesses for the loads. For targets that 
    don't support unaligned loads we version for all four accesses.  */
 
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign} } } }  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align} } } }  */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /*  { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" { target vect_no_align } } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-96.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-96.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/vect-96.c	(working copy)
@@ -44,6 +44,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { {! vect_no_align} && vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || { { ! vector_alignment_reachable} || vect_hw_misalign } } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || { { ! vector_alignment_reachable} || vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_element_align} } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/slp-3.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/slp-3.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/slp-3.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include <stdio.h>
Index: gcc/testsuite/gcc.dg/vect/no-vfa-pr29145.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/no-vfa-pr29145.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/no-vfa-pr29145.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
Index: gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c	(revision 164939)
+++ gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c	(working copy)
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-add-options quad_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
@@ -92,9 +93,9 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail { vect_no_align } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { vect_hw_misalign}  } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 8 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { target { vect_hw_misalign  } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { vect_element_align}  } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 8 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { target { vect_element_align  } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	(revision 164939)
+++ gcc/testsuite/lib/target-supports.exp	(working copy)
@@ -1813,6 +1813,18 @@ proc check_effective_target_arm32 { } {
     }]
 }
 
+# Return 1 if this is an ARM target that only supports aligned vector accesses
+proc check_effective_target_arm_vect_no_misalign { } {
+    return [check_no_compiler_messages arm_vect_no_misalign assembly {
+	#if !defined(__arm__) \
+	    || (defined(__ARMEL__) \
+	        && (!defined(__thumb__) || defined(__thumb2__)))
+	#error FOO
+	#endif
+    }]
+}
+
+
 # Return 1 if this is an ARM target supporting -mfpu=vfp
 # -mfloat-abi=softfp.  Some multilibs may be incompatible with these
 # options.
@@ -2776,7 +2788,7 @@ proc check_effective_target_vect_no_alig
 	if { [istarget mipsisa64*-*-*]
 	     || [istarget sparc*-*-*]
 	     || [istarget ia64-*-*]
-	     || [check_effective_target_arm32]
+	     || [check_effective_target_arm_vect_no_misalign]
 	     || ([istarget mips*-*-*]
 		 && [check_effective_target_mips_loongson]) } {
 	    set et_vect_no_align_saved 1
@@ -2913,6 +2925,25 @@ proc check_effective_target_vector_align
     return $et_vector_alignment_reachable_for_64bit_saved
 }
 
+# Return 1 if the target only requires element alignment for vector accesses
+
+proc check_effective_target_vect_element_align { } {
+    global et_vect_element_align
+
+    if [info exists et_vect_element_align] {
+	verbose "check_effective_target_vect_element_align: using cached result" 2
+    } else {
+	set et_vect_element_align 0
+	if { [istarget arm*-*-*]
+	     || [check_effective_target_vect_hw_misalign] } {
+	   set et_vect_element_align 1
+	}
+    }
+
+    verbose "check_effective_target_vect_element_align: returning $et_vect_element_align" 2
+    return $et_vect_element_align
+}
+
 # Return 1 if the target supports vector conditional operations, 0 otherwise.
 
 proc check_effective_target_vect_condition { } {
@@ -3480,6 +3511,16 @@ proc add_options_for_bind_pic_locally { 
     return $flags
 }
 
+# Add to FLAGS the flags needed to enable 128-bit vectors.
+
+proc add_options_for_quad_vectors { flags } {
+    if [is-effective-target arm_neon_ok] {
+	return "$flags -mvectorize-with-neon-quad"
+    }
+
+    return $flags
+}
+
 # Return 1 if the target provides a full C99 runtime.
 
 proc check_effective_target_c99_runtime { } {
Index: gcc/expr.c
===================================================================
--- gcc/expr.c	(revision 164939)
+++ gcc/expr.c	(working copy)
@@ -4223,6 +4223,9 @@ expand_assignment (tree to, tree from, b
 	reg = copy_to_mode_reg (op_mode1, reg);
 
       insn = GEN_FCN (icode) (mem, reg);
+      /* The movmisalign<mode> pattern cannot fail, else the assignment would
+         silently be omitted.  */
+      gcc_assert (insn != NULL_RTX);
       emit_insn (insn);
       return;
     }
@@ -8674,6 +8677,7 @@ expand_expr_real_1 (tree exp, rtx target
 
 	    /* Nor can the insn generator.  */
 	    insn = GEN_FCN (icode) (reg, temp);
+	    gcc_assert (insn != NULL_RTX);
 	    emit_insn (insn);
 
 	    return reg;
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 164939)
+++ gcc/config/arm/arm.c	(working copy)
@@ -242,6 +242,11 @@ static bool cortex_a9_sched_adjust_cost 
 static bool xscale_sched_adjust_cost (rtx, rtx, rtx, int *);
 static unsigned int arm_units_per_simd_word (enum machine_mode);
 static bool arm_class_likely_spilled_p (reg_class_t);
+static bool arm_vector_alignment_reachable (const_tree type, bool is_packed);
+static bool arm_builtin_support_vector_misalignment (enum machine_mode mode,
+						     const_tree type,
+						     int misalignment,
+						     bool is_packed);
 
 \f
 /* Table of machine attributes.  */
@@ -557,6 +562,14 @@ static const struct attribute_spec arm_a
 #undef TARGET_CLASS_LIKELY_SPILLED_P
 #define TARGET_CLASS_LIKELY_SPILLED_P arm_class_likely_spilled_p
 
+#undef TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE
+#define TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE \
+  arm_vector_alignment_reachable
+
+#undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
+#define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
+  arm_builtin_support_vector_misalignment
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 \f
 /* Obstack for minipool constant handling.  */
@@ -8834,7 +8847,8 @@ neon_vector_mem_operand (rtx op, int typ
     return arm_address_register_rtx_p (ind, 0);
 
   /* Allow post-increment with Neon registers.  */
-  if (type != 1 && (GET_CODE (ind) == POST_INC || GET_CODE (ind) == PRE_DEC))
+  if ((type != 1 && GET_CODE (ind) == POST_INC)
+      || (type == 0 && GET_CODE (ind) == PRE_DEC))
     return arm_address_register_rtx_p (XEXP (ind, 0), 0);
 
   /* FIXME: vld1 allows register post-modify.  */
@@ -16317,6 +16331,8 @@ arm_print_operand (FILE *stream, rtx x, 
       {
 	rtx addr;
 	bool postinc = FALSE;
+	unsigned align, modesize, align_bits;
+
 	gcc_assert (GET_CODE (x) == MEM);
 	addr = XEXP (x, 0);
 	if (GET_CODE (addr) == POST_INC)
@@ -16324,7 +16340,29 @@ arm_print_operand (FILE *stream, rtx x, 
 	    postinc = 1;
 	    addr = XEXP (addr, 0);
 	  }
-	asm_fprintf (stream, "[%r]", REGNO (addr));
+	asm_fprintf (stream, "[%r", REGNO (addr));
+
+	/* We know the alignment of this access, so we can emit a hint in the
+	   instruction (for some alignments) as an aid to the memory subsystem
+	   of the target.  */
+	align = MEM_ALIGN (x) >> 3;
+	modesize = GET_MODE_SIZE (GET_MODE (x));
+	
+	/* Only certain alignment specifiers are supported by the hardware.  */
+	if (modesize == 16 && (align % 32) == 0)
+	  align_bits = 256;
+	else if ((modesize == 8 || modesize == 16) && (align % 16) == 0)
+	  align_bits = 128;
+	else if ((align % 8) == 0)
+	  align_bits = 64;
+	else
+	  align_bits = 0;
+	
+	if (align_bits != 0)
+	  asm_fprintf (stream, ":%d", align_bits);
+
+	asm_fprintf (stream, "]");
+
 	if (postinc)
 	  fputs("!", stream);
       }
@@ -23145,4 +23183,43 @@ arm_expand_sync (enum machine_mode mode,
     }
 }
 
+static bool
+arm_vector_alignment_reachable (const_tree type, bool is_packed)
+{
+  /* Vectors which aren't in packed structures will not be less aligned than
+     the natural alignment of their element type, so this is safe.  */
+  if (TARGET_NEON && !BYTES_BIG_ENDIAN)
+    return !is_packed;
+
+  return default_builtin_vector_alignment_reachable (type, is_packed);
+}
+
+static bool
+arm_builtin_support_vector_misalignment (enum machine_mode mode,
+					 const_tree type, int misalignment,
+					 bool is_packed)
+{
+  if (TARGET_NEON && !BYTES_BIG_ENDIAN)
+    {
+      HOST_WIDE_INT align = TYPE_ALIGN_UNIT (type);
+
+      if (is_packed)
+        return align == 1;
+
+      /* If the misalignment is unknown, we should be able to handle the access
+	 so long as it is not to a member of a packed data structure.  */
+      if (misalignment == -1)
+        return true;
+
+      /* Return true if the misalignment is a multiple of the natural alignment
+         of the vector's element type.  This is probably always going to be
+	 true in practice, since we've already established that this isn't a
+	 packed access.  */
+      return ((misalignment % align) == 0);
+    }
+  
+  return default_builtin_support_vector_misalignment (mode, type, misalignment,
+						      is_packed);
+}
+
 #include "gt-arm.h"
Index: gcc/config/arm/neon.md
===================================================================
--- gcc/config/arm/neon.md	(revision 164939)
+++ gcc/config/arm/neon.md	(working copy)
@@ -141,6 +141,7 @@
    (UNSPEC_VUZP2		202)
    (UNSPEC_VZIP1		203)
    (UNSPEC_VZIP2		204)
+   (UNSPEC_MISALIGNED_ACCESS	205)
    (UNSPEC_VCLE			206)
    (UNSPEC_VCLT			207)])
 
@@ -369,6 +370,52 @@
   neon_disambiguate_copy (operands, dest, src, 4);
 })
 
+(define_expand "movmisalign<mode>"
+  [(set (match_operand:VDQX 0 "nonimmediate_operand"	      "")
+	(unspec:VDQX [(match_operand:VDQX 1 "general_operand" "")]
+		     UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+{
+  /* This pattern is not permitted to fail during expansion: if both arguments
+     are non-registers (e.g. memory := constant, which can be created by the
+     auto-vectorizer), force operand 1 into a register.  */
+  if (!s_register_operand (operands[0], <MODE>mode)
+      && !s_register_operand (operands[1], <MODE>mode))
+    operands[1] = force_reg (<MODE>mode, operands[1]);
+})
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VDX 0 "memory_operand"		       "=Um")
+	(unspec:VDX [(match_operand:VDX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vst1.<V_sz_elem>\t{%P1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VDX 0 "s_register_operand"	   "=w")
+	(unspec:VDX [(match_operand:VDX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%P0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VQX 0 "memory_operand"		       "=Um")
+	(unspec:VQX [(match_operand:VQX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vst1.<V_sz_elem>\t{%q1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VQX 0 "s_register_operand"	   "=w")
+	(unspec:VQX [(match_operand:VQX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%q0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
 (define_insn "vec_set<mode>_internal"
   [(set (match_operand:VD 0 "s_register_operand" "=w")
         (vec_merge:VD

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2010-10-04 15:00                             ` Julian Brown
@ 2010-10-07 13:06                               ` Ramana Radhakrishnan
  0 siblings, 0 replies; 29+ messages in thread
From: Ramana Radhakrishnan @ 2010-10-07 13:06 UTC (permalink / raw)
  To: Julian Brown
  Cc: Richard Earnshaw, gcc-patches, Richard Earnshaw, Paul Brook, Ira Rosen

This caused PR45932.  Can you please have a look ?

Ramana

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-12-03 16:57 ` Richard Guenther
@ 2009-12-04 14:40   ` Plotnikov Dmitry
  0 siblings, 0 replies; 29+ messages in thread
From: Plotnikov Dmitry @ 2009-12-04 14:40 UTC (permalink / raw)
  To: Richard Guenther; +Cc: julian, gcc-patches, paul, rearnsha, eres, IRAR

[-- Attachment #1: Type: text/plain, Size: 1471 bytes --]

Richard Guenther wrote:
> On Thu, Dec 3, 2009 at 4:09 PM, Plotnikov Dmitry <dplotnikov@ispras.ru> wrote:
>   
>> There seems to be a problem though: building libevas with
>> this patch causes miscompile.
>>
>> Sometimes SLP pass somehow brokes data dependencies and causes
>> "dirty = list_zeroed" initialization to be removed by DCE pass on rtl
>> in the sample code below:
>>     
> That sounds more like an alias bug of either the vectorizer or
> the alias-export code.  Can you re-check with a more recent
> snapshot and file a bugreport?
>   

The problem persists in the latest snapshot (03 December 2009) and is 
fixed by copying code in expand_alignment so to handle the first 
MISALIGNED_INDIRECT_REF case in the same way as the second one.  See the 
patch below.

--- gcc-4.5-20091203/gcc/expr.c 2009-11-30 13:39:36.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/expr.c 2009-12-04 16:59:24.000000000 +0300
@@ -4350,7 +4350,14 @@ expand_assignment (tree to, tree from, b
            && op_mode1 != VOIDmode)
          reg = copy_to_mode_reg (op_mode1, reg);

-      insn = GEN_FCN (icode) (mem, reg);
+       insn = GEN_FCN (icode) (mem, reg);
+       if (!insn)
+       {
+          reg = copy_to_mode_reg (mode, reg);
+          insn = GEN_FCN (icode) (mem, reg);
+          gcc_assert (insn);
+       }
+
        emit_insn (insn);
        return;
      }

In the attachment is the patch that could be applied to latest snapshot.

--
Best regards,
  Dmitry


[-- Attachment #2: misalign.patch --]
[-- Type: text/x-patch, Size: 70063 bytes --]

diff -rupd gcc-4.5-20091203/gcc/config/arm/arm.c gcc-4.5-20091203.patched/gcc/config/arm/arm.c
--- gcc-4.5-20091203/gcc/config/arm/arm.c	2009-11-25 14:23:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/config/arm/arm.c	2009-12-04 15:02:20.000000000 +0300
@@ -224,6 +224,8 @@ static bool arm_can_eliminate (const int
 static void arm_asm_trampoline_template (FILE *);
 static void arm_trampoline_init (rtx, tree, rtx);
 static rtx arm_trampoline_adjust_address (rtx);
+static int arm_vector_min_alignment (const_tree type);
+static bool arm_vector_always_misalign (const_tree);
 
 \f
 /* Table of machine attributes.  */
@@ -507,6 +509,12 @@ static const struct attribute_spec arm_a
 #undef TARGET_CAN_ELIMINATE
 #define TARGET_CAN_ELIMINATE arm_can_eliminate
 
+#undef TARGET_VECTOR_MIN_ALIGNMENT
+#define TARGET_VECTOR_MIN_ALIGNMENT arm_vector_min_alignment
+
+#undef TARGET_VECTOR_ALWAYS_MISALIGN
+#define TARGET_VECTOR_ALWAYS_MISALIGN arm_vector_always_misalign
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 \f
 /* Obstack for minipool constant handling.  */
@@ -8463,7 +8471,8 @@ neon_vector_mem_operand (rtx op, int typ
     return arm_address_register_rtx_p (ind, 0);
 
   /* Allow post-increment with Neon registers.  */
-  if (type != 1 && (GET_CODE (ind) == POST_INC || GET_CODE (ind) == PRE_DEC))
+  if ((type != 1 && GET_CODE (ind) == POST_INC)
+      || (type == 0 && GET_CODE (ind) == PRE_DEC))
     return arm_address_register_rtx_p (XEXP (ind, 0), 0);
 
   /* FIXME: vld1 allows register post-modify.  */
@@ -15411,6 +15420,8 @@ arm_print_operand (FILE *stream, rtx x, 
       {
 	rtx addr;
 	bool postinc = FALSE;
+	unsigned align;
+
 	gcc_assert (GET_CODE (x) == MEM);
 	addr = XEXP (x, 0);
 	if (GET_CODE (addr) == POST_INC)
@@ -15418,7 +15429,13 @@ arm_print_operand (FILE *stream, rtx x, 
 	    postinc = 1;
 	    addr = XEXP (addr, 0);
 	  }
-	asm_fprintf (stream, "[%r]", REGNO (addr));
+	align = MEM_ALIGN (x) >> 3;
+	asm_fprintf (stream, "[%r", REGNO (addr));
+	if (align > GET_MODE_SIZE (GET_MODE (x)))
+	  align = GET_MODE_SIZE (GET_MODE (x));
+	if (align >= 8)
+	  asm_fprintf (stream, ", :%d", align << 3);
+	asm_fprintf (stream, "]");
 	if (postinc)
 	  fputs("!", stream);
       }
@@ -21463,4 +21480,34 @@ arm_have_conditional_execution (void)
   return !TARGET_THUMB1;
 }
 
+/* Return the minimum alignment required to load or store a
+   vector of the given type, which may be less than the
+   natural alignment of the type.  */
+
+static int
+arm_vector_min_alignment (const_tree type)
+{
+  if (TARGET_NEON)
+    {
+      /* The NEON element load and store instructions only require the
+	 alignment of the element type.  They can benefit from higher
+	 statically reported alignment, but we do not take advantage
+	 of that yet.  */
+      gcc_assert (TREE_CODE (type) == VECTOR_TYPE);
+      return TYPE_ALIGN_UNIT (TREE_TYPE (type));
+    }
+
+  return default_vector_min_alignment (type);
+}
+
+static bool
+arm_vector_always_misalign (const_tree type ATTRIBUTE_UNUSED)
+{
+  /* On big-endian targets array loads (vld1) and vector loads (vldm)
+     use a different format.  Always use the "misaligned" array variant.
+     FIXME: this still doesn't work for big-endian because of constant
+     loads and other operations using vldm ordering.  */
+  return TARGET_NEON && !BYTES_BIG_ENDIAN;
+}
+
 #include "gt-arm.h"
diff -rupd gcc-4.5-20091203/gcc/config/arm/neon.md gcc-4.5-20091203.patched/gcc/config/arm/neon.md
--- gcc-4.5-20091203/gcc/config/arm/neon.md	2009-11-11 17:23:03.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/config/arm/neon.md	2009-12-04 15:02:20.000000000 +0300
@@ -159,7 +159,8 @@
    (UNSPEC_VUZP1		201)
    (UNSPEC_VUZP2		202)
    (UNSPEC_VZIP1		203)
-   (UNSPEC_VZIP2		204)])
+   (UNSPEC_VZIP2		204)
+   (UNSPEC_MISALIGNED_ACCESS	205)])
 
 ;; Double-width vector modes.
 (define_mode_iterator VD [V8QI V4HI V2SI V2SF])
@@ -674,6 +675,51 @@
   neon_disambiguate_copy (operands, dest, src, 4);
 })
 
+(define_expand "movmisalign<mode>"
+  [(set (match_operand:VDQX 0 "nonimmediate_operand"	      "")
+	(unspec:VDQX [(match_operand:VDQX 1 "general_operand" "")]
+		     UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+{
+  if (!s_register_operand (operands[0], <MODE>mode)
+      && !s_register_operand (operands[1], <MODE>mode))
+    FAIL;
+})
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VDX 0 "memory_operand"		       "=Um")
+	(unspec:VDX [(match_operand:VDX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN
+   && (   s_register_operand (operands[0], <MODE>mode)
+       || s_register_operand (operands[1], <MODE>mode))"
+  "vst1.<V_sz_elem>\t{%P1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VDX 0 "s_register_operand"	   "=w")
+	(unspec:VDX [(match_operand:VDX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%P0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_store"
+  [(set (match_operand:VQX 0 "memory_operand"		       "=Um")
+	(unspec:VQX [(match_operand:VQX 1 "s_register_operand" " w")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vst1.<V_sz_elem>\t{%q1}, %A0"
+  [(set_attr "neon_type" "neon_vst1_1_2_regs_vst2_2_regs")])
+
+(define_insn "*movmisalign<mode>_neon_load"
+  [(set (match_operand:VQX 0 "s_register_operand"	   "=w")
+	(unspec:VQX [(match_operand:VQX 1 "memory_operand" " Um")]
+		    UNSPEC_MISALIGNED_ACCESS))]
+  "TARGET_NEON && !BYTES_BIG_ENDIAN"
+  "vld1.<V_sz_elem>\t{%q0}, %A1"
+  [(set_attr "neon_type" "neon_vld1_1_2_regs")])
+
 (define_insn "vec_set<mode>_internal"
   [(set (match_operand:VD 0 "s_register_operand" "=w")
         (vec_merge:VD
diff -rupd gcc-4.5-20091203/gcc/doc/md.texi gcc-4.5-20091203.patched/gcc/doc/md.texi
--- gcc-4.5-20091203/gcc/doc/md.texi	2009-11-04 01:49:37.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/doc/md.texi	2009-12-04 15:02:20.000000000 +0300
@@ -3807,6 +3807,15 @@ memory, so that it's easy to tell whethe
 This pattern is used by the autovectorizer, and when expanding a
 @code{MISALIGNED_INDIRECT_REF} expression.
 
+The @code{movmisalign@var{m}} pattern should load or store vector elements
+in the same memory order as an array of the element types.  If the
+target machine uses "opaque" operations to implement @code{mov@var{m}}
+for vector types (so the vector elements are in a different order to
+an equivalent array), but can also implement @code{movmisalign@var{m}}
+efficiently, then the autovectorizer should use this pattern for aligned
+accesses as well as misaligned accesses.  This behaviour is controlled
+by the TARGET_VECTOR_ALWAYS_MISALIGN hook.
+
 @cindex @code{load_multiple} instruction pattern
 @item @samp{load_multiple}
 Load several consecutive memory locations into consecutive registers.
diff -rupd gcc-4.5-20091203/gcc/expr.c gcc-4.5-20091203.patched/gcc/expr.c
--- gcc-4.5-20091203/gcc/expr.c	2009-11-30 13:39:36.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/expr.c	2009-12-04 16:59:24.000000000 +0300
@@ -4350,7 +4350,14 @@ expand_assignment (tree to, tree from, b
            && op_mode1 != VOIDmode)
          reg = copy_to_mode_reg (op_mode1, reg);
 
-      insn = GEN_FCN (icode) (mem, reg);
+       insn = GEN_FCN (icode) (mem, reg);
+       if (!insn)
+       {
+          reg = copy_to_mode_reg (mode, reg);
+          insn = GEN_FCN (icode) (mem, reg);
+          gcc_assert (insn);
+       }
+
        emit_insn (insn);
        return;
      }
@@ -4458,6 +4465,29 @@ expand_assignment (tree to, tree from, b
 
   /* Compute FROM and store the value in the rtx we got.  */
 
+  if (TREE_CODE (to) == MISALIGNED_INDIRECT_REF)
+    {
+      rtx insn;
+      rtx from_rtx;
+      enum insn_code icode;
+      enum machine_mode mode = GET_MODE (to_rtx);
+
+      icode = optab_handler (movmisalign_optab, mode)->insn_code;
+      gcc_assert (icode != CODE_FOR_nothing);
+
+      from_rtx = expand_expr (from, NULL_RTX, mode, EXPAND_NORMAL);
+      insn = GEN_FCN (icode) (to_rtx, from_rtx);
+      /* If that failed then force the source into a reg and try again.  */
+      if (!insn)
+	{
+	  from_rtx = copy_to_mode_reg (mode, from_rtx);
+	  insn = GEN_FCN (icode) (to_rtx, from_rtx);
+	  gcc_assert (insn);
+	}
+      emit_insn (insn);
+      return;
+    }
+
   push_temp_slots ();
   result = store_expr (from, to_rtx, 0, nontemporal);
   preserve_temp_slots (result);
@@ -8696,6 +8726,10 @@ expand_expr_real_1 (tree exp, rtx target
 	    int icode;
 	    rtx reg, insn;
 
+	    /* For writes produce a MEM, and expand_assignment will DTRT.  */
+	    if (modifier == EXPAND_WRITE)
+	      return temp;
+
 	    gcc_assert (modifier == EXPAND_NORMAL
 			|| modifier == EXPAND_STACK_PARM);
 
diff -rupd gcc-4.5-20091203/gcc/target-def.h gcc-4.5-20091203.patched/gcc/target-def.h
--- gcc-4.5-20091203/gcc/target-def.h	2009-11-26 04:52:19.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/target-def.h	2009-12-04 15:06:12.000000000 +0300
@@ -393,6 +393,9 @@
 #define TARGET_VECTORIZE_BUILTIN_VEC_PERM 0
 #define TARGET_VECTORIZE_BUILTIN_VEC_PERM_OK \
   hook_bool_tree_tree_true
+#define TARGET_VECTOR_MIN_ALIGNMENT \
+     default_vector_min_alignment
+#define TARGET_VECTOR_ALWAYS_MISALIGN hook_bool_const_tree_false
 #define TARGET_SUPPORT_VECTOR_MISALIGNMENT \
   default_builtin_support_vector_misalignment
 
@@ -408,6 +411,8 @@
     TARGET_VECTOR_ALIGNMENT_REACHABLE,                                  \
     TARGET_VECTORIZE_BUILTIN_VEC_PERM,					\
     TARGET_VECTORIZE_BUILTIN_VEC_PERM_OK,				\
+    TARGET_VECTOR_MIN_ALIGNMENT,          \
+    TARGET_VECTOR_ALWAYS_MISALIGN,          \
     TARGET_SUPPORT_VECTOR_MISALIGNMENT					\
   }
 
diff -rupd gcc-4.5-20091203/gcc/target.h gcc-4.5-20091203.patched/gcc/target.h
--- gcc-4.5-20091203/gcc/target.h	2009-11-26 04:52:19.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/target.h	2009-12-04 15:05:19.000000000 +0300
@@ -491,6 +491,15 @@ struct gcc_target
     /* Target builtin that implements vector permute.  */
     tree (* builtin_vec_perm) (tree, tree*);
 
+    /* Return the minimum alignment required to load or store a
+       vector of the given type, which may be less than the
+       natural alignment of the type.  */
+    int (* vector_min_alignment) (const_tree);
+    
+    /* Return true if "movmisalign" patterns should be used for all
+       loads/stores from data arrays.  */
+    bool (* always_misalign) (const_tree);
+
     /* Return true if a vector created for builtin_vec_perm is valid.  */
     bool (* builtin_vec_perm_ok) (tree, tree);
 
diff -rupd gcc-4.5-20091203/gcc/targhooks.c gcc-4.5-20091203.patched/gcc/targhooks.c
--- gcc-4.5-20091203/gcc/targhooks.c	2009-11-25 13:55:54.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/targhooks.c	2009-12-04 15:02:20.000000000 +0300
@@ -932,6 +932,12 @@ default_addr_space_convert (rtx op ATTRI
   gcc_unreachable ();
 }
 
+int
+default_vector_min_alignment (const_tree type)
+{
+  return TYPE_ALIGN_UNIT (type);
+}
+
 bool
 default_hard_regno_scratch_ok (unsigned int regno ATTRIBUTE_UNUSED)
 {
diff -rupd gcc-4.5-20091203/gcc/targhooks.h gcc-4.5-20091203.patched/gcc/targhooks.h
--- gcc-4.5-20091203/gcc/targhooks.h	2009-11-25 13:55:54.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/targhooks.h	2009-12-04 15:02:20.000000000 +0300
@@ -82,6 +82,8 @@ default_builtin_support_vector_misalignm
 					     const_tree,
 					     int, bool);
 
+extern int default_vector_min_alignment (const_tree);
+
 /* These are here, and not in hooks.[ch], because not all users of
    hooks.h include tm.h, and thus we don't have CUMULATIVE_ARGS.  */
 
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c	2009-06-05 19:28:50.000000000 +0400
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-8.c	2009-12-04 15:02:20.000000000 +0300
@@ -46,5 +46,5 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! { vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! { vect_element_align } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-31.c	2009-12-04 15:02:20.000000000 +0300
@@ -88,5 +88,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-64.c	2009-12-04 15:02:20.000000000 +0300
@@ -84,5 +84,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-66.c	2009-12-04 15:02:20.000000000 +0300
@@ -79,5 +79,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-68.c	2009-12-04 15:02:20.000000000 +0300
@@ -88,5 +88,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c	2009-12-04 15:02:20.000000000 +0300
@@ -114,7 +114,7 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vector_alignment_reachable} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_element_align || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { { ! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/pr25413.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/pr25413.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/pr25413.c	2009-06-08 17:26:44.000000000 +0400
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/pr25413.c	2009-12-04 15:02:20.000000000 +0300
@@ -33,7 +33,7 @@ int main (void)
 } 
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vector_alignment_reachable_for_64bit } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "vector alignment may not be reachable" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "vector alignment may not be reachable" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
+/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 1 "vect" { target { {! vector_alignment_reachable_for_64bit} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c	2007-08-07 23:13:27.000000000 +0400
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c	2009-12-04 15:02:20.000000000 +0300
@@ -115,6 +115,6 @@ int main (void)
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
 /* Alignment forced using versioning until the pass that increases alignment
   is extended to handle structs.  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 4 "vect" { target {vect_int && vector_alignment_reachable } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 4 "vect" { target { {vect_int && vector_alignment_reachable } && { ! vect_element_align } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target {vect_int && {! vector_alignment_reachable} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/slp-25.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/slp-25.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/slp-25.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/slp-25.c	2009-12-04 15:02:20.000000000 +0300
@@ -56,5 +56,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-109.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-109.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-109.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-109.c	2009-12-04 15:02:20.000000000 +0300
@@ -73,7 +73,7 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 2 "vect" } } */
-/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 10 "vect" { target vect_hw_misalign } } } */
+/* { dg-final { scan-tree-dump-times "not vectorized: unsupported unaligned store" 2 "vect" { xfail vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 10 "vect" { target vect_element_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-26.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-26.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-26.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-26.c	2009-12-04 15:02:20.000000000 +0300
@@ -37,5 +37,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-27.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-27.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-27.c	2009-11-04 13:22:22.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-27.c	2009-12-04 15:02:20.000000000 +0300
@@ -45,6 +45,6 @@ int main (void)
 /* The initialization induction loop (with aligned access) is also vectorized.  */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { xfail vect_no_align } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-28.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-28.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-28.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-28.c	2009-12-04 15:02:20.000000000 +0300
@@ -40,6 +40,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-29.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-29.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-29.c	2009-11-04 13:22:22.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-29.c	2009-12-04 15:02:20.000000000 +0300
@@ -50,7 +50,7 @@ int main (void)
 
 /* The initialization induction loop (with aligned access) is also vectorized.  */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" {target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-33.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-33.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-33.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-33.c	2009-12-04 15:02:20.000000000 +0300
@@ -39,6 +39,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target vector_alignment_reachable } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */ 
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */ 
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-42.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-42.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-42.c	2009-11-04 13:22:22.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-42.c	2009-12-04 15:02:20.000000000 +0300
@@ -64,7 +64,7 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_hw_misalign } } } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { { ! vector_alignment_reachable } && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { { vect_no_align || vect_element_align } || { ! vector_alignment_reachable } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || { ! vector_alignment_reachable } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-44.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-44.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-44.c	2009-11-04 13:22:22.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-44.c	2009-12-04 15:02:20.000000000 +0300
@@ -65,8 +65,8 @@ int main (void)
    two loads to be aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {{! vect_no_align} && {! vect_hw_misalign} } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {{! vect_no_align} && {! vect_element_align} } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-48.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-48.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-48.c	2009-11-04 13:22:22.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-48.c	2009-12-04 15:02:20.000000000 +0300
@@ -54,7 +54,7 @@ int main (void)
    (The store is aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-50.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-50.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-50.c	2009-11-04 13:22:22.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-50.c	2009-12-04 15:02:20.000000000 +0300
@@ -61,9 +61,9 @@ int main (void)
    align the store will not force the two loads to be aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } }  */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target vect_hw_misalign } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } }  */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && { {! vect_no_align } && {! vect_hw_misalign } } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && { {! vect_no_align } && {! vect_element_align } } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-52.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-52.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-52.c	2009-11-04 13:22:22.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-52.c	2009-12-04 15:02:20.000000000 +0300
@@ -55,7 +55,7 @@ int main (void)
    (The store is aligned).  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-54.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-54.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-54.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-54.c	2009-12-04 15:02:20.000000000 +0300
@@ -60,5 +60,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-56.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-56.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-56.c	2009-11-04 13:22:22.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-56.c	2009-12-04 15:02:20.000000000 +0300
@@ -68,6 +68,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-58.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-58.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-58.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-58.c	2009-12-04 15:02:20.000000000 +0300
@@ -59,5 +59,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-60.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-60.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-60.c	2009-11-04 13:22:22.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-60.c	2009-12-04 15:02:20.000000000 +0300
@@ -69,6 +69,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-70.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-70.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-70.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-70.c	2009-12-04 15:02:20.000000000 +0300
@@ -64,6 +64,6 @@ int main (void)
           
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target {{! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target {{! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-72.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-72.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-72.c	2009-11-04 13:22:22.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-72.c	2009-12-04 15:02:20.000000000 +0300
@@ -46,6 +46,6 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-75.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-75.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-75.c	2009-05-08 17:39:01.000000000 +0400
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-75.c	2009-12-04 15:02:20.000000000 +0300
@@ -45,5 +45,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail vect_no_align } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-87.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-87.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-87.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-87.c	2009-12-04 15:02:20.000000000 +0300
@@ -51,6 +51,6 @@ int main (void)
 /* Fails for targets that don't vectorize PLUS (e.g alpha).  */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target vector_alignment_reachable} } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-88.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-88.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-88.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-88.c	2009-12-04 15:02:20.000000000 +0300
@@ -51,6 +51,6 @@ int main (void)
 /* Fails for targets that don't vectorize PLUS (e.g alpha).  */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target vector_alignment_reachable } } }  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } }  */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-89.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-89.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-89.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-89.c	2009-12-04 15:02:20.000000000 +0300
@@ -46,5 +46,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-91.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-91.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-91.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-91.c	2009-12-04 15:02:20.000000000 +0300
@@ -59,6 +59,6 @@ main3 ()
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" { xfail vect_no_int_add } } } */
 /* { dg-final { scan-tree-dump-times "accesses have the same alignment." 3 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" {target { vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" {target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" {target { vector_alignment_reachable && { ! vect_element_align } } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" {target { {! vector_alignment_reachable} && {! vect_element_align} } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-92.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-92.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-92.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-92.c	2009-12-04 15:02:20.000000000 +0300
@@ -92,5 +92,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { target { ! vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-93.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-93.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-93.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-93.c	2009-12-04 15:02:20.000000000 +0300
@@ -72,7 +72,7 @@ int main (void)
 /* main && main1 together: */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 2 "vect" { target powerpc*-*-* i?86-*-* x86_64-*-* } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { vect_no_align && {! vector_alignment_reachable} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
 
 /* in main1: */
 /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target !powerpc*-*-* !i?86-*-* !x86_64-*-* } } } */
@@ -80,6 +80,6 @@ int main (void)
 
 /* in main: */
 /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-95.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-95.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-95.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-95.c	2009-12-04 15:02:20.000000000 +0300
@@ -56,14 +56,14 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail {vect_hw_misalign} } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" { xfail {vect_element_align} } } } */
 
 /* For targets that support unaligned loads we version for the two unaligned 
    stores and generate misaligned accesses for the loads. For targets that 
    don't support unaligned loads we version for all four accesses.  */
 
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign} } } }  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } }  */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /*  { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" { target vect_no_align } } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-96.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-96.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-96.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-96.c	2009-12-04 15:02:20.000000000 +0300
@@ -43,7 +43,7 @@ int main (void)
    For targets that don't support unaligned loads, version for the store.  */
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { {! vect_no_align} && vector_alignment_reachable } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { target { {! { vect_no_align || vect_element_align } } && vector_alignment_reachable } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_element_align } || {! vector_alignment_reachable} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} && {! vect_element_align} } } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-align-2.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-align-2.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-align-2.c	2009-06-05 19:28:50.000000000 +0400
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-align-2.c	2009-12-04 15:02:20.000000000 +0300
@@ -43,6 +43,6 @@ int main (void)
 
 
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { xfail vect_hw_misalign} } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { xfail vect_hw_misalign } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c	2009-12-04 15:02:20.000000000 +0300
@@ -78,11 +78,11 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_element_align} } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c	2009-11-04 13:22:22.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-multitypes-3.c	2009-12-04 15:02:20.000000000 +0300
@@ -54,6 +54,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" {xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 3 "vect" {xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c	2009-12-04 15:02:20.000000000 +0300
@@ -85,11 +85,11 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_hw_misalign} } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_hw_misalign}  } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail {! vect_element_align} } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail {! vect_element_align}  } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 8 "vect" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || vect_hw_misalign } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff -rupd gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c
--- gcc-4.5-20091203/gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c	2009-11-10 21:01:22.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gcc.dg/vect/vect-multitypes-6.c	2009-12-04 15:02:20.000000000 +0300
@@ -61,6 +61,6 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { sparc*-*-* && ilp32 } }} } */
 /*  { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 6 "vect" { target vect_no_align } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 6 "vect" {xfail { vect_no_align } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 6 "vect" {xfail { vect_no_align || vect_element_align } } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */
 
diff -rupd gcc-4.5-20091203/gcc/testsuite/gfortran.dg/vect/vect-2.f90 gcc-4.5-20091203.patched/gcc/testsuite/gfortran.dg/vect/vect-2.f90
--- gcc-4.5-20091203/gcc/testsuite/gfortran.dg/vect/vect-2.f90	2009-11-04 13:22:22.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gfortran.dg/vect/vect-2.f90	2009-12-04 15:02:20.000000000 +0300
@@ -18,5 +18,5 @@ END
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 3 "vect" { xfail { vect_no_align || { ! vector_alignment_reachable } } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { vect_no_align && { ! vector_alignment_reachable } } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { xfail { vect_no_align } } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" {target { vect_no_align || { { ! vector_alignment_reachable  } && { ! vect_hw_misalign } } } } } } 
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 3 "vect" {target { vect_no_align || { { ! vector_alignment_reachable  } && { ! vect_element_align } } } } } } 
 ! { dg-final { cleanup-tree-dump "vect" } }
diff -rupd gcc-4.5-20091203/gcc/testsuite/gfortran.dg/vect/vect-3.f90 gcc-4.5-20091203.patched/gcc/testsuite/gfortran.dg/vect/vect-3.f90
--- gcc-4.5-20091203/gcc/testsuite/gfortran.dg/vect/vect-3.f90	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gfortran.dg/vect/vect-3.f90	2009-12-04 15:02:20.000000000 +0300
@@ -7,8 +7,8 @@ Y = Y + A * X
 END
 
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target vect_no_align } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } }
-! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } }
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_element_align} } } } } }
+! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vect_no_align} && { {! vector_alignment_reachable} && {! vect_element_align} } } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable}} } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { { vect_no_align } || { ! vector_alignment_reachable} } } } }
 
diff -rupd gcc-4.5-20091203/gcc/testsuite/gfortran.dg/vect/vect-4.f90 gcc-4.5-20091203.patched/gcc/testsuite/gfortran.dg/vect/vect-4.f90
--- gcc-4.5-20091203/gcc/testsuite/gfortran.dg/vect/vect-4.f90	2009-10-27 14:46:07.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gfortran.dg/vect/vect-4.f90	2009-12-04 15:02:20.000000000 +0300
@@ -12,6 +12,6 @@ END
 ! { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } 
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { { vect_no_align } || {! vector_alignment_reachable} } } } }
-! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } }
+! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 2 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } }
 ! { dg-final { scan-tree-dump-times "accesses have the same alignment." 1 "vect" } }
 ! { dg-final { cleanup-tree-dump "vect" } }
diff -rupd gcc-4.5-20091203/gcc/testsuite/gfortran.dg/vect/vect-5.f90 gcc-4.5-20091203.patched/gcc/testsuite/gfortran.dg/vect/vect-5.f90
--- gcc-4.5-20091203/gcc/testsuite/gfortran.dg/vect/vect-5.f90	2009-11-04 13:22:22.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/gfortran.dg/vect/vect-5.f90	2009-12-04 15:02:20.000000000 +0300
@@ -39,5 +39,5 @@
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { vect_no_align || {! vector_alignment_reachable} } } } }
 ! { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 "vect" { xfail { vect_no_align } } } }
 ! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 2 "vect" { target { vect_no_align } } } }
-! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } }
+! { dg-final { scan-tree-dump-times "Alignment of access forced using versioning." 1 "vect" { target { {! vector_alignment_reachable} && {! vect_element_align} } } } }
 ! { dg-final { cleanup-tree-dump "vect" } }
diff -rupd gcc-4.5-20091203/gcc/testsuite/lib/target-supports.exp gcc-4.5-20091203.patched/gcc/testsuite/lib/target-supports.exp
--- gcc-4.5-20091203/gcc/testsuite/lib/target-supports.exp	2009-11-26 05:39:42.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/testsuite/lib/target-supports.exp	2009-12-04 15:02:20.000000000 +0300
@@ -1514,6 +1514,18 @@ proc check_effective_target_arm32 { } {
     }]
 }
 
+# Return 1 if this is an ARM target that only supports aligned vector accesses
+proc check_effective_target_arm_vect_no_misalign { } {
+    return [check_no_compiler_messages arm_vect_no_misalign assembly {
+	#if !defined(__arm__) \
+	    || (defined(__ARMEL__) \
+	        && (!defined(__thumb__) || defined(__thumb2__)))
+	#error FOO
+	#endif
+    }]
+}
+
+
 # Return 1 if this is an ARM target supporting -mfpu=vfp
 # -mfloat-abi=softfp.  Some multilibs may be incompatible with these
 # options.
@@ -2331,7 +2343,7 @@ proc check_effective_target_vect_no_alig
 	if { [istarget mipsisa64*-*-*]
 	     || [istarget sparc*-*-*]
 	     || [istarget ia64-*-*]
-	     || [check_effective_target_arm32] } { 
+	     || [check_effective_target_arm_vect_no_misalign] } { 
 	    set et_vect_no_align_saved 1
 	}
     }
@@ -2466,6 +2478,25 @@ proc check_effective_target_vector_align
     return $et_vector_alignment_reachable_for_64bit_saved
 }
 
+# Return 1 if the target only requires element alignment for vector accesses
+
+proc check_effective_target_vect_element_align { } {
+    global et_vect_element_align
+
+    if [info exists et_vect_element_align] {
+	verbose "check_effective_target_vect_element_align: using cached result" 2
+    } else {
+	set et_vect_element_align 0
+	if { [istarget arm*-*-*]
+	     || [check_effective_target_vect_hw_misalign] } {
+	   set et_vect_element_align 1
+	}
+    }
+
+    verbose "check_effective_target_vect_element_align: returning $et_vect_element_align" 2
+    return $et_vect_element_align
+}
+
 # Return 1 if the target supports vector conditional operations, 0 otherwise.
 
 proc check_effective_target_vect_condition { } {
diff -rupd gcc-4.5-20091203/gcc/tree-vect-data-refs.c gcc-4.5-20091203.patched/gcc/tree-vect-data-refs.c
--- gcc-4.5-20091203/gcc/tree-vect-data-refs.c	2009-11-28 19:21:00.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/tree-vect-data-refs.c	2009-12-04 15:02:20.000000000 +0300
@@ -745,7 +745,7 @@ vect_compute_data_ref_alignment (struct 
     }
 
   base = build_fold_indirect_ref (base_addr);
-  alignment = ssize_int (TYPE_ALIGN (vectype)/BITS_PER_UNIT);
+  alignment = ssize_int (targetm.vectorize.vector_min_alignment (vectype));
 
   if ((aligned_to && tree_int_cst_compare (aligned_to, alignment) < 0)
       || !misalign)
@@ -795,8 +795,9 @@ vect_compute_data_ref_alignment (struct 
 
   /* At this point we assume that the base is aligned.  */
   gcc_assert (base_aligned
-	      || (TREE_CODE (base) == VAR_DECL
-		  && DECL_ALIGN (base) >= TYPE_ALIGN (vectype)));
+	      || (TREE_CODE (base) == VAR_DECL 
+		  && (DECL_ALIGN_UNIT (base)
+		      >= targetm.vectorize.vector_min_alignment (vectype))));
 
   /* Modulo alignment.  */
   misalign = size_binop (FLOOR_MOD_EXPR, misalign, alignment);
@@ -3382,7 +3383,12 @@ vect_supportable_dr_alignment (struct da
   bool nested_in_vect_loop = false;
 
   if (aligned_access_p (dr))
-    return dr_aligned;
+    {
+      if (targetm.vectorize.always_misalign (vectype))
+	return dr_unaligned_forced;
+      else
+	return dr_aligned;
+    }
 
   if (!loop_vinfo)
     /* FORNOW: Misaligned accesses are supported only in loops.  */
diff -rupd gcc-4.5-20091203/gcc/tree-vect-loop-manip.c gcc-4.5-20091203.patched/gcc/tree-vect-loop-manip.c
--- gcc-4.5-20091203/gcc/tree-vect-loop-manip.c	2009-11-28 19:21:00.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/tree-vect-loop-manip.c	2009-12-04 15:02:20.000000000 +0300
@@ -37,6 +37,7 @@ along with GCC; see the file COPYING3.  
 #include "tree-scalar-evolution.h"
 #include "tree-vectorizer.h"
 #include "langhooks.h"
+#include "target.h"
 
 /*************************************************************************
   Simple Loop Peeling Utilities
@@ -1835,7 +1836,7 @@ vect_gen_niters_for_prolog_loop (loop_ve
   gimple dr_stmt = DR_STMT (dr);
   stmt_vec_info stmt_info = vinfo_for_stmt (dr_stmt);
   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
-  int vectype_align = TYPE_ALIGN (vectype) / BITS_PER_UNIT;
+  int vectype_align = targetm.vectorize.vector_min_alignment (vectype);
   tree niters_type = TREE_TYPE (loop_niters);
   int step = 1;
   int element_size = GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (DR_REF (dr))));
diff -rupd gcc-4.5-20091203/gcc/tree-vectorizer.h gcc-4.5-20091203.patched/gcc/tree-vectorizer.h
--- gcc-4.5-20091203/gcc/tree-vectorizer.h	2009-11-25 13:55:54.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/tree-vectorizer.h	2009-12-04 15:02:20.000000000 +0300
@@ -48,6 +48,7 @@ enum operation_type {
 enum dr_alignment_support {
   dr_unaligned_unsupported,
   dr_unaligned_supported,
+  dr_unaligned_forced,
   dr_explicit_realign,
   dr_explicit_realign_optimized,
   dr_aligned
diff -rupd gcc-4.5-20091203/gcc/tree-vect-stmts.c gcc-4.5-20091203.patched/gcc/tree-vect-stmts.c
--- gcc-4.5-20091203/gcc/tree-vect-stmts.c	2009-11-30 15:17:43.000000000 +0300
+++ gcc-4.5-20091203.patched/gcc/tree-vect-stmts.c	2009-12-04 15:02:20.000000000 +0300
@@ -742,6 +742,7 @@ vect_model_load_cost (stmt_vec_info stmt
         break;
       }
     case dr_unaligned_supported:
+    case dr_unaligned_forced:
       {
         /* Here, we assign an additional cost for the unaligned load.  */
         inside_cost += ncopies * TARG_VEC_UNALIGNED_LOAD_COST;
@@ -3192,7 +3193,8 @@ vectorizable_store (gimple stmt, gimple_
 	       vect_permute_store_chain().  */
 	    vec_oprnd = VEC_index (tree, result_chain, i);
 
-          if (aligned_access_p (first_dr))
+          if (aligned_access_p (first_dr)
+	      && alignment_support_scheme != dr_unaligned_forced)
             data_ref = build_fold_indirect_ref (dataref_ptr);
           else
           {
@@ -3573,7 +3575,9 @@ vectorizable_load (gimple stmt, gimple_s
 	      data_ref = build_fold_indirect_ref (dataref_ptr);
 	      break;
 	    case dr_unaligned_supported:
+	    case dr_unaligned_forced:
 	      {
+	        /* TODO: Record actual alignment in always_misalign case.  */
 		int mis = DR_MISALIGNMENT (first_dr);
 		tree tmis = (mis == -1 ? size_zero_node : size_int (mis));
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
  2009-12-03 15:12 Plotnikov Dmitry
@ 2009-12-03 16:57 ` Richard Guenther
  2009-12-04 14:40   ` Plotnikov Dmitry
  0 siblings, 1 reply; 29+ messages in thread
From: Richard Guenther @ 2009-12-03 16:57 UTC (permalink / raw)
  To: Plotnikov Dmitry; +Cc: julian, gcc-patches, paul, rearnsha, eres, IRAR

On Thu, Dec 3, 2009 at 4:09 PM, Plotnikov Dmitry <dplotnikov@ispras.ru> wrote:
>> Other failures are due to things like vectorizing *more* loops than
>> expected in several tests, and (as written before) missing parts in the
>> NEON support. I don't think there's anything which indicates actual
>> breakage.
>
> There seems to be a problem though: building libevas with
> this patch causes miscompile.
>
> Sometimes SLP pass somehow brokes data dependencies and causes
> "dirty = list_zeroed" initialization to be removed by DCE pass on rtl
> in the sample code below:
>
> #include <stdlib.h>
> #include <assert.h>
> struct list {
>  int *head;
>  int *tail;
> };
>
> typedef struct list list_t;
> static const list_t list_zeroed = { NULL, NULL };
>
> int seg(list_t *arg){
>  if (arg->tail) {
>   return 1;
>  }
>  return 0;
> }
>
> int main(int argc, char* argv[]){
>  list_t dirty = list_zeroed;
>  assert(seg(&dirty)==0);
>  return 0;
> }
>
> We used GCC 4.5 snapshot from November 12, 2009 with options:
> "-ftree-vectorize -mfpu=neon -mfloat-abi=softfp -O2 -fno-inline"
>
> With -fno-tree-slp-vectorize option it works well.

That sounds more like an alias bug of either the vectorizer or
the alias-export code.  Can you re-check with a more recent
snapshot and file a bugreport?

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM] Misaligned access support for ARM Neon
@ 2009-12-03 15:12 Plotnikov Dmitry
  2009-12-03 16:57 ` Richard Guenther
  0 siblings, 1 reply; 29+ messages in thread
From: Plotnikov Dmitry @ 2009-12-03 15:12 UTC (permalink / raw)
  To: julian; +Cc: gcc-patches, paul, rearnsha, eres, IRAR

> Other failures are due to things like vectorizing *more* loops than
> expected in several tests, and (as written before) missing parts in the
> NEON support. I don't think there's anything which indicates actual
> breakage.

There seems to be a problem though: building libevas with
this patch causes miscompile.

Sometimes SLP pass somehow brokes data dependencies and causes
"dirty = list_zeroed" initialization to be removed by DCE pass on rtl
in the sample code below:

#include <stdlib.h>
#include <assert.h>
struct list {
  int *head;
  int *tail;
};

typedef struct list list_t;
static const list_t list_zeroed = { NULL, NULL };

int seg(list_t *arg){
  if (arg->tail) {
    return 1;
  }
  return 0;
}

int main(int argc, char* argv[]){
  list_t dirty = list_zeroed;
  assert(seg(&dirty)==0);
  return 0;
}

We used GCC 4.5 snapshot from November 12, 2009 with options:
"-ftree-vectorize -mfpu=neon -mfloat-abi=softfp -O2 -fno-inline"

With -fno-tree-slp-vectorize option it works well.

-- Best regards, Dmitry

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2010-10-07 13:06 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-17 17:21 [PATCH, ARM] Misaligned access support for ARM Neon Julian Brown
2009-11-18 12:03 ` Ira Rosen
2009-11-30 13:53   ` Julian Brown
2009-11-30 14:03     ` Joseph S. Myers
2009-11-30 14:15       ` Richard Earnshaw
2009-11-30 14:33         ` Paul Brook
2009-11-30 15:06           ` Richard Earnshaw
2009-12-18 18:09             ` Julian Brown
2009-12-21  8:44               ` Ira Rosen
2009-12-21 15:35               ` Paul Brook
2010-05-18  0:44                 ` Julian Brown
2010-05-18  8:50                   ` Ira Rosen
2010-05-18 15:58                     ` Julian Brown
2010-06-04 12:50                   ` Julian Brown
2010-06-04 16:26                     ` Richard Earnshaw
2010-06-05 14:39                       ` Joseph S. Myers
2010-06-07 19:09                       ` Julian Brown
2010-06-08 15:25                         ` Mark Mitchell
2010-08-03 16:32                         ` Julian Brown
2010-08-04  6:38                           ` Ira Rosen
2010-09-23  9:49                           ` Richard Earnshaw
2010-10-04 15:00                             ` Julian Brown
2010-10-07 13:06                               ` Ramana Radhakrishnan
2009-11-30 15:43           ` Joseph S. Myers
2009-12-01  8:39     ` Ira Rosen
2009-11-18 14:25 ` Paul Brook
2009-12-03 15:12 Plotnikov Dmitry
2009-12-03 16:57 ` Richard Guenther
2009-12-04 14:40   ` Plotnikov Dmitry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).