public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH ARM iWMMXt 0/5] Improve iWMMXt support
@ 2012-05-29  4:13 Matt Turner
  2012-05-29  4:14 ` [PATCH ARM iWMMXt 5/5] pipeline description Matt Turner
                   ` (7 more replies)
  0 siblings, 8 replies; 29+ messages in thread
From: Matt Turner @ 2012-05-29  4:13 UTC (permalink / raw)
  To: gcc-patches
  Cc: Ramana Radhakrishnan, Richard Earnshaw, Nick Clifton, Paul Brook,
	Xinyu Qi


This series was written by Marvell and sent by Xinyu Qi <xyqi@marvell.com>
a number of times in the last year.

We (One Laptop per Child) need these patches for reasonable iWMMXt support
and performance. Without them, logical and shift intrinsics cause ICEs,
see PR 35294 and its duplicates 36798 and 36966.

The software compositing library pixman uses MMX intrinsics to optimize
various compositing routines. The following are the minimum execution times
of cairo-perf-trace graphics work loads without and with iWMMXt-optimized
pixman for the image and image16 backends (32-bpp and 16-bpp respectively).

                             image               image16
           evolution   33.492 ->  29.590    30.334 ->  24.751
firefox-planet-gnome  191.465 -> 173.835   211.297 -> 187.570
gnome-system-monitor   51.956 ->  44.549    52.272 ->  40.525
  gnome-terminal-vim   53.625 ->  54.554    47.593 ->  47.341
      grads-heat-map    4.439 ->   4.165     4.548 ->   4.624
       midori-zoomed   38.033 ->  28.500    38.576 ->  26.937
             poppler   41.096 ->  31.949    41.230 ->  31.749
  swfdec-giant-steps   20.062 ->  16.912    28.294 ->  17.286
      swfdec-youtube   42.281 ->  37.335    52.848 ->  47.053
   xfce4-terminal-a1   64.311 ->  51.011    62.592 ->  51.191

We have cleaned up some white-space issues with the patches and fixed a
small bug in patch 4/5 since the last time they were posted in December
(added tandc,textrc,torc,torvsc to the "wtype" attribute)

Please commit them for 4.8.

For 4.7 and 4.6 please consider committing my patch
"[PATCH] arm: Fix iwmmxt shift and logical intrinsics (PR 35294)."
which only fixes the logical and shift intrinsics.

Thanks,

Matt Turner

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH ARM iWMMXt 3/5] built in define and expand
  2012-05-29  4:13 [PATCH ARM iWMMXt 0/5] Improve iWMMXt support Matt Turner
  2012-05-29  4:14 ` [PATCH ARM iWMMXt 5/5] pipeline description Matt Turner
  2012-05-29  4:14 ` [PATCH ARM iWMMXt 1/5] ARM code generic change Matt Turner
@ 2012-05-29  4:14 ` Matt Turner
  2012-06-06 11:55   ` Ramana Radhakrishnan
  2012-05-29  4:15 ` [PATCH ARM iWMMXt 4/5] WMMX machine description Matt Turner
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Matt Turner @ 2012-05-29  4:14 UTC (permalink / raw)
  To: gcc-patches
  Cc: Ramana Radhakrishnan, Richard Earnshaw, Nick Clifton, Paul Brook,
	Xinyu Qi

From: Xinyu Qi <xyqi@marvell.com>

	gcc/
	* config/arm/arm.c (enum arm_builtins): Revise built-in fcode.
	(IWMMXT2_BUILTIN): New define.
	(IWMMXT2_BUILTIN2): Likewise.
	(iwmmx2_mbuiltin): Likewise.
	(builtin_description bdesc_2arg): Revise built in declaration.
	(builtin_description bdesc_1arg): Likewise.
	(arm_init_iwmmxt_builtins): Revise built in initialization.
	(arm_expand_builtin): Revise built in expansion.
---
 gcc/config/arm/arm.c |  620 +++++++++++++++++++++++++++++++++++++++++++++-----
 1 files changed, 559 insertions(+), 61 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index b0680ab..51eed40 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -19637,8 +19637,15 @@ static neon_builtin_datum neon_builtin_data[] =
    FIXME?  */
 enum arm_builtins
 {
-  ARM_BUILTIN_GETWCX,
-  ARM_BUILTIN_SETWCX,
+  ARM_BUILTIN_GETWCGR0,
+  ARM_BUILTIN_GETWCGR1,
+  ARM_BUILTIN_GETWCGR2,
+  ARM_BUILTIN_GETWCGR3,
+
+  ARM_BUILTIN_SETWCGR0,
+  ARM_BUILTIN_SETWCGR1,
+  ARM_BUILTIN_SETWCGR2,
+  ARM_BUILTIN_SETWCGR3,
 
   ARM_BUILTIN_WZERO,
 
@@ -19661,7 +19668,11 @@ enum arm_builtins
   ARM_BUILTIN_WSADH,
   ARM_BUILTIN_WSADHZ,
 
-  ARM_BUILTIN_WALIGN,
+  ARM_BUILTIN_WALIGNI,
+  ARM_BUILTIN_WALIGNR0,
+  ARM_BUILTIN_WALIGNR1,
+  ARM_BUILTIN_WALIGNR2,
+  ARM_BUILTIN_WALIGNR3,
 
   ARM_BUILTIN_TMIA,
   ARM_BUILTIN_TMIAPH,
@@ -19797,6 +19808,81 @@ enum arm_builtins
   ARM_BUILTIN_WUNPCKELUH,
   ARM_BUILTIN_WUNPCKELUW,
 
+  ARM_BUILTIN_WABSB,
+  ARM_BUILTIN_WABSH,
+  ARM_BUILTIN_WABSW,
+
+  ARM_BUILTIN_WADDSUBHX,
+  ARM_BUILTIN_WSUBADDHX,
+
+  ARM_BUILTIN_WABSDIFFB,
+  ARM_BUILTIN_WABSDIFFH,
+  ARM_BUILTIN_WABSDIFFW,
+
+  ARM_BUILTIN_WADDCH,
+  ARM_BUILTIN_WADDCW,
+
+  ARM_BUILTIN_WAVG4,
+  ARM_BUILTIN_WAVG4R,
+
+  ARM_BUILTIN_WMADDSX,
+  ARM_BUILTIN_WMADDUX,
+
+  ARM_BUILTIN_WMADDSN,
+  ARM_BUILTIN_WMADDUN,
+
+  ARM_BUILTIN_WMULWSM,
+  ARM_BUILTIN_WMULWUM,
+
+  ARM_BUILTIN_WMULWSMR,
+  ARM_BUILTIN_WMULWUMR,
+
+  ARM_BUILTIN_WMULWL,
+
+  ARM_BUILTIN_WMULSMR,
+  ARM_BUILTIN_WMULUMR,
+
+  ARM_BUILTIN_WQMULM,
+  ARM_BUILTIN_WQMULMR,
+
+  ARM_BUILTIN_WQMULWM,
+  ARM_BUILTIN_WQMULWMR,
+
+  ARM_BUILTIN_WADDBHUSM,
+  ARM_BUILTIN_WADDBHUSL,
+
+  ARM_BUILTIN_WQMIABB,
+  ARM_BUILTIN_WQMIABT,
+  ARM_BUILTIN_WQMIATB,
+  ARM_BUILTIN_WQMIATT,
+
+  ARM_BUILTIN_WQMIABBN,
+  ARM_BUILTIN_WQMIABTN,
+  ARM_BUILTIN_WQMIATBN,
+  ARM_BUILTIN_WQMIATTN,
+
+  ARM_BUILTIN_WMIABB,
+  ARM_BUILTIN_WMIABT,
+  ARM_BUILTIN_WMIATB,
+  ARM_BUILTIN_WMIATT,
+
+  ARM_BUILTIN_WMIABBN,
+  ARM_BUILTIN_WMIABTN,
+  ARM_BUILTIN_WMIATBN,
+  ARM_BUILTIN_WMIATTN,
+
+  ARM_BUILTIN_WMIAWBB,
+  ARM_BUILTIN_WMIAWBT,
+  ARM_BUILTIN_WMIAWTB,
+  ARM_BUILTIN_WMIAWTT,
+
+  ARM_BUILTIN_WMIAWBBN,
+  ARM_BUILTIN_WMIAWBTN,
+  ARM_BUILTIN_WMIAWTBN,
+  ARM_BUILTIN_WMIAWTTN,
+
+  ARM_BUILTIN_WMERGE,
+
   ARM_BUILTIN_THREAD_POINTER,
 
   ARM_BUILTIN_NEON_BASE,
@@ -20329,6 +20415,10 @@ static const struct builtin_description bdesc_2arg[] =
   { FL_IWMMXT, CODE_FOR_##code, "__builtin_arm_" string, \
     ARM_BUILTIN_##builtin, UNKNOWN, 0 },
 
+#define IWMMXT2_BUILTIN(code, string, builtin) \
+  { FL_IWMMXT2, CODE_FOR_##code, "__builtin_arm_" string, \
+    ARM_BUILTIN_##builtin, UNKNOWN, 0 },
+
   IWMMXT_BUILTIN (addv8qi3, "waddb", WADDB)
   IWMMXT_BUILTIN (addv4hi3, "waddh", WADDH)
   IWMMXT_BUILTIN (addv2si3, "waddw", WADDW)
@@ -20385,44 +20475,45 @@ static const struct builtin_description bdesc_2arg[] =
   IWMMXT_BUILTIN (iwmmxt_wunpckihb, "wunpckihb", WUNPCKIHB)
   IWMMXT_BUILTIN (iwmmxt_wunpckihh, "wunpckihh", WUNPCKIHH)
   IWMMXT_BUILTIN (iwmmxt_wunpckihw, "wunpckihw", WUNPCKIHW)
-  IWMMXT_BUILTIN (iwmmxt_wmadds, "wmadds", WMADDS)
-  IWMMXT_BUILTIN (iwmmxt_wmaddu, "wmaddu", WMADDU)
+  IWMMXT2_BUILTIN (iwmmxt_waddsubhx, "waddsubhx", WADDSUBHX)
+  IWMMXT2_BUILTIN (iwmmxt_wsubaddhx, "wsubaddhx", WSUBADDHX)
+  IWMMXT2_BUILTIN (iwmmxt_wabsdiffb, "wabsdiffb", WABSDIFFB)
+  IWMMXT2_BUILTIN (iwmmxt_wabsdiffh, "wabsdiffh", WABSDIFFH)
+  IWMMXT2_BUILTIN (iwmmxt_wabsdiffw, "wabsdiffw", WABSDIFFW)
+  IWMMXT2_BUILTIN (iwmmxt_avg4, "wavg4", WAVG4)
+  IWMMXT2_BUILTIN (iwmmxt_avg4r, "wavg4r", WAVG4R)
+  IWMMXT2_BUILTIN (iwmmxt_wmulwsm, "wmulwsm", WMULWSM)
+  IWMMXT2_BUILTIN (iwmmxt_wmulwum, "wmulwum", WMULWUM)
+  IWMMXT2_BUILTIN (iwmmxt_wmulwsmr, "wmulwsmr", WMULWSMR)
+  IWMMXT2_BUILTIN (iwmmxt_wmulwumr, "wmulwumr", WMULWUMR)
+  IWMMXT2_BUILTIN (iwmmxt_wmulwl, "wmulwl", WMULWL)
+  IWMMXT2_BUILTIN (iwmmxt_wmulsmr, "wmulsmr", WMULSMR)
+  IWMMXT2_BUILTIN (iwmmxt_wmulumr, "wmulumr", WMULUMR)
+  IWMMXT2_BUILTIN (iwmmxt_wqmulm, "wqmulm", WQMULM)
+  IWMMXT2_BUILTIN (iwmmxt_wqmulmr, "wqmulmr", WQMULMR)
+  IWMMXT2_BUILTIN (iwmmxt_wqmulwm, "wqmulwm", WQMULWM)
+  IWMMXT2_BUILTIN (iwmmxt_wqmulwmr, "wqmulwmr", WQMULWMR)
+  IWMMXT_BUILTIN (iwmmxt_walignr0, "walignr0", WALIGNR0)
+  IWMMXT_BUILTIN (iwmmxt_walignr1, "walignr1", WALIGNR1)
+  IWMMXT_BUILTIN (iwmmxt_walignr2, "walignr2", WALIGNR2)
+  IWMMXT_BUILTIN (iwmmxt_walignr3, "walignr3", WALIGNR3)
 
 #define IWMMXT_BUILTIN2(code, builtin) \
   { FL_IWMMXT, CODE_FOR_##code, NULL, ARM_BUILTIN_##builtin, UNKNOWN, 0 },
 
+#define IWMMXT2_BUILTIN2(code, builtin) \
+  { FL_IWMMXT2, CODE_FOR_##code, NULL, ARM_BUILTIN_##builtin, UNKNOWN, 0 },
+
+  IWMMXT2_BUILTIN2 (iwmmxt_waddbhusm, WADDBHUSM)
+  IWMMXT2_BUILTIN2 (iwmmxt_waddbhusl, WADDBHUSL)
   IWMMXT_BUILTIN2 (iwmmxt_wpackhss, WPACKHSS)
   IWMMXT_BUILTIN2 (iwmmxt_wpackwss, WPACKWSS)
   IWMMXT_BUILTIN2 (iwmmxt_wpackdss, WPACKDSS)
   IWMMXT_BUILTIN2 (iwmmxt_wpackhus, WPACKHUS)
   IWMMXT_BUILTIN2 (iwmmxt_wpackwus, WPACKWUS)
   IWMMXT_BUILTIN2 (iwmmxt_wpackdus, WPACKDUS)
-  IWMMXT_BUILTIN2 (ashlv4hi3_di,    WSLLH)
-  IWMMXT_BUILTIN2 (ashlv4hi3_iwmmxt, WSLLHI)
-  IWMMXT_BUILTIN2 (ashlv2si3_di,    WSLLW)
-  IWMMXT_BUILTIN2 (ashlv2si3_iwmmxt, WSLLWI)
-  IWMMXT_BUILTIN2 (ashldi3_di,      WSLLD)
-  IWMMXT_BUILTIN2 (ashldi3_iwmmxt,  WSLLDI)
-  IWMMXT_BUILTIN2 (lshrv4hi3_di,    WSRLH)
-  IWMMXT_BUILTIN2 (lshrv4hi3_iwmmxt, WSRLHI)
-  IWMMXT_BUILTIN2 (lshrv2si3_di,    WSRLW)
-  IWMMXT_BUILTIN2 (lshrv2si3_iwmmxt, WSRLWI)
-  IWMMXT_BUILTIN2 (lshrdi3_di,      WSRLD)
-  IWMMXT_BUILTIN2 (lshrdi3_iwmmxt,  WSRLDI)
-  IWMMXT_BUILTIN2 (ashrv4hi3_di,    WSRAH)
-  IWMMXT_BUILTIN2 (ashrv4hi3_iwmmxt, WSRAHI)
-  IWMMXT_BUILTIN2 (ashrv2si3_di,    WSRAW)
-  IWMMXT_BUILTIN2 (ashrv2si3_iwmmxt, WSRAWI)
-  IWMMXT_BUILTIN2 (ashrdi3_di,      WSRAD)
-  IWMMXT_BUILTIN2 (ashrdi3_iwmmxt,  WSRADI)
-  IWMMXT_BUILTIN2 (rorv4hi3_di,     WRORH)
-  IWMMXT_BUILTIN2 (rorv4hi3,        WRORHI)
-  IWMMXT_BUILTIN2 (rorv2si3_di,     WRORW)
-  IWMMXT_BUILTIN2 (rorv2si3,        WRORWI)
-  IWMMXT_BUILTIN2 (rordi3_di,       WRORD)
-  IWMMXT_BUILTIN2 (rordi3,          WRORDI)
-  IWMMXT_BUILTIN2 (iwmmxt_wmacuz,   WMACUZ)
-  IWMMXT_BUILTIN2 (iwmmxt_wmacsz,   WMACSZ)
+  IWMMXT_BUILTIN2 (iwmmxt_wmacuz, WMACUZ)
+  IWMMXT_BUILTIN2 (iwmmxt_wmacsz, WMACSZ)
 };
 
 static const struct builtin_description bdesc_1arg[] =
@@ -20445,6 +20536,12 @@ static const struct builtin_description bdesc_1arg[] =
   IWMMXT_BUILTIN (iwmmxt_wunpckelsb, "wunpckelsb", WUNPCKELSB)
   IWMMXT_BUILTIN (iwmmxt_wunpckelsh, "wunpckelsh", WUNPCKELSH)
   IWMMXT_BUILTIN (iwmmxt_wunpckelsw, "wunpckelsw", WUNPCKELSW)
+  IWMMXT2_BUILTIN (iwmmxt_wabsv8qi3, "wabsb", WABSB)
+  IWMMXT2_BUILTIN (iwmmxt_wabsv4hi3, "wabsh", WABSH)
+  IWMMXT2_BUILTIN (iwmmxt_wabsv2si3, "wabsw", WABSW)
+  IWMMXT_BUILTIN (tbcstv8qi, "tbcstb", TBCSTB)
+  IWMMXT_BUILTIN (tbcstv4hi, "tbcsth", TBCSTH)
+  IWMMXT_BUILTIN (tbcstv2si, "tbcstw", TBCSTW)
 };
 
 /* Set up all the iWMMXt builtins.  This is not called if
@@ -20460,9 +20557,6 @@ arm_init_iwmmxt_builtins (void)
   tree V4HI_type_node = build_vector_type_for_mode (intHI_type_node, V4HImode);
   tree V8QI_type_node = build_vector_type_for_mode (intQI_type_node, V8QImode);
 
-  tree int_ftype_int
-    = build_function_type_list (integer_type_node,
-				integer_type_node, NULL_TREE);
   tree v8qi_ftype_v8qi_v8qi_int
     = build_function_type_list (V8QI_type_node,
 				V8QI_type_node, V8QI_type_node,
@@ -20524,6 +20618,9 @@ arm_init_iwmmxt_builtins (void)
   tree v4hi_ftype_v2si_v2si
     = build_function_type_list (V4HI_type_node,
 				V2SI_type_node, V2SI_type_node, NULL_TREE);
+  tree v8qi_ftype_v4hi_v8qi
+    = build_function_type_list (V8QI_type_node,
+	                        V4HI_type_node, V8QI_type_node, NULL_TREE);
   tree v2si_ftype_v4hi_v4hi
     = build_function_type_list (V2SI_type_node,
 				V4HI_type_node, V4HI_type_node, NULL_TREE);
@@ -20538,12 +20635,10 @@ arm_init_iwmmxt_builtins (void)
     = build_function_type_list (V2SI_type_node,
 				V2SI_type_node, long_long_integer_type_node,
 				NULL_TREE);
-  tree void_ftype_int_int
-    = build_function_type_list (void_type_node,
-				integer_type_node, integer_type_node,
-				NULL_TREE);
   tree di_ftype_void
     = build_function_type_list (long_long_unsigned_type_node, NULL_TREE);
+  tree int_ftype_void
+    = build_function_type_list (integer_type_node, NULL_TREE);
   tree di_ftype_v8qi
     = build_function_type_list (long_long_integer_type_node,
 				V8QI_type_node, NULL_TREE);
@@ -20559,6 +20654,15 @@ arm_init_iwmmxt_builtins (void)
   tree v4hi_ftype_v8qi
     = build_function_type_list (V4HI_type_node,
 				V8QI_type_node, NULL_TREE);
+  tree v8qi_ftype_v8qi
+    = build_function_type_list (V8QI_type_node,
+	                        V8QI_type_node, NULL_TREE);
+  tree v4hi_ftype_v4hi
+    = build_function_type_list (V4HI_type_node,
+	                        V4HI_type_node, NULL_TREE);
+  tree v2si_ftype_v2si
+    = build_function_type_list (V2SI_type_node,
+	                        V2SI_type_node, NULL_TREE);
 
   tree di_ftype_di_v4hi_v4hi
     = build_function_type_list (long_long_unsigned_type_node,
@@ -20571,6 +20675,48 @@ arm_init_iwmmxt_builtins (void)
 				V4HI_type_node,V4HI_type_node,
 				NULL_TREE);
 
+  tree v2si_ftype_v2si_v4hi_v4hi
+    = build_function_type_list (V2SI_type_node,
+                                V2SI_type_node, V4HI_type_node,
+                                V4HI_type_node, NULL_TREE);
+
+  tree v2si_ftype_v2si_v8qi_v8qi
+    = build_function_type_list (V2SI_type_node,
+                                V2SI_type_node, V8QI_type_node,
+                                V8QI_type_node, NULL_TREE);
+
+  tree di_ftype_di_v2si_v2si
+     = build_function_type_list (long_long_unsigned_type_node,
+                                 long_long_unsigned_type_node,
+                                 V2SI_type_node, V2SI_type_node,
+                                 NULL_TREE);
+
+   tree di_ftype_di_di_int
+     = build_function_type_list (long_long_unsigned_type_node,
+                                 long_long_unsigned_type_node,
+                                 long_long_unsigned_type_node,
+                                 integer_type_node, NULL_TREE);
+
+   tree void_ftype_void
+     = build_function_type_list (void_type_node,
+                                 NULL_TREE);
+
+   tree void_ftype_int
+     = build_function_type_list (void_type_node,
+                                 integer_type_node, NULL_TREE);
+
+   tree v8qi_ftype_char
+     = build_function_type_list (V8QI_type_node,
+                                 signed_char_type_node, NULL_TREE);
+
+   tree v4hi_ftype_short
+     = build_function_type_list (V4HI_type_node,
+                                 short_integer_type_node, NULL_TREE);
+
+   tree v2si_ftype_int
+     = build_function_type_list (V2SI_type_node,
+                                 integer_type_node, NULL_TREE);
+
   /* Normal vector binops.  */
   tree v8qi_ftype_v8qi_v8qi
     = build_function_type_list (V8QI_type_node,
@@ -20628,9 +20774,19 @@ arm_init_iwmmxt_builtins (void)
   def_mbuiltin (FL_IWMMXT, "__builtin_arm_" NAME, (TYPE),	\
 		ARM_BUILTIN_ ## CODE)
 
+#define iwmmx2_mbuiltin(NAME, TYPE, CODE)                      \
+  def_mbuiltin (FL_IWMMXT2, "__builtin_arm_" NAME, (TYPE),     \
+               ARM_BUILTIN_ ## CODE)
+
   iwmmx_mbuiltin ("wzero", di_ftype_void, WZERO);
-  iwmmx_mbuiltin ("setwcx", void_ftype_int_int, SETWCX);
-  iwmmx_mbuiltin ("getwcx", int_ftype_int, GETWCX);
+  iwmmx_mbuiltin ("setwcgr0", void_ftype_int, SETWCGR0);
+  iwmmx_mbuiltin ("setwcgr1", void_ftype_int, SETWCGR1);
+  iwmmx_mbuiltin ("setwcgr2", void_ftype_int, SETWCGR2);
+  iwmmx_mbuiltin ("setwcgr3", void_ftype_int, SETWCGR3);
+  iwmmx_mbuiltin ("getwcgr0", int_ftype_void, GETWCGR0);
+  iwmmx_mbuiltin ("getwcgr1", int_ftype_void, GETWCGR1);
+  iwmmx_mbuiltin ("getwcgr2", int_ftype_void, GETWCGR2);
+  iwmmx_mbuiltin ("getwcgr3", int_ftype_void, GETWCGR3);
 
   iwmmx_mbuiltin ("wsllh", v4hi_ftype_v4hi_di, WSLLH);
   iwmmx_mbuiltin ("wsllw", v2si_ftype_v2si_di, WSLLW);
@@ -20662,8 +20818,14 @@ arm_init_iwmmxt_builtins (void)
 
   iwmmx_mbuiltin ("wshufh", v4hi_ftype_v4hi_int, WSHUFH);
 
-  iwmmx_mbuiltin ("wsadb", v2si_ftype_v8qi_v8qi, WSADB);
-  iwmmx_mbuiltin ("wsadh", v2si_ftype_v4hi_v4hi, WSADH);
+  iwmmx_mbuiltin ("wsadb", v2si_ftype_v2si_v8qi_v8qi, WSADB);
+  iwmmx_mbuiltin ("wsadh", v2si_ftype_v2si_v4hi_v4hi, WSADH);
+  iwmmx_mbuiltin ("wmadds", v2si_ftype_v4hi_v4hi, WMADDS);
+  iwmmx2_mbuiltin ("wmaddsx", v2si_ftype_v4hi_v4hi, WMADDSX);
+  iwmmx2_mbuiltin ("wmaddsn", v2si_ftype_v4hi_v4hi, WMADDSN);
+  iwmmx_mbuiltin ("wmaddu", v2si_ftype_v4hi_v4hi, WMADDU);
+  iwmmx2_mbuiltin ("wmaddux", v2si_ftype_v4hi_v4hi, WMADDUX);
+  iwmmx2_mbuiltin ("wmaddun", v2si_ftype_v4hi_v4hi, WMADDUN);
   iwmmx_mbuiltin ("wsadbz", v2si_ftype_v8qi_v8qi, WSADBZ);
   iwmmx_mbuiltin ("wsadhz", v2si_ftype_v4hi_v4hi, WSADHZ);
 
@@ -20685,6 +20847,9 @@ arm_init_iwmmxt_builtins (void)
   iwmmx_mbuiltin ("tmovmskh", int_ftype_v4hi, TMOVMSKH);
   iwmmx_mbuiltin ("tmovmskw", int_ftype_v2si, TMOVMSKW);
 
+  iwmmx2_mbuiltin ("waddbhusm", v8qi_ftype_v4hi_v8qi, WADDBHUSM);
+  iwmmx2_mbuiltin ("waddbhusl", v8qi_ftype_v4hi_v8qi, WADDBHUSL);
+
   iwmmx_mbuiltin ("wpackhss", v8qi_ftype_v4hi_v4hi, WPACKHSS);
   iwmmx_mbuiltin ("wpackhus", v8qi_ftype_v4hi_v4hi, WPACKHUS);
   iwmmx_mbuiltin ("wpackwus", v4hi_ftype_v2si_v2si, WPACKWUS);
@@ -20710,7 +20875,7 @@ arm_init_iwmmxt_builtins (void)
   iwmmx_mbuiltin ("wmacu", di_ftype_di_v4hi_v4hi, WMACU);
   iwmmx_mbuiltin ("wmacuz", di_ftype_v4hi_v4hi, WMACUZ);
 
-  iwmmx_mbuiltin ("walign", v8qi_ftype_v8qi_v8qi_int, WALIGN);
+  iwmmx_mbuiltin ("walign", v8qi_ftype_v8qi_v8qi_int, WALIGNI);
   iwmmx_mbuiltin ("tmia", di_ftype_di_int_int, TMIA);
   iwmmx_mbuiltin ("tmiaph", di_ftype_di_int_int, TMIAPH);
   iwmmx_mbuiltin ("tmiabb", di_ftype_di_int_int, TMIABB);
@@ -20718,7 +20883,48 @@ arm_init_iwmmxt_builtins (void)
   iwmmx_mbuiltin ("tmiatb", di_ftype_di_int_int, TMIATB);
   iwmmx_mbuiltin ("tmiatt", di_ftype_di_int_int, TMIATT);
 
+  iwmmx2_mbuiltin ("wabsb", v8qi_ftype_v8qi, WABSB);
+  iwmmx2_mbuiltin ("wabsh", v4hi_ftype_v4hi, WABSH);
+  iwmmx2_mbuiltin ("wabsw", v2si_ftype_v2si, WABSW);
+
+  iwmmx2_mbuiltin ("wqmiabb", v2si_ftype_v2si_v4hi_v4hi, WQMIABB);
+  iwmmx2_mbuiltin ("wqmiabt", v2si_ftype_v2si_v4hi_v4hi, WQMIABT);
+  iwmmx2_mbuiltin ("wqmiatb", v2si_ftype_v2si_v4hi_v4hi, WQMIATB);
+  iwmmx2_mbuiltin ("wqmiatt", v2si_ftype_v2si_v4hi_v4hi, WQMIATT);
+
+  iwmmx2_mbuiltin ("wqmiabbn", v2si_ftype_v2si_v4hi_v4hi, WQMIABBN);
+  iwmmx2_mbuiltin ("wqmiabtn", v2si_ftype_v2si_v4hi_v4hi, WQMIABTN);
+  iwmmx2_mbuiltin ("wqmiatbn", v2si_ftype_v2si_v4hi_v4hi, WQMIATBN);
+  iwmmx2_mbuiltin ("wqmiattn", v2si_ftype_v2si_v4hi_v4hi, WQMIATTN);
+
+  iwmmx2_mbuiltin ("wmiabb", di_ftype_di_v4hi_v4hi, WMIABB);
+  iwmmx2_mbuiltin ("wmiabt", di_ftype_di_v4hi_v4hi, WMIABT);
+  iwmmx2_mbuiltin ("wmiatb", di_ftype_di_v4hi_v4hi, WMIATB);
+  iwmmx2_mbuiltin ("wmiatt", di_ftype_di_v4hi_v4hi, WMIATT);
+
+  iwmmx2_mbuiltin ("wmiabbn", di_ftype_di_v4hi_v4hi, WMIABBN);
+  iwmmx2_mbuiltin ("wmiabtn", di_ftype_di_v4hi_v4hi, WMIABTN);
+  iwmmx2_mbuiltin ("wmiatbn", di_ftype_di_v4hi_v4hi, WMIATBN);
+  iwmmx2_mbuiltin ("wmiattn", di_ftype_di_v4hi_v4hi, WMIATTN);
+
+  iwmmx2_mbuiltin ("wmiawbb", di_ftype_di_v2si_v2si, WMIAWBB);
+  iwmmx2_mbuiltin ("wmiawbt", di_ftype_di_v2si_v2si, WMIAWBT);
+  iwmmx2_mbuiltin ("wmiawtb", di_ftype_di_v2si_v2si, WMIAWTB);
+  iwmmx2_mbuiltin ("wmiawtt", di_ftype_di_v2si_v2si, WMIAWTT);
+
+  iwmmx2_mbuiltin ("wmiawbbn", di_ftype_di_v2si_v2si, WMIAWBBN);
+  iwmmx2_mbuiltin ("wmiawbtn", di_ftype_di_v2si_v2si, WMIAWBTN);
+  iwmmx2_mbuiltin ("wmiawtbn", di_ftype_di_v2si_v2si, WMIAWTBN);
+  iwmmx2_mbuiltin ("wmiawttn", di_ftype_di_v2si_v2si, WMIAWTTN);
+
+  iwmmx2_mbuiltin ("wmerge", di_ftype_di_di_int, WMERGE);
+
+  iwmmx_mbuiltin ("tbcstb", v8qi_ftype_char, TBCSTB);
+  iwmmx_mbuiltin ("tbcsth", v4hi_ftype_short, TBCSTH);
+  iwmmx_mbuiltin ("tbcstw", v2si_ftype_int, TBCSTW);
+
 #undef iwmmx_mbuiltin
+#undef iwmmx2_mbuiltin
 }
 
 static void
@@ -21375,6 +21581,10 @@ arm_expand_builtin (tree exp,
   enum machine_mode mode0;
   enum machine_mode mode1;
   enum machine_mode mode2;
+  int opint;
+  int selector;
+  int mask;
+  int imm;
 
   if (fcode >= ARM_BUILTIN_NEON_BASE)
     return arm_expand_neon_builtin (fcode, exp, target);
@@ -21409,6 +21619,24 @@ arm_expand_builtin (tree exp,
 	  error ("selector must be an immediate");
 	  return gen_reg_rtx (tmode);
 	}
+
+      opint = INTVAL (op1);
+      if (fcode == ARM_BUILTIN_TEXTRMSB || fcode == ARM_BUILTIN_TEXTRMUB)
+	{
+	  if (opint > 7 || opint < 0)
+	    error ("the range of selector should be in 0 to 7");
+	}
+      else if (fcode == ARM_BUILTIN_TEXTRMSH || fcode == ARM_BUILTIN_TEXTRMUH)
+	{
+	  if (opint > 3 || opint < 0)
+	    error ("the range of selector should be in 0 to 3");
+	}
+      else /* ARM_BUILTIN_TEXTRMSW || ARM_BUILTIN_TEXTRMUW.  */
+	{
+	  if (opint > 1 || opint < 0)
+	    error ("the range of selector should be in 0 to 1");
+	}
+
       if (target == 0
 	  || GET_MODE (target) != tmode
 	  || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
@@ -21419,11 +21647,61 @@ arm_expand_builtin (tree exp,
       emit_insn (pat);
       return target;
 
+    case ARM_BUILTIN_WALIGNI:
+      /* If op2 is immediate, call walighi, else call walighr.  */
+      arg0 = CALL_EXPR_ARG (exp, 0);
+      arg1 = CALL_EXPR_ARG (exp, 1);
+      arg2 = CALL_EXPR_ARG (exp, 2);
+      op0 = expand_normal (arg0);
+      op1 = expand_normal (arg1);
+      op2 = expand_normal (arg2);
+      if (GET_CODE (op2) == CONST_INT)
+        {
+	  icode = CODE_FOR_iwmmxt_waligni;
+          tmode = insn_data[icode].operand[0].mode;
+	  mode0 = insn_data[icode].operand[1].mode;
+	  mode1 = insn_data[icode].operand[2].mode;
+	  mode2 = insn_data[icode].operand[3].mode;
+          if (!(*insn_data[icode].operand[1].predicate) (op0, mode0))
+	    op0 = copy_to_mode_reg (mode0, op0);
+          if (!(*insn_data[icode].operand[2].predicate) (op1, mode1))
+	    op1 = copy_to_mode_reg (mode1, op1);
+          gcc_assert ((*insn_data[icode].operand[3].predicate) (op2, mode2));
+	  selector = INTVAL (op2);
+	  if (selector > 7 || selector < 0)
+	    error ("the range of selector should be in 0 to 7");
+	}
+      else
+        {
+	  icode = CODE_FOR_iwmmxt_walignr;
+          tmode = insn_data[icode].operand[0].mode;
+	  mode0 = insn_data[icode].operand[1].mode;
+	  mode1 = insn_data[icode].operand[2].mode;
+	  mode2 = insn_data[icode].operand[3].mode;
+          if (!(*insn_data[icode].operand[1].predicate) (op0, mode0))
+	    op0 = copy_to_mode_reg (mode0, op0);
+          if (!(*insn_data[icode].operand[2].predicate) (op1, mode1))
+	    op1 = copy_to_mode_reg (mode1, op1);
+          if (!(*insn_data[icode].operand[3].predicate) (op2, mode2))
+	    op2 = copy_to_mode_reg (mode2, op2);
+	}
+      if (target == 0
+	  || GET_MODE (target) != tmode
+	  || !(*insn_data[icode].operand[0].predicate) (target, tmode))
+	target = gen_reg_rtx (tmode);
+      pat = GEN_FCN (icode) (target, op0, op1, op2);
+      if (!pat)
+	return 0;
+      emit_insn (pat);
+      return target;
+
     case ARM_BUILTIN_TINSRB:
     case ARM_BUILTIN_TINSRH:
     case ARM_BUILTIN_TINSRW:
+    case ARM_BUILTIN_WMERGE:
       icode = (fcode == ARM_BUILTIN_TINSRB ? CODE_FOR_iwmmxt_tinsrb
 	       : fcode == ARM_BUILTIN_TINSRH ? CODE_FOR_iwmmxt_tinsrh
+	       : fcode == ARM_BUILTIN_WMERGE ? CODE_FOR_iwmmxt_wmerge
 	       : CODE_FOR_iwmmxt_tinsrw);
       arg0 = CALL_EXPR_ARG (exp, 0);
       arg1 = CALL_EXPR_ARG (exp, 1);
@@ -21442,10 +21720,30 @@ arm_expand_builtin (tree exp,
 	op1 = copy_to_mode_reg (mode1, op1);
       if (! (*insn_data[icode].operand[3].predicate) (op2, mode2))
 	{
-	  /* @@@ better error message */
 	  error ("selector must be an immediate");
 	  return const0_rtx;
 	}
+      if (icode == CODE_FOR_iwmmxt_wmerge)
+	{
+	  selector = INTVAL (op2);
+	  if (selector > 7 || selector < 0)
+	    error ("the range of selector should be in 0 to 7");
+	}
+      if ((icode == CODE_FOR_iwmmxt_tinsrb)
+	  || (icode == CODE_FOR_iwmmxt_tinsrh)
+	  || (icode == CODE_FOR_iwmmxt_tinsrw))
+        {
+	  mask = 0x01;
+	  selector= INTVAL (op2);
+	  if (icode == CODE_FOR_iwmmxt_tinsrb && (selector < 0 || selector > 7))
+	    error ("the range of selector should be in 0 to 7");
+	  else if (icode == CODE_FOR_iwmmxt_tinsrh && (selector < 0 ||selector > 3))
+	    error ("the range of selector should be in 0 to 3");
+	  else if (icode == CODE_FOR_iwmmxt_tinsrw && (selector < 0 ||selector > 1))
+	    error ("the range of selector should be in 0 to 1");
+	  mask <<= selector;
+	  op2 = gen_rtx_CONST_INT (SImode, mask);
+	}
       if (target == 0
 	  || GET_MODE (target) != tmode
 	  || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
@@ -21456,19 +21754,42 @@ arm_expand_builtin (tree exp,
       emit_insn (pat);
       return target;
 
-    case ARM_BUILTIN_SETWCX:
+    case ARM_BUILTIN_SETWCGR0:
+    case ARM_BUILTIN_SETWCGR1:
+    case ARM_BUILTIN_SETWCGR2:
+    case ARM_BUILTIN_SETWCGR3:
+      icode = (fcode == ARM_BUILTIN_SETWCGR0 ? CODE_FOR_iwmmxt_setwcgr0
+	       : fcode == ARM_BUILTIN_SETWCGR1 ? CODE_FOR_iwmmxt_setwcgr1
+	       : fcode == ARM_BUILTIN_SETWCGR2 ? CODE_FOR_iwmmxt_setwcgr2
+	       : CODE_FOR_iwmmxt_setwcgr3);
       arg0 = CALL_EXPR_ARG (exp, 0);
-      arg1 = CALL_EXPR_ARG (exp, 1);
-      op0 = force_reg (SImode, expand_normal (arg0));
-      op1 = expand_normal (arg1);
-      emit_insn (gen_iwmmxt_tmcr (op1, op0));
+      op0 = expand_normal (arg0);
+      mode0 = insn_data[icode].operand[0].mode;
+      if (!(*insn_data[icode].operand[0].predicate) (op0, mode0))
+        op0 = copy_to_mode_reg (mode0, op0);
+      pat = GEN_FCN (icode) (op0);
+      if (!pat)
+	return 0;
+      emit_insn (pat);
       return 0;
 
-    case ARM_BUILTIN_GETWCX:
-      arg0 = CALL_EXPR_ARG (exp, 0);
-      op0 = expand_normal (arg0);
-      target = gen_reg_rtx (SImode);
-      emit_insn (gen_iwmmxt_tmrc (target, op0));
+    case ARM_BUILTIN_GETWCGR0:
+    case ARM_BUILTIN_GETWCGR1:
+    case ARM_BUILTIN_GETWCGR2:
+    case ARM_BUILTIN_GETWCGR3:
+      icode = (fcode == ARM_BUILTIN_GETWCGR0 ? CODE_FOR_iwmmxt_getwcgr0
+	       : fcode == ARM_BUILTIN_GETWCGR1 ? CODE_FOR_iwmmxt_getwcgr1
+	       : fcode == ARM_BUILTIN_GETWCGR2 ? CODE_FOR_iwmmxt_getwcgr2
+	       : CODE_FOR_iwmmxt_getwcgr3);
+      tmode = insn_data[icode].operand[0].mode;
+      if (target == 0
+	  || GET_MODE (target) != tmode
+	  || !(*insn_data[icode].operand[0].predicate) (target, tmode))
+        target = gen_reg_rtx (tmode);
+      pat = GEN_FCN (icode) (target);
+      if (!pat)
+        return 0;
+      emit_insn (pat);
       return target;
 
     case ARM_BUILTIN_WSHUFH:
@@ -21485,10 +21806,12 @@ arm_expand_builtin (tree exp,
 	op0 = copy_to_mode_reg (mode1, op0);
       if (! (*insn_data[icode].operand[2].predicate) (op1, mode2))
 	{
-	  /* @@@ better error message */
 	  error ("mask must be an immediate");
 	  return const0_rtx;
 	}
+      selector = INTVAL (op1);
+      if (selector < 0 || selector > 255)
+	error ("the range of mask should be in 0 to 255");
       if (target == 0
 	  || GET_MODE (target) != tmode
 	  || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
@@ -21499,10 +21822,18 @@ arm_expand_builtin (tree exp,
       emit_insn (pat);
       return target;
 
-    case ARM_BUILTIN_WSADB:
-      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wsadb, exp, target);
-    case ARM_BUILTIN_WSADH:
-      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wsadh, exp, target);
+    case ARM_BUILTIN_WMADDS:
+      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wmadds, exp, target);
+    case ARM_BUILTIN_WMADDSX:
+      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wmaddsx, exp, target);
+    case ARM_BUILTIN_WMADDSN:
+      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wmaddsn, exp, target);
+    case ARM_BUILTIN_WMADDU:
+      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wmaddu, exp, target);
+    case ARM_BUILTIN_WMADDUX:
+      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wmaddux, exp, target);
+    case ARM_BUILTIN_WMADDUN:
+      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wmaddun, exp, target);
     case ARM_BUILTIN_WSADBZ:
       return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wsadbz, exp, target);
     case ARM_BUILTIN_WSADHZ:
@@ -21511,13 +21842,38 @@ arm_expand_builtin (tree exp,
       /* Several three-argument builtins.  */
     case ARM_BUILTIN_WMACS:
     case ARM_BUILTIN_WMACU:
-    case ARM_BUILTIN_WALIGN:
     case ARM_BUILTIN_TMIA:
     case ARM_BUILTIN_TMIAPH:
     case ARM_BUILTIN_TMIATT:
     case ARM_BUILTIN_TMIATB:
     case ARM_BUILTIN_TMIABT:
     case ARM_BUILTIN_TMIABB:
+    case ARM_BUILTIN_WQMIABB:
+    case ARM_BUILTIN_WQMIABT:
+    case ARM_BUILTIN_WQMIATB:
+    case ARM_BUILTIN_WQMIATT:
+    case ARM_BUILTIN_WQMIABBN:
+    case ARM_BUILTIN_WQMIABTN:
+    case ARM_BUILTIN_WQMIATBN:
+    case ARM_BUILTIN_WQMIATTN:
+    case ARM_BUILTIN_WMIABB:
+    case ARM_BUILTIN_WMIABT:
+    case ARM_BUILTIN_WMIATB:
+    case ARM_BUILTIN_WMIATT:
+    case ARM_BUILTIN_WMIABBN:
+    case ARM_BUILTIN_WMIABTN:
+    case ARM_BUILTIN_WMIATBN:
+    case ARM_BUILTIN_WMIATTN:
+    case ARM_BUILTIN_WMIAWBB:
+    case ARM_BUILTIN_WMIAWBT:
+    case ARM_BUILTIN_WMIAWTB:
+    case ARM_BUILTIN_WMIAWTT:
+    case ARM_BUILTIN_WMIAWBBN:
+    case ARM_BUILTIN_WMIAWBTN:
+    case ARM_BUILTIN_WMIAWTBN:
+    case ARM_BUILTIN_WMIAWTTN:
+    case ARM_BUILTIN_WSADB:
+    case ARM_BUILTIN_WSADH:
       icode = (fcode == ARM_BUILTIN_WMACS ? CODE_FOR_iwmmxt_wmacs
 	       : fcode == ARM_BUILTIN_WMACU ? CODE_FOR_iwmmxt_wmacu
 	       : fcode == ARM_BUILTIN_TMIA ? CODE_FOR_iwmmxt_tmia
@@ -21526,7 +21882,32 @@ arm_expand_builtin (tree exp,
 	       : fcode == ARM_BUILTIN_TMIABT ? CODE_FOR_iwmmxt_tmiabt
 	       : fcode == ARM_BUILTIN_TMIATB ? CODE_FOR_iwmmxt_tmiatb
 	       : fcode == ARM_BUILTIN_TMIATT ? CODE_FOR_iwmmxt_tmiatt
-	       : CODE_FOR_iwmmxt_walign);
+	       : fcode == ARM_BUILTIN_WQMIABB ? CODE_FOR_iwmmxt_wqmiabb
+	       : fcode == ARM_BUILTIN_WQMIABT ? CODE_FOR_iwmmxt_wqmiabt
+	       : fcode == ARM_BUILTIN_WQMIATB ? CODE_FOR_iwmmxt_wqmiatb
+	       : fcode == ARM_BUILTIN_WQMIATT ? CODE_FOR_iwmmxt_wqmiatt
+	       : fcode == ARM_BUILTIN_WQMIABBN ? CODE_FOR_iwmmxt_wqmiabbn
+	       : fcode == ARM_BUILTIN_WQMIABTN ? CODE_FOR_iwmmxt_wqmiabtn
+	       : fcode == ARM_BUILTIN_WQMIATBN ? CODE_FOR_iwmmxt_wqmiatbn
+	       : fcode == ARM_BUILTIN_WQMIATTN ? CODE_FOR_iwmmxt_wqmiattn
+	       : fcode == ARM_BUILTIN_WMIABB ? CODE_FOR_iwmmxt_wmiabb
+	       : fcode == ARM_BUILTIN_WMIABT ? CODE_FOR_iwmmxt_wmiabt
+	       : fcode == ARM_BUILTIN_WMIATB ? CODE_FOR_iwmmxt_wmiatb
+	       : fcode == ARM_BUILTIN_WMIATT ? CODE_FOR_iwmmxt_wmiatt
+	       : fcode == ARM_BUILTIN_WMIABBN ? CODE_FOR_iwmmxt_wmiabbn
+	       : fcode == ARM_BUILTIN_WMIABTN ? CODE_FOR_iwmmxt_wmiabtn
+	       : fcode == ARM_BUILTIN_WMIATBN ? CODE_FOR_iwmmxt_wmiatbn
+	       : fcode == ARM_BUILTIN_WMIATTN ? CODE_FOR_iwmmxt_wmiattn
+	       : fcode == ARM_BUILTIN_WMIAWBB ? CODE_FOR_iwmmxt_wmiawbb
+	       : fcode == ARM_BUILTIN_WMIAWBT ? CODE_FOR_iwmmxt_wmiawbt
+	       : fcode == ARM_BUILTIN_WMIAWTB ? CODE_FOR_iwmmxt_wmiawtb
+	       : fcode == ARM_BUILTIN_WMIAWTT ? CODE_FOR_iwmmxt_wmiawtt
+	       : fcode == ARM_BUILTIN_WMIAWBBN ? CODE_FOR_iwmmxt_wmiawbbn
+	       : fcode == ARM_BUILTIN_WMIAWBTN ? CODE_FOR_iwmmxt_wmiawbtn
+	       : fcode == ARM_BUILTIN_WMIAWTBN ? CODE_FOR_iwmmxt_wmiawtbn
+	       : fcode == ARM_BUILTIN_WMIAWTTN ? CODE_FOR_iwmmxt_wmiawttn
+	       : fcode == ARM_BUILTIN_WSADB ? CODE_FOR_iwmmxt_wsadb
+	       : CODE_FOR_iwmmxt_wsadh);
       arg0 = CALL_EXPR_ARG (exp, 0);
       arg1 = CALL_EXPR_ARG (exp, 1);
       arg2 = CALL_EXPR_ARG (exp, 2);
@@ -21559,6 +21940,123 @@ arm_expand_builtin (tree exp,
       emit_insn (gen_iwmmxt_clrdi (target));
       return target;
 
+    case ARM_BUILTIN_WSRLHI:
+    case ARM_BUILTIN_WSRLWI:
+    case ARM_BUILTIN_WSRLDI:
+    case ARM_BUILTIN_WSLLHI:
+    case ARM_BUILTIN_WSLLWI:
+    case ARM_BUILTIN_WSLLDI:
+    case ARM_BUILTIN_WSRAHI:
+    case ARM_BUILTIN_WSRAWI:
+    case ARM_BUILTIN_WSRADI:
+    case ARM_BUILTIN_WRORHI:
+    case ARM_BUILTIN_WRORWI:
+    case ARM_BUILTIN_WRORDI:
+    case ARM_BUILTIN_WSRLH:
+    case ARM_BUILTIN_WSRLW:
+    case ARM_BUILTIN_WSRLD:
+    case ARM_BUILTIN_WSLLH:
+    case ARM_BUILTIN_WSLLW:
+    case ARM_BUILTIN_WSLLD:
+    case ARM_BUILTIN_WSRAH:
+    case ARM_BUILTIN_WSRAW:
+    case ARM_BUILTIN_WSRAD:
+    case ARM_BUILTIN_WRORH:
+    case ARM_BUILTIN_WRORW:
+    case ARM_BUILTIN_WRORD:
+      icode = (fcode == ARM_BUILTIN_WSRLHI ? CODE_FOR_lshrv4hi3_iwmmxt
+	       : fcode == ARM_BUILTIN_WSRLWI ? CODE_FOR_lshrv2si3_iwmmxt
+	       : fcode == ARM_BUILTIN_WSRLDI ? CODE_FOR_lshrdi3_iwmmxt
+	       : fcode == ARM_BUILTIN_WSLLHI ? CODE_FOR_ashlv4hi3_iwmmxt
+	       : fcode == ARM_BUILTIN_WSLLWI ? CODE_FOR_ashlv2si3_iwmmxt
+	       : fcode == ARM_BUILTIN_WSLLDI ? CODE_FOR_ashldi3_iwmmxt
+	       : fcode == ARM_BUILTIN_WSRAHI ? CODE_FOR_ashrv4hi3_iwmmxt
+	       : fcode == ARM_BUILTIN_WSRAWI ? CODE_FOR_ashrv2si3_iwmmxt
+	       : fcode == ARM_BUILTIN_WSRADI ? CODE_FOR_ashrdi3_iwmmxt
+	       : fcode == ARM_BUILTIN_WRORHI ? CODE_FOR_rorv4hi3
+	       : fcode == ARM_BUILTIN_WRORWI ? CODE_FOR_rorv2si3
+	       : fcode == ARM_BUILTIN_WRORDI ? CODE_FOR_rordi3
+	       : fcode == ARM_BUILTIN_WSRLH  ? CODE_FOR_lshrv4hi3_di
+	       : fcode == ARM_BUILTIN_WSRLW  ? CODE_FOR_lshrv2si3_di
+	       : fcode == ARM_BUILTIN_WSRLD  ? CODE_FOR_lshrdi3_di
+	       : fcode == ARM_BUILTIN_WSLLH  ? CODE_FOR_ashlv4hi3_di
+	       : fcode == ARM_BUILTIN_WSLLW  ? CODE_FOR_ashlv2si3_di
+	       : fcode == ARM_BUILTIN_WSLLD  ? CODE_FOR_ashldi3_di
+	       : fcode == ARM_BUILTIN_WSRAH  ? CODE_FOR_ashrv4hi3_di
+	       : fcode == ARM_BUILTIN_WSRAW  ? CODE_FOR_ashrv2si3_di
+	       : fcode == ARM_BUILTIN_WSRAD  ? CODE_FOR_ashrdi3_di
+	       : fcode == ARM_BUILTIN_WRORH  ? CODE_FOR_rorv4hi3_di
+	       : fcode == ARM_BUILTIN_WRORW  ? CODE_FOR_rorv2si3_di
+	       : fcode == ARM_BUILTIN_WRORD  ? CODE_FOR_rordi3_di
+	       : CODE_FOR_nothing);
+      arg1 = CALL_EXPR_ARG (exp, 1);
+      op1 = expand_normal (arg1);
+      if (GET_MODE (op1) == VOIDmode)
+	{
+	  imm = INTVAL (op1);
+	  if ((fcode == ARM_BUILTIN_WRORHI || fcode == ARM_BUILTIN_WRORWI
+	       || fcode == ARM_BUILTIN_WRORH || fcode == ARM_BUILTIN_WRORW)
+	      && (imm < 0 || imm > 32))
+	    {
+	      if (fcode == ARM_BUILTIN_WRORHI)
+		error ("the range of count should be in 0 to 32.  please check the intrinsic _mm_rori_pi16 in code.");
+	      else if (fcode == ARM_BUILTIN_WRORWI)
+		error ("the range of count should be in 0 to 32.  please check the intrinsic _mm_rori_pi32 in code.");
+	      else if (fcode == ARM_BUILTIN_WRORH)
+		error ("the range of count should be in 0 to 32.  please check the intrinsic _mm_ror_pi16 in code.");
+	      else
+		error ("the range of count should be in 0 to 32.  please check the intrinsic _mm_ror_pi32 in code.");
+	    }
+	  else if ((fcode == ARM_BUILTIN_WRORDI || fcode == ARM_BUILTIN_WRORD)
+		   && (imm < 0 || imm > 64))
+	    {
+	      if (fcode == ARM_BUILTIN_WRORDI)
+		error ("the range of count should be in 0 to 64.  please check the intrinsic _mm_rori_si64 in code.");
+	      else
+		error ("the range of count should be in 0 to 64.  please check the intrinsic _mm_ror_si64 in code.");
+	    }
+	  else if (imm < 0)
+	    {
+	      if (fcode == ARM_BUILTIN_WSRLHI)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_srli_pi16 in code.");
+	      else if (fcode == ARM_BUILTIN_WSRLWI)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_srli_pi32 in code.");
+	      else if (fcode == ARM_BUILTIN_WSRLDI)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_srli_si64 in code.");
+	      else if (fcode == ARM_BUILTIN_WSLLHI)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_slli_pi16 in code.");
+	      else if (fcode == ARM_BUILTIN_WSLLWI)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_slli_pi32 in code.");
+	      else if (fcode == ARM_BUILTIN_WSLLDI)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_slli_si64 in code.");
+	      else if (fcode == ARM_BUILTIN_WSRAHI)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_srai_pi16 in code.");
+	      else if (fcode == ARM_BUILTIN_WSRAWI)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_srai_pi32 in code.");
+	      else if (fcode == ARM_BUILTIN_WSRADI)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_srai_si64 in code.");
+	      else if (fcode == ARM_BUILTIN_WSRLH)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_srl_pi16 in code.");
+	      else if (fcode == ARM_BUILTIN_WSRLW)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_srl_pi32 in code.");
+	      else if (fcode == ARM_BUILTIN_WSRLD)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_srl_si64 in code.");
+	      else if (fcode == ARM_BUILTIN_WSLLH)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_sll_pi16 in code.");
+	      else if (fcode == ARM_BUILTIN_WSLLW)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_sll_pi32 in code.");
+	      else if (fcode == ARM_BUILTIN_WSLLD)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_sll_si64 in code.");
+	      else if (fcode == ARM_BUILTIN_WSRAH)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_sra_pi16 in code.");
+	      else if (fcode == ARM_BUILTIN_WSRAW)
+		error ("the count should be no less than 0.  please check the intrinsic _mm_sra_pi32 in code.");
+	      else
+		error ("the count should be no less than 0.  please check the intrinsic _mm_sra_si64 in code.");
+	    }
+	}
+      return arm_expand_binop_builtin (icode, exp, target);
+
     case ARM_BUILTIN_THREAD_POINTER:
       return arm_load_tp (target);
 
-- 
1.7.3.4

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH ARM iWMMXt 5/5] pipeline description
  2012-05-29  4:13 [PATCH ARM iWMMXt 0/5] Improve iWMMXt support Matt Turner
@ 2012-05-29  4:14 ` Matt Turner
  2012-05-29  4:14 ` [PATCH ARM iWMMXt 1/5] ARM code generic change Matt Turner
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 29+ messages in thread
From: Matt Turner @ 2012-05-29  4:14 UTC (permalink / raw)
  To: gcc-patches
  Cc: Ramana Radhakrishnan, Richard Earnshaw, Nick Clifton, Paul Brook,
	Xinyu Qi

From: Xinyu Qi <xyqi@marvell.com>

	gcc/
	* config/arm/t-arm (MD_INCLUDES): Add marvell-f-iwmmxt.md.
	* config/arm/marvell-f-iwmmxt.md: New file.
	* config/arm/arm.md (marvell-f-iwmmxt.md): Include.
---
 gcc/config/arm/arm.md              |    1 +
 gcc/config/arm/marvell-f-iwmmxt.md |  179 ++++++++++++++++++++++++++++++++++++
 gcc/config/arm/t-arm               |    1 +
 3 files changed, 181 insertions(+), 0 deletions(-)
 create mode 100644 gcc/config/arm/marvell-f-iwmmxt.md

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index b0333c2..baa3b7c 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -546,6 +546,7 @@
 	  (const_string "yes")
 	  (const_string "no"))))
 
+(include "marvell-f-iwmmxt.md")
 (include "arm-generic.md")
 (include "arm926ejs.md")
 (include "arm1020e.md")
diff --git a/gcc/config/arm/marvell-f-iwmmxt.md b/gcc/config/arm/marvell-f-iwmmxt.md
new file mode 100644
index 0000000..fe8e455
--- /dev/null
+++ b/gcc/config/arm/marvell-f-iwmmxt.md
@@ -0,0 +1,179 @@
+;; Marvell WMMX2 pipeline description
+;; Copyright (C) 2011 Free Software Foundation, Inc.
+;; Written by Marvell, Inc.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+
+(define_automaton "marvell_f_iwmmxt")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Pipelines
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; This is a 7-stage pipelines:
+;;
+;;    MD | MI | ME1 | ME2 | ME3 | ME4 | MW
+;;
+;; There are various bypasses modelled to a greater or lesser extent.
+;;
+;; Latencies in this file correspond to the number of cycles after
+;; the issue stage that it takes for the result of the instruction to
+;; be computed, or for its side-effects to occur.
+
+(define_cpu_unit "mf_iwmmxt_MD" "marvell_f_iwmmxt")
+(define_cpu_unit "mf_iwmmxt_MI" "marvell_f_iwmmxt")
+(define_cpu_unit "mf_iwmmxt_ME1" "marvell_f_iwmmxt")
+(define_cpu_unit "mf_iwmmxt_ME2" "marvell_f_iwmmxt")
+(define_cpu_unit "mf_iwmmxt_ME3" "marvell_f_iwmmxt")
+(define_cpu_unit "mf_iwmmxt_ME4" "marvell_f_iwmmxt")
+(define_cpu_unit "mf_iwmmxt_MW" "marvell_f_iwmmxt")
+
+(define_reservation "mf_iwmmxt_ME"
+      "mf_iwmmxt_ME1,mf_iwmmxt_ME2,mf_iwmmxt_ME3,mf_iwmmxt_ME4"
+)
+
+(define_reservation "mf_iwmmxt_pipeline"
+      "mf_iwmmxt_MD, mf_iwmmxt_MI, mf_iwmmxt_ME, mf_iwmmxt_MW"
+)
+
+;; An attribute to indicate whether our reservations are applicable.
+(define_attr "marvell_f_iwmmxt" "yes,no"
+  (const (if_then_else (symbol_ref "arm_arch_iwmmxt")
+                       (const_string "yes") (const_string "no"))))
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; instruction classes
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; An attribute appended to instructions for classification
+
+(define_attr "wmmxt_shift" "yes,no"
+  (if_then_else (eq_attr "wtype" "wror, wsll, wsra, wsrl")
+		(const_string "yes") (const_string "no"))
+)
+
+(define_attr "wmmxt_pack" "yes,no"
+  (if_then_else (eq_attr "wtype" "waligni, walignr, wmerge, wpack, wshufh, wunpckeh, wunpckih, wunpckel, wunpckil")
+		(const_string "yes") (const_string "no"))
+)
+
+(define_attr "wmmxt_mult_c1" "yes,no"
+  (if_then_else (eq_attr "wtype" "wmac, wmadd, wmiaxy, wmiawxy, wmulw, wqmiaxy, wqmulwm")
+		(const_string "yes") (const_string "no"))
+)
+
+(define_attr "wmmxt_mult_c2" "yes,no"
+  (if_then_else (eq_attr "wtype" "wmul, wqmulm")
+		(const_string "yes") (const_string "no"))
+)
+
+(define_attr "wmmxt_alu_c1" "yes,no"
+  (if_then_else (eq_attr "wtype" "wabs, wabsdiff, wand, wandn, wmov, wor, wxor")
+	        (const_string "yes") (const_string "no"))
+)
+
+(define_attr "wmmxt_alu_c2" "yes,no"
+  (if_then_else (eq_attr "wtype" "wacc, wadd, waddsubhx, wavg2, wavg4, wcmpeq, wcmpgt, wmax, wmin, wsub, waddbhus, wsubaddhx")
+		(const_string "yes") (const_string "no"))
+)
+
+(define_attr "wmmxt_alu_c3" "yes,no"
+  (if_then_else (eq_attr "wtype" "wsad")
+	        (const_string "yes") (const_string "no"))
+)
+
+(define_attr "wmmxt_transfer_c1" "yes,no"
+  (if_then_else (eq_attr "wtype" "tbcst, tinsr, tmcr, tmcrr")
+                (const_string "yes") (const_string "no"))
+)
+
+(define_attr "wmmxt_transfer_c2" "yes,no"
+  (if_then_else (eq_attr "wtype" "textrm, tmovmsk, tmrc, tmrrc")
+	        (const_string "yes") (const_string "no"))
+)
+
+(define_attr "wmmxt_transfer_c3" "yes,no"
+  (if_then_else (eq_attr "wtype" "tmia, tmiaph, tmiaxy")
+	        (const_string "yes") (const_string "no"))
+)
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Main description
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+(define_insn_reservation "marvell_f_iwmmxt_alu_c1" 1
+  (and (eq_attr "marvell_f_iwmmxt" "yes")
+       (eq_attr "wmmxt_alu_c1" "yes"))
+  "mf_iwmmxt_pipeline")
+
+(define_insn_reservation "marvell_f_iwmmxt_pack" 1
+  (and (eq_attr "marvell_f_iwmmxt" "yes")
+       (eq_attr "wmmxt_pack" "yes"))
+  "mf_iwmmxt_pipeline")
+
+(define_insn_reservation "marvell_f_iwmmxt_shift" 1
+  (and (eq_attr "marvell_f_iwmmxt" "yes")
+       (eq_attr "wmmxt_shift" "yes"))
+  "mf_iwmmxt_pipeline")
+
+(define_insn_reservation "marvell_f_iwmmxt_transfer_c1" 1
+  (and (eq_attr "marvell_f_iwmmxt" "yes")
+       (eq_attr "wmmxt_transfer_c1" "yes"))
+  "mf_iwmmxt_pipeline")
+
+(define_insn_reservation "marvell_f_iwmmxt_transfer_c2" 5
+  (and (eq_attr "marvell_f_iwmmxt" "yes")
+       (eq_attr "wmmxt_transfer_c2" "yes"))
+  "mf_iwmmxt_pipeline")
+
+(define_insn_reservation "marvell_f_iwmmxt_alu_c2" 2
+  (and (eq_attr "marvell_f_iwmmxt" "yes")
+       (eq_attr "wmmxt_alu_c2" "yes"))
+  "mf_iwmmxt_pipeline")
+
+(define_insn_reservation "marvell_f_iwmmxt_alu_c3" 3
+  (and (eq_attr "marvell_f_iwmmxt" "yes")
+       (eq_attr "wmmxt_alu_c3" "yes"))
+  "mf_iwmmxt_pipeline")
+
+(define_insn_reservation "marvell_f_iwmmxt_transfer_c3" 4
+  (and (eq_attr "marvell_f_iwmmxt" "yes")
+       (eq_attr "wmmxt_transfer_c3" "yes"))
+  "mf_iwmmxt_pipeline")
+
+(define_insn_reservation "marvell_f_iwmmxt_mult_c1" 4
+  (and (eq_attr "marvell_f_iwmmxt" "yes")
+       (eq_attr "wmmxt_mult_c1" "yes"))
+  "mf_iwmmxt_pipeline")
+
+;There is a forwarding path from ME3 stage
+(define_insn_reservation "marvell_f_iwmmxt_mult_c2" 3
+  (and (eq_attr "marvell_f_iwmmxt" "yes")
+       (eq_attr "wmmxt_mult_c2" "yes"))
+  "mf_iwmmxt_pipeline")
+
+(define_insn_reservation "marvell_f_iwmmxt_wstr" 0
+  (and (eq_attr "marvell_f_iwmmxt" "yes")
+       (eq_attr "wtype" "wstr"))
+  "mf_iwmmxt_pipeline")
+
+;There is a forwarding path from MW stage
+(define_insn_reservation "marvell_f_iwmmxt_wldr" 5
+  (and (eq_attr "marvell_f_iwmmxt" "yes")
+       (eq_attr "wtype" "wldr"))
+  "mf_iwmmxt_pipeline")
diff --git a/gcc/config/arm/t-arm b/gcc/config/arm/t-arm
index 83c18f7..30687e1 100644
--- a/gcc/config/arm/t-arm
+++ b/gcc/config/arm/t-arm
@@ -51,6 +51,7 @@ MD_INCLUDES=	$(srcdir)/config/arm/arm1020e.md \
 		$(srcdir)/config/arm/iwmmxt.md \
 		$(srcdir)/config/arm/iwmmxt2.md \
 		$(srcdir)/config/arm/ldmstm.md \
+		$(srcdir)/config/arm/marvell-f-iwmmxt.md \
 		$(srcdir)/config/arm/neon.md \
 		$(srcdir)/config/arm/predicates.md \
 		$(srcdir)/config/arm/sync.md \
-- 
1.7.3.4

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH ARM iWMMXt 1/5] ARM code generic change
  2012-05-29  4:13 [PATCH ARM iWMMXt 0/5] Improve iWMMXt support Matt Turner
  2012-05-29  4:14 ` [PATCH ARM iWMMXt 5/5] pipeline description Matt Turner
@ 2012-05-29  4:14 ` Matt Turner
  2012-06-06 11:53   ` Ramana Radhakrishnan
  2012-05-29  4:14 ` [PATCH ARM iWMMXt 3/5] built in define and expand Matt Turner
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Matt Turner @ 2012-05-29  4:14 UTC (permalink / raw)
  To: gcc-patches
  Cc: Ramana Radhakrishnan, Richard Earnshaw, Nick Clifton, Paul Brook,
	Xinyu Qi

From: Xinyu Qi <xyqi@marvell.com>

	gcc/
	* config/arm/arm.c (FL_IWMMXT2): New define.
	(arm_arch_iwmmxt2): New variable.
	(arm_option_override): Enable use of iWMMXt with VFP.
	Disable use of iWMMXt with NEON. Disable use of iWMMXt under
	Thumb mode. Set arm_arch_iwmmxt2.
	(arm_expand_binop_builtin): Accept VOIDmode op.
	* config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Define __IWMMXT2__.
	(TARGET_IWMMXT2): New define.
	(TARGET_REALLY_IWMMXT2): Likewise.
	(arm_arch_iwmmxt2): Declare.
	* config/arm/arm-cores.def (iwmmxt2): Add FL_IWMMXT2.
	* config/arm/arm-arches.def (iwmmxt2): Likewise.
	* config/arm/arm.md (arch): Add "iwmmxt2".
	(arch_enabled): Handle "iwmmxt2".
---
 gcc/config/arm/arm-arches.def |    2 +-
 gcc/config/arm/arm-cores.def  |    2 +-
 gcc/config/arm/arm.c          |   25 +++++++++++++++++--------
 gcc/config/arm/arm.h          |    7 +++++++
 gcc/config/arm/arm.md         |    6 +++++-
 5 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/gcc/config/arm/arm-arches.def b/gcc/config/arm/arm-arches.def
index 3123426..f4dd6cc 100644
--- a/gcc/config/arm/arm-arches.def
+++ b/gcc/config/arm/arm-arches.def
@@ -57,4 +57,4 @@ ARM_ARCH("armv7-m", cortexm3,	7M,  FL_CO_PROC |	      FL_FOR_ARCH7M)
 ARM_ARCH("armv7e-m", cortexm4,  7EM, FL_CO_PROC |	      FL_FOR_ARCH7EM)
 ARM_ARCH("ep9312",  ep9312,     4T,  FL_LDSCHED | FL_CIRRUS | FL_FOR_ARCH4)
 ARM_ARCH("iwmmxt",  iwmmxt,     5TE, FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT)
-ARM_ARCH("iwmmxt2", iwmmxt2,    5TE, FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT)
+ARM_ARCH("iwmmxt2", iwmmxt2,    5TE, FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT | FL_IWMMXT2)
diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index d82b10b..c82eada 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -105,7 +105,7 @@ ARM_CORE("arm1020e",      arm1020e,	5TE,				 FL_LDSCHED, fastmul)
 ARM_CORE("arm1022e",      arm1022e,	5TE,				 FL_LDSCHED, fastmul)
 ARM_CORE("xscale",        xscale,	5TE,	                         FL_LDSCHED | FL_STRONG | FL_XSCALE, xscale)
 ARM_CORE("iwmmxt",        iwmmxt,	5TE,	                         FL_LDSCHED | FL_STRONG | FL_XSCALE | FL_IWMMXT, xscale)
-ARM_CORE("iwmmxt2",       iwmmxt2,	5TE,	                         FL_LDSCHED | FL_STRONG | FL_XSCALE | FL_IWMMXT, xscale)
+ARM_CORE("iwmmxt2",       iwmmxt2,	5TE,	                         FL_LDSCHED | FL_STRONG | FL_XSCALE | FL_IWMMXT | FL_IWMMXT2, xscale)
 ARM_CORE("fa606te",       fa606te,      5TE,                             FL_LDSCHED, 9e)
 ARM_CORE("fa626te",       fa626te,      5TE,                             FL_LDSCHED, 9e)
 ARM_CORE("fmp626",        fmp626,       5TE,                             FL_LDSCHED, 9e)
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 7a98197..b0680ab 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -685,6 +685,7 @@ static int thumb_call_reg_needed;
 #define FL_ARM_DIV    (1 << 23)	      /* Hardware divide (ARM mode).  */
 
 #define FL_IWMMXT     (1 << 29)	      /* XScale v2 or "Intel Wireless MMX technology".  */
+#define FL_IWMMXT2    (1 << 30)       /* "Intel Wireless MMX2 technology".  */
 
 /* Flags that only effect tuning, not available instructions.  */
 #define FL_TUNE		(FL_WBUF | FL_VFPV2 | FL_STRONG | FL_LDSCHED \
@@ -766,6 +767,9 @@ int arm_arch_cirrus = 0;
 /* Nonzero if this chip supports Intel Wireless MMX technology.  */
 int arm_arch_iwmmxt = 0;
 
+/* Nonzero if this chip supports Intel Wireless MMX2 technology.  */
+int arm_arch_iwmmxt2 = 0;
+
 /* Nonzero if this chip is an XScale.  */
 int arm_arch_xscale = 0;
 
@@ -1717,6 +1721,7 @@ arm_option_override (void)
   arm_tune_wbuf = (tune_flags & FL_WBUF) != 0;
   arm_tune_xscale = (tune_flags & FL_XSCALE) != 0;
   arm_arch_iwmmxt = (insn_flags & FL_IWMMXT) != 0;
+  arm_arch_iwmmxt2 = (insn_flags & FL_IWMMXT2) != 0;
   arm_arch_thumb_hwdiv = (insn_flags & FL_THUMB_DIV) != 0;
   arm_arch_arm_hwdiv = (insn_flags & FL_ARM_DIV) != 0;
   arm_tune_cortex_a9 = (arm_tune == cortexa9) != 0;
@@ -1817,14 +1822,17 @@ arm_option_override (void)
     }
 
   /* FPA and iWMMXt are incompatible because the insn encodings overlap.
-     VFP and iWMMXt can theoretically coexist, but it's unlikely such silicon
-     will ever exist.  GCC makes no attempt to support this combination.  */
-  if (TARGET_IWMMXT && !TARGET_SOFT_FLOAT)
-    sorry ("iWMMXt and hardware floating point");
+     VFP and iWMMXt however can coexist.  */
+  if (TARGET_IWMMXT && TARGET_HARD_FLOAT && !TARGET_VFP)
+    error ("iWMMXt and non-VFP floating point unit are incompatible");
+
+  /* iWMMXt and NEON are incompatible.  */
+  if (TARGET_IWMMXT && TARGET_NEON)
+    error ("iWMMXt and NEON are incompatible");
 
-  /* ??? iWMMXt insn patterns need auditing for Thumb-2.  */
-  if (TARGET_THUMB2 && TARGET_IWMMXT)
-    sorry ("Thumb-2 iWMMXt");
+  /* iWMMXt unsupported under Thumb mode.  */
+  if (TARGET_THUMB && TARGET_IWMMXT)
+    error ("iWMMXt unsupported under Thumb mode");
 
   /* __fp16 support currently assumes the core has ldrh.  */
   if (!arm_arch4 && arm_fp16_format != ARM_FP16_FORMAT_NONE)
@@ -20867,7 +20875,8 @@ arm_expand_binop_builtin (enum insn_code icode,
       || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
     target = gen_reg_rtx (tmode);
 
-  gcc_assert (GET_MODE (op0) == mode0 && GET_MODE (op1) == mode1);
+  gcc_assert ((GET_MODE (op0) == mode0 || GET_MODE (op0) == VOIDmode)
+	      && (GET_MODE (op1) == mode1 || GET_MODE (op1) == VOIDmode));
 
   if (! (*insn_data[icode].operand[1].predicate) (op0, mode0))
     op0 = copy_to_mode_reg (mode0, op0);
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index f4204e4..c51bce9 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -97,6 +97,8 @@ extern char arm_arch_name[];
 	  builtin_define ("__XSCALE__");		\
 	if (arm_arch_iwmmxt)				\
 	  builtin_define ("__IWMMXT__");		\
+	if (arm_arch_iwmmxt2)				\
+	  builtin_define ("__IWMMXT2__");		\
 	if (TARGET_AAPCS_BASED)				\
 	  {						\
 	    if (arm_pcs_default == ARM_PCS_AAPCS_VFP)	\
@@ -194,7 +196,9 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
 #define TARGET_MAVERICK		(arm_fpu_desc->model == ARM_FP_MODEL_MAVERICK)
 #define TARGET_VFP		(arm_fpu_desc->model == ARM_FP_MODEL_VFP)
 #define TARGET_IWMMXT			(arm_arch_iwmmxt)
+#define TARGET_IWMMXT2			(arm_arch_iwmmxt2)
 #define TARGET_REALLY_IWMMXT		(TARGET_IWMMXT && TARGET_32BIT)
+#define TARGET_REALLY_IWMMXT2		(TARGET_IWMMXT2 && TARGET_32BIT)
 #define TARGET_IWMMXT_ABI (TARGET_32BIT && arm_abi == ARM_ABI_IWMMXT)
 #define TARGET_ARM                      (! TARGET_THUMB)
 #define TARGET_EITHER			1 /* (TARGET_ARM | TARGET_THUMB) */
@@ -410,6 +414,9 @@ extern int arm_arch_cirrus;
 /* Nonzero if this chip supports Intel XScale with Wireless MMX technology.  */
 extern int arm_arch_iwmmxt;
 
+/* Nonzero if this chip supports Intel Wireless MMX2 technology.  */
+extern int arm_arch_iwmmxt2;
+
 /* Nonzero if this chip is an XScale.  */
 extern int arm_arch_xscale;
 
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index bbf6380..ad9d948 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -197,7 +197,7 @@
 ; for ARM or Thumb-2 with arm_arch6, and nov6 for ARM without
 ; arm_arch6.  This attribute is used to compute attribute "enabled",
 ; use type "any" to enable an alternative in all cases.
-(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,onlya8,neon_onlya8,nota8,neon_nota8"
+(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,onlya8,neon_onlya8,nota8,neon_nota8,iwmmxt,iwmmxt2"
   (const_string "any"))
 
 (define_attr "arch_enabled" "no,yes"
@@ -248,6 +248,10 @@
 	 (and (eq_attr "arch" "neon_nota8")
 	      (not (eq_attr "tune" "cortexa8"))
 	      (match_test "TARGET_NEON"))
+	 (const_string "yes")
+
+	 (and (eq_attr "arch" "iwmmxt2")
+	      (match_test "TARGET_REALLY_IWMMXT2"))
 	 (const_string "yes")]
 	(const_string "no")))
 
-- 
1.7.3.4

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH ARM iWMMXt 4/5] WMMX machine description
  2012-05-29  4:13 [PATCH ARM iWMMXt 0/5] Improve iWMMXt support Matt Turner
                   ` (2 preceding siblings ...)
  2012-05-29  4:14 ` [PATCH ARM iWMMXt 3/5] built in define and expand Matt Turner
@ 2012-05-29  4:15 ` Matt Turner
  2012-05-29  4:15 ` [PATCH ARM iWMMXt 2/5] intrinsic head file change Matt Turner
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 29+ messages in thread
From: Matt Turner @ 2012-05-29  4:15 UTC (permalink / raw)
  To: gcc-patches
  Cc: Ramana Radhakrishnan, Richard Earnshaw, Nick Clifton, Paul Brook,
	Xinyu Qi

From: Xinyu Qi <xyqi@marvell.com>

	gcc/
	* config/arm/arm.c (arm_output_iwmmxt_shift_immediate): New function.
	(arm_output_iwmmxt_tinsr): Likewise.
	* config/arm/arm-protos.h (arm_output_iwmmxt_shift_immediate): Declare.
	(arm_output_iwmmxt_tinsr): Likewise.
	* config/arm/iwmmxt.md (WCGR0, WCGR1, WCGR2, WCGR3): New constant.
	(iwmmxt_psadbw, iwmmxt_walign, iwmmxt_tmrc, iwmmxt_tmcr): Delete.
	(rorv4hi3, rorv2si3, rordi3): Likewise.
	(rorv4hi3_di, rorv2si3_di, rordi3_di): Likewise.
	(ashrv4hi3_di, ashrv2si3_di, ashrdi3_di): Likewise.
	(lshrv4hi3_di, lshrv2si3_di, lshrdi3_di): Likewise.
	(ashlv4hi3_di, ashlv2si3_di, ashldi3_di): Likewise.
	(iwmmxt_tbcstqi, iwmmxt_tbcsthi, iwmmxt_tbcstsi): Likewise
	(*iwmmxt_clrv8qi, *iwmmxt_clrv4hi, *iwmmxt_clrv2si): Likewise.
	(tbcstv8qi, tbcstv4hi, tbsctv2si): New pattern.
	(iwmmxt_clrv8qi, iwmmxt_clrv4hi, iwmmxt_clrv2si): Likewise.
	(*and<mode>3_iwmmxt, *ior<mode>3_iwmmxt, *xor<mode>3_iwmmxt): Likewise.
	(ror<mode>3, ror<mode>3_di): Likewise.
	(ashr<mode>3_di, lshr<mode>3_di, ashl<mode>3_di): Likewise.
	(ashli<mode>3_iwmmxt, iwmmxt_waligni, iwmmxt_walignr): Likewise.
	(iwmmxt_walignr0, iwmmxt_walignr1): Likewise.
	(iwmmxt_walignr2, iwmmxt_walignr3): Likewise.
	(iwmmxt_setwcgr0, iwmmxt_setwcgr1): Likewise.
	(iwmmxt_setwcgr2, iwmmxt_setwcgr3): Likewise.
	(iwmmxt_getwcgr0, iwmmxt_getwcgr1): Likewise.
	(iwmmxt_getwcgr2, iwmmxt_getwcgr3): Likewise.
	(All instruction patterns): Add wtype attribute.
	(*iwmmxt_arm_movdi, *iwmmxt_movsi_insn): iWMMXt coexist with vfp.
	(iwmmxt_uavgrndv8qi3, iwmmxt_uavgrndv4hi3): Revise the pattern.
	(iwmmxt_uavgv8qi3, iwmmxt_uavgv4hi3): Likewise.
	(ashr<mode>3_iwmmxt, ashl<mode>3_iwmmxt, lshr<mode>3_iwmmxt): Likewise.
	(iwmmxt_tinsrb, iwmmxt_tinsrh, iwmmxt_tinsrw):Likewise.
	(eqv8qi3, eqv4hi3, eqv2si3, gtuv8qi3): Likewise.
	(gtuv4hi3, gtuv2si3, gtv8qi3, gtv4hi3, gtv2si3): Likewise.
	(iwmmxt_wunpckihh, iwmmxt_wunpckihw, iwmmxt_wunpckilh): Likewise.
	(iwmmxt_wunpckilw, iwmmxt_wunpckehub, iwmmxt_wunpckehuh): Likewise.
	(iwmmxt_wunpckehuw, iwmmxt_wunpckehsb, iwmmxt_wunpckehsh): Likewise.
	(iwmmxt_wunpckehsw, iwmmxt_wunpckelub, iwmmxt_wunpckeluh): Likewise.
	(iwmmxt_wunpckeluw, iwmmxt_wunpckelsb, iwmmxt_wunpckelsh): Likewise.
	(iwmmxt_wunpckelsw, iwmmxt_wmadds, iwmmxt_wmaddu): Likewise.
	(iwmmxt_wsadb, iwmmxt_wsadh, iwmmxt_wsadbz, iwmmxt_wsadhz): Likewise.
	(iwmmxt2.md): Include.
	* config/arm/iwmmxt2.md: New file.
	* config/arm/iterators.md (VMMX2): New mode_iterator.
	* config/arm/arm.md (wtype): New attribute.
	(UNSPEC_WMADDS, UNSPEC_WMADDU): Delete.
	(UNSPEC_WALIGNI): New unspec.
	* config/arm/t-arm (MD_INCLUDES): Add iwmmxt2.md.
	* config/arm/predicates.md (imm_or_reg_operand): New predicate.
---
 gcc/config/arm/arm-protos.h  |    2 +
 gcc/config/arm/arm.c         |   89 +++
 gcc/config/arm/arm.md        |    8 +-
 gcc/config/arm/iterators.md  |    2 +
 gcc/config/arm/iwmmxt.md     | 1753 ++++++++++++++++++++++++++----------------
 gcc/config/arm/iwmmxt2.md    |  918 ++++++++++++++++++++++
 gcc/config/arm/predicates.md |    5 +
 gcc/config/arm/t-arm         |    1 +
 8 files changed, 2122 insertions(+), 656 deletions(-)
 create mode 100644 gcc/config/arm/iwmmxt2.md

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 4e6d7bb..955f324 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -159,6 +159,8 @@ extern const char *vfp_output_fstmd (rtx *);
 extern void arm_set_return_address (rtx, rtx);
 extern int arm_eliminable_register (rtx);
 extern const char *arm_output_shift(rtx *, int);
+extern const char *arm_output_iwmmxt_shift_immediate (const char *, rtx *, bool);
+extern const char *arm_output_iwmmxt_tinsr (rtx *);
 extern unsigned int arm_sync_loop_insns (rtx , rtx *);
 extern int arm_attr_length_push_multi(rtx, rtx);
 extern void arm_expand_compare_and_swap (rtx op[]);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 51eed40..a709f2f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -25149,6 +25149,95 @@ arm_output_shift(rtx * operands, int set_flags)
   return "";
 }
 
+/* Output assembly for a WMMX immediate shift instruction.  */
+const char *
+arm_output_iwmmxt_shift_immediate (const char *insn_name, rtx *operands, bool wror_or_wsra)
+{
+  int shift = INTVAL (operands[2]);
+  char templ[50];
+  enum machine_mode opmode = GET_MODE (operands[0]);
+
+  gcc_assert (shift >= 0);
+
+  /* If the shift value in the register versions is > 63 (for D qualifier),
+     31 (for W qualifier) or 15 (for H qualifier).  */
+  if (((opmode == V4HImode) && (shift > 15))
+	|| ((opmode == V2SImode) && (shift > 31))
+	|| ((opmode == DImode) && (shift > 63)))
+  {
+    if (wror_or_wsra)
+      {
+        sprintf (templ, "%s\t%%0, %%1, #%d", insn_name, 32);
+        output_asm_insn (templ, operands);
+        if (opmode == DImode)
+          {
+	    sprintf (templ, "%s\t%%0, %%0, #%d", insn_name, 32);
+	    output_asm_insn (templ, operands);
+          }
+      }
+    else
+      {
+        /* The destination register will contain all zeros.  */
+        sprintf (templ, "wzero\t%%0");
+        output_asm_insn (templ, operands);
+      }
+    return "";
+  }
+
+  if ((opmode == DImode) && (shift > 32))
+    {
+      sprintf (templ, "%s\t%%0, %%1, #%d", insn_name, 32);
+      output_asm_insn (templ, operands);
+      sprintf (templ, "%s\t%%0, %%0, #%d", insn_name, shift - 32);
+      output_asm_insn (templ, operands);
+    }
+  else
+    {
+      sprintf (templ, "%s\t%%0, %%1, #%d", insn_name, shift);
+      output_asm_insn (templ, operands);
+    }
+  return "";
+}
+
+/* Output assembly for a WMMX tinsr instruction.  */
+const char *
+arm_output_iwmmxt_tinsr (rtx *operands)
+{
+  int mask = INTVAL (operands[3]);
+  int i;
+  char templ[50];
+  int units = mode_nunits[GET_MODE (operands[0])];
+  gcc_assert ((mask & (mask - 1)) == 0);
+  for (i = 0; i < units; ++i)
+    {
+      if ((mask & 0x01) == 1)
+        {
+          break;
+        }
+      mask >>= 1;
+    }
+  gcc_assert (i < units);
+  {
+    switch (GET_MODE (operands[0]))
+      {
+      case V8QImode:
+	sprintf (templ, "tinsrb%%?\t%%0, %%2, #%d", i);
+	break;
+      case V4HImode:
+	sprintf (templ, "tinsrh%%?\t%%0, %%2, #%d", i);
+	break;
+      case V2SImode:
+	sprintf (templ, "tinsrw%%?\t%%0, %%2, #%d", i);
+	break;
+      default:
+	gcc_unreachable ();
+	break;
+      }
+    output_asm_insn (templ, operands);
+  }
+  return "";
+}
+
 /* Output a Thumb-1 casesi dispatch sequence.  */
 const char *
 thumb1_output_casesi (rtx *operands)
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index ad9d948..b0333c2 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -62,6 +62,7 @@
 ;; UNSPEC Usage:
 ;; Note: sin and cos are no-longer used.
 ;; Unspec enumerators for Neon are defined in neon.md.
+;; Unspec enumerators for iwmmxt2 are defined in iwmmxt2.md
 
 (define_c_enum "unspec" [
   UNSPEC_SIN            ; `sin' operation (MODE_FLOAT):
@@ -98,8 +99,7 @@
   UNSPEC_WMACSZ         ; Used by the intrinsic form of the iWMMXt WMACSZ instruction.
   UNSPEC_WMACUZ         ; Used by the intrinsic form of the iWMMXt WMACUZ instruction.
   UNSPEC_CLRDI          ; Used by the intrinsic form of the iWMMXt CLRDI instruction.
-  UNSPEC_WMADDS         ; Used by the intrinsic form of the iWMMXt WMADDS instruction.
-  UNSPEC_WMADDU         ; Used by the intrinsic form of the iWMMXt WMADDU instruction.
+  UNSPEC_WALIGNI        ; Used by the intrinsic form of the iWMMXt WALIGN instruction.
   UNSPEC_TLS            ; A symbol that has been treated properly for TLS usage.
   UNSPEC_PIC_LABEL      ; A label used for PIC access that does not appear in the
                         ; instruction stream.
@@ -366,6 +366,10 @@
 	       (const_string "yes")
 	       (const_string "no")))
 
+; wtype for WMMX insn scheduling purposes.
+(define_attr "wtype"
+        "none,wor,wxor,wand,wandn,wmov,tmcrr,tmrrc,wldr,wstr,tmcr,tmrc,wadd,wsub,wmul,wmac,wavg2,tinsr,textrm,wshufh,wcmpeq,wcmpgt,wmax,wmin,wpack,wunpckih,wunpckil,wunpckeh,wunpckel,wror,wsra,wsrl,wsll,wmadd,tmia,tmiaph,tmiaxy,tbcst,tmovmsk,wacc,waligni,walignr,tandc,textrc,torc,torvsc,wsad,wabs,wabsdiff,waddsubhx,wsubaddhx,wavg4,wmulw,wqmulm,wqmulwm,waddbhus,wqmiaxy,wmiaxy,wmiawxy,wmerge" (const_string "none"))
+
 ; Load scheduling, set from the arm_ld_sched variable
 ; initialized by arm_option_override()
 (define_attr "ldsched" "no,yes" (const (symbol_ref "arm_ld_sched")))
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 1567264..916444c 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -45,6 +45,8 @@
 ;; Integer element sizes implemented by IWMMXT.
 (define_mode_iterator VMMX [V2SI V4HI V8QI])
 
+(define_mode_iterator VMMX2 [V4HI V2SI])
+
 ;; Integer element sizes for shifts.
 (define_mode_iterator VSHFT [V4HI V2SI DI])
 
diff --git a/gcc/config/arm/iwmmxt.md b/gcc/config/arm/iwmmxt.md
index bc0b80d..12f4179 100644
--- a/gcc/config/arm/iwmmxt.md
+++ b/gcc/config/arm/iwmmxt.md
@@ -1,4 +1,3 @@
-;; ??? This file needs auditing for thumb2
 ;; Patterns for the Intel Wireless MMX technology architecture.
 ;; Copyright (C) 2003, 2004, 2005, 2007, 2008, 2010
 ;; Free Software Foundation, Inc.
@@ -20,6 +19,41 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; <http://www.gnu.org/licenses/>.
 
+;; Register numbers
+(define_constants
+  [(WCGR0           43)
+   (WCGR1           44)
+   (WCGR2           45)
+   (WCGR3           46)
+  ]
+)
+
+(define_insn "tbcstv8qi"
+  [(set (match_operand:V8QI                   0 "register_operand" "=y")
+        (vec_duplicate:V8QI (match_operand:QI 1 "s_register_operand" "r")))]
+  "TARGET_REALLY_IWMMXT"
+  "tbcstb%?\\t%0, %1"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tbcst")]
+)
+
+(define_insn "tbcstv4hi"
+  [(set (match_operand:V4HI                   0 "register_operand" "=y")
+        (vec_duplicate:V4HI (match_operand:HI 1 "s_register_operand" "r")))]
+  "TARGET_REALLY_IWMMXT"
+  "tbcsth%?\\t%0, %1"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tbcst")]
+)
+
+(define_insn "tbcstv2si"
+  [(set (match_operand:V2SI                   0 "register_operand" "=y")
+        (vec_duplicate:V2SI (match_operand:SI 1 "s_register_operand" "r")))]
+  "TARGET_REALLY_IWMMXT"
+  "tbcstw%?\\t%0, %1"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tbcst")]
+)
 
 (define_insn "iwmmxt_iordi3"
   [(set (match_operand:DI         0 "register_operand" "=y,?&r,?&r")
@@ -31,7 +65,9 @@
    #
    #"
   [(set_attr "predicable" "yes")
-   (set_attr "length" "4,8,8")])
+   (set_attr "length" "4,8,8")
+   (set_attr "wtype" "wor,none,none")]
+)
 
 (define_insn "iwmmxt_xordi3"
   [(set (match_operand:DI         0 "register_operand" "=y,?&r,?&r")
@@ -43,7 +79,9 @@
    #
    #"
   [(set_attr "predicable" "yes")
-   (set_attr "length" "4,8,8")])
+   (set_attr "length" "4,8,8")
+   (set_attr "wtype" "wxor,none,none")]
+)
 
 (define_insn "iwmmxt_anddi3"
   [(set (match_operand:DI         0 "register_operand" "=y,?&r,?&r")
@@ -55,7 +93,9 @@
    #
    #"
   [(set_attr "predicable" "yes")
-   (set_attr "length" "4,8,8")])
+   (set_attr "length" "4,8,8")
+   (set_attr "wtype" "wand,none,none")]
+)
 
 (define_insn "iwmmxt_nanddi3"
   [(set (match_operand:DI                 0 "register_operand" "=y")
@@ -63,64 +103,96 @@
 		(not:DI (match_operand:DI 2 "register_operand"  "y"))))]
   "TARGET_REALLY_IWMMXT"
   "wandn%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wandn")]
+)
 
 (define_insn "*iwmmxt_arm_movdi"
-  [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r, m,y,y,yr,y,yrUy")
-	(match_operand:DI 1 "di_operand"              "rIK,mi,r,y,yr,y,yrUy,y"))]
+  [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r, r, r, m,y,y,yr,y,yrUy,*w, r,*w,*w, *Uv")
+        (match_operand:DI 1 "di_operand"              "rDa,Db,Dc,mi,r,y,yr,y,yrUy,y, r,*w,*w,*Uvi,*w"))]
   "TARGET_REALLY_IWMMXT
    && (   register_operand (operands[0], DImode)
        || register_operand (operands[1], DImode))"
   "*
-{
   switch (which_alternative)
     {
-    default:
-      return output_move_double (operands, true, NULL);
     case 0:
+    case 1:
+    case 2:
       return \"#\";
-    case 3:
+    case 3: case 4:
+      return output_move_double (operands, true, NULL);
+    case 5:
       return \"wmov%?\\t%0,%1\";
-    case 4:
+    case 6:
       return \"tmcrr%?\\t%0,%Q1,%R1\";
-    case 5:
+    case 7:
       return \"tmrrc%?\\t%Q0,%R0,%1\";
-    case 6:
+    case 8:
       return \"wldrd%?\\t%0,%1\";
-    case 7:
+    case 9:
       return \"wstrd%?\\t%1,%0\";
+    case 10:
+      return \"fmdrr%?\\t%P0, %Q1, %R1\\t%@ int\";
+    case 11:
+      return \"fmrrd%?\\t%Q0, %R0, %P1\\t%@ int\";
+    case 12:
+      if (TARGET_VFP_SINGLE)
+	return \"fcpys%?\\t%0, %1\\t%@ int\;fcpys%?\\t%p0, %p1\\t%@ int\";
+      else
+	return \"fcpyd%?\\t%P0, %P1\\t%@ int\";
+    case 13: case 14:
+      return output_move_vfp (operands);
+    default:
+      gcc_unreachable ();
     }
-}"
-  [(set_attr "length"         "8,8,8,4,4,4,4,4")
-   (set_attr "type"           "*,load1,store2,*,*,*,*,*")
-   (set_attr "pool_range"     "*,1020,*,*,*,*,*,*")
-   (set_attr "neg_pool_range" "*,1012,*,*,*,*,*,*")]
+  "
+  [(set (attr "length") (cond [(eq_attr "alternative" "0,3,4") (const_int 8)
+                              (eq_attr "alternative" "1") (const_int 12)
+                              (eq_attr "alternative" "2") (const_int 16)
+                              (eq_attr "alternative" "12")
+                               (if_then_else
+                                 (eq (symbol_ref "TARGET_VFP_SINGLE") (const_int 1))
+                                 (const_int 8)
+                                 (const_int 4))]
+                              (const_int 4)))
+   (set_attr "type" "*,*,*,load2,store2,*,*,*,*,*,r_2_f,f_2_r,ffarithd,f_loadd,f_stored")
+   (set_attr "arm_pool_range" "*,*,*,1020,*,*,*,*,*,*,*,*,*,1020,*")
+   (set_attr "arm_neg_pool_range" "*,*,*,1008,*,*,*,*,*,*,*,*,*,1008,*")
+   (set_attr "wtype" "*,*,*,*,*,wmov,tmcrr,tmrrc,wldr,wstr,*,*,*,*,*")]
 )
 
 (define_insn "*iwmmxt_movsi_insn"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,r,rk, m,z,r,?z,Uy,z")
-	(match_operand:SI 1 "general_operand"      "rk, I,K,mi,rk,r,z,Uy,z, z"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,r,r,rk, m,z,r,?z,?Uy,*t, r,*t,*t  ,*Uv")
+	(match_operand:SI 1 "general_operand"      " rk,I,K,j,mi,rk,r,z,Uy,  z, r,*t,*t,*Uvi, *t"))]
   "TARGET_REALLY_IWMMXT
    && (   register_operand (operands[0], SImode)
        || register_operand (operands[1], SImode))"
   "*
    switch (which_alternative)
-   {
-   case 0: return \"mov\\t%0, %1\";
-   case 1: return \"mov\\t%0, %1\";
-   case 2: return \"mvn\\t%0, #%B1\";
-   case 3: return \"ldr\\t%0, %1\";
-   case 4: return \"str\\t%1, %0\";
-   case 5: return \"tmcr\\t%0, %1\";
-   case 6: return \"tmrc\\t%0, %1\";
-   case 7: return arm_output_load_gr (operands);
-   case 8: return \"wstrw\\t%1, %0\";
-   default:return \"wstrw\\t%1, [sp, #-4]!\;wldrw\\t%0, [sp], #4\\t@move CG reg\";
-  }"
-  [(set_attr "type"           "*,*,*,load1,store1,*,*,load1,store1,*")
-   (set_attr "length"         "*,*,*,*,        *,*,*,  16,     *,8")
-   (set_attr "pool_range"     "*,*,*,4096,     *,*,*,1024,     *,*")
-   (set_attr "neg_pool_range" "*,*,*,4084,     *,*,*,   *,  1012,*")
+     {
+     case 0: return \"mov\\t%0, %1\";
+     case 1: return \"mov\\t%0, %1\";
+     case 2: return \"mvn\\t%0, #%B1\";
+     case 3: return \"movw\\t%0, %1\";
+     case 4: return \"ldr\\t%0, %1\";
+     case 5: return \"str\\t%1, %0\";
+     case 6: return \"tmcr\\t%0, %1\";
+     case 7: return \"tmrc\\t%0, %1\";
+     case 8: return arm_output_load_gr (operands);
+     case 9: return \"wstrw\\t%1, %0\";
+     case 10:return \"fmsr\\t%0, %1\";
+     case 11:return \"fmrs\\t%0, %1\";
+     case 12:return \"fcpys\\t%0, %1\\t%@ int\";
+     case 13: case 14:
+       return output_move_vfp (operands);
+     default:
+       gcc_unreachable ();
+     }"
+  [(set_attr "type"           "*,*,*,*,load1,store1,*,*,*,*,r_2_f,f_2_r,fcpys,f_loads,f_stores")
+   (set_attr "length"         "*,*,*,*,*,        *,*,*,  16,     *,*,*,*,*,*")
+   (set_attr "pool_range"     "*,*,*,*,4096,     *,*,*,1024,     *,*,*,*,1020,*")
+   (set_attr "neg_pool_range" "*,*,*,*,4084,     *,*,*,   *,  1012,*,*,*,1008,*")
    ;; Note - the "predicable" attribute is not allowed to have alternatives.
    ;; Since the wSTRw wCx instruction is not predicable, we cannot support
    ;; predicating any of the alternatives in this template.  Instead,
@@ -129,7 +201,8 @@
    ;; Also - we have to pretend that these insns clobber the condition code
    ;; bits as otherwise arm_final_prescan_insn() will try to conditionalize
    ;; them.
-   (set_attr "conds" "clob")]
+   (set_attr "conds" "clob")
+   (set_attr "wtype" "*,*,*,*,*,*,tmcr,tmrc,wldr,wstr,*,*,*,*,*")]
 )
 
 ;; Because iwmmxt_movsi_insn is not predicable, we provide the
@@ -177,19 +250,110 @@
    }"
   [(set_attr "predicable" "yes")
    (set_attr "length"         "4,     4,   4,4,4,8,   8,8")
-   (set_attr "type"           "*,store1,load1,*,*,*,load1,store1")
+   (set_attr "type"           "*,*,*,*,*,*,load1,store1")
    (set_attr "pool_range"     "*,     *, 256,*,*,*, 256,*")
-   (set_attr "neg_pool_range" "*,     *, 244,*,*,*, 244,*")])
+   (set_attr "neg_pool_range" "*,     *, 244,*,*,*, 244,*")
+   (set_attr "wtype"          "wmov,wstr,wldr,tmrrc,tmcrr,*,*,*")]
+)
+
+(define_expand "iwmmxt_setwcgr0"
+  [(set (reg:SI WCGR0)
+	(match_operand:SI 0 "register_operand"  ""))]
+  "TARGET_REALLY_IWMMXT"
+  {}
+)
+
+(define_expand "iwmmxt_setwcgr1"
+  [(set (reg:SI WCGR1)
+	(match_operand:SI 0 "register_operand"  ""))]
+  "TARGET_REALLY_IWMMXT"
+  {}
+)
+
+(define_expand "iwmmxt_setwcgr2"
+  [(set (reg:SI WCGR2)
+	(match_operand:SI 0 "register_operand"  ""))]
+  "TARGET_REALLY_IWMMXT"
+  {}
+)
+
+(define_expand "iwmmxt_setwcgr3"
+  [(set (reg:SI WCGR3)
+	(match_operand:SI 0 "register_operand"  ""))]
+  "TARGET_REALLY_IWMMXT"
+  {}
+)
+
+(define_expand "iwmmxt_getwcgr0"
+  [(set (match_operand:SI 0 "register_operand"  "")
+        (reg:SI WCGR0))]
+  "TARGET_REALLY_IWMMXT"
+  {}
+)
+
+(define_expand "iwmmxt_getwcgr1"
+  [(set (match_operand:SI 0 "register_operand"  "")
+        (reg:SI WCGR1))]
+  "TARGET_REALLY_IWMMXT"
+  {}
+)
+
+(define_expand "iwmmxt_getwcgr2"
+  [(set (match_operand:SI 0 "register_operand"  "")
+        (reg:SI WCGR2))]
+  "TARGET_REALLY_IWMMXT"
+  {}
+)
+
+(define_expand "iwmmxt_getwcgr3"
+  [(set (match_operand:SI 0 "register_operand"  "")
+        (reg:SI WCGR3))]
+  "TARGET_REALLY_IWMMXT"
+  {}
+)
+
+(define_insn "*and<mode>3_iwmmxt"
+  [(set (match_operand:VMMX           0 "register_operand" "=y")
+        (and:VMMX (match_operand:VMMX 1 "register_operand"  "y")
+	          (match_operand:VMMX 2 "register_operand"  "y")))]
+  "TARGET_REALLY_IWMMXT"
+  "wand\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wand")]
+)
+
+(define_insn "*ior<mode>3_iwmmxt"
+  [(set (match_operand:VMMX           0 "register_operand" "=y")
+        (ior:VMMX (match_operand:VMMX 1 "register_operand"  "y")
+	          (match_operand:VMMX 2 "register_operand"  "y")))]
+  "TARGET_REALLY_IWMMXT"
+  "wor\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wor")]
+)
+
+(define_insn "*xor<mode>3_iwmmxt"
+  [(set (match_operand:VMMX           0 "register_operand" "=y")
+        (xor:VMMX (match_operand:VMMX 1 "register_operand"  "y")
+	          (match_operand:VMMX 2 "register_operand"  "y")))]
+  "TARGET_REALLY_IWMMXT"
+  "wxor\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wxor")]
+)
+
 
 ;; Vector add/subtract
 
 (define_insn "*add<mode>3_iwmmxt"
   [(set (match_operand:VMMX            0 "register_operand" "=y")
-        (plus:VMMX (match_operand:VMMX 1 "register_operand"  "y")
-	           (match_operand:VMMX 2 "register_operand"  "y")))]
+        (plus:VMMX (match_operand:VMMX 1 "register_operand" "y")
+	           (match_operand:VMMX 2 "register_operand" "y")))]
   "TARGET_REALLY_IWMMXT"
   "wadd<MMX_char>%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wadd")]
+)
 
 (define_insn "ssaddv8qi3"
   [(set (match_operand:V8QI               0 "register_operand" "=y")
@@ -197,7 +361,9 @@
 		      (match_operand:V8QI 2 "register_operand"  "y")))]
   "TARGET_REALLY_IWMMXT"
   "waddbss%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wadd")]
+)
 
 (define_insn "ssaddv4hi3"
   [(set (match_operand:V4HI               0 "register_operand" "=y")
@@ -205,7 +371,9 @@
 		      (match_operand:V4HI 2 "register_operand"  "y")))]
   "TARGET_REALLY_IWMMXT"
   "waddhss%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wadd")]
+)
 
 (define_insn "ssaddv2si3"
   [(set (match_operand:V2SI               0 "register_operand" "=y")
@@ -213,7 +381,9 @@
 		      (match_operand:V2SI 2 "register_operand"  "y")))]
   "TARGET_REALLY_IWMMXT"
   "waddwss%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wadd")]
+)
 
 (define_insn "usaddv8qi3"
   [(set (match_operand:V8QI               0 "register_operand" "=y")
@@ -221,7 +391,9 @@
 		      (match_operand:V8QI 2 "register_operand"  "y")))]
   "TARGET_REALLY_IWMMXT"
   "waddbus%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wadd")]
+)
 
 (define_insn "usaddv4hi3"
   [(set (match_operand:V4HI               0 "register_operand" "=y")
@@ -229,7 +401,9 @@
 		      (match_operand:V4HI 2 "register_operand"  "y")))]
   "TARGET_REALLY_IWMMXT"
   "waddhus%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wadd")]
+)
 
 (define_insn "usaddv2si3"
   [(set (match_operand:V2SI               0 "register_operand" "=y")
@@ -237,7 +411,9 @@
 		      (match_operand:V2SI 2 "register_operand"  "y")))]
   "TARGET_REALLY_IWMMXT"
   "waddwus%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wadd")]
+)
 
 (define_insn "*sub<mode>3_iwmmxt"
   [(set (match_operand:VMMX             0 "register_operand" "=y")
@@ -245,7 +421,9 @@
 		    (match_operand:VMMX 2 "register_operand"  "y")))]
   "TARGET_REALLY_IWMMXT"
   "wsub<MMX_char>%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wsub")]
+)
 
 (define_insn "sssubv8qi3"
   [(set (match_operand:V8QI                0 "register_operand" "=y")
@@ -253,7 +431,9 @@
 		       (match_operand:V8QI 2 "register_operand"  "y")))]
   "TARGET_REALLY_IWMMXT"
   "wsubbss%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wsub")]
+)
 
 (define_insn "sssubv4hi3"
   [(set (match_operand:V4HI                0 "register_operand" "=y")
@@ -261,7 +441,9 @@
 		       (match_operand:V4HI 2 "register_operand" "y")))]
   "TARGET_REALLY_IWMMXT"
   "wsubhss%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wsub")]
+)
 
 (define_insn "sssubv2si3"
   [(set (match_operand:V2SI                0 "register_operand" "=y")
@@ -269,7 +451,9 @@
 		       (match_operand:V2SI 2 "register_operand" "y")))]
   "TARGET_REALLY_IWMMXT"
   "wsubwss%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wsub")]
+)
 
 (define_insn "ussubv8qi3"
   [(set (match_operand:V8QI                0 "register_operand" "=y")
@@ -277,7 +461,9 @@
 		       (match_operand:V8QI 2 "register_operand" "y")))]
   "TARGET_REALLY_IWMMXT"
   "wsubbus%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wsub")]
+)
 
 (define_insn "ussubv4hi3"
   [(set (match_operand:V4HI                0 "register_operand" "=y")
@@ -285,7 +471,9 @@
 		       (match_operand:V4HI 2 "register_operand" "y")))]
   "TARGET_REALLY_IWMMXT"
   "wsubhus%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wsub")]
+)
 
 (define_insn "ussubv2si3"
   [(set (match_operand:V2SI                0 "register_operand" "=y")
@@ -293,7 +481,9 @@
 		       (match_operand:V2SI 2 "register_operand" "y")))]
   "TARGET_REALLY_IWMMXT"
   "wsubwus%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wsub")]
+)
 
 (define_insn "*mulv4hi3_iwmmxt"
   [(set (match_operand:V4HI            0 "register_operand" "=y")
@@ -301,63 +491,77 @@
 		   (match_operand:V4HI 2 "register_operand" "y")))]
   "TARGET_REALLY_IWMMXT"
   "wmulul%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmul")]
+)
 
 (define_insn "smulv4hi3_highpart"
-  [(set (match_operand:V4HI                                0 "register_operand" "=y")
-	(truncate:V4HI
-	 (lshiftrt:V4SI
-	  (mult:V4SI (sign_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
-		     (sign_extend:V4SI (match_operand:V4HI 2 "register_operand" "y")))
-	  (const_int 16))))]
+  [(set (match_operand:V4HI 0 "register_operand" "=y")
+	  (truncate:V4HI
+	    (lshiftrt:V4SI
+	      (mult:V4SI (sign_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+	                 (sign_extend:V4SI (match_operand:V4HI 2 "register_operand" "y")))
+	      (const_int 16))))]
   "TARGET_REALLY_IWMMXT"
   "wmulsm%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmul")]
+)
 
 (define_insn "umulv4hi3_highpart"
-  [(set (match_operand:V4HI                                0 "register_operand" "=y")
-	(truncate:V4HI
-	 (lshiftrt:V4SI
-	  (mult:V4SI (zero_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
-		     (zero_extend:V4SI (match_operand:V4HI 2 "register_operand" "y")))
-	  (const_int 16))))]
+  [(set (match_operand:V4HI 0 "register_operand" "=y")
+	  (truncate:V4HI
+	    (lshiftrt:V4SI
+	      (mult:V4SI (zero_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+	                 (zero_extend:V4SI (match_operand:V4HI 2 "register_operand" "y")))
+	      (const_int 16))))]
   "TARGET_REALLY_IWMMXT"
   "wmulum%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmul")]
+)
 
 (define_insn "iwmmxt_wmacs"
   [(set (match_operand:DI               0 "register_operand" "=y")
 	(unspec:DI [(match_operand:DI   1 "register_operand" "0")
-		    (match_operand:V4HI 2 "register_operand" "y")
-		    (match_operand:V4HI 3 "register_operand" "y")] UNSPEC_WMACS))]
+	            (match_operand:V4HI 2 "register_operand" "y")
+	            (match_operand:V4HI 3 "register_operand" "y")] UNSPEC_WMACS))]
   "TARGET_REALLY_IWMMXT"
   "wmacs%?\\t%0, %2, %3"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmac")]
+)
 
 (define_insn "iwmmxt_wmacsz"
   [(set (match_operand:DI               0 "register_operand" "=y")
 	(unspec:DI [(match_operand:V4HI 1 "register_operand" "y")
-		    (match_operand:V4HI 2 "register_operand" "y")] UNSPEC_WMACSZ))]
+	            (match_operand:V4HI 2 "register_operand" "y")] UNSPEC_WMACSZ))]
   "TARGET_REALLY_IWMMXT"
   "wmacsz%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmac")]
+)
 
 (define_insn "iwmmxt_wmacu"
   [(set (match_operand:DI               0 "register_operand" "=y")
 	(unspec:DI [(match_operand:DI   1 "register_operand" "0")
-		    (match_operand:V4HI 2 "register_operand" "y")
-		    (match_operand:V4HI 3 "register_operand" "y")] UNSPEC_WMACU))]
+	            (match_operand:V4HI 2 "register_operand" "y")
+	            (match_operand:V4HI 3 "register_operand" "y")] UNSPEC_WMACU))]
   "TARGET_REALLY_IWMMXT"
   "wmacu%?\\t%0, %2, %3"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmac")]
+)
 
 (define_insn "iwmmxt_wmacuz"
   [(set (match_operand:DI               0 "register_operand" "=y")
 	(unspec:DI [(match_operand:V4HI 1 "register_operand" "y")
-		    (match_operand:V4HI 2 "register_operand" "y")] UNSPEC_WMACUZ))]
+	            (match_operand:V4HI 2 "register_operand" "y")] UNSPEC_WMACUZ))]
   "TARGET_REALLY_IWMMXT"
   "wmacuz%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmac")]
+)
 
 ;; Same as xordi3, but don't show input operands so that we don't think
 ;; they are live.
@@ -366,168 +570,207 @@
         (unspec:DI [(const_int 0)] UNSPEC_CLRDI))]
   "TARGET_REALLY_IWMMXT"
   "wxor%?\\t%0, %0, %0"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wxor")]
+)
 
 ;; Seems like cse likes to generate these, so we have to support them.
 
-(define_insn "*iwmmxt_clrv8qi"
-  [(set (match_operand:V8QI 0 "register_operand" "=y")
+(define_insn "iwmmxt_clrv8qi"
+  [(set (match_operand:V8QI 0 "s_register_operand" "=y")
         (const_vector:V8QI [(const_int 0) (const_int 0)
 			    (const_int 0) (const_int 0)
 			    (const_int 0) (const_int 0)
 			    (const_int 0) (const_int 0)]))]
   "TARGET_REALLY_IWMMXT"
   "wxor%?\\t%0, %0, %0"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wxor")]
+)
 
-(define_insn "*iwmmxt_clrv4hi"
-  [(set (match_operand:V4HI 0 "register_operand" "=y")
+(define_insn "iwmmxt_clrv4hi"
+  [(set (match_operand:V4HI 0 "s_register_operand" "=y")
         (const_vector:V4HI [(const_int 0) (const_int 0)
 			    (const_int 0) (const_int 0)]))]
   "TARGET_REALLY_IWMMXT"
   "wxor%?\\t%0, %0, %0"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wxor")]
+)
 
-(define_insn "*iwmmxt_clrv2si"
+(define_insn "iwmmxt_clrv2si"
   [(set (match_operand:V2SI 0 "register_operand" "=y")
         (const_vector:V2SI [(const_int 0) (const_int 0)]))]
   "TARGET_REALLY_IWMMXT"
   "wxor%?\\t%0, %0, %0"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wxor")]
+)
 
 ;; Unsigned averages/sum of absolute differences
 
 (define_insn "iwmmxt_uavgrndv8qi3"
-  [(set (match_operand:V8QI              0 "register_operand" "=y")
-        (ashiftrt:V8QI
-	 (plus:V8QI (plus:V8QI
-		     (match_operand:V8QI 1 "register_operand" "y")
-		     (match_operand:V8QI 2 "register_operand" "y"))
-		    (const_vector:V8QI [(const_int 1)
-					(const_int 1)
-					(const_int 1)
-					(const_int 1)
-					(const_int 1)
-					(const_int 1)
-					(const_int 1)
-					(const_int 1)]))
-	 (const_int 1)))]
+  [(set (match_operand:V8QI                                    0 "register_operand" "=y")
+        (truncate:V8QI
+	  (lshiftrt:V8HI
+	    (plus:V8HI
+	      (plus:V8HI (zero_extend:V8HI (match_operand:V8QI 1 "register_operand" "y"))
+	                 (zero_extend:V8HI (match_operand:V8QI 2 "register_operand" "y")))
+	      (const_vector:V8HI [(const_int 1)
+	                          (const_int 1)
+	                          (const_int 1)
+	                          (const_int 1)
+	                          (const_int 1)
+	                          (const_int 1)
+	                          (const_int 1)
+	                          (const_int 1)]))
+	    (const_int 1))))]
   "TARGET_REALLY_IWMMXT"
   "wavg2br%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wavg2")]
+)
 
 (define_insn "iwmmxt_uavgrndv4hi3"
-  [(set (match_operand:V4HI              0 "register_operand" "=y")
-        (ashiftrt:V4HI
-	 (plus:V4HI (plus:V4HI
-		     (match_operand:V4HI 1 "register_operand" "y")
-		     (match_operand:V4HI 2 "register_operand" "y"))
-		    (const_vector:V4HI [(const_int 1)
-					(const_int 1)
-					(const_int 1)
-					(const_int 1)]))
-	 (const_int 1)))]
+  [(set (match_operand:V4HI                                    0 "register_operand" "=y")
+        (truncate:V4HI
+	  (lshiftrt:V4SI
+            (plus:V4SI
+	      (plus:V4SI (zero_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+	                 (zero_extend:V4SI (match_operand:V4HI 2 "register_operand" "y")))
+	      (const_vector:V4SI [(const_int 1)
+	                          (const_int 1)
+	                          (const_int 1)
+	                          (const_int 1)]))
+	    (const_int 1))))]
   "TARGET_REALLY_IWMMXT"
   "wavg2hr%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wavg2")]
+)
 
 (define_insn "iwmmxt_uavgv8qi3"
-  [(set (match_operand:V8QI                 0 "register_operand" "=y")
-        (ashiftrt:V8QI (plus:V8QI
-			(match_operand:V8QI 1 "register_operand" "y")
-			(match_operand:V8QI 2 "register_operand" "y"))
-		       (const_int 1)))]
+  [(set (match_operand:V8QI                                  0 "register_operand" "=y")
+        (truncate:V8QI
+	  (lshiftrt:V8HI
+	    (plus:V8HI (zero_extend:V8HI (match_operand:V8QI 1 "register_operand" "y"))
+	               (zero_extend:V8HI (match_operand:V8QI 2 "register_operand" "y")))
+	    (const_int 1))))]
   "TARGET_REALLY_IWMMXT"
   "wavg2b%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wavg2")]
+)
 
 (define_insn "iwmmxt_uavgv4hi3"
-  [(set (match_operand:V4HI                 0 "register_operand" "=y")
-        (ashiftrt:V4HI (plus:V4HI
-			(match_operand:V4HI 1 "register_operand" "y")
-			(match_operand:V4HI 2 "register_operand" "y"))
-		       (const_int 1)))]
+  [(set (match_operand:V4HI                                  0 "register_operand" "=y")
+        (truncate:V4HI
+	  (lshiftrt:V4SI
+	    (plus:V4SI (zero_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+	               (zero_extend:V4SI (match_operand:V4HI 2 "register_operand" "y")))
+	    (const_int 1))))]
   "TARGET_REALLY_IWMMXT"
   "wavg2h%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "iwmmxt_psadbw"
-  [(set (match_operand:V8QI                       0 "register_operand" "=y")
-        (abs:V8QI (minus:V8QI (match_operand:V8QI 1 "register_operand" "y")
-			      (match_operand:V8QI 2 "register_operand" "y"))))]
-  "TARGET_REALLY_IWMMXT"
-  "psadbw%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wavg2")]
+)
 
 ;; Insert/extract/shuffle
 
 (define_insn "iwmmxt_tinsrb"
-  [(set (match_operand:V8QI                             0 "register_operand"    "=y")
-        (vec_merge:V8QI (match_operand:V8QI             1 "register_operand"     "0")
-			(vec_duplicate:V8QI
-			 (truncate:QI (match_operand:SI 2 "nonimmediate_operand" "r")))
-			(match_operand:SI               3 "immediate_operand"    "i")))]
+  [(set (match_operand:V8QI                0 "register_operand" "=y")
+        (vec_merge:V8QI
+	  (vec_duplicate:V8QI
+	    (truncate:QI (match_operand:SI 2 "nonimmediate_operand" "r")))
+	  (match_operand:V8QI              1 "register_operand"     "0")
+	  (match_operand:SI                3 "immediate_operand"    "i")))]
   "TARGET_REALLY_IWMMXT"
-  "tinsrb%?\\t%0, %2, %3"
-  [(set_attr "predicable" "yes")])
+  "*
+   {
+     return arm_output_iwmmxt_tinsr (operands);
+   }
+   "
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tinsr")]
+)
 
 (define_insn "iwmmxt_tinsrh"
-  [(set (match_operand:V4HI                             0 "register_operand"    "=y")
-        (vec_merge:V4HI (match_operand:V4HI             1 "register_operand"     "0")
-			(vec_duplicate:V4HI
-			 (truncate:HI (match_operand:SI 2 "nonimmediate_operand" "r")))
-			(match_operand:SI               3 "immediate_operand"    "i")))]
+  [(set (match_operand:V4HI                0 "register_operand"    "=y")
+        (vec_merge:V4HI
+          (vec_duplicate:V4HI
+            (truncate:HI (match_operand:SI 2 "nonimmediate_operand" "r")))
+	  (match_operand:V4HI              1 "register_operand"     "0")
+	  (match_operand:SI                3 "immediate_operand"    "i")))]
   "TARGET_REALLY_IWMMXT"
-  "tinsrh%?\\t%0, %2, %3"
-  [(set_attr "predicable" "yes")])
+  "*
+   {
+     return arm_output_iwmmxt_tinsr (operands);
+   }
+   "
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tinsr")]
+)
 
 (define_insn "iwmmxt_tinsrw"
-  [(set (match_operand:V2SI                 0 "register_operand"    "=y")
-        (vec_merge:V2SI (match_operand:V2SI 1 "register_operand"     "0")
-			(vec_duplicate:V2SI
-			 (match_operand:SI  2 "nonimmediate_operand" "r"))
-			(match_operand:SI   3 "immediate_operand"    "i")))]
+  [(set (match_operand:V2SI   0 "register_operand"    "=y")
+        (vec_merge:V2SI
+          (vec_duplicate:V2SI
+            (match_operand:SI 2 "nonimmediate_operand" "r"))
+          (match_operand:V2SI 1 "register_operand"     "0")
+          (match_operand:SI   3 "immediate_operand"    "i")))]
   "TARGET_REALLY_IWMMXT"
-  "tinsrw%?\\t%0, %2, %3"
-  [(set_attr "predicable" "yes")])
+  "*
+   {
+     return arm_output_iwmmxt_tinsr (operands);
+   }
+   "
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tinsr")]
+)
 
 (define_insn "iwmmxt_textrmub"
-  [(set (match_operand:SI                                  0 "register_operand" "=r")
-        (zero_extend:SI (vec_select:QI (match_operand:V8QI 1 "register_operand" "y")
-				       (parallel
-					[(match_operand:SI 2 "immediate_operand" "i")]))))]
+  [(set (match_operand:SI                                   0 "register_operand" "=r")
+        (zero_extend:SI (vec_select:QI (match_operand:V8QI  1 "register_operand" "y")
+		                       (parallel
+				         [(match_operand:SI 2 "immediate_operand" "i")]))))]
   "TARGET_REALLY_IWMMXT"
   "textrmub%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "textrm")]
+)
 
 (define_insn "iwmmxt_textrmsb"
-  [(set (match_operand:SI                                  0 "register_operand" "=r")
-        (sign_extend:SI (vec_select:QI (match_operand:V8QI 1 "register_operand" "y")
+  [(set (match_operand:SI                                   0 "register_operand" "=r")
+        (sign_extend:SI (vec_select:QI (match_operand:V8QI  1 "register_operand" "y")
 				       (parallel
-					[(match_operand:SI 2 "immediate_operand" "i")]))))]
+				         [(match_operand:SI 2 "immediate_operand" "i")]))))]
   "TARGET_REALLY_IWMMXT"
   "textrmsb%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "textrm")]
+)
 
 (define_insn "iwmmxt_textrmuh"
-  [(set (match_operand:SI                                  0 "register_operand" "=r")
-        (zero_extend:SI (vec_select:HI (match_operand:V4HI 1 "register_operand" "y")
+  [(set (match_operand:SI                                   0 "register_operand" "=r")
+        (zero_extend:SI (vec_select:HI (match_operand:V4HI  1 "register_operand" "y")
 				       (parallel
-					[(match_operand:SI 2 "immediate_operand" "i")]))))]
+				         [(match_operand:SI 2 "immediate_operand" "i")]))))]
   "TARGET_REALLY_IWMMXT"
   "textrmuh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "textrm")]
+)
 
 (define_insn "iwmmxt_textrmsh"
-  [(set (match_operand:SI                                  0 "register_operand" "=r")
-        (sign_extend:SI (vec_select:HI (match_operand:V4HI 1 "register_operand" "y")
+  [(set (match_operand:SI                                   0 "register_operand" "=r")
+        (sign_extend:SI (vec_select:HI (match_operand:V4HI  1 "register_operand" "y")
 				       (parallel
-					[(match_operand:SI 2 "immediate_operand" "i")]))))]
+				         [(match_operand:SI 2 "immediate_operand" "i")]))))]
   "TARGET_REALLY_IWMMXT"
   "textrmsh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "textrm")]
+)
 
 ;; There are signed/unsigned variants of this instruction, but they are
 ;; pointless.
@@ -537,7 +780,9 @@
 		       (parallel [(match_operand:SI 2 "immediate_operand" "i")])))]
   "TARGET_REALLY_IWMMXT"
   "textrmsw%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "textrm")]
+)
 
 (define_insn "iwmmxt_wshufh"
   [(set (match_operand:V4HI               0 "register_operand" "=y")
@@ -545,7 +790,9 @@
 		      (match_operand:SI   2 "immediate_operand" "i")] UNSPEC_WSHUFH))]
   "TARGET_REALLY_IWMMXT"
   "wshufh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wshufh")]
+)
 
 ;; Mask-generating comparisons
 ;;
@@ -557,92 +804,106 @@
 ;; into the entire destination vector, (with the '1' going into the least
 ;; significant element of the vector).  This is not how these instructions
 ;; behave.
-;;
-;; Unfortunately the current patterns are illegal.  They are SET insns
-;; without a SET in them.  They work in most cases for ordinary code
-;; generation, but there are circumstances where they can cause gcc to fail.
-;; XXX - FIXME.
 
 (define_insn "eqv8qi3"
-  [(unspec_volatile [(match_operand:V8QI 0 "register_operand" "=y")
-		     (match_operand:V8QI 1 "register_operand"  "y")
-		     (match_operand:V8QI 2 "register_operand"  "y")]
-		    VUNSPEC_WCMP_EQ)]
+  [(set (match_operand:V8QI                        0 "register_operand" "=y")
+	(unspec_volatile:V8QI [(match_operand:V8QI 1 "register_operand"  "y")
+	                       (match_operand:V8QI 2 "register_operand"  "y")]
+	                      VUNSPEC_WCMP_EQ))]
   "TARGET_REALLY_IWMMXT"
   "wcmpeqb%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wcmpeq")]
+)
 
 (define_insn "eqv4hi3"
-  [(unspec_volatile [(match_operand:V4HI 0 "register_operand" "=y")
-		     (match_operand:V4HI 1 "register_operand"  "y")
-		     (match_operand:V4HI 2 "register_operand"  "y")]
-		    VUNSPEC_WCMP_EQ)]
+  [(set (match_operand:V4HI                        0 "register_operand" "=y")
+	(unspec_volatile:V4HI [(match_operand:V4HI 1 "register_operand"  "y")
+		               (match_operand:V4HI 2 "register_operand"  "y")]
+	                       VUNSPEC_WCMP_EQ))]
   "TARGET_REALLY_IWMMXT"
   "wcmpeqh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wcmpeq")]
+)
 
 (define_insn "eqv2si3"
-  [(unspec_volatile:V2SI [(match_operand:V2SI 0 "register_operand" "=y")
-			  (match_operand:V2SI 1 "register_operand"  "y")
-			  (match_operand:V2SI 2 "register_operand"  "y")]
-			 VUNSPEC_WCMP_EQ)]
+  [(set (match_operand:V2SI    0 "register_operand" "=y")
+	(unspec_volatile:V2SI
+	  [(match_operand:V2SI 1 "register_operand"  "y")
+	   (match_operand:V2SI 2 "register_operand"  "y")]
+           VUNSPEC_WCMP_EQ))]
   "TARGET_REALLY_IWMMXT"
   "wcmpeqw%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wcmpeq")]
+)
 
 (define_insn "gtuv8qi3"
-  [(unspec_volatile [(match_operand:V8QI 0 "register_operand" "=y")
-		     (match_operand:V8QI 1 "register_operand"  "y")
-		     (match_operand:V8QI 2 "register_operand"  "y")]
-		    VUNSPEC_WCMP_GTU)]
+  [(set (match_operand:V8QI                        0 "register_operand" "=y")
+	(unspec_volatile:V8QI [(match_operand:V8QI 1 "register_operand"  "y")
+	                       (match_operand:V8QI 2 "register_operand"  "y")]
+	                       VUNSPEC_WCMP_GTU))]
   "TARGET_REALLY_IWMMXT"
   "wcmpgtub%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wcmpgt")]
+)
 
 (define_insn "gtuv4hi3"
-  [(unspec_volatile [(match_operand:V4HI 0 "register_operand" "=y")
-		     (match_operand:V4HI 1 "register_operand"  "y")
-		     (match_operand:V4HI 2 "register_operand"  "y")]
-		    VUNSPEC_WCMP_GTU)]
+  [(set (match_operand:V4HI                        0 "register_operand" "=y")
+        (unspec_volatile:V4HI [(match_operand:V4HI 1 "register_operand"  "y")
+                               (match_operand:V4HI 2 "register_operand"  "y")]
+                               VUNSPEC_WCMP_GTU))]
   "TARGET_REALLY_IWMMXT"
   "wcmpgtuh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wcmpgt")]
+)
 
 (define_insn "gtuv2si3"
-  [(unspec_volatile [(match_operand:V2SI 0 "register_operand" "=y")
-		     (match_operand:V2SI 1 "register_operand"  "y")
-		     (match_operand:V2SI 2 "register_operand"  "y")]
-		    VUNSPEC_WCMP_GTU)]
+  [(set (match_operand:V2SI                        0 "register_operand" "=y")
+	(unspec_volatile:V2SI [(match_operand:V2SI 1 "register_operand"  "y")
+	                       (match_operand:V2SI 2 "register_operand"  "y")]
+	                       VUNSPEC_WCMP_GTU))]
   "TARGET_REALLY_IWMMXT"
   "wcmpgtuw%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wcmpgt")]
+)
 
 (define_insn "gtv8qi3"
-  [(unspec_volatile [(match_operand:V8QI 0 "register_operand" "=y")
-		     (match_operand:V8QI 1 "register_operand"  "y")
-		     (match_operand:V8QI 2 "register_operand"  "y")]
-		    VUNSPEC_WCMP_GT)]
+  [(set (match_operand:V8QI                        0 "register_operand" "=y")
+	(unspec_volatile:V8QI [(match_operand:V8QI 1 "register_operand"  "y")
+	                       (match_operand:V8QI 2 "register_operand"  "y")]
+	                       VUNSPEC_WCMP_GT))]
   "TARGET_REALLY_IWMMXT"
   "wcmpgtsb%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wcmpgt")]
+)
 
 (define_insn "gtv4hi3"
-  [(unspec_volatile [(match_operand:V4HI 0 "register_operand" "=y")
-		     (match_operand:V4HI 1 "register_operand"  "y")
-		     (match_operand:V4HI 2 "register_operand"  "y")]
-		    VUNSPEC_WCMP_GT)]
+  [(set (match_operand:V4HI                        0 "register_operand" "=y")
+	(unspec_volatile:V4HI [(match_operand:V4HI 1 "register_operand"  "y")
+	                       (match_operand:V4HI 2 "register_operand"  "y")]
+	                       VUNSPEC_WCMP_GT))]
   "TARGET_REALLY_IWMMXT"
   "wcmpgtsh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wcmpgt")]
+)
 
 (define_insn "gtv2si3"
-  [(unspec_volatile [(match_operand:V2SI 0 "register_operand" "=y")
-		     (match_operand:V2SI 1 "register_operand"  "y")
-		     (match_operand:V2SI 2 "register_operand"  "y")]
-		    VUNSPEC_WCMP_GT)]
+  [(set (match_operand:V2SI                        0 "register_operand" "=y")
+	(unspec_volatile:V2SI [(match_operand:V2SI 1 "register_operand"  "y")
+	                       (match_operand:V2SI 2 "register_operand"  "y")]
+	                       VUNSPEC_WCMP_GT))]
   "TARGET_REALLY_IWMMXT"
   "wcmpgtsw%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wcmpgt")]
+)
 
 ;; Max/min insns
 
@@ -652,7 +913,9 @@
 		   (match_operand:VMMX 2 "register_operand" "y")))]
   "TARGET_REALLY_IWMMXT"
   "wmaxs<MMX_char>%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmax")]
+)
 
 (define_insn "*umax<mode>3_iwmmxt"
   [(set (match_operand:VMMX            0 "register_operand" "=y")
@@ -660,7 +923,9 @@
 		   (match_operand:VMMX 2 "register_operand" "y")))]
   "TARGET_REALLY_IWMMXT"
   "wmaxu<MMX_char>%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmax")]
+)
 
 (define_insn "*smin<mode>3_iwmmxt"
   [(set (match_operand:VMMX            0 "register_operand" "=y")
@@ -668,7 +933,9 @@
 		   (match_operand:VMMX 2 "register_operand" "y")))]
   "TARGET_REALLY_IWMMXT"
   "wmins<MMX_char>%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmin")]
+)
 
 (define_insn "*umin<mode>3_iwmmxt"
   [(set (match_operand:VMMX            0 "register_operand" "=y")
@@ -676,657 +943,835 @@
 		   (match_operand:VMMX 2 "register_operand" "y")))]
   "TARGET_REALLY_IWMMXT"
   "wminu<MMX_char>%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmin")]
+)
 
 ;; Pack/unpack insns.
 
 (define_insn "iwmmxt_wpackhss"
-  [(set (match_operand:V8QI                    0 "register_operand" "=y")
+  [(set (match_operand:V8QI                     0 "register_operand" "=y")
 	(vec_concat:V8QI
-	 (ss_truncate:V4QI (match_operand:V4HI 1 "register_operand" "y"))
-	 (ss_truncate:V4QI (match_operand:V4HI 2 "register_operand" "y"))))]
+	  (ss_truncate:V4QI (match_operand:V4HI 1 "register_operand" "y"))
+	  (ss_truncate:V4QI (match_operand:V4HI 2 "register_operand" "y"))))]
   "TARGET_REALLY_IWMMXT"
   "wpackhss%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wpack")]
+)
 
 (define_insn "iwmmxt_wpackwss"
-  [(set (match_operand:V4HI                    0 "register_operand" "=y")
-	(vec_concat:V4HI
-	 (ss_truncate:V2HI (match_operand:V2SI 1 "register_operand" "y"))
-	 (ss_truncate:V2HI (match_operand:V2SI 2 "register_operand" "y"))))]
+  [(set (match_operand:V4HI                     0 "register_operand" "=y")
+        (vec_concat:V4HI
+	  (ss_truncate:V2HI (match_operand:V2SI 1 "register_operand" "y"))
+	  (ss_truncate:V2HI (match_operand:V2SI 2 "register_operand" "y"))))]
   "TARGET_REALLY_IWMMXT"
   "wpackwss%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wpack")]
+)
 
 (define_insn "iwmmxt_wpackdss"
-  [(set (match_operand:V2SI                0 "register_operand" "=y")
+  [(set (match_operand:V2SI                 0 "register_operand" "=y")
 	(vec_concat:V2SI
-	 (ss_truncate:SI (match_operand:DI 1 "register_operand" "y"))
-	 (ss_truncate:SI (match_operand:DI 2 "register_operand" "y"))))]
+	  (ss_truncate:SI (match_operand:DI 1 "register_operand" "y"))
+	  (ss_truncate:SI (match_operand:DI 2 "register_operand" "y"))))]
   "TARGET_REALLY_IWMMXT"
   "wpackdss%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wpack")]
+)
 
 (define_insn "iwmmxt_wpackhus"
-  [(set (match_operand:V8QI                    0 "register_operand" "=y")
+  [(set (match_operand:V8QI                     0 "register_operand" "=y")
 	(vec_concat:V8QI
-	 (us_truncate:V4QI (match_operand:V4HI 1 "register_operand" "y"))
-	 (us_truncate:V4QI (match_operand:V4HI 2 "register_operand" "y"))))]
+	  (us_truncate:V4QI (match_operand:V4HI 1 "register_operand" "y"))
+	  (us_truncate:V4QI (match_operand:V4HI 2 "register_operand" "y"))))]
   "TARGET_REALLY_IWMMXT"
   "wpackhus%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wpack")]
+)
 
 (define_insn "iwmmxt_wpackwus"
-  [(set (match_operand:V4HI                    0 "register_operand" "=y")
+  [(set (match_operand:V4HI                     0 "register_operand" "=y")
 	(vec_concat:V4HI
-	 (us_truncate:V2HI (match_operand:V2SI 1 "register_operand" "y"))
-	 (us_truncate:V2HI (match_operand:V2SI 2 "register_operand" "y"))))]
+	  (us_truncate:V2HI (match_operand:V2SI 1 "register_operand" "y"))
+	  (us_truncate:V2HI (match_operand:V2SI 2 "register_operand" "y"))))]
   "TARGET_REALLY_IWMMXT"
   "wpackwus%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wpack")]
+)
 
 (define_insn "iwmmxt_wpackdus"
-  [(set (match_operand:V2SI                0 "register_operand" "=y")
+  [(set (match_operand:V2SI                 0 "register_operand" "=y")
 	(vec_concat:V2SI
-	 (us_truncate:SI (match_operand:DI 1 "register_operand" "y"))
-	 (us_truncate:SI (match_operand:DI 2 "register_operand" "y"))))]
+	  (us_truncate:SI (match_operand:DI 1 "register_operand" "y"))
+	  (us_truncate:SI (match_operand:DI 2 "register_operand" "y"))))]
   "TARGET_REALLY_IWMMXT"
   "wpackdus%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wpack")]
+)
 
 (define_insn "iwmmxt_wunpckihb"
-  [(set (match_operand:V8QI                   0 "register_operand" "=y")
+  [(set (match_operand:V8QI                                      0 "register_operand" "=y")
 	(vec_merge:V8QI
-	 (vec_select:V8QI (match_operand:V8QI 1 "register_operand" "y")
-			  (parallel [(const_int 4)
-				     (const_int 0)
-				     (const_int 5)
-				     (const_int 1)
-				     (const_int 6)
-				     (const_int 2)
-				     (const_int 7)
-				     (const_int 3)]))
-	 (vec_select:V8QI (match_operand:V8QI 2 "register_operand" "y")
-			  (parallel [(const_int 0)
-				     (const_int 4)
-				     (const_int 1)
-				     (const_int 5)
-				     (const_int 2)
-				     (const_int 6)
-				     (const_int 3)
-				     (const_int 7)]))
-	 (const_int 85)))]
+	  (vec_select:V8QI (match_operand:V8QI 1 "register_operand" "y")
+		           (parallel [(const_int 4)
+			              (const_int 0)
+			              (const_int 5)
+			              (const_int 1)
+			              (const_int 6)
+			              (const_int 2)
+			              (const_int 7)
+			              (const_int 3)]))
+          (vec_select:V8QI (match_operand:V8QI 2 "register_operand" "y")
+			   (parallel [(const_int 0)
+			              (const_int 4)
+			              (const_int 1)
+			              (const_int 5)
+			              (const_int 2)
+			              (const_int 6)
+			              (const_int 3)
+			              (const_int 7)]))
+          (const_int 85)))]
   "TARGET_REALLY_IWMMXT"
   "wunpckihb%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckih")]
+)
 
 (define_insn "iwmmxt_wunpckihh"
-  [(set (match_operand:V4HI                   0 "register_operand" "=y")
+  [(set (match_operand:V4HI                                      0 "register_operand" "=y")
 	(vec_merge:V4HI
-	 (vec_select:V4HI (match_operand:V4HI 1 "register_operand" "y")
-			  (parallel [(const_int 0)
-				     (const_int 2)
-				     (const_int 1)
-				     (const_int 3)]))
-	 (vec_select:V4HI (match_operand:V4HI 2 "register_operand" "y")
-			  (parallel [(const_int 2)
-				     (const_int 0)
-				     (const_int 3)
-				     (const_int 1)]))
-	 (const_int 5)))]
+	  (vec_select:V4HI (match_operand:V4HI 1 "register_operand" "y")
+		           (parallel [(const_int 2)
+			              (const_int 0)
+			              (const_int 3)
+			              (const_int 1)]))
+	  (vec_select:V4HI (match_operand:V4HI 2 "register_operand" "y")
+		           (parallel [(const_int 0)
+			              (const_int 2)
+			              (const_int 1)
+			              (const_int 3)]))
+          (const_int 5)))]
   "TARGET_REALLY_IWMMXT"
   "wunpckihh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckih")]
+)
 
 (define_insn "iwmmxt_wunpckihw"
-  [(set (match_operand:V2SI                   0 "register_operand" "=y")
+  [(set (match_operand:V2SI                    0 "register_operand" "=y")
 	(vec_merge:V2SI
-	 (vec_select:V2SI (match_operand:V2SI 1 "register_operand" "y")
-			  (parallel [(const_int 0)
-				     (const_int 1)]))
-	 (vec_select:V2SI (match_operand:V2SI 2 "register_operand" "y")
-			  (parallel [(const_int 1)
-				     (const_int 0)]))
-	 (const_int 1)))]
+	  (vec_select:V2SI (match_operand:V2SI 1 "register_operand" "y")
+		           (parallel [(const_int 1)
+		                      (const_int 0)]))
+          (vec_select:V2SI (match_operand:V2SI 2 "register_operand" "y")
+		           (parallel [(const_int 0)
+			              (const_int 1)]))
+          (const_int 1)))]
   "TARGET_REALLY_IWMMXT"
   "wunpckihw%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckih")]
+)
 
 (define_insn "iwmmxt_wunpckilb"
-  [(set (match_operand:V8QI                   0 "register_operand" "=y")
+  [(set (match_operand:V8QI                                      0 "register_operand" "=y")
 	(vec_merge:V8QI
-	 (vec_select:V8QI (match_operand:V8QI 1 "register_operand" "y")
-			  (parallel [(const_int 0)
-				     (const_int 4)
-				     (const_int 1)
-				     (const_int 5)
-				     (const_int 2)
-				     (const_int 6)
-				     (const_int 3)
-				     (const_int 7)]))
-	 (vec_select:V8QI (match_operand:V8QI 2 "register_operand" "y")
-			  (parallel [(const_int 4)
-				     (const_int 0)
-				     (const_int 5)
-				     (const_int 1)
-				     (const_int 6)
-				     (const_int 2)
-				     (const_int 7)
-				     (const_int 3)]))
-	 (const_int 85)))]
+	  (vec_select:V8QI (match_operand:V8QI 1 "register_operand" "y")
+		           (parallel [(const_int 0)
+			              (const_int 4)
+			              (const_int 1)
+			              (const_int 5)
+		                      (const_int 2)
+				      (const_int 6)
+				      (const_int 3)
+				      (const_int 7)]))
+	  (vec_select:V8QI (match_operand:V8QI 2 "register_operand" "y")
+		           (parallel [(const_int 4)
+			              (const_int 0)
+			              (const_int 5)
+			              (const_int 1)
+			              (const_int 6)
+			              (const_int 2)
+			              (const_int 7)
+			              (const_int 3)]))
+	  (const_int 85)))]
   "TARGET_REALLY_IWMMXT"
   "wunpckilb%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckil")]
+)
 
 (define_insn "iwmmxt_wunpckilh"
-  [(set (match_operand:V4HI                   0 "register_operand" "=y")
+  [(set (match_operand:V4HI                                      0 "register_operand" "=y")
 	(vec_merge:V4HI
-	 (vec_select:V4HI (match_operand:V4HI 1 "register_operand" "y")
-			  (parallel [(const_int 2)
-				     (const_int 0)
-				     (const_int 3)
-				     (const_int 1)]))
-	 (vec_select:V4HI (match_operand:V4HI 2 "register_operand" "y")
-			  (parallel [(const_int 0)
-				     (const_int 2)
-				     (const_int 1)
-				     (const_int 3)]))
-	 (const_int 5)))]
+	  (vec_select:V4HI (match_operand:V4HI 1 "register_operand" "y")
+		           (parallel [(const_int 0)
+			              (const_int 2)
+			              (const_int 1)
+			              (const_int 3)]))
+	  (vec_select:V4HI (match_operand:V4HI 2 "register_operand" "y")
+			   (parallel [(const_int 2)
+			              (const_int 0)
+			              (const_int 3)
+			              (const_int 1)]))
+	  (const_int 5)))]
   "TARGET_REALLY_IWMMXT"
   "wunpckilh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckil")]
+)
 
 (define_insn "iwmmxt_wunpckilw"
-  [(set (match_operand:V2SI                   0 "register_operand" "=y")
+  [(set (match_operand:V2SI                    0 "register_operand" "=y")
 	(vec_merge:V2SI
-	 (vec_select:V2SI (match_operand:V2SI 1 "register_operand" "y")
-			   (parallel [(const_int 1)
-				      (const_int 0)]))
-	 (vec_select:V2SI (match_operand:V2SI 2 "register_operand" "y")
-			  (parallel [(const_int 0)
-				     (const_int 1)]))
-	 (const_int 1)))]
+	  (vec_select:V2SI (match_operand:V2SI 1 "register_operand" "y")
+		           (parallel [(const_int 0)
+				      (const_int 1)]))
+	  (vec_select:V2SI (match_operand:V2SI 2 "register_operand" "y")
+		           (parallel [(const_int 1)
+			              (const_int 0)]))
+	  (const_int 1)))]
   "TARGET_REALLY_IWMMXT"
   "wunpckilw%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckil")]
+)
 
 (define_insn "iwmmxt_wunpckehub"
-  [(set (match_operand:V4HI                   0 "register_operand" "=y")
-	(zero_extend:V4HI
-	 (vec_select:V4QI (match_operand:V8QI 1 "register_operand" "y")
-			  (parallel [(const_int 4) (const_int 5)
-				     (const_int 6) (const_int 7)]))))]
+  [(set (match_operand:V4HI                     0 "register_operand" "=y")
+	(vec_select:V4HI
+	  (zero_extend:V8HI (match_operand:V8QI 1 "register_operand" "y"))
+	  (parallel [(const_int 4) (const_int 5)
+	             (const_int 6) (const_int 7)])))]
   "TARGET_REALLY_IWMMXT"
   "wunpckehub%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckeh")]
+)
 
 (define_insn "iwmmxt_wunpckehuh"
-  [(set (match_operand:V2SI                   0 "register_operand" "=y")
-	(zero_extend:V2SI
-	 (vec_select:V2HI (match_operand:V4HI 1 "register_operand" "y")
-			  (parallel [(const_int 2) (const_int 3)]))))]
+  [(set (match_operand:V2SI                     0 "register_operand" "=y")
+	(vec_select:V2SI
+	  (zero_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+	  (parallel [(const_int 2) (const_int 3)])))]
   "TARGET_REALLY_IWMMXT"
   "wunpckehuh%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckeh")]
+)
 
 (define_insn "iwmmxt_wunpckehuw"
-  [(set (match_operand:DI                   0 "register_operand" "=y")
-	(zero_extend:DI
-	 (vec_select:SI (match_operand:V2SI 1 "register_operand" "y")
-			(parallel [(const_int 1)]))))]
+  [(set (match_operand:DI                       0 "register_operand" "=y")
+	(vec_select:DI
+	  (zero_extend:V2DI (match_operand:V2SI 1 "register_operand" "y"))
+	  (parallel [(const_int 1)])))]
   "TARGET_REALLY_IWMMXT"
   "wunpckehuw%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckeh")]
+)
 
 (define_insn "iwmmxt_wunpckehsb"
-  [(set (match_operand:V4HI                   0 "register_operand" "=y")
-	(sign_extend:V4HI
-	 (vec_select:V4QI (match_operand:V8QI 1 "register_operand" "y")
-			  (parallel [(const_int 4) (const_int 5)
-				     (const_int 6) (const_int 7)]))))]
+  [(set (match_operand:V4HI                     0 "register_operand" "=y")
+        (vec_select:V4HI
+	  (sign_extend:V8HI (match_operand:V8QI 1 "register_operand" "y"))
+	  (parallel [(const_int 4) (const_int 5)
+	             (const_int 6) (const_int 7)])))]
   "TARGET_REALLY_IWMMXT"
   "wunpckehsb%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckeh")]
+)
 
 (define_insn "iwmmxt_wunpckehsh"
-  [(set (match_operand:V2SI                   0 "register_operand" "=y")
-	(sign_extend:V2SI
-	 (vec_select:V2HI (match_operand:V4HI 1 "register_operand" "y")
-			  (parallel [(const_int 2) (const_int 3)]))))]
+  [(set (match_operand:V2SI                     0 "register_operand" "=y")
+	(vec_select:V2SI
+	  (sign_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+	  (parallel [(const_int 2) (const_int 3)])))]
   "TARGET_REALLY_IWMMXT"
   "wunpckehsh%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckeh")]
+)
 
 (define_insn "iwmmxt_wunpckehsw"
-  [(set (match_operand:DI                   0 "register_operand" "=y")
-	(sign_extend:DI
-	 (vec_select:SI (match_operand:V2SI 1 "register_operand" "y")
-			(parallel [(const_int 1)]))))]
+  [(set (match_operand:DI                       0 "register_operand" "=y")
+	(vec_select:DI
+	  (sign_extend:V2DI (match_operand:V2SI 1 "register_operand" "y"))
+	  (parallel [(const_int 1)])))]
   "TARGET_REALLY_IWMMXT"
   "wunpckehsw%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckeh")]
+)
 
 (define_insn "iwmmxt_wunpckelub"
-  [(set (match_operand:V4HI                   0 "register_operand" "=y")
-	(zero_extend:V4HI
-	 (vec_select:V4QI (match_operand:V8QI 1 "register_operand" "y")
-			  (parallel [(const_int 0) (const_int 1)
-				     (const_int 2) (const_int 3)]))))]
+  [(set (match_operand:V4HI                     0 "register_operand" "=y")
+	(vec_select:V4HI
+	  (zero_extend:V8HI (match_operand:V8QI 1 "register_operand" "y"))
+	  (parallel [(const_int 0) (const_int 1)
+		     (const_int 2) (const_int 3)])))]
   "TARGET_REALLY_IWMMXT"
   "wunpckelub%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckel")]
+)
 
 (define_insn "iwmmxt_wunpckeluh"
-  [(set (match_operand:V2SI                   0 "register_operand" "=y")
-	(zero_extend:V2SI
-	 (vec_select:V2HI (match_operand:V4HI 1 "register_operand" "y")
-			  (parallel [(const_int 0) (const_int 1)]))))]
+  [(set (match_operand:V2SI                     0 "register_operand" "=y")
+	(vec_select:V2SI
+	  (zero_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+	  (parallel [(const_int 0) (const_int 1)])))]
   "TARGET_REALLY_IWMMXT"
   "wunpckeluh%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckel")]
+)
 
 (define_insn "iwmmxt_wunpckeluw"
-  [(set (match_operand:DI                   0 "register_operand" "=y")
-	(zero_extend:DI
-	 (vec_select:SI (match_operand:V2SI 1 "register_operand" "y")
-			(parallel [(const_int 0)]))))]
+  [(set (match_operand:DI                       0 "register_operand" "=y")
+	(vec_select:DI
+	  (zero_extend:V2DI (match_operand:V2SI 1 "register_operand" "y"))
+	  (parallel [(const_int 0)])))]
   "TARGET_REALLY_IWMMXT"
   "wunpckeluw%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckel")]
+)
 
 (define_insn "iwmmxt_wunpckelsb"
-  [(set (match_operand:V4HI                   0 "register_operand" "=y")
-	(sign_extend:V4HI
-	 (vec_select:V4QI (match_operand:V8QI 1 "register_operand" "y")
-			  (parallel [(const_int 0) (const_int 1)
-				     (const_int 2) (const_int 3)]))))]
+  [(set (match_operand:V4HI                     0 "register_operand" "=y")
+	(vec_select:V4HI
+	  (sign_extend:V8HI (match_operand:V8QI 1 "register_operand" "y"))
+	  (parallel [(const_int 0) (const_int 1)
+		     (const_int 2) (const_int 3)])))]
   "TARGET_REALLY_IWMMXT"
   "wunpckelsb%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckel")]
+)
 
 (define_insn "iwmmxt_wunpckelsh"
-  [(set (match_operand:V2SI                   0 "register_operand" "=y")
-	(sign_extend:V2SI
-	 (vec_select:V2HI (match_operand:V4HI 1 "register_operand" "y")
-			  (parallel [(const_int 0) (const_int 1)]))))]
+  [(set (match_operand:V2SI                     0 "register_operand" "=y")
+	(vec_select:V2SI
+	  (sign_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+	  (parallel [(const_int 0) (const_int 1)])))]
   "TARGET_REALLY_IWMMXT"
   "wunpckelsh%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckel")]
+)
 
 (define_insn "iwmmxt_wunpckelsw"
-  [(set (match_operand:DI                   0 "register_operand" "=y")
-	(sign_extend:DI
-	 (vec_select:SI (match_operand:V2SI 1 "register_operand" "y")
-			(parallel [(const_int 0)]))))]
+  [(set (match_operand:DI                       0 "register_operand" "=y")
+        (vec_select:DI
+	  (sign_extend:V2DI (match_operand:V2SI 1 "register_operand" "y"))
+	  (parallel [(const_int 0)])))]
   "TARGET_REALLY_IWMMXT"
   "wunpckelsw%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wunpckel")]
+)
 
 ;; Shifts
 
-(define_insn "rorv4hi3"
-  [(set (match_operand:V4HI                0 "register_operand" "=y")
-        (rotatert:V4HI (match_operand:V4HI 1 "register_operand" "y")
-		       (match_operand:SI   2 "register_operand" "z")))]
-  "TARGET_REALLY_IWMMXT"
-  "wrorhg%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "rorv2si3"
-  [(set (match_operand:V2SI                0 "register_operand" "=y")
-        (rotatert:V2SI (match_operand:V2SI 1 "register_operand" "y")
-		       (match_operand:SI   2 "register_operand" "z")))]
-  "TARGET_REALLY_IWMMXT"
-  "wrorwg%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "rordi3"
-  [(set (match_operand:DI              0 "register_operand" "=y")
-	(rotatert:DI (match_operand:DI 1 "register_operand" "y")
-		   (match_operand:SI   2 "register_operand" "z")))]
+(define_insn "ror<mode>3"
+  [(set (match_operand:VSHFT                 0 "register_operand" "=y,y")
+        (rotatert:VSHFT (match_operand:VSHFT 1 "register_operand" "y,y")
+		        (match_operand:SI    2 "imm_or_reg_operand" "z,i")))]
   "TARGET_REALLY_IWMMXT"
-  "wrordg%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  "*
+  switch  (which_alternative)
+    {
+    case 0:
+      return \"wror<MMX_char>g%?\\t%0, %1, %2\";
+    case 1:
+      return arm_output_iwmmxt_shift_immediate (\"wror<MMX_char>\", operands, true);
+    default:
+      gcc_unreachable ();
+    }
+  "
+  [(set_attr "predicable" "yes")
+   (set_attr "arch" "*, iwmmxt2")
+   (set_attr "wtype" "wror, wror")]
+)
 
 (define_insn "ashr<mode>3_iwmmxt"
-  [(set (match_operand:VSHFT                 0 "register_operand" "=y")
-        (ashiftrt:VSHFT (match_operand:VSHFT 1 "register_operand" "y")
-			(match_operand:SI    2 "register_operand" "z")))]
+  [(set (match_operand:VSHFT                 0 "register_operand" "=y,y")
+        (ashiftrt:VSHFT (match_operand:VSHFT 1 "register_operand" "y,y")
+			(match_operand:SI    2 "imm_or_reg_operand" "z,i")))]
   "TARGET_REALLY_IWMMXT"
-  "wsra<MMX_char>g%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  "*
+  switch  (which_alternative)
+    {
+    case 0:
+      return \"wsra<MMX_char>g%?\\t%0, %1, %2\";
+    case 1:
+      return arm_output_iwmmxt_shift_immediate (\"wsra<MMX_char>\", operands, true);
+    default:
+      gcc_unreachable ();
+    }
+  "
+  [(set_attr "predicable" "yes")
+   (set_attr "arch" "*, iwmmxt2")
+   (set_attr "wtype" "wsra, wsra")]
+)
 
 (define_insn "lshr<mode>3_iwmmxt"
-  [(set (match_operand:VSHFT                 0 "register_operand" "=y")
-        (lshiftrt:VSHFT (match_operand:VSHFT 1 "register_operand" "y")
-			(match_operand:SI    2 "register_operand" "z")))]
+  [(set (match_operand:VSHFT                 0 "register_operand" "=y,y")
+        (lshiftrt:VSHFT (match_operand:VSHFT 1 "register_operand" "y,y")
+			(match_operand:SI    2 "imm_or_reg_operand" "z,i")))]
   "TARGET_REALLY_IWMMXT"
-  "wsrl<MMX_char>g%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  "*
+  switch  (which_alternative)
+    {
+    case 0:
+      return \"wsrl<MMX_char>g%?\\t%0, %1, %2\";
+    case 1:
+      return arm_output_iwmmxt_shift_immediate (\"wsrl<MMX_char>\", operands, false);
+    default:
+      gcc_unreachable ();
+    }
+  "
+  [(set_attr "predicable" "yes")
+   (set_attr "arch" "*, iwmmxt2")
+   (set_attr "wtype" "wsrl, wsrl")]
+)
 
 (define_insn "ashl<mode>3_iwmmxt"
-  [(set (match_operand:VSHFT               0 "register_operand" "=y")
-        (ashift:VSHFT (match_operand:VSHFT 1 "register_operand" "y")
-		      (match_operand:SI    2 "register_operand" "z")))]
-  "TARGET_REALLY_IWMMXT"
-  "wsll<MMX_char>g%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "rorv4hi3_di"
-  [(set (match_operand:V4HI                0 "register_operand" "=y")
-        (rotatert:V4HI (match_operand:V4HI 1 "register_operand" "y")
-		       (match_operand:DI   2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wrorh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "rorv2si3_di"
-  [(set (match_operand:V2SI                0 "register_operand" "=y")
-        (rotatert:V2SI (match_operand:V2SI 1 "register_operand" "y")
-		       (match_operand:DI   2 "register_operand" "y")))]
+  [(set (match_operand:VSHFT               0 "register_operand" "=y,y")
+        (ashift:VSHFT (match_operand:VSHFT 1 "register_operand" "y,y")
+		      (match_operand:SI    2 "imm_or_reg_operand" "z,i")))]
   "TARGET_REALLY_IWMMXT"
-  "wrorw%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "rordi3_di"
-  [(set (match_operand:DI              0 "register_operand" "=y")
-	(rotatert:DI (match_operand:DI 1 "register_operand" "y")
-		   (match_operand:DI   2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wrord%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "ashrv4hi3_di"
-  [(set (match_operand:V4HI                0 "register_operand" "=y")
-        (ashiftrt:V4HI (match_operand:V4HI 1 "register_operand" "y")
-		       (match_operand:DI   2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wsrah%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "ashrv2si3_di"
-  [(set (match_operand:V2SI                0 "register_operand" "=y")
-        (ashiftrt:V2SI (match_operand:V2SI 1 "register_operand" "y")
-		       (match_operand:DI   2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wsraw%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "ashrdi3_di"
-  [(set (match_operand:DI              0 "register_operand" "=y")
-	(ashiftrt:DI (match_operand:DI 1 "register_operand" "y")
-		   (match_operand:DI   2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wsrad%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "lshrv4hi3_di"
-  [(set (match_operand:V4HI                0 "register_operand" "=y")
-        (lshiftrt:V4HI (match_operand:V4HI 1 "register_operand" "y")
-		       (match_operand:DI   2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wsrlh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "lshrv2si3_di"
-  [(set (match_operand:V2SI                0 "register_operand" "=y")
-        (lshiftrt:V2SI (match_operand:V2SI 1 "register_operand" "y")
-		       (match_operand:DI   2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wsrlw%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  "*
+  switch  (which_alternative)
+    {
+    case 0:
+      return \"wsll<MMX_char>g%?\\t%0, %1, %2\";
+    case 1:
+      return arm_output_iwmmxt_shift_immediate (\"wsll<MMX_char>\", operands, false);
+    default:
+      gcc_unreachable ();
+    }
+  "
+  [(set_attr "predicable" "yes")
+   (set_attr "arch" "*, iwmmxt2")
+   (set_attr "wtype" "wsll, wsll")]
+)
 
-(define_insn "lshrdi3_di"
-  [(set (match_operand:DI              0 "register_operand" "=y")
-	(lshiftrt:DI (match_operand:DI 1 "register_operand" "y")
-		     (match_operand:DI 2 "register_operand" "y")))]
+(define_insn "ror<mode>3_di"
+  [(set (match_operand:VSHFT                 0 "register_operand" "=y,y")
+        (rotatert:VSHFT (match_operand:VSHFT 1 "register_operand" "y,y")
+		        (match_operand:DI    2 "imm_or_reg_operand" "y,i")))]
   "TARGET_REALLY_IWMMXT"
-  "wsrld%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  "*
+  switch (which_alternative)
+    {
+    case 0:
+      return \"wror<MMX_char>%?\\t%0, %1, %2\";
+    case 1:
+      return arm_output_iwmmxt_shift_immediate (\"wror<MMX_char>\", operands, true);
+    default:
+      gcc_unreachable ();
+    }
+  "
+  [(set_attr "predicable" "yes")
+   (set_attr "arch" "*, iwmmxt2")
+   (set_attr "wtype" "wror, wror")]
+)
 
-(define_insn "ashlv4hi3_di"
-  [(set (match_operand:V4HI              0 "register_operand" "=y")
-        (ashift:V4HI (match_operand:V4HI 1 "register_operand" "y")
-		     (match_operand:DI   2 "register_operand" "y")))]
+(define_insn "ashr<mode>3_di"
+  [(set (match_operand:VSHFT                 0 "register_operand" "=y,y")
+        (ashiftrt:VSHFT (match_operand:VSHFT 1 "register_operand" "y,y")
+		        (match_operand:DI    2 "imm_or_reg_operand" "y,i")))]
   "TARGET_REALLY_IWMMXT"
-  "wsllh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  "*
+  switch (which_alternative)
+    {
+    case 0:
+      return \"wsra<MMX_char>%?\\t%0, %1, %2\";
+    case 1:
+      return arm_output_iwmmxt_shift_immediate (\"wsra<MMX_char>\", operands, true);
+    default:
+      gcc_unreachable ();
+    }
+  "
+  [(set_attr "predicable" "yes")
+   (set_attr "arch" "*, iwmmxt2")
+   (set_attr "wtype" "wsra, wsra")]
+)
 
-(define_insn "ashlv2si3_di"
-  [(set (match_operand:V2SI              0 "register_operand" "=y")
-        (ashift:V2SI (match_operand:V2SI 1 "register_operand" "y")
-		       (match_operand:DI 2 "register_operand" "y")))]
+(define_insn "lshr<mode>3_di"
+  [(set (match_operand:VSHFT                 0 "register_operand" "=y,y")
+        (lshiftrt:VSHFT (match_operand:VSHFT 1 "register_operand" "y,y")
+		        (match_operand:DI    2 "register_operand" "y,i")))]
   "TARGET_REALLY_IWMMXT"
-  "wsllw%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  "*
+  switch (which_alternative)
+    {
+    case 0:
+      return \"wsrl<MMX_char>%?\\t%0, %1, %2\";
+    case 1:
+      return arm_output_iwmmxt_shift_immediate (\"wsrl<MMX_char>\", operands, false);
+    default:
+      gcc_unreachable ();
+    }
+  "
+  [(set_attr "predicable" "yes")
+   (set_attr "arch" "*, iwmmxt2")
+   (set_attr "wtype" "wsrl, wsrl")]
+)
 
-(define_insn "ashldi3_di"
-  [(set (match_operand:DI            0 "register_operand" "=y")
-	(ashift:DI (match_operand:DI 1 "register_operand" "y")
-		   (match_operand:DI 2 "register_operand" "y")))]
+(define_insn "ashl<mode>3_di"
+  [(set (match_operand:VSHFT               0 "register_operand" "=y,y")
+        (ashift:VSHFT (match_operand:VSHFT 1 "register_operand" "y,y")
+		      (match_operand:DI    2 "imm_or_reg_operand" "y,i")))]
   "TARGET_REALLY_IWMMXT"
-  "wslld%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  "*
+  switch (which_alternative)
+    {
+    case 0:
+      return \"wsll<MMX_char>%?\\t%0, %1, %2\";
+    case 1:
+      return arm_output_iwmmxt_shift_immediate (\"wsll<MMX_char>\", operands, false);
+    default:
+      gcc_unreachable ();
+    }
+  "
+  [(set_attr "predicable" "yes")
+   (set_attr "arch" "*, iwmmxt2")
+   (set_attr "wtype" "wsll, wsll")]
+)
 
 (define_insn "iwmmxt_wmadds"
-  [(set (match_operand:V4HI               0 "register_operand" "=y")
-        (unspec:V4HI [(match_operand:V4HI 1 "register_operand" "y")
-		      (match_operand:V4HI 2 "register_operand" "y")] UNSPEC_WMADDS))]
+  [(set (match_operand:V2SI                                        0 "register_operand" "=y")
+	(plus:V2SI
+	  (mult:V2SI
+	    (vec_select:V2SI (sign_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+	                     (parallel [(const_int 1) (const_int 3)]))
+	    (vec_select:V2SI (sign_extend:V4SI (match_operand:V4HI 2 "register_operand" "y"))
+	                     (parallel [(const_int 1) (const_int 3)])))
+	  (mult:V2SI
+	    (vec_select:V2SI (sign_extend:V4SI (match_dup 1))
+	                     (parallel [(const_int 0) (const_int 2)]))
+	    (vec_select:V2SI (sign_extend:V4SI (match_dup 2))
+	                     (parallel [(const_int 0) (const_int 2)])))))]
   "TARGET_REALLY_IWMMXT"
   "wmadds%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmadd")]
+)
 
 (define_insn "iwmmxt_wmaddu"
-  [(set (match_operand:V4HI               0 "register_operand" "=y")
-        (unspec:V4HI [(match_operand:V4HI 1 "register_operand" "y")
-		      (match_operand:V4HI 2 "register_operand" "y")] UNSPEC_WMADDU))]
+  [(set (match_operand:V2SI               0 "register_operand" "=y")
+	(plus:V2SI
+	  (mult:V2SI
+	    (vec_select:V2SI (zero_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+	                     (parallel [(const_int 1) (const_int 3)]))
+	    (vec_select:V2SI (zero_extend:V4SI (match_operand:V4HI 2 "register_operand" "y"))
+	                     (parallel [(const_int 1) (const_int 3)])))
+	  (mult:V2SI
+	    (vec_select:V2SI (zero_extend:V4SI (match_dup 1))
+	                     (parallel [(const_int 0) (const_int 2)]))
+	    (vec_select:V2SI (zero_extend:V4SI (match_dup 2))
+	                     (parallel [(const_int 0) (const_int 2)])))))]
   "TARGET_REALLY_IWMMXT"
   "wmaddu%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmadd")]
+)
 
 (define_insn "iwmmxt_tmia"
-  [(set (match_operand:DI                    0 "register_operand" "=y")
-	(plus:DI (match_operand:DI           1 "register_operand" "0")
+  [(set (match_operand:DI                     0 "register_operand" "=y")
+	(plus:DI (match_operand:DI            1 "register_operand" "0")
 		 (mult:DI (sign_extend:DI
-			   (match_operand:SI 2 "register_operand" "r"))
+			    (match_operand:SI 2 "register_operand" "r"))
 			  (sign_extend:DI
-			   (match_operand:SI 3 "register_operand" "r")))))]
+			    (match_operand:SI 3 "register_operand" "r")))))]
   "TARGET_REALLY_IWMMXT"
   "tmia%?\\t%0, %2, %3"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tmia")]
+)
 
 (define_insn "iwmmxt_tmiaph"
-  [(set (match_operand:DI          0 "register_operand" "=y")
-	(plus:DI (match_operand:DI 1 "register_operand" "0")
+  [(set (match_operand:DI                                    0 "register_operand" "=y")
+	(plus:DI (match_operand:DI                           1 "register_operand" "0")
 		 (plus:DI
-		  (mult:DI (sign_extend:DI
-			    (truncate:HI (match_operand:SI 2 "register_operand" "r")))
-			   (sign_extend:DI
-			    (truncate:HI (match_operand:SI 3 "register_operand" "r"))))
-		  (mult:DI (sign_extend:DI
-			    (truncate:HI (ashiftrt:SI (match_dup 2) (const_int 16))))
-			   (sign_extend:DI
-			    (truncate:HI (ashiftrt:SI (match_dup 3) (const_int 16))))))))]
+		   (mult:DI (sign_extend:DI
+			      (truncate:HI (match_operand:SI 2 "register_operand" "r")))
+			    (sign_extend:DI
+			      (truncate:HI (match_operand:SI 3 "register_operand" "r"))))
+		   (mult:DI (sign_extend:DI
+			      (truncate:HI (ashiftrt:SI (match_dup 2) (const_int 16))))
+			    (sign_extend:DI
+			      (truncate:HI (ashiftrt:SI (match_dup 3) (const_int 16))))))))]
   "TARGET_REALLY_IWMMXT"
   "tmiaph%?\\t%0, %2, %3"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tmiaph")]
+)
 
 (define_insn "iwmmxt_tmiabb"
-  [(set (match_operand:DI          0 "register_operand" "=y")
-	(plus:DI (match_operand:DI 1 "register_operand" "0")
+  [(set (match_operand:DI                                  0 "register_operand" "=y")
+	(plus:DI (match_operand:DI                         1 "register_operand" "0")
 		 (mult:DI (sign_extend:DI
-			   (truncate:HI (match_operand:SI 2 "register_operand" "r")))
+			    (truncate:HI (match_operand:SI 2 "register_operand" "r")))
 			  (sign_extend:DI
-			   (truncate:HI (match_operand:SI 3 "register_operand" "r"))))))]
+			    (truncate:HI (match_operand:SI 3 "register_operand" "r"))))))]
   "TARGET_REALLY_IWMMXT"
   "tmiabb%?\\t%0, %2, %3"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tmiaxy")]
+)
 
 (define_insn "iwmmxt_tmiatb"
-  [(set (match_operand:DI          0 "register_operand" "=y")
-	(plus:DI (match_operand:DI 1 "register_operand" "0")
+  [(set (match_operand:DI                         0 "register_operand" "=y")
+	(plus:DI (match_operand:DI                1 "register_operand" "0")
 		 (mult:DI (sign_extend:DI
-			   (truncate:HI (ashiftrt:SI
-					 (match_operand:SI 2 "register_operand" "r")
-					 (const_int 16))))
+			    (truncate:HI
+			      (ashiftrt:SI
+				(match_operand:SI 2 "register_operand" "r")
+				(const_int 16))))
 			  (sign_extend:DI
-			   (truncate:HI (match_operand:SI 3 "register_operand" "r"))))))]
+			    (truncate:HI
+			      (match_operand:SI   3 "register_operand" "r"))))))]
   "TARGET_REALLY_IWMMXT"
   "tmiatb%?\\t%0, %2, %3"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tmiaxy")]
+)
 
 (define_insn "iwmmxt_tmiabt"
-  [(set (match_operand:DI          0 "register_operand" "=y")
-	(plus:DI (match_operand:DI 1 "register_operand" "0")
+  [(set (match_operand:DI                         0 "register_operand" "=y")
+	(plus:DI (match_operand:DI                1 "register_operand" "0")
 		 (mult:DI (sign_extend:DI
-			   (truncate:HI (match_operand:SI 2 "register_operand" "r")))
+			    (truncate:HI
+			      (match_operand:SI   2 "register_operand" "r")))
 			  (sign_extend:DI
-			   (truncate:HI (ashiftrt:SI
-					 (match_operand:SI 3 "register_operand" "r")
-					 (const_int 16)))))))]
+			    (truncate:HI
+			      (ashiftrt:SI
+				(match_operand:SI 3 "register_operand" "r")
+				(const_int 16)))))))]
   "TARGET_REALLY_IWMMXT"
   "tmiabt%?\\t%0, %2, %3"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tmiaxy")]
+)
 
 (define_insn "iwmmxt_tmiatt"
   [(set (match_operand:DI          0 "register_operand" "=y")
 	(plus:DI (match_operand:DI 1 "register_operand" "0")
 		 (mult:DI (sign_extend:DI
-			   (truncate:HI (ashiftrt:SI
-					 (match_operand:SI 2 "register_operand" "r")
-					 (const_int 16))))
+			    (truncate:HI
+			      (ashiftrt:SI
+				(match_operand:SI 2 "register_operand" "r")
+				(const_int 16))))
 			  (sign_extend:DI
-			   (truncate:HI (ashiftrt:SI
-					 (match_operand:SI 3 "register_operand" "r")
-					 (const_int 16)))))))]
+			    (truncate:HI
+			      (ashiftrt:SI
+				(match_operand:SI 3 "register_operand" "r")
+				(const_int 16)))))))]
   "TARGET_REALLY_IWMMXT"
   "tmiatt%?\\t%0, %2, %3"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "iwmmxt_tbcstqi"
-  [(set (match_operand:V8QI                   0 "register_operand" "=y")
-	(vec_duplicate:V8QI (match_operand:QI 1 "register_operand" "r")))]
-  "TARGET_REALLY_IWMMXT"
-  "tbcstb%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "iwmmxt_tbcsthi"
-  [(set (match_operand:V4HI                   0 "register_operand" "=y")
-	(vec_duplicate:V4HI (match_operand:HI 1 "register_operand" "r")))]
-  "TARGET_REALLY_IWMMXT"
-  "tbcsth%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "iwmmxt_tbcstsi"
-  [(set (match_operand:V2SI                   0 "register_operand" "=y")
-	(vec_duplicate:V2SI (match_operand:SI 1 "register_operand" "r")))]
-  "TARGET_REALLY_IWMMXT"
-  "tbcstw%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tmiaxy")]
+)
 
 (define_insn "iwmmxt_tmovmskb"
   [(set (match_operand:SI               0 "register_operand" "=r")
 	(unspec:SI [(match_operand:V8QI 1 "register_operand" "y")] UNSPEC_TMOVMSK))]
   "TARGET_REALLY_IWMMXT"
   "tmovmskb%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tmovmsk")]
+)
 
 (define_insn "iwmmxt_tmovmskh"
   [(set (match_operand:SI               0 "register_operand" "=r")
 	(unspec:SI [(match_operand:V4HI 1 "register_operand" "y")] UNSPEC_TMOVMSK))]
   "TARGET_REALLY_IWMMXT"
   "tmovmskh%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tmovmsk")]
+)
 
 (define_insn "iwmmxt_tmovmskw"
   [(set (match_operand:SI               0 "register_operand" "=r")
 	(unspec:SI [(match_operand:V2SI 1 "register_operand" "y")] UNSPEC_TMOVMSK))]
   "TARGET_REALLY_IWMMXT"
   "tmovmskw%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tmovmsk")]
+)
 
 (define_insn "iwmmxt_waccb"
   [(set (match_operand:DI               0 "register_operand" "=y")
 	(unspec:DI [(match_operand:V8QI 1 "register_operand" "y")] UNSPEC_WACC))]
   "TARGET_REALLY_IWMMXT"
   "waccb%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wacc")]
+)
 
 (define_insn "iwmmxt_wacch"
   [(set (match_operand:DI               0 "register_operand" "=y")
 	(unspec:DI [(match_operand:V4HI 1 "register_operand" "y")] UNSPEC_WACC))]
   "TARGET_REALLY_IWMMXT"
   "wacch%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wacc")]
+)
 
 (define_insn "iwmmxt_waccw"
   [(set (match_operand:DI               0 "register_operand" "=y")
 	(unspec:DI [(match_operand:V2SI 1 "register_operand" "y")] UNSPEC_WACC))]
   "TARGET_REALLY_IWMMXT"
   "waccw%?\\t%0, %1"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wacc")]
+)
 
-(define_insn "iwmmxt_walign"
-  [(set (match_operand:V8QI                           0 "register_operand" "=y,y")
+;; use unspec here to prevent 8 * imm to be optimized by cse
+(define_insn "iwmmxt_waligni"
+  [(set (match_operand:V8QI                                0 "register_operand" "=y")
+	(unspec:V8QI [(subreg:V8QI
+		        (ashiftrt:TI
+		          (subreg:TI (vec_concat:V16QI
+				       (match_operand:V8QI 1 "register_operand" "y")
+				       (match_operand:V8QI 2 "register_operand" "y")) 0)
+		          (mult:SI
+		            (match_operand:SI              3 "immediate_operand" "i")
+		            (const_int 8))) 0)] UNSPEC_WALIGNI))]
+  "TARGET_REALLY_IWMMXT"
+  "waligni%?\\t%0, %1, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "waligni")]
+)
+
+(define_insn "iwmmxt_walignr"
+  [(set (match_operand:V8QI                           0 "register_operand" "=y")
 	(subreg:V8QI (ashiftrt:TI
-		      (subreg:TI (vec_concat:V16QI
-				  (match_operand:V8QI 1 "register_operand" "y,y")
-				  (match_operand:V8QI 2 "register_operand" "y,y")) 0)
-		      (mult:SI
-		       (match_operand:SI              3 "nonmemory_operand" "i,z")
-		       (const_int 8))) 0))]
-  "TARGET_REALLY_IWMMXT"
-  "@
-   waligni%?\\t%0, %1, %2, %3
-   walignr%U3%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+		       (subreg:TI (vec_concat:V16QI
+				    (match_operand:V8QI 1 "register_operand" "y")
+				    (match_operand:V8QI 2 "register_operand" "y")) 0)
+		       (mult:SI
+		         (zero_extract:SI (match_operand:SI 3 "register_operand" "z") (const_int 3) (const_int 0))
+		         (const_int 8))) 0))]
+  "TARGET_REALLY_IWMMXT"
+  "walignr%U3%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "walignr")]
+)
 
-(define_insn "iwmmxt_tmrc"
-  [(set (match_operand:SI                      0 "register_operand" "=r")
-	(unspec_volatile:SI [(match_operand:SI 1 "immediate_operand" "i")]
-			    VUNSPEC_TMRC))]
-  "TARGET_REALLY_IWMMXT"
-  "tmrc%?\\t%0, %w1"
-  [(set_attr "predicable" "yes")])
+(define_insn "iwmmxt_walignr0"
+  [(set (match_operand:V8QI                           0 "register_operand" "=y")
+	(subreg:V8QI (ashiftrt:TI
+		       (subreg:TI (vec_concat:V16QI
+				    (match_operand:V8QI 1 "register_operand" "y")
+				    (match_operand:V8QI 2 "register_operand" "y")) 0)
+		       (mult:SI
+		         (zero_extract:SI (reg:SI WCGR0) (const_int 3) (const_int 0))
+		         (const_int 8))) 0))]
+  "TARGET_REALLY_IWMMXT"
+  "walignr0%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "walignr")]
+)
 
-(define_insn "iwmmxt_tmcr"
-  [(unspec_volatile:SI [(match_operand:SI 0 "immediate_operand" "i")
-			(match_operand:SI 1 "register_operand"  "r")]
-		       VUNSPEC_TMCR)]
-  "TARGET_REALLY_IWMMXT"
-  "tmcr%?\\t%w0, %1"
-  [(set_attr "predicable" "yes")])
+(define_insn "iwmmxt_walignr1"
+  [(set (match_operand:V8QI                           0 "register_operand" "=y")
+	(subreg:V8QI (ashiftrt:TI
+		       (subreg:TI (vec_concat:V16QI
+				    (match_operand:V8QI 1 "register_operand" "y")
+				    (match_operand:V8QI 2 "register_operand" "y")) 0)
+		       (mult:SI
+		         (zero_extract:SI (reg:SI WCGR1) (const_int 3) (const_int 0))
+		         (const_int 8))) 0))]
+  "TARGET_REALLY_IWMMXT"
+  "walignr1%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "walignr")]
+)
+
+(define_insn "iwmmxt_walignr2"
+  [(set (match_operand:V8QI                           0 "register_operand" "=y")
+	(subreg:V8QI (ashiftrt:TI
+		       (subreg:TI (vec_concat:V16QI
+				    (match_operand:V8QI 1 "register_operand" "y")
+				    (match_operand:V8QI 2 "register_operand" "y")) 0)
+		       (mult:SI
+		         (zero_extract:SI (reg:SI WCGR2) (const_int 3) (const_int 0))
+		         (const_int 8))) 0))]
+  "TARGET_REALLY_IWMMXT"
+  "walignr2%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "walignr")]
+)
+
+(define_insn "iwmmxt_walignr3"
+  [(set (match_operand:V8QI                           0 "register_operand" "=y")
+	(subreg:V8QI (ashiftrt:TI
+		       (subreg:TI (vec_concat:V16QI
+				    (match_operand:V8QI 1 "register_operand" "y")
+				    (match_operand:V8QI 2 "register_operand" "y")) 0)
+		       (mult:SI
+		         (zero_extract:SI (reg:SI WCGR3) (const_int 3) (const_int 0))
+		         (const_int 8))) 0))]
+  "TARGET_REALLY_IWMMXT"
+  "walignr3%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "walignr")]
+)
 
 (define_insn "iwmmxt_wsadb"
-  [(set (match_operand:V8QI               0 "register_operand" "=y")
-        (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "y")
-		      (match_operand:V8QI 2 "register_operand" "y")] UNSPEC_WSAD))]
+  [(set (match_operand:V2SI               0 "register_operand" "=y")
+        (unspec:V2SI [
+		      (match_operand:V2SI 1 "register_operand" "0")
+		      (match_operand:V8QI 2 "register_operand" "y")
+		      (match_operand:V8QI 3 "register_operand" "y")] UNSPEC_WSAD))]
   "TARGET_REALLY_IWMMXT"
-  "wsadb%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  "wsadb%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wsad")]
+)
 
 (define_insn "iwmmxt_wsadh"
-  [(set (match_operand:V4HI               0 "register_operand" "=y")
-        (unspec:V4HI [(match_operand:V4HI 1 "register_operand" "y")
-		      (match_operand:V4HI 2 "register_operand" "y")] UNSPEC_WSAD))]
+  [(set (match_operand:V2SI               0 "register_operand" "=y")
+        (unspec:V2SI [
+		      (match_operand:V2SI 1 "register_operand" "0")
+		      (match_operand:V4HI 2 "register_operand" "y")
+		      (match_operand:V4HI 3 "register_operand" "y")] UNSPEC_WSAD))]
   "TARGET_REALLY_IWMMXT"
-  "wsadh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  "wsadh%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wsad")]
+)
 
 (define_insn "iwmmxt_wsadbz"
-  [(set (match_operand:V8QI               0 "register_operand" "=y")
-        (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "y")
+  [(set (match_operand:V2SI               0 "register_operand" "=y")
+        (unspec:V2SI [(match_operand:V8QI 1 "register_operand" "y")
 		      (match_operand:V8QI 2 "register_operand" "y")] UNSPEC_WSADZ))]
   "TARGET_REALLY_IWMMXT"
   "wsadbz%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wsad")]
+)
 
 (define_insn "iwmmxt_wsadhz"
-  [(set (match_operand:V4HI               0 "register_operand" "=y")
-        (unspec:V4HI [(match_operand:V4HI 1 "register_operand" "y")
+  [(set (match_operand:V2SI               0 "register_operand" "=y")
+        (unspec:V2SI [(match_operand:V4HI 1 "register_operand" "y")
 		      (match_operand:V4HI 2 "register_operand" "y")] UNSPEC_WSADZ))]
   "TARGET_REALLY_IWMMXT"
   "wsadhz%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wsad")]
+)
 
+(include "iwmmxt2.md")
diff --git a/gcc/config/arm/iwmmxt2.md b/gcc/config/arm/iwmmxt2.md
new file mode 100644
index 0000000..78fcb7f
--- /dev/null
+++ b/gcc/config/arm/iwmmxt2.md
@@ -0,0 +1,918 @@
+;; Patterns for the Intel Wireless MMX technology architecture.
+;; Copyright (C) 2011 Free Software Foundation, Inc.
+;; Written by Marvell, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_c_enum "unspec" [
+  UNSPEC_WADDC		; Used by the intrinsic form of the iWMMXt WADDC instruction.
+  UNSPEC_WABS		; Used by the intrinsic form of the iWMMXt WABS instruction.
+  UNSPEC_WQMULWMR	; Used by the intrinsic form of the iWMMXt WQMULWMR instruction.
+  UNSPEC_WQMULMR	; Used by the intrinsic form of the iWMMXt WQMULMR instruction.
+  UNSPEC_WQMULWM	; Used by the intrinsic form of the iWMMXt WQMULWM instruction.
+  UNSPEC_WQMULM		; Used by the intrinsic form of the iWMMXt WQMULM instruction.
+  UNSPEC_WQMIAxyn	; Used by the intrinsic form of the iWMMXt WMIAxyn instruction.
+  UNSPEC_WQMIAxy	; Used by the intrinsic form of the iWMMXt WMIAxy instruction.
+  UNSPEC_TANDC		; Used by the intrinsic form of the iWMMXt TANDC instruction.
+  UNSPEC_TORC		; Used by the intrinsic form of the iWMMXt TORC instruction.
+  UNSPEC_TORVSC		; Used by the intrinsic form of the iWMMXt TORVSC instruction.
+  UNSPEC_TEXTRC		; Used by the intrinsic form of the iWMMXt TEXTRC instruction.
+])
+
+(define_insn "iwmmxt_wabs<mode>3"
+  [(set (match_operand:VMMX               0 "register_operand" "=y")
+        (unspec:VMMX [(match_operand:VMMX 1 "register_operand"  "y")] UNSPEC_WABS))]
+  "TARGET_REALLY_IWMMXT"
+  "wabs<MMX_char>%?\\t%0, %1"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wabs")]
+)
+
+(define_insn "iwmmxt_wabsdiffb"
+  [(set (match_operand:V8QI                          0 "register_operand" "=y")
+	(truncate:V8QI
+	  (abs:V8HI
+	    (minus:V8HI
+	      (zero_extend:V8HI (match_operand:V8QI  1 "register_operand"  "y"))
+	      (zero_extend:V8HI (match_operand:V8QI  2 "register_operand"  "y"))))))]
+ "TARGET_REALLY_IWMMXT"
+ "wabsdiffb%?\\t%0, %1, %2"
+ [(set_attr "predicable" "yes")
+  (set_attr "wtype" "wabsdiff")]
+)
+
+(define_insn "iwmmxt_wabsdiffh"
+  [(set (match_operand:V4HI                          0 "register_operand" "=y")
+        (truncate: V4HI
+          (abs:V4SI
+            (minus:V4SI
+              (zero_extend:V4SI (match_operand:V4HI  1 "register_operand"  "y"))
+	      (zero_extend:V4SI (match_operand:V4HI  2 "register_operand"  "y"))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wabsdiffh%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wabsdiff")]
+)
+
+(define_insn "iwmmxt_wabsdiffw"
+  [(set (match_operand:V2SI                          0 "register_operand" "=y")
+        (truncate: V2SI
+	  (abs:V2DI
+	    (minus:V2DI
+	      (zero_extend:V2DI (match_operand:V2SI  1 "register_operand"  "y"))
+	      (zero_extend:V2DI (match_operand:V2SI  2 "register_operand"  "y"))))))]
+ "TARGET_REALLY_IWMMXT"
+ "wabsdiffw%?\\t%0, %1, %2"
+ [(set_attr "predicable" "yes")
+  (set_attr "wtype" "wabsdiff")]
+)
+
+(define_insn "iwmmxt_waddsubhx"
+  [(set (match_operand:V4HI                                        0 "register_operand" "=y")
+	(vec_merge:V4HI
+	  (ss_minus:V4HI
+	    (match_operand:V4HI                                    1 "register_operand" "y")
+	    (vec_select:V4HI (match_operand:V4HI 2 "register_operand" "y")
+	                     (parallel [(const_int 1) (const_int 0) (const_int 3) (const_int 2)])))
+	  (ss_plus:V4HI
+	    (match_dup 1)
+	    (vec_select:V4HI (match_dup 2)
+	                     (parallel [(const_int 1) (const_int 0) (const_int 3) (const_int 2)])))
+	  (const_int 10)))]
+  "TARGET_REALLY_IWMMXT"
+  "waddsubhx%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "waddsubhx")]
+)
+
+(define_insn "iwmmxt_wsubaddhx"
+  [(set (match_operand:V4HI                                        0 "register_operand" "=y")
+	(vec_merge:V4HI
+	  (ss_plus:V4HI
+	    (match_operand:V4HI                                    1 "register_operand" "y")
+	    (vec_select:V4HI (match_operand:V4HI 2 "register_operand" "y")
+	                     (parallel [(const_int 1) (const_int 0) (const_int 3) (const_int 2)])))
+	  (ss_minus:V4HI
+	    (match_dup 1)
+	    (vec_select:V4HI (match_dup 2)
+	                     (parallel [(const_int 1) (const_int 0) (const_int 3) (const_int 2)])))
+	  (const_int 10)))]
+  "TARGET_REALLY_IWMMXT"
+  "wsubaddhx%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wsubaddhx")]
+)
+
+(define_insn "addc<mode>3"
+  [(set (match_operand:VMMX2      0 "register_operand" "=y")
+	(unspec:VMMX2
+          [(plus:VMMX2
+             (match_operand:VMMX2 1 "register_operand"  "y")
+	     (match_operand:VMMX2 2 "register_operand"  "y"))] UNSPEC_WADDC))]
+  "TARGET_REALLY_IWMMXT"
+  "wadd<MMX_char>c%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wadd")]
+)
+
+(define_insn "iwmmxt_avg4"
+[(set (match_operand:V8QI                                 0 "register_operand" "=y")
+      (truncate:V8QI
+        (vec_select:V8HI
+	  (vec_merge:V8HI
+	    (lshiftrt:V8HI
+	      (plus:V8HI
+	        (plus:V8HI
+		  (plus:V8HI
+	            (plus:V8HI
+		      (zero_extend:V8HI (match_operand:V8QI 1 "register_operand" "y"))
+		      (zero_extend:V8HI (match_operand:V8QI 2 "register_operand" "y")))
+		    (vec_select:V8HI (zero_extend:V8HI (match_dup 1))
+		                     (parallel [(const_int 7) (const_int 0) (const_int 1) (const_int 2)
+				                (const_int 3) (const_int 4) (const_int 5) (const_int 6)])))
+		  (vec_select:V8HI (zero_extend:V8HI (match_dup 2))
+		                   (parallel [(const_int 7) (const_int 0) (const_int 1) (const_int 2)
+				              (const_int 3) (const_int 4) (const_int 5) (const_int 6)])))
+	        (const_vector:V8HI [(const_int 1) (const_int 1) (const_int 1) (const_int 1)
+	                            (const_int 1) (const_int 1) (const_int 1) (const_int 1)]))
+	      (const_int 2))
+	    (const_vector:V8HI [(const_int 0) (const_int 0) (const_int 0) (const_int 0)
+	                        (const_int 0) (const_int 0) (const_int 0) (const_int 0)])
+	    (const_int 254))
+	  (parallel [(const_int 1) (const_int 2) (const_int 3) (const_int 4)
+	             (const_int 5) (const_int 6) (const_int 7) (const_int 0)]))))]
+  "TARGET_REALLY_IWMMXT"
+  "wavg4%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wavg4")]
+)
+
+(define_insn "iwmmxt_avg4r"
+  [(set (match_operand:V8QI                                   0 "register_operand" "=y")
+	(truncate:V8QI
+	  (vec_select:V8HI
+	    (vec_merge:V8HI
+	      (lshiftrt:V8HI
+	        (plus:V8HI
+		  (plus:V8HI
+		    (plus:V8HI
+		      (plus:V8HI
+		        (zero_extend:V8HI (match_operand:V8QI 1 "register_operand" "y"))
+		        (zero_extend:V8HI (match_operand:V8QI 2 "register_operand" "y")))
+		      (vec_select:V8HI (zero_extend:V8HI (match_dup 1))
+		                       (parallel [(const_int 7) (const_int 0) (const_int 1) (const_int 2)
+				                  (const_int 3) (const_int 4) (const_int 5) (const_int 6)])))
+		    (vec_select:V8HI (zero_extend:V8HI (match_dup 2))
+		                     (parallel [(const_int 7) (const_int 0) (const_int 1) (const_int 2)
+				                (const_int 3) (const_int 4) (const_int 5) (const_int 6)])))
+		  (const_vector:V8HI [(const_int 2) (const_int 2) (const_int 2) (const_int 2)
+		                      (const_int 2) (const_int 2) (const_int 2) (const_int 2)]))
+	        (const_int 2))
+	      (const_vector:V8HI [(const_int 0) (const_int 0) (const_int 0) (const_int 0)
+	                          (const_int 0) (const_int 0) (const_int 0) (const_int 0)])
+	      (const_int 254))
+	    (parallel [(const_int 1) (const_int 2) (const_int 3) (const_int 4)
+	               (const_int 5) (const_int 6) (const_int 7) (const_int 0)]))))]
+  "TARGET_REALLY_IWMMXT"
+  "wavg4r%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wavg4")]
+)
+
+(define_insn "iwmmxt_wmaddsx"
+  [(set (match_operand:V2SI                                        0 "register_operand" "=y")
+	(plus:V2SI
+	  (mult:V2SI
+	    (vec_select:V2SI (sign_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+	                     (parallel [(const_int 1) (const_int 3)]))
+	    (vec_select:V2SI (sign_extend:V4SI (match_operand:V4HI 2 "register_operand" "y"))
+	                     (parallel [(const_int 0) (const_int 2)])))
+	  (mult:V2SI
+	    (vec_select:V2SI (sign_extend:V4SI (match_dup 1))
+	                     (parallel [(const_int 0) (const_int 2)]))
+	    (vec_select:V2SI (sign_extend:V4SI (match_dup 2))
+	                     (parallel [(const_int 1) (const_int 3)])))))]
+ "TARGET_REALLY_IWMMXT"
+  "wmaddsx%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+	(set_attr "wtype" "wmadd")]
+)
+
+(define_insn "iwmmxt_wmaddux"
+  [(set (match_operand:V2SI                                        0 "register_operand" "=y")
+	(plus:V2SI
+	  (mult:V2SI
+	    (vec_select:V2SI (zero_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+	                     (parallel [(const_int 1) (const_int 3)]))
+	    (vec_select:V2SI (zero_extend:V4SI (match_operand:V4HI 2 "register_operand" "y"))
+	                     (parallel [(const_int 0) (const_int 2)])))
+	  (mult:V2SI
+	    (vec_select:V2SI (zero_extend:V4SI (match_dup 1))
+	                     (parallel [(const_int 0) (const_int 2)]))
+	    (vec_select:V2SI (zero_extend:V4SI (match_dup 2))
+	                     (parallel [(const_int 1) (const_int 3)])))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmaddux%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmadd")]
+)
+
+(define_insn "iwmmxt_wmaddsn"
+ [(set (match_operand:V2SI                                     0 "register_operand" "=y")
+    (minus:V2SI
+      (mult:V2SI
+        (vec_select:V2SI (sign_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+	                 (parallel [(const_int 0) (const_int 2)]))
+        (vec_select:V2SI (sign_extend:V4SI (match_operand:V4HI 2 "register_operand" "y"))
+	                 (parallel [(const_int 0) (const_int 2)])))
+      (mult:V2SI
+        (vec_select:V2SI (sign_extend:V4SI (match_dup 1))
+	                 (parallel [(const_int 1) (const_int 3)]))
+        (vec_select:V2SI (sign_extend:V4SI (match_dup 2))
+	                 (parallel [(const_int 1) (const_int 3)])))))]
+ "TARGET_REALLY_IWMMXT"
+ "wmaddsn%?\\t%0, %1, %2"
+ [(set_attr "predicable" "yes")
+  (set_attr "wtype" "wmadd")]
+)
+
+(define_insn "iwmmxt_wmaddun"
+  [(set (match_operand:V2SI                                        0 "register_operand" "=y")
+	(minus:V2SI
+	  (mult:V2SI
+	    (vec_select:V2SI (zero_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+	                     (parallel [(const_int 0) (const_int 2)]))
+	    (vec_select:V2SI (zero_extend:V4SI (match_operand:V4HI 2 "register_operand" "y"))
+	                     (parallel [(const_int 0) (const_int 2)])))
+	  (mult:V2SI
+	    (vec_select:V2SI (zero_extend:V4SI (match_dup 1))
+	                     (parallel [(const_int 1) (const_int 3)]))
+	    (vec_select:V2SI (zero_extend:V4SI (match_dup 2))
+	                     (parallel [(const_int 1) (const_int 3)])))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmaddun%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmadd")]
+)
+
+(define_insn "iwmmxt_wmulwsm"
+  [(set (match_operand:V2SI                         0 "register_operand" "=y")
+	(truncate:V2SI
+	  (ashiftrt:V2DI
+	    (mult:V2DI
+	      (sign_extend:V2DI (match_operand:V2SI 1 "register_operand" "y"))
+	      (sign_extend:V2DI (match_operand:V2SI 2 "register_operand" "y")))
+	    (const_int 32))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmulwsm%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmulw")]
+)
+
+(define_insn "iwmmxt_wmulwum"
+  [(set (match_operand:V2SI                         0 "register_operand" "=y")
+	(truncate:V2SI
+          (lshiftrt:V2DI
+	    (mult:V2DI
+	      (zero_extend:V2DI (match_operand:V2SI 1 "register_operand" "y"))
+	      (zero_extend:V2DI (match_operand:V2SI 2 "register_operand" "y")))
+	    (const_int 32))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmulwum%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmulw")]
+)
+
+(define_insn "iwmmxt_wmulsmr"
+  [(set (match_operand:V4HI                           0 "register_operand" "=y")
+	(truncate:V4HI
+	  (ashiftrt:V4SI
+	    (plus:V4SI
+	      (mult:V4SI
+	        (sign_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+		(sign_extend:V4SI (match_operand:V4HI 2 "register_operand" "y")))
+	      (const_vector:V4SI [(const_int 32768)
+	                          (const_int 32768)
+				  (const_int 32768)]))
+	    (const_int 16))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmulsmr%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmul")]
+)
+
+(define_insn "iwmmxt_wmulumr"
+  [(set (match_operand:V4HI                           0 "register_operand" "=y")
+	(truncate:V4HI
+	  (lshiftrt:V4SI
+	    (plus:V4SI
+	      (mult:V4SI
+	        (zero_extend:V4SI (match_operand:V4HI 1 "register_operand" "y"))
+		(zero_extend:V4SI (match_operand:V4HI 2 "register_operand" "y")))
+	      (const_vector:V4SI [(const_int 32768)
+				  (const_int 32768)
+				  (const_int 32768)
+				  (const_int 32768)]))
+	  (const_int 16))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmulumr%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmul")]
+)
+
+(define_insn "iwmmxt_wmulwsmr"
+  [(set (match_operand:V2SI                           0 "register_operand" "=y")
+	(truncate:V2SI
+	  (ashiftrt:V2DI
+	    (plus:V2DI
+	      (mult:V2DI
+	        (sign_extend:V2DI (match_operand:V2SI 1 "register_operand" "y"))
+		(sign_extend:V2DI (match_operand:V2SI 2 "register_operand" "y")))
+	      (const_vector:V2DI [(const_int 2147483648)
+				  (const_int 2147483648)]))
+	    (const_int 32))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmulwsmr%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmul")]
+)
+
+(define_insn "iwmmxt_wmulwumr"
+  [(set (match_operand:V2SI                           0 "register_operand" "=y")
+	(truncate:V2SI
+	  (lshiftrt:V2DI
+	    (plus:V2DI
+	      (mult:V2DI
+	        (zero_extend:V2DI (match_operand:V2SI 1 "register_operand" "y"))
+		(zero_extend:V2DI (match_operand:V2SI 2 "register_operand" "y")))
+	      (const_vector:V2DI [(const_int 2147483648)
+			          (const_int 2147483648)]))
+	    (const_int 32))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmulwumr%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmulw")]
+)
+
+(define_insn "iwmmxt_wmulwl"
+  [(set (match_operand:V2SI   0 "register_operand" "=y")
+        (mult:V2SI
+          (match_operand:V2SI 1 "register_operand" "y")
+	  (match_operand:V2SI 2 "register_operand" "y")))]
+  "TARGET_REALLY_IWMMXT"
+  "wmulwl%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmulw")]
+)
+
+(define_insn "iwmmxt_wqmulm"
+  [(set (match_operand:V4HI            0 "register_operand" "=y")
+        (unspec:V4HI [(match_operand:V4HI 1 "register_operand" "y")
+		      (match_operand:V4HI 2 "register_operand" "y")] UNSPEC_WQMULM))]
+  "TARGET_REALLY_IWMMXT"
+  "wqmulm%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wqmulm")]
+)
+
+(define_insn "iwmmxt_wqmulwm"
+  [(set (match_operand:V2SI               0 "register_operand" "=y")
+	(unspec:V2SI [(match_operand:V2SI 1 "register_operand" "y")
+		      (match_operand:V2SI 2 "register_operand" "y")] UNSPEC_WQMULWM))]
+  "TARGET_REALLY_IWMMXT"
+  "wqmulwm%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wqmulwm")]
+)
+
+(define_insn "iwmmxt_wqmulmr"
+  [(set (match_operand:V4HI               0 "register_operand" "=y")
+	(unspec:V4HI [(match_operand:V4HI 1 "register_operand" "y")
+		      (match_operand:V4HI 2 "register_operand" "y")] UNSPEC_WQMULMR))]
+  "TARGET_REALLY_IWMMXT"
+  "wqmulmr%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wqmulm")]
+)
+
+(define_insn "iwmmxt_wqmulwmr"
+  [(set (match_operand:V2SI            0 "register_operand" "=y")
+        (unspec:V2SI [(match_operand:V2SI 1 "register_operand" "y")
+		      (match_operand:V2SI 2 "register_operand" "y")] UNSPEC_WQMULWMR))]
+  "TARGET_REALLY_IWMMXT"
+  "wqmulwmr%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wqmulwm")]
+)
+
+(define_insn "iwmmxt_waddbhusm"
+  [(set (match_operand:V8QI                          0 "register_operand" "=y")
+	(vec_concat:V8QI
+	  (const_vector:V4QI [(const_int 0) (const_int 0) (const_int 0) (const_int 0)])
+	  (us_truncate:V4QI
+	    (ss_plus:V4HI
+	      (match_operand:V4HI                    1 "register_operand" "y")
+	      (zero_extend:V4HI
+	        (vec_select:V4QI (match_operand:V8QI 2 "register_operand" "y")
+	                         (parallel [(const_int 4) (const_int 5) (const_int 6) (const_int 7)])))))))]
+  "TARGET_REALLY_IWMMXT"
+  "waddbhusm%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "waddbhus")]
+)
+
+(define_insn "iwmmxt_waddbhusl"
+  [(set (match_operand:V8QI                          0 "register_operand" "=y")
+	(vec_concat:V8QI
+	  (us_truncate:V4QI
+	    (ss_plus:V4HI
+	      (match_operand:V4HI                    1 "register_operand" "y")
+	      (zero_extend:V4HI
+		(vec_select:V4QI (match_operand:V8QI 2 "register_operand" "y")
+		                 (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3)])))))
+	  (const_vector:V4QI [(const_int 0) (const_int 0) (const_int 0) (const_int 0)])))]
+  "TARGET_REALLY_IWMMXT"
+  "waddbhusl%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "waddbhus")]
+)
+
+(define_insn "iwmmxt_wqmiabb"
+  [(set (match_operand:V2SI	                             0 "register_operand" "=y")
+	(unspec:V2SI [(match_operand:V2SI                    1 "register_operand" "0")
+		      (zero_extract:V4HI (match_operand:V4HI 2 "register_operand" "y") (const_int 16) (const_int 0))
+		      (zero_extract:V4HI (match_dup 2) (const_int 16) (const_int 32))
+		      (zero_extract:V4HI (match_operand:V4HI 3 "register_operand" "y") (const_int 16) (const_int 0))
+		      (zero_extract:V4HI (match_dup 3) (const_int 16) (const_int 32))] UNSPEC_WQMIAxy))]
+  "TARGET_REALLY_IWMMXT"
+  "wqmiabb%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wqmiaxy")]
+)
+
+(define_insn "iwmmxt_wqmiabt"
+  [(set (match_operand:V2SI	                             0 "register_operand" "=y")
+	(unspec:V2SI [(match_operand:V2SI                    1 "register_operand" "0")
+	              (zero_extract:V4HI (match_operand:V4HI 2 "register_operand" "y") (const_int 16) (const_int 0))
+		      (zero_extract:V4HI (match_dup 2) (const_int 16) (const_int 32))
+		      (zero_extract:V4HI (match_operand:V4HI 3 "register_operand" "y") (const_int 16) (const_int 16))
+		      (zero_extract:V4HI (match_dup 3) (const_int 16) (const_int 48))] UNSPEC_WQMIAxy))]
+  "TARGET_REALLY_IWMMXT"
+  "wqmiabt%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wqmiaxy")]
+)
+
+(define_insn "iwmmxt_wqmiatb"
+  [(set (match_operand:V2SI                                  0 "register_operand" "=y")
+        (unspec:V2SI [(match_operand:V2SI                    1 "register_operand" "0")
+	              (zero_extract:V4HI (match_operand:V4HI 2 "register_operand" "y") (const_int 16) (const_int 16))
+	              (zero_extract:V4HI (match_dup 2) (const_int 16) (const_int 48))
+	              (zero_extract:V4HI (match_operand:V4HI 3 "register_operand" "y") (const_int 16) (const_int 0))
+	              (zero_extract:V4HI (match_dup 3) (const_int 16) (const_int 32))] UNSPEC_WQMIAxy))]
+  "TARGET_REALLY_IWMMXT"
+  "wqmiatb%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wqmiaxy")]
+)
+
+(define_insn "iwmmxt_wqmiatt"
+  [(set (match_operand:V2SI                                  0 "register_operand" "=y")
+        (unspec:V2SI [(match_operand:V2SI                    1 "register_operand" "0")
+	              (zero_extract:V4HI (match_operand:V4HI 2 "register_operand" "y") (const_int 16) (const_int 16))
+	              (zero_extract:V4HI (match_dup 2) (const_int 16) (const_int 48))
+	              (zero_extract:V4HI (match_operand:V4HI 3 "register_operand" "y") (const_int 16) (const_int 16))
+	              (zero_extract:V4HI (match_dup 3) (const_int 16) (const_int 48))] UNSPEC_WQMIAxy))]
+  "TARGET_REALLY_IWMMXT"
+  "wqmiatt%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wqmiaxy")]
+)
+
+(define_insn "iwmmxt_wqmiabbn"
+  [(set (match_operand:V2SI                                  0 "register_operand" "=y")
+        (unspec:V2SI [(match_operand:V2SI                    1 "register_operand" "0")
+                      (zero_extract:V4HI (match_operand:V4HI 2 "register_operand" "y") (const_int 16) (const_int 0))
+	              (zero_extract:V4HI (match_dup 2) (const_int 16) (const_int 32))
+	              (zero_extract:V4HI (match_operand:V4HI 3 "register_operand" "y") (const_int 16) (const_int 0))
+	              (zero_extract:V4HI (match_dup 3) (const_int 16) (const_int 32))] UNSPEC_WQMIAxyn))]
+  "TARGET_REALLY_IWMMXT"
+  "wqmiabbn%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wqmiaxy")]
+)
+
+(define_insn "iwmmxt_wqmiabtn"
+  [(set (match_operand:V2SI                                  0 "register_operand" "=y")
+        (unspec:V2SI [(match_operand:V2SI                    1 "register_operand" "0")
+                      (zero_extract:V4HI (match_operand:V4HI 2 "register_operand" "y") (const_int 16) (const_int 0))
+	              (zero_extract:V4HI (match_dup 2) (const_int 16) (const_int 32))
+	              (zero_extract:V4HI (match_operand:V4HI 3 "register_operand" "y") (const_int 16) (const_int 16))
+	              (zero_extract:V4HI (match_dup 3) (const_int 16) (const_int 48))] UNSPEC_WQMIAxyn))]
+  "TARGET_REALLY_IWMMXT"
+  "wqmiabtn%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wqmiaxy")]
+)
+
+(define_insn "iwmmxt_wqmiatbn"
+  [(set (match_operand:V2SI                                  0 "register_operand" "=y")
+        (unspec:V2SI [(match_operand:V2SI                    1 "register_operand" "0")
+                      (zero_extract:V4HI (match_operand:V4HI 2 "register_operand" "y") (const_int 16) (const_int 16))
+	              (zero_extract:V4HI (match_dup 2) (const_int 16) (const_int 48))
+	              (zero_extract:V4HI (match_operand:V4HI 3 "register_operand" "y") (const_int 16) (const_int 0))
+	              (zero_extract:V4HI (match_dup 3) (const_int 16) (const_int 32))] UNSPEC_WQMIAxyn))]
+  "TARGET_REALLY_IWMMXT"
+  "wqmiatbn%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wqmiaxy")]
+)
+
+(define_insn "iwmmxt_wqmiattn"
+ [(set (match_operand:V2SI                                  0 "register_operand" "=y")
+       (unspec:V2SI [(match_operand:V2SI                    1 "register_operand" "0")
+                     (zero_extract:V4HI (match_operand:V4HI 2 "register_operand" "y") (const_int 16) (const_int 16))
+	             (zero_extract:V4HI (match_dup 2) (const_int 16) (const_int 48))
+	             (zero_extract:V4HI (match_operand:V4HI 3 "register_operand" "y") (const_int 16) (const_int 16))
+	             (zero_extract:V4HI (match_dup 3) (const_int 16) (const_int 48))] UNSPEC_WQMIAxyn))]
+  "TARGET_REALLY_IWMMXT"
+  "wqmiattn%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wqmiaxy")]
+)
+
+(define_insn "iwmmxt_wmiabb"
+  [(set	(match_operand:DI	                          0 "register_operand" "=y")
+	(plus:DI (match_operand:DI	                  1 "register_operand" "0")
+		 (plus:DI
+		   (mult:DI
+		     (sign_extend:DI
+		       (vec_select:HI (match_operand:V4HI 2 "register_operand" "y")
+				      (parallel [(const_int 0)])))
+		     (sign_extend:DI
+		       (vec_select:HI (match_operand:V4HI 3 "register_operand" "y")
+				      (parallel [(const_int 0)]))))
+		   (mult:DI
+		     (sign_extend:DI
+		       (vec_select:HI (match_dup 2)
+			              (parallel [(const_int 2)])))
+		     (sign_extend:DI
+		       (vec_select:HI (match_dup 3)
+				      (parallel [(const_int 2)])))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiabb%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiaxy")]
+)
+
+(define_insn "iwmmxt_wmiabt"
+  [(set	(match_operand:DI	                          0 "register_operand" "=y")
+	(plus:DI (match_operand:DI	                  1 "register_operand" "0")
+		 (plus:DI
+		   (mult:DI
+		     (sign_extend:DI
+		       (vec_select:HI (match_operand:V4HI 2 "register_operand" "y")
+				      (parallel [(const_int 0)])))
+		     (sign_extend:DI
+		       (vec_select:HI (match_operand:V4HI 3 "register_operand" "y")
+				      (parallel [(const_int 1)]))))
+		   (mult:DI
+		     (sign_extend:DI
+		       (vec_select:HI (match_dup 2)
+				      (parallel [(const_int 2)])))
+		     (sign_extend:DI
+		       (vec_select:HI (match_dup 3)
+				      (parallel [(const_int 3)])))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiabt%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiaxy")]
+)
+
+(define_insn "iwmmxt_wmiatb"
+  [(set	(match_operand:DI	                          0 "register_operand" "=y")
+	(plus:DI (match_operand:DI	                  1 "register_operand" "0")
+		 (plus:DI
+		   (mult:DI
+		     (sign_extend:DI
+		       (vec_select:HI (match_operand:V4HI 2 "register_operand" "y")
+				      (parallel [(const_int 1)])))
+		     (sign_extend:DI
+		       (vec_select:HI (match_operand:V4HI 3 "register_operand" "y")
+				      (parallel [(const_int 0)]))))
+		   (mult:DI
+		     (sign_extend:DI
+		       (vec_select:HI (match_dup 2)
+				      (parallel [(const_int 3)])))
+		     (sign_extend:DI
+		       (vec_select:HI (match_dup 3)
+				      (parallel [(const_int 2)])))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiatb%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiaxy")]
+)
+
+(define_insn "iwmmxt_wmiatt"
+  [(set	(match_operand:DI	                   0 "register_operand" "=y")
+        (plus:DI (match_operand:DI	           1 "register_operand" "0")
+          (plus:DI
+            (mult:DI
+              (sign_extend:DI
+                (vec_select:HI (match_operand:V4HI 2 "register_operand" "y")
+	                       (parallel [(const_int 1)])))
+	      (sign_extend:DI
+	        (vec_select:HI (match_operand:V4HI 3 "register_operand" "y")
+	                       (parallel [(const_int 1)]))))
+            (mult:DI
+	      (sign_extend:DI
+                (vec_select:HI (match_dup 2)
+	                       (parallel [(const_int 3)])))
+              (sign_extend:DI
+                (vec_select:HI (match_dup 3)
+	                       (parallel [(const_int 3)])))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiatt%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiaxy")]
+)
+
+(define_insn "iwmmxt_wmiabbn"
+  [(set	(match_operand:DI	                           0 "register_operand" "=y")
+	(minus:DI (match_operand:DI	                   1 "register_operand" "0")
+		  (plus:DI
+		    (mult:DI
+		      (sign_extend:DI
+			(vec_select:HI (match_operand:V4HI 2 "register_operand" "y")
+				       (parallel [(const_int 0)])))
+		      (sign_extend:DI
+		        (vec_select:HI (match_operand:V4HI 3 "register_operand" "y")
+				       (parallel [(const_int 0)]))))
+		    (mult:DI
+		      (sign_extend:DI
+			(vec_select:HI (match_dup 2)
+				       (parallel [(const_int 2)])))
+		      (sign_extend:DI
+		        (vec_select:HI (match_dup 3)
+				       (parallel [(const_int 2)])))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiabbn%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiaxy")]
+)
+
+(define_insn "iwmmxt_wmiabtn"
+  [(set	(match_operand:DI	                           0 "register_operand" "=y")
+	(minus:DI (match_operand:DI	                   1 "register_operand" "0")
+		  (plus:DI
+		    (mult:DI
+		      (sign_extend:DI
+			(vec_select:HI (match_operand:V4HI 2 "register_operand" "y")
+				       (parallel [(const_int 0)])))
+		      (sign_extend:DI
+		        (vec_select:HI (match_operand:V4HI 3 "register_operand" "y")
+				       (parallel [(const_int 1)]))))
+		    (mult:DI
+		      (sign_extend:DI
+		        (vec_select:HI (match_dup 2)
+				       (parallel [(const_int 2)])))
+		      (sign_extend:DI
+			(vec_select:HI (match_dup 3)
+				       (parallel [(const_int 3)])))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiabtn%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiaxy")]
+)
+
+(define_insn "iwmmxt_wmiatbn"
+  [(set (match_operand:DI	                           0 "register_operand" "=y")
+	(minus:DI (match_operand:DI	                   1 "register_operand" "0")
+		  (plus:DI
+		    (mult:DI
+		      (sign_extend:DI
+			(vec_select:HI (match_operand:V4HI 2 "register_operand" "y")
+				       (parallel [(const_int 1)])))
+		      (sign_extend:DI
+		        (vec_select:HI (match_operand:V4HI 3 "register_operand" "y")
+				       (parallel [(const_int 0)]))))
+		    (mult:DI
+		      (sign_extend:DI
+		        (vec_select:HI (match_dup 2)
+				       (parallel [(const_int 3)])))
+		      (sign_extend:DI
+			(vec_select:HI (match_dup 3)
+				       (parallel [(const_int 2)])))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiatbn%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiaxy")]
+)
+
+(define_insn "iwmmxt_wmiattn"
+  [(set (match_operand:DI	                           0 "register_operand" "=y")
+	(minus:DI (match_operand:DI	                   1 "register_operand" "0")
+		  (plus:DI
+		    (mult:DI
+		      (sign_extend:DI
+			(vec_select:HI (match_operand:V4HI 2 "register_operand" "y")
+				       (parallel [(const_int 1)])))
+		      (sign_extend:DI
+			(vec_select:HI (match_operand:V4HI 3 "register_operand" "y")
+				       (parallel [(const_int 1)]))))
+		    (mult:DI
+		      (sign_extend:DI
+			(vec_select:HI (match_dup 2)
+				       (parallel [(const_int 3)])))
+		      (sign_extend:DI
+			(vec_select:HI (match_dup 3)
+				       (parallel [(const_int 3)])))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiattn%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiaxy")]
+)
+
+(define_insn "iwmmxt_wmiawbb"
+  [(set (match_operand:DI	0 "register_operand" "=y")
+	(plus:DI
+	  (match_operand:DI      1 "register_operand" "0")
+	  (mult:DI
+	    (sign_extend:DI (vec_select:SI (match_operand:V2SI 2 "register_operand" "y") (parallel [(const_int 0)])))
+	    (sign_extend:DI (vec_select:SI (match_operand:V2SI 3 "register_operand" "y") (parallel [(const_int 0)]))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiawbb%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiawxy")]
+)
+
+(define_insn "iwmmxt_wmiawbt"
+  [(set (match_operand:DI	                               0 "register_operand" "=y")
+	(plus:DI
+	  (match_operand:DI                                    1 "register_operand" "0")
+	  (mult:DI
+	    (sign_extend:DI (vec_select:SI (match_operand:V2SI 2 "register_operand" "y") (parallel [(const_int 0)])))
+	    (sign_extend:DI (vec_select:SI (match_operand:V2SI 3 "register_operand" "y") (parallel [(const_int 1)]))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiawbt%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiawxy")]
+)
+
+(define_insn "iwmmxt_wmiawtb"
+  [(set (match_operand:DI	                               0 "register_operand" "=y")
+	(plus:DI
+	  (match_operand:DI                                    1 "register_operand" "0")
+	  (mult:DI
+	    (sign_extend:DI (vec_select:SI (match_operand:V2SI 2 "register_operand" "y") (parallel [(const_int 1)])))
+	    (sign_extend:DI (vec_select:SI (match_operand:V2SI 3 "register_operand" "y") (parallel [(const_int 0)]))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiawtb%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiawxy")]
+)
+
+(define_insn "iwmmxt_wmiawtt"
+[(set (match_operand:DI	                                     0 "register_operand" "=y")
+      (plus:DI
+	(match_operand:DI                                    1 "register_operand" "0")
+	(mult:DI
+	  (sign_extend:DI (vec_select:SI (match_operand:V2SI 2 "register_operand" "y") (parallel [(const_int 1)])))
+	  (sign_extend:DI (vec_select:SI (match_operand:V2SI 3 "register_operand" "y") (parallel [(const_int 1)]))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiawtt%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiawxy")]
+)
+
+(define_insn "iwmmxt_wmiawbbn"
+  [(set (match_operand:DI	                               0 "register_operand" "=y")
+	(minus:DI
+	  (match_operand:DI                                    1 "register_operand" "0")
+	  (mult:DI
+	    (sign_extend:DI (vec_select:SI (match_operand:V2SI 2 "register_operand" "y") (parallel [(const_int 0)])))
+	    (sign_extend:DI (vec_select:SI (match_operand:V2SI 3 "register_operand" "y") (parallel [(const_int 0)]))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiawbbn%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiawxy")]
+)
+
+(define_insn "iwmmxt_wmiawbtn"
+  [(set (match_operand:DI	                               0 "register_operand" "=y")
+	(minus:DI
+	  (match_operand:DI                                    1 "register_operand" "0")
+	  (mult:DI
+	    (sign_extend:DI (vec_select:SI (match_operand:V2SI 2 "register_operand" "y") (parallel [(const_int 0)])))
+	    (sign_extend:DI (vec_select:SI (match_operand:V2SI 3 "register_operand" "y") (parallel [(const_int 1)]))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiawbtn%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiawxy")]
+)
+
+(define_insn "iwmmxt_wmiawtbn"
+  [(set (match_operand:DI	                               0 "register_operand" "=y")
+	(minus:DI
+	  (match_operand:DI                                    1 "register_operand" "0")
+	  (mult:DI
+	    (sign_extend:DI (vec_select:SI (match_operand:V2SI 2 "register_operand" "y") (parallel [(const_int 1)])))
+	    (sign_extend:DI (vec_select:SI (match_operand:V2SI 3 "register_operand" "y") (parallel [(const_int 0)]))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiawtbn%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiawxy")]
+)
+
+(define_insn "iwmmxt_wmiawttn"
+  [(set (match_operand:DI	                               0 "register_operand" "=y")
+	(minus:DI
+	  (match_operand:DI                                    1 "register_operand" "0")
+	  (mult:DI
+	    (sign_extend:DI (vec_select:SI (match_operand:V2SI 2 "register_operand" "y") (parallel [(const_int 1)])))
+	    (sign_extend:DI (vec_select:SI (match_operand:V2SI 3 "register_operand" "y") (parallel [(const_int 1)]))))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmiawttn%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmiawxy")]
+)
+
+(define_insn "iwmmxt_wmerge"
+  [(set (match_operand:DI         0 "register_operand" "=y")
+	(ior:DI
+	  (ashift:DI
+	    (match_operand:DI     2 "register_operand" "y")
+	    (minus:SI
+	      (const_int 64)
+	      (mult:SI
+	        (match_operand:SI 3 "immediate_operand" "i")
+		(const_int 8))))
+	  (lshiftrt:DI
+	    (ashift:DI
+	      (match_operand:DI   1 "register_operand" "y")
+	      (mult:SI
+	        (match_dup 3)
+		(const_int 8)))
+	    (mult:SI
+	      (match_dup 3)
+	      (const_int 8)))))]
+  "TARGET_REALLY_IWMMXT"
+  "wmerge%?\\t%0, %1, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "wmerge")]
+)
+
+(define_insn "iwmmxt_tandc<mode>3"
+  [(set (reg:CC CC_REGNUM)
+	(subreg:CC (unspec:VMMX [(const_int 0)] UNSPEC_TANDC) 0))
+   (unspec:CC [(reg:SI 15)] UNSPEC_TANDC)]
+  "TARGET_REALLY_IWMMXT"
+  "tandc<MMX_char>%?\\t r15"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "tandc")]
+)
+
+(define_insn "iwmmxt_torc<mode>3"
+  [(set (reg:CC CC_REGNUM)
+	(subreg:CC (unspec:VMMX [(const_int 0)] UNSPEC_TORC) 0))
+   (unspec:CC [(reg:SI 15)] UNSPEC_TORC)]
+  "TARGET_REALLY_IWMMXT"
+  "torc<MMX_char>%?\\t r15"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "torc")]
+)
+
+(define_insn "iwmmxt_torvsc<mode>3"
+  [(set (reg:CC CC_REGNUM)
+	(subreg:CC (unspec:VMMX [(const_int 0)] UNSPEC_TORVSC) 0))
+   (unspec:CC [(reg:SI 15)] UNSPEC_TORVSC)]
+  "TARGET_REALLY_IWMMXT"
+  "torvsc<MMX_char>%?\\t r15"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "torvsc")]
+)
+
+(define_insn "iwmmxt_textrc<mode>3"
+  [(set (reg:CC CC_REGNUM)
+	(subreg:CC (unspec:VMMX [(const_int 0)
+		                 (match_operand:SI 0 "immediate_operand" "i")] UNSPEC_TEXTRC) 0))
+   (unspec:CC [(reg:SI 15)] UNSPEC_TEXTRC)]
+  "TARGET_REALLY_IWMMXT"
+  "textrc<MMX_char>%?\\t r15, %0"
+  [(set_attr "predicable" "yes")
+   (set_attr "wtype" "textrc")]
+)
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index fa2027c..8334b2b 100644
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -493,6 +493,11 @@
   (and (match_code "const_int")
        (match_test "((unsigned HOST_WIDE_INT) INTVAL (op)) < 64")))
 
+;; iWMMXt predicates
+
+(define_predicate "imm_or_reg_operand"
+  (ior (match_operand 0 "immediate_operand")
+       (match_operand 0 "register_operand")))
 
 ;; Neon predicates
 
diff --git a/gcc/config/arm/t-arm b/gcc/config/arm/t-arm
index 1128d19..83c18f7 100644
--- a/gcc/config/arm/t-arm
+++ b/gcc/config/arm/t-arm
@@ -49,6 +49,7 @@ MD_INCLUDES=	$(srcdir)/config/arm/arm1020e.md \
 		$(srcdir)/config/arm/fpa.md \
 		$(srcdir)/config/arm/iterators.md \
 		$(srcdir)/config/arm/iwmmxt.md \
+		$(srcdir)/config/arm/iwmmxt2.md \
 		$(srcdir)/config/arm/ldmstm.md \
 		$(srcdir)/config/arm/neon.md \
 		$(srcdir)/config/arm/predicates.md \
-- 
1.7.3.4

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH ARM iWMMXt 2/5] intrinsic head file change
  2012-05-29  4:13 [PATCH ARM iWMMXt 0/5] Improve iWMMXt support Matt Turner
                   ` (3 preceding siblings ...)
  2012-05-29  4:15 ` [PATCH ARM iWMMXt 4/5] WMMX machine description Matt Turner
@ 2012-05-29  4:15 ` Matt Turner
  2012-06-06 12:22   ` Ramana Radhakrishnan
  2012-06-06 11:59 ` [PATCH ARM iWMMXt 0/5] Improve iWMMXt support Ramana Radhakrishnan
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Matt Turner @ 2012-05-29  4:15 UTC (permalink / raw)
  To: gcc-patches
  Cc: Ramana Radhakrishnan, Richard Earnshaw, Nick Clifton, Paul Brook,
	Xinyu Qi

From: Xinyu Qi <xyqi@marvell.com>

	gcc/
	* config/arm/mmintrin.h: Use __IWMMXT__ to enable iWMMXt intrinsics.
	Use __IWMMXT2__ to enable iWMMXt2 intrinsics.
	Use C name-mangling for intrinsics.
	(__v8qi): Redefine.
	(_mm_cvtsi32_si64, _mm_andnot_si64, _mm_sad_pu8): Revise.
	(_mm_sad_pu16, _mm_align_si64, _mm_setwcx, _mm_getwcx): Likewise.
	(_m_from_int): Likewise.
	(_mm_sada_pu8, _mm_sada_pu16): New intrinsic.
	(_mm_alignr0_si64, _mm_alignr1_si64, _mm_alignr2_si64): Likewise.
	(_mm_alignr3_si64, _mm_tandcb, _mm_tandch, _mm_tandcw): Likewise.
	(_mm_textrcb, _mm_textrch, _mm_textrcw, _mm_torcb): Likewise.
	(_mm_torch, _mm_torcw, _mm_tbcst_pi8, _mm_tbcst_pi16): Likewise.
	(_mm_tbcst_pi32): Likewise.
	(_mm_abs_pi8, _mm_abs_pi16, _mm_abs_pi32): New iWMMXt2 intrinsic.
	(_mm_addsubhx_pi16, _mm_absdiff_pu8, _mm_absdiff_pu16): Likewise.
	(_mm_absdiff_pu32, _mm_addc_pu16, _mm_addc_pu32): Likewise.
	(_mm_avg4_pu8, _mm_avg4r_pu8, _mm_maddx_pi16, _mm_maddx_pu16): Likewise.
	(_mm_msub_pi16, _mm_msub_pu16, _mm_mulhi_pi32): Likewise.
	(_mm_mulhi_pu32, _mm_mulhir_pi16, _mm_mulhir_pi32): Likewise.
	(_mm_mulhir_pu16, _mm_mulhir_pu32, _mm_mullo_pi32): Likewise.
	(_mm_qmulm_pi16, _mm_qmulm_pi32, _mm_qmulmr_pi16): Likewise.
	(_mm_qmulmr_pi32, _mm_subaddhx_pi16, _mm_addbhusl_pu8): Likewise.
	(_mm_addbhusm_pu8, _mm_qmiabb_pi32, _mm_qmiabbn_pi32): Likewise.
	(_mm_qmiabt_pi32, _mm_qmiabtn_pi32, _mm_qmiatb_pi32): Likewise.
	(_mm_qmiatbn_pi32, _mm_qmiatt_pi32, _mm_qmiattn_pi32): Likewise.
	(_mm_wmiabb_si64, _mm_wmiabbn_si64, _mm_wmiabt_si64): Likewise.
	(_mm_wmiabtn_si64, _mm_wmiatb_si64, _mm_wmiatbn_si64): Likewise.
	(_mm_wmiatt_si64, _mm_wmiattn_si64, _mm_wmiawbb_si64): Likewise.
	(_mm_wmiawbbn_si64, _mm_wmiawbt_si64, _mm_wmiawbtn_si64): Likewise.
	(_mm_wmiawtb_si64, _mm_wmiawtbn_si64, _mm_wmiawtt_si64): Likewise.
	(_mm_wmiawttn_si64, _mm_merge_si64): Likewise.
	(_mm_torvscb, _mm_torvsch, _mm_torvscw): Likewise.
	(_m_to_int): New define.
---
 gcc/config/arm/mmintrin.h |  649 ++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 614 insertions(+), 35 deletions(-)

diff --git a/gcc/config/arm/mmintrin.h b/gcc/config/arm/mmintrin.h
index 2cc500d..0fe551d 100644
--- a/gcc/config/arm/mmintrin.h
+++ b/gcc/config/arm/mmintrin.h
@@ -24,16 +24,30 @@
 #ifndef _MMINTRIN_H_INCLUDED
 #define _MMINTRIN_H_INCLUDED
 
+#ifndef __IWMMXT__
+#error You must enable WMMX/WMMX2 instructions (e.g. -march=iwmmxt or -march=iwmmxt2) to use iWMMXt/iWMMXt2 intrinsics
+#else
+
+#ifndef __IWMMXT2__
+#warning You only enable iWMMXt intrinsics. Extended iWMMXt2 intrinsics available only if WMMX2 instructions enabled (e.g. -march=iwmmxt2)
+#endif
+
+
+#if defined __cplusplus
+extern "C" { /* Begin "C" */
+/* Intrinsics use C name-mangling.  */
+#endif /* __cplusplus */
+
 /* The data type intended for user use.  */
 typedef unsigned long long __m64, __int64;
 
 /* Internal data types for implementing the intrinsics.  */
 typedef int __v2si __attribute__ ((vector_size (8)));
 typedef short __v4hi __attribute__ ((vector_size (8)));
-typedef char __v8qi __attribute__ ((vector_size (8)));
+typedef signed char __v8qi __attribute__ ((vector_size (8)));
 
 /* "Convert" __m64 and __int64 into each other.  */
-static __inline __m64 
+static __inline __m64
 _mm_cvtsi64_m64 (__int64 __i)
 {
   return __i;
@@ -54,7 +68,7 @@ _mm_cvtsi64_si32 (__int64 __i)
 static __inline __int64
 _mm_cvtsi32_si64 (int __i)
 {
-  return __i;
+  return (__i & 0xffffffff);
 }
 
 /* Pack the four 16-bit values from M1 into the lower four 8-bit values of
@@ -603,7 +617,7 @@ _mm_and_si64 (__m64 __m1, __m64 __m2)
 static __inline __m64
 _mm_andnot_si64 (__m64 __m1, __m64 __m2)
 {
-  return __builtin_arm_wandn (__m1, __m2);
+  return __builtin_arm_wandn (__m2, __m1);
 }
 
 /* Bit-wise inclusive OR the 64-bit values in M1 and M2.  */
@@ -935,7 +949,13 @@ _mm_avg2_pu16 (__m64 __A, __m64 __B)
 static __inline __m64
 _mm_sad_pu8 (__m64 __A, __m64 __B)
 {
-  return (__m64) __builtin_arm_wsadb ((__v8qi)__A, (__v8qi)__B);
+  return (__m64) __builtin_arm_wsadbz ((__v8qi)__A, (__v8qi)__B);
+}
+
+static __inline __m64
+_mm_sada_pu8 (__m64 __A, __m64 __B, __m64 __C)
+{
+  return (__m64) __builtin_arm_wsadb ((__v2si)__A, (__v8qi)__B, (__v8qi)__C);
 }
 
 /* Compute the sum of the absolute differences of the unsigned 16-bit
@@ -944,9 +964,16 @@ _mm_sad_pu8 (__m64 __A, __m64 __B)
 static __inline __m64
 _mm_sad_pu16 (__m64 __A, __m64 __B)
 {
-  return (__m64) __builtin_arm_wsadh ((__v4hi)__A, (__v4hi)__B);
+  return (__m64) __builtin_arm_wsadhz ((__v4hi)__A, (__v4hi)__B);
 }
 
+static __inline __m64
+_mm_sada_pu16 (__m64 __A, __m64 __B, __m64 __C)
+{
+  return (__m64) __builtin_arm_wsadh ((__v2si)__A, (__v4hi)__B, (__v4hi)__C);
+}
+
+
 /* Compute the sum of the absolute differences of the unsigned 8-bit
    values in A and B.  Return the value in the lower 16-bit word; the
    upper words are cleared.  */
@@ -965,11 +992,8 @@ _mm_sadz_pu16 (__m64 __A, __m64 __B)
   return (__m64) __builtin_arm_wsadhz ((__v4hi)__A, (__v4hi)__B);
 }
 
-static __inline __m64
-_mm_align_si64 (__m64 __A, __m64 __B, int __C)
-{
-  return (__m64) __builtin_arm_walign ((__v8qi)__A, (__v8qi)__B, __C);
-}
+#define _mm_align_si64(__A,__B, N) \
+  (__m64) __builtin_arm_walign ((__v8qi) (__A),(__v8qi) (__B), (N))
 
 /* Creates a 64-bit zero.  */
 static __inline __m64
@@ -987,42 +1011,76 @@ _mm_setwcx (const int __value, const int __regno)
 {
   switch (__regno)
     {
-    case 0:  __builtin_arm_setwcx (__value, 0); break;
-    case 1:  __builtin_arm_setwcx (__value, 1); break;
-    case 2:  __builtin_arm_setwcx (__value, 2); break;
-    case 3:  __builtin_arm_setwcx (__value, 3); break;
-    case 8:  __builtin_arm_setwcx (__value, 8); break;
-    case 9:  __builtin_arm_setwcx (__value, 9); break;
-    case 10: __builtin_arm_setwcx (__value, 10); break;
-    case 11: __builtin_arm_setwcx (__value, 11); break;
-    default: break;
+    case 0:
+      __asm __volatile ("tmcr wcid, %0" :: "r"(__value));
+      break;
+    case 1:
+      __asm __volatile ("tmcr wcon, %0" :: "r"(__value));
+      break;
+    case 2:
+      __asm __volatile ("tmcr wcssf, %0" :: "r"(__value));
+      break;
+    case 3:
+      __asm __volatile ("tmcr wcasf, %0" :: "r"(__value));
+      break;
+    case 8:
+      __builtin_arm_setwcgr0 (__value);
+      break;
+    case 9:
+      __builtin_arm_setwcgr1 (__value);
+      break;
+    case 10:
+      __builtin_arm_setwcgr2 (__value);
+      break;
+    case 11:
+      __builtin_arm_setwcgr3 (__value);
+      break;
+    default:
+      break;
     }
 }
 
 static __inline int
 _mm_getwcx (const int __regno)
 {
+  int __value;
   switch (__regno)
     {
-    case 0:  return __builtin_arm_getwcx (0);
-    case 1:  return __builtin_arm_getwcx (1);
-    case 2:  return __builtin_arm_getwcx (2);
-    case 3:  return __builtin_arm_getwcx (3);
-    case 8:  return __builtin_arm_getwcx (8);
-    case 9:  return __builtin_arm_getwcx (9);
-    case 10: return __builtin_arm_getwcx (10);
-    case 11: return __builtin_arm_getwcx (11);
-    default: return 0;
+    case 0:
+      __asm __volatile ("tmrc %0, wcid" : "=r"(__value));
+      break;
+    case 1:
+      __asm __volatile ("tmrc %0, wcon" : "=r"(__value));
+      break;
+    case 2:
+      __asm __volatile ("tmrc %0, wcssf" : "=r"(__value));
+      break;
+    case 3:
+      __asm __volatile ("tmrc %0, wcasf" : "=r"(__value));
+      break;
+    case 8:
+      return __builtin_arm_getwcgr0 ();
+    case 9:
+      return __builtin_arm_getwcgr1 ();
+    case 10:
+      return __builtin_arm_getwcgr2 ();
+    case 11:
+      return __builtin_arm_getwcgr3 ();
+    default:
+      break;
     }
+  return __value;
 }
 
 /* Creates a vector of two 32-bit values; I0 is least significant.  */
 static __inline __m64
 _mm_set_pi32 (int __i1, int __i0)
 {
-  union {
+  union
+  {
     __m64 __q;
-    struct {
+    struct
+    {
       unsigned int __i0;
       unsigned int __i1;
     } __s;
@@ -1041,7 +1099,7 @@ _mm_set_pi16 (short __w3, short __w2, short __w1, short __w0)
   unsigned int __i1 = (unsigned short)__w3 << 16 | (unsigned short)__w2;
   unsigned int __i0 = (unsigned short)__w1 << 16 | (unsigned short)__w0;
   return _mm_set_pi32 (__i1, __i0);
-		       
+
 }
 
 /* Creates a vector of eight 8-bit values; B0 is least significant.  */
@@ -1108,11 +1166,526 @@ _mm_set1_pi8 (char __b)
   return _mm_set1_pi32 (__i);
 }
 
-/* Convert an integer to a __m64 object.  */
+#ifdef __IWMMXT2__
+static __inline __m64
+_mm_abs_pi8 (__m64 m1)
+{
+  return (__m64) __builtin_arm_wabsb ((__v8qi)m1);
+}
+
+static __inline __m64
+_mm_abs_pi16 (__m64 m1)
+{
+  return (__m64) __builtin_arm_wabsh ((__v4hi)m1);
+
+}
+
+static __inline __m64
+_mm_abs_pi32 (__m64 m1)
+{
+  return (__m64) __builtin_arm_wabsw ((__v2si)m1);
+
+}
+
+static __inline __m64
+_mm_addsubhx_pi16 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_waddsubhx ((__v4hi)a, (__v4hi)b);
+}
+
+static __inline __m64
+_mm_absdiff_pu8 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wabsdiffb ((__v8qi)a, (__v8qi)b);
+}
+
+static __inline __m64
+_mm_absdiff_pu16 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wabsdiffh ((__v4hi)a, (__v4hi)b);
+}
+
+static __inline __m64
+_mm_absdiff_pu32 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wabsdiffw ((__v2si)a, (__v2si)b);
+}
+
+static __inline __m64
+_mm_addc_pu16 (__m64 a, __m64 b)
+{
+  __m64 result;
+  __asm__ __volatile__ ("waddhc	%0, %1, %2" : "=y" (result) : "y" (a),  "y" (b));
+  return result;
+}
+
+static __inline __m64
+_mm_addc_pu32 (__m64 a, __m64 b)
+{
+  __m64 result;
+  __asm__ __volatile__ ("waddwc	%0, %1, %2" : "=y" (result) : "y" (a),  "y" (b));
+  return result;
+}
+
+static __inline __m64
+_mm_avg4_pu8 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wavg4 ((__v8qi)a, (__v8qi)b);
+}
+
+static __inline __m64
+_mm_avg4r_pu8 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wavg4r ((__v8qi)a, (__v8qi)b);
+}
+
+static __inline __m64
+_mm_maddx_pi16 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wmaddsx ((__v4hi)a, (__v4hi)b);
+}
+
+static __inline __m64
+_mm_maddx_pu16 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wmaddux ((__v4hi)a, (__v4hi)b);
+}
+
+static __inline __m64
+_mm_msub_pi16 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wmaddsn ((__v4hi)a, (__v4hi)b);
+}
+
+static __inline __m64
+_mm_msub_pu16 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wmaddun ((__v4hi)a, (__v4hi)b);
+}
+
+static __inline __m64
+_mm_mulhi_pi32 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wmulwsm ((__v2si)a, (__v2si)b);
+}
+
+static __inline __m64
+_mm_mulhi_pu32 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wmulwum ((__v2si)a, (__v2si)b);
+}
+
+static __inline __m64
+_mm_mulhir_pi16 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wmulsmr ((__v4hi)a, (__v4hi)b);
+}
+
+static __inline __m64
+_mm_mulhir_pi32 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wmulwsmr ((__v2si)a, (__v2si)b);
+}
+
+static __inline __m64
+_mm_mulhir_pu16 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wmulumr ((__v4hi)a, (__v4hi)b);
+}
+
+static __inline __m64
+_mm_mulhir_pu32 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wmulwumr ((__v2si)a, (__v2si)b);
+}
+
+static __inline __m64
+_mm_mullo_pi32 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wmulwl ((__v2si)a, (__v2si)b);
+}
+
+static __inline __m64
+_mm_qmulm_pi16 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wqmulm ((__v4hi)a, (__v4hi)b);
+}
+
+static __inline __m64
+_mm_qmulm_pi32 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wqmulwm ((__v2si)a, (__v2si)b);
+}
+
+static __inline __m64
+_mm_qmulmr_pi16 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wqmulmr ((__v4hi)a, (__v4hi)b);
+}
+
+static __inline __m64
+_mm_qmulmr_pi32 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wqmulwmr ((__v2si)a, (__v2si)b);
+}
+
+static __inline __m64
+_mm_subaddhx_pi16 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_wsubaddhx ((__v4hi)a, (__v4hi)b);
+}
+
+static __inline __m64
+_mm_addbhusl_pu8 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_waddbhusl ((__v4hi)a, (__v8qi)b);
+}
+
+static __inline __m64
+_mm_addbhusm_pu8 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_waddbhusm ((__v4hi)a, (__v8qi)b);
+}
+
+#define _mm_qmiabb_pi32(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wqmiabb ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_qmiabbn_pi32(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wqmiabbn ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_qmiabt_pi32(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wqmiabt ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_qmiabtn_pi32(acc, m1, m2) \
+  ({\
+   __m64 _acc=acc;\
+   __m64 _m1=m1;\
+   __m64 _m2=m2;\
+   _acc = (__m64) __builtin_arm_wqmiabtn ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_qmiatb_pi32(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wqmiatb ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_qmiatbn_pi32(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wqmiatbn ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_qmiatt_pi32(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wqmiatt ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_qmiattn_pi32(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wqmiattn ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiabb_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiabb (_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiabbn_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiabbn (_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiabt_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiabt (_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiabtn_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiabtn (_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiatb_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiatb (_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiatbn_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiatbn (_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiatt_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiatt (_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiattn_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiattn (_acc, (__v4hi)_m1, (__v4hi)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiawbb_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiawbb (_acc, (__v2si)_m1, (__v2si)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiawbbn_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiawbbn (_acc, (__v2si)_m1, (__v2si)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiawbt_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiawbt (_acc, (__v2si)_m1, (__v2si)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiawbtn_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiawbtn (_acc, (__v2si)_m1, (__v2si)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiawtb_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiawtb (_acc, (__v2si)_m1, (__v2si)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiawtbn_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiawtbn (_acc, (__v2si)_m1, (__v2si)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiawtt_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiawtt (_acc, (__v2si)_m1, (__v2si)_m2);\
+   _acc;\
+   })
+
+#define _mm_wmiawttn_si64(acc, m1, m2) \
+  ({\
+   __m64 _acc = acc;\
+   __m64 _m1 = m1;\
+   __m64 _m2 = m2;\
+   _acc = (__m64) __builtin_arm_wmiawttn (_acc, (__v2si)_m1, (__v2si)_m2);\
+   _acc;\
+   })
+
+/* The third arguments should be an immediate.  */
+#define _mm_merge_si64(a, b, n) \
+  ({\
+   __m64 result;\
+   result = (__m64) __builtin_arm_wmerge ((__m64) (a), (__m64) (b), (n));\
+   result;\
+   })
+#endif  /* __IWMMXT2__ */
+
+static __inline __m64
+_mm_alignr0_si64 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_walignr0 ((__v8qi) a, (__v8qi) b);
+}
+
+static __inline __m64
+_mm_alignr1_si64 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_walignr1 ((__v8qi) a, (__v8qi) b);
+}
+
+static __inline __m64
+_mm_alignr2_si64 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_walignr2 ((__v8qi) a, (__v8qi) b);
+}
+
+static __inline __m64
+_mm_alignr3_si64 (__m64 a, __m64 b)
+{
+  return (__m64) __builtin_arm_walignr3 ((__v8qi) a, (__v8qi) b);
+}
+
+static __inline void
+_mm_tandcb ()
+{
+  __asm __volatile ("tandcb r15");
+}
+
+static __inline void
+_mm_tandch ()
+{
+  __asm __volatile ("tandch r15");
+}
+
+static __inline void
+_mm_tandcw ()
+{
+  __asm __volatile ("tandcw r15");
+}
+
+#define _mm_textrcb(n) \
+  ({\
+   __asm__ __volatile__ (\
+     "textrcb r15, %0" : : "i" (n));\
+   })
+
+#define _mm_textrch(n) \
+  ({\
+   __asm__ __volatile__ (\
+     "textrch r15, %0" : : "i" (n));\
+   })
+
+#define _mm_textrcw(n) \
+  ({\
+   __asm__ __volatile__ (\
+     "textrcw r15, %0" : : "i" (n));\
+   })
+
+static __inline void
+_mm_torcb ()
+{
+  __asm __volatile ("torcb r15");
+}
+
+static __inline void
+_mm_torch ()
+{
+  __asm __volatile ("torch r15");
+}
+
+static __inline void
+_mm_torcw ()
+{
+  __asm __volatile ("torcw r15");
+}
+
+#ifdef __IWMMXT2__
+static __inline void
+_mm_torvscb ()
+{
+  __asm __volatile ("torvscb r15");
+}
+
+static __inline void
+_mm_torvsch ()
+{
+  __asm __volatile ("torvsch r15");
+}
+
+static __inline void
+_mm_torvscw ()
+{
+  __asm __volatile ("torvscw r15");
+}
+#endif
+
+static __inline __m64
+_mm_tbcst_pi8 (int value)
+{
+  return (__m64) __builtin_arm_tbcstb ((signed char) value);
+}
+
+static __inline __m64
+_mm_tbcst_pi16 (int value)
+{
+  return (__m64) __builtin_arm_tbcsth ((short) value);
+}
+
 static __inline __m64
-_m_from_int (int __a)
+_mm_tbcst_pi32 (int value)
 {
-  return (__m64)__a;
+  return (__m64) __builtin_arm_tbcstw (value);
 }
 
 #define _m_packsswb _mm_packs_pi16
@@ -1250,5 +1823,11 @@ _m_from_int (int __a)
 #define _m_paligniq _mm_align_si64
 #define _m_cvt_si2pi _mm_cvtsi64_m64
 #define _m_cvt_pi2si _mm_cvtm64_si64
+#define _m_from_int _mm_cvtsi32_si64
+#define _m_to_int _mm_cvtsi64_si32
 
+#if defined __cplusplus
+}; /* End "C" */
+#endif /* __cplusplus */
+#endif /* __IWMMXT__ */
 #endif /* _MMINTRIN_H_INCLUDED */
-- 
1.7.3.4

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH ARM iWMMXt 1/5] ARM code generic change
  2012-05-29  4:14 ` [PATCH ARM iWMMXt 1/5] ARM code generic change Matt Turner
@ 2012-06-06 11:53   ` Ramana Radhakrishnan
  2012-12-27  2:31     ` [PATCH, ARM, iWMMXT] Fix define_constants for WCGR Xinyu Qi
  2013-01-22  9:22     ` [PING][PATCH, " Xinyu Qi
  0 siblings, 2 replies; 29+ messages in thread
From: Ramana Radhakrishnan @ 2012-06-06 11:53 UTC (permalink / raw)
  To: Matt Turner
  Cc: gcc-patches, Richard Earnshaw, Nick Clifton, Paul Brook, Xinyu Qi

On 29 May 2012 05:13, Matt Turner <mattst88@gmail.com> wrote:
> From: Xinyu Qi <xyqi@marvell.com>
>
>        gcc/
>        * config/arm/arm.c (FL_IWMMXT2): New define.
>        (arm_arch_iwmmxt2): New variable.
>        (arm_option_override): Enable use of iWMMXt with VFP.
>        Disable use of iWMMXt with NEON. Disable use of iWMMXt under
>        Thumb mode. Set arm_arch_iwmmxt2.
>        (arm_expand_binop_builtin): Accept VOIDmode op.
>        * config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Define __IWMMXT2__.
>        (TARGET_IWMMXT2): New define.
>        (TARGET_REALLY_IWMMXT2): Likewise.
>        (arm_arch_iwmmxt2): Declare.
>        * config/arm/arm-cores.def (iwmmxt2): Add FL_IWMMXT2.
>        * config/arm/arm-arches.def (iwmmxt2): Likewise.
>        * config/arm/arm.md (arch): Add "iwmmxt2".
>        (arch_enabled): Handle "iwmmxt2".
> ---
>  gcc/config/arm/arm-arches.def |    2 +-
>  gcc/config/arm/arm-cores.def  |    2 +-
>  gcc/config/arm/arm.c          |   25 +++++++++++++++++--------
>  gcc/config/arm/arm.h          |    7 +++++++
>  gcc/config/arm/arm.md         |    6 +++++-
>  5 files changed, 31 insertions(+), 11 deletions(-)
>
> diff --git a/gcc/config/arm/arm-arches.def b/gcc/config/arm/arm-arches.def
> index 3123426..f4dd6cc 100644
> --- a/gcc/config/arm/arm-arches.def
> +++ b/gcc/config/arm/arm-arches.def
> @@ -57,4 +57,4 @@ ARM_ARCH("armv7-m", cortexm3, 7M,  FL_CO_PROC |             FL_FOR_ARCH7M)
>  ARM_ARCH("armv7e-m", cortexm4,  7EM, FL_CO_PROC |            FL_FOR_ARCH7EM)
>  ARM_ARCH("ep9312",  ep9312,     4T,  FL_LDSCHED | FL_CIRRUS | FL_FOR_ARCH4)
>  ARM_ARCH("iwmmxt",  iwmmxt,     5TE, FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT)
> -ARM_ARCH("iwmmxt2", iwmmxt2,    5TE, FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT)
> +ARM_ARCH("iwmmxt2", iwmmxt2,    5TE, FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT | FL_IWMMXT2)
> diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
> index d82b10b..c82eada 100644
> --- a/gcc/config/arm/arm-cores.def
> +++ b/gcc/config/arm/arm-cores.def
> @@ -105,7 +105,7 @@ ARM_CORE("arm1020e",      arm1020e, 5TE,                             FL_LDSCHED, fastmul)
>  ARM_CORE("arm1022e",      arm1022e,    5TE,                             FL_LDSCHED, fastmul)
>  ARM_CORE("xscale",        xscale,      5TE,                             FL_LDSCHED | FL_STRONG | FL_XSCALE, xscale)
>  ARM_CORE("iwmmxt",        iwmmxt,      5TE,                             FL_LDSCHED | FL_STRONG | FL_XSCALE | FL_IWMMXT, xscale)
> -ARM_CORE("iwmmxt2",       iwmmxt2,     5TE,                             FL_LDSCHED | FL_STRONG | FL_XSCALE | FL_IWMMXT, xscale)
> +ARM_CORE("iwmmxt2",       iwmmxt2,     5TE,                             FL_LDSCHED | FL_STRONG | FL_XSCALE | FL_IWMMXT | FL_IWMMXT2, xscale)
>  ARM_CORE("fa606te",       fa606te,      5TE,                             FL_LDSCHED, 9e)
>  ARM_CORE("fa626te",       fa626te,      5TE,                             FL_LDSCHED, 9e)
>  ARM_CORE("fmp626",        fmp626,       5TE,                             FL_LDSCHED, 9e)
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 7a98197..b0680ab 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -685,6 +685,7 @@ static int thumb_call_reg_needed;
>  #define FL_ARM_DIV    (1 << 23)              /* Hardware divide (ARM mode).  */
>
>  #define FL_IWMMXT     (1 << 29)              /* XScale v2 or "Intel Wireless MMX technology".  */
> +#define FL_IWMMXT2    (1 << 30)       /* "Intel Wireless MMX2 technology".  */
>
>  /* Flags that only effect tuning, not available instructions.  */
>  #define FL_TUNE                (FL_WBUF | FL_VFPV2 | FL_STRONG | FL_LDSCHED \
> @@ -766,6 +767,9 @@ int arm_arch_cirrus = 0;
>  /* Nonzero if this chip supports Intel Wireless MMX technology.  */
>  int arm_arch_iwmmxt = 0;
>
> +/* Nonzero if this chip supports Intel Wireless MMX2 technology.  */
> +int arm_arch_iwmmxt2 = 0;
> +
>  /* Nonzero if this chip is an XScale.  */
>  int arm_arch_xscale = 0;
>
> @@ -1717,6 +1721,7 @@ arm_option_override (void)
>   arm_tune_wbuf = (tune_flags & FL_WBUF) != 0;
>   arm_tune_xscale = (tune_flags & FL_XSCALE) != 0;
>   arm_arch_iwmmxt = (insn_flags & FL_IWMMXT) != 0;
> +  arm_arch_iwmmxt2 = (insn_flags & FL_IWMMXT2) != 0;
>   arm_arch_thumb_hwdiv = (insn_flags & FL_THUMB_DIV) != 0;
>   arm_arch_arm_hwdiv = (insn_flags & FL_ARM_DIV) != 0;
>   arm_tune_cortex_a9 = (arm_tune == cortexa9) != 0;
> @@ -1817,14 +1822,17 @@ arm_option_override (void)
>     }
>
>   /* FPA and iWMMXt are incompatible because the insn encodings overlap.
> -     VFP and iWMMXt can theoretically coexist, but it's unlikely such silicon
> -     will ever exist.  GCC makes no attempt to support this combination.  */
> -  if (TARGET_IWMMXT && !TARGET_SOFT_FLOAT)
> -    sorry ("iWMMXt and hardware floating point");
> +     VFP and iWMMXt however can coexist.  */
> +  if (TARGET_IWMMXT && TARGET_HARD_FLOAT && !TARGET_VFP)
> +    error ("iWMMXt and non-VFP floating point unit are incompatible");
> +
> +  /* iWMMXt and NEON are incompatible.  */
> +  if (TARGET_IWMMXT && TARGET_NEON)
> +    error ("iWMMXt and NEON are incompatible");
>
> -  /* ??? iWMMXt insn patterns need auditing for Thumb-2.  */
> -  if (TARGET_THUMB2 && TARGET_IWMMXT)
> -    sorry ("Thumb-2 iWMMXt");
> +  /* iWMMXt unsupported under Thumb mode.  */
> +  if (TARGET_THUMB && TARGET_IWMMXT)
> +    error ("iWMMXt unsupported under Thumb mode");
>
>   /* __fp16 support currently assumes the core has ldrh.  */
>   if (!arm_arch4 && arm_fp16_format != ARM_FP16_FORMAT_NONE)
> @@ -20867,7 +20875,8 @@ arm_expand_binop_builtin (enum insn_code icode,
>       || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
>     target = gen_reg_rtx (tmode);
>
> -  gcc_assert (GET_MODE (op0) == mode0 && GET_MODE (op1) == mode1);
> +  gcc_assert ((GET_MODE (op0) == mode0 || GET_MODE (op0) == VOIDmode)
> +             && (GET_MODE (op1) == mode1 || GET_MODE (op1) == VOIDmode));
>
>   if (! (*insn_data[icode].operand[1].predicate) (op0, mode0))
>     op0 = copy_to_mode_reg (mode0, op0);
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index f4204e4..c51bce9 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -97,6 +97,8 @@ extern char arm_arch_name[];
>          builtin_define ("__XSCALE__");                \
>        if (arm_arch_iwmmxt)                            \
>          builtin_define ("__IWMMXT__");                \
> +       if (arm_arch_iwmmxt2)                           \
> +         builtin_define ("__IWMMXT2__");               \
>        if (TARGET_AAPCS_BASED)                         \
>          {                                             \
>            if (arm_pcs_default == ARM_PCS_AAPCS_VFP)   \
> @@ -194,7 +196,9 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
>  #define TARGET_MAVERICK                (arm_fpu_desc->model == ARM_FP_MODEL_MAVERICK)
>  #define TARGET_VFP             (arm_fpu_desc->model == ARM_FP_MODEL_VFP)
>  #define TARGET_IWMMXT                  (arm_arch_iwmmxt)
> +#define TARGET_IWMMXT2                 (arm_arch_iwmmxt2)
>  #define TARGET_REALLY_IWMMXT           (TARGET_IWMMXT && TARGET_32BIT)
> +#define TARGET_REALLY_IWMMXT2          (TARGET_IWMMXT2 && TARGET_32BIT)
>  #define TARGET_IWMMXT_ABI (TARGET_32BIT && arm_abi == ARM_ABI_IWMMXT)
>  #define TARGET_ARM                      (! TARGET_THUMB)
>  #define TARGET_EITHER                  1 /* (TARGET_ARM | TARGET_THUMB) */
> @@ -410,6 +414,9 @@ extern int arm_arch_cirrus;
>  /* Nonzero if this chip supports Intel XScale with Wireless MMX technology.  */
>  extern int arm_arch_iwmmxt;
>
> +/* Nonzero if this chip supports Intel Wireless MMX2 technology.  */
> +extern int arm_arch_iwmmxt2;
> +
>  /* Nonzero if this chip is an XScale.  */
>  extern int arm_arch_xscale;
>
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index bbf6380..ad9d948 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -197,7 +197,7 @@
>  ; for ARM or Thumb-2 with arm_arch6, and nov6 for ARM without
>  ; arm_arch6.  This attribute is used to compute attribute "enabled",
>  ; use type "any" to enable an alternative in all cases.
> -(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,onlya8,neon_onlya8,nota8,neon_nota8"
> +(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,onlya8,neon_onlya8,nota8,neon_nota8,iwmmxt,iwmmxt2"
>   (const_string "any"))
>
>  (define_attr "arch_enabled" "no,yes"
> @@ -248,6 +248,10 @@
>         (and (eq_attr "arch" "neon_nota8")
>              (not (eq_attr "tune" "cortexa8"))
>              (match_test "TARGET_NEON"))
> +        (const_string "yes")
> +

Unnecessary new line here.

> +        (and (eq_attr "arch" "iwmmxt2")
> +             (match_test "TARGET_REALLY_IWMMXT2"))
>         (const_string "yes")]
>        (const_string "no")))


Given that we already have iwmmxt2 as a CPU it isn't really changing
behaviour. OK with that change.


regards,
Ramana

>
> --
> 1.7.3.4
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH ARM iWMMXt 3/5] built in define and expand
  2012-05-29  4:14 ` [PATCH ARM iWMMXt 3/5] built in define and expand Matt Turner
@ 2012-06-06 11:55   ` Ramana Radhakrishnan
  0 siblings, 0 replies; 29+ messages in thread
From: Ramana Radhakrishnan @ 2012-06-06 11:55 UTC (permalink / raw)
  To: Matt Turner
  Cc: gcc-patches, Ramana Radhakrishnan, Richard Earnshaw,
	Nick Clifton, Paul Brook, Xinyu Qi

On 29 May 2012 05:13, Matt Turner <mattst88@gmail.com> wrote:
> From: Xinyu Qi <xyqi@marvell.com>
>
>        gcc/
>        * config/arm/arm.c (enum arm_builtins): Revise built-in fcode.
>        (IWMMXT2_BUILTIN): New define.
>        (IWMMXT2_BUILTIN2): Likewise.
>        (iwmmx2_mbuiltin): Likewise.
>        (builtin_description bdesc_2arg): Revise built in declaration.
>        (builtin_description bdesc_1arg): Likewise.
>        (arm_init_iwmmxt_builtins): Revise built in initialization.
>        (arm_expand_builtin): Revise built in expansion.
> ---
>  gcc/config/arm/arm.c |  620 +++++++++++++++++++++++++++++++++++++++++++++-----
>  1 files changed, 559 insertions(+), 61 deletions(-)
>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index b0680ab..51eed40 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -19637,8 +19637,15 @@ static neon_builtin_datum neon_builtin_data[] =
>    FIXME?  */
>  enum arm_builtins
>  {
> -  ARM_BUILTIN_GETWCX,
> -  ARM_BUILTIN_SETWCX,
> +  ARM_BUILTIN_GETWCGR0,
> +  ARM_BUILTIN_GETWCGR1,
> +  ARM_BUILTIN_GETWCGR2,
> +  ARM_BUILTIN_GETWCGR3,
> +
> +  ARM_BUILTIN_SETWCGR0,
> +  ARM_BUILTIN_SETWCGR1,
> +  ARM_BUILTIN_SETWCGR2,
> +  ARM_BUILTIN_SETWCGR3,
>
>   ARM_BUILTIN_WZERO,
>
> @@ -19661,7 +19668,11 @@ enum arm_builtins
>   ARM_BUILTIN_WSADH,
>   ARM_BUILTIN_WSADHZ,
>
> -  ARM_BUILTIN_WALIGN,
> +  ARM_BUILTIN_WALIGNI,
> +  ARM_BUILTIN_WALIGNR0,
> +  ARM_BUILTIN_WALIGNR1,
> +  ARM_BUILTIN_WALIGNR2,
> +  ARM_BUILTIN_WALIGNR3,
>
>   ARM_BUILTIN_TMIA,
>   ARM_BUILTIN_TMIAPH,
> @@ -19797,6 +19808,81 @@ enum arm_builtins
>   ARM_BUILTIN_WUNPCKELUH,
>   ARM_BUILTIN_WUNPCKELUW,
>
> +  ARM_BUILTIN_WABSB,
> +  ARM_BUILTIN_WABSH,
> +  ARM_BUILTIN_WABSW,
> +
> +  ARM_BUILTIN_WADDSUBHX,
> +  ARM_BUILTIN_WSUBADDHX,
> +
> +  ARM_BUILTIN_WABSDIFFB,
> +  ARM_BUILTIN_WABSDIFFH,
> +  ARM_BUILTIN_WABSDIFFW,
> +
> +  ARM_BUILTIN_WADDCH,
> +  ARM_BUILTIN_WADDCW,
> +
> +  ARM_BUILTIN_WAVG4,
> +  ARM_BUILTIN_WAVG4R,
> +
> +  ARM_BUILTIN_WMADDSX,
> +  ARM_BUILTIN_WMADDUX,
> +
> +  ARM_BUILTIN_WMADDSN,
> +  ARM_BUILTIN_WMADDUN,
> +
> +  ARM_BUILTIN_WMULWSM,
> +  ARM_BUILTIN_WMULWUM,
> +
> +  ARM_BUILTIN_WMULWSMR,
> +  ARM_BUILTIN_WMULWUMR,
> +
> +  ARM_BUILTIN_WMULWL,
> +
> +  ARM_BUILTIN_WMULSMR,
> +  ARM_BUILTIN_WMULUMR,
> +
> +  ARM_BUILTIN_WQMULM,
> +  ARM_BUILTIN_WQMULMR,
> +
> +  ARM_BUILTIN_WQMULWM,
> +  ARM_BUILTIN_WQMULWMR,
> +
> +  ARM_BUILTIN_WADDBHUSM,
> +  ARM_BUILTIN_WADDBHUSL,
> +
> +  ARM_BUILTIN_WQMIABB,
> +  ARM_BUILTIN_WQMIABT,
> +  ARM_BUILTIN_WQMIATB,
> +  ARM_BUILTIN_WQMIATT,
> +
> +  ARM_BUILTIN_WQMIABBN,
> +  ARM_BUILTIN_WQMIABTN,
> +  ARM_BUILTIN_WQMIATBN,
> +  ARM_BUILTIN_WQMIATTN,
> +
> +  ARM_BUILTIN_WMIABB,
> +  ARM_BUILTIN_WMIABT,
> +  ARM_BUILTIN_WMIATB,
> +  ARM_BUILTIN_WMIATT,
> +
> +  ARM_BUILTIN_WMIABBN,
> +  ARM_BUILTIN_WMIABTN,
> +  ARM_BUILTIN_WMIATBN,
> +  ARM_BUILTIN_WMIATTN,
> +
> +  ARM_BUILTIN_WMIAWBB,
> +  ARM_BUILTIN_WMIAWBT,
> +  ARM_BUILTIN_WMIAWTB,
> +  ARM_BUILTIN_WMIAWTT,
> +
> +  ARM_BUILTIN_WMIAWBBN,
> +  ARM_BUILTIN_WMIAWBTN,
> +  ARM_BUILTIN_WMIAWTBN,
> +  ARM_BUILTIN_WMIAWTTN,
> +
> +  ARM_BUILTIN_WMERGE,
> +
>   ARM_BUILTIN_THREAD_POINTER,
>
>   ARM_BUILTIN_NEON_BASE,
> @@ -20329,6 +20415,10 @@ static const struct builtin_description bdesc_2arg[] =
>   { FL_IWMMXT, CODE_FOR_##code, "__builtin_arm_" string, \
>     ARM_BUILTIN_##builtin, UNKNOWN, 0 },
>
> +#define IWMMXT2_BUILTIN(code, string, builtin) \
> +  { FL_IWMMXT2, CODE_FOR_##code, "__builtin_arm_" string, \
> +    ARM_BUILTIN_##builtin, UNKNOWN, 0 },
> +
>   IWMMXT_BUILTIN (addv8qi3, "waddb", WADDB)
>   IWMMXT_BUILTIN (addv4hi3, "waddh", WADDH)
>   IWMMXT_BUILTIN (addv2si3, "waddw", WADDW)
> @@ -20385,44 +20475,45 @@ static const struct builtin_description bdesc_2arg[] =
>   IWMMXT_BUILTIN (iwmmxt_wunpckihb, "wunpckihb", WUNPCKIHB)
>   IWMMXT_BUILTIN (iwmmxt_wunpckihh, "wunpckihh", WUNPCKIHH)
>   IWMMXT_BUILTIN (iwmmxt_wunpckihw, "wunpckihw", WUNPCKIHW)
> -  IWMMXT_BUILTIN (iwmmxt_wmadds, "wmadds", WMADDS)
> -  IWMMXT_BUILTIN (iwmmxt_wmaddu, "wmaddu", WMADDU)
> +  IWMMXT2_BUILTIN (iwmmxt_waddsubhx, "waddsubhx", WADDSUBHX)
> +  IWMMXT2_BUILTIN (iwmmxt_wsubaddhx, "wsubaddhx", WSUBADDHX)
> +  IWMMXT2_BUILTIN (iwmmxt_wabsdiffb, "wabsdiffb", WABSDIFFB)
> +  IWMMXT2_BUILTIN (iwmmxt_wabsdiffh, "wabsdiffh", WABSDIFFH)
> +  IWMMXT2_BUILTIN (iwmmxt_wabsdiffw, "wabsdiffw", WABSDIFFW)
> +  IWMMXT2_BUILTIN (iwmmxt_avg4, "wavg4", WAVG4)
> +  IWMMXT2_BUILTIN (iwmmxt_avg4r, "wavg4r", WAVG4R)
> +  IWMMXT2_BUILTIN (iwmmxt_wmulwsm, "wmulwsm", WMULWSM)
> +  IWMMXT2_BUILTIN (iwmmxt_wmulwum, "wmulwum", WMULWUM)
> +  IWMMXT2_BUILTIN (iwmmxt_wmulwsmr, "wmulwsmr", WMULWSMR)
> +  IWMMXT2_BUILTIN (iwmmxt_wmulwumr, "wmulwumr", WMULWUMR)
> +  IWMMXT2_BUILTIN (iwmmxt_wmulwl, "wmulwl", WMULWL)
> +  IWMMXT2_BUILTIN (iwmmxt_wmulsmr, "wmulsmr", WMULSMR)
> +  IWMMXT2_BUILTIN (iwmmxt_wmulumr, "wmulumr", WMULUMR)
> +  IWMMXT2_BUILTIN (iwmmxt_wqmulm, "wqmulm", WQMULM)
> +  IWMMXT2_BUILTIN (iwmmxt_wqmulmr, "wqmulmr", WQMULMR)
> +  IWMMXT2_BUILTIN (iwmmxt_wqmulwm, "wqmulwm", WQMULWM)
> +  IWMMXT2_BUILTIN (iwmmxt_wqmulwmr, "wqmulwmr", WQMULWMR)
> +  IWMMXT_BUILTIN (iwmmxt_walignr0, "walignr0", WALIGNR0)
> +  IWMMXT_BUILTIN (iwmmxt_walignr1, "walignr1", WALIGNR1)
> +  IWMMXT_BUILTIN (iwmmxt_walignr2, "walignr2", WALIGNR2)
> +  IWMMXT_BUILTIN (iwmmxt_walignr3, "walignr3", WALIGNR3)
>
>  #define IWMMXT_BUILTIN2(code, builtin) \
>   { FL_IWMMXT, CODE_FOR_##code, NULL, ARM_BUILTIN_##builtin, UNKNOWN, 0 },
>
> +#define IWMMXT2_BUILTIN2(code, builtin) \
> +  { FL_IWMMXT2, CODE_FOR_##code, NULL, ARM_BUILTIN_##builtin, UNKNOWN, 0 },
> +
> +  IWMMXT2_BUILTIN2 (iwmmxt_waddbhusm, WADDBHUSM)
> +  IWMMXT2_BUILTIN2 (iwmmxt_waddbhusl, WADDBHUSL)
>   IWMMXT_BUILTIN2 (iwmmxt_wpackhss, WPACKHSS)
>   IWMMXT_BUILTIN2 (iwmmxt_wpackwss, WPACKWSS)
>   IWMMXT_BUILTIN2 (iwmmxt_wpackdss, WPACKDSS)
>   IWMMXT_BUILTIN2 (iwmmxt_wpackhus, WPACKHUS)
>   IWMMXT_BUILTIN2 (iwmmxt_wpackwus, WPACKWUS)
>   IWMMXT_BUILTIN2 (iwmmxt_wpackdus, WPACKDUS)
> -  IWMMXT_BUILTIN2 (ashlv4hi3_di,    WSLLH)
> -  IWMMXT_BUILTIN2 (ashlv4hi3_iwmmxt, WSLLHI)
> -  IWMMXT_BUILTIN2 (ashlv2si3_di,    WSLLW)
> -  IWMMXT_BUILTIN2 (ashlv2si3_iwmmxt, WSLLWI)
> -  IWMMXT_BUILTIN2 (ashldi3_di,      WSLLD)
> -  IWMMXT_BUILTIN2 (ashldi3_iwmmxt,  WSLLDI)
> -  IWMMXT_BUILTIN2 (lshrv4hi3_di,    WSRLH)
> -  IWMMXT_BUILTIN2 (lshrv4hi3_iwmmxt, WSRLHI)
> -  IWMMXT_BUILTIN2 (lshrv2si3_di,    WSRLW)
> -  IWMMXT_BUILTIN2 (lshrv2si3_iwmmxt, WSRLWI)
> -  IWMMXT_BUILTIN2 (lshrdi3_di,      WSRLD)
> -  IWMMXT_BUILTIN2 (lshrdi3_iwmmxt,  WSRLDI)
> -  IWMMXT_BUILTIN2 (ashrv4hi3_di,    WSRAH)
> -  IWMMXT_BUILTIN2 (ashrv4hi3_iwmmxt, WSRAHI)
> -  IWMMXT_BUILTIN2 (ashrv2si3_di,    WSRAW)
> -  IWMMXT_BUILTIN2 (ashrv2si3_iwmmxt, WSRAWI)
> -  IWMMXT_BUILTIN2 (ashrdi3_di,      WSRAD)
> -  IWMMXT_BUILTIN2 (ashrdi3_iwmmxt,  WSRADI)
> -  IWMMXT_BUILTIN2 (rorv4hi3_di,     WRORH)
> -  IWMMXT_BUILTIN2 (rorv4hi3,        WRORHI)
> -  IWMMXT_BUILTIN2 (rorv2si3_di,     WRORW)
> -  IWMMXT_BUILTIN2 (rorv2si3,        WRORWI)
> -  IWMMXT_BUILTIN2 (rordi3_di,       WRORD)
> -  IWMMXT_BUILTIN2 (rordi3,          WRORDI)
> -  IWMMXT_BUILTIN2 (iwmmxt_wmacuz,   WMACUZ)
> -  IWMMXT_BUILTIN2 (iwmmxt_wmacsz,   WMACSZ)
> +  IWMMXT_BUILTIN2 (iwmmxt_wmacuz, WMACUZ)
> +  IWMMXT_BUILTIN2 (iwmmxt_wmacsz, WMACSZ)
>  };
>
>  static const struct builtin_description bdesc_1arg[] =
> @@ -20445,6 +20536,12 @@ static const struct builtin_description bdesc_1arg[] =
>   IWMMXT_BUILTIN (iwmmxt_wunpckelsb, "wunpckelsb", WUNPCKELSB)
>   IWMMXT_BUILTIN (iwmmxt_wunpckelsh, "wunpckelsh", WUNPCKELSH)
>   IWMMXT_BUILTIN (iwmmxt_wunpckelsw, "wunpckelsw", WUNPCKELSW)
> +  IWMMXT2_BUILTIN (iwmmxt_wabsv8qi3, "wabsb", WABSB)
> +  IWMMXT2_BUILTIN (iwmmxt_wabsv4hi3, "wabsh", WABSH)
> +  IWMMXT2_BUILTIN (iwmmxt_wabsv2si3, "wabsw", WABSW)
> +  IWMMXT_BUILTIN (tbcstv8qi, "tbcstb", TBCSTB)
> +  IWMMXT_BUILTIN (tbcstv4hi, "tbcsth", TBCSTH)
> +  IWMMXT_BUILTIN (tbcstv2si, "tbcstw", TBCSTW)
>  };
>
>  /* Set up all the iWMMXt builtins.  This is not called if
> @@ -20460,9 +20557,6 @@ arm_init_iwmmxt_builtins (void)
>   tree V4HI_type_node = build_vector_type_for_mode (intHI_type_node, V4HImode);
>   tree V8QI_type_node = build_vector_type_for_mode (intQI_type_node, V8QImode);
>
> -  tree int_ftype_int
> -    = build_function_type_list (integer_type_node,
> -                               integer_type_node, NULL_TREE);
>   tree v8qi_ftype_v8qi_v8qi_int
>     = build_function_type_list (V8QI_type_node,
>                                V8QI_type_node, V8QI_type_node,
> @@ -20524,6 +20618,9 @@ arm_init_iwmmxt_builtins (void)
>   tree v4hi_ftype_v2si_v2si
>     = build_function_type_list (V4HI_type_node,
>                                V2SI_type_node, V2SI_type_node, NULL_TREE);
> +  tree v8qi_ftype_v4hi_v8qi
> +    = build_function_type_list (V8QI_type_node,
> +                               V4HI_type_node, V8QI_type_node, NULL_TREE);
>   tree v2si_ftype_v4hi_v4hi
>     = build_function_type_list (V2SI_type_node,
>                                V4HI_type_node, V4HI_type_node, NULL_TREE);
> @@ -20538,12 +20635,10 @@ arm_init_iwmmxt_builtins (void)
>     = build_function_type_list (V2SI_type_node,
>                                V2SI_type_node, long_long_integer_type_node,
>                                NULL_TREE);
> -  tree void_ftype_int_int
> -    = build_function_type_list (void_type_node,
> -                               integer_type_node, integer_type_node,
> -                               NULL_TREE);
>   tree di_ftype_void
>     = build_function_type_list (long_long_unsigned_type_node, NULL_TREE);
> +  tree int_ftype_void
> +    = build_function_type_list (integer_type_node, NULL_TREE);
>   tree di_ftype_v8qi
>     = build_function_type_list (long_long_integer_type_node,
>                                V8QI_type_node, NULL_TREE);
> @@ -20559,6 +20654,15 @@ arm_init_iwmmxt_builtins (void)
>   tree v4hi_ftype_v8qi
>     = build_function_type_list (V4HI_type_node,
>                                V8QI_type_node, NULL_TREE);
> +  tree v8qi_ftype_v8qi
> +    = build_function_type_list (V8QI_type_node,
> +                               V8QI_type_node, NULL_TREE);
> +  tree v4hi_ftype_v4hi
> +    = build_function_type_list (V4HI_type_node,
> +                               V4HI_type_node, NULL_TREE);
> +  tree v2si_ftype_v2si
> +    = build_function_type_list (V2SI_type_node,
> +                               V2SI_type_node, NULL_TREE);
>
>   tree di_ftype_di_v4hi_v4hi
>     = build_function_type_list (long_long_unsigned_type_node,
> @@ -20571,6 +20675,48 @@ arm_init_iwmmxt_builtins (void)
>                                V4HI_type_node,V4HI_type_node,
>                                NULL_TREE);
>
> +  tree v2si_ftype_v2si_v4hi_v4hi
> +    = build_function_type_list (V2SI_type_node,
> +                                V2SI_type_node, V4HI_type_node,
> +                                V4HI_type_node, NULL_TREE);
> +
> +  tree v2si_ftype_v2si_v8qi_v8qi
> +    = build_function_type_list (V2SI_type_node,
> +                                V2SI_type_node, V8QI_type_node,
> +                                V8QI_type_node, NULL_TREE);
> +
> +  tree di_ftype_di_v2si_v2si
> +     = build_function_type_list (long_long_unsigned_type_node,
> +                                 long_long_unsigned_type_node,
> +                                 V2SI_type_node, V2SI_type_node,
> +                                 NULL_TREE);
> +
> +   tree di_ftype_di_di_int
> +     = build_function_type_list (long_long_unsigned_type_node,
> +                                 long_long_unsigned_type_node,
> +                                 long_long_unsigned_type_node,
> +                                 integer_type_node, NULL_TREE);
> +
> +   tree void_ftype_void
> +     = build_function_type_list (void_type_node,
> +                                 NULL_TREE);
> +
> +   tree void_ftype_int
> +     = build_function_type_list (void_type_node,
> +                                 integer_type_node, NULL_TREE);
> +
> +   tree v8qi_ftype_char
> +     = build_function_type_list (V8QI_type_node,
> +                                 signed_char_type_node, NULL_TREE);
> +
> +   tree v4hi_ftype_short
> +     = build_function_type_list (V4HI_type_node,
> +                                 short_integer_type_node, NULL_TREE);
> +
> +   tree v2si_ftype_int
> +     = build_function_type_list (V2SI_type_node,
> +                                 integer_type_node, NULL_TREE);
> +
>   /* Normal vector binops.  */
>   tree v8qi_ftype_v8qi_v8qi
>     = build_function_type_list (V8QI_type_node,
> @@ -20628,9 +20774,19 @@ arm_init_iwmmxt_builtins (void)
>   def_mbuiltin (FL_IWMMXT, "__builtin_arm_" NAME, (TYPE),      \
>                ARM_BUILTIN_ ## CODE)
>
> +#define iwmmx2_mbuiltin(NAME, TYPE, CODE)                      \
> +  def_mbuiltin (FL_IWMMXT2, "__builtin_arm_" NAME, (TYPE),     \
> +               ARM_BUILTIN_ ## CODE)
> +
>   iwmmx_mbuiltin ("wzero", di_ftype_void, WZERO);
> -  iwmmx_mbuiltin ("setwcx", void_ftype_int_int, SETWCX);
> -  iwmmx_mbuiltin ("getwcx", int_ftype_int, GETWCX);
> +  iwmmx_mbuiltin ("setwcgr0", void_ftype_int, SETWCGR0);
> +  iwmmx_mbuiltin ("setwcgr1", void_ftype_int, SETWCGR1);
> +  iwmmx_mbuiltin ("setwcgr2", void_ftype_int, SETWCGR2);
> +  iwmmx_mbuiltin ("setwcgr3", void_ftype_int, SETWCGR3);
> +  iwmmx_mbuiltin ("getwcgr0", int_ftype_void, GETWCGR0);
> +  iwmmx_mbuiltin ("getwcgr1", int_ftype_void, GETWCGR1);
> +  iwmmx_mbuiltin ("getwcgr2", int_ftype_void, GETWCGR2);
> +  iwmmx_mbuiltin ("getwcgr3", int_ftype_void, GETWCGR3);
>
>   iwmmx_mbuiltin ("wsllh", v4hi_ftype_v4hi_di, WSLLH);
>   iwmmx_mbuiltin ("wsllw", v2si_ftype_v2si_di, WSLLW);
> @@ -20662,8 +20818,14 @@ arm_init_iwmmxt_builtins (void)
>
>   iwmmx_mbuiltin ("wshufh", v4hi_ftype_v4hi_int, WSHUFH);
>
> -  iwmmx_mbuiltin ("wsadb", v2si_ftype_v8qi_v8qi, WSADB);
> -  iwmmx_mbuiltin ("wsadh", v2si_ftype_v4hi_v4hi, WSADH);
> +  iwmmx_mbuiltin ("wsadb", v2si_ftype_v2si_v8qi_v8qi, WSADB);
> +  iwmmx_mbuiltin ("wsadh", v2si_ftype_v2si_v4hi_v4hi, WSADH);
> +  iwmmx_mbuiltin ("wmadds", v2si_ftype_v4hi_v4hi, WMADDS);
> +  iwmmx2_mbuiltin ("wmaddsx", v2si_ftype_v4hi_v4hi, WMADDSX);
> +  iwmmx2_mbuiltin ("wmaddsn", v2si_ftype_v4hi_v4hi, WMADDSN);
> +  iwmmx_mbuiltin ("wmaddu", v2si_ftype_v4hi_v4hi, WMADDU);
> +  iwmmx2_mbuiltin ("wmaddux", v2si_ftype_v4hi_v4hi, WMADDUX);
> +  iwmmx2_mbuiltin ("wmaddun", v2si_ftype_v4hi_v4hi, WMADDUN);
>   iwmmx_mbuiltin ("wsadbz", v2si_ftype_v8qi_v8qi, WSADBZ);
>   iwmmx_mbuiltin ("wsadhz", v2si_ftype_v4hi_v4hi, WSADHZ);
>
> @@ -20685,6 +20847,9 @@ arm_init_iwmmxt_builtins (void)
>   iwmmx_mbuiltin ("tmovmskh", int_ftype_v4hi, TMOVMSKH);
>   iwmmx_mbuiltin ("tmovmskw", int_ftype_v2si, TMOVMSKW);
>
> +  iwmmx2_mbuiltin ("waddbhusm", v8qi_ftype_v4hi_v8qi, WADDBHUSM);
> +  iwmmx2_mbuiltin ("waddbhusl", v8qi_ftype_v4hi_v8qi, WADDBHUSL);
> +
>   iwmmx_mbuiltin ("wpackhss", v8qi_ftype_v4hi_v4hi, WPACKHSS);
>   iwmmx_mbuiltin ("wpackhus", v8qi_ftype_v4hi_v4hi, WPACKHUS);
>   iwmmx_mbuiltin ("wpackwus", v4hi_ftype_v2si_v2si, WPACKWUS);
> @@ -20710,7 +20875,7 @@ arm_init_iwmmxt_builtins (void)
>   iwmmx_mbuiltin ("wmacu", di_ftype_di_v4hi_v4hi, WMACU);
>   iwmmx_mbuiltin ("wmacuz", di_ftype_v4hi_v4hi, WMACUZ);
>
> -  iwmmx_mbuiltin ("walign", v8qi_ftype_v8qi_v8qi_int, WALIGN);
> +  iwmmx_mbuiltin ("walign", v8qi_ftype_v8qi_v8qi_int, WALIGNI);
>   iwmmx_mbuiltin ("tmia", di_ftype_di_int_int, TMIA);
>   iwmmx_mbuiltin ("tmiaph", di_ftype_di_int_int, TMIAPH);
>   iwmmx_mbuiltin ("tmiabb", di_ftype_di_int_int, TMIABB);
> @@ -20718,7 +20883,48 @@ arm_init_iwmmxt_builtins (void)
>   iwmmx_mbuiltin ("tmiatb", di_ftype_di_int_int, TMIATB);
>   iwmmx_mbuiltin ("tmiatt", di_ftype_di_int_int, TMIATT);
>
> +  iwmmx2_mbuiltin ("wabsb", v8qi_ftype_v8qi, WABSB);
> +  iwmmx2_mbuiltin ("wabsh", v4hi_ftype_v4hi, WABSH);
> +  iwmmx2_mbuiltin ("wabsw", v2si_ftype_v2si, WABSW);
> +
> +  iwmmx2_mbuiltin ("wqmiabb", v2si_ftype_v2si_v4hi_v4hi, WQMIABB);
> +  iwmmx2_mbuiltin ("wqmiabt", v2si_ftype_v2si_v4hi_v4hi, WQMIABT);
> +  iwmmx2_mbuiltin ("wqmiatb", v2si_ftype_v2si_v4hi_v4hi, WQMIATB);
> +  iwmmx2_mbuiltin ("wqmiatt", v2si_ftype_v2si_v4hi_v4hi, WQMIATT);
> +
> +  iwmmx2_mbuiltin ("wqmiabbn", v2si_ftype_v2si_v4hi_v4hi, WQMIABBN);
> +  iwmmx2_mbuiltin ("wqmiabtn", v2si_ftype_v2si_v4hi_v4hi, WQMIABTN);
> +  iwmmx2_mbuiltin ("wqmiatbn", v2si_ftype_v2si_v4hi_v4hi, WQMIATBN);
> +  iwmmx2_mbuiltin ("wqmiattn", v2si_ftype_v2si_v4hi_v4hi, WQMIATTN);
> +
> +  iwmmx2_mbuiltin ("wmiabb", di_ftype_di_v4hi_v4hi, WMIABB);
> +  iwmmx2_mbuiltin ("wmiabt", di_ftype_di_v4hi_v4hi, WMIABT);
> +  iwmmx2_mbuiltin ("wmiatb", di_ftype_di_v4hi_v4hi, WMIATB);
> +  iwmmx2_mbuiltin ("wmiatt", di_ftype_di_v4hi_v4hi, WMIATT);
> +
> +  iwmmx2_mbuiltin ("wmiabbn", di_ftype_di_v4hi_v4hi, WMIABBN);
> +  iwmmx2_mbuiltin ("wmiabtn", di_ftype_di_v4hi_v4hi, WMIABTN);
> +  iwmmx2_mbuiltin ("wmiatbn", di_ftype_di_v4hi_v4hi, WMIATBN);
> +  iwmmx2_mbuiltin ("wmiattn", di_ftype_di_v4hi_v4hi, WMIATTN);
> +
> +  iwmmx2_mbuiltin ("wmiawbb", di_ftype_di_v2si_v2si, WMIAWBB);
> +  iwmmx2_mbuiltin ("wmiawbt", di_ftype_di_v2si_v2si, WMIAWBT);
> +  iwmmx2_mbuiltin ("wmiawtb", di_ftype_di_v2si_v2si, WMIAWTB);
> +  iwmmx2_mbuiltin ("wmiawtt", di_ftype_di_v2si_v2si, WMIAWTT);
> +
> +  iwmmx2_mbuiltin ("wmiawbbn", di_ftype_di_v2si_v2si, WMIAWBBN);
> +  iwmmx2_mbuiltin ("wmiawbtn", di_ftype_di_v2si_v2si, WMIAWBTN);
> +  iwmmx2_mbuiltin ("wmiawtbn", di_ftype_di_v2si_v2si, WMIAWTBN);
> +  iwmmx2_mbuiltin ("wmiawttn", di_ftype_di_v2si_v2si, WMIAWTTN);
> +
> +  iwmmx2_mbuiltin ("wmerge", di_ftype_di_di_int, WMERGE);
> +
> +  iwmmx_mbuiltin ("tbcstb", v8qi_ftype_char, TBCSTB);
> +  iwmmx_mbuiltin ("tbcsth", v4hi_ftype_short, TBCSTH);
> +  iwmmx_mbuiltin ("tbcstw", v2si_ftype_int, TBCSTW);
> +
>  #undef iwmmx_mbuiltin
> +#undef iwmmx2_mbuiltin
>  }
>
>  static void
> @@ -21375,6 +21581,10 @@ arm_expand_builtin (tree exp,
>   enum machine_mode mode0;
>   enum machine_mode mode1;
>   enum machine_mode mode2;
> +  int opint;
> +  int selector;
> +  int mask;
> +  int imm;
>
>   if (fcode >= ARM_BUILTIN_NEON_BASE)
>     return arm_expand_neon_builtin (fcode, exp, target);
> @@ -21409,6 +21619,24 @@ arm_expand_builtin (tree exp,
>          error ("selector must be an immediate");
>          return gen_reg_rtx (tmode);
>        }
> +
> +      opint = INTVAL (op1);
> +      if (fcode == ARM_BUILTIN_TEXTRMSB || fcode == ARM_BUILTIN_TEXTRMUB)
> +       {
> +         if (opint > 7 || opint < 0)
> +           error ("the range of selector should be in 0 to 7");
> +       }
> +      else if (fcode == ARM_BUILTIN_TEXTRMSH || fcode == ARM_BUILTIN_TEXTRMUH)
> +       {
> +         if (opint > 3 || opint < 0)
> +           error ("the range of selector should be in 0 to 3");
> +       }
> +      else /* ARM_BUILTIN_TEXTRMSW || ARM_BUILTIN_TEXTRMUW.  */
> +       {
> +         if (opint > 1 || opint < 0)
> +           error ("the range of selector should be in 0 to 1");
> +       }
> +
>       if (target == 0
>          || GET_MODE (target) != tmode
>          || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
> @@ -21419,11 +21647,61 @@ arm_expand_builtin (tree exp,
>       emit_insn (pat);
>       return target;
>
> +    case ARM_BUILTIN_WALIGNI:
> +      /* If op2 is immediate, call walighi, else call walighr.  */
> +      arg0 = CALL_EXPR_ARG (exp, 0);
> +      arg1 = CALL_EXPR_ARG (exp, 1);
> +      arg2 = CALL_EXPR_ARG (exp, 2);
> +      op0 = expand_normal (arg0);
> +      op1 = expand_normal (arg1);
> +      op2 = expand_normal (arg2);
> +      if (GET_CODE (op2) == CONST_INT)

Replace this with CONST_INT_P everywhere in your patches .

> +        {
> +         icode = CODE_FOR_iwmmxt_waligni;
> +          tmode = insn_data[icode].operand[0].mode;
> +         mode0 = insn_data[icode].operand[1].mode;
> +         mode1 = insn_data[icode].operand[2].mode;
> +         mode2 = insn_data[icode].operand[3].mode;
> +          if (!(*insn_data[icode].operand[1].predicate) (op0, mode0))
> +           op0 = copy_to_mode_reg (mode0, op0);
> +          if (!(*insn_data[icode].operand[2].predicate) (op1, mode1))
> +           op1 = copy_to_mode_reg (mode1, op1);
> +          gcc_assert ((*insn_data[icode].operand[3].predicate) (op2, mode2));
> +         selector = INTVAL (op2);
> +         if (selector > 7 || selector < 0)
> +           error ("the range of selector should be in 0 to 7");
> +       }
> +      else
> +        {
> +         icode = CODE_FOR_iwmmxt_walignr;
> +          tmode = insn_data[icode].operand[0].mode;
> +         mode0 = insn_data[icode].operand[1].mode;
> +         mode1 = insn_data[icode].operand[2].mode;
> +         mode2 = insn_data[icode].operand[3].mode;
> +          if (!(*insn_data[icode].operand[1].predicate) (op0, mode0))
> +           op0 = copy_to_mode_reg (mode0, op0);
> +          if (!(*insn_data[icode].operand[2].predicate) (op1, mode1))
> +           op1 = copy_to_mode_reg (mode1, op1);
> +          if (!(*insn_data[icode].operand[3].predicate) (op2, mode2))
> +           op2 = copy_to_mode_reg (mode2, op2);
> +       }
> +      if (target == 0
> +         || GET_MODE (target) != tmode
> +         || !(*insn_data[icode].operand[0].predicate) (target, tmode))
> +       target = gen_reg_rtx (tmode);
> +      pat = GEN_FCN (icode) (target, op0, op1, op2);
> +      if (!pat)
> +       return 0;
> +      emit_insn (pat);
> +      return target;
> +
>     case ARM_BUILTIN_TINSRB:
>     case ARM_BUILTIN_TINSRH:
>     case ARM_BUILTIN_TINSRW:
> +    case ARM_BUILTIN_WMERGE:
>       icode = (fcode == ARM_BUILTIN_TINSRB ? CODE_FOR_iwmmxt_tinsrb
>               : fcode == ARM_BUILTIN_TINSRH ? CODE_FOR_iwmmxt_tinsrh
> +              : fcode == ARM_BUILTIN_WMERGE ? CODE_FOR_iwmmxt_wmerge
>               : CODE_FOR_iwmmxt_tinsrw);
>       arg0 = CALL_EXPR_ARG (exp, 0);
>       arg1 = CALL_EXPR_ARG (exp, 1);
> @@ -21442,10 +21720,30 @@ arm_expand_builtin (tree exp,
>        op1 = copy_to_mode_reg (mode1, op1);
>       if (! (*insn_data[icode].operand[3].predicate) (op2, mode2))
>        {
> -         /* @@@ better error message */
>          error ("selector must be an immediate");
>          return const0_rtx;
>        }
> +      if (icode == CODE_FOR_iwmmxt_wmerge)
> +       {
> +         selector = INTVAL (op2);
> +         if (selector > 7 || selector < 0)
> +           error ("the range of selector should be in 0 to 7");
> +       }
> +      if ((icode == CODE_FOR_iwmmxt_tinsrb)
> +         || (icode == CODE_FOR_iwmmxt_tinsrh)
> +         || (icode == CODE_FOR_iwmmxt_tinsrw))
> +        {
> +         mask = 0x01;
> +         selector= INTVAL (op2);
> +         if (icode == CODE_FOR_iwmmxt_tinsrb && (selector < 0 || selector > 7))
> +           error ("the range of selector should be in 0 to 7");
> +         else if (icode == CODE_FOR_iwmmxt_tinsrh && (selector < 0 ||selector > 3))
> +           error ("the range of selector should be in 0 to 3");
> +         else if (icode == CODE_FOR_iwmmxt_tinsrw && (selector < 0 ||selector > 1))
> +           error ("the range of selector should be in 0 to 1");
> +         mask <<= selector;
> +         op2 = gen_rtx_CONST_INT (SImode, mask);
> +       }
>       if (target == 0
>          || GET_MODE (target) != tmode
>          || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
> @@ -21456,19 +21754,42 @@ arm_expand_builtin (tree exp,
>       emit_insn (pat);
>       return target;
>
> -    case ARM_BUILTIN_SETWCX:
> +    case ARM_BUILTIN_SETWCGR0:
> +    case ARM_BUILTIN_SETWCGR1:
> +    case ARM_BUILTIN_SETWCGR2:
> +    case ARM_BUILTIN_SETWCGR3:
> +      icode = (fcode == ARM_BUILTIN_SETWCGR0 ? CODE_FOR_iwmmxt_setwcgr0
> +              : fcode == ARM_BUILTIN_SETWCGR1 ? CODE_FOR_iwmmxt_setwcgr1
> +              : fcode == ARM_BUILTIN_SETWCGR2 ? CODE_FOR_iwmmxt_setwcgr2
> +              : CODE_FOR_iwmmxt_setwcgr3);
>       arg0 = CALL_EXPR_ARG (exp, 0);
> -      arg1 = CALL_EXPR_ARG (exp, 1);
> -      op0 = force_reg (SImode, expand_normal (arg0));
> -      op1 = expand_normal (arg1);
> -      emit_insn (gen_iwmmxt_tmcr (op1, op0));
> +      op0 = expand_normal (arg0);
> +      mode0 = insn_data[icode].operand[0].mode;
> +      if (!(*insn_data[icode].operand[0].predicate) (op0, mode0))
> +        op0 = copy_to_mode_reg (mode0, op0);
> +      pat = GEN_FCN (icode) (op0);
> +      if (!pat)
> +       return 0;
> +      emit_insn (pat);
>       return 0;
>
> -    case ARM_BUILTIN_GETWCX:
> -      arg0 = CALL_EXPR_ARG (exp, 0);
> -      op0 = expand_normal (arg0);
> -      target = gen_reg_rtx (SImode);
> -      emit_insn (gen_iwmmxt_tmrc (target, op0));
> +    case ARM_BUILTIN_GETWCGR0:
> +    case ARM_BUILTIN_GETWCGR1:
> +    case ARM_BUILTIN_GETWCGR2:
> +    case ARM_BUILTIN_GETWCGR3:
> +      icode = (fcode == ARM_BUILTIN_GETWCGR0 ? CODE_FOR_iwmmxt_getwcgr0
> +              : fcode == ARM_BUILTIN_GETWCGR1 ? CODE_FOR_iwmmxt_getwcgr1
> +              : fcode == ARM_BUILTIN_GETWCGR2 ? CODE_FOR_iwmmxt_getwcgr2
> +              : CODE_FOR_iwmmxt_getwcgr3);
> +      tmode = insn_data[icode].operand[0].mode;
> +      if (target == 0
> +         || GET_MODE (target) != tmode
> +         || !(*insn_data[icode].operand[0].predicate) (target, tmode))
> +        target = gen_reg_rtx (tmode);
> +      pat = GEN_FCN (icode) (target);
> +      if (!pat)
> +        return 0;
> +      emit_insn (pat);
>       return target;
>
>     case ARM_BUILTIN_WSHUFH:
> @@ -21485,10 +21806,12 @@ arm_expand_builtin (tree exp,
>        op0 = copy_to_mode_reg (mode1, op0);
>       if (! (*insn_data[icode].operand[2].predicate) (op1, mode2))
>        {
> -         /* @@@ better error message */
>          error ("mask must be an immediate");
>          return const0_rtx;
>        }
> +      selector = INTVAL (op1);
> +      if (selector < 0 || selector > 255)
> +       error ("the range of mask should be in 0 to 255");
>       if (target == 0
>          || GET_MODE (target) != tmode
>          || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
> @@ -21499,10 +21822,18 @@ arm_expand_builtin (tree exp,
>       emit_insn (pat);
>       return target;
>
> -    case ARM_BUILTIN_WSADB:
> -      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wsadb, exp, target);
> -    case ARM_BUILTIN_WSADH:
> -      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wsadh, exp, target);
> +    case ARM_BUILTIN_WMADDS:
> +      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wmadds, exp, target);
> +    case ARM_BUILTIN_WMADDSX:
> +      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wmaddsx, exp, target);
> +    case ARM_BUILTIN_WMADDSN:
> +      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wmaddsn, exp, target);
> +    case ARM_BUILTIN_WMADDU:
> +      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wmaddu, exp, target);
> +    case ARM_BUILTIN_WMADDUX:
> +      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wmaddux, exp, target);
> +    case ARM_BUILTIN_WMADDUN:
> +      return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wmaddun, exp, target);
>     case ARM_BUILTIN_WSADBZ:
>       return arm_expand_binop_builtin (CODE_FOR_iwmmxt_wsadbz, exp, target);
>     case ARM_BUILTIN_WSADHZ:
> @@ -21511,13 +21842,38 @@ arm_expand_builtin (tree exp,
>       /* Several three-argument builtins.  */
>     case ARM_BUILTIN_WMACS:
>     case ARM_BUILTIN_WMACU:
> -    case ARM_BUILTIN_WALIGN:
>     case ARM_BUILTIN_TMIA:
>     case ARM_BUILTIN_TMIAPH:
>     case ARM_BUILTIN_TMIATT:
>     case ARM_BUILTIN_TMIATB:
>     case ARM_BUILTIN_TMIABT:
>     case ARM_BUILTIN_TMIABB:
> +    case ARM_BUILTIN_WQMIABB:
> +    case ARM_BUILTIN_WQMIABT:
> +    case ARM_BUILTIN_WQMIATB:
> +    case ARM_BUILTIN_WQMIATT:
> +    case ARM_BUILTIN_WQMIABBN:
> +    case ARM_BUILTIN_WQMIABTN:
> +    case ARM_BUILTIN_WQMIATBN:
> +    case ARM_BUILTIN_WQMIATTN:
> +    case ARM_BUILTIN_WMIABB:
> +    case ARM_BUILTIN_WMIABT:
> +    case ARM_BUILTIN_WMIATB:
> +    case ARM_BUILTIN_WMIATT:
> +    case ARM_BUILTIN_WMIABBN:
> +    case ARM_BUILTIN_WMIABTN:
> +    case ARM_BUILTIN_WMIATBN:
> +    case ARM_BUILTIN_WMIATTN:
> +    case ARM_BUILTIN_WMIAWBB:
> +    case ARM_BUILTIN_WMIAWBT:
> +    case ARM_BUILTIN_WMIAWTB:
> +    case ARM_BUILTIN_WMIAWTT:
> +    case ARM_BUILTIN_WMIAWBBN:
> +    case ARM_BUILTIN_WMIAWBTN:
> +    case ARM_BUILTIN_WMIAWTBN:
> +    case ARM_BUILTIN_WMIAWTTN:
> +    case ARM_BUILTIN_WSADB:
> +    case ARM_BUILTIN_WSADH:
>       icode = (fcode == ARM_BUILTIN_WMACS ? CODE_FOR_iwmmxt_wmacs
>               : fcode == ARM_BUILTIN_WMACU ? CODE_FOR_iwmmxt_wmacu
>               : fcode == ARM_BUILTIN_TMIA ? CODE_FOR_iwmmxt_tmia
> @@ -21526,7 +21882,32 @@ arm_expand_builtin (tree exp,
>               : fcode == ARM_BUILTIN_TMIABT ? CODE_FOR_iwmmxt_tmiabt
>               : fcode == ARM_BUILTIN_TMIATB ? CODE_FOR_iwmmxt_tmiatb
>               : fcode == ARM_BUILTIN_TMIATT ? CODE_FOR_iwmmxt_tmiatt
> -              : CODE_FOR_iwmmxt_walign);
> +              : fcode == ARM_BUILTIN_WQMIABB ? CODE_FOR_iwmmxt_wqmiabb
> +              : fcode == ARM_BUILTIN_WQMIABT ? CODE_FOR_iwmmxt_wqmiabt
> +              : fcode == ARM_BUILTIN_WQMIATB ? CODE_FOR_iwmmxt_wqmiatb
> +              : fcode == ARM_BUILTIN_WQMIATT ? CODE_FOR_iwmmxt_wqmiatt
> +              : fcode == ARM_BUILTIN_WQMIABBN ? CODE_FOR_iwmmxt_wqmiabbn
> +              : fcode == ARM_BUILTIN_WQMIABTN ? CODE_FOR_iwmmxt_wqmiabtn
> +              : fcode == ARM_BUILTIN_WQMIATBN ? CODE_FOR_iwmmxt_wqmiatbn
> +              : fcode == ARM_BUILTIN_WQMIATTN ? CODE_FOR_iwmmxt_wqmiattn
> +              : fcode == ARM_BUILTIN_WMIABB ? CODE_FOR_iwmmxt_wmiabb
> +              : fcode == ARM_BUILTIN_WMIABT ? CODE_FOR_iwmmxt_wmiabt
> +              : fcode == ARM_BUILTIN_WMIATB ? CODE_FOR_iwmmxt_wmiatb
> +              : fcode == ARM_BUILTIN_WMIATT ? CODE_FOR_iwmmxt_wmiatt
> +              : fcode == ARM_BUILTIN_WMIABBN ? CODE_FOR_iwmmxt_wmiabbn
> +              : fcode == ARM_BUILTIN_WMIABTN ? CODE_FOR_iwmmxt_wmiabtn
> +              : fcode == ARM_BUILTIN_WMIATBN ? CODE_FOR_iwmmxt_wmiatbn
> +              : fcode == ARM_BUILTIN_WMIATTN ? CODE_FOR_iwmmxt_wmiattn
> +              : fcode == ARM_BUILTIN_WMIAWBB ? CODE_FOR_iwmmxt_wmiawbb
> +              : fcode == ARM_BUILTIN_WMIAWBT ? CODE_FOR_iwmmxt_wmiawbt
> +              : fcode == ARM_BUILTIN_WMIAWTB ? CODE_FOR_iwmmxt_wmiawtb
> +              : fcode == ARM_BUILTIN_WMIAWTT ? CODE_FOR_iwmmxt_wmiawtt
> +              : fcode == ARM_BUILTIN_WMIAWBBN ? CODE_FOR_iwmmxt_wmiawbbn
> +              : fcode == ARM_BUILTIN_WMIAWBTN ? CODE_FOR_iwmmxt_wmiawbtn
> +              : fcode == ARM_BUILTIN_WMIAWTBN ? CODE_FOR_iwmmxt_wmiawtbn
> +              : fcode == ARM_BUILTIN_WMIAWTTN ? CODE_FOR_iwmmxt_wmiawttn
> +              : fcode == ARM_BUILTIN_WSADB ? CODE_FOR_iwmmxt_wsadb
> +              : CODE_FOR_iwmmxt_wsadh);

Can this chunk here be extracted from a table. Having a nested sequence of
ternary operations is just too gross.

>       arg0 = CALL_EXPR_ARG (exp, 0);
>       arg1 = CALL_EXPR_ARG (exp, 1);
>       arg2 = CALL_EXPR_ARG (exp, 2);
> @@ -21559,6 +21940,123 @@ arm_expand_builtin (tree exp,
>       emit_insn (gen_iwmmxt_clrdi (target));
>       return target;
>
> +    case ARM_BUILTIN_WSRLHI:
> +    case ARM_BUILTIN_WSRLWI:
> +    case ARM_BUILTIN_WSRLDI:
> +    case ARM_BUILTIN_WSLLHI:
> +    case ARM_BUILTIN_WSLLWI:
> +    case ARM_BUILTIN_WSLLDI:
> +    case ARM_BUILTIN_WSRAHI:
> +    case ARM_BUILTIN_WSRAWI:
> +    case ARM_BUILTIN_WSRADI:
> +    case ARM_BUILTIN_WRORHI:
> +    case ARM_BUILTIN_WRORWI:
> +    case ARM_BUILTIN_WRORDI:
> +    case ARM_BUILTIN_WSRLH:
> +    case ARM_BUILTIN_WSRLW:
> +    case ARM_BUILTIN_WSRLD:
> +    case ARM_BUILTIN_WSLLH:
> +    case ARM_BUILTIN_WSLLW:
> +    case ARM_BUILTIN_WSLLD:
> +    case ARM_BUILTIN_WSRAH:
> +    case ARM_BUILTIN_WSRAW:
> +    case ARM_BUILTIN_WSRAD:
> +    case ARM_BUILTIN_WRORH:
> +    case ARM_BUILTIN_WRORW:
> +    case ARM_BUILTIN_WRORD:
> +      icode = (fcode == ARM_BUILTIN_WSRLHI ? CODE_FOR_lshrv4hi3_iwmmxt
> +              : fcode == ARM_BUILTIN_WSRLWI ? CODE_FOR_lshrv2si3_iwmmxt
> +              : fcode == ARM_BUILTIN_WSRLDI ? CODE_FOR_lshrdi3_iwmmxt
> +              : fcode == ARM_BUILTIN_WSLLHI ? CODE_FOR_ashlv4hi3_iwmmxt
> +              : fcode == ARM_BUILTIN_WSLLWI ? CODE_FOR_ashlv2si3_iwmmxt
> +              : fcode == ARM_BUILTIN_WSLLDI ? CODE_FOR_ashldi3_iwmmxt
> +              : fcode == ARM_BUILTIN_WSRAHI ? CODE_FOR_ashrv4hi3_iwmmxt
> +              : fcode == ARM_BUILTIN_WSRAWI ? CODE_FOR_ashrv2si3_iwmmxt
> +              : fcode == ARM_BUILTIN_WSRADI ? CODE_FOR_ashrdi3_iwmmxt
> +              : fcode == ARM_BUILTIN_WRORHI ? CODE_FOR_rorv4hi3
> +              : fcode == ARM_BUILTIN_WRORWI ? CODE_FOR_rorv2si3
> +              : fcode == ARM_BUILTIN_WRORDI ? CODE_FOR_rordi3
> +              : fcode == ARM_BUILTIN_WSRLH  ? CODE_FOR_lshrv4hi3_di
> +              : fcode == ARM_BUILTIN_WSRLW  ? CODE_FOR_lshrv2si3_di
> +              : fcode == ARM_BUILTIN_WSRLD  ? CODE_FOR_lshrdi3_di
> +              : fcode == ARM_BUILTIN_WSLLH  ? CODE_FOR_ashlv4hi3_di
> +              : fcode == ARM_BUILTIN_WSLLW  ? CODE_FOR_ashlv2si3_di
> +              : fcode == ARM_BUILTIN_WSLLD  ? CODE_FOR_ashldi3_di
> +              : fcode == ARM_BUILTIN_WSRAH  ? CODE_FOR_ashrv4hi3_di
> +              : fcode == ARM_BUILTIN_WSRAW  ? CODE_FOR_ashrv2si3_di
> +              : fcode == ARM_BUILTIN_WSRAD  ? CODE_FOR_ashrdi3_di
> +              : fcode == ARM_BUILTIN_WRORH  ? CODE_FOR_rorv4hi3_di
> +              : fcode == ARM_BUILTIN_WRORW  ? CODE_FOR_rorv2si3_di
> +              : fcode == ARM_BUILTIN_WRORD  ? CODE_FOR_rordi3_di
> +              : CODE_FOR_nothing);
> +      arg1 = CALL_EXPR_ARG (exp, 1);
> +      op1 = expand_normal (arg1);
> +      if (GET_MODE (op1) == VOIDmode)
> +       {
> +         imm = INTVAL (op1);
> +         if ((fcode == ARM_BUILTIN_WRORHI || fcode == ARM_BUILTIN_WRORWI
> +              || fcode == ARM_BUILTIN_WRORH || fcode == ARM_BUILTIN_WRORW)
> +             && (imm < 0 || imm > 32))
> +           {
> +             if (fcode == ARM_BUILTIN_WRORHI)
> +               error ("the range of count should be in 0 to 32.  please check the intrinsic _mm_rori_pi16 in code.");
> +             else if (fcode == ARM_BUILTIN_WRORWI)
> +               error ("the range of count should be in 0 to 32.  please check the intrinsic _mm_rori_pi32 in code.");
> +             else if (fcode == ARM_BUILTIN_WRORH)
> +               error ("the range of count should be in 0 to 32.  please check the intrinsic _mm_ror_pi16 in code.");
> +             else
> +               error ("the range of count should be in 0 to 32.  please check the intrinsic _mm_ror_pi32 in code.");
> +           }
> +         else if ((fcode == ARM_BUILTIN_WRORDI || fcode == ARM_BUILTIN_WRORD)
> +                  && (imm < 0 || imm > 64))
> +           {
> +             if (fcode == ARM_BUILTIN_WRORDI)
> +               error ("the range of count should be in 0 to 64.  please check the intrinsic _mm_rori_si64 in code.");
> +             else
> +               error ("the range of count should be in 0 to 64.  please check the intrinsic _mm_ror_si64 in code.");
> +           }
> +         else if (imm < 0)
> +           {
> +             if (fcode == ARM_BUILTIN_WSRLHI)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_srli_pi16 in code.");
> +             else if (fcode == ARM_BUILTIN_WSRLWI)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_srli_pi32 in code.");
> +             else if (fcode == ARM_BUILTIN_WSRLDI)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_srli_si64 in code.");
> +             else if (fcode == ARM_BUILTIN_WSLLHI)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_slli_pi16 in code.");
> +             else if (fcode == ARM_BUILTIN_WSLLWI)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_slli_pi32 in code.");
> +             else if (fcode == ARM_BUILTIN_WSLLDI)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_slli_si64 in code.");
> +             else if (fcode == ARM_BUILTIN_WSRAHI)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_srai_pi16 in code.");
> +             else if (fcode == ARM_BUILTIN_WSRAWI)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_srai_pi32 in code.");
> +             else if (fcode == ARM_BUILTIN_WSRADI)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_srai_si64 in code.");
> +             else if (fcode == ARM_BUILTIN_WSRLH)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_srl_pi16 in code.");
> +             else if (fcode == ARM_BUILTIN_WSRLW)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_srl_pi32 in code.");
> +             else if (fcode == ARM_BUILTIN_WSRLD)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_srl_si64 in code.");
> +             else if (fcode == ARM_BUILTIN_WSLLH)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_sll_pi16 in code.");
> +             else if (fcode == ARM_BUILTIN_WSLLW)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_sll_pi32 in code.");
> +             else if (fcode == ARM_BUILTIN_WSLLD)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_sll_si64 in code.");
> +             else if (fcode == ARM_BUILTIN_WSRAH)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_sra_pi16 in code.");
> +             else if (fcode == ARM_BUILTIN_WSRAW)
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_sra_pi32 in code.");
> +             else
> +               error ("the count should be no less than 0.  please check the intrinsic _mm_sra_si64 in code.");

Uggh. I'd really rather have a nicer way of doing this - Wouldn't it
make more sense to extract this information from a table rather than
have such a sequence of nested ifs ?
Is there a way we can get to the location of the expansion and give
better diagnostics using error_at ? Can you try and organize the table
above
to have this information as well and just index the error string from there ?

Also it would be nice to have some execute tests for some of these
intrinsics in the testsuite.

regards,
Ramana


> +           }
> +       }
> +      return arm_expand_binop_builtin (icode, exp, target);
> +
>     case ARM_BUILTIN_THREAD_POINTER:
>       return arm_load_tp (target);
>
> --
> 1.7.3.4
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support
  2012-05-29  4:13 [PATCH ARM iWMMXt 0/5] Improve iWMMXt support Matt Turner
                   ` (4 preceding siblings ...)
  2012-05-29  4:15 ` [PATCH ARM iWMMXt 2/5] intrinsic head file change Matt Turner
@ 2012-06-06 11:59 ` Ramana Radhakrishnan
  2012-06-11  9:24 ` nick clifton
  2012-06-13  7:36 ` nick clifton
  7 siblings, 0 replies; 29+ messages in thread
From: Ramana Radhakrishnan @ 2012-06-06 11:59 UTC (permalink / raw)
  To: Matt Turner
  Cc: gcc-patches, Richard Earnshaw, Nick Clifton, Paul Brook, Xinyu Qi

On 29 May 2012 05:13, Matt Turner <mattst88@gmail.com> wrote:
>
> This series was written by Marvell and sent by Xinyu Qi <xyqi@marvell.com>
> a number of times in the last year.
>
> We (One Laptop per Child) need these patches for reasonable iWMMXt support
> and performance. Without them, logical and shift intrinsics cause ICEs,
> see PR 35294 and its duplicates 36798 and 36966.
>
> The software compositing library pixman uses MMX intrinsics to optimize
> various compositing routines. The following are the minimum execution times
> of cairo-perf-trace graphics work loads without and with iWMMXt-optimized
> pixman for the image and image16 backends (32-bpp and 16-bpp respectively).
>
>                             image               image16
>           evolution   33.492 ->  29.590    30.334 ->  24.751
> firefox-planet-gnome  191.465 -> 173.835   211.297 -> 187.570
> gnome-system-monitor   51.956 ->  44.549    52.272 ->  40.525
>  gnome-terminal-vim   53.625 ->  54.554    47.593 ->  47.341
>      grads-heat-map    4.439 ->   4.165     4.548 ->   4.624
>       midori-zoomed   38.033 ->  28.500    38.576 ->  26.937
>             poppler   41.096 ->  31.949    41.230 ->  31.749
>  swfdec-giant-steps   20.062 ->  16.912    28.294 ->  17.286
>      swfdec-youtube   42.281 ->  37.335    52.848 ->  47.053
>   xfce4-terminal-a1   64.311 ->  51.011    62.592 ->  51.191
>
> We have cleaned up some white-space issues with the patches and fixed a
> small bug in patch 4/5 since the last time they were posted in December
> (added tandc,textrc,torc,torvsc to the "wtype" attribute)
>
> Please commit them for 4.8.

You do not mention how these patches have been tested with trunk after
you've rebased them - I understand that you are using them in your
port but can you specify how these were tested and what the results
looked like ?

Ramana

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH ARM iWMMXt 2/5] intrinsic head file change
  2012-05-29  4:15 ` [PATCH ARM iWMMXt 2/5] intrinsic head file change Matt Turner
@ 2012-06-06 12:22   ` Ramana Radhakrishnan
  0 siblings, 0 replies; 29+ messages in thread
From: Ramana Radhakrishnan @ 2012-06-06 12:22 UTC (permalink / raw)
  To: Matt Turner
  Cc: gcc-patches, Ramana Radhakrishnan, Richard Earnshaw,
	Nick Clifton, Paul Brook, Xinyu Qi

I've only had a brief look at this and point out certain stylistic
issues that I noticed and would like another set of eyes on this and
the next patch.


On 29 May 2012 05:13, Matt Turner <mattst88@gmail.com> wrote:
> From: Xinyu Qi <xyqi@marvell.com>
>
>        gcc/
>        * config/arm/mmintrin.h: Use __IWMMXT__ to enable iWMMXt intrinsics.
>        Use __IWMMXT2__ to enable iWMMXt2 intrinsics.
>        Use C name-mangling for intrinsics.
>        (__v8qi): Redefine.
>        (_mm_cvtsi32_si64, _mm_andnot_si64, _mm_sad_pu8): Revise.
>        (_mm_sad_pu16, _mm_align_si64, _mm_setwcx, _mm_getwcx): Likewise.
>        (_m_from_int): Likewise.
>        (_mm_sada_pu8, _mm_sada_pu16): New intrinsic.
>        (_mm_alignr0_si64, _mm_alignr1_si64, _mm_alignr2_si64): Likewise.
>        (_mm_alignr3_si64, _mm_tandcb, _mm_tandch, _mm_tandcw): Likewise.
>        (_mm_textrcb, _mm_textrch, _mm_textrcw, _mm_torcb): Likewise.
>        (_mm_torch, _mm_torcw, _mm_tbcst_pi8, _mm_tbcst_pi16): Likewise.
>        (_mm_tbcst_pi32): Likewise.
>        (_mm_abs_pi8, _mm_abs_pi16, _mm_abs_pi32): New iWMMXt2 intrinsic.
>        (_mm_addsubhx_pi16, _mm_absdiff_pu8, _mm_absdiff_pu16): Likewise.
>        (_mm_absdiff_pu32, _mm_addc_pu16, _mm_addc_pu32): Likewise.
>        (_mm_avg4_pu8, _mm_avg4r_pu8, _mm_maddx_pi16, _mm_maddx_pu16): Likewise.
>        (_mm_msub_pi16, _mm_msub_pu16, _mm_mulhi_pi32): Likewise.
>        (_mm_mulhi_pu32, _mm_mulhir_pi16, _mm_mulhir_pi32): Likewise.
>        (_mm_mulhir_pu16, _mm_mulhir_pu32, _mm_mullo_pi32): Likewise.
>        (_mm_qmulm_pi16, _mm_qmulm_pi32, _mm_qmulmr_pi16): Likewise.
>        (_mm_qmulmr_pi32, _mm_subaddhx_pi16, _mm_addbhusl_pu8): Likewise.
>        (_mm_addbhusm_pu8, _mm_qmiabb_pi32, _mm_qmiabbn_pi32): Likewise.
>        (_mm_qmiabt_pi32, _mm_qmiabtn_pi32, _mm_qmiatb_pi32): Likewise.
>        (_mm_qmiatbn_pi32, _mm_qmiatt_pi32, _mm_qmiattn_pi32): Likewise.
>        (_mm_wmiabb_si64, _mm_wmiabbn_si64, _mm_wmiabt_si64): Likewise.
>        (_mm_wmiabtn_si64, _mm_wmiatb_si64, _mm_wmiatbn_si64): Likewise.
>        (_mm_wmiatt_si64, _mm_wmiattn_si64, _mm_wmiawbb_si64): Likewise.
>        (_mm_wmiawbbn_si64, _mm_wmiawbt_si64, _mm_wmiawbtn_si64): Likewise.
>        (_mm_wmiawtb_si64, _mm_wmiawtbn_si64, _mm_wmiawtt_si64): Likewise.
>        (_mm_wmiawttn_si64, _mm_merge_si64): Likewise.
>        (_mm_torvscb, _mm_torvsch, _mm_torvscw): Likewise.
>        (_m_to_int): New define.
> ---
>  gcc/config/arm/mmintrin.h |  649 ++++++++++++++++++++++++++++++++++++++++++---
>  1 files changed, 614 insertions(+), 35 deletions(-)
>
> diff --git a/gcc/config/arm/mmintrin.h b/gcc/config/arm/mmintrin.h
> index 2cc500d..0fe551d 100644
> --- a/gcc/config/arm/mmintrin.h
> +++ b/gcc/config/arm/mmintrin.h
> @@ -24,16 +24,30 @@
>  #ifndef _MMINTRIN_H_INCLUDED
>  #define _MMINTRIN_H_INCLUDED
>
> +#ifndef __IWMMXT__
> +#error You must enable WMMX/WMMX2 instructions (e.g. -march=iwmmxt or -march=iwmmxt2) to use iWMMXt/iWMMXt2 intrinsics
> +#else
> +
> +#ifndef __IWMMXT2__
> +#warning You only enable iWMMXt intrinsics. Extended iWMMXt2 intrinsics available only if WMMX2 instructions enabled (e.g. -march=iwmmxt2)
> +#endif
> +

Extra newline.

> +
> +#if defined __cplusplus
> +extern "C" { /* Begin "C" */
> +/* Intrinsics use C name-mangling.  */
> +#endif /* __cplusplus */
> +
>  /* The data type intended for user use.  */
>  typedef unsigned long long __m64, __int64;
>
>  /* Internal data types for implementing the intrinsics.  */
>  typedef int __v2si __attribute__ ((vector_size (8)));
>  typedef short __v4hi __attribute__ ((vector_size (8)));
> -typedef char __v8qi __attribute__ ((vector_size (8)));
> +typedef signed char __v8qi __attribute__ ((vector_size (8)));
>
>  /* "Convert" __m64 and __int64 into each other.  */
> -static __inline __m64
> +static __inline __m64
>  _mm_cvtsi64_m64 (__int64 __i)
>  {
>   return __i;
> @@ -54,7 +68,7 @@ _mm_cvtsi64_si32 (__int64 __i)
>  static __inline __int64
>  _mm_cvtsi32_si64 (int __i)
>  {
> -  return __i;
> +  return (__i & 0xffffffff);
>  }
>
>  /* Pack the four 16-bit values from M1 into the lower four 8-bit values of
> @@ -603,7 +617,7 @@ _mm_and_si64 (__m64 __m1, __m64 __m2)
>  static __inline __m64
>  _mm_andnot_si64 (__m64 __m1, __m64 __m2)
>  {
> -  return __builtin_arm_wandn (__m1, __m2);
> +  return __builtin_arm_wandn (__m2, __m1);
>  }
>
>  /* Bit-wise inclusive OR the 64-bit values in M1 and M2.  */
> @@ -935,7 +949,13 @@ _mm_avg2_pu16 (__m64 __A, __m64 __B)
>  static __inline __m64
>  _mm_sad_pu8 (__m64 __A, __m64 __B)
>  {
> -  return (__m64) __builtin_arm_wsadb ((__v8qi)__A, (__v8qi)__B);
> +  return (__m64) __builtin_arm_wsadbz ((__v8qi)__A, (__v8qi)__B);
> +}
> +
> +static __inline __m64
> +_mm_sada_pu8 (__m64 __A, __m64 __B, __m64 __C)
> +{
> +  return (__m64) __builtin_arm_wsadb ((__v2si)__A, (__v8qi)__B, (__v8qi)__C);
>  }
>
>  /* Compute the sum of the absolute differences of the unsigned 16-bit
> @@ -944,9 +964,16 @@ _mm_sad_pu8 (__m64 __A, __m64 __B)
>  static __inline __m64
>  _mm_sad_pu16 (__m64 __A, __m64 __B)
>  {
> -  return (__m64) __builtin_arm_wsadh ((__v4hi)__A, (__v4hi)__B);
> +  return (__m64) __builtin_arm_wsadhz ((__v4hi)__A, (__v4hi)__B);
>  }
>
> +static __inline __m64
> +_mm_sada_pu16 (__m64 __A, __m64 __B, __m64 __C)
> +{
> +  return (__m64) __builtin_arm_wsadh ((__v2si)__A, (__v4hi)__B, (__v4hi)__C);
> +}
> +
> +
>  /* Compute the sum of the absolute differences of the unsigned 8-bit
>    values in A and B.  Return the value in the lower 16-bit word; the
>    upper words are cleared.  */
> @@ -965,11 +992,8 @@ _mm_sadz_pu16 (__m64 __A, __m64 __B)
>   return (__m64) __builtin_arm_wsadhz ((__v4hi)__A, (__v4hi)__B);
>  }
>
> -static __inline __m64
> -_mm_align_si64 (__m64 __A, __m64 __B, int __C)
> -{
> -  return (__m64) __builtin_arm_walign ((__v8qi)__A, (__v8qi)__B, __C);
> -}
> +#define _mm_align_si64(__A,__B, N) \
> +  (__m64) __builtin_arm_walign ((__v8qi) (__A),(__v8qi) (__B), (N))
>
>  /* Creates a 64-bit zero.  */
>  static __inline __m64
> @@ -987,42 +1011,76 @@ _mm_setwcx (const int __value, const int __regno)
>  {
>   switch (__regno)
>     {
> -    case 0:  __builtin_arm_setwcx (__value, 0); break;
> -    case 1:  __builtin_arm_setwcx (__value, 1); break;
> -    case 2:  __builtin_arm_setwcx (__value, 2); break;
> -    case 3:  __builtin_arm_setwcx (__value, 3); break;
> -    case 8:  __builtin_arm_setwcx (__value, 8); break;
> -    case 9:  __builtin_arm_setwcx (__value, 9); break;
> -    case 10: __builtin_arm_setwcx (__value, 10); break;
> -    case 11: __builtin_arm_setwcx (__value, 11); break;
> -    default: break;
> +    case 0:
> +      __asm __volatile ("tmcr wcid, %0" :: "r"(__value));
> +      break;
> +    case 1:
> +      __asm __volatile ("tmcr wcon, %0" :: "r"(__value));
> +      break;
> +    case 2:
> +      __asm __volatile ("tmcr wcssf, %0" :: "r"(__value));
> +      break;
> +    case 3:
> +      __asm __volatile ("tmcr wcasf, %0" :: "r"(__value));
> +      break;
> +    case 8:
> +      __builtin_arm_setwcgr0 (__value);
> +      break;
> +    case 9:
> +      __builtin_arm_setwcgr1 (__value);
> +      break;
> +    case 10:
> +      __builtin_arm_setwcgr2 (__value);
> +      break;
> +    case 11:
> +      __builtin_arm_setwcgr3 (__value);
> +      break;
> +    default:
> +      break;
>     }
>  }
>
>  static __inline int
>  _mm_getwcx (const int __regno)
>  {
> +  int __value;
>   switch (__regno)
>     {
> -    case 0:  return __builtin_arm_getwcx (0);
> -    case 1:  return __builtin_arm_getwcx (1);
> -    case 2:  return __builtin_arm_getwcx (2);
> -    case 3:  return __builtin_arm_getwcx (3);
> -    case 8:  return __builtin_arm_getwcx (8);
> -    case 9:  return __builtin_arm_getwcx (9);
> -    case 10: return __builtin_arm_getwcx (10);
> -    case 11: return __builtin_arm_getwcx (11);
> -    default: return 0;
> +    case 0:
> +      __asm __volatile ("tmrc %0, wcid" : "=r"(__value));
> +      break;
> +    case 1:
> +      __asm __volatile ("tmrc %0, wcon" : "=r"(__value));
> +      break;
> +    case 2:
> +      __asm __volatile ("tmrc %0, wcssf" : "=r"(__value));
> +      break;
> +    case 3:
> +      __asm __volatile ("tmrc %0, wcasf" : "=r"(__value));
> +      break;
> +    case 8:
> +      return __builtin_arm_getwcgr0 ();
> +    case 9:
> +      return __builtin_arm_getwcgr1 ();
> +    case 10:
> +      return __builtin_arm_getwcgr2 ();
> +    case 11:
> +      return __builtin_arm_getwcgr3 ();
> +    default:
> +      break;
>     }
> +  return __value;
>  }
>
>  /* Creates a vector of two 32-bit values; I0 is least significant.  */
>  static __inline __m64
>  _mm_set_pi32 (int __i1, int __i0)
>  {
> -  union {
> +  union
> +  {
>     __m64 __q;
> -    struct {
> +    struct
> +    {
>       unsigned int __i0;
>       unsigned int __i1;
>     } __s;
> @@ -1041,7 +1099,7 @@ _mm_set_pi16 (short __w3, short __w2, short __w1, short __w0)
>   unsigned int __i1 = (unsigned short)__w3 << 16 | (unsigned short)__w2;
>   unsigned int __i0 = (unsigned short)__w1 << 16 | (unsigned short)__w0;
>   return _mm_set_pi32 (__i1, __i0);
> -
> +
Extra newline again here.
>  }
>
>  /* Creates a vector of eight 8-bit values; B0 is least significant.  */
> @@ -1108,11 +1166,526 @@ _mm_set1_pi8 (char __b)
>   return _mm_set1_pi32 (__i);
>  }
>
> -/* Convert an integer to a __m64 object.  */
> +#ifdef __IWMMXT2__
> +static __inline __m64
> +_mm_abs_pi8 (__m64 m1)
> +{
> +  return (__m64) __builtin_arm_wabsb ((__v8qi)m1);
> +}
> +
> +static __inline __m64
> +_mm_abs_pi16 (__m64 m1)
> +{
> +  return (__m64) __builtin_arm_wabsh ((__v4hi)m1);
> +

And here.

> +}
> +
> +static __inline __m64
> +_mm_abs_pi32 (__m64 m1)
> +{
> +  return (__m64) __builtin_arm_wabsw ((__v2si)m1);
> +
and here.

<large part snipped.>

> +
> +#define _mm_qmiabb_pi32(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wqmiabb ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_qmiabbn_pi32(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wqmiabbn ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_qmiabt_pi32(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wqmiabt ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_qmiabtn_pi32(acc, m1, m2) \
> +  ({\
> +   __m64 _acc=acc;\
> +   __m64 _m1=m1;\
> +   __m64 _m2=m2;\
> +   _acc = (__m64) __builtin_arm_wqmiabtn ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_qmiatb_pi32(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wqmiatb ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_qmiatbn_pi32(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wqmiatbn ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_qmiatt_pi32(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wqmiatt ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_qmiattn_pi32(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wqmiattn ((__v2si)_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiabb_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiabb (_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiabbn_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiabbn (_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiabt_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiabt (_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiabtn_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiabtn (_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiatb_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiatb (_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiatbn_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiatbn (_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiatt_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiatt (_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiattn_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiattn (_acc, (__v4hi)_m1, (__v4hi)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiawbb_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiawbb (_acc, (__v2si)_m1, (__v2si)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiawbbn_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiawbbn (_acc, (__v2si)_m1, (__v2si)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiawbt_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiawbt (_acc, (__v2si)_m1, (__v2si)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiawbtn_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiawbtn (_acc, (__v2si)_m1, (__v2si)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiawtb_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiawtb (_acc, (__v2si)_m1, (__v2si)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiawtbn_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiawtbn (_acc, (__v2si)_m1, (__v2si)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiawtt_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiawtt (_acc, (__v2si)_m1, (__v2si)_m2);\
> +   _acc;\
> +   })
> +
> +#define _mm_wmiawttn_si64(acc, m1, m2) \
> +  ({\
> +   __m64 _acc = acc;\
> +   __m64 _m1 = m1;\
> +   __m64 _m2 = m2;\
> +   _acc = (__m64) __builtin_arm_wmiawttn (_acc, (__v2si)_m1, (__v2si)_m2);\
> +   _acc;\
> +   })

I assume someone knows why these are macros and not inline functions
like the others ?


> +
> +/* The third arguments should be an immediate.  */

s/arguments/argument

> +#define _mm_merge_si64(a, b, n) \
> +  ({\
> +   __m64 result;\
> +   result = (__m64) __builtin_arm_wmerge ((__m64) (a), (__m64) (b), (n));\
> +   result;\
> +   })
> +#endif  /* __IWMMXT2__ */
> +

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support
  2012-05-29  4:13 [PATCH ARM iWMMXt 0/5] Improve iWMMXt support Matt Turner
                   ` (5 preceding siblings ...)
  2012-06-06 11:59 ` [PATCH ARM iWMMXt 0/5] Improve iWMMXt support Ramana Radhakrishnan
@ 2012-06-11  9:24 ` nick clifton
  2012-06-13  7:36 ` nick clifton
  7 siblings, 0 replies; 29+ messages in thread
From: nick clifton @ 2012-06-11  9:24 UTC (permalink / raw)
  To: Matt Turner
  Cc: gcc-patches, Ramana Radhakrishnan, Richard Earnshaw, Paul Brook,
	Xinyu Qi

Hi Matt,

   This is just to let you know that I am currently reviewing these 
patches.  I do have a problem however.  With the patches applied I am 
seeing very bad results from the gcc testsuite when run with 
-mcpu=iwmmxt.  (Bad as in the testsuite takes days to run and most tests 
fail).  I am currently looking into this, but it may take me some time 
to track the problem down.

Cheers
   Nick

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support
  2012-05-29  4:13 [PATCH ARM iWMMXt 0/5] Improve iWMMXt support Matt Turner
                   ` (6 preceding siblings ...)
  2012-06-11  9:24 ` nick clifton
@ 2012-06-13  7:36 ` nick clifton
  2012-06-13 15:31   ` Matt Turner
  7 siblings, 1 reply; 29+ messages in thread
From: nick clifton @ 2012-06-13  7:36 UTC (permalink / raw)
  To: Matt Turner, Xinyu Qi
  Cc: gcc-patches, Ramana Radhakrishnan, Richard Earnshaw, Paul Brook

Hi Matt, Hi Xinyu,

> This series was written by Marvell and sent by Xinyu Qi<xyqi@marvell.com>
> a number of times in the last year.

Sorry for the long delay in reviewing these patches.  Overall they were 
fine, with only a few, very minor, formatting issues.  I have committed 
the entire series of patches to the mainline.

> For 4.7 and 4.6 please consider committing my patch
> "[PATCH] arm: Fix iwmmxt shift and logical intrinsics (PR 35294)."
> which only fixes the logical and shift intrinsics.

I will look at this and post separately about it.

Cheers
   Nick

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support
  2012-06-13  7:36 ` nick clifton
@ 2012-06-13 15:31   ` Matt Turner
  2012-06-26 15:20     ` nick clifton
  0 siblings, 1 reply; 29+ messages in thread
From: Matt Turner @ 2012-06-13 15:31 UTC (permalink / raw)
  To: nick clifton
  Cc: Xinyu Qi, gcc-patches, Ramana Radhakrishnan, Richard Earnshaw,
	Paul Brook

On Wed, Jun 13, 2012 at 3:26 AM, nick clifton <nickc@redhat.com> wrote:
> Hi Matt, Hi Xinyu,
>
>
>> This series was written by Marvell and sent by Xinyu Qi<xyqi@marvell.com>
>> a number of times in the last year.
>
>
> Sorry for the long delay in reviewing these patches.  Overall they were
> fine, with only a few, very minor, formatting issues.  I have committed the
> entire series of patches to the mainline.

Great! Thank you so much! Thanks to Ramana for the reviews!

>> For 4.7 and 4.6 please consider committing my patch
>> "[PATCH] arm: Fix iwmmxt shift and logical intrinsics (PR 35294)."
>> which only fixes the logical and shift intrinsics.

Sounds good.

There's also a trivial documentation fix:

[PATCH 1/2] doc: Correct __builtin_arm_tinsr prototype documentation

and a test to exercise the intrinsics:

[PATCH 2/2] arm: add iwMMXt mmx-2.c test

Thanks a lot!

Matt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support
  2012-06-13 15:31   ` Matt Turner
@ 2012-06-26 15:20     ` nick clifton
  2012-06-27 19:15       ` Matt Turner
  2013-01-28  3:49       ` Matt Turner
  0 siblings, 2 replies; 29+ messages in thread
From: nick clifton @ 2012-06-26 15:20 UTC (permalink / raw)
  To: Matt Turner
  Cc: Xinyu Qi, gcc-patches, Ramana Radhakrishnan, Richard Earnshaw,
	Paul Brook

Hi Matt,

> There's also a trivial documentation fix:
>
> [PATCH 1/2] doc: Correct __builtin_arm_tinsr prototype documentation
>
> and a test to exercise the intrinsics:
>
> [PATCH 2/2] arm: add iwMMXt mmx-2.c test

These have both been checked in.

It turns out that both needed minor updates as some of the builtins have 
changed since these patches were written.  I have taken care of this 
however.

Cheers
   Nick

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support
  2012-06-26 15:20     ` nick clifton
@ 2012-06-27 19:15       ` Matt Turner
  2013-01-28  3:49       ` Matt Turner
  1 sibling, 0 replies; 29+ messages in thread
From: Matt Turner @ 2012-06-27 19:15 UTC (permalink / raw)
  To: nick clifton
  Cc: Xinyu Qi, gcc-patches, Ramana Radhakrishnan, Richard Earnshaw,
	Paul Brook

On Tue, Jun 26, 2012 at 10:56 AM, nick clifton <nickc@redhat.com> wrote:
> Hi Matt,
>
>
>> There's also a trivial documentation fix:
>>
>> [PATCH 1/2] doc: Correct __builtin_arm_tinsr prototype documentation
>>
>> and a test to exercise the intrinsics:
>>
>> [PATCH 2/2] arm: add iwMMXt mmx-2.c test
>
>
> These have both been checked in.
>
> It turns out that both needed minor updates as some of the builtins have
> changed since these patches were written.  I have taken care of this
> however.
>
> Cheers
>  Nick

Thanks a lot, Nick!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH, ARM, iWMMXT] Fix define_constants for WCGR
  2012-06-06 11:53   ` Ramana Radhakrishnan
@ 2012-12-27  2:31     ` Xinyu Qi
  2013-01-22  9:22     ` [PING][PATCH, " Xinyu Qi
  1 sibling, 0 replies; 29+ messages in thread
From: Xinyu Qi @ 2012-12-27  2:31 UTC (permalink / raw)
  To: gcc-patches

Hi,

  It is necessary to sync the constants WCGR0 to WCGR3 in iwmmxt.md
with the IWMMXT_GR_REGNUM in arm.h.

ChangeLog
	* config/arm/arm.h (FIRST_IWMMXT_GR_REGNUM): Add comment.
	* config/arm/iwmmxt.md (WCGR0, WCGR1): Update.
	* config/arm/iwmmxt.md (WCGR2, WCGR3): Likewise.

Index: config/arm/arm.h
===================================================================
--- config/arm/arm.h	(revision 194603)
+++ config/arm/arm.h	(working copy)
@@ -947,6 +947,8 @@
 
 #define FIRST_IWMMXT_REGNUM	(LAST_HI_VFP_REGNUM + 1)
 #define LAST_IWMMXT_REGNUM	(FIRST_IWMMXT_REGNUM + 15)
+
+/* Need to sync with WCGR in iwmmxt.md.  */
 #define FIRST_IWMMXT_GR_REGNUM	(LAST_IWMMXT_REGNUM + 1)
 #define LAST_IWMMXT_GR_REGNUM	(FIRST_IWMMXT_GR_REGNUM + 3)
 
Index: config/arm/iwmmxt.md
===================================================================
--- config/arm/iwmmxt.md	(revision 194603)
+++ config/arm/iwmmxt.md	(working copy)
@@ -19,12 +19,12 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; <http://www.gnu.org/licenses/>.
 
-;; Register numbers
+;; Register numbers. Need to sync with FIRST_IWMMXT_GR_REGNUM in arm.h
 (define_constants
-  [(WCGR0           43)
-   (WCGR1           44)
-   (WCGR2           45)
-   (WCGR3           46)
+  [(WCGR0           96)
+   (WCGR1           97)
+   (WCGR2           98)
+   (WCGR3           99)
   ]
 )


OK?

Thanks,
Xinyu

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PING][PATCH, ARM, iWMMXT] Fix define_constants for WCGR
  2012-06-06 11:53   ` Ramana Radhakrishnan
  2012-12-27  2:31     ` [PATCH, ARM, iWMMXT] Fix define_constants for WCGR Xinyu Qi
@ 2013-01-22  9:22     ` Xinyu Qi
  2013-01-22 11:59       ` Ramana Radhakrishnan
  1 sibling, 1 reply; 29+ messages in thread
From: Xinyu Qi @ 2013-01-22  9:22 UTC (permalink / raw)
  To: gcc-patches

Ping,

Fix ChangeLog
	* config/arm/arm.h (FIRST_IWMMXT_GR_REGNUM): Add comment.
	* config/arm/iwmmxt.md (WCGR0): Update.
	 (WCGR1, WCGR2, WCGR3): Likewise.

> Hi,
> 
>   It is necessary to sync the constants WCGR0 to WCGR3 in iwmmxt.md with
> the IWMMXT_GR_REGNUM in arm.h.
> 
> ChangeLog
> 	* config/arm/arm.h (FIRST_IWMMXT_GR_REGNUM): Add comment.
> 	* config/arm/iwmmxt.md (WCGR0, WCGR1): Update.
> 	* config/arm/iwmmxt.md (WCGR2, WCGR3): Likewise.
> 
> Index: config/arm/arm.h
> ================================================================
> ===
> --- config/arm/arm.h	(revision 194603)
> +++ config/arm/arm.h	(working copy)
> @@ -947,6 +947,8 @@
> 
>  #define FIRST_IWMMXT_REGNUM	(LAST_HI_VFP_REGNUM + 1)
>  #define LAST_IWMMXT_REGNUM	(FIRST_IWMMXT_REGNUM + 15)
> +
> +/* Need to sync with WCGR in iwmmxt.md.  */
>  #define FIRST_IWMMXT_GR_REGNUM	(LAST_IWMMXT_REGNUM + 1)
>  #define LAST_IWMMXT_GR_REGNUM	(FIRST_IWMMXT_GR_REGNUM +
> 3)
> 
> Index: config/arm/iwmmxt.md
> ================================================================
> ===
> --- config/arm/iwmmxt.md	(revision 194603)
> +++ config/arm/iwmmxt.md	(working copy)
> @@ -19,12 +19,12 @@
>  ;; along with GCC; see the file COPYING3.  If not see  ;;
> <http://www.gnu.org/licenses/>.
> 
> -;; Register numbers
> +;; Register numbers. Need to sync with FIRST_IWMMXT_GR_REGNUM in
> arm.h
>  (define_constants
> -  [(WCGR0           43)
> -   (WCGR1           44)
> -   (WCGR2           45)
> -   (WCGR3           46)
> +  [(WCGR0           96)
> +   (WCGR1           97)
> +   (WCGR2           98)
> +   (WCGR3           99)
>    ]
>  )
> 
> 
> OK?
> 
> Thanks,
> Xinyu

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PING][PATCH, ARM, iWMMXT] Fix define_constants for WCGR
  2013-01-22  9:22     ` [PING][PATCH, " Xinyu Qi
@ 2013-01-22 11:59       ` Ramana Radhakrishnan
  2013-01-22 13:34         ` Andreas Schwab
                           ` (3 more replies)
  0 siblings, 4 replies; 29+ messages in thread
From: Ramana Radhakrishnan @ 2013-01-22 11:59 UTC (permalink / raw)
  To: Xinyu Qi; +Cc: gcc-patches

On 01/22/13 09:21, Xinyu Qi wrote:
> Ping,
>
> Fix ChangeLog

The ChangeLog format includes .

<date>  <Author's name>  <a.b@c.com>

If you want a patch accepted in the future, please help by creating the 
Changelog entry in the correct format, i.e. fill in the author's name as 
well as email address as below. I've created an entry as below. Please 
remember to do so for every patch you submit - thanks.

<DATE>  Xinyu Qi  <xyqi@marvell.com>

	* config/arm/arm.h (FIRST_IWMMXT_GR_REGNUM): Add comment.
	* config/arm/iwmmxt.md (WCGR0): Update.
	(WCGR1, WCGR2, WCGR3): Likewise.

The patch by itself is OK but surprisingly I never saw this earlier. 
Your ping has removed the date from the original post so I couldn't 
track it down.

Anyway, please apply.


regards,
Ramana



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PING][PATCH, ARM, iWMMXT] Fix define_constants for WCGR
  2013-01-22 11:59       ` Ramana Radhakrishnan
@ 2013-01-22 13:34         ` Andreas Schwab
  2013-01-23  6:08         ` Xinyu Qi
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: Andreas Schwab @ 2013-01-22 13:34 UTC (permalink / raw)
  To: ramrad01; +Cc: Xinyu Qi, gcc-patches

Ramana Radhakrishnan <ramrad01@arm.com> writes:

> The patch by itself is OK but surprisingly I never saw this earlier. Your
> ping has removed the date from the original post so I couldn't track it
> down.

You can follow the references and look up the message-id via
http://mid.gmane.org/<msg-id>.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PING][PATCH, ARM, iWMMXT] Fix define_constants for WCGR
  2013-01-22 11:59       ` Ramana Radhakrishnan
  2013-01-22 13:34         ` Andreas Schwab
@ 2013-01-23  6:08         ` Xinyu Qi
  2013-01-31  8:49         ` [PATCH, " Xinyu Qi
  2013-03-20  2:43         ` Xinyu Qi
  3 siblings, 0 replies; 29+ messages in thread
From: Xinyu Qi @ 2013-01-23  6:08 UTC (permalink / raw)
  To: ramrad01; +Cc: gcc-patches

At 2013-01-22 19:58:43,"Ramana Radhakrishnan" <ramrad01@arm.com> wrote:
> On 01/22/13 09:21, Xinyu Qi wrote:
> > Ping,
> >
> > Fix ChangeLog
> 
> The ChangeLog format includes .
> 
> <date>  <Author's name>  <a.b@c.com>
> 
> If you want a patch accepted in the future, please help by creating the
> Changelog entry in the correct format, i.e. fill in the author's name as well as
> email address as below. I've created an entry as below. Please remember to do
> so for every patch you submit - thanks.
> 
> <DATE>  Xinyu Qi  <xyqi@marvell.com>
> 
> 	* config/arm/arm.h (FIRST_IWMMXT_GR_REGNUM): Add comment.
> 	* config/arm/iwmmxt.md (WCGR0): Update.
> 	(WCGR1, WCGR2, WCGR3): Likewise.
> 
> The patch by itself is OK but surprisingly I never saw this earlier.
> Your ping has removed the date from the original post so I couldn't track it
> down.

Hi Ramana,

Thanks for reviewing.
I forget to keep the date which shows the original post is at Wed, 26 Dec 2012
You can find it at
http://gcc.gnu.org/ml/gcc-patches/2012-12/msg01418.html
I would remember to set the correct Changelog entry next time.

> 
> Anyway, please apply.

BTW, since I have no write access, would you mind to help to check in this patch?

Thanks!
Xinyu

> 
> 
> regards,
> Ramana
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support
  2012-06-26 15:20     ` nick clifton
  2012-06-27 19:15       ` Matt Turner
@ 2013-01-28  3:49       ` Matt Turner
  2013-01-28 15:11         ` nick clifton
  1 sibling, 1 reply; 29+ messages in thread
From: Matt Turner @ 2013-01-28  3:49 UTC (permalink / raw)
  To: nick clifton
  Cc: Xinyu Qi, gcc-patches, Ramana Radhakrishnan, Richard Earnshaw,
	Paul Brook

On Tue, Jun 26, 2012 at 7:56 AM, nick clifton <nickc@redhat.com> wrote:
> Hi Matt,
>
>
>> There's also a trivial documentation fix:
>>
>> [PATCH 1/2] doc: Correct __builtin_arm_tinsr prototype documentation
>>
>> and a test to exercise the intrinsics:
>>
>> [PATCH 2/2] arm: add iwMMXt mmx-2.c test
>
>
> These have both been checked in.
>
> It turns out that both needed minor updates as some of the builtins have
> changed since these patches were written.  I have taken care of this
> however.
>
> Cheers
>   Nick

Hi Nick,

Could this patch, or perhaps the much smaller one I attached to bug
35294 be committed to the 4.7 branch?

Also, could you close its duplicates, bugs 36798 and 36966?

Thanks,
Matt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support
  2013-01-28  3:49       ` Matt Turner
@ 2013-01-28 15:11         ` nick clifton
  2013-02-21  2:35           ` closing PR's (was Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support) Hans-Peter Nilsson
  0 siblings, 1 reply; 29+ messages in thread
From: nick clifton @ 2013-01-28 15:11 UTC (permalink / raw)
  To: Matt Turner
  Cc: Xinyu Qi, gcc-patches, Ramana Radhakrishnan, Richard Earnshaw,
	Paul Brook

Hi Matt,

> Could this patch, or perhaps the much smaller one I attached to bug
> 35294 be committed to the 4.7 branch?

Yes.  Done.

> Also, could you close its duplicates, bugs 36798 and 36966?

Sorry no.  I do not actually own these PRs, so I cannot close them. :-(

Cheers
   Nick


^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH, ARM, iWMMXT] Fix define_constants for WCGR
  2013-01-22 11:59       ` Ramana Radhakrishnan
  2013-01-22 13:34         ` Andreas Schwab
  2013-01-23  6:08         ` Xinyu Qi
@ 2013-01-31  8:49         ` Xinyu Qi
  2013-03-20  2:43         ` Xinyu Qi
  3 siblings, 0 replies; 29+ messages in thread
From: Xinyu Qi @ 2013-01-31  8:49 UTC (permalink / raw)
  To: nick clifton; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1301 bytes --]

At 2013-01-22 19:58:43,"Ramana Radhakrishnan" <ramrad01@arm.com> wrote:> 
> On 01/22/13 09:21, Xinyu Qi wrote:
> > Ping,
> >
> > Fix ChangeLog
> 
> The ChangeLog format includes .
> 
> <date>  <Author's name>  <a.b@c.com>
> 
> If you want a patch accepted in the future, please help by creating the
> Changelog entry in the correct format, i.e. fill in the author's name as well as
> email address as below. I've created an entry as below. Please remember to do
> so for every patch you submit - thanks.
> 
> <DATE>  Xinyu Qi  <xyqi@marvell.com>
> 
> 	* config/arm/arm.h (FIRST_IWMMXT_GR_REGNUM): Add comment.
> 	* config/arm/iwmmxt.md (WCGR0): Update.
> 	(WCGR1, WCGR2, WCGR3): Likewise.
> 
> The patch by itself is OK but surprisingly I never saw this earlier.
> Your ping has removed the date from the original post so I couldn't track it
> down.
> 
> Anyway, please apply.
> 
> 
> regards,
> Ramana
> 
> 

Hi Nick,

Since I have no write access, would you mind to help to check in this patch which has already been approved?
The patch is attached.

ChangeLog
2013-01-31  Xinyu Qi  <xyqi@marvell.com>

	* config/arm/arm.h (FIRST_IWMMXT_GR_REGNUM): Add comment.
	* config/arm/iwmmxt.md (WCGR0): Update.
	(WCGR1, WCGR2, WCGR3): Likewise.

Thanks,
Xinyu

[-- Attachment #2: WCGR.diff --]
[-- Type: application/octet-stream, Size: 1102 bytes --]

Index: gcc/config/arm/arm.h
===================================================================
--- gcc/config/arm/arm.h	(revision 195599)
+++ gcc/config/arm/arm.h	(working copy)
@@ -945,6 +945,8 @@
 
 #define FIRST_IWMMXT_REGNUM	(LAST_HI_VFP_REGNUM + 1)
 #define LAST_IWMMXT_REGNUM	(FIRST_IWMMXT_REGNUM + 15)
+
+/* Need to sync with WCGR in iwmmxt.md.  */
 #define FIRST_IWMMXT_GR_REGNUM	(LAST_IWMMXT_REGNUM + 1)
 #define LAST_IWMMXT_GR_REGNUM	(FIRST_IWMMXT_GR_REGNUM + 3)
 
Index: gcc/config/arm/iwmmxt.md
===================================================================
--- gcc/config/arm/iwmmxt.md	(revision 195599)
+++ gcc/config/arm/iwmmxt.md	(working copy)
@@ -18,12 +18,12 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; <http://www.gnu.org/licenses/>.
 
-;; Register numbers
+;; Register numbers. Need to sync with FIRST_IWMMXT_GR_REGNUM in arm.h
 (define_constants
-  [(WCGR0           43)
-   (WCGR1           44)
-   (WCGR2           45)
-   (WCGR3           46)
+  [(WCGR0           96)
+   (WCGR1           97)
+   (WCGR2           98)
+   (WCGR3           99)
   ]
 )
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* closing PR's (was Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support)
  2013-01-28 15:11         ` nick clifton
@ 2013-02-21  2:35           ` Hans-Peter Nilsson
  2013-02-22 12:42             ` nick clifton
  0 siblings, 1 reply; 29+ messages in thread
From: Hans-Peter Nilsson @ 2013-02-21  2:35 UTC (permalink / raw)
  To: nick clifton; +Cc: gcc-patches

On Mon, 28 Jan 2013, nick clifton wrote:
> > Also, could you close its duplicates, bugs 36798 and 36966?
>
> Sorry no.  I do not actually own these PRs, so I cannot close them. :-(

Sorry if I misinterpret, but it seems a reminder is in order:
magic powers are attached to whomever@gcc.gnu.org accounts in
bugzilla, so when people use them instead of their
whomever@employer.example.com, they are able to close PR's they
haven't created.

brgds, H-P

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: closing PR's (was Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support)
  2013-02-21  2:35           ` closing PR's (was Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support) Hans-Peter Nilsson
@ 2013-02-22 12:42             ` nick clifton
  0 siblings, 0 replies; 29+ messages in thread
From: nick clifton @ 2013-02-22 12:42 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: gcc-patches

Hi Hans-Peter,
> Sorry if I misinterpret, but it seems a reminder is in order:
> magic powers are attached to whomever@gcc.gnu.org accounts in
> bugzilla, so when people use them instead of their
> whomever@employer.example.com, they are able to close PR's they
> haven't created.

Ah - thank you, I did not know that.  I have now logged in using that 
address and closed the requested PR.

Cheers
   Nick

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH, ARM, iWMMXT] Fix define_constants for WCGR
  2013-01-22 11:59       ` Ramana Radhakrishnan
                           ` (2 preceding siblings ...)
  2013-01-31  8:49         ` [PATCH, " Xinyu Qi
@ 2013-03-20  2:43         ` Xinyu Qi
  2013-03-26 14:01           ` Ramana Radhakrishnan
  3 siblings, 1 reply; 29+ messages in thread
From: Xinyu Qi @ 2013-03-20  2:43 UTC (permalink / raw)
  To: ramrad01; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1324 bytes --]

>At 2013-01-22 19:58:43,"Ramana Radhakrishnan" <ramrad01@arm.com> wrote:>
> > On 01/22/13 09:21, Xinyu Qi wrote:
> > > Ping,
> > >
> > > Fix ChangeLog
> >
> > The ChangeLog format includes .
> >
> > <date>  <Author's name>  <a.b@c.com>
> >
> > If you want a patch accepted in the future, please help by creating
> > the Changelog entry in the correct format, i.e. fill in the author's
> > name as well as email address as below. I've created an entry as
> > below. Please remember to do so for every patch you submit - thanks.
> >
> > <DATE>  Xinyu Qi  <xyqi@marvell.com>
> >
> > 	* config/arm/arm.h (FIRST_IWMMXT_GR_REGNUM): Add comment.
> > 	* config/arm/iwmmxt.md (WCGR0): Update.
> > 	(WCGR1, WCGR2, WCGR3): Likewise.
> >
> > The patch by itself is OK but surprisingly I never saw this earlier.
> > Your ping has removed the date from the original post so I couldn't
> > track it down.
> >
> > Anyway, please apply.
> >
> >
> > regards,
> > Ramana
> >
> >
> 
Hi Ramana,

Since I have no write access, would you mind to help to check in this patch?
The patch is attached.

ChangeLog
2013-01-31  Xinyu Qi  <xyqi@marvell.com>

	* config/arm/arm.h (FIRST_IWMMXT_GR_REGNUM): Add comment.
	* config/arm/iwmmxt.md (WCGR0): Update.
	(WCGR1, WCGR2, WCGR3): Likewise.

Thanks,
Xinyu

[-- Attachment #2: WCGR.DIFF --]
[-- Type: application/octet-stream, Size: 1102 bytes --]

Index: gcc/config/arm/arm.h
===================================================================
--- gcc/config/arm/arm.h	(revision 195599)
+++ gcc/config/arm/arm.h	(working copy)
@@ -945,6 +945,8 @@
 
 #define FIRST_IWMMXT_REGNUM	(LAST_HI_VFP_REGNUM + 1)
 #define LAST_IWMMXT_REGNUM	(FIRST_IWMMXT_REGNUM + 15)
+
+/* Need to sync with WCGR in iwmmxt.md.  */
 #define FIRST_IWMMXT_GR_REGNUM	(LAST_IWMMXT_REGNUM + 1)
 #define LAST_IWMMXT_GR_REGNUM	(FIRST_IWMMXT_GR_REGNUM + 3)
 
Index: gcc/config/arm/iwmmxt.md
===================================================================
--- gcc/config/arm/iwmmxt.md	(revision 195599)
+++ gcc/config/arm/iwmmxt.md	(working copy)
@@ -18,12 +18,12 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; <http://www.gnu.org/licenses/>.
 
-;; Register numbers
+;; Register numbers. Need to sync with FIRST_IWMMXT_GR_REGNUM in arm.h
 (define_constants
-  [(WCGR0           43)
-   (WCGR1           44)
-   (WCGR2           45)
-   (WCGR3           46)
+  [(WCGR0           96)
+   (WCGR1           97)
+   (WCGR2           98)
+   (WCGR3           99)
   ]
 )
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM, iWMMXT] Fix define_constants for WCGR
  2013-03-20  2:43         ` Xinyu Qi
@ 2013-03-26 14:01           ` Ramana Radhakrishnan
  2013-04-02  9:55             ` [PATCH, ARM, iWMMXT] PR target/54338 - Include IWMMXT_GR_REGS in ALL_REGS Xinyu Qi
  0 siblings, 1 reply; 29+ messages in thread
From: Ramana Radhakrishnan @ 2013-03-26 14:01 UTC (permalink / raw)
  To: Xinyu Qi; +Cc: gcc-patches

On Wed, Mar 20, 2013 at 2:43 AM, Xinyu Qi <xyqi@marvell.com> wrote:
>>At 2013-01-22 19:58:43,"Ramana Radhakrishnan" <ramrad01@arm.com> wrote:>
>> > On 01/22/13 09:21, Xinyu Qi wrote:
>> > > Ping,
>> > >
>> > > Fix ChangeLog
>> >
>> > The ChangeLog format includes .
>> >
>> > <date>  <Author's name>  <a.b@c.com>
>> >
>> > If you want a patch accepted in the future, please help by creating
>> > the Changelog entry in the correct format, i.e. fill in the author's
>> > name as well as email address as below. I've created an entry as
>> > below. Please remember to do so for every patch you submit - thanks.
>> >
>> > <DATE>  Xinyu Qi  <xyqi@marvell.com>
>> >
>> >     * config/arm/arm.h (FIRST_IWMMXT_GR_REGNUM): Add comment.
>> >     * config/arm/iwmmxt.md (WCGR0): Update.
>> >     (WCGR1, WCGR2, WCGR3): Likewise.
>> >
>> > The patch by itself is OK but surprisingly I never saw this earlier.
>> > Your ping has removed the date from the original post so I couldn't
>> > track it down.
>> >
>> > Anyway, please apply.
>> >
>> >
>> > regards,
>> > Ramana
>> >
>> >
>>
> Hi Ramana,
>
> Since I have no write access, would you mind to help to check in this patch?
> The patch is attached.
>
> ChangeLog
> 2013-01-31  Xinyu Qi  <xyqi@marvell.com>
>
>         * config/arm/arm.h (FIRST_IWMMXT_GR_REGNUM): Add comment.
>         * config/arm/iwmmxt.md (WCGR0): Update.
>         (WCGR1, WCGR2, WCGR3): Likewise.
>

Now applied to trunk .sorry about the delay.

Ramana

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH, ARM, iWMMXT] PR target/54338 - Include IWMMXT_GR_REGS in ALL_REGS
  2013-03-26 14:01           ` Ramana Radhakrishnan
@ 2013-04-02  9:55             ` Xinyu Qi
  2013-04-02 10:03               ` Ramana Radhakrishnan
  0 siblings, 1 reply; 29+ messages in thread
From: Xinyu Qi @ 2013-04-02  9:55 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 693 bytes --]

Hi,
  According to Vladimir Makarov's analysis, the root cause of PR target/54338 is that ALL_REGS doesn't contain IWMMXT_GR_REGS in REG_CLASS_CONTENTS.
  It seems there is no reason to exclude the IWMMXT_GR_REGS from ALL_REGS as IWMMXT_GR_REGS are the real registers.
  This patch simply makes ALL_REGS include IWMMXT_GR_REGS to fix this PR.
  Since the test case gcc.target/arm/mmx-2.c would fail for the same reason and become pass with this fix, no extra test case need to be add.
  Pass arm.exp test. Patch attached.

ChangeLog

2013-04-02  Xinyu Qi  <xyqi@marvell.com>

	* config/arm/arm.h (REG_CLASS_CONTENTS): Include IWMMXT_GR_REGS in ALL_REGS.


OK?

Thanks,
Xinyu

[-- Attachment #2: IWMMXT_GR_REGS.diff --]
[-- Type: application/octet-stream, Size: 1359 bytes --]

Index: config/arm/arm.h
===================================================================
*** config/arm/arm.h	(revision 197340)
--- config/arm/arm.h	(working copy)
***************
*** 1203,1213 ****
    { 0x00000000, 0x00000000, 0x00000000, 0x0000000F }, /* IWMMXT_GR_REGS */ \
    { 0x00000000, 0x00000000, 0x00000000, 0x00000010 }, /* CC_REG */	\
    { 0x00000000, 0x00000000, 0x00000000, 0x00000020 }, /* VFPCC_REG */	\
    { 0x00000000, 0x00000000, 0x00000000, 0x00000040 }, /* SFP_REG */	\
    { 0x00000000, 0x00000000, 0x00000000, 0x00000080 }, /* AFP_REG */	\
!   { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x00000000 }  /* ALL_REGS */	\
  }
  
  /* Any of the VFP register classes.  */
  #define IS_VFP_CLASS(X) \
    ((X) == VFP_D0_D7_REGS || (X) == VFP_LO_REGS \
--- 1203,1213 ----
    { 0x00000000, 0x00000000, 0x00000000, 0x0000000F }, /* IWMMXT_GR_REGS */ \
    { 0x00000000, 0x00000000, 0x00000000, 0x00000010 }, /* CC_REG */	\
    { 0x00000000, 0x00000000, 0x00000000, 0x00000020 }, /* VFPCC_REG */	\
    { 0x00000000, 0x00000000, 0x00000000, 0x00000040 }, /* SFP_REG */	\
    { 0x00000000, 0x00000000, 0x00000000, 0x00000080 }, /* AFP_REG */	\
!   { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000000F }  /* ALL_REGS */	\
  }
  
  /* Any of the VFP register classes.  */
  #define IS_VFP_CLASS(X) \
    ((X) == VFP_D0_D7_REGS || (X) == VFP_LO_REGS \

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH, ARM, iWMMXT] PR target/54338 - Include IWMMXT_GR_REGS in ALL_REGS
  2013-04-02  9:55             ` [PATCH, ARM, iWMMXT] PR target/54338 - Include IWMMXT_GR_REGS in ALL_REGS Xinyu Qi
@ 2013-04-02 10:03               ` Ramana Radhakrishnan
  0 siblings, 0 replies; 29+ messages in thread
From: Ramana Radhakrishnan @ 2013-04-02 10:03 UTC (permalink / raw)
  To: gcc-patches

On 04/02/13 10:40, Xinyu Qi wrote:
> Hi,
>    According to Vladimir Makarov's analysis, the root cause of PR target/54338 is that ALL_REGS doesn't contain IWMMXT_GR_REGS in REG_CLASS_CONTENTS.
>    It seems there is no reason to exclude the IWMMXT_GR_REGS from ALL_REGS as IWMMXT_GR_REGS are the real registers.
>    This patch simply makes ALL_REGS include IWMMXT_GR_REGS to fix this PR.
>    Since the test case gcc.target/arm/mmx-2.c would fail for the same reason and become pass with this fix, no extra test case need to be add.
>    Pass arm.exp test. Patch attached.

Testing just with arm.exp is not enough.

Ok if no regressions running the entire regression testsuite for C and 
C++ for arm*-*-*eabi with an iwmmxt configuration.

Thanks
Ramana


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2013-04-02  9:50 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-29  4:13 [PATCH ARM iWMMXt 0/5] Improve iWMMXt support Matt Turner
2012-05-29  4:14 ` [PATCH ARM iWMMXt 5/5] pipeline description Matt Turner
2012-05-29  4:14 ` [PATCH ARM iWMMXt 1/5] ARM code generic change Matt Turner
2012-06-06 11:53   ` Ramana Radhakrishnan
2012-12-27  2:31     ` [PATCH, ARM, iWMMXT] Fix define_constants for WCGR Xinyu Qi
2013-01-22  9:22     ` [PING][PATCH, " Xinyu Qi
2013-01-22 11:59       ` Ramana Radhakrishnan
2013-01-22 13:34         ` Andreas Schwab
2013-01-23  6:08         ` Xinyu Qi
2013-01-31  8:49         ` [PATCH, " Xinyu Qi
2013-03-20  2:43         ` Xinyu Qi
2013-03-26 14:01           ` Ramana Radhakrishnan
2013-04-02  9:55             ` [PATCH, ARM, iWMMXT] PR target/54338 - Include IWMMXT_GR_REGS in ALL_REGS Xinyu Qi
2013-04-02 10:03               ` Ramana Radhakrishnan
2012-05-29  4:14 ` [PATCH ARM iWMMXt 3/5] built in define and expand Matt Turner
2012-06-06 11:55   ` Ramana Radhakrishnan
2012-05-29  4:15 ` [PATCH ARM iWMMXt 4/5] WMMX machine description Matt Turner
2012-05-29  4:15 ` [PATCH ARM iWMMXt 2/5] intrinsic head file change Matt Turner
2012-06-06 12:22   ` Ramana Radhakrishnan
2012-06-06 11:59 ` [PATCH ARM iWMMXt 0/5] Improve iWMMXt support Ramana Radhakrishnan
2012-06-11  9:24 ` nick clifton
2012-06-13  7:36 ` nick clifton
2012-06-13 15:31   ` Matt Turner
2012-06-26 15:20     ` nick clifton
2012-06-27 19:15       ` Matt Turner
2013-01-28  3:49       ` Matt Turner
2013-01-28 15:11         ` nick clifton
2013-02-21  2:35           ` closing PR's (was Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support) Hans-Peter Nilsson
2013-02-22 12:42             ` nick clifton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).