[PATCH 00/13] S/390 Implement support for IBM z13

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH 00/13] S/390 Implement support for IBM z13
@ 2015-05-11 13:23 Andreas Krebbel
  2015-05-11 13:23 ` [PATCH 01/13] recog: Increased max number of alternatives Andreas Krebbel
                   ` (12 more replies)
  0 siblings, 13 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 13:23 UTC (permalink / raw)
  To: gcc-patches

The attached patchset adds support for the IBM z13 machine to the
GCC S/390 backend.

The machine has been announced recently:
http://www-03.ibm.com/press/us/en/pressrelease/45808.wss

IBM z/Architecture Principles of Operation
http://publibfi.boulder.ibm.com/epubs/pdf/dz9zr010.pdf

The required Binutils support is upstream since January:
https://sourceware.org/ml/binutils/2015-01/msg00197.html

Highlights from a toolchain perspective are:

- 32 128 bit vector registers (overlapping with the existing 16 64 bit
  floating point registers)
- vector double instructions
- vector integer instructions
- scalar vector instructions (allowing to have more floating point
  registers for scalar operations)
- vector string instructions

I would like to commit this patchset also to GCC 5 branch in order to
enable distros to pick it up more easily.

Andreas Krebbel (13):
  recog: Increased max number of alternatives.
  optabs: Fix vec_perm -> V16QI middle end lowering.
  S/390 Fix secondary reload issue with store/load relative operands.
  S/390 Add -march/-mtune=z13 option.
  S/390 Vector base support.
  Vector base support - testcases
  S/390 Add vector scalar instruction support.
  S/390 zvector builtin support.
  S/390 Add zvector testcases.
  Testsuite These testcases require disabling hardware vector support
    on S/390.
  Testsuite S/390 vector types are only 8 byte aligned.
  S/390 Vector ABI GNU Attribute.
  S/390 Invalid vector binary ops

 gcc/common/config/s390/s390-common.c               |    3 +
 gcc/config.gcc                                     |   26 +-
 gcc/config/s390/constraints.md                     |   28 +
 gcc/config/s390/predicates.md                      |   12 +-
 gcc/config/s390/s390-builtin-types.def             |  747 ++++++
 gcc/config/s390/s390-builtins.def                  | 2486 ++++++++++++++++++++
 gcc/config/s390/s390-builtins.h                    |  160 ++
 gcc/config/s390/s390-c.c                           |  907 +++++++
 gcc/config/s390/s390-modes.def                     |   61 +
 gcc/config/s390/s390-opts.h                        |    1 +
 gcc/config/s390/s390-protos.h                      |   17 +
 gcc/config/s390/s390.c                             | 2314 +++++++++++++++---
 gcc/config/s390/s390.h                             |  220 +-
 gcc/config/s390/s390.md                            |  800 +++++--
 gcc/config/s390/s390.opt                           |   11 +
 gcc/config/s390/s390intrin.h                       |    3 +
 gcc/config/s390/t-s390                             |   27 +
 gcc/config/s390/vecintrin.h                        |  311 +++
 gcc/config/s390/vector.md                          | 1228 ++++++++++
 gcc/config/s390/vx-builtins.md                     | 2081 ++++++++++++++++
 gcc/configure                                      |   36 +
 gcc/configure.ac                                   |    7 +
 gcc/optabs.c                                       |   18 +-
 gcc/recog.h                                        |    2 +-
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11b.c       |    1 +
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11c.c       |    1 +
 gcc/testsuite/gcc.target/s390/s390.exp             |   18 +
 gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c   |   18 +
 gcc/testsuite/gcc.target/s390/vector/vec-abi-2.c   |   15 +
 gcc/testsuite/gcc.target/s390/vector/vec-abi-3.c   |  101 +
 gcc/testsuite/gcc.target/s390/vector/vec-abi-4.c   |   19 +
 .../gcc.target/s390/vector/vec-abi-align-1.c       |   48 +
 .../gcc.target/s390/vector/vec-abi-attr-1.c        |   18 +
 .../gcc.target/s390/vector/vec-abi-attr-2.c        |   53 +
 .../gcc.target/s390/vector/vec-abi-attr-3.c        |   18 +
 .../gcc.target/s390/vector/vec-abi-attr-4.c        |   17 +
 .../gcc.target/s390/vector/vec-abi-attr-5.c        |   19 +
 .../gcc.target/s390/vector/vec-abi-attr-6.c        |   24 +
 .../gcc.target/s390/vector/vec-abi-single-1.c      |   24 +
 .../gcc.target/s390/vector/vec-abi-single-2.c      |   12 +
 .../gcc.target/s390/vector/vec-abi-struct-1.c      |   37 +
 .../gcc.target/s390/vector/vec-abi-vararg-1.c      |   60 +
 .../gcc.target/s390/vector/vec-abi-vararg-2.c      |   18 +
 .../gcc.target/s390/vector/vec-clobber-1.c         |   38 +
 gcc/testsuite/gcc.target/s390/vector/vec-cmp-1.c   |   45 +
 gcc/testsuite/gcc.target/s390/vector/vec-cmp-2.c   |   38 +
 .../s390/vector/vec-dbl-math-compile-1.c           |   48 +
 .../gcc.target/s390/vector/vec-genbytemask-1.c     |   70 +
 .../gcc.target/s390/vector/vec-genbytemask-2.c     |   46 +
 .../gcc.target/s390/vector/vec-genmask-1.c         |   70 +
 .../gcc.target/s390/vector/vec-genmask-2.c         |   46 +
 gcc/testsuite/gcc.target/s390/vector/vec-init-1.c  |   68 +
 .../s390/vector/vec-int-math-compile-1.c           |   40 +
 .../gcc.target/s390/vector/vec-scalar-cmp-1.c      |   49 +
 gcc/testsuite/gcc.target/s390/vector/vec-shift-1.c |  108 +
 gcc/testsuite/gcc.target/s390/vector/vec-sub-1.c   |   51 +
 .../s390/zvector/vec-dbl-math-compile-1.c          |   67 +
 gcc/testsuite/gcc.target/s390/zvector/vec-elem-1.c |   11 +
 .../gcc.target/s390/zvector/vec-genbytemask-1.c    |   21 +
 .../gcc.target/s390/zvector/vec-genmask-1.c        |   24 +
 gcc/testsuite/gcc.target/s390/zvector/vec-lcbb-1.c |   31 +
 .../gcc.target/s390/zvector/vec-overloading-1.c    |   77 +
 .../gcc.target/s390/zvector/vec-overloading-2.c    |   54 +
 .../gcc.target/s390/zvector/vec-overloading-3.c    |   19 +
 .../gcc.target/s390/zvector/vec-overloading-4.c    |   18 +
 .../gcc.target/s390/zvector/vec-test-mask-1.c      |   25 +
 gcc/testsuite/lib/target-supports.exp              |    3 +-
 67 files changed, 12449 insertions(+), 645 deletions(-)
 create mode 100644 gcc/config/s390/s390-builtin-types.def
 create mode 100644 gcc/config/s390/s390-builtins.def
 create mode 100644 gcc/config/s390/s390-builtins.h
 create mode 100644 gcc/config/s390/s390-c.c
 create mode 100644 gcc/config/s390/t-s390
 create mode 100644 gcc/config/s390/vecintrin.h
 create mode 100644 gcc/config/s390/vector.md
 create mode 100644 gcc/config/s390/vx-builtins.md
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-3.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-4.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-align-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-3.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-4.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-5.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-6.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-single-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-single-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-struct-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-clobber-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-cmp-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-cmp-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-dbl-math-compile-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-genmask-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-genmask-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-init-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-int-math-compile-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-scalar-cmp-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-shift-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-sub-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-dbl-math-compile-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-elem-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-genbytemask-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-genmask-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-lcbb-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-overloading-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-overloading-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-overloading-3.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-overloading-4.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-test-mask-1.c

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 01/13] recog: Increased max number of alternatives.
  2015-05-11 13:23 [PATCH 00/13] S/390 Implement support for IBM z13 Andreas Krebbel
@ 2015-05-11 13:23 ` Andreas Krebbel
  2015-05-11 14:01   ` Segher Boessenkool
                     ` (2 more replies)
  2015-05-11 13:24 ` [PATCH 11/13] Testsuite S/390 vector types are only 8 byte aligned Andreas Krebbel
                   ` (11 subsequent siblings)
  12 siblings, 3 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 13:23 UTC (permalink / raw)
  To: gcc-patches

With the vector facility support z13 mov patterns have more than 30
alternatives.

gcc/
	* recog.h: Increase MAX_RECOG_ALTERNATIVES.
---
 gcc/recog.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/recog.h b/gcc/recog.h
index 8a38b26..4d8ca0c 100644
--- a/gcc/recog.h
+++ b/gcc/recog.h
@@ -23,7 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 /* Random number that should be large enough for all purposes.  Also define
    a type that has at least MAX_RECOG_ALTERNATIVES + 1 bits, with the extra
    bit giving an invalid value that can be used to mean "uninitialized".  */
-#define MAX_RECOG_ALTERNATIVES 30
+#define MAX_RECOG_ALTERNATIVES 35
 typedef unsigned int alternative_mask;
 
 /* A mask of all alternatives.  */
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 09/13] S/390 Add zvector testcases.
  2015-05-11 13:23 [PATCH 00/13] S/390 Implement support for IBM z13 Andreas Krebbel
                   ` (10 preceding siblings ...)
  2015-05-11 13:24 ` [PATCH 05/13] S/390 Vector base support Andreas Krebbel
@ 2015-05-11 13:24 ` Andreas Krebbel
  2015-05-11 13:41 ` [PATCH 08/13] S/390 zvector builtin support Andreas Krebbel
  12 siblings, 0 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 13:24 UTC (permalink / raw)
  To: gcc-patches

gcc/testsuite/
	* gcc.target/s390/zvector/vec-dbl-math-compile-1.c: New test.
	* gcc.target/s390/zvector/vec-genbytemask-1.c: New test.
	* gcc.target/s390/zvector/vec-genmask-1.c: New test.
	* gcc.target/s390/zvector/vec-lcbb-1.c: New test.
	* gcc.target/s390/zvector/vec-overloading-1.c: New test.
	* gcc.target/s390/zvector/vec-overloading-2.c: New test.
	* gcc.target/s390/zvector/vec-overloading-3.c: New test.
	* gcc.target/s390/zvector/vec-overloading-4.c: New test.
	* gcc.target/s390/zvector/vec-test-mask-1.c: New test.
	* gcc.target/s390/zvector/vec-elem-1.c: New test.
---
 .../s390/zvector/vec-dbl-math-compile-1.c          |   67 +++++++++++++++++
 gcc/testsuite/gcc.target/s390/zvector/vec-elem-1.c |   11 +++
 .../gcc.target/s390/zvector/vec-genbytemask-1.c    |   21 ++++++
 .../gcc.target/s390/zvector/vec-genmask-1.c        |   24 ++++++
 gcc/testsuite/gcc.target/s390/zvector/vec-lcbb-1.c |   31 ++++++++
 .../gcc.target/s390/zvector/vec-overloading-1.c    |   77 ++++++++++++++++++++
 .../gcc.target/s390/zvector/vec-overloading-2.c    |   54 ++++++++++++++
 .../gcc.target/s390/zvector/vec-overloading-3.c    |   19 +++++
 .../gcc.target/s390/zvector/vec-overloading-4.c    |   18 +++++
 .../gcc.target/s390/zvector/vec-test-mask-1.c      |   25 +++++++
 10 files changed, 347 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-dbl-math-compile-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-elem-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-genbytemask-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-genmask-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-lcbb-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-overloading-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-overloading-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-overloading-3.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-overloading-4.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-test-mask-1.c

diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-dbl-math-compile-1.c b/gcc/testsuite/gcc.target/s390/zvector/vec-dbl-math-compile-1.c
new file mode 100644
index 0000000..31b277b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-dbl-math-compile-1.c
@@ -0,0 +1,67 @@
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13 -mzvector --save-temps" } */
+
+/* { dg-final { scan-assembler-times "vfcedb\t" 1 } } */
+/* { dg-final { scan-assembler-times "vfchdb\t" 2 } } */
+/* { dg-final { scan-assembler-times "vfchedb\t" 2 } } */
+
+/* { dg-final { scan-assembler-times "vfcedbs\t" 2 } } */
+/* { dg-final { scan-assembler-times "vfchdbs\t" 2 } } */
+
+/* { dg-final { cleanup-saved-temps } } */
+
+#include <vecintrin.h>
+
+vector bool long long
+cmpeq (vector double a, vector double b)
+{
+  return vec_cmpeq (a, b); /* vfcedb */
+}
+
+vector bool long long
+cmpgt (vector double a, vector double b)
+{
+  return vec_cmpgt (a, b); /* vfchdb */
+}
+
+vector bool long long
+cmpge (vector double a, vector double b)
+{
+  return vec_cmpge (a, b); /* vfchedb */
+}
+
+vector bool long long
+cmplt (vector double a, vector double b)
+{
+  return vec_cmplt (a, b); /* vfchdb */
+}
+
+vector bool long long
+cmple (vector double a, vector double b)
+{
+  return vec_cmple (a, b); /* vfchedb */
+}
+
+int
+all_eq (vector double a, vector double b)
+{
+  return vec_all_eq (a, b);
+}
+
+int
+any_eq (vector double a, vector double b)
+{
+  return vec_any_eq (a, b);
+}
+
+int
+all_lt (vector double a, vector double b)
+{
+  return vec_all_lt (a, b);
+}
+
+int
+any_lt (vector double a, vector double b)
+{
+  return vec_any_lt (a, b);
+}
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-elem-1.c b/gcc/testsuite/gcc.target/s390/zvector/vec-elem-1.c
new file mode 100644
index 0000000..c8578bf8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-elem-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13 -mzvector" } */
+
+/* { dg-final { scan-assembler "nilf\t%r2,15" } } */
+/* { dg-final { scan-assembler "vlgvb" } } */
+
+signed char
+foo(unsigned char uc)
+{
+  return __builtin_s390_vec_extract((__vector signed char){ 0 }, uc);
+}
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-genbytemask-1.c b/gcc/testsuite/gcc.target/s390/zvector/vec-genbytemask-1.c
new file mode 100644
index 0000000..09471f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-genbytemask-1.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mzarch -march=z13 -mzvector" } */
+
+#include <vecintrin.h>
+
+
+vector unsigned char a, b, c, d;
+
+int
+foo ()
+{
+  a = vec_genmask (0);
+  b = vec_genmask (65535);
+  c = vec_genmask (43605);
+  d = vec_genmask (37830);
+}
+
+/* { dg-final { scan-assembler-times "vzero" 1 } } */
+/* { dg-final { scan-assembler-times "vone" 1 } } */
+/* { dg-final { scan-assembler-times "vgbm\t%v.*,43605" 1 } } */
+/* { dg-final { scan-assembler-times "vgbm\t%v.*,37830" 1 } } */
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-genmask-1.c b/gcc/testsuite/gcc.target/s390/zvector/vec-genmask-1.c
new file mode 100644
index 0000000..745c1ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-genmask-1.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mzarch -march=z13 -mzvector" } */
+
+#include <vecintrin.h>
+
+
+vector unsigned int a, b, c, d, e, f;
+
+int
+foo ()
+{
+  a = vec_genmasks_32 (0, 31);
+  b = vec_genmasks_32 (0, 0);
+  c = vec_genmasks_32 (31, 31);
+  d = vec_genmasks_32 (5, 5);
+  e = vec_genmasks_32 (31, 0);
+  f = vec_genmasks_32 (6, 5);
+}
+/* { dg-final { scan-assembler-times "vone" 1 } } */
+/* { dg-final { scan-assembler-times "vgmf\t%v.*,0,0" 1 } } */
+/* { dg-final { scan-assembler-times "vgmf\t%v.*,31,31" 1 } } */
+/* { dg-final { scan-assembler-times "vgmf\t%v.*,5,5" 1 } } */
+/* { dg-final { scan-assembler-times "vgmf\t%v.*,31,0" 1 } } */
+/* { dg-final { scan-assembler-times "vone" 1 } } */
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-lcbb-1.c b/gcc/testsuite/gcc.target/s390/zvector/vec-lcbb-1.c
new file mode 100644
index 0000000..3588b61
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-lcbb-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13 -mzvector" } */
+
+/* { dg-final { scan-assembler-times "\tlcbb\t" 4 } } */
+
+#include <vecintrin.h>
+
+/* CC will be extracted into a GPR and returned.  */
+int
+foo1 (void *ptr)
+{
+  return __lcbb (ptr, 64);
+}
+
+int
+foo2 (void *ptr)
+{
+  return __lcbb (ptr, 128) > 16;
+}
+
+int
+foo3 (void *ptr)
+{
+  return __lcbb (ptr, 256) == 16;
+}
+
+int
+foo4 (void *ptr)
+{
+  return __lcbb (ptr, 512) < 16;
+}
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-overloading-1.c b/gcc/testsuite/gcc.target/s390/zvector/vec-overloading-1.c
new file mode 100644
index 0000000..ca3a943
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-overloading-1.c
@@ -0,0 +1,77 @@
+/* Test whether overloading works as expected.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-march=z13 -mzarch -mzvector -fdump-tree-original" } */
+
+__vector int var_v4si;
+__vector unsigned var_uv4si;
+__vector bool var_bv4si;
+__vector long long var_v2di;
+__vector unsigned long long var_uv2di;
+__vector bool long long var_bv2di;
+__vector double var_v2df;
+
+int *intptr;
+unsigned *uintptr;
+double *dblptr;
+unsigned long long ull;
+const int *cintptr;
+long long* llptr;
+unsigned long long* ullptr;
+
+typedef __vector int v4si;
+typedef __vector unsigned int uv4si;
+
+v4si var2_v4si;
+uv4si var2_uv4si;
+
+void
+foo ()
+{
+  __builtin_s390_vec_scatter_element (var_v4si,  var_uv4si, intptr, (unsigned long long)0);
+  __builtin_s390_vec_scatter_element (var2_v4si, var2_uv4si, intptr, (unsigned long long)0);
+  __builtin_s390_vec_scatter_element (var_bv4si, var_uv4si, uintptr, (unsigned long long)0);
+  __builtin_s390_vec_scatter_element (var_uv4si, var_uv4si, uintptr, (unsigned long long)0);
+  __builtin_s390_vec_scatter_element (var_v2di,  var_uv2di, llptr, (unsigned long long)0);
+  __builtin_s390_vec_scatter_element (var_bv2di, var_uv2di, ullptr, (unsigned long long)0);
+  __builtin_s390_vec_scatter_element (var_uv2di, var_uv2di, ullptr, (unsigned long long)0);
+  __builtin_s390_vec_scatter_element (var_v2df,  var_uv2di, dblptr, (unsigned long long)0);
+
+  /* While the last argument is a int there is a way to convert it to
+     unsigned long long, so this variant is supposed to match.  */
+ __builtin_s390_vec_scatter_element (var_v4si,  var_uv4si, intptr, 0);
+
+  __builtin_s390_vec_insert_and_zero (intptr);
+  __builtin_s390_vec_insert_and_zero (cintptr);
+
+  __builtin_s390_vec_promote ((signed char)1, 1);
+  __builtin_s390_vec_promote ((unsigned char)1, 1);
+  __builtin_s390_vec_promote ((short int)1, 1);
+  __builtin_s390_vec_promote ((unsigned short int)1, 1);
+  __builtin_s390_vec_promote ((int)1, 1);
+  __builtin_s390_vec_promote ((unsigned)1, 1);
+  __builtin_s390_vec_promote ((long long)1, 1);
+  __builtin_s390_vec_promote ((unsigned long long)1, 1);
+  __builtin_s390_vec_promote ((double)1, 1);
+
+  /* This is supposed to match vec_promote_s32 */
+  __builtin_s390_vec_promote (1, (signed char) -1);
+
+  /* Constants in C usually are considered int.  */
+  __builtin_s390_vec_promote (1, 1);
+
+  /* And (unsigned) long if they are too big for int.  */
+  __builtin_s390_vec_promote (1ULL << 32, 1);
+  __builtin_s390_vec_promote (1LL << 32, 1);
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_s390_vscef " 5 "original" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_s390_vsceg " 4 "original" } } */
+
+/* { dg-final { scan-tree-dump-times "__builtin_s390_vllezf " 2 "original" } } */
+
+/* { dg-final { scan-tree-dump-times "__builtin_s390_vlvgb_noin " 2 "original" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_s390_vlvgh_noin " 2 "original" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_s390_vlvgf_noin " 4 "original" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_s390_vlvgg_noin " 4 "original" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_s390_vlvgg_dbl_noin " 1 "original" } } */
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-overloading-2.c b/gcc/testsuite/gcc.target/s390/zvector/vec-overloading-2.c
new file mode 100644
index 0000000..fd66e02
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-overloading-2.c
@@ -0,0 +1,54 @@
+/* Test whether overloading works as expected.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-march=z13 -mzarch -mzvector" } */
+
+__vector int v4si;
+__vector unsigned uv4si;
+__vector bool bv4si;
+__vector long long v2di;
+__vector unsigned long long uv2di;
+__vector bool long long bv2di;
+__vector double v2df;
+int *intptr;
+unsigned *uintptr;
+double *dblptr;
+long long ll;
+unsigned long long ull;
+const int *cintptr;
+long long* llptr;
+unsigned long long* ullptr;
+
+void
+foo ()
+{
+  __builtin_s390_vec_scatter_element (v4si,  uv4si, (int*)0, 0); /* ok */
+  __builtin_s390_vec_insert_and_zero (intptr); /* ok */
+
+  /* The unsigned pointer must not match the signed pointer.  */
+  __builtin_s390_vec_scatter_element (v4si, uv4si, uintptr, 0); /* { dg-error "invalid parameter combination for intrinsic" } */
+
+  /* Make sure signed int pointers don't match unsigned int pointers.  */
+  __builtin_s390_vec_scatter_element (bv4si, uv4si, intptr, 0); /* { dg-error "invalid parameter combination for intrinsic" } */
+
+  /* Const pointers do not match unqualified operands.  */
+  __builtin_s390_vec_scatter_element (v4si, uv4si, cintptr, 0); /* { dg-error "invalid parameter combination for intrinsic" } */
+
+  /* Volatile pointers do not match unqualified operands.  */
+  __builtin_s390_vec_scatter_element (v4si, uv4si, cintptr, 0); /* { dg-error "invalid parameter combination for intrinsic" } */
+
+  /* The third operands needs to be double *.  */
+  __builtin_s390_vec_scatter_element (v2df, uv4si, intptr, 0); /* { dg-error "invalid parameter combination for intrinsic" } */
+
+  /* This is an ambigious overload.  */
+  __builtin_s390_vec_scatter_element (v4si, uv4si, 0, 0); /* { dg-error "invalid parameter combination for intrinsic" } */
+
+  /* Pointer to vector must not match.  */
+  __builtin_s390_vec_scatter_element (v4si, uv4si, &v4si, 0); /* { dg-error "invalid parameter combination for intrinsic" } */
+
+  /* Don't accept const int* for int*.  */
+  __builtin_s390_vec_scatter_element (v4si,  uv4si, cintptr, 0); /* { dg-error "invalid parameter combination for intrinsic" } */
+
+  __builtin_s390_vec_load_pair (ll, ull); /* { dg-error "ambiguous overload for intrinsic" } */
+  __builtin_s390_vec_load_pair (ull, ll); /* { dg-error "ambiguous overload for intrinsic" } */
+}
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-overloading-3.c b/gcc/testsuite/gcc.target/s390/zvector/vec-overloading-3.c
new file mode 100644
index 0000000..761e5b6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-overloading-3.c
@@ -0,0 +1,19 @@
+/* Check for error messages supposed to be issued during overloading.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-march=z13 -mzarch -mzvector" } */
+
+__vector int v4si;
+__vector unsigned uv4si;
+
+int *intptr;
+unsigned long long ull;
+const unsigned int *ucintptr;
+
+void
+foo ()
+{
+  /* A backend check makes sure the forth operand is a literal.  */
+  __builtin_s390_vec_gather_element (uv4si, uv4si, ucintptr, 256); /* { dg-error "constant argument 4 for builtin.*is out of range for target type" } */
+  __builtin_s390_vec_gather_element (uv4si, uv4si, ucintptr, 5); /* { dg-error "constant argument 4 for builtin.*is out of range" } */
+}
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-overloading-4.c b/gcc/testsuite/gcc.target/s390/zvector/vec-overloading-4.c
new file mode 100644
index 0000000..66912f9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-overloading-4.c
@@ -0,0 +1,18 @@
+/* Check for error messages supposed to be issued during builtin expansion.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-march=z13 -mzarch -mzvector" } */
+
+__vector int v4si;
+__vector unsigned uv4si;
+
+int *intptr;
+unsigned long long ull;
+const unsigned int *ucintptr;
+
+void
+foo ()
+{
+  /* A backend check makes sure the forth operand is a literal.  */
+  __builtin_s390_vec_scatter_element (v4si, uv4si, intptr, ull); /* { dg-error "constant value required for builtin" } */
+}
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-test-mask-1.c b/gcc/testsuite/gcc.target/s390/zvector/vec-test-mask-1.c
new file mode 100644
index 0000000..418d5b2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-test-mask-1.c
@@ -0,0 +1,25 @@
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13 -mzvector" } */
+
+/* { dg-final { scan-assembler-times "vtm" 2 } } */
+/* { dg-final { scan-assembler-times "ipm" 1 } } */
+
+#include <vecintrin.h>
+
+/* CC will be extracted into a GPR and returned.  */
+int
+foo (vector unsigned int a, vector unsigned b)
+{
+  return vec_test_mask (a, b);
+}
+
+extern void baz (void);
+
+/* In that case the ipm/srl is supposed to optimized out by
+   combine/s390_canonicalize_comparison.  */
+int
+bar (vector unsigned int a, vector unsigned b)
+{
+  if (vec_test_mask (a, b) == 2)
+    baz ();
+}
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 10/13] Testsuite These testcases require disabling hardware vector support on S/390.
  2015-05-11 13:23 [PATCH 00/13] S/390 Implement support for IBM z13 Andreas Krebbel
                   ` (4 preceding siblings ...)
  2015-05-11 13:24 ` [PATCH 04/13] S/390 Add -march/-mtune=z13 option Andreas Krebbel
@ 2015-05-11 13:24 ` Andreas Krebbel
  2015-05-11 17:05   ` Jeff Law
  2015-05-11 13:24 ` [PATCH 13/13] S/390 Invalid vector binary ops Andreas Krebbel
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 13:24 UTC (permalink / raw)
  To: gcc-patches

gcc/testsuite/
	* gcc.dg/tree-ssa/gen-vect-11b.c: Disable vector
	  instructions on s390*.
	* gcc.dg/tree-ssa/gen-vect-11c.c: Likewise.
---
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11b.c |    1 +
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11c.c |    1 +
 2 files changed, 2 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11b.c b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11b.c
index 50dea9c..41d0e0c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11b.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11b.c
@@ -1,5 +1,6 @@
 /* { dg-do run { target vect_cmdline_needed } } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details -mno-vx" { target { s390*-*-* } } } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details -mno-sse" { target { i?86-*-* x86_64-*-* } } } */
 
 #include <stdlib.h>
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11c.c b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11c.c
index f3ada99..cf0aef7 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11c.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11c.c
@@ -1,5 +1,6 @@
 /* { dg-do run { target vect_cmdline_needed } } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details -mno-vx" { target { s390*-*-* } } } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details -mno-sse" { target { i?86-*-* x86_64-*-* } } } */
 
 #include <stdlib.h>
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 03/13] S/390 Fix secondary reload issue with store/load relative operands.
  2015-05-11 13:23 [PATCH 00/13] S/390 Implement support for IBM z13 Andreas Krebbel
                   ` (7 preceding siblings ...)
  2015-05-11 13:24 ` [PATCH 07/13] S/390 Add vector scalar instruction support Andreas Krebbel
@ 2015-05-11 13:24 ` Andreas Krebbel
  2015-05-11 13:24 ` [RFC 12/13] S/390 Vector ABI GNU Attribute Andreas Krebbel
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 13:24 UTC (permalink / raw)
  To: gcc-patches

We need a scratch register for loading from or storing to a symbolic
memory reference where we cannot use the load/store relative
instructions for.  However, the check currently fails to handle
floating point modes in GPRs correctly.

gcc/
	* config/s390/s390.c (s390_secondary_reload): Fix check for
          load/store relative.
---
 gcc/config/s390/s390.c |   16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 7d16048..cc37618 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -3141,17 +3141,15 @@ s390_secondary_reload (bool in_p, rtx x, reg_class_t rclass_i,
 	sri->icode = ((mode == DImode) ? CODE_FOR_reloaddi_larl_odd_addend_z10
 		      : CODE_FOR_reloadsi_larl_odd_addend_z10);
 
-      /* On z10 we need a scratch register when moving QI, TI or floating
-	 point mode values from or to a memory location with a SYMBOL_REF
-	 or if the symref addend of a SI or DI move is not aligned to the
-	 width of the access.  */
+      /* Handle all the (mem (symref)) accesses we cannot use the z10
+	 instructions for.  */
       if (MEM_P (x)
 	  && s390_loadrelative_operand_p (XEXP (x, 0), NULL, NULL)
-	  && (mode == QImode || mode == TImode || FLOAT_MODE_P (mode)
-	      || (!TARGET_ZARCH && mode == DImode)
-	      || ((mode == HImode || mode == SImode || mode == DImode)
-		  && (!s390_check_symref_alignment (XEXP (x, 0),
-						    GET_MODE_SIZE (mode))))))
+	  && (mode == QImode
+	      || !reg_classes_intersect_p (GENERAL_REGS, rclass)
+	      || GET_MODE_SIZE (mode) > UNITS_PER_WORD
+	      || !s390_check_symref_alignment (XEXP (x, 0),
+					       GET_MODE_SIZE (mode))))
 	{
 #define __SECONDARY_RELOAD_CASE(M,m)					\
 	  case M##mode:							\
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 05/13] S/390 Vector base support.
  2015-05-11 13:23 [PATCH 00/13] S/390 Implement support for IBM z13 Andreas Krebbel
                   ` (9 preceding siblings ...)
  2015-05-11 13:24 ` [RFC 12/13] S/390 Vector ABI GNU Attribute Andreas Krebbel
@ 2015-05-11 13:24 ` Andreas Krebbel
  2015-06-04 23:31   ` [BUILDROBOT] (was: [PATCH 05/13] S/390 Vector base support.) Jan-Benedict Glaw
  2015-05-11 13:24 ` [PATCH 09/13] S/390 Add zvector testcases Andreas Krebbel
  2015-05-11 13:41 ` [PATCH 08/13] S/390 zvector builtin support Andreas Krebbel
  12 siblings, 1 reply; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 13:24 UTC (permalink / raw)
  To: gcc-patches

gcc/
	* config/s390/constraints.md (j00, jm1, jxx, jyy, v): New
	constraints.
	* config/s390/predicates.md (const0_operand, constm1_operand)
	(constable_operand): Accept vector operands.
	* config/s390/s390-modes.def: Add supported vector modes.
	* config/s390/s390-protos.h (s390_cannot_change_mode_class)
	(s390_function_arg_vector, s390_contiguous_bitmask_vector_p)
	(s390_bytemask_vector_p, s390_expand_vec_strlen)
	(s390_expand_vec_compare, s390_expand_vcond)
	(s390_expand_vec_init): Add prototypes.
	* config/s390/s390.c (VEC_ARG_NUM_REG): New macro.
	(s390_vector_mode_supported_p): New function.
	(s390_contiguous_bitmask_p): Mask out the irrelevant bits.
	(s390_contiguous_bitmask_vector_p): New function.
	(s390_bytemask_vector_p): New function.
	(s390_split_ok_p): Vector regs don't work either.
	(regclass_map): Add VEC_REGS.
	(s390_legitimate_constant_p): Handle vector constants.
	(s390_cannot_force_const_mem): Handle CONST_VECTOR.
	(legitimate_reload_vector_constant_p): New function.
	(s390_preferred_reload_class): Handle CONST_VECTOR.
	(s390_reload_symref_address):  Likewise.
	(s390_secondary_reload): Vector memory instructions only support
	short displacements.  Rename reload*_nonoffmem* to reload*_la*.
	(s390_emit_ccraw_jump): New function.
	(s390_expand_vec_strlen): New function.
	(s390_expand_vec_compare): New function.
	(s390_expand_vcond): New function.
	(s390_expand_vec_init): New function.
	(s390_dwarf_frame_reg_mode): New function.
	(print_operand): Handle addresses with 'O' and 'R' constraints.
	(NR_C_MODES, constant_modes): Add vector modes.
	(s390_output_pool_entry): Handle vector constants.
	(s390_hard_regno_mode_ok): Handle vector registers.
	(s390_class_max_nregs): Likewise.
	(s390_cannot_change_mode_class): New function.
	(s390_invalid_arg_for_unprototyped_fn): New function.
	(s390_function_arg_vector): New function.
	(s390_function_arg_float): Remove size variable.
	(s390_pass_by_reference): Handle vector arguments.
	(s390_function_arg_advance): Likewise.
	(s390_function_arg): Likewise.
	(s390_return_in_memory): Vector values are returned in a VR if
	possible.
	(s390_function_and_libcall_value): Handle vector arguments.
	(s390_gimplify_va_arg): Likewise.
	(s390_call_saved_register_used): Consider the arguments named.
	(s390_conditional_register_usage): Disable v16-v31 for non-vec
	targets.
	(s390_preferred_simd_mode): New function.
	(s390_support_vector_misalignment): New function.
	(s390_vector_alignment): New function.
	(TARGET_STRICT_ARGUMENT_NAMING, TARGET_DWARF_FRAME_REG_MODE)
	(TARGET_VECTOR_MODE_SUPPORTED_P)
	(TARGET_INVALID_ARG_FOR_UNPROTOTYPED_FN)
	(TARGET_VECTORIZE_PREFERRED_SIMD_MODE)
	(TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT)
	(TARGET_VECTOR_ALIGNMENT): Define target macro.
	* config/s390/s390.h (FUNCTION_ARG_PADDING): Define macro.
	(FIRST_PSEUDO_REGISTER): Increase value.
	(VECTOR_NOFP_REGNO_P, VECTOR_REGNO_P, VECTOR_NOFP_REG_P)
	(VECTOR_REG_P): Define macros.
	(FIXED_REGISTERS, CALL_USED_REGISTERS)
	(CALL_REALLY_USED_REGISTERS, REG_ALLOC_ORDER)
	(HARD_REGNO_CALL_PART_CLOBBERED, REG_CLASS_NAMES)
	(FUNCTION_ARG_REGNO_P, FUNCTION_VALUE_REGNO_P, REGISTER_NAMES):
	Add vector registers.
	(CANNOT_CHANGE_MODE_CLASS): Call C function.
	(enum reg_class): Add VEC_REGS, ADDR_VEC_REGS, GENERAL_VEC_REGS.
	(SECONDARY_MEMORY_NEEDED): Allow SF<->SI mode moves without
	memory.
	(DBX_REGISTER_NUMBER, FIRST_VEC_ARG_REGNO, LAST_VEC_ARG_REGNO)
	(SHORT_DISP_IN_RANGE, VECTOR_STORE_FLAG_VALUE): Define macro.
	* config/s390/s390.md (UNSPEC_VEC_*): New constants.
	(VR*_REGNUM): New constants.
	(ALL): New mode iterator.
	(INTALL): Remove mode iterator.
	Include vector.md.
	(movti): Implement TImode moves for VRs.
	Disable TImode splitter for VR targets.
	Implement splitting TImode GPR<->VR moves.
	(reload*_tomem_z10, reload*_toreg_z10): Replace INTALL with ALL.
	(reload<mode>_nonoffmem_in, reload<mode>_nonoffmem_out): Rename to
	reload<mode>_la_in, reload<mode>_la_out.
	(*movdi_64, *movsi_zarch, *movhi, *movqi, *mov<mode>_64dfp)
	(*mov<mode>_64, *mov<mode>_31): Add vector instructions.
	(TD/TF mode splitter): Enable for GPRs only (formerly !FP).
	(mov<mode> SF SD): Prefer lder, lde for loading.
	Add lrl and strl instructions.
	Add vector instructions.
	(strlen<mode>): Rename old strlen<mode> to strlen_srst<mode>.
	Call s390_expand_vec_strlen on z13.
	(*cc_to_int): Change predicate to nonimmediate_operand.
	(addti3): Rename to *addti3.  New expander.
	(subti3): Rename to *subti3.  New expander.
	* config/s390/vector.md: New file.
---
 gcc/config/s390/constraints.md |   28 +
 gcc/config/s390/predicates.md  |   12 +-
 gcc/config/s390/s390-modes.def |   21 +
 gcc/config/s390/s390-protos.h  |    9 +
 gcc/config/s390/s390.c         | 1120 +++++++++++++++++++++++++++++++++---
 gcc/config/s390/s390.h         |  174 ++++--
 gcc/config/s390/s390.md        |  386 +++++++++----
 gcc/config/s390/vector.md      | 1226 ++++++++++++++++++++++++++++++++++++++++
 8 files changed, 2716 insertions(+), 260 deletions(-)
 create mode 100644 gcc/config/s390/vector.md

diff --git a/gcc/config/s390/constraints.md b/gcc/config/s390/constraints.md
index 25b0c98..66d4ace 100644
--- a/gcc/config/s390/constraints.md
+++ b/gcc/config/s390/constraints.md
@@ -29,7 +29,13 @@
 ;;    c -- Condition code register 33.
 ;;    d -- Any register from 0 to 15.
 ;;    f -- Floating point registers.
+;;    j -- Multiple letter constraint for constant scalar and vector values
+;;         j00: constant zero scalar or vector
+;;         jm1: constant scalar or vector with all bits set
+;;         jxx: contiguous bitmask of 0 or 1 in all vector elements
+;;         jyy: constant consisting of byte chunks being either 0 or 0xff
 ;;    t -- Access registers 36 and 37.
+;;    v -- Vector registers v0-v31.
 ;;    C -- A signed 8-bit constant (-128..127)
 ;;    D -- An unsigned 16-bit constant (0..65535)
 ;;    G -- Const double zero operand
@@ -102,6 +108,23 @@
   "FP_REGS"
   "Floating point registers")
 
+(define_constraint "j00"
+  "Zero scalar or vector constant"
+  (match_test "op == CONST0_RTX (GET_MODE (op))"))
+
+(define_constraint "jm1"
+  "All one bit scalar or vector constant"
+  (match_test "op == CONSTM1_RTX (GET_MODE (op))"))
+
+(define_constraint "jxx"
+  "@internal"
+  (and (match_code "const_vector")
+       (match_test "s390_contiguous_bitmask_vector_p (op, NULL, NULL)")))
+
+(define_constraint "jyy"
+  "@internal"
+  (and (match_code "const_vector")
+       (match_test "s390_bytemask_vector_p (op, NULL)")))
 
 (define_register_constraint "t"
   "ACCESS_REGS"
@@ -109,6 +132,11 @@
    Access registers 36 and 37")
 
 
+(define_register_constraint "v"
+  "VEC_REGS"
+  "Vector registers v0-v31")
+
+
 ;;
 ;;  General constraints for constants.
 ;;
diff --git a/gcc/config/s390/predicates.md b/gcc/config/s390/predicates.md
index 4d3fd97..46619b9 100644
--- a/gcc/config/s390/predicates.md
+++ b/gcc/config/s390/predicates.md
@@ -24,16 +24,20 @@
 
 ;; operands --------------------------------------------------------------
 
-;; Return true if OP a (const_int 0) operand.
-
+;; Return true if OP a const 0 operand (int/float/vector).
 (define_predicate "const0_operand"
-  (and (match_code "const_int, const_double")
+  (and (match_code "const_int,const_double,const_vector")
        (match_test "op == CONST0_RTX (mode)")))
 
+;; Return true if OP an all ones operand (int/float/vector).
+(define_predicate "constm1_operand"
+  (and (match_code "const_int, const_double,const_vector")
+       (match_test "op == CONSTM1_RTX (mode)")))
+
 ;; Return true if OP is constant.
 
 (define_special_predicate "consttable_operand"
-  (and (match_code "symbol_ref, label_ref, const, const_int, const_double")
+  (and (match_code "symbol_ref, label_ref, const, const_int, const_double, const_vector")
        (match_test "CONSTANT_P (op)")))
 
 ;; Return true if OP is a valid S-type operand.
diff --git a/gcc/config/s390/s390-modes.def b/gcc/config/s390/s390-modes.def
index 49a0684..a40559e 100644
--- a/gcc/config/s390/s390-modes.def
+++ b/gcc/config/s390/s390-modes.def
@@ -181,3 +181,24 @@ CC_MODE (CCT1);
 CC_MODE (CCT2);
 CC_MODE (CCT3);
 CC_MODE (CCRAW);
+
+/* Vector modes.  */
+
+VECTOR_MODES (INT, 2);        /*                 V2QI */
+VECTOR_MODES (INT, 4);        /*            V4QI V2HI */
+VECTOR_MODES (INT, 8);        /*       V8QI V4HI V2SI */
+VECTOR_MODES (INT, 16);       /* V16QI V8HI V4SI V2DI */
+
+VECTOR_MODE (FLOAT, SF, 2);   /* V2SF */
+VECTOR_MODE (FLOAT, SF, 4);   /* V4SF */
+VECTOR_MODE (FLOAT, DF, 2);   /* V2DF */
+
+VECTOR_MODE (INT, QI, 1);     /* V1QI */
+VECTOR_MODE (INT, HI, 1);     /* V1HI */
+VECTOR_MODE (INT, SI, 1);     /* V1SI */
+VECTOR_MODE (INT, DI, 1);     /* V1DI */
+VECTOR_MODE (INT, TI, 1);     /* V1TI */
+
+VECTOR_MODE (FLOAT, SF, 1);   /* V1SF */
+VECTOR_MODE (FLOAT, DF, 1);   /* V1DF */
+VECTOR_MODE (FLOAT, TF, 1);   /* V1TF */
diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index 92a2c00..b23806f 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -43,6 +43,9 @@ extern void s390_set_has_landing_pad_p (bool);
 extern bool s390_hard_regno_mode_ok (unsigned int, machine_mode);
 extern bool s390_hard_regno_rename_ok (unsigned int, unsigned int);
 extern int s390_class_max_nregs (enum reg_class, machine_mode);
+extern int s390_cannot_change_mode_class (machine_mode, machine_mode,
+					  enum reg_class);
+extern bool s390_function_arg_vector (machine_mode, const_tree);
 
 #ifdef RTX_CODE
 extern int s390_extra_constraint_str (rtx, int, const char *);
@@ -51,6 +54,8 @@ extern int s390_const_double_ok_for_constraint_p (rtx, int, const char *);
 extern int s390_single_part (rtx, machine_mode, machine_mode, int);
 extern unsigned HOST_WIDE_INT s390_extract_part (rtx, machine_mode, int);
 extern bool s390_contiguous_bitmask_p (unsigned HOST_WIDE_INT, int, int *, int *);
+extern bool s390_contiguous_bitmask_vector_p (rtx, int *, int *);
+extern bool s390_bytemask_vector_p (rtx, unsigned *);
 extern bool s390_split_ok_p (rtx, rtx, machine_mode, int);
 extern bool s390_overlap_p (rtx, rtx, HOST_WIDE_INT);
 extern bool s390_offset_p (rtx, rtx, rtx);
@@ -83,6 +88,7 @@ extern void s390_load_address (rtx, rtx);
 extern bool s390_expand_movmem (rtx, rtx, rtx);
 extern void s390_expand_setmem (rtx, rtx, rtx);
 extern bool s390_expand_cmpmem (rtx, rtx, rtx, rtx);
+extern void s390_expand_vec_strlen (rtx, rtx, rtx);
 extern bool s390_expand_addcc (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
 extern bool s390_expand_insv (rtx, rtx, rtx, rtx);
 extern void s390_expand_cs_hqi (machine_mode, rtx, rtx, rtx,
@@ -90,6 +96,9 @@ extern void s390_expand_cs_hqi (machine_mode, rtx, rtx, rtx,
 extern void s390_expand_atomic (machine_mode, enum rtx_code,
 				rtx, rtx, rtx, bool);
 extern void s390_expand_tbegin (rtx, rtx, rtx, bool);
+extern void s390_expand_vec_compare (rtx, enum rtx_code, rtx, rtx);
+extern void s390_expand_vcond (rtx, rtx, rtx, enum rtx_code, rtx, rtx);
+extern void s390_expand_vec_init (rtx, rtx);
 extern rtx s390_return_addr_rtx (int, rtx);
 extern rtx s390_back_chain_rtx (void);
 extern rtx_insn *s390_emit_call (rtx, rtx, rtx, rtx);
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 843a860..11fed14 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -97,6 +97,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "context.h"
 #include "builtins.h"
 #include "rtl-iter.h"
+#include "intl.h"
+#include "plugin-api.h"
+#include "ipa-ref.h"
+#include "cgraph.h"
 
 /* Define the specific costs for a given cpu.  */
 
@@ -440,6 +444,7 @@ struct GTY(()) machine_function
 /* Number of GPRs and FPRs used for argument passing.  */
 #define GP_ARG_NUM_REG 5
 #define FP_ARG_NUM_REG (TARGET_64BIT? 4 : 2)
+#define VEC_ARG_NUM_REG 8
 
 /* A couple of shortcuts.  */
 #define CONST_OK_FOR_J(x) \
@@ -576,6 +581,35 @@ s390_scalar_mode_supported_p (machine_mode mode)
   return default_scalar_mode_supported_p (mode);
 }
 
+/* Return true if the back end supports vector mode MODE.  */
+static bool
+s390_vector_mode_supported_p (machine_mode mode)
+{
+  machine_mode inner;
+
+  if (!VECTOR_MODE_P (mode)
+      || !TARGET_VX
+      || GET_MODE_SIZE (mode) > 16)
+    return false;
+
+  inner = GET_MODE_INNER (mode);
+
+  switch (inner)
+    {
+    case QImode:
+    case HImode:
+    case SImode:
+    case DImode:
+    case TImode:
+    case SFmode:
+    case DFmode:
+    case TFmode:
+      return true;
+    default:
+      return false;
+    }
+}
+
 /* Set the has_landing_pad_p flag in struct machine_function to VALUE.  */
 
 void
@@ -1473,6 +1507,9 @@ s390_contiguous_bitmask_p (unsigned HOST_WIDE_INT in, int size,
   /* Calculate a mask for all bits beyond the contiguous bits.  */
   mask = (-1LL & ~(((1ULL << (tmp_length + tmp_pos - 1)) << 1) - 1));
 
+  if ((unsigned)size < sizeof (HOST_WIDE_INT) * BITS_PER_UNIT)
+    mask &= (HOST_WIDE_INT_1U << size) - 1;
+
   if (mask & in)
     return false;
 
@@ -1488,6 +1525,101 @@ s390_contiguous_bitmask_p (unsigned HOST_WIDE_INT in, int size,
   return true;
 }
 
+/* Return true if OP contains the same contiguous bitfield in *all*
+   its elements.  START and END can be used to obtain the start and
+   end position of the bitfield.
+
+   START/STOP give the position of the first/last bit of the bitfield
+   counting from the lowest order bit starting with zero.  In order to
+   use these values for S/390 instructions this has to be converted to
+   "bits big endian" style.  */
+
+bool
+s390_contiguous_bitmask_vector_p (rtx op, int *start, int *end)
+{
+  unsigned HOST_WIDE_INT mask;
+  int length, size;
+
+  if (!VECTOR_MODE_P (GET_MODE (op))
+      || GET_CODE (op) != CONST_VECTOR
+      || !CONST_INT_P (XVECEXP (op, 0, 0)))
+    return false;
+
+  if (GET_MODE_NUNITS (GET_MODE (op)) > 1)
+    {
+      int i;
+
+      for (i = 1; i < GET_MODE_NUNITS (GET_MODE (op)); ++i)
+	if (!rtx_equal_p (XVECEXP (op, 0, i), XVECEXP (op, 0, 0)))
+	  return false;
+    }
+
+  size = GET_MODE_UNIT_BITSIZE (GET_MODE (op));
+  mask = UINTVAL (XVECEXP (op, 0, 0));
+  if (s390_contiguous_bitmask_p (mask, size, start,
+				 end != NULL ? &length : NULL))
+    {
+      if (end != NULL)
+	*end = *start + length - 1;
+      return true;
+    }
+  /* 0xff00000f style immediates can be covered by swapping start and
+     end indices in vgm.  */
+  if (s390_contiguous_bitmask_p (~mask, size, start,
+				 end != NULL ? &length : NULL))
+    {
+      if (end != NULL)
+	*end = *start - 1;
+      if (start != NULL)
+	*start = *start + length;
+      return true;
+    }
+  return false;
+}
+
+/* Return true if C consists only of byte chunks being either 0 or
+   0xff.  If MASK is !=NULL a byte mask is generated which is
+   appropriate for the vector generate byte mask instruction.  */
+
+bool
+s390_bytemask_vector_p (rtx op, unsigned *mask)
+{
+  int i;
+  unsigned tmp_mask = 0;
+  int nunit, unit_size;
+
+  if (!VECTOR_MODE_P (GET_MODE (op))
+      || GET_CODE (op) != CONST_VECTOR
+      || !CONST_INT_P (XVECEXP (op, 0, 0)))
+    return false;
+
+  nunit = GET_MODE_NUNITS (GET_MODE (op));
+  unit_size = GET_MODE_UNIT_SIZE (GET_MODE (op));
+
+  for (i = 0; i < nunit; i++)
+    {
+      unsigned HOST_WIDE_INT c;
+      int j;
+
+      if (!CONST_INT_P (XVECEXP (op, 0, i)))
+	return false;
+
+      c = UINTVAL (XVECEXP (op, 0, i));
+      for (j = 0; j < unit_size; j++)
+	{
+	  if ((c & 0xff) != 0 && (c & 0xff) != 0xff)
+	    return false;
+	  tmp_mask |= (c & 1) << ((nunit - 1 - i) * unit_size + j);
+	  c = c >> BITS_PER_UNIT;
+	}
+    }
+
+  if (mask != NULL)
+    *mask = tmp_mask;
+
+  return true;
+}
+
 /* Check whether a rotate of ROTL followed by an AND of CONTIG is
    equivalent to a shift followed by the AND.  In particular, CONTIG
    should not overlap the (rotated) bit 0/bit 63 gap.  Negative values
@@ -1513,8 +1645,8 @@ s390_extzv_shift_ok (int bitsize, int rotl, unsigned HOST_WIDE_INT contig)
 bool
 s390_split_ok_p (rtx dst, rtx src, machine_mode mode, int first_subword)
 {
-  /* Floating point registers cannot be split.  */
-  if (FP_REG_P (src) || FP_REG_P (dst))
+  /* Floating point and vector registers cannot be split.  */
+  if (FP_REG_P (src) || FP_REG_P (dst) || VECTOR_REG_P (src) || VECTOR_REG_P (dst))
     return false;
 
   /* We don't need to split if operands are directly accessible.  */
@@ -1705,16 +1837,20 @@ s390_init_machine_status (void)
 /* Map for smallest class containing reg regno.  */
 
 const enum reg_class regclass_map[FIRST_PSEUDO_REGISTER] =
-{ GENERAL_REGS, ADDR_REGS, ADDR_REGS, ADDR_REGS,
-  ADDR_REGS,    ADDR_REGS, ADDR_REGS, ADDR_REGS,
-  ADDR_REGS,    ADDR_REGS, ADDR_REGS, ADDR_REGS,
-  ADDR_REGS,    ADDR_REGS, ADDR_REGS, ADDR_REGS,
-  FP_REGS,      FP_REGS,   FP_REGS,   FP_REGS,
-  FP_REGS,      FP_REGS,   FP_REGS,   FP_REGS,
-  FP_REGS,      FP_REGS,   FP_REGS,   FP_REGS,
-  FP_REGS,      FP_REGS,   FP_REGS,   FP_REGS,
-  ADDR_REGS,    CC_REGS,   ADDR_REGS, ADDR_REGS,
-  ACCESS_REGS,	ACCESS_REGS
+{ GENERAL_REGS, ADDR_REGS, ADDR_REGS, ADDR_REGS,  /*  0 */
+  ADDR_REGS,    ADDR_REGS, ADDR_REGS, ADDR_REGS,  /*  4 */
+  ADDR_REGS,    ADDR_REGS, ADDR_REGS, ADDR_REGS,  /*  8 */
+  ADDR_REGS,    ADDR_REGS, ADDR_REGS, ADDR_REGS,  /* 12 */
+  FP_REGS,      FP_REGS,   FP_REGS,   FP_REGS,    /* 16 */
+  FP_REGS,      FP_REGS,   FP_REGS,   FP_REGS,    /* 20 */
+  FP_REGS,      FP_REGS,   FP_REGS,   FP_REGS,    /* 24 */
+  FP_REGS,      FP_REGS,   FP_REGS,   FP_REGS,    /* 28 */
+  ADDR_REGS,    CC_REGS,   ADDR_REGS, ADDR_REGS,  /* 32 */
+  ACCESS_REGS,	ACCESS_REGS, VEC_REGS, VEC_REGS,  /* 36 */
+  VEC_REGS, VEC_REGS, VEC_REGS, VEC_REGS,         /* 40 */
+  VEC_REGS, VEC_REGS, VEC_REGS, VEC_REGS,         /* 44 */
+  VEC_REGS, VEC_REGS, VEC_REGS, VEC_REGS,         /* 48 */
+  VEC_REGS, VEC_REGS                              /* 52 */
 };
 
 /* Return attribute type of insn.  */
@@ -2775,6 +2911,17 @@ legitimate_pic_operand_p (rtx op)
 static bool
 s390_legitimate_constant_p (machine_mode mode, rtx op)
 {
+  if (VECTOR_MODE_P (mode) && GET_CODE (op) == CONST_VECTOR)
+    {
+      if (GET_MODE_SIZE (mode) != 16)
+	return 0;
+
+      if (!const0_operand (op, mode)
+	  && !s390_contiguous_bitmask_vector_p (op, NULL, NULL)
+	  && !s390_bytemask_vector_p (op, NULL))
+	return 0;
+    }
+
   /* Accept all non-symbolic constants.  */
   if (!SYMBOLIC_CONST (op))
     return 1;
@@ -2811,6 +2958,7 @@ s390_cannot_force_const_mem (machine_mode mode, rtx x)
     {
     case CONST_INT:
     case CONST_DOUBLE:
+    case CONST_VECTOR:
       /* Accept all non-symbolic constants.  */
       return false;
 
@@ -2943,6 +3091,27 @@ legitimate_reload_fp_constant_p (rtx op)
   return false;
 }
 
+/* Returns true if the constant value OP is a legitimate vector operand
+   during and after reload.
+   This function accepts all constants which can be loaded directly
+   into an VR.  */
+
+static bool
+legitimate_reload_vector_constant_p (rtx op)
+{
+  /* FIXME: Support constant vectors with all the same 16 bit unsigned
+     operands.  These can be loaded with vrepi.  */
+
+  if (TARGET_VX && GET_MODE_SIZE (GET_MODE (op)) == 16
+      && (const0_operand (op, GET_MODE (op))
+	  || constm1_operand (op, GET_MODE (op))
+	  || s390_contiguous_bitmask_vector_p (op, NULL, NULL)
+	  || s390_bytemask_vector_p (op, NULL)))
+    return true;
+
+  return false;
+}
+
 /* Given an rtx OP being reloaded into a reg required to be in class RCLASS,
    return the class of reg to actually use.  */
 
@@ -2953,6 +3122,7 @@ s390_preferred_reload_class (rtx op, reg_class_t rclass)
     {
       /* Constants we cannot reload into general registers
 	 must be forced into the literal pool.  */
+      case CONST_VECTOR:
       case CONST_DOUBLE:
       case CONST_INT:
 	if (reg_class_subset_p (GENERAL_REGS, rclass)
@@ -2964,6 +3134,10 @@ s390_preferred_reload_class (rtx op, reg_class_t rclass)
 	else if (reg_class_subset_p (FP_REGS, rclass)
 		 && legitimate_reload_fp_constant_p (op))
 	  return FP_REGS;
+	else if (reg_class_subset_p (VEC_REGS, rclass)
+		 && legitimate_reload_vector_constant_p (op))
+	  return VEC_REGS;
+
 	return NO_REGS;
 
       /* If a symbolic constant or a PLUS is reloaded,
@@ -3087,6 +3261,7 @@ s390_reload_symref_address (rtx reg, rtx mem, rtx scratch, bool tomem)
   /* Reload might have pulled a constant out of the literal pool.
      Force it back in.  */
   if (CONST_INT_P (mem) || GET_CODE (mem) == CONST_DOUBLE
+      || GET_CODE (mem) == CONST_VECTOR
       || GET_CODE (mem) == CONST)
     mem = force_const_mem (GET_MODE (reg), mem);
 
@@ -3126,6 +3301,30 @@ s390_secondary_reload (bool in_p, rtx x, reg_class_t rclass_i,
   if (reg_classes_intersect_p (CC_REGS, rclass))
     return GENERAL_REGS;
 
+  if (TARGET_VX)
+    {
+      /* The vst/vl vector move instructions allow only for short
+	 displacements.  */
+      if (MEM_P (x)
+	  && GET_CODE (XEXP (x, 0)) == PLUS
+	  && GET_CODE (XEXP (XEXP (x, 0), 1)) == CONST_INT
+	  && !SHORT_DISP_IN_RANGE(INTVAL (XEXP (XEXP (x, 0), 1)))
+	  && reg_class_subset_p (rclass, VEC_REGS)
+	  && (!reg_class_subset_p (rclass, FP_REGS)
+	      || (GET_MODE_SIZE (mode) > 8
+		  && s390_class_max_nregs (FP_REGS, mode) == 1)))
+	{
+	  if (in_p)
+	    sri->icode = (TARGET_64BIT ?
+			  CODE_FOR_reloaddi_la_in :
+			  CODE_FOR_reloadsi_la_in);
+	  else
+	    sri->icode = (TARGET_64BIT ?
+			  CODE_FOR_reloaddi_la_out :
+			  CODE_FOR_reloadsi_la_out);
+	}
+    }
+
   if (TARGET_Z10)
     {
       HOST_WIDE_INT offset;
@@ -3174,7 +3373,27 @@ s390_secondary_reload (bool in_p, rtx x, reg_class_t rclass_i,
 	      __SECONDARY_RELOAD_CASE (SD, sd);
 	      __SECONDARY_RELOAD_CASE (DD, dd);
 	      __SECONDARY_RELOAD_CASE (TD, td);
-
+	      __SECONDARY_RELOAD_CASE (V1QI, v1qi);
+	      __SECONDARY_RELOAD_CASE (V2QI, v2qi);
+	      __SECONDARY_RELOAD_CASE (V4QI, v4qi);
+	      __SECONDARY_RELOAD_CASE (V8QI, v8qi);
+	      __SECONDARY_RELOAD_CASE (V16QI, v16qi);
+	      __SECONDARY_RELOAD_CASE (V1HI, v1hi);
+	      __SECONDARY_RELOAD_CASE (V2HI, v2hi);
+	      __SECONDARY_RELOAD_CASE (V4HI, v4hi);
+	      __SECONDARY_RELOAD_CASE (V8HI, v8hi);
+	      __SECONDARY_RELOAD_CASE (V1SI, v1si);
+	      __SECONDARY_RELOAD_CASE (V2SI, v2si);
+	      __SECONDARY_RELOAD_CASE (V4SI, v4si);
+	      __SECONDARY_RELOAD_CASE (V1DI, v1di);
+	      __SECONDARY_RELOAD_CASE (V2DI, v2di);
+	      __SECONDARY_RELOAD_CASE (V1TI, v1ti);
+	      __SECONDARY_RELOAD_CASE (V1SF, v1sf);
+	      __SECONDARY_RELOAD_CASE (V2SF, v2sf);
+	      __SECONDARY_RELOAD_CASE (V4SF, v4sf);
+	      __SECONDARY_RELOAD_CASE (V1DF, v1df);
+	      __SECONDARY_RELOAD_CASE (V2DF, v2df);
+	      __SECONDARY_RELOAD_CASE (V1TF, v1tf);
 	    default:
 	      gcc_unreachable ();
 	    }
@@ -3213,12 +3432,12 @@ s390_secondary_reload (bool in_p, rtx x, reg_class_t rclass_i,
 	{
 	  if (in_p)
 	    sri->icode = (TARGET_64BIT ?
-			  CODE_FOR_reloaddi_nonoffmem_in :
-			  CODE_FOR_reloadsi_nonoffmem_in);
+			  CODE_FOR_reloaddi_la_in :
+			  CODE_FOR_reloadsi_la_in);
 	  else
 	    sri->icode = (TARGET_64BIT ?
-			  CODE_FOR_reloaddi_nonoffmem_out :
-			  CODE_FOR_reloadsi_nonoffmem_out);
+			  CODE_FOR_reloaddi_la_out :
+			  CODE_FOR_reloadsi_la_out);
 	}
     }
 
@@ -4452,6 +4671,138 @@ s390_expand_cmpmem (rtx target, rtx op0, rtx op1, rtx len)
   return true;
 }
 
+/* Emit a conditional jump to LABEL for condition code mask MASK using
+   comparsion operator COMPARISON.  Return the emitted jump insn.  */
+
+static rtx
+s390_emit_ccraw_jump (HOST_WIDE_INT mask, enum rtx_code comparison, rtx label)
+{
+  rtx temp;
+
+  gcc_assert (comparison == EQ || comparison == NE);
+  gcc_assert (mask > 0 && mask < 15);
+
+  temp = gen_rtx_fmt_ee (comparison, VOIDmode,
+			 gen_rtx_REG (CCRAWmode, CC_REGNUM), GEN_INT (mask));
+  temp = gen_rtx_IF_THEN_ELSE (VOIDmode, temp,
+			       gen_rtx_LABEL_REF (VOIDmode, label), pc_rtx);
+  temp = gen_rtx_SET (VOIDmode, pc_rtx, temp);
+  return emit_jump_insn (temp);
+}
+
+/* Emit the instructions to implement strlen of STRING and store the
+   result in TARGET.  The string has the known ALIGNMENT.  This
+   version uses vector instructions and is therefore not appropriate
+   for targets prior to z13.  */
+
+void
+s390_expand_vec_strlen (rtx target, rtx string, rtx alignment)
+{
+  int very_unlikely = REG_BR_PROB_BASE / 100 - 1;
+  int very_likely = REG_BR_PROB_BASE - 1;
+  rtx highest_index_to_load_reg = gen_reg_rtx (Pmode);
+  rtx str_reg = gen_reg_rtx (V16QImode);
+  rtx str_addr_base_reg = gen_reg_rtx (Pmode);
+  rtx str_idx_reg = gen_reg_rtx (Pmode);
+  rtx result_reg = gen_reg_rtx (V16QImode);
+  rtx is_aligned_label = gen_label_rtx ();
+  rtx into_loop_label = NULL_RTX;
+  rtx loop_start_label = gen_label_rtx ();
+  rtx temp;
+  rtx len = gen_reg_rtx (QImode);
+  rtx cond;
+
+  s390_load_address (str_addr_base_reg, XEXP (string, 0));
+  emit_move_insn (str_idx_reg, const0_rtx);
+
+  if (INTVAL (alignment) < 16)
+    {
+      /* Check whether the address happens to be aligned properly so
+	 jump directly to the aligned loop.  */
+      emit_cmp_and_jump_insns (gen_rtx_AND (Pmode,
+					    str_addr_base_reg, GEN_INT (15)),
+			       const0_rtx, EQ, NULL_RTX,
+			       Pmode, 1, is_aligned_label);
+
+      temp = gen_reg_rtx (Pmode);
+      temp = expand_binop (Pmode, and_optab, str_addr_base_reg,
+			   GEN_INT (15), temp, 1, OPTAB_DIRECT);
+      gcc_assert (REG_P (temp));
+      highest_index_to_load_reg =
+	expand_binop (Pmode, sub_optab, GEN_INT (15), temp,
+		      highest_index_to_load_reg, 1, OPTAB_DIRECT);
+      gcc_assert (REG_P (highest_index_to_load_reg));
+      emit_insn (gen_vllv16qi (str_reg,
+		   convert_to_mode (SImode, highest_index_to_load_reg, 1),
+		   gen_rtx_MEM (BLKmode, str_addr_base_reg)));
+
+      into_loop_label = gen_label_rtx ();
+      s390_emit_jump (into_loop_label, NULL_RTX);
+      emit_barrier ();
+    }
+
+  emit_label (is_aligned_label);
+  LABEL_NUSES (is_aligned_label) = INTVAL (alignment) < 16 ? 2 : 1;
+
+  /* Reaching this point we are only performing 16 bytes aligned
+     loads.  */
+  emit_move_insn (highest_index_to_load_reg, GEN_INT (15));
+
+  emit_label (loop_start_label);
+  LABEL_NUSES (loop_start_label) = 1;
+
+  /* Load 16 bytes of the string into VR.  */
+  emit_move_insn (str_reg,
+		  gen_rtx_MEM (V16QImode,
+			       gen_rtx_PLUS (Pmode, str_idx_reg,
+					     str_addr_base_reg)));
+  if (into_loop_label != NULL_RTX)
+    {
+      emit_label (into_loop_label);
+      LABEL_NUSES (into_loop_label) = 1;
+    }
+
+  /* Increment string index by 16 bytes.  */
+  expand_binop (Pmode, add_optab, str_idx_reg, GEN_INT (16),
+		str_idx_reg, 1, OPTAB_DIRECT);
+
+  emit_insn (gen_vec_vfenesv16qi (result_reg, str_reg, str_reg,
+				  GEN_INT (VSTRING_FLAG_ZS | VSTRING_FLAG_CS)));
+
+  add_int_reg_note (s390_emit_ccraw_jump (8, NE, loop_start_label),
+		    REG_BR_PROB, very_likely);
+  emit_insn (gen_vec_extractv16qi (len, result_reg, GEN_INT (7)));
+
+  /* If the string pointer wasn't aligned we have loaded less then 16
+     bytes and the remaining bytes got filled with zeros (by vll).
+     Now we have to check whether the resulting index lies within the
+     bytes actually part of the string.  */
+
+  cond = s390_emit_compare (GT, convert_to_mode (Pmode, len, 1),
+			    highest_index_to_load_reg);
+  s390_load_address (highest_index_to_load_reg,
+		     gen_rtx_PLUS (Pmode, highest_index_to_load_reg,
+				   const1_rtx));
+  if (TARGET_64BIT)
+    emit_insn (gen_movdicc (str_idx_reg, cond,
+			    highest_index_to_load_reg, str_idx_reg));
+  else
+    emit_insn (gen_movsicc (str_idx_reg, cond,
+			    highest_index_to_load_reg, str_idx_reg));
+
+  add_int_reg_note (s390_emit_jump (is_aligned_label, cond), REG_BR_PROB,
+		    very_unlikely);
+
+  expand_binop (Pmode, add_optab, str_idx_reg,
+		GEN_INT (-16), str_idx_reg, 1, OPTAB_DIRECT);
+  /* FIXME: len is already zero extended - so avoid the llgcr emitted
+     here.  */
+  temp = expand_binop (Pmode, add_optab, str_idx_reg,
+		       convert_to_mode (Pmode, len, 1),
+		       target, 1, OPTAB_DIRECT);
+  if (temp != target)
+    emit_move_insn (target, temp);
+}
 
 /* Expand conditional increment or decrement using alc/slb instructions.
    Should generate code setting DST to either SRC or SRC + INCREMENT,
@@ -4806,6 +5157,216 @@ s390_expand_mask_and_shift (rtx val, machine_mode mode, rtx count)
 			      NULL_RTX, 1, OPTAB_DIRECT);
 }
 
+/* Generate a vector comparison COND of CMP_OP1 and CMP_OP2 and store
+   the result in TARGET.  */
+
+void
+s390_expand_vec_compare (rtx target, enum rtx_code cond,
+			 rtx cmp_op1, rtx cmp_op2)
+{
+  machine_mode mode = GET_MODE (target);
+  bool neg_p = false, swap_p = false;
+  rtx tmp;
+
+  if (GET_MODE (cmp_op1) == V2DFmode)
+    {
+      switch (cond)
+	{
+	  /* NE a != b -> !(a == b) */
+	case NE:   cond = EQ; neg_p = true;                break;
+	  /* UNGT a u> b -> !(b >= a) */
+	case UNGT: cond = GE; neg_p = true; swap_p = true; break;
+	  /* UNGE a u>= b -> !(b > a) */
+	case UNGE: cond = GT; neg_p = true; swap_p = true; break;
+	  /* LE: a <= b -> b >= a */
+	case LE:   cond = GE;               swap_p = true; break;
+	  /* UNLE: a u<= b -> !(a > b) */
+	case UNLE: cond = GT; neg_p = true;                break;
+	  /* LT: a < b -> b > a */
+	case LT:   cond = GT;               swap_p = true; break;
+	  /* UNLT: a u< b -> !(a >= b) */
+	case UNLT: cond = GE; neg_p = true;                break;
+	case UNEQ:
+	  emit_insn (gen_vec_cmpuneqv2df (target, cmp_op1, cmp_op2));
+	  return;
+	case LTGT:
+	  emit_insn (gen_vec_cmpltgtv2df (target, cmp_op1, cmp_op2));
+	  return;
+	case ORDERED:
+	  emit_insn (gen_vec_orderedv2df (target, cmp_op1, cmp_op2));
+	  return;
+	case UNORDERED:
+	  emit_insn (gen_vec_unorderedv2df (target, cmp_op1, cmp_op2));
+	  return;
+	default: break;
+	}
+    }
+  else
+    {
+      switch (cond)
+	{
+	  /* NE: a != b -> !(a == b) */
+	case NE:  cond = EQ;  neg_p = true;                break;
+	  /* GE: a >= b -> !(b > a) */
+	case GE:  cond = GT;  neg_p = true; swap_p = true; break;
+	  /* GEU: a >= b -> !(b > a) */
+	case GEU: cond = GTU; neg_p = true; swap_p = true; break;
+	  /* LE: a <= b -> !(a > b) */
+	case LE:  cond = GT;  neg_p = true;                break;
+	  /* LEU: a <= b -> !(a > b) */
+	case LEU: cond = GTU; neg_p = true;                break;
+	  /* LT: a < b -> b > a */
+	case LT:  cond = GT;                swap_p = true; break;
+	  /* LTU: a < b -> b > a */
+	case LTU: cond = GTU;               swap_p = true; break;
+	default: break;
+	}
+    }
+
+  if (swap_p)
+    {
+      tmp = cmp_op1; cmp_op1 = cmp_op2; cmp_op2 = tmp;
+    }
+
+  emit_insn (gen_rtx_SET (mode,
+			  target, gen_rtx_fmt_ee (cond,
+						  mode,
+						  cmp_op1, cmp_op2)));
+  if (neg_p)
+    emit_insn (gen_rtx_SET (mode, target, gen_rtx_NOT (mode, target)));
+}
+
+/* Generate a vector comparison expression loading either elements of
+   THEN or ELS into TARGET depending on the comparison COND of CMP_OP1
+   and CMP_OP2.  */
+
+void
+s390_expand_vcond (rtx target, rtx then, rtx els,
+		   enum rtx_code cond, rtx cmp_op1, rtx cmp_op2)
+{
+  rtx tmp;
+  machine_mode result_mode;
+  rtx result_target;
+
+  /* We always use an integral type vector to hold the comparison
+     result.  */
+  result_mode = GET_MODE (cmp_op1) == V2DFmode ? V2DImode : GET_MODE (cmp_op1);
+  result_target = gen_reg_rtx (result_mode);
+
+  /* Alternatively this could be done by reload by lowering the cmp*
+     predicates.  But it appears to be better for scheduling etc. to
+     have that in early.  */
+  if (!REG_P (cmp_op1))
+    cmp_op1 = force_reg (GET_MODE (target), cmp_op1);
+
+  if (!REG_P (cmp_op2))
+    cmp_op2 = force_reg (GET_MODE (target), cmp_op2);
+
+  s390_expand_vec_compare (result_target, cond,
+			   cmp_op1, cmp_op2);
+
+  /* If the results are supposed to be either -1 or 0 we are done
+     since this is what our compare instructions generate anyway.  */
+  if (constm1_operand (then, GET_MODE (then))
+      && const0_operand (els, GET_MODE (els)))
+    {
+      emit_move_insn (target, gen_rtx_SUBREG (GET_MODE (target),
+					      result_target, 0));
+      return;
+    }
+
+  /* Otherwise we will do a vsel afterwards.  */
+  /* This gets triggered e.g.
+     with gcc.c-torture/compile/pr53410-1.c */
+  if (!REG_P (then))
+    then = force_reg (GET_MODE (target), then);
+
+  if (!REG_P (els))
+    els = force_reg (GET_MODE (target), els);
+
+  tmp = gen_rtx_fmt_ee (EQ, VOIDmode,
+			result_target,
+			CONST0_RTX (result_mode));
+
+  /* We compared the result against zero above so we have to swap then
+     and els here.  */
+  tmp = gen_rtx_IF_THEN_ELSE (GET_MODE (target), tmp, els, then);
+
+  gcc_assert (GET_MODE (target) == GET_MODE (then));
+  emit_insn (gen_rtx_SET (GET_MODE (target), target, tmp));
+}
+
+/* Emit the RTX necessary to initialize the vector TARGET with values
+   in VALS.  */
+void
+s390_expand_vec_init (rtx target, rtx vals)
+{
+  machine_mode mode = GET_MODE (target);
+  machine_mode inner_mode = GET_MODE_INNER (mode);
+  int n_elts = GET_MODE_NUNITS (mode);
+  bool all_same = true, all_regs = true, all_const_int = true;
+  rtx x;
+  int i;
+
+  for (i = 0; i < n_elts; ++i)
+    {
+      x = XVECEXP (vals, 0, i);
+
+      if (!CONST_INT_P (x))
+	all_const_int = false;
+
+      if (i > 0 && !rtx_equal_p (x, XVECEXP (vals, 0, 0)))
+	all_same = false;
+
+      if (!REG_P (x))
+	all_regs = false;
+    }
+
+  /* Use vector gen mask or vector gen byte mask if possible.  */
+  if (all_same && all_const_int
+      && (XVECEXP (vals, 0, 0) == const0_rtx
+	  || s390_contiguous_bitmask_vector_p (XVECEXP (vals, 0, 0),
+					       NULL, NULL)
+	  || s390_bytemask_vector_p (XVECEXP (vals, 0, 0), NULL)))
+    {
+      emit_insn (gen_rtx_SET (mode, target,
+			      gen_rtx_CONST_VECTOR (mode, XVEC (vals, 0))));
+      return;
+    }
+
+  if (all_same)
+    {
+      emit_insn (gen_rtx_SET (mode, target,
+			      gen_rtx_VEC_DUPLICATE (mode,
+						     XVECEXP (vals, 0, 0))));
+      return;
+    }
+
+  if (all_regs && REG_P (target) && n_elts == 2 && inner_mode == DImode)
+    {
+      /* Use vector load pair.  */
+      emit_insn (gen_rtx_SET (mode, target,
+			      gen_rtx_VEC_CONCAT (mode,
+						  XVECEXP (vals, 0, 0),
+						  XVECEXP (vals, 0, 1))));
+      return;
+    }
+
+  /* We are about to set the vector elements one by one.  Zero out the
+     full register first in order to help the data flow framework to
+     detect it as full VR set.  */
+  emit_insn (gen_rtx_SET (mode, target, CONST0_RTX (mode)));
+
+  /* Unfortunately the vec_init expander is not allowed to fail.  So
+     we have to implement the fallback ourselves.  */
+  for (i = 0; i < n_elts; i++)
+    emit_insn (gen_rtx_SET (mode, target,
+			    gen_rtx_UNSPEC (mode,
+					    gen_rtvec (3, XVECEXP (vals, 0, i),
+						       GEN_INT (i), target),
+					    UNSPEC_VEC_SET)));
+}
+
 /* Structure to hold the initial parameters for a compare_and_swap operation
    in HImode and QImode.  */
 
@@ -5101,6 +5662,20 @@ s390_output_dwarf_dtprel (FILE *file, int size, rtx x)
   fputs ("@DTPOFF", file);
 }
 
+/* Return the proper mode for REGNO being represented in the dwarf
+   unwind table.  */
+machine_mode
+s390_dwarf_frame_reg_mode (int regno)
+{
+  machine_mode save_mode = default_dwarf_frame_reg_mode (regno);
+
+  /* The rightmost 64 bits of vector registers are call-clobbered.  */
+  if (GET_MODE_SIZE (save_mode) > 8)
+    save_mode = DImode;
+
+  return save_mode;
+}
+
 #ifdef TARGET_ALTERNATE_LONG_DOUBLE_MANGLING
 /* Implement TARGET_MANGLE_TYPE.  */
 
@@ -5427,24 +6002,26 @@ print_operand_address (FILE *file, rtx addr)
     'J': print tls_load/tls_gdcall/tls_ldcall suffix
     'M': print the second word of a TImode operand.
     'N': print the second word of a DImode operand.
-    'O': print only the displacement of a memory reference.
-    'R': print only the base register of a memory reference.
+    'O': print only the displacement of a memory reference or address.
+    'R': print only the base register of a memory reference or address.
     'S': print S-type memory reference (base+displacement).
     'Y': print shift count operand.
 
     'b': print integer X as if it's an unsigned byte.
     'c': print integer X as if it's an signed byte.
-    'e': "end" of DImode contiguous bitmask X.
-    'f': "end" of SImode contiguous bitmask X.
+    'e': "end" contiguous bitmask X in either DImode or vector inner mode.
+    'f': "end" contiguous bitmask X in SImode.
     'h': print integer X as if it's a signed halfword.
     'i': print the first nonzero HImode part of X.
     'j': print the first HImode part unequal to -1 of X.
     'k': print the first nonzero SImode part of X.
     'm': print the first SImode part unequal to -1 of X.
     'o': print integer X as if it's an unsigned 32bit word.
-    's': "start" of DImode contiguous bitmask X.
-    't': "start" of SImode contiguous bitmask X.
+    's': "start" of contiguous bitmask X in either DImode or vector inner mode.
+    't': CONST_INT: "start" of contiguous bitmask X in SImode.
+         CONST_VECTOR: Generate a bitmask for vgbm instruction.
     'x': print integer X as if it's an unsigned halfword.
+    'v': print register number as vector register (v1 instead of f1).
 */
 
 void
@@ -5503,14 +6080,7 @@ print_operand (FILE *file, rtx x, int code)
         struct s390_address ad;
 	int ret;
 
-	if (!MEM_P (x))
-	  {
-	    output_operand_lossage ("memory reference expected for "
-				    "'O' output modifier");
-	    return;
-	  }
-
-	ret = s390_decompose_address (XEXP (x, 0), &ad);
+	ret = s390_decompose_address (MEM_P (x) ? XEXP (x, 0) : x, &ad);
 
 	if (!ret
 	    || (ad.base && !REGNO_OK_FOR_BASE_P (REGNO (ad.base)))
@@ -5532,14 +6102,7 @@ print_operand (FILE *file, rtx x, int code)
         struct s390_address ad;
 	int ret;
 
-	if (!MEM_P (x))
-	  {
-	    output_operand_lossage ("memory reference expected for "
-				    "'R' output modifier");
-	    return;
-	  }
-
-	ret = s390_decompose_address (XEXP (x, 0), &ad);
+	ret = s390_decompose_address (MEM_P (x) ? XEXP (x, 0) : x, &ad);
 
 	if (!ret
 	    || (ad.base && !REGNO_OK_FOR_BASE_P (REGNO (ad.base)))
@@ -5617,7 +6180,17 @@ print_operand (FILE *file, rtx x, int code)
   switch (GET_CODE (x))
     {
     case REG:
-      fprintf (file, "%s", reg_names[REGNO (x)]);
+      /* Print FP regs as fx instead of vx when they are accessed
+	 through non-vector mode.  */
+      if (code == 'v'
+	  || VECTOR_NOFP_REG_P (x)
+	  || (FP_REG_P (x) && VECTOR_MODE_P (GET_MODE (x)))
+	  || (VECTOR_REG_P (x)
+	      && (GET_MODE_SIZE (GET_MODE (x)) /
+		  s390_class_max_nregs (FP_REGS, GET_MODE (x))) > 8))
+	fprintf (file, "%%v%s", reg_names[REGNO (x)] + 2);
+      else
+	fprintf (file, "%s", reg_names[REGNO (x)]);
       break;
 
     case MEM:
@@ -5704,6 +6277,39 @@ print_operand (FILE *file, rtx x, int code)
 				    code);
 	}
       break;
+    case CONST_VECTOR:
+      switch (code)
+	{
+	case 'e':
+	case 's':
+	  {
+	    int start, stop, inner_len;
+	    bool ok;
+
+	    inner_len = GET_MODE_UNIT_BITSIZE (GET_MODE (x));
+	    ok = s390_contiguous_bitmask_vector_p (x, &start, &stop);
+	    gcc_assert (ok);
+	    if (code == 's' || code == 't')
+	      ival = inner_len - stop - 1;
+	    else
+	      ival = inner_len - start - 1;
+	    fprintf (file, HOST_WIDE_INT_PRINT_DEC, ival);
+	  }
+	  break;
+	case 't':
+	  {
+	    unsigned mask;
+	    bool ok = s390_bytemask_vector_p (x, &mask);
+	    gcc_assert (ok);
+	    fprintf (file, "%u", mask);
+	  }
+	  break;
+
+	default:
+	  output_operand_lossage ("invalid constant vector for output "
+				  "modifier '%c'", code);
+	}
+      break;
 
     default:
       if (code == 0)
@@ -6257,14 +6863,19 @@ replace_ltrel_base (rtx *x)
 /* We keep a list of constants which we have to add to internal
    constant tables in the middle of large functions.  */
 
-#define NR_C_MODES 11
+#define NR_C_MODES 31
 machine_mode constant_modes[NR_C_MODES] =
 {
   TFmode, TImode, TDmode,
+  V16QImode, V8HImode, V4SImode, V2DImode, V4SFmode, V2DFmode, V1TFmode,
   DFmode, DImode, DDmode,
+  V8QImode, V4HImode, V2SImode, V1DImode, V2SFmode, V1DFmode,
   SFmode, SImode, SDmode,
+  V4QImode, V2HImode, V1SImode,  V1SFmode,
   HImode,
-  QImode
+  V2QImode, V1HImode,
+  QImode,
+  V1QImode
 };
 
 struct constant
@@ -7279,6 +7890,23 @@ s390_output_pool_entry (rtx exp, machine_mode mode, unsigned int align)
       mark_symbol_refs_as_used (exp);
       break;
 
+    case MODE_VECTOR_INT:
+    case MODE_VECTOR_FLOAT:
+      {
+	int i;
+	machine_mode inner_mode;
+	gcc_assert (GET_CODE (exp) == CONST_VECTOR);
+
+	inner_mode = GET_MODE_INNER (GET_MODE (exp));
+	for (i = 0; i < XVECLEN (exp, 0); i++)
+	  s390_output_pool_entry (XVECEXP (exp, 0, i),
+				  inner_mode,
+				  i == 0
+				  ? align
+				  : GET_MODE_BITSIZE (inner_mode));
+      }
+      break;
+
     default:
       gcc_unreachable ();
     }
@@ -8090,9 +8718,25 @@ s390_optimize_nonescaping_tx (void)
 bool
 s390_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
 {
+  if (!TARGET_VX && VECTOR_NOFP_REGNO_P (regno))
+    return false;
+
   switch (REGNO_REG_CLASS (regno))
     {
+    case VEC_REGS:
+      return ((GET_MODE_CLASS (mode) == MODE_INT
+	       && s390_class_max_nregs (VEC_REGS, mode) == 1)
+	      || mode == DFmode
+	      || s390_vector_mode_supported_p (mode));
+      break;
     case FP_REGS:
+      if (TARGET_VX
+	  && ((GET_MODE_CLASS (mode) == MODE_INT
+	       && s390_class_max_nregs (FP_REGS, mode) == 1)
+	      || mode == DFmode
+	      || s390_vector_mode_supported_p (mode)))
+	return true;
+
       if (REGNO_PAIR_OK (regno, mode))
 	{
 	  if (mode == SImode || mode == DImode)
@@ -8179,19 +8823,86 @@ s390_hard_regno_scratch_ok (unsigned int regno)
 int
 s390_class_max_nregs (enum reg_class rclass, machine_mode mode)
 {
+  int reg_size;
+  bool reg_pair_required_p = false;
+
   switch (rclass)
     {
     case FP_REGS:
+    case VEC_REGS:
+      reg_size = TARGET_VX ? 16 : 8;
+
+      /* TF and TD modes would fit into a VR but we put them into a
+	 register pair since we do not have 128bit FP instructions on
+	 full VRs.  */
+      if (TARGET_VX
+	  && SCALAR_FLOAT_MODE_P (mode)
+	  && GET_MODE_SIZE (mode) >= 16)
+	reg_pair_required_p = true;
+
+      /* Even if complex types would fit into a single FPR/VR we force
+	 them into a register pair to deal with the parts more easily.
+	 (FIXME: What about complex ints?)  */
       if (GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT)
-	return 2 * ((GET_MODE_SIZE (mode) / 2 + 8 - 1) / 8);
-      else
-	return (GET_MODE_SIZE (mode) + 8 - 1) / 8;
+	reg_pair_required_p = true;
+      break;
     case ACCESS_REGS:
-      return (GET_MODE_SIZE (mode) + 4 - 1) / 4;
+      reg_size = 4;
+      break;
     default:
+      reg_size = UNITS_PER_WORD;
       break;
     }
-  return (GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+
+  if (reg_pair_required_p)
+    return 2 * ((GET_MODE_SIZE (mode) / 2 + reg_size - 1) / reg_size);
+
+  return (GET_MODE_SIZE (mode) + reg_size - 1) / reg_size;
+}
+
+/* Return TRUE if changing mode from FROM to TO should not be allowed
+   for register class CLASS.  */
+
+int
+s390_cannot_change_mode_class (machine_mode from_mode,
+			       machine_mode to_mode,
+			       enum reg_class rclass)
+{
+  machine_mode small_mode;
+  machine_mode big_mode;
+
+  if (GET_MODE_SIZE (from_mode) == GET_MODE_SIZE (to_mode))
+    return 0;
+
+  if (GET_MODE_SIZE (from_mode) < GET_MODE_SIZE (to_mode))
+    {
+      small_mode = from_mode;
+      big_mode = to_mode;
+    }
+  else
+    {
+      small_mode = to_mode;
+      big_mode = from_mode;
+    }
+
+  /* Values residing in VRs are little-endian style.  All modes are
+     placed left-aligned in an VR.  This means that we cannot allow
+     switching between modes with differing sizes.  Also if the vector
+     facility is available we still place TFmode values in VR register
+     pairs, since the only instructions we have operating on TFmodes
+     only deal with register pairs.  Therefore we have to allow DFmode
+     subregs of TFmodes to enable the TFmode splitters.  */
+  if (reg_classes_intersect_p (VEC_REGS, rclass)
+      && (GET_MODE_SIZE (small_mode) < 8
+	  || s390_class_max_nregs (VEC_REGS, big_mode) == 1))
+    return 1;
+
+  /* Likewise for access registers, since they have only half the
+     word size on 64-bit.  */
+  if (reg_classes_intersect_p (ACCESS_REGS, rclass))
+    return 1;
+
+  return 0;
 }
 
 /* Return true if we use LRA instead of reload pass.  */
@@ -9223,6 +9934,23 @@ s390_can_use_return_insn (void)
   return cfun_frame_layout.frame_size == 0;
 }
 
+/* The VX ABI differs for vararg functions.  Therefore we need the
+   prototype of the callee to be available when passing vector type
+   values.  */
+static const char *
+s390_invalid_arg_for_unprototyped_fn (const_tree typelist, const_tree funcdecl, const_tree val)
+{
+  return ((TARGET_VX_ABI
+	   && typelist == 0
+	   && VECTOR_TYPE_P (TREE_TYPE (val))
+	   && (funcdecl == NULL_TREE
+	       || (TREE_CODE (funcdecl) == FUNCTION_DECL
+		   && DECL_BUILT_IN_CLASS (funcdecl) != BUILT_IN_MD)))
+	  ? N_("Vector argument passed to unprototyped function")
+	  : NULL);
+}
+
+
 /* Return the size in bytes of a function argument of
    type TYPE and/or mode MODE.  At least one of TYPE or
    MODE must be specified.  */
@@ -9242,13 +9970,61 @@ s390_function_arg_size (machine_mode mode, const_tree type)
 }
 
 /* Return true if a function argument of type TYPE and mode MODE
+   is to be passed in a vector register, if available.  */
+
+bool
+s390_function_arg_vector (machine_mode mode, const_tree type)
+{
+  if (!TARGET_VX_ABI)
+    return false;
+
+  if (s390_function_arg_size (mode, type) > 16)
+    return false;
+
+  /* No type info available for some library calls ...  */
+  if (!type)
+    return VECTOR_MODE_P (mode);
+
+  /* The ABI says that record types with a single member are treated
+     just like that member would be.  */
+  while (TREE_CODE (type) == RECORD_TYPE)
+    {
+      tree field, single = NULL_TREE;
+
+      for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
+	{
+	  if (TREE_CODE (field) != FIELD_DECL)
+	    continue;
+
+	  if (single == NULL_TREE)
+	    single = TREE_TYPE (field);
+	  else
+	    return false;
+	}
+
+      if (single == NULL_TREE)
+	return false;
+      else
+	{
+	  /* If the field declaration adds extra byte due to
+	     e.g. padding this is not accepted as vector type.  */
+	  if (int_size_in_bytes (single) <= 0
+	      || int_size_in_bytes (single) != int_size_in_bytes (type))
+	    return false;
+	  type = single;
+	}
+    }
+
+  return VECTOR_TYPE_P (type);
+}
+
+/* Return true if a function argument of type TYPE and mode MODE
    is to be passed in a floating-point register, if available.  */
 
 static bool
 s390_function_arg_float (machine_mode mode, const_tree type)
 {
-  int size = s390_function_arg_size (mode, type);
-  if (size > 8)
+  if (s390_function_arg_size (mode, type) > 8)
     return false;
 
   /* Soft-float changes the ABI: no floating-point registers are used.  */
@@ -9331,20 +10107,24 @@ s390_pass_by_reference (cumulative_args_t ca ATTRIBUTE_UNUSED,
 			bool named ATTRIBUTE_UNUSED)
 {
   int size = s390_function_arg_size (mode, type);
+
+  if (s390_function_arg_vector (mode, type))
+    return false;
+
   if (size > 8)
     return true;
 
   if (type)
     {
       if (AGGREGATE_TYPE_P (type) && exact_log2 (size) < 0)
-        return 1;
+        return true;
 
       if (TREE_CODE (type) == COMPLEX_TYPE
 	  || TREE_CODE (type) == VECTOR_TYPE)
-        return 1;
+	return true;
     }
 
-  return 0;
+  return false;
 }
 
 /* Update the data in CUM to advance over an argument of mode MODE and
@@ -9355,11 +10135,21 @@ s390_pass_by_reference (cumulative_args_t ca ATTRIBUTE_UNUSED,
 
 static void
 s390_function_arg_advance (cumulative_args_t cum_v, machine_mode mode,
-			   const_tree type, bool named ATTRIBUTE_UNUSED)
+			   const_tree type, bool named)
 {
   CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
 
-  if (s390_function_arg_float (mode, type))
+  if (s390_function_arg_vector (mode, type))
+    {
+      /* We are called for unnamed vector stdarg arguments which are
+	 passed on the stack.  In this case this hook does not have to
+	 do anything since stack arguments are tracked by common
+	 code.  */
+      if (!named)
+	return;
+      cum->vrs += 1;
+    }
+  else if (s390_function_arg_float (mode, type))
     {
       cum->fprs += 1;
     }
@@ -9393,14 +10183,24 @@ s390_function_arg_advance (cumulative_args_t cum_v, machine_mode mode,
 
 static rtx
 s390_function_arg (cumulative_args_t cum_v, machine_mode mode,
-		   const_tree type, bool named ATTRIBUTE_UNUSED)
+		   const_tree type, bool named)
 {
   CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
 
-  if (s390_function_arg_float (mode, type))
+
+  if (s390_function_arg_vector (mode, type))
+    {
+      /* Vector arguments being part of the ellipsis are passed on the
+	 stack.  */
+      if (!named || (cum->vrs + 1 > VEC_ARG_NUM_REG))
+	return NULL_RTX;
+
+      return gen_rtx_REG (mode, cum->vrs + FIRST_VEC_ARG_REGNO);
+    }
+  else if (s390_function_arg_float (mode, type))
     {
       if (cum->fprs + 1 > FP_ARG_NUM_REG)
-	return 0;
+	return NULL_RTX;
       else
 	return gen_rtx_REG (mode, cum->fprs + 16);
     }
@@ -9410,7 +10210,7 @@ s390_function_arg (cumulative_args_t cum_v, machine_mode mode,
       int n_gprs = (size + UNITS_PER_LONG - 1) / UNITS_PER_LONG;
 
       if (cum->gprs + n_gprs > GP_ARG_NUM_REG)
-	return 0;
+	return NULL_RTX;
       else if (n_gprs == 1 || UNITS_PER_WORD == UNITS_PER_LONG)
 	return gen_rtx_REG (mode, cum->gprs + 2);
       else if (n_gprs == 2)
@@ -9453,11 +10253,17 @@ s390_return_in_memory (const_tree type, const_tree fundecl ATTRIBUTE_UNUSED)
       || TREE_CODE (type) == REAL_TYPE)
     return int_size_in_bytes (type) > 8;
 
+  /* vector types which fit into a VR.  */
+  if (TARGET_VX_ABI
+      && VECTOR_TYPE_P (type)
+      && int_size_in_bytes (type) <= 16)
+    return false;
+
   /* Aggregates and similar constructs are always returned
      in memory.  */
   if (AGGREGATE_TYPE_P (type)
       || TREE_CODE (type) == COMPLEX_TYPE
-      || TREE_CODE (type) == VECTOR_TYPE)
+      || VECTOR_TYPE_P (type))
     return true;
 
   /* ??? We get called on all sorts of random stuff from
@@ -9495,6 +10301,12 @@ s390_function_and_libcall_value (machine_mode mode,
 				 const_tree fntype_or_decl,
 				 bool outgoing ATTRIBUTE_UNUSED)
 {
+  /* For vector return types it is important to use the RET_TYPE
+     argument whenever available since the middle-end might have
+     changed the mode to a scalar mode.  */
+  bool vector_ret_type_p = ((ret_type && VECTOR_TYPE_P (ret_type))
+			    || (!ret_type && VECTOR_MODE_P (mode)));
+
   /* For normal functions perform the promotion as
      promote_function_mode would do.  */
   if (ret_type)
@@ -9504,10 +10316,14 @@ s390_function_and_libcall_value (machine_mode mode,
 				    fntype_or_decl, 1);
     }
 
-  gcc_assert (GET_MODE_CLASS (mode) == MODE_INT || SCALAR_FLOAT_MODE_P (mode));
-  gcc_assert (GET_MODE_SIZE (mode) <= 8);
+  gcc_assert (GET_MODE_CLASS (mode) == MODE_INT
+	      || SCALAR_FLOAT_MODE_P (mode)
+	      || (TARGET_VX_ABI && vector_ret_type_p));
+  gcc_assert (GET_MODE_SIZE (mode) <= (TARGET_VX_ABI ? 16 : 8));
 
-  if (TARGET_HARD_FLOAT && SCALAR_FLOAT_MODE_P (mode))
+  if (TARGET_VX_ABI && vector_ret_type_p)
+    return gen_rtx_REG (mode, FIRST_VEC_ARG_REGNO);
+  else if (TARGET_HARD_FLOAT && SCALAR_FLOAT_MODE_P (mode))
     return gen_rtx_REG (mode, 16);
   else if (GET_MODE_SIZE (mode) <= UNITS_PER_LONG
 	   || UNITS_PER_LONG == UNITS_PER_WORD)
@@ -9671,9 +10487,13 @@ s390_va_start (tree valist, rtx nextarg ATTRIBUTE_UNUSED)
       expand_expr (t, const0_rtx, VOIDmode, EXPAND_NORMAL);
     }
 
-  /* Find the overflow area.  */
+  /* Find the overflow area.
+     FIXME: This currently is too pessimistic when the vector ABI is
+     enabled.  In that case we *always* set up the overflow area
+     pointer.  */
   if (n_gpr + cfun->va_list_gpr_size > GP_ARG_NUM_REG
-      || n_fpr + cfun->va_list_fpr_size > FP_ARG_NUM_REG)
+      || n_fpr + cfun->va_list_fpr_size > FP_ARG_NUM_REG
+      || TARGET_VX_ABI)
     {
       t = make_tree (TREE_TYPE (ovf), virtual_incoming_args_rtx);
 
@@ -9715,6 +10535,9 @@ s390_va_start (tree valist, rtx nextarg ATTRIBUTE_UNUSED)
        ret = args.reg_save_area[args.gpr+8]
      else
        ret = *args.overflow_arg_area++;
+   } else if (vector value) {
+       ret = *args.overflow_arg_area;
+       args.overflow_arg_area += size / 8;
    } else if (float value) {
      if (args.fgpr < 2)
        ret = args.reg_save_area[args.fpr+64]
@@ -9734,7 +10557,10 @@ s390_gimplify_va_arg (tree valist, tree type, gimple_seq *pre_p,
   tree f_gpr, f_fpr, f_ovf, f_sav;
   tree gpr, fpr, ovf, sav, reg, t, u;
   int indirect_p, size, n_reg, sav_ofs, sav_scale, max_reg;
-  tree lab_false, lab_over, addr;
+  tree lab_false, lab_over;
+  tree addr = create_tmp_var (ptr_type_node, "addr");
+  bool left_align_p; /* How a value < UNITS_PER_LONG is aligned within
+			a stack slot.  */
 
   f_gpr = TYPE_FIELDS (TREE_TYPE (va_list_type_node));
   f_fpr = DECL_CHAIN (f_gpr);
@@ -9773,6 +10599,23 @@ s390_gimplify_va_arg (tree valist, tree type, gimple_seq *pre_p,
       sav_scale = UNITS_PER_LONG;
       size = UNITS_PER_LONG;
       max_reg = GP_ARG_NUM_REG - n_reg;
+      left_align_p = false;
+    }
+  else if (s390_function_arg_vector (TYPE_MODE (type), type))
+    {
+      if (TARGET_DEBUG_ARG)
+	{
+	  fprintf (stderr, "va_arg: vector type");
+	  debug_tree (type);
+	}
+
+      indirect_p = 0;
+      reg = NULL_TREE;
+      n_reg = 0;
+      sav_ofs = 0;
+      sav_scale = 8;
+      max_reg = 0;
+      left_align_p = true;
     }
   else if (s390_function_arg_float (TYPE_MODE (type), type))
     {
@@ -9789,6 +10632,7 @@ s390_gimplify_va_arg (tree valist, tree type, gimple_seq *pre_p,
       sav_ofs = 16 * UNITS_PER_LONG;
       sav_scale = 8;
       max_reg = FP_ARG_NUM_REG - n_reg;
+      left_align_p = false;
     }
   else
     {
@@ -9813,53 +10657,74 @@ s390_gimplify_va_arg (tree valist, tree type, gimple_seq *pre_p,
 
       sav_scale = UNITS_PER_LONG;
       max_reg = GP_ARG_NUM_REG - n_reg;
+      left_align_p = false;
     }
 
   /* Pull the value out of the saved registers ...  */
 
-  lab_false = create_artificial_label (UNKNOWN_LOCATION);
-  lab_over = create_artificial_label (UNKNOWN_LOCATION);
-  addr = create_tmp_var (ptr_type_node, "addr");
+  if (reg != NULL_TREE)
+    {
+      /*
+	if (reg > ((typeof (reg))max_reg))
+          goto lab_false;
 
-  t = fold_convert (TREE_TYPE (reg), size_int (max_reg));
-  t = build2 (GT_EXPR, boolean_type_node, reg, t);
-  u = build1 (GOTO_EXPR, void_type_node, lab_false);
-  t = build3 (COND_EXPR, void_type_node, t, u, NULL_TREE);
-  gimplify_and_add (t, pre_p);
+        addr = sav + sav_ofs + reg * save_scale;
 
-  t = fold_build_pointer_plus_hwi (sav, sav_ofs);
-  u = build2 (MULT_EXPR, TREE_TYPE (reg), reg,
-	      fold_convert (TREE_TYPE (reg), size_int (sav_scale)));
-  t = fold_build_pointer_plus (t, u);
+	goto lab_over;
 
-  gimplify_assign (addr, t, pre_p);
+        lab_false:
+      */
+
+      lab_false = create_artificial_label (UNKNOWN_LOCATION);
+      lab_over = create_artificial_label (UNKNOWN_LOCATION);
+
+      t = fold_convert (TREE_TYPE (reg), size_int (max_reg));
+      t = build2 (GT_EXPR, boolean_type_node, reg, t);
+      u = build1 (GOTO_EXPR, void_type_node, lab_false);
+      t = build3 (COND_EXPR, void_type_node, t, u, NULL_TREE);
+      gimplify_and_add (t, pre_p);
 
-  gimple_seq_add_stmt (pre_p, gimple_build_goto (lab_over));
+      t = fold_build_pointer_plus_hwi (sav, sav_ofs);
+      u = build2 (MULT_EXPR, TREE_TYPE (reg), reg,
+		  fold_convert (TREE_TYPE (reg), size_int (sav_scale)));
+      t = fold_build_pointer_plus (t, u);
 
-  gimple_seq_add_stmt (pre_p, gimple_build_label (lab_false));
+      gimplify_assign (addr, t, pre_p);
 
+      gimple_seq_add_stmt (pre_p, gimple_build_goto (lab_over));
+
+      gimple_seq_add_stmt (pre_p, gimple_build_label (lab_false));
+    }
 
   /* ... Otherwise out of the overflow area.  */
 
   t = ovf;
-  if (size < UNITS_PER_LONG)
+  if (size < UNITS_PER_LONG && !left_align_p)
     t = fold_build_pointer_plus_hwi (t, UNITS_PER_LONG - size);
 
   gimplify_expr (&t, pre_p, NULL, is_gimple_val, fb_rvalue);
 
   gimplify_assign (addr, t, pre_p);
 
-  t = fold_build_pointer_plus_hwi (t, size);
+  if (size < UNITS_PER_LONG && left_align_p)
+    t = fold_build_pointer_plus_hwi (t, UNITS_PER_LONG);
+  else
+    t = fold_build_pointer_plus_hwi (t, size);
+
   gimplify_assign (ovf, t, pre_p);
 
-  gimple_seq_add_stmt (pre_p, gimple_build_label (lab_over));
+  if (reg != NULL_TREE)
+    gimple_seq_add_stmt (pre_p, gimple_build_label (lab_over));
 
 
   /* Increment register save count.  */
 
-  u = build2 (PREINCREMENT_EXPR, TREE_TYPE (reg), reg,
-	      fold_convert (TREE_TYPE (reg), size_int (n_reg)));
-  gimplify_and_add (u, pre_p);
+  if (n_reg > 0)
+    {
+      u = build2 (PREINCREMENT_EXPR, TREE_TYPE (reg), reg,
+		  fold_convert (TREE_TYPE (reg), size_int (n_reg)));
+      gimplify_and_add (u, pre_p);
+    }
 
   if (indirect_p)
     {
@@ -10660,15 +11525,18 @@ s390_call_saved_register_used (tree call_expr)
       mode = TYPE_MODE (type);
       gcc_assert (mode);
 
+      /* We assume that in the target function all parameters are
+	 named.  This only has an impact on vector argument register
+	 usage none of which is call-saved.  */
       if (pass_by_reference (&cum_v, mode, type, true))
  	{
  	  mode = Pmode;
  	  type = build_pointer_type (type);
  	}
 
-       parm_rtx = s390_function_arg (cum, mode, type, 0);
+       parm_rtx = s390_function_arg (cum, mode, type, true);
 
-       s390_function_arg_advance (cum, mode, type, 0);
+       s390_function_arg_advance (cum, mode, type, true);
 
        if (!parm_rtx)
 	 continue;
@@ -10875,6 +11743,13 @@ s390_conditional_register_usage (void)
       for (i = FPR0_REGNUM; i <= FPR15_REGNUM; i++)
 	call_used_regs[i] = fixed_regs[i] = 1;
     }
+
+  /* Disable v16 - v31 for non-vector target.  */
+  if (!TARGET_VX)
+    {
+      for (i = VR16_REGNUM; i <= VR31_REGNUM; i = NEXT_REGNO (i))
+	fixed_regs[i] = call_used_regs[i] = call_really_used_regs[i] = 1;
+    }
 }
 
 /* Corresponding function to eh_return expander.  */
@@ -12245,6 +13120,55 @@ s390_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
 #undef FPC_DXC_SHIFT
 }
 
+/* Return the vector mode to be used for inner mode MODE when doing
+   vectorization.  */
+static machine_mode
+s390_preferred_simd_mode (machine_mode mode)
+{
+  if (TARGET_VX)
+    switch (mode)
+      {
+      case DFmode:
+	return V2DFmode;
+      case DImode:
+	return V2DImode;
+      case SImode:
+	return V4SImode;
+      case HImode:
+	return V8HImode;
+      case QImode:
+	return V16QImode;
+      default:;
+      }
+  return word_mode;
+}
+
+/* Our hardware does not require vectors to be strictly aligned.  */
+static bool
+s390_support_vector_misalignment (machine_mode mode ATTRIBUTE_UNUSED,
+				  const_tree type ATTRIBUTE_UNUSED,
+				  int misalignment ATTRIBUTE_UNUSED,
+				  bool is_packed ATTRIBUTE_UNUSED)
+{
+  return true;
+}
+
+/* The vector ABI requires vector types to be aligned on an 8 byte
+   boundary (our stack alignment).  However, we allow this to be
+   overriden by the user, while this definitely breaks the ABI.  */
+static HOST_WIDE_INT
+s390_vector_alignment (const_tree type)
+{
+  if (!TARGET_VX_ABI)
+    return default_vector_alignment (type);
+
+  if (TYPE_USER_ALIGN (type))
+    return TYPE_ALIGN (type);
+
+  return MIN (64, tree_to_shwi (TYPE_SIZE (type)));
+}
+
+
 /* Initialize GCC target structure.  */
 
 #undef  TARGET_ASM_ALIGNED_HI_OP
@@ -12353,6 +13277,8 @@ s390_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
 #define TARGET_FUNCTION_VALUE s390_function_value
 #undef TARGET_LIBCALL_VALUE
 #define TARGET_LIBCALL_VALUE s390_libcall_value
+#undef TARGET_STRICT_ARGUMENT_NAMING
+#define TARGET_STRICT_ARGUMENT_NAMING hook_bool_CUMULATIVE_ARGS_true
 
 #undef TARGET_KEEP_LEAF_WHEN_PROFILED
 #define TARGET_KEEP_LEAF_WHEN_PROFILED s390_keep_leaf_when_profiled
@@ -12371,6 +13297,9 @@ s390_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
 #define TARGET_ASM_OUTPUT_DWARF_DTPREL s390_output_dwarf_dtprel
 #endif
 
+#undef TARGET_DWARF_FRAME_REG_MODE
+#define TARGET_DWARF_FRAME_REG_MODE s390_dwarf_frame_reg_mode
+
 #ifdef TARGET_ALTERNATE_LONG_DOUBLE_MANGLING
 #undef TARGET_MANGLE_TYPE
 #define TARGET_MANGLE_TYPE s390_mangle_type
@@ -12379,6 +13308,9 @@ s390_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
 #undef TARGET_SCALAR_MODE_SUPPORTED_P
 #define TARGET_SCALAR_MODE_SUPPORTED_P s390_scalar_mode_supported_p
 
+#undef TARGET_VECTOR_MODE_SUPPORTED_P
+#define TARGET_VECTOR_MODE_SUPPORTED_P s390_vector_mode_supported_p
+
 #undef  TARGET_PREFERRED_RELOAD_CLASS
 #define TARGET_PREFERRED_RELOAD_CLASS s390_preferred_reload_class
 
@@ -12439,6 +13371,18 @@ s390_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
 #undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV
 #define TARGET_ATOMIC_ASSIGN_EXPAND_FENV s390_atomic_assign_expand_fenv
 
+#undef TARGET_INVALID_ARG_FOR_UNPROTOTYPED_FN
+#define TARGET_INVALID_ARG_FOR_UNPROTOTYPED_FN s390_invalid_arg_for_unprototyped_fn
+
+#undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE
+#define TARGET_VECTORIZE_PREFERRED_SIMD_MODE s390_preferred_simd_mode
+
+#undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
+#define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT s390_support_vector_misalignment
+
+#undef TARGET_VECTOR_ALIGNMENT
+#define TARGET_VECTOR_ALIGNMENT s390_vector_alignment
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-s390.h"
diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h
index 7130275..5568037 100644
--- a/gcc/config/s390/s390.h
+++ b/gcc/config/s390/s390.h
@@ -199,6 +199,13 @@ enum processor_flags
 
 #define STACK_SIZE_MODE (Pmode)
 
+/* Vector arguments are left-justified when placed on the stack during
+   parameter passing.  */
+#define FUNCTION_ARG_PADDING(MODE, TYPE)			\
+  (s390_function_arg_vector ((MODE), (TYPE))			\
+   ? upward							\
+   : DEFAULT_FUNCTION_ARG_PADDING ((MODE), (TYPE)))
+
 #ifndef IN_LIBGCC2
 
 /* Width of a word, in units (bytes).  */
@@ -296,9 +303,11 @@ enum processor_flags
    Reg 35: Return address pointer
 
    Registers 36 and 37 are mapped to access registers
-   0 and 1, used to implement thread-local storage.  */
+   0 and 1, used to implement thread-local storage.
+
+   Reg 38-53: Vector registers v16-v31  */
 
-#define FIRST_PSEUDO_REGISTER 38
+#define FIRST_PSEUDO_REGISTER 54
 
 /* Standard register usage.  */
 #define GENERAL_REGNO_P(N)	((int)(N) >= 0 && (N) < 16)
@@ -307,6 +316,8 @@ enum processor_flags
 #define CC_REGNO_P(N)		((N) == 33)
 #define FRAME_REGNO_P(N)	((N) == 32 || (N) == 34 || (N) == 35)
 #define ACCESS_REGNO_P(N)	((N) == 36 || (N) == 37)
+#define VECTOR_NOFP_REGNO_P(N)  ((N) >= 38 && (N) <= 53)
+#define VECTOR_REGNO_P(N)       (FP_REGNO_P (N) || VECTOR_NOFP_REGNO_P (N))
 
 #define GENERAL_REG_P(X)	(REG_P (X) && GENERAL_REGNO_P (REGNO (X)))
 #define ADDR_REG_P(X)		(REG_P (X) && ADDR_REGNO_P (REGNO (X)))
@@ -314,6 +325,8 @@ enum processor_flags
 #define CC_REG_P(X)		(REG_P (X) && CC_REGNO_P (REGNO (X)))
 #define FRAME_REG_P(X)		(REG_P (X) && FRAME_REGNO_P (REGNO (X)))
 #define ACCESS_REG_P(X)		(REG_P (X) && ACCESS_REGNO_P (REGNO (X)))
+#define VECTOR_NOFP_REG_P(X)    (REG_P (X) && VECTOR_NOFP_REGNO_P (REGNO (X)))
+#define VECTOR_REG_P(X)         (REG_P (X) && VECTOR_REGNO_P (REGNO (X)))
 
 /* Set up fixed registers and calling convention:
 
@@ -328,7 +341,9 @@ enum processor_flags
 
    On 31-bit, FPRs 18-19 are call-clobbered;
    on 64-bit, FPRs 24-31 are call-clobbered.
-   The remaining FPRs are call-saved.  */
+   The remaining FPRs are call-saved.
+
+   All non-FP vector registers are call-clobbered v16-v31.  */
 
 #define FIXED_REGISTERS				\
 { 0, 0, 0, 0, 					\
@@ -340,7 +355,11 @@ enum processor_flags
   0, 0, 0, 0, 					\
   0, 0, 0, 0, 					\
   1, 1, 1, 1,					\
-  1, 1 }
+  1, 1,						\
+  0, 0, 0, 0, 					\
+  0, 0, 0, 0, 					\
+  0, 0, 0, 0, 					\
+  0, 0, 0, 0 }
 
 #define CALL_USED_REGISTERS			\
 { 1, 1, 1, 1, 					\
@@ -352,26 +371,35 @@ enum processor_flags
   1, 1, 1, 1, 					\
   1, 1, 1, 1, 					\
   1, 1, 1, 1,					\
-  1, 1 }
+  1, 1,					        \
+  1, 1, 1, 1, 					\
+  1, 1, 1, 1,					\
+  1, 1, 1, 1, 					\
+  1, 1, 1, 1 }
 
 #define CALL_REALLY_USED_REGISTERS		\
-{ 1, 1, 1, 1, 					\
+{ 1, 1, 1, 1, 	/* r0 - r15 */			\
   1, 1, 0, 0, 					\
   0, 0, 0, 0, 					\
   0, 0, 0, 0,					\
+  1, 1, 1, 1, 	/* f0 (16) - f15 (31) */	\
   1, 1, 1, 1, 					\
   1, 1, 1, 1, 					\
   1, 1, 1, 1, 					\
-  1, 1, 1, 1, 					\
+  1, 1, 1, 1,	/* arg, cc, fp, ret addr */	\
+  0, 0,		/* a0 (36), a1 (37) */	        \
+  1, 1, 1, 1, 	/* v16 (38) - v23 (45) */	\
   1, 1, 1, 1,					\
-  0, 0 }
+  1, 1, 1, 1, 	/* v24 (46) - v31 (53) */	\
+  1, 1, 1, 1 }
 
 /* Preferred register allocation order.  */
-#define REG_ALLOC_ORDER                                         \
-{  1, 2, 3, 4, 5, 0, 12, 11, 10, 9, 8, 7, 6, 14, 13,            \
-   16, 17, 18, 19, 20, 21, 22, 23,                              \
-   24, 25, 26, 27, 28, 29, 30, 31,                              \
-   15, 32, 33, 34, 35, 36, 37 }
+#define REG_ALLOC_ORDER							\
+  {  1, 2, 3, 4, 5, 0, 12, 11, 10, 9, 8, 7, 6, 14, 13,			\
+     16, 17, 18, 19, 20, 21, 22, 23,					\
+     24, 25, 26, 27, 28, 29, 30, 31,					\
+     38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 	\
+     15, 32, 33, 34, 35, 36, 37 }
 
 
 /* Fitting values into registers.  */
@@ -411,26 +439,22 @@ enum processor_flags
    but conforms to the 31-bit ABI, GPRs can hold 8 bytes;
    the ABI guarantees only that the lower 4 bytes are
    saved across calls, however.  */
-#define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE)		\
-  (!TARGET_64BIT && TARGET_ZARCH				\
-   && GET_MODE_SIZE (MODE) > 4					\
-   && (((REGNO) >= 6 && (REGNO) <= 15) || (REGNO) == 32))
+#define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE)			\
+  ((!TARGET_64BIT && TARGET_ZARCH					\
+    && GET_MODE_SIZE (MODE) > 4						\
+    && (((REGNO) >= 6 && (REGNO) <= 15) || (REGNO) == 32))		\
+   || (TARGET_VX							\
+       && GET_MODE_SIZE (MODE) > 8					\
+       && (((TARGET_64BIT && (REGNO) >= 24 && (REGNO) <= 31))		\
+	   || (!TARGET_64BIT && ((REGNO) == 18 || (REGNO) == 19)))))
 
 /* Maximum number of registers to represent a value of mode MODE
    in a register of class CLASS.  */
 #define CLASS_MAX_NREGS(CLASS, MODE)   					\
   s390_class_max_nregs ((CLASS), (MODE))
 
-/* If a 4-byte value is loaded into a FPR, it is placed into the
-   *upper* half of the register, not the lower.  Therefore, we
-   cannot use SUBREGs to switch between modes in FP registers.
-   Likewise for access registers, since they have only half the
-   word size on 64-bit.  */
 #define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)		        \
-  (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)			        \
-   ? ((reg_classes_intersect_p (FP_REGS, CLASS)				\
-       && (GET_MODE_SIZE (FROM) < 8 || GET_MODE_SIZE (TO) < 8))		\
-      || reg_classes_intersect_p (ACCESS_REGS, CLASS)) : 0)
+  s390_cannot_change_mode_class ((FROM), (TO), (CLASS))
 
 /* Register classes.  */
 
@@ -458,6 +482,7 @@ enum reg_class
   NO_REGS, CC_REGS, ADDR_REGS, GENERAL_REGS, ACCESS_REGS,
   ADDR_CC_REGS, GENERAL_CC_REGS,
   FP_REGS, ADDR_FP_REGS, GENERAL_FP_REGS,
+  VEC_REGS, ADDR_VEC_REGS, GENERAL_VEC_REGS,
   ALL_REGS, LIM_REG_CLASSES
 };
 #define N_REG_CLASSES (int) LIM_REG_CLASSES
@@ -465,11 +490,13 @@ enum reg_class
 #define REG_CLASS_NAMES							\
 { "NO_REGS", "CC_REGS", "ADDR_REGS", "GENERAL_REGS", "ACCESS_REGS",	\
   "ADDR_CC_REGS", "GENERAL_CC_REGS",					\
-  "FP_REGS", "ADDR_FP_REGS", "GENERAL_FP_REGS", "ALL_REGS" }
+  "FP_REGS", "ADDR_FP_REGS", "GENERAL_FP_REGS",				\
+  "VEC_REGS", "ADDR_VEC_REGS", "GENERAL_VEC_REGS",			\
+  "ALL_REGS" }
 
 /* Class -> register mapping.  */
-#define REG_CLASS_CONTENTS \
-{				       			\
+#define REG_CLASS_CONTENTS				\
+{							\
   { 0x00000000, 0x00000000 },	/* NO_REGS */		\
   { 0x00000000, 0x00000002 },	/* CC_REGS */		\
   { 0x0000fffe, 0x0000000d },	/* ADDR_REGS */		\
@@ -480,7 +507,10 @@ enum reg_class
   { 0xffff0000, 0x00000000 },	/* FP_REGS */		\
   { 0xfffffffe, 0x0000000d },	/* ADDR_FP_REGS */	\
   { 0xffffffff, 0x0000000d },	/* GENERAL_FP_REGS */	\
-  { 0xffffffff, 0x0000003f },	/* ALL_REGS */		\
+  { 0xffff0000, 0x003fffc0 },	/* VEC_REGS */		\
+  { 0xfffffffe, 0x003fffcd },	/* ADDR_VEC_REGS */	\
+  { 0xffffffff, 0x003fffcd },	/* GENERAL_VEC_REGS */	\
+  { 0xffffffff, 0x003fffff },	/* ALL_REGS */		\
 }
 
 /* In some case register allocation order is not enough for IRA to
@@ -511,14 +541,27 @@ extern const enum reg_class regclass_map[FIRST_PSEUDO_REGISTER];
 #define REGNO_OK_FOR_BASE_P(REGNO) REGNO_OK_FOR_INDEX_P (REGNO)
 
 
-/* We need secondary memory to move data between GPRs and FPRs.  With
-   DFP the ldgr lgdr instructions are available.  But these
-   instructions do not handle GPR pairs so it is not possible for 31
-   bit.  */
-#define SECONDARY_MEMORY_NEEDED(CLASS1, CLASS2, MODE) \
- ((CLASS1) != (CLASS2)                                \
-  && ((CLASS1) == FP_REGS || (CLASS2) == FP_REGS)     \
-  && (!TARGET_DFP || !TARGET_64BIT || GET_MODE_SIZE (MODE) != 8))
+/* We need secondary memory to move data between GPRs and FPRs.
+
+   - With DFP the ldgr lgdr instructions are available.  Due to the
+     different alignment we cannot use them for SFmode.  For 31 bit a
+     64 bit value in GPR would be a register pair so here we still
+     need to go via memory.
+
+   - With z13 we can do the SF/SImode moves with vlgvf.  Due to the
+     overlapping of FPRs and VRs we still disallow TF/TD modes to be
+     in full VRs so as before also on z13 we do these moves via
+     memory.
+
+     FIXME: Should we try splitting it into two vlgvg's/vlvg's instead?  */
+#define SECONDARY_MEMORY_NEEDED(CLASS1, CLASS2, MODE)			\
+  (((reg_classes_intersect_p (CLASS1, VEC_REGS)				\
+     && reg_classes_intersect_p (CLASS2, GENERAL_REGS))			\
+    || (reg_classes_intersect_p (CLASS1, GENERAL_REGS)			\
+	&& reg_classes_intersect_p (CLASS2, VEC_REGS)))			\
+   && (!TARGET_DFP || !TARGET_64BIT || GET_MODE_SIZE (MODE) != 8)	\
+   && (!TARGET_VX || (SCALAR_FLOAT_MODE_P (MODE)			\
+			  && GET_MODE_SIZE (MODE) > 8)))
 
 /* Get_secondary_mem widens its argument to BITS_PER_WORD which loses on 64bit
    because the movsi and movsf patterns don't handle r/f moves.  */
@@ -612,6 +655,11 @@ extern const enum reg_class regclass_map[FIRST_PSEUDO_REGISTER];
 /* Let the assembler generate debug line info.  */
 #define DWARF2_ASM_LINE_DEBUG_INFO 1
 
+/* Define the dwarf register mapping.
+   v16-v31 -> 68-83
+   rX      -> X      otherwise  */
+#define DBX_REGISTER_NUMBER(regno)			\
+  ((regno >= 38 && regno <= 53) ? regno + 30 : regno)
 
 /* Frame registers.  */
 
@@ -659,21 +707,29 @@ typedef struct s390_arg_structure
 {
   int gprs;			/* gpr so far */
   int fprs;			/* fpr so far */
+  int vrs;                      /* vr so far */
 }
 CUMULATIVE_ARGS;
 
 #define INIT_CUMULATIVE_ARGS(CUM, FNTYPE, LIBNAME, NN, N_NAMED_ARGS) \
-  ((CUM).gprs=0, (CUM).fprs=0)
+  ((CUM).gprs=0, (CUM).fprs=0, (CUM).vrs=0)
+
+#define FIRST_VEC_ARG_REGNO 46
+#define LAST_VEC_ARG_REGNO 53
 
 /* Arguments can be placed in general registers 2 to 6, or in floating
    point registers 0 and 2 for 31 bit and fprs 0, 2, 4 and 6 for 64
    bit.  */
-#define FUNCTION_ARG_REGNO_P(N) (((N) >=2 && (N) <7) || \
-  (N) == 16 || (N) == 17 || (TARGET_64BIT && ((N) == 18 || (N) == 19)))
+#define FUNCTION_ARG_REGNO_P(N)						\
+  (((N) >=2 && (N) < 7) || (N) == 16 || (N) == 17			\
+   || (TARGET_64BIT && ((N) == 18 || (N) == 19))			\
+   || (TARGET_VX && ((N) >= FIRST_VEC_ARG_REGNO && (N) <= LAST_VEC_ARG_REGNO)))
 
 
-/* Only gpr 2 and fpr 0 are ever used as return registers.  */
-#define FUNCTION_VALUE_REGNO_P(N) ((N) == 2 || (N) == 16)
+/* Only gpr 2, fpr 0, and v24 are ever used as return registers.  */
+#define FUNCTION_VALUE_REGNO_P(N)		\
+  ((N) == 2 || (N) == 16			\
+   || (TARGET_VX && (N) == FIRST_VEC_ARG_REGNO))
 
 
 /* Function entry and exit.  */
@@ -833,12 +889,20 @@ do {									\
 /* How to refer to registers in assembler output.  This sequence is
    indexed by compiler's hard-register-number (see above).  */
 #define REGISTER_NAMES							\
-{ "%r0",  "%r1",  "%r2",  "%r3",  "%r4",  "%r5",  "%r6",  "%r7",	\
-  "%r8",  "%r9",  "%r10", "%r11", "%r12", "%r13", "%r14", "%r15",	\
-  "%f0",  "%f2",  "%f4",  "%f6",  "%f1",  "%f3",  "%f5",  "%f7",	\
-  "%f8",  "%f10", "%f12", "%f14", "%f9",  "%f11", "%f13", "%f15",	\
-  "%ap",  "%cc",  "%fp",  "%rp",  "%a0",  "%a1"				\
-}
+  { "%r0",  "%r1",  "%r2",  "%r3",  "%r4",  "%r5",  "%r6",  "%r7",	\
+    "%r8",  "%r9",  "%r10", "%r11", "%r12", "%r13", "%r14", "%r15",	\
+    "%f0",  "%f2",  "%f4",  "%f6",  "%f1",  "%f3",  "%f5",  "%f7",	\
+    "%f8",  "%f10", "%f12", "%f14", "%f9",  "%f11", "%f13", "%f15",	\
+    "%ap",  "%cc",  "%fp",  "%rp",  "%a0",  "%a1",			\
+    "%v16", "%v18", "%v20", "%v22", "%v17", "%v19", "%v21", "%v23",	\
+    "%v24", "%v26", "%v28", "%v30", "%v25", "%v27", "%v29", "%v31"	\
+  }
+
+#define ADDITIONAL_REGISTER_NAMES					\
+  { { "v0", 16 }, { "v2",  17 }, { "v4",  18 }, { "v6",  19 },		\
+    { "v1", 20 }, { "v3",  21 }, { "v5",  22 }, { "v7",  23 },          \
+    { "v8", 24 }, { "v10", 25 }, { "v12", 26 }, { "v14", 27 },          \
+    { "v9", 28 }, { "v11", 29 }, { "v13", 30 }, { "v15", 31 } };
 
 /* Print operand X (an rtx) in assembler syntax to file FILE.  */
 #define PRINT_OPERAND(FILE, X, CODE) print_operand (FILE, X, CODE)
@@ -908,13 +972,21 @@ do {									\
 #define SYMBOL_REF_NOT_NATURALLY_ALIGNED_P(X) \
   ((SYMBOL_REF_FLAGS (X) & SYMBOL_FLAG_NOT_NATURALLY_ALIGNED))
 
+/* Check whether integer displacement is in range for a short displacement.  */
+#define SHORT_DISP_IN_RANGE(d) ((d) >= 0 && (d) <= 4095)
+
 /* Check whether integer displacement is in range.  */
 #define DISP_IN_RANGE(d) \
   (TARGET_LONG_DISPLACEMENT? ((d) >= -524288 && (d) <= 524287) \
-                           : ((d) >= 0 && (d) <= 4095))
+                           : SHORT_DISP_IN_RANGE(d))
 
 /* Reads can reuse write prefetches, used by tree-ssa-prefetch-loops.c.  */
 #define READ_CAN_USE_WRITE_PREFETCH 1
 
 extern const int processor_flags_table[];
-#endif
+
+/* The truth element value for vector comparisons.  Our instructions
+   always generate -1 in that case.  */
+#define VECTOR_STORE_FLAG_VALUE(MODE) CONSTM1_RTX (GET_MODE_INNER (MODE))
+
+#endif /* S390_H */
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 9b7c9d9..8680770 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -125,7 +125,23 @@
    UNSPEC_FPINT_CEIL
    UNSPEC_FPINT_NEARBYINT
    UNSPEC_FPINT_RINT
- ])
+
+   ; Vector
+   UNSPEC_VEC_EXTRACT
+   UNSPEC_VEC_SET
+   UNSPEC_VEC_PERM
+   UNSPEC_VEC_SRLB
+   UNSPEC_VEC_GENBYTEMASK
+   UNSPEC_VEC_VSUM
+   UNSPEC_VEC_VSUMG
+   UNSPEC_VEC_SMULT_EVEN
+   UNSPEC_VEC_UMULT_EVEN
+   UNSPEC_VEC_SMULT_ODD
+   UNSPEC_VEC_UMULT_ODD
+   UNSPEC_VEC_LOAD_LEN
+   UNSPEC_VEC_VFENE
+   UNSPEC_VEC_VFENECC
+])
 
 ;;
 ;; UNSPEC_VOLATILE usage
@@ -216,6 +232,11 @@
    (FPR13_REGNUM                30)
    (FPR14_REGNUM                27)
    (FPR15_REGNUM                31)
+   (VR0_REGNUM                  16)
+   (VR16_REGNUM                 38)
+   (VR23_REGNUM                 45)
+   (VR24_REGNUM                 46)
+   (VR31_REGNUM                 53)
   ])
 
 ;;
@@ -246,7 +267,7 @@
 ;; Used to determine defaults for length and other attribute values.
 
 (define_attr "op_type"
-  "NN,E,RR,RRE,RX,RS,RSI,RI,SI,S,SS,SSE,RXE,RSE,RIL,RIE,RXY,RSY,SIY,RRF,RRR,SIL,RRS,RIS"
+  "NN,E,RR,RRE,RX,RS,RSI,RI,SI,S,SS,SSE,RXE,RSE,RIL,RIE,RXY,RSY,SIY,RRF,RRR,SIL,RRS,RIS,VRI,VRR,VRS,VRV,VRX"
   (const_string "NN"))
 
 ;; Instruction type attribute used for scheduling.
@@ -403,12 +424,13 @@
 
 ;; Iterators
 
+(define_mode_iterator ALL [TI DI SI HI QI TF DF SF TD DD SD V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI V1DI V2DI V1SF V2SF V4SF V1TI V1DF V2DF V1TF])
+
 ;; These mode iterators allow floating point patterns to be generated from the
 ;; same template.
 (define_mode_iterator FP_ALL [TF DF SF (TD "TARGET_HARD_DFP") (DD "TARGET_HARD_DFP")
                               (SD "TARGET_HARD_DFP")])
 (define_mode_iterator FP [TF DF SF (TD "TARGET_HARD_DFP") (DD "TARGET_HARD_DFP")])
-(define_mode_iterator FPALL [TF DF SF TD DD SD])
 (define_mode_iterator BFP [TF DF SF])
 (define_mode_iterator DFP [TD DD])
 (define_mode_iterator DFP_ALL [TD DD SD])
@@ -444,7 +466,6 @@
 ;; This mode iterator allows the integer patterns to be defined from the
 ;; same template.
 (define_mode_iterator INT [(DI "TARGET_ZARCH") SI HI QI])
-(define_mode_iterator INTALL [TI DI SI HI QI])
 (define_mode_iterator DINT [(TI "TARGET_ZARCH") DI SI HI QI])
 
 ;; This iterator allows some 'ashift' and 'lshiftrt' pattern to be defined from
@@ -614,6 +635,8 @@
 ;; Allow return and simple_return to be defined from a single template.
 (define_code_iterator ANY_RETURN [return simple_return])
 
+(include "vector.md")
+
 ;;
 ;;- Compare instructions.
 ;;
@@ -1246,17 +1269,27 @@
 ; movti instruction pattern(s).
 ;
 
+; FIXME: More constants are possible by enabling jxx, jyy constraints
+; for TImode (use double-int for the calculations)
 (define_insn "movti"
-  [(set (match_operand:TI 0 "nonimmediate_operand" "=d,QS,d,o")
-        (match_operand:TI 1 "general_operand" "QS,d,dPRT,d"))]
+  [(set (match_operand:TI 0 "nonimmediate_operand" "=d,QS,v,  v,  v,v,d, v,QR,   d,o")
+        (match_operand:TI 1 "general_operand"      "QS, d,v,j00,jm1,d,v,QR, v,dPRT,d"))]
   "TARGET_ZARCH"
   "@
    lmg\t%0,%N0,%S1
    stmg\t%1,%N1,%S0
+   vlr\t%v0,%v1
+   vzero\t%v0
+   vone\t%v0
+   vlvgp\t%v0,%1,%N1
+   #
+   vl\t%v0,%1
+   vst\t%v1,%0
    #
    #"
-  [(set_attr "op_type" "RSY,RSY,*,*")
-   (set_attr "type" "lm,stm,*,*")])
+  [(set_attr "op_type" "RSY,RSY,VRR,VRI,VRI,VRR,*,VRX,VRX,*,*")
+   (set_attr "type" "lm,stm,*,*,*,*,*,*,*,*,*")
+   (set_attr "cpu_facility" "*,*,vec,vec,vec,vec,vec,vec,vec,*,*")])
 
 (define_split
   [(set (match_operand:TI 0 "nonimmediate_operand" "")
@@ -1286,10 +1319,14 @@
   operands[5] = operand_subword (operands[1], 0, 0, TImode);
 })
 
+; Use part of the TImode target reg to perform the address
+; calculation.  If the TImode value is supposed to be copied into a VR
+; this splitter is not necessary.
 (define_split
   [(set (match_operand:TI 0 "register_operand" "")
         (match_operand:TI 1 "memory_operand" ""))]
   "TARGET_ZARCH && reload_completed
+   && !VECTOR_REG_P (operands[0])
    && !s_operand (operands[1], VOIDmode)"
   [(set (match_dup 0) (match_dup 1))]
 {
@@ -1300,6 +1337,25 @@
 })
 
 
+; Split a VR -> GPR TImode move into 2 vector load GR from VR element.
+; For the higher order bits we do simply a DImode move while the
+; second part is done via vec extract.  Both will end up as vlgvg.
+(define_split
+  [(set (match_operand:TI 0 "register_operand" "")
+        (match_operand:TI 1 "register_operand" ""))]
+  "TARGET_VX && reload_completed
+   && GENERAL_REG_P (operands[0])
+   && VECTOR_REG_P (operands[1])"
+  [(set (match_dup 2) (match_dup 4))
+   (set (match_dup 3) (unspec:DI [(match_dup 5) (const_int 1)]
+				 UNSPEC_VEC_EXTRACT))]
+{
+  operands[2] = operand_subword (operands[0], 0, 0, TImode);
+  operands[3] = operand_subword (operands[0], 1, 0, TImode);
+  operands[4] = gen_rtx_REG (DImode, REGNO (operands[1]));
+  operands[5] = gen_rtx_REG (V2DImode, REGNO (operands[1]));
+})
+
 ;
 ; Patterns used for secondary reloads
 ;
@@ -1308,40 +1364,20 @@
 ; Unfortunately there is no such variant for QI, TI and FP mode moves.
 ; These patterns are also used for unaligned SI and DI accesses.
 
-(define_expand "reload<INTALL:mode><P:mode>_tomem_z10"
-  [(parallel [(match_operand:INTALL 0 "memory_operand"   "")
-	      (match_operand:INTALL 1 "register_operand" "=d")
-	      (match_operand:P 2 "register_operand" "=&a")])]
+(define_expand "reload<ALL:mode><P:mode>_tomem_z10"
+  [(parallel [(match_operand:ALL 0 "memory_operand"   "")
+	      (match_operand:ALL 1 "register_operand" "=d")
+	      (match_operand:P   2 "register_operand" "=&a")])]
   "TARGET_Z10"
 {
   s390_reload_symref_address (operands[1], operands[0], operands[2], 1);
   DONE;
 })
 
-(define_expand "reload<INTALL:mode><P:mode>_toreg_z10"
-  [(parallel [(match_operand:INTALL 0 "register_operand" "=d")
-	      (match_operand:INTALL 1 "memory_operand"   "")
-	      (match_operand:P 2 "register_operand" "=a")])]
-  "TARGET_Z10"
-{
-  s390_reload_symref_address (operands[0], operands[1], operands[2], 0);
-  DONE;
-})
-
-(define_expand "reload<FPALL:mode><P:mode>_tomem_z10"
-  [(parallel [(match_operand:FPALL 0 "memory_operand"   "")
-	      (match_operand:FPALL 1 "register_operand" "=d")
-	      (match_operand:P 2 "register_operand" "=&a")])]
-  "TARGET_Z10"
-{
-  s390_reload_symref_address (operands[1], operands[0], operands[2], 1);
-  DONE;
-})
-
-(define_expand "reload<FPALL:mode><P:mode>_toreg_z10"
-  [(parallel [(match_operand:FPALL 0 "register_operand" "=d")
-	      (match_operand:FPALL 1 "memory_operand"   "")
-	      (match_operand:P 2 "register_operand" "=a")])]
+(define_expand "reload<ALL:mode><P:mode>_toreg_z10"
+  [(parallel [(match_operand:ALL 0 "register_operand" "=d")
+	      (match_operand:ALL 1 "memory_operand"   "")
+	      (match_operand:P   2 "register_operand" "=a")])]
   "TARGET_Z10"
 {
   s390_reload_symref_address (operands[0], operands[1], operands[2], 0);
@@ -1370,9 +1406,16 @@
   DONE;
 })
 
-; Handles assessing a non-offsetable memory address
+; Not all the indirect memory access instructions support the full
+; format (long disp + index + base).  So whenever a move from/to such
+; an address is required and the instruction cannot deal with it we do
+; a load address into a scratch register first and use this as the new
+; base register.
+; This in particular is used for:
+; - non-offsetable memory accesses for multiword moves
+; - full vector reg moves with long displacements
 
-(define_expand "reload<mode>_nonoffmem_in"
+(define_expand "reload<mode>_la_in"
   [(parallel [(match_operand 0   "register_operand" "")
               (match_operand 1   "" "")
               (match_operand:P 2 "register_operand" "=&a")])]
@@ -1385,7 +1428,7 @@
   DONE;
 })
 
-(define_expand "reload<mode>_nonoffmem_out"
+(define_expand "reload<mode>_la_out"
   [(parallel [(match_operand   0 "" "")
               (match_operand   1 "register_operand" "")
               (match_operand:P 2 "register_operand" "=&a")])]
@@ -1438,11 +1481,9 @@
 
 (define_insn "*movdi_64"
   [(set (match_operand:DI 0 "nonimmediate_operand"
-                            "=d,d,d,d,d,d,d,d,f,d,d,d,d,d,
-                             RT,!*f,!*f,!*f,!R,!T,b,Q,d,t,Q,t")
+         "=d,    d,    d,    d,    d, d,    d,    d,f,d,d,d,d, d,RT,!*f,!*f,!*f,!R,!T,b,Q,d,t,Q,t,v,v,v,d, v,QR")
         (match_operand:DI 1 "general_operand"
-                            "K,N0HD0,N1HD0,N2HD0,N3HD0,Os,N0SD0,N1SD0,d,f,L,b,d,RT,
-                             d,*f,R,T,*f,*f,d,K,t,d,t,Q"))]
+         " K,N0HD0,N1HD0,N2HD0,N3HD0,Os,N0SD0,N1SD0,d,f,L,b,d,RT, d, *f,  R,  T,*f,*f,d,K,t,d,t,Q,K,v,d,v,QR, v"))]
   "TARGET_ZARCH"
   "@
    lghi\t%0,%h1
@@ -1470,15 +1511,21 @@
    #
    #
    stam\t%1,%N1,%S0
-   lam\t%0,%N0,%S1"
+   lam\t%0,%N0,%S1
+   vleig\t%v0,%h1,0
+   vlr\t%v0,%v1
+   vlvgg\t%v0,%1,0
+   vlgvg\t%0,%v1,0
+   vleg\t%v0,%1,0
+   vsteg\t%v1,%0,0"
   [(set_attr "op_type" "RI,RI,RI,RI,RI,RIL,RIL,RIL,RRE,RRE,RXY,RIL,RRE,RXY,
-                        RXY,RR,RX,RXY,RX,RXY,RIL,SIL,*,*,RS,RS")
+                        RXY,RR,RX,RXY,RX,RXY,RIL,SIL,*,*,RS,RS,VRI,VRR,VRS,VRS,VRX,VRX")
    (set_attr "type" "*,*,*,*,*,*,*,*,floaddf,floaddf,la,larl,lr,load,store,
-                     floaddf,floaddf,floaddf,fstoredf,fstoredf,larl,*,*,*,
-                     *,*")
+                     floaddf,floaddf,floaddf,fstoredf,fstoredf,larl,*,*,*,*,
+                     *,*,*,*,*,*,*")
    (set_attr "cpu_facility" "*,*,*,*,*,extimm,extimm,extimm,dfp,dfp,longdisp,
                              z10,*,*,*,*,*,longdisp,*,longdisp,
-                             z10,z10,*,*,*,*")
+                             z10,z10,*,*,*,*,vec,vec,vec,vec,vec,vec")
    (set_attr "z10prop" "z10_fwd_A1,
                         z10_fwd_E1,
                         z10_fwd_E1,
@@ -1504,7 +1551,7 @@
                         *,
                         *,
                         *,
-                        *")
+                        *,*,*,*,*,*,*")
 ])
 
 (define_split
@@ -1696,9 +1743,9 @@
 
 (define_insn "*movsi_zarch"
   [(set (match_operand:SI 0 "nonimmediate_operand"
-			    "=d,d,d,d,d,d,d,d,d,R,T,!*f,!*f,!*f,!R,!T,d,t,Q,b,Q,t")
+	 "=d,    d,    d, d,d,d,d,d,d,R,T,!*f,!*f,!*f,!*f,!*f,!R,!T,d,t,Q,b,Q,t,v,v,v,d, v,QR")
         (match_operand:SI 1 "general_operand"
-			    "K,N0HS0,N1HS0,Os,L,b,d,R,T,d,d,*f,R,T,*f,*f,t,d,t,d,K,Q"))]
+	 " K,N0HS0,N1HS0,Os,L,b,d,R,T,d,d, *f, *f,  R,  R,  T,*f,*f,t,d,t,d,K,Q,K,v,d,v,QR, v"))]
   "TARGET_ZARCH"
   "@
    lhi\t%0,%h1
@@ -1712,7 +1759,9 @@
    ly\t%0,%1
    st\t%1,%0
    sty\t%1,%0
+   lder\t%0,%1
    ler\t%0,%1
+   lde\t%0,%1
    le\t%0,%1
    ley\t%0,%1
    ste\t%1,%0
@@ -1722,9 +1771,15 @@
    stam\t%1,%1,%S0
    strl\t%1,%0
    mvhi\t%0,%1
-   lam\t%0,%0,%S1"
+   lam\t%0,%0,%S1
+   vleif\t%v0,%h1,0
+   vlr\t%v0,%v1
+   vlvgf\t%v0,%1,0
+   vlgvf\t%0,%v1,0
+   vlef\t%v0,%1,0
+   vstef\t%v1,%0,0"
   [(set_attr "op_type" "RI,RI,RI,RIL,RXY,RIL,RR,RX,RXY,RX,RXY,
-                        RR,RX,RXY,RX,RXY,RRE,RRE,RS,RIL,SIL,RS")
+                        RRE,RR,RXE,RX,RXY,RX,RXY,RRE,RRE,RS,RIL,SIL,RS,VRI,VRR,VRS,VRS,VRX,VRX")
    (set_attr "type" "*,
                      *,
                      *,
@@ -1739,6 +1794,8 @@
                      floadsf,
                      floadsf,
                      floadsf,
+                     floadsf,
+                     floadsf,
                      fstoresf,
                      fstoresf,
                      *,
@@ -1746,9 +1803,9 @@
                      *,
                      larl,
                      *,
-                     *")
+                     *,*,*,*,*,*,*")
    (set_attr "cpu_facility" "*,*,*,extimm,longdisp,z10,*,*,longdisp,*,longdisp,
-                             *,*,longdisp,*,longdisp,*,*,*,z10,z10,*")
+                             vec,*,vec,*,longdisp,*,longdisp,*,*,*,z10,z10,*,vec,vec,vec,vec,vec,vec")
    (set_attr "z10prop" "z10_fwd_A1,
                         z10_fwd_E1,
                         z10_fwd_E1,
@@ -1765,42 +1822,38 @@
                         *,
                         *,
                         *,
+                        *,
+                        *,
                         z10_super_E1,
                         z10_super,
                         *,
                         z10_rec,
                         z10_super,
-                        *")])
+                        *,*,*,*,*,*,*")])
 
 (define_insn "*movsi_esa"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=d,d,d,R,!*f,!*f,!R,d,t,Q,t")
-        (match_operand:SI 1 "general_operand" "K,d,R,d,*f,R,*f,t,d,t,Q"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=d,d,d,R,!*f,!*f,!*f,!*f,!R,d,t,Q,t")
+        (match_operand:SI 1 "general_operand"       "K,d,R,d, *f, *f,  R,  R,*f,t,d,t,Q"))]
   "!TARGET_ZARCH"
   "@
    lhi\t%0,%h1
    lr\t%0,%1
    l\t%0,%1
    st\t%1,%0
+   lder\t%0,%1
    ler\t%0,%1
+   lde\t%0,%1
    le\t%0,%1
    ste\t%1,%0
    ear\t%0,%1
    sar\t%0,%1
    stam\t%1,%1,%S0
    lam\t%0,%0,%S1"
-  [(set_attr "op_type" "RI,RR,RX,RX,RR,RX,RX,RRE,RRE,RS,RS")
-   (set_attr "type" "*,lr,load,store,floadsf,floadsf,fstoresf,*,*,*,*")
-   (set_attr "z10prop" "z10_fwd_A1,
-                        z10_fr_E1,
-                        z10_fwd_A3,
-                        z10_rec,
-                        *,
-                        *,
-                        *,
-                        z10_super_E1,
-                        z10_super,
-                        *,
-                        *")
+  [(set_attr "op_type" "RI,RR,RX,RX,RRE,RR,RXE,RX,RX,RRE,RRE,RS,RS")
+   (set_attr "type" "*,lr,load,store,floadsf,floadsf,floadsf,floadsf,fstoresf,*,*,*,*")
+   (set_attr "z10prop" "z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_rec,*,*,*,*,*,z10_super_E1,
+                        z10_super,*,*")
+   (set_attr "cpu_facility" "*,*,*,*,vec,*,vec,*,*,*,*,*,*")
 ])
 
 (define_peephole2
@@ -1910,8 +1963,8 @@
 })
 
 (define_insn "*movhi"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=d,d,d,d,d,R,T,b,Q")
-        (match_operand:HI 1 "general_operand"      " d,n,R,T,b,d,d,d,K"))]
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=d,d,d,d,d,R,T,b,Q,v,v,v,d, v,QR")
+        (match_operand:HI 1 "general_operand"      " d,n,R,T,b,d,d,d,K,K,v,d,v,QR, v"))]
   ""
   "@
    lr\t%0,%1
@@ -1922,10 +1975,16 @@
    sth\t%1,%0
    sthy\t%1,%0
    sthrl\t%1,%0
-   mvhhi\t%0,%1"
-  [(set_attr "op_type"      "RR,RI,RX,RXY,RIL,RX,RXY,RIL,SIL")
-   (set_attr "type"         "lr,*,*,*,larl,store,store,store,*")
-   (set_attr "cpu_facility" "*,*,*,*,z10,*,*,z10,z10")
+   mvhhi\t%0,%1
+   vleih\t%v0,%h1,0
+   vlr\t%v0,%v1
+   vlvgh\t%v0,%1,0
+   vlgvh\t%0,%v1,0
+   vleh\t%v0,%1,0
+   vsteh\t%v1,%0,0"
+  [(set_attr "op_type"      "RR,RI,RX,RXY,RIL,RX,RXY,RIL,SIL,VRI,VRR,VRS,VRS,VRX,VRX")
+   (set_attr "type"         "lr,*,*,*,larl,store,store,store,*,*,*,*,*,*,*")
+   (set_attr "cpu_facility" "*,*,*,*,z10,*,*,z10,z10,vec,vec,vec,vec,vec,vec")
    (set_attr "z10prop" "z10_fr_E1,
                        z10_fwd_A1,
                        z10_super_E1,
@@ -1934,7 +1993,7 @@
                        z10_rec,
                        z10_rec,
                        z10_rec,
-                       z10_super")])
+                       z10_super,*,*,*,*,*,*")])
 
 (define_peephole2
   [(set (match_operand:HI 0 "register_operand" "")
@@ -1969,8 +2028,8 @@
 })
 
 (define_insn "*movqi"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=d,d,d,d,R,T,Q,S,?Q")
-        (match_operand:QI 1 "general_operand"      " d,n,R,T,d,d,n,n,?Q"))]
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=d,d,d,d,R,T,Q,S,?Q,v,v,v,d, v,QR")
+        (match_operand:QI 1 "general_operand"      " d,n,R,T,d,d,n,n,?Q,K,v,d,v,QR, v"))]
   ""
   "@
    lr\t%0,%1
@@ -1981,9 +2040,16 @@
    stcy\t%1,%0
    mvi\t%S0,%b1
    mviy\t%S0,%b1
-   #"
-  [(set_attr "op_type" "RR,RI,RX,RXY,RX,RXY,SI,SIY,SS")
-   (set_attr "type" "lr,*,*,*,store,store,store,store,*")
+   #
+   vleib\t%v0,%b1,0
+   vlr\t%v0,%v1
+   vlvgb\t%v0,%1,0
+   vlgvb\t%0,%v1,0
+   vleb\t%v0,%1,0
+   vsteb\t%v1,%0,0"
+  [(set_attr "op_type" "RR,RI,RX,RXY,RX,RXY,SI,SIY,SS,VRI,VRR,VRS,VRS,VRX,VRX")
+   (set_attr "type" "lr,*,*,*,store,store,store,store,*,*,*,*,*,*,*")
+   (set_attr "cpu_facility" "*,*,*,*,*,*,*,*,*,vec,vec,vec,vec,vec,vec")
    (set_attr "z10prop" "z10_fr_E1,
                         z10_fwd_A1,
                         z10_super_E1,
@@ -1992,7 +2058,7 @@
                         z10_rec,
                         z10_super,
                         z10_super,
-                        *")])
+                        *,*,*,*,*,*,*")])
 
 (define_peephole2
   [(set (match_operand:QI 0 "nonimmediate_operand" "")
@@ -2124,7 +2190,7 @@
   [(set (match_operand:TD_TF 0 "register_operand" "")
         (match_operand:TD_TF 1 "memory_operand"   ""))]
   "TARGET_ZARCH && reload_completed
-   && !FP_REG_P (operands[0])
+   && GENERAL_REG_P (operands[0])
    && !s_operand (operands[1], VOIDmode)"
   [(set (match_dup 0) (match_dup 1))]
 {
@@ -2180,9 +2246,9 @@
 
 (define_insn "*mov<mode>_64dfp"
   [(set (match_operand:DD_DF 0 "nonimmediate_operand"
-			       "=f,f,f,d,f,f,R,T,d,d, d,RT")
+			       "=f,f,f,d,f,f,R,T,d,d,d, d,b,RT,v,v,d,v,QR")
         (match_operand:DD_DF 1 "general_operand"
-			       " G,f,d,f,R,T,f,f,G,d,RT, d"))]
+			       " G,f,d,f,R,T,f,f,G,d,b,RT,d, d,v,d,v,QR,v"))]
   "TARGET_DFP"
   "@
    lzdr\t%0
@@ -2195,17 +2261,24 @@
    stdy\t%1,%0
    lghi\t%0,0
    lgr\t%0,%1
+   lgrl\t%0,%1
    lg\t%0,%1
-   stg\t%1,%0"
-  [(set_attr "op_type" "RRE,RR,RRE,RRE,RX,RXY,RX,RXY,RI,RRE,RXY,RXY")
+   stgrl\t%1,%0
+   stg\t%1,%0
+   vlr\t%v0,%v1
+   vlvgg\t%v0,%1,0
+   vlgvg\t%0,%v1,0
+   vleg\t%0,%1,0
+   vsteg\t%1,%0,0"
+  [(set_attr "op_type" "RRE,RR,RRE,RRE,RX,RXY,RX,RXY,RI,RRE,RIL,RXY,RIL,RXY,VRR,VRS,VRS,VRX,VRX")
    (set_attr "type" "fsimpdf,floaddf,floaddf,floaddf,floaddf,floaddf,
-                     fstoredf,fstoredf,*,lr,load,store")
-   (set_attr "z10prop" "*,*,*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_rec")
-   (set_attr "cpu_facility" "z196,*,*,*,*,*,*,*,*,*,*,*")])
+                     fstoredf,fstoredf,*,lr,load,load,store,store,*,*,*,load,store")
+   (set_attr "z10prop" "*,*,*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec,*,*,*,*,*")
+   (set_attr "cpu_facility" "z196,*,*,*,*,*,*,*,*,*,z10,*,z10,*,vec,vec,vec,vec,vec")])
 
 (define_insn "*mov<mode>_64"
-  [(set (match_operand:DD_DF 0 "nonimmediate_operand" "=f,f,f,f,R,T,d,d, d,RT")
-        (match_operand:DD_DF 1 "general_operand"      " G,f,R,T,f,f,G,d,RT, d"))]
+  [(set (match_operand:DD_DF 0 "nonimmediate_operand" "=f,f,f,f,R,T,d,d,d, d,b,RT,v,v,QR")
+        (match_operand:DD_DF 1 "general_operand"      " G,f,R,T,f,f,G,d,b,RT,d, d,v,QR,v"))]
   "TARGET_ZARCH"
   "@
    lzdr\t%0
@@ -2216,13 +2289,18 @@
    stdy\t%1,%0
    lghi\t%0,0
    lgr\t%0,%1
+   lgrl\t%0,%1
    lg\t%0,%1
-   stg\t%1,%0"
-  [(set_attr "op_type" "RRE,RR,RX,RXY,RX,RXY,RI,RRE,RXY,RXY")
+   stgrl\t%1,%0
+   stg\t%1,%0
+   vlr\t%v0,%v1
+   vleg\t%v0,%1,0
+   vsteg\t%v1,%0,0"
+  [(set_attr "op_type" "RRE,RR,RX,RXY,RX,RXY,RI,RRE,RIL,RXY,RIL,RXY,VRR,VRX,VRX")
    (set_attr "type"    "fsimpdf,fload<mode>,fload<mode>,fload<mode>,
-                        fstore<mode>,fstore<mode>,*,lr,load,store")
-   (set_attr "z10prop" "*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_rec")
-   (set_attr "cpu_facility" "z196,*,*,*,*,*,*,*,*,*")])
+                        fstore<mode>,fstore<mode>,*,lr,load,load,store,store,*,load,store")
+   (set_attr "z10prop" "*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec,*,*,*")
+   (set_attr "cpu_facility" "z196,*,*,*,*,*,*,*,z10,*,z10,*,vec,vec,vec")])
 
 (define_insn "*mov<mode>_31"
   [(set (match_operand:DD_DF 0 "nonimmediate_operand"
@@ -2295,28 +2373,38 @@
 
 (define_insn "mov<mode>"
   [(set (match_operand:SD_SF 0 "nonimmediate_operand"
-			       "=f,f,f,f,R,T,d,d,d,d,R,T")
+			       "=f,f,f,f,f,f,R,T,d,d,d,d,d,b,R,T,v,v,v,d,v,QR")
         (match_operand:SD_SF 1 "general_operand"
-			       " G,f,R,T,f,f,G,d,R,T,d,d"))]
+			       " G,f,f,R,R,T,f,f,G,d,b,R,T,d,d,d,v,G,d,v,QR,v"))]
   ""
   "@
    lzer\t%0
+   lder\t%0,%1
    ler\t%0,%1
+   lde\t%0,%1
    le\t%0,%1
    ley\t%0,%1
    ste\t%1,%0
    stey\t%1,%0
    lhi\t%0,0
    lr\t%0,%1
+   lrl\t%0,%1
    l\t%0,%1
    ly\t%0,%1
+   strl\t%1,%0
    st\t%1,%0
-   sty\t%1,%0"
-  [(set_attr "op_type" "RRE,RR,RX,RXY,RX,RXY,RI,RR,RX,RXY,RX,RXY")
-   (set_attr "type"    "fsimpsf,fload<mode>,fload<mode>,fload<mode>,
-                        fstore<mode>,fstore<mode>,*,lr,load,load,store,store")
-   (set_attr "z10prop" "*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec")
-   (set_attr "cpu_facility" "z196,*,*,*,*,*,*,*,*,*,*,*")])
+   sty\t%1,%0
+   vlr\t%v0,%v1
+   vleif\t%v0,0
+   vlvgf\t%v0,%1,0
+   vlgvf\t%0,%v1,0
+   vleg\t%0,%1,0
+   vsteg\t%1,%0,0"
+  [(set_attr "op_type" "RRE,RRE,RR,RXE,RX,RXY,RX,RXY,RI,RR,RIL,RX,RXY,RIL,RX,RXY,VRR,VRI,VRS,VRS,VRX,VRX")
+   (set_attr "type"    "fsimpsf,fsimpsf,fload<mode>,fload<mode>,fload<mode>,fload<mode>,
+                        fstore<mode>,fstore<mode>,*,lr,load,load,load,store,store,store,*,*,*,*,load,store")
+   (set_attr "z10prop" "*,*,*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec,z10_rec,*,*,*,*,*,*")
+   (set_attr "cpu_facility" "z196,vec,*,vec,*,*,*,*,*,*,z10,*,*,z10,*,*,vec,vec,vec,vec,vec,vec")])
 
 ;
 ; movcc instruction pattern
@@ -2607,6 +2695,22 @@
 ;
 
 (define_expand "strlen<mode>"
+  [(match_operand:P   0 "register_operand" "")  ; result
+   (match_operand:BLK 1 "memory_operand" "")    ; input string
+   (match_operand:SI  2 "immediate_operand" "") ; search character
+   (match_operand:SI  3 "immediate_operand" "")] ; known alignment
+  ""
+{
+  if (!TARGET_VX || operands[2] != const0_rtx)
+    emit_insn (gen_strlen_srst<mode> (operands[0], operands[1],
+				      operands[2], operands[3]));
+  else
+    s390_expand_vec_strlen (operands[0], operands[1], operands[3]);
+
+  DONE;
+})
+
+(define_expand "strlen_srst<mode>"
   [(set (reg:SI 0) (match_operand:SI 2 "immediate_operand" ""))
    (parallel
     [(set (match_dup 4)
@@ -2916,8 +3020,12 @@
   operands[2] = GEN_INT (S390_TDC_INFINITY);
 })
 
+; This extracts CC into a GPR properly shifted.  The actual IPM
+; instruction will be issued by reload.  The constraint of operand 1
+; forces reload to use a GPR.  So reload will issue a movcc insn for
+; copying CC into a GPR first.
 (define_insn_and_split "*cc_to_int"
-  [(set (match_operand:SI 0 "register_operand" "=d")
+  [(set (match_operand:SI 0 "nonimmediate_operand"     "=d")
         (unspec:SI [(match_operand 1 "register_operand" "0")]
                    UNSPEC_CC_TO_INT))]
   "operands != NULL"
@@ -4668,10 +4776,30 @@
 ; addti3 instruction pattern(s).
 ;
 
-(define_insn_and_split "addti3"
-  [(set (match_operand:TI 0 "register_operand" "=&d")
+(define_expand "addti3"
+  [(parallel
+    [(set (match_operand:TI          0 "register_operand"     "")
+	  (plus:TI (match_operand:TI 1 "nonimmediate_operand" "")
+		   (match_operand:TI 2 "general_operand"      "") ) )
+     (clobber (reg:CC CC_REGNUM))])]
+  "TARGET_ZARCH"
+{
+  /* For z13 we have vaq which doesn't set CC.  */
+  if (TARGET_VX)
+    {
+      emit_insn (gen_rtx_SET (TImode,
+			      operands[0],
+			      gen_rtx_PLUS (TImode,
+                                            copy_to_mode_reg (TImode, operands[1]),
+                                            copy_to_mode_reg (TImode, operands[2]))));
+      DONE;
+    }
+})
+
+(define_insn_and_split "*addti3"
+  [(set (match_operand:TI          0 "register_operand"    "=&d")
         (plus:TI (match_operand:TI 1 "nonimmediate_operand" "%0")
-                 (match_operand:TI 2 "general_operand" "do") ) )
+                 (match_operand:TI 2 "general_operand"      "do") ) )
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_ZARCH"
   "#"
@@ -4691,7 +4819,9 @@
    operands[5] = operand_subword (operands[2], 0, 0, TImode);
    operands[6] = operand_subword (operands[0], 1, 0, TImode);
    operands[7] = operand_subword (operands[1], 1, 0, TImode);
-   operands[8] = operand_subword (operands[2], 1, 0, TImode);")
+   operands[8] = operand_subword (operands[2], 1, 0, TImode);"
+  [(set_attr "op_type"  "*")
+   (set_attr "cpu_facility" "*")])
 
 ;
 ; adddi3 instruction pattern(s).
@@ -5129,10 +5259,30 @@
 ; subti3 instruction pattern(s).
 ;
 
-(define_insn_and_split "subti3"
-  [(set (match_operand:TI 0 "register_operand" "=&d")
-        (minus:TI (match_operand:TI 1 "register_operand" "0")
-                  (match_operand:TI 2 "general_operand" "do") ) )
+(define_expand "subti3"
+  [(parallel
+    [(set (match_operand:TI           0 "register_operand" "")
+	  (minus:TI (match_operand:TI 1 "register_operand" "")
+		    (match_operand:TI 2 "general_operand"  "") ) )
+     (clobber (reg:CC CC_REGNUM))])]
+  "TARGET_ZARCH"
+{
+  /* For z13 we have vaq which doesn't set CC.  */
+  if (TARGET_VX)
+    {
+      emit_insn (gen_rtx_SET (TImode,
+			      operands[0],
+			      gen_rtx_MINUS (TImode,
+                                            operands[1],
+                                            copy_to_mode_reg (TImode, operands[2]))));
+      DONE;
+    }
+})
+
+(define_insn_and_split "*subti3"
+  [(set (match_operand:TI           0 "register_operand" "=&d")
+        (minus:TI (match_operand:TI 1 "register_operand"   "0")
+                  (match_operand:TI 2 "general_operand"   "do") ) )
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_ZARCH"
   "#"
@@ -5151,7 +5301,9 @@
    operands[5] = operand_subword (operands[2], 0, 0, TImode);
    operands[6] = operand_subword (operands[0], 1, 0, TImode);
    operands[7] = operand_subword (operands[1], 1, 0, TImode);
-   operands[8] = operand_subword (operands[2], 1, 0, TImode);")
+   operands[8] = operand_subword (operands[2], 1, 0, TImode);"
+  [(set_attr "op_type"      "*")
+   (set_attr "cpu_facility" "*")])
 
 ;
 ; subdi3 instruction pattern(s).
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
new file mode 100644
index 0000000..f07f5a7
--- /dev/null
+++ b/gcc/config/s390/vector.md
@@ -0,0 +1,1226 @@
+;;- Instruction patterns for the System z vector facility
+;;  Copyright (C) 2015 Free Software Foundation, Inc.
+;;  Contributed by Andreas Krebbel (Andreas.Krebbel@de.ibm.com)
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify it under
+;; the terms of the GNU General Public License as published by the Free
+;; Software Foundation; either version 3, or (at your option) any later
+;; version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
+;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+;; for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+; All vector modes supported in a vector register
+(define_mode_iterator V
+  [V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI V1DI V2DI V1SF
+   V2SF V4SF V1DF V2DF])
+(define_mode_iterator VT
+  [V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI V1DI V2DI V1SF
+   V2SF V4SF V1DF V2DF V1TF V1TI TI])
+
+; All vector modes directly supported by the hardware having full vector reg size
+; V_HW2 is duplicate of V_HW for having two iterators expanding
+; independently e.g. vcond
+(define_mode_iterator V_HW  [V16QI V8HI V4SI V2DI V2DF])
+(define_mode_iterator V_HW2 [V16QI V8HI V4SI V2DI V2DF])
+; Including TI for instructions that support it (va, vn, ...)
+(define_mode_iterator VT_HW [V16QI V8HI V4SI V2DI V2DF V1TI TI])
+
+; All full size integer vector modes supported in a vector register + TImode
+(define_mode_iterator VIT_HW    [V16QI V8HI V4SI V2DI V1TI TI])
+(define_mode_iterator VI_HW     [V16QI V8HI V4SI V2DI])
+(define_mode_iterator VI_HW_QHS [V16QI V8HI V4SI])
+(define_mode_iterator VI_HW_HS  [V8HI V4SI])
+(define_mode_iterator VI_HW_QH  [V16QI V8HI])
+
+; All integer vector modes supported in a vector register + TImode
+(define_mode_iterator VIT [V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI V1DI V2DI V1TI TI])
+(define_mode_iterator VI  [V2QI V4QI V8QI V16QI V2HI V4HI V8HI V2SI V4SI V2DI])
+(define_mode_iterator VI_QHS [V4QI V8QI V16QI V4HI V8HI V4SI])
+
+(define_mode_iterator V_8   [V1QI])
+(define_mode_iterator V_16  [V2QI  V1HI])
+(define_mode_iterator V_32  [V4QI  V2HI V1SI V1SF])
+(define_mode_iterator V_64  [V8QI  V4HI V2SI V2SF V1DI V1DF])
+(define_mode_iterator V_128 [V16QI V8HI V4SI V4SF V2DI V2DF V1TI V1TF])
+
+; A blank for vector modes and a * for TImode.  This is used to hide
+; the TImode expander name in case it is defined already.  See addti3
+; for an example.
+(define_mode_attr ti* [(V1QI "") (V2QI "") (V4QI "") (V8QI "") (V16QI "")
+		       (V1HI "") (V2HI "") (V4HI "") (V8HI "")
+		       (V1SI "") (V2SI "") (V4SI "")
+		       (V1DI "") (V2DI "")
+		       (V1TI "*") (TI "*")])
+
+; The element type of the vector.
+(define_mode_attr non_vec[(V1QI "QI") (V2QI "QI") (V4QI "QI") (V8QI "QI") (V16QI "QI")
+			  (V1HI "HI") (V2HI "HI") (V4HI "HI") (V8HI "HI")
+			  (V1SI "SI") (V2SI "SI") (V4SI "SI")
+			  (V1DI "DI") (V2DI "DI")
+			  (V1TI "TI")
+			  (V1SF "SF") (V2SF "SF") (V4SF "SF")
+			  (V1DF "DF") (V2DF "DF")
+			  (V1TF "TF")])
+
+; The instruction suffix
+(define_mode_attr bhfgq[(V1QI "b") (V2QI "b") (V4QI "b") (V8QI "b") (V16QI "b")
+			(V1HI "h") (V2HI "h") (V4HI "h") (V8HI "h")
+			(V1SI "f") (V2SI "f") (V4SI "f")
+			(V1DI "g") (V2DI "g")
+			(V1TI "q") (TI "q")
+			(V1SF "f") (V2SF "f") (V4SF "f")
+			(V1DF "g") (V2DF "g")
+			(V1TF "q")])
+
+; This is for vmalhw. It gets an 'w' attached to avoid confusion with
+; multiply and add logical high vmalh.
+(define_mode_attr w [(V1QI "")  (V2QI "")  (V4QI "")  (V8QI "") (V16QI "")
+		     (V1HI "w") (V2HI "w") (V4HI "w") (V8HI "w")
+		     (V1SI "")  (V2SI "")  (V4SI "")
+		     (V1DI "")  (V2DI "")])
+
+; Resulting mode of a vector comparison.  For floating point modes an
+; integer vector mode with the same element size is picked.
+(define_mode_attr tointvec [(V1QI "V1QI") (V2QI "V2QI") (V4QI "V4QI") (V8QI "V8QI") (V16QI "V16QI")
+			    (V1HI "V1HI") (V2HI "V2HI") (V4HI "V4HI") (V8HI "V8HI")
+			    (V1SI "V1SI") (V2SI "V2SI") (V4SI "V4SI")
+			    (V1DI "V1DI") (V2DI "V2DI")
+			    (V1TI "V1TI")
+			    (V1SF "V1SI") (V2SF "V2SI") (V4SF "V4SI")
+			    (V1DF "V1DI") (V2DF "V2DI")
+			    (V1TF "V1TI")])
+
+; Vector with doubled element size.
+(define_mode_attr vec_double [(V2QI "V1HI") (V4QI "V2HI") (V8QI "V4HI") (V16QI "V8HI")
+			      (V2HI "V1SI") (V4HI "V2SI") (V8HI "V4SI")
+			      (V2SI "V1DI") (V4SI "V2DI")
+			      (V2DI "V1TI")
+			      (V2SF "V1DF") (V4SF "V2DF")])
+
+; Vector with half the element size.
+(define_mode_attr vec_half [(V1HI "V2QI") (V2HI "V4QI") (V4HI "V8QI") (V8HI "V16QI")
+			    (V1SI "V2HI") (V2SI "V4HI") (V4SI "V8HI")
+			    (V1DI "V2SI") (V2DI "V4SI")
+			    (V1TI "V2DI")
+			    (V1DF "V2SF") (V2DF "V4SF")
+			    (V1TF "V1DF")])
+
+; The comparisons not setting CC iterate over the rtx code.
+(define_code_iterator VFCMP_HW_OP [eq gt ge])
+(define_code_attr asm_fcmp_op [(eq "e") (gt "h") (ge "he")])
+
+
+
+; Comparison operators on int and fp compares which are directly
+; supported by the HW.
+(define_code_iterator VICMP_HW_OP [eq gt gtu])
+; For int insn_cmp_op can be used in the insn name as well as in the asm output.
+(define_code_attr insn_cmp_op [(eq "eq") (gt "h") (gtu "hl") (ge "he")])
+
+; Flags for vector string instructions (vfae all 4, vfee only ZS and CS, vstrc all 4)
+(define_constants
+  [(VSTRING_FLAG_IN         8)   ; invert result
+   (VSTRING_FLAG_RT         4)   ; result type
+   (VSTRING_FLAG_ZS         2)   ; zero search
+   (VSTRING_FLAG_CS         1)]) ; condition code set
+
+; Full HW vector size moves
+(define_insn "mov<mode>"
+  [(set (match_operand:V_128 0 "nonimmediate_operand" "=v, v,QR,  v,  v,  v,  v,v,d")
+	(match_operand:V_128 1 "general_operand"      " v,QR, v,j00,jm1,jyy,jxx,d,v"))]
+  "TARGET_VX"
+  "@
+   vlr\t%v0,%v1
+   vl\t%v0,%1
+   vst\t%v1,%0
+   vzero\t%v0
+   vone\t%v0
+   vgbm\t%v0,%t1
+   vgm<bhfgq>\t%v0,%s1,%e1
+   vlvgp\t%v0,%1,%N1
+   #"
+  [(set_attr "op_type" "VRR,VRX,VRX,VRI,VRI,VRI,VRI,VRR,*")])
+
+(define_split
+  [(set (match_operand:V_128 0 "register_operand" "")
+	(match_operand:V_128 1 "register_operand" ""))]
+  "TARGET_VX && GENERAL_REG_P (operands[0]) && VECTOR_REG_P (operands[1])"
+  [(set (match_dup 2)
+	(unspec:DI [(subreg:V2DI (match_dup 1) 0)
+		    (const_int 0)] UNSPEC_VEC_EXTRACT))
+   (set (match_dup 3)
+	(unspec:DI [(subreg:V2DI (match_dup 1) 0)
+		    (const_int 1)] UNSPEC_VEC_EXTRACT))]
+{
+  operands[2] = operand_subword (operands[0], 0, 0, <MODE>mode);
+  operands[3] = operand_subword (operands[0], 1, 0, <MODE>mode);
+})
+
+; Moves for smaller vector modes.
+
+; In these patterns only the vlr, vone, and vzero instructions write
+; VR bytes outside the mode.  This should be ok since we disallow
+; formerly bigger modes being accessed with smaller modes via
+; subreg. Note: The vone, vzero instructions could easily be replaced
+; with vlei which would only access the bytes belonging to the mode.
+; However, this would probably be slower.
+
+(define_insn "mov<mode>"
+  [(set (match_operand:V_8 0 "nonimmediate_operand" "=v,v,d, v,QR,  v,  v,  v,  v,d,  Q,  S,  Q,  S,  d,  d,d,d,d,R,T")
+        (match_operand:V_8 1 "general_operand"      " v,d,v,QR, v,j00,jm1,jyy,jxx,d,j00,j00,jm1,jm1,j00,jm1,R,T,b,d,d"))]
+  ""
+  "@
+   vlr\t%v0,%v1
+   vlvgb\t%v0,%1,0
+   vlgvb\t%0,%v1,0
+   vleb\t%v0,%1,0
+   vsteb\t%v1,%0,0
+   vzero\t%v0
+   vone\t%v0
+   vgbm\t%v0,%t1
+   vgm\t%v0,%s1,%e1
+   lr\t%0,%1
+   mvi\t%0,0
+   mviy\t%0,0
+   mvi\t%0,-1
+   mviy\t%0,-1
+   lhi\t%0,0
+   lhi\t%0,-1
+   lh\t%0,%1
+   lhy\t%0,%1
+   lhrl\t%0,%1
+   stc\t%1,%0
+   stcy\t%1,%0"
+  [(set_attr "op_type"      "VRR,VRS,VRS,VRX,VRX,VRI,VRI,VRI,VRI,RR,SI,SIY,SI,SIY,RI,RI,RX,RXY,RIL,RX,RXY")])
+
+(define_insn "mov<mode>"
+  [(set (match_operand:V_16 0 "nonimmediate_operand" "=v,v,d, v,QR,  v,  v,  v,  v,d,  Q,  Q,  d,  d,d,d,d,R,T,b")
+        (match_operand:V_16 1 "general_operand"      " v,d,v,QR, v,j00,jm1,jyy,jxx,d,j00,jm1,j00,jm1,R,T,b,d,d,d"))]
+  ""
+  "@
+   vlr\t%v0,%v1
+   vlvgh\t%v0,%1,0
+   vlgvh\t%0,%v1,0
+   vleh\t%v0,%1,0
+   vsteh\t%v1,%0,0
+   vzero\t%v0
+   vone\t%v0
+   vgbm\t%v0,%t1
+   vgm\t%v0,%s1,%e1
+   lr\t%0,%1
+   mvhhi\t%0,0
+   mvhhi\t%0,-1
+   lhi\t%0,0
+   lhi\t%0,-1
+   lh\t%0,%1
+   lhy\t%0,%1
+   lhrl\t%0,%1
+   sth\t%1,%0
+   sthy\t%1,%0
+   sthrl\t%1,%0"
+  [(set_attr "op_type"      "VRR,VRS,VRS,VRX,VRX,VRI,VRI,VRI,VRI,RR,SIL,SIL,RI,RI,RX,RXY,RIL,RX,RXY,RIL")])
+
+(define_insn "mov<mode>"
+  [(set (match_operand:V_32 0 "nonimmediate_operand" "=f,f,f,R,T,v,v,d, v,QR,  f,  v,  v,  v,  v,  Q,  Q,  d,  d,d,d,d,d,R,T,b")
+	(match_operand:V_32 1 "general_operand"      " f,R,T,f,f,v,d,v,QR, v,j00,j00,jm1,jyy,jxx,j00,jm1,j00,jm1,b,d,R,T,d,d,d"))]
+  "TARGET_VX"
+  "@
+   lder\t%v0,%v1
+   lde\t%0,%1
+   ley\t%0,%1
+   ste\t%1,%0
+   stey\t%1,%0
+   vlr\t%v0,%v1
+   vlvgf\t%v0,%1,0
+   vlgvf\t%0,%v1,0
+   vlef\t%v0,%1,0
+   vstef\t%1,%0,0
+   lzer\t%v0
+   vzero\t%v0
+   vone\t%v0
+   vgbm\t%v0,%t1
+   vgm\t%v0,%s1,%e1
+   mvhi\t%0,0
+   mvhi\t%0,-1
+   lhi\t%0,0
+   lhi\t%0,-1
+   lrl\t%0,%1
+   lr\t%0,%1
+   l\t%0,%1
+   ly\t%0,%1
+   st\t%1,%0
+   sty\t%1,%0
+   strl\t%1,%0"
+  [(set_attr "op_type" "RRE,RXE,RXY,RX,RXY,VRR,VRS,VRS,VRX,VRX,RRE,VRI,VRI,VRI,VRI,SIL,SIL,RI,RI,
+                        RIL,RR,RX,RXY,RX,RXY,RIL")])
+
+(define_insn "mov<mode>"
+  [(set (match_operand:V_64 0 "nonimmediate_operand"
+         "=f,f,f,R,T,v,v,d, v,QR,  f,  v,  v,  v,  v,  Q,  Q,  d,  d,f,d,d,d, d,RT,b")
+        (match_operand:V_64 1 "general_operand"
+         " f,R,T,f,f,v,d,v,QR, v,j00,j00,jm1,jyy,jxx,j00,jm1,j00,jm1,d,f,b,d,RT, d,d"))]
+  "TARGET_ZARCH"
+  "@
+   ldr\t%0,%1
+   ld\t%0,%1
+   ldy\t%0,%1
+   std\t%1,%0
+   stdy\t%1,%0
+   vlr\t%v0,%v1
+   vlvgg\t%v0,%1,0
+   vlgvg\t%0,%v1,0
+   vleg\t%v0,%1,0
+   vsteg\t%v1,%0,0
+   lzdr\t%0
+   vzero\t%v0
+   vone\t%v0
+   vgbm\t%v0,%t1
+   vgm\t%v0,%s1,%e1
+   mvghi\t%0,0
+   mvghi\t%0,-1
+   lghi\t%0,0
+   lghi\t%0,-1
+   ldgr\t%0,%1
+   lgdr\t%0,%1
+   lgrl\t%0,%1
+   lgr\t%0,%1
+   lg\t%0,%1
+   stg\t%1,%0
+   stgrl\t%1,%0"
+  [(set_attr "op_type" "RRE,RX,RXY,RX,RXY,VRR,VRS,VRS,VRX,VRX,RRE,VRI,VRI,VRI,VRI,
+                        SIL,SIL,RI,RI,RRE,RRE,RIL,RR,RXY,RXY,RIL")])
+
+
+; vec_load_lanes?
+
+; vec_store_lanes?
+
+; FIXME: Support also vector mode operands for 1
+; FIXME: A target memory operand seems to be useful otherwise we end
+; up with vl vlvgg vst.  Shouldn't the middle-end be able to handle
+; that itself?
+(define_insn "*vec_set<mode>"
+  [(set (match_operand:V                    0 "register_operand"             "=v, v,v")
+	(unspec:V [(match_operand:<non_vec> 1 "general_operand"               "d,QR,K")
+		   (match_operand:DI        2 "shift_count_or_setmem_operand" "Y, I,I")
+		   (match_operand:V         3 "register_operand"              "0, 0,0")]
+		  UNSPEC_VEC_SET))]
+  "TARGET_VX"
+  "@
+   vlvg<bhfgq>\t%v0,%1,%Y2
+   vle<bhfgq>\t%v0,%1,%2
+   vlei<bhfgq>\t%v0,%1,%2"
+  [(set_attr "op_type" "VRS,VRX,VRI")])
+
+; vec_set is supposed to *modify* an existing vector so operand 0 is
+; duplicated as input operand.
+(define_expand "vec_set<mode>"
+  [(set (match_operand:V                    0 "register_operand"              "")
+	(unspec:V [(match_operand:<non_vec> 1 "general_operand"               "")
+		   (match_operand:SI        2 "shift_count_or_setmem_operand" "")
+		   (match_dup 0)]
+		   UNSPEC_VEC_SET))]
+  "TARGET_VX")
+
+; FIXME: Support also vector mode operands for 0
+; FIXME: This should be (vec_select ..) or something but it does only allow constant selectors :(
+; This is used via RTL standard name as well as for expanding the builtin
+(define_insn "vec_extract<mode>"
+  [(set (match_operand:<non_vec> 0 "nonimmediate_operand"                        "=d,QR")
+	(unspec:<non_vec> [(match_operand:V  1 "register_operand"                " v, v")
+			   (match_operand:SI 2 "shift_count_or_setmem_operand"   " Y, I")]
+			  UNSPEC_VEC_EXTRACT))]
+  "TARGET_VX"
+  "@
+   vlgv<bhfgq>\t%0,%v1,%Y2
+   vste<bhfgq>\t%v1,%0,%2"
+  [(set_attr "op_type" "VRS,VRX")])
+
+(define_expand "vec_init<V_HW:mode>"
+  [(match_operand:V_HW 0 "register_operand" "")
+   (match_operand:V_HW 1 "nonmemory_operand" "")]
+  "TARGET_VX"
+{
+  s390_expand_vec_init (operands[0], operands[1]);
+  DONE;
+})
+
+; Replicate from vector element
+(define_insn "*vec_splat<mode>"
+  [(set (match_operand:V_HW   0 "register_operand" "=v")
+	(vec_duplicate:V_HW
+	 (vec_select:<non_vec>
+	  (match_operand:V_HW 1 "register_operand"  "v")
+	  (parallel
+	   [(match_operand:QI 2 "immediate_operand" "C")]))))]
+  "TARGET_VX"
+  "vrep<bhfgq>\t%v0,%v1,%2"
+  [(set_attr "op_type" "VRI")])
+
+(define_insn "*vec_splats<mode>"
+  [(set (match_operand:V_HW                          0 "register_operand" "=v,v,v,v")
+	(vec_duplicate:V_HW (match_operand:<non_vec> 1 "general_operand"  "QR,I,v,d")))]
+  "TARGET_VX"
+  "@
+   vlrep<bhfgq>\t%v0,%1
+   vrepi<bhfgq>\t%v0,%1
+   vrep<bhfgq>\t%v0,%v1,0
+   #"
+  [(set_attr "op_type" "VRX,VRI,VRI,*")])
+
+; vec_splats is supposed to replicate op1 into all elements of op0
+; This splitter first sets the rightmost element of op0 to op1 and
+; then does a vec_splat to replicate that element into all other
+; elements.
+(define_split
+  [(set (match_operand:V_HW                          0 "register_operand" "")
+	(vec_duplicate:V_HW (match_operand:<non_vec> 1 "register_operand" "")))]
+  "TARGET_VX && GENERAL_REG_P (operands[1])"
+  [(set (match_dup 0)
+	(unspec:V_HW [(match_dup 1) (match_dup 2) (match_dup 0)] UNSPEC_VEC_SET))
+   (set (match_dup 0)
+	(vec_duplicate:V_HW
+	 (vec_select:<non_vec>
+	  (match_dup 0) (parallel [(match_dup 2)]))))]
+{
+  operands[2] = GEN_INT (GET_MODE_NUNITS (<MODE>mode) - 1);
+})
+
+(define_expand "vcond<V_HW:mode><V_HW2:mode>"
+  [(set (match_operand:V_HW 0 "register_operand" "")
+	(if_then_else:V_HW
+	 (match_operator 3 "comparison_operator"
+			 [(match_operand:V_HW2 4 "register_operand" "")
+			  (match_operand:V_HW2 5 "register_operand" "")])
+	 (match_operand:V_HW 1 "nonmemory_operand" "")
+	 (match_operand:V_HW 2 "nonmemory_operand" "")))]
+  "TARGET_VX && GET_MODE_NUNITS (<V_HW:MODE>mode) == GET_MODE_NUNITS (<V_HW2:MODE>mode)"
+{
+  s390_expand_vcond (operands[0], operands[1], operands[2],
+		     GET_CODE (operands[3]), operands[4], operands[5]);
+  DONE;
+})
+
+(define_expand "vcondu<V_HW:mode><V_HW2:mode>"
+  [(set (match_operand:V_HW 0 "register_operand" "")
+	(if_then_else:V_HW
+	 (match_operator 3 "comparison_operator"
+			 [(match_operand:V_HW2 4 "register_operand" "")
+			  (match_operand:V_HW2 5 "register_operand" "")])
+	 (match_operand:V_HW 1 "nonmemory_operand" "")
+	 (match_operand:V_HW 2 "nonmemory_operand" "")))]
+  "TARGET_VX && GET_MODE_NUNITS (<V_HW:MODE>mode) == GET_MODE_NUNITS (<V_HW2:MODE>mode)"
+{
+  s390_expand_vcond (operands[0], operands[1], operands[2],
+		     GET_CODE (operands[3]), operands[4], operands[5]);
+  DONE;
+})
+
+; We only have HW support for byte vectors.  The middle-end is
+; supposed to lower the mode if required.
+(define_insn "vec_permv16qi"
+  [(set (match_operand:V16QI 0 "register_operand"               "=v")
+	(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
+		       (match_operand:V16QI 2 "register_operand" "v")
+		       (match_operand:V16QI 3 "register_operand" "v")]
+		      UNSPEC_VEC_PERM))]
+  "TARGET_VX"
+  "vperm\t%v0,%v1,%v2,%v3"
+  [(set_attr "op_type" "VRR")])
+
+; vec_perm_const for V2DI using vpdi?
+
+;;
+;; Vector integer arithmetic instructions
+;;
+
+; vab, vah, vaf, vag, vaq
+
+; We use nonimmediate_operand instead of register_operand since it is
+; better to have the reloads into VRs instead of splitting the
+; operation into two DImode ADDs.
+(define_insn "<ti*>add<mode>3"
+  [(set (match_operand:VIT           0 "nonimmediate_operand" "=v")
+	(plus:VIT (match_operand:VIT 1 "nonimmediate_operand"  "v")
+		  (match_operand:VIT 2 "nonimmediate_operand"  "v")))]
+  "TARGET_VX"
+  "va<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vsb, vsh, vsf, vsg, vsq
+(define_insn "<ti*>sub<mode>3"
+  [(set (match_operand:VIT            0 "nonimmediate_operand" "=v")
+	(minus:VIT (match_operand:VIT 1 "nonimmediate_operand"  "v")
+		   (match_operand:VIT 2 "nonimmediate_operand"  "v")))]
+  "TARGET_VX"
+  "vs<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmlb, vmlhw, vmlf
+(define_insn "mul<mode>3"
+  [(set (match_operand:VI_QHS              0 "register_operand" "=v")
+	(mult:VI_QHS (match_operand:VI_QHS 1 "register_operand"  "v")
+		     (match_operand:VI_QHS 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vml<bhfgq><w>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vlcb, vlch, vlcf, vlcg
+(define_insn "neg<mode>2"
+  [(set (match_operand:VI         0 "register_operand" "=v")
+	(neg:VI (match_operand:VI 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vlc<bhfgq>\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+; vlpb, vlph, vlpf, vlpg
+(define_insn "abs<mode>2"
+  [(set (match_operand:VI         0 "register_operand" "=v")
+	(abs:VI (match_operand:VI 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vlp<bhfgq>\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+
+; Vector sum across
+
+; Sum across DImode parts of the 1st operand and add the rightmost
+; element of 2nd operand
+; vsumgh, vsumgf
+(define_insn "*vec_sum2<mode>"
+  [(set (match_operand:V2DI 0 "register_operand" "=v")
+	(unspec:V2DI [(match_operand:VI_HW_HS 1 "register_operand" "v")
+		      (match_operand:VI_HW_HS 2 "register_operand" "v")]
+		     UNSPEC_VEC_VSUMG))]
+  "TARGET_VX"
+  "vsumg<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vsumb, vsumh
+(define_insn "*vec_sum4<mode>"
+  [(set (match_operand:V4SI 0 "register_operand" "=v")
+	(unspec:V4SI [(match_operand:VI_HW_QH 1 "register_operand" "v")
+		      (match_operand:VI_HW_QH 2 "register_operand" "v")]
+		     UNSPEC_VEC_VSUM))]
+  "TARGET_VX"
+  "vsum<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+;;
+;; Vector bit instructions (int + fp)
+;;
+
+; Vector and
+
+(define_insn "and<mode>3"
+  [(set (match_operand:VT         0 "register_operand" "=v")
+	(and:VT (match_operand:VT 1 "register_operand"  "v")
+		(match_operand:VT 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vn\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+
+; Vector or
+
+(define_insn "ior<mode>3"
+  [(set (match_operand:VT         0 "register_operand" "=v")
+	(ior:VT (match_operand:VT 1 "register_operand"  "v")
+		(match_operand:VT 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vo\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+
+; Vector xor
+
+(define_insn "xor<mode>3"
+  [(set (match_operand:VT         0 "register_operand" "=v")
+	(xor:VT (match_operand:VT 1 "register_operand"  "v")
+		(match_operand:VT 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vx\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+
+; Bitwise inversion of a vector - used for vec_cmpne
+(define_insn "*not<mode>"
+  [(set (match_operand:VT         0 "register_operand" "=v")
+	(not:VT (match_operand:VT 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vnot\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+; Vector population count
+
+(define_insn "popcountv16qi2"
+  [(set (match_operand:V16QI                0 "register_operand" "=v")
+	(unspec:V16QI [(match_operand:V16QI 1 "register_operand"  "v")]
+		      UNSPEC_POPCNT))]
+  "TARGET_VX"
+  "vpopct\t%v0,%v1,0"
+  [(set_attr "op_type" "VRR")])
+
+; vpopct only counts bits in byte elements.  Bigger element sizes need
+; to be emulated.  Word and doubleword elements can use the sum across
+; instructions.  For halfword sized elements we do a shift of a copy
+; of the result, add it to the result and extend it to halfword
+; element size (unpack).
+
+(define_expand "popcountv8hi2"
+  [(set (match_dup 2)
+	(unspec:V16QI [(subreg:V16QI (match_operand:V8HI 1 "register_operand" "v") 0)]
+		      UNSPEC_POPCNT))
+   ; Make a copy of the result
+   (set (match_dup 3) (match_dup 2))
+   ; Generate the shift count operand in a VR (8->byte 7)
+   (set (match_dup 4) (match_dup 5))
+   (set (match_dup 4) (unspec:V16QI [(const_int 8)
+				     (const_int 7)
+				     (match_dup 4)] UNSPEC_VEC_SET))
+   ; Vector shift right logical by one byte
+   (set (match_dup 3)
+	(unspec:V16QI [(match_dup 3) (match_dup 4)] UNSPEC_VEC_SRLB))
+   ; Add the shifted and the original result
+   (set (match_dup 2)
+	(plus:V16QI (match_dup 2) (match_dup 3)))
+   ; Generate mask for the odd numbered byte elements
+   (set (match_dup 3)
+	(const_vector:V16QI [(const_int 0) (const_int 255)
+			     (const_int 0) (const_int 255)
+			     (const_int 0) (const_int 255)
+			     (const_int 0) (const_int 255)
+			     (const_int 0) (const_int 255)
+			     (const_int 0) (const_int 255)
+			     (const_int 0) (const_int 255)
+			     (const_int 0) (const_int 255)]))
+   ; Zero out the even indexed bytes
+   (set (match_operand:V8HI 0 "register_operand" "=v")
+	(and:V8HI (subreg:V8HI (match_dup 2) 0)
+		  (subreg:V8HI (match_dup 3) 0)))
+]
+  "TARGET_VX"
+{
+  operands[2] = gen_reg_rtx (V16QImode);
+  operands[3] = gen_reg_rtx (V16QImode);
+  operands[4] = gen_reg_rtx (V16QImode);
+  operands[5] = CONST0_RTX (V16QImode);
+})
+
+(define_expand "popcountv4si2"
+  [(set (match_dup 2)
+	(unspec:V16QI [(subreg:V16QI (match_operand:V4SI 1 "register_operand" "v") 0)]
+		      UNSPEC_POPCNT))
+   (set (match_operand:V4SI 0 "register_operand" "=v")
+	(unspec:V4SI [(match_dup 2) (match_dup 3)]
+		     UNSPEC_VEC_VSUM))]
+  "TARGET_VX"
+{
+  operands[2] = gen_reg_rtx (V16QImode);
+  operands[3] = force_reg (V16QImode, CONST0_RTX (V16QImode));
+})
+
+(define_expand "popcountv2di2"
+  [(set (match_dup 2)
+	(unspec:V16QI [(subreg:V16QI (match_operand:V2DI 1 "register_operand" "v") 0)]
+		      UNSPEC_POPCNT))
+   (set (match_dup 3)
+	(unspec:V4SI [(match_dup 2) (match_dup 4)]
+		     UNSPEC_VEC_VSUM))
+   (set (match_operand:V2DI 0 "register_operand" "=v")
+	(unspec:V2DI [(match_dup 3) (match_dup 5)]
+		     UNSPEC_VEC_VSUMG))]
+  "TARGET_VX"
+{
+  operands[2] = gen_reg_rtx (V16QImode);
+  operands[3] = gen_reg_rtx (V4SImode);
+  operands[4] = force_reg (V16QImode, CONST0_RTX (V16QImode));
+  operands[5] = force_reg (V4SImode, CONST0_RTX (V4SImode));
+})
+
+; Count leading zeros
+(define_insn "clz<mode>2"
+  [(set (match_operand:V        0 "register_operand" "=v")
+	(clz:V (match_operand:V 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vclz<bhfgq>\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+; Count trailing zeros
+(define_insn "ctz<mode>2"
+  [(set (match_operand:V        0 "register_operand" "=v")
+	(ctz:V (match_operand:V 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vctz<bhfgq>\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+
+; Vector rotate instructions
+
+; Each vector element rotated by a scalar
+; verllb, verllh, verllf, verllg
+(define_insn "rotl<mode>3"
+  [(set (match_operand:VI            0 "register_operand"             "=v")
+	(rotate:VI (match_operand:VI 1 "register_operand"              "v")
+		   (match_operand:SI 2 "shift_count_or_setmem_operand" "Y")))]
+  "TARGET_VX"
+  "verll<bhfgq>\t%v0,%v1,%Y2"
+  [(set_attr "op_type" "VRS")])
+
+; Each vector element rotated by the corresponding vector element
+; verllvb, verllvh, verllvf, verllvg
+(define_insn "vrotl<mode>3"
+  [(set (match_operand:VI            0 "register_operand" "=v")
+	(rotate:VI (match_operand:VI 1 "register_operand"  "v")
+		   (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "verllv<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+
+; Shift each element by scalar value
+
+; veslb, veslh, veslf, veslg
+(define_insn "ashl<mode>3"
+  [(set (match_operand:VI            0 "register_operand"             "=v")
+	(ashift:VI (match_operand:VI 1 "register_operand"              "v")
+		   (match_operand:SI 2 "shift_count_or_setmem_operand" "Y")))]
+  "TARGET_VX"
+  "vesl<bhfgq>\t%v0,%v1,%Y2"
+  [(set_attr "op_type" "VRS")])
+
+; vesrab, vesrah, vesraf, vesrag
+(define_insn "ashr<mode>3"
+  [(set (match_operand:VI              0 "register_operand"             "=v")
+	(ashiftrt:VI (match_operand:VI 1 "register_operand"              "v")
+		     (match_operand:SI 2 "shift_count_or_setmem_operand" "Y")))]
+  "TARGET_VX"
+  "vesra<bhfgq>\t%v0,%v1,%Y2"
+  [(set_attr "op_type" "VRS")])
+
+; vesrlb, vesrlh, vesrlf, vesrlg
+(define_insn "lshr<mode>3"
+  [(set (match_operand:VI              0 "register_operand"             "=v")
+	(lshiftrt:VI (match_operand:VI 1 "register_operand"              "v")
+		     (match_operand:SI 2 "shift_count_or_setmem_operand" "Y")))]
+  "TARGET_VX"
+  "vesrl<bhfgq>\t%v0,%v1,%Y2"
+  [(set_attr "op_type" "VRS")])
+
+
+; Shift each element by corresponding vector element
+
+; veslvb, veslvh, veslvf, veslvg
+(define_insn "vashl<mode>3"
+  [(set (match_operand:VI            0 "register_operand" "=v")
+	(ashift:VI (match_operand:VI 1 "register_operand"  "v")
+		   (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "veslv<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vesravb, vesravh, vesravf, vesravg
+(define_insn "vashr<mode>3"
+  [(set (match_operand:VI              0 "register_operand" "=v")
+	(ashiftrt:VI (match_operand:VI 1 "register_operand"  "v")
+		     (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vesrav<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vesrlvb, vesrlvh, vesrlvf, vesrlvg
+(define_insn "vlshr<mode>3"
+  [(set (match_operand:VI              0 "register_operand" "=v")
+	(lshiftrt:VI (match_operand:VI 1 "register_operand"  "v")
+		     (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vesrlv<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; Vector shift right logical by byte
+
+; Pattern used by e.g. popcount
+(define_insn "*vec_srb<mode>"
+  [(set (match_operand:V_HW 0 "register_operand"                    "=v")
+	(unspec:V_HW [(match_operand:V_HW 1 "register_operand"       "v")
+		      (match_operand:<tointvec> 2 "register_operand" "v")]
+		     UNSPEC_VEC_SRLB))]
+  "TARGET_VX"
+  "vsrlb\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+
+; vmnb, vmnh, vmnf, vmng
+(define_insn "smin<mode>3"
+  [(set (match_operand:VI          0 "register_operand" "=v")
+	(smin:VI (match_operand:VI 1 "register_operand"  "v")
+		 (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vmn<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmxb, vmxh, vmxf, vmxg
+(define_insn "smax<mode>3"
+  [(set (match_operand:VI          0 "register_operand" "=v")
+	(smax:VI (match_operand:VI 1 "register_operand"  "v")
+		 (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vmx<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmnlb, vmnlh, vmnlf, vmnlg
+(define_insn "umin<mode>3"
+  [(set (match_operand:VI          0 "register_operand" "=v")
+	(umin:VI (match_operand:VI 1 "register_operand"  "v")
+		 (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vmnl<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmxlb, vmxlh, vmxlf, vmxlg
+(define_insn "umax<mode>3"
+  [(set (match_operand:VI          0 "register_operand" "=v")
+	(umax:VI (match_operand:VI 1 "register_operand"  "v")
+		 (match_operand:VI 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vmxl<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmeb, vmeh, vmef
+(define_insn "vec_widen_smult_even_<mode>"
+  [(set (match_operand:<vec_double>                    0 "register_operand" "=v")
+	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand"  "v")
+			      (match_operand:VI_QHS 2 "register_operand"  "v")]
+			     UNSPEC_VEC_SMULT_EVEN))]
+  "TARGET_VX"
+  "vme<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmleb, vmleh, vmlef
+(define_insn "vec_widen_umult_even_<mode>"
+  [(set (match_operand:<vec_double>                 0 "register_operand" "=v")
+	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand"  "v")
+			      (match_operand:VI_QHS 2 "register_operand"  "v")]
+			     UNSPEC_VEC_UMULT_EVEN))]
+  "TARGET_VX"
+  "vmle<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmob, vmoh, vmof
+(define_insn "vec_widen_smult_odd_<mode>"
+  [(set (match_operand:<vec_double>                 0 "register_operand" "=v")
+	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand"  "v")
+			      (match_operand:VI_QHS 2 "register_operand"  "v")]
+			     UNSPEC_VEC_SMULT_ODD))]
+  "TARGET_VX"
+  "vmo<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vmlob, vmloh, vmlof
+(define_insn "vec_widen_umult_odd_<mode>"
+  [(set (match_operand:<vec_double>                 0 "register_operand" "=v")
+	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand"  "v")
+			      (match_operand:VI_QHS 2 "register_operand"  "v")]
+			     UNSPEC_VEC_UMULT_ODD))]
+  "TARGET_VX"
+  "vmlo<bhfgq>\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; vec_widen_umult_hi
+; vec_widen_umult_lo
+; vec_widen_smult_hi
+; vec_widen_smult_lo
+
+; vec_widen_ushiftl_hi
+; vec_widen_ushiftl_lo
+; vec_widen_sshiftl_hi
+; vec_widen_sshiftl_lo
+
+;;
+;; Vector floating point arithmetic instructions
+;;
+
+(define_insn "addv2df3"
+  [(set (match_operand:V2DF            0 "register_operand" "=v")
+	(plus:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+		   (match_operand:V2DF 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vfadb\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "subv2df3"
+  [(set (match_operand:V2DF             0 "register_operand" "=v")
+	(minus:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+		    (match_operand:V2DF 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vfsdb\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "mulv2df3"
+  [(set (match_operand:V2DF            0 "register_operand" "=v")
+	(mult:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+		   (match_operand:V2DF 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vfmdb\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "divv2df3"
+  [(set (match_operand:V2DF           0 "register_operand" "=v")
+	(div:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+		  (match_operand:V2DF 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vfddb\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "sqrtv2df2"
+  [(set (match_operand:V2DF            0 "register_operand" "=v")
+	(sqrt:V2DF (match_operand:V2DF 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vfsqdb\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "fmav2df4"
+  [(set (match_operand:V2DF           0 "register_operand" "=v")
+	(fma:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+		  (match_operand:V2DF 2 "register_operand"  "v")
+		  (match_operand:V2DF 3 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vfmadb\t%v0,%v1,%v2,%v3"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "fmsv2df4"
+  [(set (match_operand:V2DF                     0 "register_operand" "=v")
+	(fma:V2DF (match_operand:V2DF           1 "register_operand"  "v")
+		  (match_operand:V2DF           2 "register_operand"  "v")
+		  (neg:V2DF (match_operand:V2DF 3 "register_operand"  "v"))))]
+  "TARGET_VX"
+  "vfmsdb\t%v0,%v1,%v2,%v3"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "negv2df2"
+  [(set (match_operand:V2DF           0 "register_operand" "=v")
+	(neg:V2DF (match_operand:V2DF 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vflcdb\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "absv2df2"
+  [(set (match_operand:V2DF           0 "register_operand" "=v")
+	(abs:V2DF (match_operand:V2DF 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vflpdb\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "*negabsv2df2"
+  [(set (match_operand:V2DF                     0 "register_operand" "=v")
+	(neg:V2DF (abs:V2DF (match_operand:V2DF 1 "register_operand"  "v"))))]
+  "TARGET_VX"
+  "vflndb\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+; Emulate with compare + select
+(define_insn_and_split "smaxv2df3"
+  [(set (match_operand:V2DF            0 "register_operand" "=v")
+	(smax:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+		   (match_operand:V2DF 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "#"
+  ""
+  [(set (match_dup 3)
+	(gt:V2DI (match_dup 1) (match_dup 2)))
+   (set (match_dup 0)
+	(if_then_else:V2DF
+	 (eq (match_dup 3) (match_dup 4))
+	 (match_dup 2)
+	 (match_dup 1)))]
+{
+  operands[3] = gen_reg_rtx (V2DImode);
+  operands[4] = CONST0_RTX (V2DImode);
+})
+
+; Emulate with compare + select
+(define_insn_and_split "sminv2df3"
+  [(set (match_operand:V2DF            0 "register_operand" "=v")
+	(smin:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+		   (match_operand:V2DF 2 "register_operand"  "v")))]
+  "TARGET_VX"
+  "#"
+  ""
+  [(set (match_dup 3)
+	(gt:V2DI (match_dup 1) (match_dup 2)))
+   (set (match_dup 0)
+	(if_then_else:V2DF
+	 (eq (match_dup 3) (match_dup 4))
+	 (match_dup 1)
+	 (match_dup 2)))]
+{
+  operands[3] = gen_reg_rtx (V2DImode);
+  operands[4] = CONST0_RTX (V2DImode);
+})
+
+
+;;
+;; Integer compares
+;;
+
+(define_insn "*vec_cmp<VICMP_HW_OP:code><VI:mode>_nocc"
+  [(set (match_operand:VI                 2 "register_operand" "=v")
+	(VICMP_HW_OP:VI (match_operand:VI 0 "register_operand"  "v")
+			(match_operand:VI 1 "register_operand"  "v")))]
+  "TARGET_VX"
+  "vc<VICMP_HW_OP:insn_cmp_op><VI:bhfgq>\t%v2,%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+
+;;
+;; Floating point compares
+;;
+
+; EQ, GT, GE
+(define_insn "*vec_cmp<VFCMP_HW_OP:code>v2df_nocc"
+  [(set (match_operand:V2DI                   0 "register_operand" "=v")
+	(VFCMP_HW_OP:V2DI (match_operand:V2DF 1 "register_operand"  "v")
+			  (match_operand:V2DF 2 "register_operand"  "v")))]
+   "TARGET_VX"
+   "vfc<VFCMP_HW_OP:asm_fcmp_op>db\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
+
+; Expanders for not directly supported comparisons
+
+; UNEQ a u== b -> !(a > b | b > a)
+(define_expand "vec_cmpuneqv2df"
+  [(set (match_operand:V2DI          0 "register_operand" "=v")
+	(gt:V2DI (match_operand:V2DF 1 "register_operand"  "v")
+		 (match_operand:V2DF 2 "register_operand"  "v")))
+   (set (match_dup 3)
+	(gt:V2DI (match_dup 2) (match_dup 1)))
+   (set (match_dup 0) (ior:V2DI (match_dup 0) (match_dup 3)))
+   (set (match_dup 0) (not:V2DI (match_dup 0)))]
+  "TARGET_VX"
+{
+  operands[3] = gen_reg_rtx (V2DImode);
+})
+
+; LTGT a <> b -> a > b | b > a
+(define_expand "vec_cmpltgtv2df"
+  [(set (match_operand:V2DI          0 "register_operand" "=v")
+	(gt:V2DI (match_operand:V2DF 1 "register_operand"  "v")
+		 (match_operand:V2DF 2 "register_operand"  "v")))
+   (set (match_dup 3) (gt:V2DI (match_dup 2) (match_dup 1)))
+   (set (match_dup 0) (ior:V2DI (match_dup 0) (match_dup 3)))]
+  "TARGET_VX"
+{
+  operands[3] = gen_reg_rtx (V2DImode);
+})
+
+; ORDERED (a, b): a >= b | b > a
+(define_expand "vec_orderedv2df"
+  [(set (match_operand:V2DI          0 "register_operand" "=v")
+	(ge:V2DI (match_operand:V2DF 1 "register_operand"  "v")
+		 (match_operand:V2DF 2 "register_operand"  "v")))
+   (set (match_dup 3) (gt:V2DI (match_dup 2) (match_dup 1)))
+   (set (match_dup 0) (ior:V2DI (match_dup 0) (match_dup 3)))]
+  "TARGET_VX"
+{
+  operands[3] = gen_reg_rtx (V2DImode);
+})
+
+; UNORDERED (a, b): !ORDERED (a, b)
+(define_expand "vec_unorderedv2df"
+  [(set (match_operand:V2DI          0 "register_operand" "=v")
+	(ge:V2DI (match_operand:V2DF 1 "register_operand"  "v")
+		 (match_operand:V2DF 2 "register_operand"  "v")))
+   (set (match_dup 3) (gt:V2DI (match_dup 2) (match_dup 1)))
+   (set (match_dup 0) (ior:V2DI (match_dup 0) (match_dup 3)))
+   (set (match_dup 0) (not:V2DI (match_dup 0)))]
+  "TARGET_VX"
+{
+  operands[3] = gen_reg_rtx (V2DImode);
+})
+
+(define_insn "*vec_load_pairv2di"
+  [(set (match_operand:V2DI                0 "register_operand" "=v")
+	(vec_concat:V2DI (match_operand:DI 1 "register_operand"  "d")
+			 (match_operand:DI 2 "register_operand"  "d")))]
+  "TARGET_VX"
+  "vlvgp\t%v0,%1,%2"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "vllv16qi"
+  [(set (match_operand:V16QI              0 "register_operand" "=v")
+	(unspec:V16QI [(match_operand:SI  1 "register_operand"  "d")
+		       (match_operand:BLK 2 "memory_operand"    "Q")]
+		      UNSPEC_VEC_LOAD_LEN))]
+  "TARGET_VX"
+  "vll\t%v0,%1,%2"
+  [(set_attr "op_type" "VRS")])
+
+; vfenebs, vfenehs, vfenefs
+; vfenezbs, vfenezhs, vfenezfs
+(define_insn "vec_vfenes<mode>"
+  [(set (match_operand:VI_HW_QHS 0 "register_operand" "=v")
+	(unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "v")
+			   (match_operand:VI_HW_QHS 2 "register_operand" "v")
+			   (match_operand:QI 3 "immediate_operand" "C")]
+			  UNSPEC_VEC_VFENE))
+   (set (reg:CCRAW CC_REGNUM)
+	(unspec:CCRAW [(match_dup 1)
+		       (match_dup 2)
+		       (match_dup 3)]
+		      UNSPEC_VEC_VFENECC))]
+  "TARGET_VX"
+{
+  unsigned HOST_WIDE_INT flags = INTVAL (operands[3]);
+
+  gcc_assert (!(flags & ~(VSTRING_FLAG_ZS | VSTRING_FLAG_CS)));
+  flags &= ~VSTRING_FLAG_CS;
+
+  if (flags == VSTRING_FLAG_ZS)
+    return "vfenez<bhfgq>s\t%v0,%v1,%v2";
+  return "vfene<bhfgq>s\t%v0,%v1,%v2";
+}
+  [(set_attr "op_type" "VRR")])
+
+
+; Vector select
+
+; The following splitters simplify vec_sel for constant 0 or -1
+; selection sources.  This is required to generate efficient code for
+; vcond.
+
+; a = b == c;
+(define_split
+  [(set (match_operand:V 0 "register_operand" "")
+	(if_then_else:V
+	 (eq (match_operand:<tointvec> 3 "register_operand" "")
+	     (match_operand:V 4 "const0_operand" ""))
+	 (match_operand:V 1 "const0_operand" "")
+	 (match_operand:V 2 "constm1_operand" "")))]
+  "TARGET_VX"
+  [(set (match_dup 0) (match_dup 3))]
+{
+  PUT_MODE (operands[3], <V:MODE>mode);
+})
+
+; a = ~(b == c)
+(define_split
+  [(set (match_operand:V 0 "register_operand" "")
+	(if_then_else:V
+	 (eq (match_operand:<tointvec> 3 "register_operand" "")
+	     (match_operand:V 4 "const0_operand" ""))
+	 (match_operand:V 1 "constm1_operand" "")
+	 (match_operand:V 2 "const0_operand" "")))]
+  "TARGET_VX"
+  [(set (match_dup 0) (not:V (match_dup 3)))]
+{
+  PUT_MODE (operands[3], <V:MODE>mode);
+})
+
+; a = b != c
+(define_split
+  [(set (match_operand:V 0 "register_operand" "")
+	(if_then_else:V
+	 (ne (match_operand:<tointvec> 3 "register_operand" "")
+	     (match_operand:V 4 "const0_operand" ""))
+	 (match_operand:V 1 "constm1_operand" "")
+	 (match_operand:V 2 "const0_operand" "")))]
+  "TARGET_VX"
+  [(set (match_dup 0) (match_dup 3))]
+{
+  PUT_MODE (operands[3], <V:MODE>mode);
+})
+
+; a = ~(b != c)
+(define_split
+  [(set (match_operand:V 0 "register_operand" "")
+	(if_then_else:V
+	 (ne (match_operand:<tointvec> 3 "register_operand" "")
+	     (match_operand:V 4 "const0_operand" ""))
+	 (match_operand:V 1 "const0_operand" "")
+	 (match_operand:V 2 "constm1_operand" "")))]
+  "TARGET_VX"
+  [(set (match_dup 0) (not:V (match_dup 3)))]
+{
+  PUT_MODE (operands[3], <V:MODE>mode);
+})
+
+; op0 = op3 == 0 ? op1 : op2
+(define_insn "*vec_sel0<mode>"
+  [(set (match_operand:V 0 "register_operand" "=v")
+	(if_then_else:V
+	 (eq (match_operand:<tointvec> 3 "register_operand" "v")
+	     (match_operand:<tointvec> 4 "const0_operand" ""))
+	 (match_operand:V 1 "register_operand" "v")
+	 (match_operand:V 2 "register_operand" "v")))]
+  "TARGET_VX"
+  "vsel\t%v0,%2,%1,%3"
+  [(set_attr "op_type" "VRR")])
+
+; op0 = !op3 == 0 ? op1 : op2
+(define_insn "*vec_sel0<mode>"
+  [(set (match_operand:V 0 "register_operand" "=v")
+	(if_then_else:V
+	 (eq (not:<tointvec> (match_operand:<tointvec> 3 "register_operand" "v"))
+	     (match_operand:<tointvec> 4 "const0_operand" ""))
+	 (match_operand:V 1 "register_operand" "v")
+	 (match_operand:V 2 "register_operand" "v")))]
+  "TARGET_VX"
+  "vsel\t%v0,%1,%2,%3"
+  [(set_attr "op_type" "VRR")])
+
+; op0 = op3 == -1 ? op1 : op2
+(define_insn "*vec_sel1<mode>"
+  [(set (match_operand:V 0 "register_operand" "=v")
+	(if_then_else:V
+	 (eq (match_operand:<tointvec> 3 "register_operand" "v")
+	     (match_operand:<tointvec> 4 "constm1_operand" ""))
+	 (match_operand:V 1 "register_operand" "v")
+	 (match_operand:V 2 "register_operand" "v")))]
+  "TARGET_VX"
+  "vsel\t%v0,%1,%2,%3"
+  [(set_attr "op_type" "VRR")])
+
+; op0 = !op3 == -1 ? op1 : op2
+(define_insn "*vec_sel1<mode>"
+  [(set (match_operand:V 0 "register_operand" "=v")
+	(if_then_else:V
+	 (eq (not:<tointvec> (match_operand:<tointvec> 3 "register_operand" "v"))
+	     (match_operand:<tointvec> 4 "constm1_operand" ""))
+	 (match_operand:V 1 "register_operand" "v")
+	 (match_operand:V 2 "register_operand" "v")))]
+  "TARGET_VX"
+  "vsel\t%v0,%2,%1,%3"
+  [(set_attr "op_type" "VRR")])
+
+
+
+; reduc_smin
+; reduc_smax
+; reduc_umin
+; reduc_umax
+
+; vec_shl vrep + vsl
+; vec_shr
+
+; vec_pack_trunc
+; vec_pack_ssat
+; vec_pack_usat
+; vec_pack_sfix_trunc
+; vec_pack_ufix_trunc
+; vec_unpacks_hi
+; vec_unpacks_low
+; vec_unpacku_hi
+; vec_unpacku_low
+; vec_unpacks_float_hi
+; vec_unpacks_float_lo
+; vec_unpacku_float_hi
+; vec_unpacku_float_lo
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 13/13] S/390 Invalid vector binary ops
  2015-05-11 13:23 [PATCH 00/13] S/390 Implement support for IBM z13 Andreas Krebbel
                   ` (5 preceding siblings ...)
  2015-05-11 13:24 ` [PATCH 10/13] Testsuite These testcases require disabling hardware vector support on S/390 Andreas Krebbel
@ 2015-05-11 13:24 ` Andreas Krebbel
  2015-05-11 13:24 ` [PATCH 07/13] S/390 Add vector scalar instruction support Andreas Krebbel
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 13:24 UTC (permalink / raw)
  To: gcc-patches

This is a first try to implement at least some of the requirements
regarding the vector bool type documented for IBM XLC.

With this patch error messages will be issued for invalid uses of
vector bool types in binary operators.

vector bool types are being marked opaque in order to prevent the
front-end from complaining about "vector bool long" vs "vector bool
long long" combinations on 64 bit.  The opaque flag basically
suppresses any type checking. However, we still want vector bool to be
accepted only in contexts specified in the documentation (to be
published soon).  Implementing the invalid binary op hook does this
for binary operators at least.  But this is far from being complete :(

gcc/
	* config/s390/s390.c (s390_vector_bool_type_p): New function.
	(s390_invalid_binary_op): New function.
	(TARGET_INVALID_BINARY_OP): Define macro.
---
 gcc/config/s390/s390.c |   61 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index e1ae1ed..a64836e 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -13764,6 +13764,64 @@ s390_asm_file_end (void)
 #endif
 }
 
+/* Return true if TYPE is a vector bool type.  */
+static inline bool
+s390_vector_bool_type_p (const_tree type)
+{
+  return TYPE_VECTOR_OPAQUE (type);
+}
+
+/* Return the diagnostic message string if the binary operation OP is
+   not permitted on TYPE1 and TYPE2, NULL otherwise.  */
+static const char*
+s390_invalid_binary_op (int op ATTRIBUTE_UNUSED, const_tree type1, const_tree type2)
+{
+  bool bool1_p, bool2_p;
+  bool plusminus_p;
+  bool muldiv_p;
+  bool compare_p;
+  machine_mode mode1, mode2;
+
+  if (!TARGET_ZVECTOR)
+    return NULL;
+
+  if (!VECTOR_TYPE_P (type1) || !VECTOR_TYPE_P (type2))
+    return NULL;
+
+  bool1_p = s390_vector_bool_type_p (type1);
+  bool2_p = s390_vector_bool_type_p (type2);
+
+  /* Mixing signed and unsigned types is forbidden for all
+     operators.  */
+  if (!bool1_p && !bool2_p
+      && TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
+    return N_("types differ in signess");
+
+  plusminus_p = (op == PLUS_EXPR || op == MINUS_EXPR);
+  muldiv_p = (op == MULT_EXPR || op == RDIV_EXPR || op == TRUNC_DIV_EXPR
+	      || op == CEIL_DIV_EXPR || op == FLOOR_DIV_EXPR
+	      || op == ROUND_DIV_EXPR);
+  compare_p = (op == LT_EXPR || op == LE_EXPR || op == GT_EXPR || op == GE_EXPR
+	       || op == EQ_EXPR || op == NE_EXPR);
+
+  if (bool1_p && bool2_p && (plusminus_p || muldiv_p))
+    return N_("binary operator does not support two vector bool operands");
+
+  if (bool1_p != bool2_p && (muldiv_p || compare_p))
+    return N_("binary operator does not support vector bool operand");
+
+  mode1 = TYPE_MODE (type1);
+  mode2 = TYPE_MODE (type2);
+
+  if (bool1_p != bool2_p && plusminus_p
+      && (GET_MODE_CLASS (mode1) == MODE_VECTOR_FLOAT
+	  || GET_MODE_CLASS (mode2) == MODE_VECTOR_FLOAT))
+    return N_("binary operator does not support mixing vector "
+	      "bool with floating point vector operands");
+
+  return NULL;
+}
+
 /* Initialize GCC target structure.  */
 
 #undef  TARGET_ASM_ALIGNED_HI_OP
@@ -13981,6 +14039,9 @@ s390_asm_file_end (void)
 #undef TARGET_ASM_FILE_END
 #define TARGET_ASM_FILE_END s390_asm_file_end
 
+#undef TARGET_INVALID_BINARY_OP
+#define TARGET_INVALID_BINARY_OP s390_invalid_binary_op
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-s390.h"
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 02/13] optabs: Fix vec_perm -> V16QI middle end lowering.
  2015-05-11 13:23 [PATCH 00/13] S/390 Implement support for IBM z13 Andreas Krebbel
                   ` (2 preceding siblings ...)
  2015-05-11 13:24 ` [PATCH 06/13] Vector base support - testcases Andreas Krebbel
@ 2015-05-11 13:24 ` Andreas Krebbel
  2015-05-11 17:20   ` Jeff Law
  2015-05-18 17:36   ` Richard Henderson
  2015-05-11 13:24 ` [PATCH 04/13] S/390 Add -march/-mtune=z13 option Andreas Krebbel
                   ` (8 subsequent siblings)
  12 siblings, 2 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 13:24 UTC (permalink / raw)
  To: gcc-patches

The current implementation re-uses the location of the selection
pattern to generate a new one.  This fails if the pattern resides in a
read-only location.  With the patch a new temporary register is
allocated for that purpose.

gcc/
	* optabs.c (expand_vec_perm): Allocate a temp reg for the new
          select pattern.
---
 gcc/optabs.c |   18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/gcc/optabs.c b/gcc/optabs.c
index 983c8d9..8926efa 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6784,14 +6784,18 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, rtx sel, rtx target)
     {
       /* Multiply each element by its byte size.  */
       machine_mode selmode = GET_MODE (sel);
+      /* We cannot re-use SEL as a temp operand since it might by in
+	 read-only storage.  */
+      rtx sel_reg = gen_reg_rtx (selmode);
+
       if (u == 2)
-	sel = expand_simple_binop (selmode, PLUS, sel, sel,
-				   sel, 0, OPTAB_DIRECT);
+	sel_reg = expand_simple_binop (selmode, PLUS, sel, sel,
+				       sel_reg, 0, OPTAB_DIRECT);
       else
-	sel = expand_simple_binop (selmode, ASHIFT, sel,
-				   GEN_INT (exact_log2 (u)),
-				   sel, 0, OPTAB_DIRECT);
-      gcc_assert (sel != NULL);
+	sel_reg = expand_simple_binop (selmode, ASHIFT, sel,
+				       GEN_INT (exact_log2 (u)),
+				       sel_reg, 0, OPTAB_DIRECT);
+      gcc_assert (sel_reg != NULL);
 
       /* Broadcast the low byte each element into each of its bytes.  */
       vec = rtvec_alloc (w);
@@ -6803,7 +6807,7 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, rtx sel, rtx target)
 	  RTVEC_ELT (vec, i) = GEN_INT (this_e);
 	}
       tmp = gen_rtx_CONST_VECTOR (qimode, vec);
-      sel = gen_lowpart (qimode, sel);
+      sel = gen_lowpart (qimode, sel_reg);
       sel = expand_vec_perm (qimode, sel, sel, tmp, NULL);
       gcc_assert (sel != NULL);
 
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 06/13] Vector base support - testcases
  2015-05-11 13:23 [PATCH 00/13] S/390 Implement support for IBM z13 Andreas Krebbel
  2015-05-11 13:23 ` [PATCH 01/13] recog: Increased max number of alternatives Andreas Krebbel
  2015-05-11 13:24 ` [PATCH 11/13] Testsuite S/390 vector types are only 8 byte aligned Andreas Krebbel
@ 2015-05-11 13:24 ` Andreas Krebbel
  2015-05-11 13:24 ` [PATCH 02/13] optabs: Fix vec_perm -> V16QI middle end lowering Andreas Krebbel
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 13:24 UTC (permalink / raw)
  To: gcc-patches

gcc/testsuite/
	* gcc.target/s390/s390.exp
	(check_effective_target_vector): New check.
	* gcc.target/s390/vector/vec-abi-1.c: New test.
	* gcc.target/s390/vector/vec-abi-2.c: New test.
	* gcc.target/s390/vector/vec-abi-3.c: New test.
	* gcc.target/s390/vector/vec-abi-4.c: New test.
	* gcc.target/s390/vector/vec-abi-align-1.c: New test.
	* gcc.target/s390/vector/vec-abi-single-1.c: New test.
	* gcc.target/s390/vector/vec-abi-single-2.c: New test.
	* gcc.target/s390/vector/vec-abi-struct-1.c: New test.
	* gcc.target/s390/vector/vec-abi-vararg-1.c: New test.
	* gcc.target/s390/vector/vec-abi-vararg-2.c: New test.
	* gcc.target/s390/vector/vec-clobber-1.c: New test.
	* gcc.target/s390/vector/vec-cmp-1.c: New test.
	* gcc.target/s390/vector/vec-cmp-2.c: New test.
	* gcc.target/s390/vector/vec-dbl-math-compile-1.c: New test.
	* gcc.target/s390/vector/vec-genbytemask-1.c: New test.
	* gcc.target/s390/vector/vec-genbytemask-2.c: New test.
	* gcc.target/s390/vector/vec-genmask-1.c: New test.
	* gcc.target/s390/vector/vec-genmask-2.c: New test.
	* gcc.target/s390/vector/vec-init-1.c: New test.
	* gcc.target/s390/vector/vec-int-math-compile-1.c: New test.
	* gcc.target/s390/vector/vec-shift-1.c: New test.
	* gcc.target/s390/vector/vec-sub-1.c: New test.
---
 gcc/testsuite/gcc.target/s390/s390.exp             |   18 ++++
 gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c   |   17 +++
 gcc/testsuite/gcc.target/s390/vector/vec-abi-2.c   |   15 +++
 gcc/testsuite/gcc.target/s390/vector/vec-abi-3.c   |  101 ++++++++++++++++++
 gcc/testsuite/gcc.target/s390/vector/vec-abi-4.c   |   19 ++++
 .../gcc.target/s390/vector/vec-abi-align-1.c       |   48 +++++++++
 .../gcc.target/s390/vector/vec-abi-single-1.c      |   24 +++++
 .../gcc.target/s390/vector/vec-abi-single-2.c      |   12 +++
 .../gcc.target/s390/vector/vec-abi-struct-1.c      |   37 +++++++
 .../gcc.target/s390/vector/vec-abi-vararg-1.c      |   60 +++++++++++
 .../gcc.target/s390/vector/vec-abi-vararg-2.c      |   18 ++++
 .../gcc.target/s390/vector/vec-clobber-1.c         |   38 +++++++
 gcc/testsuite/gcc.target/s390/vector/vec-cmp-1.c   |   45 ++++++++
 gcc/testsuite/gcc.target/s390/vector/vec-cmp-2.c   |   38 +++++++
 .../s390/vector/vec-dbl-math-compile-1.c           |   48 +++++++++
 .../gcc.target/s390/vector/vec-genbytemask-1.c     |   70 +++++++++++++
 .../gcc.target/s390/vector/vec-genbytemask-2.c     |   46 +++++++++
 .../gcc.target/s390/vector/vec-genmask-1.c         |   70 +++++++++++++
 .../gcc.target/s390/vector/vec-genmask-2.c         |   46 +++++++++
 gcc/testsuite/gcc.target/s390/vector/vec-init-1.c  |   68 ++++++++++++
 .../s390/vector/vec-int-math-compile-1.c           |   40 ++++++++
 gcc/testsuite/gcc.target/s390/vector/vec-shift-1.c |  108 ++++++++++++++++++++
 gcc/testsuite/gcc.target/s390/vector/vec-sub-1.c   |   51 +++++++++
 23 files changed, 1037 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-3.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-4.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-align-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-single-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-single-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-struct-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-clobber-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-cmp-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-cmp-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-dbl-math-compile-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-genmask-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-genmask-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-init-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-int-math-compile-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-shift-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-sub-1.c

diff --git a/gcc/testsuite/gcc.target/s390/s390.exp b/gcc/testsuite/gcc.target/s390/s390.exp
index 431e2c0..eb1d73b 100644
--- a/gcc/testsuite/gcc.target/s390/s390.exp
+++ b/gcc/testsuite/gcc.target/s390/s390.exp
@@ -37,6 +37,21 @@ proc check_effective_target_htm { } {
     }] "-march=zEC12 -mzarch" ] } { return 0 } else { return 1 }
 }
 
+# Return 1 if vector (va - vector add) instructions are understood by
+# the assembler and can be executed.  This also covers checking for
+# the VX kernel feature.  A kernel without that feature does not
+# enable the vector facility and the following check will die with a
+# signal.
+proc check_effective_target_vector { } {
+    if { ![check_runtime s390_check_vector [subst {
+	int main (void)
+	{
+	    asm ("va %%v24, %%v26, %%v28, 3" : : : "v24", "v26", "v28");
+	    return 0;
+	}
+    }] "-march=z13 -mzarch" ] } { return 0 } else { return 1 }
+}
+
 # If a testcase doesn't have special options, use these.
 global DEFAULT_CFLAGS
 if ![info exists DEFAULT_CFLAGS] then {
@@ -50,5 +65,8 @@ dg-init
 dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] \
 	"" $DEFAULT_CFLAGS
 
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*vector*/*.\[cS\]]] \
+	"" $DEFAULT_CFLAGS
+
 # All done.
 dg-finish
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c
new file mode 100644
index 0000000..5484664
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c
@@ -0,0 +1,17 @@
+/* Check calling convention in the vector ABI.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* Make sure the last argument is fetched from the argument overflow area.  */
+/* { dg-final { scan-assembler "vl\t%v\[0-9\]*,160\\(%r15\\)" { target lp64 } } } */
+/* { dg-final { scan-assembler "vl\t%v\[0-9\]*,96\\(%r15\\)" { target ilp32 } } } */
+
+typedef double v2df __attribute__((vector_size(16)));
+
+v2df
+add (v2df a, v2df b, v2df c, v2df d,
+     v2df e, v2df f, v2df g, v2df h, v2df i)
+{
+  return a + b + c + d + e + f + g + h + i;
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-2.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-2.c
new file mode 100644
index 0000000..62663d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-2.c
@@ -0,0 +1,15 @@
+/* Check calling convention in the vector ABI.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* This needs to be v24 = v24 * v26 + v28 */
+/* { dg-final { scan-assembler "vfmadb\t%v24,%v24,%v26,%v28" } } */
+
+typedef double v2df __attribute__((vector_size(16)));
+
+v2df
+madd (v2df a, v2df b, v2df c)
+{
+  return a * b + c;
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-3.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-3.c
new file mode 100644
index 0000000..4be2360
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-3.c
@@ -0,0 +1,101 @@
+/* Check calling convention in the vector ABI regarding vector like structs.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* addA */
+/* { dg-final { scan-assembler-times "vfadb\t%v24,%v24,%v26" 1 } } */
+
+/* addB and addE*/
+/* { dg-final { scan-assembler-times "vah\t%v24,%v\[0-9\]*,%v\[0-9\]*" 2 } } */
+
+/* addC */
+/* { dg-final { scan-assembler-times "vag\t%v24,%v\[0-9\]*,%v\[0-9\]*" 1 } } */
+
+/* addB and addC are expected to read the arguments via pointers in r2 and r3 */
+/* { dg-final { scan-assembler-times "vl\t%v\[0-9\]*,0\\(%r2\\)" 2 } } */
+/* { dg-final { scan-assembler-times "vl\t%v\[0-9\]*,0\\(%r3\\)" 2 } } */
+
+/* addD */
+/* { dg-final { scan-assembler-times "vaf\t%v24,%v24,%v26" 1 } } */
+
+/* addE */
+/* { dg-final { scan-assembler-times "vah\t%v24,%v24,%v26" 1 } } */
+
+/* addF */
+/* { dg-final { scan-assembler-times "vab\t%v24,%v\[0-9\]*,%v\[0-9\]*" 1 } } */
+/* { dg-final { scan-assembler-times "srlg\t%r\[0-9\]*,%r2,32" 1 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "srlg\t%r\[0-9\]*,%r3,32" 1 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "llgfr\t%.*,%r2" 1 { target { ! lp64 } } } } */
+/* { dg-final { scan-assembler-times "llgfr\t%.*,%r4" 1 { target { ! lp64 } } } } */
+
+
+typedef double v2df __attribute__((vector_size(16)));
+typedef long long v2di __attribute__((vector_size(16)));
+typedef int v4si __attribute__((vector_size(16)));
+typedef short v8hi __attribute__((vector_size(16)));
+
+typedef short v2hi __attribute__((vector_size(4)));
+typedef char v4qi __attribute__((vector_size(4)));
+
+/* Vector like structs are passed in VRs.  */
+struct A { v2df a; };
+
+v2df
+addA (struct A a, struct A b)
+{
+  return a.a + b.a;
+}
+
+/* Only single element vectors qualify as vector type parms.  This one
+   is passed as a struct. Since it is bigger than 8 bytes it is passed
+   on the stack with the reference being put into r2/r3.  */
+struct B { v8hi a; char b;};
+
+v8hi
+addB (struct B a, struct B b)
+{
+  return a.a + b.a;
+}
+
+/* The resulting struct is bigger than 16 bytes and therefore passed
+   on the stack with the references residing in r2/r3.  */
+struct C { v2di __attribute__((aligned(32))) a; };
+
+v2di
+addC (struct C a, struct C b)
+{
+  return a.a + b.a;
+}
+
+/* The attribute here does not have any effect. So this struct stays
+   vector like and hence is passed in a VR.  */
+struct D { v4si __attribute__((aligned(16))) a; };
+
+v4si
+addD (struct D a, struct D b)
+{
+  return a.a + b.a;
+}
+
+
+/* Smaller vectors are passed in vector registers. This also applies
+   for vector like structs.  */
+struct E { v2hi a; };
+
+v2hi
+addE (struct E a, struct E b)
+{
+  return a.a + b.a;
+}
+
+/* This struct is not passed in VRs because of padding.  But since it
+   fits in a GPR and has a power of two size. It is passed in
+   GPRs.  */
+struct F { v4qi __attribute__((aligned(8))) a; };
+
+v4qi
+addF (struct F a, struct F b)
+{
+  return a.a + b.a;
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-4.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-4.c
new file mode 100644
index 0000000..fea44f9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-4.c
@@ -0,0 +1,19 @@
+/* Check calling convention in the vector ABI.  Smaller vector need to
+   be placed left-justified in the stack slot.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* { dg-final { scan-assembler-times "lde\t%.*,160\\\(%r15\\\)" 1 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "lde\t%.*,168\\\(%r15\\\)" 1 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "lde\t%.*,96\\\(%r15\\\)" 1 { target { ! lp64 } } } } */
+/* { dg-final { scan-assembler-times "lde\t%.*,100\\\(%r15\\\)" 1 { target { ! lp64 } } } } */
+
+typedef char __attribute__((vector_size(4))) v4qi;
+
+v4qi
+foo (v4qi a, v4qi b, v4qi c, v4qi d, v4qi e,
+     v4qi f, v4qi g, v4qi h, v4qi i, v4qi j)
+{
+  return (a + b + c + d + e + f + g + h + i + j);
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-align-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-align-1.c
new file mode 100644
index 0000000..10e5617
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-align-1.c
@@ -0,0 +1,48 @@
+/* Check alignment convention in the vector ABI.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+#include <stddef.h>
+
+/* Vector types get an 8 byte alignment.  */
+typedef double v2df __attribute__((vector_size(16)));
+typedef struct
+{
+  char a;
+  v2df b;
+} A;
+char c1[offsetof (A, b) == 8 ? 0 : -1];
+
+/* Smaller vector allow for smaller alignments.  */
+typedef char v4qi __attribute__((vector_size(4)));
+typedef struct
+{
+  char a;
+  v4qi b;
+} B;
+char c2[offsetof (B, b) == 4 ? 0 : -1];
+
+
+typedef double v4df __attribute__((vector_size(32)));
+typedef struct
+{
+  char a;
+  v4df b;
+} C;
+char c3[offsetof (C, b) == 8 ? 0 : -1];
+
+/* However, we allow the programmer to chose a bigger alignment.  */
+typedef struct
+{
+  char a;
+  v2df b __attribute__((aligned(16)));
+} D;
+char c4[offsetof (D, b) == 16 ? 0 : -1];
+
+typedef struct
+{
+  char a;
+  v2df b;
+} __attribute__((packed)) E;
+char c5[offsetof (E, b) == 1 ? 0 : -1];
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-single-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-single-1.c
new file mode 100644
index 0000000..b6cb0fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-single-1.c
@@ -0,0 +1,24 @@
+/* Check calling convention in the vector ABI for single element vectors.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* { dg-final { scan-assembler-times "vlr\t%v24,%v26" 7 } } */
+
+typedef int  __attribute__((vector_size(16))) v4si;
+
+typedef char __attribute__((vector_size(1))) v1qi;
+typedef short int __attribute__((vector_size(2))) v1hi;
+typedef int __attribute__((vector_size(4))) v1si;
+typedef long long __attribute__((vector_size(8))) v1di;
+typedef float __attribute__((vector_size(4))) v1sf;
+typedef double __attribute__((vector_size(8))) v1df;
+typedef long double __attribute__((vector_size(16))) v1tf;
+
+v1qi foo1 (v4si a, v1qi b) { return b; }
+v1hi foo2 (v4si a, v1hi b) { return b; }
+v1si foo3 (v4si a, v1si b) { return b; }
+v1di foo4 (v4si a, v1di b) { return b; }
+v1sf foo5 (v4si a, v1sf b) { return b; }
+v1df foo6 (v4si a, v1df b) { return b; }
+v1tf foo7 (v4si a, v1tf b) { return b; }
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-single-2.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-single-2.c
new file mode 100644
index 0000000..4829f02
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-single-2.c
@@ -0,0 +1,12 @@
+/* Check calling convention in the vector ABI for single element vectors.  */
+
+/* { dg-do compile { target { lp64 } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* { dg-final { scan-assembler-times "vlr\t%v24,%v26" 1 } } */
+
+typedef int  __attribute__((vector_size(16))) v4si;
+
+typedef __int128_t __attribute__((vector_size(16))) v1ti;
+
+v1ti foo (v4si a, v1ti b) { return b; }
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-struct-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-struct-1.c
new file mode 100644
index 0000000..7324ffa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-struct-1.c
@@ -0,0 +1,37 @@
+/* Check calling convention in the vector ABI.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* c.i and c.j are passed by reference since a struct with two
+   elements is no vector type argument.  */
+/* { dg-final { scan-assembler "ld\t%v\[0-9\]*,0\\(%r3\\)" } } */
+/* { dg-final { scan-assembler "ld\t%v\[0-9\]*,8\\(%r3\\)" } } */
+
+/* just_v2si is passed in a vector reg if it as an incoming arg.
+   However, as return value it is passed via hidden first pointer
+   argument.  */
+/* { dg-final { scan-assembler "std\t%v\[0-9\]*,0\\(%r2\\)" } } */
+
+/* { dg-final { scan-assembler "gnu_attribute 8, 2" } } */
+
+typedef int __attribute__ ((vector_size(8))) v2si;
+
+struct just_v2si
+{
+  v2si i;
+};
+
+struct two_v2si
+{
+  v2si i, j;
+};
+
+struct just_v2si
+add_structvecs (v2si a, struct just_v2si b, struct two_v2si c)
+{
+  struct just_v2si res;
+
+  res.i = a + b.i + c.i + c.j;
+  return res;
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-1.c
new file mode 100644
index 0000000..7927fa1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-1.c
@@ -0,0 +1,60 @@
+/* Check calling convention with variable argument lists in the vector
+   ABI.  */
+
+/* { dg-do run { target { s390*-*-* } } } */
+/* { dg-require-effective-target vector } */
+/* { dg-options "-O3 -mzarch -march=z13 --save-temps" } */
+
+/* Make sure arguments are fetched from the argument overflow area.  */
+/* { dg-final { scan-assembler "vl\t%v\[0-9\]*,352\\(%r15\\)" { target lp64 } } } */
+/* { dg-final { scan-assembler "ld\t%v\[0-9\]*,368\\(%r15\\)" { target lp64 } } } */
+/* { dg-final { scan-assembler "vl\t%v\[0-9\]*,376\\(%r15\\)" { target lp64 } } } */
+/* { dg-final { scan-assembler "ld\t%v\[0-9\]*,392\\(%r15\\)" { target lp64 } } } */
+
+/* { dg-final { scan-assembler "vl\t%v\[0-9\]*,208\\(%r15\\)" { target ilp32 } } } */
+/* { dg-final { scan-assembler "ld\t%v\[0-9\]*,224\\(%r15\\)" { target ilp32 } } } */
+/* { dg-final { scan-assembler "vl\t%v\[0-9\]*,232\\(%r15\\)" { target ilp32 } } } */
+/* { dg-final { scan-assembler "ld\t%v\[0-9\]*,248\\(%r15\\)" { target ilp32 } } } */
+
+/* { dg-final { cleanup-saved-temps } } */
+
+#include <stdarg.h>
+
+extern void abort (void);
+
+typedef long long v2di __attribute__((vector_size(16)));
+typedef int v2si __attribute__((vector_size(8)));
+
+v2di __attribute__((noinline))
+add (int a, ...)
+{
+  int i;
+  va_list va;
+  v2di di_result = { 0, 0 };
+  v2si si_result = (v2si){ 0, 0 };
+
+  va_start (va, a);
+
+  di_result += va_arg (va, v2di);
+  si_result += va_arg (va, v2si);
+  di_result += va_arg (va, v2di);
+  si_result += va_arg (va, v2si);
+
+  va_end (va);
+
+  di_result[0] += si_result[0];
+  di_result[1] += si_result[1];
+
+  return di_result;
+}
+
+int
+main ()
+{
+  v2di r = add (4, (v2di){ 11, 21 }, (v2si){ 12, 22 }, (v2di){ 13, 23 }, (v2si){ 14, 24 });
+
+  if (r[0] != 50 || r[1] != 90)
+    abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-2.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-2.c
new file mode 100644
index 0000000..8df4d58
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-2.c
@@ -0,0 +1,18 @@
+/* Check calling convention in the vector ABI.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13 -Wno-implicit-function-declaration" } */
+
+
+typedef long v2di __attribute__((vector_size(16)));
+extern v2di foo1 (int, v2di);
+extern v2di foo2 (int, int);
+extern v2di foo3 (int, ...);
+
+v2di bar1 (int a)  { return foo2 (1, a); }
+v2di bar2 (int a)  { return foo3 (1, a); }
+v2di bar3 (v2di a) { return foo1 (1, a); }
+v2di bar4 (v2di a) { return foo3 (1, a); }
+
+int bar5 (int a)  { return foo4 (1, a); }
+int bar6 (v2di a) { return foo4 (1, a); } /* { dg-error "Vector argument passed to unprototyped function" } */
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-clobber-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-clobber-1.c
new file mode 100644
index 0000000..413b6a0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-clobber-1.c
@@ -0,0 +1,38 @@
+/* { dg-do run { target { s390*-*-* } } } */
+/* { dg-require-effective-target vector } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* For FP zero checks we use the ltdbr instruction.  Since this is an
+   load and test it actually writes the FPR.  Whenever an FPR gets
+   written the rest of the overlapping VR is clobbered.  */
+typedef double __attribute__((vector_size(16))) v2df;
+
+v2df a = { 1.0, 2.0 };
+
+extern void abort (void);
+
+void __attribute__((noinline))
+foo (v2df a)
+{
+  v2df b = { 1.0, 3.0 };
+
+  b -= a;
+
+  /* Take away all the VRs not overlapping with FPRs.  */
+  asm volatile ("" : : :
+		"v16","v17","v18","v19",
+		"v20","v21","v22","v23",
+		"v24","v25","v26","v27",
+		"v28","v29","v30","v31");
+  if (b[0] != 0.0) /* ltdbr */
+    abort ();
+  if (b[1] != 1.0)
+    abort ();
+}
+
+int
+main ()
+{
+  foo (a);
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-cmp-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-cmp-1.c
new file mode 100644
index 0000000..f46910f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-cmp-1.c
@@ -0,0 +1,45 @@
+/* Check that the proper unsigned compare instructions are being generated.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* { dg-final { scan-assembler-times "vchlb" 1 } } */
+/* { dg-final { scan-assembler-times "vchlh" 1 } } */
+/* { dg-final { scan-assembler-times "vchlf" 1 } } */
+/* { dg-final { scan-assembler-times "vchlg" 1 } } */
+
+typedef __attribute__((vector_size(16))) signed char v16qi;
+typedef __attribute__((vector_size(16))) unsigned char uv16qi;
+
+typedef __attribute__((vector_size(16))) signed short v8hi;
+typedef __attribute__((vector_size(16))) unsigned short uv8hi;
+
+typedef __attribute__((vector_size(16))) signed int v4si;
+typedef __attribute__((vector_size(16))) unsigned int uv4si;
+
+typedef __attribute__((vector_size(16))) signed long long v2di;
+typedef __attribute__((vector_size(16))) unsigned long long uv2di;
+
+v16qi
+f (uv16qi a, uv16qi b)
+{
+  return a > b;
+}
+
+v8hi
+g (uv8hi a, uv8hi b)
+{
+  return a > b;
+}
+
+v4si
+h (uv4si a, uv4si b)
+{
+  return a > b;
+}
+
+v2di
+i (uv2di a, uv2di b)
+{
+  return a > b;
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-cmp-2.c b/gcc/testsuite/gcc.target/s390/vector/vec-cmp-2.c
new file mode 100644
index 0000000..999f72c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-cmp-2.c
@@ -0,0 +1,38 @@
+/* Check that the proper signed compare instructions are being generated.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* { dg-final { scan-assembler-times "vchb" 1 } } */
+/* { dg-final { scan-assembler-times "vchh" 1 } } */
+/* { dg-final { scan-assembler-times "vchf" 1 } } */
+/* { dg-final { scan-assembler-times "vchg" 1 } } */
+
+typedef __attribute__((vector_size(16))) signed char v16qi;
+typedef __attribute__((vector_size(16))) signed short v8hi;
+typedef __attribute__((vector_size(16))) signed int v4si;
+typedef __attribute__((vector_size(16))) signed long long v2di;
+
+v16qi
+f (v16qi a, v16qi b)
+{
+  return a > b;
+}
+
+v8hi
+g (v8hi a, v8hi b)
+{
+  return a > b;
+}
+
+v4si
+h (v4si a, v4si b)
+{
+  return a > b;
+}
+
+v2di
+i (v2di a, v2di b)
+{
+  return a > b;
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-dbl-math-compile-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-dbl-math-compile-1.c
new file mode 100644
index 0000000..f53fb11
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-dbl-math-compile-1.c
@@ -0,0 +1,48 @@
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13 --save-temps" } */
+
+typedef __attribute__((vector_size(16))) double v2df;
+
+v2df
+adddbl (v2df a, v2df b)
+{
+  return a + b;
+}
+/* { dg-final { scan-assembler-times "vfadb" 1 } } */
+
+v2df
+subdbl (v2df a, v2df b)
+{
+  return a - b;
+}
+/* { dg-final { scan-assembler-times "vfsdb" 1 } } */
+
+v2df
+muldbl (v2df a, v2df b)
+{
+  return a * b;
+}
+/* { dg-final { scan-assembler-times "vfmdb" 1 } } */
+
+v2df
+divdbl (v2df a, v2df b)
+{
+  return a / b;
+}
+/* { dg-final { scan-assembler-times "vfd" 1 } } */
+
+v2df
+fmadbl (v2df a, v2df b, v2df c)
+{
+  return a * b + c;
+}
+/* { dg-final { scan-assembler-times "vfma" 1 } } */
+
+v2df
+fmsdbl (v2df a, v2df b, v2df c)
+{
+  return a * b - c;
+}
+/* { dg-final { scan-assembler-times "vfms" 1 } } */
+
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-1.c
new file mode 100644
index 0000000..dfe19f1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-1.c
@@ -0,0 +1,70 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mzarch -march=z13 --save-temps" } */
+/* { dg-require-effective-target vector } */
+
+typedef unsigned char     uv16qi __attribute__((vector_size(16)));
+typedef unsigned short     uv8hi __attribute__((vector_size(16)));
+typedef unsigned int       uv4si __attribute__((vector_size(16)));
+typedef unsigned long long uv2di __attribute__((vector_size(16)));
+
+uv2di __attribute__((noinline))
+foo1 ()
+{
+  return (uv2di){ 0xff00ff00ff00ff00, 0x00ff00ff00ff00ff };
+}
+/* { dg-final { scan-assembler-times "vgbm\t%v24,43605" 1 } } */
+
+uv4si __attribute__((noinline))
+foo2 ()
+{
+  return (uv4si){ 0xff0000ff, 0x0000ffff, 0xffff0000, 0x00ffff00 };
+}
+/* { dg-final { scan-assembler-times "vgbm\t%v24,37830" 1 } } */
+
+uv8hi __attribute__((noinline))
+foo3a ()
+{
+  return (uv8hi){ 0xff00, 0xff00, 0xff00, 0xff00,
+      0xff00, 0xff00, 0xff00, 0xff00 };
+}
+/* { dg-final { scan-assembler-times "vgbm\t%v24,43690" 1 } } */
+
+uv8hi __attribute__((noinline))
+foo3b ()
+{
+  return (uv8hi){ 0x00ff, 0x00ff, 0x00ff, 0x00ff,
+      0x00ff, 0x00ff, 0x00ff, 0x00ff };
+}
+/* { dg-final { scan-assembler-times "vgbm\t%v24,21845" 1 } } */
+
+uv16qi __attribute__((noinline))
+foo4 ()
+{
+  return (uv16qi){ 0xff, 0xff, 0xff, 0xff,
+      0, 0, 0, 0,
+      0xff, 0, 0xff, 0,
+      0, 0xff, 0, 0xff };
+}
+/* { dg-final { scan-assembler-times "vgbm\t%v24,61605" 1 } } */
+
+int
+main ()
+{
+  if (foo1()[1] != 0x00ff00ff00ff00ffULL)
+    __builtin_abort ();
+
+  if (foo2()[1] != 0x0000ffff)
+    __builtin_abort ();
+
+  if (foo3a()[1] != 0xff00)
+    __builtin_abort ();
+
+  if (foo3b()[1] != 0x00ff)
+    __builtin_abort ();
+
+  if (foo4()[1] != 0xff)
+    __builtin_abort ();
+  return 0;
+}
+
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-2.c b/gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-2.c
new file mode 100644
index 0000000..83c64a7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-2.c
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+typedef unsigned char     uv16qi __attribute__((vector_size(16)));
+typedef unsigned short     uv8hi __attribute__((vector_size(16)));
+typedef unsigned int       uv4si __attribute__((vector_size(16)));
+typedef unsigned long long uv2di __attribute__((vector_size(16)));
+
+/* The elements differ.  */
+uv2di __attribute__((noinline))
+foo1 ()
+{
+  return (uv2di){ 0x001fffffffffff00, 0x0000ffffffffff00 };
+}
+
+/* Non-contiguous bitmasks */
+
+uv4si __attribute__((noinline))
+foo2 ()
+{
+  return (uv4si){ 0xff00100f, 0xff00100f, 0xff00100f, 0xff00100f };
+}
+
+uv8hi __attribute__((noinline))
+foo3a ()
+{
+  return (uv8hi){ 0xf700, 0xf700, 0xf700, 0xf700,
+      0xf700, 0xf700, 0xf700, 0xf700 };
+}
+
+uv8hi __attribute__((noinline))
+foo3b ()
+{
+  return (uv8hi){ 0x10ff, 0x10ff, 0x10ff, 0x10ff,
+      0x10ff, 0x10ff, 0x10ff, 0x10ff };
+}
+
+uv16qi __attribute__((noinline))
+foo4 ()
+{
+  return (uv16qi){ 0x82, 0x82, 0x82, 0x82,
+      0x82, 0x82, 0x82, 0x82,
+      0x82, 0x82, 0x82, 0x82,
+      0x82, 0x82, 0x82, 0x82 };
+}
+/* { dg-final { scan-assembler-not "vgbm" } } */
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-genmask-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-genmask-1.c
new file mode 100644
index 0000000..8149e22
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-genmask-1.c
@@ -0,0 +1,70 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mzarch -march=z13 --save-temps" } */
+/* { dg-require-effective-target vector } */
+
+typedef unsigned char     uv16qi __attribute__((vector_size(16)));
+typedef unsigned short     uv8hi __attribute__((vector_size(16)));
+typedef unsigned int       uv4si __attribute__((vector_size(16)));
+typedef unsigned long long uv2di __attribute__((vector_size(16)));
+
+uv2di __attribute__((noinline))
+foo1 ()
+{
+  return (uv2di){ 0x000fffffffffff00, 0x000fffffffffff00 };
+}
+/* { dg-final { scan-assembler-times "vgmg\t%v24,12,55" 1 } } */
+
+uv4si __attribute__((noinline))
+foo2 ()
+{
+  return (uv4si){ 0xff00000f, 0xff00000f, 0xff00000f, 0xff00000f };
+}
+/* { dg-final { scan-assembler-times "vgmf\t%v24,28,7" 1 } } */
+
+uv8hi __attribute__((noinline))
+foo3a ()
+{
+  return (uv8hi){ 0xfff0, 0xfff0, 0xfff0, 0xfff0,
+      0xfff0, 0xfff0, 0xfff0, 0xfff0 };
+}
+/* { dg-final { scan-assembler-times "vgmh\t%v24,0,11" 1 } } */
+
+uv8hi __attribute__((noinline))
+foo3b ()
+{
+  return (uv8hi){ 0x0fff, 0x0fff, 0x0fff, 0x0fff,
+      0x0fff, 0x0fff, 0x0fff, 0x0fff };
+}
+/* { dg-final { scan-assembler-times "vgmh\t%v24,4,15" 1 } } */
+
+uv16qi __attribute__((noinline))
+foo4 ()
+{
+  return (uv16qi){ 0x8, 0x8, 0x8, 0x8,
+      0x8, 0x8, 0x8, 0x8,
+      0x8, 0x8, 0x8, 0x8,
+      0x8, 0x8, 0x8, 0x8 };
+}
+/* { dg-final { scan-assembler-times "vgmb\t%v24,4,4" 1 } } */
+
+int
+main ()
+{
+  if (foo1()[1] != 0x000fffffffffff00ULL)
+    __builtin_abort ();
+
+  if (foo2()[1] != 0xff00000f)
+    __builtin_abort ();
+
+  if (foo3a()[1] != 0xfff0)
+    __builtin_abort ();
+
+  if (foo3b()[1] != 0x0fff)
+    __builtin_abort ();
+
+  if (foo4()[1] != 0x8)
+    __builtin_abort ();
+  return 0;
+}
+
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-genmask-2.c b/gcc/testsuite/gcc.target/s390/vector/vec-genmask-2.c
new file mode 100644
index 0000000..e3ae341
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-genmask-2.c
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+typedef unsigned char     uv16qi __attribute__((vector_size(16)));
+typedef unsigned short     uv8hi __attribute__((vector_size(16)));
+typedef unsigned int       uv4si __attribute__((vector_size(16)));
+typedef unsigned long long uv2di __attribute__((vector_size(16)));
+
+/* The elements differ.  */
+uv2di __attribute__((noinline))
+foo1 ()
+{
+  return (uv2di){ 0x000fffffffffff00, 0x0000ffffffffff00 };
+}
+
+/* Non-contiguous bitmasks */
+
+uv4si __attribute__((noinline))
+foo2 ()
+{
+  return (uv4si){ 0xff00100f, 0xff00100f, 0xff00100f, 0xff00100f };
+}
+
+uv8hi __attribute__((noinline))
+foo3a ()
+{
+  return (uv8hi){ 0xf700, 0xf700, 0xf700, 0xf700,
+      0xf700, 0xf700, 0xf700, 0xf700 };
+}
+
+uv8hi __attribute__((noinline))
+foo3b ()
+{
+  return (uv8hi){ 0x10ff, 0x10ff, 0x10ff, 0x10ff,
+      0x10ff, 0x10ff, 0x10ff, 0x10ff };
+}
+
+uv16qi __attribute__((noinline))
+foo4 ()
+{
+  return (uv16qi){ 0x82, 0x82, 0x82, 0x82,
+      0x82, 0x82, 0x82, 0x82,
+      0x82, 0x82, 0x82, 0x82,
+      0x82, 0x82, 0x82, 0x82 };
+}
+/* { dg-final { scan-assembler-not "vgm" } } */
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-init-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-init-1.c
new file mode 100644
index 0000000..4deb6b8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-init-1.c
@@ -0,0 +1,68 @@
+/* Check that the vec_init expander does its job.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+
+
+
+
+typedef __attribute__((vector_size(16))) signed int v4si;
+
+extern v4si G;
+
+v4si
+f (signed int a)
+{
+  return G == a;
+}
+/* { dg-final { scan-assembler-times "vrepf" 1 } } */
+
+v4si
+g (signed int *a)
+{
+  return G == *a;
+}
+/* { dg-final { scan-assembler-times "vlrepf" 1 } } */
+
+v4si
+h ()
+{
+  return G == 1;
+}
+/* { dg-final { scan-assembler-times "vgmf\t%v.*,31,31" 1 } } */
+
+v4si
+i ()
+{
+  return G == -1;
+}
+/* { dg-final { scan-assembler-times "vone" 1 } } */
+
+v4si
+j ()
+{
+  return G == 0;
+}
+/* { dg-final { scan-assembler-times "vzero" 1 } } */
+
+v4si
+k ()
+{
+  return G == (v4si){ 0xff80, 0xff80, 0xff80, 0xff80 };
+}
+/* { dg-final { scan-assembler-times "vgmf\t%v.*,16,24" 1 } } */
+
+v4si
+l ()
+{
+  return G == (v4si){ 0xf000000f, 0xf000000f, 0xf000000f, 0xf000000f };
+}
+/* { dg-final { scan-assembler-times "vgmf\t%v.*,28,3" 1 } } */
+
+v4si
+m ()
+{
+  return G == (v4si){ 0x00ff00ff, 0x0000ffff, 0xffff0000, 0xff00ff00 };
+}
+/* { dg-final { scan-assembler-times "vgbm\t%v.*,21450" 1 } } */
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-int-math-compile-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-int-math-compile-1.c
new file mode 100644
index 0000000..f6c38d9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-int-math-compile-1.c
@@ -0,0 +1,40 @@
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+typedef __attribute__((vector_size(16))) signed int v4si;
+
+v4si
+adddbl (v4si a, v4si b)
+{
+  return a + b;
+}
+
+v4si
+subdbl (v4si a, v4si b)
+{
+  return a - b;
+}
+
+v4si
+muldbl (v4si a, v4si b)
+{
+  return a * b;
+}
+
+v4si
+divdbl (v4si a, v4si b)
+{
+  return a / b;
+}
+
+v4si
+fmadbl (v4si a, v4si b, v4si c)
+{
+  return a * b + c;
+}
+
+v4si
+fmsdbl (v4si a, v4si b, v4si c)
+{
+  return a * b - c;
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-shift-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-shift-1.c
new file mode 100644
index 0000000..3517fea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-shift-1.c
@@ -0,0 +1,108 @@
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* { dg-final { scan-assembler-times "veslb" 2 } } */
+/* { dg-final { scan-assembler-times "veslh" 2 } } */
+/* { dg-final { scan-assembler-times "veslf" 2 } } */
+/* { dg-final { scan-assembler-times "veslg" 2 } } */
+
+/* { dg-final { scan-assembler-times "vesrab" 1 } } */
+/* { dg-final { scan-assembler-times "vesrah" 1 } } */
+/* { dg-final { scan-assembler-times "vesraf" 1 } } */
+/* { dg-final { scan-assembler-times "vesrag" 1 } } */
+
+/* { dg-final { scan-assembler-times "vesrlb" 1 } } */
+/* { dg-final { scan-assembler-times "vesrlh" 1 } } */
+/* { dg-final { scan-assembler-times "vesrlf" 1 } } */
+/* { dg-final { scan-assembler-times "vesrlg" 1 } } */
+
+/* { dg-final { scan-assembler-times "veslvb" 2 } } */
+/* { dg-final { scan-assembler-times "veslvh" 2 } } */
+/* { dg-final { scan-assembler-times "veslvf" 2 } } */
+/* { dg-final { scan-assembler-times "veslvg" 2 } } */
+
+/* { dg-final { scan-assembler-times "vesravb" 1 } } */
+/* { dg-final { scan-assembler-times "vesravh" 1 } } */
+/* { dg-final { scan-assembler-times "vesravf" 1 } } */
+/* { dg-final { scan-assembler-times "vesravg" 1 } } */
+
+/* { dg-final { scan-assembler-times "vesrlvb" 1 } } */
+/* { dg-final { scan-assembler-times "vesrlvh" 1 } } */
+/* { dg-final { scan-assembler-times "vesrlvf" 1 } } */
+/* { dg-final { scan-assembler-times "vesrlvg" 1 } } */
+
+typedef __attribute__((vector_size(16))) signed char v16qi;
+typedef __attribute__((vector_size(16))) unsigned char uv16qi;
+
+typedef __attribute__((vector_size(16))) signed short v8hi;
+typedef __attribute__((vector_size(16))) unsigned short uv8hi;
+
+typedef __attribute__((vector_size(16))) signed int v4si;
+typedef __attribute__((vector_size(16))) unsigned int uv4si;
+
+typedef __attribute__((vector_size(16))) signed long long v2di;
+typedef __attribute__((vector_size(16))) unsigned long long uv2di;
+
+uv16qi g_uvqi0, g_uvqi1, g_uvqi2;
+v16qi g_vqi0, g_vqi1, g_vqi2;
+
+uv8hi g_uvhi0, g_uvhi1, g_uvhi2;
+v8hi g_vhi0, g_vhi1, g_vhi2;
+
+uv4si g_uvsi0, g_uvsi1, g_uvsi2;
+v4si g_vsi0, g_vsi1, g_vsi2;
+
+uv2di g_uvdi0, g_uvdi1, g_uvdi2;
+v2di g_vdi0, g_vdi1, g_vdi2;
+
+void
+shift_left_by_scalar (int s)
+{
+  g_uvqi0 = g_uvqi1 << s;
+  g_vqi0 = g_vqi1 << s;
+  g_uvhi0 = g_uvhi1 << s;
+  g_vhi0 = g_vhi1 << s;
+  g_uvsi0 = g_uvsi1 << s;
+  g_vsi0 = g_vsi1 << s;
+  g_uvdi0 = g_uvdi1 << s;
+  g_vdi0 = g_vdi1 << s;
+}
+
+void
+shift_right_by_scalar (int s)
+{
+  g_uvqi0 = g_uvqi1 >> s;
+  g_vqi0 = g_vqi1 >> s;
+  g_uvhi0 = g_uvhi1 >> s;
+  g_vhi0 = g_vhi1 >> s;
+  g_uvsi0 = g_uvsi1 >> s;
+  g_vsi0 = g_vsi1 >> s;
+  g_uvdi0 = g_uvdi1 >> s;
+  g_vdi0 = g_vdi1 >> s;
+}
+
+void
+shift_left_by_vector ()
+{
+  g_uvqi0 = g_uvqi1 << g_uvqi2;
+  g_vqi0 = g_vqi1 << g_vqi2;
+  g_uvhi0 = g_uvhi1 << g_uvhi2;
+  g_vhi0 = g_vhi1 << g_vhi2;
+  g_uvsi0 = g_uvsi1 << g_uvsi2;
+  g_vsi0 = g_vsi1 << g_vsi2;
+  g_uvdi0 = g_uvdi1 << g_uvdi2;
+  g_vdi0 = g_vdi1 << g_vdi2;
+}
+
+void
+shift_right_by_vector ()
+{
+  g_uvqi0 = g_uvqi1 >> g_uvqi2;
+  g_vqi0 = g_vqi1 >> g_vqi2;
+  g_uvhi0 = g_uvhi1 >> g_uvhi2;
+  g_vhi0 = g_vhi1 >> g_vhi2;
+  g_uvsi0 = g_uvsi1 >> g_uvsi2;
+  g_vsi0 = g_vsi1 >> g_vsi2;
+  g_uvdi0 = g_uvdi1 >> g_uvdi2;
+  g_vdi0 = g_vdi1 >> g_vdi2;
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-sub-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-sub-1.c
new file mode 100644
index 0000000..3fe33dd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-sub-1.c
@@ -0,0 +1,51 @@
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* { dg-final { scan-assembler-times "vsb" 2 } } */
+/* { dg-final { scan-assembler-times "vsh" 2 } } */
+/* { dg-final { scan-assembler-times "vsf" 2 } } */
+/* { dg-final { scan-assembler-times "vsg" 2 } } */
+/* { dg-final { scan-assembler-times "vfs" 1 } } */
+
+
+typedef unsigned char     uv16qi __attribute__((vector_size(16)));
+typedef signed char        v16qi __attribute__((vector_size(16)));
+typedef unsigned short     uv8hi __attribute__((vector_size(16)));
+typedef signed short        v8hi __attribute__((vector_size(16)));
+typedef unsigned int       uv4si __attribute__((vector_size(16)));
+typedef signed int          v4si __attribute__((vector_size(16)));
+typedef unsigned long long uv2di __attribute__((vector_size(16)));
+typedef signed long long    v2di __attribute__((vector_size(16)));
+typedef double              v2df __attribute__((vector_size(16)));
+
+uv16qi g_uvqi0, g_uvqi1, g_uvqi2;
+v16qi g_vqi0, g_vqi1, g_vqi2;
+
+uv8hi g_uvhi0, g_uvhi1, g_uvhi2;
+v8hi g_vhi0, g_vhi1, g_vhi2;
+
+uv4si g_uvsi0, g_uvsi1, g_uvsi2;
+v4si g_vsi0, g_vsi1, g_vsi2;
+
+uv2di g_uvdi0, g_uvdi1, g_uvdi2;
+v2di g_vdi0, g_vdi1, g_vdi2;
+
+v2df g_vdf0, g_vdf1, g_vdf2;
+
+void
+sub1 ()
+{
+  g_vqi0 = g_vqi1 - g_vqi2;
+  g_uvqi0 = g_uvqi1 - g_uvqi2;
+
+  g_vhi0 = g_vhi1 - g_vhi2;
+  g_uvhi0 = g_uvhi1 - g_uvhi2;
+
+  g_vsi0 = g_vsi1 - g_vsi2;
+  g_uvsi0 = g_uvsi1 - g_uvsi2;
+
+  g_vdi0 = g_vdi1 - g_vdi2;
+  g_uvdi0 = g_uvdi1 - g_uvdi2;
+
+  g_vdf0 = g_vdf1 - g_vdf2;
+}
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 11/13] Testsuite S/390 vector types are only 8 byte aligned.
  2015-05-11 13:23 [PATCH 00/13] S/390 Implement support for IBM z13 Andreas Krebbel
  2015-05-11 13:23 ` [PATCH 01/13] recog: Increased max number of alternatives Andreas Krebbel
@ 2015-05-11 13:24 ` Andreas Krebbel
  2015-05-11 17:05   ` Jeff Law
  2015-05-11 13:24 ` [PATCH 06/13] Vector base support - testcases Andreas Krebbel
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 13:24 UTC (permalink / raw)
  To: gcc-patches

gcc/testsuite/
	* lib/target-supports.exp: Vector do not always have natural
          alignment on s390*.
---
 gcc/testsuite/lib/target-supports.exp |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index c5d0ffe..155cefa 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -4347,7 +4347,8 @@ proc check_effective_target_vect_natural_alignment { } {
     } else {
         set et_vect_natural_alignment_saved 1
         if { [check_effective_target_arm_eabi]
-	     || [istarget nvptx-*-*] } {
+	     || [istarget nvptx-*-*]
+	     || [istarget s390*-*-*] } {
             set et_vect_natural_alignment_saved 0
         }
     }
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [RFC 12/13] S/390 Vector ABI GNU Attribute.
  2015-05-11 13:23 [PATCH 00/13] S/390 Implement support for IBM z13 Andreas Krebbel
                   ` (8 preceding siblings ...)
  2015-05-11 13:24 ` [PATCH 03/13] S/390 Fix secondary reload issue with store/load relative operands Andreas Krebbel
@ 2015-05-11 13:24 ` Andreas Krebbel
  2015-05-19 18:18   ` [PING] " Andreas Krebbel
  2015-05-11 13:24 ` [PATCH 05/13] S/390 Vector base support Andreas Krebbel
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 13:24 UTC (permalink / raw)
  To: gcc-patches

With this patch .gnu_attribute is used to mark binaries with a vector
ABI tag.  This is required since the z13 vector support breaks the ABI
of existing vector_size attribute generated vector types:

1. vector_size(16) and bigger vectors are aligned to 8 byte
boundaries (formerly vectors were always naturally aligned)

2. vector_size(16) or smaller vectors are passed via VR if available
or by value on the stack (formerly vector were passed on the stack by
reference).

The .gnu_attribute will be used by ld to emit a warning if binaries
with incompatible ABIs are being linked together:
https://sourceware.org/ml/binutils/2015-04/msg00316.html

And it will be used by GDB to perform inferior function calls using a
vector ABI which fits to the binary being debugged:
https://sourceware.org/ml/gdb-patches/2015-04/msg00833.html

The current implementation tries to only set the attribute if the
vector types are really used in ABI relevant contexts in order to
avoid false positives during linking.

However, this unfortunately has some limitations like in the following
case where an ABI relevant context cannot be detected properly:

typedef int __attribute__((vector_size(16))) v4si;
struct A
{
  char x;
  v4si y;
};
char a[sizeof(struct A)];

The number of elements in a depends on the ABI (24 with -mvx and 32
with -mno-vx).  However, the implementation is not able to detect this
since the struct type is not used anywhere else and consequently does
not survive until the checking code is able to see it.

Ideas about how to improve the implementation without creating too
many false postives are welcome.

In particular we do not want to set the attribute for local uses of
vector types as they would be natural for ifunc optimizations.

gcc/
	* config/s390/s390.c (s390_vector_abi): New variable definition.
	(s390_check_type_for_vector_abi): New function.
	(TARGET_ASM_FILE_END): New macro definition.
	(s390_asm_file_end): New function.
	(s390_function_arg): Call s390_check_type_for_vector_abi.
	(s390_gimplify_va_arg): Likewise.
	* configure: Regenerate.
	* configure.ac: Check for .gnu_attribute Binutils feature.

gcc/testsuite/
	* gcc.target/s390/vector/vec-abi-1.c: Add gnu attribute check.
	* gcc.target/s390/vector/vec-abi-attr-1.c: New test.
	* gcc.target/s390/vector/vec-abi-attr-2.c: New test.
	* gcc.target/s390/vector/vec-abi-attr-3.c: New test.
	* gcc.target/s390/vector/vec-abi-attr-4.c: New test.
	* gcc.target/s390/vector/vec-abi-attr-5.c: New test.
	* gcc.target/s390/vector/vec-abi-attr-6.c: New test.
---
 gcc/config/s390/s390.c                             |  120 ++++++++++++++++++++
 gcc/configure                                      |   36 ++++++
 gcc/configure.ac                                   |    7 ++
 gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c   |    1 +
 .../gcc.target/s390/vector/vec-abi-attr-1.c        |   18 +++
 .../gcc.target/s390/vector/vec-abi-attr-2.c        |   53 +++++++++
 .../gcc.target/s390/vector/vec-abi-attr-3.c        |   18 +++
 .../gcc.target/s390/vector/vec-abi-attr-4.c        |   17 +++
 .../gcc.target/s390/vector/vec-abi-attr-5.c        |   19 ++++
 .../gcc.target/s390/vector/vec-abi-attr-6.c        |   24 ++++
 10 files changed, 313 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-3.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-4.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-5.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-6.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index e682516..e1ae1ed 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -466,6 +466,97 @@ struct GTY(()) machine_function
 #define PREDICT_DISTANCE (TARGET_Z10 ? 384 : 2048)
 
 
+/* Indicate which ABI has been used for passing vector args.
+   0 - no vector type arguments have been passed where the ABI is relevant
+   1 - the old ABI has been used
+   2 - a vector type argument has been passed either in a vector register
+       or on the stack by value  */
+static int s390_vector_abi = 0;
+
+/* Set the vector ABI marker if TYPE is subject to the vector ABI
+   switch.  The vector ABI affects only vector data types.  There are
+   two aspects of the vector ABI relevant here:
+
+   1. vectors >= 16 bytes have an alignment of 8 bytes with the new
+   ABI and natural alignment with the old.
+
+   2. vector <= 16 bytes are passed in VRs or by value on the stack
+   with the new ABI but by reference on the stack with the old.
+
+   If ARG_P is true TYPE is used for a function argument or return
+   value.  The ABI marker then is set for all vector data types.  If
+   ARG_P is false only type 1 vectors are being checked.  */
+
+static void
+s390_check_type_for_vector_abi (const_tree type, bool arg_p, bool in_struct_p)
+{
+  static hash_set<const_tree> visited_types_hash;
+
+  if (s390_vector_abi)
+    return;
+
+  if (type == NULL_TREE || TREE_CODE (type) == ERROR_MARK)
+    return;
+
+  if (visited_types_hash.contains (type))
+    return;
+
+  visited_types_hash.add (type);
+
+  if (VECTOR_TYPE_P (type))
+    {
+      int type_size = int_size_in_bytes (type);
+
+      /* Outside arguments only the alignment is changing and this
+	 only happens for vector types >= 16 bytes.  */
+      if (!arg_p && type_size < 16)
+	return;
+
+      /* In arguments vector types > 16 are passed as before (GCC
+	 never enforced the bigger alignment for arguments which was
+	 required by the old vector ABI).  However, it might still be
+	 ABI relevant due to the changed alignment if it is a struct
+	 member.  */
+      if (arg_p && type_size > 16 && !in_struct_p)
+	return;
+
+      s390_vector_abi = TARGET_VX_ABI ? 2 : 1;
+    }
+  else if (POINTER_TYPE_P (type) || TREE_CODE (type) == ARRAY_TYPE)
+    {
+      /* ARRAY_TYPE: Since with neither of the ABIs we have more than
+	 natural alignment there will never be ABI dependent padding
+	 in an array type.  That's why we do not set in_struct_p to
+	 true here.  */
+      s390_check_type_for_vector_abi (TREE_TYPE (type), arg_p, in_struct_p);
+    }
+  else if (TREE_CODE (type) == FUNCTION_TYPE || TREE_CODE (type) == METHOD_TYPE)
+    {
+      tree arg_chain;
+
+      /* Check the return type.  */
+      s390_check_type_for_vector_abi (TREE_TYPE (type), true, false);
+
+      for (arg_chain = TYPE_ARG_TYPES (type);
+	   arg_chain;
+	   arg_chain = TREE_CHAIN (arg_chain))
+	s390_check_type_for_vector_abi (TREE_VALUE (arg_chain), true, false);
+    }
+  else if (RECORD_OR_UNION_TYPE_P (type))
+    {
+      tree field;
+
+      for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
+	{
+	  if (TREE_CODE (field) != FIELD_DECL)
+	    continue;
+
+	  s390_check_type_for_vector_abi (TREE_TYPE (field), arg_p, true);
+	}
+    }
+}
+
+
 /* System z builtins.  */
 
 #include "s390-builtins.h"
@@ -10900,6 +10991,8 @@ s390_function_arg (cumulative_args_t cum_v, machine_mode mode,
 {
   CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
 
+  if (!named)
+    s390_check_type_for_vector_abi (type, true, false);
 
   if (s390_function_arg_vector (mode, type))
     {
@@ -11292,6 +11385,8 @@ s390_gimplify_va_arg (tree valist, tree type, gimple_seq *pre_p,
 
   size = int_size_in_bytes (type);
 
+  s390_check_type_for_vector_abi (type, true, false);
+
   if (pass_by_reference (NULL, TYPE_MODE (type), type, false))
     {
       if (TARGET_DEBUG_ARG)
@@ -13646,6 +13741,28 @@ s390_vector_alignment (const_tree type)
   return MIN (64, tree_to_shwi (TYPE_SIZE (type)));
 }
 
+/* Implement TARGET_ASM_FILE_END.  */
+static void
+s390_asm_file_end (void)
+{
+#ifdef HAVE_AS_GNU_ATTRIBUTE
+  varpool_node *vnode;
+  cgraph_node *cnode;
+
+  FOR_EACH_VARIABLE (vnode)
+    if (TREE_PUBLIC (vnode->decl))
+      s390_check_type_for_vector_abi (TREE_TYPE (vnode->decl), false, false);
+
+  FOR_EACH_FUNCTION (cnode)
+    if (TREE_PUBLIC (cnode->decl))
+      s390_check_type_for_vector_abi (TREE_TYPE (cnode->decl), false, false);
+
+
+  if (s390_vector_abi != 0)
+    fprintf (asm_out_file, "\t.gnu_attribute 8, %d\n",
+	     s390_vector_abi);
+#endif
+}
 
 /* Initialize GCC target structure.  */
 
@@ -13861,6 +13978,9 @@ s390_vector_alignment (const_tree type)
 #undef TARGET_VECTOR_ALIGNMENT
 #define TARGET_VECTOR_ALIGNMENT s390_vector_alignment
 
+#undef TARGET_ASM_FILE_END
+#define TARGET_ASM_FILE_END s390_asm_file_end
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-s390.h"
diff --git a/gcc/configure b/gcc/configure
index 9523773..a6d8e11 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -26544,6 +26544,42 @@ fi
       as_fn_error "Requesting --with-nan= requires assembler support for -mnan=" "$LINENO" 5
     fi
     ;;
+    s390*-*-*)
+    { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for .gnu_attribute support" >&5
+$as_echo_n "checking assembler for .gnu_attribute support... " >&6; }
+if test "${gcc_cv_as_s390_gnu_attribute+set}" = set; then :
+  $as_echo_n "(cached) " >&6
+else
+  gcc_cv_as_s390_gnu_attribute=no
+    if test $in_tree_gas = yes; then
+    if test $gcc_cv_gas_vers -ge `expr \( \( 2 \* 1000 \) + 18 \) \* 1000 + 0`
+  then gcc_cv_as_s390_gnu_attribute=yes
+fi
+  elif test x$gcc_cv_as != x; then
+    $as_echo '.gnu_attribute 8,1' > conftest.s
+    if { ac_try='$gcc_cv_as $gcc_cv_as_flags  -o conftest.o conftest.s >&5'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; }
+    then
+	gcc_cv_as_s390_gnu_attribute=yes
+    else
+      echo "configure: failed program was" >&5
+      cat conftest.s >&5
+    fi
+    rm -f conftest.o conftest.s
+  fi
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_s390_gnu_attribute" >&5
+$as_echo "$gcc_cv_as_s390_gnu_attribute" >&6; }
+if test $gcc_cv_as_s390_gnu_attribute = yes; then
+
+$as_echo "#define HAVE_AS_GNU_ATTRIBUTE 1" >>confdefs.h
+
+fi
+    ;;
 esac
 
 # Mips and HP-UX need the GNU assembler.
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 68b0ee8..577437e 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -4426,6 +4426,13 @@ pointers into PC-relative form.])
 	[Requesting --with-nan= requires assembler support for -mnan=])
     fi
     ;;
+    s390*-*-*)
+    gcc_GAS_CHECK_FEATURE([.gnu_attribute support],
+      gcc_cv_as_s390_gnu_attribute, [2,18,0],,
+      [.gnu_attribute 8,1],,
+      [AC_DEFINE(HAVE_AS_GNU_ATTRIBUTE, 1,
+	  [Define if your assembler supports .gnu_attribute.])])
+    ;;
 esac
 
 # Mips and HP-UX need the GNU assembler.
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c
index 5484664..db18e5e 100644
--- a/gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c
@@ -6,6 +6,7 @@
 /* Make sure the last argument is fetched from the argument overflow area.  */
 /* { dg-final { scan-assembler "vl\t%v\[0-9\]*,160\\(%r15\\)" { target lp64 } } } */
 /* { dg-final { scan-assembler "vl\t%v\[0-9\]*,96\\(%r15\\)" { target ilp32 } } } */
+/* { dg-final { scan-assembler "gnu_attribute 8, 2" } } */
 
 typedef double v2df __attribute__((vector_size(16)));
 
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-1.c
new file mode 100644
index 0000000..a06b338
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-1.c
@@ -0,0 +1,18 @@
+/* Check calling convention in the vector ABI.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13 -mno-vx" } */
+
+/* The function passes arguments whose calling conventions change with
+   -mvx/-mno-vx.  In that case GCC has to emit the ABI attribute to
+   allow GDB and Binutils to detect this.  */
+/* { dg-final { scan-assembler "gnu_attribute 8, 1" } } */
+
+typedef double v2df __attribute__((vector_size(16)));
+
+v2df
+add (v2df a, v2df b, v2df c, v2df d,
+     v2df e, v2df f, v2df g, v2df h, v2df i)
+{
+  return a + b + c + d + e + f + g + h + i;
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-2.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-2.c
new file mode 100644
index 0000000..97b9748
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-2.c
@@ -0,0 +1,53 @@
+/* Check calling convention in the vector ABI.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* No abi attribute should be emitted when nothing relevant happened.  */
+/* { dg-final { scan-assembler-not "gnu_attribute" } } */
+
+#include <stdarg.h>
+
+/* Local use is ok.  */
+
+typedef int v4si __attribute__((vector_size(16)));
+
+static
+v4si __attribute__((__noinline__))
+foo (v4si a)
+{
+  return a + (v4si){ 1, 2, 3, 4 };
+}
+
+int
+bar (int a)
+{
+  return foo ((v4si){ 1, 1, 1, 1 })[1];
+}
+
+/* Big vector type only used as function argument and return value
+   without being a struct/union member.  The alignment change is not
+   relevant here.  */
+
+typedef double v4df __attribute__((vector_size(32)));
+
+v4df
+add (v4df a, v4df b, v4df c, v4df d,
+     v4df e, v4df f, v4df g, v4df h, v4df i)
+{
+  return a + b + c + d + e + f + g + h + i;
+}
+
+double
+bar2 (int n, ...)
+{
+  double ret;
+  v4df a;
+  va_list va;
+
+  va_start (va, n);
+  ret = va_arg (va, v4df)[2];
+  va_end (va);
+
+  return ret;
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-3.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-3.c
new file mode 100644
index 0000000..f3dc368
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-3.c
@@ -0,0 +1,18 @@
+/* Check calling convention in the vector ABI.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* { dg-final { scan-assembler "gnu_attribute 8, 2" } } */
+
+typedef double v4df __attribute__((vector_size(32)));
+typedef struct { v4df a; } s;
+
+s
+add (v4df a, v4df b, v4df c, v4df d,
+     v4df e, v4df f, v4df g, v4df h, v4df i)
+{
+  s t;
+  t.a = a + b + c + d + e + f + g + h + i;
+  return t;
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-4.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-4.c
new file mode 100644
index 0000000..ad9b29a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-4.c
@@ -0,0 +1,17 @@
+/* Check calling convention in the vector ABI.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* { dg-final { scan-assembler "gnu_attribute 8, 2" } } */
+
+typedef int __attribute__((vector_size(16))) v4si;
+
+extern void bar (v4si);
+
+void
+foo (int a)
+{
+  v4si b = (v4si){ a, a, a, a };
+  bar (b);
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-5.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-5.c
new file mode 100644
index 0000000..fb5de4e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-5.c
@@ -0,0 +1,19 @@
+/* Check calling convention in the vector ABI.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* { dg-final { scan-assembler "gnu_attribute 8, 2" } } */
+
+#include <stdarg.h>
+
+typedef int __attribute__((vector_size(16))) v4si;
+
+extern void bar (int, ...);
+
+void
+foo (int a)
+{
+  v4si b = (v4si){ a, a, a, a };
+  bar (1, b);
+}
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-6.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-6.c
new file mode 100644
index 0000000..9134fa7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-6.c
@@ -0,0 +1,24 @@
+/* Check calling convention in the vector ABI.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* { dg-final { scan-assembler "gnu_attribute 8, 2" } } */
+
+#include <stdarg.h>
+
+typedef int __attribute__((vector_size(16))) v4si;
+
+int
+bar (int n, ...)
+{
+  int ret;
+  v4si a;
+  va_list va;
+
+  va_start (va, n);
+  ret = va_arg (va, v4si)[2];
+  va_end (va);
+
+  return ret;
+}
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 04/13] S/390 Add -march/-mtune=z13 option.
  2015-05-11 13:23 [PATCH 00/13] S/390 Implement support for IBM z13 Andreas Krebbel
                   ` (3 preceding siblings ...)
  2015-05-11 13:24 ` [PATCH 02/13] optabs: Fix vec_perm -> V16QI middle end lowering Andreas Krebbel
@ 2015-05-11 13:24 ` Andreas Krebbel
  2015-05-11 13:24 ` [PATCH 10/13] Testsuite These testcases require disabling hardware vector support on S/390 Andreas Krebbel
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 13:24 UTC (permalink / raw)
  To: gcc-patches

gcc/
	* common/config/s390/s390-common.c (processor_flags_table): Add
	z13.
	* config.gcc: Add z13.
	* config/s390/s390-opts.h (enum processor_type): Add
	PROCESSOR_2964_Z13.
	* config/s390/s390.c (s390_adjust_priority): Check for
	PROCESSOR_2964_Z13.
	(s390_reorg): Likewise.
	(s390_sched_reorder): Likewise.
	(s390_sched_variable_issue): Likewise.
	(s390_loop_unroll_adjust): Likewise.
	(s390_option_override): Likewise. Default to -mvx when available.
	* config/s390/s390.h (enum processor_flags): Add PF_Z13 and PF_VX.
	(TARGET_CPU_Z13, TARGET_CPU_VX, TARGET_Z13, TARGET_VX)
	(TARGET_VX_ABI): Define macros.
	macros.
	(TARGET_DEFAULT): Add MASK_OPT_VX.
	* config/s390/s390.md ("cpu" attribute): Add z13.
	("cpu_facility" attribute): Add vec.
	* config/s390/s390.opt (processor_type): Add z13.
	(mvx): New options.
---
 gcc/common/config/s390/s390-common.c |    3 +++
 gcc/config.gcc                       |    2 +-
 gcc/config/s390/s390-opts.h          |    1 +
 gcc/config/s390/s390.c               |   35 ++++++++++++++++++++++++++++------
 gcc/config/s390/s390.h               |   19 ++++++++++++++++--
 gcc/config/s390/s390.md              |    8 ++++++--
 gcc/config/s390/s390.opt             |    7 +++++++
 7 files changed, 64 insertions(+), 11 deletions(-)

diff --git a/gcc/common/config/s390/s390-common.c b/gcc/common/config/s390/s390-common.c
index 7181beb..43459c8 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -42,7 +42,10 @@ EXPORTED_CONST int processor_flags_table[] =
     /* z196 */   PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
                  | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196,
     /* zEC12 */  PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
+                 | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX,
+    /* z13 */    PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
                  | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
+                 | PF_Z13 | PF_VX
   };
 
 /* Change optimizations to be performed, depending on the
diff --git a/gcc/config.gcc b/gcc/config.gcc
index a1df043..a2af2a0 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4071,7 +4071,7 @@ case "${target}" in
 		for which in arch tune; do
 			eval "val=\$with_$which"
 			case ${val} in
-			"" | g5 | g6 | z900 | z990 | z9-109 | z9-ec | z10 | z196 | zEC12)
+			"" | g5 | g6 | z900 | z990 | z9-109 | z9-ec | z10 | z196 | zEC12 | z13)
 				# OK
 				;;
 			*)
diff --git a/gcc/config/s390/s390-opts.h b/gcc/config/s390/s390-opts.h
index cb9ebc7..5bde333 100644
--- a/gcc/config/s390/s390-opts.h
+++ b/gcc/config/s390/s390-opts.h
@@ -35,6 +35,7 @@ enum processor_type
   PROCESSOR_2097_Z10,
   PROCESSOR_2817_Z196,
   PROCESSOR_2827_ZEC12,
+  PROCESSOR_2964_Z13,
   PROCESSOR_max
 };
 
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index cc37618..843a860 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5851,7 +5851,8 @@ s390_adjust_priority (rtx_insn *insn, int priority)
       && s390_tune != PROCESSOR_2094_Z9_109
       && s390_tune != PROCESSOR_2097_Z10
       && s390_tune != PROCESSOR_2817_Z196
-      && s390_tune != PROCESSOR_2827_ZEC12)
+      && s390_tune != PROCESSOR_2827_ZEC12
+      && s390_tune != PROCESSOR_2964_Z13)
     return priority;
 
   switch (s390_safe_attr_type (insn))
@@ -11451,7 +11452,8 @@ s390_reorg (void)
   /* Walk over the insns and do some >=z10 specific changes.  */
   if (s390_tune == PROCESSOR_2097_Z10
       || s390_tune == PROCESSOR_2817_Z196
-      || s390_tune == PROCESSOR_2827_ZEC12)
+      || s390_tune == PROCESSOR_2827_ZEC12
+      || s390_tune == PROCESSOR_2964_Z13)
     {
       rtx_insn *insn;
       bool insn_added_p = false;
@@ -11700,7 +11702,8 @@ s390_sched_reorder (FILE *file, int verbose,
     if (reload_completed && *nreadyp > 1)
       s390_z10_prevent_earlyload_conflicts (ready, nreadyp);
 
-  if (s390_tune == PROCESSOR_2827_ZEC12
+  if ((s390_tune == PROCESSOR_2827_ZEC12
+       || s390_tune == PROCESSOR_2964_Z13)
       && reload_completed
       && *nreadyp > 1)
     {
@@ -11783,7 +11786,8 @@ s390_sched_variable_issue (FILE *file, int verbose, rtx_insn *insn, int more)
 {
   last_scheduled_insn = insn;
 
-  if (s390_tune == PROCESSOR_2827_ZEC12
+  if ((s390_tune == PROCESSOR_2827_ZEC12
+       || s390_tune == PROCESSOR_2964_Z13)
       && reload_completed
       && recog_memoized (insn) >= 0)
     {
@@ -11863,7 +11867,8 @@ s390_loop_unroll_adjust (unsigned nunroll, struct loop *loop)
 
   if (s390_tune != PROCESSOR_2097_Z10
       && s390_tune != PROCESSOR_2817_Z196
-      && s390_tune != PROCESSOR_2827_ZEC12)
+      && s390_tune != PROCESSOR_2827_ZEC12
+      && s390_tune != PROCESSOR_2964_Z13)
     return nunroll;
 
   /* Count the number of memory references within the loop body.  */
@@ -11994,6 +11999,22 @@ s390_option_override (void)
   if (!(target_flags_explicit & MASK_OPT_HTM) && TARGET_CPU_HTM && TARGET_ZARCH)
     target_flags |= MASK_OPT_HTM;
 
+  if (target_flags_explicit & MASK_OPT_VX)
+    {
+      if (TARGET_OPT_VX)
+	{
+	  if (!TARGET_CPU_VX)
+	    error ("hardware vector support not available on %s",
+		   s390_arch_string);
+	  if (TARGET_SOFT_FLOAT)
+	    error ("hardware vector support not available with -msoft-float");
+	}
+    }
+  else if (TARGET_CPU_VX)
+    /* Enable vector support if available and not explicitly disabled
+       by user.  E.g. with -m31 -march=z13 -mzarch */
+    target_flags |= MASK_OPT_VX;
+
   if (TARGET_HARD_DFP && !TARGET_DFP)
     {
       if (target_flags_explicit & MASK_HARD_DFP)
@@ -12033,6 +12054,7 @@ s390_option_override (void)
       s390_cost = &z196_cost;
       break;
     case PROCESSOR_2827_ZEC12:
+    case PROCESSOR_2964_Z13:
       s390_cost = &zEC12_cost;
       break;
     default:
@@ -12060,7 +12082,8 @@ s390_option_override (void)
 
   if (s390_tune == PROCESSOR_2097_Z10
       || s390_tune == PROCESSOR_2817_Z196
-      || s390_tune == PROCESSOR_2827_ZEC12)
+      || s390_tune == PROCESSOR_2827_ZEC12
+      || s390_tune == PROCESSOR_2964_Z13)
     {
       maybe_set_param_value (PARAM_MAX_UNROLLED_INSNS, 100,
 			     global_options.x_param_values,
diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h
index 4953075..7130275 100644
--- a/gcc/config/s390/s390.h
+++ b/gcc/config/s390/s390.h
@@ -35,7 +35,9 @@ enum processor_flags
   PF_Z10 = 32,
   PF_Z196 = 64,
   PF_ZEC12 = 128,
-  PF_TX = 256
+  PF_TX = 256,
+  PF_Z13 = 512,
+  PF_VX = 1024
 };
 
 /* This is necessary to avoid a warning about comparing different enum
@@ -64,6 +66,10 @@ enum processor_flags
  	(s390_arch_flags & PF_ZEC12)
 #define TARGET_CPU_HTM \
  	(s390_arch_flags & PF_TX)
+#define TARGET_CPU_Z13 \
+        (s390_arch_flags & PF_Z13)
+#define TARGET_CPU_VX \
+        (s390_arch_flags & PF_VX)
 
 /* These flags indicate that the generated code should run on a cpu
    providing the respective hardware facility when run in
@@ -82,7 +88,15 @@ enum processor_flags
 #define TARGET_ZEC12 \
        (TARGET_ZARCH && TARGET_CPU_ZEC12)
 #define TARGET_HTM (TARGET_OPT_HTM)
+#define TARGET_Z13 \
+       (TARGET_ZARCH && TARGET_CPU_Z13)
+#define TARGET_VX \
+       (TARGET_ZARCH && TARGET_CPU_VX && TARGET_OPT_VX && TARGET_HARD_FLOAT)
 
+/* Use the ABI introduced with IBM z13:
+   - pass vector arguments <= 16 bytes in VRs
+   - align *all* vector types to 8 bytes  */
+#define TARGET_VX_ABI TARGET_VX
 
 #define TARGET_AVOID_CMP_AND_BRANCH (s390_tune == PROCESSOR_2817_Z196)
 
@@ -115,7 +129,8 @@ enum processor_flags
   while (0)
 
 #ifdef DEFAULT_TARGET_64BIT
-#define TARGET_DEFAULT             (MASK_64BIT | MASK_ZARCH | MASK_HARD_DFP | MASK_OPT_HTM)
+#define TARGET_DEFAULT     (MASK_64BIT | MASK_ZARCH | MASK_HARD_DFP	\
+                            | MASK_OPT_HTM | MASK_OPT_VX)
 #else
 #define TARGET_DEFAULT             0
 #endif
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 76dca0a..9b7c9d9 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -324,11 +324,11 @@
 ;; distinguish between g5 and g6, but there are differences between the two
 ;; CPUs could in theory be modeled.
 
-(define_attr "cpu" "g5,g6,z900,z990,z9_109,z9_ec,z10,z196,zEC12"
+(define_attr "cpu" "g5,g6,z900,z990,z9_109,z9_ec,z10,z196,zEC12,z13"
   (const (symbol_ref "s390_tune_attr")))
 
 (define_attr "cpu_facility"
-  "standard,ieee,zarch,cpu_zarch,longdisp,extimm,dfp,z10,z196,zEC12"
+  "standard,ieee,zarch,cpu_zarch,longdisp,extimm,dfp,z10,z196,zEC12,vec"
   (const_string "standard"))
 
 (define_attr "enabled" ""
@@ -369,6 +369,10 @@
 
          (and (eq_attr "cpu_facility" "zEC12")
               (match_test "TARGET_ZEC12"))
+	 (const_int 1)
+
+         (and (eq_attr "cpu_facility" "vec")
+              (match_test "TARGET_VX"))
 	 (const_int 1)]
 	(const_int 0)))
 
diff --git a/gcc/config/s390/s390.opt b/gcc/config/s390/s390.opt
index 22f1ff5..0ff897b 100644
--- a/gcc/config/s390/s390.opt
+++ b/gcc/config/s390/s390.opt
@@ -76,6 +76,9 @@ Enum(processor_type) String(z196) Value(PROCESSOR_2817_Z196)
 EnumValue
 Enum(processor_type) String(zEC12) Value(PROCESSOR_2827_ZEC12)
 
+EnumValue
+Enum(processor_type) String(z13) Value(PROCESSOR_2964_Z13)
+
 mbackchain
 Target Report Mask(BACKCHAIN)
 Maintain backchain pointer
@@ -118,6 +121,10 @@ mhtm
 Target Report Mask(OPT_HTM)
 Use hardware transactional execution instructions
 
+mvx
+Target Report Mask(OPT_VX)
+Use hardware vector facility instructions and enable the vector ABI
+
 mpacked-stack
 Target Report Mask(PACKED_STACK)
 Use packed stack layout
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 07/13] S/390 Add vector scalar instruction support.
  2015-05-11 13:23 [PATCH 00/13] S/390 Implement support for IBM z13 Andreas Krebbel
                   ` (6 preceding siblings ...)
  2015-05-11 13:24 ` [PATCH 13/13] S/390 Invalid vector binary ops Andreas Krebbel
@ 2015-05-11 13:24 ` Andreas Krebbel
  2015-05-11 13:24 ` [PATCH 03/13] S/390 Fix secondary reload issue with store/load relative operands Andreas Krebbel
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 13:24 UTC (permalink / raw)
  To: gcc-patches

With this patch GCC makes use of the vector instruction which are
available in single element mode.  By using these instructions scalar
double operations can use 32 registers.

gcc/
	* config/s390/s390-modes.def: Add new modes CCVEQ, CCVFH, and
	CCVFHE.
	* config/s390/s390.c (s390_match_ccmode_set): Handle new modes.
	(s390_select_ccmode): Likewise.
	(s390_canonicalize_comparison): Swap operands if necessary.
	(s390_expand_vec_compare_scalar): Expand DFmode compare using
	single element vector instructions.
	(s390_emit_compare): Call s390_expand_vec_compare_scalar.
	(s390_branch_condition_mask): Generate CC masks for the new modes.
	* config/s390/s390.md (v0, vf, vd): New mode attributes.
	(VFCMP, asm_fcmp, insn_cmp): New mode iterator and attributes.
	(*vec_cmp<insn_cmp>df_cconly, *fixuns_truncdfdi2_z13)
	(*fix_trunc<BFP:mode><GPR:mode>2_bfp, *floatunsdidf2_z13)
	(*floatuns<GPR:mode><FP:mode>2, *extendsfdf2_z13)
	(*extend<DSF:mode><BFP:mode>2): New insn definition.
	(fix_trunc<BFP:mode><GPR:mode>2_bfp, loatuns<GPR:mode><FP:mode>2)
	(extend<DSF:mode><BFP:mode>2): Turn into expander.
	(floatdi<mode>2, truncdfsf2, add<mode>3, sub<mode>3, mul<mode>3)
	(div<mode>3, *neg<mode>2, *abs<mode>2, *negabs<mode>2)
	(sqrt<mode>2): Add vector instruction.

gcc/testsuite/
	* gcc.target/s390/vector/vec-scalar-cmp-1.c: New test.
---
 gcc/config/s390/s390-modes.def                     |   10 +
 gcc/config/s390/s390.c                             |  131 ++++++++-
 gcc/config/s390/s390.md                            |  304 ++++++++++++++------
 .../gcc.target/s390/vector/vec-scalar-cmp-1.c      |   49 ++++
 4 files changed, 403 insertions(+), 91 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-scalar-cmp-1.c

diff --git a/gcc/config/s390/s390-modes.def b/gcc/config/s390/s390-modes.def
index a40559e..26c0a81 100644
--- a/gcc/config/s390/s390-modes.def
+++ b/gcc/config/s390/s390-modes.def
@@ -84,7 +84,12 @@ Requested mode            -> Destination CC register mode
 CCS, CCU, CCT, CCSR, CCUR -> CCZ
 CCA                       -> CCAP, CCAN
 
+Vector comparison modes
 
+CCVEQ  	  EQ	  - 	       - 	   NE	      (VCEQ)
+
+CCVFH	  GT	  -   	       -   	   UNLE	      (VFCH)
+CCVFHE	  GE	  -   	       -   	   UNLT	      (VFCHE)
 *** Comments ***
 
 CCAP, CCAN
@@ -182,6 +187,11 @@ CC_MODE (CCT2);
 CC_MODE (CCT3);
 CC_MODE (CCRAW);
 
+CC_MODE (CCVEQ);
+CC_MODE (CCVFH);
+CC_MODE (CCVFHE);
+
+
 /* Vector modes.  */
 
 VECTOR_MODES (INT, 2);        /*                 V2QI */
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 11fed14..848cc0c 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -681,6 +681,9 @@ s390_match_ccmode_set (rtx set, machine_mode req_mode)
     case CCT1mode:
     case CCT2mode:
     case CCT3mode:
+    case CCVEQmode:
+    case CCVFHmode:
+    case CCVFHEmode:
       if (req_mode != set_mode)
         return 0;
       break;
@@ -781,6 +784,29 @@ s390_tm_ccmode (rtx op1, rtx op2, bool mixed)
 machine_mode
 s390_select_ccmode (enum rtx_code code, rtx op0, rtx op1)
 {
+  if (TARGET_VX
+      && register_operand (op0, DFmode)
+      && register_operand (op1, DFmode))
+    {
+      /* LT, LE, UNGT, UNGE require swapping OP0 and OP1.  Either
+	 s390_emit_compare or s390_canonicalize_comparison will take
+	 care of it.  */
+      switch (code)
+	{
+	case EQ:
+	case NE:
+	  return CCVEQmode;
+	case GT:
+	case UNLE:
+	  return CCVFHmode;
+	case GE:
+	case UNLT:
+	  return CCVFHEmode;
+	default:
+	  ;
+	}
+    }
+
   switch (code)
     {
       case EQ:
@@ -1058,8 +1084,74 @@ s390_canonicalize_comparison (int *code, rtx *op0, rtx *op1,
       rtx tem = *op0; *op0 = *op1; *op1 = tem;
       *code = (int)swap_condition ((enum rtx_code)*code);
     }
+
+  /* Using the scalar variants of vector instructions for 64 bit FP
+     comparisons might require swapping the operands.  */
+  if (TARGET_VX
+      && register_operand (*op0, DFmode)
+      && register_operand (*op1, DFmode)
+      && (*code == LT || *code == LE || *code == UNGT || *code == UNGE))
+    {
+      rtx tmp;
+
+      switch (*code)
+	{
+	case LT:   *code = GT; break;
+	case LE:   *code = GE; break;
+	case UNGT: *code = UNLE; break;
+	case UNGE: *code = UNLT; break;
+	default: ;
+	}
+      tmp = *op0; *op0 = *op1; *op1 = tmp;
+    }
 }
 
+/* Helper function for s390_emit_compare.  If possible emit a 64 bit
+   FP compare using the single element variant of vector instructions.
+   Replace CODE with the comparison code to be used in the CC reg
+   compare and return the condition code register RTX in CC.  */
+
+static bool
+s390_expand_vec_compare_scalar (enum rtx_code *code, rtx cmp1, rtx cmp2,
+				rtx *cc)
+{
+  machine_mode cmp_mode;
+  bool swap_p = false;
+
+  switch (*code)
+    {
+    case EQ:   cmp_mode = CCVEQmode;  break;
+    case NE:   cmp_mode = CCVEQmode;  break;
+    case GT:   cmp_mode = CCVFHmode;  break;
+    case GE:   cmp_mode = CCVFHEmode; break;
+    case UNLE: cmp_mode = CCVFHmode;  break;
+    case UNLT: cmp_mode = CCVFHEmode; break;
+    case LT:   cmp_mode = CCVFHmode;  *code = GT;   swap_p = true; break;
+    case LE:   cmp_mode = CCVFHEmode; *code = GE;   swap_p = true; break;
+    case UNGE: cmp_mode = CCVFHmode;  *code = UNLE; swap_p = true; break;
+    case UNGT: cmp_mode = CCVFHEmode; *code = UNLT; swap_p = true; break;
+    default: return false;
+    }
+
+  if (swap_p)
+    {
+      rtx tmp = cmp2;
+      cmp2 = cmp1;
+      cmp1 = tmp;
+    }
+  *cc = gen_rtx_REG (cmp_mode, CC_REGNUM);
+  emit_insn (gen_rtx_PARALLEL (VOIDmode,
+	       gen_rtvec (2,
+			  gen_rtx_SET (VOIDmode,
+				       *cc,
+				       gen_rtx_COMPARE (cmp_mode, cmp1,
+							cmp2)),
+			  gen_rtx_CLOBBER (VOIDmode,
+					   gen_rtx_SCRATCH (V2DImode)))));
+  return true;
+}
+
+
 /* Emit a compare instruction suitable to implement the comparison
    OP0 CODE OP1.  Return the correct condition RTL to be placed in
    the IF_THEN_ELSE of the conditional branch testing the result.  */
@@ -1070,10 +1162,18 @@ s390_emit_compare (enum rtx_code code, rtx op0, rtx op1)
   machine_mode mode = s390_select_ccmode (code, op0, op1);
   rtx cc;
 
-  /* Do not output a redundant compare instruction if a compare_and_swap
-     pattern already computed the result and the machine modes are compatible.  */
-  if (GET_MODE_CLASS (GET_MODE (op0)) == MODE_CC)
+  if (TARGET_VX
+      && register_operand (op0, DFmode)
+      && register_operand (op1, DFmode)
+      && s390_expand_vec_compare_scalar (&code, op0, op1, &cc))
+    {
+      /* Work has been done by s390_expand_vec_compare_scalar already.  */
+    }
+  else if (GET_MODE_CLASS (GET_MODE (op0)) == MODE_CC)
     {
+      /* Do not output a redundant compare instruction if a
+	 compare_and_swap pattern already computed the result and the
+	 machine modes are compatible.  */
       gcc_assert (s390_cc_modes_compatible (GET_MODE (op0), mode)
 		  == GET_MODE (op0));
       cc = op0;
@@ -1308,6 +1408,31 @@ s390_branch_condition_mask (rtx code)
         }
       break;
 
+      /* Vector comparison modes.  */
+
+    case CCVEQmode:
+      switch (GET_CODE (code))
+	{
+	case EQ:        return CC0;
+	case NE:        return CC3;
+	default:        return -1;
+	}
+      /* FP vector compare modes.  */
+
+    case CCVFHmode:
+      switch (GET_CODE (code))
+	{
+	case GT:        return CC0;
+	case UNLE:      return CC3;
+	default:        return -1;
+	}
+    case CCVFHEmode:
+      switch (GET_CODE (code))
+	{
+	case GE:        return CC0;
+	case UNLT:      return CC3;
+	default:        return -1;
+	}
     case CCRAWmode:
       switch (GET_CODE (code))
 	{
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 8680770..40be3be 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -524,6 +524,14 @@
 ;; first and the second operand match for bfp modes.
 (define_mode_attr f0 [(TF "0") (DF "0") (SF "0") (TD "f") (DD "f") (DD "f")])
 
+;; This attribute is used to merge the scalar vector instructions into
+;; the FP patterns.  For non-supported modes (all but DF) it expands
+;; to constraints which are supposed to be matched by an earlier
+;; variant.
+(define_mode_attr v0      [(TF "0") (DF "v") (SF "0") (TD "0") (DD "0") (DD "0") (TI "0") (DI "v") (SI "0")])
+(define_mode_attr vf      [(TF "f") (DF "v") (SF "f") (TD "f") (DD "f") (DD "f") (TI "f") (DI "v") (SI "f")])
+(define_mode_attr vd      [(TF "d") (DF "v") (SF "d") (TD "d") (DD "d") (DD "d") (TI "d") (DI "v") (SI "d")])
+
 ;; This attribute is used in the operand list of the instruction to have an
 ;; additional operand for the dfp instructions.
 (define_mode_attr op1 [(TF "") (DF "") (SF "")
@@ -635,6 +643,17 @@
 ;; Allow return and simple_return to be defined from a single template.
 (define_code_iterator ANY_RETURN [return simple_return])
 
+
+
+; Condition code modes generated by vector fp comparisons.  These will
+; be used also in single element mode.
+(define_mode_iterator VFCMP [CCVEQ CCVFH CCVFHE])
+; Used with VFCMP to expand part of the mnemonic
+; For fp we have a mismatch: eq in the insn name - e in asm
+(define_mode_attr asm_fcmp [(CCVEQ "e") (CCVFH "h") (CCVFHE "he")])
+(define_mode_attr insn_cmp [(CCVEQ "eq") (CCVFH "h") (CCVFHE "he")])
+
+
 (include "vector.md")
 
 ;;
@@ -1144,6 +1163,15 @@
    [(set_attr "op_type" "RRE,RXE")
     (set_attr "type"  "fsimp<mode>")])
 
+; wfcedbs, wfchdbs, wfchedbs
+(define_insn "*vec_cmp<insn_cmp>df_cconly"
+  [(set (reg:VFCMP CC_REGNUM)
+	(compare:VFCMP (match_operand:DF 0 "register_operand" "v")
+		       (match_operand:DF 1 "register_operand" "v")))
+   (clobber (match_scratch:V2DI 2 "=v"))]
+  "TARGET_Z13 && TARGET_HARD_FLOAT"
+  "wfc<asm_fcmp>dbs\t%v2,%v0,%v1"
+  [(set_attr "op_type" "VRR")])
 
 ; Compare and Branch instructions
 
@@ -4361,14 +4389,27 @@
 
 ; fixuns_trunc(tf|df|sf|td|dd)(di|si)2 instruction patterns.
 
+(define_insn "*fixuns_truncdfdi2_z13"
+  [(set (match_operand:DI                  0 "register_operand" "=d,v")
+	(unsigned_fix:DI (match_operand:DF 1 "register_operand"  "f,v")))
+   (unspec:DI [(match_operand:DI           2 "immediate_operand" "K,K")] UNSPEC_ROUND)
+   (clobber (reg:CC CC_REGNUM))]
+   "TARGET_Z13 && TARGET_HARD_FLOAT"
+   "@
+    clgdbr\t%0,%h2,%1,0
+    wclgdb\t%v0,%v1,0,%h2"
+   [(set_attr "op_type" "RRF,VRR")
+    (set_attr "type"    "ftoi")])
+
 ; clfebr, clfdbr, clfxbr, clgebr, clgdbr, clgxbr
 ;         clfdtr, clfxtr,         clgdtr, clgxtr
 (define_insn "*fixuns_trunc<FP:mode><GPR:mode>2_z196"
-  [(set (match_operand:GPR 0 "register_operand" "=r")
-	(unsigned_fix:GPR (match_operand:FP 1 "register_operand" "f")))
-   (unspec:GPR [(match_operand:GPR 2 "immediate_operand" "K")] UNSPEC_ROUND)
+  [(set (match_operand:GPR                  0 "register_operand" "=d")
+	(unsigned_fix:GPR (match_operand:FP 1 "register_operand"  "f")))
+   (unspec:GPR [(match_operand:GPR          2 "immediate_operand" "K")] UNSPEC_ROUND)
    (clobber (reg:CC CC_REGNUM))]
-   "TARGET_Z196"
+   "TARGET_Z196 && TARGET_HARD_FLOAT
+    && (!TARGET_Z13 || <GPR:MODE>mode != DImode || <FP:MODE>mode != DFmode)"
    "cl<GPR:gf><FP:xde><FP:bt>r\t%0,%h2,%1,0"
    [(set_attr "op_type" "RRF")
     (set_attr "type"    "ftoi")])
@@ -4383,18 +4424,37 @@
   DONE;
 })
 
+(define_insn "*fix_truncdfdi2_bfp_z13"
+  [(set (match_operand:DI         0 "register_operand" "=d,v")
+        (fix:DI (match_operand:DF 1 "register_operand"  "f,v")))
+   (unspec:DI [(match_operand:DI  2 "immediate_operand" "K,K")] UNSPEC_ROUND)
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_Z13 && TARGET_HARD_FLOAT"
+  "@
+   cgdbr\t%0,%h2,%1
+   wcgdb\t%v0,%v1,0,%h2"
+  [(set_attr "op_type" "RRE,VRR")
+   (set_attr "type"    "ftoi")])
+
 ; cgxbr, cgdbr, cgebr, cfxbr, cfdbr, cfebr
-(define_insn "fix_trunc<BFP:mode><GPR:mode>2_bfp"
-  [(set (match_operand:GPR 0 "register_operand" "=d")
-        (fix:GPR (match_operand:BFP 1 "register_operand" "f")))
-   (unspec:GPR [(match_operand:GPR 2 "immediate_operand" "K")] UNSPEC_ROUND)
+(define_insn "*fix_trunc<BFP:mode><GPR:mode>2_bfp"
+  [(set (match_operand:GPR          0 "register_operand" "=d")
+        (fix:GPR (match_operand:BFP 1 "register_operand"  "f")))
+   (unspec:GPR [(match_operand:GPR  2 "immediate_operand" "K")] UNSPEC_ROUND)
    (clobber (reg:CC CC_REGNUM))]
-  "TARGET_HARD_FLOAT"
+  "TARGET_HARD_FLOAT
+    && (!TARGET_VX || <GPR:MODE>mode != DImode || <BFP:MODE>mode != DFmode)"
   "c<GPR:gf><BFP:xde>br\t%0,%h2,%1"
   [(set_attr "op_type" "RRE")
    (set_attr "type"    "ftoi")])
 
-
+(define_expand "fix_trunc<BFP:mode><GPR:mode>2_bfp"
+  [(parallel
+    [(set (match_operand:GPR          0 "register_operand" "=d")
+	  (fix:GPR (match_operand:BFP 1 "register_operand"  "f")))
+     (unspec:GPR [(match_operand:GPR  2 "immediate_operand" "K")] UNSPEC_ROUND)
+     (clobber (reg:CC CC_REGNUM))])]
+  "TARGET_HARD_FLOAT")
 ;
 ; fix_trunc(td|dd)di2 instruction pattern(s).
 ;
@@ -4441,12 +4501,15 @@
 
 ; cxgbr, cdgbr, cegbr, cxgtr, cdgtr
 (define_insn "floatdi<mode>2"
-  [(set (match_operand:FP 0 "register_operand" "=f")
-        (float:FP (match_operand:DI 1 "register_operand" "d")))]
+  [(set (match_operand:FP           0 "register_operand" "=f,<vf>")
+        (float:FP (match_operand:DI 1 "register_operand"  "d,<vd>")))]
   "TARGET_ZARCH && TARGET_HARD_FLOAT"
-  "c<xde>g<bt>r\t%0,%1"
-  [(set_attr "op_type" "RRE")
-   (set_attr "type"    "itof<mode>" )])
+  "@
+   c<xde>g<bt>r\t%0,%1
+   wcdgb\t%v0,%v1,0,0"
+  [(set_attr "op_type"      "RRE,VRR")
+   (set_attr "type"         "itof<mode>" )
+   (set_attr "cpu_facility" "*,vec")])
 
 ; cxfbr, cdfbr, cefbr
 (define_insn "floatsi<mode>2"
@@ -4470,27 +4533,47 @@
 ; floatuns(si|di)(tf|df|sf|td|dd)2 instruction pattern(s).
 ;
 
+(define_insn "*floatunsdidf2_z13"
+  [(set (match_operand:DF                    0 "register_operand" "=f,v")
+        (unsigned_float:DF (match_operand:DI 1 "register_operand"  "d,v")))]
+  "TARGET_Z13 && TARGET_HARD_FLOAT"
+  "@
+   cdlgbr\t%0,0,%1,0
+   wcdlgb\t%v0,%v1,0,0"
+  [(set_attr "op_type" "RRE,VRR")
+   (set_attr "type"    "itofdf")])
+
 ; cxlgbr, cdlgbr, celgbr, cxlgtr, cdlgtr
 ; cxlfbr, cdlfbr, celfbr, cxlftr, cdlftr
-(define_insn "floatuns<GPR:mode><FP:mode>2"
-  [(set (match_operand:FP 0 "register_operand" "=f")
-        (unsigned_float:FP (match_operand:GPR 1 "register_operand" "d")))]
-  "TARGET_Z196 && TARGET_HARD_FLOAT"
+(define_insn "*floatuns<GPR:mode><FP:mode>2"
+  [(set (match_operand:FP                     0 "register_operand" "=f")
+        (unsigned_float:FP (match_operand:GPR 1 "register_operand"  "d")))]
+  "TARGET_Z196 && TARGET_HARD_FLOAT
+   && (!TARGET_VX || <FP:MODE>mode != DFmode || <GPR:MODE>mode != DImode)"
   "c<FP:xde>l<GPR:gf><FP:bt>r\t%0,0,%1,0"
   [(set_attr "op_type" "RRE")
-   (set_attr "type"    "itof<FP:mode>" )])
+   (set_attr "type"    "itof<FP:mode>")])
+
+(define_expand "floatuns<GPR:mode><FP:mode>2"
+  [(set (match_operand:FP                     0 "register_operand" "")
+        (unsigned_float:FP (match_operand:GPR 1 "register_operand" "")))]
+  "TARGET_Z196 && TARGET_HARD_FLOAT")
 
 ;
 ; truncdfsf2 instruction pattern(s).
 ;
 
 (define_insn "truncdfsf2"
-  [(set (match_operand:SF 0 "register_operand" "=f")
-        (float_truncate:SF (match_operand:DF 1 "register_operand" "f")))]
+  [(set (match_operand:SF                    0 "register_operand" "=f,v")
+        (float_truncate:SF (match_operand:DF 1 "register_operand"  "f,v")))]
   "TARGET_HARD_FLOAT"
-  "ledbr\t%0,%1"
-  [(set_attr "op_type"  "RRE")
-   (set_attr "type"   "ftruncdf")])
+  "@
+   ledbr\t%0,%1
+   wledb\t%v0,%v1,0,0" ; IEEE inexact exception not suppressed
+                       ; According to BFP rounding mode
+  [(set_attr "op_type"      "RRE,VRR")
+   (set_attr "type"         "ftruncdf")
+   (set_attr "cpu_facility" "*,vec")])
 
 ;
 ; trunctf(df|sf)2 instruction pattern(s).
@@ -4543,17 +4626,35 @@
 ; extend(sf|df)(df|tf)2 instruction pattern(s).
 ;
 
+(define_insn "*extendsfdf2_z13"
+  [(set (match_operand:DF                  0 "register_operand"     "=f,f,v")
+        (float_extend:DF (match_operand:SF 1 "nonimmediate_operand"  "f,R,v")))]
+  "TARGET_Z13 && TARGET_HARD_FLOAT"
+  "@
+   ldebr\t%0,%1
+   ldeb\t%0,%1
+   wldeb\t%v0,%v1"
+  [(set_attr "op_type" "RRE,RXE,VRR")
+   (set_attr "type"    "fsimpdf, floaddf,fsimpdf")])
+
 ; ldebr, ldeb, lxdbr, lxdb, lxebr, lxeb
-(define_insn "extend<DSF:mode><BFP:mode>2"
-  [(set (match_operand:BFP 0 "register_operand" "=f,f")
+(define_insn "*extend<DSF:mode><BFP:mode>2"
+  [(set (match_operand:BFP                   0 "register_operand"     "=f,f")
         (float_extend:BFP (match_operand:DSF 1 "nonimmediate_operand"  "f,R")))]
   "TARGET_HARD_FLOAT
-   && GET_MODE_SIZE (<BFP:MODE>mode) > GET_MODE_SIZE (<DSF:MODE>mode)"
+   && GET_MODE_SIZE (<BFP:MODE>mode) > GET_MODE_SIZE (<DSF:MODE>mode)
+   && (!TARGET_VX || <BFP:MODE>mode != DFmode || <DSF:MODE>mode != SFmode)"
   "@
    l<BFP:xde><DSF:xde>br\t%0,%1
    l<BFP:xde><DSF:xde>b\t%0,%1"
-  [(set_attr "op_type"  "RRE,RXE")
-   (set_attr "type"   "fsimp<BFP:mode>, fload<BFP:mode>")])
+  [(set_attr "op_type" "RRE,RXE")
+   (set_attr "type"    "fsimp<BFP:mode>, fload<BFP:mode>")])
+
+(define_expand "extend<DSF:mode><BFP:mode>2"
+  [(set (match_operand:BFP                   0 "register_operand"     "")
+        (float_extend:BFP (match_operand:DSF 1 "nonimmediate_operand" "")))]
+  "TARGET_HARD_FLOAT
+   && GET_MODE_SIZE (<BFP:MODE>mode) > GET_MODE_SIZE (<DSF:MODE>mode)")
 
 ;
 ; extendddtd2 and extendsddd2 instruction pattern(s).
@@ -5158,17 +5259,20 @@
 ;
 
 ; axbr, adbr, aebr, axb, adb, aeb, adtr, axtr
+; FIXME: wfadb does not clobber cc
 (define_insn "add<mode>3"
-  [(set (match_operand:FP 0 "register_operand"              "=f,   f")
-        (plus:FP (match_operand:FP 1 "nonimmediate_operand" "%<f0>,0")
-		 (match_operand:FP 2 "general_operand"      " f,<Rf>")))
+  [(set (match_operand:FP 0 "register_operand"                 "=f,   f,<vf>")
+        (plus:FP (match_operand:FP 1 "nonimmediate_operand" "%<f0>,   0,<v0>")
+		 (match_operand:FP 2 "general_operand"          "f,<Rf>,<vf>")))
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_HARD_FLOAT"
   "@
    a<xde><bt>r\t%0,<op1>%2
-   a<xde>b\t%0,%2"
-  [(set_attr "op_type"  "<RRer>,RXE")
-   (set_attr "type"     "fsimp<mode>")])
+   a<xde>b\t%0,%2
+   wfadb\t%v0,%v1,%v2"
+  [(set_attr "op_type"      "<RRer>,RXE,VRR")
+   (set_attr "type"         "fsimp<mode>")
+   (set_attr "cpu_facility" "*,*,vec")])
 
 ; axbr, adbr, aebr, axb, adb, aeb, adtr, axtr
 (define_insn "*add<mode>3_cc"
@@ -5582,16 +5686,18 @@
 
 ; sxbr, sdbr, sebr, sdb, seb, sxtr, sdtr
 (define_insn "sub<mode>3"
-  [(set (match_operand:FP 0 "register_operand"            "=f,  f")
-        (minus:FP (match_operand:FP 1 "register_operand" "<f0>,0")
-                  (match_operand:FP 2 "general_operand"  "f,<Rf>")))
+  [(set (match_operand:FP           0 "register_operand"   "=f,   f,<vf>")
+        (minus:FP (match_operand:FP 1 "register_operand" "<f0>,   0,<v0>")
+                  (match_operand:FP 2 "general_operand"     "f,<Rf>,<vf>")))
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_HARD_FLOAT"
   "@
    s<xde><bt>r\t%0,<op1>%2
-   s<xde>b\t%0,%2"
-  [(set_attr "op_type"  "<RRer>,RXE")
-   (set_attr "type"     "fsimp<mode>")])
+   s<xde>b\t%0,%2
+   wfsdb\t%v0,%v1,%v2"
+  [(set_attr "op_type"      "<RRer>,RXE,VRR")
+   (set_attr "type"         "fsimp<mode>")
+   (set_attr "cpu_facility" "*,*,vec")])
 
 ; sxbr, sdbr, sebr, sdb, seb, sxtr, sdtr
 (define_insn "*sub<mode>3_cc"
@@ -5997,41 +6103,47 @@
 
 ; mxbr, mdbr, meebr, mxb, mxb, meeb, mdtr, mxtr
 (define_insn "mul<mode>3"
-  [(set (match_operand:FP 0 "register_operand"              "=f,f")
-        (mult:FP (match_operand:FP 1 "nonimmediate_operand" "%<f0>,0")
-                 (match_operand:FP 2 "general_operand"      "f,<Rf>")))]
+  [(set (match_operand:FP          0 "register_operand"        "=f,   f,<vf>")
+        (mult:FP (match_operand:FP 1 "nonimmediate_operand" "%<f0>,   0,<v0>")
+                 (match_operand:FP 2 "general_operand"          "f,<Rf>,<vf>")))]
   "TARGET_HARD_FLOAT"
   "@
    m<xdee><bt>r\t%0,<op1>%2
-   m<xdee>b\t%0,%2"
-  [(set_attr "op_type"  "<RRer>,RXE")
-   (set_attr "type"     "fmul<mode>")])
+   m<xdee>b\t%0,%2
+   wfmdb\t%v0,%v1,%v2"
+  [(set_attr "op_type"      "<RRer>,RXE,VRR")
+   (set_attr "type"         "fmul<mode>")
+   (set_attr "cpu_facility" "*,*,vec")])
 
 ; madbr, maebr, maxb, madb, maeb
 (define_insn "fma<mode>4"
-  [(set (match_operand:DSF 0 "register_operand" "=f,f")
-	(fma:DSF (match_operand:DSF 1 "nonimmediate_operand" "%f,f")
-		 (match_operand:DSF 2 "nonimmediate_operand" "f,R")
-		 (match_operand:DSF 3 "register_operand" "0,0")))]
+  [(set (match_operand:DSF          0 "register_operand"     "=f,f,<vf>")
+	(fma:DSF (match_operand:DSF 1 "nonimmediate_operand" "%f,f,<vf>")
+		 (match_operand:DSF 2 "nonimmediate_operand"  "f,R,<vf>")
+		 (match_operand:DSF 3 "register_operand"      "0,0,<v0>")))]
   "TARGET_HARD_FLOAT"
   "@
    ma<xde>br\t%0,%1,%2
-   ma<xde>b\t%0,%1,%2"
-  [(set_attr "op_type"  "RRE,RXE")
-   (set_attr "type"     "fmadd<mode>")])
+   ma<xde>b\t%0,%1,%2
+   wfmadb\t%v0,%v1,%v2,%v3"
+  [(set_attr "op_type"      "RRE,RXE,VRR")
+   (set_attr "type"         "fmadd<mode>")
+   (set_attr "cpu_facility" "*,*,vec")])
 
 ; msxbr, msdbr, msebr, msxb, msdb, mseb
 (define_insn "fms<mode>4"
-  [(set (match_operand:DSF 0 "register_operand" "=f,f")
-	(fma:DSF (match_operand:DSF 1 "nonimmediate_operand" "%f,f")
-		 (match_operand:DSF 2 "nonimmediate_operand" "f,R")
-		 (neg:DSF (match_operand:DSF 3 "register_operand" "0,0"))))]
+  [(set (match_operand:DSF                   0 "register_operand"     "=f,f,<vf>")
+	(fma:DSF (match_operand:DSF          1 "nonimmediate_operand" "%f,f,<vf>")
+		 (match_operand:DSF          2 "nonimmediate_operand"  "f,R,<vf>")
+		 (neg:DSF (match_operand:DSF 3 "register_operand"      "0,0,<v0>"))))]
   "TARGET_HARD_FLOAT"
   "@
    ms<xde>br\t%0,%1,%2
-   ms<xde>b\t%0,%1,%2"
-  [(set_attr "op_type"  "RRE,RXE")
-   (set_attr "type"     "fmadd<mode>")])
+   ms<xde>b\t%0,%1,%2
+   wfmsdb\t%v0,%v1,%v2,%v3"
+  [(set_attr "op_type"      "RRE,RXE,VRR")
+   (set_attr "type"         "fmadd<mode>")
+   (set_attr "cpu_facility" "*,*,vec")])
 
 ;;
 ;;- Divide and modulo instructions.
@@ -6457,15 +6569,17 @@
 
 ; dxbr, ddbr, debr, dxb, ddb, deb, ddtr, dxtr
 (define_insn "div<mode>3"
-  [(set (match_operand:FP 0 "register_operand"          "=f,f")
-        (div:FP (match_operand:FP 1 "register_operand" "<f0>,0")
-                 (match_operand:FP 2 "general_operand"  "f,<Rf>")))]
+  [(set (match_operand:FP         0 "register_operand"   "=f,   f,<vf>")
+        (div:FP (match_operand:FP 1 "register_operand" "<f0>,   0,<v0>")
+		(match_operand:FP 2 "general_operand"     "f,<Rf>,<vf>")))]
   "TARGET_HARD_FLOAT"
   "@
    d<xde><bt>r\t%0,<op1>%2
-   d<xde>b\t%0,%2"
-  [(set_attr "op_type"  "<RRer>,RXE")
-   (set_attr "type"     "fdiv<mode>")])
+   d<xde>b\t%0,%2
+   wfddb\t%v0,%v1,%v2"
+  [(set_attr "op_type"      "<RRer>,RXE,VRR")
+   (set_attr "type"         "fdiv<mode>")
+   (set_attr "cpu_facility" "*,*,vec")])
 
 
 ;;
@@ -7674,14 +7788,18 @@
    (set_attr "type"     "fsimp<mode>")])
 
 ; lcxbr, lcdbr, lcebr
+; FIXME: wflcdb does not clobber cc
 (define_insn "*neg<mode>2"
-  [(set (match_operand:BFP 0 "register_operand" "=f")
-        (neg:BFP (match_operand:BFP 1 "register_operand" "f")))
+  [(set (match_operand:BFP          0 "register_operand" "=f,<vf>")
+        (neg:BFP (match_operand:BFP 1 "register_operand"  "f,<vf>")))
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_HARD_FLOAT"
-  "lc<xde>br\t%0,%1"
-  [(set_attr "op_type"  "RRE")
-   (set_attr "type"     "fsimp<mode>")])
+  "@
+   lc<xde>br\t%0,%1
+   wflcdb\t%0,%1"
+  [(set_attr "op_type"      "RRE,VRR")
+   (set_attr "cpu_facility" "*,vec")
+   (set_attr "type"         "fsimp<mode>,*")])
 
 
 ;;
@@ -7792,14 +7910,18 @@
    (set_attr "type"     "fsimp<mode>")])
 
 ; lpxbr, lpdbr, lpebr
+; FIXME: wflpdb does not clobber cc
 (define_insn "*abs<mode>2"
-  [(set (match_operand:BFP 0 "register_operand" "=f")
-        (abs:BFP (match_operand:BFP 1 "register_operand" "f")))
+  [(set (match_operand:BFP          0 "register_operand" "=f,<vf>")
+        (abs:BFP (match_operand:BFP 1 "register_operand"  "f,<vf>")))
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_HARD_FLOAT"
-  "lp<xde>br\t%0,%1"
-  [(set_attr "op_type"  "RRE")
-   (set_attr "type"     "fsimp<mode>")])
+  "@
+    lp<xde>br\t%0,%1
+    wflpdb\t%0,%1"
+  [(set_attr "op_type"      "RRE,VRR")
+   (set_attr "cpu_facility" "*,vec")
+   (set_attr "type"         "fsimp<mode>,*")])
 
 
 ;;
@@ -7903,14 +8025,18 @@
    (set_attr "type"     "fsimp<mode>")])
 
 ; lnxbr, lndbr, lnebr
+; FIXME: wflndb does not clobber cc
 (define_insn "*negabs<mode>2"
-  [(set (match_operand:BFP 0 "register_operand" "=f")
-        (neg:BFP (abs:BFP (match_operand:BFP 1 "register_operand" "f"))))
+  [(set (match_operand:BFP                   0 "register_operand" "=f,<vf>")
+        (neg:BFP (abs:BFP (match_operand:BFP 1 "register_operand"  "f,<vf>"))))
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_HARD_FLOAT"
-  "ln<xde>br\t%0,%1"
-  [(set_attr "op_type"  "RRE")
-   (set_attr "type"     "fsimp<mode>")])
+  "@
+   ln<xde>br\t%0,%1
+   wflndb\t%0,%1"
+  [(set_attr "op_type"      "RRE,VRR")
+   (set_attr "cpu_facility" "*,vec")
+   (set_attr "type"         "fsimp<mode>,*")])
 
 ;;
 ;;- Square root instructions.
@@ -7922,14 +8048,16 @@
 
 ; sqxbr, sqdbr, sqebr, sqdb, sqeb
 (define_insn "sqrt<mode>2"
-  [(set (match_operand:BFP 0 "register_operand" "=f,f")
-	(sqrt:BFP (match_operand:BFP 1 "general_operand" "f,<Rf>")))]
+  [(set (match_operand:BFP           0 "register_operand" "=f,   f,<vf>")
+	(sqrt:BFP (match_operand:BFP 1 "general_operand"   "f,<Rf>,<vf>")))]
   "TARGET_HARD_FLOAT"
   "@
    sq<xde>br\t%0,%1
-   sq<xde>b\t%0,%1"
-  [(set_attr "op_type" "RRE,RXE")
-   (set_attr "type" "fsqrt<mode>")])
+   sq<xde>b\t%0,%1
+   wfsqdb\t%v0,%v1"
+  [(set_attr "op_type"      "RRE,RXE,VRR")
+   (set_attr "type"         "fsqrt<mode>")
+   (set_attr "cpu_facility" "*,*,vec")])
 
 
 ;;
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-scalar-cmp-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-scalar-cmp-1.c
new file mode 100644
index 0000000..9f3e2b6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-scalar-cmp-1.c
@@ -0,0 +1,49 @@
+/* Check that we use the scalar variants of vector compares.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+/* { dg-final { scan-assembler-times "wfcedbs\t%v0,%v0,%v2" 2 } } */
+/* { dg-final { scan-assembler-times "wfchdbs\t%v0,%v0,%v2" 1 } } */
+/* { dg-final { scan-assembler-times "wfchedbs\t%v0,%v2,%v0" 1 } } */
+/* { dg-final { scan-assembler-times "wfchdbs\t%v0,%v2,%v0" 1 } } */
+/* { dg-final { scan-assembler-times "wfchedbs\t%v0,%v2,%v0" 1 } } */
+/* { dg-final { scan-assembler-times "je" 5 } } */
+/* { dg-final { scan-assembler-times "jo" 1 } } */
+
+
+double
+eq (double a, double b)
+{
+  return a == b;
+}
+
+double
+ne (double a, double b)
+{
+  return a != b;
+}
+
+double
+gt (double a, double b)
+{
+  return a > b;
+}
+
+double
+ge (double a, double b)
+{
+  return a >= b;
+}
+
+double
+lt (double a, double b)
+{
+  return a < b;
+}
+
+double
+le (double a, double b)
+{
+  return a <= b;
+}
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 08/13] S/390 zvector builtin support.
  2015-05-11 13:23 [PATCH 00/13] S/390 Implement support for IBM z13 Andreas Krebbel
                   ` (11 preceding siblings ...)
  2015-05-11 13:24 ` [PATCH 09/13] S/390 Add zvector testcases Andreas Krebbel
@ 2015-05-11 13:41 ` Andreas Krebbel
  2015-05-11 17:07   ` Jeff Law
  12 siblings, 1 reply; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 13:41 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3342 bytes --]

With this patch GCC implements an Altivec style set of builtins to
make use of vector instructions in C/C++ code.  This is provided for
compatibility with the IBM XL compiler.

gcc/
	* config.gcc: Add vecintrin.h to extra_headers.  Add s390-c.o to
	c_target_objs and cxx_target_objs.  Add t-s390 to tmake_file.
	* config/s390/s390-builtin-types.def: New file.
	* config/s390/s390-builtins.def: New file.
	* config/s390/s390-builtins.h: New file.
	* config/s390/s390-c.c: New file.
	* config/s390/s390-modes.def: Add modes CCVEQANY, CCVH,
	CCVHANY, CCVHU, CCVHUANY, CCVFHANY, CCVFHEANY.
	* config/s390/s390-protos.h (s390_expand_vec_compare_cc)
	(s390_cpu_cpp_builtins, s390_register_target_pragmas): Add
	prototypes.
	* config/s390/s390.c (s390-builtins.h, s390-builtins.def):
	Include.
	(flags_builtin, flags_overloaded_builtin_var, s390_builtin_types)
	(s390_builtin_fn_types, s390_builtin_decls, code_for_builtin): New
	variable definitions.
	(s390_const_operand_ok): New function.
	(s390_expand_builtin): Rewrite.
	(s390_init_builtins): New function.
	(s390_handle_vectorbool_attribute): New function.
	(s390_attribute_table): Add s390_vector_bool attribute.
	(s390_match_ccmode_set): Handle new cc modes CCVH, CCVHU.
	(s390_branch_condition_mask): Generate masks for new modes.
	(s390_expand_vec_compare_cc): New function.
	(s390_mangle_type): Add mangling for vector bool types.
	(enum s390_builtin): Remove.
	(s390_atomic_assign_expand_fenv): Rename constants for sfpc and
	efpc builtins.
	* config/s390/s390.h (TARGET_CPU_CPP_BUILTINS): Call
	s390_cpu_cpp_builtins.
	(REGISTER_TARGET_PRAGMAS): New macro.
	* config/s390/s390.md: Define more UNSPEC_VEC_* constants.
	(insn_cmp mode attribute): Add new CC modes.
	(s390_sfpc, s390_efpc): Rename patterns to sfpc and efpc.
	(lcbb): New pattern definition.
	* config/s390/s390intrin.h: Include vecintrin.h.
	* config/s390/t-s390: New file.
	* config/s390/vecintrin.h: New file.
	* config/s390/vector.md: Include vx-builtins.md.
	* config/s390/vx-builtins.md: New file.S/390 zvector builtin support.
---
 gcc/config.gcc                         |   24 +-
 gcc/config/s390/s390-builtin-types.def |  747 ++++++++++
 gcc/config/s390/s390-builtins.def      | 2486 ++++++++++++++++++++++++++++++++
 gcc/config/s390/s390-builtins.h        |  160 ++
 gcc/config/s390/s390-c.c               |  907 ++++++++++++
 gcc/config/s390/s390-modes.def         |   30 +
 gcc/config/s390/s390-protos.h          |    8 +
 gcc/config/s390/s390.c                 |  833 ++++++++---
 gcc/config/s390/s390.h                 |   27 +-
 gcc/config/s390/s390.md                |  118 +-
 gcc/config/s390/s390.opt               |    4 +
 gcc/config/s390/s390intrin.h           |    3 +
 gcc/config/s390/t-s390                 |   27 +
 gcc/config/s390/vecintrin.h            |  311 ++++
 gcc/config/s390/vector.md              |    2 +
 gcc/config/s390/vx-builtins.md         | 2081 ++++++++++++++++++++++++++
 16 files changed, 7494 insertions(+), 274 deletions(-)
 create mode 100644 gcc/config/s390/s390-builtin-types.def
 create mode 100644 gcc/config/s390/s390-builtins.def
 create mode 100644 gcc/config/s390/s390-builtins.h
 create mode 100644 gcc/config/s390/s390-c.c
 create mode 100644 gcc/config/s390/t-s390
 create mode 100644 gcc/config/s390/vecintrin.h
 create mode 100644 gcc/config/s390/vx-builtins.md

[-- Attachment #2: 0008-S-390-zvector-builtin-support.patch.gz --]
[-- Type: application/octet-stream, Size: 56973 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/13] recog: Increased max number of alternatives.
  2015-05-11 13:23 ` [PATCH 01/13] recog: Increased max number of alternatives Andreas Krebbel
@ 2015-05-11 14:01   ` Segher Boessenkool
  2015-05-11 14:46     ` Andreas Krebbel
  2015-05-11 17:03   ` Jeff Law
  2015-05-18 13:47   ` [PATCH 01/13] recog: Increased max number of alternatives - v2 Andreas Krebbel
  2 siblings, 1 reply; 37+ messages in thread
From: Segher Boessenkool @ 2015-05-11 14:01 UTC (permalink / raw)
  To: Andreas Krebbel; +Cc: gcc-patches

On Mon, May 11, 2015 at 03:23:29PM +0200, Andreas Krebbel wrote:
> With the vector facility support z13 mov patterns have more than 30
> alternatives.

Wow, that is a lot!

> --- a/gcc/recog.h
> +++ b/gcc/recog.h
> @@ -23,7 +23,7 @@ along with GCC; see the file COPYING3.  If not see
>  /* Random number that should be large enough for all purposes.  Also define
>     a type that has at least MAX_RECOG_ALTERNATIVES + 1 bits, with the extra
>     bit giving an invalid value that can be used to mean "uninitialized".  */
> -#define MAX_RECOG_ALTERNATIVES 30
> +#define MAX_RECOG_ALTERNATIVES 35
>  typedef unsigned int alternative_mask;

"int" isn't at least 36 bits.


Segher

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/13] recog: Increased max number of alternatives.
  2015-05-11 14:01   ` Segher Boessenkool
@ 2015-05-11 14:46     ` Andreas Krebbel
  0 siblings, 0 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-11 14:46 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches

On 05/11/2015 04:01 PM, Segher Boessenkool wrote:
> On Mon, May 11, 2015 at 03:23:29PM +0200, Andreas Krebbel wrote:
>> With the vector facility support z13 mov patterns have more than 30
>> alternatives.
> 
> Wow, that is a lot!
> 
>> --- a/gcc/recog.h
>> +++ b/gcc/recog.h
>> @@ -23,7 +23,7 @@ along with GCC; see the file COPYING3.  If not see
>>  /* Random number that should be large enough for all purposes.  Also define
>>     a type that has at least MAX_RECOG_ALTERNATIVES + 1 bits, with the extra
>>     bit giving an invalid value that can be used to mean "uninitialized".  */
>> -#define MAX_RECOG_ALTERNATIVES 30
>> +#define MAX_RECOG_ALTERNATIVES 35
>>  typedef unsigned int alternative_mask;
> 
> "int" isn't at least 36 bits.

Right. That should be unsigned HOST_WIDE_INT instead.

Thanks!

-Andreas-

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/13] recog: Increased max number of alternatives.
  2015-05-11 13:23 ` [PATCH 01/13] recog: Increased max number of alternatives Andreas Krebbel
  2015-05-11 14:01   ` Segher Boessenkool
@ 2015-05-11 17:03   ` Jeff Law
  2015-05-18 13:47   ` [PATCH 01/13] recog: Increased max number of alternatives - v2 Andreas Krebbel
  2 siblings, 0 replies; 37+ messages in thread
From: Jeff Law @ 2015-05-11 17:03 UTC (permalink / raw)
  To: Andreas Krebbel, gcc-patches

On 05/11/2015 07:23 AM, Andreas Krebbel wrote:
> With the vector facility support z13 mov patterns have more than 30
> alternatives.
>
> gcc/
> 	* recog.h: Increase MAX_RECOG_ALTERNATIVES.
> ---
>   gcc/recog.h |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/recog.h b/gcc/recog.h
> index 8a38b26..4d8ca0c 100644
> --- a/gcc/recog.h
> +++ b/gcc/recog.h
> @@ -23,7 +23,7 @@ along with GCC; see the file COPYING3.  If not see
>   /* Random number that should be large enough for all purposes.  Also define
>      a type that has at least MAX_RECOG_ALTERNATIVES + 1 bits, with the extra
>      bit giving an invalid value that can be used to mean "uninitialized".  */
> -#define MAX_RECOG_ALTERNATIVES 30
> +#define MAX_RECOG_ALTERNATIVES 35
>   typedef unsigned int alternative_mask;
Yow! Won't think break on 32bit integer targets?

Which leads me to ponder how difficult it would be to catch this in the 
generators so that a port which had too many alternatives would at least 
issue a sensible error during the build process.

Jeff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 10/13] Testsuite These testcases require disabling hardware vector support on S/390.
  2015-05-11 13:24 ` [PATCH 10/13] Testsuite These testcases require disabling hardware vector support on S/390 Andreas Krebbel
@ 2015-05-11 17:05   ` Jeff Law
  0 siblings, 0 replies; 37+ messages in thread
From: Jeff Law @ 2015-05-11 17:05 UTC (permalink / raw)
  To: Andreas Krebbel, gcc-patches

On 05/11/2015 07:23 AM, Andreas Krebbel wrote:
> gcc/testsuite/
> 	* gcc.dg/tree-ssa/gen-vect-11b.c: Disable vector
> 	  instructions on s390*.
> 	* gcc.dg/tree-ssa/gen-vect-11c.c: Likewise.
Fine with me.  Seems to me that generally adding a clause like this to a 
test ought to fall in the scope of what port maintainers can do without 
approval.

Jeff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 11/13] Testsuite S/390 vector types are only 8 byte aligned.
  2015-05-11 13:24 ` [PATCH 11/13] Testsuite S/390 vector types are only 8 byte aligned Andreas Krebbel
@ 2015-05-11 17:05   ` Jeff Law
  0 siblings, 0 replies; 37+ messages in thread
From: Jeff Law @ 2015-05-11 17:05 UTC (permalink / raw)
  To: Andreas Krebbel, gcc-patches

On 05/11/2015 07:23 AM, Andreas Krebbel wrote:
> gcc/testsuite/
> 	* lib/target-supports.exp: Vector do not always have natural
>            alignment on s390*.
OK.
jeff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 08/13] S/390 zvector builtin support.
  2015-05-11 13:41 ` [PATCH 08/13] S/390 zvector builtin support Andreas Krebbel
@ 2015-05-11 17:07   ` Jeff Law
  0 siblings, 0 replies; 37+ messages in thread
From: Jeff Law @ 2015-05-11 17:07 UTC (permalink / raw)
  To: Andreas Krebbel, gcc-patches

On 05/11/2015 07:40 AM, Andreas Krebbel wrote:
> With this patch GCC implements an Altivec style set of builtins to
> make use of vector instructions in C/C++ code.  This is provided for
> compatibility with the IBM XL compiler.
>
> gcc/
> 	* config.gcc: Add vecintrin.h to extra_headers.  Add s390-c.o to
> 	c_target_objs and cxx_target_objs.  Add t-s390 to tmake_file.
> 	* config/s390/s390-builtin-types.def: New file.
> 	* config/s390/s390-builtins.def: New file.
> 	* config/s390/s390-builtins.h: New file.
> 	* config/s390/s390-c.c: New file.
> 	* config/s390/s390-modes.def: Add modes CCVEQANY, CCVH,
> 	CCVHANY, CCVHU, CCVHUANY, CCVFHANY, CCVFHEANY.
> 	* config/s390/s390-protos.h (s390_expand_vec_compare_cc)
> 	(s390_cpu_cpp_builtins, s390_register_target_pragmas): Add
> 	prototypes.
> 	* config/s390/s390.c (s390-builtins.h, s390-builtins.def):
> 	Include.
> 	(flags_builtin, flags_overloaded_builtin_var, s390_builtin_types)
> 	(s390_builtin_fn_types, s390_builtin_decls, code_for_builtin): New
> 	variable definitions.
> 	(s390_const_operand_ok): New function.
> 	(s390_expand_builtin): Rewrite.
> 	(s390_init_builtins): New function.
> 	(s390_handle_vectorbool_attribute): New function.
> 	(s390_attribute_table): Add s390_vector_bool attribute.
> 	(s390_match_ccmode_set): Handle new cc modes CCVH, CCVHU.
> 	(s390_branch_condition_mask): Generate masks for new modes.
> 	(s390_expand_vec_compare_cc): New function.
> 	(s390_mangle_type): Add mangling for vector bool types.
> 	(enum s390_builtin): Remove.
> 	(s390_atomic_assign_expand_fenv): Rename constants for sfpc and
> 	efpc builtins.
> 	* config/s390/s390.h (TARGET_CPU_CPP_BUILTINS): Call
> 	s390_cpu_cpp_builtins.
> 	(REGISTER_TARGET_PRAGMAS): New macro.
> 	* config/s390/s390.md: Define more UNSPEC_VEC_* constants.
> 	(insn_cmp mode attribute): Add new CC modes.
> 	(s390_sfpc, s390_efpc): Rename patterns to sfpc and efpc.
> 	(lcbb): New pattern definition.
> 	* config/s390/s390intrin.h: Include vecintrin.h.
> 	* config/s390/t-s390: New file.
> 	* config/s390/vecintrin.h: New file.
> 	* config/s390/vector.md: Include vx-builtins.md.
> 	* config/s390/vx-builtins.md: New file.S/390 zvector builtin support.
Just in case there's any question, the config.gcc bits here fall in the 
port maintainer's area.  So you can self-approve this entire patch :-)

jeff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/13] optabs: Fix vec_perm -> V16QI middle end lowering.
  2015-05-11 13:24 ` [PATCH 02/13] optabs: Fix vec_perm -> V16QI middle end lowering Andreas Krebbel
@ 2015-05-11 17:20   ` Jeff Law
  2015-05-18 17:36   ` Richard Henderson
  1 sibling, 0 replies; 37+ messages in thread
From: Jeff Law @ 2015-05-11 17:20 UTC (permalink / raw)
  To: Andreas Krebbel, gcc-patches

On 05/11/2015 07:23 AM, Andreas Krebbel wrote:
> The current implementation re-uses the location of the selection
> pattern to generate a new one.  This fails if the pattern resides in a
> read-only location.  With the patch a new temporary register is
> allocated for that purpose.
>
> gcc/
> 	* optabs.c (expand_vec_perm): Allocate a temp reg for the new
>            select pattern.
This is probably a good idea even if SEL is not in readonly memory. 
When possible we still want rtl objects to have a single def.

Presumably it passes a bootstrap and regression test.

OK for the trunk.

As for the branches, that's the call of the release managers.

jeff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 01/13] recog: Increased max number of alternatives - v2
  2015-05-11 13:23 ` [PATCH 01/13] recog: Increased max number of alternatives Andreas Krebbel
  2015-05-11 14:01   ` Segher Boessenkool
  2015-05-11 17:03   ` Jeff Law
@ 2015-05-18 13:47   ` Andreas Krebbel
  2015-05-18 14:39     ` Richard Biener
  2 siblings, 1 reply; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-18 13:47 UTC (permalink / raw)
  To: gcc-patches

The new version also changes the type for the alternative_mask to unsigned HOST_WIDE_INT.

Bootstrapped without regressions on x86-64.

Ok to apply?

Bye,

-Andreas-

gcc/
        * recog.h: Increase MAX_RECOG_ALTERNATIVES.
---
 gcc/recog.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/recog.h b/gcc/recog.h
index 463c748..07700bd 100644
--- a/gcc/recog.h
+++ b/gcc/recog.h
@@ -23,8 +23,8 @@ along with GCC; see the file COPYING3.  If not see
 /* Random number that should be large enough for all purposes.  Also define
    a type that has at least MAX_RECOG_ALTERNATIVES + 1 bits, with the extra
    bit giving an invalid value that can be used to mean "uninitialized".  */
-#define MAX_RECOG_ALTERNATIVES 30
-typedef unsigned int alternative_mask;
+#define MAX_RECOG_ALTERNATIVES 35
+typedef unsigned HOST_WIDE_INT alternative_mask;

 /* A mask of all alternatives.  */
 #define ALL_ALTERNATIVES ((alternative_mask) -1)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/13] recog: Increased max number of alternatives - v2
  2015-05-18 13:47   ` [PATCH 01/13] recog: Increased max number of alternatives - v2 Andreas Krebbel
@ 2015-05-18 14:39     ` Richard Biener
  2015-05-19  8:41       ` Andreas Krebbel
  0 siblings, 1 reply; 37+ messages in thread
From: Richard Biener @ 2015-05-18 14:39 UTC (permalink / raw)
  To: Andreas Krebbel; +Cc: GCC Patches

On Mon, May 18, 2015 at 3:41 PM, Andreas Krebbel
<krebbel@linux.vnet.ibm.com> wrote:
> The new version also changes the type for the alternative_mask to unsigned HOST_WIDE_INT.
>
> Bootstrapped without regressions on x86-64.
>
> Ok to apply?

Please use uint64_t instead.

Richard.

> Bye,
>
> -Andreas-
>
> gcc/
>         * recog.h: Increase MAX_RECOG_ALTERNATIVES.
> ---
>  gcc/recog.h |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/recog.h b/gcc/recog.h
> index 463c748..07700bd 100644
> --- a/gcc/recog.h
> +++ b/gcc/recog.h
> @@ -23,8 +23,8 @@ along with GCC; see the file COPYING3.  If not see
>  /* Random number that should be large enough for all purposes.  Also define
>     a type that has at least MAX_RECOG_ALTERNATIVES + 1 bits, with the extra
>     bit giving an invalid value that can be used to mean "uninitialized".  */
> -#define MAX_RECOG_ALTERNATIVES 30
> -typedef unsigned int alternative_mask;
> +#define MAX_RECOG_ALTERNATIVES 35
> +typedef unsigned HOST_WIDE_INT alternative_mask;
>
>  /* A mask of all alternatives.  */
>  #define ALL_ALTERNATIVES ((alternative_mask) -1)
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/13] optabs: Fix vec_perm -> V16QI middle end lowering.
  2015-05-11 13:24 ` [PATCH 02/13] optabs: Fix vec_perm -> V16QI middle end lowering Andreas Krebbel
  2015-05-11 17:20   ` Jeff Law
@ 2015-05-18 17:36   ` Richard Henderson
  2015-05-19  8:45     ` Andreas Krebbel
  1 sibling, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2015-05-18 17:36 UTC (permalink / raw)
  To: Andreas Krebbel, gcc-patches

On 05/11/2015 06:23 AM, Andreas Krebbel wrote:
> @@ -6784,14 +6784,18 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, rtx sel, rtx target)
>      {
>        /* Multiply each element by its byte size.  */
>        machine_mode selmode = GET_MODE (sel);
> +      /* We cannot re-use SEL as a temp operand since it might by in
> +	 read-only storage.  */
> +      rtx sel_reg = gen_reg_rtx (selmode);
> +
>        if (u == 2)
> -	sel = expand_simple_binop (selmode, PLUS, sel, sel,
> -				   sel, 0, OPTAB_DIRECT);
> +	sel_reg = expand_simple_binop (selmode, PLUS, sel, sel,
> +				       sel_reg, 0, OPTAB_DIRECT);
>        else

You needn't allocate sel_reg explicitly; expand_simple_binop will do that for
you if the TARGET parameter is NULL.

Thus this patch should be an 8 character change on those two calls.


r~

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/13] recog: Increased max number of alternatives - v2
  2015-05-18 14:39     ` Richard Biener
@ 2015-05-19  8:41       ` Andreas Krebbel
  2015-05-19 10:13         ` Richard Biener
  2015-05-22  8:26         ` Andreas Krebbel
  0 siblings, 2 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-19  8:41 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On 05/18/2015 04:19 PM, Richard Biener wrote:
> On Mon, May 18, 2015 at 3:41 PM, Andreas Krebbel
> <krebbel@linux.vnet.ibm.com> wrote:
>> The new version also changes the type for the alternative_mask to unsigned HOST_WIDE_INT.
>>
>> Bootstrapped without regressions on x86-64.
>>
>> Ok to apply?
> 
> Please use uint64_t instead.

Done. Ok with that change?

-Andreas-


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/13] optabs: Fix vec_perm -> V16QI middle end lowering.
  2015-05-18 17:36   ` Richard Henderson
@ 2015-05-19  8:45     ` Andreas Krebbel
  2015-05-19 15:02       ` Richard Henderson
  0 siblings, 1 reply; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-19  8:45 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches

On 05/18/2015 07:35 PM, Richard Henderson wrote:
> On 05/11/2015 06:23 AM, Andreas Krebbel wrote:
>> @@ -6784,14 +6784,18 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, rtx sel, rtx target)
>>      {
>>        /* Multiply each element by its byte size.  */
>>        machine_mode selmode = GET_MODE (sel);
>> +      /* We cannot re-use SEL as a temp operand since it might by in
>> +	 read-only storage.  */
>> +      rtx sel_reg = gen_reg_rtx (selmode);
>> +
>>        if (u == 2)
>> -	sel = expand_simple_binop (selmode, PLUS, sel, sel,
>> -				   sel, 0, OPTAB_DIRECT);
>> +	sel_reg = expand_simple_binop (selmode, PLUS, sel, sel,
>> +				       sel_reg, 0, OPTAB_DIRECT);
>>        else
> 
> You needn't allocate sel_reg explicitly; expand_simple_binop will do that for
> you if the TARGET parameter is NULL.
> 
> Thus this patch should be an 8 character change on those two calls.

Right. Thanks!

Ok to apply with that change?

-Andreas-

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/13] recog: Increased max number of alternatives - v2
  2015-05-19  8:41       ` Andreas Krebbel
@ 2015-05-19 10:13         ` Richard Biener
  2015-05-22  8:26         ` Andreas Krebbel
  1 sibling, 0 replies; 37+ messages in thread
From: Richard Biener @ 2015-05-19 10:13 UTC (permalink / raw)
  To: Andreas Krebbel; +Cc: GCC Patches

On Tue, May 19, 2015 at 10:40 AM, Andreas Krebbel
<krebbel@linux.vnet.ibm.com> wrote:
> On 05/18/2015 04:19 PM, Richard Biener wrote:
>> On Mon, May 18, 2015 at 3:41 PM, Andreas Krebbel
>> <krebbel@linux.vnet.ibm.com> wrote:
>>> The new version also changes the type for the alternative_mask to unsigned HOST_WIDE_INT.
>>>
>>> Bootstrapped without regressions on x86-64.
>>>
>>> Ok to apply?
>>
>> Please use uint64_t instead.
>
> Done. Ok with that change?

Yes.

Richard.

> -Andreas-
>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/13] optabs: Fix vec_perm -> V16QI middle end lowering.
  2015-05-19  8:45     ` Andreas Krebbel
@ 2015-05-19 15:02       ` Richard Henderson
  2015-05-22  8:12         ` Andreas Krebbel
  0 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2015-05-19 15:02 UTC (permalink / raw)
  To: Andreas Krebbel, gcc-patches

On 05/19/2015 01:41 AM, Andreas Krebbel wrote:
> On 05/18/2015 07:35 PM, Richard Henderson wrote:
>> On 05/11/2015 06:23 AM, Andreas Krebbel wrote:
>>> @@ -6784,14 +6784,18 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, rtx sel, rtx target)
>>>      {
>>>        /* Multiply each element by its byte size.  */
>>>        machine_mode selmode = GET_MODE (sel);
>>> +      /* We cannot re-use SEL as a temp operand since it might by in
>>> +	 read-only storage.  */
>>> +      rtx sel_reg = gen_reg_rtx (selmode);
>>> +
>>>        if (u == 2)
>>> -	sel = expand_simple_binop (selmode, PLUS, sel, sel,
>>> -				   sel, 0, OPTAB_DIRECT);
>>> +	sel_reg = expand_simple_binop (selmode, PLUS, sel, sel,
>>> +				       sel_reg, 0, OPTAB_DIRECT);
>>>        else
>>
>> You needn't allocate sel_reg explicitly; expand_simple_binop will do that for
>> you if the TARGET parameter is NULL.
>>
>> Thus this patch should be an 8 character change on those two calls.
> 
> Right. Thanks!
> 
> Ok to apply with that change?

Yes, thanks.


r~

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PING] [RFC 12/13] S/390 Vector ABI GNU Attribute.
  2015-05-11 13:24 ` [RFC 12/13] S/390 Vector ABI GNU Attribute Andreas Krebbel
@ 2015-05-19 18:18   ` Andreas Krebbel
  0 siblings, 0 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-19 18:18 UTC (permalink / raw)
  To: gcc-patches

On 05/11/2015 03:23 PM, Andreas Krebbel wrote:
> With this patch .gnu_attribute is used to mark binaries with a vector
> ABI tag.  This is required since the z13 vector support breaks the ABI
> of existing vector_size attribute generated vector types:
> 
> 1. vector_size(16) and bigger vectors are aligned to 8 byte
> boundaries (formerly vectors were always naturally aligned)
> 
> 2. vector_size(16) or smaller vectors are passed via VR if available
> or by value on the stack (formerly vector were passed on the stack by
> reference).
> 
> The .gnu_attribute will be used by ld to emit a warning if binaries
> with incompatible ABIs are being linked together:
> https://sourceware.org/ml/binutils/2015-04/msg00316.html
> 
> And it will be used by GDB to perform inferior function calls using a
> vector ABI which fits to the binary being debugged:
> https://sourceware.org/ml/gdb-patches/2015-04/msg00833.html
> 
> The current implementation tries to only set the attribute if the
> vector types are really used in ABI relevant contexts in order to
> avoid false positives during linking.
> 
> However, this unfortunately has some limitations like in the following
> case where an ABI relevant context cannot be detected properly:
> 
> typedef int __attribute__((vector_size(16))) v4si;
> struct A
> {
>   char x;
>   v4si y;
> };
> char a[sizeof(struct A)];
> 
> The number of elements in a depends on the ABI (24 with -mvx and 32
> with -mno-vx).  However, the implementation is not able to detect this
> since the struct type is not used anywhere else and consequently does
> not survive until the checking code is able to see it.
> 
> Ideas about how to improve the implementation without creating too
> many false postives are welcome.
> 
> In particular we do not want to set the attribute for local uses of
> vector types as they would be natural for ifunc optimizations.

Any ideas how this could be improved? That's the only patch of the IBM z13 series I did not apply yet.

-Andreas-

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/13] optabs: Fix vec_perm -> V16QI middle end lowering.
  2015-05-19 15:02       ` Richard Henderson
@ 2015-05-22  8:12         ` Andreas Krebbel
  0 siblings, 0 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-22  8:12 UTC (permalink / raw)
  To: gcc-patches

On Tue, May 19, 2015 at 07:48:29AM -0700, Richard Henderson wrote:
> > Ok to apply with that change?
> 
> Yes, thanks.

I've applied the following.

Bye,

-Andreas-

gcc/
	* optabs.c (expand_vec_perm): Don't re-use SEL as target operand.
---
 gcc/optabs.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/optabs.c b/gcc/optabs.c
index bd03fc1..bc19029 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6796,11 +6796,11 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, rtx sel, rtx target)
       machine_mode selmode = GET_MODE (sel);
       if (u == 2)
 	sel = expand_simple_binop (selmode, PLUS, sel, sel,
-				   sel, 0, OPTAB_DIRECT);
+				   NULL, 0, OPTAB_DIRECT);
       else
 	sel = expand_simple_binop (selmode, ASHIFT, sel,
 				   GEN_INT (exact_log2 (u)),
-				   sel, 0, OPTAB_DIRECT);
+				   NULL, 0, OPTAB_DIRECT);
       gcc_assert (sel != NULL);
 
       /* Broadcast the low byte each element into each of its bytes.  */
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/13] recog: Increased max number of alternatives - v2
  2015-05-19  8:41       ` Andreas Krebbel
  2015-05-19 10:13         ` Richard Biener
@ 2015-05-22  8:26         ` Andreas Krebbel
  2015-06-01  8:22           ` Jakub Jelinek
  1 sibling, 1 reply; 37+ messages in thread
From: Andreas Krebbel @ 2015-05-22  8:26 UTC (permalink / raw)
  To: gcc-patches

On Tue, May 19, 2015 at 10:40:26AM +0200, Andreas Krebbel wrote:
> On 05/18/2015 04:19 PM, Richard Biener wrote:
> > Please use uint64_t instead.
> 
> Done. Ok with that change?

I've applied the following patch.

Bye,

-Andreas-

gcc/
	* recog.h: Increase MAX_RECOG_ALTERNATIVES.
	Change type of alternative_mask to uint64_t.
---
 gcc/recog.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/recog.h b/gcc/recog.h
index 463c748..3a09304 100644
--- a/gcc/recog.h
+++ b/gcc/recog.h
@@ -23,8 +23,8 @@ along with GCC; see the file COPYING3.  If not see
 /* Random number that should be large enough for all purposes.  Also define
    a type that has at least MAX_RECOG_ALTERNATIVES + 1 bits, with the extra
    bit giving an invalid value that can be used to mean "uninitialized".  */
-#define MAX_RECOG_ALTERNATIVES 30
-typedef unsigned int alternative_mask;
+#define MAX_RECOG_ALTERNATIVES 35
+typedef uint64_t alternative_mask;
 
 /* A mask of all alternatives.  */
 #define ALL_ALTERNATIVES ((alternative_mask) -1)
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/13] recog: Increased max number of alternatives - v2
  2015-05-22  8:26         ` Andreas Krebbel
@ 2015-06-01  8:22           ` Jakub Jelinek
  2015-06-08 13:35             ` Andreas Krebbel
  0 siblings, 1 reply; 37+ messages in thread
From: Jakub Jelinek @ 2015-06-01  8:22 UTC (permalink / raw)
  To: Andreas Krebbel; +Cc: gcc-patches

On Fri, May 22, 2015 at 09:54:00AM +0200, Andreas Krebbel wrote:
> On Tue, May 19, 2015 at 10:40:26AM +0200, Andreas Krebbel wrote:
> > On 05/18/2015 04:19 PM, Richard Biener wrote:
> > > Please use uint64_t instead.
> > 
> > Done. Ok with that change?
> 
> I've applied the following patch.

Note that on current trunk cross compiler from x86_64-linux to
s390x-linux (admittedly just make cc1 of an older configured tree,
but with libcpp (normal and build) rebuilt) fails miserably with
genattrtab: invalid alternative specified for pattern number 1015

> 	* recog.h: Increase MAX_RECOG_ALTERNATIVES.
> 	Change type of alternative_mask to uint64_t.

From quick look at genattrtab.c, there are many further spots
which rely on MAX_RECOG_ALTERNATIVES fitting into int bits.

With this quick patch make cc1 at least succeeds, but no idea whether
I've caught all the spots which work with bitmasks of alternatives.

--- gcc/genattrtab.c.jj	2015-01-09 21:59:45.000000000 +0100
+++ gcc/genattrtab.c	2015-06-01 10:15:50.797576547 +0200
@@ -230,7 +230,7 @@ static int *insn_n_alternatives;
 /* Stores, for each insn code, a bitmap that has bits on for each possible
    alternative.  */
 
-static int *insn_alternatives;
+static uint64_t *insn_alternatives;
 
 /* Used to simplify expressions.  */
 
@@ -258,7 +258,7 @@ static char *attr_printf           (unsi
   ATTRIBUTE_PRINTF_2;
 static rtx make_numeric_value      (int);
 static struct attr_desc *find_attr (const char **, int);
-static rtx mk_attr_alt             (int);
+static rtx mk_attr_alt             (uint64_t);
 static char *next_comma_elt	   (const char **);
 static rtx insert_right_side	   (enum rtx_code, rtx, rtx, int, int);
 static rtx copy_boolean		   (rtx);
@@ -769,7 +769,7 @@ check_attr_test (rtx exp, int is_const,
 	  if (attr == NULL)
 	    {
 	      if (! strcmp (XSTR (exp, 0), "alternative"))
-		return mk_attr_alt (1 << atoi (XSTR (exp, 1)));
+		return mk_attr_alt (((uint64_t) 1) << atoi (XSTR (exp, 1)));
 	      else
 		fatal ("unknown attribute `%s' in EQ_ATTR", XSTR (exp, 0));
 	    }
@@ -815,7 +815,7 @@ check_attr_test (rtx exp, int is_const,
 
 	      name_ptr = XSTR (exp, 1);
 	      while ((p = next_comma_elt (&name_ptr)) != NULL)
-		set |= 1 << atoi (p);
+		set |= ((uint64_t) 1) << atoi (p);
 
 	      return mk_attr_alt (set);
 	    }
@@ -1292,7 +1292,7 @@ static struct attr_value *
 get_attr_value (rtx value, struct attr_desc *attr, int insn_code)
 {
   struct attr_value *av;
-  int num_alt = 0;
+  uint64_t num_alt = 0;
 
   value = make_canonical (attr, value);
   if (compares_alternatives_p (value))
@@ -1934,7 +1934,7 @@ insert_right_side (enum rtx_code code, r
    This routine is passed an expression and either AND or IOR.  It returns a
    bitmask indicating which alternatives are mentioned within EXP.  */
 
-static int
+static uint64_t
 compute_alternative_mask (rtx exp, enum rtx_code code)
 {
   const char *string;
@@ -1965,15 +1965,15 @@ compute_alternative_mask (rtx exp, enum
     return 0;
 
   if (string[1] == 0)
-    return 1 << (string[0] - '0');
-  return 1 << atoi (string);
+    return ((uint64_t) 1) << (string[0] - '0');
+  return ((uint64_t) 1) << atoi (string);
 }
 
 /* Given I, a single-bit mask, return RTX to compare the `alternative'
    attribute with the value represented by that bit.  */
 
 static rtx
-make_alternative_compare (int mask)
+make_alternative_compare (uint64_t mask)
 {
   return mk_attr_alt (mask);
 }
@@ -2472,7 +2472,7 @@ attr_alt_complement (rtx s)
    in E.  */
 
 static rtx
-mk_attr_alt (int e)
+mk_attr_alt (uint64_t e)
 {
   rtx result = rtx_alloc (EQ_ATTR_ALT);
 
@@ -2499,7 +2499,7 @@ simplify_test_exp (rtx exp, int insn_cod
   struct attr_value *av;
   struct insn_ent *ie;
   struct attr_value_list *iv;
-  int i;
+  uint64_t i;
   rtx newexp = exp;
   bool left_alt, right_alt;
 
@@ -2779,7 +2779,7 @@ simplify_test_exp (rtx exp, int insn_cod
     case EQ_ATTR:
       if (XSTR (exp, 0) == alternative_name)
 	{
-	  newexp = mk_attr_alt (1 << atoi (XSTR (exp, 1)));
+	  newexp = mk_attr_alt (((uint64_t) 1) << atoi (XSTR (exp, 1)));
 	  break;
 	}
 
@@ -5263,10 +5263,11 @@ main (int argc, char **argv)
     expand_delays ();
 
   /* Make `insn_alternatives'.  */
-  insn_alternatives = oballocvec (int, insn_code_number);
+  insn_alternatives = oballocvec (uint64_t, insn_code_number);
   for (id = defs; id; id = id->next)
     if (id->insn_code >= 0)
-      insn_alternatives[id->insn_code] = (1 << id->num_alternatives) - 1;
+      insn_alternatives[id->insn_code]
+	= (((uint64_t) 1) << id->num_alternatives) - 1;
 
   /* Make `insn_n_alternatives'.  */
   insn_n_alternatives = oballocvec (int, insn_code_number);


	Jakub

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [BUILDROBOT] (was: [PATCH 05/13] S/390 Vector base support.)
  2015-05-11 13:24 ` [PATCH 05/13] S/390 Vector base support Andreas Krebbel
@ 2015-06-04 23:31   ` Jan-Benedict Glaw
  2015-06-08 13:36     ` [BUILDROBOT] Andreas Krebbel
  0 siblings, 1 reply; 37+ messages in thread
From: Jan-Benedict Glaw @ 2015-06-04 23:31 UTC (permalink / raw)
  To: Andreas Krebbel; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2032 bytes --]

Hi Andreas,

On Mon, 2015-05-11 15:23:33 +0200, Andreas Krebbel <krebbel@linux.vnet.ibm.com> wrote:
> gcc/
> 	* config/s390/constraints.md (j00, jm1, jxx, jyy, v): New
> 	constraints.
> 	* config/s390/predicates.md (const0_operand, constm1_operand)
> 	(constable_operand): Accept vector operands.
> 	* config/s390/s390-modes.def: Add supported vector modes.
> 	* config/s390/s390-protos.h (s390_cannot_change_mode_class)
[...]

Starting with this patch, it seems my buildrobot won't be able to
build a s390{,x}-linux compiler:

g++   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -DGENERATOR_FILE -fno-PIE -static-libstdc++ -static-libgcc   -o build/genattrtab \
    build/genattrtab.o build/rtl.o build/read-rtl.o build/ggc-none.o build/vec.o build/min-insn-modes.o build/gensupport.o build/print-rtl.o build/hash-table.o build/read-md.o build/errors.o ../build-x86_64-unknown-linux-gnu/libiberty/libiberty.a
build/genattrtab /home/jbglaw/repos/gcc/gcc/common.md /home/jbglaw/repos/gcc/gcc/config/s390/s390.md insn-conditions.md \
-Atmp-attrtab.c -Dtmp-dfatab.c -Ltmp-latencytab.c
genattrtab: invalid alternative specified for pattern number 1015
Makefile:2167: recipe for target 's-attrtab' failed
make[1]: *** [s-attrtab] Error 1
make[1]: Leaving directory '/home/jbglaw/build/s390x-linux/build-gcc/gcc'
Makefile:4119: recipe for target 'all-gcc' failed
make: *** [all-gcc] Error 2

The above error is taken from this build:

	http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=444690

MfG, JBG

-- 
      Jan-Benedict Glaw      jbglaw@lug-owl.de              +49-172-7608481
Signature of:                     Eine Freie Meinung in einem Freien Kopf
the second  :                   für einen Freien Staat voll Freier Bürger.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/13] recog: Increased max number of alternatives - v2
  2015-06-01  8:22           ` Jakub Jelinek
@ 2015-06-08 13:35             ` Andreas Krebbel
  2015-06-08 13:40               ` Jakub Jelinek
  0 siblings, 1 reply; 37+ messages in thread
From: Andreas Krebbel @ 2015-06-08 13:35 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On 06/01/2015 10:22 AM, Jakub Jelinek wrote:
> On Fri, May 22, 2015 at 09:54:00AM +0200, Andreas Krebbel wrote:
>> On Tue, May 19, 2015 at 10:40:26AM +0200, Andreas Krebbel wrote:
>>> On 05/18/2015 04:19 PM, Richard Biener wrote:
>>>> Please use uint64_t instead.
>>>
>>> Done. Ok with that change?
>>
>> I've applied the following patch.
> 
> Note that on current trunk cross compiler from x86_64-linux to
> s390x-linux (admittedly just make cc1 of an older configured tree,
> but with libcpp (normal and build) rebuilt) fails miserably with
> genattrtab: invalid alternative specified for pattern number 1015
> 
>> 	* recog.h: Increase MAX_RECOG_ALTERNATIVES.
>> 	Change type of alternative_mask to uint64_t.
> 
> From quick look at genattrtab.c, there are many further spots
> which rely on MAX_RECOG_ALTERNATIVES fitting into int bits.
> 
> With this quick patch make cc1 at least succeeds, but no idea whether
> I've caught all the spots which work with bitmasks of alternatives.

I've regtested your patch on S/390 without seeing any problems. Could you please commit it to mainline?

Thanks!

Bye,

-Andreas-

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [BUILDROBOT]
  2015-06-04 23:31   ` [BUILDROBOT] (was: [PATCH 05/13] S/390 Vector base support.) Jan-Benedict Glaw
@ 2015-06-08 13:36     ` Andreas Krebbel
  0 siblings, 0 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-06-08 13:36 UTC (permalink / raw)
  To: gcc-patches

On 06/05/2015 01:04 AM, Jan-Benedict Glaw wrote:
> Hi Andreas,
> 
> On Mon, 2015-05-11 15:23:33 +0200, Andreas Krebbel <krebbel@linux.vnet.ibm.com> wrote:
>> gcc/
>> 	* config/s390/constraints.md (j00, jm1, jxx, jyy, v): New
>> 	constraints.
>> 	* config/s390/predicates.md (const0_operand, constm1_operand)
>> 	(constable_operand): Accept vector operands.
>> 	* config/s390/s390-modes.def: Add supported vector modes.
>> 	* config/s390/s390-protos.h (s390_cannot_change_mode_class)
> [...]
> 
> Starting with this patch, it seems my buildrobot won't be able to
> build a s390{,x}-linux compiler:
> 
> g++   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -DGENERATOR_FILE -fno-PIE -static-libstdc++ -static-libgcc   -o build/genattrtab \
>     build/genattrtab.o build/rtl.o build/read-rtl.o build/ggc-none.o build/vec.o build/min-insn-modes.o build/gensupport.o build/print-rtl.o build/hash-table.o build/read-md.o build/errors.o ../build-x86_64-unknown-linux-gnu/libiberty/libiberty.a
> build/genattrtab /home/jbglaw/repos/gcc/gcc/common.md /home/jbglaw/repos/gcc/gcc/config/s390/s390.md insn-conditions.md \
> -Atmp-attrtab.c -Dtmp-dfatab.c -Ltmp-latencytab.c
> genattrtab: invalid alternative specified for pattern number 1015
> Makefile:2167: recipe for target 's-attrtab' failed
> make[1]: *** [s-attrtab] Error 1
> make[1]: Leaving directory '/home/jbglaw/build/s390x-linux/build-gcc/gcc'
> Makefile:4119: recipe for target 'all-gcc' failed
> make: *** [all-gcc] Error 2
> 
> The above error is taken from this build:
> 
> 	http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=444690

This has been reported by Jakub already. He also proposed a fix in his email:
https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00012.html

Bye,

-Andreas-

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/13] recog: Increased max number of alternatives - v2
  2015-06-08 13:35             ` Andreas Krebbel
@ 2015-06-08 13:40               ` Jakub Jelinek
  2015-07-02 11:52                 ` Andreas Krebbel
  0 siblings, 1 reply; 37+ messages in thread
From: Jakub Jelinek @ 2015-06-08 13:40 UTC (permalink / raw)
  To: Andreas Krebbel; +Cc: gcc-patches

On Mon, Jun 08, 2015 at 03:32:50PM +0200, Andreas Krebbel wrote:
> On 06/01/2015 10:22 AM, Jakub Jelinek wrote:
> > On Fri, May 22, 2015 at 09:54:00AM +0200, Andreas Krebbel wrote:
> >> On Tue, May 19, 2015 at 10:40:26AM +0200, Andreas Krebbel wrote:
> >>> On 05/18/2015 04:19 PM, Richard Biener wrote:
> >>>> Please use uint64_t instead.
> >>>
> >>> Done. Ok with that change?
> >>
> >> I've applied the following patch.
> > 
> > Note that on current trunk cross compiler from x86_64-linux to
> > s390x-linux (admittedly just make cc1 of an older configured tree,
> > but with libcpp (normal and build) rebuilt) fails miserably with
> > genattrtab: invalid alternative specified for pattern number 1015
> > 
> >> 	* recog.h: Increase MAX_RECOG_ALTERNATIVES.
> >> 	Change type of alternative_mask to uint64_t.
> > 
> > From quick look at genattrtab.c, there are many further spots
> > which rely on MAX_RECOG_ALTERNATIVES fitting into int bits.
> > 
> > With this quick patch make cc1 at least succeeds, but no idea whether
> > I've caught all the spots which work with bitmasks of alternatives.
> 
> I've regtested your patch on S/390 without seeing any problems. Could you please commit it to mainline?

Ok, I will.  Have you looked around if these are all the spots
that need changing for this in the gen* tools?
Perhaps trying -fsanitize=undefined and/or valgrind.  I admit I haven't
spent too much time on it.

	Jakub

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/13] recog: Increased max number of alternatives - v2
  2015-06-08 13:40               ` Jakub Jelinek
@ 2015-07-02 11:52                 ` Andreas Krebbel
  0 siblings, 0 replies; 37+ messages in thread
From: Andreas Krebbel @ 2015-07-02 11:52 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On Mon, Jun 08, 2015 at 03:38:03PM +0200, Jakub Jelinek wrote:
> On Mon, Jun 08, 2015 at 03:32:50PM +0200, Andreas Krebbel wrote:
> > On 06/01/2015 10:22 AM, Jakub Jelinek wrote:
> > > On Fri, May 22, 2015 at 09:54:00AM +0200, Andreas Krebbel wrote:
> > >> On Tue, May 19, 2015 at 10:40:26AM +0200, Andreas Krebbel wrote:
> > >>> On 05/18/2015 04:19 PM, Richard Biener wrote:
> > >>>> Please use uint64_t instead.
> > >>>
> > >>> Done. Ok with that change?
> > >>
> > >> I've applied the following patch.
> > > 
> > > Note that on current trunk cross compiler from x86_64-linux to
> > > s390x-linux (admittedly just make cc1 of an older configured tree,
> > > but with libcpp (normal and build) rebuilt) fails miserably with
> > > genattrtab: invalid alternative specified for pattern number 1015
> > > 
> > >> 	* recog.h: Increase MAX_RECOG_ALTERNATIVES.
> > >> 	Change type of alternative_mask to uint64_t.
> > > 
> > > From quick look at genattrtab.c, there are many further spots
> > > which rely on MAX_RECOG_ALTERNATIVES fitting into int bits.
> > > 
> > > With this quick patch make cc1 at least succeeds, but no idea whether
> > > I've caught all the spots which work with bitmasks of alternatives.
> > 
> > I've regtested your patch on S/390 without seeing any problems. Could you please commit it to mainline?
> 
> Ok, I will.  Have you looked around if these are all the spots
> that need changing for this in the gen* tools?
> Perhaps trying -fsanitize=undefined and/or valgrind.  I admit I haven't
> spent too much time on it.

Could you please apply this to GCC 5 branch as well? I'm about to
apply the z13 backports now.

Bye,

-Andreas-

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2015-07-02 11:52 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-11 13:23 [PATCH 00/13] S/390 Implement support for IBM z13 Andreas Krebbel
2015-05-11 13:23 ` [PATCH 01/13] recog: Increased max number of alternatives Andreas Krebbel
2015-05-11 14:01   ` Segher Boessenkool
2015-05-11 14:46     ` Andreas Krebbel
2015-05-11 17:03   ` Jeff Law
2015-05-18 13:47   ` [PATCH 01/13] recog: Increased max number of alternatives - v2 Andreas Krebbel
2015-05-18 14:39     ` Richard Biener
2015-05-19  8:41       ` Andreas Krebbel
2015-05-19 10:13         ` Richard Biener
2015-05-22  8:26         ` Andreas Krebbel
2015-06-01  8:22           ` Jakub Jelinek
2015-06-08 13:35             ` Andreas Krebbel
2015-06-08 13:40               ` Jakub Jelinek
2015-07-02 11:52                 ` Andreas Krebbel
2015-05-11 13:24 ` [PATCH 11/13] Testsuite S/390 vector types are only 8 byte aligned Andreas Krebbel
2015-05-11 17:05   ` Jeff Law
2015-05-11 13:24 ` [PATCH 06/13] Vector base support - testcases Andreas Krebbel
2015-05-11 13:24 ` [PATCH 02/13] optabs: Fix vec_perm -> V16QI middle end lowering Andreas Krebbel
2015-05-11 17:20   ` Jeff Law
2015-05-18 17:36   ` Richard Henderson
2015-05-19  8:45     ` Andreas Krebbel
2015-05-19 15:02       ` Richard Henderson
2015-05-22  8:12         ` Andreas Krebbel
2015-05-11 13:24 ` [PATCH 04/13] S/390 Add -march/-mtune=z13 option Andreas Krebbel
2015-05-11 13:24 ` [PATCH 10/13] Testsuite These testcases require disabling hardware vector support on S/390 Andreas Krebbel
2015-05-11 17:05   ` Jeff Law
2015-05-11 13:24 ` [PATCH 13/13] S/390 Invalid vector binary ops Andreas Krebbel
2015-05-11 13:24 ` [PATCH 07/13] S/390 Add vector scalar instruction support Andreas Krebbel
2015-05-11 13:24 ` [PATCH 03/13] S/390 Fix secondary reload issue with store/load relative operands Andreas Krebbel
2015-05-11 13:24 ` [RFC 12/13] S/390 Vector ABI GNU Attribute Andreas Krebbel
2015-05-19 18:18   ` [PING] " Andreas Krebbel
2015-05-11 13:24 ` [PATCH 05/13] S/390 Vector base support Andreas Krebbel
2015-06-04 23:31   ` [BUILDROBOT] (was: [PATCH 05/13] S/390 Vector base support.) Jan-Benedict Glaw
2015-06-08 13:36     ` [BUILDROBOT] Andreas Krebbel
2015-05-11 13:24 ` [PATCH 09/13] S/390 Add zvector testcases Andreas Krebbel
2015-05-11 13:41 ` [PATCH 08/13] S/390 zvector builtin support Andreas Krebbel
2015-05-11 17:07   ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).