[Committed] S/390: Add arch12 support

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [Committed] S/390: Add arch12 support
@ 2017-03-24 14:11 Andreas Krebbel
  2017-03-24 14:11 ` [PATCH 07/16] S/390: Use wfc for scalar vector compares Andreas Krebbel
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:11 UTC (permalink / raw)
  To: gcc-patches

This patch series adds support for a new architecture level of S/390.

The most important feature of the new instruction set is the support
of single and extended precision floating point vector operations.

Binutils support is part of the 2.28 release:
https://sourceware.org/ml/binutils/2017-02/msg00301.html

Note: arch12 is NOT the official name of the new CPU.  It just
continues the series of archXX options supported as alternate names.
The archXX terminology refers to the edition number of the Principle
of Operations manual.  The official CPU name will be added later while
keeping support of the arch12 for backwards compatibility.

Andreas Krebbel (16):
  S/390: Rename cpu facility vec to vx.
  S/390: Improve support of 128 bit vectors in GPRs
  S/390: vec_init improvements
  S/390: movsf/sd pattern fixes.
  S/390: movdf improvements
  S/390: Move and rename vector check.
  S/390: Use wfc for scalar vector compares
  S/390: Rearrange fixuns_trunc pattern definitions.
  S/390: arch12: Add arch12 option.
  S/390: arch12: Add support for new vector bit operations.
  S/390: arch12: New vector popcount variants
  S/390: arch12: Add vllezlf instruction.
  S/390: arch12: Add indirect branch pattern
  S/390: arch12: Support the mul/add/subtract instructions.
  S/390: arch12: Support new vector floating point modes.
  S/390: arch12: New builtins.

 gcc/ChangeLog                                      |  223 ++
 gcc/common/config/s390/s390-common.c               |    5 +-
 gcc/config.gcc                                     |    2 +-
 gcc/config/s390/2964.md                            |    8 +-
 gcc/config/s390/constraints.md                     |   10 +-
 gcc/config/s390/driver-native.c                    |    3 +
 gcc/config/s390/s390-builtin-types.def             |  129 +-
 gcc/config/s390/s390-builtins.def                  | 3504 +++++++++++---------
 gcc/config/s390/s390-builtins.h                    |    2 +
 gcc/config/s390/s390-c.c                           |   41 +-
 gcc/config/s390/s390-opts.h                        |    1 +
 gcc/config/s390/s390.c                             |  206 +-
 gcc/config/s390/s390.h                             |   25 +-
 gcc/config/s390/s390.md                            |  663 ++--
 gcc/config/s390/s390.opt                           |    3 +
 gcc/config/s390/vecintrin.h                        |  125 +-
 gcc/config/s390/vector.md                          |  522 ++-
 gcc/config/s390/vx-builtins.md                     |  547 +--
 gcc/testsuite/ChangeLog                            |   62 +
 gcc/testsuite/gcc.target/s390/arch12/aghsghmgh-1.c |   23 +
 gcc/testsuite/gcc.target/s390/arch12/mul-1.c       |   30 +
 gcc/testsuite/gcc.target/s390/arch12/mul-2.c       |   16 +
 gcc/testsuite/gcc.target/s390/htm-builtins-z13-1.c |    2 +-
 gcc/testsuite/gcc.target/s390/s390.exp             |   22 +-
 .../gcc.target/s390/target-attribute/tattr-3.c     |    3 +-
 .../gcc.target/s390/target-attribute/tattr-4.c     |    6 +-
 .../s390/target-attribute/tpragma-struct-vx-1.c    |    2 +-
 .../s390/target-attribute/tpragma-struct-vx-2.c    |    2 +-
 gcc/testsuite/gcc.target/s390/vector/stpcpy-1.c    |    2 +-
 .../gcc.target/s390/vector/vec-abi-vararg-1.c      |    2 +-
 .../gcc.target/s390/vector/vec-clobber-1.c         |    2 +-
 .../gcc.target/s390/vector/vec-genbytemask-1.c     |    2 +-
 .../gcc.target/s390/vector/vec-genmask-1.c         |    2 +-
 gcc/testsuite/gcc.target/s390/vector/vec-init-2.c  |   48 +
 .../gcc.target/s390/vector/vec-nopeel-1.c          |    2 +-
 .../gcc.target/s390/vector/vec-scalar-cmp-1.c      |   31 +-
 gcc/testsuite/gcc.target/s390/vector/vec-vrepi-1.c |    2 +-
 gcc/testsuite/gcc.target/s390/vxe/bitops-1.c       |   52 +
 gcc/testsuite/gcc.target/s390/vxe/negfma-1.c       |   49 +
 gcc/testsuite/gcc.target/s390/vxe/popcount-1.c     |   88 +
 gcc/testsuite/gcc.target/s390/vxe/vllezlf-1.c      |   30 +
 gcc/testsuite/lib/target-supports.exp              |   35 +
 42 files changed, 4083 insertions(+), 2451 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/arch12/aghsghmgh-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/arch12/mul-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/arch12/mul-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-init-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/bitops-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/negfma-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/popcount-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/vllezlf-1.c

-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 12/16] S/390: arch12: Add vllezlf instruction.
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
  2017-03-24 14:11 ` [PATCH 07/16] S/390: Use wfc for scalar vector compares Andreas Krebbel
  2017-03-24 14:11 ` [PATCH 02/16] S/390: Improve support of 128 bit vectors in GPRs Andreas Krebbel
@ 2017-03-24 14:11 ` Andreas Krebbel
  2017-03-24 14:11 ` [PATCH 06/16] S/390: Move and rename vector check Andreas Krebbel
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:11 UTC (permalink / raw)
  To: gcc-patches

This adds support for the vector load element and zero instruction and
makes sure it is used when initializing vectors with elements while
setting the rest to 0.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/s390.c (s390_expand_vec_init): Use vllezl
	instruction if possible.
	* config/s390/vector.md (vec_halfnumelts): New mode
	attribute.
	("*vec_vllezlf<mode>"): New pattern.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* gcc.target/s390/vxe/vllezlf-1.c: New test.
---
 gcc/ChangeLog                                 |  8 +++++++
 gcc/config/s390/s390.c                        | 28 +++++++++++++++++++++++++
 gcc/config/s390/vector.md                     | 17 +++++++++++++++
 gcc/testsuite/ChangeLog                       |  4 ++++
 gcc/testsuite/gcc.target/s390/vxe/vllezlf-1.c | 30 +++++++++++++++++++++++++++
 5 files changed, 87 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/vllezlf-1.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d516b4d..a48b743 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,13 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* config/s390/s390.c (s390_expand_vec_init): Use vllezl
+	instruction if possible.
+	* config/s390/vector.md (vec_halfnumelts): New mode
+	attribute.
+	("*vec_vllezlf<mode>"): New pattern.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* config/s390/vector.md ("popcountv16qi2", "popcountv8hi2")
 	("popcountv4si2", "popcountv2di2"): Rename to ...
 	("popcount<mode>2", "popcountv8hi2_vx", "popcountv4si2_vx")
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 416a15e..e800323 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -6552,6 +6552,34 @@ s390_expand_vec_init (rtx target, rtx vals)
       return;
     }
 
+  /* Use vector load logical element and zero.  */
+  if (TARGET_VXE && (mode == V4SImode || mode == V4SFmode))
+    {
+      bool found = true;
+
+      x = XVECEXP (vals, 0, 0);
+      if (memory_operand (x, inner_mode))
+	{
+	  for (i = 1; i < n_elts; ++i)
+	    found = found && XVECEXP (vals, 0, i) == const0_rtx;
+
+	  if (found)
+	    {
+	      machine_mode half_mode = (inner_mode == SFmode
+					? V2SFmode : V2SImode);
+	      emit_insn (gen_rtx_SET (target,
+			      gen_rtx_VEC_CONCAT (mode,
+						  gen_rtx_VEC_CONCAT (half_mode,
+								      x,
+								      const0_rtx),
+						  gen_rtx_VEC_CONCAT (half_mode,
+								      const0_rtx,
+								      const0_rtx))));
+	      return;
+	    }
+	}
+    }
+
   /* We are about to set the vector elements one by one.  Zero out the
      full register first in order to help the data flow framework to
      detect it as full VR set.  */
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index d4c0e95..6a726a3 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -44,6 +44,7 @@
 (define_mode_iterator VI_HW_HSD [V8HI  V4SI V2DI])
 (define_mode_iterator VI_HW_HS  [V8HI  V4SI])
 (define_mode_iterator VI_HW_QH  [V16QI V8HI])
+(define_mode_iterator VI_HW_4   [V4SI V4SF])
 
 ; All integer vector modes supported in a vector register + TImode
 (define_mode_iterator VIT [V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI V1DI V2DI V1TI TI])
@@ -127,6 +128,9 @@
    (V2DI "V2SI")
    (V2DF "V2SF")])
 
+(define_mode_attr vec_halfnumelts
+  [(V4SF "V2SF") (V4SI "V2SI")])
+
 ; The comparisons not setting CC iterate over the rtx code.
 (define_code_iterator VFCMP_HW_OP [eq gt ge])
 (define_code_attr asm_fcmp_op [(eq "e") (gt "h") (ge "he")])
@@ -451,6 +455,19 @@
   DONE;
 })
 
+(define_insn "*vec_vllezlf<mode>"
+  [(set (match_operand:VI_HW_4              0 "register_operand" "=v")
+	(vec_concat:VI_HW_4
+	 (vec_concat:<vec_halfnumelts>
+	  (match_operand:<non_vec> 1 "memory_operand"    "R")
+	  (const_int 0))
+	 (vec_concat:<vec_halfnumelts>
+	  (const_int 0)
+	  (const_int 0))))]
+  "TARGET_VXE"
+  "vllezlf\t%v0,%1"
+  [(set_attr "op_type" "VRX")])
+
 ; Replicate from vector element
 ; vrepb, vreph, vrepf, vrepg
 (define_insn "*vec_splat<mode>"
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 6d178c5..4efc391 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,9 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* gcc.target/s390/vxe/vllezlf-1.c: New test.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* gcc.target/s390/vxe/popcount-1.c: New test.
 
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
diff --git a/gcc/testsuite/gcc.target/s390/vxe/vllezlf-1.c b/gcc/testsuite/gcc.target/s390/vxe/vllezlf-1.c
new file mode 100644
index 0000000..14ea4f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vxe/vllezlf-1.c
@@ -0,0 +1,30 @@
+/* Make sure the vector load and zero instruction is being used for
+   initializing a 32 bit vector with the first element taken from
+   memory.  */
+
+/* { dg-do run } */
+/* { dg-options "-O3 -mzarch -march=arch12 --save-temps" } */
+/* { dg-require-effective-target s390_vxe } */
+
+typedef unsigned int       uv4si __attribute__((vector_size(16)));
+
+uv4si __attribute__((noinline))
+foo (int *a)
+{
+  return (uv4si){ *a, 0, 0, 0 };
+}
+
+int
+main ()
+{
+  int b = 4;
+  uv4si a = (uv4si){ 1, 2, 3, 4 };
+
+  a = foo (&b);
+
+  if (a[0] != 4 || a[1] != 0 || a[2] != 0 || a[3] != 0)
+    __builtin_abort ();
+
+  return 0;
+}
+/* { dg-final { scan-assembler-times "vllezlf\t%v24,0\\(%r2\\)" 1 } } */
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 07/16] S/390: Use wfc for scalar vector compares
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
@ 2017-03-24 14:11 ` Andreas Krebbel
  2017-03-24 14:11 ` [PATCH 02/16] S/390: Improve support of 128 bit vectors in GPRs Andreas Krebbel
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:11 UTC (permalink / raw)
  To: gcc-patches

The z13 vector support used the vector style comparison instructions
also for the scalar compares in vector registers.  However, it is much
more convenient to just use the compare scalar instruction for that
purpose.  The advantage is that this instruction generates a CC result
as our compares usually do.  So this results in quite some code to be
removed from the backend.

Regression tested on s390x.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/2964.md: Remove the single element vector compare
	instructions which are no longer used.
	* config/s390/s390.c (s390_select_ccmode): Remove handling of
	vector CCmodes.
	(s390_canonicalize_comparison): Remove handling of DFmode
	compares.
	(s390_expand_vec_compare_scalar): Remove function.
	(s390_emit_compare): Don't call s390_expand_vec_compare_scalar.
	* config/s390/s390.md ("*vec_cmp<insn_cmp>df_cconly"): Remove
	pattern.
	("*cmp<mode>_ccs"): Add wfcdb instruction.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* gcc.target/s390/vector/vec-scalar-cmp-1.c: Adjust for the
	comparison instructions used from now on.
---
 gcc/ChangeLog                                      |  14 +++
 gcc/config/s390/2964.md                            |   8 +-
 gcc/config/s390/s390.c                             | 102 +--------------------
 gcc/config/s390/s390.md                            |  26 ++----
 gcc/testsuite/ChangeLog                            |   5 +
 .../gcc.target/s390/vector/vec-scalar-cmp-1.c      |  31 +++++--
 6 files changed, 57 insertions(+), 129 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 4dd2be6..fef571c 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,19 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* config/s390/2964.md: Remove the single element vector compare
+	instructions which are no longer used.
+	* config/s390/s390.c (s390_select_ccmode): Remove handling of
+	vector CCmodes.
+	(s390_canonicalize_comparison): Remove handling of DFmode
+	compares.
+	(s390_expand_vec_compare_scalar): Remove function.
+	(s390_emit_compare): Don't call s390_expand_vec_compare_scalar.
+	* config/s390/s390.md ("*vec_cmp<insn_cmp>df_cconly"): Remove
+	pattern.
+	("*cmp<mode>_ccs"): Add wfcdb instruction.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* config/s390/s390.md ("mov<mode>_64dfp" DD_DF): Use vleig for loading a
 	FP zero.
 	("*mov<mode>_64" DD_DF): Remove the vector instructions. These
diff --git a/gcc/config/s390/2964.md b/gcc/config/s390/2964.md
index 374e2e3..d9b6729 100644
--- a/gcc/config/s390/2964.md
+++ b/gcc/config/s390/2964.md
@@ -88,7 +88,7 @@ vsh,vsl,vsq,lxebr,cdtr,fiebr,vupllb,vupllf,vupllh,vmrhb,madbr,vtm,vmrhf,\
 vmrhg,vmrhh,axtr,fiebra,vleb,cxtr,vlef,vleg,vleh,vpkf,vpkg,vpkh,vmlob,vmlof,\
 vmloh,lxdb,ldeb,mdtr,vceqfs,adb,wflndb,lxeb,vn,vo,vchlb,vx,mxtr,vchlf,vchlg,\
 vchlh,vfcedbs,vfcedb,vceqgs,cxbr,msdbr,vcdgb,debr,vceqhs,meeb,lcxbr,vavglb,\
-vavglf,vavglg,vavglh,wfcedbs,vmrlb,vmrlf,vmrlg,vmrlh,wfchedbs,vmxb,tcdb,\
+vavglf,vavglg,vavglh,vmrlb,vmrlf,vmrlg,vmrlh,vmxb,tcdb,\
 vmahh,vsrlb,wcgdb,lcdbr,vistrbs,vrepb,wfmdb,vrepf,vrepg,vreph,ler,wcdlgb,\
 ley,vistrb,vistrf,vistrh,tceb,wfsqdb,sqeb,vsumqf,vsumqg,vesrlb,vfeezbs,\
 maebr,vesrlf,vesrlg,vesrlh,vmeb,vmef,vmeh,meebr,vflcdb,wfmadb,vperm,sxtr,\
@@ -96,7 +96,7 @@ vclzf,vgm,vgmb,vgmf,vgmg,vgmh,tdcxt,vzero,msebr,veslb,veslf,veslg,vfenezb,\
 vfenezf,vfenezh,vistrfs,vchf,vchg,vchh,vmhb,vmhf,vmhh,cdb,veslvb,ledbr,\
 veslvf,veslvg,veslvh,wclgdb,vfmdb,vmnlb,vmnlf,vmnlg,vmnlh,vclzb,vfeezfs,\
 vclzg,vclzh,mdb,vmxlb,vmxlf,vmxlg,vmxlh,ltdtr,vsbcbiq,ceb,wfddb,sebr,vistrhs,\
-lxdtr,lcebr,vab,vaf,vag,vah,ltxtr,vlpf,vlpg,vsegb,vaq,vsegf,vsegh,wfchdbs,\
+lxdtr,lcebr,vab,vaf,vag,vah,ltxtr,vlpf,vlpg,vsegb,vaq,vsegf,vsegh,\
 sdtr,cdbr,vfeezhs,le,wldeb,vfmadb,vchlbs,vacccq,vmaleb,vsel,vmalef,vmaleh,\
 vflndb,mdbr,vmlb,wflpdb,ldetr,vpksfs,vpksf,vpksg,vpksh,sqdb,mxbr,sqdbr,\
 vmaeb,veslh,vmaef,vpklsf,vpklsg,vpklsh,verllb,vchb,ddtr,verllf,verllg,verllh,\
@@ -164,7 +164,7 @@ vsl,vsq,lxebr,cdtr,fiebr,vupllb,vupllf,vupllh,vmrhb,madbr,vtm,vmrhf,vmrhg,\
 vmrhh,axtr,fiebra,vleb,cxtr,vlef,vleg,vleh,vpkf,vpkg,vpkh,vmlob,vmlof,vmloh,\
 lxdb,ldeb,vceqfs,adb,wflndb,lxeb,vn,vo,vchlb,vx,vchlf,vchlg,vchlh,vfcedbs,\
 vfcedb,vceqgs,cxbr,msdbr,vcdgb,vceqhs,meeb,lcxbr,vavglb,vavglf,vavglg,vavglh,\
-wfcedbs,vmrlb,vmrlf,vmrlg,vmrlh,wfchedbs,vmxb,tcdb,vmahh,vsrlb,wcgdb,lcdbr,\
+vmrlb,vmrlf,vmrlg,vmrlh,vmxb,tcdb,vmahh,vsrlb,wcgdb,lcdbr,\
 vistrbs,vrepb,wfmdb,vrepf,vrepg,vreph,ler,wcdlgb,ley,vistrb,vistrf,vistrh,\
 tceb,vsumqf,vsumqg,vesrlb,vfeezbs,maebr,vesrlf,vesrlg,vesrlh,vmeb,vmef,\
 vmeh,meebr,vflcdb,wfmadb,vperm,sxtr,vclzf,vgm,vgmb,vgmf,vgmg,vgmh,tdcxt,\
@@ -172,7 +172,7 @@ vzero,msebr,veslb,veslf,veslg,vfenezb,vfenezf,vfenezh,vistrfs,vchf,vchg,\
 vchh,vmhb,vmhf,vmhh,cdb,veslvb,ledbr,veslvf,veslvg,veslvh,wclgdb,vfmdb,\
 vmnlb,vmnlf,vmnlg,vmnlh,vclzb,vfeezfs,vclzg,vclzh,mdb,vmxlb,vmxlf,vmxlg,\
 vmxlh,ltdtr,vsbcbiq,ceb,sebr,vistrhs,lxdtr,lcebr,vab,vaf,vag,vah,ltxtr,\
-vlpf,vlpg,vsegb,vaq,vsegf,vsegh,wfchdbs,sdtr,cdbr,vfeezhs,le,wldeb,vfmadb,\
+vlpf,vlpg,vsegb,vaq,vsegf,vsegh,sdtr,cdbr,vfeezhs,le,wldeb,vfmadb,\
 vchlbs,vacccq,vmaleb,vsel,vmalef,vmaleh,vflndb,mdbr,vmlb,wflpdb,ldetr,vpksfs,\
 vpksf,vpksg,vpksh,vmaeb,veslh,vmaef,vpklsf,vpklsg,vpklsh,verllb,vchb,verllf,\
 verllg,verllh,wfsdb,maeb,vclgdb,vftcidb,vpksgs,vmxf,vmxg,vmxh,fidbra,vmnb,\
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 65a7546..eac39c5 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -1402,29 +1402,6 @@ s390_tm_ccmode (rtx op1, rtx op2, bool mixed)
 machine_mode
 s390_select_ccmode (enum rtx_code code, rtx op0, rtx op1)
 {
-  if (TARGET_VX
-      && register_operand (op0, DFmode)
-      && register_operand (op1, DFmode))
-    {
-      /* LT, LE, UNGT, UNGE require swapping OP0 and OP1.  Either
-	 s390_emit_compare or s390_canonicalize_comparison will take
-	 care of it.  */
-      switch (code)
-	{
-	case EQ:
-	case NE:
-	  return CCVEQmode;
-	case GT:
-	case UNLE:
-	  return CCVFHmode;
-	case GE:
-	case UNLT:
-	  return CCVFHEmode;
-	default:
-	  ;
-	}
-    }
-
   switch (code)
     {
       case EQ:
@@ -1703,26 +1680,6 @@ s390_canonicalize_comparison (int *code, rtx *op0, rtx *op1,
       *code = (int)swap_condition ((enum rtx_code)*code);
     }
 
-  /* Using the scalar variants of vector instructions for 64 bit FP
-     comparisons might require swapping the operands.  */
-  if (TARGET_VX
-      && register_operand (*op0, DFmode)
-      && register_operand (*op1, DFmode)
-      && (*code == LT || *code == LE || *code == UNGT || *code == UNGE))
-    {
-      rtx tmp;
-
-      switch (*code)
-	{
-	case LT:   *code = GT; break;
-	case LE:   *code = GE; break;
-	case UNGT: *code = UNLE; break;
-	case UNGE: *code = UNLT; break;
-	default: ;
-	}
-      tmp = *op0; *op0 = *op1; *op1 = tmp;
-    }
-
   /* A comparison result is compared against zero.  Replace it with
      the (perhaps inverted) original comparison.
      This probably should be done by simplify_relational_operation.  */
@@ -1749,56 +1706,6 @@ s390_canonicalize_comparison (int *code, rtx *op0, rtx *op1,
     }
 }
 
-/* Helper function for s390_emit_compare.  If possible emit a 64 bit
-   FP compare using the single element variant of vector instructions.
-   Replace CODE with the comparison code to be used in the CC reg
-   compare and return the condition code register RTX in CC.  */
-
-static bool
-s390_expand_vec_compare_scalar (enum rtx_code *code, rtx cmp1, rtx cmp2,
-				rtx *cc)
-{
-  machine_mode cmp_mode;
-  bool swap_p = false;
-
-  switch (*code)
-    {
-    case EQ:   cmp_mode = CCVEQmode;  break;
-    case NE:   cmp_mode = CCVEQmode;  break;
-    case GT:   cmp_mode = CCVFHmode;  break;
-    case GE:   cmp_mode = CCVFHEmode; break;
-    case UNLE: cmp_mode = CCVFHmode;  break;
-    case UNLT: cmp_mode = CCVFHEmode; break;
-    case LT:   cmp_mode = CCVFHmode;  *code = GT;   swap_p = true; break;
-    case LE:   cmp_mode = CCVFHEmode; *code = GE;   swap_p = true; break;
-    case UNGE: cmp_mode = CCVFHmode;  *code = UNLE; swap_p = true; break;
-    case UNGT: cmp_mode = CCVFHEmode; *code = UNLT; swap_p = true; break;
-    default: return false;
-    }
-
-  if (swap_p)
-    {
-      rtx tmp = cmp2;
-      cmp2 = cmp1;
-      cmp1 = tmp;
-    }
-
-  emit_insn (gen_rtx_PARALLEL (VOIDmode,
-	       gen_rtvec (2,
-			  gen_rtx_SET (gen_rtx_REG (cmp_mode, CC_REGNUM),
-				       gen_rtx_COMPARE (cmp_mode, cmp1,
-							cmp2)),
-			  gen_rtx_CLOBBER (VOIDmode,
-					   gen_rtx_SCRATCH (V2DImode)))));
-
-  /* This is the cc reg how it will be used in the cc mode consumer.
-     It either needs to be CCVFALL or CCVFANY.  However, CC1 will
-     never be set by the scalar variants.  So it actually doesn't
-     matter which one we choose here.  */
-  *cc = gen_rtx_REG (CCVFALLmode, CC_REGNUM);
-  return true;
-}
-
 
 /* Emit a compare instruction suitable to implement the comparison
    OP0 CODE OP1.  Return the correct condition RTL to be placed in
@@ -1810,14 +1717,7 @@ s390_emit_compare (enum rtx_code code, rtx op0, rtx op1)
   machine_mode mode = s390_select_ccmode (code, op0, op1);
   rtx cc;
 
-  if (TARGET_VX
-      && register_operand (op0, DFmode)
-      && register_operand (op1, DFmode)
-      && s390_expand_vec_compare_scalar (&code, op0, op1, &cc))
-    {
-      /* Work has been done by s390_expand_vec_compare_scalar already.  */
-    }
-  else if (GET_MODE_CLASS (GET_MODE (op0)) == MODE_CC)
+  if (GET_MODE_CLASS (GET_MODE (op0)) == MODE_CC)
     {
       /* Do not output a redundant compare instruction if a
 	 compare_and_swap pattern already computed the result and the
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 554fb37..e72d5be 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -1317,28 +1317,20 @@
  })
 
 
-; cxtr, cxbr, cdtr, cdbr, cebr, cdb, ceb
+; cxtr, cdtr, cxbr, cdbr, cebr, cdb, ceb, wfcdb
 (define_insn "*cmp<mode>_ccs"
   [(set (reg CC_REGNUM)
-        (compare (match_operand:FP 0 "register_operand" "f,f")
-                 (match_operand:FP 1 "general_operand"  "f,R")))]
+        (compare (match_operand:FP 0 "register_operand" "f,f,v")
+                 (match_operand:FP 1 "general_operand"  "f,R,v")))]
   "s390_match_ccmode(insn, CCSmode) && TARGET_HARD_FLOAT"
   "@
    c<xde><bt>r\t%0,%1
-   c<xde>b\t%0,%1"
-   [(set_attr "op_type" "RRE,RXE")
-    (set_attr "type"  "fsimp<mode>")
-    (set_attr "enabled" "*,<DSF>")])
-
-; wfcedbs, wfchdbs, wfchedbs
-(define_insn "*vec_cmp<insn_cmp>df_cconly"
-  [(set (reg:VFCMP CC_REGNUM)
-	(compare:VFCMP (match_operand:DF 0 "register_operand" "v")
-		       (match_operand:DF 1 "register_operand" "v")))
-   (clobber (match_scratch:V2DI 2 "=v"))]
-  "TARGET_VX && TARGET_HARD_FLOAT"
-  "wfc<asm_fcmp>dbs\t%v2,%v0,%v1"
-  [(set_attr "op_type" "VRR")])
+   c<xde>b\t%0,%1
+   wfcdb\t%0,%1"
+  [(set_attr "op_type" "RRE,RXE,VRR")
+   (set_attr "cpu_facility" "*,*,vx")
+   (set_attr "enabled" "*,<DSF>,<DFDI>")])
+
 
 ; Compare and Branch instructions
 
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 0f0877c..e6f0e2b 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,10 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* gcc.target/s390/vector/vec-scalar-cmp-1.c: Adjust for the
+	comparison instructions used from now on.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* gcc.target/s390/s390.exp (check_effective_target_vector):
 	Include target-supports.exp and move target_vector check routine
 	...
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-scalar-cmp-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-scalar-cmp-1.c
index 46a261f..ea51d0c 100644
--- a/gcc/testsuite/gcc.target/s390/vector/vec-scalar-cmp-1.c
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-scalar-cmp-1.c
@@ -6,48 +6,65 @@
 int
 eq (double a, double b)
 {
+  asm ("" : : :
+       "f0", "f1",  "f2",  "f3",  "f4" , "f5",  "f6",  "f7",
+       "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15");
   return a == b;
 }
 
-/* { dg-final { scan-assembler "eq:\n\twfcedbs\t%v\[0-9\]*,%v0,%v2\n\tlhi\t%r2,1\n\tlochine\t%r2,0" } } */
+/* { dg-final { scan-assembler "eq:\n\[^:\]*\twfcdb\t%v\[0-9\]*,%v\[0-9\]*\n\t\[^:\]+\tlochine\t%r2,0" } } */
 
 int
 ne (double a, double b)
 {
+  asm ("" : : :
+       "f0", "f1",  "f2",  "f3",  "f4" , "f5",  "f6",  "f7",
+       "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15");
   return a != b;
 }
 
-/* { dg-final { scan-assembler "ne:\n\twfcedbs\t%v\[0-9\]*,%v0,%v2\n\tlhi\t%r2,1\n\tlochie\t%r2,0" } } */
+/* { dg-final { scan-assembler "ne:\n\[^:\]*\twfcdb\t%v\[0-9\]*,%v\[0-9\]*\n\t\[^:\]+\tlochie\t%r2,0" } } */
 
 int
 gt (double a, double b)
 {
+  asm ("" : : :
+       "f0", "f1",  "f2",  "f3",  "f4" , "f5",  "f6",  "f7",
+       "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15");
   return a > b;
 }
 
-/* { dg-final { scan-assembler "gt:\n\twfchdbs\t%v\[0-9\]*,%v0,%v2\n\tlhi\t%r2,1\n\tlochine\t%r2,0" } } */
+/* { dg-final { scan-assembler "gt:\n\[^:\]*\twfcdb\t%v\[0-9\]*,%v\[0-9\]*\n\t\[^:\]+\tlochinh\t%r2,0" } } */
 
 int
 ge (double a, double b)
 {
+  asm ("" : : :
+       "f0", "f1",  "f2",  "f3",  "f4" , "f5",  "f6",  "f7",
+       "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15");
   return a >= b;
 }
 
-/* { dg-final { scan-assembler "ge:\n\twfchedbs\t%v\[0-9\]*,%v0,%v2\n\tlhi\t%r2,1\n\tlochine\t%r2,0" } } */
+/* { dg-final { scan-assembler "ge:\n\[^:\]*\twfcdb\t%v\[0-9\]*,%v\[0-9\]*\n\t\[^:\]+\tlochinhe\t%r2,0" } } */
 
 int
 lt (double a, double b)
 {
+  asm ("" : : :
+       "f0", "f1",  "f2",  "f3",  "f4" , "f5",  "f6",  "f7",
+       "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15");
   return a < b;
 }
 
-/* { dg-final { scan-assembler "lt:\n\twfchdbs\t%v\[0-9\]*,%v2,%v0\n\tlhi\t%r2,1\n\tlochine\t%r2,0" } } */
+/* { dg-final { scan-assembler "lt:\n\[^:\]*\twfcdb\t%v\[0-9\]*,%v\[0-9\]*\n\t\[^:\]+\tlochinl\t%r2,0" } } */
 
 int
 le (double a, double b)
 {
+  asm ("" : : :
+       "f0", "f1",  "f2",  "f3",  "f4" , "f5",  "f6",  "f7",
+       "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15");
   return a <= b;
 }
 
-/* { dg-final { scan-assembler "le:\n\twfchedbs\t%v\[0-9\]*,%v2,%v0\n\tlhi\t%r2,1\n\tlochine\t%r2,0" } } */
-
+/* { dg-final { scan-assembler "le:\n\[^:\]*\twfcdb\t%v\[0-9\]*,%v\[0-9\]*\n\t\[^:\]+\tlochinle\t%r2,0" } } */
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 03/16] S/390: vec_init improvements
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
                   ` (3 preceding siblings ...)
  2017-03-24 14:11 ` [PATCH 06/16] S/390: Move and rename vector check Andreas Krebbel
@ 2017-03-24 14:11 ` Andreas Krebbel
  2017-03-24 14:13 ` [PATCH 14/16] S/390: arch12: Support the mul/add/subtract instructions Andreas Krebbel
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:11 UTC (permalink / raw)
  To: gcc-patches

This enables the vec_init pattern also for V4SF, V1TI, and V1TF.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* gcc.target/s390/vector/vec-init-2.c: New test.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/s390.c (s390_expand_vec_init): Enable vector load
	pair for all vector types with 64 bit elements.
	* config/s390/vx-builtins.md (V_HW_64): Move mode iterator to ...
	* config/s390/vector.md (V_HW_64): ... here.
	(V_128_NOSINGLE): New mode iterator.
	("vec_init<V_HW:mode>"): Use V_128 as mode iterator.
	("*vec_splat<mode>"): Use V_128_NOSINGLE mode iterator.
	("*vec_tf_to_v1tf", "*vec_ti_to_v1ti"): New pattern definitions.
	("*vec_load_pairv2di"): Change to ...
	("*vec_load_pair<mode>"): ... this one.
---
 gcc/ChangeLog                                     | 13 ++++
 gcc/config/s390/s390.c                            |  5 +-
 gcc/config/s390/vector.md                         | 72 +++++++++++++++++------
 gcc/config/s390/vx-builtins.md                    |  1 -
 gcc/testsuite/ChangeLog                           |  4 ++
 gcc/testsuite/gcc.target/s390/vector/vec-init-2.c | 48 +++++++++++++++
 6 files changed, 122 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-init-2.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 292e946..d29c06b 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,18 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* config/s390/s390.c (s390_expand_vec_init): Enable vector load
+	pair for all vector types with 64 bit elements.
+	* config/s390/vx-builtins.md (V_HW_64): Move mode iterator to ...
+	* config/s390/vector.md (V_HW_64): ... here.
+	(V_128_NOSINGLE): New mode iterator.
+	("vec_init<V_HW:mode>"): Use V_128 as mode iterator.
+	("*vec_splat<mode>"): Use V_128_NOSINGLE mode iterator.
+	("*vec_tf_to_v1tf", "*vec_ti_to_v1ti"): New pattern definitions.
+	("*vec_load_pairv2di"): Change to ...
+	("*vec_load_pair<mode>"): ... this one.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* config/s390/constraints.md: Add comments.
 	(jKK): Reject element sizes > 8 bytes.
 	* config/s390/s390.c (s390_split_ok_p): Enable splitting also for
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index f3cebd6..65a7546 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -6617,7 +6617,10 @@ s390_expand_vec_init (rtx target, rtx vals)
       return;
     }
 
-  if (all_regs && REG_P (target) && n_elts == 2 && inner_mode == DImode)
+  if (all_regs
+      && REG_P (target)
+      && n_elts == 2
+      && GET_MODE_SIZE (inner_mode) == 8)
     {
       /* Use vector load pair.  */
       emit_insn (gen_rtx_SET (target,
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 38905e8..7ddeb9a 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -31,6 +31,9 @@
 ; independently e.g. vcond
 (define_mode_iterator V_HW  [V16QI V8HI V4SI V2DI V2DF])
 (define_mode_iterator V_HW2 [V16QI V8HI V4SI V2DI V2DF])
+
+(define_mode_iterator V_HW_64 [V2DI V2DF])
+
 ; Including TI for instructions that support it (va, vn, ...)
 (define_mode_iterator VT_HW [V16QI V8HI V4SI V2DI V2DF V1TI TI])
 
@@ -53,6 +56,8 @@
 (define_mode_iterator V_64  [V8QI  V4HI V2SI V2SF V1DI V1DF])
 (define_mode_iterator V_128 [V16QI V8HI V4SI V4SF V2DI V2DF V1TI V1TF])
 
+(define_mode_iterator V_128_NOSINGLE [V16QI V8HI V4SI V4SF V2DI V2DF])
+
 ; A blank for vector modes and a * for TImode.  This is used to hide
 ; the TImode expander name in case it is defined already.  See addti3
 ; for an example.
@@ -437,9 +442,9 @@
   "vlgv<bhfgq>\t%0,%v1,%Y3(%2)"
   [(set_attr "op_type" "VRS")])
 
-(define_expand "vec_init<V_HW:mode>"
-  [(match_operand:V_HW 0 "register_operand" "")
-   (match_operand:V_HW 1 "nonmemory_operand" "")]
+(define_expand "vec_init<mode>"
+  [(match_operand:V_128 0 "register_operand" "")
+   (match_operand:V_128 1 "nonmemory_operand" "")]
   "TARGET_VX"
 {
   s390_expand_vec_init (operands[0], operands[1]);
@@ -449,20 +454,20 @@
 ; Replicate from vector element
 ; vrepb, vreph, vrepf, vrepg
 (define_insn "*vec_splat<mode>"
-  [(set (match_operand:V_HW   0 "register_operand" "=v")
-	(vec_duplicate:V_HW
+  [(set (match_operand:V_128_NOSINGLE   0 "register_operand" "=v")
+	(vec_duplicate:V_128_NOSINGLE
 	 (vec_select:<non_vec>
-	  (match_operand:V_HW 1 "register_operand"  "v")
+	  (match_operand:V_128_NOSINGLE 1 "register_operand"  "v")
 	  (parallel
 	   [(match_operand:QI 2 "const_mask_operand" "C")]))))]
-  "TARGET_VX && UINTVAL (operands[2]) < GET_MODE_NUNITS (<V_HW:MODE>mode)"
+  "TARGET_VX && UINTVAL (operands[2]) < GET_MODE_NUNITS (<MODE>mode)"
   "vrep<bhfgq>\t%v0,%v1,%2"
   [(set_attr "op_type" "VRI")])
 
 ; vlrepb, vlreph, vlrepf, vlrepg, vrepib, vrepih, vrepif, vrepig, vrepb, vreph, vrepf, vrepg
 (define_insn "*vec_splats<mode>"
-  [(set (match_operand:V_HW                          0 "register_operand" "=v,v,v,v")
-	(vec_duplicate:V_HW (match_operand:<non_vec> 1 "general_operand"  " R,K,v,d")))]
+  [(set (match_operand:V_128_NOSINGLE                          0 "register_operand" "=v,v,v,v")
+	(vec_duplicate:V_128_NOSINGLE (match_operand:<non_vec> 1 "general_operand"  " R,K,v,d")))]
   "TARGET_VX"
   "@
    vlrep<bhfgq>\t%v0,%1
@@ -471,18 +476,45 @@
    #"
   [(set_attr "op_type" "VRX,VRI,VRI,*")])
 
+; A TFmode operand resides in FPR register pairs while V1TF is in a
+; single vector register.
+(define_insn "*vec_tf_to_v1tf"
+  [(set (match_operand:V1TF                   0 "nonimmediate_operand" "=v,v,R,v,v")
+	(vec_duplicate:V1TF (match_operand:TF 1 "general_operand"       "v,R,v,G,d")))]
+  "TARGET_VX"
+  "@
+   vmrhg\t%v0,%1,%N1
+   vl\t%v0,%1
+   vst\t%v1,%0
+   vzero\t%v0
+   vlvgp\t%v0,%1,%N1"
+  [(set_attr "op_type" "VRR,VRX,VRX,VRI,VRR")])
+
+(define_insn "*vec_ti_to_v1ti"
+  [(set (match_operand:V1TI                   0 "nonimmediate_operand" "=v,v,R,  v,  v,v")
+	(vec_duplicate:V1TI (match_operand:TI 1 "general_operand"       "v,R,v,j00,jm1,d")))]
+  "TARGET_VX"
+  "@
+   vlr\t%v0,%v1
+   vl\t%v0,%1
+   vst\t%v1,%0
+   vzero\t%v0
+   vone\t%v0
+   vlvgp\t%v0,%1,%N1"
+  [(set_attr "op_type" "VRR,VRX,VRX,VRI,VRI,VRR")])
+
 ; vec_splats is supposed to replicate op1 into all elements of op0
 ; This splitter first sets the rightmost element of op0 to op1 and
 ; then does a vec_splat to replicate that element into all other
 ; elements.
 (define_split
-  [(set (match_operand:V_HW                          0 "register_operand" "")
-	(vec_duplicate:V_HW (match_operand:<non_vec> 1 "register_operand" "")))]
+  [(set (match_operand:V_128_NOSINGLE                          0 "register_operand" "")
+	(vec_duplicate:V_128_NOSINGLE (match_operand:<non_vec> 1 "register_operand" "")))]
   "TARGET_VX && GENERAL_REG_P (operands[1])"
   [(set (match_dup 0)
-	(unspec:V_HW [(match_dup 1) (match_dup 2) (match_dup 0)] UNSPEC_VEC_SET))
+	(unspec:V_128_NOSINGLE [(match_dup 1) (match_dup 2) (match_dup 0)] UNSPEC_VEC_SET))
    (set (match_dup 0)
-	(vec_duplicate:V_HW
+	(vec_duplicate:V_128_NOSINGLE
 	 (vec_select:<non_vec>
 	  (match_dup 0) (parallel [(match_dup 2)]))))]
 {
@@ -1129,13 +1161,15 @@
   operands[3] = gen_reg_rtx (V2DImode);
 })
 
-(define_insn "*vec_load_pairv2di"
-  [(set (match_operand:V2DI                0 "register_operand" "=v")
-	(vec_concat:V2DI (match_operand:DI 1 "register_operand"  "d")
-			 (match_operand:DI 2 "register_operand"  "d")))]
+(define_insn "*vec_load_pair<mode>"
+  [(set (match_operand:V_HW_64                       0 "register_operand" "=v,v")
+	(vec_concat:V_HW_64 (match_operand:<non_vec> 1 "register_operand"  "d,v")
+			    (match_operand:<non_vec> 2 "register_operand"  "d,v")))]
   "TARGET_VX"
-  "vlvgp\t%v0,%1,%2"
-  [(set_attr "op_type" "VRR")])
+  "@
+   vlvgp\t%v0,%1,%2
+   vmrhg\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR,VRR")])
 
 (define_insn "vllv16qi"
   [(set (match_operand:V16QI              0 "register_operand" "=v")
diff --git a/gcc/config/s390/vx-builtins.md b/gcc/config/s390/vx-builtins.md
index 6aff378..48164da 100644
--- a/gcc/config/s390/vx-builtins.md
+++ b/gcc/config/s390/vx-builtins.md
@@ -20,7 +20,6 @@
 
 ; The patterns in this file are enabled with -mzvector
 
-(define_mode_iterator V_HW_64 [V2DI V2DF])
 (define_mode_iterator V_HW_32_64 [V4SI V2DI V2DF])
 (define_mode_iterator VI_HW_SD [V4SI V2DI])
 (define_mode_iterator V_HW_HSD [V8HI V4SI V2DI V2DF])
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 29318bb..727bb45 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,9 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* gcc.target/s390/vector/vec-init-2.c: New test.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* gcc.dg/ubsan/pr79904-2.c: New test.
 
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-init-2.c b/gcc/testsuite/gcc.target/s390/vector/vec-init-2.c
new file mode 100644
index 0000000..e497210
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-init-2.c
@@ -0,0 +1,48 @@
+/* Check that the vec_init expander does its job.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+/* { dg-require-effective-target int128 } */
+
+
+
+
+typedef __attribute__((vector_size(16))) double v2df;
+typedef __attribute__((vector_size(16))) long long v2di;
+
+typedef __attribute__((vector_size(16))) long double v1tf;
+typedef __attribute__((vector_size(16))) __int128 v1ti;
+
+v1tf gld;
+
+v1tf
+f (long double a)
+{
+  return (v1tf){ a };
+}
+
+v1ti
+g (__int128 a)
+{
+  return (v1ti){ a };
+}
+/* { dg-final { scan-assembler-times "vl\t" 2 } } */
+
+v1tf
+h ()
+{
+  long double a;
+  asm volatile ("" : "=f" (a));
+  return (v1tf){ a };
+}
+
+/* { dg-final { scan-assembler-times "vmrhg\t" 1 } } */
+
+v1ti
+i ()
+{
+  __int128 a;
+  asm volatile ("" : "=d" (a));
+  return (v1ti){ a };
+}
+/* { dg-final { scan-assembler-times "vlvgp\t" 1 } } */
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 06/16] S/390: Move and rename vector check.
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
                   ` (2 preceding siblings ...)
  2017-03-24 14:11 ` [PATCH 12/16] S/390: arch12: Add vllezlf instruction Andreas Krebbel
@ 2017-03-24 14:11 ` Andreas Krebbel
  2017-03-24 14:11 ` [PATCH 03/16] S/390: vec_init improvements Andreas Krebbel
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:11 UTC (permalink / raw)
  To: gcc-patches

Move the target support routine for the vector facility to the common
code file.  This is required to enable the generic vectorization tests
on S/390.  While doing this the too generic name for the check (vector)
is changed to s390_vx.  The renaming required to modify all the
testcases currently using that check.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* gcc.target/s390/s390.exp (check_effective_target_vector):
	Include target-supports.exp and move target_vector check routine
	...
	* lib/target-supports.exp (check_effective_target_s390_vx): ... to
	here and rename it.
	* gcc.target/s390/htm-builtins-z13-1.c: Rename effective target
	check from vector to s390_vx.
	* gcc.target/s390/target-attribute/tpragma-struct-vx-1.c: Likewise.
	* gcc.target/s390/target-attribute/tpragma-struct-vx-2.c: Likewise.
	* gcc.target/s390/vector/stpcpy-1.c: Likewise.
	* gcc.target/s390/vector/vec-abi-vararg-1.c: Likewise.
	* gcc.target/s390/vector/vec-clobber-1.c: Likewise.
	* gcc.target/s390/vector/vec-genbytemask-1.c: Likewise.
	* gcc.target/s390/vector/vec-genmask-1.c: Likewise.
	* gcc.target/s390/vector/vec-nopeel-1.c: Likewise.
	* gcc.target/s390/vector/vec-vrepi-1.c: Likewise.
---
 gcc/testsuite/ChangeLog                               | 19 +++++++++++++++++++
 gcc/testsuite/gcc.target/s390/htm-builtins-z13-1.c    |  2 +-
 gcc/testsuite/gcc.target/s390/s390.exp                | 16 +---------------
 .../s390/target-attribute/tpragma-struct-vx-1.c       |  2 +-
 .../s390/target-attribute/tpragma-struct-vx-2.c       |  2 +-
 gcc/testsuite/gcc.target/s390/vector/stpcpy-1.c       |  2 +-
 .../gcc.target/s390/vector/vec-abi-vararg-1.c         |  2 +-
 gcc/testsuite/gcc.target/s390/vector/vec-clobber-1.c  |  2 +-
 .../gcc.target/s390/vector/vec-genbytemask-1.c        |  2 +-
 gcc/testsuite/gcc.target/s390/vector/vec-genmask-1.c  |  2 +-
 gcc/testsuite/gcc.target/s390/vector/vec-nopeel-1.c   |  2 +-
 gcc/testsuite/gcc.target/s390/vector/vec-vrepi-1.c    |  2 +-
 gcc/testsuite/lib/target-supports.exp                 | 18 ++++++++++++++++++
 13 files changed, 48 insertions(+), 25 deletions(-)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 727bb45..0f0877c 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,24 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* gcc.target/s390/s390.exp (check_effective_target_vector):
+	Include target-supports.exp and move target_vector check routine
+	...
+	* lib/target-supports.exp (check_effective_target_s390_vx): ... to
+	here and rename it.
+	* gcc.target/s390/htm-builtins-z13-1.c: Rename effective target
+	check from vector to s390_vx.
+	* gcc.target/s390/target-attribute/tpragma-struct-vx-1.c: Likewise.
+	* gcc.target/s390/target-attribute/tpragma-struct-vx-2.c: Likewise.
+	* gcc.target/s390/vector/stpcpy-1.c: Likewise.
+	* gcc.target/s390/vector/vec-abi-vararg-1.c: Likewise.
+	* gcc.target/s390/vector/vec-clobber-1.c: Likewise.
+	* gcc.target/s390/vector/vec-genbytemask-1.c: Likewise.
+	* gcc.target/s390/vector/vec-genmask-1.c: Likewise.
+	* gcc.target/s390/vector/vec-nopeel-1.c: Likewise.
+	* gcc.target/s390/vector/vec-vrepi-1.c: Likewise.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* gcc.target/s390/vector/vec-init-2.c: New test.
 
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
diff --git a/gcc/testsuite/gcc.target/s390/htm-builtins-z13-1.c b/gcc/testsuite/gcc.target/s390/htm-builtins-z13-1.c
index 7879c36..aaca1f4 100644
--- a/gcc/testsuite/gcc.target/s390/htm-builtins-z13-1.c
+++ b/gcc/testsuite/gcc.target/s390/htm-builtins-z13-1.c
@@ -1,7 +1,7 @@
 /* Verify if VRs are saved and restored.  */
 
 /* { dg-do run } */
-/* { dg-require-effective-target vector } */
+/* { dg-require-effective-target s390_vx } */
 /* { dg-options "-O3 -march=z13 -mzarch" } */
 
 typedef int __attribute__((vector_size(16))) v4si;
diff --git a/gcc/testsuite/gcc.target/s390/s390.exp b/gcc/testsuite/gcc.target/s390/s390.exp
index cab68e8..d7a61f4 100644
--- a/gcc/testsuite/gcc.target/s390/s390.exp
+++ b/gcc/testsuite/gcc.target/s390/s390.exp
@@ -26,6 +26,7 @@ if ![istarget s390*-*-*] then {
 
 # Load support procs.
 load_lib gcc-dg.exp
+load_lib target-supports.exp
 
 # Return 1 if the the assembler understands .machine and .machinemode.  The
 # target attribute needs that feature to work.
@@ -55,21 +56,6 @@ proc check_effective_target_htm { } {
     }] "-march=zEC12 -mzarch" ] } { return 0 } else { return 1 }
 }
 
-# Return 1 if vector (va - vector add) instructions are understood by
-# the assembler and can be executed.  This also covers checking for
-# the VX kernel feature.  A kernel without that feature does not
-# enable the vector facility and the following check will die with a
-# signal.
-proc check_effective_target_vector { } {
-    if { ![check_runtime s390_check_vector [subst {
-	int main (void)
-	{
-	    asm ("va %%v24, %%v26, %%v28, 3" : : : "v24", "v26", "v28");
-	    return 0;
-	}
-    }] "-march=z13 -mzarch" ] } { return 0 } else { return 1 }
-}
-
 global s390_cached_flags
 set s390_cached_flags ""
 global s390_cached_value
diff --git a/gcc/testsuite/gcc.target/s390/target-attribute/tpragma-struct-vx-1.c b/gcc/testsuite/gcc.target/s390/target-attribute/tpragma-struct-vx-1.c
index d471033..a0f4d1c 100644
--- a/gcc/testsuite/gcc.target/s390/target-attribute/tpragma-struct-vx-1.c
+++ b/gcc/testsuite/gcc.target/s390/target-attribute/tpragma-struct-vx-1.c
@@ -2,7 +2,7 @@
 
 /* { dg-do run } */
 /* { dg-require-effective-target target_attribute } */
-/* { dg-require-effective-target vector } */
+/* { dg-require-effective-target s390_vx } */
 /* { dg-options "-march=z900 -mno-vx -mzarch" } */
 
 #define V16 __attribute__ ((vector_size(16)))
diff --git a/gcc/testsuite/gcc.target/s390/target-attribute/tpragma-struct-vx-2.c b/gcc/testsuite/gcc.target/s390/target-attribute/tpragma-struct-vx-2.c
index a238dce..652b122 100644
--- a/gcc/testsuite/gcc.target/s390/target-attribute/tpragma-struct-vx-2.c
+++ b/gcc/testsuite/gcc.target/s390/target-attribute/tpragma-struct-vx-2.c
@@ -2,7 +2,7 @@
 
 /* { dg-do run } */
 /* { dg-require-effective-target target_attribute } */
-/* { dg-require-effective-target vector } */
+/* { dg-require-effective-target s390_vx } */
 /* { dg-options "-march=z13 -mvx -mzarch" } */
 
 #define V16 __attribute__ ((vector_size(16)))
diff --git a/gcc/testsuite/gcc.target/s390/vector/stpcpy-1.c b/gcc/testsuite/gcc.target/s390/vector/stpcpy-1.c
index 91c1f7c..aed20e5 100644
--- a/gcc/testsuite/gcc.target/s390/vector/stpcpy-1.c
+++ b/gcc/testsuite/gcc.target/s390/vector/stpcpy-1.c
@@ -4,7 +4,7 @@
    strings.  */
 
 /* { dg-do run } */
-/* { dg-require-effective-target vector } */
+/* { dg-require-effective-target s390_vx } */
 /* { dg-options "-O3 -mzarch -march=z13" } */
 
 #include <stdio.h>
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-1.c
index 59740c5..9d4d5bd 100644
--- a/gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-1.c
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-1.c
@@ -2,7 +2,7 @@
    ABI.  */
 
 /* { dg-do run { target { s390*-*-* } } } */
-/* { dg-require-effective-target vector } */
+/* { dg-require-effective-target s390_vx } */
 /* { dg-options "-O3 -mzarch -march=z13 --save-temps" } */
 
 /* Make sure arguments are fetched from the argument overflow area.  */
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-clobber-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-clobber-1.c
index 413b6a0..c55cc68 100644
--- a/gcc/testsuite/gcc.target/s390/vector/vec-clobber-1.c
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-clobber-1.c
@@ -1,5 +1,5 @@
 /* { dg-do run { target { s390*-*-* } } } */
-/* { dg-require-effective-target vector } */
+/* { dg-require-effective-target s390_vx } */
 /* { dg-options "-O3 -mzarch -march=z13" } */
 
 /* For FP zero checks we use the ltdbr instruction.  Since this is an
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-1.c
index 26c189a..30ef05e 100644
--- a/gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-1.c
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-1.c
@@ -1,6 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-O3 -mzarch -march=z13 --save-temps" } */
-/* { dg-require-effective-target vector } */
+/* { dg-require-effective-target s390_vx } */
 /* { dg-require-effective-target int128 } */
 
 typedef unsigned char     uv16qi __attribute__((vector_size(16)));
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-genmask-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-genmask-1.c
index 6093422..f303087 100644
--- a/gcc/testsuite/gcc.target/s390/vector/vec-genmask-1.c
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-genmask-1.c
@@ -1,6 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-O3 -mzarch -march=z13 --save-temps" } */
-/* { dg-require-effective-target vector } */
+/* { dg-require-effective-target s390_vx } */
 
 typedef unsigned char     uv16qi __attribute__((vector_size(16)));
 typedef unsigned short     uv8hi __attribute__((vector_size(16)));
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-1.c
index 581c371..6c9a2e1 100644
--- a/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-1.c
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O3 -mzarch -march=z13" } */
-/* { dg-require-effective-target vector } */
+/* { dg-require-effective-target s390_vx } */
 
 int
 foo (int * restrict a, int n)
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-vrepi-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-vrepi-1.c
index 27bf39e..bfb9974 100644
--- a/gcc/testsuite/gcc.target/s390/vector/vec-vrepi-1.c
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-vrepi-1.c
@@ -1,6 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-O3 -mzarch -march=z13 --save-temps" } */
-/* { dg-require-effective-target vector } */
+/* { dg-require-effective-target s390_vx } */
 
 typedef unsigned char     uv16qi __attribute__((vector_size(16)));
 typedef unsigned short     uv8hi __attribute__((vector_size(16)));
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 152b7d9..290c527 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -8209,6 +8209,24 @@ proc check_effective_target_profile_update_atomic {} {
     } "-fprofile-update=atomic -fprofile-generate"]
 }
 
+# Return 1 if vector (va - vector add) instructions are understood by
+# the assembler and can be executed.  This also covers checking for
+# the VX kernel feature.  A kernel without that feature does not
+# enable the vector facility and the following check will die with a
+# signal.
+proc check_effective_target_s390_vx { } {
+    if ![istarget s390*-*-*] then {
+	return 0;
+    }
+
+    return [check_runtime s390_check_vx {
+	int main (void)
+	{
+	    asm ("va %%v24, %%v26, %%v28, 3" : : : "v24", "v26", "v28");
+	    return 0;
+	}
+    } "-march=z13 -mzarch" ]
+}
 #For versions of ARM architectures that have hardware div insn,
 #disable the divmod transform
 
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 02/16] S/390: Improve support of 128 bit vectors in GPRs
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
  2017-03-24 14:11 ` [PATCH 07/16] S/390: Use wfc for scalar vector compares Andreas Krebbel
@ 2017-03-24 14:11 ` Andreas Krebbel
  2017-03-24 14:11 ` [PATCH 12/16] S/390: arch12: Add vllezlf instruction Andreas Krebbel
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:11 UTC (permalink / raw)
  To: gcc-patches

This patch improves the handling of 128 bit vectors residing in GPRs
by adding more alternatives to the move pattern.

Regression tested on s390x.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/constraints.md: Add comments.
	(jKK): Reject element sizes > 8 bytes.
	* config/s390/s390.c (s390_split_ok_p): Enable splitting also for
	s_operands.
	* config/s390/s390.md: Add the s_operand checks formerly in
	s390_split_ok_p to various splitters where they are still
	required.
	* config/s390/vector.md ("mov<mode>" V_128): Add GPR alternatives
	for 128 bit vectors.  Plus two splitters.
---
 gcc/ChangeLog                  | 12 ++++++++++
 gcc/config/s390/constraints.md | 10 +++++++--
 gcc/config/s390/s390.c         |  4 ----
 gcc/config/s390/s390.md        | 16 +++++++++++++
 gcc/config/s390/vector.md      | 51 ++++++++++++++++++++++++++++++++++++++----
 5 files changed, 83 insertions(+), 10 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 7a83d1b..292e946 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,17 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* config/s390/constraints.md: Add comments.
+	(jKK): Reject element sizes > 8 bytes.
+	* config/s390/s390.c (s390_split_ok_p): Enable splitting also for
+	s_operands.
+	* config/s390/s390.md: Add the s_operand checks formerly in
+	s390_split_ok_p to various splitters where they are still
+	required.
+	* config/s390/vector.md ("mov<mode>" V_128): Add GPR alternatives
+	for 128 bit vectors.  Plus two splitters.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* config/s390/s390.md: Rename the cpu facilty vec to vx throughout
 	the file.
 
diff --git a/gcc/config/s390/constraints.md b/gcc/config/s390/constraints.md
index 536f485..95c6a8f 100644
--- a/gcc/config/s390/constraints.md
+++ b/gcc/config/s390/constraints.md
@@ -410,20 +410,26 @@
   "All one bit scalar or vector constant"
   (match_test "op == CONSTM1_RTX (GET_MODE (op))"))
 
+; vector generate mask operand - support for up to 64 bit elements
 (define_constraint "jxx"
   "@internal"
   (and (match_code "const_vector")
        (match_test "s390_contiguous_bitmask_vector_p (op, NULL, NULL)")))
 
+; vector generate byte mask operand - this is only supposed to deal
+; with real vectors 128 bit values of being either 0 or -1 are handled
+; with j00 and jm1
 (define_constraint "jyy"
   "@internal"
   (and (match_code "const_vector")
        (match_test "s390_bytemask_vector_p (op, NULL)")))
 
+; vector replicate immediate operand - support for up to 64 bit elements
 (define_constraint "jKK"
   "@internal"
-  (and (and (match_code "const_vector")
-	    (match_test "const_vec_duplicate_p (op)"))
+  (and (and (and (match_code "const_vector")
+		 (match_test "const_vec_duplicate_p (op)"))
+	    (match_test "GET_MODE_UNIT_SIZE (GET_MODE (op)) <= 8"))
        (match_test "satisfies_constraint_K (XVECEXP (op, 0, 0))")))
 
 (define_constraint "jm6"
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 27640ad..f3cebd6 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -2494,10 +2494,6 @@ s390_split_ok_p (rtx dst, rtx src, machine_mode mode, int first_subword)
   if (FP_REG_P (src) || FP_REG_P (dst) || VECTOR_REG_P (src) || VECTOR_REG_P (dst))
     return false;
 
-  /* We don't need to split if operands are directly accessible.  */
-  if (s_operand (src, mode) || s_operand (dst, mode))
-    return false;
-
   /* Non-offsettable memory references cannot be split.  */
   if ((GET_CODE (src) == MEM && !offsettable_memref_p (src))
       || (GET_CODE (dst) == MEM && !offsettable_memref_p (dst)))
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 660b5f9..555a779 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -1490,6 +1490,8 @@
   [(set (match_operand:TI 0 "nonimmediate_operand" "")
         (match_operand:TI 1 "general_operand" ""))]
   "TARGET_ZARCH && reload_completed
+   && !s_operand (operands[0], TImode)
+   && !s_operand (operands[1], TImode)
    && s390_split_ok_p (operands[0], operands[1], TImode, 0)"
   [(set (match_dup 2) (match_dup 4))
    (set (match_dup 3) (match_dup 5))]
@@ -1504,6 +1506,8 @@
   [(set (match_operand:TI 0 "nonimmediate_operand" "")
         (match_operand:TI 1 "general_operand" ""))]
   "TARGET_ZARCH && reload_completed
+   && !s_operand (operands[0], TImode)
+   && !s_operand (operands[1], TImode)
    && s390_split_ok_p (operands[0], operands[1], TImode, 1)"
   [(set (match_dup 2) (match_dup 4))
    (set (match_dup 3) (match_dup 5))]
@@ -1824,6 +1828,8 @@
   [(set (match_operand:DI 0 "nonimmediate_operand" "")
         (match_operand:DI 1 "general_operand" ""))]
   "!TARGET_ZARCH && reload_completed
+   && !s_operand (operands[0], DImode)
+   && !s_operand (operands[1], DImode)
    && s390_split_ok_p (operands[0], operands[1], DImode, 0)"
   [(set (match_dup 2) (match_dup 4))
    (set (match_dup 3) (match_dup 5))]
@@ -1838,6 +1844,8 @@
   [(set (match_operand:DI 0 "nonimmediate_operand" "")
         (match_operand:DI 1 "general_operand" ""))]
   "!TARGET_ZARCH && reload_completed
+   && !s_operand (operands[0], DImode)
+   && !s_operand (operands[1], DImode)
    && s390_split_ok_p (operands[0], operands[1], DImode, 1)"
   [(set (match_dup 2) (match_dup 4))
    (set (match_dup 3) (match_dup 5))]
@@ -2364,6 +2372,8 @@
   [(set (match_operand:TD_TF 0 "nonimmediate_operand" "")
         (match_operand:TD_TF 1 "general_operand"      ""))]
   "TARGET_ZARCH && reload_completed
+   && !s_operand (operands[0], <MODE>mode)
+   && !s_operand (operands[1], <MODE>mode)
    && s390_split_ok_p (operands[0], operands[1], <MODE>mode, 0)"
   [(set (match_dup 2) (match_dup 4))
    (set (match_dup 3) (match_dup 5))]
@@ -2378,6 +2388,8 @@
   [(set (match_operand:TD_TF 0 "nonimmediate_operand" "")
         (match_operand:TD_TF 1 "general_operand"      ""))]
   "TARGET_ZARCH && reload_completed
+   && !s_operand (operands[0], <MODE>mode)
+   && !s_operand (operands[1], <MODE>mode)
    && s390_split_ok_p (operands[0], operands[1], <MODE>mode, 1)"
   [(set (match_dup 2) (match_dup 4))
    (set (match_dup 3) (match_dup 5))]
@@ -2532,6 +2544,8 @@
   [(set (match_operand:DD_DF 0 "nonimmediate_operand" "")
         (match_operand:DD_DF 1 "general_operand" ""))]
   "!TARGET_ZARCH && reload_completed
+   && !s_operand (operands[0], <MODE>mode)
+   && !s_operand (operands[1], <MODE>mode)
    && s390_split_ok_p (operands[0], operands[1], <MODE>mode, 0)"
   [(set (match_dup 2) (match_dup 4))
    (set (match_dup 3) (match_dup 5))]
@@ -2546,6 +2560,8 @@
   [(set (match_operand:DD_DF 0 "nonimmediate_operand" "")
         (match_operand:DD_DF 1 "general_operand" ""))]
   "!TARGET_ZARCH && reload_completed
+   && !s_operand (operands[0], <MODE>mode)
+   && !s_operand (operands[1], <MODE>mode)
    && s390_split_ok_p (operands[0], operands[1], <MODE>mode, 1)"
   [(set (match_dup 2) (match_dup 4))
    (set (match_dup 3) (match_dup 5))]
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 4b5d43b..38905e8 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -144,11 +144,18 @@
 (include "vx-builtins.md")
 
 ; Full HW vector size moves
+
+; We don't use lm/stm for 128 bit moves since these are slower than
+; splitting it into separate moves.
+
+; FIXME: More constants are possible by enabling jxx, jyy constraints
+; for TImode (use double-int for the calculations)
+
 ; vgmb, vgmh, vgmf, vgmg, vrepib, vrepih, vrepif, vrepig
 (define_insn "mov<mode>"
-  [(set (match_operand:V_128 0 "nonimmediate_operand" "=v,v,R,  v,  v,  v,  v,  v,v,d")
-	(match_operand:V_128 1 "general_operand"      " v,R,v,j00,jm1,jyy,jxx,jKK,d,v"))]
-  "TARGET_VX"
+  [(set (match_operand:V_128 0 "nonimmediate_operand" "=v,v,R,  v,  v,  v,  v,  v,v,*d,*d,?o")
+	(match_operand:V_128 1 "general_operand"      " v,R,v,j00,jm1,jyy,jxx,jKK,d, v,dT,*d"))]
+  ""
   "@
    vlr\t%v0,%v1
    vl\t%v0,%1
@@ -159,9 +166,13 @@
    vgm<bhfgq>\t%v0,%s1,%e1
    vrepi<bhfgq>\t%v0,%h1
    vlvgp\t%v0,%1,%N1
+   #
+   #
    #"
-  [(set_attr "op_type" "VRR,VRX,VRX,VRI,VRI,VRI,VRI,VRI,VRR,*")])
+  [(set_attr "cpu_facility" "vx,vx,vx,vx,vx,vx,vx,vx,vx,vx,*,*")
+   (set_attr "op_type"      "VRR,VRX,VRX,VRI,VRI,VRI,VRI,VRI,VRR,*,*,*")])
 
+; VR -> GPR, no instruction so split it into 64 element sets.
 (define_split
   [(set (match_operand:V_128 0 "register_operand" "")
 	(match_operand:V_128 1 "register_operand" ""))]
@@ -177,6 +188,38 @@
   operands[3] = operand_subword (operands[0], 1, 0, <MODE>mode);
 })
 
+; Split the 128 bit GPR move into two word mode moves
+; s390_split_ok_p decides which part needs to be moved first.
+
+(define_split
+  [(set (match_operand:V_128 0 "nonimmediate_operand" "")
+        (match_operand:V_128 1 "general_operand" ""))]
+  "reload_completed
+   && s390_split_ok_p (operands[0], operands[1], <MODE>mode, 0)"
+  [(set (match_dup 2) (match_dup 4))
+   (set (match_dup 3) (match_dup 5))]
+{
+  operands[2] = operand_subword (operands[0], 0, 0, <MODE>mode);
+  operands[3] = operand_subword (operands[0], 1, 0, <MODE>mode);
+  operands[4] = operand_subword (operands[1], 0, 0, <MODE>mode);
+  operands[5] = operand_subword (operands[1], 1, 0, <MODE>mode);
+})
+
+(define_split
+  [(set (match_operand:V_128 0 "nonimmediate_operand" "")
+        (match_operand:V_128 1 "general_operand" ""))]
+  "reload_completed
+   && s390_split_ok_p (operands[0], operands[1], <MODE>mode, 1)"
+  [(set (match_dup 2) (match_dup 4))
+   (set (match_dup 3) (match_dup 5))]
+{
+  operands[2] = operand_subword (operands[0], 1, 0, <MODE>mode);
+  operands[3] = operand_subword (operands[0], 0, 0, <MODE>mode);
+  operands[4] = operand_subword (operands[1], 1, 0, <MODE>mode);
+  operands[5] = operand_subword (operands[1], 0, 0, <MODE>mode);
+})
+
+
 ; Moves for smaller vector modes.
 
 ; In these patterns only the vlr, vone, and vzero instructions write
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 05/16] S/390: movdf improvements
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
                   ` (12 preceding siblings ...)
  2017-03-24 14:13 ` [PATCH 10/16] S/390: arch12: Add support for new vector bit operations Andreas Krebbel
@ 2017-03-24 14:13 ` Andreas Krebbel
  2017-03-24 14:22 ` [PATCH 13/16] S/390: arch12: Add indirect branch pattern Andreas Krebbel
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:13 UTC (permalink / raw)
  To: gcc-patches

This patch add the vector load element from immediate instruction to the
movdf/dd pattern for loading a FP zero and it removes the vector
instructions from the mov<mode>_64 pattern. These were pointless in
there because z13 support implies DFP support so these instructions will
always be matched in the mov<mode>_64dfp pattern instead.

Regression tested on s390x

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/s390.md ("mov<mode>_64dfp" DD_DF): Use vleig for loading a
	FP zero.
	("*mov<mode>_64" DD_DF): Remove the vector instructions. These
	will anyway by matched by mov<mode>_64dfp.
---
 gcc/ChangeLog           |  7 +++++++
 gcc/config/s390/s390.md | 30 ++++++++++++++----------------
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d29c06b..4dd2be6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,12 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* config/s390/s390.md ("mov<mode>_64dfp" DD_DF): Use vleig for loading a
+	FP zero.
+	("*mov<mode>_64" DD_DF): Remove the vector instructions. These
+	will anyway by matched by mov<mode>_64dfp.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* config/s390/s390.c (s390_expand_vec_init): Enable vector load
 	pair for all vector types with 64 bit elements.
 	* config/s390/vx-builtins.md (V_HW_64): Move mode iterator to ...
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 75b15df..554fb37 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -2460,9 +2460,9 @@
 
 (define_insn "*mov<mode>_64dfp"
   [(set (match_operand:DD_DF 0 "nonimmediate_operand"
-			       "=f,f,f,d,f,f,R,T,d,d,d,d,b,T,v,v,d,v,R")
+			       "=f,f,f,d,f,f,R,T,d,d,d,d,b,T,v,v,v,d,v,R")
         (match_operand:DD_DF 1 "general_operand"
-			       " G,f,d,f,R,T,f,f,G,d,b,T,d,d,v,d,v,R,v"))]
+			       " G,f,d,f,R,T,f,f,G,d,b,T,d,d,v,G,d,v,R,v"))]
   "TARGET_DFP"
   "@
    lzdr\t%0
@@ -2480,19 +2480,20 @@
    stgrl\t%1,%0
    stg\t%1,%0
    vlr\t%v0,%v1
+   vleig\t%v0,0,0
    vlvgg\t%v0,%1,0
    vlgvg\t%0,%v1,0
    vleg\t%0,%1,0
    vsteg\t%1,%0,0"
-  [(set_attr "op_type" "RRE,RR,RRE,RRE,RX,RXY,RX,RXY,RI,RRE,RIL,RXY,RIL,RXY,VRR,VRS,VRS,VRX,VRX")
+  [(set_attr "op_type" "RRE,RR,RRE,RRE,RX,RXY,RX,RXY,RI,RRE,RIL,RXY,RIL,RXY,VRR,VRI,VRS,VRS,VRX,VRX")
    (set_attr "type" "fsimpdf,floaddf,floaddf,floaddf,floaddf,floaddf,
-                     fstoredf,fstoredf,*,lr,load,load,store,store,*,*,*,load,store")
-   (set_attr "z10prop" "*,*,*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec,*,*,*,*,*")
-   (set_attr "cpu_facility" "z196,*,*,*,*,longdisp,*,longdisp,*,*,z10,*,z10,*,vx,vx,vx,vx,vx")])
+                     fstoredf,fstoredf,*,lr,load,load,store,store,*,*,*,*,load,store")
+   (set_attr "z10prop" "*,*,*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec,*,*,*,*,*,*")
+   (set_attr "cpu_facility" "z196,*,*,*,*,longdisp,*,longdisp,*,*,z10,*,z10,*,vx,vx,vx,vx,vx,vx")])
 
 (define_insn "*mov<mode>_64"
-  [(set (match_operand:DD_DF 0 "nonimmediate_operand" "=f,f,f,f,R,T,d,d,d,d,b,T,v,v,R")
-        (match_operand:DD_DF 1 "general_operand"      " G,f,R,T,f,f,G,d,b,T,d,d,v,R,v"))]
+  [(set (match_operand:DD_DF 0 "nonimmediate_operand" "=f,f,f,f,R,T,d,d,d,d,b,T")
+        (match_operand:DD_DF 1 "general_operand"      " G,f,R,T,f,f,G,d,b,T,d,d"))]
   "TARGET_ZARCH"
   "@
    lzdr\t%0
@@ -2506,15 +2507,12 @@
    lgrl\t%0,%1
    lg\t%0,%1
    stgrl\t%1,%0
-   stg\t%1,%0
-   vlr\t%v0,%v1
-   vleg\t%v0,%1,0
-   vsteg\t%v1,%0,0"
-  [(set_attr "op_type" "RRE,RR,RX,RXY,RX,RXY,RI,RRE,RIL,RXY,RIL,RXY,VRR,VRX,VRX")
+   stg\t%1,%0"
+  [(set_attr "op_type" "RRE,RR,RX,RXY,RX,RXY,RI,RRE,RIL,RXY,RIL,RXY")
    (set_attr "type"    "fsimpdf,fload<mode>,fload<mode>,fload<mode>,
-                        fstore<mode>,fstore<mode>,*,lr,load,load,store,store,*,load,store")
-   (set_attr "z10prop" "*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec,*,*,*")
-   (set_attr "cpu_facility" "z196,*,*,longdisp,*,longdisp,*,*,z10,*,z10,*,vx,vx,vx")])
+                        fstore<mode>,fstore<mode>,*,lr,load,load,store,store")
+   (set_attr "z10prop" "*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec")
+   (set_attr "cpu_facility" "z196,*,*,longdisp,*,longdisp,*,*,z10,*,z10,*")])
 
 (define_insn "*mov<mode>_31"
   [(set (match_operand:DD_DF 0 "nonimmediate_operand"
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 04/16] S/390: movsf/sd pattern fixes.
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
                   ` (7 preceding siblings ...)
  2017-03-24 14:13 ` [PATCH 08/16] S/390: Rearrange fixuns_trunc pattern definitions Andreas Krebbel
@ 2017-03-24 14:13 ` Andreas Krebbel
  2017-03-24 14:13 ` [PATCH 11/16] S/390: arch12: New vector popcount variants Andreas Krebbel
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:13 UTC (permalink / raw)
  To: gcc-patches

The SD/SFmode move pattern used a wrong mnemonic for vector load
element.
On the vector load element instruction was an operand missing.

Regression tested on s390x.

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/s390.md ("mov<mode>" SD_SF): Change vleg/vsteg to
	vlef/vstef.  Add missing operand to vleif.
---
 gcc/config/s390/s390.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 555a779..75b15df 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -2613,11 +2613,11 @@
    st\t%1,%0
    sty\t%1,%0
    vlr\t%v0,%v1
-   vleif\t%v0,0
+   vleif\t%v0,0,0
    vlvgf\t%v0,%1,0
    vlgvf\t%0,%v1,0
-   vleg\t%0,%1,0
-   vsteg\t%1,%0,0"
+   vlef\t%0,%1,0
+   vstef\t%1,%0,0"
   [(set_attr "op_type" "RRE,RR,RR,RXE,RX,RXY,RX,RXY,RI,RR,RIL,RX,RXY,RIL,RX,RXY,VRR,VRI,VRS,VRS,VRX,VRX")
    (set_attr "type"    "fsimpsf,fsimpsf,fload<mode>,fload<mode>,fload<mode>,fload<mode>,
                         fstore<mode>,fstore<mode>,*,lr,load,load,load,store,store,store,*,*,*,*,load,store")
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 01/16] S/390: Rename cpu facility vec to vx.
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
                   ` (10 preceding siblings ...)
  2017-03-24 14:13 ` [PATCH 15/16] S/390: arch12: Support new vector floating point modes Andreas Krebbel
@ 2017-03-24 14:13 ` Andreas Krebbel
  2017-03-24 14:13 ` [PATCH 10/16] S/390: arch12: Add support for new vector bit operations Andreas Krebbel
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:13 UTC (permalink / raw)
  To: gcc-patches

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/s390.md: Rename the cpu facilty vec to vx throughout
	the file.
---
 gcc/ChangeLog           |  5 +++++
 gcc/config/s390/s390.md | 46 +++++++++++++++++++++++-----------------------
 2 files changed, 28 insertions(+), 23 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9cb2271..7a83d1b 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,10 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* config/s390/s390.md: Rename the cpu facilty vec to vx throughout
+	the file.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	PR target/79893
 	* config/s390/s390-c.c (s390_adjust_builtin_arglist): Issue an
 	error if the boundary argument is not constant.
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 19daf31..660b5f9 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -482,7 +482,7 @@
   (const (symbol_ref "s390_tune_attr")))
 
 (define_attr "cpu_facility"
-  "standard,ieee,zarch,cpu_zarch,longdisp,extimm,dfp,z10,z196,zEC12,vec,z13"
+  "standard,ieee,zarch,cpu_zarch,longdisp,extimm,dfp,z10,z196,zEC12,vx,z13"
   (const_string "standard"))
 
 (define_attr "enabled" ""
@@ -525,7 +525,7 @@
               (match_test "TARGET_ZEC12"))
 	 (const_int 1)
 
-         (and (eq_attr "cpu_facility" "vec")
+         (and (eq_attr "cpu_facility" "vx")
               (match_test "TARGET_VX"))
 	 (const_int 1)
 
@@ -1484,7 +1484,7 @@
    #"
   [(set_attr "op_type" "RSY,RSY,VRR,VRI,VRI,VRR,*,VRX,VRX,*,*")
    (set_attr "type" "lm,stm,*,*,*,*,*,*,*,*,*")
-   (set_attr "cpu_facility" "*,*,vec,vec,vec,vec,vec,vec,vec,*,*")])
+   (set_attr "cpu_facility" "*,*,vx,vx,vx,vx,vx,vx,vx,*,*")])
 
 (define_split
   [(set (match_operand:TI 0 "nonimmediate_operand" "")
@@ -1720,7 +1720,7 @@
                      *,*,*,*,*,*,*")
    (set_attr "cpu_facility" "*,*,*,*,*,extimm,extimm,extimm,dfp,dfp,longdisp,
                              z10,*,*,*,*,*,longdisp,*,longdisp,
-                             z10,z10,*,*,*,*,vec,vec,vec,vec,vec,vec")
+                             z10,z10,*,*,*,*,vx,vx,vx,vx,vx,vx")
    (set_attr "z10prop" "z10_fwd_A1,
                         z10_fwd_E1,
                         z10_fwd_E1,
@@ -2001,7 +2001,7 @@
                      *,
                      *,*,*,*,*,*,*")
    (set_attr "cpu_facility" "*,*,*,extimm,longdisp,z10,*,*,longdisp,*,longdisp,
-                             vec,*,vec,*,longdisp,*,longdisp,*,*,*,z10,z10,*,vec,vec,vec,vec,vec,vec")
+                             vx,*,vx,*,longdisp,*,longdisp,*,*,*,z10,z10,*,vx,vx,vx,vx,vx,vx")
    (set_attr "z10prop" "z10_fwd_A1,
                         z10_fwd_E1,
                         z10_fwd_E1,
@@ -2049,7 +2049,7 @@
    (set_attr "type" "*,lr,load,store,floadsf,floadsf,floadsf,floadsf,fstoresf,*,*,*,*")
    (set_attr "z10prop" "z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_rec,*,*,*,*,*,z10_super_E1,
                         z10_super,*,*")
-   (set_attr "cpu_facility" "*,*,*,*,vec,*,vec,*,*,*,*,*,*")
+   (set_attr "cpu_facility" "*,*,*,*,vx,*,vx,*,*,*,*,*,*")
 ])
 
 (define_peephole2
@@ -2183,7 +2183,7 @@
    vsteh\t%v1,%0,0"
   [(set_attr "op_type"      "RR,RI,RX,RXY,RIL,RX,RXY,RIL,SIL,VRI,VRR,VRS,VRS,VRX,VRX")
    (set_attr "type"         "lr,*,*,*,larl,store,store,store,*,*,*,*,*,*,*")
-   (set_attr "cpu_facility" "*,*,*,longdisp,z10,*,longdisp,z10,z10,vec,vec,vec,vec,vec,vec")
+   (set_attr "cpu_facility" "*,*,*,longdisp,z10,*,longdisp,z10,z10,vx,vx,vx,vx,vx,vx")
    (set_attr "z10prop" "z10_fr_E1,
                        z10_fwd_A1,
                        z10_super_E1,
@@ -2248,7 +2248,7 @@
    vsteb\t%v1,%0,0"
   [(set_attr "op_type" "RR,RI,RX,RXY,RX,RXY,SI,SIY,SS,VRI,VRR,VRS,VRS,VRX,VRX")
    (set_attr "type" "lr,*,*,*,store,store,store,store,*,*,*,*,*,*,*")
-   (set_attr "cpu_facility" "*,*,*,longdisp,*,longdisp,*,longdisp,*,vec,vec,vec,vec,vec,vec")
+   (set_attr "cpu_facility" "*,*,*,longdisp,*,longdisp,*,longdisp,*,vx,vx,vx,vx,vx,vx")
    (set_attr "z10prop" "z10_fr_E1,
                         z10_fwd_A1,
                         z10_super_E1,
@@ -2476,7 +2476,7 @@
    (set_attr "type" "fsimpdf,floaddf,floaddf,floaddf,floaddf,floaddf,
                      fstoredf,fstoredf,*,lr,load,load,store,store,*,*,*,load,store")
    (set_attr "z10prop" "*,*,*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec,*,*,*,*,*")
-   (set_attr "cpu_facility" "z196,*,*,*,*,longdisp,*,longdisp,*,*,z10,*,z10,*,vec,vec,vec,vec,vec")])
+   (set_attr "cpu_facility" "z196,*,*,*,*,longdisp,*,longdisp,*,*,z10,*,z10,*,vx,vx,vx,vx,vx")])
 
 (define_insn "*mov<mode>_64"
   [(set (match_operand:DD_DF 0 "nonimmediate_operand" "=f,f,f,f,R,T,d,d,d,d,b,T,v,v,R")
@@ -2502,7 +2502,7 @@
    (set_attr "type"    "fsimpdf,fload<mode>,fload<mode>,fload<mode>,
                         fstore<mode>,fstore<mode>,*,lr,load,load,store,store,*,load,store")
    (set_attr "z10prop" "*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec,*,*,*")
-   (set_attr "cpu_facility" "z196,*,*,longdisp,*,longdisp,*,*,z10,*,z10,*,vec,vec,vec")])
+   (set_attr "cpu_facility" "z196,*,*,longdisp,*,longdisp,*,*,z10,*,z10,*,vx,vx,vx")])
 
 (define_insn "*mov<mode>_31"
   [(set (match_operand:DD_DF 0 "nonimmediate_operand"
@@ -2606,7 +2606,7 @@
    (set_attr "type"    "fsimpsf,fsimpsf,fload<mode>,fload<mode>,fload<mode>,fload<mode>,
                         fstore<mode>,fstore<mode>,*,lr,load,load,load,store,store,store,*,*,*,*,load,store")
    (set_attr "z10prop" "*,*,*,*,*,*,*,*,z10_fwd_A1,z10_fr_E1,z10_fr_E1,z10_fwd_A3,z10_fwd_A3,z10_rec,z10_rec,z10_rec,*,*,*,*,*,*")
-   (set_attr "cpu_facility" "z196,vec,*,vec,*,longdisp,*,longdisp,*,*,z10,*,longdisp,z10,*,longdisp,vec,vec,vec,vec,vec,vec")])
+   (set_attr "cpu_facility" "z196,vx,*,vx,*,longdisp,*,longdisp,*,*,z10,*,longdisp,z10,*,longdisp,vx,vx,vx,vx,vx,vx")])
 
 ;
 ; movcc instruction pattern
@@ -4983,7 +4983,7 @@
    wcdgb\t%v0,%v1,0,0"
   [(set_attr "op_type"      "RRE,VRR")
    (set_attr "type"         "itof<mode>" )
-   (set_attr "cpu_facility" "*,vec")
+   (set_attr "cpu_facility" "*,vx")
    (set_attr "enabled"      "*,<DFDI>")])
 
 ; cxfbr, cdfbr, cefbr
@@ -5048,7 +5048,7 @@
                        ; According to BFP rounding mode
   [(set_attr "op_type"      "RRE,VRR")
    (set_attr "type"         "ftruncdf")
-   (set_attr "cpu_facility" "*,vec")])
+   (set_attr "cpu_facility" "*,vx")])
 
 ;
 ; trunctf(df|sf)2 instruction pattern(s).
@@ -5767,7 +5767,7 @@
    wfadb\t%v0,%v1,%v2"
   [(set_attr "op_type"      "RRF,RRE,RXE,VRR")
    (set_attr "type"         "fsimp<mode>")
-   (set_attr "cpu_facility" "*,*,*,vec")
+   (set_attr "cpu_facility" "*,*,*,vx")
    (set_attr "enabled"      "<nBFP>,<nDFP>,<DSF>,<DFDI>")])
 
 ; axbr, adbr, aebr, axb, adb, aeb, adtr, axtr
@@ -6198,7 +6198,7 @@
    wfsdb\t%v0,%v1,%v2"
   [(set_attr "op_type"      "RRF,RRE,RXE,VRR")
    (set_attr "type"         "fsimp<mode>")
-   (set_attr "cpu_facility" "*,*,*,vec")
+   (set_attr "cpu_facility" "*,*,*,vx")
    (set_attr "enabled"      "<nBFP>,<nDFP>,<DSF>,<DFDI>")])
 
 ; sxbr, sdbr, sebr, sdb, seb, sxtr, sdtr
@@ -6628,7 +6628,7 @@
    wfmdb\t%v0,%v1,%v2"
   [(set_attr "op_type"      "RRF,RRE,RXE,VRR")
    (set_attr "type"         "fmul<mode>")
-   (set_attr "cpu_facility" "*,*,*,vec")
+   (set_attr "cpu_facility" "*,*,*,vx")
    (set_attr "enabled"      "<nBFP>,<nDFP>,<DSF>,<DFDI>")])
 
 ; madbr, maebr, maxb, madb, maeb
@@ -6644,7 +6644,7 @@
    wfmadb\t%v0,%v1,%v2,%v3"
   [(set_attr "op_type"      "RRE,RXE,VRR")
    (set_attr "type"         "fmadd<mode>")
-   (set_attr "cpu_facility" "*,*,vec")
+   (set_attr "cpu_facility" "*,*,vx")
    (set_attr "enabled"      "*,*,<DFDI>")])
 
 ; msxbr, msdbr, msebr, msxb, msdb, mseb
@@ -6660,7 +6660,7 @@
    wfmsdb\t%v0,%v1,%v2,%v3"
   [(set_attr "op_type"      "RRE,RXE,VRR")
    (set_attr "type"         "fmadd<mode>")
-   (set_attr "cpu_facility" "*,*,vec")
+   (set_attr "cpu_facility" "*,*,vx")
    (set_attr "enabled"      "*,*,<DFDI>")])
 
 ;;
@@ -7104,7 +7104,7 @@
    wfddb\t%v0,%v1,%v2"
   [(set_attr "op_type"      "RRF,RRE,RXE,VRR")
    (set_attr "type"         "fdiv<mode>")
-   (set_attr "cpu_facility" "*,*,*,vec")
+   (set_attr "cpu_facility" "*,*,*,vx")
    (set_attr "enabled"      "<nBFP>,<nDFP>,<DSF>,<DFDI>")])
 
 
@@ -8353,7 +8353,7 @@
    lc<xde>br\t%0,%1
    wflcdb\t%0,%1"
   [(set_attr "op_type"      "RRE,VRR")
-   (set_attr "cpu_facility" "*,vec")
+   (set_attr "cpu_facility" "*,vx")
    (set_attr "type"         "fsimp<mode>,*")
    (set_attr "enabled"      "*,<DFDI>")])
 
@@ -8476,7 +8476,7 @@
     lp<xde>br\t%0,%1
     wflpdb\t%0,%1"
   [(set_attr "op_type"      "RRE,VRR")
-   (set_attr "cpu_facility" "*,vec")
+   (set_attr "cpu_facility" "*,vx")
    (set_attr "type"         "fsimp<mode>,*")
    (set_attr "enabled"      "*,<DFDI>")])
 
@@ -8592,7 +8592,7 @@
    ln<xde>br\t%0,%1
    wflndb\t%0,%1"
   [(set_attr "op_type"      "RRE,VRR")
-   (set_attr "cpu_facility" "*,vec")
+   (set_attr "cpu_facility" "*,vx")
    (set_attr "type"         "fsimp<mode>,*")
    (set_attr "enabled"      "*,<DFDI>")])
 
@@ -8615,7 +8615,7 @@
    wfsqdb\t%v0,%v1"
   [(set_attr "op_type"      "RRE,RXE,VRR")
    (set_attr "type"         "fsqrt<mode>")
-   (set_attr "cpu_facility" "*,*,vec")
+   (set_attr "cpu_facility" "*,*,vx")
    (set_attr "enabled"      "*,<DSF>,<DFDI>")])
 
 
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 09/16] S/390: arch12: Add arch12 option.
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
                   ` (5 preceding siblings ...)
  2017-03-24 14:13 ` [PATCH 14/16] S/390: arch12: Support the mul/add/subtract instructions Andreas Krebbel
@ 2017-03-24 14:13 ` Andreas Krebbel
  2017-03-24 14:13 ` [PATCH 08/16] S/390: Rearrange fixuns_trunc pattern definitions Andreas Krebbel
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:13 UTC (permalink / raw)
  To: gcc-patches

This patch covers the mechanical work of making the new architecture
option arch12 available wherever it will be needed later.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* gcc.target/s390/s390.exp: Run tests in arch12 and vxe dirs.
	* lib/target-supports.exp: Add effective target check s390_vxe.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* common/config/s390/s390-common.c (processor_flags_table): Add
	arch12.
	* config.gcc: Add arch12.
	* config/s390/driver-native.c (s390_host_detect_local_cpu):
	Default to arch12 for unknown CPU model numbers.
	* config/s390/s390-builtins.def: Add B_VXE builtin flag.
	* config/s390/s390-c.c (s390_cpu_cpp_builtins_internal): Adjust
	PROCESSOR_max sanity check.
	* config/s390/s390-opts.h (enum processor_type): Add
	PROCESSOR_ARCH12.
	* config/s390/s390.c (processor_table): Add arch12.
	(s390_expand_builtin): Add check for B_VXE flag.
	(s390_issue_rate): Add PROCESSOR_ARCH12.
	(s390_get_sched_attrmask): Likewise.
	(s390_get_unit_mask): Likewise.
	(s390_sched_score): Enable z13 scheduling for arch12.
	(s390_sched_reorder): Likewise.
	(s390_sched_variable_issue): Likewise.
	* config/s390/s390.h (enum processor_flags): Add PF_ARCH12 and
	PF_VXE.
	(s390_tune_attr): Use z13 scheduling also for arch12.
	(TARGET_CPU_ARCH12, TARGET_CPU_ARCH12_P, TARGET_CPU_VXE)
	(TARGET_CPU_VXE_P, TARGET_ARCH12, TARGET_ARCH12_P, TARGET_VXE)
	(TARGET_VXE_P): New macros.
	* config/s390/s390.md: Add arch12 to cpu attribute.  Add arch12
	and vxe to cpu_facility.  Add arch12 and vxe to enabled attribute.
	* config/s390/s390.opt: Add arch12 as processor_type.
---
 gcc/ChangeLog                          | 30 ++++++++++++++++++++++++++++++
 gcc/common/config/s390/s390-common.c   |  5 ++++-
 gcc/config.gcc                         |  2 +-
 gcc/config/s390/driver-native.c        |  3 +++
 gcc/config/s390/s390-builtins.def      |  3 ++-
 gcc/config/s390/s390-c.c               |  2 +-
 gcc/config/s390/s390-opts.h            |  1 +
 gcc/config/s390/s390.c                 | 22 ++++++++++++++++------
 gcc/config/s390/s390.h                 | 25 +++++++++++++++++++++----
 gcc/config/s390/s390.md                | 12 ++++++++++--
 gcc/config/s390/s390.opt               |  3 +++
 gcc/testsuite/ChangeLog                |  5 +++++
 gcc/testsuite/gcc.target/s390/s390.exp |  6 ++++++
 gcc/testsuite/lib/target-supports.exp  | 17 +++++++++++++++++
 14 files changed, 120 insertions(+), 16 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 779101e..89e7906 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,35 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* common/config/s390/s390-common.c (processor_flags_table): Add
+	arch12.
+	* config.gcc: Add arch12.
+	* config/s390/driver-native.c (s390_host_detect_local_cpu):
+	Default to arch12 for unknown CPU model numbers.
+	* config/s390/s390-builtins.def: Add B_VXE builtin flag.
+	* config/s390/s390-c.c (s390_cpu_cpp_builtins_internal): Adjust
+	PROCESSOR_max sanity check.
+	* config/s390/s390-opts.h (enum processor_type): Add
+	PROCESSOR_ARCH12.
+	* config/s390/s390.c (processor_table): Add arch12.
+	(s390_expand_builtin): Add check for B_VXE flag.
+	(s390_issue_rate): Add PROCESSOR_ARCH12.
+	(s390_get_sched_attrmask): Likewise.
+	(s390_get_unit_mask): Likewise.
+	(s390_sched_score): Enable z13 scheduling for arch12.
+	(s390_sched_reorder): Likewise.
+	(s390_sched_variable_issue): Likewise.
+	* config/s390/s390.h (enum processor_flags): Add PF_ARCH12 and
+	PF_VXE.
+	(s390_tune_attr): Use z13 scheduling also for arch12.
+	(TARGET_CPU_ARCH12, TARGET_CPU_ARCH12_P, TARGET_CPU_VXE)
+	(TARGET_CPU_VXE_P, TARGET_ARCH12, TARGET_ARCH12_P, TARGET_VXE)
+	(TARGET_VXE_P): New macros.
+	* config/s390/s390.md: Add arch12 to cpu attribute.  Add arch12
+	and vxe to cpu_facility.  Add arch12 and vxe to enabled attribute.
+	* config/s390/s390.opt: Add arch12 as processor_type.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* config/s390/s390.md
 	("fixuns_truncdddi2", "fixuns_trunctddi2")
 	("fixuns_trunc<BFP:mode><GPR:mode>2"): Merge into ...
diff --git a/gcc/common/config/s390/s390-common.c b/gcc/common/config/s390/s390-common.c
index 47f13e1..10418a3 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -45,7 +45,10 @@ EXPORTED_CONST int processor_flags_table[] =
                  | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX,
     /* z13 */    PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
                  | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
-                 | PF_Z13 | PF_VX
+                 | PF_Z13 | PF_VX,
+    /* arch12 */ PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
+                 | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
+                 | PF_Z13 | PF_VX | PF_VXE | PF_ARCH12
   };
 
 /* Change optimizations to be performed, depending on the
diff --git a/gcc/config.gcc b/gcc/config.gcc
index f7f6967..b8bb4d6 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4331,7 +4331,7 @@ case "${target}" in
 		for which in arch tune; do
 			eval "val=\$with_$which"
 			case ${val} in
-			"" | native | g5 | g6 | z900 | z990 | z9-109 | z9-ec | z10 | z196 | zEC12 | z13 | arch3 | arch5 | arch6 | arch7 | arch8 | arch9 | arch10 | arch11)
+			"" | native | g5 | g6 | z900 | z990 | z9-109 | z9-ec | z10 | z196 | zEC12 | z13 | arch3 | arch5 | arch6 | arch7 | arch8 | arch9 | arch10 | arch11 | arch12)
 				# OK
 				;;
 			*)
diff --git a/gcc/config/s390/driver-native.c b/gcc/config/s390/driver-native.c
index 5e70845..4bcddb4 100644
--- a/gcc/config/s390/driver-native.c
+++ b/gcc/config/s390/driver-native.c
@@ -114,6 +114,9 @@ s390_host_detect_local_cpu (int argc, const char **argv)
 	    case 0x2964:
 	      cpu = "z13";
 	      break;
+	    default:
+	      cpu = "arch12";
+	      break;
 	    }
 	}
       if (has_features == 0 && strncmp (buf, "features", 8) == 0)
diff --git a/gcc/config/s390/s390-builtins.def b/gcc/config/s390/s390-builtins.def
index 137aab5..27cb6a8 100644
--- a/gcc/config/s390/s390-builtins.def
+++ b/gcc/config/s390/s390-builtins.def
@@ -270,6 +270,7 @@
 #undef B_INT
 #undef B_HTM
 #undef B_VX
+#undef B_VXE
 
 #undef BFLAGS_MASK_INIT
 #define BFLAGS_MASK_INIT (B_INT)
@@ -277,7 +278,7 @@
 #define B_INT   (1 << 0)  /* Internal builtins.  This builtin cannot be used in user programs.  */
 #define B_HTM   (1 << 1)  /* Builtins requiring the transactional execution facility.  */
 #define B_VX    (1 << 2)  /* Builtins requiring the z13 vector extensions.  */
-
+#define B_VXE   (1 << 3)  /* Builtins requiring the arch12 vector extensions.  */
 
 /* B_DEF defines a standard (not overloaded) builtin
    B_DEF (<builtin name>, <RTL expander name>, <function attributes>, <builtin flags>, <operand flags, see above>, <fntype>)
diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c
index 0521e1e..019d86e 100644
--- a/gcc/config/s390/s390-c.c
+++ b/gcc/config/s390/s390-c.c
@@ -339,7 +339,7 @@ s390_cpu_cpp_builtins_internal (cpp_reader *pfile,
       /* Z9_EC has the same level as Z9_109.  */
       arch_level--;
     /* Review when a new arch is added and increase the value.  */
-    char dummy[23 - 2 * PROCESSOR_max] __attribute__((unused));
+    char dummy[(PROCESSOR_max > 12) ? -1 : 1] __attribute__((unused));
     sprintf (macro_def, "__ARCH__=%d", arch_level);
     cpp_undef (pfile, "__ARCH__");
     cpp_define (pfile, macro_def);
diff --git a/gcc/config/s390/s390-opts.h b/gcc/config/s390/s390-opts.h
index 98e2810..65ac4f8 100644
--- a/gcc/config/s390/s390-opts.h
+++ b/gcc/config/s390/s390-opts.h
@@ -38,6 +38,7 @@ enum processor_type
   PROCESSOR_2817_Z196,
   PROCESSOR_2827_ZEC12,
   PROCESSOR_2964_Z13,
+  PROCESSOR_ARCH12,
   PROCESSOR_NATIVE,
   PROCESSOR_max
 };
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index eac39c5..c94edcc 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -334,6 +334,7 @@ const processor_table[] =
   { "z196",   PROCESSOR_2817_Z196,   &z196_cost },
   { "zEC12",  PROCESSOR_2827_ZEC12,  &zEC12_cost },
   { "z13",    PROCESSOR_2964_Z13,    &zEC12_cost },
+  { "arch12", PROCESSOR_ARCH12,      &zEC12_cost },
   { "native", PROCESSOR_NATIVE,      NULL }
 };
 
@@ -824,12 +825,18 @@ s390_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED,
 		 "(default with -march=zEC12 and higher).", fndecl);
 	  return const0_rtx;
 	}
-      if ((bflags & B_VX) && !TARGET_VX)
+      if (((bflags & B_VX) || (bflags & B_VXE)) && !TARGET_VX)
 	{
 	  error ("builtin %qF is not supported without -mvx "
 		 "(default with -march=z13 and higher).", fndecl);
 	  return const0_rtx;
 	}
+
+      if ((bflags & B_VXE) && !TARGET_VXE)
+	{
+	  error ("Builtin %qF requires arch12 or higher.", fndecl);
+	  return const0_rtx;
+	}
     }
   if (fcode >= S390_OVERLOADED_BUILTIN_VAR_OFFSET
       && fcode < S390_ALL_BUILTIN_MAX)
@@ -7781,6 +7788,7 @@ s390_issue_rate (void)
 	 instruction gets issued per cycle.  */
     case PROCESSOR_2827_ZEC12:
     case PROCESSOR_2964_Z13:
+    case PROCESSOR_ARCH12:
     default:
       return 1;
     }
@@ -13987,6 +13995,7 @@ s390_get_sched_attrmask (rtx_insn *insn)
 	mask |= S390_SCHED_ATTR_MASK_GROUPALONE;
       break;
     case PROCESSOR_2964_Z13:
+    case PROCESSOR_ARCH12:
       if (get_attr_z13_cracked (insn))
 	mask |= S390_SCHED_ATTR_MASK_CRACKED;
       if (get_attr_z13_expanded (insn))
@@ -14010,6 +14019,7 @@ s390_get_unit_mask (rtx_insn *insn, int *units)
   switch (s390_tune)
     {
     case PROCESSOR_2964_Z13:
+    case PROCESSOR_ARCH12:
       *units = 3;
       if (get_attr_z13_unit_lsu (insn))
 	mask |= 1 << 0;
@@ -14082,7 +14092,7 @@ s390_sched_score (rtx_insn *insn)
       break;
     }
 
-  if (s390_tune == PROCESSOR_2964_Z13)
+  if (s390_tune >= PROCESSOR_2964_Z13)
     {
       int units, i;
       unsigned unit_mask, m = 1;
@@ -14187,7 +14197,7 @@ s390_sched_reorder (FILE *file, int verbose,
 	      PRINT_SCHED_ATTR (S390_SCHED_ATTR_MASK_ENDGROUP, endgroup);
 	      PRINT_SCHED_ATTR (S390_SCHED_ATTR_MASK_GROUPALONE, groupalone);
 #undef PRINT_SCHED_ATTR
-	      if (s390_tune == PROCESSOR_2964_Z13)
+	      if (s390_tune >= PROCESSOR_2964_Z13)
 		{
 		  unsigned int unit_mask, m = 1;
 		  int units, j;
@@ -14250,7 +14260,7 @@ s390_sched_variable_issue (FILE *file, int verbose, rtx_insn *insn, int more)
 	    }
 	}
 
-      if (s390_tune == PROCESSOR_2964_Z13)
+      if (s390_tune >= PROCESSOR_2964_Z13)
 	{
 	  int units, i;
 	  unsigned unit_mask, m = 1;
@@ -14279,7 +14289,7 @@ s390_sched_variable_issue (FILE *file, int verbose, rtx_insn *insn, int more)
 	  PRINT_SCHED_ATTR (S390_SCHED_ATTR_MASK_GROUPALONE, groupalone);
 #undef PRINT_SCHED_ATTR
 
-	  if (s390_tune == PROCESSOR_2964_Z13)
+	  if (s390_tune >= PROCESSOR_2964_Z13)
 	    {
 	      unsigned int unit_mask, m = 1;
 	      int units, j;
@@ -14293,7 +14303,7 @@ s390_sched_variable_issue (FILE *file, int verbose, rtx_insn *insn, int more)
 	    }
 	  fprintf (file, " sched state: %d\n", s390_sched_state);
 
-	  if (s390_tune == PROCESSOR_2964_Z13)
+	  if (s390_tune >= PROCESSOR_2964_Z13)
 	    {
 	      int units, j;
 
diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h
index 5828041..a372981 100644
--- a/gcc/config/s390/s390.h
+++ b/gcc/config/s390/s390.h
@@ -37,12 +37,14 @@ enum processor_flags
   PF_ZEC12 = 128,
   PF_TX = 256,
   PF_Z13 = 512,
-  PF_VX = 1024
+  PF_VX = 1024,
+  PF_ARCH12 = 2048,
+  PF_VXE = 4096
 };
 
 /* This is necessary to avoid a warning about comparing different enum
    types.  */
-#define s390_tune_attr ((enum attr_cpu)s390_tune)
+#define s390_tune_attr ((enum attr_cpu)(s390_tune > PROCESSOR_2964_Z13 ? PROCESSOR_2964_Z13 : s390_tune ))
 
 /* These flags indicate that the generated code should run on a cpu
    providing the respective hardware facility regardless of the
@@ -87,11 +89,19 @@ enum processor_flags
 #define TARGET_CPU_Z13 \
 	(s390_arch_flags & PF_Z13)
 #define TARGET_CPU_Z13_P(opts) \
-        (opts->x_s390_arch_flags & PF_Z13)
+	(opts->x_s390_arch_flags & PF_Z13)
 #define TARGET_CPU_VX \
-        (s390_arch_flags & PF_VX)
+	(s390_arch_flags & PF_VX)
 #define TARGET_CPU_VX_P(opts) \
 	(opts->x_s390_arch_flags & PF_VX)
+#define TARGET_CPU_ARCH12 \
+	(s390_arch_flags & PF_ARCH12)
+#define TARGET_CPU_ARCH12_P(opts) \
+	(opts->x_s390_arch_flags & PF_ARCH12)
+#define TARGET_CPU_VXE \
+	(s390_arch_flags & PF_VXE)
+#define TARGET_CPU_VXE_P(opts) \
+	(opts->x_s390_arch_flags & PF_VXE)
 
 #define TARGET_HARD_FLOAT_P(opts) (!TARGET_SOFT_FLOAT_P(opts))
 
@@ -137,6 +147,13 @@ enum processor_flags
 	(TARGET_ZARCH_P (opts->x_target_flags) && TARGET_CPU_VX_P (opts) \
 	 && TARGET_OPT_VX_P (opts->x_target_flags) \
 	 && TARGET_HARD_FLOAT_P (opts->x_target_flags))
+#define TARGET_ARCH12 (TARGET_ZARCH && TARGET_CPU_ARCH12)
+#define TARGET_ARCH12_P(opts)						\
+	(TARGET_ZARCH_P (opts->x_target_flags) && TARGET_CPU_ARCH12_P (opts))
+#define TARGET_VXE				\
+	(TARGET_VX && TARGET_CPU_VXE)
+#define TARGET_VXE_P(opts)						\
+	(TARGET_VX_P (opts) && TARGET_CPU_VXE_P (opts))
 
 #ifdef HAVE_AS_MACHINE_MACHINEMODE
 #define S390_USE_TARGET_ATTRIBUTE 1
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index d4d3781..53c8fed 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -478,11 +478,11 @@
 ;; distinguish between g5 and g6, but there are differences between the two
 ;; CPUs could in theory be modeled.
 
-(define_attr "cpu" "g5,g6,z900,z990,z9_109,z9_ec,z10,z196,zEC12,z13"
+(define_attr "cpu" "g5,g6,z900,z990,z9_109,z9_ec,z10,z196,zEC12,z13,arch12"
   (const (symbol_ref "s390_tune_attr")))
 
 (define_attr "cpu_facility"
-  "standard,ieee,zarch,cpu_zarch,longdisp,extimm,dfp,z10,z196,zEC12,vx,z13"
+  "standard,ieee,zarch,cpu_zarch,longdisp,extimm,dfp,z10,z196,zEC12,vx,z13,arch12,vxe"
   (const_string "standard"))
 
 (define_attr "enabled" ""
@@ -532,6 +532,14 @@
          (and (eq_attr "cpu_facility" "z13")
               (match_test "TARGET_Z13"))
 	 (const_int 1)
+
+         (and (eq_attr "cpu_facility" "arch12")
+              (match_test "TARGET_ARCH12"))
+	 (const_int 1)
+
+         (and (eq_attr "cpu_facility" "vxe")
+	      (match_test "TARGET_VXE"))
+	 (const_int 1)
 	 ]
 	(const_int 0)))
 
diff --git a/gcc/config/s390/s390.opt b/gcc/config/s390/s390.opt
index 494124f..d0a0d46 100644
--- a/gcc/config/s390/s390.opt
+++ b/gcc/config/s390/s390.opt
@@ -113,6 +113,9 @@ EnumValue
 Enum(processor_type) String(arch11) Value(PROCESSOR_2964_Z13)
 
 EnumValue
+Enum(processor_type) String(arch12) Value(PROCESSOR_ARCH12)
+
+EnumValue
 Enum(processor_type) String(native) Value(PROCESSOR_NATIVE) DriverOnly
 
 mbackchain
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index e6f0e2b..9ca13ab 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,10 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* gcc.target/s390/s390.exp: Run tests in arch12 and vxe dirs.
+	* lib/target-supports.exp: Add effective target check s390_vxe.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* gcc.target/s390/vector/vec-scalar-cmp-1.c: Adjust for the
 	comparison instructions used from now on.
 
diff --git a/gcc/testsuite/gcc.target/s390/s390.exp b/gcc/testsuite/gcc.target/s390/s390.exp
index d7a61f4..420aff1 100644
--- a/gcc/testsuite/gcc.target/s390/s390.exp
+++ b/gcc/testsuite/gcc.target/s390/s390.exp
@@ -209,6 +209,12 @@ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*vector*/*.{c,S,C}]] \
 dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/target-attribute/*.{c,S,C}]] \
 	"" $DEFAULT_CFLAGS
 
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/arch12/*.{c,S,C}]] \
+	"" "-O3 -march=arch12 -mzarch"
+
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/vxe/*.{c,S}]] \
+	"" "-O3 -march=arch12 -mzarch"
+
 dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/md/*.{c,S,C}]] \
 	"" $DEFAULT_CFLAGS
 
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 290c527..342af27 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -8227,6 +8227,23 @@ proc check_effective_target_s390_vx { } {
 	}
     } "-march=z13 -mzarch" ]
 }
+
+# Same as above but for the arch12 vector enhancement facility. Test
+# is performed with the vector nand instruction.
+proc check_effective_target_s390_vxe { } {
+    if ![istarget s390*-*-*] then {
+	return 0;
+    }
+
+    return [check_runtime s390_check_vxe {
+	int main (void)
+	{
+	    asm ("vnn %%v24, %%v26, %%v28" : : : "v24", "v26", "v28");
+	    return 0;
+	}
+    } "-march=arch12 -mzarch" ]
+}
+
 #For versions of ARM architectures that have hardware div insn,
 #disable the divmod transform
 
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 08/16] S/390: Rearrange fixuns_trunc pattern definitions.
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
                   ` (6 preceding siblings ...)
  2017-03-24 14:13 ` [PATCH 09/16] S/390: arch12: Add arch12 option Andreas Krebbel
@ 2017-03-24 14:13 ` Andreas Krebbel
  2017-03-24 14:13 ` [PATCH 04/16] S/390: movsf/sd pattern fixes Andreas Krebbel
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:13 UTC (permalink / raw)
  To: gcc-patches

This reworks the fixuns_trunc* patterns a bit which got quite confusing
after adding z13 support.  Now we just have a single RTL standard name
expander definition ("fixuns_trunc<FP:mode><GPR:mode>2") which then
multiplexes to either the emulation variants *_emu or the hardware
implementations.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/s390.md
	("fixuns_truncdddi2", "fixuns_trunctddi2")
	("fixuns_trunc<BFP:mode><GPR:mode>2"): Merge into ...
	("fixuns_trunc<FP:mode><GPR:mode>2"): New expander.

	("fixuns_trunc<BFP:mode><GPR:mode>2", "fixuns_trunc<mode>si2"):
	Rename expanders to ...

	("fixuns_trunc<BFP:mode><GPR:mode>2_emu")
	("fixuns_truncdddi2_emu"): ... these.

	("fixuns_trunc<mode>si2_emu"): New expander.

	("*fixuns_truncdfdi2_z13"): Rename to ...
	("*fixuns_truncdfdi2_vx"): ... this.
---
 gcc/ChangeLog           |  18 ++++
 gcc/config/s390/s390.md | 253 +++++++++++++++++++++++++++---------------------
 2 files changed, 161 insertions(+), 110 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index fef571c..779101e 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,23 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* config/s390/s390.md
+	("fixuns_truncdddi2", "fixuns_trunctddi2")
+	("fixuns_trunc<BFP:mode><GPR:mode>2"): Merge into ...
+	("fixuns_trunc<FP:mode><GPR:mode>2"): New expander.
+
+	("fixuns_trunc<BFP:mode><GPR:mode>2", "fixuns_trunc<mode>si2"):
+	Rename expanders to ...
+
+	("fixuns_trunc<BFP:mode><GPR:mode>2_emu")
+	("fixuns_truncdddi2_emu"): ... these.
+
+	("fixuns_trunc<mode>si2_emu"): New expander.
+
+	("*fixuns_truncdfdi2_z13"): Rename to ...
+	("*fixuns_truncdfdi2_vx"): ... this.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* config/s390/2964.md: Remove the single element vector compare
 	instructions which are no longer used.
 	* config/s390/s390.c (s390_select_ccmode): Remove handling of
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index e72d5be..d4d3781 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -4735,152 +4735,185 @@
   "operands[2] = gen_lowpart (QImode, operands[0]);")
 
 ;
-; fixuns_trunc(dd|td)di2 instruction pattern(s).
+; fixuns_trunc(dd|td|sf|df|tf)(si|di)2 expander
 ;
 
-(define_expand "fixuns_truncdddi2"
+; This is the only entry point for fixuns_trunc.  It multiplexes the
+; expansion to either the *_emu expanders below for pre z196 machines
+; or emits the default pattern otherwise.
+(define_expand "fixuns_trunc<FP:mode><GPR:mode>2"
   [(parallel
-    [(set (match_operand:DI 0 "register_operand" "")
-	  (unsigned_fix:DI (match_operand:DD 1 "register_operand" "")))
-     (unspec:DI [(const_int DFP_RND_TOWARD_0)] UNSPEC_ROUND)
+    [(set (match_operand:GPR 0 "register_operand" "")
+	  (unsigned_fix:GPR (match_operand:FP 1 "register_operand" "")))
+     (unspec:GPR [(match_dup 2)] UNSPEC_ROUND)
      (clobber (reg:CC CC_REGNUM))])]
-
-  "TARGET_HARD_DFP"
+  "TARGET_HARD_FLOAT"
 {
   if (!TARGET_Z196)
     {
-      rtx_code_label *label1 = gen_label_rtx ();
-      rtx_code_label *label2 = gen_label_rtx ();
-      rtx temp = gen_reg_rtx (TDmode);
-      REAL_VALUE_TYPE cmp, sub;
-
-      decimal_real_from_string (&cmp, "9223372036854775808.0");  /* 2^63 */
-      decimal_real_from_string (&sub, "18446744073709551616.0"); /* 2^64 */
-
-      /* 2^63 can't be represented as 64bit DFP number with full precision.  The
-         solution is doing the check and the subtraction in TD mode and using a
-         TD -> DI convert afterwards.  */
-      emit_insn (gen_extendddtd2 (temp, operands[1]));
-      temp = force_reg (TDmode, temp);
-      emit_cmp_and_jump_insns (temp,
-	    const_double_from_real_value (cmp, TDmode),
-	    LT, NULL_RTX, VOIDmode, 0, label1);
-      emit_insn (gen_subtd3 (temp, temp,
-	    const_double_from_real_value (sub, TDmode)));
-      emit_insn (gen_fix_trunctddi2_dfp (operands[0], temp,
-					 GEN_INT (DFP_RND_TOWARD_MINF)));
-      emit_jump (label2);
-
-      emit_label (label1);
-      emit_insn (gen_fix_truncdddi2_dfp (operands[0], operands[1],
-					 GEN_INT (DFP_RND_TOWARD_0)));
-      emit_label (label2);
+      /* We don't provide emulation for TD|DD->SI.  */
+      if (GET_MODE_CLASS (<FP:MODE>mode) == MODE_DECIMAL_FLOAT
+	  && <GPR:MODE>mode == SImode)
+	FAIL;
+      emit_insn (gen_fixuns_trunc<FP:mode><GPR:mode>2_emu (operands[0],
+							       operands[1]));
       DONE;
     }
+
+  if (GET_MODE_CLASS (<FP:MODE>mode) == MODE_DECIMAL_FLOAT)
+    operands[2] = GEN_INT (DFP_RND_TOWARD_0);
+  else
+    operands[2] = GEN_INT (BFP_RND_TOWARD_0);
+})
+
+; (sf|df|tf)->unsigned (si|di)
+
+; Emulate the unsigned conversion with the signed version for pre z196
+; machines.
+(define_expand "fixuns_trunc<BFP:mode><GPR:mode>2_emu"
+  [(parallel
+    [(set (match_operand:GPR 0 "register_operand" "")
+	  (unsigned_fix:GPR (match_operand:BFP 1 "register_operand" "")))
+     (unspec:GPR [(const_int BFP_RND_TOWARD_0)] UNSPEC_ROUND)
+     (clobber (reg:CC CC_REGNUM))])]
+  "!TARGET_Z196 && TARGET_HARD_FLOAT"
+{
+  rtx_code_label *label1 = gen_label_rtx ();
+  rtx_code_label *label2 = gen_label_rtx ();
+  rtx temp = gen_reg_rtx (<BFP:MODE>mode);
+  REAL_VALUE_TYPE cmp, sub;
+
+  operands[1] = force_reg (<BFP:MODE>mode, operands[1]);
+  real_2expN (&cmp, <GPR:bitsize> - 1, <BFP:MODE>mode);
+  real_2expN (&sub, <GPR:bitsize>, <BFP:MODE>mode);
+
+  emit_cmp_and_jump_insns (operands[1],
+			   const_double_from_real_value (cmp, <BFP:MODE>mode),
+			   LT, NULL_RTX, VOIDmode, 0, label1);
+  emit_insn (gen_sub<BFP:mode>3 (temp, operands[1],
+	       const_double_from_real_value (sub, <BFP:MODE>mode)));
+  emit_insn (gen_fix_trunc<BFP:mode><GPR:mode>2_bfp (operands[0], temp,
+	       GEN_INT (BFP_RND_TOWARD_MINF)));
+  emit_jump (label2);
+
+  emit_label (label1);
+  emit_insn (gen_fix_trunc<BFP:mode><GPR:mode>2_bfp (operands[0],
+							 operands[1],
+							 GEN_INT (BFP_RND_TOWARD_0)));
+  emit_label (label2);
+  DONE;
 })
 
-(define_expand "fixuns_trunctddi2"
+; dd->unsigned di
+
+; Emulate the unsigned conversion with the signed version for pre z196
+; machines.
+(define_expand "fixuns_truncdddi2_emu"
   [(parallel
     [(set (match_operand:DI 0 "register_operand" "")
-	  (unsigned_fix:DI (match_operand:TD 1 "register_operand" "")))
+	  (unsigned_fix:DI (match_operand:DD 1 "register_operand" "")))
      (unspec:DI [(const_int DFP_RND_TOWARD_0)] UNSPEC_ROUND)
      (clobber (reg:CC CC_REGNUM))])]
 
-  "TARGET_HARD_DFP"
-{
-  if (!TARGET_Z196)
-    {
-      rtx_code_label *label1 = gen_label_rtx ();
-      rtx_code_label *label2 = gen_label_rtx ();
-      rtx temp = gen_reg_rtx (TDmode);
-      REAL_VALUE_TYPE cmp, sub;
-
-      operands[1] = force_reg (TDmode, operands[1]);
-      decimal_real_from_string (&cmp, "9223372036854775808.0");  /* 2^63 */
-      decimal_real_from_string (&sub, "18446744073709551616.0"); /* 2^64 */
-
-      emit_cmp_and_jump_insns (operands[1],
-	    const_double_from_real_value (cmp, TDmode),
-	    LT, NULL_RTX, VOIDmode, 0, label1);
-      emit_insn (gen_subtd3 (temp, operands[1],
-	    const_double_from_real_value (sub, TDmode)));
-      emit_insn (gen_fix_trunctddi2_dfp (operands[0], temp,
-					 GEN_INT (DFP_RND_TOWARD_MINF)));
-      emit_jump (label2);
-
-      emit_label (label1);
-      emit_insn (gen_fix_trunctddi2_dfp (operands[0], operands[1],
-					 GEN_INT (DFP_RND_TOWARD_0)));
-      emit_label (label2);
-      DONE;
-    }
+  "!TARGET_Z196 && TARGET_HARD_DFP"
+{
+  rtx_code_label *label1 = gen_label_rtx ();
+  rtx_code_label *label2 = gen_label_rtx ();
+  rtx temp = gen_reg_rtx (TDmode);
+  REAL_VALUE_TYPE cmp, sub;
+
+  decimal_real_from_string (&cmp, "9223372036854775808.0");  /* 2^63 */
+  decimal_real_from_string (&sub, "18446744073709551616.0"); /* 2^64 */
+
+  /* 2^63 can't be represented as 64bit DFP number with full precision.  The
+     solution is doing the check and the subtraction in TD mode and using a
+     TD -> DI convert afterwards.  */
+  emit_insn (gen_extendddtd2 (temp, operands[1]));
+  temp = force_reg (TDmode, temp);
+  emit_cmp_and_jump_insns (temp,
+			   const_double_from_real_value (cmp, TDmode),
+			   LT, NULL_RTX, VOIDmode, 0, label1);
+  emit_insn (gen_subtd3 (temp, temp,
+			 const_double_from_real_value (sub, TDmode)));
+  emit_insn (gen_fix_trunctddi2_dfp (operands[0], temp,
+				     GEN_INT (DFP_RND_TOWARD_MINF)));
+  emit_jump (label2);
+
+  emit_label (label1);
+  emit_insn (gen_fix_truncdddi2_dfp (operands[0], operands[1],
+				     GEN_INT (DFP_RND_TOWARD_0)));
+  emit_label (label2);
+  DONE;
 })
 
-;
-; fixuns_trunc(sf|df|tf)(si|di)2 and fix_trunc(sf|df|tf)(si|di)2
-; instruction pattern(s).
-;
+; td->unsigned di
 
-(define_expand "fixuns_trunc<BFP:mode><GPR:mode>2"
+; Emulate the unsigned conversion with the signed version for pre z196
+; machines.
+(define_expand "fixuns_trunctddi2_emu"
   [(parallel
-    [(set (match_operand:GPR 0 "register_operand" "")
-	  (unsigned_fix:GPR (match_operand:BFP 1 "register_operand" "")))
-     (unspec:GPR [(const_int BFP_RND_TOWARD_0)] UNSPEC_ROUND)
+    [(set (match_operand:DI 0 "register_operand" "")
+	  (unsigned_fix:DI (match_operand:TD 1 "register_operand" "")))
+     (unspec:DI [(const_int DFP_RND_TOWARD_0)] UNSPEC_ROUND)
      (clobber (reg:CC CC_REGNUM))])]
-  "TARGET_HARD_FLOAT"
-{
-  if (!TARGET_Z196)
-    {
-      rtx_code_label *label1 = gen_label_rtx ();
-      rtx_code_label *label2 = gen_label_rtx ();
-      rtx temp = gen_reg_rtx (<BFP:MODE>mode);
-      REAL_VALUE_TYPE cmp, sub;
-
-      operands[1] = force_reg (<BFP:MODE>mode, operands[1]);
-      real_2expN (&cmp, <GPR:bitsize> - 1, <BFP:MODE>mode);
-      real_2expN (&sub, <GPR:bitsize>, <BFP:MODE>mode);
-
-      emit_cmp_and_jump_insns (operands[1],
-	    const_double_from_real_value (cmp, <BFP:MODE>mode),
-	    LT, NULL_RTX, VOIDmode, 0, label1);
-      emit_insn (gen_sub<BFP:mode>3 (temp, operands[1],
-	    const_double_from_real_value (sub, <BFP:MODE>mode)));
-      emit_insn (gen_fix_trunc<BFP:mode><GPR:mode>2_bfp (operands[0], temp,
-	    GEN_INT (BFP_RND_TOWARD_MINF)));
-      emit_jump (label2);
 
-      emit_label (label1);
-      emit_insn (gen_fix_trunc<BFP:mode><GPR:mode>2_bfp (operands[0],
-	    operands[1], GEN_INT (BFP_RND_TOWARD_0)));
-      emit_label (label2);
-      DONE;
-    }
+  "!TARGET_Z196 && TARGET_HARD_DFP"
+{
+  rtx_code_label *label1 = gen_label_rtx ();
+  rtx_code_label *label2 = gen_label_rtx ();
+  rtx temp = gen_reg_rtx (TDmode);
+  REAL_VALUE_TYPE cmp, sub;
+
+  operands[1] = force_reg (TDmode, operands[1]);
+  decimal_real_from_string (&cmp, "9223372036854775808.0");  /* 2^63 */
+  decimal_real_from_string (&sub, "18446744073709551616.0"); /* 2^64 */
+
+  emit_cmp_and_jump_insns (operands[1],
+			   const_double_from_real_value (cmp, TDmode),
+			   LT, NULL_RTX, VOIDmode, 0, label1);
+  emit_insn (gen_subtd3 (temp, operands[1],
+			 const_double_from_real_value (sub, TDmode)));
+  emit_insn (gen_fix_trunctddi2_dfp (operands[0], temp,
+				     GEN_INT (DFP_RND_TOWARD_MINF)));
+  emit_jump (label2);
+
+  emit_label (label1);
+  emit_insn (gen_fix_trunctddi2_dfp (operands[0], operands[1],
+				     GEN_INT (DFP_RND_TOWARD_0)));
+  emit_label (label2);
+  DONE;
 })
 
-; fixuns_trunc(td|dd)si2 expander
-(define_expand "fixuns_trunc<mode>si2"
+; Just a dummy to make the code in the first expander a bit easier.
+(define_expand "fixuns_trunc<mode>si2_emu"
   [(parallel
     [(set (match_operand:SI 0 "register_operand" "")
 	  (unsigned_fix:SI (match_operand:DFP 1 "register_operand" "")))
-     (unspec:SI [(const_int DFP_RND_TOWARD_0)] UNSPEC_ROUND)
+     (unspec:DI [(const_int DFP_RND_TOWARD_0)] UNSPEC_ROUND)
      (clobber (reg:CC CC_REGNUM))])]
-  "TARGET_Z196 && TARGET_HARD_DFP"
-  "")
+
+  "!TARGET_Z196 && TARGET_HARD_DFP"
+ {
+   FAIL;
+ })
+
 
 ; fixuns_trunc(tf|df|sf|td|dd)(di|si)2 instruction patterns.
 
-(define_insn "*fixuns_truncdfdi2_z13"
+; df -> unsigned di
+(define_insn "*fixuns_truncdfdi2_vx"
   [(set (match_operand:DI                  0 "register_operand" "=d,v")
 	(unsigned_fix:DI (match_operand:DF 1 "register_operand"  "f,v")))
    (unspec:DI [(match_operand:DI           2 "immediate_operand" "K,K")] UNSPEC_ROUND)
    (clobber (reg:CC CC_REGNUM))]
-   "TARGET_VX && TARGET_HARD_FLOAT"
-   "@
-    clgdbr\t%0,%h2,%1,0
-    wclgdb\t%v0,%v1,0,%h2"
-   [(set_attr "op_type" "RRF,VRR")
-    (set_attr "type"    "ftoi")])
+  "TARGET_VX && TARGET_HARD_FLOAT"
+  "@
+   clgdbr\t%0,%h2,%1,0
+   wclgdb\t%v0,%v1,0,%h2"
+  [(set_attr "op_type" "RRF,VRR")
+   (set_attr "type"    "ftoi")])
 
+; (dd|td|sf|df|tf)->unsigned (di|si)
 ; clfebr, clfdbr, clfxbr, clgebr, clgdbr, clgxbr
 ;         clfdtr, clfxtr,         clgdtr, clgxtr
 (define_insn "*fixuns_trunc<FP:mode><GPR:mode>2_z196"
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 15/16] S/390: arch12: Support new vector floating point modes.
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
                   ` (9 preceding siblings ...)
  2017-03-24 14:13 ` [PATCH 11/16] S/390: arch12: New vector popcount variants Andreas Krebbel
@ 2017-03-24 14:13 ` Andreas Krebbel
  2017-03-24 14:13 ` [PATCH 01/16] S/390: Rename cpu facility vec to vx Andreas Krebbel
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:13 UTC (permalink / raw)
  To: gcc-patches

This patch adds support for the new floating point vector elements (SF
and TF) introduced with arch12.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/s390.c (s390_expand_vec_compare): Support other
	vector floating point modes than just V2DF.
	(s390_expand_vcond): Likewise.
	(s390_hard_regno_mode_ok): Allow SFmode values in VRs.
	(s390_cannot_change_mode_class): Prevent mode changes between TF
	and V1TF in vector registers.
	* config/s390/s390.md (DF, SF): New mode attributes.
	("*cmp<mode>_ccs", "add<mode>3", "sub<mode>3", "mul<mode>3")
	("fma<mode>4", "fms<mode>4", "div<mode>3", "*neg<mode>2"): Add
	SFmode support for VRs.
	* config/s390/vector.md (V_HW, V_HW2, VT_HW, ti*, nonvec): Add new
	vector fp modes.
	(VFT, VF_HW): New mode iterators.
	(vw, sdx): New mode attributes.
	("addv2df3", "subv2df3", "mulv2df3", "divv2df3", "sqrtv2df2")
	("fmav2df4","fmsv2df4", "negv2df2", "absv2df2", "*negabsv2df2")
	("smaxv2df3", "sminv2df3", "*vec_cmp<VFCMP_HW_OP:code>v2df_nocc")
	("vec_cmpuneqv2df", "vec_cmpltgtv2df", "vec_orderedv2df")
	("vec_unorderedv2df"): Adjust the v2df only patterns to support
	also the new vector floating point modes.  Renaming to ...

	("add<mode>3", "sub<mode>3", "mul<mode>3", "div<mode>3")
	("sqrt<mode>2", "fma<mode>4", "fms<mode>4", "neg<mode>2")
	("abs<mode>2", "negabs<mode>2", "smax<mode>3")
	("smin<mode>3", "*vec_cmp<VFCMP_HW_OP:code><mode>_nocc")
	("vec_cmpuneq<mode>", "vec_cmpltgt<mode>", "vec_ordered<mode>")
	("vec_unordered<mode>"): ... these.

	("neg_fma<mode>4", "neg_fms<mode>4", "*smax<mode>3_vxe")
	("*smin<mode>3_vxe", "*sminv2df3_vx", "*vec_extendv4sf")
	("*vec_extendv2df"): New insn definitions.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* gcc.target/s390/vxe/negfma-1.c: New test.
---
 gcc/ChangeLog                                |  34 +++
 gcc/config/s390/s390-builtins.def            |   2 +-
 gcc/config/s390/s390.c                       |  13 +-
 gcc/config/s390/s390.md                      | 144 ++++++------
 gcc/config/s390/vector.md                    | 313 ++++++++++++++++++---------
 gcc/testsuite/ChangeLog                      |   4 +
 gcc/testsuite/gcc.target/s390/vxe/negfma-1.c |  49 +++++
 7 files changed, 388 insertions(+), 171 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/negfma-1.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3753ad6..bd60982 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,39 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* config/s390/s390.c (s390_expand_vec_compare): Support other
+	vector floating point modes than just V2DF.
+	(s390_expand_vcond): Likewise.
+	(s390_hard_regno_mode_ok): Allow SFmode values in VRs.
+	(s390_cannot_change_mode_class): Prevent mode changes between TF
+	and V1TF in vector registers.
+	* config/s390/s390.md (DF, SF): New mode attributes.
+	("*cmp<mode>_ccs", "add<mode>3", "sub<mode>3", "mul<mode>3")
+	("fma<mode>4", "fms<mode>4", "div<mode>3", "*neg<mode>2"): Add
+	SFmode support for VRs.
+	* config/s390/vector.md (V_HW, V_HW2, VT_HW, ti*, nonvec): Add new
+	vector fp modes.
+	(VFT, VF_HW): New mode iterators.
+	(vw, sdx): New mode attributes.
+	("addv2df3", "subv2df3", "mulv2df3", "divv2df3", "sqrtv2df2")
+	("fmav2df4","fmsv2df4", "negv2df2", "absv2df2", "*negabsv2df2")
+	("smaxv2df3", "sminv2df3", "*vec_cmp<VFCMP_HW_OP:code>v2df_nocc")
+	("vec_cmpuneqv2df", "vec_cmpltgtv2df", "vec_orderedv2df")
+	("vec_unorderedv2df"): Adjust the v2df only patterns to support
+	also the new vector floating point modes.  Renaming to ...
+
+	("add<mode>3", "sub<mode>3", "mul<mode>3", "div<mode>3")
+	("sqrt<mode>2", "fma<mode>4", "fms<mode>4", "neg<mode>2")
+	("abs<mode>2", "negabs<mode>2", "smax<mode>3")
+	("smin<mode>3", "*vec_cmp<VFCMP_HW_OP:code><mode>_nocc")
+	("vec_cmpuneq<mode>", "vec_cmpltgt<mode>", "vec_ordered<mode>")
+	("vec_unordered<mode>"): ... these.
+
+	("neg_fma<mode>4", "neg_fms<mode>4", "*smax<mode>3_vxe")
+	("*smin<mode>3_vxe", "*sminv2df3_vx", "*vec_extendv4sf")
+	("*vec_extendv2df"): New insn definitions.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* config/s390/s390.md ("*adddi3_sign", "*subdi3_sign", "mulditi3")
 	("mulditi3_2", "*muldi3_sign"): New patterns.
 	("muldi3", "*muldi3", "mulsi3", "*mulsi3"): Add an expander and
diff --git a/gcc/config/s390/s390-builtins.def b/gcc/config/s390/s390-builtins.def
index 27cb6a8..bb2d743 100644
--- a/gcc/config/s390/s390-builtins.def
+++ b/gcc/config/s390/s390-builtins.def
@@ -2496,7 +2496,7 @@ B_DEF      (s390_vec_ctsl,              vec_ctsl,           0,
 B_DEF      (s390_vec_ctul,              vec_ctul,           0,                  B_VX,               O2_U3,              BT_FN_UV2DI_V2DF_INT)                    /* vclgdb */
 B_DEF      (s390_vcgdb,                 vec_df_to_di_s64,   0,                  B_VX,               O2_U3,              BT_FN_V2DI_V2DF_INT)                     /* vcgdb */
 B_DEF      (s390_vclgdb,                vec_df_to_di_u64,   0,                  B_VX,               O2_U3,              BT_FN_UV2DI_V2DF_INT)                    /* vclgdb */
-B_DEF      (s390_vfidb,                 vfidb,              0,                  B_VX,               O2_U4 | O3_U3,      BT_FN_V2DF_V2DF_UCHAR_UCHAR)
+B_DEF      (s390_vfidb,                 vfiv2df,            0,                  B_VX,               O2_U4 | O3_U3,      BT_FN_V2DF_V2DF_UCHAR_UCHAR)
 B_DEF      (s390_vec_ld2f,              vec_ld2f,           0,                  B_VX,               0,                  BT_FN_V2DF_FLTCONSTPTR)                  /* vldeb */
 B_DEF      (s390_vec_st2f,              vec_st2f,           0,                  B_VX,               0,                  BT_FN_VOID_V2DF_FLTPTR)                  /* vledb */
 B_DEF      (s390_vfmadb,                fmav2df4,           0,                  B_VX,               0,                  BT_FN_V2DF_V2DF_V2DF_V2DF)
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index e800323..1d26979 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -6201,7 +6201,7 @@ s390_expand_vec_compare (rtx target, enum rtx_code cond,
   bool neg_p = false, swap_p = false;
   rtx tmp;
 
-  if (GET_MODE (cmp_op1) == V2DFmode)
+  if (GET_MODE_CLASS (GET_MODE (cmp_op1)) == MODE_VECTOR_FLOAT)
     {
       switch (cond)
 	{
@@ -6447,7 +6447,8 @@ s390_expand_vcond (rtx target, rtx then, rtx els,
 
   /* We always use an integral type vector to hold the comparison
      result.  */
-  result_mode = cmp_mode == V2DFmode ? V2DImode : cmp_mode;
+  result_mode = mode_for_vector (int_mode_for_mode (GET_MODE_INNER (cmp_mode)),
+				 GET_MODE_NUNITS (cmp_mode));
   result_target = gen_reg_rtx (result_mode);
 
   /* We allow vector immediates as comparison operands that
@@ -10112,6 +10113,7 @@ s390_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
       return ((GET_MODE_CLASS (mode) == MODE_INT
 	       && s390_class_max_nregs (VEC_REGS, mode) == 1)
 	      || mode == DFmode
+	      || (TARGET_VXE && mode == SFmode)
 	      || s390_vector_mode_supported_p (mode));
       break;
     case FP_REGS:
@@ -10256,6 +10258,13 @@ s390_cannot_change_mode_class (machine_mode from_mode,
   machine_mode small_mode;
   machine_mode big_mode;
 
+  /* V1TF and TF have different representations in vector
+     registers.  */
+  if (reg_classes_intersect_p (VEC_REGS, rclass)
+      && ((from_mode == V1TFmode && to_mode == TFmode)
+	  || (from_mode == TFmode && to_mode == V1TFmode)))
+    return 1;
+
   if (GET_MODE_SIZE (from_mode) == GET_MODE_SIZE (to_mode))
     return 0;
 
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 93a0bc6..7e9add7 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -674,6 +674,12 @@
 (define_mode_attr DFDI [(TF "0") (DF "*") (SF "0")
 			(TD "0") (DD "0") (DD "0")
 			(TI "0") (DI "*") (SI "0")])
+(define_mode_attr DF [(TF "0") (DF "*") (SF "0")
+		      (TD "0") (DD "0") (DD "0")
+		      (TI "0") (DI "0") (SI "0")])
+(define_mode_attr SF [(TF "0") (DF "0") (SF "*")
+		      (TD "0") (DD "0") (DD "0")
+		      (TI "0") (DI "0") (SI "0")])
 
 ;; This attribute is used in the operand constraint list
 ;; for instructions dealing with the sign bit of 32 or 64bit fp values.
@@ -1325,20 +1331,21 @@
  })
 
 
-; cxtr, cdtr, cxbr, cdbr, cebr, cdb, ceb, wfcdb
+; VX: TFmode in FPR pairs: use cxbr instead of wfcxb
+; cxtr, cdtr, cxbr, cdbr, cebr, cdb, ceb, wfcsb, wfcdb
 (define_insn "*cmp<mode>_ccs"
   [(set (reg CC_REGNUM)
-        (compare (match_operand:FP 0 "register_operand" "f,f,v")
-                 (match_operand:FP 1 "general_operand"  "f,R,v")))]
+        (compare (match_operand:FP 0 "register_operand" "f,f,v,v")
+                 (match_operand:FP 1 "general_operand"  "f,R,v,v")))]
   "s390_match_ccmode(insn, CCSmode) && TARGET_HARD_FLOAT"
   "@
    c<xde><bt>r\t%0,%1
    c<xde>b\t%0,%1
-   wfcdb\t%0,%1"
-  [(set_attr "op_type" "RRE,RXE,VRR")
-   (set_attr "cpu_facility" "*,*,vx")
-   (set_attr "enabled" "*,<DSF>,<DFDI>")])
-
+   wfcdb\t%0,%1
+   wfcsb\t%0,%1"
+  [(set_attr "op_type" "RRE,RXE,VRR,VRR")
+   (set_attr "cpu_facility" "*,*,vx,vxe")
+   (set_attr "enabled" "*,<DSF>,<DF>,<SF>")])
 
 ; Compare and Branch instructions
 
@@ -5159,6 +5166,7 @@
 ; extend(sf|df)(df|tf)2 instruction pattern(s).
 ;
 
+; wflls
 (define_insn "*extendsfdf2_z13"
   [(set (match_operand:DF                  0 "register_operand"     "=f,f,v")
         (float_extend:DF (match_operand:SF 1 "nonimmediate_operand"  "f,R,v")))]
@@ -5811,20 +5819,21 @@
 ; axbr, adbr, aebr, axb, adb, aeb, adtr, axtr
 ; FIXME: wfadb does not clobber cc
 (define_insn "add<mode>3"
-  [(set (match_operand:FP 0 "register_operand"              "=f,f,f,v")
-        (plus:FP (match_operand:FP 1 "nonimmediate_operand" "%f,0,0,v")
-		 (match_operand:FP 2 "general_operand"       "f,f,R,v")))
+  [(set (match_operand:FP          0 "register_operand"     "=f,f,f,v,v")
+        (plus:FP (match_operand:FP 1 "nonimmediate_operand" "%f,0,0,v,v")
+		 (match_operand:FP 2 "general_operand"       "f,f,R,v,v")))
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_HARD_FLOAT"
   "@
    a<xde>tr\t%0,%1,%2
    a<xde>br\t%0,%2
    a<xde>b\t%0,%2
-   wfadb\t%v0,%v1,%v2"
-  [(set_attr "op_type"      "RRF,RRE,RXE,VRR")
+   wfadb\t%v0,%v1,%v2
+   wfasb\t%v0,%v1,%v2"
+  [(set_attr "op_type"      "RRF,RRE,RXE,VRR,VRR")
    (set_attr "type"         "fsimp<mode>")
-   (set_attr "cpu_facility" "*,*,*,vx")
-   (set_attr "enabled"      "<nBFP>,<nDFP>,<DSF>,<DFDI>")])
+   (set_attr "cpu_facility" "*,*,*,vx,vxe")
+   (set_attr "enabled"      "<nBFP>,<nDFP>,<DSF>,<DF>,<SF>")])
 
 ; axbr, adbr, aebr, axb, adb, aeb, adtr, axtr
 (define_insn "*add<mode>3_cc"
@@ -6249,28 +6258,30 @@
 ; sub(tf|df|sf|td|dd)3 instruction pattern(s).
 ;
 
+; FIXME: (clobber (match_scratch:CC 3 "=c,c,c,X,X")) does not work - why?
 ; sxbr, sdbr, sebr, sdb, seb, sxtr, sdtr
 (define_insn "sub<mode>3"
-  [(set (match_operand:FP           0 "register_operand" "=f,f,f,v")
-        (minus:FP (match_operand:FP 1 "register_operand"  "f,0,0,v")
-                  (match_operand:FP 2 "general_operand"   "f,f,R,v")))
+  [(set (match_operand:FP           0 "register_operand" "=f,f,f,v,v")
+        (minus:FP (match_operand:FP 1 "register_operand"  "f,0,0,v,v")
+		  (match_operand:FP 2 "general_operand"   "f,f,R,v,v")))
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_HARD_FLOAT"
   "@
    s<xde>tr\t%0,%1,%2
    s<xde>br\t%0,%2
    s<xde>b\t%0,%2
-   wfsdb\t%v0,%v1,%v2"
-  [(set_attr "op_type"      "RRF,RRE,RXE,VRR")
+   wfsdb\t%v0,%v1,%v2
+   wfssb\t%v0,%v1,%v2"
+  [(set_attr "op_type"      "RRF,RRE,RXE,VRR,VRR")
    (set_attr "type"         "fsimp<mode>")
-   (set_attr "cpu_facility" "*,*,*,vx")
-   (set_attr "enabled"      "<nBFP>,<nDFP>,<DSF>,<DFDI>")])
+   (set_attr "cpu_facility" "*,*,*,vx,vxe")
+   (set_attr "enabled"      "<nBFP>,<nDFP>,<DSF>,<DF>,<SF>")])
 
 ; sxbr, sdbr, sebr, sdb, seb, sxtr, sdtr
 (define_insn "*sub<mode>3_cc"
   [(set (reg CC_REGNUM)
 	(compare (minus:FP (match_operand:FP 1 "nonimmediate_operand" "f,0,0")
-                           (match_operand:FP 2 "general_operand"      "f,f,R"))
+			   (match_operand:FP 2 "general_operand"      "f,f,R"))
 		 (match_operand:FP 3 "const0_operand" "")))
    (set (match_operand:FP 0 "register_operand" "=f,f,f")
 	(minus:FP (match_dup 1) (match_dup 2)))]
@@ -6736,51 +6747,54 @@
 
 ; mxbr, mdbr, meebr, mxb, mxb, meeb, mdtr, mxtr
 (define_insn "mul<mode>3"
-  [(set (match_operand:FP          0 "register_operand"     "=f,f,f,v")
-        (mult:FP (match_operand:FP 1 "nonimmediate_operand" "%f,0,0,v")
-                 (match_operand:FP 2 "general_operand"       "f,f,R,v")))]
+  [(set (match_operand:FP          0 "register_operand"     "=f,f,f,v,v")
+        (mult:FP (match_operand:FP 1 "nonimmediate_operand" "%f,0,0,v,v")
+		 (match_operand:FP 2 "general_operand"       "f,f,R,v,v")))]
   "TARGET_HARD_FLOAT"
   "@
    m<xdee>tr\t%0,%1,%2
    m<xdee>br\t%0,%2
    m<xdee>b\t%0,%2
-   wfmdb\t%v0,%v1,%v2"
-  [(set_attr "op_type"      "RRF,RRE,RXE,VRR")
+   wfmdb\t%v0,%v1,%v2
+   wfmsb\t%v0,%v1,%v2"
+  [(set_attr "op_type"      "RRF,RRE,RXE,VRR,VRR")
    (set_attr "type"         "fmul<mode>")
-   (set_attr "cpu_facility" "*,*,*,vx")
-   (set_attr "enabled"      "<nBFP>,<nDFP>,<DSF>,<DFDI>")])
+   (set_attr "cpu_facility" "*,*,*,vx,vxe")
+   (set_attr "enabled"      "<nBFP>,<nDFP>,<DSF>,<DF>,<SF>")])
 
 ; madbr, maebr, maxb, madb, maeb
 (define_insn "fma<mode>4"
-  [(set (match_operand:DSF          0 "register_operand"     "=f,f,v")
-	(fma:DSF (match_operand:DSF 1 "nonimmediate_operand" "%f,f,v")
-		 (match_operand:DSF 2 "nonimmediate_operand"  "f,R,v")
-		 (match_operand:DSF 3 "register_operand"      "0,0,v")))]
+  [(set (match_operand:DSF          0 "register_operand"     "=f,f,v,v")
+	(fma:DSF (match_operand:DSF 1 "nonimmediate_operand" "%f,f,v,v")
+		 (match_operand:DSF 2 "nonimmediate_operand"  "f,R,v,v")
+		 (match_operand:DSF 3 "register_operand"      "0,0,v,v")))]
   "TARGET_HARD_FLOAT"
   "@
    ma<xde>br\t%0,%1,%2
    ma<xde>b\t%0,%1,%2
-   wfmadb\t%v0,%v1,%v2,%v3"
-  [(set_attr "op_type"      "RRE,RXE,VRR")
+   wfmadb\t%v0,%v1,%v2,%v3
+   wfmasb\t%v0,%v1,%v2,%v3"
+  [(set_attr "op_type"      "RRE,RXE,VRR,VRR")
    (set_attr "type"         "fmadd<mode>")
-   (set_attr "cpu_facility" "*,*,vx")
-   (set_attr "enabled"      "*,*,<DFDI>")])
+   (set_attr "cpu_facility" "*,*,vx,vxe")
+   (set_attr "enabled"      "*,*,<DF>,<SF>")])
 
 ; msxbr, msdbr, msebr, msxb, msdb, mseb
 (define_insn "fms<mode>4"
-  [(set (match_operand:DSF                   0 "register_operand"     "=f,f,v")
-	(fma:DSF (match_operand:DSF          1 "nonimmediate_operand" "%f,f,v")
-		 (match_operand:DSF          2 "nonimmediate_operand"  "f,R,v")
-		 (neg:DSF (match_operand:DSF 3 "register_operand"      "0,0,v"))))]
+  [(set (match_operand:DSF                   0 "register_operand"     "=f,f,v,v")
+	(fma:DSF (match_operand:DSF          1 "nonimmediate_operand" "%f,f,v,v")
+		 (match_operand:DSF          2 "nonimmediate_operand"  "f,R,v,v")
+		 (neg:DSF (match_operand:DSF 3 "register_operand"      "0,0,v,v"))))]
   "TARGET_HARD_FLOAT"
   "@
    ms<xde>br\t%0,%1,%2
    ms<xde>b\t%0,%1,%2
-   wfmsdb\t%v0,%v1,%v2,%v3"
-  [(set_attr "op_type"      "RRE,RXE,VRR")
+   wfmsdb\t%v0,%v1,%v2,%v3
+   wfmssb\t%v0,%v1,%v2,%v3"
+  [(set_attr "op_type"      "RRE,RXE,VRR,VRR")
    (set_attr "type"         "fmadd<mode>")
-   (set_attr "cpu_facility" "*,*,vx")
-   (set_attr "enabled"      "*,*,<DFDI>")])
+   (set_attr "cpu_facility" "*,*,vx,vxe")
+   (set_attr "enabled"      "*,*,<DF>,<SF>")])
 
 ;;
 ;;- Divide and modulo instructions.
@@ -7212,19 +7226,20 @@
 
 ; dxbr, ddbr, debr, dxb, ddb, deb, ddtr, dxtr
 (define_insn "div<mode>3"
-  [(set (match_operand:FP         0 "register_operand" "=f,f,f,v")
-        (div:FP (match_operand:FP 1 "register_operand"  "f,0,0,v")
-		(match_operand:FP 2 "general_operand"   "f,f,R,v")))]
+  [(set (match_operand:FP         0 "register_operand" "=f,f,f,v,v")
+        (div:FP (match_operand:FP 1 "register_operand"  "f,0,0,v,v")
+		(match_operand:FP 2 "general_operand"   "f,f,R,v,v")))]
   "TARGET_HARD_FLOAT"
   "@
    d<xde>tr\t%0,%1,%2
    d<xde>br\t%0,%2
    d<xde>b\t%0,%2
-   wfddb\t%v0,%v1,%v2"
-  [(set_attr "op_type"      "RRF,RRE,RXE,VRR")
+   wfddb\t%v0,%v1,%v2
+   wfdsb\t%v0,%v1,%v2"
+  [(set_attr "op_type"      "RRF,RRE,RXE,VRR,VRR")
    (set_attr "type"         "fdiv<mode>")
-   (set_attr "cpu_facility" "*,*,*,vx")
-   (set_attr "enabled"      "<nBFP>,<nDFP>,<DSF>,<DFDI>")])
+   (set_attr "cpu_facility" "*,*,*,vx,vxe")
+   (set_attr "enabled"      "<nBFP>,<nDFP>,<DSF>,<DF>,<SF>")])
 
 
 ;;
@@ -8423,11 +8438,10 @@
 
 (define_expand "neg<mode>2"
   [(parallel
-    [(set (match_operand:BFP 0 "register_operand" "=f")
-          (neg:BFP (match_operand:BFP 1 "register_operand" "f")))
+    [(set (match_operand:BFP          0 "register_operand")
+          (neg:BFP (match_operand:BFP 1 "register_operand")))
      (clobber (reg:CC CC_REGNUM))])]
-  "TARGET_HARD_FLOAT"
-  "")
+  "TARGET_HARD_FLOAT")
 
 ; lcxbr, lcdbr, lcebr
 (define_insn "*neg<mode>2_cc"
@@ -8463,18 +8477,20 @@
 
 ; lcxbr, lcdbr, lcebr
 ; FIXME: wflcdb does not clobber cc
+; FIXME: Does wflcdb ever match here?
 (define_insn "*neg<mode>2"
-  [(set (match_operand:BFP          0 "register_operand" "=f,v")
-        (neg:BFP (match_operand:BFP 1 "register_operand"  "f,v")))
+  [(set (match_operand:BFP          0 "register_operand" "=f,v,v")
+        (neg:BFP (match_operand:BFP 1 "register_operand"  "f,v,v")))
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_HARD_FLOAT"
   "@
    lc<xde>br\t%0,%1
-   wflcdb\t%0,%1"
-  [(set_attr "op_type"      "RRE,VRR")
-   (set_attr "cpu_facility" "*,vx")
-   (set_attr "type"         "fsimp<mode>,*")
-   (set_attr "enabled"      "*,<DFDI>")])
+   wflcdb\t%0,%1
+   wflcsb\t%0,%1"
+  [(set_attr "op_type"      "RRE,VRR,VRR")
+   (set_attr "cpu_facility" "*,vx,vxe")
+   (set_attr "type"         "fsimp<mode>,*,*")
+   (set_attr "enabled"      "*,<DF>,<SF>")])
 
 
 ;;
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 6a726a3..7535b9d 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -26,16 +26,16 @@
   [V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI V1DI V2DI V1SF
    V2SF V4SF V1DF V2DF V1TF V1TI TI])
 
-; All vector modes directly supported by the hardware having full vector reg size
+; All modes directly supported by the hardware having full vector reg size
 ; V_HW2 is duplicate of V_HW for having two iterators expanding
 ; independently e.g. vcond
-(define_mode_iterator V_HW  [V16QI V8HI V4SI V2DI V2DF])
-(define_mode_iterator V_HW2 [V16QI V8HI V4SI V2DI V2DF])
+(define_mode_iterator V_HW  [V16QI V8HI V4SI V2DI V2DF (V4SF "TARGET_VXE") (V1TF "TARGET_VXE")])
+(define_mode_iterator V_HW2 [V16QI V8HI V4SI V2DI V2DF (V4SF "TARGET_VXE") (V1TF "TARGET_VXE")])
 
 (define_mode_iterator V_HW_64 [V2DI V2DF])
 
 ; Including TI for instructions that support it (va, vn, ...)
-(define_mode_iterator VT_HW [V16QI V8HI V4SI V2DI V2DF V1TI TI])
+(define_mode_iterator VT_HW [V16QI V8HI V4SI V2DI V2DF V1TI TI (V4SF "TARGET_VXE") (V1TF "TARGET_VXE")])
 
 ; All full size integer vector modes supported in a vector register + TImode
 (define_mode_iterator VIT_HW    [V16QI V8HI V4SI V2DI V1TI TI])
@@ -51,6 +51,15 @@
 (define_mode_iterator VI  [V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI V1DI V2DI])
 (define_mode_iterator VI_QHS [V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI])
 
+(define_mode_iterator VFT [(V1SF "TARGET_VXE") (V2SF "TARGET_VXE") (V4SF "TARGET_VXE")
+			   V1DF V2DF
+			   (V1TF "TARGET_VXE")])
+
+; FP vector modes directly supported by the HW.  This does not include
+; vector modes using only part of a vector register and should be used
+; for instructions which might trigger IEEE exceptions.
+(define_mode_iterator VF_HW [(V4SF "TARGET_VXE") V2DF (V1TF "TARGET_VXE")])
+
 (define_mode_iterator V_8   [V1QI])
 (define_mode_iterator V_16  [V2QI  V1HI])
 (define_mode_iterator V_32  [V4QI  V2HI V1SI V1SF])
@@ -59,26 +68,30 @@
 
 (define_mode_iterator V_128_NOSINGLE [V16QI V8HI V4SI V4SF V2DI V2DF])
 
-; A blank for vector modes and a * for TImode.  This is used to hide
-; the TImode expander name in case it is defined already.  See addti3
-; for an example.
-(define_mode_attr ti* [(V1QI "") (V2QI "") (V4QI "") (V8QI "") (V16QI "")
-		       (V1HI "") (V2HI "") (V4HI "") (V8HI "")
-		       (V1SI "") (V2SI "") (V4SI "")
-		       (V1DI "") (V2DI "")
-		       (V1TI "*") (TI "*")])
+; Empty string for all but TImode.  This is used to hide the TImode
+; expander name in case it is defined already.  See addti3 for an
+; example.
+(define_mode_attr ti* [(V1QI "")  (V2QI "") (V4QI "") (V8QI "") (V16QI "")
+		       (V1HI "")  (V2HI "") (V4HI "") (V8HI "")
+		       (V1SI "")  (V2SI "") (V4SI "")
+		       (V1DI "")  (V2DI "")
+		       (V1TI "")  (TI "*")
+		       (V1SF "")  (V2SF "") (V4SF "")
+		       (V1DF "")  (V2DF "")
+		       (V1TF "")  (TF "")])
 
 ; The element type of the vector.
 (define_mode_attr non_vec[(V1QI "QI") (V2QI "QI") (V4QI "QI") (V8QI "QI") (V16QI "QI")
 			  (V1HI "HI") (V2HI "HI") (V4HI "HI") (V8HI "HI")
 			  (V1SI "SI") (V2SI "SI") (V4SI "SI")
 			  (V1DI "DI") (V2DI "DI")
-			  (V1TI "TI")
+			  (V1TI "TI") (TI "TI")
 			  (V1SF "SF") (V2SF "SF") (V4SF "SF")
 			  (V1DF "DF") (V2DF "DF")
-			  (V1TF "TF")])
+			  (V1TF "TF") (TF "TF")])
 
-; The instruction suffix
+; The instruction suffix for integer instructions and instructions
+; which do not care about whether it is floating point or integer.
 (define_mode_attr bhfgq[(V1QI "b") (V2QI "b") (V4QI "b") (V8QI "b") (V16QI "b")
 			(V1HI "h") (V2HI "h") (V4HI "h") (V8HI "h")
 			(V1SI "f") (V2SI "f") (V4SI "f")
@@ -105,6 +118,13 @@
 			    (V1SF "V1SI") (V2SF "V2SI") (V4SF "V4SI")
 			    (V1DF "V1DI") (V2DF "V2DI")
 			    (V1TF "V1TI")])
+(define_mode_attr vw [(SF "w") (V1SF "w") (V2SF "v") (V4SF "v")
+		      (DF "w") (V1DF "w") (V2DF "v")
+		      (TF "w") (V1TF "w")])
+
+(define_mode_attr sdx [(SF "s") (V1SF "s") (V2SF "s") (V4SF "s")
+		       (DF "d") (V1DF "d") (V2DF "d")
+		       (TF "x") (V1TF "x")])
 
 ; Vector with doubled element size.
 (define_mode_attr vec_double [(V1QI "V1HI") (V2QI "V1HI") (V4QI "V2HI") (V8QI "V4HI") (V16QI "V8HI")
@@ -1029,92 +1049,139 @@
 ;; Vector floating point arithmetic instructions
 ;;
 
-(define_insn "addv2df3"
-  [(set (match_operand:V2DF            0 "register_operand" "=v")
-	(plus:V2DF (match_operand:V2DF 1 "register_operand" "%v")
-		   (match_operand:V2DF 2 "register_operand"  "v")))]
+; vfasb, vfadb, wfasb, wfadb, wfaxb
+(define_insn "add<mode>3"
+  [(set (match_operand:VF_HW             0 "register_operand" "=v")
+	(plus:VF_HW (match_operand:VF_HW 1 "register_operand" "%v")
+		    (match_operand:VF_HW 2 "register_operand"  "v")))]
   "TARGET_VX"
-  "vfadb\t%v0,%v1,%v2"
+  "<vw>fa<sdx>b\t%v0,%v1,%v2"
   [(set_attr "op_type" "VRR")])
 
-(define_insn "subv2df3"
-  [(set (match_operand:V2DF             0 "register_operand" "=v")
-	(minus:V2DF (match_operand:V2DF 1 "register_operand" "%v")
-		    (match_operand:V2DF 2 "register_operand"  "v")))]
+; vfssb, vfsdb, wfssb, wfsdb, wfsxb
+(define_insn "sub<mode>3"
+  [(set (match_operand:VF_HW              0 "register_operand" "=v")
+	(minus:VF_HW (match_operand:VF_HW 1 "register_operand" "%v")
+		     (match_operand:VF_HW 2 "register_operand"  "v")))]
   "TARGET_VX"
-  "vfsdb\t%v0,%v1,%v2"
+  "<vw>fs<sdx>b\t%v0,%v1,%v2"
   [(set_attr "op_type" "VRR")])
 
-(define_insn "mulv2df3"
-  [(set (match_operand:V2DF            0 "register_operand" "=v")
-	(mult:V2DF (match_operand:V2DF 1 "register_operand" "%v")
-		   (match_operand:V2DF 2 "register_operand"  "v")))]
+; vfmsb, vfmdb, wfmsb, wfmdb, wfmxb
+(define_insn "mul<mode>3"
+  [(set (match_operand:VF_HW             0 "register_operand" "=v")
+	(mult:VF_HW (match_operand:VF_HW 1 "register_operand" "%v")
+		    (match_operand:VF_HW 2 "register_operand"  "v")))]
   "TARGET_VX"
-  "vfmdb\t%v0,%v1,%v2"
+  "<vw>fm<sdx>b\t%v0,%v1,%v2"
   [(set_attr "op_type" "VRR")])
 
-(define_insn "divv2df3"
-  [(set (match_operand:V2DF           0 "register_operand" "=v")
-	(div:V2DF (match_operand:V2DF 1 "register_operand"  "v")
-		  (match_operand:V2DF 2 "register_operand"  "v")))]
+; vfdsb, vfddb, wfdsb, wfddb, wfdxb
+(define_insn "div<mode>3"
+  [(set (match_operand:VF_HW            0 "register_operand" "=v")
+	(div:VF_HW (match_operand:VF_HW 1 "register_operand"  "v")
+		   (match_operand:VF_HW 2 "register_operand"  "v")))]
   "TARGET_VX"
-  "vfddb\t%v0,%v1,%v2"
+  "<vw>fd<sdx>b\t%v0,%v1,%v2"
   [(set_attr "op_type" "VRR")])
 
-(define_insn "sqrtv2df2"
-  [(set (match_operand:V2DF            0 "register_operand" "=v")
-	(sqrt:V2DF (match_operand:V2DF 1 "register_operand"  "v")))]
+; vfsqsb, vfsqdb, wfsqsb, wfsqdb, wfsqxb
+(define_insn "sqrt<mode>2"
+  [(set (match_operand:VF_HW           0 "register_operand" "=v")
+	(sqrt:VF_HW (match_operand:VF_HW 1 "register_operand"  "v")))]
   "TARGET_VX"
-  "vfsqdb\t%v0,%v1"
+  "<vw>fsq<sdx>b\t%v0,%v1"
   [(set_attr "op_type" "VRR")])
 
-(define_insn "fmav2df4"
-  [(set (match_operand:V2DF           0 "register_operand" "=v")
-	(fma:V2DF (match_operand:V2DF 1 "register_operand" "%v")
-		  (match_operand:V2DF 2 "register_operand"  "v")
-		  (match_operand:V2DF 3 "register_operand"  "v")))]
+; vfmasb, vfmadb, wfmasb, wfmadb, wfmaxb
+(define_insn "fma<mode>4"
+  [(set (match_operand:VF_HW            0 "register_operand" "=v")
+	(fma:VF_HW (match_operand:VF_HW 1 "register_operand" "%v")
+		   (match_operand:VF_HW 2 "register_operand"  "v")
+		   (match_operand:VF_HW 3 "register_operand"  "v")))]
   "TARGET_VX"
-  "vfmadb\t%v0,%v1,%v2,%v3"
+  "<vw>fma<sdx>b\t%v0,%v1,%v2,%v3"
   [(set_attr "op_type" "VRR")])
 
-(define_insn "fmsv2df4"
-  [(set (match_operand:V2DF                     0 "register_operand" "=v")
-	(fma:V2DF (match_operand:V2DF           1 "register_operand" "%v")
-		  (match_operand:V2DF           2 "register_operand"  "v")
-		  (neg:V2DF (match_operand:V2DF 3 "register_operand"  "v"))))]
+; vfmssb, vfmsdb, wfmssb, wfmsdb, wfmsxb
+(define_insn "fms<mode>4"
+  [(set (match_operand:VF_HW                     0 "register_operand" "=v")
+	(fma:VF_HW (match_operand:VF_HW          1 "register_operand" "%v")
+		   (match_operand:VF_HW          2 "register_operand"  "v")
+		 (neg:VF_HW (match_operand:VF_HW 3 "register_operand"  "v"))))]
   "TARGET_VX"
-  "vfmsdb\t%v0,%v1,%v2,%v3"
+  "<vw>fms<sdx>b\t%v0,%v1,%v2,%v3"
+  [(set_attr "op_type" "VRR")])
+
+; vfnmasb, vfnmadb, wfnmasb, wfnmadb, wfnmaxb
+(define_insn "neg_fma<mode>4"
+  [(set (match_operand:VF_HW             0 "register_operand" "=v")
+	(neg:VF_HW
+	 (fma:VF_HW (match_operand:VF_HW 1 "register_operand" "%v")
+		    (match_operand:VF_HW 2 "register_operand"  "v")
+		    (match_operand:VF_HW 3 "register_operand"  "v"))))]
+  "TARGET_VXE"
+  "<vw>fnma<sdx>b\t%v0,%v1,%v2,%v3"
+  [(set_attr "op_type" "VRR")])
+
+; vfnmssb, vfnmsdb, wfnmssb, wfnmsdb, wfnmsxb
+(define_insn "neg_fms<mode>4"
+  [(set (match_operand:VF_HW                      0 "register_operand" "=v")
+	(neg:VF_HW
+	 (fma:VF_HW (match_operand:VF_HW          1 "register_operand" "%v")
+		    (match_operand:VF_HW          2 "register_operand"  "v")
+		  (neg:VF_HW (match_operand:VF_HW 3 "register_operand"  "v")))))]
+  "TARGET_VXE"
+  "<vw>fnms<sdx>b\t%v0,%v1,%v2,%v3"
   [(set_attr "op_type" "VRR")])
 
-(define_insn "negv2df2"
-  [(set (match_operand:V2DF           0 "register_operand" "=v")
-	(neg:V2DF (match_operand:V2DF 1 "register_operand"  "v")))]
+; vflcsb, vflcdb, wflcsb, wflcdb, wflcxb
+(define_insn "neg<mode>2"
+  [(set (match_operand:VFT          0 "register_operand" "=v")
+	(neg:VFT (match_operand:VFT 1 "register_operand"  "v")))]
   "TARGET_VX"
-  "vflcdb\t%v0,%v1"
+  "<vw>flc<sdx>b\t%v0,%v1"
   [(set_attr "op_type" "VRR")])
 
-(define_insn "absv2df2"
-  [(set (match_operand:V2DF           0 "register_operand" "=v")
-	(abs:V2DF (match_operand:V2DF 1 "register_operand"  "v")))]
+; vflpsb, vflpdb, wflpsb, wflpdb, wflpxb
+(define_insn "abs<mode>2"
+  [(set (match_operand:VFT          0 "register_operand" "=v")
+	(abs:VFT (match_operand:VFT 1 "register_operand"  "v")))]
   "TARGET_VX"
-  "vflpdb\t%v0,%v1"
+  "<vw>flp<sdx>b\t%v0,%v1"
   [(set_attr "op_type" "VRR")])
 
-(define_insn "*negabsv2df2"
-  [(set (match_operand:V2DF                     0 "register_operand" "=v")
-	(neg:V2DF (abs:V2DF (match_operand:V2DF 1 "register_operand"  "v"))))]
+; vflnsb, vflndb, wflnsb, wflndb, wflnxb
+(define_insn "negabs<mode>2"
+  [(set (match_operand:VFT                   0 "register_operand" "=v")
+	(neg:VFT (abs:VFT (match_operand:VFT 1 "register_operand"  "v"))))]
   "TARGET_VX"
-  "vflndb\t%v0,%v1"
+  "<vw>fln<sdx>b\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+(define_expand "smax<mode>3"
+  [(set (match_operand:VF_HW             0 "register_operand")
+	(smax:VF_HW (match_operand:VF_HW 1 "register_operand")
+		    (match_operand:VF_HW 2 "register_operand")))]
+  "TARGET_VX")
+
+; vfmaxsb, vfmaxdb, wfmaxsb, wfmaxdb, wfmaxxb
+(define_insn "*smax<mode>3_vxe"
+  [(set (match_operand:VF_HW             0 "register_operand" "=v")
+	(smax:VF_HW (match_operand:VF_HW 1 "register_operand" "%v")
+		    (match_operand:VF_HW 2 "register_operand"  "v")))]
+  "TARGET_VXE"
+  "<vw>fmax<sdx>b\t%v0,%v1,%v2,4"
   [(set_attr "op_type" "VRR")])
 
 ; Emulate with compare + select
-(define_insn_and_split "smaxv2df3"
+(define_insn_and_split "*smaxv2df3_vx"
   [(set (match_operand:V2DF            0 "register_operand" "=v")
 	(smax:V2DF (match_operand:V2DF 1 "register_operand" "%v")
 		   (match_operand:V2DF 2 "register_operand"  "v")))]
-  "TARGET_VX"
+  "TARGET_VX && !TARGET_VXE"
   "#"
-  ""
+  "&& 1"
   [(set (match_dup 3)
 	(gt:V2DI (match_dup 1) (match_dup 2)))
    (set (match_dup 0)
@@ -1127,14 +1194,29 @@
   operands[4] = CONST0_RTX (V2DImode);
 })
 
+(define_expand "smin<mode>3"
+  [(set (match_operand:VF_HW             0 "register_operand")
+	(smin:VF_HW (match_operand:VF_HW 1 "register_operand")
+		    (match_operand:VF_HW 2 "register_operand")))]
+  "TARGET_VX")
+
+; vfminsb, vfmindb, wfminsb, wfmindb, wfminxb
+(define_insn "*smin<mode>3_vxe"
+  [(set (match_operand:VF_HW             0 "register_operand" "=v")
+	(smin:VF_HW (match_operand:VF_HW 1 "register_operand" "%v")
+		    (match_operand:VF_HW 2 "register_operand"  "v")))]
+  "TARGET_VXE"
+  "<vw>fmin<sdx>b\t%v0,%v1,%v2,4"
+  [(set_attr "op_type" "VRR")])
+
 ; Emulate with compare + select
-(define_insn_and_split "sminv2df3"
+(define_insn_and_split "*sminv2df3_vx"
   [(set (match_operand:V2DF            0 "register_operand" "=v")
 	(smin:V2DF (match_operand:V2DF 1 "register_operand" "%v")
 		   (match_operand:V2DF 2 "register_operand"  "v")))]
-  "TARGET_VX"
+  "TARGET_VX && !TARGET_VXE"
   "#"
-  ""
+  "&& 1"
   [(set (match_dup 3)
 	(gt:V2DI (match_dup 1) (match_dup 2)))
    (set (match_dup 0)
@@ -1166,65 +1248,66 @@
 ;;
 
 ; EQ, GT, GE
-(define_insn "*vec_cmp<VFCMP_HW_OP:code>v2df_nocc"
-  [(set (match_operand:V2DI                   0 "register_operand" "=v")
-	(VFCMP_HW_OP:V2DI (match_operand:V2DF 1 "register_operand"  "v")
-			  (match_operand:V2DF 2 "register_operand"  "v")))]
+; vfcesb, vfcedb, wfcexb, vfchsb, vfchdb, wfchxb, vfchesb, vfchedb, wfchexb
+(define_insn "*vec_cmp<VFCMP_HW_OP:code><mode>_nocc"
+  [(set (match_operand:<tointvec>                  0 "register_operand" "=v")
+	(VFCMP_HW_OP:<tointvec> (match_operand:VFT 1 "register_operand"  "v")
+			     (match_operand:VFT 2 "register_operand"  "v")))]
    "TARGET_VX"
-   "vfc<VFCMP_HW_OP:asm_fcmp_op>db\t%v0,%v1,%v2"
+   "<vw>fc<VFCMP_HW_OP:asm_fcmp_op><sdx>b\t%v0,%v1,%v2"
   [(set_attr "op_type" "VRR")])
 
 ; Expanders for not directly supported comparisons
 
 ; UNEQ a u== b -> !(a > b | b > a)
-(define_expand "vec_cmpuneqv2df"
-  [(set (match_operand:V2DI          0 "register_operand" "=v")
-	(gt:V2DI (match_operand:V2DF 1 "register_operand"  "v")
-		 (match_operand:V2DF 2 "register_operand"  "v")))
+(define_expand "vec_cmpuneq<mode>"
+  [(set (match_operand:<tointvec>         0 "register_operand" "=v")
+	(gt:<tointvec> (match_operand:VFT 1 "register_operand"  "v")
+		    (match_operand:VFT 2 "register_operand"  "v")))
    (set (match_dup 3)
-	(gt:V2DI (match_dup 2) (match_dup 1)))
-   (set (match_dup 0) (ior:V2DI (match_dup 0) (match_dup 3)))
-   (set (match_dup 0) (not:V2DI (match_dup 0)))]
+	(gt:<tointvec> (match_dup 2) (match_dup 1)))
+   (set (match_dup 0) (ior:<tointvec> (match_dup 0) (match_dup 3)))
+   (set (match_dup 0) (not:<tointvec> (match_dup 0)))]
   "TARGET_VX"
 {
-  operands[3] = gen_reg_rtx (V2DImode);
+  operands[3] = gen_reg_rtx (<tointvec>mode);
 })
 
 ; LTGT a <> b -> a > b | b > a
-(define_expand "vec_cmpltgtv2df"
-  [(set (match_operand:V2DI          0 "register_operand" "=v")
-	(gt:V2DI (match_operand:V2DF 1 "register_operand"  "v")
-		 (match_operand:V2DF 2 "register_operand"  "v")))
-   (set (match_dup 3) (gt:V2DI (match_dup 2) (match_dup 1)))
-   (set (match_dup 0) (ior:V2DI (match_dup 0) (match_dup 3)))]
+(define_expand "vec_cmpltgt<mode>"
+  [(set (match_operand:<tointvec>         0 "register_operand" "=v")
+	(gt:<tointvec> (match_operand:VFT 1 "register_operand"  "v")
+		    (match_operand:VFT 2 "register_operand"  "v")))
+   (set (match_dup 3) (gt:<tointvec> (match_dup 2) (match_dup 1)))
+   (set (match_dup 0) (ior:<tointvec> (match_dup 0) (match_dup 3)))]
   "TARGET_VX"
 {
-  operands[3] = gen_reg_rtx (V2DImode);
+  operands[3] = gen_reg_rtx (<tointvec>mode);
 })
 
 ; ORDERED (a, b): a >= b | b > a
-(define_expand "vec_orderedv2df"
-  [(set (match_operand:V2DI          0 "register_operand" "=v")
-	(ge:V2DI (match_operand:V2DF 1 "register_operand"  "v")
-		 (match_operand:V2DF 2 "register_operand"  "v")))
-   (set (match_dup 3) (gt:V2DI (match_dup 2) (match_dup 1)))
-   (set (match_dup 0) (ior:V2DI (match_dup 0) (match_dup 3)))]
+(define_expand "vec_ordered<mode>"
+  [(set (match_operand:<tointvec>          0 "register_operand" "=v")
+	(ge:<tointvec> (match_operand:VFT 1 "register_operand"  "v")
+		 (match_operand:VFT 2 "register_operand"  "v")))
+   (set (match_dup 3) (gt:<tointvec> (match_dup 2) (match_dup 1)))
+   (set (match_dup 0) (ior:<tointvec> (match_dup 0) (match_dup 3)))]
   "TARGET_VX"
 {
-  operands[3] = gen_reg_rtx (V2DImode);
+  operands[3] = gen_reg_rtx (<tointvec>mode);
 })
 
 ; UNORDERED (a, b): !ORDERED (a, b)
-(define_expand "vec_unorderedv2df"
-  [(set (match_operand:V2DI          0 "register_operand" "=v")
-	(ge:V2DI (match_operand:V2DF 1 "register_operand"  "v")
-		 (match_operand:V2DF 2 "register_operand"  "v")))
-   (set (match_dup 3) (gt:V2DI (match_dup 2) (match_dup 1)))
-   (set (match_dup 0) (ior:V2DI (match_dup 0) (match_dup 3)))
-   (set (match_dup 0) (not:V2DI (match_dup 0)))]
+(define_expand "vec_unordered<mode>"
+  [(set (match_operand:<tointvec>          0 "register_operand" "=v")
+	(ge:<tointvec> (match_operand:VFT 1 "register_operand"  "v")
+		 (match_operand:VFT 2 "register_operand"  "v")))
+   (set (match_dup 3) (gt:<tointvec> (match_dup 2) (match_dup 1)))
+   (set (match_dup 0) (ior:<tointvec> (match_dup 0) (match_dup 3)))
+   (set (match_dup 0) (not:<tointvec> (match_dup 0)))]
   "TARGET_VX"
 {
-  operands[3] = gen_reg_rtx (V2DImode);
+  operands[3] = gen_reg_rtx (<tointvec>mode);
 })
 
 (define_insn "*vec_load_pair<mode>"
@@ -1563,6 +1646,28 @@
   "vupllf\t%0,%1"
   [(set_attr "op_type" "VRR")])
 
+;; vector load lengthened
+
+; vflls
+(define_insn "*vec_extendv4sf"
+  [(set (match_operand:V2DF 0 "register_operand" "=v")
+	(float_extend:V2DF
+	 (vec_select:V2SF
+	  (match_operand:V4SF 1 "register_operand" "v")
+	  (parallel [(const_int 0) (const_int 2)]))))]
+  "TARGET_VX"
+  "vldeb\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "*vec_extendv2df"
+  [(set (match_operand:V1TF 0 "register_operand" "=v")
+	(float_extend:V1TF
+	 (vec_select:V1DF
+	  (match_operand:V2DF 1 "register_operand" "v")
+	  (parallel [(const_int 0)]))))]
+  "TARGET_VXE"
+  "wflld\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
 
 ; reduc_smin
 ; reduc_smax
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 9247f53..b843b78 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,9 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* gcc.target/s390/vxe/negfma-1.c: New test.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* gcc.target/s390/arch12/aghsghmgh-1.c: New test.
 	* gcc.target/s390/arch12/mul-1.c: New test.
 	* gcc.target/s390/arch12/mul-2.c: New test.
diff --git a/gcc/testsuite/gcc.target/s390/vxe/negfma-1.c b/gcc/testsuite/gcc.target/s390/vxe/negfma-1.c
new file mode 100644
index 0000000..4c976b0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vxe/negfma-1.c
@@ -0,0 +1,49 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mzarch -march=arch12" } */
+
+typedef float       v4sf __attribute__((vector_size(16)));
+typedef double      v2df __attribute__((vector_size(16)));
+typedef long double v1tf __attribute__((vector_size(16)));
+
+v4sf
+neg_vfnmasb (v4sf a, v4sf b, v4sf c)
+{
+  return -(a * b + c);
+}
+/* { dg-final { scan-assembler-times "vfnmasb\t%v24,%v24,%v26,%v28" 1 } } */
+
+v2df
+neg_vfnmadb (v2df a, v2df b, v2df c)
+{
+  return -(a * b + c);
+}
+/* { dg-final { scan-assembler-times "vfnmadb\t%v24,%v24,%v26,%v28" 1 } } */
+
+v1tf
+neg_wfnmaxb (v1tf a, v1tf b, v1tf c)
+{
+  return -(a * b + c);
+}
+/* { dg-final { scan-assembler-times "wfnmaxb\t%v24,%v24,%v26,%v28" 1 } } */
+
+
+v4sf
+neg_vfnmssb (v4sf a, v4sf b, v4sf c)
+{
+  return -(a * b - c);
+}
+/* { dg-final { scan-assembler-times "vfnmssb\t%v24,%v24,%v26,%v28" 1 } } */
+
+v2df
+neg_vfnmsdb (v2df a, v2df b, v2df c)
+{
+  return -(a * b - c);
+}
+/* { dg-final { scan-assembler-times "vfnmsdb\t%v24,%v24,%v26,%v28" 1 } } */
+
+v1tf
+neg_wfnmsxb (v1tf a, v1tf b, v1tf c)
+{
+  return -(a * b - c);
+}
+/* { dg-final { scan-assembler-times "wfnmsxb\t%v24,%v24,%v26,%v28" 1 } } */
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 10/16] S/390: arch12: Add support for new vector bit operations.
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
                   ` (11 preceding siblings ...)
  2017-03-24 14:13 ` [PATCH 01/16] S/390: Rename cpu facility vec to vx Andreas Krebbel
@ 2017-03-24 14:13 ` Andreas Krebbel
  2017-03-24 14:13 ` [PATCH 05/16] S/390: movdf improvements Andreas Krebbel
  2017-03-24 14:22 ` [PATCH 13/16] S/390: arch12: Add indirect branch pattern Andreas Krebbel
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:13 UTC (permalink / raw)
  To: gcc-patches

This patch adds support for the new bit operations introduced with
arch12.

The patch also renames the one complement pattern to the proper RTL
standard name.

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/s390.c (s390_rtx_costs): Return low costs for the
	canonical form of ~AND to make sure the new instruction will be
	used.
	* config/s390/vector.md ("notand<mode>3", "ior_not<mode>3")
	("notxor<mode>3"): Add new pattern definitions.
	("*not<mode>"): Rename to ...
	("one_cmpl<mode>2"): ... this.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* gcc.target/s390/vxe/bitops-1.c: New test.
---
 gcc/config/s390/s390.c                       | 15 ++++++++
 gcc/config/s390/vector.md                    | 31 +++++++++++++++--
 gcc/testsuite/ChangeLog                      |  4 +++
 gcc/testsuite/gcc.target/s390/vxe/bitops-1.c | 52 ++++++++++++++++++++++++++++
 4 files changed, 100 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/bitops-1.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index c94edcc..416a15e 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -3373,6 +3373,21 @@ s390_rtx_costs (rtx x, machine_mode mode, int outer_code,
 	  *total = COSTS_N_INSNS (2);
 	  return true;
 	}
+
+      /* ~AND on a 128 bit mode.  This can be done using a vector
+	 instruction.  */
+      if (TARGET_VXE
+	  && GET_CODE (XEXP (x, 0)) == NOT
+	  && GET_CODE (XEXP (x, 1)) == NOT
+	  && REG_P (XEXP (XEXP (x, 0), 0))
+	  && REG_P (XEXP (XEXP (x, 1), 0))
+	  && GET_MODE_SIZE (GET_MODE (XEXP (XEXP (x, 0), 0))) == 16
+	  && s390_hard_regno_mode_ok (VR0_REGNUM,
+				      GET_MODE (XEXP (XEXP (x, 0), 0))))
+	{
+	  *total = COSTS_N_INSNS (1);
+	  return true;
+	}
       /* fallthrough */
     case ASHIFT:
     case ASHIFTRT:
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 7ddeb9a..68a8ed0 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -655,6 +655,15 @@
   "vn\t%v0,%v1,%v2"
   [(set_attr "op_type" "VRR")])
 
+; Vector not and
+
+(define_insn "notand<mode>3"
+  [(set (match_operand:VT                 0 "register_operand" "=v")
+	(ior:VT (not:VT (match_operand:VT 1 "register_operand" "%v"))
+		(not:VT	(match_operand:VT 2 "register_operand"  "v"))))]
+  "TARGET_VXE"
+  "vnn\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
 
 ; Vector or
 
@@ -666,6 +675,15 @@
   "vo\t%v0,%v1,%v2"
   [(set_attr "op_type" "VRR")])
 
+; Vector or with complement
+
+(define_insn "ior_not<mode>3"
+  [(set (match_operand:VT                 0 "register_operand" "=v")
+	(ior:VT (not:VT (match_operand:VT 2 "register_operand"  "v"))
+		(match_operand:VT         1 "register_operand" "%v")))]
+  "TARGET_VXE"
+  "voc\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
 
 ; Vector xor
 
@@ -677,9 +695,18 @@
   "vx\t%v0,%v1,%v2"
   [(set_attr "op_type" "VRR")])
 
+; Vector not xor
+
+(define_insn "notxor<mode>3"
+  [(set (match_operand:VT                 0 "register_operand" "=v")
+	(not:VT (xor:VT (match_operand:VT 1 "register_operand" "%v")
+			(match_operand:VT 2 "register_operand"  "v"))))]
+  "TARGET_VXE"
+  "vnx\t%v0,%v1,%v2"
+  [(set_attr "op_type" "VRR")])
 
-; Bitwise inversion of a vector - used for vec_cmpne
-(define_insn "*not<mode>"
+; Bitwise inversion of a vector
+(define_insn "one_cmpl<mode>2"
   [(set (match_operand:VT         0 "register_operand" "=v")
 	(not:VT (match_operand:VT 1 "register_operand"  "v")))]
   "TARGET_VX"
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 9ca13ab..bbdd3c8 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,9 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* gcc.target/s390/vxe/bitops-1.c: New test.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* gcc.target/s390/s390.exp: Run tests in arch12 and vxe dirs.
 	* lib/target-supports.exp: Add effective target check s390_vxe.
 
diff --git a/gcc/testsuite/gcc.target/s390/vxe/bitops-1.c b/gcc/testsuite/gcc.target/s390/vxe/bitops-1.c
new file mode 100644
index 0000000..bdf7457
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vxe/bitops-1.c
@@ -0,0 +1,52 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mzarch -march=arch12 --save-temps" } */
+/* { dg-require-effective-target s390_vxe } */
+
+typedef unsigned int       uv4si __attribute__((vector_size(16)));
+
+uv4si __attribute__((noinline))
+not_xor (uv4si a, uv4si b)
+{
+  return ~(a ^ b);
+}
+/* { dg-final { scan-assembler-times "vnx\t%v24,%v24,%v26" 1 } } */
+
+uv4si __attribute__((noinline))
+not_and (uv4si a, uv4si b)
+{
+  return ~(a & b);
+}
+/* { dg-final { scan-assembler-times "vnn\t%v24,%v24,%v26" 1 } } */
+
+uv4si __attribute__((noinline))
+or_not (uv4si a, uv4si b)
+{
+  return a | ~b;
+}
+/* { dg-final { scan-assembler-times "voc\t%v24,%v24,%v26" 1 } } */
+
+
+int
+main ()
+{
+  uv4si a = (uv4si){ 42, 1, 0, 2 };
+  uv4si b = (uv4si){ 42, 2, 0, 2 };
+  uv4si c;
+
+  c = not_xor (a, b);
+
+  if (c[0] != ~0 || c[1] != ~3 || c[2] != ~0 || c[3] != ~0)
+    __builtin_abort ();
+
+  c = not_and (a, b);
+
+  if (c[0] != ~42 || c[1] != ~0 || c[2] != ~0 || c[3] != ~2)
+    __builtin_abort ();
+
+  c = or_not (a, b);
+
+  if (c[0] != ~0 || c[1] != ~2 || c[2] != ~0 || c[3] != ~0)
+    __builtin_abort ();
+
+  return 0;
+}
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 11/16] S/390: arch12: New vector popcount variants
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
                   ` (8 preceding siblings ...)
  2017-03-24 14:13 ` [PATCH 04/16] S/390: movsf/sd pattern fixes Andreas Krebbel
@ 2017-03-24 14:13 ` Andreas Krebbel
  2017-03-24 14:13 ` [PATCH 15/16] S/390: arch12: Support new vector floating point modes Andreas Krebbel
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:13 UTC (permalink / raw)
  To: gcc-patches

arch12 provides pop count vector instructions for bigger elements than
just chars.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* gcc.target/s390/vxe/popcount-1.c: New test.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/vector.md ("popcountv16qi2", "popcountv8hi2")
	("popcountv4si2", "popcountv2di2"): Rename to ...
	("popcount<mode>2", "popcountv8hi2_vx", "popcountv4si2_vx")
	("popcountv2di2_vx"): ... these and add !TARGET_VXE to the
	condition.
	("popcount<mode>2_vxe"): New pattern.
---
 gcc/ChangeLog                                  |  9 +++
 gcc/config/s390/vector.md                      | 38 ++++++++---
 gcc/testsuite/ChangeLog                        |  4 ++
 gcc/testsuite/gcc.target/s390/vxe/popcount-1.c | 88 ++++++++++++++++++++++++++
 4 files changed, 131 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vxe/popcount-1.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 89e7906..d516b4d 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,14 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* config/s390/vector.md ("popcountv16qi2", "popcountv8hi2")
+	("popcountv4si2", "popcountv2di2"): Rename to ...
+	("popcount<mode>2", "popcountv8hi2_vx", "popcountv4si2_vx")
+	("popcountv2di2_vx"): ... these and add !TARGET_VXE to the
+	condition.
+	("popcount<mode>2_vxe"): New pattern.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* common/config/s390/s390-common.c (processor_flags_table): Add
 	arch12.
 	* config.gcc: Add arch12.
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 68a8ed0..d4c0e95 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -715,11 +715,33 @@
 
 ; Vector population count
 
-(define_insn "popcountv16qi2"
+(define_expand "popcount<mode>2"
+  [(set (match_operand:VI_HW                0 "register_operand" "=v")
+	(unspec:VI_HW [(match_operand:VI_HW 1 "register_operand"  "v")]
+		      UNSPEC_POPCNT))]
+  "TARGET_VX"
+{
+  if (TARGET_VXE)
+    emit_insn (gen_popcount<mode>2_vxe (operands[0], operands[1]));
+  else
+    emit_insn (gen_popcount<mode>2_vx (operands[0], operands[1]));
+  DONE;
+})
+
+; vpopctb, vpopcth, vpopctf, vpopctg
+(define_insn "popcount<mode>2_vxe"
+  [(set (match_operand:VI_HW                0 "register_operand" "=v")
+	(unspec:VI_HW [(match_operand:VI_HW 1 "register_operand"  "v")]
+		      UNSPEC_POPCNT))]
+  "TARGET_VXE"
+  "vpopct<bhfgq>\t%v0,%v1"
+  [(set_attr "op_type" "VRR")])
+
+(define_insn "popcountv16qi2_vx"
   [(set (match_operand:V16QI                0 "register_operand" "=v")
 	(unspec:V16QI [(match_operand:V16QI 1 "register_operand"  "v")]
 		      UNSPEC_POPCNT))]
-  "TARGET_VX"
+  "TARGET_VX && !TARGET_VXE"
   "vpopct\t%v0,%v1,0"
   [(set_attr "op_type" "VRR")])
 
@@ -729,7 +751,7 @@
 ; of the result, add it to the result and extend it to halfword
 ; element size (unpack).
 
-(define_expand "popcountv8hi2"
+(define_expand "popcountv8hi2_vx"
   [(set (match_dup 2)
 	(unspec:V16QI [(subreg:V16QI (match_operand:V8HI 1 "register_operand" "v") 0)]
 		      UNSPEC_POPCNT))
@@ -761,7 +783,7 @@
 	(and:V8HI (subreg:V8HI (match_dup 2) 0)
 		  (subreg:V8HI (match_dup 3) 0)))
 ]
-  "TARGET_VX"
+  "TARGET_VX && !TARGET_VXE"
 {
   operands[2] = gen_reg_rtx (V16QImode);
   operands[3] = gen_reg_rtx (V16QImode);
@@ -769,20 +791,20 @@
   operands[5] = CONST0_RTX (V16QImode);
 })
 
-(define_expand "popcountv4si2"
+(define_expand "popcountv4si2_vx"
   [(set (match_dup 2)
 	(unspec:V16QI [(subreg:V16QI (match_operand:V4SI 1 "register_operand" "v") 0)]
 		      UNSPEC_POPCNT))
    (set (match_operand:V4SI 0 "register_operand" "=v")
 	(unspec:V4SI [(match_dup 2) (match_dup 3)]
 		     UNSPEC_VEC_VSUM))]
-  "TARGET_VX"
+  "TARGET_VX && !TARGET_VXE"
 {
   operands[2] = gen_reg_rtx (V16QImode);
   operands[3] = force_reg (V16QImode, CONST0_RTX (V16QImode));
 })
 
-(define_expand "popcountv2di2"
+(define_expand "popcountv2di2_vx"
   [(set (match_dup 2)
 	(unspec:V16QI [(subreg:V16QI (match_operand:V2DI 1 "register_operand" "v") 0)]
 		      UNSPEC_POPCNT))
@@ -792,7 +814,7 @@
    (set (match_operand:V2DI 0 "register_operand" "=v")
 	(unspec:V2DI [(match_dup 3) (match_dup 5)]
 		     UNSPEC_VEC_VSUMG))]
-  "TARGET_VX"
+  "TARGET_VX && !TARGET_VXE"
 {
   operands[2] = gen_reg_rtx (V16QImode);
   operands[3] = gen_reg_rtx (V4SImode);
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index bbdd3c8..6d178c5 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,9 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* gcc.target/s390/vxe/popcount-1.c: New test.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* gcc.target/s390/vxe/bitops-1.c: New test.
 
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
diff --git a/gcc/testsuite/gcc.target/s390/vxe/popcount-1.c b/gcc/testsuite/gcc.target/s390/vxe/popcount-1.c
new file mode 100644
index 0000000..9ea835a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vxe/popcount-1.c
@@ -0,0 +1,88 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mzarch -march=arch12 --save-temps" } */
+/* { dg-require-effective-target s390_vxe } */
+
+/* Vectorization currently only works for v4si.  v8hi at least uses 2x
+   vpopctf but no vpopcth.  */
+
+typedef unsigned char     uv16qi __attribute__((vector_size(16)));
+typedef unsigned short     uv8hi __attribute__((vector_size(16)));
+typedef unsigned int       uv4si __attribute__((vector_size(16)));
+typedef unsigned long long uv2di __attribute__((vector_size(16)));
+
+uv16qi __attribute__((noinline))
+vpopctb (uv16qi a)
+{
+  uv16qi r;
+  int i;
+
+  for (i = 0; i < 16; i++)
+    r[i] = __builtin_popcount (a[i]);
+
+  return r;
+}
+/* { dg-final { scan-assembler "vpopctb\t%v24,%v24" { xfail *-*-* } } } */
+
+uv8hi __attribute__((noinline))
+vpopcth (uv8hi a)
+{
+  uv8hi r;
+  int i;
+
+  for (i = 0; i < 8; i++)
+    r[i] = __builtin_popcount (a[i]);
+
+  return r;
+}
+/* { dg-final { scan-assembler "vpopcth\t%v24,%v24" { xfail *-*-* } } } */
+
+uv4si __attribute__((noinline))
+vpopctf (uv4si a)
+{
+  uv4si r;
+  int i;
+
+  for (i = 0; i < 4; i++)
+    r[i] = __builtin_popcount (a[i]);
+
+  return r;
+}
+/* { dg-final { scan-assembler "vpopctf\t%v24,%v24" } } */
+
+uv2di __attribute__((noinline))
+vpopctg (uv2di a)
+{
+  uv2di r;
+  int i;
+
+  for (i = 0; i < 2; i++)
+    r[i] = __builtin_popcount (a[i]);
+
+  return r;
+}
+/* { dg-final { scan-assembler "vpopctg\t%v24,%v24" { xfail *-*-* } } } */
+
+int
+main ()
+{
+  uv16qi a = (uv16qi){ 42, 1, ~0, 2, 42, 1, ~0, 2, 42, 1, ~0, 2, 42, 1, ~0, 2 };
+  if (__builtin_s390_vec_any_ne (vpopctb (a),
+				 (uv16qi){ 3, 1, 8, 1, 3, 1, 8, 1,
+					   3, 1, 8, 1, 3, 1, 8, 1 }))
+    __builtin_abort ();
+
+  if (__builtin_s390_vec_any_ne (vpopcth ((uv8hi){ 42, 1, ~0, 2, 42, 1, ~0, 2 }),
+				 (uv8hi){ 3, 1, 16, 1, 3, 1, 16, 1 }))
+    __builtin_abort ();
+
+  if (__builtin_s390_vec_any_ne (vpopctf ((uv4si){ 42, 1, ~0, 2 }),
+				 (uv4si){ 3, 1, 32, 1 }))
+    __builtin_abort ();
+
+  if (__builtin_s390_vec_any_ne (vpopctg ((uv2di){ 42, 1 }),
+					  (uv2di){ 3, 1 }))
+      __builtin_abort ();
+
+
+  return 0;
+}
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 14/16] S/390: arch12: Support the mul/add/subtract instructions.
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
                   ` (4 preceding siblings ...)
  2017-03-24 14:11 ` [PATCH 03/16] S/390: vec_init improvements Andreas Krebbel
@ 2017-03-24 14:13 ` Andreas Krebbel
  2017-03-24 14:13 ` [PATCH 09/16] S/390: arch12: Add arch12 option Andreas Krebbel
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:13 UTC (permalink / raw)
  To: gcc-patches

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/s390.md ("*adddi3_sign", "*subdi3_sign", "mulditi3")
	("mulditi3_2", "*muldi3_sign"): New patterns.
	("muldi3", "*muldi3", "mulsi3", "*mulsi3"): Add an expander and
	rename the pattern definition.

gcc/testsuite/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* gcc.target/s390/arch12/aghsghmgh-1.c: New test.
	* gcc.target/s390/arch12/mul-1.c: New test.
	* gcc.target/s390/arch12/mul-2.c: New test.
---
 gcc/ChangeLog                                      |  7 ++
 gcc/config/s390/s390.md                            | 98 +++++++++++++++++++---
 gcc/testsuite/ChangeLog                            |  6 ++
 gcc/testsuite/gcc.target/s390/arch12/aghsghmgh-1.c | 23 +++++
 gcc/testsuite/gcc.target/s390/arch12/mul-1.c       | 30 +++++++
 gcc/testsuite/gcc.target/s390/arch12/mul-2.c       | 16 ++++
 6 files changed, 167 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/arch12/aghsghmgh-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/arch12/mul-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/arch12/mul-2.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index b3f3d95..3753ad6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,12 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* config/s390/s390.md ("*adddi3_sign", "*subdi3_sign", "mulditi3")
+	("mulditi3_2", "*muldi3_sign"): New patterns.
+	("muldi3", "*muldi3", "mulsi3", "*mulsi3"): Add an expander and
+	rename the pattern definition.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* config/s390/s390.md ("indirect_jump"): Turn insn definition into
 	expander.
 	("*indirect_jump", "*indirect2_jump"): New pattern definitions.
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 32753ef..93a0bc6 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -5795,6 +5795,15 @@
    (set_attr "cpu_facility" "*,z196,extimm,z10")
    (set_attr "z10prop" "z10_super_E1,*,z10_super_E1,z10_super_E1")])
 
+(define_insn "*adddi3_sign"
+  [(set (match_operand:DI                          0 "register_operand" "=d")
+        (plus:DI (sign_extend:DI (match_operand:HI 2 "memory_operand"    "T"))
+		 (match_operand:DI                 1 "register_operand"  "0")))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_ARCH12"
+  "agh\t%0,%2"
+  [(set_attr "op_type"  "RXY")])
+
 ;
 ; add(tf|df|sf|td|dd)3 instruction pattern(s).
 ;
@@ -6226,6 +6235,15 @@
    (set_attr "cpu_facility" "*,z196,*,longdisp")
    (set_attr "z10prop" "z10_super_c_E1,*,z10_super_E1,z10_super_E1")])
 
+(define_insn "*subdi3_sign"
+  [(set (match_operand:DI                           0 "register_operand" "=d")
+        (minus:DI (match_operand:DI                 1 "register_operand"  "0")
+                  (sign_extend:DI (match_operand:HI 2 "memory_operand"    "T"))))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_ARCH12"
+  "sgh\t%0,%2"
+  [(set_attr "op_type"  "RXY")])
+
 
 ;
 ; sub(tf|df|sf|td|dd)3 instruction pattern(s).
@@ -6565,6 +6583,14 @@
 ; muldi3 instruction pattern(s).
 ;
 
+(define_expand "muldi3"
+  [(parallel
+    [(set (match_operand:DI          0 "register_operand")
+	  (mult:DI (match_operand:DI 1 "nonimmediate_operand")
+		   (match_operand:DI 2 "general_operand")))
+     (clobber (reg:CC CC_REGNUM))])]
+  "TARGET_ZARCH")
+
 (define_insn "*muldi3_sign"
   [(set (match_operand:DI 0 "register_operand" "=d,d")
         (mult:DI (sign_extend:DI (match_operand:SI 2 "general_operand" "d,T"))
@@ -6576,24 +6602,68 @@
   [(set_attr "op_type"      "RRE,RXY")
    (set_attr "type"         "imuldi")])
 
-(define_insn "muldi3"
-  [(set (match_operand:DI 0 "register_operand" "=d,d,d,d")
-        (mult:DI (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,0")
-                 (match_operand:DI 2 "general_operand" "d,K,T,Os")))]
+(define_insn "*muldi3"
+  [(set (match_operand:DI          0 "register_operand"     "=d,d,d,d,d")
+	(mult:DI (match_operand:DI 1 "nonimmediate_operand" "%0,d,0,0,0")
+		 (match_operand:DI 2 "general_operand"       "d,d,K,T,Os")))
+   (clobber (match_scratch:CC      3                        "=X,c,X,X,X"))]
   "TARGET_ZARCH"
   "@
    msgr\t%0,%2
+   msgrkc\t%0,%1,%2
    mghi\t%0,%h2
    msg\t%0,%2
    msgfi\t%0,%2"
-  [(set_attr "op_type"      "RRE,RI,RXY,RIL")
+  [(set_attr "op_type"      "RRE,RRF,RI,RXY,RIL")
    (set_attr "type"         "imuldi")
-   (set_attr "cpu_facility" "*,*,*,z10")])
+   (set_attr "cpu_facility" "*,arch12,*,*,z10")])
+
+(define_insn "mulditi3"
+  [(set (match_operand:TI 0 "register_operand"               "=d,d")
+        (mult:TI (sign_extend:TI
+		  (match_operand:DI 1 "register_operand"     "%d,0"))
+		 (sign_extend:TI
+		  (match_operand:DI 2 "nonimmediate_operand" " d,T"))))]
+  "TARGET_ARCH12"
+  "@
+   mgrk\t%0,%1,%2
+   mg\t%0,%2"
+  [(set_attr "op_type"  "RRF,RXY")])
+
+; Combine likes op1 and op2 to be swapped sometimes.
+(define_insn "mulditi3_2"
+  [(set (match_operand:TI 0 "register_operand"               "=d,d")
+        (mult:TI (sign_extend:TI
+		  (match_operand:DI 1 "nonimmediate_operand" "%d,T"))
+		 (sign_extend:TI
+		  (match_operand:DI 2 "register_operand"     " d,0"))))]
+  "TARGET_ARCH12"
+  "@
+   mgrk\t%0,%1,%2
+   mg\t%0,%1"
+  [(set_attr "op_type"  "RRF,RXY")])
+
+(define_insn "*muldi3_sign"
+  [(set (match_operand:DI                          0 "register_operand" "=d")
+        (mult:DI (sign_extend:DI (match_operand:HI 2 "memory_operand"    "T"))
+                 (match_operand:DI                 1 "register_operand"  "0")))]
+  "TARGET_ARCH12"
+  "mgh\t%0,%2"
+  [(set_attr "op_type" "RXY")])
+
 
 ;
 ; mulsi3 instruction pattern(s).
 ;
 
+(define_expand "mulsi3"
+  [(parallel
+    [(set (match_operand:SI           0 "register_operand"     "=d,d,d,d,d,d")
+	  (mult:SI  (match_operand:SI 1 "nonimmediate_operand" "%0,d,0,0,0,0")
+		    (match_operand:SI 2 "general_operand"       "d,d,K,R,T,Os")))
+     (clobber (reg:CC CC_REGNUM))])]
+  "")
+
 (define_insn "*mulsi3_sign"
   [(set (match_operand:SI 0 "register_operand" "=d,d")
         (mult:SI (sign_extend:SI (match_operand:HI 2 "memory_operand" "R,T"))
@@ -6606,20 +6676,22 @@
    (set_attr "type"         "imulhi")
    (set_attr "cpu_facility" "*,z10")])
 
-(define_insn "mulsi3"
-  [(set (match_operand:SI 0 "register_operand" "=d,d,d,d,d")
-        (mult:SI  (match_operand:SI 1 "nonimmediate_operand" "%0,0,0,0,0")
-                  (match_operand:SI 2 "general_operand" "d,K,R,T,Os")))]
+(define_insn "*mulsi3"
+  [(set (match_operand:SI           0 "register_operand"     "=d,d,d,d,d,d")
+        (mult:SI  (match_operand:SI 1 "nonimmediate_operand" "%0,d,0,0,0,0")
+                  (match_operand:SI 2 "general_operand"       "d,d,K,R,T,Os")))
+   (clobber (match_scratch:CC       3                        "=X,c,X,X,X,X"))]
   ""
   "@
    msr\t%0,%2
+   msrkc\t%0,%1,%2
    mhi\t%0,%h2
    ms\t%0,%2
    msy\t%0,%2
    msfi\t%0,%2"
-  [(set_attr "op_type"      "RRE,RI,RX,RXY,RIL")
-   (set_attr "type"         "imulsi,imulhi,imulsi,imulsi,imulsi")
-   (set_attr "cpu_facility" "*,*,*,longdisp,z10")])
+  [(set_attr "op_type"      "RRE,RRF,RI,RX,RXY,RIL")
+   (set_attr "type"         "imulsi,*,imulhi,imulsi,imulsi,imulsi")
+   (set_attr "cpu_facility" "*,arch12,*,*,longdisp,z10")])
 
 ;
 ; mulsidi3 instruction pattern(s).
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 4efc391..9247f53 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,11 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* gcc.target/s390/arch12/aghsghmgh-1.c: New test.
+	* gcc.target/s390/arch12/mul-1.c: New test.
+	* gcc.target/s390/arch12/mul-2.c: New test.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* gcc.target/s390/vxe/vllezlf-1.c: New test.
 
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
diff --git a/gcc/testsuite/gcc.target/s390/arch12/aghsghmgh-1.c b/gcc/testsuite/gcc.target/s390/arch12/aghsghmgh-1.c
new file mode 100644
index 0000000..fc844c3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/arch12/aghsghmgh-1.c
@@ -0,0 +1,23 @@
+/* { dg-compile } */
+
+long long
+agh (long long a, short int *p)
+{
+  return a + *p;
+}
+
+long long
+sgh (long long a, short int *p)
+{
+  return a - *p;
+}
+
+long long
+mgh (long long a, short int *p)
+{
+  return a * *p;
+}
+
+/* { dg-final { scan-assembler-times "\tagh\t" 1 } } */
+/* { dg-final { scan-assembler-times "\tsgh\t" 1 } } */
+/* { dg-final { scan-assembler-times "\tmgh\t" 1 } } */
diff --git a/gcc/testsuite/gcc.target/s390/arch12/mul-1.c b/gcc/testsuite/gcc.target/s390/arch12/mul-1.c
new file mode 100644
index 0000000..ef39535
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/arch12/mul-1.c
@@ -0,0 +1,30 @@
+/* { dg-compile } */
+
+int
+msrkc (int unused, int a, int b)
+{
+  return a * b;
+}
+
+long long
+msgrkc (int unused, long long a, long long b)
+{
+  return a * b;
+}
+
+/* Make sure the 2 operand version are still being used.  */
+
+int
+msr (int a, int b)
+{
+  return a * b;
+}
+
+long long
+msgr (long long a, long long b)
+{
+  return a * b;
+}
+
+/* { dg-final { scan-assembler-times "\tmsrkc\t" 1 } } */
+/* { dg-final { scan-assembler-times "\tmsgrkc\t" 1 } } */
diff --git a/gcc/testsuite/gcc.target/s390/arch12/mul-2.c b/gcc/testsuite/gcc.target/s390/arch12/mul-2.c
new file mode 100644
index 0000000..ad3b11e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/arch12/mul-2.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target int128 } } */
+
+__int128
+mgrk (long long a, long long b)
+{
+  return (__int128)a * (__int128)b;
+}
+
+__int128
+mg (long long a, long long *b)
+{
+  return (__int128)a * (__int128)*b;
+}
+
+/* { dg-final { scan-assembler-times "\tmgrk\t" 1 } } */
+/* { dg-final { scan-assembler-times "\tmg\t" 1 } } */
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 13/16] S/390: arch12: Add indirect branch pattern
  2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
                   ` (13 preceding siblings ...)
  2017-03-24 14:13 ` [PATCH 05/16] S/390: movdf improvements Andreas Krebbel
@ 2017-03-24 14:22 ` Andreas Krebbel
  14 siblings, 0 replies; 16+ messages in thread
From: Andreas Krebbel @ 2017-03-24 14:22 UTC (permalink / raw)
  To: gcc-patches

This adds support for the branch indirect instruction.

gcc/ChangeLog:

2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/s390.md ("indirect_jump"): Turn insn definition into
	expander.
	("*indirect_jump", "*indirect2_jump"): New pattern definitions.
---
 gcc/ChangeLog           |  6 ++++++
 gcc/config/s390/s390.md | 48 +++++++++++++++++++++++++++++++++++++-----------
 2 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index a48b743..b3f3d95 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,11 @@
 2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
 
+	* config/s390/s390.md ("indirect_jump"): Turn insn definition into
+	expander.
+	("*indirect_jump", "*indirect2_jump"): New pattern definitions.
+
+2017-03-24  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>
+
 	* config/s390/s390.c (s390_expand_vec_init): Use vllezl
 	instruction if possible.
 	* config/s390/vector.md (vec_halfnumelts): New mode
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 53c8fed..32753ef 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -9509,20 +9509,46 @@
 ; indirect-jump instruction pattern(s).
 ;
 
-(define_insn "indirect_jump"
- [(set (pc) (match_operand 0 "address_operand" "ZR"))]
+(define_expand "indirect_jump"
+  [(set (pc) (match_operand 0 "nonimmediate_operand" ""))]
   ""
 {
-  if (get_attr_op_type (insn) == OP_TYPE_RR)
-    return "br\t%0";
+  if (address_operand (operands[0], GET_MODE (operands[0])))
+    ;
+  else if (TARGET_ARCH12
+	   && GET_MODE (operands[0]) == Pmode
+	   && memory_operand (operands[0], Pmode))
+    ;
   else
-    return "b\t%a0";
-}
-  [(set (attr "op_type")
-        (if_then_else (match_operand 0 "register_operand" "")
-                      (const_string "RR") (const_string "RX")))
-   (set_attr "type"  "branch")
-   (set_attr "atype" "agen")])
+    operands[0] = force_reg (Pmode, operands[0]);
+})
+
+(define_insn "*indirect_jump"
+  [(set (pc)
+	(match_operand 0 "address_operand" "a,ZR"))]
+ ""
+ "@
+  br\t%0
+  b\t%a0"
+ [(set_attr "op_type" "RR,RX")
+  (set_attr "type"  "branch")
+  (set_attr "atype" "agen")
+  (set_attr "cpu_facility" "*")])
+
+; FIXME: LRA does not appear to be able to deal with MEMs being
+; checked against address constraints like ZR above.  So make this a
+; separate pattern for now.
+(define_insn "*indirect2_jump"
+  [(set (pc)
+	(match_operand 0 "nonimmediate_operand" "a,T"))]
+ ""
+ "@
+  br\t%0
+  bi\t%0"
+ [(set_attr "op_type" "RR,RXY")
+  (set_attr "type"  "branch")
+  (set_attr "atype" "agen")
+  (set_attr "cpu_facility" "*,arch12")])
 
 ;
 ; casesi instruction pattern(s).
-- 
2.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-03-24 14:13 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-24 14:11 [Committed] S/390: Add arch12 support Andreas Krebbel
2017-03-24 14:11 ` [PATCH 07/16] S/390: Use wfc for scalar vector compares Andreas Krebbel
2017-03-24 14:11 ` [PATCH 02/16] S/390: Improve support of 128 bit vectors in GPRs Andreas Krebbel
2017-03-24 14:11 ` [PATCH 12/16] S/390: arch12: Add vllezlf instruction Andreas Krebbel
2017-03-24 14:11 ` [PATCH 06/16] S/390: Move and rename vector check Andreas Krebbel
2017-03-24 14:11 ` [PATCH 03/16] S/390: vec_init improvements Andreas Krebbel
2017-03-24 14:13 ` [PATCH 14/16] S/390: arch12: Support the mul/add/subtract instructions Andreas Krebbel
2017-03-24 14:13 ` [PATCH 09/16] S/390: arch12: Add arch12 option Andreas Krebbel
2017-03-24 14:13 ` [PATCH 08/16] S/390: Rearrange fixuns_trunc pattern definitions Andreas Krebbel
2017-03-24 14:13 ` [PATCH 04/16] S/390: movsf/sd pattern fixes Andreas Krebbel
2017-03-24 14:13 ` [PATCH 11/16] S/390: arch12: New vector popcount variants Andreas Krebbel
2017-03-24 14:13 ` [PATCH 15/16] S/390: arch12: Support new vector floating point modes Andreas Krebbel
2017-03-24 14:13 ` [PATCH 01/16] S/390: Rename cpu facility vec to vx Andreas Krebbel
2017-03-24 14:13 ` [PATCH 10/16] S/390: arch12: Add support for new vector bit operations Andreas Krebbel
2017-03-24 14:13 ` [PATCH 05/16] S/390: movdf improvements Andreas Krebbel
2017-03-24 14:22 ` [PATCH 13/16] S/390: arch12: Add indirect branch pattern Andreas Krebbel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).