public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes
@ 2016-02-17 18:51 Andreas Krebbel
  2016-02-17 18:51 ` [PATCH 3/9] S/390: z13 inline stpcpy implementation Andreas Krebbel
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Andreas Krebbel @ 2016-02-17 18:51 UTC (permalink / raw)
  To: gcc-patches

I'm having this patchset in my local tree for quite a while now.
Posting it was so far prevented by some internal process hurdles.  I'm
aware it isn't stage 4 material.  I nevertheless would like to commit
this since:

* It is z13 only and z13 support was new in GCC 6 anyway.  The risk to
  cause regressions for other cpu levels is small (hopefully).

* It is required to get rid of some nasty performance regressions
  which can be observed with -march=z13 otherwise.

Any objections?

Bye,

-Andreas-

Andreas Krebbel (9):
  S/390: Add IBM z13 pipeline description
  S/390: z13 lcbb fix address operand.
  S/390: z13 inline stpcpy implementation.
  S/390: Adjust movstr-1.c testcase to work with the z13 stpcpy
    implementation.
  S/390: z13 fix mode in vcond expansion
  S/390: Add vec_sub_u128 to vecintrin.h
  S/390: z13 Change predicates of 128 bit add sub.
  S/390: Add single element vector types to iterators.
  S/390: z13 Add missing commutative operand markers.

 gcc/config/s390/2827.md                            |   9 +-
 gcc/config/s390/2964.md                            |  64 ++++
 gcc/config/s390/s390-protos.h                      |   1 +
 gcc/config/s390/s390.c                             | 381 +++++++++++++++++----
 gcc/config/s390/s390.md                            |  19 +-
 gcc/config/s390/vecintrin.h                        |   1 +
 gcc/config/s390/vector.md                          |  60 ++--
 gcc/config/s390/vx-builtins.md                     |  56 +--
 gcc/testsuite/gcc.target/s390/md/movstr-1.c        |   2 +-
 gcc/testsuite/gcc.target/s390/md/movstr-2.c        |  98 ++++++
 gcc/testsuite/gcc.target/s390/vector/int128-1.c    |  47 +++
 gcc/testsuite/gcc.target/s390/vector/vec-vcond-1.c |  23 ++
 12 files changed, 628 insertions(+), 133 deletions(-)
 create mode 100644 gcc/config/s390/2964.md
 create mode 100644 gcc/testsuite/gcc.target/s390/md/movstr-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/int128-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-vcond-1.c

-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 5/9] S/390: z13 fix mode in vcond expansion
  2016-02-17 18:51 [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes Andreas Krebbel
                   ` (4 preceding siblings ...)
  2016-02-17 18:51 ` [PATCH 8/9] S/390: Add single element vector types to iterators Andreas Krebbel
@ 2016-02-17 18:51 ` Andreas Krebbel
  2016-02-17 18:51 ` [PATCH 4/9] S/390: Adjust movstr-1.c testcase to work with the z13 stpcpy implementation Andreas Krebbel
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Andreas Krebbel @ 2016-02-17 18:51 UTC (permalink / raw)
  To: gcc-patches

For floating point vector compares the target mode is an integer mode
which accidently was used as register mode when forcing the compare
operands into regs.

gcc/ChangeLog:

2016-02-17  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/s390.c (s390_expand_vcond): Use the compare operand
	mode.

gcc/testsuite/ChangeLog:

2016-02-17  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* gcc.target/s390/vector/vec-vcond-1.c: New test.
---
 gcc/config/s390/s390.c                             |  4 ++--
 gcc/testsuite/gcc.target/s390/vector/vec-vcond-1.c | 23 ++++++++++++++++++++++
 2 files changed, 25 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-vcond-1.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index da05a04..cd53b15 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -6329,10 +6329,10 @@ s390_expand_vcond (rtx target, rtx then, rtx els,
      can be handled by the optimization above but not by the
      following code.  Hence, force them into registers here.  */
   if (!REG_P (cmp_op1))
-    cmp_op1 = force_reg (target_mode, cmp_op1);
+    cmp_op1 = force_reg (GET_MODE (cmp_op1), cmp_op1);
 
   if (!REG_P (cmp_op2))
-    cmp_op2 = force_reg (target_mode, cmp_op2);
+    cmp_op2 = force_reg (GET_MODE (cmp_op2), cmp_op2);
 
   s390_expand_vec_compare (result_target, cond,
 			   cmp_op1, cmp_op2);
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-vcond-1.c b/gcc/testsuite/gcc.target/s390/vector/vec-vcond-1.c
new file mode 100644
index 0000000..ec65c6f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-vcond-1.c
@@ -0,0 +1,23 @@
+/* A const vector operand is forced into a register in
+   s390_expand_vcond.
+   This testcase once failed because the target mode (v2di) was picked
+   for the reg instead of the mode of the other comparison
+   operand.  */
+
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+typedef __attribute__((vector_size(16))) long   v2di;
+typedef __attribute__((vector_size(16))) double v2df;
+
+v2di
+foo (v2df a)
+{
+  return a == (v2df){ 0.0, 0.0 };
+}
+
+v2di
+bar (v2df a)
+{
+  return (v2df){ 1.0, 1.0 } == (v2df){ 0.0, 0.0 };
+}
-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 6/9] S/390: Add vec_sub_u128 to vecintrin.h
  2016-02-17 18:51 [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes Andreas Krebbel
  2016-02-17 18:51 ` [PATCH 3/9] S/390: z13 inline stpcpy implementation Andreas Krebbel
@ 2016-02-17 18:51 ` Andreas Krebbel
  2016-02-17 18:51 ` [PATCH 2/9] S/390: z13 lcbb fix address operand Andreas Krebbel
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Andreas Krebbel @ 2016-02-17 18:51 UTC (permalink / raw)
  To: gcc-patches

This adds a missing macro to the vecintrin.h header file.

gcc/ChangeLog:

2016-02-17  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/vecintrin.h (vec_sub_u128): Define missing macro.
---
 gcc/config/s390/vecintrin.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/config/s390/vecintrin.h b/gcc/config/s390/vecintrin.h
index b9742ec..ab82e7a 100644
--- a/gcc/config/s390/vecintrin.h
+++ b/gcc/config/s390/vecintrin.h
@@ -80,6 +80,7 @@ __lcbb(const void *ptr, int bndry)
 #define vec_checksum __builtin_s390_vcksm
 #define vec_gfmsum_128 __builtin_s390_vgfmg
 #define vec_gfmsum_accum_128 __builtin_s390_vgfmag
+#define vec_sub_u128 __builtin_s390_vsq
 #define vec_subc_u128 __builtin_s390_vscbiq
 #define vec_sube_u128 __builtin_s390_vsbiq
 #define vec_subec_u128 __builtin_s390_vsbcbiq
-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/9] S/390: z13 inline stpcpy implementation.
  2016-02-17 18:51 [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes Andreas Krebbel
@ 2016-02-17 18:51 ` Andreas Krebbel
  2016-02-17 18:51 ` [PATCH 6/9] S/390: Add vec_sub_u128 to vecintrin.h Andreas Krebbel
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Andreas Krebbel @ 2016-02-17 18:51 UTC (permalink / raw)
  To: gcc-patches

A handwritten loop for stpcpy using the new z13 vector instructions
appears to be much faster than the millicoded instruction.  However,
the implementation is much longer and therefore will only be enabled
when optimization for speed.

gcc/testsuite/ChangeLog:

2016-02-17  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* gcc.target/s390/md/movstr-2.c: New test.

gcc/ChangeLog:

2016-02-17  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/s390-protos.h: Add s390_expand_vec_movstr prototype.
	* config/s390/s390.c (s390_expand_vec_movstr): New function.
	* config/s390/s390.md ("movstr<P:mode>"): Call
	s390_expand_vec_movstr.
---
 gcc/config/s390/s390-protos.h               |   1 +
 gcc/config/s390/s390.c                      | 118 ++++++++++++++++++++++++++++
 gcc/config/s390/s390.md                     |  12 ++-
 gcc/testsuite/gcc.target/s390/md/movstr-2.c |  98 +++++++++++++++++++++++
 4 files changed, 227 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/md/movstr-2.c

diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index 09032c9..792eaa7 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -109,6 +109,7 @@ extern bool s390_expand_movmem (rtx, rtx, rtx);
 extern void s390_expand_setmem (rtx, rtx, rtx);
 extern bool s390_expand_cmpmem (rtx, rtx, rtx, rtx);
 extern void s390_expand_vec_strlen (rtx, rtx, rtx);
+extern void s390_expand_vec_movstr (rtx, rtx, rtx);
 extern bool s390_expand_addcc (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
 extern bool s390_expand_insv (rtx, rtx, rtx, rtx);
 extern void s390_expand_cs_hqi (machine_mode, rtx, rtx, rtx,
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index c2e59f5..da05a04 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5622,6 +5622,124 @@ s390_expand_vec_strlen (rtx target, rtx string, rtx alignment)
     emit_move_insn (target, temp);
 }
 
+void
+s390_expand_vec_movstr (rtx result, rtx dst, rtx src)
+{
+  int very_unlikely = REG_BR_PROB_BASE / 100 - 1;
+  rtx temp = gen_reg_rtx (Pmode);
+  rtx src_addr = XEXP (src, 0);
+  rtx dst_addr = XEXP (dst, 0);
+  rtx src_addr_reg = gen_reg_rtx (Pmode);
+  rtx dst_addr_reg = gen_reg_rtx (Pmode);
+  rtx offset = gen_reg_rtx (Pmode);
+  rtx vsrc = gen_reg_rtx (V16QImode);
+  rtx vpos = gen_reg_rtx (V16QImode);
+  rtx loadlen = gen_reg_rtx (SImode);
+  rtx gpos_qi = gen_reg_rtx(QImode);
+  rtx gpos = gen_reg_rtx (SImode);
+  rtx done_label = gen_label_rtx ();
+  rtx loop_label = gen_label_rtx ();
+  rtx exit_label = gen_label_rtx ();
+  rtx full_label = gen_label_rtx ();
+
+  /* Perform a quick check for string ending on the first up to 16
+     bytes and exit early if successful.  */
+
+  emit_insn (gen_vlbb (vsrc, src, GEN_INT (6)));
+  emit_insn (gen_lcbb (loadlen, src_addr, GEN_INT (6)));
+  emit_insn (gen_vfenezv16qi (vpos, vsrc, vsrc));
+  emit_insn (gen_vec_extractv16qi (gpos_qi, vpos, GEN_INT (7)));
+  emit_move_insn (gpos, gen_rtx_SUBREG (SImode, gpos_qi, 0));
+  /* gpos is the byte index if a zero was found and 16 otherwise.
+     So if it is lower than the loaded bytes we have a hit.  */
+  emit_cmp_and_jump_insns (gpos, loadlen, GE, NULL_RTX, SImode, 1,
+			   full_label);
+  emit_insn (gen_vstlv16qi (vsrc, gpos, dst));
+
+  force_expand_binop (Pmode, add_optab, dst_addr, gpos, result,
+		      1, OPTAB_DIRECT);
+  emit_jump (exit_label);
+  emit_barrier ();
+
+  emit_label (full_label);
+  LABEL_NUSES (full_label) = 1;
+
+  /* Calculate `offset' so that src + offset points to the last byte
+     before 16 byte alignment.  */
+
+  /* temp = src_addr & 0xf */
+  force_expand_binop (Pmode, and_optab, src_addr, GEN_INT (15), temp,
+		      1, OPTAB_DIRECT);
+
+  /* offset = 0xf - temp */
+  emit_move_insn (offset, GEN_INT (15));
+  force_expand_binop (Pmode, sub_optab, offset, temp, offset,
+		      1, OPTAB_DIRECT);
+
+  /* Store `offset' bytes in the dstination string.  The quick check
+     has loaded at least `offset' bytes into vsrc.  */
+
+  emit_insn (gen_vstlv16qi (vsrc, gen_lowpart (SImode, offset), dst));
+
+  /* Advance to the next byte to be loaded.  */
+  force_expand_binop (Pmode, add_optab, offset, const1_rtx, offset,
+		      1, OPTAB_DIRECT);
+
+  /* Make sure the addresses are single regs which can be used as a
+     base.  */
+  emit_move_insn (src_addr_reg, src_addr);
+  emit_move_insn (dst_addr_reg, dst_addr);
+
+  /* MAIN LOOP */
+
+  emit_label (loop_label);
+  LABEL_NUSES (loop_label) = 1;
+
+  emit_move_insn (vsrc,
+		  gen_rtx_MEM (V16QImode,
+			       gen_rtx_PLUS (Pmode, src_addr_reg, offset)));
+
+  emit_insn (gen_vec_vfenesv16qi (vpos, vsrc, vsrc,
+				  GEN_INT (VSTRING_FLAG_ZS | VSTRING_FLAG_CS)));
+  add_int_reg_note (s390_emit_ccraw_jump (8, EQ, done_label),
+		    REG_BR_PROB, very_unlikely);
+
+  emit_move_insn (gen_rtx_MEM (V16QImode,
+			       gen_rtx_PLUS (Pmode, dst_addr_reg, offset)),
+		  vsrc);
+  /* offset += 16 */
+  force_expand_binop (Pmode, add_optab, offset, GEN_INT (16),
+		      offset,  1, OPTAB_DIRECT);
+
+  emit_jump (loop_label);
+  emit_barrier ();
+
+  /* REGULAR EXIT */
+
+  /* We are done.  Add the offset of the zero character to the dst_addr
+     pointer to get the result.  */
+
+  emit_label (done_label);
+  LABEL_NUSES (done_label) = 1;
+
+  force_expand_binop (Pmode, add_optab, dst_addr_reg, offset, dst_addr_reg,
+		      1, OPTAB_DIRECT);
+
+  emit_insn (gen_vec_extractv16qi (gpos_qi, vpos, GEN_INT (7)));
+  emit_move_insn (gpos, gen_rtx_SUBREG (SImode, gpos_qi, 0));
+
+  emit_insn (gen_vstlv16qi (vsrc, gpos, gen_rtx_MEM (BLKmode, dst_addr_reg)));
+
+  force_expand_binop (Pmode, add_optab, dst_addr_reg, gpos, result,
+		      1, OPTAB_DIRECT);
+
+  /* EARLY EXIT */
+
+  emit_label (exit_label);
+  LABEL_NUSES (exit_label) = 1;
+}
+
+
 /* Expand conditional increment or decrement using alc/slb instructions.
    Should generate code setting DST to either SRC or SRC + INCREMENT,
    depending on the result of the comparison CMP_OP0 CMP_CODE CMP_OP1.
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 55ae705..2c90eae 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -2953,8 +2953,16 @@
      (clobber (reg:CC CC_REGNUM))])]
   ""
 {
-  rtx addr1 = gen_reg_rtx (Pmode);
-  rtx addr2 = gen_reg_rtx (Pmode);
+  rtx addr1, addr2;
+
+  if (TARGET_VX && optimize_function_for_speed_p (cfun))
+    {
+      s390_expand_vec_movstr (operands[0], operands[1], operands[2]);
+      DONE;
+    }
+
+  addr1 = gen_reg_rtx (Pmode);
+  addr2 = gen_reg_rtx (Pmode);
 
   emit_move_insn (addr1, force_operand (XEXP (operands[1], 0), NULL_RTX));
   emit_move_insn (addr2, force_operand (XEXP (operands[2], 0), NULL_RTX));
diff --git a/gcc/testsuite/gcc.target/s390/md/movstr-2.c b/gcc/testsuite/gcc.target/s390/md/movstr-2.c
new file mode 100644
index 0000000..1b977a2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/md/movstr-2.c
@@ -0,0 +1,98 @@
+/* The z13 stpcpy implementation plays some alignment tricks for good
+   performance.  This test tries to make sure it works correctly and
+   does not access bytes beyond the source and destination
+   strings.  */
+
+/* { dg-do run } */
+
+#include <stdio.h>
+#include <sys/mman.h>
+
+#define PAGE_SIZE 4096
+
+struct {
+  char unused[PAGE_SIZE - 32];
+  char m32[15]; /* page bndry - 32 */
+  char m17[1];
+  char m16[1];
+  char m15[14];
+  char m1[1];
+  char next_page[PAGE_SIZE];
+} s, d __attribute__((aligned(PAGE_SIZE)));
+
+char *__attribute__((noinline))
+my_stpcpy(char *dest, const char *src)
+{
+  return __builtin_stpcpy (dest, src);
+}
+
+void __attribute__ ((noinline))
+check (char *dest, char *src, size_t len)
+{
+  char *result;
+
+  result = my_stpcpy (dest, src);
+  if (result != dest + len)
+    __builtin_abort ();
+  if (__builtin_memcmp (src, dest, len) != 0)
+    __builtin_abort ();
+}
+
+int
+main ()
+{
+  char *src[5] = { s.m32, s.m17, s.m16, s.m15, s.m1 };
+  char *dst[5] = { d.m32, d.m17, d.m16, d.m15, d.m1 };
+  int len[8] = { 33, 32, 31, 17, 16, 15, 1, 0 };
+  int i, j, k;
+  char backup;
+
+  for (i = 0; i < sizeof (s); i++)
+    ((char*)&s)[i] = i % 26 + 97;
+
+  for (i = 0; i < 5; i++)
+    for (j = 0; j < 5; j++)
+      for (k = 0; k < 8; k++)
+	{
+	  backup = src[j][len[k]];
+	  src[j][len[k]] = 0;
+	  __builtin_memset (&d, 0, sizeof (d));
+	  check (dst[i], src[j], len[k]);
+	  src[j][len[k]] = backup;
+	}
+
+  /* Make all source strings end before the page boundary.  */
+  backup = s.m1[0];
+  s.m1[0] = 0;
+
+  if (mprotect (&s.next_page, PAGE_SIZE, PROT_NONE) == -1)
+    perror ("mprotect src");
+
+  for (i = 0; i < 5; i++)
+    for (j = 0; j < 5; j++)
+      check (dst[i], src[j],
+	     PAGE_SIZE - ((unsigned long)src[j] & ((1UL << 12) - 1)) - 1);
+
+  if (mprotect (&s.next_page, PAGE_SIZE, PROT_READ | PROT_WRITE) == -1)
+    perror ("mprotect src");
+
+  s.m1[0] = backup;
+
+  if (mprotect (&d.next_page, PAGE_SIZE, PROT_NONE) == -1)
+    perror ("mprotect dst");
+
+  for (i = 0; i < 5; i++)
+    for (j = 0; j < 5; j++)
+      {
+	int len = PAGE_SIZE - ((unsigned long)dst[i] & ((1UL << 12) - 1)) - 1;
+	char backup = src[j][len];
+
+	src[j][len] = 0;
+	__builtin_memset (&d, 0,
+			  (unsigned long)&d.next_page - (unsigned long)&d);
+	check (dst[i], src[j], len);
+	src[j][len] = backup;
+      }
+
+  return 0;
+}
-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 7/9] S/390: z13 Change predicates of 128 bit add sub.
  2016-02-17 18:51 [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes Andreas Krebbel
                   ` (6 preceding siblings ...)
  2016-02-17 18:51 ` [PATCH 4/9] S/390: Adjust movstr-1.c testcase to work with the z13 stpcpy implementation Andreas Krebbel
@ 2016-02-17 18:51 ` Andreas Krebbel
  2016-02-17 18:51 ` [PATCH 9/9] S/390: z13 Add missing commutative operand markers Andreas Krebbel
  2016-02-18  8:38 ` [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes Richard Biener
  9 siblings, 0 replies; 11+ messages in thread
From: Andreas Krebbel @ 2016-02-17 18:51 UTC (permalink / raw)
  To: gcc-patches

So far usage of 128 bit add/sub instruction was rejected if the second
operand was a constant because the predicate rejected this.

gcc/testsuite/ChangeLog:

2016-02-17  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* gcc.target/s390/vector/int128-1.c: New test.

gcc/ChangeLog:

2016-02-17  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/vector.md ("<ti*>add<mode>3", "<ti*>sub<mode>3"):
	Change the predicate of op2 from nonimmediate to general and let
	reload fix it if necessary.
---
 gcc/config/s390/vector.md                       |  4 +--
 gcc/testsuite/gcc.target/s390/vector/int128-1.c | 47 +++++++++++++++++++++++++
 2 files changed, 49 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/int128-1.c

diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 2302a8f..cdb9ba6 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -454,7 +454,7 @@
 (define_insn "<ti*>add<mode>3"
   [(set (match_operand:VIT           0 "nonimmediate_operand" "=v")
 	(plus:VIT (match_operand:VIT 1 "nonimmediate_operand"  "v")
-		  (match_operand:VIT 2 "nonimmediate_operand"  "v")))]
+		  (match_operand:VIT 2 "general_operand"  "v")))]
   "TARGET_VX"
   "va<bhfgq>\t%v0,%v1,%v2"
   [(set_attr "op_type" "VRR")])
@@ -463,7 +463,7 @@
 (define_insn "<ti*>sub<mode>3"
   [(set (match_operand:VIT            0 "nonimmediate_operand" "=v")
 	(minus:VIT (match_operand:VIT 1 "nonimmediate_operand"  "v")
-		   (match_operand:VIT 2 "nonimmediate_operand"  "v")))]
+		   (match_operand:VIT 2 "general_operand"  "v")))]
   "TARGET_VX"
   "vs<bhfgq>\t%v0,%v1,%v2"
   [(set_attr "op_type" "VRR")])
diff --git a/gcc/testsuite/gcc.target/s390/vector/int128-1.c b/gcc/testsuite/gcc.target/s390/vector/int128-1.c
new file mode 100644
index 0000000..b4a16b8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/int128-1.c
@@ -0,0 +1,47 @@
+/* Check that vaq/vsq are used for int128 operations.  */
+
+/* { dg-do compile { target { lp64 } } } */
+/* { dg-options "-O3 -mzarch -march=z13" } */
+
+
+const __int128 c = (__int128)0x0123456789abcd55 + ((__int128)7 << 64);
+
+
+__int128
+addreg(__int128 a, __int128 b)
+{
+  return a + b;
+}
+
+__int128
+addconst(__int128 a)
+{
+  return a + c;
+}
+
+__int128
+addmem(__int128 *a, __int128_t *b)
+{
+  return *a + *b;
+}
+
+__int128
+subreg(__int128 a, __int128 b)
+{
+  return a - b;
+}
+
+__int128
+subconst(__int128 a)
+{
+  return a - c; /* This becomes vaq as well.  */
+}
+
+__int128
+submem(__int128 *a, __int128_t *b)
+{
+  return *a - *b;
+}
+
+/* { dg-final { scan-assembler-times "vaq" 4 } } */
+/* { dg-final { scan-assembler-times "vsq" 2 } } */
-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 9/9] S/390: z13 Add missing commutative operand markers.
  2016-02-17 18:51 [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes Andreas Krebbel
                   ` (7 preceding siblings ...)
  2016-02-17 18:51 ` [PATCH 7/9] S/390: z13 Change predicates of 128 bit add sub Andreas Krebbel
@ 2016-02-17 18:51 ` Andreas Krebbel
  2016-02-18  8:38 ` [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes Richard Biener
  9 siblings, 0 replies; 11+ messages in thread
From: Andreas Krebbel @ 2016-02-17 18:51 UTC (permalink / raw)
  To: gcc-patches

gcc/ChangeLog:

2016-02-17  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/vector.md: Add missing commutative operand markers
	to the patterns which qualify for one.
	* config/s390/vx-builtins.md: Likewise.
---
 gcc/config/s390/vector.md      | 44 +++++++++++++++++++++---------------------
 gcc/config/s390/vx-builtins.md | 44 +++++++++++++++++++++---------------------
 2 files changed, 44 insertions(+), 44 deletions(-)

diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 3101057..cc3287c 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -453,8 +453,8 @@
 ; operation into two DImode ADDs.
 (define_insn "<ti*>add<mode>3"
   [(set (match_operand:VIT           0 "nonimmediate_operand" "=v")
-	(plus:VIT (match_operand:VIT 1 "nonimmediate_operand"  "v")
-		  (match_operand:VIT 2 "general_operand"  "v")))]
+	(plus:VIT (match_operand:VIT 1 "nonimmediate_operand" "%v")
+		  (match_operand:VIT 2 "general_operand"       "v")))]
   "TARGET_VX"
   "va<bhfgq>\t%v0,%v1,%v2"
   [(set_attr "op_type" "VRR")])
@@ -471,7 +471,7 @@
 ; vmlb, vmlhw, vmlf
 (define_insn "mul<mode>3"
   [(set (match_operand:VI_QHS              0 "register_operand" "=v")
-	(mult:VI_QHS (match_operand:VI_QHS 1 "register_operand"  "v")
+	(mult:VI_QHS (match_operand:VI_QHS 1 "register_operand" "%v")
 		     (match_operand:VI_QHS 2 "register_operand"  "v")))]
   "TARGET_VX"
   "vml<bhfgq><w>\t%v0,%v1,%v2"
@@ -526,7 +526,7 @@
 
 (define_insn "and<mode>3"
   [(set (match_operand:VT         0 "register_operand" "=v")
-	(and:VT (match_operand:VT 1 "register_operand"  "v")
+	(and:VT (match_operand:VT 1 "register_operand" "%v")
 		(match_operand:VT 2 "register_operand"  "v")))]
   "TARGET_VX"
   "vn\t%v0,%v1,%v2"
@@ -537,7 +537,7 @@
 
 (define_insn "ior<mode>3"
   [(set (match_operand:VT         0 "register_operand" "=v")
-	(ior:VT (match_operand:VT 1 "register_operand"  "v")
+	(ior:VT (match_operand:VT 1 "register_operand" "%v")
 		(match_operand:VT 2 "register_operand"  "v")))]
   "TARGET_VX"
   "vo\t%v0,%v1,%v2"
@@ -548,7 +548,7 @@
 
 (define_insn "xor<mode>3"
   [(set (match_operand:VT         0 "register_operand" "=v")
-	(xor:VT (match_operand:VT 1 "register_operand"  "v")
+	(xor:VT (match_operand:VT 1 "register_operand" "%v")
 		(match_operand:VT 2 "register_operand"  "v")))]
   "TARGET_VX"
   "vx\t%v0,%v1,%v2"
@@ -765,7 +765,7 @@
 ; vmnb, vmnh, vmnf, vmng
 (define_insn "smin<mode>3"
   [(set (match_operand:VI          0 "register_operand" "=v")
-	(smin:VI (match_operand:VI 1 "register_operand"  "v")
+	(smin:VI (match_operand:VI 1 "register_operand" "%v")
 		 (match_operand:VI 2 "register_operand"  "v")))]
   "TARGET_VX"
   "vmn<bhfgq>\t%v0,%v1,%v2"
@@ -774,7 +774,7 @@
 ; vmxb, vmxh, vmxf, vmxg
 (define_insn "smax<mode>3"
   [(set (match_operand:VI          0 "register_operand" "=v")
-	(smax:VI (match_operand:VI 1 "register_operand"  "v")
+	(smax:VI (match_operand:VI 1 "register_operand" "%v")
 		 (match_operand:VI 2 "register_operand"  "v")))]
   "TARGET_VX"
   "vmx<bhfgq>\t%v0,%v1,%v2"
@@ -783,7 +783,7 @@
 ; vmnlb, vmnlh, vmnlf, vmnlg
 (define_insn "umin<mode>3"
   [(set (match_operand:VI          0 "register_operand" "=v")
-	(umin:VI (match_operand:VI 1 "register_operand"  "v")
+	(umin:VI (match_operand:VI 1 "register_operand" "%v")
 		 (match_operand:VI 2 "register_operand"  "v")))]
   "TARGET_VX"
   "vmnl<bhfgq>\t%v0,%v1,%v2"
@@ -792,7 +792,7 @@
 ; vmxlb, vmxlh, vmxlf, vmxlg
 (define_insn "umax<mode>3"
   [(set (match_operand:VI          0 "register_operand" "=v")
-	(umax:VI (match_operand:VI 1 "register_operand"  "v")
+	(umax:VI (match_operand:VI 1 "register_operand" "%v")
 		 (match_operand:VI 2 "register_operand"  "v")))]
   "TARGET_VX"
   "vmxl<bhfgq>\t%v0,%v1,%v2"
@@ -800,8 +800,8 @@
 
 ; vmeb, vmeh, vmef
 (define_insn "vec_widen_smult_even_<mode>"
-  [(set (match_operand:<vec_double>                    0 "register_operand" "=v")
-	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand"  "v")
+  [(set (match_operand:<vec_double>                 0 "register_operand" "=v")
+	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand" "%v")
 			      (match_operand:VI_QHS 2 "register_operand"  "v")]
 			     UNSPEC_VEC_SMULT_EVEN))]
   "TARGET_VX"
@@ -811,7 +811,7 @@
 ; vmleb, vmleh, vmlef
 (define_insn "vec_widen_umult_even_<mode>"
   [(set (match_operand:<vec_double>                 0 "register_operand" "=v")
-	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand"  "v")
+	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand" "%v")
 			      (match_operand:VI_QHS 2 "register_operand"  "v")]
 			     UNSPEC_VEC_UMULT_EVEN))]
   "TARGET_VX"
@@ -821,7 +821,7 @@
 ; vmob, vmoh, vmof
 (define_insn "vec_widen_smult_odd_<mode>"
   [(set (match_operand:<vec_double>                 0 "register_operand" "=v")
-	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand"  "v")
+	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand" "%v")
 			      (match_operand:VI_QHS 2 "register_operand"  "v")]
 			     UNSPEC_VEC_SMULT_ODD))]
   "TARGET_VX"
@@ -831,7 +831,7 @@
 ; vmlob, vmloh, vmlof
 (define_insn "vec_widen_umult_odd_<mode>"
   [(set (match_operand:<vec_double>                 0 "register_operand" "=v")
-	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand"  "v")
+	(unspec:<vec_double> [(match_operand:VI_QHS 1 "register_operand" "%v")
 			      (match_operand:VI_QHS 2 "register_operand"  "v")]
 			     UNSPEC_VEC_UMULT_ODD))]
   "TARGET_VX"
@@ -854,7 +854,7 @@
 
 (define_insn "addv2df3"
   [(set (match_operand:V2DF            0 "register_operand" "=v")
-	(plus:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+	(plus:V2DF (match_operand:V2DF 1 "register_operand" "%v")
 		   (match_operand:V2DF 2 "register_operand"  "v")))]
   "TARGET_VX"
   "vfadb\t%v0,%v1,%v2"
@@ -862,7 +862,7 @@
 
 (define_insn "subv2df3"
   [(set (match_operand:V2DF             0 "register_operand" "=v")
-	(minus:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+	(minus:V2DF (match_operand:V2DF 1 "register_operand" "%v")
 		    (match_operand:V2DF 2 "register_operand"  "v")))]
   "TARGET_VX"
   "vfsdb\t%v0,%v1,%v2"
@@ -870,7 +870,7 @@
 
 (define_insn "mulv2df3"
   [(set (match_operand:V2DF            0 "register_operand" "=v")
-	(mult:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+	(mult:V2DF (match_operand:V2DF 1 "register_operand" "%v")
 		   (match_operand:V2DF 2 "register_operand"  "v")))]
   "TARGET_VX"
   "vfmdb\t%v0,%v1,%v2"
@@ -893,7 +893,7 @@
 
 (define_insn "fmav2df4"
   [(set (match_operand:V2DF           0 "register_operand" "=v")
-	(fma:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+	(fma:V2DF (match_operand:V2DF 1 "register_operand" "%v")
 		  (match_operand:V2DF 2 "register_operand"  "v")
 		  (match_operand:V2DF 3 "register_operand"  "v")))]
   "TARGET_VX"
@@ -902,7 +902,7 @@
 
 (define_insn "fmsv2df4"
   [(set (match_operand:V2DF                     0 "register_operand" "=v")
-	(fma:V2DF (match_operand:V2DF           1 "register_operand"  "v")
+	(fma:V2DF (match_operand:V2DF           1 "register_operand" "%v")
 		  (match_operand:V2DF           2 "register_operand"  "v")
 		  (neg:V2DF (match_operand:V2DF 3 "register_operand"  "v"))))]
   "TARGET_VX"
@@ -933,7 +933,7 @@
 ; Emulate with compare + select
 (define_insn_and_split "smaxv2df3"
   [(set (match_operand:V2DF            0 "register_operand" "=v")
-	(smax:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+	(smax:V2DF (match_operand:V2DF 1 "register_operand" "%v")
 		   (match_operand:V2DF 2 "register_operand"  "v")))]
   "TARGET_VX"
   "#"
@@ -953,7 +953,7 @@
 ; Emulate with compare + select
 (define_insn_and_split "sminv2df3"
   [(set (match_operand:V2DF            0 "register_operand" "=v")
-	(smin:V2DF (match_operand:V2DF 1 "register_operand"  "v")
+	(smin:V2DF (match_operand:V2DF 1 "register_operand" "%v")
 		   (match_operand:V2DF 2 "register_operand"  "v")))]
   "TARGET_VX"
   "#"
diff --git a/gcc/config/s390/vx-builtins.md b/gcc/config/s390/vx-builtins.md
index 65e683c9..489bbee 100644
--- a/gcc/config/s390/vx-builtins.md
+++ b/gcc/config/s390/vx-builtins.md
@@ -575,7 +575,7 @@
 
 (define_insn "vec_addc<mode>"
   [(set (match_operand:VI_HW                0 "register_operand" "=v")
-	(unspec:VI_HW [(match_operand:VI_HW 1 "register_operand"  "v")
+	(unspec:VI_HW [(match_operand:VI_HW 1 "register_operand" "%v")
 		       (match_operand:VI_HW 2 "register_operand"  "v")]
 		      UNSPEC_VEC_ADDC))]
   "TARGET_VX"
@@ -584,7 +584,7 @@
 
 (define_insn "vec_addc_u128"
   [(set (match_operand:V16QI                0 "register_operand" "=v")
-	(unspec:V16QI [(match_operand:V16QI 1 "register_operand"  "v")
+	(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "%v")
 		       (match_operand:V16QI 2 "register_operand"  "v")]
 		      UNSPEC_VEC_ADDC_U128))]
   "TARGET_VX"
@@ -596,7 +596,7 @@
 
 (define_insn "vec_adde_u128"
   [(set (match_operand:V16QI                0 "register_operand" "=v")
-	(unspec:V16QI [(match_operand:V16QI 1 "register_operand"  "v")
+	(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "%v")
 		       (match_operand:V16QI 2 "register_operand"  "v")
 		       (match_operand:V16QI 3 "register_operand"  "v")]
 		      UNSPEC_VEC_ADDE_U128))]
@@ -609,7 +609,7 @@
 
 (define_insn "vec_addec_u128"
   [(set (match_operand:V16QI                0 "register_operand" "=v")
-	(unspec:V16QI [(match_operand:V16QI 1 "register_operand"  "v")
+	(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "%v")
 		       (match_operand:V16QI 2 "register_operand"  "v")
 		       (match_operand:V16QI 3 "register_operand"  "v")]
 		      UNSPEC_VEC_ADDEC_U128))]
@@ -672,7 +672,7 @@
 
 (define_insn "vec_avg<mode>"
   [(set (match_operand:VI_HW                0 "register_operand" "=v")
-	(unspec:VI_HW [(match_operand:VI_HW 1 "register_operand"  "v")
+	(unspec:VI_HW [(match_operand:VI_HW 1 "register_operand" "%v")
 		       (match_operand:VI_HW 2 "register_operand"  "v")]
 		      UNSPEC_VEC_AVG))]
   "TARGET_VX"
@@ -683,7 +683,7 @@
 
 (define_insn "vec_avgu<mode>"
   [(set (match_operand:VI_HW                0 "register_operand" "=v")
-	(unspec:VI_HW [(match_operand:VI_HW 1 "register_operand"  "v")
+	(unspec:VI_HW [(match_operand:VI_HW 1 "register_operand" "%v")
 		       (match_operand:VI_HW 2 "register_operand"  "v")]
 		      UNSPEC_VEC_AVGU))]
   "TARGET_VX"
@@ -871,9 +871,9 @@
 ; vmalb, vmalh, vmalf, vmalg
 (define_insn "vec_vmal<mode>"
   [(set (match_operand:VI_HW_QHS 0 "register_operand" "=v")
-	(unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "v")
-			   (match_operand:VI_HW_QHS 2 "register_operand" "v")
-			   (match_operand:VI_HW_QHS 3 "register_operand" "v")]
+	(unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "%v")
+			   (match_operand:VI_HW_QHS 2 "register_operand"  "v")
+			   (match_operand:VI_HW_QHS 3 "register_operand"  "v")]
 			  UNSPEC_VEC_VMAL))]
   "TARGET_VX"
   "vmal<bhfgq><w>\t%v0,%v1,%v2,%v3"
@@ -884,9 +884,9 @@
 ; vmahb; vmahh, vmahf, vmahg
 (define_insn "vec_vmah<mode>"
   [(set (match_operand:VI_HW_QHS 0 "register_operand" "=v")
-	(unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "v")
-			   (match_operand:VI_HW_QHS 2 "register_operand" "v")
-			   (match_operand:VI_HW_QHS 3 "register_operand" "v")]
+	(unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "%v")
+			   (match_operand:VI_HW_QHS 2 "register_operand"  "v")
+			   (match_operand:VI_HW_QHS 3 "register_operand"  "v")]
 			  UNSPEC_VEC_VMAH))]
   "TARGET_VX"
   "vmah<bhfgq>\t%v0,%v1,%v2,%v3"
@@ -895,9 +895,9 @@
 ; vmalhb; vmalhh, vmalhf, vmalhg
 (define_insn "vec_vmalh<mode>"
   [(set (match_operand:VI_HW_QHS 0 "register_operand" "=v")
-	(unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "v")
-			   (match_operand:VI_HW_QHS 2 "register_operand" "v")
-			   (match_operand:VI_HW_QHS 3 "register_operand" "v")]
+	(unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "%v")
+			   (match_operand:VI_HW_QHS 2 "register_operand"  "v")
+			   (match_operand:VI_HW_QHS 3 "register_operand"  "v")]
 			  UNSPEC_VEC_VMALH))]
   "TARGET_VX"
   "vmalh<bhfgq>\t%v0,%v1,%v2,%v3"
@@ -908,8 +908,8 @@
 ; vmaeb; vmaeh, vmaef, vmaeg
 (define_insn "vec_vmae<mode>"
   [(set (match_operand:<vec_double> 0 "register_operand" "=v")
-	(unspec:<vec_double> [(match_operand:VI_HW_QHS 1 "register_operand" "v")
-			      (match_operand:VI_HW_QHS 2 "register_operand" "v")
+	(unspec:<vec_double> [(match_operand:VI_HW_QHS 1 "register_operand"   "%v")
+			      (match_operand:VI_HW_QHS 2 "register_operand"    "v")
 			      (match_operand:<vec_double> 3 "register_operand" "v")]
 			     UNSPEC_VEC_VMAE))]
   "TARGET_VX"
@@ -919,7 +919,7 @@
 ; vmaleb; vmaleh, vmalef, vmaleg
 (define_insn "vec_vmale<mode>"
   [(set (match_operand:<vec_double> 0 "register_operand" "=v")
-	(unspec:<vec_double> [(match_operand:VI_HW_QHS 1 "register_operand" "v")
+	(unspec:<vec_double> [(match_operand:VI_HW_QHS 1 "register_operand" "%v")
 			      (match_operand:VI_HW_QHS 2 "register_operand" "v")
 			      (match_operand:<vec_double> 3 "register_operand" "v")]
 			     UNSPEC_VEC_VMALE))]
@@ -932,7 +932,7 @@
 ; vmaob; vmaoh, vmaof, vmaog
 (define_insn "vec_vmao<mode>"
   [(set (match_operand:<vec_double> 0 "register_operand" "=v")
-	(unspec:<vec_double> [(match_operand:VI_HW_QHS 1 "register_operand" "v")
+	(unspec:<vec_double> [(match_operand:VI_HW_QHS 1 "register_operand" "%v")
 			      (match_operand:VI_HW_QHS 2 "register_operand" "v")
 			      (match_operand:<vec_double> 3 "register_operand" "v")]
 			     UNSPEC_VEC_VMAO))]
@@ -959,7 +959,7 @@
 ; vmhb, vmhh, vmhf
 (define_insn "vec_smulh<mode>"
   [(set (match_operand:VI_HW_QHS 0 "register_operand" "=v")
-	(unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "v")
+	(unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "%v")
 			   (match_operand:VI_HW_QHS 2 "register_operand" "v")]
 			  UNSPEC_VEC_SMULT_HI))]
   "TARGET_VX"
@@ -969,7 +969,7 @@
 ; vmlhb, vmlhh, vmlhf
 (define_insn "vec_umulh<mode>"
   [(set (match_operand:VI_HW_QHS 0 "register_operand" "=v")
-	(unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "v")
+	(unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "%v")
 			   (match_operand:VI_HW_QHS 2 "register_operand" "v")]
 			  UNSPEC_VEC_UMULT_HI))]
   "TARGET_VX"
@@ -987,7 +987,7 @@
 
 (define_insn "vec_nor<mode>3"
   [(set (match_operand:VT_HW 0 "register_operand" "=v")
-	(not:VT_HW (ior:VT_HW (match_operand:VT_HW 1 "register_operand" "v")
+	(not:VT_HW (ior:VT_HW (match_operand:VT_HW 1 "register_operand" "%v")
 			      (match_operand:VT_HW 2 "register_operand" "v"))))]
   "TARGET_VX"
   "vno\t%v0,%v1,%v2"
-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/9] S/390: z13 lcbb fix address operand.
  2016-02-17 18:51 [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes Andreas Krebbel
  2016-02-17 18:51 ` [PATCH 3/9] S/390: z13 inline stpcpy implementation Andreas Krebbel
  2016-02-17 18:51 ` [PATCH 6/9] S/390: Add vec_sub_u128 to vecintrin.h Andreas Krebbel
@ 2016-02-17 18:51 ` Andreas Krebbel
  2016-02-17 18:51 ` [PATCH 1/9] S/390: Add IBM z13 pipeline description Andreas Krebbel
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Andreas Krebbel @ 2016-02-17 18:51 UTC (permalink / raw)
  To: gcc-patches

gcc/ChangeLog:

2016-02-17  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/s390.md: Add missing output modifier for operand 1
	to print it as address properly.
---
 gcc/config/s390/s390.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 9d76e61..55ae705 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -10913,11 +10913,11 @@
 
 (define_insn "lcbb"
   [(set (match_operand:SI             0 "register_operand"  "=d")
-	(unspec:SI [(match_operand:SI 1 "address_operand" "ZQZR")
+	(unspec:SI [(match_operand    1 "address_operand" "ZQZR")
 		    (match_operand:SI 2 "immediate_operand"  "C")] UNSPEC_LCBB))
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_Z13"
-  "lcbb\t%0,%1,%b2"
+  "lcbb\t%0,%a1,%b2"
   [(set_attr "op_type" "VRX")])
 
 ; Handle -fsplit-stack.
-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 8/9] S/390: Add single element vector types to iterators.
  2016-02-17 18:51 [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes Andreas Krebbel
                   ` (3 preceding siblings ...)
  2016-02-17 18:51 ` [PATCH 1/9] S/390: Add IBM z13 pipeline description Andreas Krebbel
@ 2016-02-17 18:51 ` Andreas Krebbel
  2016-02-17 18:51 ` [PATCH 5/9] S/390: z13 fix mode in vcond expansion Andreas Krebbel
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Andreas Krebbel @ 2016-02-17 18:51 UTC (permalink / raw)
  To: gcc-patches

gcc/ChangeLog:

2016-02-17  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/vector.md (VI, VI_QHS): Add single element vector
	types to mode iterators.
	(vec_double): ... and mode attribute.
	* config/s390/vx-builtins.md (non_vec_int): Likewise.
---
 gcc/config/s390/vector.md      | 14 +++++++-------
 gcc/config/s390/vx-builtins.md | 12 ++++++------
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index cdb9ba6..3101057 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -43,8 +43,8 @@
 
 ; All integer vector modes supported in a vector register + TImode
 (define_mode_iterator VIT [V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI V1DI V2DI V1TI TI])
-(define_mode_iterator VI  [V2QI V4QI V8QI V16QI V2HI V4HI V8HI V2SI V4SI V2DI])
-(define_mode_iterator VI_QHS [V4QI V8QI V16QI V4HI V8HI V4SI])
+(define_mode_iterator VI  [V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI V1DI V2DI])
+(define_mode_iterator VI_QHS [V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI])
 
 (define_mode_iterator V_8   [V1QI])
 (define_mode_iterator V_16  [V2QI  V1HI])
@@ -100,11 +100,11 @@
 			    (V1TF "V1TI")])
 
 ; Vector with doubled element size.
-(define_mode_attr vec_double [(V2QI "V1HI") (V4QI "V2HI") (V8QI "V4HI") (V16QI "V8HI")
-			      (V2HI "V1SI") (V4HI "V2SI") (V8HI "V4SI")
-			      (V2SI "V1DI") (V4SI "V2DI")
-			      (V2DI "V1TI")
-			      (V2SF "V1DF") (V4SF "V2DF")])
+(define_mode_attr vec_double [(V1QI "V1HI") (V2QI "V1HI") (V4QI "V2HI") (V8QI "V4HI") (V16QI "V8HI")
+			      (V1HI "V1SI") (V2HI "V1SI") (V4HI "V2SI") (V8HI "V4SI")
+			      (V1SI "V1DI") (V2SI "V1DI") (V4SI "V2DI")
+			      (V1DI "V1TI") (V2DI "V1TI")
+			      (V1SF "V1DF") (V2SF "V1DF") (V4SF "V2DF")])
 
 ; Vector with half the element size.
 (define_mode_attr vec_half [(V1HI "V2QI") (V2HI "V4QI") (V4HI "V8QI") (V8HI "V16QI")
diff --git a/gcc/config/s390/vx-builtins.md b/gcc/config/s390/vx-builtins.md
index 81a2d07..65e683c9 100644
--- a/gcc/config/s390/vx-builtins.md
+++ b/gcc/config/s390/vx-builtins.md
@@ -28,12 +28,12 @@
 
 ; The element type of the vector with floating point modes translated
 ; to int modes of the same size.
-(define_mode_attr non_vec_int[(V2QI "QI") (V4QI "QI") (V8QI "QI") (V16QI "QI")
-			      (V2HI "HI") (V4HI "HI") (V8HI "HI")
-			      (V2SI "SI") (V4SI "SI")
-			      (V2DI "DI")
-			      (V2SF "SI") (V4SF "SI")
-			      (V2DF "DI")])
+(define_mode_attr non_vec_int[(V1QI "QI") (V2QI "QI") (V4QI "QI") (V8QI "QI") (V16QI "QI")
+			      (V1HI "HI") (V2HI "HI") (V4HI "HI") (V8HI "HI")
+			      (V1SI "SI") (V2SI "SI") (V4SI "SI")
+			      (V1DI "DI") (V2DI "DI")
+			      (V1SF "SI") (V2SF "SI") (V4SF "SI")
+			      (V1DF "DI") (V2DF "DI")])
 
 ; Condition code modes generated by int comparisons
 (define_mode_iterator VICMP [CCVEQ CCVH CCVHU])
-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/9] S/390: Add IBM z13 pipeline description
  2016-02-17 18:51 [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes Andreas Krebbel
                   ` (2 preceding siblings ...)
  2016-02-17 18:51 ` [PATCH 2/9] S/390: z13 lcbb fix address operand Andreas Krebbel
@ 2016-02-17 18:51 ` Andreas Krebbel
  2016-02-17 18:51 ` [PATCH 8/9] S/390: Add single element vector types to iterators Andreas Krebbel
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Andreas Krebbel @ 2016-02-17 18:51 UTC (permalink / raw)
  To: gcc-patches

This patch adds proper support for the -mtune=z13 option by adding a
z13 pipeline description.  As started with zEC12 we mostly make use of
the sched reorder hooks to implement a grouping strategy.  However,
this time we also keep an eye at the instruction mix provided in the
out of order window to allow the hardware to exploit the different
units.

gcc/ChangeLog:

2016-02-17  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* config/s390/2827.md: Rename ooo_* insn attributes to zEC12_*.
	* config/s390/2964.md: New file.
	* config/s390/s390.c (s390_get_sched_attrmask): Use the right set
	of insn grouping attributes depending on the CPU level.
	(s390_get_unit_mask): New function.
	(s390_sched_score): Remove the OOO from the scheduling macros.
	Add loop to calculate a score for the instruction mix.
	(s390_sched_reorder): Likewise plus improve debug output.
	(s390_sched_variable_issue): Rename macros as above.  Calculate
	the unit distances after actually scheduling an insn.  Improve
	debug output.
	(s390_sched_init): Clear last_scheduled_unit_distance array.
	* config/s390/s390.md: Include 2964.md.
---
 gcc/config/s390/2827.md |   9 +-
 gcc/config/s390/2964.md | 232 +++++++++++++++++++++++++++++++++++++++++++
 gcc/config/s390/s390.c  | 259 ++++++++++++++++++++++++++++++++++++------------
 gcc/config/s390/s390.md |   3 +
 4 files changed, 435 insertions(+), 68 deletions(-)
 create mode 100644 gcc/config/s390/2964.md

diff --git a/gcc/config/s390/2827.md b/gcc/config/s390/2827.md
index 7baf990..21a5ee9 100644
--- a/gcc/config/s390/2827.md
+++ b/gcc/config/s390/2827.md
@@ -18,20 +18,19 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; <http://www.gnu.org/licenses/>.
 
-
-(define_attr "ooo_cracked" ""
+(define_attr "zEC12_cracked" ""
   (cond [(eq_attr "mnemonic" "cgdbr,clfxtr,cdgtr,celfbr,cxgtr,clfebr,clc,lngfr,cs,cfxbr,xc,clfdbr,basr,ex,cxlgtr,clfdtr,srdl,lpgfr,cdlgbr,cgxtr,cxlftr,nc,cxftr,cdfbr,clfxbr,cdftr,clgxbr,cgdtr,cxlgbr,mvc,clgdtr,cegbr,cfebr,cdlftr,sldl,cdlgtr,csg,chhsi,clgebr,cxgbr,cxfbr,cdlfbr,cgebr,lzxr,oc,cdgbr,brasl,cgxbr,cxlfbr,clgxtr,exrl,cfdbr,celgbr,clgdbr,lxr,cpsdr,lcgfr,bras,srda,cefbr") (const_int 1)]
         (const_int 0)))
 
-(define_attr "ooo_expanded" ""
+(define_attr "zEC12_expanded" ""
   (cond [(eq_attr "mnemonic" "dlr,dsgr,d,dsgf,stam,dsgfr,dlgr,dsg,cds,dr,stm,mvc,dl,cdsg,stmy,dlg,stmg,lam") (const_int 1)]
         (const_int 0)))
 
-(define_attr "ooo_endgroup" ""
+(define_attr "zEC12_endgroup" ""
   (cond [(eq_attr "mnemonic" "ipm") (const_int 1)]
         (const_int 0)))
 
-(define_attr "ooo_groupalone" ""
+(define_attr "zEC12_groupalone" ""
   (cond [(eq_attr "mnemonic" "lnxbr,madb,ltxtr,clc,axtr,msebr,slbgr,xc,alcr,lpxbr,slbr,maebr,mlg,mfy,lxdtr,maeb,lxeb,nc,mxtr,sxtr,dxbr,alc,msdbr,ltxbr,lxdb,madbr,lxdbr,lxebr,mvc,m,mseb,mlr,mlgr,slb,tcxb,msdb,sqxbr,alcgr,oc,flogr,alcg,mxbr,dxtr,axbr,mr,sxbr,slbg,ml,lcxbr,bcr_flush") (const_int 1)]
         (const_int 0)))
 
diff --git a/gcc/config/s390/2964.md b/gcc/config/s390/2964.md
new file mode 100644
index 0000000..d2211e1
--- /dev/null
+++ b/gcc/config/s390/2964.md
@@ -0,0 +1,232 @@
+;; Scheduling description for z13.
+;;   Copyright (C) 2016 Free Software Foundation, Inc.
+;;   Contributed by Andreas Krebbel (Andreas.Krebbel@de.ibm.com)
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify it under
+;; the terms of the GNU General Public License as published by the Free
+;; Software Foundation; either version 3, or (at your option) any later
+;; version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
+;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+;; for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+
+; generator options: vector_ecycs=12 cracked_ecycs=6 scale_ecycs=5
+
+(define_attr "z13_cracked" ""
+  (cond [(eq_attr "mnemonic" "celgbr,vscef,vsceg,exrl,clfebr,cefbr,chhsi,\
+vgef,vgeg,cdlftr,lcgfr,cfdbr,cgdbr,lzxr,cfxbr,rnsbg,cgdtr,cegbr,rxsbg,ex,\
+cgxtr,clfxtr,cdlgtr,brasl,efpc,cfebr,tbeginc,celfbr,clgxbr,vsteb,vsteh,\
+clfdtr,cdfbr,lngfr,clgebr,stpq,cs,lpgfr,cdlgbr,lpq,cdgtr,d,cgxbr,cdftr,\
+rosbg,clgdbr,cdgbr,bras,tbegin,clfdbr,cdlfbr,cgebr,clfxbr,lxr,csy,csg,clgdtr,\
+clgxtr") (const_int 1)]
+        (const_int 0)))
+
+(define_attr "z13_expanded" ""
+  (cond [(eq_attr "mnemonic" "cxlftr,cdsg,cdsy,stam,lam,dsgf,lmg,cxlgtr,\
+dl,cxftr,sldl,dsg,cxlfbr,cxgtr,stmg,stmy,stm,lm,cds,lmy,cxfbr,cxlgbr,srda,\
+srdl,cxgbr,dlg") (const_int 1)]
+        (const_int 0)))
+
+(define_attr "z13_groupalone" ""
+  (cond [(eq_attr "mnemonic" "mvc,dxbr,lxebr,axtr,cxtr,alcr,lxdb,lxeb,mxtr,\
+mfy,cxbr,dsgr,lcxbr,slb,mr,dr,alc,slbr,maebr,mlgr,dsgfr,sxtr,tdcxt,tabort,\
+msebr,lxdtr,ltxtr,slbg,ml,mxbr,maeb,oc,dxtr,msdb,sqxbr,mseb,xc,m,clc,mlg,\
+mlr,fixbra,alcgr,nc,sfpc,dlgr,fixbr,slbgr,fixtr,lpxbr,axbr,lxdbr,ltxbr,\
+tcxb,dlr,lnxbr,sxbr,flogr,alcg,tend,madb,bcr_flush") (const_int 1)]
+        (const_int 0)))
+
+(define_attr "z13_endgroup" ""
+  (cond [(eq_attr "mnemonic" "ipm") (const_int 1)]
+        (const_int 0)))
+
+(define_attr "z13_unit_lsu" ""
+  (cond [(eq_attr "mnemonic" "vlbb,mvc,llgc,llc,llhrl,vl,llghrl,vlrepf,\
+vlrepg,vlreph,lde,ldy,tabort,l,llh,ld,lg,ly,vlrepb,vllezb,vllezf,vllezg,\
+vllezh,oc,xc,clc,lrl,ear,nc,lgrl,sfpc,llgf,llgfrl,llgh,llgt,lcbb,vll,sar") (const_int 1)]
+        (const_int 0)))
+
+(define_attr "z13_unit_fxu" ""
+  (cond [(eq_attr "mnemonic" "s,lcgr,x,nop,oiy,ppa,ng,msy,sgrk,vstl,aghik,\
+msgf,ipm,mvi,stocg,rll,srlg,cghsi,clgit,srlk,alrk,sg,sh,sl,st,sy,vst,ark,\
+xgr,agsi,tm,nrk,shy,llhr,agf,alcr,slgfr,sr,clgrt,laa,lder,sgf,lan,llilf,\
+llilh,ag,llill,lay,al,n,laxg,ar,ahi,sgr,ntstg,ay,stcy,nopr,mfy,ngrk,lbr,\
+br,dsgr,stdy,ork,ldgr,lcr,cg,ch,lgfrl,cl,stoc,cr,agfr,stgrl,cy,alfi,xg,\
+cgfi,xi,clfhsi,cgfr,xr,slb,mghi,clfi,slg,clhhsi,agfi,clfit,sly,mr,ldr,nihf,\
+nihh,algfi,dr,nihl,algf,algfr,algr,clgf,clgr,clgt,aghi,alc,alg,locg,alr,\
+locr,cghi,aly,alghsik,slbr,clgfrl,mhy,cit,nr,ny,xiy,mlgr,sthy,cly,dsgfr,\
+rllg,cgit,lgb,lgf,clgrl,lgh,lrvgr,cliy,cgrl,lgr,slrk,clrt,icy,laog,og,agr,\
+mvhi,lhrl,or,lhr,vlvgp,lhy,nilf,oy,nilh,nill,lcdfr,mviy,tmhh,tmhl,sthrl,\
+ltgf,ltgr,srk,clghrl,ahy,vstef,vsteg,ah,vlgvb,llgcr,tmh,tml,clmy,slr,cfi,\
+stc,std,ste,stg,sth,locgr,slbg,sty,tmlh,la,lb,mvghi,lh,risbgn,lrvg,lr,asi,\
+lt,ahik,lrvr,cgf,cgh,cgr,clhrl,lzdr,tmll,mh,ml,vlvgb,ms,lrv,vlvgf,xgrk,\
+vlvgg,llgfr,vlvgh,slfi,chi,chy,mhi,lzer,alhsik,ni,ltgfr,loc,icm,oi,cgfrl,\
+agrk,lgat,oilh,llghr,lghrl,oill,xihf,lpgr,cgrt,clrl,sgfr,lpr,lgbr,strl,\
+algrk,alsi,srak,slgf,a,b,c,slgr,m,o,algsi,icmh,srag,iilf,ogrk,clg,icmy,\
+cli,clm,clr,clt,slgrk,mlg,lao,mlr,risbg,mvhhi,lat,etnd,lax,iihf,sra,alcgr,\
+msgr,clghsi,stey,ngr,xilf,laag,oihf,oihh,oihl,ltg,ltr,niy,lgfi,dlgr,lgfr,\
+slgfi,llcr,slbgr,chrl,lgdr,pfpo,lang,basr,sllg,sllk,lghi,lghr,vlgvf,vlgvg,\
+vlgvh,vlr,chsi,lngr,cghrl,srl,lhi,oilf,crl,crt,afi,xrk,llgtr,llihf,llihh,\
+llihl,dlr,msgfi,msgfr,msg,flogr,xy,msr,clgfi,clgfr,ogr,popcnt,alcg,lndfr,\
+larl,sll,tmy,msfi,ic,lpdfr,tend,lnr") (const_int 1)]
+        (const_int 0)))
+
+(define_attr "z13_unit_vfu" ""
+  (cond [(eq_attr "mnemonic" "seb,vcksm,vfadb,vleib,vchgs,vleif,vleig,vleih,\
+vgbm,verimb,vone,verimf,verimg,verimh,dxbr,verllvb,lpebr,verllvf,verllvg,\
+verllvh,vfeneb,wcdgb,vfenef,vfeneh,vchhs,vctzb,vctzf,vctzg,vctzh,vlcb,aeb,\
+vlcf,vlcg,vlch,vfmsdb,vgfmab,ltebr,vgfmaf,vgfmag,vgfmah,vmaeh,vsb,vsf,vsg,\
+vsh,vsl,vsq,lxebr,cdtr,fiebr,vupllb,vupllf,vupllh,vmrhb,madbr,vtm,vmrhf,\
+vmrhg,vmrhh,axtr,fiebra,vleb,cxtr,vlef,vleg,vleh,vpkf,vpkg,vpkh,vmlob,vmlof,\
+vmloh,lxdb,ldeb,mdtr,vceqfs,adb,wflndb,lxeb,vn,vo,vchlb,vx,mxtr,vchlf,vchlg,\
+vchlh,vfcedbs,vfcedb,vceqgs,cxbr,msdbr,vcdgb,debr,vceqhs,meeb,lcxbr,vavglb,\
+vavglf,vavglg,vavglh,wfcedbs,vmrlb,vmrlf,vmrlg,vmrlh,wfchedbs,vmxb,tcdb,\
+vmahh,vsrlb,wcgdb,lcdbr,vistrbs,vrepb,wfmdb,vrepf,vrepg,vreph,ler,wcdlgb,\
+ley,vistrb,vistrf,vistrh,tceb,wfsqdb,sqeb,vsumqf,vsumqg,vesrlb,vfeezbs,\
+maebr,vesrlf,vesrlg,vesrlh,vmeb,vmef,vmeh,meebr,vflcdb,wfmadb,vperm,sxtr,\
+vclzf,vgm,vgmb,vgmf,vgmg,vgmh,tdcxt,vzero,msebr,veslb,veslf,veslg,vfenezb,\
+vfenezf,vfenezh,vistrfs,vchf,vchg,vchh,vmhb,vmhf,vmhh,cdb,veslvb,ledbr,\
+veslvf,veslvg,veslvh,wclgdb,vfmdb,vmnlb,vmnlf,vmnlg,vmnlh,vclzb,vfeezfs,\
+vclzg,vclzh,mdb,vmxlb,vmxlf,vmxlg,vmxlh,ltdtr,vsbcbiq,ceb,wfddb,sebr,vistrhs,\
+lxdtr,lcebr,vab,vaf,vag,vah,ltxtr,vlpf,vlpg,vsegb,vaq,vsegf,vsegh,wfchdbs,\
+sdtr,cdbr,vfeezhs,le,wldeb,vfmadb,vchlbs,vacccq,vmaleb,vsel,vmalef,vmaleh,\
+vflndb,mdbr,vmlb,wflpdb,ldetr,vpksfs,vpksf,vpksg,vpksh,sqdb,mxbr,sqdbr,\
+vmaeb,veslh,vmaef,vpklsf,vpklsg,vpklsh,verllb,vchb,ddtr,verllf,verllg,verllh,\
+wfsdb,maeb,vclgdb,vftcidb,vpksgs,vmxf,vmxg,vmxh,fidbra,vmnb,vmnf,vmng,vfchedbs,\
+lnebr,vfidb,dxtr,ddb,msdb,vmalhb,vfddb,vmalhf,vmalhh,vpkshs,vfsdb,sqxbr,\
+vmalhw,ltdbr,vmob,vmof,vmoh,deb,vchlfs,mseb,vcdlgb,vlpb,wfmsdb,vlph,vmahb,\
+vldeb,vmahf,vgfmb,fidbr,vfsqdb,aebr,wledb,vchlgs,vesravb,vfchdbs,cebr,vesravf,\
+vesravg,vesravh,vcgdb,fixbra,vrepib,vrepif,vrepig,vrepih,tdcdt,vchlhs,vceqb,\
+vscbib,vceqf,vceqg,vscbif,vscbig,vscbih,vmlhw,vscbiq,vuphb,vuphf,vuphh,\
+vfchedb,tdcet,vslb,vpklsfs,adbr,sqebr,vfchdb,fixbr,vpklsgs,vsldb,vmleb,\
+vmlef,vmleh,cpsdr,vmalb,vmalf,vavgb,vmlf,vavgf,vavgg,vavgh,vgfmf,vgfmg,\
+vgfmh,fidtr,vpklshs,lndbr,vno,lpdbr,vacq,vledb,vchbs,vfeeb,vfeef,vfeeh,\
+fixtr,vaccb,wfadb,vaccf,vaccg,vacch,vnot,vmalob,vaccq,vmalof,vmaloh,lpxbr,\
+ledtr,vuplb,vuplf,axbr,lxdbr,ltxbr,vpopct,vpdi,vmlhb,vmlhf,vmlhh,sdbr,vnc,\
+vsumb,vsrab,vsumh,vmaob,vmaof,vmaoh,vesrlvb,vesrlvf,vesrlvg,vesrlvh,tcxb,\
+vceqbs,vceqh,lnxbr,sxbr,vesrab,wflcdb,vesraf,vesrag,vesrah,vflpdb,vmnh,\
+vsbiq,adtr,vsra,vsrl,vuplhb,sdb,vuplhf,vuplhh,vsumgf,vsumgh,ldebr,vuplhw,\
+vchfs,madb,ddbr") (const_int 1)]
+        (const_int 0)))
+
+(define_insn_reservation "z13_0" 0
+  (and (eq_attr "cpu" "z13")
+       (eq_attr "mnemonic" "s,lcgr,x,nop,oiy,vlbb,ppa,ng,sgrk,vstl,aghik,\
+mvc,ipm,llgc,mvi,stocg,rll,jg,srlg,cghsi,clgit,srlk,alrk,sg,sh,sl,st,sy,\
+vst,ark,xgr,agsi,tm,nrk,shy,llhr,agf,alcr,slgfr,sr,clgrt,llc,laa,lder,sgf,\
+lan,llhrl,llilf,llilh,ag,llill,lay,al,n,laxg,ar,ahi,sgr,ntstg,ay,stcy,vl,\
+nopr,ngrk,lbr,br,stdy,ork,ldgr,lcr,cg,ch,llghrl,lgfrl,cl,stoc,cr,agfr,stgrl,\
+cy,alfi,xg,cgfi,xi,vlrepf,vlrepg,vlreph,clfhsi,cgfr,xr,slb,mghi,clfi,slg,\
+lde,clhhsi,agfi,clfit,sly,ldr,ldy,nihf,nihh,algfi,nihl,algf,algfr,algr,\
+clgf,clgr,clgt,aghi,alc,alg,locg,alr,locr,cghi,aly,alghsik,slbr,clgfrl,\
+mhy,cit,nr,ny,xiy,sthy,cly,rllg,cgit,lgb,lgf,clgrl,lgh,lrvgr,cliy,cgrl,\
+lgr,slrk,clrt,icy,laog,og,agr,mvhi,lhrl,or,lhr,vlvgp,lhy,nilf,oy,nilh,tabort,\
+nill,lcdfr,mviy,tmhh,tmhl,sthrl,ltgf,ltgr,srk,clghrl,ahy,vstef,vsteg,ah,\
+vlgvb,llgcr,tmh,tml,clmy,slr,cfi,stc,std,ste,stg,sth,l,locgr,llh,slbg,sty,\
+tmlh,la,lb,ld,mvghi,lg,lh,risbgn,lrvg,lr,asi,lt,ahik,ly,lrvr,vlrepb,vllezb,\
+cgf,cgh,vllezf,vllezg,vllezh,cgr,clhrl,lzdr,tmll,mh,vlvgb,lrv,vlvgf,xgrk,\
+vlvgg,llgfr,vlvgh,slfi,chi,chy,mhi,lzer,alhsik,ni,ltgfr,loc,icm,oc,oi,cgfrl,\
+agrk,lgat,oilh,llghr,lghrl,oill,xihf,lpgr,cgrt,clrl,sgfr,lpr,lgbr,strl,\
+algrk,alsi,srak,brcl,slgf,xc,a,b,c,slgr,j,o,algsi,icmh,srag,iilf,ogrk,clc,\
+clg,icmy,cli,clm,clr,clt,slgrk,lrl,lao,risbg,mvhhi,lat,etnd,lax,iihf,sra,\
+alcgr,clghsi,ear,nc,lgrl,stey,ngr,xilf,laag,oihf,oihh,oihl,ltg,ltr,niy,\
+lgfi,sfpc,lgfr,slgfi,llcr,llgf,llgfrl,llgh,slbgr,llgt,chrl,lgdr,pfpo,lang,\
+basr,lcbb,sllg,sllk,lghi,vll,lghr,vlgvf,vlgvg,vlgvh,vlr,chsi,lngr,cghrl,\
+srl,sar,lhi,oilf,crl,crt,afi,xrk,llgtr,llihf,llihh,llihl,xy,clgfi,clgfr,\
+ogr,popcnt,alcg,lndfr,larl,sll,tmy,ic,lpdfr,tend,lnr,bcr_flush")) "nothing")
+
+(define_insn_reservation "z13_1" 1
+  (and (eq_attr "cpu" "z13")
+       (eq_attr "mnemonic" "celgbr,vscef,vsceg,msy,msgf,cxlftr,cdsg,cdsy,\
+exrl,clfebr,cefbr,chhsi,stam,vgef,vgeg,cdlftr,lam,mfy,lcgfr,cfdbr,dsgf,\
+cgdbr,lzxr,lmg,cfxbr,rnsbg,cxlgtr,mr,dl,cxftr,sldl,cgdtr,cegbr,rxsbg,ex,\
+cgxtr,clfxtr,mlgr,cdlgtr,brasl,dsg,efpc,cfebr,tbeginc,celfbr,clgxbr,vsteb,\
+vsteh,cxlfbr,clfdtr,cxgtr,stmg,stmy,stm,lm,cds,cdfbr,ml,ms,lngfr,clgebr,\
+stpq,lmy,cs,lpgfr,cdlgbr,lpq,cxfbr,cxlgbr,cdgtr,d,m,mlg,mlr,cgxbr,cdftr,\
+msgr,rosbg,clgdbr,cdgbr,srda,bras,srdl,tbegin,clfdbr,cdlfbr,cxgbr,cgebr,\
+dlg,clfxbr,lxr,csy,msgfi,msgfr,msg,flogr,msr,csg,msfi,clgdtr,clgxtr")) "nothing")
+
+(define_insn_reservation "z13_2" 2
+  (and (eq_attr "cpu" "z13")
+       (eq_attr "mnemonic" "seb,vcksm,vfadb,vleib,vchgs,vleif,vleig,vleih,\
+vgbm,verimb,vone,verimf,verimg,verimh,verllvb,lpebr,verllvf,verllvg,verllvh,\
+vfeneb,wcdgb,vfenef,vfeneh,vchhs,vctzb,vctzf,vctzg,vctzh,vlcb,aeb,vlcf,\
+vlcg,vlch,vfmsdb,vgfmab,ltebr,vgfmaf,vgfmag,vgfmah,vmaeh,vsb,vsf,vsg,vsh,\
+vsl,vsq,lxebr,cdtr,fiebr,vupllb,vupllf,vupllh,vmrhb,madbr,vtm,vmrhf,vmrhg,\
+vmrhh,axtr,fiebra,vleb,cxtr,vlef,vleg,vleh,vpkf,vpkg,vpkh,vmlob,vmlof,vmloh,\
+lxdb,ldeb,vceqfs,adb,wflndb,lxeb,vn,vo,vchlb,vx,vchlf,vchlg,vchlh,vfcedbs,\
+vfcedb,vceqgs,cxbr,msdbr,vcdgb,vceqhs,meeb,lcxbr,vavglb,vavglf,vavglg,vavglh,\
+wfcedbs,vmrlb,vmrlf,vmrlg,vmrlh,wfchedbs,vmxb,tcdb,vmahh,vsrlb,wcgdb,lcdbr,\
+vistrbs,vrepb,wfmdb,vrepf,vrepg,vreph,ler,wcdlgb,ley,vistrb,vistrf,vistrh,\
+tceb,vsumqf,vsumqg,vesrlb,vfeezbs,maebr,vesrlf,vesrlg,vesrlh,vmeb,vmef,\
+vmeh,meebr,vflcdb,wfmadb,vperm,sxtr,vclzf,vgm,vgmb,vgmf,vgmg,vgmh,tdcxt,\
+vzero,msebr,veslb,veslf,veslg,vfenezb,vfenezf,vfenezh,vistrfs,vchf,vchg,\
+vchh,vmhb,vmhf,vmhh,cdb,veslvb,ledbr,veslvf,veslvg,veslvh,wclgdb,vfmdb,\
+vmnlb,vmnlf,vmnlg,vmnlh,vclzb,vfeezfs,vclzg,vclzh,mdb,vmxlb,vmxlf,vmxlg,\
+vmxlh,ltdtr,vsbcbiq,ceb,sebr,vistrhs,lxdtr,lcebr,vab,vaf,vag,vah,ltxtr,\
+vlpf,vlpg,vsegb,vaq,vsegf,vsegh,wfchdbs,sdtr,cdbr,vfeezhs,le,wldeb,vfmadb,\
+vchlbs,vacccq,vmaleb,vsel,vmalef,vmaleh,vflndb,mdbr,vmlb,wflpdb,ldetr,vpksfs,\
+vpksf,vpksg,vpksh,vmaeb,veslh,vmaef,vpklsf,vpklsg,vpklsh,verllb,vchb,verllf,\
+verllg,verllh,wfsdb,maeb,vclgdb,vftcidb,vpksgs,vmxf,vmxg,vmxh,fidbra,vmnb,\
+vmnf,vmng,vfchedbs,lnebr,vfidb,msdb,vmalhb,vmalhf,vmalhh,vpkshs,vfsdb,vmalhw,\
+ltdbr,vmob,vmof,vmoh,vchlfs,mseb,vcdlgb,vlpb,wfmsdb,vlph,vmahb,vldeb,vmahf,\
+vgfmb,fidbr,aebr,wledb,vchlgs,vesravb,vfchdbs,cebr,vesravf,vesravg,vesravh,\
+vcgdb,fixbra,vrepib,vrepif,vrepig,vrepih,tdcdt,vchlhs,vceqb,vscbib,vceqf,\
+vceqg,vscbif,vscbig,vscbih,vmlhw,vscbiq,vuphb,vuphf,vuphh,vfchedb,tdcet,\
+vslb,vpklsfs,adbr,vfchdb,fixbr,vpklsgs,vsldb,vmleb,vmlef,vmleh,cpsdr,vmalb,\
+vmalf,vavgb,vmlf,vavgf,vavgg,vavgh,vgfmf,vgfmg,vgfmh,fidtr,vpklshs,lndbr,\
+vno,lpdbr,vacq,vledb,vchbs,vfeeb,vfeef,vfeeh,fixtr,vaccb,wfadb,vaccf,vaccg,\
+vacch,vnot,vmalob,vaccq,vmalof,vmaloh,lpxbr,vuplb,vuplf,axbr,lxdbr,ltxbr,\
+vpopct,vpdi,vmlhb,vmlhf,vmlhh,sdbr,vnc,vsumb,vsrab,vsumh,vmaob,vmaof,vmaoh,\
+vesrlvb,vesrlvf,vesrlvg,vesrlvh,tcxb,vceqbs,vceqh,lnxbr,sxbr,vesrab,wflcdb,\
+vesraf,vesrag,vesrah,vflpdb,vmnh,vsbiq,adtr,vsra,vsrl,vuplhb,sdb,vuplhf,\
+vuplhh,vsumgf,vsumgh,ldebr,vuplhw,vchfs,madb")) "nothing")
+
+(define_insn_reservation "z13_3" 3
+  (and (eq_attr "cpu" "z13")
+       (eq_attr "mnemonic" "ledtr")) "nothing")
+
+(define_insn_reservation "z13_4" 4
+  (and (eq_attr "cpu" "z13")
+       (eq_attr "mnemonic" "dr,mxbr,dlr")) "nothing")
+
+(define_insn_reservation "z13_6" 6
+  (and (eq_attr "cpu" "z13")
+       (eq_attr "mnemonic" "debr,sqeb,deb,sqebr")) "nothing")
+
+(define_insn_reservation "z13_7" 7
+  (and (eq_attr "cpu" "z13")
+       (eq_attr "mnemonic" "mdtr")) "nothing")
+
+(define_insn_reservation "z13_8" 8
+  (and (eq_attr "cpu" "z13")
+       (eq_attr "mnemonic" "wfddb,ddb,vfddb,ddbr")) "nothing")
+
+(define_insn_reservation "z13_9" 9
+  (and (eq_attr "cpu" "z13")
+       (eq_attr "mnemonic" "dsgr,wfsqdb,dsgfr,sqdb,sqdbr,vfsqdb")) "nothing")
+
+(define_insn_reservation "z13_13" 13
+  (and (eq_attr "cpu" "z13")
+       (eq_attr "mnemonic" "mxtr,ddtr")) "nothing")
+
+(define_insn_reservation "z13_16" 16
+  (and (eq_attr "cpu" "z13")
+       (eq_attr "mnemonic" "sqxbr")) "nothing")
+
+(define_insn_reservation "z13_17" 17
+  (and (eq_attr "cpu" "z13")
+       (eq_attr "mnemonic" "dxtr")) "nothing")
+
+(define_insn_reservation "z13_20" 20
+  (and (eq_attr "cpu" "z13")
+       (eq_attr "mnemonic" "dxbr,dlgr")) "nothing")
+
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index b1ab0c0..c2e59f5 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -340,6 +340,19 @@ extern int reload_completed;
 
 /* Kept up to date using the SCHED_VARIABLE_ISSUE hook.  */
 static rtx_insn *last_scheduled_insn;
+#define MAX_SCHED_UNITS 3
+static int last_scheduled_unit_distance[MAX_SCHED_UNITS];
+
+/* The maximum score added for an instruction whose unit hasn't been
+   in use for MAX_SCHED_MIX_DISTANCE steps.  Increase this value to
+   give instruction mix scheduling more priority over instruction
+   grouping.  */
+#define MAX_SCHED_MIX_SCORE      8
+
+/* The maximum distance up to which individual scores will be
+   calculated.  Everything beyond this gives MAX_SCHED_MIX_SCORE.
+   Increase this with the OOO windows size of the machine.  */
+#define MAX_SCHED_MIX_DISTANCE 100
 
 /* Structure used to hold the components of a S/390 memory
    address.  A legitimate address on S/390 is of the general
@@ -13560,27 +13573,66 @@ s390_z10_prevent_earlyload_conflicts (rtx_insn **ready, int *nready_p)
 
 static int s390_sched_state;
 
-#define S390_OOO_SCHED_STATE_NORMAL  3
-#define S390_OOO_SCHED_STATE_CRACKED 4
+#define S390_SCHED_STATE_NORMAL  3
+#define S390_SCHED_STATE_CRACKED 4
 
-#define S390_OOO_SCHED_ATTR_MASK_CRACKED    0x1
-#define S390_OOO_SCHED_ATTR_MASK_EXPANDED   0x2
-#define S390_OOO_SCHED_ATTR_MASK_ENDGROUP   0x4
-#define S390_OOO_SCHED_ATTR_MASK_GROUPALONE 0x8
+#define S390_SCHED_ATTR_MASK_CRACKED    0x1
+#define S390_SCHED_ATTR_MASK_EXPANDED   0x2
+#define S390_SCHED_ATTR_MASK_ENDGROUP   0x4
+#define S390_SCHED_ATTR_MASK_GROUPALONE 0x8
 
 static unsigned int
 s390_get_sched_attrmask (rtx_insn *insn)
 {
   unsigned int mask = 0;
 
-  if (get_attr_ooo_cracked (insn))
-    mask |= S390_OOO_SCHED_ATTR_MASK_CRACKED;
-  if (get_attr_ooo_expanded (insn))
-    mask |= S390_OOO_SCHED_ATTR_MASK_EXPANDED;
-  if (get_attr_ooo_endgroup (insn))
-    mask |= S390_OOO_SCHED_ATTR_MASK_ENDGROUP;
-  if (get_attr_ooo_groupalone (insn))
-    mask |= S390_OOO_SCHED_ATTR_MASK_GROUPALONE;
+  switch (s390_tune)
+    {
+    case PROCESSOR_2827_ZEC12:
+      if (get_attr_zEC12_cracked (insn))
+	mask |= S390_SCHED_ATTR_MASK_CRACKED;
+      if (get_attr_zEC12_expanded (insn))
+	mask |= S390_SCHED_ATTR_MASK_EXPANDED;
+      if (get_attr_zEC12_endgroup (insn))
+	mask |= S390_SCHED_ATTR_MASK_ENDGROUP;
+      if (get_attr_zEC12_groupalone (insn))
+	mask |= S390_SCHED_ATTR_MASK_GROUPALONE;
+      break;
+    case PROCESSOR_2964_Z13:
+      if (get_attr_z13_cracked (insn))
+	mask |= S390_SCHED_ATTR_MASK_CRACKED;
+      if (get_attr_z13_expanded (insn))
+	mask |= S390_SCHED_ATTR_MASK_EXPANDED;
+      if (get_attr_z13_endgroup (insn))
+	mask |= S390_SCHED_ATTR_MASK_ENDGROUP;
+      if (get_attr_z13_groupalone (insn))
+	mask |= S390_SCHED_ATTR_MASK_GROUPALONE;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+  return mask;
+}
+
+static unsigned int
+s390_get_unit_mask (rtx_insn *insn, int *units)
+{
+  unsigned int mask = 0;
+
+  switch (s390_tune)
+    {
+    case PROCESSOR_2964_Z13:
+      *units = 3;
+      if (get_attr_z13_unit_lsu (insn))
+	mask |= 1 << 0;
+      if (get_attr_z13_unit_fxu (insn))
+	mask |= 1 << 1;
+      if (get_attr_z13_unit_vfu (insn))
+	mask |= 1 << 2;
+      break;
+    default:
+      gcc_unreachable ();
+    }
   return mask;
 }
 
@@ -13598,48 +13650,66 @@ s390_sched_score (rtx_insn *insn)
     case 0:
       /* Try to put insns into the first slot which would otherwise
 	 break a group.  */
-      if ((mask & S390_OOO_SCHED_ATTR_MASK_CRACKED) != 0
-	  || (mask & S390_OOO_SCHED_ATTR_MASK_EXPANDED) != 0)
+      if ((mask & S390_SCHED_ATTR_MASK_CRACKED) != 0
+	  || (mask & S390_SCHED_ATTR_MASK_EXPANDED) != 0)
 	score += 5;
-      if ((mask & S390_OOO_SCHED_ATTR_MASK_GROUPALONE) != 0)
+      if ((mask & S390_SCHED_ATTR_MASK_GROUPALONE) != 0)
 	score += 10;
     case 1:
       /* Prefer not cracked insns while trying to put together a
 	 group.  */
-      if ((mask & S390_OOO_SCHED_ATTR_MASK_CRACKED) == 0
-	  && (mask & S390_OOO_SCHED_ATTR_MASK_EXPANDED) == 0
-	  && (mask & S390_OOO_SCHED_ATTR_MASK_GROUPALONE) == 0)
+      if ((mask & S390_SCHED_ATTR_MASK_CRACKED) == 0
+	  && (mask & S390_SCHED_ATTR_MASK_EXPANDED) == 0
+	  && (mask & S390_SCHED_ATTR_MASK_GROUPALONE) == 0)
 	score += 10;
-      if ((mask & S390_OOO_SCHED_ATTR_MASK_ENDGROUP) == 0)
+      if ((mask & S390_SCHED_ATTR_MASK_ENDGROUP) == 0)
 	score += 5;
       break;
     case 2:
       /* Prefer not cracked insns while trying to put together a
 	 group.  */
-      if ((mask & S390_OOO_SCHED_ATTR_MASK_CRACKED) == 0
-	  && (mask & S390_OOO_SCHED_ATTR_MASK_EXPANDED) == 0
-	  && (mask & S390_OOO_SCHED_ATTR_MASK_GROUPALONE) == 0)
+      if ((mask & S390_SCHED_ATTR_MASK_CRACKED) == 0
+	  && (mask & S390_SCHED_ATTR_MASK_EXPANDED) == 0
+	  && (mask & S390_SCHED_ATTR_MASK_GROUPALONE) == 0)
 	score += 10;
       /* Prefer endgroup insns in the last slot.  */
-      if ((mask & S390_OOO_SCHED_ATTR_MASK_ENDGROUP) != 0)
+      if ((mask & S390_SCHED_ATTR_MASK_ENDGROUP) != 0)
 	score += 10;
       break;
-    case S390_OOO_SCHED_STATE_NORMAL:
+    case S390_SCHED_STATE_NORMAL:
       /* Prefer not cracked insns if the last was not cracked.  */
-      if ((mask & S390_OOO_SCHED_ATTR_MASK_CRACKED) == 0
-	  && (mask & S390_OOO_SCHED_ATTR_MASK_EXPANDED) == 0)
+      if ((mask & S390_SCHED_ATTR_MASK_CRACKED) == 0
+	  && (mask & S390_SCHED_ATTR_MASK_EXPANDED) == 0)
 	score += 5;
-      if ((mask & S390_OOO_SCHED_ATTR_MASK_GROUPALONE) != 0)
+      if ((mask & S390_SCHED_ATTR_MASK_GROUPALONE) != 0)
 	score += 10;
       break;
-    case S390_OOO_SCHED_STATE_CRACKED:
+    case S390_SCHED_STATE_CRACKED:
       /* Try to keep cracked insns together to prevent them from
 	 interrupting groups.  */
-      if ((mask & S390_OOO_SCHED_ATTR_MASK_CRACKED) != 0
-	  || (mask & S390_OOO_SCHED_ATTR_MASK_EXPANDED) != 0)
+      if ((mask & S390_SCHED_ATTR_MASK_CRACKED) != 0
+	  || (mask & S390_SCHED_ATTR_MASK_EXPANDED) != 0)
 	score += 5;
       break;
     }
+
+  if (s390_tune == PROCESSOR_2964_Z13)
+    {
+      int units, i;
+      unsigned unit_mask, m = 1;
+
+      unit_mask = s390_get_unit_mask (insn, &units);
+      gcc_assert (units <= MAX_SCHED_UNITS);
+
+      /* Add a score in range 0..MAX_SCHED_MIX_SCORE depending on how long
+	 ago the last insn of this unit type got scheduled.  This is
+	 supposed to help providing a proper instruction mix to the
+	 CPU.  */
+      for (i = 0; i < units; i++, m <<= 1)
+	if (m & unit_mask)
+	  score += (last_scheduled_unit_distance[i] * MAX_SCHED_MIX_SCORE /
+		    MAX_SCHED_MIX_DISTANCE);
+    }
   return score;
 }
 
@@ -13695,12 +13765,12 @@ s390_sched_reorder (FILE *file, int verbose,
 
 	      if (verbose > 5)
 		fprintf (file,
-			 "move insn %d to the top of list\n",
+			 ";;\t\tBACKEND: move insn %d to the top of list\n",
 			 INSN_UID (ready[last_index]));
 	    }
 	  else if (verbose > 5)
 	    fprintf (file,
-		     "best insn %d already on top\n",
+		     ";;\t\tBACKEND: best insn %d already on top\n",
 		     INSN_UID (ready[last_index]));
 	}
 
@@ -13711,16 +13781,35 @@ s390_sched_reorder (FILE *file, int verbose,
 
 	  for (i = last_index; i >= 0; i--)
 	    {
-	      if (recog_memoized (ready[i]) < 0)
+	      unsigned int sched_mask;
+	      rtx_insn *insn = ready[i];
+
+	      if (recog_memoized (insn) < 0)
 		continue;
-	      fprintf (file, "insn %d score: %d: ", INSN_UID (ready[i]),
-		       s390_sched_score (ready[i]));
-#define PRINT_OOO_ATTR(ATTR) fprintf (file, "%s ", get_attr_##ATTR (ready[i]) ? #ATTR : "!" #ATTR);
-	      PRINT_OOO_ATTR (ooo_cracked);
-	      PRINT_OOO_ATTR (ooo_expanded);
-	      PRINT_OOO_ATTR (ooo_endgroup);
-	      PRINT_OOO_ATTR (ooo_groupalone);
-#undef PRINT_OOO_ATTR
+
+	      sched_mask = s390_get_sched_attrmask (insn);
+	      fprintf (file, ";;\t\tBACKEND: insn %d score: %d: ",
+		       INSN_UID (insn),
+		       s390_sched_score (insn));
+#define PRINT_SCHED_ATTR(M, ATTR) fprintf (file, "%s ",\
+					   ((M) & sched_mask) ? #ATTR : "");
+	      PRINT_SCHED_ATTR (S390_SCHED_ATTR_MASK_CRACKED, cracked);
+	      PRINT_SCHED_ATTR (S390_SCHED_ATTR_MASK_EXPANDED, expanded);
+	      PRINT_SCHED_ATTR (S390_SCHED_ATTR_MASK_ENDGROUP, endgroup);
+	      PRINT_SCHED_ATTR (S390_SCHED_ATTR_MASK_GROUPALONE, groupalone);
+#undef PRINT_SCHED_ATTR
+	      if (s390_tune == PROCESSOR_2964_Z13)
+		{
+		  unsigned int unit_mask, m = 1;
+		  int units, j;
+
+		  unit_mask  = s390_get_unit_mask (insn, &units);
+		  fprintf (file, "(units:");
+		  for (j = 0; j < units; j++, m <<= 1)
+		    if (m & unit_mask)
+		      fprintf (file, " u%d", j);
+		  fprintf (file, ")");
+		}
 	      fprintf (file, "\n");
 	    }
 	}
@@ -13745,12 +13834,12 @@ s390_sched_variable_issue (FILE *file, int verbose, rtx_insn *insn, int more)
     {
       unsigned int mask = s390_get_sched_attrmask (insn);
 
-      if ((mask & S390_OOO_SCHED_ATTR_MASK_CRACKED) != 0
-	  || (mask & S390_OOO_SCHED_ATTR_MASK_EXPANDED) != 0)
-	s390_sched_state = S390_OOO_SCHED_STATE_CRACKED;
-      else if ((mask & S390_OOO_SCHED_ATTR_MASK_ENDGROUP) != 0
-	       || (mask & S390_OOO_SCHED_ATTR_MASK_GROUPALONE) != 0)
-	s390_sched_state = S390_OOO_SCHED_STATE_NORMAL;
+      if ((mask & S390_SCHED_ATTR_MASK_CRACKED) != 0
+	  || (mask & S390_SCHED_ATTR_MASK_EXPANDED) != 0)
+	s390_sched_state = S390_SCHED_STATE_CRACKED;
+      else if ((mask & S390_SCHED_ATTR_MASK_ENDGROUP) != 0
+	       || (mask & S390_SCHED_ATTR_MASK_GROUPALONE) != 0)
+	s390_sched_state = S390_SCHED_STATE_NORMAL;
       else
 	{
 	  /* Only normal insns are left (mask == 0).  */
@@ -13759,30 +13848,73 @@ s390_sched_variable_issue (FILE *file, int verbose, rtx_insn *insn, int more)
 	    case 0:
 	    case 1:
 	    case 2:
-	    case S390_OOO_SCHED_STATE_NORMAL:
-	      if (s390_sched_state == S390_OOO_SCHED_STATE_NORMAL)
+	    case S390_SCHED_STATE_NORMAL:
+	      if (s390_sched_state == S390_SCHED_STATE_NORMAL)
 		s390_sched_state = 1;
 	      else
 		s390_sched_state++;
 
 	      break;
-	    case S390_OOO_SCHED_STATE_CRACKED:
-	      s390_sched_state = S390_OOO_SCHED_STATE_NORMAL;
+	    case S390_SCHED_STATE_CRACKED:
+	      s390_sched_state = S390_SCHED_STATE_NORMAL;
 	      break;
 	    }
 	}
+
+      if (s390_tune == PROCESSOR_2964_Z13)
+	{
+	  int units, i;
+	  unsigned unit_mask, m = 1;
+
+	  unit_mask = s390_get_unit_mask (insn, &units);
+	  gcc_assert (units <= MAX_SCHED_UNITS);
+
+	  for (i = 0; i < units; i++, m <<= 1)
+	    if (m & unit_mask)
+	      last_scheduled_unit_distance[i] = 0;
+	    else if (last_scheduled_unit_distance[i] < MAX_SCHED_MIX_DISTANCE)
+	      last_scheduled_unit_distance[i]++;
+	}
+
       if (verbose > 5)
 	{
-	  fprintf (file, "insn %d: ", INSN_UID (insn));
-#define PRINT_OOO_ATTR(ATTR)						\
-	  fprintf (file, "%s ", get_attr_##ATTR (insn) ? #ATTR : "");
-	  PRINT_OOO_ATTR (ooo_cracked);
-	  PRINT_OOO_ATTR (ooo_expanded);
-	  PRINT_OOO_ATTR (ooo_endgroup);
-	  PRINT_OOO_ATTR (ooo_groupalone);
-#undef PRINT_OOO_ATTR
-	  fprintf (file, "\n");
-	  fprintf (file, "sched state: %d\n", s390_sched_state);
+	  unsigned int sched_mask;
+
+	  sched_mask = s390_get_sched_attrmask (insn);
+
+	  fprintf (file, ";;\t\tBACKEND: insn %d: ", INSN_UID (insn));
+#define PRINT_SCHED_ATTR(M, ATTR) fprintf (file, "%s ", ((M) & sched_mask) ? #ATTR : "");
+	  PRINT_SCHED_ATTR (S390_SCHED_ATTR_MASK_CRACKED, cracked);
+	  PRINT_SCHED_ATTR (S390_SCHED_ATTR_MASK_EXPANDED, expanded);
+	  PRINT_SCHED_ATTR (S390_SCHED_ATTR_MASK_ENDGROUP, endgroup);
+	  PRINT_SCHED_ATTR (S390_SCHED_ATTR_MASK_GROUPALONE, groupalone);
+#undef PRINT_SCHED_ATTR
+
+	  if (s390_tune == PROCESSOR_2964_Z13)
+	    {
+	      unsigned int unit_mask, m = 1;
+	      int units, j;
+
+	      unit_mask  = s390_get_unit_mask (insn, &units);
+	      fprintf (file, "(units:");
+	      for (j = 0; j < units; j++, m <<= 1)
+		if (m & unit_mask)
+		  fprintf (file, " %d", j);
+	      fprintf (file, ")");
+	    }
+	  fprintf (file, " sched state: %d\n", s390_sched_state);
+
+	  if (s390_tune == PROCESSOR_2964_Z13)
+	    {
+	      int units, j;
+
+	      s390_get_unit_mask (insn, &units);
+
+	      fprintf (file, ";;\t\tBACKEND: units unused for: ");
+	      for (j = 0; j < units; j++)
+		fprintf (file, "%d:%d ", j, last_scheduled_unit_distance[j]);
+	      fprintf (file, "\n");
+	    }
 	}
     }
 
@@ -13799,6 +13931,7 @@ s390_sched_init (FILE *file ATTRIBUTE_UNUSED,
 		 int max_ready ATTRIBUTE_UNUSED)
 {
   last_scheduled_insn = NULL;
+  memset (last_scheduled_unit_distance, 0, MAX_SCHED_UNITS * sizeof (int));
   s390_sched_state = 0;
 }
 
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 6f0e172..9d76e61 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -512,6 +512,9 @@
 ;; Pipeline description for zEC12
 (include "2827.md")
 
+;; Pipeline description for z13
+(include "2964.md")
+
 ;; Predicates
 (include "predicates.md")
 
-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4/9] S/390: Adjust movstr-1.c testcase to work with the z13 stpcpy implementation.
  2016-02-17 18:51 [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes Andreas Krebbel
                   ` (5 preceding siblings ...)
  2016-02-17 18:51 ` [PATCH 5/9] S/390: z13 fix mode in vcond expansion Andreas Krebbel
@ 2016-02-17 18:51 ` Andreas Krebbel
  2016-02-17 18:51 ` [PATCH 7/9] S/390: z13 Change predicates of 128 bit add sub Andreas Krebbel
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Andreas Krebbel @ 2016-02-17 18:51 UTC (permalink / raw)
  To: gcc-patches

2016-02-17  Andreas Krebbel  <krebbel@linux.vnet.ibm.com>

	* gcc.target/s390/md/movstr-1.c: Allow also the z13 strings
	instruction pattern name to prevent the testcase from failing with
	-march=z13.
---
 gcc/testsuite/gcc.target/s390/md/movstr-1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/s390/md/movstr-1.c b/gcc/testsuite/gcc.target/s390/md/movstr-1.c
index 7da749b..da98415 100644
--- a/gcc/testsuite/gcc.target/s390/md/movstr-1.c
+++ b/gcc/testsuite/gcc.target/s390/md/movstr-1.c
@@ -9,7 +9,7 @@ void test(char *dest, const char *src)
   __builtin_stpcpy (dest, src);
 }
 
-/* { dg-final { scan-assembler-times {{[*]movstr}} 1 } } */
+/* { dg-final { scan-assembler-times {{[*]movstr}|{vec_vfenesv16qi}} 1 } } */
 
 #define LEN 200
 char buf[LEN];
-- 
1.9.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes
  2016-02-17 18:51 [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes Andreas Krebbel
                   ` (8 preceding siblings ...)
  2016-02-17 18:51 ` [PATCH 9/9] S/390: z13 Add missing commutative operand markers Andreas Krebbel
@ 2016-02-18  8:38 ` Richard Biener
  9 siblings, 0 replies; 11+ messages in thread
From: Richard Biener @ 2016-02-18  8:38 UTC (permalink / raw)
  To: Andreas Krebbel; +Cc: GCC Patches

On Wed, Feb 17, 2016 at 7:51 PM, Andreas Krebbel
<krebbel@linux.vnet.ibm.com> wrote:
> I'm having this patchset in my local tree for quite a while now.
> Posting it was so far prevented by some internal process hurdles.  I'm
> aware it isn't stage 4 material.  I nevertheless would like to commit
> this since:
>
> * It is z13 only and z13 support was new in GCC 6 anyway.  The risk to
>   cause regressions for other cpu levels is small (hopefully).
>
> * It is required to get rid of some nasty performance regressions
>   which can be observed with -march=z13 otherwise.
>
> Any objections?

THe bugfixes are obviously fine, the rest is up to the s390x maintainers.

Richard.

> Bye,
>
> -Andreas-
>
> Andreas Krebbel (9):
>   S/390: Add IBM z13 pipeline description
>   S/390: z13 lcbb fix address operand.
>   S/390: z13 inline stpcpy implementation.
>   S/390: Adjust movstr-1.c testcase to work with the z13 stpcpy
>     implementation.
>   S/390: z13 fix mode in vcond expansion
>   S/390: Add vec_sub_u128 to vecintrin.h
>   S/390: z13 Change predicates of 128 bit add sub.
>   S/390: Add single element vector types to iterators.
>   S/390: z13 Add missing commutative operand markers.
>
>  gcc/config/s390/2827.md                            |   9 +-
>  gcc/config/s390/2964.md                            |  64 ++++
>  gcc/config/s390/s390-protos.h                      |   1 +
>  gcc/config/s390/s390.c                             | 381 +++++++++++++++++----
>  gcc/config/s390/s390.md                            |  19 +-
>  gcc/config/s390/vecintrin.h                        |   1 +
>  gcc/config/s390/vector.md                          |  60 ++--
>  gcc/config/s390/vx-builtins.md                     |  56 +--
>  gcc/testsuite/gcc.target/s390/md/movstr-1.c        |   2 +-
>  gcc/testsuite/gcc.target/s390/md/movstr-2.c        |  98 ++++++
>  gcc/testsuite/gcc.target/s390/vector/int128-1.c    |  47 +++
>  gcc/testsuite/gcc.target/s390/vector/vec-vcond-1.c |  23 ++
>  12 files changed, 628 insertions(+), 133 deletions(-)
>  create mode 100644 gcc/config/s390/2964.md
>  create mode 100644 gcc/testsuite/gcc.target/s390/md/movstr-2.c
>  create mode 100644 gcc/testsuite/gcc.target/s390/vector/int128-1.c
>  create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-vcond-1.c
>
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-02-18  8:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-17 18:51 [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes Andreas Krebbel
2016-02-17 18:51 ` [PATCH 3/9] S/390: z13 inline stpcpy implementation Andreas Krebbel
2016-02-17 18:51 ` [PATCH 6/9] S/390: Add vec_sub_u128 to vecintrin.h Andreas Krebbel
2016-02-17 18:51 ` [PATCH 2/9] S/390: z13 lcbb fix address operand Andreas Krebbel
2016-02-17 18:51 ` [PATCH 1/9] S/390: Add IBM z13 pipeline description Andreas Krebbel
2016-02-17 18:51 ` [PATCH 8/9] S/390: Add single element vector types to iterators Andreas Krebbel
2016-02-17 18:51 ` [PATCH 5/9] S/390: z13 fix mode in vcond expansion Andreas Krebbel
2016-02-17 18:51 ` [PATCH 4/9] S/390: Adjust movstr-1.c testcase to work with the z13 stpcpy implementation Andreas Krebbel
2016-02-17 18:51 ` [PATCH 7/9] S/390: z13 Change predicates of 128 bit add sub Andreas Krebbel
2016-02-17 18:51 ` [PATCH 9/9] S/390: z13 Add missing commutative operand markers Andreas Krebbel
2016-02-18  8:38 ` [PATCH 0/9] S/390: z13 pipeline description, stpcpy + bugfixes Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).