[PATCH] Various store-merging improvements (PR tree-optimization/88709, PR tree-optimization/90271)

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] Various store-merging improvements (PR tree-optimization/88709, PR tree-optimization/90271)
@ 2019-05-06 13:22 Jakub Jelinek
  2019-05-06 18:49 ` Richard Biener
  0 siblings, 1 reply; 2+ messages in thread
From: Jakub Jelinek @ 2019-05-06 13:22 UTC (permalink / raw)
  To: Richard Biener, Jeff Law; +Cc: gcc-patches

Hi!

The following patch adds some store-merging improvements.
Previously, only stores where the lhs was
ARRAY_REF, ARRAY_RANGE_REF, MEM_REF, COMPONENT_REF or BIT_FIELD_REF
were considered, as the testcases show it is useful to consider
stores into whole decl as well.
Another thing is that we were considering stores where the rhs was some
constant with fixed size mode and one that native_encode_expr can handle.
This patch extends it to handle = {} stores (but not clobbers).  While
native_encode_expr doesn't handle those, they can be handled very easily
(it is a memory clearing) and we don't need have the machine mode
restrictions for that case either.
With those two changes, we can handle stuff like:
  var = {};
  var.b = 1;
  var.d = 2;
and turn that into
  MEM [(struct S *)&var] = some_constant;
We do have a param how many stores we consider together and when all the
stores were fixed size, that naturally implied a reasonably small size of
the whole store group area which we e.g. precompute the bytes for.  With
= {} having almost arbitrary sizes, I feared we could easily trigger out of
compile time memory errors by just having var = {} for say 2GB variable
followed by var.a = 1; allocating 4GB of RAM (or even larger cases for 64-bit
hosts+targets).  So, the patch adds another parameter and its default to
limit the sizes of the groups, above that boundary we just don't merge
multiple stores into a group.
Another thing is, if we have a large store first, typically = {} of the
whole object, followed by some stores after it, either the initial clearing
is small and we manage to turn the stores into even smaller ones that cover
the complete object, but if the = {} is large, we'd often just punt because
it would create more stores than originally (we really count the = {} as one
store and although sometimes it could be expanded by pieces, in other times
it can be far more efficient to clear everything and overwrite a few
things).  This means, in various cases the patch could even regress on the
store merging done, because previously we wouldn't even consider the = {},
while now we've placed it into the same store group, so before we would have
that = {} separate and then merge the rest, while now we just punt.
This patch solves it by special casing the case where there is a clear of
the whole area, followed by some stores into that area.  We check what is in
that case smaller, whether to handle the = {} together with the rest, or
whether to treat it as = {} separate from the rest; in the latter case we
can do whatever we used to do before, but actually can do even better,
because we know the memory area is already pre-cleared, so for the purposes
of computation how many, how large and how much aligned stores we will do
we can actually treat all the zero bytes in the combined value to be stored
as the padding bytes (gaps); we have already stored zeros there, so we don't
need to overwrite it, but if it is beneficial, we actually can overwrite it
with further zeros.  When later on actually adding the individual bytes, we
don't need to consider those bytes as padding though, so can avoid doing a
RMW cycle, we know those bytes are zero.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-05-06  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/88709
	PR tree-optimization/90271
	* params.def (PARAM_STORE_MERGING_MAX_SIZE): New parameter.
	* gimple-ssa-store-merging.c (encode_tree_to_bitpos): Handle
	non-clobber CONSTRUCTORs with no elts.  Remove useless tmp_int
	variable.
	(imm_store_chain_info::coalesce_immediate_stores): Punt if the size
	of the store merging group is larger than
	PARAM_STORE_MERGING_MAX_SIZE parameter.
	(split_group): Add bzero_first argument.  If set, always emit first
	the first store which must be = {} of the whole area and then for the
	rest of the stores consider all zero bytes as paddings.
	(imm_store_chain_info::output_merged_store): Check if first store
	is = {} of the whole area and if yes, determine which setting of
	bzero_first for split_group gives smaller number of stores.  Adjust
	split_group callers.
	(lhs_valid_for_store_merging_p): Allow decls.
	(rhs_valid_for_store_merging_p): Allow non-clobber CONTRUCTORs with
	no elts.
	(pass_store_merging::process_store): Likewise.

	* gcc.dg/store_merging_26.c: New test.
	* gcc.dg/store_merging_27.c: New test.
	* gcc.dg/store_merging_28.c: New test.
	* gcc.dg/store_merging_29.c: New test.

--- gcc/params.def.jj	2019-05-06 09:45:57.014019177 +0200
+++ gcc/params.def	2019-05-06 10:06:33.036183432 +0200
@@ -1205,6 +1205,11 @@ DEFPARAM (PARAM_MAX_STORES_TO_MERGE,
 	  "store merging pass.",
 	  64, 2, 0)
 
+DEFPARAM (PARAM_STORE_MERGING_MAX_SIZE,
+	  "store-merging-max-size",
+	  "Maximum size of a single store merging region in bytes.",
+	  65536, 1, 1)
+
 DEFPARAM (PARAM_MAX_TAIL_MERGE_ITERATIONS,
           "max-tail-merge-iterations",
           "Maximum amount of iterations of the pass over a function.",
--- gcc/gimple-ssa-store-merging.c.jj	2019-05-06 09:45:57.905004886 +0200
+++ gcc/gimple-ssa-store-merging.c	2019-05-06 10:06:33.037183416 +0200
@@ -1615,13 +1615,31 @@ encode_tree_to_bitpos (tree expr, unsign
 		       unsigned int total_bytes)
 {
   unsigned int first_byte = bitpos / BITS_PER_UNIT;
-  tree tmp_int = expr;
   bool sub_byte_op_p = ((bitlen % BITS_PER_UNIT)
 			|| (bitpos % BITS_PER_UNIT)
 			|| !int_mode_for_size (bitlen, 0).exists ());
+  bool empty_ctor_p
+    = (TREE_CODE (expr) == CONSTRUCTOR
+       && CONSTRUCTOR_NELTS (expr) == 0
+       && TYPE_SIZE_UNIT (TREE_TYPE (expr))
+		       && tree_fits_uhwi_p (TYPE_SIZE_UNIT (TREE_TYPE (expr))));
 
   if (!sub_byte_op_p)
-    return native_encode_expr (tmp_int, ptr + first_byte, total_bytes) != 0;
+    {
+      if (first_byte >= total_bytes)
+	return false;
+      total_bytes -= first_byte;
+      if (empty_ctor_p)
+	{
+	  unsigned HOST_WIDE_INT rhs_bytes
+	    = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (expr)));
+	  if (rhs_bytes > total_bytes)
+	    return false;
+	  memset (ptr + first_byte, '\0', rhs_bytes);
+	  return true;
+	}
+      return native_encode_expr (expr, ptr + first_byte, total_bytes) != 0;
+    }
 
   /* LITTLE-ENDIAN
      We are writing a non byte-sized quantity or at a position that is not
@@ -1667,14 +1685,29 @@ encode_tree_to_bitpos (tree expr, unsign
 
   /* We must be dealing with fixed-size data at this point, since the
      total size is also fixed.  */
-  fixed_size_mode mode = as_a <fixed_size_mode> (TYPE_MODE (TREE_TYPE (expr)));
+  unsigned int byte_size;
+  if (empty_ctor_p)
+    {
+      unsigned HOST_WIDE_INT rhs_bytes
+	= tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (expr)));
+      if (rhs_bytes > total_bytes)
+	return false;
+      byte_size = rhs_bytes;
+    }
+  else
+    {
+      fixed_size_mode mode
+	= as_a <fixed_size_mode> (TYPE_MODE (TREE_TYPE (expr)));
+      byte_size = GET_MODE_SIZE (mode);
+    }
   /* Allocate an extra byte so that we have space to shift into.  */
-  unsigned int byte_size = GET_MODE_SIZE (mode) + 1;
+  byte_size++;
   unsigned char *tmpbuf = XALLOCAVEC (unsigned char, byte_size);
   memset (tmpbuf, '\0', byte_size);
   /* The store detection code should only have allowed constants that are
-     accepted by native_encode_expr.  */
-  if (native_encode_expr (expr, tmpbuf, byte_size - 1) == 0)
+     accepted by native_encode_expr or empty ctors.  */
+  if (!empty_ctor_p
+      && native_encode_expr (expr, tmpbuf, byte_size - 1) == 0)
     gcc_unreachable ();
 
   /* The native_encode_expr machinery uses TYPE_MODE to determine how many
@@ -2671,6 +2704,8 @@ imm_store_chain_info::coalesce_immediate
 
   FOR_EACH_VEC_ELT (m_store_info, i, info)
     {
+      unsigned HOST_WIDE_INT new_bitregion_start, new_bitregion_end;
+
       if (i <= ignore)
 	goto done;
 
@@ -2702,7 +2737,14 @@ imm_store_chain_info::coalesce_immediate
 	    }
 	}
 
-      if (info->order >= merged_store->first_nonmergeable_order)
+      new_bitregion_start
+	= MIN (merged_store->bitregion_start, info->bitregion_start);
+      new_bitregion_end
+	= MAX (merged_store->bitregion_end, info->bitregion_end);
+
+      if (info->order >= merged_store->first_nonmergeable_order
+	  || (((new_bitregion_end - new_bitregion_start + 1) / BITS_PER_UNIT)
+	      > (unsigned) PARAM_VALUE (PARAM_STORE_MERGING_MAX_SIZE)))
 	;
 
       /* |---store 1---|
@@ -3184,12 +3226,15 @@ count_multiple_uses (store_immediate_inf
    Return number of new stores.
    If ALLOW_UNALIGNED_STORE is false, then all stores must be aligned.
    If ALLOW_UNALIGNED_LOAD is false, then all loads must be aligned.
+   BZERO_FIRST may be true only when the first store covers the whole group
+   and clears it; if BZERO_FIRST is true, keep that first store in the set
+   unmodified and emit further stores for the overrides only.
    If SPLIT_STORES is NULL, it is just a dry run to count number of
    new stores.  */
 
 static unsigned int
 split_group (merged_store_group *group, bool allow_unaligned_store,
-	     bool allow_unaligned_load,
+	     bool allow_unaligned_load, bool bzero_first,
 	     vec<struct split_store *> *split_stores,
 	     unsigned *total_orig,
 	     unsigned *total_new)
@@ -3207,6 +3252,7 @@ split_group (merged_store_group *group,
   if (group->stores[0]->rhs_code == LROTATE_EXPR
       || group->stores[0]->rhs_code == NOP_EXPR)
     {
+      gcc_assert (!bzero_first);
       /* For bswap framework using sets of stores, all the checking
 	 has been done earlier in try_coalesce_bswap and needs to be
 	 emitted as a single store.  */
@@ -3278,10 +3324,26 @@ split_group (merged_store_group *group,
       if (group->load_align[i])
 	group_load_align = MIN (group_load_align, group->load_align[i]);
 
+  if (bzero_first)
+    {
+      first = 1;
+      ret = 1;
+      if (split_stores)
+	{
+	  struct split_store *store
+	    = new split_store (bytepos, group->stores[0]->bitsize, align_base);
+	  store->orig_stores.safe_push (group->stores[0]);
+	  store->orig = true;
+	  any_orig = true;
+	  split_stores->safe_push (store);
+	}
+    }
+
   while (size > 0)
     {
       if ((allow_unaligned_store || group_align <= BITS_PER_UNIT)
-	  && group->mask[try_pos - bytepos] == (unsigned char) ~0U)
+	  && (group->mask[try_pos - bytepos] == (unsigned char) ~0U
+	      || (bzero_first && group->val[try_pos - bytepos] == 0)))
 	{
 	  /* Skip padding bytes.  */
 	  ++try_pos;
@@ -3348,7 +3410,9 @@ split_group (merged_store_group *group,
       /* Now look for whole padding bytes at the end of that bitsize.  */
       for (nonmasked = try_size / BITS_PER_UNIT; nonmasked > 0; --nonmasked)
 	if (group->mask[try_pos - bytepos + nonmasked - 1]
-	    != (unsigned char) ~0U)
+	    != (unsigned char) ~0U
+	    && (!bzero_first
+		|| group->val[try_pos - bytepos + nonmasked - 1] != 0))
 	  break;
       if (nonmasked == 0)
 	{
@@ -3367,7 +3431,9 @@ split_group (merged_store_group *group,
 	  /* Now look for whole padding bytes at the start of that bitsize.  */
 	  unsigned int try_bytesize = try_size / BITS_PER_UNIT, masked;
 	  for (masked = 0; masked < try_bytesize; ++masked)
-	    if (group->mask[try_pos - bytepos + masked] != (unsigned char) ~0U)
+	    if (group->mask[try_pos - bytepos + masked] != (unsigned char) ~0U
+		&& (!bzero_first
+		    || group->val[try_pos - bytepos + masked] != 0))
 	      break;
 	  masked *= BITS_PER_UNIT;
 	  gcc_assert (masked < try_size);
@@ -3583,20 +3649,44 @@ imm_store_chain_info::output_merged_stor
   bool allow_unaligned_store
     = !STRICT_ALIGNMENT && PARAM_VALUE (PARAM_STORE_MERGING_ALLOW_UNALIGNED);
   bool allow_unaligned_load = allow_unaligned_store;
-  if (allow_unaligned_store)
+  bool bzero_first = false;
+  if (group->stores[0]->rhs_code == INTEGER_CST
+      && TREE_CODE (gimple_assign_rhs1 (group->stores[0]->stmt)) == CONSTRUCTOR
+      && CONSTRUCTOR_NELTS (gimple_assign_rhs1 (group->stores[0]->stmt)) == 0
+      && group->start == group->stores[0]->bitpos
+      && group->width == group->stores[0]->bitsize
+      && (group->start % BITS_PER_UNIT) == 0
+      && (group->width % BITS_PER_UNIT) == 0)
+    bzero_first = true;
+  if (allow_unaligned_store || bzero_first)
     {
       /* If unaligned stores are allowed, see how many stores we'd emit
 	 for unaligned and how many stores we'd emit for aligned stores.
-	 Only use unaligned stores if it allows fewer stores than aligned.  */
-      unsigned aligned_cnt
-	= split_group (group, false, allow_unaligned_load, NULL, NULL, NULL);
-      unsigned unaligned_cnt
-	= split_group (group, true, allow_unaligned_load, NULL, NULL, NULL);
-      if (aligned_cnt <= unaligned_cnt)
+	 Only use unaligned stores if it allows fewer stores than aligned.
+	 Similarly, if there is a whole region clear first, prefer expanding
+	 it together compared to expanding clear first followed by merged
+	 further stores.  */
+      unsigned cnt[4] = { ~0, ~0, ~0, ~0 };
+      int pass_min = 0;
+      for (int pass = 0; pass < 4; ++pass)
+	{
+	  if (!allow_unaligned_store && (pass & 1) != 0)
+	    continue;
+	  if (!bzero_first && (pass & 2) != 0)
+	    continue;
+	  cnt[pass] = split_group (group, (pass & 1) != 0,
+				   allow_unaligned_load, (pass & 2) != 0,
+				   NULL, NULL, NULL);
+	  if (cnt[pass] < cnt[pass_min])
+	    pass_min = pass;
+	}
+      if ((pass_min & 1) == 0)
 	allow_unaligned_store = false;
+      if ((pass_min & 2) == 0)
+	bzero_first = false;
     }
   unsigned total_orig, total_new;
-  split_group (group, allow_unaligned_store, allow_unaligned_load,
+  split_group (group, allow_unaligned_store, allow_unaligned_load, bzero_first,
 	       &split_stores, &total_orig, &total_new);
 
   if (split_stores.length () >= orig_num_stmts)
@@ -4164,7 +4254,8 @@ lhs_valid_for_store_merging_p (tree lhs)
   tree_code code = TREE_CODE (lhs);
 
   if (code == ARRAY_REF || code == ARRAY_RANGE_REF || code == MEM_REF
-      || code == COMPONENT_REF || code == BIT_FIELD_REF)
+      || code == COMPONENT_REF || code == BIT_FIELD_REF
+      || DECL_P (lhs))
     return true;
 
   return false;
@@ -4178,6 +4269,12 @@ static bool
 rhs_valid_for_store_merging_p (tree rhs)
 {
   unsigned HOST_WIDE_INT size;
+  if (TREE_CODE (rhs) == CONSTRUCTOR
+      && !TREE_CLOBBER_P (rhs)
+      && CONSTRUCTOR_NELTS (rhs) == 0
+      && TYPE_SIZE_UNIT (TREE_TYPE (rhs))
+      && tree_fits_uhwi_p (TYPE_SIZE_UNIT (TREE_TYPE (rhs))))
+    return true;
   return (GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (rhs))).is_constant (&size)
 	  && native_encode_expr (rhs, NULL, size) != 0);
 }
@@ -4363,7 +4460,9 @@ pass_store_merging::process_store (gimpl
   bool invalid = (base_addr == NULL_TREE
 		  || (maybe_gt (bitsize,
 				(unsigned int) MAX_BITSIZE_MODE_ANY_INT)
-		      && (TREE_CODE (rhs) != INTEGER_CST)));
+		      && TREE_CODE (rhs) != INTEGER_CST
+		      && (TREE_CODE (rhs) != CONSTRUCTOR
+			  || CONSTRUCTOR_NELTS (rhs) != 0)));
   enum tree_code rhs_code = ERROR_MARK;
   bool bit_not_p = false;
   struct symbolic_number n;
--- gcc/testsuite/gcc.dg/store_merging_26.c.jj	2019-05-06 12:28:59.513106665 +0200
+++ gcc/testsuite/gcc.dg/store_merging_26.c	2019-05-06 12:40:29.773998446 +0200
@@ -0,0 +1,36 @@
+/* PR tree-optimization/90271 */
+/* { dg-do run { target int32 } } */
+/* { dg-require-effective-target store_merge } */
+/* { dg-options "-O2 -fdump-tree-store-merging-details" } */
+/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 2 stores" "store-merging" } } */
+
+__attribute__((noipa)) void
+foo (int *x)
+{
+  asm volatile ("" : : "r" (x) : "memory");
+}
+
+__attribute__((noipa)) int
+bar ()
+{
+  int x;
+  foo (&x);
+  x = 3;
+  ((char *) &x)[1] = 1;
+  foo (&x);
+  return x;
+}
+
+int
+main ()
+{
+  int x;
+  foo (&x);
+  x = 3;
+  foo (&x);
+  ((char *) &x)[1] = 1;
+  foo (&x);
+  if (x != bar ())
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.dg/store_merging_27.c.jj	2019-05-06 13:53:20.396771304 +0200
+++ gcc/testsuite/gcc.dg/store_merging_27.c	2019-05-06 13:55:24.178785079 +0200
@@ -0,0 +1,26 @@
+/* PR tree-optimization/88709 */
+/* { dg-do run } */
+/* { dg-require-effective-target store_merge } */
+/* { dg-options "-O2 -fdump-tree-store-merging-details" } */
+/* { dg-final { scan-tree-dump "New sequence of \[12] stores to replace old one of 3 stores" "store-merging" } } */
+
+struct S { char buf[8]; };
+
+__attribute__((noipa)) void
+bar (struct S *x)
+{
+  int i;
+  for (i = 0; i < 8; i++)
+    if (x->buf[i] != ((i == 1) + (i == 3) * 2))
+      __builtin_abort ();
+}
+
+int
+main ()
+{
+  struct S s = {};
+  s.buf[1] = 1;
+  s.buf[3] = 2;
+  bar (&s);
+  return 0;
+}
--- gcc/testsuite/gcc.dg/store_merging_28.c.jj	2019-05-06 13:55:40.531522687 +0200
+++ gcc/testsuite/gcc.dg/store_merging_28.c	2019-05-06 14:00:20.408031875 +0200
@@ -0,0 +1,44 @@
+/* PR tree-optimization/88709 */
+/* { dg-do compile } */
+/* { dg-require-effective-target store_merge } */
+/* { dg-options "-O2 -fno-ipa-icf -fdump-tree-store-merging-details" } */
+/* { dg-final { scan-tree-dump-times "New sequence of \[24] stores to replace old one of 16 stores" 8 "store-merging" { target { i?86-*-* x86_64-*-* } } } } */
+/* { dg-final { scan-tree-dump-times "New sequence of \[24] stores to replace old one of 6 stores" 1 "store-merging" } } */
+
+typedef struct S { char data[16]; } S;
+void optimize_me (S);
+void optimize_me3 (S, S, S);
+
+void
+good ()
+{
+  optimize_me ((S) { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 });
+}
+
+void
+bad ()
+{
+  optimize_me ((S) { 1, 2, 3, 4, 5 });
+}
+
+void
+why ()
+{
+  optimize_me ((S) { 1, 2, 3, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 });
+}
+
+void
+srsly ()
+{
+  optimize_me3 ((S) { 1, 2, 3, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
+		(S) { 11, 12, 13, 14, 15, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10 },
+		(S) { 21, 22, 23, 24, 25, 20, 20, 20, 10, 20, 20, 20, 20, 20, 20 });
+}
+
+void
+srsly_not_one_missing ()
+{
+  optimize_me3 ((S) { 1, 2, 3, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
+		(S) { 11, 12, 13, 14, 15, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10 },
+		(S) { 21, 22, 23, 24, 25, 20, 20, 20, 10, 20, 20, 20, 20, 20, 20, 11 });
+}
--- gcc/testsuite/gcc.dg/store_merging_29.c.jj	2019-05-06 14:08:12.484457081 +0200
+++ gcc/testsuite/gcc.dg/store_merging_29.c	2019-05-06 14:11:22.739404299 +0200
@@ -0,0 +1,33 @@
+/* PR tree-optimization/88709 */
+/* { dg-do run { target int32 } } */
+/* { dg-require-effective-target store_merge } */
+/* { dg-options "-O2 -fdump-tree-store-merging-details" } */
+/* { dg-final { scan-tree-dump "New sequence of 3 stores to replace old one of 6 stores" "store-merging" { target le } } } */
+/* { dg-final { scan-tree-dump "New sequence of \[34] stores to replace old one of 6 stores" "store-merging" { target be } } } */
+
+struct T { char a[1024]; };
+
+__attribute__((noipa)) void
+bar (struct T *t)
+{
+  int x = 0x506;
+  if (__builtin_memcmp (&t->a[97], &x, sizeof (x)))
+    __builtin_abort ();
+  __builtin_memset (&t->a[97], '\0', sizeof (x));
+  for (int i = 0; i < 8; ++i)
+    if (t->a[i] != ((i == 54) + 2 * (i == 52) + 3 * (i == 95) + 4 * (i == 96)))
+      __builtin_abort ();
+}
+
+int
+main ()
+{
+  struct T t = {};
+  t.a[54] = 1;
+  t.a[52] = 2;
+  t.a[95] = 3;
+  t.a[96] = 4;
+  int x = 0x506;
+  __builtin_memcpy (&t.a[97], &x, sizeof (x));
+  bar (&t);
+}

	Jakub

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] Various store-merging improvements (PR tree-optimization/88709, PR tree-optimization/90271)
  2019-05-06 13:22 [PATCH] Various store-merging improvements (PR tree-optimization/88709, PR tree-optimization/90271) Jakub Jelinek
@ 2019-05-06 18:49 ` Richard Biener
  0 siblings, 0 replies; 2+ messages in thread
From: Richard Biener @ 2019-05-06 18:49 UTC (permalink / raw)
  To: Jakub Jelinek, Jeff Law; +Cc: gcc-patches

On May 6, 2019 3:22:42 PM GMT+02:00, Jakub Jelinek <jakub@redhat.com> wrote:
>Hi!
>
>The following patch adds some store-merging improvements.
>Previously, only stores where the lhs was
>ARRAY_REF, ARRAY_RANGE_REF, MEM_REF, COMPONENT_REF or BIT_FIELD_REF
>were considered, as the testcases show it is useful to consider
>stores into whole decl as well.
>Another thing is that we were considering stores where the rhs was some
>constant with fixed size mode and one that native_encode_expr can
>handle.
>This patch extends it to handle = {} stores (but not clobbers).  While
>native_encode_expr doesn't handle those, they can be handled very
>easily
>(it is a memory clearing) and we don't need have the machine mode
>restrictions for that case either.
>With those two changes, we can handle stuff like:
>  var = {};
>  var.b = 1;
>  var.d = 2;
>and turn that into
>  MEM [(struct S *)&var] = some_constant;
>We do have a param how many stores we consider together and when all
>the
>stores were fixed size, that naturally implied a reasonably small size
>of
>the whole store group area which we e.g. precompute the bytes for. 
>With
>= {} having almost arbitrary sizes, I feared we could easily trigger
>out of
>compile time memory errors by just having var = {} for say 2GB variable
>followed by var.a = 1; allocating 4GB of RAM (or even larger cases for
>64-bit
>hosts+targets).  So, the patch adds another parameter and its default
>to
>limit the sizes of the groups, above that boundary we just don't merge
>multiple stores into a group.
>Another thing is, if we have a large store first, typically = {} of the
>whole object, followed by some stores after it, either the initial
>clearing
>is small and we manage to turn the stores into even smaller ones that
>cover
>the complete object, but if the = {} is large, we'd often just punt
>because
>it would create more stores than originally (we really count the = {}
>as one
>store and although sometimes it could be expanded by pieces, in other
>times
>it can be far more efficient to clear everything and overwrite a few
>things).  This means, in various cases the patch could even regress on
>the
>store merging done, because previously we wouldn't even consider the =
>{},
>while now we've placed it into the same store group, so before we would
>have
>that = {} separate and then merge the rest, while now we just punt.
>This patch solves it by special casing the case where there is a clear
>of
>the whole area, followed by some stores into that area.  We check what
>is in
>that case smaller, whether to handle the = {} together with the rest,
>or
>whether to treat it as = {} separate from the rest; in the latter case
>we
>can do whatever we used to do before, but actually can do even better,
>because we know the memory area is already pre-cleared, so for the
>purposes
>of computation how many, how large and how much aligned stores we will
>do
>we can actually treat all the zero bytes in the combined value to be
>stored
>as the padding bytes (gaps); we have already stored zeros there, so we
>don't
>need to overwrite it, but if it is beneficial, we actually can
>overwrite it
>with further zeros.  When later on actually adding the individual
>bytes, we
>don't need to consider those bytes as padding though, so can avoid
>doing a
>RMW cycle, we know those bytes are zero.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok. 

Richard. 

>2019-05-06  Jakub Jelinek  <jakub@redhat.com>
>
>	PR tree-optimization/88709
>	PR tree-optimization/90271
>	* params.def (PARAM_STORE_MERGING_MAX_SIZE): New parameter.
>	* gimple-ssa-store-merging.c (encode_tree_to_bitpos): Handle
>	non-clobber CONSTRUCTORs with no elts.  Remove useless tmp_int
>	variable.
>	(imm_store_chain_info::coalesce_immediate_stores): Punt if the size
>	of the store merging group is larger than
>	PARAM_STORE_MERGING_MAX_SIZE parameter.
>	(split_group): Add bzero_first argument.  If set, always emit first
>	the first store which must be = {} of the whole area and then for the
>	rest of the stores consider all zero bytes as paddings.
>	(imm_store_chain_info::output_merged_store): Check if first store
>	is = {} of the whole area and if yes, determine which setting of
>	bzero_first for split_group gives smaller number of stores.  Adjust
>	split_group callers.
>	(lhs_valid_for_store_merging_p): Allow decls.
>	(rhs_valid_for_store_merging_p): Allow non-clobber CONTRUCTORs with
>	no elts.
>	(pass_store_merging::process_store): Likewise.
>
>	* gcc.dg/store_merging_26.c: New test.
>	* gcc.dg/store_merging_27.c: New test.
>	* gcc.dg/store_merging_28.c: New test.
>	* gcc.dg/store_merging_29.c: New test.
>
>--- gcc/params.def.jj	2019-05-06 09:45:57.014019177 +0200
>+++ gcc/params.def	2019-05-06 10:06:33.036183432 +0200
>@@ -1205,6 +1205,11 @@ DEFPARAM (PARAM_MAX_STORES_TO_MERGE,
> 	  "store merging pass.",
> 	  64, 2, 0)
> 
>+DEFPARAM (PARAM_STORE_MERGING_MAX_SIZE,
>+	  "store-merging-max-size",
>+	  "Maximum size of a single store merging region in bytes.",
>+	  65536, 1, 1)
>+
> DEFPARAM (PARAM_MAX_TAIL_MERGE_ITERATIONS,
>           "max-tail-merge-iterations",
>           "Maximum amount of iterations of the pass over a function.",
>--- gcc/gimple-ssa-store-merging.c.jj	2019-05-06 09:45:57.905004886
>+0200
>+++ gcc/gimple-ssa-store-merging.c	2019-05-06 10:06:33.037183416 +0200
>@@ -1615,13 +1615,31 @@ encode_tree_to_bitpos (tree expr, unsign
> 		       unsigned int total_bytes)
> {
>   unsigned int first_byte = bitpos / BITS_PER_UNIT;
>-  tree tmp_int = expr;
>   bool sub_byte_op_p = ((bitlen % BITS_PER_UNIT)
> 			|| (bitpos % BITS_PER_UNIT)
> 			|| !int_mode_for_size (bitlen, 0).exists ());
>+  bool empty_ctor_p
>+    = (TREE_CODE (expr) == CONSTRUCTOR
>+       && CONSTRUCTOR_NELTS (expr) == 0
>+       && TYPE_SIZE_UNIT (TREE_TYPE (expr))
>+		       && tree_fits_uhwi_p (TYPE_SIZE_UNIT (TREE_TYPE (expr))));
> 
>   if (!sub_byte_op_p)
>-    return native_encode_expr (tmp_int, ptr + first_byte, total_bytes)
>!= 0;
>+    {
>+      if (first_byte >= total_bytes)
>+	return false;
>+      total_bytes -= first_byte;
>+      if (empty_ctor_p)
>+	{
>+	  unsigned HOST_WIDE_INT rhs_bytes
>+	    = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (expr)));
>+	  if (rhs_bytes > total_bytes)
>+	    return false;
>+	  memset (ptr + first_byte, '\0', rhs_bytes);
>+	  return true;
>+	}
>+      return native_encode_expr (expr, ptr + first_byte, total_bytes)
>!= 0;
>+    }
> 
>   /* LITTLE-ENDIAN
>  We are writing a non byte-sized quantity or at a position that is not
>@@ -1667,14 +1685,29 @@ encode_tree_to_bitpos (tree expr, unsign
> 
>   /* We must be dealing with fixed-size data at this point, since the
>      total size is also fixed.  */
>-  fixed_size_mode mode = as_a <fixed_size_mode> (TYPE_MODE (TREE_TYPE
>(expr)));
>+  unsigned int byte_size;
>+  if (empty_ctor_p)
>+    {
>+      unsigned HOST_WIDE_INT rhs_bytes
>+	= tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (expr)));
>+      if (rhs_bytes > total_bytes)
>+	return false;
>+      byte_size = rhs_bytes;
>+    }
>+  else
>+    {
>+      fixed_size_mode mode
>+	= as_a <fixed_size_mode> (TYPE_MODE (TREE_TYPE (expr)));
>+      byte_size = GET_MODE_SIZE (mode);
>+    }
>   /* Allocate an extra byte so that we have space to shift into.  */
>-  unsigned int byte_size = GET_MODE_SIZE (mode) + 1;
>+  byte_size++;
>   unsigned char *tmpbuf = XALLOCAVEC (unsigned char, byte_size);
>   memset (tmpbuf, '\0', byte_size);
>/* The store detection code should only have allowed constants that are
>-     accepted by native_encode_expr.  */
>-  if (native_encode_expr (expr, tmpbuf, byte_size - 1) == 0)
>+     accepted by native_encode_expr or empty ctors.  */
>+  if (!empty_ctor_p
>+      && native_encode_expr (expr, tmpbuf, byte_size - 1) == 0)
>     gcc_unreachable ();
> 
>/* The native_encode_expr machinery uses TYPE_MODE to determine how
>many
>@@ -2671,6 +2704,8 @@ imm_store_chain_info::coalesce_immediate
> 
>   FOR_EACH_VEC_ELT (m_store_info, i, info)
>     {
>+      unsigned HOST_WIDE_INT new_bitregion_start, new_bitregion_end;
>+
>       if (i <= ignore)
> 	goto done;
> 
>@@ -2702,7 +2737,14 @@ imm_store_chain_info::coalesce_immediate
> 	    }
> 	}
> 
>-      if (info->order >= merged_store->first_nonmergeable_order)
>+      new_bitregion_start
>+	= MIN (merged_store->bitregion_start, info->bitregion_start);
>+      new_bitregion_end
>+	= MAX (merged_store->bitregion_end, info->bitregion_end);
>+
>+      if (info->order >= merged_store->first_nonmergeable_order
>+	  || (((new_bitregion_end - new_bitregion_start + 1) / BITS_PER_UNIT)
>+	      > (unsigned) PARAM_VALUE (PARAM_STORE_MERGING_MAX_SIZE)))
> 	;
> 
>       /* |---store 1---|
>@@ -3184,12 +3226,15 @@ count_multiple_uses (store_immediate_inf
>    Return number of new stores.
>    If ALLOW_UNALIGNED_STORE is false, then all stores must be aligned.
>    If ALLOW_UNALIGNED_LOAD is false, then all loads must be aligned.
>+   BZERO_FIRST may be true only when the first store covers the whole
>group
>+   and clears it; if BZERO_FIRST is true, keep that first store in the
>set
>+   unmodified and emit further stores for the overrides only.
>    If SPLIT_STORES is NULL, it is just a dry run to count number of
>    new stores.  */
> 
> static unsigned int
> split_group (merged_store_group *group, bool allow_unaligned_store,
>-	     bool allow_unaligned_load,
>+	     bool allow_unaligned_load, bool bzero_first,
> 	     vec<struct split_store *> *split_stores,
> 	     unsigned *total_orig,
> 	     unsigned *total_new)
>@@ -3207,6 +3252,7 @@ split_group (merged_store_group *group,
>   if (group->stores[0]->rhs_code == LROTATE_EXPR
>       || group->stores[0]->rhs_code == NOP_EXPR)
>     {
>+      gcc_assert (!bzero_first);
>       /* For bswap framework using sets of stores, all the checking
> 	 has been done earlier in try_coalesce_bswap and needs to be
> 	 emitted as a single store.  */
>@@ -3278,10 +3324,26 @@ split_group (merged_store_group *group,
>       if (group->load_align[i])
> 	group_load_align = MIN (group_load_align, group->load_align[i]);
> 
>+  if (bzero_first)
>+    {
>+      first = 1;
>+      ret = 1;
>+      if (split_stores)
>+	{
>+	  struct split_store *store
>+	    = new split_store (bytepos, group->stores[0]->bitsize,
>align_base);
>+	  store->orig_stores.safe_push (group->stores[0]);
>+	  store->orig = true;
>+	  any_orig = true;
>+	  split_stores->safe_push (store);
>+	}
>+    }
>+
>   while (size > 0)
>     {
>       if ((allow_unaligned_store || group_align <= BITS_PER_UNIT)
>-	  && group->mask[try_pos - bytepos] == (unsigned char) ~0U)
>+	  && (group->mask[try_pos - bytepos] == (unsigned char) ~0U
>+	      || (bzero_first && group->val[try_pos - bytepos] == 0)))
> 	{
> 	  /* Skip padding bytes.  */
> 	  ++try_pos;
>@@ -3348,7 +3410,9 @@ split_group (merged_store_group *group,
>    /* Now look for whole padding bytes at the end of that bitsize.  */
> for (nonmasked = try_size / BITS_PER_UNIT; nonmasked > 0; --nonmasked)
> 	if (group->mask[try_pos - bytepos + nonmasked - 1]
>-	    != (unsigned char) ~0U)
>+	    != (unsigned char) ~0U
>+	    && (!bzero_first
>+		|| group->val[try_pos - bytepos + nonmasked - 1] != 0))
> 	  break;
>       if (nonmasked == 0)
> 	{
>@@ -3367,7 +3431,9 @@ split_group (merged_store_group *group,
>	  /* Now look for whole padding bytes at the start of that bitsize. 
>*/
> 	  unsigned int try_bytesize = try_size / BITS_PER_UNIT, masked;
> 	  for (masked = 0; masked < try_bytesize; ++masked)
>-	    if (group->mask[try_pos - bytepos + masked] != (unsigned char)
>~0U)
>+	    if (group->mask[try_pos - bytepos + masked] != (unsigned char)
>~0U
>+		&& (!bzero_first
>+		    || group->val[try_pos - bytepos + masked] != 0))
> 	      break;
> 	  masked *= BITS_PER_UNIT;
> 	  gcc_assert (masked < try_size);
>@@ -3583,20 +3649,44 @@ imm_store_chain_info::output_merged_stor
>   bool allow_unaligned_store
>= !STRICT_ALIGNMENT && PARAM_VALUE
>(PARAM_STORE_MERGING_ALLOW_UNALIGNED);
>   bool allow_unaligned_load = allow_unaligned_store;
>-  if (allow_unaligned_store)
>+  bool bzero_first = false;
>+  if (group->stores[0]->rhs_code == INTEGER_CST
>+      && TREE_CODE (gimple_assign_rhs1 (group->stores[0]->stmt)) ==
>CONSTRUCTOR
>+      && CONSTRUCTOR_NELTS (gimple_assign_rhs1
>(group->stores[0]->stmt)) == 0
>+      && group->start == group->stores[0]->bitpos
>+      && group->width == group->stores[0]->bitsize
>+      && (group->start % BITS_PER_UNIT) == 0
>+      && (group->width % BITS_PER_UNIT) == 0)
>+    bzero_first = true;
>+  if (allow_unaligned_store || bzero_first)
>     {
>      /* If unaligned stores are allowed, see how many stores we'd emit
> 	 for unaligned and how many stores we'd emit for aligned stores.
>-	 Only use unaligned stores if it allows fewer stores than aligned. 
>*/
>-      unsigned aligned_cnt
>-	= split_group (group, false, allow_unaligned_load, NULL, NULL, NULL);
>-      unsigned unaligned_cnt
>-	= split_group (group, true, allow_unaligned_load, NULL, NULL, NULL);
>-      if (aligned_cnt <= unaligned_cnt)
>+	 Only use unaligned stores if it allows fewer stores than aligned.
>+	 Similarly, if there is a whole region clear first, prefer expanding
>+	 it together compared to expanding clear first followed by merged
>+	 further stores.  */
>+      unsigned cnt[4] = { ~0, ~0, ~0, ~0 };
>+      int pass_min = 0;
>+      for (int pass = 0; pass < 4; ++pass)
>+	{
>+	  if (!allow_unaligned_store && (pass & 1) != 0)
>+	    continue;
>+	  if (!bzero_first && (pass & 2) != 0)
>+	    continue;
>+	  cnt[pass] = split_group (group, (pass & 1) != 0,
>+				   allow_unaligned_load, (pass & 2) != 0,
>+				   NULL, NULL, NULL);
>+	  if (cnt[pass] < cnt[pass_min])
>+	    pass_min = pass;
>+	}
>+      if ((pass_min & 1) == 0)
> 	allow_unaligned_store = false;
>+      if ((pass_min & 2) == 0)
>+	bzero_first = false;
>     }
>   unsigned total_orig, total_new;
>-  split_group (group, allow_unaligned_store, allow_unaligned_load,
>+  split_group (group, allow_unaligned_store, allow_unaligned_load,
>bzero_first,
> 	       &split_stores, &total_orig, &total_new);
> 
>   if (split_stores.length () >= orig_num_stmts)
>@@ -4164,7 +4254,8 @@ lhs_valid_for_store_merging_p (tree lhs)
>   tree_code code = TREE_CODE (lhs);
> 
>   if (code == ARRAY_REF || code == ARRAY_RANGE_REF || code == MEM_REF
>-      || code == COMPONENT_REF || code == BIT_FIELD_REF)
>+      || code == COMPONENT_REF || code == BIT_FIELD_REF
>+      || DECL_P (lhs))
>     return true;
> 
>   return false;
>@@ -4178,6 +4269,12 @@ static bool
> rhs_valid_for_store_merging_p (tree rhs)
> {
>   unsigned HOST_WIDE_INT size;
>+  if (TREE_CODE (rhs) == CONSTRUCTOR
>+      && !TREE_CLOBBER_P (rhs)
>+      && CONSTRUCTOR_NELTS (rhs) == 0
>+      && TYPE_SIZE_UNIT (TREE_TYPE (rhs))
>+      && tree_fits_uhwi_p (TYPE_SIZE_UNIT (TREE_TYPE (rhs))))
>+    return true;
>return (GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (rhs))).is_constant (&size)
> 	  && native_encode_expr (rhs, NULL, size) != 0);
> }
>@@ -4363,7 +4460,9 @@ pass_store_merging::process_store (gimpl
>   bool invalid = (base_addr == NULL_TREE
> 		  || (maybe_gt (bitsize,
> 				(unsigned int) MAX_BITSIZE_MODE_ANY_INT)
>-		      && (TREE_CODE (rhs) != INTEGER_CST)));
>+		      && TREE_CODE (rhs) != INTEGER_CST
>+		      && (TREE_CODE (rhs) != CONSTRUCTOR
>+			  || CONSTRUCTOR_NELTS (rhs) != 0)));
>   enum tree_code rhs_code = ERROR_MARK;
>   bool bit_not_p = false;
>   struct symbolic_number n;
>--- gcc/testsuite/gcc.dg/store_merging_26.c.jj	2019-05-06
>12:28:59.513106665 +0200
>+++ gcc/testsuite/gcc.dg/store_merging_26.c	2019-05-06
>12:40:29.773998446 +0200
>@@ -0,0 +1,36 @@
>+/* PR tree-optimization/90271 */
>+/* { dg-do run { target int32 } } */
>+/* { dg-require-effective-target store_merge } */
>+/* { dg-options "-O2 -fdump-tree-store-merging-details" } */
>+/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace
>old one of 2 stores" "store-merging" } } */
>+
>+__attribute__((noipa)) void
>+foo (int *x)
>+{
>+  asm volatile ("" : : "r" (x) : "memory");
>+}
>+
>+__attribute__((noipa)) int
>+bar ()
>+{
>+  int x;
>+  foo (&x);
>+  x = 3;
>+  ((char *) &x)[1] = 1;
>+  foo (&x);
>+  return x;
>+}
>+
>+int
>+main ()
>+{
>+  int x;
>+  foo (&x);
>+  x = 3;
>+  foo (&x);
>+  ((char *) &x)[1] = 1;
>+  foo (&x);
>+  if (x != bar ())
>+    __builtin_abort ();
>+  return 0;
>+}
>--- gcc/testsuite/gcc.dg/store_merging_27.c.jj	2019-05-06
>13:53:20.396771304 +0200
>+++ gcc/testsuite/gcc.dg/store_merging_27.c	2019-05-06
>13:55:24.178785079 +0200
>@@ -0,0 +1,26 @@
>+/* PR tree-optimization/88709 */
>+/* { dg-do run } */
>+/* { dg-require-effective-target store_merge } */
>+/* { dg-options "-O2 -fdump-tree-store-merging-details" } */
>+/* { dg-final { scan-tree-dump "New sequence of \[12] stores to
>replace old one of 3 stores" "store-merging" } } */
>+
>+struct S { char buf[8]; };
>+
>+__attribute__((noipa)) void
>+bar (struct S *x)
>+{
>+  int i;
>+  for (i = 0; i < 8; i++)
>+    if (x->buf[i] != ((i == 1) + (i == 3) * 2))
>+      __builtin_abort ();
>+}
>+
>+int
>+main ()
>+{
>+  struct S s = {};
>+  s.buf[1] = 1;
>+  s.buf[3] = 2;
>+  bar (&s);
>+  return 0;
>+}
>--- gcc/testsuite/gcc.dg/store_merging_28.c.jj	2019-05-06
>13:55:40.531522687 +0200
>+++ gcc/testsuite/gcc.dg/store_merging_28.c	2019-05-06
>14:00:20.408031875 +0200
>@@ -0,0 +1,44 @@
>+/* PR tree-optimization/88709 */
>+/* { dg-do compile } */
>+/* { dg-require-effective-target store_merge } */
>+/* { dg-options "-O2 -fno-ipa-icf -fdump-tree-store-merging-details" }
>*/
>+/* { dg-final { scan-tree-dump-times "New sequence of \[24] stores to
>replace old one of 16 stores" 8 "store-merging" { target { i?86-*-*
>x86_64-*-* } } } } */
>+/* { dg-final { scan-tree-dump-times "New sequence of \[24] stores to
>replace old one of 6 stores" 1 "store-merging" } } */
>+
>+typedef struct S { char data[16]; } S;
>+void optimize_me (S);
>+void optimize_me3 (S, S, S);
>+
>+void
>+good ()
>+{
>+  optimize_me ((S) { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
>15, 16 });
>+}
>+
>+void
>+bad ()
>+{
>+  optimize_me ((S) { 1, 2, 3, 4, 5 });
>+}
>+
>+void
>+why ()
>+{
>+  optimize_me ((S) { 1, 2, 3, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
>});
>+}
>+
>+void
>+srsly ()
>+{
>+  optimize_me3 ((S) { 1, 2, 3, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
>},
>+		(S) { 11, 12, 13, 14, 15, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10
>},
>+		(S) { 21, 22, 23, 24, 25, 20, 20, 20, 10, 20, 20, 20, 20, 20, 20 });
>+}
>+
>+void
>+srsly_not_one_missing ()
>+{
>+  optimize_me3 ((S) { 1, 2, 3, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
>},
>+		(S) { 11, 12, 13, 14, 15, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10
>},
>+		(S) { 21, 22, 23, 24, 25, 20, 20, 20, 10, 20, 20, 20, 20, 20, 20, 11
>});
>+}
>--- gcc/testsuite/gcc.dg/store_merging_29.c.jj	2019-05-06
>14:08:12.484457081 +0200
>+++ gcc/testsuite/gcc.dg/store_merging_29.c	2019-05-06
>14:11:22.739404299 +0200
>@@ -0,0 +1,33 @@
>+/* PR tree-optimization/88709 */
>+/* { dg-do run { target int32 } } */
>+/* { dg-require-effective-target store_merge } */
>+/* { dg-options "-O2 -fdump-tree-store-merging-details" } */
>+/* { dg-final { scan-tree-dump "New sequence of 3 stores to replace
>old one of 6 stores" "store-merging" { target le } } } */
>+/* { dg-final { scan-tree-dump "New sequence of \[34] stores to
>replace old one of 6 stores" "store-merging" { target be } } } */
>+
>+struct T { char a[1024]; };
>+
>+__attribute__((noipa)) void
>+bar (struct T *t)
>+{
>+  int x = 0x506;
>+  if (__builtin_memcmp (&t->a[97], &x, sizeof (x)))
>+    __builtin_abort ();
>+  __builtin_memset (&t->a[97], '\0', sizeof (x));
>+  for (int i = 0; i < 8; ++i)
>+    if (t->a[i] != ((i == 54) + 2 * (i == 52) + 3 * (i == 95) + 4 * (i
>== 96)))
>+      __builtin_abort ();
>+}
>+
>+int
>+main ()
>+{
>+  struct T t = {};
>+  t.a[54] = 1;
>+  t.a[52] = 2;
>+  t.a[95] = 3;
>+  t.a[96] = 4;
>+  int x = 0x506;
>+  __builtin_memcpy (&t.a[97], &x, sizeof (x));
>+  bar (&t);
>+}
>
>	Jakub

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-05-06 18:49 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-06 13:22 [PATCH] Various store-merging improvements (PR tree-optimization/88709, PR tree-optimization/90271) Jakub Jelinek
2019-05-06 18:49 ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).